Youtube to MP3 using S3, Lambda & Elastic Transcoder

In this tutorial, I will show you how to convert a Youtube video 📺 to a mp3 file 💿 using AWS Elastic Transcoder. How can we do that ?

We will create a Lambda function to consume events published by S3. For any video uploaded to a bucket, S3 will invoke our Lambda function by passing event information. AWS Lambda executes the function. As the function executes, it reads the S3 event data, logs some of the event information to Amazon CloudWatch. Then, kick off a transcoding job.

Let’s start, by creating an S3 bucket to store the inputs files (videos) and the outputs files (audio) :

Next, let’s define a Transcoder pipeline. A pipeline essentially defines a queue for future transcoding jobs. To create a pipeline, we need to specify the input bucket (where the videos will be).

Note: Copy down the Pipeline ID, we will need later on

Having created a pipeline, go to the AWS Management Console, navigate to Lambda service & click on “Create a Lambda Function“, add S3 as the event source for Lambda function:

I used the following Node.JS code:

The script does the following:

  • Extract the filename of the uploaded file from the event object
  • Create a Transcoder job and specify the required outputs
  • Launch the job

Note: you might notice in the function above is the use of presets (1351620000001-300040). It describes how to encode the given file (in this case mp3). The full list of available presets can be found in AWS Documentation.

Finally, set the pipeline id as an envrionment variable and select an IAM role with permission to access Elastic Transcoder:

Once created, upload a video file to the inputs bucket:

If everything went well, you should see the file in your outputs bucket:

S3 will trigger our Lambda function. It will then execute our function. and log the S3 object name to CloudWatch Logs:

After couple of seconds (or minutes depends on the size of the video ) , you should see a new MP3 file has been generated by Elastic Transcoder job inside the outputs directory in the S3 bucket:

Setup AWS Lambda with Scheduled Events

This post is part of my “Serverless” series. In this part, I will show you how to setup a Lambda function to send mails on a defined scheduled event from CloudWatch.

1 – Create Lambda Function

So start by cloning the project :

I implemented a simple Lambda function in NodeJS to send an email using MailGun library

Note: you could use another service like AWS SES or your own SMTP server

Then, create a zip file:

Next, we need to create an Execution Role for our function:

Execute the following Lambda CLI command to create a Lambda function. We need to provide the zip file, IAM role ARN we created earlier & set MAILGUN_API_KEY and MAILGUN_DOMAIN as parameters.

Note: the –runtime parameter uses Node.JS 6.10 but you can also specify Node.JS 4.3

Once created, AWS Lambda returns function configuration information as shown in the following example:

Now if we go back to AWS Lambda Dashboard we should see our function has been successfuly created:

2 – Configure a CloudWatch Rule

Create a new rule which will trigger our lambda function each 5 minutes:

Note: you can specify the value as a rate or in the cron expression format. All schedules use the UTC time zone, and the minimum precision for schedules is one minute

If you go back now to the Lambda Function Console and navigate to the Trigger tab, you should see the CloudWatch has been added:

After 5 minutes, CloudWatch will trigger the Lambda Function and you should get an email notification:

Getting started with AWS VPC

We use Virtual Private Cloud (VPC) ☁ to define our own network and how the AWS resources (EC2, EB, RDS …) inside the network, are exposed to the Internet 🌎.

As promised before, today I will show you guys how to create a VPC and how to:

  • Create a Public Subnet with a couple of WebServers and a NAT instance.
  • Create a Private Subnet with a Database instance.
  • Configure AWS Flow logs to store VPC traffic and metrics in CloudWatch.
  • Configure Network Access Control List(ACL), Route Table & Security Groups … 

1 – Create a VPC

Go to the AWS Management Console and navigate to the VPC Dashboard then click on “Create VPC”:

Give a name to your VPC, and assign a private IPv4 CIDR block (of /16 or smaller) from the following ranges :

  • 192.168.0.0 – 192.168.255.255 (65,536 IP addresses)
  • 172.16.0.0 – 172.31.255.255 (1,048,576 IP addresses)
  • 10.0.0.0 – 10.255.255.255 (16,777,216 IP addresses)

Left the default settings unchanged and click on “Yes, Create”. When the work completes, you should see something as below:

Note: When you create a VPC a main Route Table, default ACL, and a default Security Group is created automatically.

As show in our network diagram, we need to create two subnets spanning two Availability Zones (AZ). Each AZ will have one subnet.

2 – Public Subnet

Click on the subnets menu on the left, then click on “Create Subnet”:

Populate the fields as above. Make sure yout select the VPC we created earlier. Then, click on “Yes, Create“. Once created, you should see:

Note: We used a CIDR block of 10.0.1.0/24 this would give us 251 usable IPs

5 IP’s are reserved (2 for Network address and Broadcast, and 3 are reserved by default by Amazon)

That being said, let’s visualize what we have done so far:

3 – Private Subnet

We similarly create the private subnet:

Then:

Now, we have a VPC with two Subnets:

In order to make the first subnet (10.0.1.0/24) publically accessible by the Internet, we need to create an Internet Gateway (IGW).

4 – Internet Gateway

So click on “Internet Gateway” menu on the left, then click on “Create Internet Gateway” and give it a name:

Then Attach it to the VPC we created earlier by clicking on “Attach to VPC” button:

So far we created :

5 – Route Table

We will now create a new Route Table that allows instances inside the Public Subnet to direct all traffic to the Internet Gateway so that the Gateway can direct it out to the InternetSo, go to the “Route Tables” menu on the left, and click on “Create Route Table”. Give it a name and select our VPC:

Then click on “Yes, Create”, click on “Routes” tab and “Edit” button. Add new route that redirects all traffic (0.0.0.0/0) to the Internet Gateway that we created:

As last step we associate this route table to the public subnet:

Then “Save”. Let’s see what we have created so far:

Now the 10.0.1.0/24 is a public subnet as we associated a route table with IGW rule. We can now launch an instance to the public subnet which can be accessed over the Internet.

6 – EC2 Instances

Note:  In order to automatically assign a public IP address for every EC2 instance created in the public subnet, we need to enable “Auto-assign IP” on the public subnet:

6.1 – Webserver Instance

So now, lets go ahead and deploy a new EC2 instance. I already did a tutorial on how to do that, there are few ways you want differ from the tutorial:

  • On the configuration page, select the VPC we created earlier and the public subnet:

  • This instance will play the role of a webserver, so we need to install an Apache Server. Add the following script to the instance user data:

 

  • Don’t forget to assign a label to the instance, and enable HTTP port on the security group page:

Your instance is now launching and that may take a few minutes. Click the View your instances on the Instances page link and you will be redirected to your “My Instances” page, where you can monitor and configure your EC2 instance:

Now, if you point your browser to the instance public IP:

You can now use the instance key pair to connect to the server via SSH:

Awesome 😎, so we have successfuly created an EC2 instance inside a public subnet:

6.2 – Database Instance

We will follow the same steps like the previous section, except this time the instance will be inside the private subnet :

We will also enable access to MySQL 3006 port, SSH, and Ping only from the public subnet.

Once created:

As you can see our database server has a private ip address so we wont be able to SSH it 😕. But dont get sad 🙂, we will use the webserver instance in the public subnet as a bastion server (jump box)

So first connect to the webserver instance via SSH and copy over the key pair using sftp or copy past the key pair content into a new file in the webserver. Then by using the database private ip address we can now connect to the server:

As I mentioned before, the private subnet has no access to the Internet, therefore our database server cannot talk to the Internet:

Fortunately, if we want the private subnet instances to be addressable to Internet, we have two solutions:

7.1 – NAT Instance

We need to create a NAT instance inside the public Subnet and configure the private subnet route table so all the subnet traffic should be routed through the NAT instance for Internet access.

So let’s launch a t1.micro NAT instance. AWS has public linux NAT AMI’s. So choose the first one:

In the launch configuration select the appropriate VPC and the public subnet (10.0.1.0/24) you configured. Make sure all the configurations are correct as below and launch the instance:

Once created, go back to “your instances” page, you should see:

Next, we will update the private subnet route table and add a new route that redirects all traffic (0.0.0.0/0) towards the NAT Instance.

The current VPC diagram:

So now the database server can access the internet via a NAT instance for tasks such as installing MySQL or downloading updates or security patches:

7.2 – NAT Gateway

On the “NAT Gateways” menu on left, click on “Create NAT Gateway“. Provide the necessary details, like the public subnet (10.0.1.0/24) and Elastic IP. and create the NAT Gateway.

Once created you will see this:

Now lets edit the route table to send the traffic destined for the internet toward the gateway.

Which one to choose ?

Nat Instance is a single point of failure, in case it went down, our EC2 instances in the private subnet wont be able to talk to the internet. Of course you can always put it inside an autoscalling group and set the minimum of running NAT instance to 1. But you will need to have an automated script configured and ready to redirect communication from the private instances within the old NAT instance to the new NAT instance.

But using AWS’s new managed NAT Gateway, things work differently. Instead of configuring, running, monitoring, and scalling a cluster of EC2 NAT instances, it’s a matter  of couple clicks and you are all set. In short, all the configurations that were once the responsibility of the Ops team, will now be handled invisibly by AWS (You won’t need to apply security patches)

8 – Flow Logs & CloudWatch

In order to capture and log data about network traffic in our VPC, Flow Logs records information about the IP data going to and from network interfaces, storing this raw data in Amazon CloudWatch where it can be retriever, filtred, and viewed.

Browse to your VPC Dashboard and follow the setup wizard to create a new flow log:

In the dialog box, complete the following information:

  • Filter: Select whether the flow log should capture rejected traffic, accepted traffic or all traffic.
  • Role: Specify the name of an IAM role that has permission logs to CloudWatch Logs (Check my previous tutorial on how to create IAM role).
  • Destination Log Group: Enter the name for the log and where to store it in CloudWatch.

To view your VPC log records then go to CloudWatch Console . In the navigation pane, choose Logs, select the log group you created earlier, then create a Log stream:

It may take a few minutes after you’ve created your flow log to see the VPC log records:

9 – Cleaning Up

Make sure to terminates all the AWS resources associated with the VPC such as EC2 instances before deleting the VPC:

That’s it for this tutorial. I hope your learnt something. In my next post, I will show you how to use Infrastructure as Code with Terraform to setup the same VPC architecture in less than 1 min. 😁