Amazon’s S3 is, as the name suggests, simple storage. It allows for the remote storage of static files, and will serve them on request. While it can be advantageous to use on a website, it is not a true content delivery network, as files are served from a single location (the server hosting the bucket). In many cases, the latency of S3 will be greater than that of a well configured server (e.g. running nginx or lighttpd), and it lacks the ability to natively serve gzipped content (one must upload separate copies of the file and specially request the gzipped version).
Given that there is a cost for both storage and requests, it does not seem like a useful solution. However, there are some advantages – S3 will reduce server load somewhat, it will (usually) remain up even if a site’s server goes down, storage does not have to be pre-allocated, and the storage itself is very reliable. In essence, this makes S3 better suited to backups than serving dynamic files.
Amazon’s Cloudfront shares many of the characteristics of S3, with the significant exception that it content is distributed to servers globally and served from the nearest server, thereby decreasing latency. For static content Cloudfront can provide an advantage, if you are trying to get that extra bit of performance out of a site.
Migrating from S3 to Cloudfront can be a simple matter, especially if a site accesses S3 through a CNAME.
As mentioned above, sn S3 bucket can be referenced through a CNAME, the only condition is that the CNAME must match the bucket name. If one has this setup already, the migration to Cloudfront is trivial and can be accomplished with no downtime in a matter of minutes. The following provides a brief outline of the procedure to accomplish this.
In order to use Cloudfront, you must sign up for it (even though it is accessible through the AWS console, you cannot actually use it until you have signed up). An email will be sent to you confirming the completion of your Cloudfront registration.
Once you have subscribed, you can create a new distribution (using S3 as the origin), from the AWS console.
- Login to the AWS console, and under the Amazon Cloudfront tab, click ‘Create Distribution’.
- Choose the S3 bucket you currently use, and enter the CNAME that you reference that S3 bucket by. The other, default settings, can remain unchanged:
- Delivery method: Download
- Logging: Off,
- Distribution status: Enabled).
- Click Create
Once you click create, a new distribution will be listed with the status ‘InProgress’, after a few minutes, that will change to ‘Deployed’. This distribution will list a domain name by which it can be accessed.
If you wish, you may test that things are working by accessing a file that is on the S3 origin bucket, through the Cloudfront domain.
For example, if you have:
- cache.example.com as a CNAME pointing to cache.example.com.s3.amazonaws.com
- a file in the root of the above bucket called, test.txt
- i.e.: http://cache.example.com/test.txt
- and the domain of your Cloudfront distribution is abcd.cloudfront.net
You can access the above file, through Cloudfront: http://abcd.cloudfront.net/test.txt
At this point, the Cloudfront part of the setup is done. Of course, since your site still points to S3, it is not using Cloudfront yet. This, however, is a simple matter of updating your DNS entries (again, provided that the site originally was using S3, and referenced it through a CNAME).
If you are using your domain registrar’s nameservers, you will need to login to your domain registrar to update your DNS settings, otherwise, you will simply modify the entry on your nameserver (e.g. BIND, TinyDNS, etc). If you have setup a local DNS server in addition to the authoritative nameservers (i.e. from your registrar), do not forget to update that as well.
You can use dig or nslookup to see when your DNS change has propagated.
The reason that this does not result in any downtime, is because, for a site previously using S3, the DNS already points to S3 – you are not changing any code on your site for this. Cloudfront will only fetch and serve files that it receives requests for. As such, a user to which the updated DNS change has not propagated to will request the file from S3, while one who has received the updated DNS information, will request the file from S3.
In theory, the same procedure should transitioning from static content served from any subdomain (even if it is not S3 backed). Cloudfront would be setup to use a custom origin, and the DNS entry for the subdomain changed to point to the Cloudfront domain. As the files are never removed from the subdomain, and no code is changed, the shift should be transparent, and occur without downtime.
As a final note, Cloudfront does not expire files instantly – it will obey the cache-control header settings. As a result, updating a file on S3, will not immediately cause the file being served by Cloudfront to be updated. One solution to this is to include a version on each file, and when your site requests the new file, Cloudfront will fetch the new file from S3 before serving it.
Great posts on your blog about AWS. Would love to see more of them. Would you be willing to do one on saving a custom AMI for future use? And/or one on backing up data files so that you can restore them separately from the AMI?
Thanks for the comment – glad you find the posts useful. I certainly can do both of those – I think I have notes on saving a custom AMI from before I started using EBS-root AMIs, and there are two ways I typically back up files separate from an AMI – either in the form of an EBS volume, or using S3 (individual files or snapshots), which I can also write about. Thanks for the suggestions, I’ll certainly add them to my list of articles to write.