How To Deploy a Simple Website on AWS with Terraform

I maintain an old web application that’s just short of its 20^th anniversary. As Murphy’s Law dictates, the physical server it was hosted on died during the Covid-19 pandemic, pushing me to finally migrate towards a more modern infrastructure.

While there is a rewrite in progress, this version is to stay for a while and I want a repeatable infrastructure, buzzworded Infrastructure as Code those days. Terraform and Amazon Web Services (AWS) are golden standards for doing that and I expected a very easy job. I was in for a surprise. I was surprised even more with from how many sources and tutorials I had to scavenge the information. This provides a summary as of May 2020.

The Application Architecture

All of this is done to deploy a fairly simple application with two components:

A self-contained application server (in this case, written in legacy PHP)
A database the server is talking to, in this case MySQL. I opted for RDS because of compatibility issues; you may consider Aurora

This is it. Since this is a hobby, low-traffic project, there are no load balancers since one instance is more then enough and ELB is relatively expensive. Note this means that certificate handling and TLS has to be sorted out on an instance.

Since the site is only maintained time from time, the deployment is done by creating new AMIs (see below) and configuration is uploaded from local files. If your website is maintained by more than one person or you deploy more often than quarterly, you should automate this in a deployment pipeline¹.

The Prerequisites

Working AWS account
Terraform CLI installed (0.12.x at the time of this writing)
Generated SSH key pair
aws-cli if you are going to do the Linux image conversion, in which case you also want VirtualBox
AWS security credentials (application and secret key) ready. If not, follow Amazon's documentation
Have the credentials exported in environment variables; I recommend doing that for this session only and have keys stored in a password manager
I tested on Linux and Mac OS X, but this should work on Windows as well. Worst case, use WSL

One more thing to get started: AWS instances use a bit unusual PEM format for SSH keys. To generate one, run

ssh-keygen -P "" -t rsa -b 4096 -m pem -f \~/.ssh/aws-mycomputer.pem

On Windows, use Putty as described in docs.

Besides, you will need a public key to upload to AWS below. Get it by running

ssh-keygen -y -f ~/.ssh/aws-mycomputer.pem

Preparing the Image

If you are running an app from the last decade, you may skip this step and use a public AMI, but I am documenting for other people going down the same route. For reasons you don’t want to know, I am temporarily stuck with unsupported Debian Squeeze. Now, there are some community-provided images, but…let’s say I don’t fully trust their origins.

Hence, I decided to make my own. This is a manual version that I recommend to automate for your purposes.

First, download the official ISO from a trusted source, like the Debian Squeeze page
Launch VirtualBox. Create a new machine, mount the iso into CD-ROM. When creating the machine, make sure the volume size matches the one you want on production and that you are using VHD format. While Amazon claims to support more formats, I got errors with them
Install what you need on the machine, including the SSH key; it will not be managed by your Amazon settings. I have not added a file with secrets; instead, I am uploading them to the launched instances with file provisioner below²
Upload the VHD to an S3 bucket
Launch the import task as described in docs
Write down your AMI number to be used below

The Terraform Complexity

Various tutorials may try to convince you that all you need are a few lines for a provider, an instance, a database, and you are done, just as you would do when setting things up through the web interface.

It’s not as simple and the main reason is security. Cloud forces you into setting up a reasonable network and policies of what can run on them and those need to be set up explicitly in terraform.

But let’s go through it a problem by problem.

The Initial Setup

First, define the provider with the region. You could skip, but this is production: it shouldn’t change based on what your current region environment is. We’ll also store state on S3 and declare availability zones. We need two even though we are going to use one because of RDS requirements.

terraform {
    backend "s3" {
        bucket = "almad-terraform-states"
        key = "mywebapp/state"
        region = "eu-central-1"
    }
}

provider "aws" {
    version = "~> 2.0"
    region = "eu-central-1"
}

locals {
    internet_cidr = "0.0.0.0/0"
    az = "eu-central-1b"
    secondary_az = "eu-central-1a"
}

variable "RDS_PASSWORD" {}

resource "aws_key_pair" "mycomputer" {
    key_name = "mycomputer"
    public_key = "ssh-rsa AAAAB3Nza...Ch"
}

The public_key content is the output of the command from the prerequisites above.

The Components

Let’s now define the main elements. They refer to network setup that is not defined yet; we’ll get to them soon. First, the database.

resource "aws_db_instance" "mysql" {
    availability_zone = local.az
    allocated_storage = 20

    engine = "mysql"
    engine_version = "5.7"
    instance_class = "db.t3.micro"

    identifier = "mywebsite-mysql"
    name = "mywebsite_db"
    username = "root"
    password = var.RDS_PASSWORD

    parameter_group_name = "default.mysql5.7"
    skip_final_snapshot = true
    final_snapshot_identifier = "mywebsite-mysql-snap"
    multi_az = "false"

    db_subnet_group_name = aws_db_subnet_group.mywebsite_mysql.name
    vpc_security_group_ids = [aws_security_group.mywebsite.id]
    publicly_accessible = true

    storage_type = "standard"
}

Note the instance_class: RDS creates a managed EC2 instance under the hood, one that you pay for. You should definitely consider saving money and paying for it as a reserved instance: do note that they are NOT purchased as a normal EC2 reserved instance! You have to buy them through the RDS menu; follow the Amazon tutorial.

Also, notice the publicly_accessible attribute: without it, you can’t talk to RDS from the Internet even if it’s exposed through networks and security groups.

Now, let’s define the application server instance.

resource "aws_instance" "mywebsite" {
  ami           = "ami-123456"
  instance_type = "t2.nano"
  key_name      = aws_key_pair.penpen.key_name
  availability_zone = local.az
  associate_public_ip_address = true
  vpc_security_group_ids = [aws_security_group.mywebsite.id]
  subnet_id              = aws_subnet.mywebsite_prod.id
  root_block_device {
    volume_type = "standard"
    volume_size = 4
    delete_on_termination = true
  }
  connection {
    type        = "ssh"
    user        = "root"
    private_key = chomp(file(var.private_key_path))
    host        = self.public_ip
  }
  provisioner "file" {
    source      = "etc/services/run"
    destination = "/etc/service/mywebsite.example/run"
  }
  provisioner "file" {
    source      = "etc/lighttpd.conf"
    destination = "/etc/lighttpd/lighttpd.conf"
  }
}

The AMI is the image we’ve created above. You may want to use a public image with a custom setup instead. In that case, remote-exec is what you need unless you go full in for configuration management and tools like Chef, Ansible, Puppet, or Salt.

The connection’s private key has to be present on the computer you are running the terraform on. As mentioned in architecture, this should be in the deployment pipeline in case of multiple contributors.

The volume type standard is an old, magnetic tape HDD. It is well hidden in AWS menus and deprecated, yet it’s still the cheapest one available and at of time of this writing, available at smaller sizes than gp2 root volumes. That said, if you take more traffic or your instance RAM can’t cache all code, I’d advise for gp2.

It is also generally recommended to use EBS instead of root block device as it allows features like snapshots and data persistency. I am intentionally not opting for those since in a setup where new deploys are done by creating new AMIs, those features are not needed.

The files uploaded using file provider are poor man’s version to handle secrets. Same treatment as for private keys applies.

The Network Blocks

Now comes the fun part! Let’s start by the basic building block, Virtual Private Cloud. This is akin to setting up your network at home, saying “this is mine and isolated from everything else”.

resource "aws_vpc" “mywebsite_prod" {
  cidr_block = "192.168.0.0/16"
  instance_tenancy = "default"
  enable_dns_support = "true"
  enable_dns_hostnames = "true"
}

If you want to read more into how it relates to availability zones and subnets, the documentation contains relevant diagrams.

As with your home network, this is not useful for the outer world until you get an ISP and a router. Let’s do that: the connection point between your network and Internet is “gateway” and since we are going to use this for a project with a domain, let’s get it a public and static IP address with elastic IP.

resource "aws_internet_gateway" "mywebsite_prod" {
  vpc_id = aws_vpc.mywebsite_prod.id
}

resource "aws_eip" "mywebsite" {
  instance = aws_instance.mywebsite.id
  vpc      = true
  depends_on = [aws_internet_gateway.mywebsite_prod, aws_instance.mywebsite]
}

Cool. Now unlike your house, VPC can span an entire city and we have to create a network per “server room”, aka availability zone. Although we are not in the high availability business with our simple website, we will use just a single AZ and network, yet because of RDS constraints, we’ll create two.


resource "aws_subnet" "mywebsite_prod" {
  availability_zone = local.az
  vpc_id     = aws_vpc.mywebsite_prod.id
  cidr_block = "192.168.1.0/24"
  map_public_ip_on_launch = true
}
resource "aws_subnet" "mywebsite_secondary_az" {
  availability_zone = local.secondary_az
  vpc_id     = aws_vpc.mywebsite_prod.id
  cidr_block = "192.168.2.0/24"
  map_public_ip_on_launch = true
}

Have I mentioned RDS being demanding? Especially for it, we have to create the subnet group that is going to be tied to the RDS instance.

resource "aws_db_subnet_group" "mywebsite_mysql" {
    name = "mywebsite-mysql-subnet"
    description = "RDS subnet group"
    subnet_ids = [aws_subnet.mywebsite_prod.id, aws_subnet.mywebsite_secondary_az.id]
}

Now we have all the building blocks! All that’s left is to specify how information flows.

The Network Wiring

Two things remain: defining how information flows (routing table) and who can do it (network access control list).

Our routing is simple: we just allow the whole internet to talk to our gateway, which allows information into the subnet where our application is.

resource "aws_route_table" "mywebsite_prod" {
 vpc_id = aws_vpc.mywebsite_prod.id
 route {
    cidr_block = local.internet_cidr
    gateway_id = aws_internet_gateway.mywebsite_prod.id
 }
}
resource "aws_route_table_association" "mywebsite_prod" {
  subnet_id      = aws_subnet.mywebsite_prod.id
  route_table_id = aws_route_table.mywebsite_prod.id
}

Simple enough. Let’s allow computers to do so. I am allowing all ports here, you may be more verbose and picky.

resource "aws_network_acl" "mywebsite_prod" {
  vpc_id = aws_vpc.mywebsite_prod.id
  subnet_ids = [ aws_subnet.mywebsite_prod.id ]
  ingress {
      protocol = "all"
      rule_no = 100
      action = "allow"
      cidr_block = local.internet_cidr
      from_port = 0
      to_port = 0
  }
  egress {
      protocol = "all"
      rule_no = 100
      action = "allow"
      cidr_block = local.internet_cidr
      from_port = 0
      to_port = 0
  }
}

Network ACL is essentially a firewall for your network, the one you’d set up on your router. With those rules, the traffic is in, but it will be declined by a security group layer. Security groups are tied to your instances and are essentially representing an abstract computer firewall that is just running on a group of computers at once, instead of having to set it up one by one.

resource "aws_security_group" "mywebsite" {
  name        = "sg_mywebsite"
  description = "SG for mywebsite"
  vpc_id      = aws_vpc.mywebsite_prod.id
  ingress {
    description = "SSH from the world"
    from_port   = 22
    to_port     = 22
    protocol    = "tcp"
    cidr_blocks = [local.internet_cidr]
  }
  ingress {
    description = "HTTP from the world"
    from_port   = 80
    to_port     = 80
    protocol    = "tcp"
    cidr_blocks = [local.internet_cidr]
  }
  ingress {
    description = "RDS from the world (because secondary usage from Heroku)"
    from_port   = 3306
    to_port     = 3306
    protocol    = "tcp"
    cidr_blocks = [local.internet_cidr]
  }
  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = [local.internet_cidr]
  }
}

Phew. Done! Now you should be able to run the Terraform (terraform apply) and connect to your instance. Note that this opens your RDS instance to the world, as well as SSH to your EC2 instances. This is useful for the initial setup and necessary if you want to use the instance from Heroku, but you may want the delete the corresponding ingress line.

The Production Polish

There are two things I haven’t done with AWS Infrastructure and Terraform.

Setting the domain

I have an existing registrar and I don’t feel like switching it to Amazon. Thus, to make your site work, go there and set your A record for the address you got in your EIP.

If you don’t mind having everything in one package, check out terraform’s docs for route 53.

Loading the data

This is a migration and I do have an existing database. Have I mentioned this is a small site? As I could afford a downtime, mysqldump has done the job and standard mysql command worked as well.

Improvements

This is all a very minimal skeleton that you can layer on. In particular, you may consider:

A separate private subnet for RDS that is not tied to a gateway, only to the instance.
Only opening SSH to EC2 instances on demand, preferably never. This is more complicated if you are using a file provisioner since it uploads the files via SSH connection.
Instead of baking code and configuration into the image, have it on a separate EBS volume and mount it to appropriate places; see ebs_volume and volume_attachment docs. Alternatively, EFS.
Running RDS with multi_az = "true" and having application servers in multiple availability zones.
Putting ALB in front of your instances, potentially having an auto-scaling group to manage its size. A significant motivation for ELB or ALB is that you can use them to manage TLS certificates and termination, relieving you of need to handle that on the instance.

Unfortunately, with the last two, you are increasing your expenses significantly, going fully into the “cloud native for serving the whole world” business.

The Complete File

This is the complete Terraform file for you to build on. Licensed under UPL. Available also as a gist.

terraform {
  backend "s3" {
    bucket = "almad-terraform-states"
    key    = "mywebapp/state"
    region = "eu-central-1"
  }
}


provider "aws" {
  version = "~> 2.0"
  region  = "eu-central-1"

}

locals {
    internet_cidr = "0.0.0.0/0"
    az = "eu-central-1b"
    secondary_az = "eu-central-1a"
}

variable "RDS_PASSWORD" {}

variable "public_key_path" {}
variable "private_key_path" {}

resource "aws_key_pair" "mycomputer" {
  key_name   = "mycomputer"
  public_key = chomp(file(var.public_key_path))
}

resource "aws_vpc" “mywebsite_prod" {
  cidr_block = "192.168.0.0/16"
  instance_tenancy = "default"
  enable_dns_support = "true"
  enable_dns_hostnames = "true"
}

resource "aws_subnet" "mywebsite_prod" {
  availability_zone = local.az
  vpc_id     = aws_vpc.mywebsite_prod.id
  cidr_block = "192.168.1.0/24"
  map_public_ip_on_launch = true
}

resource "aws_subnet" "mywebsite_secondary_az" {
  availability_zone = local.secondary_az
  vpc_id     = aws_vpc.mywebsite_prod.id
  cidr_block = "192.168.2.0/24"
  map_public_ip_on_launch = true
}


resource "aws_internet_gateway" "mywebsite_prod" {
  vpc_id = aws_vpc.mywebsite_prod.id
}

resource "aws_route_table" "mywebsite_prod" {
 vpc_id = aws_vpc.mywebsite_prod.id
 route {
    cidr_block = local.internet_cidr
    gateway_id = aws_internet_gateway.mywebsite_prod.id
 }
}
resource "aws_route_table_association" "mywebsite_prod" {
  subnet_id      = aws_subnet.mywebsite_prod.id
  route_table_id = aws_route_table.mywebsite_prod.id
}

resource "aws_network_acl" "mywebsite_prod" {
  vpc_id = aws_vpc.mywebsite_prod.id
  subnet_ids = [ aws_subnet.mywebsite_prod.id ]

  ingress {
      protocol = "all"
      rule_no = 100
      action = "allow"
      cidr_block = local.internet_cidr
      from_port = 0
      to_port = 0
  }

  egress {
      protocol = "all"
      rule_no = 100
      action = "allow"
      cidr_block = local.internet_cidr
      from_port = 0
      to_port = 0
  }
}

resource "aws_security_group" "mywebsite" {
  name        = "sg_mywebsite"
  description = "SG for mywebsite"
  vpc_id      = aws_vpc.mywebsite_prod.id

  ingress {
    description = "SSH from the world"
    from_port   = 22
    to_port     = 22
    protocol    = "tcp"
    cidr_blocks = [local.internet_cidr]
  }

  ingress {
    description = "HTTP from the world"
    from_port   = 80
    to_port     = 80
    protocol    = "tcp"
    cidr_blocks = [local.internet_cidr]
  }

  ingress {
    description = "RDS from the world (because secondary usage from Heroku)"
    from_port   = 3306
    to_port     = 3306
    protocol    = "tcp"
    cidr_blocks = [local.internet_cidr]
  }

  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = [local.internet_cidr]
  }

}

resource "aws_db_subnet_group" "mywebsite_mysql" {
    name = "mywebsite-mysql-subnet"
    description = "RDS subnet group"
    subnet_ids = [aws_subnet.mywebsite_prod.id, aws_subnet.mywebsite_secondary_az.id]
}


resource "aws_db_instance" "mysql" {
    availability_zone = local.az
    allocated_storage = 20
    engine = "mysql"
    engine_version = "5.7"
    instance_class = "db.t3.micro"
    identifier = "mywebsite-mysql"
    name = "mywebsite"
    username = "root"
    password = var.RDS_PASSWORD
    parameter_group_name = "default.mysql5.7"
    skip_final_snapshot = true
    final_snapshot_identifier = "mywebsite-mysql-snap"
    publicly_accessible = true

    multi_az = "false"

    db_subnet_group_name = aws_db_subnet_group.mywebsite_mysql.name
    vpc_security_group_ids = [aws_security_group.mywebsite.id]

    storage_type = "standard"
}

resource "aws_eip" "mywebsite" {
  instance = aws_instance.mywebsite.id
  vpc      = true
  depends_on = [aws_internet_gateway.mywebsite_prod, aws_instance.mywebsite]
}

resource "aws_instance" "mywebsite" {
  ami           = "ami-123456"
  instance_type = "t2.nano"
  key_name      = aws_key_pair.penpen.key_name
  availability_zone = local.az

  associate_public_ip_address = true
  vpc_security_group_ids = [aws_security_group.mywebsite.id]
  subnet_id              = aws_subnet.mywebsite_prod.id

  root_block_device {
    volume_type = "standard"
    volume_size = 4
    delete_on_termination = true
  }

  connection {
    type        = "ssh"
    user        = "root"
    private_key = chomp(file(var.private_key_path))
    host        = self.public_ip
  }

  provisioner "file" {
    source      = "etc/services/run"
    destination = "/etc/service/mywebsite.example/run"
  }

  provisioner "file" {
    source      = "etc/lighttpd.conf"
    destination = "/etc/lighttpd/lighttpd.conf"
  }
}

Conclusion

Using infrastructure programmatically exposes you to some complexity that the web tools are hiding from you. This makes the learning curve more challenging, but you have the benefits of repeatability and consistency.

Alternatives for this setup (without Terraform usage):

AWS Lightsail allows you to get an instance as well as a managed database. It is a very neat starting pack: if you overgrow it, you can easily upgrade to EC2. If you feel like using other AWS resources with it, you can do so with VPC peering. I’d consider going down that route except I needed an unsupported OS environment and I really want that IaC.
If I would be having more of those services, instead of having them on individual images and instances, I’d go for instance pool (or Fargate) for ECS and build Docker containers instead of AMIs. Note that in order to use those, you must have and pay for the load balancer
As an honorable mention, throw money at it and use Heroku. I think it’s mostly worth paying for at a small scale. Again in my case, it was a question of unsupported environments.

Thanks to Ladislav Prskavec for suggestions and script fixes.

I originally went that way with Amazon CodeStar/CodePipeline, yet it turned out to be very error prone for my use case and I’ve decide it’s not worth it. The main reason is that CodeStar acts as a scaffolder. This works well for the initial setup, but as you try to debug and thinker with your deployment, it breaks down very quickly. ↩︎
If you can modify the application sufficiently, obviously use a reasonable secret storage instead ↩︎

Published June 29, 2020 in Notes and tagged aws • cloud