Some Linux Guy

Just a blog from yet another Linux user

Clouding Properly

Dec 28, 2023 - 7 minute read - homelab

One effective way for me to learn something new is to use it to reproduce something that already exists. I did it a few years ago to learn how to program in Go (previously used lots of Perl), I did it a few years back to learn Docker (previously deployed VMs using Virtualbox or ESXi), and I am doing it now to learn how to use Terraform to deploy things to AWS (I previously managed all my personal servers using Ansible).

I recently decided to use some “spare time” to learn something new. Previously, Kubernetes was at the top of my “to-learn” list. However, Terraform and AWS quickly skipped to the head of the line because of how quickly I could add efficiency to things I’m currently managing. I haven’t had a chance to work with Terraform at all in the past, and have minimal experience using AWS, mainly due to mostly working in roles that deal primarily with on-prem infrastructure and minimal cloud environments.

Previously, I managed all of my personal cloud servers and services manually. Later I expanded my homelab’s Ansible configuration to also manage my cloud servers. Now, I’m migrating things to a proper Infrastructure as Code environment. I did some minor work in AWS many years ago (had to dig up some really old notes to find my original credentials), and I see a lot of people discussing Terraform to control their cloud infrastructure, so that is the path I went down. For those of you who are already Terraform (and/or AWS) gurus, this might be considered some boring reading…

My cloud footprint is quite small; a few static web sites, a couple dynamic sites, hosted on a few Linux cloud servers. I figured it might be easy to migrate, but quickly learned that there are a lot of things to take into consideration. In the past I’ve seen cloud migrations fail because people wanted to just do a “lift and shift”. That is often the absolutely wrong thing to do, especially if you are migrating to AWS. That means taking the work I am putting into managing BIND, NGINX, and Linux in general and transferring it over to AWS infrastructure.

I started with my static sites. It is fairly easy to initially set up one site, but using Terraform to manage several of them requires quickly learning how to iterate over things using for_each. Also, basic testing exposed several other issues I had to work around, things that were easy (sometimes just one line) configuration changes:

  • Redirecting users from www.domain.com to domain.com
  • Custom error pages
  • Configuring domains that only exist to redirect to other sites
  • Adding DNS entries for non-AWS addresses
  • Getting Terraform to properly refer to and link together all resources for each site

I ended up using four services:

  • AWS Certificate Manager – Encrypt all the things
  • Route53 – All DNS, all the time
  • Cloudfront – Configured along with functions to handle all the required redirects
  • S3 – A single bucket for all of my static sites, with Cloudfront using different origin paths

All of the work I did to build my Terraform file reminded me of work I did many years ago when I first started learning to build states in Salt, which is another example of learning something new (configuration management) to replace something that already exists (manually maintaining a Linux server environment at work). In Salt (just like other configuration managment systems) you have to build everything in a way to refer back to previous resources, including looping over previous resources to configure/deploy new resources.

Back to the “learning something by reproducing something else” experiment… Here are a few things I’ve learned while poking around Terraform and AWS over the past couple weeks:

  • Lots of examples on the Internet are out of date – The first thing I wanted to do was upload files to an S3 bucket. I saw an example that suggested using aws_s3_bucket_object, and I was immediately greeted with a deprecation notice (the correct answer is aws_s3_object). Later, I tried setting up certificate validation for ACM and was immediately greeted with the “validation data is now a set instead of a list” issue (which completely changes how you create the certificate validation resources). It reinforced the fact that you will eventually need to pin to specific versions of Terraform and the AWS provider.

  • Design your loop labels/iterators correctly, or a simple change will cause your resources to be nuked and recreated – When creating the additional DNS records for non AWS resources, I initially used count (later for_each, which changed the iterators from int’s to string’s) when creating the loop. Later, adding/removing entries to/from the set of DNS entries would shift subsequent entries up/down, throwing things out of sync. This would manifest in weird errors (for example, being unable to add DNS entries that don’t currently exist) due to all of the indexes/identifiers changing. I redesigned the for loop so that the identifier was crafted from the DNS data instead of an index (probably basic knowledge to someone with Terraform experience), and a few terraform state mv’s later and my config is now less brittle.

  • Design your code and data structures in an extensible way – When I first started, I built my code to work based on a single static site. Once I got that to work, I ended up re-writing most of my code to take the “single instance” logic and re-write each of the resources in a way that can be iterated over (proper loop iterators, proper resource names, etc.). Yes, this should be common sense, but planning ahead could save you from lots of terraform state mv’s in the future.

  • For small environments, terraform plan can help you build your configs – If you don’t know offhand the specific Terraform settings needed to implement a certain function/feature on a resource currently being tracked by Terraform, it is usually easy to find the needed directives. You can make the required changes in the AWS console, then run terraform plan. That will display all the settings it wants to remove, and you can just add those changes to your configuration.

  • The lack of a nested for_each, and the limited data types it supports – Even in my small environment I can see how the lack of a nested for_each can be very annoying. I want to deploy the same set of resources for multiple sites, and some of those resources require loops internally. For example, the combination of aws_acm_certificate_validation and the DNS records that are created by looping over the entries returned by domain_validation_options required individually configuring separate resources for each hostname. I was able to work around that by creating a local Terraform module to handle all of the certificate creation logic. Additionally, I can’t run for_each against a regular list, but enclosing that same data by a toset() will work. I assume that is a design decision to ensure that you don’t apply for_each against the same value more than once (guaranteed by converting it to a set), but it is still annoying (particularly since I’ve iterated over many different data types in many different programming languages).

My next steps will be to move over my dynamic sites. I haven’t decided on exactly how I will use EC2 for that part. I don’t have any high-traffic sites, so even RDS might be overkill when compared to a local MySQL/MariaDB (although using RDS would be considered “clouding properly”, and would be much easier to manage via Terraform).

Also, I don’t have anything set up for looking at my meager traffic stats (e.g., what I would normally pull from NGINX web server logs), so that would be something else I need to get set up. I have my Cloudfront distributions logging to an S3 bucket, I just need to visualize that log data.

Overall, it has been interesting learning about Terraform, although there is a big difference between managing a few static sites and a large multi-AZ deployment containing many different applications, storage types, caching layers, compute nodes, and databases. That would most likely require a much larger way of thinking of your environment, and an entirely different set of design choices. Looks like I still have quite a bit of learning to do…