Pushing AWS to its limits: zero downtime important change of service.

José de Zárate
3 min readJun 4, 2023

--

How using “everything in the book” of AWS made an important change in some application possible with zero downtime.

The change

We have a complex app architecture in aws based on services. One of those is the auth service, and basically the other services ask it about whether an user is authorized, what kind of roles that user have, etc …

The internal database that auth service use is on a postgresql — Aurora engine. And we want to change that database to a mysql — Aurora engine.

The basic arquitecture

Well, this auth service is run by some ec2 instances that belong to a Target Group and are behind an ELB, so every auth request first goes to the ELB and the ELB forwards it to the Target Group made of the ec2 instances that contains the auth service code (which uses its internal database to know what to answer and then answer it)

In a nutshell, when some service of our app, made an auth request to -say- http://cool-auth-service.some.subdomain.com, the ELB forwarded it to this auth Target Group

How to do it with zero downtime

  • First, make the database migration: that is, make sure there is a mysql database which is completely in sync with the original postgresql database. That is covered here
  • Then make the code changes so the auth service talks to the new database but don’t deploy those changes just yet
  • Create some CloudFormation Stack that creates a Target Group of ec2 instances similar to those you’re using to run the auth service, and also creates a Code Deploy application for deploying to that target group.
  • Now, using the brand new Code Deploy application, deploy the code that talks to the mysql database to the instances of the recently created target group.
  • At this point, we have two target groups with auth code in theirs:
    - The original -still not replaced- “old” target group of ec2 instances running the code that allows them to “talk” to the postgresql database, which is behind some endpoint on the ELB (therefore, is still the “active” auth service)
    — The newly created “new” target group of ec2 instances running the code that allows them to “talk” to the mysql database, behind nobody, and being requested by nobody.
  • Now we add some temporary endpoint to our ELB so it points to this new target group. Nobody knows it, but we can use it to test how the new auth service works.
  • If we’re satisfied, then on our ELB, we make the endpoint that used to point to the old auth target group points to the new target group. So now when a service of ours make an auth request to http://cool-auth-service.some.subdomain.com, that request is forwarded to the new Target Group
  • And, That’s all folks! 😆

The AWS components involved

Just for the sake of having them listed somewhere:

  • DMS (Database Migration Services)
  • ELB (Elastic Load Balancing)
  • CloudFormation Stacks
  • CodeDeploy
  • Aurora Postgresql
  • Aurora Mysql

Disclaimer: my cool company

Doofinder

--

--

José de Zárate

I'm a Theoretical Physicist who plays rock'n'roll bass and get his money from programming in some SaaS company.