7FACTOR NEWS
High-availability, auto-scaling, self-healing cloud infrastructure is as resilient as the many-headed Hydra of Greek myth. Make the most of it by using containers.
By Jeremy Duvall, founder, 7Factor Software
This article originally appeared on InfoWorld on January 30, 2023.
According to Greek mythology, if you were to venture to a certain lake in Lerna, you’d find the many-headed hydra, a serpentine water monster that carries the secret to modern cloud architecture. Why? The thing is hard to kill, much like you want your cloud infrastructure to be. You cut one head off, and it grows two more.
In the myth, even mighty Hercules needed help shutting down the resilient beast. Yet in the world of IT infrastructure, instead of spawning hydras we’re entrusting our digital futures to snowflake servers swirling in the cloud.
We’ve lost sight of the true potential of infrastructure automation to deliver high-availability, auto-scaling, self-healing solutions. Why? Because everyone in the C-suite expects a timely, tidy transition to the cloud, with as little actual transformation as possible.
This incentivizes teams to “lift and shift” their legacy codebase to virtual machines (VMs) that look just like their on-prem data center. While there are scenarios in which this approach is necessary and appropriate—such as when migrating away from a rented data center under a very tight deadline—in most cases you’re just kicking the can of a true transformation down the road. Immersed in a semblance of the familiar, teams will continue to rely on the snowflake configurations of yore, with even allegedly “automated” deployments still requiring manual tweaking of servers.
These custom, manual configurations used to make sense with on-prem virtual machines running on bare metal servers. You had to handle changes on a system-by-system basis. The server was like a pet requiring regular attention and care, and the team would keep that same server around for a long time.
Yet even as they migrate their IT infrastructure to the cloud, engineers continue to tend to VMs provisioned in the cloud through manual configurations. While seemingly the simplest way to satisfy a “lift and shift” mandate, this thwarts the fully automated promise of public cloud offerings to deliver high-availability, auto-scaling, self-healing infrastructure. It’s like buying a smartphone, shoving it in your pocket, and waiting by the rotary for a call.
The end result? Despite making substantial investments in the cloud, organizations fumbled the opportunity to capitalize on its capabilities.
Why would you ever treat your AWS, Azure, Google Cloud, or other cloud computing service deployments the same way you treat a data center when they have fundamentally different governing ideologies?
Rage against the virtual machine. Go stateless.
Cloud-native deployment calls for an entirely different mindset: a stateless one, in which no individual server matters. The opposite of a pet. Instead, you effectively need to create your own virtual herd of hydras so that when something goes wrong or load is high, your infrastructure simply spawns new heads.
You can do this with auto-scaling rules in your cloud platform, a sort of halfway point along the road to a truly cloud-native paradigm. But container orchestration is where you fully unleash the power of the hydra: fully stateless, self-healing, and effortlessly scaling.
Imagine if, like VMs, the mythic Hydra required several minutes of downtime to regrow each severed head. Hercules could have dispatched it on his own during the wait. But because containers are so lightweight, horizontal scaling and self-healing can complete in less than five seconds (assuming well-designed containers) for true high availability that outpaces even the swiftest executioner’s sword.
We have Google to thank for the departure from big on-prem servers and the commoditization of workloads that makes this lightning-fast scaling possible. Picture Larry Page and Sergey Brin in the garage with 10 stacked 4GB hard drives in a cabinet made of LEGOs wired together with a bunch of commodity desktop computers. They created the first Google while also sparking the “I don’t need a big server anymore” revolution. Why bother when you can use standard computing power to deploy what you need, when you need it, then dispatch it as soon as you’re done?
Back to containers. Think of containers as the heads of the hydra. When one goes down, if you have your cloud configured properly in Kubernetes, Amazon ECS, or any other container orchestration service, the cloud simply replaces it immediately with new containers that can pick up where the fallen one left off.
Yes, there’s a cost associated with implementing this approach, but in return, you’re unlocking unprecedented scalability that creates new levels of reliability and feature velocity for your operation. Plus, if you keep treating your cloud like a data center without the ability to capitalize the cost of that data center, you incur even more expenses while missing out on some of the key benefits cloud has to offer.
What does a hydra-based architecture look like?
Now that we know why the heads of the hydra are necessary for today’s cloud architecture, how do you actually create them?
Separate config from code
Based on Twelve-Factor App principles, a hydra architecture should rely on environment-based configuration, ensuring a resilient, high-availability infrastructure independent of any changes in the codebase.
Never local, always automated
Think of file systems as immutable—and never local. I repeat: Local IO is a no. Logs should go to Prometheus or Amazon CloudWatch and files go to blob storage like Amazon S3 or Azure Blob Storage. You’ll also want to make sure you’ve deployed automation services for continuous integration (CI), continuous delivery (CD), and disaster recovery (DR) so that new containers spin up automatically as necessary.
Let bin packing be your guide
To control costs and reduce waste, refer to the principles of container bin packing. Some cloud platforms will bin pack for you while others will require a more manual approach, but either way you need to optimize your resources. Think of it like this: Machines are like storage space on a ship—you only have so much depending on CPU and RAM. Containers are the boxes you’re going to transport on the ship. You’ve already paid for the storage (i.e., the underlying machines), so you want to pack as much into it as you can to maximize your investment. In a 1:1 implementation, you would pay for multiple ships that carry only one box each.
Right-size your services
Services should be as stateless as possible. Design right-size services—the sweet spot between microservices and monoliths—by building a suite of services that are the correct size to solve the problem, based on the context and the domain you're working in. By contrast, microservices invite complexity, and monoliths don't scale well. As with most things in life, right in the middle is likely the best choice.
How do you know if you’ve succeeded?
How do you know if you’ve configured your containers correctly to achieve horizontal scale? Here’s the litmus test: If I were to turn off a deployed server, or five servers, would your infrastructure come back to life without the need for manual intervention? If the answer is yes, congratulations. If the answer is no, go back to the drawing board and figure out why not, then solve for those cases. This concept applies no matter your public cloud: Automate everything, including your DR strategies wherever cost-effective. Yes, you may need to change how your application reacts to these scenarios.
As a bonus, save time on compliance
Once you’re set up for horizontal auto-scaling and self-healing, you’ll also free up time previously spent on security and compliance. With managed services, you no longer have to spend as much time patching operating systems thanks to a shared responsibility model. Running container-based services on someone else’s machine also means letting them deal with host OS security and network segmentation, easing the way to SOC and HIPAA compliance.
Now, let’s get back to coding
Bottom line, if you’re a software engineer, you have better things to do than babysit your virtual pet cloud infrastructure, especially when it’s costing you more and negating the benefits you’re supposed to be getting from virtualization in the first place. When you take the time up front to ensure effortless horizontal auto-scaling and self-healing, you’re configured to enjoy a high-availability infrastructure while increasing the bandwidth your team has available for value-add activities like product development.
So go ahead and dive into your next project with the surety that the hydra will always spawn another head. Because in the end, there’s no such thing as flying too close to the cloud.