There are a lot of technical articles on implementing Kubernetes (K8s), and a whole lot of inspiring blog posts on using K8s to solve difficult technical challenges, but in this post, I want to cover the business reasons for why a small, technology driven insurance agency shifted to using an enterprise-class platform over other solutions.
“What is Kubernetes,”
I hear you cry… Well, in layman’s terms, Kubernetes is an open-source platform that manages the deployment and workloads of applications. It allows you to package up your code and configuration into a small container and run it in a managed pool of servers - along with definitions of all the plumbing necessary to access it.
Containerization of applications has a storied past, and has become The Way for building and running apps. Docker is probably the best known and most often tooling used when people talk about ‘containerization.’ There are countless frameworks available that can run and manage Docker containers: from one-off deployment tools like
docker-machine and scheduling (deployment) tools like Docker Swarm, to hosted providers like Amazon ECS, Heroku, Google Cloud Platform, CloudFoundry, etc. The list goes on.
Clearly Docker and containerization is a good choice for any technical team, as it gives you a wide choice of platforms on which to deploy, and it doesn’t require that you run specialized software yourself. In addition, it is well-known in the tech space, so finding software engineers who know it isn’t difficult.
“That’s great, I guess. Why do I care?”
So back to the original intent, “Why is this important to an independent insurance agency?” It comes down to resources. As a company in a highly regulated industry, and as a technology-driven startup, we have a lot of business requirements:
- High up-time
- Be able to react to changing business needs (#startuplife)
- Be secure (properly handle personal and financial data)
- Be SOC compliant
- Do it with only 3 engineers!!!
The last is the biggest drivers for us. Everything else is possible, but how do we do all of these things with just 3 engineers and no system administrators? Meeting these goals is done every day by financial institutions and other service providers, but they are usually supported by larger teams.
In order to support all of these business needs, we had to make sure we had a solution that could be managed without requiring the hiring of a DevOps engineer and system administrator (aka, more resources), wouldn’t significantly impact current development flow, and wouldn’t cost more than our current solution of deploying to Amazon ECS (even better - reduce costs, please).
Without a dedicated sys admin, we knew we needed to adopt the Infrastructure as Code methodology. We had begun using Terraform to manage systems in AWS and to deploy code, but as the number of microservices increased the amount of Terraform ‘code’ and the complexity of it was getting out of hand. Not to mention the number of servers we were having to run in AWS to support the high-availability requirements. From a time perspective, every time we added an app or updated a version, we had to go back and wade through all of the terraform files and link the right containers together in this giant infrastructure project.
What we needed was a more efficient way to isolate individual applications into separate deployment descriptors per project, with better software defined networking to make them available internally. So, when Amazon finally released their hosted Kubernetes environment we knew we had our answer.
Creating an EKS cluster with terraform was fairly simple. Following instructions provided by the vibrant Terraform community, we were able to build out our cluster in no time and hook to our existing AWS security mechanisms. You could go with Google Cloud Platform as well (as they pioneered Kubernetes and open-sourced it), but we were invested in Amazon already. EKS allowed us to get access to a fairly complicated platform with very little System Adminstration knowledge - reducing the burden on our small team.
Next, came some standard tooling. Kubernetes deployments happen via command-line tools and APIs, and we chose Helm as a tool to allow us to package k8s resources and templatize them to work in standard dev/test/stage/production environments.
With Kubernetes/Helm up and running we suddenly had access to all sorts of tools! We were able to deploy standard Helm packages to get access to Jenkins for build/deploy automation, Grafana and Prometheus for metrics and monitoring, Logstash/Elasticsearch for log aggregation, NGINX proxies… - all the things usually managed by a devOps team.
With all of these tools available we then created standardized build and deployment scripts and put them in Yeoman Generators - so a developer can spin up a new project with FULL automated builds and push-button deployments in just a few minutes! These automated scripts (
yo Jenkinsfile and
yo Kubernator) also insert our SOC required auditing steps, so we get Slack notifications on all builds and deployments as well as a generated report of all production deployments and changes.
With all of the out-of-the-box configurations in our metrics bundle, we get monitoring and alerting of our infrastructure in Grafana - so we were able to build dashboards and get service alerts via email and Slack with almost no effort.
All while at the same time reducing our EC2 instances to just 5 servers, which supports our test, preview, and production environments. We now run over 40 different services and jobs (over 100 containers) and haven’t had any issues with scaling or service availability. We run all external web access through a single set of ELB load balancers and proxies, which simplifies DNS and SSL certificate management - while significantly reducing our AWS costs.
If applications (or servers!) fall over, the system self-heals. Kubernetes will automatically restart any failed applications, and our AWS auto-scaling group will automatically replace any failing servers, which will then automatically rejoin the Kubernetes cluster.
So, while many enterprises choose Kubernetes to handle large and complex workloads (just for fun, you can read about Tinder’s experience, we chose it for the opposite reason. We needed something that would enable us to execute small, simple, and standard workloads without any overhead - and without too much specialized experience.
We built Covered to make it easy to shop around for just the right homeowners insurance, taking the mystery out of the process. Check out just how easy it is.