Today’s post is a guest contribution from Shawn Xu (no relations). Shawn is one of the early engineers at Ascend.io, got his Masters in Human Computer Interaction at Carnegie Mellon, and writes an informative WeChat public account on SaaS called 硅谷成长攻略. He has written numerous fantastic posts on Interconnected, like Low Code No Code Landscape Part I and II, and "What Does Databricks Do?"
For anyone trying to make sense of the name of open source unicorn, HashiCorp, look no further than one of its co-founders name, Mitchell Hashimoto. Mitchell registered Hashicorp, the name, years ago as a placeholder before starting HashiCorp, the company. It is now worth more than $5 billion.
Similar to my previous article, “What does Databricks do", this article will explore and explain the technology and business model that has been powering HashiCorp to become one of the most prominent open source startups to date. We'll start by drawing an example to help anyone -- developer or not -- better understand what this company does and how it’s riding a larger industry wave that is just getting started.
Power of Automation
A pianola, more famously known as a player piano, is a self-playing piano, typically instructed by notes printed on a music roll. The futuristic HBO TV series "Westworld" featured a pianola in its Season 1 opening scene, as a metaphor and foreshadow that certain roles in the show operate by pre-defined, automated scripts.
Whether you like this metaphor in the show or not, I find the invention of the pianola remarkable in two ways:
- It brought the beauty of live piano performance to the masses. Sure it will never reach the grandmaster level of Lang Lang or Evgeny Kissin, but its music is ear-pleasing enough for most people. What's groundbreaking is that the pianola commoditized live piano music.
- Given a script that’s recorded on a piano roll, the music quality is consistent. No hectic page turns. No distraction from the audience.
This consistency via automation is a future that HashiCorp is building today, just in the cloud infrastructure space. The work that used to take highly-skilled infrastructure engineers (the Lang Lang of IT, if you will) can now be recorded, and played back any time in any environment.
For example, the most common task performed on AWS -- provisioning a new EC2 server instance -- takes a whopping 16 steps. Some of these tasks may branch off into sub-tasks, and any task could introduce hard-to-debug errors for beginners.
Imagine having to do that ten times for ten slightly different servers. Even worse, having to provision additional servers on Microsoft Azure or GCP, with a set of vastly different UI and CLI commands to learn and memorize.
Terraform, one of HashiCorp's flagship open source products, automates away many of these mundane and repetitive tasks, by introducing an easy-to-modify template for infrastructure provisioning. The template reduces those steps to a single descriptive document (the "piano roll"), allowing even the most novice engineer to “playback” and provision servers with consistency.
Unlike the pianola, Terraform does not replace infrastructure engineers, but instead makes their work easier and more predictable. Terraform is a great example of a low-code solution that we described in detail in “Low Code No Code Part 1” because it:
- Offloads engineers from code and processes that yield little impact;
- Abstracts away the complexity of directly interacting with different cloud environments, transitioning from “imperative” to “declarative”. (Terraform is cloud-agnostic, you can even use it on-premises.)
The Origin of HashiCorp
Mitchell Hashimoto, who co-founded HashiCorp with Armon Dadgar while they were students at the University of Washington, played a lot of video games growing up. Many of these games involve a range of repetitive tasks. Take Neopets as an example. Players have to frequently feed and play with these digital pets, or otherwise, they would grow ill. Mitchell quickly grew tired of the chores and only wanted to focus on the "fun part": battling and trading. To do so, he started writing bots that can auto-perform the repetitive steps on his behalf, by triggering mouse clicks at the right place and the right time.
Mitchell’s obsession with "bots" and automation continued. He wrote scripts to set up web forums automatically and composed software to sign up classes, as soon as enrollment begins.
A few years later in 2009, while working for a tech consultancy, he had to interact with many clients and a variety of tech stacks. Different clients run services on different operating systems, and oftentimes it’s the same software in slightly different versions. Mitchell, on the other hand, only had a single laptop to develop on locally. It often took him many hours to switch computing contexts, in order to simulate different customer environments. Born out of this repetitive frustration was HashiCorp’s first open source product, Vagrant.
Vagrant takes a configuration file, ".Vagrantfile", and returns a fully setup virtual environment by executing the instructions in the file. These instructions are "Vagrant-flavored" abstractions that are decoupled from the underlying operating system, so you can use it on any operating system.
This design pattern isn't new. In fact, another open source company, Puppet, introduced a similar solution back in 2006. This paradigm is now known as "Infrastructure as Code" (IaC). Vagrant took it further and provided even more integration and abstraction on top.
IaC, in a nutshell, is "f(.Configfile) => infrastructure" , or the process of translating a configuration file into desired infrastructure outcome. It's a simple concept, but HashiCorp took it to the next level with great developer experience and well-executed features.
HashiCorp makes money mostly from four open source products: Terraform, Vault, Consul, and Nomad. Vagrant, the company's first offering, saw decline in traction as container technology gradually took over virtualized machines (VMs).
These four products each reside in a discrete domain and together form a convincing IaC story. The best way to navigate them is with a simplified real-world example without jargons.
Let's assume we are a team of infrastructure or DevOps engineers working for a new ecommerce startup. Our mission is to set up the right foundation and process for cloud infrastructure, so that other developers can self-serve and write applications on top of it.
Resource Provisioning (Terraform)
Our customer-facing ecommerce website has to run on some cloud servers. It needs to talk to the database to fetch and update orders. Between the server and the database, there has to be some sort of internal network that connects the two.
These components (computing servers, storage, network bandwidths, etc.) are the crucial "resources" in the HashiCorp context, which all other services depend on. Terraform, allows developers to say “what I want to set up” (the declarative way) in a descriptive file, instead of the old approach of “what should I do to set these up” (the imperative way).
In the illustration below, we use one Terraform file to set up three different resources. They are empty for now, but don’t worry, we'll fill them in soon.
Service Orchestration (Nomad)
Now that we have the infrastructure backbone set up with our Terraform, it's time to run the awesome applications our developers have been building, so our e-commerce store can open for business.
Traditionally, this process would involve various manual commands: copy application executable files to servers, set up environment variables, and run these files. Similar to Terraform, Nomad automates these laborious, tedious tasks away by allowing programmers to describe the desired “end state” on a static “.nomad” file.
People commonly find overlap between Terraform and Nomad. In fact, some of Nomad's duties, especially spinning up applications, can be done by Terraform as well. However, there are two key differences:
- Terraform works best with lower-level components, e.g. the resources that applications run on like storage and servers. Nomad mainly deals with applications and services.
- Nomad itself is a service that runs in the environment (same for Consul and Vault as we will discuss later). It’s always present throughout an application’s lifecycle (aka, online). Terraform's mission is finished as soon as the infrastructure resources are set up (aka, offline).
Thus, when our website traffic and orders spike on Black Friday, Nomad is there to intervene and increase resources for different applications to handle that traffic. It can also kill and restart unhealthy applications if they go rogue. (See illustration below)
Consider Nomad as a mediator that’s always there to take care of services -- an orchestration provider. Another commonly used orchestration tool is Kubernetes, which was originally created by Google and very popular in the industry. Compared to Kubernetes, Nomad is easier to set up with a lower learning curve. And it plays well with other HashiCorp products out of the box to form a “stack”.
Service Discovery (Consul)
The concept of "service mesh" has been around for a while now, yet it is still vague, confusing, and ill-defined. Consul provides an easy-to-implement example.
As our ecommerce business grows, we can now afford to hire our own data analytics team and fraud detection team. Each team brings the services they need to do their job, which live on Nomad. Now we have a new problem: the core ecommerce web application needs to talk to the fraud detection service, then later write to a data ingestion service for analytics. The IP addresses and ports of these two services may change. Every time they do, we have to change the hard-coded fields in the web service and redeploy it, which is a pain in the butt and error-prone.
Consider Consul as an always-on post office situated in the middle that documents where each service lives (IP address) to facilitate package delivery to the right address and keeps track of changes. In addition, it helps determine who can send what packages to which service (known as service authorization, because you can’t just send whatever you want to whoever you want), check-in on the receiving service’s liveliness (health check), prevent overloading a recipient by distributing mail to similar services (load balancing), etc. All these are user-defined policies stored in a descriptive file that the Consul service executes. With Consul or a similar “service mesh” solution like Istio, Envoy or Linkerd, you no longer need to keep track of hard-coded information on every single service and redeploy every time there is a change!
Credential Management (Vault)
Information Security (InfoSec) is a topic of major importance today. Without good security practice enforced, a careless developer may hard-code a password to database access in the application and risk a major leak. Vault is a service that centrally manages all credentials in the HashiCorp Stack.
When an application needs a set of credentials to access some resources, it would talk to Vault to acquire it, instead of looking at some local files or variables in the code. Similarly, all credentials should be written to Vault, regardless of during provision or at runtime. It helps guarantee that few developers can easily grab credentials (or secrets) and abuse it (intentionally or not).
In addition, Vault makes it easy to rotate credentials, something a mature development team does every now and then.
Behind HashiCorp's Success
IaC (Infrastructure as Code) is an irreversible industry trend, which has propelled HashiCorp to prominence in recent years.
But just like any success story, HashiCorp’s traction took many years of building, iterating, and its fair share of false starts. Being an open source company did not help either in its early days, when investors frequently questioned its profitability. To make things harder, the DevOps concept was hardly the promising investment category it is now, when HashiCorp started in 2012.
From this excellent interview with Mitchell, we can learn that there were two turning points that pulled HashiCorp out from the stale growth pit:
- Getting the right balance between bundling and unbundling
Before 2015, HashiCorp actually built most of the pieces -- Terraform, Consul, Vagrant, Packer, etc. -- already, but decided to bundle them in a massive offering called "Atlas". One cannot buy or use say Terraform alone. It was a take all or nothing offering. The product messaging was also vague: Atlas was trying to please both individual developers and Fortune 2000 companies, when both audiences have distinct needs and purchasing power.
After a horrible Friday board meeting, Mitchell and Armon decided to pivot: split "Atlas" up into atomic components and focus the company on winning the large enterprises. This move became what HashiCorp is today. While it seemed natural and obvious in hindsight, especially given the company's current success, the move was a bold and risky one back then.
- Docker taking off
Historically, HashiCorp has had a contentious relationship with Docker, with Docker's container technology taking over Vagrant's market share. However, as Docker got much wider adoption and became a unicorn in 2015, it also sparked more hype for DevOps tools and the notion of Infrastructure as Code. In a way, Docker’s popularity created the wave that HashiCorp is riding on right now! There is no permanent friend or foe in the complex and fast-evolving space of infrastructure technology.
There are two other factors that I believe catalyzed HashiCorp to its $5 Billion valuation:
- Growing infrastructure with no infrastructure talent:
Back in the days, when businesses just needed a dozen machines in a server room, things were much easier to control. In 2021, it is routine for companies to have thousands of machines, either on its own premises or in the cloud.
The infrastructure technology domain has evolved too quickly for engineering talents and training to meet the demand organically. Low-code tools like the ones HashiCorp provides along with extensive tutorials and documentation can essentially turn any developer into an infrastructure engineer. Over the years, its products have maturity in depth and complexity to lend a helping hand to the most hardcore infrastructure engineers.
- “Shoganai” and pragmatism
Mitchell’s favorite slang is "Shoganai" (しょうがない, a Japanese saying that means “it cannot be helped”, or “it has to be done”). This mindset bears similarity to Zhang Yiming’s last speech on “ordinary mind” that we published a few weeks ago. At the core, both founders practice the “the power of now”, not the past nor the future.
HashiCorp’s DNA is encoded with pragmatism: container is taking over virtualization? Sure, let's work with containers. Different cloud vendors have drastically different APIs and backends? Bite the bullet and build better abstractions, so customers won't have to. Many large enterprises are still on-prem? Then our software must work with on-prem, so they are not left behind.
As we mentioned in the beginning, HashiCorp is a very low-profile company. Many engineers could be using its open source products without knowing who’s supporting their open source development. We can find similar traits in some of the other open source success stories, like Mongo and Elastic. As a developer myself, I have to admit -- polishing a product that pleases developers is extremely hard. Developers are an opinionated crowd, and building for them requires immense patience and humility.
What’s Next for HashiCorp
In the short-term, HashiCorp’s growth will likely continue, as the wave of cloud computing and Infrastructure-as-Code lifts all boats. With another DevOps darling, GitLab, releasing its S1 filing last week, this public listing, if successful, may accelerate HashiCorp’s own IPO soon!
But becoming a public company is just one of many milestones in a company’s journey to become an enduring brand and institution. Just in the small realm of public open source companies alone, we’ve seen both successes and failures. Red Hat, the OG of open source companies, adapted to the fast-evolving landscape of cloud infrastructure with OpenShift, which resulted in its $34 Billion acquisition by IBM -- still the largest deal to date for an infrastructure technology company. Cloudera (and Hortonworks), on the other hand, could not survive and was taken private by KKR and other private equity firms.
Will HashiCorp be more Red Hat or Cloudera or something entirely different? We don't know, but we do know that its pragmatism, persistence, and "Shoganai" spirit will serve its future well.
The pianola was popular during the late 19th century and early 20th century, but it was replaced by phonographs, then recorders, and then MIDIs. While the original insight of automation-to-commoditization of music continues to evolve, we only see pianola in museums and TV shows now. Similarly, Infrastructure-as-Code may be popular now, but the way cloud infrastructure gets used and provisioned by engineers will no doubt change 10 or 15 years from today. In fact, that change is already happening with new low-code and declarative products, pushing the boundary of simplicity and powering new paradigms, like the so-called “serverless” and Lambda architecture.
Can HashiCorp adapt? With the success of Nomad, which is working its way up the stack to automate workflows closer to the application player, it looks like it can. With Mitchell recently deciding to step down from his Board and other managerial duty to become an individual contributor again, it looks like the DNA of his (half) namesake company will persist for a long time.
Pragmatism may be boring, but it’s enduring.
If you like what you've read, please SUBSCRIBE to the Interconnected email list. To read all previous posts, please check out the Archive section. New content will be delivered to your inbox once a week. Follow and interact with me on: Twitter, LinkedIn, Clubhouse (@kevinsxu).
HashiCorp的两位创始人，Mitchell Hashimoto与Armon Dadgar是华盛顿大学的同学。Mitchell从小就是个电子游戏迷。小的时候他就发现，游戏里总是有大量的重复环节——例如他最常玩的Neopets。这款游戏就像我们小时候经典的“电子宠物”，玩家们需要“喂养”他们、陪他们玩耍，否则他们就会生病、战斗力下降。对Mitchell来说，喂养和玩耍并不是有趣的环节，他只想体验刺激的部分：宠物对战、交换。于是，这个小男孩写了一些脚本，可以通过控制他的鼠标，来自动完成那些重复的操作。
在基础架构领域，这种“声明式”的设计模式并不是Terraform首创——事实上，另一家开源企业Puppet早在2006年就提出了“代码定义的基础架构”（Infrastructure as Code, IaC）理念。IaC用最简单的方式表达，就是"f(配置文件) ⇒ 架构"。换句话说，将一个静态的配置文件翻译、转化为底层的命令，最后得到预期中的架构。这个概念并不难懂，但HashiCorp通过更多系统的整合、好用的附加功能，将其发扬光大。
- “しょうがない” (Shoganai)