Every year GitHub publishes a “State of the Octoverse” report to share insights on the state of open source software and developer ecosystem at large. Every year for the last two years, I’ve written a post to share my own personal thoughts, takeaways, and analysis of the report’s findings.

When I wrote about the 2020 edition, GitHub reported having 56 million developers at the time and projected reaching the 100 million milestone by 2025. Back then, I wrote about where these next 50 million developers would come from based on this projection. But just two years later, GitHub reported in this year’s Octoverse report of having 94 million developers on the platform. Looks like the “next 50 million developers” are already here!

Let’s look at some of the revelations from this year’s Octoverse report that I personally find noteworthy and worth pondering about.

(Disclaimer: I have not and do not participate in the research and production of the Octoverse report. The views expressed in this post, and elsewhere on Interconnected, are all personal opinions. They do not represent GitHub, which happens to be my employer at the time of this writing.)

The Rise of India

The most eye-popping revelation and prediction for me is that India had the largest raw number of developer growth over the last year, and if this trend continues, India “will match the current United States GitHub developer population by 2025.”

Source: https://octoverse.github.com/2022/global-tech-talent

Open source, as a software development model, has always been a globally distributed phenomenon. Even though the concept took root among the corridors of MIT in the 1980s, the most widely-adopted open source technology to date, Linux, started in the classrooms of the University of Helsinki in Finland. So any country with either a large and/or reasonably educated population can become a leader in open source; it is not supposed to be a US-centric movement. Even in the very first issue of the Octoverse report, published 10 years ago in 2012, GitHub shared that only 28% of its traffic came from the US.

However, there is one often underappreciated element about India in the worldwide tech ecosystem that is uniquely contributing to its rise: global system integrators (GSIs).

GSIs are huge consulting and IT services companies that resell, implement, and deliver services for technology products. It is the “unsexy” part of the industry that enables massive trends (and jargons) like “digital transformation” or “cloud adoption”. These companies don’t come up with new products per se, but they have massive technical workforces, whose job is to learn how to implement whichever product a client wants into its existing systems. Thus, they are called “system integrators”, or SIs.

There is literally an SI for everything. Want to integrate a CRM into your system for your sales team so they don’t use spreadsheets anymore? There are SIs for that. Want to implement HR or accounting software so your back office operations move away from paper documents? There are SIs for that. Want to lift and shift your on-premises IT infrastructure to a public cloud? There are SIs for that. Because the companies that these SIs typically work with are Global 2000 enterprises, their reach and influence is also global. Thus, G + SIs. And the country that supplies the most technical workforce for the top GSIs is India!

Among the top five GSIs:

  1. Accenture
  2. Infosys
  3. Tata Consultancy Services
  4. HCL Technologies
  5. Cognizant

Three are headquartered in India: Infosys. Tata, HCL. For Accenture, ~250,000 out of its roughly 700,000 employees are in India (about 35%). For Cognizant, ~240,000 out of its roughly 330,000 employees are in India (a whopping 72%). This is, of course, not to say that Indian developers only work at GSIs. The country’s growing domestic tech startup ecosystem and large multinational companies setting up engineering hubs in India both continue to be trends as well. But India’s place among all the GSIs is unique.

The second order effect of India dominating the GSIs’ workforce is not only that there will continue to be demand for the country to train more engineers and developers, but that these developers are learning to build, integrate, and provide services for some of the most complex technical challenges facing big companies. This means that Indian developers are higher on the value chain and their skills are more valuable than other IT outsourcing destinations. And as more big companies, both tech and non-tech, use more open source software, they will continue to gravitate towards open source, which shows up in developers growth counts on GitHub, where most of the open source activities in the world take place.

Thus, the rise of India’s developers are both quantitatively and qualitatively noteworthy. And GSIs are the unsung heroes.

IAC and HCL

The one technical trend from the 2022 Octoverse report that caught my eye is the increasing popularity of IAC (infrastructure as code) and a new language, HCL (Hashicorp Configuration Language). This trend is neatly aligned with India’s rise fueled by the GSIs.

Here’s why.

Without getting overly technical, IAC is a paradigm that is designed to make provisioning and operating IT infrastructure resources (e.g. servers, virtual machines, containers, databases, etc.) easier, more scalable, and less error-prone for developers. The idea has been around for at least 15 years, so it is not new. But as more IT infrastructure moves to the cloud, as more cloud options become available, and as more services and applications are built on top of different clouds, the “IAC way” is becoming more preferable. That is because IAC is more consistent and templatized, no matter which cloud system you need to work with. That is also what many engineers who work at large GSIs do on a daily basis for their clients. Thus, two seemingly unconnected trends: the increase of IAC’s popularity and the increase of Indian developers working at GSIs, both of which show up in GitHub’s report, are actually two sides of the same coin.

This “coin” is also deeply connected to HCL’s fast growth, which, admittedly, caught me a bit by surprise. Hashicorp is one of the leading companies in the open source IAC space. It is the creator of multiple popular open source projects – Terraform, Vault, Consul, Nomad, Vagrant – all of which address different dimensions of provisioning, operating, and securing IT infrastructure resources on different cloud platforms or on-premise settings. HCL is Hashicorp’s branded IAC configuration tool (the “H”). It initially only worked for Terraform, but has since extended itself to be compatible with other popular formats like JSON that are used often in the IAC context. This interoperability combined with IAC’s popularity likely contributed to HCL’s 50+% growth in the last year.

Source: https://octoverse.github.com/2022/top-programming-languages

Calling HCL a “programming language” can be confusing. Unlike more general-purpose programming languages like C, C++ or Go, which you can use to build anything, HCL is a higher-level abstraction that, in its current form, is only used for configuring infrastructure resources – a decidedly more narrow use case. For the same reason, people often question if SQL should be considered a “programming language”, when its use is just for querying relational databases.

I will leave it up to the computer science academics to decide whether HCL qualifies as a “programming language” or not. I find HCL’s growth interesting, because it is very hard for a new, “branded” language of any flavor by a company (in this case Hashicorp) to take off among developers. Many companies have tried, because, if successful, their products will be much stickier among developers, which will boost their long term business prospects. There is a long list of different flavors of SQLs in the database industry – MySQL (relational database of the same name), CQL (the Cassandra NoSQL database), Cypher (graph database from Neo4j), etc. – which have all tried to achieve that developer stickiness to varying degrees of success and failure. It is a tall feat that is rarely achieved at scale. But if this Octoverse report’s findings is a crystal ball into the future, then Hashicorp’s HCL is showing signs of take off.

Even though Hashicorp’s stock price has been dropping like a brick since its IPO last December, Hashicorp’s many open source projects are popular tools among DevOps engineers, system administrators, and the GSIs’, many of whom will be using more HCL in their daily work. The convergence of these trends will eventually accrue value to Hashicorp in the long run. (For readers interested in Hashicorp, see our previous deep dive analysis on its history, products, and prospects.)

Understand (and Invest in) Developers

My long-held industry view and investment thesis is that in an increasingly technology-driven world, developers are the most important people (or persona) to understand, because they are the builders and kingmakers of technology. Reports like GitHub’s State of the Octoverse, Puppet’s State of DevOps, and Stack Overflow’s annual developer survey are all good resources for deepening our understanding of the developer persona.

However, while these reports are routinely consumed by VCs (at least the sharp ones) who invest in the infrastructure software space, public market investors and ETFs still lump companies that target developers or a technical audience with a broad basket of “cloud companies”, just because the end products all happen to be in the cloud. It’s like lumping residential, commercial, and data center REITs all into one ETF, just because they are all built on the physical ground of planet Earth!

Among the top holdings of two of the most popular cloud indices – BVP Nasdaq Emerging Cloud Index and WisdomTree Cloud Computing Fund — you will see companies like Toast (for restaurant), Shopify (for retailers), and MongoDB (a database for app developers) all crammed into one big list. It’s intellectually lazy and downright silly.

One silver lining of an otherwise tough year for both public and private markets is that our understanding may be forced to improve by the reality of the market. True cloud infrastructure software companies, most of which target developers on some level, are starting to separate themselves from the rest of the “cloud bucket”. In Battery Venture’s recent State of OpenCloud report, 8 out of the 10 most valuable software companies are in the cloud infrastructure category, while the other two (Zoominfo and Bill.com) sell to non-technical audiences.

Source: https://www.battery.com/wp-content/uploads/2022/10/Battery-Ventures-OpenCloud-Report__2022.pdf

Now that the next 50 million developers are already here, the timing is ripe for a deeper understanding who developers really are and how to invest in them.

两年增长 5000 万开发者,新时代已经到来了吗?


(本篇中文版文章是读者 Ben Yu 做的编译,我做了一些修改后发表。非常感谢Ben的贡献!)

GitHub 每年都会发布一份名为 “State of the Octoverse” 的报告,内容大多是围绕开源软件和开发者生态的现状。在过去的两年里,我每年都会写篇文章分析此报告分享我个人的观点。

在 2020 年时,GitHub 的报告说当时拥有 5600 万开发者,预计到 2025 年将达到 1 亿人。当时,我预测了一下其余 5000 万开发者会来自哪里。但仅仅两年后,GitHub 在今年的 Octoverse 报告中就称,该平台拥有 9400 万开发人员,下一批 5000 万开发者的时代已经悄然来临。

在这篇文章里,让我们一起看看今年的报告中都有些什么值得注意和思考的内容。

(郑重声明:我没有参与 Octoverse 报告的研究和编写。在这篇文章中表达的观点,以及其他在《互联》博客上发表的观点,都是我的个人观点,和 GitHub 无关)

印度的崛起

对我来说,最让人惊讶的是,去年印度开发者数量增长最快。如果这种趋势继续下去,根据报告的预测,印度将在 2025 年赶上美国目前的 GitHub 开发者数量。

来源: https://octoverse.github.com/2022/global-tech-talent

开源作为一种软件开发模式,一直都带有全球化色彩。这个概念最早是 20 世纪 80 年代在麻省理工学院的里萌芽,但迄今为止最广泛采用的开源技术是 Linux,这是在芬兰的赫尔辛基大学的教室里开始的。实际上,任何拥有大量人口,或基础教育水准较高的国家都有可能成为开源的领导者,并不依赖于美国。即使在 10 年前的 2012 年出版的第一期 Octoverse 报告中所分享的,当时的GitHub网站的流量也只有 28%来自美国。

然而,在全球科技生态系统中,有一个关于印度的因素往往被低估,它促进了印度的崛起:全球系统集成商(global system integrators,下文简称 GSIs)。

GSIs 是指大型咨询和 IT 服务公司们,这类公司专门做软件销售、落实和交付相关的事情。这些事情在软件行业里属于枯燥无味,不那么性感的部分,但同时是促进企业“数字转型”和“上云” 这些大趋势的重要因素。这些公司本身不做新产品,但他们往往有数量庞大的开发者团队,工作内容是学习如何将客户期望的产品和自己已经有的系统做集成。因此他们被称为系统集成商(system integrators,下文简称 SIs)

这些 SIs 几乎服务每个可以想象的领域。希望你的销售团队用的 CRM 和自己公司的系统结合,来避免继续使用 Excel?他们可以解决。想要 HR 或者会计软件,实现无纸化办公?他们可以做到。想要将内部 IT 基础设施部署到公有云上?他们也可以做到。由于 SIs 和全球最大的 2000 家企业合作密切,他们的影响范围也是全球的。而为第一梯 GSIs 提供最多开发者的国家,就是印度。

全球排名前五的 GSIs 是:

  1. Accenture
  2. Infosys
  3. Tata Consultancy Services
  4. HCL Technologies
  5. Cognizant

我们一个个看。其中三家公司的总部设在印度,分别是Infosys,Tata,和HCLAccenture 约有 70 万名员工,其中约 25 万人在印度(约占 35%)。对于Cognizant来说,大约有 33 万名员工,其中有 24 万人在印度(高达 72%)。当然,这并不是说印度开发者只在 GSIs 工作。印度国内不断增长的科技创业生态,以及在印度建立工程中心的大型跨国公司,都在成为新的趋势。但印度在所有 GSIs 中的地位是独一无二的。

印度在 GSIs 的员工数量占比如此多引发的连锁反应是,印度需要培训更多的工程师和开发者,这些开发者则要不断的学习开发、集成和提供服务,以应对大公司面临的一些最复杂的技术挑战。这意味着印度开发者开始往价值链的上层走,他们的技能比其他 IT 外包公司更有价值。随着越来越多的大公司,无论是科技公司还是非科技公司,使用更多的开源软件,这些印度开发者们将继续被开源软件所吸引,这体现在 GitHub 的开发者增长数据中,因为世界上大多数的开源开发都发生在 GitHub 上。

因此,印度开发者数量的增长无论在数量上还是在质量上都值得关注。

IAC 和 HCL

2022年 Octoverse 报告中引起我注意的一个技术趋势是 IAC(infrastructure as code,IaC)和新语言 HCL (Hashicorp 配置语言)的日益流行。这一趋势与印度在 GSIs 推动下的崛起有巧妙的吻合。

IAC 不需要过多的技术,它是一个范例,旨在使 IT 基础设施资源(如服务器、虚拟机、容器、数据库等)的操作更加容易,对于开发者来说更具有可伸缩性,更少出错。这个概念已经存在至少 15 年了,所以它并不新鲜。但是随着越来越多的 IT 基础设施转向云,随着越来越多的云服务层出不穷,随着越来越多的服务和应用建立在不同的云之上,IaC 正变得越来越流行。这是因为无论需要使用哪个云系统,IaC 的操作都保持一致。这也是许多在大型 GSIs 工作的工程师每天为他们客户所做的事情。因此,两个看似毫无关联的趋势:IaC 的受欢迎程度的提高和在 GSIs 工作的印度开发者的增加,这两者都出现在 GitHub 今年的报告中,实际上是同一枚硬币的两面。

这个 "硬币" 也和 HCL 语言的快速增长有深刻的关系,诚然,这让我有点吃惊。Hashicorp 是开源 IAC 领域的领先公司之一。它是多个流行的开源项目的创建者——Terraform、Vault、Consul、Nomad、Vagrant,所有这些项目都涉及在不同的云平台或内部设置上配置、运营和保护 IT 基础设施资源的不同层面。HCL 是 Hashicorp 的品牌 IaC 配置工具(即 "H")。它最初只适用于Terraform,但后来扩展到兼容其他流行的格式,如在 IaC 背景下经常使用的 JSON。这种互操作性与 IaC 的流行相结合,可能促成了 HCL 在去年 50% 以上的增长。

来源:https://octoverse.github.com/2022/top-programming-languages

把 HCL 称为编程语言可能会让有些人感到困惑。它毕竟不像 C、 C + + 或 Go 这样的通用编程语言,你可以用它们来构建任何东西。HCL 是一种更高层次的抽象语言,在它当前的形式下,它只用于配置基础设施资源——一个明显更狭窄的用例。出于同样的原因,当 SQL 仅用于查询关系数据库时,人们经常质疑 SQL 是否应该被视为是一种编程语言。

我对 HCL 是否符合编程语言的定义并不关心,我认为 HCL 的增长非常值得关注,是因为对于一家公司(在这个案例中是 Hashicorp)的任何风格的新“品牌”语言来说,在开发者群体中能收到欢迎都是非常困难的。许多公司都在尝试,因为如果成功的话,他们的产品在开发者中会更具粘性,这将提升他们的长期商业前景。数据库行业有很多不同风格的 SQL —— MySQL(同名关系数据库)、 CQL(Cassandra NoSQL 数据库)、 Cypher(来自 Neo4j 的图形数据库)等等。这是一项很难达到规模的壮举。但是,就这份 Octoverse 报告的数据来看,HCL 达到了,正在起飞。

尽管自去年 12 月上市以来,Hashicorp 的股价一直在下跌,但 Hashicorp 的许多开源项目在 DevOps 工程师、系统管理员和 GSIs 中都是广受欢迎的工具,他们中的许多人将在日常工作中写更多的 HCL。从长远来看,这些趋势的趋同最终将为 Hashicorp 带来价值。(对 Hashicorp 感兴趣的读者,请参阅之前对其历史、产品和前景的深度分析的文章。)

理解(并投资)开发者

我长期持有的行业观点和投资观点是,在一个日益由科技驱动的世界中,开发者是最重要的角色,因为他们是技术的建设者和决策者。GitHub 的 State of Octoverse、 Puppet 的 State of DevOps 和 Stack Overflow 的年度开发人员调查报告都是加深我们对开发人员角色理解的好资源。

然而,尽管投资基础设施软件领域的 VC 们经常读这些报告,但公开市场投资机构和 ETF 仍将那些以开发者或技术受众为目标的公司与一篮子“云服务公司”混为一谈,仅仅因为最终产品碰巧都搭建在云上。这就像将住宅、商业和数据中心的 REIT 都集中到一个 ETF 中,仅仅因为它们都建立在地球的土地,逻辑上很可笑。

在最受欢迎的两个云行业指数中——BVP 纳斯达克新兴云指数(BVP Nasdaq Emerging Cloud Index) WisdomTree 云计算基金(WisdomTree Cloud Computing Fund),你会看到 Toast(餐厅)、 Shopify(零售商)和 MongoDB(软件开发者的数据库)等公司都挤在一个大名单中。这就是不合理的地方,而这种不合理已存在许久。

无论是公开市场和私人市场投资,今年是艰难的一年。而艰难中产生的一个亮点,通常是市场的现实可能会迫使我们改善我们的理解。真正的云基础设施软件公司(其中大多数在某种程度上以开发者为核心受众)正开始将自己与其他“云股票”分离开来。在最近 Battery Ventures 的 OpenCloud 状态报告中,10 家最有价值的软件公司中有 8 家属于云基础设施类别,而另外两家(Zoominfo 和 Bill.com)则面向非技术用户销售。

既然我们提前进入了近 1 亿开发者的世界,深入了解开发者的身份,以及如何投资他们的时机已经更加成熟。