Last May, I wrote a three-part series that surveyed the entire open source landscape in China:
Since then, the landscape has changed significantly as “open source” continues to receive top-down attention from the central government. Recently, the Ministry of Industry and Information Technology (MIIT, China’s main regulator of its tech industry) released a new set of development planning guidance. “Open source” is prominently featured; the term was used 27 times.
This guidance specifically called out the establishment of China’s first home-grown open-source foundation -- the OpenAtom Foundation -- as a major accomplishment (for more on the OpenAtom Foundation, see my previous analysis). Looking ahead, the guidance explicitly sets a target of building 2-3 open source communities with “global influence” by 2025.
The MIIT guidance is focused on advancing and modernizing China’s industrial technology infrastructure and supply chain. The infrastructure layer is where open source software tends to excel starting with Linux, the open source operating system that runs most of the world’s servers. China’s open source community has been organically growing since the early 2000s. The country now sports the world’s second largest developer community, with a burgeoning if not crowded open source ecosystem. Many projects have 5-figure GitHub stars, thousands of forks, and hundreds of contributors, but those are mostly vanity metrics. Only projects with strategic technical value have the potential to reach a global developer audience.
With the MIIT expressing both top-level approval and high expectation of open source, which “Made in China” open source projects have the best chance of emerging as both a “local champion” and a “global influencer” by 2025? I share my prediction of six leading candidates -- three Big Tech projects, three independent projects -- and why they could become global projects.
Big Tech Projects
The three Big Tech projects are Baidu’s Apollo (self-driving), Huawei’s OpenHarmony (operating system), and Alibaba’s OpenXuantie (RISC-V based semiconductor design).
Apollo: First launched in 2017, Apollo has emerged as one of the leading self-driving open source projects. It’s at the heart of Baidu’s Apollo Go robotaxi service, which “beta launched” in a neighborhood in Beijing. That neighborhood just happens to be one of the venues of the upcoming Winter Olympics. Even though this Winter Olympics is receiving lots of geopolitical pushback from Western countries -- the US, Canada, Australia, and the UK have all announced diplomatic boycotts of the competition -- it will still be a major product launch moment for China’s self-driving ambition in front of a global audience.
How well Apollo works during the Olympics will indicate whether Baidu’s goal of launching the Apollo Go robotaxi service in 65 cities by 2025 is realistic or not. But Baidu’s goal aside, it is likely that Apollo as an open source project with a permissive license (Apache 2.0) has already been copied, reused, and modified to run other self-driving vehicles made by other manufacturers. By 2025, Apollo may well emerge as the leading open source self-driving alternative to Tesla’s self-driving software stack, which is at the moment totally closed source and proprietary. At first glance, Apollo’s documentation and ReadMe on GitHub are all in English and look well-structured and well-written. This is a positive signal that the project team is investing in making Apollo accessible to non-Chinese developers.
If Tesla’s system is the iOS of self-driving, Baidu’s Apollo may become self-driving’s Android.
Huawei’s OpenHarmony: first started in 2012, Huawei’s development of OpenHarmony intensified in 2019 to reduce its reliance on the Android operating system due to US sanctions. The OpenHarmony project also became the “anchor project” of the OpenAtom Foundation, which Huawei is heavily involved in.
While OpenHarmony’s development was mostly a reaction to possible future US sanctions, there is a proactive, strategic element to OpenHarmony potentially becoming an alternative OS to both iOS and Android. For countries like Russia that are also concerned about US sanctions, OpenHarmony could be an appealing option.
Even though the MIIT guidance explicitly mentioned OpenHarmony (and no other project), the project is still relatively young and hosted on Gitee, China’s domestic alternative to GitHub with limited appeal to non-Chinese developers. From a strategic technical perspective, OpenHarmony has global influence potential, but the team has a long way to go to appeal to the global developer community.
Alibaba’s OpenXuantie: this project is the youngest of the three Big Tech projects, so it’s my “dark horse” pick. It was just open sourced in October, so Xuantie has hardly any community presence or traction, whether in Chinese or English.
Why do I think OpenXuantie has the promise to reach the kind of “global influence” that MIIT is targeting by 2025?
First, it is incubated within Alibaba, which has the strongest open source technical and community talent among all of China’s tech giants. Second, it leverages RISC-V, the open source semiconductor design architecture. (For more on RISC-V, please see my previous writings.) As far as I know, Alibaba still sports the fastest RISC-V based processor. Now it is taking a profound step forward to open source its semiconductor design development via OpenXuantie.
Given semiconductor’s obvious strategic importance to China, as well as countries like India, Pakistan, Japan, Korea, and many European countries, open source projects that can take RISC-V to the next level will surely gain global influence. Of course, the future of the semiconductor industry is anything but certain, with the US Federal Trade Commission challenging Nvidia’s acquisition of Arm on antitrust grounds. But RISC-V will be a force no matter what happens to the Nvidia-Arm deal, and OpenXuantie may play a big role in RISC-V’s continuous rise.
Independent Projects
The three independent projects are TiDB / TiKV (distributed database), OpenResty (API gateway), and OceanBase (distributed database).
TiDB / TiKV: started in 2015, TiDB is one of the most active open source distributed databases that is not incubated by any tech giant. TiKV is the key-value storage layer of the database, which was donated to the Cloud Native Computing Foundation in 2018 as a separate project. Thus, this layer technically belongs to the Foundation, not PingCAP, the startup that originally created both TiDB and TiKV in tandem.
Disclaimer: I worked for two years at PingCAP to launch its global expansion operation. I don’t have any knowledge of the company’s latest planning or roadmap. This analysis is purely based on my outside observation of these open source projects.
Because databases are a crucial layer of any infrastructure stack, TiDB / TiKV very much fits the MIIT’s strategic focus on China’s industrial technology infrastructure and supply chain. And because of the projects’ popularity among database engineers, many of whom are open source enthusiasts, the core technology has advanced quickly. In fact, TiDB / TiKV have been considered as a good enough home-grown alternative to the likes of Oracle and IBM since 2018. With users like Square and Shopee, you can argue that TiDB / TiKV already has some global influence.
OpenResty: started in 2011, OpenResty is one of the oldest and most widely used open source projects that originated from China. This project is likely how many Western developers first came in contact with “Made in China” open source technology.
Similar to the database layer, the API gateway layer is also crucial to all flavors of infrastructure technology, especially ones that are cloud-native. In simple terms, an API gateway provides load balancing and routing of API calls and web traffic dynamically, so your system does not crash when too many requests come in, say on an online shopping holiday like Singles Day. It’s no surprise then that OpenResty was first created inside Yahoo China, then followed its creator to Taobao, then to Cloudflare. The project is now stewarded by its own foundation and commercialized by a startup, also called OpenResty.
Given its long history and wide adoption, including users like Target and Lyft, OpenResty also already has “global influence”. It will be interesting to see how big the project becomes by 2025.
OceanBase: it may be a bit of a stretch to call OceanBase an independent open source project, because it was neither independent nor open source until very recently. First incubated inside Ant Financial as the primary transactional database for AliPay, Taobao, then many other products in the vast Alibaba ecosystem, OceanBase was spun out in mid-2020 as an independent company. However, the company is still mostly owned by Ant. OceanBase’s open source history is even shorter than its independent corporate history; its core codebase was open sourced in mid-2021, only a few months before my writing this post.
However, OceanBase deserves a mention because it is literally the fastest database in the world, at least by the measurement of a well-known industry benchmark called TPC-C. I won’t go into the technical details of what this benchmark measures. What’s important to note is that not only was OceanBase first place based on the TPC-C benchmark measurement, it was twice as fast as the second place Oracle product.
Given its short open source history, it remains to be seen how much “global influence” OceanBase can accrue over the next four years. Its core technology is unquestionably top-notch. But technology alone is by no means sufficient for any open source project to reach global prominence.
Every one of these six projects is worthy of its own deep dive analysis; this overview post barely scratches the surface of their respective technology and strategic value. While we won’t know for four more years which open source projects will receive the MIIT’s “global influence” badge of honor, open source itself is clearly a force that the Chinese government has both recognized and (mostly) embraced. Thus, understanding how open source works is table stakes for anyone working in technology, foreign policy, and especially the intersection of both.
中国的开源世界:未来四年
去年五月,我写了一组三篇系列文章,系统的介绍了中国的开源生态:
一年多过去后,随着 "开源" 继续得到中央政府自上而下的关注,整个生态也在变化。最近,工信部发布了一套新的发展规划。在规划文件里,“开源”极为突出,此词共被提到了27次。
该规划特别指出中国第一个本土的开源基金会 —— 开放原子开源基金会 —— 的建立是一项重大成就(关于开放原子开源基金会,请读我对它之前写的分析)。展望未来,该规划明确设定了2025年要建立2-3个具有 "国际影响力" 的开源社区的目标。
工信部的政策重点是推进中国的“产业基础高级化、产业链现代化”。基础科技层是开源软件最擅长的地方,以Linux带头,它是运行世界上大多数服务器的开源操作系统。自21世纪初以来,中国的开源社区一直在自然增长。中国现在拥有世界上第二大的开发者社区,是一个蓬勃发展、甚至已经有些拥挤的开源生态。许多项目在GitHub上已有上万的star,数千个分叉,数百个贡献者。但这些大多是虚荣指标,只有具有战略技术价值的项目才有可能接触到全球的开发者受众。
随着工信部对开源表达了最高级别的认可和高度的期望,哪些 "中国制造" 的开源项目最有可能在2025年既成为 "本土佼佼者 ",又达到 "国际影响力"呢?在本文中我分享一下个人选的六个候选项目 —— 三个大厂孵化的,三个独立项目 —— 以及为什么它们可能成为有国际影响力的项目。
大厂项目
这三个大厂项目是百度的Apollo(无人驾驶),华为的OpenHarmony(操作系统)以及阿里的Open玄铁(基于RISC-V的半导体设计)。
百度的Apollo:Apollo于2017年首次推出,已成为领先的无人驾驶开源项目之一。它是百度的Apollo Go无人驾驶出租车服务的核心,该服务在北京的一个小区已经初步上线,而这个小区恰好是即将举行的冬奥会的场地之一。尽管本届冬奥会受到许多西方国家的抵制 —— 美国、加拿大、澳大利亚和英国都宣布对比赛进行外交抵制 —— 但它仍将是中国在全球观众面前展示无人驾驶科技产品的重要时刻。
Apollo在冬奥会期间的表现如何,将证明百度准备在2025年前在65个城市推出Apollo Go无人出租车服务的目标是否会现实。但撇开百度的目标不谈,Apollo作为一项用了开放许可(Apache 2.0)的开源项目,很可能已经被复制、修改和使用到其他厂商的无人车里了。到2025年时,Apollo很可能成为特斯拉自驾软件栈的替代方案。特斯拉的自驾游软件栈目前是完全专有和闭源的。乍一看,Apollo的文档和GitHub上的ReadMe都是英文的,看起来结构清晰,写得很好。这是一个好信号,表明项目团队正在努力让Apollo面向海外的开发者。
如果特斯拉的系统是无人驾驶的iOS,那百度的Apollo很可能会成为无人驾驶的安卓。
华为的OpenHarmony:于2012年启动的项目,由于美国的制裁,华为在2019年加强了对OpenHarmony的开发、投资,以减少对安卓操作系统的依赖。OpenHarmony项目也成为华为大力支持开放原子开源基金会而捐赠的 "标杆项目"。
虽然OpenHarmony的发展主要是对未来可能的美国制裁的反应,但OpenHarmony有可能成为iOS和安卓的替代操作系统,这其中有积极的战略因素。对于像俄罗斯这样同样担心美国制裁的国家,OpenHarmony可能是一个有吸引力的选择。
尽管工信部的规划文件点名提到了OpenHarmony(没有给任何其他项目这种待遇),但该项目仍然相对年轻,并且托管在Gitee上。Gitee是中国国内GitHub的替代品,对海外开发者的吸引力有限。从战略技术角度来看,OpenHarmony具有达到国际影响力的潜力,但该团队在吸引并打造一个全球开发者社区上还有很长的路要走。
阿里的Open玄铁:这个项目是我挑的三个大厂孵化项目中最年轻的一个,算是我的 "黑马"选择。它在今年10月份刚刚开源,目前几乎没有任何社区存在感或牵引力,无论是中文还是英文社群。
那为什么我还认为Open玄铁有希望在2025年达到工信部所期望的 "全球影响力"呢?
首先,它是阿里内部孵化的。阿里在中国所有的科技巨头中拥有最强大的开源技术和社区人才。其次,它利用了RISC-V这一开源的半导体设计架构(关于RISC-V的更多信息,请参考我以前的文章)。据我所知,目前行业里最快的基于RISC-V的处理器仍是阿里开发的。现在以Open玄铁的方式进一步把其半导体设计开发开源出来,意义深远。
鉴于半导体对中国以及印度、巴基斯坦、日本、韩国和许多欧洲国家的明显战略重要性,能够将RISC-V提升到新水平的开源项目必将获得国际影响力。当然,半导体行业的未来并不确定,美国联邦贸易委员会正在以反垄断的理由挑战Nvidia对Arm的收购。但是,无论Nvidia-Arm的交易最终有什么结果,RISC-V都将是一股强大的力量,而Open玄铁可能会在RISC-V的持续崛起中发挥重大作用。
独立项目
这三个独立项目是TiDB / TiKV(分布式数据库)、OpenResty(API网关)和OceanBase(分布式数据库)。
TiDB / TiKV:TiDB始于2015年,是最活跃的开源分布式数据库之一,完全独立起家,而不是大厂孵化的成果。TiKV是数据库的存储层,在2018年作为一个独立项目捐给了云原生计算基金会。因此,这一层技术严格上说属于基金会,而不是最初打造TiDB和TiKV的创业公司PingCAP。
声明:我在PingCAP工作过约两年,启动其全球扩张业务。我对该公司的最新规划或路线图没有任何内部信息,此分析纯粹是基于我对此开源项目从外部的观察。
因为数据库是任何基础设施堆栈的关键层,TiDB/TiKV非常符合工信部对中国产业基础高级化、产业链现代化的政策目标。由于这些项目在数据库工程师中很受欢迎,其中许多人都是开源爱好者,因此核心技术进展很快。自2018年以来,就有报道提出TiDB / TiKV已经被认为是足以替代甲骨文和IBM等公司的本土产品。目前TiDB / TiKV已有Square和Shopee这样的用户,可以说已经具有一定的国际影响力了。
OpenResty: 始于2011年,该项目是源于中国最老和最被广泛使用的开源项目之一。这个项目可能是许多西方开发者第一次接触到 "中国制造" 的开源技术。
与数据库层类似,API网关层对于所有类型的基础设施科技,尤其是云原生技术,至关重要。简单地说,API网关提供负载平衡和API调用和网络流量的动态路由,帮助处理网络请求和流量洪峰,避免系统不会崩溃,用例如双十一这种网购节。因此,OpenResty首先在雅虎中国内部孵化,然后跟随其创建者去了淘宝,又去了Cloudflare,这都不奇怪。该项目现在由自己的基金会管理,并由一家也叫OpenResty的创业公司进行商业化。
鉴于其多年历史和广泛应用,包括Target和Lyft这种用户,OpenResty也已经在某种程度上具有 "国际影响力" 了。我们可以看看到2025年时这个项目变得有多大,这样的观察会很有意思。
OceanBase: 说OceanBase是一个独立的开源项目有点牵强,因为直到最近它既不独立也不开源。OceanBase最初是在蚂蚁金服内部孵化的,作为阿里庞大的生态系统中支付宝、淘宝和其他许多产品的主要交易数据库。2020年中期被剥离出来,成为一家独立公司,但大股东仍是蚂蚁。OceanBase的开源历史比它的独立公司历史还要短:核心代码库是在2021年中期开源的,离我写这篇文章只有短短几个月时间。
然而,OceanBase仍然值得一提,因为它实打实是世界上最快的数据库:至少根据业界著名的基准TPC-C的测量跑分,它是最快的。我不在这里详细介绍这个基准测量的技术细节了,但值得记住的是,根据TPC-C测试,OceanBase不仅取得了第一名,速度还是第二名Oracle产品的两倍。
鉴于其短暂的开源历史,OceanBase在未来四年内能积累多少 "国际影响力" 还有待观察。它的核心技术无疑是一流的,但是单靠技术去影响全球的开发者,对任何开源项目都是远远不够的。
这六个项目中的每一个都值得深入分析,这篇综述文章仅仅触及了它们各自技术和战略价值的皮毛。虽然还要等四年才能知道哪些开源项目将获得工信部的 "国际影响力" 勋章,但开源显然是一股中国政府已经认识到并(大部分)认可的力量。因此,对于任何从事科技工作、外交政策工作,尤其是两者交叉点的工作的人士来说,了解开源项目如何运作和成长是必须的。
如果您喜欢所读的内容,请用email订阅加入“互联”。要想读以前的文章,请查阅《互联档案》。每周一篇新文章送达您的邮箱。请在Twitter、LinkedIn、Clubhouse(@kevinsxu)上给个follow,和我交流互动!