Hype or not, GPT-3, OpenAI’s newest natural language AI model, has been percolating in the back of my mind ever since it was announced two months ago. This post will be some open musing of what I think its implications are in the intersection of tech, business, and geopolitics.
Not to get too philosophical, but I see GPT-3 as a beautiful abstraction of the relationship and tradeoffs between time and money. (Don’t worry, this isn’t the machine talking. This is still written by me, the human author, not GPT-3.)
GPT-3 in a Nutshell
To quickly summarize what is GPT-3 in plain language: it’s the 3rd generation of OpenAI’s natural language processing model, the previous generation being GPT-2, released in early 2019. GPT-3 has 175 billion parameters compared to GPT-2’s 1.5 billion parameters. In the world of Deep Learning AI, if a model has more parameters, it’s (literally) bigger, more complex, and usually produces better results. In this case, GPT-3 is more than 100-times bigger than GPT-2. GPT-3 is also trained on a much larger dataset -- almost a trillion words from the Common Crawl dataset, which crawls the entire Internet and makes the data publicly available for research. Again, in Deep Learning AI, more training data usually produces a better model, thus better results from that model. Presumably, there will be GPT-4, 5, 6, etc. that will be more powerful than GPT-3.
(There are of course many more details to GPT-3’s underlying technology. Since Interconnected caters to a wide range of readers, I won’t belabor all the details here. I’d encourage readers who are technical to read the GPT-3 paper, which lays out all aspects of the model, including algorithm, architecture, training process, evaluation, limitations, and ethical implications.)
GPT-3 is also not open sourced. Instead, the model’s abilities are exposed through an API layer that can be controlled. OpenAI’s justification is to limit malicious use of the model; if someone is abusing GPT-3 for nefarious purposes, OpenAI can turn off that user’s access to the API. Given just the limited examples of “magical” things that hobbyists have produced using GPT-3 -- from business writing and poetry, to designing a website and other forms of simple coding -- I think this approach is well-justified, even though it goes against my default position that always prefers open source.
This approach is not without its critics. When GPT-2 was first released, it was heavily criticized, both externally by other AI researchers and internally by OpenAI employees, for being not open sourced. These critiques also have merit: openness and transparency are important for other researchers to reproduce the technology. After all, good results can always be hard-coded and human-generated on the backend. OpenAI eventually open sourced GPT-2, as a public repository that is archived, after it felt comfortable that the model won’t be abused. I expect GPT-3 to follow this measured path to eventually becoming open sourced.
Money Can Solve (a lot of) Problems
Money can’t solve all our problems, but in the case of GPT-3, it has solved a lot of problems. In this context, money means cloud computing resources to train the model, where the larger the model and training dataset (both of which are very large for GPT-3), the more expensive it is to compute.
Broadly speaking, the path to developing AGI (artificial general intelligence) falls in one of two camps. One end of the research believes that Deep Learning is not the ultimate technique and more innovation is necessary. The other end believes that the necessary techniques are all available, so the focus should be on how to put them together and scale their training process and production deployment. More succinctly, the former thinks more time is needed to create better techniques, the latter thinks more money is needed to scale the current Deep Learning techniques.
According to this MIT Technology Review profile of OpenAI, the group’s strategy squarely falls in the “more money” camp. That’s why it has a rather bizarre corporate structure that’s a hybrid of a nonprofit and a normal VC-backed startup, but the investors’ returns are capped at a certain multiple -- 100x for the earliest round of investors. Microsoft also invested $1 billion in OpenAI with a preferred partnership to use Azure’s cloud infrastructure, as Microsoft works to develop its own AI supercomputer for its cloud offering.
(Aside: this partnership was quite a coup for Microsoft and to the detriment of Google in particular, given that GPT-3’s theoretical foundation, known as the transformer architecture, originated from Google. Google has also been developing its special-purpose Deep Learning hardware, the TPU (tensor processing unit), to differentiate GCP as the most capable AI cloud on the market. Google of course owns DeepMind, the other AGI company, so forging a partnership with OpenAI may even be a conflict of interest. For what it’s worth, Satya Nadella was openly bullish about GPT-3 during Microsoft’s most recent earnings call when talking about Azure. Whether OpenAI will help propel Azure to become the market-leading AI cloud or will Google maintain its technical advantage is a competitive dynamic worth watching.)
That $1 billion Microsoft money is rumored to be a split between cash and Azure credits. Regardless of which form the money comes in, OpenAI could use it and is using it. GPT-3 is estimated to have a memory requirement of more than 350GB and a training cost of more than $12 million. That’s likely the cost of a single round of training. Most models go through several rounds of training to achieve good results. Training GPT-3 three or four times could easily sink around $50 million, and that doesn’t include the cost of data cleaning and pipelining that usually goes into preparing a model for training.
OpenAI has done some interesting research on the connection between the amount of compute used and progress in AI (though perhaps just to support its “more money” approach). Since 2012, the amount of compute used in the largest AI training exercises (e.g. AlphaZero) has been doubling every 3.4 months. The research compares and contrasts this finding with Moore’s Law, which observes the doubling of the number of transistors on a chip every two years. This comparison is only useful to the extent that while Moore’s Law may be reaching its limit, AI computing advancement is far from slowing down. And much of this advancement is focused on parallel computing (i.e. efficient use of multiple chips processing workloads at the same time) and creating more special-purpose AI chips (e.g. TPUs or Tesla’s Full-Selfing Driving chip, which I’ve written about before), thus it is not directly constrained by Moore’s Law either.
I have a pragmatic attitude towards AI (and most things) and don’t have a strong opinion between the “more time” vs “more money” approach. If it’s working, it’s working.
It looks like GPT-3 is definitely working. Even if OpenAI’s observation and development strategy turn out to be half true, we will see future generations of GPT advance rapidly within the next five years.
Time and Expertise is Premium
So what is there left for humans to do? Plenty, but differently.
What AI models, like GPT, do to language understanding and state-of-the-art computer vision models do to visual understanding is the continuous commoditization of rote memorization (X is Y) and basic pattern matching (when X happens, Y usually happens), where the value approaches zero. Thus, any task that falls in one of these two broad categories that is currently done by human beings will be done more by AI.
We can get a hint of which industries GPT-3 may impact in this way, just by looking at the initial list of beta users:
- Algolia (search engine)
- Koko (mental health and emotional well-being via chat)
- MessageBird (customer service via text or voice)
- Sapling (HR and operations processes)
- Replika (mental health bot that keeps you company)
- Quizlet (online education via flashcard quizzes)
- Casetext (legal research)
- Reddit (public discussion forum)
- Middlebury Institute (higher education, known for having one of the top foreign language study programs in the world)
This list of companies and institutions may seem to cover a wide range of unrelated use cases, but they all have an element of memorization or basic pattern matching to their workflow that can be standardized by a well-functioning AI model to ensure consistency and quality. Furthermore, the GPT-3 paper revealed that while training the model was expensive (prohibitively so for most companies), producing inferences (aka results) from the trained model is cheap and consumes little energy. Thus, from a cost-efficiency perspective, companies will be incentivized to build their services on top of something like GPT.
What all this means, I believe, is that there will be a premium for the one thing that requires more time, but not necessarily money, to acquire: deep domain expertise.
Vitalik Buterin, the creator of the Ethereum blockchain, captured this possibility well in this tweet:
What’s implicit in a “human describes, AI builds, human debugs" future is that the human must have enough expertise built over time to know “how to describe” a problem and “what to debug”. What this also means is that the very notion of “domain expertise” will become more niche and specific.
Knowing how to onboard a new employee may not be that valuable; knowing how to onboard a junior engineer two years out of college, so she can be productive within one month of starting will be. Knowing how to set up a standard cloud environment may not be that valuable; knowing how to troubleshoot a cyber attack that brought your infrastructure down will be. Knowing how to translate the basic meaning between two different languages may not be that valuable; knowing how to parse the signals and connotations that may get lost in a standard translation will be. etc. etc.
This future will reward people who can (and will) spend the time and energy to deeply understand something, and do so continuously over time. If I don’t continue to spend time reading, thinking, and understanding the intersections between technology, business, and geopolitics or the relationships between the US and China, regulators and companies, etc., Interconnected will be written by GPT-15 one day.
Of course, that leaves the big, hairy question of what happens to people who can’t (or won’t) spend the time and energy to develop expertise in something. A universal basic income should probably be in the picture to ensure basic sustenance and survival. Though that’s likely not enough to incentivize everyone to develop niche expertise, which requires not just time and survival, but internal motivation.
Ethics: A Humanity-Scale Challenge
One challenge around AI that most certainly would need more time than money to solve is ethics. The challenge isn’t on a company-scale or national-scale, but humanity-scale. While developers of AGI, like OpenAI, are fond of analogizing this eventuality to a “utility” that will just flow seamlessly in people’s lives, like electricity, the analogy falls apart pretty quickly. For one, electricity does not make decisions; it’s a straightforward source of power. AI, even the relative dumb, non-general ones we use today, makes decisions all the time, both for itself and for the humans it interacts with (see TikTok).
Tackling this challenge requires cross-border, cross-disciplinary, and cross-cultural collaboration. In a recent paper, academics from Cambridge and Beijing specifically advocated for more cross-culture cooperation on AI ethics and governance, while warning against the barrier that frosty relations between the AI superpowers -- US, China, and EU -- could erect.
To OpenAI’s credit, it recognizes that its work has a global impact and its audience is not just Americans and Europeans, but also Chinese. It’s not an accident that the OpenAI charter, the company’s overarching organizing principle and mission statement, has a Chinese version and no other language. Both the benefits and drawbacks of AI, AGI in particular, will be distributed and borderless, whether we like it or not.
As a pragmatist, I do believe for some necessities, it makes sense for countries to be self-sufficient and less connected with other countries. It is precisely the lack of self-sufficiency in medical supplies and devices that partially put the U.S. on its heels during the initial spread of COVID-19. It is precisely the lack of self-sufficiency in semiconductor design and manufacturing that is partially (or completely) hobbling Huawei and related tech sectors in China.
But for humanity-scale challenges like the ethical use of AGI, cross-border and cross-cultural cooperation is a necessity, not a nice-to-have. Money will likely get AGI to a level that will create a litany of difficult ethical challenges, as OpenAI has already demonstrated with GPT-3. However, time isn’t exactly on our side, if we don’t learn to cooperate soon.
These are obviously big problems that no single blog post can resolve. Until countries come together in the way those AI researchers from Cambridge, Beijing, and other places do, all I can do is just patiently waiting for my beta access to GPT-3.
根据MIT Technology Review对OpenAI的报道，该集团的战略方针完全属于“更多钱”阵营的。这可能也就是为什么它有个相当奇特的公司结构，一个非营利组织和一个普通风投支持的创业公司的混合体，但投资人的回报率有一定的上限——最早一轮投资人的上限是100倍。微软还给OpenAI投资了10亿美元，包括与Azure云基础设施建立优先合作伙伴关系，有助于微软正在为自己的云产品开发的AI超级计算机。
（旁白：鉴于GPT-3的理论基础，即变压器架构（transformer architecture），源自谷歌，微软能拿到这个伙伴关系，也算是大赢家了，同时也对谷歌有商业上的损害。谷歌也一直在开发自己为深度学习特制的硬件TPU（tensor processing unit），企图把GCP变成市场上功能最强大的AI云。谷歌当然也拥有另一家AGI公司，DeepMind，所以与OpenAI建立合作关系很可能有利益冲突。值得一提的是，微软CEO Satya Nadella在最近的季度财报电话会议中在谈论Azure时公开赞赏了GPT-3的进步。OpenAI会不会帮助Azure成为市场领先的AI云，还是谷歌会保持自己的技术优势，是个值得关注的竞争动态。）
- Middlebury Institute（高等教育，以世界顶尖的外语教育科目闻名）
在“人类描述、AI构建、人类调试”（human describes, AI builds, human debugs）的未来里，隐含的意义是：人必须拥有足够的专业知识与时间和经验的累积去知道“如何描述”一个问题和“该调试什么”。这也意味着“领域专长”的概念将变得更加细化和具体。