Last week, DeepSeek released a new version of its model to much fanfare. Except the fanfare did not come from the model’s performance improvement, hybrid nature (combining both reasoning and non-reasoning into a single model, like GPT-5), or strong agentic capabilities. It came from an unprompted comment from the DeepSeek team to clarify a numerical format choice in the new model version. The comment went as such:

“UE8M0 FP8 is designed for the next generation of domestically produced chips to be released soon.” 

This comment immediately sent the entire cohort of Chinese AI semiconductor stocks flying for two days. Cambricon, a GPU designer in China, led the way with a 20% pop in one day (the maximum allowed on the Chinese stock market) and added another 11% on the next day. Other AI-related names from good old Alibaba to data center provider and Microsoft’s local partner, 21Vianet, all jumped. The whole stock exchange index in Shanghai and Shenzhen melted up and went along for the ride. (China possibly banning Nvidia’s H20 chips, even though the US is allowing it, further helped Chinese semiconductor companies’ prospects.)

At the time of publishing this post, this market-moving comment is garnering “likes” count as fast as the release post itself. 

This cryptic and comically nerdy comment about “UE8M0 FP8” or “unsigned exponent 8 mantissa 0 floating point 8”, which might as well be Morse code to most people, may be the prelude to the real DeepSeek moment. The one we had in January that sent the AI industry and the stock market into a tizzy was more of a headfake. 

The DeepSeek Headfake

Rewinding the clock back to late January, when that so-called “DeepSeek moment” wiped out $1 trillion dollar of value off the stock market in a single day, Mr. Market was clearly in a panicky mood. He sold first, then asked questions later. That panic implied a few knee-jerk observations – some turned out to be real, some not so much:

  • Oh wow, Chinese labs are actually good at building AI models! (real)
  • Nvidia is toast because DeepSeek was so cheap to train! (not so much)
  • Open source models can compete with closed source models at the frontier! (real)
  • Large capital expenditures in AI infrastructure won’t continue! (not so much)
Chart from my screener and chart dashboard of choice: Koyfin

Nvidia dropped 17% that day. Meanwhile, cloud software companies (represented in blue by WCLD, the WisdomTree Cloud Computing Fund) held up in anticipation of more AI applications and less AI infrastructure. Chinese AI names like Alibaba, typically used as a proxy and bellwether, rose quickly then dropped when the initial enthusiasm about DeepSeek faded.

Over the last few months, the market got its bearings while navigating a bunch of other “moments”, from Liberation Day, to military attack on Iran’s nuclear facilities, to whether and how the Federal Reserve will lower interest rates. Cloud companies returned to their underperforming form. Chinese tech names returned to being more tied to the macroeconomic conditions of China and its trade relations with the US and other countries, not their own technological progress, even though Chinese open source AI models have been dominating the scene thanks to some important structural advantages.

As for the AI trade, things returned to more or less before DeepSeek came on the scene – a primarily hardware-led state of affairs, led by Nvidia, followed by AMD, and supported by a long list of chips, memory, and server makers (TSMC, SK Hynix, Dell, Foxconn, etc.), powered by energy and cooling providers (Vertiv, Schneider Electric, etc.), and fueled by hundreds of billions of capex commitments from all the big tech companies.

That’s why the January moment was a headfake. 

What makes this moment different, more meaningful, and more real? DeepSeek v3.1 is ushering in a software-led AI development roadmap, which Chinese AI chip makers are designing for and possibly standardizing on, starting with UE8M0 FP8.

Software-Led vs Hardware-Led

The UE8M0 FP8 format deserves some explanation, but also should not be blown out of proportion. FP8, or floating point 8, is a data representation format that many AI labs are experimenting with to increase AI training efficiency. When DeepSeek released its V3 model in December of last year, I noted the model’s choice of FP8 as one of the key improvements and an example of that team’s unique software-hardware full stack expertise.

What FP8 does is trading off precision for computational efficiency. The higher-precision, more computationally intensive, thus less efficient formats would be FP16 or FP32. The higher number basically means more decimal places to represent a result, which is more precise but also takes up more space (i.e. memory) to represent and store.

There are many variants to a type of floating point operation, whether it is 4 or 8 or 16 or 32. Here is a visual that shows a few different variants. What’s important to remember is that each bit, 1 or 0, occupies a block of memory space. E8, or exponent 8, means 8 blocks. M2, or mantissa 2, means 2 blocks. (Don’t worry about what “exponent” or “mantissa” means for the purpose of understanding the big picture.)

UE8M0 is also a variant, and an extreme one. The mantissa part goes to 0, occupying 0 block of memory, while the exponent part is 8. What this choice means in reality is trading off quite a lot of precision, in exchange for more memory efficiency. Combining this variant with FP8 also makes this tradeoff more amenable to less powerful logic chips, i.e. GPUs. It’s also important to note that DeepSeek chose this format as its scale factor, which more directly impacts efficiency during the training phase, not inference.

Adding all this up, UE8M0 FP8 as a scale factor gives AI GPUs that are produced from less advanced processes (e.g. 5nm or 7nm) with less memory attached and less bandwidth interconnecting all the pieces a chance to punch above their weight. In other words, whatever SMIC is capable of at this moment, if it could scale up to volume production, might be enough! 

UE8M0 FP8 in and of itself is nothing new. Nvidia’s Blackwell system supports this format. Cambricon supports FP8, thus the big pump in share price. Huawei’s Ascend does not support FP8, but is rumored to in future releases, if only to keep up with what DeepSeek wants and needs.

And that’s the real moment: DeepSeek is setting the direction, roadmap, and expectations for the entire Chinese AI hardware ecosystem, from SMIC the chip foundry, to the GPU designers, to peer AI models. It is a software-led roadmap, not a hardware-led one. Because DeepSeek is open source, there is nothing stopping non-Chinese hardware vendors from following its lead either. That includes the incumbents, like Nvidia and AMD, as well as upstarts like Groq, especially if DeepSeek remains popular and its derivatives widely diffused.

This is a significant shift, when an entire ecosystem’s north star shifts from “match whatever Nvidia’s latest and greatest invention is” (and handicapped in doing so), to “make whatever DeepSeek’s latest model great” (more clear, more doable).

Contours of Sovereign Chinese AI

So which way is better? Software-led or hardware-led? It is hard to say. Both can and do work. 

What initiated the AI boom we are living in today has been hardware-led so far. Along the way, Nvidia became the most valuable company in the world. Other examples, like Linux and Android, have both to a certain extent standardized the server operating systems and mobile operating systems in a software-led fashion. The hardware that runs them has been led to follow along. It is not an accident that both of those are open source; the leading open source option of a particular layer in the stack tends to become the default standard over time. AI models, as myself and others have opined on before, is but a middleware software layer.

It is unclear if DeepSeek (or any AI lab) can ever become that default standard in the AI model layer. What is clear though is that the Chinese AI hardware vendors are willing to bet on DeepSeek and co-designing their products to fit its research direction. The output is the contours of a full-fledged sovereign Chinese AI system.

Sovereign AI is an emerging trend that will dictate the global AI infrastructure buying and build out pattern for years to come. From the recent White House AI Action Plan, it is clear that the US wants to aggressively sell the American AI stack to other countries. And the products are more or less ready to go! Although the implicit and explicit “enemy” in the AI Action Plan is the Chinese AI stack, it has not been clear what the Chinese stack really is and whether the country has the capacity to sell overseas, when so much US export control restrictions have hobbled its manufacturing capabilities in semiconductor and the domestic demand has not even been met yet. My priors have been that even if Huawei’s Ascend GPUs and CloudMatrix systems are on par and competitive with Nvidia’s products, it’ll take years for SMIC to progress to more advanced nodes beyond 5nm with a high enough yield to satisfy China’s own market demand, with Nvidia still fighting for shares there, before it can scale volume to sell abroad.

DeepSeek 3.1 UE8M0 FP8 may have just changed all that. 

To be clear, UE8M0 FP8 is a bold R&D bet that DeepSeek is making. It could fail spectacularly. The tradeoffs I laid out earlier may be too costly, and models being trained this way could end up falling behind over time. 

But if FP8 becomes the default, and the UE8M0 variant can indeed train great models, SMIC’s 7nm and 5nm processes may just be enough to support an explosion of indigenous GPUs from Cambricon to Moore Threads to Alibaba and ByteDance’s in-house chips, not just Huawei. These could be general purpose GPUs or ASICs. They can support massive training runs – currently a big gap among Chinese GPUs – not just inference workloads. An entirely “Made in China” AI stack may emerge in volume to meet both domestic demand and expansion into overseas markets sooner than most people think, credibly challenging Nvidia and the American AI stack both at home and abroad.

Thus, the knee-jerk reaction (to the upside) has been mostly in the Chinese market, while the US market has stayed muted. Mr. Market always learns from past mistakes, but also tends to overcorrect. He made a mistake in January by overestimating the threat. Perhaps he is underestimating the real DeepSeek moment that just arrived.

(Note: after the initial publication of this post, the Financial Times reported that China is tripling capacity to manufacturing AI chips. This develop further suggests that Chinese domestic chip fabrication capabilities may be reaching volume production.)