As I noted in Interconnected Weekly, the most interesting news item from last week was the revelation that ByteDance is cutting its China-based engineers data access to TikTok and its other international products, reported by PingWest.
While many Chinese tech companies have global ambitions, ByteDance is in a class of its own. Its various social media products have reached significant traction and cultural relevance in both the Chinese market and other large Internet markets, like the U.S., India, and Indonesia. And with that traction comes intense geopolitical scrutiny.
Instead of being reflexively skeptical about everything a Chinese tech company does, it’s worth deep-diving into how ByteDance could accomplish this internal separation from a technical angle to build trust and alleviate legitimate geopolitical concerns.
RBAC on Crack
The simple answer is RBAC (role-based access control). And lots of it.
By designing a set of identity and access management (IAM) policies along individual roles, ByteDance can restrict every single employee’s access on a product by product basis. China-based employees of all functions -- engineering, data analysis, growth, marketing, etc. -- will only have access to the product codebase, production databases, and all their replicas related to the China market products, e.g. Toutiao, Douyin, Xigua. The same rules can be applied to its non-China-based employees for its international products, e.g. TikTok, Helo, BaBe, Lark.
This separation will have to be cleanly cut along the entire technical stack, from the pixels on the frontend UI, through the algorithm-driven content layer, through the backend server-side layer, through the database layer, and all the way down to computing and storage resources in its data center layer. PingWest’s report hinted that this type of separation may have already started with the splitting of the internal middleware team.
Consequently, ByteDance will need two sets of people for every function to support its product portfolio in and outside of China in parallel. This is arguably much harder to accomplish than separating the tech stack. Collaboration, if not just regular communication, between the two parallel universes may have to be cut off. If that’s the direction ByteDance is going, it would explain why its headcount is planned to balloon from roughly 60,000 to 100,000 by the end of 2020. To put this 40,000 person increase in context, Facebook’s entire headcount at the end of 2019 was about 45,000. That’s a lot of IAM policies to administer in a short amount of time -- many, many YAML files. Clean separation may be the hardest for HR and recruiting, who are trying to meet this aggressive hiring goal. (An anecdote: the day before this post was published, I received a cold LinkedIn outreach from a ByteDance recruiter in Chinese for an international product role but based in Beijing or Shanghai.)
The administrative details get even more challenging, when such a large employee base starts traveling to and from various locations around the globe -- only a non-issue at the moment due to COVID-19. But ByteDance can draw from large Silicon Valley companies, most of whom already have best practices to geofence employee data access while traveling, especially in and out of China.
Of course, there will have to be a central team that manages, evolves, and enforces this set of complex IAM policies for the long haul. Where should this team sit organizationally? Who can be on this team, since they can see both universes? Who should this team report to? These are some of the important questions that remain unanswered.
All these measures are both technically and operationally doable. They will be painful to implement and maintain. And they will present a massive challenge to building a unified company culture. But they can be done. If ByteDance gets this right, it’ll be a template Chinese tech companies with global expansion ambitions will follow.
(An aside: this setup reminds me a bit of how the Hatch Act works when a U.S. president is running for re-election. There is complete bifurcation between the rank and file employees of the White House and the re-election campaign operation, but high-level government officials appointed by the President, e.g. a Cabinet secretary, can legally access both.)
For something as granular and tedious as RBAC at the scale of 100,000-plus people, the devil is indeed in the details. Assuming those details are implemented and operationalized to perfection, ByteDance will still need to present a way for third party organizations to audit and verify those implementations on an ongoing basis. While ByteDance’s intention to build various Transparency Centers is a nice touch (arguably more than what Facebook or YouTube has done in this regard), these centers appear to be focused on human content moderation and as controlled PR spin machines for the media. They won’t convince many technical experts.
The best way to build trust with the technical community? Open source.
Regular readers of Interconnected know I write a lot about open source. I’m a proponent and practitioner of the power of open source in building robust, sustainable, and trustworthy technologies.
Sunlight is always the best disinfectant. Open source is that sunlight in the technology world.
There are few examples of large tech companies open sourcing their RBAC implementations and IAM policies. I don’t think it is because there’s something inherent to RBAC that makes it unsuitable to open source. You can remove the business and organizational logic and open source just the implementation mechanisms. But large tech companies usually open-source something if there is externally strategic value in doing so, e.g. creating a developer ecosystem (Apple with Swift, Microsoft with VSCode). RBAC rarely holds any strategic value. It’s mundane and boring.
However, for ByteDance, how its RBAC implementation works is anything but mundane or boring to the government regulators, cybersecurity auditors, and privacy advocates -- three audiences who have lots of skepticism about the company.
Each tech company sees its strategic differentiation and focus differently. As I discussed in “Why Is Facebook Not in the Cloud Business?”, because Facebook is dead set on becoming the dominant social media company and not interested in having a cloud PaaS business, it open sourced its data center design for the strategic purpose of achieving more infrastructure efficiencies. Google, on the other hand, saw its data center design as a competitive advantage and kept it proprietary.
Facebook is differentiated by its algorithm-driven social network, and not by its data centers. Similarly, ByteDance is also differentiated by its algorithm-driven social network, and not by its RBAC implementation (nor is any company for that matter). What’s different is that open sourcing RBAC has strategic value to ByteDance to help it shore up the only currency it lacks: trust.
Unfortunately, ByteDance does not have a long or strong track record of creating, contributing or stewarding open source projects. Given its unique potential and predicament, ByteDance would be well-served to develop that competency sooner rather than later. There may not be a direct template to open sourcing RBAC at the scale of ByteDance, but there are many open source practices beyond just sharing code publicly to draw from to build trust -- community discussion forums, open documentation, transparent governance, etc.
Earning the trust of a global audience will take a lot more work than poaching “Captain America” from Disney. It’ll take a lot more work than hiring a former congressman to lobby on your behalf. As I’ve shared in “Why Huawei Should IPO in America”, even hiring one of Trump’s biggest fundraisers doesn’t buy you much.
But all the work required is just hard work, tedious work, but not impossible work.
Chinese Version Below
简单的答案是RBAC（基于角色的访问控制, role-based access control）。而且需要很多很多的RBAC。
通过沿着具体工作角色设计一组标识和访问管理（identity and access management，IAM）策略，字节跳动可以在逐个产品的基础上限制每个员工的访问权利。所有部门（工程、数据分析、增长、营销等）的中国员工将只能访问国内产品的代码、生产数据库及其他副本，如头条、抖音、西瓜。同样的规则也适用于身在海外的员工们做海外的产品，如TikTok、Helo、BaBe、Lark。