China wants AI to rival ChatGPT. Censorship makes that tricky

Chinese firms including Baidu and Alibaba are racing to roll out versions of the chatbots taking the tech world by storm.

By Erin Hale

Taipei, Taiwan – As the arrival of artificial intelligence-powered chatbots sends shockwaves through the global tech industry, China is racing to produce versions of its own.

China’s search-engine giant Baidu has announced plans to release its chatbot ERNIE sometime in March, following the pioneering launch of ChatGPT, which has prompted existential questions about the future of sectors ranging from education to journalism and healthcare.

Keep reading

A US state asked for evidence to ban TikTok. The FBI offered none

A year into Ukraine war, Asia’s big brands sit out Russia boycott

China’s top tech dealmaker Bao Fan goes missing

China’s factory output smashes forecasts with decade-high growth

Chinese tech shares rallied in response to the news and authorities have pledged to beef up their support of the sector. Similar projects to ERNIE are under way at Chinese tech giants Huawei, Alibaba, Tencent, JD.com and top institutions including the Beijing Academy of Artificial Intelligence.

China’s Ministry of Science and Technology said last week it would push for the integration of AI across Chinese industry, while cities including Beijing have also announced plans to back developers.

But while China appears to be on the cusp of producing a fast follower to rival ChatGPT, which was developed by California-based OpenAI, there are big questions about how the technology will operate within an ecosystem that includes strict internet controls.

“The most general-purpose technology we have, artificial intelligence, should be something that’s a super general purpose,” Jeffrey Ding, an assistant professor at George Washington University who studies the Chinese tech sector, told Al Jazeera.

“But it’s really shaped by the specific, political, cultural, linguistic context in which these models are developed and deployed.”

Bots like ChatGPT rely on generative AI to formulate responses that draw from billions of data points scraped from the internet, which also makes their answers at times difficult to predict.

Lengthy conversations between ChatGPT and users have gone off the rails, leading Microsoft to limit its ChatGPT-powered search engine Bing to a maximum of five questions to keep it on task. ChatGPT’s answers have also rankled conservatives in the United States, who have accused the bot of being “woke” on hot-button social issues such as affirmative action and transgender rights.

In China, internet censors routinely ban keywords, delete posts and bar users in accordance with the sensitivities of the ruling Chinese Communist Party (CCP), leading creative internet users to use homophones, coded messages and screenshots to get around information controls.

For a chatbot, the censorship apparatus means a severely limited pool of information to rely on.

Baidu’s ERNIE chatbot is based on information scraped from both inside and outside China’s firewall – which is necessary to obtain an adequate data set – and draws on sources like Wikipedia and the famously unwieldy Reddit.

Assuming their products are technically able to perform at a similar level to ChatGPT, Chinese tech companies may find themselves either choosing between restricting what chatbots can do, like Microsoft’s Bing, or what they can say.

“It’ll make it a lot less useful, but it’ll make it a little bit safer politically,” Matt Sheehan, a fellow at the Carnegie Endowment for International Peace who studies AI and China, told Al Jazeera.

“Historically, almost every time they’ve been faced with a trade-off between information control and…business opportunities, they always come down on the side of information control and then they assume that businesses will figure it out.”

In 2017, Tencent pulled two chatbots from its QQ messaging app after they reportedly made comments deemed unpatriotic. One chatbot, developed by Microsoft, told users it dreamed of moving to the US, while the other chatbot, developed by Chinese tech company Turing Robot, told users it did not love the CCP.

Earlier this month, YuanYu Intelligence, a Hangzhou-based startup, suspended its chatbot after it provided negative answers about the Chinese economy, although the company’s lead developer insisted in an interview with the Washington Post that the suspension was simply due to technical errors.

Baidu itself has previously anticipated Beijing’s red lines, as seen with its ERNIE VilG Image and art generator released in demo form last year.

While widely praised for performing as good or better than Western rivals, the app blocks users from content related to politically sensitive topics such as Tiananmen Square, democracy, Xi Jinping and Mao Zedong.

Shenzhen-based Tencent is among the Chinese tech companies working to develop a rival to ChatGPT [File: Aly Song/Reuters]

“With generative AI, the power of the tool is its ability to be creative and to connect things that you wouldn’t expect to be connected, and to do things in different styles that are expected,” Sheehan said.

“But how can you prevent the maybe subtler or less direct criticisms of the Communist Party’s core beliefs without completely neutering the tool itself? That seems like a really hard technical and sociopolitical problem.”

Before the release of ChatGPT, China was already taking steps to regulate AI. On Wednesday, the Cybersecurity Administration began enforcing new rules governing search engine recommendations, providing users with more control over how their personal data is used by search engines.

In January, China also passed legislation to regulate deep synthesis – a form of generative AI that can be used to create “deep fakes” – and last year set up a registry for algorithms, although the expected long-term effects of both measures are widely seen as unclear.

As part of a broader crackdown on the tech industry since 2020, authorities have shown no hesitation about reining in firms deemed to be acting beyond their authority, such as by pulling the plug on blockbuster IPOs by Ant Group and ride-hailing app Didi over alleged data concerns.

Despite being blocked by China’s firewall, ChatGPT has generated huge buzz among Chinese users accessing the site via virtual private networks (VPNs) and other roundabout methods.

Much of that excitement has stemmed from ChatGPT’s ability to perform in Chinese and other languages despite being trained in English, said Ding, the George Washington University professor.

“The excitement isn’t really about the business applications. Part of it is just excitement and wonder at how impressive the natural language capabilities of this technology are,” he said.

“And one aspect of that is ChatGPT was not even trained on any Chinese language texts. It was mostly all trained on English language texts but I’ve seen Chinese users ask questions in Mandarin and it will still perform very capably in a different language.”

Even so, Chinese could prove especially challenging for AI, Ding said, due to the language’s heavy use of idioms and sayings with historical context.

While Chinese developers have already released a number of chatbots, including Inspur’s Yuan 1.0 and Fudan University’s MOSS, none has come close to matching the capabilities of ChatGPT.

Unlike Silicon Valley, Chinese tech companies until now have tended to focus on consumer-facing products with a short development cycle, said Chim Lee, a Chinese tech analyst at the Economist Intelligence Unit, putting them at a disadvantage in a nascent field like AI.

The arrival of ChatGPT provided Chinese firms with a “proof of concept”, Lee said, showing both the promise of generative AI and the need for longer-term investment.

“Baidu has been considering this kind of model for quite a while, but you need to justify this kind of investment just to train the model, not to mention researching or talking about the long-term foundational data related to the algorithm,” Lee told Al Jazeera.

“What is very helpful with ChatGPT is now these companies can say, ‘Hey, we want to develop these kinds of things and they can tell the government that’s what I want to do’.”

china ai — Tech analysts say that Chinese companies face hurdles to replicating the success of ChatGPT, including government censorship and an industry focus on consumer-facing products with a short development cycle[File: Jason Lee/Reuters]

Rui Ma, a tech analyst and creator of Tech Buzz China, said it is anyone’s guess which Chinese firm might come out on top in the race to match ChatGPT, although Baidu appears to be first out of the gate.

“I think right now the most excitement is still at the model level,” Ma said.

Alibaba told Al Jazeera that it is internally testing a Chat GPT-style bot for use in its apps and cloud services but it did not provide further details or respond to questions about censorship.

JD.com directed Al Jazeera to a statement released last week about its plans to roll out its industrial chatbot ChatJD for use on its retail and financial website, based on 10 years of data from its various platforms.

Baidu, Tencent and Huawei did not respond to requests for comment.

Apart from the scrutiny of Beijing’s watchful eye, Chinese tech companies also face hurdles from overseas in the form of export controls.

In August, US President Joe Biden signed the CHIPS and Sciences Act, which requires tech companies receiving government subsidies to move the manufacturing of advanced chips out of China.

Although Chinese tech companies have strategic stockpiles of chips, Washington’s effort to hobble the sector poses a long-term threat, said the EIU’s Lee.

“The US specifically banned the export of these very advanced AI chips that would be used in model training, or even just employment, so all of these factors put Chinese AI developers in a disadvantaged position in many ways,” he said.

“A lot of Chinese companies and research institutions have indeed stockpiled some chips that would be used for these kinds of applications but if you look at the scale of chips that ChatGPT requires, there’s a very high possibility that the chips would run out at some point,” he added.

Source: Al Jazeera

Live