The troubles of AI scientists with an annual income of millions of dollars
2021-08-08 15:44:23 【Shared by: Chang Zheng】
AI Research scientist Alexis Conneau Just a few keystrokes , A torrent of information containing hundreds of billions of words , Can scroll through his computer screen window .
these years , automation “ Reptiles ” use 100 The two languages will be ancient poetry on the Internet 、 Angry comments 、 Dessert recipes and all other information are sucked into a huge database .
Reprint ： Academic headlines
As human beings , Conneau I can't read so much data one by one , But his work ——XLM-RoBERTa, But has put the database “ read ” Many times . the Facebook AI Key technical achievements , be based on transformer, Used more than 2TB Filtering public crawl data in 100 Trained in two languages .
Conneau help Facebook Build a machine learning system , Can understand dozens of languages better than the best system before .
chart ｜Alexis Conneau
In recent years ,AI Great progress has been made , Tens of billions of dollars a year 、 Huge amounts of data 、 Powerful computing power and influence , Driven by the competition of global technology giants , The field has gone beyond its academic roots , It has become a technological highland for enterprise giants and even countries , The machine itself also rewrites the manufacturing method of Technology .
Return to the most basic level , Push this AI The technology related to the revolution is made up of things like Conneau Such developers build ,Conneau He is a clever machine learning addict , He sees the race to the future as a series of engineering problems destined to be solved . But these achievements involved him in a new dilemma .
Beset by ethical problems AI Scientists
2021 year 4 month ,Conneau from Facebook Job hopping to Google , As a research scientist , Years old 30 He has an annual income of nearly 100 Thousands of dollars .
But as the AI The development of the field , image Conneau Such researchers find themselves in an intense “ Moments of global turmoil ”. The outside world has raised many disturbing questions , Existing technology groups are interested in AI Control systems developed by research laboratories , Also have AI The destructive power of using in the real world .
Almost all aspects are controversial ： The racial bias of the algorithm and development team ; Anxiety about intellectual freedom and corporate repression ; There's more about funding in this area 、 The imbalance between energy and power .
The industry crisis has shaken people's understanding of this utopia 、 Confidence in the pioneering field of technology known for optimism , At a time of unprecedented competitive tension , The researchers 、 thinker 、 Executives and engineers argue with each other , The main winners of these debates , May dominate the shaping of AI Change the way millions of people around the world live .
Conneau Help the industry promote the development of natural language processing technology , It redefines the way we communicate on the Internet , He led AI The research was Facebook For automatic interception system , Against bullying 、 An automatic screening system for prejudice and hate speech , Faster than any human moderator 、 More strictly deal with the impact of network roughness .
But in real work ,Conneau It's not like a fighter fighting for the future of the Internet , He often listens to trance music while coding , The post it note he posted on the laptop screen recorded that he was AI A feat in training —— such as “ Get ready 10 A hundred million words ”. No one asked him about the policy decisions of large companies , Although developers like him can give these policies technical power .
In a perfect world ,Conneau Believe in , His work can give automatic voice regulators to protect people from the worst human factors , Build friendlier 、 A happier Internet Environment . He thinks that , These systems are essential to harness the iconic tug of war of online speech ： Encourage freedom of expression , Suppress prejudice and anger at the same time .
“ Our work on the classification of hate speech and bullying , In addition to automatically solving these problems , There is no other way .”Conneau Say .
But in the eyes of his critics , His invention will become a tool abused by the rich and powerful technology giants ： A biased and invasive force , It can provide more targeted advertising around the world 、 More automated monitoring and more large-scale deception .
Conneau And others AI Researchers are sometimes regarded as “ Mercenaries ”, Because their invention will earn huge amounts of money directly or indirectly from almost every network user —— Every time a user searches 、 Click and scroll , Provided clues about themselves , Then the algorithm “ captive “.
These prejudices are Conneau And many developers like him feel very uncomfortable . because , He is a machine learning scientist , Not a politician or decision maker . Although he often thinks about the major social and moral problems brought about by his work . But he thought , What determines the answers to these questions , It is still the task of those with more global background and public power .
The accumulation of problems , It has cast a shadow on the whole technical field that can not be ignored . With AI The work of researchers gradually reshapes the foundation of the new society , Whether the tools they develop will help create a better future world ？
The answer seems increasingly uncertain .
The more advanced the technology is , The more we should be vigilant against negative effects
Conneau Grew up in northwest France , As a student, I fell in love with the puzzling characteristics of abstract mathematics —— How does it decompose existence into the most basic parts ： Numbers 、 Pattern 、 thought . When you graduate from college , He entered the technology industry .
2012 year , Once silent machine learning to achieve technological revival , A new wave “ neural network ”—— A lot of software based on brain neurons and chemical interactions —— Give us thoughts and memories —— It has begun to dominate the long established methods of recognizing patterns and images .
Researchers use super powerful graphics cards , These chips are mainly used in video games , To break through the old computing bottleneck , The Internet world has ushered in information technology again “ big bang ”, Money is flowing , Everything is exciting .
2015 year ,Conneau Joined the Facebook New in Paris AI laboratory , The laboratory was launched in the global expansion of America's top technology giants ： Every technology company wants to find the world's most talented research and engineering students .
Facebook Especially believe that you are sitting AI Gold mine —— Because hundreds of millions of user photos are posted on social networks every day , The right algorithm can use this data to make a lucrative technical masterpiece .
AI “ Computer vision ” The development of has also promoted the development of other fields , A large number of written works of human civilization can be quantified and analyzed mathematically through models , To discover patterns and predict future use .
This technological advance has reshaped the way people interact with the Internet ： It exists in the genetic code of almost every smartphone we see today —— AutoCorrect 、 Personalized recommendations and search results .AI The system does not know the too deep definition or meaning of these words , however , They are in meaning and reasoning , You can guess other words or phrases that may appear next .
But in 2016 year ,Conneau Exposed to amazing things around you ： His wife's sister is 21 At the age of suicide . She's at school and on the Internet , Including on social networks , Have been subjected to ruthless Internet bullying .Conneau Began to deeply realize the harm of the other side of advanced technology .
Facebook Funded a series of Research Laboratories —— In New York 、 Pittsburgh 、 Seattle 、 London 、 Montreal 、 Paris and Tel Aviv —— These labs are chasing some of the craziest ideas in the field , From virtual “AI habitat ” To the six legged spider robot . But it's Conneau Where AI The applied research group is dedicated to “ Here and now ” Develop a product ： Make online advertising more attractive 、 More viscous news flow and let Facebook Our global audience needs this software more in their daily life .
In the lab ,Conneau We have devoted ourselves to developing tools that can detect the depth of human language AI. He and his colleagues published some papers , Include “ Word translation without parallel data ” and “ Cross language model pre training ”, These papers help promote “ Unsupervised machine translation ” What's new .
Neural network can transform words into light spots in a vast three-dimensional space , Words will no longer be confused definitions , It's numerical calculation , This is what computers are very good at . let me put it another way , Language will become mathematics .
2019 year ,Conneau And other researchers began training a AI Model , Can spell different languages at the same time , Speed reading movie subtitles 、 Minutes of United Nations meetings and other written works in different languages , Can pair sentences together .
“ Training ” A system model requires a lot of preparation ： Follow the coding algorithm of rule set ; Collect and process data into a form that can be read by the machine ; Design tests ; Analysis results, etc .
In order to run the calculation ,Conneau Rely on a huge cluster of data center processors , The cluster can run trillions of calculations per second ;XLM-R Of “ Unsupervised cross language expressive learning ” Training depends on 500 individual Nvidia Tesla V100 Graphics card .
When Conneau And his colleagues through some international AI Researchers use language comprehension tests as benchmarks to run the system , They were stunned ：100 The accuracy of a language model is very close to that of its specialized monolingual competitors . This means the world's largest social network —— Its core business model is news feed algorithm 、 Relationship diagrams and targeted advertising —— You can start using the system to scan every day in tens of milliseconds 30 Every post uploaded by 100 million users .
Conneau stay Facebook One of the last contributions , I worked with other researchers to help design a new speech recognition system ：wav2vec-U. Compared with competitor Technology , This tool has key advantages ： It is not learned by reading a large number of artificially transcribed sounds , But by listening to a lot of audio and finding out the words by yourself .
stay wav2vec-U See in the “ Unsupervised ” Learning technology has long been AI The Holy Grail of researchers , Because it can complete a lot of language processing without a large amount of manually labeled training data .
Facebook And other companies say ,AI It is human beings who face Division 、 The best solution to hate and harmful online speech .Facebook stay 5 In the said , its AI The system can actively detect 97% The hate speech of , These comments were eventually deleted from the website before they were reported .
AI Now we can analyze the images in the post together 、 video 、 Text and comments , Instead of analyzing it alone .Conneau say , It is a commitment to a better world that keeps him moving forward .“ Maybe there is one piece of information that we classify as harmful , The user did not receive , May completely change their lives .”
When Conneau stay AI When running a benchmark on a model , But also aware of modern AI Technology is dangerous and arrogant . The system they build is prone to abuse 、 Distorted by prejudice or too powerful to control and face criticism .
AI The most influential players in the field are usually the largest technology companies , They recruit top talent in a cash rich way , This means that most of the major technological advances in this field will soon become new products of the company , Not the product of considering the public interest .
Some of Google's most famous researchers , Such as Timnit Gebru and Margaret Mitchell, Has resigned or been fired in recent months ,Samy Bengio Worked at Google 14 year , Managed hundreds of top AI The researchers , This spring , He also changed jobs to Apple , The main reason is that they strongly protest against Google's internal attitude towards people of color , And interfere with the so-called “ Independent ” Research .
People are worried that all terrible things will happen —— Racism 、 sexualgender discrimination 、 Threats of violence, etc —— May eventually be AI The system absorbs , For them to learn 、 Process and copy , Ignoring these problems can be disastrous , Because these systems will increasingly shape the way of life and communication in the modern world .
“ These optimization problems are difficult to solve . But I think , By continuing research work , We will be closer to finding the best solution to these problems , I don't think it's childish , But pragmatic . We might as well be optimistic .”
AI No smarter , It's still “ machine ”
For hard-working engineers , these AI The problems of technology ethics and abuse are becoming more and more difficult to avoid . Industry's top machine learning conference NeurIPS Every year, thousands of elite researchers compete for attention and awards , It was announced last year that all submitted work will need to be analyzed for the first time “ Not only useful applications …… There are also potential malicious uses and consequences of failure .”
This change is not generally welcomed , And this shift won't happen overnight . A deep learning researcher said on twitter , Most researchers do not “ Come up with good enough academic achievements to make meaningful remarks on the social impact of this technology ”. Some researchers also believe that , The consideration of downstream consequences is beyond their scope of responsibility , And think their focus should only be on scientific progress .
There are some clues that ,AI Researchers are now beginning to think more often about what they build . This year, 5 month , When Google launches a new language AI System LaMDA when , Google says the system is trained to simulate the tortuous style of human dialogue , And acknowledge that such a system can learn to internalize online bias 、 Copy hate speech or misleading information , Or in other ways “ Be abused ”, Although Google says “ Efforts will be made to ensure that such risks are minimized ”, But many people are skeptical , Call these statements “ Moral cleansing ”.
As the work progresses ,Conneau expect , Researchers will find themselves increasingly at the center of the debate over the evolution of online dialogue and the limitations of machine learning .“ We are all writing this history in some way .”
He said ,“ This is everyone 、 social 、 democracy , Including the role of public power .” There is also a lot of public opinion showing that AI Fear of Technology ： A super intelligent threat that can conquer our human beings .
but Conneau Full of hope for the future , When we train language models , It can generate text data , But this is not thinking . He thinks that AI It's not getting smarter , It just reads... In a more convincing way 、 Processing and manipulating data .
“ You give it an input , It produces an output , It's just a machine . Right ？”Conneau Say .
本文为[Shared by: Chang Zheng]所创，转载请带上原文链接，感谢
- Fourth in the world! Wang Sicong installed a server "readily". Netizen: trench is inhuman
- [Tencent classroom] creator zero foundation immortal practice is online!
- Follow Huawei and learn digital transformation (3): mode innovation
- Record an interface slow check and troubleshooting
- ss -h命令
- @Do you know all these operations of Autowired?
- 使用Yolo v5进行目标检测
Identify and stop the process that‘s listening on port 8080 or configure this application to listen
[PyTroch系列-11]：PyTorch基础 - 张量Tensor元素的排序
[PyTroch系列-12]：PyTorch基础 - 张量Tensor线性运算（点乘、叉乘）
【环境篇】第 3 节 • Navicat 环境安装
预训练语言模型的前世今生 - 从Word Embedding到BERT
- 华南理工 | 基于生成式的低比特无数据量化
- 一行代码快速实现今日头条 网易新闻焦点图自动循环轮播效果
- 用一张草图创建GAN模型，新手也能玩转，朱俊彦团队新研究入选ICCV 2021
- UIUC | 用于语言模型的课程学习
- SS - H command
- Target detection using Yolo V5
- Yazid's freshman ball (thread tree)
- When creator meets protobufjs
- Identify and stop the process that‘s listening on port 8080 or configure this application to listen
- Why recommend learning bytecode?
- SAP Commerce Cloud UI 的用户会话管理
- 以太坊 交易 data字段 内容是什么
- SAP CRM Fiori 应用 My Note 里创建 Note 失败的一个原因分析
- Uncover the secret! Millions of pixel color filling solutions. Blessed are those who want to develop picture book applications!
- [pytroch series - 11]: pytorch basis - ordering of tensor tensor elements
- [pytroch series - 12]: pytorch basis tensor tensor linear operation (point multiplication, cross multiplication)
- [environment] section 3 • Navicat environment installation
- The past and present life of pre training language model - from word embedding to Bert
- Make sense, as long as you are a tossing programmer, you really don't need to spend money on training to find a job after graduation!
- South China Technology | low bit no data quantization based on generative
- Wechat applet authorizes location and user information permissions (to prevent users from being unable to use location information after prohibition)
- One line of code can quickly realize the automatic circular rotation effect of today's headlines and Netease News focus map
- Causal emergence: mathematical theory reveals how the whole is greater than the sum of parts
- The troubles of AI scientists with an annual income of millions of dollars
- API "why is the Olympic Games marked by five color rings?" Data source interface
- Create a GaN model with a sketch, which can be played by novices. The new research of Zhu Junyan's team was selected into iccv 2021
- UIUC | course learning for language model
- I'm sure! You haven't used a code artifact yet. It only belongs to creator users!