Interview with Li Feifei: from the top students of Tsinghua affiliated high school to Ali flying knife, a well drilled out "cloud original"

brief introduction :  He went to Tsinghua University on the third day of junior high school , Now he's the chief scientist of the Dharma Institute's database . Li Feifei went from academia to industry , Led Alibaba cloud technology team to build cloud native distributed database , Let Ali 「 All in all 」 The battle of the next city . today , He used a well to tell us about cloud origin !

If you want to use a keyword to define China's current consumption era ,“ double 11” It can't be more appropriate .

from 2009 year 11 Month begins , It only took ten years , From one that only 27 Discount days participated by merchants , It has become a national Carnival consumption Festival ,2019 More than 18 Ten thousand brands participated in , The cumulative turnover is as high as 2684 One hundred million yuan .

Every year's double 11, It's the peak of Internet traffic .
Facing the increasing turnover data year by year , What is the support ?

Vice president of Alibaba Group 、 President of Alibaba cloud intelligent business group database product business department 、 Li Feifei, head of the database and storage Laboratory of Dharma academy, revealed in an exclusive interview with Xinzhiyuan , Double last year 11 Trading peaked at... Per second 55 Ten thousand brush , And each transaction can be split into many transactions , All in all, there will be millions of transaction Transaction in 0:00:01 That moment happened , Trading volume will skyrocket 133 times . For example, there was one in the previous second , Now it's a second, there's 133 individual .

Such a high growth in a short period of time , The challenges to back-end computing and storage systems are enormous . Elasticity of system 、 Scalability and high availability should be very good , Otherwise, it's hard to support the sudden surge in trading .

This is a great challenge to data science , And every aspiring one 「 Data scientist 」 The best stage , Li Feifei prepared for the stage 20 year .

Take off from the high school attached to Tsinghua University : How Li Feifei embarked on the road of Data Science ?

Li Feifei likes to dig deeply and study from childhood , Took part in a lot of math and physics competitions .

Science experimental class of the State Education Commission in the high school attached to Tsinghua University , Without the pressure of college entrance examination , Li Feifei dived into the competition .

16 Age is a watershed . That year , He graduated from junior high school and entered the science experimental class of the State Education Commission , Determined the follow-up road to Tsinghua .

Graduation project of undergraduate course , He was exposed to data science for the first time . That s , Big data hasn't sprung up yet , But in the eyes of Li Feifei at that time , This work has forward-looking significance .
Even when it comes to 2017 year ,「 Data scientist 」 This position is still very 「 perceptual 」.

Because at that time , Data science is not a hot subject yet ,「 Data scientist 」 The concept of "is still on the surface" , Few people can understand the charm of it through boredom and boredom .

from 2001 Early in the year 2002 In the first half of , For more than a year , Li Feifei is immersed in his own graduation project , Deal with massive data on foreign language websites every day .

Through to BBC、CNN And so on website information dismantling and Analysis Hyperlink resources , Successfully topological the structure of the website , And depict a complete backstage , It also realizes the automatic understanding of website information .

「 Now come back , At that time, the project was very advanced , With the technology of that time, it was also unlikely that lead to somewhere」, When it comes to undergraduate programs , Li Feifei said .

Although only a few participles were used 、 Word frequency statistics and other simple methods , But there is modernity NLP The shadow of Technology . People who are really forward thinking are always ahead of the times .

「01、02 About years ago , There's no big data concept yet , But essentially , I think big data and databases are of the same origin , That is, the management and processing of data .

「 This project is a great exercise for me , The artificial intelligence of the time , Including deep neural networks and NLP The technology has not developed yet , In fact, there were neural networks , I took this course as an undergraduate , I'm still very impressed , I think the effect is very good , But because of the size of the cluster 、 The limits of computational power , There's no deep network .」

To see only one spot , Can be seen .

Through this project , Li Feifei recognized the power of data-driven and the endless exploration space .

「 In those days , A lot of upper application analysis , Most of the time it's rule-based, It's rule driven , Define the rules , And then build the system based on this , But when we did that project, we already felt the power of data-driven .

「 Data from production 、 Processing to storage 、 consumption , This whole link understanding 、 Mining and managing , There is no end to it , Because the amount of data is growing . Data types are constantly complicating . The top applications built on these data are also diverse , It's also diversifying 」

Data mining and management is like a wheel rolling forward , With the evolution of the times and technology, it is moving forward , It inspired Li Feifei's passion to explore and explore in this field .

Under the influence of this project with advanced consciousness , Li Feifei resolutely embarked on the road of data science , I went to Boston University to study database systems and big data . Then he taught in the computer department of Florida and the University of Utah , From assistant professor to associate professor , All the way to be a professor , This is it 10 year .

This low-key school is famous for its graphic processing and system orientation , Famous software company Adobe cofounder John Warnock, It was at the University of Utah that I completed my undergraduate and doctoral degrees .

Even Pixar, a famous animation company, was founded by professors and doctoral students at the University of Utah , There are also three Turing Award winners in graphic and image directions .

Internet Internet The forerunner of —ARPANET The birth of the is usually thought to be spread on the Internet “ Genesis ”, From the west coast 4 It's made up of nodes : UCLA ( UCLA)、 Stanford Institute (SRI)、 University of California, Santa Barbara (UCSB) And the University of Utah (UTAH), One of them is in the computer department at the University of Utah .

Top level meeting of system direction OSDI The best paper award Jay Lepreau Award He is also a professor of computer science at the University of Utah Jay Lepreau Named .

The president of Alibaba cloud said that he resolutely joined Alibaba : Technology creates new business

When it comes to leading people on the road to data science , Li Feifei talked about the famous Turing Award winners in the field of database Michael Stonebraker.
mention Stonebraker Ordinary readers may not be familiar with , But in the field , The mainstream open source database he built PostgreSQL Almost no one knows .

Stonebraker Not only research well done , You can also do real system driven research, When I was a professor, I opened many database companies with far-reaching influence in the industry .

In the realm of databases , He is a model of the combination of academic research ability and technology product ability . In the impression of flying knife ,Stonebraker It can be called a real banner figure .

In the 89 years of Vocational Education in Colleges and Universities , Let Li Feifei realize the difference between school and industry . School is a relatively freer atmosphere , And the people that are created are more pure , Scientific research is the simplification of complex problems , To find the most essential problem , And companies are more oriented , With the market 、 Customer demand oriented . The company's short-term goals are clearer .

「 What you have to think about in a company is , How to turn technology into products , How to turn a product into a commodity . And in school , You're more concerned with technological innovation , Maybe it hasn't arrived yet , Or far from the commodity , The ability quadrant requirements of the two are completely different .」 Li Feifei mentioned in the interview that .

In the years of scientific research in Colleges and Universities , Li Feifei has won numerous awards , a IEEE ICDE 2014 10 The most influential paper award in 、ACM SIGMOD 2016 Best Paper Award 、ACM SIGMOD 2015 Best system Presentation Award 、IEEE ICDE 2004 Best paper award, etc , But out of the ivory tower of colleges and Universities , The attraction of engineering and product also inspired him .

Before returning home , Li Feifei has also contacted Google many times Facebook The big Silicon Valley factory , But finally chose to return home to join Ali , It also originated from Alibaba at that time CTO、 What Zhang Jianfeng, the current president of Alibaba cloud, said .

During the interview , Zhang Jianfeng's words deeply moved him ,「 Technology creates new business 」, Let the Throwing Knife think about the essence of technology .
「 Finally, I want to understand , What we think about from a technical point of view is how much performance can be improved , How much cost reduction , But ultimately, what drives the evolution of this society is the power of Commerce .」

In this sense , Ali cloud's Dharma Institute and database business department are the organic combination of scientific research and business , You can also participate in the commercialization process of products while studying technology , To the bull's eye .

Besides , Ali includes e-commerce 、 logistics 、 The new retail 、 Financial and other business diversification and massive data brought about by the rich challenges, but also to the flying knife small test ox knife .

From the perspective of a long history , Technology that can create business value is really viable , I really feel like .」 The Throwing Knife concluded that .

Take off from the high school attached to Tsinghua University :「 A well 」 Get out of the cloud native database

In reality, there are many scenes similar to the double 11 , We need a flexible database to support .

It's just 2018 year , Ali started a 「 All in all 」 The battle of , Will double 11 Calculation of core system 、 Storage 、 The Internet 、 All databases have been moved to alicloud .

And at the cloud habitat conference just concluded this year , Alibaba announced the establishment of cloud native Technology Committee , At the same time, the cloud original relational database was launched PolarDB、 Cloud native distributed database PolarDB-X、 Cloud native data warehouse AnalyticDB(ADB)、 Cloud native data Lake analysis DLA、 Cloud native multi-mode database Lindorm And a series of self-developed cloud native database products .
These products are not alone , It's a complete system .

This also marks that Alibaba cloud database has entered the cloud native + Distributed age . Wang Jian, chairman of Ali's technical committee, said , This will let Alibaba cloud and its customers 「 On the same plane 」.

Traditional databases can be classified as OLTP、OLAP、NoSQL, The biggest challenge they face is to ensure consistency when the amount of read-write concurrency is high , Avoid reading and writing errors , And the low-cost storage and efficient calculation and analysis of massive data .

Cloud native (Cloud Native) Databases are used in all three areas .

To understand cloud Nativity , The first thing to understand is 「 cloud 」. The cloud is not just about putting resources in the cloud . In traditional computer architecture , The resources are 「 Tightly coupled 」 Together .

Li Feifei gives a vivid example —— Buckets and wells . The water in the well needs to be pumped out for use in the kitchen , If you compare the kitchen to CPU, We can say that the well and the kitchen are tightly coupled .

When there is a lot of water , In addition to increasing the depth and width of the well , You can also build 「 Distributed 」 Well , Connect the wells of each family by some device .

But the water wells of every household are dispatched through the device , such 「 Distributed 」 The process is also very complicated , Need an efficient scheduling system .

Understand distributed , Let's look at the clouds .

Li Feifei said ,「 cloud 」 The first essence of is 「 Use virtualization technology to pool resources 」.

Explain with the example of a well 「 cloud 」 Namely , On the surface, it's still 100 A separate well , But the bottom of the well is already connected , Formed an invisible pond .

「 cloud 」 The second essence of is 「 Resource decoupling 」, Storage and computing need to be decoupled , Then pool the storage and computing separately . The advantage is that the expansion can be very flexible , Such as CPU The number of cores and storage can be expanded freely .
Cloud native database is through resource pooling 、 Storage and computing are separated 、 Resource decoupling , Thus, it has higher flexibility, high availability and distributed ability , To meet the needs of the business on demand and pay as you go .

Cloud native relational database PolarDB、 Cloud native distributed database PolarDB-X、 Cloud native data warehouse AnalyticDB(ADB) On the surface, it doesn't look very different from traditional databases , All have storage engines 、 Optimization engine 、 Interface engine, etc , But the use and scheduling of resources at the bottom has changed dramatically . Although the bottom has changed , But hopefully for users , It's a transparent, imperceptible change .

Li Feifei also said , future , Multi modal data processing and intelligent resource scheduling , It is one of the challenges that cloud native databases will face .

During the epidemic , Online education and the game industry have changed in essence , Using cloud native database can better meet the demand of elasticity .

More Than This , Cloud native database can also realize off-line integration , Integration of data processing and computational analysis , Big data and database integration , Help users to achieve the database will be big data , There's no need to write complex Hadoop and Spark Program , Just simple SQL You can handle complex tasks , Greatly reduce the difficulty of user development . meanwhile , A lot of user positioning time has been shortened to 7 Within minutes, .

Last , Li Feifei also said , Alibaba cloud's cloud native database adheres to independent research and development 、 Self control , Ecologically, it will 100% Compatible with existing databases , Users don't have to worry about being locked up here in the future , It can be flexibly migrated according to requirements .

Cloud nativity is the trend , Only when technology creates value is meaningful

The exploration of Technology , To help mankind progress and ascend , What kind of technology is worth pursuing ?

Li Feifei thinks , Think about it in essence , Will this technology eventually become a scalable one , Continue to produce business value .

It sounds like a very empty question , But actually calm down and think about it , What kind of innovation breakthrough needs to be made at what node , It's easy to implement , For example, why is cloud native a trend .

Resources are decoupled 、 Flexibility, scalability, these really become on-demand, on-demand , It's just like why we used to draw water from wells and drink water from every household , The evolution of Chengdu is the same with tap water , If you don't drink, you can stop at any time .

So cloud native database comes , In essence, it solves the problems of resource utilization efficiency and resource cost , It becomes a business problem .

Is technology meaningful only when it creates business value ?

If we look at the history of human civilization , Many technologies can be of no commercial value in the short term .

however , Can't wait for 100 It won't be realized until after two years , Then the technology doesn't make sense . Now in a rapidly changing environment , Three years at most , We must make clear the business value and logic brought by technological evolution , Because companies need to evolve and improve their operational efficiency .

The database of these years , In Li Feifei's own words , They tend to think about some problems from the logical point of view .

「 For example, I will pay more attention to this causal relationship , The correlation , For example, when I look at a lot of things, I will first think whether it is relevant , After having relevance , Again, this is a simple correlation , There's still an internal cause and effect .」

future , Everything will be data driven , Looking for connections between data can lead to new values .


