当前位置:网站首页>Full text of LVS speech by Alibaba cloud video cloud technology expert: "evolution road of intelligent media production and production with" cloud integration "

Full text of LVS speech by Alibaba cloud video cloud technology expert: "evolution road of intelligent media production and production with" cloud integration "

2020-11-08 12:56:51 osc u.97wmavr6

2020 year 11 month 1 Japan , Alibaba cloud video cloud appears LiveVideoStackCon Audio and Video Technology Conference , Zou Juan, senior technical expert of Alibaba cloud intelligent video cloud , Keynote speech on smart media production ——《“ Cloud in one ” The evolution of intelligent media production 》, The following is the complete content of the speech :

 Alibaba cloud video cloud technology experts  LVS  Speech full text :《“ Cloud in one ” The evolution of intelligent media production 》
Hello everyone , I'm Zou Juan from alicloud video cloud , I am responsible for the architecture design and development of media production platform in video cloud . The theme I'm sharing today is “ The technological evolution of cloud integrated intelligent media production ”. My whole sharing will start from three parts .

Part 1 The evolution of media production technology


The first part is the evolution of media production technology , If we zoom in on the whole video link . Video full link abstracts it into 5 A link , Start with the collection , After production management , Finally, distribution and consumption .

 Alibaba cloud video cloud technology experts  LVS  Speech full text :《“ Cloud in one ” The evolution of intelligent media production 》

From before to now , Over the years , Video technology has developed for decades throughout the industry . In the whole process of circulation , Great changes have taken place in every link of the video link before and now .

such as , In the collection phase , We started with the collection process , You have to shoot with a professional camera like Sony Panasonic . So far, we can shoot videos with mobile phones . In the production process , We have to use professional nonlinear editing software and desktop tools from the beginning , Or it's like this kind of studio car hardware guide station to do this kind of post production or real-time production , Now? , We can click on the phone to beautify our face , You can do online editing on the outside .

From the management point of view , At the beginning, the traditional mode is that we need to edit the raw data manually , And then it has to go through a lot of auditing , Now we can use intelligent thinking to build dynamic original data system , To do the mining between the materials of the knowledge map . And can use intelligent audit to reduce the pressure of audit , Improve the performance of the whole process .

The whole path of development is from the very beginning to be artificial , Now we can use intelligent way to integrate into the whole process to improve the whole efficiency .

In the past, video production was done by professional organizations . Like a TV station or a film and television production company . Up to now, every common people can make videos . The whole trend is from artificial to intelligent , From the minority to the public .

Finally, the link of distribution and consumption is actually a . From a passive reception that we used to be very traditional , Like the first passive reception mode of watching TV , Now we can interact , You can choose what you see according to your needs . The whole evolution of media production , In fact, it is a change from a very professional threshold to a inclusive one .

Now about the production itself , In fact, I think there are two factors . The first is the manufacturer of mobile phones , This technology of video shooting can be applied to mobile phones in a larger and deeper way . So we can shoot very high-definition videos on mobile phones .

Another tiktok is the platform for Kwai Fu's short video. , It enhances the common people's pursuit of aesthetics , And for the pursuit of video quality and video production requirements . So in the whole process , Making this part is becoming more and more important .

 Alibaba cloud video cloud technology experts  LVS  Speech full text :《“ Cloud in one ” The evolution of intelligent media production 》

We zoom in on the process of making this itself . Take a look at the production mode of media and its changing process . In the earliest days , In fact, the whole video production is a linear editing process , In other words, the editor needs to record while putting it away .

Even the earliest stage of film production was really about cutting that film , To make a positive film of the film , And cut it with scissors , And then go and stick it with scotch tape . In the 1980s and 1990s , There have been some specialized Productions , Video editing can be done with some software . In the middle, we can divide the production into two modes . The first one was made on site , And then there's post production .

In the last stage of live production , We usually use this kind of equipment, such as a studio or a hardware radio station , Or a live car to make it in real time . In the later stage, use non-linear editing software to do . The whole mode of production system is audio 、 Video and graphics , They're made separately . There are special subtitle production equipment and machines to do . Through further development , At this stage , We've added some ways of cloud production and rapid production . For example, our live production , You can add a lot of things in real time during the live broadcast , Do a lot of processing . Then change the hardware broadcasting station to cloud broadcasting station in the cloud , In the cloud to do real-time personalized broadcast switching .

In the post production stage , We are no longer limited to non-linear editing software . We can use cloud clips in the cloud , And then use short video on the phone app Production tools for video production . Great changes have taken place in the mode of production . Production mode is based on the original superposition of some new scenes and patterns .

The whole cloud computing and AI The development of , In fact, a lot of new modes of production system have been added , It can make the production of content richer . In the whole process ,AI In the whole process of changing the mode of production now , It's a supplementary function . We hope for the future AI To be able to intelligently create some videos with stories .

This is the evolution route of our video cloud in the whole intelligent production .

 Alibaba cloud video cloud technology experts  LVS  Speech full text :《“ Cloud in one ” The evolution of intelligent media production 》

We need to know the needs of intelligent production , What's the first step ?

First , We have a lot of AI Algorithmic capabilities , These capabilities can be linked to the production process . For example, vision related , A split mirror , Identification of characters , Video segmentation , Including the identification of the main body of some video images . And voice recognition , speech synthesis , Color related , Color analysis , And color matching and so on . There are also some pictures related to the content of . For example, the smart cover may be static , It can also be dynamic . These are what we can achieve . Some of the atoms that may be used in this field AI Ability . Our first step is to take these atoms AI Ability , adopt API We can see .

The second stage is that we made a Intelligent Experience Hall . because AI The atomic power may be hidden in the background , We just let out API Words , There may be no way to give people a very intuitive feeling .

So in the second stage, we built an experience Pavilion , And then we can get a lot of customers to try this capability , See the effect . After the second stage , We found that some customers would be more interested in some of these points . because AI There's a lot of capacity for , But for different scenarios , Maybe the focus of customers is different .

We abstracted several scenarios 、 Several applications , From content planning to creative packaging management . Customers can submit their own feedback according to the experience Pavilion . After this feedback, we can understand the needs of the customers .

So we turned it into a real cloud service process . That's the fourth stage . Because it takes an atom of AI Ability , Take it API We can really provide a cloud service . But in the middle gap It's huge . So we did some system building . We did the basic source data , Provides some tag libraries 、 People Library 、 Camera gallery , And from the project to do a lot of data service system , A system that includes logging and monitoring . Finish this system , It can be regarded as a service that we can provide to our customers .

At the fifth stage , We find that it is not enough to provide these services stably . What the customer may need is not a result of face recognition , It's about solving problems in real situations . Maybe we need to move on to the next stage . I have to put these AI To combine the service with the scene , Can work for the production itself . Here we abstract some scenes , Picture and text synthesis video , Template factory, etc , Making videos based on templating , Like live clips 、 Smart subtitles 、 Intelligent dubbing, etc . These scenarios are what the customer ultimately needs . So in stage five , We put the whole production and AI Made a combination , Provides a wave of scene production services .

In the whole process , We're going to rely on systems like media , Like the editing system , Like the copyright system , Do some task scheduling and policy analysis . Then, the services in different scenarios are implemented with different strategies . therefore , We can see that our entire video cloud is in the process of intelligent production , It's not an imaginary process .AI The ability of , It needs to be combined with the scene , In order to truly provide services for customers .

Part 2 Cloud integrated architecture design


 Alibaba cloud video cloud technology experts  LVS  Speech full text :《“ Cloud in one ” The evolution of intelligent media production 》

Next is our intelligent cloud integrated architecture design .

Before we talk about the architecture design , I would like to share with you some of the core components and core pain points of media market production that we analyzed earlier . In the process of media production , We can abstract the whole production process into four stages .

 Alibaba cloud video cloud technology experts  LVS  Speech full text :《“ Cloud in one ” The evolution of intelligent media production 》

The first stage is the creative process , This process is actually the one that I think takes the longest in the whole process so far .

First of all, the threshold of creativity is relatively high , The creative process is very brain burning . So in the process of creativity , I need to collect , To arrange a lot of material . The collection and selection of materials has become a problem . If it's doing a job that requires a lot of people to work together , That would find it difficult to share material . And the raw material , These materials need to be circulated among many people , But maybe it's going to be big . The problem of file size is also a prominent problem .

By the third stage, I've probably found the material , But I need to be able to edit it or package it to achieve an effect I want . At this time, I found the tools very complicated to use .

for instance : For example, I made a general plan on Friday 4 Minute video , It probably took me 4 Hours , Then it took another two hours to collect the material . And then finally I went through the whole editing and packaging process , It took me hours again . So I started at noon on Friday , The video finally came out at 2 a.m. on Saturday .

So the complexity of the tool , The material is huge and the transmission is inconvenient , There's also the inconvenience of collaboration . Such scenes may be suitable for non personal production , It takes a lot of people to work together .

So we designed a framework like this .

 Alibaba cloud video cloud technology experts  LVS  Speech full text :《“ Cloud in one ” The evolution of intelligent media production 》

One of the core points of our architecture is , It's the part that includes the cloud and the end , And the whole architecture is not usually understood SaaS A framework like tool , It's a cloud + End , A very open architecture that can be separated or combined .

First , The middle part is the production tool part , This part is also the most easily thought of , Because before we go into cloud editing , We are all using some client tools to do .

In the whole process , Our tool will be abstracted into three components . At the heart of it is the components of this story board , That's the timeline . There are also two sub components , One is the player , Because I'm going to preview the editing process on the player , And there are also some components of effect editing . These components will complete for video audio including mapping , Including the subtitle of some of the various effects of editing .

At the heart of it is my preview rendering engine . This actually makes up an end-to-side component of the production tool . At this end , In fact, at the beginning, we only did the external end and mobile end . And in the beginning , There is no uniform time line between the external end and the mobile end . In the process , Finally, there is such an architecture . Maybe the architecture is simple at first , We only consider the external end , The collaboration between an external end and an external end is not considered . Now we have a unified architecture with multiple terminals .

On the whole, on the right , It is the server of our production system , It is equivalent to dividing the whole cloud service system into three components . At the heart of it is the processing center of the timeline . That's when I get a timeline , There's a lot of track material and effects on this timeline . I need to work on this timeline . Because maybe I got a timeline , It's a client of mine who goes directly through API The timeline of the request submission , Then the parameters of this timeline may have many problems .

If I simply and roughly refused it , So the whole experience is bad . So we do a lot of fault-tolerant checking and completion on the server side , And the mechanism of prediction , Be able to present the timeline to the desired state of the customer . Finally through the template factory to lower the whole threshold . Rendering synthesis is the ultimate hard power . We support multi-layer video , And then the multi track mix , And support intelligent engine to schedule to different bottom , There's a special effects engine for video rendering .

You can see API The left side of the 、API On the right side of ( Above picture ), It's the end part and the cloud part . The whole design is that these two parts can be used independently . For example, I can just use the external sdk Part of , I can also just use the cloud part , Or simply not using the outside sdk, Call directly by request .

Of course, it can be in a SaaS On the tools of transformation , Put these two parts together . This is a cloud architecture design that can be separated and combined , It was originally designed to , Not a pure PaaS Or a pure SaaS , Or it's just an end and cloud structure , It's an integrated and detachable structure . On top of this structure , It's some services and pages packaged based on structure . This part can be done by alicloud , It can also be done by our customers . In the end, there are some of our scenes . We can abstract these technologies into scenarios , We can use our technologies in these scenarios .

The one on the far left is actually what we added later , When we started making the first edition , It's not AI Part of the . hold AI Add this part of , It's to be able to intelligently arrange the timeline . The arrangement of the timeline , We abstracted it into three scenarios .

The first scene is a creative scene . The second is the scene of enhancement class . The third is the replacement class scenario . In these three scenarios , We can analyze the material , Get a preliminary timeline , And combine this timeline with the artificial timeline again . Produce a final timeline .

So you can see in The key point of the whole intelligent production is the design of the timeline . Because the timeline describes multiple orbits , And then multiple materials follow one idea , To arrange 、 Make such a product of the fusion of multiple effects .

So what we're going to talk about is a design of our timeline .

 Alibaba cloud video cloud technology experts  LVS  Speech full text :《“ Cloud in one ” The evolution of intelligent media production 》

Time line words , In fact, there is no standard in the industry , Whether it's professional or cloud , There is no standard .

Let's take a look at the professional non-woven , image 3A(Apple/Avid/Adobe), Each family has its own defined timeline structure . These professional non-woven, its design is the design of multiple tracks . First of all, they must be tracks , Visual track .

There are multiple orbits , And its material and effect design are all different . Of course, there are traditions EDL This design of . This kind of design is relatively simple , It only has monorail , Only defined material , But it doesn't define the effect . Because the description of the effect is different between different manufacturers . We are based on such a situation , We have made the design that cloud add-on can be reused . We are at the heart of the timeline, four elements , It's the orbit 、 material 、 There is a trade-off and balance between the effect and the stage .

First of all, special effects are complicated . In some professional design , Special effects tracks are independent , It's very likely that it came out on its own . In our design , Special effects track does not require independent appearance , It can appear as an attribute of the video material . This is to reduce the complexity of cloud users and Internet users .

At the same time, we will keep the design of the track material , Then the original video that the track material points to is just a reference relationship . This is to increase the applicability of . Otherwise, the design of the whole timeline will be very bloated .

in addition , In order to consider the scalability later , We did a multi track design of the whole timeline . Because in the beginning , A lot of intelligent production is in the process of design , It's all monorail . But when we did the first design , Just consider a multi track . Because the design of multi track can ensure that in the process of program iteration after , It won't be because the foundation is not good , And make a subversive transformation on the original basis .

So at the beginning, we made a multi track design of the track according to the material type . Last , We have a canvas for output , That is, the design of the output stage , It's an automation 、 Personalized and customized design . It can be used when there is no stage layout , Can do automatic output according to the resolution of the original material , You can also customize the layout by specifying the layout .

There's a lot of design in the cloud , There are many different scenarios to consider . Maybe most of the scenes are 4:3、16:9 perhaps 9:16 perhaps 3:4 Such a need . There are also some special scenes , Its resolution may need to be customized . So our whole design is actually a trade-off and balance between the track effect stage and the material .

( In the figure ) Sinister timeline The four elements of , It's the core element of our entire design , That is, the timeline is abstracted into four levels , Every level is progressive . Maybe one. timeline There are multiple orbits , Each track has multiple materials , Each material has multiple effects . The effects can be arranged by people , It can also be arranged by machine . Finally, it's good to export it to the stage , Canvas, too .

This is a form of the final output of the video , These four elements are at the heart of the timeline design .

 Alibaba cloud video cloud technology experts  LVS  Speech full text :《“ Cloud in one ” The evolution of intelligent media production 》

You can imagine the timeline mentioned above , Its whole is more complicated . If I'm going to organize such a timeline data structure myself , Then my workload will be very heavy . In order to lower the threshold of time line use , And at the same time ensure professionalism . We did a template factory design .

In the design of the template factory , We will abstract some templates .

These templates are the equivalent of putting the timeline together , Or a small part of the timeline is abstracted , Then use the way of parameter to specify . In the whole template design process , Support nesting or composition . For example, I made a cool video , We need to arrange the material , Including the effect of switching . Or add some animations or captions , Then we can use the corresponding template to do nested and combined design .

In this way, we can make the most of the transformation of the template . The core problem of this template factory is : Lower the threshold for using the timeline . At the same time, there is also the most important , Solved the threshold of creative production . These two designs guarantee the professionalism of the whole production field .

Template factory is really reflected in the packaging and use of . It can lower the threshold while ensuring professionalism , The entire production design is inclusive to everyone who wants to make videos . These two thresholds are what we think are the core of the whole production process .

Based on a previous result , This is a data stream of intelligent media production data that we designed .

 Insert picture description here

Because the previous architecture is rather dry , It's a pure technology architecture . So how does the final data flow , How can I synthesize the video I want from the original material to the final ?

The process is like this . On my left is the material , My material is the same as the video I want to make . There are many types of raw material , There may be audio, video and text , There are some subtext , There will even be html code snippet . These are all my sources .

In the middle of the process , Is the core of the intelligent production link . First of all, my material goes through a series of AI Handle , Get structured information .

Before you get structured information , Will process the material first . For example, we will analyze the audio and video stream information first , Including some size information, format information , This information will assist the input in the intermediate intelligent production process . And then I get the preprocessing information , I will analyze the whole process of intellectualization . The analysis here is multidimensional . The output may be visually related to the time axis , Or it's about time intervals , It may also be phonetic , And maybe it's a mix of colors , Or the pixel set extracted from the real-time process . And then you get the processed data , I can make it with tools .

Of course, not every tool uses every capability . But these capabilities can be used as input to these tools . There are many kinds of tools . Including mobile terminals and web End 、 Produced by templating and mass production , And by AI To assist in . In the end, we will have a series of production effects .

In the picture, intelligent production makes the right part , Is the most commonly used abstract concept of effect in the production process .

For example, we'll use multiple layers of images , This image could be a video , It could be pictures , You'll use multi track mixing , It's a hybrid of pictures and texts on the same track , The effect of the material will be used to make a filter or transition , It will do real-time matting of foreground characters or subjects for some live streams , You can also make intelligent subtitles . You can also make a collection of intelligence . In other words, through the analysis of the video to extract the highlights of this video to do a collection .

Of course, there are also some comprehensive production processes , It's a combination of artificial and intelligent , To complete the whole production process .

The final output is , In fact, we also abstract it into three categories .

  • The first category is used to distribute and play films . We can sum it up as creative , A collection is a kind of creation .
  • The second category is enhancement : There were no subtitles in the video , With speech recognition and subtitles , This belongs to the enhancement class .
  • The third is the substitution class : The background of the host's live broadcast is not very beautiful , Replace the background with a more attractive one .

It's a piece of output 3 Types , Of course, you can also output material , When the output is material , The output content can be used for secondary production .

These materials are sometimes more valuable than films . Because it's reusable . Our system can also output material .

Finally, we are not technically opposed to professional non-woven , We and professional non-woven is a technical cooperation relationship .

Our model is equivalent to the Internet way of new media editing . When we need professional occasions , You can make a rough cut in the cloud , And then go online and do a choreography . So you can switch the timeline , Can achieve the best overall effect .

So in the whole process of media content consumption , The experience of getting some feedback , It will give back to AI The system of . It's a closed loop on the data . Push these algorithms to continue to iterate . At the same time, the content we produce will go back to the media database . Back in the media pool , This content will also serve as an input for the next video production . You can see that Alibaba cloud has been making intelligent media in the whole process , The central idea of design , yes Take production as the core 、AI auxiliary .

Part 3 Production as the core 、AI auxiliary


 Alibaba cloud video cloud technology experts  LVS  Speech full text :《“ Cloud in one ” The evolution of intelligent media production 》

But why do we need to AI Well ? Why do you still attach so much importance to AI Well ? This is a simple picture , however , It's one that we're actually thinking about AI To assist us in production .

 Alibaba cloud video cloud technology experts  LVS  Speech full text :《“ Cloud in one ” The evolution of intelligent media production 》

At the very beginning, our most primitive stage was that everything was edited by people , The timeline is also clear , It's completely human led . But there are some scenes that people take time to dominate or are not so easy to lead .

for instance , For example, kindergarten surveillance video . Parents said that I especially want to see our children's performance in kindergarten , It's very hard to find your child from the surveillance video frame by frame . When you have to deal with a huge amount of video , You will find that there is no way to deal with recognition through people , So the yield will be very low .

As we move from manual choreography to mass production , And when you need to greatly improve your self efficiency , We're bound to go through cloud computing and AI A combination of ways to do this .

In the whole process , We are going to use it AI The ability of . I think this is also AI The greatest charm and value , It's a good combination with cloud computing , It can provide help for large-scale production and massive material analysis , Improve the efficiency of media production .

I'll start with three practical examples , Let's share with you AI The integration of technology and production process .

 Insert picture description here

This is an example of how we broadcast it on the cloud . In this case , We can see that traditional broadcasting may have a lot of seats on the spot , There are a lot of shots , A lot of video footage .

But what we see on TV are those channels , Maybe a lot of video material is wasted . We saw it on TV , It's the live broadcast that gives us this kind of picture . But in fact, there is a lot of video material that hasn't been used .

therefore , We've built an architecture for broadcast over the cloud . The logic of technology is like this , First of all, we will still stream the video live , Through the live broadcast center . Then we create multiple instances of the pilot with the cloud , In each instance, I can use a different perspective to do the scene I want to guide .

Cloud broadcasting because it can be distributed on the Internet , So the utilization rate of the original live stream and material is very high . We can also take this video down , Enter the live recording process . For this live stream AI Do quick processing .

Before the Winter Olympics broadcast , There is an example of the Youth Olympic Games . We did three kinds of sports exercises at that time . For these three events , We track the track of the athletes , Analysis of the cloud . And then each athlete in each segment of the movement in the wonderful shot through AI The way of handling , Use cloud editing to quickly generate material , And turn the material into a video stream , Then turn to the input of cloud guide , This is equivalent to the income of unilateral live streaming .

On the other hand, I use real-time technology to automatically generate this playback collection . And you can add some effects between the shots . At this time, if we don't consider the difference between the complete real-time performance and the hardware broadcasting station , In fact, the whole production mode is very close to the traditional mode .

Our charm is to say , We can make use of a lot of live streams . Especially at some events , Athletes in some countries may not be in the top three , Maybe it didn't give them too much . But the people of this country will be very concerned about their own athletes . At this time, we can make every organization a director through such technology , And then they can do the whole broadcast process , From the live broadcast of their own . So the process of cloud live broadcasting is AI The ability to connect with real-time production and offline or post production , At the same time, we can use our system on a large scale , And can let all live streams play its value .

This is an application of cloud live broadcasting technology .

 Alibaba cloud video cloud technology experts  LVS  Speech full text :《“ Cloud in one ” The evolution of intelligent media production 》

This example is also used a lot . When we're making a film , We can't have every show with totally different ideas . When I need to want to replicate my ideas , But I think when I copy it less rigidly , I will need this kind of scene very much , Is a piece of template production , There are a lot of things in my library .

We've also talked about our library before , Maybe it's a live stream , It could also be an offline video file , And maybe some pure audio , It could be human voice , It could be background music , And then there might be some subtitles . These subtitles may be external subtitles or some banner text . And then there could be a variety of pictures , Including some text information . Even a piece of code . for example html Code segment , Or in my code canvas One of the structures of . These are actually the materials we use to make . Through these materials , How can we make this show ?

We may also need a template library , This template library is the concept of a library , We can use the designer ecosystem in the template library , Designers will design a lot of templates in it . But we don't really need to use AI The way to the whole template production to carry out an advanced . But where's the advance ? That is to say, we don't want to apply these templates intact without making any changes .

Now, for example, designers have created a background in which bubbles bounce back and forth , Need a fusion with my foreground image . When he designed this bubble , Designers can only design color matching and a change of some movement track .

But when I actually do synthesis , If I use this background to compose every picture , It may appear that the background is not in harmony with my picture .

How can I use it AI To make such an improvement ?

We may analyze the color of this picture , And to analyze the tonality of the whole picture and the trajectory of the template . Through analysis , Will put the current material , It depends on the characteristics of the template with the parameters of the analytical segmentation . Then I can combine the whole parameter level changes with the characteristics of my material . In this way, I can split the basic template into many personalized templates . This personalized template can correspond to each different material . Through this personalized template , Combined with the material collection . On the left of the front is my complete material collection . Maybe my material collection is a huge amount of , What kind of material am I going to use to make my video ? There may be a selection process here .

Selection actually consists of two parts , Part of it is searching , Part of it is intercepting . The search process is AI A process that can be deeply involved . It may be customized according to my scene ,AI Analysis may be content-based , It may also be keyword based , Even based on the knowledge map . And then after searching, which part of this video did I intercept . This is based on my theme and video content . If I'm doing a character related video , So the material I might get is this kind of footage related to the characters . If what I want is an action type , Like events , I'm going to make a collection , What I might want to focus on are some motion pictures , Or something related to the lens .

We do it through a combination of two parts , It is from the massive material library to search for the material set needed for each production , And use AI To split a template into a personalized template . after , Let's take this template and the material set and then combine it . This is our raw material . Finally, we build the timeline through such a combination .

The timeline is a basis for the final synthesis . The entire timeline is composed and rendered , You can render video or some pan media images . This is an example of how we are making film templates . The core of it is actually that every part of me can be used AI The way to replace . Application AI It's not just for the initial screening of material , It can also be deeply involved in the whole production process .

 Alibaba cloud video cloud technology experts  LVS  Speech full text :《“ Cloud in one ” The evolution of intelligent media production 》

The third one is just mentioned , We don't just make movies sometimes . And my goal is to make some material . These materials themselves can be used repeatedly . Because there is a big difference between making materials and making films .

Take making films , I'm going to use a lot of different effects , To ensure my visual impact . But when I do the material , I'll try to make sure that one clean Result . I probably don't want to add too much effect modification . My core is what's in this video 、 Which fragments can be reused .

also , I may use some of the principles and benchmarks I reuse to make my selection strategy . I have two sources of material , It can be divided into two categories , Live streaming and video . Then through a preprocessing of video intelligent production , You can see that this focus is completely different from the original film production .

I used to focus on all kinds of effects in film production , Various arrangements , A superposition of multiple tracks . But when I was working on the material , I'm focusing on the video itself , This is an important factor , I need a very rigorous analysis of the lens . The two core elements of this lens language , It's the scene and the way you shoot it .

The scene is divided into distant views 、 panorama 、 medium shot 、 Close ups and close ups . The usage of each type of lens is also different . I might pass AI To identify the level of the shot , And the level of this shot will mark the screen .

It's not just on the timeline , It's also on the video screen . Another very important dimension is the way you shoot .

Because when we do different types of video , Maybe the focus on shooting is different . If we're doing a story show , I'm going to pay a lot of attention to the way I shoot , That is, the sequence of camera movement should not be disordered . Connect people's concerns in a sequential way , Instead of the whole picture jumping around the world . So we need to study the way we shoot , It is to analyze the lens language in a fixed and moving way . And then we can extract the shooting methods of different segments . But in some scenarios , We just need to integrate these shooting methods .

For example, when we're making a collection of cool music or dance shows , I'm going to deliberately create this kind of confusing perspective , So it has a cool effect .

So we need to analyze the lens language according to the scene combination , Make this lens recognizable . Then according to the different scenes and shooting methods, put the label on , Only in this way can we prepare for the next program production and video production .

meanwhile , We still need a basic library . For example, we need databases , Need a camera tag library , And the video library of the lens itself . And because character creation is a very important part of the whole program .

So we're going to build a character library . Based on these basic library construction and production preprocessing , And lens analysis . We can do a processing of the material intelligence timeline . And then we analyze the material , We'll get the level results of the material , Get the results of the shooting method , Get the analysis results of content feature extraction . After getting this thing , We can start building a timeline .

In the construction of the timeline , Because the results we may get in the middle of this stage are very fragmentary . In this piecemeal result , In the end, which pictures can we reuse ? At this time, we need to combine the scene to define some thesauri , Or some feature library .

Based on these feature Libraries , We can generate the structure of the material timeline we need . After you get the structure of the material timeline, you can actually split the material . It's possible that we can get a full program from the news broadcast , To be able to get some valuable footage . In traditional industries, these fragments are called "manuscripts" , Or is it clean The concept of material . This process is actually a difference between our whole intelligent production and the production of materials .

 Alibaba cloud video cloud technology experts  LVS  Speech full text :《“ Cloud in one ” The evolution of intelligent media production 》

So we give three examples ,AI How does capability fit into our production process in different scenarios .

Finally, we summarize the technical level of our video cloud intelligent media production . In our technical level of design ,( In the figure ) The bottom right is the core , On the ability of cloud production .

This cloud production capability , It's actually a hard currency and core competence . Like cutting and splicing, multi track overlay, multi track mixing , Multi frame rate of mixed graphics and text , Then, an adaptive fusion of multiple rates , And the ability to subtitle , And the ability to move graphs , Effect rendering, filter transition, etc , These are all part of cloud production .

This is a core part of the whole intelligent production , If you don't have these things , Whether it's AI Good , The packing is good , In fact, there is no foundation .

In terms of production capacity, it is our packaging capability designed by us , Packaging capability is a technical level to scale production capacity . The first point is scale , By packing , You can extract something , Abstract it out , Instead of making it from scratch every time . This is the first point of packaging capability . The second point of packaging capability is that it can be used AI The ways are diverse .

For example, I have some templates , adopt AI packing , Can split a raw material into a variety of effects . And then there is componentization . When I turn packaging capabilities into tools , Or make it sdk Words , This is a component effect , This is also our ability to quickly and batch generate video . It's equivalent to production focusing on the core , And packaging is focused on Application .

You can see on the left side of the picture that AI Part of .

AI In our entire system , It's a handle for intelligence and scale . It's that it's going to be deeply integrated , In every module of cloud production and cloud packaging capabilities .

The top layer , It's the ecological part of our entire technology system , That is, we need to do multi-faceted integration , And we have to deal with the last mile .

In the process , We have a good export of these capabilities to make an ecology . Then we have some prospects for this intelligent research path .

At first, we made videos in batches , It's possible to use templated production , Or use AI What's aided and based on simple rules is content generation .

These are the first three points , It's what we've done . The fourth point is to return what we haven't done yet . It's a recommendation based on the scenario understanding template . Now the template is still selected by people . And based on video image analysis AI Filter of , Now, whether it's a template or a filter , In fact, it's up to us to specify .

We hope to be able to use AI To do these things . One of my ultimate ideas , Hope for the future AI Be able to create independently , To generate videos with stories .

 Alibaba cloud video cloud technology experts  LVS  Speech full text :《“ Cloud in one ” The evolution of intelligent media production 》

Finally, we have a view on the future of intelligent production system .

We think the future is making this system , It has to be both .

First of all, we will be more and more professional . From our demand for video , At first, video on the Internet was a single track production , Now it may also be multi track 、 Multiple effects , Multiple material , A production of many types .

The whole video production link will be more and more professional . But at the same time as a professional , We think there are more and more participants in the whole video production , It's a inclusive process , Professional and inclusive are a couple that seem to conflict , But it's not contradictory .

Through our core design , And the process of foundation construction , Let the whole industry include AI The ability to further improve , Make it possible for future professional production .

Pratt Whitney is what we do through a variety of tools , Through instrumental production , Can lower the threshold of creativity and use , Let everyone into the production process to do their own video .

This is our overall view of the future . If it's more specific , We think , First of all, when we collaborate with the cloud , There will be a WYSIWYG , But the rendering effect is not uniform . We hope that in the future , End to end and cloud to cloud , Its effect is consistent . This is a trend in the future . Cloud rendering technology may be used here . Now real time production and post production are relatively fragmented . We hope that these two parts can be fully integrated in the future .

Third, we think that with the increase of the screen and 5G The arrival of the , UHD production has been tried in some scenes , At the same time, professional production is also a direction .

Finally, the fourth, inclusive process , The later evolution may be national creation . Video production technology is no longer the so-called high-end technology , It's a technology that's all inclusive . So that everyone can make their own video . Last , I talk to many of my peers in the field of professional production , They also hope AI Can really evolve into the stage of being able to create videos with stories .

Today's sharing is here . Thank you. .

 Alibaba cloud video cloud technology experts  LVS  Speech full text :《“ Cloud in one ” The evolution of intelligent media production 》

If you're also interested in smart media production groups , Welcome to join the wechat communication group : Click Scan Code

Ali cloud video cloud technology official account for the video cloud industry and technology trends , make “ New content ”、“ New interaction ”.

版权声明
本文为[osc u.97wmavr6]所创,转载请带上原文链接,感谢