当前位置:网站首页>[C + +] starting from scratch, only using ffmpeg and Win32 API to realize a player (1)

[C + +] starting from scratch, only using ffmpeg and Win32 API to realize a player (1)

2021-05-04 12:08:15 The Last Gentleman

Preface

At first, I just wanted to do a program to read video files directly and then play character animation . My idea is simple , As long as there's a ready-made Library , Help me parse the video file into the original picture information frame by frame , So I just need to read every pixel in it RGB The number , Calculate the brightness , Then map to a character based on brightness , And then put all these characters together to show , It's done . So I started to study how to use FFmpeg This library , It's easy to find related tutorials online , Soon I achieved my goal .

image

And then I thought , Why don't you just make a serious player , result , I just had a bunch of problems , The purpose of this article , It is the process of exploring these problems , And my own answers , Share it out .

Because I don't plan to cross platform , So no build system was used , Open it directly Visual Studio 2019 Just start a new project . I'm not going to show great software engineering skills , And perfect error handling , So the code is a Soha , How to be direct and simple , The point is to make it clear how to do it 、 How to start , The rest is up to you to play freely .

Originally, I wanted to finish an article , I think it's too long , Especially in the back DirectX 11 The rendering part of is too complicated ,DirectX 9 It's easy , So the first one , The first dx9 I'm finished , Second articles , Besides, dx11.

A simple window

Now all 2021 Years. , No one will use the actual product directly Win32 API Write GUI, The reason why I still choose to do this , It's because I want to make the bottom things clear , But I don't want to introduce too much extra stuff , for example QT、SDL And so on. GUI library , Besides, I didn't really want to make it into a utility . In fact, my first version used SDL 2.0 It's done , It's just getting out of the back , Write your own rendering code .

image

The first thing to say is this , In project properties - The linker - System - Subsystem choice window (/SUBSYSTEM:WINDOWS), When the program starts , The console window does not appear . Of course , It doesn't matter , Even with Console (/SUBSYSTEM:CONSOLE), Also does not hinder the normal operation of the program function .

Create the core function of the window , yes CreateWindow( Accurately speaking : yes CreateWindowA perhaps CreateWindowW, These two are User32.dll The name of the exported function , But for convenience , After that, I'll use Windows The macro defined in the header file is used as the function name , Pay attention to this ), But it has enough 11 Two parameters to fill in , It's very persuasive .

auto window = CreateWindow(className, L"Hello World  title ", WS_OVERLAPPEDWINDOW, CW_USEDEFAULT, CW_USEDEFAULT, 800, 600, NULL, NULL, hInstance, NULL);

className Is the window class name , I'll talk about it later ,L"Hello World title " The text that will appear in the title bar of the window ,WS_OVERLAPPEDWINDOW Is a macro , Represents the window style , For example, when you want a window with no border or title bar , We have to use some other styles .CW_USEDEFAULT, CW_USEDEFAULT, 800, 600 Represent the position coordinates and width and height of the window respectively , Let's just use the default location , You can specify the size yourself , The remaining parameters are not very important at the moment , All is NULL No problem at all .

Calling CreateWindow Before , It's usually called RegisterClass, Register a window class , You can take any class name you like .

auto className = L"MyWindow";
WNDCLASSW wndClass = {};
wndClass.hInstance = hInstance;
wndClass.lpszClassName = className;
wndClass.lpfnWndProc = [](HWND hwnd, UINT msg, WPARAM wParam, LPARAM lParam) -> LRESULT {
	return DefWindowProc(hwnd, msg, wParam, lParam);
};

RegisterClass(&wndClass);

WNDCLASSW Structure also has a lot to set up , But there are two things that are essential ,lpszClassName and lpfnWndProc,hInstance It's not necessary here .lpszClassName It's the class name , and lpfnWndProc Is a function pointer , Whenever the window receives a message , I'll call this function . Here we can use C++ 11 Of Lambda expression , Assign values to the lpfnWndProc It is automatically converted to a pure function pointer when it is used , And you don't have to worry about stdcall cdecl Call convention problem , The premise is that we can't use variables to capture features .

return DefWindowProc(hwnd, msg, wParam, lParam); The purpose of this is to give the message to Windows By default , For example, click on the top right corner of the title bar × Will close the window , And the default behavior of maximizing and minimizing , These behaviors can be taken over by users themselves , Later, we will deal with the mouse, keyboard and other messages here .

By default, the window you just created is hidden , So we're going to call ShowWindow Display window , Finally, use the message loop to keep the window receiving messages .

ShowWindow(window, SW_SHOW);

MSG msg;
while (GetMessage(&msg, window, 0, 0) > 0) {
	TranslateMessage(&msg);
	DispatchMessage(&msg);
}

Finally, don't forget to call... At the beginning of the program SetProcessDPIAware(), prevent Windows When the display zooms larger than 100% when , Self stretching the form causes the display to blur .

The complete code looks like this :

#include <stdio.h>
#include <Windows.h>

int WINAPI WinMain (
	_In_ HINSTANCE hInstance,
	_In_opt_ HINSTANCE hPrevInstance,
	_In_ LPSTR lpCmdLine,
	_In_ int nShowCmd
) {
	SetProcessDPIAware();

	auto className = L"MyWindow";
	WNDCLASSW wndClass = {};
	wndClass.hInstance = NULL;
	wndClass.lpszClassName = className;
	wndClass.lpfnWndProc = [](HWND hwnd, UINT msg, WPARAM wParam, LPARAM lParam) -> LRESULT {
		return DefWindowProc(hwnd, msg, wParam, lParam);
	};

	RegisterClass(&wndClass);
	auto window = CreateWindow(className, L"Hello World  title ", WS_OVERLAPPEDWINDOW, CW_USEDEFAULT, CW_USEDEFAULT, 800, 600, NULL, NULL, NULL, NULL);

	ShowWindow(window, SW_SHOW);

	MSG msg;
	while (GetMessage(&msg, window, 0, 0) > 0) {
		TranslateMessage(&msg);
		DispatchMessage(&msg);
	}

	return 0;
}

effect :

image

introduce FFmpeg

We don't have to worry about compiling from the source code , Just download the compiled file :https://github.com/BtbN/FFmpeg-Builds/releases, Pay attention to the download tape shared Version of , for example :ffmpeg-N-102192-gc7c138e411-win64-gpl-shared.zip, After decompression, there are three folders , Namely bin, include, lib, This corresponds to three things that need to be configured .

Next, create two environment variables , Note that the directory is changed to your actual decompression Directory :

  • FFMPEG_INCLUDE = D:\Download\ffmpeg-N-102192-gc7c138e411-win64-gpl-shared\include
  • FFMPEG_LIB = D:\Download\ffmpeg-N-102192-gc7c138e411-win64-gpl-shared\lib

Notice that every time you change the environment variable , Both need to be rebooted Visual Studio. Then configure VC++ Catalog Include directory and Library Directory in

image

Then you can introduce... Into the code FFmpeg The header file , And it compiles normally :

extern "C" {
#include <libavcodec/avcodec.h>
#pragma comment(lib, "avcodec.lib")

#include <libavformat/avformat.h>
#pragma comment(lib, "avformat.lib")

#include <libavutil/imgutils.h>
#pragma comment(lib, "avutil.lib")

}

Finally, in the environment variables PATH Join the path D:\Download\ffmpeg-N-102192-gc7c138e411-win64-gpl-shared\bin, So that the program can be loaded correctly FFmpeg Of dll.

Decode the first frame

Next we write a function , Get the pixel set of the first frame .

AVFrame* getFirstFrame(const char* filePath) {
	AVFormatContext* fmtCtx = nullptr;
	avformat_open_input(&fmtCtx, filePath, NULL, NULL);
	avformat_find_stream_info(fmtCtx, NULL);

	int videoStreamIndex;
	AVCodecContext* vcodecCtx = nullptr;
	for (int i = 0; i < fmtCtx->nb_streams; i++) {
		AVStream* stream = fmtCtx->streams[i];
		if (stream->codecpar->codec_type == AVMEDIA_TYPE_VIDEO) {
			const AVCodec* codec = avcodec_find_decoder(stream->codecpar->codec_id);
			videoStreamIndex = i;
			vcodecCtx = avcodec_alloc_context3(codec);
			avcodec_parameters_to_context(vcodecCtx, fmtCtx->streams[i]->codecpar);
			avcodec_open2(vcodecCtx, codec, NULL);
		}
	}

	while (1) {
		AVPacket* packet = av_packet_alloc();
		int ret = av_read_frame(fmtCtx, packet);
		if (ret == 0 && packet->stream_index == videoStreamIndex) {
			ret = avcodec_send_packet(vcodecCtx, packet);
			if (ret == 0) {
				AVFrame* frame = av_frame_alloc();
				ret = avcodec_receive_frame(vcodecCtx, frame);
				if (ret == 0) {
					av_packet_unref(packet);
					avcodec_free_context(&vcodecCtx);
					avformat_close_input(&fmtCtx);
					return frame;
				}
				else if (ret == AVERROR(EAGAIN)) {
					av_frame_unref(frame);
					continue;
				}
			}
		}

		av_packet_unref(packet);
	}
}

The process is simple , Namely :

  1. obtain AVFormatContext, This represents the container of this video file
  2. obtain AVStream, A video file can have multiple streams , Video streaming 、 Audio streaming and other resources , We're only focusing on video streams right now , So here's a judgment stream->codecpar->codec_type == AVMEDIA_TYPE_VIDEO
  3. obtain AVCodec, Represents the decoder corresponding to a stream
  4. obtain AVCodecContext, Represents the decoding context of the decoder
  5. Enter the decoding cycle , Call with av_read_frame obtain AVPacket, Determine whether it is a packet of video stream , Yes, call avcodec_send_packet Send to AVCodecContext decode , Sometimes a packet is not enough to decode a complete frame , At this point, we need to get the next packet , Call again avcodec_send_packet Send to decoder , Try to decode successfully .
  6. Finally through avcodec_receive_frame Got AVFrame It contains the original image information

A lot of video frames are all black in the first frame , It's not convenient to test , So you can change the code a little bit , Read the next few frames more .

AVFrame* getFirstFrame(const char* filePath, int frameIndex) {
// ...
	n++;
	if (n == frameIndex) {
		av_packet_unref(packet);
		avcodec_free_context(&vcodecCtx);
		avformat_close_input(&fmtCtx);
		return frame;
	}
	else {
		av_frame_unref(frame);
	}
// ...
}

You can go directly through AVFrame Read the image of width, height

AVFrame* firstframe = getFirstFrame(filePath.c_str(), 10);

int width = firstframe->width;
int height = firstframe->height;

We focus on the original image pixel information in AVFrame::data in , His concrete structure , Depending on AVFrame::format, This is the pixel format used for video , Most of the videos are currently used YUV420P(AVPixelFormat::AV_PIX_FMT_YUV420P), For convenience , We'll just consider how it's handled .

Render the first frame

Different from what we thought , The pixel format used in most videos is not RGB, It is YUV,Y For brightness ,UV It stands for chromaticity 、 concentration . The most important thing is that it has different sampling methods , The most common YUV420P, Every pixel , They're all stored separately 1 Bytes of Y value , Every time 4 Pixel , share 1 individual U and 1 individual V value , therefore , A picture 1920x1080 Image , Occupy only 1920 * 1080 * (1 + (1 + 1) / 4) = 3110400 byte , yes RGB Half of the code . Here we use the sensitivity of the human eye to brightness , But relatively insensitive to color , Even if the chroma bandwidth is reduced , There's no sense distortion .

but Windows There's no way to render it directly YUV The data of , So you need to switch . Here in order to see the picture as soon as possible , Let's just use Y Value to display black and white , The specific methods are as follows :

struct Color_RGB
{
	uint8_t r;
	uint8_t g;
	uint8_t b;
};

AVFrame* firstframe = getFirstFrame(filePath.c_str(), 30);

int width = firstframe->width;
int height = firstframe->height;

vector<Color_RGB> pixels(width * height);
for (int i = 0; i < pixels.size(); i++) {
	uint8_t r = firstframe->data[0][i];
	uint8_t g = r;
	uint8_t b = r;
	pixels[i] = { r, g, b };
}

YUV420P The format will put Y、U、V Three values are stored separately in three arrays ,AVFrame::data[0] Namely Y Channel array , Let's simply put the brightness values in at the same time RGB You can achieve black and white picture . Next, write a function to process the RGB Array to render , Let's start with the most traditional GDI Drawing style :

void StretchBits (HWND hwnd, const vector<Color_RGB>& bits, int width, int height) {
	auto hdc = GetDC(hwnd);
	for (int x = 0; x < width; x++) {
		for (int y = 0; y < height; y++) {
			auto& pixel = bits[x + y * width];
			SetPixel(hdc, x, y, RGB(pixel.r, pixel.g, pixel.b));
		}
	}
	ReleaseDC(hwnd, hdc);
}

stay ShowWindow After call , Call the one written above StretchBits function , You'll see the image gradually appear in the window :

//...
ShowWindow(window, SW_SHOW);

StretchBits(window, pixels, width, height);

MSG msg;
while (GetMessage(&msg, window, 0, 0) > 0) {
	TranslateMessage(&msg);
	DispatchMessage(&msg);
}
// ...

image

An obvious problem , It's just that rendering efficiency is too low , It takes several seconds to display a frame , For ordinary people 24 It's totally unacceptable for video frames , So we're going to try to gradually optimize StretchBits function .

Optimize GDI Rendering

SetPixel Functions are obviously inefficient , A better solution is to use StretchDIBits function , But it's not so easy to use .

void StretchBits (HWND hwnd, const vector<Color_RGB>& bits, int width, int height) {
	auto hdc = GetDC(hwnd);
	BITMAPINFO bitinfo = {};
	auto& bmiHeader = bitinfo.bmiHeader;
	bmiHeader.biSize = sizeof(bitinfo.bmiHeader);
	bmiHeader.biWidth = width;
	bmiHeader.biHeight = -height;
	bmiHeader.biPlanes = 1;
	bmiHeader.biBitCount = 24;
	bmiHeader.biCompression = BI_RGB;

	StretchDIBits(hdc, 0, 0, width, height, 0, 0, width, height, &bits[0], &bitinfo, DIB_RGB_COLORS, SRCCOPY);
	ReleaseDC(hwnd, hdc);
}

Be careful bmiHeader.biHeight = -height; You have to use a minus sign here , Otherwise, the picture will turn upside down , stay BITMAPINFOHEADER structure It's detailed in the book . At this point, the rendering time of a frame is reduced to a few milliseconds .

Play a continuous picture

First of all, we need to dismantle getFirstFrame function , Separate out the loop decoding part , It's broken down into two functions :InitDecoder and RequestFrame

struct DecoderParam
{
	AVFormatContext* fmtCtx;
	AVCodecContext* vcodecCtx;
	int width;
	int height;
	int videoStreamIndex;
};

void InitDecoder(const char* filePath, DecoderParam& param) {
	AVFormatContext* fmtCtx = nullptr;
	avformat_open_input(&fmtCtx, filePath, NULL, NULL);
	avformat_find_stream_info(fmtCtx, NULL);

	AVCodecContext* vcodecCtx = nullptr;
	for (int i = 0; i < fmtCtx->nb_streams; i++) {
		const AVCodec* codec = avcodec_find_decoder(fmtCtx->streams[i]->codecpar->codec_id);
		if (codec->type == AVMEDIA_TYPE_VIDEO) {
			param.videoStreamIndex = i;
			vcodecCtx = avcodec_alloc_context3(codec);
			avcodec_parameters_to_context(vcodecCtx, fmtCtx->streams[i]->codecpar);
			avcodec_open2(vcodecCtx, codec, NULL);
		}
	}

	param.fmtCtx = fmtCtx;
	param.vcodecCtx = vcodecCtx;
	param.width = vcodecCtx->width;
	param.height = vcodecCtx->height;
}

AVFrame* RequestFrame(DecoderParam& param) {
	auto& fmtCtx = param.fmtCtx;
	auto& vcodecCtx = param.vcodecCtx;
	auto& videoStreamIndex = param.videoStreamIndex;

	while (1) {
		AVPacket* packet = av_packet_alloc();
		int ret = av_read_frame(fmtCtx, packet);
		if (ret == 0 && packet->stream_index == videoStreamIndex) {
			ret = avcodec_send_packet(vcodecCtx, packet);
			if (ret == 0) {
				AVFrame* frame = av_frame_alloc();
				ret = avcodec_receive_frame(vcodecCtx, frame);
				if (ret == 0) {
					av_packet_unref(packet);
					return frame;
				}
				else if (ret == AVERROR(EAGAIN)) {
					av_frame_unref(frame);
				}
			}
		}

		av_packet_unref(packet);
	}

	return nullptr;
}

And then in main The function says :

// ...
DecoderParam decoderParam;
InitDecoder(filePath.c_str(), decoderParam);
auto& width = decoderParam.width;
auto& height = decoderParam.height;
auto& fmtCtx = decoderParam.fmtCtx;
auto& vcodecCtx = decoderParam.vcodecCtx;

auto window = CreateWindow(className, L"Hello World  title ", WS_OVERLAPPEDWINDOW, 0, 0, decoderParam.width, decoderParam.height, NULL, NULL, hInstance, NULL);

ShowWindow(window, SW_SHOW);

MSG msg;
while (GetMessage(&msg, window, 0, 0) > 0) {
	AVFrame* frame = RequestFrame(decoderParam);

	vector<Color_RGB> pixels(width * height);
	for (int i = 0; i < pixels.size(); i++) {
		uint8_t r = frame->data[0][i];
		uint8_t g = r;
		uint8_t b = r;
		pixels[i] = { r, g, b };
	}

	av_frame_free(&frame);

	StretchBits(window, pixels, width, height);

	TranslateMessage(&msg);
	DispatchMessage(&msg);
}
// ...

Now run the program , I found that the picture still didn't move , Only when our mouse keeps moving in the window , The screen will play continuously . This is because we used GetMessage, When there is no message in the window , This function will always block , It doesn't return until there's a new message . When we use the mouse to move on the window, it is equivalent to sending mouse event messages to the window , To make while The cycle goes on .

The solution is to use PeekMessage Instead of , This function does not care whether a message is received or not , Will return to . Let's change the message loop code a little bit :

// ...
wndClass.lpfnWndProc = [](HWND hwnd, UINT msg, WPARAM wParam, LPARAM lParam) -> LRESULT {
	switch (msg)
	{
	case WM_DESTROY:
		PostQuitMessage(0);
		return 0;
	default:
		return DefWindowProc(hwnd, msg, wParam, lParam);
	}
};
// ...
while (1) {
	BOOL hasMsg = PeekMessage(&msg, NULL, 0, 0, PM_REMOVE);
	if (hasMsg) {
		if (msg.message == WM_QUIT) {
			break;
		}
		TranslateMessage(&msg);
		DispatchMessage(&msg);
	}
	else {
		AVFrame* frame = RequestFrame(decoderParam);

		vector<Color_RGB> pixels(width * height);
		for (int i = 0; i < pixels.size(); i++) {
			uint8_t r = frame->data[0][i];
			uint8_t g = r;
			uint8_t b = r;
			pixels[i] = { r, g, b };
		}

		av_frame_free(&frame);

		StretchBits(window, pixels, width, height);
	}
}

Pay attention to PeekMessage After that, you need to handle it manually WM_DESTROY and WM_QUIT news . At this point, even if the mouse does not move, the screen can play continuously . But in my notebook i5-1035G1 Under the weak performance , The picture effect is better than PPT It's still miserable , At this point, just put VS Build configuration of from Debug Change it to Release, It's like pressing the fast forward button , Sometimes it's very different whether the code optimization is on or off .

Let's cut in here Visual Studio Performance diagnostic tools for , It's just too powerful .

image

You can see the code clearly , Which function , How much CPU, It is very convenient to use it to find the most need to optimize the place . You can see vector Most of the... Is occupied by the distribution of CPU Time , We'll do it later .

Color picture

FFmpeg There are functions that can help us deal with the conversion of color coding , To do this, we need to introduce a new header file :

// ...
#include <libswscale/swscale.h>
#pragma comment(lib, "swscale.lib")
// ...

Then write a new function to convert the color code

vector<Color_RGB> GetRGBPixels(AVFrame* frame) {
	static SwsContext* swsctx = nullptr;
	swsctx = sws_getCachedContext(
		swsctx,
		frame->width, frame->height, (AVPixelFormat)frame->format,
		frame->width, frame->height, AVPixelFormat::AV_PIX_FMT_BGR24, NULL, NULL, NULL, NULL);

	vector<Color_RGB> buffer(frame->width * frame->height);
	uint8_t* data[] = { (uint8_t*)&buffer[0] };
	int linesize[] = { frame->width * 3 };
	sws_scale(swsctx, frame->data, frame->linesize, 0, frame->height, data, linesize);

	return buffer;
}

sws_scale Function to zoom the screen , It can also change the color coding , Here we don't need to zoom , therefore width and height Keep consistent .

Then it is called after decoding :

// ...
AVFrame* frame = RequestFrame(decoderParam);

vector<Color_RGB> pixels = GetRGBPixels(frame);

av_frame_free(&frame);

StretchBits(window, pixels, width, height);
// ...

It's not bad :

image

Next, optimize the code a little bit , stay Debug In mode ,vector Allocating memory seems to consume a lot of performance , We're trying to figure out how to distribute the message before it loops .

vector<Color_RGB> GetRGBPixels(AVFrame* frame, vector<Color_RGB>& buffer) {
	static SwsContext* swsctx = nullptr;
	swsctx = sws_getCachedContext(
		swsctx,
		frame->width, frame->height, (AVPixelFormat)frame->format,
		frame->width, frame->height, AVPixelFormat::AV_PIX_FMT_BGR24, NULL, NULL, NULL, NULL);

	uint8_t* data[] = { (uint8_t*)&buffer[0] };
	int linesize[] = { frame->width * 3 };
	sws_scale(swsctx, frame->data, frame->linesize, 0, frame->height, data, linesize);

	return buffer;
}

// ...
InitDecoder(filePath.c_str(), decoderParam);
auto& width = decoderParam.width;
auto& height = decoderParam.height;
auto& fmtCtx = decoderParam.fmtCtx;
auto& vcodecCtx = decoderParam.vcodecCtx;

vector<Color_RGB> buffer(width * height);
// ...
while (1) {
// ...
vector<Color_RGB> pixels = GetRGBPixels(frame, buffer);
// ...
}

Now, even if it's Debug It doesn't get stuck in mode either ppt 了 .

The right playback speed

At present, our picture playing speed , It's up to you CPU speed , How to control the timing of each frame ? A simple idea , First get the frame rate of the video , Calculate how long each frame should be spaced , And then after each frame is rendered , call Sleep Function delay , Try it first :

AVFrame* frame = RequestFrame(decoderParam);

vector<Color_RGB> pixels = GetRGBPixels(frame, buffer);

av_frame_free(&frame);

StretchBits(window, pixels, width, height);

double framerate = (double)vcodecCtx->framerate.den / vcodecCtx->framerate.num;
Sleep(framerate * 1000);

AVCodecContext::framerate You can get the frame rate of the video , Represents how many frames need to be rendered per second , He is AVRational type , It's like fractions ,num It's molecules ,den It's the denominator . Here we turn him upside down , Multiplied by 1000 Get the number of milliseconds to wait for each frame .

But the actual impression is that the speed is slow , This is because decoding and rendering itself takes a lot of time , And again Sleep The waiting time is superimposed , In fact, the time between frames is longer , Now let's try to solve this problem :

// ...
#include <chrono>
#include <thread>
// ...

using namespace std::chrono;
// ...

int WINAPI WinMain (
	_In_ HINSTANCE hInstance,
	_In_opt_ HINSTANCE hPrevInstance,
	_In_ LPSTR lpCmdLine,
	_In_ int nShowCmd
) {
// ...

	auto currentTime = system_clock::now();

	MSG msg;
	while (1) {
		BOOL hasMsg = PeekMessage(&msg, NULL, 0, 0, PM_REMOVE);
		if (hasMsg) {
			// ...
		} else {
			// ...
			
			av_frame_free(&frame);

			double framerate = (double)vcodecCtx->framerate.den / vcodecCtx->framerate.num;
			std::this_thread::sleep_until(currentTime + milliseconds((int)(framerate * 1000)));
			currentTime = system_clock::now();

			StretchBits(window, pixels, width, height);
		}
	}

std::this_thread::sleep_until Can be delayed to a specified point in time , Take advantage of this feature , Even if decoding and rendering take time , And it doesn't affect the overall delay time , Unless your decoding and rendering time of one frame has exceeded the interval time of each frame .

don 't worry , This clumsy approach will certainly not be our final solution .

Hardware decoding

Using this program can still play smoothly in my notebook 1080p24fps The video , But when it plays 1080p60fps I can't keep up with the video , Let's see where it's occupied first CPU most :

image

obviously RequestFrame It takes up a lot of resources , This is the function used for decoding , Now try hardware decoding , See if you can improve efficiency :

void InitDecoder(const char* filePath, DecoderParam& param) {
	// ...

	//  Enable hardware decoder 
	AVBufferRef* hw_device_ctx = nullptr;
	av_hwdevice_ctx_create(&hw_device_ctx, AVHWDeviceType::AV_HWDEVICE_TYPE_DXVA2, NULL, NULL, NULL);
	vcodecCtx->hw_device_ctx = hw_device_ctx;

	param.fmtCtx = fmtCtx;
	param.vcodecCtx = vcodecCtx;
	param.width = vcodecCtx->width;
	param.height = vcodecCtx->height;
}

vector<Color_RGB> GetRGBPixels(AVFrame* frame, vector<Color_RGB>& buffer) {
	AVFrame* swFrame = av_frame_alloc();
	av_hwframe_transfer_data(swFrame, frame, 0);
	frame = swFrame;

	static SwsContext* swsctx = nullptr;
	
	// ...
	
	sws_scale(swsctx, frame->data, frame->linesize, 0, frame->height, data, linesize);

	av_frame_free(&swFrame);

	return buffer;
}

Through the first av_hwdevice_ctx_create Create a hardware decoding device , Then assign the device pointer to AVCodecContext::hw_device_ctx that will do ,AV_HWDEVICE_TYPE_DXVA2 It's a type of hardware decoding device , It's about the platform you're running on , stay Windows platform , Usually use AV_HWDEVICE_TYPE_DXVA2 perhaps AV_HWDEVICE_TYPE_D3D11VA, Best compatibility , Because I need to use dx9 Rendering , So let's use dxva2.

It's decoded right now AVFrame, It's impossible to access the original image information directly , Because the decoded data is still there GPU In video memory , Need to pass through av_hwframe_transfer_data Copy it ( This is what's in the player copy-back Options ), And the color code that came out became AV_PIX_FMT_NV12, Not before AV_PIX_FMT_YUV420P, But it doesn't need to worry ,sws_scale Can help us deal with .

After running the program , You do see it in task manager GPU There is a certain occupation of :

image

But it's not smooth enough , Let's look at the performance analysis :

image

Seems to be sws_scale Functions consume performance , But this is FFmpeg Function of , We can't optimize it from within , Anyway, let's put it on hold , Deal with it later .

Use D3D9 Render the picture

GDI Rendering is an ancient method , Now let's use the modern method :Direct3D 9 Rendering .

First introduce the necessary header file :

#include <d3d9.h>
#pragma comment(lib, "d3d9.lib")

There's also a benefit from Microsoft ,ComPtr:

#include <wrl.h>
using Microsoft::WRL::ComPtr;

Because next we're going to use a lot of COM( Component object model ) technology , With ComPtr It will be convenient . About COM Too much to say , I really can't go into too much detail in this article , It is suggested to read the relevant materials first, and then look down if you have a little understanding .

Next, initialize D3D9 equipment

// ...

ShowWindow(window, SW_SHOW);

// D3D9
ComPtr<IDirect3D9> d3d9 = Direct3DCreate9(D3D_SDK_VERSION);
ComPtr<IDirect3DDevice9> d3d9Device;

D3DPRESENT_PARAMETERS d3dParams = {};
d3dParams.Windowed = TRUE;
d3dParams.SwapEffect = D3DSWAPEFFECT_DISCARD;
d3dParams.BackBufferFormat = D3DFORMAT::D3DFMT_X8R8G8B8;
d3dParams.Flags = D3DPRESENTFLAG_LOCKABLE_BACKBUFFER;
d3dParams.BackBufferWidth = width;
d3dParams.BackBufferHeight = height;
d3d9->CreateDevice(D3DADAPTER_DEFAULT, D3DDEVTYPE_HAL, window, D3DCREATE_HARDWARE_VERTEXPROCESSING, &d3dParams, d3d9Device.GetAddressOf());

auto currentTime = system_clock::now();
// ...

Use ComPtr This C++ Template class to package COM The pointer , There's no need to worry about resource release , At the end of a variable's life cycle, it is automatically called Release Release resources .

The most important parameter to create a device is D3DPRESENT_PARAMETERS structure ,Windowed = TRUE Set window mode , We don't need full screen now either .SwapEffect It's the exchange chain model , choose D3DSWAPEFFECT_DISCARD Just go .BackBufferFormat More important , Must choose D3DFMT_X8R8G8B8, Because only it can be used as post buffer format and display format at the same time ( See the picture below ), and sws_scale It can also be converted to this format correctly .

image

Flags Must be D3DPRESENTFLAG_LOCKABLE_BACKBUFFER, Because later we will write the data directly to the post buffer , I don't know 3D Texture layer .

Readjust GetRGBPixels function :

void GetRGBPixels(AVFrame* frame, vector<uint8_t>& buffer, AVPixelFormat pixelFormat, int byteCount) {
	AVFrame* swFrame = av_frame_alloc();
	av_hwframe_transfer_data(swFrame, frame, 0);
	frame = swFrame;

	static SwsContext* swsctx = nullptr;
	swsctx = sws_getCachedContext(
		swsctx,
		frame->width, frame->height, (AVPixelFormat)frame->format,
		frame->width, frame->height, pixelFormat, NULL, NULL, NULL, NULL);

	uint8_t* data[] = { &buffer[0] };
	int linesize[] = { frame->width * byteCount };
	sws_scale(swsctx, frame->data, frame->linesize, 0, frame->height, data, linesize);

	av_frame_free(&swFrame);
}

Added parameters pixelFormat You can customize the pixel format of the output , The purpose is to output later AV_PIX_FMT_BGRA Formatted data , It corresponds to D3DFMT_X8R8G8B8, And different formats , Each pixel takes up a different number of bytes , So we need another byteCount Parameter represents bytes per pixel . Of course vector<Color_RGB> We don't have to either , Change to universal vector<uint8_t>.

To readjust StretchBits function :

void StretchBits(IDirect3DDevice9* device, const vector<uint8_t>& bits, int width, int height) {
	ComPtr<IDirect3DSurface9> surface;
	device->GetBackBuffer(0, 0, D3DBACKBUFFER_TYPE_MONO, surface.GetAddressOf());

	D3DLOCKED_RECT lockRect;
	surface->LockRect(&lockRect, NULL, D3DLOCK_DISCARD);

	memcpy(lockRect.pBits, &bits[0], bits.size());

	surface->UnlockRect();

	device->Present(NULL, NULL, NULL, NULL);
}

Here is the buffer after the picture data is written , And then call Present It will be displayed in the window .

Finally, adjust main Some of the contents of the function :

// ...

vector<uint8_t> buffer(width * height * 4);

auto window = CreateWindow(className, L"Hello World  title ", WS_OVERLAPPEDWINDOW, 0, 0, decoderParam.width, decoderParam.height, NULL, NULL, hInstance, NULL);
// ...

AVFrame* frame = RequestFrame(decoderParam);

GetRGBPixels(frame, buffer, AVPixelFormat::AV_PIX_FMT_BGRA, 4);

av_frame_free(&frame);

double framerate = (double)vcodecCtx->framerate.den / vcodecCtx->framerate.num;
std::this_thread::sleep_until(currentTime + milliseconds((int)(framerate * 1000)));
currentTime = system_clock::now();

StretchBits(d3d9Device.Get(), buffer, width, height);
// ...

Be careful buffer There's a change in the size of ,GetRGBPixels You need to use AV_PIX_FMT_BGRA,StretchBits It's called d3d9 Device pointer .

Run the program , It doesn't look different from before , But actually at this time CPU The occupancy will be slightly reduced , and GPU The occupancy will increase a little .

image

a farewell sws_scale

First, make the window borderless , It looks more cool , Also let the proportion of the screen slightly normal :

// ...

auto window = CreateWindow(className, L"Hello World  title ", WS_POPUP, 100, 100, 1280, 720, NULL, NULL, hInstance, NULL);
// ...

image

As mentioned earlier , It's hard to solve AVFrame There is no original picture information , But we're going to see it format value , You will find that the corresponding AV_PIX_FMT_DXVA2_VLD

image

It's mentioned in the notes that :data[3] yes One LPDIRECT3DSURFACE9, That is to say IDirect3DSurface9*, Then we can just put this Surface Render to window , There is no need to transfer the image data from GPU The video memory is copied back to memory ,sws_scale You can throw it away .

Let's write a new function RenderHWFrame To do it ,StretchBits and GetRGBPixels No longer needed :

void RenderHWFrame(HWND hwnd, AVFrame* frame) {
	IDirect3DSurface9* surface = (IDirect3DSurface9*)frame->data[3];
	IDirect3DDevice9* device;
	surface->GetDevice(&device);

	ComPtr<IDirect3DSurface9> backSurface;
	device->GetBackBuffer(0, 0, D3DBACKBUFFER_TYPE_MONO, backSurface.GetAddressOf());

	device->StretchRect(surface, NULL, backSurface.Get(), NULL, D3DTEXF_LINEAR);

	device->Present(NULL, NULL, hwnd, NULL);
}

int WINAPI WinMain (
	_In_ HINSTANCE hInstance,
	_In_opt_ HINSTANCE hPrevInstance,
	_In_ LPSTR lpCmdLine,
	_In_ int nShowCmd
) {
// ...

AVFrame* frame = RequestFrame(decoderParam);

double framerate = (double)vcodecCtx->framerate.den / vcodecCtx->framerate.num;
std::this_thread::sleep_until(currentTime + milliseconds((int)(framerate * 1000)));
currentTime = system_clock::now();

RenderHWFrame(window, frame);

av_frame_free(&frame);
// ...

In different d3d9 Sharing resources between devices is troublesome , So we directly get FFmepg Created d3d9 equipment , And then call Present Specifies the window handle , We can make the picture appear in our own window .

image

Now CPU The occupancy is really low enough to be ignored . But then a new problem appeared , Watch the picture carefully , You'll find that the picture is blurred , The reason is that we use it directly FFmpeg Of d3d9 The switch chain created by the device by default , The resolution of this switching chain is quite low , Only 640x480, See his source code to know (hwcontext_dxva2.c:46

image

So we need to use FFmpeg Of d3d9 Devices create their own switching chains :

void RenderHWFrame(HWND hwnd, AVFrame* frame) {
	IDirect3DSurface9* surface = (IDirect3DSurface9*)frame->data[3];
	IDirect3DDevice9* device;
	surface->GetDevice(&device);

	static ComPtr<IDirect3DSwapChain9> mySwap;
	if (mySwap == nullptr) {
		D3DPRESENT_PARAMETERS params = {};
		params.Windowed = TRUE;
		params.hDeviceWindow = hwnd;
		params.BackBufferFormat = D3DFORMAT::D3DFMT_X8R8G8B8;
		params.BackBufferWidth = frame->width;
		params.BackBufferHeight = frame->height;
		params.SwapEffect = D3DSWAPEFFECT_DISCARD;
		params.BackBufferCount = 1;
		params.Flags = 0;
		device->CreateAdditionalSwapChain(&params, mySwap.GetAddressOf());
	}

	ComPtr<IDirect3DSurface9> backSurface;
	mySwap->GetBackBuffer(0, D3DBACKBUFFER_TYPE_MONO, backSurface.GetAddressOf());

	device->StretchRect(surface, NULL, backSurface.Get(), NULL, D3DTEXF_LINEAR);

	mySwap->Present(NULL, NULL, NULL, NULL, NULL);
}

One d3ddevice You can have multiple exchange chains , Use CreateAdditionalSwapChain Function to create , And then just like before , Take the hard solution surface Copy to the post buffer of the new exchange chain .

image

Now even if it plays 4k60fps In the video , There's no pressure .

The current problems

  1. If your screen refresh rate is 60hz, The program plays 60 When I make a video , It's slower than normal , The reason is that IDirect3DSwapChain9::Present Will force you to wait for the screen to synchronize vertically , So the rendering time is always later than the normal time .
  2. There are no action controls , You can't pause fast forward and so on .
  3. No sound .

Let's leave the above problems to the second part .

版权声明
本文为[The Last Gentleman ]所创,转载请带上原文链接,感谢
https://chowdera.com/2021/05/20210504120240387y.html

随机推荐