当前位置:网站首页>Huffman Tree Huffman Code Creation

Huffman Tree Huffman Code Creation

2022-11-24 21:24:48* idle fish

基础知识点

Huffman tree is also called optimal tree,It is a tree with the shortest weighted path length,有着广泛的应用.

最优二叉树

We give the notion of paths and path lengths.A divide and conquer from one node of the tree to another constitutes a path between those two nodes,路径上的分支数目称作路径长度.树的路径长度是从树根到每一个结点的路径长度之和.完全二叉树就是这种路径长度最短的二叉树.The length of the weighted path is the product of the path length from the node to the root of the tree and the weight on the node.树的带权路径长度为树中所有叶子结点的带权路径长度之和.
WPL=(W1L1+W2L2+W3L3+…+WnLn)
例如下图:在这里插入图片描述
WPL=24+34+43+92+72+82=80
The figure below shows three binary trees with the same leaf node weight and different paths,Calculate their weights as follows:
2*(9+4+5+2)=40
14+22+35+39=50
91+52+43+23=37
We found that different paths have the same weight,The weighted path lengths are different,And the smallest one is our Huffman tree.
在这里插入图片描述

How to construct a Huffman tree

Huffman first gave an algorithm with a general law,Commonly known as the Huffman algorithm.如下

  • 根据给定的n个权值{w1,w2,……,wn}构成n棵二叉树的集合F={T1,T2……Tn},其中每棵二叉树Ti中只有一个带权为wi的根结点,其左右子树均为空.
  • 在F中选取两棵根节点的权值最小的树作为左右子树构造一棵新的二叉树,且置新的二叉树的根节点的权值为其左,右子树上根节点的权值之和
  • 在Fdelete this tree,同时将新得到的二叉树加入F中.
  • 重复2,3步骤,直到F只含一棵树为止,这棵树便是赫夫曼树.

我们可以进行举例:Can only occur in a known system connection8{a,b,c,d,e,f,g,h}种字符,其出现的概率为0.05,0.29,0.07,0.08,0.14,0.23,0.03,0.11,Try to design a Huffman tree.
设权w={5,29,7,8,14,23,3,11},n=8,则m=15,A Huffman tree can be constructed according to the above algorithm.
The nodes in the figure below are all leaf nodes,We first find the two nodes with the smallest weight among them,Construct a binary tree,For example, in the following node, we will make a left child with a weight of 3The weight of the right child is 5,The parent weight is 8的二叉树,Then set the weight to 8The left child and right child are numbered as 7和1The nodes of are added to the following nodes,Those who have been selected in this way will not continue to choose,Add the newly created node to it,The two nodes with the smallest weight are selected each time,一直循环下去,until all nodes are traversed.
在这里插入图片描述
The Huffman tree after construction is complete:在这里插入图片描述

赫夫曼编码

We follow the Huffman tree constructed above to produce a binary tree as shown in the figure:在这里插入图片描述

We transcode according to the left and right children of the binary tree,左为0,右为1,eThe variable code is000,hThe code for001.Read down from the root node in turn.

Encode and compress files

文件压缩流程:

  • 统计文件中所有字符出现的频次
  • 将字符作为节点的值,字符出现的频次作为节点的权,构建哈夫曼树
  • 将哈夫曼树的每个节点的字符与对应的 01 编码作为映射存储到字典中
  • 再次读取文件,将每个字符转化为 01 编码的字符串格式
  • 将这个文件对应的 01 编码字符串转换为真正的比特并写入到文件中,完成压缩

解压流程:

  • 读取压缩文件,将压缩文件转换为 01 的字符串形式
  • 将哈夫曼树的每个节点对应的 01 编码与字符值作为映射存储到字典中(HashMap)
  • 对压缩文件转换成的字符串进行遍历;使用两个指针 i,j,第一个指针 i 指向起始点,第二个指针 j 以 i 所在的位置开始从左向右逐一扫描.如果 i~j 对应的字符串在字典中有匹配的 key,那么就在字典中取出对应的 value 写入到还原的文件中,并更新指针 i 的位置;无匹配的 key 则继续移动 j 指针直至匹配.当 i 移动到字符串最后的位置时,结束遍历,解压完成.

代码

结构体设计

#pragma once
const int n = 8;
const int m = n * 2;
typedef unsigned int WeigthType;
typedef unsigned int NodeType;
typedef struct {
    
	WeigthType weight;
	NodeType parent, leftchild, rightchild;
}HTNode;
typedef HTNode HuffManTree[m];

typedef struct
{
    
	char ch;
	char code[n + 1];
}HuffCodeNode;

typedef HuffCodeNode HuffCoding[n + 1];

struct IndexWeigth
{
    
	int index;
	WeigthType weight;
	operator WeigthType() const {
     return weight; }
};

创建赫夫曼树

void InitHuffManTree(HuffManTree hft, WeigthType *w) {
    
	memset(hft,0, sizeof(HuffManTree));
	for (int i = 0; i < n; i++) {
    
		hft[i+1].weight = w[i];
	}
}
void PrintHuffManTree(HuffManTree hft) {
    
	for (int i = 1; i <m; ++i) {
    
		printf("index %3d weight: %3d parent %3d Lchild %3d Rchild %3d\n"
			,i,hft[i].weight,hft[i].parent,hft[i].leftchild,hft[i].rightchild);
	}
}
void CreateHuffManTree(HuffManTree hft) {
    
	priority_queue<IndexWeigth, vector<IndexWeigth>, std::greater<IndexWeigth> >qu;
	for (int i = 1; i <= n; ++i) {
    
		qu.push(IndexWeigth{
    i,hft[i].weight});
	}
	int k = n + 1;
	while (!qu.empty()) {
    
		if (qu.empty()) break;
		IndexWeigth left = qu.top(); qu.pop();
		if (qu.empty()) break;
		IndexWeigth right = qu.top(); qu.pop();
		hft[k].weight = left.weight + right.weight;
		hft[k].leftchild = left.index;
		hft[k].rightchild = right.index;
		hft[left.index].parent = k;
		hft[right.index].parent = k;
		qu.push(IndexWeigth{
    k,hft[k].weight});
		k += 1;
	}
}

Create build Huffman codes

void InitHuffManCode(HuffCoding hc, char* ch) {
    
	memset(hc,0,sizeof(HuffCoding));
	for (int i = 1; i <= n; ++i) {
    
		hc[i].ch = ch[i-1];
		hc[i].code[0] = '\0';
	}
}
void PrintHuffManCode(HuffCoding hc) {
    
	for (int i = 1; i <= n; ++i) {
    
		printf("data: %c => code : %s \n", hc[i].ch, hc[i].code);
	}
}
void CreateHuffManCode(HuffManTree hft, HuffCoding hc) {
    
	char code[n + 1] = {
     0 };
	for (int i = 1; i <= n; ++i) {
    
		int k = n;
		code[k] = '\0';
		int c = i;
		int pa = hft[c].parent;
		while (pa != 0) {
    
			code[--k] = hft[pa].leftchild==c?'0' : '1';
			c = pa;
			pa = hft[c].parent;
		}
		strcpy_s(hc[i].code,n,&code[k]);
	}
}
原网站

版权声明
本文为[* idle fish]所创,转载请带上原文链接,感谢
https://chowdera.com/2022/328/202211242120106711.html

随机推荐