当前位置:网站首页>Storage rules of integers and floating point numbers in memory

Storage rules of integers and floating point numbers in memory

2020-12-06 15:22:07 ybhuangfugui

Focus on + Star sign public Number , Don't miss the highlights

author |  Night breeze

layout | strongerHuang

Why our code will float 、 Cast integers , Or print out the accuracy loss , Or something wrong ?

Trying to figure this out , You need to know about integers 、 Floating point storage rules .

Embedded column

1

Floating point storage rules

According to international standards IEEE( Institute of electrical and Electronic Engineering ) Regulations , Any floating-point number NUM The binary number of can be written as :

NUM = (-1) ^ S * M * 2 ^ E; (S A symbol ,E It means factorial ,M Represents a significant number )

① When S by 0 when , It means a positive number ; When S by 1 when , It means a negative number ;

②M Represents a significant number ,1<= M <2;

③2^E The index

For example, decimal 3.0, Binary is 0011.0 I can write this as (-1)^ 0 * 1.1 * 2 ^ 1

Another example is decimal -3.0, Binary is -0011.0 I can write this as (-1)^ 1 * 1.1 * 2 ^ 1

And the rules float Type has a sign bit (S), Yes 8 One digit (E), and 23 Significant digits (M)

double Type has a sign bit (S), Yes 11 One digit (E), and 52 Significant digits (M)

With float Type as an example :

IEEE about ( Significant figures )M and ( Index )E There are special rules ( With float For example ): 

1. because M The value of must be 1<= M <2, So it can definitely be written as 1.xxxxxxx In the form of , So the rules are M Omit the first one in storage 1, Only numbers after the decimal point are stored .

This saves space , With float Type as an example , I can save it 23 Decimal information , Plus the missing 1 You can use it 23 To said 24 A valid message .

2. about E( Index )E It's an unsigned integer, so E The value range of is (0~ 255), But in counting, the index can be negative , So it's a rule to deposit E when , Add the middle number to its original value (127), Subtract the middle number when using it (127), such E The real value range of is (-127~128).

about E There are also three situations :

①E Not all for 0, Not all for 1:

Then we use the normal calculation rules ,E The real value of is E Minus the literal value of 127( In the middle ),M Add the value of the first omitted 1.

②E All for 0

Then the index E be equal to 1-127 For real value ,M Don't add what's left out 1, It's reduced to 0.xxxxxxxx decimal . This is to show that 0, And some very small integers .

So floating point numbers and 0 When comparing , it is to be noted that .

③E All for 1

When M All for 0 when , Express ± infinity ( Depending on the sign bit ); When M Not all for 1 when , This number is not a number (NaN)

Embedded column

2

test

The code is as follows :

void test(void)
{
  float m=134.375;
  char *a=(char*)&m;


  printf("0x%p:%d\n",a,*a);
  printf("0x%p:%d\n",a+1,*(a+1) );
  printf("0x%p:%d\n",a+2,*(a+2) );
  printf("0x%p:%d\n",a+3,*(a+3) );
}

Code output :

The specific calculation process is as follows :

Embedded column

3

Loss of accuracy

We can multiply the decimal part of the decimal system by 2, Take the integer part as a bit of binary , Continue to multiply the remaining decimals 2, Until there are no remaining decimals .

for example 0.2 Can be converted to :

0.2 x 2 = 0.4 0

0.4 x 2 = 0.8 0

0.8 x 2 = 1.6 1

0.6 x 2 = 1.2 1

0.2 x 2 = 0.4 0

0.4 x 2 = 0.8 0

0.8 x 2 = 1.6 1

namely :.0011001…

It's a binary number with an infinite loop , That's why there is a loss of precision when converting decimal to binary decimal .

I shared with you not long ago 《 Single precision 、 Double precision 、 What is the difference between multi precision and mixed precision calculation ?》 Maybe you don't quite understand , Today I saw the storage rules of floating point numbers , Do you understand ?

Embedded column

4

Storage rules of integers

Understand the storage rules of floating point numbers , It's easy to understand integers .

Integers are stored in memory in the form of complements , There are positive and negative integers . When you need to store signed numbers , Use the first place to indicate positive (0) And negative (1).

The inverse and complement of a positive number is still itself , The following is mainly about the inverse and complement of negative numbers . The inverse code is the original code after removing the highest symbol bit, and the remaining bits are reversed bit by bit , The complement is the inverse of the complement 1 .

Test code :

void test(void)
{
  int8_t n=-123;
  uint8_t *p=(uint8_t *)&n;


  printf("%d\n",n);
  printf("%d\n",*p);  
}

Output results :

The calculation process is as follows :

Material source :

https://blog.csdn.net/u014470361/article/details/79820892

disclaimer : Source network of this paper , The copyright belongs to the original author . If involves the work copyright question , Please contact me to delete .

------------ END ------------

Recommended reading :

Selected summary | special column | Catalog | Search for

Selected summary | ARM、Cortex-M

Selected summary  | ST Tools 、 Download programming tools

Pay attention to WeChat public number 『 Embedded column 』, Bottom menu for more , reply “ Add group ” Join the technical exchange group according to the rules .

Click on “ Read the original ” See more sharing , welcome Share 、 Collection 、 give the thumbs-up 、 Looking at .

版权声明
本文为[ybhuangfugui]所创,转载请带上原文链接,感谢
https://chowdera.com/2020/12/202012061521340189.html