当前位置:网站首页>Key points of C language -- index article (let you fully understand indicators) | understand indicators from memory | complete analysis of indicators

Key points of C language -- index article (let you fully understand indicators) | understand indicators from memory | complete analysis of indicators

2020-11-07 16:50:08 itread01

> There are dry goods 、 There are more stories , Wechat search 【** Programming means North **】 Focus on this different programmer , Wait for you to flirt ~ ** notes : After reading this article, you will be able to grasp the essence of indicators ** C The core knowledge of language is index , therefore , The theme of this article is 「 Metrics and memory models 」 When it comes to indicators , It's impossible to leave memory , There are two types of people who learn indicators , One is that they don't understand the memory model , The other is to understand . If you don't understand the indicators, you'll stop at “ An index is the address of a variable ” This sentence , Will be more afraid of using indicators , In particular, various high-order operations . And those who understand the memory model can use the index perfectly , All kinds of byte Operate at will , Let people call 666. ### One 、 The nature of memory The essence of programming is to manipulate data , Data is stored in memory . therefore , If we can better understand the memory model , as well as C How to manage memory , You can see how the program works , So that the programming ability to a higher level . Don't really think it's empty talk , I didn't dare to use it all my freshman year C Programs written on thousands of lines are also very resistant to writing C. Because once a thousand lines , All sorts of inexplicable memory errors often occur , It happened by accident coredump...... And there's no way to investigate , There's no reason for it . by comparison , At that time, I liked Java, stay Java No matter what you write in it, there will be no similar exception , At most once in a while **NullPointerException**, It's also easier to investigate . Until later, there was a deeper understanding of memory and metrics , I can use it slowly C A thousand lines of projects , It's rare to have memory problems again .( Over confident 「 The pointer stores the memory address of the variable 」 This sentence should be said in any way C Language books will mention it . therefore , To understand indicators thoroughly , First of all, understand C The storage nature of variables in language , It's memory . #### 1.1 Memory addressing The memory of a computer is a space for storing data , It consists of a series of continuous storage units , It's like this , ![](https://tva1.sinaimg.cn/large/0081Kckwgy1gk6p1iowcxj30t20acmz8.jpg) Every cell represents 1 One Bit, One bit stay EE Professional students seem to be high and low potential , And in the CS The students seem to be 0、1 Two states . Because of 1 One bit It can only represent two states , So the big guys stipulate 8 One bit For a group , Named byte. And will byte As the smallest unit of memory address , That's for everyone byte A number , This number is called memory ** Address **. ![](https://tva1.sinaimg.cn/large/0081Kckwgy1gk6pcjje44j30qe09ggnr.jpg) This is equivalent to , We give every unit in the neighborhood 、 Each household is assigned a house number : 301、302、403、404、501...... In the life , We need to make sure the number is unique , In this way, we can accurately locate the family through the house number . Again , In the computer , We have to make sure that we give it to everyone byte The numbers are unique , Only in this way can we ensure that each number can access the unique and definite byte. #### 1.2 Memory address space We said above for each in memory byte Unique number , Then the range of numbers determines the range of addressable memory . All the numbers together are called the address space of memory , This is what we usually say about computers 32 Bit or 64 It's about . In the early Intel 8086、8088 Of CPU Just support 16 Bit address space ,** register ** and ** Address bus ** All are 16 position , This means that at most ```2^16 = 64 Kb``` Address the memory number of . This memory space is obviously not enough , And then ,80286 stay 8086 On the basis of general ** Address bus ** and ** Address register ** The extension package has arrived 20 position , It's also called A20 Address bus . I was writing mini os When , It also needs to go through BIOS Interrupt to start A20 Address bus switch . however , Today's computers are generally 32 Bit's starting ,32 Bits mean that the addressable memory range is ```2^32 byte = 4GB```. therefore , If your computer is 32 Bit , Then you pretend to be more than 4G We can't make the most of it . Okay , This is memory and memory addressing . #### 1.3 The nature of variables With memory , Next we need to think about ,int、double How these variables are stored in 0、1 Cell . stay C In language, we define variables like this : ```c int a = 999; char c = 'c'; ``` When you write down a variable definition , It's actually asking memory for a space to hold your variables . We all know int Type 4 Bytes , And in the computer, the numbers are complemented ( Don't know the complement, remember to go to Baidu ) It means . ```999``` The complement is :```0000 0011 1110 0111``` There are 4 One byte, So you need four cells to store : ![](https://tva1.sinaimg.cn/large/0081Kckwgy1gk73z5ahpjj30s00aimzc.jpg) Did you notice , We put the high bits in the low address . Can it be reversed ? Of course , This leads to ** Big end and small end .** The way to put high bits in low address memory like above is called ** Big end ** conversely , The way to put the low bits in the low address of memory is called ** The small end **: ![](https://tva1.sinaimg.cn/large/0081Kckwgy1gk74584w6tj30rs0b840p.jpg) It just says int How variables of type are stored in memory , and float、char The same type is actually the same , They need to be converted to complements first . For variable types of multiple tuples , Also need to follow the format of big end or small end , Write the bytes to the memory unit in turn . Remember the two pictures above , This is what all variables in a programming language look like in memory , Whether it's int、char、 Indicators 、 Array 、 Structure 、 thing ... It's all in memory like this . ### Two 、 What is the indicator ? #### 2.1 Where are the variables ? I said , To define a variable is to request a piece of memory from the computer to store it . So if we want to know where the variables are ? It can be done by the operator ```&``` To get the actual address of the variable , This value is the starting address of the memory block occupied by the variable . (PS: Actually, this address is a virtual address , It's not an address on real physical memory We can print this address out : ```c printf("%x", &a); ``` It's going to be a bunch of numbers like this :```0x7ffcad3b8f3c``` #### 2.2 The nature of indicators It says , We can go through ```&``` Symbol gets the memory address of the variable , How to show that this is a ** Address **, Instead of a normal value ? ** That is to say C How to express the concept of address in language ?** Yes , It's the indicator , You can do this : ```c int *pa = &a; ``` pa Variables are stored in ```a``` The address of , It's also called pointing ```a``` Indicators of . Here I want to talk about a few topics that seem a little boring : > Why do we need indicators ? Can't you just use the variable name ? Of course , But variable names are limited . > What's the essence of variable number names ? It's the symbolization of the address of a variable , Variables are designed to make programming more convenient , Be friendly to people , But computers don't know what variables ```a```, It only knows the address and the instruction . So when you look at C Language compiled assembly code , You'll find that the variable name disappears , Instead, it's a string of abstract addresses . You can think of , The compiler automatically maintains a map , Convert the variable name in our program to the address of the variable , Then read and write the address . That is to say, there is such a mapping table , Automatically convert variable name to address : ```c a | 0x7ffcad3b8f3c c | 0x7ffcad3b8f2c h | 0x7ffcad3b8f4c .... ``` Well said ! But I still don't know the necessity of indicators , That's the problem , Look at the following code : ```c int func(...) { ... }; int main() { int a; func(...); }; ``` Suppose I have a need : > Ask for in ``` func``` The function should be able to modify ```main``` Variables in a function ```a```, How about this , stay ```main``` Functions can be read and written directly through variable names ```a``` Where the memory is . > > But in ```func``` You can't see it in a function ``` a``` Yeah . You said you could go through ```&``` Take the address symbol , Will ```a``` Pass in the address of : ``` int func(int address) { .... }; int main() { int a; func(&a); }; ``` In this way ``` func``` You can get ```a``` The address of , Read and write . In theory, there is no problem at all , But the problem is : How does a compiler distinguish between a int What you have in it is int Value of type , Or the address of another variable ( That is, the index ). If it's entirely up to us programmers to remember , It introduces complexity , And it can't detect some syntax errors by compiler . And through ``` int *``` To define an indicator variable , It will be very clear :** This is the other one int The address of the type variable .** The compiler can also eliminate some compilation errors by type checking . This is the need for indicators to exist . In fact, any language has this need , It's just that a lot of languages are for security , Put a layer of shackles on the index , The index is packaged as a reference . Maybe when you study, you will naturally accept the indicator , But I hope this lengthy explanation will enlighten you . At the same time , Here's a little question : Since the essence of an index is the first memory address of a variable , That's a int Integer of type . > Why do we have all kinds of types ? > such as int Indicators ,float Indicators , Does this type affect the information stored in the indicator itself ? > When does this type work ? #### 2.3 Quoting The above question , It is for the purpose of deriving the index dereference . ```pa``` It's stored in ```a``` The memory address of the variable , How to get it through the address ```a``` The value of ? This operation is called ** Quoting **, stay C In language, through operators ```*``` You can get the content of an index's address . such as ```*pa``` You can get ```a``` Value . We say that the index stores the first address of the variable memory , How does the compiler know how many bytes to fetch from the first address ? This is when the indicator type works , The compiler will determine how many bytes should be fetched according to the type of the element referred to in the index . If it is int Type index , Then the compiler will produce instructions to extract four bytes ,char Only one byte is extracted , And so on . Here is a diagram of the indicator memory : ![](https://tva1.sinaimg.cn/large/0081Kckwgy1gk8awo5rq8j30xq0fcadf.jpg) ```pa``` An indicator is, first of all, a variable , It also occupies a chunk of memory , What's in this memory is ```a``` The first address of the variable . When dereferencing , It's going to be drawn from this initial address 4 One byte, And then according to int How to encode a type . #### 2.4 Learn and use Don't look at this place. It's simple , But it's the key to a deep understanding of indicators . Give two examples to illustrate : such as : ```c float f = 1.0; short c = *(short*)&f; ``` You can explain the process , For ```f``` Variables , What has changed at the memory level ? perhaps ```c``` What's the value of ?1 ? Actually , At the memory level ,```f``` Nothing has changed . As shown in the picture : ![](https://tva1.sinaimg.cn/large/0081Kckwgy1gk8b66uofoj30sc0dg0vd.jpg) Suppose this is ``` f``` Bit patterns in memory , This process is actually to put ```f``` The first two of byte Take it out and follow short The way to explain , And then assign it to ```c```. The detailed process is as follows : 1. ```&f``` obtain ``` f``` The first address 2. ```(short*)&f``` The second step above did nothing , This expression just says : “ Oh , I think ```f``` This address is a short Variable of type ” Finally, when it comes to dereferencing ```*(short*)&f``` When , The compiler will take out the first two bytes , And according to short To explain the coding of , And assign the interpreted value to ```c``` Variables . This process ```f ``` There is no change in the bit pattern of , It's just the way these bits are interpreted . Of course , The final value here is definitely not 1, As for what , You can go and really calculate . And the other way around , So ? ```c short c = 1; float f = *(float*)&c; ``` As shown in the picture : ![](https://tva1.sinaimg.cn/large/0081Kckwgy1gk8babigguj30se0dm771.jpg) The specific process is the same as above , But there's no mistake on it , It's not necessarily here . Why? ? ```(float*)&c``` It will take us from ```c``` The first address of the start takes four bytes , And then according to float To explain the coding of . however ``` c ``` yes short Type takes only two bytes , That will definitely access the next two bytes , At this point, a memory access overrun occurs . Of course , If you just read , Maybe the rate is OK . however , Sometimes you need to write new values to this area , such as : ```c *(float*)&c = 1.0; ``` Then it may happen coredump, That is to say, the access to deposit failed . in addition , Even if it doesn't coredump, This will also destroy the original value of this memory , Because it's probably memory space for other variables , And we went to cover other people's content , It's bound to lead to hidden bug. If you understand the above , Then the use of indicators will be more comfortable . #### 2.6 Look at a little problem That's it , Let's look at a problem , This is a group friend asked , It's his need : ![](https://tva1.sinaimg.cn/large/0081Kckwgy1gk8bj1ximsj30ie0aajtn.jpg) This is the code he wrote : ![](https://tva1.sinaimg.cn/large/0081Kckwgy1gk8bk5n64oj30a508xdi7.jpg) He put the double Write it in the file and read it out , And then found that the printed value does not match . And the key point is here : ``` char buffer[4]; ... printf("%f %x\n", *buffer, *buffer); ``` He might think ```buffer``` It's an indicator ( Array, to be exact ), When you dereference an index, you should get the value in it , And the values in it, he thought, were read from the file 4 One byte, That's what happened before float Variables . Be careful , All this is what he thinks , In fact, the compiler thinks : “ Oh ,```buffer``` yes char Type indicator , I'll just take the first byte ”. Then the value of the first byte is passed to printf Function ,printf The function will find ,```%f``` The request received is a float Floating point numbers , That will automatically convert the value of the first byte to a floating-point number and print it out . This is the whole process . The key to error is , The student mistook , Any index dereference is taken into it “ The value we think is ”, In fact, the compiler doesn't know , The compiler will only interpret according to the type of the index . So it's changed to : ```c printf("%f %x\n", *(float*)buffer, *(float*)buffer); ``` This is equivalent to explicitly telling the compiler : “```buffer ``` To this place , I put a float, You give me according to float To explain ” ### 3、 ... and 、 Structures and indicators The structure contains multiple members , How are these members stored in memory ? such as : ```c struct fraction { int num; // The whole part int denom; // The fractional part }; struct fraction fp; fp.num = 10; fp.denom = 2; ``` This is a fixed-point decimal structure , It's in memory 8 Bytes ( Memory alignment is not considered here ), Two member domains are stored like this : ![image-20201030214416842](https://tva1.sinaimg.cn/large/0081Kckwgy1gk7p2vyuxzj30m00d2tb8.jpg) We put 10 In the structure, the base address offset is 0 The domain of ,2 Put it at an offset of 4 The domain of . And then we do an operation that normal people never do : ```c ((fraction*)(&fp.denom))->num = 5; ((fraction*)(&fp.denom))->denom = 12; printf("%d\n", fp.denom); // How much output ? ``` How much will the above one output ? Think about it yourself first ~ Next, I'll analyze what happened in this process : ![](https://tva1.sinaimg.cn/large/0081Kckwgy1gk7pkqjcmtj30v00d4acz.jpg) First ,```&fp.denom``` It means to take structure fp in denom The first address of the domain , Then take this address as the starting address 8 Bytes , And think of them as a fraction Structure . In this new structure , The top four bytes become denom Domain , and fp Of denom The domain corresponds to the new structure num Domain . therefore : ```((fraction*)(&fp.denom))->num = 5 ``` What actually changed was ``fp.denom``, and ```((fraction*)(&fp.denom))->denom = 12``` The top four bytes are assigned to 12. Of course , Write values to that four byte memory , The results are unpredictable , May cause program crash , Because maybe it just stores the key information of the function call stack frame , Maybe there's no write permission there . We are just beginning to learn C A lot of language coredump Mistakes are caused by similar reasons . So the final output is 5. Why do you talk about this kind of code that doesn't seem to make sense ? To illustrate the nature of a structure is actually a bunch of variables packed together , And access the domain in the structure , It's through the starting address of the structure , Also called base address , And then add the offset of the field . In fact ,C++、Java Objects in are stored in the same way , It's just that they're trying to implement some object-oriented features , Will be outside the data member , Add some Head Information , such as C++ Virtual function table of . Actually , We can use C Language to imitate . That's why we keep saying C Language is the foundation , You really understand C Metrics and memory , For other languages, you can quickly understand the object model and memory layout . ### Four 、 Multi level indicators When it comes to multi-level indicators , I used to be a freshman , At most, I can understand 2 Class , More will really make me dizzy , I often write wrong code . If you write me this :```int ******p``` Can break me down , I think many students are in this situation now

版权声明
本文为[itread01]所创,转载请带上原文链接,感谢