当前位置:网站首页>Interview question: why is the dynamic expansion of C + + vector 1.5 times or 2 times

Interview question: why is the dynamic expansion of C + + vector 1.5 times or 2 times

2021-10-14 06:28:15 The senming Gang is bigger than the black tiger gang

One 、 summary

During the interview vector The question of capacity expansion is often asked , such as :

  • vector How to expand the capacity ?
  • Capacity expansion will lead to inefficiency , How to avoid dynamic capacity expansion ?
  • Why choose to 1.5 Times or 2 Capacity expansion in multiple ways ? instead of 3 times 4 Double expansion ?
  • vs Why choose 1.5 times ,linux Why choose 2 times ?

A series of questions came down , Is there a feeling of being hanged ? In this section, we will delve into vector The details behind the expansion , Make your interview less embarrassing .

Two 、 Use it efficiently vector, Avoid expansion

1. Review of capacity expansion mechanism

Direction vector When you insert an element in , If the number of elements is valid size And space capacity capacity When equal ,vector The capacity expansion mechanism will be triggered internally :

Open up new space

  • Copy element .
  • Free up the old space .

Be careful : The new space for each expansion cannot be too large , It can't be too small , Too big is easy to cause a waste of space , Too small will lead to frequent capacity expansion and affect program efficiency .

2. How to avoid inefficiency caused by capacity expansion

If you want to avoid the problem of low program efficiency caused by capacity expansion , It's very simple : If before insertion , It can be estimated that vector Number of storage elements , Open up the bottom capacity in advance . If done before insertion reserve, As long as there is enough space , The capacity will not be expanded during insertion , without reserve, It will be expanded while inserting , Extremely inefficient .

#include <iostream>
#include <vector>
int main ()
{
    
    size_t sz;
    std::vector<int> foo;
    // foo.reserve(100); //  First the vector The bottom space is expanded to 100 individual , Then insert 
    sz = foo.capacity();
    std::cout << "making foo grow:\n";
    for (int i=0; i<100; ++i)
    {
    
        foo.push_back(i);
        if (sz!=foo.capacity())
        {
    
            sz = foo.capacity();
            std::cout << "capacity changed: " << sz << '\n';
        }
}
}


vs: Running results :
making foo grow:
capacity changed: 1
capacity changed: 2
capacity changed: 3
capacity changed: 4
capacity changed: 6
capacity changed: 9
capacity changed: 13
capacity changed: 19
capacity changed: 28
capacity changed: 42
capacity changed: 63
capacity changed: 94
capacity changed: 141

g++ Running results :
making foo grow:
capacity changed: 1
capacity changed: 2
capacity changed: 4
capacity changed: 8
capacity changed: 16
capacity changed: 32
capacity changed: 64
capacity changed: 128

3、 ... and 、 Why do you choose to expand in multiples

1. Expand the capacity with the number of equal length

Capacity expansion by equal length number , The new space size is to expand the original space size to capacity+K Space (capacity Is the size of the old space ). Suppose you need to ask vector Insert 100 Elements ,K by 10, So we need to expand 10 Time ; Each expansion needs to move the old space elements to the new space , The first i The number of elements in this expansion copy is :ki( The first 1 Second expansion , The new space size is 20, There are... In the old space 10 Elements , Need to move to a new space ; The first 2 Second expansion , The new space size is 30, There are... In the old space 20 Elements , All need to be moved to the new space ), Suppose the element insertion and element movement are 1 One unit operation , be n Elements push_back() The total number of operations is :
 Insert picture description here

2. Capacity expansion in multiples

Suppose there is n An element needs to be like vector Insert , The multiplication factor is m, Then finish n Elements like vector Of push_back Operation requires capacity expansion log With m For low n To the power of . such as : Double expansion , Direction vector Insert 1000 Elements , Need to expand log With 2 Bottom 1000 Power , It's expansion 10 Time , The first i This capacity increase will m Of i To move elements to a new space ,n Time push_back The total number of operations is :
 Insert picture description here
It can be seen that the capacity expansion in multiple is more efficient than the capacity expansion in equal length .

3. Why choose 1.5 Times or 2 Double expansion , instead of 3 times 、4 times

The expansion principle is : Apply for a new space , Copy element , Free up the old space , The ideal allocation scheme is in the N If it can be reused during the second expansion N-1 The space released at one time is great , If according to 2 Double expansion , The first i The size of the secondary expansion space is as follows :
 Insert picture description here
You can see , At each expansion , The space released in front cannot be used . such as : The first 4 During secondary capacity expansion , front 2 Secondary space has been released , The first 3 The secondary space has not been released yet ( Open up new space 、 Copy element 、 Free up the old space ), That is, the space released in front is only 1 + 2 = 3, Hypothesis number 1 3 Only when the secondary space has been released 1+2+4=7, And the fourth time you need 8 Space , Therefore, the previously released space cannot be used , But according to less than 2 Double expansion , After multiple capacity expansions, the previously released space can be reused . If the multiple exceeds 2 times ( contain 2 times ) The expansion mode will exist :

  • Space waste may be high , such as : After the expansion, I applied for 64 Space , But only 33 Elements , Nearly half of the space is unused .

  • Unable to use previously freed memory .

 Insert picture description here
Use 2 times (k=2) The capacity expansion mechanism is used for capacity expansion , The new memory size after each expansion must be greater than the sum of the previous .
While using 1.5 times (k=1.5) Add capacity , After several expansions , You can reuse the previous memory space .

because STL The standard does not strictly specify how to expand the capacity , Therefore, different implementation vendors expand capacity in their own way , namely :linux The following is according to 2 Capacity expansion in the way of times , and vs The following is according to 1.5 Capacity expansion in the way of times .

Four 、Windows and Linux The expansion principle of the bottom layer

1.Windows Expansion bottom

Windows The heap management system will merge the released heap blocks , therefore :vs Under the vector The capacity expansion mechanism chooses to use 1.5 Double the way to expand , After this expansion for many times , You can use the previously released space .

2.Linux The expansion of the bottom layer

 Insert picture description here

  • linux The following is mainly used glibc Of ptmalloc To apply for user space . If malloc The space is less than 128KB, It passes through brk() To expand , If it is greater than 128KB And arena When there is not enough space in , adopt mmap Map memory to process address space .
  • linux Introduction in Buddy system It provides an efficient allocation strategy for the kernel to allocate successive pages , Compromise between fixed partition and dynamic partition , There are internal fragments in the fixed partition , There are external fragments in the dynamic partition , Moreover, merging during dynamic partition recycling and slicing during allocation are time-consuming .
  • The partner system is to build the entire memory area into a basic size basicSize Of 1 times 、2 times 、4 times 、8 times 、16 Times equal , That is, memory space partition chains are required to correspond 2 A space of the size of an integral power of , Be neat and uniform , Regular rather than messy .
  • When allocating and freeing space , Can pass log2(request/basicSize) The hash algorithm of rounding up quickly finds the corresponding memory block .
  • Understanding of managing free partitions through partner systems , Sure You can see that every free partition chain in the partner system is hung with 2i Page size of , Space allocation and merging through hash thought , Very efficient . It is estimated that this may be the reason SGI-STL Choose to 2 Capacity expansion in multiple ways .

5、 ... and 、 summary

 Insert picture description here

  • vector stay push_back By multiplying, it can reach... After sharing equally O(1) Event complexity , Relative to the growth of a specified size O(n) Better time complexity .
  • In order to prevent the waste of application memory , Now the more used are 2 Times and 1.5 Double growth , and 1.5 The double growth mode can better realize the reuse of memory .
     Insert picture description here

版权声明
本文为[The senming Gang is bigger than the black tiger gang]所创,转载请带上原文链接,感谢
https://chowdera.com/2021/10/20211002145852964z.html

随机推荐