当前位置:网站首页>Interview disaster area -- memory structure and garbage collection mechanism of JVM

Interview disaster area -- memory structure and garbage collection mechanism of JVM

2020-11-10 12:04:51 Stack

JVM Introduce

1. JVM The architecture of ( Memory model )

The green ones are private to threads , The orange ones are shared by threads

2. Class loader

Responsible for .class File loaded into memory , And convert the data structure in the file to the data structure in the method area , Generate a Class object

2.1 Classloader classification

  • Self starting class loader .Bootstrap ClassLoader Class loader . Responsible for loading jdk My own bag .

    • %JAVA_HOME%/lib/rt.jar% namely JDK Source code
    • Use C++ To write
    • Class loaders that directly obtain classes loaded by the loader appear in the program null
  • Extend the classloader .Extension ClassLoader. Responsible for loading jdk Extended packages

    • For future expansion
    • %JAVA_HOME/lib/ext/*.jar%
  • Application class loader or system class loader .AppClassLoader or SystemClassLOader

    • Loader for loading custom classes
    • CLASSPATH Under the path
  • Custom class loaders

    • By implementing ClassLoader Abstract class implementation

2.2 Parent delegate mechanism

When the application class loader gets a class loading request , This class load request will not be processed immediately , Instead, delegate the request to his parent loader to load , If the parent loader cannot handle the class load request , It's passed on to the child loader . The class loader of this class can be loaded by passing instructions one level at a time .

This mechanism is also known as Sandbox security mechanism . Prevent developers from being aware of JDK Load to destroy

2.3 Break the parental delegation mechanism

  • Custom class loaders , rewrite loadClass Method
  • Using the thread context class loader

2.4 Java Virtual machine entry file

sun.misc.Launcher class

3. Execution Engine

The execution engine is responsible for executing interpretation commands , Leave it to the operating system for specific execution

4. Local interface

4.1 native Method

native Method means Java Operations that cannot be handled by the level , Local function library can only be called through local interface (C function library )

4.2 Native Interface

A set of interface for calling function library

5. Native Method Stack

In the load native Method time , It will be carried out C Function library method , Put it in this stack area to execute

6. Program counter

Every thread has a program counter , The main function is to store code instructions , It's like an implementation plan .

Multiple pointers are maintained internally , These pointers point to the method bytecode in the method area . The execution engine gets the next instruction to execute from the program counter .

Because of the small space , It is a line number indicator of the code executed by the current thread /

Not trigger OOM

7. Method area

The area of runtime memory shared by threads , Store the structure information of each class ( One Class object ), Include : Field , Method , Construction method , Runtime constant pool .

although JVM The specification describes the method area as a logical part of the heap , But it also has an alias called Non-Heap( Non heap ), The purpose is to separate it from the pile

There are mainly : Permanent generation or meta space . There is GC

Due to the direct use of physical memory in meta space , So the default maximum meta space size is 1/4 Physical memory size

8. Java Stack

Mainly responsible for the implementation of various methods , Threads are private , With the death of threads , There's no garbage collection problem . The eight data types and instance references allocate memory in the stack memory of the function .

The default size is 512~1024K, adopt -Xss1024k Parameters change

8.1 Stack and queue data structures

Stack FILO: First in, then out

queue FIFO: fifo

8.2 Stored data

  • The local variable Local Variable. Include the formal parameters and return values of the method
  • Stack operation Operand Stack. Including a variety of stack and stack out operations
  • Stack frame data Frame Data. It's like one method after another . In stack space , Methods are called stack frames

8.3 Execute the process

The unit of execution in the stack is the stack frame , Stack frames are methods .

  • First of all, will main Method stack , Become a stack frame
  • Then we call other methods , That is to press the stack again
  • A list of local variables for this method is stored in the stack frame , The stack of operands 、 Dynamic links 、 Method export, etc
  • The size of the stack and JVM The realization of , Usually in 256K~756K

9. Method area , Stack , The relationship of heaps

10. Heap Pile up

10.1 Heap memory structure

The default initial size is physical memory 1/64, The default maximum size is 1/4. In actual production, these two values are usually set to the same , To avoid the space expansion calculation after the garbage collector completes garbage collection , Waste resources .

Out of heap memory : Memory objects are allocated in Java Memory outside the virtual machine's heap , This memory is directly managed by the operating system ( Not virtual machines ), The result of this is to reduce the impact of garbage collection on the application to a certain extent . Use undisclosed Unsafe and NIO It's a bag ByteBuffer To create out of heap memory .

The default out of heap memory size is , adopt -XX:MaxDirectMemorySize= The size of the memory outside the execution heap

10.1.1 JDK1.7

Logically divided into three areas :

  • New Area Young Generation Space.

    • Eden Eden Space
    • Survival zone Survivor 0 Space
    • Survival zone Survivor 1 Space
  • Retirement area Tenure Generation Space
  • The permanent zone Permanent Space( Method area )

On the physical level, it's divided into two areas :

  • New Area
  • Old age area
10.1.1.1 Heap memory GC The process

There are three steps in the main process :

  • When Eden When it's full, set off once light GC(Minor GC), There is no object of death , Age +1, Store in from Area
  • When Eden Trigger again when full again GC, No dead objects are placed in to Area , And then from All objects that are not dead in the area are placed in to Area , Age +1. And then every time GC It's going to start once from and to In exchange for , Which area is empty and that area is to
  • When survivor When the area is full , Trigger again GC, When the age of the object is equal to 15 When , The object will be moved into the old age area

    • MaxTenuringThreshold This parameter is used to set when the age is to move in
  • Trigger once when the old age zone is full Full GC, If the elderly area can no longer store objects, report directly OOM

Be careful : every time GC Will give the age of the surviving object +1

10.1.2 JDK1.8

and 1.7 comparison , It's just to replace forever with Meta space . The storage of meta space is built in Physical memory , instead of JVM in .

This kind of treatment , It can make the size of meta space no longer affected by the memory size of virtual machine , It is controlled by the space currently available in the system .

The ratio of the size of the new and old areas is 1:2, adopt -XX:NewRatio=n Set the ratio of the new generation to the old generation ,n Represents the proportion of the elderly area .

Eden Space and Survivor Space The default ratio between is 8:1, adopt -XX:SurvivorRatio Set the ratio of Eden to survivor

The logical level is layered :

  • New Area Young Generation Space

    • Eden Eden Space
    • Survival zone Survivor 0 Space
    • Survival zone Survivor 1 Space
  • Old age area Tenure Generation Space
  • Meta space ( Method area )

Physical stratification :

  • New Area He occupied the pile 1/3
  • Old age area He occupied the pile 2/3

10.2 Heap parameter tuning

10.2.1 Common heap parameters
Parameters effect
-Xms Set the initial heap size , Default to physical memory 1/64
-Xmx Set maximum heap size , Default to physical memory 1/4
-XX:+PrintGCDetails Output detailed GC journal

simulation OOM

// Set the maximum heap memory to 10m 
//-Xms10m -Xmx10m -XX:+PrintGCDetails

Let's analyze it in detail GC What does the process do ,GC What do you think of the log

name :GC Used to occupy ->GC Then occupy ( Total occupancy )

//GC  Allocation failed 
GC (Allocation Failure)
    [PSYoungGen: 1585K->504K(2560K)] 1585K->664K(9728K), 0.0009663 secs] //[ The new generation , Used to occupy -> Thread occupancy ( All free )]  Heap usage size -> The heap is now the size of ( Total size )
    [Times: user=0.00 sys=0.00, real=0.00 secs] 
    
    
[Full GC (Allocation Failure)
 [PSYoungGen: 0K->0K(2560K)] 
 [ParOldGen: 590K->573K(7168K)] 590K->573K(9728K),
 [Metaspace: 3115K->3115K(1056768K)], 0.0049775 secs] 
 [Times: user=0.00 sys=0.00, real=0.01 secs] 

11. Garbage collection algorithm

11.1 Garbage collection type

  • Ordinary GC(minor GC) It happened in the new area , Very frequent
  • overall situation GCmajor GC Garbage collection in the old days , once major GC Often accompanied at least once Minor GC

11.2 Garbage collection algorithm classification

11.2.1 Reference counting

The main idea : Every time there is an object reference, add one to the object , When the reference to this object is zero , It triggers garbage collection . Generally not used

shortcoming :

  • Every time you create a new object, you need to add a counter , More wasteful
  • Circular references are more difficult to handle
11.2.2 Copy algorithm

The main idea : Make a direct copy of the object , Put it in other areas

advantage : No memory fragmentation

shortcoming : It takes up a lot of space

Use scenarios : The replication of the new area is performed by the replication algorithm . When Minor Gc in the future , A copy of the surviving object will be placed in to District

11.2.3 Tag clearing algorithm

The main idea : Traverse all references from the reference root node , Mark all objects that need to be cleaned up , And then clean up . Two steps to complete

shortcoming : During garbage collection, the whole code will be interrupted . Memory fragmentation will occur

11.2.4 Marking algorithm

The main idea : It's the same as the tag removal algorithm , Finally, a step is added to organize , Will defragment memory . Three steps to complete

shortcoming : Low efficiency , Need to move object .

11.3 Comparison of major garbage collection algorithms

11.3.1 Memory efficiency

Copy algorithm > Mark removal method > Mark up

11.3.2 Memory neatness

Copy algorithm = Mark up > Mark removal method

11.3.3 Memory utilization

Mark up = Mark removal method > Copy algorithm

11.3.4 Optimal algorithm

Using different algorithms through scenarios , To achieve the best purpose

The younger generation : Because the object's lifetime is , The mortality rate is high , So the replication algorithm is generally used

Old age : Large area , The survival rate is high , Generally, a hybrid algorithm of mark clearing and mark sorting is used .

The old age is generally realized by Mark clearing or the mixture of mark clearing and mark finishing . With hotspot Medium CMS Take the recycler for example ,CMS Is based on Mark-Sweep Realized , For the recovery efficiency of the image is very high , And for debris ,CMS The adoption is based on Mark-Compact Algorithm Serial Old As a compensation measure, the recycler : When memory recycling is not good ( Caused by debris Concurrent Mode Failure when ), Will adopt Serial Old perform Full GC In order to achieve the old generation of memory arrangement .

11.3.5 GCRoots

When we mentioned the mark removal algorithm above , A noun is mentioned , The root node references . So what is a root node reference ?

The root node reference also becomes GCRoots, It refers to the root node of the object traversal by garbage collection algorithm . That is, from this object to traverse down , Mark the object that needs to be recycled .

The process of garbage collection labeling is : With GCRoots Objects start to search down , If an object arrives GCRoots When there is no chain of references connected , Indicates that this object is not available .

It's from GCRoots Traversal , What can be traversed is not garbage , What is not traversed is garbage , Determine death

11.3.5.1 Reachable objects and unreachable objects

Accessibility object refers to , At the top of the object link reference is a GCRoot quote

An unreachable object means , At the top level of the object link reference is not a GCRoot quote

Popular explanation : Reachable objects are objects that have a attribution , This attribution has a term called GCRoot, Non reachable objects are those objects that don't belong .

11.3.5.2 What reference can be used as GCRoots
  • Local variable references in the stack
  • Static attribute reference in meta space
  • Constant references in meta spaces
  • Local method stack native The method of decoration

To put it bluntly , It's all the references exposed to developers

12. Garbage collector

The garbage collector is based on GC algorithmic .

There are four main types of garbage collectors , But there are seven ways to use it

12.1 Four garbage collectors

12.1.1 Serial garbage collector (Serial)

Single thread garbage collection , At this point, all other threads are suspended

adopt -XX:+UseSerialGC

12.1.2 Parallel garbage collector (Parallel)

Multithreading for garbage collection , At this point, all other threads are suspended

12.1.3 Concurrent garbage collector (CMS)

GC Thread and user thread run simultaneously

12.1.4 G1 Garbage collector

Partition garbage collection . Physically, there's no distinction between a new area and a retirement area , Divide heap memory into 1024 Small region, Every space occupied is in 2~32M, every last region It could be Eden SpaceSurvivor01 SpaceSurvivor02 Space and Old District .

The whole use of the mark collation algorithm , Local use of replication algorithm . By means of replication algorithm GC After the object from a region To the other region transfer , As for the memory fragmentation problem , Through the overall tagging algorithm , Avoid the birth of memory fragmentation

In the garbage collection, directly to a region To recycle , The saved objects are copied to TO District or Old District .

Logically, the heap has four zones , The size of each zone varies , Distribute on demand . It is divided into Eden Space,Survivor01 Space,Old and Humongous. among Humongous Used to store magnified objects , It's usually continuous storage , When because of continuity region When it's not enough , Will trigger Full GC Clean up the surroundings Region To save and enlarge objects

G1 Heap memory diagram

G1 Garbage collection

There's a big object , Three region Cannot be stored , Conduct FullGC

Execute the process

  • Initial marker .GC Multithreading , Mark GCRoots
  • Concurrent Tags . User threads and GC Threads are running at the same time .GC Thread traversal GCRoots All objects of , marked
  • Re label . Fix objects marked by concurrent tags , Because the user program calls again , And objects that need to be unmarked .GC Threads
  • Screening and recovery . Clean up marked objects .GC Threads
  • The user thread continues to run

12.1.4.1 Case study
  • Initial marker . It's triggered by a large object G1

  • Concurrent Tags

  • Re label 、 Filter clean up and large objects Full GC

12.1.4.2 G1 Common parameters
-XX:+UseG1GC   Turn on GC
-XX:G1HeapRegionSize=n :  Set up G1 The size of the area . The value is 2 The power of , The scope is 1M To 32M. The goal is based on the smallest Java The heap size is divided into about 2048 Regions 
-XX:MaxGCPauseMillis=n :  Maximum pause time , It's a soft target ,JVM Will try to ( But there's no guarantee ) The pause time is less than this time 
    
-XX:InitiatingHeapOccupancyPercent=n   Trigger when the heap is occupied GC, The default is 45
-XX:ConcGCThreads=n   Concurrent GC Number of threads used 
-XX:G1ReservePercent=n  Set the percentage of memory reserved as free space , To reduce the risk of target space overflow , The default value is 10%

12.2 Common parameters

DefNew      Default New Generation // Serial garbage collector , The new generation is called 
Tenured     Old  // Serial garbage collector , It was called in the old days 
ParNew         Parallel New Generation // The next generation of parallel garbage collectors , The new generation is called 
PSYongGen     Parallel Scavenge // New and old generation garbage collectors , It's called 
ParOldGen     Parallel Old Generation // New and old generation garbage collectors , It's called 

12.3 The new generation of garbage collectors

The picture above shows all kinds of garbage collectors that can be used in new and old areas , Let's explain one by one

12.3.1 Serial GC(Serial/Serial Coping)

The new generation Use Serial Coping Garbage collector use Copy algorithm

Old age area By default Serial Old Garbage collector , Use Mark clearing algorithm and mark sorting algorithm

adopt -XX:+UseSerialGC Set up

12.3.2 parallel GC(ParNew)

New Area Use ParNew Garbage collector , Using the replication algorithm

Old age area Use Serial Old Garbage collector ( This is not recommended ), Use the mark clearing algorithm and the mark collation algorithm

adopt -XX:+UseParNewGC start-up

12.3.3 Parallel recycling GC(Parallel/Parallel Scavenge)

The new generation Using parallel garbage collection

Old age Using parallel garbage collection .Java1.8 The default garbage collector used in

A problem :Parallel and Parallel Scavenge The difference between collectors ?

Parallel Scavenge The collector is similar to ParNew It's also a new generation of garbage collector , Using the replication algorithm , It is also a parallel multithreaded garbage collector , Commonly known as throughput first collector .

parallel Scavenge It's an adaptive collector , The virtual opportunity collects performance monitoring information according to the current system operation , Dynamically adjust these parameters to provide the most appropriate Teton time or maximum throughput

His focus is :

Controllable throughput . throughput = Run user code time /( Run user code time + Garbage collection time ),

meanwhile , When the new generation chooses to be Parallel Scavenge When , By default, the old age zone will be activated to use parallel garbage collection

adopt -XX:UseParallelGC perhaps -XX:UseParallelOldGC The two will activate each other

-XX:ParallelGCThreads=n Indicates how many to start GC Threads

cpu>8 when N=5 perhaps 8

cpu<8 when N= The actual number

12.4 Old garbage collector

12.4.1 Serial garbage collector (Serial Old/Serial MSC)

Serial Old yes Serial Old generation version of garbage collector , It's a single threaded collector , Use Marking algorithm , Running on the Client The old generation garbage collection algorithm in

And the new generation of Serial GC Related to

12.4.2 Parallel recycling (Parallel Old/Parallel MSC)

Parallel Old/ use Marking algorithm Realization

And the new generation of Parallel Scavenge GC Related to

12.4.3 Concurrent tag removal GC

CMS The collector (Concurrent Mark Sweep Concurrent tag removal ): A collector that aims to obtain the shortest recovery pause time

Suitable for Internet sites or B/S On the server of the system , Pay attention to the response speed of the server

CMS Very suitable for heaps with large memory 、CPU More core server applications , It's also G1 Prior to the advent of the preferred collector for large applications

When marking ,GC Thread running ; When cleared, run with the user thread

adopt -XX:+UseConcMarkSweepGC Command on

coordination In the new district pallellal New GC The recycler uses

When CMS because CPU When the pressure is too high to use SerialGC As a backup collector

12.4.3.1 CMS Execution process
  • Initial marker (CMS initial mark). Traverse to find all the GCRoots.GC Threads execute , User thread pause
  • Concurrent Tags (CMS concurrent mark) Traverse with the user thread GCRoots, Mark the objects that need to be cleared
  • Re label (CMS remark). Fix marking period , Fix objects that do not need to be recycled because the user program continues to run
  • Concurrent elimination (CMS concurrent sweep) Clear all marked objects with user threads

12.4.3.2 Advantages and disadvantages

advantage :

  • Concurrent collection low pause

shortcoming :

  • Concurrent execution , Yes CPU There's a lot of pressure on resources
  • Using mark clearing algorithm will lead to a large number of memory fragments

12.5 Garbage collector summary

Parameters (-XX:+……) The new generation of garbage collectors New generation algorithm Old garbage collector Old age algorithms
UseSerialGC SerialGC Copy algorithm Serial Old GC Mark whole
UseParNewGC Parallel New GC Copy algorithm Serial Old GC Mark whole
UseParllelGC Parallel Scavenge GC Copy algorithm Parallel GC Mark whole
UseConcMarkSweepGC Parallel New GC Copy algorithm CMS and Serial Old GC Standard definition
UseG1GC Mark the whole Local copy

General logic of garbage collection algorithm

12.6 CMS and G1 The difference between

  • G1 Does not cause memory fragmentation
  • G1 Precise control of memory , It can collect garbage accurately . By setup GC Processing time to collect the most garbage area

13. JMM

java Memory model . It's a norm .

When a thread is manipulating variables , First copy a copy from physical memory to your working memory ( Stack memory ), Write to physical memory after updating

characteristic :

  • Atomicity
  • visibility
  • Orderliness

More original articles and learning courses, please pay attention to the author's official account. @MakerStack obtain

版权声明
本文为[Stack]所创,转载请带上原文链接,感谢