当前位置:网站首页>Process concepts (under)

Process concepts (under)

2022-09-23 09:01:56Shikokumaru

环境变量

引入

问:Why do we have to carry a path when we execute our compiled executable program,But when executing system commands such aslspwdNo need to include a path when waiting for the command?(注:ls、pwdEtc. commands are essentially executable programs,存储在路径/usr/bin/目录下)

答:There are related environment variables in the system,save the search path of the program.

基本概念

  • 环境变量(environment variables)一般是指在操作系统中用来指定操作系统运行环境的一些参数
  • 如:我们在编写C/C++代码的时候,在链接的时候,从来不知道我们的所链接的动态静态库在哪里,但 是照样可以链接成功,生成可执行程序,原因就是有相关环境变量帮助编译器进行查找.
  • 环境变量通常具有某些特殊用途,还有在系统当中通常具有全局特性

注意:在LinuxThere are two kinds of variables: environment variables and ordinary variables,The environment variables are as described above,Ordinary variables like this one:

image-20220817150757732

上面的aaaa就是普通变量,无法通过envEnvironment variable command to view.

问:How do we define environment variables ourselves?

答:可以通过export NAME=xxx来进行定义(NAME是环境变量名,xxxis the value we want to define for the environment variable)

image-20220817151148943

image-20220817151159703

问:How to undefine an environment variable?

答:通过unset NAMECommands can cancel our own defined environment variables:

image-20220817152758382

常见环境变量

  • PATH : 指定命令的搜索路径
  • HOME : 指定用户的主工作目录(即用户登陆到Linux系统中时,默认的目录)
  • SHELL : 当前Shell,它的值通常是/bin/bash.

查看环境变量方法

使用env命令查看环境变量:

image-20220816202007581

echo $NAME//NAME:环境变量的名称

使用举例:

image-20220816203631323

Now you can answer,使用lspwd等命令时,The reason for normal use is that the executable programs corresponding to these commands are inPATHunder the path indicated by the environment variable,The reason our own executables can't run like this is because our own executables are notPATH路径下.

问:What if we want to use the executable file we compiled directly?

答:

方法一:Copy the compiled executable file intoPATH路径下.(使用sudo cp NAME PATH)//NAMEis the executable program file name,PATH是

方法二:Add the path of the current file toPATH路径下.

image-20220817151924073

image-20220817152239604

注意:在修改环境变量时,The above method is a new addition,而不是覆盖,better not to overwrite,But once it is overwritten, it can be solved by restarting the terminal,Because the modifications we make are only modifications made in memory,So it won't have any effect.

注意:whichThe reason why the command can find the path where our command is located is to find the executable program through the environment variable.

image-20220817152509261

image-20220817155743199

和环境变量相关的命令

  1. echo: 显示某个环境变量值

  2. export: 设置一个新的环境变量

  3. env: 显示所有环境变量

  4. unset: 清除环境变量

  5. set: 显示本地定义的shell变量和环境变量(View common variables and environment variables)

    image-20220817182343567

环境变量的组织方式

image-20220817182620998

每个程序都会收到一张环境表,环境表是一个字符指针数组,每个指针指向一个以’\0’结尾的环境字符串

通过代码如何获取环境变量

先了解C/C++main()函数的前两个参数:

代码:
image-20220817184103206

运行结果:

image-20220817184148855

分析argv结构:

image-20220817184624796

从上面可以知道,给main()函数传递的argc、argv[],命令行参数传递的是,命令行中输入的程序名和选项.

问:main()What is the point of passing command line options to a function?

答:通过传入不同的参数,Let the same program have different execution logic and execution results,This is why we can through the study of the different options of instruction can have different reasons.

使用举例:简易计算器:

代码:

#include<stdio.h>
#include<string.h>
//./process -a 10 20
//10 + 20 = 30
//./process -s 10 20
//10 - 20 = -10
//./process -m 10 20
//10 * 20 = 200
//./process -d 10 20
//10 / 20 = 0
int main(int argc, char* argv[])
{
     
 if(argc != 4)
 {
     
     printf("Usage: %s [-a|-s|-m|-d] firstData secondData\n", argv[0]);
     return 0;
 }
 int x = atoi(argv[2]);//atoi()函数的作用是将char*type string to integer
 int y = atoi(argv[3]);
 if(strcmp("-a", argv[1]) == 0)
 {
     
     printf("%d + %d = %d\n", x, y, x + y);
 }
 else if(strcmp("-s", argv[1]) == 0)
 {
     
     printf("%d - %d = %d\n", x, y, x - y);
 }
 else if(strcmp("-m", argv[1]) == 0)
 {
     
     printf("%d * %d = %d\n", x, y, x * y);
 }
 else if(strcmp("-d", argv[1]) == 0 && y != 0)
 {
     
     printf("%d / %d = %d\n", x, y, x / y);
 }
 else
 {
     
     printf("Usage: %s [-a|-s|-m|-d] firstData secondData\n", argv[0]);
     return 0;
 }
 return 0;
}

使用详情:

image-20220817190828365

  • 命令行第三个参数
#include <stdio.h>
int main(int argc, char *argv[], char *env[])
{
    
	int i = 0;
	for(; env[i]; i++)
    {
    
		printf("%s\n", env[i]);
	}
	return 0;
}
  • 通过第三方变量environ获取
#include<stdio.h>
int main()
{
    
    extern char** environ;
    int i = 0;
    for(; environ[i]; i++)
    {
    
        printf("%s\n", environ[i]);
    }
    return 0;
}

libc中定义的全局变量environ指向环境变量表,environ没有包含在任何头文件中,所以在使用时 要用extern声明.

通过系统调用获取或设置环境变量

  • getenv()函数,获取环境变量

image-20220817205527578

使用举例:

#include<stdio.h>
int main()
{
    
    char* path = getenv("PATH");
    printf("PATH:%s\n", path);
    return 0;
}

环境变量通常是具有全局属性的

  • 环境变量通常具有全局属性,可以被子进程继承下去

代码:

image-20220818103721884

执行结果:

image-20220818103659442

注意:创建的myprocFrom the environment variables in the process alsobashprocess inherited.

注意:所谓的本地变量,本质就是在bash内部定义的变量,不会被子进程继承下去.

问:Since local variables are not inherited by child processes,那么我们echo $NAMEHow to print local variables?

答:LinuxUnder most of the commands are executed by the child process,But there are still some commands that are not executed through subprocesses,而是由bash自己执行(Directly call your own corresponding function to complete a specific function),We call this type of command a built-in command.

程序地址空间

image-20220818105830721

测试代码:

image-20220818115340693

运行结果:

image-20220818115318290

注意:Static local variables and global variables are stored in the same location,the global data area.

代码感受:

image-20220818161605052

执行结果:

image-20220818161636827

子进程和父进程的g_valThe value of and address is the same.

将代码进行如下修改:

image-20220818162027599

image-20220818161948228

通过上述代码发现:The same variable at the same address read by the child process and the parent process,但是g_valModified in the child process,g_valThe value of is only changed in the child process,即变成了20,But in the parent processg_val的值依旧是10.

结论:我们在C/C++The address used in is not a physical address,Because if it is a physical address,就不会出现上面的情况,The same value must be stored in the same physical address.

The address in the above is虚拟地址!操作系统负责将虚拟地址转化为物理地址.

进程地址空间

每一个进程在启动的时候,都会让操作系统给它创建一个地址空间,该地址空间就是进程地址空间.

注意:所谓的进程地址空间,其实是内核的一个数据结构,和task_struct类似,类型叫struct mm_struct.

struct mm_struct
{
    
    //代码区
    long code_start;
    long code_end;
    //Initialize the global data area
    long init_start;
    long init _end;
    ······
}

问:程序被编译出来,when not loaded into memory,程序内部有地址吗?Is there an area inside the program??

答:In the compiled executable program,It has its own set of addresses,and it is also regional,These are already partitioned on the disk.

image-20220818185833608

编译程序的时候,就认为程序是按照0000~FFFF进行编址的.

对上面g_valexplain the phenomenon:

修改前:

image-20220818192640546

修改后(写时拷贝):

image-20220818192947381

Because the print is always a virtual address,So even after modification,&g_valThe value still hasn't changed.

问:fork有两个返回值,pid_t id,同一个变量,为什么会有不同的值?

答:pid_t idis a variable defined in the stack space of the parent process,fork内部,return会被执行两次,return的本质,It is to write the return value through the register to the variable that receives the return value!当id = fork()的时候,who will return after,Who is going to happen first copy-on-write,so the same variable,会有不同的值,本质是因为大家的虚拟地址是一样的,但是大家对应的物理地址是不一样的.

问:The page table is formed from the beginning,or dynamically generated at runtime?

答:The page table is formed from the beginning,But some are dynamically generated,比如堆空间.

问:为什么要有虚拟地址空间?

答:1. The virtual address space adds a layer of hardware and software for accessing memory,Can heap conversion process process audit,Illegal access can be directly to intercept,That is, the role of protecting memory.2. When we expand the requested memory space,Only address space to expand,起到了节约内存的作用,And through the address space in the process management andLinuxCoupling between two functional modules of memory management.3. 让进程或者程序可以以一种统一的视角看待内存,That is, each process has its own process address space,方便以统一的方式来编译和加载所有的可执行程序,This simplifies the design and implementation of the process itself.

Linux2.6内核进程调度队列

image-20220819092431404

上图是Linux2.6内核中进程队列的数据结构,

一个CPU拥有一个runqueue

  • 如果有多个CPU就要考虑进程个数的负载均衡问题

优先级

  • 普通优先级:60~99

活动队列

  • 时间片还没有结束的所有进程都按照优先级放在该队列

  • nr_active: 总共有多少个运行状态的进程

  • queue[140]: 一个元素就是一个进程队列,相同优先级的进程按照FIFO规则进行排队调度,所以,数组下 target is priority!

  • 从该结构中,选择一个最合适的进程,过程是怎么的呢?

    1. 从0下表开始遍历queue[140]
    2. 找到第一个非空队列,该队列必定为优先级最高的队列
    3. 拿到选中队列的第一个进程,开始运行,调度完成!
    4. 遍历queue[140]时间复杂度是常数!但还是太低效了!
  • bitmap[5]:一共140个优先级,一共140个进程队列,为了提高查找非空队列的效率,就可以用5*32个 比特位表示队列是否为空,这样,便可以大大提高查找效率!

过期队列

  • 过期队列和活动队列结构一模一样
  • 过期队列上放置的进程,都是时间片耗尽的进程
  • 当活动队列上的进程都被处理完毕之后,对过期队列的进程进行时间片重新计算

active指针和expired指针

  • active指针永远指向活动队列
  • expired指针永远指向过期队列
  • 可是活动队列上的进程会越来越少,过期队列上的进程会越来越多,Because the process time slice always exists when the time slice expires 的.
  • 在合适的时候,只要能够交换active指针和expired指针的内容,就相当于有具有了一批新的活动进程!

总结

  • 在系统当中查找一个最合适调度的进程的时间复杂度是一个常数,No time cost increase as the number of processes increases 加,我们称之为进程调度O(1)算法!
原网站

版权声明
本文为[Shikokumaru]所创,转载请带上原文链接,感谢
https://chowdera.com/2022/266/202209230842135988.html

随机推荐