当前位置:网站首页>Embedded assembly in IOS

Embedded assembly in IOS

2020-11-09 13:44:12 osc_zvjde550

Python The actual combat community

Java The actual combat community

Long press to identify the QR code below , Add as needed

Scan code, pay attention to add customer service

Into the Python community ▲

Scan code, pay attention to add customer service

Into the Java community

author | sindrilin 

source | sindrilin The nest

http://sindrilin.com/2019/10/23/write_assembly_in_ios.html

Click to read the original to see more articles by the author

Write an article in iOS The idea of using compiled articles has been in my mind for a long time , But I didn't do it . Although earlier in the start-up time-consuming optimization work , I've done it by intercepting objc_msgSend And insert assembly instructions to count the time-consuming work of method calls , But that's all . Just recently, the project is doing security reinforcement , More assembly needs to be written to improve security ( The assembly instruction set in this paper is ARM64), So there is this article

Embedded assembly format

__asm__ [ key word ]( 
     Instructions 
    : [ Output the list of operands ]
    : [ Input the list of operands ]
    : [ Contaminated register list ]
);

For example, there is a、b、c Three variables , To achieve a = b + c This code , The assembly code is as follows :

__asm__ volatile(
    "mov x0, %[b]\n"
    "mov x1, %[c]\n"
    "add x2, x0, x1\n"
    "mov %[a], x2\n"
    : [a]"=r"(a)
    : [b]"r"(b), [c]"r"(c)
);

volatile

volatile Keyword means that the compiler is not allowed to re optimize the assembly code , But basically, it doesn't make any difference whether or not to declare the compiled instructions

Operands

The format of the operands is "[limits]constraint", It is divided into two parts: permission and qualifier . such as "=r" Indicates that the parameter is only written and stored in a general register

  • limits

keyword To signify
= Just write , General used to output operands
+ Reading and writing , Can only be used for output operands
& Declaration registers can only be used for output
  • constraint

keyword To signify
f Floating point register f0~f7
G/H Floating point constant immediate
I/L/K An immediate number used in data processing
J The value is -4095~4095 The index of
l/r register r0~r15
M 0~32/2 Constant to the power of
m Memory address
w Vector register s0~s31
X Any type of operands

Instructions

because ARM64 Too many instructions for , The instructions can be read at the end of the text , Here only explain some key words in the instruction :

  • %0~%N / %[param]

In the use of C In the case of mixed code and assembly ,% The beginning is used to correlate parameters , adopt %[param] You can declare parameter names , You can also use the anonymous parameter format %N The order of the mode corresponds to the parameter (abc The parameters will follow 012 In order to match ):

  __asm__ volatile(
      "mov x0, %1\n"
      "mov x1, %2\n"
      "add x2, x0, x1\n"
      "mov %0, x2\n"
      : "=r"(a)
      : "r"(b), "r"(c)
  );

In practice , The device does not necessarily support %N Anonymous parameter format of , It is recommended to use %[param] Make it more readable

  • [reg]

Most of the time a program runs , The register stores the address where the data is stored , Use [] Wrap the register , Represents that the stored value of a register is used as an address to access data . The following instructions are to take out the address 0x10086 The stored data is stored in x1 Register , And put it in the address 0x100086 The memory of the :

  "mov x0, #0x10086\n"
  "mov x1, [x0]\n"
  "mov x2, #0x100086\n"
  "str x1, [x2]\n"
  • #1 / #0x1

Use # At the beginning, it means an immediate number ( constant ), It is recommended to use 16 Write in hexadecimal

Call specification

ARM64 The calling convention uses AAPCS64, Parameters are stored from left to right in x0~x7 In the register , Parameter out of 8 Time , The rest goes right to left , According to the size of the return value, it is stored in x0/x8 return . The register rules are as follows :

register Special name The rules
r31 SP Storage stack top address
r30 LR Store function return address
r29 FP The storage function uses the stack frame address
r19~r28
Registers that the callee needs to protect
r18
Platform register , It is not recommended to use as a temporary register
r17 IP1 In process use registers , It is not recommended to use as a temporary register
r16 IP0 Same as r17, At the same time as a soft interrupt svc System call parameters in
r9~r15
Temporary register ( When the function address parameter is embedded in the assembly instruction , Will be used to save the function address )
r8
Return value register ( The same at other times r9~r15)
r0~r7
Passing store call parameters ,r0 Can be used as a return value register
NZCV
Status register

actual combat

Debug detection

stay iOS In the application of security reinforcement , adopt sysctl + kinfo_proc Can detect whether the application is debugged :

__attribute__((__always_inline)) bool checkTracing() {
    size_t size = sizeof(struct kinfo_proc);
    struct kinfo_proc proc;
    memset(&proc, 0, size);
    
    int name[4];
    name[0] = CTL_KERN;
    name[1] = KERN_PROC;
    name[2] = KERN_PROC_PID;
    name[3] = getpid();
    
    sysctl(name, 4, &proc, &size, NULL, 0);
    return proc.kp_proc.p_flag & P_TRACED;
}

But because of fishhook This kind of direct modification lazy symbol address scheme exists , Use it directly sysctl It's not safe , As a result, most developers will replace this call with an embedded assembly solution execution :

size_t size = sizeof(struct kinfo_proc);
struct kinfo_proc proc;
memset(&proc, 0, size);

int name[4];
name[0] = CTL_KERN;
name[1] = KERN_PROC;
name[2] = KERN_PROC_PID;
name[3] = getpid();

__asm__(
    "mov x0, %[name_ptr]\n"
    "mov x1, #4\n"
    "mov x2, %[proc_ptr]\n"
    "mov x3, %[size_ptr]\n"
    "mov x4, #0x0\n"
    "mov x5, #0x0\n"
    "mov w16, #202\n"
    "svc #0x80\n"
    :
    :[name_ptr]"r"(&name), [proc_ptr]"r"(&proc), [size_ptr]"r"(&size)
);

return proc.kp_proc.p_flag & P_TRACED;

Step on the pit

Use C Code embedded assembly development , The fatal problem is that function entries put temporary variables on the stack , And put these variables in registers . The actual runtime of the above hybrid code , The following will happen :

//  Temporary variable code generated by function entry 
add x0, sp, #0x24       // x0 Deposit name
add x1, sp, #0x34       // x1 Deposit proc
add x2, sp, #020        // x2 Deposit size

......

//  Built in assembly 
mov x0, x0              // name Normal assignment 
mov x1, #4              // proc Data is corrupted 
mov x2, x1              // size Data is corrupted 
mov x3, x2
mov x4, #0x0
mov x5, #0x0
mov x12, #0xca
svc #0x80

The compiled code is due to the temporary variable order problem , Led to svc Interrupt call sysctl Unable to pass in the correct parameter , Finally, the application is stuck

Repair

Insert temporary variables

Through the compiled instructions, we get a corresponding table :

Variable register Input parameter register
name x0 x0
proc x1 x2
size x2 X3

If you can make registers that store temporary variables and svc The input parameter registers of interrupt are consistent , It won't be destroyed

ARM64 Calling convention , Parameters are put on the stack from right to left

Because the detection function has no input parameters , Therefore, the temporary parameters are stored in the following order x0~x2 In the register , In sequence name、proc、size, So just need to be in name and proc Insert a useless temporary variable into the , You can make the parameters correspond to :

size_t size = sizeof(struct kinfo_proc);
struct kinfo_proc proc;
memset(&proc, 0, size);

int placeholder;
int name[4];
name[0] = CTL_KERN;
name[1] = KERN_PROC;
name[2] = KERN_PROC_PID;
name[3] = getpid();

After compiling, the instruction becomes :

//  Temporary variable code generated by function entry 
add x0, sp, #0x24       // x0 Deposit name
add x1, sp, #0x34       // x1 Deposit placeholder
add x2, sp, 0x38        // x2 Deposit proc
add x3, sp, #020        // x3 Deposit size

......

//  Built in assembly 
mov x0, x0           
mov x1, #4           
mov x2, x2             
mov x3, x3
mov x4, #0x0
mov x5, #0x0
mov x12, #0xca
svc #0x80

Modify the order of instructions

The instruction that sets the input parameter will destroy the existing value in the register , Before entering the settings, then guarantee , It's ok if the register is not destroyed :

__asm__(
    "mov x0, %[name_ptr]\n"
    "mov x3, %[size_ptr]\n"
    "mov x2, %[proc_ptr]\n"
    "mov x1, #4\n"
    "mov x4, #0x0\n"
    "mov x5, #0x0\n"
    "mov w16, #202\n"
    "svc #0x80\n"
    :
    :[name_ptr]"r"(&name), [proc_ptr]"r"(&proc), [size_ptr]"r"(&size)
);

The compiled instructions are as follows :

//  Built in assembly 
mov x0, x0              // x0 preservation name
mov x3, x2              // x3 preservation size
mov x2, x1              // x2 preservation proc
mov x1, #4
mov x4, #0x0
mov x5, #0x0
mov x12, #0xca
svc #0x80

Full assembly implementation

In the and C In the case of mixed code , There is no guarantee which registers will be destroyed , So it's a good choice to use assembly directly to implement the whole logic , We need to pay attention to 2 A question :

  1. Ensure that there is no entry and exit instruction generated before and after the function call , Use __attribute__((naked)) To deal with it

  2. All variables are stored on the stack , You need to control the use of the stack

  3. Use safe registers (r19~r28)

First, determine how much stack space is needed , According to the function sysctl(name, 4, &proc, &size, NULL, 0) Judge

  • Parameters name Total occupancy 4 * int Space , Write it down as 0x10

  • Parameters proc stay arm64 Next ,sizof() The calculated length is 0x288

  • Parameters &size The length of the pointer is 0x8

  • total 0x2a0

Entry function , Need to be right FP/LR Register to stack , Ensure that the function exits correctly . in addition r19~r28 total 10 Registers need to be stack protected , Finally, we get the stack space graph when the function is running :

---------- 
|   FP   |
----------  sp + 0x2f8
|   LR   |
----------  sp + 0x2f0
|   r20  |
----------  sp + 0x2e8
|   r19  |
----------  sp + 0x2e0
|   r22  |
----------  sp + 0x2d8
|   r21  |
----------  sp + 0x2d0
|   r24  |
----------  sp + 0x2c8
|   r23  |
----------  sp + 0x2c0
|   r26  |
----------  sp + 0x2b8
|   r25  |
----------  sp + 0x2b0
|   r28  |
----------  sp + 0x2a8
|   r27  |
----------  sp + 0x2a0
| p_size |
----------  sp + 0x298
|  proc  |
----------  sp + 0x10
|  name  |  
----------  sp

In preservation r19~r28 After the register is put on the stack , Use five of these registers to hold some parameters :

------------------
|    Parameters   |  register  |
------------------  
|  name  |  r19  |
------------------   
|  proc  |  r20  |
------------------  
| p_size |  r21  |
------------------  
|  size  |  r22  |
------------------  
|   sp   |  r23  |
------------------  
|  temp  |  r24  |
------------------

After confirming the use of space on the stack , You can start to implement it step by step :

Function entry and exit

There are two things to do at the entry and exit of a function :FP/LR In and out of the warehouse 、r19~r28 In and out of the warehouse

__asm__ volatile(
    "stp x29, x30, [sp, #-0x10]!\n"
    "stp x19, x20, [sp, #-0x10]!\n"
    "stp x21, x22, [sp, #-0x10]!\n"
    "stp x23, x24, [sp, #-0x10]!\n"
    "stp x25, x26, [sp, #-0x10]!\n"
    "stp x27, x28, [sp, #-0x10]!\n"
    
    ......
    
    "ldp x19, x20, [sp], #0x10\n"
    "ldp x21, x22, [sp], #0x10\n"
    "ldp x23, x24, [sp], #0x10\n"
    "ldp x25, x26, [sp], #0x10\n"
    "ldp x27, x28, [sp], #0x10\n"
    "ldp x29, x30, [sp], #0x10\n"
);

Stacks open up space

Temporary variables are always shared with 0x2a0 Space , And need to use 5 Registers hold variables

__asm__ volatile(
    ......
    "sub sp, sp, #0x2a0\n"
    
    //  Open up stack space , Variable save register 
    "mov x19, sp\n"             // x19 = name
    "add, x20, sp, #0x10\n"     // x20 = proc
    "add, x21, sp, #0x298\n"    // x21 = p_size
    "mov x22, #0x288\n"         // x22 = size
    "mov x23, sp\n"             // x23 = sp
    "str x22, [x21]\n"          // p_size = &size
    
    "add sp, sp, #0x2a0\n"
    ......
);

kinfo_proc

determine proc After the memory of , Need to put :

size_t size = sizeof(struct kinfo_proc);
struct kinfo_proc proc;
memset(&proc, 0, size);

Convert to the corresponding assembly , among proc Stored in x20,x22 Store size,memset Three parameters are needed , Enter into the reference respectively :

__asm__ volatile(
    ......
    
    "mov x24, %[memset_ptr]\n"
    "mov x0, x20\n"
    "mov x1, #0x0\n"
    "mov x2, x12\n"
    "blr x24\n"
    
    ......
    :
    :[memset_ptr]"r"(memset)
);

name

because name yes int Array , In the case of clear storage location , We need to separate 4 individual 4 Byte parameters are stored in the corresponding memory location , Its location is as follows :

-------------
|  name[3]  |  
-------------  sp + 0xc
|  name[2]  |  
-------------  sp + 0x8
|  name[1]  |  
-------------  sp + 0x4
|  name[0]  |  
-------------  sp

in addition name Need to be used getpid() To configure parameters , adopt svc The interrupt can get this parameter (svc System call parameters can refer to... In the extended reading Kernel Syscalls)

#define CTL_KERN        1
#define KERN_PROC       14
#define KERN_PROC_PID   1

__asm__ volatile(
    ......
    
    // getpid
    "mov x0, #0\n"
    "mov w16, #20\n"
    "mov x3, x0\n"          // name[3]=getpid()

    //  Set parameters and store 
    "mov x0, #0x1\n"
    "mov x1, #0xe\n"
    "mov x2, #0x1\n"
    "str w0, [x23, 0x0]\n"
    "str w1, [x23, 0x4]\n"
    "str w2, [x23, 0x8]\n"
    "str w3, [x23, 0xc]\n"
    
    ......
);

sysctl

Finally, call sysctl, According to the corresponding relationship between the parameters and registers, you can call the parameters :

__asm__ volatile(
    ......

    "mov x0, x19\n"
    "mov x1, #0x4\n"
    "mov x2, x20\n"
    "mov x3, x21\n"
    "mov x4, #0x0\n"
    "mov x5, #0x0\n"
    "mov w16, #202\n"
    "svc #0x80\n"
            
    ......
);

flag testing

Finally, we need to go back p_flag and P_TRACED And comparative testing , Here we need to get p_flag Offset in the structure to access data ,struct extern_proc The structure is as follows :

struct extern_proc {
    union {
        struct {
            struct  proc *__p_forw; /* Doubly-linked run/sleep queue. */
            struct  proc *__p_back;
        } p_st1;
        struct timeval __p_starttime;   /* process start time */
    } p_un;
    
    #define p_forw p_un.p_st1.__p_forw
    #define p_back p_un.p_st1.__p_back
    #define p_starttime p_un.__p_starttime
    
    struct  vmspace *p_vmspace;     /* Address space. */
    struct  sigacts *p_sigacts;     /* Signal actions, state (PROC ONLY). */
    int     p_flag;                 /* P_* flags. */
    char    p_stat;                 /* S* process status. */
    pid_t   p_pid;                  /* Process identifier. */
    pid_t   p_oppid;         /* Save parent pid during ptrace. XXX */
    int     p_dupfd;         /* Sideways return value from fdopen. XXX */
    /* Mach related  */
    caddr_t user_stack;     /* where user stack was allocated */
    void    *exit_thread;   /* XXX Which thread is exiting? */
    int             p_debugger;             /* allow to debug */
    boolean_t       sigwait;        /* indication to suspend */
    /* scheduling */
    u_int   p_estcpu;        /* Time averaged value of p_cpticks. */
    int     p_cpticks;       /* Ticks of cpu time. */
    fixpt_t p_pctcpu;        /* %cpu for this process during p_swtime */
    void    *p_wchan;        /* Sleep address. */
    char    *p_wmesg;        /* Reason for sleep. */
    u_int   p_swtime;        /* Time swapped in or out. */
    u_int   p_slptime;       /* Time since last blocked. */
    struct  itimerval p_realtimer;  /* Alarm timer. */
    struct  timeval p_rtime;        /* Real time. */
    u_quad_t p_uticks;              /* Statclock hits in user mode. */
    u_quad_t p_sticks;              /* Statclock hits in system mode. */
    u_quad_t p_iticks;              /* Statclock hits processing intr. */
    int     p_traceflag;            /* Kernel trace points. */
    struct  vnode *p_tracep;        /* Trace to vnode. */
    int     p_siglist;              /* DEPRECATED. */
    struct  vnode *p_textvp;        /* Vnode of executable. */
    int     p_holdcnt;              /* If non-zero, don't swap. */
    sigset_t p_sigmask;     /* DEPRECATED. */
    sigset_t p_sigignore;   /* Signals being ignored. */
    sigset_t p_sigcatch;    /* Signals being caught by user. */
    u_char  p_priority;     /* Process priority. */
    u_char  p_usrpri;       /* User-priority based on p_cpu and p_nice. */
    char    p_nice;         /* Process "nice" value. */
    char    p_comm[MAXCOMLEN + 1];
    struct  pgrp *p_pgrp;   /* Pointer to process group. */
    struct  user *p_addr;   /* Kernel virtual addr of u-area (PROC ONLY). */
    u_short p_xstat;        /* Exit status for wait; also stop signal. */
    u_short p_acflag;       /* Accounting flags. */
    struct  rusage *p_ru;   /* Exit information. XXX */
};

among union p_un Of size by 0x10, as well as p_flag The first two pointers occupy 0x8, You can confirm the memory consumption diagram of the structure :

-------------------
|      p_flag     |  
-------------------  kinfo_proc + 0x20
|     p_sigacts   |  
-------------------  kinfo_proc + 0x18
|     p_vmspace   |  
-------------------  kinfo_proc + 0x10
|    union p_un   |  
-------------------  kinfo_proc

Compare the marks and store the test results in x0 Back in :

#define P_TRACED        0x00000800

__asm__ volatile(
    ......
    
    "ldr, x24, [x20, #0x20]\n"      // x24 = proc.kp_proc.p_flag
    "mov x25, #0x800\n"             // x25 = P_TRACED
    "blc x0, x24, x25\n"            // x0 = x24 & x25
    
    ......
);

Extended reading

[1]https://www.theiphonewiki.com/wiki/Kernel_Syscalls 
[2]https://juejin.im/post/5cadeda55188251ad87b0eed 
[3]https://juejin.im/post/5a786c555188257a6854b18c 
[4]http://infocenter.arm.com/help/topic/com.arm.doc.ihi0055b/IHI0055B_aapcs64.pdf


 Programmer column   Scan code and pay attention to customer service   Press and hold to recognize the QR code below to enter the group 

Recent highlights are recommended :  

  My girlfriend thinks the annual salary is 50 Ten thousand is the average level , What do I do ?

  The sexy goddess of the biggest straight men forum in China overturned

 IntelliJ IDEA Fully optimized settings , Efficiency bars !

  Very useful Python skill

Here's a look Good articles to share with more people ↓↓

版权声明
本文为[osc_zvjde550]所创,转载请带上原文链接,感谢