[In-depth understanding of computer systems] CSAPP-Experiment 2 The latest detailed explanation of BombLab 2020

Hits: 0

foreword

This chapter is based on the background of “demolition the bomb”, and analyzes the program through the gdb debugger. It is an interesting practice of assembly and [decompilation .]

This machine uses win10 +wsl2.0 + ubuntu18.04 to complete the experiment.

Click to see my full code

reference

【Construction completed】CSAPP bomb lab

【Actual combat 🚀】Teach you Bomb Lab by touching your hands

Answer

Border relations with Canada have never been better.
1 2 4 8 16 32
0 207
0 0
ionefg
4 3 2 1 6 5

Phase_1

Dump of assembler code for function phase_1:
   0x0000000000400ee0 < +0 > : sub $0x8,%rsp //Apply for stack space
   0x0000000000400ee4 < +4 > : mov $0x402400,%esi //put the memory space into the %esi register
   0x0000000000400ee9 < +9 > : callq 0x401338 < strings_not_equal > //Call the function to judge whether the strings are equal
   0x0000000000400eee <+14>:    test   %eax,%eax
   0x0000000000400ef0 < +16 > : je 0x400ef7 < phase_1+23 > //If they are equal, jump to the function exit. Otherwise explode.
   0x0000000000400ef2 < +18 > : callq 0x40143a < explode_bomb > 
   0x0000000000400ef7 < +23 > : add $0x8,%rsp //Return occupied space
   0x0000000000400efb <+27>:    retq
End of assembler dump.

Clear and clear.

So just check what the memory space $0x402400 is.

(gdb) x/s 0x402400
0x402400:       "Border relations with Canada have never been better."

Answer

Border relations with Canada have never been better.

2

Dump of assembler code for function phase_2:
   0x0000000000400efc < +0 > : push %rbp //These two lines save register information. so as not to contaminate other data
   0x0000000000400efd <+1>:     push   %rbx
   0x0000000000400efe < +2 > : sub $0x28,%rsp //Apply for space

   0x0000000000400f02 <+6>:     mov    %rsp,%rsi 
   0x0000000000400f05 < +9 > : callq 0x40145c < read_six_numbers > //read six numbers

   0x0000000000400f0a < +14 > : cmpl $0x1,(%rsp) //The top element of the stack is compared with 1, and if it is equal, jump to < phase_2+52 > . Otherwise explode. So let's look at < phase_2+52 > 
   0x0000000000400f0e < +18 > : je 0x400f30 < phase_2+52 > 
   0x0000000000400f10 < +20 > : callq 0x40143a < explode_bomb > 0x + 
   mp00000000400f15 < +25 > : j > phase

   0x0000000000400f17 < +27 > : mov -0x4(%rbx),%eax //eax saves the previous element
   0x0000000000400f1a < +30 > : add %eax,%eax //eax doubles
   0x0000000000400f1c < +32 > : cmp %eax,(%rbx) //Compare eax and rbx. Jump to < phase_2+41 > if equal , otherwise explode.
   0x0000000000400f1e <+34>:    je     0x400f25 <phase_2+41>
   0x0000000000400f20 <+36>:    callq  0x40143a <explode_bomb>

   0x0000000000400f25 < +41 > : add $0x4,%rbx //rbx takes the next element
   0x0000000000400f29 < +45 > : cmp %rbp,%rbx //See if the end element is reached, if not, continue the loop; if so, go to the program exit
   0x0000000000400f2c <+48>:    jne    0x400f17 <phase_2+27>
   0x0000000000400f2e <+50>:    jmp    0x400f3c <phase_2+64>

   0x0000000000400f30 < +52 > : lea 0x4(%rsp),%rbx //rbx takes the next element
   0x0000000000400f35 < +57 > : lea 0x18(%rsp),%rbp //rbp records the address of the last element of six elements

   0x0000000000400f3c < +64 > : add $0x28,%rsp //Return space
   0x0000000000400f40 < +68 > : pop %rbx //Restore register
   0x0000000000400f41 <+69>:    pop    %rbp


   0x0000000000400f42 <+70>:    retq
End of assembler dump.

Therefore, it can be determined that:

  • There are six numbers
  • The first number is 1
  • The next number is twice the previous number

Answer

1 2 4 8 16 32

Phase 3

Dump of assembler code for function phase_3:
   0x0000000000400f43 < +0 > : sub $0x18,%rsp //Apply for stack space

   0x0000000000400f47 < +4 > : lea 0xc(%rsp),%rcx //The rcx and rdx registers store two local variables respectively.
   0x0000000000400f4c <+9>:     lea    0x8(%rsp),%rdx

   //Here reads a memory content to the register esi (parameter), and at the same time clears eax (return), these two steps are to prepare for calling the function.
   0x0000000000400f51 < +14 > : mov $0x4025cf,%esi //Check the memory value, it is "%d %d", it can be determined that the parsing format is specified. Looking at gdb, Value returned is $1 = 2 . So the value of eax is 2 after the call ends. scanf returns the number of reads, which is 2
   0x0000000000400f56 <+19>:    mov    $0x0,%eax
   0x0000000000400f5b <+24>:    callq  0x400bf0 <__isoc99_sscanf@plt>

   0x0000000000400f60 < +29 > : cmp $0x1,%eax //If eax is greater than 1, jump to < phase_3+39 > , otherwise explode.
   0x0000000000400f63 <+32>:    jg     0x400f6a <phase_3+39>
   0x0000000000400f65 <+34>:    callq  0x40143a <explode_bomb>

   //Compare 0x8(%rsp) with 7. If it is greater than 7, it explodes. So the value of 0x8(%rsp) must be less than 7. ja is an unsigned comparison, so it can be seen that the value of 1 of the first parameter must be between 0 and 7
   0x0000000000400f6a <+39>:    cmpl   $0x7,0x8(%rsp)
   0x0000000000400f6f <+44>:    ja     0x400fad <phase_3+106>

   0x0000000000400f71 < +46 > : mov 0x8(%rsp),%eax //0x8(%rsp) is moved into register eax. switch initialization

   //switch to jump. We have to print out the jump table to know what the jump rule is. x/w can print one of them, 0x00400f7c, at this time %rax = 0. So jump to < phase_3+57 > , the corresponding function value is 0xcf(207), then jump to < phase_3+123 > 
   0x0000000000400f75 < +50 > : jmpq *0x402470(,%rax,8)

   0x0000000000400f7c <+57>:    mov    $0xcf,%eax
   0x0000000000400f81 <+62>:    jmp    0x400fbe <phase_3+123>

   0x0000000000400f83 <+64>:    mov    $0x2c3,%eax
   0x0000000000400f88 <+69>:    jmp    0x400fbe <phase_3+123>

   0x0000000000400f8a <+71>:    mov    $0x100,%eax
   0x0000000000400f8f <+76>:    jmp    0x400fbe <phase_3+123>

   0x0000000000400f91 <+78>:    mov    $0x185,%eax
   0x0000000000400f96 <+83>:    jmp    0x400fbe <phase_3+123>

   0x0000000000400f98 <+85>:    mov    $0xce,%eax
   0x0000000000400f9d <+90>:    jmp    0x400fbe <phase_3+123>

   0x0000000000400f9f <+92>:    mov    $0x2aa,%eax
   0x0000000000400fa4 <+97>:    jmp    0x400fbe <phase_3+123>

   0x0000000000400fa6 <+99>:    mov    $0x147,%eax
   0x0000000000400fab <+104>:   jmp    0x400fbe <phase_3+123>
   0x0000000000400fad <+106>:   callq  0x40143a <explode_bomb>

   0x0000000000400fb2 <+111>:   mov    $0x0,%eax
   0x0000000000400fb7 <+116>:   jmp    0x400fbe <phase_3+123>
   0x0000000000400fb9 <+118>:   mov    $0x137,%eax

   //Determine whether the values ​​of 0xc(%rsp) and %eax are equal, if so, jump to the program exit, otherwise explode.
   0x0000000000400fbe <+123>:   cmp    0xc(%rsp),%eax
   0x0000000000400fc2 <+127>:   je     0x400fc9 <phase_3+134>
   0x0000000000400fc4 <+129>:   callq  0x40143a <explode_bomb>

   0x0000000000400fc9 < +134 > : add $0x18,%rsp //Release stack space
   0x0000000000400fcd <+138>:   retq
End of assembler dump.

It can be inferred

  1. There are 8 combinations of answers: the front is the label value of the switch, and the back is a corresponding memory value.
  2. Any combination can pass.

answer (not unique)

0 207

Phase_4

Dump of assembler code for function phase_4:
   0x000000000040100c < +0 > : sub $0x18,%rsp //Apply for stack space

   0x0000000000401010 <+4>:     lea    0xc(%rsp),%rcx
   0x0000000000401015 <+9>:     lea    0x8(%rsp),%rdx

   0x000000000040101a < +14 > : mov $0x4025cf,%esi // "%d %d" is still two integers
   0x000000000040101f <+19>:    mov    $0x0,%eax
   0x0000000000401024 <+24>:    callq  0x400bf0 <__isoc99_sscanf@plt>

   0x0000000000401029 <+29>:    cmp    $0x2,%eax
   0x000000000040102c < +32 > : jne 0x401035 < phase_4+41 > //%eax must be equal to 2, otherwise it will explode
   0x000000000040102e < +34 > : cmpl $0xe,0x8(%rsp)//0x8(%rsp) must be less than or equal to 14, otherwise it will explode. here also unsigned
   0x0000000000401033 <+39>:    jbe    0x40103a <phase_4+46>
   0x0000000000401035 <+41>:    callq  0x40143a <explode_bomb>

   0x000000000040103a < +46 > : mov $0xe,%edx //The third parameter
   0x000000000040103f < +51 > : mov $0x0,%esi //The second parameter
   0x0000000000401044 < +56 > : mov 0x8(%rsp),%edi //The first parameter
   //Call < func4 > , the return value must be equal to 0, otherwise it will explode
   0x0000000000401048 <+60>:    callq  0x400fce <func4>
   0x000000000040104d <+65>:    test   %eax,%eax
   0x000000000040104f <+67>:    jne    0x401058 <phase_4+76>
   //The value of 0xc(%rsp) must be equal to 0, otherwise it will explode.
   0x0000000000401051 <+69>:    cmpl   $0x0,0xc(%rsp)
   0x0000000000401056 <+74>:    je     0x40105d <phase_4+81>
   0x0000000000401058 <+76>:    callq  0x40143a <explode_bomb>

   0x000000000040105d < +81 > : add $0x18,%rsp //Return stack space
   0x0000000000401061 <+85>:    retq
End of assembler dump.

It can be determined that:

  1. The input is two integers, there are 0x8(%rsp) and 0xc(%rsp) respectively, the former is the first input and the latter is the second input
  2. The value range of input1 is [0,14]
  3. The return value of calling func4(input1, 0,14) must be 0, otherwise it will explode
  4. The value of input2 must also be 0 after the call, otherwise it will explode. Note that func4 does not modify input2, so it can be determined that input2 = 0

So let’s take a look at the decompiled code of func4 to determine how much input1 is
(I didn’t know how to recurse when I wrote the comment at the beginning, so I calculated it directly with numbers. So I will use c language later to describe the abstract meaning )

Dump of assembler code for function func4:
    // %edi input1
    // %esi 0
    // %edx 14

   0x0000000000400fce <+ 0 >: sub $ 0x8 ,%rsp //Apply for stack space

   0x0000000000400fd2 <+ 4 >: mov %edx,%eax //(%eax) =14 
   0x0000000000400fd4 <+ 6 >: sub %esi,%eax //(%eax) =14-0=14 
   0x0000000000400fd6 <+ 8 >: mov %eax,%ecx //new, (%ecx) = 14, boundary length 
   0x0000000000400fd8 <+ 10 >: shr $ 0x1f ,%ecx //Logical shift right, ecx shift right by 31 bits, leaving sign bit 0 
   0x0000000000400fdb < + 13 >:     add     %ecx,%eax //(%eax) = 14+0 = 14 
   0x0000000000400fdd <+ 15 >: sar %eax //Arithmetic right shift, default 1, arithmetic shift 1 bit. That is, eax/=2, eax=7. The break point verification is correct here

   0x0000000000400fdf <+ 17 >: lea (%rax,%rsi, 1 ),%ecx //%ecx = 7 + 0*1 = 7 
   0x0000000000400fe2 <+ 20 >: cmp %edi,%ecx //notice the current value , %edi or %rdi have not been called at all, it can be determined that %ecx must be 7, so input1 must be compared with 7. If 7<=input1, jump to <func4+36>. 
   0x0000000000400fe4 <+ 22 >: jle     0x400ff2 <func4+ 36 >

   //Here we explain input1<7. 
   0x0000000000400fe6 <+ 24 >: lea     -0x1 (%rcx),%edx //(%edx) = 7 - 1 = 6 
    //A recursive call occurs here. 
        // %edi input1 
        // %esi 0 
        // %edx 6 
   0x0000000000400fe9 <+ 27 >: callq   0x400fce <func4>
    0x0000000000400fee <+ 32 >:     add     %eax,%eax //(%eax) double, then exit the program 
   0x0000000000400ff0 <+ 34 >: jmp     0x401007 <func4+ 57 >

   //Go here to explain input1>=7 
   0x0000000000400ff2 <+ 36 >: mov $ 0x0 ,%eax
    0x0000000000400ff7 <+ 41 >: cmp %edi,%ecx //Compare input1 and 7. If 7>=input1, jump to <func4+57>, is the program exit. At this point the return is indeed 0. 
   0x0000000000400ff9 <+ 43 >: jge     0x401007 <func4+ 57 >
    0x0000000000400ffb <+ 45 >: lea     0x1 (%rcx),%esi //%esi=7+1 = 8. 
   // recursion also occurs here 
        // %edi input1 
        // %esi 8 
        // %edx 14 
   0x0000000000400ffe <+ 48 >: callq   0x400fce<func4>
    0x0000000000401003 <+ 53 >: lea     0x1 (%rax,%rax, 1 ),%eax //%eax = 2* %rax +1. That is, double the return value and add 1

   0x0000000000401007 <+ 57 >:     add     $ 0x8 ,%rsp //Return stack space 
   0x000000000040100b <+ 61 >: retq
End of assembler dump.

Since recursive calls are involved, we must distinguish changes to the parameters of the call. By analyzing the parameters (the process is in the comments), we are also keenly aware that this is a bisection recursive process.

We rearranged it into pseudocode for easy viewing and analysis:

int func4(
      int edi = input1,
      int esi = 0,
      int edx = 14
   ){
      int eax;

      eax = edx; //assign right boundary 
      eax -= esi; //subtract left boundary

      int ecx = eax; //retain the interval length value 
      ecx>>= 31 ; //sign bit, at this time ecx=0 
      eax+=ecx; //eax = 14 
      eax>>= 1 ; //eax/=2, eax=7. At this time, eax stores half of the interval length

      ecx = rax (ie eax) + rsi (ie esi); //exc stores: left boundary value + half the length of the interval. So ecx stores the index of the midpoint of the interval

      //Prepare to enter the left interval recursion 
      if (edi<ecx){
         edx = rac(ie ecx) -1 ; //Update the right vertex value of the left interval 
         eax = func4(edi,esi,edx); //Recursion, except for edx, nothing else has changed. The reason why eax is directly assigned is because func does not store the current variable through the push and pop methods, so it can be inferred that the value is directly modified here. 
         eax *= 2 ;
      }
      else  if (ecx==edi) {   //ecx>=edi,edi>=ecx, only the two are equal 
            return  0 ;
      }
      //Prepare to enter right interval recursion. At this time edi>=ecx
      {
         esi = rac(ie ecx) + 1 ; //Update the left vertex value of the right interval 
         eax = func4(edi,esi,edx);   //Recursion, except for esi, nothing else has changed. 
         eax = 2 * rax (ie eax) + 1 ;
      }

      return eax;
   }

In order to make the return value equal to 0, it cannot enter the right interval recursion (because it will be + 1), that is, enter the left interval recursion every time. So simple, the first parameter is also 0.

Answer

0 0

Phase_5

Dump of assembler code for function phase_5:
   0x0000000000401062 <+0>:     push   %rbx 
   0x0000000000401063 < +1 > : sub $0x20,%rsp //Apply for memory

   0x0000000000401067 < +5 > : mov %rdi,%rbx //The first parameter is stored in rbx
   0x000000000040106a <+8>:     mov    %fs:0x28,%rax
   0x0000000000401073 <+17>:    mov    %rax,0x18(%rsp)
   0x0000000000401078 <+22>:    xor    %eax,%eax
   0x000000000040107a < +24 > : callq 0x40131b < string_length > 
   0x000000000040107f < +29 > : cmp $0x6,%eax //The input length is stored. The input length must be 6 or it explodes.
   0x0000000000401082 <+32>:    je     0x4010d2 <phase_5+112>
   0x0000000000401084 <+34>:    callq  0x40143a <explode_bomb>
   0x0000000000401089 <+39>:    jmp    0x4010d2 <phase_5+112>

       /*
            The next step is to enter the do while loop. From 0 to 5.
       */
   0x000000000040108b < +41 > : movzbl (%rbx,%rax,1),%ecx //rbx stores the six strings passed in. Here is the meaning of traversing each char
   0x000000000040108f < +45 > : mov %cl,(%rsp)//ecx contains cl. As char, cl is enough for storage. So %rsp (top of stack) now stores the char being traversed
   0x0000000000401092 < +48 > : mov (%rsp),%rdx //new register, storing the char being traversed
   0x0000000000401096 < +52 > : and $0xf,%edx //edx only keeps the last byte
   0x0000000000401099 < +55 > : movzbl 0x4024b0(%rdx),%edx //%rdx as the subscript index stored at $0x4024b0. Here edx is updated. Let's first see what 0x4024b0 stores. 0x4024b0 < array.3449 > : "maduiersnfotvbylSo you think you can stop the bomb with ctrl-c, do you?"
   0x00000000004010a0 < +62 > : mov %dl,0x10(%rsp,%rax,1)//Save the updated rdx to the position starting with (%rsp+0x10).
   0x00000000004010a4 <+66>:    add    $0x1,%rax
   0x00000000004010a8 <+70>:    cmp    $0x6,%rax
   0x00000000004010ac <+74>:    jne    0x40108b <phase_5+41>


   0x00000000004010ae <+76>:    movb   $0x0,0x16(%rsp)
   0x00000000004010b3 < +81 > : mov $0x40245e,%esi //"flyers", as the second parameter.
   0x00000000004010b8 < +86 > : lea 0x10(%rsp),%rdi //The first parameter is 0x10(%rsp). If 0x10(%rsp) is equal to the input string, the program exits. Otherwise explode.
   0x00000000004010bd <+91>:    callq  0x401338 <strings_not_equal>
   0x00000000004010c2 <+96>:    test   %eax,%eax
   0x00000000004010c4 <+98>:    je     0x4010d9 <phase_5+119>
   0x00000000004010c6 <+100>:   callq  0x40143a <explode_bomb>
   0x00000000004010cb <+105>:   nopl   0x0(%rax,%rax,1)
   0x00000000004010d0 <+110>:   jmp    0x4010d9 <phase_5+119>

       //Just empty the eax and jump back
   0x00000000004010d2 <+112>:   mov    $0x0,%eax
   0x00000000004010d7 <+117>:   jmp    0x40108b <phase_5+41>

   0x00000000004010d9 <+119>:   mov    0x18(%rsp),%rax
   0x00000000004010de <+124>:   xor    %fs:0x28,%rax
   0x00000000004010e7 <+133>:   je     0x4010ee <phase_5+140>
   0x00000000004010e9 <+135>:   callq  0x400b30 <__stack_chk_fail@plt>

   0x00000000004010ee < +140 > : add $0x20,%rsp //Release memory
   0x000000004010f2 < +144 > : pop %rbx
   0x00000000004010f3 <+145>:   retq
End of assembler dump.

Clear and clear. Determine as follows:

  1. The input is six chars, which I call InputString here.

  2. “flyers” must be equal to the mapped string of InputString in 0x4024b0 to pass.

  3. The string in 0x4024b0 is “maduiersnfotvbylSo you think you can stop the bomb with ctrl-c, do you?”

  4. Let’s analyze the mapping relationship:

    1. Similarly, the entire calculation process is as follows:

      target char The corresponding Index in 0x4024b0 The last byte is the char of the second column
      f 9 i
      l f o
      y e n
      e 5 e
      r 6 f
      s 7 g

answer (not unique)

ionefg

Phase_6

This assembly code is too long.

And the registers are also relatively unfamiliar, and it is still relatively difficult for people who are new to assembly.

So leave a blogger’s answer.

https://mcginn7.github.io/2020/02/16/CSAPP-bomblab/#phase-6

I temporarily give up Phase_6.

later

This lab is difficult for me. Because I have not learned assembly.

When learning OS today, I found that I also need a lot of assembly knowledge. But at present, it is not considered a demand, so let it go.

I will do this Phase_6 when I feel the time is right one day.

Welcome to leave a message.

You may also like...

Leave a Reply

Your email address will not be published.