6.828 Lab1 Booting a PC Experiment Summary

Hits: 0

Article directory

Part1 PC Bootstrap

Getting Started with x86 assembly

Simulating the x86

make qemu / make qemu-nox

The PC’s Physical Address Space

The original PC (8088) was 16-bit and could only address 1MB of space, from 0x00000000to 0x000FFFFF. After 80286 and 80386, 16MB and 4GB of physical address space are supported. For backward compatibility, modern PCs have a hole , and the address space is 0x000A0000to 0x00100000, and the memory is divided into Low Memory and Extended Memory. Due to some limitations, JOS supports 256MB of physical address space. address space.

The BIOS address space is up 0x000F0000to 0x000FFFFF64KB. Early PCs used ROM to store BIOS, and now use flash memory. The BIOS is responsible for performing basic system initialization, such as activating the graphics card and checking the amount of memory installed. After performing this initialization, the BIOS will load the operating system from some suitable location (such as a floppy disk, hard disk, CD-ROM, or network) and pass control of the machine to the operating system.

The ROM BIOS

The first command after the PC starts up is:

[f000:fff0] 0xffff0:    ljmp   $0xf000,$0xe05b

  • The IBM PC starts executing at physical address 0x000ffff0, which is at the very top of the 64KB area reserved for the ROM BIOS.
  • The PC starts executing with CS = 0xf000 and IP = 0xfff0.
  • The first instruction to be executed is a jmp instruction, which jumps to the segmented address CS = 0xf000 and IP = 0xe05b.

In real mode,physical address = 16 * segment + offset

The address is only 16 bytes 0xffff0away from the end of the BIOS ( 0x100000), so use the ljmpjump to the front of the BIOS

Exercise 2

[f000:e05b]    0xfe05b: cmpl   $0x0,%cs:0x6ac8
[f000:e062]    0xfe062: jne    0xfd2e1
[f000:e066]    0xfe066: xor    %dx,%dx
[f000:e068]    0xfe068: mov    %dx,%ss
[f000:e06a]    0xfe06a: mov    $0x7000,%esp
[f000:e070]    0xfe070: mov    $0xf34c2,%edx
[f000:e076]    0xfe076: jmp    0xfd15c
[f000:d15c]    0xfd15c: mov    %eax,%ecx
[f000:d15f]    0xfd15f: cli
[f000:d160]    0xfd160: cld
...

After the BIOS starts, it sets up the interrupt vector table, initializes various devices, and after initializing the PCI bus, searches for a bootable device such as a floppy disk, a hard disk, and an optical drive. Finally, when it finds a bootable disk, the BIOS reads from the disk. Bootloader and pass control to it.

Part2 The Boot Loader

The minimum transmission granularity of floppy disk and hard disk is sector, and each sector is 512KB. If the disk is bootable, then its first sector is called boot sectorand stores the boot loader code. When the BIOS finds a bootable disk, it loads it into boot secotrthe physical address space and then uses the command to set CS:IP to pass control to the boot loader0x7c000x7dff0000:7c00

In 6.828, the boot loader contains assembly code boot/boot.Sand C codeboot/main.c

In general, the boot loader does two things:

  1. Switch the CPU from 16-bit real mode to 32-bit protected mode
  2. read kernel code

Exercise 3

click here

Loading the Kernel

Part3 The Kernel

Using virtual memory to work around position dependence

kernelThe link address is not the same as the load address, because the kernel of the operating system is generally linked to a very high virtual address, for example 0xf0100000, the purpose is to leave the low address space for user programs. But many machines don’t have physical addresses 0xf010000, so the processor 0xf0100000maps to 1MB addresses.

Exercise 7

mov %eax,%cr0The address queried in obj/kern/kernel.asm isf0100025

f0100025:       0f 22 c0                mov    %eax,%cr0

Turn on the debug mode, f0100025set a breakpoint at , and find that the contents of the two addresses are different before execution.

(gdb) b *0x100025
Breakpoint 1 at 0x100025
(gdb) c
Continuing.
The target architecture is assumed to be i386
=> 0x100025:    mov    %eax,%cr0
Breakpoint 1, 0x00100025 in ?? ()
(gdb) x/16x 0x00100000
0x100000:       0x1badb002      0x00000000      0xe4524ffe      0x7205c766
0x100010:       0x34000004      0x2000b812      0x220f0011      0xc0200fd8
0x100020:       0x0100010d      0xc0220f80      0x10002fb8      0xbde0fff0
0x100030:       0x00000000      0x110000bc      0x0068e8f0      0xfeeb0000
(gdb) x/16x 0xf0100000
0xf0100000 <_start+4026531828>: 0x00000000      0x00000000      0x00000000      0x00000000
0xf0100010 <entry+4>:   0x00000000      0x00000000      0x00000000      0x00000000
0xf0100020 <entry+20>:  0x00000000      0x00000000      0x00000000      0x00000000
0xf0100030 <relocated+1>:       0x00000000      0x00000000      0x00000000      0x00000000

After execution movl %eax, %cr0, the content is exactly the same, indicating that the address mapping is successful

(gdb) x/16x 0xf0100000
0xf0100000 <_start+4026531828>: 0x1badb002      0x00000000      0xe4524ffe      0x7205c766
0xf0100010 <entry+4>:   0x34000004      0x2000b812      0x220f0011      0xc0200fd8
0xf0100020 <entry+20>:  0x0100010d      0xc0220f80      0x10002fb8      0xbde0fff0
0xf0100030 <relocated+1>:       0x00000000      0x110000bc      0x0068e8f0      0xfeeb0000
(gdb) x/16x 0xf0100000
0xf0100000 <_start+4026531828>: 0x1badb002      0x00000000      0xe4524ffe      0x7205c766
0xf0100010 <entry+4>:   0x34000004      0x2000b812      0x220f0011      0xc0200fd8
0xf0100020 <entry+20>:  0x0100010d      0xc0220f80      0x10002fb8      0xbde0fff0
0xf0100030 <relocated+1>:       0x00000000      0x110000bc      0x0068e8f0      0xfeeb0000

After the memory mapping fails, the move $relocated, %eaxsum jmp %eaxwill fail, because the relocated address is obtained from the segment address plus the offset address. The segment address is 0xf0100008, if the address mapping fails, those jmp %eaxwill jump to the physical address of 0xf010008 plus the offset, cause an error.

Formatted Printing to the Console

Exercise 8

num = getuint(&ap, lflag);
if ((long long) num < 0) {
    putch( '-' , putdata);
    num = -( long  long ) num;
}
base = 8;
goto number;

Question 1

The putch function in printf.c calls the cputchar function in console.c, and the specific calling relationship is: cprintf → vcprintf → putch → cputchar

Question2

if (crt_pos >= CRT_SIZE) {
    int i;
     memmove(crt_buf, crt_buf + CRT_COLS, (CRT_SIZE - CRT_COLS) * sizeof(uint16_t));
     for (i = CRT_SIZE - CRT_COLS; i < CRT_SIZE; i++)
             crt_buf[i] = 0x0700 | ' ';6              crt_pos -= CRT_COLS;
   }

Contact the code context to understand what this code does. First of all, CRT (cathode ray tube) is a cathode ray display. According to the definition in the console.h file, CRT_COLS is the word length of each row of the display (1 word occupies 2 bytes), and the value is 80; CRT_ROWS is the number of rows of the display, and the value is 25; and #define CRT_SIZE (CRT_ROWS * CRT_COLS) is the number of words that the monitor screen can hold, ie 2000.

When crt_pos is greater than or equal to CRT_SIZE, it means that the display screen is full, so move the content of the screen up one line, that is, the content of the second line to the last line (that is, the 25th line) covers the first line to the penultimate second line ( That is line 24). Next, fill the last line with black spaces. The purpose of ORing the space character, 0x0700, is to make the color of the space black. Finally update the value of crt_pos. Summary: What this code does is move the screen up 1 line when it’s full, and fill the last line with black spaces.

Question3

int  x = 1 , y = 3 , z = 4 ;
cprintf("x %d, y %x, z %d\n", x, y, z);

In the call to cprintf(), to what does fmt point? To what does ap point?

List (in order of execution) each call to cons_putc, va_arg, and vcprintf. For cons_putc, list its argument as well. For va_arg, list what ap points to before and after the call. Forvcprintf list the values of its two arguments.

fmt points to the memory address of the format string “x %d, y %x, z %d\n” (is this string stored on the stack?), ap points to the memory address of the first parameter to be printed, That is, the address of x.

List the status of each call to cons_putc, va_arg and vcprintf:

  1. cprintf first calls vcprintf, the value of the first parameter fmt passed in is the address of the format string “x %d, y %x, z %d\n”, and the second parameter ap points to the address of x
  2. vcprintf calls vprintfmt, the vprintfmt function calls va_arg and putch multiple times, putch calls cputchar, and cputchar calls cons_putc, the first parameter of putch will eventually be passed to cons_putc. Next, the status of each call to these functions is listed in the order of code execution.
  3. 1st call to cons_putc: printfmt.c line 95, parameter is character ‘x’
  4. 2nd call to cons_putc: printfmt.c line 95, the parameter is the character ‘ ‘
  5. The first call to va_arg: printfmt.c line 75, lflag=0, ap points to x before the call, and ap points to y after the call
  6. The third call to cons_putc: printfmt.c line 49, the parameter is the character ‘1’. The expression of the first parameter passed to putch here is novel and concise: “0123456789abcdef” [num % base]. Note that the double quotes and their interiors actually define an array whose elements are in turn 16 hexadecimal characters
  7. The 4th call to cons_putc: printfmt.c line 95, the parameter is the character ‘,’
  8. The 5th call to cons_putc: printfmt.c line 95, the parameter is the character ‘ ‘
  9. 6th call to cons_putc: printfmt.c line 95, the argument is the character ‘y’
  10. The 7th call to cons_putc: printfmt.c line 95, the parameter is the character ‘ ‘
  11. The second call to va_arg: printfmt.c line 75, lflag=0, ap points to y before the call, and ap points to z after the call
  12. The 8th call to cons_putc: printfmt.c line 49, the parameter is the character ‘3’
  13. The 9th call to cons_putc: printfmt.c line 95, the parameter is the character ‘,’
  14. The 10th call to cons_putc: printfmt.c line 95, the parameter is the character ‘ ‘
  15. 11th call to cons_putc: printfmt.c line 95, argument is character ‘z’
  16. The 12th call to cons_putc: printfmt.c line 95, the parameter is the character ‘ ‘
  17. The third call to va_arg: printfmt.c line 75, lflag=0, ap points to z before the call, and ap points to the address of z plus 4 after the call
  18. 13th call to cons_putc: printfmt.c line 49, argument is character ‘4’
  19. The 14th call to cons_putc: printfmt.c line 95, the parameter is the character ‘\n’

Question4

The output is He110 World, 57616 converted to hexadecimal is 0xe110, according to the ASCII code, the 4 bytes of i are ‘r’, ‘l’, ‘d’, ‘\0’ from low to high, but the host is required to be small Only endian can be printed normally. If it is big endian, the value of i should be modified to 0x726c6400

Question5

The value of y printed out should be the value represented by the 4 bytes after the position where x is stored in the stack. Because when the value of x is printed, the ap pointer of the va_arg function points to the byte next to the last byte of x. Therefore, no matter how many parameters are passed in calling cprintf, when “%d” is parsed, the va_arg function will take the address pointed to by the current pointer and return it as a pointer to an int type integer.

Question6

There are two ways. One is that when the programmer calls the cprintf function, the parameters are passed from right to left. This method does not conform to our reading habits and is less readable. The second method is to add an int parameter at the end of the original interface to record the total length of all parameters, so that we can find the position of the format string according to the top element of the stack.

The Stack

Exercise 9

.data
    .p2align    PGSHIFT     # force page alignment
    .globl      bootstack
bootstack:
    .space      KSTKSIZE
    .globl      bootstacktop   
bootstacktop:

There is an instruction under the relocated label: movl $(bootstacktop),%esp, which assigns the value of the stack pointer to the %esp register. Continue to look down and find the bootstack label, where the .space KSTKSIZE statement applies for a stack space with a size of KSTKSIZE = 8 * PGSIZE = 8 * 4096 bytes and an initial value of all 0. The bootstacktop label is defined later. It can be seen that the top of the stack is at the highest address of the stack, and the stack pointer points to the top of the stack, that is, to the highest address of the stack, which also means that the stack is from top to bottom (high address to low address) grown.

Exercise 10

The C code corresponding to the test_backtrace function:

void test_backtrace(int x)
{
    cprintf("entering test_backtrace %d\n", x);
    if (x > 0)
        test_backtrace(x-1);
    else
        mon_backtrace(0, 0, 0);
    cprintf("leaving test_backtrace %d\n", x);
}

The assembly code corresponding to the test_backtrace function:

f0100040:   55                      push   %ebp
f0100041:   89 e5                   mov    %esp,%ebp
f0100043:   56                      push   %esi
f0100044:   53                      push   %ebx
f0100045:   e8 5b 01 00 00          call   f01001a5 <\_\_x86.get_pc_thunk.bx>
f010004a:   81 c3 be 12 01 00       add    $0x112be,%ebx
f0100050:   8b 75 08                mov    0x8(%ebp),%esi
f0100053:   83 ec 08                sub    $0x8,%esp
f0100056:   56                      push   %esi
f0100057:   8d 83 18 07 ff ff       lea    -0xf8e8(%ebx),%eax
f010005d:   50                      push   %eax
f010005e:   e8 cf 09 00 00          call   f0100a32 <cprintf>
f0100063:   83 c4 10                add    $0x10,%esp
f0100066:   85 f6                   test   %esi,%esi
f0100068:   7f 2b                   jg     f0100095 <test\_backtrace+0x55>
f010006a:   83 ec 04                sub    $0x4,%esp
f010006d:   6a 00                   push   $0x0
f010006f:   6a 00                   push   $0x0
f0100071:   6a 00                   push   $0x0
f0100073:   e8 f4 07 00 00          call   f010086c <mon\_backtrace>
f0100078:   83 c4 10                add    $0x10,%esp
f010007b:   83 ec 08                sub    $0x8,%esp
f010007e:   56                      push   %esi
f010007f:   8d 83 34 07 ff ff       lea    -0xf8cc(%ebx),%eax
f0100085:   50                      push   %eax
f0100086:   e8 a7 09 00 00          call   f0100a32 <cprintf>
f010008b:   83 c4 10                add    $0x10,%esp
f010008e:   8d 65 f8                lea    -0x8(%ebp),%esp
f0100091:   5b                      pop    %ebx
f0100092:   5e                      pop    %esi
f0100093:   5d                      pop    %ebp
f0100094:   c3                      ret    
f0100095:   83 ec 0c                sub    $0xc,%esp
f0100098:   8d 46 ff                lea    -0x1(%esi),%eax
f010009b:   50                      push   %eax
f010009c:   e8 9f ff ff ff          call   f0100040 <test\_backtrace>
f01000a1:   83 c4 10                add    $0x10,%esp
f01000a4:   eb d5                   jmp    f010007b <test\_backtrace+0x3b>

Observe the test_backtrace function call stack

Let’s start to observe the call stack of the test_backtrace function. %esp stores the position of the top of the stack, %ebp stores the position of the top of the caller’s stack, and %eax stores the value of x. These registers need to be paid attention to, so I use the display command of gdb to automatically print them after each run is completed. In addition, I also set up to automatically print the data of the memory used in the stack, so as to clearly observe the changes of the stack. Let’s go.

Enter test_backtrace(5)

f01000d1:   c7 04 24 05 00 00 00    movl   $0x5,(%esp)
f01000d8:   e8 63 ff ff ff          call   f0100040 <test\_backtrace>
f01000dd:   83 c4 10                add    $0x10,%esp

The test_backtrace function is called in the i386_init function, and the incoming parameter x=5. We will start tracking the changes in the stack data from here. The data in each register and stack is shown below. It can be seen that a total of two 4-byte integers are pushed onto the stack:

  1. The value of the input parameter (that is, 5)
  2. The address of the next instruction of the call instruction (that is, f01000dd)

%esp = 0xf010ffdc 
%ebp = 0xf010fff8 
// stack info 
0xf010ffe0 : 0x00000005   // The input parameters of the first call: 5 
0xf010ffdc : 0xf01000dd   // The return address of the first call

After entering the test_backtrace function, the instructions involving data modification in the stack can be divided into three parts:

  1. At the beginning of the function, the value of some registers is pushed onto the stack so that it can be restored before the end of the function
  2. Push input arguments onto stack before calling cprintf
  3. Push the input parameters onto the stack before the second call to test_backtrace

// function start
f0100040:   55                      push   %ebp
f0100041:   89 e5                   mov    %esp,%ebp
f0100043:   56                      push   %esi
f0100044:   53                      push   %ebx
// call cprintf
f0100053:   83 ec 08                sub    $0x8,%esp
f0100056:   56                      push   %esi
f0100057:   8d 83 18 07 ff ff       lea    -0xf8e8(%ebx),%eax
f010005d:   50                      push   %eax
f010005e:   e8 cf 09 00 00          call   f0100a32 <cprintf>
f0100063:   83 c4 10                add    $0x10,%esp
// call test_backtrace(x-1)
f0100095 : 83 ec 0c sub $ 0xc , %esp 
f0100098 : 8d 46 ff lea - 0x1 ( %esi ), %eax 
f010009b : 50 push %eax 
f010009c : e8 9f ff ff ff call f0100040 < test_backtrace >

Enter test_backtrace(4)

Before entering test_backtrace(4), the data in the stack is as follows:

%esp = 0xf010ffc0 
%ebp = 0xf010ffd8 
// stack info 
0xf010ffe0 : 0x00000005   // Input parameters for the first call: 5 
0xf010ffdc : 0xf01000dd   // The return address for the first call 
0xf010ffd8 : 0xf010fff8   // For the first call The value of register %ebp 
0xf010ffd4 : 0x10094      // The value of register %esi at the first call is 
0xf010ffd0 : 0xf0111308   // The value of register %ebx at the first call is 
0xf010ffcc : 0xf010004a   // Residual data, do not need to pay attention to 
0xf010ffc8 : 0x00000000   / / Residual data, do not need to pay attention to 
0xf010ffc4 : 0x00000005   // Residual data, do not need to pay attention to 
0xf010ffc0 :0x00000004   // Input parameters for the second call

The case of entering test_backtrace(3), test_backtrace(2), test_backtrace(1) and test_backtrace(0) is similar to test_backtrace(4) and will not be repeated here. The following directly gives the situation of the data in the stack when entering mon_backtrace(0, 0, 0).

Enter mon_backtrace(0, 0, 0)

Before entering mon_backtrace(0, 0, 0), the data on the stack is as follows:

%esp = 0xf010ff20 
%ebp = 0xf010ff38 
// stack info 
0xf010ffe0 : 0x00000005   // Input parameters for the first call: 5 
0xf010ffdc : 0xf01000dd   // The return address for the first call 
0xf010ffd8 : 0xf010fff8   // The first call starts The value of the register %ebp is 
0xf010ffd4 : 0x10094      // The value of the register %esi at the beginning of the first call is 
0xf010ffd0 : 0xf0111308   // The value of the register %ebx at the beginning of the first call is 
0xf010ffcc : 0xf010004a   // Reserve space, do not need to pay attention 
0xf010ffc8 : 0x00000000   // reserved space, do not need to pay attention 
0xf010ffc4 : 0x00000005   // reserved space, do not need to pay attention to 
0xf010ffc0: 0x00000004   // Input parameters for the second call: 4 
0xf010ffbc : 0xf01000a1   // The return address for the second call 
0xf010ffb8 : 0xf010ffd8   // The value of register %ebp at the beginning of the second call 
0xf010ffb4 : 0x00000005   // The second The value of register %esi at the beginning of the second call is 
0xf010ffb0 : 0xf0111308   // The value of register %ebx at the beginning of the second call is 
0xf010ffac : 0xf010004a   // reserved space, do not need to pay attention to 
0xf010ffa8 : 0x00000000   // reserved space, do not need to pay attention to 
0xf010ffa4 : 0x00000004   // reserved space, do
 n't need to pay attention  
  // Return address at the third call 
0xf010ff98 : 0xf010ffb8   // Value of register %ebp at the start of the third call 
0xf010ff94 : 0x00000004   // Value of register %esi at the start of the third call 
0xf010ff90 : 0xf0111308   // The third call The value of register %ebx at the beginning of the call is 
0xf010ff8c : 0xf010004a //   reserved space, do not need to pay attention 
0xf010ff88 : 0xf010ffb8   // reserved space
 , do not need to pay attention Input parameters for the 4th call: 2 0xf010ff7c : 0xf01000a1 // The return address for the 4th call 0xf010ff78 : 0xf010ff98 // The value of the register %ebp at the beginning of the 4th call  



0xf010ff74 : 0x00000003   // The value of register %esi at the beginning of the 4th call 
0xf010ff70 : 0xf0111308   // The value of register %ebx at the beginning of the 4th call 
0xf010ff6c : 0xf010004a   // Reserved space, do not need to pay attention to 
0xf010ff68 : 0xf010ff98   // Preliminary Leave space, no need to pay attention 
0xf010ff64 : 0x00000002   // Reserved space, no need to pay attention 
0xf010ff60 : 0x00000001   // Input parameters for the fifth call: 1 
0xf010ff5c : 0xf01000a1   // The return address for the fifth call 
0xf010ff58 : 0xf010ff78   / / The value of register %ebp at the beginning of the 5th call is 
0xf010ff54 : 0x00000002   // The value of register %esi at the beginning of the 5th call is 
0xf010ff50 :0xf0111308   // The value of register %ebx at the beginning of the fifth call 
0xf010ff4c : 0xf010004a   // Reserved space, do not need to pay attention 
0xf010ff48 : 0xf010ff78   // Reserved space, do not
 need to pay attention 0xf010ff40 : 0x00000000 // Input parameters at the sixth call: 0 0xf010ff3c : 0xf01000a1 // The return address at the sixth call 0xf010ff38 : 0xf010ff58 // The value of register %ebp at the beginning of the sixth call 0xf010ff34 : 0x00000001 // The first The value of register %esi at the beginning of the 6th call is 0xf010ff30 : 0xf0111308 // The value of register %ebx at the beginning of the 6th call is 0xf010ff2c : 0xf010004a // Reserved space, do not need to pay attention  






0xf010ff28 : 0x00000000   // The 1st input parameter of the 7th call: 0 
0xf010ff24 : 0x00000000   // The 2nd input parameter of the 7th call: 0 
0xf010ff20 : 0x00000000   // The 3rd of the 7th call Input parameters: 0

The mon_backtrace function is currently empty internally, no need to pay attention

exit mon_backtrace(0, 0, 0)

With the add $0x10, %espstatement, the input parameters and the reserved 4 bytes are cleared from the stack. At this time %esp = 0xf010ff30,%ebp = 0xf010ff38

exit test_backtrace(0)

Three consecutive pop statements pop the ebx, esi and ebp registers from the stack in turn, and then return through the ret statement. The exit process of other 1~5 is similar and will not be repeated.

Each time test_backtrace is called, it mainly does the following things:

  1. Push the return address (the address of the instruction next to the call instruction) on the stack
  2. Push the values ​​of the three registers ebp, esi, and ebx onto the stack to restore their values ​​before exiting the function
  3. Call the cprintf function to print “entering test_backtrace x”, where x is the value of the input parameter
  4. Push the input parameter (x-1) on the stack, and then allocate 3 double words in the stack, a total of 16 bytes, to facilitate clearing the stack
  5. call test_backtrace(x-1)
  6. Call the cprintf function to print “leaving test_backtrace x”, where x is the value of the input parameter

A total of 8 double words are pushed onto the stack each time test_backtrace is called:

  1. return address
  2. The values ​​of the three registers ebp, esi, ebx
  3. the value of the input parameter (x-1)
  4. 3 reserved double words (4 bytes with input parameters, easy to clear the stack)

Exercise 11

int mon_backtrace(int argc, char **argv, struct Trapframe *tf)
{
    uint32_t *ebp;
    ebp = (uint32_t *)read_ebp();
    cprintf("Stack backtrace:\r\n");
    while (ebp)
    {
        cprintf("  ebp %08x  eip %08x  args %08x %08x %08x %08x %08x\r\n", 
                ebp, ebp[1], ebp[2], ebp[3], ebp[4], ebp[5], ebp[6]);
        ebp = (uint32_t *)*ebp;
    }
    return 0;
}

Exercise12

click here

Reference

“MIT 6.828 Lab1: Booting a PC” experimental report

https://jiyou.github.io/blog/2018/04/15/mit.6.828/jos-lab1/

http://www.wineandchord.com/mit-6-828-lab-1-booting-a-pc/

https://111qqz.com/2019/01/mit-6-828-lab-1/#formatted-printing-to-the-console

You may also like...

Leave a Reply

Your email address will not be published.