MIT6.828學習筆記1

2022-11-22 06:01:04

Lab 1: Booting a PC

Part 1: PC Bootstrap

The PC's Physical Address Space

早期的PC機基於Intel的8088處理器,能夠定址1MB的實體記憶體。從0x00000000到0x000FFFFF。低640KB的空間被標註為「Low Memory」。這是早期PC機可以使用的RAM。

定址1MB實體記憶體需要20位的地址匯流排,因此8088的地址匯流排是20位。但是8088的CPU中的ALU寬度依然是16位元的。即資料匯流排寬度為16位元。為了解決這個問題,8088的CPU中設定了4個段暫存器:CS、DS、SS和ES,分別用於程式碼段、資料段、堆疊段和其他段。每個段暫存器都是16位元的。每條指令的地址在送上地址匯流排之前,會將段暫存器中的值進行一定量的偏移,然後相加得到20位的地址。

從0x000C0000到0x000FFFFF的384KB由硬體保留用於特殊用途,如視訊顯示緩衝和非易失性記憶體中的韌體。BIOS佔用從0x000F0000到0x000FFFFF的64KB區域,早期PC的BIOS儲存在真正的ROM中。當前的PC將BIOS儲存在可更新的快閃記憶體中。

BIOS負責執行基本的系統初始化,例如啟用顯示卡和檢查安裝的記憶體量。執行此初始化後,BIOS 從某個適當的位置(如軟碟、硬碟、CD-ROM 或網路)載入作業系統,並將計算機的控制權傳遞給作業系統。

在後來出現的處理器中,定址空間已經遠不止1MB。如80286可定址空間為4MB,80386可定址空間為4GB。在這些機器中,BIOS的位置發生了變化,但為了保持相容性,從0x000A0000到0x000FFFFF的空間被保留了。

The ROM BIOS

開啟兩個終端,分別輸入make qemu-nox-gdbmake gdb

despot@ubuntu:~/6.828/lab$ make gdb
gdb -n -x .gdbinit
GNU gdb (Ubuntu 8.1-0ubuntu3) 8.1.0.20180409-git
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word".
+ target remote localhost:26000
warning: No executable has been specified and target does not support
determining executable automatically.  Try using the "file" command.
warning: A handler for the OS ABI "GNU/Linux" is not built into this configuration
of GDB.  Attempting to continue with the default i8086 settings.

The target architecture is assumed to be i8086
[f000:fff0]    0xffff0:	ljmp   $0xf000,$0xe05b
0x0000fff0 in ?? ()
+ symbol-file obj/kern/kernel
(gdb) 

其中,較為重要的是這一行:

 [f000:fff0] 0xffff0:	ljmp   $0xf000,$0xe05b

這是第一條將要被執行的指令。
從這條指令我們可以看出:

  • The IBM PC starts executing at physical address 0x000ffff0, which is at the very top of the 64KB area reserved for the ROM BIOS.
  • The PC starts executing with CS = 0xf000 and IP = 0xfff0.
  • The first instruction to be executed is a jmp instruction, which jumps to the segmented address CS = 0xf000 and IP = 0xe05b.

QEMU這樣做的原因是BIOS「hard-wired」實體記憶體0x00F0000到0x00FFFFF。這樣可以確保BIOS在系統重啟時首先獲得控制權。

QEMU 模擬器附帶自己的 BIOS,它將其放置在處理器模擬實體地址空間中的此位置。在處理器重置時,(模擬)處理器進入真真實模式,並將 CS 設定為 0xf000,將 IP 設定為 0xfff0,以便從該 (CS:IP) 段地址開始執行。

該指令的實際地址為CS向左偏移4位元加上IP。即0xF0000(16 * 0xF000) + 0xFFF0 = 0xFFFF0。

BIOS主要的工作是初始化中斷向量表、各種裝置。在載入完PCI匯流排和一些重要裝置後,它開始搜尋可引導裝置,如軟碟、硬碟機或者CD-ROM,從磁碟讀取bootloader並將控制權轉移給它。具體的指令含義可以參考這篇文章

Part 2: The Boot Loader

對於PC來說,軟碟和硬碟都被劃分為一個個512KB的區域,一個這樣的區域稱為磁區。磁區是磁碟操作的最小粒度,即讀取或者寫入都需要以磁區為單位。如果一個磁碟可以用來啟動操作,那麼這個磁碟的第一個磁區叫做啟動磁區(boot sector)。boot loader的程式碼就存放在這個磁區。當BIOS找到這個磁區後,它會將這個磁區的內容轉移到記憶體0x7c00~0x7dff的空間中。然後將控制權交給boot loader。

6.828的boot loader包含兩個檔案:boot/boot.Sboot/main.c

boot loader主要有兩個功能:

  • 從真真實模式轉換到32位元保護模式,這樣才能存取超過1MB的地址空間。
  • 通過x86的I/O指令,將核心從硬碟讀取到記憶體中。

關於真真實模式和操作模式,可以閱讀這篇文章或者PC Assembly Language的1.2.7和1.2.8節。

boot.S

  1 #include <inc/mmu.h>
  2 
  3 # Start the CPU: switch to 32-bit protected mode, jump into C.
  4 # The BIOS loads this code from the first sector of the hard disk into
  5 # memory at physical address 0x7c00 and starts executing in real mode
  6 # with %cs=0 %ip=7c00.
  7 
  8 .set PROT_MODE_CSEG, 0x8         # kernel code segment selector
  9 .set PROT_MODE_DSEG, 0x10        # kernel data segment selector
 10 .set CR0_PE_ON,      0x1         # protected mode enable flag

第1行是包含的標頭檔案。第3~6行是功能說明,第8~10行設定了一些全域性變數。

 11 
 12 .globl start
 13 start:
 14   .code16                     # Assemble for 16-bit mode
 15   cli                         # Disable interrupts
 16   cld                         # String operations increment
 17 

cli指令關閉中斷,與之相對的是sti指令,開啟中斷。cld指令復位方向標誌位DF(direction flag)。即使DF=0.與之相對的是std,其置位DF。DF決定了串操作指令的地址增長方向。

 18   # Set up the important data segment registers (DS, ES, SS).
 19   xorw    %ax,%ax             # Segment number zero
 20   movw    %ax,%ds             # -> Data Segment
 21   movw    %ax,%es             # -> Extra Segment
 22   movw    %ax,%ss             # -> Stack Segment
 23 

第19行將ax清零,然後分別設定幾個段暫存器。

 24   # Enable A20:
 25   #   For backwards compatibility with the earliest PCs, physical 26   
 26   #   address line 20 is tied low, so that addresses higher than
 27   #   1MB wrap around to zero by default.  This code undoes this.
 28 seta20.1:
 29   inb     $0x64,%al               # Wait for not busy
 30   testb   $0x2,%al
 31   jnz     seta20.1
 32 
 33   movb    $0xd1,%al               # 0xd1 -> port 0x64
 34   outb    %al,$0x64
 35 

這段程式碼的作用是使能A20地址線。在真真實模式下,A20地址線被禁止,定址空間被限制在1MB,在轉向保護模式前,需要開啟A20地址線。

第29行程式碼inb $0x64,%al從埠0x64讀取一個位元組的資料到暫存器al中,然後第30行程式碼testb $0x2,%al檢查暫存器al中資料的第2位(從1算起),如果該位為1,則跳轉到seta20.1,重複上述操作。否則將資料0xd1寫入暫存器al,然後將資料輸出到埠0x64。

根據這份檔案提供的資訊,我們可以知道,0x64埠是鍵盤控制器的IO埠。我們在此處只需要關心bit 1的狀態,第30行程式碼檢測的位置。當該位為1時,說明輸入緩衝有資料未被控制器取走,CPU需要等待直到鍵盤控制器的輸入緩衝區為空。

0064	r	KB controller read status (MCA)
		 bit 7 = 1 parity error on transmission from keyboard
		 bit 6 = 1 general timeout
		 bit 5 = 1 mouse output buffer full
		 bit 4 = 0 keyboard inhibit
		 bit 3 = 1 data in input register is command
			 0 data in input register is data
		 bit 2	 system flag status: 0=power up or reset  1=selftest OK
		 bit 1 = 1 input buffer full (input 60/64 has data for 804x)
		 bit 0 = 1 output buffer full (output 60 has data for system)

當鍵盤控制器取走資料之後,boot loader向埠0x64寫入資料0xd1。資料D1可以看做是一條控制指令,該條指令表示下一個寫入0x0060埠的資料將被鍵盤控制器寫到它的輸出埠。有些機器使用輸出埠的bit1來控制A20線。

D1	dbl   write output port. next byte written  to 0060
			      will be written to the 804x output port; the
			      original IBM AT and many compatibles use bit 1 of
			      the output port to control the A20 gate.
		      Compaq  The system speed bits are not set by this command
			      use commands A1-A6 (!) for speed functions.
 36 seta20.2:
 37   inb     $0x64,%al               # Wait for not busy
 38   testb   $0x2,%al
 39   jnz     seta20.2
 40 
 41   movb    $0xdf,%al               # 0xdf -> port 0x60
 42   outb    %al,$0x60
 43 

第37行~第39行再次等待鍵盤控制器將上一條指令取走。第41~第42行程式碼將資料0xdf輸出到0x60埠,這條資料會被鍵盤控制器寫入它的輸出埠,此時輸出埠的bit1為1,A20線被使能。

 44   # Switch from real to protected mode, using a bootstrap GDT
 45   # and segment translation that makes virtual addresses 
 46   # identical to their physical addresses, so that the 
 47   # effective memory map does not change during the switch.
 48   lgdt    gdtdesc
 49   movl    %cr0, %eax
 50   orl     $CR0_PE_ON, %eax
 51   movl    %eax, %cr0
 52 
 ……
 75 # Bootstrap GDT
 76 .p2align 2                                # force 4 byte alignment
 77 gdt:
 78   SEG_NULL                              # null seg
 79   SEG(STA_X|STA_R, 0x0, 0xffffffff)     # code seg
 80   SEG(STA_W, 0x0, 0xffffffff)           # data seg
 81 
 82 gdtdesc:
 83   .word   0x17                            # sizeof(gdt) - 1
 84   .long   gdt                             # address gdt
 85 

第48行~第51行程式碼從真真實模式轉向保護模式。

第48行程式碼載入全域性描述符。關於該條指令可以參考這裡這裡這裡。該行程式碼還存取了第75~第85行所定義的資料。

lgdt取6個位元組的資料,將前兩個位元組裝入gdtr暫存器的limit部分,另外4個位元組裝入gdtr暫存器的base部分。lgdt是間接定址的,需要用裝入的資料間接找到真正的GDT的線性地址。

第49行~第51行程式碼將cr0控制器的最低位置1,處理器執行於保護模式。

關於真真實模式保護模式除了這兩個連結,還可以參考這裡

 53   # Jump to next instruction, but in 32-bit code segment.
 54   # Switches processor into 32-bit mode.
 55   ljmp    $PROT_MODE_CSEG, $protcseg
 56 

執行一條跳轉指令。但處理器工作於32位元保護模式。

 57   .code32                     # Assemble for 32-bit mode
 58 protcseg:
 59   # Set up the protected-mode data segment registers
 60   movw    $PROT_MODE_DSEG, %ax    # Our data segment selector
 61   movw    %ax, %ds                # -> DS: Data Segment
 62   movw    %ax, %es                # -> ES: Extra Segment
 63   movw    %ax, %fs                # -> FS
 64   movw    %ax, %gs                # -> GS
 65   movw    %ax, %ss                # -> SS: Stack Segment
 66 

設定一下段暫存器。前面提到過,在真真實模式下,指令的實際地址由段暫存器和指令暫存器組合給出。段暫存器的值左移4位元加上指令暫存器的值得到實際地址。在保護模式下,段暫存器是為了獲取段描述符表的某個專案。根據這份連結指出,在對GDT進行操作後,我們需要將新的段選擇器載入到段暫存器。

Whatever you do with the GDT has no effect on the CPU until you load new Segment Selectors into Segment Registers. For most of these registers, the process is as simple as using MOV instructions, but changing the CS register requires code resembling a jump or call to elsewhere, as this is the only way its value is meant to be changed.

 67   # Set up the stack pointer and call into C.
 68   movl    $start, %esp
 69   call bootmain
 70 

設定esp的值,呼叫bootmain函數。

 71   # If bootmain returns (it shouldn't), loop.
 72 spin:
 73   jmp spin
 74 

如果從bootmain返回,死迴圈。

main.c

main.c的主要作用是將核心從磁碟載入到記憶體,然後將控制權轉移給核心。

  1 #include <inc/x86.h>
  2 #include <inc/elf.h>
  3 

前三行是包含的標頭檔案。

  4 /**********************************************************************
  5  * This a dirt simple boot loader, whose sole job is to boot
  6  * an ELF kernel image from the first IDE hard disk.
  7  *
  8  * DISK LAYOUT
  9  *  * This program(boot.S and main.c) is the bootloader.  It should
 10  *    be stored in the first sector of the disk.
 11  *
 12  *  * The 2nd sector onward holds the kernel image.
 13  *
 14  *  * The kernel image must be in ELF format.
 15  *
 16  * BOOT UP STEPS
 17  *  * when the CPU boots it loads the BIOS into memory and executes it
 18  *
 19  *  * the BIOS intializes devices, sets of the interrupt routines, and
 20  *    reads the first sector of the boot device(e.g., hard-drive)
 21  *    into memory and jumps to it.
 22  *
 23  *  * Assuming this boot loader is stored in the first sector of the
 24  *    hard-drive, this code takes over...
 25  *
 26  *  * control starts in boot.S -- which sets up protected mode,
 27  *    and a stack so C code then run, then calls bootmain()
 28  *
 29  *  * bootmain() in this file takes over, reads in the kernel and jumps to i    t.
 30  **********************************************************************/
 31

第4行~第31行介紹了main.c的功能和啟動步驟。

 32 #define SECTSIZE        512
 33 #define ELFHDR          ((struct Elf *) 0x10000) // scratch space
 34 

定義一些變數。SCTSIZE是磁區大小,512KB。ELFHDR為一個記憶體地址。

 35 void readsect(void*, uint32_t);
 36 void readseg(uint32_t, uint32_t, uint32_t);
 37

一些函數的宣告。readsect讀取一個磁區的資料。readseg呼叫readsect讀取資料。

 98 void
 99 waitdisk(void)
100 {
101         // wait for disk reaady
102         while ((inb(0x1F7) & 0xC0) != 0x40)
103                 /* do nothing */;
104 }
105 
106 void
107 readsect(void *dst, uint32_t offset)
108 {
109         // wait for disk to be ready
110         waitdisk();
111 
112         outb(0x1F2, 1);         // count = 1
113         outb(0x1F3, offset);
114         outb(0x1F4, offset >> 8);
115         outb(0x1F5, offset >> 16);
116         outb(0x1F6, (offset >> 24) | 0xE0);
117         outb(0x1F7, 0x20);      // cmd 0x20 - read sectors
118 
119         // wait for disk to be ready
120         waitdisk();
121 
122         // read a sector
123         insl(0x1F0, dst, SECTSIZE/4);
124 }
125

先看readsect函數。該函數接收兩個引數。void *dst為資料裝載的起始地址,offset為當前所裝載的磁區距離核心起始地址的偏移量,以磁區為單位,一次裝載為1個磁區。

waitdisk函數等待磁碟準備好。(inb(0x1F7) & 0xC0) != 0x40表示從0x1F7埠讀取一個資料並檢測該資料的高兩位,當最高位為0且次高位為1時迴圈結束。此時磁碟已經準備好。埠0x1F7此連結可以看到相關資訊。

 01F7	r	status register
		 bit 7 = 1  controller is executing a command
		 bit 6 = 1  drive is ready
		 bit 5 = 1  write fault
		 bit 4 = 1  seek complete
		 bit 3 = 1  sector buffer requires servicing
		 bit 2 = 1  disk data read successfully corrected
		 bit 1 = 1  index - set to 1 each disk revolution
		 bit 0 = 1  previous command ended in an error

bit7為0且bit6為1時表示控制器沒有在執行命令且磁碟已經準備好。

outb是一個行內函式。接收兩個引數。一個是port,一個是data。

static inline void
outb(int port, uint8_t data)
{
        asm volatile("outb %0,%w1" : : "a" (data), "d" (port));
}
 01F2	r/w	sector count
 01F3	r/w	sector number
 01F4	r/w	cylinder low
 01F5	r/w	cylinder high
 01F6	r/w	drive/head
		 bit 7	 = 1
		 bit 6	 = 0
		 bit 5	 = 1
		 bit 4	 = 0  drive 0 select
			     = 1  drive 1 select
		 bit 3-0      head select bits
 01F7	w	command register
		commands:
        ……
		 20	 read sectors with retry
        ……

通過上表我們可以看到一系列呼叫outb的含義,首先向0xF2寫入1,表示一次讀取一個磁區;0x1F6的低4位元、0x1F30x1F40x1F5存放的是起始磁區的資訊。

其中0x1F30x1F40x1F5分別儲存第0~7位、第8~15位和第16~23位。0x1F6的低四位儲存第24~27位。

最後向0x1F7寫入命令0x20讀取磁區。等待控制器讀取完這些命令後執行insl進行讀取。

insl也是一個行內函式,在x86.h中可以找到它的定義:

static inline void
insl(int port, void *addr, int cnt)
{
        asm volatile("cld\n\trepne\n\tinsl"
                : "=D" (addr), "=c" (cnt)
                : "d" (port), "0" (addr), "1" (cnt)
                : "memory", "cc");
}

01F0 r/w data register

insl從埠port讀取cnt個雙字(4位元組)儲存到基址為addr的記憶體中。

接下來看一下readseg函數。

 69 // Read 'count' bytes at 'offset' from kernel into physical address 'pa'.
 70 // Might copy more than asked
 71 void
 72 readseg(uint32_t pa, uint32_t count, uint32_t offset)
 73 {
 74         uint32_t end_pa;
 75 
 76         end_pa = pa + count;
 77 
 78         // round down to sector boundary
 79         pa &= ~(SECTSIZE - 1);
 80 
 81         // translate from bytes to sectors, and kernel starts at sector 1
 82         offset = (offset / SECTSIZE) + 1;
 83 
 84         // If this is too slow, we could read lots of sectors at a time.
 85         // We'd write more to memory than asked, but it doesn't matter --
 86         // we load in increasing order.
 87         while (pa < end_pa) {
 88                 // Since we haven't enabled paging yet and we're using
 89                 // an identity segment mapping (see boot.S), we can
 90                 // use physical addresses directly.  This won't be the
 91                 // case once JOS enables the MMU.
 92                 readsect((uint8_t*) pa, offset);
 93                 pa += SECTSIZE;
 94                 offset++;
 95         }
 96 }
 97

readseg函數接受3個引數。pa表示所讀取資料在記憶體中存放的首地址;count表示讀取的位元組數;offset表示讀取的資料位於距離核心起始的偏移。

end_pa表示讀取的資料存放的最高地址。pa &= ~(SECTSIZE - 1)把pa重新定向到offset儲存單元所在的磁區的起始地址,等價的組合指令為and $0xfffffe00, %ebx,捨棄了低8位元。第82行程式碼將位元組的偏移量轉化為了磁區的偏移量,磁區0存放的是boot loader,核心從磁區1開始存放。

接下來判斷讀取是否完成,沒有完成則呼叫readsect讀取資料。因為一次讀取一個磁區,因此總的讀取位元組數可能超過count

接下來我們回到主函數。

 38 void
 39 bootmain(void)
 40 {
 41         struct Proghdr *ph, *eph;
 42 
 43         // read 1st page off disk
 44         readseg((uint32_t) ELFHDR, SECTSIZE*8, 0);
 45 
 46         // is this a valid ELF?
 47         if (ELFHDR->e_magic != ELF_MAGIC)
 48                 goto bad;
 49 
 50         // load each program segment (ignores ph flags)
 51         ph = (struct Proghdr *) ((uint8_t *) ELFHDR + ELFHDR->e_phoff);
 52         eph = ph + ELFHDR->e_phnum;
 53         for (; ph < eph; ph++)
 54                 // p_pa is the load address of this segment (as well
 55                 // as the physical address)
 56                 readseg(ph->p_pa, ph->p_memsz, ph->p_offset);
 57 
 58         // call the entry point from the ELF header
 59         // note: does not return!
 60         ((void (*)(void)) (ELFHDR->e_entry))();
 61 
 62 bad:
 63         outw(0x8A00, 0x8A00);
 64         outw(0x8A00, 0x8E00);
 65         while (1)
 66                 /* do nothing */;
 67 }
 68 

第41行程式碼定義了兩個指向struct Proghdr的指標。這個結構體的定義在inc/elf.h,我們可以開啟看一下。

struct Proghdr {
	uint32_t p_type;
	uint32_t p_offset;	//本段在檔案內的偏移
	uint32_t p_va;
	uint32_t p_pa;		//段在實體記憶體的起始地址
	uint32_t p_filesz;
	uint32_t p_memsz;	//記憶體大小
	uint32_t p_flags;
	uint32_t p_align;
};

第44行程式碼從磁區1開始讀取4KB資料到以EDFHDR(0x10000)為起始地址的記憶體中。這些資料其實是作業系統映像檔案的elf頭部。關於ELF檔案,可以參考這個連結或者這裡。我們使用的核心被編譯為ELF格式的可執行檔案。主要有ELF檔案頭、程式頭表和相應的段組成。

ELF is a format for storing programs or fragments of programs on disk, created as a result of compiling and linking. An ELF file is divided into sections. For an executable program, these are the text section for the code, the data section for global variables and the rodata section that usually contains constant strings. The ELF file contains headers that describe how these sections should be stored in memory.

這個頭部檔案的結構定義也在inc/elf.h

#define ELF_MAGIC 0x464C457FU	/* "\x7FELF" in little endian */

struct Elf {
	uint32_t e_magic;		// must equal ELF_MAGIC
	uint8_t e_elf[12];
	uint16_t e_type;
	uint16_t e_machine;
	uint32_t e_version;
	uint32_t e_entry;
	uint32_t e_phoff;		//程式頭表在檔案內的偏移量
	uint32_t e_shoff;
	uint32_t e_flags;
	uint16_t e_ehsize;
	uint16_t e_phentsize;
	uint16_t e_phnum;		//程式頭表條目數目,即段的數目
	uint16_t e_shentsize;
	uint16_t e_shnum;
	uint16_t e_shstrndx;
};

第47行程式碼檢驗這個檔案是否有效。

如果無效則執行兩條outw指令後進入一個死迴圈。outw是一個行內函式,定義在x86.h中。

static inline void
outw(int port, uint16_t data)
{
	asm volatile("outw %0,%w1" : : "a" (data), "d" (port));
}

檢驗完成後,通過ph指向程式頭表,eph是程式頭表最後一個段的地址。通過一個while迴圈,將各個段載入到記憶體中。

然後通過這條指令((void (*)(void)) (ELFHDR->e_entry))()將控制權轉移給核心。

ELF header

An ELF binary starts with a fixed-length ELF header, followed by a variable-length program header listing each of the program sections to be loaded. The C definitions for these ELF headers are in inc/elf.h. The program sections we're interested in are:

  • .text: The program's executable instructions.
  • .rodata: Read-only data, such as ASCII string constants produced by the C compiler. (We will not bother setting up the hardware to prohibit writing, however.)
  • .data: The data section holds the program's initialized data, such as global variables declared with initializers like int x = 5

通過輸入objdump -h obj/boot/boot.out我們可以看到,一些塊的連結地址和載入地址是相同的:

obj/boot/boot.out:     file format elf32-i386

Sections:
Idx Name          Size      VMA       LMA       File off  Algn
  0 .text         00000186  00007c00  00007c00  00000074  2**2
                  CONTENTS, ALLOC, LOAD, CODE
  1 .eh_frame     000000a8  00007d88  00007d88  000001fc  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  2 .stab         0000087c  00000000  00000000  000002a4  2**2
                  CONTENTS, READONLY, DEBUGGING
  3 .stabstr      00000925  00000000  00000000  00000b20  2**0
                  CONTENTS, READONLY, DEBUGGING
  4 .comment      00000029  00000000  00000000  00001445  2**0
                  CONTENTS, READONLY

通過輸入objdump -x obj/kern/kernel我們可以看到:

Program Header:
    LOAD off    0x00001000 vaddr 0xf0100000 paddr 0x00100000 align 2**12
         filesz 0x0000759d memsz 0x0000759d flags r-x
    LOAD off    0x00009000 vaddr 0xf0108000 paddr 0x00108000 align 2**12
         filesz 0x0000b6a8 memsz 0x0000b6a8 flags rw-
   STACK off    0x00000000 vaddr 0x00000000 paddr 0x00000000 align 2**4
         filesz 0x00000000 memsz 0x00000000 flags rwx

標註為LOAD的會被讀取到記憶體。

連結地址可以理解為通過編譯器連結器處理形成的可執行程式中指令的地址,即邏輯地址。載入地址則是可執行檔案真正被裝入記憶體後執行的地址,即實體地址。

BIOS預設將boot loader的載入地址設為0x7c00,而它的連結地址在boot/Makefrag中給出:

……
$(OBJDIR)/boot/boot: $(BOOT_OBJS)
	@echo + ld boot/boot
	$(V)$(LD) $(LDFLAGS) -N -e start -Ttext 0x7C00 -o [email protected] $^
	$(V)$(OBJDUMP) -S [email protected] >[email protected]
	$(V)$(OBJCOPY) -S -O binary -j .text [email protected] $@
	$(V)perl boot/sign.pl $(OBJDIR)/boot/boot
……

其中start -Ttext 0x7C00說明了它的連結地址。

我們再開啟obj/boot/boot.asm看一下:

……
.globl start
start:
  .code16                     # Assemble for 16-bit mode
  cli                         # Disable interrupts
    7c00:	fa                   	cli    
  cld                         # String operations increment
    7c01:	fc                   	cld    
……

可見,boot loader的連結地址為0x7c00

現在,我們改變一下Makefrag中的引數,將0x7c00改為其他值,如0x6c00。在lab目錄下輸入make clean然後make,此時再來開啟boot.asm

……
.globl start
start:
  .code16                     # Assemble for 16-bit mode
  cli                         # Disable interrupts
    6c00:	fa                   	cli    
  cld                         # String operations increment
    6c01:	fc                   	cld
……

可以看到,boot loader的連結地址已經發生了變化。此時我們重新執行一下BIOS,看看會發生什麼。開啟兩個終端,分別輸入make qemu-nox-gdbmake gdb

因為BIOS的載入地址是在0x7c00,我們斷點還是打在這裡。

(gdb) b *0x7c00
Breakpoint 1 at 0x7c00
(gdb) c
Continuing.
[   0:7c00] => 0x7c00:	cli    

Breakpoint 1, 0x00007c00 in ?? ()

第一條指令是正確的。

[   0:7c1e] => 0x7c1e:	lgdtw  0x6c64

(gdb) x/6xb 0x6c64
0x6c64:	0x00	0x00	0x00	0x00	0x00	0x00
(gdb) x/6xb 0x7c64
0x7c64:	0x17	0x00	0x4c	0x6c	0x00	0x00

執行到這一條指令時我們會發現,載入到GDT的值是位於0x6c64處的6個位元組,而這六個位元組的資料全部是0.在boot.asm中我們可以看到:

00006c64 <gdtdesc>:
    6c64:	17                   	pop    %ss
    6c65:	00 4c 6c 00          	add    %cl,0x0(%esp,%ebp,2)

gdtdesc的連結地址是0x6c64,但是它被載入到了0x7c64,這樣導致了GDT的設定錯誤。然後我們繼續執行:

(gdb) si
[   0:7c23] => 0x7c23:	mov    %cr0,%eax
0x00007c23 in ?? ()
(gdb) si
[   0:7c26] => 0x7c26:	or     $0x1,%eax
0x00007c26 in ?? ()
(gdb) si
[   0:7c2a] => 0x7c2a:	mov    %eax,%cr0
0x00007c2a in ?? ()
(gdb) si
[   0:7c2d] => 0x7c2d:	ljmp   $0x8,$0x6c32
0x00007c2d in ?? ()
(gdb) si
[   0:7c2d] => 0x7c2d:	ljmp   $0x8,$0x6c32
0x00007c2d in ?? ()

我們可以發現,將保護模式開啟後執行的跳轉指令發生了錯誤。此時處理器工作在保護模式,GDT的基址部分為0,而長度值也被設定為0。因此,處理器定址不到目標處的指令,因此出現了死迴圈。

Part 3: The Kernel

Using virtual memory to work around position dependence

在進入核心之後,在執行mov %eax,%cr0指令之前,我們可以看到,在地址0x00100000的地方的資料為0x02,在地址0xf01000000的地方的資料為0x00。說明此時地址對映還沒有完成。當執行完mov %eax,%cr0指令後,兩個地址都對映到實際實體地址0x00100000的地方,此時,兩個地址的資料為0x02。

(gdb) b *0x100025
Breakpoint 1 at 0x100025
(gdb) c
Continuing.
The target architecture is assumed to be i386
=> 0x100025:	mov    %eax,%cr0

Breakpoint 1, 0x00100025 in ?? ()
(gdb) x/1b 0x00100000
0x100000:	0x02
(gdb) x/1b 0xf0100000
0xf0100000 <_start+4026531828>:	0x00
(gdb) stepi
=> 0x100028:	mov    $0xf010002f,%eax
0x00100028 in ?? ()
(gdb) x/1b 0x00100000
0x100000:	0x02
(gdb) x/1b 0xf0100000
0xf0100000 <_start+4026531828>:	0x02

kern/kernel.S中註釋掉movl %eax, %cr0後我們會發現,在執行add %al,(%eax)指令時發生了錯誤,原因是Trying to execute code outside RAM or ROM at 0xf010002c,我們要定址的地方超出了實體記憶體。

+ as kern/entry.S
+ ld obj/kern/kernel
ld: warning: section `.bss' type changed to PROGBITS
+ mk obj/kern/kernel.img
***
*** Now run 'make gdb'.
***
qemu-system-i386 -nographic -drive file=obj/kern/kernel.img,index=0,media=disk,format=raw -serial mon:stdio -gdb tcp::26000 -D qemu.log  -S
qemu: fatal: Trying to execute code outside RAM or ROM at 0xf010002c

EAX=f010002c EBX=00010094 ECX=00000000 EDX=000000a4
ESI=00010094 EDI=00000000 EBP=00007bf8 ESP=00007bec
EIP=f010002c EFL=00000086 [--S--P-] CPL=0 II=0 A20=1 SMM=0 HLT=0
ES =0010 00000000 ffffffff 00cf9300 DPL=0 DS   [-WA]
CS =0008 00000000 ffffffff 00cf9a00 DPL=0 CS32 [-R-]
SS =0010 00000000 ffffffff 00cf9300 DPL=0 DS   [-WA]
DS =0010 00000000 ffffffff 00cf9300 DPL=0 DS   [-WA]
FS =0010 00000000 ffffffff 00cf9300 DPL=0 DS   [-WA]
GS =0010 00000000 ffffffff 00cf9300 DPL=0 DS   [-WA]
LDT=0000 00000000 0000ffff 00008200 DPL=0 LDT
TR =0000 00000000 0000ffff 00008b00 DPL=0 TSS32-busy
GDT=     00007c4c 00000017
IDT=     00000000 000003ff
CR0=00000011 CR2=00000000 CR3=00112000 CR4=00000000
DR0=00000000 DR1=00000000 DR2=00000000 DR3=00000000 
DR6=ffff0ff0 DR7=00000400
CCS=00000084 CCD=80010011 CCO=EFLAGS  
EFER=0000000000000000
FCW=037f FSW=0000 [ST=0] FTW=00 MXCSR=00001f80
FPR0=0000000000000000 0000 FPR1=0000000000000000 0000
FPR2=0000000000000000 0000 FPR3=0000000000000000 0000
FPR4=0000000000000000 0000 FPR5=0000000000000000 0000
FPR6=0000000000000000 0000 FPR7=0000000000000000 0000
XMM00=00000000000000000000000000000000 XMM01=00000000000000000000000000000000
XMM02=00000000000000000000000000000000 XMM03=00000000000000000000000000000000
XMM04=00000000000000000000000000000000 XMM05=00000000000000000000000000000000
XMM06=00000000000000000000000000000000 XMM07=00000000000000000000000000000000
GNUmakefile:171: recipe for target 'qemu-nox-gdb' failed
make: *** [qemu-nox-gdb] Aborted (core dumped)
=> 0xf010002c <relocated>:	add    %al,(%eax)
relocated () at kern/entry.S:74
74		movl	$0x0,%ebp			# nuke frame pointer
(gdb) 
Remote connection closed

Formatted Printing to the Console

  1. Explain the interface between printf.c and console.c. Specifically, what function does console.c export? How is this function used by printf.c?

cprintf(printf.c)呼叫了vcprintf(printf.c)vcprintf會呼叫vprintfmt(printfmt.c)函數,vprintfmt會呼叫putch(printf.c)函數,putch會呼叫cputchar(console.c)函數

  1. Explain the following from console.c:
1      if (crt_pos >= CRT_SIZE) {
2              int i;
3              memmove(crt_buf, crt_buf + CRT_COLS, (CRT_SIZE - CRT_COLS) * sizeof(uint16_t));
4              for (i = CRT_SIZE - CRT_COLS; i < CRT_SIZE; i++)
5                      crt_buf[i] = 0x0700 | ' ';
6              crt_pos -= CRT_COLS;
7      }

console.h中我們可以看到CRT_SIZE定義為CRT_ROWS * CRT_COLSCRT_ROWS CRT_COLS的值分別為25和80。CRT(cathode ray tube)是陰極射線顯示器。該顯示器有80列,25行,每個字可容納兩個位元組。當crt_pos大於或等於CRT_SIZE時說明顯示器已經寫滿。

關於計算機顯示的知識可以參考這裡

memmove的定義在lib/string.c裡:

void *
memmove(void *dst, const void *src, size_t n)
{
	const char *s;
	char *d;

	s = src;
	d = dst;
	if (s < d && s + n > d) {
		s += n;
		d += n;
		if ((int)s%4 == 0 && (int)d%4 == 0 && n%4 == 0)
			asm volatile("std; rep movsl\n"
				:: "D" (d-4), "S" (s-4), "c" (n/4) : "cc", "memory");
		else
			asm volatile("std; rep movsb\n"
				:: "D" (d-1), "S" (s-1), "c" (n) : "cc", "memory");
		// Some versions of GCC rely on DF being clear
		asm volatile("cld" ::: "cc");
	} else {
		if ((int)s%4 == 0 && (int)d%4 == 0 && n%4 == 0)
			asm volatile("cld; rep movsl\n"
				:: "D" (d), "S" (s), "c" (n/4) : "cc", "memory");
		else
			asm volatile("cld; rep movsb\n"
				:: "D" (d), "S" (s), "c" (n) : "cc", "memory");
	}
	return dst;
}

該函數接收3個引數。

  • dst 指向用於儲存複製內容的目標陣列,型別強制轉換為 void* 指標。
  • src 指向要複製的資料來源,型別強制轉換為 void* 指標。
  • n 要被複制的位元組數。

在上面進行的程式碼呼叫用顯示器緩衝區的後24行資料覆蓋前24行的資料,再將最後一行的資料填充為0x0700 | ' '。空格字元、0x0700進行或操作的目的是讓空格的顏色為黑色。最後將當前位置移到最後一行的起始位置。

  1. Trace the execution of the following code step-by-step:
	int x = 1, y = 3, z = 4;
	cprintf("x %d, y %x, z %d\n", x, y, z);
  • In the call to cprintf(), to what does fmt point? To what does ap point?
  • List (in order of execution) each call to cons_putc, va_arg, and vcprintf. For cons_putc, list its argument as well. For va_arg, list what ap points to before and after the call. For vcprintf list the values of its two arguments.

先看一下cprintf的程式碼:

int
cprintf(const char *fmt, ...)
{
	va_list ap;
	int cnt;

	va_start(ap, fmt);
	cnt = vcprintf(fmt, ap);
	va_end(ap);

	return cnt;
}

函數首先宣告了一個變數ap,它是va_list型別的。關於這種型別,可以參考這篇文章。在inc/stdarg.h中也可以看到一些關於它們的資訊。ap是一個字元型的指標,指向可變引數的字串,在題目中,cprintf的引數除了一個字串,還有x, y, z

typedef __builtin_va_list va_list;

#define va_start(ap, last) __builtin_va_start(ap, last)

#define va_arg(ap, type) __builtin_va_arg(ap, type)

#define va_end(ap) __builtin_va_end(ap)

我們繼續看題目所給程式碼的執行順序。定義了一個變數cnt。然後呼叫了va_start(ap, fmt)va_startap真正指向可變參數列。

  • va_list用於宣告一個變數,我們知道函數的可變參數列其實就是一個字串,所以va_list才被宣告為字元型指標,這個型別用於宣告一個指向參數列的字元型指標變數。
  • va_start(ap,v) 它的第一個引數是指向可變引數字串的變數,第二個引數是可變引數函數的第一個引數,通常用於指定可變參數列中引數的個數。
  • va_arg(ap,t) 它的第一個引數指向可變引數字串的變數,第二個引數是可變引數的型別。
  • va_end(ap) 用於將存放可變引數字串的變數清空(賦值為NULL)。

之後cprintf呼叫了vcprintf函數,並將返回值賦給了cnt

int
vcprintf(const char *fmt, va_list ap)
{
	int cnt = 0;

	vprintfmt((void*)putch, &cnt, fmt, ap);
	return cnt;
}

vcprintf呼叫了vprintfmtvprintf的定義太長就不在此展示,該函數位於lib/printfmt.c

static void
putch(int ch, int *cnt)
{
	cputchar(ch);
	*cnt++;
}
……
void
vprintfmt(void (*putch)(int, void*), void *putdat, const char *fmt, va_list ap)

vprintfmt函數中,首先遍歷fmt所指向的字串,通過呼叫傳遞的函數指標呼叫putch函數,putch函數隨後呼叫cputchar函數並增加cnt的值,cputchar函數呼叫cons_putc函數輸出字元。遍歷fmt的操作通過while迴圈進行直到/0或者%

// output a character to the console
static void
cons_putc(int c)
{
	serial_putc(c);
	lpt_putc(c);
	cga_putc(c);
}

當遇到/0時,vprintfmt函數返回。當遇到%時,通過switch操作將輸出根據要求進行格式化。在vprintfmt呼叫結束後,vcprintf返回輸出的位元組數,然後cprintf執行va_end將存放可變引數字串的變數清空,然後返回cnt

  1. Run the following code.
	unsigned int i = 0x00646c72;
	cprintf("H%x Wo%s", 57616, &i);
  • What is the output? Explain how this output is arrived at in the step-by-step manner of the previous exercise.
  • The output depends on that fact that the x86 is little-endian. If the x86 were instead big-endian what would you set i to in order to yield the same output? Would you need to change 57616 to a different value?

輸出:He110 World

cprintfmt函數在找到一個%後退出while迴圈遍歷,進行switch操作。而x對應的case如下:

……
// (unsigned) hexadecimal
case 'x':
	num = getuint(&ap, lflag);
	base = 16;
number:
	printnum(putch, putdat, num, base, width, padc);
	break;
……

首先從可變參數列裡獲取到我們的引數57616,該引數的型別由lflag決定。在此例中,%後直接跟著xlflag的值為0,表示取的是一個無符號int型整數。

// Get an unsigned int of various possible sizes from a varargs list,
// depending on the lflag parameter.
static unsigned long long
getuint(va_list *ap, int lflag)
{
	if (lflag >= 2)
		return va_arg(*ap, unsigned long long);
	else if (lflag)
		return va_arg(*ap, unsigned long);
	else
		return va_arg(*ap, unsigned int);
}

將該引數取回後,根據給定的要求進行格式轉換並輸出。57616轉化為16進位製為e110

在進行下一次呼叫switch語句時,%後跟著s,代表輸出的是一個字串。

// string
case 's':
	if ((p = va_arg(ap, char *)) == NULL)
		p = "(null)";
	if (width > 0 && padc != '-')
		for (width -= strnlen(p, precision); width > 0; width--)
			putch(padc, putdat);
	for (; (ch = *p++) != '\0' && (precision < 0 || --precision >= 0); width--)
		if (altflag && (ch < ' ' || ch > '~'))
			putch('?', putdat);
		else
			putch(ch, putdat);
	for (; width > 0; width--)
		putch(' ', putdat);
	break;

0x00646c72按位元組進行字元轉換並輸出。x86是小端模式,儲存的資料從低地址開始應該是:72 6c 64 00。根據ASCII提供的資訊,我們可以查到72 6c 64 00對應的字元是r l d \0。如果是大端序的機器。那麼定義的變數應該是unsigned int i = 0x726c6400

  1. In the following code, what is going to be printed after 'y='? (note: the answer is not a specific value.) Why does this happen?
	cprintf("x=%d y=%d", 3);

va_arg在取完一個引數後,會將ap的值改變,使它指向下一個引數。如果可變參數列的引數不夠,則va_arg指向的地方的資料未被定義。具體資訊可以參考這裡

  1. Let's say that GCC changed its calling convention so that it pushed arguments on the stack in declaration order, so that the last argument is pushed last. How would you have to change cprintf or its interface so that it would still be possible to pass it a variable number of arguments?

可以改變va_argva_start的宏實現,使它們地址的增長方向相反。

The Stack

x86的棧是向下生長的。stack pointer(esp)指向當前正在使用的棧的最低地址。向棧頂新增一個資料先減小esp的值再向當前指向的地址寫入資料。從棧頂彈出一個資料先將資料讀出來再增加esp的值。

Determine where the kernel initializes its stack, and exactly where in memory its stack is located. How does the kernel reserve space for its stack? And at which "end" of this reserved area is the stack pointer initialized to point to?

boot loader最後通過一個呼叫來將控制權交給kernel,在此之前的程式碼我們已經分析過了,並沒有初始化棧。因此我們直接從這裡開始偵錯,看看後面執行的指令。

	((void (*)(void)) (ELFHDR->e_entry))();
    7d6b:	ff 15 18 00 01 00    	call   *0x10018

在指令執行過程中,我們可以看到有這兩條指令:

(gdb) 
=> 0xf010002f <relocated>:	mov    $0x0,%ebp
relocated () at kern/entry.S:74
74		movl	$0x0,%ebp			# nuke frame pointer
(gdb) 
=> 0xf0100034 <relocated+5>:	mov    $0xf0110000,%esp
relocated () at kern/entry.S:77
77		movl	$(bootstacktop),%esp

這兩條指令在entry.S中:

	# Clear the frame pointer register (EBP)
	# so that once we get into debugging C code,
	# stack backtraces will be terminated properly.
	movl	$0x0,%ebp			# nuke frame pointer

	# Set the stack pointer
	movl	$(bootstacktop),%esp

可見,正是這兩條指令初始化了棧,並且將棧的初始地址設為了0xf0110000,對映到實際實體地址是0x00110000

0xf0110000顯然超出了我們實際具有的實體記憶體,而且我們現在還沒有虛擬記憶體的機制,entry.S中通過這樣一段程式碼來將0xf0000000~0xf04000000x00000000~0x00400000的地址都對映到實際實體地址0x00000000~0x00400000上。

	# Load the physical address of entry_pgdir into cr3.  entry_pgdir
	# is defined in entrypgdir.c.
	movl	$(RELOC(entry_pgdir)), %eax
	movl	%eax, %cr3
	# Turn on paging.
	movl	%cr0, %eax
	orl	$(CR0_PE|CR0_PG|CR0_WP), %eax
	movl	%eax, %cr0

inc/memlayout.h中我們可以找到這樣一段定義:

// Kernel stack.
#define KSTACKTOP	KERNBASE
#define KSTKSIZE	(8*PGSIZE)   		// size of a kernel stack
#define KSTKGAP		(8*PGSIZE)   		// size of a kernel stack guard

程式碼定義了棧的大小為8頁,一頁為4KB,總的為32KB,因此棧的地址為從0xf0108000~0xf0110000的地址空間,實際地址為0x00108000~0x00110000

The ebp (base pointer) register, in contrast, is associated with the stack primarily by software convention. On entry to a C function, the function's prologue code normally saves the previous function's base pointer by pushing it onto the stack, and then copies the current esp value into ebp for the duration of the function. If all the functions in a program obey this convention, then at any given point during the program's execution, it is possible to trace back through the stack by following the chain of saved ebp pointers and determining exactly what nested sequence of function calls caused this particular point in the program to be reached. This capability can be particularly useful, for example, when a particular function causes an assert failure or panic because bad arguments were passed to it, but you aren't sure who passed the bad arguments. A stack backtrace lets you find the offending function.

ebp暫存器儲存了當前函數的棧幀資訊。並且在當前函數執行函數呼叫時將資料儲存在棧上,並更新為新的函數的棧幀資訊。

To become familiar with the C calling conventions on the x86, find the address of the test_backtrace function in obj/kern/kernel.asm, set a breakpoint there, and examine what happens each time it gets called after the kernel starts. How many 32-bit words does each recursive nesting level of test_backtrace push on the stack, and what are those words?

obj/kern/kernel.asm中,我們可以看到以下資訊:

……
// Test the stack backtrace function (lab 1 only)
void
test_backtrace(int x)
{
f0100040:	55                   	push   %ebp
……

kern/init.c中我們可以找到這個函數的定義:

// Test the stack backtrace function (lab 1 only)
void
test_backtrace(int x)
{
	cprintf("entering test_backtrace %d\n", x);
	if (x > 0)
		test_backtrace(x-1);
	else
		mon_backtrace(0, 0, 0);
	cprintf("leaving test_backtrace %d\n", x);
}

mon_backtrace目前沒有做任何事情:

int
mon_backtrace(int argc, char **argv, struct Trapframe *tf)
{
	// Your code here.
	return 0;
}
	// Test the stack backtrace function (lab 1 only)
	test_backtrace(5);
f01000e8:	c7 04 24 05 00 00 00 	movl   $0x5,(%esp)
f01000ef:	e8 4c ff ff ff       	call   f0100040 <test_backtrace>
f01000f4:	83 c4 10             	add    $0x10,%esp

kernel.asm的程式碼我們可以看到,test_backtrace第一次被呼叫是在地址0xf01000e8,傳入的引數是5,我們在這裡設定斷點,追蹤棧的資訊。

當執行完call指令後,我們看一下esp暫存器的值,看看當前棧指標指向的位置:

(gdb) print $esp
$1 = (void *) 0xf010ffdc

我們再看看這個地址和前一個地址(棧向下生長,前一個地址數值更大)儲存的資料:

(gdb) print/x *0xf010ffdc@2
$2 = {0xf01000f4, 0x5}

可以看到,我們傳入的引數5被壓入棧中,其次,還有一個地址0xf01000f4也在棧中,這個地址是test_backtrace返回後要執行的指令的首地址。

隨後進入test_backtrace函數。首先執行以下指令,將呼叫者的棧幀資訊儲存在棧上,並將自己的棧幀資訊儲存在ebp中,儲存呼叫者的esiebx資料。

f0100040:	55                   	push   %ebp
f0100041:	89 e5                	mov    %esp,%ebp
f0100043:	56                   	push   %esi
f0100044:	53                   	push   %ebx

檢視一下棧裡的資訊:

(gdb) print/x *0xf010ffd0@5
$3 = {0xf0111308, 0x10094, 0xf010fff8, 0xf01000f4, 0x5}

從高地址開始依次是:傳入的資料0x5test_backtrace返回後執行的指令的地址,i386_init在呼叫次函數時ebpesiebx的值。此時ebp儲存的值是指向第三條資料的地址。

然後執行這三條指令:

f0100045:	e8 72 01 00 00       	call   f01001bc <__x86.get_pc_thunk.bx>
f010004a:	81 c3 be 12 01 00    	add    $0x112be,%ebx
f0100050:	8b 75 08             	mov    0x8(%ebp),%esi

首先是一個跳轉指令,跳轉到這個子程式:

f01001bc <__x86.get_pc_thunk.bx>:
f01001bc:	8b 1c 24             	mov    (%esp),%ebx
f01001bf:	c3                   	ret    

在執行call指令時,會將call返回後下一條指令的地址壓入棧中,然後再跳轉到給定位置。執行ret時,會將執行call時儲存在棧中的地址取出,賦給eip。此時我們的棧中又多了一條資料:0xf010004a

之後執行了一個mov指令,將這條新的資料傳遞給了ebx,返回後又執行addmov指令。最後傳遞了一個資料給esi。我們看一下此時esi內的資料:

(gdb) print $esi
$4 = 5

因為全域性變數相對於程式碼來說有固定的偏移量,因此我們可以通過這種方法來存取資料(要傳入cprintf的字串)。

esi的資料通過前面儲存的ebp來完成。
當前的棧的資訊為:

0xf010ffe0:	0x00000005	//傳入的引數
0xf010ffdc:	0xf01000f4	//函數返回後執行的下一條指令的地址
0xf010ffd8:	0xf010fff8	//執行init.c時的ebp的資料
0xf010ffd4:	0x00010094	//執行init.c時的esi的資料
0xf010ffd0:	0xf0111308	//執行init.c時的ebx的資料
(gdb) print/x $ebp
$5 = 0xf010ffd8

此時ebp儲存的是指向第三條資料的指標。因此0xf0100050處的指令存取的資料是0x5

接下來呼叫cprintf

	cprintf("entering test_backtrace %d\n", x);
f0100053:	83 ec 08             	sub    $0x8,%esp
f0100056:	56                   	push   %esi
f0100057:	8d 83 18 07 ff ff    	lea    -0xf8e8(%ebx),%eax
f010005d:	50                   	push   %eax
f010005e:	e8 e6 09 00 00       	call   f0100a49 <cprintf>

sub $0x8,%esp在棧中開闢一些空間,用於存放臨時變數。然後將引數0x5壓入棧中,在通過程式碼和全域性變數之間的偏移量存取字串,並將資料指標壓如棧中,最後呼叫cprintf
此時棧內新增了5條資料。從上至下依次為兩個空白區域引數5字串指標cprintf返回後下一條指令的地址0xf0100063

呼叫返回後執行以下指令:

	if (x > 0)
f0100063:	83 c4 10             	add    $0x10,%esp
f0100066:	85 f6                	test   %esi,%esi
f0100068:	7f 2b                	jg     f0100095 <test_backtrace+0x55>

cprintf呼叫返回後esp的值為:0xf010ffc0。此時通過add操作刪除了4個為呼叫cprintf作準備的元素,ret也會刪除一個。然後判斷變數x的值,如果大於0,則進行遞迴呼叫,如果小於0,則執行mon_backtrace

		mon_backtrace(0, 0, 0);
f010006a:	83 ec 04             	sub    $0x4,%esp
f010006d:	6a 00                	push   $0x0
f010006f:	6a 00                	push   $0x0
f0100071:	6a 00                	push   $0x0
f0100073:	e8 0b 08 00 00       	call   f0100883 <mon_backtrace>
f0100078:	83 c4 10             	add    $0x10,%esp
	cprintf("leaving test_backtrace %d\n", x);
f010007b:	83 ec 08             	sub    $0x8,%esp
f010007e:	56                   	push   %esi
f010007f:	8d 83 34 07 ff ff    	lea    -0xf8cc(%ebx),%eax
f0100085:	50                   	push   %eax
f0100086:	e8 be 09 00 00       	call   f0100a49 <cprintf>
}
f010008b:	83 c4 10             	add    $0x10,%esp
f010008e:	8d 65 f8             	lea    -0x8(%ebp),%esp
f0100091:	5b                   	pop    %ebx
f0100092:	5e                   	pop    %esi
f0100093:	5d                   	pop    %ebp
f0100094:	c3                   	ret
		test_backtrace(x-1);
f0100095:	83 ec 0c             	sub    $0xc,%esp
f0100098:	8d 46 ff             	lea    -0x1(%esi),%eax
f010009b:	50                   	push   %eax
f010009c:	e8 9f ff ff ff       	call   f0100040 <test_backtrace>
f01000a1:	83 c4 10             	add    $0x10,%esp
f01000a4:	eb d5                	jmp    f010007b <test_backtrace+0x3b>

每次進行遞迴呼叫,上面的操作都要重新走一遍。此時棧裡一共有8個元素(上面提到的最初的5個加上三個空白區域)。每一次呼叫都會增加8個,除了最後一次。當程式進行到x = 0,並且執行到0xf0100068處的判斷條件時,棧裡一共有45個元素,他們的性質跟最初5+3個元素是重複的,不過具體的值不同。此時esp的值為:0xf010ff30

我們直接來看當x = 0時的情況。此時程式呼叫mon_backtrace函數。先在棧內開闢了一塊區域,然後又傳入三個引數,接著呼叫函數,返回後又刪去了三個引數和開闢的區域。然後再次呼叫cprintf函數。

先開闢兩個存放資料的區域,然後傳入引數,接著呼叫,最後又刪去了這些區域。

接著通過f010008e: 8d 65 f8 lea -0x8(%ebp),%esp這條指令來設定esp指向當前呼叫儲存在棧中的ebx的值,然後恢復暫存器的值,此時esp指向的是呼叫者在呼叫返回後要執行的指令的地址,retesp的值載入到程式計數器裡,然後彈出該元素。當x = 0呼叫返回時,它的返回地址是0xf01000a1,此時彈出4個空白區域,然後跳轉到x = 1時,第二次呼叫vprintf的語句0xf010007b。一直返回到x = 5時,此時的返回地址是0xf01000f4,由init.c呼叫call儲存在棧中的資料,此時棧中只剩下資料5了(我們關心的)。

Implement the backtrace function as specified above. Use the same format as in the example, since otherwise the grading script will be confused. When you think you have it working right, run make grade to see if its output conforms to what our grading script expects, and fix it if it doesn't. After you have handed in your Lab 1 code, you are welcome to change the output format of the backtrace function any way you like.

程式碼如下:

int
mon_backtrace(int argc, char **argv, struct Trapframe *tf)
{
	// Your code here.
	int *ebp = (int *)read_ebp();
	cprintf("Stack backtrace:\r\n");
	while(ebp != 0) {
		cprintf("  ebp %08x  eip %08x  args %08x %08x %08x %08x %08x\r\n", ebp, ebp[1], ebp[2]
			, ebp[3], ebp[4], ebp[5], ebp[6]);
		ebp = (int *)ebp[0];
	}
	
	return 0;
}

輸出結果:

entering test_backtrace 5
entering test_backtrace 4
entering test_backtrace 3
entering test_backtrace 2
entering test_backtrace 1
entering test_backtrace 0
Stack backtrace:
  ebp f010ff18  eip f0100078  args 00000000 00000000 00000000 f010004a f0111308
  ebp f010ff38  eip f01000a1  args 00000000 00000001 f010ff78 f010004a f0111308
  ebp f010ff58  eip f01000a1  args 00000001 00000002 f010ff98 f010004a f0111308
  ebp f010ff78  eip f01000a1  args 00000002 00000003 f010ffb8 f010004a f0111308
  ebp f010ff98  eip f01000a1  args 00000003 00000004 00000000 f010004a f0111308
  ebp f010ffb8  eip f01000a1  args 00000004 00000005 00000000 f010004a f0111308
  ebp f010ffd8  eip f01000f4  args 00000005 00001aac 00000640 00000000 00000000
  ebp f010fff8  eip f010003e  args 00000003 00001003 00002003 00003003 00004003
leaving test_backtrace 0
leaving test_backtrace 1
leaving test_backtrace 2
leaving test_backtrace 3
leaving test_backtrace 4
leaving test_backtrace 5

Modify your stack backtrace function to display, for each eip, the function name, source file name, and line number corresponding to that eip.

關於Stabs我們可以檢視這個連結的內容。在inc/stab.h中有結構體Stab的定義。

// Entries in the STABS table are formatted as follows.
struct Stab {
	uint32_t n_strx;	// index into string table of name
	uint8_t n_type;         // type of symbol
	uint8_t n_other;        // misc info (usually empty)
	uint16_t n_desc;        // description field
	uintptr_t n_value;	// value of symbol
};

我們先開啟kern/kernel.ld看一下相關的資訊:

/* Include debugging information in kernel memory */
	.stab : {
		PROVIDE(__STAB_BEGIN__ = .);
		*(.stab);
		PROVIDE(__STAB_END__ = .);
		BYTE(0)		/* Force the linker to allocate space
				   for this section */
	}

	.stabstr : {
		PROVIDE(__STABSTR_BEGIN__ = .);
		*(.stabstr);
		PROVIDE(__STABSTR_END__ = .);
		BYTE(0)		/* Force the linker to allocate space
				   for this section */
	}

__STAB_BEGIN____STAB_END____STABSTR_BEGIN____STABSTR_END__分別表示.stab.stabstr這兩個段的起始和結束地址。

.代表當前地址。

輸入:

objdump -h obj/kern/kernel

我們現在關注的是這兩條資訊:

Idx Name          Size      VMA       LMA       File off  Algn
	……
  2 .stab         00003c61  f010218c  0010218c  0000318c  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  3 .stabstr      0000195b  f0105ded  00105ded  00006ded  2**0
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
	……

這兩條資訊說明了這兩個段的存放地址和大小,我們可以籍此計算出他們的結束地址。

輸入:

objdump -G obj/kern/kernel

我們可以檢視.stab段內的資料。

obj/kern/kernel:     file format elf32-i386

Contents of .stab section:

Symnum n_type n_othr n_desc n_value  n_strx String

-1     HdrSym 0      1287   0000195a 1     
0      SO     0      0      f0100000 1      {standard input}
1      SOL    0      0      f010000c 18     kern/entry.S
2      SLINE  0      44     f010000c 0      
……
474    FUN    0      0      f0100883 4237   mon_backtrace:F(0,1)
475    PSYM   0      0      00000008 4129   argc:p(0,1)

根據上面的連結,我們主要要知道以下幾點:

  • n_type N_UNDF
  • n_othr Unused field, always zero. This may eventually be used to hold overflows from the count in the n_desc field.
  • n_desc Count of upcoming symbols, i.e., the number of remaining stabs for this source file.
  • n_value Size of the string table fragment associated with this source file, in bytes.
  • n_strx Relative to the start of the .stabstr section.

Symnum可以看做是標號,n_type是型別。FUN指的是函數,對應的String為函數名加上一些資訊。因此,我們想要在mon_backtrace中找到函數名需要找到這條資訊。

通過kern/kdebug.c的資訊我們可以瞭解stab_binsearch函數的功能:

//	Given an instruction address, this function finds the single stab
//	entry of type 'type' that contains that address.

輸入:

gcc -pipe -nostdinc -O2 -fno-builtin -I. -MD -Wall -Wno-format -DJOS_KERNEL -gstabs -c -S kern/init.c

我們可以檢視init.S來獲取更多資訊。

為了檢視符號表是否被載入進記憶體,我們可以直接用gdb偵錯檢視該段起始地址的資料:

(gdb) x/5s 0xf0105ded
0xf0105ded:	""
0xf0105dee:	"{standard input}"
0xf0105dff:	"kern/entry.S"
0xf0105e0c:	"kern/entrypgdir.c"
0xf0105e1e:	"gcc2_compiled."

這與init.S的資訊相同。說明符號表被載入進入記憶體了。不過這個地址需要在進入核心完成地址對映才能看到,否則需要檢視的地址可以為0x00105ded

Complete the implementation of debuginfo_eip by inserting the call to stab_binsearch to find the line number for an address.

現在我們需要去debuginfo_eip函數中補充一些程式碼來完成找到行號的功能。而這個功能需要用到stab_binsearch

這個函數的程式碼和樣例在kern/kdebug.c中均有說明。

補充程式碼如下:

	stab_binsearch(stabs, &lline, &rline, N_SLINE, addr);
	if (lline <= rline) {
		info->eip_line = stabs[lline].n_desc;
	} else return -1;

程式碼註釋說info->eip_line應該設定為right line number,但我設定為lline才輸出正確。

更改後的mon_backtrace如下:

int
mon_backtrace(int argc, char **argv, struct Trapframe *tf)
{
	// Your code here.
	int *ebp = (int *)read_ebp();
	struct Eipdebuginfo info;
	cprintf("Stack backtrace:\r\n");
	while(ebp != 0) {
		cprintf("  ebp %08x  eip %08x  args %08x %08x %08x %08x %08x\r\n", ebp, ebp[1], ebp[2], ebp[3], ebp[4], ebp[5], ebp[6]);
		memset(&info, 0, sizeof(struct Eipdebuginfo));
		if (debuginfo_eip(ebp[1], &info)) {
			cprintf("failed to get debuginfo for eip %x.\r\n", ebp[1]);
		}
		else
        {
            cprintf("\t%s:%d: %.*s+%u\r\n", info.eip_file, info.eip_line, info.eip_fn_namelen, info.eip_fn_name, ebp[1] - info.eip_fn_addr);
        }
		ebp = (int *)ebp[0];
	}
	
	return 0;
}

命令增加如下:

static struct Command commands[] = {
	{ "help", "Display this list of commands", mon_help },
	{ "kerninfo", "Display information about the kernel", mon_kerninfo },
	{ "mon_backtrace", "Display information about Stack trace", mon_backtrace },
};

最後make grade:

……
running JOS: (1.0s) 
  printf: OK 
  backtrace count: OK 
  backtrace arguments: OK 
  backtrace symbols: OK 
  backtrace lines: OK 
Score: 50/50

關於Stabs我還弄得不是很明白,有機會再補充。