Third, Boot Loader's main task and typical structure framework
Before we continue to discuss this section, we must first make a hypothesis, that is: assume that the kernel image and root file system image are loaded into the RAM. The reason why such a hypothesis is that the kernel image and root file system image can be run directly in a solid storage device such as a ROM or Flash in a solid state storage device such as a ROM or FLASH in a solid state storage device such as a ROM or Flash. But this approach is undoubtedly at the expense of running speed. From the perspective of the operating system, the total goal of Boot Loader is to correctly call the kernel.
In addition, due to the implementation of Boot Loader depends on the architecture of the CPU, most Boot Loader is divided into two parts: STAGE1 and STAGE2. Depending on the code of the CPU architecture, such as equipment initialization code, etc., usually in Stage1, and usually use assembly language to achieve short escape purposes. Stage2 is usually implemented in a C language, which can be implemented for complex functions, and the code will have better readability and portability.
Boot Loader's Stage1 usually includes the following steps (in the order of execution):
· Hardware device initialization.
• Prepare RAM space for the Stage2 of the Boot Loader.
· Copy Boot Loader's Stage2 to RAM space.
· Set good stack.
• Jump to the C entry point of Stage2.
Boot Loader's Stage2 usually includes the following steps (in the order of execution):
· Initialization of hardware devices to use in this stage.
· Detect system memory mapping.
· Read the Kernel image and root file system image from the Flash to the RAM space.
· Set the parameters for the kernel.
· Call the kernel.
3.1 boot loading stage1
3.1.1 Basic hardware initialization
This is the operation performed by Boot Loader, with the purpose of the execution of Stage2 and the subsequent Kernel's execution to prepare some basic hardware environments. It typically includes the following steps (in the order of execution):
1. Shield all interrupts. Providing services for interrupts is usually the responsibility of the OS device driver, so there is no need to respond to any interrupt in the full process of Boot Loader. Interrupt masking can be done by writing a CPU's interrupt shield register or status register (such as ARM's CPSR register).
2. Set the speed and clock frequency of the CPU.
3. RAM initialization. The function register and each of the memory control registers, including the correct setting of the system.
4. Initialize the LED. Typically, the LEDs are driven by GPIO, which indicates that the state of the system is OK or Error. If there is no LED on the board, you can also print the LOGO character information of the Boot Loader to the serial port by initializing the UART.
5. Turn off CPU internal instruction / data Cache.
3.1.2 Prepare RAM Spaces for Loading Stage2
In order to obtain a faster execution speed, the STAGE2 is usually loaded into the RAM space, so it must be prepared for the STAGE2 of the Boot Loader to prepare a range of RAM space range.
Since STAGE2 is usually a C language execution code, in consideration of space size, in addition to the STAGE2 executable image, the stack space must also be considered. In addition, the space size is the multiple of the MEMORY PAGE (usually 4KB). In general, 1M RAM space is sufficient. The specific address range can be arranged any, such as BLOB, arranged its STAGE2 executable image to 1M space starting from the system RAM start address 0xC0200000. However, the STAGE2 is arranged to the top 1MB of the entire RAM space (ie, Ramend-1MB) - RAMEND is a recommended method. For the later narrative convenience, the size of the arranged RAM spatial range is here: Stage2_size (byte), separate the start address and termination address as: Stage2_start and Stage2_end (these two addresses are 4-byte boundary Align). therefore:
STAGE2_END = STAGE2_START STAGE2_SIZE
In addition, it is necessary to ensure that the range of address ranges is indeed readable RAM space, so the address range you have arranged must be tested. The specific test method can use a method similar to blob, i.e., as the MEMORY PAGE is tested, and the two words starting every MEMORY PAGE are readable. For the convenience of the following description, we remember this testing algorithm for: Test_Mempage, the specific steps are as follows:
1. First save the contents of the MEMORY PAGE. 2. Write any numbers to these two words. For example: writing 0x55 to the first word, 0xAA is written in the second word. 3. Then, immediately read the contents of the two words immediately. Obviously, what we read should be 0x55 and 0xAA, respectively. If not, the address range occupied by this Memory Page is not a valid RAM space. 4. Write any numbers in these two words. For example, write 0xAA to the first word, and write 0x55 in the second word. 5. Then, immediately read the contents of the two words immediately. Obviously, what we read should be 0xAA and 0x55, respectively. If not, the address range occupied by this Memory Page is not a valid RAM space. 6. Restore the original content of these two words. The test is completed. In order to get a clean RAM spatial range, we can also clear the arranged RAM spatial range. 3.1.3 Copying STAGE2 to RAM When you copy to copy the two points: (1) The executable image of the STAGE2 is stored at the start address and termination address of the solid state storage device; (2) the start address of the RAM space. 3.1.4 Setting the Stack Pointer SP Stack The settings are prepared to perform C language code. Usually we can set the value of the SP to (STAGE2_END-4), that is, the 1MB RAM space arranged at 3.1.2 (the stack is growing down). In addition, the LED light can also be turned off before setting the stack pointer SP to prompt users to jump to Stage2. After the above execution steps, the physical memory layout of the system should be shown in Figure 2 below. 3.1.5 Jump to the C entry point of STAGE2 After all of the above is ready, you can jump to the Stage2 of Boot Loader. For example, in the ARM system, this can be implemented by modifying the PC register as the appropriate address.
Figure 2 BootLoader STAGE2 executive Image Just copy to the RAM Space System Memory Layout 3.2 Boot Loader Stage2 As mentioned earlier, Stage2 code usually implements in C language to make more complex features and achieve more Good code readability and portability. But in different ways with ordinary C language applications, we cannot use any support functions in the GLIBC library when compiling and linking Boot Loader. The reason is obvious. This brings us a question, which is where you jump into the main () function? Directly put the origin of the main () function as the entry point of the entire Stage2 execution image may be the most direct idea. But there are two shortcomings in this way: 1) Unable to pass the main () function transfer function parameter; 2) The case where the main () function returns cannot be processed. A more clever way is to use the concept of trampoline (spring bed). That is, write a TRAMPOLINE applet with assembly language and use this Trampoline applet to perform entry points for the STAGE2 executable image. Then we can jump into the main () function in the Trampoline assembly appler; and when the main () function returns, the CPU execution path obviously returns to our Trampoline program. In short, this method is to use this TRAMPOLINE applet as an External Wrapper. The following is a simple trampoline program example (from blob):
.Text
.globl _trampoline
_trampoline:
Bl Main
/ * if main ever return we just call it again * /
B_trampoline
It can be seen that when the main () function returns, we use a jump instruction to re-execute the trampoline program - of course, the main () function is re-executed, which is the meaning of the word trampoline. 3.2.1 Hardware devices to be used in the initialization stage This usually includes: (1) initializing at least one serial port, so that the terminal user performs I / O output information; (2) initialization timer, etc. Before initializing these devices, you can also turn the LED light to indicate that we have entered the main () function. After the device is initialized, some print information, program name string, version number, etc. can be output. 3.2.2 Memory Map Memory Map The memory mapping refers to what address ranges are allocated in the entire 4GB physical address space to address the RAM unit of the address. For example, in the SA-1100 CPU, the 512M address space starting from 0xC000, 0000 is used as the RAM address space of the system, and in the Samsung S3C44B0X CPU, the 64M address space between 0x0C00 to 0x1000, 10000 is used. System's RAM address space. Although the CPU typically reserves a large number of sufficient address spaces to the system RAM, it does not necessarily implement all RAM address spaces reserved in the CPU when building a specific embedded system. That is, the specific embedded system often maps a portion of the entire RAM address space reserved in the CPU to the RAM unit, and the remaining portion of the RAM address space is in an unused state. Due to the above fact, Boot Loader's Stage2 must detect the memory mapping of the entire system before it wants to do anything (for example, reading the kernel image stored in the Flash to the RAM space) before detecting the memory mapping of the entire system before you have detected the memory mapping of the entire system before it must know the CPU pre- Which of the RAM address spaces remains true to the RAM address unit, which are in the "unused" state. (1) Description of memory maps can be used to describe a continuous address range in the RAM address space: type_struct {
U32 Start; / * The base address of the memory * /
U32 size; / * The byte number of the memory region * /
INT.
Memory_Area_t;
The continuous address range in this RAM address space can be one of two states: (1) Used = 1, then the continuous address range is implemented, i.e., is truly mapped to the RAM unit. (2) Used = 0, the continuous address range is not implemented by the system, but is in an unused state. Based on the above MEMORY_AREA_T data structure, the entire CPU reserved RAM address space can be represented by an array of MEMORY_AREA_T types as follows:
Memory_Area_t memory_map [Num_Mem_areas] = {
[0 ... (Num_Mem_areas - 1)] = {
.start = 0,
.size = 0,
.use = 0
}
}
(2) Memory map detection The following we give a simple and efficient algorithm that can be used to detect the overall RAM address space memory mapping:
/ * Array initialization * /
For (i = 0; i Memory_map [i] .used = 0; / * First Write a 0 to all memory locations * / For (AddR = MEM_START; addr * (u32 *) addr = 0; For (i = 0, addr = mem_start; addr / * * Detection starts from the base address MEM_START I * Page_SIZE, the size is * Page_size is whether the address space is a valid RAM address space. * / Call the algorithm TEST_MEMPAGE () in Section 3.1.2; Current Memory Page ISNOT a Valid Ram Page { / * no ram here * / IF (Memory_MAP [i] .used) i ; CONTINUE; } / * * The current page is already a valid address range that is mapped to the RAM * But still have to see if the current page is just an alias of an address page in the 4GB address space? * / IF (* (u32 *) addr! = 0) {/ * alias? * / / * This memory page is an alias of an address page in the 4GB address space * / IF (Memory_MAP [i] .used) i ; CONTINUE; } / * * The current page is already a valid address range that is mapped to the RAM * And it is not an alias of an address page in the 4GB address space. * / IF (Memory_MAP [i] .USED == 0) { Memory_map [i] .start = addr; Memory_map [i] .size = page_size; Memory_map [i] .used = 1; } else { Memory_map [i] .size = Page_size; } } / * End of for (...) * / After detecting the memory mapping of the system with the above algorithm, Boot Loader can also print more detailed information on memory to the serial port. 3.2.3 Loading the kernel image and root file system image (1) Planning memory usage The layout here includes two aspects: (1) The memory range occupied by the kernel image; (2) The range of memory occupied by the root file system. When planning memory, it is mainly considered two aspects of the size of the base address and the image. For kernel images, it is generally copied to approximately 1MB of memory from (MEM_START 0X8000) base address (the kernel of embedded Linux generally does not operate 1MB). Why is it to empty the memory from MEM_START to MEM_START 0x8000? This is because Linux kernels have placed some global data structures in this memory, such as starting parameters and kernel page tables. For root file system images, it is generally copied to the MEM_START 0x0010, 0000. If RAMDisk is used as a root file system image, the size after decompression is generally 1MB. (2) Copy from Flash Since the embedded CPU like ARM is usually to address Flash and other solid storage devices in a unified memory address space, read data from the ram unit and read data from the RAM unit. no difference. Use a simple loop to complete the work from the flash device: While (count) { * DEST = * src ; / * They area all aligned with word boundary * / count - = 4; / * byte number * / } 3.2.4 Setting the kernel Startable parameters It should be said that after copying the kernel image and root file system image to the RAM space, you can prepare to start the Linux kernel. But before calling the kernel, you should have a step preparation, namely: Set the startup parameters of the Linux kernel. The kernel after Linux 2.4.x expects to pass the startup parameters in the form of tagged list. Start the parameter tag list to mark Atag_core to mark the Atag_none end. Each tag is composed of a Tag_Header structure that is identified, and subsequent parameter values data structures. Data Structure Tag and Tag_Header Define in include / asm / setup.h header files in Linux kernel source: / * The list ends with an atag_none node. * / #define atag_none 0x00000000 Struct tag_header { U32 size; / * Note that size is the number of words in the word * / U32 TAG; } ...... Struct tag { Struct Tag_header HDR; Union { Struct Tag_core; Struct Tag_mem32 MEM; Struct tag_videText VideoText; Struct tag_ramdisk ramdisk; Struct tag_initrd initrd; Struct tag_serialnr serialnr; Struct Tag_revision revision; Struct tag_videolfb video videolfb; Struct tag_cmdline cmdline; / * * Acorn specific * / Struct tag_acorn acorn; / * * DC21285 Specific * / Struct tag_memclk memclk; } u; } In embedded Linux systems, common startup parameters set by Boot Loader are typically required: ATAG_CORE, ATAG_MEM, ATAG_CMDLINE, ATAG_RAMDISK, ATAG_INITRD, etc. For example, the code to set the ATAG_CORE is as follows: Params = (struct tag *) boot_params; Params-> hdr.tag = attag_core; Params-> hdr.size = tag_size; tag_core; Params-> u.core.flags = 0; Params-> u.core.pageSize = 0; Params-> u.core.rootdev = 0; Params = tag_next (params); Among them, Boot_Params represents the starting base address of the kernel launch parameter in memory, the pointer params is a pointer for the Struct Tag type. Macro tag_next () will point to the current tag of the pointer as a parameter, calculate the starting address of the next tag that is currently marked. Note that the device ID of the root file system of the kernel is set here. Below is an example code for setting up memory mapping: For (i = 0; i IF (Memory_MAP [i] .USED) { Params-> hdr.tag = ATAG_MEM; params-> HDr.size = tag_size (tag_mem32); Params-> u.Mem.Start = Memory_MAP [i] .start; Params-> u.Mem.size = memory_map [i] .size; Params = tag_next (params); } } It can be seen that in the MEMORY_MAP [] array, each valid memory segment corresponds to an ATAG_MEM parameter tag. The Linux kernel can receive information in the form of command line parameters at startup, using this, we can provide the kernel to provide those kernels that cannot be detected by themselves, or the override core you detect. For example, we use such a command line parameter string "console = TTYS0, 115200N8" to notify the core as the console in ttys0, and the serial port is set with "115200bps, no parity, 8-bit data bit". Below is a set of sample code for calling the kernel command line parameter string: Char * p; / * Eat Leading White Space * / For (P = CommandLine; * P == ''; p ) ; / * SKIP NON-EXISTENT COMMAND LINES SO The Kernel Will Still * Use ITS default command line. * / IF (* p == '/ 0') Return; Params-> hdr.tag = attag_cmdline; Params-> hdr.size = (SIZEOF (Struct Tag_Header) Strlen (P) 1 4) >> 2; STRCPY (params-> u.cmdline.cmdline, p); Params = tag_next (params); Note that in the above code, set the Tag_Header size, must include the terminator '/ 0' of the string, and the number of bytes is rounded up to 4 bytes, as the Size member in the tag_header is indicated by it. Word Count. Below is an example code for setting ATAG_Initrd, which tells the kernel where you can find the initrd image (compressed format) and its size: Params-> hdr.tag = atag_initrd2; Params-> hdr.size = tag_size; tag_initrd Params-> u.initrd.start = ramdisk_ram_base; Params-> u.initrd.size = initrd_len; Params = tag_next (params); Below is the sample code for setting ATAG_RAMDISK, which tells how big the RAMDISK after the core is decompressed (the unit is KB): Params-> hdr.tag = atag_ramdisk; Params-> hdr.size = tag_size; Params-> u.ramdisk.start = 0; Params-> u.ramdisk.size = ramdisk_size; / * Please note that the unit is KB * / Params-> u.ramdisk.flags = 1; / * Automatic or Load Ramdisk * / params = tag_next (params); Finally, set the ATAG_NONE tag to end the entire startup parameter list: Static void setup_end_tag (void) { Params-> hdr.tag = attag_none; Params-> hdr.size = 0; } 3.2.5 Calling the kernel Boot loader The method of calling the Linux kernel is to jump to the first instruction of the kernel, that is, directly jump to the MEM_START 0x8000 address. At the time of jump, the following conditions should be satisfied: 1. Settings: • r0 = 0; @ r1 = Machine type ID; About Machine Type Number, see Linux / Arch / ARM / Tools / Mach-Types. @ R2 = Starting parameter marker list starts at the base address in RAM; CPU mode: • Interrupt (IRQs and FiQs) must be disabled; • The CPU must be SVC mode; 3. Settings of Cache and MMU: • MMU must be closed; • Directive cache can be turned on or off; • Data cache must be turned off; if you use a C language, you can call the kernel like the following sample code: Void (* thekernel) (int Zero, Int Arch, U32 params_addr) = (void (*) (int, int, u32) kernel_ram_base; ...... Thekernel (0, Arch_Number, (U32) kernel_params_start); Note that thekernel () function call should never return. If this call returns, the error is explained. 4. About the design and implementation of the serial terminal in the Boot Loader program, there is nothing to be more exciting than the printed information from the serial port terminal. In addition, printing information to the serial terminal is also a very important and effective debugging means. However, we often touch the serial terminal to display garbled or have no problems at all. There are two main reason for this problem: (1) Boot loader is incorrect to the initialization setting of the serial port. (2) The terminal emulation program running at the Host end is incorrect to the serial port, including: baud rate, parity, data bit, and stop bit settings. In addition, sometimes this problem will be encountered, that is: In the operation of the Boot Loader, we can correctly output information to the serial terminal, but when the Boot Loader starts the kernel, it is not possible to see the launch output information of the kernel. The reason for this problem can be considered from the following aspects: (1) First confirm that your kernel is configured with support for serial terminals when compiling and configures the correct serial driver. (2) Your Boot Loader's initialization settings for serial port may be inconsistent with the initial setting of the serial port. Furthermore, for the CPU such as S3C44B0X, the setting of the CPU clock frequency will also affect the serial port, so if the Boot Loader and the kernel are inconsistent with its CPU clock frequency, the serial terminal cannot display information correctly. (3) Finally, it is also necessary to confirm that the kernel base address used by Boot Loader must be consistent with the running base address used in compilation, especially for Uclinux. Suppose your kernel image is using the base address for compiling is 0xc0008000, but your Boot Loader loads it to the 0xC0010000, then the kernel image must not be executed correctly. V. Conclusion Boot Loader design and implementation is a very complex process. If you can't receive the exciting "Uncompressing Linux from the serial port .................. DONE, Booting the kernel ... " The kernel starts information, I am afraid no one can say: "Hey, my boot loader has successfully turned!" --------------------------------------- This article is reproduced in IBM DW, Author: Zhan Rong open research interests include: embedded Linux, Linux kernel, drivers, file systems. You can connect him through ZHANRK@sohu.com.