Embedded bootloader technology insider (2)
Third, the main task of Boot Loader and the typical structure framework first we have assumed it before continuing this section, that is: assuming that the kernel image and root file system image are loaded into the RAM. The reason why such a hypothesis is that the kernel image and root file system image can be run directly in a solid storage device such as a ROM or Flash in a solid state storage device such as a ROM or FLASH in a solid state storage device such as a ROM or Flash. But this approach is undoubtedly at the expense of running speed. From the perspective of the operating system, the total goal of Boot Loader is to correctly call the kernel. In addition, due to the implementation of Boot Loader depends on the architecture of the CPU, most Boot Loader is divided into two parts: STAGE1 and STAGE2. Depending on the code of the CPU architecture, such as equipment initialization code, etc., usually in Stage1, and usually use assembly language to achieve short escape purposes. Stage2 is usually implemented in a C language, which can be implemented for complex functions, and the code will have better readability and portability. Boot Loader's Stage1 typically includes the following steps (in the order of execution): • Hardware devices initialize. • Prepare RAM space for the Stage2 of the Boot Loader. · Copy Boot Loader's Stage2 to RAM space. · Set good stack. • Jump to the C entry point of Stage2. Boot Loader's Stage2 typically includes the following steps (in the order of execution): • The hardware device to be used initialized this phase. · Detect system memory mapping. · Read the Kernel image and root file system image from the Flash to the RAM space. · Set the parameters for the kernel. · Call the kernel. 3.1 STAGE1 3.1.1 Basic hardware Initialization This is the operation of Boot Loader, with the purpose of performing Some basic hardware environments for the execution of Stage2 and subsequent Kernel. It typically includes the following steps (in the order of execution): 1. Shield all interrupts. Providing services for interrupts is usually the responsibility of the OS device driver, so there is no need to respond to any interrupt in the full process of Boot Loader. Interrupt masking can be done by writing a CPU's interrupt shield register or status register (such as ARM's CPSR register). 2. Set the speed and clock frequency of the CPU. 3. RAM initialization. The function register and each of the memory control registers, including the correct setting of the system. 4. Initialize the LED. Typically, the LEDs are driven by GPIO, which indicates that the state of the system is OK or Error. If there is no LED on the board, you can also print the LOGO character information of the Boot Loader to the serial port by initializing the UART. 5. Turn off CPU internal instruction / data Cache. 3.1.2 To load the STAGE2 Prepare RAM Space To obtain a faster execution speed, the STAGE2 is usually loaded into the RAM space, so it is necessary to prepare a STAGE2 that is loaded with the boot loader. Since STAGE2 is usually a C language execution code, in consideration of space size, in addition to the STAGE2 executable image, the stack space must also be considered. In addition, the space size is the multiple of the MEMORY PAGE (usually 4KB). In general, 1M RAM space is sufficient. The specific address range can be arranged any, such as BLOB, arranged its STAGE2 executable image to 1M space starting from the system RAM start address 0xC0200000.
However, the STAGE2 is arranged to the top 1MB of the entire RAM space (ie, Ramend-1MB) - RAMEND is a recommended method. For the later narrative convenience, the size of the arranged RAM spatial range is here: Stage2_size (byte), separate the start address and termination address as: Stage2_start and Stage2_end (these two addresses are 4-byte boundary Align). So: STAGE2_END = Stage2_Start Stage2_size In addition, it is necessary to ensure that the range of address ranges is indeed readable RAM space, so you must test the address range you arrange. The specific test method can use a method similar to blob, i.e., as the MEMORY PAGE is tested, and the two words starting every MEMORY PAGE are readable. For the convenience of the following description, we remember this detection algorithm as: test_mempage, the specific steps are as follows: 1. First save the contents of the MEMORY PAGE. 2. Write any numbers to these two words. For example: writing 0x55 to the first word, 0xAA is written in the second word. 3. Then, immediately read the contents of the two words immediately. Obviously, what we read should be 0x55 and 0xAA, respectively. If not, the address range occupied by this Memory Page is not a valid RAM space. 4. Write any numbers in these two words. For example, write 0xAA to the first word, and write 0x55 in the second word. 5. Then, immediately read the contents of the two words immediately. Obviously, what we read should be 0xAA and 0x55, respectively. If not, the address range occupied by this Memory Page is not a valid RAM space. 6. Restore the original content of these two words. The test is completed. In order to get a clean RAM spatial range, we can also clear the arranged RAM spatial range. 3.1.3 Copying STAGE2 to RAM When you copy to copy the two points: (1) The executable image of the STAGE2 is stored at the start address and termination address of the solid state storage device; (2) the start address of the RAM space. 3.1.4 Setting the Stack Pointer SP Stack The settings are prepared to perform C language code. Usually we can set the value of the SP to (STAGE2_END-4), that is, the 1MB RAM space arranged at 3.1.2 (the stack is growing down). In addition, the LED light can also be turned off before setting the stack pointer SP to prompt users to jump to Stage2. After the above execution steps, the physical memory layout of the system should be shown in Figure 2 below. 3.1.5 Jump to the C entry point of STAGE2 After all of the above is ready, you can jump to the Stage2 of Boot Loader. For example, in the ARM system, this can be implemented by modifying the PC register as the appropriate address. Http://tech.ccidnet.com/pub/attachm...3/12/268047.gif Figure 2 Bootloader Stage2 Optical Image Just copy to RAM Space System Memory Layout 3.2 Boot Loader Stage2 is as front It is said that the code of Stage2 is usually implemented in C language to facilitate more complex functions and better code readability and portability. But unlike ordinary C language applications, we cannot use any support functions in the GLIBC library when compiling and linking bootloader.
The reason is obvious. This brings us a question, which is where you jump into the main () function? Directly put the origin of the main () function as the entry point of the entire Stage2 execution image may be the most direct idea. But there are two shortcomings in this way: 1) Unable to pass the main () function transfer function parameter; 2) The case where the main () function returns cannot be processed. A more clever way is to use the concept of trampoline (spring bed). That is, write a TRAMPOLINE applet with assembly language and use this Trampoline applet to perform entry points for the STAGE2 executable image. Then we can jump into the main () function in the Trampoline assembly appler; and when the main () function returns, the CPU execution path obviously returns to our Trampoline program. In short, this method is to use this TRAMPOLINE applet as an External Wrapper of the main () function. The following is given a simple trampoline program example (from blob): .text.globl _trampoline_trampoline: bl main / * if main ever returns we just call it aga * / b _tampoline It can be seen that when the main () function returns, we Also use a jump instruction to re-execute the trampoline program - of course, the main () function is re-executed, which is the meaning of the word trampoline (spring bed). 3.2.1 Hardware devices to be used in the initialization stage This usually includes: (1) initializing at least one serial port, so that the terminal user performs I / O output information; (2) initialization timer, etc. Before initializing these devices, you can also turn the LED light to indicate that we have entered the main () function. After the device is initialized, some print information, program name string, version number, etc. can be output. 3.2.2 Memory Map Memory Map The memory mapping refers to what address ranges are allocated in the entire 4GB physical address space to address the RAM unit of the address. For example, in the SA-1100 CPU, the 512M address space starting from 0xC000, 0000 is used as the RAM address space of the system, and in the Samsung S3C44B0X CPU, the 64M address space between 0x0C00 to 0x1000, 10000 is used. System's RAM address space. Although the CPU typically reserves a large number of sufficient address spaces to the system RAM, it does not necessarily implement all RAM address spaces reserved in the CPU when building a specific embedded system. That is, the specific embedded system often maps a portion of the entire RAM address space reserved in the CPU to the RAM unit, and the remaining portion of the RAM address space is in an unused state. Due to the above fact, Boot Loader's Stage2 must detect the memory mapping of the entire system before it wants to do anything (for example, reading the kernel image stored in the Flash to the RAM space) before detecting the memory mapping of the entire system before you have detected the memory mapping of the entire system before it must know the CPU pre- Which of the RAM address spaces remains true to the RAM address unit, which are in the "unused" state.
(1) Description of memory maps can be used as the following data structure to describe a continuous address range in the RAM address space: typedef struct memory_area_struct {u32 start; / * the base address of the memory region * / u32 size; / * The byte number of the memory;} Memory_Area_t; the continuous address range in this RAM address space can be one of two states: (1) Used = 1, then the continuous address range has been It is achieved, i.e., is truly mapped to the RAM unit. (2) Used = 0, the continuous address range is not implemented by the system, but is in an unused state. Based on the above MEMORY_AREA_T data structure, the entire CPU reserved RAM address space can be represented by an array of MEMORY_AREA_T types, as shown below: Memory_Area_t memory_map [Num_Mem_areas] = {[0 ... (Num_Mem_areas - 1)] = {.start = 0, .size = 0, .USED = 0},}; (2) Detection of memory maps Under We give a simple and efficient algorithm that can be used to detect memory mappings of the entire RAM address space: / * Array initialization * / for (i = 0; i 3.2.3 Loading the kernel image and root file system image (1) Planning memory usage The layout here includes two aspects: (1) The memory range occupied by the kernel image; (2) The range of memory occupied by the root file system. When planning memory, it is mainly considered two aspects of the size of the base address and the image. For kernel images, it is generally copied to approximately 1MB of memory from (MEM_START 0X8000) base address (the kernel of embedded Linux generally does not operate 1MB). Why is it to empty the memory from MEM_START to MEM_START 0x8000? This is because Linux kernels have placed some global data structures in this memory, such as starting parameters and kernel page tables. For root file system images, it is generally copied to the MEM_START 0x0010, 0000. If RAMDisk is used as a root file system image, the size after decompression is generally 1MB. (2) Copy from Flash Since the embedded CPU like ARM is usually to address Flash and other solid storage devices in a unified memory address space, read data from the ram unit and read data from the RAM unit. no difference. With a simple loop, you can complete the work from the flash device: while (count) {* dest = * src ; / * They area all aligned with word boundary * / count - = 4; / * byte number * / }; 3.2.4 Setting the kernel's startup parameters should be said that after copying the kernel image and root file system image to the RAM space, you can prepare to start the Linux kernel. But before calling the kernel, you should have a step preparation, namely: Set the startup parameters of the Linux kernel. The kernel after Linux 2.4.x expects to pass the startup parameters in the form of tagged list. Start the parameter tag list to mark Atag_core to mark the Atag_none end. Each tag is composed of a Tag_Header structure that is identified, and subsequent parameter values data structures. Data Structure Tag and Tag_Header Define Include / ASM / SETUP.H header file in Linux kernel source: / * The list ends with an atag_none node. * / # Define atag_none 0x0000000000Struct tag_header {u32 size; / * Note, here SIZE is word units * / u32 tag;}; ...... struct tag {struct tag_header hdr; union {struct tag_core core; struct tag_mem32 mem; struct tag_videotext videotext; struct tag_ramdisk ramdisk; struct tag_initrd initrd; struct tag_serialnr serialnr; struct tag_revision revision; struct tag_videolfb videolfb; struct tag_cmdline cmdline; / ** Acorn specific * / struct tag_acorn acorn; / ** DC21285 specific * / struct tag_memclk memclk;} u;}; in the embedded Linux system, usually set by the common Boot Loader The startup parameters are: ATAG_CORE, ATAG_MEM, ATAG_CMDLINE, ATAG_RAMDISK, ATAG_INITRD, etc. For example, the code sets ATAG_CORE is as follows: params = (struct tag *) Boot_Params-> HDr.tag = atag_core; params-> hdr.size = tag_size (tag_core); params-> u.core.flags = 0; params -> u.core.pageSize = 0; params-> u.core.rootDev = 0; params = tag_next (params); where boot_params represents the starting base address of the kernel launch parameter in memory, pointer params is a struct tag Type pointer. Macro tag_next () will point to the current tag of the pointer as a parameter, calculate the starting address of the next tag that is currently marked. Note that the device ID of the root file system of the kernel is set here. Below is an example code for setting memory mapping: for (i = 0; i