xv6-riscv/notes/chapter3-page-tables

116 lines
5.6 KiB
Text

Pages tables enables each process to have its own private memory space.
It's slicing the memory into small pages (PGSIZE, 4KiB) so we can
distribute it to many processes without much fragmentation, and enables
xv6 a few tricks:
- mapping the same memory (a trampoline page) in several address spaces
- guarding kernel and user stacks with an unmapped page
3.1 Paging hardware
RISC-V instruction (both user and kernel) manipulate virtual addresses.
Physical memory (RAM) = physical addresses (phy@ in the notes).
Virtual memory = fake memory addresses (virt@ in the notes).
RISC-V page table hardware maps virt@ and phy@.
XV6 = Sv39 RISC-V = only bottom 39 bits for virt@ (top 25 bits are not used).
2^39 @ = 2^27 page table entries (PTEs)
a PTE = 44 bits of physical page number (PPN) + flags
vocabulary, acronyms, et caetera:
- physical/virtual addresses = phy@/virt@
- physical address (what the computer can address) = 2^56 = 65536 TiB
- virtual address (what a single process can address) = 2^39 = 512 GiB
- Page Table Entry = PTE
- Physical Page Number = PPN
- Page Directory = page table is split into 3 "small page tables" called page directories
- Translation Look-aside Buffer = TLB
- Supervisor Address Translation and Protection = satp (it's a register)
- address space = set of valid virtual addresses in a given page table
= the kernel also has its own address space
- User memory = its address space + physical memory allowed by the page table
- Virtual memory = ideas and techniques associated with managing page tables
used to achieve isolation & such
- direct mapping = virt@ == phy@
- KVM/UVM = Kernel/User virtual memory
- a page table = 512 PTEs = can be contained into 1 memory page
virt@ = [ 25-bit EXT ; 27-bit index ; 12-bit offset ]
↑64 ↑39 ↑12 ↑0
index = index to the PPN in the page table
page table = 2^27 entries
page table entry = [ 44-bit PPN ; 10-bit flags ]
↑54 ↑10 ↑0
phy@ = [ 44-bit PPN (indexed by virt@ index) ; 12-bit virt@ offset ]
↑56 ↑12 ↑0
virt@ = 39 (usable) bits, phy@ = 56 bits
Paging hardware translates virt@ with its top 27 of the 39 bits to find a PTE
IN REAL LIFE there the page table is split into 3 small tables, the index
is split into 3*9 bits to be used as indexes for these 3 tables.
These 3 parts can be called "directories":
1. a "root" (a 4096-byte page table)
contains 512 PTEs which contain phy@ for next level directory
2. a "middle" (idem)
3. a "final"
In Sv48, there is an extra page table before "root" which takes bits 39
through 47 of a virt@.
In case the 3 PTEs points to an invalid page, the page hardware raises a
"page-fault exception" (execeptions explained in chapter 4).
RISC-V CPU caches page table entries in a Translation Look-aside Buffer
(TLB) to avoid costly loads of the PTE from the physical memory.
PTE flags:
- PTE_V valid (is the PTE is present or not)
- PTE_U user mode (if not present, can only be used in supervisor mode)
- PTE_R read: can the instructions read the page?
- PTE_W write: can the instructions write the page?
- PTE_X execute: can the CPU interpret the page's content as instructions and execute them?
satp register = where to put the phy@ of the root page table to be used by the CPU.
Once satp, subsequent instructions are interpreted with the provided page table.
Before setting satp, instructions use phy@.
Each CPU its own satp so each CPU can handle user code.
XV6 = 1 page table per process and 1 page table for the kernel
XV6 kernel page table:
- direct mapping for most pages
- no direct mapping for the trampoline page
- no direct mapping for stacks' pages
=> these pages are related to the processes (`kstack` in the `proc` structure)
=> the kstack is followed by an invalid guard page (PTE_V not set)
to prevent memory corruption from stack overflows
Virtual memory functions:
- walk: find PTE for a virt@
- mappages: install PTEs for new mappings
- kvm_* = kernel virtual memory functions (kernel page table)
- uvm_* = same but for a user process
- copyin = copy data from user space to kernel space
- copyout = copy data from kernel space to user space
- copyinstr = copy a null-terminated string from user to kernel
= used for paths given in syscalls for example
- kvminit sets the root kernel page table created by kvmmake
- kvmmake creates a direct-map page table for the kernel
1. create the root kernel page table with a call to `kalloc`
(kalloc provides a pointer to a page, which is the type `pagetable_t`)
2. call kvmmap multiple times to set a few direct-map pages
kvmmap adds mapping to the kernel page table (when booting only), doesn't flush TLB or enable paging
mapped stuff:
uart registers, virtio mmio disk interface, PLIC, kernel text and data
and trampoline (for trap entry/exit) is mapped to the highest virtual address in the kernel
3. proc_mapstacks
Function signatures (for reference):
void kvminit(void); // set the kernel root page table
pagetable_t kvmmake(void); // create the kernel page table
void kvmmap(pagetable_t, uint64 virt@, uint64 phy@, uint64 sz, int perm); // add PTEs to the kernel page table
(this is only a call to mappages + a call to "panic" in case of an error)
int mappages(pagetable_t, uint64 virt@, uint64 size, uint64 phy@, int perm); // create PTEs
pte_t * walk(pagetable_t pagetable, uint64 va, int alloc); // virt@ -> PTE
uint64 walkaddr(pagetable_t pagetable, uint64 va); // virt@ -> phy@