xv6-riscv/notes/chapter3-page-tables

Pages tables enables each process to have its own private memory space.
It's slicing the memory into small pages (PGSIZE, 4KiB) so we can
distribute it to many processes without much fragmentation, and enables
xv6 a few tricks:

- mapping the same memory (a trampoline page) in several address spaces
- guarding kernel and user stacks with an unmapped page

3.1 Paging hardware


RISC-V instruction (both user and kernel) manipulate virtual addresses.
Physical memory (RAM) = physical addresses (phy@ in the notes).
Virtual memory = fake memory addresses (virt@ in the notes).
RISC-V page table hardware maps virt@ and phy@.

XV6 = Sv39 RISC-V = only bottom 39 bits for virt@ (top 25 bits are not used).
2^39 @ = 2^27 page table entries (PTEs)
a PTE = 44 bits of physical page number (PPN) + flags

vocabulary, acronyms, et caetera:
- physical/virtual addresses = phy@/virt@
- physical address (what the computer can address) = 2^56 = 65536 TiB
- virtual address (what a single process can address) = 2^39 = 512 GiB
- Page Table Entry = PTE
- Physical Page Number = PPN
- Page Directory = page table is split into 3 "small page tables" called page directories
- Translation Look-aside Buffer = TLB
- Supervisor Address Translation and Protection = satp (it's a register)
- address space = set of valid virtual addresses in a given page table
                = the kernel also has its own address space
- User memory = its address space + physical memory allowed by the page table
- Virtual memory = ideas and techniques associated with managing page tables
                   used to achieve isolation & such
- direct mapping = virt@ == phy@

virt@ = [ 25-bit EXT ; 27-bit index ; 12-bit offset ]
        ↑64          ↑39            ↑12             ↑0
                              index = index to the PPN in the page table
page table = 2^27 entries
page table entry = [ 44-bit PPN ; 10-bit flags ]
                   ↑54          ↑10            ↑0
phy@ = [ 44-bit PPN (indexed by virt@ index) ; 12-bit virt@ offset ]
       ↑56                                   ↑12                   ↑0

virt@ = 39 (usable) bits, phy@ = 56 bits

Paging hardware translates virt@ with its top 27 of the 39 bits to find a PTE

IN REAL LIFE there the page table is split into 3 small tables, the index
is split into 3*9 bits to be used as indexes for these 3 tables.
These 3 parts can be called "directories":
1. a "root" (a 4096-byte page table)
   contains 512 PTEs which contain phy@ for next level directory
2. a "middle" (idem)
3. a "final"
In Sv48, there is an extra page table before "root" which takes bits 39
through 47 of a virt@.

In case the 3 PTEs points to an invalid page, the page hardware raises a
"page-fault exception" (execeptions explained in chapter 4).

RISC-V CPU caches page table entries in a Translation Look-aside Buffer
(TLB) to avoid costly loads of the PTE from the physical memory.

PTE flags:
- PTE_V valid (is the PTE is present or not)
- PTE_U user mode (if not present, can only be used in supervisor mode)
- PTE_R read: can the instructions read the page?
- PTE_W write: can the instructions write the page?
- PTE_X execute: can the CPU interpret the page's content as instructions and execute them?

satp register = where to put the phy@ of the root page table to be used by the CPU.
Once satp, subsequent instructions are interpreted with the provided page table.
Before setting satp, instructions use phy@.
Each CPU its own satp so each CPU can handle user code.

XV6 = 1 page table per process and 1 page table for the kernel
XV6 kernel page table:
- direct mapping for most pages
- no direct mapping for the trampoline page
- no direct mapping for stacks' pages
  => these pages are related to the processes (`kstack` in the `proc` structure)
  => the kstack is followed by an invalid guard page (PTE_V not set)
     to prevent memory corruption from stack overflows