# CS3600 — Systems and Networks

# SPRING 2013

#### Lecture 13: Paging

Prof. Alan Mislove (amislove@ccs.neu.edu)

# Paging

- Physical address space of a process can be noncontiguous; process is allocated physical memory whenever the latter is available
- Divide physical memory into fixed-sized blocks called frames
  - Size is power of 2, between 512 bytes and 16 Mbytes
  - Keep track of all free frames
- Divide logical memory into blocks of same size called pages
- To run a program of size N pages, need to find N free frames and load program
- Set up a page table to translate logical to physical addresses
- Backing store likewise split into pages

CS3600 — Systems and Networks

# **Address Translation Scheme**

- Address generated by CPU is divided into:
  - Page number (p) used as an index into a page table which contains base address of each page in physical memory
  - Page offset (d) combined with base address to define the physical memory address that is sent to the memory unit

| page number | page offset |  |
|-------------|-------------|--|
| p           | d           |  |
| m - n       | n           |  |

• For given logical address space 2<sup>m</sup> and page size 2<sup>n</sup>

# Paging Hardware



#### Paging Model of Logical and Physical Memory



# Paging Example

![](_page_5_Figure_1.jpeg)

## *n*=2 and *m*=4 32-byte memory and 4-byte pages

# Paging (Cont.)

- What is the internal fragmentation of paging?
  - Page size = 2,048 bytes
  - Process size = 72,766 bytes
  - 35 pages + 1,086 bytes
  - Internal fragmentation of 2,048 1,086 = 962 bytes
  - Worst case fragmentation = 1 frame 1 byte
  - Average fragmentation = 1 / 2 frame size
  - So small frame sizes desirable?
  - But each page table entry takes memory to track
  - Page sizes growing over time
    - Solaris supports two page sizes 8 KB and 4 MB
- Process view and physical memory now very different
- By implementation process can only access its own memory

### **Free Frames**

![](_page_7_Figure_1.jpeg)

#### **Before allocation**

### After allocation

# Implementation of Page Table

- Page table is kept in main memory
  - Page-table base register (PTBR) points to the page table
  - Page-table length register (PTLR) indicates size of the page table
- In this scheme every data/instruction access requires two memory accesses
  - One for the page table and one for the data / instruction
- The two memory access problem can be solved by the use of a special fastlookup hardware cache called associative memory or translation lookaside buffers (TLBs)
- Some TLBs store address-space identifiers (ASIDs) in each TLB entry uniquely identifies each process to provide address-space protection for that process
  - Otherwise need to flush at every context switch
- TLBs typically small (64 to 1,024 entries)
  - On a TLB miss, value is loaded into the TLB for faster access next time
    - Replacement policies must be considered
    - Some entries can be wired down for permanent access

### **Associative Memory**

Associative memory – parallel search

| Page # | Frame # |
|--------|---------|
|        |         |
|        |         |
|        |         |
|        |         |

- Address translation (p, d)
  - If p is in associative register, get frame # out
  - Otherwise get frame # from page table in memory

### Paging Hardware With TLB

![](_page_10_Figure_1.jpeg)

# **Effective Access Time**

- Associative Lookup =  $\varepsilon$  time unit
  - Can be < 10% of memory access time
- Hit ratio =  $\alpha$ 
  - Hit ratio percentage of times that a page number is found in the associative registers; ratio related to number of associative registers
- Effective Access Time (EAT) (expr in terms of memory access time)  $EAT = (1 + \varepsilon) \alpha + (2 + \varepsilon)(1 - \alpha)$   $= 2 + \varepsilon - \alpha$
- Consider  $\alpha$  = 80%,  $\epsilon$  = 20ns for TLB search, 100ns for memory access

• EAT = 0.80 x 120 + 0.20 x 220 = 140ns

- Consider slower memory but better hit ratio ->  $\alpha$  = 98%,  $\epsilon$  = 20ns for TLB search, 140ns for memory access
  - EAT = 0.98 x 160 + 0.02 x 300 = 162.8ns

# **Memory Protection**

- Memory protection implemented by associating protection bit with each frame to indicate if read-only or read-write access is allowed
  - Can also add more bits to indicate page execute-only, and so on
- Valid-invalid bit attached to each entry in the page table:
  - "valid" indicates that the associated page is in the process' logical address space, and is thus a legal page
  - "invalid" indicates that the page is not in the process' logical address space
  - Or use PTLR

#### Any violations result in a trap to the kernel

### Valid (v) or Invalid (i) Bit In A Page Table

![](_page_13_Figure_1.jpeg)

# **Shared Pages**

#### Shared code

- One copy of read-only (reentrant) code shared among processes (i.e., text editors, compilers, window systems)
- Similar to multiple threads sharing the same process space
- Also useful for interprocess communication if sharing of readwrite pages is allowed

#### Private code and data

- Each process keeps a separate copy of the code and data
- The pages for the private code and data can appear anywhere in the logical address space

# Shared Pages Example

![](_page_15_Figure_1.jpeg)

# Structure of the Page Table

- Memory structures for paging can get huge using straightforward methods
  - Consider a 32-bit logical address space as on modern computers
  - Page size of 4 KB (2<sup>12</sup>)
  - Page table would have 1 million entries (2<sup>32</sup> / 2<sup>12)</sup>
  - If each entry is 4 bytes -> 4 MB of physical address space / memory for page table alone
    - That amount of memory used to cost a lot
    - Don't want to allocate that contiguously in main memory
- Hierarchical Paging
- Hashed Page Tables
- Inverted Page Tables

CS3600 — Systems and Networks

# **Hierarchical Page Tables**

- Break up the logical address space into multiple page tables
- A simple technique is a two-level page table
- We then page the page table

### **Two-Level Page-Table Scheme**

![](_page_18_Figure_1.jpeg)

# **Two-Level Paging Example**

- A logical address (on 32-bit machine with 1K page size) is divided into:
  - a page number consisting of 22 bits
  - a page offset consisting of 10 bits
- Since the page table is paged, the page number is further divided into:
  - a 12-bit page number
  - a 10-bit page offset
- Thus, a logical address is as follows:

| page number |       | page offset |
|-------------|-------|-------------|
| $p_1$       | $p_2$ | d           |
| 12          | 10    | 10          |

### **Address-Translation Scheme**

![](_page_20_Figure_1.jpeg)

# 64-bit Logical Address Space

- Even two-level paging scheme not sufficient
- If page size is 4 KB (2<sup>12</sup>)
  - Then page table has 2<sup>52</sup> entries
  - If two level scheme, inner page tables could be 2<sup>10</sup> 4-byte entries

| outer<br>page | inner<br>page | page offset |  |
|---------------|---------------|-------------|--|
| $p_1$         | $p_2$         | d           |  |
| 42            | 10            | 12          |  |

- Outer page table has 2<sup>42</sup> entries or 2<sup>44</sup> bytes
- One solution is to add a 2<sup>nd</sup> outer page table
- But in the following example the 2<sup>nd</sup> outer page table is still 2<sup>34</sup> bytes in size
  - And possibly 4 memory access to get to one physical memory location

CS3600 — Systems and Networks

### **Three-level Paging Scheme**

| outer page | inner page | offset |
|------------|------------|--------|
| $p_1$      | $p_2$      | d      |
| 42         | 10         | 12     |

| 2nd outer page | outer page | inner page | offset |
|----------------|------------|------------|--------|
| $p_1$          | $p_2$      | $p_3$      | d      |
| 32             | 10         | 10         | 12     |

# Hashed Page Tables

- Common in address spaces > 32 bits
- The virtual page number is hashed into a page table
  - This page table contains a chain of elements hashing to the same location
- Each element contains (1) the virtual page number (2) the value of the mapped page frame (3) a pointer to the next element
- Virtual page numbers are compared in this chain searching for a match
  - If a match is found, the corresponding physical frame is extracted

### Hashed Page Table

![](_page_24_Figure_1.jpeg)

# Segmentation

- Memory-management scheme that supports user view of memory
- A program is a collection of segments
  - A segment is a logical unit such as:

main program procedure function method object local variables, global variables common block stack symbol table arrays

### User's View of a Program

![](_page_26_Figure_1.jpeg)

### Logical View of Segmentation

![](_page_27_Figure_1.jpeg)

# **Segmentation Architecture**

• Logical address consists of a two tuple:

<segment-number, offset>,

- Segment table maps two-dimensional physical addresses; each table entry has:
  - base contains the starting physical address where the segments reside in memory
  - limit specifies the length of the segment
- Segment-table base register (STBR) points to the segment table's location in memory
- Segment-table length register (STLR) indicates number of segments used by a program;

segment number **s** is legal if **s** < **STLR** 

# Segmentation Architecture (Cont.)

#### Protection

- With each entry in segment table associate:
  - validation bit =  $0 \Rightarrow$  illegal segment
  - read/write/execute privileges
- Protection bits associated with segments; code sharing occurs at segment level
- Since segments vary in length, memory allocation is a dynamic storage-allocation problem
- A segmentation example is shown in the following diagram

### **Segmentation Hardware**

![](_page_30_Figure_1.jpeg)

# **Example of Segmentation**

![](_page_31_Figure_1.jpeg)

CS3600 — Systems and Networks