Introduction to Operating Systems

In this part, we will introduce you to the world of Operating Systems! We will first introduce the 3 major parts of Operating Systems: Virtualization, Concurrency and Persistence, which we will go into the details in later days. Then, we will introduce some basic Operating System interfaces and concepts. At last, we will talk about the OS boot sequence, especially how the OS boot in OS/161.

Introduction to Operating Systems

Learning Materials

Readings:

OSTEP book Chapter 2

xv6 book Chapter 0

xv6 book Chapter 3

Address Spaces

Address Translation

How does an OS boot?

LAMEbus documentation

xv6 book Chapter 1

Short section on booting in The Little Book About OS Development

Lecture Slides:

Introduction

Bootstrapping. What does it mean to be the kernel?

3 Major Parts of Operating Systems

In this part, we will briefly introduce the 3 major parts of Operating Systems: Virtualization, Concurrency and Persistence.

Virtualization

The crux of the problem:

How does OS virtualize resources?

  • Virtualizing the CPU

OS creates an illusion that the system has a very large number of virtual CPUs.

Each running program is represented in the kernel as a process.

Each process has its own state: Program counter, registers, memory, open files, …

A process runs for a while…Then the system is interrupted by a timer interrupt.

The kernel performs a context switch:

Saves the state of the old process to memory

Loads the state of the new process onto the CPU

Let the new process run

After a while, there is another interrupt, and the kernel switches back to the old process.

  • Virtualizing Memory
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
1  #include <unistd.h>
2 #include <stdio.h>
3 #include <stdlib.h>
4 #include "common.h"
5
6 int
7 main(int argc, char *argv[])
8 {
9 int *p = malloc(sizeof(int)); // a1
10 assert(p != NULL);
11 printf("(%d) address pointed to by p: %p\n",
12 getpid(), p); // a2
13 *p = 0; // a3
14 while (1) {
15 Spin(1);
16 *p = *p + 1;
17 printf("(%d) p: %d\n", getpid(), *p); // a4
18 }
19 return 0;
20 }

output:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
prompt> ./mem & ./mem &
[1] 24113
[2] 24114
(24113) address pointed to by p: 0x200000
(24114) address pointed to by p: 0x200000
(24113) p: 1
(24114) p: 1
(24114) p: 2
(24113) p: 2
(24113) p: 3
(24114) p: 3
(24113) p: 4
(24114) p: 4
...

Each process accesses its own address space, which the OS maps onto the physical memory of the machine. We will talk more about address spaces and how virtual memory is translated into physical memory later.

Virtual memory may be much larger than physical memory.


Concurrency

A Multi-threaded Program:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
1  #include <stdio.h>
2 #include <stdlib.h>
3 #include "common.h"
4 #include "common_threads.h"
5
6 volatile int counter = 0;
7 int loops;
8
9 void *worker(void *arg) {
10 int i;
11 for (i = 0; i < loops; i++) {
12 counter++;
13 }
14 return NULL;
15 }
16
17 int main(int argc, char *argv[]) {
18 if (argc != 2) {
19 fprintf(stderr, "usage: threads <value>\n");
20 exit(1);
21 }
22 loops = atoi(argv[1]);
23 pthread_t p1, p2;
24 printf("Initial value : %d\n", counter);
25
26 Pthread_create(&p1, NULL, worker, NULL);
27 Pthread_create(&p2, NULL, worker, NULL);
28 Pthread_join(p1, NULL);
29 Pthread_join(p2, NULL);
30 printf("Final value : %d\n", counter);
31 return 0;
32 }

output:

1
2
3
4
prompt> gcc -o threads threads.c -Wall -pthread
prompt> ./threads 1000
Initial value : 0
Final value : 2000

Each thread increments counter 1000 times

1
2
3
4
5
6
prompt> ./threads 100000
Initial value : 0
Final value : 143012 // huh??
prompt> ./threads 100000
Initial value : 0
Final value : 137298 // what the??

What?

Incrementing counter takes 3 instructions:

One to load the value of the counter from memory into a register, one to increment it, and one to store it back into memory.

These three instructions do not execute atomically.

So it leads to the crux of the problem:

How to build correct concurrent problems?


Persistence

The crux of the problem:

How to store data persistently

The file system is the part of the OS in charge of managing persistent data


Design Goals

Finding the right set of trade-offs is a key to building systems.

Our goals:

  • build up some abstractions

  • provide high performance

provide virtualization and other OS features without excessive overheads (extra time, extra space).

  • provide protection between applications

main principles of OS: isolation

  • reliability

  • other: energy-efficiency, security, mobility

Operating System Interfaces

user space and kernel space

A process alternates between executing in user space and kernel space.


shell

An ordinary user program that reads commands from the user and executes them

The primary user interface to traditional Unix-like systems


Processes and memory

fork()

wait()

1
2
3
4
5
6
7
8
9
10
11
int pid = fork();
if(pid > 0){
printf("parent: child=%d\n", pid);
pid = wait();
printf("child %d is done\n", pid);
} else if(pid == 0){
printf("child: exiting\n");
exit();
} else {
printf("fork error\n");
}

exec(filename, *argv)

1
2
3
4
5
6
char *argv[3];
argv[0] = "echo";
argv[1] = "hello";
argv[2] = 0;
exec("/bin/echo", argv);
printf("exec error\n");

I/O and File descriptors

The file descriptor interface abstracts away the differences between files, pipes, and devices, making them all look like streams of bytes

  • I/O redirection
1
2
3
4
5
6
7
8
char *argv[2];
argv[0] = "cat";
argv[1] = 0;
if(fork() == 0) {
close(0);
open("input.txt", O_RDONLY);
exec("cat", argv);
}

Now it is clear why it is a good idea that fork and exec are separate calls. This separation allows the shell to fix up (e.g. I/O redirection) the child process before the child runs the intended program (exec).

  • fork system call

Although fork copies the file descriptor table, each underlying file offset is shared between parent and child.

1
2
3
4
5
6
7
if(fork() == 0) {
write(1, "hello ", 6);
exit();
} else {
wait();
write(1, "world\n", 6);
}

output: "hello world"

  • dup system call
1
2
3
fd = dup(1);
write(1, "hello ", 6);
write(fd, "world\n", 6);

output: "hello world"


Pipes

A small kernel buffer exposed to processes as a pair of file descriptors, one for reading and one for writing

The pipe file is just a memory buffer.

The mechanics of pipe is that it creates a special "pipe" file in memory. Parent process's two file descriptors refer to its vnode: one for reading (pfd[0]), one for writing (pfd[1]). Then the parent process forks a child process, which shares file handles with the parent and can read/write the same pipe. The parent and the child each close one side of the pipe(read/write) and can communicate with each other.

  • Redirecting Pipe I/O:

As shown below, parent redirects output end of the pipe (pfd[1]) to STDOUT via the dup2 system call. In this way, whatever the parent is writing to STDOUT can be read by the child from the input end of the pipe (pfd[0]).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
int p[2];
char *argv[2];
argv[0] = "wc";
argv[1] = 0;
pipe(p);
if(fork() == 0) {
close(0);
dup(p[0]); //duplicate p[0] to 0
close(p[0]);
close(p[1]);
exec("/bin/wc", argv);
} else {
write(p[1], "hello world\n", 12);
close(p[0]);
close(p[1]);
}

The program redirect child process's stdin to the read end of the pipe

p[0] is the read end of the pipe

p[1] is the write end of the pipe


File System

  • chdir system call

They open the same file:

1
2
3
4
5
chdir("/a");
chdir("b");
open("c", O_RDONLY);

open("/a/b/c", O_RDONLY);
  • O_CREATE flag and mknod system call

open with the O_CREATE flag creates a new data file

mknod creates a new device file

1
2
3
4
mkdir("/dir");
fd = open("/dir/file", O_CREATE|O_WRONLY);
close(fd);
mknod("/console", 1, 1);
  • fstat

fstat retrieves information about the object a file descriptor refers to.

1
2
3
4
5
6
7
8
9
10
#define T_DIR 1 // Directory
#define T_FILE 2 // File
#define T_DEV 3 // Device
struct stat {
short type; // Type of file
int dev; // File system’s disk device
uint ino; // Inode number
short nlink; // Number of links to file
uint size; // Size of file in bytes
};
  • link system call

The same underlying file, called an inode, can have multiple names, called links,

Create a new file named both a and b:

1
2
open("a", O_CREATE|O_WRONLY);
link("a", "b");

Each inode is identified by a unique inode number

  • unlink system call

Removes a name from the file system

1
unlink("a");
1
2
fd = open("/tmp/xyz", O_CREATE|O_RDWR);
unlink("/tmp/xyz");

Traps, Interrupts, and Drivers (in x86)

Systems Calls, Exceptions, and Interrupts

  • interrupt

An interrupts stops the normal processor loop and starts executing a new sequence called an interrupt handler

Processor should switch from user mode to kernel mode, and back

  • Difference between traps and interrupts

traps are caused by the current process running on a processor (e.g. system call)

interrupts are caused by devices and may not be related to the currently running process


x86 protection

To make a system call on the x86, a program invokes the int n instruction

n specifies the index into the IDT (interrupt descriptor table), representing different types of interrupt


Code: Assembly trap handlers

The user mode processor registers are saved in the trapframe on the kernel stack


Code: C trap handler

Each handler sets up a trap frame and then calls the C function trap

If the trap is T_SYSCALL, trap calls the system call handler syscall

After checking for a system call, trap looks for hardware interrupts

If the trap is not a system call and not a hardware device looking for attention, trap assumes it was caused by incorrect behavior as part of the code that was executing before the trap

If the code was a user program, clean it up

If it was kernel running, there must be a kernel bug: trap prints details about the surprise and then calls panic.


Code: System Calls

trap invoke syscall

syscall loads the system call number from the trap frame and indexes into the system call table


Code: Interrupts

Can occur at any time

Signal the CPU when a device needs attention


Drivers

A driver manage a particular device: it provides interrupt handlers for a device, causes a device to perform operations, causes a device to generate interrupts, …


Code: Disk driver

  • Disk sectors

A numbered sequence of 512-byte blocks

The disk driver represent disk sectors with a data structure called a buffer

Each buffer represents the contents of one sector on a particular disk device

1
2
3
4
5
6
7
8
9
3500 struct buf {
3501 int flags; //track the relationship between memory and disk
3502 uint dev;
3503 uint sector;
3504 struct buf *prev; // LRU cache list
3505 struct buf *next;
3506 struct buf *qnext; // disk queue
3507 uchar data[512];
3508 };

Address Space and Address Translation

Address Spaces

  • Time sharing

Where the programs actually lives:

  • Address Space

Layout of an user address space:

Layout of a virtual address space:


Address Translation

  • Dynamic (Hardware-based) Relocation

When a program starts running, the OS decides where in physical memory it should be loaded and sets the base register to that value.

physical address = virtual address + base

base is the process's physical starting address (first instruction)

bounds holds the size of the process's address space to help with protection

  • OS Issues

what OS does at boot time (After the kernel is loaded into memory by bootloader):

what happens when a process (Process A) starts running:

Address Translation is done by hardware only, except for exception

when starting a new process A,

allocate entry in process table

allocate memory for process (looking for free memory via free list)

set base/bound registers

return from trap into A

when a context switch occurs,

save regs(A) to proc-struct(A) (including base/bounds, PC…);

restore regs(B) from proc-struct(B) (including base/bounds, PC…)

return from trap into B

OS Booting

General booting sequence

Booting an operating system consists of transferring control along a chain of small programs

An example of the boot process. Each box is a program:

  • BIOS

Today, BIOS mainly runs some early diagnostics (power-on-self-test) and then transfers control to the bootloader.

  • The Bootloader

task: transfer control to the operating system

Using GRUB, the OS can be built as an ordinary ELF executable, which will be loaded by GRUB into the correct memory location.

  • Kernel ELF Information

Entry point address: where CPU executes the first instruction after the kernel is loaded into memory by the bootloader.

In OS161:

1
2
3
4
5
root@xxxxxxxxxxxx:~/os161/root# os161-readelf -h kernel
ELF Header:
...
Entry point address: 0x8002d6f0 //the first instruction to run
...

How does the OS boot in OS161?

Q: Where does the processor expect to find the very first instruction to run, after it is powered on?

We first need to know the design of MIPS processors.

  • MIPS processors

Physical Memory MIPS assumes:

Virtual Memory MIPS assumes:

Boot ROM area is where the bootloader lives. So 0xbfc00000 is where the very first instruction to run. Namely, after the processor is powered on, the bootloader runs first.

kernel area:

Draw a diagram showing how kseg0 and kseg1 map to the same physical memory area:

Q: Based on that, can you tell what is the maximum amount of virtual memory that a user program can use?

2G

Q: What is an exception address and why is it critical for the boot sequence? Where does the hardware expect to find exception code AFTER we boot (boot flag not set)?

Exception Address refers to a specific memory location where the processor jumps to when an exception or interrupt occurs.

After we boot up, boot flag is no longer set, we find exception code in kernel area(0x80000000 and 0x80000080)

Look at os161 source code:

Begin with main.c -- it is located in ~/os161/src/kern/main/main.c.

Look at the boot() function. In particular, trace all the way through ram_bootstrap(). Find the lowest-level function that reads the size of the RAM from the hardware:

1
2
3
4
5
6
7
8
9
10
11
12
void *
lamebus_map_area(struct lamebus_softc *bus, int slot, uint32_t offset)
{
uint32_t address;

(void)bus; // not needed

KASSERT(slot >= 0 && slot < LB_NSLOTS);

address = LB_BASEADDR + slot*LB_SLOT_SIZE + offset;
return (void *)address;
}

Take a look at ~/os161/src/kern/arch/sys161/main/start.S. Since we are running on the simulator, the early boot code is located not in the boot ROM, but in sys161 -- the hardware simulator. The code in start.S is the very first code that runs on the system! It sets up the operating system to run and gives you the idea of what the actual boot ROM code looks like.

Note that the code in start.S is executed by the hardware simulator and it serves the same purpose as the actual bootloader.

Q: How does kmain() get called?

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
/*
* Now, copy the exception handler code onto the first page of memory.
*/

li a0, EXADDR_UTLB
la a1, mips_utlb_handler
la a2, mips_utlb_end
sub a2, a2, a1
jal memmove
nop

li a0, EXADDR_GENERAL
la a1, mips_general_handler
la a2, mips_general_end
sub a2, a2, a1
jal memmove
nop
1
2
3
4
5
6
/*
* We're all set up!
* Fetch the copy of the bootstring as the argument, and call main.
*/
jal kmain
move a0, s0 /* in delay slot */

To sum up, after the MIPS processor is powered on, it starts executing the code in start.S, which is executed by the sys161 hardware simulator to serve the purpose of the bootloader. The code in start.S, or the bootloader, initialize hardware and load the kernel into memory (including copy the exception handlers to specific memory locations), then transfer controls to the kernel entry point (kmain).


Introduction to Operating Systems
http://oooscar8.github.io/2024/09/12/Intro/
作者
Alex Sun
发布于
2024年9月12日
许可协议