Processes and Threads

In this part, we will cover the topics of processes, threads, context switch, pipes, file descriptors and some system calls that manage processes. This part is crucial for you to implementing OS/161 Assignment 4&5 correctly!

Processes and Threads

Learning Materials

Videos:

Context switching

Processes and threads

Process components

Process management

Fork and exec

OS161 man pages for the system calls you are implementing in A4.

Lecture Slides:

Implementing system calls

Processes and threads

Processes and file descriptors

Processes and threads

Process creation: fork() and shared file objects

fork.c, pipes.c, pipes2.c, pipes-exec.c Processes: exec(), exec.c

Processes: waitpid/exit

Readings:

File descriptors in Unix

Chapter 2 in the xv6 book

[OSTEP] Introduction of the Unix process API (fork/exec, etc.)

Context Switching

terminology:

What does the kernel do on a Trap?

  1. The kernel finds a stack on which to run. (Every real* user-level thread has a corresponding kernel stack.)
  2. The kernel saves state.
  3. Figure out what caused the trap.

Process Switch

  1. change protection domain(user to kernel)
  2. switch from using the user-level stack to using a kernel stack
  3. save execution state on kernel's stack
  4. do kernel stuff
  5. kernel thread switch
  6. restore user-level execution state
  7. change protection domain(kernel to user)

MIPS R3000 Hardware in Trap Handling

When a trap happens, the hardware saves PC into exception PC, and sets PC to the address of the trap handler(software program). The hardware also sets the status register and cause register.

As shown below, the left line of hardware registers need to be saved on the kernel stack(by trap handling software) and K0 and K1 are the two spare registers to achieve that.

After hardware does its job, we let the trap handling software to do the rest.


MIPS R3000 Software in Trap Handling

  1. Assembly code

Use as few hareware registers as possible and save all the states that you need.

Create a trap frame on the kernel stack: Save away all the registers and start using these registers to execute something else.

How to find the kernel stack to save the trapframe?

Each processor(cpu) has its own kernel stack to save the trapframe.

Extract processor number from K1(context register) as an index(Base is in K0).

Add K1 to K0 so that K0 points to the stack for the current processor.

  1. trap C code

Syscall Details

How did we leave the arguments?

First four in a0-a3; rest on the stack.

How does the kernel deal with fork system call?

  1. Make sure forking process is not running on user level.
  2. Save all state from forking process.
  3. Make a copy of the code, data, and stack.
  4. Copy trapframe of the original process into the new one.
  5. Make the new process known to the dispatcher

How does the kernel deal with exec system call?

Replace the current program image with the code and data of a new program.

Processes and Threads

Multithreaded process:

User-level threads:

The stack of the additional thread is created on the heap by the user program.

What comprises a process's state?

1.Address space

2.Threads

3.Architectural state

4.File table

5.A kernel thread that gets assigned to a process when the process traps into the kernel

We can think of user processes and kernel, each with multiple threads, as running in their own address spaces.

However, deep into the details, user address space often contains kernel mappings, which is only accessible when the processor is in kernel mode.

Pipes

Mechanics:

The pipe file is just a memory buffer.

Redirecting Pipe I/O:

Process Components

Parts of a process not in the Executable

  • Execution state

Execution state is a snapshot of hardware, so it is hardware-specific

  • Resources
  1. File information

Mutiple threads within a process share a file descriptor table.

  1. Scheduling information
  • How much time the process has used

  • its priority

  • resource consumption

These information is provided to the scheduler for it to decide which process to run.

  • Address space

  • Other

  1. process id(PID)
  2. credentials(what permissions/capabilities a process has)
  3. signal information

Process Data Structures

Which parts of the process are machine (in)dependent?

Which parts need to be per-thread and which are per-process?(although not necessary in os161 which only considers single-thread process)

Prcess Control Block(PCB)

PCB encapsulates an entire process, which contains references to other structures:

  • Address space
  • Threads associated with the process
  • Credentials

All the PCBs are gathered together into a process table.

File descriptors in Unix

Kernel three-table data structure for open files

  • Table1: Process table

Each process entry in the process table contains a file description table(a table of open file descriptors).

  • Table2: Open file table

The file descriptors index the entries in the file description table. Each entry contains a pointer to an entry in the open files table(a table that the kernel maintains for all open files).

Each entry in this open files table contains the file status flags (read, write, append, etc.), the current file offset, and a pointer to the entry for this file in the so-called v-node table.

  • Table3: V-node table

The v-node table (or part of it) is stored on the physical device. For now, you can think of its entries as the "real'' file contents on a disk, with associated informations like file location on disk, size, name, owner, etc.

What if two processes want to independently open the same physical file, i.e. with each process mantaining its own access mode (read, write or append) and file offset as well?

Here the open file table has two independent entries for the same file, each associated to one of the processes.

What if we need to have in one process two file descriptors opened for the same file?

Duplication of file descriptors:

How we achieve this?

The dup() and dup2() system calls do this job.

1
2
3
4
/* duplicate the passed file descriptor, returning the smallest available one. */
int dup(int fildes);
/* specify in what file descriptor you want the copy */
int dup2(int fildes, int fildes2);

Why we need the duplication of file descriptors?

1
2
3
4
5
6
7
8
...
if (fork()==0)
{
close(STDIN_FILENO);
dup(datafd); /* datafd duplicated in stdin */
execlp("sed", "sed", (char *)0);
perror("sed");
}

What is the problem of dup()? How we solve this?

The close-and-duplicate sequence above is not atomic.

1
2
3
4
5
6
7
8
...
if (fork()==0)
{
/* atomic operation */
dup2(datafd, STDIN_FILENO);
execlp("sed", "sed", (char *)0);
perror("sed");
}

A potential failure case:

This special case illustrates the need of the reference count field in the file object structure.

fork and exec (process creation)

The OS's job during process creation

Child processes share the same file descriptor table with the parent process, which is the basis of pipes.


sys_fork

sys_fork is responsible for:

  1. Copy address space

Use the functions in addrspace.c.

  1. Copy the file table

When a process is forked, file descriptor entries of the new process point to the same file objects as in the parent process.

What kind of race conditions do you have to worry about?

Two processes simultaneously modifying the values of the offset.

Can you hold a spinlock while doing I/O?

Spinlocks turn off interrupts; I/O completion is signaled via an interrupt. If you turn off interrupts and then go do I/O, YOU WILL NEVER GET NOTIFIED!

Hold a regular lock? That will work. But do we have to hold the lock?

-- Not really. We can set up the uio struct with the right offset, release the lock, do the I/O, then acquire it again and update the offset if I/O was successful. Can’t update the offset prior to I/O completion – what if it fails?

  1. Copy architectural state(Copy and tweak trapframe)

In the implementation of A5, you will have to make sure the right value is returned to the right process!

  1. Copy kernel thread(thread_fork)

When you fork a new process, you fork a new kernel thread to provide to the child process.

To fork that thread, you need to provide it a function to run. You’ll need to write that function.

  1. Return to user mode
  2. Both the parent and child return to user mode

The parent just returns from exception.

The child returns via the usermode function.

How does a thread enter user mode for the first time?

mips_usermode in trap.c


sys_execv

sys_execv is responsible for:

• Replace the old address space with a new one

• Copy the arguments into the new address space

• Return to user mode

The mechanics of exec():

To think before starting exec():

Where are the arguments located when the exec system call in invoked? In whose address space? How do you know their address?

How do we get them into the kernel? How do you figure out their total size?

Where do the arguments need to end up before we return to the user mode? Whose address space? Stack or heap?

How do we get them there? How do we organize them there.

Passing arguments(copyin and copyout):

It's important to emphasize that we need to pass not only the arguments, but also the pointers to those arguments.

What if the arguments array is huge and you cannot allocate enough kernel memory contiguously?


Processes and Threads
http://oooscar8.github.io/2024/10/24/Processes-and-Threads/
作者
Alex Sun
发布于
2024年10月24日
许可协议