Göm menyn
IT-Programmet, Tema 1 i termin 4:

TTIT61 Processprogrammering och Operativ System

/Concurrent Programming and Operating Systems/


Lab 2: System Calls

Goal

In this assignment you are supposed to learn about user programs, system calls and memory layout. The main goal is to understand clearly such mechanisms as systems calls and argument passing in user programs by implementing a set of system calls in Pintos.

Overview

This assignment covers:

  • An introduction to user programs in Pintos.
  • System calls.
  • Layout of the user memory and the kernel memory in Pintos.
  • Input/output management

User Programs

In the last assignment you did (Lab 1), your test program resided in Pintos kernel memory and run in the kernel mode, that is, it had all the privileges as the operating system. This is, however, not the normal situation. An operating system must be able to run user programs, each with their own memory. The kernel should execute each user program in "user mode" instead of "kernel mode" in order to protect the kernel from the user.

Starting from this assignment you will be using real user mode programs and run them under Pintos. Here are the features and limitations of Pintos user programs:

  • They should be written in C.
  • Floating point operations cannot be used because Pintos does not save the corresponding information during process switch.
  • Multithreaded processes are not supported, therefore we will use the words thread and process interchangeably (Although it might not be the case for other operating systems).
  • Pintos user programs can use only those system calls which you will implement in this and the following lab(s).
  • malloc() system call is not implemented and it will not be implemented in terms of these labs, therefore you cannot use dynamic data structures inside a user program.
  • A user program should be copied to and reside on a simulated disk used by Pintos. How to do it is discussed later.

System Calls

The communication between the user program and the kernel is done by system calls. System calls can be seen as functions, called from the user program and performed by the kernel. Usually computers often use interrupts to accomplish that switch from user code to system code, and so does x86 machine.

When the programmer wants to invoke a system call in a user program, he or she calls one of the functions defined in pintos/src/lib/user/syscall.h. Those functions are implemented in pintos/src/lib/user/syscall.h and do nothing except placing function's arguments with the respective system call number into registers and raising an internal (software) interrupt 0x30. The raised interrupt makes the processor to stop temporarily the user program, change from the user to the system mode and jump to the interrupt handler syscall_handler defined in the kernel (userprog/syscall.c).

The interrupt handler is the entrance to the kernel. All communications with the kernel via system calls must go through it. The interrupt handler must then determine what system call it is and handle it properly. Look into threads/interrupt.[h|c] and clearly understand the main structures and main functions of the interrupt handler in Pintos. Pay attention especially on intr_frame structure and stack pointer into it. You will need to use this stack pointer in order to access system call arguments and extract the names of system calls.

Note, that there are two different pairs of files with the same names: pintos/src/lib/user/syscall.[h|c] and pintos/src/userprog/syscall.[h|c]. The first pair is visible from the user side and is merely a wrapper to raise an interrupt, while the second one is having the real implementation of the system calls. Currently, the interrupt handler contains no useful code and forces the calling program to exit.

In this assignment you will write the OS code to implement a number of system calls as well as some simple user programs to test your implementation. You can find names of system calls in Pintos by looking into lib/syscall-nr.h.

Examples of system calls are:

  • create - creates a file.
  • open - opens a file.
  • close - closes a file.
  • read - reads from a file or the console (the keyboard).
  • write - writes to a file or the console (the monitor).
  • halt - halts the processor.
  • exit - Terminates a program and deallocates resources occupied by the program, for example, closes all files opened by the program.

The following are also system calls, which you will continue implementing in the next Lab 3: Execution, Termination and Synchronization of User Programs and where you will have several processes in memory at the same time.

  • exec - Loads a program into memory and executes it in its own thread or process.
  • wait - Waits for a child process to exit and returns its exit status.

Preparation

System Calls

Get familiar with the code in userprog/syscall.[h|c], threads/interrupt.[h|c], lib/syscall-nr.h, and src/lib/user/syscall.[h|c] files. The latter files contain some assembler code, which just puts arguments and the respective system call number to registers and raises the interrupt 0x30. You should get the clear image of the system call architecture in Pintos after studying those files and reading the documentation:

Preparatory question 0:
What is the idea behind the system calls? Why, for example, the code of the system calls cannot be available simply as a library to user processes? (Tip: you may look at an article in Wikipedia for the answer).
Preparatory question 1:
Pintos uses one interrupt (0x30) for all system calls, so there is only one interrupt handler syscall_handler to be called for different system calls. How it is possible to distinguish in syscall_handler which system call it is? Where are the arguments of a system call stored (if there are any) and how can you access them? (It may be easier to answer this question after preparing the next section about argument passing).

Memory Issues and Argument Passing

Read about Pintos virtual memory layout in the corresponding section of Pintos documentation.

Preparatory question 3:
Where are the user-mode stack and the kernel-mode stack of a process located? How can you address the user-mode stack in the kernel code, particularly, in an interrupt handler? What is the reason of having two stacks instead of one?
Preparatory question 4:
When a user program executes a system call like Open(), an address to a string containing the file name is provided as an argument. In which memory is this string stored, and how can we access it in the kernel code?
Preparatory question 5:
In some system calls the user passes pointers as arguments, which is supposed to point to some data in the user-space (for instance, a pointer to a string with a file name in Create()). Specify the situations when accessing the data via that pointer can lead to problems. What can be done about it? (Tip: read "Accessing User Memory" Section).
Preparatory question 6:
Assume that in the kernel code you have got a pointer to a data in the user space which potentially spans across few pages. Reconsider the previous question again. Does your solution still works? Improve it, if not. Is the solution efficient? (Tip: You may present and discuss the solution with your lab assistant before doing the lab if you are unsure about it).

Making user programs

You already have a number of simple user programs in src/examples. You can compile them by issuing gmake in that directory. Modify src/examples/Makefile whenever you wish to compile your own user programs (You can find instructions inside the makefile). Write all your test programs in the src/examples directory.

Simulated Disk

Before running user programs first you have to create a simulated disk, format it and copy user programs there.

  1. Go to userprog/build and issue the command pintos-mkdisk fs.dsk 2. This will create a file fs.dsk with a 2MB simulated disk in the directory.
  2. Format the disk with the command: pintos --qemu -- -f -q.
  3. Copy Pintos user programs to the simulated disk with the command:
    pintos --qemu -p programname -- -q
    (Remember to copy already compiled programs, not the source code files .c :) If you wish to copy a file under a new name, then modify the command as follows:
    pintos --qemu -p programname -a newname -- -q
    Most probably you will copy user program from the src/examples directory. Then the command will look like this:
    pintos --qemu -p ../../examples/programname -a programname -- -q
  4. If you need to copy a file from the simulated disk, use the command:
    pintos --qemu -g programname -- -q
    or
    pintos --qemu -g programname -a newname -- -q
    As you see, the only difference is in the switch: -p is used to put files to the disk and -g to get a file from the disk.
  5. If you need to run a user program that has been already copied:
    pintos --qemu -- run programname

File System

The current distribution contains a very simple but complete file system. Get acquainted with file system interface (which is available only in the kernel code!) in filesys/filesys.[h|c] files. You do not have to modify that code in this lab, so it is enough if you have a look at the available functions and read their documentation. The same concerns files filesys/file.[h|c]. It is good to look into other files in filesys directory, although it is not strictly necessary to complete this lab.

Read about file system limitations in the corresponding section of Pintos documentation. Although the access to files is not synchronized in the current implementation of the file system, you should not worry about it at this point.

Preparatory question 7:
Why a user (program) cannot just call functions in filesys directly instead of calling system calls?
Preparatory question 8:
All (or almost all) operating systems require that user programs first open a file before using (reading or writing from) it and close when they are done. Explain the motivation behind such a requirement. In other words, why cannot we have only read and write system calls in an operating system without open and close?
When a file is opened, a file id is returned to the user program, which is used to refer to a specific opened file when doing file operations. Explain how to generate file identifiers and map them to OpenFile objects that are created in the kernel?

Assignment in detail

If you try to run user programs at his point, you will get a page fault until you implement argument passing (in the next lab). Here is a makeshift solution if you wish to run a program and do some other things first: go into userprog/process.c, find setup_stack() function and change the following line:
*esp = PHYS_BASE;
into
*esp = PHYS_BASE - 12;
With such a provisional solution you can run any program which does not require to examine its arguments (although its name will be printed as "").

Make sure that you understand the problems which may arise if you access (in the kernel) data stored in the user memory using the pointers provided by the user program as system call arguments.

The assignment is to implement the following system calls:

void halt (void)
Shuts down the whole system. Use power_off() for that (declared in threads/init.h). Do not use this system call to terminate your user program!
bool create (const char *file, unsigned initial_size)
Creates a new file called file initially initial_size bytes in size. Returns true if successful, false otherwise.
int open (const char *file)

Opens the file called file. Returns a nonnegative integer handle called a "file descriptor" (fd), or -1 if the file could not be opened.

File descriptors numbered 0 and 1 are reserved for the console: fd 0 (STDIN_FILENO) is standard input, fd 1 (STDOUT_FILENO) is standard output. You do not have to open the standard input and output before using them.

Each process has an independent set of file descriptors. A user program should be able to have up to 128 files open at the same time. File descriptors are not inherited by child processes.

When a single file is opened more than once, whether by a single process or different processes, each open returns a new file descriptor. Different file descriptors for a single file are closed independently in separate calls to close and they do not share a file position.

void close (int fd)

Closes file descriptor fd. Exiting or terminating a process implicitly closes all its open file descriptors, as if by calling this function for each one.

int read (int fd, void *buffer, unsigned size)

Reads size bytes from the file open as fd into buffer. Returns the number of bytes actually read, or -1 if the file could not be read (due to a condition other than end of file). Fd 0 reads from the keyboard using input_getc() (defined in devices/input.h).

int write (int fd, const void *buffer, unsigned size)

Writes size bytes from buffer to the open file fd. Returns the number of bytes actually written or -1 if the file could not be written.

Writing past end-of-file would normally extend the file, but file growth is not implemented by the basic file system. The expected behavior is to write as many bytes as possible up to end-of-file and return the actual number written or -1 if no bytes could be written at all.

When fd=1 then the system call should write to the console. Your code which writes to the console should write all of buffer in one call to putbuf() (check lib/kernel/stdio.h and lib/kernel/console.c), at least as long as size is not bigger than a few hundred bytes. (It is reasonable to break up larger buffers.) Otherwise, lines of text output by different processes may end up interleaved on the console, confusing the user.

void exit (int status)

Terminates the current user program, returning status to the kernel. Conventionally, a status of 0 indicates success and nonzero values indicate errors. Remember to free all the resources will be not needed anymore.

This system call will be improved in the following labs.

Tip: The system call name and the arguments you can get from stack with a stack pointer. Some pointer arithmetics will be useful to go up and down in the stack.

System call handler function will be used a lot in this and the following assignments so you should think about structuring your code in an organized and clear way to make reading it and performing future expansion easier.

You may need to implement some additional functions and data structures that are not specified here, and which you must think of yourself. This is part of the assignment.

Assume that only one user program can run at a time (indeed, exec system call is not implemented in this lab). Therefore, synchronization of any thread-unsafe code can be delayed until the next lab.

Test program

An example test program for testing your system calls is examples/lab2test.c. Don't use the graphical qemu terminal for keyboard input (it behaves strangely), instead use the terminal from which you start pintos.

You are welcome to write your own user programs as well. Remember, that we will judge your solution first of all based on your code.

Helpful Information

Code directory: /src/userprog, /src/lib/, /src/lib/kernel
Textbook chapters: Chapter 2.3: System Calls
Chapter 2.4: Types of System Calls
Chapter 8.4: Paging
Documentation: Pintos documentation related to Project 2
(Always remember that the TDDB68 lab instructions always have higher precedence)

ddd man page (call man ddd)

Next Laboratory work

Laboratory Assignments 3

TTIT61
Temamål
Temaplan
Schema
Examination
Referenslitteratur
Personal
Register for labs

Föreläsningarna
Programexempel
Forum
Labresultat

Schemaläggning
Kritiska sektioner
Processorstöd för operativsystem
Sekundärminne
UNIX, WinNT
Säkerhet

Intro: C/make
Intro: installation
Threads and synchronisation
System calls
Execution of user programs
File system

Lesson 1
Lesson 2
Lesson 3

C/C++ OH
C/C++ tutorial
C pointers tutorial
Pintos documentation
Memory Issues in Pintos
Pintos on-line documentation
The gnu DDD documentation
DDD tutorial
Debugging topics
Programing with threads

Guidelines for writine and changing source code
Pintos source code

Sidansvarig: Sergiu Rafiliu
Senast uppdaterad: 2011-09-12