University of Pennsylvania / CIS Department
Fall 2007
CSE-380 (Operating Systems)
Matt Blaze (blaze at-sign cis.upenn.edu)
Homework #2 due: Tuesday, October 2, 2007 (1800 EST)
(Revised 9/22/07)

Description

In this exercise, you will conduct some simple performance experiments and explore the POSIX standard user-level thread library.

To submit your work, put your source and text files in a directory and use the turnin command on the eniac cluster. All programs should be written in C and compile on the eniac Linux machines. Note that turnin now runs on all the Moore 100 lab eniac Linux machines. You can also use eniac.seas.upenn.edu to turn in your work from outside the lab.

Important: Take the time to write and code your answers clearly and lucidly, whether the language you are using is English or C.

VERY IMPORTANT: Submit only your source code and supporting text. Do not include compiled or large output files in your submission. Those files can be very large and submitting them can have a disruptive effect on our shared computing resources.

Part 1 (30% credit)

NOTE REVISED PARAMETERS (9/22/07) The number of iterations has been revised to make it easier to see any differences to between the three programs. If you already did the homework with the original parameters, that's fine, too.

One of the features of the Unix "standard I/O" library is the use of input and output "buffers" that aim to reduce the number of times the (expensive) read() and write() system calls are invoked. Functions like fprintf() and fputc() don't always call write() each time they're used. Instead, they add their output to a buffer and call write() only when the buffer becomes full. (Write() also gets called when the process ends, when the output file is closed and when the program explicitly requests that buffers be flushed).

By avoiding excessive use of system calls, programs that use the standard I/O library often have much better performance than they would had they used read() and write() directly. Write three simple C programs that explore this scheme:

The three programs should each produce 1,000,000 characters of output and perform no other significant computation.

Using the /usr/bin/time command on the Eniac cluster, compare the performance of the three programs. (Note that you must use the full pathname /usr/bin/time, since there's also a built-in shell command called "time", which has a similar function but provides less complete data.) Construct an experiment in which you time each program over ten (or more) consecutive invocations (with the program output redirected to a file). Give the output data from your time command experiments and write a brief summary of your results; be sure to explain the meanings of each of the figures reported by the time command.

Are there variations in the output of the time command across successive invocations of the same program? Why or why not?

Are there significant differences in the performance of the three programs? Which is fastest? Which is slowest? Why? Do your results suggest that the buffering strategy of the standard I/O library is successful? Explain.

Now write a C program or a shell script that invokes your three programs as three simultaneous processes. Are the X's, O's and e's output by each process intermixed with one another or do they come out in long bursts? Why do you think this is?

Part 2 (70% credit)

Many versions of Unix, including that running on the Eniac cluster, include a lightweight "threads" library package that allows multiple threads to be part of a single process. See the manual pages for pthreads on eniac for details. (Type the commands " man pthreads " and " man pthread_create " to get started).

Rewrite the three programs in the previous section as simultaneous threads in a single process using this library. Your multi-threaded process should start the three threads in parallel and report which one finishes first in a message printed to standard error (STDERR)..

Submit your source code and a brief discussion of its operation, especially addressing whether one would expect to get the same result each time the program is run. Does it matter which thread your program starts first? Does your experience running your program support your expectation?

Note: To complete this exercise, you will need to become familiar with the pthreads package, which we did not discuss extensively in class. Part of the purpose of this exercise is to give you experience interpreting and evaluating the interfaces to unfamiliar system services. Don't be shy about reading and posting questions to the course newsgroup.

Extra Credit (Up to 20% credit)

Using the pthreads package, write an original example program with two (or more) threads whose output is not consistently the same each time it is run due to a race condition in the order or timing of the thread execution, scheduling or blocking. Explain the different outputs that your program might have and the circumstances that might result in the various execution scenarios that produce these different outputs. Support your explanation with experiments that show at least two different outputs over multiple runs. Then write a second version of your program that incorporates critical section protection code to fix the race condition.

This is a rather challenging and open-ended problem. Be creative in constructing an example that you can explain clearly and succinctly. If you cannot construct a suitable multi-threaded program with inconsistent output, you may still receive some extra credit if you describe well the various approaches you tried and the hypotheses that led you to try them.



Course home page: http://www.crypto.com/courses/fall07/cse380/

20 September 2007;