C-with-Ease debugging

Under construction

Debugging parallel programs is generally something you don't want to do. In other words: get it right the first time.

Given that this rarely, if ever, happens, facilities to debug are useful to have. Even if a program that you are parallelizing works in its serial incantation, just because it crashes in parallel doesn't mean that it's the parallel parts that are breaking it. It is possible that running in parallel tickles a memory allocation problem elsewhere in the program.

The manner of debugging depends on what libraries and systems your program has been linked with. See the section on implementations for more information about what attributes different implementations have.

The most difficult implemenations to debug are pre-emptive and distributed ones. This is because invoking a debugger or adding debugging code will almost certainly change the order of execution.

The best way for locating bugs that depend on the ordering of your program is to use a non-preemptive threads implementation and have pseudo-random scheduling for which you determine the random seed (to facilitate repeatability). This way changing the speed of a process or adding breakpoints will not upset program ordering. (Breakpoints upset ordering in pre-emptive scheduling since pre-emption is usually done using a timer signal; the timer does not stop when the breakpoint is hit, so when you continue a task switch will probably occur immediately).

Invoking a debugger

The debegger you normally use for debugging on a particular platform can usually be used to debug an Ease program. A non-distributed (threads) implementation will allow you to use a single debugger for your program. A distributed implemenation may allow a single debugger for each Ease process.

For threads implementations you can run the program just as you would any normal sequential program. Unless the debugger supports the threads you are using (as is the case with Irix kernel threads and dbx), breakpoints will stop the program if it reaches the break point in any thread. Generally, you cannot examine local data in a thread other than the one that is currently executing (or that the debugger stopped in), although all global data is accessible. To examine local data in another thread, allow the program to continue and stop it in the desired thread. (Of course, the other thread must be runnable (not blocked) in order for the system to switch to it).

With a pvm application, you may run the initial process under a debugger as normal, and also choose whether created processes run under debuggers (one per process). To do this, use the =g runtime flag. The debuggers will be started in xterm's on your display. You must have the PVM_EXPORT environment variable set to include DISPLAY (e.g. setenv PVM_EXPORT DISPLAY).


Tim MacKenzie