The instructor will show you how to do this.
Login to the workshop machine
Workshops differ in how this is done. The instructor will demonstrate how to do this.
Override the LC limit on core file size
LC sets a minimum core file size, which is useless for debugging. It has to be overridden in order to produce a useful core file. After logging in, use the command below:
limit coredumpsize unlimited
Copy the example files
mkdir totalview cd totalview
cp /usr/global/docs/training/blaise/totalview/* ~/totalview
List the contents of your TotalView subdirectory
You should have the following files:
Compile the Exercise 1 example code
To produce an executable file that can be used with TotalView:
Start the TotalView debugger with your executable
totalview ex1 &
If everything is setup and working correctly, including your XWindows environment, you should then see TotalView's Root and Process windows appear, loaded with your Exercise 1 program.
Familiarize yourself with TotalView's windows
Obviously, there isn't much you can do just yet, but take a few moments to notice the various components of the Root Window, the Process Window, and the various menus for both windows.
Run the program
Use any of the following methods to start running the program (remember at least one method for later):
Note that since no breakpoints were set, the program simply runs to completion. Note also that the program's output is displayed in the window where you started totalview. Sample output is available here.
Set a breakpoint
In the Process Window's Source Pane, left-click on the box for line 43. A STOP icon will appear here and also in the Action Points Pane, indicating that the breakpoint has been set (shown below).
Note that this is just one of several ways to set a breakpoint - it is probably the easiest and quickest however.
Run the program again
Dive on a routine to view its source code
This can be done several ways. Only one is described here.
Undive from a routine
Dive on an array variable
Display an array slice
Modify a variable value
Stepping
Get Help
Quit TotalView
Use either of the following methods to quit the debugger:
This concludes TotalView Exercise 1. What would you like to do?
This exercise assumes that you have completed Exercise 1, that you are still logged into the workshop machine, and that your environment is still setup to run TotalView.
Compile the Exercise 2 example code
Run the executable
Start TotalView with the core file and determine why the program crashed
Begin debugging the crashed program by loading the executable
In order to perform further debugging the actual executable, not the core file, must be loaded. One way of doing this is shown below:
Set an Evaluation Point to trap the bug
Assuming that you reached the conclusion that the program crashes due to an array boundary condition, setup a test using an Evaluation Point to prove your hypothesis. One way of doing this is shown below.
Run the program and catch the bug
Modify your Evaluation Point to patch around the bug and finish execution
Attach to a hung process
ex2hang & ps ux
Debug the hung process
Resume execution of the hung process
Resume (Go) execution of the hung process now that you've "debugged" it. Use any of the methods you already know.
The hung process should now complete execution.
Make sure the hung process is gone
At the Unix prompt, issue the ps command to verify that the process successfully terminated. If not, then use the kill pid command to kill the process, where pid is the process ID number as shown by the ps command.
This concludes TotalView Exercise 2. What would you like to do?
Compile the Exercise 3 OpenMP example code
Specify 4 threads and start TotalView with your executable
setenv OMP_NUM_THREADS 4 totalview ex3omp &
Review the source code
In this simple example, the master thread first initializes two vectors A and B, and then spawns a parallel region. Inside the parallel region, threads share the work of summing A and B into a third vector, C, by using the OpenMP DO (Fortran) or for (C) directive. Note the scoping of the variables used in this program.
Set two breakpoints
Set breakpoints on lines 42 and 53. The first breakpoint occurs inside the parallel region, and will affect all threads. The second one occurs outside the parallel region and will only affect the master thread.
Go the program. While it is running, notice the output that appears in the window where you started totalview. The program will stop when one of the threads hits the first breakpoint.
Find where thread information is displayed
Cycle through all threads
Open a new Process Window for at least one other thread
This can be done by selecting any thread (other than the current thread) in the Root Window thread list, and then selecting Dive in a New Window from the Root Window's View Menu.
Find the Master thread
View SHARED and PRIVATE variables
Laminate a variable
Disable the first breakpoint
Finish program and quit TotalView
Compile the Exercise 3 MPI example code
Set a few POE environment variables
The following should suffice, accepting the default settings for the rest (not shown):
setenv MP_ADAPTER_USE shared (Important for the workshop machine configuration!) setenv MP_EUIDEVELOP deb setenv MP_PROCS 2 setenv MP_NODES 1 setenv MP_RMPOOL pclass
Start TotalView with POE and your executable
The header comments explain what's happening with this program. It follows the SPMD (Single Program Multiple Data) programming model, which means the same program is executed by all MPI tasks. Note however, that there are sections of code that are executed by the master task (0) only, by non-master tasks only, and by all tasks.
Find where MPI task information is displayed
Experiment with breakpoints
The whole point of this section is to familiarize you with the behavior and options associated with action points...using breakpoints (the simplest) as an example. The default behaviors may or may not be what you think or want.
Notice that your MPI processes are multi-threaded
You may already have noticed at this point that each MPI task is actually multi-threaded. Note that only one of these threads is of interest - the one which is executing your code. The others are created by the system or MPI library and are ordinarily not of interest to you.
Cycle through all MPI tasks
View Message Queue and Message Queue Graph
Set a barrier point accepting its default properties
Open a new Process Window for at least one other MPI task
This can be done by selecting any MPI task in the Root Window's process list, and then selecting Dive Anew from the Root Window's View Menu.
Examine variables
Experiment. Dive on any variables of your choice. Compare between tasks. Dive from the Source Pane or the Stack Frame Pane.
Laminate a variable or two
Finish execution and quit TotalView
This completes the exercise.
Where would you like to go now?