TotalView Exercises


TotalView Exercise for Part 1: The Very Basics

  1. First, make sure that your XWindows emulator program is running.

    The instructor will show you how to do this.

  2. Login to the workshop machine

    Workshops differ in how this is done. The instructor will demonstrate how to do this.

  3. Override the LC limit on core file size

    LC sets a minimum core file size, which is useless for debugging. It has to be overridden in order to produce a useful core file. After logging in, use the command below:

    limit coredumpsize unlimited

  4. Copy the example files

    1. In your workshop home directory, create a subdirectory for the TotalView exercise codes and cd to it.

      mkdir totalview
      cd  totalview

    2. Copy the exercise files to your totalView subdirectory:
      cp  /usr/global/docs/training/blaise/totalview/*   ~/totalview

  5. List the contents of your TotalView subdirectory

    You should have the following files:

    C Files Fortran Files Description
    tvEx1.c tvEx1.f Exercise 1 example file for demonstrating TotalView basics. Serial code.
    tvEx2.c tvEx2.f Exercise 2 example file for showing additional TotalView functions and features. Serial code with a bug.
    tvEx2.dat tvEx2.dat Exercise 2 input file
    tvEx2Hang.c tvEx2Hang.f Exercise 2 example file demonstrating how to attach to a running code and "fix" it. Serial code with a bug.
    tvEx3omp.c tvEx3omp.f Exercise 3 example file demonstrating TotalView use with parallel OpenMP codes. Shared memory threads-based parallelism.
    tvEx3mpi.c tvEx3mpi.f Exercise 3 example file demonstrating TotalView use with parallel MPI codes. Distributed memory parallelism.

  6. Compile the Exercise 1 example code

    To produce an executable file that can be used with TotalView:

    C: xlc -g tvEx1.c -o ex1
    Fortran: xlf -g tvEx1.f -o ex1

  7. Start the TotalView debugger with your executable

    totalview ex1 &

    If everything is setup and working correctly, including your XWindows environment, you should then see TotalView's Root and Process windows appear, loaded with your Exercise 1 program.

  8. Familiarize yourself with TotalView's windows

    Obviously, there isn't much you can do just yet, but take a few moments to notice the various components of the Root Window, the Process Window, and the various menus for both windows.

  9. Run the program

    Use any of the following methods to start running the program (remember at least one method for later):

    Accelerator key: Type g in the Process Window
    Go Button: Selected from the Process Window's execution control button panel.
    Process/Group Menus: Select Go from any of these three pull down menus in the Process Window.

    Note that since no breakpoints were set, the program simply runs to completion. Note also that the program's output is displayed in the window where you started totalview. Sample output is available here.

  10. Set a breakpoint

    In the Process Window's Source Pane, left-click on the box for line 43. A STOP icon will appear here and also in the Action Points Pane, indicating that the breakpoint has been set (shown below).

    Note that this is just one of several ways to set a breakpoint - it is probably the easiest and quickest however.

    Example 1 - setting a breakpoint

  11. Run the program again

    1. Use any of several methods to run the program.
    2. When the program hits the breakpoint and stops, notice the Program Counter (yellow arrow) on line 43.
    3. Check the Root Window and note the B breakpoint status code.

    Example 1 - stopping at a breakpoint

  12. Dive on a routine to view its source code

    This can be done several ways. Only one is described here.

    1. Find the routine call to printmatrix on line 43. Then, right-click (and hold) on the actual word for printmatrix
    2. When the pop-up window (shown at right) appears, select the Dive option.
    3. The routine should now appear in the Source Code Pane.
    Dive pop-up menu

  13. Undive from a routine
    As with diving, this can be done several ways. Only one is described here.

    1. Find the undive button located in the upper right corner of the Source Code Pane (shown at right)
    2. Left-click on the undive button
    3. This will return you to the source code for the main program.
    Undive button

  14. Dive on an array variable
    1. Find the array variable c - several occurrences of it appear between lines 35-42. Double left-click (another way of diving) on any occurrence.
    2. This will cause a new Variable Window to open, displaying the contents of the array (shown at right).
    3. Try diving on other arrays (or any variable). Each will open a new Variable Window.
    4. Leave the array c Variable Window open for the next step.
    Example Variable Window for array c

  15. Display an array slice
    1. In the Variable Window for array c, find the box that says: Slice:
    2. Left-click on the array dimension brackets. For C these will look like [:][:] and for Fortran they will look like (:,:).
    3. This will invoke the line editor, allowing you to type in a range of array elements, called a "slice".
    4. Try typing in a slice as shown at right.
    5. When you hit return, the array slice you specified will appear in the Variable Window. Scroll through the window contents to prove this.
    Using the line editor to specify an array slice

  16. Modify a variable value
    1. Scroll to any of the array elements shown in the Variable Window for array c.
    2. Left-mouse click on the element's value field to invoke the line editor, allowing you to modify the value (shown at right).
    3. Hit return for the modification to take effect.
    4. If you want to confirm that the modification took effect, close the Variable Window. Then, dive on array c again to open a new Variable Window. Find the array element you modified and verify that it was changed.
    Using the line editor to modify an array value

  17. Stepping

    1. Try stepping through the code's execution. This can be done several ways:

      Accelerator key: Type s in the Process Window
      Step Button: Selected from the Process Window's execution control button panel.
      Process/Group/Thread Menus: Select Step from any of these three pull down menus in the Process Window.

    2. Continue to step through the program several times, noticing what occurs in the Source Code Pane.
    3. When you are finished with stepping, Go the program so that it completes execution.

  18. Get Help

    1. In either the Root Window or the Process Window, pull down the Help Menu.
    2. Select "Documentation"
    3. Then, wait (and wait) until a new web browser window appears with the Etnus online help.
    4. Try viewing the Users Guide and/or other online docs.
    5. Now try bringing up context-sensitive help. This is done by left-clicking on any window, pane, etc. and then selecting Help or hitting the F1 key.
    6. Then wait (and wait) until the context-sensitive help information appears in the web browser window previously opened above.
    7. Close the web browser help window when you are done.

  19. Quit TotalView

    Use either of the following methods to quit the debugger:

    Accelerator key: Type CTRL-Q in the Root Window
    Menu: Select Exit from the Root Window's File Menu


This concludes TotalView Exercise 1. What would you like to do?




TotalView Exercise for Part 2: Additional Functions


NOTE: The fortran compiler/totalview combination as of 7/06 causes the fortran exercise 2 to not work properly. Please note that you need to set an LLNL specific environment variable before you do the fortran compile.

This exercise assumes that you have completed Exercise 1, that you are still logged into the workshop machine, and that your environment is still setup to run TotalView.

  1. Compile the Exercise 2 example code

    C: xlc -g tvEx2.c -o ex2
    Fortran: setenv LLNL_COMPILE_SINGLE_THREADED TRUE
    xlf -g tvEx2.f -o ex2

  2. Run the executable

    1. In the same window where you just compiled, run your executable by simply entering the command: ex2
    2. What happens? It should crash and dump core.
    3. Check for a core file - note its name (it should be core)

  3. Start TotalView with the core file and determine why the program crashed

    1. Enter the command: totalview ex2 core &
    2. When the Process Window appears, the yellow PC should be pointing to the statement that caused the program to crash. Examples below.
      Fortran note: If you only see assembler code, click on the "tvex2" line in the Stack Trace pane as shown by the red arrow below.
      Fortran
      PC in core file - Fortran
      C
      PC in core file - C

    3. Figure out why this program crashed. Hint: dive on the variables/array indices that appear in line 41.
    4. After you are satisfied that you know why the program crashes, CLOSE THE PROCESS WINDOW so you don't confuse yourself with the next step.

  4. Begin debugging the crashed program by loading the executable

    New Program Dialog Box In order to perform further debugging the actual executable, not the core file, must be loaded. One way of doing this is shown below:

    1. Select the New Program command from the Root Window's File Menu.
    2. The New Program Dialog Box will appear (shown at right). Enter the program name ex2 in the Program box.
    3. Click the OK button. A new Process Window will appear and display the source code for the ex2 program.

  5. Set an Evaluation Point to trap the bug

    Assuming that you reached the conclusion that the program crashes due to an array boundary condition, setup a test using an Evaluation Point to prove your hypothesis. One way of doing this is shown below.

    1. Open a pop-up menu by right-clicking (and holding) on the statement in line 41.
    2. When the pop-up menu appears, select the Properties command (below).

      Pop-up menu

    3. An Action Point Properties Dialog Box will then appear.
    4. Select the Evaluate button and make sure the correct language is also selected.
    5. Enter an expression (C or Fortran) to trap the array boundary problem, as shown below.
      Fortran
      Fortran expression in an Action Point Properties Dialog Box
      C
      C expression in an Action Point Properties Dialog Box

    6. Click the OK button when completed. The source line and the Process Window's Action Points Pane should now display an EVAL icon (below).

      Evaluation Point icon displays

  6. Run the program and catch the bug

    1. Go the program by any of the methods you already know. The program should now run until it triggers the Evaluation Point condition you coded in the previous step.
    2. When the Evaluation Point condition is triggered, the program will stop. At this point, notice two things:
      • The Source Pane will show the code for your Evaluation Point with the yellow PC arrow pointing to the $stop built-in statement.
      • The Stack Trace Pane will show that the program is stopped in your Evaluation Point code.
    3. To view your actual source code, you must left-click on the main program in the Stack Trace Pane, as shown below for both C and Fortran. After doing this, you should see your source code in the Process Window.
      Fortran
      Fortran Stack Trace Pane after triggering Evaluation Point
      C
      C Stack Trace Pane after triggering Evaluation Point

    4. On the line where your Evaluation Point occurs, dive on the j index for the array a. When the new Variable Window opens, inspect its value. Is it out of bounds?

  7. Modify your Evaluation Point to patch around the bug and finish execution

    1. Right-mouse click (and hold) on the source line where your Evaluation Point occurs, to open a pop-up window. Then select Properties from the pop-up menu.
    2. An Action Point Properties Dialog Box will open. Note that it is displaying your previously entered Evaluation Point. Modify the code as shown below, to "patch" around the problem. The "patch" simply skips the program out of the crash causing loop.
      Fortran
      Fortran expression in an Action Point Properties Dialog Box
      C
      C expression in an Action Point Properties Dialog Box

    3. Click the OK button when done.
    4. Resume (Go) execution by any of the methods you already know. The program should now complete without crashing. Note that in the real world, you would now want to go back and fix your source code.
    5. Quit TotalView when you are done.

    FORTRAN ONLY: be sure to unset the environment variable you previously set before proceeding:
    unsetenv LLNL_COMPILE_SINGLE_THREADED

  8. Attach to a hung process

    Warning This part of the exercise will be very CPU intensive on the machine where it executes. Please make sure that it is terminated before quitting!!!

    1. Compile the example program.

      C: xlc -g tvEx2Hang.c -o ex2hang
      Fortran: xlf -g tvEx2Hang.f -o ex2hang

    2. Start the program that will hang: On the same machine where you are running TotalView, start the program, and then verify that it is running (and consuming lots of cpu cycles). At the Unix prompt:

      ex2hang &
      ps ux

    3. Start TotalView by itself: totalview &
    4. The Root Window should appear along with the New Program Dialog Box.
    5. In the New Program Dialog Box select Attach to an existing process. A list of attachable processes will then display (shown below).
    6. Select ex2hang and click OK.
    7. A new Process Window will appear containing the running ex2hang process.

      Root Window Unattached Page

  9. Debug the hung process

    1. First, Halt the hung process by using any of the following methods:

      Accelerator key: Type h in the Process Window
      Halt Button: Selected from the Process Window's execution control button panel.
      Process Menu: Select Halt from the Process Window's Process Menu
      Group Menu: Select Halt from the Process Window's Group Menu
      Modifying variable i

    2. Examine the source code and determine the problem. The reason why this trivial program is hung is rather obvious.
    3. Dive (by any means you choose) on the index variable i. A new Variable Window will open.
    4. In the new Variable Window, left-click on the variable i value to invoke the field editor.
    5. Modify the variable's value so that the condition causing it to hang is resolved. Simply make i greater than 100 as shown at the right.
    6. Hit return for the modification to take effect.

  10. Resume execution of the hung process

    Resume (Go) execution of the hung process now that you've "debugged" it. Use any of the methods you already know.

    The hung process should now complete execution.

  11. Make sure the hung process is gone

    At the Unix prompt, issue the ps command to verify that the process successfully terminated. If not, then use the kill pid command to kill the process, where pid is the process ID number as shown by the ps command.

  12. Quit TotalView

    Use either of the following methods to quit the debugger:

    Accelerator key: Type CTRL-Q in the Root Window
    Menu: Select Exit from the Root Window's File Menu


This concludes TotalView Exercise 2. What would you like to do?




TotalView Exercise for Part 3: Debugging Parallel Codes


This exercise assumes that you have completed Exercise 1, Exercise 2, that you are still logged into the workshop machine, and that your environment is still setup to run TotalView.

Debugging OpenMP Programs

  1. Compile the Exercise 3 OpenMP example code

    C: xlc_r -qsmp=omp -g tvEx3omp.c -o ex3omp
    Fortran: xlf_r -qsmp=omp -g tvEx3omp.f -o ex3omp

  2. Specify 4 threads and start TotalView with your executable

    setenv OMP_NUM_THREADS 4
    totalview ex3omp &

  3. Review the source code

    In this simple example, the master thread first initializes two vectors A and B, and then spawns a parallel region. Inside the parallel region, threads share the work of summing A and B into a third vector, C, by using the OpenMP DO (Fortran) or for (C) directive. Note the scoping of the variables used in this program.

  4. Set two breakpoints

    Set breakpoints on lines 42 and 53. The first breakpoint occurs inside the parallel region, and will affect all threads. The second one occurs outside the parallel region and will only affect the master thread.

  5. Run the program

    Go the program. While it is running, notice the output that appears in the window where you started totalview. The program will stop when one of the threads hits the first breakpoint.

  6. Find where thread information is displayed

    1. Process Window: click on the Threads Pane
    2. Process Window: note the status bar that shows the process and thread ids, and also a unique color/pattern
    3. Root Window: click the process toggle to show the thread list

  7. Cycle through all threads

    1. Use the T- and T+ buttons
    2. Left-click on any thread in the Threads Pane
    3. Double left-click (Dive) on any thread in the Root Window list
    4. As you cycle through each thread, notice what changes in the Process Window (status bar info and colors/patterns, info in the various panes, etc.)

  8. Open a new Process Window for at least one other thread

    This can be done by selecting any thread (other than the current thread) in the Root Window thread list, and then selecting Dive in a New Window from the Root Window's View Menu.

  9. Find the Master thread

    1. The Master thread is always TotalView thread 1. By any of the means already discussed, select thread #1 so that it's information appears in the Process Window.
    2. In the Stack Trace Pane, notice that the Master thread shows both the main program and the outlined routine for that program.

  10. View SHARED and PRIVATE variables

    1. OpenMP SHARED variables can only be viewed in the Master thread's main program. All worker threads (and the outlined routine of the master thread) only show private variables.
    2. While you are viewing the Master thread, select (left click) on the main program name in the Stack Trace Pane.
    3. Look at the contents of the Stack Frame Pane. Review the variables that are present and notice that the variables declared as SHARED in the source code appear.
    4. Now, select any worker thread (any thread except #1) so that its information fills the Process Window.
    5. Notice the Stack Frame Pane now. Only private variables should appear.

  11. Laminate a variable

    1. Source Pane: Find an occurrence of the tid variable.
    2. Dive on it (double left-click or right click menu) - a new Variable Window will appear
    3. In the Variable Window, select the View Menu, and then Laminate -> Thread
    4. The Variable Window display will toggle into a laminated display for the tid variable. Note the different values for this PRIVATE variable in each thread.
    5. Note that if a thread has not yet reached the point where it has obtained its tid, the laminate window will say "Has no matching call frame" for that thread.

  12. Disable the first breakpoint

    1. In the Action Points Pane (lower right corner of the Process Window), left-click on the first breakpoint icon.
    2. Notice that the red STOP icon is now dimmed in both the Source Pane and the Action Points Pane. This means that it is disabled (not deleted).

  13. Finish program and quit TotalView

    1. Go the program again. This will allow the master thread to hit the second breakpoint on line 53, outside the parallel region.
    2. Confirm that the Master thread is actually at the second breakpoint by making sure its information appears in the Process Window.
    3. Go the program one final time to complete its execution.
    4. Quit TotalView




Debugging MPI Programs

NOTE: For the 7/14/06 workshop, this exercise will be run with only 2 MPI tasks instead of the usual 4 tasks because so many people have signed up. The number of CPUs available on the workshop machine is only 32 and 14 people * 4 tasks = 56 CPUs!

  1. Compile the Exercise 3 MPI example code

    C: mpcc_r -g tvEx3mpi.c -o ex3mpi
    Fortran: mpxlf_r -g tvEx3mpi.f -o ex3mpi

  2. Set a few POE environment variables

    The following should suffice, accepting the default settings for the rest (not shown):

    setenv MP_ADAPTER_USE shared (Important for the workshop machine configuration!)
    setenv MP_EUIDEVELOP deb
    setenv MP_PROCS 2
    setenv MP_NODES 1
    setenv MP_RMPOOL pclass

  3. Start TotalView with POE and your executable

    Dialog box

    1. Issue the command: totalview poe -a ex3mpi &
    2. The poe process will appear in the Root and Process startup windows.
    3. Go the poe process
    4. Eventually, a dialog box (at right) will appear. Select Yes
    5. Your MPI task 0 should now appear in the Process Window and a list of all 4 MPI tasks plus the POE process should appear in the Root Window.

  4. Review the source code

    The header comments explain what's happening with this program. It follows the SPMD (Single Program Multiple Data) programming model, which means the same program is executed by all MPI tasks. Note however, that there are sections of code that are executed by the master task (0) only, by non-master tasks only, and by all tasks.

  5. Find where MPI task information is displayed

    1. Root Window: list of MPI tasks
    2. Process Window: status bar information and colors/patterns

  6. Experiment with breakpoints

    The whole point of this section is to familiarize you with the behavior and options associated with action points...using breakpoints (the simplest) as an example. The default behaviors may or may not be what you think or want.

    1. First, set a breakpoint on line 53 by left-clicking the line box. Note that this line occurs in the master (task 0) section of code. The other MPI tasks can not execute it.
    2. Go the Group. What happens when the master task hits the breakpoint? Notice that the other MPI tasks keep running when the master task hits the breakpoint.
      Pop-up menu
    3. Now, delete the breakpoint on line 53 (simply left-click the red stop icon again) and set a new one on line 73, which is still in the master only section of code - however this time right-click (and hold) until the pop-up menu (at right) appears and then select Properties.
    4. When the Action Point Properties Dialog Box appears you will see the current properties for this breakpoint. Override this behavior to stop all processes by clicking on the Group toggle. Then click OK.
    5. Go the Group. What happens this time when task 0 hits the breakpoint? Notice that the non-master tasks now stop even though they themselves don't hit the breakpoint.
    6. Moral of the story: know how you want your breakpoints to behave.

  7. Notice that your MPI processes are multi-threaded

    You may already have noticed at this point that each MPI task is actually multi-threaded. Note that only one of these threads is of interest - the one which is executing your code. The others are created by the system or MPI library and are ordinarily not of interest to you.

  8. Cycle through all MPI tasks
    1. Use the P- and P+ buttons in the upper left section of the Process Window
    2. Double left-click (dive) on any MPI process in the Root Window list
    3. As you cycle through each process, notice what changes in the Process Window (status bar info and colors/patterns, info in the various panes, etc.)
    4. Note: if you don't see source code as you cycle through the MPI processes, click on the main program name in the Stack Trace Pane.

  9. View Message Queue and Message Queue Graph

    1. Make sure that some task other than task 0 appears in the Process Window.
    2. Then, pull-down the Tools menu and select Message Queue. A Message Queue Window will then appear. Review the information it displays.
    3. Try the same thing with a different MPI task.
    4. Again, in the Process Window, pull-down the Tools menu but this time select Message Queue Graph. The Message Queue Graph Window will then appear.
    5. Click on the Receive and Send toggles at the top of the display. Then click the Update button. You may need to expand the window size to see all of its contents.
    6. Click on the Help button (upper right corner) to find out what this display means and what options are available. (Note that it may take a few moments for the web browser based help window to appear).
    7. Close the web browser help window when you are done.

  10. Set a barrier point accepting its default properties

    1. Find the call to the MPI_Finalize routine - at line 122.
    2. Right-click (and hold) to raise a pop-up menu. Then select Set Barrier.
    3. A blue Barrier icon will then appear on the line number and in the Action Points Pane.

  11. Run the program

    1. Go the Group. All MPI tasks will now execute until they hit the barrier point. The Root Window will indicate this by giving each task a B status code.
    2. Cycle through all of the MPI tasks - they should all be at the barrier point. Ignore the POE task, by the way.

  12. Open a new Process Window for at least one other MPI task

    This can be done by selecting any MPI task in the Root Window's process list, and then selecting Dive Anew from the Root Window's View Menu.

  13. Examine variables

    Experiment. Dive on any variables of your choice. Compare between tasks. Dive from the Source Pane or the Stack Frame Pane.

  14. Laminate a variable or two

    1. Remember that laminated variables only have meaning if they exist between multiple tasks (or threads, as seen previously).
    2. Two suggested variables to laminate include offset and mysum. They are unique for each MPI task.
    3. Dive on these variables. Each will open a new Variable Window.
    4. Pull-down the View menu and then select Laminate -> Process
    5. Note the different values between tasks.

  15. Finish execution and quit TotalView

    1. Go the Group. The program should complete.
    2. Quit TotalView.
    3. We're done. Pfew.


This completes the exercise.

Evaluation Form       Please complete the online evaluation form if you have not already done so for this tutorial.

Where would you like to go now?