Compaq Clusters Overview

NOTE: The LC Compaq Clusters are in the process of being phased out. TC2K was removed from service on 12/8/04. All references to TC2K herein are historical only. Also, because the Compaqs are being phased out, this tutorial is no longer being maintained as of 3/05.

Table of Contents

  1. LC Compaq Cluster Systems
  2. Hardware Overview
    1. Alpha 21264 Chip
    2. Alpha Processors and AlphaServers
    3. Quadrics Interconnect
  3. Software and Development Environment
  4. Compilers
  5. MPI
  6. Running on Compaq Clusters
    1. Overview
    2. Invoking the Executable
    3. Interactive Job Specifics
    4. Batch Job Specifics
    5. Monitoring Job Status
    6. Optimizing CPU Usage
    7. Some Troubleshooting Hints
  7. Debugging
  8. References and More Information
  9. Exercise


LC Compaq Cluster Systems


There are 3 Compaq clusters systems within LC - two in the OCF and one in the SCF. Their general configuration information is shown below. For additional details, see:

OCF

TC2K:

  • 128 nodes
  • 4 processors per node
  • Processor type: ES40 EV67 @667 MHz
  • 681 GFLOPS system peak performance
  • 2 GB memory per node; 2 login nodes with 8 GB
  • Quadrics interconnect
  • 7 TB CPFS parallel file system
  • Running Tru64 UNIX 5.1
  • Primarily used for parallel jobs
  • Image at right. Larger image available here.
TC2K
GPS:

  • 33 nodes
  • 4 or 32 processors per node
  • Processor type:
    • GPS1-16: ES45 EV6.8 @1 GHz
    • GPS17-32: ES40 EV6.7 @667 HHz
    • GPS320: GS320 EV6.8 @1 GHz
  • 277 GFLOPS system peak performance
  • 4, 8, 16 or 32 GB memory per node
  • No interconnect
  • Running Tru64 UNIX 5.1
  • Primarily used for serial and shared-memory parallel jobs
  • Image at right. Larger image available here.
GPS

SCF

SC:



Hardware Overview

Alpha 21264 Chip

Alpha 21264 Chip - Some History: Alpha Photo


Alpha 21264 Architecture:



Hardware Overview

Alpha Processors and AlphaServers

Overview:

Alpha Processor Design:



Hardware Overview

Quadrics Interconnect

Primary components:

Topology:

Features:



Software and Development Environment


The software and development environment for the Compaq clusters is similar to what is generally described in the Introduction to LC Resources tutorial. Items specific to the Compaq clusters are discussed below.

Tru64 Operating System:

Compilers / Languages: Math Libraries/Tools Specific to Compaq Clusters

Batch System:

User Filesystems:



Compilers


Available Compilers:

Compiler Invocation Commands:

Parallel Usage:

Useful Tru64 Compiler Options:

Compiler Documentation:



MPI


What's Available?

Compiling with Compaq's MPI

Note Note: The -pthread flag is recommended by Compaq. If -pthread is used, it should be included on both compilation and load commands, as this option will automatically add appropriate thread-safe options for both the compiler and the loader. If your application is not threaded, you may omit -pthread, but this is not recommended since the thread-safe mode is also safe for single-threaded applications. In addition to the use of -pthread as a compiling and loading option, KCC requires -pthread (alternatively, --thread_safe) when building a library archive (combining .o files into a .a file).

Compiling with MPICH

Compiling MPICH P4



Running on Compaq Clusters

Overview

Big Differences:

Job Limits:



Running on Compaq Clusters

Invoking the Executable:

GPS and SC:

TC2K:



Running on Compaq Clusters

Monitoring Job Status

GPS and SC:

TC2K:



Running on Compaq Clusters

Interactive Job Specifics

In General:

Killing Interactive Jobs:



Running on Compaq Clusters

Batch Job Specifics

LCRM:

Submitting Batch Jobs:

GPS/SC: Specifying Memory (and Other) Constraints:

Note Note that the LCRM constraint specification is rather fussy. For example, given the case of -c 15000Mb,gps, there is no space between "15000" and the "Mb" or on either side of the comma. Units are only in megabytes on top of that. See the psub man page for additional rules and tips regarding constraints.

Quick Summary of Common Batch Commands:



Running on Compaq Clusters

Optimizing CPU Usage:

TC2K: Effectively Utilizing CPUs:

GPS/SC: Effectively Utilizing CPUs:



Running on Compaq Clusters

Some Troubleshooting Hints



Debugging


Available Debuggers:

Using TotalView on the Compaq Clusters:


This completes the tutorial.

Evaluation Form       Please complete the online evaluation form - unless you are doing the exercise, in which case please complete it at the end of the exercise.

Where would you like to go now?



References and More Information