This tutorial provides an overview of Livermore Computing's (LC) supercomputing resources and how to effectively use them. It is definitely intended as a "getting started" document for new users or for those who want to know "in a nutshell" what supercomputing at LC is all about from a practical user's perspective. It is also intended as the first presentation in a 4+ day, hands-on workshop that covers in great detail parallel programming on LC's supercomputing systems.
A wide variety of topics are covered in what is hopefully, a logical progression, starting with a description of the LC organization, a summary of the available supercomputing hardware resources, how to obtain an account and how to access LC systems. Important aspects concerning the user environment are then addressed, such as the user's home directory, various files and file systems, how to transfer/share files, quotas and archival storage. A brief description of the software development environment (compilers, debuggers, and performance tools), a summary of video and graphics services, how to run jobs and a few security reminders follow. Finally, this tutorial concludes with a discussion on where to obtain more information and help.
Level/Prerequisites: Some basic knowledge of high-end computing systems, particularly parallel computing, would be useful, but certainly not required. This tutorial is geared to new users of such systems and might actually be considered a prerequisite for using LC systems and other tutorials intended to follow parallel programming on LC systems in more detail.
"The acronyms can be a bit overwhelming" - Excerpt from a workshop attendee evaluation form
A more complete list of acronyms can be found at http://www.llnl.gov/computing/hpc/documentation/acronyms.html. A very succinct subset that is relevant to this tutorial appears below.
"Everything changes except the fact that everything changes."
Notes/Legend:
System Details: (production systems)
System Details:
Examples:
purple.llnl.gov uv.llnl.gov um012.llnl.gov ace22.llnl.gov alc.llnl.gov up099.llnl.gov
DCE Passwords (SCF only):
One-time Passwords (OCF and SCF):
Categories of Access:
Services Available:
For Help and More Information:
SSH Required:
ssh mcr.llnl.gov
ssh -p922 alc.llnl.gov
SCF Access:
Tri-lab Exceptions:
ascpurple1.llnl.gov or purple1441-ext.llnl.gov ascpurple2.llnl.gov or purple1442-ext.llnl.gov ascpurple3.llnl.gov or purple1443-ext.llnl.gov ascpurple4.llnl.gov or purple1444-ext.llnl.gov
SCP versus FTP:
SSH and SCP Examples:
Web Page Access:
OpenSSH:
RSA/DSA Authentication:
SSH Hints:
ssh -v [other options] <host>
Need SSH?
More Information:
.cshrc.blue .kshrc.blue .login.blue .profile.blue .cshrc.compaq .kshrc.compaq .login.compaq .profile.compaq .cshrc.linux .kshrc.linux .login.linux .profile.linux .cshrc.white .kshrc.white .login.white .profile.white
% ls -l .snapshot total 32 drwx------ 40 joeuser joeuser 4096 Sep 07 11:50 hourly.0 drwx------ 40 joeuser joeuser 4096 Sep 06 18:44 hourly.1 drwx------ 40 joeuser joeuser 4096 Sep 06 11:46 hourly.2 drwx------ 40 joeuser joeuser 4096 Sep 05 15:09 hourly.3 % ls -l .snapshot/hourly.0 total 24712 -rw------- 1 joeuser joeuser 31575 Aug 30 12:19 Batch_Limits.doc -rw------- 1 joeuser joeuser 2120192 Sep 01 12:04 FY01Blueprint.doc drwx------ 2 joeuser joeuser 4096 May 07 15:44 Mail drwx------ 2 joeuser joeuser 4096 Nov 07 2000 Misc drwx------ 16 joeuser joeuser 4096 Oct 24 1998 NPB2.3 -rw------- 1 joeuser joeuser 3039744 Aug 30 10:22 WhitePIX.ppt drwx------ 2 joeuser joeuser 4096 Mar 29 13:09 bin -rw------- 1 joeuser joeuser 39 May 09 09:20 blank.html -r-------- 1 joeuser joeuser 2433035 Aug 24 14:01 cforaix.pdf ....
Linux Systems:
igs-mds-gm1a:/mds_p_gm1/mcr_gm1_client 90318316544 73100343808 17215692320 81% /p/gm1 igs-mds-gm2a:/mds_p_gm2/mcr_gm2_client 93040928032 26728821184 64827855200 30% /p/gm2
Compaq Systems:
Access Methods:
quota -v df du
% quota -v Disk quotas for joeuser: Filesystem used quota limit timeleft files quota limit timeleft /g/g0 -0- 8.0G 8.0G -0- n/a n/a /g/g10 -0- 16.0G 16.0G -0- n/a n/a /g/g11 -0- 16.0G 16.0G -0- n/a n/a /g/g12 -0- 16.0G 16.0G -0- n/a n/a /g/g13 -0- 16.0G 16.0G -0- n/a n/a /g/g14 -0- 16.0G 16.0G -0- n/a n/a /g/g15 -0- 16.0G 16.0G -0- n/a n/a /g/g16 816.7M 16.0G 16.0G 6.2K n/a n/a /g/g17 -0- 16.0G 16.0G -0- n/a n/a /g/g18 -0- 16.0G 16.0G -0- n/a n/a /g/g19 -0- 16.0G 16.0G -0- n/a n/a /g/g20 -0- 16.0G 16.0G -0- n/a n/a /g/g21 -0- 16.0G 16.0G -0- n/a n/a /g/g22 -0- 16.0G 16.0G -0- n/a n/a /g/g23 -0- 16.0G 16.0G -0- n/a n/a /g/g24 -0- 16.0G 16.0G -0- n/a n/a /g/g90 -0- 16.0G 16.0G -0- n/a n/a /g/g91 -0- 16.0G 16.0G -0- n/a n/a /g/g92 -0- 16.0G 16.0G -0- n/a n/a /g/g99 -0- 8.0G 8.0G -0- n/a n/a /nfs/tmp2 -0- 100.0G 100.0G 0.0K n/a n/a /nfs/tmp3 -0- 100.0G 100.0G 0.0K n/a n/a
Exceeding quota:
Click on image for larger image
scp thisfile user@host2:thatfile
File Sharing:
give user file take user file
give jsmith input2 give jsmith input1 input2 give jsmith in* give -u jsmith input2 take ljones data take ljones data2 data3 take ljones give, take
Notes:
More info: http://www.llnl.gov/LCdocs/ezfiles (File-Management Tools section) and man pages.
DEG:
Software Documentation:
Software List:
Types of Jobs and Where to Run:
Interactive vs. Batch:
What About the Details?
Dedicated Application Time (DAT):
The services and software provided by the Information Management and Graphics Group (IMGG) are discussed below.
Video Production:
Consulting:
Visualization Machine Resources:
PowerWalls:
Contacts & More Information:
Just a Few Reminders...
LC Hotline:
LC Users Home Page:
OCF: www.llnl.gov/computing SCF: https://lc.llnl.gov
Most Important and Time Critical Information:
To: white-status@llnl.gov Subject: Summary of weekend runs on White The following top priority jobs are scheduled this weekend for the White machine. Any lower priority jobs interfering with this work are subject to being killed. Of course if any job is killed, the job owner will be notified. SUMMARY OF WEEKEND RUNS REQUESTED ON WHITE Friday August 24 - Monday August 27, 2001 ---------------------------------------------------------------------------- SNL: user01 \ tier 1 user02 \ 128 nodes tier 1 user03 / tier 1 user04 / tier 1 LANL: user05 128 nodes \ 132 nodes tier 1 user06 4 nodes / total tier 2 plus 2x4=8 nodes on ice tier 2 LLNL: 64 nodes \ tier 1 user07 ) \ 144- tier 2 user08 / 190 tier 2 user09 important weekend debug / nodes tier 2 user10 16-24 nodes / tier 2 plus 16 nodes on ice ---------------------------------------------- Total: 450 White nodes requested (457 possible) 24 Ice nodes requested (26 possible)
Miscellaneous Documentation:
LC User Meeting:
This completes the tutorial.
Where would you like to go now?