| 
  • If you are citizen of an European Union member nation, you may not use this service unless you are at least 16 years old.

  • You already know Dokkio is an AI-powered assistant to organize & manage your digital files & messages. Very soon, Dokkio will support Outlook as well as One Drive. Check it out today!

View
 

TAPIR Computing

Page history last edited by Christian Ott 9 years, 2 months ago

Computing Information for TAPIR Group Members

 

There are multiple options for computing in TAPIR that together should be able to accommodate almost any computing needs. Details are below and up to date as of 2015/01/24. If you have questions regarding local TAPIR machines, please contact Chris Mach (cmach@tapir.caltech.edu). If you want to learn more about local and national high-performance computing resources, contact Christian Ott.

 


Small-scale computing within TAPIR:

 

There are a number of quite powerful desktop machines that are part of the TAPIR cluster (using the same network file system and user authentication). These machines are managed by Chris Mach. You can log in to these machines using your TAPIR credentials. If you are planing to do some serious computing that produces more than a few tens of megabytes of output, you should ask Chris to set you up with a local work directory on the machine(s) of your choice, since doing I/O on the network file system is not a good idea!

 

List of machines:

 

jabberjaw.tapir.caltech.edu

psiphi.tapir.caltech.edu

cosmo.tapir.caltech.edu

mcbain.tapir.caltech.edu

esther.tapir.caltech.edu

 

These are 4-core machines with Intel iCore7 CPUs with hyperthreading (they show 8 logical CPU cores, but there are really only 4 physical cores there). The machines all have 16 GB of RAM. You can run single-thread or multi-thread jobs on them. They are, for example, ideal for running MESA on them. If any software is missing, Chris Mach can help you with that.

 

In addition, Christian has two group servers, fermi.tapir.caltech.edu and bethe.tapir.caltech.edu that have 12 and 8 cores, respectively, and more RAM than the iCore7s. Please get in touch with him if you want to run on fermi or bethe.

 

Important:

Before starting any computations, make sure that nobody else is using the machine. 'w' will tell you who is logged on and what the current system load is (should not exceed 4-8) and 'top' will show you which processes are running.

 

Hint:

If you want to run a calculation in the background and do not want to have it be interrupted/shut down when you log out of the machine, then launch your calculation in a virtual terminal using the increadibly handy 'screen' tool.

 

 


Medium-scale computing on TAPIR-owned/shared computer clusters

 

Christian Ott, Mark Scheel, and Phil Hopkins are the PIs of the Zwicky compute cluster, which is located in the machine room of Powell-Booth and operated by IMSS. Zwicky has 2560 Intel Westmere/SandyBridge compute cores, 5.6 TB of RAM, QDR Infiniband interconnect, 180 TB of parallel high-performance storage, and 180 TB of tier-2 storage. There is also an older, smaller and less-used cluster called SHC.

 

Using Zwicky is practical if you want to run jobs that can use at least one entire compute node (12 or 16 cores) at once. The largest parallel MPI jobs that get pushed through Zwicky's queuing system are around 16-20 nodes (~192-240 cores), but larger jobs are possible. It is also possible to use Zwicky as a high-throughput machine (for example, if you need to run many small embarrassingly parallel jobs).

 

If you are interested in using Zwicky or SHC, please contact Christian, Mark, or Phil. Zwicky was funded by an NSF grant and a matching grant from the Sherman Fairchild Foundation. If you use Zwicky for your research, please acknowledge it in the following way: "Computations were carried out on the Caltech Zwicky compute cluster (NSF MRI-R2 award no. PHY-0960291)"

 


Large-scale computing on national supercomputers

 

Phil Hopkins, Christian Ott, and Mark Scheel have allocations on national supercomputers. Talk to them if you need to run on more than a few hundred compute cores in parallel.

 

 

 

 

Comments (0)

You don't have permission to comment on this page.