...
I've collected information about tricks that a newbie might not know, but which is useful to help you get around computational work. I'll try to keep adding stuff as I learn it. Please add your tricks too!
Parallel computing
I'm trying to get a better sense about designing parallel scripts. Typically, I've inherited someone's code and I've made it work for me. However, I have been looking for a good basic resource and I've found at least one site that looks promising. They have a few free courses that look relevant like "Parallel computing explained", "Introduction to MPI" and "Intermediate MPI". I found this by looking at an MIT computing course which pointed to this site.
http://www.citutor.org/index.php
Although it's got a lot of basic information, it's hard to figure out how it helps because I'm really not sure what type of clusters I'm actually using (i.e. which parts are relevant to me). Didn't really help me do any actual coding yet, although some background about computers was semi-interesting.
How to find stuff out about computing clusterclusters
I wanted to know whether there was a website where you could just find out about how to run stuff on a computer cluster (i.e. beagle, aces, or coyote). Basically, Scott said that only the sys admin knows all of the specific rules associated with each cluster and if you don't pick their brain about it, you won't really know how to use it right. I will hopefully pick brains for you and put it on this website in another post about each system. That's a work in progress.
You can find out about specifics of aces queues with:
qstat -Qf | more
or
qstat -q
Which results in this on aces:
server: login
Queue Memory CPU Time Walltime Node Run Que Lm State
--------------- ---- ------ ------ -- -- -- - -----
geom - - - - 0 0 -- E R
one - - 06:00:00 1 8 319 10 E R
four-twelve - - 12:00:00 -- 8 4 10 E R
four - - 02:00:00 16 8 437 10 E R
long - - 24:00:00 16 1 0 10 E R
all - - 02:00:00 1024 0 0 4 E R
mchen - - 02:00:00 1024 0 0 4 E R
mediumlong - - 96:00:00 30 0 0 10 E R
special - - - 36 0 0 - E R
toolong - - 168:00:0 4 0 0 10 E R
---- ----
25 760
An this on coyote:
server: wiley
Queue Memory CPU Time Walltime Node Run Que Lm State
--------------- ---- ------ ------ -- -- -- - -----
speedy - - 00:30:00 - 0 0 - E R
short - - 12:00:00 - 2 -2 - E R
long - - 48:00:00 - 68 46 - E R
quick - - 03:00:00 - 0 0 - E R
be320 - - 00:30:00 - 0 0 - E R
ultra - - 336:00:0 - 2 0 - E R
---- ----
72 44
You can also use this to find more information about qsub (I would be in a place like the head node because not all nodes have the same qsub data):
man qsub
You can find out more about the various flags you can use with qsub.
Queuing system on clusters
Never run anything on the head node!!! When you log into a cluster, you need to submit jobs to a queue or work interactively on a dedicated interactive node. The dedicated interactive nodes will have different names, so you just have to find them. Sometimes you can request nodes by qsub -I or qloginqsub to a dedicated interactive node (qubert on aces and super-genius on coyote), but these also depend on your system.
...
Works for scp too (and presumably other things).
Downloading directly to the clusters
You can get stuff from a website using wget, for example:
wget https://github.com/swo/lake_matlab_sens/archive/master.zip
Running something on a detatched screen:
use screen. This will help you figure stuff out:
screen -man
or
screen --help
This starts screen:
screen -S SPPtest
This detaches but keeps it running:
hold "control" and "A" keys then type "D"
To reattach to detached screen:
screen -R SPPtest
To get rid of the screen altogether type this from within a screen:
exit