UMBC High Performance Computing Facility
 
How to run Bash programs on tara
Introduction
Now we'll see how to run a Bash script on the cluster. Before 
proceeding, make sure you've read the 
How To 
Run tutorial first.
We now know that we should be running our job on the compute nodes, 
rather than the front end node. However, we need to be careful with 
scripting, and make that the scheduler always has control over our job. 
We'll see some examples of how to do this correctly, along with some 
counterexamples. Use of other scripting languages and shells should be 
very similar.
Simple Bash example
Let's start with the following script. We initiate a one minute sleep to 
allow it to run for a little while. This is such a simple example, we 
could have included it directly in the batch script. In practice though, 
we'll usually want to keep our functional code seperate from our 
batch job running code.
Here is the qsub script we will use to launch it
#!/bin/bash
#SBATCH --job-name=pause
#SBATCH --output=slurm.out
#SBATCH --error=slurm.err
#SBATCH --partition=develop
./pause.bash
 
Download: 
../code-2010/bash_pause/run.slurm
 
Now we launch the job
[araim1@tara-fe1 bash_pause]$ sbatch openmpi.slurm 
sbatch: Submitted batch job 2618
[araim1@tara-fe1 bash_pause]$ squeue
  JOBID PARTITION     NAME     USER  ST       TIME  NODES NODELIST(REASON)
   2620    serial    pause   araim1  R        0:00      1 (Resources)
[araim1@tara-fe1 bash_pause]$
 
 
After about a minute, we get the following output
[araim1@tara-fe1 bash_pause]$ cat slurm.err 
[araim1@tara-fe1 bash_pause]$ cat slurm.out 
Script started at Thu Aug 20 18:12:36 EDT 2009
Script ended at Thu Aug 20 18:13:36 EDT 2009
[araim1@tara-fe1 bash_pause]$
 
 
If we had killed the job during its execution, the scheduler 
would have been able to stop it cleanly, and no pieces of it would 
continue to run on the compute node.
It would not be a good idea to try to run the pause.bash script as a 
background job, or through nohup, as a note to users familiar with these 
mechanisms. These could potentially run outside of the scheduler. If 
this happens, you would lose control of your job and need to 
contact HPC Support to stop it. If such a job is 
left running, other users' jobs could be scheduled on your busy 
processors, which could interfere with their execution.