Personal tools
You are here: Home Tutorials User Guides Using CBIF cluster
Document Actions

Using CBIF cluster

CBIF cluster includes 16 nodes (2 CPUs each) and uses Sun Grid Engine (SGE) as a resource management software. The SGE system works as follows:

  • Users submit computationally demading tasks (jobs) to the SGE system.
  • The SGE system accepts the users' requests (users' jobs), puts them in a holding area (queues) until they can be executed
  • The SGE system sents jobs from queues to an execution host , manages them during execution, and log the record of their execution when they are finished.

Users  can submit batch jobs, interactive jobs and parallel jobs to the SGE system.

Submitting Batch jobs

Before you start submitting batch scripts to the SGE system, check if your personal shell resource files (.cshrc) contains these lines:
if ( -f /usr/pkg/sge/default/common/settings.csh  ) then
    source /usr/pkg/sge/default/common/settings.csh

endif

Enter the following command
qsub job.sh
job.sh is the name of the script file and the file is located in the current directory.

Since batch jobs do not have a terminal connection, their standard output and their standard error output has to be redirected into files. The standard location for the files is in the current working directory where the jobs execute. The default standard output file name is <Job_name).o<Job_id>, the default standard error output is redirected to <Job_name).e<Job_id>. <Job_name>  is the script file name and <Job_id> is a unique identifier assigned to the job by SGE.  Users can specify output directions with -e and -o options.

A simple example of a SGE batch script job.sh:
#!/bin/csh 
#$ -S /bin/csh
#$ -o res.out -e res.err
#$ -cwd
date
sleep 20
date
LInes with a leading "#$" are treated as script-embedded command line options for qsub. Here are some options for qsub:
[-S  path_list] : command interpreter to be used
[-o path_list]  : specify standard output stream path
[-e path_list]  : specify standard error stream path
[-cwd]            : use current working directory
[-j y[es]|n[o]]  : merge stdout and stderr stream of job

The qsub command should confirm the success full job submission as follows
Your job 1 ("job.sh") has been submitted
The job has been assigned a Job ID=1

Retrieving the status information on your job

Enter the following command:
qstat

You should receive a status report containing information about all jobs currently known to SGE system:

job-ID  prior  name   user     state  submit/start at          queue         function
-----------------------------------------------------------------------------------------------
1         0        job.sh   cngo   r        09/05/07 15:20:15    yucca          MASTER

  • jobID
  • name: the name of the job script
  • user: the owner of the job
  • state: a state information (r - running, s-suspended, q-queued, w-waiting, e-error)
  • the start time and the name of the queue in which the job is execute
If you don't see any status report about your job, it means  your job is finished and you should check  the res.out  file for the results of your job.
 

Cancelling a job

Enter the following command:
qdel jobID
In order to delete a job, you must be either the owner of the job, a SGE manager or operator.

Resources

http://gridengine.sunsource.net

Powered by Plone CMS, the Open Source Content Management System