README
------

The Rsge package (v0.4)is an integration of the R Programming languages that allows R users to programatically integrate their R models with the SGE environment via qsub.

This implementation of Rsge is based on the source code of the Rlsf package (which is based on the snow package, etc..). Thanks to those guys for saving me tons of time on this!

The program functions as follows:

1. call sge.par(L|s|C|R)Apply(X, fun, ..., join.method=cbind, njobs) or sge.apply 
2. The data object is split as specified by njobs (number of jobs)
3. Each data segment along with the function, arguemnt list (optionally global variables and library names) are saved to a shared network location using the save call.
4. Each worker process loads the saved file, executes them, and saves the results to the shared network location. 
5. The master script uses join.method to merge the results together and returns the result.

(sge.submit is similar, but does not split data and is asynchronous)

This new version has the following changes

LICENSE
------
This code is licensed under GPL, use it, modify it, change it, but most of all, enjoy it.

WARRENTY
-----

There is (of coarse) no warranty for this code. Use it at your own risk. 

INSTALLATION
------

To install this package, first ensure that snow if installed, then run the following. 

R CMD INSTALL Rsge

This package requires that SGE is installed with the following configuration:
  Nodes where this package is installed are referred to as submit nodes.
  Nodes where this package will distribute data for computation are referred to as compute nodes.

  SubmitNode:
    - SGE should be installed and configured
      - should be configured as a submit node
      - should have qsub adn qstat in the path (optionally qacct)
    - R (tested with 2.6.1)
    - R packages snow and Rsge installed.
  
  ComputeNode
    - R, Rsge, and snow should be installed
    - Other packages may be required if you use the “packages” option
    - access to the submission directory (via a shared filesystem like nfs)
    - The node should be properly configured to be an SGE execute host 

Usage
------

To get more information about how to use this package, either see:
  - R docs,
  - check the test directory to see examples that I used for testing. 
       - ignore the framework, its a little too much.
  - Check out the source code, there are only a few hundred lines :)

Configuration:
-------
Most of the configuration was added since it could be useful in the future, ONLY the following configuration should be used (using the other config options will almost surely break things):

1. sge.block.size=100 - Number of elements, rows, or columns per task if njobs is not specified.
2. options(sge.qsub="qsub") - Location of qsub if its not in the path. 
3. options(sge.qsub="qstat") - Location of qstat if its not in the path. 
4. options(sge.user.options="-S /bin/bash") - feel free to change to target queues, things like that
5. options(sge.ret.ext="sge.ret") - I dont know why you would change this, but you can
6. options(sge.use.qacct="FALSE") - This can provide more robust job status checking if qacct is installed (and configured) on the submit node, provided that the default accounting file is used.
7. options(sge.use.cluster="TRUE") - if false, all of the jobs will be run locally.

TODO:

The following are things that I want to do, or should at least consider doing ... if I have the time. I will have to get these things in a proper bug-tracking environment.

1. Write proper documentation. (getting closer)
3. Implement asynchronous job arrays (currently asynchronous launches a qsub call for each submit call, it would be better to launch a pre-process call, the have a submit call, maybe...)
4. Allow users to run jobs without doing the worker prep work (requires that a previous run was run with debug=TRUE)
5. Remove the requirement for Rsge/snow installation on nodes.
6. sge.get.result should accept the actually filename to retrieve, not the input file from the worker.
7. Eventully, I should consider adding support for Rmpi or something comparable. I can resuse more Rlsf code for this.
========
$Id: README,v 1.2 2007/04/03 20:24:33 kuhna03 Exp $
