The lfe R-package is installed as any R-package from 'cran' 
with the R-command install.packages('lfe')

In the R-package, in the subdirectory exec (e.g. in
/usr/lib64/R/library/lfe/exec or /usr/local/lib64/R/library/lfe/exec
or wherever your installation keep the packages, it's displayed during
installation) there's a perl-script lfescript.  It takes as an
input-file a simple specification and creates an R-script which uses
the lfe-package.  Here's a typical input-file:

----cut test.lfe ----
file smalldata.csv
vars x x2 year id firm y ife ffe yfe
model y ~ x + x2 + year + G(id)+G(firm)
dummy year
#nofe
#se
merge
----cut----

Most of this is self-explanatory, there's a file-name, it's supposed to be
a pure space-separated data-file (without any header with variable names).
However, if it ends in '.dta' it is assumed to be a Stata-file.

The 'vars' line must list all the variables in the file, even if you don't
use them.  This line is ignored if the data-file is a Stata-file.

The 'model' line specifies the left-hand side and the right-hand side
as an R-style model specification.  Note that the fixed effects are
included here within the G() operator.  Any number of fixed
effects may be used, but currently identification in the case of more
than two fixed effects are not very well understood. In the example,
the 'year' variable could in principle be included as a fixed effect,
but for variables with few values it's usually wiser to include them
as dummies among the ordinary covariates, then you also get standard
errors.


The 'dummy' line specifies which variables should be coded as dummies, a
reference is automatically removed.  In the example, you'll get variables
like year1998, year1999, year2000 or whatever time-period you've got.
If the data-file is a Stata-file, any variable with value-labels will be
treated this way.


The 'nofe' line (commented out above) is used if one does not need to
compute the fixed effects, only compensate for them in the other
covariates.  In case of 3 or more fixed effects, the standard errors
provided will be slightly wrong (because it is assumed that all the
fixed effects are implicitly present, even though there should be an
unknown number of references, so the degrees of freedom aren't entirely
correct).

The 'se' line (commented out above) is included if you want standard errors
on the fixed effects.  These are bootstrapped, and as such very time-consuming
to compute.  Avoid if you don't need.

The line 'merge' will cause the fixed effects to be merged into the original
data-set and output as a file 'fe-merged.csv' for further analysis.

Output from the program is both on standard-output (the covariate estimates)
as well as to a file 'coef.csv'.

In case fixed effects are requested, one file for each fixed effect is
created.  In the example, 'fe-id.csv' and 'fe-firm.csv'.  The output-files
contain a header, like 'id  effect  comp  obs'.  I.e. the id, the
estimated coefficient, its component number in the connection graph, and
the number of observations for this effect.  

If the data-file is a Stata-file, the output files will also be State-files.

Note that fixed effects from different components are not directly comparable
because there's a separate reference value in each component.


----------
Usage:  
Run the lfescript on your specification file to create an R-script:
path-to-script/lfescript test.lfe > test.R

In case you have more than one cpu (e.g. 8), you should tell it on the
mountain with the command 

export OMP_NUM_THREADS=8

Then you run the script:

R CMD BATCH test.R

Standard output is redirected to the file 'test.Rout'.

At the Frisch-centre (with mostly Windows/stata-users) we have set up
a server with home-directories for the users.  They may drop their
data-file in a 'data'-folder and their specification file in a
'submit'-folder, and then the computation is automatically forwarded
to our compute-cluster batch-system.

Notes:
The centering on the means is done separately for each covariate.  When you've
set the OMP_NUM_THREADS environment variable, the lfe-package picks this up
and sends each covariate to a separate cpu.  Thus the centering is speeded up.

If you request estimation of fixed effects, there's nothing in lfe as
such which uses more than one cpu for this task.   On the other hand,
this is done with the Kaczmarz-method which is very fast when the
system is as sparse as is typical for such data.


