
"To do" list for R/qtl
----------------------------------------------------------------------
This file is intended to contain a list of many of the additions and
revisions that are planned for the R/qtl package.  

If you any additions or revisions to suggest, please send an email to
Karl Broman, <kbroman@jhsph.edu>.
----------------------------------------------------------------------
 
SHORT TERM:

o clean.scantwo: 0 out stuff between two markers

o Add cross type "dh" for doubled haploids.  Treat this just like
  a backcross, but with different genotype labels.

o refineqtl -- like fitqtl, but refines the locations of the QTLs as
  in Zhao-Bang's MIM; the output could include both the final qtl
  object plus the set of LOD curves, and have class "refineqtl", with
  functions summary.refineqtl() to give the final positions and
  plot.refineqtl() to give a plot like Z-B's.

o max.scanqtl, summary.scanqtl, print.summary.scanqtl, plot.scanqtl
  + help file

o write.qtlcart: RILs seem to need coding 0/2.

o Changed default for plot.scantwo to lower="joint"; check that the
  tutorial reflects this.  

o Include Bjarke's code on eHK for scanone and scantwo.

o read.cross for "qtx" sometimes doesn't seem to take the
  genotype pattern appropriately; read in a backcross as if it
  were an F2. 

o plot.pxg for X chr and autosome: results can be messed
  up, depending on the order of the markers.  

o Fix the help file for fitqtl() to emphasize that interactions among
  covariates are not allowed, but must be set up in advance.
 
o Add an explanation regarding the coding in the coefficient 
  estimates in fitqtl().  Add text to the help file for
  summary.fitqtl(). 

o write tools for converting the output from scanqtl() to the format
  for scanone() or scantwo(), according to whether it's a 1-d or
  2-d search, or print a warning otherwise.

o Documentation for scanone() and scantwo() regarding the X chromosome.

o scanone() with model="2part" gives NAs as LOD scores if there
  is complete penetrance (one of the p's goes to 1).  This doesn't
  happen with model="binary".

  The problem is that if everyone with a certain genotype survives,
  then you have great segregation distortion if you look at only
  the dead individuals, and so you can't estimate both means.

o In scanone (and possibly in scantwo), when a factor is used as
  a covariate, a statement about "must be a matrix" is given, but 
  it should say something about the need for the thing to be numeric.

o Finish off the work to get coefficient estimates by imputation in
  fitqtl for the X chromosome in BC and F2. 

o geno.table(): the X chromosome needs special treatment.

o P-values from geno.table() when an intercross has some dominant
  markers.  

o summary.scantwo doesn't work if there's just two positions on a
  chromosome (or maybe it's for one position).

o Fix plot.scantwo for the case that incl.markers=TRUE, so that positions
  are not equally spaced, but are according to the genetic map.

o Add an example regarding the X chromosome to the help files for
  fake.f2 and scanone.

o Effectscan for X chromosome.

o An NA in the mapmaker data file caused an error in read.cross;
  the line became too long.  Maybe this is true whenever an item
  doesn't match what is expected.

o Add sample data to the web site.

o Speed up read.cross.mm; deliver meaningful errors if map/genotypes
  don't match, and if too many genotypes in a row.

o Documentation for makeqtl/fitqtl/scanqtl (especially summary.fitqtl).

o Deal more cleanly with missing values in the output to scantwo.
  Make the warning message more clear, and perhaps don't automatically 
  set them to 0.

o Add attributes regarding degrees of freedom and null log10lik
  to the output of scanone and scantwo.

o Revise the a.starting.point a bit.

o Revise c.cross so that you can combine crosses even if there are
  different numbers of chromosomes

o Turn the tutorial into a Sweave vingette, to conform to the
  Bioconductor requirements?

o Allow no p-values in summary.scantwo

o LOD thresholds for X chromosome in scanone/two.

o Ensure consistency in use of chromosome numbers vs names/IDs when
  plotting results of genome scans, subsetting crosses, and so forth
  (sometimes #'s taken as indices and sometimes as names).

o Documentation on RI lines.


MEDIUM TERM:

o MIM for a set of QTLs at specified locations with specified
  interactions (as a new method in fitqtl and scanqtl).

o Incorporate the DIRECT algorithm stuff from Hao

o In MIM: allow return of SEs of effects.  Write coef.mim, resid.mim,
  and dev.mim to pull out the est'd coefficients, the residuals, and
  the "deviance" (2 * ln likelihood).

o In MIM, refinement of QTL location and plots of that.

o scanone with additive alleles at QTL

o Pull out results for an interval.

o Function to calculate variance due to QTL

o Modify the map expansion for RI lines for the X chromosome.
  [really need to add additional functions for mapping, as
   the marginal genotype distribution is 2:1 rather than 1:1]
  --the transition matrix is not symmetric 
  Pr(BB|AA) = 2r/(1+4r) and Pr(AA|BB) = 4r/(1+4r)

o Ability to get at the individual contributions to the LOD score? 

o Incorporate the code from Brian Yandell, Fei Zou and Amy Jin on
  semi-parametric QTL mapping methods.

o Have effectscan give output (silently)

o Allow plot.missing to give results color coded by marker genotype
  (like Saunak's cool plots).

o effectscan and effectplot: SEs and so forth using imputations

o effectscan: if one chromosome, plot map positions on the x-axis 
  rather than the chromosome ID.

o Add appropriate functions to analyze advanced intercrosses (AILs). 

o Analysis of binary traits by imputation

o Include widgets for getting more easy access to the data.

o Calculate pairwise QTL probabilities by the more simple method,
  assuming independence, in scantwo.

o Permutation tests with scantwo should include the results comparing
  2 vs 1 QTL.

o Modify plot.rf and plot.errorlod to allow plot of a color scale, as
  in plot.scantwo.  

o Add a FAQ document to the R/qtl web page
    - Reading in data
    - Haley-knott vs non-parametric vs EM
    - Do you plan to incorporate _____?
    - Extremely large LOD scores by EM


LONG TERM:

o HMM stuff for BCn data.

o Allow phenotypes on multiple individuals (esp for recombinant inbred
  lines). 

o "embarassing parallel" processing for permutation tests (Rmpi, snow)

o Composite interval mapping, in an automated way.

o Imprinting/parent-of-origin effects.

o Treating a covariate as a random effect.

o Multiple phenotypes (esp. regarding pleiotropy).

o Model search for MIM etc...forward and stepwise selection.

o Function to plot, for a specified q1, LOD{q2|q1} vs q2 (using the
  output from scantwo).

o Take the fit of the null model outside of the C code for
  the imputation method in scanone and scantwo, so that it
  only has to be done once (rather than for each chr or chr pair).

o Starting values for EM for the two-part model (and more generally).
  Allow the option of an automatic selection of multiple starting
  points. 

o Generalized linear models in scanone and scantwo.

o Analysis functions such as scanone and scantwo might assign an
  attribute to their output which identifies the input data and/or
  function call.

o Re-write the C code for EM underneath scanone and scantwo so that it
  is not so tedious.

----------------------------------------------------------------------
end of TODO.txt
