
"To do" list for R/qtl
----------------------------------------------------------------------
This file is intended to contain a list of many of the additions and
revisions that are planned for the R/qtl package.  

If you any additions or revisions to suggest, please send an email to
Karl Broman, <kbroman@jhsph.edu>.
----------------------------------------------------------------------
 
SHORT TERM:

o Finish off the work to get coefficient estimates by imputation in
  fitqtl for the X chromosome in BC and F2. 

o Fix scantwo for the case that incl.markers=TRUE, so that positions
  are not equally spaced, but are according to the genetic map.

o geno.table(): the X chromosome needs special treatment.

o P-values from geno.table() when an intercross has some dominant
  markers.  

o summary.scantwo doesn't work if there's just two positions on a
  chromosome (or maybe it's for one position).

o Changed default for plot.scantwo to lower="joint"; check that the
  tutorial reflects this.  

o An NA in the mapmaker data file caused an error in read.cross;
  the line became too long.  Maybe this is true whenever an item
  doesn't match what is expected.

o Add sample data to the web site.

o Speed up read.cross.mm; deliver meaningful errors if map/genotypes
  don't match, and if too many genotypes in a row.

o Documentation for scanone() and scantwo() regarding the X chromosome.

o Deal more cleanly with missing values in the output to scantwo.
  Make the warning message more clear, and perhaps don't automatically 
  set them to 0.

o Incorporate the DIRECT algorithm stuff from Hao

o Add attributes regarding degrees of freedom and null log10lik
  to the output of scanone and scantwo.

o Revise the a.starting.point a bit.

o Revise the documentation regarding the new version of plot.scantwo. 

o Documentation for makeqtl/fitqtl/scanqtl (especially summary.fitqtl).

o Turn the tutorial into a Sweave vingette, to conform to the
  Bioconductor requirements?

o Bug in the warning message regarding switched genotypes in est.rf?
  (especially for 4-way crosses)

o Allow no p-values in summary.scantwo

o Documentation on how to do thresholds a la Gary's method for
  scantwo.  

o LOD thresholds for X chromosome in scanone/two.

o Treatment of X chromosome in fitqtl, scanqtl?

o Add an example regarding the X chromosome to the help files for
  fake.f2 and scanone.

o Go through makeqtl, fitqtl, scanqtl carefully.

o Ensure consistency in use of chromosome numbers vs names/IDs when
  plotting results of genome scans, subsetting crosses, and so forth
  (sometimes #'s taken as indices and sometimes as names).

o Documentation on RI lines.


MEDIUM TERM:

o Data input format with two CSV files: one for phenotypes and one for
  genotypes.  Also, a CSV format that is the transpose of the current
  format, for things like RILs with many markers but few individuals.

o Pull out results for an interval.

o Function to calculate variance due to QTL

o Modify the map expansion for RI lines for the X chromosome.
  [really need to add additional functions for mapping, as
   the marginal genotype distribution is 2:1 rather than 1:1]
  --the transition matrix is not symmetric 
  Pr(BB|AA) = 2r/(1+4r) and Pr(AA|BB) = 4r/(1+4r)

o Incorporate the code from Brian Yandell, Fei Zou and Amy Jin on
  semi-parametric QTL mapping methods.

o Allow use of individual IDs; a column "id" or "ID" in the phenotype
  data.  We could refer to these in functions like top.errorlod.

o Have effectscan give output (silently)

o Allow tailored allele labels (e.g. CC/CB/BB rather than AA/AB/BB)

o Allow plot.missing to give results color coded by marker genotype
  (like Saunak's cool plots).

o Negative LOD scores in fitqtl()?

o Ability to get at the individual contributions to the LOD score? 

o Revise c.cross so that you can combine crosses even if there are
  different numbers of chromosomes: match the marker names.

o effectscan and effectplot: SEs and so forth using imputations

o effectscan: if one chromosome, plot map positions on the x-axis 
  rather than the chromosome ID.

o Add appropriate functions to analyze advanced intercrosses (AILs). 

o Analysis of binary traits by imputation

o Pictures which indicate the locations of crossovers (e.g. in
  plot.geno). 

o Include widgets for getting more easy access to the data.

o Calculate pairwise QTL probabilities by the more simple method,
  assuming independence, in scantwo.

o Permutation tests with scantwo should include the results comparing
  2 vs 1 QTL.

o MIM for a set of QTLs at specified locations with specified
  interactions (as a new method in fitqtl and scanqtl).

o refineqtl -- like fitqtl, but refines the locations of the QTLs as
  in Zhao-Bang's MIM

o Deal with the X chromosome appropriately in scantwo with method="em"

o max.scanqtl, summary.scanqtl, print.summary.scanqtl, plot.scanqtl

o Modify plot.rf and plot.errorlod to allow plot of a color scale, as
  in plot.scantwo.  

o In MIM: allow return of SEs of effects.  Write coef.mim, resid.mim,
  and dev.mim to pull out the est'd coefficients, the residuals, and
  the "deviance" (2 * ln likelihood).

o In MIM, refinement of QTL location and plots of that.

o Add a FAQ document to the R/qtl web page
    - Reading in data
    - Haley-knott vs non-parametric vs EM
    - Do you plan to incorporate _____?
    - Extremely large LOD scores by EM

o calc.genoprob and sim.geno for "f2ss" (intercross with sex-specific
  maps) 


LONG TERM:

o Allow phenotypes on multiple individuals (esp for recombinant inbred
  lines). 

o "embarassing parallel" processing for permutation tests (Rmpi, snow)

o Composite interval mapping, in an automated way.

o Imprinting/parent-of-origin effects.

o Treating a covariate as a random effect.

o Multiple phenotypes (esp. regarding pleiotropy).

o Data conversion functions to/from Chuck Berry's bqtl package

o Model search for MIM etc...forward and stepwise selection.

o Function to plot, for a specified q1, LOD{q2|q1} vs q2 (using the
  output from scantwo).

o Take the fit of the null model outside of the C code for
  the imputation method in scanone and scantwo, so that it
  only has to be done once (rather than for each chr or chr pair).

o Starting values for EM for the two-part model (and more generally).
  Allow the option of an automatic selection of multiple starting
  points. 

o Generalized linear models in scanone and scantwo.

o Analysis functions such as scanone and scantwo might assign an
  attribute to their output which identifies the input data and/or
  function call.

o Re-write the C code for EM underneath scanone and scantwo so that it
  is not so tedious.

o Individual numbers in plot.geno function; allow it to plot only
  individuals with apparent genotyping errors, with a mix of
  chromosomes. 

----------------------------------------------------------------------
end of TODO.txt
