OCEAN is a flexible feature set testing method for analysis of multi-omics. For a pair of omics, either matrix of pairwise associations is derived from a pair of pre-processed omics data (using corPs, embedded in ocean) or the p-value matrix provided by the user is used. This matrix is the input for simesCT, which calculates the closed testing parameters based on Simes local tests for a given alpha. Then for any given two-way feature set, 3 error rates are calculates using ocean function. The 3 error rates are TDP: proportion of omics pairs that are truly associated; row-TDP: proportion of rows that are associated with at least one column; column-TPD: proportion of columns that are associated with at least one row. TDP is calculated as an extension of SEA algorithm (see SEA R-package). For row-TDP and column-TDP a lower-bound (B) and heuristic (H) is calculated. Optionally, if the results are not exact (B not equal to H), it is possible to run branch and bound algorithm (using runBaB, embedded in ocean) to get an exact result. There are no limits on the number of feature sets being tested and the family-wise error rate is always controlled at level aplpha as set in the first step.
You can install the development version of OCEAN from GitHub with:
install.packages("devtools")
#install the package from GitHub
::install_github("mitra-ep/rOCEAN") devtools
library(rOCEAN)
#number of feature per omic data set
<-1000
n_cols<-1200
n_rows
#'#random matrix of p-values
set.seed(1258)
<-matrix(runif(n_rows*n_cols, min=0, max=1)^3, nrow=n_rows, ncol=n_cols) pvalmat
Calculate the closed testing parameters:
Calculate TDPs for an imaginary two-way feature set.
#> pair-TDP done.
#> p-categories matrix for rows ready.
#> row-TDP done.
#> p-categories matrix for columns ready.
#> Running BaB for column-TDP...
#> column-TDP done.
#> $Pairs
#> pTDP
#> 0.007936508
#>
#> $Rows
#> row-TDP nStep
#> 0.5061728 1.0000000
#>
#> $Columns
#> cHeuristic cBound nStep
#> 0.3839286 0.3750000 2.0000000
In the example above nMax=2
so only 2 steps of BaB were
applied for column-TDP, we can increase the number of steps to make the
bound narrower:
#> p-categories matrix for columns ready.
#> Running BaB for column-TDP...
#> column-TDP done.
#> $Columns
#> cHeuristic cBound nStep
#> 0.3839286 0.3750000 100.0000000
Calculate TDPs for a case where the initial outcome is unsure and branch and bound will be adopted for row-TDP.
#> p-categories matrix for rows ready.
#> Running BaB for row-TDP...
#> row-TDP done.
#> $Rows
#> rHeuristic rBound nStep
#> 0.9600000 0.9466667 100.0000000