StepMiner command line options

Copied from Appendix C of my thesis:
Ph.D. Thesis: Boolean analysis of high-throughput biological datasets
Ph.D. Advisor: David L. Dill, Co-advisor: Sylvia K. Plevritis
Download StepMiner jar files :
stepminer-1.0.jar
stepminer-1.1.jar

prompt> java -jar stepminer-1.0.jar
Usage: tools.Analyze [-d/--debug] < filename>
                     [-r/--reduceLog] < filename>
                     --gui [< filename>]
                     [options] < filename>
Options:
    [-o/--outfile < file.ano>] [-o/--outfile < file.pcl>]
    [-o/--outfile [< tag>:]< file.ann>] [-o/--outfile < file.exp>]
    [-o/--outfile ] [-o/--outfile < file.gmx>]
    [-o/--outfile [:]] [-o/--outfile < file.png>]
    Example tags: All, Step, Up, Down, UpDown, DownUp, Rest
    [-t/--type < type>]
            < type> : OneStep, OneStepFdr, Fdr,
                     Order, Subset, ZeroCenter, MeanCenter,
                     None, ListGenes, Normalize,
                     KNNimpute, LLSimpute
    [--annFile < Gene Annotation File : 15 columns format>]
    [--onnFile < Ontology File : OBO format>]
    [--org < Organism: Mm/Hs/Sgd/Pombie>]
    [--geneIndex < arrayIndex of gene description>]
    [--splitString < Splitting regexp of the gene str>]
    [--splitIndex < Index of gene after splitting>]
    [--goPvalue < pvalue threshold of GOAnalysis>]
    [--range < ex: 4:17 Range of array indices for analysis>]
    [-p/--pvalue < pvalue threshold>]
    [--numMissing < Number of missing timepoints>]
    [--Intersect < file>]
    [--Union < file>]
    [--Select < file>] select ids with original order
    [--SelectOrder < file>] select ids with given order
    [--Diff < file>]
    [--SelectNames < file>] select names with original order

Following are few command-line examples for running StepMiner.

prompt> java -Xms64m -Xmx512m -jar stepminer-1.0.jar
        - prints command line options

To invoke GUI from command line:

prompt> java -Xms64m -Xmx512m -jar stepminer-1.0.jar --gui

Timecourse analysis of yeast.pcl and the results are saved in yeast-step.pcl.

java -Xms64m -Xmx512m -jar stepminer-1.0.jar \
     -t OneStep yeast.pcl -o yeast-step.pcl

StepMiner also dumps information about the calculated p-values for each gene.

java -Xms64m -Xmx512m -jar stepminer-1.0.jar \
     -t OneStep yeast.pcl -o yeast-step.ano -o yeast-step.ann

The columns in yeast-step.ann are as follows :

Name	-	probe id corresponds to the first column of the PCL file.
num	-	number of timepoints.
numSteps	-	number of steps found.
geneIndex	-	The column index in the PCL file that has the gene name and the description of the gene.
pvalue	-	pvalue for the fitted steps.
sstot	-	Sum of square error for no-step (fitting with mean).
sse	-	Sum of square error for the fitted steps.
label	-	labels for various matching
	0 -	no significant step
	1 -	one step - Up
	2 -	one step - Down
	3 -	two step - UpDown
	4 -	two step - DownUp
step0	-	position of first step.
step1	-	position of the second step.
mean0	-	mean for the first segment.
mean1	-	mean for the second segment.
mean2	-	mean for the third segment.

The columns in yeast-step.ano are as follows :

The first three columns are directly copied from the PCL file. Next five columns are as follows:

label	-	labels for various matching
	0 -	no significant step
	2 -	one step
	3 -	two step
dir	-	labels for various matching
	0 -	one step - Up
	1 -	one step - Down
	2 -	two step - UpDown
	3 -	two step - DownUp
step1	-	position of first step.
step2	-	position of the second step.
pvalue	-	pvalue for the fitted steps.

The rest of the columns are directly copied from the PCL file.

GO Analysis

Following are complex examples that performs GO Analysis after StepMiner analysis:

java -Xms64m -Xmx512m -jar stepminer-1.0.jar\
     --onnFile "http://www.geneontology.org/ontology/gene_ontology.obo"\
     --annFile \
     "http://www.geneontology.org/gene-associations/gene_association.sgd.gz"\
     -o label.html --org "Sgd" \
     http://genepyramid.stanford.edu/home/public/StepMiner/yeast-batch1.pcl


java -Xms64m -Xmx512m -jar stepminer-1.0.jar \
     --onnFile "http://www.geneontology.org/ontology/gene_ontology.obo" \
     --annFile "gene_association.goa_human.gz" \
     --range 3:17 --geneIndex 1 --splitIndex 1 -o label.html --org "Hs" \
     http://genepyramid.stanford.edu/home/public/StepMiner/t-cell-control-cd3.pcl

Specifying replicates using StepMiner

Download stepminer-1.1.jar .
Analysis of 4 timepoints with three replicates:

java -cp stepminer-1.1.jar tools.CustomAnalysis step \
     output.pcl yourfile.pcl pvalue 0.05 type TwoStep \
     timepoints "0x3,1x3,2x3,3x3"

Usage:
java -cp stepminer-1.1.jar tools.CustomAnalysis step \
     output.pcl yourfile.pcl [< command> < arg>]*

< command> < arg>:
    type        OneStep/TwoStep/BothStep/SelectTwoStep
    centering   NoCentering/Step
    range       n:m
    org         Hs/Mm/Sgd/Pombie/Dm/Affy/Card
    annFile     < Annotation file>
    onnFile     < Ontology file>
    timepoints  < timepoint string>
    pvalue      < number>
    goPvalue    < pvalue threshold for GO Analysis>
    fdr         true/false
    geneIndex   < index of PCL file that has the gene name>
    splitString < delimiter for the gene name>
    splitIndex  < index of gene name after splitting with splitString>
    numMissing  < number of missing values allowed>

More examples

To extract geneset from StepMiner:
The description of genes are present in the second column of the pcl file. Lets assume that the description format is "gene name: gene title" e.g. "CCNB2: Cyclin B2". Using StepMiner it is possible to extract the gene names from this as follows:

java -Xms64m -Xmx512m -jar stepminer-1.1.jar -p 0.01 expt.pcl -o expt.gmt \
     --splitString ":" --geneIndex 1 --splitIndex 0

To run GO Analysis on the extracted geneset:

java -cp stepminer-1.1.jar tools.CustomAnalysis go \
    expt-goanalysis.html "gene_ontology.obo" \
    "gene_association.mgi" Mm 0.001 expt.gmt

Author: Debashis Sahoo
Stanford University