StepMiner command line options
Copied from Appendix C of my thesis:
Ph.D. Thesis: Boolean analysis of high-throughput biological datasets
Ph.D. Advisor: David L. Dill, Co-advisor: Sylvia K. Plevritis
Download StepMiner jar files :
stepminer-1.0.jar
stepminer-1.1.jar
prompt> java -jar stepminer-1.0.jar
Usage: tools.Analyze [-d/--debug] < filename>
[-r/--reduceLog] < filename>
--gui [< filename>]
[options] < filename>
Options:
[-o/--outfile < file.ano>] [-o/--outfile < file.pcl>]
[-o/--outfile [< tag>:]< file.ann>] [-o/--outfile < file.exp>]
[-o/--outfile ] [-o/--outfile < file.gmx>]
[-o/--outfile [:]] [-o/--outfile < file.png>]
Example tags: All, Step, Up, Down, UpDown, DownUp, Rest
[-t/--type < type>]
< type> : OneStep, OneStepFdr, Fdr,
Order, Subset, ZeroCenter, MeanCenter,
None, ListGenes, Normalize,
KNNimpute, LLSimpute
[--annFile < Gene Annotation File : 15 columns format>]
[--onnFile < Ontology File : OBO format>]
[--org < Organism: Mm/Hs/Sgd/Pombie>]
[--geneIndex < arrayIndex of gene description>]
[--splitString < Splitting regexp of the gene str>]
[--splitIndex < Index of gene after splitting>]
[--goPvalue < pvalue threshold of GOAnalysis>]
[--range < ex: 4:17 Range of array indices for analysis>]
[-p/--pvalue < pvalue threshold>]
[--numMissing < Number of missing timepoints>]
[--Intersect < file>]
[--Union < file>]
[--Select < file>] select ids with original order
[--SelectOrder < file>] select ids with given order
[--Diff < file>]
[--SelectNames < file>] select names with original order
Following are few command-line examples for running StepMiner.
prompt> java -Xms64m -Xmx512m -jar stepminer-1.0.jar
- prints command line options
To invoke GUI from command line:
prompt> java -Xms64m -Xmx512m -jar stepminer-1.0.jar --gui
Timecourse analysis of yeast.pcl and the results are saved in yeast-step.pcl.
java -Xms64m -Xmx512m -jar stepminer-1.0.jar \
-t OneStep yeast.pcl -o yeast-step.pcl
StepMiner also dumps information about the calculated p-values for each gene.
java -Xms64m -Xmx512m -jar stepminer-1.0.jar \
-t OneStep yeast.pcl -o yeast-step.ano -o yeast-step.ann
The columns in yeast-step.ann are as follows :
Name | - | probe id corresponds to the first column of
the PCL file. |
num | - | number of timepoints. |
numSteps | - | number of steps found. |
geneIndex | - | The column index in the PCL file that has the gene name
and the description of the gene. |
pvalue | - | pvalue for the fitted steps. |
sstot | - | Sum of square error for no-step (fitting with mean). |
sse | - | Sum of square error for the fitted steps.
|
label | - | labels for various matching |
| 0 - | no significant step |
| 1 - | one step - Up |
| 2 - | one step - Down |
| 3 - | two step - UpDown |
| 4 - | two step - DownUp |
step0 | - | position of first step. |
step1 | - | position of the second step. |
mean0 | - | mean for the first segment. |
mean1 | - | mean for the second segment. |
mean2 | - | mean for the third segment. |
The columns in yeast-step.ano are as follows :
The first three columns are directly copied from the PCL file. Next five
columns
are as follows:
label | - | labels for various matching |
| 0 - | no significant step |
| 2 - | one step |
| 3 - | two step |
dir | - | labels for various matching |
| 0 - | one step - Up |
| 1 - | one step - Down |
| 2 - | two step - UpDown |
| 3 - | two step - DownUp |
step1 | - | position of first step. |
step2 | - | position of the second step. |
pvalue | - | pvalue for the fitted steps. |
The rest of the columns are directly copied from the PCL file.
GO Analysis
Following are complex examples that performs GO Analysis after StepMiner
analysis:
java -Xms64m -Xmx512m -jar stepminer-1.0.jar\
--onnFile "http://www.geneontology.org/ontology/gene_ontology.obo"\
--annFile \
"http://www.geneontology.org/gene-associations/gene_association.sgd.gz"\
-o label.html --org "Sgd" \
http://genepyramid.stanford.edu/home/public/StepMiner/yeast-batch1.pcl
java -Xms64m -Xmx512m -jar stepminer-1.0.jar \
--onnFile "http://www.geneontology.org/ontology/gene_ontology.obo" \
--annFile "gene_association.goa_human.gz" \
--range 3:17 --geneIndex 1 --splitIndex 1 -o label.html --org "Hs" \
http://genepyramid.stanford.edu/home/public/StepMiner/t-cell-control-cd3.pcl
Specifying replicates using StepMiner
Download
stepminer-1.1.jar .
Analysis of 4 timepoints with three replicates:
java -cp stepminer-1.1.jar tools.CustomAnalysis step \
output.pcl yourfile.pcl pvalue 0.05 type TwoStep \
timepoints "0x3,1x3,2x3,3x3"
Usage:
java -cp stepminer-1.1.jar tools.CustomAnalysis step \
output.pcl yourfile.pcl [< command> < arg>]*
< command> < arg>:
type OneStep/TwoStep/BothStep/SelectTwoStep
centering NoCentering/Step
range n:m
org Hs/Mm/Sgd/Pombie/Dm/Affy/Card
annFile < Annotation file>
onnFile < Ontology file>
timepoints < timepoint string>
pvalue < number>
goPvalue < pvalue threshold for GO Analysis>
fdr true/false
geneIndex < index of PCL file that has the gene name>
splitString < delimiter for the gene name>
splitIndex < index of gene name after splitting with splitString>
numMissing < number of missing values allowed>
More examples
To extract geneset from StepMiner:
The description of genes are present in the second column of the pcl file.
Lets assume that the description format is "gene name: gene title" e.g. "CCNB2: Cyclin B2". Using StepMiner it is possible to extract the gene names from this
as follows:
java -Xms64m -Xmx512m -jar stepminer-1.1.jar -p 0.01 expt.pcl -o expt.gmt \
--splitString ":" --geneIndex 1 --splitIndex 0
To run GO Analysis on the extracted geneset:
java -cp stepminer-1.1.jar tools.CustomAnalysis go \
expt-goanalysis.html "gene_ontology.obo" \
"gene_association.mgi" Mm 0.001 expt.gmt
Author: Debashis Sahoo
Stanford University