rsem-control-fdr - Filter EBSeq output for statistical
significance.
rsem-control-fdr [options] input_file fdr_rate output_file
- input_file
- This should be the main result file generated by 'rsem-run-ebseq', which
contains all genes/transcripts and their associated statistics.
- fdr_rate
- The desire false discovery rate (FDR).
- output_file
- This file is a subset of the 'input_file'. It only contains the
genes/transcripts called as differentially expressed (DE). When more than
2 conditions exist, DE is defined as not all conditions are equally
expressed. Because statistical significance does not necessarily mean
biological significance, users should also refer to the fold changes to
decide which genes/transcripts are biologically significant. When more
than two conditions exist, this file will not contain fold change
information and users need to calculate it from 'input_file.condmeans' by
themselves.
- --hard-threshold
- Use hard threshold method to control FDR. If this option is set, only
those genes/transcripts with their PPDE >= 1 - fdr_rate are called as
DE. (Default: on)
- --soft-threshold
- Use soft threshold method to control FDR. If this option is set, this
program will try to report as many genes/transcripts as possible, as long
as their average PPDE >= 1 - fdr_rate. This option is equivalent to use
EBSeq's 'crit_fun' for FDR control. (Default: off)
- -h/--help
- Show help information.
This program controls the false discovery rate and reports
differentially expressed genes/transcripts.
We assume that we have 'GeneMat.results' as input. We want to
control FDR at 0.05 using hard threshold method and name the output file as
'GeneMat.de.txt':
rsem-control-fdr GeneMat.results 0.05 GeneMat.de.txt