rockhopper - system for analyzing bacterial RNA-seq data (command
line tool)
rockhopper is a comprehensive and user-friendly system for
computational analysis of bacterial RNA-seq data. As input, it takes RNA
sequencing reads output by high-throughput sequencing technology (FASTQ,
QSEQ, FASTA, SAM, or BAM files).
If the -g option is used, then rockhopper aligns reads to one or
more reference genomes, otherwise, rockhopper performs de novo transcript
assembly.
- -g <DIR1,DIR2>
- a comma separated list of directories, each containing a genome file
(*.fna), gene file (*.ptt), and rna file (*.rnt)
- -c <boolean>
- reverse complement single-end reads (default is false)
- -ff, -fr, -rf, -rr
- orientation of two mate reads for paired-end read, f=forward and
r=reverse_complement (default is fr)
- -d <integer>
- maximum number of bases between mate pairs for paired-end reads (default
is 500)
- -a <boolean>
- identify 1 alignment (true) or identify all optimal alignments (false),
(default is true)
- -p <integer>
- number of processors (default is self-identification of processors)
- -e <boolean>
- compute differential expression for transcripts in pairs of experimental
conditions (default is true)
- -s <boolean>
- RNA-seq experiments are strand specific (true) or strand ambiguous
(false), (default is true)
- -L <comma separated list>
- labels for each condition
- -o <DIR>
- directory where output files are written (default is
Rockhopper_Results/)
- -v <boolean>
- verbose output including raw/normalized counts aligning to each gene
(default is false)
- -SAM
- output a SAM format file
- -TIME
- output time taken to execute program
- -m <number>
- allowed mismatches as percent of read length (default is 0.15)
- -l <number>
- minimum seed as percent of read length (default is 0.33)
- -y <boolean>
- compute operons (default is true)
- -t <boolean>
- identify transcript boundaries including UTRs and ncRNAs (default is
true)
- -z <number>
- minimum expression of UTRs and ncRNAs, a number in range [0.0, 1.0]
(default is 0.5)
- -k <integer>
- size of k-mer, range of values is 15 to 31 (default is 25)
- -j <integer>
- minimum length required to use a sequencing read after trimming/processing
(default is 35)
- -n <integer>
- size of k-mer hashtable is ~ 2^n (default is 25). HINT: should normally be
25 or, if more memory is available, 26. WARNING: if increased above 25
then more than 1.2M of memory must be allocated
- -b <integer>
- minimum number of full length reads required to map to a de novo assembled
trancript (default is 20)
- -u <integer>
- minimum length of de novo assembled transcripts (default is 2*k)
- -w <integer>
- minimum count of k-mer to use it to seed a new de novo assembled
transcript (default is 50)
- -x <integer>
- minimum count of k-mer to use it to extend an existing de novo assembled
transcript (default is 5)
reference based assembly with single-end reads
% rockhopper <options> -g genome_DIR1,genome_DIR2
aerobic_replicate1.fastq,aerobic_replicate2.fastq
anaerobic_replicate1.fastq,anaerobic_replicate2.fastq
de novo assembly with single-end reads
% rockhopper <options>
aerobic_replicate1.fastq,aerobic_replicate2.fastq
anaerobic_replicate1.fastq,anaerobic_replicate2.fastq