MAF_PARSE(1) | User Commands | MAF_PARSE(1) |
maf_parse - Reads a MAF file and perform various operations on it.
Reads a MAF file and perform various operations on it. Performs parsing operations block-by-block whenever possible, rather than storing entire alignment in memory. Can extract a sub-alignment from an alignment (by row or by column). Can extract features given GFF, BED, or genepred file. Can also extract sub-features such as CDS1,2,3 or 4d sites. Can perform various functions such as gap stripping or re-ordering of sequences. Capable of reading and
--out-format, -o MAF|PHYLIP|FASTA|MPM|SS (Default MAF). Output file format. SS format is only available un-ordered. Note that some options, which involve reversing alignments based on strand, or stripping gaps, cannot be output in MAF format and use FASTA by default. Also note that when output format is not MAF, the entire output must be loaded into memory.
--pretty, -p
--start, -s <start_col> Start index of sub-alignment (indexing starts with 1). Coordinates are in terms of the reference sequence unless the --no-refseq option is used, in which case they are in terms of alignment columns. Default is 1.
--end, -e <end_col> End index of sub-alignment. Default is length of alignment.
--seqs, -l <seq_list>
--exclude, -x Exclude rather than include specified sequences.
--order, -O <name_list>
--no-refseq, -n Do not assume first sequence in MAF is refseq. Instead, use coordinates given by absolute position in alignment (starting from 1).
--split, -S length
--out-root, -r <name>
--out-root-digits, -d <numdigits> (for use with --split). The minimum number of digits used to
--features, -g <fname> Annotations file. May be GFF, BED, or genepred format.
--by-category, -L
--do-cats, -C <cat_list> (For use with --by-category) Output sub-alignments for only the specified categories.
--catmap, -c <fname>|<string>
--catmap "NCATS = 3 ; CDS 1-3" or
--catmap "NCATS = 1; UTR 1".
--by-group, -P <tag> (Requires --features). Split by groups in annotation file, as defined by specified tag.
--mask-bases, -b <qscore> Mask all bases with quality score <= n. Note that n is in the same units as displayed in the MAF (ranging from 0-9), and represents min(9, floor(PHRED_score/5)). Bases without any quality score will not be masked.
--masked-file, -m <filename> (For use with --mask-bases). Write a file containing all the regions masked for low quality. The file will be in 0-based coordinates relative to the refseq, with an additional column giving the name of the species masked. Note that low-quality bases masked at alignment columns with a gap in the reference sequence may not be represented in the output file.
--mask-features -M <spec> (Requires --features). Mask all bases annotated in features in the given species (can be a comma-delimited list of species). Note that
--strip-i-lines, -I
--strip-e-lines, -E Remove lines in MAF starting with e.
--help, -h
May 2016 | maf_parse 1.4 |