tantan - low complexity and tandem repeat masker for
biosequences
tantan [options] fasta-sequence-file(s)
Find simple repeats in sequences
- -p
- interpret the sequences as proteins
- -x
- letter to use for masking, instead of lowercase
- -c
- preserve uppercase/lowercase in non-masked regions
- -m
- file for letter pair scores (+1/-1, but -p selects BLOSUM62)
- -r
- probability of a repeat starting per position (0.005)
- -e
- probability of a repeat ending per position (0.05)
- -w
- maximum tandem repeat period to consider (100, but -p selects
50)
- -d
- probability decay per period (0.9)
- -a
- gap existence cost (0)
- -b
- gap extension cost (infinite: no gaps)
- -s
- minimum repeat probability for masking (0.5)
- -f
- output type: 0=masked sequence, 1=repeat probabilities,
- 2=repeat counts, 3=BED (0)
- -h, --help
- show help message, then exit
- --version
- show version information, then exit
Report bugs to: tantan@cbrc.jp
Home page: http://www.cbrc.jp/tantan/