alistat(1) | Biosquid Manual | alistat(1) |
alistat [options] alignfile
alistat reads a multiple sequence alignment from the file alignfile in any supported format (including SELEX, GCG MSF, and CLUSTAL), and shows a number of simple statistics about it. These statistics include the name of the format, the number of sequences, the total number of residues, the average and range of the sequence lengths, the alignment length (e.g. including gap characters).
Also shown are some percent identities. A percent pairwise alignment identity is defined as (idents / MIN(len1, len2)) where idents is the number of exact identities and len1, len2 are the unaligned lengths of the two sequences. The "average percent identity", "most related pair", and "most unrelated pair" of the alignment are the average, maximum, and minimum of all (N)(N-1)/2 pairs, respectively. The "most distant seq" is calculated by finding the maximum pairwise identity (best relative) for all N sequences, then finding the minimum of these N numbers (hence, the most outlying sequence).
afetch(1), compalign(1), compstruct(1), revcomp(1), seqsplit(1), seqstat(1), sfetch(1), shuffle(1), sindex(1), sreformat(1), stranslate(1), weight(1).
Biosquid and its documentation are Copyright (C) 1992-2003 HHMI/Washington University School of Medicine Freely distributed under the GNU General Public License (GPL) See COPYING in the source code distribution for more details, or contact me.
Sean Eddy HHMI/Department of Genetics Washington University School of Medicine 4444 Forest Park Blvd., Box 8510 St Louis, MO 63108 USA Phone: 1-314-362-7666 FAX : 1-314-362-2157 Email: eddy@genetics.wustl.edu
January 2003 | Biosquid 1.9g |