PP_SIMSCORE(1) | PP_SIMSCORE(1) |
pp_simScore - print similarity and alignments for block-profile and protein sequence on the standard output
pp_simScore [OPTIONS] --fasta=protein-sequence-file --prfl=protein-profile-file
Algorithm for calculating the similarity score and the optimal alignments of a block-profile and a protein sequence. The algorithm can optional take intron positions into account. Print to standard output.
-f, --fasta=file
>protein sequence header
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXX protein sequence XXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
[Introns]
# index of the position after which an intron occurs | residual nucleotides before the intron
2 0
5 1
30 2
104 1
-p, --prfl= file
[dist]
min max
[block]
B
[intron profile]
w
inter-block_profile_list
intra-block_profile_list
This structure can be repeated in this file. The file has to end either in a [dist] section or a [dist] and than [intron profile] section. The [intron profile] sections are optional.
[dist] explanation:
min, max denote the distance interval of an inter-block section
[block] explanation:
B denotes a (20 x t) matrix for a block of t of the block-profile
[intron profile] explanation:
an intron profile describes the positions and frequencies of introns in and
before the associated block
w: number of protein family members used to build the intron profile
inter-block_profile_list: list of (h, v),
where h denotes the number of introns which occurred within a family member,
v the number of family members which have this number of introns
intra-block_profile_list: list (s, f, v),
where s denotes the index of the position in the block after which an intron occurs,
f denotes the number of nucleotides which are left before the intron (0,1,2)
v the number of family members which have an intron at that position
-g, --gap_inter=float
-b, --gap_intra=float
-r, --gap_intron=float
-e, --epsilon_intron=float
-n, --epsilon_noIntron=float
-i, --intron_weight_intra=float
-t, --intron_weight_inter=float
-a, --alignment=number
-o, --out=format
Denotes the output format, the following output options are implemented:
score
matrix
alignment
matrix+alignment
db
bp
consents
interblock
:
-h, --help
pp_simScore --fasta=EDW03868.1.fa --prfl=EOG09150290.prfl --out=alignment
AUGUSTUS was written by M. Stanke, O. Keller, S. König, L. Gerischer, L. Romoth and L.Gabriel.