FREECONTACT(1) | User Commands | FREECONTACT(1) |
freecontact - fast protein contact predictor
freecontact [OPTION]
freecontact --parprof [evfold|psicov|psicov-sd] < alignment.aln > contacts.out
/usr/share/freecontact/a2m2aln --query '^RASH_HUMAN/(\d+)' < alignment.fa | freecontact --parprof evfold > contacts.out
freecontact --ali=ALIFILE --apply-gapth=BOOL --clustpc=NUM --density=NUM --cov20=BOOL --estimate-ivcov=BOOL --gapth=NUM --icme-timeout=NUM --input-format=[flat|xml] --mincontsep=NUM --output-format=[evfold|pfrmat_rr|bioxsd] --pseudocnt=NUM --pscount-weight=NUM --rho=NUM --threads=NUM --veczw=BOOL
freecontact --help --debug --quiet --version
FreeContact is a protein residue contact predictor optimized for speed. FreeContact can function as an accelerated drop-in for the published contact predictors EVfold-mfDCA of DS. Marks et al. (2011) [1], and PSICOV of D. Jones et al. (2011) [2].
FreeContact is accelerated by a combination of vector instructions, multiple threads, and faster implementation of key parts. Depending on the alignment, 8-fold or higher speedups are possible.
A sufficiently large alignment is required for meaningful results. As a minimum, an alignment with an effective (after-weighting) sequence count bigger than the length of the query sequence should be used. Alignments with tens of thousands of (effective) sequences are considered good input.
jackhmmer(1) or hhblits(1) can be used to generate the alignments, for example.
[1] PLoS One. 2011;6(12):e28766. doi: 10.1371/journal.pone.0028766. Epub 2011 Dec 7. Protein 3D structure computed from evolutionary sequence variation. Marks DS, Colwell LJ, Sheridan R, Hopf TA, Pagnani A, Zecchina R, Sander C.
[2] Bioinformatics. 2012 Jan 15;28(2):184-90. Epub 2011 Nov 17. PSICOV: precise structural contact prediction using sparse inverse covariance estimation on large multiple sequence alignments. Jones DT, Buchan DW, Cozzetto D, Pontil M.
The following formats are supported:
# querystart=5 # query=QUERYwithinsertionSEQUENCEWITHNOGAPSORINSERTIONS QUERYSEQUENCEWITHNOGAPSORINSERTIONS -ALIGNED---SEQUENCE--WITH-GAPS----- ANOTHER-ALIGNED------------SEQUENCE
The '#' header lines are optional. Header lines are used to calculate contact residue numbers and to look up respective query residues for certain output formats.
If no query is defined, the first sequence in the alignment is used as the query sequence. The query sequence must not contain gaps in the alignment.
All alignment rows must be the same length, and may contain only [ABCDEFGHIJKLMNOPQRSTUVWXYZ-]. [B] is mapped to [D], [Z] is mapped to [E], [JOUX] are mapped to [X]. [X] matches only itself for the entire program.
A2M input alignments can be converted to the above format using /usr/share/freecontact/a2m2aln. a2m2aln can be used to pipe the alignment directly into freecontact.
Example: /usr/share/doc/freecontact/examples/PF00071_v25_1000.xml.
The original EVfold-mfDCA or PSICOV output format is used by default when the respective parameter profile is selected.
5 K 6 L 0.332129 3.59798 | | | | | + corrected norm (CN) contact score | | | | + mutual information (MI) score | | | + contact amino acid residue code | | + contact residue number | + contact amino acid residue code + contact residue number
Contacts are sorted on residue number.
55 67 0 8 10.840280 | | | | + contact score | | +-+ range [Å] of Cb-Cb distance predicted for the residue pair | | (C-alpha for glycines) | | These two fields are invariant in the output. | + contact residue number + contact residue number
Contacts are sorted on score, descending.
[3] <http://predictioncenter.org/casp10/index.cgi?page=format>
Example: /usr/share/doc/freecontact/examples/PF00071_v25_1000.evfold.50.xml.
Note: as BioXSD is under active development in collaboration with FreeContact, the FreeContact schema may actually be derived from a version not yet available at [5].
[4] <file:///usr/share/freecontact/freecontact.xsd>
[5] <http://bioxsd.org>
The output may not list all possible contacts.
Command line arguments can be used to override profile values.
/usr/share/freecontact/a2m2aln --query '^RASH_HUMAN/(\d+)' < '/usr/share/doc/freecontact/examples/PF00071_v25_1000.fa' | \ freecontact --parprof evfold > PF00071_v25_1000.evfold freecontact --parprof evfold -i xml -o bioxsd < '/usr/share/doc/freecontact/examples/PF00071_v25_1000.xml' > PF00071_v25_1000.evfold.xml freecontact --parprof psicov < /usr/share/doc/freecontact/examples/demo_1000.aln > demo_1000.psicov
For optimal performance, use the Automatically Tuned Linear Algebra Software (ATLAS) library compiled on the machine where freecontact is run.
László Kaján <lkajan@rostlab.org>
2023-05-18 | 1.0.21 |