DOKK / manpages / debian 12 / muscle / muscle.1.en
MUSCLE(1) User Commands MUSCLE(1)

muscle - Multiple alignment program of protein sequences

MUSCLE is a multiple alignment program for protein sequences. MUSCLE stands for multiple sequence comparison by log-expectation. In the authors tests, MUSCLE achieved the highest scores of all tested programs on several alignment accuracy benchmarks, and is also one of the fastest programs out there.

muscle -align input.fa -output aln.afa

Align large input using Super5 algorithm if -align is too expensive, typically needed with more than a few hundred sequences:

muscle -super5 input.fa -output aln.afa

muscle -align input.fa -perm PERM -perturb SEED -output aln.afa muscle -super5 input.fa -perm PERM -perturb SEED -output aln.afa
PERM is guide tree permutation none, abc, acb, bca (default none). SEED is perturbation seed 0, 1, 2... (default 0 = don't perturb).

Ensemble of replicate alignments, output in Ensemble FASTA (EFA) format, EFA has one aligned FASTA for each replicate with header line "<PERM.SEED":

muscle -align input.fa -stratified -output stratified_ensemble.efa muscle -align input.fa -diversified -output diversified_ensemble.afa

-replicates N

Number of replicates, defaults 4, 100, 100 for stratified,
diversified, resampled. With -stratified there is one replicate per guide tree permutation, total is 4 x N.

Generate resampled ensemble from existing ensemble by sampling columns with replacement:

muscle -resample ensemble.efa -output resampled.efa

-maxgapfract F

Maximum fraction of gaps in a column (F=0..1, default 0.5).

-minconf CC

Minimum column confidence (CC=0..1, default 0.5).

If ensemble output filename has @, then one FASTA file is generated for each replicate where @ is replaced by perm.s, otherwise all replicates are written to one EFA file.

muscle -disperse ensemble.efa

muscle -maxcc ensemble.efa -output maxcc.afa

muscle -efa_explode ensemble.efa

muscle -fa2efa filenames.txt -output ensemble.efa

Update ensemble by adding two sequences of digits to each replicate, digits are column confidence (CC) values, e.g. "73" means CC=0.73, "++" is CC=1.0:

muscle -addconfseqs ensemble.efa -output ensemble_cc.efa

Calculate letter confidence (LC) values, -ref specifies the alignment to compare against the ensemble (e.g. from -maxcc), output is in aligned FASTA format with LC values 0, 1 ... 9 instead of letters:

muscle -letterconf ensemble.efa -ref aln.afa -output letterconf.afa

-html aln.html

Alignment colored by LC in HTML format.

-jalview aln.features

Jalview feature file with LC values and colors.

https://drive5.com/muscle


This manpage was written by Andreas Tille for the Debian distribution and
can be used for any other usage of the program.

January 2022 muscle 5.1