plasmidID - plasmid identification tool
plasmidID is a computational pipeline tha reconstruct and annotate
the most likely plasmids present in one sample
usage : ./plasmidID <-1 R1> <-2 R2> <-d
database(fasta)> <-s sample_name> [-g group_name] [options]
- Mandatory input data: -1 | --R1 <filename> reads
corresponding to paired-end R1 (mandatory) -2 | --R2
<filename> reads corresponding to paired-end R2 (mandatory)
-d | --database <filename> database to map and
reconstruct (mandatory) -s | --sample <string> sample
name (mandatory), less than 37 characters
- Optional input data: -g | --group <string> group name
(optional). If unset, samples will be gathered in NO_GROUP group -c
| --contigs <filename> file with contigs. If supplied,
plasmidID will not assembly reads -a | --annotate
<filename> file with configuration file for specific annotation
-o <output_dir> output directory, by default is the current
directory
- Pipeline options: --explore Relaxes default parameters to find less
reliable relationships within data supplied and database
--only-reconstruct Database supplied will not be filtered and all
sequences will be used as scaffold
- This option does not require R1 and R2, instead a contig file can be
supplied
- -w
- Undo winner takes it all algorithm when clustering by kmer - QUICKER
MODE
- Trimming: --trimmomatic-directory Indicate directory holding
trimmomatic .jar executable --no-trim Reads supplied will not be
quality trimmed
- Coverage and Clustering: -C | --coverage-cutoff <int>
minimum coverage percentage to select a plasmid as scafold (0-100),
default 80 -S | --coverage-summary <int> minimum
coverage percentage to include plasmids in summary image (0-100), default
90 -f | --cluster <int> kmer identity to cluster
plasmids into the same representative sequence (0 means identical) (0-1),
default 0.5 -k | --kmer <int> identity to filter
plasmids from the database with kmer approach (0-1), default 0.95
- Contig local alignment -i | --alignment-identity <int>
minimum identity percentage aligned for a contig to annotate, default 90
-l | --alignment-percentage <int> minimum length
percentage aligned for a contig to annotate, default 20 -L |
--length-total <int> minimum alignment length to filter blast
analysis --extend-annotation <int> look for annotation over
regions with no homology found (base pairs), default 500bp
- Draw images: --config-directory <dir> directory holding
config files, default config_files/ --config-file-individual
<file-name> file name of the individual file used to reconstruct
Additional options:
- -M | --memory
- <int> max memory allowed to use
- -T |
--threads
- <int> number of threads
- -v |
--version
- version
- -h | --help
- display usage message
example: ./plasmidID.sh -1 ecoli_R1.fastq.gz -2
ecoli_R2.fastq.gz -d database.fasta -s ECO_553 -G
ENTERO
- ./plasmidID.sh -1 ecoli_R1.fastq.gz -2 ecoli_R2.fastq.gz
-d PacBio_sample.fasta -c scaffolds.fasta -C 60
-s ECO_60 -G ENTERO --no-trim
The full documentation for plasmidID is maintained as a
Texinfo manual. If the info and plasmidID programs are
properly installed at your site, the command
- info plasmidID
should give you access to the complete manual.