MIRTOP(1) | mirtop | MIRTOP(1) |
mirtop - mirtop Documentation
Looking for a logo, enter the competition here. Deadline 07/07/2018. Win a t-shirt and stickers if your logo is selected!
We got a logo: https://github.com/miRTop/mirtop/tree/master/artwork # Installation
## bioconda
conda install mirtop -c bioconda
## pypi
pip install mirtop
## update to develop version from pip
` pip install --upgrade --no-deps git+https://github.com/miRTop/mirtop.git#egg=mirtop `
## install develop version
Thes best solution is to install conda to get an independent environment.
``` wget http://repo.continuum.io/miniconda/Miniconda-latest-Linux-x86_64.sh
bash Miniconda-latest-Linux-x86_64.sh -b -p ~/mirtop_env
export PATH=$PATH:~/mirtop_env
conda install -c bioconda bioconda bedtools samtools pip nose pysam pandas dateutil pyyaml pybedtools biopython setuptools
git clone http://github.com/miRTop/mirtop cd mirtop git fetch origin dev git checkout dev
python setup.py develop
``
`
# Quick Start
## Importer
### From Bam files to GFF3
` git clone mirtop cd mirtop/data `
You can use the example data. Here the reads have been mapped to the precursor sequences.
` mirtop gff -sps hsa --hairpin examples/annotate/hairpin.fa --gtf examples/annotate/hsa.gff3 -o test_out sim_isomir.bam `
### From seqbuster::miraligner files to GFF3
miRNA annotation generated from [miraligner](https://github.com/lpantano/seqbuster) tool:
` mirtop gff --format seqbuster --sps hsa --hairpin examples/annotate/hairpin.fa --gtf examples/annotate/hsa.gff3 -o test_out examples/seqbuster/reads.mirna `
### From sRNAbench files to GFF3
miRNA annotation generated from [sRNAbench](http://bioinfo2.ugr.es:8080/ceUGR/srnabench/) tool:
` mirtop gff --format sranbench -sps hsa --hairpin examples/annotate/hairpin.fa --gtf examples/annotate/hsa.gff3 -o test_out srnabench examples/srnabench `
### From PROST! files to GFF3
miRNA annotation generated from [PROST!]() tool. Export isomiRs tab from excel file to a tabular text format file.
` mirtop gff --format prost -sps hsa --hairpin examples/annotate/hairpin.fa --gtf examples/annotate/hsa.gff3 -o test_out examples/prost/prost.example.txt `
### From isomiR-SEA files to GFF3
miRNA annotation generated from [isomiR-SEA]() tool.
` mirtop validate examples/gff/correct_file.gff `
## Operations
### Validator
To validate your mirGFF3 file and make sure if follows the current format:
` mirtop gff --format isomirsea -sps hsa --hairpin examples/annotate/hairpin.fa --gtf examples/annotate/hsa.gff3 -o test_out examples/isomir-sea/tagMir-all.gff `
### Get statistics from GFF
Get number of isomiRs and miRNAs annotated in the GFF file by isomiR category.
` cd mirtop/data mirtop stats -o test_out example/gff/correct_file.gff `
### Compare GFF file with reference
Compare the sequences from two or more GFF files. The first one will be used as the reference data.
` cd mirtop/data mirtop compare -o test_out example/gff/correct_file.gff example/gff/alternative.gff `
### Updates mirGFF3
Updates older versions with the most current one.
` cd mirtop/data mirtop update -o test_out_mirs examples/versions/version1.0.gff `
## Export
### Export file to isomiRs format
To be compatible with [isomiRs](https://bioconductor.org/packages/release/bioc/html/isomiRs.html) bioconductor package use:
` cd mirtop/data mirtop export -o test_out_mirs --hairpin examples/annotate/hairpin.fa --gtf examples/annotate/hsa.gff3 examples/gff correct_file.gff `
### Export file to FASTA format
` cd mirtop/data mirtop export -o test_out_mirs --format fasta -d -vd --hairpin examples/annotate/hairpin.fa --gtf examples/annotate/hsa.gff3 examples/gff/correct_file.gff `
### Export file to VCF format
` cd mirtop/data mirtop export -o test_out_mirs --format vcf --hairpin examples/annotate/hairpin.fa --gtf examples/a nnotate/hsa.gff3 examples/gff/correct_file.gff `
### Get count file
This file it is useful to load into R as a matrix. It contains the minimal information about each sequence and the count data in columns for each samples.
` cd mirtop/data mirtop counts -o test_out_mirs --hairpin examples/annotate/hairpin.fa --gtf examples/annotate/hsa.gff3 examples/synthetic/let7a-5p.gtf ` # Output
## GFF command
The mirtop gff generates the GFF3 adapter format to capture miRNA variations. The output is explained [here](https://github.com/miRTop/incubator/blob/master/format/definition.md).
## Stats command
The mirtop stats generates a table with different statistics for each type of isomiRs:
It generates as well a JSON file with the same information to be integrated easily with QC tools like [MultiQC](https://multiqc.info/).
## Compare command
The mirtop compare generates a tabular file with information about the difference and similarities. The first file in the command line will be considered the reference and the following files will be compared to the reference. Each line of the output has the following information for each file:
## Counts command
The mirtop counts generates a tabular file with the following columns:
## Export command
The mirtop export generates different files from a mirGFF3 file:
To add new sub-commands, modify the following:
## How to add a new sub-command
You need first to clone and install the tool in [develop mode](installation.html)
Let's say that you want to add a new operation to mirtop, for instance, similar to the stats command to work with sGFF3 files. Assume a test function for this example to just read the file and print Hello GFF3.
``` from mirtop.gff.body import read_gff_line
import mirtop.libs.logger as mylog logger = mylog.getLogger(__name__)
``
`
``` def add_subparser_test(subparsers):
``
`
`
``
`
`
elif "test" in kwargs: logger.info("Run test.") test(kwargs["args"])
``
`
` from mirtop.test import test `
Try the new operation:
` mirtop test data/examples/correct_file.gff `
## Add a unit test
## for the internal function
Add to the end of test/test_functions.py, but inside class FunctionsTest(unittest.TestCase): this code:
`
@attr(fn_test=True) def test_function_test(self):
``
`
## for the sub-command
Add to the end of test/test_function.py, but inside class AutomatedAnalysisTest(unittest.TestCase): this code:
`
@attr(cmd_test=True) def test_srnaseq_annotation_bam(self):
print("") print(" ".join(clcode)) subprocess.check_call(clcode)
``
`
## test the unit
nose is needed: pip install nose
Run the function test from the top parent folder:
` ./run_test.sh fn_test `
Run the command test from the top parent folder:
` ./run_test.sh cmd_test `
Read bam files
clean: Use mirtop.filter.clean_hits() to remove lower score hits.
>>> {'read_id': mirtop.realign.hits, ...}
Returns:
precursor (str): sequence of the precursor.
start (int): start position of sequence on the precursor, +1.
cigar (str): similar to SAM CIGAR attribute.
Returns:
add (list): nt added to the end
cigar (str): updated cigar
Read GFF files and output isomiRs compatible format
Reads a GFF file to produces output file containing Expression counts
Read GFF files and output FASTA format
GFF reader and creator helpers
>>> {'iso_3p': -3, ...}
Compare multiple GFF files to a reference
Helpers to define the header fo the GFF file
Produce stats from GFF3 format
Update gff3 files to newest version
Read isomiR GFF files
database(str): database name.
Read prost! files
database(str): database name.
Read seqbuster files
database(str): database name.
Read sRNAbench files
database(str): database name.
Read isomiR GFF files from optimir tool
database(str): database name.
Read Manatee files
database(str): database name.
Centralize running of external commands, providing logging and tracking. Integrated from bcbio package with some changes.
Helpers to work with fastq files
utils from http://www.github.com/chapmanb/bcbio-nextgen.git
>>> with chdir(temporal):
do_something()
Read bam files
Read precursor fasta file
Read database information
TODO: this needs to be generic to other databases.
>>> {'parent': {mirna: [start, end]}}
>>> {'parent': {mirna: [start, end]}}
>>> {'parent': {mirna: [start, end]}}
variants(str): string from Variant attribute in GFF file.
mature(list): [start, end].
exact(boolean): not add 4+/- flanking nts.
nt(int): number of nts to get.
It uses the code from mirtop.mirna.keys().
Inspired by MINTplate: https://cm.jefferson.edu/MINTbase https://github.com/TJU-CMC-Org/MINTmap/tree/master/MINTplates
It uses the code from mirtop.mirna.keys().
Inspired by MINTplate: https://cm.jefferson.edu/MINTbase https://github.com/TJU-CMC-Org/MINTmap/tree/master/MINTplates
>>> AAATTTT
position(int): >>> 3
>>> A
>>> AAATTTT
position(int): >>> 3
>>> T
>>> AAATTTT
position(int): >>> 3
>>> TT
Lorena Pantano, Thomas Desvignes, Karen EIlbeck, Ioannis Vlachos, Bastian Fromm, Marc K. Halushka, Michael Hackenberg, Gianvito Urgese
2021, Lorena Pantano, Thomas Desvignes, Karen EIlbeck, Ioannis Vlachos, Bastian Fromm, Marc K. Halushka, Michael Hackenberg, Gianvito Urgese
June 30, 2021 | 0.3 |