NAME

topdiff - Top-down mass spectrometry-based identification of Differentially expressed proteoforms

SYNOPSIS

topdiff [options] database-file-name spectrum-file-names

DESCRIPTION

TopDiff (Top-down mass spectrometry-based identification of Differentially expressed proteoforms) compares the abundances of proteoforms and finds differentially expressed proteoforms by using identifications of top-down mass spectrometry data of several protein samples.

1.: Input

A protein database file in the FASTA format
Several mass spectrum data files in the msalign format
Proteoform identification files of the mass spectrum data files in the xml format, e.g., spectra_ms2_toppic_proteoform.xml

2.: Output
TopDiff outputs a csv file containing proteoform identifications and their abundances in the input mass spectrum data. The default output file name is sample_diff.csv.

OPTIONS

-h [ --help ] Print the help message.

-f [ --fixed-mod ] <C57|C58|a fixed modification file> Set fixed modifications. Three available options: C57, C58, or the name of a text file specifying fixed modifications (see an example file). When C57 is selected, carbamidomethylation on cysteine is the only fixed modification. When C58 is selected, carboxymethylation on cysteine is the only fixed modification.

-e [ --error-tolerance ] <a positive number> Set the error tolerance for mapping identified proteoforms across multiple samples (in Dalton). Default value: 1.2 Dalton.

-t [ --tool-name ] <toppic|topmg> Specify the name of the database search tool: toppic or topmg. Default: toppic.

-o [ --output ] <a file name> Specify the output file name. Default: sample_diff.csv.

EXAMPLES

Compare proteoform abundances using TopPIC identifications of two spectrum files spectra1_ms2.msalign and spectra2_ms2.msalign. The protein sequence database file name is proteins.fasta.
topdiff proteins.fasta spectra1_ms2.msalign spectra2_ms2.msalign
Compare proteoform abundances using TopPIC identifications of two spectrum files spectra1_ms2.msalign and spectra2_ms2.msalign. The protein sequence database file name is proteins.fasta and a fixed modification, carbamidomethylation on cysteine, is used in database search.
topdiff -f C57 proteins.fasta spectra1_ms2.msalign spectra2_ms2.msalign
Compare proteoform abundances using TopMG identifications of two spectrum files spectra1_ms2.msalign and spectra2_ms2.msalign. The protein sequence database file name is proteins.fasta and a fixed modification, carbamidomethylation on cysteine, is used in database search.
topdiff -f C57 -t topmg proteins.fasta spectra1_ms2.msalign spectra2_ms2.msalign

MAN PAGE PRODUCTION

This man page was written by Filippo Rusconi <lopippo@debian.org>. Material was taken from http://proteomics.informatics.iupui.edu/software/toppic/manual.html.

AUTHOR

Filippo Rusconi <lopippo@debian.org> and upstream authors (Dr. Xiaowen Liu's Lab at Indiana University-Purdue University Indianapolis and others)

COPYRIGHT

Filippo Rusconi and Indiana University-Purdue University Indianapolis

20200521