TFBS::Matrix::ICM(3pm) | User Contributed Perl Documentation | TFBS::Matrix::ICM(3pm) |
TFBS::Matrix::ICM - class for information content matrices of nucleotide patterns
my $matrixref = [ [ 0.00, 0.30, 0.00, 0.00, 0.24, 0.00 ], [ 0.00, 0.00, 0.00, 1.45, 0.42, 0.00 ], [ 0.00, 0.89, 2.00, 0.00, 0.00, 0.00 ], [ 0.00, 0.00, 0.00, 0.13, 0.06, 2.00 ] ]; my $icm = TFBS::Matrix::ICM->new(-matrix => $matrixref, -name => "MyProfile", -ID => "M0001" ); # or my $matrixstring = <<ENDMATRIX 2.00 0.30 0.00 0.00 0.24 0.00 0.00 0.00 0.00 1.45 0.42 0.00 0.00 0.89 2.00 0.00 0.00 0.00 0.00 0.00 0.00 0.13 0.06 2.00 ENDMATRIX ; my $icm = TFBS::Matrix::ICM->new(-matrixstring => $matrixstring, -name => "MyProfile", -ID => "M0001" );
(See documentation of individual TFBS::DB::* modules to learn how to connect to different types of pattern databases and retrieve TFBS::Matrix::* objects from them.)
my $db_obj = TFBS::DB::JASPAR2->new (-connect => ["dbi:mysql:JASPAR2:myhost", "myusername", "mypassword"]); my $pfm = $db_obj->get_Matrix_by_ID("M0001", "ICM"); # or my $pfm = $db_obj->get_Matrix_by_name("MyProfile", "ICM");
(see decumentation of TFBS::MatrixSet to learn how to create objects for storage and manipulation of multiple matrices)
my @icm_list = $matrixset->all_patterns(-sort_by=>"name");
* drawing a sequence logo
$icm->draw_logo(-file=>"logo.png", -full_scale =>2.25, -xsize=>500, -ysize =>250, -graph_title=>"C/EBPalpha binding site logo", -x_title=>"position", -y_title=>"bits");
TFBS::Matrix::ICM is a class whose instances are objects representing position weight matrices (PFMs). An ICM is normally calculated from a raw position frequency matrix (see TFBS::Matrix::PFM for the explanation of position frequency matrices). For example, given the following position frequency matrix,
A:[ 12 3 0 0 4 0 ] C:[ 0 0 0 11 7 0 ] G:[ 0 9 12 0 0 0 ] T:[ 0 0 0 1 1 12 ]
the standard computational procedure is applied to convert it into the following information content matrix:
A:[2.00 0.30 0.00 0.00 0.24 0.00] C:[0.00 0.00 0.00 1.45 0.42 0.00] G:[0.00 0.89 2.00 0.00 0.00 0.00] T:[0.00 0.00 0.00 0.13 0.06 2.00]
which contains the "weights" associated with the occurrence of each nucleotide at the given position in a pattern.
A TFBS::Matrix::PWM object is equipped with methods to search nucleotide sequences and pairwise alignments of nucleotide sequences with the pattern they represent, and return a set of sites in nucleotide sequence (a TFBS::SiteSet object for single sequence search, and a TFBS::SitePairSet for the alignment search).
Please send bug reports and other comments to the author.
Boris Lenhard <Boris.Lenhard@cgb.ki.se>
The rest of the documentation details each of the object methods. Internal methods are preceded with an underscore.
Title : new Usage : my $icm = TFBS::Matrix::ICM->new(%args) Function: constructor for the TFBS::Matrix::ICM object Returns : a new TFBS::Matrix::ICM object Args : # you must specify either one of the following three: -matrix, # reference to an array of arrays of integers #or -matrixstring,# a string containing four lines # of tab- or space-delimited integers #or -matrixfile, # the name of a file containing four lines # of tab- or space-delimited integers ####### -name, # string, OPTIONAL -ID, # string, OPTIONAL -class, # string, OPTIONAL -tags # an array reference, OPTIONAL
Title : to_PWM Usage : my $pwm = $icm->to_PWM() Function: converts an information content matrix (a TFBS::Matrix::ICM object) to position weight matrix. At present it assumes uniform background distribution of nucleotide frequencies. Returns : a new TFBS::Matrix::PWM object Args : none; in the future releases, it should be able to accept a user defined background probability of the four nucleotides
Title : draw_logo Usage : my $gdImageObj = $icm->draw_logo(%args) Function: Draws a "sequence logo", a graphical representation of a possibly degenerate fixed-width nucleotide sequence pattern, from the information content matrix Returns : a GD::Image object; if you only need the image file you can ignore it Args : -file, # the name of the output PNG image file # OPTIONAL: default none -xsize # width of the image in pixels # OPTIONAL: default 600 -ysize # height of the image in pixels # OPTIONAL: default 5/8 of -x_size -startpos # start position in the logo for x axis # OPTIONAL: default is 1 -margin # size of image margins in pixels # OPTIONAL: default 15% of -y_size -full_scale # the maximum value on the y-axis, in bits # OPTIONAL: default 2.25 -graph_title,# the graph title # OPTIONAL: default none -x_title, # x-axis title; OPTIONAL: default none -y_title # y-axis title; OPTIONAL: default none -error_bars # reference to an array of S.D. values for each column; OPTIONAL -ps # if true, produces a postscript string instead of a GD::Image object -pdf # if true AND the -file argumant is used, produces an output pdf file
Title : _draw_ps_logo Usage : my $postscript_string = $icm->_draw_ps_logo(%args) Internal method, should be accessed using draw_logo() Function: Draws a "sequence logo", a graphical representation of a possibly degenerate fixed-width nucleotide sequence pattern, from the information content matrix Returns : a postscript string; if you only need the image file you can ignore it Args : -file, # the name of the output PNG image file # OPTIONAL: default none -xsize # width of the image in pixels # OPTIONAL: default 600 -ysize # height of the image in pixels # OPTIONAL: default 5/8 of -x_size -full_scale # the maximum value on the y-axis, in bits # OPTIONAL: default 2.25 -graph_title,# the graph title # OPTIONAL: default none -x_title, # x-axis title; OPTIONAL: default none -y_title # y-axis title; OPTIONAL: default none
The above methods are common to all matrix objects. Please consult TFBS::Matrix to find out how to use them.
2020-11-09 | perl v5.32.0 |