DOKK / manpages / debian 12 / libtfbs-perl / TFBS::Matrix::ICM.3pm.en
TFBS::Matrix::ICM(3pm) User Contributed Perl Documentation TFBS::Matrix::ICM(3pm)

TFBS::Matrix::ICM - class for information content matrices of nucleotide patterns

  • creating a TFBS::Matrix::ICM object manually:

        my $matrixref = [ [ 0.00, 0.30, 0.00, 0.00, 0.24, 0.00 ],
                          [ 0.00, 0.00, 0.00, 1.45, 0.42, 0.00 ],
                          [ 0.00, 0.89, 2.00, 0.00, 0.00, 0.00 ],
                          [ 0.00, 0.00, 0.00, 0.13, 0.06, 2.00 ]
                        ];  
        my $icm = TFBS::Matrix::ICM->new(-matrix => $matrixref,
                                         -name   => "MyProfile",
                                         -ID     => "M0001"
                                        );
     
        # or
     
        my $matrixstring = <<ENDMATRIX
        2.00   0.30   0.00   0.00   0.24   0.00
        0.00   0.00   0.00   1.45   0.42   0.00
        0.00   0.89   2.00   0.00   0.00   0.00
        0.00   0.00   0.00   0.13   0.06   2.00
        ENDMATRIX
        ;
        my $icm = TFBS::Matrix::ICM->new(-matrixstring => $matrixstring,
                                         -name         => "MyProfile",
                                         -ID           => "M0001"
                                        );
        
  • retrieving a TFBS::Matix::ICM object from a database:

    (See documentation of individual TFBS::DB::* modules to learn how to connect to different types of pattern databases and retrieve TFBS::Matrix::* objects from them.)

        my $db_obj = TFBS::DB::JASPAR2->new
                        (-connect => ["dbi:mysql:JASPAR2:myhost",
                                      "myusername", "mypassword"]);
        my $pfm = $db_obj->get_Matrix_by_ID("M0001", "ICM");
        # or
        my $pfm = $db_obj->get_Matrix_by_name("MyProfile", "ICM");
        
  • retrieving list of individual TFBS::Matrix::ICM objects from a TFBS::MatrixSet object

    (see decumentation of TFBS::MatrixSet to learn how to create objects for storage and manipulation of multiple matrices)

        my @icm_list = $matrixset->all_patterns(-sort_by=>"name");
        

    * drawing a sequence logo

        $icm->draw_logo(-file=>"logo.png", 
                        -full_scale =>2.25,
                        -xsize=>500,
                        -ysize =>250, 
                        -graph_title=>"C/EBPalpha binding site logo", 
                        -x_title=>"position", 
                        -y_title=>"bits");
        

TFBS::Matrix::ICM is a class whose instances are objects representing position weight matrices (PFMs). An ICM is normally calculated from a raw position frequency matrix (see TFBS::Matrix::PFM for the explanation of position frequency matrices). For example, given the following position frequency matrix,

    A:[ 12     3     0     0     4     0  ]
    C:[  0     0     0    11     7     0  ]
    G:[  0     9    12     0     0     0  ]
    T:[  0     0     0     1     1    12  ]

the standard computational procedure is applied to convert it into the following information content matrix:

    A:[2.00  0.30  0.00  0.00  0.24  0.00]
    C:[0.00  0.00  0.00  1.45  0.42  0.00]
    G:[0.00  0.89  2.00  0.00  0.00  0.00]
    T:[0.00  0.00  0.00  0.13  0.06  2.00]

which contains the "weights" associated with the occurrence of each nucleotide at the given position in a pattern.

A TFBS::Matrix::PWM object is equipped with methods to search nucleotide sequences and pairwise alignments of nucleotide sequences with the pattern they represent, and return a set of sites in nucleotide sequence (a TFBS::SiteSet object for single sequence search, and a TFBS::SitePairSet for the alignment search).

Please send bug reports and other comments to the author.

Boris Lenhard <Boris.Lenhard@cgb.ki.se>

The rest of the documentation details each of the object methods. Internal methods are preceded with an underscore.

 Title   : new
 Usage   : my $icm = TFBS::Matrix::ICM->new(%args)
 Function: constructor for the TFBS::Matrix::ICM object
 Returns : a new TFBS::Matrix::ICM object
 Args    : # you must specify either one of the following three:
 
           -matrix,      # reference to an array of arrays of integers
              #or
           -matrixstring,# a string containing four lines
                         # of tab- or space-delimited integers
              #or
           -matrixfile,  # the name of a file containing four lines
                         # of tab- or space-delimited integers
           #######
 
           -name,        # string, OPTIONAL
           -ID,          # string, OPTIONAL
           -class,       # string, OPTIONAL
           -tags         # an array reference, OPTIONAL

 Title   : to_PWM
 Usage   : my $pwm = $icm->to_PWM()
 Function: converts an  information content matrix (a TFBS::Matrix::ICM object)
           to position weight matrix. At present it assumes uniform
           background distribution of nucleotide frequencies.
 Returns : a new TFBS::Matrix::PWM object
 Args    : none; in the future releases, it should be able to accept
           a user defined background probability of the four
           nucleotides
 Title   : draw_logo
 Usage   : my $gdImageObj = $icm->draw_logo(%args)
 Function: Draws a "sequence logo", a graphical representation
           of a possibly degenerate fixed-width nucleotide
           sequence pattern, from the information content matrix
 Returns : a GD::Image object;
           if you only need the image file you can ignore it
 Args    : -file,       # the name of the output PNG image file
                        # OPTIONAL: default none
           -xsize       # width of the image in pixels
                        # OPTIONAL: default 600
           -ysize       # height of the image in pixels
                        # OPTIONAL: default 5/8 of -x_size
           -startpos    # start position in the logo for x axis
                        # OPTIONAL: default is 1
           -margin      # size of image margins in pixels
                        # OPTIONAL: default 15% of -y_size
           -full_scale  # the maximum value on the y-axis, in bits
                        # OPTIONAL: default 2.25
           -graph_title,# the graph title
                        # OPTIONAL: default none
           -x_title,    # x-axis title; OPTIONAL: default none
           -y_title     # y-axis title; OPTIONAL: default none
           -error_bars  # reference to an array of S.D. values for each column; OPTIONAL
           -ps          # if true, produces a postscript string instead of a GD::Image object
            -pdf          # if true AND the -file argumant is used, produces an output pdf file
 Title   : _draw_ps_logo 
 Usage   : my $postscript_string = $icm->_draw_ps_logo(%args)
           Internal method, should be accessed using draw_logo()
 Function: Draws a "sequence logo", a graphical representation
           of a possibly degenerate fixed-width nucleotide
           sequence pattern, from the information content matrix
 Returns : a postscript string;
           if you only need the image file you can ignore it
 Args    : -file,       # the name of the output PNG image file
                        # OPTIONAL: default none
           -xsize       # width of the image in pixels
                        # OPTIONAL: default 600
           -ysize       # height of the image in pixels
                        # OPTIONAL: default 5/8 of -x_size
           -full_scale  # the maximum value on the y-axis, in bits
                        # OPTIONAL: default 2.25
           -graph_title,# the graph title
                        # OPTIONAL: default none
           -x_title,    # x-axis title; OPTIONAL: default none
           -y_title     # y-axis title; OPTIONAL: default none

ID

The above methods are common to all matrix objects. Please consult TFBS::Matrix to find out how to use them.

2022-10-20 perl v5.36.0