first previous next last contents

Finding Similar Spans

This method was first described by McLachlan Mclachlan,A.D. Tests for comparing related amino acid sequences J. Mol. Biol. 61, 409-424 (1971). It involves calculating a score for each position in the plot by summing points found when looking forwards and backwards along a diagonal line of a given length (window length). The algorithm does not simply look for identity but uses a score matrix that contains scores for every possible pair of character types. At each point that the score is above a minimum score, a match is saved. The matches are plotted as a single point in the SPIN Sequence Comparison Plot, corresponding to the centre of the matching span ( see section SPIN Sequence Comparison Plot) (Although see "Rescan matches, below).


The dialogue box (shown above) requests the horizontal and vertical sequences and their ranges ( see section Selecting a sequence), the window span length and the minimum score. Only results above this minimum score are plotted. The default value for the minimum score is one that would produce approximately 500 matches between two random sequences of the same composition as the two under investigation ( see section Probabilities and expected numbers of matches). This value of 500 can be changed using the "Configure default number of matches" option of the "Options" menu on the main menubar ( see section Changing the default number of matches). The upper and lower limits of the minimum score are similarly determined except that the expected number of matches for the upper limit is 0 and for the lower limit is "maximum number of matches". The "maximum number of matches" value can be altered if more matches are required to be plotted by using the "Configure maximum number of matches" option of the "Options" menu ( see section Changing the maximum number of matches).

Further operations available for find similiar spans are:

This command gives a brief description of the sequences used in the comparison, the input parameters used and the number of matches found.

horizontal EMBL: hsproperd 
vertical EMBL: mmproper
window length 11 min match 9
number of matches 1772

A detailed listing of all the hits found is displayed in the Output Window.

Positions          2 h        630 v and score          9

 Percentage mismatch  18.2
                2        12
              H agcctatcaac
                ::::::: : :
              V agcctatgagc
              630       640

Positions          7 h        369 v and score          9

 Percentage mismatch  18.2
                7        17
              H atcaacccaga
                :  ::::::::
              V aggaacccaga
              369       379

Tabulate Scores
This option lists scores, probabilities, and their expected and observed numbers of matches.

score    9 probability 1.73e-04 expected          365 observed 1772
score   10 probability 1.17e-05 expected           25 observed 601
score   11 probability 3.60e-07 expected            1 observed 149

Rescan matches
It is also possible to plot a dot for each residue with a score above a minimum value within each matching span using the "Rescan matches" command. This is only a temporary result and will be destroyed if the SPIN Sequence Comparison Plot is altered (see section Controlling and Managing Results).
This option allows the line width and colour of the matches to be altered. See section Colour Selector. A colour browser is displayed from which the desired line width or colour can be configured. Pressing OK will update the SPIN Sequence Comparison Plot.
Display sequences
Selecting this command invokes the SPIN Sequence Comparison Display ( see section Sequence Comparison Display). Moving the cursor in the sequence display will move the cursors of the same sequence in any SPIN Sequence Comparison Plot ( see section Cursors). To force the sequence display to show the nearest match, use the "nearest match" button in the sequence display plot. To force the sequences to maintain their current register activate the "Lock" button.
This option removes the points from the SPIN Sequence Comparison Plot but retains the information in memory.
This option will redisplay previously hidden points in the SPIN Sequence Comparison Plot.
This command removes all the information regarding this particular invocation of Find similar spans and access to this data lost.

first previous next last contents
This page is maintained by staden-package. Last generated on 22 October 2002.