first previous next last contents

Finding the Best Diagonals

This option is among the fastest and can be useful for a quick comparison of two long DNA sequences. The algorithm is as follows. First it finds the positions of runs of identical characters ("words") of length word length, as for the find matching words algorithm. These words are accumulated in an imaginary SPIN Sequence Comparison Plot and the number of hits on each diagonal is summed to produce a histogram. The histogram is analysed to find its mean and standard deviation. The diagonals that lie above some cutoff score (defined in standard deviation units), are rescanned using the find similar spans algorithm. Any window lengths reaching the cutoff score produce a dot which is plotted in the usual way.

[picture]

The dialogue box requests horizontal and vertical sequences and their ranges ( see section Selecting a sequence), the minimum number of identical characters in a run "word length", the minimum standard deviation, the window length and the minimum score.

The points are plotted to the SPIN Sequence Comparison Plot ( see section SPIN Sequence Comparison Plot).

Further operations available for find best diagonals are:

Information
This command gives a brief description of the sequences used in the comparison and the input parameters used.

horizontal EMBL: hsproperd
vertical EMBL: mmproper
window length 11 minimum score 9 word length 8 minimum sd 3.000000

Results
A listing of all the matches is obtained in the Output Window. The horizontal (h) and vertical (v) positions of the beginning of the match are listed.

Positions       1066 h        905 v 
Positions       1067 h        906 v 
Positions       1068 h        907 v 
Positions       1069 h        908 v 
Positions       1070 h        909 v 
Positions       1071 h        910 v 
Positions       1072 h        911 v 
Positions       1073 h        912 v 
Positions       1074 h        913 v 

Configure
This option allows the line width and colour of the matches to be altered.See section Colour Selector. A colour browser is displayed from which the desired line width or colour can be configured. Pressing OK will update the SPIN Sequence Comparison Plot.
Display sequences
Selecting this command invokes the sequence display ( see section Sequence Comparison Display). Moving the cursor in the sequence display will move the cursors of the same sequence in any SPIN Sequence Comparison Plot ( see section Cursors). To force the sequence display to show the nearest match, use the "nearest match" button in the sequence display plot.
Hide
This option removes the points from the SPIN Sequence Comparison Plot but retains the information in memory.
Reveal
This option will redisplay previously hidden points in the SPIN Sequence Comparison Plot.
Remove
This command removes all the information regarding this particular invocation of Find best diagonals, and access to this data is lost.

first previous next last contents
This page is maintained by staden-package. Last generated on 22 October 2002.
URL: http://www.mrc-lmb.cam.ac.uk/pubseq/manual/spin_unix_36.html