first previous next last contents

Consensus Calculation Using Base Frequencies

This algorithm can be used for any data, with or without confidence values. Each standard base type is given the same weight. The consensus will be the most frequent base type in a given column provided that the consensus cutoff parameter is low enough. All unrecognised base types, including IUB codes, are treated as dashes. Dashes are given a weight of 1/10th that of recognised base types. Pads are given a weight which is the average of their neighbouring bases.

The confidence of a consensus base for this method is expressed as a percentage. So for example a column of bases of A, A, A and T will give a consensus base of A and a confidence of 75. Therefore a consensus cutoff of 76 or higher will give a consensus base of "-".

In the event that more than one base type is calculated to have the same confidence, and this exceeds the consensus cutoff, the bases are assigned in descending order of precedence: A, C, G and T.

The quality cutoff parameter (Q in the Contig Editor) has no effect on this algorithm.


first previous next last contents
This page is maintained by staden-package. Last generated on 22 October 2002.
URL: http://www.mrc-lmb.cam.ac.uk/pubseq/manual/gap4_unix_118.html