first previous next last contents

Quality Clip

Description
This module determines where the sequence quality is too poor to use for reliable assembly. It supercedes the Uncalled Base Clip module. This uses the qclip program which reads and writes to Experiment Files. Its default quality evaluation is based on the range of values produced by the Estimate Base Accuracies module (quality value 70, averaged over 100 bases). For use with phred, try lower values such as quality value 15 averaged over 50 bases. When quality values are not available it will use the same method as the Uncalled Base Clip module; to analyse the base calls and count the number of undetermined bases within a given window of sequence. Both 5' and 3' ends may be quality clipped. For the confidence mode of clipping the method starts from the point of highest average quality, and then steps outwards in both directions until the average quality is below a defined threshold. For the sequence mode of clipping the method starts from a defined position and steps outwards in both directions until the number of uncalled bases within a given window length exceeds a predefined threshold. For more details see the qclip documentation (see section qclip). Note that the Phrap assembly algorithm works best without quality clipping and it can make use of the full length of readings (due to the use of the Phred confidence values).

Option: Clip mode
This may be one of "by sequence" or "by confidence". The "by sequence" mode is equivalent to the Uncalled Clip module. The "by confidence" mode uses Phred-scaled confidence values to determine the quality for clipping. This does not work with eba confidence values.

Option: Minimum extent
The lowest allowable 5' clip position.

Option: Maximum extent
The largest allowable 3' clip position.

Option: Minimum length
If after quality clipping the good portion of a sequence is shorter than the specified length, then this file will be rejected with the message "qclip: Sequence too short".

Option: Window length
The window length over which the confidence will be averaged. This option is only relevant for the "clip by confidence" mode.

Option: Average confidence
The minimum average confidence (over `window length' bases) for sequence to be accepted as good quality. This option is only relevant for the "clip by confidence" mode.

Option: Start offset
The base number to start the 5' and 3' good quality searches from. This option is only relevant for the "clip by sequence" mode.

Option: 3' window length
The window length in which to count uncalled bases. This option is only relevant for the "clip by sequence" mode.

Option: 3' number of uncalled bases
The maximum allowed count of uncalled bases in a single window length. This option is only relevant for the "clip by sequence" mode.

Option: 5' window length
The window length in which to count uncalled bases. This option is only relevant for the "clip by sequence" mode.

Option: 5' number of uncalled bases
The maximum allowed count of uncalled bases in a single window length. This option is only relevant for the "clip by sequence" mode.


first previous next last contents
This page is maintained by staden-package. Last generated on 22 October 2002.
URL: http://www.mrc-lmb.cam.ac.uk/pubseq/manual/pregap4_unix_24.html