first previous next last contents

Reference Traces and Reference Sequences

This module specifies the reference traces and reference sequences used by the two mutation detection modules ( see section Trace Difference and see section Heterozygote Scanner). The left and right clip points for each trace can also be specified. A reference trace should be as similar as possible to the ones being compared against. It should be prepared by sequencing the wild type from the same primer and using the same chemistry as the readings being screened. One good way to produce a reference trace is to run the wild type sequence on the gel along with the other samples. If the input files have been sequenced from both strands, reference traces from each strand may be specified here. Note that in order for pregap4 to choose the appropriate wild type trace it needs to know the strand for each input sequence. This is specified by the PR record in the experiment file which is typically generated using a naming convention (see section Pregap4 Naming Schemes) If pregap4 cannot determine the strand, or if only one reference trace is specified, then each input sequence will be compared against the +ve strand reference trace. The reference data supplied in this module, when entered with gap4 shotgun assembly, will add REFS and REFT notes (see section Notes) to the gap4 database. A reference sequence is used to number bases in the Contig Editor (see section Reference sequences and traces) and in reporting the positions of mutations (see section Report Mutations.)

Option: Reference Trace (+ve strand)
Option: Reference Trace (-ve strand)
These are the filenames of the chromatogram for the reference trace on each strand. These may be in any allowable trace format (ZTR, SCF, ABI, CTF or ALF). The filenames are entered into the experiment file as WT records by the "Augment Database" phase of pregap4, so this module must also be enabled.
Option: Clip left
Option: Clip right
These values determine which region of the reference trace (in bases) is used for mutation detection. This can be used to exclude poor quality regions, or restrict the range over which mutation detection occurs. Restricting the range will also speed up the algorithms. If you specify -1 for all values, the entire trace is used. i.e. No clipping occurs. If the range specified is too small, the mutation detection algorithms may report an error, since there must be a useful overlap between the sequences in order to process them.
Option: Reference Sequence
This specifies the reference sequence, which is typically an annotated EMBL entry. This field is optional.
Option: Start base number
If a reference sequence was specified this indicates which base number it will start counting from within Gap4's contig editor. It also defines the positions of mutations, as output by the Report mutations function of gap4See section Report Mutations..
Option: Circular
Option: Sequence length
If the reference sequence is defined to be circular then the length needs to be known too. When the base number reaches the sequence length the next base in the sequence will be renumbered to base 1. This may be useful if the circular reference sequence needs to be chopped to form a linear sequence at a different position than the standard numbering. (For example this is typical when sequencing the mitochondrial variable loop, which by standard conventions contains base number 1.)

Note that it is possible (though no longer recommended) to use gap4 to produce a consensus trace. This requires using pregap4 twice. Firstly process the sequences through pregap4 with all the appropriate options except with the mutation detection modules disabled. Assemble these sequences into gap4. Within gap4, for each contig start up the Contig Editor and select Save Consensus Trace from the command menu (available only in expert mode). This will produce a trace which is the average of the traces in that contig. Then delete the gap4 database and reprocess the sequences using Pregap4, this time using mutation detection to compare against the consensus trace. Best results are usually obtained by first deleting pads in the consensus sequence. You should inspect the resulting consensus trace carefully to ensure there are no discontinuities introduced as a result of the pad deletions.

first previous next last contents
This page is maintained by staden-package. Last generated on 22 October 2002.