first previous next last contents



tracediff -- Compare two trace files for differences to detect mutations.


tracediff [-a peak-alignment-deviation] [-c complement-reverse-strand-tags] [-d output-difference-traces] [-f file-of-filenames] [-n analysis-window-length] [-q quiet-mode] [-s analysis-sensitivity] [-t noise-threshold] [-w maximum-peak-width] experiment_file(s)


tracediff compares a pair of traces to look for mutations. It aligns the traces, and then subtracts one trace from the other to produce a "difference trace". This difference trace is analysed to distinguish between mutations and incorrect base calls. Bonfield,JK, Rada,C and Staden,R Automated detection of point mutations using fluorescent sequence trace subtraction. Nucl. Acids Res. 26, 3404-3409 (1998).

For an overview and more details about mutation detection seesection Search for Mutations..

To detect mutations, compute the mean and standard deviation of the difference trace, and then locate bases associated with a significant pair of peaks, one positive, the other negative. For example a base change from an A to T will cause a positive A trace difference and a negative T trace difference. If both the positive and negative differences are more than num_sd multiples of the standard deviation from the mean, then this is flagged as a potential mutation. Mutations are written to the experiment file as MUTA tags.

The experiment_file contains records specifying the input trace, the reference trace and the strand direction. It also contains the clipping points for the input trace. A minimal experiment file for tracediff might look like this:

LN 27_17f.ztr PR 1 QL 10 QR 839 WT C:/my_dataset/09_5f

Where the LN record specifies the name of the input trace, the PR record specifies the strand direction 1=forward, 2=reverse, the QL and QR records specify the input trace left and right clip points respectively, and the WT record specifies the wildtype trace. You can also optionally specify clip points for the wildtype trace as WL and WR records. Pregap4 generates suitable experiment files automatically, so these would not normally be created manually.


-a peak-alignment-deviation
The centres of each individual half-peak of a double peak above and below the baseline must align reasonably well for them to be considered to be a real mutation. The amount of half-peak alignment deviation allowable is specified in bases by this parameter, usually as a fraction of one base.
-c complement-reverse-strand-tags
After mutation detection and after readings have been assembled into a GAP4 database, GAP4 displays both forward and reverse readings in a single direction in the contig editor. This makes it much easier to compare sequences and traces in both directions simultaneously. When the corresponding traces are displayed, any reverse strand traces are complemented automatically such that the bases are interchanged. In this case, the original mutation tag generated by tracediff will then be of the wrong sense, so if checked, this option complements the tag base labels to match the complemented trace displayed by GAP4.
-d output-difference-traces
After trace difference analysis, the generated traces are normally discarded and not written to disk. Checking this option lets you save the trace difference files to the same directory as the original traces. The .ZTR trace format is used for this purpose. The original filename is retained and a "_diff.ztr" suffix is appended.
-f file-of-filenames
Specifies the filename of a simple text file containing a list of experiment files to be processed by tracediff.
-n analysis-window-length
Analysis of the trace difference is done over a local region to counter the effects of non-stationarity in the trace signal. The analysis region is defined by a short window whose length is specified in bases. The window is asymmetric in that it's located to the left of the base it's positioned on. This avoids measurement problems when mutations are encountered. The window size is a tradeoff. If it's too big, low level mutations may be missed. If it's too small, there may be insufficient data to give unbiased measurements leading to many false positives.
-q quiet-mode
If specified, no information is output to stdout. The mutations will still be written to the experiment file as tags.
-s analysis-sensitivity
This threshold is used to determine when an above/below baseline double peak in the difference trace is considered to be a mutation. It is specified in standard deviations from the mean over the analysis window. The higher the value, the more stringent the test. This value is reduced dynamically by the algorithm in the presense of mutations since small mutations near larger ones can often be missed with a uniform sensitivity setting. It's likely that some experimentation with this parameter will be required for optimal mutation detection in your data.
-t noise-threshold
This threshold is used to filter out low level noise during the analysis phase. It is specified as a percentage of the maximum peak-to-peak trace difference value. A high threshold will lead to fewer false positives but you run the additional risk of missing low level mutations.
-w maximum-peak-width
During analysis, the width of each peak is measured to avoid problems caused by gel artifacts. These often appear as broad peaks that overlay many bases. The maximum peak width is specified in bases. A lower value will lead to fewer false positives, but you run the additional risk of missing smeared mutations towards the end of a trace.

first previous next last contents
This page is maintained by staden-package. Last generated on 22 October 2002.