tracediff -- Compare two trace files for differences to detect mutations.
tracediff
[-a
peak-alignment-deviation]
[-c
complement-reverse-strand-tags]
[-d
output-difference-traces]
[-f
file-of-filenames]
[-n
analysis-window-length]
[-q
quiet-mode]
[-s
analysis-sensitivity]
[-t
noise-threshold]
[-w
maximum-peak-width]
experiment_file(s)
tracediff
compares a pair of traces to look for mutations. It aligns
the traces, and then subtracts one trace from the other to produce a "difference
trace". This difference trace is analysed to distinguish between mutations and
incorrect base calls. Bonfield,JK, Rada,C and Staden,R Automated detection
of point mutations using fluorescent sequence trace subtraction. Nucl. Acids Res. 26,
3404-3409 (1998).
For an overview and more details about mutation detection seesection Search for Mutations..
To detect mutations, compute the mean and standard deviation of the
difference trace, and then locate bases associated with a significant
pair of peaks, one positive, the other negative.
For example a base change from an A
to T
will cause a positive
A
trace difference and a negative T
trace difference. If both
the positive and negative differences are more than num_sd multiples of
the standard deviation from the mean, then this is flagged as a potential
mutation. Mutations are written to the experiment file as MUTA
tags.
The experiment_file contains records specifying the input trace, the reference
trace and the strand direction. It also contains the clipping points for the input
trace. A minimal experiment file for tracediff might look like this:
LN 27_17f.ztr
PR 1
QL 10
QR 839
WT C:/my_dataset/09_5f
Where the LN
record specifies the name of the input trace, the PR
record specifies the strand direction 1=forward, 2=reverse, the QL
and
QR
records specify the input trace left and right clip points respectively,
and the WT
record specifies the wildtype trace. You can also optionally
specify clip points for the wildtype trace as WL
and WR
records.
Pregap4 generates suitable experiment files automatically, so these would not
normally be created manually.
-a
peak-alignment-deviation
-
The centres of each individual half-peak of a double peak above and below
the baseline must align reasonably well for them to be considered to be
a real mutation. The amount of half-peak alignment deviation allowable is
specified in bases by this parameter, usually as a fraction of one base.
-c
complement-reverse-strand-tags
-
After mutation detection and after readings have been assembled into a GAP4
database, GAP4 displays both forward and reverse readings in a single direction
in the contig editor. This makes it much easier to compare sequences and traces
in both directions simultaneously. When the corresponding traces are displayed,
any reverse strand traces are complemented automatically such that the bases are
interchanged. In this case, the original mutation tag generated by tracediff will
then be of the wrong sense, so if checked, this option complements the tag base
labels to match the complemented trace displayed by GAP4.
-d
output-difference-traces
-
After trace difference analysis, the generated traces are normally discarded and not
written to disk. Checking this option lets you save the trace difference files to
the same directory as the original traces. The .ZTR trace format is used for this
purpose. The original filename is retained and a "_diff.ztr" suffix is appended.
-f
file-of-filenames
-
Specifies the filename of a simple text file containing a list of experiment
files to be processed by tracediff.
-n
analysis-window-length
-
Analysis of the trace difference is done over a local region to counter
the effects of non-stationarity in the trace signal. The analysis region is
defined by a short window whose length is specified in bases. The window is
asymmetric in that it's located to the left of the base it's positioned on.
This avoids measurement problems when mutations are encountered. The window
size is a tradeoff. If it's too big, low level mutations may be missed. If
it's too small, there may be insufficient data to give unbiased measurements
leading to many false positives.
-q
quiet-mode
-
If specified, no information is output to stdout. The mutations will still
be written to the experiment file as tags.
-s
analysis-sensitivity
-
This threshold is used to determine when an above/below baseline double
peak in the difference trace is considered to be a mutation. It is specified
in standard deviations from the mean over the analysis window. The higher the
value, the more stringent the test. This value is reduced dynamically
by the algorithm in the presense of mutations since small mutations near
larger ones can often be missed with a uniform sensitivity setting. It's
likely that some experimentation with this parameter will be required for
optimal mutation detection in your data.
-t
noise-threshold
-
This threshold is used to filter out low level noise during the analysis
phase. It is specified as a percentage of the maximum peak-to-peak trace
difference value. A high threshold will lead to fewer false positives but
you run the additional risk of missing low level mutations.
-w
maximum-peak-width
-
During analysis, the width of each peak is measured to avoid problems caused
by gel artifacts. These often appear as broad peaks that overlay many bases.
The maximum peak width is specified in bases. A lower value will lead to
fewer false positives, but you run the additional risk of missing smeared
mutations towards the end of a trace.
This page is maintained by
staden-package.
Last generated on 22 October 2002.
URL: http://www.mrc-lmb.cam.ac.uk/pubseq/manual/manpages_unix_15.html