home up

Staden Package Program Summary

Assembly

Assembly program

gap4 Performs assembly, contig joining, assembly checking, repeat searching, experiment suggestion, read pair analysis and contig editing. Has graphical views of contigs, templates, readings and traces which all scroll in register. Contig editor searches and experiment suggestion routines use phred confidence values to calculate the confidence of the consensus sequence and hence only identify places requiring visual trace inspection or extra data. The result is extremely rapid finishing and a consensus of known accuracy.

Preparing sequence trace data for analysis or assembly

pregap4 Provides a graphical user interface to set up the processing required to prepare trace data for assembly or analysis; and also gives a method for its automation. The possible processes which can be set up include trace format conversion, quality analysis, vector clipping, contaminant screening and repeat searching.

Sequence screening

vector_clip Finds and marks (with tags) vector segments of sequence readings stored as Experiment Files. Rapid and sensitive, and usually used via pregap4.
screen_seq Searches sequence readings stored as Experiment Files for matches against sets of possible contaminant sequences. Typically used to look for E.Coli or yeast contamination. Very fast, and usually used via pregap4.
find_renz Finds and marks (with tags) known repeat sequences ( e.g. ALUs) in sequence readings stored as Experiment Files. Usually used via pregap4.

Trace viewing

trev A rapid and flexible viewer and editor for ABI, ALF or SCF trace files. Provides good support for interaction with Experiment Files.

Mutation detection

tracediff Automatically locates point mutations by comparing new traces against those of a reference trace. Handles any number of files in a single run and prepares results which can be viewed in gap4.
hetscan Used in conjunction with tracediff to search for heterozygous positions and, where they coincide, to label tracediff results appropriately.
gap4 For viewing aligned sequences and traces and checking automatic mutation assignments. Can subtract traces and display their differences.


Sequence analysis

spin A combination of the older nip4 and sip4 program. Spin compares pairs of sequences in many ways, often presenting its results graphically. Has very rapid dot matrix analysis, global and local alignment, plus a sliding sequence window linked to the graphical plots. Can compare nucleic acid against nucleic acid, protein against protein, and protein against nucleic acid. Analyses nucleotide sequences to find genes, restriction sites, motifs, etc. Performs translations, finds open reading frames, counts codons, etc.
make_weights Analyses a multiple alignment to produce a weight matrix for use within spin.


Sequence trace and reading file manipulation

Any trace file

convert_trace Converts traces from any format to any format. Also handles trace background subtraction and normalisation.
get_comment Extracts text from the comment fields from any trace format. Replaces the get_scf_field program.
index_tar Produces a text index from a tar file. Used for speeding up RAWDATA access within gap4.

ABI files

getABIstring Displays arbitrary string fields from an ABI trace file.
getABIhex Displays arbitrary fields from an ABI trace file as hex codes.
getABIraw Displays arbitrary fields from an ABI trace file in the raw format.
getABIcomment Displays the comments from an ABI trace file. Equivalent to getABIstring CMNT.
getABISampleName Displays the sample name (reading name) stored in an ABI trace file. Equivalent to getABIstring SMPL
getABIdate Displays the run date from an ABI trace file.

ALF files

alfsplit Splits the Pharmacia ALF gel file into multiple files. This is necessary before processing by pregap4.

SCF files

makeSCF Converts existing trace files (whatever format) into SCF files.
scf_info Displays details stored in the header of an SCF file.
scf_dump Displays the entire SCF file contents in a human readable format.
scf_update Converts between SCF file versions (2 to 3 and vice versa).
get_scf_field Extracts data from the SCF comment section.
eba Estimates the base accuracy of each base in an SCF file.

Sequence quality clipping

qclip Performs simple quality clipping of Experiment Files based on confidence values or on the sequence composition.

Gap4 database utilities

convert Converts between the various assembly database formats.
copy_db Copies and garbage collects gap4 databases.
copy_reads Aligns two gap4 databases and copies overlapping sequences from one to the other.

Other sequencing utilities

extract_seq Extracts the sequence component from trace files or experiment files.
init_exp Extracts the sequence and related information from trace files to output in Experiment File format.


Scripting utilities

stash General purpose scripting interface to Gap4 and Spin, be used for producing graphical scripts and interfaces.


Misc

splitseq_da Splits large sequences into a set of overlapping smaller sequences. Outputs the sequences in a Experiment File format with attributes suitable for input using Directed Assembly.


home up