first previous next last contents

Text Identifiers

These are for use in the TEXT segments. None are required, but if any of these identifiers are present they must confirm to the description below. Much (currently all) of this list has been taken from the NCBI Trace Archive [2] documentation. It is duplicated here as the ZTR spec is not tied to the same revision schedules as the NCBI trace archive (although it is intended that any suitable updates to the trace archive should be mirrored in this ZTR spec).

The Trace Archive specifies a maximum length of values. The ZTR spec does not have length limitations, but for compatibility these sizes should still be observed.

The Trace Archive also states some identifiers are mandatory; these are marked by asterisks below. These identifiers are not mandatory in the ZTR spec (but clearly they need to exist if the data is to be submitted to the NCBI).

Finally, some fields are not appropriate for use in the ZTR spec, such as BASE_FILE (the name of a file containing the base calls). Such fields are included only for compatibility with the Trace Arhive. It is not expected that use of ZTR would allow for the base calls to be read from an external file instead of the ZTR BASE chunk.

[ Quoted from TraceArchiveRFC v1.17 ]

Identifier      Size       Meaning			 Example value(s)
----------      -----      ----------------------------  -----------------
TRACE_NAME *      250      name of the trace             HBBBA1U2211
                           as used at the center
                           unique within the center
                           but not among centers.
                           
SUBMISSION_TYPE *   -      type of submission
                           
CENTER_NAME *     100      name of center                BCM
CENTER_PROJECT    200      internal project name         HBBB
                           used within the center
                           
TRACE_FILE *      200      file name of the trace	 ./traces/TRACE001.scf
                           relative to the top of
                           the volume.
                           
TRACE_FORMAT *     20      format of the tracefile
                           
SOURCE_TYPE *       -      source of the read
                           
INFO_FILE         200      file name of the info file
INFO_FILE_FORMAT   20        
                           
BASE_FILE         200      file name of the base calls
QUAL_FILE         200      file name of the base calls
                           
                           
TRACE_DIRECTION     -      direction of the read
TRACE_END           -      end of the template
PRIMER            200      primer sequence
PRIMER_CODE                which primer was used
                           
STRATEGY            -      sequencing strategy
TRACE_TYPE_CODE     -      purpose of trace
                           
PROGRAM_ID         100     creator of trace file         phred-0.990722.h
                           program-version
                           
TEMPLATE_ID         20     used for read pairing         HBBBA2211
                           
CHEMISTRY_CODE       -     code of the chemistry         (see below)
ITERATION            -     attempt/redo                  1
                           (int 1 to 255)
                           
CLIP_QUALITY_LEFT          left clip of the read in bp due to quality
CLIP_QUALITY_RIGHT         right " " " " "
CLIP_VECTOR_LEFT           left clip of the read in bp due to vector
CLIP_VECTOR_RIGHT          right " " " " "

                           
SVECTOR_CODE        40     sequencing vector used        (in table)
SVECTOR_ACCESSION   40     sequencing vector used        (in table)
CVECTOR_CODE        40     clone vector used             (in table)
CVECTOR_ACCESSION   40     clone vector used             (in table)
                           
INSERT_SIZE          -     expected size of insert       2000,10000
                           in base pairs (bp)
                           (int 1 to 2^32)
                           
PLATE_ID            32     plate id at the center          
WELL_ID                    well                          1-384

SPECIES_CODE *       -     code for species
SUBSPECIES_ID       40     name of the subspecies
                           Is this the same as strain

CHROMOSOME           8     name of the chromosome        ChrX, Chr01, Chr09
                           
                           
LIBRARY_ID          30     the source library of the clone
CLONE_ID            30     clone id                      RPCI11-1234 
 
ACCESSION           30     NCBI accession number         AC00001
                           
PICK_GROUP_ID       30     an id to group traces picked
                           at the same time.
PREP_GROUP_ID       30     an id to group traces prepared
                           at the same time
                           
                           
RUN_MACHINE_ID      30     id of sequencing machine
RUN_MACHINE_TYPE    30     type/model of machine
RUN_LANE            30     lane or capillary of the trace
RUN_DATE             -     date of run
RUN_GROUP_ID        30     an identifier to group traces
                           run on the same machine

[ End of quote from TraceArchiveRFC ]

More detailed information on the format of these values should be obtained
from the Trace Archive RFC [2].

first previous next last contents
This page is maintained by staden-package. Last generated on 22 October 2002.
URL: http://www.mrc-lmb.cam.ac.uk/pubseq/manual/formats_unix_16.html