first previous next last contents

The GDatabase Structure

#define GAP_DB_VERSION 2
#define GAP_DNA		   0
#define GAP_PROTEIN	   1

typedef struct { 
    GCardinal version;		/* Database version - GAP_DB_VERSION */
    GCardinal maximum_db_size;	/* MAXDB */
    GCardinal actual_db_size;	/* */
    GCardinal max_gel_len;	/* 4096 */
    GCardinal data_class;	/* GAP_DNA or GAP_PROTEIN */

    /* Used counts */
    GCardinal num_contigs;	/* number of contigs used */
    GCardinal num_readings;	/* number of readings used */

    /* Bitmaps */
    GCardinal Nfreerecs;	/* number of bits */
    GCardinal freerecs;		/* record no. of freerecs bitmap */

    /* Arrays */
    GCardinal Ncontigs;		/* elements in array */
    GCardinal contigs;		/* record no. of array of type GContigs */

    GCardinal Nreadings;	/* elements in array */
    GCardinal readings;		/* record no. of array of type GReading */

    GCardinal Nannotations;	/* elements in array */
    GCardinal annotations;	/* record no. of array of type GAnnotation */
    GCardinal free_annotations; /* head of list of free annotations */

    GCardinal Ntemplates;	/* elements in array */
    GCardinal templates;	/* record no. of array of type GTemplates */

    GCardinal Nclones;		/* elements in array */
    GCardinal clones;		/* record no. of array of type GClones */

    GCardinal Nvectors;		/* elements in array */
    GCardinal vectors;		/* record no. of array of type GVectors */

    GCardinal contig_order;	/* record no. of array of type GCardinal */

    GCardinal Nnotes;		/* elements in array */
    GCardinal notes_a;		/* records that are GT_Notes */
    GCardinal notes;		/* Unpositional annotations */
    GCardinal free_notes;	/* SINGLY linked list of free notes */
} GDatabase; 

This is always the first record in the database. In contains information about the Gap4 database as a whole and can be viewed as the root from which all other records are eventually referenced from. Care must be taken when dealing with counts of contigs and readings as there are two copies; one for the used number and one for the allocated number.

The structure contains several database record numbers of arrays. These arrays in turn contain record numbers of structures. Most other structures, and indeed functions within Gap4, then reference structure numbers (eg a reading number) and not their record numbers. The conversion from one to the other is done by accessing the arrays listed in the GDatabase structure.

For instance, to read the structure for contig number 5 we could do the following.

GContigs c;
GT_Read(io, arr(GCardinal, io->contigs, 5-1), &c, sizeof(c), GT_Contigs);

In the above code, io->contigs is the array of GCardinals whose record number is contained within the contigs element of the GDatabase structure. In practise, this is hidden away by simply calling "contig_read(io, 5, c)" instead.

version
Database record format version control. The current version is held within the GAP_DB_VERSION macro.
maximum_db_size
actual_db_size
These are essentially redundant as Gap4 can support any number of readings up to maximum_db_size, and maximum_db_size can be anything the user desires. It is specifable using the -maxdb command line argument to gap4.
max_gel_len
This is currently hard coded as 4096 (but is relatively easy to change).
data_class
This specifies whether the database contains DNA or protein sequences. In the current implementation only DNA is supported.
num_contigs
num_readings
These specify the number of used contigs and readings. They may be different from the number of records allocated.
Nfreerecs
freerecs
freerecs is the record number of a bitmap with a single element per record in the database. Each free bit in the bitmap corresponds to a free record. The Nfreerecs variable holds the number of bits allocated in the freerecs bitmap.
Ncontigs
contigs
contigs is the record number of an array of GCardinals. Each element of the array is the record number of a GContigs structures. Ncontigs is the number of elements allocated in the contigs array. Note that this is different from num_contigs, which is the number of elements used.
Nreadings
readings
readings is the record number of an array of GCardinals. Each element of the array is the record number of a GReadings structures. Nreadings is the number of elements allocated in the readings array. Note that this is different from num_readings, which is the number of elements used.
Nannotations
annotations
free_annotations
annotations is the record number of an array of GCardinals. Each element of the array is the record number of a GAnnotations structures. Nannotations is the number of elements allocated in the annotations array. free_annotations is the record number of the first free annotation, which forms the head of a linked list of free annotations.
Ntemplates
templates
templates is the record number of an array of GCardinals. Each element of the array is the record number of a GTemplates structures. Ntemplates is the number of elements allocated in the templates array.
Nclones
clones
clones is the record number of an array of GCardinals. Each element of the array is the record number of a GClones structures. Nclones is the number of elements allocated in the clones array.
Nvectors
vectors
vectors is the record number of an array of GCardinals. Each element of the array is the record number of a GVectors structures. Nvectors is the number of elements allocated in the vectors array.
contig_order
This is the record number of an array of GCardinals of size NContigs. Each element of the array is a contig number. The index of the array element indicates the position of this contig. Thus the contigs are displayed in the order that they appear in this array.

first previous next last contents
This page is maintained by staden-package. Last generated on 1 March 2001.
URL: http://www.mrc-lmb.cam.ac.uk/pubseq/manual/scripting_113.html