first previous next last contents

Writing New Modules

An Overview of a Module

A pregap4 module is a single file containing a series of functions with predefined interfaces. Pregap4 uses these functions to communicate with module.

This section is for system managers and programmers only.

The module itself is written using the Tcl/Tk language. A definition of this language is outside the scope of this manual, however several books exist on the subject. Each modules executes inside a Tcl "namespace". This means that modules may make use of global variables and global function names without fear of clashing with other modules. Indeed the use of specific function names and global variables is of considerable importance for designing a new module.

Functions

The basic structure of a module is that it has a series of known functions which pregap4 expects to use. Some of these functions are mandatory, whilst others will only be called by pregap4 if they have been defined.

name
Mandatory. Arguments: none Returns: The textual name of the module. This function is used to query a human readable name for the module (eg "ALF/ABI to SCF Conversion"). This name is used in the module list at the left side of the pregap4 window.

init
Optional. Arguments: None Returns: None This sets up any data structures needed for this module. It can be used for providing defaults for global variables when they are not known (eg they have no settings in the system or user pregap4rc files) and for setting up any other data structures required.

run
Optional. Arguments: A Tcl list of files to process Returns: A new Tcl list of files for subsequent processing. This is the main work horse. It is optional, however in all but the most esoteric cases, it will be needed. The single argument is a Tcl list of sequence names. These are either filenames on disk or identifiers used for fetching data from a database. The module should loop through the sequences which it can process (which may not be all of them, depending on the known information and file types). When finished, it needs to return a new list of files. If a file has been rejected by this module (eg it is completely sequencing vector) then this sequence name should be omitted from the returned list. However do make sure that all failed files have an error string attached to them by setting the file_error(seq name) array element.

shutdown
Optional. Arguments: A Tcl list of files to process Returns: A new Tcl list of files for subsequent processing. Deallocates any data structures that have been setup during the init or run stages. Most modules will not need this function. As with the run module, the returned value should be the list of passed files, which is generally the same as the list passed into this function. A special module, which is always included by pregap4, is the shutdown.p4m module. This is always the last module to have shutdown called. It produces the reports for pregap4 and does some general house keeping.

create_dialogue
Optional. Arguments: A tk pathname Returns: None This create a dialogue controlling the parameters for this module. The tk pathname passed into this function should be the root for all components of this dialogue. (Note though that this is not a toplevel window, but a subwindow of the main pregap4 dialogue.)

check_params
Optional. Arguments: None Returns: A variable name or a blank string. This checks that this module has valid answers to all of its mandatory questions. If this is the case a blank string is returned, otherwise the first variable name which needs a value is returned.

process_dialogue
Optional. Arguments: A tk pathname Returns: 0 for failure, 1 for success This is executed in all modules before the run functions are executed. It's purpose is to extract any information from user editable entries or checkboxes ready for the run function to utilise. It may also be used to check that the data entered is valid. The return code is used to indicate whether this module has sufficient data to execute. If 0 is returned pregap4 will beep and make sure that the dialogue 'tab' for this module is displayed. Further processing then stops until the 'Run' button is pressed again. For instance if a module needs to know the sequencing vector to screen against, then this should check if the value has been entered or can be obtained via a command. If so it returns 1.

configure_dialogue path mode
Optional. Arguments: A tk pathname, the configure mode Returns: None If this function is present pregap4 will add a button to the top of the module dialogue inviting the user to save the parameters for this module to the configuration file. In early releases of pregap4 (2000.0 and before) a "Select parameters to save" button was also available. To maintain compatibility with older modules the "mode" parameter is still used. If you wish the module to be backwards compatible with old pregap4 releases then this needs to be checked to make sure that it contains "save_all". If it does not then no action should be taken. In the 2001 release and newer the "mode" parameter will always contain "save_all" so no check is required. To save the dialogue information this function should use the pregap4 mod_save and glob_save functions.

Module Variables

mandatory
The existence of this variable (set to anything) states that this module cannot be disabled.

hidden
The existence of this variable states that its name shall not appear in the module list (although it will still be used).

report
The contents of this variable are displayed at the end of the pregap run by the shutdown.p4m module.

Global Variables

Several global variables exist which may need to be updated within the modules. For successful operation it is required to update these when applicable.

file_type
This is a Tcl array indexed by file name. It is initialised by the General Configuration module to be one of ABI, ALF, EXP, PLN, SCF or UNK.

file_error
This is a global array indexed by the current file name. If a file has been rejected by a module (ie not returned from the run function) then the appropriate array element must be filled with a reason. Typically the format for this reason will start with the module name followed by a colon. For example "makeSCF: unknown file type".

file_id
This is a global array, indexed by filenames, containing the sequence identifiers (which are often different to the sequence filenames). It is initialised by the General Configuration module.

file_orig_name
This is a global array holding any original filename for each currently processed file. It is initialised by the General Configuration module such that each file points to its own filename.

When creating and returning a new file (such as when switching from SCF files to Experiment Files in the Initialise Experiment Files module) it is required that the arrays are all updated correctly. This involves creating new array elements for each of the above four arrays. The file_type array element, indexed by a new name should contain the new file type (eg set file_type(seq10.exp) EXP). The file_error array element should be set to a blank string. The file_id should inherit the sequence identifier from the original file (eg set file_id(seq10.exp) $file_id(seq10.scf)). The file_orig_name array element should point to the old filename (not the original filename pointed to by the old filename). In this way file_orig_name could be considered as a list of the intermediate files generated for each final sequence file.

Builtin Functions

Apologies, but this section of documentation is still unfinished.

The full definition of these functions may be found in the Tcl code for Pregap4 itself. It is recommended that you use the Unix grep utility to find the definitions and example uses.

An Example Module

The best examples are the existing modules. Try looking at the Compress Trace Files module as an example. This may be found in `$STADENROOT/lib/pregap4/modules/compress_trace.p4m'.


first previous next last contents
This page is maintained by staden-package. Last generated on 22 October 2002.
URL: http://www.mrc-lmb.cam.ac.uk/pubseq/manual/pregap4_unix_93.html