home up

Version-2001.0 Release Notes

James Bonfield, Kathryn Beal, Yaping Cheng, Mark Jordan and Rodger Staden

The major visible changes in this release include a new sequencing project experiment suggestion program "pre_finish"; spin a replacement/combination for nip4 and sip4 which, very importantly provides the first graphical user interface to EMBOSS (http://www.hgmp.mrc.ac.uk/Software/EMBOSS/);plus improvements and additions to our existing programs pregap4, gap4 and trev.

At the time of the last release we stated that the MS Windows version would only be available commercially, but we are very pleased to say that at the beginning of this year we obtained permission to distribute the MS version under identical conditions to those for the UNIX versions. The functionality of the programs is the same on all systems (except there is no EMBOSS release for Windows).

Providing the MS Windows version has had quite a big startup cost in terms of progamming effort, but the problems have now been solved and lessons learnt, and it should not slow down progress in the future. Simple things like allowing spaces in file names or different window manager behavior can create a lot of work.

This release is the first to use our new trace file format ZTR which has advantages of reduced file size and flexibility over SCF and removes the need for the use of external compression programs. Over time we expect it to replace SCF as the preferred trace file format. In addition to ZTR, Gap4 now also contains an interface to Perkin Elmer's BioLIMS database.

People are performing ever increasing sizes of sequencing project, for example shotgunning whole bacteria, and although it may not be evident to users doing smaller projects, a major improvement in this release of gap4 is numerous speedups of what were becoming slow tasks. These larger projects also require more readings so we have increased the possible number of readings to 99,999,999 and already sites are using around 200,000. The permitted length of reading names has also been increased to 40 characters.

We have always recommended making the fewest number of changes to the original reading data. This results in many pads appearing in contigs (which are stripped when the final consensus is created) and meant that some searches in gap4 failed when the targets included pad characters. We have improved some of the affected routines by stripping pads before the searches are applied.

Hyperlinks have been introduced to results in the Output window and to the results produced by the new reading name, template name and tag content searches. The hyperlinks can be used to invoke the Contig Editor etc.

The gap4_viewer have been free for some time, even for commercial users, and provides full gap4 functionality in a read-only mode. Making the gap4_viewer free to all means that anybody (including commercial sequencing companies returning results to customers) can usefully send their gap4 databases to colleagues at other sites: the databases are machine independent and anybody with access to MS Windows or UNIX/LINUX could obtain and use the viewer. Anyone who has done any sequencing will know that seeing the assembly and the traces is the best way of assessing the reliability of the consensus. From this release we will not be distributing a separate gap4_viewer as we have incorporated its limited functionality into the standard version of gap4. This means that the downloaded version of gap4 will work with full functionality on the demo and course data included in the package and in viewer mode on any other data. Full functionality on all data will require a licence as before (free to academic users).

A more detailed list of the program changes made is given below. The manual has also been updated. Whenever we receive queries on topics that are not documented or which are inadequately explained we add to or improve the corresponding sections in the manual.

Note that we welcome comments and suggestions about the package, particularly ideas about what users would like to see added. For this release we are looking especially for sites that would like to try out our new experiment suggestion program (pre_finish) and to contribute to the design of its user interface. See the $STADENROOT/lib/finish/METHODS file for more information.


Program version numbers


Operating systems

The binaries have been created in the following build environments. Typically newer environments for the same operating system should work fine, but not necessarily older systems. (For example, the binaries will not run under RedHat Linux 5.x, but will run on RedHat Linux 7.x)

Demo data sets

The course (see course/*_docs/*.pdf) may be run when in demonstration mode (ie without needing a licence). Specifically all demonstration data files are considered as valid sequences and so are exempt from the licence restrictions. All pathnames listed below are relative to the installation root for the package.

Here is a list of sequences which may be loaded into spin:

For a good example of protein-protein similarity plots, try using mysa_drome.seq and mysa_human.seq.

For dna-protein plots, try using cemyo1.seq against mysa_caeel.seq.

To see how spin handles large sequences try using ecoli.00003 and lambda.seq. This is a large comparison: 250Kb against 48.5Kb. Hence the slower searches, such as Find Similar Spans, will take a long time. We suggest searching with Find Matching Words using a word length of 12.

Gap4 in demonstration mode allows access to:

Pregap4 in demonstration mode allows access to the same files listed for Gap4.


Sequence Library Access Using EMBOSS

The spin interface to EMBOSS provides access to the sequence libraries provided you know the names of the files you want to extract!


Linux, Gnome and Enlightenment

There is a known problem with the Gap4 contig editor when using Enlightenment as the window manager, although this may have been fixed by now. This is the default for earlier versions of Gnome, but it is not known whether the problem arises when using Enlightenment in other environments. The symptom is that program will terminate instantly upon starting the contig editor with a complaint about X_ConfigureEvents.

The solution is to change window managers, which may be adjusted using the Gnome control panel.


Change log

Here is a list of changes since the 2000.0 release.

Gap4

Changes

Bug fixes

Nip4/Sip4 (now spin)

Changes

Bugs fixed

Pregap4

Changes

Bug fixes

Vector_clip

Changes

Bug fixes

Trace file handling (io_lib)

Changes

Bug fixes

Trev

Changes

Bug fixes

Misc

Changes

Bug fixes


home up