vector_clip -- finds and marks vector segments in sequence readings
vector_clip
-
[schr
]
[-w
word_length (4)] [-n
num_diags (7)]
[-d
diagonal_score (0.35)] [-l
minimum_match (20/70%)]
[-m
minimum_5'_position] [-t
] [-p
passed_fofn] [-f
failed_fofn] input_fofn
vector_clip
finds and marks vector segments in sequence readings stored
in experiment file format. For sequencing vectors it can be used to find the
5' primer and, for short inserts, the sequence to the 3' side of the cloning
site. It can also be used to find 3' primer sequences. A further option can do
a final check for any vector rearrangements that could be missed by the more
specific searches around the cloning site. For cloning vectors it will search
both orientations of the sequence and mark any segments found. The vector
sequences must be stored as simple text files. For cloning vector
searches the reading's experiment file must contain the name of the
cloning vector file. For sequencing vector searches, either the experiment
file for each reading must contain the information about the vector
sequence (the file name, cloning site and primer offset) or
vector-primer files must be used. Vector-primer files contain sets of
sequences from around cloning sites, and vector_clip can use these to
find the vector that matches each reading best. If the match is above
the cutoff score the reading is clipped. Vector-primer files are the
simplest method of providing vector_clip with the data it needs for
finding sequencing vectors. More information is available elsewhere
(see section Screening Against Vector Sequences).
The program processes batches of readings by the use of file of file names: one is used for input and two for output. The input file lists the names of all the readings to process, one name per line. One output file contains the names of all the readings that pass the screening and the other contains the names of those that fail.
-s
-c
-h
-i vector_primer filename
-r
-t
-L
minimum percentage match 5' end (60)
-R
minimum percentage match 3' end (80)
-m
minimum 5' position
-v
vector-primer-pair filename
-V
vector_primer length
-w
word_length (4)
-P
probability
-n
num_diags (7)
-d
diagonal score (0.35)
-l
minimum match (20)
-M
maximum vector length (100000)
-p
passed fofn
-f
failed fofn
Usage: vector_clip [options] file_of_filenames Where options are: [-s mark sequencing vector] [-c mark cloning vector] [-h hgmp primer] [-r vector rearrangements] [-w word_length (4)] [-n num_diags (7)] [-d diagonal score (0.35)] [-l minimum match (20)] [-L minimum % 5' match (60)] [-R minimum % 3' match (80)] [-m default 5' position] [-t test only] [-M Max vector length (100000)] [-P max Probability] [-v vector_primer filename] [-i vector_primer filename] [-V vector_primer length] [-p passed fofn] [-f failed fofn]
Screen for sequencing vector using 5' cutoff of 70%, a 3' cutoff of 90% and default 5' primer position of 30. The batch of files to process are named in files.in, the names of the passed files are written to files.pass and the names of those that fail to files.fail.
vector_clip -s -L70 -R90 -m30 -pfiles.pass -f files.fail files.in
Screen for sequencing vector using 5' cutoff of 60%, a 3' cutoff of 80% and default 5' primer position of 30. The batch of files to process are named in files.in, the names of the passed files are written to files.pass and the names of those that fail to files.fail. This shows that the default search is for sequencing vector.
vector_clip -m30 -pfiles.pass -f files.fail files.in
Screen for sequencing vector using 5' cutoff of 60%, a 3' cutoff of 80% and a vector-primer-pair file called vector_primer_file. The batch of files to process are named in files.in, the names of the passed files are written to files.pass and the names of those that fail to files.fail.
vector_clip -v vector_primer_file -pfiles.pass -f files.fail files.in
Screen transposon data using 5' cutoff of 80%, a 3' cutoff of 85%, a match length of 10 and a vector-primer-pair file called vector_primer_file. The batch of files to process are named in files.in, the names of the passed files are written to files.pass and the names of those that fail to files.fail.
vector_clip -i vector_primer_file -L 80 -R 85 -l 10 -pfiles.pass \
-f files.fail files.in
Screen for cloning vector using the old algorithm with a word length of 4, summing 7 diagonals and diagonal cutoff score of 0.4. The batch of files to process are named in files.in, the names of the passed files are written to files.pass and the names of those that fail to files.fail.
vector_clip -c -w4 -n7 -d0.4 -pfiles.pass -f files.fail files.in
Screen for cloning vector using the probability based algorithm with a word length of 4 and probability cutoff of 1.0e-13. The batch of files to process are named in files.in, the names of the passed files are written to files.pass and the names of those that fail to files.fail.
vector_clip -c -P 1.0e-13 -pfiles.pass -f files.fail files.in
Screen for 3' primer using a cutoff of 75%. The batch of files to process are named in files.in, the names of the passed files are written to files.pass and the names of those that fail to files.fail.
vector_clip -h -R75 -pfiles.pass -f files.fail files.in
Screen for sequencing vector rearrangements using a cutoff of 20 bases. The batch of files to process are named in files.in, the names of the passed files are written to files.pass and the names of those that fail to files.fail.
vector_clip -r -l20 -pfiles.pass -f files.fail files.in
The following error messages can be generated.
SL, SR, CL, CR, CS, PS, PR and SF records are written to the experiment files.
See section Experiment File.
For notes on defining the cloning and primer sites,See section Defining the Positions of Cloning and Primer Sites for Vector_Clip.
See section scf(4).