clip -- an Experiment File sequence clipper
clip [-v] [-s start_offset] [-m
minimum_extent] [-M maximum_extent] [-w r_length_1]
[-u r_unknown_1] [-W r_length_2] [-U
r_unknown_2] [-l l_length_1] [-y l_unknown_1]
[-L l_length_2] [-Y l_unknown_2] file ...
Clip is a simple program to decide how much of the 3' end of a
sequence, stored as an Experiment File, should be clipped off and ignored
during assembly. The decision is made by simply counting the numbers of
unknown bases (eg - or N) found within windows slid left to
right along the sequence.
The file arguments, of which there can be several, are processed one at a
time. Each argument is assumed to be a valid Experiment File. The sequence
is read from the Experiment File SQ identifier; clipping is performed;
and QL and QR identifiers are appended to the file.
The right clip position is calculated by sliding to the right a window of
length r_length_1 along the sequence starting from base
start_offset. We stop once we find greater than or equal to
r_unknown_1 unknown bases. At this stage two choices are available; to
place our clip at the start position of our first window or to proceed from
our current position plus half of r_length_1 using a second window. In
the latter case we perform a similar operation to the first window, except
using the r_length_2 and r_unknown_2 parameters. We will then
set the clip to be the start position of this second window.
The left clip position is calculated by sliding a window to the left starting
from base start_offset. The algorithm used is identical to the right
clip position except that the l_unknown_1, l_len_1,
l_unknown_2 and l_len_2 parameters are used.
To only use one window (the default parameters) set the length_2
parameter to be 0 using -W 0.
The default arguments are
"-s 70 -m 0 -M 999999 -w 100 -u 5 -W 0 -U 0 -l 20 -y 3 -L 0 -Y 0."
-v
-s offset
-m extent
QL clip value of less than
extent bases into the sequence then use extent as the QL
value.
-M extent
QR clip value of more than
extent bases into the sequence then use extent as the QR
value.
-w length
-u unknown
-W length
-U unknown
-l length
-y unknown
-L length
-Y unknown
To clip a batch of sequences listed in the `fofn' file with a minimum left clip value of 20 bases use:
clip -m 20 `cat fofn`
See section ExperimentFile(4).See section trace_clip.