trace_clip -- an Experiment File sequence clipper that analyses traces
trace_clip [-w winlen_nonc] [-W winlen_drop]
[-s start] [-c cut_nonc_r] [-C cut_drop_r]
[-f fract_nonc_r] [-k cut_nonc_l] [-K
cut_drop_l] [-F fract_nonc_l] [-m max_right]
[-M min_left] [-L] [-R] [-b] [-v]
[-t] [-p] file ...
trace_clip is used to "clip" the 3' and 5' ends of machine produced
sequences. It adds QR and QL records to the reading's experiment file
and bases respectively to the right and left of this point will be
ignored for many subsequent
processing steps (although note that the clipped data can be used to
help find joins between contigs (see section Find Internal Joins),
and to confirm single stranded regions
(see section Double stranding).
The clip position is selected by analysing the reading's
traces using two simple measures.
The first (nonc, or non-called over called) calculates the ratio of the
area under the trace for the called base to the maximum area under each
of the non-called bases at the same position. The second (drop) measures,
for the called base, the ratio of the height of the trace at its peak to
its height at the
mid-point between the peak and the next base.
In our hands, for ABI-produced traces both of these calculations give
values that start off high, drop to a minimum and then increase 5' to 3'.
For the majority of the sequence the measures are averaged over windows
winlen_nonc and winlen_drop but near to the 5' end of the
sequence the windows are progressively decreased. For example if the
window length is 101 then
from base position 51 rightwards the
calculations are averaged over window lengths of 101 bases; for base
position 50 the window length is 99, for 49 it is 97, and so on, until
the 5' end of the reading is reached. (Actually bases at positions 1 to
min_left are given the value at position min_left.
To select the right end clip point the program starts either at the
base having the minimum observed average value or, if defined by the
user at start, and searches rightwards until it finds a position
that exceeds the cutoff values cut_nonc_r and cut_drop_r.
The clip point is the weighted mean (using fract_nonc_r)
of the positions at which the two windows stop. The left clip point is
calculated in a similar manner.
The file arguments, of which there can be several, are processed one at a
time. Each argument is assumed to be a valid Experiment File. The trace
file name is read from the Experiment File; clipping is performed;
and a QR or QL identifier is appended to the Experiment File.
The default arguments are -w 101 -W 101 -c 0.3 -C 1.1 -f 0.25, -k
0.3 -K 0.3 -F 0.5 -m 550 -M 5. Also by default, only 3' clipping is
performed. Left end clipping can be forced using -L or -b
options. -L does the left end only unless the -R option is
also used, and -b does both ends.
Using the -p option the program will output the averaged values
nonc and drop to two files with names derived from the input file name:
input file fred.1, output files fred.1.d and fred.1.n will contain
respectively the drop and nonc values as x,y coordinates suitable for
use by a graph plotting program.
Using the -t option will leave the experiment file unchanged. If
used in conjunction with the -v option the program will write the
clip points to the terminal screen.
The parameters cut_nonc_r, cut_drop_r fract_nonc_r, cut_nonc_l,
cut_drop_l fract_nonc_l can be chosen by use of scale_trace_clip See section scale_trace_clip..
-w winlen_nonc
-W winlen_drop
-c cut_nonc_r
-C cut_drop_r
-f fract_nonc_r
-k cut_nonc_l
-K cut_drop_l
-F fract_nonc_l
-v
-t
-p
-s start
-M min_left
min_left
-m max_right
max_right
-L
-R
-b
See section ExperimentFile(4).See section scale_trace_clip.