Submitting a sequence
A query amino acid sequence should be pasted into the text area of the
REPRO form. Alternatively, a sequence file can be specified using the
upload option. Various parameters are available with assigned default
values, but can be set by the user:
-
A choice of substitution matrices is available, default is PAM250.
-
Gap opening and extension penalties used in the pair-wise local
alignments can be selected: the default is set at 10 and 1
respectively (for use with the default substitution matrix).
-
The number (N) of non-overlapping local alignments should be set to
between 1-99, default 50.
Pressing
the 'Run Repro' button will execute REPRO. The N top-scoring
non-overlapping alignments found are then printed with their
corresponding positions within the sequence. Obtaining the top
alignments is the most time consuming part of the repeat determining
procedure, taking the bulk of the computation time. Therefore, the
alignments resulting from the first step are written to a file and
stored in a personal directory assigned to the user at the server
site. REPRO can then easily be restarted by the user to attempt
further analysis (step 2) with various numbers of alignments and/or
parameter settings.
Analysis of top alignments
Submitting top alignments
After completion of step 1, the user can commence the graph-based
clustering of the alignments. In the REPRO method, graphs are used to
represent different repeat sets. The graph nodes correspond to
individual repeats, represented by their N-terminal residue, while
the edges represent matches between these residues, labeled with the
top alignment providing the match.
To activate step 2, the form in the results page from step 1 should be
completed by specifying the number M of alignments (M£N)
or by selecting individual alignments. After clicking on the second
'Run Repro' button to assemble the repeat sets, the results will
appear almost instantaneously on the screen, as the clustering
protocol is very fast.
The parameters are set at optimum, but can be altered for complex repeat situations..
- MGC - Minimal graph connection
- MRFL - Minimum repeating fragment length
- VSSV - Virtual fragment start site variation
- Filter - Filter
- DFE - Determination of fragment ends
- VNSS - Virtual nodes start sites
Analysis of results
Presentation of the final results is split into four categories:
Repeats
Pair-wise top alignments: as described for step 1 above.
Stacked alignments
Multiple alignment through stacking of pair-wise top alignments: This allows the user to assess repetitive patterns by eye.
Evaluation
Repeat
sets with evaluation statistics: three types of graph set are given
according to the phase of the REPRO graph-clustering procedure
(Heringa and Argos, 1993): (a) initial repeats graphs, which are
based on consistent matching of the top alignment N-termini; (b)
non-decomposed repeat graphs, which give the complete set of
fragments for each repeat type recognised; and (c) decomposed repeat
sub-graphs, which are the graphs given in (b) split according to
single graph connecting nodes (Heringa and Argos, 1993).
Fragments
As
a final output category, the repeat fragments corresponding to the
decomposed graphs are given in PIR format. Each output category can
be conveniently reached by following the navigation bars on the
screen. Evaluation of the three types of graphs for each repeat set
is facilitated by navigation click buttons, which interlink the
start, non-decomposed and sub-graphs for each repeat set. This
enables the user to easily verify the significance and consistency
of individual repeats, so that manual adjustments can be made to the
repeat sets produced by REPRO. The final output can be downloaded or
printed in a simple text format by following the link at the top of
the page. The user can thus subject the final repeat sets (iv) in
PIR format to further analysis, for example by constructing a
multiple alignment using, for example, PRALINE (Heringa, 1999).
Further Information
Questions and comments
regarding the interface to Repro should be sent to Richard
George and questions regarding the program Repro should be sent
to Jaap Heringa.
Contact Info