REPRO - Help
sidebar-main
title
griff
banner-button_0 banner-button_Layer-7 banner_button_03 banner-button_Layer-4 banner-button_05 banner-button_Layer-5 banner-button_07
banner-button_08 banner-button_09 banner-button_10
Bioinformatics Unit banner
   homeoffresearchoffconfoffprogonmemoffpuboffvacoff
   tabfoot tabfoot-bgtabfoot-bgtabfoot-bgtabfoot-bgtabfoot-bgtabfoot-bgtabfoot-bgtabfoot-bgtabfoot-bgtabfoot-bgtabfoot-bgtabfoot-bgtabfoot-bgtabfoot-bgtabfoot-bg

REPRO - Help

Contents:



Step 1 - Calculation of a list of N top-scoring non-overlapping local alignments.
    Submitting a sequence

    A query amino acid sequence should be pasted into the text area of the REPRO form. Alternatively, a sequence file can be specified using the upload option. Various parameters are available with assigned default values, but can be set by the user:

    1. A choice of substitution matrices is available, default is PAM250.
    2. Gap opening and extension penalties used in the pair-wise local alignments can be selected: the default is set at 10 and 1 respectively (for use with the default substitution matrix).
    3. The number (N) of non-overlapping local alignments should be set to between 1-99, default 50.


    Pressing the 'Run Repro' button will execute REPRO. The N top-scoring non-overlapping alignments found are then printed with their corresponding positions within the sequence. Obtaining the top alignments is the most time consuming part of the repeat determining procedure, taking the bulk of the computation time. Therefore, the alignments resulting from the first step are written to a file and stored in a personal directory assigned to the user at the server site. REPRO can then easily be restarted by the user to attempt further analysis (step 2) with various numbers of alignments and/or parameter settings.


    Analysis of top alignments

Step 2 - Graph-based clustering of M top-alignments to assemble the repeat sets.
    Submitting top alignments

    After completion of step 1, the user can commence the graph-based clustering of the alignments. In the REPRO method, graphs are used to represent different repeat sets. The graph nodes correspond to individual repeats, represented by their N-terminal residue, while the edges represent matches between these residues, labeled with the top alignment providing the match.

    To activate step 2, the form in the results page from step 1 should be completed by specifying the number M of alignments (M£N) or by selecting individual alignments. After clicking on the second 'Run Repro' button to assemble the repeat sets, the results will appear almost instantaneously on the screen, as the clustering protocol is very fast. The parameters are set at optimum, but can be altered for complex repeat situations..

    1. MGC - Minimal graph connection
    2. MRFL - Minimum repeating fragment length
    3. VSSV - Virtual fragment start site variation
    4. Filter - Filter
    5. DFE - Determination of fragment ends
    6. VNSS - Virtual nodes start sites


    Analysis of results

    Presentation of the final results is split into four categories:

    Repeats

    Pair-wise top alignments: as described for step 1 above.

    Stacked alignments

    Multiple alignment through stacking of pair-wise top alignments: This allows the user to assess repetitive patterns by eye.

    Evaluation

    Repeat sets with evaluation statistics: three types of graph set are given according to the phase of the REPRO graph-clustering procedure (Heringa and Argos, 1993): (a) initial repeats graphs, which are based on consistent matching of the top alignment N-termini; (b) non-decomposed repeat graphs, which give the complete set of fragments for each repeat type recognised; and (c) decomposed repeat sub-graphs, which are the graphs given in (b) split according to single graph connecting nodes (Heringa and Argos, 1993).

    Fragments

    As a final output category, the repeat fragments corresponding to the decomposed graphs are given in PIR format. Each output category can be conveniently reached by following the navigation bars on the screen. Evaluation of the three types of graphs for each repeat set is facilitated by navigation click buttons, which interlink the start, non-decomposed and sub-graphs for each repeat set. This enables the user to easily verify the significance and consistency of individual repeats, so that manual adjustments can be made to the repeat sets produced by REPRO. The final output can be downloaded or printed in a simple text format by following the link at the top of the page. The user can thus subject the final repeat sets (iv) in PIR format to further analysis, for example by constructing a multiple alignment using, for example, PRALINE (Heringa, 1999).



    Further Information

    Questions and comments regarding the interface to Repro should be sent to Richard George and questions regarding the program Repro should be sent to Jaap Heringa.

    Contact Info