Fold recognition with minimal gaps

Citation:

Chen, W., Mirny, L. & Shakhnovich, E.I. Fold recognition with minimal gaps. Proteins: Structure, Function, and Bioinformatics 51, 4, 531 - 543 (2003).

Date Published:

2003

Abstract:

Here we present a simplified form of threading that uses only a 20 × 20 two-body residue-based potential and restricted number of gaps. Despite its simplicity and transparency the Monte Carlo-based threading algorithm performs very well in a rigorous test of fold recognition. The results suggest that by simplifying and constraining the decoy space, one can achieve better fold recognition. Fold recognition results are compared with and supplemented by a PSI-BLAST search. The statistical significance of threading results is rigorously evaluated from statistics of extremes by comparison with optimal alignments of a large set of randomly shuffled sequences. The statistical theory, based on the Random Energy Model, yields a cumulative statistical parameter, ϵ, that attests to the likelihood of correct fold recognition. A large ϵ indicates a significant energy gap between the optimal alignment and decoy alignments and, consequently, a high probability that the fold is correctly recognized. For a particular number of gaps, the ϵ parameter reaches its maximal value, and the fold is recognized. As the number of gaps further increases, the likelihood of correct fold recognition drops off. This is because the decoy space is small when gaps are restricted to a small number, but the native alignment is still well approximated, whereas unrestricted increase of the number of gaps leads to rapid growth of the number of decoys and their statistical dominance over the correct alignment. It is shown that best results are obtained when a combination of one-, two-, and three-gap threading is used. To this end, use of the ϵ parameter is crucial for rigorous comparison of results across the different decoy spaces belonging to a different number of gaps. Proteins 2003;51:531–543. © 2003 Wiley-Liss, Inc.

Website