Skip to main content
Log in

Some operations research methods for analyzing protein sequences and structures

  • Published:
Annals of Operations Research Aims and scope Submit manuscript

Abstract

The operations research is probably one of the most successful field of applied mathematics used in economics, physics, chemistry, almost everywhere where one has to analyze huge amounts of data. Lately, these techniques of operations research were introduced in biology, especially in the protein analysis area to support biologists. The fast growth of protein data makes operations research an important issue in bioinformatics, a science which lays on the border between computer science and biology. This paper gives a short overview of the operations research techniques currently used to support structural and functional analysis of proteins.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Althaus, E., Kohlbacher, O., Lenhof, H.-P., & Muller, P. (2002). A combinatorial approach to protein docking with flexible side-chains. Journal of Computational Biology, 9(4), 597–612.

    Article  Google Scholar 

  • Altschul, S. F., Gish, W., Miller, W., Myers, E. W., & Lipman, D. J. (1990). Basic local alignment search tool. Journal of Molecular Biology, 215, 403–410.

    Google Scholar 

  • Altschul, S. F., Madden, T. L., Schaffer, A. A., Zhang, J., Zhang, Z., Miller, W., & Lipman, D. J. (1997). Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Research, 25, 3389–3402.

    Article  Google Scholar 

  • Andonov, R., Balev, S., & Yanev, N. (2004). Protein threading: from mathematical models to parallel implementations. INFORMS Journal on Computing, 16(4).

  • Andrade, M. A., & Valencia, A. (1997). Automatic annotation for biological sequences by extraction of keywords from MEDLINE abstracts. Development of a prototype system. In T. Gaasterland, P. Karp, K. Karplus, C. Ouzounis, & C. Sander et al. (Eds.), Fifth international conference on intelligent systems for molecular biology (pp. 25–32). Halkidiki: AAAI Press.

    Google Scholar 

  • Andreeva, A., Howorth, D., Brenner, S. E., Hubbard, T. J. P., Chothia, C., & Murzin, A. G. (2004). SCOP database in 2004: refinements integrate structure and sequence family data. Nucleic Acid Research, 32, 226–229.

    Article  Google Scholar 

  • Anfinsen, C. B. (1973). Principles that govern the folding of protein chains. Science, 181, 223–230.

    Article  Google Scholar 

  • Anfinsen, C. B., Haber, E., Sela, M., & White, F. Jr. (1961). The kinetics of formation of native ribonuclease during oxidation of the reduced polypeptide chain. Proceedings of the National Academy of Sciences of the USA, 47(9), 1309–1314.

    Article  Google Scholar 

  • Apweiler, R., Attwood, T. K., Bairoch, A., Bateman, A., Birney, E., Bucher, P., Codani, J. J., Corpet, F., Croning, M. D. R., & Durbin, R. (2000). InterPro—an integrated documentation resource for protein families, domains and functional sites. Bioinformatics, 16, 1145–1150.

    Article  Google Scholar 

  • Arbib, M. (1995). The handbook of brain theory and neural networks. Cambridge: Bradford Books/The MIT Press.

    Google Scholar 

  • Asai, K., Hayamizu, S., & Handa, K. (1993). Prediction of protein secondary structure by the hidden Markov model. Bioinformatics, 9, 141–146.

    Article  Google Scholar 

  • Attwood, T. K. (2000). The quest to deduce protein function from sequence: the role of pattern databases. International Journal of Biochemistry & Cell Biology, 32, 139–155.

    Article  Google Scholar 

  • Attwood, T. K., Croning, M. D., Flower, D. R., Lewis, A. P., Mabey, J. E., Scordis, P., Selley, J. N., & Wright, W. (2000). PRINTS-S: the database formerly known as prints. Nucleic Acid Research, 28, 225–227.

    Article  Google Scholar 

  • Bairoch, A., & Apweiler, R. (2000). The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000. Nucleic Acids Research, 28, 45–48.

    Article  Google Scholar 

  • Baldi, P., & Brunak, S. (1998). Bioinformatics: the machine learning approach. Cambridge: MIT Press.

    Google Scholar 

  • Baldi, P., Brunak, S., Frasconi, P., Soda, G., & Pollastri, G. (1999). Exploiting the past and the future in protein secondary structure prediction. Bioinformatics, 15, 937–946.

    Article  Google Scholar 

  • Balev, S. (2004). Solving the protein threading problem by Lagrangian relaxation. In Proceedings of the annual workshop on algorithms in bioinformatics (WABI) (pp. 182–193). Berlin: Springer.

    Google Scholar 

  • Barnes, E., Sokol, J. S., & Strickland, D. M. (2005). Optimal protein structure alignment using maximum cliques. Operations Research, 53, 389–402.

    Article  Google Scholar 

  • Bateman, A., Birney, E., Durbin, R., Eddy, S. R., Howe, K. L., & Sonnhammer, E. L. (2000). The Pfam protein families database. Nucleic Acids Research, 28, 263–266.

    Article  Google Scholar 

  • Baum, L. E., & Petrie, T. (1966). Statistical inference for probabilistic functions of finite state Markov chains. Annals of Mathematical Statistics, 37.

  • Benner, S. A., & Gerloff, D. (1990). Patterns of divergence in homologous proteins as indicators of secondary and tertiary structure of the catalytic domain of protein kinases. Advances in Enzyme Regulation, 31, 121–181.

    Article  Google Scholar 

  • Bertsekas, D. P. (1995). Dynamic programming and optimal control (Vols. 1, 2). Belmont: Athena Scientific.

    Google Scholar 

  • Bertsekas, D. P., & Tsitsiklis, J. N. (1996). Neuro-dynamic programming. Belmont: Athena Scientific.

    Google Scholar 

  • Blazewicz, J., Kasprzak, M., Sterna, M., & Węglarz, J. (1997). Selected combinatorial optimization problems arising in molecular biology. Ricerca Operativa, 26, 35–63.

    Google Scholar 

  • Blazewicz, J., Hammer, P. L., & Lukasiak, P. (2004a). Logical analysis of data as a predictor of protein secondary structures. In N. Kolchanov & R. Hofestaedt (Eds.), Bioinformatics of genome regulations and structure, chapter Computational structural biology (pp. 145–154). Boston: Kluwer Academic Publisher.

    Google Scholar 

  • Blazewicz, J., Dill, K. A., Lukasiak, P., & Milostan, M. (2004b). A Tabu search strategy for finding low energy structures of proteins in HP-model. Computational Methods in Science and Technology, 10, 7–19.

    Google Scholar 

  • Blazewicz, J., Formanowicz, P., & Kasprzak, M. (2005a). Selected combinatorial problems of computational biology. European Journal of Operational Research, 161, 585–597.

    Article  Google Scholar 

  • Blazewicz, J., Hammer, P. L., & Lukasiak, P. (2005b). Predicting secondary structures of proteins. IEEE Engineering in Medicine and Biology, 24(3), 88–94.

    Article  Google Scholar 

  • Blazewicz, J., Lukasiak, P., & Milostan, M. (2005c). Application of tabu search strategy for finding low energy structure of protein. Artificial Intelligence in Medicine, 35(1–2), 135–145.

    Article  Google Scholar 

  • Blazewicz, J., Lukasiak, P., & Milostan, M. (2006). Some operations research methods for analyzing protein sequences and structures. 4OR: A Quarterly Journal of Operations Research, 4(2), 91–123.

    Article  Google Scholar 

  • Blom, N., Hansen, J., Blaas, D., & Brunak, S. (1996). Cleavage site analysis in picornaviral polyproteins: discovering cellular targets by neural networks. Protein Science, 5, 2203–2216.

    Article  Google Scholar 

  • Bohr, H., Bohr, J., Brunak, S., Cotterill, R. M., Lautrup, B., Norskov, L., Olsen, O. H., & Petersen, S. B. (1988). Protein secondary structure and homology by neural networks. The alpha-helices in rhodopsin. FEBS Letters, 241, 223–228.

    Article  Google Scholar 

  • Bowie, J. U., Luthy, R., & Eisenberg, D. (1991). A method to identify protein sequences that fold into a known three-dimensional structure. Science, 253, 164–170.

    Article  Google Scholar 

  • Branden, C., & Tooze, J. (1999). Introduction to protein structure (2nd edn., pp. 89–120). New York: Garland Science Publishing.

    Google Scholar 

  • Brunak, S. (1991). Non-linearities in training sets identified by inspecting the order in which neural networks learn. In O. Benhar, C. Bosio, P. Del Giudice, & E. Tabet (Eds.), Neural networks from biology to high energy physics (pp. 277–288). Elba, Italy.

  • Bryant, S. H., & Altschul, S. F. (1995). Statistics of sequence-structure threading. Biology Current Opinions with Evaluated MEDLINE, 5, 236–244.

    Google Scholar 

  • Bystroff, C., & Baker, D. (1998). Prediction of local structure in proteins using a library of sequence-structure motifs. Journal of Molecular Biology, 281, 565–577.

    Article  Google Scholar 

  • Bystroff, C., Thorsson, V., & Baker, D. (2000). HMMSTR: A hidden Markov model for local sequence-structure correlations in proteins. Journal of Molecular Biology, 301, 173–190.

    Article  Google Scholar 

  • Caprara, A., & Lancia, G. (2002). Structural alignment of large-size proteins via Lagrangian relaxation. In Proceedings of the annual international conference on computational molecular biology (RECOMB) (pp. 100–108). New York: ACM Press.

    Google Scholar 

  • Caprara, A., Carr, B., Istrail, S., Lancia, G., & Walenz, B. (2004). 1001 optimal pdb structure alignments: Integer programming methods for finding the maximum contact map overlap. Journal of Computational Biology, 11(1), 27–52.

    Article  Google Scholar 

  • Carr, R. D., & Lancia, G. (2004). Compact optimization can outperform separation: a case study in structural proteomics. 4OR, 2(3), 221–233.

    Article  Google Scholar 

  • Chazelle, B., Kingsford, C., & Singh, M. (2003). The side-chain positioning problem: a semidefinite programming formulation with new rounding schemes. In PCK50—principles of computing & knowledge, Paris C Kanellakis memorial workshop (pp. 86–94). New York: ACM Press.

    Chapter  Google Scholar 

  • Chazelle, B., Kingsford, C., & Singh, M. (2004). A semidefinite programming approach to side chain positioning with new rounding strategies. INFORMS Journal on Computing, 16(4).

  • Corpet, F., Servant, F., Gouzy, J., & Kahn, D. (2000). ProDom and ProDom-CG: tools for protein domain analysis and whole genome comparisons. Nucleic Acids Research, 28, 267–269.

    Article  Google Scholar 

  • Cuff, J. A., & Barton, G. J. (1999). Evaluation and improvement of multiple sequence methods for protein secondary structure prediction. Proteins, 34, 508–519.

    Article  Google Scholar 

  • Dickerson, R. E., Timkovich, R., & Almassy, R. J. (1976). The cytochrome fold and the evolution of bacterial energy metabolism. Journal of Molecular Biology, 100, 473–491.

    Article  Google Scholar 

  • Doye, J. P. K., Leary, R. H., Locatelli, M., & Schoen, F. (2004). Global optimization of morse clusters by potential energy transformations. INFORMS Journal on Computing, 16(4).

  • Durbin, R., Eddy, S., Krogh, A., & Mitchison, G. (1998). Biological sequence analysis. Cambridge: Cambridge University Press.

    Book  Google Scholar 

  • Eddy, S. R. (1998). Profile hidden Markov models. Bioinformatics, 14, 755–763.

    Article  Google Scholar 

  • Edler, L., Grassmann, J., & Suhai, S. (2001). Role and results of statistical methods in protein fold class prediction. Mathematical and Computer Modelling, 33, 1401–1417.

    Article  Google Scholar 

  • Efimov, A. V. (1997). Structural trees for protein superfamilies. Proteins, 28, 241–260.

    Article  Google Scholar 

  • Eriksson, O., Zhou, Y., & Elofsson, A. (2001). Side chain-positioning as an integer programming problem. In O. Gascuel & B. M. E. Moret (Eds.), Lecture notes in computer science : Vol. 2149. Proceedings of annual workshop on algorithms in bioinformatics (WABI) (pp. 128–141). Berlin: Springer.

    Google Scholar 

  • Eskow, E., Bader, B., Byrd, R., Crivelli, S., Head-Gordon, T., Lamberti, V., & Schnabel, R. (2004). An optimization approach to the problem of protein structure prediction. Mathematical Programming, 101(3), 497–514.

    Article  Google Scholar 

  • Eyrich, V. A., Standley, D. M., & Friesner, R. A. (1999). Prediction of protein tertiary structure to low resolution: performance for a large and structurally diverse test set. Journal of Molecular Biology, 288(4), 725–742.

    Article  Google Scholar 

  • Ferrán, E. A., & Pflugfelder, B. (1993). A hybrid method to cluster protein sequences based on statistics and artificial neural networks. Computer Applications in the Biosciences, 9, 671–680.

    Google Scholar 

  • Fiesler, E., & Beale, R. (1996). Handbook of neural computation. New York: Oxford Univ. Press.

    Book  Google Scholar 

  • Finkelstein, A. V., & Ptitsyn, O. B. (1987). Why do globular proteins fit the limited set of folding patterns? Progress in Biophysics and Molecular Biology, 50, 171–190.

    Article  Google Scholar 

  • Frampton, J., Leutz, A., Gibson, T. J., & Graf, T. (1989). DNA-binding domain ancestry. Nature, 342, 134.

    Article  Google Scholar 

  • Frishman, D., & Argos, P. (1992). Recognition of distantly related protein sequences using conserved motifs and neural networks. Journal of Molecular Biology, 228, 951–962.

    Article  Google Scholar 

  • Godzik, A., Skolnick, J., & Kolinski, A. (1992). Topology fingerprint approach to the inverse protein folding problem. Journal of Molecular Biology, 227, 227–238.

    Article  Google Scholar 

  • Gough, J., Karplus, K., Hughey, R., & Chothia, C. (2001). Assignment of homology to genome sequences using a library of hidden Markov models that represent all proteins of known structure. Journal of Molecular Biology, 313, 903–919.

    Article  Google Scholar 

  • Greenberg, H., Hart, W., & Lancia, G. (2004). Opportunities for combinatorial optimization in computational biology. INFORMS Journal on Computing, 16(3), 1–22.

    Article  Google Scholar 

  • Gribskov, M., McLachlan, A. D., & Eisenberg, D. (1987). Profile analysis: detection of distantly related proteins. Proceedings of the National Academy of Sciences of the USA, 84, 4355–4358.

    Article  Google Scholar 

  • Hadley, C., & Jones, D. T. (1999). A systematic comparison of protein structure classifications: SCOP, CATH and FSSP. Structure, 7, 1099–1112.

    Article  Google Scholar 

  • Han, K. F., & Baker, D. (1996). Global properties of the mapping between local amino acid sequence and local structure in proteins. Proceedings of the National Academy of Sciences of the USA, 93, 5814–5818.

    Article  Google Scholar 

  • Hansen, J. E., Lund, O., Tolstrup, N., Gooley, A. A., Williams, K. L., & Brunak, S. (1998). NetOglyc: Prediction of mucin type O-glycosylation sites based on sequence context and surface accessibility. Glycoconjugate Journal, 15, 115–130.

    Article  Google Scholar 

  • Haykin, S. (1999). Neural networks (2nd edn.). New York: Prentice Hall.

    Google Scholar 

  • Henikoff, J. G., Greene, E. A., Pietrokovski, S., & Henikoff, S. (2000). Increased coverage of protein families with the blocks database servers. Nucleic Acids Research, 28, 228–230.

    Article  Google Scholar 

  • Hirst, J. D., & Sternberg, M. J. E. (1991). Prediction of ATP-binding motifs a comparison of a perceptron-type neural network and a consensus sequence method. Protein Engineering, 4, 615–623.

    Article  Google Scholar 

  • Hirst, J. D., & Sternberg, M. J. E. (1992). Prediction of structural and functional features of protein and nucleic acid sequences by artificial neural networks. Biochemistry, 31, 615–623.

    Article  Google Scholar 

  • Hofmann, K., Bucher, P., Falquet, L., & Bairoch, A. (1999). The PROSITE database, its status in 1999. Nucleic Acids Research, 27, 215–219.

    Article  Google Scholar 

  • Holley, H., & Karplus, M. (1989). Protein secondary structure prediction with a neural network. Proceedings of the National Academy of Sciences of the USA, 86, 152–156.

    Article  Google Scholar 

  • Holm, L., & Sander, C. (1993). Protein structures comparision by alignment of distance matrices. Journal of Molecular Biology, 233, 123–138.

    Article  Google Scholar 

  • Holm, L., & Sander, C. (1994). The FSSP database of structurally aligned protein fold families. Nucleic Acids Research, 22, 3600–3609.

    Google Scholar 

  • Holm, L., & Sander, C. (1997). Dali/FSSP classification of three-dimensional protein folds. Nucleic Acids Research, 25, 231–234.

    Article  Google Scholar 

  • Hua, S., & Sun, Z. (2001). A novel method of protein secondary structure prediction with high segment overlap measure: support vector machine approach. Journal of Molecular Biology, 308, 397–407.

    Article  Google Scholar 

  • Jagla, B., & Schuchhardt, J. (2000). Adaptive encoding neural networks for the recognition of human signal peptide cleavage sites. Bioinformatics, 16, 245–250.

    Article  Google Scholar 

  • Johnson, S. C. (1967). Hierarchical clustering schemes. Psychometrika, 32, 241–254.

    Article  Google Scholar 

  • Johnson, S. C. (1985). This week’s citation classic. Current Contents, 5, 16.

    Google Scholar 

  • Jones, D. T. (1999a). Protein secondary structure prediction based on position-specific scoring matrices. Journal of Molecular Biology, 292, 195–202.

    Article  Google Scholar 

  • Jones, D. T. (1999b). GenTHREADER: an efficient and reliable protein fold recognition method for genomic sequences. Journal of Molecular Biology, 287, 797–815.

    Article  Google Scholar 

  • Jones, D. T., Taylor, W. R., & Thornton, J. M. (1992). A new approach to protein fold recognition. Nature, 358, 86–89.

    Article  Google Scholar 

  • Karplus, K., Barrett, C., Cline, M., Diekhans, M., Grante, L., & Hughey, R. (1999). Predicting protein structure using only sequence information. Proteins, 3, 121–125.

    Article  Google Scholar 

  • Kelley, L. A., MacCallum, R. M., & Sternberg, M. J. E. (2000). Enhanced genome annotation using structural profiles in the program 3D-PSSM. Journal of Molecular Biology, 299, 499–520.

    Article  Google Scholar 

  • Kim, D., Xu, D., Guo, J., Ellrott, K., & Xu, Y. (2003). PROSPECT II: protein structure prediction program for genome-scale applications. Protein Engineering, 16(9), 641–650.

    Article  Google Scholar 

  • Kingsford, C., Chazelle, B., & Singh, M. (2005). Solving and analyzing side-chain positioning problems using linear and integer programming. Bioinformatics, 21(7), 1028–1039.

    Article  Google Scholar 

  • Kneller, D., Cohen, F., & Langridge, R. (1990). Improvements in protein secondary structure prediction by an enhanced neural network. Journal of Molecular Biology, 214, 171–182.

    Article  Google Scholar 

  • Koh, S. H., Ananthasurehs, G. K., & Croke, C. (2004). Design of reduced protein models by energy minimization using mathematical programming. In 10th AIAA/ISSMO multidisciplinary analysis and optimization conference (pp. 1–10).

  • Kohonen, T. (1982). Self-organized formation of topologically correct feature maps. Biological Cybernetics, 43, 59–69.

    Article  Google Scholar 

  • Kolinski, A., & Bujnicki, J. M. (2004). Combination of fold-recognition with De Novo Folding and evaluation of models. http://www.forcasp.org/upload/2165.6.pdf.

  • Kolinski, A., & Skolnick, J. (2004). Reduced models of proteins and their applications. Polymer, 45, 511–524.

    Article  Google Scholar 

  • Kriventseva, E. V., Biswas, M., & Apweiler, R. (2001). Clustering and analysis of protein families. Current Opinion in Structural Biology, 11, 334–339.

    Article  Google Scholar 

  • Ladunga, I., Czakó, F., Csabai, I., & Geszti, T. (1991). Improving signal peptide prediction accuracy by simulated neural network. Computer Applications in the Biosciences, 7, 485–487.

    Google Scholar 

  • Lancia, G., Carr, R., Walenz, B., & Istrail, S. (2001). 101 optimal PDB structure alignments: a branch-and-cut algorithm for the maximum contact map overlap problem. In Proceedings of the annual international conference on computational biology (RECOMB) (pp. 193–202). New York: ACM Press.

    Google Scholar 

  • Lathrop, R. H. (1994). The protein threading problem with sequence amino acid interaction preferences is NP-complete. Protein Engineering, 7, 1059–1068.

    Article  Google Scholar 

  • Lee, Y. (2005). Hidden Markov models with states depending on observations. Pattern Recognition Letters, 26, 977–984.

    Article  Google Scholar 

  • Lesk, A. M. (2001). Introduction to protein architecture. London: Oxford University Press.

    Google Scholar 

  • Levinthal, C. (1968). Are there pathways to protein folding? Journal of Chemical Physics, 65, 44–45.

    Google Scholar 

  • Li, W., Jaroszewski, L., & Godzik, A. (2002). Tolerating some redundancy significantly speeds up clustering of large protein databases. Bioinformatics, 18, 77–82.

    Article  Google Scholar 

  • Lindahl, E., & Elofsson, A. (2000). Identification of related proteins on family, superfamily and fold level. Journal of Molecular Biology, 295, 613–625.

    Article  Google Scholar 

  • Lipman, D. J., & Pearson, W. R. (1985). Rapid and sensitive protein similarity searches. Science, 227, 1435–1441.

    Article  Google Scholar 

  • Liu, J., & Rost, B. (2003). Domains, motifs and clusters in protein universe. Current Opinion in Chemical Biology, 7, 5–11.

    Article  Google Scholar 

  • Lukasiak, P. (2004). Algorithmic aspects of protein secondary structure prediction. PhD Thesis, Poznan University of Technology.

  • Ma, Q., Chirn, G.-W., Cai, R., Szustakowski, J., & Nirmala, N. R. (2005). Clustering protein sequences with a novel metric transformed from sequence similarity scores and sequence alignments with neural networks. Bioinformatics, 6, 242.

    Google Scholar 

  • Markowetz, F., Edler, L., & Vingron, M. (2003). Support vector machines for protein fold class prediction. Biometrical Journal, 45(3), 377–389.

    Article  Google Scholar 

  • Mewes, H. W., Frishman, D., Gruber, C., Geier, B., Haase, D., Kaps, A., Lemcke, K., Mannhaupt, G., Pfeiffer, F., & Schuller, C. (2000). MIPS: a database for genomes and protein sequences. Nucleic Acids Research, 28, 37–40.

    Article  Google Scholar 

  • Mizuguchi, K., Deane, C. M., Blundell, T. L., & Overington, J. P. (1998). HOMSTRAD: a database of protein structure alignments for homologous families. Protein Science, 7, 2469–2471.

    Article  Google Scholar 

  • Mohseni-Zadeh, S., Brzellec, P., & Risler, J.-L. (2004). Cluster-C, an algorithm for the large-scale clustering of protein sequences based on the extraction of maximal cliques. Computational Biology and Chemistry, 28(3), 211–218.

    Article  Google Scholar 

  • Murvai, J., Vlahovicek, K., Barta, E., Cataletto, B., & Pongor, S. (2000). The SBASE protein domain library, release 7.0: a collection of annotated protein sequence segments. Nucleic Acids Research, 28, 260–262.

    Article  Google Scholar 

  • Murzin, A. G., Brenner, S. E., Hubbard, T., & Chothia, C. (1995). SCOP: a structural classification of proteins database for the investigation of sequences and structures. Journal of Molecular Biology, 247, 536–540.

    Google Scholar 

  • Nanias, M., Chinchio, M., Ołdziej, S., Czaplewski, C., & Scheraga, H. A. (2005). Protein structure prediction with the UNRES force-field using replica-exchange Monte Carlo-with-minimization; comparison with MCM, CSA and CFMC. Journal of Computational Chemistry, 26, 1472–1486.

    Article  Google Scholar 

  • Needleman, S., & Wunsch, C. (1970). A general method applicable to the search for similarities in the amino acid sequences of two proteins. Journal of Molecular Biology, 48, 443–453.

    Article  Google Scholar 

  • Nielsen, H., Engelbrecht, J., Brunak, S., & von Heijne, G. (1997). Identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites. Protein Engineering, 10, 1–6.

    Article  Google Scholar 

  • Niermann, T., & Kirschner, K. (1990). Improving the prediction of secondary structure of ‘TIM-barrel’ enzymes. Protein Engineering, 4, 137–147.

    Article  Google Scholar 

  • Orengo, C. A., Michie, A. D., Jones, S., Jones, D. T., Swindells, M. B., & Thornton, J. M. (1997). CATH-a hierarchic classification of protein domain structures. Structure, 5, 1093–1108.

    Article  Google Scholar 

  • Ouali, M., & King, R. D. (2000). Cascaded multiple classifiers for secondary structure prediction. Protein Science, 9, 1162–1176.

    Article  Google Scholar 

  • Panchenko, A. R., Marchler-Bauer, A., & Bryant, S. H. (2000). In Quantitative challenges in the post-genome sequence era: a workshop and symposium. The La Jolla interfaces in science, La Jolla, CA (Vol. 2).

  • Papoulis, A. (1984). Brownian movement and Markov processes, Chap. 15. In Probability, random variables, and stochastic processes (2nd edn., pp. 515–553). New York: McGraw-Hill.

    Google Scholar 

  • Pearl, F., Todd, A., Sillitoe, I., Dibley, M., Redfern, O., Lewis, T., Bennett, C., Marsden, R., Grant, A., Lee, D., Akpor, A., Maibaum, M., Harrison, A., Dallman, T., Reeves, G., Diboun, I., Addou, S., Lise, S., Johnston, C., Sillero, A., Thornton, J., & Orengo, C. (2005). The CATH Domain Structure Database and related resources Gene3D and DHS provide comprehensive domain family information for genome analysis. Nucleic Acids Research, 33, D247–D251.

    Article  Google Scholar 

  • Pearson, W. R., & Lipman, D. J. (1988). Improved tools for biological sequence comparison. Proceedings of National Academy Sciences of the USA, 85, 2444–2448.

    Article  Google Scholar 

  • Pevzner, P. A. (2001). Computational molecular biology an algorithmic approach. Cambridge: MIT Press.

    Google Scholar 

  • Pollastri, G., & Baldi, P. (2002). Prediction of contact maps by GIOHMMs and recurrent neural networks using lateral propagation from all four cardinal corners. Bioinformatics, 18(1), S62–S70.

    Article  Google Scholar 

  • Pollastri, G., Przybylski, D., Rost, B., & Baldi, P. (2002). Improving the prediction of protein secondary structure in three and eight classes using recurrent neural networks and profiles. Proteins, 47, 228–235.

    Article  Google Scholar 

  • Przybylski, D., & Rost, B. (2002). Alignments grow, secondary structure prediction improves. Proteins, 46, 197–205.

    Article  Google Scholar 

  • Ptitsyn, O. B., & Finkelstein, A. V. (1980). Similarities of protein topologies: evolutionary divergence, functional convergence or principles of folding? Quarterly Reviews of Biophysics, 13, 339–386.

    Article  Google Scholar 

  • Qian, N., & Sejnowski, T. (1988). Predicting the secondary structure of globular proteins using neural network models. Journal of Molecular Biology, 202, 865–884.

    Article  Google Scholar 

  • Rabiner, L. R. (1989). A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE, 77(2), 257–286.

    Article  Google Scholar 

  • Riis, S. K., & Krogh, A. (1996). Improving prediction of protein secondary structure using structured neural networks and multiple sequence alignments. Journal of Computation Biology, 3, 163–183.

    Article  Google Scholar 

  • Rost, B., & Sander, C. (1993a). Improved prediction of protein secondary structure by use of sequence profiles and neural networks. Proceedings of the National Academy of Sciences of the USA, 90, 7558–7562.

    Article  Google Scholar 

  • Rost, B., & Sander, C. (1993b). Prediction of protein secondary structure at better than 70% accuracy. Journal of Molecular Biology, 232, 584–599.

    Article  Google Scholar 

  • Rost, B., Sander, C., & Schneider, R. (1994). PHD—an automatic server for protein secondary structure prediction. Computer Applications in the Biosciences, 10, 53–60.

    Google Scholar 

  • Rumelhart, D. E., & McClelland, J. L. (1986). Parallel distributed processing. Explorations in the microstructure of cognition. Cambridge: MIT Press.

    Google Scholar 

  • Rychlewski, L., Jaroszewski, L., Li, W., & Godzik, A. (2000). Comparison of sequence profiles. Strategies for structural predictions using sequence information. Protein Science, 9, 232–241.

    Article  Google Scholar 

  • Schneider, G., & Wrede, P. (1993). Development of artificial neural filters for pattern recognition in protein sequences. Journal of Molecular Evolution, 36, 586–595.

    Article  Google Scholar 

  • Setubal, J., & Meidanis, J. (1997). Introduction to computational biology. Boston: PWS Publishing.

    Google Scholar 

  • Shi, J., Blundell, T. L., & Mizuguchi, K. (2001). FUGUE: sequence-structure homology recognition using environment-specific substitution tables and structure-dependent gap penalties. Journal of Molecular Biology, 310, 243–257.

    Article  Google Scholar 

  • Smith, T. F., & Waterman, M. S. (1981). Identification of common molecular subsequences. Journal of Molecular Biology, 147, 195–197.

    Article  Google Scholar 

  • Sonnhammer, E. L., Eddy, S. R., Birney, E., Bateman, A., & Durbin, R. (1998). Pfam: Multiple sequence alignments and HMM-profiles of protein domains. Nucleic Acids Research, 26, 320–322.

    Article  Google Scholar 

  • Taylor, W. R. (2000). Searching for the ideal forms of proteins. Biochemical Society Transactions, 28, 264–269.

    Google Scholar 

  • Taylor, W. R. (2002a). In B. Mewes & H. S. Weiss (Eds.), Bioinformatics and genome analysis. Ernst Schering research foundation workshop (Vol. 38, pp. 133–148). Berlin: Springer.

    Google Scholar 

  • Taylor, W. R. (2002b). A ‘periodic table’ for protein structures. Nature, 416, 657–660.

    Article  Google Scholar 

  • Tendulkar, A. V., Wangikar, P. P., Sohoni, M. A., Samant, V. V., & Mone, Ch. Y. (2003). Parameterization and classification of the protein universe via geometric techniques. Journal of Molecular Biology, 334(1), 157–172.

    Article  Google Scholar 

  • Tolstrup, N., Toftgård, J., Engelbrecht, J., & Brunak, S. (1994). Neural network model of the genetic code is strongly correlated to the GES scale of amino acid transfer free energies. Journal of Molecular Biology, 243, 816–820.

    Article  Google Scholar 

  • Tsigelny, I., Sharikov, Y., & Ten Eyck, L. F. (2002). Hidden Markov models-based system (HMMSPECTR) for detecting structural homologies on the basis of sequential information. Protein Engineering, 15(5), 347–352.

    Article  Google Scholar 

  • Veber, P., Yanev, N., Andonov, R., & Poirriez, V. (2005). Optimal protein threading by cost-splitting. In Proceedings of the annual workshop on algorithms in bioinformatics (WABI) (pp. 365–375). Berlin: Springer.

    Google Scholar 

  • Wagner, M., Meller, J., & Elber, R. (2004). Large-scale linear programming techniques for the design of protein folding potentials. Mathematical Programming, 101(2), 301–318.

    Article  Google Scholar 

  • Waterman, M. S. (1995). Introduction to computational biology. London: Chapman and Hall.

    Google Scholar 

  • Wilbur, W. J., & Lipman, D. J. (1983). Rapid similarity searches of nucleic acid and protein data banks. Proceedings of the National Academy of Sciences of the USA, 80, 726–730.

    Article  Google Scholar 

  • Wu, C. H., Zhao, S., Chen, H.-L., Lo, C.-J., & McLarty, J. (1996). Motif identification neural design for rapid and sensitive protein family search. Computer Applications in the Biosciences, 12, 109–118.

    Google Scholar 

  • Xu, J. (2003). Speedup LP approach to protein threading via graph reduction. In Proceedings of the annual workshop on algorithms in bioinformatics (WABI) (pp. 374–388). Berlin: Springer.

    Google Scholar 

  • Xu, J., & Li, M. (2003). Assessment of RAPTOR’s linear programming approach in CAFASP3. Proteins: Structure, Function, and Genetics, 53(6), 579–584.

    Article  Google Scholar 

  • Xu, J., Li, M., Kim, D., & Xu, Y. (2003). RAPTOR: Optimal protein threading by linear programming. Journal of Bioinformatics and Computational Biology, 1(1), 95–117.

    Article  Google Scholar 

  • Xu, J., Li, M., & Xu, Y. (2004). Protein threading by linear programming, Theoretical analysis and computational results. Journal of Combinatorial Optimization, 8(4), 403–418.

    Article  Google Scholar 

  • Yona, G., & Levitt, M. (2002). Within the twilight zone: a sensitive profile-profile comparison tool based on information theory. Journal of Molecular Biology, 315, 1257–1275.

    Article  Google Scholar 

  • Yuan, X., Hou, Y., Huang, Y., Shao, Y., & Bystroff, Ch. (2004). Contact map prediction using HMMSTR. http://www.bioinfo.rpi.edu/bystrc/pub/casp6abstract.pdf.

  • Zhang, Y., & Skolnick, J. (2004). SPICKER: a clustering approach to identify near-native protein folds. Journal of Computational Chemistry, 25, 865–871.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Piotr Łukasiak.

Additional information

Partially supported by KBN grant No 3T11F00227. This is an updated version of the paper that appeared in 4OR, 4, 2006, pp. 91–123.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Łukasiak, P., Błażewicz, J. & Miłostan, M. Some operations research methods for analyzing protein sequences and structures. Ann Oper Res 175, 9–35 (2010). https://doi.org/10.1007/s10479-009-0652-y

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10479-009-0652-y

Navigation