Abstract
The operations research is probably one of the most successful field of applied mathematics used in economics, physics, chemistry, almost everywhere where one has to analyze huge amounts of data. Lately, these techniques of operations research were introduced in biology, especially in the protein analysis area to support biologists. The fast growth of protein data makes operations research an important issue in bioinformatics, a science which lays on the border between computer science and biology. This paper gives a short overview of the operations research techniques currently used to support structural and functional analysis of proteins.
Similar content being viewed by others
References
Althaus, E., Kohlbacher, O., Lenhof, H.-P., & Muller, P. (2002). A combinatorial approach to protein docking with flexible side-chains. Journal of Computational Biology, 9(4), 597–612.
Altschul, S. F., Gish, W., Miller, W., Myers, E. W., & Lipman, D. J. (1990). Basic local alignment search tool. Journal of Molecular Biology, 215, 403–410.
Altschul, S. F., Madden, T. L., Schaffer, A. A., Zhang, J., Zhang, Z., Miller, W., & Lipman, D. J. (1997). Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Research, 25, 3389–3402.
Andonov, R., Balev, S., & Yanev, N. (2004). Protein threading: from mathematical models to parallel implementations. INFORMS Journal on Computing, 16(4).
Andrade, M. A., & Valencia, A. (1997). Automatic annotation for biological sequences by extraction of keywords from MEDLINE abstracts. Development of a prototype system. In T. Gaasterland, P. Karp, K. Karplus, C. Ouzounis, & C. Sander et al. (Eds.), Fifth international conference on intelligent systems for molecular biology (pp. 25–32). Halkidiki: AAAI Press.
Andreeva, A., Howorth, D., Brenner, S. E., Hubbard, T. J. P., Chothia, C., & Murzin, A. G. (2004). SCOP database in 2004: refinements integrate structure and sequence family data. Nucleic Acid Research, 32, 226–229.
Anfinsen, C. B. (1973). Principles that govern the folding of protein chains. Science, 181, 223–230.
Anfinsen, C. B., Haber, E., Sela, M., & White, F. Jr. (1961). The kinetics of formation of native ribonuclease during oxidation of the reduced polypeptide chain. Proceedings of the National Academy of Sciences of the USA, 47(9), 1309–1314.
Apweiler, R., Attwood, T. K., Bairoch, A., Bateman, A., Birney, E., Bucher, P., Codani, J. J., Corpet, F., Croning, M. D. R., & Durbin, R. (2000). InterPro—an integrated documentation resource for protein families, domains and functional sites. Bioinformatics, 16, 1145–1150.
Arbib, M. (1995). The handbook of brain theory and neural networks. Cambridge: Bradford Books/The MIT Press.
Asai, K., Hayamizu, S., & Handa, K. (1993). Prediction of protein secondary structure by the hidden Markov model. Bioinformatics, 9, 141–146.
Attwood, T. K. (2000). The quest to deduce protein function from sequence: the role of pattern databases. International Journal of Biochemistry & Cell Biology, 32, 139–155.
Attwood, T. K., Croning, M. D., Flower, D. R., Lewis, A. P., Mabey, J. E., Scordis, P., Selley, J. N., & Wright, W. (2000). PRINTS-S: the database formerly known as prints. Nucleic Acid Research, 28, 225–227.
Bairoch, A., & Apweiler, R. (2000). The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000. Nucleic Acids Research, 28, 45–48.
Baldi, P., & Brunak, S. (1998). Bioinformatics: the machine learning approach. Cambridge: MIT Press.
Baldi, P., Brunak, S., Frasconi, P., Soda, G., & Pollastri, G. (1999). Exploiting the past and the future in protein secondary structure prediction. Bioinformatics, 15, 937–946.
Balev, S. (2004). Solving the protein threading problem by Lagrangian relaxation. In Proceedings of the annual workshop on algorithms in bioinformatics (WABI) (pp. 182–193). Berlin: Springer.
Barnes, E., Sokol, J. S., & Strickland, D. M. (2005). Optimal protein structure alignment using maximum cliques. Operations Research, 53, 389–402.
Bateman, A., Birney, E., Durbin, R., Eddy, S. R., Howe, K. L., & Sonnhammer, E. L. (2000). The Pfam protein families database. Nucleic Acids Research, 28, 263–266.
Baum, L. E., & Petrie, T. (1966). Statistical inference for probabilistic functions of finite state Markov chains. Annals of Mathematical Statistics, 37.
Benner, S. A., & Gerloff, D. (1990). Patterns of divergence in homologous proteins as indicators of secondary and tertiary structure of the catalytic domain of protein kinases. Advances in Enzyme Regulation, 31, 121–181.
Bertsekas, D. P. (1995). Dynamic programming and optimal control (Vols. 1, 2). Belmont: Athena Scientific.
Bertsekas, D. P., & Tsitsiklis, J. N. (1996). Neuro-dynamic programming. Belmont: Athena Scientific.
Blazewicz, J., Kasprzak, M., Sterna, M., & Węglarz, J. (1997). Selected combinatorial optimization problems arising in molecular biology. Ricerca Operativa, 26, 35–63.
Blazewicz, J., Hammer, P. L., & Lukasiak, P. (2004a). Logical analysis of data as a predictor of protein secondary structures. In N. Kolchanov & R. Hofestaedt (Eds.), Bioinformatics of genome regulations and structure, chapter Computational structural biology (pp. 145–154). Boston: Kluwer Academic Publisher.
Blazewicz, J., Dill, K. A., Lukasiak, P., & Milostan, M. (2004b). A Tabu search strategy for finding low energy structures of proteins in HP-model. Computational Methods in Science and Technology, 10, 7–19.
Blazewicz, J., Formanowicz, P., & Kasprzak, M. (2005a). Selected combinatorial problems of computational biology. European Journal of Operational Research, 161, 585–597.
Blazewicz, J., Hammer, P. L., & Lukasiak, P. (2005b). Predicting secondary structures of proteins. IEEE Engineering in Medicine and Biology, 24(3), 88–94.
Blazewicz, J., Lukasiak, P., & Milostan, M. (2005c). Application of tabu search strategy for finding low energy structure of protein. Artificial Intelligence in Medicine, 35(1–2), 135–145.
Blazewicz, J., Lukasiak, P., & Milostan, M. (2006). Some operations research methods for analyzing protein sequences and structures. 4OR: A Quarterly Journal of Operations Research, 4(2), 91–123.
Blom, N., Hansen, J., Blaas, D., & Brunak, S. (1996). Cleavage site analysis in picornaviral polyproteins: discovering cellular targets by neural networks. Protein Science, 5, 2203–2216.
Bohr, H., Bohr, J., Brunak, S., Cotterill, R. M., Lautrup, B., Norskov, L., Olsen, O. H., & Petersen, S. B. (1988). Protein secondary structure and homology by neural networks. The alpha-helices in rhodopsin. FEBS Letters, 241, 223–228.
Bowie, J. U., Luthy, R., & Eisenberg, D. (1991). A method to identify protein sequences that fold into a known three-dimensional structure. Science, 253, 164–170.
Branden, C., & Tooze, J. (1999). Introduction to protein structure (2nd edn., pp. 89–120). New York: Garland Science Publishing.
Brunak, S. (1991). Non-linearities in training sets identified by inspecting the order in which neural networks learn. In O. Benhar, C. Bosio, P. Del Giudice, & E. Tabet (Eds.), Neural networks from biology to high energy physics (pp. 277–288). Elba, Italy.
Bryant, S. H., & Altschul, S. F. (1995). Statistics of sequence-structure threading. Biology Current Opinions with Evaluated MEDLINE, 5, 236–244.
Bystroff, C., & Baker, D. (1998). Prediction of local structure in proteins using a library of sequence-structure motifs. Journal of Molecular Biology, 281, 565–577.
Bystroff, C., Thorsson, V., & Baker, D. (2000). HMMSTR: A hidden Markov model for local sequence-structure correlations in proteins. Journal of Molecular Biology, 301, 173–190.
Caprara, A., & Lancia, G. (2002). Structural alignment of large-size proteins via Lagrangian relaxation. In Proceedings of the annual international conference on computational molecular biology (RECOMB) (pp. 100–108). New York: ACM Press.
Caprara, A., Carr, B., Istrail, S., Lancia, G., & Walenz, B. (2004). 1001 optimal pdb structure alignments: Integer programming methods for finding the maximum contact map overlap. Journal of Computational Biology, 11(1), 27–52.
Carr, R. D., & Lancia, G. (2004). Compact optimization can outperform separation: a case study in structural proteomics. 4OR, 2(3), 221–233.
Chazelle, B., Kingsford, C., & Singh, M. (2003). The side-chain positioning problem: a semidefinite programming formulation with new rounding schemes. In PCK50—principles of computing & knowledge, Paris C Kanellakis memorial workshop (pp. 86–94). New York: ACM Press.
Chazelle, B., Kingsford, C., & Singh, M. (2004). A semidefinite programming approach to side chain positioning with new rounding strategies. INFORMS Journal on Computing, 16(4).
Corpet, F., Servant, F., Gouzy, J., & Kahn, D. (2000). ProDom and ProDom-CG: tools for protein domain analysis and whole genome comparisons. Nucleic Acids Research, 28, 267–269.
Cuff, J. A., & Barton, G. J. (1999). Evaluation and improvement of multiple sequence methods for protein secondary structure prediction. Proteins, 34, 508–519.
Dickerson, R. E., Timkovich, R., & Almassy, R. J. (1976). The cytochrome fold and the evolution of bacterial energy metabolism. Journal of Molecular Biology, 100, 473–491.
Doye, J. P. K., Leary, R. H., Locatelli, M., & Schoen, F. (2004). Global optimization of morse clusters by potential energy transformations. INFORMS Journal on Computing, 16(4).
Durbin, R., Eddy, S., Krogh, A., & Mitchison, G. (1998). Biological sequence analysis. Cambridge: Cambridge University Press.
Eddy, S. R. (1998). Profile hidden Markov models. Bioinformatics, 14, 755–763.
Edler, L., Grassmann, J., & Suhai, S. (2001). Role and results of statistical methods in protein fold class prediction. Mathematical and Computer Modelling, 33, 1401–1417.
Efimov, A. V. (1997). Structural trees for protein superfamilies. Proteins, 28, 241–260.
Eriksson, O., Zhou, Y., & Elofsson, A. (2001). Side chain-positioning as an integer programming problem. In O. Gascuel & B. M. E. Moret (Eds.), Lecture notes in computer science : Vol. 2149. Proceedings of annual workshop on algorithms in bioinformatics (WABI) (pp. 128–141). Berlin: Springer.
Eskow, E., Bader, B., Byrd, R., Crivelli, S., Head-Gordon, T., Lamberti, V., & Schnabel, R. (2004). An optimization approach to the problem of protein structure prediction. Mathematical Programming, 101(3), 497–514.
Eyrich, V. A., Standley, D. M., & Friesner, R. A. (1999). Prediction of protein tertiary structure to low resolution: performance for a large and structurally diverse test set. Journal of Molecular Biology, 288(4), 725–742.
Ferrán, E. A., & Pflugfelder, B. (1993). A hybrid method to cluster protein sequences based on statistics and artificial neural networks. Computer Applications in the Biosciences, 9, 671–680.
Fiesler, E., & Beale, R. (1996). Handbook of neural computation. New York: Oxford Univ. Press.
Finkelstein, A. V., & Ptitsyn, O. B. (1987). Why do globular proteins fit the limited set of folding patterns? Progress in Biophysics and Molecular Biology, 50, 171–190.
Frampton, J., Leutz, A., Gibson, T. J., & Graf, T. (1989). DNA-binding domain ancestry. Nature, 342, 134.
Frishman, D., & Argos, P. (1992). Recognition of distantly related protein sequences using conserved motifs and neural networks. Journal of Molecular Biology, 228, 951–962.
Godzik, A., Skolnick, J., & Kolinski, A. (1992). Topology fingerprint approach to the inverse protein folding problem. Journal of Molecular Biology, 227, 227–238.
Gough, J., Karplus, K., Hughey, R., & Chothia, C. (2001). Assignment of homology to genome sequences using a library of hidden Markov models that represent all proteins of known structure. Journal of Molecular Biology, 313, 903–919.
Greenberg, H., Hart, W., & Lancia, G. (2004). Opportunities for combinatorial optimization in computational biology. INFORMS Journal on Computing, 16(3), 1–22.
Gribskov, M., McLachlan, A. D., & Eisenberg, D. (1987). Profile analysis: detection of distantly related proteins. Proceedings of the National Academy of Sciences of the USA, 84, 4355–4358.
Hadley, C., & Jones, D. T. (1999). A systematic comparison of protein structure classifications: SCOP, CATH and FSSP. Structure, 7, 1099–1112.
Han, K. F., & Baker, D. (1996). Global properties of the mapping between local amino acid sequence and local structure in proteins. Proceedings of the National Academy of Sciences of the USA, 93, 5814–5818.
Hansen, J. E., Lund, O., Tolstrup, N., Gooley, A. A., Williams, K. L., & Brunak, S. (1998). NetOglyc: Prediction of mucin type O-glycosylation sites based on sequence context and surface accessibility. Glycoconjugate Journal, 15, 115–130.
Haykin, S. (1999). Neural networks (2nd edn.). New York: Prentice Hall.
Henikoff, J. G., Greene, E. A., Pietrokovski, S., & Henikoff, S. (2000). Increased coverage of protein families with the blocks database servers. Nucleic Acids Research, 28, 228–230.
Hirst, J. D., & Sternberg, M. J. E. (1991). Prediction of ATP-binding motifs a comparison of a perceptron-type neural network and a consensus sequence method. Protein Engineering, 4, 615–623.
Hirst, J. D., & Sternberg, M. J. E. (1992). Prediction of structural and functional features of protein and nucleic acid sequences by artificial neural networks. Biochemistry, 31, 615–623.
Hofmann, K., Bucher, P., Falquet, L., & Bairoch, A. (1999). The PROSITE database, its status in 1999. Nucleic Acids Research, 27, 215–219.
Holley, H., & Karplus, M. (1989). Protein secondary structure prediction with a neural network. Proceedings of the National Academy of Sciences of the USA, 86, 152–156.
Holm, L., & Sander, C. (1993). Protein structures comparision by alignment of distance matrices. Journal of Molecular Biology, 233, 123–138.
Holm, L., & Sander, C. (1994). The FSSP database of structurally aligned protein fold families. Nucleic Acids Research, 22, 3600–3609.
Holm, L., & Sander, C. (1997). Dali/FSSP classification of three-dimensional protein folds. Nucleic Acids Research, 25, 231–234.
Hua, S., & Sun, Z. (2001). A novel method of protein secondary structure prediction with high segment overlap measure: support vector machine approach. Journal of Molecular Biology, 308, 397–407.
Jagla, B., & Schuchhardt, J. (2000). Adaptive encoding neural networks for the recognition of human signal peptide cleavage sites. Bioinformatics, 16, 245–250.
Johnson, S. C. (1967). Hierarchical clustering schemes. Psychometrika, 32, 241–254.
Johnson, S. C. (1985). This week’s citation classic. Current Contents, 5, 16.
Jones, D. T. (1999a). Protein secondary structure prediction based on position-specific scoring matrices. Journal of Molecular Biology, 292, 195–202.
Jones, D. T. (1999b). GenTHREADER: an efficient and reliable protein fold recognition method for genomic sequences. Journal of Molecular Biology, 287, 797–815.
Jones, D. T., Taylor, W. R., & Thornton, J. M. (1992). A new approach to protein fold recognition. Nature, 358, 86–89.
Karplus, K., Barrett, C., Cline, M., Diekhans, M., Grante, L., & Hughey, R. (1999). Predicting protein structure using only sequence information. Proteins, 3, 121–125.
Kelley, L. A., MacCallum, R. M., & Sternberg, M. J. E. (2000). Enhanced genome annotation using structural profiles in the program 3D-PSSM. Journal of Molecular Biology, 299, 499–520.
Kim, D., Xu, D., Guo, J., Ellrott, K., & Xu, Y. (2003). PROSPECT II: protein structure prediction program for genome-scale applications. Protein Engineering, 16(9), 641–650.
Kingsford, C., Chazelle, B., & Singh, M. (2005). Solving and analyzing side-chain positioning problems using linear and integer programming. Bioinformatics, 21(7), 1028–1039.
Kneller, D., Cohen, F., & Langridge, R. (1990). Improvements in protein secondary structure prediction by an enhanced neural network. Journal of Molecular Biology, 214, 171–182.
Koh, S. H., Ananthasurehs, G. K., & Croke, C. (2004). Design of reduced protein models by energy minimization using mathematical programming. In 10th AIAA/ISSMO multidisciplinary analysis and optimization conference (pp. 1–10).
Kohonen, T. (1982). Self-organized formation of topologically correct feature maps. Biological Cybernetics, 43, 59–69.
Kolinski, A., & Bujnicki, J. M. (2004). Combination of fold-recognition with De Novo Folding and evaluation of models. http://www.forcasp.org/upload/2165.6.pdf.
Kolinski, A., & Skolnick, J. (2004). Reduced models of proteins and their applications. Polymer, 45, 511–524.
Kriventseva, E. V., Biswas, M., & Apweiler, R. (2001). Clustering and analysis of protein families. Current Opinion in Structural Biology, 11, 334–339.
Ladunga, I., Czakó, F., Csabai, I., & Geszti, T. (1991). Improving signal peptide prediction accuracy by simulated neural network. Computer Applications in the Biosciences, 7, 485–487.
Lancia, G., Carr, R., Walenz, B., & Istrail, S. (2001). 101 optimal PDB structure alignments: a branch-and-cut algorithm for the maximum contact map overlap problem. In Proceedings of the annual international conference on computational biology (RECOMB) (pp. 193–202). New York: ACM Press.
Lathrop, R. H. (1994). The protein threading problem with sequence amino acid interaction preferences is NP-complete. Protein Engineering, 7, 1059–1068.
Lee, Y. (2005). Hidden Markov models with states depending on observations. Pattern Recognition Letters, 26, 977–984.
Lesk, A. M. (2001). Introduction to protein architecture. London: Oxford University Press.
Levinthal, C. (1968). Are there pathways to protein folding? Journal of Chemical Physics, 65, 44–45.
Li, W., Jaroszewski, L., & Godzik, A. (2002). Tolerating some redundancy significantly speeds up clustering of large protein databases. Bioinformatics, 18, 77–82.
Lindahl, E., & Elofsson, A. (2000). Identification of related proteins on family, superfamily and fold level. Journal of Molecular Biology, 295, 613–625.
Lipman, D. J., & Pearson, W. R. (1985). Rapid and sensitive protein similarity searches. Science, 227, 1435–1441.
Liu, J., & Rost, B. (2003). Domains, motifs and clusters in protein universe. Current Opinion in Chemical Biology, 7, 5–11.
Lukasiak, P. (2004). Algorithmic aspects of protein secondary structure prediction. PhD Thesis, Poznan University of Technology.
Ma, Q., Chirn, G.-W., Cai, R., Szustakowski, J., & Nirmala, N. R. (2005). Clustering protein sequences with a novel metric transformed from sequence similarity scores and sequence alignments with neural networks. Bioinformatics, 6, 242.
Markowetz, F., Edler, L., & Vingron, M. (2003). Support vector machines for protein fold class prediction. Biometrical Journal, 45(3), 377–389.
Mewes, H. W., Frishman, D., Gruber, C., Geier, B., Haase, D., Kaps, A., Lemcke, K., Mannhaupt, G., Pfeiffer, F., & Schuller, C. (2000). MIPS: a database for genomes and protein sequences. Nucleic Acids Research, 28, 37–40.
Mizuguchi, K., Deane, C. M., Blundell, T. L., & Overington, J. P. (1998). HOMSTRAD: a database of protein structure alignments for homologous families. Protein Science, 7, 2469–2471.
Mohseni-Zadeh, S., Brzellec, P., & Risler, J.-L. (2004). Cluster-C, an algorithm for the large-scale clustering of protein sequences based on the extraction of maximal cliques. Computational Biology and Chemistry, 28(3), 211–218.
Murvai, J., Vlahovicek, K., Barta, E., Cataletto, B., & Pongor, S. (2000). The SBASE protein domain library, release 7.0: a collection of annotated protein sequence segments. Nucleic Acids Research, 28, 260–262.
Murzin, A. G., Brenner, S. E., Hubbard, T., & Chothia, C. (1995). SCOP: a structural classification of proteins database for the investigation of sequences and structures. Journal of Molecular Biology, 247, 536–540.
Nanias, M., Chinchio, M., Ołdziej, S., Czaplewski, C., & Scheraga, H. A. (2005). Protein structure prediction with the UNRES force-field using replica-exchange Monte Carlo-with-minimization; comparison with MCM, CSA and CFMC. Journal of Computational Chemistry, 26, 1472–1486.
Needleman, S., & Wunsch, C. (1970). A general method applicable to the search for similarities in the amino acid sequences of two proteins. Journal of Molecular Biology, 48, 443–453.
Nielsen, H., Engelbrecht, J., Brunak, S., & von Heijne, G. (1997). Identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites. Protein Engineering, 10, 1–6.
Niermann, T., & Kirschner, K. (1990). Improving the prediction of secondary structure of ‘TIM-barrel’ enzymes. Protein Engineering, 4, 137–147.
Orengo, C. A., Michie, A. D., Jones, S., Jones, D. T., Swindells, M. B., & Thornton, J. M. (1997). CATH-a hierarchic classification of protein domain structures. Structure, 5, 1093–1108.
Ouali, M., & King, R. D. (2000). Cascaded multiple classifiers for secondary structure prediction. Protein Science, 9, 1162–1176.
Panchenko, A. R., Marchler-Bauer, A., & Bryant, S. H. (2000). In Quantitative challenges in the post-genome sequence era: a workshop and symposium. The La Jolla interfaces in science, La Jolla, CA (Vol. 2).
Papoulis, A. (1984). Brownian movement and Markov processes, Chap. 15. In Probability, random variables, and stochastic processes (2nd edn., pp. 515–553). New York: McGraw-Hill.
Pearl, F., Todd, A., Sillitoe, I., Dibley, M., Redfern, O., Lewis, T., Bennett, C., Marsden, R., Grant, A., Lee, D., Akpor, A., Maibaum, M., Harrison, A., Dallman, T., Reeves, G., Diboun, I., Addou, S., Lise, S., Johnston, C., Sillero, A., Thornton, J., & Orengo, C. (2005). The CATH Domain Structure Database and related resources Gene3D and DHS provide comprehensive domain family information for genome analysis. Nucleic Acids Research, 33, D247–D251.
Pearson, W. R., & Lipman, D. J. (1988). Improved tools for biological sequence comparison. Proceedings of National Academy Sciences of the USA, 85, 2444–2448.
Pevzner, P. A. (2001). Computational molecular biology an algorithmic approach. Cambridge: MIT Press.
Pollastri, G., & Baldi, P. (2002). Prediction of contact maps by GIOHMMs and recurrent neural networks using lateral propagation from all four cardinal corners. Bioinformatics, 18(1), S62–S70.
Pollastri, G., Przybylski, D., Rost, B., & Baldi, P. (2002). Improving the prediction of protein secondary structure in three and eight classes using recurrent neural networks and profiles. Proteins, 47, 228–235.
Przybylski, D., & Rost, B. (2002). Alignments grow, secondary structure prediction improves. Proteins, 46, 197–205.
Ptitsyn, O. B., & Finkelstein, A. V. (1980). Similarities of protein topologies: evolutionary divergence, functional convergence or principles of folding? Quarterly Reviews of Biophysics, 13, 339–386.
Qian, N., & Sejnowski, T. (1988). Predicting the secondary structure of globular proteins using neural network models. Journal of Molecular Biology, 202, 865–884.
Rabiner, L. R. (1989). A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE, 77(2), 257–286.
Riis, S. K., & Krogh, A. (1996). Improving prediction of protein secondary structure using structured neural networks and multiple sequence alignments. Journal of Computation Biology, 3, 163–183.
Rost, B., & Sander, C. (1993a). Improved prediction of protein secondary structure by use of sequence profiles and neural networks. Proceedings of the National Academy of Sciences of the USA, 90, 7558–7562.
Rost, B., & Sander, C. (1993b). Prediction of protein secondary structure at better than 70% accuracy. Journal of Molecular Biology, 232, 584–599.
Rost, B., Sander, C., & Schneider, R. (1994). PHD—an automatic server for protein secondary structure prediction. Computer Applications in the Biosciences, 10, 53–60.
Rumelhart, D. E., & McClelland, J. L. (1986). Parallel distributed processing. Explorations in the microstructure of cognition. Cambridge: MIT Press.
Rychlewski, L., Jaroszewski, L., Li, W., & Godzik, A. (2000). Comparison of sequence profiles. Strategies for structural predictions using sequence information. Protein Science, 9, 232–241.
Schneider, G., & Wrede, P. (1993). Development of artificial neural filters for pattern recognition in protein sequences. Journal of Molecular Evolution, 36, 586–595.
Setubal, J., & Meidanis, J. (1997). Introduction to computational biology. Boston: PWS Publishing.
Shi, J., Blundell, T. L., & Mizuguchi, K. (2001). FUGUE: sequence-structure homology recognition using environment-specific substitution tables and structure-dependent gap penalties. Journal of Molecular Biology, 310, 243–257.
Smith, T. F., & Waterman, M. S. (1981). Identification of common molecular subsequences. Journal of Molecular Biology, 147, 195–197.
Sonnhammer, E. L., Eddy, S. R., Birney, E., Bateman, A., & Durbin, R. (1998). Pfam: Multiple sequence alignments and HMM-profiles of protein domains. Nucleic Acids Research, 26, 320–322.
Taylor, W. R. (2000). Searching for the ideal forms of proteins. Biochemical Society Transactions, 28, 264–269.
Taylor, W. R. (2002a). In B. Mewes & H. S. Weiss (Eds.), Bioinformatics and genome analysis. Ernst Schering research foundation workshop (Vol. 38, pp. 133–148). Berlin: Springer.
Taylor, W. R. (2002b). A ‘periodic table’ for protein structures. Nature, 416, 657–660.
Tendulkar, A. V., Wangikar, P. P., Sohoni, M. A., Samant, V. V., & Mone, Ch. Y. (2003). Parameterization and classification of the protein universe via geometric techniques. Journal of Molecular Biology, 334(1), 157–172.
Tolstrup, N., Toftgård, J., Engelbrecht, J., & Brunak, S. (1994). Neural network model of the genetic code is strongly correlated to the GES scale of amino acid transfer free energies. Journal of Molecular Biology, 243, 816–820.
Tsigelny, I., Sharikov, Y., & Ten Eyck, L. F. (2002). Hidden Markov models-based system (HMMSPECTR) for detecting structural homologies on the basis of sequential information. Protein Engineering, 15(5), 347–352.
Veber, P., Yanev, N., Andonov, R., & Poirriez, V. (2005). Optimal protein threading by cost-splitting. In Proceedings of the annual workshop on algorithms in bioinformatics (WABI) (pp. 365–375). Berlin: Springer.
Wagner, M., Meller, J., & Elber, R. (2004). Large-scale linear programming techniques for the design of protein folding potentials. Mathematical Programming, 101(2), 301–318.
Waterman, M. S. (1995). Introduction to computational biology. London: Chapman and Hall.
Wilbur, W. J., & Lipman, D. J. (1983). Rapid similarity searches of nucleic acid and protein data banks. Proceedings of the National Academy of Sciences of the USA, 80, 726–730.
Wu, C. H., Zhao, S., Chen, H.-L., Lo, C.-J., & McLarty, J. (1996). Motif identification neural design for rapid and sensitive protein family search. Computer Applications in the Biosciences, 12, 109–118.
Xu, J. (2003). Speedup LP approach to protein threading via graph reduction. In Proceedings of the annual workshop on algorithms in bioinformatics (WABI) (pp. 374–388). Berlin: Springer.
Xu, J., & Li, M. (2003). Assessment of RAPTOR’s linear programming approach in CAFASP3. Proteins: Structure, Function, and Genetics, 53(6), 579–584.
Xu, J., Li, M., Kim, D., & Xu, Y. (2003). RAPTOR: Optimal protein threading by linear programming. Journal of Bioinformatics and Computational Biology, 1(1), 95–117.
Xu, J., Li, M., & Xu, Y. (2004). Protein threading by linear programming, Theoretical analysis and computational results. Journal of Combinatorial Optimization, 8(4), 403–418.
Yona, G., & Levitt, M. (2002). Within the twilight zone: a sensitive profile-profile comparison tool based on information theory. Journal of Molecular Biology, 315, 1257–1275.
Yuan, X., Hou, Y., Huang, Y., Shao, Y., & Bystroff, Ch. (2004). Contact map prediction using HMMSTR. http://www.bioinfo.rpi.edu/bystrc/pub/casp6abstract.pdf.
Zhang, Y., & Skolnick, J. (2004). SPICKER: a clustering approach to identify near-native protein folds. Journal of Computational Chemistry, 25, 865–871.
Author information
Authors and Affiliations
Corresponding author
Additional information
Partially supported by KBN grant No 3T11F00227. This is an updated version of the paper that appeared in 4OR, 4, 2006, pp. 91–123.
Rights and permissions
About this article
Cite this article
Łukasiak, P., Błażewicz, J. & Miłostan, M. Some operations research methods for analyzing protein sequences and structures. Ann Oper Res 175, 9–35 (2010). https://doi.org/10.1007/s10479-009-0652-y
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10479-009-0652-y