Some operations research methods for analyzing protein sequences and structures

Łukasiak, Piotr; Błażewicz, Jacek; Miłostan, Maciej

doi:10.1007/s10479-009-0652-y

Some operations research methods for analyzing protein sequences and structures

Published: 05 November 2009

Volume 175, pages 9–35, (2010)
Cite this article

Annals of Operations Research Aims and scope Submit manuscript

Piotr Łukasiak^1,2,
Jacek Błażewicz^1,2 &
Maciej Miłostan¹

142 Accesses
12 Citations
Explore all metrics

Abstract

The operations research is probably one of the most successful field of applied mathematics used in economics, physics, chemistry, almost everywhere where one has to analyze huge amounts of data. Lately, these techniques of operations research were introduced in biology, especially in the protein analysis area to support biologists. The fast growth of protein data makes operations research an important issue in bioinformatics, a science which lays on the border between computer science and biology. This paper gives a short overview of the operations research techniques currently used to support structural and functional analysis of proteins.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Bioinformatics and Its Application in Computing Biological Data

Programming Global and Local Sequence Alignment by Using R

Bioinformatics: The Importance of Data Mining Techniques

References

Althaus, E., Kohlbacher, O., Lenhof, H.-P., & Muller, P. (2002). A combinatorial approach to protein docking with flexible side-chains. Journal of Computational Biology, 9(4), 597–612.
Article Google Scholar
Altschul, S. F., Gish, W., Miller, W., Myers, E. W., & Lipman, D. J. (1990). Basic local alignment search tool. Journal of Molecular Biology, 215, 403–410.
Google Scholar
Altschul, S. F., Madden, T. L., Schaffer, A. A., Zhang, J., Zhang, Z., Miller, W., & Lipman, D. J. (1997). Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Research, 25, 3389–3402.
Article Google Scholar
Andonov, R., Balev, S., & Yanev, N. (2004). Protein threading: from mathematical models to parallel implementations. INFORMS Journal on Computing, 16(4).
Andrade, M. A., & Valencia, A. (1997). Automatic annotation for biological sequences by extraction of keywords from MEDLINE abstracts. Development of a prototype system. In T. Gaasterland, P. Karp, K. Karplus, C. Ouzounis, & C. Sander et al. (Eds.), Fifth international conference on intelligent systems for molecular biology (pp. 25–32). Halkidiki: AAAI Press.
Google Scholar
Andreeva, A., Howorth, D., Brenner, S. E., Hubbard, T. J. P., Chothia, C., & Murzin, A. G. (2004). SCOP database in 2004: refinements integrate structure and sequence family data. Nucleic Acid Research, 32, 226–229.
Article Google Scholar
Anfinsen, C. B. (1973). Principles that govern the folding of protein chains. Science, 181, 223–230.
Article Google Scholar
Anfinsen, C. B., Haber, E., Sela, M., & White, F. Jr. (1961). The kinetics of formation of native ribonuclease during oxidation of the reduced polypeptide chain. Proceedings of the National Academy of Sciences of the USA, 47(9), 1309–1314.
Article Google Scholar
Apweiler, R., Attwood, T. K., Bairoch, A., Bateman, A., Birney, E., Bucher, P., Codani, J. J., Corpet, F., Croning, M. D. R., & Durbin, R. (2000). InterPro—an integrated documentation resource for protein families, domains and functional sites. Bioinformatics, 16, 1145–1150.
Article Google Scholar
Arbib, M. (1995). The handbook of brain theory and neural networks. Cambridge: Bradford Books/The MIT Press.
Google Scholar
Asai, K., Hayamizu, S., & Handa, K. (1993). Prediction of protein secondary structure by the hidden Markov model. Bioinformatics, 9, 141–146.
Article Google Scholar
Attwood, T. K. (2000). The quest to deduce protein function from sequence: the role of pattern databases. International Journal of Biochemistry & Cell Biology, 32, 139–155.
Article Google Scholar
Attwood, T. K., Croning, M. D., Flower, D. R., Lewis, A. P., Mabey, J. E., Scordis, P., Selley, J. N., & Wright, W. (2000). PRINTS-S: the database formerly known as prints. Nucleic Acid Research, 28, 225–227.
Article Google Scholar
Bairoch, A., & Apweiler, R. (2000). The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000. Nucleic Acids Research, 28, 45–48.
Article Google Scholar
Baldi, P., & Brunak, S. (1998). Bioinformatics: the machine learning approach. Cambridge: MIT Press.
Google Scholar
Baldi, P., Brunak, S., Frasconi, P., Soda, G., & Pollastri, G. (1999). Exploiting the past and the future in protein secondary structure prediction. Bioinformatics, 15, 937–946.
Article Google Scholar
Balev, S. (2004). Solving the protein threading problem by Lagrangian relaxation. In Proceedings of the annual workshop on algorithms in bioinformatics (WABI) (pp. 182–193). Berlin: Springer.
Google Scholar
Barnes, E., Sokol, J. S., & Strickland, D. M. (2005). Optimal protein structure alignment using maximum cliques. Operations Research, 53, 389–402.
Article Google Scholar
Bateman, A., Birney, E., Durbin, R., Eddy, S. R., Howe, K. L., & Sonnhammer, E. L. (2000). The Pfam protein families database. Nucleic Acids Research, 28, 263–266.
Article Google Scholar
Baum, L. E., & Petrie, T. (1966). Statistical inference for probabilistic functions of finite state Markov chains. Annals of Mathematical Statistics, 37.
Benner, S. A., & Gerloff, D. (1990). Patterns of divergence in homologous proteins as indicators of secondary and tertiary structure of the catalytic domain of protein kinases. Advances in Enzyme Regulation, 31, 121–181.
Article Google Scholar
Bertsekas, D. P. (1995). Dynamic programming and optimal control (Vols. 1, 2). Belmont: Athena Scientific.
Google Scholar
Bertsekas, D. P., & Tsitsiklis, J. N. (1996). Neuro-dynamic programming. Belmont: Athena Scientific.
Google Scholar
Blazewicz, J., Kasprzak, M., Sterna, M., & Węglarz, J. (1997). Selected combinatorial optimization problems arising in molecular biology. Ricerca Operativa, 26, 35–63.
Google Scholar
Blazewicz, J., Hammer, P. L., & Lukasiak, P. (2004a). Logical analysis of data as a predictor of protein secondary structures. In N. Kolchanov & R. Hofestaedt (Eds.), Bioinformatics of genome regulations and structure, chapter Computational structural biology (pp. 145–154). Boston: Kluwer Academic Publisher.
Google Scholar
Blazewicz, J., Dill, K. A., Lukasiak, P., & Milostan, M. (2004b). A Tabu search strategy for finding low energy structures of proteins in HP-model. Computational Methods in Science and Technology, 10, 7–19.
Google Scholar
Blazewicz, J., Formanowicz, P., & Kasprzak, M. (2005a). Selected combinatorial problems of computational biology. European Journal of Operational Research, 161, 585–597.
Article Google Scholar
Blazewicz, J., Hammer, P. L., & Lukasiak, P. (2005b). Predicting secondary structures of proteins. IEEE Engineering in Medicine and Biology, 24(3), 88–94.
Article Google Scholar
Blazewicz, J., Lukasiak, P., & Milostan, M. (2005c). Application of tabu search strategy for finding low energy structure of protein. Artificial Intelligence in Medicine, 35(1–2), 135–145.
Article Google Scholar
Blazewicz, J., Lukasiak, P., & Milostan, M. (2006). Some operations research methods for analyzing protein sequences and structures. 4OR: A Quarterly Journal of Operations Research, 4(2), 91–123.
Article Google Scholar
Blom, N., Hansen, J., Blaas, D., & Brunak, S. (1996). Cleavage site analysis in picornaviral polyproteins: discovering cellular targets by neural networks. Protein Science, 5, 2203–2216.
Article Google Scholar
Bohr, H., Bohr, J., Brunak, S., Cotterill, R. M., Lautrup, B., Norskov, L., Olsen, O. H., & Petersen, S. B. (1988). Protein secondary structure and homology by neural networks. The alpha-helices in rhodopsin. FEBS Letters, 241, 223–228.
Article Google Scholar
Bowie, J. U., Luthy, R., & Eisenberg, D. (1991). A method to identify protein sequences that fold into a known three-dimensional structure. Science, 253, 164–170.
Article Google Scholar
Branden, C., & Tooze, J. (1999). Introduction to protein structure (2nd edn., pp. 89–120). New York: Garland Science Publishing.
Google Scholar
Brunak, S. (1991). Non-linearities in training sets identified by inspecting the order in which neural networks learn. In O. Benhar, C. Bosio, P. Del Giudice, & E. Tabet (Eds.), Neural networks from biology to high energy physics (pp. 277–288). Elba, Italy.
Bryant, S. H., & Altschul, S. F. (1995). Statistics of sequence-structure threading. Biology Current Opinions with Evaluated MEDLINE, 5, 236–244.
Google Scholar
Bystroff, C., & Baker, D. (1998). Prediction of local structure in proteins using a library of sequence-structure motifs. Journal of Molecular Biology, 281, 565–577.
Article Google Scholar
Bystroff, C., Thorsson, V., & Baker, D. (2000). HMMSTR: A hidden Markov model for local sequence-structure correlations in proteins. Journal of Molecular Biology, 301, 173–190.
Article Google Scholar
Caprara, A., & Lancia, G. (2002). Structural alignment of large-size proteins via Lagrangian relaxation. In Proceedings of the annual international conference on computational molecular biology (RECOMB) (pp. 100–108). New York: ACM Press.
Google Scholar
Caprara, A., Carr, B., Istrail, S., Lancia, G., & Walenz, B. (2004). 1001 optimal pdb structure alignments: Integer programming methods for finding the maximum contact map overlap. Journal of Computational Biology, 11(1), 27–52.
Article Google Scholar
Carr, R. D., & Lancia, G. (2004). Compact optimization can outperform separation: a case study in structural proteomics. 4OR, 2(3), 221–233.
Article Google Scholar
Chazelle, B., Kingsford, C., & Singh, M. (2003). The side-chain positioning problem: a semidefinite programming formulation with new rounding schemes. In PCK50—principles of computing & knowledge, Paris C Kanellakis memorial workshop (pp. 86–94). New York: ACM Press.
Chapter Google Scholar
Chazelle, B., Kingsford, C., & Singh, M. (2004). A semidefinite programming approach to side chain positioning with new rounding strategies. INFORMS Journal on Computing, 16(4).
Corpet, F., Servant, F., Gouzy, J., & Kahn, D. (2000). ProDom and ProDom-CG: tools for protein domain analysis and whole genome comparisons. Nucleic Acids Research, 28, 267–269.
Article Google Scholar
Cuff, J. A., & Barton, G. J. (1999). Evaluation and improvement of multiple sequence methods for protein secondary structure prediction. Proteins, 34, 508–519.
Article Google Scholar
Dickerson, R. E., Timkovich, R., & Almassy, R. J. (1976). The cytochrome fold and the evolution of bacterial energy metabolism. Journal of Molecular Biology, 100, 473–491.
Article Google Scholar
Doye, J. P. K., Leary, R. H., Locatelli, M., & Schoen, F. (2004). Global optimization of morse clusters by potential energy transformations. INFORMS Journal on Computing, 16(4).
Durbin, R., Eddy, S., Krogh, A., & Mitchison, G. (1998). Biological sequence analysis. Cambridge: Cambridge University Press.
Book Google Scholar
Eddy, S. R. (1998). Profile hidden Markov models. Bioinformatics, 14, 755–763.
Article Google Scholar
Edler, L., Grassmann, J., & Suhai, S. (2001). Role and results of statistical methods in protein fold class prediction. Mathematical and Computer Modelling, 33, 1401–1417.
Article Google Scholar
Efimov, A. V. (1997). Structural trees for protein superfamilies. Proteins, 28, 241–260.
Article Google Scholar
Eriksson, O., Zhou, Y., & Elofsson, A. (2001). Side chain-positioning as an integer programming problem. In O. Gascuel & B. M. E. Moret (Eds.), Lecture notes in computer science : Vol. 2149. Proceedings of annual workshop on algorithms in bioinformatics (WABI) (pp. 128–141). Berlin: Springer.
Google Scholar
Eskow, E., Bader, B., Byrd, R., Crivelli, S., Head-Gordon, T., Lamberti, V., & Schnabel, R. (2004). An optimization approach to the problem of protein structure prediction. Mathematical Programming, 101(3), 497–514.
Article Google Scholar
Eyrich, V. A., Standley, D. M., & Friesner, R. A. (1999). Prediction of protein tertiary structure to low resolution: performance for a large and structurally diverse test set. Journal of Molecular Biology, 288(4), 725–742.
Article Google Scholar
Ferrán, E. A., & Pflugfelder, B. (1993). A hybrid method to cluster protein sequences based on statistics and artificial neural networks. Computer Applications in the Biosciences, 9, 671–680.
Google Scholar
Fiesler, E., & Beale, R. (1996). Handbook of neural computation. New York: Oxford Univ. Press.
Book Google Scholar
Finkelstein, A. V., & Ptitsyn, O. B. (1987). Why do globular proteins fit the limited set of folding patterns? Progress in Biophysics and Molecular Biology, 50, 171–190.
Article Google Scholar
Frampton, J., Leutz, A., Gibson, T. J., & Graf, T. (1989). DNA-binding domain ancestry. Nature, 342, 134.
Article Google Scholar
Frishman, D., & Argos, P. (1992). Recognition of distantly related protein sequences using conserved motifs and neural networks. Journal of Molecular Biology, 228, 951–962.
Article Google Scholar
Godzik, A., Skolnick, J., & Kolinski, A. (1992). Topology fingerprint approach to the inverse protein folding problem. Journal of Molecular Biology, 227, 227–238.
Article Google Scholar
Gough, J., Karplus, K., Hughey, R., & Chothia, C. (2001). Assignment of homology to genome sequences using a library of hidden Markov models that represent all proteins of known structure. Journal of Molecular Biology, 313, 903–919.
Article Google Scholar
Greenberg, H., Hart, W., & Lancia, G. (2004). Opportunities for combinatorial optimization in computational biology. INFORMS Journal on Computing, 16(3), 1–22.
Article Google Scholar
Gribskov, M., McLachlan, A. D., & Eisenberg, D. (1987). Profile analysis: detection of distantly related proteins. Proceedings of the National Academy of Sciences of the USA, 84, 4355–4358.
Article Google Scholar
Hadley, C., & Jones, D. T. (1999). A systematic comparison of protein structure classifications: SCOP, CATH and FSSP. Structure, 7, 1099–1112.
Article Google Scholar
Han, K. F., & Baker, D. (1996). Global properties of the mapping between local amino acid sequence and local structure in proteins. Proceedings of the National Academy of Sciences of the USA, 93, 5814–5818.
Article Google Scholar
Hansen, J. E., Lund, O., Tolstrup, N., Gooley, A. A., Williams, K. L., & Brunak, S. (1998). NetOglyc: Prediction of mucin type O-glycosylation sites based on sequence context and surface accessibility. Glycoconjugate Journal, 15, 115–130.
Article Google Scholar
Haykin, S. (1999). Neural networks (2nd edn.). New York: Prentice Hall.
Google Scholar
Henikoff, J. G., Greene, E. A., Pietrokovski, S., & Henikoff, S. (2000). Increased coverage of protein families with the blocks database servers. Nucleic Acids Research, 28, 228–230.
Article Google Scholar
Hirst, J. D., & Sternberg, M. J. E. (1991). Prediction of ATP-binding motifs a comparison of a perceptron-type neural network and a consensus sequence method. Protein Engineering, 4, 615–623.
Article Google Scholar
Hirst, J. D., & Sternberg, M. J. E. (1992). Prediction of structural and functional features of protein and nucleic acid sequences by artificial neural networks. Biochemistry, 31, 615–623.
Article Google Scholar
Hofmann, K., Bucher, P., Falquet, L., & Bairoch, A. (1999). The PROSITE database, its status in 1999. Nucleic Acids Research, 27, 215–219.
Article Google Scholar
Holley, H., & Karplus, M. (1989). Protein secondary structure prediction with a neural network. Proceedings of the National Academy of Sciences of the USA, 86, 152–156.
Article Google Scholar
Holm, L., & Sander, C. (1993). Protein structures comparision by alignment of distance matrices. Journal of Molecular Biology, 233, 123–138.
Article Google Scholar
Holm, L., & Sander, C. (1994). The FSSP database of structurally aligned protein fold families. Nucleic Acids Research, 22, 3600–3609.
Google Scholar
Holm, L., & Sander, C. (1997). Dali/FSSP classification of three-dimensional protein folds. Nucleic Acids Research, 25, 231–234.
Article Google Scholar
Hua, S., & Sun, Z. (2001). A novel method of protein secondary structure prediction with high segment overlap measure: support vector machine approach. Journal of Molecular Biology, 308, 397–407.
Article Google Scholar
Jagla, B., & Schuchhardt, J. (2000). Adaptive encoding neural networks for the recognition of human signal peptide cleavage sites. Bioinformatics, 16, 245–250.
Article Google Scholar
Johnson, S. C. (1967). Hierarchical clustering schemes. Psychometrika, 32, 241–254.
Article Google Scholar
Johnson, S. C. (1985). This week’s citation classic. Current Contents, 5, 16.
Google Scholar
Jones, D. T. (1999a). Protein secondary structure prediction based on position-specific scoring matrices. Journal of Molecular Biology, 292, 195–202.
Article Google Scholar
Jones, D. T. (1999b). GenTHREADER: an efficient and reliable protein fold recognition method for genomic sequences. Journal of Molecular Biology, 287, 797–815.
Article Google Scholar
Jones, D. T., Taylor, W. R., & Thornton, J. M. (1992). A new approach to protein fold recognition. Nature, 358, 86–89.
Article Google Scholar
Karplus, K., Barrett, C., Cline, M., Diekhans, M., Grante, L., & Hughey, R. (1999). Predicting protein structure using only sequence information. Proteins, 3, 121–125.
Article Google Scholar
Kelley, L. A., MacCallum, R. M., & Sternberg, M. J. E. (2000). Enhanced genome annotation using structural profiles in the program 3D-PSSM. Journal of Molecular Biology, 299, 499–520.
Article Google Scholar
Kim, D., Xu, D., Guo, J., Ellrott, K., & Xu, Y. (2003). PROSPECT II: protein structure prediction program for genome-scale applications. Protein Engineering, 16(9), 641–650.
Article Google Scholar
Kingsford, C., Chazelle, B., & Singh, M. (2005). Solving and analyzing side-chain positioning problems using linear and integer programming. Bioinformatics, 21(7), 1028–1039.
Article Google Scholar
Kneller, D., Cohen, F., & Langridge, R. (1990). Improvements in protein secondary structure prediction by an enhanced neural network. Journal of Molecular Biology, 214, 171–182.
Article Google Scholar
Koh, S. H., Ananthasurehs, G. K., & Croke, C. (2004). Design of reduced protein models by energy minimization using mathematical programming. In 10th AIAA/ISSMO multidisciplinary analysis and optimization conference (pp. 1–10).
Kohonen, T. (1982). Self-organized formation of topologically correct feature maps. Biological Cybernetics, 43, 59–69.
Article Google Scholar
Kolinski, A., & Bujnicki, J. M. (2004). Combination of fold-recognition with De Novo Folding and evaluation of models. http://www.forcasp.org/upload/2165.6.pdf.
Kolinski, A., & Skolnick, J. (2004). Reduced models of proteins and their applications. Polymer, 45, 511–524.
Article Google Scholar
Kriventseva, E. V., Biswas, M., & Apweiler, R. (2001). Clustering and analysis of protein families. Current Opinion in Structural Biology, 11, 334–339.
Article Google Scholar
Ladunga, I., Czakó, F., Csabai, I., & Geszti, T. (1991). Improving signal peptide prediction accuracy by simulated neural network. Computer Applications in the Biosciences, 7, 485–487.
Google Scholar
Lancia, G., Carr, R., Walenz, B., & Istrail, S. (2001). 101 optimal PDB structure alignments: a branch-and-cut algorithm for the maximum contact map overlap problem. In Proceedings of the annual international conference on computational biology (RECOMB) (pp. 193–202). New York: ACM Press.
Google Scholar
Lathrop, R. H. (1994). The protein threading problem with sequence amino acid interaction preferences is NP-complete. Protein Engineering, 7, 1059–1068.
Article Google Scholar
Lee, Y. (2005). Hidden Markov models with states depending on observations. Pattern Recognition Letters, 26, 977–984.
Article Google Scholar
Lesk, A. M. (2001). Introduction to protein architecture. London: Oxford University Press.
Google Scholar
Levinthal, C. (1968). Are there pathways to protein folding? Journal of Chemical Physics, 65, 44–45.
Google Scholar
Li, W., Jaroszewski, L., & Godzik, A. (2002). Tolerating some redundancy significantly speeds up clustering of large protein databases. Bioinformatics, 18, 77–82.
Article Google Scholar
Lindahl, E., & Elofsson, A. (2000). Identification of related proteins on family, superfamily and fold level. Journal of Molecular Biology, 295, 613–625.
Article Google Scholar
Lipman, D. J., & Pearson, W. R. (1985). Rapid and sensitive protein similarity searches. Science, 227, 1435–1441.
Article Google Scholar
Liu, J., & Rost, B. (2003). Domains, motifs and clusters in protein universe. Current Opinion in Chemical Biology, 7, 5–11.
Article Google Scholar
Lukasiak, P. (2004). Algorithmic aspects of protein secondary structure prediction. PhD Thesis, Poznan University of Technology.
Ma, Q., Chirn, G.-W., Cai, R., Szustakowski, J., & Nirmala, N. R. (2005). Clustering protein sequences with a novel metric transformed from sequence similarity scores and sequence alignments with neural networks. Bioinformatics, 6, 242.
Google Scholar
Markowetz, F., Edler, L., & Vingron, M. (2003). Support vector machines for protein fold class prediction. Biometrical Journal, 45(3), 377–389.
Article Google Scholar
Mewes, H. W., Frishman, D., Gruber, C., Geier, B., Haase, D., Kaps, A., Lemcke, K., Mannhaupt, G., Pfeiffer, F., & Schuller, C. (2000). MIPS: a database for genomes and protein sequences. Nucleic Acids Research, 28, 37–40.
Article Google Scholar
Mizuguchi, K., Deane, C. M., Blundell, T. L., & Overington, J. P. (1998). HOMSTRAD: a database of protein structure alignments for homologous families. Protein Science, 7, 2469–2471.
Article Google Scholar
Mohseni-Zadeh, S., Brzellec, P., & Risler, J.-L. (2004). Cluster-C, an algorithm for the large-scale clustering of protein sequences based on the extraction of maximal cliques. Computational Biology and Chemistry, 28(3), 211–218.
Article Google Scholar
Murvai, J., Vlahovicek, K., Barta, E., Cataletto, B., & Pongor, S. (2000). The SBASE protein domain library, release 7.0: a collection of annotated protein sequence segments. Nucleic Acids Research, 28, 260–262.
Article Google Scholar
Murzin, A. G., Brenner, S. E., Hubbard, T., & Chothia, C. (1995). SCOP: a structural classification of proteins database for the investigation of sequences and structures. Journal of Molecular Biology, 247, 536–540.
Google Scholar
Nanias, M., Chinchio, M., Ołdziej, S., Czaplewski, C., & Scheraga, H. A. (2005). Protein structure prediction with the UNRES force-field using replica-exchange Monte Carlo-with-minimization; comparison with MCM, CSA and CFMC. Journal of Computational Chemistry, 26, 1472–1486.
Article Google Scholar
Needleman, S., & Wunsch, C. (1970). A general method applicable to the search for similarities in the amino acid sequences of two proteins. Journal of Molecular Biology, 48, 443–453.
Article Google Scholar
Nielsen, H., Engelbrecht, J., Brunak, S., & von Heijne, G. (1997). Identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites. Protein Engineering, 10, 1–6.
Article Google Scholar
Niermann, T., & Kirschner, K. (1990). Improving the prediction of secondary structure of ‘TIM-barrel’ enzymes. Protein Engineering, 4, 137–147.
Article Google Scholar
Orengo, C. A., Michie, A. D., Jones, S., Jones, D. T., Swindells, M. B., & Thornton, J. M. (1997). CATH-a hierarchic classification of protein domain structures. Structure, 5, 1093–1108.
Article Google Scholar
Ouali, M., & King, R. D. (2000). Cascaded multiple classifiers for secondary structure prediction. Protein Science, 9, 1162–1176.
Article Google Scholar
Panchenko, A. R., Marchler-Bauer, A., & Bryant, S. H. (2000). In Quantitative challenges in the post-genome sequence era: a workshop and symposium. The La Jolla interfaces in science, La Jolla, CA (Vol. 2).
Papoulis, A. (1984). Brownian movement and Markov processes, Chap. 15. In Probability, random variables, and stochastic processes (2nd edn., pp. 515–553). New York: McGraw-Hill.
Google Scholar
Pearl, F., Todd, A., Sillitoe, I., Dibley, M., Redfern, O., Lewis, T., Bennett, C., Marsden, R., Grant, A., Lee, D., Akpor, A., Maibaum, M., Harrison, A., Dallman, T., Reeves, G., Diboun, I., Addou, S., Lise, S., Johnston, C., Sillero, A., Thornton, J., & Orengo, C. (2005). The CATH Domain Structure Database and related resources Gene3D and DHS provide comprehensive domain family information for genome analysis. Nucleic Acids Research, 33, D247–D251.
Article Google Scholar
Pearson, W. R., & Lipman, D. J. (1988). Improved tools for biological sequence comparison. Proceedings of National Academy Sciences of the USA, 85, 2444–2448.
Article Google Scholar
Pevzner, P. A. (2001). Computational molecular biology an algorithmic approach. Cambridge: MIT Press.
Google Scholar
Pollastri, G., & Baldi, P. (2002). Prediction of contact maps by GIOHMMs and recurrent neural networks using lateral propagation from all four cardinal corners. Bioinformatics, 18(1), S62–S70.
Article Google Scholar
Pollastri, G., Przybylski, D., Rost, B., & Baldi, P. (2002). Improving the prediction of protein secondary structure in three and eight classes using recurrent neural networks and profiles. Proteins, 47, 228–235.
Article Google Scholar
Przybylski, D., & Rost, B. (2002). Alignments grow, secondary structure prediction improves. Proteins, 46, 197–205.
Article Google Scholar
Ptitsyn, O. B., & Finkelstein, A. V. (1980). Similarities of protein topologies: evolutionary divergence, functional convergence or principles of folding? Quarterly Reviews of Biophysics, 13, 339–386.
Article Google Scholar
Qian, N., & Sejnowski, T. (1988). Predicting the secondary structure of globular proteins using neural network models. Journal of Molecular Biology, 202, 865–884.
Article Google Scholar
Rabiner, L. R. (1989). A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE, 77(2), 257–286.
Article Google Scholar
Riis, S. K., & Krogh, A. (1996). Improving prediction of protein secondary structure using structured neural networks and multiple sequence alignments. Journal of Computation Biology, 3, 163–183.
Article Google Scholar
Rost, B., & Sander, C. (1993a). Improved prediction of protein secondary structure by use of sequence profiles and neural networks. Proceedings of the National Academy of Sciences of the USA, 90, 7558–7562.
Article Google Scholar
Rost, B., & Sander, C. (1993b). Prediction of protein secondary structure at better than 70% accuracy. Journal of Molecular Biology, 232, 584–599.
Article Google Scholar
Rost, B., Sander, C., & Schneider, R. (1994). PHD—an automatic server for protein secondary structure prediction. Computer Applications in the Biosciences, 10, 53–60.
Google Scholar
Rumelhart, D. E., & McClelland, J. L. (1986). Parallel distributed processing. Explorations in the microstructure of cognition. Cambridge: MIT Press.
Google Scholar
Rychlewski, L., Jaroszewski, L., Li, W., & Godzik, A. (2000). Comparison of sequence profiles. Strategies for structural predictions using sequence information. Protein Science, 9, 232–241.
Article Google Scholar
Schneider, G., & Wrede, P. (1993). Development of artificial neural filters for pattern recognition in protein sequences. Journal of Molecular Evolution, 36, 586–595.
Article Google Scholar
Setubal, J., & Meidanis, J. (1997). Introduction to computational biology. Boston: PWS Publishing.
Google Scholar
Shi, J., Blundell, T. L., & Mizuguchi, K. (2001). FUGUE: sequence-structure homology recognition using environment-specific substitution tables and structure-dependent gap penalties. Journal of Molecular Biology, 310, 243–257.
Article Google Scholar
Smith, T. F., & Waterman, M. S. (1981). Identification of common molecular subsequences. Journal of Molecular Biology, 147, 195–197.
Article Google Scholar
Sonnhammer, E. L., Eddy, S. R., Birney, E., Bateman, A., & Durbin, R. (1998). Pfam: Multiple sequence alignments and HMM-profiles of protein domains. Nucleic Acids Research, 26, 320–322.
Article Google Scholar
Taylor, W. R. (2000). Searching for the ideal forms of proteins. Biochemical Society Transactions, 28, 264–269.
Google Scholar
Taylor, W. R. (2002a). In B. Mewes & H. S. Weiss (Eds.), Bioinformatics and genome analysis. Ernst Schering research foundation workshop (Vol. 38, pp. 133–148). Berlin: Springer.
Google Scholar
Taylor, W. R. (2002b). A ‘periodic table’ for protein structures. Nature, 416, 657–660.
Article Google Scholar
Tendulkar, A. V., Wangikar, P. P., Sohoni, M. A., Samant, V. V., & Mone, Ch. Y. (2003). Parameterization and classification of the protein universe via geometric techniques. Journal of Molecular Biology, 334(1), 157–172.
Article Google Scholar
Tolstrup, N., Toftgård, J., Engelbrecht, J., & Brunak, S. (1994). Neural network model of the genetic code is strongly correlated to the GES scale of amino acid transfer free energies. Journal of Molecular Biology, 243, 816–820.
Article Google Scholar
Tsigelny, I., Sharikov, Y., & Ten Eyck, L. F. (2002). Hidden Markov models-based system (HMMSPECTR) for detecting structural homologies on the basis of sequential information. Protein Engineering, 15(5), 347–352.
Article Google Scholar
Veber, P., Yanev, N., Andonov, R., & Poirriez, V. (2005). Optimal protein threading by cost-splitting. In Proceedings of the annual workshop on algorithms in bioinformatics (WABI) (pp. 365–375). Berlin: Springer.
Google Scholar
Wagner, M., Meller, J., & Elber, R. (2004). Large-scale linear programming techniques for the design of protein folding potentials. Mathematical Programming, 101(2), 301–318.
Article Google Scholar
Waterman, M. S. (1995). Introduction to computational biology. London: Chapman and Hall.
Google Scholar
Wilbur, W. J., & Lipman, D. J. (1983). Rapid similarity searches of nucleic acid and protein data banks. Proceedings of the National Academy of Sciences of the USA, 80, 726–730.
Article Google Scholar
Wu, C. H., Zhao, S., Chen, H.-L., Lo, C.-J., & McLarty, J. (1996). Motif identification neural design for rapid and sensitive protein family search. Computer Applications in the Biosciences, 12, 109–118.
Google Scholar
Xu, J. (2003). Speedup LP approach to protein threading via graph reduction. In Proceedings of the annual workshop on algorithms in bioinformatics (WABI) (pp. 374–388). Berlin: Springer.
Google Scholar
Xu, J., & Li, M. (2003). Assessment of RAPTOR’s linear programming approach in CAFASP3. Proteins: Structure, Function, and Genetics, 53(6), 579–584.
Article Google Scholar
Xu, J., Li, M., Kim, D., & Xu, Y. (2003). RAPTOR: Optimal protein threading by linear programming. Journal of Bioinformatics and Computational Biology, 1(1), 95–117.
Article Google Scholar
Xu, J., Li, M., & Xu, Y. (2004). Protein threading by linear programming, Theoretical analysis and computational results. Journal of Combinatorial Optimization, 8(4), 403–418.
Article Google Scholar
Yona, G., & Levitt, M. (2002). Within the twilight zone: a sensitive profile-profile comparison tool based on information theory. Journal of Molecular Biology, 315, 1257–1275.
Article Google Scholar
Yuan, X., Hou, Y., Huang, Y., Shao, Y., & Bystroff, Ch. (2004). Contact map prediction using HMMSTR. http://www.bioinfo.rpi.edu/bystrc/pub/casp6abstract.pdf.
Zhang, Y., & Skolnick, J. (2004). SPICKER: a clustering approach to identify near-native protein folds. Journal of Computational Chemistry, 25, 865–871.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Computing Science, Poznan University of Technology, Piotrowo 2, 60-965, Poznan, Poland
Piotr Łukasiak, Jacek Błażewicz & Maciej Miłostan
Institute of Bioorganic Chemistry, Polish Academy of Sciences, ul. Z. Noskowskiego 12/14, 61-704, Poznan, Poland
Piotr Łukasiak & Jacek Błażewicz

Authors

Piotr Łukasiak
View author publications
You can also search for this author in PubMed Google Scholar
Jacek Błażewicz
View author publications
You can also search for this author in PubMed Google Scholar
Maciej Miłostan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Piotr Łukasiak.

Additional information

Partially supported by KBN grant No 3T11F00227. This is an updated version of the paper that appeared in 4OR, 4, 2006, pp. 91–123.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Łukasiak, P., Błażewicz, J. & Miłostan, M. Some operations research methods for analyzing protein sequences and structures. Ann Oper Res 175, 9–35 (2010). https://doi.org/10.1007/s10479-009-0652-y

Download citation

Published: 05 November 2009
Issue Date: March 2010
DOI: https://doi.org/10.1007/s10479-009-0652-y

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Some operations research methods for analyzing protein sequences and structures

Abstract

Access this article

Similar content being viewed by others

Bioinformatics and Its Application in Computing Biological Data

Programming Global and Local Sequence Alignment by Using R

Bioinformatics: The Importance of Data Mining Techniques

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Navigation

Some operations research methods for analyzing protein sequences and structures

Abstract

Access this article

Similar content being viewed by others

Bioinformatics and Its Application in Computing Biological Data

Programming Global and Local Sequence Alignment by Using R

Bioinformatics: The Importance of Data Mining Techniques

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation