QSAR Challenges and Opportunities: A Commentary

Eugene A. Coats

Introduction

When I began my scientific career with a year in the laboratories of Corwin Hansch, the world of quantitative structure-activity relationships, QSAR, seemed to be very clearly defined. QSAR referred to statistical analysis of potential relationships between chemical structure and biological activity. Chemical structure was characterized by experimentally measured substituent constants or other physicochemical properties and biological activity was usually a concentration giving a predefined effect. Virtually all forms of molecular modeling and computer-assisted drug (or agrochemical) design have been included under the heading of QSAR at one time or another. QSAR has often become QSSR, QSPR, or QSTR in efforts to reflect the more specific nature of various types of quantitative relationships. This difficulty in defining the term, QSAR, may be frustrating to some but I prefer to view it as an indication of the explosion of techniques, procedures, and ideas, all relating in some fashion to attempt to summarize chemical and biological information in a form that allows one to generate and test hypotheses to facilitate an understanding of interactions between molecules. The variety of papers included in this issue of Network Science illustrates my point. In many applications, from studies of receptor binding and docking to the development of new types of potential energy fields for comparative molecular field analyses to the elucidation of potential pharmacophoric patterns, QSAR has become truly three dimensional. The spectrum of computer assisted methods available is both amazing and exciting.

It is not my intent in these brief comments to review QSAR as it was or as it is. Rather, I have arbitrarily selected three areas posing some new and some old challenges. Combinatorial library design and high throughput screening present unique opportunities for the application of QSAR principles in information management and analysis. The ever expanding ability to compute hundreds or even thousands of descriptors to characterize chemical structure coupled with statistical ways to correlate these with receptor interaction re-emphasizes an old question. Is it possible or even desirable to attempt to relate chemical descriptors to physicochemical properties? And finally a continuing challenge to QSAR is highlighted: the correlation of pharmacokinetic properties.

Lead Discovery and Lead Optimization

When it has been possible to elucidate quantitative relationships between chemical structure and biological activity, these relationships have proven useful in describing possible mechanisms of interaction. However it has always been tempting to use the quantitative relationships in another fashion, to predict new structures with better properties than those used to formulate the original QSAR. This, of course, is driven by the desire to devise ever more potent, selective pharmaceuticals and agrochemicals, but it unfortunately contradicts one of the basic tenants of statistical analysis; that the results of an analysis only serve to characterize trends in properties within the bounds of the learning set of data. Advances in technology may have offered a solution to this dilemma while also posing new challenges.

The development of automated synthesis capability along with the formulation of the combinatorial chemistry approach has enabled the rapid synthesis of large numbers of molecules. This huge increase in synthetic capacity has been accompanied by the automation of in vitro bioassays affording high throughput screening systems capable of generating massive amounts of data in a relatively short period of time.

If one has this ability to assay large numbers of compounds, it would seem to be the ideal way to identify new lead structures, provided that the molecules chosen for screening represent an evenly distributed cross section of all potentially bioactive structures. This requirement is, of course, the basis for the concept of molecular diversity. An obvious approach is to characterize structures in terms of their physicochemical properties, their substructures (molecular fingerprints), and any other possibly relevant and calculable features. Then one needs only to require that the ranges and occurrences of these various features adequately describe the universe of molecules of interest. In other words, the value of concepts in design first demonstrated by classical QSAR, are quite applicable here. While the approach is obvious, the practice is not. Just what properties and property ranges are sufficient to ensure a diverse set of molecules? Must these properties be 3D as well as 2D or are 2D features sufficient [1]? One approach which appears to have met with some degree of success involves computing molecular descriptors selected from classes including chemical functionality, receptor recognition, shape, physicochemical characteristics, and topological parameters. This collection of descriptors is then subjected to a D-optimal design to select structures, or structural building blocks to be included in creating a combinatorial library for biological evaluation [2, 3]. An alternative approach makes use exclusively of molecular interaction properties of molecules. That is, a so-called affinity fingerprint for each molecule in a library is determined by measuring the binding of the molecules to a diverse set of proteins [4, 5]. Thus, the molecules in a library are not described by their purely structural features but rather by their ability to interact with the panel of proteins. Efforts to adequately describe molecular diversity and thereby to create libraries for high throughput screening continue, yet concepts such as those mentioned above demonstrate that principles learned from years of QSAR in its various forms are indeed applicable here as well.

A further problem accompanying the evolution of high throughput screening methods is that of devising ways of handling the information generated. The screening results may often be semiquantitative at best and depending upon the design of the library, the bioassay may be run on mixtures of compounds. Is it possible, or even desirable to attempt any type of quantitative analysis on such results? Can one relate the properties used in library design to the results of the biological evaluation or are completely different types of characterization and QSAR required? If analyses are not attempted, how can the huge amounts of information be described, summarized, and effectively used? Should negative as well as positive bioassay results be stored and examined? The answers to such questions depend as much on the type of library being evaluated as on the screen itself. A combinatorial library designed for random screening may contain large mixtures of very highly diverse structures such that screening results would not be amenable to any type of statistical analyses and could only be employed in the design of more focused libraries. Devising ways of organizing, analyzing, storing, and utilizing results from high throughput remains a challenge.

Ligand-Receptor Interaction

A large variety of parameters to characterize changes in chemical structure can now be quickly computed, however there are clear limitations on the manner in which such computed properties may be employed. Sets of computed properties have been quite useful in setting up measures of molecular diversity and in designing screening libraries. They should also be of considerable value in correlating changes in observed biological activity, especially where large numbers of structures are involved. Conversely the use of computed molecular descriptors, as opposed to measured physicochemical properties, often provides little insight into the actual mechanisms of interactions occurring between the ligand and the biological target. This problem is highlighted by the continuing difficulties in properly characterizing ligand-receptor (enzyme) binding affinities such that predictions can be made. It is also evident if one asks how, for example, two CoMFA (Comparative Molecular Field) analyses [6] of different sets of molecules might be compared or how QSARs using molecular similarities might be contrasted. The many methods of developing structure-activity relationships, both 2D and 3D, are extremely valuable and often afford the only practical solution to an otherwise intractable problem. Yet computed structural parameters may only be applicable to the set of molecules under investigation, while properly measured physicochemical characteristics of molecules can, in principle, be extended to other groups of structures. Thus a physicochemical relationship between activity and structure found for any given series of molecules describes both the biological target and the set of molecules and this information is transferable to other biological targets and other sets of molecules. Such quantitative relationships between experimentally based physicochemical properties and biological activities, under the heading of "classical" QSAR, provide an important scientific foundation for our understanding of molecular interactions. With the increasing availability of isolated receptor assays, much could be learned from detailed QSAR studies involving physicochemical characterization of ligand-receptor interactions, but progress has been very slow. One limitation is the experimental determination of physicochemical properties which can be both time consuming and expensive. A solution is to improve our understanding of the relationships between computed structural descriptors and experimentally determined physicochemical properties by designing studies aimed at identification of basic underlying mechanisms and there are some efforts in this direction [7, 8].

Bioavailability QSAR?

While most of the effort in QSAR is directed towards enhancing potency and selectivity using in vitrobiological assay systems, identification of factors controlling oral bioavailability is equally crucial to developing effective and useful drugs. Just what is bioavailability and how is it related to drug structure? While simple in its statement, this question has no simple answer because of the competing and complex in vivo processes of absorption, distribution, metabolism and excretion that are experienced by a potential drug.

It is probably correct to assume that molecular hydrophobicity and/or partitioning properties are responsible for many of the passive distribution properties contributing to observed pharmacokinetics. Thus, understandably, past efforts to relate molecular structure to pharmacokinetic properties have focused on elucidation of quantitative relationships using octanol/water partition coefficients and ionization. Although the octanol/water partition coefficient is now firmly established as a standard measure of hydrophobicity, it has recently been demonstrated in a number of laboratories that viable alternatives do exist and often afford superior relationships. Leahy, Taylor and coworkers [9, 10, 11, 12] provided a lucid discussion of solvent pairs for partitioning studies as related to selective membrane transport and proposed the use of propylene glycol dipelargonate/ water as one of several additional partitioning systems. Other laboratories have investigated the relationships between solvent pair partitioning and in vitro cell monolayers such as CACO-2 as models for biological transport and thus oral absorption [13]. Studies on phospholipid vesicles and phospholipid monolayers have also shown promise as a means of measuring molecular characteristics more directly related to in vivo pharmacokinetics than octanol/water partition coefficients [14].

These and numerous other studies not mentioned here indicate an increased effort toward developing a better understanding of the physicochemical relationships between hydrophobicity, hydrogen bonding, ionization, and membrane transport. While earlier studies may have been based on intuition, there is now ample evidence that the passive transport across membranes can be a very selective process and that no single, model partitioning system can hope to account for transport across all types of membranes.

The challenge then, is to make use of these findings and apply knowledge of the differing character of membranes toward enhancing drug bioavailability as well as improving selectivity. The need to continue evaluation and development of efficient and relevant models for membrane interaction and membrane transport systems will become even greater as a result the current expansion in high throughput screening technology.

REFERENCES

Brown, R.D.; Martin, Y.C. A Comparison of Some Commercially Available Structural Descriptors and Clustering Algorithms. J. Chem. Inf. Comput. Sci., 1996, in press
Spellmeyer, D.C. Computational Approaches to Molecular Diversity. ACS Satellite Television Seminar, 1995, American Chemical Society, Washington, D.C.
Martin, E.J.; Blaney, J.M.; Siani, M.A.; Spellmeyer, D.C.; Wong, A.K.; Moos, W.H. Measuring Diversity: Experimental Design of Combinatorial Libraries for Drug Discovery, J.Med.Chem., 1995, 38, 1431-1436.
Villar, Hugo O. Affinity Fingerprints: Applications and Implications, Network Science(http://www.netsci.org/), 1995, 1(5).
Kauvar, L.M.; Higgins, D.L.; Villar, H.O.; Sportsman, J.R.; Engqvist-Goldstein, A.E.; Bukar, R.; Bauer, K.E.; Dilley, H.; Rocke, D.M. Predicting Ligand Binding to Proteins by Affinity Fingerprinting, Chem. & Biol., 1995, 2, 107-118.
Cramer, R.D., III; Patterson, D.E.; Bunce, J.D. "Comparative Molecular Field Analysis (CoMFA). 1. Effect of Shape on Binding of Steroids to Carrier Proteins", J. Am. Chem. Soc., 1988, 110, 5959-5967.
Lowrey, A.H.; Cramer, C.J.; Urban, J.J.; Famini, G.R. Quantum Chemical Descriptors for Linear Solvation Energy Relationships. Comput. Chem., 1995, 19, 209-15.
Famini, G.R.; Wilson, L.Y. Using Theoretical Descriptors in Linear Solvation Energy Relationships. Theor. Comput. Chem., 1994, 1, 213-241.
Leahy, D.E.; Taylor, P. J.; Wait, A.R. Model Solvent Systems for QSAR Part I. Propylene Glycol Dipelargonate (PGDP). A new Standard Solvent for use in Partition Coefficient Determination.Quant. Struct.-Act. Relat.. 1989, 8, 17-31.
Leahy, D. E.; Morris, J. J.; Taylor, P. J. Model solvent systems for QSAR. Part II Fragment values (f-values) for the critical quartet. J. Chem. Soc., Perkin Trans. 2 , 1992, 705-722.
Leahy, D. E.; Morris, J. J.; Taylor, P. Model solvent systems for QSAR. Part III. An LSER analysis of the ctiirical quartet. New light on hydrogen bond strength and directionality. J. Chem. Soc., Perkin Trans. 2 , 1992, 723-731.
Leahy, D. E.; Morris, J. J.; Taylor, P. Model solvent systems for QSAR. Part IV. The hydrogen bond accepteor behavior of heterocycles. J. Phys. Org. Chem., 1994, 7, 743-750.
Paterson, D.A.; Conradi, R.A.; Hilgers, A.R.; Vidmar, T.J.; Burton, P.S. A Non- aqueous Partitioning System for Predicting the Oral Absorption Potential of Peptides. Quant. Struct.-Act. Relat., 1994, 13, 4-10.
Seydel, J.K. Nuclear Magnetic Resonance and Differential Scanning Calorimetry as Tools for Studying Drug-Membrane Interactions. Trends Pharmacol. Sci., 1991, 12, 368-71.

Drug Desiginig Using Modern Techniques In Computational Chemistry

Blog Archive

July 29, 2009