US 20050281819 A9
Nature evolves biological molecules such as proteins through iterated rounds of diversification, selection, and amplification. The present invention provides methods, compositions, and systems for synthesizing, selecting, amplifying, and evolving non-natural molecules based on nucleic acid templates. The sequence of a nucleic acid template is used to direct the synthesis of non-natural molecules such as unnatural polymers and small molecules. Using this method combinatorial libraries of these molecules can be prepared and screened. Upon selection of a molecule, its encoding nucleic acid template may be amplified and/or evolved to yield the same molecule or related molecules for re-screening. The inventive methods and compositions of the present invention allow for the amplification and evolution of non-natural molecules in a manner analogous to the amplification of natural biopolymer such as polynucleotides and protein.
47. A method for synthesizing a templated molecule comprising a plurality of functional groups, said method comprising the steps of
i) providing at least one template comprising a sequence of n coding nucleotides, wherein each coding nucleotide comprises at least one base capable of recognizing a predetermined complementary nucleotide, and wherein n is an integer of more than 1,
ii) providing a plurality of transfer units, wherein each transfer unit comprises
a) at least one complementary nucleotide comprising at least one base capable of recognizing a predetermined coding nucleotide,
b) at least one reactive unit comprising at least one functional group and at least one reactive group, and
c) at least one linker separating the at least one reactive unit from the at least one complementary nucleotide,
iii) contacting each of said coding nucleotides with a complementary nucleotide capable of recognizing said coding nucleotide,
iv) optionally, obtaining a strand complementary to the template, and
v) obtaining a templated molecule comprising covalently linked, functional groups by linking, by means of a reaction involving reactive groups, a functional group of at least one reactive unit to a functional group of another reactive unit,
wherein the templated molecule is capable of being linked by means of a linker to the strand complementary to the template or to the template that templated the synthesis of the templated molecule, and
wherein the synthesis of the templated molecule does not involve ribosome mediated translation of an mRNA.
48. The method of
49. The method of
50. The method of
51. The method of
52. The method of
53. The method of
54. The method of
55. The method of
56. The method according to
57. The method of
58. A templated molecule obtainable in accordance with
59. The templated molecule according to
60. A templated molecule comprising a sequence of covalently linked, functional groups, wherein the templated molecule does not comprise or consist of a peptide or a polynucleotide.
61. A complex comprising a templated molecule and the template that templated the synthesis of the templated molecule.
62. A templated molecule comprising a sequence of covalently linked, functional groups, wherein the templated molecule is linked by means of a linker to the template or to the strand complementary to the template that templated the synthesis of the templated molecule, wherein the templated molecule does not comprise or consist of an α-peptide or a polynucleotide.
63. A molecule comprising a sequence of covalently linked building blocks, wherein the sequence of covalently linked building blocks comprises a sequence of complementary nucleotides complementary to the template that templated the synthesis of the templated molecule, and wherein the templated molecule is linked to the complementary nucleotides or to the template that templated its synthesis.
64. A method for selecting complexes or templated molecules having a predetermined activity, said method comprising the step of performing a selection procedure and selecting templated molecules based on predetermined selection criteria.
65. A method for screening a composition of molecules having a predetermined activity comprising:
i) establishing a first composition of templated molecules, said molecules being produced as defined in
ii) exposing the first composition to conditions enriching said first composition with templated molecules having the predetermined activity, and
iii) optionally amplifying the templated molecules of the enriched composition to obtain a second composition,
iv) further optionally repeating step ii) to iii), and
v) obtaining a further composition having a higher ratio of templated molecules having the specific predetermined activity.
66. A method for amplifying a strand complementary to the template or the template that templated the synthesis of the templated molecule having, or potentially having a predetermined activity, said method comprising the step of contacting the template with amplification means, and amplifying the template.
67. A method for altering the sequence of a templated molecule, including generating a templated molecule comprising a novel or altered sequence of functional groups, wherein said method preferably comprises the steps of
i) providing a first strand complementary to a template or a first template capable of templating the first templated molecule, or a plurality of such first strands complementary to templates or first templates capable of templating a plurality of first templated molecules,
ii) mutating or modifying the sequence of the first complementary strand or the first template, or the plurality of first complementary strands or first templates, and generating a second template or a second complementary strand, or a plurality of second templates or second complementary strands,
wherein said second template(s) or complementary strand(s) is capable of templating the synthesis of a second templated molecule, or a plurality of second templated molecules,
wherein said second templated molecule(s) comprises a sequence of covalently linked, functional groups that is not identical to the sequence of functional groups of the first templated molecule(s), and optionally
iii) templating by means of said second template(s) or complementary strand(s) a second templated molecule, or a plurality of such second templated molecules.
68. A transfer unit comprising
i) a complementary nucleotide capable of specifically recognizing a coding nucleotide having a base,
ii) at least one functional entity selected from a precursor of peptides wherein the amino acid residues are in the L-form or in the D-form, polyamides, polyesters, polycarbamates, polycarbonates, polyureas, polyethers, poly-thioethers, polyethylenes, PNAs, polyimines, polyacetals, polyacetates, polyvinyl, and polycyclic compounds, and
iii) a linker separating the functional entity from the complementary nucleotide.
This application claims priority under 35 U.S.C. § 119(e) to U.S. Provisional patent application 60/277,081, filed Mar. 19, 2001, entitled “Nucleic Acid Directed Synthesis of Chemical Compounds”; 60/277,094, filed Mar. 19, 2001, entitled “Approaches to Generating New Molecular Function”; and 60/306,691, filed Jul. 20, 2001, entitled “Approaches to Generating New Molecular Function”, and the entire contents of each of these applications are hereby incorporated by reference.
The classic “chemical approach” to generating molecules with new functions has been used extensively over the last century in applications ranging from drug discovery to synthetic methodology to materials science. In this approach (
In contrast, Nature generates proteins with new functions using a fundamentally different method that overcomes many of these limitations. In this approach (
Acknowledging the power and efficiency of Nature's approach, researchers have used molecular evolution to generate many proteins and nucleic acids with novel binding or catalytic properties (see, for example, J. Minshull et al. Curr. Opin. Chem. Biol. 1999, 3, 284-90; C. Schmidt-Dannert et al. Trends Biotechnol. 1999, 17, 135-6; D. S. Wilson et al. Annu. Rev. Biochem. 1999, 68, 611-47). Proteins and nucleic acids evolved by researchers have demonstrated value as research tools, diagnostics, industrial reagents, and therapeutics and have greatly expanded our understanding of the molecular interactions that endow proteins and nucleic acids with binding or catalytic properties (see, M. Famulok et al. Curr. Opin. Chem. Biol. 1998, 2, 320-7).
Despite nature's efficient approach to generating function, nature's molecular evolution is limited to two types of “natural” molecules—proteins and nucleic acids—because thus far the information in DNA can only be translated into proteins or into other nucleic acids. However, many synthetic molecules of interest do not in general represent nucleic acid backbones, and the use of DNA-templated synthesis to translate DNA sequences into synthetic small molecules would be broadly useful only if synthetic molecules other than nucleic acids and nucleic acid analogs could be synthesized in a DNA-templated fashion. An ideal approach to generating functional molecules would merge the most powerful aspects of molecular evolution with the flexibility of synthetic chemistry. Clearly, enabling the evolution of non-natural synthetic small molecules and polymers, similarly to the way nature evolves biomolecules, would lead to much more effective methods of discovering new synthetic ligands, receptors, and catalysts difficult or impossible to generate using rational design.
The recognition of the need to be able to amplify and evolve classes of molecules besides nucleic acids and proteins led to the present invention providing methods and compositions for the template-directed synthesis, amplification, and evolution of molecules. In general, these methods use an evolvable template to direct the synthesis of a chemical compound or library of chemical compounds (i.e., the template actually encodes the synthesis of a chemical compound). Based on a library encoded and synthesized using a template such as a nucleic acid, methods are provided for amplifying, evolving, and screening the library. In certain embodiments of special interest, the chemical compounds are compounds that are not, or do not resemble, nucleic acids or analogs thereof. In certain embodiments, the chemical compounds of these template-encoded combinatorial libraries are polymers and more preferably are unnatural polymers (i.e., excluding natural peptides, proteins, and polynucleotides). In other embodiments, the chemical compounds are small molecules.
In certain embodiments, the method of synthesizing a compound or library of compounds comprises first providing one or more nucleic acid templates, which one or more nucleic acid templates optionally have a reactive unit associated therewith. The nucleic acid template is then contacted with one or more transfer units designed to have a first moiety, an anti-codon, which hybridizes to a sequence of the nucleic acid, and is associated with a second moiety, a reactive unit, which includes a building block of the compound to be synthesized. Once these transfer units have hybridized to the nucleic acid template in a sequence-specific manner, the synthesis of the chemical compound can take place due to the interaction of reactive moieties present on the transfer units and/or the nucleic acid template. Signficantly, the sequence of the nucleic acid can later be determined to decode the synthetic history of the attached compound and thereby its structure. It will be appreciated that the method described herein may be used to synthesize one molecule at a time or may be used to synthesize thousands to millions of compounds using combinatorial methods.
It will be appreciated that libraries synthesized in this manner (i.e., having been encoded by a nucleic acid) have the advantage of being amplifiable and evolvable. Once a molecule is identified, its nucleic acid template besides acting as a tag used to identify the attached compound can also be amplified using standard DNA techniques such as the polymerase chain reaction (PCR). The amplified nucleic acid can then be used to synthesize more of the desired compound. In certain embodiments, during the amplification step mutations are introduced into the nucleic acid in order to generate a population of chemical compounds that are related to the parent compound but are modified at one or more sites. The mutated nucleic acids can then be used to synthesize a new library of related compounds. In this way, the library being screened can be evolved to contain more compounds with the desired activity or to contain compounds with a higher degree of activity.
The methods of the present invention may be used to synthesize a wide variety of chemical compounds. In certain embodiments, the methods are used to synthesize and evolve unnatural polymers (i.e., excluding polynucleotides and peptides), which cannot be amplified and evolved using standard techniques currently available. In certain other embodiments, the inventive methods and compositions are utilized for the synthesis of small molecules that are not typically polymeric. In still other embodiments, the method is utilized for the generateion of non-natural nucleic acid polymers.
The present invention also provides the transfer molecules (e.g., nucleic acid templates and/or transfer units) useful in the practice of the inventive methods. These transfer molecules typically include a portion capable of hybridizing to a sequence of nucleic acid and a second portion with monomers, other building blocks, or reactants to be incorporated into the final compound being synthesized. It will be appreciated that the two portions of the transfer molecule are preferably associated with each other either directly or through a linker moiety. It will also be appreciated that the reactive unit and the anti-codon may be present in the same molecule (e.g., a non-natural nucleotide having functionality incorporated therein).
The present invention also provides kits and compositions useful in the practice of the inventive methods. These kits may include nucleic acid templates, transfer molecules, monomers, solvents, buffers, enzymes, reagents for PCR, nucleotides, small molecule scaffolds, etc. The kit may be used in the synthesis of a particular type of unnatural polymer or small molecule.
The term antibody refers to an immunoglobulin, whether natural or wholly or partially synthetically produced. All derivatives thereof which maintain specific binding ability are also included in the term. The term also covers any protein having a binding domain which is homologous or largely homologous to an immunoglobulin binding domain. These proteins may be derived from natural sources, or partly or wholly synthetically produced. An antibody may be monoclonal or polyclonal. The antibody may be a member of any immunoglobulin class, including any of the human classes: IgG, IgM, IgA, IgD, and IgE. Derivatives of the IgG class, however, are preferred in the present invention.
The term, associated with, is used to describe the interaction between or among two or more groups, moieties, compounds, monomers, etc. When two or more entities are “associated with” one another as described herein, they are linked by a direct or indirect covalent or non-covalent interaction. Preferably, the association is covalent. The covalent association may be through an amide, ester, carbon-carbon, disulfide, carbamate, ether, or carbonate linkage. The covalent association may also include a linker moiety such as a photocleavable linker. Desirable non-covalent interactions include hydrogen bonding, van der Waals interactions, hydrophobic interactions, magnetic interactions, electrostatic interactions, etc. Also, two or more entities or agents may be “associated” with one another by being present together in the same composition.
A biological macromolecule is a polynucleotide (e.g., RNA, DNA, RNA/DNA hybrid), protein, peptide, lipid, natural product, or polysaccharide. The biological macromolecule may be naturally occurring or non-naturally occurring. In a preferred embodiment, a biological macromolecule has a molecular weight greater than 500 g/mol.
Polynucleotide, nucleic acid, or oligonucleotide refers to a polymer of nucleotides. The polymer may include natural nucleosides (i.e., adenosine, thymidine, guanosine, cytidine, uridine, deoxyadenosine, deoxythymidine, deoxyguanosine, and deoxycytidine), nucleoside analogs (e.g., 2-aminoadenosine, 2-thiothymidine, inosine, pyrrolo-pyrimidine, 3-methyl adenosine, 5-methylcytidine, C5-bromouridine, C5-fluorouridine, C5-iodouridine, C5-propynyl-uridine, C5-propynyl-cytidine, C5-methylcytidine, 7-deazaadenosine, 7-deazaguanosine, 8-oxoadenosine, 8-oxoguanosine, O(6)-methylguanine, and 2-thiocytidine), chemically modified bases, biologically modified bases (e.g., methylated bases), intercalated bases, modified sugars (e.g., 2′-fluororibose, ribose, 2′-deoxyribose, arabinose, and hexose), or modified phosphate groups (e.g., phosphorothioates and 5′-N-phosphoramidite linkages).
A protein comprises a polymer of amino acid residues linked together by peptide bonds. The term, as used herein, refers to proteins, polypeptides, and peptide of any size, structure, or function. Typically, a protein will be at least three amino acids long. A protein may refer to an individual protein or a collection of proteins. A protein may refer to a full-length protein or a fragment of a protein. Inventive proteins preferably contain only natural amino acids, although non-natural amino acids (i.e., compounds that do not occur in nature but that can be incorporated into a polypeptide chain; see, for example, http://www.cco.caltech.edu/˜dadgrp/Unnatstruct.gif, which displays structures of non-natural amino acids that have been successfully incorporated into functional ion channels) and/or amino acid analogs as are known in the art may alternatively be employed. Also, one or more of the amino acids in an inventive protein may be modified, for example, by the addition of a chemical entity such as a carbohydrate group, a hydroxyl group, a phosphate group, a farnesyl group, an isofarnesyl group, a fatty acid group, a linker for conjugation, functionalization, or other modification, etc. A protein may also be a single molecule or may be a multi-molecular complex. A protein may be just a fragment of a naturally occurring protein or peptide. A protein may be naturally occurring, recombinant, or synthetic, or any combination of these.
The term small molecule, as used herein, refers to a non-peptidic, non-oligomeric organic compound either synthesized in the laboratory or found in nature. Small molecules, as used herein, can refer to compounds that are “natural product-like”, however, the term “small molecule” is not limited to “natural product-like” compounds. Rather, a small molecule is typically characterized in that it possesses one or more of the following characteristics including having several carbon-carbon bonds, having multiple stereocenters, having multiple functional groups, having at least two different types of functional groups, and having a molecular weight of less than 1500, although this characterization is not intended to be limiting for the purposes of the present invention.
The term small molecule scaffold, as used herein, refers to a chemical compound having at least one site for functionalization. In a preferred embodiment, the small molecule scaffold may have a multitude of sites for functionalization. These functionalization sites may be protected or masked as would be appreciated by one of skill in this art. The sites may also be found on an underlying ring structure or backbone.
The term transfer unit, as used herein, refers to a molecule comprising an anti-codon moiety associated with a reactive unit, including, but not limited to a building block, monomer, monomer unit, or reactant used in synthesizing the nucleic acid-encoded molecules.
As discussed above, it would be desirable to be able to evolve and amplify chemical compounds including, but not limited to small molecules and polymers, in the same way that biopolymers such as polynucleotides and proteins can be amplified and evolved. It has been demonstrated that DNA-templated synthesis provides a possible means of translating the information in a sequence of DNA into a synthetic small molecule. In general, DNA templates linked to one reactant may be able to recruit a second reactive group linked to a complementary DNA molecule to yield a product. Since DNA hybridization is sequence-specific, the result of a DNA-templated reaction is the translation of a specific DNA sequence into a corresponding reaction product. As shown in
However, although the ability of nucleic acid templates to accelerate the formation of a variety of non-natural nucleic acid analogues has been demonstrated, nearly all of these reactions previously shown to be catalyzed by nucleic acid templates were designed to proceed through transition states closely resembling the structure of the natural nucleic acid backbone (
Because synthetic molecules of interest do not in general resemble nucleic acid backbones, the use of DNA-templated synthesis to translate DNA sequences into synthetic small molecules would be broadly useful only if synthetic molecules other than nucleic acids and nucleic acid analogs could be synthesized in a DNA-templated fashion. The ability of DNA-templated synthesis to translate DNA sequences into arbitrary non-natural small molecules therefore requires demonstrating that DNA-templated synthesis is a much more general phenomenon than has been previously described.
Signficantly, for the first time it has been demonstrated herein that DNA-templated synthesis is indeed a general phenomenon and can be used for a variety of reactions and conditions to generate a diverse range of compounds, specifically including for the first time, compounds that are not, or do not resemble, nucleic acids or analogs thereof. More specifically, the present invention extends the ability to amplify and evolve libraries of chemical compounds beyond natural biopolymers. The ability to synthesize chemical compounds of arbitrary structure allows researchers to write their own genetic codes incorporating a wide range of chemical functionality into novel backbone and side-chain structures, which enables the development of novel catalysts, drugs, and polymers, to name a few examples. For example, the ability to directly amplify and evolve these molecules by genetic selection enables the discovery of entirely new families of artificial catalysts which possess activity, bioavailability, solvent, or thermal stability, or other physical properties (such as fluorescence, spin-labeling, or photolability) which are difficult or impossible to achieve using the limited set of natural protein and nucleic acid building blocks. Similarly, developing methods to amplify and directly evolve synthetic small molecules by iterated cycles of mutation and selection enables the isolation of novel ligands or drugs with properties superior to those isolated by traditional rational design or combinatorial screening drug discovery methods. Additionally, extending the approaches described herein to polymers of significance in material science would enable the evolution of new plastics.
In general, the method of the invention involves 1) providing one or more nucleic acid templates, which one or more nucleic acid templates optionally have a reactive unit associated therewith; and 2) contacting the one or more nucleic acid templates with one or more transfer units designed to have a first moiety, an anti-codon which hybridizes to a sequence of the nucleic acid, and is associated with a second moiety, a reactive unit, which includes specific functionality, a building block, reactant, etc. for the compound to be synthesized. It will be appreciated that in certain embodiments of the invention, the transfer unit comprises one moiety incorporating the hybridization capablility of the anti-codon unit and the chemical functionality of the reaction unit. Once these transfer units have hybridized to the nucleic acid template in a sequence-specific manner, the synthesis of the chemical compound can take place due to the interaction of reactive units present on the transfer units and/or the nucleic acid template. Significantly, the sequence of the nucleic acid can later be determined to decode the synthetic history of the attached compound and thereby its structure. It will be appreciated that the method described herein may be used to synthesize one molecule at a time or may be used to synthesize thousands to millions of compounds using combinatorial methods.
It will be appreciated that a variety of chemical compounds can be prepared and evolved according to the method of the invention. In certain embodiments of the invention, however, the methods are utilized for the synthesis of chemical compounds that are not, or do not, resemble nucleic acids or nucleic acid analogs. For example, in certain embodiments of the invention, small molecule compounds can be syntheiszed by providing a template which has a reactive unit (e.g., building block or small molecule scaffold) associated therewith (attached directly or through a linker as described in more detail in Examples 5 herein), and contacting the template simultaneously or sequentially with one or more transfer units having one or more reactive units associated therewith. In certain other embodiments, non-natural polymers can be synthesized by providing a template and contacting the template simultaneously with one or more transfer units having one or more reactive units associated therewith under conditions suitable to effect reaction of the adjacent reactive units on each of the transfer units (see, for example,
Certain embodiments are discussed in more detail below; however, it will be appreciated that the present invention is not intended to be limited to those embodiments discussed below. Rather, the present invention is intended to encompass these embodiments and equivalents thereof.
As discussed above, one or more templates are utilized in the method of the invention and hybridize to the transfer units to direct the synthesis of the chemical compound. As would be appreciated by one of skill in this art, any template may be used in the methods and compositions of the present invention. Templates which can be mutated and thereby evolved can be used to guide the synthesis of another chemical compound or library of chemical compounds as described in the present invention. As described in more detail herein, the evolvable template encodes the synthesis of a chemical compound and can be used later to decode the synthetic history of the chemical compound, to indirectly amplify the chemical compound, and/or to evolve (i.e., diversify, select, and amplify) the chemical compound. The evolvable template is, in certain embodiments, a nucleic acid. In certain embodiment of the present invention, the template is based on a nucleic acid.
The nucleic acid templates used in the present invention are made of DNA, RNA, a hybrid of DNA and RNA, or a derivative of DNA and RNA, and may be single- or double-stranded. The sequence of the template is used in the inventive method to encode the synthesis of a chemical compound, preferably a compound that is not, or does not resemble, a nucleic acid or nucleic acid analog (e.g., an unnatural polymer or a small molecule). In the case of certain unnatural polymers, the nucleic acid template is used to align the monomer units in the sequence they will appear in the polymer and to bring them in close proximity with adjacent monomer units along the template so that they will react and become joined by a covalent bond. In the case of a small molecule, the template is used to bring particular reactants within proximity of the small molecule scaffold in order that they may modify the scaffold in a particular way. In certain other embodiments, the template can be utilized to generate non-natural polymers by PCR amplification of a synthetic DNA template library consisting of a random region of nucleotides, as describe in Example 9 herein.
As would be appreciated by one of skill in the art, the sequence of the template may be designed in a number of ways without going beyond the scope of the present invention. For example, the length of the codon must be determined and the codon sequences must be set. If a codon length of two is used, then using the four naturally occurring bases only 16 possible combinations are available to be used in encoding the library. If the length of the codon is increased to three (the number Nature uses in encoding proteins), the number of possible combinations is increased to 64. Other factors to be considered in determining the length of the codon are mismatching, frame-shifting, complexity of library, etc. As the length of the codon is increased up to a certain extent the number of mismatches is decreased; however, excessively long codons will hybridize despite mismatched base pairs. In certain embodiments of special interest, the length of the codon ranges between 2 and 10 bases.
Another problem associated with using a nucleic acid template is frame shifting. In Nature, the problem of frame-shifting in the translation of protein from an mRNA is avoided by use of the complex machinery of the ribosome. The inventive methods, however, will not take advantage of such complex machinery. Instead, frameshifting may be remedied by lengthening each codon such that hybridization of a codon out of frame will guarantee a mismatch. For example, each codon may start with a G, and subsequent positions may be restricted to T, C, and A (
It will be appreciated that the template can vary greatly in the number of bases. For example, in certain embodiments, the template may be 10 to 10,000 bases long, preferably between 10 and 1,000 bases long. The length of the template will of course depend on the length of the codons, complexity of the library, length of the unnatural polymer to be synthesized, complexity of the small molecule to be synthesized, use of space sequences, etc. The nucleic acid sequence may be prepared using any method known in the art to prepare nucleic acid sequences. These methods include both in vivo and in vitro methods including PCR, plasmid preparation, endonuclease digestion, solid phase synthesis, in vitro transcription, strand separation, etc. In certain embodiments, the nucleic acid template is synthesized using an automated DNA synthesizer.
As discussed above, in certain embodiments of the invention, the method is used to synthesize chemical compounds that are not, or do not resemble, nucleic acids or nucleic acid analogs. Although it has been demonstrated that DNA-templated synthesis can be utilized to direct the synthesis of nucleic acids and analogs thereof, it has not been previously demonstrated that the phenomenon of DNA-tempalted synthesis is general enough to extend to other more complex chemical compounds (e.g., small molecules, non-natural polymers). As described in detail herein, it has been demonstrated that DNA-templated synthesis is indeed a more general phenomenon and that a variety of reactions can be utilized.
Thus, in certain embodiments of the present invention, the nucleic acid template comprises sequences of bases that encode the synthesis of an unnatural polymer or small molecule. The message encoded in the nucleic acid template preferably begins with a specific codon that bring into place a chemically reactive site from which the polymerization can take place, or in the case of synthesizing a small molecule the “start” codon may encode for an anti-codon associated with a small molecule scaffold or a first reactant. The “start” codon of the present invention is analogous to the “start” codon, ATG, found in Nature, which encodes for the amino acid methionine. To give but one example for use in synthesizing an unnatural polymer library, the start codon may encode for a start monomer unit comprising a primary amine masked by a photolabile protecting group, as shown below in Example 5A.
In yet other embodiments of the invention, the nucleic acid template itself may be modified to include an initiation site for polymer synthesis (e.g., a nucleophile) or a small molecule scaffold. In certain embodiments, the nucleic acid template includes a hairpin loop on one of its ends terminating in a reactive group used to initiate polymerization of the monomer units. For example, a DNA template may comprise a hairpin loop terminating in a 5′-amino group, which may be protected or not. From the amino group polymerization of the unnatural polymer may commence. The reactive amino group can also be used to link a small molecule scaffold onto the nucleic acid template in order to synthesize a small molecule library.
To terminate the synthesis of the unnatural polymer a “stop” codon should be included in the nucleic acid template preferably at the end of the encoding sequence. The “stop” codon of the present invention is analogous to the “stop” codons (i.e., TAA, TAG, TGA) found in mRNA transcripts. In Nature, these codons lead to the termination of protein synthesis. In certain embodiments, a “stop” codon is chosen that is compatible with the artificial genetic code used to encode the unnatural polymer. For example, the “stop” codon should not conflict with any other codons used to encode the synthesis, and it should be of the same general format as the other codons used in the template. The “stop” codon may encode for a monomer unit that terminates polymerization by not providing a reactive group for further attachment. For example, a stop monomer unit may contain a blocked reactive group such as an acetamide rather than a primary amine as shown in Example 5A below. In yet other embodiments, the stop monomer unit comprises a biotinylated terminus providing a convenient way of terminating the polymerization step and purifying the resulting polymer.
As described above, in the method of the invention, transfer units are also provided which comprise an anti-codon and a reactive unit. It will be appreciated that the anti-codons used in the present invention are designed to be complementary to the codons present within the nucleic acid template, and should be designed with the nucleic acid template and the codons used therein in mind. For example, the sequences used in the template as well as the length of the codons would need to be taken into account in designing the anti-codons. Any molecule which is complementary to a codon used in the template may be used in the inventive methods (e.g., nucleotides or non-natural nucleotides). In certain embodiments, the codons comprise one or more bases found in nature (i.e., thymidine, uracil, guanidine, cytosine, adenine). In certain other embodiments, the anti-codon comprises one or more nucleotides normally found in Nature with a base, a sugar, and an optional phosphate group. In yet other embodiments, the bases are strung out along a backbone that is not the sugar-phosphate backbone normally found in Nature (e.g., non-natural nucleotides).
As discussed above, the anti-codon is associated with a particular type of reactive unit to form a transfer unit. It will be appreciated that this reactive unit may represent a distinct entity or may be part of the functionality of the anti-codon unit (see, Example 9). In certain embodiments, each anti-codon sequence is associated with one monomer type. For example, the anti-codon sequence ATTAG may be associated with a carbamate residue with an iso-butyl side chain, and the anti-codon sequence CATAG may be associated with a carbamate residue with a phenyl side chain. This one-for-one mapping of anti-codon to monomer units allows one to decode any polymer of the library by sequencing the nucleic acid template used in the synthesis and allows one to synthesize the same polymer or a related polymer by knowing the sequence of the original polymer. It will be appreciated by one of skill in this art that by changing (e.g., mutating) the sequence of the template, different monomer units will be brought into place, thereby allowing the synthesis of related polymers, which can subsequently be selected and evolved. In certain preferred embodiments, several anti-codons may code for one monomer unit as is the case in Nature.
In certain other embodiments of the present invention where a small molecule library is to be created rather than a polymer library, the anti-codon is associated with a reactant used to modify the small molecule scaffold. In certain embodiments, the reactant is associated with the anti-codon through a linker long enough to allow the reactant to come in contact with the small molecule scaffold. The linker should preferably be of such a length and composition to allow for intramolecular reactions and minimize intermolecular reactions. The reactants include a variety of reagents as demonstrated by the wide range of reactions that can be utilized in DNA-templated synthesis (see Example 2, 3 and 4 herein) and can be any chemical group, catalyst (e.g., organometallic compounds), or reactive moiety (e.g., electrophiles, nucleophiles) known in the chemical arts.
Additionally, the association between the anti-codon and the monomer unit or reactant in the transfer unit may be covalent or non-covalent. In certain embodiments of special intereste, the association is through a covalent bond, and in certain embodiments the covalent linkage is severable. The linkage may be cleaved by light, oxidation, hydrolysis, exposure to acid, exposure to base, reduction, etc. For examples of linkages used in this art, please see Fruchtel et al. Angew. Chem. Int. Ed. Engl. 35:17, 1996, incorporated herein by reference. The anti-codon and the monomer unit or reactant may also be associated through non-covalent interactions such as ionic, electrostatic, hydrogen bonding, van der Waals interactions, hydrophobic interactions, pi-stacking, etc. and combinations thereof. To give but one example, the anti-codon may be linked to biotin, and the monomer unit linked to streptavidin. The propensity of streptavidin to bind biotin leads to the non-covalent association between the anti-codon and the monomer unit to form the transfer unit.
Synthesis of Certain Exemplary Compounds
It will be appreciated that a variety of compounds and/or libraries can be prepared using the method of the invention. As discussed above, in certain embodiments of special interest, compounds that are not, or do not resemble, nucleic acids or analogs thereof, are synthesized according to the method of the invention.
In certain embodiments, polymers, specifically unnatural polymers, are prepared according to the method of the present invention. The unnatural polymers that can be created using the inventive method and system include any unnatural polymers. Exemplary unnatural polymers include, but are not limited to, polycarbamates, polyureas, polyesters, polyacrylate, polyalkylene (e.g., polyethylene, polypropylene), polycarbonates, polypeptides with unnatural stereochemistry, polypeptides with unnatural amino acids, and combination thereof. In certain embodiments, the polymers comprises at least 10 monomer units. In certain other embodiments, the polymers comprise at least 50 monomer units. In yet other embodiments, the polymers comprise at least 100 monomer units. The polymers synthesized using the inventive system may be used as catalysts, pharmaceuticals, metal chelators, materials, etc.
In preparing certain unnatural polymers, the monomer units attached to the anti-codons and used in the present invention may be any monomers or oligomers capable of being joined together to form a polymer. The monomer units may be carbamates, D-amino acids, unnatural amino acids, ureas, hydroxy acids, esters, carbonates, acrylates, ethers, etc. In certain embodiments, the monomer units have two reactive groups used to link the monomer unit into the growing polymer chain. Preferably, the two reactive groups are not the same so that the monomer unit may be incorporated into the polymer in a directional sense, for example, at one end may be an electrophile and at the other end a nucleophile. Reactive groups may include, but are not limited to, esters, amides, carboxylic acids, activated carbonyl groups, acid chlorides, amines, hydroxyl groups, thiols, etc. In certain embodiments, the reactive groups are masked or protected (Greene & Wuts Protective Groups in Organic Synthesis, 3rd Edition Wiley, 1999; incorporated herein by reference) so that polymerization may not take place until a desired time when the reactive groups are deprotected. Once the monomer units are assembled along the nucleic acid template, initiation of the polymerization sequence results in a cascade of polymerization and deprotection steps wherein the polymerization step results in deprotection of a reactive group to be used in the subsequent polymerization step (see,
The monomer units to be polymerized may comprise two or more units depending on the geometry along the nucleic acid template. As would be appreciated by one of skill in this art, the monomer units to be polymerized must be able to stretch along the nucleic acid template and particularly across the distance spanned by its encoding anti-codon and optional spacer sequence. In certain embodiments, the monomer unit actually comprises two monomers, for example, a dicarbamate, a diurea, a dipeptide, etc. In yet other embodiments, the monomer unit actually comprises three or more monomers.
The monomer units may contain any chemical groups known in the art. As would be appreciated by one of skill in this art, reactive chemical groups especially those that would interfere with polymerization, hybridization, etc. are masked using known protecting groups. (Greene & Wuts Protective Groups in Organic Synthesis, 3rd Edition Wiley, 1999; incorporated herein by reference). In general, the protecting groups used to mask these reactive groups are orthogonal to those used in protecting the groups used in the polymerization steps.
In synthesizing an unnatural polymer, in certain embodiments, a template is provided encoding the sequence of monomer units. Transfer units are then allow to contact the template under conditions that allow for hybridization of the anti-codons to the template. Polymerization of the monomer units along the template is then allowed to occur to form the unnatural polymer. The newly synthesized polymer may then be cleaved from the anti-codons and/or the template. The template may be used as a tag to elucidate the structure of the polymer or may be used to amplify and evolve the unnatural polymer. As will be described in more detail below, the present method may be used to prepare a library of unnatural polymers. For example, in certain embodiments, as described in more detail in Example 9 herein, a library of DNA templates can be utilized to prepare unnatural polymers. In general, the method takes advantage of the fact that certain DNA polymerases are able to accept certain modified nucleotide triphosphate substrates and that several deoxyribonucleotides and ribonucleotides bearing modified groups that do not participate in Watson-Crick bonding are known to be inserted with high sequence specificity opposite natural DNA templates. Accordingly, single stranded DNA containing modified nucleotides can serve as efficient templates for the DNA-polymerase catalyzed incorporation of natural or modified nucleotides.
It will be appreciated that the inventive methods may also be used to synthesize other classes of chemical compounds besides unnatural polymers. For example, small molecules may be prepared using the methods and compositions provided by the present invention. These small molecules may be natural product-like, non-polymeric, and/or non-oligomeric. The substantial interest in small molecules is due in part to their use as the active ingredient in many pharmaceutical preparations although they may also be used as catalysts, materials, additives, etc.
In synthesizing small molecules using the method of the present invention, an evolvable template is also provided. The template may either comprise a small molecule scaffold upon which the small molecule is to be built, or a small molecule scaffold may be added to the template. The small molecule scaffold may be any clinical compound with sites for functionalization. For example, the small molecule scaffold may comprises a ring system (e.g., the ABCD steroid ring system found in cholesterol) with functionalizable groups off the atoms making up the rings. In another example, the small molecule may be the underlying structure of a pharmaceutical agent such as morphine or a cephalosporin antibiotic (see Examples 5C and 5D below below). The sites or groups to be functionalized on the small molecule scaffold may be protected using methods and protecting groups known in the art. The protecting groups used in a small molecule scaffold may be orthogonal to one another so that protecting groups can be removed one at a time.
In this embodiment, the transfer units comprise an anti-codon similar to those described in the unnatural polymer synthesis; however, these anti-codons are associated with reactants or building blocks to be used in modifying, adding to, or taking away from the small molecule scaffold. The reactants or building blocks may be electrophiles (e.g., acetyl, amides, acid chlorides, esters, nitrites, imines), nucleophiles (e.g., amines, hydroxyl groups, thiols), catalysts (e.g., organometallic catalysts), side chains, etc. See, for example reactions in aqueous and organic media as described herein in Examples 2 and 4. The transfer units are allowed to contact the template under hydridizing conditions, and the attached reactant or building block is allowed to react with a site on the small molecule scaffold. In certain embodiments, protecting groups on the small molecule template are removed one at a time from the sites to be functionalized so that the reactant of the transfer unit will react at only the desired position on the scaffold. As will be appreciated by one of skill in the art, the anti-codon may be associated with the reactant through a linker moiety (see, Example 3). The linker facilitates contact of the reactant with the small molecule scaffold and in certain embodiments, depending on the desired reaction, positions DNA as a leaving group (“autocleavable” strategy), or may link reactive groups to the template via the “scarless” linker strategy (which yields product without leaving behind additional chemical functionality), or a “useful scar” strategy (in which the linker is left behind and can be functionalized in subsequent steps following linker cleavage). The reaction condition, linker, reactant, and site to be functionalized are chosen to avoid intermolecular reactions and accelerate intramolecular reactions. It will also be appreciated that the method of the present invention contemplates both sequential and simultaneous contacting of the template with transfer units depending on the particular compound to be synthesized. In certain embodiments of special interest, the multi-step synthesis of chemical compounds is provided in which the template is contacted sequentially with two or more transfer units to facilitate multi-step synthesis of complex chemical compounds.
After the sites on the scaffold have been modified, the newly synthesized small molecule is linked to the template that encoded is synthesis. Decoding of the template tag will allow one to elucidate the synthetic history and thereby the structure of the small molecule. The template may also be amplified in order to create more of the desired small molecule and/or the template may be evolved to create related small molecules. The small molecule may also be cleaved from the template for purification or screening.
As would be appreciated by one of skill in this art, a plurality of templates may be used to encode the synthesis of a combinatorial library of small molecules using the method described above. This would allow for the amplification and evolution of a small molecule library, a feat which has not been accomplished before the present invention.
Method of Synthesizing Libraries of Compounds
In the inventive method, a nucleic acid template, as described above, is provided to direct the synthesis of an unnatural polymer, a small molecule, or any other type of molecule of interest. In general, a plurality of nucleic acid templates is provided wherein the number of different sequences provided ranges from 2 to 1015. In one embodiment of the present invention, a plurality of nucleic acid templates is provided, preferably at least 100 different nucleic acid templates, more preferably at least 10000 different nucleic acid templates, and most preferably at least 1000000 different nucleic acid templates. Each template provided comprises a unique nucleic acid sequence used to encode the synthesis of a particular unnatural polymer or small molecule. As described above, the template may also have functionality such as a primary amine from which the polymerization is initiated or a small molecule scaffold. In certain embodiments, the nucleic acid templates are provided in one “pot”. In certain other embodiments, the templates are provided in aqueous media, and subsequent reactions are performed in aqueous media.
To the template is added transfer units with anti-codons, as described above, associated with a monomer unit, as described above. In certain embodiments, a plurality of transfer units is provided so that there is an anti-codon for every codon represented in the template. In a preferred embodiment, certain anti-codons are used as start and stop sites. In general, a large enough number of transfer units is provided so that all corresponding codon sites on the template are filled after hybridization.
The anti-codons of the transfer units are allowed to hybridize to the nucleic acid template thereby bringing the monomer units together in a specific sequence as determined by the template. In the situation where a small molecule library is being synthesized, reactants are brought in proximity to a small molecule scaffold. The hybridization conditions, as would be appreciated by those of skill in the art, should preferably allow for only perfect matching between the codon and its anti-codon. Even single base pair mismatches should be avoided. Hybridization conditions may include, but are not limited to, temperature, salt concentration, pH, concentration of template, concentration of anti-codons, and solvent. The hybridization conditions used in synthesizing the library may depend on the length of the codon/anti-codon, the similarity between the codons present in the templates, the content of G/C versus A/T base pairs, etc (for further information regarding hybridization conditions, please see, Molecular Cloning: A Laboratory Manual, 2nd Ed., ed. by Sambrook, Fritsch, and Maniatis (Cold Spring Harbor Laboratory Press: 1989); Nucleic Acid Hybridization (B. D. Hames & S. J. Higgins eds. 1984); the treatise, Methods in Enzymology (Academic Press, Inc., N.Y.); Ausubel et al. Current Protocols in Molecular Biology (John Wiley & Sons, Inc., New York, 1999); each of which is incorporated herein by reference).
After hybridization of the anti-codons to the codons on the template have occurred, the monomer units are then polymerized in the case of the synthesis of unnatural polymers. The polymerization of the monomer units may occur spontaneously or may need to be initiated, for example, by the deprotection of a reactive groups such as a nucleophile or by providing light of a certain wavelength. In certain other embodiments, polymers can be catalyzed by DNA polymerization capable of effecting polymerization of non-natural nucleotides (see, Example 9). The polymerization preferably occurs in one direction along the template with adjacent monomer units becoming joined through a covalent linkage. The termination of the polymerization step occurs by the addition of a monomer unit that is not capable of being added onto. In the case of the synthesis of small molecules, the reactants are allowed to react with the small molecule scaffold. The reactant may react spontaneously, or protecting groups on the reactant and/or the small molecule scaffold may need to be removed. Other reagents (e.g., acid, base, catalyst, hydrogen gas, etc.) may also be needed to effect the reaction (see, Examples 5A-5E).
After the unnatural polymers or small molecules have been created with the aid of the nucleic acid template, they may be cleaved from the nucleic acid template and/or anti-codons used to synthesize them. In certain embodiments, the polymers or small molecules are assayed before being completely detached from the nucleic acid templates that encode them. Once the polymer or small molecule is selected, the sequence of the template or its complement may be determined to elucidate the structure of the attached polymer or small molecule. This sequence may then be amplified and/or evolved to create new libraries of related polymers or small molecules that in turn may be screened and evolved.
The methods and compositions of the present invention represent a new way to generate molecules with desired properties. This approach marries the extremely powerful genetic methods, which molecular biologists have taken advantage of for decades, with the flexibility and power of organic chemistry. The ability to prepare, amplify, and evolve unnatural polymers by genetic selection may lead to new classes of catalysts that possess activity, bioavailability, stability, fluorescence, photolability, or other properties that are difficult or impossible to achieve using the limited set of building blocks found in proteins and nucleic acids. Similarly, developing new systems for preparing, amplifying, and evolving small molecules by iterated cycles of mutation and selection may lead to the isolation of novel ligands or drugs with properties superior to those isolated by slower traditional drug discovery methods (see, Example 7).
Performing organic library synthesis on the molecular biology scale is a fundamentally different approach from traditional solid phase library synthesis and carries significant advantages. A library created using the inventive methods can be screened using any method known in this art (e.g., binding assay, catalytic assay). For example, selection based on binding to a target molecule can be carried out on the entire library by passing the library over a resin covalently linked to the target. Those biopolymers that have affinity for the resin-bound target can be eluted with free target molecules, and the selected compounds can be amplified using the methods described above. Subsequent rounds of selection and amplification can result in a pool of compounds enriched with sequences that bind the target molecule. In certain embodiments, the target molecule mimics a transition state of a chemical reaction, and the chemical compounds selected may serve as a catalyst for the chemical reaction. Because the information encoding the synthesis of each molecule is covalently attached to the molecule at one end, an entire library can be screened at once and yet each molecule is selected on an individual basis.
Such a library can also be evolved by introducing mutations at the DNA level using error-prone PCR (Cadwell et al. PCR Methods Appl. 2:28, 1992; incorporated herein by reference) or by subjecting the DNA to in vitro homologous recombination (Stemmer Proc. Natl. Acad. Sci. USA 91:10747, 1994; Stemmer Nature 370:389, 1994; each of which is incorporated herein by reference). Repeated cycled of selection, amplification, and mutation may afford biopolymers with greatly increased binding affinity for target molecules or with significantly improved catalytic properties. The final pool of evolved biopolymers having the desired properties can be sequenced by sequencing the nucleic acid cleaved from the polymers. The nucleic acid-free polymers can be purified using any method known in the art including HPLC, column chromatography, FLPC, etc., and its binding or catalytic properties can be verified in the absence of covalently attached nucleic acid.
The polymerization of synthetically-generated monomer units independent of the ribosomal machinery allows the incorporation of an enormous variety of side chains with novel chemical, biophysical, or biological properties. Terminating each biopolymer with a biotin side chain, for example, allows the facile purification of only full-length biopolymers which have been completely translated by passing the library through an avidin-linked resin. Biotin-terminated biopolymers can be selected for the actual catalysis of bond-breaking reactions by passing these biopolymers over resin linked through the substrate to avidin (
In this manner unnatural biopolymers may be isolated which serve as artificial receptors to selectively bind molecules or which catalyze chemical reactions. Characterization of these molecules would provide important insight into the ability of polycarbamates, polyureas, polyesters, polycarbonates, polypeptides with unnatural side chain and stereochemistries, or other unnatural polymers to form secondary or tertiary structures with binding or catalytic properties.
The present invention also provides kits and compositions for use in the inventive methods. The kits may contain any item or composition useful in practicing the present invention. The kits may include, but is not limited to, templates, anticodons, transfer units, monomer units, building blocks, reactants, small molecule scaffolds, buffers, solvents, enzymes (e.g., heat stable polymerase, reverse transcriptase, ligase, restriction endonuclease, exonuclease, Klenow fragment, polymerase, alkaline phosphatase, polynucleotide kinase), linkers, protecting groups, polynucleotides, nucleosides, nucleotides, salts, acids, bases, solid supports, or any combinations thereof.
As would be appreciated by one of skill in this art, a kit for preparing unnatural polymers would contain items needed to prepare unnatural polymers using the inventive methods described herein. Such a kit may include templates, anti-codons, transfer units, monomers units, or combinations thereof. A kit for synthesizing small molecules may include templates, anti-codons, transfer units, building blocks, small molecule scaffolds, or combinations thereof.
The inventive kit may also be equipped with items needed to amplify and/or evolve a polynucleotide template such as a heat stable polymerase for PCR, nucleotides, buffer, and primers. In certain other embodiments, the inventive kit includes items commonly used in performing DNA shuffling such as polynucleotides, ligase, and nucleotides.
In addition to the templates and transfer units described herein, the present invention also includes compositions comprising complex small molecules, scaffolds, or unnatural polymer prepared by any one or more of the methods of the invention as described herein.
The representative examples that follow are intended to help illustrate the invention, and are not intended to, nor should they be construed to, limit the scope of the invention. Indeed, various modifications of the invention and many further embodiments thereof, in addition to those shown and described herein, will become apparent to those skilled in the art from the full contents of this document, including the examples which follow and the references to the scientific and patent literature cited herein. It should further be appreciated that the contents of those cited references are incorporated herein by reference to help illustrate the state of the art.
The following examples contain important additional information, exemplification and guidance that can be adapted to the practice of this invention in its various embodiments and the equivalents thereof.
The Generality of DNA-Templated Synthesis: Clearly, implementing the small molecule evolution approach described above requires establishing the generality of DNA-templated synthesis. The present invention, for the first time, establishes the generality fo this approach and thus enables the syntheis of a vareity of chemical compounds using DNA-templated synthesis. As shown in
Additionally, sequence-specific DNA-templated reactions spanning a variety of reaction types (SN2 substitutions, additions to α,β-unsaturated carbonyl systems, and additions to vinyl sulfones), nucleophiles (thiols and amines), and reactant structures all proceeded in good yields with excellent sequence selectivity (
Since sequence discrimination is important for the faithful translation of DNA into synthetic structures, the reaction rate of a matched reagent compared with that of a reagent bearing a single mismatched base near the center of its 10-base oligonucleotide was measured. At 25° C., the initial rate of reaction of matched thiol reagents with iodoacetamide-linked H templates is 200-fold faster than that of reagents bearing a single mismatch (kapp=2.4×104 M−1 s−1 vs. 1.1×102 M−1s−1,
In addition to reaction generality and sequence specificity, DNA-templated synthesis also demonstrates remarkable distance independence. Both H and E templates linked to maleimide or α-iodoacetamide groups promote sequence-specific reaction with matched, but not mismatched, thiol reagents annealed anywhere on the templates examined thus far (up to 30 bases away from the reactive group on the template). Reactants annealed one base away react with similar rates as those annealed 2, 3, 4, 6, 8, 10, 15, 20, or 30 bases away (
To determine the basis of the distance independence of DNA-templated synthesis, a series of modified E templates were first synthesized in which the intervening bases were replaced by a series of DNA analogs designed to evaluate the possible contribution of (i) interbase interactions, (ii) conformational preferences of the DNA backbone, (iii) the charged phosphate backbone, and (iv) backbone hydrophilicity. Templates in which the intervening bases were replaced with any of the analogs in
The distance independent reaction rates may be explained if the bond-forming events in a DNA-templated format are sufficiently accelerated relative to their nontemplated counterparts such that DNA annealing, rather than bond formation, is rate-determining. If DNA annealing is at least partially rate limiting, then the rate of product formation should decrease as the concentration of reagents is lowered because annealing, unlike templated bond formation, is a bimolecular process. Decreasing the concentration of reactants in the case of the E template with one or ten intervening bases between reactive groups resulted in a marked decrease in the observed reaction rate (
These findings raise the possibility of using DNA-templated synthesis to translate in one pot libraries of DNA into solution-phase libraries of synthetic molecules suitable for PCR amplification and selection. The ability of DNA-templated synthesis to support a variety of transition state geometries suggests its potential in directing a range of powerful water-compatible synthetic reactions (see, Li, C. J. Organic Reactions in Aqueous Media, Wiley and Sons, New York: 1997). The sequence specificity described above suggests that mixtures of reagents may be able to react predictably with complementary mixtures of templates. Finally, the observed distance independence suggests that different regions of DNA “codons” may be used to encode different groups on the same synthetic scaffold without impairing reactions rates. As a demonstration of this approach, a library of 1,025 maleimide-linked templates was syntheisized, each with a different DNA sequence in an eight-base encoding region (
Digestion with the restriction endonuclease Tsp45I, which cleaves GTGAC and therefore cuts the biotin encoding template but none of the other templates, revealed a 1:1 ratio of biotin encoding to non-biotin encoding templates following selection (
Taken together, these results suggest that DNA-templated synthesis is a surprisingly general phenomenon capable of directing, rather than simply encoding, a range of chemical reactions to form products unrelated in structure to nucleic acid backbones. For several reactions examined, the DNA-templated format accelerates the rate of bond formation beyond the rate of a 10-base DNA oligonucleotide annealing to its complement, resulting in surprising distance independence. The facile nature of long-distance DNA-templated reactions may also arise in part from the tendency of water to contract the volume of nonpolar reactants (see, C.-J. Li et al. Organic Reactions in Aqueous Media, Wiley and Sons: New York, 1997) and from possible compactness of the intervening single-stranded DNA between reactive groups. These findings may have implications for prebiotic evolution and for understanding the mechanisms of catalytic nucleic acids, which typically localize substrates to a strand of RNA or DNA.
DNA synthesis. DNA oligonucleotides were synthesized on a PerSeptive Biosystems Expedite 8909 DNA synthesizer using standard protocols and purified by reverse phase HPLC. Oligonucleotides were quantitated spectrophotometrically and by denaturing polyacrylamide gel electrophoresis (PAGE) followed by staining with ethidium bromide or SYBR Green (Molecular Probes) and quantitation using a Stratagene Eagle Eye II densitometer. Phosphoramidites enabling the synthesis of 5′-NH2-dT, 5′ tetrachlorofluorescein, abasic backbone spacer, C3 backbone spacer, 9-bond polyethylene glycol spacer, 12-bond saturated hydrocarbon spacer, and 5′ biotin groups were purchased from Glen Research. Thiol-linked oligonucleotide reagents were synthesized on C3 disulfide controlled pore glass (Glen Research).
Template functionalization. Templates bearing 5′-NH2-dT groups were transformed into a variety of electrophilic functional groups by reaction with the appropriate electrophile-NHS ester (Pierce). Reactions were performed in 200 mM sodium phosphate pH 7.2 with 2 mg/mL electrophile-NHS ester, 10% DMSO, and up to 100 μg of 5′-amino template at 25° C. for 1 h. Desired products were purified by reverse-phase HPLC and characterized by gel electrophoresis and MALDI mass spectrometry.
DNA-templated synthesis reactions. Reactions were initiated by mixing equimolar quantities of reagent and template in buffer containing 50 mM MOPS pH 7.5 and 250 mM NaCl at the desired temperature (25° C. unless stated otherwise). Concentrations of reagents and templates were 60 nM unless otherwise indicated. At various time points, aliquots were removed, quenched with excess α-mercaptoethanol, and analyzed by denaturing PAGE. Reaction products were quantitated by densitometry using their intrinsic fluorescence or by staining followed by densitometry. Representative products were also verified by MALDI mass spectrometry.
In vitro selection for avidin binding. Products of the library translation reaction were isolated by ethanol precipitation and dissolved in binding buffer (10 mM Tris pH 8, 1 M NaCl, 10 mM EDTA). Products were incubated with 30 μg of streptavidin-linked magnetic beads (Roche Biosciences) for 10 min at room temperature in 100 uL total volume. Beads were washed 16 times w ith binding buffer and eluted by treatment with 1 pmol free biotin in 100 uL binding buffer at 70° C. for 10 minutes. Eluted molecules were isolated by ethanol precipitation and amplified by standard PCR protocols (2 mM MgCl2, 5° C. annealing, 20 cycles) using the primers 5′-TGGTGCGGAGCCGCCG and 5′-CCACTGTCCGTGGCGCGACCCCGGCTCC TCGGCTCGG. Automated DNA sequencing used the primer 5′-CCACTGTCCGTGGCGCGACCC.
DNA Sequences. Sequences not provided in the figures are as follows: matched reagent in
As discussed above, the generality of DNA-templated synthetic chemistry was examined (see, Liu et al. J. Am. Chem. Soc. 2001, 123, 6961). Specifically, the ability of DNA-templated synthesis to direct a modest collection of chemical reactions without requiring the precise alignment of reactive groups into DNA-like conformations was demonstrated. Indeed, the distance independence and sequence fidelity of DNA-templated synthesis allowed the simultaneous, one-pot translation of a model library of more than 1,000 templates into the corresponding thioether products, one of which was enriched by in vitro selection for binding to the protein streptavidin and amplified by PCR.
As described in detail herein, the generality of DNA-templated synthesis has been further expanded and it has been demonstrated that a variety of chemical reactions can be utilized for the construction of small molecules and in particular, for the first time, DNA-templated organometallic couplings and carbon-carbon bond forming reactions other than pyrimidine photodimerization. These reactions clearly represent an important step towards the in vitro evolution of non-natural synthetic molecules by enabling the DNA-templated construction of a much more diverse set of structures than has previously been achieved.
The ability of DNA-templated synthesis to direct reactions that require a non-DNA-linked activator, catalyst or other reagent in addition to the principal reactants has also been demonstrated herein. To test the ability of DNA-templated synthesis to mediate such reactions without requiring structural mimicry of the DNA-templated backbone, DNA-templated reductive aminations between an amine-linked template (1) and benzaldehyde- or glyoxal-linked reagents (3) with millimolar concentrations of NaBH3CN at room temperature in aqueous solutions can be performed. Significantly, products formed efficiently when the template and reagent sequences were complementary, while control reactions in which the sequence of the reagent did not complement that of the template, or in which NaBH3CN was omitted, yielded no significant product (see
It will be appreciated that carbon-carbon bond forming reactions are also important in both chemical and biological syntheses and thus several such reactions are utilized in DNA-templated format. Both the reaction of nitroalkane-linked reagent (10) with aldehyde-linked template (11) (nitro-aldol or Henry reaction) and the conjugate addition of 10 to maleimide-linked template (12) (nitro-Michael addition) proceeded efficiently and with high sequence specificity at pH 7.5-8.5, 25° C. (
In addition to the reactions described above, organometallic coupling reactions can also be utilized in the present invention. For example, DNA-templated Heck reactions were performed in the presence of water-soluble Pd precatalysts. In the presence of 170 mM Na2PdCl4, aryl iodide-linked reagent 19 and a variety of olefin-linked templates including maleimide 12, acrylamide 17, vinyl sulfone 18 or cinnamamide 20 yielded Heck coupling products in modest yields at pH 5.0, 25° C. (
It was previously discovered that the same DNA-templated reactions demonstrate distance independence, the ability to form product at a rate independent of the number of intervening bases between annealed reactants. It was hypothesized (
In addition to the DNA-templated SN 2 reaction, conjugate addition, vinyl sulfone addition, amide bond formation, reductive amination, nitro-aldol (Henry reaction), nitro Michael, Wittig olefination, 1,3-dipolar cycloaddition and Heck coupling reactions described directly above, a variety of additional reagents can also be utilized in the method of the present invention. For example, as depicted in
Taken together, these results expand considerably the reaction scope of DNA-templated synthesis. A wide variety of reactions proceeded efficiently and selectively only when the corresponding reactants are programmed with complementary sequences. By augmenting the repertoire of known DNA-templated reactions to now include carbon-carbon bond forming and organometallic reactions (nitro-aldol additions, nitro-Michael additions, Wittig olefinations, dipolar cycloadditions, and Heck couplings) in addition to previously reported amide bond formation (see, Schmidt et al. Nucleic Acids Res. 1997, 25, 4792; Bruick et al. Chem. Biol. 1996, 3, 49), imine formation (Czlapinski et al. J. Am. Chem. Soc. 2001, 123, 8618), reductive amination (Li et al. J. Am. Chem. Soc. 2002, 124, 746; Gat et al. Biopolymers, 1998, 48, 19), SN 2 reactions (Gartner et al. J. Am. Chem. Soc. 2001, 123, 6961; Xu et al. Nat. Biotechnol. 2001, 19, 148; Herrlein et al. J. Am. Chem. Soc. 1995, 117, 10151) conjugate addition of thiols (Gartner et al. J. Am. Chem. Soc. 2001, 123, 6961), and phosphoester or phosphonamide formation (Orgel et al. Acc. Chem. Res. 1995, 28, 109; Luther et al. Nature, 199°, 396, 245), these results may enable the sequence-specific translation of libraries of DNA into libraries of structurally and functionally diverse synthetic products. Since minute quantities of templates encoding desired molecules can be amplified by PCR, the yields of DNA-templated reactions are arguably less critical than the yields of traditional synthetic transformations. Nevertheless, many of the reactions developed above proceed efficiently. In addition, by demonstrating that DNA-templated synthesis in the absence of proteins can direct a large diversity of chemical reactions, these findings support previously proposed hypotheses that nucleic acid-templated synthesis may have translated replicable information into some of the earliest functional molecules such as polyketides, terpenes and polypeptides prior to the evolution of protein-based enzymes. The diversity of chemistry shown here to be controllable simply by bringing reactants into proximity by DNA hybridization without obvious structural requirements provides an experimental basis for these possibilities. The translation of amplifiable information into a wide range of structures is a key requirement for applying nature's molecular evolution approach to the discovery of non-natural molecules with new functions.
Methods for Exemplary Reactions for Use in DNA-Templated Synthesis:
Functionalized templates and reagents were typically prepared by reacting 5′-NH2 terminated oligonucleotides (for template 1), 5′-NH2—(CH2O)2 terminated oligonucleotides (for all other templates) or 3′-OPO3—CH2CH(CH2OH)(CH2)4NH2 terminated nuclotides (for all reagents) with the appropriate NHS esters (0.1 volumes of a 20 mg/mL solution in DMF) in 0.2 M sodium phosphate buffer, pH 7.2, 25° C., 1 h to provide the template and reagent structures shown in
Functionalized templates and reagents were purified by gel filtration using Sephadex G-25 followed by reverse-phase HPLC (0.1 triethylammonium acetate-acetonitrile gradient) and characterized by MALDI mass spectrometry. DNA templated reactions were conducted under the conditions described in
The sequences of oligonucleotide templates and reagents are as follows (5′ to 3′ direction, n refers to the number of bases between reactive groups when template and reagent are annealed as shown in
Reaction yields quantitated by denaturing polyacrylamide gel electrophoresis followed by ehidium bromide staining, UV visualization, and CCD-based densitometry of product and template starting material bands. Yield calculations assumed that templates and products stained with equal intensity per base; for those cases in which products are partially double-stranded during quantitation, changes in staining intensity may result in higher apparent yields.
As will be appreciated by one of ordinary skill in the art, it is frequently useful to leave the DNA moiety of the reagents linked to products during reaction development to facilitate analysis by gel electrophoresis. The use of DNA-templated synthesis to translate libraries of DNA into corresponding libraries of synthetic small molecules suitable for in vitro selection, however, requires the development of cleavable linkers connecting reactive groups of reagents with their decoding DNA oligonucleotides. As described below and herein, three exemplary types of linkers have been developed (see,
As demonstrated herein, a variety of DNA-templated reactions can occur in aqueous media. It has also been demonstrated, as discussed below, that DNA-templated reactions can occur in organic solvents, thus greatly expanding the scope of DNA-templated synthesis. Specifically, DNA templates and reagents have been complexed with long chain tetraalkylammonium cations (see, Jost et al. Nucleic Acids Res. 1989, 17, 2143; Mel'nikov et al. Langmuir, 1999, 15, 1923-1928) to enable quantitative dissolution of reaction components in anhydrous organic solvents including CH2Cl2, CHCl3, DMF and MeOH. Surprisingly, it was found that DNA-templated synthesis can indeed occur in anhydrous organic-solvents with high sequence selectivity. Depicted in
As detailed above, the generality of DNA-templated synthesis has been established by performing several distinct DNA-templated reaction types, none of which are limited to producing structures that resemble the natural nucleic acid backbone, and many of which are highly useful carbon-carbon bond forming or complexity-building synthetic reactions. It has been shown that the distance independence of DNA-templated synthesis allows different regions of a DNA template to each encode different synthetic reactions. DNA-templated synthesis can maintain sequence fidelity even in a library format in which more than 1,000 templates and 1,000 reagents react simultaneously in one pot. As described above and below, linker strategies have been developed, which together with the reactions developed as described above, have enabled the first multi-step DNA-templated synthesis of simple synthetic small molecules. Additionally, the sequence-specific DNA-templated synthesis in organic solvents has been demonstrated, further expanding the scope of this approach.
A) Synthesis of a Polycarbamate Library: One embodiment of the strategy described above is the creation of an amplifiable polycarbamate library. Of the sixteen possible dinucleotides used to encode the library, one is assigned a start codon function, and one is assigned to serve as a stop codon. An artificial genetic code is then created assigning each of the up to 14 remaining dinucleotides to a different monomer. For geometric reasons one monomer actually contains a dicarbamate containing two side chains. Within each monomer, the dicarbamate is attached to the corresponding dinucleotide (analogous to a tRNA anticodon) through a silyl enol ether linker which liberates the native DNA and the free carbamate upon treatment with fluoride. The dinucleotide moiety exists as the activated 5′-2-methylimidazole phosphate, that has been demonstrated (Inoue et al. J. Mol. Biol. 162:201, 1982; Rembold et al. J. Mol. Evol. 38:205, 1994; Chen et al. J. Mol. Biol. 181:271, 1985; Acevedo et al. J. Mol. Biol. 197:187, 1987; Inoue et al. J. Am. Chem. Soc. 103:7666, 1981; each of which is incorporated herein by reference) to serve as an excellent leaving group for template-directed oligomerization of nucleotides yet is relatively stable under neutral or basic aqueous conditions (Schwartz et al. Science 228:585, 1985; incorporated herein by reference). The dicarbamate moiety exists in a cyclic form linked through a vinyloxycarbonate linker. The vinylcarbonate group has been demonstrated to be stable in neutral or basic aqueous conditions (Olofson et al. Tetrahedron Lett. 18:1563, 1977; Olofson et al. Tetrahedron Lett. 18:1567, 1977; Olofson et al. Tetrahedron Lett. 18:1571, 1977; each of which is incorporated herein by reference) and further has been shown to provide carbamates in very high yields upon the addition of amines (Olofson et al. Tetrahedron Lett. 18:1563, 1977; incorporated herein by reference).
When attacked by an amine from a nascent polycarbamate chain, the vinyl carbonate linker, driven by the aromatization of m-cresol, liberates a free amine. This free amine subsequently serves as the nucleophile to attack the next vinyloxycarbonate, propagating the polymerization of the growing carbamate chain. Such a strategy minimizes the potential for cross-reactivity and bidirectional polymerization by ensuring that only one nucleophile is present at any time during polymerization.
Using the monomer described above, artificial translation of DNA into a polycarbamate can be viewed as a three-stage process. In the first stage, single stranded DNA templates encoding the library are used to guide the assembly and polymerization of the dinucleotide moieties of the monomers, terminating with the “stop” monomer which possesses a 3′methyl ether instead of a 3′hydroxyl group (
Once the nucleotides have assembled and polymerized into double-stranded DNA, the “start” monomer ending in a o-nitrobenzylcarbamates is photodeprotected to reveal the primary amine that initiates carbamate polymerization. Polymerization proceeds in the 5′ to 3′ direction along the DNA backbone, with each nucleophilic attack resulting in the subsequent unmasking of a new amine nucleophile. Attack of the “stop” monomer liberates an acetamide rather than an amine, thereby, termination polymerization (
Following polymerization the polycarbamate is cleaved from the phosphate backbone of the DNA upon treatment with fluoride. Desilylation of the enol ether linker and the elimination of the phosphate driven by the resulting release of phenol provides the polycarbamate covalently linked at its carboxy terminus to its encoding single-stranded DNA (
At this stage the polycarbamate may be completely liberated from the DNA by base hydrolysis of the ester linkage. The liberated polycarbamate can be purified by HPLC and retested to verify that its desired properties are intact. The free DNA can be amplified using PCR, mutated with error-prone PCR (Cadwell et al. PCR Methods Appl. 2:28, 1992; incorporated herein by reference) or DNA shuffling (Stemmer Proc. Natl. Acad. Sci. USA 91:10747, 1994; Stemmer Nature 370:389, 1994; U.S. Pat. No. 5,811,238, issued Sep. 22, 1998; each of which is incorporated herein by reference), and/or sequenced to reveal the primary structure of the polycarbamate.
Synthesis of monomer units. After the monomers are synthesized, the assembly and polymerization of the monomers on the DNA scaffold should occur spontaneously. Shikimic acid 1, available commercially, biosynthetically (Davis Adv. Enzymol. 16:287, 1955; incorporated herein by reference), or by short syntheses from D-mannose (Fleet et al. J. Chem. Soc., Perkins Trans. I 905, 1984; Harvey et al. Tetrahedron Lett. 32:4111, 1991; each of which is incorporated herein by reference), serves as a convenient starting point for the monomer synthesis. The syn hydroxyl groups are protected as the p-methoxybenzylidene, and remaining hydroxyl group as the tert-butyldimethylsilyl ether to afford 2. The carboxylate moiety of the protected shikimic acid is then reduced completely by LAH reduction, tosylation of the resulting alcohol, and further reduction with LAH to provide 3.
Commercially available and synthetically accessible N-protected amino acids serve as the starting materials for the dicarbamate moiety of each monomer. Reactive side chains are protected as photolabile ethers, esters, acetals, carbamates, or thioethers. Following chemistry previously developed (Cho et al: Science 261:1303, 1993; incorporated herein by reference), a desired amino acid 4 is converted to the corresponding amino alcohol 5 by mixed anhydride formation with isobutylchloroformate followed by reduction with sodium borohydride. The amino alcohol is then converted to the activated carbonate by treatment with p-nitrophenylchloroformate to afford 6, which is then coupled to a second amino alcohol 7 to provide, following hydroxyl group silylation and FMOC deprotection, carbamate 8.
Coupling of carbamate 8 onto the shikimic acid-derived linker proceeds as follows. The allylic hydroxyl group of 3 is deprotected with TBAF, treated with triflic anhydride to form the secondary triflate, then displaced with aminocarbamate 8 to afford 9. Presence of the vinylic methyl group in 3 should assist in minimizing the amount of undesired product resulting from SN2′ addition (Magid Tetrahedron 36:1901, 1980; incorporated herein by reference). Michael additions of deprotonated carbamates to α,β-unsaturated esters have been well documented (Collado et al. Tetrahedron Lett. 35:8037, 1994; Hirama et al. J. Am. Chem. Soc. 107:1797, 1985; Nagasaka et al. Heterocycles 29:155, 1989; Shishido et al. J. Chem. Soc. Perkins Trans. I 993, 1987; Hirarna et al. Heterocycles 28:1229, 1989; each of which is incorporated herein by reference). By analogy, the secondary amine is protected as the o-nitrobenzyl carbamate (NBOC), and the resulting compound is deprotonated at the carbamate nitrogen. This deprotonation can typically be performed with either sodium hydride or potassium tert-butyloxide (Collado et al. Tetrahedron Lett. 35:8037, 1994; Hirama et al J. Am. Chem. Soc. 107:1797, 1985; Nagasaka et al. Heterocycles 29:155, 1989; Shishido et al. J. Chem. Soc. Perkins Trans. 1993, 1987; Hirama et al. Heterocycles 28:1229, 1989; each of which is incorporated herein by reference), although other bases may be utilized to minimize deprotonation of the nitrobenzylic protons. Additions of the deprotonated carbamate to α,β-unsaturated ketone 10, followed by trapping of the resulting enolate with TBSCl, should afford silyl enol ether 11. The previously found stereoselectivity of conjugate additions to 5-substituted enones such as 10 (House et al. J. Org. Chem. 33:949, 1968; Still et al. Tetrahedron 37:3981, 1981; each of which is incorporated herein by reference) suggests that preferential formation of 11 over its diastereomer. Ketone 10, the precursor to the fluoride-cleavable carbamate-phosphate linker, may be synthesized from 2 by one pot decarboxylation (Barton et al. Tetrahedron 41:3901, 1985; incorporated herein by reference) followed by treatment with TBAF, Swem oxidation of the resulting alcohol to afford 12, deprotection with DDQ, selective nitrobenzyl ether formation of the less-hindered alcohol, and reduction of the α-hydroxyl group with samarium iodide (Molander In Organic Reactions, Paquette, Ed. 46:211, 1994; incorporated herein by reference).
The p-methoxybenzylidiene group of 11 is transformed into the α-hydroxy PMB ether using sodium cyanoborohydride and TMS chloride (Johansson et al. J. Chem. Soc. Perkin Trans. I 2371, 1984; incorporated herein by reference) and the TES group deprotected with 2% HF (conditions that should not affect the TBS ether (Boschelli et al. Tetrahedron Lett. 26:5239, 1985; incorporated herein by reference)) to provide 13. The PMB group, following precedent (Johansson et al. J. Chem. Soc. Perkin Trans. 12371, 1984; Sutherlin et al. Tetrahedron Lett. 34:4897, 1993; each of which is incorporated herein by reference), should remain on the more hindered secondary alcohol. The two free hydroxyl groups may be macrocyclized by very slow addition of 13 to a solution of p-nitrophenyl chloroformate (or another phosgene analog), providing 14. The PMB ether is deprotected, and the resulting alcohol is converted into a triflate and eliminated under kinetic conditions with a sterically hindered base to afford vinyloxycarbonate 15. Photodeprotection of the nitrobenzyl either and nitrobenzyl carbamate yields alcohol 16.
The monomer synthesis is completed by the sequential coupling of three components. Chlorodiisopropylaminophosphine 17 is synthesized by the reaction of PCl3 with diisopropylamine (King et al. J. Org. Chem. 49:1784, 1984; incorporated herein by reference). Resin-bound (or 3′-o-nitrobenzylether protected) nucleoside 18 is coupled to 17 to afford phosphoramidite 19. Subsequent coupling of 19 with the nucleoside 20 (Inoue et al. J. Am. Chem. Soc. 103:7666, 1981; incorporated herein by reference) provides 21. Alcohol 16 is then reacted with 21 to yield, after careful oxidation using MCPBA or 12 followed by cleavage from the resin (or photodeprotection), the completed monomer 22. This strategy of sequential coupling of 17 with alcohols has been successfully used to generate phosphates bearing three different alkoxy substituents in excellent yields (Bannwarth et al. Helv. Chim. Acta 70:175, 1987; incorporated herein by reference).
The unique start and stop monomers used to initiate and terminate carbamate polymerization may be synthesized by simple modification of the above scheme.
B) Evolvable Functionalized Peptide-Nucleic Acids (PNAs): In another embodiment an amplifiable peptide-nucleic acid library is created. Orgel and co-workers have demonstrated that peptide-nucleic acid (PNAs) oligomers are capable of efficient polymerization on complementary DNA or RNA templates (Böhler et al. Nature 376:578, 1995; Schmidt et al. Nucl. Acids Res. 25:4792, 1997; each of which is incorporated herein by reference). This finding, together with the recent synthesis and characterization of chiral peptide nucleic acids bearing amino acid side chains (Haairna et al. Angew. Chem. Int. Ed. Engl. 35:1939-1942, 1996; Püschl et al. Tetrahedron Lett. 39:4707, 1998; each of which is incorporated herein by reference), allows the union of the polymer backbone and the growing nucleic acid strand into a single structure. In this example, each template consists of a DNA hairpin terminating in a 5′ amino group; the solid-phase and solution syntheses of such molecules have been previously described (Uhlmann et al. Angew. Chem. Int. Ed. Engl. 35:2632, 1996; incorporated herein by reference). Each extension monomer consists of a PNA trimer (or longer) bearing side chains containing functionality of interest. An artificial genetic code is written to assign each trinucleotide to a different set of side chains. Assembly, activation (with a carbodiimide and appropriate leaving group, for example), and polymerization of the PNA dimers along the complementary DNA template in the carboxy- to amino-terminal direction affords the unnatural polymer (
The experimental approach towards implementing an evolvable functionalized peptide nucleic acid library comprises (i) improving and adapting known chemistry for the high efficiency template-directed polymerization of PNAs; (ii) defining a codon format (length and composition) suitable for PNA coupling of a number of diverse monomers on a complementary strand of encoding DNA free from significant infidelity, framshifting, or spurious initiation of polymerization; (iii) choosing an initial set of side chains defining our new genetic code and synthesizing corresponding monomers; and (iv) subjecting a library of functionalized PNAs to cycles of selection, amplification, and mutation and characterizing the resulting evolved molecules to understand the basis of their novel activities.
(i) Improving coupling chemistry: While Orgel and coworkers have reported template-directed PNA polymerization, reported yields and number of successful couplings are significantly lower than would be desired. A promising route towards improving this key coupling process is exploring new coupling reagents, temperatures, and solvents which were not previously investigated (presumably because previous efforts focused on conditions which could have existed on prebiotic earth). The development of evolvable functionalized PNA polymers involves employing activators (DCC, DIC, EDC, HATU/DIEA, HBTU/DIEA, ByBOP/DIEA, chloroacetonitrile), leaving groups (2-methylimidazole, imidazole, pentafluorophenol, phenol, thiophenol, trifluoroacetate, acetate, toluenesulfonic acids, coenzyme A, DMAP, ribose), solvents (aqueous at several pH values, DMF, DMSO, chloroform, TFE), and temperature (0° C., 4° C., 25° C., 37° C., 55° C.) in a large combinatorial screen to isolate new coupling conditions. Each well of a 384-well plate is assigned a specific combination of one activator, leaving group, solvent, and temperature. Solid-phase synthesis beads covalently linked to DNA hairpin templates are placed in each well, together with a fluorescently labeled PNA monomer complementary to the template. A successful coupling event results in the covalent linking of the fluorophore to the beads (
(ii) Defining a codon format: While Nature has successfully employed a triplet codon in protein biosynthesis, a new polymer assembled under very different conditions without the assistance of enzymes may require an entirely novel codon format. Frameshifting may be remedied by lengthening each codon such that hybridizing a codon out of frame guarantees a mismatch (for example, by starting each codon with a G and by restricting subsequent positions in the codon to T. C, and A). Thermodynamically, one would also expect fidelity to improve as codon length increases to a certain point. Codons that are excessively long, however, will be able to hybridize despite mismatched bases and moreover complicate monomer synthesis. An optimal codon length for high fidelity artificial translation can be defined using an optimized plate-based combinatorial screen developed above. The length and composition of each codon in the template is varied by solid-phase synthesis of the appropriate DNA hairpin. These template hairpins are then allowed to couple with fluorescently labeled PNA monomers of varying sequence. The ideal codon format allows only monomers bearing exactly complementary sequences to couple with templates, even in the presence of mismatched PNA monomers (which are labeled differently to facilitate assaying of matched versus mismatched coupling). Triplet and quadruplet codons in which two bases are varied among A, T, and C while the remaining base or bases are fixed as G to ensure proper registration during polymerization are first studied.
(iii) Writing a new genetic code: Side chains are chosen which provide interesting functionality not necessarily present in natural biopolymers, which are synthetically accessible, and which are compatible with coupling conditions. For example, a simple genetic code which might be used to evolve a Ni+2 chelating PNA consists of a variety of protected carboxylate-bearing side chains as well as a set of small side chains to equip polymers with conformational flexibility and structural diversity (
(iv) Selecting for desired unnatural polymers: Many of the methods developed for the selection of biological molecules can be applied to selections for evolved PNAs with desired properties. Like nucleic acid or phage-display selections, libraries of unnatural polymers generated by the DNA-templated polymerization methods described above are self-tagged and therefore do not need to be spatially separated or synthesized on pins or beads. Ni+2 binding PNA may be done simply by passing the entire library resulting from translation or a random oligonucleotide through commercially available Ni-NTA (“His-Tag”) resin precharged with nickel. Desired molecules bind to the resin and are eluted with EDTA. Sequencing these PNAs after several cycles of selection, mutagenesis, and amplification reveals which of the initially chosen side chains can assemble together to form a Ni+2 receptor. In addition, the isolation of a PNA Ni+2 chelator represents the PNA equivalent of a histidine tag which may prove useful for the purification of subsequent unnatural polymers. Later efforts will involve more ambitious selections. For example, PNAs that fluoresce in the presence of specific ligands may be selected by FACS sorting of translated polymers linked through their DNA templates to beads. Those beads that fluoresce in the presence, but not in the absence, of the target ligand are isolated and characterized. Finally, the use of a biotinylated “stop” monomer as described above allows for the direct selection for the catalysis of many bond-forming or bond-breaking reactions. Two examples depicted in
C) Evolvable Libraries of Small Molecules: In yet another embodiment of the present invention, the inventive methods are used in preparing amplifiable and evolvable unnatural nonpolymeric molecules including synthetic drug scaffolds. Nucleophilic or electrophilic groups are individually unmasked on a small molecule scaffold attached by simple covalent linkage or through a common solid support to an encoding oligonucleotide template. Electrophilic or nucleophilic reactants linked to short nucleic acid sequences are hybridized to the corresponding templates. Sequence-specific reaction with the appropriate reagent takes place by proximity catalysis (
Following synthetic functionalization of all positions in a manner determined by the sequence of the attached DNA (
Encoding DNA is cleaved from each bead identified in the screen and subjected to PCR, mutagenesis, sequencing, or homologous recombination before reattachment to a solid support. Ultimately, this system is most flexible when the encoding DNA is directly linked to the combinatorial synthetic scaffold without an intervening bead. In this case, entire libraries of compounds may be screened or selected for desired activities, their encoding DNA liberated, amplified, mutated, and recombined, and new compounds synthesized all in a small series of one-pot, massively parallel reactions. Without a bead support, however, reactivities of hybridized reactants must be highly efficient since only one template molecule directs the synthesis of the entire small molecule.
The development of evolvable synthetic small molecule libraries relies on chemical catalysis provided by the proximity of DNA hybridized reactants. It will be appreciated that acceptable distances between hybridized reactants and unmasked reactive groups must first be defined for efficient DNA-templated functionalization by hybridizing radiolabeled electrophiles (activated esters in out first attempts) attached to short oligonucleotides at varying distances from a reactive nucleophile (a primary amine) on a strand of DNA. At given timepoints, aliquots are subjected to gel electrophoresis and autoradiography to monitor the course of the reaction. Plotting the reaction as a function of the distance (in bases) between the nucleophile and electrophile will define an acceptable distance window within-which proximity-based catalysis of a DNA-hybridized reaction can take place. The width of this window will determine the number of distinct reactions we can encode on a strand of DNA (a larger window allows more reactions) as well as the nature of the codons (a larger window is required for longer codons) (
Once acceptable distances between functional groups on a combinatorial synthetic scaffold and hybridizes reactants is determined, the codon format is determined. The nonpolymeric nature of small molecule synthesis simplifies codon reading as frameshifting is not an issue and relatively large codons may be used to ensure that each set of reactants hybridizes only to one region of the encoding DNA strand.
Once the distance of the linker between the functional group and synthetic small molecule scaffold and the codon format have been determined, one can synthesize small molecules based on a small molecule scaffold such as the cephalosporin scaffold shown in
D) Multi-Step Small Molecule Synthesis Programmed by DNA Templates: Molecular evolution requires the sequence-specific translation of an amplifiable information carrier into the structures of the evolving molecules. This requirement has limited the types of molecules that have been directly evolved to two classes, proteins and nucleic acids, because only these classes of molecules can be translated from nucleic acid sequences. As described generally above, a promising approach to the evolution of molecules other than proteins and nucleic acids uses DNA-templated synthesis as a method of translating DNA sequences into synthetic small molecules. DNA-templated synthesis can direct a wide variety of powerful chemical reactions with high sequence-specificity and without requiring structural mimicry of the DNA backbone. The application of this approach to synthetic molecules of useful complexity, however, requires the development of general methods to enable the product of a DNA-templated reaction to undergo subsequent DNA-templated transformations. The first DNA-templated multi-step small molecule syntheses is described in detail herein. Together with recent advances in the reaction scope of DNA-templated synthesis, these findings set the stage for the in vitro evolution of synthetic small molecule libraries.
Multi-step DNA-templated small molecule synthesis faces two major challenges beyond those associated with DNA-templated synthesis in general. First, the DNA-used to direct reagents to appropriate templates must be removed from the product of a DNA-templated reaction prior to subsequent DNA-templated synthetic steps in order to prevent undesired hybridization to the template. Second, multi-step synthesis often requires the purification and isolation of intermediate products, yet common methods used to purify and isolate reaction products are not appropriate for multi-step synthesis on the molecular biology scale. To address these challenges, three distinct strategies were implemented in solid-phase organic synthesis, for linking chemical reagents with their decoding DNA oligonucleotides and two general approaches for product purification after any DNA-templated synthetic step were developed.
When possible, an ideal reagent-oligonucleotide linker for DNA-templated synthesis positions the oligonucleotide as a leaving group of the reagent. Under this “autocleaving” linker strategy, the oligonucleotide-reagent bond is cleaved as a natural chemical consequence of the reaction (
Reagents bearing more than one functional group can be linked to their decoding DNA oligonucleotides through a second and third linker strategies. In the “scarless linker” approach, one functional group of the reagent is reserved for DNA-templated bond formation, while the second functional group is used to attach a linker that can be cleaved without introducing additional unwanted chemical functionality. DNA-templated reaction is followed by cleavage of the linker attached through the second functional group to afford desired products (
In some cases it may be advantageous to introduce new chemical groups as a consequence of linker cleavage. Under a third linker strategy, linker cleavage generates a “useful scar” that can be functionalized in subsequent steps. As an example of this class of linker, amino acid reagents such as the (L)-Phe derivative 10 were generated linked through 1,2-diols (Fruchart et al Tetrahedron Lett. 1999, 40, 6225) to their decoding DNA oligonucleotides. Following DNA-templated amide bond formation with amine terminated template (5), this linker was quantitatively cleaved by oxidation with 50 mM aqueous NaIO4 at pH 5.0 to afford product 12 containing an aldehyde group appropriate for subsequent functionalization (for example, in a DNA-templated Wittig olefination, reductive amination, or nitrolaldol addition (
Desired products generated from DNA-templated reactions using the scarless or useful scar linkers can be readily purified using biotinylated reagent oligonucleotides (
Integrating the recently expanded repertoire of synthetic reactions compatible with DNA-templated synthesis and the linker strategies described above, multi-step DNA-templated small molecule syntheses can be conducted.
In one embodiment, a solution phase DNA-templated synthesis of a non-natural peptide library is described generally below and is shown generally in
It will be appreciated that a virtually unlimited assortment of amino acid building blocks can be incorporated into a non-natural peptide library. Unlike peptide libraries generated using the protein biosynthetic machinery such as phage displayed libraries (O'Neil et al. Curr. Opin. Struct. Biol. 1995; 5, 443-9), mRNA displayed libraries (Roberts et al. Proc. Natl. Acad. Sci, USA 1997, 94, 12297-12302) ribosome displayed libraries (Roberts et al. Curr. Opin. Chem. Biol. 1999, 3, 268-73; Schaffitzel et al. J. Immunol Methods 1999, 231, 119-35), or intracellular peptide libraries (Norman et al. Science 1999, 285, 591-5), amino acids with non-proteinogenic side chains, non-natural side chain stereochemistry, or non-peptidic backbones can all be incorporated into this library. In addition, the many commercially available di-, tri- and oligopeptides can also be used as building blocks to generate longer library members. The presence of non-natural peptides in this library may confer enhanced pharmacological properties such as protease resistance compared with peptides generated ribosomally. Similarly, the macrocyclic library members may yield higher affinity ligands since the entropy loss upon binding their targets may be less than their more flexible linear counterparts. Based on the enormous variety of commercially available amino acids fitting these descriptions, the maximum diversity of this non-natural cyclic and linear tetrapeptde library can exceed 100×100×100×100=108 members.
Another example of a library using the approach described above includes the DNA-templated synthesis of a diversity-oriented macrobicyclic library containing 5- and 14-membered rings (
As but one example of a specific library generated from the first general approach described above, three iterated cycles of DNA-templated amide formation, traceless linker cleavage, and purification with streptavidin-linked beads were used to generate anon-natural tripeptide (
The progress of each reaction, purification, and sulfone linker cleavage step was followed by denaturing polyacrylamide gel electrophoresis. The final tripeptide linked to template (16) was digested with the restriction endonuclease EcoRI and the digestion fragment containing the tripeptide was characterized by MALDI mass spectrometry. Beginning with 2 nmol (˜20 μg) of starting material, sufficient tripeptide product was generated to serve as the template for more than 106 in vitro selections and PCR reactions (Kramer et al. in Current Protocols in Molecular Biology, Vol 3 (Ed.: F. M. Ausubel), Wiley, 1999, pp. 15.1) (assuming 1/10,000 molecules survive selection). No significant product was generated when the starting material template was capped with acetic anhydride, or when control reagents containing sequence mismatches were used instead of the complementary reagents (
A non-peptidic multi-step DNA-templated small molecule synthesis (
The commercial availability of many substrates for DNA-templated reactions including amines, carboxylic acids, α-halo carbonyl compounds, olefins, alkoxyamines, aldehydes, and nitroalkanes may allow the translation of large libraries of DNA into diverse small molecule libraries. The direct one-pot selection of these libraries for members with desired binding or catalytic activities, followed by the PCR amplification and diversification of the DNA encoding active molecules, may enable synthetic small molecules to evolve in a manner paralleling the powerful methods Nature uses to generate new molecular function. In addition, multi-step nucleic acid-templated synthesis is a requirement of previously proposed models (A. I. Scott, Tetrahedron Lett. 1997, 38, 4961; Li et al. Nature 1994, 369, 218; Tamura et al. Proc. Natl. Acad. Sci USA 2001, 98, 1393) for the prebiotic translation of replicable information into functional molecules. These findings demonstrate that nucleic acid templates are indeed capable of directing iterative or non-iterative multi-step small molecule synthesis even when reagents anneal at widely varying distances from the growing molecule (in the above examples, zero to twenty bases). As described in more detail below, libraries of synthetic molecules can then be evolved towards active ligand and catalysts through cycles of translation, selection, amplification and mutagenesis.
E) Evolving Plastics: In yet another embodiment of the present invention, a nucleic acid (e.g., DNA, RNA, derivative thereof) is attached to a polymerization catalyst. Since nucleic acids can fold into complex structures, the nucleic acid can be used to direct and/or affect the polymerization of a growing polymer chain. For example, the nucleic acid may influence the selection of monomer units to be polymerized as well as how the polymerization reaction takes place (e.g., stereochemistry, tacticity, activity). The synthesized polymers may be selected for specific properties such molecular, weight, density, hydrophobicity, tacticity, stereoselectivity, etc., and the nucleic acid which formed an integral part of the catalyst which directed its synthesis may be amplified and evolved (
To give but one example, a library of DNA molecules is attached to Grubbs' ruthenium-based ring opening metathesis polymerization (ROMP) catalyst through a dihydroimidazole ligand (Scholl et al. Org. Lett. 1(6):953, 1999; incorporated herein by reference) creating a large, diverse pool of potential catalytic molecules, each unique by nature of the functionalized ligand. Undoubtedly, functionalizing the catalyst with a relatively large DNA-dehydroimidazole (DNA-DHI) ligand will alter the activity of the catalyst. Each DNA molecule has the potential to fold into a unique stereoelectronic shape which potentially has different selectivities and/or activities in the polymerization reaction (
Subsequent selection of a polymer from the library based on a desired property by electrophoresis, gel filtration, centrifugal sedimentation, partitioning into solvents of different hydrophobicities, etc. Amplification and diversification of the coding nucleic acid via techniques such as error-prone PCR or DNA shuffling followed by attachment to a DHI backbone will allow for production of another pool of potential ROMP catalysts enriched in the selected activity (
Characterization of DNA-Templated Synthetic Small Molecule Libraries: The non-natural peptide and bicyclic libraries described above are characterized in several stages. Each candidate reagent is conjugated to its decoding DNA oligonucleotide, then subjected to model reactions with matched and mismatched templates. The products from these reactions are analyzed by denaturing polyacrylamide gel electrophoresis to assess reaction efficiency, and by mass spectrometry to verify anticipated product structures. Once a complete set of robust reagents are identified, the complete multi-step DNA-templated syntheses of representative single library members on a large scale is performed and the final products are characterized by mass spectrometry.
More specifically, the sequence fidelity of each multi-step DNA-templated library synthesis is tested by following the fate of single chemically labeled reagents through the course of one-pot library synthesis reactions. For example, products arising from building blocks bearing a ketone group are captured with commercially available hydrazide-linked resin and analyzed by DNA sequencing to verify sequence fidelity during DNA-templated synthesis. Similarly, when using non-biotinylated model templates, building blocks bearing biotin groups are purified after DNA-templated synthesis using streptavidin magnetic beads and subjected to DNA sequencing (Liu et al. J. Am. Chem. Soc. 2001, 123, 6961-6963) Codons that show a greater propensity to anneal with mismatched DNA are identified by screening in this manner and removed from the genetic code of these synthetic libraries.
In Vitro Selection of Protein Ligands from Evolvable Synthetic Libraries: Because every library member generated in this approach is covalently linked to a DNA oligonucleotide that encodes and directs its synthesis, libraries can be subjected to true in vitro selections. Although direct selections for small molecule catalysts of bond-forming or bond-cleaving reactions are an exciting potential application of this approach, the simplest in vitro selection that can be used to evolve these libraries is a selection for binding to a target protein. An ideal initial target protein for the synthetic library selection both plays an important biological role and possesses known ligands of varying affinities for validating the selection methods.
One receptor of special interest for use in the present invention is the αvβ3 receptor. The αvβ3 receptor is a member of the integrin family of transmembrane heterodimeric glycoprotein receptors (Miller et al. Drug Discov Today 2000, 5, 397-408; Berman et al. Membr Cell Biol. 2000, 13, 207-44) The αvβ3 integrin receptor is expressed on the surface of many cell types such as osteoclasts, vascular smooth muscle cells, endothelial cells, and some tumor cells. This receptor mediates several important biological processes including adhesion of osteoclasts to the bone matrix (van der Pluijm et al. J. Bone Miner. Res. 1994, 9, 1021-8) smooth muscle cell migration (Choi et al. J. Vasc. Surg. 1994, 19, 125-34) and tumor-induced angiogenesis (Brooks et al. Cell 1994, 79, 1157-64) (the outgrowth of new blood vessels). During tumor-induced angiogenesis, invasive endothelial cells bind to extracellular matrix components through their αvβ3 integrin receptors. Several studies (Brooks et al. Cell 1994, 79, 1157-64; Brooks et al. Cell 1998, 92, 391-400; Friedlander et al. Science 1995, 270, 1500-2; Varner et al. Cell Adhes Commun 1995, 3, 367-74; Brooks et al. J Clin Invest 1995, 96, 1815-22) have demonstrated that the inhibition of this integrin binding event with antibodies or small synthetic peptides induces apoptosis of the proliferative angiogenic vascular cells and can inhibit tumor metastasis.
A number of peptide ligands of varying affinities and selectivities for the αvβ3 integrin receptor have been reported. Two benchmark αvβ3 integrin antagonists are the linear peptide GRGDSPK (IC50=210 nM (Dechantsreiter et al. J. Med. Chem. 1999, 42, 3033-40; Pfaff et al. J. Biol. Chem. 1994, 269, 20233-8) and the cyclic peptide cyclo-RGDfV (Pfaff et al. J. Biol. Chem. 1994, 269, 20233-8) (f=(D)-Phe, IC50=10 nM). While peptides antagonists for integrins commonly contain RGD, not all RGD-containing peptides are high affinity integrin ligands. Rather, the conformational context of RGD and other peptide sequences can have a profound effect on integrin affinity and specificity (Wermuth et al. J. Am. Chem. Soc. 1997, 119, 1328-1335; Geyer et al. J. Am. Chem. Soc. 1994, 116, 7735-7743; Rai et al. Bioorg. Med. Chem. Lett 2001, 11, 1797-800; Rai et al. Curr. Med. Chem. 2001, 8, 101-19) For this reason, combinatorial approaches towards αvβ3 integrin receptor antagonist discovery are especially promising.
The biologically important and medicinally relevant role of the αvβ3 integrin receptor together with its known peptide antagonists and its commercial availability (Chemicon International, Inc., Temecula, Calif.) make the αvβ3 integrin receptor an ideal initial target for DNA-templated synthetic small molecule libraries. The αvβ3 integrin receptor can be immobilized by adsorption onto microtiter plate wells without impairing its ligand binding ability or specificity (Dechantsreiter et al. J. Med. Chem. 1999, 42, 303340; Wermuth et al. J. Am. Chem. Soc. 1997, 119, 1328-1335; Haubner et al. J. Am. Chem. Soc. 1996, 118, 7461-7472). Alternatively, the receptor can be immobilized by conjugation with NHS ester or maleimide groups covalently linked to sepharose beads and the ability of the resulting integrin affinity resin to maintain known ligand binding properties can be verified.
To perform the actual protein binding selections, DNA template-linked synthetic peptide or macrocyclic libraries are dissolved in aqueous binding buffer in one pot and equilibrated in the presence of immobilized αvβ3 integrin receptor. Non-binders are washed away with buffer. Those molecules that may be binding through their attached DNA templates rather than through their synthetic moieties are eliminated by washing the bound library with unfunctionalized DNA templates lacking PCR primer binding sites. Remaining ligands bound to the immobilized αvβ3 integrin receptor are eluted by denaturation or by the addition of excess high affinity RGD-containing peptide ligand. The DNA templates that encode and direct the syntheses of αvβ3 integrin binders are amplified by PCR using one primer designed to bind to a constant 3′ region of the template and one pool of biotinylated primers functionalized at its 5′ end with the library starting materials (
For reasons similar to those that make the αvβ3 integrin receptor an attractive initial target for the approach to generating synthetic molecules with desired properties, the factor Xa serine protease also serves as a promising protein target. Blood coagulation involves a complex cascade of enzyme-catalyzed reactions that ultimately generate fibrin, the basis of blood clots (Rai et al. Curr. Med Chem. 2001, 8, 101-109; Vacca et al. Curr. Opin. Chem Biol. 2000, 4, 394-400) Thrombin is the serine protease that converts fibrinogen into fibrin during blood clotting. Thrombin, in turn, is generated by the proteolytic action of factor Xa on prothrombin. Because thromboembolitic (blood clotting) diseases such as stroke remain a leading cause of death in the world (Vacca et al. Curr. Opin. Chem. Biol. 2000, 4, 394-400) the development of drugs that inhibit thrombin or factor Xa is a major area of pharmaceutical research. The inhibition of factor Xa is a newer approach thought to avoid the side effects associated with inhibiting thrombin, which is also involved in normal hemostasis (Maignan et al. J. Med. Chem. 2000, 43, 3226-32; Leadley et al. J Cardiovasc. Pharmacol. 1999, 34, 791-9; Becker et al. Bioorg. Med. Chem. Lett. 1999, 9, 2753-8; Choi-Sledeski et al. Bioorg. Med Chem. Lett. 1999, 9, 2539-44; Choi-Sledeski et al. J. Med. Chem. 1999, 42, 3572-87; Ewing et al. J. Med. Chem. 1999, 42, 3557-71; Bostwick et al. Thromb Haemost 1999, 81, 157-60). Although many agents including heparin, hirudin, and hirulog have been developed to control the production of thrombin, these agents generally have the disadvantage of requiring intravenous or subcutaneous injection several times a day in addition to possible side effects, and the search for synthetic small molecule factor Xa inhibitors remains the subject of great research effort.
Among factor Xa inhibitors with known binding affinities are a series of tripeptides ending with arginine aldehyde (Marlowe et al. Bioorg. Med Chem. Lett. 2000, 10, 13-16) that are easily be included in the DNA-templated non-natural peptide library described above. Depending on the identities of the first two residues, these tripeptides exhibit IC50 values ranging from 15 nM to 60 μM (Marlowe et al. Bioorg. Med. Chem. Lett. 2000, 10, 13-16) and therefore provide ideal positive controls for validating and calibrating an in vitro selection for synthetic factor Xa ligands (see below). Both factor Xa and active factor Xa immobilized on resin are commercially available (Protein Engineering Technologies, Denmark). The resin-bound factor Xa is used to select members of both the DNA-templated non-natural peptide and bicyclic libraries with factor Xa affinity in a manner analogous to the integrin receptor binding selections described above.
Following PCR amplification of DNA templates encoding selected synthetic molecules, additional rounds of translation, selection, and amplification are conducted to enrich the library for the highest affinity binders. The stringency of the selection is gradually increased by increasing the salt concentration of the binding and washing buffers, decreasing the duration of binding, elevating the binding and washing temperatures, and increasing the concentration of washing additives such as template DNA or unrelated proteins. Importantly, in vitro selections can also select for specificity in addition to binding affinity. To eliminate those molecules that possess undesired binding properties, library members bound to immobilized αvβ3 integrin or factor Xa are washed with non-target proteins such as other integrins or other serine proteases, leaving only those molecules that bind the target protein but do not bind non-target proteins.
Iterated cycles of translation, selection, and amplification results in library enrichment rather than library evolution, which requires diversification between rounds of selection. Diversification of these synthetic libraries are achieved in at least two ways, both analogous to methods used by Nature to diversify proteins. Random point mutagenesis is performed by conducting the PCR amplification step under error-prone PCR (Caldwell et al. PCR Methods Applic. 1992, 2, 28-33) conditions. Because the genetic code of these molecules are written to assign related codons to related chemical groups, similar to the way that the natural protein genetic code is constructed, random point mutations in the templates encoding selected molecules will diversify progeny towards chemically related analogs. In addition to point mutagenesis, synthetic libraries generated in this approach are also diversified using recombination. Templates to be recombined have the structure shown in
Small molecule evolution using mutation and recombination offers two potential advantages over simple enrichment. If the total diversity of the library is much less than the number of molecules made (typically 1012 to 1015), every possible library member is present at the start of the selection. In this case, diversification is still useful because selection conditions almost always change as rounds of evolution progress. For example, later rounds of selection will likely be conducted under higher stringencies, and may involve counter selections against binding non-target proteins. Diversification gives library members that have been discarded during earlier rounds of selection the chance to reappear in later rounds under altered selection conditions in which their fitness relative to other members may be greater. In addition, it is quite possible using this approach to generate a synthetic library that has a theoretical diversity greater than 1015 molecules. In this case, diversification allows molecules that never existed in the original library to emerge in later rounds of selections on the basis of their similarity to selected molecules, similar to the way in which protein evolution searches the vastness of protein sequence space one small subset at a time.
Characterization of Evolved Compounds: Following multiple rounds of selection, amplification, diversification, and translation, molecules surviving the selection will be characterized for their ability to bind the target protein. To identify the DNA sequences encoding evolved synthetic molecules surviving the selection, PCR-amplified templates are cloned into vectors, transformed into cells, and sequenced as individual clones. DNA sequencing of these subcloned templates reveal the identity of the synthetic molecules surviving the selection. To gain general information about the functional groups being selected during rounds of evolution, populations of templates are sequenced in pools to reveal the distribution of A, G, T, and C at every codon position. The judicious design of each functional group's genetic code allows considerable information to be gathered from population sequencing. For example, a G at the first position of a codon may designate a charged group, while a C at this position may encode a hydrophobic substituent.
To validate the integrin binding selection and to compare selected library members with known αvβ3 integrin ligands, linear GRGDSPK and a cyclic RGDfV analog (cyclic iso-ERGDfV) are also included in the DNA-templated cyclic peptide library. The selection conditions are adjusted until verification that libraries containing these known integrin ligands undergo enrichment of the DNA templates encoding the known ligands upon selection for integrin binding. In addition, the degree of enrichment of template sequences encoding these known αvβ3 integrin ligands is correlated with their known affinities and with the enrichment and affinity of newly discovered αvβ3 integrin ligands.
Once the enrichment of template sequences encoding known and new integrin ligands is confirmed, novel evolved ligands will be synthesized by non-DNA templated synthesis and assayed for their αvβ3 integrin receptor antagonist activity and specificity. Standard in vitro binding assays to integrin receptors (Dechantsreiter et al. J. Med. Chem. 1999, 42, 3033-40) are performed by competing the binding of biotinylated fibrinogen (a natural integrin ligand) to immobilized integrin receptor with the ligand to be assayed. The inhibition of binding to fibrinogen is quantitated by incubation with an alkaline phosphatase-conjugated anti-biotin antibody and a chromogenic alkaline phosphate substrate. Comparison of the binding affinities of randomly chosen library members before and after selection will validate the evolution of the library towards target binding. Assays for binding non-target proteins reveal the ability of these libraries to be evolved towards binding specificity in addition to binding affinity.
Similarly, the selection for factor Xa binding is validated by including the known factor Xa tripeptide inhibitors in the library design and verifying that a round of factor Xa binding selection and PCR amplification results in the enrichment of their associated DNA templates. Synthetic library members evolved to bind factor Xa are assayed in vitro for their ability to inhibit factor Xa activity. Factor Xa inhibition can be readily assayed spectrophotometrically using the commercially available chromogenic substrate S-2765 (Chromogenix, Italy).
While the DNA sequence alone of a non-natural peptide library member is likely to reveal the exact identity of the corresponding peptide, the final step in the bicyclic library synthesis is a non-DNA-templated intramolecular 1,3-dipolar cycloaddition that may yield diastereomeric pairs of regioisomers. Although modeling strongly suggests that only the regioisomer shown in
Translating DNA into Non-Natural Polymers Using DNA Polymerases: An alternative approach to translating DNA into non-natural, evolvable polymers takes advantage of the ability of some DNA polymerases to accept certain modified nucleotide triphosphate substrates (D. M. Perrin el al. J. Am. Chem. Soc. 2001, 123, 1556; D. M. Perrin et al. Nucleosides Nucleotides 1999, 18, 377-91; T. Gourlain et al. Nucleic Acids Res. 2001, 29, 1898-1905; S. E. Lee et al. Nucleic Acids Res. 2001, 29, 1565-73; K. Sakthievel et al. Angew. Chem. Int. Ed. 1998, 37, 2872-2875). Several deoxyribonucleotides (
The functionalized nucleotides incorporated by DNA polymerases to date, shown in
In contrast with simple general acid and general base functionality, chiral metal centers would expand considerably the chemical scope of nucleic acids. Functionality aimed at binding chemically potent metal centers has yet to been incorporated into nucleic acid polymers. Natural DNA has demonstrated the ability to fold in complex three-dimensional structures capable of stereospecifically binding target molecules (C. H. Lin et al. Chem. Biol. 1997, 4, 817-32; C. H. Lin et al. Chem. Biol. 1998, 5, 555-72; P. Schultze et al. J. Mol. Biol. 1994, 235, 1532-47) or catalyzing phosphodiester bond manipulation (S. W. Santoro et al. Proc. Natl. Acad. Sci. USA 1997, 94, 4262-6; R. R. Breaker et al. Chem. Biol. 1995, 2, 655-60; Y. Li et al. Biochemistry 2000, 39, 3106-14; Y. Li et al. Proc. Natl. Acad. Sci. USA 1999, 96, 2746-51). DNA depurination (T. L. Sheppard et al. Proc. Natl. Acad. Sci. USA 2000, 97, 7802-7807) and porphyrin metallation (Y. Li et al. Biochemistry 1997, 36, 5589-99; Y. Li et al. Nat. Struct. Biol. 1996, 3, 743-7). Non-natural nucleic acids augmented with the ability to bind chemically potent, water-compatible metals such Cu, La, Ni, Pd, Rh, Ru, or Sc may possess greatly expanded catalytic properties. For example, a Pd-binding oligonucleotide folded into a well-defined structure may possess the ability to catalyze Pd-mediated coupling reactions with a high degree of regiospecificity or stereospecificity. Similarly, non-natural nucleic acids that form chiral Sc binding sites may serve as enantioselective cycloaddition or aldol addition catalysts. The ability of DNA polymerases to translate DNA sequences into these non-natural polymers coupled with in vitro selections for catalytic activities would therefore enable the direct evolution of desired catalysts from random libraries.
Evolving catalysts in this approach addresses the difficulty of rationally designing catalytic active sites with specific chemical properties that has inspired recent combinatorial approaches (K. W. Kuntz et al. Curr. Opin. Chem. Biol. 1999, 3, 313-319; M. B. Francis et al. Curr. Opin. Chem. Biol. 1998, 2, 422-8) to organometallic catalyst discovery. For example, Hoveyda and co-workers identified Ti-based enantioselective epoxidation catalysts by serial screening of peptide ligands (K. D. Shimizu et al. Angew. Chem. Int. Ed. 1997, 36) Serial screening was also used by Jacobsen and co-workers to identify peptide ligands that form enantioselective epoxidation catalysts when complexed with metal cations (M. B. Francis et al. Angew. Chem. Int. Ed. Engl. 1999, 38, 937-941) Recently, a peptide library containing phosphine side chains was screened for the ability to catalyze malonate ester addition to cyclopentenyl acetate in the presence of Pd (S. R. Gilbertson et al. J. Am. Chem. Soc. 2000, 122, 6522-6523). The current approach differs fundamentally from previous combinatorial catalyst discovery efforts, however, in that it enables catalysts with desired properties to spontaneously emerge from one pot, solution-phase libraries after evolutionary cycles of diversification, amplification, translation, and selection. This strategy allows up to 1015 different catalysts to be generated and selected for desired properties in a single experiment. The compatibility of our approach with one-pot in vitro selections allows the direct selection for reaction catalysis rather than screening for a phenomenon associated with catalysis such as metal binding or heat generation. In addition, properties difficult to screen rapidly such as substrate stereospecificity or metal selectivity can be directly selected using our approach (see below).
Key intermediates for a number of C5-functionalized uridine analogs and C7-functionalized 7-deazaadenosine analogs have been synthesized for incorporation into non-natural DNA polymers. In addition, the synthesis of six C8-functionalized adenosine analogs as deoxyribonucleotide triphosphates has been completed. Because only limited information exists on the ability of DNA polymerases to accept modified nucleotides, we chose to synthesize analogs were synthesized that not only will bring metal-binding functionality to nucleic acids but that also will provide insights into the determinants of DNA polymerase acceptance.
The strategy for the synthesis of metal-binding uridine and 7-deazaadenosine analogs is shown in
Several steps towards the synthesis of 13 have been completed, the key intermediate for 7-deazaadenosine analogs (
As alternative functionalized adenine analogs that will both probe the structural requirements of DNA polymerase acceptance and provide potential metal-binding functionality, six 8-modified deoxyadenosine triphosphates (
The ability of thermostable DNA polymerases suitable for PCR amplification to accept these modified nucleotide triphosphates containing metal-binding functionality. Non-natural nucleotide triphosphates were purified by ion exchange HPLC and added to PCR reactions containing Taq DNA polymerase, three natural deoxynucleotide triphosphates, pUC19 template DNA, and two DNA primers. Primers were chosen to generate PCR products ranging from 50 to 200 base pairs in length. Control PCR reactions contained the four natural deoxynucleotide triphosphates and no non-natural nucleotides. PCR reactions were analyzed by agarose or denaturing acrylamide gel electrophoresis. Amino modified uridine derivative 7 was efficiently incorporated by Taq DNA polymerase over 30 PCR cycles, while the triphosphate of 23 was not an efficient polymerase substrate (
Non-Natural Metal-Binding Deoxyribonucleotide Triphosphate Synthesis: The syntheses of the C5-functionalized uridine, C7-functionalized 7-deazaadenosine, and C8-functionalized adenosine deoxynucleotide triphosphates will be completed. Synthesis of the 7-deazaadenosine derivatives from 4-chloro-7-iodo-deazaadenine (30) proceeds by glycosylation of 30 with protected deoxyribosyl chloride 38 followed by ammonolysis to afford 7-iodo-adenosine (39) (
To generate rapidly a collection of metal-binding uridine and adenosine analogs, a variety of metal-binding groups as NHS esters will be coupled to C5-modified uridine intermediate 7 (already synthesized) and C7-modified 7-deazaadenosine intermediate 13. Metal-binding groups that will be examined initially are shown in
Evaluating Non-Natural Nucleotides: Each functionalized deoxyribonucleotide triphosphate is then assayed for its suitability as a building block of an evolvable non-natural polymer library in two stages. First, simple acceptance by thermostable DNA polymerases is measured by PCR amplification of fragments of DNA plasmid pUC19 of varying length. PCR reactions contain synthetic primers designed to bind at the ends of the fragment, a small quantity of pUC19 template DNA, a thermostable DNA polymerase (Taq, Pfu or Vent), three natural deoxyribonucleotide triphosphates, and the non-natural nucleotide triphosphate to be tested. The completely successful incorporation of the non-natural nucleotide results in the production of DNA products of any length at a rate similar to that of the control reaction. Those nucleotides that allow at least incorporation of 10 or more non-natural nucleotides in a single product molecule with at least modest efficiency are subjected to the second stage of evaluation.
Non-natural nucleotides accepted by thermostable DNA polymerases are evaluated for their possible mutagenic properties. If DNA polymerases insert a non-natural nucleotide opposite an incorrect (non-Watson-Crick) template base, or insert an incorrect natural nucleotide opposite a non-natural nucleotide in the template, the fidelity of library amplification and translation is compromised. To evaluate this possibility, PCR products generated in the above assay are subjected to DNA sequencing using each of the PCR primers. Deviations from the sequence of the pUC19 template imply that one or both of the mutagenic mechanisms are taking place. Error rates of less than 0.7% per base per 30 PCR cycles are acceptable, as error-prone PCR generates errors at approximately this rate (Caldwell et al. PCT Methods Applic. 1992, 2, 28-33) yet has been successfully used to evolve nucleic acid libraries.
Pairs of promising non-natural adenosine analogs and non-natural uridine analogs are also tested together for their ability to support DNA polymerization in a PCR reaction containing both modified nucleotide triphosphates together with dGTP and dCTP. Successful PCR product formation with two non-natural nucleotide triphosphates enables the incorporation of two non-natural metal-binding bases into the same polymer molecule. Functionalized nucleotides that are especially interesting yet are not compatible with Taq, Pfu, or Vent thermostable DNA polymerases can still be used in the libraries provided that they are accepted by a commercially available DNA polymerase such as the Klenow fragment of E. coli DNA polymerase I, T7 DNA polymerase, T4 DNA polymerase, or M-MuLV reverse transcriptase. In this case, the assays require conducting the primer extension step of the PCR reaction at 25-37° C., and fresh polymerase must be added at every cycle following the 94° C. denaturation step. DNA sequencing to evaluate the possible mutagenic properties of the non-natural nucleotide is still performed as described above
Generating Libraries of Metal-Binding Polymers: Based on the results of the above non-natural nucleotide assays, several libraries of ˜1015 different nucleic acid sequences will be made containing one or two of the most polymerase compatible and chemically promising non-natural metal-binding nucleotides. Libraries are generated by PCR amplification of a synthetic DNA template library consisting of a random region of 20 or 40 nucleotides flanked by two 15-base constant priming regions (
Each library is then incubated in aqueous solution with a metal of interest from the following non-limiting list of water compatible metal salts (Fringueli et al. Eur. J. Org. Chem. 2001, 2001, 439-455; Zaitoun et al. J. Phys. Chem. B 1997, 1857-1860): ScCl3, CrCl3, MnCl2, FeCl2, FeCl3, CoCl2, NiCl2, CuCl2, ZnCl2, GaCl3, YCl3, RuCl3, RhCl3, PdCl2, AgCl, CdCl2, InCl3, SnCl2, La(OTf)3, Ce(OTf)3, Pr(OTf)3, Nd(OTf)3, Sm(OTf)3, Eu(OTf)3, Gd(OTf)3, Tb(OTf)3, Dy(OTf)3, Ho(OTf)3, Er(OTf)3, Tm(OTf)3, Yb(OTf)3, Lu(OTf)3, IrCl3, PtCl2, AuCl, HgCl2, HgCl, PbCl2, or BiCl3. Metals are chosen based on the specific chemical reactions to be catalyzed. For example, libraries aimed at reactions such as aldol condensations or hetero Diels-Alder reactions that are known (Fringuelli et al. Eur. J. Org. Chem. 2001, 2001,439-455) to be catalyzed by Lewis acids are incubated with ScCl3 or with one of the lanthamide triflates, while those aimed at coupling electron-deficient olefins with aryl halides are incubated with PdCl2. The metalated library is then purified away from unbound metal salts by gel filtration using sephadex or acrylamide cartridges, which separate DNA oligonucleotides 25 bases or longer from unbound small molecule components.
The ability of the polymer library (or of individual library members) to bind metals of interest is verified by treating the metalated library free of unbound metals with metal staining reagents such as dithiooxamide, dimethylglyoxime, KSCN (Francis et al. Curr. Opin. Chem. Biol. 1998, 2, 422-8) or EDTA (Zaitoun et al. J. Phys. Chem. B 1997, 101, 1857-1860) that become distinctly colored in the presence of different metals. The approximate level of metal binding is measured by spectrophotometric comparison with solutions of free metals of known concentration and with solutions of positive control oligonucleotides containing an EDTA group (which can be introduced using a commercially available phosphoramidite from Glen Research).
In Vitro Selections for Non-Natural Polymer Catalysts: Metalated libraries of evolvable non-natural polymers containing metal-binding groups are then subjected to one-pot, solution-phase selections for catalytic activities of interest. Library members that catalyze virtually any reaction that causes bond formation between two substrate molecules or that results in bond breakage into two product molecules are selected using the schemes proposed in
In an analogous manner, library members that catalyze bond cleavage reactions such as retro-aldol reactions, amide hydrolysis, elimination reactions, or olefin dihydroxylation followed by periodate cleavage can also be selected. In this case, metalated library members are covalently linked to biotinylated substrates such that the bond breakage reaction causes the disconnection of the biotin moiety from the library members (
Catalysts of three important and diverse bond-forming reactions will initially be evolved: Heck coupling, hetero Diels-Alder cycloaddition, and aldol addition. All three reactions are water compatible (Kobayashi et al. J. Am. Chem. Soc. 1998, 120, 8287-8288; Fringuelli et al. Eur. J. Org. Chem. 2001, 2001, 439-455; Li et al. Organic Reactions in Aqueous Media: Wiley and Sons: New York, 1997) and are known to be catalyzed by metals. As Heck coupling substrates both electron deficient and unactivated olefins will be used together with aryl iodides and aryl chlorides. Heck reactions with aryl chlorides in aqueous solution, as well as room temperature Heck reactions with non-activated aryl chlorides, have not yet been reported to our knowledge. Libraries for Heck coupling catalyst evolution use PdCl2 as a metal source. Hetero Diels-Alder substrates include simple dienes and aldehydes, while aldol addition substrates consist of aldehydes and both silyl enol ethers as well as simple ketones. Representative selection schemes for Heck coupling, hetero Diels-Alder, and aldol addition catalysts are shown in
Evolving Non-Natural Polymers: Diversification and Selecting for Stereospecificity
Following each round of selection, active library members are amplified by PCR with the non-natural nucleotides and subjected to additional rounds of selection to enrich the library for desired catalysts. These libraries are truly evolved by introducing a diversification step before each round of selection. Libraries are diversified by random mutagenesis using error-prone PCR (Caldwell et al. PCR Methods Applic. 1992, 2, 28-33) or by recombination using modified DNA shuffling methods that recombine small, non-homologous nucleic acid fragments. Because error-prone PCR is inherently less efficient than normal PCR, error-prone PCR diversification will be conducted with only natural dATP, dTTP, dCTP, and dGTP and using primers that lack chemical handles or biotin groups. The resulting mutagenized products are then subjected to PCR translation into non-natural nucleic acid polymers using standard PCR reactions containing the non-natural nucleotide(s), the biotinylated primer, and the amino- or thiol-terminated primer.
In addition to simply evolving active catalysts, the in vitro selections described above are used to evolve non-natural polymer libraries in powerful directions difficult to achieve using other catalyst discovery approaches. An enabling feature of these selections is the ability to select either for library members that are biotinylated or for members that are not biotinylated. Substrate specificity among catalysts can therefore be evolved by selecting for active catalysts in the presence of the desired substrate and then selecting in the same pot for inactive catalysts in the presence of one or more undesired substrates. If the desired and undesired substrates differ by the configuration at one or more stereocenters, enantioselective or diastereoselective catalysts can emerge from rounds of selection. Similarly, metal selectivity can be evolved by selecting for active catalysts in the presence of desired metals and selecting for inactive catalysts in the presence of undesired metals. Conversely, catalysts with broad substrate tolerance can be evolved by varying substrate structures between successive rounds of selection.
Finally, the observations of sequence-specific DNA-templated synthesis in DMF and CH2Cl2 suggests that DNA-tetralkylammonium cation complexes can form base-paired structures in organic solvents. This finding raises the possibility of evolving our non-natural nucleic acid catalysts in organic solvents using slightly modified versions of the selections described above. The actual bond forming and bond cleavage selection reactions will be conducted in organic solvents, the crude reactions will be ethanol precipitated to remove the tetraalkylammonium cations, and the immobilized avidin separation of biotinylated and non-biotinylated library members in aqueous solution will be performed. PCR amplification of selected members will then take place as described above. The successful evolution of reaction catalysts that function in organic solvents would expand considerably both the scope of reactions that can be catalyzed and the utility of the resulting evolved non-natural polymer catalysts.
Characterizing Evolved Non-Natural Polymers: Libraries subjected to several rounds of evolution are characterized for their ability to catalyze the reactions of interest both as pools of mixed sequences or as individual library members. Individual members are extricated from evolved pools by ligating PCR amplified sequences into DNA vectors, transforming dilute solutions of ligated vectors into competent bacterial cells, and picking single colonies of transformants. Assays on pools or individual sequences are conducted both in the single turnover format and in a true multiple turnover catalytic format. For the single turnover assays, the rate at which substrate-linked bond formation catalysts effect their own biotinylation in the presence of free biotinylated substrate will be measured, or the rate at which biotinylated bond breakage catalysts effect the loss of their biotin groups. Multiple turnover assays are conducted by incubating evolved catalysts with small molecule versions of substrates and analyzing the rate of product formation by tic, NMR, mass spectrometry, HPLC, or spectrophotometry.
Once multiple turnover catalysts are evolved and verified by these methods, detailed mechanistic studies can be conducted on the catalysts. The DNA sequences corresponding to the catalysts are revealed by sequencing PCR products or DNA vectors containing the templates of active catalysts. Metal preferences are evaluated by metalating catalysts with a wide variety of metal cations and measuring the resulting changes in activity. The substrate specificity and stereoselectivity of these catalysts are assessed by measuring the rates of turnover of a series of substrate analogs. Diastereoselectivities and enantioselectivities of product formation are revealed by comparing reaction products with those of known stereochemistry. Previous studies suggest that active sites buried within large chiral environments often possess high degrees of stereoselectivity. For example, peptide-based catalysts generated in combinatorial approaches have demonstrated poor to excellent stereoselectivities that correlate with the size of the peptide ligand (Jarvo et al. J. Am. Chem. Soc. 1999, 121, 11638-11643) while RNA-based catalysts and antibody-based catalysts frequently demonstrate excellent stereoselectivities (Jäschke et al. Curr. Opin. Chem. Biol. 2000, 4, 257-262; Seelig et al. Angew. Chem. Int. Ed. Engl. 2000, 39, 4576-4579; Hilvert, D. Annu. Rev. Biochem. 2000, 69, 751-93; Barbas et al. Science 1997, 278, 2085-92; Zhong et al. Angew. Chem. Int. Ed. Engl. 1999, 38, 3738-3741; Zhong et al. J. Am. Chem. Soc. 1997, 119, 8131-8132; List et al. Org. Lett. 1999, 1, 59-61) The direct selections for substrate stereoselectivity described above should further enhance this property among evolved catalysts.
Structure-function studies on evolved catalysts are greatly facilitated by the ease of automated DNA synthesis. Site-specific structural modifications are introduced by synthesizing DNA sequences corresponding to “mutated” catalysts in which bases of interest are changed to other bases. Changing the non-natural bases in a catalyst to a natural base (U* to C or A* to G) and assaying the resulting mutants may identify the chemically important metal-binding sites in each catalyst. The minimal polymer required for efficient catalysis are determined by synthesizing and assaying progressively truncated versions of active catalysts. Finally, the three-dimensional structures of the most interesting evolved catalysts complexed with metals are solved in collaboration with local macromolecular NMR spectroscopists or X-ray crystallographers.