WO1993000352A1

WO1993000352A1 - Sequence-specific binding polymers for duplex nucleic acids

Info

Publication number: WO1993000352A1
Application number: PCT/US1992/005208
Authority: WO
Inventors: James E. Summerton; Dwight D. Weller
Original assignee: Antivirals Inc.
Priority date: 1991-06-20
Filing date: 1992-06-18
Publication date: 1993-01-07
Also published as: ATE157098T1; EP0592511B1; TW466243B; US5166315A; CA2110988C; CA2110988A1; AU2253192A; KR100225090B1; EP0592511A1; AU665560B2; DE69221730D1; JPH06508526A; DE69221730T2

Abstract

The present invention describes a polymer composition effective to bind in a sequence-specific manner to a target sequence of a duplex polynucleotide at least two different-oriented Watson/Crick base-pairs at selected positions in the target sequence. The composition includes an uncharged backbone with 5- or 6-membered cyclic backbone structures and selected bases attached to the backbone structures effective to hydrogen bond specifically with different oriented base-pairs in the target sequence. Also disclosed are subunits useful for the construction of the polymer composition. The present invention also includes methods for (i) coupling a first free or polymer-terminal subunit, and (ii) isolating, from a liquid sample, a target duplex nucleic acid fragment having a selected sequence of base-pairs.

Description

SEQUENCE-SPECIFIC BINDING POLYMERS FOR DUPLEX NUCLEIC ACIDS

The present invention is a continuation-in-part appli¬ cation of U.S. patent application Serial No. 719,732, filed June 20, 1991, now allowed, which is a continuation-in-part of application Serial No. 07/454,055, filed December 20, 1989 (now issued as U.S. Patent No. 5,034,506).

1. Field of the Invention The present invention relates to an uncharged polymer capable of binding with sequence specificity to double stranded nucleic acids containing a selected base-pair sequence.

2. References

Aboderin, Delpierre, and Fruton, J. Amer. Che . Soc. 1965, 87, 5469

Aoya a (1987). Bull. Chem. Soc. Jpn. £50 2073.

Arnott & Bond (1973). Science 181 68; Nature New Biol. 244 99.

Arnott & Seising (1974). J. Molec. Biol. J38. 509.

Balgobin, McBride, Kierzek, Beaucage and Caruthers Bassingdale (1986) . J. Amer. Chem. Soc. 108 2040.

Barwolff and Langen, in "Nucleic Acid Chemistry," Townsend and Tipson, Ed. Wiley, New York, 1978 page 359.

Belikova, Zaratova, & Grineva (1967) . Tet. Letters 37 3557.

Bischofberger, Tetrahedron Letters (1987) 2j$.2821.

Bredereck, et al, Chemische Berichte (1968) 101 41. Bunemann et al. (1981). Biochem. 2.0 2864. Carnelley and Dutt, J. Chem. Soc. 125, 2483. Chamberlin & Patterson (1965) . J. Molec. Biol. jL2410. Chelsky et al. (1989). Mol. Cell. Biol. 9.2487. Cooney et al. (1988). Science 241 456. Corey, Gilman, and Ganem (1968) . J. Am. Chem. Soc. 90 5616.

Elguero et al. (1976) . The Tautomerism of Hetero- cycles. Adv. in Heterocyclic Chem. Supplement I. Academic Press. NY. Fischer-Fantuzzi & Vesco (1988) . Molec. & Cell. Biol. 8.5495.

Flavell & Van den Berg (1975). FEBS Letters 58.90. Gregoriadis & Needunjun (1975) . Biochem. Biophys. Res. Commun. _5.537. Himmelsbach and Pfleiderer (1983). Tet. Lett. 24.3583. Hoffer (1960). Chemische Berichte 93.2777. Hoogsteen (1959). Acta Cryst. 12.822. Inaman (1964). J. Mol. Biol. 10 137. Jones (1979) . Int. J. Biol. Macromolec. 1 194. Jurgens (1907) . Chemische Berichte 4O 4409. Kabanov (1989). FEBS Letters 258.343. Kamimura, Tsuchiya, Urakami, Koura, Sekine, Shinozaki, Miura and Hata (1984). J. Amer. Chem. Soc. 106 4552. Karpova et al. (1980) . FEBS Letters 122 21. Katritzky and Yates (1976). J. Chem. Soc, Perkin Trans. 1 309.

King, McWhirter, and Barton (1945) . J. Am. Chem. Soc. 67 2089.

Kosturko et al. (1979) . Biochem. 18.5751. Kundu & Heidelberger (1974). Biochem. Biophys. Res. Comm. 6O 561.

Kundu et al. (1975). J. Med. Chem. 18.395 & 399. Kundu (1980). J. Med. Chem. 23.512. Le aitre, Bayard & Leblue (1987). PNAS 84.648. Maeba et al (1983). J. Org. Chem. 48.2998.

Mahler, Wold, Dervan (1989). Science 245 725. Miller et al. (1979). Biochemistry .18 5134.

Miller et al. (1980). J. Biol. Chem. 255 9659.

Miller et al. (1985). Biochimie .67 769.

Miura and Hata (1984). J. Amer. Chem. Soc. 106 4552. Morgan & Wells (1968). J. Molec. Biol. 37 63.

Moser & Dervan (1987). Science 238 645.

Myers and Lee (1984). Carb. Res. 132 61.

Ozdowska (1974). Rocz. Chem. 48. 1065.

Pelaprat et al. (1980) J. Med. Chem. 22 1330, 1336. Peltier (1956). Belg. Soc. Science, Bretagne 3_1 26.

Phillips (1928). J. Chem. Soc. 2393.

Pickering, Srivastava, Witkowski, and Robins, Nucleic Acid Chemistry, Part 1, Ed. by Townsend and Tipson, John Wiley and Sons, New York, p 145. Pitha & Pitha (1970). Biopolymers 9.965.

Poisel and Schmidt (1975) . Chemische Berichte 108 2547.

Povich (1989). J. Amer. Chem. Soc. Ill 3059.

Rich & Seeman (1975) . Handbook of Biochemistry and Molecular Biology, 3rd Edition, Vol. 2, pages 465-466.

Robins, Naik, and Lee (1974). J. Org. Chem. .39 1891.

Robins, Hansske, Bernier (1981). Can. J. Chem. 59 3360.

Sakore et al. (1969). J. Molec. Biol. .43 385. Schmitz & Galas (1979). Nucleic Acids Res. 6.111.

Schnneller and Christ (1981) . J. Heterocyclic Chem. 18 654.

Schultz, Taylor, & Dervan (1982). JACS 104 6861.

Schultz & Dervan (1983). PNAS 8O 6834. Sekine, Peshakova, Hata, Yokoyama and Miyazawa (1987) . J. Org. C e . 52.5061.

Shioiri, Ninomiya, Yamada (1972) . J. Amer. Chem. Soc. 14 6203.

Sluka et al. (1987). Science 238 1129. Smith, Rammler, Goldberg and Khorana (1961) . J. Amer. Chem. Soc. 84 430.

Stirchak, Summerton, & Weller (1987) . J. Org. Chem. 52 4202. Stirchak, Summerton, & Weller (1989) . Nucleic Acids Res. 17 6129.

Summerton & Bartlett (1978b) . J. Molecular Biology 122 145-162.

Summerton (1979) . J. Theor. Biol. 78.61-76. Summerton (1979). J. Theor. Biol. 78.77-99. Tamura and Okai (1984) . Carb. Res. 133 207. Toulme et al. (1986). PNAS 83.1227. Trattner, et al (1964) . J. Org. Chem. 129.2674. Trichtinger, Charbula and Pfleiderer (1983) . Tet. Lett. .24 711.

Voet & Rich (1970) . Progress in Nucleic Acids Res. & Molec. Biol. 10.183 - 265.

Youngquist & Dervan (1985). JACS 107 5528. Zamecnik & Stephenson (1978) . PNAS 75.280. Zuidema, Van den Berg & Flavell (1978) . Nucleic Acids Res. 5.2471.

3. Background of the Invention

Oligonucleotides or oligonucleotide analogs designed to inactivate selected single-stranded genetic sequences unique to a target pathogen were first reported in the late 1960's by Belikova, 1967, and subsequently by: Pitha, 1970; Summerton, 1978a,b, 1979a,b; Zamecnik, 1978; Jones, 1979; Karpova, 1980; Miller, 1979, 1980, 1985; Toulme, 1986; Stirchak, 1987, 1989. Polymeric agents of this type achieve their sequence specificity by exploiting Watson/Crick base pairing between the agent and its comple¬ mentary single-stranded target genetic sequence. Because such polymers only bind single-stranded target genetic se- quences, they are of limited value where the genetic infor- mation one wishes to inactivate exists predominantly in the double-stranded state.

For many pathogens and pathogenic states duplex gene¬ tic sequences offer a more suitable target for blocking genetic activity. One of the earliest attempts to develop a sequence-specific duplex-directed nucleic acid binding agent was reported by Kundu, Heidelberger, and coworkers during the period 1974 to 1980 (Kundu 1974; Kundu 1975; Kundu 1980) . This group reported two monomeric agents, each designed to hydrogen-bond to a specific base-pair in duplex nuc.^" ic acids. However, these agents were ineffec¬ tive, probably for two reasons. First, they utilized a nonrigid ambiguous hydrogen-bonding group (an amide) which can act as either a proton donor or acceptor (in the hydrogen-bonding sense) . Secondly, they provided an insufficient number of hydrogen bonds (two) for complex stability in aqueous solution. Experimental results from a variety of systems suggest that hydrogen-bonded complexes are stable in aqueous solution only if there are a substan- tial number (probably at least 12) of cooperative intermo- lecular hydrogen bonds, or if there are additional stabi¬ lizing interactions (electrostatic, hydrophobic, etc.).

Another early attempt was reported by Dattagupta and Crothers at Yale and coworkers in Germany (Kosturko 1979; Bunemann 1981) . These workers employed a polymer prepared from a dye known to intercalate into duplex DNA rich in G:C base-pairs and another dye which preferentially binds to duplex DNA rich in A:T base-pairs, probably via minor-groove sites. Preparation of the polymer involved modification of the two dyes by adding acrylic moieties and then polymerization of a mixture of the modified dyes in the presence of duplex DNA of defined sequence (the template) . The expectation was that the resultant polymer would show a specific affinity for duplex DNA having the same sequence as the template DNA. However, such material proved to exhibit only nominal sequence specificity. variety of bis-intercalating agents designed to bind t specific sequences in duplex DNA have also been reporte (Pelaprat, 1980) , but such agents inherently give onl minimal sequence specificity.

More recently, Dervan has taken a natural B-form-spec- ific minor-groove-binding antibiotic (Distamycin) and systematically extended its structure to achieve a signifi¬ cant level of sequence specificity (Schultz 1982; Schultz 1983; Youngquist 1985). He has also appended to this oligomer an EDTA/Fe complex which under certain conditions acts to cleave the duplex target sequence near the agent's binding site. However, this particular approach will not lead to the high level of specificity which is needed for therapeutic applications because the inherent symmetry of the H-bonding sites in the minor groove provides too little sequence information.

Still more recently, Dervan and coworkers reported a binding agent which utilizes the informationally-richer polar major-groove sites of a target genetic duplex for sequence-specific recognition (Sluka 1987) . This entailed adapting a synthetic polypeptide, comprising the DNA- sequence-recognition portion of a DNA-binding protein, for cleaving DNA at the protein's binding site on duplex DNA. The cleaving activity was achieved by linking an EDTA/Fe complex to the amino terminus of the synthetic peptide and demonstrating that this complex selectively cleaved duplex DNA at or near the parent protein's natural target sequence. Another approach to duplex targeting has grown out of studies first reported in the late 1950's that demonstrat¬ ed, via X-ray diffraction, that under high salt conditions an all-thymine or all-uracil polynucleotide can bind to specific polar major-groove sites on a Watson/Crick genetic duplex having all adenines in one strand and all thymines or uracils in the other strand (Hoogsteen 1959) . Subse¬ quently, it was reported that in high salt and at pH values lower than 7, an all-cytosine polynucleotide, having the cytosine moieties protonated, can bind in a similar manner to a Watson/Crick duplex having all guanines in one strand and all cytosines in the other strand.

Thereafter, is was demonstrated that under high salt and at a pH below 7, a polynucleotide containing both cytosines and thymines (or uracils) can bind to a Wat- son/Crick duplex having the appropriate sequence of purines in one strand and pyrimidines in the other strand (Morgan, 1968) .

In the 1970^,s this Hoogsteen binding mechanism was exploited for affinity chromatography purification of duplex genetic fragments containing runs of purines in one strand and pyrimidines in the other strand (Flavell, 1975; Zuidema, 1978) . In 1987 Dervan and coworkers exploited this Hoogsteen binding mechanism to position an all- pyrimidine polynucleotide, carrying an EDTA/Fe cleaving moiety, onto a target genetic duplex having a specific sequence of purines in one strand and pyrimidines in the other strand (Moser, 1987) .

A major-groove binding mode different from the Hoogsteen mode was reported in the mid-1960's and involves binding of an all-purine polynucleotide, poly(dl) , to a poly(di)/poly(rC) duplex (Inamn 1964) and to a poly(di)/poly(dC) duplex (Chamberlin 1965). Similarly, a mostly-purine polynucleotide has been recently used by Hogan and coworkers (Cooney, 1988) for blocking the activity of a selected natural duplex genetic sequence. These workers reported that in the presence of 6 mM Mg⁺⁺ a mostly-purine polynucleotide (24 purines, 3 pyrimidines) of a specific sequence inhibits transcription of the human C- myc gene in a cell-free system. To date, reported polynucleotides used for binding to genetic duplexes fail to satisfy one or more important criteria for effective use within living organisms. First, the Hoogsteen-binding polynucleotides (polypyrimidines) containing cytosines require a lower-than-physiological pH in order to achieve effective binding (due to the necessity of protonating the cytosine moieties) , although it has recently been demonstrated by Dervan and coworkers that the use of 5-methylcytosines in place of cytosines allows Hoogsteen binding at a pH somewhat closer to physiological (Mahler, 1989) , and use of both 5-methylcytosines in place of cytosines and 5-bromouracils in place of thymines (or uracils) improves binding still further (Povsic, 1989) .

Secondly, in the case of polypurine polynucleotides, both inosine (hypoxanthine) and adenine moieties lack adequate sequence specificity and adequate binding affinity for effective major-groove binding in intracellular appli¬ cations. The inadequate sequence specificity for inosine (In an, 1964) and adenine (Cooney, 1988) moieties derives from the fact that inosine can bind with similar affinity to the central polar major-groove sites of both a C:I (or C:G) base-pair (i.e., NH4 of c and 06 of G or I) and an A:T or A:U base-pair (i.e., NH6 of A and 04 of T or U) , and because adenine can bind with similar affinity to the central polar major-groove sites of both a T:A or U:A base- pair (i.e., 04 of T or U and NH6 of A) and a G:C base-pair (i.e., 06 of G and NH4 of C) , as discussed further below. The low binding affinity of inosine for its target base-pairs and of adenine for its target base-pairs is due to the fact that these purines can form only two less-than- optimal hydrogen-bonds to the major-groove sites of their respective target base-pairs.

Thirdly, both polypyrimidine and polypurine polynucle¬ otides fail to achieve effective binding to their target genetic duplexes under physiological conditions, due to the substantial electrostatic repulsion between the three closely-packed polyanionic backbones of the three-stranded complexes. Although this repulsion can be attenuated b high salt (Morgan, 1968) , divalent cations (Cooney, 1988) , or polyamines (Moser, 1987), nonetheless, for applications in living cells, and particularly cells within intact organisms, control of intracellular cation concentrations is generally not feasible.

In addition, for therapeutic applications polynucleot- ides are less than optimal because: they are rapidly sequestered by the reticuloendothelial lining of the capillaries, they do not readily cross biological mem¬ branes, and they are sensitive to degradation by nucleases in the blood and within cells. Finally, for many in vivo applications of sequence- specific duplex-directed nucleic acid-binding agents, the principal target is DNA, which appears to exist within cells predominantly in a B or B-like conformation. In this context, polynucleotides which have been used for major- groove binding to genetic duplexes (Moser, 1987; Cooney, 1988) have a unit backbone length which is shorter than optimal for binding to duplex genetic sequences existing in a B-type conformation.

4. Summary of the Invention

The present invention includes a polymer composition effective to bind in a sequence-specific manner to a target sequence of a duplex polynucleotide containing at least two different oriented Watson-Crick base-pairs at selected positions in the target sequence. The polymer is formed of a specific sequence of subunits selected from the following forms:

where Y is a 2- or 3-atom length, uncharged intersubunit linkage group; R' is H, OH, or O-alkyl; the 5'-methylene has a β stereochemical orientation in the 5-membered ring and a uniform stereochemical orientation in the 6-membered ring; R_; has a β stereochemical orientation; and at least about 70% of _j groups in the polymer are selected from two or more of the following base-pair-specificity groups: (a) for a T:A or U:A oriented base-pair, R_; is 2,6-diamino- purine; (b) for a C:G oriented base-pair R_j is guanine or 6-thioguanine; (c) for a G:C oriented base-pair, R_; is selected from the group consisting of planar bases having the following skeletal ring structures and hydrogen bonding arrays, where B indicates the polymer backbone:

where the * ring position may carry a hydrogen-bond acceptor group; such as a carbonyl oxygen; and (d) for an A:T or A:U oriented base-pair, R_{ is selected from the group consisting of planar bases having the following skeletal ring structures and hydrogen bonding arrays, where B indicates the polymer backbone:

* where the * ring position may carry a hydrogen-bond donat¬ ing group, such as NH~.

In one embodiment, for use in sequence-specific bind¬ ing to a duplex nucleic acid sequence in an A conformation, the Y linkage group is two atoms in length. In another embodiment, for use in sequence-specific binding to a B- form DNA-DNA duplex nucleic acid sequence, the Y linkage group is three atoms in length.

In another aspect, the invention includes a method for coupling a first free or polymer-terminal subunit having one of the following subunit forms:

where R_j is a planar ring structure having two or more hydrogen-bonding sites, with a second free or polymer- terminal subunit having one of the following subunit forms:

where Z is a 2-atom or 3-atom long moiety. The method includes i) oxidizing the first subunit to generate a dialdehyde intermediate; ii) contacting the dialdehyde intermediate with the second subunit under conditions effective to couple a primary amine to a dialdehyde; and (iii) adding a reducing agent effective to give a coupled structure selected from the following forms:

In still another aspect, the invention prov des a method for isolating, from a liquid sample, a target duplex nucleic acid fragment having a selected sequence of base- pairs. The method includes first contacting the sample with a polymer reagent containing structure which allows isolation of the reagent from solution, and attached to this structure, a polymer composition of the type described above, where the polymer composition has a subunit sequence effective to bind in a sequence-specific manner with the selected sequence of base-pairs. The contacting is carried out under conditions effective for sequence-specific binding of the polymer composition to the selected sequence of base-pairs.

Further, the polymers of the present invention can be used to detect the presence of a target nucleic acid sequence. For example, a support-bound polymer composition can be contacted with a test solution containing the selected duplex genetic sequence under conditions effective for sequence-specific binding of the polymer composition to its target sequence of base-pairs. The support-bound polymer with bound selected duplex genetic sequence is then separated from the test solution. The presence of the polymer/target duplex sequence is then detected. Detecting the selected genetic sequence may, for example, utilize one of the following: fluorescent compounds, such as, ethidiu bromide and propidium iodide, effective to intercalate into duplex genetic sequences; or reporter moieties linked to oligocationic moieties effective to bind to the polyanionic backbone of nucleic acids.

Also forming part of the invention is a subunit compo¬ sition for use in forming a polymer composition effective to bind in a sequence specific manner to a target sequence in a duplex polynucleotide. The composition includes one of the following subunit structures:

(a) (b) (c) (d)

where R' is H, OH, or 0-alkyl; the 5'-methylene has a β stereochemical orientation in subunit forms (a) , (c) , and (d) and a uniform stereochemical orientation in subunit form (b) ; X is hydrogen or a protective group or a linking group suitable for joining the subunits in any selected order into a linear polymer; Y is a nucleophilic or electrophilic linking group suitable for joining the subunits in any selected order into a linear polymer; and X and Y together are such that when two subunits of the subunit set are linked the resulting intersubunit linkage is 2 or 3 atoms in length and uncharged; Z is a 2-atom or 3-atom long moiety; and, R_if which may be in the protected state and has a β stereochemical orientation, is selected from the group consisting of planar bases having the following skeletal ring structures and hydrogen bonding arrays, where B indicates the aliphatic backbone moiety:

where the * ring position may carry a hydrogen-bond acceptor group; or, where R_j is selected from the group consisting of planar bases having the following skeletal ring structures and hydrogen bonding arrays, where B indicates the aliphatic backbone moiety:

where the * ring position may carry a hydrogen-bond donating group. Another embodiment of the present invention includes a method for inhibiting the biological activity of a selected duplex genetic sequence. In this method, a suitable target sequence of base-pairs is selected within the selected duplex genetic sequence whose activity is to be inhibited. A polymer composition, as described above, is provided which is effective to bind in a sequence- specific manner to the target sequence. The polymer composition is contacted with the selected duplex genetic sequence under substantially physiological conditions. This method may further include contacting the polymer composition with the selected genetic sequence where contacting the polymer composition with the selected genetic sequence entails targeting the polymer composition to a tissue or site containing the selected genetic sequence to be inactivated. Some methods of delivery include delivering the polymer composition in the form of an aerosol to the respiratory tract of a patient and/or injecting an aqueous solution of the polymer composition into a patient. These and other objects and features of the present invention will become more fully apparent when the follow¬ ing detailed description of the invention is read in conjunction with the accompanying drawings.

Brief Description of the Drawings Figures 1A-1D illustrate T:A (IA) , A:T (IB), C:G (IC) and G:C (ID) oriented Watson-Crick base-pairs, showing the major-groove hydrogen-binding sites of the base-pairs (arrows) ;

Figures 2A and 2B illustrate tautomeric forms of 2-amino pyrimidine (2A) , and 2-pyrimidinone (2B) ;

Figures 3A and 3B illustrate rigid (3A) and non-rigid (3B) hydrogen-bonding arrays; Figures 4A-4D illustrate standard positioning for a U:A base-pair in an A conformation and the approximate position of helical axis for an A-form duplex (4A) , the use of this positioning scheme for assessing R_a, Θ_Λ, and A values for a subunit base hydrogen bonded to the polar major-groove sites of a U:A base-pair in an A conformation (4B) , the standard positioning for a T:A base-pair in a B conformation and the approximate position of helical axis for a B-form duplex (4C) , and the use of this positioning scheme for assessing R„, θ_h, and A values for a subunit base hydrogen bonded to the polar major-groove sites of a T:A base-pair in a B conformation (4D) ;

Figures 5A-5C show representative 2'-deoxyribose (5A) , ribose (5B) , and ribose-derived backbone structures (5C) suitable for use in forming the polymer of the invention; Figures 6A-6F show representative morpholino backbone structures suitable for use in forming the polymer of the invention;

Figures 7A-7E show representative acyclic backbone structures suitable for forming the polymer of the invention;

Figure 8A shows a representative coupled acyclic backbone structure with a 4-atom unit backbone length, 8B-

8C show coupled acyclic backbone structures with a 5-atom unit backbone length, and 8D-8E show coupled acyclic backbone structures with a 6-atom unit backbone length.

Figures 9A-9D show representative coupled cyclic backbone structures with a 6-atom unit backbone length, and 9E-9F show representative coupled cyclic backbone structures with a 7-atom unit backbone length;

Figures 10A and 10B illustrate a guanine base and its binding to a C:G oriented Watson-Crick base-pair (10A) and a diaminopurine base and its binding to a T:A oriented Watson-Crick base-pair (10B) ; Figures 11A and 11B show hydrogen bonding of a cytosine base to a G:C (11A) and T:A (11B) oriented base- pair;

Figures 12A and 12B show hydrogen bonding of a uracil base to an A:T (12A) and C:G (12B) oriented base-pair; Figures 13A-13D illustrate the general skeletal ring structure, hydrogen bonding array, and backbone attachment position of a tautomeric base designed for binding to a G:C or A:T Watson-Crick base-pair (13A) , and three specific embodiments of the 13A structure (13B-13D) ; Figures 14A and 14B show the hydrogen bonding of the Figure 13B structure to a G:C (14A) and A:T (14B) oriented base-pair;

Figures 15A-15D illustrate the general skeletal ring- structure, hydrogen bonding array, and backbone attachment position of a base designed for binding to a G:C Watson- Crick base-pair (15A) , and three specific embodiments o the Figure 15A structure (15B-15D) ;

Figure 16 shows the hydrogen bonding of the Figure 15 structure to a G:C oriented base-pair; Figure 17A illustrates the general skeletal ring structure, hydrogen bonding array, and backbone attachmen position of a base designed for binding to a G:C Watson Crick base-pair, and Figure 17B shows a specific embodimen of the 17A structure hydrogen bonded to a G:C oriente base-pair;

Figure 18A-18D illustrate the general skeletal ring- structure, hydrogen bonding array, and backbone attachment position of a base designed for binding to an A:T or A: Watson-Crick base-pair (18A) , and three specific embodiments of the 18A structure (18B-18D) ;

Figure 19 shows the hydrogen bonding of the Figure 18D structure to an A:T oriented base-pair;

Figure 20 illustrates the coupling cycle used in an exemplary solid-phase synthesis of one embodiment of this binding polymer;

Figure 21 illustrates a segment of a polymer constructed according to the invention, and designed to bind to a region of an A-form genetic duplex having the sequence of base-pairs: C:G, A:T, T:A, and G:C. Figure 22 illustrates the coupling cycle in a novel method for assembling nucleic acid-binding polymers.

Detailed Description of the Invention I. Polymer Subunit Construction The polymer of the invention is designed for binding with base-pair specificity to a selected sequence (the target sequence) in a strand of duplex nucleic acid. As used herein, duplex sequence refers to a sequence of contiguous oriented Watson/Crick base-pairs, where the four oriented base-pairs are: A:T (or A:U) , T:A (or U:A) , G:C, and C:G, where A, T, U, G, and C, refer to adenine, thymine, uracil, guanine, and cytosine nucleic acid bases, respectively.

The polymer is formed of subunits, each of which comprises a cyclic backbone structure and linkage group, which collectively form an uncharged backbone, and a base attached to the cyclic backbone structure, which provides base-pair-specific hydrogen-bonding to the target. The requirements of the backbone structure, linkage group, and attached base in the polymer subunits are detailed below. In the context of these duplex binding polymers, the term "base" refers to planar base-pair-specific hydrogen- bonding moieties.

A. Subunit Base Requirements

Because of the symmetry of the polar minor-groove sites and the asymmetry of polar major-groove sites in Watson/Crick base-pairs, to achieve a given level of sequence specificity a minor-groove-binding agent would have to recognize twice as many base-pairs as would a corresponding major-groove-binding agent. Accordingly, hydrogen-bonding of the subunit base is to the polar sites in the major groove of the target duplex.

Figures 1A-1D shows T:A, A:T, C:G, and G:C oriented Watson/Crick base-pairs, with the major-groove hydrogen- bonding sites indicated by arrows in the figure. For the T:A and A:T oriented base-pairs, the polar major-groove sites include the N7 and a hydrogen on the N6 of adenine and the 04 of thymine (or uracil) . For the C:G and G:C oriented base-pairs, the polar major-groove sites include the 06 and N7 of guanine and a hydrogen on the N4 of cytosine.

In order to make a significant contribution to the free energy of binding and to provide adequate base-pair specificity, the subunit base should form at least two hydrogen bonds to its target base-pair. That is, each subunit base in the polymer should contain at physiological pH a hydrogen-bonding array suitable for binding to two or three of the polar major-groove sites on its respective oriented target base-pair. Table 1 shows the hydrogen- bonding arrays comprising the polar major-groove sites for each of the four oriented Watson/Crick base-pairs, and the corresponding hydrogen-bonding array of the subunit base suitable for hydrogen-bonding to said polar major-groove sites.

Table 1

Oriented hydrogen-bonding Reguired hvdrogen-bonding base-pair array of base-pair array of subunit base

A:T ** H **

T:A ** H **

C:G H ** **

G:C ** ** H

In the table, X is generally an N, 0, or S atom, but can also be F, Cl, or Br, having a non-bonded pair of electrons suitable for hydrogen bonding, and ** represents the nonbonded pair of electrons suitable for hydrogen- bonding. As indicated above, the polymer subunit base should contain the specified hydrogen-bonding array at physio¬ logical pH (in contrast to the case for cytosine moieties used for Hoogsteen-type major-groove binding) . This assures that at physiological pH, binding of the subunit base makes a substantial contribution to the free energy of binding between the polymer and its target duplex.

At physiological pH the subunit base should be predo¬ minantly non-ionized. More specifically, basic moieties should have pKb values of at least 7.5 or greater, and acidic moieties should have pKa values of at least 7.7 or greater. This lack of substantial ionic charge provides two advantages. First, for applications in living cells, the lack of ionic groups on the binding polymers facili- tates passage of the polymer across biological membranes. Second, lack of negative charges avoids the problem of charge repulsion between the binding polymer and the negatively charged phosphates of its target duplex.

Major-groove hydrogen-bonding arrays of the four ori- ented Watson/Crick base-pairs are illustrated in Table 2.

In Table 2, H is a hydrogen bound to a nitrogen, and

** is an electron pair of nitrogen or oxygen available for hydrogen bonding.

The respective positioning of the base-pair H-bonding arrays shown in Table 2, which approximates their relative positions in the major-groove of a duplex genetic sequence, illustrates the fact that two of the H-bonding sites of a C:G base-pair (NH4 and 06) are positioned nearly the same as two of the H-bonding sites of an A:T(U) base-pair (NH6 and 04) . Likewise, two of the hydrogen-bonding sites of a G:C base-pair (06 and NH4) are positioned nearly the same as two of the H-bonding sites of a T(U) :A base-pair (04 and NH6) . Because of these similarities in positioning between central hydrogen-bonding sites of the oriented base-pairs, subunit bases which hydrogen-bond only to the polar sites near the center of the major-groove (underlined in the above table) lack adequate specificity for a given base-pair. Accordingly, in order for a subunit base to achieve high specificity for a single oriented base-pair, the base should hydrogen bond to the N7 of its respective target base-pair. If a subunit base is to bind to only one of the four oriented Watson/Crick base-pairs, the tautomeric state of that subunit base should be sufficiently fixed under conditions of use so that at least two of the hydro¬ gen-bonding groups positioned for base-pair binding will not tautomerize to give a structure capable of H-bonding with comparable affinity to a base-pair other than the intended one. To illustrate, Figure 2A shows an acceptably fixed structure (2-amino pyrimidine, which exists almost exclusively in the 2-amino tautomeric form) . Figure 2B shows a second structure which lacks specificity for a single base-pair due to its facile tautomerization under physiological conditions (2-pyrimidinone) . Dominant tautomeric forms of a wide assortment of representative heterocyclic structures have been tabulated in a book edited by Elguero, Marzin, Katritzky & Linda (1976) .

The subunit bases should have structures which provide a relatively rigid arrangement of at least two of the base-pair H-bonding groups positioned for base-pair bind¬ ing. Such rigidity is best afforded by a ring structure wherein at least two of the polar hetero atoms to be in- volved in H-bonding to the target base-pair are either part of the ring or directly attached to the ring. To illus¬ trate. Figure 3A shows a structure (2-amino-3-cyano pyr¬ role) which satisfies this rigidity requirement. Figure 3B shows a structure (2-carboxamide pyrrole) which fails to satisfy the requirement.

The simplest sequence-specific binding polymers are those which bind to a target which is composed of contigu¬ ous base-pairs in the polynucleotide duplex. This, in turn, requires that the subunit bases of the binding poly¬ mer be no thicker than the target base-pairs to which they are to bind. Accordingly, each subunit structure should be planar. This is best achieved by using subunit bases having aromatic character and/or having plane trigonal bonding for most or all ring atoms.

B. Subunit-Binding Constraints

Considering now the geometric requirements of the polymer subunits, most duplex nucleic acids adopt either of two general conformations. RNA/RNA and RNA/DNA duplexes adopt an A-type conformation. DNA/DNA duplexes adopt a B- type conformation, but can readily convert to an A confor¬ mation under certain conditions, such as high salt or low polarity solvent. In duplex nucleic acids the polar major-groove sites on each of the Watson/Crick base-pairs are fairly regularly positioned with respect to corresponding arrays of major- groove sites on neighboring base-pairs, with the relative positions being defined by the helical conformation para- meters of axial position, axial rise, and axial rotation. In principle, the backbone attachment positions of the different subunit bases, when the bases are hydrogen-bonded to their respective target base-pairs, need not be posi¬ tioned in any regular way relative to their target base- pairs. However, when there is significant variability in the relative backbone attachment positions of the different subunit bases relative to their target base-pairs, each of the backbone structures of the component subunits in the polymer must be custom tailored with respect to backbone length and position of subunit base attachment, leading to extremely high development and production costs.

However, if all of the subunit bases of a given sub¬ unit set have similar backbone attachment positions and angles relative to their respective target base-pairs, then all subunits of the set can have identical backbone struc¬ tures, greatly simplifying the synthetic effort required for polymer construction. To this end, the polymer sub¬ units used in the present invention are selected, according to criteria described below, to have similar backbone attachment positions and angles.

To understand what is meant by similar backbone posi¬ tions and angles, reference is made to Figure 4A, which shows a Watson/Crick base-pair (W/C bp) positioned relative to the helical axis (denoted K_t) of an A-form genetic duplex, i.e., (A, 12, 0.326) RNA. The lower horizontal line in the figure connects the two ribose Cl' atoms of the

Watson-Crick base-pair, and the vertical line (denoted PB) is the perpendicular bisector of the first-mentioned line.

The backbone attachment position and angles of a sub- unit base are then determined by positioning the subunit base on its corresponding target base-pair in this stan¬ dardized position, with the subunit base being hydrogen bonded to the appropriate polar major-groove sites on the Watson-Crick base-pair, as shown for a 2,6-diaminotriazine subunit base in Figure 4B.

The backbone attachment position of the subunit base, relative to its A-form target duplex, can then be described by an R. and 6*. value, where R. is the radial distance, in angstroms, from the helical axis of the A-form target duplex to the center of the backbone atom (denoted B) to which the subunit base is attached, and 0, is the angle, in degrees, about this helical axis, measured clockwise from the perpendicular bisector to the center of the afore¬ mentioned backbone atom. The attachment angle, A, is defined as the angle, in degrees, measured clockwise from the perpendicular bisector, between the perpendicular bisector and a line parallel to the bond between the subunit base and the backbone moiety.

Figure 4B illustrates R_a, 6-., and A parameters for a 2,6-diaminotriazine subunit base hydrogen-bonded to a U:A base-pair in an A conformation. Figure 4C illustrates a correspondingly positioned base-pair of a B-form duplex, and Figure 4D illustrates R_b, _-_b, and A parameters for this 2,6-diaminotriazine subunit base hydrogen-bonded to a T:A base-pair in a B conformation.

In order to unambiguously define the target base-pair for a selected subunit base with a given backbone attach¬ ment site, two orientations for each Watson/Crick base-pair in the target duplex must be considered. The resultant 4 oriented base-pairs are designated as A:T, T:A, C:G, and G:C (and corresponding base-pairs where U replaces T) . The orientations of these base-pairs are defined in Table 3.

Table 3

Oriented Base-pair θ value for N7 of Purine

Designation of Target Base-pair

A:T (A;U) > 180°

T:A (U:A) < 180° C:G < 180°

G:C > 180°

In principle, the backbone attachment position for any given subunit base, in position on its target base-pair, can have a 0 value, X°, in the range of 0° to 180°. By flipping the target base-pair, the 0 value of that same target-bound subunit base is changed to 360° - X°. The convention used in the following discussion is that the 0 value for each subunit base of the binding polymer is less than 180°.

Thus, in the context of selecting a subunit set suitable for assembling the binding polymers disclosed herein, to explicitly define which orientation of a given base-pair constitutes the target for a specified subunit base, it is important to designate the orientation of that target base-pair such that the backbone attachment position of the base-pair-bound subunit base has a 0 value less than 180°. To illustrate, a 2,6-diaminotriazine subunit base having a backbone moiety attached through the C4 of the triazine (Figure 4B) can bind to a U:A base-pair in an A conformation to give a 0 value of 28°. When this same subunit base is hydrogen-bonded to that same base-pair in the base-pair's opposite orientation (ie. , A:U) , the 0 value for the subunit base is 332° (ie. , 360° - 28°). The convention used herein dictates that the target base-pair for this subunit base is U:A (where 0 is < 180°) , and not A:U (where 0 is > 180°) .

Acceptable values of R, 0, and A for prospective recognition moieties can be readily obtained with CPK molecular models (The Ealing Corp., South Natick, Mass., USA) . Slightly more accurate values can be estimated by optimization of the hydrogen-bonding in the subunit base/base-pair triplex via a computer molecular mechanics program, such as are available commercially. The subunit bases should be so selected that a given subunit set (the set of subunits used in assembly of a given polymer) all have R values within about 2 angstroms of each other, 0 values within about 20° of each other, and A values within about 30° of each other. In order for a subunit base to have a high specificity for only one of the oriented base-pairs, it is important that the subunit base not be able to bind to a given base-pair in both orientations (eg., G:C and C:G) simply by rotation of the subunit base about its linkage to its backbone structure. Therefore, the earlier-described backbone attachment position or angle should be asymmetrical with respect to the Cl' positions of the target base-pair. Specifically, 0, for the subunit base should have a value greater than about 10°, or the attachment angle. A, for the subunit base should have a value greater than about 25°.

C. Backbone Structure Constraints This section considers the backbone structure con¬ straints for a selected subunit set. Principally, the structure should be joinable in any selected order to other subunit structures via uncharged linkages having the gene¬ ral properties discussed in Section D below. Further, the subunit backbone structures and linkages must provide pro¬ per spacing and allow correct orientation and positioning of their respective subunit bases for effective binding of the subunit bases to their respective oriented base-pairs in the target duplex sequence. A principal requirement for the subunit backbone structure and linkage is that it provide a means for join¬ ing the subunits in essentially any specified order. This requirement can be satisfied by structures containing either heterologous or homologous linking groups. Hetero- logous type backbone moieties contain a nucleophilic group (N) on one end and an electrophilic group (E) on the other end, as illustrated below.

N E

The preferred functional groups for the N component include primary and secondary amine, hydrazine, hydroxyl. sulfhydryl, and hydroxylamine. The preferred functiona groups for the E component include the following acids an derivatives thereof: carboxylic, thiocarboxylic, phos phoric, thiophosphoric, esters, thioesters, and amides o phosphoric and thiophosphoric, phosphonic and thiophos phonic, and sulfonic acid. Other suitable E groups includ aldehyde, dialdehyde (or vicinal hydroxyls suitable fo conversion to a dialdehyde) , alkyl halide, and alkyl tosylate. Homologous type backbone moieties can be of two types, one type having nucleophilic end groups and the other type having electrophilic end groups; or, a single homologous backbone moiety can be alternated with an appropriate linker. These alternatives are illustrated below: N N alternated with E E

N N alternated with E linker

N linker alternated with E E

Preferred functional groups for N and E are as in the heterologous backbone moieties. Preferred E linkers include carbonyl; thiocarbonyl; alkyl, ester, thioester, and amide of phosphoryl and thiophosphoryl; phosphonyl and thiophosphonyl; sulfonyl; and, oxalic acid. A preferred N linker is 1,2-Dimethylhydrazine.

The present invention contemplates a variety of both cyclic and acyclic backbone structures, as will be illu¬ strated in Figures 5-9 below. One limitation of acyclic backbone structures is that activation of the electrophilic linking groups preparatory to polymer assembly, can lead to varying amounts of undesired intramolecular attack on sites of the subunit base. By contrast, with properly structured cyclic backbone moieties, the activated electrophile can be effectively isolated from reactive sites on the subunit base, thereby reducing unwanted intramolecular reactions.

However, use of aliphatic cyclic backbone moieties does entail the presence of multiple chiral centers in each backbone structure. With proper selection of cyclic backbone structures, synthetic challenges associated with such multiple chiral centers can be largely circumvented, by utilizing readily available natural products for the backbone moiety or, preferably, for the entire subunit, or as a proximal precursor thereto.

This preference for backbone structures, or entire subunits, from natural sources reflects the difficulty, and corresponding greater expense, of de novo preparation of aliphatic ring structures having multiple chiral centers. Accordingly, preferred categories of cyclic backbone moieties are those comprising, or readily derived from, deoxyribose or ribose. In addition, certain other natural cyclic structures wherein a single enantiomer is available, or can be readily prepared or isolated, are also preferred. Figures 5A-5C illustrate exemplary cyclic backbone struc¬ tures comprising or derived from deoxyribosides or ribo- sides. R' in the figure indicates H or alkyl, and R_j indicates the subunit base, which, as seen, has the same β- orientation as natural nucleosides. Figures 6A-6F illu¬ strate exemplary cyclic morpholino backbone structures derivable from ribosides, having either a jS-orientation (Figures 6A-6C) or an α-orientation (Figures 6D-6F) for the 5'-methylene (numbered as in the parent ribose) , again with a β orientation of the R_£ base. The synthesis of such subunits will be described below and in Examples 1-5. Figures 7A-7E show representative types of acyclic backbone structures.

D. Intersubunit Linkages

This section considers several types and properties of intersubunit linkages used in linking subunits to form the polymer of the invention. First, the backbone must be stable in neutral aqueous conditions. Since the binding polymers are designed for use under physiological condi- tions it is necessary that the intersubunit linkages b stable under said conditions. The linkages must also b stable under those conditions required for polymer assem bly, deprotection, and purification. To illustrate thi stabilityrequirement, an alkyl sulfonate (R-(S0₂)-0-CH₂-R') is precluded because the resultant structure is undul sensitive to nucleophilic attack on the CH-. Further, whil carbonates (R-0-(C=0)-0-R') and esters (R-(C=0)-0-R') ca be successfully prepared, their instability under physio logical conditions renders them of little practical value. Secondly, the backbone must be adaptable to a confor mation suitable for target binding. If the intersubuni linkage is such that it exhibits specific rotational con formations (as is the case for amides, thioamides, ureas, thioureas, carbamates, thiocarbamates, carbazates, hydrazides, thiohydrazides, sulfonamides, sulfamides, an sulfonylhydrazides) then it is important either that th rotomer compatible with target binding be the lowest energ conformation, or that the barrier to rotation between th conformations be relatively low (ie. , that the conforma tions be rapidly interchangeable at physiological tempe ratures) . Thus, a secondary amide (N-alkyl amide, whic prefers to adopt a trans conformation) would be acceptabl if the trans conformation is suitable for pairing to th target duplex. By contrast, tertiary amides and relate N,N-dialkyl structures generally have two approximatel equal low energy conformations, and so to be useful in a binding polymer, the linkages should have a relatively lo energy barrier to interconversion between the two conformations.

The barrier to rotation between two conformers can be assessed by NMR as follows: At a temperature where the two conformers are interconver ing slowly relative to the NM time scale (on the order of 10"⁸ sec) two distinct signals are often seen, each representing a single conformer. As the NMR spectra are taken at progressively higher tempera¬ tures, the two conformer signals coalesce - indicating rapid interconversion. The coalescence temperature (Tc) thus provides a useful measure of the rotational freedom of various linkage types. For example, N,N-dimethylformamide exhibits a Tc of about 114°C (Bassindale, 1984) and con- formers of analogous tertiary amides have been found to interconvert slowly in biological macromolecules. By contrast, an N,N-dialkyl carbamate-containing structure exhibits a Tc just under 44°C (unpublished results obtained in support of the present invention) , indicating reasonable conformational freedom at physiological temperature.

An N,N-dialkylsulfinamide (which should have a rota¬ tional energy barrier similar to that of sulfonamide and related substances) has been reported to have a Tc lower than minus 60°C (Tet. Let. .10 509 (1964)). Based on these considerations, backbone linkages containing N,N-dialkyl- type carbamate, thiocarbamate, carbazate, and various amidates of phosphorous and sulfur are preferred, while N,N-dialkyl-type amide, thioamide, urea, thiourea, hydrazide, and thiohydrazide linkages are generally unacceptable.

Third, the backbone should be uncharged. For thera¬ peutic applications it is desirable to design these binding polymers so that they i) are not sequestered by the reticu- loendothelial lining of the capillaries; ii) readily cross cell membranes; iii) are resistant to degradation by nucleases; and, iv) are not repelled by the high density of negative charge on the backbones of the target duplex. These design objectives are best achieved by using both intersubunit linkages and backbone moieties which are largely uncharged (non-ionic) at physiological pH.

When the subunit bases are positioned on contiguous base-pairs of their target sequence via hydrogen-bonding, and if all recognition moieties of the subunit set have well matched R, 0, and A values, then the distance from th subunit base attachment position of one backbone moiety t the attachment position of the next backbone moiety is th square root of: (R sine(rot))² + (R cosine(rot) - R)² + (rise)² where R is the distance from the helical axis to the cente of the atom of the backbone moiety to which the subuni base is attached, rot is the axial rotation value for th target duplex (typically about 30° to 33° for an A-for duplex and 36° for a B-form duplex) , and rise is the axia rise value for the target duplex (typically about 2.8 t

3.3 A for an A-form duplex and 3.4 A for a B-form duplex) .

It is this distance which must be spanned by the uni backbone length of the binding polymer, i.e., the length o one backbone structure plus the intersubunit linkag between backbone structures. However, it should b emphasized that both A-form (RNA/RNA and RNA/DNA duplexes) and B-form (DNA/DNA) target duplexes are somewhat flexibl and so can generally accommodate binding polymers whic have unit backbone lengths which are a fraction of a angstrom shorter or longer than the calculated lengt requirement. Further, it should be appreciated tha DNA/DNA in a B conformation can be converted to an conformation under certain conditions. In selecting a particular backbone structure, the following factors bear on the required length and so should be taken into consideration: first, any conformational restrictions imposed by hindered rotations about bonds such as amides and carbamates; second, when the subunit bases are in position on their target base-pairs, any steric interactions between these bases and the target duplex, and between the bases and the polymer backbone; third, steric interactions between different components of the backbone structure; and fourth, for cyclic backbone moieties, favored conformations of the component ring structure of the subunit backbone structures.

A generally satisfactory way to determine whether or not a prospective polymer backbone is likely to be acceptable for use against a particular target conformation (e.g., A-form or B-form) is to assemble with CPK molecular models a representative target genetic duplex in the desired conformation, with subunit bases H-bonded thereto, and then add the prospective polymer backbone. If the prospective polymer backbone can be easily attached without having to adopt an energetically unfavorable conformation, and if the attachment of the polymer backbone does not cause significant perturbation of the target structure, and if there are no unacceptable steric interactions, then the backbone should be operable. Additional support for the suitability of a prospective backbone structure can be obtained by modeling the polymer/target triplex on a computer using a molecular mechanics program to obtain an optimized bonding structure via an energy minimization procedure. Such modeling can, on occasion, identify significant unfavorable interactions (eg. , dipole-dipole repulsions) which might be overlooked in the initial CPK modeling.

As noted above, such factors as R, 0, and A values for the subunit bases of a given subunit set, and steric and rotational constraints of particular subunit structures and intersubunit linkages, bear on how long a unit backbone must be in order to provide the correct spacing of subunit bases for binding to a target duplex in a given conformation. However, as a rule, subunit sets wherein the subunit bases of the set have R. values less than about 7 angstroms and 0 values clustered within about 12° of each other, and A values clustered within about 20° of each other, generally require a 4-atom or 5-atom unit-length acyclic-type backbone, such as shown in Figures 8A-8C, or a 6-atom unit-length cyclic-type backbone, such as shown in Figure 9A-9D, for binding to target duplexes in an A-type conformation.

Subunit base sets having R_b values less than about 11.5 Angstroms, 0_b values within about 9° of each other, and A values clustered within about 20° of each other generally require a 6-atom unit-length acyclic-type backbone, such as shown in Figure 8D-8E, or a 7-atom unit-length cyclic-type backbone, such as shown in Figure 9E-9F, for binding to target duplexes in a B-type conformation.

However, it should be noted that DNA/DNA duplexes, which generally exist in a B conformation, can readily convert to an A conformation. Two such conditions which cause this B to A transition are high salt and low polarity solvent. It also appears that a B to A conformational transition of the target duplex can be induced by duplex- directed binding polymers having backbone unit-lengths shorter than optimal for binding to a B-form duplex. However, such conformation transitions incur a cost in free energy of binding, and so, to compensate, the binding poly¬ mer's affinity for its target must be increased accordingly. Because of the feasibility of this B to A conformational transition of target duplexes, for some ap¬ plications the shorter unit-length backbones suitable for A-form target duplexes can also be used for targeting genetic sequences which exist normally in a B conformation.

E. Subunit Sets

When the subunit bases of a set have acceptably matched R, 0, and A values, and when subunit backbone structures which are identical or very similar in length and subunit base attachment position and orientation are used for all subunits of the set, the subunits of that set can be assembled in any desired order for targeting a selected duplex sequence. Each subunit of such a matched set consists of a subunit base linked at a standard position to a standard- length backbone structure. The subunit base of each subunit of the set has an R, 0, and A value closely matched to the R, 0, and A values of the subunit bases of the other subunits of that set.

According to an important feature of the invention, the polymer subunits in a set must contain at least two different subunit types, each specific for a different oriented base-pair. Specifically, the base of each of at least two different subunits of the set is effective to form at least two hydrogen bonds with the major-groove sites of its respective target base-pair, where one of those hydrogen bonds is to the purine N7 nitrogen of the target base-pair, as discussed above.

The other subunit or subunits in the set may, but do not necessarily bind with high specificity to oriented base-pairs in the target sequence. Thus, another subunit of the set may bind satisfactorily to two different oriented base-pairs, as will be seen below. Such low- specificity or non-specific subunits serve to provide (a) required spacing between high-specificity subunits in the polymer and (b) contribute to stacking interactions between the planar bases in the polymer/duplex complex. In addition, and according to an important feature of the invention, the subunits in the polymer must provide high-specificity base binding to at least about 70% of the oriented base-pairs in the target sequence. Thus, where a subunit set includes only two high-specificity bases, the target duplex sequence must contain at least 70% oriented base-pairs which are specifically bound by those two high- specificity bases. El. Basic Subunit Set for C:G and T:A or U:A Oriente Base-pairs

The most basic subunit set is suitable for targetin duplex genetic sequences containing only C:G and T:A or U: oriented base-pairs.

The first member of this basic subunit set is a high specificity guanine subunit containing a guanine or 6- thioguanine subunit base effective to hydrogen bon specifically to a C:G oriented base-pair. As illustrate in Figure 10A, guanine (or 6-thioguanine) forms three hydrogen bonds to the polar major-groove sites of a C:G oriented base-pair, including the guanine N7 of that target base-pair. The subunit may be formed with any of a variety of deoxyribose, ribose or morpholino backbone structures, with the base attached to the backbone structure in the β- stereochemical orientation, as illustrated in Example 2.

The second member of the basic set is a high- specificity diaminopurine subunit containing a 2,6- diaminopurine subunit base effective to hydrogen bond specifically to a T:A or U:A oriented base-pair. As illustrated in Figure 10B, the 2,6-diaminopurine base forms three hydrogen bonds to the polar major-groove sites of a T:A or U:A oriented base-pair, including the adenine N7 of that target base-pair. As with the guanine subunits, a variety of diaminopurine subunits with deoxyribose, ribose and morpholino backbone structures, and having the desired .-stereochemical attachment of the base to the backbone structure, can be prepared by modifications of commercially available nucleosides, also as illustrated in Example 2. CPK molecular modeling showed that the guanine and diaminopurine moieties should effectively and specifically bind their target base-pairs. Additional support for this major-groove hydrogen-bonding mode was obtained from a best fit analysis carried out for these two trimolecular complexes, C:G:G and U:A:D. An exhaustive review by Voet and Rich (1970) tabulates the lengths and angles of hydrogen-bonds from x-ray diffraction studies of crystalline complexes of purines and pyrimidines. In those tabulations NH:N bonds range in length from 2.75 A to 3.15 A and their angles range from 115° to 145°. NH:0 bonds range in length from 2.60 A to 3.20 A and their angles range from 110° to 145°.

In the best fit calculations, structural parameters used for the purines and pyrimidines in the Watson-Crick base-pairs are those given by Rich and Seeman (1975) . Those parameters were obtained from x-ray diffraction of ApU and GpC crystals (right handed anti-parallel Watson- Crick) which were solved at atomic resolution. The guanine structural parameters referenced above were also used for the subunit base in Figure 10A. The 2,6-diaminopurine subunit base of Figure 10B was assumed to have structural parameters essentially identical to those of 9-ethyl-2,6- diaminopurine obtained from x-ray diffraction studies of crystalline tri olecular complexes of 9-ethyl-2,6- diaminopurine hydrogen-bonded to two 1-methylthymines (one thymine bonded in the Watson-Crick mode and the other thymine bonded in the reverse-Hoogsteen mode) as reported by Sakore et al. (1969) .

To simplify the analysis, the approximation was made that all atoms are in the same plane. Table 4 gives the results of this analysis. In this table the standard purine and pyrimidine numbering system is used throughout, subunit base-G stands for the subunit base of Figure 10A (guanine) and subunit base-D for the subunit base of Figure 10B (2,6-diaminopurine) . Angles are measured as in Voet and Rich referenced above. Table 4

Guanine subunit base H-bonded to a C:G base-pair

W/C hvdrogen-bonds 02(C) :NH2(G) N3(C) :NH1(G) NH4(C) :06(G)

Ma or-Groove hvdrogen-bonds NH2(subunit base-G) :N7(G) NHl(subunit base-G) :06(G) 06(subunit base-G) :NH4(C)

Diaminopurine subunit base H-bonded to a U:A base-pair

W/C hvdrogen-bonds NH3(U) :N1(A) 04(U) :NH6(A)

Maior-Groove hvdrogen-bonds NH2(subunit base-D) :N7(A) Nl(subunit base-D) :NH6(A) NH6(subunit base-D) :04(U)

As can be seen from this table, all hydrogen-bond angles and lengths in the subunit base/base-pair complexes fall within established angle and length limits for hydrogen-bonds.

E2. Spacer Subunits for A:T and G:C Oriented Base-pairs

The basic guanine plus diaminopurine subunit set can be easily prepared from readily available guanosine or deoxyguanosine. However, binding polymers assembled from only these two subunits, and targeted against sequences of at least 16 contiguous base-pairs, are expected to have targets in only quite large viruses having genome sizes on the order of 65,000 base-pairs or greater.

However, it is desirable to have binding polymers which can be targeted against a much broader range of viruses, including even quite small viruses such as Hepatitis B, which has a genome size of only 3,200 base- pairs. One effective approach to extending the targeting range of these binding polymers, without substantially increasing their cost of production, is to target sequences composed predominantly (at least about 70%) of target base- pairs for the guanine and diaminopurine high-specificity subunit bases (ie., oriented base-pairs C:G and T:A or U:A) . The remaining base-pairs in the target sequence (i.e., no more than about 30% G:C and/or A:T or A:U) can then be accommodated by low-specificity "spacer" bases in the binding polymer, which serve primarily to provide continuity of stacking interactions between the contiguous subunit bases of the binding polymer when that polymer is in position on its target duplex.

Thus, in one embodiment, a polymer assembled from the basic subunit set described in Section El additionally includes one or more low-specificity spacer subunit bases. When the binding polymer is in position on its target duplex, with the subunit bases stacked, the spacer subunit bases (which are not necessarily hydrogen-bonded to their respective base-pairs) should have R, 0, and A values which can closely match the R, 0, and A values of the high- specificity subunit bases. Specifically, for the full subunit set, the R values should all be within about 2 A, 0 values should all be within about 20°, and A values should all be within about 30°. Preferably, the spacer subunit bases should also provide modest hydrogen-bonding to their respective target base-pairs so as to make some contribu- tion to target binding specificity and affinity. Where the target sequence contains a G:C oriente base-pair, one preferred spacer subunit in the subunit se contains a cytosine base, which can hydrogen-bond weakly t G:C and to T:A oriented base-pairs. Figure 11A shows cyto sine hydrogen bonded to the major-groove sites of a G: base-pair, and Figure 11B shows cytosine hydrogen bonded t a T:A base-pair. In neither case does this include hydrogen bond to the N7 of the purine of a target base pair. Where the target sequence contains an A:T or A: oriented base-pair, one preferred spacer subunit in th subunit set contains a uracil (or thymine) base, which ca hydrogen-bond weakly to A:T and to C:G oriented base-pairs Figure 12A shows uracil hydrogen bonded to the major-groov sites of an A:T base-pair, and Figure 12B shows uraci hydrogen bonded to a C:G base-pair. As with the cytosin spacer, neither of these hydrogen bonding interaction involve the N7 of the purine of a target base-pair.

Although these two subunit spacer bases provide onl low-specificity and low affinity binding to their targe base-pairs, nonetheless: i) they effectively provide fo continuity of subunit base stacking in the target-boun binding polymer; ii) they have R, 0, and A values which ar acceptably matched with the R, 0, and A values of the high specificity guanine and diaminopurine subunit bases of th subunit set; and iii) the spacer subunits, or close pre cursors thereto, are commercially available and relativel inexpensive.

Syntheses of subunit sets containing the four subuni bases guanine, diaminopurine, cytosine, and uracil (o thymine) , and having various deoxyribose, ribose and mor pholino backbone structures, are described in Example 2. The sets described in the example have the following back bone structures: (a) 2'-deoxyribose, seen in Figure 5A, Example 2A; (b) 2'-0-methylribose, seen in Figure 5B (R = methyl). Example 2B;

(c) morpholino, seen in Figure 6A, Example 2C;

(d) N-carboxymethylmorpholino-5'-amino, seen in Figure 6C, Example 2D;

(e) N-carboxymethylmorpholino-(alpha)5'-amino, seen gene¬ rally in Figure 6F, Example 2E;

(f) ribose with 5'carbazate, seen in Figure 5C, Example 2F;

(g) ribose with 5'sulfonylhydrazide, seen in Figure 5C, but where the carbonyl group is replaced by a sulfonyl group.

Example 2G;

(h) ribose with 5'glycinamide, seen in Figure 5C, but where the OCONHNH- group is replaced by NHCOCH-NH-, Example 2H; and, (i) ribose with 5' (aminomethyl) (ethyl)phosphate, seen in Figure 5C, but where the OCONHNH- group is replaced by OPO-EtCH-NH_j, Example 21.

Table 5 shows the base-pair specificities and approxi¬ mate R, 0, and A values for the subunit bases of this guanine, diaminopurine, cytosine, and uracil (or thymine) subunit set.

It will be appreciated that binding polymers prepared with the above G, D, C and U or T subunit set also have the potential to bind to single-stranded genetic sequences. Specifically, the polymer will be able to bind in a Watson- Crick pairing mode to a single-stranded polynucleotide of the appropriate base sequence.

Since the spacer subunits, C and U or T, in the poly¬ mer are degenerate in binding specificity, at least two of these low-specificity spacer subunits are required to provide a level of target specificity equivalent to that provided by one high-specificity subunit. Thus, a binding polymer containing 16 high-specificity subunit bases pro¬ vides about the same level of target specificity as a binding polymer containing 12 high-specificity subunit bases and 8 low-specificity spacer subunit bases.

E3. Subunit Set with a Tautomeric Subunit Specific for A:T and G:C Oriented Base-pairs In another embodiment, the guanine plus diaminopurine subunit set described in Section El includes an additional subunit having a tautomeric subunit base capable of hydrogen bonding to either G:C or A:T oriented base-pairs. A generalized skeletal ring structure and hydrogen bonding array of one preferred base type is shown in Figure 13A, where X_j is H or NH-; X. is H, F, or Cl; and B indicates the polymer backbone. Figures 13B-13D show three preferred embodiments of this tautomeric base, as discussed further below. The hydrogen bonding to target base-pairs by different tautomeric forms of the base from Figure 13D is shown in Figures 14A and 14B for G:C and A:T oriented base-pairs, respectively. As seen from Figure 14, X_j can be hydrogen- bond acceptor when the tautomer is hydrogen bonded to a G:C base-pair, to provide three hydrogen bonds to the base- pair. Similarly, X_t can be a hydrogen-bond donor when the tautomer is hydrogen bonded to an A:T base-pair, to provide three hydrogen bonds to the base-pair. Table 6 shows the base-pair specificities and approxi mate R, 0, and A values for the subunit bases of th guanine, diaminopurine, and the subunit base of Figure 14:

Table 6

Subunit Base Base-pair Specificity R_{t t}

G C: G 5.8 A 33° 60°

D T:A 5. 6 A 32° 60° Tautomeric Base G:C & A:T 6.3 A 36° 55° of Figure 13B

The syntheses of a number of specific embodiments of a tautomeric subunit are described in Example 3. The synthesis of the structures seen in Figure 13B and 13C are described in Example 3A for the 2'doexyribose backbone structure; in Example 3B for the 2'0-methylribose backbone; and in Example 3C for the morpholino backbone.

E4. Subunit Set with High-Specificity Subunits for A:T and G:C Oriented Base-pairs

In still another embodiment, the guanine plus diaminopurine subunit set described in Section El includes an additional subunit whose base is specific for hydrogen bonding to a G:C oriented base-pair, or an additional subunit whose base is specific for hydrogen bonding to an A:T (or A:U) oriented base-pair, or the set includes two additional subunits whose bases are specific for hydrogen bonding to a G:C oriented base-pair and to an A:T or A:U oriented base-pair, respectively.

Figure 15A shows the ring structure and hydrogen bonding array of a general type of base effective to bind a G:C oriented base-pair. Three preferred embodiments of this structure type are shown Figures 15B-15D. Figure 16 shows the structure in Figure 15D hydrogen-bonded to its G:C target base-pair. As seen from Figure 15A and Figure 16, the X- position in the Figure 15A structure may be a hydrogen bond acceptor, e.g., O, for forming three hydrogen bonds between the base and its target G:C base-pair. Syntheses for subunits having a morpholino backbone structure and the G:C-specific bases of Figures 15B and 15C are described in Example 4D.

Figure 17A shows the skeletal ring structure and hydrogen bonding array of another general type of base effective to bind a G:C oriented base-pair. A preferred embodiment of this structure type hydrogen-bonded to its G:C target base-pair is shown in Figure 17B.

Synthesis of a subunit having a morpholino backbone structure and the G:C-specific base of Figures 17B is described in Example 4E.

Figure 18A shows the skeletal ring structure and hydrogen bonding array of a general type of base effective to bind an A:T or A:U oriented base-pair. Three preferred embodiments of this structure type are shown in Figures 18B-18D. Figure 19 shows the structure in Figure 18D hydrogen-bonded to its A:T target base-pair.

Syntheses for subunits having a morpholino backbone structure and the A:T or A:U-specific bases of Figures 18B and 18C are described in Example 4C. The subunits described in this section whose bases are specific for G:C, A:T and A:U oriented base-pairs, with the guanine and diaminopurine subunits described in Section El, provide a complete set of subunits providing high-specifi¬ city hydrogen bonding for each of the four possible orien- ted base-pairs in duplex nucleic acids. A subunit set formed in accordance with one aspect of the invention may include any three of these high-specificity subunits effec¬ tive to bind to three different oriented base-pairs in a duplex target sequence. For example, in a target sequence containing T:A, C:G, and G:C base-pairs, the selected sub- unit set would include three different subunits containin a common or similar backbone structure and diaminopurine, guanine (or thioguanine) , and one of the above G:C-specifi bases. A subunit set suitable for a target sequence con- taining all four oriented base-pairs would additionally include a subunit whose base is one of the above high- specificity bases for an A:T oriented base-pair.

Table 7 shows the base-pair specificities and appro¬ ximate R, 0, and A values for the subunit bases comprising guanine, diaminopurine, and the high-specificity bases of Figures 15, 17, and 18.

The table illustrates the general suitability of this set of bases in regard to R, 0, and A values.

II. Polymer Preparation This section describes assembly of the subunits com¬ prising a subunit set described above, to give a sequence- specific duplex-binding polymer.

A. Polvmer Seguence and Length The polymer of the invention is designed to bind to and inactivate a target duplex sequence, such as a sequence essential for a given pathogen, without inactivating normal host genetic sequences. Thus, the sequence information recognized by the polymer should be sufficient to rigorous- ly distinguish the pathogen sequence from all normal hos sequences.

A reasonable estimation of the amount of sequenc information which a duplex nucleic acids-binding polyme should recognize in a disease-specific sequence in order t avoid concomitant attack on normal cellular sequences ca be calculated as follows. The human genome contain roughly 3 billion base-pairs of unique-sequence DNA. Fo a gene-inactivating agent to have an expectation of havin no fortuitous target sequences in a cellular pool of 3 billion base-pairs of unique sequence genetic material, i should recognize at least n base-pairs in its target, wher n is calculated as 4ⁿ = 3 x 10⁹, giving a minimal targe recognition requirement of approximately 16 base-pairs. This suggests that a gene-inactivating polymer recognizin in excess of 16 base-pairs in its target sequence will likely have no targets in the cellular pool of inherent DNA. Obviously as the number of base-pairs recognized in the target sequence increases over this value the probability that the polymer will attack inherent cellular sequences continues to decrease. It is noteworthy that as the number of base-pairs recognized by the agent increases linearly, this "safety factor" increases exponentially.

To illustrate, Table 8 tabulates the number of base- pairs recognized in a target sequence and the corresponding expected number of fortuitous targets in a pool of 3 billion base-pairs of unique-sequence genetic material. Table 8

Number of base-pairs Expected number of fortuitous recognized in target duplex targets in human genome 8 45,776

10 2,861

12 179

14 11.2

16 0.7 18 0.044

20 0.0027

The numbers in Table 8 indicate that in order to achieve adequate specificity for the pathogen or pathogenic state, a binding agent for duplex nucleic acids should recognize at least 16, and preferably 18 or more base-pairs of the target sequence.

In addition to target sequence length, it is important to consider how many of the four possible oriented base- pairs in duplex nucleic acids (ie., A:T, C:G, G:C, and T:A) must be specifically recognized by the polymer bases in order to allow practical targeting of various viral patho¬ gens. Table 9 shows the approximate number of targets expected in a relatively small viral genome (about the size of the HIV provirus) as a function of the number of dif¬ ferent base-pair-binding specificities in a 16-subunit polymer. The values in the table were calculated on the assumption that the purine to pyrimidine ratio in a given strand of the pathogen's genome is approximately 1.0 and that the bases are effectively in a random order. Table 9 Number of base-pair-binding Expected number of contiguous specificities in subunit set 16-base-pair targets in a

10.000 base-pairviral genome

1 0.000002

2 0.15

3 100

4 10,000

The tabulated values demonstrate that, in general, homopolymers (i.e., polymers assembled from subunits having specificity for just one oriented base-pair) are unlikely to have any practical targets in natural duplex genetic sequences. Further, copolymers of just two subunit types with specificities for only two of the four oriented base- pairs are expected to have contiguous 16-base-pair targets in only quite large viruses (eg. Herpes) . In contrast, binding polymers assembled from subunit sets having specificities for three or four of the oriented base-pairs have a quite adequate number of targets in even the smallest DNA viruses (eg., Hepatitis B with a genome size of only 3200 base-pairs) .

As described in Section I, the basic two-subunit set formed in accordance with the present invention includes two subunits which are specific for two different oriented base-pairs, C:G and T:A or U:A. To increase targeting ver¬ satility, another embodiment includes an expanded subunit set which includes one or two spacer subunits. Still another embodiment comprises the basic two-subunit set plus an additional semi-specific subunit whose base is capable of hydrogen bonding to either of two different oriented base-pairs. As noted above, this semi-specific subunit base recognizes only half the sequence information recog- nized by a high-specificity subunit base, and thus its use will require a correspondingly longer polymer in order t achieve adequate specificity for its target. Yet anothe embodiment comprises the basic two-subunit set plus one o two additional subunits whose high-specificity bases ar each capable of hydrogen bonding to just one of the fou oriented base-pairs. Such a subunit set containing sub units for all four of the oriented base-pairs allows tar geting of essentially any desired duplex genetic sequence.

B. Subunit Activation and Polymer Assembly

The subunits, prepared as in Examples 1 - 5, can be activated and then coupled in a controlled sequential man¬ ner to give the desired binding polymer. Representative polymer assembly procedures for deoxyribose-containing and 2'-O-methylribose-containing subunits are described in Example 6. Representative activation procedures for mor- pholino-containing subunits are described in Example 7; Example 8 describes an exemplary procedure for assembling these activated subunits via solid-phase stepwise addition to give the desired binding polymers; and, Example 9 describes their purification. Figure 20 illustrates one subunit addition cycle of this stepwise assembly procedure using a representative morpholino subunit prepared as in Example 2C and activated as in Example 7A. Figure 21 illustrates a four-subunit-long segment of a representative polymer assembled from the subunit set prepared as in Example 4A-4D, and activated as in Example 7A.

C. Novel Polymer Assembly Comprising: Oxidation/Ring Closure/Reduction

In addition to the above, a novel coupling procedure can also be used for assembling the desired nucleic acids binding polymers, of which one embodiment is illustrated in Figure 22. This procedure involves: i) providing a subunit, or block of linked subunits, which contains vicinyl aliphatic hydroxyls, but no fre primary amine (e.g., structure l of Figure 22); ii) oxidizing those vicinyl hydroxyls to give dialdehyde component (eg., structure 2 of Figure 22); iii) providing a subunit, or block of subunits, whic contains a free primary aliphatic amine (eg. , structure 3 of Figure 22, and subunits prepared as in Examples 2F - 21); iv) contacting the dialdehyde component with the primary amine component to effect coupling of the two components via formation of a cyclic morpholino structure having hydroxyls on the carbons adjacent to the morpholino nitrogen (eg., structure 4 of Figure 22); and, v) during or after the coupling reaction, or after completion of polymer assembly, adding a reducing agent to remove the hydroxyls on the carbons adjacent to the morpho¬ lino nitrogen, to give the desired morpholino ring struc¬ ture (e.g., structure 5 of Figure 22). The vicinyl-hydroxyl-containing moiety can be other than ribose, such as galactose or glucose. Further, this coupling method can be used in either a solution-phase or a solid-phase mode for polymer assembly. Also, the oxida¬ tion step and the subsequent coupling step are preferably carried out in alcohol or water or a mixture thereof, and at a pH near neutrality. Although the reduction can be carried out during or after the coupling, best results are obtained when reducing agent, e.g., NaCNBH₄, is present during the coupling step. Complete reduction and disrup- tion of borate complexes (generated when NaCNBH₄ is used for the reduction) is best achieved by a final acidic wash having a pH in the range of 3 to 5 - which can be carried out after each coupling, or after all couplings are completed. Example 10 describes a representative application o this "oxidation/ring closure/reduction" coupling method fo stepwise solid-phase assembly of a binding polymer.

D. Polvmer Modifications

Some of the polymer types of the invention hav relatively poor solubilities for polymer sizes above abou 15-20 subunits, e.g., in the low-micromolar range. It ma thus be desirable to enhance the solubility of the polymer by addition of a hydrophilic moiety, such as a polyethylene glycol (PEG) chain. This can be accomplished, according to one approach, by deprotecting the polymer terminus, and reacting the polymer with excess of activated hydrophilic compound, e . g. , PEG activated by bis(p-nitrophenyl)carbonate. Thereafter the binding poly¬ mer is cleaved from the synthesis support and treated with ammonium hydroxide to remove the base-protecting groups, and then purified, preferably by ion exchange chromatography at pH 10.5. One preferred hydrophilic molecule is PEG having an average molecular weight of about 1000 daltons (commercially available from Polysciences, Inc. and Aldrich Chem. Co.).

For some applications it may be desirable to modify the polymer to favor its cellular uptake via endocytosis. This may be done, for example, by derivatizing the polymer with a polycationic molecule, such as polylysine. Coupling of such a molecule containing one or more primary amine moieties may be by reaction of the base-protected polymer with a bifunctional coupling agent, such as disuccinimidyl suberate, or other commercially available agent (e.g.. Pierce Chemical Company) and then adding the a ine- containing polycationic molecule.

Where the polymer molecules are to be attached to a solid support, for use in a diagnostic system, the terminal N-protective group can be cleaved (leaving the bases still in the protected state) , and reacted with a suitable cross- linking agent, such as disuccinimidyl suberate. This pre¬ paration is then added to the support material, such as latex microparticles containing suitable linker arms terrai- nating in primary amine moieties.

Alternatively, if it is desired to purify the binding polymer prior to attachment to a support, a methoxytritryl- protected 6-aminocaproic acid can be linked to the unpro¬ tected N-terminus of the binding polymer using DCC. The binding polymer is then treated with ammonium hydroxide to deprotect the bases, purified by standard methods, and the terminal methoxytrityl is cleaved from the aminocaproic acid moiety. Finally, the purified polymer is mixed with support material having suitable linker arms terminating in p-nitrophenylester moieties, to give covalent coupling of the polymer molecules to the support.

Binding polymers constructed from subunits having cyclic backbone moieties have a strand polarity analogous to the 5' to 3' strand polarity exhibited by standard phosphodiester-linked polynucleotides. As a consequence, for a given heteromeric target sequence of base pairs, two binding polymers can be constructed, one having the proper sequence of bases ordered from 5' to 3', and the other having the same sequence of bases, but ordered 3' to 5' . The preferred polymer for any selected target sequence of base pairs is readily determined by assembling the two binding polymers containing the appropriate sequence of bases in both possible orientations, and testing these two polymers for their respective binding affinities for the selected duplex target sequence. Similar approaches for determining proper binding orientations for standard polynucleotides are well-known in the art. It should be appreciated that these binding polymers have the potential to bind their target duplex in either or both of the two orientations. III. Utility

A. Diagnostics: Detection of Sequences in Duplex form

In one application, the polymer of the invention i used in a diagnostic method for detecting a duplex targe nucleic acid sequence in an analyte. The target sequence is typically a pathogen-specific sequence, such as a virus or bacterial genome sequence, which is to be detected in a biological sample, such as a blood sample.

The target sequence is preferably 15 to 25 subunits in length, to provide the requisite sequence specificity, as discussed above. In one assay format, the diagnostic reagent is a solid support, such as a micro-bead, coated by covalen ly-bound polymers effective to specifically bind to the duplex target sequence. After sample treatment to release the analyte duplex from bacterium or virus in free form, if necessary, the sample is contacted with the solid support under conditions sufficient to effect base-pair- specific binding of the analyte duplex to the support-bound polymer. Typically, the binding reaction is performed at 20°-37°C for 10 minutes to 2 hours. After washing the solid support to remove unbound material, the support is contacted with a reporter reagent effective to bind to the captured target duplex, to allow detection of said duplex. The reporter may be a soluble duplex-binding polymer, formed in accordance with the present invention, which is base-pair-specific for a second analyte-specific target sequence in the analyte duplex, and which is labeled with a suitable signal group, such as a fluorescent moiety, for signal detection. The signal group is coupled to the poly- mer by standard coupling methods, such as described in Section II.

After washing the support, it is examined for bound reporter, which will be proportional to the amount of analyte bound to the support via the sequence-specific binding polymer. Alternatively, the washed support containing bound analyte duplex may be reacted with a fluorescent interca¬ lating agent specific for nucleic acids, such as ethidium bromide, and then the polymer-bound analyte is assessed by its fluorescence. Another alternative is to react the washed support containing bound analyte duplex with a reporter-labeled polycationic molecule, such as a fluores¬ cent-labeled oligo-cation, as described in co-owned pub¬ lished PCT Application No. PCT/US86/00545 (WO 86/05519) . The reporter molecule binds by electrostatic interactions with the negatively charged analyte duplex backbone, but does not bind the substantially uncharged polymer molecules on the solid support. After washing the support to remove unbound material, the reporter bound to the solid support, via the sequence-specific analyte/polymer complex, is measured.

B. In situ Hybridization

In many applications, the in situ hybridization is directed toward a target sequence in a double-stranded duplex nucleic acid, typically a DNA duplex associated with a pathogen or with a selected sequence in chromosomal DNA. In the method, as it has been practiced heretofore, a labeled nucleic acid probe is added to the permeabilized structure, the structure is heated to a temperature suf¬ ficient to denature the target duplex nucleic acid, and the probe and denatured nucleic acid are allowed to react under suitable hybridization conditions. After removing unbound (non-hybridized) probe, the structure is examined for the presence of reporter label, allowing the site(s) of probe binding to target nucleic acid to be localized in the bio¬ logical structure.

The method has been widely applied to chromosomal DNA, for mapping the location of specific gene sequences and determining distances between known gene sequences, for studying chromosomal distribution of satellite or repeate DNA, for examining nuclear organization, for analyzin chromosomal aberrations, and for localizing DNA damage i single cells or tissue. Several studies have reported o the localization of viral sequences integrated into host cell chromosomes. The method has also been used to stud the position of chromosomes, by three-dimensional recon struction of sectioned nuclei, and by double in sit hybridization with mercurated and biotinylated probes using digital image analysis to study interphase chromosom topography (Emmerich) . Another general application of th in situ hybridization method is for detecting the presenc of virus in host cells, as a diagnostic tool.

In the present application, the polymer of the inven tion is designed for targeting a specific duplex geneti sequence associated with a cellular or subcellular struc ture of interest, such as a chromosomal preparation. Th polymer is derivatized with a suitable label such as fluorescent tag. The polymer is preferably added directl to cells or tissue containing the structure being studied, without first permeabilizing the material. Because th polymer is uncharged it can more readily penetrate into living cells without the need for a per eabilization treat¬ ment. It further offers the advantage of being resistant to nuclease degradation.

Once in contact with the duplex target material of interest, base-pair-specific binding can occur at normal physiological temperatures, again allowing detection of duplex targets under conditions of normal cell activity, and without heat disruption of the material being studied. After a time sufficient for binding to the target duplex, and washout of unbound polymer, the structure being studied may be examined directly, e.g., by fluorescence microscopy, to observe site-specific localization of the duplex target sequence and possible movement thereof. Alternatively, to reduce fluorescence background, the material may be fixed, e.g., by ethanol treatment, washed to remove unbound reporter, and viewed in fixed form by microscopy.

C. Isolation of Duplexes Containing Target Seguence

Another general application of the polymer invention is for isolating duplex nucleic acid structures from a nucleic acid mixture, such as a mixture of genomic frag¬ ments, a blood sample containing a selected viral duplex, or a mixture of plasmids with different duplex inserts in different orientations.

The binding polymer used in the method is (a) designed for base-pair-specific binding to a selected target duplex sequence and (b) capable of being isolated from a liquid sample after capture of the target duplex. To this end, the polymer may be bound to a solid support, as described above, or may be derivatized with a ligand moiety, such as biotin, which permits capture on a solid support, or immunoprecipitation, after binding to the target duplex. The polymer is added to the sample material and incu¬ bated under conditions which allow binding of the polymer to its target sequence, typically for 10 minutes to 2 hours at 20°-37°C. After binding has occurred, the polymer and bound material is isolated from the sample. The isolated material may be released from the polymer by heating, or by chaotropic agents, and further amplified, if necessary by polymerase chain reaction methods, and/or clonal propaga¬ tion in a suitable cloning vector.

D. Site Specific DNA Modification

The polymer of the invention is also useful for producing selected site-specific modifications of duplex DNA in vitro. These may include cutting a duplex species at a selected site, or protecting a selected region against restriction or methylating enzymes. The latter application is useful particularly in recombinant DNA technology, wher it is often advantageous to be able to protect a vector o heterologous DNA seguence against cutting by a selecte restriction endonuclease, or where it is desired to selec- tively prevent methylation at a given restriction site.

To produce site-specific cleavage in a selected base sequence, the polymer is derivatized with a cleaving moiety, such as a chelated iron group, capable of cleaving duplex DNA in a polymer bound state. The polymer sequence is selected to place the cleaving group, which is typically coupled at one polymer end, adjacent to the site to be cleaved. To protect a selected region of duplex target sequence against restriction or methylase enzymes, the polymer includes a sequence for binding to the 4-8 base- pair sequence which specifies a selected restriction enzyme sequence - plus any additional proximal bases effective to give increased specificity for a unique target sequence. After addition of the polymer to the duplex material, the material is treated with the selected restriction or methylating enzyme. After enzyme treatment, the treated duplex is "deprotected" by heating.

E. Therapeutic Application

The polymers of the invention, by their ability to bind to duplex target sequences, have the potential to inactivate or inhibit pathogens or selected genomic sequences, such as oncogenes,. associated with disease. Origins of replication and enhancer and promoter sequences are particularly sensitive to inactivation by duplex-directed binding agents, because the agent can occupy a target site required for initiation of replication or transcription of the targeted gene. Such gene-control sequences are known for many pathogenic genes, and also for a variety of oncogenes which have been characterized in humans. For some therapeutic applications, it may be desirable to modify the binding polymer to favor its delivery to cer¬ tain cells or tissues, or to favor its delivery to certain subcellular organelles, such as the nucleus (Chelsky) . This can be accomplished, for example, by linking the bind¬ ing polymer to a suitable signal structure, such as desialylated galactosyl-containing proteins (Gregoriadis, 1975) or a cluster of galactose moieties, which favors uptake by liver cells; or such as D-mannose or L-fucose, which favor uptake by Kupffer cells and macrophages; or such as insulin or related peptides, which may then be actively transported across the blood/brain barrier. Additionally, the binding polymers can be incorporated into surfactant micelles, with or without brain-specific antibodies, to enhance delivery across the blood/brain barrier (Kabanov) .

For the reasons discussed above, the polymer should generally contain at least 16 base-pair-specific subunits, to minimize the possibility of undesired binding to sequen- ces other than the intended target sequence. Candidate target structures can be determined from analysis of geno¬ mic sequences, such as are available in a variety of sequence databases. Preferred target structures are those which are (a) well conserved across strains, and (b) have a base-pair sequence which is compatible with the set of subunits available for forming the polymer. For example, if the subunit set includes a guanine, diaminopurine, and one or two spacer subunits, as detailed in Section I, the target sequence preferably contains at least about 70% C:G and T:A oriented base-pairs, and the remainder G:C and/or T:A.

As an example, a search was made of the HIV-I genome, in the duplex proviral stage, for sequences which are both well conserved across strains and suitable targets for binding polymers assembled predominantly from guanine and 2,6-diaminopurine-containing subunits. Table 10 show several such selected target sequences, and positione thereon, binding polymers assembled from the "two subunit plus spacers" set of the type described in Section I.E2.

Table 10

Position Gene Polymer/Target Complex in Genome

DDDDDUGDUDGGGGGDD Polymer _._.__._.*__.*__._.—_—

2431 Pol 5'-AAAAATGATAGGGGGAA Target TTTTTACTATCCCCCTT-5' Duplex

DUDDDGDDDDDDGDCDG Polymer

_* _—_—_._.*_-.

2735 Pol 5'-ATAAAGAAAAAAGACAG Target

TATTTCTTTTTTCTGTC-5' Duplex

GGDDDGGUGDDGGGGCDGUDGUDD Polymer * *—*—*—

4956 Pol 5'-GGAAAGGTGAAGGGGCAGTAGTAA Target

CCTTTCCACTTCCCCGTCATCATT-5' Duplex

In the table, "-" represents a high-specificity base- pair binding, and "*" represents a low-specificity base- pair binding.

The following examples detail synthetic methods for preparing a variety of subunits, subunit sets, and poly¬ mers, in accordance with the invention. The examples are intended to illustrate but not limit the invention. Example 1 Subunit Protection Methods

A. General procedure for the protection of primary amino groups on bases of subunits.

Unless otherwise indicated, chemicals are purchased from Aldrich Chemical Co., Milwaukee, WI.

The subunit, generally a nucleoside or nucleoside analog, (10 mmol, which has been dried by coevaporation with pyridine several times) is dissolved or suspended in pyridine (50-100 mL) , and treated with chlorotrimethyl- silane (2-3 equivalents of silane per hydroxyl group in the substrate) . The solution is stirred one hour, or until solution is complete (sonication may be employed with dif¬ ficultly soluble substrates) . An alkyl chloroformate, acid chloride, or anhydride, or other suitable activated carb¬ oxylic acid derivative is added (1.05-4.0 equivalents per amino group in the substrate) . After stirring for 1-24 hours at room temperature, the reaction is cooled to 0 C, and treated slowly with a 1:1 mixture of pyridine/water (20 mL) . After 10 minutes concentrated ammonium hydroxide (20 mL) is added and stirring continued for 15 minutes. The solution is concentrated under vacuum and dissolved in ethyl acetate (or ether or chloroform) and shaken with water. The organic phase is removed and the product allowed to crystallize. If no crystallization occurs, the solvent is removed and the residue chromatographed on silica to yield the N-acylated species. Typical chlorofor- mates which are useful include 9-fluorenylmethoxycarbonyl chloride, 2-(p-nitrophenyl)ethoxycarbonyl chloride (Himmelsbach) , and 2-(phenylsulfonyl)ethoxycarbonyl chloride (Balgobin) . Typical acid chlorides include benzoyl, isobutyryl, and trichloroacetyl. Typical anhy- drides include acetic, isobutyric, and trifluroacetic. Other acid derivatives include acyl hydroxybenzotriazolide (prepared from the acid chloride and dry hydroxybenzotri azole in acetonitrile) . The latter are advantageously use to introduce the phenylacetyl group. Alternatively, pri mary amino groups may be protected as amidines by th procedure of McBride, et al.

B. Procedure for the differential protection of primar diamines on base-pair recognition moieties. 2,6-Diaminopurineriboside (Pfaltz and Bauer, Inc.) i converted by the general procedure in Example IA into th N-2,N-6 bis-(phenylacetyl)amide. The acyl group at the N-6 position is selectively cleaved by treatment of the nucleo¬ side with IN LiOH in pyridine/ethanol at 0 C. The reaction mixture is neutralized with aq. HC1 and the solvents evapo¬ rated. The residue may be recrystallized from ethyl ace- tate/ethanol or purified by silica gel chromatography. The crude product, or the purified nucleoside, is resubjected to acylation by the general procedure using benzoyl chlo- ride to introduce the N-6 benzoyl group. For this second acylation only a slight excess of the acylating agent (1.05-1.2 equivalents) is employed.

C. Procedure for the protection of oxo groups in the recognition moieties.

2',3',5'-Tri-O-isobutyryl N2-isobutyrl deoxyguanosine is converted by the procedure of Trichtinger, et al, into the 06 2-(p-nitrophenyl)ethyl derivative. Alternatively, guanosine may be converted into the 06 diphenylcarbamoyl derivative by the method of Kamimura, et al. Following treatment with ammonia (1:1 cone, ammonium hydroxide/DMF) or IN LiOH in pyridine/ethanol at 0 C, the N2-propionyl 06-diphenylcarbamoyl guanosine is produced. These procedures are applicable to the preparation of N-2 acylated 0-4 protected 2-amino-4(3H)-quinazolinone derivatives and N-7 acylated 0-9 protected 7-amino-9(8H)-imidazo[4,5-f]quinazolinone derivatives.

D. General procedure for the introduction of a dimeth- oxytrityl substituent at a primary alcohol.

The alcohol bearing substrate (10 mmol) is dissolved or suspended in pyridine (50-100 mL) and treated with 4,4'-dimethoxytrityl chloride, triethylamine (20 mmol) and 4-dimethylaminopyridine (0.5 mmol). After several hours at room temperature the mixture is treated with water (5 mL) then poured into cold, satd. aq. sodium bicarbonate solution. The mixture is extracted with ethyl acetate (or chloroform) and the combined organic layers are dried (sodium sulfate) and evaporated. The residue is chromatographed on silica to give pure dimethoxytritylated compound.

Example 2 Preparation of "2-Subunits plus Spacers" Set.

A. Subunits containing 2'-Deoxyribose moiety.

The 5'-O-dimethoxytrityl protected derivatives of the following are available from Sigma (St. Louis, MO, USA) : N-4 benzoyldeoxycytidine, N-2 isobutyryldeoxyguanosine, thymidine. 2,6-Diaminopurine-2'-deoxyriboside is available from Sigma and is protected at the primary amino groups and the primary hydroxy group by the methods in Example 1.

B. Subunits containing 2'-O-Methylribose moiety. The 2'-O-methylribonucleosides of uracil, cytosine, guanine, adenine, and 7-deazaadenine may be obtained by the method of Robins, et al (1974) or Sekine, et al. The guanosine and 2-aminoadenosine 2'-0-methyl ethers are also advantageously prepared by the method of Robins, et al, (1981) . They may be converted into their base protected analogues by the general methods in Example 1 (for example, N-2 isobutyryl for the guanosine derivative, N-2 phenylacetyl, N-6 benzoyl for the 2-aminoadenosine derivative, N-4 benzoyl for the cytidine derivative) . The primary hydroxy is protected as in Example 1.

C. Subunits containing Morpholino moiety.

A ribose-containing subunit, having the base in the protected form, is oxidized with periodate to a 2'-3' dialdehyde. The dialdehyde is closed on ammonia or primary amine and the 2' and 3' hydroxyls (numbered as in the parent ribose) are removed by reduction with cyanoborohydride.

An example of this general synthetic scheme is described below with reference to the synthesis of a base- protected cytosine (R_;*) morpholino subunit. To 1.6 L of methanol is added, with stirring, 0.1 mole of N4- benzoylcytidine and 0.105 mole sodium periodate dissolved in 100 ml of water. After 5 minutes, 0.12 mole of ammonium biborate is added, and the mixture is stirred 1 hour at room temperature, chilled and filtered. To the filtrate is added 0.12 mole of sodium cyanoborohydride. After 10 minutes, 0.2 mole of toluenesulfonic acid is added. After another 30 minutes, another 0.2 mole of toluenesulfonic acid is added and the mixture is chilled and filtered. The solid precipitate is dried under vacuum to give the tosylate salt of the free amine. The use of a moderately strong (pKa < 3) aromatic acid, such as toluenesulfonic acid or 2-naphthalenesulfonic acid, provides ease of handling, significantly improved yields, and a high level of product purity.

Filtration of the tosylate salt of the 2,6- diaminopurine-containing morpholino subunit also works well. However, the tosylate salts of the guanine- containing and uracil-containing subunits are generally more soluble in methanol. Thus, for G and U subunits the methanol is removed under reduced pressure and the residue partitioned between brine and isopropanol - with the desired product going into the organic phase. The base-protected morpholino subunit can then be protected at the annular nitrogen of the morpholino ring using trityl chloride.

As an example of the tritylation step, to 2 liters of acetonitrile is added, with stirring, 0.1 mole of the tosylate salt from above, followed by 0.26 mole of triethylamine and 0.15 mole of trityl chloride. The mixture is covered and stirred for 1 hour at room temperature, after which 100 ml of methanol is added, followed by stirring for 15 minutes. The solvent is removed under reduced pressure and then 400 ml of methanol is added. After the solid is thoroughly suspended as a slurry, 5 liters of water is added, the mixture is stirred for 30 minutes, and filtered. The solid is washed with 1 liter of water, filtered and dried under vacuum. The solid is resuspended in 500 ml of dichloromethane, filtered, and rotovaped until precipitation just begins, after which 1 liter of hexane is added and stirred for 15 minutes. The solid is removed by filtering, and dried under vacuum.

The above procedure yields the base-protected morpholino subunit tritylated on the morpholino nitrogen and having a free 5' hydroxy1 (numbered as in the parent ribose) .

D. Subunits containing N-Carboxymethylmorpholino-5'-amino moiety.

A ribose-containing subunit, having the base-pair recognition moiety in the protected form, is converted to the 5'amine and that 5' amine tritylated, as per Stirchak, Summerton, and Weller (1987), or by the method described in Example 2E below. Following the general procedures of Example 2C above, the vicinyl 2' and 3' hydroxyls of th ribose are then oxidized with periodate to give a 2'-3 dialdehyde. The dialdehyde is closed on glycine in th presence of triethylamine. The 2' and 3' hydroxyl (numbered as in the parent ribose) are subsequently remove by reduction with cyanoborohydride.

Alternatively, the dialdehyde can be closed on ammoni and reduced as in Example 2C, and then the morpholin nitrogen alkylated with bromoacetic acid buffered with N,N diethylaniline.

These procedures yield the base-protected morpholino subunit having a tritylated 5' amine and a carboxymethyl group on the morpholino nitrogen.

E. Subunits containing N-Carboxymethylmorpholino-alpha(5'- amino) moiety.

Examples 2C and 2D illustrate the preparation of morpholino-containing subunits wherein the 5' methylene is in the beta orientation - that is, the same orientation as in the parent ribose. Analogous morpholino-containing subunits wherein the 5' methylene is in the alpha orientation can be prepared by the following general approach.

The 5' hydroxyl of a ribose-containing subunit, having the base-pair recognition moiety in the protected form, is converted to a secondary amine by established methods (see Example 2D above) . Thereafter, following the general procedures of Example 2C above, the vicinyl 2' and 3' hydroxyls of the ribose are oxidized with periodate to give a 2'-3' dialdehyde. The 2' aldehyde rapidly closes on the secondary amine at the 5' position (numbered as in the parent Ribose) . Reduction with cyanoborohydride then generates a structure containing a morpholino ring wherein the annular morpholino nitrogen is tertiary, and containing a 5'aldehyde in the alpha orientation. Subsequent addition of ammonia or a primary amine, in the presence of excess cyanoborohydride, generates a 5' amine (primary or secondary, respectively) in the alpha orientation.

The above general strategy can be applied to prepare subunits containing N-carboxymethylmorpholino-alpha(5'- amino) moiety, as well as a number of other useful variations. One method to introduce the desired secondary amine at the 5' position of the ribose moiety entails: a) conversion of the 2',3' hydroxyls to an acetal as per the method of Smith, Rammler, Goldberg and Khorana (1961) ; b) oxidation of the 5'hydroxyl to an aldehyde using DMSO/pyridine/trifluoroacetic acid/diisoproylycarbodiimide (the Moffat oxidation); c) reacting this 5' aldehyde with glycine (or the tert-Butyl ester of glycine) in the presence of cyanoborohydride; and, regeneration of the 2',3' hydroxyls by acid cleavage of the acetal.

F. Subunits containing Ribose with 5'-Carbazate.

A ribose-containing subunit can be converted to the 5'carbazate as follows. To 10 mMole of ribose-containing subunit, having exocyclic amines of the base-pair recognition moiety in the protected state, add 100 ml of anisylaldehyde and 0.5 g of tosic acid. Stir at room temperature for 48 hours. Add the reaction mixture to 500 ml hexane and collect the precipitate. Purify the product by silica gel chromatography developed with ether. The resulting product is reacted with 2 equivalents of bis(p- nitrophenyl)carbonate plus 2 equivalents of triethylamine in acetonitrile for 8 hours at 30 deg. C. The product is purified by silica gel chromatography developed with a 5% to 15% acetone/chloroform mixture. The product is reacted with 4 equivalents of t-butylcarbazate in DMF for 4 hrs at 50 deg. C. The reaction mixture is added to water and the precipitate collected and suspended in DMF/Con NH₄OH, 1:1 by vol overnight at 30 deg. C. The ammonium solution is added to brine and the insoluble product collected an dried under vacuum. The dry product is dissolved i trifluoroacetic acid and, after 5 minutes, ether is adde to precipitate the product, which is triturated twice wit ether. The product is dissolved in methanol containin sufficient N-ethylmorpholine to neutralize all residua trifluoroacetic acid and the product again precipitated b addition of ether, and the product dried under vacuum. Th desired 5'carbazate product can generally be purified b silica gel chromatography developed with N-ethylmorpho line/methanol/chloroform, 1:4:6 by volume, or preferably, purified by recrystalization from a suitabl aqueous/organic mixture.

G. Subunits containing Ribose with 5'-Sulfonylhydrazide.

A ribose-containing subunit can be converted to th 5'-sulfonylhydrazide as follows. Ten mMole of ribose- containing subunit, having exocyclic amines of the base- pair recognition moiety in the protected state, is converted to the anisylacetal derivative as described in Example 2F above.

To 10 mMole of sulfonyl chloride in dichloromethane chilled on dry ice add 15 mMole of N,N-diethylaniline. Next, slowly add, with rapid stirring, a dilute solution of 10 mMole of N-aminophthalimide in dichloromethane.

After 20 minutes, add the anisylacetal subunit derivative to this chlorosulfonylhydrazide solution. Slowly add, with rapid stirring, 30 mMole of diiso- propylethylamine in 30 ml of dichloromethane. After stirring 1 hour at room temperature, remove the solvent under reduced pressure and purify the product by silica gel chromatography developed with an acetone/chloroform mixture.

The product is then treated with hydrazine acetate in methanol, the solvent removed under reduced pressure, and DMF/con NH₄OH, 1:1 by vol is added and the preparatio incubated at 30 deg. C overnight. Lastly, the product i treated with trifluoroacetic acid and worked up as i Example 2F.

H. Subunits containing Ribose with 5'-glycinamide

A primary amine is introduced into the 5' position o a ribose-containing subunit following th oxidation/reductive alkylation procedure described i Example 2E, excepting Ammonia is used instead of Glycine. This 5' primary amine is then acylated with N-tert- butoxycarbonyl glycine, p-nitrophenyl ester. Afte purification, the protective groups are removed b treatment with DMF/con NH₄OH, and then with trifluoroacetic acid, and the final 5'-glycinamide derivative worked up as in Example 2F.

I. Subunits containing Ribose with an aminomethylethylphosphate group linked to the 5'oxygen. Aminomethylphosphonic acid (Aldrich Chem. Co.) is re¬ acted with trityl chloride in the presence of triethylamine. The di-anionic phosphonate product, where the counter ions are triethylammonium, is suspended in ethanol and then a carbodiimide, such as dicyclohexylcarbodiimide (DCC) , is added. The resultant mono-anionic product is shaken with a mixture of water and chloroform containing pyridinium hydrochloride. This procedure gives a mono-ionic phosphonic acid having a pyridinium counter ion. This product is added to chloroform, followed by addition of the ribose-containing subunit wherein exocyclic amines of the base is in the protected form and the 2'and 3' hydroxyls are protected as the anisylacetal. DCC is added to couple the phosphonate to the 5'oxygen of the subunit. The product is dried and chromatographed on silica using methanol/chloroform mixtures. The pure product is next base-deprotected wit DMF/conNH₄OH, 1:1 by vol. and then suspended i trifluoroacetic acid to remove the trityl and the anisy protective group

Example 3 Preparation of Subunits With Tautomeric Base

A. Subunit containing 2'-Deoxyribose moiety. 1. Preparation of N-glycosyl isoindoles

4-Acetylamino-2-methylbenzoic acid (Peltier) is converted into the 5-nitro compound by treatment with cold fuming nitric acid. The reaction mix was poured into crushed ice and the solid product collected by filtration and purified by recrystallization from DMF/water or by silica chromatography. The acetamide is removed by alkaline hydrolysis with 1-10% NaOH solution in 90% ethanol. The reaction mixture was added to excess dilute HC1 and the solvent evaporated. The crude acid is esteri- fied with satd. methanolic HC1 at room temperature for several days. After removal of solvent the product is partitioned between ethyl acetate and satd. sodium bi¬ carbonate. After washing with water the organic phase is evaporated and the residue purified by silica chromato- graphy. The nitro group is reduced to the amino using hydrogen and palladium on carbon in ethanol or DMF. After filtration through celite and evaporation, the crude diamine is converted to the methyl 2-amino-6-methylbenz- imidazole-5-carboxylate using cyanogen bromide in methanol at reflux. The mixture is cooled and poured into satd. aq. sodium bicarbonate and the solid product filtered and purified by recrystallization. The exocyclic amino group is acylated by refluxing with phthaloyl dichloride in pyridine followed by reaction of the diazepine with pyra- zole in refluxing acetonitrile according to the method of Katritzky. The compound is reacted with either bromine or N-bromosuccinimide or 1,3-dibromo-5,5-dimethylhydantoin either neat or in carbon tetrachloride or chloroform or 1,1,1-trichloroethane with the aid of a high-intensity sun lamp and/or benzoyl peroxide, to provide the benzylic bromide. It is possible to acylate the diazepine further with isobutryl chloride in pyridine to produce a triply acylated benzimidazole species. This is normally done prior to the bromination. The crude benzylic bromide is reacted with sodium azide in dry DMF and reduced with hydrogen over platinum or palladium to produce the lactam. This is O-silylated with one equivalent of trimethylsilyl trifluoromethanesulfonate or tert-butyldimethylsilyl trifluoromethanesulfonate to produce the O-silyl lactim ether/benzimidazole trifluoromethanesulfonate salt. This is reacted with 3,5-di-O-toluyl-alpha-D-erythropentofuanosyl chloride (Hoffer) in THF or acetonitrile in the presence of p-nitrophenol by the method of Aoyama to give the protected nucleoside which is purified by silica chromatography. The acyl groups are all removed by a two step procedure requiring first, hydrazineolysis with hydrazine/ethanol at room temperature, then evaporation of solvent and heating the crude residue in refluxing ethanol to fully cleave the phthaloyl residue. The a inobenzimidazole is protected by reaction with 4-(dimethoxymethyl)-morpholine (prepared from 4-formyl morpholine by the general procedure of Bredereck et. al.) in methanol to form the amidine. The remaining reactive site of the benzimidazole is protected by reaction with pivaloyl chloride under the conditions of Example 1. Alternatively, the final acylation may be done with - (dimethylamino)benzoyl chloride. An alternative amino protecting group is formed by reaction of the unprotected benzimidazole with 4-(dimethylamino)benzaldehyde in methanol in the presence of piperidine (10 mole%) and methanesulfonic acid (5 mole%) . The resulting imine is acylated as for the amidine. The primary hydroxyl group is protected with the dimethoxytrityl group as per Example 1.

2. Preparation of 2-glycosyl benoxazoles

3-Acetamidophenol (Aldrich Chemical Co.) is nitrated to give the 2-nitro-5-acetamidophenol. Reduction with hydrogen and palladium/carbon and reaction with trifluoroacetic anhydride or trichloroacetic anhydride give the 2-trihaloacetamido derivative. This is nitrated to give the 4-nitro species and the trihaloacetyl group removed by brief ammonolysis to give 5-acetamido-2-amino-4-nitrophenol.

2,5-Anhydro-3 ,4 ,6-tri-O-benzoyl-D-allonothioamide (Pickering) is treated with methyl iodide and sodium hydride to give the corresponding methyl thioimidate. Alternatively the thioamide is reacted with di-tert-butyl dicarbonate (Aldrich) and 4-dimethylaminopyridine in dichloromethane to produce the imide. Alternatively, the imide is treated with methyl iodide or methyl triflate in the presence of diisopropylethylamine to give the N-tert-butoxycarbonyl methyl thioimidate. Any of these are suitable for reaction with aromatic 1,2-diamines or ortho aminophenols to produce benzimidazole or benzoxazole derivatives of deoxyribosides, respectively.

The aminophenol is reacted with the appropriate activated thioamide from the previous paragraph to produce the 2-(tri-O-benzoyl-beta-deoxyribosyl)benzoxazole. The N-acetyl and O-benzoyl groups are removed by ammonolysis or hydrazinolysis and the nitro group reduced with hydrogen and palladium/carbon. The aromatic diamine is reacted with cyanogen bromide in refluxing methanol, and the product 6-amino-2-(tri-O-benzoyl-beta-deoxyribosyl)imidazo[4,5-f] benzoxaz ole derivative protected as in Example 3A1, and the primary hydroxy protected as per Example 1. B. Subunit containing Ribose moiety. 1. N-glycosyl isoindoles

The ribose nucleoside is prepared as for th deoxyribonucleoside in Example 3A1 except that th O-silylated lactam is reacted in the presence of mercuri bromide or silver trifluoromethanesulfonate with th ribosyl bromide prepared from by treatment o l-O-acetyl-2,3,5-tri-O-benzoyl-D-ribofuranose with HBr i benzene as per the procedure of Maeba et al.

2. 2-glycosyl benzoxazoles

2,5-Anhydro-3-deoxy-4,6-di-O-toluoyl-D-ribo-hexanothioamid (Pickering) is converted into the methyl thioimidate, th imide, or the N-tert-butoxycarbonyl methyl thioimidate a in Example 3A2. Any of these are suitable for reactio with aromatic 1,2-diamines or ortho aminophenols to produc benzimidazole or benzoxazole derivatives of ribosides, respectively.

By the same procedures in Example 3A2, the aminopheno is reacted with the activated thioamide from the previou paragraph to produce the benzoxazole which is furthe converted into the protected nucleoside by the procedure in Example 3A2.

C. Subunit containing Morpholino moiety. 1. N-glycosyl isoindoles

The morpholine nucleoside is prepared by reaction of the O-silylated lactam from Example 3A1 with tetraacetyl alpha-D-glucopyranosyl bromide (Sigma) (with or without the presence of mercuric bromide or silver trifluoromethane¬ sulfonate) . The glycoside is converted into the morpholino nucleoside in the usual way except that twice the normal amount of sodium periodate is employed. Following N- tritylation (Example 2C) and hydrazinolysis of the base protecting groups, the base is reprotected as in Exampl 3A1.

Alternatively, the morpholine nucleoside is prepared by reaction of the benzylic bromide from Example 3A1 with beta-D-glucopyranosylamine (Tamura) to give the glycosyl lactam directly. This is converted into the morpholino nucleoside by the usual procedure except that twice the amount of sodium periodate must be employed in the oxida¬ tion step. Following N-tritylation (Example 2C) and hydrazinolysis of the base, reprotection is accomplished as in Example 3A1.

Alternatively, the methyl 4-acetamido-2-methyl-5-ni- trobenzoate from Example 3A1 is brominated as in Example 3A1 and reacted with beta-D-glucopyranosylamine. The N-acetyl is removed with 1-10% NaOH in 90% ethanol, the nitro is reduced with palladium/carbon and hydrogen, and the aminobenzimidazole is formed by reaction with cyanogen bromide in refluxing ethanol. The aminobenzimidazole is protected as in Example 3A1. Alternatively, the riboside prepared in Example 3B1 is converted into a morpholine-containing subunit following the procedure in Example 2C. This procedure is accom¬ plished prior to deacylation of the phthaloyl group from the aminobenzimidalole. After morpholine formation and protection as the N-trityl species, the phthaloyl group is removed as in Example 3A1.

The morpholine nitrogen is protected as the N-trityl by reaction of the free amine or the tosylate salt with trityl chloride in acetonitrile containing triethyamine. The reaction mix is poured into water and the solid product isolated by filtration and purified by silica gel chromatography. 2. 2-glycosyl benzoxazoles

By the procedures described in Myers 2,3,4,6-te tra-O-acetyl-alpha-D-galactopyranosyl bromide is converte into 2,3,4,6-tetra-o-acetyl-alpha-D-galactopyranosy cyanide and then into the corresponding thioamide by th method of Pickering, et al, and then into its activate thioamide derivatives as in Example 3A2. These are suit able for reacting with 1,2-diamines or ortho aminophenol to produce benzimidazoles or benzoxazole derivatives o galactosides, resp. A similar procedure may be employe beginning with other hexose nitriles (Myers) .

By the same procedures in Example 3B2, the aminopheno is reacted with the activated thioamide from the paragrap above to produce the benzoxazole which is further converte into the N-protected galactoside by the procedures i Example 3B2. This is converted into the morpholine nucleo side by the usual procedure except that twice the normal amount of periodate must be employed in the oxidation step. The N-trityl group is introduced by the method in Example 3C1.

Example 4 Preparation of 4-Membered High-Specificitv Subunit Set Containing Morpholino Backbone Moieties

A. CG-specific subunit.

Guanosine is converted into its 2-phenylacetyl deri¬ vative by the method in Example 1. This is converted into the morpholine nucleoside tosylate salt by the methods in Example 2C. It may be tritylated by reaction with tri¬ phenylmethyl chloride in acetonitrile containing triethyl¬ amine. The reaction mixture is poured into water and the product filtered. It is purified by recrystallization from acetonitrile. B. TA-specific subunit.

2,6-Diaminopurineriboside is converted into it

N2-phenylacetyl N6-benzoyl derivative by the method i

Example 1. This is converted into the morpholin nucleoside by the methods in Example 2C. It is tritylate by the procedure in Example 5A

C. AT-specific recognition moiety. 1. 2-glycosylbenzoxazoles 5-Hydroxy-2(3H)-benzoxazolone (Ozdowska) is acetylated with acetic anhydride and then nitrated with cold fuming nitric acid to the 6-nitro-5-acetoxy species. This is dissolved in ethanol and treated with potassium carbonate, than hydrogenated over palladium to reduce the nitro group to an amino group. The isolated aminophenol is reacted with an active thioamide derivative from Example 3C to give the 6-(2,3,4,6-tetra-O-acetyl-galactosyl)-oxazolo[4 ,5-f]- -2(3H)-benzoxazolone. Reaction with phosphoryl chloride followed by ammonolysis gives the 2-aminobenzoxazole. This is N-protected by the usual procedure to prepare the benzoyl, isobutyryl, acetyl, methoxyacetyl , phenoxyacetyl or trichloroacetyl amides.

The morpholine nucleoside is prepared from the galactosyl species above by the procedures in Example 2C except with double the usual amount of sodium periodate in the oxidation step in order to form the dialdehyde required for reductive amination. The latter step is performed by the usual methods. The morpholine is tritylated as in Example 5A and purified by silica gel chromatography.

2. 2-glycosylisoindoles

2-Methyl-4-hydroxybenzoic acid (King) is nitrated with cold fuming nitric acid to give the 5-nitro derivative which is reduced using palladium catalyst in a hydrogen atomosphere to the 5-amino species. This is converted into the methyl ester by the procedure in Example 3A1. This is converted to the 2-aminobenzoxazole using cyanogen bromide and the exocyclic amino group acylated by the methods in Example 1 with acetyl, methoxyacetyl, trichloroacetyl, isbutyryl or benzoyl. The compound is converted into the benzylic bromide by the methods in Example 3A1.

The morpholine nucleoside is prepared first by reac¬ tion of the benzylic bromide with beta-D-glucopyrano¬ sylamine as in Example 3C. Then, methanolic periodate cleavage using twice the usual amount of sodium periodate and reductive amination give the morpholine nucleoside. This is tritylated by the procedure in Example 5A and purified by silica gel chromatography.

Alternatively, the benzylic bromide is reacted with ammonia to produce the lactam which is O-silylated with trimethylsilyl trifluoromethanesulfonate or tert-butyldi- methylsilyl trifluoromethanesulfonate and 2,6-di-tert-bu- tylpyridine. The O-silylated lactam is reacted with tetra- acetyl alpha-D-glucopyranosyl bromide (with our without the presence of silver trifluoromethanesulfonate or mercuric bromide) , followed by ammonolysis and reprotection of the primary amino group as in Example 5C1. The glycoside is converted into the morpholine nucleoside in the usual way except that twice the normal amount of sodium periodate is employed. The morpholine is tritylated as in Example 5A and purified by silica gel chromatography.

D. GC-specific subunit. 1. 2-glycosylbenzoxazoles 5-Chloro-2,4-dinitrophenol (Carnelley) is treated with chloromethyl benzyl ether and diisopropylethyl amine, and the ether is treated with the sodium salt of methyl cyano- acetate (or malononitrile) followed by reduction with iron in acetic acid. Cleavage of the acetal (hydrogen/palladium on carbon) and reaction with an activated thioimide deriva- tive from Example 3C produces the pyrrolobenzoxazole which, after ammonolysis, may be base protected by the procedur in Example 1 to prepare the benzoyl, isobutyryl, acetyl, methoxyacetyl, phenoxyacetyl or trichloroacetyl amides. The morpholine nucleoside is prepared by reaction of the galactoside with double the usual amount of sodium periodate in order to form the dialdehyde required for reductive amination. The latter step is performed by the usual methods. The molecule is tritylated by the method in Example 5A and purified by silica gel chromatography.

2. 2-glycosylisoindoles

4-Chloro-2-methylbenzoic acid (Pfaltz and Bauer Chemi¬ cal Co) is converted into its methyl ester (HCl/methanol) and further converted into the benzylic bromide by the pro¬ cedure in Example 3C. Reaction with two equivalents of ammonia provides the lactam which is nitrated in fuming nitric acid to give the 4-nitro-5-chloro-2-oxoisoindole.

The lactam from above is O-silylated as in Example 5C2. The lactim ether is reacted with tetraacetyl alpha-D-glucopyranosyl bromide (with or without the presence of silver trifluoromethanesulfonate or mercuric bromide) . This is reacted the sodium salt of methyl cyanoacetate (or malononitrile) followed by reduction with iron in acetic acid. The acyl groups are all removed by ammonolysis and the base reprotected by the usual procedure as the benzoyl, isobutyryl, acetyl, methoxyacetyl, phen¬ oxyacetyl or trichloroacetyl amides.

Alternatively, 4-chloro-2-methylbenzoic acid is nitrated with fuming nitric acid in concentrated sulfuric acid to give the 5-nitro derivative. Following esteri- fication by the method in Example 3A, this is reacted with the sodium salt of methyl cyanoacetate (or malononitrile) followed by reduction with iron in acetic acid. The amine is protected by reaction with trichloroacetic anhydride, methoxyacetic anhydride, acetic anhydride, isobutyryl chloride or benzoyl chloride. This is converted into the benzylic bromide by the methods in Example 3C. The benzylic bromide is converted into the lactam glucoside by treatment with beta-D-glucopyranosylamine.

The glucoside above is reacted with methanolic perio¬ date using twice the usual amount of sodium periodate followed by reductive amination to give the morpholino nucleoside. This is tritylated by the procedure in Example 5A and purified by silica gel chromatography.

E. Synthesis of pyrimidopyridine.

5-Formyl-2'-deoxyuridine (Barwolff and Langen) is dissolved in methanol and treated with manganese dioxide in the presence of sodium cyanide and acetic acid according to the general procedure of Corey to provide the methyl ester. The ester is reacted with tert-butyldimethylsilyl triflate in dichloromethane in the presence of diisopropylethyl amine to protect the alcohols. The heterocycle is acti- vated by the method of Bischofberger (NaH, triisopropyl- benzenesulfonyl chloride, THF) . The 4-O-sulfonated hetero¬ cycle is treated with the tosylate salt of benzhydryl alanine (Aboderin) in the presence of diisopropylethyl amine in DMF to give the cytosine derivative. The cyto- sinyl alanine derivative is oxidized to the dehydroamino acid by the general procedure of Poisel and Schmidt (tert- butyl hypochlorite in THF, followed by one equivalent of potassium tert-butoxide in THF) . The product is treated with a catalytic amount of potassium tert-butoxide in hot THF to provide the pyrimidopyridine. The benzhydryl ester is removed by hydrogenolysis using hydrogen over palladium/carbon. The acid is treated with diphenylphos- phoryl azide in benzyl alcohol (or benzyl alcohol/dioxane) containing triethylamine according to Shioiri, et al. Following hydrogenolysis to cleave the carbamate, and HF- pyridine to remove the silyl groups, the molecule is N protected as the trichloroacetamide or phenylacetamide b the usual procedure.

In a similar manner, 5'-formyluridine, prepared fro 5-methyluridine by the procedures in Barwolff and Langen, is converted into the corresponding pyrimidopyridin riboside. The riboside is converted into the morpholin nucleoside by the usual procedure, and protected as the N trityl derivative.

Example 5 Preparation of 4-Membered High-Specificity Subunit Set Containing N-Carboxymethylmorpholino-5'-amino Backbone.

Subunits containing ribose, galactose, or glucose moieties are prepared as in Example 4, and their respective sugar moieties are converted to the N-

Carboxymethylmorpholino-5'-tritylated amine form by the method described in Example 2D.

Example 6

Representative Polymer Assembly Procedures for 2'-0- Methylribose and 2'-Deoxyribose-containing subunits

The protected 2'-Deoxyriboside-containing subunits and the protected 2'-0-Methylriboside-containing subunits are converted into their corresponding 3'-H-phosphonate salts by the methods given in Sakataume, Yamane, Takaku, Yamamoto, Nucleic Acids Res. 1990, 18, 3327 and polymerized on solid support by the method in this source. When the assembly of the polymer chain is complete, the supported molecule is treated with a primary or secondary amine in the presence of either iodine or carbon tetrachloride as per the method of Froehler, Tetrahedron Lett. 1986, 27, 5575. The phorphoramidate-linked polymer is removed from the support and deprotected by the usual methods involving ammonolysis (See second ref.) Example 7 Representative Activation Procedures for Morpholino- Containing subunits

A. Activation of 5'-Hydroxyl of Morpholino Subunit.

Dimethylaminodichlorophosphate is prepared as follows: a suspension containing 0.1 mole of dimethylamine hydro¬ chloride in 0.2 mole of phosphorous oxychloride is refluxed for 12 hours and then distilled (boiling point is 36°C at 0.5 mm Hg) .

Activation of the 5'Hydroxyl of a morpholino-contain¬ ing subunit prepared as in Example 2C entails dissolving one mmole of 5'hydroxy1 subunit, base-protected and trity- lated on the morpholino nitrogen, in 20 ml of dichloro¬ methane. To this solution 4 mmole of N,N-diethylaniline and 1 mmole of 4-methoxypyridine-N-oxide are added. After dissolution, 2 mmole of dimethylaminodichlorophosphate is added. After two hours the product is isolated by chroma- tography on silica gel developed with 10% acetone/90% chloroform. The same procedure, except substituting ethyl- dichlorothiophosphate instead of dimethylaminodichlorophos¬ phate, gives an activated subunit with similar utility.

B. Activation of 5'-Amine of Morpholino-containing Subunit

The 5'hydroxy1 of a morpholino-containing subunit, having exocyclic amino groups of the base-pair recognition moiety in the protected form, prepared as in Example 2C can be converted to the amine as follows. To 500 ml of DMSO is added 1.0 mole of pyridine (Pyr) , 0.5 mole of triflouro- acetic acid (TFA) , and 0.1 mole of the morpholino subunit. The mixture is stirred until dissolved, and then 0.5 mole of diisopropylcarbodiimide (DIC) or dicyclohexylcarbodi¬ imide (DCC) is added. After 2 hours the reaction mixture is added to 8 liters of rapidly stirred brine, which is stirred for 30 minutes and filtered. The solid is dried briefly, washed with 1 liter of ice cold hexanes, filtered, and the solid is added to 0.2 mole of sodium cyanoboro- hydride in 1 liter of methanol, stirred for 10 minutes, 0.4 mole of benzotriazole or p-nitrophenol is added, followed by 0.2 mole of methylamine (40% in H~0) and the preparation is stirred four hours at room temperature [Note: the benzotriazole or p-nitrophenol buffers the reaction mixture to prevent racemization at the 4' carbon of the subunit at the iminiu stage of the reductive alkylation] . Finally, the reaction mixture is poured into 5 liters of water, stirred until a good precipitate forms, and the solid is collected and dried. This dried product is next suspended in DMF and 4 equivalents of S0₃/pyridine complex is added. Over a period of several hours, 8 equivalents of triethyl¬ amine is added dropwise with stirring. After an additional two hours the preparation is dumped into a large volume of brine and the solid collected by filtration and dried. This sulfamic acid preparation is then purified by silica gel chromatography.

Ten mmole of the triethylamine salt of sulfated sub¬ unit protected on the recognition moiety and on the nitro¬ gen of the morpholino ring is dissolved in 10 ml of dichlo¬ romethane and then 40 mmole of pyridine is added. This solution is chilled for 15 minutes on a bed of dry ice and then 1.1 mmole of phosgene (20% in Toluene) is slowly added while the solution is rapidly stirred. After addition, the solution is allowed to come to room temperature and then washed with aqueous NaHC0₃, dried, and chromatographed on silica gel eluted with a mixture of chloroform and acetone to give the desired sulfamoyl chloride.

C. Activation of Annular Morpholino Nitrogen

This example describes the preparation of a morpholino subunit protected on its 5' oxygen and sulfated on its mor- pholino ring nitrogen. Morpholino-containing subunit pre pared as in Example 2C, but not carried through the las tritylation step, is silylated on its 5' hydroxyl with t butyldimethlsilyl chloride. This product is then treate with SO_j/pyridine complex (with excess pyridine) in di methylformamide (DMF) to give a sulfamic acid on th annular morpholino nitrogen.

It should be mentioned that the salts of sulfami acids can be chromatographed on silica gel using triethyl amine/ ethanol/chloroform mixtures if the silica is firs pre-eluted with 2% triethylamine in chloroform.

This sulfamic acid on the morpholino nitrogen is con¬ verted to the sulfamoyl chloride and purified as in Example 7B above.

D. Activation of N-Carboxymethyl of Morpholino

Carboxylate-containing subunits, such as prepared in Examples 2D and 2E, are activated as follows. Ten mmole of the subunit is dissolved in DMF containing 20 mmole of p- nitrophenol and 15 mmole of dicyclohexylcarbodiimide. After 1 hour the product is rotovaped and then purified by silica gel chromatography developed with a mixture of Acetone and Chloroform.

Example 8 Representative Solid-Phase Polymer Assembly of Morpholino-containing Subunits This example describes a method which is generally applicable for assembly of activated subunits, prepared as in Examples 7A and 7B, to give phosphorodiamidate-linked, ethylthiophosphoramidate-linked, and sulfamate-linked binding polymers. A similar scheme wherein the coupling step includes the addition of silver trifluoromethanesulfonate, and use of N,N-diisopropyl-2- methoxyethylamine instead of diisopropylethanolamine, is suitable for assembly of subunits prepared as in Example 7 to give sulfamate-linked polymers. A similar scheme, wherein the coupling step is carried out in dimethylform- amide instead of dichloromethane, is suitable for assembly of subunits activated as in Example 7D to give amide-linked polymers.

A. Linker Aminomethyl polystyrene resin (Catalog no. A1160, from Sigma Chemical Co.) 1% divinylbenzene crosslinked, 200 to 400 mesh, 1.1 mMole of N per gram, is suspended in dichloromethane and transferred to a 1 cm diameter column having a frit on the bottom, to give a resin bed volume of 2.5 ml.

One mMole of bis[2-(succinimidooxycarbonyloxy)- ethyl]sulfone (Pierce Chemical Co. of Rockford, Illinois, USA) is added to a dichloromethane solution containing 1 mMole of N-tritylated piperazine. After 2 hours the reaction mixture is chromatographed on silica gel developed with an acetone/chloroform mixture to give a mono-activated beta-elimination-cleavable linker.

134 MicroMole of the above linker is dissolved in 1 ml of dichloromethane and added to the resin in the synthesis column and the resin suspension agitated for 3 hours at 30 deg. C. Next, 1 mMole of diisopropylaminoethanol and 1 mMole of acetic anhydride is added and agitation continued for 10 minutes, followed by addition of 2 mMole of benzyl- methylamine and agitation for 20 minutes. The column is washed with 30 ml dichloromethane. Based on release of trityl, the above procedure typically gives on the order of 100 to 110 micromoles of bound linker.

B. Coupling cycle (Detritylation/Coupling/Capping) The coupling cycle described below is used for addin each subunit in an order appropriate to give a polyme having the desired sequence of subunits.

i) Detritylation. Add a solution containing 53 ml o dichloromethane, 6 ml of trifluoroethanol, and 1 gram o cyanoacetic acid. After this solution has passed through, wash the column with 40 ml of dichloromethane, followed b 20 ml of dichloromethane containing 4 mMole of diisopropyl aminoethanol. Wash the column with 10 ml of dichloro methane.

ii) Coupling. Add 1 ml of dichloromethane containing 12 microliter of diisopropylaminoethanol to 0.25 mMole o activated subunit (prepared as in Example 7A or 7B) and ad to the column and agitate at 37 deg. C for 1 hr. Wash th column with 30 ml dichloromethane. Note: excess unreacte activated subunit can be conveniently recovered simply b adding 4 volumes of hexane to this eluant and filtering.

iii) Capping. Add to the column 2 ml of dichloromethane containing 1 mMole of diisopropylaminoethanol and 1 mMole of acetic anhydride and agitate at 37 deg. C for 10 minutes. Add to the column 10 ml of dichloromethane containing 1 mMole of benzylmethylamine, and agitate the resin bed at 37 deg. C for 20 min. Wash the column with 30 ml dichloromethane.

C. Cleavage from support and deprotection After all the subunits have been added by the above coupling procedure, the full length polymer is cleaved from the support by eluting the column with a solution consist¬ ing of 2.5 ml of diethylmalonate, 5 ml of 1,8-diazabi- cyclo[5.4.0]undec-7-ene, and 43 ml dichloromethane. The polymer is then precipitated from this eluant by addin ether.

If it is desirable to add a moiety to enhance aqueou solubility, or to enhance target binding affinity, or t facilitate uptake by specific cell or tissue types, the the secondary aliphatic amine generated upon cleavage fro the polystyrene support provides an excellent site for attachment of said moieties at this stage of the polymer preparation. The polymer product is next dissolved in DMF and an equal volume of conNH₄OH added, the preparation capped tightly, and incubated 18 hrs at 37 deg. C. Subsequently, the preparation is dried under reduced pressure to give a polymer preparation wherein the base-pair recognition moieties are deprotected and at one end of the polymer is a trityl moiety, and at the other end is a secondary aliphatic amine - which, as noted above, may be derivatized prior to the ammonia treatment.

Example 9

Polymer Purification Methods. The full-length polymer having a terminal trityl moiety (typically greater than 50% of the total mass of the preparation for a 24-subunit long polymer) can be separated from the capped failure sequences by low pressure chromato¬ graphy on a column of chromatographic grade polypropylene (Catalog No. 4342 from PolySciences Inc.) developed with an acetonitrile/water gradient, with the eluant monitored photometrically at 254 nm. Purifications generally go better when the polymer is suspended in water and then the solution adjusted to pH 11 with dimethylamine and the eluting solvents also adjusted to pH 11 with dimethylamine. In this system, the tritylated full-length polymer elutes appreciably later than the non-trityl-containing capped failure sequences. The fractions containing full-length polymer ar collected and dried down under reduced pressure. The poly mer preparation is then detritylated by suspending i trifluoroethanol (1 g polymer in 25 ml TFE) and 1.5 ml o ercaptoacetic acid added. After 10 minutes, 100 ml o ether is added and the final pure product collected b centrifugation or filtration.

Example 10 Polvmer Assembly Via Novel Oxidation/Ring

Closure/Reduction Method

A. Synthesis support

The solid support used in this synthesis should be hydrophilic, but should not contain vicinyl hydroxyls. Add an aqueous slurry of Macro-Prep 50 CM (Catalog No. 156-0070 from Bio-Rad Laboratories, Richmond, Calif., USA) to a fritted column to give a 5 ml packed bed volume (containing approximately 1 mMole of carboxylate) . Wash this synthesis support with 100 ml of 0.1 N HC1 and then 50 ml water.

Pass 50 ml of DMF (dimethylformamide) through the column and drain. Add 5 ml of DMF containing 5 mMole of diisopro- pylcarbodiimide and 5 mMole. of p-nitrophenol and incubate with agitation at 30 deg. C for 3 hours. Wash the column with 100 ml of DMF and then add 20 mMole of piperazine in

10 ml of DMF and agitate for 15 minutes. Wash the column with 50 ml of DMF and drain.

B. Addition of linker and first subunit To 1 mMole of a ribose-containing subunit having a carbazate moiety at the 5' of the ribose (prepared as in Example 2F) in 5 ml of DMF, add 3 mMole of Bis[2-(suc- cinimidooxycarbonyloxy)-ethyl]sulfone (Pierce Chemical Co. of Rockford, Illinois, USA) and incubate at 30 deg. C for 3 hours. To the reaction mixture add ether and collect the precipitate. Wash the precipitated linker-subunit wit ether, resuspend in 5 ml of DMF, add to the synthesis sup port, and incubate with agitation for 3 hrs at 30 deg. C. Wash the support with 50 ml of DMF, and then with 100 ml of water.

C. Coupling cycle i) Oxidation of vicinyl hydroxyls

Dissolve 5 mMole of sodium periodate in 10 ml of water, add to column, and agitate for 10 minutes. Wash column with 50 ml of water and drain.

ii) Morpholino ring closure/reduction

Dissolve 2 mMole of sodium cyanoborohydride in 5 ml of water, adjust pH to between 7 and 8 with trimethylacetic acid, add 1.5 mMole of the next ribose-containing 5'- carbazate subunit, and add to the column containing the synthesis support. Incubate with agitation for 30 min at 30 deg. C. Add formic acid to reduce pH to between 3 and 4, and incubate at 30 deg. C for 10 minutes. Wash column with 100 ml of water.

Repeat this coupling cycle until all subunits have been added to give the desired full-length polymer.

Addition of terminal moieties

If it is desirable to add to the binding polymer a moiety to enhance aqueous solubility, or to enhance target binding affinity, or to facilitate uptake by specific cell or tissue types, this can be conveniently achieved at this stage by oxidizing the vicinyl hydroxyls of the terminal subunit of the polymer and, by the morpholino ring closure/reduction procedure described above, adding said moieties containing a primary aliphatic amine. Cleavage from the support

After all the subunits of the polymer, and any desir additional groups, have been added by the above coupli procedure, the polymer is cleaved from the support washing the column with 50 ml of DMF, and then eluting th column with a solution consisting of 2.5 ml of diethyl malonate, 5 ml of l,8-diazabicyclo[5.4.0]undec-7-ene, an 43 ml of DMF. The polymer is then precipitated from thi eluant by adding ether. The full-length polymer can be purified by lo pressure chromatography on a column of chromatographi grade polypropylene (Catalog No. 4342 from PolyScience Inc.) developed with an acetonitrile/water gradient, wit the eluant monitored photometrically at 254 nm. Purifica tions generally go better when the polymer is suspended i water and then the solution adjusted to pH 11 with di methylamine and the eluting solvents also adjusted to pH 1 with dimethyl amine.

Example 11

Polymer Structural Characterization. NMR, and even two-dimensional NMR, appears to provid little useful structural information for these heteropoly mers when they are of any significant length. Likewise elemental analysis has not been found to be of value.

Polymers prepared as in Example 8 and cleaved from th solid support, but not yet treated with ammonium hydroxide generally show relatively clean parent ions for polymers u to about 16 to 18 subunits in length, when assessed b positive fast atom bombardment mass spectrometry. Fo longer polymers, and for polymers lacking protective group on the bases (such as prepared in Example 10) , effectiv mass analysis requires procedures such as laser desorptio or electro spray. Although the invention has been described with respec to particular polymer subunits, methods of preparing the subunits, and polymer assembly, it will be appreciated that various modifications and changes may be made without departing from the invention.

Claims

IT IS CLAIMED:

1. A polymer composition effective to bind in sequence-specific manner to a target sequence of a duple polynucleotide containing at least two different-oriente Watson/Crick base-pairs at selected positions in the targe sequence, comprising a specific sequence of subunits havin the form:

where Y is a 2- or 3-atom length, uncharged subunit linkag group; R* is H, OH, or O-alkyl; the 5'-methylene has a stereochemical orientation in the 5-membered ring and uniform stereochemical orientation in the 6-membered ring R_; has a β stereochemical orientation; and at least abou 70% of j groups in the polymer are selected from two o more of the following base-pair-specificity groups:

(a) for a T:A or U:A oriented base-pairs, Rj is 2,6 diaminopurine;

(b) for a C:G oriented base-pair, Rj is guanine or 6 thioguanine; (c) for a G:C oriented base-pair, Rj is selected from th group consisting of planar bases having the followin skeletal ring structures and hydrogen bonding arrays, wher B indicates the polymer backbone:

where the * ring position may carry a hydrogen-bon acceptor group; and. (d) for an A:T or A:U oriented base-pair, R_; is selected from the group consisting of planar bases having the following skeletal ring structures and hydrogen bonding arrays, where B indicates the polymer backbone:

where the * ring position may carry a hydrogen-bond donating group.

2. The polymer composition of claim 1, containing one or more subunits of the form:

3. The polymer composition of claim 1, containing one or more subunits of the form: ; ι

4. The polymer composition of claim 3, containing one or more subunits of the form: |

5. The polymer composition of claim 1, for use in sequence-specific binding to a B-form DNA-DNA duplex nucleic acid, wherein the Y linkage group is three atoms in length.

6. The polymer composition of claim 5, wherein one more subunits of the polymer are selected from the gro consisting of:

7. The polymer composition of claim 6, wherein one o more subunits of the polymer are selected from the grou consisting of:

8. The polymer composition of claim 1, for use i sequence-specific binding to an A-form duplex nucleic acid wherein the Y linkage group is two atoms in length.

9. The polymer composition of claim 8, wherein one o more subunits of the polymer are selected from the grou consisting of: 93

10. The polymer composition of claim 8, wherein on or more subunits of the polymer are selected from the grou consisting of:

N

94

11. The polymer composition of claim 1, wherein the _j structure is selected from the group consisting of:

95

12. The polymer of claim 1, wherein the Rj structur specific for a G:C target orientation is selected from th group consisting of the following bases:

13. The polymer of claim 1, wherein the Rj structur specific for a A:T target orientation is selected from th group consisting of the following bases:

14. The polymer composition of claim 1, wherein up to about 30% of the R; groups in the polymer are cytosine, at polymer subunits corresponding to a G:C base-pair orientation in the target sequence, and thymine, at polymer subunits corresponding to A:T base-pair orientations in the target sequence.

15. The composition of claim 1, wherein the polymer contains one or more attached moieties effective to enhance the solubility of the polymer in aqueous medium. 96

16. A method for coupling a first free or polymer terminal subunit having one of the following subunit forms:

where Z is a 2-atom or 3-atom long moiety, said method comprising: i) oxidizing the first subunit to generate a dialdehyde intermediate; ii) contacting the dialdehyde intermediate with the second subunit under conditions effective to couple a primary amine to a dialdehyde; and

(iii) adding a reducing agent effective to give a coupled structure selected from the following forms:

97

17. A method for isolating, from a liquid sample, target duplex nucleic acid fragment having a selecte sequence of base-pairs, comprising: i) contacting the sample with a polymer reagen containing structure which allows isolation of the reagen from solution, and attached to this structure, the polyme composition of claim 1 having a subunit sequence effectiv to bind in a sequence-specific manner with the selecte sequence of base-pairs, under conditions effective fo sequence-specific binding of the polymer composition to th selected sequence of base-pairs; and ii) separating the polymer reagent from the flui sample.

18. The method of claim 17, for use in detecting the presence of such target fragment in a liquid sample, whic further includes testing the separated polymer reagent for the presence of bound duplex nucleic acid.

19. The method of claim 18, wherein said polymer reagent includes a solid support with bound polymer composition, and said testing includes adding to the duplex nucleic acid, a fluorescent compound effective to intercalate into duplex DNA.

20. A subunit composition for use in forming a poly¬ mer composition effective to bind in a sequence specific manner to a target sequence in a duplex polynucleotide, comprising one of the following subunit structures: (a) (b) (c) (d)

98

where R' is H, OH, or O-alkyl; the 5'-methylene has a β stereochemical orientation in subunit forms (a) , (c) , and (d) and a uniform stereochemical orientation in subunit form (b) ; X is hydrogen or a protective group or a linking group suitable for joining the subunits in any selected order into a linear polymer; Y is a nucleophilic or electrophilic linking group suitable for joining the subunits in any selected order into a linear polymer; and X and Y together are such that when two subunits of the subunit set are linked the resulting intersubunit linkage is 2 or 3 atoms in length and uncharged; Z is a 2-atom or 3-atom long moiety; and, Rj, which may be in the protected state and has a β stereochemical orientation, is selected from the group consisting of planar bases having the following skeletal ring structures and hydrogen bonding arrays, where B indicates the aliphatic backbone moiety:

where the * ring position may carry a hydrogen-bond acceptor group; or,

R_j is selected from the group consisting of planar bases having the following skeletal ring structures and hydrogen bonding arrays, where B indicates the aliphatic backbone moiety:

99

where the * ring position may carry a hydrogen-bond donating group.