CA2332129A1 - Dna encoding methymycin and pikromycin - Google Patents

Dna encoding methymycin and pikromycin Download PDF

Info

Publication number
CA2332129A1
CA2332129A1 CA002332129A CA2332129A CA2332129A1 CA 2332129 A1 CA2332129 A1 CA 2332129A1 CA 002332129 A CA002332129 A CA 002332129A CA 2332129 A CA2332129 A CA 2332129A CA 2332129 A1 CA2332129 A1 CA 2332129A1
Authority
CA
Canada
Prior art keywords
ala
leu
gly
arg
host cell
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
CA002332129A
Other languages
French (fr)
Inventor
David H. Sherman
Hung-Wen Liu
Yongquan Xue
Lishan Zhao
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Minnesota
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Publication of CA2332129A1 publication Critical patent/CA2332129A1/en
Abandoned legal-status Critical Current

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • C12N15/52Genes encoding for enzymes or proenzymes
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12PFERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
    • C12P19/00Preparation of compounds containing saccharide radicals
    • C12P19/44Preparation of O-glycosides, e.g. glucosides
    • C12P19/60Preparation of O-glycosides, e.g. glucosides having an oxygen of the saccharide radical directly bound to a non-saccharide heterocyclic ring or a condensed ring system containing a non-saccharide heterocyclic ring, e.g. coumermycin, novobiocin
    • C12P19/62Preparation of O-glycosides, e.g. glucosides having an oxygen of the saccharide radical directly bound to a non-saccharide heterocyclic ring or a condensed ring system containing a non-saccharide heterocyclic ring, e.g. coumermycin, novobiocin the hetero ring having eight or more ring members and only oxygen as ring hetero atoms, e.g. erythromycin, spiramycin, nystatin
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12PFERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
    • C12P7/00Preparation of oxygen-containing organic compounds
    • C12P7/40Preparation of oxygen-containing organic compounds containing a carboxyl group including Peroxycarboxylic acids
    • C12P7/42Hydroxy-carboxylic acids

Abstract

A biosynthetic gene cluster for methymycin and pikromycin as well as a biosynthetic gene cluster for desosamine is provided.

Description

DEMANDES OU BREVETS VOLUMtNEUX
LA PRESENTS PART1E DE t:ETTE DEMANDS OU CE BREVET
COMPREND PLUS D'UN Tt)ME_ CSC! EST LE TOME ~ DE
NOTE: Pour les comes additionels, veuillez contacter le Bureau canadien des brevets JUMBO APPLlCATIONS/PATENTS
THlS SECTION OF THE APPi.ICATlON/PATENT CONTAINS MORE
THAN ONE VOLUME ~ , THIS IS VOLUME OF
NOTE: For additional volumes-phase contact the Canadian Patent Ofific~ . ~~

DNA ENCODINI~ METHYMYCIN AND PIKROMYCIN
This invention was made W th a grant from the Government of the United States of America (grants GM48562, GM35906 and GM54346 from the National Institutes of Health and a grant from the Office of Naval Research). The Government may have certain rights in the invention.
$gr~gronnd of the Invention Polyhydroxyalkanoates (PHAs) are one class of biodegradable polymers. The first identified member of the PHAs thermoplastics was polyhydroxybutyrate (PHB), the polymeric ester of D(-)-3-hydroxybutyrate. The biosynthetic pathway of PHB in the gram negative bacterium Alcaligenes eutrophus is depicted in Figure 1. PHAs related to PHB
differ in the structure of the pendar.~t arm, R (Figure 2). For example, R=CH3 in PHB, while R=CHZCH3 in polyhydroxyvalerate, and R=(CHZ)4CH3 in polyhydroxyoctanoate.
The genes responsible for PHB synthesis in A. eutrophus have been cloned and sequenced. (Peoples et al., J.J. Riol. , 2~ø, 15293 (1989); Peoples et al., J.
Biol. Chem., 2fi4, 15298 (1989)). Three enzymes: J3-ketothiolase (phbA), acetoacetyl-CoA
reductase (phbB), and PHB synthase (phbC) ;are involved in the conversion of acetyl-CoA
to PHB. The PHB synthase gene encodes a protean of 1VI< = 63,900 which is active when introduced into E.
toll (Peoples et al., J. Biol. Chem., ~, 15298 (1989)).
Although PHB represents the archetypical form of a biodegradable thermoplastic, its physical properties preclude significant use of the homopolymer form. Pure PHB
is highly crystalline and, thus, very brittle. )=lowever, unique physical properties resulting form the structural characteristics of the R groups in a PHA copolymer may result in a polymer with more desirable characteristics. These characteristics include altered crystallinity, UV
weathering resistance, glass to rubber transition temperature (Tg), melting temperature of the crystalline phase, rigidity and durability (Holmes et al., EPO 00052 459;
Anderson et al., Microbiol. Rev., $4, 450 (1990)). 7f'hus, these polyesters behave as thermoplastics, with melting temperatures of 50-180°C, which can be processed by conventional extension and molding equipment.
Traditional strategies for producing random PHA copolymers involve feeding short-and long-chain fatty acid monomers to bacterial cultures. However, this technology is limited by the monomer units which can be. incorporated into a polymer by the endogenous PHA
syntha~se and the expense of manufacturing PHAs by existing fermentation methods (Haywood et al., FEMS Microbiol. L,ett" 52, 1 ( 1989); Poi et al., hlt T.
Riol. Macromol_., ~, 106 (1990); Steinbuchel et al., In: l~cW'ier'Lals from Biolog~~l ~omrc . D.
Byron (ed.); MacMillan, NY (1991); Valentin et al., Ap~,1.Microbiol. BioteclLnical, 3S, 507 (1992)).
T'he production of diverse hydroxyacylCoA monomers for homo- and co-polymeric PHAs also occurs in some bacteria through the reduction and condensation pathway of fatty acids. This pathway employs a fatty acid synthase (FAS) which condenses malonate and acetate. The resulting ~i-keto group undergoes three processing steps, ~3-keto reduction, dehydration, and enayl reduction, to yield a fully saturated butyryl unit.
However, this pathway provides only a limited array of PHA monomers which vary in alkyl chain length but not in the degree of alkyl group t~ranching, saturation, or functionalization along the acyl chain.
The biosynthesis of polyketides, such as erythromycin, is mechanistically related to formation of long-chain fatty acids. However, polyketides, in contrast to FASs, retain ketone, hydroxyl, or olefinic functions and contain methyl or ethyl side groups interspersed along an acyl chain comparable in length to that of common fatty acids. This asymmetry in structure implies that the polyketide synthase (PKS), the enzyme system responsible for formation of these molecules, although mechanistically related to a FAS, results in an end product that is structurally very different than that of a long-chain fatty acid.
Because PHAs are biodegradable polymers that have the versatility to replace petrochemical-based thermoplastics, it is desirable that new, more economical methods be provided for the production of definE;d PHAs. Thus, what is needed are methods to produce recombinant PHA monomer synthases for the generation of PHA polymers.
Moreover, there is a continuing need for the identification and isolation of novel polyketide synthase genes, e.g., a polyketide synthase which encodes polypeptides that synthesize an antibiotic such as a ma~crolide.
The invention provides an isolated and purified nucleic acid segment comprising a nucleic acid sequence comprising a sugar (desosaunine) biosynthetic gene cluster, a biologically active variant or fragment thereof, wherein the nucleic acid sequence is not derived from the eryC gene cluster of Saccharopolyspora erythraea. As described hereinbelow, the desosamine biosynthetic gene cluster from Streptomyces venezuelae was WO 00/00620 PCT/US99/1~1398 isolated, cloned and sequenced. The isolated nucleic acid segment comprising the gene cluster preferably includes a nucleic acid sequence comprising SEQ ID N0:3, or a fragment or variant thereof. The cluster was found to encode nine polypeptides including DesI (e.g., SEQ ID N0:8 encoded by SEQ ID N0:7), DesII (e.g., SEQ ID NO:10 encoded by SEQ
ID
N0:9), DesIII (e.g., SEQ ID N0:12 encoded by SEQ ID NO:11), DesIV (e.g., SEQ
II? N0:14 encoded by SEQ ID N0:13), DesV (e.g., SEQ ID N0:16 encoded by SEQ ID NO:15), DesVI
(e.g., SEQ ID N0:18 encoded by SEiQ ID N0:17), DesVII (e.g., SEQ ID N0:20 encoded by SEQ ID N0:19), DesVIII (e.g., SEQ ID N0:22 encoded by SEQ ID N0:21), and DesR
(e.g., SEQ ID N0:24 encoded by SEQ TD N0:23) (see Figure 24). It is also preferred that the nucleic acid segment of the invention encoding DesR is not derived from the eryB gene cluster of Saccharopolyspora erythraea or the oleD gene from Streptomyces antibioticus.
Preferably, the nucleic acid segment comprising the desosamine biosynthetic gene cluster hybridizes under moderate, or more preferably stringent, hybridization conditions to SEQ ID
N0:3, or a fragment thereof. Moderate and stringent hybridization conditions are well known to the art, see, for exaunple sections 9.47-9.51 of Sambrook et al. ~;MolecLl~
r .lo ing A
La~xatQry.Manual, Cold Spring Ha~:bor Laboratory, Cold Spring Harbor, NY ( 1989). For example, stringent conditions are those that (1) employ low ionic strength and high temperature for washing, for exaunple, 0.015 M NaCI/0.0015 M sodium citrate (SSC); 0.1%
sodium lauryl sulfate (SDS) at 50°C, or (2) employ a denaturing agent such as formamide during hybridization, e.g., 50% formarnide with 0.1% bovine serum albumin/0.1%
Fico11/0.1 % polyvinylpyrrolidonei50 mM sodium phosphate buffer at pH 6.5 with 750 mM
NaCI, 75 mM sodium citrate at 42°C'.. Another example is use of 50%
formamide, S x SSC
(0.75 M NaCI, 0.075 M sodium citrate), 50 mM sodium phosphate (pH 6.8), 0.1%
sodium -pyrophosphate, 5 x Denhardt's solution, sonicated salmon sperm DNA (50 pg/ml), 0.1%
sodium dodecylsulfate (SDS), and 10% dextran sulfate at 42°C, with washes at 42°C in 0.2 x SSC and 0.1% SDS.
The invention also provides .a variant polypeptide having at least about 80%, more preferably at least about 90%, and even more preferably at least about 95%, but less than 100%, contiguous amino acid sequence identity to the polypeptide having an amino acid sequence comprising SEQ ID N0:8" SEQ ID NO:10, SEQ ID N0:12, SEQ ID N0:14, SEQ
ID N0:16, SEQ ID N0:18, SEQ ID N0:20, SEQ ID N0:22, SEQ ID N0:24, or a fragment thereof. A preferred variant polypeptide, or a subunit or fragment of a polypeptide, of the invention includes a variant or subunit polypeptide having at least about 1 %, more preferably at least about 10%, and even more preferably at least about SO%, the activity of the polypeptide having the amino acid sequence comprising SEQ ID N0:8, SEQ ID
NO:10, SEQ
ID N0:12, SEQ ID N0:14, SEQ ID N0:16, SEQ ID N0:18, SEQ ID N0:20, SEQ ID
N0:22, or SEQ ID N0:24. Thus, for example, the glycosyltransferase activity of a polypeptide of SEQ ID N0:20 can be compared to a variant of SEQ ID N0:20 having at least one amino acid substitution, insertion, or deletion relative to SEQ ID N0:20.
A variant nucleic acid sequence of the invention has at least about 80%, more preferably at least about 90%, and even more preferably at least about 95%, but less than 100%, contiguous nucleic acid sequc;nce identity to a nucleic acid sequence comprising SEQ
ID N0:3, SEQ ID N0:7, SEQ ID N(J:9, SEQ ID NO:11, SEQ ID N0:13, SEQ ID NO:15, SEQ ID N0:17, SEQ ID N0:19, SEQ ID N0:21, SEQ ID N0:23, or a fragment thereof.
Also provided is an expression cassette comprising a nucleic acid sequence comprising a desosamine biosynthetic gene cluster, a biologically active variant or fragment thereof operably linked to a promoter functional in a host cell, as well as host cells comprising an expression cassette of the invention. Thus, the expression cassettes of the invention are useful to express individual genes within the cluster, e.g., the desR gene which encodes a glycosidase or the desVll ;gene which encodes a glycosyltransferase having relaxed substrate specificity for polyketides .and deoxysugars, i.e., the glycosyltransferase processes sugar substrates other than TDP-desosamine. Thus, the desVll gene can be employed in combinatorial biology approaches to synthesize a library of macrolide compounds having various polyketide and deoxysugar structures. Moreover, the expression of a glycosylase in a host cell which synthesizes a macrolide antibiotic may be useful in a method to reduce toxicity of, e.g., inactivate, the antibiotic. For example, a host cell which produces the antibiotic is transformed with an exFiression cassette encoding the glycosyltransferase. The recombinant glycosyltransferase is expressed in an amount that reversibly inactivates the antibiotic. To activate the antibiotic, the antibiotic, preferably the isolated antibiotic which is recovered from the host cell, is contacted with an appropriate native or recombinant glycosidase.
Preferably, the nucleic acid segment encoding desosamine in the expression cassette of the invention is not derived form the eryC gene cluster of Saccharopolyspora erythraea.
Preferred host cells are prokaryotic cells, although eukaryotic host cells are also envisioned.
These host cells are useful to express desosamine, analogs or derivatives thereof as well as individual polypeptides which can tlhen be isolated from the host cell. Also provided is an expression cassette or host cell comprising antisense sequences from at least a portion of the desosamine biosynthetic gene cluster.
Another embodiment of the iinvention is a recombinant host cell, e.g., a bacterial cell, in which at least a portion of a nucleic acid sequence encoding desosamine in the host chromosome is disrupted, e.g., deleted or interrupted (e.g., by an insertion) with heterologous sequences, or substituted with a variant nucleic acid sequence of the invention, so as to alter, preferably so as to result in a decrease or lack of, desosamine synthesis and/or so as to result in the synthesis of an analog or derivative of desosamine. Preferably, the nucleic acid sequence which is disrupted is not derived from the eryC gene cluster of Saccharopolyspora erythraea. Thus, the recombinant host cell of the invention has at least one gene, i.e., desl, desll, deslll, deslV, desV, desVl, desVll; desVlll or desR, which is disrupted. One embodiment of the invention includes a recombinant host cell in which the desVl gene, which encodes an N-methyltransferase, i.s disrupted, for example, by replacement with an antibiotic resistance gene. Preferably, such a host cell produces an aglycone having an N
acetylated aminodeoxy sugar, 10-deoxy-methylonide, a compound of formula (7), a compound of formula (8), or a combination thereof. Thus, the deletion or disruption of the desVl gene may be useful in a method for preparing :novel sugars.
Another preferred embodiment of the invention is a recombinant bacterial host cell in which the desR gene, which encodes a glycosidase such as ~3-glucosidase, is disrupted.
Preferably, the host cell synthesizes C-2' ~i-glucosylated macrolide antibiotics, for example, a compound of formula (13), a compound of formula (14), or a combination thereof.
Therefore, the invention further provides a compound of formula (8), (9), (13) or (14). It will be appreciated by those skilled in the art that each atom of the compounds of the invention having a chiral center may exist in and be isolated in optically active and racemic forms. Some compounds may exhilbit polymorphism. It is to be understood that the present invention encompasses any racemic, optically active, polymorphic or stereoisomeric form, or mixtures thereof, of a compound of the invention, which possess the useful properties described herein, it being well known in the art how to prepare optically active forms (for example, by resolution of the racemic form by recrystallization techniques, by synthesis from optically active starting materials, b~y chiral synthesis, or by chromatographic separation using a chiral stationary phase) and how to determine activity using the standard tests described herein, or using other similar tests which are well known in the art.

Also provided is a method for directing the biosynthesis of specific glycosylation-modified polyketides by genetic manipulation of a polyketide-producing microorganism. The method comprises introducing into a polyketide-producing microorganism a DNA
sequence encoding enzymes in desosamine biosynthesis, e.g., a DNA sequence comprising SEQ m N0:3, a variant or fragment thereof, so as to yield a microorganism that produces specific glycosylation-modified polyketides. Alternatively, an anti-sense DNA sequence of the invention may be employed. Then the glycosylation-modified polyketides are isolated from the microorganism. It is preferred that the DNA sequence is modified so as to result in the inactivation of at least one enzymatic activity in sugar biosynthesis or in the attachment of the sugar to a polyketide.
Further provided is an isolated and purified nucleic acid segment comprising a nucleic acid sequence comprising a macroli~de biosynthetic gene cluster (the "metlpik"
or ' pik" gene cluster) encoding polypeptides that ;synthesize methymycin, pikromycin, neomethymycin, narbomycin, or a combination thereof, or a biologically active variant or fragment thereof. It is preferred that the nucleic acid segment comprises SEQ lD NO:S, or a fragment or variant thereof, or hybridizes under moderate or more preferably stringent, conditions to SEQ )D
NO:S or a fragment thereof. It is also preferred that the isolated and purified nucleic acid segment is from Streptomyces sp., such as Streptomyces venezuelae (e.g., ATCC
15439, ATCC 15068, MCRL 0306, SC 2366 or 3629), Streptomyces narbonensis (e.g., ATCC
19790), Streptomyces eurocidicus, Streptomyces zaomyceticus (MCRL 0405), Streptomyces flavochromogens, Streptomyces sp. AM400, and Streptomyces felleus, although isolated and purified nucleic acid from other organisms which produce methymycin, narbomycin, neomethymycin and/or pikromycin are also within the scope of the invention.
The cloned genes can be introduced into an expression system and genetically manipulated so as to yield novel macrolide antibiotics, e.g., ketolides, as well as monomers for polyhydroxyalkanoate (PHA) biopolymers. Preferably, the nucleic acid sequence encodes PikRl (e.g., SEQ ID
N0:27 encoded by SEQ ID N0:26), PikR2 (e.g., SEQ ID N0:29 encoded by SEQ DD
N0:28), PikAI (e.g., SEQ ID N0:31 encoded by SEQ ID N0:30), PikAII (e.g., SEQ
ID
N0:33 encoded by SEQ m N0:32), PikAIII (e.g., SEQ m N0:35 encoded by SEQ ff) N0:34), PikAIV (e.g., SEQ ID NO:37 encoded by SEQ ID N0:36), PikB (which is the desosamine gene cluster described ;above), PikC (e.g., SEQ m N0:39 encoded by SEQ >D
N0:38), and PikD (e.g., SEQ m NO:41 encoded by SEQ ID N0:40), a variant or a fragment thereof, or hybridizes under moderal:e or preferably stringent conditions to such a nucleic acid sequence.
The invention also provides .a variant polypeptide having at least about 80%, more preferably at least about 90%, and even more preferably at least about 95%, but less than 100%, contiguous amino acid sequence identity to the polypeptide having an amino acid sequence comprising SEQ LD N0:2'7, SEQ >D N0:29; SEQ ID N0:31, SEQ ID N0:33, SEQ
ID N0:35, SEQ ID N0:37, SEQ ID N0:39, SEQ ID N0:41, or a fragment thereof. A
preferred variant polypeptide, or a subunit or fragment of a polypeptide, of the invention includes a variant or subunit polypptide having at least about 1%, more preferably at least about 10%, and even more preferably at least about 50%, the activity of the polypeptide having the amino acid sequence comprising SEQ ID N0:27, SEQ >D N0:29, SEQ ID
N0:31, SEQ >D N0:33, SEQ ID N0:35, SEQ ID N0:37, SEQ ID N0:39, or SEQ ID N0:41. The activities of polypeptides of the macrolide biosynthetic pathway of the invention are described below.
A variant nucleic acid sequence of the pik biosynthetic gene cluster of the invention has at least about 80%, more preferably at least about 90%, and even more preferably at least about 95%, but less than 100%, contiguous nucleic acid sequence identity to a nucleic acid sequence comprising SEQ 1D N0:5" SEQ ID N0:26, SEQ ID N0:28, SEQ ID N0:30, SEQ
ll~ N0:32, SEQ ID N0:34, SEQ ID N0:36, SEQ ID N0:38, SEQ m N0:40, or a fragment thereof.
The pikA gene encodes a polyketide synthase which synthesizes macrolactone 10-deoxymethonolide and narbolide, pikB encodes desosamine synthases which catalyze the formation and transfer of a deoxysugar moiety onto aglycones, the pikC gene encodes a P450 hydoxylase which catalyzes the conversion of YC-17 and narbomycin into methymycin, neomethymycin, and pikromycin, arid the pikRl, pikR2 (possibly one for a 12-membered ring and the other for a 14-membered ring) and desR genes which encode enzymes associated with bacterial self protection. Thus, the iisolated nucleic acid molecule of the invention encodes four active macrolide antibiotics two of which have a 12-membered ring while the other two have a 14-membered ring. The genetic mechanism underlying the alternative termination of polyketide synthesis may be useful to prepare novel compounds, e.g., antibiotics, and PHA
monomers. The invention further provides isolated and purified nucleic acid segments, e.g., in the form of an expression cassette, for each of the individual genes in the macrolide biosynthetic gene cluster. For example, the invention provides an isolated and purified pikAV gene that encodes a thioesterase II. In particular, the thioesterase may be useful to enhance the structural diversity of antibiotics and in PHA production, as the thioesterase modulates chain release and cyclization. For example, a thioesterase II gene having acyl-ACP coenzyme A transferase activity (e.g., a mutant pik TEII, bacterial, fungal or plant medium-chain-length thioesterase, an animal fatty acid thioesterase or a thioesterase from a polyketide synthase) is introduced at the end of a recombinant monomer synthase (see Figure ~~
36), which, in the presence of a PIiA synthase, e.g., phaCl, produces a novel polyhydroxyalkanoate polymer. .Alternatively, in the absence of a TEII domain, a fusion of a portion of PKS gene cluster with a 1'HA synthase may result in the transfer of an acyl chain from the PHA to the polymerase.
Also provided is a pikC gene that encodes a hydroxylase which is active at two positions on a 12-membered ring or atone position on a 14-membered ring. Such a gene may be particularly useful to prepare novel compounds through bioconversion or biotransformation.
The invention also provides an expression cassette comprising a nucleic acid segment comprising a macrolide biosynthetic; gene cluster encoding polypeptides that synthesize methymycin, pikromycin, neomethymycin, narbomycin, or a combination thereof, or a biologically active variant or fragmc;nt thereof, operably linked to a promoter functional in a host cell. Further provided is a host cell comprising the nucleic acid segment encoding methymycin, pikromycin, neomethymycin, narbomycin, or a combination thereof, or a biologically active variant or fragment thereof. Moreover, the invention provides isolated and purified polypeptides of the invention, preferably obtained from host cells having the nucleic acid molecules of the invention. In addition, expression cassettes and host cells comprising antisense sequences of at least a portion of the macrolide biosynthetic gene cluster of the invention are envisioned.
Yet another embodiment of the invention is a recombinant host cell, e.g., a bacterial cell, in which a portion of the macrolide biosynthetic gene cluster of the invention is disrupted or replaced with a heterologous sequence or a variant nucleic acid segment of the invention, so as to alter, preferably so as to result in a decrease or lack of methymycin, pikromycin, neomethymycin, narbomycin, or a combination thereof, and/or so as to result in the synthesis of novel macrolides. 'Cherefore, the invention provides a recombinant host cell in which a pikAl gene, a pikAll genc;, a pikAlll gene (12-membered rings), a piklV gene (14-membered rings), a pikB gene clustc;r, a pikAV gene, a pikC gene, a pikD gene, a pikRl gene, apikR2 gene, or a combination thereof, is disrupted or replaced. A preferred embodiment of the invention is a host cell wherein tlhe pikB (e.g., the desVl and desV
genes), pikAl, pikAV or pikC gene, is disrupted.
Although the sixth (final) condensation cycle is not required for 10-deoxymethynolide formation, as described hereinbelow genetic disruption of Pik module 6 (encoded by pikAIT~
prevented production of both the 12-~ as well as the 14-membered ring macrolactones. Thus, expression of alternative forms of PikAIV controls the final step in polyketide chain elongation and termination. Specifically, an N-terminal truncated form of PikAIV leads to 10-deoxymethynolide formation while full-length PikAIV results in narbonolide production.
The expression of a truncated PKS module represents a novel method of polyketide chain length determination. Moreover, as the expression of such a module may produce multiple polyketides, the use of such a module may result in the more rapid identification of novel products.
The invention also provides .a method for combinatorial biosynthesis. The method comprises expressing in a host cell an expression cassette comprising a DNA
fragment of a biosynthetic gene cluster, e.g., a pol;yketide synthase gene wherein the expression cassette is present on a plasmid, wherein the ge;nome of the host cell comprises a portion of the gene which is different than the portion of the gene present on the plasmid.
Preferably, the DNA
fragment and the portion of the gene; which is one the host chromosome together comprise the entire gene. Synchronized expression of genes from the plasmid and the chromosome thus creates a combinatorial pathway that produces a product. The smaller size of the plasmid facilitates gene manipulation so that a large library of recombinant pathways can thus be generated in a short time. Preferably, the DNA fragment and the portion of the gene cluster on the host chromosome are linked to the native promoter, e.g., pik genes are linked to PpikA.
Moreover, as the nucleic acid segment comprising the macrolide biosynthetic gene cluster of the invention encodes a polyketide synthase, modules of that synthase are useful in methods to prepare recombinant polyhydroxyalkanoate monomer syntheses and polymers in addition to macrolide antibiotics and derivatives thereof.
Thus, the invention provides an isolated and purified DNA molecule comprising a first DNA segment encoding a first module and a second DNA segment encoding a second module, wherein the DNA segments together encode a recombinant polyhydroxyalkanoate monomer synthase, and wherein at least one DNA segment is derived from the pikA gene cluster of Streptomyces venezuelae. Preferably, no more than one DNA segment is derived ' from the eryA gene cluster of Saccharopolyspora erythraea. In one embodiment of the invention, the 3' most DNA segment of the isolated DNA molecule of the invention encodes a thioesterase II. Also provided is an c;xpression cassette comprising a nucleic acid molecule encoding the polyhydroxyalkanoate :monomer synthase operably linked to a promoter 5 functional in a host cell.
Yet another embodiment of the invention is a method of providing a polyhydroxyalkanoate monomer. The method comprises introducing into a host cell a DNA
molecule comprising a DNA segment encoding a recombinant polyhydroxyalkanoate monomer synthase operably linked to a promoter functional in the host cell.
The DNA
10 molecule comprises a plurality of DNA segments, e.g., a first module and a second module, wherein at least one DNA segment is derived from the pikA gene cluster of Streptomyces venezuelae. The DNA encoding the recombinant polyhydroxyalkanoate monomer synthase is then expressed in the host cell so as to generate a polyhydroxyalkanoate monomer.
Optionally, a second DNA molecule may be introduced into the host cell. The second DNA
molecule comprises a DNA segment encoding a polyhydroxyalkanoate synthase operably linked to a promoter functional in the host cell. The two DNA molecules are expressed in the host cell so as to generate a polyhydroxyalkanoate polymer.
Another embodiment of the invention is an isolated and purified DNA molecule comprising a first DNA segment encoding a fatty acid synthase and a second DNA
segment encoding a module from the pikA gene cluster of Streptomyces venezuelae. Such a DNA
molecule can be employed in a method of providing a polyhydroxya~lkanoate monomer. Thus, a DNA molecule comprising a first DNA segment encoding a fatty acid synthase and a second DNA segment encoding a polyketide synthase is introduced inl;o a host cell. The first DNA segment is 5' to the second DNA segment and the first DNA se~~nent is operably linked to a promoter functional in the host cell. The first DNA segment is linked to the second DNA segment so that the linked DNA segments express a fusion protein. The DNA molecule is expressed in the host cell so as to generate a polyhydroxyalkanoate monomer.
Further provided is a methods of providing a polyhydroxyalkanoate monomer synthase. The method comprises introducing an expression cassette comprising a DNA
molecule encoding a polyhydroxyalkcanoate synthase operably linked to a promoter functional in a host cell. The DNA molecule comprises a first DNA segment encoding a first module and a second DNA segment encoding a second module wherein the DNA segments together encode a polyhydroxyalkanoate monomer synthase. At least one DNA segment is derived from the pikA gene cluster of Streptnmyces veneauelae. The DNA molecule is expressed in the host cell. Optionally, the DNA molecule further comprises a DNA segment encoding a polyhydroxyalkanoate synthase. Alternatively, a second, separate DNA molecule encoding a polyhydroxyalkanoate synthase is introduced into the host cell.
A further embodiment of the; invention is an isolated and purified DNA
molecule comprising a DNA segment which encodes a Streptomyces venezuelae polyketide synthase, e.g., a polyhydroxyalkanoate monomer synthase, a biologically active variant or subunit (fragment) thereof. Preferably, the 1DNA segment encodes a polypeptide having an amino acid sequence comprising SEQ ID N0:2. Preferably, the DNA segment comprises SEQ ID
NO:1. The DNA molecules of the invention are double stranded or single stranded. A
preferred embodiment of the inventiion is a DNA molecule that has at least about 70%, more preferably at least about 80%, and even more preferably at least about 90%, but less than 100%, contiguous sequence identity to the DNA segment comprising SEQ m NO:1, e.g., a "variant" DNA molecule. A variant; DNA molecule of the invention can be prepared by methods well known to the art, including oligonucleotide-mediated mutagenesis.
See Adelman et al., D~I,A, 2, 183 (1983)~ and Sambrook et al., Molecular Cloning:
A Laborato~
Manual ( 1989).
The invention also provides an isolated, purified polyhydroxyalkanoate monomer synthase, e.g., a polypeptide having an amino acid sequence comprising SEQ >D
N0:2, a biologically active subunit, or a biologically active variant thereof. Thus, the invention provides a variant polypeptide having at least about 80%, more preferably at least about 90%, and even more preferably at least about 95%, but less than 100%, contiguous amino acid sequence identity to the polypeptide; having an amino acid sequence comprising SEQ )D
N0:2. A preferred variant polypept.ide, or a subunit of a polypeptide, of the invention includes a variant or suburut polypeptide having at least about 10%, more preferably at least about SO%, and even more preferably at least about 90%, the activity of the polypeptide having the amino acid sequence comprising SEQ >D N0:2. Preferably, a variant polypeptide of the invention has one or more conservative amino acid substitutions relative to the polypeptide having the amino acid sequence comprising SEQ )D N0:2. For example, conservative substitutions include aspartic-glutamic as acidic amino acids;
lysine/arginine/histidine as basic anuno acids; leucine/isoleucine, methioninelvaline, alanine/valine as hydrophobic amino acids; serine/glycine/alanine/threonine as hydrophilic amino acids. The biological activity of a polypeptide of the invention can be measured by methods well known to the art, inchading but not limited to, methods described hereinbelow.
Thus, the modules encoded lby the nucleic acid segments of the invention may be employed in the methods described hereinabove to prepare polyhydroxyalkanoates of varied chain length or having various side chain substitutions and/or to prepare glycosylated biopolymers.
The compounds produced b:y the recombinant host cells of the invention are useful as biopolymers, e.g., in packaging or biomedical applications, to engineer PHA
monomer synthases, or to prepare biologically active agents, such as those useful to prepare a medicament for the treatment of a pathological condition or a symptom in a mammal, e.g., a human. The agents include pharmaceuticals such as chemotherapeutic agents, immunosuppressants, agents to treat asthma, chronic obstructive pulmonary disease as well as other diseases involving respiratory inflammation, cholesterol-lowering agents, or macrolide-based antibiotics which are active against a variety of organisms, e.g., bacteria, including mufti-drug-resistant pneumococci and other respiratory pathogens, as well as viral and parasitic pathogens; or as crop protE;ctian agents (e.g., fungicides or insecticides) via expression ofpolyketides in plants. Methods employing these compounds, e.g., to treat a mammal, bird or fish in need of such therapy, such as a patient having a bacterial, viral or parasitic infection, cancer, respiratory disease, or in need of immunosuppression, e.g., during cell, tissue or organ transplantation, are also envisioned.
$rl~.' Deccri i<on yf the Figures Figure 1. The PHB biosyntlhetic pathway in A. eutrophus.
Figure 2. Molecular structure of common bacterial PHAs. Most of the known PHAs are polymers of 3-hydroxy acids possessing the general formula shown. For example, R=CH3 in PHB, T=CHZCH3 in polyhydrox~,rvalerate (PHV), and R=(CH2)4CH3 in polyhydroxyoctanoate (PHO).
Figure 3. Comparison of the natural and recombinant pathways for PHB
synthesis.
The three enzymatic steps of PHB synthesis in bacteria involving 3-ketothiolase, acetoacetyl-CoA reductase, and PHB synthase .are shown on the left. The two enzymatic steps involved in PHB synthesis in the pathway in Sf21 cells containing a rat fatty acid synthase with an inactivated dehydrase domain (ratFAS206) are shown on the right.

Figure 4. Schematic diagram of the molecular organization of the tyl polyketide synthase (PKS) gene cluster. Open arrows correspond to individual open reading frames (ORFs) and numbers above an ORE denote a multifunctional module or synthase unit (S~.
AT=acyltransferase; ACP=acyl carrier protein; KS=(3-ketoacyl synthase;
KR=ketoreductase;
DH=dehydrase; ER=enoyl reductase; TE=thioesterase; MM=rnethylmalonylCoA;
M=malonyl CoA; EM=ethylmalonyl CoA. Module 7 in tyl is also known as Module F.
Figure 5. Schematic diagram of the molecular organization of the met PKS gene cluster.
Figure 6. Strategy for producing a recombinant PHA monomer synthase by domain replacement.
Figure 7. (A) 10% SDS-PAGE gel showing samples from various stages of the purification of PHA synthase; lane 1, molecular weight markers; lane 2, total protein of uninfected insect cells; lane 3, total', protein or insect cells expressing a rat FAS (200 kDa;
Joshi et al., LEi~sh~m.~., 29.1, 143 (1993)); lane 4, total protein of insect cells expressing PHA
synthase; lane S, soluble protein from sample in lane 4; lane 6, pooled hydroxylapatite (HA) fractions containing PHA synthase,. (B) Western analysis of an identical gel using rabbit-a-PHA synthase antibody as probe. Bands designated with arrows are: a, intact PHB synthase with N-terminal alanine at residue '7 and serine at residue 10 (A7/S 10); b, 44 kDa fragment of PHB synthase with. N-terminal alarune at residue 181 and asparagine at residue (A181/N185); c, PHB synthase fragment of approximately 30 kDa apparently blocked based on resistance to Edman degradation; d, 22 kDa fragment with N-terminal glycine at residue 187 (G187). Band d apparently does not react with rabbit-a-PHB synthase antibody (B, lane 6). The band of similar size in B, lane 4 was not further identified.
Figure 8. N-terminal analysis of PHA synthase purified from insect cells. (a) The expected N-terminal 25 amino acid sequence of A. eutrophus PHA synthase. (b&c) The two N-terminal sequences determined for the A. eutrophus PHA synthase produced in insect cells.
The bolded sequences are the actual N-termini determined.
Figure 9. Spectrophotometric scans of substrate, 3-hydroxybutyrate CoA (HBCoA) and product, CoA. The wavelength at which the direct spectrophotometric assays were carried out (232 nm) is denoted by the arrow; substrate, HBCoA (~) and product, CoA (o), Figure 10. Velocity of the '.hydrolysis of HBCoA as a function of substrate concentration. Assays were carn'e<i out in 40 or 200 pl assay volumes with enzyme concentration remaining constant a.t 0.95 mglml (3.8 pg/40 pl assay).
Velocities were WO 00/00620 PCf/US99/14398 calculated from the linear portions of the assay curves subsequent to the characteristic lag period. The substrate concentration at half optimal velocity, the apparent K", value, was estimated to be 2.5 mM from this data.
Figure 11. Double reciprocal plot of velocity versus substrate concentration.
The concave upward shape of this plot is similar to results obtained by Fukui et al. (Arch.
Micr~hi~l., .LIQ, 149 ( 1976)) with granular PHA synthase from Z. ramigera.
Figure I2. Velocity of the hydrolysis of HBCoA as a function of enzyme concentration. Assays were carried out in 40 ~,1 assay volumes with the concentration HBCoA remaining constant at 8 ~M.
Figure 13. Specific activity of PHA synthase as a function of enzyme concentration.
Figure 14. pH activity curve for soluble PHA synthase produced using the baculovirus system,. Reactions were carried out in the presence of 200 mM P;.
Buffers of pH
< 10 were prepared with potassium phosphate, while buffers of pH > 10 were prepared with the appropriate proportion of Na3PC)4.
Figure 15. Assays of the hydrolysis of HBCoA with varying amounts of PHA
synthase. Assays were carried out in 40 pl assay volumes with the concentration of HBCoA
remaining constant at 8 ~M. Initial AZS2 values, originally between 0.62 and 0.77, were normalized to 0.70. Enzyme amounts used in these assays were, from the uppermost curve, 0.38, 0.76, 1.14, 1.52, 1.90, 2.28, 2.ti6, 3.02, 3.42, 7.6, and 15.2 ~,g, respectively.
Figure 16. SDS/PAGE analysis of proteins synthesized at various time points during infection of S, f 21 cells. Approximately 0.5 mg of total cellular protein from various samples was fractionated on a 10% polyacryl.amide gel. Samples include: uninfected cells, lanes 1-4, days 0, 1, 2, 3, respectively; infection with BacPAK6::phbC alone, lanes 5-8, days, 0, 1, 2, 3, respectively, infection with baculaviral clone containing ratFAS206 alone, lanes 9-12, days 0, 1, 2, 3, respectively; and ratFAS206 and BacPAK6 infected cells, lanes 13-16, days 0, 1, 2, 3, respectively. A = mobility of FAS,1B = mobility of PHA synthase. Molecular weight standard lanes are marked M.
Figure 17. Gas chromatographic evidence for PHB accumulation in Sf21 cells.
Gas chromatograms from various samples are superimposed. PHB standard (Sigma) is chromatogram #7 showing a propylh~ydroxybutyrate elution time of 10.043 minutes (s, arrow). The gas chromatograms of extracts of the uninfected (#1); singly infected with ratFAS206 (#2, day 3); and singly infected with PHA synthase (#3, day 3) are shown at the bottom of the figure. Gas chromatograms of extracts of dual-infected cells at day 1 (#4), 2 (#5), and 3 (#6) are also shown exhibiting a peak eluting at 10.096 minutes (x, arrow). The peak of dual-infected, day 3 extract (#6) was used for mass spectrometry (MS) analysis.
Figure 18. Gas chromatography-mass spectrometry analysis of PHB. The characteristic fragmentation of prop;ylhydroxybutyrate at m/z of 43, 60, 87, and 131 is shown.
5 A) standard PHB from bacteria (Sigma), and B) peak X from ratFAS206 and BacPAK6:
phbC baculovirus infected, day 3 (#6, Figure 17) Sf21 cells expressing rat FAS
dehydrase inactivated protein and PHA syntha:~e.
Figure 19. Map of the vep (Streptomyces venezuelae polyene encoding) gene cluster.
Figure 20. Plasmid map of pDHS502.
10 Figure 21. Plasmid map of I>DHSSOS.
Figure 22. Cloning protocol for pDHS505.
Figure 23. Nucleotide sequence (SEQ ID NO: l ) and corresponding amino acid sequence (SEQ ID N0:22) of vep ORFI.
Figure 24. Schematic diagram of the desosamine biosynthetic pathway and the 15 enzymatic activity associated with each of the desosamine biosynthetic polypeptides.
Figure 25. Schematic of the conversion of the inactive (diglycosylated) form of methymycin and pikromycin to the .active form of methymycin and pikromycin.
Figure 26. Schematic diagram of the desosamine biosynthetic pathway.
Figure 27. Pathway for the synthesis of a compound of formula 7 and 8 in desVI-mutants of Streptomyces.
Figure 28. Structure and biosynthesis of methymycin, pikromycin, and related compounds in Streptomyces venezuelae ATCC 15439. Methymycin: Rl=OH, RZ H, neomethymycin: R,=H, R2 = OH; p~ikromycin: R3 OH, narbomycin: R3 = H.
Polyketide synthase components PikAI, PikA,II, PikAIII, PikAIV, and PikAV are represented by solid bars. Each circle represents an enzymatic domain in the Pik PKS system. KS: ~i-ketoacyl-ACP synthase, AT: acyltransferase, ACP: acyl carrier protein, KR: ~i-ketoacyl-ACP
reductase, DH: ~i-hydroxyl-thioester dehydratase, ER: enoyl reductase, KSQ: a KS-like domain, KR with a cross: nonfunctional KR, TE: thioesterase domain, and TEII:
type II
thioesterase. Des represents all eight enzymes for desosamine biosynthesis and transfer and PikC is the cytochrome P450 monooxygenase responsible for hydroxylation at R, , Rz, and R3 positions (Xu et al., 1998).
Figure 29. Organization of the pik cluster in S. venezuelae. Each arrow represents an open reading frame (ORF). The direction of transcription and relative sizes of the ORFs deduced from nucleotide sequence are indicated. The cluster is composed of four genetic loci: pikA, pikB (d'es), pikC, and pikR. Cosmid clones are denoted as overlapping lines.
Figure 30. Conversion of ~.'C-17 and narbomycin by PikC P450 hydroxylase.
Figure 31. Nucleotide sequence (SEQ ID NO:S) and inferred amino acid sequence (SEQ m N0:6) of the pik gene cluster.
Figure 32. Nucleotide sequence (SEQ 117 N0:3) and inferred amino acid sequence (SEQ ID N0:4) of the desosamine gene cluster.
Figure 33. S. venezuelae AX916 construct useful to prepare a polyketide having a shorter chain length compared to v~~ild-type pikA. pik module 2 is fused to pik module 5, and module 3 and 4 are deleted, so as to encode a three module PKS which produces two macrolides, a triketide and a tetraketide.
Figure 34. Recombinant PKS having a wild-type thioesterase IT.
Figure 35. pAX703 constmct, an expression and complementation vector. The PikTEII gene can be replaced with an EcoRI-NsiI fragment. The phaC 1 gene can be replaced with a PacI-DraI fragment.
Figure 36. Strategy for C7 polymer production. mTEII is a mutant pikTEII, an acyl-ACP CoA transferase; phaCl is a 1'HA polymerase 1 from P. olivaras which may have racemase activity. In a strain having these constructs, AX916, a PHA polymer is produced.
Figure 37. Strategy for GS polymer production. A PHA polymerase gene phaCl is directly fused to pik module 2, so ~~s to result in a fusion that transfers an acyl chain from the PKS protein directly to the polymerase by the prosthetic group on the ACP
domain of the PKS.
Figure 38. Codons for specified amino acids.
Figure 39. Exemplary and preferred amino acid substitutions.
Figure 40. Plasmid complE;mentation of S. venezuelae AX912. The relevant genotype (on the chromosome and on the plasmid) is listed on the left side and the corresponding phenotype is listed on the right side. The pikA genes are indicated by open arrows with divided boxes indicating domains in the PKS. An internal alternative translation start site for PikAIV is indicated by an * above the KS6 domain and a hexa-histidine was introduced into mutant AX912 chromosome (position marked by a) to facilitate the detection of PikAIV
expression. Antibiotic production was determined following complementation of mutant AX912 with the corresponding pl~~smids. Antibiotic production was normalized by using AX912 as 0% and full-length pikA~V complementation (pDHS707) as 100%
standards.

Figure 41. Mechanistic models for alternative termination by PikAIV. Proteins PikAIII and PikAIV are stacked one on top of the other according to their order in polyketide biosynthesis (PikAI and PikAII are not shown). A sphere represents an enzymatic domain in the PKSs with its diameter proportional to the size of the domain. Each PKS
module/protein was first dimerized (each peptide chain is shown as either red or blue) and then twisted 180 degrees to form a half helix following the model for erythromycin PKS
(Staunton et al., 1999). Two sets of independent active sites are thus formed along two grooves of the helix that lead to the production of two po~lyketides in each biosynthetic cycle. A) Wild type S.
venezuelae under culture conditions for pikromycin production. B) Wild type S.
venezuelae under culture conditions for methymycin production. C) S. venezuelae AX912 (pDHS704) under culture conditions for methymycin production. D) S. venezuelae AX912 (pDHS704) under culture conditions for pikromycin production. E) S. venezuelae AX912 (pDHS708) under culture conditions for pikromycin production. F) S. venezuelae AX912 (pDHS708) under culture conditions for methym~ycin production. Gene products expressed from the plasmid construct used for complementation are underlined.
Figure 42. Pathway for desosamine biosynthesis.
Figure 43. Schematic of pathway leading to methymycin/neomethymycin analogs 18 and 19.
Figure 44. Macrolide having; D-quinovose.
Figure 45. Products produced by desl mutant.
Figure 46. Pik sequences from Streptomyces spp. A) PikA3 pikA4 from S.
venezulae ATCC 15068 (SEQ ID N0:54). B) .PikA3 pikA4 from S. narbonesis ATCC 19790 (SEQ
ID
N0:55). C) TEII gene from S. vene~:ulae ATCC 15068 (SEQ ID N0:56). D) TEII
gene from S. narbonesis ATCC 19790 (SEQ ID N0:57).
Il!etail~.d Description of the Invention I~1t1~1S
As used herein, a "linker regiion" is an amino acid sequence present in a multifunctional protein which is less; well conserved in an amino acid sequence than an amino acid sequence with catalytic activity.
As used herein, an "extender unit" catalytic or enzymatic domain is an acyl transferase in a module that catalyzes chain elongation by adding 2-4 carbon units to an acyl WO 00/00620 PCT/US99i14398 chain and is located carboxy-terminal to another acyl transferase. For example, an extender unit with methylinalonylCoA specificity adds acyl groups to a methylmalonylCoA
molecule.
As used herein, a "polyhydr~oxyalkanoate" or "PHA" polymer includes, but is not limited to, linked units of related, preferably heterologous, hydroxyalkanoates such as 3-hydroxybutyrate, 3-hydroxyvalerate, 3-hydroxycaproate, 3-hydroxyheptanoate, 3-hydroxyhexanoate, 3-hydroxyoctanoate, 3-hydroxyundecanoate, and 3-hydroxydodecanoate, and their 4-hydroxy and S-hydroxy counterparts.
As used herein, a "Type I polyketide synthase" is a single polypeptide with a single set of iteratively used active sites. 'This is in contrast to a Type II
polyketide synthase which employs active sites on a series of polypeptides.
As used herein, a "recombinant" nucleic acid or protein molecule is a molecule where the nucleic acid molecule which encodes the protein has been modified in vitro, so that its sequence is not naturally occurnng,, or corresponds to naturally occurring sequences that are not positioned as they would be positioned in a genome which has not been modified.
A "recombinant" host cell of the invention has a genome that has been manipulated in vitro so as to alter, e.g., decrease or' disrupt, or, alternatively, increase, the function or activity of at least one gene in the macrolid~e or desosamine biosynthetic gene cluster of the invention.
As used herein, a "multifunctional protein" is one where two or more enzymatic activities are present on a single polypeptide.
As used herein, a "module" is one of a series of repeated units in a multifunctional protein, such as a Type I polyketidE; synthase or a fatty acid synthase.
As used herein, a "premahwe termination product" is a product which is produced by a recombinant multifunctional protein which is different than the product produced by the non-recombinant multifunctional protein. In general, the product produced by the recombinant multifunctional protein has fewer aryl groups.
As used herein, a DNA that is "derived from" a gene cluster is a DNA that has been isolated and purified in vitro from genomic DNA, or synthetically prepared on the basis of the sequence of genomic DNA.
As used herein, the "pill' or "piklmet" gene cluster includes sequences encoding a polyketide synthase (pikA), desosa~mine biosynthetic enzymes (pikB, also referred to as des), a cytochrome P450 (pikC), regulatory factors (pikD) and enzymes for cellular self resistance (pikR).

As used herein, the terms "isolated and/or purified" refer to in vitro isolation of a DNA or polypeptide molecule from its natural cellular environment, and from association with other components of the cell, such as nucleic acid or polypeptide, so that is can be sequenced, replicated and/or expressed. Moreover, the DNA may encode more than one recombinant Type I polyketide synthase and/or fatty acid synthase. For example, "an isolated DNA molecule encoding a polyhydnoxyalkanoate monomer synthase" is RNA or DNA
containing greater than 7, preferably 15, and more preferably 20 or more sequential nucleotide bases that encode a biologically active polypeptide, fragment, or variant thereof, that is complementary to the non-coding, or complementary to the coding strand, of a polyhydroxyalkanoate monomer syr~thase RNA, or hybridizes to the RNA or DNA
encoding the polyhydroxyalkanoate monomer' synthase and remains stably bound under stringent conditions, as defined by methods well known to the art, e.g., in Sambrook et al., supra.
An "antibiotic" as used herein is a substance produced by a microorganism which, either naturally or with limited chemical modification, will inhibit the growth of or kill another microorganism or eukaryotic cell.
An "antibiotic biosynthetic gene" is a nucleic acid, e.g., DNA, segment or sequence that encodes an enzyrrratic activity which is necessary for an enzymatic reaction in the process of converting primary metabolites into antibiotics.
An "antibiotic biosynthetic pathway" includes the entire set of antibiotic biosynthetic genes necessary for the process of converting primary metabolites into antibiotics. These genes can be isolated by methods well known to the art, e.g., see U.S. Patent No. 4,935,340.
Antibiotic-producing organisms include any organism, including, but not limited to, Actinoplanes, Actinomadura, Bacillus, Cephalosporium, Micromonospora~
Penicillium, Nocardia, and Streptomyces, which either produces an antibiotic or contains genes which, if expressed, would produce an antibiotic.
An antibiotic resistance-conferring gene is a DNA segment that encodes an enzymatic or other activity which confers resistance to an antibiotic.
The tenor "polyketide" as used herein refers to a large and diverse class of natural products, including but not limited to antibiotic, antifungal, anticancer, and anti-hehninthic compounds. Antibiotics include, but are not limited to anthracyclines and macrolides of different types (polyenes and averrr~ectins as well as classical macrolides such as erythromycins). Macrolides are produced by, for example, S. erytheus, S.
antibioticus, S.
venezuelae, S. fradiae and S. narbonensis.

The term "glycosylated polyketide" refers to any polyketide that contains one or more sugar residues.
The term "glycosylation-modified polyketide" refers to a polyketide having a changed glycosylation pattern or configuration relative to that particular polyketide's unmodified or S native state.
The term "polyketide-producing microorganism" as used herein includes any microorganism that can produce a polyketide naturally or after being suitably engineered (i.e., genetically). Examples of actinomycetes that naturally produce polyketides include but are not limited to Micromonospora roscrria, Micromonospora megalomicea, Saccharopolyspora 10 erythraea, Streptomyces antibioticus, , Streptomyces albereticuli, Streptomyces ambofaciens, Streptomyces avermitilis, Streptomyces fradiae, Streptomyces griseus, Streptomyces hydroscopicus, Streptomyces tsukulrebaensis, Streptomyces mycarofasciens, Streptomyces platenesis, Streptomyces violaceoniger, Streptomyces violaceoniger, Streptomyces thermotolerans, Streptomyces rimosus, Streptomyces peucetius, Streptomyces coelicolor, 15 Streptomyces glaucescens, Streptorr~yces roseofulvus, Streptomyces cinnamonensis, Streptomyces curacoi, and Amycalatopsis mediterranei (see Hopwood, D. A. and Sherman, D. H., AnnL. Rev_ ('Tenet., 24:37-66 (1990), incorporated herein by reference). Other examples of polyketide-producing microorganisms that produce polyketides naturally include various Actinomadura, Dactylosporangium and Nocardia strains.
20 The term "sugar biosynthesis genes" as used herein refers to nucleic acid sequences from organisms such as Streptomyaes venezuelae that encode sugar biosynthesis enzymes and is intended to include sequences of DNA from other polyketide-producing microorganisms which are identical or analogous to those obtained from Streptomyces venezuelae.
The term "sugar biosynthesis enzymes" as used herein refers to polypeptides which are involved in the biosynthesis and/or attachment of polyketide-associated sugars and their derivatives and intermediates.
The term "polyketide-associated sugar" refers to a sugar that is known to attach to polyketides or that can be attached to polyketides by the processes described herein.
The term "sugar derivative" refers to a sugar which is naturally associated with a polyketide but which is altered relative to the unmodified or native state, including but not limited to, N-3-a-desdimethyl D-dc;sosamine.
The term "sugar intermediate" refers to an intermediate compound produced in a sugar biosynthesis pathway.

As used herein, the term "de:rivative" means that a particular compound produced by a host cell of the invention or prepared in vitro using polypeptides encoded by the nucleic acid molecules of the invention, is modified so that it comprises other moieties, e.g., peptide or polypeptide molecules, such as antibodies or fragments thereof, nucleic acid molecules, sugars, lipids, fats, a detectable signal molecule such as a radioisotope, e.g., gamma emitters, small chemicals, metals, salts, synthetic polymers, e.g., polylactide and polyglycolide, surfactants and glycosaminoglycans, which are covalently or non-covalently attached or linked to the compound.
A "recombinant" host cell of"the invention has a genome that has been manipulated in vitro so as to alter, e.g., decrease or disrupt, or alternatively, increase, the function or activity of at least one gene, e.g., in the pik biosynthetic gene cluster, of the invention.
As used herein, the term "derivative" means that a particular compound produced by a host cell of the invention or prepared in vitro using polypeptides encoded by the nucleic acid molecules of the invention, is modified so that it comprises other moieties, e.g., peptide or polypeptide molecules, such as antibodies or fragments thereof, nucleic acid molecules, sugars, lipids, fats, a detectable signal molecule such as a radioisotope, e.g., gamma emitters, small chemicals, metals, salts, synthetic polymers, e.g., polylactide and polyglycolide, surfactants and glycosaminoglycans, which are covalently or non-covalently attached or linked to the compound.
It will be appreciated by those skilled in the art that each atom of the compounds of the invention having a chiral center may exist in and be isolated in optically active and racemic forms. Some compounds may exhibit polymorphism. It is to be understood that the present invention encompasses any racemic, optically active, polymorphic or stereoisomeric form, or mixtures thereof, of a coml>ound of the invention, which possess the useful properties described herein, it being well known in the art how to prepare optically active forms (for example, by resolution o:f the racemic foam by recrystallization techniques, by synthesis from optically active starting materials, by chiral synthesis, or by chromatographic separation using a chiral stationary phase) and how to determine activity using the standard tests described herein, or using other similar tests which are well known in the art.
The term "sequence homology" or "sequence identity" means the proportion of base matches between two nucleic acid sequences or the proportion amino acid matches between two amino acid sequences. When sequence homology is expressed as a percentage, e.g., 50%, 22 ''' the percentage denotes the proportion of matches over the length of sequence that is compared to some other sequence. Gaps (in either of the two sequences) are permitted to maximize matching; gap lengths of 15 bases or less are usually used, 6 bases or less are preferred with 2 bases or less more ;preferred. When using oligonucleotides as probes, the sequence homology between the target nucleic acid and the oligonucleotide sequence is generally not less than 17 target base matches out of 20 possible oligonucleotide base pair matches (85%); preferably not less khan 9 matches out of 10 possible base pair matches {90%), and more preferably not less, than 19 matches out of 20 possible base pair matches {95%).
Two amino acid sequences ~~re homologous if there is a partial or complete identity between their sequences. For example, 85% homology means that 85% of the amino acids are identical when the two sequences are aligned for maximum matching. Gaps (in either of the two sequences being matched) acre allowed in maximizing matching; gap lengths of 5 or less are preferred with 2 or less being more preferred. Alternatively and preferably, two protein sequences (or polypeptide sequences derived from them of at least 30 amino acids in length) are homologous, as this term is used herein, if they have an alignment score of at more than 5 (in standard deviation units) using the program ALIGN with the mutation data matrix and a gap penalty of 6 or greater. See Dayhoff, M. O., in Atlas of Protein Sequence and Structure, 1972, volume 5, National Biomedical Research Foundation, pp.
101-110, and Supplement 2 to this volume, pp. 1-~10. The two sequences or parts thereof are more preferably homologous if their amino acids are greater than or equal to 50%
identical when optimally aligned using the ALIGN program.
The following terms are used to describe the sequence relationships between two or more polynucleotides: "reference sequence", "comparison window", "sequence identity", "percentage of sequence identity", and "substantial identity". A "reference sequence" is a defined sequence used as a basis foo~ a sequence comparison; a reference sequence may be a subset of a larger sequence, for example, as a segment of a full-length cDNA
or gene sequence given in a sequence listing, or may comprise a complete cDNA or gene sequence.
Generally, a reference sequence is at least 20 nucleotides in length, frequently at least 25 nucleotides in length, and often at least 50 nucleotides in length. Since two polynucleotides may each (i) comprise a sequence ( i.e., a portion of the complete polynucleotide sequence) that is similar between the two polynucleotides, and (2) may further comprise a sequence that is divergent between the two polynucleotides, sequence comparisons between two (or more) polynucleotides are typically performed by comparing sequences of the two polynucleotides over a "comparison window" to identify and compare local regions of sequence similarity.
A "comparison window", a~ used herein, refers to a conceptual segment of at least 20 contiguous nucleotides and wherein the portion of the polynucleotide sequence in the comparison window may comprise additions or deletions (i.e., gaps) of 20 percent or less as compared to the reference sequence; (which does not comprise additions or deletions) for optimal alignment of the two sequences. Optimal alignment of sequences for aligning a comparison window may be conducted by the local homology algorithm of Smith and Waterman (1981) 2: 482, by the homology alignment algorithm of Needleman and Wunsch (19?0) .L.~~l.~iQl. 4$:443, by the search for similarity method of Pearson and Lipman (1988) P~datl. Acad. Sci. (I1.S.A.) $~: 2444, by computerized implementations of these algorithms (GAP, BESTFIT, FASTA, and TFASTA in the Wisconsin Genetics Software Package Release 7.0, Genetics Computer Group, 575 Science Dr., Madison, Wis.), or by inspection, and the best alignment (i.e., resulting in the highest percentage of homology over the comparison window) generated by the various methods is selected.
The term "sequence identity" means that two polynucleotide sequences are identical (i.e., on a nucleotide-by-nucleotide basis) over the window of comparison. The term "percentage of sequence identity" means that two polynucleotide sequences are identical (i.e., on a nucleotide-by-nucleotide basis) over the window of comparison. The term "percentage of sequence identity" is calculated by comparing two optimally aligned sequences over the window of comparison, determining the number of positions at which the identical nucleic acid base (e.g., A, T, C, G, U, or I) occurs in both sequences to yield the number of matched positions, dividing the number of rr~atched positions by the total number of positions in the window of comparison (i.e., the window size), and multiplying the result by 100 to yield the percentage of sequence identity. The teens "substantial identity" as used herein denote a characteristic of a polynucleotide sc;quence, wherein the polynucleotide comprises a sequence that has at least 85 percent sequence identity, preferably at least 90 to 95 percent sequence identity, more usually at least 99 percent sequence identity as compared to a reference sequence over a comparison window of at least 20 nucleotide positions, frequently over a window of at least 20-50 nucleotidf;s, wherein the percentage of sequence identity is calculated by comparing the reference sequence to the polynucleotide sequence which may include deletions or additions which total 20 percent or less of the reference sequence over the window of comparison.
As applied to polypeptides, the term "substantial identity" means that two peptide sequences, when optimally aligned, such as by the programs GAP or BESTFIT
using default gap weights, share at least about 80 percent sequence identity, preferably at least about 90 percent sequence identity, more preferably at least about 95 percent sequence identity, and most preferably at least about 99 percent sequence identity.
In accordance with the present invention there is provided an isolated and purified nucleic acid molecule which encodes the entire pathway for methymycin, pikromycin, neomethymycin, narbomycin, or a combination thereof, which includes sugar biosynthetic genes that are linked thereto. Desirably, the nucleic acid molecule is DNA
isolated from Streptomyces spp. The present invention further includes isolated and purified nucleic acid sequences which hybridize under sW ndard or stringent conditions to the nucleic acid molecules of the invention. It is also understood that the invention encompasses isolated and purified polypeptides which may be encoded by the nucleic acid molecules of the invention.
The invention described herein can be used for the production of a diverse range of novel compounds including polyketides, e.g., antibiotics, and biodegradable PHA polymers through genetic redesign of DNA encoding a FAS or a PKS such as that found in Streptomyces spp. Thus, the isolation and characterization of this gene cluster allows for the selective production of antibiotics, the overproduction or under production of particular compounds, e.g., overproduction of certain antibiotics, and the production of novel compounds. For example, combinational biosynthetic-based modification of compounds may be accomplished by selective activation or disruption of specific genes within the cluster or incorporation of the genes into biased biosynthetic libraries which are assayed for a wide range of biological activities, to derive greater chemical diversity. A
fiuther example includes the introduction of biosynthetic genes) into a particular host cell so as to result in the production of a novel compound due to the activity of the biosynthetic genes) on other metabolites, intermediates or components of the host cells.
Further, different PHA synthases can be tested for their ability to polymerize monomers produced by the recombinant PKS or PHA monomer synthase into a biodegradable polymer. The invention also provides a method by which various PHA
synthases can be tested for their specificity with respect to different monomer substrates.

25 ' The potential uses and applications of PHAs produced by PHA monomer synthases and PHA synthases include both medical and industrial applications. Medical applications of PHAs include surgical pins, sutures, staples, swabs, wound dressings, blood vessel replacements, bone replacements and plates, stimulation of bone growth by piezoelectric properties, and biodegradable carrier for long-term dosage of pharmaceuticals.
Industrial applications of PHAs include disposable items such as baby diapers, packaging containers, bottles, wrappings, bags, and films, and biodegradable earners for long-term dosage of herbicides, fungicides, insecticides, or fertilizers.
In animals, the biosynthesis of fatty acids de novo from malonyl-CoA is catalyzed by FAS. For example, the rat FAS is a~ homodimer with a subunit structure consisting of 2505 amino acid residues having a molecular weight of 272,340 Da. Each subunit consists of seven catalytic activities in separate. physical domains (Amy et al., Proc.
Natl. Acad. ~ci.
II~A, $6, 3114 ( 1989)). The physical location of six of the catalytic activities, ketoacyl synthase (KS), malonyl/acetyltransi:erase (M/AT), enoyl reductase (ER), ketoreductase (KR), aryl carrier protein (ACP), and thioesterase (TE), has been established by (1) the identification of the various active site residues within the overall amino acid sequence by isolation of catalytically active fragments from limited proteolytic digests of the whole FAS, (2) the identification of regions within the FAS that exhibit sequence similarity with various monofunctional proteins, (3) expression of DNA encoding an amino acid sequence with catalytic activity to produce recombinant proteins, and (4) the identification of DNA that does not encode catalytic activity, i.e., DNA encoding a linker region. (Smith et al., Proc. Natl.
Acad. ~ci. 1~A, ~,, 1184 (1976); T'sukamoto et al., J-,Blot-C)~, 2~, 16225 (1988);
Rangan et al., J. Biol. Chew, 2~,, 1.9180 (1991)).
The seventh catalytic activity, dehydrase (DH), was identified as physically residing between AT and ER by an amino acrid comparison of FAS with the amino acid sequences encoded by the three open reading flames of the eryA polyketide synthase (PKS) gene cluster of Saccharopolyspora erythraea. T'he three polypeptides that comprise this PKS
are constructed from "modules" which resemble animal FAS, both in terms of their amino acid sequence and in the ordering of the constituent domains (Donadio et al., x,111, 51 (1992); Berth et al., Fur. J. Bioc_h_Prr~., 2~, 39 (1992)).
One embodiment of the invc;ntion employs a FAS in which the DH is inactivated (FAS DH-). The FAS DH- employed in this embodiment of the invention is preferably a eukaryotic FAS DH- and, more preferably, a mammalian FAS DH-. The most preferred embodiment of the invention is a FA,S where the active site in the DH has been inactivated by mutation. For example, Joshi et aI. (Z..~1i91.~hem., 25$, 22508 (1993)) changed the HisB'g residue in the rat FAS to an alanine residue by site-directed mutagenesis. In vitro studies showed that a FAS with this change (ratFAS206) produced 3-hydroxybutyrylCoA as a premature termination product from acetyl-CoA, malonyl-CoA and NA,DPH.
As shown below, a FAS DH-- effectively replaces the (3-ketothiolase and acetoacetyl-CoA reductase activities of the natural pathway by producing D(-)-3-hydroxybutyrate as a premature termination product, rather than the usual 16-carbon product, palmitic acid. This premature termination product can then be incorporated into PHB by a PHB
synthase (See Example 2).
Another embodiment of the invention employs a recombinant Streptomyces spp.
PKS
to produce a variety of ~i-hydroxyCoA esters that can serve as monomers for a PHA synthase.
One example of a DNA encoding a Type I PKS is the eryA gene cluster, which governs the synthesis of erythromycin aglycone deoxyerythronolide B (DEB). The gene cluster encodes 1 S six repeated units, termed modules or synthase units (SUs). Each module or SU, which comprises a series of putative FAS-like activities, is responsible for one of the six elongation cycles required for DEB formation. Thus, the processive synthesis of asymmetric acyl chains found in complex polyketides is accomplished through the use of a programmed protein template, where the nature of the chemical reactions occurring at each point is determined by the specificities in each SU.
Two other Type I PKS are encoded by the tyl (tylosin) (Figure 4) and met (methymycin) (Figure 5) gene clusters. The macrolide multifunctional synthases encoded by tyl and met provide a greater degree of metabolic diversity than that found in the eryA gene cluster. The PKSs encoded by the e~yA gene cluster only catalyze chain elongation with methylmalonylCoA, as opposed to h~l and met PKSs, which catalyze chain elongation with malonylCoA, methylmalonylCoA and ethylmalonylCoA. Specifically, the tyl PKS
includes two malonylCoA extender units and one ethylmalonylCoA extender unit, and the met PKS
includes one malonylCoA extender unit. Thus, a preferred embodiment of the invention includes, but is not limited to, replacing catalytic activities encoded in met PKS open reading frame 1 (ORF1) to provide a DNA encoding a protein that possesses the required keto group processing capacity and short-chain acylCoA ester starter and extender unit specificity necessary to provide a saturated (3-hydroxyhexanoylCoA or unsaturated ~3-hydroxyhexenoylCoA monomer.

In order to manipulate the catalytic specificities within each module, DNA
encoding a catalytic activity must remain undisturbed. To identify the amino acid sequences between the amino acid sequences with catalytic activity, the "linker regions," amino acid sequences of related modules, preferably those encoded by more than one gene cluster, are compared.
Linker regions are amino acid sequences which are less well conserved than amino acid sequences with catalytic activity. V~7itkowski et al., L,19$., 571 (1991).
In an alternative embodiment of the invention, to provide a DNA encoding a Type I
PKS module with a TE and lacking a fimctional DH, a DNA encoding a module F, containing KS, MT, KR, ACP, and TE catalytic activities, is introduced at the 3' end of a DNA encoding a first module (Figure 6). Module F' introduces the final (R)-3-hydroxyl acyl group at the final step of PHA monomer synthesis, as a result of the presence of a TE
domain. DNA
encoding a module F is not present in the eryA PKS gene cluster (Donadio et al., supra, 1991).
A DNA encoding a recombinant monomer synthase is inserted into an expression vector. The expression vector employed varies depending on the host cell to be transformed with the expression vector. That is, vectors are employed with transcription, translation and/or post-translational signals, such as targeting signals, necessary for efficient expression of the genes in various host cells into which the vectors are introduced. Such vectors are constructed and transformed into host cells by methods well known in the art.
See Sambrook et al., :~iLab~at~auual, Cold Spring Harbor (1989). Preferred host cells for the vectors of the invention include insect, bacterial, and plant cells. Preferred insect cells include Spodoptera frugiperda cells such as Sf2l, and Trichoplusia ni cells. Preferred bacterial cells include Escherichia coli, Streptomyces and Pseudomonas.
Preferred plant cells include monocot and dicot cells, such as maize, rice, wheat, tobacco, legumes, carrot, squash, canola, soybean, potato, and the like:.
Moreover, the appropriate subcellular compartment in which to locate the enzyme in eukaryotic cells must be considered when constructing eukaryotic expression vectors. Two factors are important: the site of production of the acetyl-CoA substrate, and the available space for storage of the PHA polymer. To direct the enzyme to a particular subcellular location, targeting sequences may be added to the sequences encoding the recombinant molecules.
The baculovirus system is p~uticularly amenable to the introduction of DNA
encoding a recombinant FAS or a PKS monomer synthase because an increasing variety of transfer plasmids are becoming available which can accommodate a large insert, and the virus can be propagated to high titers. Moreover, insect cells are adapted readily to suspension culture, facilitating relatively large-scale recombinant protein production. Further, recombinant proteins tend to be produced exclusively as soluble proteins in insect cells, thus, obviating the need for refolding, a task that might be particularly daunting in the case of a large multifunctional protein. The Sf21/baculovirus system has routinely expressed milligram quantities of catalytically active recombinant fatty acid synthase. Finally, the baculovirus/insect cell system provides the ability to construct and analyze different synthase proteins for the ability to polymerize monomers into unique biodegradable polymers.
A further embodiment of the invention is the introduction of at least one DNA
encoding a PHA synthase and a DNA encoding a PHA monomer synthase into a host cell.
Such syntheses include, but are not limited to, A. eutrophus 3-hydroxy, 4-hydroxy, and 5-hydroxy alkanoate syntheses, Rhodccoccus Tuber C3-CS hydroxyalkanoate syntheses, Pseudomonas oleororans C6-C,4 hydroxyalkanoate syntheses, P. putida C6-C~4 hydroxyalkanoate syntheses, P. aeruginosa CS-C,a hydroxyalkanoate syntheses, P.
resinovorans C4 C,o hydroxyalkanoate syntheses, Rhodospirillum rubrum C4-C., hydroxyalkanoate syntheses, R. gelcztinorus C4 C~, Thiocapsa pfennigii C4 C8 hydroxyalkanoate syntheses, and Bacillus megaterium C4 Cs hydroxyalkanoate syntheses.
The introduction of DNA(s) encoding more than one PHA synthase may be necessary to produce a particular PHA polymer due to the specificities exhibited by different PHA
syntheses. As multifunctional proteins are altered to produce unusual monomeric structures, synthase specificity may be problematic for particular substrates. Although the A. eutrophus PHB synthase utilizes only C4 and CS compounds as substrates, it appears to be a good prototype synthase for initial studies since it is known to be capable of producing copolymers of 3-hydroxybutyrate and 4-hydrox;ybutyrate (Kunioka et al., M~~lc,~, 22, 694 (1989)) as well as copolymers of 3-hydroxyvalerate, 3-hydroxybutyrate, and 5-hydroxyvalerate (Doi et al., ~nol . ~I c, ,1,Q, 2860 (1986)). Other syntheses, especially those of Pseudomonas aeruginosa (Timm et al., Fur. T. Biochem., ~QQ, 15 (1992)) and Rhodococcus Tuber (Pieper et al., E.FM~ Microbiol. Lett., Q(, 73 (1992)), can also be employed in the practice of the invention. Synthase specificity may be alterable through molecular biological methods.

In yet another embodiment of the invention, a DNA encoding a FAS and a PHA
synthase can be introduced into a single expression vector, obviating the need to introduce the genes into a host cell individually.
A further embodiment of the; invention is the generation of a DNA encoding a recombinant multifimctional protein, which comprises a FAS, of either eukaryotic or prokaryotic origin, and a PKS module F. Module F will carry out the final chain extension to ~~
include two additional carbons and the reduction of the (3-keto group, which results in a (R)-3-hydroxy acyl CoA moiety.
To produce this recombinant protein, DNA encoding the FAS TE is replaced with a DNA encoding a linker region which is normally found in the ACP-KS interdomain region of bimodular ORFs. DNA encoding a module F is then inserted 3' to the DNA
encoding the linker region. Different linker regions, such as those described below which vary in length and amino acid composition, can be tested to determine which linker most efficiently mediates or allows the required transfer of the nascent saturated fatty acid intermediate to module F for the final chain elongation and keto reduction steps. The resulting DNA
encoding the protein can then be tested for expression of long-chain (3-hydroxy fatty acids in insect cells, such as Sf21 cells, or Streptomyces, or Pseudomonas. The expected 3-hydroxy C-18 fatty acid can serve as a potential substrate for PHA synthases which are able to accept long-chain alkyl groups. A preferred embodiment of the invention is a FAS that has a chain length specificity between 4-22 carbons.
Examples of linker regions that can be employed in this embodiment of the invention include, but are not limited to, the ACP-KS linker regions encoded by the tyl ORFI (ACP,-KS2; ACPZ KS3), and ORF3 (ACPS-~KS6), and eryA ORFI (ACPr-KS,; ACPz-KSZ), ORF2 (ACP3-KS4) and ORF3 (ACPS-KS6).
This approach can also be used to produce shorter chain fatty acid groups by limiting the ability of the FAS unit to generate long-chain fatty acids. Mutagenesis of DNA encoding various FAS catalytic activities, starting with the KS, may result in the synthesis of short-chain (R)-3-hydroxy fatty acids.
The PHA palymers are then recovered from the biomass. Large-scale solvent extraction can be used, but is expensive. An alternative method involving heat shock with subsequent enzymatic and detergent digestive processes is also available (Byron, Trends ~l~lni.~l, ~, 246 (1987); Holmes, In: LZeve m .n a i ystallyne Polymers, D. C.
Bassett (ed.), pp. 1-65 (1988)). PITS and other PHAs are readily extracted from microorganisms by chlorinated hydrocarbons. Refluxing with chloroform has been extensively used; the resulting solution is filtered to remove debris and concentrated, and the polymer is precipitated with methanol or ethanol, leaving low-molecular-weight lipids in solution. Longer side-chain PHAs show a less restricted solubility than PHB
and are, for 5 example, soluble in acetone. Other strategies adopted include the use of ethylene carbonate and propylene carbonate as disclosed by Lafferty et al. (Chem,Bun~hau, ~Q, 14 (1977)) to extract PHB from biomass. Scandola et al. (lnt. T. Biol. Micr iol , ~, 373 (1988)) reported that 1 M HCl-chloroform extraction. of Rhizobium meliloti yielded PHB of MW =
6 ~ 104 compared with 1.4 X 106 when acetone was used.
10 Methods are well known in the art for the determination of the PHB or PHA
content of microorganisms, the composition of PHAs, and the distribution of the monomer units in the polymer. Gas chromatography and high-pressure liquid chromatography are widely used for quantitative PHB analysis. See .Anderson et al., y., 5~, 450 (1990) for a review of such methods. NMR techniques can also be used to determine polymer 15 composition, and the distribution of monomer units.
The present invention also contemplates nucleic acid sequences which hybridize under stringent hybridization conditions to the nucleic acid sequences set forth herein.
Stringent hybridization conditions are well known in the art and define a degree of sequence 20 identity greater than about 80 to about 90%. Thus, nucleic acid sequences encoding variant polypeptides (Figure 38), or nucleic acid sequences having conservative (silent) nucleotide substitutions (Figure 37), are within the scope of the invention. Preferably, variant polypeptides encoded by the nucleic; acid sequences of the invention are biologically active.
The present invention also contemplates naturally occurring allelic variations and mutations 25 of the nucleic acid sequences described herein.
As is well known in the art, because of the degeneracy of the genetic code, there are numerous other DNA and RNA molecules that can code for the same polypeptides as those encoded by the exemplified biosynthetic genes and fragments thereof. The present invention, therefore, contemplates those other DNA and RNA molecules which, on expression, encode 30 the polypeptides of, for example, portions of SEQ ID N0:4 or SEQ ID N0:6.
Having identified the amino acid residue sequence encoded by a sugar biosynthetic or macrolide biosynthetic gene, and with knowledge of all triplet codons for each particular amino acid residue, it is possible to describe all such encoding RNA and DNA sequences.
DNA and RNA molecules other than those specifically disclosed herein and, which molecules are characterized simply by a change in a codon for a particular amino acid, are within the scope of this invention.
The 20 common amino acids and their representative abbreviations, symbols and codons are well known in the art (see, for example, MolecLla_r Biology of the f ell, Second Edition, B. Alberts et al., Garland Publishing Inc., New York and London, 1989). As is also well known in the art, codons constitute triplet sequences of nucleotides in mRNA molecules and as such, are characterized by the: base uracil (L>] in place of base thymidine (T) which is present in DNA molecules. A simple change in a codon for the same amino acid residue within a polynucleotide will not change the structure of the encoded polypeptide. By way of example, it can be seen from SEQ ID N0:6 that a TCT codon for serine exists at nucleotide positions 1735-1737. However, it can also be seen from that same sequence that serine can be encoded by a TCA codon (see, e.g., nucleotide positions 1738-1740) and a TCC codon (see, e.g., nucleotide positions 1874-1876). Substitution of the latter codons for serine with the TCT codon for serine or vice versa, does not substantially alter the DNA
sequence of SEQ
m N0:6 and results in production of the same polypeptide. In a similar manner, substitutions of the recited codons with other equivalent codons can be made in a like manner without departing from the scope of the present invention.
A nucleic acid molecule, segment or sequence of the present invention can also be an RNA molecule, segment or sequence. An RNA molecule contemplated by the present invention corresponds to, is complementary to or hybridizes under stringent conditions to any of the DNA sequences set forth herein. Exemplary and preferred RNA molecules are mRNA
molecules that encode sugar biosynthetic or macrolide biosynthetic enzymes of this invention.
Mutations can be made to the native nucleic acid sequences of the invention and such mutants used in place of the native sequence, so long as the mutants are able to function with other sequences to collectively catalyze the synthesis of an identifiable polyketide or macrolide. Such mutations can be rnade to the native sequences using conventional techniques such as by preparing synthetic oligonucleotides including the mutations and inserting the mutated sequence into the gene using restriction endonuclease digestion. (See, e.g., Kunkel, T. A. Proc. Natl. Acad. Sci. USA (1985) 82:448; Geisselsoder et al.
BioTech;(1987) 5:786.) Alternatively, the mutations can be effected using a mismatched primer (generally 10-217 nucleotides in length) which hybridizes to the native nucleotide sequence (generally cDNA corresponding to the RNA sequence), at a temperature below the melting temperature of the mismatched duplex. The primer can be made specific by keeping primer length and base composition within relatively narrow limits and by keeping the mutant base centrally located. Zoller and Smith, , (1983) 100:468. Primer extension is effecl:ed using DNA polymerase, the product cloned and clones containing the mutated DNA, derived by segregation of the primer extended strand, selected. ~~
Selection can be accomplished using the mutant primer as a hybridization probe. The technique is also applicable for generating multiple point mutations. See, e.g., Dalbie-McFarland et al., Proc. Natl. Acad. ~,~i. iTSA_ (1982) 79:6409. PCR
mutagenesis will also find use for effecting the desired mutations.
Random mutagenesis of the nucleotide sequence can be accomplished by several different techniques known in the art, such as by altering sequences within restriction endonuclease sites, inserting an olil;onucleotide linker randomly into a plasmid, by irradiation with X-rays or ultraviolet light, by incorporating incorrect nucleotides during in vitro DNA
synthesis, by error-prone PCR rnutagenesis, by preparing synthetic mutants or by damaging plasmid DNA in vitro with chemicals. Chemical mutagens include, for example, sodium bisulfite, nitrous acid, hydroxylamine, agents which damage or remove bases thereby preventing normal base-pairing suclh as hydrazine or formic acid, analogues of nucleotide precursors such as nitrosoguanidine:, 5-bromouracil, 2-aminopurine, or acridine intercalating agents such as proflavine, acriflavine, quinacrine, and the like. Generally, plasmid DNA or DNA fragments are treated with chemicals, transformed into E. coli and propagated as a pool or library of mutant plasmids.
Large populations of random enzyme variants can be constructed in vivo using "recombination-enhanced mutagenE;sis." This method employs two or more pools of, for example, 106 mutants each of the wild-type encoding nucleotide sequence that are generated using any convenient mutagenesis technique and then inserted into cloning vectors.
The gene sequences can be iinserted into one or more expression vectors, using methods known to those of skill in the art. Expression vectors may include control sequences operably linked to the desired genes. Suitable expression systems for use with the present invention include systems which function in eukaryotic and prokaryotic host cells.
Prokaryotic systems are preferred, and in particular, systems compatible with Streptomyces spp. are of particular interest. Control elements for use in such systems include promoters, optionally containing operator sequences, and ribosome binding sites.
Particularly useful promoters include control sequences derived from the gene clusters of the invention.
However, other bacterial promoters., such as those derived from sugar metabolizing enzymes, such as galactose, lactose (lac) and maltose, will also find use in the expression cassettes encoding desosamine. Preferred promoters are Streptomyces promoters, including but not limited to the erm~*, pikA and tipA promoters. Additional examples include promoter sequences derived from biosynthetic enzymes such as tryptophan (trp), the ~i-lactamase (bla) promoter system, bacteriophage lambda PL, and T5. In addition, synthetic promoters, such as the tac promoter (U.S. Pat. No. 4,S:i1,433}, which do not occur in nature, also function in bacterial host cells.
Other regulatory sequences may also be desirable which allow for regulation of expression of the genes relative to t:he growth of the host cell. Regulatory sequences are known to those of skill in the art, aJld examples include those which cause the expression of a gene to be turned on or off in response to a chemical or physical stimulus, including the presence of a regulatory compound. Other types of regulatory elements may also be present in the vector, for example, enhancer sequences.
Selectable markers can also be included in the recombinant expression vectors.
A
variety of markers are known whiclh are useful in selecting for transformed cell lines and generally comprise a gene whose e:Kpression confers a selectable phenotype on transformed cells when the cells are grown in an appropriate selective medium. Such markers include, for example, genes which confer antibiotic resistance or sensitivity to the plasmid. Alternatively, several polyketides are naturally colored and this characteristic provides a built-in marker for selecting cells successfully transfoirned by the present constructs.
The various subunits of interest can be cloned into one or more recombinant vectors as individual cassettes, with separate control elements, or under the control of, e.g., a single promoter. The subunits can include flanking restriction sites to allow for the easy deletion and insertion of other subunits so that hybrid PKSs can be generated. The design of such unique restriction sites is known to those of skill in the art and can be accomplished using the techniques described above, such as site-directed mutagenesis and PCR.
For sequences generated by random mutagenesis, the choice of vector depends on the pool of mutant sequences, i.e., donor or recipient, with which they are to be employed.
Furthermore, the choice of vector determines the host cell to be employed in subsequent steps of the claimed method. Any transducible cloning vector can be used as a cloning vector for the donor pool of mutants. It is prc;ferred, however, that phagemids, cosmids, or similar cloning vectors be used for cloning the donor pool of mutant encoding nucleotide sequences into the host cell. Phagemids and casmids, for example, are advantageous vectors due to the ability to insert and stably propagate therein larger fragments of DNA than in M13 phage and JL phage, respectively. Phagemids which will find use in this method generally include hybrids between plasmids and filame;ntous phage cloning vehicles. Cosmids which will find use in this method generally include ~1 phage-based vectors into which cos sites have been inserted. Recipient pool cloning vectors can be any suitable plasmid. The cloning vectors into which pools of mutants are inserted may be identical or may be constructed to harbor and express different genetic markers (see, e.g., Sambrook et al., supra). The utility of employing such vectors having different marker genes may be exploited to facilitate a determination of successful transduction.
Thus, for example, the cloning vector employed may be an E. colilStreptomyces shuttle vector (see, for example, U.S,. Patent Nos. 4,416,994, 4,343,906, 4,477,571, 4,362,816, and 4,340,674), a cosmid,, a plasmid, an artificial bacterial chromosome (see, e.g., Zhang and Wing, Pl~L~ial., ~~, 115 (1997); Schalkwyk et al., C'.Lrr_ ~.
Biotech., fi, 37 91995); and Monaco and Lavin, 7:r~nd.~i~~~,12, 280 (1994), or a phagemid, and the host cell may be a bacterial cell such as E. coli, Penicillium patulum, and Streptomyces spp. such as S. lividans, S. venezuelae, or S. lavendulae, or a eukaryotic cell such as fungi, yeast or a plant cell, e.g., monocot and dicot cells, preferably cells that are regenerable.
Moreover, recombinant polypeptides having a particular activity may be prepared via "gene-shuffling". See, for example, Crameri et al., L~atu~, 33.1., 288 (1998);
Patten et al., C.'Lrr_ y. Biotech., $, 724 (1997), U..S. Patent Nos. 5,837,458, 5,834,252, 5,830,727, 5,811,238, 5,605,793).
For phagemids, upon infection of the host cell which contains a phagemid, single-stranded phagemid DNA is produced, packaged and extruded from the cell in the form of a transducing phage in a manner similar to other phage vectors. Thus, clonal amplification of mutant encoding nucleotide sequences carried by phagemids is accomplished by propagating the phagemids in a suitable host cell., Following clonal amplification, the cloned donor pool of mutants is infected with a helper phage to obtain a mixture of phage particles containing either the helper phage genome or phagemids mutant alleles of the v~rild-type encoding nucleotide sequence.

Infection, or transfection, of host cells with helper phage is generally accomplished by methods well known in the art (see., e.g., Sambrook et al., supra; and Russell et al. (1986) ~ 9r~:333-338).
The helper phage may be any phage which can be used in combination with the 5 cloning phage to produce an infective transducing phage. For example, if the cloning vector is a cosmid, the helper phage will necessarily be a ~, phage. Preferably, the cloning vector is a phagemid and the helper phage is a iilamentous phage, and preferably phage M13.
If desired after infecting the phagemid with helper phage and obtaining a mixture of phage particles, the transducing phage can be separated from helper phage based on size 10 difference (Barnes et al. (1983) Mcltlodc .n~ymol. x:98-122), or other similarly effective technique.
The entire spectrum of cloned donor mutations can now be transduced into clonally amplified recipient cells into which has been transduced or transformed a pool of mutant encoding nucleotide sequences. Recipient cells which may be employed in the method 15 disclosed and claimed herein may be:, for example, E. coli, or other bacterial expression systems which are not recombination deficient. A recombination deficient cell is a cell in which recombinatorial events is greatly reduced, such as rec mutants of E.
coli (see, Clark et al. (1965) Proc. Natl. Avca~d. ~ci. 15~~ x:451-459).
These transductants can now be selected for the desired expressed protein property or 20 characteristic and, if necessary or desirable, amplified. Optionally, if the phagemids into which each pool of mutants is cloned are constructed to express different genetic markers, as described above, transductants may be selected by way of their expression of both donor and recipient plasmid markers.
The recombinants generated by the above-described methods can then be subjected to 25 selection or screening by any appropriate method, for example, enzymatic or other biological activity.
The above cycle of amplification, infection, transduction, and recombination rnay be repeated any number of times using additional donor pools cloned on phagemids.
As above, the phagemids into which each pool of mutants is cloned may be constructed to express a 30 different marker gene. Each cycle could increase the number of distinct mutants by up to a factor of 106. Thus, if the probability of occurrence of an inter-allelic recombination event in any individual cell is f {a parameter 'that is actually a function of the distance between the recombining mutations), the transduced culture from two pools of 106 allelic mutants will express up to 10'2 distinct mutants in a population of 10'2/f cells.
The invention will be further described by the following non-limiting examples.
Mate>ia~andds Mat.eriala. Sodium R-(-)-3-hydroxybutyrate, coenzyme-A, ethylchloroformate, pyridine and diethyl ether were purchased from Sigma Chemical Co. Amberlite IR-120 was purchased from Mallinckrodt Inc. 6-O-(N-Heptylcarbamoyl)methyl a-D-glycopyranoside (Hecameg) was obtained from Vegate;c (Villeejuif, France). Two-piece spectrophotometer cells with pathlengths of 0.1 (#20/0-~)-1 ) and 0.01 cm (#20/0-Q-0.1) were obtained from Stanza Cells Inc. (Atascadero, CA). Rabbit anti-A. eutrophus PHA synthase antibody was a gracious gift from Dr. F. Srienc and S. Stoup (Biological Process Technology Institute, University of Minnesota). Sf2l cells and T ni cells were kindly provided by Greg Franzen (R&D Systems, Minneapolis, MN) and Stephen Harsch (Department of Veterinary Pathobiology, University of Minnesota), respectively.
Plasmid pFAS206 and a recombinant baculoviral clone encoding FAS206 (Joshi et al., J. Riol. C.hem., ~$, 22508 (1993)) were generous gifts of A. Joshi and S.
Smith. Plasmid pAet41 (Peoples et al., J. iol. hem.., ~, 15298 ( 1989)), the source of the A.
eutrophus PHB synthase, was obtained from A. Sinskey. Baculovirus transfer vector, pBacPAK9, and linearized baculoviral DNA, were obtained from Clontech Inc. (Palo Alto, CA).
Restriction enzymes, T4 DNA ligase, E. coli DH;Sa competent cells, molecular weight standards, lipofectin reagent, Grace's insect cell medium, fetal bovine serum (FBS), and antibiotic/antimycotic reagent were obtained from GIBCO-BRL (Grand Island, NY). Tissue culture dishes were obtained from Corning Inc. Spinner flasks were obtained from Bellco Glass Inc. Seaplaque agarose GTG was obtained from FMC Bioproducts Inc.
Pry aration of R-3HRC'oA. R: (-)-3 HBCoA was prepared by the mixed anhydride method described by Haywood et al., EEM , ~Z, 1 (1989). 60 mg (0.58 nmol) of R-(-)-3 hydroxybutyric acid was freeze dried and added to a solution of 72 mg of pyridine in 10 ml diethyl ether at 0°C. Ethylchloroformate (100 mg) was added, and the mixture was allowed to stand at 4°C For 60 minutes. Insoluble pyridine hydrochloride was removed by centrifugation. The resulting anhydride was added, dropwise with mixing, to a solution of 100 mg coenzyme-A (0.1:3 mmol) in 4 ml 0.2 M potassium bicarbonate, pH 8.0 at 37 '' 0°C. The reaction was monitored b;y the nitroprusside test of Stadtman, ll~th~Enz~l., ~, 931 (1957), to ensure sufficient anhydride was added to esterify all the coenzyme-A. The concentration of R-3-HBCoA was determined by measuring the absorbance at 260 nm (e =
16.8 nM-' cm'; 18).
C'.onstroction of ~P- hob .C.. The phbC gene (approximately 1.8 kb) was excised from pAet41 (Peoples et al., T. Bio~ Chern" ?.~4, 15293 (1989)) by digestion with BstBI and Stul, purified as described by Williams et al. (~,19~, 445 (1991)), and ligated to pBacPAK9 digested with BstBI and StuI. This resulted in pBP-phbC, the baculovirus transfer vector used in formation of recombinant baculovirus particles carrying phbC.
1_.~ge-scale e~cnression o_f P13~8ym pace. A 1 L culture of T. ni cells (1.2 X 106 cells/ml) in logarithmic growth was infected by the addition of 50 ml recombinant viral stock solution (2.5 X 108 pfu/rr~l) resulting in a multiplicity of infection (MOI) of 10.
This infected culture was split between two Bellco spinners (350 ml/500 ml spinner, 700 ml/1 L spinner) to facilitate oxygenation of the culture. These cultures were incubated at 28°C and stirred at 60 rpm for 60 hours. Infected cells were harvested by centrifugation at 1000 X g for 10 minutes at 4°C. Cells were flash frozen in liquid NZ and stored in 4 equal aliquots, at -80°C until purification.
Insect cell maintenance and ~combina ,~baculovir~s formation. Sfl l cells were maintained at 26-28°C in Grace's insect cell medium supplemented with 10% FBS, 1.0%
pluronic F68, and 1.0% antibioticlauntimycotic (GIBCO-BRL). Cells were typically maintained in suspension at 0.2-2.0 X 106/ml in 60 ml total culture volume in 100 m1 spinner flasks at 55-65 rpm. Cell viability during the culture period was typically 95-100%. The procedures for use of the transfer vector and baculovirus were essentially those described by the manufacturer (Clontech, Inc.). Purified pBP-phbC and linearized bacuiovirus DNA were used for cotransfecrion of Sfll celhe using the liposome-mediated method (Felgner et al., Pros. Natl. Aced. ~ci. 1~A, $4, 74113 (1987)) utilizing Lipofectin (GIBCO-BRL). Four days later cotransfection supernatants were utilized for plaque purification.
Recombinant viral clones were purified from plaque assay plates containing 1.5% Seaplaque GTG
after 5-7 days at 28°C. Recombinant viral clone stocks were then amplified in T25-flask cultures (4 ml, 3 x 106/ml on day 0) for 4 days; infected cells were determined by their morphology and size and then screened by SDS/PAGE using 10% polyacrylamide gels (Laemmli, Nat~u~, 22~, (1970)) for production of PHA synnase.

p ~rifi iy of PHA svnth~e from BTI-TN-SBI-4 T. ni cellc, purification of PHA
synthase was performed according 'to the method of Gerngross et al., , 3~, (1994) with the following alterations. line aliquot (110 mg protein) of frozen cells was thawed on ice and resuspended in 10 rnM KPi (pH 7.2), 5% glycerol, and 0.05%
Hecameg (Buffer A) containing the following protease inhibitors at the indicated final concentrations:
benzamidine {2 mM), phenylmethylsulfonyl fluoride (PMSF, 0.4 mM), pepstatin (2 mg/ml), leupeptin {2.5 mg/ml), and Na p-tosyl-1-lysine chloromethyl ketone (TLCK, 2 mM). EDTA
was omitted at this stage due to its incompatibility with hydroxylapatite (HA). This mixture was homogenized with three series of 10 strokes each in two Thomas homogenizers while partially submerged in an ice bath and then sonicated for 2 minutes in a Branson Sonifier 250 at 30% cycle, 30% power while an ice. All subsequent procedures were carried out at 4°C.
The lysate was immediately centrifuged at 100000 X g in a Beckman 50.2Ti rotor for 80 minutes, and the resulting supernatant (10.5 ml, 47 mg) was immediately filtered through a 0.45 mm Uniflow filter (Schleicher and Schuell Inc., Keene, N.H.) to remove any remaining insoluble matter. Aliquots of the soluble fraction (1.5 ml, 7 mg) were loaded onto a 5 ml BioRad Econo-Pac HTP column th;~t had been equilibrated with Buffer A (+
protease inhibitor mix) attached to a BioRad Econo-system, and the column was washed with 30 ml Buffer A. All chromatographic steps were carried out at a flow rate of 0.8 ml/minute. PHA
synthase was eluted form the HA column with a 32 x 32 ml linear gradient from 10 to 300 mM KPi.
Fraction collection tubes were prepared by addition of 30 ml of 100 mM EDTA to provide a metalloprotease inhibitor at 1 mM immediately after HA
chromatography. PHA
synthase was eluted in a broad peak: between 110-180 mM KPi. Fractions (3 ml) containing significant PHA synthase activity were pooled and stored at 0°C until the entire soluble fraction had been run through the chromatographic process. Pooled fractions then were concentrated at 4°C by use of a Centriprep-30 concentrator (Amicon) to 3.8 mg/ml. Aliquots (0.5 ml) were either flash frozen and stored in liquid Nz or glycerol was added to a final concentration of 50% and samples 1;1.9 mg/mI) were stored at -20°C.
~st~n..an~,l~tsis. Samples of T. ni cells were fractionated by SDS-PAGE on 10%
polyacrylamide gels, and the proteins then were transferred to 0.2 mm nitrocellulose membranes using a BioRad Transb:lot SD Semi-Dry electrophoretic transfer cell according to the manufacturer. Proteins were transferred for 1 hour at 15 V. The membrane was rinsed with doubly distilled HZO, dried, ar.~d treated with phosphate-buffered saline (PBS) containing 39 ' 0.05% Tween-20 {PBS-Tween) and 3% nonfat dry milk to block non-specific binding sites.
Primary antibody (rabbit anti-PHA ~synthase) was applied in fresh blocking solution and incubated at 25°C for 2 hours. Membranes were then washed four times for 10 minutes with PBS-Tween followed by the addition of horseradish peroxidase-conjugated goat-anti-rabbit antibody (Boehringer-Mannheim) diluted 10,000X in fresh blocking solution and incubated at 25°C for 1 hour. Membranes were ,washed finally in three changes (10 minutes) of PBS, and the immobilized peroxidase label was detected using the chemiluminescent LumiGLO
substrate kit (Kirkegaard and Perry, Gaithersburg, MD) and X-ray film.
~LIeIminaLaual~~. Appro:Kimately 10 mg of purified PHA synthase was run on a 10% SDS-polyacrylamide gel, transferred to PVDF {Immobilon-PSQ, Millipore Corporation, Bedford, MA), stained with Amido Black, and sequenced on a 494 Procise Protein Sequencer (Perkin-Elmer, Applied Biosystems Division, Foster City, California).
Double-infection ron tocol. lFour 100 ml spinner flasks were each inoculated with 8 x 10' cells in 50 ml of fresh insect medium. To flask 1, an additional 20 ml of fresh insect medium was added (uninfected control); to flask 2, 10 ml BacPAK6: phbC viral stock (1 x 10g pfi~/ml) and 10 ml fresh insect medium were added; to flask 3, 10 ml BacPAK6::FAS206 viral stock (1 x 10$ pfu/ml) and 10 ml fresh insect medium were added; and to flask 4, 10 ml BacPAK6: phbC viral stock (1 x 10$ pfu/ml) and 10 ml BacPAK6::FAS206 viral stock (1 x 108 pfu/ml) were added. These viral infections were carried out at a multiplicity of infection of approximately 10. Cultures were. maintained under normal growth conditions and 15 ml samples were removed at 24, 48, and 72 hour time points. Cells were collected by gentle centrifugation at 1000 x g for 5 minutes, the medium was discarded, and the cells were irnrnediately stored at -70°C.
p A cvn has . scars. Coenzyme A released by PHA synthase in the process of polymerization was monitored preciisely as described by Gerngross et al.
(supra) using S,5'-dithiobis (2-nitrobenzoic acid, DTNB) (Ellman, Arch.~iQSh~an__Bio~hv~s., $2, 70 (1959)).
The presence of HBCoA was monitored spectrophotornetrically. Assays were performed at 25°C in a Hewlett Packard 8452A diode array spectrophotometer equipped with a water jacketed cell holder. Two-Friece Starna Spectrosil spectrophotometer cells with pathlengths of 0.1 and 0.01 cm were employed to avoid errors arising from the compression of the absorbance scale at higher values. Absorbance was monitored at 232 nm, and EZS2 ~
of 4.5 x 103 M-' crn' was used in calculations. One unit (Ln of enzyme is the amount required to hydrolyze 1 mmol of substrate minute ~. Buffer (0.15 M KPi, pH
7.2) and substrate were equilibrated to 25°C and then combined in an Eppendorf tube also at 25°C.
Enzyme was added and mixed once. in the pipet tip used to transfer the entire mixture to the spectrophotometer cell. The two-piece cell was immediately assembled, placed in the spectrophotometer with the cell holder (type CH) adapted for the standard 10 mm pathlength 5 cell holder of the spectrophotometer. Manipulations of sample, from mixing to initiation of monitoring, took only 10-15 seconds. Absorbance was continually monitored for up to 10 minutes. Calibration of reactions was against a solution of buffer and enzyme (no substrate) which led to absorbance values that: represented substrate only.
PHB~.ssaat. PHB was assayed from Sf21 cell samples according to the propanolysis 10 method of Riis et al., d. ClLromo., 4ø~, 285 ( 1988). Cell pellets were thawed on ice, resuspended in 1 ml cold ddH20 and transferred to 5 ml screwtop test tubes with teflon seals.
Two ml of ddH20 were added, the cells were washed and centrifuged and then 3 ml of acetone were added and the cells washed and centrifuged. The samples were then desiccated by placing them in a 94°C oven for 12 hours. The following day 0.5 ml of 1,2-15 dichloroethane, 0.5 ml acidified propanol (20 ml HCI, 80 ml 1-propanol} and 50 ml benzoic acid standard were added and the sealed tubes were heated to 100°C in a boiling water bath for 2 hours with periodic vortexing. The tubes were cooled to room temperature and the organic phase was used for gas-chromatographic (GC) analysis using a Hewlett Packard 5890A gas chromatograph equipped with a Hewlett Packard 7673A automatic injector and a 20 fused silica capillary column, DB-WAX 30W of 30 meter length. Positive samples were further subjected to GC-mass spectrometric (MS) analysis for the presence of propylhydroxybutyrate using a Kratos MS25 GC/MS. The following parameters were used:
source temperature, 210°C; voltage, 70 eV; and accelerating voltage, 4 KeV.
.ata ~ 'c acti~~
25 Ketoacyl synthase (KS) activity was assessed radiochemically by the condensation-'4C0z exchange reaction (Smith et .al., PNAS LISA, ~, 1184 (1976)).
Transferase (AT) activity was assayed, using malonyl-CoA as donor and pantetheine as acceptor, by determining spectrophotometrically the free CoA released in a coupled ATP
citrate-lyase-malate dehydrogenase: reaction (see, Rangen et al.,1. Biol_ Chem., 25~, 19180 30 (1991).
Ketoreductase (KR) was assayed spectrophotometrically at 340 nm: assay systems contained 0.1 M potassium phosphate buffer (pH 7), 0.15 mM NADPH, enzyme and either 10 mM traps-1-decalone or 0.1 mM acetoacetyl-CoA substrate.

Dehydrase (DH) activity was assayed spectrophotometrically at 270 nm using S-DL-~3-hydyroxybutyryl N-acetylcysteamiine as substrate (Kumar et al., d iol.
yhem., 2~, 4732 (1970)).
Enoyl reductase (ER) activity was assayed spectrophotometrically at 340 nm essentially as described by Strom et al. (J. Biol. Chem., 2~4, 8159 ( 1979));
the assay system contained 0.1 M potassium phosphate buffer (pH 7), 0.15 mM NADPH, 0.375 nM
crotonoyl-CoA, 20 pM CoA and enzyme.
Thioesterase (TE) activity w~~s assessed radiochemically by extracting and assaying the ['°CJpalmitic acid formed from [1-'°C]palinitoyl-CoA during a 3 minute incubation Smith, M~th~Enz~m~l., Z1G, 181 (1981); the assay was in a final volume of 0.1 ml, 25 mM
potassium phosphate buffer (pH 8;), 20 ~,M [1-'°CJpalmitoyl-CoA (20 nCi) and enzyme.
Assay of overall fatty acid synthase activity was performed spectrophotometrically as described previously by Smith et al. (M~h,.~nzymel., 3.~, 65 (1975)). All enzyme activities were assayed at 37°C except the transferase, which was assayed at 20°C. Activity units indicate nmol of substrate consumed/minute. All assays were conducted, at a minimum, at two different protein concentrations with the appropriate enzyme and substrate blanks included.
F~. yreccion of A. Eutror~hnc P A ~ymthace LTying, a~~acLloviros $,3rt em Recent work has shown that PHA synthase from A. eutrophus can be overexpressed in E. coli, in the absence of 3-ketothiol;ase and acetoacetyl-CoA reductase (Gerngross et al., supra) and can be expressed in plants (See Poirier et al., Hi~h,1~, 142 (1995) for a review). Isolation of the soluble form of PHA synthase provides opportunities to examine the mechanistic details of the priming and initiation reactions. Because the baculovirus system has been successful for the expression of a number of prokaryotic genes as soluble proteins, and insect cells, unlike bacterial expression systems, carry out a wide array of post-translational modifications, the bacu.lovirus expression system appeared ideal for the expression of large quantities of soluble PHA synthase, a protein that must be modified by phosphopantetheine in order to be catalytically active (Gemgross et al., supra).
p~of PHA synth~~. The purification procedure employed for PHA
synthase is a modification of Gerng~-oss et al. (supra) involving the elimination of the second liquid chromatographic step and inclusion of a protease-inhibitor cocktail in all buffers. All steps were carned out on ice or at 4°C except where noted. Frozen cells were thawed on ice in 10 ml of Buffer A ( 10 mM KPi, ;pH 7.2, OS% glycerol, and 0.05% Hecameg) and then immediately homogenized prior to centrifugation and HA chromatography.
The results of these efforts :ire summarized in Table 1 and Figure 7. A
prominent band at 64 kDa is visible in total, soluble, and HA eluate protein samples fractionated by SDS/PAGE (lanes 4, 5, and 6 of Figure 7, respectively). The initial specific activity of the isolated PHA synthase was 20-fold higher than previous attempts at expression and purification of this polypeptide. Approximately 1000 units of PHB synthase have been purified, based on calculations from the direct spectrophotometric assay detailed below, with an overall recovery of activity of 70%. The large proportion of synthase present in the membrane fraction, and the fact that over 90% of the initial activity was found in the soluble fraction, suggest either that the syn~thase in the membrane fraction is in an inactive form or that the direct assay is not applicable to the initial, 12 U/mg, crude extract.
protein specific sample total units vol (mL) (mg) (mg/ml) activity recovery , total 1430 11.5 113 9.8 12.7 100 protein soluble 1340 10.5 47 4.5 28.6 93 protein pooled 1020 7.9 30 3.8 34.2 71 HA
fractions N-terminal sequencing of tree 64 kDa protein confirmed its identity as PHA
synthase (Figure 8). Two prominent N-termini, at amino acid residue 7 (alanine) and residue 10 (serine) were obtained in a 3:2 ratio. This heterogeneous N-terminus presumably is the result of aminopeptidase activity. Western analysis using a rabbit-anti-PHA synthase antibody corroborated the results of the sequencing and indicated the presence of at least three bands that resulted from proteolysis of PHA synthase (Figure 7B, lanes 4-6). The antibody was specific for PHA synthase since neither T. ni nor baculoviral proteins showed reactivity (Figure 7B, lanes 2 and 3). N-terminal protein sequencing (Figure 8) showed directly that the 44 kDa (band b) and 32 kDa (band d) proteins were derived from PHA synthase (fragments beginning at A181/N185 and at 6387, respectively). The 35-40 kDa (band c) protein gave low sequencing yields and may contain a blocked N-terminus. Inspection of Figure 7B
suggests that most degradation occurs following cell disruption since the total protein sample of this gel (lane 4) was prepared by boiling intact cells directly in SDS
sample buffer while the HA sample (lane 6) went through the purification procedure described above.
Aj a~of~vnthase activiy. Due to the significant level of expression obtained using the baculovirus system, the synthase activity could be assayed spectrophotometrically by monitoring hydrolysis of the thioest:er bond at 232 nm, the wavelength at which there is a maximum decrease in absorbance upon hydrolysis. The difference between substrate (HBCoA) and product (CoA) at this; wavelength is shown in Figure 9. Absorbance of HBCoA and CoA a.t 232 nm occurs at a trough between two well-separated peaks.
Assays were carried out at pH 7.2 for cornp~arative analysis with previous studies (Gerngross et al., supra). Substrate (R-(-)3-HBCoA) substrate for these studies was prepared using the mixed anhydride method (Hayvc~ood et al., supra), and its concentration was determined by measuring AZbo. The short pathleng;th cells (0.1 cm and 0.01 cm) allowed use of relatively high reaction concentrations while conserving substrate and enzyme. Assay results showed an initial lag period of 60 seconds prior to the linear decrease in A232, and velocities were determined from the slope of these linear regions of the assay curves. The length of the lag period was variable and was inversely related to enzyme concentration. These data are consistent with those using PHA synthase purified from E. coli (Gerngross et al., supra).
Figures 10 and 11 show the V versus S and 1/V versus 1/S plots, respectively.
The double reciprocal plot was concave upward which is similar to results obtained from studies of the granular PHA synthase from Zooglea ramigera (Fukui et al., Arch.~icr~hiQl.,1LQ, 149 (1976)) and suggests a complex reaction mechanism. Examinations of velocity and specific activity as a function of enzyme concentration are shown in Figures 12 and 13.
These results confirm that specific activity of the synthase depends upon enzyme concentration. The pH activity curve for A. eutrophus PHA synthase purified from T. ni cells is shown in Figure 14. The curve shows a broad activity maximum centered around pH 8.5.
This result agrees well with prior work on the A. eutrophus PHB synthase although it is significantly different than results obtained for the PHB synthase from Z.
ramigera for which the optimum was determined to be pH 7Ø
The effect of varying enzyme concentration in the presence of a fixed amount of substrate revealed an intriguing trend (Figure 15). From these data it appears that the extent of polymerization is dependent on the amount of enzyme included in the reaction mixture.
This could be explained if there is a "terminal length" limitation of the polymer, which, once reached, cannot be extended any further. If this is the case, it would also suggest that termination of the polymerization reaction, the release of the synthase from the polymer, and/or reinitiation of polymerization by the newly released synthase are relatively slow events since no evidence of these reactions are seen within the time course of these studies. The phenomenon observed in Figure 15 is not the result of decay of the enzyme over the course of the assay since virtually identical results are obtained following a 10 minute preincubation of the synthase at 25°C.
It must also be noted that comparisons of the direct spectrophotometric assays used here and the more common assay involving the use of Ellman's reagent, DTNB, (Ellinan, supra) in the formation of thiolate o~f coenzyme-A showed that the values determined by the direct method were approximately 70% of the values determined using Ellman's reagent.
This may be due to phase separation occurring in the cuvettes as the relatively insoluble polymer is formed. In support of this notion, a faint haze or opalescence in the cuvette developed during the course of the reaction, particularly at higher substrate concentrations.
PHA synthase purified from insect cells appears to be relatively stable.
Examination of activity following storage, in liqvd NZ and at -20°C in the presence of 50% glycerol showed that approximately 50% of synthase activity remained after 7 weeks when stored in liquid N2 and approximately 75% o:f synthase activity remained after 7 weeks when stored at 20°C in the presence of 50% glycerol.
The expression of PHA synlhase from A. eutrophus in a baculovirus expression system results in the synthase constituting approximately 50% of total protein 60 hours post-infection; however, approximately 50-75% of the synthase is observed in the membrane-associated fraction. This elevated level of expression allowed purification of the soluble PHA
synthase using a single chromatographic step on HA. The purity of this preparation is estimated to be approximately 90% (intact PHA synthase and 3 proteolysis products).
The initial specific activity of 12 U/mg was approximately 20-fold higher than the most successful previous efforts at ~overexpression of A. eutrophus PHA
synthase. The synthase reported here was isolated from a 250 ml culture with 70% recovery which represents an improvement of 500-,fold (1000 U / 64 U X 8 L / 0.25 L) when compared to an 8 L E. codi culture with 40% recover~r. This high expression level should provide sufficient PHA synthase for extensive structural, functional, and mechanistic studies.
Furthermore, it is clear that the baculovirus expression system is an attractive option for isolation of other PHA
synthases from various sources.
PHA synthase produced in the baculovirus system was of sufficient potency to allow direct spectrophotometric analysis of the hydrolysis of the thioester bond of HBCoA at 232 5 nm. These assays revealed a lag period of approximately 60 seconds, the length of which was variable and inversely related to enzyme concentration. Such a lag period presumably reflects a slow step in the reaction, perhaps correlating to dimerization of the enzyme, the priming, and/or initiation steps in formation of PHB. Size exclusion chromatographic examination of the PHB synthase native MW indicated two forms of the synthase.
One form 10 showed a MW of approximately 100-160 kDa and the other showed a MW of approximately 50-80 kDA; these two forms likely represent the dimer and monomer of PHA
synthase, respectively. Similar results have been reported previously in which two forms of approximately 60 and 130 kDa were. observed. Comparisons of the direct assay reported here and the indirect assay using DTNB revealed that the former resulted in values that were 70%
15 of the values determined by the D'IT~TB indirect assay. Although the reason for this difference has not been examined in detail, it is probable that the apparent phase separation that occurred upon PHB formation in the short pathlength cuvettes used, particularly with high [HBCoA], results in this discrepancy.
Enzymatic analyses of the P13A synthase have found that the enzyme has a broad pH
20 optimum centered at pH 8.5; however, the studies described herein have been performed at pH 7.2 to provide comparative values with the results of others. Moreover, the specific activity of this enzyme is dependent upon enzyme concentration which confirms and extends earlier results (Gerngross et al., supra}.
In studies intended to examine the dependence of activity upon enzyme concentration, 25 it became apparent that the extent of the polymerization reaction is dependent on the amount of enzyme included in the reaction mixture. Specifically, decreasing the amount of enzyme leads not only to decreased velocity of reaction but also to a decreased extent of condensation (Figure 15). One possible explanation is that the enzyme is thermally labile;
however, identical assays in which the enzyme is preincubated at 25°C for 10 minutes prior to initiation 30 of the reaction had similar results. Another possibility is that a terminal-length of the polymer is reached precluding fiu~ther condensations until the particular synthase molecule is released from the terminal-length polymer.

This work clearly demonstrates the value of the baculovirus expression system for the production of A. eutrophus PHA s~,mthase and for the potential application to studies of other PHA synthases. Furthermore, the :high level of expression obtained using the baculoviral system should allow convenient analysis for substrate-specificity and structure-function S studies of PHA synthases from relatively crude insect cell extracts.
PHB S,mthase Gene in Insect Cells Expression of a rat FAS DH- cDNA in S~ cells has been reported previously (Rangan et al., J. Biol Chem., 2~,, 19180 (1991); Joshi et al., Bis~s~m.~, 295, 143 (1993)).
Once activity of the phbC gene product had been established in insect cells (see Example 1 ), baculovirus clones containing the rat FAS DH- eDNA and BacPAK6: phbC were employed in a double-infection strategy to determine if PHB would be produced in insect cells. It was not known if an intracellular pool of R(-)-3-hydroxybutyrate would be stable or available as a substrate for the PHB synthase. In order for the R-(-)-3-hydroxybutyrylCoA to be available as a substrate, the R-(-)-3-hydroxybutyrylCoA released from rat FAS DH-protein must be trapped by the PHB synthase and incorporated into a polymer at a rate faster than (3-oxidation, which would regenerate acetylCoA. It was also not known if the stereochemical configuration of the 3-hydroxyl group, which must be in the R form, would be recognized as a substrate by PHB synthase. Forh~nately, previous biochemical studies on eukaryotic FASs indicated that the R form of 3-hydroxybutyrylCoA would be generated (Wakil et al., J.J. Biol.
them., ?.~Z, 687 (1962)).
SDS-PAGE of protein samples from a time course of uninfected, single-infected, and dual-infected Sfll cells was performed (Figure 16). From these data, it is clear that the rat FAS DH mutant and PHB synthase; polypeptides are efficiently co-expressed in Sf21 cells.
However, co-expression results in ~-50% reduced levels of both polypeptides compared to Sill cells that are producing the individual proteins. Western analysis using anti-rat FAS
(Rangan et al., supra) and anti-PHA synthase antibodies confirmed simultaneous production of the corresponding proteins.
To provide further evidence: that PHB was being synthesized in insect cells, T. ni cells which had been infected with a baculovirus vector encoding rat FAS DH°
and/or a baculovirus vector encoding PHA synthase were analyzed for the presence of granules.

47 ' Infected cells were fixed in parafo~maldehyde and incubated with anti-PHA
synthase antibodies (Williams et al., Proteim.Exm PLrif., Z, 203 (1996)). Granules were observed only in doubly infected cells (Williams et al., App. Environ. Mi~o , ~, 2540 (1996)).
~~~ion of PHB nrodLCtion in insect cells. In order to determine if de novo synthesis of PHB was occurring in SJ21 cells that co-express the rat FAS DH
mutant and PHB synthase, fractions of these s~~rnples were extracted, the extract subjected to propanolysis, and analyzed for the presence of propylhydroxybutyrate by gas chromatography (Figure 17). A unique peak with a retention time that coincided with a propylhydroxybutyrate standard was detected only in the double infection samples at 48 and 72 hours, in contrast to the individually expressed gene products and uninfected controls, which were negative. These samplles were analyzed further by GC/MS to confirm the identity of the product. Figure 18 shows mass spectroscopy data corresponding to the material obtained from peak 10.1 in the gas chromatograph compared to a propylhydroxybutyrate standard. The results show that PFD synthesis is occurring only in Sf21 cells co-expressing the rat FAS DH mutant cDNA and the phbC gene from A. eutrophus. Integration of the peak in the gas chromatograph corresponding to propylhydroxybutyrate revealed that approximately 1 mg of PHB was isolated from 1 liter culture of S, f21 cells (approximately 600 mg dry cell weight of Sf21 cells). 'Thus, the ratFAS206 protein effectively replaces the ~i-ketothiolase and acetoacetyl-CoA reductase functions, resulting in the production of PHB by a novel pathway.
The approach described here provides a new strategy to combine metabolic pathways that are normally engaged in primary anabolic functions for production of polyesters. The premature termination of the norm;~l fatty acid biosynthetic pathway to provide suitably modified acylCoA monomers fox u.se in PHA synthesis can be applied to both prokaryotic and eukaryotic expression since thc; formation of polymer will not be dependent on specialized feedstocks. Thus, once a recombinant PHA monomer synthase is introduced into a prokaryotic or eukaryotic system;, and co-expressed with the appropriate PHA
synthase, novel bipolymer formation can occur.
C.'l.~g na d eq~encing o~h~Y~n ORFI PKS Gene CILStPr The entire PKS cluster form Streptomyces venezuelae was cloned using a heterologous hybridization strategy. A 1.2 kb DNA fragment that hybridized strongly to a DNA encoding an eryA PKS ~i-ketoacyl synthase domain was cloned and used to generate a plasmid for gene disruption. This method generated a mutant strain blocked in the synthesis of the antibiotic. A S. veneauelae genomic DNA library was generated and used to clone a cosmid containing the complete methymycin aglycone PKS DNA. Fine-mapping analysis was performed to identify the order and sequence of catalytic domains along the multifunctional PKS (Figure 19). DNA sequence analysis of the vep ORFI showed that the order of catalyric domains is KS~/AT/ACP/KS/AT/KR/ACP/KS/AT/DH/KR/ACP. The complete DNA sequence, and corrf;sponding amino acid sequence, of the vep ORFI
is shown in Figure 23 (SEQ ID N0:1 and SF?Q ID N0:2, respectively).
The sequence data indicated that the PKS gene cluster encodes a polyene of twelve carbons. The vep gene cluster contains 5 polyketide synthase modules, with a loading module at its 5' end and an ending domain at its 3' end. Each of the sequenced modules includes a keto-AGP (KS), an acyltransferase (AT), a dehydratase (DH), a keto-reductase (KR), and an acyl carrier protein domain. The six acyltransferase domains in the cluster are responsible for the incorporation o1'six acetyl-CoA moieties into the product.
The loading module contains a KSQ, an AT and an ACP domain. KSQ refers to a domain that is homologous to a KS domain except that the active site cysteine (C) is replaced by glutamine (Q). There is no counterpart to the KS's domain in the PKS clusters which have been previously characterized.
The ending domain (ED) is an enzyme which is responsible for the attachment of the nascent polyketide chain onto another molecule. The amino acid sequence of ED
resembles an enzyme, HetM, which is involved in Anabaena heterocyst formation. The homology between vep and HetM suggests that the polypeptide encoded by the vep gene cluster may synthesize a polyene-containing composition which is present in the spore coat or cell wall of its natural host, S. venezuelae.
P~aration of a Vector Encodir~~Saturated~i-~~yhexanoyl CaA Monomer or an I_Tnsaturated (3-h,~x~hexanovl CoA Monomer To provide a recombinant monomer synthase that generates a saturated ~i-hydroxyhexanoylCoA or unsaturatc;d ~3-hydroxyhexanoylCoA monomer, the linear correspondence between the genetic organization of the Type I macrolide PKS
and the catalytic domain organization in the multifunctional proteins is assessed (Donadio et al., 49 ' supra, 1991; Katz et al., Ann. R~v~, ~, 875 (1993)). First, a DNA encoding a TE
is added to the 3' end of an ORFI .of a Type I PKS, preferably the met ORF I
(Figure 6) as recently described by Cortes et al. (S, 2~$, 1487 (1995)) in the erythromycin system.
To ensure that the DNA encoding the TE is completely active, DNA encoding a linker region separating a normal ACP-TE region in a PKS, for example, the one found in met PKS ORES
(Figure 5), will be incorporated unto the DNA. The resulting vector can be introduced into a host cell and the TE activity, rate of release of the CoA product, and identity of the fatty acid chain determined.
The acyl chain that is most likely to be released is the CoA ester, specifically the 3-hydroxy-4-methyl heptenoylCoA ester, since the fully elongated chain is presumably released in this form prior to macrolide cyclization. If the CoA form of the acyl chain is not observed, then a gene encoding a CoA ligase; will be cloned and co-expressed in the host cell to catalyze formation of the desired intermediate.
There is clear precedent for. release of the predicted premature termination products from mutant strains of macrolide-producing Streptomyces that produce intermediates in macrolide synthesis (Huber et al., ,A,ntimis'~~Ag~ Chemother., ~, 1535 (1990);
Kinoshita et al., 1. C.'hem. goc., Chem. Comm., 14, 943 (1988)). The structure of these intermediates is consistent with thc; linear organization of functional domains in macrolide PKSs, particularly those related to eryA, tyl, and met. Other known PKS gene clusters include, but are not limited to, the gene cluster encoding 6-methylsalicylic acid synthase (Beck et al., ELr. J. Biochem., ~, 487 (1990)), soraphen A (Schupp et al., J.
Bacteriol., ~, 3673 (1995)), and sterigmatocystin (Yu et al., Z..~a!eleriQL,1.ZZ, 4792 (1995)).
Once the release of the 3-h,ydroxy-4-methyl heptenoylCoA ester is established, DNA
encoding the extender unit AT in met module 1 is replaced to change the specificity from methylmalonyICoA to malonylCo,A (Figures 4-6). This change eliminates methyl group branching in the p-hydroxy acyl chain. While comparison of known AT amino acid sequences shows high overall amina acid sequence conservation, distinct regions are readily apparent where significant deletions or insertions have occurred. For example, comparison of malonyl and methylmalonyl amino acid sequences reveals a 37 amino acid deletion in the central region of the malonyltransferase. Thus, to change the specificity of the methylmalonyl transferase to malonyl transferase, the met ORFI DNA encoding the 37 amino acid sequence of MMT will be deleted, and the resulting gene will be tested in a host cell for production of the desmethyl species, 3-hydroxyheptenoylCoA. Alternatively, the DNA

encoding the entire MMT can be n~laced with a DNA encoding an intact MT to affect the desired chain construction.
After replacing MMT with MT, DNA encoding DIi/ER will be introduced into DNA
encoding met ORFI module 1. Thiis modification results in a multifimctional protein that 5 generates a methylene group at C-:3 of the acyl chain (Figure 6). The DNA
encoding DH/ER
will be PCR amplified from the available eryA or tyl PKS sequences, including the DNA
encoding the required linker regions, employing a primer pair to conserved sequences 5' and 3' of the DNA encoding DH/ER. 'Che PCR fragment will then be cloned into the met ORFI.
The result is a DNA encoding a multifunctional protein (MT* DH/ER*TE*). This protein 10 possesses the fixll complement of keto group processing steps and results in the production of heptenoylCoA.
The DNA encoding dehydrase in met module 2 is then inactivated, using site-directed mutagenesis in a scheme similar to that used to generate the rat FAS DH-described above (Joshi et al., J. Bi~ 'l. .hem., 2fr$, 2:2508 (1993)). This preserves the required (R)-3-hydroxy 15 group which serves as the substrate: for PHA synthases and results in (R)-3-hydroxyheptanoylCoA species.
The final domain replacemf,nt will involve the DNA encoding the starter unit acyltransferase in met module 1 (Fi.gure 5), to change the specificity from propionyl CoA to acetyl CoA. This shortens the (R)-3-hydroxy acyl chain from heptanoyl to hexanoyl. The 20 DNA encoding the catalytic domain will need to be generated based on a FAS
or 6-methylsalicylic acid synthase model (Beck et al., lour. J. Biocheny, ~2, 487 (1990)) or by using site-directed mutagenesis to ~~lter the specificity of the resident met PKS
propionyltransferase sequence. I,irniting the initiator species to acetylCoA
can result in the use of this starter unit by the monomer synthase. Previous work with macrolide synthases 25 have shown that some are able to accept a wide range of starter unit carboxylic acids. This is particularly well documented for avermectin synthase, where over 60 new compounds have been produced by altering the starter unit substrate in precursor feeding studies (button et al., T. Antibiotics, ~, 357 (1991)).
30 Prena_ra_tion o_f a Vector Fncodin"g a Recomb'nant Monomer ~mthase that wnthecizPs 3-$~tdroxvl-4- exenoic Acid To provide a recombinant monomer synthase that synthesizes 3-hydroxyl-4-hexenoic acid, a precursor for polyhydroxyhexenoate, the DNA segment encoding the loading and the Sl first module of the vep gene cluste:r was linked to the DNA segment encoding module 7 of the tyl gene cluster so as to yield a recombinant DNA molecule encoding a fusion polypeptide which has no amino acid differences relative to the corresponding amino acid sequence of the parent modules. The fusion polypeptide catalyzes the synthesis of 3-hydroxyl-4-hexenoic S acid. The recombinant DNA molecule was introduced into SCP2, a Streptomyces vector, under the control of the act promoter (pDHSS02, Figure 20). A
polyhydroxyalkanoate polymerise gene, phaCl from Pseudomonas oleavorans, was then introduced downstream of the recombinant PKS cluster {pD~ISSOS; Figures 22 and 23). The DNA segment encoding the polyhydroxyalkanoate polyme:rase is linked to the DNA segment encoding the recombinant PKS synthase so as to yield a fusion polypeptide which synthesizes polyhydroxyhexenoate in Streptomyces. Polyhydroxyhexenoate, a biodegradable thermoplastic, is not naturally synl;hesized in Streptomyces, or as a major product in any other organism. Moreover, the unsaturated double bond in the side chain of polyhydroxyhexenoate may result in a polymer which has superior physical properties as a biodegradable 1 S thermoplastic over the known polyhydroxyalkanoates.
Deletion of the decR one of the Desoa mine Biosynthetic (Tens (''latter As some macrolides have more than one attached sugar moiety, the assignment of sugar biosynthetic genes to the appropriate sugar biosynthetic pathway can be quite difficult.
Since methymycin (a compound oFformula (1)) and neornethymycin (a compound of formula (2)) (Figure 24) (Donin et al., 1953.; Djerassi et al., 1956), two closely related macrolide antibiotics produced by Streptomyc:es venezuelae, contain desosamine as their sole sugar component, the organization of the. sugar biosynthetic genes in the methymycin/neomethymycin gene cluster may be less complicated. Thus, this system was 2S chosen for the study of the biosynthesis of desosamine, a N,N dimethylamino-3,4,6-trideoxyhexose, which also exists in the erythromycin structure (Flinn et al., 1954).
To study the formation of this unusual sugar, a DNA library was constructed by partially digesting the genomic DNA of S. venezuelae (ATCC 15439) with Sau3A I
into 3S-40 kb fragments which were ligated into the cosmid vector pNJl (Tuan et al., 1990). The recombinant DNA was packaged into bacteriophage ~, which was used to transfect E. coli DHSa. The resulting cosmid library was screened for desired clones using the tylAl and tylA2 genes from the tylosin biosynthetic cluster as probes {Baltz et al., 1988; Merson-Davies et al., 1994). These two probes are specific for sugar biosynthetic genes whose products catalyze the first two steps universally followed by all unusual 6-deoxyhexoses studied thus far. The initial reaction involves conversion of glucose-1-phosphate to TDP-D-glucose by a-D-glucose-1-phosphate thymidylyltransferase (TylAl) and subsequently, TDP-D-glucose is transformed to TDP-4-keto-6-deoxy-D-glucose by TDP-D-glucose 4,6-dehydratase (TylA2).
Three cosmids were found to contain genes homologous to tylAl and tylA2.
Further analysis ~~
of these cosmids led to the identifiication of nine open reading frames (ORFs) downstream of the PKS genes (Figure 24). Based on sequence similarities to other sugar biosynthetic genes, especially those derived form the erythromycin cluster (Gaisser et al., 1997;
Summers et al., 1997), eight of these nine ORFs are believed to be involved in the biosynthesis of'TDP-D-desosamine. Interestingly, the ery cluster lacks homologs of the tylAl and tylA2 genes that are responsible for the first two steps in desosamine pathway. It is possible that the erythromycin biosynthetic machinery may rely on a general cellular pool of TDP-4-keto-6-deoxy-D-glucose for mycarose and desosamine formation. Depicted in Figure 24 is a biosynthetic pathway for TDP-D-desosamine.
Although eight of the nine ORFs have been assigned to desosamine formation, the presence of desR, which shows strong sequence homology to ~i-glucosidases (as high as 39%
identity and 46% similarity) (Castle et al., 1998), within the desosamine gene cluster is puzzling. To investigate the fimction of DesR relative to the biosynthesis of methymycin/neomethymycin, a disruption plasmid (pBL1005) derived from pKC1139 (containing an apramycin resistance marker) (Bierman et al., 1992) was constructed in which a 1.0 kb NcoIlXhoI fragment of the; desR gene was deleted and replaced by the thiostrepton resistance (tsr) gene (1.1 kb) (Bibb et al., 1985) via blunt-end ligation.
This plasmid was used to transform E. coli S17-1, wlhich serves as the donor strain to introduce the pBL1005 construct through conjugal transfer into the wild-type S. venezuelae (Bierman et al., 1992).
The double crossover mutants in which chromosomal desR had been replaced with the disrupted gene were selected according to their thiostrepton-resistant and apramycin-sensitive characteristics. Southern blot hybridization analysis was used to confirm the gene replacement.
The desired mutant was fir,~t grown at 29°C in seed medium for 48 hours, and then inoculated and grown in vegetative; medium for another 48 hours (Cane et al., 1993). After the fermentation broth was centrifiiged at 10,000 g to remove cellular debris and mycelia, the supernatant was adjusted to pH 9.5~ with concentrated KOH, and extracted with an equivolume of chloroform (four times). The organic layer was dried over sodium sulfate and evaporated to dryness. The amber oil-like crude products were first subjected to flash chromatography on silica gel using a gradient of 0-40% methanol in chloroform, followed by HPLC purification on a C~8 column eluted isocratically with 45% acetonitrile in 57 mM
ammonium acetate {pH 6.7). In addlition to methymycin (a compound of formula (1)) and neomethymycin (a compound of formula (2)), two new products were isolated. The yield of a ~~
compound of formula (13) and a compound of formula (14) was each in the range of 5-10 mg/L of fermentation broth. However, a compound of formula (1) and a compound of formula (2) remained to be the major products. High-resolution FAB-MS revealed that both compounds have identical molecular compositions that differ from methymycin/neomethymycin by an extra hexose. The chemical nature of these two new compounds were elucidated to be C-2' ~3-glucosylated methymycin and neomethymycin (a compound of formula (13) and formula (14), respectively) by extensive spectral analysis.
The spectral data of (13): 'H NMR (acetone-d6) b 6.56 (1H, d, J=16.0, 9-H}, 6.46 (1H, d, J= 16.0, 8-H), 4.67 (1H, dd, J= 10.8, 2.0, 11-H), 4.39 (1H, d, J= 7.5, 1'-H), 4.32 (1H, d, J= 8.0, 1 "-H), 3.99 (1H, dd, J= 11.5, 2.5, 6"-H), 3.72 (1H, dd, J=
11.5, 5.5, 6"-H), 3.56 (1H, m, 5'-H), 3.52 (1H, d, J= 10.0, 3-H), 3.37 (1H, t, J= 8.5, 3"-H), 3.33 (1H, m, 5"-H), 3.28 (1H, t, J= 8.5, 4"-H), 3.23 (1H, dd, J=10.5, 7.5, 2'-H), 3.15 (1H, dd, J= 8.5, 8.0, 2"-H), 3.10 (IH, m, 2-H), 2.75 (1:EI, 3'-H, buried under H20 peak), 2.42 (1H, m, 6-H), 2.28 (6H, s, NMe2), 1.95 (1H, m, 12-H), 1.9 (1H, m, 5-H), 1.82 (1H, m, 4'-H), 1.50 (1H, m, 12-H), 1.44 (3H, d, J= 7.0, 2-Me), 1.4 (1H,, m, 5-H), I.34 (3H, s, 10-Me), 1.3 (IH, m, 4-H), 1.25 (IH, m, 4'-H), 1.20 (3H, d, J= 6.0, :5'-Me), 1.15 (3H, d, J= 7.0, 6-Me), 0.95 (3H, d, J= 6.0, 4-Me}, 0.86 (3H, t, J= 7.5, 12-Me). High-resolution FAB-MS: calc for C3,H54N0,2 (M+H)+
632.3646, found 632.3686.
Spectral data of (14): 'H NMR (acetone-db) b 6.69 (1H, dd, J=16.0, 5.5 Hz, 9-H), 6.55 (1H, dd, J=16.0, 1.3, 8-H), 4.71 (1H, dd, J= 9.0, 2.0, 11-H), 4.37 (1H, d, J= 7.0, 1'-H), 4.31 ( 1 H, d, J = 8.0, 1 "-H), 3.97 ( 11 l, dd, J =11.5, 2.5, 6"-H), 3.81 ( 1 H, dq, J = 9.0, 6.0, 12-H), 3.72 (1H, dd, J= 11.5, 5.0, 6"-hU), 3.56 (IH, m, 5'-H), 3.50 (1H, bd, J=10.0, 3-H), 3.36 (1H, t, J= 8.5, 3 "-H), 3.32 (1H, m, :S "-H), 3.30 (1H, t, J= 8.5, 4"-H), 3.23 (1H, dd, J=10.2, 7.0, 2'-H), 3.13, (1H, dd, J= 8.5, 8.0, 2"-H), 3.09 (1H, m, 2-H), 3.08 (1H, m, 10-H), 2.77 (1H, ddd; J= 12.5, 10.2, 4.5, 3'-H), 2.41 (1H, m, 6-H), 2.28 (6H, s, NMe2), 1.89 (1H, t, J=
13.0, 5-H), 1.83 (1H, ddd, J=12.5, 4.5, 1.5, 4'-H), 1.41 (3H, d, J= 7.0, 2-Me), 1.3 (IH, m, 4-H), 1.25 ( 1 H, m, 5-H), 1.2 ( 1 H, m, 4' -H, 1.20 (3H, d, J = 6.0, 5'-Me), 1.17 (6H, d, J = 7.0, 6-Me, 10-Me), 1.12 (3H, d, J= 6.0, 12-me), 0.96 (3H, d, J= 6.0, 4-Me). "C NMR
(acetone-db) 8 204.1 (C-7), 175.8 (C-1), 148.2 (C-9), 126.7 (C-8), 108.3 (C-1 "), 104.2 (C-1'), 85.1 (C-3), 83.0 (C-2'), 78.2 (C-3 "), 78.1 (C-5"), 76.6 (C-2 "), 76.4 (C-11 ), 71.8 (C-4 "), 69.3 (C-5'), 66.1 (C-12), 66.0 (C-3'), 63.7 (C-6"), 46.2 (C-6), 44.4 (C-2) , 40.8 (NMe2), 36.4 (C-10), 34.7 (C-5), 34.0 (C-4), 29.5 (C-4'), 21.5 {5'-Me), 21.5 (12-Me), 17.9 (6-Me), 17.7 (4-Me), 17.2 (2-Me), 9.9 (10-Me). High-resolution FAB-MS: calc for C3~H54NO12 (M+H)+ 632.3646, found 632.3648.
The coupling constant (d, J = 8.0 Hz) of the anomeric hydrogen (1 "-H) of the added glucose and the magnitude of the downfield shift (11.8 ppm) of C-2' of desosamine are all consistent with the assigned C-2'' ~i-configuration (Seo et al., 1978).
The antibiotic activity of a compound of formula (13) and (14) against Streptococcus pyogenes was examined by separately applying 201tL of each sample (1.6 mM in MeOH) to sterilized filter paper discs which were placed onto the surface of S.
pyogenes grown on Mueller-Hinton agar plates (Mangahas, 1996). After being grown overnight at 37°C;, the plates of the controls (a compound .of formula (1) and (2)) showed clearly visible inhibition zones. In contrast, no such clearings were discernible around the discs of a compound of formula (13) and (14). Evidently, p.-glucosylation at C-2' of desosamine in methymycin/neomethymycin renders these antibiotics inactive.
It should be noted that similar phenomena involving inactivation of macrolide antibiotics by glycosylation are known (Celmer et al., 1985; Kuo et al., 1989;
Sasaki et al., 1996). For example, it was found that when erythromycin was given to Streptomyces lividans, which contains a macrolide glycosyltransferase (MgtA), the bacterium was able to defend itself by glycosylating the drug (Cundliffe, 1992; Jenkins et al., 1991). Such a macrolide glycosyltransferase activity has been detected in 15 out of a total of 32 actinomycete strains producing various polyketide antibiotics (Sasaki et al., 1996).
Interestingly, the co-existence of a rnacrolide glycosyltransferase (OIeD) capable of deactivating oleandomycin by glucosylation (Hernandez et al., 1993), and an extracellular (3-glucosidase capable of removing thE: added glucose from the deactivated oleandomycin in Streptomyces antibioticus (Vilches et al., 1992) has led to the speculation of glycosylation as a possible self resistance mechaniism in S. antibioticus. Although the genes of the aforementioned glycosyltransferases have been cloned in a few cases, such as mgtA of S.
lividans and oleD of S. antibioticus, the whereabouts of macrolide ~3-glycosidase genes remain obscure. Interestingly, the recently released eryBl sequence, which is part of the erythromycin biosynthetic cluster, is highly homologous to desR (55% identity) (Gaisser et al., 1997).
The discovery of desR, a mac:rolide ~3-glucosidase gene, within the desosamine gene cluster is thus significant, and the accumulation of deactivated compounds of formula (13) and (14) after desR disruption provides direct molecular evidence indicating that a similar self defense mechanism via glycosylation/deglycosylation may also be operative in S.
venezuelae. However, because a significant amount of methymycin and neomethymycin also exist in the fermentation broth of the mutant strain, glucosylation of desosamine may not be the primary self resistance mechanism in S. venezuelae. Indeed, an rRNA
methyltransferase 10 gene found upstream from the PKS genes in this cluster may confer the primary self resistance protection. Thus, these results are consistent with the fact that antibiotic producing organisms generally have more than one defensive option (Cundliffe, 1989). In light of this observation, it is conceivable that methymycin/neomethymycin may be produced in part as the inert diglycosides (a compound oi° formula (13) or (14)), and the macrolide ~3-glucosidase 15 encoded by desR is responsible for transforming methymycin/neomethyrnycin from their dormant state to their active form. Supporting this idea, the translated desR
gene has a leader sequence characteristic of secretory proteins (von Heijne, 1986; von Heijne, 1989). Thus, DesR may be transported through the cell membrane and hydrolyze the modified antibiotics extracellularly to activate them (Figure 25).
20 Summar~c Inspired by the complex assembly and the enzymology of aminodeoxy sugars that are frequently found as essential components of macrolide antibiotics, the entire desosamine biosynthetic gene cluster from the methymycin and neomethymycin producing strain Streptomyces venezuelae was cloned, sequenced, and mapped. Eight of the nine mapped 25 genes were assigned to the biosynthesis of TDP-D-desosamine based on sequence similarities to those derived from the erythromyciin cluster. The remaining gene, designated desR, showed strong sequence homology to ~i-glucosidases.
To investigate the function of the encoded protein (DesR), a disruption mutant was constructed in which a NcoIlXhoI fral~nent of the desR gene was deleted and replaced by the 30 thiostrepton resistance {tsr) gene. In addition to methymycin and neomethymycin, two new products were isolated from the fermentation of the mutant strain. These two new compounds, which are biologically inactive, were found to be C-2' ~3-glucosylated methymycin and neomethymycin. Since the translated desR gene has a leader sequence characteristic of secretory proteins, the DesR protein may be an extracellular (3-glucosidase capable of removing the added glucose from the modified antibiotics to activate them. Thus, the occurrence of desR within the desosamine gene cluster and the accumulation of deactivated glucosylated methymycin/neomethymycin upon disruption of desR
provide strong molecular evidence suggesting that a self resistance mechanism via glucosylation may be operative in S. venezuelae.
Thus, the desR gene can be used as a probe to identify homologs in other antibiotic biosynthetic pathways. Deletion of the corresponding macrolide glycosidase gene in other antibiotic biosynthetic pathways may lead to the accumulation of the glycosylated products which rnay be used as prodrugs with reduced cytotoxicity. Glycosylation also holds promise as a tool to regulate and/or minimize the potential toxicity associated with new macrolide antibiotics produced by genetically engineered microorganisms. Moreover, the availability of macrolide glycosidases, which can b~e used for the activation of newly formed antibiotics that have been deliberately deactivated by engineered glycosyltransferases, may be useful in the development of novel antibiotics using the combinatorial biosynthetic approach (Hopwood et al., 1990; Katz et al., 1993; Hutchinson et al., 1995; Cameras et al., 1997;
Kramer et al., 1996;
Khosla et al., 1996; Jacobsen et al., 1997; Marsden et al., 1998).
The emergence of pathogenic. bacteria resistant to many commonly used antibiotics poses a serious threat to human health and has been the impetus of the present resurgent search for new antimicrobial agents (Box et al., 1997; Davies, 1996; Service, 1995). Since the first report on using genetic engineering techniques to create "hybrid"
polyketides (Hopwood et al., 1995), the potential of manipulating the genes governing the biosynthesis of secondary metabolites to create new bioactive compounds, especially macrolide antibiotics, has received much attention (Kramer et al., 1996; Khosla et al., 1996). This class of clinically important drugs consists oftwo essential structural components: a polyketide aglycone and the appended deoxy sugars (Omura, 1984). The aglycone is synthesized via sequential condensations of acyl thioesters catalyzed by a highly organized multi-enzyme complex, polyketide synthase (PKS) (Hopwood et al., 1990; Katz, 1993;
Hutchinson et al., 1995; Cameras et al., 1997). Recent advances in the understanding of the polyketide biosynthesis have allowed recombination of the PKS genes to construct an impressive array of novel skeletons (Kramer et al., 19!6; Khosla et al., 1996; Hopwood et al., 1990; Katz, 1993; Hutchinson et al., 1995; Carreras et al., 1997; Epp et al., 1989;
Donadio et al., 1993;
Arisawa et al., 1994; Jacobsen et al., 1997; Marsden et al., 1998). Without the sugar components, however, these new compounds are usually biologically impotent.
Hence, if one plans to make new macrolide antibiotics by a combinatorial biosynthetic approach, two immediate challenges must be overcome: assembling a repertoire of novel sugar structures and then having the capacity to couple these sugars to the structurally diverse macrolide aglyeones.
Unfortunately, knowledge of the formation of the unusual sugars in these antibiotics remains limited (Liu et al., 1994; Kirschning et al., 1997; Johnson et al., 1998). Part of the reason for this comes from the fact that the sugar genes are generally scattered at both ends of the PKS genes. Such an organization within the macrolide biosynthetic gene cluster makes it difficult to distinguish the sugar genes from those encoding regulatory proteins or aglycone modification enzymes that are also interspersed in the same regions. The task can be made even more formidable if the macrolides contain multiple sugar components. In view of the "scattered" nature of the sugar biosynthetic genes, the antibiotic methymycin (a compound of formula (1) in Figure 24) and its co-nnetabolite, neomethymycin (a compound of formula (2) in Figure 24)), of Streptomyces venezuelae present themselves as an attractive system to study the formation of deoxy sugars (Donin et al., 1953; Djerassi et al., 1956).
First, they carry D-desosamine (a compound of formula (3)) a prototypical aminodeoxy sugar that also exists in erythromycin. Second, since desosarnine is the only sugar attached to the rnacrolactone of formula (1) and (2), identification of the sugar biosynthetic genes within the methymycin/neomethymycin gene cluster should be possible with much more certainty.
A 10 kb stretch of DNA downstream from the methymycin/neomethymycin gene cluster, which is about 60 kb in length, was found to harbor the entire desosamine biosynthetic gene cluster (Figure 26).. Among the nine open reading frames (ORFs) mapped in this segment, eight are likely to be involved in desosamine formation, while the remaining one, desR, encodes a macrolide ~i-glycosidase that may be involved in a self resistance mechanism. Their identities, shown :in Figure 26, are assigned based on sequence similarities to other sugar biosynthetic genes (Ga~.isser et al., 1997; Summers et al., 1997). The proposed pathway is well founded on literature. precedent and mechanistic intuition for the construction of aminodeoxy sugars (Liu et al., 1994; Kirschning et al., 1997; Johnson et al., 1998).
To determine whether new methymycin/neomethymycin analogues carrying modified sugars could be generated by altering; the desosamine biosynthetic genes, the desYl gene, WO 00/00620 PC'T/US99/14398 which has been predicted to encode the N methyltransferase, was chosen as a target (Gaisser et al., 1997; Summers et al., 1997). The deduced desVl product is most closely related to that of eryCVI from the erythromycin producing strain Saccharopolyspora erythraea (70%
identity), and also strongly resembles 'the predicted products of rdmD from the rhodomycin cluster of Streptomyces purpurascens (Niemi et al., 1995), srmX from the spiromycin cluster of Streptomyces ambofaciens (Geistlich et al., 1992), and tylMl from the tylosin cluster of Streptomyces fradiae (Gandecha et al., 1997). All of these enzymes contain the consensus sequence LLDV(I)ACGTG (SEQ ID N0:25) (Gaisser et al., 1997; Summers et al., 1997), near their N terminus, which is part of the S-adenosylmethionine binding site (Ingrosso et al., 1989; Haydock et al., 1991 ).
The deletion of desVl should have little polar effect (Lin et al., 1984) on the expression of other desosamine biosynthetic genes because the ORF (desR) lying immediately downstream from desVl i.s not.directly involved in desosamine formation, and those lying further downstream are transcribed in the opposite direction.
Second, since N,N
dimethylation is almost certainly the last step in the desosamine biosynthetic pathway (Liu et al., 1994; Kirschning et al., 1997; Johmon et al., 1998; Gaisser et al., 1997;
Summers et al., 1997), perturbing this step may lead to the accumulation of a compound of formula (4), which stands the best chance among all other intermediates of being recognized by the glycosyltransferase (DesVII) for successful linkage to the macrolactone of formula (6) {Figure 25). Deletion and/or disruption of a single biosynthetic gene often affects the pathway at more than one specific step. In fact, disruption of eryCVI, the desVl equivalent in the erythromycin cluster, which has bc;en predicted to encode a similar N-methylase to make desosamine in erythromycin (Gaisser et al., 1997; Summers et al., 1997), led to the accumulation of an intermediate devoid of the entire desosamine moiety (Summers et al., 1997).
A plasmid pBL3001, in which desVl was replaced by the thiostrepton gene (tsr) (Bibb et al., 1985), was constructed and introduced into wild type S. venezuelae by conjugal transfer using E. coli S17-1 (Bierman et al., 1992). Two identical double crossover mutants, KdesVI-21 and KdesVI-22 with phenotypes of thiostrepton resistance (ThioR) and apamycin sensitivity (Apms) were obtained. Southern blot hybridization using tsr or a 1.1 kb HincII
fi~agment from the desVll region fiuther confirmed that the desVl gene was indeed replaced by tsr on the chromosome of these mutants. The KdesVI-21 mutant was first grown at 29°C
in seed medium (100 rnL) for 48 hours, and then inoculated and grown in vegetative medium (3 L) for another 48 hours (Cane et al.., 1993). The fermentation broth was centrifuged to remove the cellular debris and mycelia, and the supernatant was adjusted to pH
9.5 with concentrated KOH, followed by extraction with chloroform. No methymycin or neomethymycin was found; instead, tl;~e 10-deoxy-methynolide (6) (350 mg) (Lambalot et al., 1992) and two new macrolides containing an N acetylated amino sugar, a compound of formula (7) (20 mg) and a compound of formula (8) (15 mg), were isolated.
Their strictures were determined by spectral analyses and high-resolution MS.
Spectral data of formula 7 are: 'H NMR (CDC13) 8 6.62 (1H, d, J=16.0, H-9), 6.22 (1H, d, J= 16.0, H-8), 5.75 (1 H, d, J= 7.5, N-H), 4.75 (1H, dd, J= 10.8, 2.2, H-11), 4.28 (1H, d, J= 7.5, H-1'), 3.95 (1H, m, H-3'), 3.64 (1H; d, J=10.5, H-3), 3.56 (1H, m, H-5'), 3.16 ( 1 H, dd, J = 10.0, 7.5, H-2' ), 2.84 ( 1 H, dq, J = 10.5, 7.0, H-2), 2.55 ( 1 H, m, H-6), 2.02 (3H, s, NAc), 1.95 {1H, m, H-12), 1.90 (1H, m, H-4'), 1.66 (1H, m, H-5), 1.50 (1H, m, H-12), 1.41 (3H, d, J=7.0, 2-Me), 1.40 (1H, m, H-5), 1.34 (3H, s, 10-Me), 1.25 (1H, m, H-4), 1.22 (1H, m, H-4'), 1.21 (3H, d, J= 6.0, H-6'), 1.17 (3H, d, J= 7.0, 6-Me), 1.01 (3H, d, J= 6.5, 4-Me), 0.89 (3H, t, J= 7.2, 12-Me); "C NMR (CDC13) 8 204.3 (C-7), 175.1 (C-1), 171.8 (Me-C=O), 149.1 (C-9), 125.3 (C-8), 104.4 (C-1'), 85.4 (C-3), 76.3 (C-11), 75.4 (C-2'), 74.1 (C-10), 68.6 (C-5'), 51.9 (C-3'), 45.0 (C-6), 44.0 (C-2), 38.5 (C-4'), 33.8 (C-5), 33.3 (C-4), 23.1 (Me-C=O), 21.1 (C-12), 20.6 (C-6'), 119.2 (10-Me), 17.5 (6-Me), 17.2 (4-Me), 16.2 (2-Me), 10.6 (12-Me). High-resolution FABMS: talc for CZSHa30sN {M+H)+ 484.2910, found 484.2903.
Spectral data of formula 8 are: 'H NMR (CDC13) 8 6.76 (1H, dd, J=16.0, 5.5, H-9), 6.44 (1H, dd, J=16.0, 1.5, H-8), 5.50 {1H, d, J= 6.5, N-H), 4.80 (1H, dd, J=
9.0, 2.0, H-11), 4.28 (1H, d, J= 7.5, H-1'), 3.95 (1H, gym, H-3'), 3.88 (1H, m, H-12), 3.62 (1H, d, J=11.0, H-3), 3.57 (1H, m, H-5'), 3.18 (1H, dd, .J=10.0, 7.5, H-2'), 3.06 (1H, m, H-10), 2.86 (1H, dq, J
=11.0, 7.0, H-2), 2.54 (1H, m, H-6), :!.04 (3H, s, NAc), 1.98 (1H, m, H-4'), 1.67 {1H, m, H-5), 1.40 (1H, m, H-5), 1.39 (3H, d, .l== 7.0, 2-Me), 1.25 (1H, m, H-4), 1.22 (1H, m, H-4'), 1.22 (3H, d, J= 6.0, H-6'), 1.21 (3H, .d, J= 6.0, 6-Me), 1.19 (3H, d, J= 7.0, 12-Me), 1.16 (3H, d, J= 6.5, 10-Me), 1.01 (3H, d, J= 6.5, 4-Me); '3C NMR (CDCl3) b 205.1 (C-7), 174.6 (C-1), 171.9 (Me-C=O), 147.2 (C-9), 126.2 {C-8), 104.4 (C-1'), 85.3 (C-3), 75.7 (C-11), 75.4 (C-2'), 68.7 (C-5'), 66.4 (C-12), 52.0 (C-3'), 45.1 (C-6), 43.8 (C-2), 38.6 (C-4'), 35.4 (C-10), 34.1 (C-5), 33.4 (C-4), 23.1 (Me-C=O), 21.0 (12-Me), 20.7 (C-6'), 17.7 (6-Me), 17.4 (4-Me), 16.1 (2-Me), 9.8 (10-Me). High-resolution FABMS: talc for CZSHasOsN (M+H)+
484.2910, found 484.2892.

The fact that compounds of formula (7) and (8) bearing modified desosamine are produced by the desVl deletion mutant is a thrilling discovery. However, this result is also somewhat surprising since the sugar component in the products is expected to be the aminodeoxy hexose (4). As illustrated in Figure 27, it is possible that a compound of formula 5 (7) and (8) are derived from the predicted compound of formula (9) and (10), respectively, by a post-synthetic nonspecific acetylation of the attached aminodeoxy sugar. It is also conceivable that N acetylation of (4) occurs first, followed by coupling of the resulting sugar {1l) to the 10-deoxymethynolide (6). Nevertheless, the lack of N methylation of the sugar component in these new products provides convincing evidence sustaining the assignment of 10 desVl as the N methyltransferase gene:. Most significantly, the production of a compound of formula (7) and (8) by the desVl deletion mutant attests to the fact that the glycosyltransferase (DesVII) in methymycin/neomethymycin pathway is capable of recognizing and processing sugar substrates other than TDP-desosamine (5).
Since both compounds of fornuula (7) and (8) are new compounds synthesized in vivo 15 by the S. venezuelae mutant strain, the. observed N acetylation might be a necessary step for self protection (Cundliffe, 1989). In view of these results, the potential toxicity associated with new macrolide antibiotics produced by genetically engineered microorganisms can be minimized and newly formed antibiotiics that have been deactivated (either deliberately or not) during production can be activated. Such an approach can be part of an overall strategy 20 for the development of novel antibiotics using the combinatorial biosynthetic approach.
Indeed, purified compounds of formula (7) and (8) are inactive against Streptococcus pyogenes grown on Mueller-Hinton agar plates (Mangahas, 1996), while the controls (a compound of formula (1) and (2)) show clearly visible inhibition zones.
It should be pointed out that a few glycosyltransferases involved in the biosynthesis of 25 antibiotics have been shown to have relaxed specificity towards modified macrolactones (Jacobsen et al., 1997; Marsden et al., 1998; Weber et al., 1991 ). However, a similar relaxed specificity toward sugar substrates ha,~ only been reported for the daunorubicin glycosyltransferase, which is able to recognize a modified daunosamine and catalyze its coupling to the aglycone, e-rhodomycinone (Madduri et al., 1998). Thus, the fact that the 30 methymycin/neomethymycin glycosyltransferase can also tolerate structural variants of its sugar substrate indicates that at least some glycosyltransferases in antibiotic biosynthetic pathways may be useful to create biologically active hybrid natural products via genetic engineering.

The appended sugars in mac;rolide antibiotics are indispensable to the biological activities of these clinically important drugs. Therefore, the development of new antibiotics via a biological combinatorial approach requires detailed knowledge of the biosynthesis of these unusual sugars, as well as the ability to manipulate the biosynthetic genes to create novel sugars that can be incorporate;d into the final macrolide structures. A
targeted deletion of the desVl gene of Streptomyces venezuelae, which has been predicted to encode an N
methyltransferase based on sequence comparison, was prepared to determine whether new methymycin/neomethymycin analogues bearing modified sugars can be generated by altering the desosamine biosynthetic genes. Growth of the S. venezuelae deletion mutant strain resulted in the accumulation of a rnethymycin/neomethymycin analogue carrying an N
acetylated aminodeoxy sugar. Isolation and characterization of these derivatives not only provide the first direct evidence confirming the identity of desVl as the N
methyltransferase gene, but also demonstrate the feasibility of preparing novel sugars by the gene deletion 1 S approach. Most significantly, the results also revealed that the glycosyltransferase of methymycin/neomethymycin exhibits a relaxed specificity towards its sugar substrates.
loning n; a d ~~ ~' .aencing of the Met/Pik Biosynthetic Gene ClLSter Materials..and~thQds.
B~t~al~rains.and.~ia,.. E. coli DHSa was used as a cloning host. E. call LE392 was the host for a cosmid library derived from S. venezuelae genomic DNA. LB
medium was used in E. coli propagation. Streptc~myces venezuelae ATCC 15439 was obtained as a freeze-dried pellet from ATCC. Media for vegetative growth and antibiotic production were used as described (Lambalot et al., 1992). l3riefly, SGGP liquid medium was for propagation of S.
venezuelae mycelia. Sporulation agar {SPA) was used for production of S.
venezuelae spores.
Methymycin production was conducted in either SCM or vegetative medium and pikromycin production was performed in Suzuk:i glucose-peptone medium.
. pUC 119 was the routine cloning vector, and pNJI w;as the cosmid vector used for genomic DNA
library construction. Plasmid vectors for gene disruption were either pGM160 (Muth et al., 1989) or pKC1139 (Bierman et al., 1992). Plasmid, cosmid, and genomic DNA preparation, restriction digestion, fragment isolation, and cloning were performed using standard procedures (Sambrook et al., 1989; Hopwood et al., 1985). The cosmid library was made according to instructions from the Pac:kagene ~,-packaging system (Promega).
NA ~ ~ n i g and An vci~. An Exonuclease III (ExoIII) nested deletion series combined with PCR-based double stranded DNA sequencing was employed to sequence the pik cluster. The ExoIII procedure followed the Erase-a-Base protocol (Stratagene) and DNA
sequencing reactions were performed using the Dye Primer Cycle Sequencing Ready Reaction Kit (Applied Biosystems). The nucleotide sequences were read from an ABI
PRISM 377 sequences on both DNA strands. DNA and deduced protein sequence analyses were performed using GeneWorks and GCG sequence analysis package. All analyses were performed using the specific program default parameters.
~rene~is~ption. A replicative plasmid-mediated homologous recombination approach was developed to conduct gE;ne disruption in S. venezuelae. Plasmids for insertional inactivation were constructed by cloning a kanamycin resistance marker into target genes, and plasmid for gene deletion/replacement: was constructed by replacing the target gene with a kanamycin or thiostrepton resistance gene in the plasmid. Disruption plasmids were introduced into S. venezuelae by either PEG-mediated protoplast transformation (Hopwood et al., 1985) or RK2-mediated conjugation (Bierman et al., 1992). Then, spores from individual transformants or transconjugants were cultured on non-selective plates to induce recombination. The cycle was repeated three times to enhance the opportunity for recombination. Double crossovers yielding targeted gene disruption mutants were selected and screened using the appropriate cornbination of antibiotics and finally confirmed by Southern hybridization.
Anti ,iotic .xtra . ion a~~y,~, Methymycin, pikromycin, and related compounds were extracted following published procedures (Cane et al., 1993).
Thin layer chromatography (TLC) was routinely used to detect methymycin, neomethymycin, narbomycin and pikromycin. Further purification was conducted using flash column chromatography and HPLC, and the purified compounds were analyzed by'H,'3C NMR
spectroscopy and MS spectrometry.
.lonin~~ end Identifica ion of h~~ t~I m r. Heterologous hybridization was used to identify genes for methymycin, neomethymycin, narbomycin and pikromycin biosynthesis in S. venezuelae. Initial Southern blot hybridization analysis using a hype I PKS
DNA probe revealed two multifunctional PKS clusters of uncharacterized function in the genome. Since WO 00/00620 PG"T/US99/14398 63 ' these four antibiotics are all comprised of an identical desosamine residue, a tylAl a-D-glucose-1-phosphate thymidylyltransferase DNA probe (for mycarninose/mycorose/mycinose biosynthesis in the tylosin pathway) (Me,rson-Davies et al., 1994) was used to locate the corresponding biosynthetic gene cluster(s). This analysis established that only one of the PKS pathways contained a cluster of desosamine biosynthetic genes. Nine overlapping cosmid clones were isolated spanning over 80 kilobases (kb) on the bacterial chromosome that encompassed the entire gene cluster (pik) for methymycin, neomethymycin, narbomycin and pikromycin biosynthesis (Figure 28). Through subsequent gene disruption, the other PKS cluster (vep, devoid of linked desosamine biosynthetic genes) was found to play no role in production of methymycin, neomethymycin, narbomycin or pikromycin.
Nucleotide Senuence of the n,i Club. The nucleotide sequence of the pik cluster was completely determined and shown to contain 18 open reading frames (ORFs) that span approximately 60 kb. Central to the cluster are four large ORFs, pikAl, pikAll, pikAlll, and pikAlV, encoding a multifunctional P'KS (Figure 28). Analysis of the six modules comprising the pik PKS indicated that it would specify production of narbonolide, the 14-membered ring aglycone precursor of narbomycin arid pikromycin (Figure 28).
Initial analysis unveiled two ;>ignificant architectural differences in the pikA-encoded PKS. First, compared with eryA (Donadio et al., 1998) and oleA (Swan et al., 1994), two PKS clusters that produce 14-membe,red ring macrolides erythromycin and oleadomycin similar to pikromycin, the presence o~f separate ORFs, pikAlll and pikAlV, encoding Pik module 5 and Pik module 6 (as individual modules) as opposed to one bimodular protein as in eryAIII and oIeAIII is striking. Secondly, the presence of a type II
thioesterase immediately downstream of the type I PKS cluster is also unprecedented (Figure 28). These two characteristics suggest that pikA may produce the 12-membered ring macrolactone 10 deoxymethynolide as well. Indeed, the domain organization of PikAI - AIII
(module L-5) is consistent with the predicted biosyntlhesis of 10-deoxymethynolide except for the absence of a TE function at the C-terminus of Pik module 5 (PikAlII). The lack of a TE
domain in PikAilI may be compensated by the type II TE (encoded by pikAV) immediately downstream of pikAIV. Consistent with the supposition that two distinct polyketide ring systems are assembled from the pik PKS, two macrolide-lincosamide-streptogramin B type resistant genes, pikRl and pikR2, are found upstream of the pik PKS (Figure 29), which presumably provide cellular self protection for S. venezuelae.

The genetic locus for desosamine biosynthesis and glycosyl transfer are immediately downstream ofpikA. Seven genes, desl; desll, deslll, deslV, desV, desVl, and desVlll, are responsible for the biosynthesis of the deoxysugar, and the eighth gene, desVll, encodes a glycosyltransferase that apparently catalyzes transfer of desosamine onto the alternate (12-and 14-membered ring) poIyketide aglycones. The existence of only one set of desosamine genes indicates that DesVIII can accept both 10-deoxymethynolide and narbonolide as substrates (Jacobsen et al., 1997). The largest ORF in the des locus, desR, encodes a ~i-glycosidase that is involved in a drug inactivation-reactivation cycle for bacterial self protection.
Just downstream of the des locus is a gene (pike') encoding a cytochrome P450 hydroxylase similar to eryF (Andersen et al., 1992), and eryK (Stassi et al., 1993), PikC, and a gene (pikD) encoding a putative regulator protein, PikD (Figure 28).
Interestingly, PikC is the only P450 hydroxylase identified in the entire pik cluster, suggesting that the enzyme can accept both 12- and 14-membered ring macrolide substrates and, more remarkably, it is active on both C-10 and C-12 of the YC-1T (12-membered ring intermediate) to produce methymycin and neomethymycin (Figure 30). PikD is a putative regulatory protein similar to ORFH in the rapamycin gene cluster (Schwecke et al., 1995).
The combined functionality coded by the eighteen genes in the pik cluster predicts biosynthesis of methymycin, neomethyrnycin, narbomycin and pikromycin (Table 2).
Flanlting the pik cluster locus are genes presumably involved in primary metabolism and genes that may be involved in both primary and secondary metabolism. An S-adenosyl-methionine synthase gene is located downstream of pikD that may help to provide the methyl group in desosamine synthesis. A threonine dehydratase gene was identified upstream of pikRl that may provide precursors for polyketide biosynthesis. It is not apparent that any of these genes are dedicated to antibiotic biosynthesis and they are not directly linked to the pik cluster.

Table 2. Deduced function of ORFs in the pik cluster Polypeptide (ORF) Amino Proposed function or sequence similarity detected acids, :no.

PikAI 4,61:3 PKS

Loading module KSQ AT(P) ACP

5 Module 1 KS AT(P) KR ACP "

Module 2 KS AT(A) DH KR ACP

PikAII 3,73!9 PKS

Module 3 KS, AT(P) KR ACP

Module 4 KS AT(P) DH ER KR ACP

10 PikAIII 1,56:2 PKS

Module 5 KS AT(P) KR ACP

PikAIV 1,346 PKS

Module 6 KS AT(P) ACP TE

PikAV 281 Thioesterase II (TEII) 15 DesI 415 4-Dehydrase DesII 485 Reductase?

DesIII 292 a-D-Glucose-1-phosphate thymidylyltransferase DesIV 337 TDP-glucose 4, 6-dehydratase DesV 379 Transaminase 20 DesVI 237 N,N-dimethyltransferase DesVII 426 Glycosyl transferase DesVIII 402 Tautomerase?

DesR 809 ~i-Glucosidase {involved in resistance mechanism) PikC 418 P450 hydroxylase 25 PikD 945?' Putative regulator PikRl 336 rRNA methyltransferase (mls resistance) PikR2 288?' rRNA methyltransferase (mls resistance) AT(A), acyltransferase incorporating an acetate extender unit; AT(P), acyltransferase 30 incorporating a propionate extender unit. KR°, an inactive KR.
Enzymes of uncertain function are denoted with a question mark.
Table 3.
Summary of mutational analyses of the pik cluster Antibiotic production/

Mutant Type of Target Intermediate accumulation mutation gene WO 00/00620 PCT/US99/1439$
66 ''' Met & neomethymycinPikromycin AX903 Insertion pikAl NolNo NolNo LZ3001 Deletion/ desVl No/10-deoxymethynolideNo/narbonolide replacement LZ4001 Deletion/ desV No/10-deoxymethynolideNo/narbonolide replacement AX905 Deletion/ pikAV <S%lNo <S%/No replacement AX906 Insertion ikC No/YC-17 No/narbomycin MLtational nalvsis of the;' .1 m .r. Extensive disruption of genes in the pik cluster were carried out to address, the role of key enzymes in antibiotic production (Table 3).
First, PikAI, the first putative enz.,~me involved in the biosynthesis of 10-deoxymethynolide and narbonolide was inactivated by insertional mutagenesis. The resulting mutant, AX903, produced neither methymycin or neomethymycin, nor narbomycin or pikromycin, indicating that pikA encodes a PKS required for both I 2- and 14-membered ring macrolactone formation.
Second, deletion of both desVl and desV abolished methymycin, neomethymycin, narbomycin and pikromycin production, and the resulting mutants, LZ3001 and LZ4001, 1 S accumulate 10-deoxymethynolide and narbonolide in their culture broth, indicating that enzymes for desosamine synthesis and transfer are also shared by the 12- and 14-membered ring macrolides.
In order to understand the mechanism of polyketide chain termination at PikAIII
(PIKATII (module 5) is presumed ~~to be the termination point in construction of 10-deoxyrnethynolide), the pik TEII gene, pikAV, was deleted. The deletionlreplacement mutant, AX905, produces less than 5% of methymycin, neomethymycin, and less than 5% of pikromycin compared to weld type: S. venezuelae. This abrogation in product formation occurs without significant accumulation of the expected aglycone intermediates, suggesting that pik TEII is involved in the tenmination of I2- as well as 14-membered ring macrolides at PikAIII and PikAIV, respectively. Although the polar effects may influence the observed phenotype in AX905, this has been ruled out after the consideration of mutant LZ3001, in which mutation in an enzyme downstream of pikAV accumulated 10-deoxymethynolide and narbonolide. The fact that mutant AX905 failed to accumulate these intermediates suggested that the polyketide chains were not efficiently released from this PKS protein in the absence of Pik TEII. Therefore, Pik TEII plays a crucial role in polyketide chain release and cyclization, and it presumably provides the mechanism for alternative termination in pik polyketide biosynthesis.
Finally, disruption of pikC confirmed that PikC is the sole enzyme catalyzing hydroxylation of both YC-17 (at C-10 and C-12) and narbomycin (at C-12). The relaxed substrate specificity of PikC and its regional specificity at C-i0 and C-12 provide another layer of metabolite diversity in the pik encoded biosynthetic system.
The work described herein has established that methymycin, neomethymycin, narbomycin and pikromycin biosynthesis is encoded by the pik cluster in S.
venezuelae.
Three key enzymes as well as the unique architecture of the cluster enable this relatively compact system to produce multiple macrolide antibiotics. Foremost, the presence of pik module 5 and 6 as separate proteins, PikAIII and PikAN, and the activity of pik TEII enable the bacterium to terminate the polyketide chain at two different points of assembly, thereby producing two macrolactones of different ring size. Second, DesVII, the glycosyltransferase in the pik cluster, can accept both 12- and 14-membered ring macrolactones as substrates.
Finally, PikC, the P450 hydroxylas;e, has a remarkable substrate and regiochemical specificity that introduces another layer of diversity into the system.
It is interesting to consider 'that pikA evolved in a line analogous to eryA
and oleA
since each of these PKSs specify tree synthesis of 14-membered ring macrolactones.
Therefore, pik may have acquired the capacity to generate methymycin when a mutation in the primordial pikAlII pikAlV linker region caused splitting of Pik module 5 and 6 into two separate gene products. This notion is raised by two features of the nucleotide sequence.
First, the intergenic region between pikAlll and pikAIV, which is 105 bp, may be the remanent of an intramodular linker peptide of 35 amino acids. Moreover, the potential for independently regulated expression of pikAIV is implied by the presence of a 100 nucleotide region at the S' end of the gene that is relatively AT-rich (62% as comparing 74% Ci+C
content in coding region). Thus, as the mutation in an original ORF encoding the bimodular multifunctional protein (PikAIII-PikAIV) occurred, so too may have evolved a mechanism for regulated synthesis of the new l;ene product (PikAIV).
The role of Pik TEII in alternative termination of polyketide chain elongation intermediates provides a unique aspect of diversity generation in natural product biosynthesis.
Engineered polyketides of different chain length are typically generated by moving the TE

WO 00/00620 PC'f/US99/14398 catalytic domain to alternate positions in a modular PKS (fortes et al., 1995). Repositioning of the TE domain necessarily abolishes production of the original full-length polyketide so only one macrolide is produced each time. In contrast to the fixed-position TE
domain, the independent Pik TEII polypeptide; presumably has the flexibility to catalyze termination at different stages of polyketide assembly, therefore enabling the system to produce multiple products of variant chain length. Combinatorial biology technologies can now exploit this system for generating molecular diversity through construction of novel PKS
systems with TEIIs for simultaneous production of several new molecules as opposed to the TE domains alone that limit catalysis to a single termination step.
It is noteworthy that sequences similar to Pik TEII are found in almost all known polyketide and non-ribosomal polypeptide biosynthetic systems (Marahiel et al., 1997).
Currently, the pik TEII is the first to be characterized in a modular PKS.
However, recent work on a TEII gene in the lipope~ptide surfactin biosynthetic cluster (Schneider et al., 1998) demonstrated that srf TEII plays an important role in poiypeptide chain release, and may suggest that sr, f TEII reacts at multiple stages in peptide assembly as well (Marahiel et al., 1997).
The enzymes involved in post-polyketide assembly of 10-deoxymethynolide and narbonolide are particularly intriguing, especially the glycosyltransferase, DesVII, and P450 hydroxylase, PikC. Both have the remarkable ability to accept substrates with significant structural variability. Moreover, dlisruption of desVl demonstrated that DesVII also tolerates variations in deoxysugar structure {Example 6). Likewise, PikC has recently been shown to convert YC-17 to methymycin/neomethymycin and narbomycin to pikromycin in vitro.
Targeted gene disruption of ORFI abolished both pikromycin and methymycin production, indicating that the single cluster is responsible for biosynthesis of both antibiotics. Deletion of the TE2 gene substantially reduced methymycin and pikromycin production, which demonstrates thiat TE2, in contrast to the position-fixed TE1 domain, has the capacity to release polyketide chain at different points during the assembly process, thereby producing polyketides of different chain length.
The results described above were unexpected in that it was surprising that one PKS
cluster produces two macrolides which differ in the number of atoms in their ring structure, that module 5 and module b of the PKS are in ORFs that are separated by a spacer region, that PikAIII lacked TE, that there 'was a Type II thioesterase, that TEI
domain was not 69 ' separate, and that 2 resistance genes were identified which may be specific for either a 12- or 14-membered ring.
With eighteen genes spanning less than 60 kb of DNA capable of producing four active macrolide antibiotics, the pi,k cluster represents the least complex yet most versatile modular PKS system so far investigated. This simplicity provides the basis for a compelling expression system in which novel .active ketoside products are engineered and produced with considerable facility for discovery of a diverse range of new biologically active compounds.
Complex polyketide synthesis follows a processive reaction mechanism, and each module within a PKS harbors a string of three to six enzymatic domains that catalyze reactions in nearly linear order as described in particular detail for the erythromycin-producing PKS (Katz, 1997; Khosla, 1997; Staunton et al. 1997). The combined set of PKS
modules and catalytic domains along with genes that encode enzymes for post-polyketide tailoring (e.g., glycosyl transferases, hydroxylases) typically limits a biosynthetic system to the generation of a single polyketide product.
Combinatorial biology involves the genetic manipulation of multistep biosynthetic pathways to create molecular diversity in natural products for use in novel drug discovery.
PKSs represent one of the most amenable systems for combinatorial technologies because of their inherent genetic organization and ability to produce polyketide metabolites, a large group of natural products generated by bacteria (primarily actinomycetes and myxobacteria) and fungi with diverse structures and biological activities. Complex polyketides are produced by multifunctional PKSs involving; a mechanism similar to long-chain fatty acid synthesis in animals (Hopwood et al., 1990). Pioneering studies (Comes et al., 1990;
Donadio et al., 1991) on the erythromycin PKS in Saccharopolyspora erythraea revealed a modular organization. Characterization of this multidomain protein system, followed by molecular analysis of rapamycin (Aparicio et al., 1996), FK506 (Motamedi et al., 1997), soraphen A
(Schupp et al., 1995), niddamycin (Kakavas et al., 1997), and rifamycin (August et al., 1998) PKSs, demonstrated a co-linear relationship between modular structure of a multifunctional bacterial PKS and the structure of iits polyketide product.
In a survey of microbial systems capable of generating unusual metabolite structural variability, Streptomyces venezuelcae ATCC 15439 is notable in its ability to produce two distinct groups of macrolide antibiotics. Methymycin and neomethymycin are derived from the 12-membered ring macrolactone 10-deoxymethynolide, while narbomycin and WO 00/00620 PCT/US99/1439ti 70 '' pikromycin are derived from the 14-rnembered ring macrolactone, narbonolide.
The cloning and characterization of the biosynthetic gene cluster for these antibiotics reveals the key role of a type II thioesterase in forming a metabolic branch through which polyketides of different chain length are generated by the pikromycin multifunctional polyketide synthase (PKS).
Immediately downstream of the PKS genes (pikA) are a set of genes for desosamine (des) biosynthesis and macrolide ring hydroxylation. The glycosyl transferase (encoded by desVll~ has the remarkable ability to catalyze glycosylation of both the 12-and 14-membered ring macrolactones. ll~Ioreover, the pikC encoded P450 hydroxylase provides yet another layer of structural variability by introducing regiochemical diversity into the macrolide ring systems.
One strategy to exploit modular PKSs, e.g., modules of pikA or a FAS, to provide PHA monomers is to harvest polyketide intermediates as CoA derivatives using a TEII which is converted to an acyl-CoA transferase (mTEII). PikTEII is a small enzyme (281 amino acids) encoded by pikAV in S. vene: uelae. The primary function of the wild-type enzyme is to catalyze the release of a polyketide chain at the fifth module in the pikA
pathway as 10-deoxymethonolide. The enzyme most likely binds to the fifth module (PikAIII) ACP (ACPS) and releases the acyl chain attached to it. This relationship, TEII and its cognate ALPS, can be exploited to produce a polyketide having different chain lengths by moving Pik ACPS to a different position in the cluster. For example, by moving ACPs into the second module in place of ACPZ, a triketide instead of hexoketide may be produced by the cluster. Further, moving KRs together with ACPs into the second module, and replacing the DH, KR, and ACP domains, a 3-hydroxyl triketide is produced that is structurally suitable as PHA.
monomer. A mutant TEII (mTEII) catalyzes the release of the triketide as CoA
form. The triketide-CoA, 3,5-dihydroxyl-4-methyl-heptonyl-CoA, is a substrate for PHA
polymerase, e.g., PhaCl from P. olivarus, which, in turn, can incorporate the monomer into a polymer.
A second strategy includes the harvesting of a polyketide intermediate as a CoA
derivative using a TEI which has been converted to an acyl-CoA transferase (mTE). Thus, the second strategy for 3-hydroxyac,yl-CoA monomer production is to exploit the TE. domain (TEI) within the PKS module. It has been demonstrated that the TE domain can release polyketide intermediates attached to the ACP domain within the same module.
Moving the TEI to a different position in a PKS cluster results in the production of a polyketide having a different chain length. Similarly, .a mutant TEI (mTEI) (i.e., one which is an acyl-CoA
transferase) releases the polyketide intermediate to acyl-CoA, which then is polymerized by PHA synthetase. :Preferably, a mutant TE domain in the pikA gene cluster is moved into pik module l, fusing it immediately downstream of ACP1. The recombinant enzyme produces 2- ~~
(S)-methyl-3(R)-hydroxylveleraty:l-CoA, which is a suitable substrate for PHA
polymerise PhaCl. Therefore, the coexpression of the polymerise with the recombinant PKS
produces a polymer.
A third strategy is to directly collect polyketide intermediates as substrates for PHA
synthesis by fusing a PHA polyme,rase with a polyketide synthase. The first two strategies produce 3-hydroxylacyl-CoA as a substrate for PHA synthesis by employing a mutant PKS
enzyme (TEI or TEII). As PHA polymerise may be active on acyl-ACP itself if the acyl-ACP is properly oriented, the third strategy fuses a PHA polymerise downstream of an ACP
in a PKS protein. The PHA synthc;tase then serves as a domain within the chimeric multifunctional enzyme in place of a TE domain. The PKS portion of the protein catalyzes the synthesis of a 3-hydroxylacyl-ACP intermediate and then the PHA synthetase domain accepts it as substrate and adds the 3-hydroxylacyl monomer to the growing polyhydroxyalkanoate chain. The process regenerates ACP function so that the reaction can go on repeatedly to synthesize a PHA of multiple units. For example, a phaC 1 gene is fused directly downstream of pik ACP 1 ,>o as to produce a chimeric enzyme that catalyzes the synthesis of a polymer.
The strategies described above can produce PHAs of complex structure, and having superior properties. In addition, the structure can be easily fine-tuned by modifying the PKS
gene, thus resulting in PHAs having desired properties or functions.
Media. Streptomyces vene~;uelae ATCC 15439 produces two groups of macrolide antibiotics: the 12-membered ring :macrolides methymycin and neomethymycin, and the 14-membered ring macrolides pikromycin and narbomycin (Figure 28). Methymycin and neomethymycin are derived from the 12-membered ring macrolactone 10-deoxymethynolide and are produced in SCM medium (Lambalot et al., 1992), whereas pikromycin and narbomycin are derived from the 14-membered ring macrolactone narbonolide and are produced in PGM medium (Xue et al., 1998).
CTenetic Ma~ninLlation of S vene, . a . Mutant AX910 and AX912 were created by targeted gene replacement. The rrmtation plasmid pDHS910 was created by ligating two DNA fragments flanking the TE d.omain so that the TE domain was deleted and a hexa-histidine sequence was introduced at its position. The primer pairs that were used to amplified the flanking DNA in polymerise chain reaction (PCR) are 5'-1C~ CCCGAATTCGCCGCCGCCAT~GGCCGAA - 3' (SEQ ID N0:42) and 5' -GTGATGCATCGGCTCGGCGACGGCCCAGTTCCGCT - 3' (SEQ ID N0:43); and 5'-ATGCATCACCACCACCACCACTGAGGGGGCGGGCAAGTGACCGAC-3' (SEQ ID
N0:44) and 5'-GGGTCTAGAGCTGCACCGGCGGGTCGTAGCGGA-3' (SEQ 1D N0:45).
Plasrnid pDHS910 was introducef~ into S. venezuelae AX905 (Xue et al., 1998) which has a 15 kanamycin resistance marker at the position of pikAV. Following procedures established by Xue et al. (1998), mutant AX910 (12 colonies) was isolated by screening for a kanamycin sensitive phenotype. The expected genotype of the mutant was confirmed by genomic Southern hybridization. Mutation plasmid pDHS912 was generated by replacing a BamHI-BgIII fragment (the DNA fragment corresponding to the pikAV gene immediately downstream 20 of the TE domain) in pDHS910 with a kanamycin resistance gene (Denis et al., 1992). Thus, the TE domain as well as the TEII gene pikAV were disrupted in the mutant AX912. Plasmid pDHS912 was transferred into wild type S. venezuelae and mutant AX912 (12 colonies) was selected according to the procedures of Xue et al. (1998).
Western Blot nalvcic, Western blot analysis of PikAIV followed standard 25 procedures (Sambrook et al., 1989;). The total protein of S venezuelae AX910, AX912, or wild type was first prepared from a four-day culture in either SCM or PGM
medium. The protein extract was separated on a 10% SDS-PAGE, transferred to PVDF membrane (Bio-Rad, Hercules, CA), hybridized with anti-6xHis antibody (Qiagen, Valencia, CA), and visualized using a secondary antibody conjugated to alkaline phophatase (Sigma, St. Louis, 30 MO).
Construction of Comnlemen~~, The pikA promoter, PpikA, was isolated as an EcoRV-EcoRI fragment betv~reen pikAl and pikRl in the pik cluster (Xue et al., 1998).
To create a plasmid for complementation, a DNA fragment encoding PikAV was first PCR-amplified and placed downstream of the EcoRI site in such a way that PikAV was translationally coupled to the leader sequence of pikAl in PpikA to give plasmid pDHS702.
Then, plasmids pDHS704, pDHS705, pDHS706, pDHS707, and pDHS708 were constructed by cloning various lengths of the pikAlV pikA Yregion into pDHS702 replacing pikAV. The various lengths of pikAIV were PCR-amplified from cosmid pLZS 1 (Xue et al., 1998) by the following primer pairs: prepared with primers 5' -GAATTCATCGAGGGGGCGGGCAAGTGA - 3' (SEQ ID N0:46) and 5' -ATGCATCAGGTCGTCGGTCACCGTGGGTTCT - 3' (SEQ ID N0:47) for pDHS702;
5'-GGATCCGCGCCGGGATGTTCCGCGCCCTGT-3' (SEQ ID N0:48) and 5'-AAAATGCATCAGAGGTCTGTCGGTCACTTGC - 3' (SEQ ID N0:49), for pDHS704;
5'-AAAAGATCTTGATGGTGC:AGGCGCTGCGCCACGGGGTGCTG-3' (SEQ ID NO:50) and 5'-AAA.ATGCATCAGA(JGTCTGTCGGTCACTTGC-3' (SEQ ID N0:49) for pDHS708; and 5'-AAAAGATC'CCCAACGAACAGTTGGTGGACGCT-3' (SEQ ID
NO:S 1 ) and 5'-AAAATGCATCAGAGGTCTGTCGGTCACTTGC-3' (SEQ ID N0:49) for pDHS707.
The fragment in pDHS705 {EcoF;I-BamHl) and pDHS706 (EcoRI-BgIII) was isolated directly from restriction digestion of cosmid pLZ51 (Xue et al., 1998) and ligated into EcoRI-BgIII treated pDHS702.
Antibiotic Extraction ~~~. Extraction, identification, and quantitation of 2~0 methymycin and related compounds followed a procedure developed by Cane et al. (1993), which is summarized in Xue et al. (1998).
Bnd~is~ussi~n Dele,~ion of the T~nomain from PikATV. Production of both 10-deoxymethynolide and narbonolide is mediated by a single PKS cluster (pikA) in S. venezuelae (Xue et al., 1998). ThepikA-encoded PKS is. composed of PikAI, PikAII, PikAllI, and PikAlV
(Figure 28) multifunctional proteins similar to EryAI-AIII except that PikAIII and PikAIV each contain a single module in contrast to the bimodular EryAIII (Donadio et al., 1991 ).
Moreover, PikAV is an independent thioesterase (TEII) that is distinct from the thioesterase domain (TE) located at the C-terminus of PikAIV. The modular organization of PikA
indicates that PikAI-PikAIII produces a hexaketide that cyclizes into 10-deoxymethynolide, and that PikAI-PikAIV produces a heptaketide that cyclizes into narbonolide {Figure 28).
Termination of polyketide assembly at the heptaketide stage is likely catalyzed by the C-terminal TE domain in PikAIV, which is analogous to chain ternination in the erythromycin pathway. However, it was not clear how the PikA system terminates polyketide assembly to produce the 12-membered ring al;lycone, 10-deoxymethynolide. Genetic evidence excluded PikAV (TEII) as the determining factor in aiternative termination since deletion ofpikAV
reduced the production of both macrolactones (Xue et al, 1998).
S To study the role of PikAifV in alternative termination, two mutant strains of S.
venezuelae were created in which. PikAIV was disrupted by deleting the C-terminal thioesterase (TE) domain. In mutant AX910, an inframe deletion was engineered to remove the TE domain from S. venezuelae chromosome. In a second mutant, AX912, the TE
domain as well as the downstream TEII gene (pikAV) was removed from the bacterial chromosome.
As expected, S. venezuelae AX912 is devoid of antibiotic production since the mutant lacks the thioesterase activities that are necessary to release the polyketide chain from the Pik PKS
protein. It was expected that the ,AX910 mutant strain would at least produce the 12-membered ring macrolides methymycin and neomethymycin because the sixth condensation cycle catalyzed by PikAIV is not required for 10-deoxymethynolide formation.
Surprisingly, l:i mutant AX910 produced trace amounts of pikromycin, however, methymycin and neomethymycin were completely absent from the fermentation broth. Since the mutant contains an inframe deletion of the pikAIV encoded TE domain, the potential for a downstream polar effect (on the pikAV encoded TEII enzyme) was avoided. This result suggested that PikAIV, or at least the TE domain within PikAIV, is involved directly in the 2() production of the 12- as well as 14-membered ring macrolactones.
Probing the ex~~ssion of ply. To investigate the differential expression of pikAlV using culture conditions for methymycin (SCM medium) or pikromycin (PGM
medium) production, the PikAIV protein was first tagged by a hexa-histidine sequemce replacing the TE domain at its C-terminus. Expression of PikAIV was then probed with anti-25 6xHis antibody in a Western blot 'that revealed a single protein band under conditions for either methymycin or pikromycin production in the mutant strains (AX910 and AX912).
Interestingly, the protein detected from cell extracts obtained under culture conditions for methymycin production (SCM medium) was approximately 25 kDa lower in molecular weight compared to the protein detected under conditions for pikromycin production (PGM
3(1 medium). The molecular weight of the protein detected under pikromycin culture conditions is 110 kDa, which is consistent with the predicted TE-truncated (6xHis-tag replaced) form of PikAIV. Therefore, the protein dc;tected under conditions for methymycin production must be an N-terminal truncated form of PikAIV (Figure 41 ). Indeed, two potential alternative translation start sites have been located in the pikAIV sequence, with either predicted to generate the truncated form of PikAIV. The presumed alternative expression of pikAlV
creates a protein product that contains only half of the Pik module 6 KS (KS6) domain (Figure 41 ). This result immediately pointed to a mechanism for alternative termination in the PikA
.'i system. Since the KS6 domain is responsible for the condensation of the final extender unit, a PKS that is unable to catalyze this reaction could only produce the 12-membered ring macrolactone.
~mglen ~~ of PikAIV. To investigate the functioning of the truncated form of PikAIV, the contribution of various domains in the multifunctional protein was tested 10 by genetic complementation of S. venezuelae mutant strain AX912. An SCP2*-based low copy number plasmid (Lydiate et al., 1985) was designed and the target gene (comprised of alternative-length forms of pikAIV) was placed under the control of the native pikA promoter (Xue et al., 1998). Using this system, the expression of pikAlV from the plasmid would most closely resemble its normal temporal expression profile, and would also be synchronized with 15 expression of the pikA cluster encoded on the S venezuelae chromosome. This system was used to test the ability of alternative forms of the pikAIV pikAV region (Figure 41) to complement the TE-TEII double mutant strain AX912.
The results clearly demonstrated that the TE domain in PikAIV is critical for deoxymethynolide formation. Specifically, all of the plasmid constructs that contain the TE
20 domain including, pDHS704 (TE alone), pDHS705 (ACP6-TE), pDHS706 {ACP6 TE::TEII), pDHS708 (AT6 ACP6-TE), and plDHS707 (KS6 AT6-ACP6 TE), complemented mutant AX912 to give 10-deoxymethynolide, Interestingly, other domains in the truncated form of PikAIV, especially the AT domain, were necessary for effective production of deoxymethynolide. The most efficient production of 10-deoxymethynolide resulted from 25 complementation by pDHS708 (A.T6 ACP6 TE), which contains the AT domain and closely mimics the truncated form of PikATV detected in wild type S. venezuelae under conditions for methymycin production (Figure 41). The relatively efficient complementation by the TE
domain alone (pDHS704) leading to 10-deoxymethynolide is especially intriguing and may result from two possible {or one of the two) complementation scenarios.
Specifically, it may 30 involve interaction of the TE domain directly with PikAIII (Figure 42C) andlor formation of a wild type-like PKS complex (Figl~re 42B) by the TE domain expressed from the plasmid interacting with the rest of PikAl« {expressed from the corresponding AX912 chromosomal allele) through noncovalent interactions.

Interestingly, the TE domain alone did not complement AX912 (TE-TEII double mutant) to give narbonolide production (Figure 41 ). This is consistent with a recent result (Gokhale et al., 1999) obtained from the erythromycin PKS system suggesting that the TE
domain may not interact significantly with it natural endogenous module (e.g., EryAIII or PikAIV) but must be covalently linked to be functional. However, the failure to complement may be due in part to introduction of the hexa-histidine at the C-terminus of the engineered PikAIV protein in AX912. Interestingly, pDHS708 (AT6 ACP6 TE) did complement under culture conditions for pikronnycin production resulting in equal amounts of 10-deoxymethynolide and narbonolid.e (Figure 41). This product pattern occurs due to formation of hetero- and homodimeric structures of PikAIV as shown in Figure 42E and Figure 42F, respectively. These results are in .accord with a model in which an N-terminal truncated form of PikAIV is responsible for 10-de;oxymethynolide formation while expressiion of lull-length PikAIV is responsible for narbonolide production.
Comparing the complementation of pDHS705 (ACP6 TE) and pDHS706 (ACP6--TE::TEII) further revealed the activity of pik TEII. Although TEII alone is not sufficient for polyketide termination (as shown iin pDHS702 complementation, see Figure 41 ), the independent thioesterase did enhance the production of both 10-deoxymethynolide and narbonoIide (Figure 41). Particularly in the case of narbonolide formation, the presence of TEII in pDHS706 (ACP6 TE::TEff) complementation helped to boost polyketide production to a level that was otherwise undetectable in AX912 (pDHS705 (ACP6-TE)). This accessory role of TEII is consistent with previous observations in the pikromycin system (Xue et al., 1998), as well as with other PKS (:Rangaswamy et al., 1998) and non-ribosomal peptide synthetase (NRPS) systems (Schnc;ider et al., 1998).
Mechanistic Modelc or the Alternative Termination b~ Pm'kAlV. The complementation experiments described above strongly suggest that TE is the key enzymatically active domain in the; truncated PikAIV polypepdde, although the entire protein (including AT, ACP, TE, and probably a partial KS domain) is much more effective for polyketide production. A structural model based on the proposed helical form of the erythromycin PKS complex (Staunton et al., 1996) was developed to illustrate the role of PikAIV in alternative termination iin the pik encoded PKS. Under conditions for pikromycin production, wild type S. venezuelae expresses a full length PikAIV module, which interacts with PikAIII and elongates the growing polyketide chain on ACPS by adding a methylmalonate unit (the activity of KS6) to ultimately produce the 14-membered ring 77 ' macrolactone, narbonolide (Figure 42A). On the other hand, the truncated form of :PikAN
that lacks KS6 is expressed under culture conditions for methymycin production. The molecular space left unoccupied by KS6 truncation is then presumably filled by the TE
domain that would be aligned to interact directly with ACPS to release the 12-membered ring macrolactone (Figure 42B). In both cases, the main part of PikAIV is predicted to remain fixed. A small movement of the TlE domain into the unoccupied space (left by truncation) would result in the bypass of the AT6-ACP6 catalytic domains in the truncated PikAIV, while retaining thioesterase activity. Evidently, the main function of truncated PikAN is to serve as a scaffold that orients the TE domain and stabilizes the interacting complex between PikAIII and Pik~~IV, therefore, greatly increasing the production of 10-deoxymethynolide,.
Efficient production of 10-deoxymethynolide by a truncated form of PikAIV
suggests that the AT, rather than the KS domain plays a pivotal role in the structure and function of modular PKS. The KS6-truncated i:orm of PikAIV generated from the pDHS708 (AT6-1S TE) complementation plasmid probably forms a heterodimer with the product of the corresponding AX912 chromosomal allele to generate narbonolide (Figure 42E), and it also efficiently forms a homodimer to produce 10-deoxymethynolide (Figure 42F).
However, this dimerization capacity was severely limited when the AT6 domain was truncated in pDHS705 (ACP6 TE). Furthermore, the complete absence of complementation by pDHS704 (TE
alone) to give narbonolide (under culture conditions for pikromycin production) suggests tlhat a dominant interaction exists between KS6 and PikAIII (Figure 42D), which may be the primary basis of module-module recognition and docking in multifunctional PKS
systems.
The pikA system in S. venezuelae provides a unique opportunity as well as a powerful tool to study these fimdamental interactions in further detail.
It is valuable to compare alternative termination by differential expression of PikAN
in S. venezuelae with engineered polyketide chain-length manipulations from other PKS
systems. In the erythromycin PKS,, the TE domain from EryAIII was moved to upstream domains and covalently linked to alternative ACPs resulting in truncated polyketides (fortes et al, 1995; Kao et al., 1995). In each case, the capacity for producing the full-length polyketide product was subsequently eliminated. In contrast, by linking the TE
domain of PikAIV to an upstream module by protein-protein interactions, S. venezuelae retains the capacity to generate two alternative:-sized macrolactones. Sequence analysis (Xue et al., 1998) suggested that the pikA may :have evolved from a six-module PKS that generated a 14-membered ring macrolactone. It is, therefore, interesting to consider that the structural and regulatory evolution of pikA to produce the rare 12-membered ring macrolactone may be the result of endogenous genetic selection to overcome antibiotic resistance within the ecological milieu of the antibiotic producing microorganism. The pikA system provides a natural '.> example of a branched metabolic pathway with the capacity to generate multiple macrolactone systems that may be; readily exploited for combinatorial biosynthetic creation of novel natural products.
A mutant of S. venezuelae (KdesV-41) was constructed that had the desV gene disrupted (Zhao et al., , .T Am. rl~m. Soc_, ~Q, 12159 (1998)). Since desYencodes the 3-aminotransferase that catalyzes the conversion of the 3-keto sugar 17 (Figure 42) to the corresponding amino sugar 4, deletion of this gene should prevent C-3 transamination, resulting in the accumulation of 1'7. It was expected that if the glycosyltransferase (DesVII) of this pathway is capable of reco~izing and processing the keto sugar intermediate 17, the macrolide products) produced by the KdesV-41 mutant should have an attached 3-keto sugar. Surprisingly, the two products isolated were the methymycin/neomethymycin analogues 18 and 19, each carrying a 4,6-dideoxyhexose (Figure 43). While this result demonstrated a relaxed specificity for the glycosyltransferase toward its sugar substrate, it also indicated the existence of a p<~thway-independent reductase in S.
venezuelae that can stereospecifically reduce the C-3 h:eto group of the sugar metabolite.
To explore the possibility of generating a mutant capable of synthesizing new macrolides of this class containing; an engineered sugar, the desl gene, which has been proposed to encode the dehydrase responsible for the C-4 deoxygenation in the biosynthesis of desosamine, was altered with the prediction that it would lead to the incorporation of D-quinovose (22; Figure 44), also known as 6-deoxy-D-glucose, into the final product(s). The rationale was based on the following: (1) Desosamine biosynthesis will be "terminated" at the C-4 deoxygenation step due to desl deletion and, thus, should result in the accumulation of 3-keto-6-deoxyhexose 16 (Figure 42). (2) By taking advantage of the existence of a 3-ketohexose reductase in S. venezue~lae, the sugar intermediate 15 is expected to be reduced stereospecifically to D-quinovose (22). (3) The glycosyltransferase (DesVII), with its relaxed specificity toward the sugar substrate, should catalyze the coupling of 22 to the macrolactones to give new macrolides 20 and 21 containing the engineered sugar D-quinovose (Figure 44).

A disruption plasmid, pDesI-K, derived from pKC1139 that contains an apramycin resistant marker, was constructed in which dell was replaced by the neomycin resistance gene, which also confers resistance; to kanamycin. This construct was then introduced into wild type S. venezuelae by conjugal transfer using Escherichia coli S 17-1 as the donor strain (Bierman et al., 1992). Several doable crossover mutants were identified on the basis of their phenotypes of kanamycin resistant (KanR) and apramycin sensitive (Aprs). One mutant, KdesI-80, was selected and grown at 29°C in seed medium (100 mL) for 48 hours and then inoculated and grown in vegetative. medium (5 L) for another 48 hours (Cane et al., 1993).
The fermentation broth was centrifuged to remove cellular debris and mycelia, and the supernatant was adjusted to pH 9.5 with concentrated potassium hydroxide solution. The resulting solution was extracted with chloroform, and the pooled organic extracts were dried over sodium sulfate and evaporated to dryness. The yellow oil was subjected to flash chromatography on silica gel using a gradient of 0-12% methanol in chloroform, and the isolated products were further purii:ied by HPLC using a C,8 column eluted isocratically with 50% acetonitrile in water. As expected, no methymycin or neomethymycin was detected;
instead, 10-deoxymethynolide 23 v~ras found as the major product (approximately 600 mg).
Significant quantities of methynolide 24 (approximately 40 mg) and neomethynolide 25 (approximately 2 mg) were also isolated (Figure 45). A new macrolide 15 containing D-quinovose (3.2 mg) was produced by this mutant. Its structure was fully established by spectral analyses. Spectral data (Jvalues are in hertz) for 15: 'H NMR (CDCI3) b 6.76 (1H, dd, J =16.0, 5.5, 9-H), 6.43 ( 1 H, d" J =16.0, 8-H), 4.97 ( 1 H, ddd, J =
8.4, 5.9, 2.5, 11-H), 4.29 (1H, d, J= 8.0, 1'-H), 3.62 (113, d, J=10.5, 3-H), 3.49 (1H, t, J= 9.0, 3'-H), 3.36 (1H, dd, J= 9.0, 8.0, 2'-H), 3.32 (1H, dd, J= 8.5, 5.5, 5'-H), 3.23 (1H, dd, J=
9.0, 8.5, 4'-H), 2.82 (1H, dq, J=10.5, 7.0, 2-H), 2.64 (l.H, m, 10-H), 2.55 (1H, m, 6-H), 1.70 (1H, m, 12a-H), 1.66 (1H, bt, J= 12.5, Sb-H), 1.56 {1H, m, 12b-H), 1.40 (1H, dd, J=12.5, 4.5, Sa-H), 1.35 (3H, d, J= 7.0, 2-Me), 1.31 (3H, d, J= 5.5, 5'-Me), 1.24 (1H, bdd, J=10.0, 4.5, 4-H), 1.21 (3H, d, J= 7.0, 6-Me), 1.11 (3H, d, J= 6.5, 10-Me), 1.00 (3H, d, J= 7.0, 4-Me), 0.92 (3H, t, J= 7.5, 12-Me);'3C NMR (CDC13) $ 205.0 (C-7), 174.7 (C-1), 146.9 (C-9), 125.9 (C-8), 102.9 (C-1'), 85.4 {C-3), 76.5 (C-3'), 75.5 (C-4'), 74.7 (C-2'), 73.9 {C-11), 71.6 (C-5'), 45.0 (C-6), 43.9 (C-2), 37.9 (C-10), 34.1 (C-5), 33.4 (C-4), 25.2 {C-12), 17.7 (6-Me), 17.5 (5'-Me), 17.4 (4-Me), 16.2 (2-Me), 10.3 (12-Me), 9.6 (10-Me); high-resolution FAB-MS
calculated for Cz3H38Og (M + H)i- 443.2644, found 443.2661.

The fact that macrolide 15 containing D-quinovose is indeed produced by the desl mutant is significant. First, the formation of quinovose as predicted further corroborates the presence of a pathway-independent reductase in S. venezuelae that reduces the 3-keto sugars.
Interestingly, this reductase is able to act on the 4,6-dideoxy sugar 17 as well as the 6-deoxy 5 sugar 16, suggesting that it is oblivious to the presence of a hydroxyl group at C-4. However, it is not clear at this point whether 'the reduction occurs on the free sugar or after it is appended to the aglycone. Second,, the retention of the 4-OH in quinovose as a result of desl deletion provides strong evidence .supporting the assigned role of desl to encode a C-4 dehydrase. Moreover, the results again show that the glycosyltransferase (DesVII) of this 10 pathway can recognize alternative ;sugar substrates whose structures are considerably different from the original amino sugar substrate desosamine. While the incorporation of quinovose is important, another noteworthy, albeit unexpected, result was the fact that the aglycone of the isolated macrolide 15 was 10-deox;y-methynolide 23 instead of methynolide 24 and neomethynolide 25. It is possible that the cytochrome P450 hydroxylase (PikC), which 15 catalyzes the hydroxylation of 10-deoxy-methynolide at either its C-10 or C-12 position (Xue et al., S~hs~m..~iQL, ~, 661 (1998)), is sensitive to structural variations in the appended sugar.
It could be argued that the presence; of the 4-OH group in the sugar moiety is somehow responsible for decreasing or preventing hydroxylation of the macrolide.
Thus, the results demonstrate the feasibility of combining pathway-dependent genetic 20 manipulations and pathway-independent enzymatic reactions to engineer a sugar of designed structure. It is conceivable that the pathway-independent enzymes could also be used in concert with the natural biosynthetic machinery to generate further structural diversity, which can provide an array of random compounds.
B~fe~nc.~
25 Andersen, J. R., Hutchinson, C. R. ~~1.,1Z4:725-735 (1992).
Aparicio, J.. F., Molnar, L, ;ichwecke, T., Konig, A., Haydock, S. F., Khaw, L. E., Staunton, J., Leadlay, P. F. ~, .L62:9-16 (1996).
Arisawa, A., Kawamura, N.., Takeda, K., Tsunekawa, H., Okamura, K., Okamoto, R.
~ nn~~" ~:265'~-2660 (1994).
30 August, P. :R., Tang, L., Yoon, Y. J., Ning, S., Muller, R., Yu, T. W., Taylor, M., Hoffmann, D., Kim, C. G., Zhang, X., Hutchinson, C. R. & Floss, H. G. Chem.
Biol_, x:69-79 (1998).

Baltz, R. H., Seno, E. T. ynnL_ Rev. Microbiol., 42:547-574 (1988).
Bibb, M. J." Bibb, M. J., Ward, J. M., Cohen, S. N. Mo1__ C'ren. Cenet., ~QQ:26-36 (1985).
Bierman, M., Logan, R., O'Brien, K., Seno, G., Nagaraja, R., Schoner, B. E.
.~, 116:43-49 (1992).
Box, R. P. Clin. Infect. Di:3" 24:5151 (1997).
Cane, D. E., Lambalot, R. Fi., Prabhakaran, P. C., Ott, W. R. 1. A_m. Chem.
Soc., 1L~:522-526 (1993).
Cameras, C. W., Pieper, R., Khosla, C. In Bioorganic Chemist' Deb rc Pol~ketides & Related Clacse$y h cic, Rios m h ci , .n~3~, Rohr, J. (ed.), Springer:Berlin, 85-126 (1997).
Castle, L. A., Smith, K. D., Morris, R. O. Z. Bacte,~L, x:1478-1486 (1992).
Celmer, W. D., Nagel, A. A., Wadlow, J. W., Tatematsu, H., Tkenaga, S., Nakanishi, S. Abstracts of Papers of 24th Intersci. Conf. on Antimicrob. Agents Chemother., No. 1142, Washington, D. C. (1985).
Comes, J. Haydock, S. F., Robems, G. A., Bevitt, D. J., Leadlay, P. F. Ice, 34$:176-8 (1990).
Cortes, J., Wiesmann, K. E., Roberts, G. A., Brown, M. J., Staunton, J., Leadlay, P. F.
S~i~, 2..6$:1487-9 (1995).
Cundliffe, E. C. Annu~B,~c~.Micr.~hiol., 4:207-233 (1989).
Cundliffe, E. G,~timis~.,~~.gcnt, , 35:348-352 (1992).
Davies, J. , 3$3:219-220 (1996).
Denis, F., Brzezinski, R. Gene, 111:115-118 (1992).
Djerassi, C., Zderic, J. A. Z. m. .hem. Soc., 2$:6390-6395 (1956).
Donadio, S., McAlpine, J. Et., Sheldon, P. J., Jackson, M., Katz, L. Po Sci_ LI_S.A., QQ:7119-7123 (1993).
Donadio, S., Staver, M. J., McAlpine, J. B,, Swanson, S. J., Katz, L. S~icns~, 252:675-9 ( 1991 ).
Donadio, S., Katz, L. ~, .111:51-60 (1992).
Donin, M. N., Pagano, J., butcher, J. D., McKee, C. M. Antibio i c nn ~., x:179-185 (1953-1954).
Epp, J., Huber, M. L. B., Tuner, J. R., Goodson, T., Schoner, B. E. ~, $x:293-(1989).

Flinn, E. H., Sigal, M. V., Jr., Wiley, P. F., Gerzon, K. T. Am_ Chem. Soc., ~:3121-3131 (1954).
Gaisser, S., Bohm, G. A., Cortes, J., Leadlay, P. F. Mol. Gen. Cene~, 25:239-(1997).
Gandecha, A. R., Large, 5. L., Cundliffe, E. ~en~,1$4:197-203 ( 1997).
Geistlich, M., Losick, R., Turner, J. R., Rao, R. N. MQL~icr~bi~l, f :2019-(1992).
Gokhale, R.S., Hunziker, I)., Cane, D.E., Khosla, C. ~m~Bil~l., fi:l 17-125 (1999).
Haydock, S. F., Dowson, J. A., Dhillon, N., Roberts, G. A., Cortes, J., Leadlay, P. F.
Mol<Gen. Genet., ~3Q:120-128 (1'991).
Hernandez, C., Olano, C'., lMendez, C., Salas, J. A. ~enc,1~4:139-140 (1993).
Hopwood, D. A., Sherman, D. H. Anny. Rev. C,enet., 24:37-66 (1990).
Hopwood, D. A., Malpartida, F., Kieser, H. M., Ikeda, H., Duncan, J., Fujii, L, Rudd, B. A., Floss, H. G., Omura, S. I~atur~, 314:642-644 (1985).
1 S Hopwood, D. A., Bibb, M. J., Chater, K. J., Kieser, T., Bruton, C. J., Kieser, H. M., Lydiate, D. J., Smith, C. P., Ward, J. M., Schrempf, H., Genetic Maninmlation of omyces: A Laborator;/ Many (The John Innes Foundation) (1985).
Hori et al., C.hem. Comm., 304 (1971).
Hutchinson, C. R., Fujii, I. ~icr~i~y, 4Q:201-238 (1995).
Ingrosso, D., Fowler, A. V", Bleibaum, J., Clarke, S. J. Biol. C'.hem., ~:20130-20139 (1989).
Jacobsen, J. R., Hutchinsor~, C. R., Cane, D. E., Khosla, C. S~il~, 2ZZ:367-( 1997).
Jenksins, G., Cundliffe, E. Cxenc,1Q.$., 55-62 ( 1991 ).
Kakavas, S. J., Katz, L., Stassi, D. J. Bacteriol.,17~:7515-22 (1997).
Kao, C.M., Luo, G.L., Kat::, L., Cane, D.E., Khosla, C. J. m. _hem, Soc., ~:9105-9106 (1995).
Katz, L., Donadio, S. Aunuu. Rev. Microhiol., x:875-912 (1993).
Katz, L., C.'hem. Rev., Q7:2557-2575 (1997).
Khosla, C., Ch.~m~$e~., 2!:2577-2590 (1997).
Khosla, C." Zawada, R. J. : ,14:335-341 (1996).

Kirschning, A., Bechthold, A. F.-W., Rohr, J. In Bioorga~nic C:hemisny!
Deox~rc~t ,g rc Poj3rketides & Relab~,~ .1 cc .t~ Sv , Rohr, J. (ed.), Springer:Berlin 1-84 (1997).
Kramer, P. J., Khosla, C. E\_nnL. N.Y. Acad. Sci , .7~Q:32-45 (1996).
Kuo, M.-S., Chirby, D. G., Argoudelis, A. D., Cialdella, J. L, Coats, J. H., Marshall, V. P. Antimlsl~.~gents Chemoxh~, 3:2089-2091 (1989).
Lambalot, R. H., Cane, D. lE. ~AntilZi~t., 4:1981-1982 (1992).
Lin, E. C. C., Goldstein, R., Syvanen, M. . acteri~, Plasmids~a'nd Phag~,,~n-IntrodLCtion to Moi_ecLlar Biology" Harvard University Press:Cambridge, p. 123 {1984).
Liu, H.-w., Thorson, J. S. ~~nnL. Rev. Microbiol , 4$:223-256 (1994).
Lydiate, D.J., Malpartida, F'., Hopwood, D.A. ~, x,:223-235 (1985).
Madduri, K., Kennedy, J., Rivola, G., Inventi-Solari, A., Filippini, S., Zanuso, G., Colombo, A. L., Gewain, K. M., Occi, J. L., MacNeil, D. J., Hutchinson, C. R.
IVa Hi~ch.,16:69-74 (1998).
Mangahas, F. R. MS Thesis, University of Minnesota, 1996.
Marahiel, M. A., Stachelhaus, T., Mootz, H. D., Chem. Rev., Q2:2651-2673 (1997).
Marsden, A. F. A., Wilkinson, B., Cort6s, J., Dunster, N. J., Staunton, J., Leadlay, P. F. S.ci~, 2ZQ:199-201 ( 1998).
Merson-Davies, L. A., Cundliffe, E. MQ1..~~is~:Q1?i~l.,1~:349-355 (1994).
Merson-Davies, L. A., Cundliffe, E. Mol.'~l~y, y'~:347-355 (1994).
Motamedi, H., Cai, S. J., Shafiee, A., Elliston, K. O. F~J. Biochem., 24:74-80 {1997).
Muth, G., Nubhaumer, B., Wohlleben, W., Puhler, A. Mol. Cene. CTenet., 2~Q:341-348 (1989).
Niemi, J., Mantsala, P. T. Ba~l.,112:2942-2945 (1995).
Omura, S. (ed.) f~li.d~.Antibiotics, .h .rnicyr, Biology nd Practice, Academic Press:New York (1984).
Omuras et al., ~.Antil~iQ., 2.'~, 316 ( 1971 ).
Rangaswamy, V., Mitchell, R., Ullrich, M., Bender, C. J~eriQL,1$Q:3330-3338 (1998).
Sambrook, J., Fritsch, E. F., Maniatis, T. MolecLlar Cloning A Laboratonr ManLal (Cold Spring Harbor Laboratory Press), 2nd edition (1989).
Sasaki, J., Mizoue, K., Morirnoto, S., Omura, S. J. Anti, 4Q:1110-1118 {1996).

Schneider, A., Marahiel, M. A., A~ , .162:404-410 ( 1998).
Schupp, T., Toupet, C., Cluzel, B., Neff, S., Hill, S., Beck, J. J., Ligon, J.
M., Z, »ac~l.,1ZZ:3673-9 (1995).
Schwecke, T., Aparicio, J. F., Molnar, L, Konig, A., Khaw, L. E., Haydock, S.
F., Oliynyk, M., Caffrey, P., Cortes, ~~., Lester, J. B., et al. Proc Natl Acad ci I1 ~,A , 92:7839-7843(1995).
Seo, S., Tomita, Y., Tori,1~., Yoshimura, Y. J. Am. Chem. ~oc.,1QQ:3331-3339 (1978).
Service, R. F. S~ienc~, 2Z!Q:724-727 (1995).
Stassi, D., Donadio, S., Staver, M. J., Katz, L. J. Bacteriol., .LZ5:182-189 (1993).
Staunton, :f., Caffrey, P., A.paricio, J.F., Roberts, G.A., Bethell, S.S., Leadlay, P.F.
Iy~. Struct. iol., x:188-192 (1996).
Staunton, J., Wilkinson, B., ~~, Q2:261 i-2629 (1997).
Summers, R. G., Donadio, S., Staver, M. J., Wendt-Pienkowski, E., Hutchinson, C.
R., Katz, L. Microbiolo~v, x:3:!51-3262 (1997).
Swan, D. Ci., Rodriguez, A.. M., Vilches, C., Mendez, C., Salas, J. A. ~~n.
(tenet., 24:358-362 (1994).
Tuan, J. S., Weber, J. M., Staver, M. J., Leung, J. O., Donadio, S., Katz, L.
Gene, 2Q:21-29 (1990).
Vilches, C., Hernandez, C., Mendez, C., Salas, J. A. J. Bac~,1'Z4:161-165 ( 1992).
von Heijne, G. L~Iucleic;.~c,14:4683-4690 (1986).
von Heijne, G., Abrahmsen, L. FEB~ Lett., 2:439-446 (1989).
Weber, J. M., Leung, J. O., Swanson, S. J., Idler, K. B., McAlpine, J. B. , 252:114-117 (1991).
Xue, Y., Zhao, L., Liu, H.=w., Sherman, D.H. Proc. Natl. Acad. Sci. L1.S.A., Q~:
12111-12116 (1998).
The complete disclosure of all patents, patent documents and publications cited herein are incorporated herein by reference as if individually incorporated. The foregoing detailed description and examples have been given for clarity of understanding only. No unnecessary limitations are to be understood therefrom. The invention is not limited to the exact details shown and described for variations obvious to one skilled in the art will be included within the invention defined by the claims.

SEQUENCE LISTING
<110> Regents of the: University of Minnesota, et al.
$ <120> DNA encoding methymycin and pikromycin <130> 600.438W01 <150> US 09/105,537 <151> 1998-06-26 <160> 53 1$ <170> FastSEQ for Windows Version 3.0 <210> 1 <211> 15872 <212> DNA
<213> Streptomyces Venezuelae <400> 1 ttaattaaggaggaccatcatgaac:gaggccatcgccgtcgtcggcatgtcctgccgcct60 gccgaaggcctcgaacccggccgccatctgggagctgctgcggaacggggagagcgccgt120 25caccgacgtgccctccggccggtgc~acgtcggtgctcgggggagcggacgccgaggagcc180 ggcggagtccggtgtccgccggggc:ggcttcctcgactccctcgacctcttcgacgcggc240 cttcttcggaatctcgccccgtgac~gccgccgccatggacccgcagcagcgactggtcct300 cgaactcgcctgggaggcgctggac~gacgccggaatcgtccccggcaccctcgccggaag360 ccgcaccgccgtcttcgtcggcacc:ctgcgggacgactacacgagcctcctctaccagca420 30cggcgagcaggccatcacccagcac:accatggcgggcgtgaaccggggcgtcatcgccaa480 ccgcgtctcgtaccacctcggcctc~cagggcccgagcctcaccgtcgacgccgcgcagtc540 gtcctcgctcgtcgccgtgcacctc~gcctgcgagtccctgcgcgccggggagtccacgac600 ggcgctcgtcgcrggcgtgaacctc:aacatcctcgcggagagcgccgtgacggaggagcg660 cttcggtggactctccccggacggc:accgcctacaccttcgacgcgcgggccaacggatt720 35cgtccggggcgagggcggcggagtc:gtcgtactcaagccgctctcccgcgccctcgccga780 cggcgaccgtgtc:cacggcgtcatc:cgcgccagcgccgtcaacaacgacggagccacccc840 gggtctcaccgtgcccagcagggcc:gcccaggagaaggtgctgcgcgaggcgtaccggaa900 ggcggccctggacccgtccgccgtccagtacgtcgaactccacggcaccggaacccccgt960 cggcgaccccatcgaggccgccgcc~ctcggcgccgtcctcggctcggcgcgccccgcgga1020 cgaacccctgctcgtcggctcggccaagacgaacgtcgggcacctcgaaggcgccgccgg 1080 catcgtcggcctcatcaagacgctcctcgcgctcggccggcgccggatcccggcgagcct 1140 caacttccgtacgccccacccggacatcccgctcgacaccctcgggctcgacgtgcccga 1200 cggcctgcgggagtggccgcacccg~gaccgcgaactcctcgccggcgtcagctcgttcgg 1260 Scatgggcggcaccaacgcccacgtc:gtcctcagcgaaggccccgcccagggcggcgagca 1320 gcccggcatcgatgaggagacccccgtcgacagcggggccgcactgcccttcgtcgtcac 1380 cggccgcggcggcgaggccctgcgcgcccaggcccggcgcctgcacgaggccgtcgaagc 1440 ggacccggagctcgcgcccgccgca.ctcgcccggtcgctggtcaccacccgtacggtctt 1500 cacgcaccggtcggtcgtcctcgccccggaccgcgcccgcctcctcgacggcctcggcgc 1560 l0cctcgccgccgggacgcccgcgcccggcgtggtcaccggcacccccgcccccgggcgcct 1620 cgccgtcctgttcagcggccagggtgcccaacgtacgggcatgggcatggagttgtacgc 1680 cgcccaccccgccttcgcgacggccttcgacgccgtcgccgccgaactggaccccctcct 1740 cgaccggcccctcgccgaactcgtcgcggcgggcgacaccctcgaccgcaccgtccacac 1800 acagcccgcgctcttcgccgtggaggtcgccctccaccgcctcgtcgagtcctggggcgt 1860 l5cacgcccgacctgctcgccggccactccgtcggcgagatcagcgccgcccacgtcgccgg 1920 ggtcctgtcgctgcgcgacgccgcccgcctcgtcgcggcgcgcggccgcctcatgcaggc 1980 gctccccgagggcggcgcgatggtcgcggtcgaggcgagcgaggaggaagtgcttccgca 2040 cctcgcgggacgcgagcgggagctctccctcgcggccgtgaacggcccccgcgcggtcgt 2100 cctcgcgggcgccgagcgcgccgtcctcgacgtcgccgagctgctgcgcgaacagggccg 2160 20ccggacgaagcggctcagcgtctcgcacgccttccactcgccgctcatggagccgatgct 2220 cgacgacttccgccgggtcgtcgaagagctggacttccaggagccccgcgtcgacgtcgt 2280 gtccacggtgacgggcctgcctgtcacagcgggccaatggaccgatcccgagtactgggt 2340 ggaccaggtccgcaggcccgtacgcttcctcgacgccgtacgcaccctggaggaatcggg 2400 cgccgacaccttcctggagctcggtcccgacggggtctgctccgcgatggcggcggactc 2460 25cgtacgcgaccaggaggccgccacggcggtctccgccctgcgcaagggccgcccggagcc 2520 ccagtcgctgctcgccgcactcaccaccgtcttcgtccggggccacgacgtcgactggac 2580 cgccgcgcacgggagcaccggcacggtcagggtgcccctgccgacctacgccttccagcg 2640 cgaacgccactggttcgacggcgccgcgcgaacggcggcgccgctcacggcgggccgatc 2700 gggcaccggtgcgggcaccggcccggccgcgggtgtgacgtcgggcgagggcgagggcga 2760 30gggcgagggcgcgggtgcgggtggcggtgatcggccggctcgccacgagacgaccgagcg 2820 cgtgcgcgcacacgtcgccgccgtcctcgagtacgacgacccgacccgcgtcgaactcgg 2880 cctcaccttcaaggagctgggcttcgactccctcatgtccgtcgagctgcggaacgcgct 2940 cgtcgacgacacgggactgcgcctgcccagcggactgctcttcgaccacccgacgccgcg 3000 cgccctcgccgcccacctgggcgacctgctcaccggcggcagcggcgagaccggatcggc 3060 3Scgacgggataccgcccgcgaccccggcggacaccaccgccgagcccatcgcgatcatcgg 3120 catggcctgccgctaccccggcggcgtcacctcccccgaggacctgtggcggctcgtcgc 3180 cgaggggcgcgacgccgtctcggggctgcccaccgaccgcggctgggacgaggacctctt 3240 cgacgccgaccccgaccgcagcggc:aagagctcggtccgcgagggcggattcctgcacga 3300 cgccgccctgttcgacgccggcttcttcgggatatcgccccgcgaggccctcggcatgga 3360 40cccgcagcagcggctgctcctggag~acggcatgggaggccgtggagcgcgcagggctcga 3420 ccccgaaggcctcaagggcagccggacggccgtcttcgtcggcgccaccgccctggacta3480 cggcccgcgcatgcacgacggcgccgagggcgtcgagggccacctcctgaccgggaccac3540 gcccagcgtgatgtcgggccgcatc:gcctaccagctcggcctcaccggtcctgcggtcac3600 cgtcgacacggccagctcgtcctc<~ctcgtcgcgctgcacctggccgtccgttcgctgcg3660 Sgcagggcgagtcgagcctcgcgcac:gccggcggagcgaccgtcatgtcgacaccgggcat3720 gttcgtcgagttctcgcggcagcgcggcctcgccgccgacggccgctccaaggccttctc3780 cgactccgccgacggcacctcctgggccgagggcgtcggcctcctcgtcgtcgagcggct3840 ctcggacgccgagcgcaacggcc:ac:cccgtgctcgccgtgatccggggcagcgcggtcaa3900 ccaggacggcgcctccaacgggctc:accgcccccaacggcccgtcccagcagcgcgtcat3960 lOccgacaggccctggccgacgccgggctcac.cccggccgacgtcgacgccgtcgaggcgca4020 cggtacgggtacccggctcggcgac:cccatcgaggccgaggcgatcctcggcacctacgg4080 ccgggaccggggcgagggcgctccc~ctccagctcggctcgctgaagtcgaacatcggcca4140 cgcgcaggccgccgcgggcgtggg<:gggctcatcaagatggtcctcgcgatgcgccacgg4200 cgtcctgcccaggacgctccacgtc~gaccggcccaccacccgcgtcgactgggaggccgg4260 l5cggcgtcgagctcctcaccgaggac~cgggagtggccggagacgggccgcccgcgccgcgc4320 ggcgatctcctccttcggcatcagc:ggcaccaacgcccacatcgtggtcgaacaggcccc4380 ggaagccggggaggcggcggtcacc:accaccgccccggaagcaggggaagccggggaagc4440 ggcggacaccaccgccaccacgacc~ccggccgcggtcggcgtccccgaacccgtacgcgc4500 ccccgtcgtggtctccgcgcgggac:gccgccgccctgcgcgcccaggccgttcggctgcg4560 20gaccttcctcgacggccgaccggac:gtcaccgtcgccgacctcggacgctcgctggccgc4620 ccgtaccgccttcgagcacaaggcc:gccctcaccaccgccaccagggacgagctgctcgc4680 cgggctcgacgccctcggccgcgg<~gagcaagccacgggcctggtcaccggcgaaccggc4740 cagggccggacgcacggccttcctc~ttcaccggccagggagcgcagcgcgtcgccatggg4800 cgaggaactgcgcgccgcgcacccc:gtgttcgccgccgccctcgacaccgtgtacgcggc4860 25cctcgaccgtcacctcgaccggccc~ctgcgggagatcgtcgccgccggggaggagctgga4920 cctcaccgcgtacacccagcccgcc:ctcttcgccttcgaggtggcgctgttccgcctcct4980 cgaacaccacggcctcgtccccgac:ctgctcaccggccactccgtcggcgagatcgccgc5040 cgcgcacgtcgccggtgtcctctcc:ctcgacgacgccgcacgtctcgtcaccgcccgcgg5100 ccggctcatgcagtcggcccgcgac~ggcggcgcgatgatcgccgtgcaggcgggcgaggc5160 30cgaggtcgtcgagtccctgaagggcaacgagggcagggtcgccgtcgccgccgtcaacgg5220 acccaccgccgtggtcgtctccggc:gacgcggacgccgccgaggagatccgcgccgtatg5280 ggcgggacgcggccggcgcacccgc:aggctgcgcgtcagccacgccttccactccccgca5340 catggacgacgtcctcgacgagttc:ctccgggtcgccgagggcctgaccttcgaggagcc5400 gcggatccccgtcgtctccacggtc:accggcgcgctcgtcacgtccggcgagctcacctc5460 35gcccgcgtactgggtcgaccagatc:cggcggcccgtgcgcttcctggacgccgtccgcac5520 cctggccgcccaggacgcgaccgtc:ctcgtcgagatcggccccgacgccgtcctcacggc5580 actcgccgaggaggctctcgcgccc:ggcacggacgccccggacgcccgggacgtcacggt5640 cgtcccgctgctgcgcgcggggcgc:cccgagcccgagaccctcgccgccggtctcgcgac5700 cgcccatgtccacggcgcacccttggaccgggcgtcgttcttcccggacgggcgccgcac5760 40ggacctgcccacgtacgccttccggcgcgagcactactggctgacgcccgaggcccgtac5820 ggacgcccgcgcactcggcttcgacccggcgcggcacccgctgctgacgaccacggtcga 5880 ggtcgccggcggcgacggcgtcctgctgaccggccgtctctccctgaccgaccagccctg 5940 gctggccgaccacatggtcaacggcgccgtcctgttgccggccaccgccttcctggagct 6000 cgccctcgcggcgggcgaccacgtcggggcggtccgggtggaggaactcaccctcgaagc 6060 $gccgctcgtcctgcccgagcggggcgccgtccgcatccaggtcggcgtgagcggcgacgg 6120 cgagtcgccggccgggcgcaccttcggtgtgtacagcacccccgactccggcgacaccgg 6180 tgacgacgcgccccgggagtggacccgccatgtctccggcgtactcggcgaaggggaccc 6240 ggccacggagtcggaccaccccggcaccgacggggacggttcagcggcctggccgcctgc 6300 ggcggcgaccgccacacccctcgacggcgtctacgaccggctcgcggagctcggctacgg 6360 l0atacggtccggccttccagggcctgacggg.gctgtggcgcgacggcgccgacacgctcgc 6420 cgagatccggctgcccgcggcgcagcacgagagcgcggggctcttcggcgtacacccggc 6480 gctgctcgacgcggcgctccacccgatcgtcctggagggcaactcagctgccggtgcctg 6540 tgacgccgataccgacgcgaccgaccggatccggctgccgttcgcgtgggcgggggtgac 6600 cctccacgccgaaggggccaccgcgctccgcgtacggatcacacccaccggcccggacac 6660 1$ggtcacgctccgcctcaccgacaccaccggtgcgcccgtggccaccgtggagtccctgac 6720 cctgcgcgcggtggcgaaggaccggctgggcaccaccgccgggcgcgtcgacgacgccct 6780 gttcacggtcgtgtggacggagaccggcacaccggaacccgcagggcgcggagccgtgga 6840 ggtcgaggaactcgtcgacctcgccggcctcggcgacctcgtggagctcggcgccgcgga 6900 cgtcgtcctccgggccgaccgctggacgctcgacggggacccgtccgccgccgcgcgcac 6960 20agccgtccggcgcaccctcgccatcgtccaggagttcctgtccgagccgcgcttcgacgg 7020 ctcgcgactggtgtgcgtcaccaggggcgcggtcgccgcactccccggcgaggacgtcac 7080 ctccctcgccaccggccccctctggggcctcgtccgctccgcccagtccgagaacccggg 7140 acgcctgttcctcctggacctgggtgaaggcgaaggcgagcgcgacggagccgaggagct 7200 gatccgcgcggccacggccggggacgagccgcagctcgcggcacgggacggccgactgct 7260 2$cgcgccgaggctggcccgtaccgccgccctttcgagtgaggacaccgccggcggcgccga 7320 ccgtttcggccccgacggcaccgtcctcgtcaccgggggcaccggaggcctcggagcgct 7380 cctcgcccgccacctcgtggagcgtcacggggtgcgccggctgctgctggtgagccgccg 7440 cggggccgacgcccccggcgcggccgacctgggcgaggacctcgcgggcctcggcgcgga 7500 ggtggcgttcgccgccgccgacgccgccgaccgcgagagcctggcgcgggcgatcgccac 7560 30cgtgcccgccgagcatccgctgacggccgtcgtgcacacggcgggagtcgtcgacgacgc 7620 gacggtggaggcgctcacaccggaacggctggacgcggtactgcgcccgaaggtcgacgc 7680 cgcgtggaacctgcacgagctcaccaaggacctgcggctcgacgccttcgtcctcttctc 7740 ctccgtctccggcatcgtcggcaccgccggccaggccaactacgcggcggccaacacggg 7800 cctcgacgccctcgccgcccaccgcgccgccacgggcctggccgccacgtcgctggcctg 7860 3$gggcctctgggacggcacgcacggcatgggcggcacgctcggcgccgccgacctcgcccg 7920 ctggagccgggccggaatcaccccgctcaccccgctgcagggcctcgcgctcttcgacgc 7980 cgcggtcgccagggacgacgccctcctcgtacccgccgggctccgtcccaccgcccaccg 8040 gggcacggacggacagcctcctgcgctgtggcgcggcctcgtccgggcgcgcccgcgccg 8100 tgccgcgcggacggccgccgaggcg~gcggacacgaccggcggctggctgagcgggctcgc 8160 40cgcacagtcccccgaggagcggcgcagcacagccgtcacgctcgtgacgggtgtcgtcgc 8220 ggacgtcctcgggcacgccgactccgccgcggtcggggcggagcggtccttcaaggacct8280 cggcttcgactccctggccggggtggagctccgcaaccggctgaacgccgccaccggcct8340 gcggctccccgcgaccacggtcttcgaccatccctcgccggccgcgctcgcgtcccatct8400 cctcgcccaggtgcccgggttgaaggaggggacggcggcgaccgcgaccgtcgtggccga8460 Sgcggggcgcttccttcggtgaccgtc~cgaccgacgacgatccgatcgcgatcgtgggcat8520 ggcatgccgctatccgggtggtgtgt:cgtcgccggaggacctgtggcggctggtggccga8580 ggggacggacgcgatcagcgagttcc:ccgtcaaccgcggctgggacctggagagcctcta8640 cgacccggatcccgagtcgaagggcaccacgtactgccgggagggcgggttcctggaagg8700 cgccggtgacttcgacgccgccttctacggcatctcgccgcgcgaggccctggtgatgga8760 lOcccgcagcagcggctgctgctggagc~tgtcctgggaggcgctggaacgcgcgggcatcga8820 cccgtcctcgctgcgcggcagccgcc~gtggtgtctacgtgggcgccgcgcacggctcgta8880 cgcctccgatccccggctggtgcr,cc~agggctcggagggctatctgctgaccggcagcgc8940 cgacgcggtgatgtccggccgcatct:cctacgcgctcggtctcgaaggaccgtccatgac9000 ggtggagacggcctgctcctcctcgcaggtggcgctgcatctggcggtacgggcgctgcg9060 1$gcacggcgagtgcgggctcgcgctgc~cgggcggggtggcggtgatggccgatccggcggc9120 gttcgtggagttct:cccggcagaagc~ggctggccgccgacggccgctgcaaggcgttctc9180 ggccgccgccgacggcaccggctggc~ccgagggcgtcggcgtgctcgtcctggagcggct9240 gtcggacgcgcgccgcgcggggcacacggtcctcggcctggtcaccggcaccgcggtcaa9300 ccaggacggtgcctccaacgggctgaccgcgcccaacggcccagcccagcaacgcgtcat9360 20cgccgaggcgctcgccgacgccgggcagtccccggaggacgtggacgcggtcgaggcgca9420 cggcaccggcacccggctcggcgacc;ccatcgaggccggggcgctgctcgccgcctccgg9480 acggaaccgttccggcgaccacccgcagtggctcggctcgctgaagtccaacatcgggca9540 tgcccaggccgccgccggtgtcggcc~gcgtcatcaagatgctccaggcgctgcggcacgg9600 cttgctgccccgcaccctccacgccc~acgagccgaccccgcatgccgactggagctccgg9660 25ccgggtacggctgctcacctccgagc~tgccgtggcagcggaccggccggccccggcggac9720 cggggtgtcc,gcct.tcggcgtcggcc~gcaccaatgcccatgtcgtcctcgaagaggcacc9780 cgccccgcccgcgccggaaccggccc~gggaggcccccggcggctcccgcgccgcagaagg9840 ggcggaagggcccctggcctgggtgc~tctccggacgcgacgagccggccctgcggtccca9900 ggcccggcggctccgcgaccacct:ct:cccgcacccccggggcccgcccgcgtgacatcgc9960 30cttctccctcgccgccacgcgcgcac~cctttgaccaccgcgccgtgctgatcggctcgga10020 cggggccgaactcgccgccgccctgc~acgcgttggccgaaggacgcgacggtccggcggt10080 ggtgcgcggagtccgcgaccgggacc~gcaggatggccttcctcttcaccgggcagggcag10140 ccagcgcgccgggatggcccacgaccagcatgccgcccataccttcttcgcgtccgccct10200 cgacgaggtgacggaccgtctcgacc:cgctgctcggccggccgctcggcgcgctgctgga10260 35cgcccgacccggct:cgcccgaagcgc~cactcctggaccggaccgagtacacccagccggc10320 gctcttcgccgtcgaggtggcgctcc:accggctgctggagcactgggggatgcgccccga10380 cctgctgctggggcactcggtgggcc~aactggcggccgcccacgtcgcgggtgtgctcga10440 tctcgacgacgcctgcgcgctggtgc~ccgcccgcggcaggctgatgcagcgcctgccgcc10500 cggcggcgcgatggtctccgtgcggc~ccggcgaggacgaggtccgcgcactgctggccgg10560 40ccgcgaggacgccgtctgcgtcgccc~cggtgaacggcccccggtcggtggtgatctccgg10620 cgcggaggaa gcggtggccgaggcggcggcgcagctcgccggacgaggccgccgcaccag10680 gcggctccgc gtcgcgcacgccttcc:actcacccctgatggacggcatgctcgccggatt10740 ccgggaggtc gccgccggcctgcgct:accgggaaccggagctgacggtcgtctccacggt10800 cacggggcgg cccgcccgccccggtc~aactcaccggccccgactactgggtggcccaggt10860 Sccgtgagcccgtgcgcttcgcggacc~cggtccgcacggcacaccgcctcggagcccgcac10920 cttcctggag accggcccggacggcc~tgctgtgcggcatggcagaggagtgcctggagga10980 , cgacaccgtg gcccagctgccggcgatccacaagcccggcaccgcgccgcacggtccggc11040 ggctcccggc gcgctgcgggcggcc<~ccgccgcgtacggccggggcgcccgggtggactg11100 ggccgggatg cacgccgacggccccc~aggggccggcccgccgcgtcgaactgcccgtcca11160 lOcgccttccggcaccgccgctactggctcgccccgggccgcgcggcggacaccgacgactg11220 gatgtaccgg atcggctgggaccggcagccggctgtgaccggcggggcccggaccgccgg11280 ccgctggctg gtgatccaccccgacagcccgcgctgccgggagctgtccggccacgccga11340 acgcgcgctg cgcgccgcgggcgcgagccccgtaccgctgcccgtggacgctccggccgc11400 cgaccgggcg tcctacgcggcactg<agcgctccgccaccggacctgacacacgaggtga11460 l5cacagccgcgcccgtggccggtgtgcagtcgctgctgtccgaggaggatcggccccatcg11520 ccagcacgcc ccggtacccgccggggtcctggcgacgctgtccctgatgcaggctatgga11580 ggaggaggcg gtggaggctcgcgtgt:ggtgcgtctcccgcgccgcggtcgccgccgccga11640 ccgggaacgg cccgtcggcgcgggcgccgccctgtgggggctggggcgggtggccgccct11700 ggaacgcccc acccggtggggcggtctcgtggacctgcccgcctcgcccggtgcggcgca11760 20ctgggcggccgccgtggaacggct:cgccggtcccgaggaccagatcgccgtgcgcgcgtc11820 cggcagttgg ggccggcgcctcaccaggctgccgcgcgacggcggcggccggacggccgc11880 acccgcgtac cggccgcgcggcacggtgctcgtcaccggtggcaccggcgcgctcggcgg11940 gcatctcgcc cgctggctcgccgcgcacgggcgccgaacacctggcgctcaccagccgccg12000 gggcccggac gcgc:ccggcgccgccggactcgaggccgaactcctcctcctgggcgccaa12060 25ggtgacgttcgccgcctgcgacaccgccgaccgcgacggcctcgcccgggtcctgcgggc12120 gataccggag gacaccccgctcaccc3cggtgttccacgccgcgggcgtaccgcaggtcac12180 gccgctgtcc cgtacctcgcccgagcacttcgccgacgtgtacgcgggcaaggcggcggg12240 cgccgcgcac ctggacgaactgacccgcgaactcggcgccggactcgacgcgttcgtcct12300 ctactcctcc ggcgccggcgtctgg<3gcagcgccggccagggtgcctacgccgccgccaa12360 30cgccgccctggacgcgctcgcccggcgccgtgcggcggacggactccccgccacctccat12420 cgcctggggc gtgtggggcggcggcggtatgggggccgacgaggcgggcgcggagtatct12480 gggccggcgc ggtatgcgccccatggcaccggtctccgcgctccgggcgatggccaccgc12540 catcgcctcc ggggaaccctgccccaccgtcacccacaccgactgggagcgcttcggcga12600 gggcttcacc gccttccggcccagccctctgatcgcggggctcggcacgccgggcggcgg12660 35ccgggcggcggagacccccgaggaggggaacgccaccgctgcggcggacctcaccgccct12720 gccgcccgcc gaactccgcaccgcgctgcgcgagctggtgcgagcccggaccgccgcggc12780 gctcggcctc gacgacccggccgag~gtcgccgagggcgaacggttccccgccatgggctt12840 cgactccctg gccaccgtacggctgcgccgcggactcgcctcggccacgggcctcgacct12900 gccccccgat ctgctcttcgaccgggacaccccggccgcgctcgccgcccacctggccga12960 40actgctcgccaccgcacgggaccacggacccggcggccccgggaccggtgccgcgccggc13020 cgatgccgga agcggcctgc cggccctcta ccgggaggcc gtccgcaccg gccgggccgc 13080 ggaaatggcc gaactgctcg ccgccgcttc ccggttccgc cccgccttcg ggacggcgga 13140 ccggcagccg gtggccctcg tgccgctggc cgacggcgcg gaggacaccg ggctcccgct 13200 gctcgtgggc tgcgccggga cggcggtggc ctccggcccg gtggagttca ccgccttcgc 13260 Scggagcgctg gcggacctcc cggcggcggc cccgatggcc gcgctgccgc agcccggctt 13320 tctgccggga gaacgagtcc cggccacccc ggaggcattg ttcgaggccc aggcggaagc 13380 , gctgctgcgc tacgcggccg gccggccctt cgtgctgctg gggcactccg ccggcgccaa 13440 catggcccac gccctgaccc gtcatctgga ggcgaacggt ggcggccccg cagggctggt 13500 gctcatggac atctacaccc ccgccgaccc cggcgcgatg ggcgtctggc ggaacgacat 13560 l0gttccagtgg gtctggcggc gctcggacat.ccccccggac gaccaccgcc tcacggccat 13620 gggcgcctac caccggctgc ttctcgactg gtcgcccacc cccgtccgcg cccccgtact 13680 gcatctgcgc gccgcggaac ccatgggcga ctggccaccc ggggacaccg gctggcagtc 13740 ccactgggac ggcgcgcaca ccaccgccgg catccccgga aaccacttca cgatgatgac 13800 cgaacacgcc tccgccgccg cccgg~ctcgt gcacggctgg ctcgcggaac ggaccccgtc 13860 lScgggcagggc gggtcaccgt cccgcgcggc ggggagagag gagaggccgt gaacacggca 13920 gccggcccga ccggcaccgc cgccg~gcggc accaccgccc cggcggcggc acacgacctg 13980 tcccgcgccg gacgcaggct ccaactcacc cgggccgcac agtggttcgc cggcaaccag 14040 ggagacccct acgggatgat cctgc:gcgcc ggcaccgccg acccggcacc gtacgaggaa 14100 gagatccccg ggtaccgagc tcgaa.ttctt aattaaggag gtcgtagatg agtaacaaga 14160 20acaacgatga gctgcagcgg caggcctcgg aaaacaccct ggggctgaac ccggtcatcg 14220 gtatccgccg caaagacctg ttgagctcgg cacgcaccgt gctgcgccag gccgtgcgcc 14280 aaccgctgca cagcgccaag catgt.ggccc actttggcct ggagctgaag aacgtgctgc 14340 tgggcaagtc cagccttgcc ccgga~aagcg acgaccgtcg cttcaatgac ccggcatgga 14400 gcaacaaccc act.ttaccgc cgcta~cctgc aaacctatct ggcctggcgc aaggagctgc 14460 2Saggactggat cggcaacagc gacct:gtcgc cccaggacat cagccgcggc cagttcgtca 14520 tcaacctgat gaccgaagcc atggca ccga ccaacaccct gtccaacccg gcagcagtca 14580 aacgcttctt cgaaaccggc ggcaagagcc tgctcgatgg cctgtccaac ctggccaagg 14640 acctggtcaa caacggtggc atgcc:cagcc aggtgaacat ggacgccttc gaggtgggca 14700 agaacctggg caccagtgaa ggcgc:cgtgg tgtaccgcaa cgatgtgctg gagctgatcc 14760 30agtacaagcc catcaccgag caggt:gcatg cccgcccgct gctggtggtg ccgccgcaga 14820 tcaacaagtt ctacgtattc gacct:gagcc cggaaaagag cctggcacgc tactgcctgc 14880 gctcgcagca gcagaccttc atcat:cagct ggcgcaaccc gaccaaagcc cagcgcgaat 14940 ggggcctgtc caectacatc gacgc:gctca aggaggcggt cgacgcggtg ctggcgatta 15000 ccggcagcaa ggacctgaac atgct:cggtg cctgctccgg cggcatcacc tgcacggcat 15060 3Stggtcggcca ctatgccgcc ctcggcgaaa acaaggtcaa tgccctgacc ctgctggtca 15120 gcgtgctgga caccaccatg gacaaccagg tcgccctgtt cgtcgacgag cagactttgg 15180 aggccgccaa gcgccactcc taccaggccg gtgtgctcga aggcagcgag atggccaagg 15240 tgttcgcctg gatgcgcccc aacgacctga tctggaacta ctgggtcaac aactacctgc 15300 tcggcaacga gccgccggtg ttcg;acatcc tgttctggaa caacgacacc acgcgcctgc 15360 40cggccgcctt ccacggcgac ctgatcgaaa tgttcaagag caacccgctg acccgcccgg 15420 acgccctggaggtttgcggcactcegatcgacctgaaacaggtcaaatgcgacatctaca15480 gccttgccggcaccaacgaccacatcaccccgtggcagtcatgctaccgctcggcgcacc15540 tgttcggcggcaagatcgagttcgt:gctgtccaacagcggccacatccagagcatcctca15600 acccgccaggcaaccccaaggcgcgcttcatgaccggtgccgatcgcccgggtgacccgg15660 $tggcctggcaggaaaacgccaccaagcatgccgactcctggtggctgcactggcaaagct15720 ggctgggcgagcgtgccggcgagct:ggaaaaggcgccgacccgcctgggcaaccgtgcct15780 , atgccgctggcgaggcatccccggg~cacctacgttcacgagcgttgagctgcagcgccgt15840 ggccacctgcgggacgccacggtgt.tgaattc 15872 <210> 2 <211> 5215 <212> PRT
<213> Streptomyces venezuelae <400> 2 Met Asn Glu Ala Ile Ala Val Val Gly Met Ser Cys Arg Leu Pro Lys 20A1a Ser Asn Pro Ala Ala Phe Trp Glu Leu Leu Arg Asn Gly Glu Ser Ala Val Thr Asp Val Pro Ser Gly Arg Trp Thr Ser Val Leu Gly Gly Ala Asp Ala Glu Glu Pro Ala Glu Ser Gly Val Arg Arg Gly Gly Phe Leu Asp Ser Leu Asp Leu Phe Asp Ala Ala Phe Phe Gly Ile Ser Pro Arg Glu Ala Ala Ala Met Asp Pro Gln Gln Arg Leu Val Leu Glu Leu 3UAla Trp Glu Ala Leu Glu Asp Ala Gly Ile Val Pro Gly Thr Leu Ala Gly Ser Arg Thr Ala Val Phe Val Gly Thr Leu Arg Asp Asp Tyr Thr Ser Leu Leu Tyr Gln His Gly Glu Gln Ala Ile Thr Gln His Thr Met 3$ 130 135 140 Ala Gly Val Asn Arg Gly Val Ile Ala Asn Arg Val Ser Tyr His Leu Gly Leu Gln Gly Pro Ser Leu Thr Val Asp Ala Ala Gln Ser Ser Ser 40Leu Val Ala Val His Leu Ala Cys Glu Ser Leu Arg Ala Gly Glu Ser Thr Thr Ala Leu Val Ala Gly Val Asn Leu Asn Ile Leu Ala Glu Ser Ala Val Thr Glu Glu Arg Phe Gly Gly Leu Ser Pro Asp Gly Thr Ala $ 210 21!i 220 Tyr Thr Phe Asp Ala Arg A1<~ Asn Gly Phe Val Arg Gly Glu Gly Gly Gly Val Val Val Leu Lys Pro Leu Ser Arg Ala Leu Ala Asp Gly Asp I~Arg Val His Gly Val Ile Arch Ala Ser Ala Val Asn Asn Asp Gly Ala Thr Pro Gly Leu Thr Val Pro Ser Arg Ala Ala Gln Glu Lys Val Leu Arg Glu Ala Tyr Arg Lys Ala Ala Leu Asp Pro Ser Ala Val Gln Tyr 1$ 290 295 300 Val Glu Leu His Gly Thr Gly Thr Pro Val Gly Asp Pro Ile Glu Ala Ala Ala Leu Gly Ala Val Leu Gly Ser Ala Arg Pro Ala Asp Glu Pro ZOLeu Leu Val Gly Ser Ala Lys Thr Asn Val Gly His Leu Glu Gly Ala Ala Gly Ile Val Gly Leu Ile Lys Thr Leu Leu Ala Leu Gly Arg Arg Arg Ile Pro Ala Ser Leu Asn. Phe Arg Thr Pro His Pro Asp Ile Pro 2$ 370 375 380 Leu Asp Thr Leu Gly Leu Asp Val Pro Asp Gly Leu Arg Glu Trp Pro His Pro Asp Arg Glu Leu Leu Ala Gly Val Ser Ser Phe Gly Met Gly 3~Gly Thr Asn Ala His Val Val Leu Ser Glu Gly Pro Ala Gln Gly Gly Glu Gln Pro Gly Ile Asp Glu Glu Thr Pro Val Asp Ser Gly Ala Ala Leu Pro Phe Val Val Thr Gly Arg Gly Gly Glu Ala Leu Arg Ala Gln 3$ 450 455 460 Ala Arg Arg Leu His Glu Ala Val Glu Ala Asp Pro Glu Leu Ala Pro Ala Ala Leu Ala Arg Ser Leu Val Thr Thr Arg Thr Val Phe Thr His 4~Arg Ser Val Val Leu Ala Pro Asp Arg Ala Arg Leu Leu Asp Gly Leu Gly Ala Leu Ala Ala Gly Thr Pro Ala Pro Gly Val Val Thr Gly Thr Pro Ala Pro Gly Arg Leu Ala Val Leu Phe Ser Gly Gln Gly Ala Gln $ 530 535 540 Arg Thr Gly Met Gly Met Glu Leu Tyr Ala Ala His Pro Ala Phe Ala Thr Ala Phe Asp Ala Val Ala Ala Glu Leu Asp Pro Leu Leu Asp Arg lOPro Leu Ala Glu Leu Val Ala Ala Gly Asp Thr Leu Asp Arg Thr Val His Thr Gln Pro Ala Leu Phe Ala Val Glu Val Ala Leu His Arg Leu Val Glu Ser Trp Gly Val Thr Pro Asp Leu Leu Ala Gly His Ser Val 1$ 6i0 615 620 Gly Glu Ile Ser Ala Ala His Val Ala Gly Val Leu Ser Leu Arg Asp Ala Ala Arg Leu Val Ala Ala Arg Gly Arg Leu Met Gln Ala Leu Pro 20G1u Gly Gly Ala Met Val Ala Val Glu Ala Ser Glu Glu Glu Val Leu Pro His Leu Ala Gly Arg Glu Arg Glu Leu Ser Leu Ala Ala Val Asn Gly Pro Arg Ala Val Val Leu Ala Gly Ala Glu Arg Ala Val Leu Asp 2$ 690 695 700 Val Ala Glu Leu Leu Arg Glu Gln Gly Arg Arg Thr Lys Arg Leu Ser Val Ser His Ala Phe His Ser Pro Leu Met Glu Pro Met Leu Asp Asp 30Phe Arg Arg Val Val Glu Giu Leu Asp Phe Gln Glu Pro Arg Val Asp Val Val Ser Thr Val Thr Gly Leu Pro Val Thr Ala Gly Gln Trp Thr Asp Pro Glu Tyr Trp Val Asp Gln Val Arg Arg Pro Val Arg Phe Leu 3$ 770 775 780 Asp Ala Val Arg Thr Leu Glu Glu Ser Gly Ala Asp Thr Phe Leu Glu Leu Gly Pro Asp Gly Val Cys Ser Ala Met Ala Ala Asp Ser Val Arg 40Asp Gln Glu Ala Ala Thr Ala Val Ser Ala Leu Arg Lys Gly Arg Pro 1 ~1 Glu Pro Gln Ser Leu Leu Ales Ala Leu Thr Thr Val Phe Val Arg Gly His Asp Val Asp Trp Thr Ala Ala His Gly Ser Thr Gly Thr Val Arg $ 850 855 860 Val Pro Leu Pro Thr Tyr Ala Phe Gln Arg Glu Arg His Trp Phe Asp , Gly Ala Ala Arg Thr Ala Alai Pro Leu Thr Ala Gly Arg Ser Gly Thr lUGly Ala Gly Thr Gly Pro Ala Ala Gly Val Thr Ser Gly Glu Gly Glu Gly Glu Gly Glu Gly Ala Gly Ala Gly G1y Gly Asp Arg Pro Ala Arg His Glu Thr Thr Glu Arg Val Arg Ala His Val Ala Ala Val Leu Glu Tyr Asp Asp Pro Thr Arg Val Glu Leu Gly Leu Thr Phe Lys Glu Leu Gly Phe Asp Ser Leu Met Ser Val Glu Leu Arg Asn A1a Leu Val Asp Z~Asp Thr Gly Leu Arg Leu Pro Ser Gly Leu Leu Phe Asp His Pro Thr Pro Arg Ala Leu Ala Ala His Leu Gly Asp Leu Leu Thr Gly Gly Ser Gly Glu Thr Gly Ser Ala Asp Gly Ile Pro Pro Ala Thr Pro Ala Asp 2$ 1010 1015 1020 Thr Thr Ala Glu Pro Ile Ala Ile Ile Gly Met Ala Cys Arg Tyr Pro Gly Gly Val Thr Ser Pro Glu Asp Leu Trp Arg Leu Val Ala Glu Gly 3~Arg Asp Ala Val Ser Gly Leu Pro Thr Asp Arg Gly Trp Asp Glu Asp Leu Phe Asp Ala Asp Pro Asp Arg Ser Gly Lys Ser Ser Val Arg Glu Gly Gly Phe Leu His Asp Ala Ala Leu Phe Asp Ala Gly Phe Phe Gly 3$ 1090 1095 1100 Ile Ser Pro Arg Glu Ala Leu Gly Met Asp Pro Gln Gln Arg Leu Leu Leu Glu Thr Ala Trp Glu Al.a Val Glu Arg Ala Gly Leu Asp Pro Glu 4UGly Leu Lys Gly Ser Arg Thr Ala Val Phe Val Gly Ala Thr Ala Leu Asp Tyr Gly Pro Arg Met Hi~c Asp Gly Ala Glu Gly Val Glu Gly His Leu Leu Thr Gly Thr Thr Pro Ser Val Met Ser Gly Arg Ile Ala Tyr $ 1170 1175 1180 Gln Leu Gly Leu Thr Gly Pro Ala Val Thr Val Asp Thr Ala Cys Ser Ser Ser Leu Val Ala Leu His Leu Ala Val Arg Ser Leu Arg Gln Gly lOGlu Ser Ser Leu Ala Leu Ala Gly Gly Ala Thr Val Met Ser Thr Pro Gly Met Phe Val Glu Phe Ser Arg Gln Arg Gly Leu Ala Ala Asp Gly Arg Ser Lys Ala Phe Ser Asp Ser Ala Asp Gly Thr Ser Trp Ala Glu Gly Val Gly Leu Leu Val Val Glu Arg Leu Ser Asp Ala Glu Arg Asn Gly His Pro Val Leu Ala Val Ile Arg Gly Ser Ala Val Asn Gln Asp 2~Gly Ala Ser Asn Gly Leu Thr Ala Pro Asn Gly Pro Ser Gln Gln Arg Val Ile Arg Gln Ala Leu Ala Asp Ala Gly Leu Thr Pro Ala Asp Val Asp Ala Val Glu Ala His Gly Thr Gly Thr Arg Leu Gly Asp Pro Ile Glu Ala Glu Ala Ile Leu Gly Thr Tyr Gly Arg Asp Arg GIy Glu Gly Ala Pro Leu Gln Leu Gly Ser Leu Lys Ser Asn Ile Gly His Ala Gln 30A1a Ala Ala Gly Val Gly Gly Leu Ile Lys Met Val Leu Ala Met Arg His Gly Val Leu Pro Arg Thr Leu His Val Asp Arg Pro Thr Thr Arg Val Asp Trp Glu Ala Gly Gly Val Glu Leu Leu Thr Glu Glu Arg Glu 35 1410 141°.i 1420 Trp Pro Glu Thr Gly Arg Pro Arg Arg Ala Ala Ile Ser Ser Phe Gly Ile Ser Gly Thr Asn Ala His Ile Val Val Glu Gln Ala Pro Glu Ala 4UGly Glu Ala Ala Val Thr Thr Thr Ala Pro Glu Ala Gly Glu Ala Gly Glu Ala Ala Asp Thr Thr Ala Thr Thr Thr Pro Ala Ala Val Gly Val Pro Glu Pro Val Arg Ala Pro Val Val Val Ser Ala Arg Asp Ala Ala Ala Leu Arg Ala Gln Ala Val Arg Leu Arg Thr Phe Leu Asp Gly Arg Pro Asp Val Thr Val Ala Asp Leu Gly Arg Ser Leu Ala Ala Arg Thr l~Ala Phe Glu His Lys Ala Ala Leu Thr Thr Ala Thr Arg Asp Glu Leu Leu Ala Gly Leu Asp Ala Leu Gly Arg Gly Glu Gln Ala Thr Gly Leu Val Thr Gly Glu Pro Ala Arg Ala Gly Arg Thr Ala Phe Leu Phe Thr Gly Gln Gly Ala Gln Arg Val Ala Met Gly Glu Glu Leu Arg Ala Ala His Pro Val Phe Ala Ala Ala Leu Asp Thr Val Tyr Ala Ala Leu Asp 2~Arg His Leu Asp Arg Pro Leu Arg Glu Ile Val Ala Ala Gly Glu Glu Leu Asp Leu Thr Ala Tyr Thr Gln Pro Ala Leu Phe Ala Phe Glu Val Ala Leu Phe Arg Leu Leu Glu His His Gly Leu Val Pro Asp Leu Leu 25 1650 165!i 1660 Thr Gly His Ser Val Gly Glu Ile Ala Ala Ala His Val Ala Gly Val Leu Ser Leu Asp Asp Ala Ala Arg Leu Val Thr Ala Arg Gly Arg Leu 3~Met Gln Ser Ala Arg Glu Gly Gly Ala Met Ile Ala Val Gln Ala Gly Glu Ala Glu Val Val Glu Ser Leu Lys Gly Tyr Glu Gly Arg Val Ala Val Ala Ala Val Asn Gly Pro Thr Ala Val Val Val Ser Gly Asp Ala 3$ 1730 173_°~ 1740 Asp Ala Ala Glu Glu Ile Arg Ala Val Trp Ala Gly Arg Gly Arg Arg Thr Arg Arg Leu Arg Val Ser His Ala Phe His Ser Pro His Met Asp 4~Asp Val Leu Asp Glu Phe Leu Arg Val Ala Glu Gly Leu Thr Phe Glu WO 00/OOb20 PGT/US99/14398 Glu Pro Arg Ile Pro Val VaT. Ser Thr Val Thr Gly Ala Leu Val Thr Ser Gly Glu Leu Thr Ser Pro Ala Tyr Trp Val Asp Gln Ile Arg Arg $ 1810 187.5 1820 Pro Val Arg Phe Leu Asp Ala Val Arg Thr Leu Ala Ala Gln Asp Ala , Thr Val Leu Val Glu Ile Gly Pro Asp Ala Val Leu Thr Ala Leu Ala lOGlu Glu Ala Leu Ala Pro Gly Thr Asp Ala Pro Asp Ala Arg Asp Val Thr Val Val Pro Leu Leu Arg Ala Gly Arg Pro Glu Pro Glu Thr Leu Ala Ala Gly Leu Ala Thr Ala. His Val His Gly Ala Pro Leu Asp Arg 1$ 1890 1895 1900 Ala Ser Phe Phe Pro Asp Gly Arg Arg Thr Asp Leu Pro Thr Tyr Ala Phe Arg Arg Glu His Tyr Trp Leu Thr Pro Glu Ala Arg Thr Asp Ala Z~Arg Ala Leu Gly Phe Asp Pro Ala Arg His Pro Leu Leu Thr Thr Thr Val Glu Val Ala Gly Gly Asp Gly Val Leu Leu Thr Gly Arg Leu Ser Leu Thr Asp Gln Pro Trp Leu Ala Asp His Met Val Asn Gly Ala Val 2$ 1970 1975 1980 Leu Leu Pro Ala Thr Ala Phe Leu Glu Leu Ala Leu Ala Ala Gly Asp His Val Gly Ala Val Arg Val Glu Glu Leu Thr Leu Glu Ala Pro Leu 3~Va1 Leu Pro Glu Arg Gly Ala Val Arg Ile Gln Val Gly Val Ser Gly Asp Gly Glu Ser Pro Ala Gly Arg Thr Phe Gly Val Tyr Ser Thr Pro Asp Ser Gly Asp Thr Gly Asp Asp Ala Pro Arg Glu Trp Thr Arg His 3$ 2050 205.5 2060 Val Ser Gly Val Leu Gly Glu Gly Asp Pro Ala Thr Glu Ser Asp His Pro Gly Thr Asp Gly Asp Gly Ser Ala Ala Trp Pro Pro Ala Ala Ala 4~Thr Ala Thr Pro Leu Asp Gly Val Tyr Asp Arg Leu Ala Glu Leu Gly Tyr Gly Tyr Gly Pro Ala Phe Gln Gly Leu Thr Gly Leu Trp Arg Asp Gly Ala Asp Thr Leu Ala Glu Ile Arg Leu Pro Ala Ala Gln His Glu $ 2130 2135 2140 Ser Ala Gly Leu Phe Gly Val His Pro Ala Leu Leu Asp Ala Ala Leu His Pro Ile Val Leu Glu Gly Asn Ser Ala Ala Gly Ala Cys Asp Ala l~Asp Thr Asp Ala Thr Asp Arg Ile Arg Leu Pro Phe Ala Trp Ala Gly Val Thr Leu His Ala Glu Gly Ala Thr Ala Leu Arg Val Arg Ile Thr Pro Thr Gly Pro Asp Thr Val Thr Leu Arg Leu Thr Asp Thr Thr Gly 15 2210 221_-'i 2220 Ala Pro Val Ala Thr Val Glu Ser Leu Thr Leu Arg Ala Val Ala Lys Asp Arg Leu Gly Thr Thr Ala Gly Arg Val Asp Asp Ala Leu Phe Thr 2~Va1 Val Trp Thr Glu Thr Gly Thr Pro Glu Pro Ala Gly Arg Gly Ala Val Glu Val Glu Glu Leu Val Asp Leu Ala Gly Leu Gly Asp Leu Val Glu Leu Gly Ala Ala Asp Val Val Leu Arg Ala Asp Arg Trp Thr Leu 2290 2295. 2300 Asp Gly Asp Pro Ser Ala Ala Ala Arg Thr Ala Val Arg Arg Thr Leu Ala Ile Val Gln Glu Phe Leu Ser Glu Pro Arg Phe Asp Gly Ser Arg 3ULeu Val Cys Val Thr Arg Gly Ala Val Ala Ala Leu Pro Gly Glu Asp Val Thr Ser Leu Ala Thr Gly Pro Leu Trp Gly Leu Val Arg Ser Ala Gln Ser Glu Asn Pro Gly Arg Leu Phe Leu Leu Asp Leu Gly Glu Gly Glu Gly Glu Arg Asp Gly Ala Glu Glu Leu Ile Arg Ala Ala Thr Ala Gly Asp Glu Pro Gln Leu Ala Ala Arg Asp Gly Arg Leu Leu Ala Pro 4~Arg Leu Ala Arg Thr Ala Ala Leu Ser Ser Glu Asp Thr Ala Gly Gly Ala Asp Arg Phe Gly Pro Asp Gly Thr Val Leu Val Thr Gly Gly Thr Gly GIy Leu Gly Ala Leu Leu Ala Arg His Leu Val Glu Arg His Gly $ 2450 2455 2460 Val Arg Arg Leu Leu Leu Val Ser Arg Arg Gly Ala Asp Ala Pro Gly , Ala Ala Asp Leu Gly Glu Asp Leu Ala Gly Leu Gly Ala Glu Val Ala l~Phe Ala Ala Ala Asp Ala Ala Asp Arg Glu Ser Leu Ala Arg Ala Ile Ala Thr Val Pro Ala Glu His Pro Leu Thr Ala Val Val His Thr Ala Gly Val Val Asp Asp Ala Thr Val Glu Ala Leu Thr Pro Glu Arg Leu 1$ 2530 253!i 2540 Asp Ala Val Leu Arg Pro Lys Val Asp Ala Ala Trp Asn Leu His Glu Leu Thr Lys Asp Leu Arg Leu Asp Ala Phe Val Leu Phe Ser Ser Val 20Ser Gly Ile Val Gly Thr Ala Gly Gln Ala Asn Tyr Ala Ala Ala Asn Thr Gly Leu Asp Ala Leu Ala Ala His Arg Ala A1a Thr Gly Leu Ala Ala Thr Ser Leu Ala Trp Gly Leu Trp Asp Gly Thr His Gly Met Gly 2$ 2610 261_°. 2620 Gly Thr Leu Gly Ala Ala Asp Leu Ala Arg Trp Ser Arg Ala Gly Ile Thr Pro Leu Thr Pro Leu Gln Gly Leu Ala Leu Phe Asp Ala Ala Val 3~Ala Arg Asp Asp Ala Leu Leu Val Pro Ala Gly Leu Arg Pro Thr Ala His Arg Gly Thr Asp Gly Gln Pro Pro Ala Leu Trp Arg Gly Leu Val Arg Ala Arg Pro Arg Arg Ala Ala Arg Thr Ala Ala Glu Ala Ala Asp 3$ 2690 2695. 2700 Thr Thr Gly Gly Trp Leu Ser Gly Leu Ala Ala Gln Ser Pro Glu Glu Arg Arg Ser Thr Ala Val Thr Leu Val Thr Gly Val Val Ala Asp Val 4~Leu Gly His Ala Asp Ser Ala Ala Val Gly Ala Glu Arg Ser Phe Lys Asp Leu Gly Phe Asp Ser Leu Ala Gly Val Glu Leu Arg Asn Arg Leu Asn Ala Ala Thr Gly Leu Arg Leu Pro Ala Thr Thr Val Phe Asp His $ 2770 2775 2780 Pro Ser Pro Ala Ala Leu Ala Ser His Leu Leu Ala Gln Val Pro Gly Leu Lys Glu Gly Thr Ala Ala Thr Ala Thr Val Val Ala Glu Arg Gly lOAla Ser Phe Gly Asp Arg Ala Thr Asp Asp Asp Pro Ile Ala Ile Val Gly Met Ala Cys Arg Tyr Pro Gly Gly Val Ser Ser Pro Glu Asp Leu Trp Arg Leu Val Ala Glu Gly Thr Asp Ala Ile Ser Glu Phe Pro Val 1$ 2850 2855 2860 Asn Arg Gly Trp Asp Leu Glu Ser Leu Tyr Asp Pro Asp Pro Glu Ser Lys Gly Thr Thr Tyr Cys Arg Glu Gly Gly Phe Leu Glu Gly Ala Gly 20Asp Phe Asp Ala Ala Phe Phe Gly Ile Ser Pro Arg Glu Ala Leu Val Met Asp Pro Gln Gln Arg Leu Leu Leu Glu Val Ser Trp Glu Ala Leu Glu Arg Ala Gly Ile Asp Pro Ser Ser Leu Arg Gly Ser Arg Gly Gly Val Tyr Val Gly Ala Ala His Gly Ser Tyr Ala Ser Asp Pro Arg Leu Val Pro Glu Gly Ser Glu Gly Tyr Leu Leu Thr Gly Ser Ala Asp Ala 30Va1 Met Ser Gly Arg Ile Ser Tyr Ala Leu Gly Leu Glu Gly Pro Ser Met Thr Val Glu Thr Ala Cys Ser Ser Ser Leu Val Ala Leu His Leu Ala Val Arg Ala Leu Arg His Gly Glu Cys Gly Leu Ala Leu Ala Gly 3$ 3010 3015, 3020 Gly Val Ala Val Met Ala Asp Pro Ala Ala Phe Val Glu Phe Ser Arg Gln Lys Gly Leu Ala Ala Asp Gly Arg Cys Lys A1a Phe Ser Ala Ala 40A1a Asp Gly Thr Gly Trp Ala Glu Gly Val Gly Val Leu Val Leu Glu Arg Leu Ser Asp Ala Arg Arg Ala Gly His Thr Val Leu Gly Leu Val Thr Gly Thr Ala Val Asn Gln Asp Gly Ala Ser Asn Gly Leu Thr Ala Pro Asn Gly Pro Ala Gln Gln Arg Val Ile Ala Glu Ala Leu Ala Asp , Ala Gly Leu Ser Pro Glu Asp Val Asp Ala Val Glu Ala His Gly Thr l~Gly Thr Arg Leu Gly Asp Pro Ile Glu Ala Gly Ala Leu Leu Ala Ala Ser Gly Arg Asn Arg Ser Gly Asp His Pro Leu Trp Leu Gly Ser Leu Lys Ser Asn Ile Gly His Ala Gln Ala Ala Ala Gly Val Gly Gly Val Ile Lys Met Leu Gln Ala Leu Arg His Gly Leu Leu Pro Arg Thr Leu His Ala Asp Glu Pro Thr Pro His Ala Asp Trp Ser Ser Gly Arg Val Z~Arg Leu Leu Thr Ser Glu Val Pro Trp Gln Arg Thr Gly Arg Pro Arg Arg Thr Gly Val Ser Ala Phe Gly Val Gly Gly Thr Asn Ala His Val Val Leu Glu Glu Ala Pro Ala Pro Pro Aia Pro Glu Pro Ala Gly Glu 2$ 3250 3255 3260 Ala Pro Gly Gly Ser Arg Ala .Ala Glu Gly Ala Glu Gly Pro Leu Ala Trp Val Val Ser Gly Arg Asp Glu Pro Ala Leu Arg Ser Gln Ala Arg 3~Arg Leu Arg Asp His Leu Ser iArg Thr Pro Gly Ala Arg Pro Arg Asp Ile Ala Phe Ser Leu Ala Ala 'L'hr Arg Ala Ala Phe Asp His Arg Ala 3315 :3320 3325 Val Leu Ile Gly Ser Asp Gly Ala Glu Leu Ala Ala Ala Leu Asp Ala Leu Ala Glu Gly Arg Asp Gly 1?ro Ala Val Val Arg Gly Val Arg Asp Arg Asp Gly Arg Met Ala Phe Leu Phe Thr Gly Gln Gly Ser Gln Arg 4~Ala Gly Met Ala His Asp Leu Fiis Ala Ala His Thr Phe Phe Ala Ser Ala Leu Asp Glu Val Thr Asp Arg Leu Asp Pro Leu Leu Gly Arg Pro Leu Gly Ala Leu Leu Asp Ala Arg Pro Gly Ser Pro Glu Ala Ala Leu Leu Asp Arg Thr Glu Tyr Thr Gln Pro Ala Leu Phe Ala Val Glu Val Ala Leu His Arg Leu Leu Glu His Trp Gly Met Arg Pro Asp Leu Leu lOLeu Gly His Ser Val Gly Glu Leu Ala Ala Ala His Val Ala Gly Val Leu Asp Leu Asp Asp Ala Cys Ala Leu Val Ala Ala Arg Gly Arg Leu Met Gln Arg Leu Pro Pro Gly Gly Ala Met Val Ser Val Arg Ala Gly 1$ 3490 3495 3500 Glu Asp Glu Val Arg Ala Leu Leu Ala Gly Arg Glu Asp Ala Val Cys Val Ala Ala Val Asn Gly Pro Arg Ser Val Val Ile Ser Gly Ala Glu 2~Glu Ala Val Ala Glu Ala Ala Ala Gln Leu Ala Gly Arg Gly Arg Arg Thr Arg Arg Leu Arg Val Ala His Ala Phe His Ser Pro Leu Met Asp Gly Met Leu Ala Gly Phe Arg Glu Val Ala Ala Gly Leu Arg Tyr Arg 25 3570 357!i 3580 Glu Pro Glu Leu Thr Val Val Ser Thr Val Thr Gly Arg Pro Ala Arg Pro Gly Glu Leu Thr Gly Pro Asp Tyr Trp Val Ala Gln Val Arg Glu 3~Pro Val Arg Phe Ala Asp Ala Val Arg Thr Ala His Arg Leu Gly Ala Arg Thr Phe Leu Glu Thr Gly Pro Asp Gly Val Leu Cys Gly Met Ala Glu Glu Cys Leu Glu Asp Asp Thr Val Ala Leu Leu Pro Ala Ile His Lys Pro Gly Thr Ala Pro His Gly Pro Ala Ala Pro Gly Ala Leu Arg Ala Ala Ala Ala Ala Tyr Gly Arg Gly Ala Arg Val Asp Trp Ala Gly 4UMet His Ala Asp Gly Pro Glu Gly Pro Ala Arg Arg Val Glu Leu Pro Val His Ala Phe Arg His Arg Arg Tyr Trp Leu Ala Pro Gly Arg Ala Ala Asp Thr Asp Asp Trp Met Tyr Arg Ile Gly Trp Asp Arg Leu Pro $ 3730 3735 3740 Ala Val Thr Gly Gly Ala Arg Thr Ala Gly Arg Trp Leu Val Ile His Pro Asp Ser Pro Arg Cys Arg Glu Leu Ser Gly His Ala Glu Arg Ala lOLeu Arg Ala Ala Gly Ala Ser Pro Val Pro Leu Pro Val Asp Ala Pro Ala Ala Asp Arg Ala Ser Phe Ala Ala Leu Leu Arg Ser Ala Thr Gly Pro Asp Thr Arg Gly Asp Thr Ala Ala Pro Val Ala Gly Val Leu Ser 1$ 3810 3815 3820 Leu Leu Ser Glu Glu Asp Arg Pro His Arg Gln His Ala Pro Val Pro Ala Gly Val Leu Ala Thr Leu Ser Leu Met Gln Ala Met Glu Glu Glu 20A1a Val Glu Ala Arg Val Trp Cys Val Ser Arg Ala Ala Val Ala Ala Ala Asp Arg Glu Arg Pro Val Gly Ala Gly Ala Ala Leu Trp Gly Leu Gly Arg Val Ala Ala Leu Glu Arg Pro Thr Arg Trp Gly Gly Leu Val 2$ 3890 389!i 3900 Asp Leu Pro Ala Ser Pro Gly Ala Ala His Trp Ala Ala Ala Val Glu Arg Leu Ala Gly Pro Glu Asp Gln Ile Ala Val Arg Ala Ser Gly Ser 30Trp Gly Arg Arg Leu Thr Arg Leu Pro Arg Asp G1y Gly Gly Arg Thr Ala Ala Pro Ala Tyr Arg Pro Arg Gly Thr Val Leu Val Thr Gly Gly Thr Gly Ala Leu Gly Gly His Leu Ala Arg Trp Leu Ala Ala Ala Gly 3$ 3970 397__°i 3980 Ala Glu His Leu Ala Leu Thr Ser Arg Arg Gly Pro Asp Ala Pro Gly Ala Ala Gly Leu Glu Ala Glu Leu Leu Leu Leu Gly Ala Lys Val Thr 40Phe Ala Ala Cys Asp Thr Ala Asp Arg Asp Gly Leu Ala Arg Val Leu WO 00/00620 PCT/US99l14398 Arg Ala Ile Pro Glu Asp Thr Pro Leu Thr Ala Val Phe His Ala Ala Gly Val Pro Gln Val Thr Pro Leu Ser Arg Thr Ser Pro Glu His Phe $ 4050 4055 4060 Ala Asp Val Tyr Ala Gly Lys Ala Ala Gly Ala Ala His Leu Asp Glu Leu Thr Arg Glu Leu Gly Ala Gly Leu Asp Ala Phe Val Leu Tyr Ser lOSer Gly Ala Gly Val Trp Gly Ser Ala Gly Gln Gly Ala Tyr Ala Ala Ala Asn Ala Ala Leu Asp Ala Leu Ala Arg Arg Arg Ala Ala Asp Gly Leu Pro Ala Thr Ser Ile Ala Trp Gly Val Trp Gly Gly Gly Gly Met 1$ 4130 4135 4140 Gly Ala Asp Glu Ala Gly Ala Glu Tyr Leu Gly Arg Arg Gly Met Arg Pro Met Ala Pro Val Ser Ala Leu Arg Ala Met Ala Thr Ala Ile Ala Z~Ser Gly Glu Pro Cys Pro Thr Val Thr His Thr Asp Trp Glu Arg Phe Gly Glu Gly Phe Thr Ala Phe Arg Pro Ser Pro Leu Ile Ala Gly Leu Gly Thr Pro Gly Gly Gly Arg Ala Ala Glu Thr Pro Glu Glu Gly Asn 25 4210 421!i 4220 Ala Thr Ala Ala Ala Asp Leu Thr Ala Leu Pro Pro Ala Glu Leu Arg Thr Ala Leu Arg Glu Leu Val Arg Ala Arg Thr Ala Ala Ala Leu Gly 3~Leu Asp Asp Pro Ala Glu Val Ala Glu Gly Glu Arg Phe Pro Ala Met Gly Phe Asp Ser Leu Ala Thr Val Arg Leu Arg Arg Gly Leu Ala Ser Ala Thr Gly Leu Asp Leu Pro Pro Asp Leu Leu Phe Asp Arg Asp Thr 3$ 4290 4295. 4300 Pro Ala Ala Leu Ala Ala His Leu Ala Glu Leu Leu Ala Thr Ala Arg Asp His Gly Pro Gly Gly Pro Gly Thr Gly Ala Ala Pro Ala Asp Ala 4~Gly Ser Gly Leu Pro Ala Leu Tyr Arg Glu Ala Val Arg Thr Gly Arg Ala Ala Glu Met Ala Glu Leu Leu Ala Ala Ala Ser Arg Phe Arg Pro Ala Phe Gly Thr Ala Asp Arg Gln Pro Val Ala Leu Val Pro Leu Ala $ 4370 4375 4380 Asp Gly Ala Glu Asp Thr Gly Leu Pro Leu Leu Val Gly Cys Ala Gly , Thr Ala Val Ala Ser Gly Pro Val Glu Phe Thr Ala Phe Ala Gly Ala I~Leu Ala Asp Leu Pro Ala Ala Ala Pro Met Ala Ala Leu Pro Gln Pro Gly Phe Leu Pro Gly Glu Arg Val Pro Ala Thr Pro Glu Ala Leu Phe Glu Ala Gln Ala Glu Ala Leu Leu Arg Tyr Ala Ala Gly Arg Pro Phe 1$ 4450 445!5 4460 Val Leu Leu Gly His Ser Ala Gly Ala Asn Met Ala His Ala Leu Thr Arg His Leu Glu Ala Asn Gly Gly Gly Pro Ala Gly Leu Val Leu Met 2~Asp Ile Tyr Thr Pro Ala Asp Pro Gly Ala Met Gly Val Trp Arg Asn Asp Met Phe Gln Trp Val Trp Arg Arg Ser Asp Ile Pro Pro Asp Asp His Arg Leu Thr Ala Met Gly Ala Tyr His Arg Leu Leu Leu Asp Trp 25 4530 453..°i 4540 Ser Pro Thr Pro Val Arg Ala Pro Val Leu His Leu Arg Ala Ala Glu Pro Met Gly Asp Trp Pro Pro Gly Asp Thr Gly Trp Gln Ser His Trp 3~Asp Gly Ala His Thr Thr Ala Gly Ile Pro Gly Asn His Phe Thr Met Met Thr Glu His Ala Ser Ala Ala Ala Arg Leu Val His Gly Trp Leu Ala Glu Arg Thr Pro Ser Gly Gln Gly Gly Ser Pro Ser Arg Ala Ala 35 4610 461°_i 4620 Gly Arg Glu Glu Arg Pro Met Ile Leu Arg Ala Gly Thr Ala Asp Pro Ala Pro Tyr Glu Glu Glu Ile Pro Gly Tyr Arg Ala Arg Ile Leu Asn 4~Met Ser Asn Lys Asn Asn Asp Glu Leu Gln Arg Gln Ala Ser Glu Asn Thr Leu Gly Leu Asn Pro Val Ile Gly Ile Arg Arg Lys Asp Leu Leu Ser Ser Ala Arg Thr Val Leu Arg Gln Ala VaI Arg Gln Pro Leu His $ 4690 4695 4700 Ser Ala Lys His Val Ala His Phe Gly Leu Glu Leu Lys Asn Val Leu Leu Gly Lys Ser Ser Leu Ala Pro Glu Ser Asp Asp Arg Arg Phe Asn IDAsp Pro Ala Trp Ser Asn Asn Pro Leu Tyr Arg Arg Tyr Leu Gln Thr Tyr Leu Ala Trp Arg Lys Glu Leu Gln Asp Trp Ile Gly Asn Ser Asp Leu Ser Pro Gln Asp Ile Ser Arg Gly Gln Phe Val Ile Asn Leu Met 1$ 4770 4775 4780 Thr Glu Ala Met Ala Pro Thr Asn Thr Leu Ser Asn Pro Ala Ala Val Lys Arg Phe Phe Glu Thr Gly Gly Lys Ser Leu Leu Asp Gly Leu Ser 2~Asn Leu Ala Lys Asp Leu Val Asn Asn Gly Gly Met Pro Ser Gln Val Asn Met Asp Ala Phe Glu Val Gly Lys Asn Leu Gly Thr Ser Glu Gly Ala Val Val Tyr Arg Asn Asp Val Leu Glu Leu Ile Gln Tyr Lys Pro 2$ 4850 4855 4860 Ile Thr Glu Gln Val His Ala Arg Pro Leu Leu Val Val Pro Pro Gln Ile Asn Lys Phe Tyr Val Phe Asp Leu Ser Pro Glu Lys Ser Leu Ala 30Arg Tyr Cys Leu Arg Ser Gln Gln Gln Thr Phe Ile Ile Ser Trp Arg Asn Pro Thr Lys Ala Gln Arg Glu Trp Gly Leu Ser Thr Tyr Ile Asp Ala Leu Lys Glu Ala Val Asp Ala Val Leu Ala Ile Thr Gly Ser Lys 3$ 4930 4935 4940 Asp Leu Asn Met Leu Gly Ala Cys Ser Gly Gly Ile Thr Cys Thr Ala Leu Val Gly His Tyr Ala Ala Leu Gly Glu Asn Lys Val Asn Ala Leu 4~Thr Leu Leu Val Ser Val Leu Asp Thr Thr Met Asp Asn Gln Val Ala Leu Phe Val Asp Glu Gln Thr Leu Glu Ala Ala Lys Arg His Ser Tyr Gln Ala Gly Val Leu Glu G1y Ser Glu Met Ala Lys Val Phe Ala Trp $ 5010 5015 5020 Met Arg Pro Asn Asp Leu Ile Trp Asn Tyr Trp Val Asn Asn Tyr Leu °' Leu Gly Asn Glu Pro Pro Val Phe Asp Ile Leu Phe Trp Asn Asn Asp lOThr Thr Arg Leu Pro Ala Ala. Phe His Gly Asp Leu Ile Glu Met Phe Lys Ser Asn Pro Leu Thr Arg Pro Asp Ala Leu Glu Val Cys Gly Thr Pro Ile Asp Leu Lys Gln Val Lys Cys Asp Ile Tyr Ser Leu Ala Gly 1$ 5090 5095 5100 Thr Asn Asp His Ile Thr Pro Trp Gln Ser Cys Tyr Arg Ser Ala His Leu Phe Gly Gly Lys Ile Glu. Phe Val Leu Ser Asn Ser Gly His Ile 20G1n Ser Ile Leu Asn Pro Pro Gly Asn Pro Lys Ala Arg Phe Met Thr Gly Ala Asp Arg Pro Gly Asp Pro Val Ala Trp Gln Glu Asn Ala Thr Lys His Ala Asp Ser Trp Trp~ Leu His Trp Gln Ser Trp Leu Gly Glu 2$ 5170 5175 5180 Arg Ala Gly Glu Leu Glu Lys Ala Pro Thr Arg Leu Gly Asn Arg Ala Tyr Ala Ala Gly Glu Ala Ser Pro Gly Thr Tyr Val His Glu Arg <210> 3 3$ <211> 12441 <212> DNA
<213> Streptomyces v~enezuelae <400> 3 40agatctgcaa cgacatctcc gaccacctgc tcgtcacccg cggcgcgccc gatgcccgcg 60 tcgtgcagcccccgaccagccttatcgaaggagcggcgaagagatggcagaacccacggt120 gaccgacgacctgacgggggccctcacgcagcccccgctgggccgcaccgtccgcgcggt180 ggccgaccgtgaactcggcacccacctcctggagacccgcggcatccactggatccacgc240 cgcgaacggcgacccgtacgccaccgtgctgcgcggccaggcggacgacccgtatcccgc300 $gtacgagcgggtgcgtgcccgcggcgcgctctccttcagcccgacgggcagctgggtcac360 cgccgatcacgccctggcggcgagcatcctctgctcgacggacttcggggtctccggcgc420 cgacggcgtcccggtgccgcagcaggtcctctcgtacggggagggctgtccgctggagcg480 cgagcaggtgctgccggcggccggtgacgtgccggagggcgggcagcgtgccgtggtcga540 ggggatccaccgggagacgctggagggtctcgcgccggacccgtcggcgtcgtacgcctt600 lOcgagctgctgggcggtttcgtccgcccggcggtgacggccgctgccgccgccgtgctggg660 tgttcccgcggaccggcgcgcggacttcgcggatctgctggagcggctccggccgctgtc720 cgacagcctgctggccccgcagtccctgcggacggtacgggcggcggacggcgcgctggc780 cgagctcacggcgctgctcgccgattcggacgactcccccggggccctgctgtcggcgct840 cggggtcaccgcagccgtccagctc,accgggaacgcggtgctcgcgctcctcgcgcatcc900 l5cgagcagtggcgggagctgtgcgaccggcccgggctcgcggcggccgcggtggaggagac960 cctccgctacgacccgccggtgcagctcgacgcccgggtggtccgcggggagacggagct1020 ggcgggccggcggctgccggccggggcgcatgtcgtcgtcctgaccgccgcgaccggccg1080 ggacccggaggtcttcacggacccggagcgcttcgacctcgcgcgccccgacgccgccgc1140 gcacctcgcgctgcaccccgccggtccgtacggcccggtggcgtccctggtccggcttca1200 20ggcggaggtcgcgctgcggaccctggccgggcgtttccccgggctgcggcaggcggggga1260 cgtgctccgcccccgccgcgcgcctc3tcggccgcgggccgctgagcgtcccggtcagcag1320 ctcctgagacaccggggccccggtccgcccggccccccttcggacggaccggacggctcg1380 gaccacggggacggctcagaccgtcc:cgtgtgtccccgtccggctcccgtccgccccatc1440 ccgcccctccaccggcaaggaaggac:acgacgccatgcgcgtcctgctgacctcgttcgc1500 25acatcacacgcactactacggcctgc~tgcccctggcctgggcgctgctcgccgccgggca1560 cgaggtgcgggtcgccagccagcccc~cgctcacggacaccatcaccgggtccgggctcgc1620 cgcggtgccggtcggcaccgaccaccacatccacgagtaccgggtgcggatggcgggcga1680 gccgcgcccgaaccatccggcgatcgccttcgacgaggcccgtcccgagccgctggactg1740 ggaccacgccctcggcatcgaggcgatcctcgccccgtacttccatctgctcgccaacaa1800 30cgactcgatggtcgacgacctcgtcc~acttcgcccggtcctggcagccggacctggtgct1860 gtgggagccgacgacctacgcgggcc~ccgtcgccgcccaggtcaccggtgccgcgcacgc1920 ccgggtcctgtgggggcccgacgtgatgggcagcgcccgccgcaagttcgtcgcgctgcg1980 ggaccggcagccgcccgagcaccgcgaggaccccaccgcggagtggctgacgtggacgct2040 cgaccggtacggcgcctccttcgaac~aggagctgctcaccggccagttcacgatcgaccc2100 35gaccccgccgagcctgcgcctcgacacgggcctgccgaccgtcgggatgcgttatgttcc2160 gtacaacggcacgtcggtcgtgccgc~actggctgagtgagccgcccgcgcggccccgggt2220 ctgcctgaccctcggcgtctccgcgc:gtgaggtcctcggcggcgacggcgtctcgcaggg2280 cgacatcctggaggcgctcgccgaccacgacatcgagctcgtcgccacgctcgacgcgag2340 tcagcgcgccgagatccgcaactacc:cgaagcacacccggttcacggacttcgtgccgat2400 40gcacgcgctcctgccgagctgctcgc~cgatcatccaccacggcggggcgggcacctacgc2460 gaccgccgtg atcaacgcggtgccgcaggtcatgctcgccgagctgtgggacgcgccggt 2520 caaggcgcgg gccgtcgccgagcac~ggggcggggttcttcctgccgccggccgagctcac 2580 gccgcaggcc gtgcgggacgccgtc:gtccgcatcctcgacgacccctcggtcgccaccgc 2640 cgcgcaccgg ctgcgcgaggagacc;ttcggcgaccccaccccggccgggatcgtccccga 2700 Sgctggagcggctcgccgcgcagcac;cgccgcccgccggccgacgcccggcactgagccgc 2760 acccctcgcc ccaggcctcacccct:gtatctgcgccgggggacgcccccggcccaccctc 2820 cgaaagaccg aaagcaggagcaccc~tgtacgaagtcgaccacgccgacgtctacgacctc 2880 ttctacctgg gtcgcggcaaggact.acgccgccgaggcctccgacatcgccgacctggtg 2940 cgctcccgta cccccgaggcctcct.cgctcctggacgtggcctgcggtacgggcacgcat 3000 lOctggagcacttcaccaaggagttcggcgacaccgccggcctggagctgtccgaggacatg 3060 ctcacccacg cccgcaagcggctgcccgacgccacgctccaccagggcgacatgcgggac 3120 ttccggctcg gccggaagttctccgccgtggtcagcatgttcagctccgtcggctacctg 3180 aagacgaccg aggaactcggcgcggccgtcgcctcgttcgcggagcacctggagcccggt 3240 ggcgtcgtcg tcgtcgagccgtggtggttcccggagaccttcgccgacggctgggtcagc 3300 lSgccgacgtcgtccgccgtgacgggcgcaccgtggcccgtgtctcgcactcggtgcgggag 3360 gggaacgcga cgcgcatggaggtccacttcaccgtggccgacccgggcaagggcgtgcgg 3420 cacttctccg acgtccatctcatcaccctgttccaccaggccgagtacgaggccgcgttc 3480 acggccgccg ggctgcgcgtcgagtacctggagggcggcccgtcgggccgtggcctcttc 3540 gtcggcgtcc ccgcctgagcaccgcccaagaccccccggggcgggacgtcccgggtgcac 3600 20caagcaaagagagagaaacgaaccgtgacaggtaagacccgaataccgcgtgtccgccgc 3660 ggccgcacca cgcccagggccttcaccctggccgtcgtcggcaccctgctggcgggcacc 3720 accgtggcgg ccgccgctcccggcgccgccgacacggccaatgttcagtacacgagccgg 3780 gcggcggagc tcgtcgcccagatgacgctcgacgagaagatcagcttcgtccactgggcg 3840 ctggaccccg accggcagaacgtcggctaccttcccggcgtgccgcgtctgggcatcccg 3900 25gagctgcgtgccgccgacggcccgaacggcatccgcctggtggggcagaccgccaccgcg 3960 ctgcccgcgc cggtcgccctggcca~gcaccttcgacgacaccatggccgacagctacggc 4020 aaggtcatgg gccgcgacggtcgcgcgctcaaccaggacatggtcctgggcccgatgatg 4080 aacaacatcc gggtgccgcacggcggccggaactacgagaccttcagcgaggaccccctg 4140 gtctcctcgc gcaccgcggtcgccc;agatcaagggcatccagggtgcgggtctgatgacc 4200 30acggccaagcacttcgcggccaacaaccaggagaacaaccgcttctccgtgaacgccaat 4260 gtcgacgagc agacgctccgcgagatcgagttcccggcgttcgaggcgtcctccaaggcc 4320 ggcgcggcct ccttcatgtgtgcct;acaacggcctcaacgggaagccgtcctgcggcaac 4380 gacgagctcc tcaacaacgtgctgcgcacgcagtggggcttccagggctgggtgatgtcc 4440 gactggctcg ccaccccgggcaccgacgccatcaccaagggcctcgaccaggagatgggc 4500 35gtcgagctccccggcgacgtcccgaagggcgagccctcgccgccggccaagttcttcggc 4560 gaggcgctga agacggccgtcctgaacggcacggtccccgaggcggccgtgacgcggtcg 4620 gcggagcgga tcgtcggccagatggagaagttcggtctgctcctcgccactccggcgccg 4680 cggcccgagc gcgacaaggcgggtgcccaggcggtgtcccgcaaggtcgccgagaacggc 4740 gcggtgctcc tgcgcaacgagggccaggccctgccgctcgccggtgacgccggcaagagc 4800 40atcgcggtcatcggcccgacggccgtcgaccccaaggtcaccggcctgggcagcgcccac 4860 gtcgtcccgg actcggcggcggcgc;cactcgacaccatcaaggcccgcgcgggtgcgggt4920 gcgacggtga cgtacgagacgggtqaggagaccttcgggacgcagatcccggcggggaac4980 ctcagcccgg cgttcaaccagggcc:accagctcgagccgggcaaggcgggggcgctgtac5040 gacggcacgc tgaccgtgcccgccc~acggcgagtaccgcatcgcggtccgtgccaccggt5100 Sggttacgccacggtgcagctcggcagccacaccatcgaggccggtcaggtctacggcaag5160 gtgagcagcc cgctcctcaagctga.ccaagggcacgcacaagctcacgatctcgggcttc5220 gcgatgagtg ccaccccgctctcccaggagctgggctgggtgacgccggcggcggccgac5280 gcgacgatcg cgaaggccgtggagtcggcgcggaaggcccgtacggcggtcgtcttcgcc5340 tacgacgacg gcaccgagggcgtcgaccgtccgaacctgtcgctgccgggtacgcaggac5400 l0aagctgatctcggctgtcgcggacgccaacccgaacacgatcgtggtcctcaacaccggt5460 tcgtcggtgc tgatgccgtggctgtccaagacccgcgcggtcctggacatgtggtacccg5520 ggccaggcgg gcgccgaggccaccgccgcgctgctctacggtgacgtcaacccgagcggc5580 aagctcacgc agagcttcccggccgccgagaaccagcacgcggtcgccggcgacccgaca5640 agctacccgg gcgtcgacaaccagcagacgtaccgcgagggcatccacgtcgggtaccgc5700 l5tggttcgacaaggagaacgtcaagccgctgttcccgttcgggcacggcctgtcgtacacc5760 tcgttcacgc agagcgccccgaccgtcgtgcgtacgtccacgggtggtctgaaggtcacg5820 gtcacggtcc gcaacagcgggaagcgcgccggccaggaggtcgtccaggcgtacctcggt5880 gccagcccga acgtgacggctccgcaggcgaagaagaagctcgtgggctacacgaaggtc5940 tcgctcgccg cgggcgaggcgaagacggtgacggtgaacgtcgaccgccgtcagctgcag6000 20accggttcgtcctccgccgacctgc~ggggcagcgccacggtcaacgtctggtgacgtgac6060 gccgtgaaag cggcggtgcccgccacccgggagggtggcgggcaccgctttttcggcctg6120 ctgggtctac cggaccacctgactaggcctggtcgacccgctcggcccattcgcgcacgg6180 cgtcgatcac ccgcagcgcctgcgggcgctccaggtgcgggccgatcggcaggctgagga6240 cctgccgcgc gaagctctcggcccgcgggagcgagccttccggcggtgcctcgcccgcgt6300 25aggcgggcgagaggtgcacgggtaccgggtagtgcgtgagggtgtcgatgccgcgggcgt6360 cgaggtggct gcgcagctcgtcgcggcgctcggtgcgcacggtgaagaggtgccagaccg6420 ggtcggtgtc gggcgcggtcaccggcaggccgatgccgggcagtccggcgagcccggaga6480 ggtactccgc ggccagcgccgacctgcggccgttccagctgtccaggtgggcgagccgga6540 tccgcagcac ggcggcctgcatctcgtccaggcgggagttggtgcccttcgtctcgtggc6600 30tgtacttctgccgcgagccgtagtt<3cggagcatccggagccgttcggcgagctcggggt6660 cgccggtgac gacggcgccgccgtcgccgaagcagccgaggttcttgcccgggtagaagc6720 tgaacgcggc caccgacgacccggcc~ccgatccgccggccccggtagcgggcgccgtggg6780 cctgcgcggc gtcctcgacgatgtgc:aggccgtgccggtccgcgagctcgcggagggcgt6840 ccatgtcggc ggggtgcccgtagag<~tggacggggaggagcgcccgggtgcggggggtga6900 35tcgccttctcgacgagcagcgggtccagggtggggtggtcctcgtgcggctcgacgggca6960 cgggggtcgc gccggtggcggacaccgcgagccagctggcgatgtacgtgtgcgagggga7020 cgatcacctc gtccccgggtccgat<~ccgaggccgcggagggcgagctggagggcgtcca7080 tcccgctgtt cacgccgacggcgtg<~tccgtctcgcagtacgcggcgaactccgcctcga7140 atccttcgag ttcgggtccgaggag<~tagcgccccgagtcgaggacgcgggcgatcgcgg7200 40cgtcggtctccgcgcggagctcctcgtaggcggccttgaggtcgaggaaggggacgcggg7260 gggtctcggc gcggctgctcacgcggacacctccacggcggtggcgggcagctgcggggc7320 ggtcgccttg agcggctcccaccagccgcggttctcccggtaccagcggacggtccgcgc7380 gaggccgtcc gcgaaggagacctgcgggcggtagccgagctcgcgctcgatctcgccgcc7440 gtcgagggag tagcgcaggtcgtggcccttgcggtcggcgaccttccggaccgaggacca7500 Sgtcggcgccgagcgagtccaggagg~atgccggtgagttcgcggttggtcagctccaggcc7560 gccgccgatg tggtagatctcgccggcccggccgcccgcgaggacgagcgcgatgccccg7620 "

gcagtggtcg tcggtgtgcacccactcgcggacgttcgcgccgtcgccgtacagcgggag7680 cgtcccgccg tcgaggaggttcgtcacgaagagggggatgagcttctcggggtgctggta7740 cggcccgtag ttgttgcagcagcgggtgatccgtacgtcgaggccgtacgtccggtggta7800 lOggcgcgggcaacgaggtcggagccggccttggacgccgcgtagggcgagttgggctccag7860 cgggctgctc tcggtccaggagccggagtcgatcgacccgtacacctcgtcggtggagac7920 gtgcacgacc cggccgacgccggcgtcgacggcgcactggagcagcgtctgcgtgccctg7980 cacgttggtc tcggtgaacacggacgcgcccgcgatggagcggtccacgtggctctcggc8040 cgcgaagtgg acgatggcgtccacgccgcgcagttcccgggcgaggaggccggcgtcgcg8100 l5gatgtcgccgtggacgaagcgcagtcgcgggtccgcgtccaccggggcgaggttggcgcg8160 gttgcccgcg taggtgaggctgtcc~aggacgatcacctcatcggcgggcacgtcggggta8220 cgccccggcg aggagctgccgcacgaagtgcgagccgatgaagcccgcacctccggtcac8280 cagaagccgc actgccgtcttcctt!tcggtcgcgctgtaggtcgcggtgtgggtcgcact8340 gtcggtggcg gtgcgggtcgcggtgtgggtcgcactgtcggtggcgctgtcggtcgtggg8400 20aacgcgtcggccgcgaggtgccctcacggggctccctcgcggccggcgatctccatcaga8460 tagctgccgt actcggtgcgggagaggccttctcccaggccgtgacaggcctcggcgtcg8520 , atgaagccca tgcggaaggcgatct<:ctcaaggcccgcgatccagacgccctgccgctcc8580 tccaggacct ggacgtactgggcgg<:ccgcaggagcgagtcgtgggtgccggtgtccagc8640 caggcgaagc cgcggcccaggttgac:gagttcggcccggccccgctccaggtagacgcgg8700 25ttgacgtcggtgatctccagctcgcc:gcgcggcgagggccggatgttcttggcgatgtcg8760 acgacgtcgt tgtcgtagaggtagaggccggtgacggcgaggttggagcgcggcttgacg8820 ggcttctcga cgaggtcggtcagccc~gcccgtcgcgtccacctcggcgacccgtaccgc 8880 g tcggggtcct tgaccgggtagccgaagagcacgcagccgtcgaggcgcgcgatgctgtcc894.0 cgcaggagcg tgtagaggccgggccc;gtggaagatgttgtcgcccaggatcagggcgcag9000 30gtgtcgtcgccgatgtgctcggctcc:gacgagaagtgcgtccgcgattcctgcgggctct9060 ttctggaccg catagtcgagttctata aggtgcctgccgtttccgagaagcgactgg9120 ccc aagagttcga tgtgctggggggtcgagatgatttgaatctcgcgaataccgccgagcatg9180 agaaccgaca gcggatagtagatcat:cggtttgttgtagaccggaagaatctgcttcgaa9240 atgaccgagg tcgccggatgcagccc~agttccgctcccgccggccaggacattcccttc 9300 t 35attctcggaaactagcagcagggcgccggtgataacggtcggcgtggcgagttagggggg9360 cgctaggggc tgcgcagggggagtgt.caccacccctttggggggtgggaaaacaccgagg9420 gcccggccgg acggccgggccctcaggtggggggatcgtgggggggggatcggggggatc9480 ggggcgggtg cgggtcagcgcaggaa.gccgcgggcctcctcccagccgtccgcggcgtcg9540 cgctccagct ggttcaggcgggcggt.gacgacctgatcgaagccgtccatgaagtactcg9600 40tcgccgtcgacggccgccacctcgccgccgcgctcgacgaagtccctgacgacctcggtg9660 agggaggtgt cgggggtcac gcggc:ccgcg atgtagcggg tcgcgccgtc caggtcgggg 9720 aagccggcct cgcggtacag gtacacgtcg ccgaggagat cgacctgcac cgcgacctgc 9780 gggtgcgcgg tgggccgcat ggtgc~cgggc ttgatccgca gcagttcggc gtcggccccg 9840 gtgcgcaggc tgttcagggc gtagccgtag tcgatgtgga gtccgggggt gcgctcgcgg 9900 Sacccgctcct cgaaggcgtt gagggcctcc tggagctcgg cccgctcctc ctgcggcagc 9960 ttgccgtcgt cacggccgct gtagt.cctcg cgaatgttga cgaagtcgat cgtcctgccc 10020 tgcccggcgt cgttgaggtc ggcga.tgaag tcgaccaggt cgagcaggcg ggaggcacgg 10080 ~I
cccgggagca cgatgtaggc gaagccgagg ttgatcggcg actcgcgctc ggcgcgcagc 10140 tgctggaagc ggcgcaggtt ctcgcggacg cggcggaagg cggccttctt gccggtggtc 10200 l0tgctcgtact cctcgtcgtt gaggccgtag agcgaggtgc ggatggcgtg caggccccag 10260 aggccgggct ggcgctccag ggtgcgctcg gtgagcgcga aggagttcgt gtagacggtg 10320 ggccgcaggc cgtggtcggt ggcgtgcgcg gccaggctcc cgaggccggg gttggtgagc 10380 ggctccaggc cgccggagaa gtacatcgcc gaggggttgc ccgcgggtat ctcgtcgatg 10440 accgaccgga acatggcgtt gccggcgtcg agggcggacg ggtcgtagcg ggcgccggtc 10500 l5acacggacgc agaagtggca gcggaacatg caggtcgggc cggggtagag gccgacgctg 10560 tacgggaaga cgggcttcct ggcgagcgcc gcgtcgaaga cgccgcgctg ttcgagcggg 10620 agcagggtgt tcttccagta cgccccggcg gggccggtct cgaccgcggt gcggagctcc 10680 gggacctgcc cgaacagggc gaggaggcgc cggaaggcgt cccggtcgac gcccaggtcg 10740 tggcgggcct cctccagcgg ggtga,agggg ctgttgccgt agcgcacggc gagccggacg 10800 20aggtggcggg cggtcgttcc ggcctcgtcg ggcggcacga ggccgccggc ggcgagggtc 10860 tggccgacgg cgtggaccgc cgcccccaga tcggctccgg ggtgcgcgca gcgttcggcc 10920 ggggcggtgg cggaaagggc gggggcggtc atcgggagcg tccaatcgtg ggcgtggatg 10980 tctggggggc cgcgagcggg gcgggggccg tgtcgcggtg gcgcgcggtc agttcgcggc 11040 cgcgggtcgc gcagagacgc agcaggtcgg cgacccggcg gatgtcgtcg tcgccgatgg 11100 25cggtgccggt cggcagggac agcacc3cgcg cggcgaggcg ttcggtgtgc ggcagcgggg 11160 cgtgcggctg cccgcggtac ggctccagct cgtggcagcc cggcgagaag taggcgcggg 11220 tgtgcacgcc ttcggccttc aggacctcca tgacgaggtc gcggtggatg ccggtggtgg 11280 cctcgtcgat ctcgacgatc acgtactggt ggttgttgag gccgtggcgg tcgtggtcgg 1134_0 cgacgaggac gccggggagg tccgc<~aggt gctcgcggta ggcggcgtgg ttgcgccggt 11400 30tccggtcgat gacctcggga aacgcc~tcga gggaggtgag gcccatggcg gcggcggcct 11460 cgctcatctt ggcgttggtc ccgccggcgg ggctgccgcc gggcaggtcg aagccgaagt 11520 tgtggagggc gcggatccgg gcggcc~aggt cggcgtcgtc ggtgacgacg gcgccgccct 11580 cgaaggcgtt gacggccttg gtggcc~tgga agctgaagac ctcggcgtcg ccgaggctgc 11640 cggcgggccg gccgtcgacc gcgcagccga gggcgtgcgc ggcgtcgaag tacagccgca 11700 35ggccgtgctc gtcggcgacc ttccgc:agct ggtcggcggc gcaggggcgg ccccagaggt 11760 ggacgccgac gacggccgag gtgcgc~ggtg tgaccgcggc ggccacctgg tccgggtcga 11820 ggttgccggt gtccgggtcg atgtcc~gcga agaccggggt gaggccgatc cagcgcagtg 11880 cgtgcggggt ggcggcgaac gtcatc:gacg gcatgatcac ttcgccggtg aggccggcgg 11940 cgtgcgcgag gagctggagc ccggcc:gtgg cgttgcaggt ggccacggca tgccggaccc 12000 40cggcgagccc ggcgacgcgc tcctcc~aact cgcggacgag cgggccgccg ttggacagcc 12060 actggctgtcgagggcccggtcgagccgctcgtacagcctggcgcggtcgatgcggttgg12120 gccgccccacgaggagcggctggtcgaaagcggcggggccgccgaagaatgcgaggtcgg12180 ataaggcgcttttcacggatgttccctccgggccaccgtcacgaaatgattcgccgatcc12240 gggaatcccgaacgaggtcgccgcgctccaccgtgacgtacgacgagatggtcgattgtg12300 $gtggtcgatttcggggggactctaatccgcgcggaacgggaccgacaagagcacgctatg12360 cgctctcgatgtgcttcggatcacatccgcctccggggtattccatcggcggcccgaatg12420 tgatgatccttgacaggatcc <210> 4 <211> 3782 <212> PRT
<213> Streptomyces v~~nezuelae 1$
<400> 4 Met Thr Asp Asp Leu Thr Gl;r Ala Leu Thr Gln Pro Pro Leu Gly Arg Thr Val Arg Ala Val Ala Asp Arg Glu Leu Gly Thr His Leu Leu Glu Thr Arg Gly Ile His Trp Ilea His Ala Ala Asn Gly Asp Pro Tyr Ala Thr Val Leu Arg Gly Gln Ala Asp Asp Pro Tyr Pro Ala Tyr Glu Arg 2$ Val Arg Ala Arg Gly Ala Leu Ser Phe Ser Pro Thr Gly Ser Trp Val Thr Ala Asp His Ala Leu Ala Ala Ser Ile Leu Cys Ser Thr Asp Phe Gly Val Ser Gly Ala Aep Gly Val Pro Val Pro Gln Gln Val Leu Ser loo l05 llo Tyr Gly Glu Gly Cys Pro Leu Glu Arg Glu Gln Val Leu Pro Ala Ala Gly Asp Val Pro Glu Gly Gly Gln Arg Ala Val Val Glu Gly Ile His 130 135. 140 3$ Arg Glu Thr Leu Glu Gly Leu Ala Pro Asp Pro Ser Ala Ser Tyr Ala Phe Glu Leu Leu Gly Gly Phe: Val Arg Pro Ala Val Thr Ala Ala Ala Ala Ala Val Leu Gly Val Pro Ala Asp Arg Arg Ala Asp Phe Ala Asp Leu Leu Glu Arg Leu Arg Pro Leu Ser Asp Ser Leu Leu Ala Pro Gln Ser Leu Arg Thr Val Arg Ala Ala Asp Gly Ala Leu Ala Glu Leu Thr $ Ala Leu Leu Ala Asp Ser Asp Asp Ser Pro Gly Ala Leu Leu Ser Ala Leu Gly Val Thr Ala Ala Val Gln Leu Thr Gly Asn Ala Val Leu Ala Leu Leu Ala His Pro Glu Gln Trp Arg Glu Leu Cys Asp Arg Pro Gly 1~ 260 265 270 Leu Ala Ala Ala Ala Val Glu Glu Thr Leu Arg Tyr Asp Pro Pro Val Gln Leu Asp Ala Arg Val Val Arg Gly Glu Thr Glu Leu Ala Gly Arg 1$ Arg Leu Pro Ala Gly Ala His Val Val Val Leu Thr AIa Ala Thr Gly Arg Asp Pro Glu Val Phe Thr Asp Pro Glu Arg Phe Asp Leu Ala Arg Pro Asp Ala Ala Ala His Leu Ala Leu His Pro Ala Gly Pro Tyr Gly Pro Val Ala Ser Leu Val Arg Leu Gln Ala Glu Val Ala Leu Arg Thr Leu Ala Gly Arg Phe Pro Gl~y Leu Arg Gln Ala Gly Asp Val Leu Arg 370 37!5 380 2$ Pro Arg Arg Ala Pro Val Gl~y Arg Gly Pro Leu Ser Val Pro Val Ser Ser Ser Met Arg Val Leu Leu Thr Ser Phe Ala His His Thr His Tyr Tyr Gly Leu Val Pro Leu Ala Trp Ala Leu Leu Ala Ala Gly His Glu 3~ 420 425 430 Val Arg Val Ala Ser Gln Pro Ala Leu Thr Asp Thr Ile Thr Gly Ser Gly Leu Ala Ala Val Pro Va:l Gly Thr Asp His Leu Ile His Glu Tyr 450 45!i 460 3$ Arg Val Arg Met Ala Gly Glu Pro Arg Pro Asn His Pro Ala Ile Ala Phe Asp Glu Ala Arg Pro Glu Pro Leu Asp Trp Asp His Ala Leu Gly Ile Glu .Ala Ile Leu Ala Pro Tyr Phe His Leu Leu Ala Asn Asn Asp Ser Met Val Asp Asp Leu Val Asp Phe Ala Arg Ser Trp Gln Pro Asp Leu Val Leu Trp Glu Pro Thr Thr Tyr Ala Gly Ala Val Ala Ala Gln $ Val Thr Gly Ala Ala His Al,a Arg Val Leu Trp Gly Pro Asp Val Met Gly Ser Ala Arg Arg Lys Phe Val Ala Leu Arg Asp Arg Gln Pro Pro Glu His Arg Glu Asp Pro Th:r Ala Glu Trp Leu Thr Trp Thr Leu Asp Arg Tyr Gly Ala Ser Phe Glu Glu Glu Leu Leu Thr Gly Gln Phe Thr Ile Asp Pro Thr Pro Pro Ser Leu Arg Leu Asp Thr Gly Leu Pro Thr 610 61!i 620 1$ Val Gly Met Arg Tyr Val Pro Tyr Asn Gly Thr Ser Val Val Pro Asp Trp Leu Ser Glu Pro Pro Ala Arg Pro Arg Val Cys Leu Thr Leu Gly Val Ser Ala Arg Glu Val Leu Gly Gly Asp Gly Val Ser Gln Gly Asp 2~ 660 665 670 Ile Leu Glu Ala Leu Ala Asp Leu Asp Ile Glu Leu Val Ala Thr Leu Asp Ala Ser Gln Arg Ala G1u Ile Arg Asn Tyr Pro Lys His Thr Arg 2$ Phe Thr Asp Phe Val Pro Met His Ala Leu Leu Pro Ser Cys Ser Ala Ile Ile His His Gly Gly Ala Gly Thr Tyr Ala Thr Ala Val Ile Asn Ala Val Pro Gln Val Met Leu Ala Glu Leu Trp Asp Ala Pro Val Lys 3~ 740 745 750 Ala Arg Ala Val Ala Glu Gln Gly Ala Gly Phe Phe Leu Pro Pro Ala Glu Leu Thr Pro Gln Ala Val Arg Asp Ala Val Val Arg Ile Leu Asp 3$ Asp Pro Ser Val Ala Thr Ala Ala His Arg Leu Arg Glu Glu Thr Phe Gly Asp Pro Thr Pro Ala Gly Ile Val Pro Glu Leu Glu Arg Leu Ala Ala Gln His Arg Arg Pro Pro Ala Asp Ala Arg His Met Tyr Glu Val Asp His Ala Asp Val Tyr Asp Leu Phe Tyr Leu Gly Arg Gly Lys Asp Tyr Ala Ala Glu Ala Ser Asp Ile Ala Asp Leu Val Arg Ser Arg Thr $ Pro Glu Ala Ser Ser Leu Leu Asp Val Ala Cys Gly Thr Gly Thr His Leu Glu His Phe Thr Lys Glu Phe Gly Asp Thr Ala Gly Leu Glu Leu Ser Glu Asp Met Leu Thr His Ala Arg Lys Arg Leu Pro Asp Ala Thr Leu His Gln Gly Asp Met Arg Asp Phe Arg Leu Gly Arg Lys Phe Ser Ala Val Val Ser Met Phe Ser Ser Val Gly Tyr Leu Lys Thr Thr Glu 1$ Glu Leu Gly Ala Ala Val Ala Ser Phe Ala Glu His Leu Glu Pro Gly Gly Val Val Val Val Glu Pro Trp Trp Phe Pro Glu Thr Phe Ala Asp Gly Trp Val Ser Ala Asp Va:l Val Arg Arg Asp Gly Arg Thr Val Ala Arg Val Ser His Ser Val Arg Glu Gly Asn Ala Thr Arg Met Glu Val His Phe Thr Val Ala Asp Pro Gly Lys Gly Val Arg His Phe Ser Asp 1010 10:L5 1020 2$ Val His Leu Ile Thr Leu Phe~ His Gln Ala Glu Tyr Glu Ala Ala Phe Thr Ala Ala Gly Leu Arg Va7L Glu Tyr Leu Glu Gly Gly Pro Ser Gly Arg Gly Leu Phe Val Gly Val Pro Ala Met Thr Gly Lys Thr Arg Ile 3~ 1060 1065 1070 Pro Arg Val Arg Arg Gly ArcL Thr Thr Pro Arg Ala Phe Thr Leu Ala Val Val Gly Thr Leu Leu Ala Gly Thr Thr Val Ala Ala Ala Ala Pro 3$ Gly Ala Ala Asp Thr Ala Asn Val Gln Tyr Thr Ser Arg Ala Ala Glu Leu Val Ala Gln Met Thr Leu Asp Glu Lys Ile Ser Phe Val His Trp Ala Leu Asp Pro Asp Arg Gln Asn Val Gly Tyr Leu Pro Gly Val Pro Arg Leu Gly Ile Pro Glu Leu Arg Ala Ala Asp Gly Pro Asn Gly Ile Arg Leu Val Gly Gln Thr Ala Thr Ala Leu Pro Ala Pro Val Ala Leu S Ala Ser Thr Phe Asp Asp Thr Met Ala Asp Ser Tyr Gly Lys Val Met Gly Arg Asp Gly Arg Ala Leu Asn Gln Asp Met Val Leu Gly Pro Met Met Asn Asn Ile Arg Val Pro His Gly Gly Arg Asn Tyr Glu Thr Phe Ser Glu Asp Pro Leu Val Ser Ser Arg Thr Ala Val Ala Gln Ile Lys Gly Ile Gln Gly Ala Gly Leu Met Thr Thr Ala Lys His Phe Ala Ala 1$ Asn Asn Gln Glu Asn Asn Ar~g Phe Ser Val Asn Ala Asn Val Asp Glu Gln Thr Leu Arg Glu Ile Glu Phe Pro Ala Phe Glu Ala Ser Ser Lys Ala Gly Ala Ala Ser Phe Met Cys Ala Tyr Asn Gly Leu Asn Gly Lys Pro Ser Cys Gly Asn Asp Glu Leu Leu Asn Asn Val Leu Arg Thr Gln Trp Gly Phe Gln Gly Trp Va:l Met Ser Asp Trp Leu Ala Thr Pro Gly 2$ Thr Asp Ala Ile Thr Lys Gly Leu Asp Gln Glu Met Gly Val Glu Leu Pro Gly Asp Val Pro Lys G1;/ Glu Pro Ser Pro Pro Ala Lys Phe Phe Gly Glu Ala Leu Lys Thr Ala Val Leu Asn Gly Thr Val Pro Glu Ala 3~ 1380 1385 1390 Ala Val Thr Arg Ser Ala Glu Arg Ile Val Gly Gln Met Glu Lys Phe Gly Leu Leu Leu Ala Thr Pro Ala Pro Arg Pro Glu Arg Asp Lys Ala 1410 14:L5 1420 35 Gly Ala Gln Ala Val Ser Arg Lys Val Ala Glu Asn Gly Ala Val Leu Leu Arg Asn Glu Gly Gln Ala Leu Pro Leu Ala Gly Asp Ala Gly Lys Ser Ile Ala Val Ile Gly Pro Thr Ala Val Asp Pro Lys Val Thr Gly 4~ 1460 1465 1470 Leu Gly Ser Ala His Val Val Pro Asp Ser Ala Ala Ala Pro Leu Asp Thr Ile Lys Ala Arg Ala Gly Ala Gly Ala Thr Val Thr Tyr Glu Thr S Gly Glu Glu Thr Phe Gly Thr Gln Ile Pro Ala Gly Asn Leu Ser Pro Ala Phe Asn Gln Gly His Gl:n Leu Glu Pro Gly Lys Ala Gly Ala Leu Tyr Asp Gly Thr Leu Thr Val Pro Ala Asp Gly Glu Tyr Arg Ile Ala Val Arg Ala Thr Gly Gly Ty:r Ala Thr Val Gln Leu Gly Ser His Thr Ile Glu Ala Gly Gln Val Ty:r Gly Lys Val Ser Ser Pro Leu Leu Lys 1570 15'75 1580 15 Leu Thr Lys Gly Thr His Ly:a Leu Thr Ile Ser Gly Phe Ala Met Ser Ala Thr Pro Leu Ser Leu Glu Leu Gly Trp Val Thr Pro Ala Ala Ala Asp Ala Thr Ile Ala Lys Ala Val Glu Ser Ala Arg Lys Ala Arg Thr Ala Val Val Phe Ala Tyr Asp Asp Gly Thr Glu Gly Val Asp Arg Pro Asn Leu Ser Leu Pro Gly Thr Gln Asp Lys Leu Ile Ser Ala Val Ala 1650 16-'i5 1660 2$ Asp Ala Asn Pro Asn Thr Ile>. Val Val Leu Asn Thr Gly Ser Ser Val Leu Met Pro Trp Leu Ser Ly:; Thr Arg Ala Val Leu Asp Met Trp Tyr Pro Gly Gln Ala GIy Ala Glu Ala Thr Ala Ala Leu Leu Tyr Gly Asp Val Asn Pro Ser Gly Lys Leu Thr Gln Ser Phe Pro Ala Ala Glu Asn Gln His Ala Val Ala Gly Asp Pro Thr Ser Tyr Pro Gly Val Asp Asn 1730 17?'~5 1740 35 Gln Gln Thr Tyr Arg Glu Gly Ile His Val Gly Tyr Arg Trp Phe Asp Lys Glu Asn Val Lys Pro Leu Phe Pro Phe Gly His Gly Leu Ser Tyr Thr Ser Phe Thr Gln Ser Ala: Pro Thr Val Val Arg Thr Ser Thr Gly Gly Leu Lys Val Thr Val Thr Val Arg Asn Ser Gly Lys Arg Ala Gly Gln Glu Val Val Gln Ala Tyr Leu Gly Ala Ser Pro Asn Val Thr Ala $ Pro Gln Ala Lys Lys Lys Leu Val Gly Tyr Thr Lys Val Ser Leu Ala Ala Gly Glu Ala Lys Thr Val Thr Val Asn Val Asp Arg Arg Gln Leu Gln Thr Gly Ser Ser Ser Al~a Asp Leu Arg Gly Ser Ala Thr Val Asn 1~ 1860 1865 1870 Val Trp Met Ser Ser Arg Ala Glu Thr Pro Arg Val Pro Phe Leu Asp Leu Lys Ala Ala Tyr Glu Glu Leu Arg Ala Glu Thr Asp Ala Ala Ile 1890 18.'95 1900 1$ Ala Arg Val Leu Asp Ser Gly Arg Tyr Leu Leu Gly Pro Glu Leu Glu Gly Phe Glu Ala Glu Phe Ala Ala Tyr Cys Glu Thr Asp His Ala Val Gly Val Asn Ser Gly Met Asp Ala Leu Gln Leu Ala Leu Arg Gly Leu Gly Ile Gly Pro Gly Asp Glu Val Ile Val Pro Ser His Thr Tyr Ile Ala Ser Trp Leu Ala Val Ser Ala Thr Gly Ala Thr Pro Val Pro Val 1970 19',5 1980 2$ Glu Pro His Glu Asp His Pro Thr Leu Asp Pro Leu Leu Val Glu Lys Ala Ile Thr Pro Arg Thr Arg Ala Leu Leu Pro Val His Leu Tyr Gly His Pro Ala Asp Met Asp Ala Leu Arg Glu Leu Ala Asp Arg His Gly 3~ 2020 2025 2030 Leu His Ile Val Glu Asp Ala Ala Gln Ala His Gly Ala Arg Tyr Arg Gly Arg Arg Ile Gly Ala Gly Ser Ser Val Ala Ala Phe Ser Phe Tyr 3$ Pro Gly Lys Asn Leu Gly Cys Phe Gly Asp Gly Gly Ala Val Val Thr Gly Asp Pro Glu Leu Ala Glu Arg Leu Arg Met Leu Arg Asn Tyr Gly Ser Arg Gln Lys Tyr Ser His Glu Thr Lys Gly Thr Asn Ser Arg Leu Asp Glu Met Gln Ala Ala Va1 Leu Arg Ile Arg Leu Ala His Leu Asp Ser Trp Asn Gly Arg Arg Se:r Ala Leu Ala Ala Glu Tyr Leu Ser Gly 2130 2T~35 2140 $ Leu Ala Gly Leu Pro Gly I7.e Gly Leu Pro Val Thr Ala Pro Asp Thr Asp Pro Val Trp His Leu Phe Thr Val Arg Thr Glu Arg Arg Asp Glu Leu Arg Ser His Leu Asp Al.a Arg Gly Ile Asp Thr Leu Thr His Tyr Pro Val Pro Val His Leu Se:r Pro Ala Tyr Ala Gly Glu Ala Pro Pro Glu Gly Ser Leu Pro Arg Ala Glu Ser Phe Ala Arg Gln Val Leu Ser 1$ Leu Pro Ile Gly Pro His Leu Glu Arg Pro Gln Ala Leu Arg Val Ile Asp Ala Val Arg Glu Trp Ala Glu Arg Val Asp Gln Ala Met Arg Leu Leu Val Thr Gly Gly Ala Gly Phe Ile Gly Ser His Phe Val Arg Gln Leu Leu Ala Gly Ala Tyr Pro Asp Val Pro Ala Asp Glu Val Ile Val Leu Asp Ser Leu Thr Tyr Ala Gly Asn Arg Ala Asn Leu Ala Pro Val 2290 22:95 2300 2$ Asp Ala Asp Pro Arg Leu Arg Phe Val His Gly Asp Ile Arg Asp Ala Gly Leu Leu Ala Arg Glu Leu Arg Gly Val Asp Ala Ile Val His Phe Ala Ala Glu Ser His Val Asp Arg Ser Ile Ala Gly Ala Ser Val Phe 3~ 2340 2345 2350 Thr Glu Thr Asn Val Gln Gl;r Thr Gln Thr Leu Leu Gln Cys Ala Val Asp Ala Gly Val Gly Arg Va'.l Val His Val Ser Thr Asp Glu Val Tyr 2370 23'5 2380 3$ Gly Ser Ile Asp Ser Gly Ser Trp Thr Glu Ser Ser Pro Leu Glu Pro Asn Ser Pro Tyr Ala Ala Ser Lys Ala Gly Ser Asp Leu Val Ala Arg Ala Tyr His Arg Thr Tyr Gly Leu Asp Val Arg Ile Thr Arg Cys Cys Asn Asn Tyr Gly Pro Tyr Gl.n His Pro Glu Lys Leu Ile Pro Leu Phe Val Thr Asn Leu Leu Asp Gly Gly Thr Leu Pro Leu Tyr Gly Asp Gly S Ala Asn Val Arg Glu Trp Va.l His Thr Asp Asp His Cys Arg Gly Ile Ala Leu Val Leu Ala Gly Gly Arg Ala Gly Glu Ile Tyr His Ile Gly Gly Gly Leu Glu Leu Thr Asn Arg Glu Leu Thr Gly Ile Leu Leu Asp 1~ 2500 2505 2510 Ser Leu Gly Ala Asp Trp Ser Ser Val Arg Lys Val Ala Asp Arg Lys Gly His Asp Leu Arg Tyr Ser Leu Asp Gly Gly Glu Ile Glu Arg Glu IS Leu Gly Tyr Arg Pro Gln Val Ser Phe Ala Asp Gly Leu Ala Arg Thr Val Arg Trp Tyr Arg Glu Asn Arg Gly Trp Trp Glu Pro Leu Lys Ala Thr Ala Pro Gln Leu Pro Ala Thr Ala Val Glu Val Ser Ala Met Lys Gly Ile Val Leu Ala Gly Gly Ser Gly Thr Arg Leu His Pro Ala Thr Ser Val Ile Ser Lys Gln Il~e Leu Pro Val Tyr Asn Lys Pro Met Ile 25 Tyr Tyr Pro Leu Ser Val Leu Met Leu Gly Gly Ile Arg Glu Ile Gln Ile Ile Ser Thr Pro Gln His Ile Glu Leu Phe Gln Ser Leu Leu Gly Asn Gly Arg His Leu Gly Il~e Glu Leu Asp Tyr Ala Val Gln Lys Glu Pro Ala Gly Ile Ala Asp Ala Leu Leu Val Gly Ala Glu His Ile Gly Asp Asp Thr Cys Ala Leu Ile Leu Gly Asp Asn Ile Phe His Gly Pro 35 Gly Leu Tyr Thr Leu Leu Arg Asp Ser Ile Ala Arg Leu Asp Gly Cys Val Leu Phe Gly Tyr Pro Va:l Lys Asp Pro Glu Arg Tyr Gly Val Ala 2?25 2730 2735 Glu Val Asp Ala Thr Gly Arg Leu Thr Asp Leu Val Glu Lys Pro Val Lys Pro Arg Ser Asn Leu Ala Val Thr Gly Leu Tyr Leu Tyr Asp Asn Asp Val Val Asp Ile Ala Lys Asn Ile Arg Pro Ser Pro Arg Gly Glu 2770 2',75 2780 Leu Glu Ile Thr Asp Val Ae>n Arg Val Tyr Leu Glu Arg Gly Arg Ala Glu Leu Val Asn Leu Gly Arg Gly Phe Ala Trp Leu Asp Thr Gly Thr His Asp Ser Leu Leu Arg Al.a Ala Gln Tyr Val Gln Val Leu Glu Glu ' 1~ 2820 2825 2830 Arg Gln Gly Val Trp Ile Al.a Gly Leu Glu Glu Ile Ala Phe Arg Met Gly Phe Ile Asp Ala Glu Ala Cys His Gly Leu Gly Glu Gly Leu Ser 1$ Arg Thr Glu Tyr Gly Ser Tyr Leu Met Glu Ile Ala Gly Arg Glu Gly Ala Pro Met Thr Ala Pro Ala Leu Ser Ala Thr Ala Pro Ala Glu Arg Cys Ala His Pro Gly Ala Asp Leu Gly Ala Ala Val His Ala Val Gly Gln Thr Leu Ala Ala Gly Gly Leu Val Pro Pro Asp Glu Ala Gly Thr Thr Ala Arg His Leu Val Arg Leu Ala Val Arg Tyr Gly Asn Ser Pro 25 Phe Thr Pro Leu Glu Glu Ala Arg His Asp Leu Gly Val Asp Arg Asp Ala Phe Arg Arg Leu Leu Al~a Leu Phe Gly Gln Val Pro Glu Leu Arg Thr Ala Val Glu Thr Gly Pro Ala Gly Ala Tyr Trp Lys Asn Thr Leu 3~ 2980 2985 2990 Leu Pro Leu Glu Gln Arg Gly Val Phe Asp Ala Ala Leu Ala Arg Lys Pro Val Phe Pro Tyr Ser Va:l Gly Leu Tyr Pro Gly Pro Thr Cys Met 3010 30:15 3020 3$ Phe Arg Cys His Phe Cys Va:l Arg Val Thr Gly Ala Arg Tyr Asp Pro Ser Ala Leu Asp Ala Gly Asn Ala Met Phe Arg Ser Val Ile Asp Glu Ile Pro Ala Gly Asn Pro Ser Ala Met Tyr Phe Ser Gly Gly Leu Glu Pro Leu Thr Asn Pro Gly Leu Gly Ser Leu Ala Ala His Ala Thr Asp His Gly Leu Arg Pro Thr Va.l Tyr Thr Asn Ser Phe Ala Leu Thr Glu $ Arg Thr Leu Glu Arg Gln Pro Gly Leu Trp Gly Leu His Ala Ile Arg Thr Ser Leu Tyr Gly Leu Asn Asp Glu Glu Tyr Glu Gln Thr Thr Gly Lys Lys Ala Ala Phe Arg Arg Val Arg Glu Asn Leu Arg Arg Phe Gln Gln Leu Arg Ala Glu Arg Glu Ser Pro Ile Asn Leu Gly Phe Ala Tyr Ile Val Leu Pro Gly Arg Ala Ser Arg Leu Leu Asp Leu Val Asp Phe 1$ Ile Ala Asp Leu Asn Asp Ala Gly Gln Gly Arg Thr Ile Asp Phe Val Asn Ile Arg Glu Asp Tyr Ser Giy Arg Asp Asp Gly Lys Leu Pro Gln Glu Glu Arg Ala Glu Leu Gln Glu Ala Leu Asn Ala Phe Glu Glu Arg Val Arg Glu Arg Thr Pro Gly Leu His Ile Asp Tyr Gly Tyr Ala Leu Asn Ser Leu Arg Thr Gly Ala Asp Ala Glu Leu Leu Arg Ile Lys Pro 3250 32!i5 3260 2$ Ala Thr Met Arg Pro Thr Ala His Pro Gln Val Ala Val Gln Val Asp Leu Leu Gly Asp Val Tyr Leu Tyr Arg Glu Ala Gly Phe Pro Asp Leu Asp Gly Ala Thr Arg Tyr Ile: Ala Gly Arg Val Thr Pro Asp Thr Ser Leu Thr Glu Val Val Arg Asp Phe Val Glu Arg Gly Gly Glu Val Ala Ala Val Asp Gly Asp Glu Tyr Phe Met Asp Gly Phe Asp Gln Val Val 3330 33?t5 3340 3$ Thr Ala Arg Leu Asn Gln Leu Glu Arg Asp Ala Ala Asp Gly Trp Glu Glu Ala Arg Gly Phe Leu Arch Met Lys Ser Ala Leu Ser Asp Leu Ala Phe Phe Gly Gly Pro Ala Ala Phe Asp Gln Pro Leu Leu Val Gly Arg Pro Asn Arg Ile Asp Arg Ala Arg Leu Tyr Glu Arg Leu Asp Arg Ala Leu Asp Ser Gln Trp Leu Se:r Asn Gly Gly Pro Leu Val Arg Glu Phe $ Glu Glu Arg Val Ala Gly Le:u Ala Gly Val Arg His Ala Val Ala Thr Cys Asn Ala Thr Ala Gly Leu Gln Leu Leu Ala His Ala Ala Gly Leu Thr Gly Glu Val Ile Met Pro Ser Met Thr Phe Ala Ala Thr Pro His 1~ 3460 3465 3470 Ala Leu Arg Trp Ile Gly Leu Thr Pro Val Phe Ala Asp Ile Asp Pro Asp Thr Gly Asn Leu Asp Pro Asp Gln Val Ala Ala Ala Val Thr Pro 1$ Arg Thr Ser Ala Val Val Gly Val His Leu Trp Gly Arg Pro Cys Ala Ala Asp Gln Leu Arg Lys Val Ala Asp Glu His Gly Leu Arg Leu Tyr Phe Asp Ala Ala His Ala Leu Gly Cys Ala Val Asp Gly Arg Pro Ala Gly Ser Leu Gly Asp Ala Glu Val Phe Ser Phe His Ala Thr Lys Ala Val Asn Ala Phe Glu Gly Gly Ala Val Val Thr Asp Asp Ala Asp Leu 2$ Ala Ala Arg Ile Arg Ala Leu His Asn Phe Gly Phe Asp Leu Pro Gly Gly Ser Pro Ala Gly Gly Thr Asn Ala Lys Met Ser Glu Ala Ala Ala Ala Met Gly Leu Thr Ser Leu Asp Ala Phe Pro Glu Val Ile Asp Arg Asn Arg Arg Asn His Ala Al~a Tyr Arg Glu His Leu Ala Asp Leu Pro Gly Val Leu Val Ala Asp Hiss Asp Arg His Gly Leu Asn Asn His Gln 3650 36!55 3660 3$ Tyr Val Ile Val Glu Ile Asp Glu Ala Thr Thr Gly Ile His Arg Asp Leu Val Met Glu Val Leu Ly~s Ala Glu Gly Val His Thr Arg Ala Tyr Phe Ser Pro Gly Cys His Glu Leu Glu Pro Tyr Arg Gly Gln Pro His Ala Pro Leu Pro His Thr Glu Arg Leu Ala Ala Arg Val Leu Ser Leu Pro Thr Gly Thr Ala Ile Gly Asp Asp Asp Ile Arg Arg Val Ala Asp $ Leu Leu Arg Leu Cys Ala Thr Arg Gly Arg Glu Leu Thr Ala Arg His Arg Asp Thr Ala Pro Ala Pro Leu Ala Ala Pro Gln Thr Ser Thr Pro Thr Ile Gly Arg Ser Arg <210> 5 <211> 37948 <212> DNA
<213> Streptomyces venezuelae <400> 5 gggcccctcctcacgcgtctcgatcctcgcgcgtccgcgccttcccgcgccggcactcgc60 gctctcgcgttcgttcaggccctcc<3cttccgggtcccgccggtgcggctcttcgtgctg120 20ctccggctccgaacggtttcgcggac~cagactcatggcattttccccgcagggcggccga180 cacgagctcggtcagaacttcctcgt:cgaccggtcagtgatcgacgagatcgacggcctg240 gtggccaggaccaagggtccgatact:ggagatcggtccgggtgacggcgccctgaccctg300 ccgctgagcaggcacggcaggccgat:caccgccgtcgagctcgacggccggcgcgcgcag360 cgcctcggtgcccgcacccccggtcatgtgaccgtggtgcaccacgacttcctgcagtac420 2Sccgctgccgcgcaacccgcatgtggi:cgtcggcaacgtccccttccatctgacgacggcg480 atcatgcggcggctgctcgacgcccagcactggcacaccgccgtcctcctcgtccagtgg540 gaggtcgcccggcgccgggccggcgtcggcgggtcgacgctgctgacggccggctgggcg600 ccctggtacgagttcgacctgcact<:ccgggtccccgcgcgggccttccgtccgatgccg660 ggcgtggacggaggagtactggccat:ccggcggcggtccgcgccgctcgtgggccaggtg720 30aagacgtaccaggacttcgtacgccaggtgttcaccggcaaggggaacgggctgaaggag780 atcctgcggcggaccgggcggatctc:gcagcgggacctggcgacctggctgcggaggaac840 gagatctcgccgcacgcgctgcccaaggacctgaagcccgggcagtgggcgtcgctgtgg900 gagctgaccggcggcacggccgacg<~atccttcgacggtacggcgggcggtggcgcggcc960 ggatcgcacggggcggctcgggtcgc~ggccggtcacccgggcggccgggtgtccgcgagc1020 3Scggcggggcgtgccgcaggcgcggcc~cggccgggggcatgcggtacggagctccacgggg1080 accgagccgaggtggggcagggggcc~ggcggagagcgcgtgagccgttctcgagcctgct1140 gccgagccgctgctgagccggtgctc~agccggatccgaccgtgggtgtgaatctccgggt1200 gctcgcctcgtcctgccccgttacct:gtccgcctcccgctccagaccagcgggaggcgga1260 caggggcatgcccgccgggcggctaacggcccgtgcggcgtccgtacgacgagcctcgcg1320 40cgccctggcggcccttggtctgccgc~acctgtgcgcggggtgcgcagggttcgccgccgc1380 WO 00/00620 PCT/(1S99/14398 gcgtggggcc gtatctgcggctcccgggcacggcggccctgctcgtctccgagtcatagt1440 ccctgccgcc ggcgccaccgccctggcccggcatgcgcgtgccgggcgcccccggcgcgt1500 aactcggctg ggaggcctggaaaagggcgatccattgggtgagcgtgaggtccttcggca1560 gtccgccgtc cggaattccgtggcggtcggcgagggaacggtaggtccgcttggggatgt1620 Sggcgccggaggatctccgcgaggccccgtccggggccggtgaagacggcttcggcgaagt1680 tctggaaggc gcggctcgcgctctcgggcagcaggggctgggggcgtcgcctgatcgtca1740 ggacgccgcc gtcgacgcggggcatcggacggaacgacgaggcgcggacgcggtcgtgga1800 ccgcgaactc gtaccagggggccca~ggaggtcgtgaggagcgatccgccgctgcgaccgg1860 cgcgtttgcg ggcgacctcccactgcactatcagggccgccgactgccagttcgtcgatt1920 l0ccaggagactccggagaatctgggtcgtgatgccgaagggaacgtttccgacgacggtgt1980 cgatatcgcg cggaatgcggaagtcgaggaaatcaccctggaatacggtgaccctctccc2040 cttcgaattt ccgccgcacatgcgcggcccagtgcgggtccatctccacgaccgtcacgg2100 tgtcgaagga gcgcaccaactcctcggttatcgcgccctttccggggccgatttcgagaa2160 cgttcctacc gtccccctcgacatgcgtgacgagattgcgcacggctctgtcgtcctgaa2220 ~l5ggaagttctggcctaattcgcggcgaagggtgtcgcggtccgctcgcctcggtatggagt2280 cgcgcattgc catgaacgatcccctccctggatgccgtggtcaatggacttggcacggac2340 catacctcac ggtccgtcggacgaccggagaagaagttcacgcacgggcgttccggagta2400 cgggagttgt gaacggccgcgacgaagtcggtcgcggctcggcgggcggtgacgagcgag2460 gtccggagga acgcgacgaagcagccgaaccccaagtgaggtgcgacggagtgacattgg2520 20gggcatacggagggttgtcgtacggagcgcactcaacgaggctccaggagggaggggttg2580 aacccgccgc cgactggccttcgccc~cccgcgcggccggagtatgtcatgtcgggggtga2640 aatcaagcca ttcccccgggatcggcagttacccatccctttacctggcgtggatttccc2700 aacccttggt atagagcgggagacgacgcgacaccatggagaccacgcacaccacgagcg2760 ccaccccccg gccatcccgacaaggc~gggtccggctcgcctcccgacacccatggcctgg2820 25ggtacacgccaggtatagggggaacc~tagggggagcatagggggggtgccctggggttgg2880 gtgaaagcgc ggcttccggagacggagccggatgtcttcagccggaattaccaggaccgg2940 tgcgagaaca ccggtgacagggcgtggggcggcagcgtgggacacgggggaagtgcgggt3000 ccgacggggg ttgccccctgccggcc:ccgatcatgcggagcactccttctctcgtgctcc3060 taccggtgat gtgcgcgccgaattgattcgtggagagatgtcgacagtgtccaagagtga3120 30gtccgaggaattcgtgtccgtgtcgaacgacgccggttccgcgcacggcacagcggaacc3180 cgtcgccgtc gtcggcatctcctgcc:gggtgcccggcgcccgggacccgagagagttctg3240 ggaactcctg gcggcaggcggccaggccgtcaccgacgtccccgcggaccgctggaacgc3300 cggcgacttc tacgacccggaccgct:ccgcccccggccgctcgaacagccggtggggcgg3360 gttcatcgag gacgtcgaccggttcc~acgccgccttcttcggcatctcgccccgcgaggc3420 35cgcggagatggacccgcagcagcggcacgccctggagctgggctgggaggccctggagcg3480 cgccgggatc gacccgtcctcgctca~ccggcacccgcaccggcgtcttcgccggcgccat3540 ctgggacgac tacgccaccctgaagc:accgccagggcggcgccgcgatcaccccgcacac3600 cgtcaccggc ctccaccgcggcatcatcgcgaaccgactctcgtacacgctcgggctccg3660 cggccccagc atggtcgtcgactccc~gccagtcctcgtcgctcgtcgccgtccacctcgc3720 40gtgcgagagcctgcggcgcggcgagt:ccgagctcgccctcgccggcggcgtctcgctcaa3780 f cctggtgccg gacagcatcatcggggcgagcaagttcggcggcctctccccgacggccg 3840 c cgcctacacc ttcgacgcgcgcgcc;aacggctacgtacgcggcgagggcggcggtttcgt 3900 cgtcctgaag cgcctctcccgggcc;gtcgccgacggcgacccggtgctcgccgtgatccg 3960 gggcagcgcc gtcaacaacggcggc;gccgcccagggcatgacgacccccgacgcgcaggc 4020 Sgcaggaggccgtgctccgcgaggcc;cacgagcgggccgggaccgcgccggccgacgtgcg 4080 gtacgtcgag ctgcacggcaccggc:acccccgtgggcgacccgatcgaggccgctgcgct 4140 ~I

cggcgccgcc ctcggcaccggccgc:ccggccggacagccgctcctggtcggctcggtcaa 4200 gacgaacatc ggccacctggagggc:gcggccggcatcgccggcctcatcaaggccgtcct 4260 ggcggtccgc ggtcgcgcgctgccc:gccagcctgaactacgagaccccgaacccggcgat 4320 l~cccgttcgaggaactgaacctccgggtgaa.cacggagtacctgccgtgggagccggagca 4380 cgacgggcag cggatggtcgtcggcgtgtcctcgttcggcatgggcggcacgaacgcgca 4440 tgtcgtgctc gaagaggcccccggg~ggttgtcgaggtgcttcggtcgtggagtcgacggt 4500 cggcgggtcg gcggtcggcggcggtgtggtgccgtgggtggtgtcggcgaagtccgctgc 4560 cgcgctggac gcgcagatcgagcggcttgccgcgttcgcctcgcgggatcgtacggatgg 4620 l5tgtcgacgcgggcgctgtcgatgcgggtgctgtcgatgcgggtgctgtcgctcgcgtact 4680 ggccggcggg cgtgctcagttcgagcaccgggccgtcgtcgtcggcagcgggccggacga 4740 tctggcggca gcgctggccgcgcctgagggtctggtccggggcgtggcttccggtgtcgg 4800 gcgagtggcg ttcgtgttccccgggcagggcacgcagtgggccggcatgggtgccgaact 4860 gctggactct tccgcggtgttcgcggcggccatggccgaatgcgaggccgcactctcccc 4920 20gtacgtcgactggtcgctggaggccgtcgtacggcaggcccccggtgcgcccacgctgga 4980 gcgggtcgat gtcgtgcagcctgtgacgttcgccgtcatggtctcgctggctcgcgtgtg 5040 gcagcaccac ggggtgacgccccaggcggtcgtcggccactcgcagggcgagatcgccgc 5100 cgcgtacgtc gccggtgccctgagcctggacgacgccgctcgtgtcgtgaccctgcgcag 5160 caagtccatc gccgcccacctcgcc~ggcaagggcggcatgctgtccctcgcgctgagcga 5220 25ggacgccgtcctggagcgactggcc~gggttcgacgggctgtccgtcgccgctgtgaacgg 5280 gcccaccgcc accgtggtctccggtgaccccgtacagatcgaagagcttgctcgggcgtg 5340 tgaggccgat ggggtccgtgcgcgggtcattcccgtcgactacgcgtcccacagccggca 5400 ggtcgagatc atcgagagcgagctcgccgaggtcctcgccgggctcagcccgcaggctcc 5460 gcgcgtgccg ttcttctcgacactcgaaggcgcctggatcaccgagcccgtgctcgacgg 5520 3Ucggctactggtaccgcaacctgcgccatcgtgtgggcttcgccccggccgtcgagaccct 5580 ggccaccgac gagggcttcacccacttcgtcgaggtcagcgcccaccccgtcctcaccat 5640 ggccctcccc gggaccgtcaccggtctggcgaccctgcgtcgcgacaacggcggtcagga 5700 ccgcctagtc gcctccctcgccgaagcatgggccaacggactcgcggtcgactggagccc 5760 gctcctcccc tccgcgaccggccaccactccgacctccccacctacgcgttccagaccga 5820 35gcgccactggctgggcgagatcgaggcgctcgccccggcgggcgagccggggtgcagcc 5880 c cgccgtcctc cgcacggaggcggccgagccggcggagctcgaccgggacggcagctgcg 5940 a cgtgatcctg gacaaggtccgggcgcagacggcccaggtgctggggtacgcgacaggcgg 6000 gcagatcgag gtcgaccggaccttccgtgaggccggttgcacctccctgaccggcgtgga 6060 cctgcgcaac cggatcaacgccgccttcggcgtacggatggcgccgtccatgatcttcga 6120 40cttccccacccccgaggctctcgcg<~agcagctgctcctcgtcgtgcacggggaggcggc 6180 ggcgaacccg gccggtgcggagccggctccggtggcggcggccggtgccgtcgacgagcc6240 ggtggcgatc gtcggcatggcctgccgcctgcccggtggggtcgcctcgccggaggacct6300 gtggcggctg gtggccggcggcggggacgcgatctcggagttcccgcaggaccgcggctg6360 ggacgtggag gggctgtaccacccggatccggagcaccccggcacgtcgtacgtccgcca6420 Sgggcggtttcatcgagaacgtcgccggcttcgacgcggccttcttcgggatctcgccgcg6480 cgaggccctc gccatggacccgcagcagcggctcctcctcgaaacctcctgggaggccgt6540 cgaggacgcc gggatcgacccgacctccctgcggggacggcaggtcggcgtcttcactgg6600 ggcgatgacc cacgagtacgggccgagcctgcgggacggcggggaaggcctcgacggcta6660 cctgctgacc ggcaacacggccagc~gtgatgtcgggccgcgtctcgtacacactcggcct6720 l0tgagggccccgccctgacggtggac~acggcctgctcgtcgtcgctggtcgccctgcacct6780 cgccgtgcag gccctgcgcaagggcgaggtcgacatggcgctcgccggcggcgtggccgt6840 gatgcccacg cccgggatgttcgtcgagttcagccggcagcgcgggctggccggggacgg6900 ccggtcgaag gcgttcgccgcgtcggcggacggcaccagctggtccgagggcgtcggcgt6960 cctcctcgtc gagcgcctgtcggacc~cccgccgcaacggacaccaggtcctcgcggtcgt7020 l5ccgcggcagcgccttgaaccaggacggcgcgagcaacggcctcacggctccgaacgggcc7080 ctcgcagcag cgcgtcatccggcgcgcgctggcggacgcccggctgacgacctccgacgt7140 ggacgtcgtc gaggcacacggcacgggcacgcgactcggcgacccgatcgaggcgcaggc7200 cctgatcgcc acctacggccagggcc:gtgacgacgaacagccgctgcgcctcgggtcgtt7260 gaagtccaac atcgggcacacccaggccgcggccggcgtctccggtgtcatcaagatggt7320 ZOccaggcgatgcgccacggactgctgc:cgaagacgctgcacgtcgacgagccctcggacca7380 gatcgactgg tcggctggcgccgtgc~aactcctcaccgaggccgtcgactggccggagaa7440 gcaggacggc gggctgcgccgggccc~ccgtctcctccttcgggatcagcggcaccaatgc7500 gcatgtggtg ctcgaagaggccccgc~tggttgtcgagggtgcttcggtcgtcgagccgtc7560 ggttggcggg tcggcggtcggcggcc~gtgtgacgccttgggtggtgtcggcgaagtccgc7620 25tgccgcgctcgacgcgcagatcgagc:ggcttgccgcattcgcctcgcgggatcgtacgga7680 tgacgccgac gccggtgctgtcgacc~cgggcgctgtcgctcacgtactggctgacgggcg7740 tgctcagttc gagcaccgggccgtcc~cgctcggcgccggggcggacgacctcgtacaggc7800 gctggccgat ccggacgggctgatac:gcggaacggcttccggtgtcgggcgagtggcgtt7860 cgtgttcccc ggtcagggcacgcagt:gggctggcatgggtgccgaactgctggactcttc7920 30cgcggtgttcgcggcggccatggccc~agtgtgaggccgcgctgtccccgtacgtcgactg7980 gtcgctggag gccgtcgtacggcaggcccccggtgcgcccacgctggagcgggtcgatgt8040 cgtgcagcct gtgacgttcgccgtca~tggtctcgctggctcgcgtgtggcagcaccacgg8100 tgtgacgccc caggcggtcgtcggcc:actcgcagggcgagatcgccgccgcgtacgtcgc8160 cggagccctg cccctggacgacgccc~cccgcgtcgtcaccctgcgcagcaagtccatcgc8220 35cgcccacctcgccggcaagggcggcatgctgtccctcgcgctgaacgaggacgccgtcct8280 ggagcgactg agtgacttcgacgggcagtccgtcgccgccgtcaacgggcccaccgccac8340 tgtcgtgtcg ggtgaccccgtacaga~tcgaagagcttgctcaggcgtgcaaggcggacgg8400 attccgcgcg cggatcattcccgtcafactacgcgtcccacagccggcaggtcgagatcat8460 cgagagcgag ctcgcccaggtcctcgccggtctcagcccgcaggccccgcgcgtgccgtt8520 40cttctcgacgctcgaaggcacctgga.tcaccgagcccgtcctcgacggcacctactggta8580 ccgcaacctc cgtcaccgcgtcggca ccccgccatcgagaccctggccgtcgacga8640 tcgc gggcttcacg cacttcgtcgaggtc:agcgcccaccccgtcctcaccatgaccctccccga8700 gaccgtcacc ggcctcggcaccctc:cgtcgcgaacagggaggccaagagcgtctggtcac8760 ctcgctcgcc gaggcgtgggtcaac:gggcttcccgtggcatggacttcgctcctgcccgc8820 Scacggcctcccgccccggtctgcccacctacgccttccaggccgagcgctactggctcga8880 gaacactccc gccgccctggccaccggcgacgactggcgctaccgcatcgactggaagcg8940 "

cctcccggcc gccgaggggtccgag~cgcaccggcctgtccggccgctggctcgccgtcac9000 gccggaggac cactccgcgcaggccgccgccgtgctcaccgcgctggtcgacgccggggc9060 gaaggtcgag gtgctgacggccggggcggacgacgaccgtgaggccctcgccgcccggct9120 l0caccgcactgacgaccggtgacggcttcaccggcgtggtctcgctcctcgacggactcgt9180 accgcaggtc gcctgggtccaggcgctcggcgacgccggaatcaaggcgcccctgtggtc9240 cgtcacccag ggcgcggtctccgtcggacgtctcgacacccccgccgaccccgaccgggc9300 catgctctgg ggcctcggccgcgtcgtcgcccttgagcaccccgaacgctgggccggcct9360 cgtcgacctc cccgcccagcccgatgccgccgccctcgcccacctcgtcaccgcactctc9420 l5cggcgccaccggcgaggaccagatcgccatccgcaccaccggactccacgcccgccgcct9480 cgcccgcgca cccctccacggacgtcggcccacccgcgactggcagccccacggcaccgt9540 cctcatcacc ggcggcaccggagccctcggcagccacgccgcacgctggatggcccacca9600 cggagccgaa cacctcctcctcgtcagccgcagcggcgaacaagcccccggagccaccca9660 actcaccgcc gaactcaccgcatcg~ggcgcccgcgtcaccatcgccgcctgcgacgtcgc9720 20cgacccccacgccatgcgcaccctcctcgacgccatccccgccgagacgcccctcaccgc9780 cgtcgtccac accgccggcgcgctcgacgacggcatcgtggacacgctgaccgccgagca9840 ggtccggcgg gcccaccgtgcgaaggccgtcggcgcctcggtgctcgacgagctgacccg9900 ggacctcgac ctcgacgcgttcgtgctcttctcgtccgtgtcgagcactctgggcatccc9960 cggtcagggc aactacgccccgcacaacgcctacctcgacgccctcgcggctcgccgccg10020 25ggccaccggccggtccgccgtctcggtggcctggggaccgtgggacggtggcggcatggc10080 cgccggtgac ggcgtggccgagcggctgcgcaaccacggcgtgcccggcatggacccgga10140 actcgccctg gccgcactggagtcc<3cgctcggccgggacgagaccgcgatcaccgtcgc10200 ggacatcgac tgggaccgcttctacctcgcgtactcctccggtcgcccgcagcccctcgt10260 cgaggagctg cccgaggtgcggcgcatcatcgacgcacgggacagcgccacgtccggaca10320 30gggcgggagctccgcccagggcgccaaccccctggccgagcggctggccgccgcggctcc10380 cggcgagcgt acggagatcctcctcc3gtctcgtacgggcgcaggccgccgccgtgctccg10440 gatgcgttcg ccggaggacgtcgccgccgaccgcgccttcaaggacatcggcttcgactc10500 gctcgccggt gtcgagctgcgcaacaggctgacccgggcgaccgggctccagctgcccgc10560 gacgctcgtc ttcgaccacccgacgc:cgctggccctcgtgtcgctgctccgcagcgagtt10620 3Scctcggtgacgaggagacggcggacc~cccggcggtccgcggcgctgcccgcgactgtcgg10680 tgccggtgcc ggcgccggcgccggcaccgatgccgacgacgatccgatcgcgatcgtcgc10740 gatgagctgc cgctaccccggtgacatccgcagcccggaggacctgtggcggatgctgtc10800 cgagggcggc gagggcatcacgccgt;tccccaccgaccgcggctgggacctcgacggcct10860 gtacgacgcc gacccggacgcgctcc~gcagggcgtacgtccgcgagggcgggttcctgca10920 40cgacgcggccgagttcgacgcggagta cggcgtctcgccgcgcgaggcgctggccat10980 ctt WO 00/00620 PC"f/US99/14398 ggacccgcagcagcggatgctcctg;acgacgtcctgggaggccttcgagcgggccggcat11040 cgagccggcatcgctgcgcggcagcagcaccggtgtcttcatcggcctctcctaccagga11100 ctacgcggcccgcgtcccgaacgccccgcgtggcgtggagggttacctgctgaccggcag11160 cacgccgagcgtcgcgtcgggccgtatcgcgtacaccttcggtctcgaagggcccgcgac11220 $gaccgtcgacaccgcctgctcgtcgtcgctgaccgccctgcacctggcggtgcgggcgct11280 gcgcagcggcgagtgcacgatggcgctcgccggtggcgtggcgatgatggcgaccccgca11340 , catgttcgtggagttcagccgtcagcgggcgctcgccccggacggccgcagcaaggcctt11400 ctcggcggacgccgacgggttcggcgccgcggagggcgtcggcctgctgctcgtggagcg11460 gctctcggacgcgcggcgcaacggtcacccggtgctcgccgtggtccgcggtaccgccgt11520 l~caaccaggacggcgccagcaacgggctgaccgcgcccaacggaccctcgcagcagcgggt11580 gatccggcaggcgctcgccgacgcccggctggcacccggcgacatcgacgccgtcgagac11640 gcacggcacgggaacctcgctgggcgaccccatcgaggcccagggcctccaggccacgta11700 cggcaaggagcggcccgcggaacgg<:cgctcgccatcggctccgtgaagtccaacatcgg11760 acacacccaggccgcggccggtgcggcgggcatcatcaagatggtcctcgcgatgcgcca11820 1$cggcaccctgccgaagaccctccacc~ccgacgagccgagcccgcacgtcgactgggcgaa11880 cagcggcctggccctcgtcaccgagc:cgatcgactggccggccggcaccggtccgcgccg11940 cgccgccgtctcctccttcggcatcagcgggacgaacgcgcacgtcgtgctggagcaggc12000 gccggatgctgctggtgaggtgcttc~gggccgatgaggtgcctgaggtgtctgagacggt12060 agcgatggctgggacggctgggacct:ccgaggtcgctgagggctctgaggcctccgaggc12120 2Uccccgcggcccccggcagccgtgaggcgtccctccccgggcacctgccctgggtgctgtc12180 cgccaaggacgagcagtcgctgcgcqgccaggccgccgccctgcacgcgtggctgtccga12240 gcccgccgccgacctgtcggacgcgc~acggaccggcccgcctgcgggacgtcgggtacac12300 gctcgccacgagccgtaccgccttcc~cgcaccgcgccgccgtgaccgccgccgaccggga12360 cgggttcctggacgggctggccacgc;tggcccagggcggcacctcggcccacgtccacct12420 2$ggacaccgcccgggacggcaccaccc~cgttcctcttcaccggccagggcagtcagcgccc12480 cggcgccggccgtgagctgtacgacc:ggcaccccgtcttcgcccgggcgctcgacgagat12540 ctgcgcccacctcgacggtcacctcgaactgcccctgctcgacgtgatgttcgcggccga12600 gggcagcgcggaggccgcgctgctcgacgagacgcggtacacgcagtgcgcgctgttcgc12660 cctggaggtcgcgctcttccggctcgtcgagagctggggcatgcggccggccgcactgct12720 3~cggtcactcggtcggcgagatcgccgccgcgcacgtcgccggtgtgttctcgctcgccga12780 cgccgcccgcctggtcgccgcgcgcggccggctcatgcaggagctgcccgccggtggcgc12840 gatgctcgccgtccaggccgcggagg~acgagatccgcgtgtggctggagacggaggagcg12900 gtacgcgggacgtctggacgtcgccg~ccgtcaacggccccgaggccgccgtcctgtccgg12960 cgacgcggacgcggcgcgggaggcgg~aggcgtactggtccgggctcggccgcaggacccg13020 3$cgcgctgcgggtcagccacgccttccactccgcgcacatggacggcatgctcgacgggtt13080 ccgcgccgtcctggagacggtggagttccggcgcccctccctgaccgtggtctcgaacgt13140 caccggcctggccgccggcccggacgacctgtgcgaccccgagtactgggtccggcacgt13200 ccgcggcaccgtccgcttcctcgacg~gcgtccgtgtcctgcgcgacctcggcgtgcggac13260 ctgcctggagctgggccccgacgggg~tcctcaccgccatggcggccgacggcctcgcgga13320 4~cacccccgcggattccgctgccggctcccccgtcggctctcccgccggctctcccgccga13380 4$
ctccgccgcc ggcgcgctcc ggccccggcc gctgctcgtg gcgctgctgc gccgcaagcg 13440 gtcggagacc gagaccgtcg cggac:gccct cggcagggcg cacgcccacg gcaccggacc 13500 cgactggcac gcctggttcg ccggctccgg ggcgcaccgc gtggacctgc ccacgtactc 13560 cttccggcgc gaccgctact ggctggacgc cccggcggcc gacaccgcgg tggacaccgc 13620 Scggcctcggt ctcggcaccg ccgaccaccc gctgctcggc gccgtggtca gccttccgga 13680 ccgggacggc ctgctgctca ccggccgcct ctccctgcgc acccacccgt ggctcgcgga 13740 ccacgccgtc ctggggagcg tcctgctccc cggcgccgcg atggtcgaac tcgccgcgca 13800 ~~
cgctgcggag tccgccggtc tgcgtgacgt gcgggagctg accctccttg aaccgctggt 13860 actgcccgag cacggtggcg tcgagctgcg cgtgacggtc ggggcgccgg ccggagagcc 13920 lOcggtggcgag tcggccgggg acggcgcacg ,gcccgtctcc ctccactcgc ggctcgccga 13980 cgcgcccgcc ggtaccgcct ggtcctgcca cgcgaccggt ctgctggcca ccgaccggcc 14040 cgagcttccc gtcgcgcccg accgtgcggc catgtggccg ccgcagggcg ccgaggaggt 14100 gccgctcgac ggtctctacg agcggctcga cgggaacggc ctcgccttcg gtccgctgtt 14160 ccaggggctg aacgcggtgt ggcggtacga gggtgaggtc ttcgccgaca tcgcgctccc 14220 l5cgccaccacg aatgcgaccg cgccc~gcgac cgcgaacggc ggcgggagtg cggcggcggc 14280 cccctacggc atccaccccg ccctgctcga cgcttcgctg cacgccatcg cggtcggcgg 14340 tctcgtcgac gagcccgagc tcgtccgcgt ccccttccac tggagcggtg tcaccgtgca 14400 cgcggccggt gccgcggcgg cccgggtccg tctcgcctcc gcggggacgg acgccgtctc 14460 gctgtccctg acggacggcg agggacgccc gctggtctcc gtggaacggc tcacgctgcg 14520 20cccggtcacc gccgatcagg cggcggcgag ccgcgtcggc gggctgatgc accgggtggc 14580 ctggcgtccg tacgccctcg cctcgtccgg cgaacaggac ccgcacgcca cttcgtacgg 14640 gccgaccgcc gtcctcggca aggacgagct gaaggtcgcc gccgccctgg agtccgcggg 14700 cgtcgaagtc gggctctacc ccgacctggc cgcgctgtcc caggacgtgg cggccggcgc 14760 cccggcgccc cgtaccgtcc ttgcgccgct gcccgcgggt cccgccgacg gcggcgcgga 14820 25gggtgtacgg ggcacggtgg cccgg<icgct ggagctgctc caggcctggc tggccgacga 14880 gcacctcgcg ggcacccgcc tgctccaggt cacccgcggt gcggtgcggg accccgaggg 14940 gtccggcgcc gacgatggcg gcgagc~acct gtcgcacgcg gccgcctggg gtctcgtacg 15000 gaccgcgcag accgagaacc ccggcc:gctt cggccttctc gacctggccg acgacgcctc 15060 gtcgtaccgg accctgccgt cggtgcactc cgacgcgggc ctgcgcgacg aaccgcagct 15120 30cgccctgcac gacggcacca tcaggcaggc ccgcctggcc tccgtccggc ccgagaccgg 15180 caccgccgca ccggcgctcg ccccggaggg cacggtcctg ctgaccggcg gcaccggcgg 15240 cctgggcgga ctggtcgccc ggcacc~tggt gggcgagtgg ggcgtacgac gcctgctgct 15300 ggtgagccgg cggggcacgg acgccc:cggg cgccgacgag ctcgtgcacg agctggaggc 15360 cctgggagcc gacgtctcgg tggccc~cgtg cgacgtcgcc gaccgcgaag ccctcaccgc 15420 35cgtactcgac gccatccccg ccgaac:accc gctcaccgcg gtcgtccaca cggcaggcgt 15480 cctctccgac ggcaccctcc cgtccatgac gacggaggac gtggaacacg tactgcggcc 15540 caaggtcgac gccgcgttcc tcctcgacga actcacctcg acgcccgcat acgacctggc 15600 agcgttcgtc atgttctcct ccgccgccgc cgtcttcggt ggcgcggggc agggcgccta 15660 cgccgccgcc aacgccaccc tcgacc~ccct cgcctggcgc cgccgggcag ccggactccc 15720 40cgccctctcc ctcggctggg gcctct:gggc cgagaccagc ggcatgaccg gcgagctcgg 15780 ccaggcggac ctgcgccgga tgagccgcgc gggcatcggc gggatcagcg acgccgaggg 15840 catcgcgctc ctcgacgccg ccctccgcga cgaccgccac ccggtcctgc tgcccctgcg 15900 gctcgacgcc gccgggctgc gggacgcggc cgggaacgac ccggccggaa tcccggcgct 15960 cttccgggac gtcgtcggcgccaggaccgtccgggcccggccgtccgcggcctccgcctc16020 Sgacgacagccgggacggccggcacgccggggacggcggacggcgcggcggaaacggcggc16080 ggtcacgctc gccgaccgggccgccaccgtggacgggcccgcacggcagcgcctgctgct16140 cgagttcgtc gtcggcgaggtcgccgaagtactcggccacgcccgcggtcaccggatcga16200 cgccgaacgg ggcttcctcgacctcggcttcgactccctgaccgccgtcgaactccgcaa16260 ccggctcaac tccgccggtggcctcgccctcccggcgaccctggtcttcgaccacccaag16320 lOcccggcggcactcgcctcccacctggacgccgagctgccgcgcggcgcctcggaccagga16380 cggagccggg aaccggaacgggaacgagaacgggacgacggcgtcccggagcaccgccga16440 gacggacgcg ctgctggcacaactgacccgcctggaaggcgccttggtgctgacgggcct16500 ctcggacgcc cccgggagcgaagaa~gtcctggagcacctgcggtccctgcgctcgatggt16560 cacgggcgag accgggaccgggacc~gcgtccggagccccggacggcgccgggtccggcgc16620 l5cgaggaccggccctgggcggccggggacggagccgggggcgggagtgaggacggcgcggg16680 agtgccggac ttcatgaacgcctcggccgaggaactcttcggcctcctcgaccaggaccc16740 cagcacggac tgatccctgccgcacggtcgcctcccgccccggaccccgtcccgggcacc16800 tcgactcgaa tcacttcatgcgcgcctcgggcgcctccaggaactcaaggggacagcgtg16860 tccacggtga acgaagagaagtacctcgactacctgcgtcgtgccacggcggacctccac16920 20gaggcccgtggccgcctccgcgagctggaggcgaaggcgggcgagccggtggcgatcgtc16980 ggcatggcct gccgcctgcccggcggcgtcgcctcgcccgaggacctgtggcggctggtg17040 gccggcggcg aggacgcgatctcggagttcccccaggaccgcggctgggacgtggagggc17100 ctgtacgacc cgaacccggaggccacgggcaagagttacgcccgcgaggccggattcctg17160 tacgaggcgg gcgagttcgacgccgacttcttcgggatctcgccgcgcgaggccctcgcc17220 25atggacccgcagcagcgtctcctcctggaggcctcctgggaggcgttcgagcacgccggg17280 atcccggcgg ccaccgcgcgcggcacctcggtcggcgtcttcaccggcgtgatgtaccac17340 gactacgcca cccgtctcaccgatgtcccggagggcatcgagggctacctgggcaccggc17400 aactccggca gtgtcgcctcgggccc~cgtcgcgtacacgcttggcctggaggggccggcc17460 gtcacggtcg acaccgcctgctcgtcctcgctggtcgccctgcacctcgccgtgcaggcc17520 30ctgcgcaagggcgaggtcgacatggcgctcgccggcggcgtgacggtcatgtcgacgccc17580 agcaccttcg tcgagttcagccgtcagcgcgggctggcgccggacggccggtcgaagtcc17640 ttctcgtcga cggccgacggcaccagctggtccgagggcgtcggcgtcctcctcgtcgag17700 cgcctgtccg acgcgcgtcgcaaggc~ccatcggatcctcgccgtggtccggggcaccgcc17760 gtcaaccagg acggcgccagcagcggcctcacggctccgaacgggccgtcgcagcagcgc17820 35gtcatccgacgtgccctggcggacg<:ccggctcacgacctccgacgtggacgtcgtcgag17880 gcccacggca cgggtacgcgactcg<~cgacccgatcgaggcgcaggccgtcatcgccacg17940 tacgggcagg gccgtgacggcgaacagccgctgcgcctcgggtcgttgaagtccaacatc18000 ggacacaccc aggccgccgccggtgtctccggcgtgatcaagatggtccaggcgatgcgc18060 cacggcgtcc tgccgaagacgctccaacgtggagaagccgacggaccaggtggactggtcc18120 40gcgggcgcggtcgagctgctcaccgaggccatggactggccggacaagggcgacggcgga18180 ctgcgcaggg ccgcggtctc ctcctacggc gtcagcggga cgaacgcgca cgtcgtgctc 18240 gaagaggccc cggcggccgaggagacccctgcctccgaggcgaccccggccgtcgagccg18300 tcggtcggcg ccggcctggtgccgt;ggctggtgtcggcgaagactccggccgcgctggac18360 gcccagatcg gacgcctcgccgcgtacgcctcgcagggccgtacggacgccgccgatccg18420 Sggcgcggtcgctcgcgtactggccggcgggcgcgccgagttcgagcaccgggccgtcgtg18480 ctcggcaccg gacaggacgatttcc~cgcaggcgctgaccgctccggaaggactgatacgc18540 ~~

ggcacgccct cggacgtgggccggc~tggcgttcgtgttccccggtcagggcacgcagtgg18600 gccgggatgg gcgccgaactcctcgacgtgtcgaaggagttcgcggcggccatggccgag18660 tgcgagagcg cgctctcccgctatgtcgactggtcgctggaggccgtcgtccggcaggcg18720 lOccgggcgcgcccacgctggagcgggtcgac.gtcgtccagcccgtgaccttcgctgtcatg18780 gtttcgctgg cgaaggtctggcagcaccacggcgtgacgccgcaggccgtcgtcggccac18840 tcgcagggcg agatcgccgccgcgtacgtcgccggtgccctcaccctcgacgacgccgcc18900 cgcgtcgtca ccctgcgcagcaagtccatcgccgcccacctcgccggcaagggcggcatg18960 atctccctcg ccctcagcgaggaagccacccggcagcgcatcgagaacctccacggactg19020 IStcgatcgccgccgtcaacggccccaccgccaccgtggtttcgggcgaccccacccagatc19080 caagagctcg ctcaggcgtgtgaggccgacggggtccgcgcacggatcatccccgtcgac19140 tacgcctccc acagcgcccacgtcgagaccatcgagagcgaactcgccgaggtcctcgcc19200 gggctcagcc cgcggacacctgaggtgccgttcttctcgacactcgaaggcgcctggatc19260 accgagccgg tgctcgacggcacctactggtaccgcaacctccgccaccgcgtcggcttc19320 20gcccccgccgtcgagaccctcgccaccgacgaaggcttcacccacttcatcgaggtcagc19380 gcccaccccg tcctcaccatgaccctccccgagaccgtcaccggcctcggcaccctccgc19440 cgcgaacagg gaggccaggagcgtctggtcacctcactcgccgaagcctggaccaacggc19500 ctcaccatcg actgggcgcccgtcctccccaccgcaaccggccaccaccccgagctcccc19560 acctacgcct tccagcgccgtcactactggctccacgactcccccgccgtccagggctcc19620 2Sgtgcaggactcctggcgctaccgcatcgactggaagcgcctcgcggtcgccgacgcgtcc19680 gagcgcgccg ggctgtccgggcgctggctcgtcgtcgtccccgaggaccgttccgccgag19740 gccgccccgg tgctcgccgcgctgtccggcgccggcgccgaccccgtacagctggacgtg19800 tccccgctgg gcgaccggcagcggctcgccgcgacgctgggcgaggccctggcggcggcc19860 ggtggagccg tcgacggcgtcctctcgctgctcgcgtgggacgagagcgcgcaccccggc19920 30caccccgcccccttcacccggggcaccggcgccaccctcaccctggtgcaggcgctggag19980 gacgccggcg tcgccgccccgctgt~ggtgcgtgacccacggcgcggtgtccgtcggccgg20040 gccgaccacg tcacctcccccgcccaggccatggtgtggggcatgggccgggtcgccgcc20100 ctggagcacc ccgagcggtggggcg~gcctgatcgacctgccctcggacgccgaccgggcg20160 gccctggacc gcatgaccacggtcctcgccggcggtacgggtgaggaccaggtcgcggta20220 35cgcgcctccgggctgctcgcccgccgcctcgtccgcgcctccctcccggcgcacggcacg20280 gcttcgccgt ggtggcaggccgacggcacggtgctcgtcaccggtgccgaggagcctgcg20340 gccgccgagg ccgcacgccggctggcccgcgacggcgccggacacctcctcctccacacc20400 accccctccg gcagcgaaggcgccg,aaggcacctccggtgccgccgaggactccggcctc20460 gccgggctcg tcgccgaactcgcgg,acctgggcgcgacggccaccgtcgtgacctgcgac20520 40ctcacggacgcggaggcggccgccc~ggctgctcgccggcgtctccgacgcgcacccgctc20580 agcgccgtcc tccacctgcc gcccaccgtc gactccgagc cgctcgccgc gaccgacgcg 20640 gacgcgctcg cccgtgtcgt gaccc~cgaag gccaccgccg cgctccacct ggaccgcctc 20700 ctgcgggagg ccgcggctgc cggaggccgt ccgcccgtcc tggtcctctt ctcctcggtc 20760 gccgcgatct ggggcggcgc cggtc:agggc gcgtacgccg ccggtacggc cttcctcgac 20820 5gccctcgccg gtcagcaccg ggccc~acggc cccaccgtga cctcggtggc ctggagcccc 20880 tgggagggca gccgcgtcac cgagc~gtgcg accggggagc ggctgcgccg cctcggcctg 20940 , cgccccctcg cccccgcgac ggcgca cacc gccctggaca ccgcgctcgg ccacggcgac 21000 accgccgtca cgatcgccga cgtcgactgg tcgagcttcg cccccggctt caccacggcc 21060 cggccgggca ccctcctcgc cgatctgccc gaggcgcgcc gcgcgctcga cgagcagcag 21120 l0tcgacgacgg ccgccgacga caccgtcctg agccgcgagc tcggtgcgct caccggcgcc 21180 gaacagcagc gccgtatgca ggagttggtc cgcgagcacc tcgccgtggt cctcaaccac 21240 ccctcccccg aggccgtcga cacggggcgg gccttccgtg acctcggatt cgactcgctg 21300 acggcggtcg agctccgcaa ccgcctcaag aacgccaccg gcctggccct cccggccact 21360 ctggtcttcg actacccgac cccccggacg ctggcggagt tcctcctcgc ggagatcctg 21420 l5ggcgagcagg ccggtgccgg cgagcagctt ccggtggacg gcggggtcga cgacgagccc 21480 gtcgcgatcg tcggcatggc gtgccgcctg ccgggcggtg tcgcctcgcc ggaggacctg 21540 tggcggctgg tggccggcgg cgaggacgcg atctccggct tcccgcagga ccgcggctgg 21600 gacgtggagg ggctgtacga cccggacccg gacgcgtccg ggcggacgta ctgccgtgcc 21660 ggtggcttcc tcgacgaggc gggcgagttc gacgccgact tcttcgggat ctcgccgcgc 21720 20gaggccctcg ccatggaccc gcagcagcgg ctcctcctgg agacctcctg ggaggccgtc 21780 gaggacgccg ggatcgaccc gacctccctt caggggcagc aggtcggcgt gttcgcgggc 21840 accaacggcc cccactacga gccgctgctc cgcaacaccg ccgaggatct tgagggttac 21900 gtcgggacgg gcaacgccgc cagcatcatg tcgggccgtg tctcgtacac cctcggcctg 21960 gagggcccgg ccgtcacggt cgacaccgcc tgctcctcct cgctggtcgc cctgcacctc 22020 2.5gccgtgcagg ccctgcgcaa gggcgaatgc ggactggcgc tcgcgggcgg tgtgacggtc 22080 atgtcgacgc ccacgacgtt cgtggagttc agccggcagc gcgggctcgc ggaggacggc 22140 cggtcgaagg cgttcgccgc gtcggcggac ggcttcggcc cggcggaggg cgtcggcatg 22200 ctcctcgtcg agcgcctgtc ggacgcccgc cgcaacggac accgtgtgct ggcggtcgtg 22260 cgcggcagcg cggtcaacca ggacg~gcgcg agcaacggcc tgaccgcccc gaacgggccc 22320 30tcgcagcagc gcgtcatccg gcgcgcgctc gcggacgccc gactgacgac cgccgacgtg 22380 gacgtcgtcg aggcccacgg cacgg~gcacg cgactcggcg acccgatcga ggcacaggcc 22440 ctcatcgcca cctacggcca ggggcgcgac accgaacagc cgctgcgcct ggggtcgttg 22500 aagtccaaca tcggacacac ccaggccgcc gccggtgtct ccggcatcat caagatggtc 22560 caggcgatgc gccacggcgt cctgccgaag acgctccacg tggaccggcc gtcggaccag 22620 35atcgactggt cggcgggcac ggtcg~agctg ctcaccgagg ccatggactg gccgaggaag 22680 caggagggcg ggctgcgccg cgcggccgtc tcctccttcg gcatcagcgg cacgaacgcg 22740 cacatcgtgc tcgaagaagc cccggtcgac gaggacgccc cggcggacga gccgtcggtc 22800 ggcggtgtgg tgccgtggct cgtgtccgcg aagactccgg ccgcgctgga cgcccagatc 22860 ggacgcctcg ccgcgttcgc ctcgcagggc cgtacggacg ccgccgatcc gggcgcggtc 22920 40gctcgcgtac tggccggcgg gcgtgcgcag ttcgagcacc gggccgtcgc gctcggcacc 22980 ggacaggacgacctggcggccgcact:ggccgcgcctgagggtctggtccggggtgtggcc23040 tccggtgtgggtcgagtggcgttcgt:gttcccgggacagggcacgcagtgggccgggatg23100 ggtgccgaactcctcgacgtgtcgaaggagttcgcggcggccatggccgagtgcgaggcc23160 gcgctcgctccgtacgtggactggtc:gctggaggccgtcgtccgacaggcccccggcgcg23220 Scccacgctggagcgggtcgatgtcgt:ccagcccgtgacgttcgccgtcatggtctcgctg23280 gcgaaggtctggcagcaccacggggt:gaccccgcaagccgtcgtcggccactcgcagggc23340 , gagatcgccgccgcgtacgtcgccgc~tgccctgagcctggacgacgccgctcgtgtcgtg23400 accctgcgcagcaagtccatcggcgc;ccacctcgcgggccagggcggcatgctgtccctc23460 gcgctgagcgaggcggccgttgtggagcgactggccgggttcgacgggctgtccgtcgcc23520 lOgccgtcaacgggcctaccgccaccgt:ggtttcgggcgacccgacccagatccaagagctc23580 gctcaggcgtgtgaggccgacggggt:ccgcgcacggatcatccccgtcgactacgcctcc23640 cacagcgcccacgtcgagaccatcgagagcgaactcgccgacgtcctggcggggttgtcc23700 ccccagacaccccaggtccccttctt:ctccaccctcgaaggcgcctggatcaccgaaccc23760 gccctcgacggcggctactggtaccc~caacctccgccatcgtgtgggcttcgccccggcc23820 l5gtcgaaaccctggccaccgacgaagc~cttcacccacttcgtcgaggtcagcgcccacccc23880 gtcctcaccatggcgctgcccgagac:cgtcaccggactcggcaccctccgccgtgacaac23940 ggcggacagcaccgcctcaccacctc:cctcgccgaggcctgggccaacggcctcaccgtc24000 gactgggcctctctcctccccaccac;gaccacccaccccgatctgcccacctacgccttc24060 cagaccgagcgctactggccgcagcc:cgacctctccgccgccggtgacatcacctccgcc24120 20ggtctcggggcggccgagcacccgct:gctcggcgcggccgtggcgctcgcggactccgac24180 ggctgcctgctcacggggagcctctccctccgtacgcacccctggctggcggaccacgcg24240 gtggccggcaccgtgctgctgccggg~aacggcgttcgtggagctggcgttccgagccggg24300 gaccaggtcggttgcgatctggtcgaggagctcaccctcgacgcgccgctcgtgctgccc24360 cgtcgtggcgcggtccgtgtgcagct.gtccgtcggcgcgagcgacgagtccgggcgtcgt24420 25accttcgggctctacgcgcacccgga.ggacgcgccgggcgaggcggagtggacgcggcac24480 gccaccggtgtgctggccgcccgtgc:ggaccgcaccgcccccgtcgccgacccggaggcc24540 tggccgccgccgggcgccgagccggt.ggacgtggacggtctgtacgagcgcttcgcggcg24600 aacggctacggctacggccccctctt.ccagggcgtccgtggtgtctggcggcgtggcgac:24660 gaggtgttcgccgacgtggccctgccggccgaggtcgccggtgccgagggcgcgcggttc24720 30ggccttcacccggcgctgctcgacgccgccgtgcaggcggccggtgcgggccggggcgtt24780 cggcgcgggcacgcggctgccgttcgcctggagcgggatctcctgtacgcggtcggcgcc24840 accgccctccgcgtgcggctggcccccgccggcccggacacggtgtccgtgagcgccgcc24900 gactcctccgggcagccggtgttcgccgcggactccctcacggtgctgcccgtcgacccc:24960 gcgcagctggcggccttcagcgacccgactctggacgcgctgcacctgctggagtggacc25020 35gcctgggacggtgccgcgcaggccct.gcccggcgcggtcgtgctgggcggcgacgccgac25080 ggtctcgccgcggcgctgcgcgccggtggcaccgaggtcctgtccttcccggaccttacg25140 gacctggtggaggccgtcgaccgggg~cgagaccccggccccggcgaccgtcctggtggcc25200 tgccccgccgccggccccgatgggcc:ggagcatgtccgcgaggccctgcacgggtcgctc:25260 gcgctgatgcaggcctggctggccga.cgagcggttcaccgatgggcgcctggtgctcgtg25320 40acccgcgacgcggtcgccgcccgttc:cggcgacggcctgcggtccacgggacaggccgcc25380 WO 00/00620 PC'T/US99/14398 gtctggggcc tcggccggtc cgcgc:agacg gagagcccgg gccggttcgt cctgctcgac 25440 ctcgccgggg aagcccggac ggccggggac gccaccgccg gggacggcct gacgaccggg 25500 gacgccaccg tcggcggcac ctctggagac gccgccctcg gcagcgccct cgcgaccgcc 25560 ctcggctcgg gcgagccgca gctcg~ccctc cgggacgggg cgctcctcgt accccgcctg 25620 Sgcgcgggccg ccgcgcccgc cgcgg~ccgac ggcctcgccg cggccgacgg cctcgccgct 25680 ctgccgctgc ccgccgctcc ggccctctgg cgtctggagc ccggtacgga cggcagcctg 25740 "
gagagcctca cggcggcgcc cggcgacgcc gagaccctcg ccccggagcc gctcggcccg 25800 ggacaggtcc gcatcgcgat ccgggccacc ggtctcaact tccgcgacgt cctgatcgcc 25860 ctcggcatgt accccgatcc ggcgctgatg ggcaccgagg gagccggcgt ggtcaccgcg 25920 l0accggccccg gcgtcacgca cctcgccccc ggcgaccggg tcatgggcct gctctccggc 25980 gcgtacgccc cggtcgtcgt ggcggacgcg cggaccgtcg cgcggatgcc cgaggggtgg 26040 acgttcgccc agggcgcctc cgtgccggtg gtgttcctga cggccgtcta cgccctgcgc 26100 gacctggcgg acgtcaagcc cggcgagcgc ctcctggtcc actccgccgc cggtggcgtg 26160 ggcatggccg ccgtgcagct cgcccggcac tggggcgtgg aggtccacgg cacggcgagt 26220 l5cacgggaagt gggacgccct gcgcgcgctc ggcctggacg acgcgcacat cgcctcctcc 26280 cgcaccctgg acttcgagtc cgcgttccgt gccgcttccg gcggggcggg catggacgtc 26340 gtactgaact cgctcgcccg cgagttcgtc gacgcctcgc tgcgcctgct cgggccgggc 26400 ggccggttcg tggagatggg gaagaccgac gtccgcgacg cggagcgggt cgccgccgac 26460 caccccggtg tcggctaccg cgccttcgac ctgggcgagg ccgggccgga gcggatcggc 26520 20gagatgctcg ccgaggtcat cgccctcttc gaggacgggg tgctccggca cctgcccgtc 26580 acgacctggg acgtgcgccg ggcccgcgac gccttccggc acgtcagcca ggcccgccac 26640 acgggcaagg tcgtcctcac gatgccgtcg ggcctcgacc cggagggtac ggtcctgctg 26700 accggcggca ccggtgcgct ggggggcatc gtggcccggc acgtggtggg cgagtggggc 26760 gtacgacgcc tgctgctcgt gagcc~ggcgg ggcacggacg ccccgggcgc cggcgagctc 26820 25gtgcacgagc tggaggccct gggagccgac gtctcggtgg ccgcgtgcga cgtcgccgac 26880 cgcgaagccc tcaccgccgt actcg~actcg atccccgccg aacacccgct caccgcggtc 26940 gtccacacgg caggcgtcct ctccg~acggc accctcccct cgatgacagc ggaggatgtg 27000 gaacacgtac tgcgtcccaa ggtcg;acgcc gcgttcctcc tcgacgaact cacctcgacg 27060 cccggctacg acctggcagc gttcgtcatg ttctcctccg ccgccgccgt cttcggtggc 27120 3Ugcggggcagg gcgcctacgc cgccgccaac gccaccctcg acgccctcgc ctggcgccgc 27180 cggacagccg gactccccgc cctctccctc ggctggggcc tctgggccga gaccagcggc 27240 atgaccggcg gactcagcga caccg;accgc tcgcggctgg cccgttccgg ggcgacgccc 27300 atggacagcg agctgaccct gtccctcctg gacgcggcca tgcgccgcga cgacccggcg 27360 ctcgtcccga tcgccctgga cgtcgccgcg ctccgcgccc agcagcgcga cggcatgctg 27420 35gcgccgctgc tcagcgggct cacccgcgga tcgcgggtcg gcggcgcgcc ggtcaaccag 27480 cgcagggcag ccgccggagg cgcgggcgag gcggacacgg acctcggcgg gcggctcgcc 27540 gcgatgacac cggacgaccg ggtcgcgcac ctgcgggacc tcgtccgtac gcacgtggcg 2?600 accgtcctgg gacacggcac cccgagccgg gtggacctgg agcgggcctt ccgcgacacc 27660 ggtttcgact cgctcaccgc cgtcgaactc cgcaaccgtc tcaacgccgc gaccgggctg 27720 4~cggctgccgg ccacgctggt cttcg;accac cccaccccgg gggagctcgc cgggcacctg 27780 ctcgacgaac tcgccacggccgcgggcgggtcctgggcggaaggcaccgggtccggagac27840 acggcctcgg cgaccgatcggcagaccacggcggccctcgccgaactcgaccggctggaa27900 ggcgtgctcg cctccctcgcgcccgccgccggcggccgtccggagctcgccgcccggctc27960 agggcgctgg ccgcggccctgggggacgacggcgacgacgccaccgacctggacgaggcg28020 $tccgacgacgacctcttctccttcatcgacaaggagctgggcgactccgacttctgacct28080 gcccgacacc accggcaccaccggcaccaccagcccccctcacacacggaacacggaacg28140 , gacaggcgag aacgggagccatggcgaacaacgaagacaagctccgcgactacctcaagc28200 gcgtcaccgc cgagctgcagcagaacaccaggcgtctgcgcgagatcgagggacgcacgc28260 acgagccggt ggcgatcgtgggcatggcctgccgcctgccgggcggtgtcgcctcgcccg28320 l0aggacctgtggcagctggtggccggggacggggacgcgatctcggagttcccgcaggacc28380 gcggctggga cgtggaggggctgtacgaccccgacccggacgcgtccggcaggacgtact28440 gccggtccgg cggattcctgcacgacgccggcgagttcgacgccgacttcttcgggatct28500 cgccgcgcga ggccctcgccatggacccgcagcagcgactgtccctcaccaccgcgtggg28560 aggcgatcga gagcgcgggcatcgacccgacggccctgaagggcagcggcctcggcgtct28620 l5tcgtcggcggctggcacaccggctacacctcggggcagaccaccgccgtgcagtcgcccg28680 agctggaggg ccacctggtcagcggcgcggcgctgggcttcctgtccggccgtatcgcgt28740 acgtcctcgg tacggacggaccggccctgaccgtggacacggcctgctcgtcctcgctgg28800 tcgccctgca cctcgccgtgcaggccctccgcaagggcgagtgcgacatggccctcgccg28860 gtggtgtcac ggtcatgcccaacgc~ggacctgttcgtgcagttcagccggcagcgcgggc28920 20tggccgcggacggccggtcgaaggc~gttcgccacctcggcggacggcttcggccccgcgg28980 agggcgccgg agtcctgctggtgga~gcgcctgtcggacgcccgccgcaacggacaccgga29040 tcctcgcggt cgtccgcggcagcgc~ggtcaaccaggacggcgccagcaacggcctcacgg29100 ctccgcacgg gccctcccagcagcgcgtcatccgacgggccctggcggacgcccggctcg29160 cgccgggtga cgtggacgtcgtcga~ggcgcacggcacgggcacgcggctcggcgacccga29220 25tcgaggcgcaggccctcatcgccacctacggccaggagaagagcagcgaacagccgctga29280 ggctgggcgc gttgaagtcgaacatcgggcacacgcaggccgcggccggtgtcgcaggtg29340 tcatcaagat ggtccaggcgatgcgccacggactgctgccgaagacgctgcacgtcgacg29400 agccctcgga ccagatcgactggtc!ggcgggcacggtggaactcctcaccgaggccgtcg29460 actggccgga gaagcaggacggcgggctgcgccgcgcggctgtctcctccttcggcatca29520 30gcgggacgaacgcgcacgtcgtcct~,ggaggaggccccggcggtcgaggactccccggccg29580 tcgagccgcc ggccggtggcggtgtggtgccgtggccggtgtccgcgaagactccggccg29640 cgctggacgc ccagatcgggcagctcgccgcgtacgcggacggtcgtacggacgtggatc29700 cggcggtggc cgcccgcgccctggtcgacagccgtacggcgatggagcaccgcgcggtcg29760 cggtcggcga cagccgggaggcactgcgggacgccctgcggatgccggaaggactggtac29820 35gcggcacgtcctcggacgtgggccg~ggtggcgttcgtcttccccggccagggcacgcagt29880 gggccggcat gggcgccgaactccttgacagctcaccggagttcgctgcctcgatggccg29940 aatgcgagac cgcgctctcccgctacgtcgactggtctcttgaagccgtcgtccgacagg30000 aacccggcgc acccacgctcgaccgcgtcgacgtcgtccagcccgtgaccttcgctgtca30060 tggtctcgct ggcgaaggtctggcagcaccacggcatcaccccccaggccgtcgtcggcc30120 40actcgcagggcgagatcgccgccgcgtacgtcgccggtgcactcaccctcgacgacgccg30180 SS
cccgcgtcgt caccctgcgcagcaagtccatcgccgcccacctcgccggcaagggcggca30240 tgatctccct cgccctcgacgaggcggccgtcctgaagcgactgagcgacttcgacggac30300 tctccgtcgc cgccgtcaacggccccaccgccaccgtcgtctccggcgacccgacccaga30360 tcgaggaact cgcccgcacctgcgaggccgacggcgtccgtgcgcggatcatcccggtcg30420 Sactacgcctcccacagccggcaggtcgagatcatcgagaaggagctggccgaggtcctcg30480 ccggactcgc cccgcaggctccgcacgtgccgttcttctccaccctcgaaggcacctgga30540 "

tcaccgagcc ggtgctcgacggcacctactggtaccgcaacctgcgccatcgcgtgggct30600 tcgcccccgc cgtggagaccttggcggttgacggcttcacccacttcatcgaggtcagcg30660 cccaccccgt cctcaccatgaccctccccgagaccgtcaccggcctcggcaccctccgcc30720 l0gcgaacagggaggccaggagcgtctggtcacctcactcgccgaagcctgggccaacggcc30780 tcaccatcga ctgggcgcccatcctccccaccgcaaccggccaccaccccgagctcccca30840 cctacgcctt ccagaccgagcgcttctggctgcagagctccgcgcccaccagcgccgccg30900 acgactggcg ttaccgcgtcgagtggaagccgctgacggcctccggccaggcggacctgt30960 ccgggcggtg gatcgtcgccgtcgggagcgagccagaagccgagctgctgggcgcgctga31020 lSaggccgcgggagcggaggtcgacgtactggaagccggggcggacgacgaccgtgaggccc31080 tcgccgcccg gctcaccgcactgacgaccggcgacggcttcaccggcgtggtctcgctcc31140 tcgacgacct cgtgccacaggtcgcctgggtgcaggcactcggcgacgccggaatcaagg31200 cgcccctgtg gtccgtcacccagggcgcggtctccgtcggacgtctcgacacccccgccg31260 accccgaccg ggccatgctctggggcctcggccgcgtcgtcgcccttgagcaccccgaac31320 20gctgggccggcctcgtcgacctccccgcccagcccgatgccgccgccctcgcccacctcg31380 tcaccgcact ctccggcgccaccggcgaggaccagatcgccatccgcaccaccggactcc31440 acgcccgccg cctcgcccgcgcacccctccacggacgtcggcccacccgcgactggcagc31500 cccacggcac cgtcctcatcaccggcggcaccggagccctcggcagccacgccgcacgct31560 ggatggccca ccacggagccgaacacctcctcctcgtcagccgcagcggcgaacaagccc31620 2Sccggagccacccaactcaccgccga~actcaccgcatcgggcgcccgcgtcaccatcgccg31680 cctgcgacgt cgccgacccccacgccatgcgcaccctcctcgacgccatccccgccgaga31740 cgcccctcac cgccgtcgtccacaccgccggcgcaccgggcggcgatccgctggacgtca31800 ccggcccgga ggacatcgcccgcatcctgggcgcgaagacgagcggcgccgaggtcctcg31860 acgacctgct ccgcggcactccgctggacgccttcgtcctctactcctcgaacgccgggg31920 30tctggggcagcggcagccagggcgtctacgcggcggccaacgcccacctcgacgcgctcg31980 ccgcccggcg ccgcgcccggggcgagacggcgacctcggtcgcctggggcctctgggccg32040 gcgacggcat gggccggggcgccgacgacgcgtactggcagcgtcgcggcatccgtccga32100 tgagccccga ccgcgccctggacga~actggccaaggccctgagccacgacgagaccttcg32160 tcgccgtggc cgatgtcgactgggagcggttcgcgcccgcgttcacggtgtcccgtccca32220 3Sgccttctgctcgacggcgtcccggaggcccggcaggcgctcgccgcacccgtcggtgccc32280 cggctcccgg cgacgccgccgtggcgccgaccgggcagtcgtcggcgctggccgcgatca32340 ccgcgctccc cgagcccgagcgccggccggcgctcctcaccctcgtccgtacccacgcgg32400 cggccgtact cggccattcctcccccgaccgggtggcccccggccgtgccttcaccgagc32460 tcggcttcga ctcgctgacggccgtgcagctccgcaaccagctctccacggtggtcggca32520 40acaggctccccgccaccacggtcttcgaccacccgacgcccgccgcactcgccgcgcacc32580 tccacgaggc gtacctcgcaccggccgagccggccccgacggactgggaggggcgggtgc32640 gccgggccct ggccgaactgcccctcgaccggctgcgggacgcgggggtcctcgacaccg32700 tcctgcgcct caccggcatcgagcccgagccgggttccggcggttcggacggcggcgccg32760 ccgaccctgg tgcggagccggaggcgtcgatcgacgacctggacgccgaggccctgatcc32820 Sggatggctctcggcccccgtaacacctgacccgaccgcggtcctgccccacgcgccgcac32880 cccgcgcatc ccgcgcaccacccgcccccacacgcccacaaccccatccacgagcggaag32940 , accacaccca gatgacgagttccaacgaacagttggtggacgctctgcgcgcctctctca33000 aggagaacga agaactccggaaagagagccgtcgccgggccgaccgtcggcaggagccca33060 tggcgatcgt cggcatgagctgccggttcgcgggcggaatccggtcccccgaggacctct33120 l0gggacgccgtcgccgcgggcaaggacctggtctccgaggtaccggaggagcgcggctggg33180 acatcgactc cctctacgacccggtgcccgggcgcaagggcacgacgtacgtccgcaacg33240 ccgcgttcct cgacgacgccgccggattcgacgcggccttcttcgggatctcgccgcgcg33300 aggccctcgc catggacccgcagcagcggcagctcctcgaagcctcctgggaggtcttcg33360 agcgggccgg catcgaccccgcgtcggtccgcggcaccgacgtcggcgtgtacgtgggct33420 l5gtggctaccaggactacgcgccggacatccgggtcgcccccgaaggcaccggcggttacg33480 tcgtcaccgg caactcctccgccgt~ggcctccgggcgcatcgcgtactccctcggcctgg33540 agggacccgc cgtgaccgtggacac~ggcgtgctcctcttcgctcgtcgccctgcacctcg33600 ccctgaaggg cctgcggaacggcgactgctcgacggcactcgtgggcggcgtggccgtcc33660 tcgcgacgcc gggcgcgttcatcgagttcagcagccagcaggccatggccgccgacggcc33720 20ggaccaagggcttcgcctcggcggcggacggcctcgcctggggcgagggcgtcgccgtac33780 tcctcctcga acggctctccgacgcgcggcgcaagggccaccgggtcctggccgtcgtgc33840 gcggcagcgc catcaaccaggacggcgcgagcaacggcctcacggctccgcacgggccct33900 cccagcagca cctgatccgccaggccctggccgacgcgcggctcacgtcgagcgacgtgg33960 acgtcgtgga gggccacggcacggggacccgtctcggcgacccgatcgaggcgcaggcgc34020 25tgctcgccacgtacgggcaggggcgcgccccggggcagccgctgcggctggggacgctga34080 agtcgaacat cgggcacacgcaggccgcttcgggtgtcgccggtgtcatcaagatggtgc34140 aggcgctgcg ccacggggtgctgccgaagaccctgcacgtggacgagccgacggaccagg34200 tcgactggtc ggccggttcggtcgagctgctcaccgaggccgtggactggccggagcggc34260 cgggccggct ccgccgggcgggcgtctccgcgttcggcgtgggcgggacgaacgcgcacg34320 30tcgtcctggaggaggccccggcggtcgaggagtcccctgccgtcgagccgccggccggtg34380 gcggcgtggt gccgtggccggtgtccgcgaagacctcggccgcactggacgcccagatcg34440 ggcagctcgc cgcatacgcggaagaccgcacggacgtggatccggcggtggccgcccgcg34500 ccctggtcga cagccgtacggcgatggagcaccgcgcggtcgcggtcggcgacagccggg34560 aggcactgcg ggacgccctgcggatc3ccggaaggactggtacggggcacggtcaccgatc34620 35cgggccgggtggcgttcgtcttccccggccagggcacgcagtgggccggcatgggcgccg34680 aactcctcga cagctcacccgaattcgccgccgccatggccgaatgcgagaccgcactct34740 ccccgtacgt cgactggtctctcgaagccgtcgtccgacaggctcccagcgcaccgacac34800 tcgaccgcgt cgacgtcgtccagcccgtcaccttcgccgtcatggtctccctcgccaagg34860 tctggcagca ccacggcatcacccccgaggccgtcatcggccactcccagggcgagatcg34920 40ccgccgcgtacgtcgccggtgccctcaccctcgacgacgccgctcgtgtcgtgaccctcc34980 gcagcaagtc catcgccgcc cacct.cgccg gcaagggcgg catgatctcc ctcgccctca 35040 gcgaggaagc cacccggcag cgcat.cgaga acctccacgg actgtcgatc gccgccgtca 35100 acgggcctac cgccaccgtg gtttcgggcg accccaccca gatccaagaa cttgctcagg 35160 cgtgtgaggc cgacggcatc cgcgcacgga tcatccccgt cgactacgcc tcccacagcg 35220 $cccacgtcga gaccatcgag aacga.actcg ccgacgtcct ggcggggttg tccccccaga 35280 caccccaggt ccccttcttc tccaccctcg aaggcacctg gatcaccgaa cccgccctcg 35340 , acggcggcta ctggtaccgc aacctccgcc atcgtgtggg cttcgccccg gccgtcgaga 35400 ccctcgccac cgacgaaggc ttcacccact tcatcgaggt cagcgcccac cccgtcctca 35460 ccatgaccct ccccgacaag gtcaccggcc tggccaccct ccgacgcgag gacggcggac 35520 l0agcaccgcct caccacctcc cttgccgagg cctgggccaa cggcctcgcc ctcgactggg 35580 cctccctcct gcccgccacg ggcgccctca gccccgccgt ccccgacctc ccgacgtacg 35640 ccttccagca ccgctcgtac tggatcagcc ccgcgggtcc cggcgaggcg cccgcgcaca 35700 ccgcttccgg gcgcgaggcc gtcgccgaga cggggctcgc gtggggcccg ggtgccgagg 35760 acctcgacga ggagggccgg cgcagcgccg tactcgcgat ggtgatgcgg caggcggcct 35820 l5ccgtgctccg gtgcgactcg cccgaagagg tccccgtcga ccgcccgctg cgggagatcg 35880 gcttcgactc gctgaccgcc gtcgacttcc gcaaccgcgt caaccggctg accggtctcc 35940 agctgccgcc caccgtcgtg ttccagcacc cgacgcccgt cgcgctcgcc gagcgcatca 36000 gcgacgagct ggccgagcgg aactgggccg tcgccgagcc gtcggatcac gagcaggcgg 36060 aggaggagaa ggccgccgct ccggc~ggggg cccgctccgg ggccgacacc ggcgccggcg 36120 20ccgggatgtt ccgcgccctg ttccg~gcagg ccgtggagga cgaccggtac ggcgagttcc 36180 tcgacgtcct cgccgaagcc tccgc~gttcc gcccgcagtt cgcctcgccc gaggcctgct 36240 cggagcggct cgacccggtg ctgctcgccg gcggtccgac ggaccgggcg gaaggccgtg 36300 ccgttctcgt cggctgcacc ggcaccgcgg cgaacggcgg cccgcacgag ttcctgcggc 36360 tcagcacctc cttccaggag gagcg~ggact tcctcgccgt acctctcccc ggctacggca 36420 25cgggtacggg caccggcacg gccctcctcc cggccgatct cgacaccgcg ctcgacgccc 36480 aggcccgggc gatcctccgg gccgccgggg acgccccggt cgtcctgctc gggcactccg 36540 gcggcgccct gctcgcgcac gagctggcct tccgcctgga gcgggcgcac ggcgcgccgc 36600 cggccgggat cgtcctggtc gacccctatc cgccgggcca tcaggagccc atcgaggtgt 36660 ggagcaggca gctgggcgag ggcctgttcg cgggcgagct ggagccgatg tccgatgcgc 36720 30ggctgctggc catgggccgg tacgc!3cggt tcctcgccgg cccgcggccg ggccgcagca 36780 gcgcgcccgt gcttctggtc cgtgcctccg aaccgctggg cgactggcag gaggagcggg 36840 gcgactggcg tgcccactgg gaccttccgc acaccgtcgc ggacgtgccg.ggcgaccact 36900 tcacgatgat gcgggaccac gcgccggccg tcgccgaggc cgtcctctcc tggctcgacg 36960 ccatcgaggg catcgagggg gcgggcaagt gaccgacaga cctctgaacg tggacagcgg 37020 35actgtggatc cggcgcttcc accccgcgcc gaacagcgcg gtgcggctgg tctgcctgcc 37080 gcacgccggc ggctccgcca gctacttctt ccgcttctcg gaggagctgc acccctccgt 37140 cgaggccctg tcggtgcagt atccgggccg ccaggaccgg cgtgccgagc cgtgtctgga 37200 gagcgtcgag gagctcgccg agcat!3tggt cgcggccacc gaaccctggt ggcaggaggg 37260 ccggctggcc ttcttcgggc acagcctcgg cgcctccgtc gccttcgaga cggcccgcat 37320 40cctggaacag cggcacgggg tacggcccga gggcctgtac gtctccggtc ggcgcgcccc 37380 gtcgctggcgccggaccggctcgtcc:accagctggacgaccgggcgttcctggccgagat 37440 ccggcggctcagcggcaccgacgagcggttcctccaggacgacgagctgctgcggctggt 37500 gctgcccgcgctgcgcagcgactacaaggcggcggagacgtacctgcaccggccgtccgc 37560 caagctcacctgcccggtgatggccctggccggcgaccgtgacccgaaggcgccgctgaa 37620 Scgaggtggccgagtggcgtcggcac<~ccagcgggccgttctgcctccgggcgtactccgg 37680 cggccacttctacctcaacgaccagt:ggcacgagatctgcaacgacatctccgaccacct 37740 , gctcgtcacccgcggcgcgcccgatgcccgcgtcgtgcagcccccgaccagccttatcga 37800 aggagcggcgaagagatggcagaacccacggtgaccgacgacctgacgggggccctcacg 37860 cagcccccgctgggccgcaccgtccgcgcggtggccgaccgtgaactcggcacccacctc 37920 l~ctggagacccgcggcatccactggat:cc 37948 <210> 6 1$ <211> 12199 <212> PRT
<213> Streptomyces ve:nezuelae <400> 6 20Met Ala Phe Ser Pro Gln Gly Gly Arg His Glu Leu Gly Gln Asn Phe Leu Val Asp Arg Ser Val Ile Asp Glu Ile Asp Gly Leu Val Ala Arg Thr Lys Gly Pro Ile Leu Glu Ile Gly Pro Gly Asp Gly Ala Leu Thr 2$ 35 40 45 Leu Pro Leu Ser Arg His Gly Arg Pro Ile Thr Ala Val Glu Leu Asp Gly Arg Arg Ala Gln Arg Leu Gly Ala Arg Thr Pro Gly His Val Thr 3~Va1 Val His His Asp Phe Leu Gln Tyr Pro Leu Pro Arg Asn Pro Hie Val Val Val Gly Asn Val Pro Phe His Leu Thr Thr Ala Ile Met Arg Arg Leu Leu Asp Ala Gln His Trp His Thr Ala Val Leu Leu Val Gln Trp Glu Val Ala Arg Arg Arg Ala Gly Val Gly G1y Ser Thr Leu Leu Thr Ala Gly Trp Ala Pro Trp Tyr Glu Phe Asp Leu His Ser Arg Val 4~Pro Ala Arg Ala Phe Arg Pro Met Pro Gly Val Asp Gly Gly Val Leu Ala Ile Arg Arg Arg Ser Ala Pro Leu Val Gly Gln Val Lys Thr Tyr Gln Asp Phe Val Arg Gln Val. Phe Thr Gly Lys Gly Asn Gly Leu Lys $ 195 200 205 Glu Ile Leu Arg Arg Thr Gly Arg Ile Ser Gln Arg Asp Leu Ala Thr 210 215. 220 Trp Leu Arg Arg Asn Glu Ile: Ser Pro His Ala Leu Pro Lys Asp Leu lOLys Pro Gly Gln Trp Ala Ser Leu Trp Glu Leu Thr Gly Gly Thr Ala Asp Gly Ser Phe Asp Gly Thr Ala Gly Gly Gly Ala Ala Gly Ser His Gly Ala Ala Arg Val Gly Ala. Gly His Pro Gly Gly Arg Val Ser Ala 1$ 275 280 285 Ser Arg Arg Gly Val Pro Gln. Ala Arg Arg Gly Arg Gly His Ala Val Arg Ser Ser Thr Gly Thr Glu Pro Arg Trp Gly Arg Gly Arg Ala Glu 20Ser Ala Met Ala Met Arg Asp Ser Ile Pro Arg Arg Ala Asp Arg Asp Thr Leu Arg Arg Glu Leu Gly Gln Asn Phe Leu Gln Asp Asp Arg Ala Val Arg Asn Leu Val Thr His Val Glu Gly Asp Gly Arg Asn Val Leu Glu Ile Gly Pro Gly Lys Gly Ala Ile Thr Glu Glu Leu Val Arg Ser Phe Asp Thr Val Thr Val Val Glu Met Asp Pro His Trp Ala Ala His 30Va1 Arg Arg Lys Phe Glu Gly Glu Arg Val Thr Val Phe Gln Gly Asp Phe Leu Asp Phe Arg Ile Pro Arg Asp Ile Asp Thr Val Val Gly Asn Val Pro Phe Gly Ile Thr Thr Gln Ile Leu Arg Ser Leu Leu Glu Ser Thr Asn Trp Gln Ser Ala Ala Leu Ile Val Gln Trp Glu Val Ala Arg Lys Arg Ala Gly Arg Ser Gly Gly Ser Leu Leu Thr Thr Ser Trp Ala 40Pro Trp Tyr Glu Phe Ala Val His Asp Arg Val Arg Ala Ser Ser Phe Arg Pro Met Pro Arg Val Asp Gly Gly Val Leu Thr Ile Arg Arg Arg Pro Gln Pro Leu Leu Pro Glu Ser Ala Ser Arg Ala Phe Gln Asn Phe $ 515 520 525 Ala Glu Ala Val Phe Thr Gly Pro Gly Arg Gly Leu Ala Glu Ile Leu Arg Arg His Ile Pro Lys Arg Thr Tyr Arg Ser Leu Ala Asp Arg His 545 , 550 555 560 lOGly Ile Pro Asp Gly Gly Leu Pro Lys Asp Leu Thr Leu Thr Gln Trp Ile Ala Leu Phe Gln Ala Ser Gln Pro Ser Tyr Ala Pro Gly Ala Pro Gly Thr Arg Met Pro Gly Gln Gly Gly Gly Ala Gly Gly Arg Asp Tyr Asp Ser Glu Thr Ser Arg Ala Ala Val Pro Gly Ser Arg Arg Tyr Gly Pro Thr Arg Gly Gly Glu Pro Cys Ala Pro Arg Ala Gln Val Arg Gln 20Thr Lys Gly Arg Gln Gly Ala Arg Gly Ser Ser Tyr Gly Arg Arg Thr Gly Arg Met Ser Ser Ala Gly Ile Thr Arg Thr Gly Ala Arg Thr Pro Val Thr Gly Arg Gly Ala Ala Ala Trp Asp Thr Gly Glu Val Arg Val 2$ 675 680 685 Arg Arg Gly Leu Pro Pro Ala Gly Pro Asp His Ala Glu His Ser Phe Ser Arg Ala Pro Thr Gly Asp Val Arg Ala Glu Leu Ile Arg Gly Glu 30Met Ser Thr Val Ser Lys Ser Glu Ser Glu Glu Phe Val Ser Val Ser Asn Asp Ala Gly Ser Ala His Gly Thr Ala Glu Pro Val Ala Val Val Gly Ile Ser Cys Arg Val Pro Gly Ala Arg Asp Pro Arg Glu Phe Trp 3$ 755 760 765 Glu Leu Leu Ala Ala Gly Gly Gln Ala Val Thr Asp Val Pro Ala Asp Arg Trp Asn Ala Gly Asp Phe Tyr Asp Pro Asp Arg Ser Ala Pro Gly 40Arg Ser Asn Ser Arg Trp Gly Gly Phe Ile Glu Asp Val Asp Arg Phe Asp Ala Ala Phe Phe Gly Ile Ser Pro Arg Glu Ala Ala Glu Met Asp Pro Gln Gln Arg Leu Ala Leu Glu Leu Gly Trp Glu Ala Leu Glu Arg $ 835 840 845 Ala Gly Ile Asp Pro Ser Ser Leu Thr Gly Thr Arg Thr Gly Val Phe Ala Gly Ala Ile Trp Asp Asp Tyr Ala Thr Leu Lys His Arg Gln Gly lUGly Ala Ala Ile Thr Pro His Thr Val Thr Gly Leu His Arg Gly Ile Ile Ala Asn Arg Leu Ser Tyr Thr Leu Gly Leu Arg Gly Pro Ser Met Val Val Asp Ser Gly Gln Ser Ser Ser Leu Val Ala Val His Leu Ala 1$ 915 920 925 Cys Glu Ser Leu Arg Arg Gly Glu Ser Glu Leu Ala Leu Ala Gly Gly Val Ser Leu Asn Leu Val Pro Asp Ser Ile Ile Gly Ala Ser Lys Phe 2~Gly Gly Leu Ser Pro Asp Gly .Arg Ala Tyr Thr Phe Asp Ala Arg Ala Asn Gly Tyr Val Arg Gly Glu Gly Gly Gly Phe Val Val Leu Lys Arg Leu Ser Arg Ala Val Ala Asp Gly Asp Pro Val Leu Ala Val Ile Arg 2$ 995 1000 1005 Gly Ser Ala Val Asn Asn Gly Gly Ala Ala Gln Gly Met Thr Thr Pro Asp Ala Gln Ala Gln Glu Ala 'Val Leu Arg Glu Ala His Glu Arg Ala 30G1y Thr Ala Pro Ala Asp Val ;erg Tyr Val Glu Leu His Gly Thr Gly Thr Pro Val Gly Asp Pro Ile Glu Ala Ala Ala Leu Gly Ala Ala Leu Gly Thr Gly Arg Pro Ala Gly Gln Pro Leu Leu Val Gly Ser Val Lys 35 1075 :1080 1085 Thr Asn Ile Gly His Leu Glu Gly Ala Ala Gly Ile Ala Gly Leu Ile Lys Ala Val Leu Ala Val Arg Gly Arg Ala Leu Pro Ala Ser Leu Asn 40Tyr Glu Thr Pro Asn Pro Ala :Ile Pro Phe Glu Glu Leu Asn Leu Arg Val Asn Thr Glu Tyr Leu Pro Trp Glu Pro Glu His Asp Gly Gln Arg Met Val Val Gly Val Ser Ser Phe Gly Met Gly Gly Thr Asn Ala His $ 1155 1160 1165 Val Val Leu Glu Glu Ala Pro Gly Gly Cys Arg Gly Ala Ser Val Val Glu Ser Thr Val Gly Gly Ser Ala Val Gly Gly Gly Val Val Pro Trp I~Val Val Ser Ala Lys Ser Ala Ala Ala Leu Asp Ala Gln Ile Glu Arg Leu Ala Ala Phe Ala Ser Arg Asp Arg Thr Asp Gly Val Asp Ala Gly Ala Val Asp Ala Gly Ala Val Asp Ala Gly Ala Val Ala Arg Val Leu Ala Gly Gly Arg Ala Gln Phe Glu His Arg Ala Val Val Val Gly Ser Gly Pro Asp Asp Leu Ala Ala Ala Leu Ala Ala Pro Glu Gly Leu Val 20Arg Gly Val Ala Ser Gly Val Gly Arg Val Ala Phe Val Phe Pro Gly Gln Gly Thr Gln Trp Ala Gly Met Gly Ala Glu Leu Leu Asp Ser Ser Ala Val Phe Ala Ala Ala Met Ala Glu Cys Glu Ala Ala Leu Ser Pro 2$ 1315 1320 1325 Tyr Val Asp Trp Ser Leu Glu Ala Val Val Arg Gln Ala Pro Gly Ala 1330 133!5 1340 Pro Thr Leu Glu Arg Val Asp Val Val Gln Pro Val Thr Phe Ala Val 3~Met Val Ser Leu Ala Arg Val Trp Gln His His Gly Val Thr Pro Gln Ala Val Val Gly His Ser Gln Gly Glu Ile Ala Ala Ala Tyr Val Ala Gly Ala Leu Ser Leu Asp Asp Ala Ala Arg Val Val Thr Leu Arg Ser 3$ 1395 1400 1405 Lys Ser Ile Ala Ala His Leu Ala Gly Lys Gly Gly Met Leu Ser Leu 1410 141!i 1420 Ala Leu Ser Glu Asp Ala Val Leu Glu Arg Leu Ala Gly Phe Asp Gly 4~Leu Ser Val Ala Ala Val Asn Gly Pro Thr Ala Thr Val Val Ser Gly Asp Pro Val Gln Ile Glu Glu Leu Ala Arg Ala Cys Glu Ala Asp Gly Val Arg Ala Arg Val Ile Pro Val Asp Tyr Ala Ser His Ser Arg Gln Val Glu Ile Ile Glu Ser Glu Leu Ala Glu Val Leu Ala Gly Leu Ser Pro Gln Ala Pro Arg Val Pro Phe Phe Ser Thr Leu Glu Gly Ala Trp lOIle Thr Glu Pro Val Leu Asp Gly Gly Tyr Trp Tyr Arg Asn Leu Arg His Arg Val Gly Phe Ala Pro Ala Val Glu Thr Leu Ala Thr Asp Glu Gly Phe Thr His Phe Val Glu Val Ser Ala His Pro Val Leu Thr Met Ala Leu Pro Gly Thr Val Thr Gly Leu Ala Thr Leu Arg Arg Asp Asn Gly Gly Gln Asp Arg Leu Val Ala Ser Leu Ala Glu Ala Trp Ala Asn 20G1y Leu Ala Val Asp Trp Ser Pro Leu Leu Pro Ser Ala Thr Gly His His Ser Asp Leu Pro Thr Tyr Ala Phe Gln Thr Glu Arg His Trp Leu Gly Glu Ile Glu Ala Leu Ala Pro Ala Gly Glu Pro Ala Val Gln Pro 2$ 1635 1640 1645 Ala Val Leu Arg Thr Glu Ala Ala Glu Pro Ala Glu Leu Asp Arg Asp Glu Gln Leu Arg Val Ile Leu Asp Lys Val Arg Ala Gln Thr Ala Gln .

30Va1 Leu Gly Tyr Ala Thr Gly Gly Gln Ile Glu Val Asp Arg Thr Phe Arg Glu Ala Gly Cys Thr Ser Leu Thr Gly Val Asp Leu Arg Asn Arg Ile Asn Ala Ala Phe Gly Val Arg Met Ala Pro Ser Met Ile Phe Asp 3$ 1715 1720 1725 Phe Pro Thr Pro Glu Ala Leu Ala Glu Gln Leu Leu Leu Val Val His Gly Glu Ala Ala Ala Asn Pro Ala Gly Ala Glu Pro Ala Pro Val Ala 40A1a Ala Gly Ala Val Asp Glu Pro Val Ala Ile Val Gly Met Ala Cys Arg Leu Pro Gly Gly Val Ala Ser Pro Glu Asp Leu Trp Arg Leu Val Ala Gly Gly Gly Asp Ala Ile Ser Glu Phe Pro Gln Asp Arg Gly Trp $ 1795 1800 1805 Asp Val Glu Gly Leu Tyr His Pro Asp Pro Glu His Pro Gly Thr Ser , Tyr Val Arg Gln Gly Gly Phe Ile Glu Asn Val Ala Gly Phe Asp Ala I~Ala Phe Phe Gly Ile Ser Pro Arg Glu Ala Leu A1a Met Asp Pro Gln Gln Arg Leu Leu Leu Glu Thr Ser Trp Glu Ala Val Glu Asp Ala Gly Ile Asp Pro Thr Ser Leu Arg Gly Arg Gln Val Gly Val Phe Thr Gly 1$ 1875 1880 1885 Ala Met Thr His Glu Tyr Gly Pro Ser Leu Arg Asp Gly Gly Glu Gly Leu Asp Gly Tyr Leu Leu Thr Gly Asn Thr Ala Ser Val Met Ser Gly 20Arg Val Ser Tyr Thr Leu Gly Leu Glu Gly Pro Ala Leu Thr Val Asp Thr Ala Cys Ser Ser Ser Leu Val Ala Leu His Leu Ala Val Gln Ala Leu Arg Lys Gly Glu Val Asp Met Ala Leu Ala Gly Gly Val Ala Val 2$ 1955 1960 1965 Met Pro Thr Pro Gly Met Phe Val Glu Phe Ser Arg Gln Arg Gly Leu Ala Gly Asp Gly Arg Ser Lys Ala Phe Ala Ala Ser Ala Asp Gly Thr 3~Ser Trp Ser Glu Gly Val Gly Val Leu Leu Val Glu Arg Leu Ser Aep Ala Arg Arg Asn Gly His Gln Val Leu Ala Val Val Arg Gly Ser Ala Leu Asn Gln Asp Gly Ala Ser Asn Gly Leu Thr Ala Pro Asn Gly Pro 3$ 2035 2040 2045 Ser Gln Gln Arg Val Ile Arg Arg Ala Leu Ala Asp Ala Arg Leu Thr Thr Ser Asp Val Asp Val Val Glu Ala His Gly Thr Gly Thr Arg Leu 4UGly Asp Pro Ile Glu Ala Gln Ala Leu Ile Ala Thr Tyr Gly Gln Gly Arg Asp Asp Glu Gln Pro Leu Arg Leu Gly Ser Leu Lys Ser Asn Ile Gly His Thr Gln Ala Ala Ala Gly Val Ser Gly Val Ile Lys Met Val $ 2115 2120 2125 Gln Ala Met Arg His Gly Leu Leu Pro Lys Thr Leu His Val Asp Glu 2130 213!i 2140 Pro Ser Asp Gln Ile Asp Trp Ser Ala Gly Ala Val Glu Leu Leu Thr I~Glu Ala Val Asp Trp Pro Glu Lys Gln Asp Gly Gly Leu Arg Arg Ala Ala Val Ser Ser Phe Gly Ile Ser Gly Thr Asn Ala His Val Val Leu Glu Glu Ala Pro Val Val Val Glu Gly Ala Ser Val Val Glu Pro Ser I$ 2195 2200 2205 Val Gly Gly Ser Ala Val Gly Gly Gly Val Thr Pro Trp Val Val Ser 2210 221°_i 2220 Ala Lys Ser Ala Ala Ala Leu Asp Ala Gln Ile Glu Arg Leu Ala Ala 2UPhe Ala Ser Arg Asp Arg Thr Asp Asp Ala Asp Ala Gly Ala Val Asp Ala Gly Ala Val Ala His Val Leu Ala Asp Gly Arg Ala Gln Phe Glu His Arg Ala Val Ala Leu Gly Ala Gly Ala Asp Asp Leu Val Gln Ala Leu Ala Asp Pro Asp Gly Leu Ile Arg Gly Thr Ala Ser Gly Val Gly Arg Val Ala Phe Val Phe Pro Gly Gln Gly Thr Gln Trp Ala Gly Met 3~Gly Ala Glu Leu Leu Asp Ser Ser Ala Val Phe Ala Ala Ala Met Ala Glu Cys Glu Ala Ala Leu Ser Pro Tyr Val Asp Trp Ser Leu Glu Ala Val Val Arg Gln Ala Pro Gly Ala Pro Thr Leu Glu Arg Val Asp Val 3$ 2355 2360 2365 Val Gin Pro Val Thr Phe Ala Val Met Val Ser Leu Ala Arg Val Trp 2370 2375. 2380 Gln His His Gly Val Thr Pro Gln Ala Val Val Gly His Ser Gln Gly 40G1u Ile Ala Ala Ala Tyr Val Ala Gly Ala Leu Pro Leu Asp Asp Ala Ala Arg Val Val Thr Leu Arg Ser Lys Ser Ile Ala Ala His Leu Ala Gly Lys Gly Gly Met Leu Ser Leu Ala Leu Asn Glu Asp Ala Val Leu Glu Arg Leu Ser Asp Phe Asp Gly Leu Ser Val Ala Ala Val Asn Gly Pro Thr Ala Thr Val Val Ser Gly Asp Pro Val Gln Ile Glu Glu Leu lOAla Gln Ala Cys Lys Ala Asp Gly Phe Arg Ala Arg Ile Ile Pro Val Asp Tyr Ala Ser His Ser Arg Gln Val Glu Ile Ile Glu Ser Glu Leu Ala Gln Val Leu Ala Gly Leu Ser Pro Gln Ala Pro Arg Val Pro Phe Phe Ser Thr Leu Glu Gly Thr Trp Ile Thr Glu Pro Val Leu Asp Gly Thr Tyr Trp Tyr Arg Asn Leu Arg His Arg Val Gly Phe Ala Pro Ala Z~Ile Glu Thr Leu Ala Val Asp Glu Gly Phe Thr His Phe Val Glu Val Ser Ala His Pro Val Leu Thr Met Thr Leu Pro Glu Thr Val Thr Gly Leu Gly Thr Leu Arg Arg Glu Gln Gly Gly Gln Glu Arg Leu Val Thr 2$ 2595 2600 2605 Ser Leu Ala Glu Ala Trp Val Asn Gly Leu Pro Val Ala Trp Thr Ser Leu Leu Pro Ala Thr Ala Ser Arg Pro Gly Leu Pro Thr Tyr Ala Phe 3~Gln Ala Glu Arg Tyr Trp Leu Glu Asn Thr Pro Ala Ala Leu Ala Thr Gly Asp Asp Trp Arg Tyr Arg Ile Asp Trp Lys Arg Leu Pro Ala Ala Glu Gly Ser Glu Arg Thr Gly Leu Ser Gly Arg Trp Leu Ala Val Thr 3$ 2675 2680 2685 Pro Glu Asp His Ser Ala Gln .Ala Ala Ala Val Leu Thr Ala Leu Val Asp Ala Gly Ala Lys Val Glu 'Val Leu Thr Ala Gly Ala Asp Asp Asp 4~Arg Glu Ala Leu Ala Ala Arg :Leu Thr Ala Leu Thr Thr Gly Asp Gly Phe Thr Gly Val Val Ser Leu Leu Asp Gly Leu Val Pro Gln Val Ala Trp Val Gln Ala Leu Gly Asp Ala Gly Ile Lys Ala Pro Leu Trp Ser J' 2755 2760 2765 Val Thr Gln Gly Ala Val Ser Val Gly Arg Leu Asp Thr Pro Ala Asp Pro Asp Arg Ala Met Leu Trp Gly Leu Gly Arg Val Val Ala Leu Glu I~His Pro Glu Arg Trp Ala Gly Leu Val Asp Leu Pro Ala Gln Pro Asp Ala Ala Ala Leu Ala His Leu Val Thr Ala Leu Ser Gly Ala Thr Gly Glu Asp Gln Ile Ala Ile Arg Thr Thr Gly Leu His Ala Arg Arg Leu Ala Arg Ala Pro Leu His Gly Arg Arg Pro Thr Arg Asp Trp Gln Pro His Gly Thr Val Leu Ile Thr Gly Gly Thr Gly Ala Leu Gly Ser His Z~Ala Ala Arg Trp Met Ala His His Gly Ala Glu His Leu Leu Leu Val Ser Arg Ser Gly Glu Gln Ala Pro Gly Ala Thr Gln Leu Thr Ala Glu Leu Thr Ala Ser Gly Ala Arg Val Thr Ile Ala Ala Cys Asp Val Ala 2$ 2915 2920 2925 Asp Pro His Ala Met Arg Thr Leu Leu Asp Ala Ile Pro Ala Glu Thr Pro Leu Thr Ala Val Val His Thr Ala Gly Ala Leu Asp Asp Gly Ile 3~Va1 Asp Thr Leu Thr Ala Glu Gln Val Arg Arg Ala His Arg Ala Lys Ala Val Gly Ala Ser Val Leu Asp Glu Leu Thr Arg Asp Leu Asp Leu Asp Ala Phe Val Leu Phe Ser Ser Val Ser Ser Thr Leu Gly Ile Pro 3$ 2995 3000 3005 Gly Gln Gly Asn Tyr Ala Pro His Asn Ala Tyr Leu Asp Ala Leu Ala Ala Arg Arg Arg Ala Thr Gly Arg Ser Ala Val Ser Val Ala Trp Gly 4~Pro Trp Asp Gly Gly Gly Met Ala Ala Gly Asp Gly Val Ala Glu Arg Leu Arg Asn His Gly Val Pro Gly Met Asp Pro Glu Leu Ala Leu Ala Ala Leu Glu Ser Ala Leu Gly Arg Asp Glu Thr Ala Ile Thr Val Ala $ 3075 3080 3085 Asp Ile Asp Trp Asp Arg Phe Tyr Leu Ala Tyr Ser Ser Gly Arg Pro Gln Pro Leu Val Glu Glu Leu Pro Glu Val Arg Arg Ile Ile Asp Ala I~Arg Asp Ser Ala Thr Ser Gly Gln Gly Gly Ser Ser Ala Gln Gly Ala Asn Pro Leu Ala Glu Arg Leu Ala Ala Ala Ala Pro Gly Glu Arg Thr Glu Ile Leu Leu Gly Leu Val Arg Ala Gln Ala Ala Ala Val Leu Arg Met Arg Ser Pro Glu Asp Val Ala Ala Asp Arg Ala Phe Lys Asp Ile Gly Phe Asp Ser Leu Ala Gly Val Glu Leu Arg Asn Arg Leu Thr Arg Z~Ala Thr Gly Leu Gln Leu Pro Ala Thr Leu Val Phe Asp His Pro Thr Pro Leu Ala Leu Val Ser Leu Leu Arg Ser Glu Phe Leu Gly Asp Glu Glu Thr Ala Asp Ala Arg Arg Ser Ala Ala Leu Pro Ala Thr Val Gly Ala Gly Ala Gly Ala Gly Ala Gly Thr Asp Ala Asp Asp Asp Pro Ile 3250 325!5 3260 Ala Ile Val Ala Met Ser Cys Arg Tyr Pro Gly Asp Ile Arg Ser Pro 3UGlu Asp Leu Trp Arg Met Leu Ser Glu Gly Gly Glu Gly Ile Thr Pro Phe Pro Thr Asp Arg Gly Trp Asp Leu Asp Gly Leu Tyr Asp Ala Asp Pro Asp Ala Leu Gly Arg Ala Tyr Val Arg Glu G1y Gly Phe Leu His Asp Ala Ala Glu Phe Asp Ala Glu Phe Phe Gly Val Ser Pro Arg Glu 3330 333__°i 3340 Ala Leu Ala Met Asp Pro Gln Gln Arg Met Leu Leu Thr Thr Ser Trp 4~Glu Ala Phe Glu Arg Ala Gly Tle Glu Pro Ala Ser Leu Arg Gly Ser Ser Thr Gly Val Phe Ile Gly Leu Ser Tyr Gln Asp Tyr Ala Ala Arg Val Pro Asn Ala Pro Arg Gly Val Glu Gly Tyr Leu Leu Thr Gly Ser Thr Pro Ser Val Ala Ser G1y Arg Ile Ala Tyr Thr Phe Gly Leu Glu Gly Pro Ala Thr Thr Val Asp Thr Ala Cys Ser Ser Ser Leu Thr Ala l~Leu His Leu Ala Val Arg Ala Leu Arg Ser Gly Glu Cys Thr Met Ala Leu Ala Gly Gly Val Ala Met Met Ala Thr Pro His Met Phe Val Glu Phe Ser Arg Gln Arg Ala Leu Ala Pro Asp Gly Arg Ser Lys Ala Phe Ser Ala Asp Ala Asp Gly Phe Gly Ala Ala Glu Gly Val Gly Leu Leu Leu Val Glu Arg Leu Ser Asp Ala Arg Arg Asn Gly His Pro Val Leu 2~Ala Val Val Arg Gly Thr Ala Val Asn Gln Asp Gly Ala Ser Asn Gly Leu Thr Ala Pro Asn Gly Pro Ser Gln Gln Arg Val Ile Arg Gln Ala Leu Ala Asp Ala Arg Leu Ala Pro Gly Asp Ile Asp Ala Val Glu Thr His Gly Thr Gly Thr Ser Leu Gly Asp Pro Ile Glu Ala Gln Gly Leu 3570 357!i 3580 Gln Ala Thr Tyr Gly Lys Glu Arg Pro Ala Glu Arg Pro Leu Ala Ile 3~Gly Ser Val Lys Ser Asn Ile Gly His Thr Gln Ala Ala Ala Gly Ala Ala Gly Ile Ile Lys Met Val Leu Ala Met Arg His Gly Thr Leu Pro Lys Thr Leu His Ala Asp Glu Pro Ser Pro His Val Asp Trp Ala Asn 3$ 3635 3640 3645 Ser Gly Leu Ala Leu Val Thr Glu Pro Ile Asp Trp Pro Ala Gly Thr Gly Pro Arg Arg Ala Ala Val Ser Ser Phe Gly I:le Ser Gly Thr Asn 4~Ala His Val Val Leu Glu Gln Ala Pro Asp Ala Ala Gly Glu Val Leu Gly Ala Asp Glu Val Pro Glu Val Ser Glu Thr Val Ala Met Ala Gly Thr Ala Gly Thr Ser Glu Val Ala Glu Gly Ser Glu Ala Ser Glu Ala $ 3715 3720 3725 Pro Ala Ala Pro Gly Ser Arg Glu Ala Ser Leu Pro Gly His Leu Pro 3730 373!i 3740 Trp Val Leu Ser Ala Lys Asp Glu Gln Ser Leu Arg Gly Gln Ala Ala IOAla Leu His Ala Trp Leu Ser Glu Pro Ala Ala Asp Leu Ser Asp Ala Asp Gly Pro Ala Arg Leu Arg Asp Val Gly Tyr Thr Leu Ala Thr Ser Arg Thr Ala Phe Ala His Arg Ala Ala Val Thr Ala Ala Asp Arg Asp 1$ 3795 3800 3805 Gly Phe Leu Asp Gly Leu Ala Thr Leu Ala Gln Gly Gly Thr Ser Ala 3810 3815. 3820 His Val His Leu Asp Thr Ala Arg Asp Gly Thr Thr Ala Phe Leu Phe 20Thr Gly Gln Gly Ser Gln Arg Pro Gly Ala Gly Arg Glu Leu Tyr Asp Arg His Pro Val Phe Ala Arg Ala Leu Asp Glu Ile Cys Ala His Leu Asp Gly His Leu Glu Leu Pro Leu Leu Asp Val Met Phe Ala Ala Glu 2$ 3875 3880 3885 Gly Ser Ala Glu Ala Ala Leu Leu Asp Glu Thr Arg Tyr Thr Gln Cys Ala Leu Phe Ala Leu Glu Va1 Ala Leu Phe Arg Leu Val Glu Ser Trp 30G1y Met Arg Pro Ala Ala Leu Leu Gly His Ser Val G1y Glu Ile Ala Ala Ala His Val Ala Gly Val Phe Ser Leu Ala Asp Ala Ala Arg Leu Val Ala Ala Arg Gly Arg Leu Met Gln Glu Leu Pro Ala Gly Gly Ala 3$ 3955 3960 3965 Met Leu Ala Val Gln Ala Ala Glu Asp Glu Ile Arg Val Trp Leu Glu Thr Glu Glu Arg Tyr Ala Gly ;4rg Leu Asp Val Ala Ala Val Asn Gly 40Pro Glu Ala Ala Val Leu Ser Gly Asp Ala Asp Ala Ala Arg Glu Ala Glu Ala Tyr Trp Ser Gly Leu Gly Arg Arg Thr Arg Ala Leu Arg Val Ser His Ala Phe His Ser Ala His Met Asp Gly Met Leu Asp Gly Phe $ 4035 4040 4045 Arg Ala Val Leu Glu Thr Val Glu Phe Arg Arg Pro Ser Leu Thr Val Val Ser Asn Val Thr Gly Leu Ala Ala Gly Pro Asp Asp Leu Cys Aep lOPro Glu Tyr Trp Val Arg His Val Arg Gly Thr Val Arg Phe Leu Asp Gly Val Arg Val Leu Arg Asp Leu Gly Val Arg Thr Cys Leu Glu Leu Gly Pro Asp Gly Val Leu Thr Ala Met Ala Ala Asp Gly Leu Ala Asp 1$ 4115 4120 4125 Thr Pro Ala Asp Ser Ala Ala Gly Ser Pro Val Gly Ser Pro Ala Gly Ser Pro Ala Asp Ser Ala Ala Gly Ala Leu Arg Pro Arg Pro Leu Leu 20Va1 Ala Leu Leu Arg Arg Lys Arg Ser Glu Thr Glu Thr Val Ala Asp Ala Leu Gly Arg Ala His Ala His Gly Thr Gly Pro Asp Trp His Ala Trp Phe Ala Gly Ser Gly Ala His Arg Val Asp Leu Pro Thr Tyr Ser 2$ 4195 4200 4205 Phe Arg Arg Asp Arg Tyr Trp Leu Asp Ala Pro Ala Ala Asp Thr Ala Val Asp Thr Ala Gly Leu Gly Leu Gly Thr Ala Asp His Pro Leu Leu 30G1y Ala Val Val Ser Leu Pro Asp Arg Asp Gly Leu Leu Leu Thr Gly Arg Leu Ser Leu Arg Thr His Pro Trp Leu Ala Asp His Ala Val Leu Gly Ser Val Leu Leu Pro Gly Ala Ala Met Val Glu Leu Ala Ala His 3$ 4275 4280 4285 Ala Ala Glu Ser Ala Gly Leu Arg Asp Val Arg Glu Leu Thr Leu Leu Glu Pro Leu Val Leu Pro Glu His Gly Gly Val Glu Leu Arg Val Thr 40Va1 Gly Ala Pro Ala Gly Glu Pro Gly Gly Glu Ser Ala Gly Asp Gly Ala Arg Pro Val Ser Leu His Ser Arg Leu Ala Asp Ala Pro Ala Gly Thr Ala Trp Ser Cys His Ala. Thr Gly Leu Leu Ala Thr Asp Arg Pro Glu Leu Pro Val Ala Pro Asp Arg Ala Ala Met Trp Pro Pro Gln Gly Ala Glu Glu Val Pro Leu Asp Gly Leu Tyr Glu Arg Leu Asp Gly Asn l~Gly Leu Ala Phe Gly Pro Leu, Phe Gln Gly Leu Asn Ala Val Trp Arg Tyr Glu Gly Glu Val Phe Ala, Asp Ile Ala Leu Pro Ala Thr Thr Asn Ala Thr Ala Pro Ala Thr Alai Asn Gly Gly Gly Ser Ala Ala Ala Ala Pro Tyr Gly Ile His Pro Alai Leu Leu Asp Ala Ser Leu His Ala Ile 4450 445.5 4460 Ala Val Gly Gly Leu Val Asp Glu Pro Glu Leu Val Arg Val Pro Phe 20His Trp Ser Gly Val Thr Va7. His Ala Ala Gly Ala Ala Ala Ala Arg 4485 4490 4495 _ Val Arg Leu Ala Ser Ala Gly Thr Asp Ala Val Ser Leu Ser Leu Thr Asp Gly Glu Gly Arg Pro Leu Val Ser Val Glu Arg Leu Thr Leu Arg Pro Val Thr Ala Asp Gln Ala Ala Ala Ser Arg Val Gly Gly Leu Met 4530 45?t5 4540 His Arg Val Ala Trp Arg Pro Tyr Ala Leu Ala Ser Ser Gly Glu Gln 3~Aep Pro His Ala Thr Ser Tyr Giy Pro Thr Ala Val Leu Gly Lys Asp Glu Leu Lys Val Ala Ala Ala Leu Glu Ser Ala Gly Val Glu Val Gly Leu Tyr Pro Asp Leu Ala Ala Leu Ser Gln Asp Val Ala Ala Gly Ala 3$ 4595 4600 4605 Pro Ala Pro Arg Thr Val Leu Ala Pro Leu Pro Ala Gly Pro Ala Asp 4610 46:15 4620 Gly Gly Ala Glu Gly Val Arg Gly Thr Val Ala Arg Thr Leu Glu Leu 4~Leu Gln Ala Trp Leu Ala Asp Glu His Leu Ala Gly Thr Arg Leu Leu Leu Val Thr Arg Gly Ala Val Arg Asp Pro Glu Gly Ser Gly Ala Asp Asp Gly Gly Glu Asp Leu Ser His Ala Ala Ala Trp Gly Leu Val Arg $ 4675 4680 4685 Thr Ala Gln Thr Glu Asn Pro Gly Arg Phe Gly Leu Leu Asp Leu Ala 4690 469!i 4700 Asp Asp Ala Ser Ser Tyr Arg Thr Leu Pro Ser Val Leu Ser Asp Ala I~Gly Leu Arg Asp Glu Pro Gln Leu Ala Leu His Asp Gly Thr Ile Arg Leu Ala Arg Leu Ala Ser Val Arg Pro Glu Thr Gly Thr Ala Ala Pro Ala Leu Ala Pro Glu Gly Thr Val Leu Leu Thr G1y Gly Thr Gly Gly 1$ 4755 4760 4765 Leu Gly Gly Leu Val Ala Arg His Val Val Gly Glu Trp Gly Val Arg 4770 477__°i 4780 Arg Leu Leu Leu Val Ser Arg Arg Gly Thr Asp Ala Pro Gly Ala Asp Z~Glu Leu Val His Glu Leu Glu Ala Leu Gly Ala Asp Val Ser Val Ala Ala Cys Asp Val Ala Asp Arg Glu Ala Leu Thr Ala Val Leu Asp Ala Ile Pro Ala Glu His Pro Leu Thr Ala Val Val His Thr Ala Gly Val Leu Ser Asp Gly Thr Leu Pro Ser Met Thr Thr Glu Asp Val Glu His 4850 4855. 4860 Val Leu Arg Pro Lys Val Asp Ala Ala Phe Leu Leu Asp Glu Leu Thr 3~Ser Thr Pro Ala Tyr Asp Leu Ala Ala Phe Val Met Phe Ser Ser Ala Ala Ala Val Phe Gly Gly Ala Gly Gln Gly Ala Tyr Ala Ala Ala Asn Ala Thr Leu Asp~Ala Leu Ala Trp Arg Arg Arg Ala Ala Gly Leu Pro 3$ 4915 4920 4925 Ala Leu Ser Leu Gly Trp Gly Leu Trp Ala Glu Thr Ser Gly Met Thr Gly Glu Leu Gly Gln Ala Asp Leu Arg Arg Met Ser Arg Ala Gly Ile 4~Gly Gly Ile Ser Asp Ala Glu Gly Ile Ala Leu Leu Asp Ala Ala Leu Arg Asp Asp Arg His Pro Va7. Leu Leu Pro Leu Arg Leu Asp Ala Ala Gly Leu Arg Asp Ala Ala Gly Asn Asp Pro Ala Gly Ile Pro Ala Leu $ 4995 5000 5005 Phe Arg Asp Val Val Gly Ala Arg Thr Val Arg Ala Arg Pro Ser Ala °' Ala Ser Ala Ser Thr Thr Ala Gly Thr Ala Gly Thr Pro Gly Thr Ala lOAsp Gly Ala Ala Glu Thr Ala Ala Val Thr Leu Ala Asp Arg Ala Ala Thr Val Asp Gly Pro Ala Arch Gln Arg Leu Leu Leu Glu Phe Val Val Gly Glu Val Ala Glu Val Leu Gly His Ala Arg Gly His Arg Ile Asp Ala Glu Arg Gly Phe Leu Asp Leu Gly Phe Asp Ser Leu Thr Ala Val Glu Leu Arg Asn Arg Leu Asn Ser Ala Gly Gly Leu Ala Leu Pro Ala Z~Thr Leu Val Phe Asp His Pro Ser Pro Ala Ala Leu Ala Ser His Leu Asp Ala Glu Leu Pro Arg Gly Ala Ser Asp Gln Asp Gly Ala Gly Asn Arg Asn Gly Asn Glu Asn Gly Thr Thr Ala Ser Arg Ser Thr Ala Glu Thr Asp Ala Leu Leu Ala Gln Leu Thr Arg Leu Glu Gly Ala Leu Val 5170 51'75 51$0 Leu Thr Gly Leu Ser Asp Ala Pro Gly Ser Glu Glu Val Leu Glu His 3OLeu Arg Ser Leu Arg Ser Met Val Thr Gly Glu Thr Gly Thr Gly Thr Ala Ser Gly Ala Pro Asp G1~~r Ala Gly Ser Gly Ala Glu Asp Arg Pro Trp Ala Ala Gly Asp Gly Ala Gly Gly Gly Ser Glu Asp Gly Ala Gly 3$ 5235 5240 5245 Val Pro Asp Phe Met Asn Al;a Ser Ala Glu Glu Leu Phe Gly Leu Leu Asp Gln ASp Pro Ser Thr As:p Met Ser Thr Val Asn Glu Glu Lys Tyr 4~Leu Asp Tyr Leu Arg Arg Ala Thr Ala Asp Leu His Glu Ala Arg Gly Arg Leu Arg Glu Leu Glu Ala Lys Ala Gly Glu Pro Val Ala Ile Val Gly Met Ala Cys Arg Leu Pro Gly Gly Val Ala Ser Pro Glu Asp Leu Trp Arg Leu Val Ala Gly Gly Glu Asp Ala Ile Ser Glu Phe Pro Gln Asp Arg Gly Trp Asp Val Glu Gly Leu Tyr Asp Pro Asn Pro Glu Ala lOThr Gly Lys Ser Tyr Ala Arg Glu Ala Gly Phe Leu Tyr Glu Ala Gly Glu Phe Asp Ala Asp Phe Phe Gly Ile Ser Pro Arg Glu Ala Leu Ala Met Asp Pro Gln Gln Arg Leu Leu Leu Glu Ala Ser Trp Glu Ala Phe Glu His Ala Gly Ile Pro Ala Ala Thr Ala Arg Gly Thr Ser Val Gly Val Phe Thr Gly Val Met Tyr His Asp Tyr Ala Thr Arg Leu Thr Asp ZOVal Pro Glu Gly Ile Glu Gly Tyr Leu Gly Thr Gly Asn Ser Gly Ser Val Ala Ser Gly Arg Val Ala Tyr Thr Leu Gly Leu Glu Gly Pro Ala Val Thr Val Asp Thr Ala Cys Ser Ser Ser Leu Val Ala Leu His Leu Ala Val Gln Ala Leu Arg Lys Gly Glu Val Asp Met Ala Leu Ala Gly Gly Val Thr Val Met Ser Thr Pro Ser Thr Phe Val Glu Phe Ser Arg 3~Gln Arg Gly Leu Ala Pro Asp Gly Arg Ser Lys Ser Phe Ser Ser Thr Ala Asp Gly Thr Ser Trp Ser Glu Gly Val Gly Val Leu Leu Val Glu Arg Leu Ser Asp Ala Arg Arg Lys Gly His Arg Ile Leu Ala Val Val Arg Gly Thr Ala Val Asn Gln Asp Gly Ala Ser Ser Gly Leu Thr Ala Pro Asn Gly Pro Ser Gln Gln Arg Val Ile Arg Arg Ala Leu Ala Asp 4~Ala Arg Leu Thr Thr Ser Asp Val Asp Val Val Glu Ala His Gly Thr Gly Thr Arg Leu Gly Asp Pro Ile Glu Ala Gln Ala Val Ile Ala Thr Tyr Gly Gln Gly Arg Asp Gly Glu Gln Pro Leu Arg Leu Gly Ser Leu Lys Ser Asn Ile Gly His Thr Gln Ala Ala Ala Gly Val Ser Gly Val Ile Lys Met Val Gln Ala Met Arg His Gly Val Leu Pro Lys Thr Leu lOHis Val Glu Lys Pro Thr Asp Gln Val Asp Trp Ser Ala Gly Ala Val Glu Leu Leu Thr Glu Ala Met Asp Trp Pro Asp Lys Gly Asp Gly Gly Leu Arg Arg Ala Ala Val Ser Ser Phe Gly Val Ser Gly Thr Asn Ala 1$ 5715 5720 5725 His Val Val Leu Glu Glu Ala Pro Ala Ala Glu Glu Thr Pro Ala Ser Glu Ala Thr Pro Ala Val Glu Pro Ser Val Gly A1a Gly Leu Val Pro Z~Trp Leu Val Ser Ala Lys Thr Pro Ala Ala Leu Asp Ala Gln Ile Gly Arg Leu Ala Ala Phe Ala Ser Gln Gly Arg Thr Asp Ala Ala Asp Pro Gly Ala Val Ala Arg Val Leu Ala Gly Gly Arg Ala Glu Phe Glu His 2$ 5795 5800 5805 Arg Ala Val Val Leu Gly Thr Gly Gln Asp Asp Phe Ala Gln Ala Leu 5810 581_°. 5820 Thr Ala Pro Glu Gly Leu Ile Arg Gly Thr Pro Ser Asp Val Gly Arg 30Va1 Ala Phe Val Phe Pro Gly Gln Gly Thr Gln Trp Ala Gly Met Gly Ala Glu Leu Leu Asp Val Ser Lys Glu Phe Ala Ala Ala Met Ala Glu Cys Glu Ser Ala Leu Ser Arg Tyr Val Asp Trp Ser Leu Glu Ala Val 3$ 5875 5880 5885 Val Arg Gln Ala Pro Gly Ala Pro Thr Leu Glu Arg Val Asp Val Val Gln Pro Val Thr Phe Ala Val Met Val Ser Leu Ala Lys Val Trp Gln 4flHis His Gly Val Thr Pro Gln Ala Val Val Gly His Ser Gln Gly Glu Ile Ala Ala Ala Tyr Val Ala Gly Ala Leu Thr Leu Asp Asp Ala Ala Arg Val Val Thr Leu Arg Ser Lys Ser Ile Ala Ala His Leu Ala Gly Lys Gly Gly Met Ile Ser Leu Ala Leu Ser Giu Glu Ala Thr Arg Gln 5970 5975. 5980 Arg Ile Glu Asn Leu His Gly Leu Ser Ile Ala Ala Val Asn Gly Pro I~Thr Ala Thr Val Val Ser Gly Asp Pro Thr Gln Ile Gln Glu Leu Ala Gln Ala Cys Glu Ala Asp Gly Val Arg Ala Arg Ile Ile Pro Val Asp Tyr Ala Ser His Ser Ala His Val Glu Thr Ile Glu Ser Glu Leu Ala I$ 6035 6040 6045 Glu Val Leu Ala Gly Leu Ser Pro Arg Thr Pro Glu Val Pro Phe Phe Ser Thr Leu Glu Gly Ala Trp Ile Thr Glu Pro Val Leu Asp Gly Thr 20Tyr Trp Tyr Arg Asn Leu Arg His Arg Val Gly Phe Ala Pro Ala Val Glu Thr Leu Ala Thr Asp Glu Gly Phe Thr His Phe Ile Glu Val Ser Ala His Pro Val Leu Thr Met Thr Leu Pro Glu Thr Val Thr Gly Leu 2$ 6115 6120 6125 Gly Thr Leu Arg Arg Glu Gin Gly Gly Gln Glu Arg Leu Val Thr Ser Leu Ala Glu Ala Trp Thr Asn Gly Leu Thr Ile Asp Trp Ala Pro Val 3~Leu Pro Thr Ala Thr Gly His His Pro Glu Leu Pro Thr Tyr Ala Phe Gln Arg Arg His Tyr Trp Leu His Asp Ser Pro Ala Val Gln Gly Ser Val Gln Asp Ser Trp Arg Tyr .Arg Ile Asp Trp Lys Arg Leu Ala Val Ala Asp Ala Ser Glu Arg Ala Gly Leu Ser Gly Arg Trp Leu Val Val Val Pro Glu Asp Arg Ser Ala Glu Ala Ala Pro Val Leu Ala Ala Leu 4~Ser Gly Ala Gly Ala Asp Pro Val Gln Leu Asp Val Ser Pro Leu Gly Asp Arg Gln Arg Leu Ala Ala Thr Leu Gly Glu Ala Leu Ala Ala Ala Gly Gly Ala Val Asp Gly Val Leu Ser Leu Leu Ala Trp Asp Glu Ser Ala His Pro Gly His Pro Ala Pro Phe Thr Arg Gly Thr Gly Ala Thr Leu Thr Leu Val Gln Ala Leu Glu Asp Ala Gly Val Ala Ala Pro Leu lOTrp Cys Val Thr His Gly Ala Val Ser Val Gly Arg Ala Asp His Val Thr Ser Pro Ala Gln Ala Met Val Trp Gly Met Gly Arg Val Ala Ala Leu Glu His Pro Glu Arg Trp Gly Gly Leu Ile Asp Leu Pro Ser Asp 1$ 6355 6360 6365 Ala Asp Arg Ala Ala Leu Asp Arg Met Thr Thr Val Leu Ala Gly Gly 6370 637!i 6380 Thr Gly Glu Asp Gln Val Ala Val Arg Ala Ser Gly Leu Leu Ala Arg Z~Arg Leu Val Arg Ala Ser Leu Pro Ala His Gly Thr Ala Ser Pro Trp Trp Gln Ala Asp Gly Thr Val Leu Val Thr Gly Ala Glu Glu Pro Ala Ala Ala Glu Ala Ala Arg Arg Leu Ala Arg Asp Gly Ala Gly His Leu Leu Leu His Thr Thr Pro Ser Gly Ser Glu Gly Ala Glu Gly Thr Ser 6450 645°.i 6460 Gly Ala Ala Glu Asp Ser Gly Leu Ala Gly Leu Val Ala Glu Leu Ala 3~Asp Leu Gly Ala Thr Ala Thr Val Val Thr Cys Asp Leu Thr Asp Ala Glu Ala Ala Ala Arg Leu Leu Ala Gly Val Ser Asp Ala His Pro Leu Ser Ala Val Leu His Leu Pro Pro Thr Val Asp Ser Glu Pro Leu Ala Ala Thr Asp Ala Asp Ala Leu Ala Arg Val Val Thr Ala Lys Ala Thr Ala Ala Leu His Leu Asp Arg Leu Leu Arg Glu Ala Ala Ala Ala Gly 4~Gly Arg Pro Pro Val Leu Val Leu Phe Ser Ser Val Ala Ala Ile Trp Gly Gly Ala Gly Gln Gly Al<i Tyr Ala Ala Gly Thr Ala Phe Leu Asp Ala Leu Ala Gly Gln His Arch Ala Asp Gly Pro Thr Val Thr Ser Val $ 6595 6600 6605 Ala Trp Ser Pro Trp Glu Gly Ser Arg Val Thr Glu Gly Ala Thr Gly 6610 661.5 6620 Glu Arg Leu Arg Arg Leu Gly Leu Arg Pro Leu Ala Pro Ala Thr Ala l~Leu Thr Ala Leu Asp Thr Alai Leu Gly His Gly Asp Thr Ala Val Thr Ile Ala Asp Val Asp Trp Ser Ser Phe Ala Pro Gly Phe Thr Thr Ala Arg Pro Gly Thr Leu Leu Ala. Asp Leu Pro Glu Ala Arg Arg Ala Leu Asp Glu Gln Gln Ser Thr Thr Ala Ala Asp Asp Thr Val Leu Ser Arg Glu Leu Gly Ala Leu Thr Gly Ala Glu Gln Gln Arg Arg Met Gln Glu Z~Leu Val Arg Glu His Leu Ala Val Val Leu Asn His Pro Ser Pro Glu Ala Val Asp Thr Gly Arg Ala Phe Arg Asp Leu Gly Phe Asp Ser Leu Thr Ala Val Glu Leu Arg Asn Arg Leu Lys Asn Ala Thr Gly Leu Ala Leu Pro Ala Thr Leu Val Phe Asp Tyr Pro Thr Pro Arg Thr Leu Ala Glu Phe Leu Leu Ala Glu Ile Leu Gly Glu Gln Ala Gly Ala Gly Glu 3~Gln Leu Pro Val Asp Gly Gly Val Asp Asp Glu Pro Val Ala Ile Val Gly Met Ala Cys Arg Leu Pro Gly Gly Val Ala Ser Pro Glu Asp Leu Trp Arg Leu Val Ala Gly Gly Glu Asp Ala Ile Ser Gly Phe Pro Gln 35 6835 x'840 6845 Asp Arg Gly Trp Asp Val Glu Gly Leu Tyr Asp Pro Asp Pro Asp Ala Ser Gly Arg Thr Tyr Cys Arg Ala Gly Gly Phe Leu Asp Glu Ala Gly 4~Glu Phe Asp Ala Asp Phe Phe Gly Ile Ser Pro Arg Glu Ala Leu Ala Met Asp Pro Gln Gln Arg Leu Leu Leu Glu Thr Ser Trp Glu Ala Val Glu Asp Ala Gly Ile Asp Pro Thr Ser Leu Gln Gly Gln Gln Val Gly $ 6915 6920 6925 Val Phe Ala Gly Thr Asn Gly Pro His Tyr Glu Pro Leu Leu Arg Asn 6930 69:35 6940 Thr Ala Glu Asp Leu Glu Gly Tyr Val Gly Thr Gly Asn Ala Ala Ser lOIle Met Ser Gly Arg Val Ser_ Tyr Thr Leu Gly Leu Glu Gly Pro Ala Val Thr Val Asp Thr Ala Cys Ser Ser Ser Leu Val Ala Leu His Leu Ala Val Gln Ala Leu Arg Lys~ Gly Glu Cys Gly Leu Ala Leu Ala GIy Gly Val Thr Val Met Ser Thr Pro Thr Thr Phe Val Glu Phe Ser Arg 7010 707.5 7020 Gln Arg Gly Leu Ala Glu Asp Gly Arg Ser Lys Ala Phe Ala Ala Ser 20A1a Asp Gly Phe Gly Pro Ala Glu Gly Val Gly Met Leu Leu Val Glu Arg Leu Ser Asp Ala Arg Arch Asn Gly His Arg Val Leu Ala Val Val Arg Gly Ser Ala Val Asn Gln Asp Gly Ala Ser Asn Gly Leu Thr Ala 25 7075 7oao 7oas Pro Asn Gly Pro Ser Gln Gln Arg Val Ile Arg Arg Ala Leu Ala Asp Ala Arg Leu Thr Thr Ala Asp. Val Asp Val Val Glu Ala His Gly Thr 30G1y Thr Arg Leu Gly Asp Pro Ile Glu Ala Gln Ala Leu Ile Ala Thr Tyr Gly Gln Gly Arg Asp Thr Glu Gln Pro Leu Arg Leu Gly Ser Leu Lys Ser Asn Ile Gly His Thr Gln Ala Ala Ala Gly Val Ser Gly Ile Ile Lys Met Val Gln Ala Met Arg His Gly Val Leu Pro Lys Thr Leu His Val Asp Arg Pro Ser Asp Gln Ile Asp Trp Ser Ala Gly Thr Val 40G1u Leu Leu Thr Glu Ala Met Asp Trp Pro Arg Lys Gln Glu Gly Gly Leu Arg Arg Ala Ala Val Se:r Ser Phe Gly Ile Ser Gly Thr Asn Ala His Ile Val Leu Glu Glu Ala Pro Val Asp Glu Asp Ala Pro Ala Asp $ 7235 7240 7245 Glu Pro Ser Val Gly Gly Va:L Val Pro Trp Leu Val Ser Ala Lys Thr 7250 72!i5 7260 Pro Ala Ala Leu Asp Ala Gln Ile Gly Arg Leu Ala Ala Phe Ala Ser lOGln Gly Arg Thr Asp Ala Ales Asp Pro Gly Ala Val Ala Arg Val Leu Ala Gly Gly Arg Ala Gln Phe Glu His Arg Ala Val Ala Leu Gly Thr Gly Gln Asp Asp Leu Ala Ala Ala Leu Ala Ala Pro Glu Gly Leu Val 1$ 7315 7320 7325 Arg Gly Val Ala Ser Gly Val. Gly Arg Val Ala Phe Val Phe Pro Gly Gln Gly Thr Gln Trp Ala Gly Met Gly Ala Glu Leu Leu Asp Val Ser Z~Lys Glu Phe Ala Ala Ala Met Ala Glu Cys Glu Ala Ala Leu Ala Pro Tyr Val Asp Trp Ser Leu Glu Ala Val Val Arg Gln Ala Pro Gly Ala Pro Thr Leu Glu Arg Val Asp Val Val Gln Pro Val Thr Phe Ala Val 2$ 7395 7400 7405 Met Val Ser Leu Ala Lys Val Trp Gln His His Gly Val Thr Pro Gln Ala Val Val Gly His Ser Gln Gly Glu Ile Ala Ala Ala Tyr Val Ala 3~Gly Ala Leu Ser Leu Asp Asp Ala Ala Arg Val Val Thr Leu Arg Ser Lys Ser Ile Gly Ala His Leu Ala Gly Gln Gly Gly Met Leu Ser Leu Ala Leu Ser Glu Ala Ala Val Val Glu Arg Leu Ala Gly Phe Asp Gly 3$ 7475 7480 7485 Leu Ser Val Ala Ala Val Asn Gly Pro Thr Ala Thr Val Val Ser Gly Asp Pro Thr Gln Ile Gln Glu Leu Ala Gln Ala Cys Glu Ala Asp Gly 4~Va1 Arg Ala Arg Ile Ile Pro Val Asp Tyr Ala Ser His Ser Ala His Val Glu Thr Ile Glu Ser Glu Leu Ala Asp Val Leu Ala Gly Leu Ser Pro Gln Thr Pro Gln Val Pro Phe Phe Ser Thr Leu Glu Gly Ala Trp $ 7555 7560 7565 Ile Thr Glu Pro Ala Leu Asp Gly Gly Tyr Trp Tyr Arg Asn Leu Arg His Arg Val Gly Phe Ala Pro Ala Val Glu Thr Leu Ala Thr Asp Glu lOGly Phe Thr His Phe Val Glu Val Ser Ala His Pro Val Leu Thr Met Ala Leu Pro Glu Thr Val Thr Gly Leu Gly Thr Leu Arg Arg Asp Asn Gly Gly Gln His Arg Leu Thr Thr Ser Leu Ala Glu Ala Trp Ala Asn 1$ 7635 7640 7645 Gly Leu Thr Val Asp Trp Ala Ser Leu Leu Pro Thr Thr Thr Thr His Pro Asp Leu Pro Thr Tyr Ala Phe Gln Thr Glu Arg Tyr Trp Pro Gln 20Pro Asp Leu Ser Ala Ala Gly Asp Ile Thr Ser Ala Gly Leu Gly Ala Ala Glu His Pro Leu Leu Gly Ala Ala Val Ala Leu Ala Asp Ser Asp Gly Cys Leu Leu Thr Gly Ser Leu Ser Leu Arg Thr His Pro Trp Leu Ala Asp His Ala Val Ala Gly Thr Val Leu Leu Pro Gly Thr Ala Phe Val Glu Leu Ala Phe Arg Ala Gly Asp Gln Val Gly Cys Asp Leu Val 30G1u Glu Leu Thr Leu Asp Ala Pro Leu Val Leu Pro Arg Arg Gly Ala Val Arg Val Gln Leu Ser Val Gly Ala Ser Asp Glu Ser Gly Arg Arg Thr Phe Gly Leu Tyr Ala His Pro Glu Asp Ala Pro Gly Glu Ala Glu 3$ 7795 7800 7805 Trp Thr Arg His Ala Thr Gly Val Leu Ala Ala Arg Ala Asp Arg Thr Ala Pro Val Ala Asp Pro Glu Ala Trp Pro Pro Pro Gly Ala Glu Pro 40Va1 Asp Val Asp Gly Leu Tyr Glu Arg Phe Ala Ala Asn Gly Tyr Gly Tyr Gly Pro Leu Phe Gln Gly Val Arg Gly Val Trp Arg Arg Gly Asp Glu Val Phe Ala Asp Val Ala Leu Pro Ala Glu Val Ala Gly Ala Glu $ 7875 7880 7885 Gly Ala Arg Phe Gly Leu His Pro Ala Leu Leu Asp Ala Ala Val Gln Ala Ala Gly Ala Gly Arg Gly Val Arg Arg Gly His Ala Ala Ala Val I~Arg Leu Glu Arg Asp Leu Leu Tyr Ala Val Gly Ala Thr Ala Leu Arg Val Arg Leu Ala Pro Ala Gly Pro Asp Thr Val Ser Val Ser Ala Ala Asp Ser Ser Gly Gln Pro Val Phe Ala Ala Asp Ser Leu Thr Val Leu i$ 7955 7960 7965 Pro Val Asp Pro Ala Gln Leu Ala Ala Phe Ser Asp Pro Thr Leu Asp Ala Leu His Leu Leu Glu Trp Thr Ala Trp Asp Gly Ala Ala Gln Ala 2ULeu Pro Gly Ala Val Val Leu Gly Gly Asp Ala Asp Gly Leu Ala Ala Ala Leu Arg Ala Gly Gly Thr Glu Val Leu Ser Phe Pro Asp Leu Thr Asp Leu Val Glu Ala Val Asp Arg Gly Glu Thr Pro Ala Pro Ala Thr Val Leu Val Ala Cys Pro Ala Ala Gly Pro Asp Gly Pro Glu His Val 8050 805°.i 8060 Arg Glu Ala Leu His Gly Ser Leu Ala Leu Met Gln Ala Trp Leu Ala 30Asp Glu Arg Phe Thr Asp Gly Arg Leu Val Leu Val Thr Arg Asp Ala Val Ala Ala Arg Ser Gly Asp Gly Leu Arg Ser Thr Gly Gln Ala Ala Val Trp Gly Leu Gly Arg Ser Ala Gln Thr Glu Ser Pro Gly Arg Phe 3$ 8115 8120 8125 Val Leu Leu Asp Leu Ala Gly Glu Ala Arg Thr A1a Gly Asp Ala Thr 8130 813Ei 8140 Ala Gly Asp Gly Leu Thr Thr Gly Asp Ala Thr Val Gly Gly Thr Ser 4~Gly Asp Ala Ala Leu Gly Ser Ala Leu Ala Thr A1a Leu Gly Ser Gly Glu Pro Gln Leu Ala Leu Arg Asp Gly Ala Leu Leu Val Pro Arg Leu Ala Arg Ala Ala Ala Pro Ala Ala Ala Asp Gly Leu Ala Ala Ala Asp $ 8195 8200 8205 Gly Leu Ala Ala Leu Pro Leu Pro Ala Ala Pro Ala Leu Trp Arg Leu Glu Pro Gly Thr Asp Gly Ser Leu Glu Ser Leu Thr Ala Ala Pro Gly I~Asp Ala Glu Thr Leu Ala Pro Glu Pro Leu Gly Pro Gly Gln Val Arg Ile Ala Ile Arg Ala Thr Gly Leu Asn Phe Arg Asp Val Leu Ile Ala Leu Gly Met Tyr Pro Asp Pro Ala Leu Met Gly Thr Glu Gly Ala Gly 1$ 8275 8280 8285 Val Val Thr Ala Thr Gly Pro Gly Val Thr His Leu Ala Pro Gly Asp 8290 829_°. 8300 Arg Val Met Gly Leu Leu Ser Gly Ala Tyr Ala Pro Val Val Val Ala 20Asp Ala Arg Thr Val Ala Arg Met Pro Glu Gly Trp Thr Phe Ala Gln Gly Ala Ser Val Pro Val Val Phe Leu Thr Ala Val Tyr Ala Leu Arg Asp Leu Ala Asp Val Lys Pro Gly Glu Arg Leu Leu Val His Ser Ala 2$ 8355 8360 8365 Ala Gly Gly Val Gly Met Ala Ala Val Gln Leu Ala Arg His Trp Gly 8370 837_°. 8380 Val Glu Val His Gly Thr Ala Ser His Gly Lys Trp Asp Ala Leu Arg 3~Ala Leu Gly Leu Asp Asp Ala His Ile Ala Ser Ser Arg Thr Leu Asp Phe Glu Ser Ala Phe Arg Ala Ala Ser Gly Gly Ala Gly Met Asp Val Val Leu Asn Ser Leu Ala Arg Glu Phe Val Asp Ala Ser Leu Arg Leu Leu Gly Pro Gly Gly Arg Phe Val Glu Met Gly Lys Thr Asp Val Arg Asp Ala Glu Arg Val Ala Ala Asp His Pro Gly Val Gly Tyr Arg Ala 4~Phe Asp Leu Gly Glu Ala Gly Pro Glu Arg Ile Gly Glu Met Leu Ala 8$

Glu Val Ile Ala Leu Phe Glu Asp Gly Val Leu Arg His Leu Pro Val Thr Thr Trp Asp Val Arg Arg Ala Arg Asp Ala Phe Arg His Val Ser $ 8515 8520 8525 Gln Ala Arg His Thr Gly Lys Val Val Leu Thr Met Pro Ser Gly Leu Asp Pro Glu Gly Thr Val Leu Leu Thr Gly Gly Thr Gly Ala Leu Gly lOGly Ile Val Ala Arg His Val Val Gly Glu Trp Gly Val Arg Arg Leu Leu Leu Val Ser Arg Arg Gly Thr Asp Ala Pro Gly Ala Gly Glu Leu Val His Glu Leu Glu Ala Leu Gly Ala Asp Val Ser Val Ala Ala Cys 1$ 8595 8600 8605 Asp Val Ala Asp Arg Glu Ala Leu Thr Ala Val Leu Asp Ser Ile Pro Ala Glu His Pro Leu Thr Ala Val Val His Thr Ala Gly Val Leu Ser Z~Asp Gly Thr Leu Pro Ser Met Thr Ala Glu Asp Val Glu His Val Leu Arg Pro Lys Val Asp Ala Ala Phe Leu Leu Asp Glu Leu Thr Ser Thr Pro Gly Tyr Asp Leu Ala Ala Phe Val Met Phe Ser Ser Ala Ala Ala 2$ 8675 8680 8685 Val Phe Gly Gly Ala Gly Gln Gly Ala Tyr Ala Ala Ala Asn Ala Thr Leu Asp Ala Leu Ala Trp Arg Arg Arg Thr Ala Gly Leu Pro Ala Leu 3USer Leu Gly Trp Gly Leu Trp Ala Glu Thr Ser Gly Met Thr Gly Gly Leu Ser Asp Thr Asp Arg Ser Arg Leu Ala Arg Ser Gly Ala Thr Pro Met Asp Ser Glu Leu Thr Leu Ser Leu Leu Asp Ala Ala Met Arg Arg 3$ 8755 8760 8765 Asp Asp Pro Ala Leu Val Pro Ile Ala Leu Asp Val Ala Ala Leu Arg Ala Gln Gln Arg Asp Gly Met Leu Ala Pro Leu Leu Ser Gly Leu Thr 4~Arg Gly Ser Arg Val Gly Gly Ala Pro Val Asn Gln Arg Arg Ala Ala Ala Gly Gly Ala Gly Glu Ala Asp Thr Asp Leu Gly Gly Arg Leu Ala Ala Met Thr Pro Asp Asp Arg Val Ala His Leu Arg Asp Leu Val Arg $ 8835 8840 8845 Thr His Val Ala Thr Val Leu Gly His Gly Thr Pro Ser Arg Val Asp Leu Glu Arg Ala Phe Arg Asp Thr Gly Phe Asp Ser Leu Thr Ala Val I~Glu Leu Arg Asn Arg Leu Asn Ala Ala Thr Gly Leu Arg Leu Pro Ala Thr Leu Val Phe Asp His Pro Thr Pro Gly Glu Leu Ala Gly His Leu Leu Asp Glu Leu Ala Thr Ala Ala Gly Gly Ser Trp Ala Glu Gly Thr 1$ 8915 8920 8925 Gly Ser Gly Asp Thr Ala Ser Ala Thr Asp Arg Gln Thr Thr Ala Ala Leu Ala Glu Leu Asp Arg Leu Glu Gly Val Leu Ala Ser Leu Ala Pro 2~Ala Ala Gly Gly Arg Pro Glu Leu Ala Ala Arg Leu Arg Ala Leu Ala Ala Ala Leu Gly Asp Asp Gly Asp Asp Ala Thr Asp Leu Asp Glu Ala Ser Asp Asp Asp Leu Phe Ser Phe Ile Asp Lys Glu Leu Gly Asp Ser 2$ 8995 9000 9005 Asp Phe Met Ala Asn Asn Glu Asp Lys Leu Arg Asp Tyr Leu Lys Arg 9010 901':5 9020 Val Thr Ala Glu Leu Gln Gln Asn Thr Arg Arg Leu Arg Glu Ile Glu 30G1y Arg Thr Hie Glu Pro Val Ala Ile Val Gly Met Ala Cys Arg Leu Pro Gly Gly Val Ala Ser Pro Glu Asp Leu Trp Gln Leu Val Ala Gly Asp Gly Asp Ala Ile Ser Glu Phe Pro Gln Asp Arg Gly Trp Asp Val 3$ 9075 9080 9085 Glu Gly Leu Tyr Asp Pro Asp Pro Asp Ala Ser Gly Arg Thr Tyr Cys 9090 909'.5 9100 Arg Ser Gly Gly Phe Leu His Asp Ala Gly Glu Phe Asp Ala Asp Phe 4~Phe Gly Ile Ser Pro Arg Glu Ala Leu Ala Met Asp Pro Gln Gln Arg Leu Ser Leu Thr Thr Ala Trp Glu Ala Ile Glu Ser Ala Gly Ile Asp Pro Thr Ala Leu Lys Gly Ser Gly Leu Gly Val Phe Val Gly Gly Trp $ 9155 9160 9165 His Thr Gly Tyr Thr Ser Gly Gln Thr Thr Ala Val Gln Ser Pro Glu 9170 91 i'5 9180 Leu Glu Gly His Leu Val Ser Gly Ala Ala Leu Gly Phe Leu Ser Gly I~Arg Ile Ala Tyr Val Leu Gly Thr Asp Gly Pro Ala Leu Thr Val Asp Thr Ala Cys Ser Ser Ser Leu~. Val Ala Leu His Leu Ala Val Gln Ala Leu Arg Lys Gly Glu Cys Asp Met Ala Leu Ala Gly Gly Val Thr Val 1$ 9235 9240 9245 Met Pro Asn Ala Asp Leu Phe. Val Gln Phe Ser Arg Gln Arg Gly Leu Ala Ala Asp Gly Arg Ser Lys Ala Phe Ala Thr Ser Ala Asp Gly Phe ZOGly Pro Ala Glu Gly Ala Gly Val Leu Leu Val Glu Arg Leu Ser Asp Ala Arg Arg Asn Gly His Arg Ile Leu Ala Val Val Arg Gly Ser Ala Val Asn Gln Asp Gly Ala Ser Asn Gly Leu Thr Ala Pro His Gly Pro 2$ 9315 9320 9325 Ser Gln Gln Arg Val Ile Arg Arg Ala Leu Ala Asp Ala Arg Leu Ala Pro Gly Asp Val Asp Val Val Glu Ala His Gly Thr Gly Thr Arg Leu 30G1y Asp Pro Ile Glu Ala Gln Ala Leu Ile Ala Thr Tyr Gly Gln Glu Lys Ser Ser Glu Gln Pro Leu Arg Leu Gly Ala Leu Lys Ser Asn Ile Gly His Thr Gln Ala Ala Ala Gly Val Ala Gly Val Ile Lys Met Val 3$ 9395 9400 9405 Gln Ala Met Arg His Gly Leu Leu Pro Lys Thr Leu His Val Asp Glu Pro Ser Asp Gln Ile Asp Trp Ser Ala Gly Thr Val Glu Leu Leu Thr 40G1u Ala Val Asp Trp Pro Glu Lys Gln Asp Gly Gly Leu Arg Arg Ala Ala Val Ser Ser Phe Gly Ilea Ser Gly Thr Asn Ala His Val Val Leu Glu Glu Ala Pro Ala Val Glu. Asp Ser Pro Ala Val Glu Pro Pro Ala $ 9475 9480 9485 Gly Gly Gly Val Val Pro Trp~ Pro Val Ser Ala Lys Thr Pro Ala Ala , Leu Asp Ala Gln Ile Gly Gln. Leu Ala Ala Tyr Ala Asp Gly Arg Thr I~Asp Val Asp Pro Ala Val Ala Ala Arg Ala Leu Val Asp Ser Arg Thr Ala Met Glu His Arg Ala Val Ala Val Gly Asp Ser Arg Glu Ala Leu Arg Asp Ala Leu Arg Met Pro Glu Gly Leu Val Arg Gly Thr Ser Ser 1$ 9555 9560 9565 Asp Val Gly Arg Val Ala Phe Val Phe Pro Gly Gln Gly Thr Gln Trp Ala Gly Met Gly Ala Glu Leu Leu Asp Ser Ser Pro Glu Phe Ala Ala Z~Ser Met Ala Glu Cys Glu Thr Ala Leu Ser Arg Tyr Val Asp Trp Ser Leu Glu Ala Val Val Arg Gln Glu Pro Gly Ala Pro Thr Leu Asp Arg Val Asp Val Val Gln Pro Val Thr Phe Ala Val Met Val Ser Leu Ala Lys Val Trp Gln His His Gly Ile Thr Pro Gln Ala Val Val Gly His Ser Gln Gly Glu Ile Ala Ala Ala Tyr Val Ala Gly Ala Leu Thr Leu .

30Asp Asp Ala Ala Arg Val Val Thr Leu Arg Ser Lys Ser Ile Ala Ala His Leu Ala Gly Lys Gly Gly Met Ile Ser Leu Ala Leu Asp Glu Ala Ala Val Leu Lys Arg Leu Ser Asp Phe Asp Gly Leu Ser Val Ala Ala Val Asn Gly Pro Thr Ala Thr Val Val Ser Gly Asp Pro Thr Gln Ile Glu Glu Leu Ala Arg Thr Cys Glu Ala Asp Gly Val Arg Ala Arg Ile 4~Ile Pro Val Asp Tyr Ala Ser His Ser Arg Gln Val Glu Ile Ile Glu Lys Glu Leu Ala Glu Val Leu Ala Gly Leu Ala Pro Gln Ala Pro His Val Pro Phe Phe Ser Thr Leu Glu Gly Thr Trp Ile Thr Glu Pro Val $ 9795 9800 9805 Leu Asp Gly Thr Tyr Trp Tyr Arg Asn Leu Arg His Arg Val Gly Phe 9810 981!5 9820 Ala Pro Ala Val Glu Thr Leu Ala Val Asp Gly Phe Thr His Phe Ile l~Glu Val Ser Ala His Pro Val Leu Thr Met Thr Leu Pro Glu Thr Val Thr Gly Leu Gly Thr Leu Arg Arg Glu Gln Gly Gly Gln Glu Arg Leu Val Thr Ser Leu Ala Glu Ala Trp Ala Asn Gly Leu Thr Ile Asp Trp 1$ 9875 9880 9885 Ala Pro Ile Leu Pro Thr Ala Thr Gly His His Pro Glu Leu Pro Thr 9890 989!i 9900 Tyr Ala Phe Gln Thr Glu Arg Phe Trp Leu Gln Ser Ser Ala Pro Thr Z~Ser Ala Ala Asp Asp Trp Arg Tyr Arg Val Glu Trp Lys Pro Leu Thr Ala Ser Gly Gln Ala Asp Leu Ser Gly Arg Trp Ile Val Ala Val Gly Ser Glu Pro Glu Ala Glu Leu Leu Gly Ala Leu Lys Ala Ala Gly Ala 2$ 9955 9960 9965 Glu Val Asp Val Leu Glu Ala Gly Ala Asp Asp Asp Arg Glu Ala Leu Ala Ala Arg Leu Thr Ala Leu Thr Thr Gly Asp Gly Phe Thr Gly Val 3~Va1 Ser Leu Leu Asp Asp Leu Val Pro Gln Val A1a Trp Val Gln Ala Leu Gly Asp Ala Gly Ile Lys Ala Pro Leu Trp Ser Val Thr Gln Gly Ala Val Ser Val Gly Arg Leu Asp Thr Pro Ala Asp Pro Asp Arg Ala 3$ 10035 :L0040 10045 Met Leu Trp Gly Leu Gly Arg Val Val Ala Leu Glu His Pro Glu Arg 10050 1005!i 10060 1 Trp Ala Gly Leu Val Asp Leu Pro Ala Gln Pro Asp Ala Ala Ala Leu 4~Ala His Leu Val Thr Ala Leu Ser Gly Ala Thr Gly Glu Asp Gln Ile Ala Ile Arg Thr Thr Gly Leu His Ala Arg Arg Leu Ala Arg Ala Pro Leu His Gly Arg Arg Pro Thr Arg Asp Trp Gln Pro His Gly Thr Val $ 10115 '10120 10125 Leu Ile Thr Gly Gly Thr Gly Ala Leu Gly Ser His Ala Ala Arg Trp Met Ala His His Gly Ala Glu His Leu Leu Leu Val Ser Arg Ser Gly lOGlu Gln Ala Pro Gly Ala Thr Gln Leu Thr Ala Glu Leu Thr Ala Ser Gly Ala Arg Val Thr Ile Ala Ala Cys Asp Val Ala Asp Pro His Ala Met Arg Thr Leu Leu Asp Ala Ile Pro Ala Glu Thr Pro Leu Thr Ala 1$ 10195 :L0200 10205 Val Val His Thr Ala Gly Ala Pro Gly Gly Asp Pro Leu Asp Val Thr 10210 1021!i 10220 1 Gly Pro Glu Asp Ile Ala Arg Ile Leu Gly Ala Lys Thr Ser Gly Ala 20G1u Val Leu Asp Asp Leu Leu Arg Gly Thr Pro Leu Asp Ala Phe Val Leu Tyr Ser Ser Asn Ala Gly Val Trp Gly Ser Gly Ser Gln Gly Val Tyr Ala Ala Ala Asn Ala His Leu Asp Ala Leu Ala Ala Arg Arg Arg 2$ 10275 10280 10285 Ala Arg Gly Glu Thr Ala Thr Ser Val Ala Trp G1y Leu Trp Ala Gly 10290 1029°.i 10300 1 Asp Gly Met Gly Arg Gly Ala Asp Asp Ala Tyr Trp Gln Arg Arg Gly 30I1e Arg Pro Met Ser Pro Asp Arg Ala Leu Asp Glu Leu Ala Lys Ala Leu Ser Hie Asp Glu Thr Phe Val Ala Val Ala Asp Val Asp Trp Glu Arg Phe Ala Pro Ala Phe Thr Val Ser Arg Pro Ser Leu Leu Leu Asp 3$ 10355 10360 10365 Gly Val Pro Glu Ala Arg Gln Ala Leu Ala Ala Pro Val Gly Ala Pro Ala Pro Gly Asp Ala Ala Val Ala Pro Thr Gly G:Ln Ser Ser Ala Leu 4~Ala Ala Ile Thr Ala Leu Pro Glu Pro Glu Arg Arg Pro Ala Leu Leu Thr Leu Val Arg Thr His Ala. Ala Ala Val Leu Gly His Ser Ser Pro Asp Arg Val Ala Pro Gly Arg Ala Phe Thr Glu Leu Gly Phe Asp Ser $ 10435 10440 10445 Leu Thr Ala Val Gln Leu Arg Asn Gln Leu Ser Thr Val Val Gly Asn Arg Leu Pro Ala Thr Thr Val Phe Asp His Pro Thr Pro Ala Ala Leu l~Ala Ala His Leu His Glu Ala Tyr Leu Ala Pro Ala Glu Pro Ala Pro Thr Asp Trp Glu Gly Arg VaI Arg Arg Ala Leu Ala Glu Leu Pro Leu Asp Arg Leu Arg Asp Ala G1y Val Leu Asp Thr Val Leu Arg Leu Thr 1$ 10515 10520 10525 Gly Ile Glu Pro Glu Pro Gly Ser Gly Gly Ser Asp Gly Gly Ala Ala Asp Pro Gly Ala Glu Pro G1u Ala Ser Ile Asp Asp Leu Asp Ala Glu Z~Ala Leu Ile Arg Met Ala Leu Gly Pro Arg Asn Thr Met Thr Ser Ser Asn Glu Gln Leu Val Asp Ala Leu Arg Ala Ser Leu Lys Glu Asn Glu Glu Leu Arg Lys Glu Ser Arg Arg Arg Ala Asp Arg Arg Gln Glu Pro 2$ 10595 10600 10605 Met Ala Ile Val Gly Met Ser Cys Arg Phe Ala Gly Gly Ile Arg Ser 10610 1061.5 10620 1 Pro Glu Asp Leu Trp Asp Ala Val Ala Ala Gly Lys Asp Leu Val Ser 3~Glu Val Pro Glu Glu Arg Gly Trp Asp Ile Asp Ser Leu Tyr Asp Pro Val Pro Gly Arg Lys Gly Thr Thr Tyr Val Arg Asn Ala Ala Phe Leu Asp Asp Ala Ala Gly Phe Asp Ala Ala Phe Phe Gly Ile Ser Pro Arg 3$ 10675 10680 10685 Glu Ala Leu Ala Met Asp Pro Gln Gln Arg Gln Leu Leu Glu Ala Ser 10690 1069!5 10700 1 Trp Glu Val Phe Glu Arg Ala Giy Ile Asp Pro Ala Ser Val Arg Gly 40Thr Asp Val Gly Val Tyr Val Gly Cys Gly Tyr Gln Asp Tyr Ala Pro Asp Ile Arg Val Ala Pro Glu. Gly Thr Gly Gly Tyr Val Val Thr Gly Asn Ser Ser Ala Val Ala Ser Gly Arg Ile Ala Tyr Ser Leu Gly Leu $ 10755 10760 10765 Glu Gly Pro Ala Val Thr Val Asp Thr Ala Cys Ser Ser Ser Leu Val Ala Leu His Leu Ala Leu Lys Giy Leu Arg Asn Gly Asp Cys Ser Thr lOAla Leu Val Gly Gly Val Ala Val Leu Ala Thr Pro Gly Ala Phe Ile Glu Phe Ser Ser Gln Gln Ala Met Ala Ala Asp Gly Arg Thr Lys Gly Phe Ala Ser Ala Ala Asp G1y Leu Ala Trp Gly Glu Gly Val Ala Val Leu Leu Leu Glu Arg Leu Ser Asp Ala Arg Arg Lys Gly His Arg Val Leu Ala Val Val Arg Gly Ser Ala Ile Asn Gln Asp Gly Ala Ser Asn 20G1y Leu Thr Ala Pro His Gly Pro Ser Gln Gln His Leu Ile Arg Gln Ala Leu Ala Asp Ala Arg Leu Thr Ser Ser Asp Val Asp Val Val Glu Gly His Gly Thr Gly Thr Arg Leu Gly Asp Pro Ile Glu Ala Gln Ala 2$ 10915 10920 10925 Leu Leu Ala Thr Tyr Gly Gln Gly Arg Ala Pro Gly Gln Pro Leu Arg Leu Gly Thr Leu Lye Ser Asn Ile Gly His Thr Gln Ala Ala Ser Gly 3~Va1 Ala Gly Val Ile Lys Met Val Gln Ala Leu Arg His Gly Val Leu Pro Lys Thr Leu His Val Asp Glu Pro Thr Asp Gln Val Asp Trp Ser Ala Gly Ser Val Glu Leu Leu Thr Glu Ala Val Asp Trp Pro Glu Arg 3$ 10995 11000 11005 Pro Gly Arg Leu Arg Arg Ala Gly Val Ser Ala Phe Gly Val Gly Gly Thr Asn Ala His Val Val Leu Glu Glu Ala Pro Ala Val Glu Glu Ser 4~Pro Ala Val Glu Pro Pro Ala Gly Gly Gly Val Val Pro Trp Pro Val Ser Ala Lys Thr Ser Ala Ala Leu Asp Ala Gln Ile Gly Gln Leu Ala Ala Tyr Ala Glu Asp Arg Thr Asp Val Asp Pro Ala Val Ala Ala Arg $ 11075 11080 11085 Ala Leu Val Asp Ser Arg Thr Ala Met Glu His Arg Ala Val Ala Val Gly Asp Ser Arg Glu Ala Leu Arg Asp Ala Leu Arg Met Pro Glu Gly lOLeu Val Arg Gly Thr Val Thr Asp Pro Gly Arg Val Ala Phe Val Phe Pro Gly Gln Gly Thr Gln Trp Ala Gly Met Gly Ala Glu Leu Leu Asp Ser Ser Pro Glu Phe Ala Ala Ala Met Ala Glu Cys Glu Thr Ala Leu 1$ 11155 11160 11165 Ser Pro Tyr Val Asp Trp Ser Leu Glu Ala Val Val Arg Gln Ala Pro Ser Ala Pro Thr Leu Asp Arg Val Asp Val Val Gln Pro Val Thr Phe Z~Ala Val Met Val Ser Leu Ala Lys Val Trp Gln His His Gly Ile Thr Pro Glu Ala Val Ile Gly His Ser Gln Gly Glu Ile Ala Ala Ala Tyr Val Ala Gly Ala Leu Thr Leu Asp Asp Ala Ala Arg Val Val Thr Leu 2$ 11235 11240 11245 Arg Ser Lys Ser Ile Ala Ala His Leu Ala Gly Lys Gly Gly Met Ile Ser Leu Ala Leu Ser Glu Glu Ala Thr Arg Gln Arg Ile Glu Asn Leu 30His Gly Leu Ser Ile Ala Ala Val Asn Gly Pro Thr Ala Thr Val Val Ser Gly Asp Pro Thr Gln Ile Gln Glu Leu Ala Gln Ala Cys Glu Ala Asp Gly Ile Arg Ala Arg Ile Ile Pro Val Asp Tyr Ala Ser His Ser Ala His Val Glu Thr Ile Glu Asn Glu Leu Ala Asp Val Leu Ala Gly Leu Ser Pro Gln Thr Pro Gln Val Pro Phe Phe Ser Thr Leu Glu Gly 4~Thr Trp Ile Thr Glu Pro Ala Leu Asp Gly Gly Tyr Trp Tyr Arg Asn Leu Arg His Arg Val Gly Phe Ala Pro Ala Val Glu Thr Leu Ala Thr Asp Glu Gly Phe Thr His Phe Ile Glu Val Ser Ala His Pro Val Leu $ 11395 :L1400 11405 Thr Met Thr Leu Pro Asp Lys Val Thr Gly Leu Ala Thr Leu Arg Arg , 11410 1141!i 114 2 0 1 Glu Asp Gly Gly Gln His Arg Leu Thr Thr Ser Leu Ala Glu Ala Trp I~Ala Asn Gly Leu Ala Leu Asp Trp Ala Ser Leu Leu Pro Ala Thr Gly Ala Leu Ser Pro Ala Val Pro Asp Leu Pro Thr Tyr Ala Phe Gln His Arg Ser Tyr Trp Ile Ser Pro Ala Gly Pro Gly Glu Ala Pro Ala His 1$ 11475 11480 11485 Thr Ala Ser Gly Arg Glu Ala Val Ala Glu Thr Gly Leu Ala Trp Gly Pro Gly Ala Glu Asp Leu Asp Glu Glu Gly Arg Arg Ser Ala Val Leu 2~Ala Met Val Met Arg Gln Ala Ala Ser Val Leu Arg Cys Asp Ser Pro Glu Glu Val Pro Val Asp Arg Pro Leu Arg Glu Ile Gly Phe Asp Ser Leu Thr Ala Val Asp Phe Arg Asn Arg Val Asn Arg Leu Thr Gly Leu 2$ 11555 7.1560 11565 Gln Leu Pro Pro Thr Val Val Phe Gln His Pro Thr Pro Val Ala Leu Ala Glu Arg Ile Ser Asp Glu Leu Ala Glu Arg Asn Trp Ala Val Ala 3UGlu Pro Ser Asp His Glu Gln Ala Glu Glu Glu Lys Ala Ala Ala Pro Ala Gly Ala Arg Ser Gly Ala Asp Thr Gly Ala Gly Ala Gly Met Phe Arg Ala Leu Phe Arg Gln Ala Val Glu Asp Asp Arg Tyr Gly Glu Phe 3$ 11635 7.1640 11645 Leu Asp Val Leu Ala Glu Ala Ser Ala Phe Arg Pro Gln Phe Ala Ser 11650 1165°_. 11660 1 Pro Glu Ala Cys Ser Glu Arg Leu Asp Pro Val Leu Leu Ala Gly Gly 4dPro Thr Asp Arg Ala Glu Gly Arg Ala Val Leu Val Gly Cys Thr Gly Thr Ala Ala Asn Gly Gly Pro H:is Glu Phe Leu Arg Leu Ser Thr Ser Phe Gln Glu Glu Arg Asp Phe Leu Ala Val Pro Leu Pro Gly Tyr Gly 11715 1.1'720 11725 Thr Gly Thr Gly Thr Gly Thr A:La Leu Leu Pro Ala Asp Leu Asp Thr , Ala Leu Asp Ala Gln Ala Arg A:La Ile Leu Arg Ala Ala Gly Asp Ala ~~~Pro Val Val Leu Leu Gly His Se:r Gly Gly Ala Leu Leu Ala His Glu Leu Ala Phe Arg Leu Glu Arg A:La His Gly Ala Pro Pro Ala Gly Ile Val Leu Val Asp Pro Tyr Pro Pro Gly His Gln Glu Pro Ile Glu Val Trp Ser Arg Gln Leu Gly Glu Gly Leu Phe Ala Gly Glu Leu Glu Pro Met Ser Asp Ala Arg Leu Leu Al.a Met Gly Arg Tyr Ala Arg Phe Leu ~',OAla Gly Pro Arg Pro Gly Arg Seer Ser Ala Pro Val Leu Leu Val Arg Ala Ser Glu Pro Leu Gly Asp Trp Gln Glu Glu Arg Gly Asp Trp Arg Ala His Trp Asp Leu Pro His Thr Val Ala Asp Val Pro Gly Asp His s S 11875 11ft80 11885 Phe Thr Met Met Arg Asp His A7.a Pro Ala Val Ala Glu Ala Val Leu Ser Trp Leu Asp Ala Ile Glu Gl.y Ile Glu Gly Ala Gly Lys Met Thr ~i~Asp Arg Pro Leu Asn Val Asp Se:r Gly Leu Trp Ile Arg Arg Phe His Pro Ala Pro Asn Ser Ala Val Arg Leu Val Cys Leu Pro His Ala Gly Gly Ser Ala Ser Tyr Phe Phe Arg Phe Ser Glu Glu Leu His Pro Ser .C$ 11955 11960 11965 Val Glu Ala Leu Ser Val Gln Tyr Pro Gly Arg Gln Asp Arg Arg Ala Glu Pro Cys Leu Glu Ser Val G7.u Glu Leu Ala Glu His Val Val Ala ~~~Ala Thr Glu Pro Trp Trp Gln Gl.u Gly Arg Leu Ala Phe Phe Gly His Ser Leu Gly Ala Ser Val Ala Phe Glu Thr Ala Arg Ile Leu Glu Gln Arg His Gly Val Arg Pro Glu Gl.y Leu Tyr Val Ser Gly Arg Arg Ala $ 12035 12040 12045 Pro Ser Leu Ala Pro Asp Arg Le:u Val His Gln Leu Asp Asp Arg Ala Phe Leu Ala Glu Ile Arg Arg Le~u Ser Gly Thr Asp Glu Arg Phe Leu I~Gln Asp Asp Glu Leu Leu Arg Leu Val Leu Pro Ala Leu Arg Ser Asp Tyr Lys Ala Ala Glu Thr Tyr Leu His Arg Pro Ser Ala Lys Leu Thr Cys Pro Val Met Ala Leu Ala Gly Asp Arg Asp Pro Lys Ala Pro Leu 1$ 12115 12120 12125 Asn Glu Val Ala Glu Trp Arg Arg His Thr Ser Gly Pro Phe Cys Leu Arg Ala Tyr Ser Gly Gly His Phe Tyr Leu Asn Asp Gln Trp His Glu 2~DIle Cys Asn Asp Ile Ser Asp His Leu Leu Val Thr Arg Gly Ala Pro Asp Ala Arg Val Val Gln Pro Pro Thr Ser Leu Ile Glu Gly Ala Ala Lys Arg Trp Gln Asn Pro Arg 2:5 12195 1 34 <210> 7 <211> 1248 <212> DNA
<213> Streptomyces vene:zuelae 3:i <400> 7 gtgaaaagcg ccttatccgacctcgcattcttcggcggccccgccgctttcgaccagccg 60 ctcctcgtgg ggcggcccaaccgcatc<3accgcgccaggctgtacgagcggctcgaccgg 120 gccctcgaca gccagtggctgtccaacggcggcccgctcgtccgcgagttcgaggagcgc 180 gtcgccgggc tcgccggggtccggcatgccgtggccacctgcaacgccacggccgggctc 240 41) cagctcctcgcgcacgccgccggcctcaccggcgaagtgatcatgccgtcgatgacgttc 300 gccgccaccc cgcacgcactgcgctggatcggcctcaccccggtcttcgccgacatcgac360 ccggacaccg gcaacctcgacccggaccaggtggccgccgcggtcacaccccgcacctcg420 gccgtcgtcg gcgtccacctctggggccgcccctgcgccgccgaccagctgcggaaggtc480 gccgacgagc acggcctgcggctgtacttcgacgccgcgcacgccctcggctgcgcggtc540 $ gacggccggcccgccggcagcctcggcgacgccgaggtcttcagcttccacgccaccaag600 gccgtcaacg ccttcgagggcggcgccgtcgtcaccgacgacgccgacctcgccgcccgg660 , atccgcgccc tccacaacttcggcttcgacctgcccggcggcagccccgccggcgggacc720 aacgccaaga tgagcgaggccgccgccgccatgggcctcacctccctcgacgcgtttccc780 gaggtcatcg accggaaccggcgcaaccacgccgcctaccgcgagcacctcgcggacctc840 l0 cccggcgtcctcgtcgccgaccacgaccgccacggcctcaacaaccaccagtacgtgatc900 gtcgagatcg acgaggccaccaccggr_atccaccgcgacctcgtcatggaggtcctgaag960 gccgaaggcg tgcacacccgcgcctacttctcgccgggctgccacgagctggagccgtac1020 cgcgggcagc cgcacgccccgctgcc<3cacaccgaacgcctcgccgcgcggtgctgtcc 1080 c ctgccgaccg gcaccgccatcggcgacgacgacatccgccgggtcgccgacctgctgcgt1140 ~~5 ctctgcgcgacccgcggccgcgaactgaccgcgcgccaccgcgacacggccccgccccg 1200 c ctcgcggccc cccagacatccacgcccacgattggacgctcccgatga 1248 <210 > 8 <211> 415 <212> PRT
<213> Streptomyces vene~zuelae <400> 8 Met Lys Ser Ala Leu Ser Asp Leu Ala Phe Phe Gly Gly Pro Ala Ala L.$ 1 5 10 15 Phe Asp Gln Pro Leu Leu Val Gly Arg Pro Asn Arg Ile Asp Arg Ala Arg Leu Tyr Glu Arg Leu Asp A.rg Ala Leu Asp Ser Gln Trp Leu Ser 3~ Asn Gly Gly Pro Leu Val Arg G'~lu Phe Glu Glu Arg Val Ala Gly Leu Ala Gly Val Arg His Ala Val A.la Thr Cys Asn Ala Thr Ala Gly Leu Gln Leu Leu Ala His Ala Ala Gly Leu Thr Gly Glu Val Ile Met Pro 3$ 85 90 95 Ser Met Thr Phe Ala Ala Thr Pro His Ala Leu Arg Trp Ile Gly Leu Thr Pro Val Phe Ala Asp Ile Asp Pro Asp Thr Gly Asn Leu Asp Pro 4~ Asp Gln Val Ala Ala Ala Val Thr Pro Arg Thr Ser Ala Val Val Gly Val His Leu Trp Gly Arg Pro (:ys Ala Ala Asp Gln Leu Arg Lys Val Ala Asp Glu His Gly Leu Arg Leu Tyr Phe Asp Ala Ala His Ala Leu Gly Cys Ala Val Asp Gly Arg Pro Ala Gly Ser Leu Gly Asp Ala Glu Val Phe Ser Phe His Ala Thr L~ys Ala Val Asn Ala Phe Glu Gly Gly 1~ Ala Val Val Thr Asp Asp Ala A.sp Leu Ala Ala Arg Ile Arg Ala Leu His Asn Phe Gly Phe Asp Leu Pro Gly Gly Ser Pro Ala Gly Gly Thr Asn Ala Lys Met Ser Glu Ala Ala Ala Ala Met Gly Leu Thr Ser Leu 1$ 245 250 255 Asp Ala Phe Pro Glu Val Ile Asp Arg Asn Arg Arg Asn His Ala Ala Tyr Arg Glu His Leu Ala Asp Leu Pro Gly Val Leu Val Ala Asp His 2~~ Asp Arg His Gly Leu Asn Asn His Gln Tyr Val Ile Val Glu Ile Asp Glu Ala Thr Thr Gly Ile His A:rg Asp Leu Val Met Glu Val Leu Lys Ala Glu Gly Val His Thr Arg A:La Tyr Phe Ser Pro Gly Cys His Glu 2:) 325 330 335 Leu Glu Pro Tyr Arg Gly Gln Pro His Ala Pro Leu Pro His Thr Glu Arg Leu Ala Ala Arg Val Leu Se:r Leu Pro Thr Gly Thr Ala Ile Gly 355 3Ei0 365 30 Asp Asp Asp Ile Arg Arg Val A7La Asp Leu Leu Arg Leu Cys Ala Thr Arg Gly Arg Glu Leu Thr Ala Arg His Arg Asp Thr Ala Pro Ala Pro Leu Ala Ala Pro Gln Thr Ser Thr Pro Thr Ile Gly Arg Ser Arg <210> 9 <211> 1458 <212> DNA
4(I <213> Streptomyces vene2,uelae <400> 9 atgaccgcccccgccctttccgccaccgccccggccgaacgctgcgcgcaccccggagcc60 gatctgggggcggcggtccacgccgtcggccagaccctcgccgccggcggcctcgtgccg120 cccgacgaggccggaacgaccgcccgccacctcgtccggctcgccgtgcgctacggcaac180 agccccttcaccccgctggaggaggcccgccacgacctgggcgtcgaccgggacgccttc240 cggcgcctcctcgccctgttcgggcac3gtcccggagctccgcaccgcggtgagaccggc 300 c cccgccggggcgtactggaagaacaccctgctcccgctcgaacagcgcggcgtcttcgac360 gcggcgctcgccaggaagcccgtcttcccgtacagcgtcggcctctaccccggcccgacc420 tgcatgttccgctgccacttctgcgtc:cgtgtgaccggcgcccgctacgacccgtccgcc480 10ctcgacgccggcaacgccatgttccgc~tcggtcatcgacgagatacccgcggcaacccc 540 g tcggcgatgtacttctccggcggcctc~gagccgctcaccaaccccggcctgggagcctg 600 c gccgcgcacgccaccgaccacggcctqcggcccaccgtctacacgaactcttcgcgctc 660 c accgagcgcaccctggagcgccagccc:ggcctctggggcctgcacgccatccgcacctcg720 ctctacggcctcaacgacgaggagtac:gagcagaccaccggcaagaaggccgccttccgc780 15cgcgtccgcgagaacctgcgccgcttc:cagcagctgcgcgccgagcgcgagtcgccgatc840 aacctcggcttcgcctacatcgtgctcccgggccgtgcctcccgcctgctcgacctggtc900 gacttcatcgccgacctcaacgacgccgggcagggcaggacgatcgacttcgtcaacatt960 cgcgaggactacagcggccgtgacgacggcaagctgccgcaggaggagcgggccgagctc1020 caggaggccctcaacgccttcgaggagcgggtccgcgagcgcacccccggactccacatcx080 20gactacggctacgccctgaacagcctgcgcaccggggccgacgccgaactgctgcggatc1140 aagcccgccaccatgcggcccaccgcgcacccgcaggtcgcggtgcaggtcgatctcctc1200 ggcgacgtgtacctgtaccgcgaggccggcttccccgacctggacggcgcgacccgctac1260 atcgcgggccgcgtgacccccgacacctccctcaccgaggtcgtcagggacttcgtcgag1320 cgcggcggcgaggtggcggccgtcgac~ggcgacgagtacttcatggacggcttcgatcag7.380 2:5gtcgtcaccgcccgcctgaaccagctg~gagcgcgacgccgcggacggctgggaggaggcc1.440 cgcggcttcctgcgctga 1.458 <210> 10 <211> 485 30 <212> PRT
<213> Streptomyces vene:~uelae <400> 10 Met Thr Ala Pro Ala Leu Ser Ala Thr Ala Pro Ala Glu Arg Cys Ala 3~> 1 5 10 15 His Pro Gly Ala Asp Leu Gly Ala Ala Val His Ala Val Gly Gln Thr Leu Ala Ala Gly Gly Leu Val Px-o Pro Asp Glu Ala Gly Thr Thr Ala 4t) Arg His Leu Val Arg Leu Ala Val Arg Tyr Gly Asn Ser Pro Phe Thr Pro Leu Glu Glu Ala Arg His Asp Leu Gly Val Asp Arg Asp Ala Phe Arg Arg Leu Leu Ala Leu Phe Gly Gln Val Pro Glu Leu Arg Thr Ala $ 85 90 95 Val Glu Thr Gly Pro Ala Gly .Ala Tyr Trp Lys Asn Thr Leu Leu Pro Leu Glu Gln Arg Gly Val Phe ,asp Ala Ala Leu Ala Arg Lys Pro Val LO Phe Pro Tyr Ser Val Gly Leu 'Pyr Pro Gly Pro Thr Cys Met Phe Arg Cys His Phe Cys Val Arg Val 'Chr Gly Ala Arg Tyr Asp Pro Ser Ala Leu Asp Ala Gly Asn Ala Met 1?he Arg Ser Val Ile Asp Glu Ile Pro i~s 165 170 175 Ala Gly Asn Pro Ser Ala Met 7Cyr Phe Ser Gly Gly Leu Glu Pro Leu Thr Asn Pro Gly Leu Gly Ser Leu Ala Ala His Ala Thr Asp His Gly s0 Leu Arg Pro Thr Val Tyr Thr Asn Ser Phe Ala Leu Thr Glu Arg Thr Leu Glu Arg Gln Pro Gly Leu Trp Gly Leu His Ala Ile Arg Thr Ser Leu Tyr Gly Leu Asn Asp Glu Glu Tyr Glu Gln Thr Thr Gly Lys Lys 2.$ 245 250 255 Ala Ala Phe Arg Arg Val Arg Glu Asn Leu Arg Arg Phe Gln Gln Leu Arg Ala Glu Arg Glu Ser Pro Ile Asn Leu Gly Phe Ala Tyr Ile Val 30 Leu Pro Gly Arg Ala Ser Arg Leu Leu Asp Leu Val Asp Phe Ile Ala Asp Leu Asn Asp Ala Gly Gln Gly Arg Thr Ile Asp Phe Val Asn Ile Arg Glu Asp Tyr Ser Gly Arg Asp Asp Gly Lys Leu Pro Gln Glu Glu 3$ 325 330 335 Arg Ala Glu Leu Gln Glu Ala Leu Asn Ala Phe Glu Glu Arg Val Arg Glu Arg Thr Pro Gly Leu His Ile Asp Tyr Gly Tyr Ala Leu Asn Ser 4~ Leu Arg Thr Gly Ala Asp Ala Glu Leu Leu Arg Ile Lys Pro Ala Thr Met Arg Pro Thr Ala His Pro Gln Val Ala Val Gln Val Asp Leu Leu Gly Asp Val Tyr Leu Tyr Arg Glu Ala Gly Phe Pro Asp Leu Asp Gly .> 405 410 415 Ala Thr Arg Tyr Ile Ala Gly A:rg Val Thr Pro Asp Thr Ser Leu Thr , Glu Val Val Arg Asp Phe Val G:Lu Arg Gly Gly Glu Val Ala Ala Val 435 4~10 445 Asp Gly Asp Glu Tyr Phe Met A:ap Gly Phe Asp Gln Val Val Thr Ala Arg Leu Asn Gln Leu Glu Arg Asp Ala Ala Asp Gly Trp Glu Glu Ala Arg Gly Phe Leu Arg 1~~ 485 <210> 11 <211> 879 <212> DNA
20' <213> Streptomyces venez;uelae <400> 11 atgaagggaa tagtcctggccggcgggagcggaactcggctgcatccggcgacctcggtc60 atttcgaagc agattcttccggtctacaacaaaccgatgatctactatccgctgtcggtt120 25 ctcatgctcggcggtattcgcgagattcaaatcatctcgaccccccagcacatcgaactc:180 ttccagtcgc ttctcggaaacggcaggcacctgggaatagaactcgactatgcggtccag240 aaagagcccg caggaatcgcggacgcacttctcgtcggagccgagcacatcggcgacgac300 acctgcgccc tgatcctgggcgacaacatcttccacgggcccggcctctacacgctcctg360 cgggacagca tcgcgcgcctcgacggctgcgtgctcttcggctacccggtcaaggacccc420 30 gagcggtacggcgtcgccgaggtggacgcgacgggccggctgaccgacctcgtcgagaag480 cccgtcaagc cgcgctccaacctcgccgtcaccggcctctacctctacgacaacgacgtc540 gtcgacatcg ccaagaacatccggccctcgccgcgcggcgagctggagatcaccgacgtc600 aaccgcgtct acctggagcggggccgggccgaactcgtcaacctgggccgcggcttcgccE~60 tggctggaca ccggcacccacgactcgctcctgcgggccgcccagtacgtccaggtcctg720 35 gaggagcggcagggcgtctggatcgcgggccttgaggagatcgccttccgcatgggcttc780 atcgacgccg aggcctgtcacggcctgggagaaggcctctcccgcaccgagtacggcagc840 tatctgatgg agatcgccggccgcgagggagccccgtga 879 <210> 12 <211> 292 <212> PRT
<213> Streptomyces ven~ezuelae <400> 12 $ Met Lys Gly Ile Val Leu Ala Gly Gly Ser Gly Thr Arg Leu His Pro Ala Thr Ser Val Ile Ser Lys Gln Ile Leu Pro Val Tyr Asn Lys Pro Met Ile Tyr Tyr Pro Leu Ser Val Leu Met Leu Gly Gly Ile Arg Glu Ile Gln Ile Ile Ser Thr Pro Gln His Ile Glu Leu Phe Gln Ser Leu Leu Gly Asn Gly Arg His Leu Gly Ile Glu Leu Asp Tyr Ala Val Gln ~.$ Lys Glu Pro Ala Gly Ile Ala Asp Ala Leu Leu Val Gly Ala Glu His Ile Gly Asp Asp Thr Cys Ala Leu Ile Leu Gly Asp Asn Ile Phe His Gly Pro Gly Leu Tyr Thr Leu Leu Arg Asp Ser Ile Ala Arg Leu Asp 20 115 7.2 0 12 5 Gly Cys Val Leu Phe Gly Tyr F>ro Val Lys Asp Pro Glu Arg Tyr Gly Val Ala Glu Val Asp Ala Thr Gly Arg Leu Thr Asp Leu Val Glu Lys 2$ Pro Val Lys Pro Arg Ser Asn heu Ala Val Thr Gly Leu Tyr Leu Tyr Asp Asn Asp Val Val Asp Ile p,la Lys Asn Ile Arg Pro Ser Pro Arg Gly Glu Leu Glu Ile Thr Asp V'al Asn Arg Val Tyr Leu Glu Arg Gly Arg Ala Glu Leu Val Asn Leu Gly Arg Gly Phe Ala Trp Leu Asp Thr Gly Thr His Asp Ser Leu Leu A.rg Ala Ala Gln Tyr Val Gln Val Leu 3$ Glu Glu Arg Gln Gly Val Trp Ile Ala Gly Leu Glu Glu Ile Ala Phe Arg Met Gly Phe Ile Asp Ala Glu Ala Cys His Gly Leu Gly Glu Gly Leu Ser Arg Thr Glu Tyr Gly Ser Tyr Leu Met Glu Ile Ala Gly Arg Glu Gly Ala Pro <210>

$ <211 > 1014 <212 > DNA

<213 > Streptomyces venezuelae <400 > 13 10gtgcggcttctggtgaccggaggtgcgggcttcatcggctcgcacttcgtgcggcagctc60 ctcgccggggcgtaccccgacgtgcccgccgatgaggtgatcgtcctggaagcctcacc 120 c tacgcgggcaaccgcgccaacctcgccccggtggacgcggacccgcgactgcgcttcgtc180 cacggcgacatccgcgacgccggcctcctcgcccgggaactgcgcggcgtggacgccatc240 gtccacttcgcggccgagagccacgt<~gaccgctccatcgcgggcgcgtcgtgttcacc 300 c )',5gagaccaacgtgcagggcacgcagacc3ctgctccagtgcgccgtcgacgccggcgtcggc360 cgggtcgtgcacgtctccaccgacgac~gtgtacgggtcgatcgactccggtcctggacc 420 c gagagcagcccgctggagcccaactcc~ccctacgcggcgtccaaggccggtccgacctc 480 c gttgcccgcgcctaccaccggacgtac:ggcctcgacgtacggatcacccgctgctgcaac540 aactacgggccgtaccagcaccccgac~aagctcatccccctcttcgtgacgaacctcctc600 20gacggcgggacgctcccgctgtacggc:gacggcgcgaacgtccgcgagtgggtgcacacc660 gacgaccactgccggggcatcgcgctc:gtcctcgcgggcggccgggccggcgagatctac720 cacatcggcggcggcctggagctgaccaaccgcgaactcaccggcatcctcctggactcg780 ctcggcgccgactggtcctcggtccgc~aaggtcgccgaccgcaagggccacgacctgcgc840 tactccctcgacggcggcgagatcgac~cgcgagctcggctaccgcccgcagtctccttc 900 g 2.5gcggacggcctcgcgcggaccgtccgctggtaccgggagaaccgcggctggtgggagccg960 ctcaaggcgaccgccccgcagctgcccgccaccgccgtggaggtgtccgcgtga 1014 <210> 14 <211> 337 30 <212> PRT
<213> Streptomyces venezuelae <400> 14 Met Arg Leu Leu Val Thr Gly Gly Ala Gly Phe Ile Gly Ser His Phe Val Arg Gln Leu Leu Ala Gly Ala Tyr Pro Asp Val Pro Ala Asp Glu Val Ile Val Leu Asp Ser Leu Thr Tyr Ala Gly Asn Arg Ala Asn Leu 40 Ala Pro Val Asp Ala Asp Pro Arg Leu Arg Phe Val His Gly Asp Ile WO 00/00620 PCT/fJS99/14398 Arg Asp Ala Gly Leu Leu Ala Arg Glu Leu Arg Gly Val Asp Ala Ile Val His Phe Ala Ala Glu Ser His Val Asp Arg Ser Ile Ala Gly Ala $ 85 90 95 Ser Val Phe Thr Glu Thr Asn Val Gln Gly Thr Gln Thr Leu Leu Gln Cys Ala Val Asp Ala Gly Val Gly Arg Val Val His Val Ser Thr Asp Glu Val Tyr Gly Ser Ile Asp Ser Gly Ser Trp Thr Glu Ser Ser Pro Leu Glu Pro Asn Ser Pro Tyr Ala Ala Ser Lys Ala Gly Ser Asp Leu Val Ala Arg Ala Tyr His Arg Thr Tyr Gly Leu Asp Val Arg Ile Thr 1$ 165 170 175 Arg Cys Cys Asn Asn Tyr Gly Pro Tyr Gln His Pro Glu Lys Leu Ile Pro Leu Phe Val Thr Asn Leu Leu Asp Gly Gly Thr Leu Pro Leu Tyr Gly Asp Gly Ala Asn Val Arg Glu Trp Val His Thr Asp Asp His Cys Arg Gly Ile Ala Leu Val Leu Ala Gly Gly Arg Ala Gly Glu Ile Tyr His Ile Gly Gly Gly Leu Glu Leu Thr Asn Arg Glu Leu Thr Gly Ile Z$ 245 250 255 Leu Leu Asp Ser Leu Gly Ala Asp Trp Ser Ser Val Arg Lys Val Ala Asp Arg Lys Gly His Asp Leu Arg Tyr Ser Leu Asp Gly Gly Glu Ile .30 Glu Arg Glu Leu Gly Tyr Arg Pro Gln Val Ser Phe Ala Asp Gly Leu Ala Arg Thr Val Arg Trp Tyr .Arg Glu Asn Arg Gly Trp Trp Glu Pro Leu Lys Ala Thr Ala Pro Gln :Leu Pro Ala Thr Ala Val Glu Val Ser .3$ 325 330 335 Ala <210> 15 40 <211> 1140 10$
<212> DNA
<213> Streptomyces venezuelae <400> 15 $ gtgagcagccgcgccgagaccccccg~cgtccccttcctcgacctcaaggccgcctacgag60 gagctccgcgcggagaccgacgccgcgatcgcccgcgtcctcgactcggggcgctacctc120 L, ctcggacccgaactcgaaggattcgaggcggagttcgccgcgtactgcgagacggaccac180 gccgtcggcgtgaacagcgggatggacgccctccagctcgccctccgcggcctcggcatc240 ggacccggggacgaggtgatcgtcccctcgcacacgtacatcgccagctggctcgcggtg300 10tccgccaccggcgcgacccccgtgcccgtcgagccgcacgaggaccaccccaccctggac360 ccgctgctcgtcgagaaggcgatcaccccccgcacccgggcgctcctccccgtccacctc420 tacgggcaccccgccgacatggacgccctccgcgagctcgcggaccggcacggcctgcac480 atcgtcgaggacgccgcgcaggcccacggcgcccgctaccggggccggcggatcggcgcc540 gggtcgtcggtggccgcgttcagcttctacccgggcaagaacctcggctgcttcggcgac600 1$ggcggcgccgtcgtcaccggcgaccccgagctcgccgaacggctccggatgctccgcaac660 tacggctcgcggcagaagtacagccacgagacgaagggcaccaactcccgcctggacgag720 atgcaggccgccgtgctgcggatccggctcgcccacctggacagctggaacggccgcagg780 .

tcggcgctggccgcggagtacctctcc:gggctcgccggactgcccggcatcggcctgccg840 gtgaccgcgcccgacaccgacccggtcaggcacctcttcaccgtgcgcaccgagcgccgc900 :!0gacgagctgcgcagccacctcgacgcc:cgcggcatcgacaccctcacgcactacccggta960 cccgtgcacctctcgcccgcctacgcc~ggcgaggcaccgccggaaggctcctcccgcgg 1020 g gccgagagcttcgcgcggcaggtcctc;agcctgccgatcggcccgcacctggagcgcccg1080 caggcgctgcgggtgatcgacgccgtqcgcgaatgggccgagcgggtcgacaggcctag 1140 c L;$ <210> 16 <211> 379 <212> PRT
<213> Streptomyces venezuelae 30 <400> 16 Met Ser Ser Arg Ala Glu Thr Pro Arg Val Pro Phe Leu Asp Leu Lys Ala Ala Tyr Glu Glu Leu Arg Ala Glu Thr Asp Ala Ala Ile Ala Arg 35 Val Leu Asp Ser Gly Arg Tyr Leu Leu Gly Pro Glu Leu Glu Gly Phe Glu Ala Glu Phe Ala Ala Tyr Cys Glu Thr Asp His Ala Val Gly Val Asn Ser Gly Met Asp Ala Leu Gln Leu Ala Leu Arg Gly Leu Gly Ile 4'~ 65 70 75 80 Gly Pro Gly Asp Glu Val Ile Val Pro Ser His Thr Tyr Ile Ala Ser Trp Leu Ala Val Ser Ala Thr Gly Ala Thr Pro Val Pro Val Glu Pro $ His Glu Asp His Pro Thr Leu Asp Pro Leu Leu Val Glu Lys Ala Ile Thr Pro Arg Thr Arg Ala Leu Leu Pro Val His Leu Tyr Gly His Pro Ala Asp Met Asp Ala Leu Arg Glu Leu Ala Asp Arg His Gly Leu His 1~ 145 150 155 160 Ile Val Glu Asp Ala Ala Gln Ala His Gly Ala Arg Tyr Arg Gly Arg Arg Ile Gly Ala Gly Ser Ser Val Ala Ala Phe Ser Phe Tyr Pro Gly 1$ Lys Asn Leu Gly Cys Phe Gly Asp Gly Gly Ala Val Val Thr Gly Asp Pro Glu Leu Ala Glu Arg Leu .Arg Met Leu Arg Asn Tyr Gly Ser Arg Gln Lys Tyr Ser His Glu Thr :Lys Gly Thr Asn Ser Arg Leu Asp Glu .~~ 225 230 235 240 Met Gln Ala Ala Val Leu Arg :Ile Arg Leu Ala His Leu Asp Ser Trp Asn Gly Arg Arg Ser Ala Leu Ala Ala Glu Tyr Leu Ser Gly Leu Ala :!$ Gly Leu Pro Gly Ile Gly Leu 1?ro Val Thr Ala Pro Asp Thr Asp Pro 275 :?BO 285 Val Trp His Leu Phe Thr Val Arg Thr Glu Arg Arg Asp Glu Leu Arg Ser His Leu Asp Ala Arg Gly Ile Asp Thr Leu Thr His Tyr Pro Val :~~ 305 310 315 320 Pro Val His Leu Ser Pro Ala Tyr Ala Gly Glu Ala Pro Pro Glu Gly Ser Leu Pro Arg Ala Glu Ser Phe Ala Arg Gln Val Leu Ser Leu Pro ~~$ Ile Gly Pro His Leu Glu Arg Pro Gln Ala Leu Arg Val Ile Asp Ala 355 3'.60 365 Val Arg Glu Trp Ala Glu Arg Val Asp Gln Ala <210> 17 <211> 714 <212> DNA
<213> Streptomyces ven.ezuelae S <400> 17 gtgtacgaag tcgaccacgccgacgtctacgacctcttctacctgggtcgcggcaaggac 60 G' tacgccgccg aggcctccgacatcgccgacctggtgcgctcccgtacccccgaggcctcc 120 tcgctcctgg acgtggcctgcggtacgggcacgcatctggagcacttcaccaaggagttc 180 ggcgacaccg ccggcctggagctgtccgaggacatgctcacccacgcccgcaagcggctg 240 l0 cccgacgccacgctccaccagggcgacatgcgggacttccggctcggccggaagttctcc 300 gccgtggtca gcatgttcagctccgtcggctacctgaagacgaccgaggaactcggcgcg 360 gccgtcgcct cgttcgcggagcacct<3gagcccggtggcgcgtcgtcgtcgagccgtgg 420 t tggttcccgg agaccttcgccgacggctgggtcagcgccgacgtcgtccgccgtgacggg 480 cgcaccgtgg cccgtgtctcgcactcggtgcgggaggggaacgcgacgcgatggaggtc 540 c :l5 cacttcaccgtggccgacccgggcaagggcgtgcggcacttctccgacgtccatctcatc 600 .

accctgttcc accaggccgagtacgac3gccgcgttcacggcgccgggctgcgcgtcgag 660 c tacctggagg gcggcccgtcgggccgt:ggcctcttcgtcggcgtccccgcctga 714 <210> 18 t;0 <211> 237 <212> PRT
<213> Streptomyces vene:zuelae <400> 18 2$ Met Tyr Glu Val Asp His Ala A.sp Val Tyr Asp Leu Phe Tyr Leu Gly Arg Gly Lys Asp Tyr Ala Ala Glu Ala Ser Asp Ile Ala Asp Leu Val Arg Ser Arg Thr Pro Glu Ala Ser Ser Leu Leu Asp Val Ala Cys Gly Thr Gly Thr His Leu Glu His Phe Thr Lys Glu Phe Gly Asp Thr Ala Gly Leu Glu Leu Ser Glu Asp Met Leu Thr His Ala Arg Lys Arg Leu 35 Pro Asp Ala Thr Leu His Gln Gly Asp Met Arg Asp Phe Arg Leu Gly Arg Lys Phe Ser Ala Val Val S~sr Met Phe Ser Ser Val Gly Tyr Leu Lys Thr Thr Glu Glu Leu Gly A:la Ala Val Ala Ser Phe Ala Glu His Leu Glu Pro Gly Gly Val Val Val Val Glu Pro Trp Trp Phe Pro Glu Thr Phe Ala Asp Gly Trp Val Ser Ala Asp Val Val Arg Arg Asp Gly $ Arg Thr Val Ala Arg Val Ser His Ser Val Arg Glu Gly Asn Ala Thr Arg Met Glu Val His Phe Thr Val Ala Asp Pro Gly Lys Gly Val Arg His Phe Ser Asp Val His Leu Ile Thr Leu Phe His Gln Ala Glu Tyr Glu Ala Ala Phe Thr Ala Ala Gly Leu Arg Val Glu Tyr Leu Glu Gly Gly Pro Ser Gly Arg Gly Leu Phe Val Gly Val Pro Ala <210> 19 <211> 1281 <212> DNA
<213> Streptomyces vene:zuelae :!0 <400> 19 atgcgcgtcctgctgacctcgttcgcacatcacacgcactactacggcctggtgcccctg60 gcctgggcgctgctcgccgccgggcac:gaggtgcgggtcgccagccagcccgcgctcacg120 gacaccatcaccgggtccgggctcgcc:gcggtgccggtcggcaccgaccacctcatccac180 t;5gagtaccgggtgcggatggcgggcgag~ccgcgcccgaaccatccggcgatcgccttcgac240 gaggcccgtcccgagccgctggactggrgaccacgccctcggcatcgaggcgatcctcgcc300 ccgtacttccatctgctcgccaacaacgactcgatggtcgacgacctcgtcgacttcgcc360 cggtcctggcagccggacctggtgctgtgggagccgacgacctacgcgggcgccgtcgcc420-gcccaggtcaccggtgccgcgcacgcccgggtcctgtgggggcccgacgtgatgggcagc480 30gcccgccgcaagttcgtcgcgctgcgggaccggcagccgcccgagcaccgcgaggacccc540 accgcggagtggctgacgtggacgctcgaccggtacggcgcctccttcgaagaggagctg600 ctcaccggccagttcacgatcgacccgaccccgccgagcctgcgcctcgacacgggcctg660 ccgaccgtcgggatgcgttatgttccgtacaacggcacgtcggtcgtgccggactggctg720 agtgagccgcccgcgcggccccgggtctgcctgaccctcggcgtctccgcgcgtgaggtc780 3$ctcggcggcgacggcgtctcgcagggcgacatcctggaggcgctcgccgacctcgacatc840 gagctcgtcgccacgctcgacgcgagtcagcgcgccgagatccgcaactacccgaagcac900 acccggttcacggacttcgtgccgatgcacgcgctcctgccgagctgctcggcgatcatc960 caccacggcggggcgggcacctacgcgaccgccgtgatcaacgcggtgccgcaggtcatg1020 ctcgccgagctgtgggacgcgccggtcaaggcgcgggccgtcgccgagcagggggcgggg1.080 40ttcttcctgccgccggccgagctcacgccgcaggccgtgcgggacgccgtcgtccgcatc1140 ctcgacgacc cctcggtcgc caccgccgcg caccggctgc gcgaggagac cttcggcgac 1200 cccaccccgg ccgggatcgt ccccgagctg gagcggctcg ccgcgcagca ccgccgcccg 1260 ccggccgacg cccggcactg a 1281 <210> 20 <211> 426 <212> PRT
<213> Streptomyces venezuelae ~0 <400> 20 Met Arg Val Leu Leu Thr Ser 1?he Ala His His Thr His Tyr Tyr Gly Leu Val Pro Leu Ala Trp Ala Leu Leu Ala Ala Gly His Glu Val Arg ~S Val Ala Ser Gln Pro Ala Leu 7Chr Asp Thr Ile Thr Gly Ser Gly Leu Ala Ala Val Pro Val Gly Thr Asp His Leu Ile His Glu Tyr Arg Val Arg Met Ala Gly Glu Pro Arg Pro Asn His Pro Ala Ile Ala Phe Asp Glu Ala Arg Pro Glu Pro Leu h,sp Trp Asp His Ala Leu Gly Ile Glu Ala Ile Leu Ala Pro Tyr Phe H:is Leu Leu Ala Asn Asn Asp Ser Met 2$ Val Asp Asp Leu Val Asp Phe A.la Arg Ser Trp Gln Pro Asp Leu Val Leu Trp Glu Pro Thr Thr Tyr Ala Gly Ala Val Ala Ala Gln Val Thr Gly Ala Ala His Ala Arg Val Leu Trp Gly Pro Asp Val Met Gly Ser 3~~ 145 150 155 160 Ala Arg Arg Lys Phe Val Ala Leu Arg Asp Arg Gln Pro Pro Glu His Arg Glu Asp Pro Thr Ala Glu T:rp Leu Thr Trp Thr Leu Asp Arg Tyr 3:) Gly Ala Ser Phe Glu Glu Glu Leu Leu Thr Gly Gln Phe Thr Ile Asp Pro Thr Pro Pro Ser Leu Arg Leu Asp Thr Gly Leu Pro Thr Val Gly Met Arg Tyr Val Pro Tyr Asn G:Ly Thr Ser Val Val Pro Asp Trp Leu 41) 225 230 235 240 Ser Glu Pro Pro Ala Arg Pro Arg Val Cys Leu Thr Leu Gly Val Ser Ala Arg Glu Val Leu Gly Gly Asp Gly Val Ser Gln Gly Asp Ile Leu Glu Ala Leu Ala Asp Leu Asp Ile Glu Leu Val Ala Thr Leu Asp Ala Ser Gln Arg Ala Glu Ile Arg .Asn Tyr Pro Lys His Thr Arg Phe Thr Asp Phe Val Pro Met His Ala :Geu Leu Pro Ser Cys Ser Ala Ile Ile 1~ 305 310 315 320 His His Gly Gly Ala Gly Thr 'Tyr Ala Thr Ala Val Ile Asn Ala Val Pro Gln Val Met Leu Ala Glu Leu Trp Asp Ala Pro Val Lys Ala Arg .l$ Ala Val Ala Glu Gln Gly Ala GIy Phe Phe Leu Pro Pro Ala Glu Leu 355 .160 365 Thr Pro Gln Ala Val Arg-Asp Ala Val Val Arg Ile Leu Asp Asp Pro Ser Val Ala Thr Ala Ala His Pvrg Leu Arg Glu Glu Thr Phe Gly Asp t~ 385 390 395 400 Pro Thr Pro Ala Gly Ile Val Fro Glu Leu Glu Arg Leu Ala Ala Gln His Arg Arg Pro Pro Ala Asp A.la Arg His <210> 21 <211> 1209 <212> DNA
<213> Streptomyces venezuelae <400> 21 gtgaccgacg acctgacgggggccctcacgcagcccccgctgggccgcaccgtccgcgcg 60 gtggccgacc gtgaactcggcacccacctcctggagacccgcggcatccactggatccac 120 gccgcgaacg gcgacccgtacgccaccc~tgctgcgcggccaggcggacgaccgtatccc 180 c 3:5 gcgtacgagcgggtgcgtgcccgcggcgcgctctccttcagcccgacgggcagctgggtc 240 accgccgatc acgccctggcggcgagcatcctctgctcgacggacttcggggtctccggc 300 gccgacggcg tcccggtgccgcagcagc~tcctctcgtacggggagggctgccgctggag 360 t cgcgagcagg tgctgccggcggccggt<~acgtgccggagggcgggcagcggccgtggtc 420 t gaggggatcc accgggagacgctggagc~gtctcgcgccggacccgtcggcgtcgtacgcc 480 .

ttcgagctgctgggcggtttcgtccgcc:cggcggtgacggccgctgccgccgccgtgctg 540 ggtgttcccg cggaccggcgcgcggacttcgcggatctgctggagcggctccggccgctg 600 tccgacagcc tgctggccccgcagtccctgcggacggtacgggcggcggacggcgcgctg 660 gccgagctca cggcgctgctcgccgattcggacgactcccccggggccctgctgtcggcg 720 ctcggggtca ccgcagccgtccagctcaccgggaacgcggtgctcgcgctcctcgcgcat 780 $ cccgagcagtggcgggagctgtgcgaccggcccgggctcgcggcggccgcggtggaggag 840 accctccgct acgacccgccggtgcagctcgacgcccgggtggtccgcggggagacggag 900 , ctggcgggcc ggcggctgccggccgg~ggcgcatgtcgtcgtcctgaccgccgcgaccggc 960 cgggacccgg aggtcttcacggacccggagcgcttcgacctcgcgcgccccgacgccgcc 1020 gcgcacctcg cgctgcaccccgccggtccgtacggcccggtggcgtccctggtccggctt 1080 caggcggagg tcgcgctgcggaccctggccgggcgtttccccgggctgcggcaggcgggg 1140 gacgtgctcc gcccccgccgcgcgcc~tgtcggccgcgggccgctgagcgtcccggtcagc 1200 agctcctga 1209 <210> 22 ~.$ <211> 402 <212> PRT
<213> Streptomyces vene:zuelae <400> 22 2',~ Met Thr Asp Asp Leu Thr Gly Ala Leu Thr Gln Pro Pro Leu Gly Arg Thr Val Arg Ala Val Ala Asp Arg Glu Leu Gly Thr His Leu Leu Glu Thr Arg Gly Ile His Trp Ile Efis Ala Ala Asn Gly Asp Pro Tyr Ala Thr Val Leu Arg Gly Gln Ala p,sp Asp Pro Tyr Pro Ala Tyr Glu Arg Val Arg Ala Arg Gly Ala Leu Ser Phe Ser Pro Thr Gly Ser Trp Val 3~ Thr Ala Asp His Ala Leu Ala A.la Ser Ile Leu Cys Ser Thr Asp Phe Gly Val Ser Gly Ala Asp Gly Val Pro Val Pro Gln Gln Val Leu Ser Tyr Gly Glu Gly Cys Pro Leu Glu Arg Glu Gln Val Leu Pro Ala Ala 3$ 115 120 125 Gly Asp Val Pro Glu Gly Gly Gln Arg Ala Val Val Glu Gly Ile His Arg Glu Thr Leu Glu Gly Leu Ala Pro Asp Pro Ser Ala Ser Tyr Ala 4~ Phe Glu Leu Leu Gly Gly Phe Val Arg Pro Ala Val Thr Ala Ala Ala Ala Ala Val Leu Gly Val Pro ;41a Asp Arg Arg Ala Asp Phe Ala Asp Leu Leu Glu Arg Leu Arg Pro Leu Ser Asp Ser Leu Leu Ala Pro Gln Ser Leu Arg Thr Val Arg Ala Ala Asp Gly Ala Leu Ala Glu Leu Thr Ala Leu Leu Ala Asp Ser Asp Asp Ser Pro Gly Ala Leu Leu Ser Ala 1~ Leu Gly Val Thr Ala Ala Val Gln Leu Thr Gly Asn Ala Val Leu Ala Leu Leu Ala His Pro Glu Gln Trp Arg Glu Leu Cys Asp Arg Pro Gly 260 265 ' 270 Leu Ala Ala Ala Ala Val Glu Glu Thr Leu Arg Tyr Asp Pro Pro Val 1$ 275 280 285 Gln Leu Asp Ala Arg Val Val Arg Gly Glu Thr Glu Leu Ala Gly Arg Arg Leu Pro Ala Gly Ala His Val Val Val Leu Thr Ala Ala Thr Gly 2~~ Arg Asp Pro Glu Val Phe Thr Asp Pro Glu Arg Phe Asp Leu Ala Arg Pro Asp Ala Ala Ala His Leu A:la Leu His Pro Ala Gly Pro Tyr Gly Pro Val Ala Ser Leu Val Arg Leu Gln Ala Glu Val Ala Leu Arg Thr 2:> 355 3ti0 365 Leu Ala Gly Arg Phe Pro Gly Le:u Arg Gln Ala Gly Asp Val Leu Arg Pro Arg Arg Ala Pro Val Gly Arg Gly Pro Leu Ser Val Pro Val Ser 3(I Ser ser <210> 23 <211> 2430 35 <212> DNA
<213> Streptomyces venezuelae <400> 23 gtgacaggta agacccgaat accgcgtgtc cgccgcggcc gcaccacgcc cagggccttc 60 40 accctggccg tcgtcggcac cctgctggcg ggcaccaccg tggcggccgc cgctcccggc 120 gccgccgaca 180 cggccaatgt tcagtacacg agccgggcgg cggagctcgt cgcccagatg acgctcgacg 240 agaagatcag cttcgtccac tgggcgctgg accccgaccg gcagaacgtc ggctaccttc atcccggagctgcgtgccgccgacggcccg300 ccggcgtgcc gcgtctgggc aacggcatccgcctggtggg accgcgctgcccgcgccggtcgccctggcc360 gcagaccgcc agcaccttcg ggccgacagctacggcaaggtcatgggccgcgacggtcgc420 acgacaccat gcgctcaaccaggacatggtcctgggcccgatgatgaacaacatccgggtgccgcacggc480 ggccggaactacgagaccttcagcga~,~gaccccctggtctcctcgcgcaccgcggtcgcc540 cagatcaagggcatccagggtgcgggtctgatgaccacggccaagcacttcgcggccaac600 aaccaggagaacaaccgcttctccgtgaacgccaatgtcgacgagcagacgctccgcgag660 l0atcgagttcccggcgttcgaggcgtcctccaaggccggcgcggcctccttcatgtgtgcc720 tacaacggcctcaacgggaagccgtcctgcggcaacgacgagctcctcaacaacgtgctg780 cgcacgcagtggggcttccagggctgc~gtgatgtccgactggctcgccaccccgggcacc840 .

gacgccatcaccaagggcctcgaccaggagatgggcgtcgagctccccgggacgtcccg 900 c aagggcgagccctcgccgccggccaac~ttcttcggcgaggcgctgaagacgccgtcctg 960 g l5aacggcacggtccccgaggcggccgtc~acgcggtcggcggagcggatcgtggccagatg 1020 c gagaagttcggtctgctcctcgccact:ccggcgccgcggcccgagcgcgacaaggcgggt1080 gcccaggcggtgtcccgcaaggtcgcc;gagaacggcgcggtgctcctgcgcaacgagggc1140 caggccctgccgctcgccggtgacgccggcaagagcatcgcggtcatcggcccgacggcc1200 gtcgaccccaaggtcaccggcctgggc:agcgcccacgtcgtcccggactcggcggcggcg1260 2.0ccactcgacaccatcaaggcccgcgcg~ggtgcgggtgcgacggtgacgtacgagacgggt1320 gaggagaccttcgggacgcagatcccggcggggaacctcagcccggcgttcaaccagggc1380 caccagctcgagccgggcaaggcgggggcgctgtacgacggcacgctgaccgtgcccgcc:1440 gacggcgagtaccgcatcgcggtccgtgccaccggtggttacgccacggtgcagctcggc:1500 agccacaccatcgaggccggtcaggtctacggcaaggtgagcagcccgctcctcaagctg:1560 25accaagggcacgcacaagctcacgatctcgggcttcgcgatgagtgccaccccgctctcc1620 ctggagctgggctgggtgacgccggcggcggccgacgcgacgatcgcgaaggccgtggag:1680 tcggcgcggaaggcccgtacggcggtcgtcttcgcctacgacgacggcaccgagggcgtc1740 gaccgtccgaacctgtcgctgccgggtacgcaggacaagctgatctcggctgtcgcggac1800.

gccaacccgaacacgatcgtggtcctcaacaccggttcgtcggtgctgatgccgtggctg1860 3~tccaagacccgcgcggtcctggacatgtggtacccgggccaggcgggcgccgaggccacc1920 gccgcgctgctctacggtgacgtcaacccgagcggcaagctcacgcagagcttcccggcc7.980 gccgagaaccagcacgcggtcgccggcgacccgacaagctacccgggcgtcgacaaccag2040 cagacgtaccgcgagggcatccacgtcgggtaccgctggttcgacaaggagaacgtcaag2.100 ccgctgttcccgttcgggcacggcctgtcgtacacctcgttcacgcagagcgccccgacc2160 3:Sgtcgtgcgtacgtccacgggtggtctgaaggtcacggtcacggtccgcaacagcgggaag2220 cgcgccggccaggaggtcgtccaggcgtacctcggtgcca gacggctccg2280 gcccgaacgt caggcgaagaagaagctcgtgggctacacgaaggtctcgctcgccgcgggcgaggcgaag2340 acggtgacggtgaacgtcgaccgccgtcagctgcagaccg cgccgacctg2400 gttcgtcctc cggggcagcgccacggtcaacgtctggtga 2430 41) <210> 24 <211> 809 <212> PRT
<213> Streptomyces vene:zuelae <400> 24 Met Thr Gly Lys Thr Arg Ile Pro Arg Val Arg Arg Gly Arg Thr Thr Pro Arg Ala Phe Thr Leu Ala Val Val Gly Thr Leu Leu Ala Gly Thr 1~ 20 25 . 30 Thr Val Ala Ala Ala Ala Pro G~ly Ala Ala Asp Thr Ala Asn Val Gln Tyr Thr Ser Arg Ala Ala Glu L~eu Val Ala Gln Met Thr Leu Asp Glu 1$ Lys Ile Ser Phe Val His Trp A.la Leu Asp Pro Asp Arg Gln Asn Val Gly Tyr Leu Pro Gly Val Pro Arg Leu Gly Ile Pro Glu Leu Arg Ala Ala Asp Gly Pro Asn Gly Ile Arg Leu Val Gly Gln Thr Ala Thr Ala 2n loo l05 llo Leu Pro Ala Pro Val Ala Leu Ala Ser Thr Phe Asp Asp Thr Met Ala Asp Ser Tyr Gly Lys Val Met G:ly Arg Asp Gly Arg Ala Leu Asn Gln 2:> Asp Met Val Leu Gly Pro Met Met Asn Asn Ile Arg Val Pro His Gly Gly Arg Asn Tyr Glu Thr Phe Ser Glu Asp Pro Leu Val Ser Ser Arg Thr Ala Val Ala Gln Ile Lys Gly Ile Gln Gly Ala Gly Leu Met Thr 3() lso ls5 190 Thr Ala Lys His Phe Ala Ala Assn Asn Gln Glu Asn Asn Arg Phe Ser Val Asn Ala Asn Val Asp Glu G7_n Thr Leu Arg Glu Ile Glu Phe Pro ~~ Ala Phe Glu Ala Ser Ser Lys A7.a Gly Ala Ala Ser Phe Met Cys Ala Tyr Asn Gly Leu Asn Gly Lys Pro Ser Cys Gly Asn Asp Glu Leu Leu Asn Asn Val Leu Arg Thr Gln Trp Gly Phe Gln Gly Trp Val Met Ser 4~~ 260 265 270 11$
Asp Trp Leu Ala Thr Pro Gly Thr Asp Ala Ile Thr Lys Gly Leu Asp Gln Glu Met Gly Val Glu Leu Fro Gly Asp Val Pro Lys Gly Glu Pro $ Ser Pro Pro Ala Lys Phe Phe Gly Glu Ala Leu Lys Thr Ala Val Leu Asn Gly Thr Val Pro Glu Ala Ala Val Thr Arg Ser Ala Glu Arg Ile Val Gly Gln Met Glu Lys Phe Gly Leu Leu Leu Ala Thr Pro Ala Pro 1~ 340 345 350 Arg Pro Glu Arg Asp Lys Ala Gly Ala Gln Ala Val Ser Arg Lys Val Ala Glu Asn Gly Ala Val Leu :Leu Arg Asn Glu Gly Gln Ala Leu Pro 1$ Leu Ala Gly Asp Ala Gly Lys ~Ser Ile Ala Val Ile Gly Pro Thr Ala Val Asp Pro Lys Val Thr Gly 7Leu Gly Ser Ala His Val Val Pro Asp Ser Ala Ala Ala Pro Leu Asp 'Chr Ile Lys Ala Arg Ala Gly Ala Gly Ala Thr Val Thr Tyr Glu Thr Gly Glu Glu Thr Phe Gly Thr Gln Ile Pro Ala Gly Asn Leu Ser Pro Ala Phe Asn Gln Gly His Gln Leu Glu ~'.$ Pro Gly Lys Ala Gly Ala Leu 7'yr Asp Gly Thr Leu Thr Val Pro Ala Asp Gly Glu Tyr Arg Ile Ala Val Arg Ala Thr Gly Gly Tyr Ala Thr Val Gln Leu Gly Ser His Thr I:le Glu Ala Gly Gln Val Tyr Gly Lys Val Ser Ser Pro Leu Leu Lys heu Thr Lys Gly Thr His Lys Leu Thr Ile Ser Gly Phe Ala Met Ser P,la Thr Pro Leu Ser Leu Glu Leu Gly 3$ Trp Val Thr Pro Ala Ala Ala A.sp Ala Thr Ile Ala Lys Ala Val Glu Ser Ala Arg Lys Ala Arg Thr A.la Val Val Phe Ala Tyr Asp Asp Gly Thr Glu Gly Val Asp Arg Pro A.sn Leu Ser Leu Pro Gly Thr Gln Asp Lys Leu Ile Ser Ala Val Ala Asp Ala Asn Pro Asn Thr Ile Val Val Leu Asn Thr Gly Ser Ser Val Leu Met Pro Trp Leu Ser Lys Thr Arg $ Ala Val Leu Asp Met Trp Tyr Pro Gly Gln Ala Gly Ala Glu Ala Thr Ala Ala Leu Leu Tyr Gly Asp Val Asn Pro Ser Gly Lys Leu Thr Gln Ser Phe Pro Ala Ala Glu Asn Gln His Ala Val Ala Gly Asp Pro Thr Ser Tyr Pro Gly Val Asp Asn Gln Gln Thr Tyr Arg Glu Gly Ile His Val Gly Tyr Arg Trp Phe Asp Lys Glu Asn Val Lys Pro Leu Phe Pro 1S Phe Gly His Gly Leu Ser Tyr 'Thr Ser Phe Thr Gln Ser Ala Pro Thr Val Val Arg Thr Ser Thr Gly Gly Leu Lys Val Thr Val Thr Val Arg Asn Ser Gly Lys Arg Ala Gly Gln Glu Val Val Gln Ala Tyr Leu Gly Ala Ser Pro Asn Val Thr Ala :Pro Gln Ala Lys Lys Lys Leu Val Gly 755 '760 765 Tyr Thr Lys Val Ser Leu Ala Ala Gly Glu Ala Lys Thr Val Thr Val :ZS Asn Val Asp Arg Arg Gln Leu Gln Thr Gly Ser Ser Ser Ala Asp Leu Arg Gly Ser Ala Thr Val Asn Val Trp <210> 25 <211> 9 <212> PRT
<213> Artificial Sequence .35 <220>
<223> A consensus sequence.
<221> VARIANT
<222> (4) ... (4) <223> Residue 4 is either V or I.

<400> 25 Leu Leu Asp Val Ala Cys Gly '.Chr Gly <210> 26 <211> 1011 <212> DNA
<213> Streptomyces vene:zuelae 1~ <400> 26 atggcaatgc gcgactccataccgagg~cgagcggaccgcgacacccttcgccgcgaatta 60 ggccagaact tccttcaggacgacaga.gccgtgcgcaatctcgtcacgcatgtcgagggg 120 gacggtagga acgttctcgaaatcggccccggaaagggcgcgataaccgaggagttggtg 180 cgctccttcg acaccgtgacggtcgtggagatggacccgcactgggccgcgcatgtgcgg 240 15 cggaaattcgaaggggagagggtcaccgtattccagggtgatttcctcgacttccgcatt 300 ccgcgcgata tcgacaccgtcgtcggaaacgttcccttcggcatcacgacccagattctc 360 cggagtctcc tggaatcgacgaactggcagtcggcggccctgatagtgcagtgggaggtc 420 gcccgcaaac gcgccggtcgcagcggcggatcgctcctcacgacctcctgggccccctgg 480 tacgagttcg cggtccacgaccgcgtccgcgcctcgtcgttccgtccgatgccccgcgtc 540 2~~ gacggcggcgtcctgacgatcaggcgacgcccccagcccctgctgcccgagagcgcgagc 600 cgcgccttcc agaacttcgccgaagcc~gtcttcaccggccccggacggggcctcgcggag 660 atcctccggc gccacatccccaagcgg,acctaccgttccctcgccgaccgccacggaatt 720 ccggacggcg gactgccgaaggacctc,acgctcacccaatggatcgcccttttccaggcc 780 tcccagccga gttacgcgccgggggcgcccggcacgcgcatgccgggccagggcggtggc 840 2:> gccggcggcagggactatgactcggagacgagcagggccgccgtgcccgggagccgcaga 900 tacggcccca cgcgcggcggcgaaccctgcgcaccccgcgcacaggtccggcagaccaag 960 ggccgccagg gcgcgcgaggctcgtcgtacggacgccgcacgggccgttag 1011 <210> 27 <211> 336 <212> PRT
<213> Streptomyces veneauelae <400> 27 3_'i Met Ala Met Arg Asp Ser Ile Pra Arg Arg Ala Asp Arg Asp Thr Leu Arg Arg Glu Leu Gly Gln Asn Phe Leu Gln Asp Asp Arg Ala Val Arg Asn Leu Val Thr His Val Glu G7.y Asp Gly Arg Asn Val Leu Glu Ile Gly Pro Gly Lys Gly Ala Ile Thr Glu Glu Leu Val Arg Ser Phe Asp Thr Val Thr Val Val Glu Met Asp Pro His Trp Ala Ala His Val Arg $ Arg Lys Phe Glu Gly Glu Arg Val Thr Val Phe Gl.n Gly Asp Phe Leu Asp Phe Arg Ile Pro Arg Asp Ile Asp Thr Val Val Gly Asn Val Pro Phe Gly Ile Thr Thr Gln Ile Leu Arg Ser Leu Leu Glu Ser Thr Asn 1~ 115 120 125 Trp Gln Ser Ala Ala Leu Ile 'Val Gln Trp Glu Val Ala Arg Lys Arg Ala Gly Arg Ser Gly Gly Ser :Leu Leu Thr Thr Ser Trp Ala Pro Trp 1$ Tyr Glu Phe Ala Val His Asp i~rg Val Arg Ala Ser Ser Phe Arg Pro Met Pro Arg Val Asp Gly Gly Val Leu Thr Ile Arg Arg Arg Pro Gln Pro Leu Leu Pro Glu Ser Ala Ser Arg Ala Phe Gln Asn Phe Ala Glu Ala Val Phe Thr Gly Pro Gly Arg Gly Leu Ala Glu Ile Leu Arg Arg His Ile Pro Lys Arg Thr Tyr Arg Ser Leu Ala Asp Arg His Gly Ile :!$ Pro Asp Gly Gly Leu Pro Lys Asp Leu Thr Leu Thr Gln Trp Ile Ala Leu Phe Gln Ala Ser Gln Pro :per Tyr Ala Pro Gly Ala Pro Gly Thr Arg Met Pro Gly Gln Gly Gly C:ly Ala Gly Gly Arg Asp Tyr Asp Ser :« 275 280 285 Glu Thr Ser Arg Ala Ala Val Pro Gly Ser Arg Arg Tyr Gly Pro Thr Arg Gly Gly Glu Pro Cys Ala Faro Arg Ala Gln Val Arg Gln Thr Lys ~$ Gly Arg Gln Gly Ala Arg Gly S~er Ser Tyr Gly Arg Arg Thr Gly Arg <210> 28 <211> 969 <212> DNA

<213> Streptomyces vene:zuelae <400> 28 atggcattttccccgcagggcggccga.cacgagctcggtcagaacttcctcgtcgaccgg 60 tcagtgatcgacgagatcgacggcctgrgtggccaggaccaagggtccgatactggagatc 120 ggtccgggtgacggcgccctgaccctg~ccgctgagcaggcacggcaggccgatcaccgcc 180 gtcgagctcgacggccggcgcgcgcagcgcctcggtgcccgcacccccggtcatgtgacc 240 gtggtgcaccacgacttcctgcagtacccgctgccgcgcaacccgcatgtggtcgtcggc 300 aacgtccccttccatctgacgacggcgatcatgcggcggctgctcgacgcccagcactgg 360 10cacaccgccgtcctcctcgtccagtgggaggtcgcccggcgccgggccggcgtcggcggg 420 tcgacgctgctgacggccggctgggcgccctggtacgagttcgacctgcactcccgggtc 480 cccgcgcgggccttccgtccgatgccgggcgtggacggaggagtactggccatccggcgg 540 cggtccgcgccgctcgtgggccaggtgaagacgtaccaggacttcgtacgccaggtgttc 600 accggcaaggggaacgggctgaaggagatcctgcggcggaccgggcggatctcgcagcgg 660 15gacctggcgacctggctgcggaggaacgagatctcgccgcacgcgctgcccaaggacctg 720 aagcccgggcagtgggcgtcgctgtgggagctgaccggcggcacggccgacggatccttc 780 gacggtacggcgggcggtggcgcggccggatcgcacggggcggctcgggtcggggccggt 840 cacccgggcggccgggtgtccgcgagccggcggggcgtgccgcaggcgcggcgcggccgg 900 gggcatgcggtacggagctccacggggaccgagccgaggtggggcagggggcgggcggag 960 2~~agcgcgtga g6g <210> 29 <211> 322 <212> PRT

2:5<213> Streptomyces ae vene:zuel <400> 29 Met Ala Phe Ser Pro Gln Gly G:Ly Arg His Glu Leu Gly Gln Asn Phe 31) Leu Val Asp Arg Ser Val Ile A;ap Glu Ile Asp Gly Leu Val Ala Arg Thr Lys Gly Pro Ile Leu Glu I:Le Gly Pro Gly Asp Gly Ala Leu Thr 35 41) 45 Leu Pro Leu Ser Arg His Gly Arg Pro Ile Thr Ala Val Glu Leu Asp 3:> 50 55 60 Gly Arg Arg Ala Gln Arg Leu G:Ly Ala Arg Thr Pro Gly His Val Thr Val Val His His Asp Phe Leu Gln Tyr Pro Leu Pro Arg Asn Pro His 41) Val Val Val Gly Asn Val Pro Phe His Leu Thr Thr Ala Ile Met Arg loo los llo Arg Leu Leu Asp Ala Gln His '.Crp His Thr Ala Val Leu Leu Val Gln 115 .L 2 0 I2 5 Trp Glu Val Ala Arg Arg Arg Ala Gly Val Gly Gly Ser Thr Leu Leu $ 130 135 140 Thr Ala Gly Trp Ala Pro Trp Tyr Glu Phe Asp Leu His Ser Arg Val , Pro Ala Arg Ala Phe Arg Pro biet Pro Gly Val Asp Gly Gly Val Leu Ala Ile Arg Arg Arg Ser Ala Fro Leu Val Gly Gln Val Lys Thr Tyr Gln Asp Phe Val Arg Gln Val P~he Thr Gly Lys Gly Asn Gly Leu Lys Glu Ile Leu Arg Arg Thr Gly A.rg Ile Ser Gln Arg Asp Leu Ala Thr Trp Leu Arg Arg Asn Glu Ile Ser Pro His Ala Leu Pro Lys Asp Leu Lys Pro Gly Gln Trp Ala Ser Leu Trp Glu Leu Thr Gly Gly Thr Ala Asp Gly Ser Phe Asp Gly Thr Ala Gly Gly Gly Ala Ala Gly Ser His Gly Ala Ala Arg Val Gly Ala Gly His Pro Gly Gly Arg Val Ser Ala 275 2.80 285 Ser Arg Arg Gly Val Pro Gln A:la Arg Arg Gly Arg Gly His Ala Val 2:S 290 295 300 Arg Ser Ser Thr Gly Thr Glu P:ro Arg Trp Gly Arg Gly Arg Ala Glu Ser Ala 31) <210> 30 <211> 13842 <212> DNA
<213> Streptomyces vene::uelae 3 '.i <400> 30 atgtcttcag ccggaattac caggaccgc~t gcgagaacac cggtgacagg gcgtggggcg 60 gcagcgtggg acacggggga agtgcgggt:c cgacgggggt tgccccctgc cggccccgat 120 catgcggagc actccttctc tcgtgctcca: accggtgatg tgcgcgccga attgattcgt 180 40ggagagatgt cgacagtgtc caagagtgag tccgaggaat tcgtgtccgt gtcgaacgac 240 gccggttccgcgcacggcacagcggaac:ccgtcgccgtcgtcggcatctcctgccgggtg300 cccggcgcccgggacccgagagagttct:gggaactcctggcggcaggcggccaggccgtc360 accgacgtccccgcggaccgctggaacc3ccggcgacttctacgacccggaccgctccgcc420 cccggccgctcgaacagccggtggggcc3ggttcatcgaggacgtcgaccggttcgacgcc480 Sgccttcttcggcatctcgccccgcgagctccgcggagatggacccgcagcagcggctcgcc540 ctggagctgggctgggaggccctggagc;gcgccgggatcgacccgtcctcgctcaccggc600 acccgcaccggcgtcttcgccggcgccatctgggacgactacgccaccctgaagcaccgc660 cagggcggcgccgcgatcaccccgcacaccgtcaccggcctccaccgcggcatcatcgcg720 aaccgactctcgtacacgctcgggctcc:gcggccccagcatggtcgtcgactccggccag780 lOtcctcgtcgctcgtcgccgtccacctcc~cgtgcgagagcctgcggcgcggcgagtccgag840 ctcgccctcgccggcggcgtctcgctca~acctggtgccggacagcatcatcggggcgagc900 aagttcggcggcctctcccccgacggccgcgcctacaccttcgacgcgcgcgccaacggc960 tacgtacgcggcgagggcggcggtttcg~tcgtcctgaagcgcctctcccgggccgtcgcc1020 gacggcgacccggtgctcgccgtgatccggggcagcgccgtcaacaacggcggcgccgcc1080 l5cagggcatgacgacccccgacgcgcagg~cgcaggaggccgtgctccgcgaggcccacgag1:140 cgggccgggaccgcgccggccgacgtgcggtacgtcgagctgcacggcaccggcaccccc1200 gtgggcgacccgatcgaggccgctgcgctcggcgccgccctcggcaccggccgcccggcc1260 ggacagccgctcctggtcggctcggtcaagacgaacatcggccacctggagggcgcggcc1320 ggcatcgccggcctcatcaaggccgtcctggcggtccgcggtcgcgcgctgcccgccagc1380 20ctgaactacgagaccccgaacccggcgatcccgttcgaggaactgaacctccgggtgaac1440 acggagtacctgccgtgggagccggagcacgacgggcagcggatggtcgtcggcgtgtcc1500 tcgttcggcatgggcggcacgaacgcgcatgtcgtgctcgaagaggcccccgggggttgt1560 cgaggtgcttcggtcgtggagtcgacggtcggcgggtcggcggtcggcggcggtgtggtg1620 ccgtgggtggtgtcggcgaagtccgctgccgcgctggacgcgcagatcgagcggcttgcc1680 25gcgttcgcctcgcgggatcgtacggatggtgtcgacgcgggcgctgtcgatgcgggtgct1740 gtcgatgcgggtgctgtcgctcgcgtactggccggcgggcgtgctcagttcgagcaccgg1800 gccgtcgtcgtcggcagcgggccggacgatctggcggcagcgctggccgcgcctgagggt1860 ctggtccggggcgtggcttccggtgtcgggcgagtggcgttcgtgttccccgggcagggc1920 acgcagtgggccggcatgggtgccgaactgctggactcttccgcggtgttcgcggcggcc1980 3~Datggccgaatgcgaggccgcactctccccgtacgtcgactggtcgctggaggccgtcgta2040 cggcaggcccccggtgcgcccacgctggagcgggtcgatgtcgtgcagcctgtgacgttc2100 gccgtcatggtctcgctggctcgcgtgtggcagcaccacggggtgacgccccaggcggtc2160 gtcggccactcgcagggcgagatcgccgccgcgtacgtcgccggtgccctgagcctggac2220 gacgccgctcgtgtcgtgaccctgcgcagcaagtccatcgccgcccacctcgccggcaag2280 3:Sggcggcatgctgtccctcgcgctgagcg,aggacgccgtcctggagcgactggccgggttc2340 gacgggctgtccgtcgccgctgtgaacgggcccaccgccaccgtggtctccggtgacccc2400 gtacagatcgaagagcttgctcgggcgtgtgaggccgatggggtccgtgcgcgggtcatt2460 cccgtcgactacgcgtcccacagccggc;aggtcgagatcatcgagagcgagctcgccgag2520 gtcctcgccgggctcagcccgcaggctc~~gcgcgtgccgttcttctcgacactcgaaggc2580 40gcctggatcaccgagcccgtgctcgacggcggctactggtaccgcaacctgcgccatcgt2fi40 WO 00/00620 PC'T/US99/14398 gtgggcttcgccccggccgtcgagacc:ctggccaccgacgagggcttcacccacttcgtc 2700 gaggtcagcgcccaccccgtcctcaccatggccctccccgggaccgtcaccggtctggcg 2760 accctgcgtcgcgacaacggcggtcaglgaccgcctagtcgcctccctcgccgaagcatgg 2820 gccaacggactcgcggtcgactggagcccgctcctcccctccgcgaccggccaccactcc 2880 $gacctccccacctacgcgttccagaccgagcgccactggctgggcgagatcgaggcgctc 2940 gccccggcgggcgagccggcggtgcagcccgccgtcctccgcacggaggcggccgagccg 3000 , gcggagctcgaccgggacgagcagctgcgcgtgatcctggacaaggtccgggcgcagacg 3060 gcccaggtgctggggtacgcgacaggcgggcagatcgaggtcgaccggaccttccgtgag '.3120 gccggttgcacctccctgaccggcgtggacctgcgcaaccggatcaacgccgccttcggc 3180 lOgtacggatggcgccgtccatgatcttcgacttccccacccccgaggctctcgcggagcag 3240 ctgctcctcgtcgtgcacggggaggcggcggcgaacccggccggtgcggagccggctccg 3300 gtggcggcggccggtgccgtcgacgagccggtggcgatcgtcggcatggcctgccgcctg 3360 cccggtggggtcgcctcgccggaggacctgtggcggctggtggccggcggcggggacgcg 3420 atctcggagttcccgcaggaccgcggctgggacgtggaggggctgtaccacccggatccg 3480 1$gagcaccccggcacgtcgtacgtccgccagggcggtttcatcgagaacgtcgccggcttc 3540 gacgcggccttcttcgggatctcgccgcgcgaggccctcgccatggacccgcagcagcgg 3600 ctcctcctcgaaacctcctgggaggccgtcgaggacgccgggatcgacccgacctccctg 3660 cggggacggcaggtcggcgtcttcactggggcgatgacccacgagtacgggccgagcctg 3720 cgggacggcggggaaggcctcgacggctacctgctgaccggcaacacggccagcgtgatg 3780 ;ZOtcgggccgcgtctcgtacacactcggccttgagggccccgccctgacggtggacacggcc 3840 tgctcgtcgtcgctggtcgccctgcacctcgccgtgcaggccctgcgcaagggcgaggtc 3900 gacatggcgctcgccggcggcgtggcc~gtgatgcccacgcccgggatgttcgtcgagttc 3960 agccggcagcgcgggctggccggggacggccggtcgaaggcgttcgccgcgtcggcggac 4020 ggcaccagctggtccgagggcgtcggcgtcctcctcgtcgagcgcctgtcggacgcccgc 4080 :~$cgcaacggacaccaggtcctcgcggtc~3tccgcggcagcgccttgaaccaggacggcgcg 4140 agcaacggcctcacggctccgaacgggccctcgcagcagcgcgtcatccggcgcgcgctg 4200 gcggacgcccggctgacgacctccgac~3tggacgtcgtcgaggcacacggcacgggcacg 4260 cgactcggcgacccgatcgaggcgcaggccctgatcgccacctacggccagggccgtgac 4320 gacgaacagccgctgcgcctcgggtcgttgaagtccaacatcgggcacacccaggccgcg 4380 :30gccggcgtctccggtgtcatcaagatggtccaggcgatgcgccacggactgctgccgaag 4440 acgctgcacgtcgacgagccctcggaccagatcgactggtcggctggcgccgtggaactc 4500 ctcaccgaggccgtcgactggccggagaagcaggacggcgggctgcgccgggccgccgtc 4560 tcctccttcgggatcagcggcaccaat<3cgcatgtggtgctcgaagaggccccggtggtt 4620 gtcgagggtgcttcggtcgtcgagccgtcggttggcgggtcggcggtcggcggcggtgtg 4680 :3$acgccttgggtggtgtcggcgaagtccgctgccgcgctcgacgcgcagatcgagcggctt 4740 gccgcattcgcctcgcgggatcgtacggatgacgccgacgccggtgctgtcgacgcgggc 4800 gctgtcgctcacgtactggctgacgggcgtgctcagttcgagcaccgggccgtcgcgctc 4860 ggcgccggggcggacgacctcgtacagc3cgctggccgatccggacgggctgatacgcgga 4920 acggcttccggtgtcgggcgagtggcgttcgtgttccccggtcagggcacgcagtgggct 4980 ~~Oggcatgggtgccgaactgctggactcttccgcggtgttcgcggcggccatggccgagtgt 5040 WO 00/00620 PC1'/US99/14398 gaggccgcgctgtccccgtacgtcgacaggtcgctggaggccgtcgtacggcaggccccc5100 ggtgcgcccacgctggagcgggtcgat:gtcgtgcagcctgtgacgttcgccgtcatggtc5160 tcgctggctcgcgtgtggcagcaccac:ggtgtgacgccccaggcggtcgtcggccactcg5220 cagggcgagatcgccgccgcgtacgtc:gccggagccctgcccctggacgacgccgcccgc5280 Sgtcgtcaccctgcgcagcaagtccatc:gccgcccacctcgccggcaagggcggcatgctg5340 tccctcgcgctgaacgaggacgccgtc:ctggagcgactgagtgacttcgacgggctgtcc5400 gtcgccgccgtcaacgggcccaccgcc:actgtcgtgtcgggtgaccccgtacagatcgaa5460 gagcttgctcaggcgtgcaaggcggac;ggattccgcgcgcggatcattcccgtcgactac5520 gcgtcccacagccggcaggtcgagatc:atcgagagcgagctcgcccaggtcctcgccggt5580 lOctcagcccgcaggccccgcgcgtgccc~ttcttctcgacgctcgaaggcacctggatcacc5640 gagcccgtcctcgacggcacctactga~taccgcaacctccgtcaccgcgtcggcttcgcc5700 cccgccatcgagaccctggccgtcgacgagggcttcacgcacttcgtcgaggtcagcgcc5760 caccccgtcctcaccatgaccctccccgagaccgtcaccggcctcggcaccctccgtcgc5820 gaacagggaggccaagagcgtctggtcacctcgctcgccgaggcgtgggtcaacgggctt5880 lScccgtggcatggacttcgctcctgcccgccacggcctcccgccccggtctgcccacctac5940 gccttccaggccgagcgctactggctcgagaacactcccgccgccctggccaccggcgac6000 gactggcgctaccgcatcgactggaagcgcctcccggccgccgaggggtccgagcgcacc6060 ggcctgtccggccgctggctcgccgtcacgccggaggaccactccgcgcaggccgccgcc6120 gtgctcaccgcgctggtcgacgccggggcgaaggtcgaggtgctgacggccggggcggac6180 20gacgaccgtgaggccctcgccgcccggctcaccgcactgacgaccggtgacggcttcacc6240 ggcgtggtctcgctcctcgacggactcgtaccgcaggtcgcctgggtccaggcgctcggc6300 gacgccggaatcaaggcgcccctgtggtccgtcacccagggcgcggtctccgtcggacgt6360 ctcgacacccccgccgaccccgaccgggccatgctctggggcctcggccgcgtcgtcgcc6420 cttgagcaccccgaacgctgggccggcctcgtcgacctccccgcccagcccgatgccgcc6480 2Sgccctcgcccacctcgtcaccgcactctccggcgccaccggcgaggaccagatcgccatc6540 cgcaccaccggactccacgcccgccgcctcgcccgcgcacccctccacggacgtcggccc6600 acccgcgactggcagccccacggcacc~gtcctcatcaccggcggcaccggagccctcggcfa660 agccacgccgcacgctggatggcccaccacggagccgaacacctcctcctcgtcagccgc6720 agcggcgaacaagcccccggagccacccaactcaccgccgaactcaccgcatcgggcgcc6780 30cgcgtcaccatcgccgcctgcgacgtcgccgacccccacgccatgcgcaccctcctcgac6840 gccatccccgccgagacgcccctcaccgccgtcgtccacaccgccggcgcgctcgacgac6900 ggcatcgtggacacgctgaccgccgagcaggtccggcgggcccaccgtgcgaaggccgtc6'960 ggcgcctcggtgctcgacgagctgacccgggacctcgacctcgacgcgttcgtgctcttc7020 tcgtccgtgtcgagcactctgggcatccccggtcagggcaactacgccccgcacaacgcc7080 :35tacctcgacgccctcgcggctcgccgccgggccaccggccggtccgccgtctcggtggcc7140 tggggaccgtgggacggtggcggcatgc3ccgccggtgacggcgtggccgagcggctgcgc7200 aaccacggcgtgcccggcatggacccggaactcgccctggccgcactggagtccgcgctc7260 ggccgggacgagaccgcgatcaccgtcgcggacatcgactgggaccgcttctacctcgcg7320 tactcctccggtcgcccgcagcccctc<~tcgaggagctgcccgaggtgcggcgcatcatc7380 -40gacgcacgggacagcgccacgtccggacagggcgggagctccgcccagggcgccaacccc7440 ctggccgagc ggctggccgccgcggctcccggcgagcgtacggagatcctcctcggtctc7500 gtacgggcgc aggccgccgccgtgctccggatgcgttcgccggaggacgtcgccgccgac7560 cgcgccttca aggacatcggcttcgacl:cgctcgccggtgtcgagctgcgcaacaggctg7620 acccgggcga ccgggctccagctgcccgcgacgctcgtcttcgaccacccgacgccgctg7680 $gccctcgtgtcgctgctccgcagcgaglacctcggtgacgaggagacggcggacgcccgg7740 cggtccgcgg cgctgcccgcgactgtcc~gtgccggtgccggcgccggcgccggcaccgat7800 , gccgacgacg atccgatcgcgatcgtcgcgatgagctgccgctaccccggtgacatccgc7860 agcccggagg acctgtggcggatgctgt:ccgagggcggcgagggcatcacgccgttcccc7920 accgaccgcg gctgggacctcgacggc<agtacgacgccgacccggacgcgctcggcagg7980 l0gcgtacgtccgcgagggcgggttcctgc:acgacgcggccgagttcgacgcggagttcttc8040 ggcgtctcgc cgcgcgaggcgctggccatggacccgcagcagcggatgctcctgacgacg8100 tcctgggagg ccttcgagcgggccggcatcgagccggcatcgctgcgcggcagcagcacc8160 ggtgtcttca tcggcctctcctaccagc~actacgcggcccgcgtcccgaacgccccgcgt8220 ggcgtggagg gttacctgctgaccggcagcacgccgagcgtcgcgtcgggccgtatcgcg8280 l5tacaccttcggtctcgaagggcccgcgacgaccgtcgacaccgcctgctcgtcgtcgctg8340 accgccctgc acctggcggtgcgggcgc;tgcgcagcggcgagtgcacgatggcgctcgcc8400 ggtggcgtgg cgatgatggcgaccccgc:acatgttcgtggagttcagccgtcagcgggcg8460 ctcgccccgg acggccgcagcaaggcctactcggcggacgccgacgggttcggcgccgcg8520 gagggcgtcg gcctgctgctcgtggagc;ggctctcggacgcgcggcgcaacggtcacccg8580 2',Dgtgctcgccgtggtccgcggtaccgccc~tcaaccaggacggcgccagcaacgggctgacc8640 gcgcccaacg gaccctcgcagcagcggc~tgatccggcaggcgctcgccgacgcccggctg8700 gcacccggcg acatcgacgccgtcgagacgcacggcacgggaacctcgctgggcgacccc8760 atcgaggccc agggcctccaggccacgt:acggcaaggagcggcccgcggaacggccgctc8820 gccatcggct ccgtgaagtccaacatcc~gacacacccaggccgcggccggtgcggcgggc8880 2,Satcatcaagatggtcctcgcgatgcgcc:acggcaccctgccgaagaccctccacgccgac8940 gagccgagcc cgcacgtcgactgggcga~acagcggcctggccctcgtcaccgagccgatc9000 gactggccgg ccggcaccggtccgcgccgcgccgccgtctcctccttcggcatcagcggg9060 acgaacgcgc acgtcgtgctggagcaggcgccggatgctgctggtgaggtgcttggggcc9120 gatgaggtgc ctgaggtgtctgagacggtagcgatggctgggacggctgggacctccgag9180 30gtcgctgagggctctgaggcctccgaggcccccgcggcccccggcagccgtgaggcgtcc9240 ctccccgggc acctgccctgggtgctgt.ccgccaaggacgagcagtcgctgcgcggccag9300 gccgccgccc tgcacgcgtggctgtcco~agcccgccgccgacctgtcggacgcggacgga9360 ccggcccgcc tgcgggacgtcgggtaca.cgctcgccacgagccgtaccgccttcgcgcac9420 cgcgccgccg tgaccgccgccgaccggg~acgggttcctggacgggctggccacgctggcc9480 35cagggcggcacctcggcccacgtccacctggacaccgcccgggacggcaccaccgcgttc9540 ctcttcaccg gccagggcagtcagcgccccggcgccggccgtgagctgtacgaccggcac9600 cccgtcttcg cccgggcgctcgacgaga.tctgcgcccacctcgacggtcacctcgaactg9660 cccctgctcg acgtgatgttcgcggccgiagggcagcgcggaggccgcgctgctcgacgag9'720 acgcggtaca cgcagtgcgcgctgttcg~ccctggaggtcgcgctcttccggctcgtcgag9780 4~~agctggggcatgcggccggccgcactgctcggtcactcggtcggcgagatcgccgccgcg9840 cacgtcgccggtgtgttctcgctcgccgacgccgcccgcctggtcgccgcgcgcggccgg9900 ctcatgcaggagctgcccgccggtggcgcgatgctcgccgtccaggccgcggaggacgag9960 atccgcgtgtggctggagacggaggagcggtacgcgggacgtctggacgtcgccgccgtc10020 aacggccccgaggccgccgtcctgtccggcgacgcggacgcggcgcgggaggcggaggcg10080 Stactggtccgggctcggccgcaggacccgcgcgctgcgggtcagccacgccttccactcc10140 gcgcacatggacggcatgctcgacgggttccgcgccgtcctggagacggtggagttccgg10200 , cgcccctccctgaccgtggtctcgaacgtcaccggcctggccgccggcccggacgacctg10260 tgcgaccccgagtactgggtccggcacgtccgcggcaccgtccgcttcctcgacggcgtc10320 cgtgtcctgcgcgacctcggcgtgcggacctgcctggagctgggccccgacggggtcctc10380 .l0accgccatggcggccgacggcctcgcggacacccccgcggattccgctgccggctccccc10440 gtcggctctcccgccggctctcccgccgactccgccgccggcgcgctccggccccggccg10500 ctgctcgtggcgctgctgcgccgcaagcggtcggagaccgagaccgtcgcggacgccctc10560 ggcagggcgcacgcccacggcaccggacccgactggcacgcctggttcgccggctccggg10620 gcgcaccgcgtggacctgcccacgtactccttccggcgcgaccgctactggctggacgcc10680 .~Sccggcggccgacaccgcggtggacaccc~ccggcctcggtctcggcaccgccgaccacccg10740 ctgctcggcgccgtggtcagccttccg<3accgggacggcctgctgctcaccggccgcctc10800 tccctgcgcacccacccgtggctcgcggaccacgccgtcctggggagcgtcctgctcccc10860 ggcgccgcgatggtcgaactcgccgcg<:acgctgcggagtccgccggtctgcgtgacgtg10920 cgggagctgaccctccttgaaccgctggtactgcccgagcacggtggcgtcgagctgcgc10980 :!Ogtgacggtcggggcgccggccggagag<:ccggtggcgagtcggccggggacggcgcacgg11040 cccgtctccctccactcgcggctcgccc~acgcgcccgccggtaccgcctggtcctgccac11100 gcgaccggtctgctggccaccgaccggc:ccgagcttcccgtcgcgcccgaccgtgcggcc11160 atgtggccgccgcagggcgccgaggagc~tgccgctcgacggtctctacgagcggctcgac11220 gggaacggcctcgccttcggtccgctgtaccaggggctgaacgcggtgtggcggtacgag11280 :!Sggtgaggtcttcgccgacatcgcgctcc:ccgccaccacgaatgcgaccgcgcccgcgacc11340 gcgaacggcggcgggagtgcggcggcgc~ccccctacggcatccaccccgccctgctcgac11400 gcttcgctgcacgccatcgcggtcggcc~gtctcgtcgacgagcccgagctcgtccgcgtc11460 cccttccactggagcggtgtcaccgtgc:acgcggccggtgccgcggcggcccgggtccgt11520 ctcgcctccgcggggacggacgccgtct:cgctgtccctgacggacggcgagggacgcccg11580 ;S~ctggtctccgtggaacggctcacgctgc:gcccggtcaccgccgatcaggcggcggcgagc11640 cgcgtcggcgggctgatgcaccgggtgc~cctggcgtccgtacgccctcgcctcgtccggc11700 gaacaggacccgcacgccacttcgtacc~ggccgaccgccgtcctcggcaaggacgagctg11760 aaggtcgccgccgccctggagtccgcgc~gcgtcgaagtcgggctctaccccgacctggcc11820 gcgctgtcccaggacgtggcggccggcc~ccccggcgccccgtaccgtccttgcgccgctg11880 :SScccgcgggtcccgccgacggcggcgcgc~agggtgtacggggcacggtggcccggacgctg11940 gagctgctccaggcctggctggccgacc~agcacctcgcgggcacccgcctgctcctggtc12000 acccgcggtgcggtgcgggaccccgagc~ggtccggcgccgacgatggcggcgaggacctg12060 tcgcacgcggccgcctggggtctcgtac:ggaccgcgcagaccgagaaccccggccgcttc12120 ggccttctcgacctggccgacgacgcct:cgtcgtaccggaccctgccgtcggtgctctcc12180 40gacgcgggcctgcgcgacgaaccgcagcacgccctgcacgacggcaccatcaggctggcc12240 cgcctggcct ccgtccggcc cgagaccggc accgccgcac cggcgctcgc cccggagggc 12300 acggtcctgc tgaccggcgg caccggcggc ctgggcggac tggtcgcccg gcacgtggtg 12360 ggcgagtggg gcgtacgacg cctgctgctg gtgagccggc ggggcacgga cgccccgggc 12420 gccgacgagc tcgtgcacga gctggaggcc ctgggagccg acgtctcggt ggccgcgtgc 12480 Sgacgtcgccg accgcgaagc cctcaccgcc gtactcgacg ccatccccgc cgaacacccg 1:3540 ctcaccgcgg tcgtccacac ggcaggcgtc ctctccgacg gcaccctccc gtccatgacg 12600 , acggaggacg tggaacacgt actgcggccc aaggtcgacg ccgcgttcct cctcgacgaa 12660 ctcacctcga cgcccgcata cgacctg~gca gcgttcgtca tgttctcctc cgccgccgcc 12720 gtcttcggtg gcgcggggca gggcgcctac gccgccgcca acgccaccct cgacgccctc 12780 lOgcctggcgcc gccgggcagc cggactcccc gccctctccc tcggctgggg cctctgggcc 12840 gagaccagcg gcatgaccgg cgagctcggc caggcggacc tgcgccggat gagccgcgcg 12900 ggcatcggcg ggatcagcga cgccgagggc atcgcgctcc tcgacgccgc cctccgcgac 12960 gaccgccacc cggtcctgct gcccctgcgg ctcgacgccg ccgggctgcg ggacgcggcc 13020 gggaacgacc cggccggaat cccggcgctc ttccgggacg tcgtcggcgc caggaccgtc 13080 lScgggcccggc cgtccgcggc ctccgcctcg acgacagccg ggacggccgg cacgccgggg 13140 acggcggacg gcgcggcgga aacggcggcg gtcacgctcg ccgaccgggc cgccaccgtg 13200 gacgggcccg cacggcagcg cctgctgctc gagttcgtcg tcggcgaggt cgccgaagta 13260 ctcggccacg cccgcggtca ccggatcgac gccgaacggg gcttcctcga cctcggcttc 13320 gactccctga ccgccgtcga actccgcaac cggctcaact ccgccggtgg cctcgccctc 13380 20ccggcgaccc tggtcttcga ccacccaagc ccggcggcac tcgcctccca cctggacgcc 13440 gagctgccgc gcggcgcctc ggaccagc~ac ggagccggga accggaacgg gaacgagaac 13500 gggacgacgg cgtcccggag caccgccc~ag acggacgcgc tgctggcaca actgacccgc 13560 ctggaaggcg ccttggtgct gacgggccac tcggacgccc ccgggagcga agaagtcctg 13620 gagcacctgc ggtccctgcg ctcgatgc~tc acgggcgaga ccgggaccgg gaccgcgtcc 13680 s'.Sggagccccgg acggcgccgg gtccggcc~cc gaggaccggc cctgggcggc cggggacgga 13740 gccgggggcg ggagtgagga cggcgcgc~ga gtgccggact tcatgaacgc ctcggccgag 13800 gaactcttcg gcctcctcga ccaggacc:cc agcacggact ga 13842 3~0 <210> 31 <211> 4613 <212> PRT
<213> Streptomyces vene:zuelae <400> 31 Met Ser Ser Ala Gly Ile Thr l~,rg Thr Gly Ala Arg Thr Pro Val Thr Gly Arg Gly Ala Ala Ala Trp A,sp Thr Gly Glu Val Arg Val Arg Arg Gly Leu Pro Pro Ala Gly Pro Asp His Ala Glu His Ser Phe Ser Arg Ala Pro Thr Gly Asp Val Arg i~la Glu Leu Ile Arg Gly Glu Met Ser Thr Val Ser Lys Ser Glu Ser c;lu Glu Phe Val Ser Val Ser Asn Asp Ala Gly Ser Ala His Gly Thr Ala Glu Pro Val Ala Val Val Gly Ile Ser Cys Arg Val Pro Gly Ala Arg Asp Pro Arg Glu Phe Trp Glu Leu 1.0 100 105 110 Leu Ala Ala Gly Gly Gln Ala Val Thr Asp Val Pro Ala Asp Arg Trp 115 _L20 125 Aen Ala Gly Asp Phe Tyr Asp ~?ro Asp Arg Ser Ala Pro Gly Arg Ser 1$ Asn Ser Arg Trp Gly Gly Phe 7.1e Glu Asp Val Asp Arg Phe Asp Ala Ala Phe Phe Gly Ile Ser Pro Arg Glu Ala Ala Glu Met Asp Pro Gln Gln Arg Leu Ala Leu Giu Leu Gly Trp Glu Ala Leu Glu Arg Ala Gly 2;0 180 185 190 Ile Asp Pro Ser Ser Leu Thr Gly Thr Arg Thr Gly Val Phe Ala Gly 195 i:00 205 Ala Ile Trp Asp Asp Tyr Ala Thr Leu Lys His Arg Gln Gly Gly Ala 2.$ Ala Ile Thr Pro His Thr Val Thr Gly Leu His Arg Gly Ile Ile Ala Asn Arg Leu Ser Tyr Thr Leu Gly Leu Arg Gly Pro Ser Met Val Val Asp Ser Gly Gln Ser Ser Ser heu Val Ala Val His Leu Ala Cys Glu Ser Leu Arg Arg Gly Glu Ser Gilu Leu Ala Leu Ala Gly Gly Val Ser 275 2.80 285 Leu Asn Leu Val Pro Asp Ser I:le Ile Gly Ala Ser Lys Phe Gly Gly 3$ Leu Ser Pro Asp Gly Arg Ala T'yr Thr Phe Asp Ala Arg Ala Asn Gly Tyr Val Arg Gly Glu Gly Gly Gly Phe Val Val Leu Lys Arg Leu Ser Arg Ala Val Ala Asp Gly Asp Fro Val Leu Ala Val Ile Arg Gly Ser Ala Val Asn Asn Gly Gly Ala .Ala Gln Gly Met Thr Thr Pro Asp Ala Gln Ala Gln Glu Ala Val Leu ,Arg Glu Ala His Glu Arg Ala Gly Thr $ Ala Pro Ala Asp Val Arg Tyr 'Val Glu Leu His Gly Thr Gly Thr Pro Val Gly Asp Pro Ile Glu Ala ~41a Ala Leu Gly Ala Ala Leu Gly Thr Gly Arg Pro Ala Gly Gln Pro Leu Leu Val Gly Ser Val Lys Thr Asn Ile Gly His Leu Glu Gly Ala i~la Gly Ile Ala Gly Leu Ile Lys Ala Val Leu Ala Val Arg Gly Arg Ala Leu Pro Ala Ser Leu Asn Tyr Glu .~$ Thr Pro Asn Pro Ala Ile Pra 1?he Glu Glu Leu Asn Leu Arg Val Asn Thr Glu Tyr Leu Pro Trp Glu 1?ro Glu His Asp Gly Gln Arg Met Val Val Gly Val Ser Ser Phe Gly Met Gly Gly Thr Asn Ala His Val Val Leu Glu Glu Ala Pro Gly Gly C:ys Arg Gly Ala Ser Val Val Glu Ser 515 _-°i20 525 Thr Val Gly Gly Ser Ala Val C~ly Gly Gly Val Val Pro Trp Val Val s$ Ser Ala Lys Ser Ala Ala Ala Leu Asp Ala Gln Ile Glu Arg Leu Ala Ala Phe Ala Ser Arg Asp Arg Thr Asp Gly Val Asp Ala Gly Ala Val Asp Ala Gly Ala Val Asp Ala Gly Ala Val Ala Arg Val Leu Ala Gly Gly Arg Ala Gln Phe Glu His Fvrg Ala Val Val Val Gly Ser Gly Pro 595 6.00 605 Asp Asp Leu Ala Ala Ala Leu p~la Ala Pro Glu Gly Leu Val Arg Gly ~~$ Val Ala Ser Gly Val Gly Arg Val Ala Phe Val Phe Pro Gly Gln Gly Thr Gln Trp Ala Gly Met Gly P.la Glu Leu Leu Asp Ser Ser Ala Val Phe Ala Ala Ala Met Ala Glu C'ys Glu Ala Ala Leu Ser Pro Tyr Val Asp Trp Ser Leu Glu Ala Val Val Arg Gln Ala Pro Gly Ala Pro Thr Leu Glu Arg Val Asp Val Val Gln Pro Val Thr Phe Ala Val Met Val $ Ser Leu Ala Arg Val Trp Gln Iiis His Gly Val Thr Pro Gln Ala Val Val Gly His Ser Gln Gly Glu Ile Ala Ala Ala Tyr Val Ala Gly Ala Leu Ser Leu Asp Asp Ala Ala Arg Val Val Thr Leu Arg Ser Lys Ser Ile Ala Ala His Leu Ala Gly Lys Gly Gly Met Leu Ser Leu Ala Leu 755 ',i60 765 Ser Glu Asp Ala Val Leu Glu Arg Leu Ala Gly Phe Asp Gly Leu Ser 1$ Val Ala Ala Val Asn Gly Pro Thr Ala Thr Val Val Ser Gly Asp Pro Val Gln Ile Glu Glu Leu Ala Arg Ala Cys Glu Ala Asp Gly Val Arg Ala Arg Val Ile Pro Val Asp 'I'yr Ala Ser His Ser Arg Gln Val Glu Ile Ile Glu Ser Glu Leu Ala Glu Val Leu Ala Gly Leu Ser Pro Gln Ala Pro Arg Val Pro Phe Phe Ser Thr Leu Glu Gly Ala Trp Ile Thr 2$ Glu Pro Val Leu Asp Gly Gly Z'yr Trp Tyr Arg Asn Leu Arg His Arg Val Gly Phe Ala Pro Ala Val G'~lu Thr Leu Ala Thr Asp Glu Gly Phe Thr His Phe Val Glu Val Ser A.la His Pro Val Leu Thr Met Ala Leu Pro Gly Thr Val Thr Gly Leu A.la Thr Leu Arg Arg Asp Asn Gly Gly Gln Asp Arg Leu Val Ala Ser Leu Ala Glu Ala Trp Ala Asn Gly Leu 35 Ala Val Asp Trp Ser Pro Leu Leu Pro Ser Ala Thr Gly His His Ser Asp Leu Pro Thr Tyr Ala Phe Gln Thr Glu Arg His Trp Leu Gly Glu Ile Glu Ala Leu Ala Pro Ala Gly Glu Pro Ala Val Gln Pro Ala Val Leu Arg Thr Glu Ala Ala Glu ~?ro Ala Glu Leu Asp Arg Asp Glu Gln Leu Arg Val Ile Leu Asp Lys Val Arg Ala Gln Thr Ala Gln Val Leu S Gly Tyr Ala Thr Gly Gly Gln 7:1e Glu Val Asp Arg Thr Phe Arg Glu Ala Gly Cys Thr Ser Leu Thr C~ly Val Asp Leu Arg Asn Arg Ile Asn Ala Ala Phe Gly Val Arg Met 1?~la Pro Ser Met Ile Phe Asp Phe Pro Thr Pro Glu Ala Leu Ala Glu Gln Leu Leu Leu Val Val His Gly Glu Ala Ala Ala Asn Pro Ala Gly A.la Glu Pro Ala Pro Val Ala Ala Ala 1S Gly Ala Val Asp Glu Pro Val A.la Ile Val Gly Met Ala Cys Arg Leu Pro Gly Gly Val Ala Ser Pro Glu Asp Leu Trp Arg Leu Val Ala Gly Gly Gly Asp Ala Ile Ser Glu Phe Pro Gln Asp Arg Gly Trp Asp Val Glu Gly Leu Tyr His Pro Asp Pro Glu His Pro Gly Thr Ser Tyr Val Arg Gln Gly Gly Phe Ile Glu Asn Val Ala Gly Phe Asp Ala Ala Phe 2:$ Phe Gly Ile Ser Pro Arg Glu Ala Leu Ala Met Asp Pro Gln Gln Arg Leu Leu Leu Glu Thr Ser Trp Glu Ala Val Glu Asp Ala Gly Ile Asp Pro Thr Ser Leu Arg Gly Arg G:ln Val Gly Val Phe Thr Gly Ala Met 3t) 1220 1225 1230 Thr His Glu Tyr Gly Pro Ser Leu Arg Asp Gly Gly Glu Gly Leu Asp Gly Tyr Leu Leu Thr Gly Asn Tlzr Ala Ser Val Met Ser Gly Arg Val 3:) Ser Tyr Thr Leu Gly Leu Glu G:Ly Pro Ala Leu Thr Val Asp Thr Ala Cys Ser Ser Ser Leu Val Ala Leu His Leu Ala Val Gln Ala Leu Arg Lys Gly Glu Val Asp Met Ala Leu Ala Gly Gly Val Ala Val Met Pro Thr Pro Gly Met Phe Val Glu Phe Ser Arg Gln Arg Gly Leu Ala Gly Asp Gly Arg Ser Lys Ala Phe Ala Ala Ser Ala Asp Gly Thr Ser Trp S Ser Glu Gly Val Gly Val Leu Leu Val Glu Arg Leu Ser Asp Ala Arg Arg Asn Gly His Gln Val Leu Ala Val Val Arg Gly Ser Ala Leu Asn Gln Asp Gly Ala Ser Asn Gly Leu Thr Ala Pro Asn Gly Pro Ser Gln 1~ 1380 1385 1390 Gln Arg Val Ile Arg Arg Ala Leu Ala Asp Ala Arg Leu Thr Thr Ser Asp Val Asp Val Val Glu Ala :His Gly Thr Gly Thr Arg Leu Gly Asp 1$ Pro Ile Glu Ala Gln Ala Leu Ile Ala Thr Tyr Gly Gln Gly Arg Asp Asp Glu Gln Pro Leu Arg Leu Gly Ser Leu Lys Ser Asn Ile Gly His Thr Gln Ala Ala Ala Gly Val Ser Gly Val Ile Lys Met Val Gln Ala Met Arg His Gly Leu Leu Pro Lys Thr Leu His Val Asp Glu Pro Ser 1475 :1480 1485 Asp Gln Ile Asp Trp Ser Ala Gly Ala Val Glu Leu Leu Thr Glu Ala :~$ Val Asp Trp Pro Glu Lye Gln Asp Gly Gly Leu Arg Arg Ala Ala Val Ser Ser Phe Gly Ile Ser Gly Thr Asn Ala His Val Val Leu Glu Glu Ala Pro Val Val Val Glu Gly Ala Ser Val Val Glu Pro Ser Val Gly Gly Ser Ala Val Gly Gly Gly Val Thr Pro Trp Val Val Ser Ala Lys 1555 :L560 1565 Ser Ala Ala Ala Leu Asp Ala Gln Ile Glu Arg Leu Ala Ala Phe Ala 35 Ser Arg Asp Arg Thr Asp Asp Ala Asp Ala Gly Ala Val Asp Ala Gly Ala Val Ala His Val Leu Ala Asp Gly Arg Ala Gln Phe Glu His Arg Ala Val Ala Leu Gly Ala Gly Ala Asp Asp Leu Val Gln Ala Leu Ala WO 00/00620 PC'T/US99/14398 Asp Pro Asp Gly Leu Ile Arg Cily Thr Ala Ser Gly Val Gly Arg Val 1635 7.640 1645 Ala Phe Val Phe Pro Gly Gln C~ly Thr Gln Trp Ala Gly Met Gly Ala Glu Leu Leu Asp Ser Ser Ala Val Phe Ala Ala Ala Met Ala Glu Cys Glu Ala Ala Leu Ser Pro Tyr Val Asp Trp Ser Leu Glu Ala Val Val Arg Gln Ala Pro Gly Ala Pro Thr Leu Glu Arg Val Asp Val Val Gln Pro Val Thr Phe Ala Val Met V'al Ser Leu Ala Arg Val Trp Gln His His Gly Val Thr Pro Gln Ala Val Val Gly His Ser Gln Gly Glu Ile Ala Ala Ala Tyr Val Ala Gly A.la Leu Pro Leu Asp Asp Ala Ala Arg Val Val Thr Leu Arg Ser Lys Ser Ile Ala Ala His Leu Ala Gly Lys Gly Gly Met Leu Ser Leu Ala Leu Asn Glu Asp Ala Val Leu Glu Arg 2'~ 1780 1785 1790 Leu Ser Asp Phe Asp Gly Leu Ser Val Ala Ala Val Asn Gly Pro Thr Ala Thr Val Val Ser Gly Asp Pro Val Gln Ile Glu Glu Leu Ala Gln 2:5 Ala Cys Lys Ala Asp Gly Phe Arg Ala Arg Ile Ile Pro Val Asp Tyr Ala Ser His Ser Arg Gln Val Glu Ile Ile Glu Ser Glu Leu Ala Gln Val Leu Ala Gly Leu Ser Pro Gln Ala Pro Arg Val Pro Phe Phe Ser 30 l8so lss5 1870 Thr Leu Glu Gly Thr Trp Ile Tihr Glu Pro Val Leu Asp Gly Thr Tyr Trp Tyr Arg Asn Leu Arg His A:rg Val Gly Phe Ala Pro Ala Ile Glu 3;> Thr Leu Ala Val Asp Glu Gly Plze Thr His Phe Val Glu Val Ser Ala His Pro Val Leu Thr Met Thr Leu Pro Glu Thr Val Thr Gly Leu Gly Thr Leu Arg Arg Glu Gln Gly G:Ly Gln Glu Arg Leu Val Thr Ser Leu 41) 1940 1945 1950 Ala Glu Ala Trp Val Asn Gly Leu Pro Val Ala Trp Thr Ser Leu Leu Pro Ala Thr Ala Ser Arg Pro Gly Leu Pro Thr Tyr Ala Phe Gln Ala $ Glu Arg Tyr Trp Leu Glu Asn 'Thr Pro Ala Ala Leu Ala Thr Gly Asp Asp Trp Arg Tyr Arg Ile Asp 'Trp Lys Arg Leu Pro Ala Ala Glu Gly Ser Glu Arg Thr Gly Leu Ser Gly Arg Trp Leu Ala Val Thr Pro Glu Asp His Ser Ala Gln Ala Ala i~la Val Leu Thr Ala Leu Val Asp Ala Gly Ala Lys Val Glu Val Leu '.Chr Ala Gly Ala Asp Asp Asp Arg Glu .L$ Ala Leu Ala Ala Arg Leu Thr Ala Leu Thr Thr Gly Asp Gly Phe Thr Gly Val Val Ser Leu Leu Asp Gly Leu Val Pro Gln Val Ala Trp Val Gln Ala Leu Gly Asp Ala Gly 7Cle Lys Ala Pro Leu Trp Ser Val Thr Gln Gly Ala Val Ser Val Gly Arg Leu Asp Thr Pro Ala Asp Pro Asp Arg Ala Met Leu Trp Gly Leu G1y Arg Val Val Ala Leu Glu His Pro i$ Glu Arg Trp Ala Gly Leu Val Asp Leu Pro Ala Gln Pro Asp Ala Ala Ala Leu Ala His Leu Val Thr Ala Leu Ser Gly Ala Thr Gly Glu Asp Gln Ile Ala Ile Arg Thr Thr Oily Leu His Ala Arg Arg Leu Ala Arg Ala Pro Leu His Gly Arg Arg Faro Thr Arg Asp Trp Gln Pro His Gly 2195 2.200 2205 Thr Val Leu Ile Thr Gly Gly 'I'hr Gly Ala Leu Gly Ser His Ala Ala ~~$ Arg Trp Met Ala His His Gly F,la Glu His Leu Leu Leu Val Ser Arg Ser Gly Glu Gln Ala Pro Gly 1?,la Thr Gln Leu Thr Ala Glu Leu Thr Ala Ser Gly Ala Arg Val Thr I:le Ala Ala Cys Asp Val Ala Asp Pro His Ala Met Arg Thr Leu Leu i~sp Ala Ile Pro Ala Glu Thr Pro Leu 2275 :Z280 2285 Thr Ala Val Val His Thr Ala c;ly Ala Leu Asp Asp Gly Ile Val Asp $ Thr Leu Thr Ala Glu Gln Val Arg Arg Ala His Arg Ala Lys Ala Val Gly Ala Ser Val Leu Asp Glu Leu Thr Arg Asp Leu Asp Leu Asp Ala Phe Val Leu Phe Ser Ser Val Ser Ser Thr Leu Gly Ile Pro Gly Gln Gly Asn Tyr Ala Pro His Asn Ala Tyr Leu Asp Ala Leu Ala Ala Arg 2355 .:360 2365 Arg Arg Ala Thr Gly Arg Ser Ala Val Ser Val Ala Trp Gly Pro Trp 1$ Asp Gly Gly Gly Met Ala Ala Gly Asp Gly Val Ala Glu Arg Leu Arg Asn His Gly Val Pro Gly Met Asp Pro Glu Leu Ala Leu Ala Ala Leu Glu Ser Ala Leu Gly Arg Asp Glu Thr Ala Ile Thr Val Ala Asp Ile 2;0 2420 2425 2430 Asp Trp Asp Arg Phe Tyr Leu p,la Tyr Ser Ser Gly Arg Pro Gln Pro Leu Val Glu Glu Leu Pro Glu V'al Arg Arg Ile Ile Asp Ala Arg Asp 2$ Ser Ala Thr Ser Gly Gln Gly Gly Ser Ser Ala Gln Gly Ala Asn Pro Leu Ala Glu Arg Leu Ala Ala A.la Ala Pro Gly Glu Arg Thr Glu Ile Leu Leu Gly Leu Val Arg Ala Gln Ala Ala Ala Val Leu Arg Met Arg Ser Pro Glu Asp Val Ala Ala A.sp Arg Ala Phe Lys Asp Ile Gly Phe Asp Ser Leu Ala Gly Val Glu Leu Arg Asn Arg Leu Thr Arg Ala Thr 3$ Gly Leu Gln Leu Pro Ala Thr Leu Val Phe Asp His Pro Thr Pro Leu Ala Leu Val Ser Leu Leu Arg Ser Glu Phe Leu Gly Asp Glu Glu Thr Ala Asp Ala Arg Arg Ser Ala Ala Leu Pro Ala Thr Val Gly Ala Gly 4~ 2580 2585 2590 Ala Gly Ala Gly Ala Gly Thr Asp Ala Asp Asp Asp Pro Ile Ala Ile Val Ala Met Ser Cys Arg Tyr Pro Gly Asp Ile Arg Ser Pro Glu Asp 2610 2615. 2620 Leu Trp Arg Met Leu Ser Glu Gly Gly Glu Gly Ile Thr Pro Phe Pro Thr Asp Arg Gly Trp Asp Leu Asp Gly Leu Tyr Asp Ala Asp Pro Asp Ala Leu Gly Arg Ala Tyr VaI Arg Glu Gly Gly Phe Leu His Asp Ala 1~ 2660 26b5 2670 Ala Glu Phe Asp Ala Glu Phe Phe Gly Val Ser Pro Arg Glu Ala Leu Ala Met Asp Pro Gln Gln Arg Met Leu Leu Thr Thr Ser Trp Glu Ala Phe Glu Arg Ala Gly Ile Glu Pro Ala Ser Leu Arg Gly Ser Ser Thr Gly Val Phe Ile Gly Leu Ser Tyr Gln Asp Tyr Ala Ala Arg Val Pro Asn Ala Pro Arg Gly Val Glu Gly Tyr Leu Leu Thr Gly Ser Thr Pro Ser Val Ala Ser Gly Arg Ile Ala Tyr Thr Phe Gly Leu Glu Gly Pro Ala Thr Thr Val Asp Thr Ala Cys Ser Ser Ser Leu Thr Ala Leu His :Z$ Leu Ala Val Arg Ala Leu Arg Ser Gly Glu Cys Thr Met Ala Leu Ala Gly Gly Val Ala Met Met Ala Thr Pro His Met Phe Val Glu Phe Ser Arg Gln Arg Ala Leu Ala Pro .Asp Gly Arg Ser Lys Ala Phe Ser Ala Asp Ala Asp Gly Phe Gly Ala .Ala Glu Gly Val Gly Leu Leu Leu Val Glu Arg Leu Ser Asp Ala Arg .Arg Asn Gly His Pro Val Leu Ala Val 3S Val Arg Gly Thr Ala Val Asn Gln Aep Gly Ala Ser Asn Gly Leu Thr Ala Pro Asn Gly Pro Ser Gln ~Gln Arg Val Ile Arg Gln Ala Leu Ala Asp Ala Arg Leu Ala Pro Gly .Asp Ile Asp Ala Val Glu Thr His Gly Thr Gly Thr Ser Leu Gly Asp Pro Ile Glu Ala G1n Gly Leu Gln Ala Thr Tyr Gly Lys Glu Arg Pro Ala Glu Arg Pro Leu Ala Ile Gly Ser $ Val Lys Ser Asn Ile Gly His Thr Gln Ala Ala Ala Gly Ala Ala Gly 2945 2950 2955 2960 "
Ile Ile Lys Met Val Leu Ala Met Arg His Gly Thr Leu Pro Lys Thr Leu His Ala Asp Glu Pro Ser Pro His Val Asp Trp Ala Asn Ser Gly 1~ 2980 2985 2990 Leu Ala Leu Val Thr Glu Pro Ile Asp Trp Pro Ala Gly Thr Gly Pro Arg Arg Ala Ala Val Ser Ser :Phe Gly Ile Ser Gly Thr Asn Ala His 1$ Val Val Leu Glu Gln Ala Pro i~sp Ala Ala Gly Glu Val Leu Gly Ala Asp Glu Val Pro Glu Val Ser (31u Thr Val Ala Met Ala Gly Thr Ala Gly Thr Ser Glu Val Ala Glu Gly Ser Glu Ala Ser Glu Ala Pro Ala Ala Pro Gly Ser Arg Glu Ala ~3er Leu Pro Gly His Leu Pro Trp Val Leu Ser Ala Lys Asp Glu Gln Ser Leu Arg Gly Gln Ala Ala Ala Leu i$ His Ala Trp Leu Ser Glu Pro Ala Ala Asp Leu Ser Asp Ala Asp Gly Pro Ala Arg Leu Arg Asp Val Gly Tyr Thr Leu Ala Thr Ser Arg Thr Ala Phe Ala His Arg Ala Ala V'al Thr Ala Ala Asp Arg Asp Gly Phe Leu Asp Gly Leu Ala Thr Leu A.la Gln Gly Gly Thr Ser Ala His Val His Leu Asp Thr Ala Arg Asp Gly Thr Thr Ala Phe Leu Phe Thr Gly 3$ Gln Gly Ser Gln Arg Pro Gly A.la Gly Arg Glu Leu Tyr Asp Arg His Pro Val Phe Ala Arg Ala Leu Asp Glu Ile Cys Ala His Leu Asp Gly His Leu Glu Leu Pro Leu Leu Asp Val Met Phe Ala Ala Glu Gly Ser WO 00/00620 PCT/tJS99/14398 Ala Glu Ala Ala Leu Leu Asp Glu Thr Arg Tyr Thr Gln Cys Ala Leu 3235 a240 3245 Phe Ala Leu Glu Val Ala Leu Phe Arg Leu Val Glu Ser Trp Gly Met Arg Pro Ala Ala Leu Leu Gly His Ser Val Gly Glu Ile Ala Ala Ala His Val Ala Gly Val Phe Ser heu Ala Asp Ala Ala Arg Leu Val Ala Ala Arg Gly Arg Leu Met Gln Glu Leu Pro Ala Gly Gly Ala Met Leu 1~ 3300 3305 3310 Ala Val Gln Ala Aia Glu Asp Glu Ile Arg Val Trp Leu Glu Thr Glu Glu Arg Tyr Ala Gly Arg Leu Asp Val Ala Ala Val Asn Gly Pro Glu 1.5 Ala Ala Val Leu Ser Gly Asp Ala Asp Ala Ala Arg Glu Ala Glu Ala Tyr Trp Ser Gly Leu Gly Arg Arg Thr Arg Ala Leu Arg Val Ser His Ala Phe His Ser Ala His Met Asp Gly Met Leu Asp Gly Phe Arg Ala 21) 3380 3385 3390 Val Leu Glu Thr Val Glu Phe A:rg Arg Pro Ser Leu Thr Val Val Ser Asn Val Thr Gly Leu Ala Ala G.Ly Pro Asp Asp Leu Cys Asp Pro Glu 2.'> Tyr Trp Val Arg His Val Arg G:Ly Thr Val Arg Phe Leu Asp Gly Val Arg Val Leu Arg Asp Leu Gly V<~1 Arg Thr Cys Leu Glu Leu Gly Pro Asp Gly Val Leu Thr Ala Met Ala Ala Asp Gly Leu Ala Asp Thr Pro 3« 3460 3465 3470 Ala Asp Ser Ala Ala Gly Ser Pra Val Gly Ser Pro Ala Gly Ser Pro Ala Asp Ser Ala Ala Gly Ala Le:u Arg Pro Arg Pro Leu Leu Val Ala 3~~ Leu Leu Arg Arg Lys Arg Ser Gl.u Thr Glu Thr Val Ala Asp Ala Leu Gly Arg Ala His Ala His Gly Thr Gly Pro Asp Trp His Ala Trp Phe Ala Gly Ser Gly Ala His Arg Va~l Asp Leu Pro Thr Tyr Ser Phe Arg Arg Asp Arg Tyr Trp Leu Asp Ala Pro Ala Ala Asp Thr Ala Val Asp Thr Ala Gly Leu Gly Leu Gly Thr Ala Asp His Pro Leu Leu Gly Ala $ Val Val Ser Leu Pro Asp Arg Asp Gly Leu Leu Leu Thr Gly Arg Leu Ser Leu Arg Thr His Pro Trp Leu Ala Asp His Ala Val Leu Gly Ser Val Leu Leu Pro Gly Ala Ala Met Val Glu Leu Ala Ala His Ala Ala Glu Ser Ala Gly Leu Arg Asp Val Arg Glu Leu Thr Leu Leu Glu Pro Leu Val Leu Pro Glu His Gly Gly Val Glu Leu Arg Val Thr Val Gly 1$ Ala Pro Ala Gly Glu Pro Gly Gly Glu Ser Ala Gly Asp Gly Ala Arg Pro Val Ser Leu His Ser Arg Leu Ala Asp Ala Pro Ala Gly Thr Ala Trp Ser Cys His Ala Thr Gly :Leu Leu Ala Thr Asp Arg Pro Glu Leu :ZD 3700 3705 3710 Pro Val Ala Pro Asp Arg Ala Ala Met Trp Pro Pro Gln Gly Ala Glu 3715 :3720 3725 Glu Val Pro Leu Asp Gly Leu 'ryr Glu Arg Leu Asp Gly Asn Gly Leu :~$ Ala Phe Gly Pro Leu Phe Gln Gly Leu Asn Ala Val Trp Arg Tyr Glu Gly Glu Val Phe Ala Asp Ile Ala Leu Pro Ala Thr Thr Asn Ala Thr Ala Pro Ala Thr Ala Asn Gly Gly Gly Ser Ala Ala Ala Ala Pro Tyr Gly Ile His Pro Ala Leu Leu Asp Ala Ser Leu His Ala Ile Ala Val 3795 aB00 3805 Gly Gly Leu Val Asp Glu Pro CJlu Leu Val Arg Val Pro Phe His Trp ~~$ Ser Gly Val Thr Val His Ala Ala Gly Ala Ala Ala Ala Arg Val Arg Leu Ala Ser Ala Gly Thr Asp Ala Val Ser Leu Ser Leu Thr Asp Gly Glu Gly Arg Pro Leu Val Ser Val Glu Arg Leu Thr Leu Arg Pro Val Thr Ala Asp Gln Ala Ala Ala Ser Arg Val Gly Gly Leu Met His Arg Val Ala Trp Arg Pro Tyr Ala Leu Ala Ser Ser Gly Glu Gln Asp Pro $ His Ala Thr Ser Tyr Gly Pro Thr Ala Val Leu Gly Lys Asp Glu Leu Lys Val AIa Ala Ala Leu Glu Ser Ala Gly Val Glu Val Gly Leu Tyr Pro Asp Leu Ala Ala Leu Ser Gln Asp Val Ala Ala Gly Ala Pro Ala 1~ 3940 3945 3950 Pro Arg Thr Val Leu Ala Pro Leu Pro Ala Gly Pro Ala Asp Gly Gly Ala Glu Gly Val Arg Gly Thr Val Ala Arg Thr Leu Glu Leu Leu Gln 1$ Ala Trp Leu Ala Asp Glu His :Leu Ala Gly Thr Arg Leu Leu Leu Val Thr Arg Gly Ala Val Arg Asp :Pro Glu Gly Ser Gly Ala Asp Asp Gly Gly Glu Asp Leu Ser His Ala ~Ala Ala Trp Gly Leu Val Arg Thr Ala Gln Thr Glu Asn Pro Gly Arg ~Phe Gly Leu Leu Asp Leu Ala Asp Asp Ala Ser Ser Tyr Arg Thr Leu 1?ro Ser Val Leu Ser Asp Ala Gly Leu :!$ Arg Asp Glu Pro Gln Leu Ala Leu His Aep Gly Thr Ile Arg Leu Ala Arg Leu Ala Ser Val Arg Pro GIu Thr Gly Thr Ala Ala Pro Ala Leu Ala Pro Glu Gly Thr Val Leu Leu Thr Gly Gly Thr Gly Gly Leu Gly :« 4100 4105 4110 Gly Leu Val Ala Arg His Val Val Gly Glu Trp Gly Val Arg Arg Leu Leu Leu Val Ser Arg Arg Gly Thr Asp Ala Pro Gly Ala Asp Glu Leu :~$ Val His Glu Leu Glu Ala Leu Gly Ala Asp Val Ser Val Ala Ala Cys Asp Val Ala Asp Arg Glu Ala Leu Thr Ala Val Leu Asp Ala Ile Pro Ala Glu His Pro Leu Thr Ala Val Val His Thr Ala Gly Val Leu Ser WO 00/00620 PC'T/US99/14398 Asp Gly Thr Leu Pro Ser Met Thr Thr Glu Asp Val Glu His Val Leu Arg Pro Lys Val Asp Ala Ala Phe Leu Leu Asp Glu Leu Thr Ser Thr $ Pro Ala Tyr Asp Leu Ala Ala Phe Val Met Phe Ser Ser Ala Ala Ala Val Phe Gly Gly Ala Gly Gln Gly Ala Tyr Ala Ala Ala Asn Ala Thr Leu Asp Ala Leu Ala Trp Arg .Arg Arg Ala Ala GIy Leu Pro Ala Leu Ser Leu Gly Trp Gly Leu Trp ;Ala Glu Thr Ser Gly Met Thr Gly Glu Leu Gly Gln Ala Asp Leu Arg ;Arg Met Ser Arg Ala Gly Ile Gly Gly ~$ Ile Ser Asp Ala Glu Gly Ile i~lla Leu Leu Asp Ala Ala Leu Arg Asp Asp Arg His Pro Val Leu Leu 1?ro Leu Arg Leu Asp Ala Ala Gly Leu Arg Asp Ala Ala Gly Asn Asp 1?ro Ala Gly Ile Pro Ala Leu Phe Arg Asp Val Val Gly Ala Arg Thr Val Arg Ala Arg Pro Ser Ala Ala Ser Ala Ser Thr Thr Ala Gly Thr Ala Gly Thr Pro Gly Thr Ala Asp Gly .!$ Ala Ala Glu Thr Ala Ala Val ~Phr Leu Ala Asp Arg Ala Ala Thr Val Asp Gly Pro Ala Arg Gln Arg Leu Leu Leu Glu Phe Val Val Gly Glu Val Ala Glu Val Leu Gly His Ala Arg Gly His Arg Ile Asp Ala Glu Arg Gly Phe Leu Asp Leu Gly Phe Asp Ser Leu Thr Ala Val Glu Leu Arg Asn Arg Leu Asn Ser Ala Gly Gly Leu Ala Leu Pro Ala Thr Leu ~~$ Val Phe Asp His Pro Ser Pro Ala Ala Leu Ala Ser His Leu Asp Ala Glu Leu Pro Arg Gly Ala Ser Asp Gln Asp Gly Ala Gly Asn Arg Asn Gly Asn Glu Asn Gly Thr Thr Ala Ser Arg Ser Thr Ala Glu Thr Asp Ala Leu Leu Ala Gln Leu Thr Arg Leu Glu Gly Ala Leu Val Leu Thr Gly Leu Ser Asp Ala Pro Gly Ser Glu Glu Val Leu Glu His Leu Arg Ser Leu Arg Ser Met Val Thr Gly Glu Thr Gly Thr Gly Thr Ala Ser Gly Ala Pro Asp Gly Ala Gly Ser Gly Ala Glu Asp Arg Pro Trp Ala Ala Giy Asp Gly Ala Gly Gly Gly Ser Glu Asp Gly Ala Gly Val Pro 1~ 4580 4585 4590 Asp Phe Met Asn Ala Ser Ala Glu Glu Leu Phe Gly Leu Leu Asp Gln Asp Pro Ser Thr Asp <210> 32 <211> 11220 <212> DNA
<213> Streptomyces venezuelae <400> 32 gtgtccacggtgaacgaagagaagtacctcgactacctgcgtcgtgccacggcggacctc 60 cacgaggcccgtggccgcctccgcgagctggaggcgaaggcgggcgagccggtggcgatc 120 gtcggcatggcctgccgcctgcccggc~ggcgtcgcctcgcccgaggacctgtggcggctg 180 :ZSgtggccggcggcgaggacgcgatctcg~gagttcccccaggaccgcggctgggacgtggag 240 ggcctgtacgacccgaacccggaggcc,acgggcaagagttacgcccgcgaggccggattc 300 ctgtacgaggcgggcgagttcgacgcc~gacttcttcgggatctcgccgcgcgaggccctc 360 gccatggacccgcagcagcgtctcctcctggaggcctcctgggaggcgttcgagcacgcc 420.

gggatcccggcggccaccgcgcgcggc~acctcggtcggcgtcttcaccggcgtgatgtac 480 :30cacgactacgccacccgtctcaccgatgtcccggagggcatcgagggctacctgggcacc 540 ggcaactccggcagtgtcgcctcgggccgcgtcgcgtacacgcttggcctggaggggccg 600 gccgtcacggtcgacaccgcctgctcgtcctcgctggtcgccctgcacctcgccgtgcag 660 gccctgcgcaagggcgaggtcgacatggcgctcgccggcggcgtgacggtcatgtcgacg 720 cccagcaccttcgtcgagttcagccgtcagcgcgggctggcgccggacggccggtcgaag 780 :35tccttctcgtcgacggccgacggcacc,agctggtccgagggcgtcggcgtcctcctcgtc 840 gagcgcctgtccgacgcgcgtcgcaag~,3gccatcggatcctcgccgtggtccggggcacc 900 gccgtcaaccaggacggcgccagcagcggcctcacggctccgaacgggccgtcgcagcag 960 cgcgtcatccgacgtgccctggcggacgcccggctcacgacctccgacgtggacgtcgtc 1.020 gaggcccacggcacgggtacgcgactcggcgacccgatcgaggcgcaggccgtcatcgcc 1.080 40acgtacgggcagggccgtgacggcgaacagccgctgcgcctcgggtcgttgaagtccaac 1.140 atcggacaca cccaggccgccgccggtgtctccggcgtgatcaagatggtccaggcgatg 1200 cgccacggcg tcctgccgaagacgctccacgtggagaagccgacggaccaggtggactgg 1260 tccgcgggcg cggtcgagctgctcaccgaggccatggactggccggacaagggcgacggc 2320 ggactgcgca gggccgcggtctcctccttcggcgtcagcgggacgaacgcgcacgtcgtg 1380 Sctcgaagaggccccggcggccgaggagacccctgcctccgaggcgaccccggccgtcgag 1440 ccgtcggtcg gcgccggcctggtgccgtggctggtgtcggcgaagactccggccgcgctg 1500 gacgcccaga tcggacgcctcgccgcgttcgcctcgcagggccgtacggacgccgccgat 7.560 ccgggcgcgg tcgctcgcgtactggccggcgggcgcgccgagttcgagcaccgggccgtc 1620 gtgctcggca ccggacaggacgatttcgcgcaggcgctgaccgctccggaaggactgata 1680 l0cgcggcacgccctcggacgtgggccgg~gtggcgttcgtgttccccggtcagggcacgcag 1.740 tgggccggga tgggcgccgaactcctc~gacgtgtcgaaggagttcgcggcggccatggcc 7.800 gagtgcgaga gcgcgctctcccgctat~gtcgactggtcgctggaggccgtcgtccggcag 1.860 gcgccgggcg cgcccacgctggagcgggtcgacgtcgtccagcccgtgaccttcgctgtc 1.920 atggtttcgc tggcgaaggtctggcagcaccacggcgtgacgccgcaggccgtcgtcggc 1.980 l5cactcgcagggcgagatcgccgccgcgtacgtcgccggtgccctcaccctcgacgacgcc 2040 gcccgcgtcg tcaccctgcgcagcaagtccatcgccgcccacctcgccggcaagggcggc 2100 atgatctccc tcgccctcagcgaggaagccacccggcagcgcatcgagaacctccacgga 2160 ctgtcgatcg ccgccgtcaacggccccaccgccaccgtggtttcgggcgaccccacccag 2220 atccaagagc tcgctcaggcgtgtgagc3ccgacggggtccgcgcacggatcatccccgtc 2280 :L~gactacgcctcccacagcgcccacgtcgagaccatcgagagcgaactcgccgaggtcctc 2340 gccgggctca gcccgcggacacctgagc3tgccgttcttctcgacactcgaaggcgcctgg 2400 atcaccgagc cggtgctcgacggcacctactggtaccgcaacctccgccaccgcgtcggc 2460 ttcgcccccg ccgtcgagaccctcgccaccgacgaaggcttcacccacttcatcgaggtc 2520 agcgcccacc ccgtcctcaccatgacccaccccgagaccgtcaccggcctcggcaccctc 2580 Scgccgcgaacagggaggccaggagcgtctggtcacctcactcgccgaagcctggaccaac 2640 ggcctcacca tcgactgggcgcccgtc<accccaccgcaaccggccaccaccccgagctc 2700 cccacctacg ccttccagcgccgtcact:actggctccacgactcccccgccgtccagggc 2760 tccgtgcagg actcctggcgctaccgcatcgactggaagcgcctcgcggtcgccgacgcg 2820 tccgagcgcg ccgggctgtccgggcgct:ggctcgtcgtcgtccccgaggaccgttccgcc 2880 30gaggccgccccggtgctcgccgcgctgt:ccggcgccggcgccgaccccgtacagctggac 2940 gtgtccccgc tgggcgaccggcagcggc;tcgccgcgacgctgggcgaggccctggcggcg 3000 gccggtggag ccgtcgacggcgtcctct:cgctgctcgcgtgggacgagagcgcgcacccc 3060 ggccaccccg cccccttcacccggggcaccggcgccaccctcaccctggtgcaggcgctg 3120 gaggacgccg gcgtcgccgccccgctgt:ggtgcgtgacccacggcgcggtgtccgtcggc 3180 ~~Scgggccgaccacgtcacctcccccgccc;aggccatggtgtggggcatgggccgggtcgcc 3240 gccctggagc accccgagcggtggggcc~gcctgatcgacctgccctcggacgccgaccgg 3300 gcggccctgg accgcatgaccacggtcc;tcgccggcggtacgggtgaggaccaggtcgcg 3360 gtacgcgcct ccgggctgctcgcccgcc;gcctcgtccgcgcctccctcccggcgcacggc 3420 acggcttcgc cgtggtggcaggccgacc~gcacggtgctcgtcaccggtgccgaggagcct 3480 <<Ogcggccgccgaggccgcacgccggctgqcccgcgacggcgccggacacctcctcctccac 3540 accaccccctccggcagcgaaggcgcc:gaaggcacctccggtgccgccgaggactccggc3600 ctcgccgggctcgtcgccgaactcgcc~gacctgggcgcgacggccaccgtcgtgacctgc3660 gacctcacggacgcggaggcggccgcc:cggctgctcgccggcgtctccgacgcgcacccg3720 ctcagcgccgtcctccacctgccgccc:accgtcgactccgagccgctcgccgcgaccgac3780 Sgcggacgcgctcgcccgtgtcgtgacc:gcgaaggccaccgccgcgctccacctggaccgc3840 ctcctgcgggaggccgcggctgccgga.ggccgtccgcccgtcctggtcctcttctcctcg3900 , gtcgccgcgatctggggcggcgccggtcagggcgcgtacgccgccggtacggccttcctc3960 gacgccctcgccggtcagcaccgggccgacggccccaccgtgacctcggtggcctggagc4020 ccctgggagggcagccgcgtcaccgagggtgcgaccggggagcggctgcgccgcctcggc4080 l0ctgcgccccctcgcccccgcgacggcgctcaccgccctggacaccgcgctcggccacggc4140 gacaccgccgtcacgatcgccgacgtcgactggtcgagcttcgcccccggcttcaccacg4200 gcccggccgggcaccctcctcgccgatctgcccgaggcgcgccgcgcgctcgacgagcag,260 cagtcgacgacggccgccgacgacaccgtcctgagccgcgagctcggtgcgctcaccggc4320 gccgaacagcagcgccgtatgcaggagttggtccgcgagcacctcgccgtggtcctcaac4380 lScacccctcccccgaggccgtcgacacggggcgggccttccgtgacctcggattcgactcg4440 ctgacggcggtcgagctccgcaaccgcctcaagaacgccaccggcctggccctcccggcc4500 actctggtcttcgactacccgaccccccggacgctggcggagttcctcctcgcggagatc4560 ctgggcgagcaggccggtgccggcgagcagcttccggtggacggcggggtcgacgacgag4620 cccgtcgcgatcgtcggcatggcgtgccgcctgccgggcggtgtcgcctcgccggaggac4680 20ctgtggcggctggtggccggcggcgag~gacgcgatctccggcttcccgcaggaccgcggc4740 tgggacgtggaggggctgtacgacccg~gacccggacgcgtccgggcggacgtactgccgt4800 gccggtggcttcctcgacgaggcgggcgagttcgacgccgacttcttcgggatctcgccg4860 cgcgaggccctcgccatggacccgcagcagcggctcctcctggagacctcctgggaggcc4920 gtcgaggacgccgggatcgacccgacctcccttcaggggcagcaggtcggcgtgttcgcg4980 2Sggcaccaacggcccccactacgagccgctgctccgcaacaccgccgaggatcttgagggt5040 tacgtcgggacgggcaacgccgccagcatcatgtcgggccgtgtctcgtacaccctcggc5100 ctggagggcccggccgtcacggtcgacaccgcctgctcctcctcgctggtcgccctgcac5160 ctcgccgtgcaggccctgcgcaagggcgaatgcggactggcgctcgcgggcggtgtgacg5220.

gtcatgtcgacgcccacgacgttcgtggagttcagccggcagcgcgggctcgcggaggac5280 :30ggccggtcgaaggcgttcgccgcgtcggcggacggcttcggcccggcggagggcgtcggc5340 atgctcctcgtcgagcgcctgtcggacgcccgccgcaacggacaccgtgtgctggcggtc5400 gtgcgcggcagcgcggtcaaccaggacggcgcgagcaacggcctgaccgccccgaacggg5460 ccctcgcagcagcgcgtcatccggcgcc3cgctcgcggacgcccgactgacgaccgccgac5520 gtggacgtcgtcgaggcccacggcacgggcacgcgactcggcgacccgatcgaggcacag5580 :3Sgccctcatcgccacctacggccaggggcgcgacaccgaacagccgctgcgcctggggtcg5640 ttgaagtccaacatcggacacacccaggccgccgccggtgtctccggcatcatcaagatg5700 gtccaggcgatgcgccacggcgtcctgccgaagacgctccacgtggaccggccgtcggac5760 cagatcgactggtcggcgggcacggtcgagctgctcaccgaggccatggactggccgagg5820 aagcaggagggcgggctgcgccgcgcggccgtctcctccttcggcatcagcggcacgaac5880 40gcgcacatcgtgctcgaagaagccccggtcgacgaggacgccccggcggacgagccgtcg5940 WO 00/OOb20 PCT/US99/14398 gtcggcggtg tggtgccgtggctcgtgtccgcgaagactccggccgcgctggacgcccag 6000 atcggacgcc tcgccgcgttcgcctcgcagggccgtacggacgccgccgatccgggcgcg 6060 gtcgctcgcg tactggccggcgggcgtgcgcagttcgagcaccgggccgtcgcgctcggc X120 accggacagg acgacctggcggccgcactggccgcgcctgagggtctggtccggggtgtg 6'180 Sgcctccggtgtgggtcgagtggcgttcgtgttcccgggacagggcacgcagtgggccggg 6240 atgggtgccg aactcctcgacgtgtcg~aaggagttcgcggcggccatggccgagtgcgag 6300 gccgcgctcg ctccgtacgtggactggtcgctggaggccgtcgtccgacaggcccccggc 6360 gcgcccacgc tggagcgggtcgatgtcgtccagcccgtgacgttcgccgtcatggtctcg 6420 ctggcgaagg tctggcagcaccacggg<~tgaccccgcaagccgtcgtcggccactcgcag 6480 .~Oggcgagatcgccgccgcgtacgtcgccggtgcectgagcctggacgacgccgctcgtgtc 6540 gtgaccctgc gcagcaagtccatcggcgcccacctcgcgggccagggcggcatgctgtcc 6600 ctcgcgctga gcgaggcggccgttgtgc~agcgactggccgggttcgacgggctgtccgtc 6660 gccgccgtca acgggcctaccgccaccgtggtttcgggcgacccgacccagatccaagag 6720 ctcgctcagg cgtgtgaggccgacgggc3tccgcgcacggatcatccccgtcgactacgcc 6780 l~5tcccacagcgcccacgtcgagaccatcgagagcgaactcgccgacgtcctggcggggttg 6840 tccccccaga caccccaggtccccttcttctccaccctcgaaggcgcctggatcaccgaa 6900 cccgccctcg acggcggctactggtacc:gcaacctccgccatcgtgtgggcttcgccccg 6960 gccgtcgaaa ccctggccaccgacgaaggcttcacccacttcgtcgaggtcagcgcccac 7020 cccgtcctca ccatggcgctgcccgagaccgtcaccggactcggcaccctccgccgtgac 7080 :!Oaacggcggacagcaccgcctcaccacct:ccctcgccgaggcctgggccaacggcctcacc 7140 gtcgactggg cctctctcctccccaccacgaccacccaccccgatctgcccacctacgcc 7200 ttccagaccg agcgctactggccgcagc:ccgacctctccgccgccggtgacatcacctcc 7260 gccggtctcg gggcggccgagcacccg<agctcggcgcggccgtggcgctcgcggactcc 7320 gacggctgcc tgctcacggggagcctct:ccctccgtacgcacccctggctggcggaccac 7380 i!Sgcggtggccggcaccgtgctgctgccgc~gaacggcgttcgtggagctggcgttccgagcc 7440 ggggaccagg tcggttgcgatctggtcc~aggagctcaccctcgacgcgccgctcgtgctg 7500 ccccgtcgtg gcgcggtccgtgtgcag<agtccgtcggcgcgagcgacgagtccgggcgt 7560 cgtaccttcg ggctctacgcgcacccgc~aggacgcgccgggcgaggcggagtggacgcgg 7620 cacgccaccg gtgtgctggccgcccgt<~cggaccgcaccgcccccgtcgccgacccggag 7680 :~Ogcctggccgccgccgggcgccgagccgc~tggacgtggacggtctgtacgagcgcttcgcg 7740 gcgaacggct acggctacggccccctctacagggcgtccgtggtgtctggcggcgtggc 7800 c gacgaggtgt tcgccgacgtggccctgc:cggccgaggtcgccggtgccgagggcgcgcgg 7860 ttcggccttc acccggcgctgctcgacc~ccgccgtgcaggcggccggtgcgggccggggc 7920 gttcggcgcg ggcacgcggctgccgttc:gcctggagcgggatctcctgtacgcggtcggc 7980 ~~Sgccaccgccctccgcgtgcggctggccc:ccgccggcccggacacggtgtccgtgagcgcc 8040 gccgactcct ccgggcagccggtgttcc~ccgcggactccctcacggtgctgcccgtcgac 8100 cccgcgcagc tggcggccttcagcgacc:cgactctggacgcgctgcacctgctggagtgg 8160 accgcctggg acggtgccgcgcaggcccagcccggcgcggtcgtgctgggcggcgacgcc 8220 gacggtctcg ccgcggcgctgcgcgccc~gtggcaccgaggtcctgtccttcccggacctt 8280 40acggacctggtggaggccgtcgaccggqgcgagaccccggccccggcgaccgtcctggtg 8340 gcctgccccgccgccggccccgatgggccggagcatgtccgcgaggccctgcacgggtcg8400 ctcgcgctgatgcaggcctggctggccgacgagcggttcaccgatgggcgcctggtgctc8460 gtgacccgcgacgcggtcgccgcccgttccggcgacggcctgcggtccacgggacaggcc8520 gccgtctggggcctcggccggtccgcgcagacggagagcccgggccggttcgtcctgctc8580 Sgacctcgccggggaagcccggacggccggggacgccaccgccggggacggcctgacgacc8640 ggggacgccaccgtcggcggcacctctggagacgccgccctcggcagcgccctcgcgacc8700 gccctcggctcgggcgagccgcagctcgccctccgggacggggcgctcctcgtaccccgc8760 ctggcgcgggccgccgcgcccgccgcggccgacggcctcgccgcggccgacggcctcgcc8820 gctctgccgctgcccgccgctccggccctctggcgtctggagcccggtacggacggcagc8880 lOctggagagcctcacggcggcgcccggcgacgccgagaccctcgccccggagccgctcggc8940 ccgggacaggtccgcatcgcgatccgggccaccggtctcaacttccgcgacgtcctgatcSr000 gccctcggcatgtaccccgatccggcgctgatgggcaccgagggagccggcgtggtcacc9060 gcgaccggccccggcgtcacgcacctcgcccccggcgaccgggtcatgggcctgctctcc9120 ggcgcgtacgccccggtcgtcgtggcg~gacgcgcggaccgtcgcgcggatgcccgagggg9180 lStggacgttcgcccagggcgcctccgtgccggtggtgttcctgacggccgtctacgccctg9240 cgcgacctggcggacgtcaagcccggc~gagcgcctcctggtccactccgccgccggtggc9300 gtgggcatggccgccgtgcagctcgcccggcactggggcgtggaggtccacggcacggcg9360 agtcacgggaagtgggacgccctgcgc!~cgctcggcctggacgacgcgcacatcgcctcc9420 tcccgcaccctggacttcgagtccgcgttccgtgccgcttccggcggggcgggcatggac9480 :ZOgtcgtactgaactcgctcgcccgcgagttcgtcgacgcctcgctgcgcctgctcgggccg9540 ggcggccggttcgtggagatggggaag;accgacgtccgcgacgcggagcgggtcgccgcc9600 gaccaccccggtgtcggctaccgcgccttcgacctgggcgaggccgggccggagcggatc9660 ggcgagatgctcgccgaggtcatcgccctcttcgaggacggggtgctccggcacctgccc9720 gtcacgacctgggacgtgcgccgggcccgcgacgccttccggcacgtcagccaggcccgc9780 :~Scacacgggcaaggtcgtcctcacgatgccgtcgggcctcgacccggagggtacggtcctg9840 ctgaccggcggcaccggtgcgctggggggcatcgtggcccggcacgtggtgggcgagtgg9900 ggcgtacgacgcctgctgctcgtgagccggcggggcacggacgccccgggcgccggcgag9960 ctcgtgcacgagctggaggccctgggagccgacgtctcggtggccgcgtgcgacgtcgcc10020 .

gaccgcgaagccctcaccgccgtactcc3actcgatccccgccgaacacccgctcaccgcg10080 :i0gtcgtccacacggcaggcgtcctctccgacggcaccctcccctcgatgacagcggaggat10140 gtggaacacgtactgcgtcccaaggtc<lacgccgcgttcctcctcgacgaactcacctcg10200 acgcccggctacgacctggcagcgttcgtcatgttctcctccgccgccgccgtcttcggt10260 ggcgcggggcagggcgcctacgccgccgccaacgccaccctcgacgccctcgcctggcgc10320 cgccggacagccggactccccgccctct:ccctcggctggggcctctgggccgagaccagc10380 :35ggcatgaccggcggactcagcgacaccdaccgctcgcggctggcccgttccggggcgacg10440 cccatggacagcgagctgaccctgtcccacctggacgcggccatgcgccgcgacgacccg10500 gcgctcgtcccgatcgccctggacgtcc3ccgcgctccgcgcccagcagcgcgacggcatg10560 ctggcgccgctgctcagcgggctcaccc:gcggatcgcgggtcggcggcgcgccggtcaac10620 cagcgcagggcagccgccggaggcgcg<3gcgaggcggacacggacctcggcgggcggctc10680 ~30gccgcgatgacaccggacgaccgggtcc3cgcacctgcgggacctcgtccgtacgcacgtg10740 gcgaccgtcctgggacacggcaccccc~agccgggtggaccggagcgggccttccgcgac10800 t accggtttcgactcgctcaccgccgtc:gaactccgcaaccgtctcaacgccgcgaccggg10860 ctgcggctgccggccacgctggtcttc:gaccaccccaccccgggggagctcgccgggcac10920 ctgctcgacgaactcgccacggccgcgggcgggtcctgggggaaggcaccgggtccgga10980 c Sgacacggcctcggcgaccgatcggcag~accacggcggccctcgccgaactcgaccggctg11040 gaaggcgtgctcgcctccctcgcgcccgccgccggcggccgtccggagctcgccgcccgg1:1100 ctcagggcgctggccgcggccctgggggacgacggcgacgacgccaccgacctggacgag1:1160 gcgtccgacgacgacctcttctccttcatcgacaaggagctgggcgactccgacttctga1:1220 <210> 33 <211> 3739 LS <212> PRT
<213> Streptomyces venezuelae <400> 33 Met Ser Thr Val Asn Glu Glu Lys Tyr Leu Asp Tyr Leu Arg Arg Ala Thr Ala Asp Leu His Glu Ala Arg Gly Arg Leu Arg Glu Leu Glu Ala Lys Ala Gly Glu Pro Val Ala 7:1e Val Gly Met Ala Cys Arg Leu Pro sS Gly Gly Val Ala Ser Pro Glu Asp Leu Trp Arg Leu Val Ala Gly Gly Glu Asp Ala Ile Ser Glu Phe Pro Gln Asp Arg Gly Trp Asp Val Glu Gly Leu Tyr Asp Pro Asn Pro Glu Ala Thr Gly Lys Ser Tyr Ala Arg Glu Ala Gly Phe Leu Tyr Glu l~.la Gly GIu Phe Asp Ala Asp Phe Phe Gly Ile Ser Pro Arg Glu Ala Leu Ala Met Asp Pro Gln Gln Arg Leu 3S Leu Leu Glu Ala Ser Trp Glu Ala Phe Glu His Ala Gly Ile Pro Ala Ala Thr Ala Arg Gly Thr Ser Val Gly Val Phe Thr Gly Val Met Tyr His Asp Tyr Ala Thr Arg Leu Thr Asp Val Pro Glu Gly Ile Glu Gly Tyr Leu Gly Thr Gly Asn Ser Gly Ser Val Ala Ser Gly Arg Val Ala Tyr Thr Leu Gly Leu Glu Gly Pro Ala Val Thr Val Asp Thr Ala Cys $ Ser Ser Ser Leu Val Ala Leu His Leu Ala Val Gln Ala Leu Arg Lys Gly Glu Val Asp Met Ala Leu Ala Gly Gly Val Thr Val Met Ser Thr Pro Ser Thr Phe Val Glu Phe Ser Arg Gln Arg Gly Leu Ala Pro Asp Gly Arg Ser Lys Ser Phe Ser Ser Thr Ala Asp Gly Thr Ser Trp Ser Glu Gly Val Gly Val Leu Leu Val Glu Arg Leu Ser Asp Ala Arg Arg 1$ Lys Gly His Arg Ile Leu Ala Val Val Arg Gly Thr Ala Val Asn Gln Asp Gly Ala Ser Ser Gly Leu Thr Ala Pro Asn Gly Pro Ser Gln Gln Arg Val Ile Arg Arg Ala Leu Ala Asp Ala Arg Leu Thr Thr Ser Asp Val Asp Val Val Glu Ala His Gly Thr Gly Thr Arg Leu Gly Asp Pro Ile Glu Ala Gln Ala Val Ile .Ala Thr Tyr Gly Gln Gly Arg Asp Gly .~$ Glu Gln Pro Leu Arg Leu Gly Ser Leu Lys Ser Asn Ile Gly His Thr Gln Ala Ala Ala Gly Val Ser Gly Val Ile Lys Met Val Gln Ala Met Arg His Gly Val Leu Pro Lys 'Thr Leu His Val Glu Lys Pro Thr Asp Gln Val Asp Trp Ser Ala Gly i~la Val Glu Leu Leu Thr Glu Ala Met Asp Trp Pro Asp Lys Gly Asp Gly Gly Leu Arg Arg Ala Ala Val Ser _SS Ser Phe Gly Val Ser Gly Thr Asn Ala His Val Val Leu Glu Glu Ala Pro Ala Ala Glu Glu Thr Pro Ala Ser Glu Ala Thr Pro Ala Val Glu Pro Ser Val Gly Ala Gly Leu Val Pro Trp Leu Va.l Ser Ala Lys Thr Pro Ala Ala Leu Asp Ala Gln Ile Gly Arg Leu Ala Ala Phe Ala Ser Gln Gly Arg Thr Asp Ala Ala Asp Pro GIy Ala Val Ala Arg Val Leu $ Ala Gly Gly Arg Ala Glu Phe Glu His Arg Ala Val Val Leu Gly Thr Gly Gln Asp Asp Phe Ala Gln Ala Leu Thr Ala Pro Glu Gly Leu Ile Arg Gly Thr Pro Ser Asp Val Gly Arg Val Ala Phe Val Phe Pro Gly Gln Gly Thr Gln Trp Ala Gly lHet Gly Ala Glu Leu Leu Asp Val Ser Lys Glu Phe Ala Ala Ala Met ~41a Glu Cys Glu Ser Ala Leu Ser Arg l5 Tyr Val Asp Trp Ser Leu Glu i~la Val Val Arg Gln Ala Pro Gly Ala Pro Thr Leu Glu Arg Val Asp Val Val Gln Pro Val Thr Phe Ala Val Met Val Ser Leu AIa Lys VaI '.Crp Gln His His Gly Val Thr Pro Gln Ala Val Val Gly His Ser Gln CJly Glu Ile Ala Ala Ala Tyr Val Ala Gly Ala Leu Thr Leu Asp Asp Ala Ala Arg Val Val Thr Leu Arg Ser 675 E~80 685 2$ Lys Ser Ile Ala Ala His Leu Ala Gly Lys Gly Gly Met Ile Ser Leu Ala Leu Ser Glu Glu Ala Thr P~rg Gln Arg Ile Glu Asn Leu His Gly Leu Ser Ile Ala Ala Val Asn Glly Pro Thr Ala Thr Val Val Ser Gly Asp Pro Thr Gln Ile Gln Glu L~eu Ala Gln Ala Cys Glu Ala Asp Gly Val Arg Ala Arg Ile Ile Pro Val Asp Tyr Ala Ser His Ser Ala His 3$ Val Glu Thr Ile Glu Ser Glu Leu Ala Glu Val Leu Ala Gly Leu Ser Pro Arg Thr Pro Glu Val Pro Phe Phe Ser Thr Leu Glu Gly Ala Trp Ile Thr Glu Pro Val Leu Asp Gly Thr Tyr Trp Tyr Arg Asn Leu Arg 40 eos 810 815 His Arg Val Gly Phe Ala Pro Ala Val Glu Thr Leu Ala Thr Asp Glu Gly Phe Thr His Phe Ile Glu Val Ser Ala His Pro Val Leu Thr Met $ Thr Leu Pro Glu Thr Val Thr Gly Leu Gly Thr Leu Arg Arg Glu Gln Gly Gly Gln Glu Arg Leu Val Thr Ser Leu Ala Glu Ala Trp Thr Asn Gly Leu Thr Ile Asp Trp Ala Pro Val Leu Pro Thr Ala Thr Gly His His Pro Glu Leu Pro Thr Tyr .Ala Phe Gln Arg Arg His Tyr Trp Leu His Asp Ser Pro Ala Val Gln Gly Ser Val Gln Asp Ser Trp Arg Tyr 915 '920 925 ~$ Arg Ile Asp Trp Lys Arg Leu i~la Val Ala Asp Ala Ser Glu Arg Ala Gly Leu Ser Gly Arg Trp Leu Val Val Val Pro Glu Asp Arg Ser Ala Glu Ala Ala Pro Val Leu Ala Ala Leu Ser Gly Ala Gly Ala Asp Pro Val Gln Leu Asp Val Ser Pro Leu Gly Asp Arg Gln Arg Leu Ala Ala Thr Leu Gly Glu Ala Leu Ala Ala Ala Gly Gly Ala Val Asp Gly Val 2;$ Leu Ser Leu Leu Ala Trp Asp Glu Ser Ala His Pro Giy His Pro Ala Pro Phe Thr Arg Gly Thr Gly h,la Thr Leu Thr Leu Val Gln Ala Leu Glu Asp Ala Gly Val Ala Ala Pro Leu Trp Cys Val Thr His Gly Ala Val Ser Val Gly Arg Ala Asp His Val Thr Ser Pro Ala Gln Ala Met Val Trp Gly Met Gly Arg Val Ala Ala Leu Glu His Pro Glu Arg Trp 35 Gly Gly Leu Ile Asp Leu Pro Ser Asp Ala Asp Arg Ala Ala Leu Asp Arg Met Thr Thr Val Leu Ala Gly Gly Thr Gly Glu Asp Gln Val Ala Val Arg Ala Ser Gly Leu Leu Ala Arg Arg Leu Val Arg Aia Ser Leu Pro Ala His Gly Thr Ala Se~_- Pro Trp Trp Gln Ala Asp Gly Thr Val Leu Val Thr Gly Ala Glu Glu Pro Ala Ala Ala Glu Ala Ala Arg Arg $ Leu Ala Arg Asp Gly Ala Gly His Leu Leu Leu His Thr Thr Pro Ser Gly Ser Glu Gly Ala Glu Gly Thr Ser Gly Ala Ala Glu Asp Ser Gly Leu Ala Gly Leu Val Ala Glu Leu Ala Asp Leu Gly Ala Thr Ala Thr Val Val Thr Cys Asp Leu Thr Asp Ala Glu Ala Ala Ala Arg Leu Leu Ala Gly Val Ser Asp Ala His Pro Leu Ser Ala Val Leu His Leu Pro 1S Pro Thr Val Asp Ser Glu Pro Leu Ala Ala Thr Asp Ala Asp Ala Leu 1250 125!5 1260 Ala Arg Val Val Thr Ala Lys Ala Thr Ala Ala Leu His Leu Asp Arg Leu Leu Arg Glu Ala Ala Al.a Ala Gly Gly Arg Pro Pro Val Leu Val 2~ 1285 1290 Leu Phe Ser Ser Val Ala Ala Ile Trp Gly Gly Ala Gly Gln Gly Ala Tyr Ala Ala Gly Thr Ala Phe Leu Asp Ala Leu Ala Gly Gln His Arg 25 Ala Asp Gly Pro Thr Val Thr Ser Val Ala Trp Ser Pro Trp Glu Gly Ser Arg Val Thr Glu Gly Ala Thr Gly Glu Arg Leu Arg Arg Leu Gly 1355 1360 .
Leu Arg Pro Leu Ala Pro Ala Thr Ala Leu Thr Ala Leu Asp Thr Ala Leu Gly His Gly Asp Thr Ala Val Thr Ile Ala Asp Val Aep Trp Ser Ser Phe Ala Pro Gly Phe Thr Thr Ala Arg Pro Gly Thr Leu Leu Ala .3$ Asp Leu Pro Glu Ala Arg Arg ,Ala Leu Asp Glu Gln Gln Ser Thr Thr Ala Ala Asp Asp Thr Val Leu ;Ser Arg Glu Leu Gly Ala Leu Thr Gly Ala Glu Gln Gln Arg Arg Met Gln Glu Leu Val Arg Glu His Leu Ala Val Val Leu Asn His Pro Ser Pro Glu Ala Val Asp Thr Gly Arg Ala Phe Arg Asp Leu Gly Phe Asp Ser Leu Thr Ala Val Glu Leu Arg Asn $ Arg Leu Lys Asn Ala Thr Gly Leu Ala Leu Pro Ala Thr Leu VaI Phe 2490 149!5 1500 Asp Tyr Pro Thr Pro Arg Thr Leu Ala Glu Phe Leu Leu Ala Glu Ile Leu Gly Glu Gln Ala Gly Ala Gly Glu Gln Leu Pro Val Asp Gly Gly Val Asp Asp Glu Pro Val Ala Ile Val Gly Met Ala Cys Arg Leu Pro Gly Gly Val Ala Ser Pro Glu Asp Leu Trp Arg Leu val Ala Gly Gly 1$ Glu Asp Ala Ile Ser Gly Phe Pro Gln Asp Arg Gly Trp Asp Val Glu Gly Leu Tyr Asp Pro Asp Pro Asp Ala Ser Gly Arg Thr Tyr Cys Arg Ala Gly Gly Phe Leu Asp Glu Ala Gly Glu Phe Asp Ala Asp Phe Phe Gly Ile Ser Pro Arg Glu Ala Leu Ala Met Asp Pro Gln Gln Arg Leu Leu Leu Glu Thr Ser Trp Glu .Ala Val Glu Asp Ala Gly Ile Asp Pro :25 Thr Ser Leu Gln Gly Gln Gln 'Val Gly Val Phe Ala Gly Thr Asn Gly Pro His Tyr Glu Pro Leu Leu Arg Asn Thr Ala Glu Asp Leu Glu Gly Tyr Val Gly Thr Gly Asn Ala i~la Ser Ile Met Ser Gly Arg VaI Ser Tyr Thr Leu Gly Leu Glu Gly 1?ro Ala Val Thr Val Asp Thr Ala Cys Ser Ser Ser Leu Val Ala Leu His Leu Ala Val Gln Ala Leu Arg Lys _~S Gly Glu Cys Gly Leu Ala Leu Ala Gly Gly Val Thr Val Met Ser Thr Pro Thr Thr Phe Val Glu Phe Ser Arg Gln Arg Gly Leu Ala Glu Asp Gly Arg Ser Lys Ala Phe Ala Ala Ser Ala Asp Gly Phe Gly Pro Ala 1$2 Glu Gly Val Gly Met Leu Leu. Val Glu Arg Leu Ser Asp Ala Arg Arg ,.

Asn Gly His Arg Val Leu Ala Val Val Arg Gly Ser Ala Val Asn Gln $ Asp Gly Ala Ser Asn Gly Leu Thr Ala Pro Asn Gly Pro Ser Gln Gln Arg Val Ile Arg Arg Ala Leu Ala Asp Ala Arg Leu Thr Thr Ala Asp Val Asp Val Val Glu Ala His Gly Thr Gly Thr Arg Leu Gly Asp Pro Ile Glu Ala Gln Ala Leu Ile Ala Thr Tyr Gly Gln Gly Arg Asp Thr Glu Gln Pro Leu Arg Leu Gly Ser Leu Lys Ser Asn Ile Gly His Thr I$ Gln Ala Ala Ala Gly Val Ser Gly Ile Ile Lys Met Val Gln Ala Met 1890 189__°i 1900 Arg His Gly Val Leu Pro Lys Thr Leu His Val Asp Arg Pro Ser Asp Gln Ile Asp Trp Ser Ala Gly Thr Val Glu Leu Leu Thr Glu Ala Met 2~ 1925 1930 Asp Trp Pro Arg Lys Gln Glu Gly Gly Leu Arg Arg Ala Ala Val Ser Ser Phe Gly Ile Ser Gly Thr Asn Ala His Ile Val Leu Glu Glu Ala 2$ Pro Val Asp Glu Asp Ala Pro Ala Asp Glu Pro Ser Val Gly Gly Val Val Pro Trp Leu Val Ser Ala Lys Thr Pro Ala Ala Leu Asp Ala Gln Ile Gly Arg Leu Ala Ala Phe .Ala Ser Gln Gly Arg Thr Asp Ala Ala Asp Pro Gly Ala Val Ala Arg 'Val Leu Ala Gly Gly Arg Ala Gln Phe Glu His Arg Ala Val Ala Leu Gly Thr Gly Gln Asp Asp Leu Ala Ala 3$ Ala Leu Ala Ala Pro Glu Gly Leu Val Arg Gly Va1 Ala Ser Gly Val Gly Arg Val Ala Phe Val Phe 1?ro Gly Gln Gly Thr Gln Trp Ala Gly Met Gly A1a Glu Leu Leu Asp Val Ser Lys Glu Phe Ala A1a Ala Met 1$3 Ala Glu Cys Glu Ala Ala Leu. Ala Pro Tyr Val Asp Trp Ser Leu Glu Ala Val Val Arg Gln Ala Pro Gly Ala Pro Thr Leu Glu Arg Vai Asp $ Val Val Gln Pro Val Thr Phe Ala VaI Met Val Ser Leu Ala Lys Val Trp Gln His His Gly Val Thr Pro Gln Ala Val Val Gly His Ser Gln Gly Glu Ile Ala Ala Ala Tyr Val Ala Gly Ala Leu Ser Leu Asp Asp 1~ 2165 2170 Ala Ala Arg Val Val Thr Leu Arg Ser Lys Ser Ile Gly Ala His Leu Ala Gly Gln Gly Gly Met Leu Ser Leu Ala Leu Ser Glu Ala Ala Val 15 Val Glu Arg Leu Ala Gly Phe Asp Gly Leu Ser Val Ala Ala Val Asn Gly Pro Thr Ala Thr Val Val Ser Gly Asp Pro Thr Gln Ile Gln Glu Leu Ala Gln Ala Cys Glu Ala Asp Gly Val Arg Ala Arg Ile Ile Pro Val Asp Tyr Ala Ser His Ser Ala His Val Glu Thr Ile Glu Ser Glu Leu Ala Asp Val Leu Ala Gly Leu Ser Pro Gln Thr Pro Gln Val Pro 2$ Phe Phe Ser Thr Leu Glu Gly .Ala Trp Ile Thr Glu Pro Ala Leu Asp Gly Gly Tyr Trp Tyr Arg Asn :Leu Arg His Arg Val Gly Phe Ala Pro Ala Val Glu Thr Leu Ala Thr iAsp Glu Gly Phe Thr His Phe Val Glu Val Ser Ala His Pro Val Leu 'Chr Met Ala Leu Pro Glu Thr Val Thr Gly Leu Gly Thr Leu Arg Arg Asp Asn Gly Gly Gln His Arg Leu Thr 2355 :?360 2365 :~$ Thr Ser Leu Ala Glu Ala Trp Ala Asn Gly Leu Thr Val Asp Trp Ala Ser Leu Leu Pro Thr Thr Thr Thr His Pro Asp Leu Pro Thr Tyr Ala Phe Gln Thr Glu Arg Tyr Trp Pro Gln Pro Asp Leu Ser Ala Ala Gly 1$4 Asp Ile Thr Ser Ala Gly Leu Gly Ala Ala Glu His Pro Leu Leu Gly Ala Ala Val Ala Leu Ala Asp Ser Asp Gly Cys Leu Leu Thr Gly Ser $ Leu Ser Leu Arg Thr His Pro Trp Leu Ala Asp His Ala Val Ala Gly 2450 245!i 2460 Thr Val Leu Leu Pro Gly Thr Ala Phe Val Glu Leu Ala Phe Arg Ala Gly Asp Gln Val Gly Cys Asp Leu Val Glu Glu Leu Thr Leu Asp Ala 1~ 2485 2490 2495 Pro Leu Val Leu Pro Arg Arg Gly Ala Val Arg Val Gln Leu Ser Val Gly Ala Ser Asp Glu Ser Gly Arg Arg Thr Phe Gly Leu Tyr Ala His 1$ Pro Glu Asp Ala Pro Gly Glu Ala Glu Trp Thr Arg His Ala Thr Gly 2530 2535. 2540 Val Leu Ala Ala Arg Ala Asp Arg Thr Ala Pro Val Ala Asp Pro Glu Ala Trp Pro Pro Pro Gly Ala Glu Pro Val Asp Val Asp Gly Leu Tyr Glu Arg Phe Ala Ala Asn Gly Tyr Gly Tyr Gly Pro Leu Phe Gln Gly Val Arg Gly Val Trp Arg Arg Gly Asp Glu Val Phe Ala Asp Val Ala 2$ Leu Pro Ala Glu Val Ala Gly Ala Glu Gly Ala Arg Phe Gly Leu His 2610 2615 26'20 Pro Ala Leu Leu Asp Ala Ala Val Gln Ala Ala Gly Ala Gly Arg Gly Val Arg Arg Gly His Ala Ala .Ala Val Arg Leu Glu Arg Asp Leu Leu Tyr Ala Val Gly Ala Thr Ala Leu Arg Val Arg Leu Ala Pro Ala Gly Pro Asp Thr Val Ser Val Ser .Ala Ala Asp Ser Ser Gly Gln Pro Val 3$ Phe Ala Ala Asp Ser Leu Thr 'Val Leu Pro Val Asp Pro Ala Gln Leu Ala Ala Phe Ser Asp Pro Thr :Leu Asp Ala Leu His Leu Leu Glu Trp Thr Ala Trp Asp Gly Ala Ala Gln Ala Leu Pro Gly Ala Val Val Leu Gly Gly Asp Ala Asp Gly Leu Ala Ala Ala Leu Arg Ala Gly Gly Thr Glu Val Leu Ser Phe Pro Asp Leu Thr Asp Leu Val Glu Ala Val Asp Arg Gly Glu Thr Pro Ala Pro Ala Thr Val Leu Val Ala Cys Pro Ala 2770 277°.i 2780 Ala Gly Pro Asp Gly Pro Glu His Val Arg Glu Ala Leu His Gly Ser Leu Ala Leu Met Gln Ala Trp Leu Ala Asp Glu Arg Phe Thr Asp Gly Arg Leu Val Leu Val Thr Arg Asp Ala Val Ala Ala Arg Ser Gly Asp Gly Leu Arg Ser Thr Gly Gln Ala Ala Val Trp Gly Leu Gly Arg Ser Ala Gln Thr Glu Ser Pro Gly Arg Phe Val Leu Leu Asp Leu Ala Gly Glu Ala Arg Thr Ala Gly Asp Ala Thr Ala Gly Asp Gly Leu Thr Thr Gly Asp Ala Thr Val Gly Gly T:hr Ser Gly Asp Ala Ala Leu Gly Ser Ala Leu Ala Thr Ala Leu Gly Ser Gly Glu Pro Gln Leu Ala Leu Arg Asp Gly Ala Leu Leu Val Pro ;Arg Leu Ala Arg Ala Ala Ala Pro Ala 2915 :2920 2925 .~5 Ala Ala Asp Gly Leu Ala Ala ~41a Asp Gly Leu Ala Ala Leu Pro Leu Pro Ala Ala Pro Ala Leu Trp i~rg Leu Glu Pro Gly Thr Asp Gly Ser Leu Glu Ser Leu Thr Ala Ala 1?ro Gly Asp Ala Glu Thr Leu Ala Pro Glu Pro Leu Gly Pro Gly Gln Val Arg Ile Ala Ile Arg Ala Thr Gly Leu Asn Phe Arg Asp Val Leu Ile Ala Leu Gly Met Tyr Pro Asp Pro .i5 Ala Leu Met Gly Thr Glu Gly Ala Gly Val Val Thr Ala Thr Gly Pro Gly Val Thr His Leu Ala Pro C3ly Asp Arg Val Met Gly Leu Leu Ser Gly Ala Tyr Ala Pro Val Val Val Ala Asp Ala Arg Thr Val Ala Arg 1$6 Met Pro Glu Gly Trp Thr Phe~ Ala Gln Gly Ala Ser Val Pro Val Val Phe Leu Thr Ala Val Tyr Ala Leu Arg Asp Leu Ala Asp Val Lys Pro $ Gly Glu Arg Leu Leu Val His Ser Ala Ala Gly Gly Val Gly Met Ala Ala Val Gln Leu Ala Arg His Trp Gly Val Glu Val His Gly Thr Ala Ser His Gly Lys Trp Asp Ala Leu Arg Ala Leu Gly Leu Asp Asp Ala 1~ 3125 3130 His Ile Ala Ser Ser Arg Thr Leu Asp Phe Glu Ser Ala Phe Arg Ala Ala Ser Gly Gly Ala Gly Met Asp Val Val Leu Asn Ser Leu Ala Arg 3155 .3160 1$ Glu Phe Val Asp Ala Ser Leu Arg Leu Leu Gly Pro Gly Gly Arg Phe Val Glu Met Gly Lys Thr Asp Val Arg Asp Ala Glu Arg VaI Ala Ala Asp His Pro Gly Val Gly Tyr Arg Ala Phe Asp Leu Gly Glu Ala Gly Pro Glu Arg Ile Gly Glu Met Leu Ala Glu Val Ile Ala Leu Phe Glu Asp Gly Val Leu Arg His Leu Pro Val Thr Thr Trp Asp Val Arg Arg 2$ Ala Arg Asp Ala Phe Arg His Val Ser Gln Ala Arg His Thr Gly Lys Val Val Leu Thr Met Pro Ser Gly Leu Asp Pro Glu Gly Thr Val Leu Leu Thr Gly Gly Thr Gly Ala Leu Gly Gly Ile Val Ala Arg His Val Val Gly Glu Trp Gly Val Arg Arg Leu Leu Leu Val Ser Arg Arg Gly Thr Asp Ala Pro Gly Ala Gly Glu Leu Val His Glu Leu Glu Ala Leu 3$ Gly Ala Asp Val Ser Val Ala ,Ala Cys Asp Val Ala Asp Arg Glu Ala Leu Thr Ala Val Leu Asp Ser Ile Pro Ala Glu His Pro Leu Thr Ala Val Val His Thr Ala Gly Val :Leu Ser Asp Gly Thr Leu Pro Ser Met Thr Ala Glu Asp Val Glu His Val Leu Arg Pro Lys Val Asp Ala Ala Phe Leu Leu Asp Glu Leu Th:r Ser Thr Pro Gly Tyr Asp Leu Ala Ala $ Phe Val Met Phe Ser Ser Alai Ala Ala Val Phe Gly Gly Ala Gly Gln 3410 34:L5 Gly Ala Tyr Ala Ala Ala Asn Ala Thr Leu Asp Ala Leu Ala Trp Arg Arg Arg Thr Ala Gly Leu Pro Ala Leu Ser Leu Gly Trp Gly Leu Trp 1~ 3445 3450 3455 Ala Glu Thr Ser Gly Met Thr Gly Gly Leu Ser Asp Thr Asp Arg Ser Arg Leu Ala Arg Ser Gly Ala Thr Pro Met Asp Ser Glu Leu Thr Leu 1$ Ser Leu Leu Asp Ala Ala Met Arg Arg Asp Asp Pro Ala Leu Val Pro Ile Ala Leu Asp Val Ala Ala Leu Arg Ala Gln Gln Arg Asp Gly Met Leu Ala Pro Leu Leu Ser G1y Leu Thr Arg Gly Ser Arg Val Gly Gly Ala Pro Val Asn Gln Arg Arg Ala Ala Ala Gly Gly Ala Gly Glu Ala Asp Thr Asp Leu Gly Gly Arg Leu Ala Ala Met Thr Pro Asp Asp Arg ZS Val Ala His Leu Arg Asp Leu Val Arg Thr His Val Ala Thr Val Leu Gly His Gly Thr Pro Ser Arg Val Asp Leu Glu Arg Ala Phe Arg Asp Thr Gly Phe Asp Ser Leu Thr Ala Val Glu Leu Arg Asn Arg Leu Asn Ala Ala Thr Gly Leu Arg Leu Pro Ala Thr Leu Val Phe Asp His Pro Thr Pro Gly Glu Leu Ala Gly His Leu Leu Asp Glu Leu Ala Thr Ala 3$ Ala Gly Gly Ser Trp Ala Glu Gly Thr Gly Ser Gly Asp Thr Ala Ser Ala Thr Asp Arg Gln Thr Thr Ala Ala Leu Ala Glu Leu Asp Arg Leu Glu Gly Val Leu Ala Ser Leu Ala Pro Ala Ala Gly Gly Arg Pro Glu Leu Ala Ala Arg Leu Arg Ala Leu Ala Ala Ala Leu Gly Asp Asp Gly Asp Asp Ala Thr Asp Leu Asp Glu Ala Ser Asp Asp Asp Leu Phe Ser $ Phe Ile Asp Lys Glu Leu Gly Asp Ser Asp Phe 3730 3735.
<210> 34 <211> 4689 <212> DNA
<213> Streptomyces venezuelae <400> 34 atggcgaacaacgaagacaagctccgcgactacctcaagcgcgtcaccgccgagctgcag60 cagaacaccaggcgtctgcgcgagatcgagggacgcacgcacgagccggtggcgatcgtg120 ggcatggcctgccgcctgccgggcggtgtcgcctcgcccgaggacctgtggcagctggtg180 gccggggacggggacgcgatctcggagttcccgcaggaccgcggctgggacgtggagggg240 ctgtacgaccccgacccggacgcgtccggcaggacgtactgccggtccggcggattcctg300 cacgacgccggcgagttcgacgccgacttcttcgggatctcgccgcgcgaggccctcgcc360 :~0atggacccgcagcagcgactgtccctcaccaccgcgtgggaggcgatcgagagcgcgggc420 atcgacccgacggccctgaagggcagc:ggcctcggcgtcttcgtcggcggctggcacacc480 ggctacacctcggggcagaccaccgcc:gtgcagtcgcccgagctggagggccacctggtc540 agcggcgcggcgctgggcttcctgtcc:ggccgtatcgcgtacgtcctcggtacggacgga600 ccggccctgaccgtggacacggcctgcacgtcctcgctggtcgccctgcacctcgccgtg660 :!5caggccctccgcaagggcgagtgcgac:atggccctcgccggtggtgtcacggtcatgccc720 aacgcggacctgttcgtgcagttcagc:cggcagcgcgggctggccgcggacggccggtcg780 aaggcgttcgccacctcggcggacggca ggccccgcggagggcgccggagtcctgctg840 tc gtggagcgcctgtcggacgcccgccgcaacggacaccggatcctcgcggtcgtccgcggc900.

agcgcggtcaaccaggacggcgccagcaacggcctcacggctccgcacgggccctcccag960 ~~0cagcgcgtcatccgacgggccctggcggacgcccggctcgcgccgggtgacgtggacgtc1020 gtcgaggcgcacggcacgggcacgcgg~ctcggcgacccgatcgaggcgcaggccctcatc1080 gccacctacggccaggagaagagcagcgaacagccgctgaggctgggcgcgttgaagtcg1140 aacatcgggcacacgcaggccgcggccggtgtcgcaggtgtcatcaagatggtccaggcg1200 atgcgccacggactgctgccgaagacgctgcacgtcgacgagccctcggaccagatcgac1260 35 tggtcggcgggcacggtggaactcctcaccgaggccgtcgactggccggagaagcaggac1320 ggcgggctgcgccgcgcggctgtctcctccttcggcatcagcgggacgaacgcgcacgtc1380 gtcctggaggaggccccggcggtcgaggactccccggccgtcgagccgccggccggtggc1440 ggtgtggtgccgtggccggtgtccgcgaagactccggccgcgctggacgcccagatcggg7.500 cagctcgccgcgtacgcggacggtcgtacggacgtggatccggcggtggccgcccgcgcc7.560 40 ctggtcgacagccgtacggcgatggagcaccgcgcggtcgcggtcggcgacagccgggag1.620 WO 00/00620 PCT/US99/1439$

gcactgcggg 1680 acgccctgcg gatgccggaa ggactggtac gcggcacgtc ctcggacgtg ggccgggtgg 1740 cgttcgtctt ccccggccag ggcacgcagt.
yggccggcat gggcgccgaa ctccttgaca 1800 gctcaccgga gttcgctgcc tcgatggccg aatgcgagac cgcgctctcc cgctacgtcg acccacgctc 1860 actggtctct tgaag<:cgtc gtccgacagg aacccggcgc S gaccgcgtcg ggcgaaggtc 1920 acgtcgtcca gcccgt:gacc ttcgctgtca tggtctcgct tggcagcacc cgagatcgcc 1980 acggcatcac , cccccaggcc gtcgtcggcc actcgcaggg gccgcgtacg actcac:cctc caccctgcgc 2040 tcgccggtgc gacgacgccg cccgcgtcgt agcaagtccatcgccgcccacctcgc:cggc tgatctccctcgccctcgac 2100 aagggcggca gaggcggccgtcctgaagcgactgagcgacttcgacggactctccgtcgcgccgtcaac 2160 c ggccccaccgccaccgtcgtctccggcgacccgacccagatcgaggaactgcccgcacc 2220 c tgcgaggccgacggcgtccgtgcgcg~gatcatcccggtcgactacgcctcccacagccgg 2280 caggtcgagatcatcgagaaggagctggccgaggtcctcgccggactcgccccgcaggct 2340 ccgcacgtgccgttcttctccaccctcgaaggcacctggatcaccgagccggtgctcgac 2400 ggcacctactggtaccgcaacctgcgccatcgcgtgggcttcgcccccgccgtggagacc 2460 IS ttggcggttgacggcttcacccacttcatcgaggtcagcgcccaccccgtcctcaccatg 2520 accctccccgagaccgtcaccggcctcggcaccctccgccgcgaacagggaggccaggag 2580 cgtctggtcacctcactcgccgaagcctgggccaacggcctcaccatcgactgggcgccc 2640 atcctccccaccgcaaccggccaccaccccgagctccccacctacgccttccagaccgag 2700 cgcttctggctgcagagctccgcgcccaccagcgccgccgacgactggcgttaccgcgtc 2760 ZO gagtggaagccgctgacggcctccggccaggcggacctgtccgggcggtggatcgtcgcc 2820 gtcgggagcgagccagaagccgagctgctgggcgcgctgaaggccgcgggagcggaggtc 2880 gacgtactggaagccggggcggacgacgaccgtgaggccctcgccgcccggctcaccgca 2940 ctgacgaccggcgacggcttcaccggcgtggtctcgctcctcgacgacctcgtgccacag 3000 gtcgcctgggtgcaggcactcggcgacgccggaatcaaggcgcccctgtggtccgtcacc 3060 :~Scagggcgcggtctccgtcggacgtctc:gacacccccgccgaccccgaccgggccatgctc 3120 tggggcctcggccgcgtcgtcgcccttgagcaccccgaacgctgggccggcctcgtcgac 3180 ctccccgcccagcccgatgccgccgcc:ctcgcccacctcgtcaccgcactctccggcgcc 3240 accggcgaggaccagatcgccatccg<:accaccggactccacgcccgccgcctcgcccgc 3300.

gcacccctccacggacgtcggcccacc:cgcgactggcagccccacggcaccgtcctcatc 3360 :~0accggcggcaccggagccctcggcagc:cacgccgcacgctggatggcccaccacggagcc 3420 gaacacctcctcctcgtcagccgcagc:ggcgaacaagcccccggagccacccaactcacc 3480 gccgaactcaccgcatcgggcgcccgc:gtcaccatcgccgcctgcgacgtcgccgacccc 3540 cacgccatgcgcaccctcctcgacgcc:atccccgccgagacgcccctcaccgccgtcgtc 3600 cacaccgccggcgcaccgggcggcgat:ccgctggacgtcaccggcccggaggacatcgcc 3660 '5 cgcatcctgggcgcgaagacgagcggc:gccgaggtcctcgacgacctgctccgcggcact 3720 ccgctggacgccttcgtcctctactccacgaacgccggggtctggggcagcggcagccag 3780 ggcgtctacgcggcggccaacgcccac:ctcgacgcgctcgccgcccggcgccgcgcccgg :3840 ggcgagacggcgacctcggtcgcctgg~ggcctctgggccggcgacggcat :3900 gggccggggc gccgacgacgcgtactggcagcgtcgcggcatccgtccgatgagccccgaccgcgccctg 3950 4~0gacgaactggccaaggccctgagccacgacgagaccttcgtcgccgtggccgatgtcgac 4020 tgggagcggt tcgcgcccgcgttcacggtgtcccgtcccagccttctgctcgacggcgtc 4080 ccggaggccc ggcaggcgctcgccc~cacccgtcggtgcccggctcccggcgacgccgcc 4140 c gtggcgccga ccgggcagtcgtcgc~cgctggccgcgatcacgcgctccccgagcccgag 4200 c cgccggccgg cgctcctcaccctcg~tccgtacccacgcggcggccgtactcggccattcc 4260 $ tcccccgaccgggtggcccccggccgtgccttcaccgagctcggcttcgactcgctgacg 4320 gccgtgcagc tccgcaaccagctctccacggtggtcggcaacaggctccccgccaccacg 4380 gtcttcgac acccgacgcccgccgcactcc tccacgaggcgtacctcgca 4440 gccgcgcacc ccggccgagc cggccccgacggactgggaggggcgggtgcgccgggccctggccgaactg 4500 cccctcgacc ggctgcgggacgcgggggtcctcgacaccgtcctgcgcctcaccggcatc 4560 gagcccgagccgggttccggcggttcggacggcggcgccgccgaccctggtgcggagccg 4620 gaggcgtcga tcgacgacctggacgccgaggccctgatccggatggctctcggcccccgt 4680 aacacctga 4689 <210> 35 1$ <211> 1562 <212> PRT
<213> Streptomyces venezuelae <400> 35 Met Ala Asn Asn Glu Asp Lys Leu Arg Asp Tyr Leu Lys Arg Val Thr Ala Glu Leu Gln Gln Asn Thr Arg Arg Leu Arg Glu Ile Glu Gly Arg Thr His Glu Pro Val Ala Ile Val Gly Met Ala Cys Arg Leu Pro Gly 2$ 35 40 45 Gly Val Ala Ser Pro Glu Asp Leu Trp Gln Leu Val Ala Gly Asp Gly Asp Ala Ile Ser Glu Phe Pro Gln Asp Arg Gly Trp Asp Val Glu Gly Leu Tyr Asp Pro Asp Pro Asp Ala Ser Gly Arg Thr Tyr Cys Arg Ser Gly Gly Phe Leu His Asp Ala Gly Glu Phe Asp Ala Asp Phe Phe Gly Ile Ser Pro Arg Glu Ala Leu .Ala Met Asp Pro Gln Gln Arg Leu Ser 3$ 115 120 125 Leu Thr Thr Ala Trp Glu Ala Ile Glu Ser Ala Gly Ile Asp Pro Thr Ala Leu Lys Gly Ser Gly Leu Gly Val Phe Val Gly Gly Trp His Thr ~40 Gly Tyr Thr Ser Gly Gln Thr 'Thr Ala Val Gln Ser Pro Glu Leu Glu WO 00/00620 PCTlUS99/14398 Gly His Leu Val Ser Gly Ala Ala Leu Gly Phe Leu Ser Gly Arg Ile Ala Tyr Val Leu Gly Thr Asp Gly Pro Ala Leu Thr Val Asp Thr Ala Cys Ser Ser Ser Leu Val Ala Leu His Leu Ala Val Gln Ala Leu Arg , Lys Gly Glu Cys Asp Met Ala Leu Ala Gly Gly Val Thr Val Met Pro 1~ Asn Ala Asp Leu Phe Val Gln Phe Ser Arg Gln Arg Gly Leu Ala Ala Asp Gly Arg Ser Lys Ala Phe .Ala Thr Ser Ala Asp Gly Phe Gly Pro Ala Glu Gly Ala Gly Val Leu :Leu Val Glu Arg Leu Ser Asp Ala Arg 1$ 275 280 285 Arg Asn Gly His Arg Ile Leu ~Ala Val Val Arg Gly Ser Ala Val Asn Gln Asp Gly Ala Ser Asn Gly Leu Thr Ala Pro His Gly Pro Ser Gln .~~ Gln Arg Val Ile Arg Arg Ala I~eu Ala Asp Ala Arg Leu Ala Pro Gly Asp Val Asp Val Val Glu Ala His Gly Thr Gly Thr Arg Leu Gly Asp Pro Ile Glu Ala Gln Ala Leu 7:1.e Ala Thr Tyr Gly Gln Glu Lys Ser s $ 355 3.60 365 Ser Glu Gln Pro Leu Arg Leu Giy Ala Leu Lys Ser Asn Ile Gly His Thr Gln Ala Ala Ala Gly Val A.la Gly Val Ile Lys Met Val Gln Ala 30 Met Arg His Gly Leu Leu Pro Lys Thr Leu His Val Asp Glu Pro Ser Asp Gln Ile Asp Trp Ser Ala Gly Thr Val Glu Leu Leu Thr Glu Ala Val Asp Trp Pro Glu Lys Gln As,p Gly Gly Leu Arg Arg Ala Ala Val 3$ 435 440 445 Ser Ser Phe Gly Ile Ser Gly Thr Asn Ala His Val Val Leu Glu Glu Ala Pro Ala Val Glu Asp Ser Pro Ala Val Glu Pro Pro Ala Gly Gly 4v Gly Val Val Pro Trp Pro Val Seer Ala Lys Thr Pro Ala Ala Leu Asp Ala Gln Ile Gly Gln Leu Ala Ala Tyr Ala Asp Gly Arg Thr Asp Val Asp Pro Ala Val Ala Ala Arg Ala Leu Val Asp Ser Arg Thr Ala Met Glu His Arg Ala Val Ala Val Gly Asp Ser Arg Glu Ala Leu Arg Asp .

Ala Leu Arg Met Pro Glu Gly Leu Val Arg Gly Thr Ser Ser Asp Val 1~ Gly Arg Val Ala Phe Val Phe Pro Gly Gln Gly Thr Gln Trp Ala Gly Met Gly Ala Glu Leu Leu Asp Ser Ser Pro Glu Phe Ala Ala Ser Met Aia Glu Cys Glu Thr Ala Leu Ser Arg Tyr Val Asp Trp Ser Leu Glu 1$ 595 600 605 Ala Val Val Arg Gln Glu Pro Gly Ala Pro Thr Leu Asp Arg Val Asp Val Val Gln Pro Val Thr Phe i4la Val Met Val Ser Leu Ala Lys Val :~0 Trp Gln His His Gly Ile Thr 1?ro Gln Ala Val Val Gly His Ser Gln Gly Glu Ile Ala Ala Ala Tyr Val Ala Gly Ala Leu Thr Leu Asp Asp Ala Ala Arg Val Val Thr Leu Arg Ser Lys Ser Ile Ala Ala His Leu 675 E~80 68S
Ala Gly Lys Gly Gly Met Ile Ser Leu Ala Leu Asp Glu Ala Ala Val Leu Lys Arg Leu Ser Asp Phe F~sp Gly Leu Ser Val Ala Ala Val Asn 3~ Gly Pro Thr Ala Thr Val Val S~er Gly Asp Pro Thr Gln Ile Glu Glu Leu Ala Arg Thr Cys Glu Ala A.sp Gly Val Arg Ala Arg Ile Ile Pro Val Asp Tyr Ala Ser His Ser Arg Gln Val Glu Ile Ile Glu Lys Glu 3$ 755 760 765 Leu Ala Glu Val Leu Ala Gly Leu Ala Pro Gln Ala Pro His Val Pro Phe Phe Ser Thr Leu Glu Gly Thr Trp Ile Thr Glu Pro Val Leu Asp 4~ Gly Thr Tyr Trp Tyr Arg Asn Leu Arg His Arg Val Gly Phe Ala Pro Ala Val Glu Thr Leu Ala Val Asp Gly Phe Thr His Phe Ile Glu Val Ser Ala His Pro Val Leu Thr Met Thr Leu Pro Glu Thr Val Thr Gly $ 835 840 845 Leu Gly Thr Leu Arg Arg G1u Gln Gly Gly Gln Glu Arg Leu Val Thr Ser Leu Ala Glu Ala Trp Ala Asn Gly Leu Thr Ile Asp Trp Ala Pro 1~ Ile Leu Pro Thr Ala Thr Gly His His Pro Glu Leu Pro Thr Tyr Ala Phe Gln Thr Glu Arg Phe Trp Leu Gln Ser Ser Ala Pro Thr Ser Ala Ala Asp Asp Trp Arg Tyr Arg Val Glu Trp Lys Pro Leu Thr Ala Ser Gly Gln Ala Asp Leu Ser Gly Arg Trp Ile Val Ala Val Gly Ser Glu Pro Glu Ala Glu Leu Leu Gly Ala Leu Lys Ala Ala Gly Ala Glu Val 2~ Asp Val Leu Glu Ala Gly Ala Asp Asp Asp Arg Glu Ala Leu Ala Ala Arg Leu Thr Ala Leu Thr Thr Gly Asp Gly Phe Thr Gly Val Val Ser Leu Leu Asp Asp Leu Val Pro Gln Val Ala Trp Val Gln Ala Leu Gly Asp Ala Gly Ile Lys Ala Pro Leu Trp Ser Val Thr Gln Gly Ala Val Ser Val Gly Arg Leu Asp Thr Pro Ala Asp Pro Asp Arg Ala Met Leu 30 Trp Gly Leu Gly Arg Val Val .Ala Leu Glu His Pro Glu Arg Trp Ala Gly Leu Val Asp Leu Pro Ala Gln Pro Asp Ala Ala Ala Leu Ala His Leu Val Thr Ala Leu Ser Gly i~.la Thr Gly Glu Asp Gln Ile Ala Ile 35 1075 :1080 1085 Arg Thr Thr Gly Leu His Ala Arg Arg Leu Ala Arg Ala Pro Leu His Gly Arg Arg Pro Thr Arg Asp '.Crp Gln Pro His Gly Thr Val Leu Ile ~~~ Thr Gly Gly Thr Gly Ala Leu Gly Ser His Ala Ala Arg Trp Met Ala DEMANDES OU Bk~EVETS VOLUMINEUX

COMPREND PLUS D'UN Tt)ME.
CECI EST LE TOME ~ DE
N()TF: Pour tes tomes additionels" veuiilez contacter le Bureau canadien des brevets JUMBO APPLlCATIONSIPATENTS
THiS SECTION OF THE APPLICATION/PATENT CONTAINS MORE
THAN ONE VOLUME - , THIS IS VOt_UME ,~_ OF
' NOTE: For additional voiumes-pi~ase contact the Canadian Patent Ofific~

Claims (60)

WHAT IS CLAIMED IS:
1. An isolated and purified nucleic acid segment comprising a nucleic acid sequence comprising a desosamine biosynthetic gene cluster, a fragment or a biologically active variant thereof, wherein the nucleic acid sequence is not derived from the eryC gene cluster of Saccharopolyspora erythraea or Streptomyces antibioticus.
2. The isolated and purified nucleic acid segment of claim 1 comprising SEQ ID
NO:3.
3. The isolated and purified nucleic acid segment of claim 1 which encodes DesI, DesII, DesIII, DesIV, DesV, DesVI, DesVII, DesVIII or DesR.
4. The isolated and purified nucleic acid segment of any one of claims 1 to 3 which is from Streptomyces venezuelae.
5. An expression cassette comprising the nucleic acid segment of any one of claims 1 to 4 operably linked to a promoter functional in a host cell.
6. A recombinant bacterial host cell in which at least a portion of a nucleic acid sequence encoding desosamine is disrupted so as to alter desosamine synthesis, wherein the nucleic acid sequence which is disrupted is not derived from the eryC gene cluster of Saccharopolyspora erythraea.
7. The host cell of claim 6 wherein the nucleic acid sequence which is disrupted encodes DesI, DesII, DesIII, DesIV, DesV, DesVI, DesVII, DesVIII or DesR.
8. A host cell, the genome of which is augmented with the expression cassette of claim 5.
9. A product produced by the host cell of any one of claims 6 to 8 which is not produced by the corresponding non-recombinant or non-augmented host cell.
10. The product of claim 9 which comprises a macrolide.
11. The product of claim 9 or 10 which is biologically active.
12. An isolated and purified nucleic acid segment comprising a nucleic acid sequence comprising a macrolide biosynthetic gene cluster encoding polypeptides which synthesize methymycin, pikromycin, neomethymycin, narbomycin, or a combination thereof, or a biologically active variant or fragment thereof.
13. The isolated and purified nucleic acid segment of claim 12 comprising SEQ
ID NO:5.
14. The isolated and purified nucleic acid segment of claim 12 comprising a biologically active variant or fragment of SEQ ID NO:5.
15. The isolated and purified nucleic acid segment of claim 12 which encodes PikR1, PikR2, PikAI, PikAII, PikAIII, PikAIV, PikAV, PikC or PikD.
16. The isolated and purified nucleic acid segment of any one of claims 12 to 15 which is from Streptomyces venezuelae.
17. A host cell, the genome of which is augmented with the nucleic acid segment of any one of claims 12 to 16.
18. An isolated and purified nucleic acid sequence comprising SEQ )D NO:3, SEQ
ID
NO:5, a fragment thereof, the complement thereto, or which hybridizes thereto.
19. An isolated polypeptide encoded by the nucleic acid segment of any one of claims 1 to 4 or 12 to 16.
20. A recombinant host cell in which a pikAI gene, a pikAII gene, a pikAIII
gene, a pikAIV
gene, a pikB gene cluster, a pikAV gene cluster, a pikC gene, a pikR1 gene, a pikR2 gene, or a combination thereof, is disrupted so as to alter production of methymycin, neomethymycin, pikromycin, narbomycin, or a combination thereof.
21. A macrolide or polyketide product produced by the host cell of claim 17 or 20 which is not produced by the corresponding non-recombinant or non-augmented host cell.
22. The macrolide or polyketide of claim 21 which is biologically active.
23. An isolated and purified DNA molecule comprising a first DNA segment encoding a first module and a second DNA segment encoding a second module, wherein the DNA segments together encode a recombinant polyhydroxyalkanoate monomer synthase, and wherein at least one DNA segment is derived from the pikA gene cluster of Streptomyces venezuelae.
24. A method of providing a polyhydroxyalkanoate monomer, comprising:
(a) introducing into a host cell a DNA molecule comprising a DNA segment encoding a recombinant polyhydroxyalkanoate monomer synthase operably linked to a promoter functional in the host cell, wherein the recombinant polyhydroxyalkanoate monomer synthase comprises a first module and a second module, and wherein at least one DNA segment is derived from the pikA gene cluster of Streptomyces venezuelae; and (b) expressing the DNA encoding the recombinant polyhydroxyalkanoate monomer synthase in the host cell so as to generate a polyhydroxyalkanoate monomer.
25. A recombinant vector comprising one or more modules of a polyketide synthase wherein at least one module is from Streptomyces venezuelae.
26. The method of claim 24 wherein the first module encodes a fatty acid synthase.
27. A method of providing a polyhydroxyalkanoate monomer, comprising:
(a) introducing into a host cell a DNA molecule encoding a fusion polypeptide, wherein the DNA molecule comprises a first DNA segment operably linked to a promoter functional in the host cell and a second DNA segment, wherein at least one DNA segment is derived from the pikA gene cluster of Streptomyces venezuelae; and (b) expressing the DNA in the host cell so as to generate the fusion polypeptide.
28. The host cell of claim 8 or 17 the native genome of which does not comprise an intact macrolide biosynthetic gene cluster encoding polypeptides which synthesize methymycin, pikromycin, neomethymycin, or narbomycin.
29. A recombinant bacterial host cell comprising a deletion of the thioesterase domain of pikAIV gene.
30. The recombinant host cell of claim 29 further comprising a deletion in the pikAV
gene.
31. An isolated and purified DNA molecule comprising a DNA segment comprising a pikA promoter.
32. An expression cassette comprising a pikA promoter operably linked to a DNA
molecule comprising a DNA segment comprising an open reading frame or a portion thereof.
33. The expression cassette of claim 32 wherein the DNA segment encodes the thioesterase domain of pik4IV.
34. The expression cassette of claim 32 wherein the DNA segment encodes the thioesterase II domain of pikAV.
35. The expression cassette of claim 33 further comprising an acyl carrier protein domain.
36. The expression cassette of claim 35 further comprising a thioesterase II
domain.
37. The expression cassette of claim 35 further comprising an acyl transferase domain.
38. The expression cassette of claim 37 further comprising a .beta.-ketoacyl-acyl carrier protein synthase domain.
39. The expression cassette of any one of claims 32 to 38 wherein the DNA
molecule comprises a second DNA segment comprising the leader sequence of pikAI
operably linked to the first DNA segment.
40. A host cell transformed with a plasmid comprising the expression cassette of any one of claims 32 to 39.
41. The host cell of claim 40 which lacks the thioesterase domain of pikAIV
gene cluster and the thioesterase II domain of pikAV gene.
42. A method to alter polyketide chain length, comprising: expressing in a host cell an expression cassette comprising at least a portion of a DNA segment that encodes a module that catalyzes the final condensation of a polyketide so as to yield a polyketide product which is of a different length relative to a polyketide produced by a host cell which does not express the module, wherein the DNA segment that encodes an intact module encodes two different polypeptides, one of which has a lower molecular weight than the other polypeptide.
43. The method of claim 42 wherein the intact module is pikA module 6.
44. The method of claim 42 wherein the expression cassette is present on a plasmid.
45. The method of claim 42 wherein the host cell is a polyketide-producing host cell.
46. A product produced by the method of claim 42 which is not produced by a host cell which does not express the module.
47. A method to prepare a polyketide product, comprising: expressing in a host cell an expression cassette comprising a promoter operably linked to a DNA segment comprising a portion of a first polyketide synthase gene so as to yield the product, wherein the expression cassette is present on a plasmid, wherein the chromosome of the host cell comprises at least a portion of a second polyketide synthase gene, and wherein both portions are operably linked to the native polyketide promoter of one of the polyketide genes.
48. The method of claim 47 wherein the portions are from the same polyketide gene and wherein the portion on the host cell chromosome is different than the portion that is on the plasmid.
49. The method of claim 48 wherein the portions together comprise the entire gene.
50. The method of claim 49 wherein the gene is the pikA gene cluster.
51. A host cell, the genome of which comprises at least a portion of a first polyketide synthase gene, comprising: a plasmid comprising a promoter operably linked to a DNA molecule comprising a DNA segment encoding a portion of a second polyketide synthase gene, wherein both portions are operably linked to the native promoter of one of the genes, and wherein the expression of both portions yields a polyketide.
52. The host cell of claim 51 wherein the portions are from the same polyketide gene and wherein the portion on the host cell genome is different than the portion that is on the plasmid.
53. The host cell of claim 52 wherein the portions together comprise the entire gene.
54. The host cell of claim 53 wherein the gene is the pikA gene cluster.
55. A polyketide produced by the host cell of claim 51.
56. Use of a product of claim 9, 21 or 46 for the manufacture of a medicament for the treatment of a pathological condition or symptom in a mammal.
57. The host cell of claim 7 wherein the nucleic acid sequence encoding DesR
is disrupted.
58. A product produced by the host cell of claim 57.
59. The host cell of claim 7 wherein the nucleic acid sequence encoding DesI
is disrupted.
60. A product produced by the host cell of claim 59.
CA002332129A 1998-06-26 1999-06-25 Dna encoding methymycin and pikromycin Abandoned CA2332129A1 (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US09/105,537 1998-06-26
US09/105,537 US6265202B1 (en) 1998-06-26 1998-06-26 DNA encoding methymycin and pikromycin
PCT/US1999/014398 WO2000000620A2 (en) 1998-06-26 1999-06-25 Dna encoding methymycin and pikromycin

Publications (1)

Publication Number Publication Date
CA2332129A1 true CA2332129A1 (en) 2000-01-06

Family

ID=22306384

Family Applications (1)

Application Number Title Priority Date Filing Date
CA002332129A Abandoned CA2332129A1 (en) 1998-06-26 1999-06-25 Dna encoding methymycin and pikromycin

Country Status (6)

Country Link
US (4) US6265202B1 (en)
EP (1) EP1090125A2 (en)
JP (1) JP2002536959A (en)
AU (1) AU4719999A (en)
CA (1) CA2332129A1 (en)
WO (1) WO2000000620A2 (en)

Families Citing this family (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6902913B2 (en) * 1997-04-30 2005-06-07 Kosan Biosciences, Inc. Recombinant narbonolide polyketide synthase
US6117659A (en) 1997-04-30 2000-09-12 Kosan Biosciences, Inc. Recombinant narbonolide polyketide synthase
US6503741B1 (en) * 1998-05-28 2003-01-07 Kosan Biosciences, Inc. Polyketide synthase genes from Streptomyces venezuelae
ATE300618T1 (en) * 1999-03-15 2005-08-15 Univ Laval METHOD FOR PRODUCING POLYHYDROXYALKANOATES IN RECOMBINANT ORGANISMS
US7033818B2 (en) * 1999-10-08 2006-04-25 Kosan Biosciences, Inc. Recombinant polyketide synthase genes
CA2424567A1 (en) * 2000-10-05 2002-04-11 Regents Of The University Of Minnesota Method to alter sugar moieties
AR034703A1 (en) * 2001-05-16 2004-03-17 Syngenta Participations Ag METHODS AND COMPOSITIONS TO PREPARE EMAMECTINE.
US20040161839A1 (en) * 2001-10-05 2004-08-19 Hung-Wen Liu Method to alter sugar moieties
AU2003291619A1 (en) * 2002-07-31 2004-04-30 Kosan Biosciences, Inc. Production of glycosylated macrolides in e. coli
WO2005044979A2 (en) * 2003-08-04 2005-05-19 Diversa Corporation Glycosylation enzymes and systems and methods of making and using them
US7288396B2 (en) * 2003-09-11 2007-10-30 Kosan Biosciences Incorporated Biosynthetic gene cluster for leptomycins
KR100649394B1 (en) 2004-04-27 2006-11-24 주식회사 진켐 Novel Olivosyl Methymycin Derivatives and Method for Preparing the Same
KR100636653B1 (en) 2004-04-27 2006-10-19 주식회사 진켐 Novel Olivosyl Pikromycin Derivatives and Method for Preparing the Same
EP2342335B1 (en) 2008-09-24 2015-09-16 Shanghai Institute Of Organic Chemistry, Chinese Academy Of Sciences Novel gene cluster
US8292863B2 (en) 2009-10-21 2012-10-23 Donoho Christopher D Disposable diaper with pouches
KR20140012072A (en) 2011-01-28 2014-01-29 아미리스 인코퍼레이티드 Gel-encapsulated microcolony screening
CN103518136A (en) 2011-05-13 2014-01-15 阿迈瑞斯公司 Methods and compositions for detecting microbial production of water-immiscible compounds
CA2879178C (en) 2012-08-07 2020-11-24 Amyris, Inc. Methods for stabilizing production of acetyl-coenzyme a derived compounds
CA2903053C (en) 2013-03-15 2023-01-17 Amyris, Inc. Use of phosphoketolase and phosphotransacetylase for production of acetyl-coenzyme a derived compounds
EP3663392A1 (en) 2013-08-07 2020-06-10 Amyris, Inc. Methods for stabilizing production of acetyl-coenzyme a derived compounds
US10988513B2 (en) 2015-06-25 2021-04-27 Amyris, Inc. Maltose dependent degrons, maltose-responsive promoters, stabilization constructs, and their use in production of non-catabolic compounds
CN106916834B (en) * 2015-12-24 2022-08-05 武汉合生科技有限公司 Biosynthetic gene cluster of compounds and application thereof

Family Cites Families (53)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS61205484A (en) 1985-03-09 1986-09-11 Sanraku Inc Novel plasmid
US4935340A (en) 1985-06-07 1990-06-19 Eli Lilly And Company Method of isolating antibiotic biosynthetic genes
JPS6229595A (en) 1985-07-31 1987-02-07 Toyo Jozo Co Ltd 5-o-mycaminosyl-narbonolide derivative and production thereof
JPS6261765A (en) 1985-09-09 1987-03-18 Kawasaki Steel Corp Roller apron for continuous casting
AU598516B2 (en) 1985-12-17 1990-06-28 Lubrizol Genetics Inc. Isolation of genes for biosynthesis of polyketide antibiotics
CA1340599C (en) 1986-03-21 1999-06-22 Karen Leigh Cox Antibiotic-producing microorganisms
US5672497A (en) 1986-03-21 1997-09-30 Eli Lilly And Company Method for increasing the antibiotic-producing ability of antibiotic-producing microorganisms
US4874748A (en) 1986-03-24 1989-10-17 Abbott Laboratories Cloning vectors for streptomyces and use thereof in macrolide antibiotic production
US5149639A (en) 1986-03-24 1992-09-22 Abbott Laboratories Biologically pure cultures of streptomyces and use thereof in macrolide antibiotic production
US4952502A (en) 1987-02-24 1990-08-28 Eli Lilly And Company Carbomycin biosynthetic gene, designated carG, for use in streptomyces and other organisms
US5229279A (en) 1987-06-29 1993-07-20 Massachusetts Institute Of Technology Method for producing novel polyester biopolymers
US5245023A (en) 1987-06-29 1993-09-14 Massachusetts Institute Of Technology Method for producing novel polyester biopolymers
US5480794A (en) 1987-06-29 1996-01-02 Massachusetts Institute Of Technology And Metabolix, Inc. Overproduction and purification of soluble PHA synthase
US5250430A (en) 1987-06-29 1993-10-05 Massachusetts Institute Of Technology Polyhydroxyalkanoate polymerase
US5168052A (en) 1988-03-28 1992-12-01 Eli Lilly And Company Method for producing 20-deoxotylosin
US5063155A (en) 1988-03-28 1991-11-05 Eli Lilly And Company Method for producing 2"'-o-demethyltylosin
GB8811036D0 (en) 1988-05-10 1988-06-15 Glaxo Group Ltd Chemical compounds
US5068189A (en) 1988-05-13 1991-11-26 Eli Lilly And Company Recombinant dna vectors encoding a 4"-o-isovaleryl acylase derived from a carbomycin biosynthetic gene, designated care, for use in streptomyces and other organisms
US5098837A (en) 1988-06-07 1992-03-24 Eli Lilly And Company Macrolide biosynthetic genes for use in streptomyces and other organisms
US5057425A (en) 1988-07-29 1991-10-15 Eli Lilly And Company Picromycin resistance-conferring gene, designated pica, for use in streptomyces and other organisms
EP0361905A3 (en) 1988-09-29 1991-07-10 Eli Lilly And Company Carbomycin biosynthetic genes, designated carl and carm, for use in streptomyces and other organisms
US5149638A (en) 1988-09-29 1992-09-22 Eli Lilly And Company Tylosin biosynthetic genes tylA, tylB and tylI
US5141926A (en) 1990-04-18 1992-08-25 Abbott Laboratories Erythromycin derivatives
DK0468217T3 (en) 1990-07-26 1998-10-07 American Cyanamid Co A bifunctional cosmid useful for cloning afactinomycete
US5824513A (en) 1991-01-17 1998-10-20 Abbott Laboratories Recombinant DNA method for producing erythromycin analogs
US6060234A (en) 1991-01-17 2000-05-09 Abbott Laboratories Polyketide derivatives and recombinant methods for making same
WO1993013663A1 (en) 1992-01-17 1993-07-22 Abbott Laboratories Method of directing biosynthesis of specific polyketides
FI93860C (en) 1991-03-25 1995-06-12 Leiras Oy Method for producing antibiotics, useful DNA sequences therein, hybrid DNA constructs and those containing microbe strains
US5610041A (en) 1991-07-19 1997-03-11 Board Of Trustees Operating Michigan State University Processes for producing polyhydroxybutyrate and related polyhydroxyalkanoates in the plastids of higher plants
US5514544A (en) 1991-07-26 1996-05-07 Eli Lilly And Company Activator gene for macrolide biosynthesis
US5672491A (en) 1993-09-20 1997-09-30 The Leland Stanford Junior University Recombinant production of novel polyketides
US5712146A (en) 1993-09-20 1998-01-27 The Leland Stanford Junior University Recombinant combinatorial genetic library for the production of novel polyketides
PT725778E (en) 1993-09-20 2002-03-28 Univ Leland Stanford Junior RECOMBINANT PRODUCTION OF NEW POLYETTIDES
WO1995012661A1 (en) 1993-11-02 1995-05-11 Merck & Co., Inc. Dna encoding triol polyketide synthase
US5716849A (en) 1994-06-08 1998-02-10 Novartis Finance Corporation Genes for the biosynthesis of soraphen
US5545553A (en) 1994-09-26 1996-08-13 The Rockefeller University Glycosyltransferases for biosynthesis of oligosaccharides, and genes encoding them
WO1997002358A1 (en) 1995-07-06 1997-01-23 The Leland Stanford Junior University Cell-free synthesis of polyketides
US5702717A (en) 1995-10-25 1997-12-30 Macromed, Inc. Thermosensitive biodegradable polymers based on poly(ether-ester)block copolymers
US6600029B1 (en) 1995-12-19 2003-07-29 Regents Of The University Of Minnesota Metabolic engineering of polyhydroxyalkanoate monomer synthases
US5998194A (en) 1995-12-21 1999-12-07 Abbott Laboratories Polyketide-associated sugar biosynthesis genes
CA2197160C (en) 1996-02-22 2007-05-01 Stanley Gene Burgett Platenolide synthase gene
CA2197524A1 (en) 1996-02-22 1997-08-22 Bradley Stuart Dehoff Polyketide synthase genes
US5958745A (en) 1996-03-13 1999-09-28 Monsanto Company Methods of optimizing substrate pools and biosynthesis of poly-β-hydroxybutyrate-co-poly-β-hydroxyvalerate in bacteria and plants
AU731654B2 (en) 1996-07-05 2001-04-05 Biotica Technology Limited Polyketides and their synthesis
US5811272A (en) 1996-07-26 1998-09-22 Massachusetts Institute Of Technology Method for controlling molecular weight of polyhydroxyalkanoates
WO1998007868A1 (en) 1996-08-20 1998-02-26 Novartis Ag Rifamycin biosynthesis gene cluster
DK1291352T3 (en) 1996-09-04 2005-09-19 Abbott Lab 6-O-substituted ketolides with antibacterial activity
WO1998011230A1 (en) 1996-09-13 1998-03-19 Bristol-Myers Squibb Company Polyketide synthases for pradimicin biosynthesis and dna sequences encoding same
US6033883A (en) 1996-12-18 2000-03-07 Kosan Biosciences, Inc. Production of polyketides in bacteria and yeast
EP0977865A1 (en) 1997-02-13 2000-02-09 James Madison University Methods of making polyhydroxyalkanoates comprising 4-hydroxybutyrate monomer units
US6117659A (en) * 1997-04-30 2000-09-12 Kosan Biosciences, Inc. Recombinant narbonolide polyketide synthase
US6090601A (en) 1998-01-23 2000-07-18 Kosan Bioscience Sorangium polyketide synthase
JP2002516090A (en) 1998-05-28 2002-06-04 コーサン バイオサイエンシーズ, インコーポレイテッド Recombinant narvonolide polyketide synthase

Also Published As

Publication number Publication date
US20030087405A1 (en) 2003-05-08
US20020110897A1 (en) 2002-08-15
EP1090125A2 (en) 2001-04-11
US6265202B1 (en) 2001-07-24
US20020164742A1 (en) 2002-11-07
JP2002536959A (en) 2002-11-05
WO2000000620A3 (en) 2000-04-13
WO2000000620A2 (en) 2000-01-06
AU4719999A (en) 2000-01-17

Similar Documents

Publication Publication Date Title
AU758421B2 (en) Recombinant oleandolide polyketide synthase
CA2332129A1 (en) Dna encoding methymycin and pikromycin
US6200813B1 (en) Polyketide derivatives and recombinant methods for making same
US6274350B1 (en) Biosynthetic genes for spinosyn insecticide production
US6600029B1 (en) Metabolic engineering of polyhydroxyalkanoate monomer synthases
DE60033835T2 (en) HETEROLOGIST PREPARATION OF POLYCETIDES
US6524841B1 (en) Recombinant megalomicin biosynthetic genes and uses thereof
CA2463167A1 (en) Production, detection and use of transformant cells
EP1414969B1 (en) Biosynthetic genes for butenyl-spinosyn insecticide production
US20030073824A1 (en) DNA encoding methymycin and pikromycin
CA2354030A1 (en) Micromonospora echinospora genes encoding for biosynthesis of calicheamicin and self-resistance thereto
US20030194784A1 (en) DNA encoding methymycin and pikromycin
US20040161839A1 (en) Method to alter sugar moieties
WO2002029035A2 (en) Method to alter sugar moieties
MXPA00008811A (en) Biosynthetic genes for spinosyn insecticide production
AU2002305118A1 (en) Biosynthetic genes for butenyl-spinosyn insecticide production

Legal Events

Date Code Title Description
EEER Examination request
FZDE Discontinued