WO1994025567A1 - CLONING AND EXPRESSION OF THE CHONDROITINASE I AND II GENES FROM $i(P. VULGARIS) - Google Patents

CLONING AND EXPRESSION OF THE CHONDROITINASE I AND II GENES FROM $i(P. VULGARIS) Download PDF

Info

Publication number
WO1994025567A1
WO1994025567A1 PCT/US1994/004495 US9404495W WO9425567A1 WO 1994025567 A1 WO1994025567 A1 WO 1994025567A1 US 9404495 W US9404495 W US 9404495W WO 9425567 A1 WO9425567 A1 WO 9425567A1
Authority
WO
WIPO (PCT)
Prior art keywords
leu
enzyme
ser
chondroitinase
ala
Prior art date
Application number
PCT/US1994/004495
Other languages
French (fr)
Inventor
Michael Joseph Ryan
Kiran Manohar Khandke
Bruce Clifford Tilley
Jason Arnold Lotvin
Original Assignee
American Cyanamid Company
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by American Cyanamid Company filed Critical American Cyanamid Company
Priority to AU68183/94A priority Critical patent/AU697156B2/en
Priority to EP94916561A priority patent/EP0702715A4/en
Priority to JP6524437A priority patent/JPH09500011A/en
Publication of WO1994025567A1 publication Critical patent/WO1994025567A1/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/88Lyases (4.)

Definitions

  • This invention relates to the DNA sequence encoding the major protein component of chondroitinase ABC, which is referred to as "chondroitinase I”, from Proteus vulgaris (P. vulgaris) .
  • This invention further relates to the DNA sequence encoding a second protein component of chondroitinase ABC, which is referred to as “chondroitinase II", from P. vulgaris.
  • This invention also relates to the cloning and expression of the genes containing these DNA sequences and to the amino acid sequences of the recombinant chondroitinase I and II enzymes encoded by these DNA sequences.
  • This invention additionally relates to methods for the isolation and purification of the recombinantly expressed major protein component of chondroitinase ABC, which is referred to as "chondroitinase I”, from Proteus vulgaris (P. vulgaris) .
  • This invention further relates to methods for the isolation and purification of the recombinantly expressed second protein component of chondroitinase ABC, which is referred to as "chondroitinase II", from P. vulgaris.
  • These methods provide significantly higher yields and purity than those obtained by adapting for the recombinant enzymes the method previously used for isolating and purifying the native chondroitinase I enzyme from P. vulgaris. Background of the Invention
  • Chondroitinases are enzymes of bacterial origin which have been described as having value in dissolving the cartilage of herniated discs without disturbing the stabilizing collagen components of those discs.
  • chondroitinase enzymes examples include chondroitinase ABC, which is produced by the bacterium P. vulgaris. and chondroitinase AC, which is produced by A. aurescens.
  • the chondroitinases function by degrading polysaccharide side chains in protein- polysaccharide complexes, without degrading the protein core.
  • Yamagata et al. describes the purification of the enzyme chondroitinase ABC from extracts of P. vulgaris (Bibliography entry 1) .
  • the enzyme selectively degrades the glycosaminoglycans chondroitin-4-sulfate, dermatan sulfate and chondroitin-6-sulfate (also referred to respectively as chondroitin sulfates A, B and C) at pH 8 at higher rates than chondroitin or hyaluronic acid.
  • the enzyme did not attack keratosulfate, heparin or heparitin sulfate. Kikuchi et al.
  • glycosaminoglycan degrading enzymes such as chondroitinase ABC
  • chondroitinase ABC glycosaminoglycan degrading enzymes
  • Brown describes a method for treating intervertebral disc displacement in mammals, including humans, by injecting into the intervertebral disc space effective amounts of a solution containing chondroitinase ABC (3) .
  • the chondroitinase ABC was isolated and purified from extracts of P. vulgaris. This native enzyme material functioned to dissolve cartilage, such as herniated spinal discs. Specifically, the enzyme causes the selective chemonucleolysis of the nucleus pulposus which contains proteoglycans and randomly dispersed collagen fibers.
  • Hageman describes an ophthalmic vitrectomy method for selectively and completely disinserting the ocular vitreous body, epiretinal membranes or fibrocellular membranes from the neural retina, ciliary epithelium and posterior lens surface of the mammalian eye as an adjunct to vitrectomy, by administering to the eye an effective amount of an enzyme which disrupts or degrades chondroitin sulfate proteoglycan localized specifically to sites of vitreoretinal adhesion and thereby permit complete disinsertion of said vitreous body and/or epiretinal membranes (4) .
  • the enzyme can be a protease-free glycosaminoglycanase, such as chondroitinase ABC.
  • chondroitinase ABC obtained from Seikagaku Kogyo Co., Ltd., Tokyo, Japan.
  • isolating and purifying the chondroitinase ABC enzyme from the Seikagaku Kogyo material it was noted that there was a correlation between effective preparations of the chondroitinase in vitrectomy procedures and the presence of a second protein having an apparent molecular weight (by SDS- PAGE) slightly greater than that of the major protein component of chondroitinase ABC.
  • the second protein is now designated “chondroitinase II", while the major protein component of chondroitinase ABC is referred to as “chondroitinase I.”
  • the chondroitinase I and II proteins are basic proteins at neutral pH, with similar isoelectric points of 8.30-8.45. Separate purification of the chondroitinase I and II forms of the native enzyme revealed that it was the combination of the two proteins that was active in the surgical vitrectomy rather than either of the proteins individually.
  • chondroitinase I and II forms of the native enzyme have been limited by the small amounts of enzymes obtained from native sources.
  • the production and purification of the native forms of the enzyme has been carried out using fermentations of P. vulgaris in which its substrate has been used as the inducer to initiate production of these forms of the enzyme.
  • a combination of factors, including low levels of synthesis, the cost and availability of the inducer (chondroitin sulfate) , and the opportunistically pathogenic nature of P. vulgaris, has resulted in the requirement for a more efficient method of production.
  • the native forms of the enzyme produced by conventional techniques are subject to degradation by proteases present in the bacterial extract.
  • chondroitinase I and chondroitinase II in quantities not readily achievable using present non-recombinant bacterial fermentation and extraction techniques. It is a further object of this invention to produce chondroitinase I and chondroitinase II, each in a form substantially free of proteases which would otherwise degrade the enzyme and cause a loss of its activity.
  • this invention is directed to the cloning of the P. vulgaris gene for chondroitinase I and the high level expression of that enzyme in E. coli, as well as the cloning of the P. vulgaris gene for chondroitinase II and the high level expression of that enzyme in E. coli.
  • This invention provides a purified isolated DNA fragment of P. vulgaris which comprises a sequence encoding for chondroitinase I.
  • This invention further provides a purified isolated DNA fragment of P. vulgaris which hybridizes with a nucleic acid sequence encoding for amino acids as follows: (a) the chondroitinase I enzyme with its signal peptide (SEQ ID NO:2, amino acids 1-1021) or a biological equivalent thereof (encoded for example by: (1) nucleotides numbered 119-3181 of SEQ ID NO:l, and (2) nucleotides numbered 119-3181 of SEQ ID NO:3, where the three nucleotides immediately upstream of the initiation codon are changed (SEQ ID NO:3, nucleotides 116- 118)); (b) the mature chondroitinase I enzyme (SEQ ID NO:2, amino acids 25-1021) or a biological equivalent thereof (encoded for example by: (1) nucleotides numbered 191-3181 of SEQ ID
  • nucleotides numbered 191-3181 of SEQ ID NO:3, where the three nucleotides immediately upstream of the initiation codon are changed SEQ ID NO:3, nucleotides 116-118)
  • the recombinant chondroitinase I is produced by transforming a host cell with a plasmid containing a purified isolated DNA fragment of P. vulgaris which contains one of the above-described sequences, and culturing the host cell under conditions which permit expression of the enzyme by the host cell.
  • This invention also provides a purified isolated DNA fragment of P. vulgaris which comprises a sequence encoding for chondroitinase II.
  • This invention further provides a purified isolated DNA fragment from P. vulgaris which hybridizes with a nucleic acid sequence encoding for amino acids as follows:
  • the recombinant chondroitinase II is produced by transforming a host cell with a plasmid containing a purified isolated DNA fragment of P. vulgaris which contains one of the above-described sequences, and culturing the host cell under conditions which permit expression of the enzyme by the host cell.
  • the first method comprises the steps of:
  • step (d) loading the eluate from step (c) to a cation exchange resin-containing column so that the enzyme in the eluate binds to the cation exchange column; and (e) eluting the enzyme bound to the cation exchange column with a solvent capable of releasing the enzyme from the column.
  • step (d) loading the eluate from step (c) to a cation exchange resin-containing column so that the enzyme in the eluate binds to the cation exchange column;
  • step (f) loading the eluate from step (e) to an anion exchange resin-containing column and eluting the enzyme with a solvent such that the chondroitin sulfate binds to the column;
  • step (g) concentrating the eluate from step (f) and crystallizing out the enzyme from the supernatant which contains an approximately 37 kD contaminant.
  • step (b) of the first method just described the following two steps are performed:
  • Figure 1 depicts a preliminary restriction map for the subcloned approximately 10 kilobase Nsi fragment in pIBI24.
  • the Nsi fragment contains the complete gene encoding chondroitinase I and a portion of the gene encoding chondroitinase II.
  • the restriction sites are shown in their approximate positions. The restriction sites are useful in the constructions described below; other restriction sites present are not shown in this Figure; some are set forth in Example 13 below.
  • Figure 2 depicts the elution of the recombinant chondroitinase I enzyme from a cation exchange chromatography column using a sodium chloride gradient. The method used to purify the native enzyme is used here to attempt to purify the recombinant enzyme.
  • the initial fractions at the left do not bind to the column. They contain the majority of the chondroitinase I enzyme activity. The fractions at right containing the enzyme are marked "eluted activity”. The gradient is from 0.0 to 250 mM NaCl.
  • Figure 3 depicts the elution of the recombinant chondroitinase I enzyme from a cation exchange column, after first passing the supernatant through an anion exchange column, in accordance with a method of this invention.
  • the initial fractions at the left do not bind to the column, and contain only traces of chondroitinase I activity.
  • the fractions at right containing the enzyme are marked "eluted activity”.
  • the gradient is from 0.0 to 250 mM NaCl.
  • Figure 4 depicts sodium dodecyl sulfate- polyacrylamide gel chromatography (SDS-PAGE) of the recombinant chondroitinase I enzyme before and after the purification methods of this invention are used. In the SDS-PAGE gel photograph.
  • Lane 1 is the enzyme purified using the method of the first embodiment of the invention
  • Lane 2 is the enzyme purified using the method of the second embodiment of the invention
  • Lane 3 represents the supernatant from the host cell prior to purification -- many other proteins are present
  • Lane 4 represents the following molecular weight standards: 14.4 kD - lysozyme; 21.5 kD - trypsin inhibitor; 31 kD - carbonic anhydrase; 42.7 kD - ovalbumin; 66.2 kD - bovine serum albumin; 97.4 kD - phosphorylase B; 116 kD - beta- galactosidase; 200 kD - myosin. A single sharp band is seen in Lanes 1 and 2.
  • Figure 5 depicts SDS-PAGE chromatography of the recombinant chondroitinase II enzyme during various stages of purification using a method of this invention.
  • Lane 1 is the crude supernatant after diafiltration
  • Lane 2 the eluate after passage of the supernatant through an anion exchange resin-containing column
  • Lane 3 is the enzyme after elution through a cation exchange resin- containing column
  • Lane 4 is the enzyme after elution through a second anion exchange resin-containing column
  • Lane 5 represents the same molecular weight standards as described for Figure 4, plus 6.5 kD - aprotinin
  • Lane 6 is the same as Lane 4, except it is overloaded to show the approximately 37 kD contaminant
  • Lane 7 is the 37 kD contaminant in the supernatant after crystallization of the chondroitinase II enzyme
  • Lane 8 is first wash of the crystals
  • Lane 9 is the second wash of the crystals
  • Lane 10 is the enzyme in the washe
  • oligonucleotides designed to bracket part of the chondroitinase I gene
  • DNA synthesis is carried out in vitro.
  • This cycle of denaturation, annealing and DNA synthesis using the oligonucleotides as primers is repeated many times (e.g., 30), with the yield of the desired product (the DNA fragment that lies between the two oligonuc ⁇ leotides) increasing exponentially with each cycle.
  • a putative nucleotide sequence of the appropriate oligonucleotides is constructed from available amino acid sequence information derived from the protein purified from P. vulgaris bacteria.
  • the DNA fragment produced by PCR is cloned and its DNA sequence determined to verify that it is part of the chondroitinase I gene. It is then labeled and used as a probe to indicate which members of the gene bank actually contain the chondroitinase I gene. Subsequent restriction mapping and Southern hybridization narrows the location to a piece of DNA of approximately four thousand base-pairs (bp) . This is then sequenced using the Sanger dideoxy chain termination method (6) to reveal the exact position of the gene and guide the subsequent manipulations used to place the gene into a high-level expression system in E. coli. A fermentation at a 10 liter scale carried out with this E.
  • coli strain containing a recombinant plasmid expressing the P. vulgaris chondroitinase I gene yields a maximum chondroitinase I titer of approximately 600 units/ml (which is the same as 1.2 mg/ml) . This yield far exceeds that of the native P. vulgaris fermentation process which had not achieved a titer of more than 2 units/ml.
  • genomic DNA is obtained. DNA is separated from protein and other material contained in a P. vulgaris fermentation. Study of the genomic DNA is facilitated by the insertion of fragments of the DNA into cosmid vectors.
  • the genomic DNA is digested with an appropriate restriction endonuclease, such as Sau3A, and then Iigated into a cosmid vector.
  • the packaged recombinant cosmids containing the P. vulgaris DNA fragments are introduced into an appropriate bacterial host strain, such as an E. coli strain, and the resulting culture is grown to allow gene expression.
  • the gene banks are engineered to contain a marker, such as ampicillin or kanamycin resistance, to assist in the screening of the gene banks for the presence of the chondroitinase I gene.
  • Applicants have conducted some amino acid sequencing of the native chondroitinase I enzyme. Samples of the enzyme are generated by fermentation of P. vulgaris. Samples may also be obtained from Seikagaku Kogyo Co., Ltd., Tokyo, Japan. The amino acid sequence information is used to design oligonucleotides for use in screening for the chondroitinase I gene.
  • oligonucleotides are designed for use in PCR.
  • a first set of oligonucleotides is designed so as to encode a heptapeptide that has minimal degeneracy of its genetic code. Seven amino acids near the amino terminus of the chondroitinase I enzyme (SEQ ID NO:2, amino acids 19-25) are potentially encoded by 512 different nucleotide sequences (SEQ ID NO:6; see Example 2) . The number of potential sequences is reduced to 32 by selecting specific nucleotides at the 5' end, because of the observation that mismatched nucleotides in PCR primers are of less consequence at the 5' end than at the 3' end of the primer (7) . The sequences of the pool of 32 primers are set out at SEQ ID NOS:7-14.
  • the approximately 110 kD chondroitinase I enzyme is cleaved proteolytically into an 18,000 MW ("18 kD”) fragment and an approximately 90,000 MW (“90 kD”) fragment. Furthermore, the 18 kD fragment is further fragmented by treatment with cyanogen bromide and trypsin. The various fragments are then used to design additional sets of oligonucleotide primers for PCR.
  • the complementary strand has the same number of potential sequences (SEQ ID NOS:27 and 28; see Example 2) .
  • the number of potential sequences is reduced to the sequences set out at SEQ ID NOS:29-36.
  • PCR amplifications are conducted using these 24 mixtures of oligonucleotides. The most effective amplifications are observed as discrete bands on electrophoretic gels. Products approximately 500 and 350 base pairs (bp) in size are obtained. The approximately 350 bp product is a subfragment of the approximately 500 bp product. The approximately 500 bp product is isolated and, following successive cloning procedures described in Example 2, is isolated as a 455 bp PCR product.
  • This 455 bp fragment is sequenced and translated into an amino acid sequence which is in virtual agreement with the sequence available from the native chondroitinase I enzyme.
  • the sequences differ by one amino acid; subsequent experiments reveal that the nucleotide and amino acid sequences of the 455 bp fragment are correct, while the native amino acid sequence identification is in error.
  • the PCR amplification fragment is used as a probe to identify the cosmid gene banks prepared in the first stage which contain the chondroitinase I gene.
  • the PCR fragment is denatured and labelled with, for example, digoxigenin- labelled dUTP (Boehringer-Mannheim, Indianapolis, IN) .
  • the cosmid gene banks are then used to infect a bacterial strain.
  • the resulting colonies are lysed and their DNA subjected to colony hybridization with the labelled probe, followed by exposure to an alkaline phosphatase-conjugated antibody to the digoxigenin-labelled material. Positive clones are visualized and then picked to be grown in selective media.
  • Southern hybridization (8) and restriction mapping are used to localize the position of the chondroitinase I gene within individual clones.
  • the PCR-generated fragment described above is used as a Southern hybridization probe against P. vulgaris genomic DNA that is first digested by restriction enzymes and fractionated.
  • several of the oligonucleotides described above are used as primers.
  • the results indicate that the portion of the chondroitinase I gene that hybridizes to the probe is carried on several large DNA fragments. These large DNA fragments are digested to yield individual fragments which are isolated, tested for the presence of chondroitinase I sequences by Southern hybridization, and then subcloned into appropriate vectors.
  • Example 3 details the cloning strategy used. Restriction maps are generated to assist in the identification of the portions of the fragments carrying the desired sequences.
  • .in vitro chondroitinase I assays in which the activity of the enzyme based on measuring the release of unsaturated disaccharide from chondroitin sulfate C at 232 nm are conducted on several samples to assist in the placement and orientation of the chondroitinase I gene. The results of these procedures suggest that a 4.2 kb EcoRV-EcoRI fragment of a larger 10 kb Nsil fragment could contain the entire chondroitinase I gene.
  • the above-mentioned 4.2 kb fragment is subjected to DNA sequence analysis.
  • the resulting DNA sequence is 3980 nucleotides in length (SEQ ID NO:l) .
  • Translation of the DNA sequence into the putative amino acid sequence reveals a continuous open reading frame (SEQ ID NO:l, nucleotides 119-3181) encoding 1021 amino acids (SEQ ID NO:2) .
  • analysis of the amino acid sequence reveals a 24 residue signal sequence (SEQ ID NO:2, amino acids 1-24) , followed by a 997 residue mature (processed) chondroitinase I enzyme (SEQ ID NO:2, amino acids 25-1021) .
  • Signal sequences are required for a complex series of post-translational processing steps which result in secretion of a protein from a host cell.
  • the signal sequence constitutes the amino-terminal end of the protein to be secreted. In most cases, the signal sequence is cleaved off by a specific protease, called a signal peptidase.
  • the "18 kD” and “90 kD” fragments are found to be adjacent to each other, with the “18 kD” fragment constituting the first 157 amino acids of the mature protein (SEQ ID NO:2, amino acids 25-181), and the "90 kD” fragment constituting the remaining 840 amino acids of the mature protein (SEQ ID NO:2, amino acids 182-1021) .
  • the chondroitinase I enzyme of this invention is expressed using established recombinant DNA methods.
  • Suitable host organisms include bacteria, viruses, yeast, insect or mammalian cell lines, as well as other conventional organisms.
  • the host cell is transformed with a plasmid containing a purified isolated DNA fragment encoding for chondroitinase I enzyme.
  • the host cell is then cultured under conditions which permit expression of the enzyme by the host cell.
  • the gene is subjected to site-directed mutagenesis to introduce unique restriction sites. These permit the gene to be moved, in the correct reading frame, into an expression system which results in expression of chondroitinase I enzyme at high levels.
  • site-directed mutagenesis to introduce unique restriction sites.
  • Example 6 two different constructs are prepared.
  • the three nucleotides immediately upstream of the initiation codon are changed (SEQ ID NO:3, nucleotides 116-118) through the use of a mutagenic oligonucleotide (SEQ ID NO:37) .
  • the coding region and amino acid sequence encoded by the resulting construct are not changed, and the signal sequence is preserved (SEQ ID NO:3, nucleotides 119-3181; SEQ ID NO:2) .
  • the second construct is used.
  • the site-directed mutagenesis is carried out at the junction of the signal sequence and the start of the mature protein.
  • a mutagenic oligonucleotide (SEQ ID NO:38) which differs at six nucleotides from those of the native sequence (SEQ ID NO:l, nucleotides 185-190) .
  • the sequence differences result in (a) the deletion of the signal sequence, and (b) the addition of a methionine residue at the amino-terminus, resulting in a 998 amino acid protein (SEQ ID NO:4, nucleotides 188-3181; SEQ ID NO:5) .
  • the enzyme In the absence of a signal sequence, the enzyme is not secreted. Fortunately, it is not retained within the cell in the form of insoluble inclusion bodies. Instead, at least some of the enzyme is produced intracellularly as a soluble active enzyme. The enzyme is extracted by homogenization, which serves to lyse the cells and thereby release the enzyme into the supernatant. Even with the signal sequence present, much of the enzyme is not secreted, because it is thought that this expression system provides such high yields of enzyme that it exceeds the capacity of the host cell to secrete that much enzyme.
  • the gene lacking the signal sequence is inserted into an appropriate expression vector.
  • One such vector is pET-9A (9; Novagen, Madison, WI) , which is derived from elements of the E. coli bacteriophage T7.
  • the resulting recombinant plasmid is designated pTM49-6.
  • the plasmid is then used to transform an appropriate expression host cell, such as the E. coli B strain BL21/(DE3)/pLysS (10; Novagen) . Samples of this E. coli B strain
  • BL21(DE3)/pLysS carrying the recombinant plasmid pTM49-6 were deposited by Applicants on February 4, 1993, with the American Type Culture Collection, 12301 Parklawn Drive, Rockville, Maryland 20852, U.S.A., and have been assigned ATCC accession number 69234.
  • chondroitinase I enzyme expressed using the deposited host cell yields approximately 300 times the amount of the enzyme as was possible using a same size fermentation vessel with native (non- recombinant) P. vulgaris.
  • the supernatant from the host cells is treated to isolate and purify the enzyme.
  • Initial attempts to isolate and purify the recombinant chondroitinase I enzyme do not result in high yields of purified protein.
  • the previous method for isolating and purifying native chondroitinase I from fermentation cultures of P. vulgaris is found to be inappropriate for the recombinant material.
  • the native enzyme is produced by fermentation of a culture of P. vulgaris.
  • the bacterial cells are first recovered from the medium and resuspended in buffer.
  • the cell suspension is then homogenized to lyse the bacterial cells.
  • a charged particulate such as Bioacryl (Toso Haas,
  • the solution is then filtered and the retentate is washed to recover most of the enzyme.
  • the filtrate is concentrated and subjected to diafiltration with a phosphate to remove the salt.
  • the filtrate containing the chondroitinase I is subjected to cation exchange chromatography using a cellulose sulfate column. At pH 7.2, 20 mM sodium phosphate, more than 98% of the chondroitinase I binds to the column.
  • the native chondroitinase I is then eluted from the column using a sodium chloride gradient.
  • chondroitinase I is obtained at a purity of 90-97%.
  • the level of purity is measured by first performing SDS-PAGE. The proteins are stained using Coomassie blue, destained, and the lane on the gel is scanned using a laser beam of wavelength 600 nm. The purity is expressed as the percentage of the total absorbance accounted for by that band.
  • the yield of the native protein is only 25-35%.
  • the yield is measured as the remaining activity in the final purified product, expressed as a percentage of the activity at the start (which is taken as 100%) .
  • the activity of the enzyme is based on measuring the release of unsaturated disaccharide from chondroitin sulfate C at 232 nm.
  • This purification method also results in the extensive cleavage of the approximately 110,000 dalton (110 kD) chondroitinase I protein into a 90 kD and an 18 kD fragment. Nonetheless, the two fragments remain non-covalently bound and exhibit chondroitinase I activity.
  • the host cell contains or produces small, negatively charged molecules. These negatively charged molecules bind to the enzyme, thereby reducing the number of positive charges on the enzyme. If these negatively charged molecules bind with high enough affinity to copurify with the enzyme, they can cause an alteration of the behavior of the enzyme on the ion exchange column.
  • cation exchange resins bind to proteins better at lower pH' s than higher pH's.
  • a protein which is not very basic, and hence does not bind at a high pH can be made to bind to the cation exchanger by carrying out the operation at a lower pH.
  • the native enzyme binds completely to a cation exchange resin.
  • the recombinant-derived enzyme due to the lowered basicity as a result of binding of the negatively charged molecules, does not bind very well (less than 10%) .
  • This enzyme can be made to bind up to 70% by using a pH of 6.8 and a lower phosphate concentration (5 mM rather than 20 mM) , but heterogeneity and low yield remain great problems. Indeed, only one fermentation results in a 70% binding level; typically, it is much less (less than 10%) even at pH 6.8. This level of binding varies dramatically between different fermentation batches. This hypothesis and a possible solution to the problem are then tested. If negatively charged molecules are attaching non-covalently to chondroitinase I, thus decreasing its basicity, it should be possible to remove these undesired molecules by using a strong, high capacity anion exchange resin. Removal of the negatively charged molecules should then restore the basicity of the enzyme.
  • chondroitinase I is recombinantly expressed in two forms.
  • the enzyme is expressed with a signal peptide, which is then cleaved to produce the mature enzyme.
  • the enzyme is also expressed without a signal peptide, to produce directly the mature enzyme.
  • the two embodiments of this invention which will now be discussed are suitable for use in purifying either of these forms of the enzyme.
  • the host cells which express the recombinant chondroitinase I enzyme are lysed by homogenization to release the enzyme into the supernatant.
  • the supernatant is then subjected to diafiltration to remove salts and other small molecules.
  • this step only removes the free, but not the bound form of the negatively charged molecules.
  • the bound form of these charged species i ⁇ next removed by passing the supernatant through a strong, high capacity anion exchange resin-containing column.
  • a strong, high capacity anion exchange resin-containing column is the Macro-PrepTM High Q resin (Bio-Rad, Melville, N.Y.) .
  • Other strong, high capacity anion exchange columns are also suitable.
  • Weak anion exchangers containing a diethylaminoethyl (DEAE) ligand also are suitable, although they are not as effective.
  • low capacity resins are also suitable, although they too are not as effective.
  • the negatively charged molecules bind to the column, while the enzyme passes through the column. It is also found that some unrelated, undesirable proteins also bind to the column.
  • the eluate from the anion exchange column is directly loaded to a cation exchange resin- containing column.
  • cation exchange resin- containing column examples include the S- Sepharo ⁇ eTM (Pharmacia, Piscataway, N.J.) and the Macro-PrepTM High S (Bio-Rad) .
  • S- Sepharo ⁇ eTM Pulsoa, Piscataway, N.J.
  • Macro-PrepTM High S Bio-Rad
  • the enzyme binds to the column and is then eluted with a solvent capable of releasing the enzyme from the column.
  • Any salt which increases the conductivity of the solution is suitable for elution.
  • salts include sodium salts, as well as potassium salts and ammonium salts.
  • An aqueous sodium chloride solution of appropriate concentration is suitable.
  • a gradient, such as 0 to 250 mM sodium chloride is acceptable, as is a step elution using 200 mM sodium chloride.
  • the purity of the protein is measured by scanning the bands in SDS-PAGE gels. A 4-20% gradient of acrylamide is used in the development of the gels. The band(s) in each lane of the gel is scanned using the procedure described above.
  • a second embodiment of this aspect of the invention two additional steps are inserted in the method before the diafiltration step of the first embodiment.
  • the supernatant is treated with an acidic solution to precipitate out the desired enzyme.
  • the pellet is recovered and then dissolved in an alkali solution to again place the enzyme in a basic environment.
  • the solution is then subjected to the diafiltration and subsequent steps of the first embodiment of this invention.
  • Acid precipitation removes proteins that remain soluble; however, these proteins are removed anyway by the cation and anion exchange steps that follow (although smaller columns may be used) .
  • An advantage of the acid precipitation step is that the sample volume is decreased to about 20% of the original volume after dissolution, and hence can be handled more easily on a large scale.
  • the additional acid precipitation and alkali dissolution steps of the second embodiment mean that the second embodiment is more time consuming than the first embodiment.
  • the marginal improvements in purity and yield provided by the second embodiment may be outweighed by the simpler procedure of the first embodiment, which still provides highly pure chondroitinase I enzyme at high yields.
  • An additional benefit of the two embodiments of the invention is that cleavage of the enzyme into 90 kD and 18 kD fragments is avoided.
  • the material depo ⁇ ited with the ATCC can al ⁇ o be used in conjunction with the sequences disclo ⁇ ed herein to regenerate the native chondroitinase I gene sequence (SEQ ID NO:l) or the modified chondroitinase I gene sequence which includes the signal sequence (SEQ ID NO:3) using conventional genetic engineering technology.
  • the present invention further comprises DNA sequences which, by virtue of the redundancy of the genetic code, are biologically equivalent to the sequences which encode for the enzyme, that is, these other DNA sequences are characterized by nucleotide sequences which differ from those set forth herein, but which encode an enzyme having the same amino acid sequences as those encoded by the DNA sequences set forth herein.
  • the invention contemplates those DNA sequences which are sufficiently duplicative of the sequences of SEQ ID NOS:l, 3 or 4 so as to permit hybridization therewith under standard high stringency Southern hybridization conditions, such as those described in Sambrook et al. (11), as well as the biologically active enzymes produced thereby.
  • This invention also comprises DNA sequences which encode amino acid sequences which differ from those of the chondroitinase I enzyme, but which are the biological equivalent to those described for the enzyme (SEQ ID NOS:2 and 5) .
  • Such amino acid sequences may be said to be biologically equivalent to those of the enzyme if their sequences differ only by minor deletions from or conservative substitutions to the enzyme sequence, such that the tertiary configurations of the sequences are essentially unchanged from those of the enzyme.
  • a codon for the amino acid alanine, a hydrophobic amino acid may be substituted by a codon encoding another less hydrophobic residue, such as glycine, or a more hydrophobic residue, such as valine, leucine, or isoleucine.
  • a codon encoding another less hydrophobic residue such as glycine
  • a more hydrophobic residue such as valine, leucine, or isoleucine.
  • changes which result in substitution of one negatively charged residue for another such as aspartic acid for gl tamic acid, or one positively charged residue for another, such as lysine for arginine, as well as changes based on similarities of residues in their hydropathic index, can also be expected to produce a biologically equivalent product.
  • the nucleotide sequence determined above for the region encoding the chondroitinase I gene includes an additional approximately 800 base pairs beyond the translation termination codon (SEQ ID NOS:l and 39, nucleotides 3185-3980) .
  • An inspection of this region reveals that the sequence between nucleotides 3307 and 3372 (SEQ ID NOS:l and 39) encodes the identical 22 amino acids in the same order as the first 22 amino acids of native chondroitinase II.
  • an ATG initiation codon (SEQ ID NOS:l and 39, nucleotides 3238-3240) is found upstream of this region and in-frame, indicating that this gene is expressed with a 23 amino acid signal peptide sequence for the export of chondroitinase II (SEQ ID NO:40, amino acids 1-23) .
  • a Shine-Dalgarno sequence (AGGA; SEQ ID N0S:1 and 39, nucleotides 3225- 3228) is found upstream of the initiation codon, there is no apparent promoter sequence, suggesting that both the 110 kD and 112 kD forms of the P. vulgaris chondroitinase enzyme are expressed as part of a single messenger RNA.
  • the coding sequence that starts with this ATG was originally not found to be continuous in SEQ ID N0:1, since a termination codon (TAA) was thought to be present in-frame at base-pairs identified as 3607-3609. Re-examination of the sequencing data, however, revealed that a residue was overlooked and that a T should be inserted between nucleotides originally identified as 3593 and 3594. This change restores the open reading frame which then extends through the end of SEQ ID NO: 39 (SEQ ID N0S:1 and 39 include the inserted T as nucleotide 3594) . (Thus, the three bases TAA at base-pairs 3608-3610, properly numbered, do not constitute a termination codon.)
  • the cloning and expression of the P. vulgaris chondroitinase II gene is performed in three stages.
  • the first stage because the N-terminal sequences are known, a site-specific mutagenesis is carried out. This is necessary in order for this gene to be placed, eventually, directly into the desired T7-based expression vector pET9A that is used (as described above) for the chondroitinase I gene.
  • the mutagenized bases are upstream of the coding region (an AT sequence (SEQ ID NOS:l and 39, base pairs 3235 and 3236) is replaced by a CA sequence) .
  • the second stage which can be carried out in parallel with the first, involves the identification, isolation and DNA sequencing of an appropriate DNA fragment which will include the C- terminal coding region of the chondroitinase II gene.
  • the available DNA sequence information is adequate to account for approximately 220 amino acids of an estimated 1000 for the entire chondroitinase II protein. The missing coding sequences, therefore, would extend for another 2400 base pairs beyond the end of SEQ ID NO: 1.
  • the third stage involves the assembly of an intact gene for chondroitinase II that has been modified to include the initiation codon as part of an Ndel site and to be followed by a BamHl site downstream of the coding region. This allows a directed insertion of this gene into the pET9A expression vector (Novagen, Madison, WI) without further modification.
  • Sequencing of the entire assembled gene confirms the presence of the initiation codon at nucleotides 3238-3240, where this codon represents the start of the region coding for the signal peptides at nucleotides 3238-3306, the region coding for the mature protein at nucleotides 3307-6276, and a termination codon at nucleotides 6277-6279 (SEQ ID NO:39) .
  • the translation of this sequence results in 1013 amino acids, of which the first 23 amino acids are the signal peptide and 990 amino acids constitute the mature chondroitinase II protein at residues numbered 24-1013 (SEQ ID NO:40) .
  • the signal peptide is retained, such that the expressed gene is processed and secreted to yield the mature native enzyme structure that has a leucine residue at the N-terminus.
  • the gene encoding the chondroitinase II protein is inserted into pET9A and the resulting recombinant plasmid is designated LP 2 1359.
  • the plasmid i ⁇ then used to transform an appropriate expres ⁇ ion host cell, such a ⁇ the E. coli B strain BL21(DE3)/pLysS (which is also used for the expres ⁇ ion of the chondroitinase I gene.
  • an appropriate expres ⁇ ion host cell such a ⁇ the E. coli B strain BL21(DE3)/pLysS (which is also used for the expres ⁇ ion of the chondroitinase I gene.
  • coli B strain designated TD112 which is BL21 (DE3) /pLy ⁇ S carrying the recombinant pla ⁇ mid LP 2 1359, were depo ⁇ ited by Applicant ⁇ on April 6, 1994, with the American Type Culture Collection, 12301 Parklawn Drive, Rockville, Maryland 20852, U.S.A., and have been a ⁇ signed ATCC accession number 69598.
  • the supernatant from the host cells is treated to isolate and purify the enzyme. Because of the virtually identical isoelectric points and similar molecular weights for the two proteins, the first method described above for isolating and purifying the recombinant chondroitinase I protein is adapted for isolating and purifying the recombinant chondroitinase II protein, and then modified as will now be described.
  • the need for the modification of the method is based on the fact that the recombinant chondroitinase II protein is expressed at levels approximately several-fold lower than the recombinant chondroitinase I protein; therefore, a more powerful and selective solution is neces ⁇ ary in order to obtain a final chondroitina ⁇ e II product of a purity equivalent to that obtained for the chondroitina ⁇ e I protein.
  • the first several step ⁇ of the method for the chondroitinase II protein are the same as those used to isolate and purify the chondroitinase I protein. Initially, the host cells which express the recombinant chondroitinase II enzyme are lysed by homogenization to release the enzyme into the supernatant. The supernatant is then subjected to diafiltration to remove salts and other small molecules. However, this step only removes the free, but not the bound form of the negatively charged molecules. The bound form of these charged species is next removed by passing the supernatant through a strong, high capacity anion exchange resin-containing column.
  • Such a resin is the Macro-PrepTM High Q resin (Bio-Rad, Melville, N.Y.) .
  • Other strong, high capacity anion exchange columns are also suitable.
  • Weak anion exchangers containing a diethylaminoethyl (DEAE) ligand also are suitable, although they are not as effective.
  • low capacity resins are also suitable, although they too are not as effective.
  • the negatively charged molecules bind to the column, while the enzyme pas ⁇ es through the column. It is also found that some unrelated, undesirable proteins also bind to the column.
  • a cation exchange resin- containing column examples include the S- Sepharo ⁇ eTM (Pharmacia, Pi ⁇ cataway, N.J.) and the Macro-PrepTM High S (Bio-Rad) .
  • S- Sepharo ⁇ eTM Pulcoa, Pi ⁇ cataway, N.J.
  • Macro-PrepTM High S Bio-Rad
  • Each of these two resin-containing columns has S0 3 " ligands bound thereto in order to facilitate the exchange of cation ⁇ .
  • Other cation exchange columns are also suitable.
  • the enzyme binds to the column, while a significant portion of contaminating proteins elute unbound.
  • the method diverges from that used for the chondroitinase I protein.
  • a specific elution using a solution containing chondroitin sulfate is used.
  • This procedure utilizes the affinity the positively charged chondroitinase II protein has for the negatively charged chondroitin sulfate.
  • the affinity is larger than that accounted for by a simple positive and negative interaction alone. It i ⁇ an enzyme- ⁇ ub ⁇ trate interaction, which is similar to other specific biological interactions of high affinity, ⁇ uch a ⁇ antigen-antibody, ligand-receptor, co-factor-protein and inhibitor/activator-protein.
  • the chondroitin sulfate is able to elute the enzyme from the negatively charged resin.
  • the resin-enzyme interaction is a simple po ⁇ itive and negative interaction.
  • affinity elution chromatography i ⁇ as easy to practice as ion-exchange chromatography, the elution is specific, unlike salt elution. Thus, it has the advantages of both affinity chromatography (specificity) , as well as ion-exchange chromatography (low cost, ease of operation, reusability) .
  • Another advantage is the low conductivity of the eluent (approximately 5% of that of the salt eluent) , which allows for further ion-exchange chromatography without a diafiltration/dialysi ⁇ ⁇ tep, which i ⁇ required when a salt is used. Note, that this is not a consideration in the method for the chondroitinase I protein, because no further ion- exchange chromatography i ⁇ needed in order to obtain the purified chondroitinase I protein. There is another reason for not using the method for purifying recombinant chondroitinase I.
  • Chondroitinase II obtained using the chondroitinase I salt elution purification method has poor stability; there is extensive degradation at 4°C within one week.
  • chondroitina ⁇ e II obtained by affinity elution i ⁇ stable The reason for this difference in stability is not known. It is to be noted that chondroitinase I obtained by salt elution is stable.
  • the cation exchange column is next washed with a phosphate buffer to elute unbound proteins, followed by washing with borate buffer to elute loo ⁇ ely bound contaminating protein ⁇ and to increa ⁇ e the pH of the re ⁇ in to that required for the optimal elution of the chondroitina ⁇ e II protein using the substrate, chondroitin sulfate.
  • chondroitin sulfate in water, adjusted to pH 9.0, is u ⁇ ed to elute the chondroitinase II protein, as a sharp peak (recovery 65%) and at a high purity of approximately 95%.
  • a 1% concentration of chondroitin sulfate is used.
  • a gradient of this solvent is also acceptable.
  • the chondroitin sulfate has an affinity for the chondroitinase II protein which is stronger than its affinity for the resin of the column, the chondroitin sulfate co-elutes with the protein.
  • Thi ⁇ ensures that only protein which recognizes chondroitin sulfate is eluted, which is desirable, but also means that an additional process step is neces ⁇ ary to ⁇ eparate the chondroitin ⁇ ulfate from the chondroitinase II protein.
  • the eluate is adjusted to a neutral pH and is loaded as i ⁇ onto an anion exchange resin-containing column, such as the Macro-PrepTM High Q resin.
  • the column is washed with a phosphate buffer.
  • the chondroitin sulfate binds to the column, while the chondroitinase II protein flows through in the unbound pool with greater than 95% recovery.
  • the protein i ⁇ pure, except for the presence of a single minor contaminant of approximately 37 kD.
  • the contaminant may be a breakdown product of the chondroitinase II protein.
  • This contaminant is effectively removed by a crytallization step.
  • the eluate from the anion exchange column is concentrated and the solution is maintained at a reduced temperature, such as 4°C, for several days to crystallize out the pure chondroitinase II protein.
  • the ⁇ upernatant contain ⁇ the 37 kD contaminant.
  • Centrifugation causes the crystals to form a pellet, while the supernatant with the 37 kD contaminant is removed by pipetting.
  • the crystals are then washed with water.
  • the washed crystals are composed of the chondroitinase II protein at a purity of greater than 99%.
  • chondroitinase II protein two additional steps are inserted in the method before the diafiltration step of the first embodiment.
  • the supernatant is treated with an acidic solution to precipitate out the desired enzyme.
  • the pellet is recovered and then dissolved in an alkali solution to again place the enzyme in a basic environment.
  • the solution is then subjected to the diafiltration and subsequent steps of the first embodiment of thi ⁇ invention.
  • Acid precipitation removes proteins that remain soluble; however, these proteins are removed anyway by the cation and anion exchange steps that follow (although smaller columns may be used) .
  • An advantage of the acid precipitation step is that the sample volume is decreased compared to the original volume after dissolution, and hence can be handled more easily on a large scale.
  • the additional acid precipitation and alkali dis ⁇ olution ⁇ tep ⁇ of the second embodiment mean that the second embodiment is more time consuming than the first embodiment.
  • the marginal improvements in purity and yield provided by the second embodiment may be outweighed by the simpler procedure of the first embodiment, which still provides highly pure chondroitinase II enzyme at high yields.
  • the present invention further comprises DNA sequences which, by virtue of the redundancy of the genetic code, are biologically equivalent to the sequences which encode for the enzyme, that is, these other DNA sequences are characterized by nucleotide sequences which differ from those set forth herein, but which encode an enzyme having the same amino acid sequences as tho ⁇ e encoded by the DNA sequences set forth herein.
  • the invention contemplates those DNA sequences which are sufficiently duplicative of the sequence of SEQ ID NO:39 so as to permit hybridization therewith under standard high stringency Southern hybridization conditions, ⁇ uch a ⁇ those described in Sambrook et al. (11), as well as the biologically active enzymes produced thereby.
  • This invention also comprises DNA sequences which encode amino acid sequences which differ from tho ⁇ e of the chondroitina ⁇ e II enzyme, but which are the biological equivalent to tho ⁇ e de ⁇ cribed for the enzyme (SEQ ID NO:40) .
  • amino acid ⁇ equence ⁇ may be said to be biologically equivalent to tho ⁇ e of the enzyme if their sequences differ only by minor deletion ⁇ from or con ⁇ ervative ⁇ ub ⁇ titutions to the enzyme sequence, such that the tertiary configurations of the sequences are essentially unchanged from tho ⁇ e of the enzyme.
  • a codon for the amino acid alanine, a hydrophobic amino acid may be ⁇ ub ⁇ tituted by a codon encoding another less hydrophobic residue, such as glycine, or a more hydrophobic residue, such as valine, leucine, or isoleucine.
  • a codon encoding another less hydrophobic residue such as glycine
  • a more hydrophobic residue such as valine, leucine, or isoleucine.
  • changes which result in sub ⁇ titution of one negatively charged residue for another such as aspartic acid for glutamic acid, or one positively charged residue for another, such as lysine for arginine, as well a ⁇ changes based on similarities of residues in their hydropathic index, can also be expected to produce a biologically equivalent product.
  • one of ordinary skill in the art can ligate together the two pieces of DNA from the two depo ⁇ it ⁇ , for example, at the Hindlll ⁇ ite at nucleotide 3326, ⁇ o as to expres ⁇ both the chondroitinase I and chondroitinase II proteins under the control of the T7 promoter upstream of the coding sequence for chondroitinase I.
  • sample “B” is centrifuged and the cell pellet taken up with 7 ml of 0.05M glucose-0.025M Tris-HCl-0.01M EDTA (pH 8) containing 40 ⁇ g/ml of DNAa ⁇ e-free RNAa ⁇ e and then 7 ml of 1% SDS-0.16M EDTA-0.02M NaCl (pH 8) are added to this resuspended material. Finally, proteinase K (Boehringer Mannheim, Indianapolis, IN) is added to both samples to a final concentration of 100 ⁇ g/ml and incubation is continued overnight at 37°C.
  • proteinase K Boehringer Mannheim, Indianapolis, IN
  • sample ⁇ are extracted once with an equal volume (14 ml) of equilibrated phenol followed by two further extractions in which the samples are extracted with 7 ml of phenol followed by the addition of 7 ml of chloroform, continued shaking and finally, centrifugation to separate the two phase ⁇ .
  • the pelleted DNA is rinsed once with 70% (v/v) ethanol, dried under vacuum and then resuspended with 1 ml of TE (0.01M Tris-HCl-O.OOlM EDTA, pH 7.4).
  • Fragmentation of the genomic DNA to yield pieces of a size suitable for insertion into cosmid vectors is accom- pli ⁇ hed by partial digestion with the restriction endonuclease Sau3A.
  • Duplicate 0.2 ml reactions are set up (one with preparation "A” and the other with DNA from preparation "B") , each containing 100 ⁇ g of the P. vulgari ⁇ genomic DNA, 0.1M NaCl, 0.01M MgCl,, 0.01M Tri ⁇ -HCl (pH 7.5) and 80 unit ⁇ of the enzyme Sau3A.
  • the individual samples are heated to' 70°C and then 10 ⁇ l are removed for a size- di ⁇ tribution analysis on an agarose gel.
  • the sample obtained after five minutes of Sau3A digestion of preparation "A" and that obtained after 6 minutes with preparation "B" are chosen for further use.
  • an aliquot (4 ⁇ l, which is approximately equal to 2 ⁇ g) of the chosen partial digest is Iigated to the appropriate "left" and “right” arms of the cosmid vector DNA using approximately 1 ⁇ g and 2 ⁇ g of each, respectively, in 10 ⁇ l reactions containing 0.066M Tris-HCl (pH 7.4), 0.01M MgCl 2 , 0.001M ATP, and 400 units (as defined by the manufacturer (New England Biolabs, Beverly, MA)) of T4 DNA ligase. Incubation is carried out at 11°C overnight.
  • the "left" and "right” arms of the cosmids are DNA fragments which, when Iigated to an appropriately sized piece of P.
  • vulgaris DNA comprise a recombinant molecule of approximately 35-50 kb. Both arms contain "cos" site ⁇ which are recognized by the packaging enzymes in the next ⁇ tep. In addition, the ⁇ e arms carry the origin of replication and ampicillin-re ⁇ istance functions of pIBI24 (International Biochemical Inc., New Haven, CT) .
  • PDB 0.1M NaCl- 0.01M Tris-HCl (pH 7.9) -0.01M MgS0 4
  • Each tube of packaged DNA is, therefore, a gene bank of the P. vulgaris genome.
  • this method of construction creates a pool of infectious particles (i.e., ⁇ phage heads filled with the cosmid vector joined to approximately 25' to 35 kb of P. vulgaris DNA)
  • the number of potential clones is quantitated by adsorbing an aliquot of the packaged material to an appropriate, sensitive E. coli host strain, and then after outgrowth, plating the mixture on selective media.
  • an overnight culture of the E. coli strain ER1562 (New England Biolabs, Beverly, MA) grown in 20-10-5 medium is diluted 1:20 into fresh media (20-10-5 supplemented with 1% maltose) and grown for three hours at 37°C.
  • the cells (1 ml) are then centrifuged, resuspended with PDB (0.2 ml) and 0.02 ml of the appropriate gene bank added. After adsorption for twenty minutes at 37°C, the sample ⁇ are diluted to 2 ml with 20-10-5 medium and grown at 37°C for 30 minute ⁇ . The culture is then spread on 20-10-5 plates containing 100 ⁇ g/ml of ampicillin and colonies scored after overnight incubation at 37°C. The results indicate that there are approximately 68,000 and 95,000 infectious particles (potential cosmid clone ⁇ ) present in the two sample ⁇ , designated PV1-GB and PV2- GB, corresponding to the "A" and "B" preparation of P. vulgaris genomic DNA, respectively.
  • P. vulgaris gene banks are prepared, as above, using two different cosmid vectors. These two cosmids differ from the above-mentioned vectors in that a kanamycin resi ⁇ tance determinant i ⁇ used in one ca ⁇ e rather than the ampicillin resistance, while in the other, the replication functions of pBR322 (New England Biolabs, Beverly, MA) are used instead of those of pIBI24.
  • Example 2 Example 2
  • PCR Polymera ⁇ e Chain Reaction
  • the oligonucleotides used must have sequences that are as close as possible to those of the target ⁇ equence -- the P. vulgari ⁇ chondroitinase I gene.
  • An approximation of that ⁇ equence can be derived from the limited available amino acid ⁇ equence data.
  • the first approximation involves choosing an amino acid sequence that has the least degeneracy.
  • Thi ⁇ amino acid ⁇ equence could be encoded by any one of 512 different nucleotide ⁇ equence ⁇ , repre- ⁇ ented as 5' -CAY-TTY-GCN-CAR-AAY-AAY-CCN-3' (SEQ ID NO:6), where R stands for purine (A or G) , Y for pyrimidine (C or T) , and N indicates that any one of the four nucleotides (A T, G, or C) at this position will con ⁇ titute a nucleotide ⁇ equence that could encode the indicated amino acid ⁇ equence.
  • R stands for purine (A or G)
  • C or T pyrimidine
  • N indicates that any one of the four nucleotides (A T, G, or C) at this position will con ⁇ titute a nucleotide ⁇ equence that could encode the indicated amino acid ⁇ equence.
  • One po ⁇ sible approach would be to synthe ⁇ ize an
  • One of these pools is perfectly matched for the first eleven nucleotides (counting from the 3- end) , and, furthermore, within thi ⁇ pool of four oligonucleotides, one is a perfect match for the first fourteen nucleotide ⁇ .
  • Thi ⁇ is important because it permits stringent annealing conditions to be used that discriminate against imperfect matches that give rise to PCR products that are unrelated to the chondroitinase I gene.
  • a further aid in the design of oligonucleo ⁇ tides to be used in these PCR experiments is derived from the ob ⁇ ervation that the P. vulgaris 110 kD chodroitinase enzyme appear ⁇ to have a structure that leaves one particular region hypersensitive to proteolytic cleavage.
  • the result of this hydrolysi ⁇ i ⁇ that the normally approximately 110 kD protein i ⁇ split into two predominant species of 18 kD and approximately 90 kD.
  • the amino-terminal sequences of the "110 kD" protein and the "18 kD" fragment are the same, while that for the "90 kD" has been found to be different.
  • the "18 kD" peptide is further fragmented by treatment with cyanogen bromide and trypsin and the re ⁇ ulting oligopeptides sequenced, affording still more information with which to design oligonucleotides for PCR.
  • This information from the "18 kD” and "90 kD” regions i ⁇ also valuable because the locations of these amino acid sequences relative to each other and the N-terminal sequences of the intact protein are well defined.
  • the nucleotide distance between the regions encoding the N-termini of the "110 kD" and "90 kD” entities can be predicted to be approximately 400-500 bp.
  • Two further set ⁇ of oligonucleotide pool ⁇ are then de ⁇ igned with one further con ⁇ ideration:
  • the fir ⁇ t eight oligonucleotides hybridize to one strand of the DNA and, during the in vitro DNA synthesis, they are extended toward the "90 kD" N-terminal coding sequence ⁇ . Consequently, the oligonucleotides corre ⁇ ponding to amino acid sequences from within the "18 kD" peptide and at the N-terminus of the "90 kD" peptide must be designed so that they anneal to the complementary DNA strand of the P. vulgaris genome, so that they extend, in vitro, toward the region encoding the N-terminus of the intact protein.
  • the oligonucleotides effectively "bracket" the region of the P. vulgaris chromosome that encodes the N-terminal region of the chondroitinase I gene. It is worth noting that the PCR methodology offers an extremely large potential amplification of this bracketed region. Thirty PCR cycles, in theory, increase the number of copies of this DNA segment by a factor of one billion. This allows the use of very small quantities of P. vulgari ⁇ genomic DNA a ⁇ a template which will yield, potentially, microgram amounts of synthesized product which can be readily visualized, isolated and cloned.
  • oligonucleotide mixtures are designed based on the following amino acid sequence that is found within the "18 kD" peptide: Glu-Ala-Gln-Ala-Gly-Phe-Lys (SEQ ID NO:2, amino acids 138-144) .
  • This heptapeptide is encoded by the following nucleotide sequences:
  • the complementary strand therefore, has the following sequences:
  • a further set of eight oligonucleotides (each made up of 16 unique sequences) is designed, where the individual sets of oligonucleotides have the following sequences:
  • one pool has a perfect match for the first eight nucleotides at the 3' -end, while 50% of this same pool has an eleven-nucleotide perfect match with the genomic DNA of P. vulgaris encoding chondroitina ⁇ e I.
  • oligonucleotide mixtures For a third ⁇ et of oligonucleotide mixtures, the following amino acid sequence, obtained as part of the N-terminal amino acid sequence of the "90 kD" peptide, is u ⁇ ed: Gly-Ala-Ly ⁇ -Val-A ⁇ p-Ser (SEQ ID NO:2, amino acid ⁇ 189-194) .
  • Thi ⁇ hexapeptide can be encoded by the following nucleotide sequences:
  • one base is deleted from the 5' end of oligonucleotides 17-24 in order to reduce the number of ⁇ equence permutations.
  • one oligonucleotide mixture ha ⁇ half of it ⁇ member ⁇ perfectly matched for the fir ⁇ t eight nucleotides at the 3' -end, and one quarter of the oligonucleotides in the pool are perfectly matched for eleven nucleotides at the 3' -end.
  • oligonucleotide mixtures are purchased from Biosynthesis, Inc. (Denton, TX) , and are provided as fully deprotected, purified and lyophilized samples. In each case (except oligonucleotide #20), 5 O.D. units of synthetic DNA are obtained. This is resu ⁇ pended in 0.5 ml of water to yield a solution that contains approximately 50-60 pmoles of oligonucleotide per microliter. The remaining sample (oligonucleotide #20) contains 15
  • a typical 50 ⁇ l PCR reaction contains approximately 20 ng of P. vulgaris genomic DNA as template; 200 ⁇ M each of dATP, dGTP, dCTP, dTTP; 50mM KCl; lOmM Tris-HCl (pH 8.4); 1.5 mM MgCl 2 ; 0.01% gelatin; 2.5 units of Ampli-TaqTM DNA polymerase (Perkin-Elmer/Cetus, Norwalk, CT) ; and 50 pmoles of each oligonucleotide pool to be tested.
  • the reactions are overlaid with mineral oil (Plough) and incubated in a Perkin-Elmer/Cetu ⁇ ThermalcyclerTM.
  • the instrument is programmed to denature the template DNA at 94°C for 1.25 minutes, anneal the oligonucleotide primers to the denatured template at 60°C or 62°C for one minute, and to extend these primers via DNA synthesis at 72°C for 2.25 minutes. Thirty such cycles are carried out in an experimental amplification.
  • the products are analyzed by running an aliquot on a 4% NuSieveTM (FMC Biochemicals, Rockland, ME) GTG gel containing approximately 0.5 ⁇ g/ml ethidium bromide using either Tris-borate or Tri ⁇ -acetate buffers at either full or half strength. These gels are usually run overnight at approximately lV/cm and photographed on a long wavelength UV transilluminator u ⁇ ing a red filter and Polaroid Type 57 film.
  • NuSieveTM FMC Biochemicals, Rockland, ME
  • PCR experiment ⁇ are run te ⁇ ting the pairwi ⁇ e combination ⁇ between oligonucleotide pool ⁇ #1-8 (derived from the "110 kD" amino-terminal ⁇ equence of chondroitina ⁇ e I) , pool ⁇ #9-16 (derived from a peptide sequence contained within the "18 kD” fragment) , and pools #17-24 (derived from the amino-terminal sequence of the "90 kD” fragment) .
  • the most effective amplifications observed are between oligonucleotide pools #4 and #18, and pools, #4 and #9,10,11, or 12.
  • oligonucletide pools #4 and #18 yield a product of approximately 500 bp as estimated relative to size standards (pBR322 digested with MSP-1 (New England Biolabs, Beverly, MA) ranging from 30 to 700 bp on NuSieveTM agarose gels.
  • the product from the use of oligonucleotide pool #4 combined with pools #9, 10, 11, or 12 is approximately 350 bp in length.
  • the larger product could be isolated from an agarose gel, diluted a thousand-fold, and then used as the template in a second PCR reaction employing oligonucleotide pools #4 and #9 as primers, which yield a product of approximately 350 bp. That is, the smaller PCR product is synthesized from the larger one in agreement with what would be expected if these ⁇ equence ⁇ were all derived from the P. vulgaris chondroitinase I gene. This indicates that the desired region of the genome is amplified.
  • the larger PCR product is isolated from an agarose gel using a QiaexTM extraction procedure according to the manufacturer's instructions (Qiagen, Chatsworth, CA) .
  • the isolated DNA i ⁇ then subjected to a "fill-in" reaction (11) to remove the extra, protruding adenine residue that Tag DNA polymerase tends to add to the 3'-end of DNA in a template- independent reaction (12) .
  • the isolated DNA is then treated with T. polynucleotide kinase to add a phosphate moiety to the 5' -ends of the PCR products to allow them to be joined to the vector DNA.
  • the PCR product is Iigated to pIBI24, a high copy vector containing a polylinker (IBI, New Haven, CT) , that is first sequentially digested with Pstl. "filled-in” and then treated with calf intestinal alkaline phosphatase (Boehringer- Mannheim) .
  • pIBI24 a high copy vector containing a polylinker
  • Pstl. "filled-in” and then treated with calf intestinal alkaline phosphatase (Boehringer- Mannheim) .
  • An eight residue oligopeptide derived from the DNA sequence (SEQ ID NO:2, amino acids 71-78) also matche ⁇ a previou ⁇ ly sequenced oligopeptide derived by a combination of trypsin digestion and cyanogen bromide treatment of the native protein.
  • the problems include: (1) the assumption that the protein being sequenced has not been processed at either end (not likely to be true, for example, with a secreted protein) , (2) the occasional lack of fidelity exhibited by Tag DNA polymerase during PCR reactions, and (3) the rather large size of the bracketed region of the DNA that is to be amplified which was expected to be approximately 3000 bp (deduced from the apparent molecular weight of approximately 110 kD) . Consequently, the approach of constructing a gene bank is selected.
  • a total of approximately 260 ⁇ g of plasmid DNA is digested with Sail and the products separated by electrophoresis on a NuSieveTM GTG agaro ⁇ e gel.
  • the desired approximately 450 bp fragment is isolated using a QiaexTM extraction protocol.
  • the fragment is then denatured by heating at 95-100°C for 5-15 minutes, followed by rapid cooling.
  • the denatured fragment is then labelled with digoxigenin- labelled dUTP (Boehringer-Mannheim, Indianapolis, IN) in two 200 ⁇ l reactions. Aliquots of the six P. vulgaris cosmid gene banks described in Example 1 above are used to infect the E.
  • coli strain ER1562 described above and a total of approximately 10,000 colonies are obtained on the appropriate selective plates. These colonies (on a total of 50 plates) are replica plated onto two nylon membranes on selective agar as well as to a third selective plate. After overnight incubation, the colonies on the filters are lysed by sequentially treating with 10% sodium dodecyl sulfate (SDS) and 0.5 M NaOH for 5-30 minute ⁇ each. The cells from the lysed colonies are neutralized by being placed on sheet ⁇ ⁇ aturated with 1 M Tris-HCl (pH 7.4) (twice) and then on paper saturated with 2X standard saline citrate prior to vacuum drying at 80°C. The DNA from the lysed colonies is then fixed to the membranes.
  • SDS sodium dodecyl sulfate
  • the filters are then washed by incubation of the filters at 42°C with agitation for 1-3 hours, using at least 10 ml/filter of 0.05 M Tris HCI, 0.5-1 M NaCl/0.001 M EDTA, pH 8, 0.1% SDS and 0.05 mg/ml proteinase K.
  • the filters are then rinsed with 2 X SSC and pre-hybrid!zed by incubation with a hybridization buffer at 65°C for 1-3 hour ⁇ .
  • the filters are then hybridized overnight at 65-68°C using the digoxigenin-labeled probe described above (0.5-50 ng/ml in a hybridization solution) .
  • the hybridized filters are wa ⁇ hed with SSC and SDS, re-blocked with a blocking reagent (Component #11 of DNA Labelling and Detection Kit, Nonradioactive, Boehringer Mannheim, Indianapolis, IN) and exposed to polyclonal sheep anti-digoxigenin Fab fragments conjugated to alkaline phosphatase.
  • a blocking reagent Component #11 of DNA Labelling and Detection Kit, Nonradioactive, Boehringer Mannheim, Indianapolis, IN
  • the positive clones are visualized by incubation of the antibody-labeled filters in the presence of BCIP (bromo-chloro-indolyl-phosphate) and NBT (nitro-blue tetrazolium) .
  • BCIP bromo-chloro-indolyl-phosphate
  • NBT nitro-blue tetrazolium
  • the presence of the desired DNA fragment within a colony will result in a dark brownish-purple spot in the filter after this hybridization procedure.
  • the developed filters are used a ⁇ templates to guide the ⁇ election of a total of 117 clone ⁇ which are then picked to ⁇ elective media.
  • a small-volume (10 ml) culture (“Miniprep") of each of these clones is grown in selective media and plasmid DNA is then isolated using materials and protocols supplied by Qiagen.
  • a number of approaches are used to guide the selection of particular cosmid clones for further study.
  • One is to carry out Southern hybridization (8) using the same PCR-generated fragment as a probe against P. vulgaris genomic DNA that had been digested by a number of restriction enzymes and then fractionated on an agarose gel prior to transfer to a nylon membrane.
  • the probe is labeled with digoxigenin-dUTP by including thi ⁇ nucleotide analogue in a PCR amplification.
  • thi ⁇ reaction the gel- purified product of a previous PCR amplification (that using P. vulgaris genomic DNA as template) is diluted 10,000-fold and serve ⁇ a ⁇ the template in a second PCR amplification.
  • Thi ⁇ latter reaction i ⁇ made up as a 0.5 ml mixture, which is then divided into ten individual tubes and amplified as described above for 25 cycles using oligonucleotide pools #2 and #10 (see above) as the primers.
  • the normal complement of deoxyribonucleoside tripho ⁇ phate ⁇ i ⁇ replaced with a digoxigenin-dUTP labeling mixture from the manufacturer (Boehringer-Mannheim, Indianapoli ⁇ , IN) , which yields a final concentration of 100 ⁇ M each of dATP, dCTP and dGTP, 65 ⁇ M dTTP and 35 ⁇ l digoxigenin- dUTP.
  • the reactions are pooled and precipitated according to the manufacturer's recommendations. An aliquot of the resuspended product is examined by gel electrophoresis and exhibits a single band between approximately 300 and approximately 400 bp in length as expected for the "smaller" PCR product de ⁇ cribed above.
  • the DNA (approximately 5 ⁇ l) is diluted into large (0.35 ml) volume ⁇ for dige ⁇ tion with the variou ⁇ re ⁇ triction enzyme ⁇ .
  • the DNA i ⁇ then concentrated by ethanol precipitation prior to fractionation on agaro ⁇ e gels and transfer to nylon membranes.
  • the data obtained in these experiments indicates that the chondroitinase I gene (at least that portion that hybridizes to the N-terminal coding region represented by the probe described above) is carried on a BstYI fragment of approximately 2800 bp, an EcoRV fragment of 5400 bp, and on large (equal to or greater than approximately lOkb) DNA fragments generated by Nsil. Bglll, Hindlll, and Stvl.
  • Two of these fragments are of special interest.
  • the first, a BstYI fragment of approximately 2800 bp, is observed in a number of cosmid clones, including those designated #2 and #45.
  • the DNA i ⁇ olated from these two cosmid clones i ⁇ designated LP 2 751 and LP 2 760.
  • LP 2 760 With LP 2 760, the approximately 2800 bp BstYI fragment is well separated from the other BstYI fragments and i ⁇ therefore more readily ⁇ ubcloned into another vector de ⁇ ignated pT660-3.
  • the plasmid designated pT660-3 is a derivative of pBR322 in which the DNA from a point immediately downstream of the promoter for tetracycline resistance (approximately bp 80) as far as the PvuII site (approximately bp 2070) is deleted and replaced with a BamHl linker.
  • the approximately 10 kb Nsil fragment (which hybridizes with the chondroitinase probe described above) is readily i ⁇ olated from a digest performed on LP 2 751. These two fragments are referred to as the "2800 bp BstYI" fragment and the "10 kb Nsil" fragment.
  • the 2800 bp BstYI fragment is small enough to permit a second restriction enzyme digestion on this piece of DNA in order to obtain a fragment suitable for DNA sequence analysis. This is important because the hybridization experiments serve to identify the N-terminal coding region of the chondroitinase I gene, due to how the probe is derived. This procedure does not, however, indicate to which side the rest of the gene is located. Given the relative size of the probe (les ⁇ than 500 bp) compared to the predicted size of the intact gene (greater than 3000 bp) , thi ⁇ is not a trivial consideration.
  • the nucleotide sequence clearly indicates in which direction the gene would be "read” and therefore, which restriction fragments should be cloned in order to obtain the entire gene.
  • the subcloned 2800 bp BstYI fragment contains two internal EcoRV site ⁇ , which ⁇ ugge ⁇ t ⁇ that the resulting fragments might be small enough for DNA sequencing.
  • the EcoRV sites are symmetrically placed within the 2800 bp BstYI fragment; each EcoRV site is approximately 1200 bp from one end, with the ⁇ pace between them equal to approximately 400 bp.
  • the ⁇ ubcloned fragment is digested asymmetrically by taking advantage of unique restriction sites present within the vector.
  • the "halves" of the 2800 bp B ⁇ tYI fragment are distinguished physically and, by Southern hybridi ⁇ zation, the "end" that contains the chondroitinase I N-terminal coding region is ascertained.
  • the appropriate piece which is a Hindlll-EcoRV fragment of approximately 1200 bp, is subcloned into both M13mpl8 and M13mpl9 vectors which are fir ⁇ t digested with both Hindlll and Smal and subsequently treated with calf intestinal alkaline phosphatase.
  • the DNA sequence derived from these subclone ⁇ reveal ⁇ a number of feature ⁇ that clearly establish the location of the chondroitinase I gene, as well as the direction in which it is read.
  • nucleotide #183 in this sequence (SEQ ID NO:l, nucleotide 191), a coding region is observed which matches the first thirty previously-identified amino acids of the P. vulgaris chondroitina ⁇ e I enzyme.
  • thi ⁇ sequence it is possible to discern a number of other features by their analogy to corresponding ⁇ equence motif ⁇ from previou ⁇ ly analyzed E. coli genes.
  • nucleotides 32-37 SEQ ID NO:l, nucleotide ⁇ 40-45
  • nucleotides 98-103 SEQ ID NO:l, nucleotide ⁇ 106-111
  • there i ⁇ an in-frame ATG initiation codon at nucleotide ⁇ 111-113 SEQ ID N0:1, nucleotide ⁇ 119-121) , which indicates that the P.
  • vulgaris chondroitinase I enzyme is synthe ⁇ ized with a 24 amino acid ⁇ ignal sequence which is, pre ⁇ umably, removed a ⁇ the protein i ⁇ transported across the inner membrane.
  • the second fragment that is subcloned (into a pIBI24 derivative that is fir ⁇ t modified to include an Nsil re ⁇ triction ⁇ ite in place of the P ⁇ tl ⁇ ite normally pre ⁇ ent in the polylinker of thi ⁇ vector) i ⁇ the approximately 10 kb Nsil fragment.
  • a double digestion with EcoRV and Hindlll releases fragments of approximately 4.1 kb, 2.3 kb, 2.1 kb, 2.0 kb, 1.3 kb, 1.1 kb and 0.4 kb.
  • Three of these fragments (2.3 kb, 2.1 kb, and 0.4 kb) are apparently EcoRV fragments that have not been cut by Hindlll. Again, the only fragment larger than the vector (4.1 kb) indicates that this fragment includes pIBI24 (2.9 kb) .
  • the approximately 2.0 kb fragment hybridizes with the chondroitinase probe, thereby serving to place one of the Hindlll site ⁇ . Since there i ⁇ a Hindlll ⁇ ite in the polylinker, it too can be placed, leaving the last Hindlll site to be placed by deduction.
  • Double digestion of the cloned approximately 10 kb Nsil fragment with EcoRV and EcoRl yields six fragments (of approximately 4.2 kb, 3.5 kb, 2.3 kb, 2.1 kb, 1 kb, and 0.4 kb) , indicating the presence of two EcoRl site ⁇ -- one in the polylinker and one in the cloned P. vulgaris DNA.
  • Southern hybridization reveals that the approximately 4.2 kb band in this double digest contains the chondroitinase I N-terminal coding sequence. Adding this information to the above data yields a preliminary restriction map for the subcloned approximately 10 kb Nsil fragment in pIBI24 ( Figure 1) .
  • This culture is then inoculated into fresh selective media either with or without isopropyl-beta-D-thiogalactopyrano ⁇ ide (IPTG) which is expected to increase the level of transcription from the lac promoter pre ⁇ ent in pIBI24.
  • IPTG isopropyl-beta-D-thiogalactopyrano ⁇ ide
  • the approximately 10 kb Nsil fragment, cloned into pIBI24, is digested with EcoRV (as described above) and Iigated together in the presence of EcoRI linkers.
  • the net result of this construction is the deletion of approximately 5 kb of P. vulgaris DNA from this subcloned piece of DNA and the simultaneous introduction of another EcoRI site into the molecule.
  • One hundred micrograms of thi ⁇ "EcoRV deletion" con ⁇ truction (LP 2 786) i ⁇ digested with EcoRI and fractionated on an agarose gel.
  • the desired approximately 4.2 kb fragment is eluted from the gel, precipitated and resuspended in 150 ⁇ l TE described above.
  • One-third of this material is then Iigated to itself (polymerized) and, after destruction of the DNA ligase by heating, the DNA is sonicated to generate random, small pieces suited to DNA sequence analysis.
  • coli strain MV1190 and 500 of the phage plaques obtained are picked into SM buffer (NaCl, 100 mM, MgS0 4 , 8 mM, Tris-HCl, pH 7.4, 50 mM and 0.01% gelatin) to serve as stock ⁇ for the infection of ⁇ mall (le ⁇ than or equal to 10 ml) cultures that are then used for the isolation of single stranded template DNA.
  • SM buffer NaCl, 100 mM, MgS0 4 , 8 mM, Tris-HCl, pH 7.4, 50 mM and 0.01% gelatin
  • DNA sequencing is carried out at elevated temperatures using Tag DNA polymerase and fluorescently-labeled oligonucleotide primer ⁇ .
  • the data are collected u ⁇ ing a Model 370A DNA ⁇ equencing ⁇ y ⁇ tem (Applied Biosysterns, Foster City, CA) .
  • Sequence editing, overlap determinations and derivation of a consensus sequence are performed using a collection of computer programs obtained from the
  • the resulting DNA sequence of this EcoRI fragment is 3980 nucleotides in length (SEQ ID NO:l) . It is to be noted that the EcoRI site near the N- terminal coding sequence is derived from the linker Iigated into this site; it is not present in the P. vulgaris chromosome. This position actually is an EcoRV site in the cloned cosmid DNA.
  • the site-specific mutagenesis method employed is based on that of Kunkel (15) , using materials purcha ⁇ ed from Bio-Rad, Melville, N.Y.
  • This oligonucleotide serve ⁇ as a primer for T7 DNA polymerase which copies the entire recombinant molecule.
  • T4 DNA ligase is then used to seal the nick between the first residue of the mutagenic oligonucleotide and the last residue added in vitro.
  • the newly synthesized DNA (containing the desired base changes) therefore does not contain uracil, while the template DNA does. Transformation of a non-mutant (with respect to the dut and ung alleles) male E. coli strain yields phage progeny that are primarily derived from the mutagenized strand synthe ⁇ ized in vitro a ⁇ a re ⁇ ult of the inactivation of the uracil-containing template ⁇ trand.
  • resuspended plaques (aliquots of which had been used for DNA sequencing which established the N-terminal coding region of the chondroitinase I gene and included another 110 bp "upstream" of the presumed translation initiation site (see above)) are used to infect the male host strain CJ236 (dut ung) .
  • Individual plaques are picked to 0.5 ml of phage dilution buffer (PDB) .
  • PDB phage dilution buffer
  • One picked plaque from each transformation is adsorbed to log phase CJ236 and the infected culture grown for 6.5 hours. The cells are pelleted by centrifugation, and the supernatant heated to 55°C for 30 minutes and then stored at 4°C.
  • Single stranded DNA is isolated from 100 ml of each supernatant and resuspended in a total volume of 0.1 ml of TE.
  • the goal of the site-specific mutagenesis is to modify the "ends" of thi ⁇ gene to allow it to be moved, precisely, into an appropriate high-level E. coli expression system.
  • the target vector chosen (pET9-A; see above) is one derived from genetic regulatory elements present in the bacteriophage T7. In this sytem, there i ⁇ a unique Ndel site (CATATG) that includes the translation initiation codon as well as a downstream Ba Hl site that, together, allow the direct, unidirectional, insertion of a gene encoding the protein that is to be expressed.
  • the native sequence including the predicted initiation codon, is presented on line 1 below while the mutagenic oligonucleotide #25 (which differs in the three nucleotides immediately upstream of the initiation codon) is presented on line 2: 1) 5' -GCCAGCGTTTCTAAGGAGAAAAATAATGCCGATATT- TCGTTTTACTGC-3' (SEQ ID NO:l, nucleotides 94-141)
  • the site-specific mutagenesis is carried out at the junction of the signal sequence and the start of the mature protein (line 3) using the mutagenic oligonucleotide # 26 (line 4) (which differs by six nucleotide ⁇ , including the location of the initiation codon) :
  • the underlined GCC in line 3 corresponds to the codon for alanine which is the N-terminal amino acid for the mature, processed form of the P. vulgaris chondroitinase I.
  • oligonucleotide # 25 (5 O.D. units) is resuspended with 0.5 ml of TE
  • oligonucleotide # 26 (also 5 O.D. units) is resuspended in 0.65 ml TE to yield stock ⁇ that are approximately 20 nM, i.e., 20 pmole/ ⁇ l.
  • Template DNA (5 ⁇ l of the preparation de ⁇ scribed above) and pho ⁇ phorylated mutagenic primer (approximately 2 pmole) are annealed in a 20 ⁇ l volume containing 20 mM Tris-HCl (pH 7.4), 2 mM MgCl 2 , and 50 mM NaCl.
  • the sample is heated at 70°C for 45 minutes in a Perkin-Elmer/Cetus ThermalcyclerTM.
  • the sample is then gradually cooled from 70°C to 25°C over a 45 minute period.
  • the annealed mixture is placed on ice and the following components added: 2 ⁇ l of 10 X synthesis buffer (Bio-Rad) : 5mM each of dATP, dGTP, dCTP, dTTP; 10 mM ATP; lOO M Tris-HCl (pH 7.4); 50 mM MgCl 2 ; 20 mM dithiothreitol) , 2 ⁇ l of T4 DNA ligase (6 units) and 1 ⁇ l of T7 DNA polymerase (1 unit) . These reactions are incubated for 5 minutes each at 0°C (on ice), 11°C, 25°C, and finally for 30 minutes at 37°C.
  • 10 X synthesis buffer Bio-Rad
  • 5mM each of dATP, dGTP, dCTP, dTTP 10 mM ATP; lOO M Tris-HCl (pH 7.4)
  • 50 mM MgCl 2 20 mM di
  • Example 6 described the site- ⁇ pecific mutageneses that created an Ndel site immediately preceeding the signal sequence, as well as a second construction which placed the Ndel site adjacent to the triplet which codes for the N-terminal alanine found on the mature, processed P. vulgaris chondroitinase I gene.
  • the ATG sequence of the Ndel recognition site can function as the translation initiation codon for the protein (either with or without the signal sequence) .
  • the isolated replicative form is digested with Kpnl and Clal.
  • the Kpnl site is part of the M13mpl9 polylinker, while the Clal site is found approximately 490 bp from the end of the cloned fragment of the chondroitinase I gene.
  • the restriction digestion products obtained are fractionated on a 4% NuSieveTM GTG agarose gel run in 1/2 X Tris-Acetate buffer (TAE) . The appropriate approximately 500 bp band is extracted from the gel u ⁇ ing QiaexTM.
  • plasmid DNA (LP 2 786) carrying the chondroitinase I gene is also digested with Kpnl and Clal and then fractionated on a 0.8% agarose gel run in 1/2 X TAE.
  • the Kpnl site is part of the polylinker of pIBI24, while the Clal site corresponds to the one described above.
  • Clal site corresponds to the one described above.
  • the approximately 7 kb fragment containing the pIBI24 vector and the large fragment of the chondroitinase I gene are isolated from the agarose gel by electroelution (11) , followed by ethanol precipitation. This 7 kb fragment is then treated with calf intestinal alkaline phosphata ⁇ e, extracted first with phenol-chloroform, then with chloroform, and then precipitated twice with ethanol and finally resuspended with 0.1 ml TE.
  • the two isolated N- terminal encoding fragments are each Iigated to the approximately 7 kb fragment encompassing the remainder of the chondroitinase I gene and the pIBI24 vector.
  • the ligase reaction is then used to transform the E. coli strain 294 and ampicillin resistant derivatives obtained.
  • DNA is isolated from small (10 ml) cultures and digested with Ndel to verify the presence of this restriction site within the reconstructed DNA. In order to remove the (apparent) P.
  • the modified chondroitina ⁇ e I genes are isolated as approximately 4.5 kb Ndel-Nsil fragments and ⁇ ubcloned into a pBR322 variant in which the EcoRI ⁇ ite i ⁇ fir ⁇ t filled-in, then depho ⁇ phorylated, and finally a phosphorylated Nsil linker (New England Biolabs) inserted.
  • the sequence of the linker used (TGCATGCATGCA) to place the Nsil site (ATGCAT) into pBR322 also includes an SphI site (GCATGC) .
  • pla ⁇ mid ⁇ (repre ⁇ enting two clones each with the signal sequenc retained [LP 2 861 and LP 2 863] and two with the signal ⁇ equence deleted [LP 2 865 and LP 2 867] ) containing the approximately 4500 bp Ndel-Nsil segment ⁇ including the chondroitina ⁇ e I gene are first digested with SphI, the end ⁇ "filled-in” with the "Klenow" fragment (11) of the E.
  • Ndel-BamHl fragment which contains the chondroitinase I gene. Seventeen clones (eight with and nine without the signal sequence) yield the desired fragment which is extracted from the agarose gel with QiaexTM. These approximately 3.4 kb Ndel-Bam-HI chondroitina ⁇ e I gene-containing fragments (both with and without the signal sequence) are then used to construct a high-level expres ⁇ ion system.
  • chondroitina ⁇ e I gene fragments (both with and without the signal sequence) is Iigated to the expression vector fragment.
  • the resulting recombinant DNA mixture is used to transform the E. coli K-12 host, HMS174 (Novagen) .
  • Kanamycin- resistant colonies obtained are grown in small scale (10 ml) and plasmid DNA is isolated and examined to confirm the predicted structure.
  • Samples of these constructions are then used to transform the expres ⁇ ion host BL21(DE3)/pLysS (10).
  • T7 lysozyme is expres ⁇ ed at a relatively low level in this construction and serves as an inhibitor of the T7 RNA Polymerase (16) , thereby minimizing the basal-level expres ⁇ ion of the gene to be overexpressed.
  • coli B strain BL21(DE3) /pLysS carrying the plasmid pTM49-6 constitute the deposited strain ATCC 69234.
  • An overnight culture of this deposited strain is grown at 30°C in the presence of 40 ⁇ g/ml of kanamycin and 25 ⁇ g/ml of chloramphenicol.
  • a 0.5 ml aliquot of this culture is used to inoculate 100 ml of a rich "expression" medium containing M9 salts (17) supplemented with 20 g/1 tryptone, 10 g/1 yeast extract, and 10 g/1 dextrose in addition to the same level of kanamycin and chloramphenicol.
  • the culture is grown at 30°C to an appropriate density (a value of 1 at A 600 ) and then chondroitinase I expres ⁇ ion is induced by the addition of IPTG to a final concentration of 1 mM. After three hours, samples are taken, centrifuged, and the cell pellets frozen on dry ice prior to as ⁇ ay. The frozen pellets are thawed, resuspended in buffer and sonicated. A value of 56 unit ⁇ /ml i ⁇ obtained
  • the bacterial cells are fir ⁇ t recovered from the medium and resuspended in buffer.
  • the cell suspension is then homogenized to lyse the bacterial cells.
  • a charged particulate such as 50 ppm Bioacryl (Toso Haas, Philadelphia, PA) , is added to remove DNA, aggregates and debris from the homogenization step.
  • the solution is brought to 40% saturation of ammonium sulfate to precipitate out undesired proteins.
  • the chondroitina ⁇ e I remain ⁇ in ⁇ olution.
  • the ⁇ olution is then filtered using a 0.22 micron SP240 filter (Amicon, Beverly, MA) , and the retentate is washed using nine volumes of 40% ammonium ⁇ ulfate solution to recover most of the enzyme.
  • the filtrate is concentrated and subjected to diafiltration with a sodium phosphate buffer using a 30 kD filter to remove salts and small molecules.
  • the filtrate containing chondroitinase I is subjected to cation exchange chromatography using a CellufineTM cellulo ⁇ e ⁇ ulfate column (Chi ⁇ o Corporation, di ⁇ tributed by Amicon) .
  • a CellufineTM cellulo ⁇ e ⁇ ulfate column Cho ⁇ o Corporation, di ⁇ tributed by Amicon
  • pH 7.2 20 mM ⁇ odium pho ⁇ phate
  • more than 98% of the chondroitina ⁇ e I binds to the column.
  • the native chondroitinase I is then eluted from the column using a 0 to 250 mM sodium chloride gradient, in 20 mM sodium phosphate buffer.
  • chondroitinase I i ⁇ obtained at a purity of 90-97% as measured by SDS-PAGE scanning (see above) .
  • the yield of the native protein is only 25-35%, determined as described above. This method also results in the cleavage of the approximately 110 kD chondroitinase I protein into a 90 kD and an 18 kD fragment. Nonetheless, the two fragments remain non-ionically bound and exhibit chondroitinase I activity.
  • the host cells which express the recombinant chondroitinase I enzyme are homogenized to lyse the cells. This releases the enzyme into the supernatant.
  • the supernatant is first subjected to diafiltration to remove salts and other small molecules.
  • a suitable filter is a spiral wound 30 kD filter made by Amicon (Beverly, MA) .
  • this step only removes the free, but not the bound form of the negatively charged molecules.
  • the bound form of these charged species is removed by passing the supernatant through a strong, high capacity anion exchange resin- containing column.
  • a strong, high capacity anion exchange resin- containing column An example of such a resin is the Macro-PrepTM High Q resin (Bio-Rad, Melville, N.Y.) .
  • Other strong, high capacity anion exchange columns are also suitable.
  • the negatively charged molecules bind to the column, while the enzyme passes through the column. It is also found that some unrelated, undesirable proteins also bind to the column.
  • the eluate from the anion exchange column i ⁇ directly loaded to a cation exchange resin- containing column.
  • cation exchange resin- containing column examples include the S- Sepharo ⁇ eTM (Pharmacia, Pi ⁇ cataway, N.J.) and the Macro-PrepTM High S (Bio-Rad) .
  • S- Sepharo ⁇ eTM Pulcoa, Pi ⁇ cataway, N.J.
  • Macro-PrepTM High S Bio-Rad
  • Each of the ⁇ e two resin-containing columns has S0 3 " ligands bound thereto in order to facilitate the exchange of cations.
  • Other cation exchange columns are also suitable.
  • the enzyme binds to the column and is then eluted with a solvent capable of releasing the enzyme from the column.
  • Any salt which increase ⁇ the conductivity of the solution is suitable for elution.
  • examples of such salts include sodium salts, as well as potassium salts and ammonium salts.
  • An aqueous sodium chloride solution of appropriate concentration is suitable.
  • a gradient, such a ⁇ 0 to 250 mM sodium chloride is acceptable, as i ⁇ a step elution using 200 mM sodium chloride.
  • the purity of the protein is measured by scanning the bands in SDS-PAGE gels. A 4-20% gradient of acrylamide is used in the development of the gels. The band(s) in each lane of the gel i ⁇ scanned using the procedure described above.
  • two additional ⁇ tep ⁇ are inserted in the method before the diafiltration step of the first embodiment.
  • the supernatant is treated with an acidic solution, such as 1 M acetic acid, bringing the supernatant to a final pH of 4.5, to precipitate out the desired enzyme.
  • the pellet is obtained by centrifugation at 5,000 x g for 20 minutes.
  • the pellet is then dissolved in an alkali solution, such as 20-30 mM NaOH, bringing it to a final pH of 9.8.
  • the solution i ⁇ then subjected to the diafiltration and subsequent step ⁇ of the fir ⁇ t embodiment of this invention.
  • Acid precipitation removes proteins that remain soluble; however, these proteins are removed anyway by the cation and anion exchange ⁇ tep ⁇ that follow (although smaller columns may be used) .
  • An advantage of the acid precipitation step is that the sample volume is decreased to about 20% of the original volume after dissolution, and hence can be handled more easily on a large scale.
  • the additional acid precipitation and alkali dissolution steps of the second embodiment mean that the ⁇ econd embodiment is more time consuming than the first embodiment.
  • the marginal improvements in purity and yield provided by the second embodiment may be outweighed by the simpler procedure of the first embodiment, which still provides highly pure enzyme at high yields.
  • Lane 1 is the enzyme using the method of the first embodiment
  • Lane 2 is the enzyme using the method of the second embodiment
  • Lane 3 represents the supernatant from the host cell prior to purification -- many other proteins are present
  • Lane 4 represent ⁇ molecular weight standards.
  • the approach taken in the case of the chondroitinase II gene is to modify the naturally- occurring ATG initiation codon to embed it within an Ndel site. Thi ⁇ re ⁇ ults in a construction in which the signal peptide is retained, such that the expressed gene is processed and secreted to yield the mature native enzyme structure that has a leucine residue at the N-terminus.
  • the mutagenized bases are upstream of the coding region.
  • the method used for this ⁇ ite-specific alteration is that described above for the expres ⁇ ion of the chondroitina ⁇ e I gene and i ⁇ based on the work of Kunkel (15) using the Muta-GeneTM In Vitro Mutagenesis Kit Version 2 (Bio-Rad, Melville, N.Y.).
  • the target DNA to be mutagenized is first cloned into a suitable M13-derived vector to generate single-stranded DNA.
  • This recombinant phage is replicated in the E. coli host strain CJ236 (Bio- Rad) , a male strain that carries the dut and ung alleles.
  • dut duTPase
  • ung uracil-N-glyco ⁇ yla ⁇ e
  • T7 DNA polymerase which copies the entire recombinant molecule.
  • T4 DNA ligase is then used to seal the nick between the fir ⁇ t residue of the mutagenic oligonucleotide and the last residue added in vitro.
  • the newly synthe ⁇ ized DNA (containing the de ⁇ ired base changes) therefore does not contain uracil while the template DNA (with the native sequences) does. Transformation of a non-mutant (with respect to the ung and dut alleles) male E. coli ⁇ train yields phage progeny that are primarily derived from the mutagenized strand synthesized in vitro as a re ⁇ ult of the inactivation of the uracil-containing template strand.
  • the fragment to be cloned for the mutagenesis is a Muni-EcoRI fragment that span ⁇ the region between nucleotide ⁇ 2943 to 3980 (SEQ ID NOS:l and 39) .
  • the DNA dige ⁇ ted to obtain this fragment is designated LP 2 783.
  • Thi ⁇ plasmid is constructed in the same way as LP 2 786 (described in Example 4) , except that a Hindlll linker is inserted into the EcoRV deletion of LP 2 776 rather than the EcoRI linker.
  • the four ba ⁇ e overhang produced by Muni digestion can be Iigated to an EcoRI site, but the re ⁇ ulting recombinant ⁇ equence cannot be dige ⁇ ted by either enzyme.
  • the EcoRI dige ⁇ ted LP 2 941 i ⁇ also dephosphorylated with calf inte ⁇ tinal alkaline pho ⁇ phatase (Boehringer Mannheim, Indianapolis IN) prior to gel purification and use.
  • the Iigated DNA mixture is used to infect the male E. coli strain MV1190 and the plaques obtained are picked to 0.5 ml. of SM buffer and the phage allowed to elute by diffusion. These are then used to infect 10 ml. cultures of MV1190 and grown overnight. The cultures are centrifuged and the pellets used for the isolation of the double-stranded replicative forms of the recombinant viru ⁇ . The ⁇ upernatant ⁇ , which contain the corresponding phage particles, are ⁇ tored under refrigeration until needed. The orientation of the cloned fragment is determined by digestion of the replicative form DNA and Hindlll. because there is one site within the polylinker and a second, aymmetrically placed site (SEQ ID NOS:l and 39, nucleotides 3326-3331) within the above Muni-EcoRI fragment.
  • the corresponding phage-containing supernatant is serially diluted, used to infect the E. coli strain CJ236, and then plated to obtain single plaques which are picked and eluted as above.
  • One of these is then used to infect CJ236 and another 10 ml culture grown and the single- ⁇ tranded DNA is isolated from the phage-containing supernatant using QiaexTM column ⁇ and material ⁇ and methods recommended by the manufacturer (Qiagen, Chatsworth, CA) and finally re ⁇ uspended in a volume of 0.01 ml.
  • the recombinant phage are grown on CJ236 (dut " ung " ) for two rounds in order to maximize the accumulation of uracil residues in the template and strand prior to the actual site- ⁇ pecific mutagenesis.
  • the mutagenic oligonucleotide used is obtained from Bio-Synthesis (Denton, TX) and has the following sequence:
  • This sequence differs from the corresponding region of SEQ ID NOS:l and 39 in that an AT sequence (base pairs 3235 and 3236) is replaced by a CA sequence which creates the desired Ndel sequence (CATATG) at the start of the presumed leader sequence for the chondroitinase II gene.
  • AT sequence base pairs 3235 and 3236
  • CA sequence which creates the desired Ndel sequence (CATATG) at the start of the presumed leader sequence for the chondroitinase II gene.
  • One optical density unit of this oligonucleotide is dissolved in 0.46 ml. of TE 7.4 (0.01M TrisHCl, pH 7.8-0.001M EDTA, pH 8.0), yielding an oligonucleotide concentration of approximately 6 pmol/ ⁇ l.
  • Three hundred picomole ⁇ of this oligonucleotide are phosphorylated in a 0.1 ml reaction containing 0.05 M TrisHCl, pH 7.8, 0.01 M MgCl 2 , 0.02M dithiothreitol, 0.001 M ATP, 25 ⁇ g/ml bovine serum albumin and 100 units of T4 polynucleotide kinase (New England Biolab ⁇ ) at 37°C for 30 minute ⁇ , followed by incubation at 75° for 20 minute ⁇ to inactivate the enzyme.
  • the pho ⁇ phorylated oligonucleotide i ⁇ then ⁇ tored frozen at -20° at a concentration of approximately 3 pmoles/ ⁇ l.
  • 1 ⁇ l (3 pmole) of the mutagenic oligonucleotide is mixed with 6 ⁇ l of the single-stranded DNA prepared above in a 10 ⁇ l volume of 0.02 M TrisHCl, pH 7.4, 0.002 M MgCl 2 , 0.05 M NaCl.
  • the oligonucleotide is annealed to this template by fir ⁇ t incubating the ⁇ ample at 70°C for 5 minute ⁇ and then cooling thi ⁇ ⁇ ample at 25°C over a 45 minute period in a DNA Thermal CyclerTM (Perkin-Elmer Cetus/Norwalk, CT) .
  • the sample is maintained at 25°C for another 5 minute ⁇ before being cooled to 20°C and finally transferred to an ice bath.
  • the annealed primer is then extended after the addition of 1 ⁇ l of 10X synthesi ⁇ buffer (Bio-Rad; containing 0.005 M of each of the dNTP' ⁇ , 0.01 M ATP, 0.1 M Tri ⁇ HCl, pH 7.4, 0.05 M MgCl 2 , 0.02 M DTT).
  • 10X synthesi ⁇ buffer Bio-Rad; containing 0.005 M of each of the dNTP' ⁇ , 0.01 M ATP, 0.1 M Tri ⁇ HCl, pH 7.4, 0.05 M MgCl 2 , 0.02 M DTT.
  • T4 DNA ligase 3 units/ ⁇ l Bio-Rad
  • T7 DNA polymerase 0.5 units/ ⁇ l Bio-Rad.
  • the in vitro DNA synthesis is carried out on ice for 5 minutes, at 11°C for ten minute ⁇ , and at 37°C for 30 minute ⁇ prior to transfer to ice. This ⁇ ample i ⁇ used directly to transform the male E.
  • coli host MV1190 (dut* ung*) and the resulting plaques, containing the site-specifically mutagenized phage, are obtained, picked and eluted as described above. Aliquots of these phage stock ⁇ are used in infect 10 ml. cultures of MV1190 and allowed to grow overnight. The cultures are centrifuged and the replicative forms of the recombinant phage are isolated using QiaexTM columns and methods recommended by the manufacturer (Qiagen, Chatsworth CA) . The DNA isolated is resuspended in 0.1 ml of TE 7.4.
  • the four sample ⁇ are then combined and the DNA extracted from the gel using a QiaexTM resin and buffer ⁇ according to the manufacturer' s recommendations (Qiagen, Chat ⁇ worth CA) and resuspended in 0.05 ml. of TE, pH 7.4.
  • This isolated, site-specifically mutagenized N-terminal coding region of the cloned P. vulgaris gene for the chondroitinase II gene is then subcloned into the plasmid pNEB193 (New England Biolabs, Beverly MA) between the (dephosphorylated) unique Ndel and EcoRI sites present in this plasmid.
  • the DNA sample from one of the positive clones is designated m#15-5712. This sample represents the modified N- terminal region that is to be joined to the C-terminal coding region for the chondroitinase II gene, which is described in Example 12.
  • the DNA sequence contained in SEQ ID NO: 39 indicates that chondroitinase II is encoded by a region that is downstream of that for chondroitinase I. This information is derived from a portion of a 10 kilobase Nsil fragment of P. vulgaris that is ⁇ ubcloned originally from a co ⁇ mid clone designated LP 2 751.
  • the combination of the DNA sequencing and the restriction map in Figure 1 reveal ⁇ that the chondroitinase II coding region initiates to the "left" of the EcoRI site that lies within the P. vulgaris derived DNA and proceeds toward the Nsil site at the "right” end of the fragment depicted in Figure 1. Therefore, this re ⁇ triction map ⁇ hould be expanded to the "right” to find a ⁇ uitable fragment that will include the C-terminal coding region for the chondroitina ⁇ e II gene.
  • digestion ⁇ are carried u ⁇ ing the restriction enzymes Afllll, Clal, EcoRV, and Hindlll each of which has been noted by Applicants to yield eight to ten fragments upon digestion of the original cosmid clone designated LP 2 751.
  • the recombinant molecule carrying the subcloned approximately 10 kb Nsil fragment (LP 2 770) and the individually gel-purified approximately 20 kb EcoRI and approximately 10 kb EcoRI fragments are digested with each of the ⁇ e enzyme ⁇ to yield pattern ⁇ of fragments that are compared.
  • the ⁇ e digestions reveal that the approximately 20 kb EcoRI and the LP 2 770 pattern ⁇ have a number of fragments in common.
  • the second deletion removes the region between the Hindlll site at the other end of the polylinker and the (now unique) PvuII site, maintaining the Hindlll site, while removing the PvuII site.
  • the recombinant DNA molecule carrying the subcloned approximately 10 kb EcoRI fragment in the vector lacpo ⁇ pNEB193 is de ⁇ ignated LP 2 1263.
  • the orientation of the 112 kD C-terminal coding region within LP 2 1263 i ⁇ determined by re ⁇ triction enzyme mapping.
  • the re ⁇ ult ⁇ indicate that thi ⁇ region is positioned so as to proceed from the EcoRI site (defined as the "left" end) toward the Hindlll site at the other end of the polylinker.
  • Thi ⁇ con ⁇ truction also "places" a BamHl site (present in the polylinker) downstream of the coding region for the chondroitinase II gene.
  • This recombinant DNA molecule which carries the chondroitinase II gene from the EcoRI site to (and presumably just beyond) the termination codon for this gene has been designated m#25-5712.
  • DNA sequence analysi ⁇ i ⁇ initiated on the approximately 10 kb EcoRI fragment derived from LP 2 1263 and is completed after the assembly of the intact gene for chondroitinase II.
  • the materials and methods for the DNA sequencing of thi ⁇ fragment are essentially the same a ⁇ tho ⁇ e u ⁇ ed for the approximately 4 kb fragment containing the gene for chondroitina ⁇ e I.
  • Random fragments are derived from this approximately 10 kb EcoRI fragment by self-ligating the DNA and then fragmenting the polymerized DNA by sonication as well a ⁇ by partial dige ⁇ tion with the re ⁇ triction enzymes Sau3A or Msel.
  • the ⁇ e piece ⁇ are then eventually cloned into M13 derived vector ⁇ and the single- stranded recombinant molecules sequenced using the standard protocols described above. Finally, with the two set of sequence data available, an approximately 300 base-pair Bell fragment is identified that is predicted to contain the EcoRI site that is the junction between the two P. vulgaris fragments of approximately 20 kb and approximately 10 kb obtained by dige ⁇ tion with EcoRI. This small fragment is sequenced in both directions to verify the nucleotide sequence through this junction point used in the con ⁇ truction ⁇ described below.
  • the molecule designated m#25-5712 is digested with EcoRI and BamHl, This releases a DNA fragment of approximately 2.6 kb.
  • the construction designated m#15-5712 is digested with EcoRI and BamHl and then dephosphorylated prior to purification by gel electrophore ⁇ is.
  • the latter molecule therefore carries the N-terminal coding region of the chondroitinase II gene from the ATG initiation codon (now present as part of an Ndel site from the site- specific mutagenesis) to the EcoRI site.
  • the coding region of the chondroitinase II gene includes nucleotides 3238-6276 of the SEQ ID NO: 39, which encodes 1013 amino acids (SEQ ID NO:40) .
  • nucleotide ⁇ 3238-3306 encode the 23 amino acid ⁇ ignal peptide (SEQ ID NO:40, amino acid ⁇ 1-23)
  • nucleotide ⁇ 3307-6276 encode the mature 990 amino acid chondroitina ⁇ e II protein (SEQ ID NO:40, amino acid ⁇ 24-1013) .
  • restriction analysis with Sau3AI reveals a multiplicity of site ⁇ , including tho ⁇ e at SEQ ID NO:39, nucleotide ⁇ 212, 602, 890, 1042, 1181, 1241, 1442, 1505, 1746, 2330, 2363, 2701, 2705, 2920, 3697, 3708, 3745, 3868, 4087, 4800, 4872, 5565, 5635, 5860, 6058 and 6467.
  • One of the recombinant molecules (the chondroitina ⁇ e II gene in ⁇ erted into pET9A) obtained in thi ⁇ experiment i ⁇ grown in large ⁇ cale (0.5 liter) and the expre ⁇ ion system containing the chondroitinase II gene isolated and designated LP 2 1359.
  • the re ⁇ ulting strain is designated TD112 and is used for large-scale fermentation and isolation of the chondroitinase II enzyme. A fermentation at a 10 liter scale carried out with this E.
  • coli strain containing the plasmid expressing the chondroitinase II protein provide ⁇ a maximum chondroitina ⁇ e II titer of approximately 0.3 mg/ml, which i ⁇ approximately 25 time ⁇ that of the approximately 0.012 mg/ml obtained from the native P. vulgari ⁇ fermentation proce ⁇ for chondroitina ⁇ e II.
  • the initial part of this method is the same as that used for the recombinant chondroitinase I enzyme.
  • the host cells which express the recombinant chondroitinase II enzyme are homogenized to lyse the cells. This releases the enzyme into the supernatant.
  • the supernatant is first subjected to diafiltration to remove salts and other small molecules.
  • a ⁇ uitable filter is a spiral wound 30 kD filter made by Amicon (Beverly, MA) .
  • Amicon Billerly, MA
  • this ⁇ tep only removes the free, but not the bound form of the negatively charged molecules.
  • the bound form of the ⁇ e charged species is removed by pas ⁇ ing the supernatant (see the SDS-PAGE gel depicted in Figure 5, lane 1) through a strong, high capacity anion exchange resin- containing column.
  • An example of such a resin is the Macro-PrepTM High Q re ⁇ in (Bio-Rad, Melville, N.Y.) .
  • Other ⁇ trong, high capacity anion exchange column ⁇ are al ⁇ o suitable.
  • the negatively charged molecules bind to the column, while the enzyme passes through the column with approximately 90% recovery of the enzyme. It is also found that some unrelated, undesirable proteins also bind to the column.
  • the method diverges from that used for the chondroitinase I protein.
  • a specific elution using a solution containing chondroitin sulfate is used.
  • a 1% concentration of chondroitin sulfate is used; however, a gradient of this solvent is also acceptable.
  • the specific chondroitin sulfate solution is preferred to the non-specific salt solution becau ⁇ e the recombinant chondroitina ⁇ e II protein i ⁇ expressed at levels approximately several-fold lower than the recombinant chondroitinase I protein; therefore, a more powerful and selective solution is necessary in order to obtain a final chondroitinase II product of a purity equivalent to that obtained for the ' chondroitinase I protein.
  • the cation exchange column i ⁇ next washed with a phosphate buffer, pH 7.0, to elute unbound proteins, followed by washing with borate buffer, pH 8.5, to elute loosely bound contaminating proteins and to increase the pH of the resin to that required for the optimal elution of the chondroitinase II protein using the ⁇ ub ⁇ trate, chondroitin ⁇ ulfate.
  • a phosphate buffer pH 7.0
  • borate buffer pH 8.5
  • chondroitin sulfate a 1% solution of chondroitin sulfate in water, adjusted to pH 9.0, is used to elute the chondroitinase II protein, as a sharp peak (recovery 65%) and at a high purity of approximately 95% ( Figure 5, lane 3) .
  • the chondroitin sulfate has an affinity for the chondroitinase II protein which is stronger than its affinity for the resin of the column, and therefore the chondroitin sulfate co- elutes with the protein.
  • Thi ⁇ ensures that only protein which recognizes chondroitin sulfate is eluted, which is desirable, but also means that an additional proce ⁇ ⁇ tep is neces ⁇ ary to separate the chondroitin sulfate from the chondroitinase II protein.
  • the eluate is adju ⁇ ted to pH 7.0 and i ⁇ loaded a ⁇ is onto an anion exchange resin-containing column, such as the Macro- PrepTM High Q resin.
  • the column is washed with a 20 mM phosphate buffer, pH 6.8.
  • the chondroitin sulfate binds to the column, while the chondroitina ⁇ e II protein flow ⁇ through in the unbound pool with greater than 95% recovery.
  • the protein i ⁇ pure, except for the pre ⁇ ence of a ⁇ ingle minor contaminant of approximately 37 kD ( Figure 5, lane ⁇ 4 and 6) .
  • the contaminant may be a breakdown product of the chondroitinase II protein.
  • This contaminant is effectively removed by a crytallization step.
  • the eluate from the anion exchange column is concentrated to 15 mg/ml protein using an Amicon stirred cell with a 30 kD cutoff.
  • the solution is maintained at 4°C for ⁇ everal days to crystallize out the pure chondroitinase II protein.
  • the supernatant contains the 37 kD contaminant ( Figure 5, lane 7) .
  • Centrifugation causes the crystal ⁇ to form a pellet, while the ⁇ upernatant with the 37 kD contaminant i ⁇ removed by pipetting, and the cry ⁇ tal ⁇ wa ⁇ hed twice with water.
  • two additional step ⁇ are inserted in the method for purifying the chondroitinase II enzyme before the diafiltration step of the fir ⁇ t embodiment.
  • the supernatant i ⁇ treated with an acidic solution, such as 1 M acetic acid, bringing the supernatant to a final pH of 4.5, to precipitate out the desired enzyme.
  • the pellet is obtained by centrifugation at 5,000 x g for 20 minutes.
  • the pellet is then dissolved in an alkali solution, such as 20-30 mM
  • MOLECULE TYPE DNA (genomic)
  • HYPOTHETICAL NO
  • ANTI-SENSE NO
  • CAC AAC GTA AAG CCA CAA CTA CCT GTA ACA CCT GAA AAT TTA GCG GCC 886 His Asn Val Lys Pro Gin Leu Pro Val Thr Pro Glu Asn Leu Ala Ala 245 250 255
  • GGC AGA CAT CTG ATC ACT GAT AAA CAA ATC ATT ATT TAT CAA CCA GAG 1078 Gly Arg His Leu He Thr Asp Lys Gin He He He Tyr Gin Pro Glu 305 310 315 320
  • GGT AGC AAT ATA AAT AGT AGT GAT AAA AAT AAA AAT GTT GAA ACG ACC 2470 Gly Ser Asn He Asn Ser Ser Asp Lys Asn Lys Asn Val Glu Thr Thr 770 775 780
  • MOLECULE TYPE DNA (genomic)
  • HYPOTHETICAL NO
  • ANTI-SENSE NO
  • GATCGATCAC AGCACTCGCC CCAAAGATGC CAGTTATGAG TATATGGTCT TTTTAGATGC 2760
  • MOLECULE TYPE DNA (genomic)
  • HYPOTHETICAL NO
  • ANT -SENSE NO
  • GGAATTCCAT CACTCAATCA TTAAATTTAG GCACAACGAT GGGCTATCAG CGTTATGACA 60
  • AATTTAATGA AGGACGCATT GGTTTCACTG TTAGCCAGCG TTTCTAAGGA GAAAAATAAT 120
  • GAT AAA CAA CTA TTT GAT AAT TAT GTT ATT TTA GGT AAT TAC ACG ACA 1141 Asp Lys Gin Leu Phe Asp Asn Tyr Val He Leu Gly Asn Tyr Thr Thr 305 310 315
  • MOLECULE TYPE DNA (genomic)
  • HYPOTHETICAL NO
  • ANTI-SENSE NO
  • MOLECULE TYPE DNA (genomic)
  • HYPOTHETICAL NO
  • ANTI -SENSE NO
  • MOLECULE TYPE DNA (genomic)
  • HYPOTHETICAL NO
  • ANTI-SENSE NO
  • MOLECULE TYPE DNA (genomic)
  • HYPOTHETICAL NO
  • ANTI-SENSE NO
  • MOLECULE TYPE DNA (genomic)
  • HYPOTHETICAL NO
  • ANTI-SENSE NO
  • SEQUENCE DESCRIPTION SEQ ID NO:10: CACTTCGCNC AAAATAACCC 20
  • MOLECULE TYPE DNA (genomic)
  • HYPOTHETICAL NO
  • ANTI-SENSE NO
  • MOLECULE TYPE DNA (genomic)
  • HYPOTHETICAL NO
  • ANTI-SENSE NO
  • MOLECULE TYPE DNA (genomic)
  • HYPOTHETICAL NO
  • ANTI-SENSE NO
  • MOLECULE TYPE DNA (genomic)
  • HYPOTHETICAL NO
  • ANTI-SENSE NO
  • MOLECULE TYPE DNA (genomic)
  • HYPOTHETICAL NO
  • ANTI-SENSE NO
  • MOLECULE TYPE DNA (genomic)
  • HYPOTHETICAL NO
  • ANTI-SENSE YES
  • MOLECULE TYPE DNA (genomic)
  • HYPOTHETICAL NO
  • ANTI-SENSE YES
  • MOLECULE TYPE DNA (genomic)
  • HYPOTHETICAL NO
  • ANTI-SENSE YES
  • MOLECULE TYPE DNA (genomic)
  • HYPOTHETICAL NO
  • ANTI-SENSE YES
  • MOLECULE TYPE DNA (genomic)
  • HYPOTHETICAL NO
  • ANTI-SENSE YES
  • MOLECULE TYPE DNA (genomic)
  • HYPOTHETICAL NO
  • ANTI-SENSE YES
  • MOLECULE TYPE DNA (genomic)
  • HYPOTHETICAL NO
  • ANTI-SENSE YES
  • MOLECULE TYPE DNA (genomic)
  • HYPOTHETICAL NO
  • ANTI-SENSE YES
  • MOLECULE TYPE DNA (genomic)
  • HYPOTHETICAL NO
  • ANTI-SENSE YES
  • MOLECULE TYPE DNA (genomic)
  • HYPOTHETICAL NO
  • ANTI-SENSE NO
  • MOLECULE TYPE DNA (genomic)
  • HYPOTHETICAL NO
  • ANTI-SENSE YES
  • MOLECULE TYPE DNA (genomic)
  • HYPOTHETICAL NO
  • ANTI-SENSE YES
  • MOLECULE TYPE DNA (genomic)
  • HYPOTHETICAL NO
  • ANTI-SENSE YES
  • MOLECULE TYPE DNA (genomic)
  • HYPOTHETICAL NO
  • ANTI-SENSE YES
  • MOLECULE TYPE DNA (genomic)
  • HYPOTHETICAL NO
  • ANTI-SENSE YES
  • MOLECULE TYPE DNA (genomic)
  • HYPOTHETICAL NO
  • ANTI-SENSE YES

Abstract

This invention relates to the DNA sequence encoding the major protein component of chondroitinase ABC, which is referred to as 'chondroitinase I', from Proteus vulgaris (P. vulgaris), which is contained in the Nsi fragment shown in the figure. This invention further relates to the DNA sequence encoding a second protein component of chondroitinase ABC, which is referred to as 'chondroitinase II', from P. vulgaris, to the cloning and expression of the genes containing these DNA sequences, to the amino acid sequences of the recombinant chondroitinase I and II, and to methods for the isolation and purification of recombinant chondroitinase I or II. These methods provide significantly higher yields and purity than those obtained by adapting for the recombinant enzymes the method previously used for isolating and purifying native chondroitinase I enzyme from P. vulgaris.

Description

CLONING AND EXPRESSION OF THE CHONDROITINASE I AND II GENES FROM P. VULGARIS
Field of the Invention
This invention relates to the DNA sequence encoding the major protein component of chondroitinase ABC, which is referred to as "chondroitinase I", from Proteus vulgaris (P. vulgaris) . This invention further relates to the DNA sequence encoding a second protein component of chondroitinase ABC, which is referred to as "chondroitinase II", from P. vulgaris. This invention also relates to the cloning and expression of the genes containing these DNA sequences and to the amino acid sequences of the recombinant chondroitinase I and II enzymes encoded by these DNA sequences.
This invention additionally relates to methods for the isolation and purification of the recombinantly expressed major protein component of chondroitinase ABC, which is referred to as "chondroitinase I", from Proteus vulgaris (P. vulgaris) . This invention further relates to methods for the isolation and purification of the recombinantly expressed second protein component of chondroitinase ABC, which is referred to as "chondroitinase II", from P. vulgaris. These methods provide significantly higher yields and purity than those obtained by adapting for the recombinant enzymes the method previously used for isolating and purifying the native chondroitinase I enzyme from P. vulgaris. Background of the Invention
Chondroitinases are enzymes of bacterial origin which have been described as having value in dissolving the cartilage of herniated discs without disturbing the stabilizing collagen components of those discs.
Examples of chondroitinase enzymes are chondroitinase ABC, which is produced by the bacterium P. vulgaris. and chondroitinase AC, which is produced by A. aurescens. The chondroitinases function by degrading polysaccharide side chains in protein- polysaccharide complexes, without degrading the protein core. Yamagata et al. describes the purification of the enzyme chondroitinase ABC from extracts of P. vulgaris (Bibliography entry 1) . The enzyme selectively degrades the glycosaminoglycans chondroitin-4-sulfate, dermatan sulfate and chondroitin-6-sulfate (also referred to respectively as chondroitin sulfates A, B and C) at pH 8 at higher rates than chondroitin or hyaluronic acid. However, the enzyme did not attack keratosulfate, heparin or heparitin sulfate. Kikuchi et al. describes the purification of glycosaminoglycan degrading enzymes, such as chondroitinase ABC, by fractionating the enzymes by adsorbing a solution containing the enzymes onto an insoluble sulfated polysaccharide carrier and then desorbing the individual enzymes from the carrier (2) .
Brown describes a method for treating intervertebral disc displacement in mammals, including humans, by injecting into the intervertebral disc space effective amounts of a solution containing chondroitinase ABC (3) . The chondroitinase ABC was isolated and purified from extracts of P. vulgaris. This native enzyme material functioned to dissolve cartilage, such as herniated spinal discs. Specifically, the enzyme causes the selective chemonucleolysis of the nucleus pulposus which contains proteoglycans and randomly dispersed collagen fibers.
Hageman describes an ophthalmic vitrectomy method for selectively and completely disinserting the ocular vitreous body, epiretinal membranes or fibrocellular membranes from the neural retina, ciliary epithelium and posterior lens surface of the mammalian eye as an adjunct to vitrectomy, by administering to the eye an effective amount of an enzyme which disrupts or degrades chondroitin sulfate proteoglycan localized specifically to sites of vitreoretinal adhesion and thereby permit complete disinsertion of said vitreous body and/or epiretinal membranes (4) . The enzyme can be a protease-free glycosaminoglycanase, such as chondroitinase ABC. Hageman utilized chondroitinase ABC obtained from Seikagaku Kogyo Co., Ltd., Tokyo, Japan. In isolating and purifying the chondroitinase ABC enzyme from the Seikagaku Kogyo material, it was noted that there was a correlation between effective preparations of the chondroitinase in vitrectomy procedures and the presence of a second protein having an apparent molecular weight (by SDS- PAGE) slightly greater than that of the major protein component of chondroitinase ABC. The second protein is now designated "chondroitinase II", while the major protein component of chondroitinase ABC is referred to as "chondroitinase I." The chondroitinase I and II proteins are basic proteins at neutral pH, with similar isoelectric points of 8.30-8.45. Separate purification of the chondroitinase I and II forms of the native enzyme revealed that it was the combination of the two proteins that was active in the surgical vitrectomy rather than either of the proteins individually.
Use of the chondroitinase I and II forms of the native enzyme to date has been limited by the small amounts of enzymes obtained from native sources. The production and purification of the native forms of the enzyme has been carried out using fermentations of P. vulgaris in which its substrate has been used as the inducer to initiate production of these forms of the enzyme. A combination of factors, including low levels of synthesis, the cost and availability of the inducer (chondroitin sulfate) , and the opportunistically pathogenic nature of P. vulgaris, has resulted in the requirement for a more efficient method of production. In addition, the native forms of the enzyme produced by conventional techniques are subject to degradation by proteases present in the bacterial extract. Therefore, there is a need for a reliable supply of pure material free of contaminants in order for the medical applications of the two forms of this enzyme to be evaluated properly and exploited. There is also a need for methods to isolate and purify a reliable supply of the chondroitinase I and II enzymes free of contaminants.
Summary of the Invention
Accordingly, it is an object of this invention to produce chondroitinase I and chondroitinase II in quantities not readily achievable using present non-recombinant bacterial fermentation and extraction techniques. It is a further object of this invention to produce chondroitinase I and chondroitinase II, each in a form substantially free of proteases which would otherwise degrade the enzyme and cause a loss of its activity.
These objects are achieved through the use of an alternative approach to the problems presented by large scale bacterial fermentation of these two forms of the enzyme. Separately for chondroitinase I and chondroitinase II, the gene that encodes the enzyme is cloned and the enzyme is expressed at high levels in a heterologous host. In a preferred embodiment, this invention is directed to the cloning of the P. vulgaris gene for chondroitinase I and the high level expression of that enzyme in E. coli, as well as the cloning of the P. vulgaris gene for chondroitinase II and the high level expression of that enzyme in E. coli.
This invention provides a purified isolated DNA fragment of P. vulgaris which comprises a sequence encoding for chondroitinase I. This invention further provides a purified isolated DNA fragment of P. vulgaris which hybridizes with a nucleic acid sequence encoding for amino acids as follows: (a) the chondroitinase I enzyme with its signal peptide (SEQ ID NO:2, amino acids 1-1021) or a biological equivalent thereof (encoded for example by: (1) nucleotides numbered 119-3181 of SEQ ID NO:l, and (2) nucleotides numbered 119-3181 of SEQ ID NO:3, where the three nucleotides immediately upstream of the initiation codon are changed (SEQ ID NO:3, nucleotides 116- 118)); (b) the mature chondroitinase I enzyme (SEQ ID NO:2, amino acids 25-1021) or a biological equivalent thereof (encoded for example by: (1) nucleotides numbered 191-3181 of SEQ ID NO:l, and
(2) nucleotides numbered 191-3181 of SEQ ID NO:3, where the three nucleotides immediately upstream of the initiation codon are changed (SEQ ID NO:3, nucleotides 116-118)); and
(c) the mature chondroitinase I enzyme where the sequence encoding the signal peptide has been replaced with a sequence which adds a methionine residue to the amino terminus of the enzyme (SEQ ID NO:5, amino acids 24- 1021) or a biological equivalent thereof (encoded for example by nucleotides numbered 188-3181 of SEQ ID NO:4) .
The recombinant chondroitinase I is produced by transforming a host cell with a plasmid containing a purified isolated DNA fragment of P. vulgaris which contains one of the above-described sequences, and culturing the host cell under conditions which permit expression of the enzyme by the host cell.
This invention also provides a purified isolated DNA fragment of P. vulgaris which comprises a sequence encoding for chondroitinase II. This invention further provides a purified isolated DNA fragment from P. vulgaris which hybridizes with a nucleic acid sequence encoding for amino acids as follows:
(a) the chondroitinase II enzyme with its signal peptide (SEQ ID NO:40, amino acids 1-1013) or a biological equivalent thereof (encoded for example by nucleotides numbered 3238-6276 of SEQ ID NO:39) ; and (b) the mature chondroitinase II enzyme
(SEQ ID NO:40, amino acids 24-1013) or a biological equivalent thereof (encoded for example by nucleotides numbered 3307-6276 of SEQ ID NO:39) . The recombinant chondroitinase II is produced by transforming a host cell with a plasmid containing a purified isolated DNA fragment of P. vulgaris which contains one of the above-described sequences, and culturing the host cell under conditions which permit expression of the enzyme by the host cell.
It is an additional object of this invention to provide methods for the isolation and purification of the recombinantly expressed chondroitinase I enzyme of P. vulgaris.
It is a particular object of this invention to provide methods which result in significantly higher yields and purity of the recombinant chondroitinase I enzyme than those obtained by adapting for the recombinant enzyme the method previously used for isolating and purifying the native chondroitinase I enzyme from P. vulgaris.
These objects are achieved through either of two methods described and claimed herein for the chondroitinase I enzyme. The first method comprises the steps of:
(a) lysing by homogenization the host cells which express the recombinant chondroitinase I enzyme to release the enzyme into the supernatant; (b) subjecting the supernatant to diafiltration to remove salts and other small molecules;
(c) passing the supernatant through an anion exchange resin-containing column;
(d) loading the eluate from step (c) to a cation exchange resin-containing column so that the enzyme in the eluate binds to the cation exchange column; and (e) eluting the enzyme bound to the cation exchange column with a solvent capable of releasing the enzyme from the column. In the second method, prior to step (b) of the first method just described, the following two steps are performed:
(1) treating the supernatant with an acidic solution to precipitate out the enzyme; and (2) recovering the pellet and then dissolving it in an alkali solution to again place the enzyme in a basic environment. It is a further object of this invention to provide methods for the isolation and purification of the recombinantly expressed chondroitinase II enzyme of P. vulgaris.
It is an additional object of this invention to provide methods which result in significantly higher yields and purity of the recombinant chondroitinase II enzyme than those obtained by adapting for the recombinant enzyme the method previously used for isolating and purifying the native chondroitinase I enzyme from P. vulgaris. These objects are achieved through either of two methods described and claimed herein for the chondroitinase II enzyme. The firεt method comprises the steps of:
(a) lysing by homogenization the host cells which express the recombinant chondroitinase I enzyme to release the enzyme into the supernatant;
(b) subjecting the supernatant to diafiltration to remove salts and other small molecules;
(c) passing the supernatant through an anion exchange resin-containing column;
(d) loading the eluate from step (c) to a cation exchange resin-containing column so that the enzyme in the eluate binds to the cation exchange column;
(e) obtaining by affinity elution the enzyme bound to the cation exchange column with a solution of chondroitin sulfate, such that the enzyme is co- eluted with the chondroitin sulfate;
(f) loading the eluate from step (e) to an anion exchange resin-containing column and eluting the enzyme with a solvent such that the chondroitin sulfate binds to the column; and
(g) concentrating the eluate from step (f) and crystallizing out the enzyme from the supernatant which contains an approximately 37 kD contaminant.
In the second method, prior to step (b) of the first method just described, the following two steps are performed:
(1) treating the supernatant with an acidic solution to precipitate out the enzyme; and (2) recovering the pellet and then dissolving it in an alkali solution to again place the enzyme in a basic environment.
Use of the methods of this invention results in significantly higher yields and purity of each recombinant enzyme than those obtained by adapting for each recombinant enzyme the method previously used for isolating and purifying the native chondroitinase I enzyme from P. vulgaris.
Brief Description of the Figures
Figure 1 depicts a preliminary restriction map for the subcloned approximately 10 kilobase Nsi fragment in pIBI24. The Nsi fragment contains the complete gene encoding chondroitinase I and a portion of the gene encoding chondroitinase II. The restriction sites are shown in their approximate positions. The restriction sites are useful in the constructions described below; other restriction sites present are not shown in this Figure; some are set forth in Example 13 below. Figure 2 depicts the elution of the recombinant chondroitinase I enzyme from a cation exchange chromatography column using a sodium chloride gradient. The method used to purify the native enzyme is used here to attempt to purify the recombinant enzyme. The initial fractions at the left do not bind to the column. They contain the majority of the chondroitinase I enzyme activity. The fractions at right containing the enzyme are marked "eluted activity". The gradient is from 0.0 to 250 mM NaCl. Figure 3 depicts the elution of the recombinant chondroitinase I enzyme from a cation exchange column, after first passing the supernatant through an anion exchange column, in accordance with a method of this invention. The initial fractions at the left do not bind to the column, and contain only traces of chondroitinase I activity. The fractions at right containing the enzyme are marked "eluted activity". The gradient is from 0.0 to 250 mM NaCl. Figure 4 depicts sodium dodecyl sulfate- polyacrylamide gel chromatography (SDS-PAGE) of the recombinant chondroitinase I enzyme before and after the purification methods of this invention are used. In the SDS-PAGE gel photograph. Lane 1 is the enzyme purified using the method of the first embodiment of the invention; Lane 2 is the enzyme purified using the method of the second embodiment of the invention; Lane 3 represents the supernatant from the host cell prior to purification -- many other proteins are present; Lane 4 represents the following molecular weight standards: 14.4 kD - lysozyme; 21.5 kD - trypsin inhibitor; 31 kD - carbonic anhydrase; 42.7 kD - ovalbumin; 66.2 kD - bovine serum albumin; 97.4 kD - phosphorylase B; 116 kD - beta- galactosidase; 200 kD - myosin. A single sharp band is seen in Lanes 1 and 2.
Figure 5 depicts SDS-PAGE chromatography of the recombinant chondroitinase II enzyme during various stages of purification using a method of this invention. In the SDS-PAGE gel photograph, Lane 1 is the crude supernatant after diafiltration; Lane 2 the eluate after passage of the supernatant through an anion exchange resin-containing column; Lane 3 is the enzyme after elution through a cation exchange resin- containing column; Lane 4 is the enzyme after elution through a second anion exchange resin-containing column; Lane 5 represents the same molecular weight standards as described for Figure 4, plus 6.5 kD - aprotinin; Lane 6 is the same as Lane 4, except it is overloaded to show the approximately 37 kD contaminant; Lane 7 is the 37 kD contaminant in the supernatant after crystallization of the chondroitinase II enzyme; Lane 8 is first wash of the crystals; Lane 9 is the second wash of the crystals; Lane 10 is the enzyme in the washed crystals after redissolving in water.
Detailed Description of the Invention
Preliminary experiments indicated that E. coli could not use the hydrolysis products yielded by chondroitinase I as a sole carbon source, suggesting that this gene could not be cloned by selecting for its expression in E. coli. Another approach, followed in this application, is to use a physical method to identify DNA fragments that encode the chondroitinase I enzyme. This is accomplished using an appropriately labeled probe for hybridization with individual clones that, together, make up a gene bank comprising the complete genome of P. vulgaris. The probe itself is generated using Polymerase Chain Reaction (PCR) (5) . In this procedure, the genomic DNA of P. vulgaris is denatured and oligonucleotides (designed to bracket part of the chondroitinase I gene) are annealed and DNA synthesis is carried out in vitro. This cycle of denaturation, annealing and DNA synthesis using the oligonucleotides as primers is repeated many times (e.g., 30), with the yield of the desired product (the DNA fragment that lies between the two oligonuc¬ leotides) increasing exponentially with each cycle. A putative nucleotide sequence of the appropriate oligonucleotides is constructed from available amino acid sequence information derived from the protein purified from P. vulgaris bacteria. Once this is done, the DNA fragment produced by PCR is cloned and its DNA sequence determined to verify that it is part of the chondroitinase I gene. It is then labeled and used as a probe to indicate which members of the gene bank actually contain the chondroitinase I gene. Subsequent restriction mapping and Southern hybridization narrows the location to a piece of DNA of approximately four thousand base-pairs (bp) . This is then sequenced using the Sanger dideoxy chain termination method (6) to reveal the exact position of the gene and guide the subsequent manipulations used to place the gene into a high-level expression system in E. coli. A fermentation at a 10 liter scale carried out with this E. coli strain containing a recombinant plasmid expressing the P. vulgaris chondroitinase I gene yields a maximum chondroitinase I titer of approximately 600 units/ml (which is the same as 1.2 mg/ml) . This yield far exceeds that of the native P. vulgaris fermentation process which had not achieved a titer of more than 2 units/ml.
The process of cloning and expression of the chondroitinase I gene is summarized by the following series of stages:
1) The isolation of P. vulgaris genomic DNA and the construction of a cosmid gene bank.
2) PCR experimentation designed to yield an authentic piece of the chondroitinase I gene for use as a hybridization probe.
3) Colony hybridization studies to identify at least a portion of the chondroitinase I gene. 4) Restriction mapping, Southern hybridi- zation, DNA sequencing, and chondroitinase I enzyme assays that, collectively, serve to place the location of the chondroitinase I gene more precisely within the cloned DNA. 5) DNA sequence analysis to reveal the exact coding region and location of the chondroitinase I gene.
6) Site-specific mutagenesis, related manipulations, and genetic engineering leading to the regulated, high-level expression of the P. vulgaris gene in E. coli.
These six stages are described in specific detail in Examples 1-7 below. The rationale for the stages is as follows. In the first stage, genomic DNA is obtained. DNA is separated from protein and other material contained in a P. vulgaris fermentation. Study of the genomic DNA is facilitated by the insertion of fragments of the DNA into cosmid vectors. The genomic DNA is digested with an appropriate restriction endonuclease, such as Sau3A, and then Iigated into a cosmid vector. The packaged recombinant cosmids containing the P. vulgaris DNA fragments are introduced into an appropriate bacterial host strain, such as an E. coli strain, and the resulting culture is grown to allow gene expression. The gene banks are engineered to contain a marker, such as ampicillin or kanamycin resistance, to assist in the screening of the gene banks for the presence of the chondroitinase I gene. Applicants have conducted some amino acid sequencing of the native chondroitinase I enzyme. Samples of the enzyme are generated by fermentation of P. vulgaris. Samples may also be obtained from Seikagaku Kogyo Co., Ltd., Tokyo, Japan. The amino acid sequence information is used to design oligonucleotides for use in screening for the chondroitinase I gene.
In the second stage, oligonucleotides are designed for use in PCR. A first set of oligonucleotides is designed so as to encode a heptapeptide that has minimal degeneracy of its genetic code. Seven amino acids near the amino terminus of the chondroitinase I enzyme (SEQ ID NO:2, amino acids 19-25) are potentially encoded by 512 different nucleotide sequences (SEQ ID NO:6; see Example 2) . The number of potential sequences is reduced to 32 by selecting specific nucleotides at the 5' end, because of the observation that mismatched nucleotides in PCR primers are of less consequence at the 5' end than at the 3' end of the primer (7) . The sequences of the pool of 32 primers are set out at SEQ ID NOS:7-14.
Applicants have discovered that the approximately 110 kD chondroitinase I enzyme is cleaved proteolytically into an 18,000 MW ("18 kD") fragment and an approximately 90,000 MW ("90 kD") fragment. Furthermore, the 18 kD fragment is further fragmented by treatment with cyanogen bromide and trypsin. The various fragments are then used to design additional sets of oligonucleotide primers for PCR.
Seven amino acids within the 18 kD fragment (SEQ ID NO:2, amino acids 114-120) are potentially encoded by 512 different nucleotide sequences (SEQ ID NO:15; see Example 2) . The complementary strand has the same number of potential sequences (SEQ ID NO:16; see Example 2) . Using the criteria described above for the first set of oligonucleotides, the number of potential sequences is reduced to 128, whose sequences are set out at SEQ ID NOS:17-24. Six amino acids located near the amino- terminus of the "90 kD" fragment (SEQ ID NO:2, amino acids 165-170) are potentially encoded by a large number of different nucleotide sequences (SEQ ID NOS:25 and 26; see Example 2) . The complementary strand has the same number of potential sequences (SEQ ID NOS:27 and 28; see Example 2) . Using the criteria described above for the first set of oligonucleotides, the number of potential sequences is reduced to the sequences set out at SEQ ID NOS:29-36.
PCR amplifications are conducted using these 24 mixtures of oligonucleotides. The most effective amplifications are observed as discrete bands on electrophoretic gels. Products approximately 500 and 350 base pairs (bp) in size are obtained. The approximately 350 bp product is a subfragment of the approximately 500 bp product. The approximately 500 bp product is isolated and, following successive cloning procedures described in Example 2, is isolated as a 455 bp PCR product.
This 455 bp fragment is sequenced and translated into an amino acid sequence which is in virtual agreement with the sequence available from the native chondroitinase I enzyme. The sequences differ by one amino acid; subsequent experiments reveal that the nucleotide and amino acid sequences of the 455 bp fragment are correct, while the native amino acid sequence identification is in error.
In the third stage, the PCR amplification fragment is used as a probe to identify the cosmid gene banks prepared in the first stage which contain the chondroitinase I gene. The PCR fragment is denatured and labelled with, for example, digoxigenin- labelled dUTP (Boehringer-Mannheim, Indianapolis, IN) . The cosmid gene banks are then used to infect a bacterial strain. The resulting colonies are lysed and their DNA subjected to colony hybridization with the labelled probe, followed by exposure to an alkaline phosphatase-conjugated antibody to the digoxigenin-labelled material. Positive clones are visualized and then picked to be grown in selective media.
In the fourth stage, Southern hybridization (8) and restriction mapping are used to localize the position of the chondroitinase I gene within individual clones. The PCR-generated fragment described above is used as a Southern hybridization probe against P. vulgaris genomic DNA that is first digested by restriction enzymes and fractionated. In a second PCR amplification, several of the oligonucleotides described above are used as primers. The results indicate that the portion of the chondroitinase I gene that hybridizes to the probe is carried on several large DNA fragments. These large DNA fragments are digested to yield individual fragments which are isolated, tested for the presence of chondroitinase I sequences by Southern hybridization, and then subcloned into appropriate vectors. Example 3 details the cloning strategy used. Restriction maps are generated to assist in the identification of the portions of the fragments carrying the desired sequences. In addition, .in vitro chondroitinase I assays in which the activity of the enzyme based on measuring the release of unsaturated disaccharide from chondroitin sulfate C at 232 nm are conducted on several samples to assist in the placement and orientation of the chondroitinase I gene. The results of these procedures suggest that a 4.2 kb EcoRV-EcoRI fragment of a larger 10 kb Nsil fragment could contain the entire chondroitinase I gene.
In the fifth stage, the above-mentioned 4.2 kb fragment is subjected to DNA sequence analysis. The resulting DNA sequence is 3980 nucleotides in length (SEQ ID NO:l) . Translation of the DNA sequence into the putative amino acid sequence reveals a continuous open reading frame (SEQ ID NO:l, nucleotides 119-3181) encoding 1021 amino acids (SEQ ID NO:2) . In turn, analysis of the amino acid sequence reveals a 24 residue signal sequence (SEQ ID NO:2, amino acids 1-24) , followed by a 997 residue mature (processed) chondroitinase I enzyme (SEQ ID NO:2, amino acids 25-1021) . Signal sequences are required for a complex series of post-translational processing steps which result in secretion of a protein from a host cell. The signal sequence constitutes the amino-terminal end of the protein to be secreted. In most cases, the signal sequence is cleaved off by a specific protease, called a signal peptidase.
The "18 kD" and "90 kD" fragments are found to be adjacent to each other, with the "18 kD" fragment constituting the first 157 amino acids of the mature protein (SEQ ID NO:2, amino acids 25-181), and the "90 kD" fragment constituting the remaining 840 amino acids of the mature protein (SEQ ID NO:2, amino acids 182-1021) .
The chondroitinase I enzyme of this invention is expressed using established recombinant DNA methods. Suitable host organisms include bacteria, viruses, yeast, insect or mammalian cell lines, as well as other conventional organisms. The host cell is transformed with a plasmid containing a purified isolated DNA fragment encoding for chondroitinase I enzyme. The host cell is then cultured under conditions which permit expression of the enzyme by the host cell.
In the sixth stage, the gene is subjected to site-directed mutagenesis to introduce unique restriction sites. These permit the gene to be moved, in the correct reading frame, into an expression system which results in expression of chondroitinase I enzyme at high levels. Such an appropriate host cell is the bacterium E. coli.
As detailed in Example 6 below, two different constructs are prepared. In the first, the three nucleotides immediately upstream of the initiation codon are changed (SEQ ID NO:3, nucleotides 116-118) through the use of a mutagenic oligonucleotide (SEQ ID NO:37) . The coding region and amino acid sequence encoded by the resulting construct are not changed, and the signal sequence is preserved (SEQ ID NO:3, nucleotides 119-3181; SEQ ID NO:2) . In a preferred embodiment of this invention, the second construct is used. In the second construct, the site-directed mutagenesis is carried out at the junction of the signal sequence and the start of the mature protein. A mutagenic oligonucleotide (SEQ ID NO:38) is used which differs at six nucleotides from those of the native sequence (SEQ ID NO:l, nucleotides 185-190) . The sequence differences result in (a) the deletion of the signal sequence, and (b) the addition of a methionine residue at the amino-terminus, resulting in a 998 amino acid protein (SEQ ID NO:4, nucleotides 188-3181; SEQ ID NO:5) .
In the absence of a signal sequence, the enzyme is not secreted. Fortunately, it is not retained within the cell in the form of insoluble inclusion bodies. Instead, at least some of the enzyme is produced intracellularly as a soluble active enzyme. The enzyme is extracted by homogenization, which serves to lyse the cells and thereby release the enzyme into the supernatant. Even with the signal sequence present, much of the enzyme is not secreted, because it is thought that this expression system provides such high yields of enzyme that it exceeds the capacity of the host cell to secrete that much enzyme.
As described in Example 7 below, the gene lacking the signal sequence is inserted into an appropriate expression vector. One such vector is pET-9A (9; Novagen, Madison, WI) , which is derived from elements of the E. coli bacteriophage T7. The resulting recombinant plasmid is designated pTM49-6. The plasmid is then used to transform an appropriate expression host cell, such as the E. coli B strain BL21/(DE3)/pLysS (10; Novagen) . Samples of this E. coli B strain
BL21(DE3)/pLysS carrying the recombinant plasmid pTM49-6 were deposited by Applicants on February 4, 1993, with the American Type Culture Collection, 12301 Parklawn Drive, Rockville, Maryland 20852, U.S.A., and have been assigned ATCC accession number 69234.
Expression of the chondroitinase I enzyme using the deposited host cell yields approximately 300 times the amount of the enzyme as was possible using a same size fermentation vessel with native (non- recombinant) P. vulgaris.
After expression of the chondroitinase I enzyme, the supernatant from the host cells is treated to isolate and purify the enzyme. Initial attempts to isolate and purify the recombinant chondroitinase I enzyme do not result in high yields of purified protein. The previous method for isolating and purifying native chondroitinase I from fermentation cultures of P. vulgaris is found to be inappropriate for the recombinant material. The native enzyme is produced by fermentation of a culture of P. vulgaris. The bacterial cells are first recovered from the medium and resuspended in buffer. The cell suspension is then homogenized to lyse the bacterial cells. Then a charged particulate such as Bioacryl (Toso Haas,
Philadelphia, PA) , is added to remove DNA, aggregates and debris from the homogenization step. Next, the solution is brought to 40% saturation of ammonium sulfate to precipitate out undesired proteins. The chondroitinase I remains in solution.
The solution is then filtered and the retentate is washed to recover most of the enzyme. The filtrate is concentrated and subjected to diafiltration with a phosphate to remove the salt. The filtrate containing the chondroitinase I is subjected to cation exchange chromatography using a cellulose sulfate column. At pH 7.2, 20 mM sodium phosphate, more than 98% of the chondroitinase I binds to the column. The native chondroitinase I is then eluted from the column using a sodium chloride gradient.
The eluted enzyme is then subjected to additional chromatography steps, such as anion exchange and hydrophobic interaction column chromatography. As a result of all of these procedures, chondroitinase I is obtained at a purity of 90-97%. The level of purity is measured by first performing SDS-PAGE. The proteins are stained using Coomassie blue, destained, and the lane on the gel is scanned using a laser beam of wavelength 600 nm. The purity is expressed as the percentage of the total absorbance accounted for by that band.
However, the yield of the native protein is only 25-35%. The yield is measured as the remaining activity in the final purified product, expressed as a percentage of the activity at the start (which is taken as 100%) . In turn, the activity of the enzyme is based on measuring the release of unsaturated disaccharide from chondroitin sulfate C at 232 nm. This purification method also results in the extensive cleavage of the approximately 110,000 dalton (110 kD) chondroitinase I protein into a 90 kD and an 18 kD fragment. Nonetheless, the two fragments remain non-covalently bound and exhibit chondroitinase I activity.
When this procedure is repeated with homogenate from lysed host cells carrying a recombinant plasmid encoding chondroitinase I, significantly poorer results are obtained. Less than 10% of the chondroitinase I binds to the cation exchange column at standard stringent conditions of pH 7.2, 20 mM sodium phosphate.
Under less stringent binding conditions of pH 6.8 and 5 mM phosphate, an improvement of binding with one batch of material to 60-90% is observed.
However, elution of the recombinant protein with the NaCl gradient gives a broad activity peak, rather than a sharp peak (see Figure 2) . This indicates the product is heterogeneous. Furthermore, in subsequent fermentation batches, the recombinant enzyme binds poorly (1-40%) , even using the less stringent binding conditions. Most of these batches are not processed to the end, as there is poor binding. Therefore, their overall recovery is not quantified. Based on these results, it is concluded that the recombinant chondroitinase I enzyme has a reduced basicity compared to the native enzyme, and that the basicity also varies between batches, as well as within the same batch. It is evident that the method used to isolate and purify the native enzyme is not appropriate for the recombinant enzyme. The method produces low yields of protein at high cost. Furthermore, for large batches, large amounts of solvent waste are produced containing large amounts of a nitrogen-containing compound (ammonium sulfate) . This is undesirable from an environmental point of view.
A hypothesis is then developed to explain these poor results and to provide a basis for developing improved isolation and purification methods. It is known that the native chondroitinase I enzyme is basic at neutral pH. It is therefore assumed that the surface of the enzyme has a net excess of positive charges.
Without being bound by this hypothesis, it is believed that, in recombinant expression of the enzyme, the host cell contains or produces small, negatively charged molecules. These negatively charged molecules bind to the enzyme, thereby reducing the number of positive charges on the enzyme. If these negatively charged molecules bind with high enough affinity to copurify with the enzyme, they can cause an alteration of the behavior of the enzyme on the ion exchange column.
Support for this hypothesis is provided by the data described below. In general, cation exchange resins bind to proteins better at lower pH' s than higher pH's. Thus, a protein which is not very basic, and hence does not bind at a high pH, can be made to bind to the cation exchanger by carrying out the operation at a lower pH. At pH 7.2, the native enzyme binds completely to a cation exchange resin. However, the recombinant-derived enzyme, due to the lowered basicity as a result of binding of the negatively charged molecules, does not bind very well (less than 10%) . This enzyme can be made to bind up to 70% by using a pH of 6.8 and a lower phosphate concentration (5 mM rather than 20 mM) , but heterogeneity and low yield remain great problems. Indeed, only one fermentation results in a 70% binding level; typically, it is much less (less than 10%) even at pH 6.8. This level of binding varies dramatically between different fermentation batches. This hypothesis and a possible solution to the problem are then tested. If negatively charged molecules are attaching non-covalently to chondroitinase I, thus decreasing its basicity, it should be possible to remove these undesired molecules by using a strong, high capacity anion exchange resin. Removal of the negatively charged molecules should then restore the basicity of the enzyme. The enzyme could then be bound to a cation exchange resin and eluted therefrom in pure form at higher yields. Experiments demonstrate that this approach indeed provides a solution to the problem encountered with the isolation and purification of the recombinantly expressed chondroitinase I enzyme.
As is discussed below, chondroitinase I is recombinantly expressed in two forms. The enzyme is expressed with a signal peptide, which is then cleaved to produce the mature enzyme. The enzyme is also expressed without a signal peptide, to produce directly the mature enzyme. The two embodiments of this invention which will now be discussed are suitable for use in purifying either of these forms of the enzyme.
In the first embodiment of this aspect of •the invention, the host cells which express the recombinant chondroitinase I enzyme are lysed by homogenization to release the enzyme into the supernatant. The supernatant is then subjected to diafiltration to remove salts and other small molecules. However, this step only removes the free, but not the bound form of the negatively charged molecules. The bound form of these charged species iε next removed by passing the supernatant through a strong, high capacity anion exchange resin-containing column. An example of such a resin is the Macro-Prep™ High Q resin (Bio-Rad, Melville, N.Y.) . Other strong, high capacity anion exchange columns are also suitable. Weak anion exchangers containing a diethylaminoethyl (DEAE) ligand also are suitable, although they are not as effective. Similarly, low capacity resins are also suitable, although they too are not as effective. The negatively charged molecules bind to the column, while the enzyme passes through the column. It is also found that some unrelated, undesirable proteins also bind to the column.
Next, the eluate from the anion exchange column is directly loaded to a cation exchange resin- containing column. Examples of such resins are the S- Sepharoεe™ (Pharmacia, Piscataway, N.J.) and the Macro-Prep™ High S (Bio-Rad) . Each of these two resin-containing columns has S03 ~ ligands bound thereto in order to facilitate the exchange of cations. Other cation exchange columns are also suitable. The enzyme binds to the column and is then eluted with a solvent capable of releasing the enzyme from the column. Any salt which increases the conductivity of the solution is suitable for elution. Examples of such salts include sodium salts, as well as potassium salts and ammonium salts. An aqueous sodium chloride solution of appropriate concentration is suitable. A gradient, such as 0 to 250 mM sodium chloride is acceptable, as is a step elution using 200 mM sodium chloride.
A sharp peak is seen in the sodium chloride gradient elution (Figure 3) . The improvement in enzyme yield over the prior method is striking. The recombinant chondroitinase I enzyme is recovered at a purity of 99% at a yield of 80-90%.
The purity of the protein is measured by scanning the bands in SDS-PAGE gels. A 4-20% gradient of acrylamide is used in the development of the gels. The band(s) in each lane of the gel is scanned using the procedure described above.
These improvements are related directly to the increase in binding of the enzyme to the cation exchange column which results from first using the anion exchange column. In comparative experiments, when only the cation exchange column is used, only 1% of the enzyme binds to the column. However, when the anion exchange column is used first, over 95% of the enzyme binds to the column.
The high purity and yield obtained with the first embodiment of this invention make it more feasible to manufacture the recombinant chondroitinase I enzyme on a large scale.
In a second embodiment of this aspect of the invention, two additional steps are inserted in the method before the diafiltration step of the first embodiment. The supernatant is treated with an acidic solution to precipitate out the desired enzyme. The pellet is recovered and then dissolved in an alkali solution to again place the enzyme in a basic environment. The solution is then subjected to the diafiltration and subsequent steps of the first embodiment of this invention.
In comparative experiments with the second embodiment of this invention, when only the cation exchange column is used, only 5% of the enzyme binds to the column. However, when the anion exchange column is used first, essentially 100% of the enzyme binds to the column. The second embodiment provides comparable enzyme purity and yield to the first embodiment of the invention.
Acid precipitation removes proteins that remain soluble; however, these proteins are removed anyway by the cation and anion exchange steps that follow (although smaller columns may be used) . An advantage of the acid precipitation step is that the sample volume is decreased to about 20% of the original volume after dissolution, and hence can be handled more easily on a large scale. However, the additional acid precipitation and alkali dissolution steps of the second embodiment mean that the second embodiment is more time consuming than the first embodiment. On a manufacturing scale, the marginal improvements in purity and yield provided by the second embodiment may be outweighed by the simpler procedure of the first embodiment, which still provides highly pure chondroitinase I enzyme at high yields. An additional benefit of the two embodiments of the invention is that cleavage of the enzyme into 90 kD and 18 kD fragments is avoided.
The high purity of the enzyme produced by the two embodiments of this invention is depicted in Figure 4. A single sharp band is seen in the SDS-PAGE gel photograph: Lane 1 is the enzyme using the method of the first embodiment; Lane 2 is the enzyme using the method of the second embodiment (Lane 3 represents the supernatant from the host cell prior to purification -- many other proteins are present; Lane 4 represents molecular weight standardε) .
The material depoεited with the ATCC can alεo be used in conjunction with the sequences discloεed herein to regenerate the native chondroitinase I gene sequence (SEQ ID NO:l) or the modified chondroitinase I gene sequence which includes the signal sequence (SEQ ID NO:3) using conventional genetic engineering technology..
Production of native chondroitinase I enzyme in P. vulgaris after induction with chondroitin sulfate does not provide a high yield of enzyme; the enzyme represents approximately 0.1% of total protein present. When the recombinant construct with the signal sequence deleted is used in E. coli, approximately 15% of the total protein is the chondroitinase I enzyme.
In addition to the three DNA sequences just described for the chondroitinase I gene (SEQ ID N0S:1, 3 and 4) , the present invention further comprises DNA sequences which, by virtue of the redundancy of the genetic code, are biologically equivalent to the sequences which encode for the enzyme, that is, these other DNA sequences are characterized by nucleotide sequences which differ from those set forth herein, but which encode an enzyme having the same amino acid sequences as those encoded by the DNA sequences set forth herein.
In particular, the invention contemplates those DNA sequences which are sufficiently duplicative of the sequences of SEQ ID NOS:l, 3 or 4 so as to permit hybridization therewith under standard high stringency Southern hybridization conditions, such as those described in Sambrook et al. (11), as well as the biologically active enzymes produced thereby. This invention also comprises DNA sequences which encode amino acid sequences which differ from those of the chondroitinase I enzyme, but which are the biological equivalent to those described for the enzyme (SEQ ID NOS:2 and 5) . Such amino acid sequences may be said to be biologically equivalent to those of the enzyme if their sequences differ only by minor deletions from or conservative substitutions to the enzyme sequence, such that the tertiary configurations of the sequences are essentially unchanged from those of the enzyme.
For example, a codon for the amino acid alanine, a hydrophobic amino acid, may be substituted by a codon encoding another less hydrophobic residue, such as glycine, or a more hydrophobic residue, such as valine, leucine, or isoleucine. Similarly, changes which result in substitution of one negatively charged residue for another, such as aspartic acid for gl tamic acid, or one positively charged residue for another, such as lysine for arginine, as well as changes based on similarities of residues in their hydropathic index, can also be expected to produce a biologically equivalent product. Nucleotide changes which result in alteration of the N-terminal or C- terminal portions of the protein molecule would also not be expected to alter the activity of the protein. Each of the proposed modifications is well within the routine skill in the art, as is determination of retention of biological activity of the encoded products. Therefore, where the terms "chondroitinase I gene" or "chondroitinase I enzyme" are used in either the specification or the claims, each will be understood to encompass all such modifications and variations which result in the production of a biologically equivalent protein. The starting point for the cloning and expresεion of the chondroitinase II gene is partial amino acid sequencing of the mature native chondroitinase II protein obtained from P. vulgaris. The N-terminal sequence of the mature native chondroitinase II protein is found to include the following 22 amino acids:
Leu-Pro-Thr-Leu-Ser-His-Glu-Ala-Phe-Gly-Asp-Ile-Tyr- Leu-Phe-Glu-Gly-Glu-Leu-Pro-Asn-Thr (SEQ ID NO: 40, amino acids 1-22)
The nucleotide sequence determined above for the region encoding the chondroitinase I gene includes an additional approximately 800 base pairs beyond the translation termination codon (SEQ ID NOS:l and 39, nucleotides 3185-3980) . An inspection of this region reveals that the sequence between nucleotides 3307 and 3372 (SEQ ID NOS:l and 39) encodes the identical 22 amino acids in the same order as the first 22 amino acids of native chondroitinase II.
Furthermore, an ATG initiation codon (SEQ ID NOS:l and 39, nucleotides 3238-3240) is found upstream of this region and in-frame, indicating that this gene is expressed with a 23 amino acid signal peptide sequence for the export of chondroitinase II (SEQ ID NO:40, amino acids 1-23) . Although a Shine-Dalgarno sequence (AGGA; SEQ ID N0S:1 and 39, nucleotides 3225- 3228) is found upstream of the initiation codon, there is no apparent promoter sequence, suggesting that both the 110 kD and 112 kD forms of the P. vulgaris chondroitinase enzyme are expressed as part of a single messenger RNA.
The coding sequence that starts with this ATG was originally not found to be continuous in SEQ ID N0:1, since a termination codon (TAA) was thought to be present in-frame at base-pairs identified as 3607-3609. Re-examination of the sequencing data, however, revealed that a residue was overlooked and that a T should be inserted between nucleotides originally identified as 3593 and 3594. This change restores the open reading frame which then extends through the end of SEQ ID NO: 39 (SEQ ID N0S:1 and 39 include the inserted T as nucleotide 3594) . (Thus, the three bases TAA at base-pairs 3608-3610, properly numbered, do not constitute a termination codon.)
With this information available, the cloning and expression of the P. vulgaris chondroitinase II gene is performed in three stages. In the first stage, because the N-terminal sequences are known, a site-specific mutagenesis is carried out. This is necessary in order for this gene to be placed, eventually, directly into the desired T7-based expression vector pET9A that is used (as described above) for the chondroitinase I gene. The mutagenized bases are upstream of the coding region (an AT sequence (SEQ ID NOS:l and 39, base pairs 3235 and 3236) is replaced by a CA sequence) .
The second stage, which can be carried out in parallel with the first, involves the identification, isolation and DNA sequencing of an appropriate DNA fragment which will include the C- terminal coding region of the chondroitinase II gene. The available DNA sequence information is adequate to account for approximately 220 amino acids of an estimated 1000 for the entire chondroitinase II protein. The missing coding sequences, therefore, would extend for another 2400 base pairs beyond the end of SEQ ID NO: 1.
The third stage involves the assembly of an intact gene for chondroitinase II that has been modified to include the initiation codon as part of an Ndel site and to be followed by a BamHl site downstream of the coding region. This allows a directed insertion of this gene into the pET9A expression vector (Novagen, Madison, WI) without further modification.
Sequencing of the entire assembled gene confirms the presence of the initiation codon at nucleotides 3238-3240, where this codon represents the start of the region coding for the signal peptides at nucleotides 3238-3306, the region coding for the mature protein at nucleotides 3307-6276, and a termination codon at nucleotides 6277-6279 (SEQ ID NO:39) . The translation of this sequence results in 1013 amino acids, of which the first 23 amino acids are the signal peptide and 990 amino acids constitute the mature chondroitinase II protein at residues numbered 24-1013 (SEQ ID NO:40) . In this construction, the signal peptide is retained, such that the expressed gene is processed and secreted to yield the mature native enzyme structure that has a leucine residue at the N-terminus.
As described in Example 13 below, the gene encoding the chondroitinase II protein is inserted into pET9A and the resulting recombinant plasmid is designated LP21359. The plasmid iε then used to transform an appropriate expresεion host cell, such aε the E. coli B strain BL21(DE3)/pLysS (which is also used for the expresεion of the chondroitinase I gene. Sampleε of thiε E. coli B strain designated TD112, which is BL21 (DE3) /pLyεS carrying the recombinant plaεmid LP21359, were depoεited by Applicantε on April 6, 1994, with the American Type Culture Collection, 12301 Parklawn Drive, Rockville, Maryland 20852, U.S.A., and have been aεsigned ATCC accession number 69598.
Expreεεion of the chondroitinaεe II enzyme uεing the depoεited hoεt cell yieldε approximately 25 times the amount of the enzyme as was possible using a same size fermentation vessel with native (non- recombinant) P. vulgaris.
After expression of the enzyme, the supernatant from the host cells is treated to isolate and purify the enzyme. Because of the virtually identical isoelectric points and similar molecular weights for the two proteins, the first method described above for isolating and purifying the recombinant chondroitinase I protein is adapted for isolating and purifying the recombinant chondroitinase II protein, and then modified as will now be described.
The need for the modification of the method is based on the fact that the recombinant chondroitinase II protein is expressed at levels approximately several-fold lower than the recombinant chondroitinase I protein; therefore, a more powerful and selective solution is necesεary in order to obtain a final chondroitinaεe II product of a purity equivalent to that obtained for the chondroitinaεe I protein.
The first several stepε of the method for the chondroitinase II protein are the same as those used to isolate and purify the chondroitinase I protein. Initially, the host cells which express the recombinant chondroitinase II enzyme are lysed by homogenization to release the enzyme into the supernatant. The supernatant is then subjected to diafiltration to remove salts and other small molecules. However, this step only removes the free, but not the bound form of the negatively charged molecules. The bound form of these charged species is next removed by passing the supernatant through a strong, high capacity anion exchange resin-containing column. An example of such a resin is the Macro-Prep™ High Q resin (Bio-Rad, Melville, N.Y.) . Other strong, high capacity anion exchange columns are also suitable. Weak anion exchangers containing a diethylaminoethyl (DEAE) ligand also are suitable, although they are not as effective. Similarly, low capacity resins are also suitable, although they too are not as effective. The negatively charged molecules bind to the column, while the enzyme pasεes through the column. It is also found that some unrelated, undesirable proteins also bind to the column.
Next, the eluate from the anion exchange column iε directly loaded to a cation exchange resin- containing column. Examples of εuch resins are the S- Sepharoεe™ (Pharmacia, Piεcataway, N.J.) and the Macro-Prep™ High S (Bio-Rad) . Each of these two resin-containing columns has S03 " ligands bound thereto in order to facilitate the exchange of cationε. Other cation exchange columns are also suitable. The enzyme binds to the column, while a significant portion of contaminating proteins elute unbound.
At this point, the method diverges from that used for the chondroitinase I protein. Instead of eluting the protein with a a non-specific salt solution capable of releasing the enzyme from the cation exchange column, a specific elution using a solution containing chondroitin sulfate is used.
This procedure utilizes the affinity the positively charged chondroitinase II protein has for the negatively charged chondroitin sulfate. The affinity is larger than that accounted for by a simple positive and negative interaction alone. It iε an enzyme-εubεtrate interaction, which is similar to other specific biological interactions of high affinity, εuch aε antigen-antibody, ligand-receptor, co-factor-protein and inhibitor/activator-protein. Hence, the chondroitin sulfate is able to elute the enzyme from the negatively charged resin. In contrast, the resin-enzyme interaction is a simple poεitive and negative interaction. Although affinity elution chromatography iε as easy to practice as ion-exchange chromatography, the elution is specific, unlike salt elution. Thus, it has the advantages of both affinity chromatography (specificity) , as well as ion-exchange chromatography (low cost, ease of operation, reusability) .
Another advantage is the low conductivity of the eluent (approximately 5% of that of the salt eluent) , which allows for further ion-exchange chromatography without a diafiltration/dialysiε εtep, which iε required when a salt is used. Note, that this is not a consideration in the method for the chondroitinase I protein, because no further ion- exchange chromatography iε needed in order to obtain the purified chondroitinase I protein. There is another reason for not using the method for purifying recombinant chondroitinase I. Chondroitinase II obtained using the chondroitinase I salt elution purification method has poor stability; there is extensive degradation at 4°C within one week. In contrast, chondroitinaεe II obtained by affinity elution iε stable. The reason for this difference in stability is not known. It is to be noted that chondroitinase I obtained by salt elution is stable. The cation exchange column is next washed with a phosphate buffer to elute unbound proteins, followed by washing with borate buffer to elute looεely bound contaminating proteinε and to increaεe the pH of the reεin to that required for the optimal elution of the chondroitinaεe II protein using the substrate, chondroitin sulfate.
Next, a solution of chondroitin sulfate in water, adjusted to pH 9.0, is uεed to elute the chondroitinase II protein, as a sharp peak (recovery 65%) and at a high purity of approximately 95%. A 1% concentration of chondroitin sulfate is used. A gradient of this solvent is also acceptable.
Because the chondroitin sulfate has an affinity for the chondroitinase II protein which is stronger than its affinity for the resin of the column, the chondroitin sulfate co-elutes with the protein. Thiε ensures that only protein which recognizes chondroitin sulfate is eluted, which is desirable, but also means that an additional process step is necesεary to εeparate the chondroitin εulfate from the chondroitinase II protein.
In this separation step, the eluate is adjusted to a neutral pH and is loaded as iε onto an anion exchange resin-containing column, such as the Macro-Prep™ High Q resin. The column is washed with a phosphate buffer. The chondroitin sulfate binds to the column, while the chondroitinase II protein flows through in the unbound pool with greater than 95% recovery. At thiε point, the protein iε pure, except for the presence of a single minor contaminant of approximately 37 kD. The contaminant may be a breakdown product of the chondroitinase II protein.
This contaminant is effectively removed by a crytallization step. The eluate from the anion exchange column is concentrated and the solution is maintained at a reduced temperature, such as 4°C, for several days to crystallize out the pure chondroitinase II protein. The εupernatant containε the 37 kD contaminant. Centrifugation causes the crystals to form a pellet, while the supernatant with the 37 kD contaminant is removed by pipetting. The crystals are then washed with water. The washed crystals are composed of the chondroitinase II protein at a purity of greater than 99%.
In a second embodiment of this aspect of the invention for the chondroitinase II protein, two additional steps are inserted in the method before the diafiltration step of the first embodiment. The supernatant is treated with an acidic solution to precipitate out the desired enzyme. The pellet is recovered and then dissolved in an alkali solution to again place the enzyme in a basic environment. The solution is then subjected to the diafiltration and subsequent steps of the first embodiment of thiε invention. Acid precipitation removes proteins that remain soluble; however, these proteins are removed anyway by the cation and anion exchange steps that follow (although smaller columns may be used) . An advantage of the acid precipitation step is that the sample volume is decreased compared to the original volume after dissolution, and hence can be handled more easily on a large scale. However, the additional acid precipitation and alkali disεolution εtepε of the second embodiment mean that the second embodiment is more time consuming than the first embodiment. On a manufacturing scale, the marginal improvements in purity and yield provided by the second embodiment may be outweighed by the simpler procedure of the first embodiment, which still provides highly pure chondroitinase II enzyme at high yields.
Production of native chondroitinase II enzyme in P. vulgaris after induction with chondroitin sulfate does not provide a high yield of enzyme; the enzyme representε approximately 0.1% of total protein present. When the recombinant construct iε used in E. coli. approximately 2.5% of the total protein is the chondroitinase II enzyme.
In addition to the DNA sequence just described for the chondroitinase II gene (SEQ ID NO:39), the present invention further comprises DNA sequences which, by virtue of the redundancy of the genetic code, are biologically equivalent to the sequences which encode for the enzyme, that is, these other DNA sequences are characterized by nucleotide sequences which differ from those set forth herein, but which encode an enzyme having the same amino acid sequences as thoεe encoded by the DNA sequences set forth herein.
In particular, the invention contemplates those DNA sequences which are sufficiently duplicative of the sequence of SEQ ID NO:39 so as to permit hybridization therewith under standard high stringency Southern hybridization conditions, εuch aε those described in Sambrook et al. (11), as well as the biologically active enzymes produced thereby.
This invention also comprises DNA sequences which encode amino acid sequences which differ from thoεe of the chondroitinaεe II enzyme, but which are the biological equivalent to thoεe deεcribed for the enzyme (SEQ ID NO:40) . Such amino acid εequenceε may be said to be biologically equivalent to thoεe of the enzyme if their sequences differ only by minor deletionε from or conεervative εubεtitutions to the enzyme sequence, such that the tertiary configurations of the sequences are essentially unchanged from thoεe of the enzyme.
For example, a codon for the amino acid alanine, a hydrophobic amino acid, may be εubεtituted by a codon encoding another less hydrophobic residue, such as glycine, or a more hydrophobic residue, such as valine, leucine, or isoleucine. Similarly, changes which result in subεtitution of one negatively charged residue for another, such as aspartic acid for glutamic acid, or one positively charged residue for another, such as lysine for arginine, as well aε changes based on similarities of residues in their hydropathic index, can also be expected to produce a biologically equivalent product. Nucleotide changes which result in alteration of the N-terminal or C- terminal portionε of the protein molecule would alεo not be expected to alter the activity of the protein. Each of the proposed modifications is well within the routine skill in the art, as is determination of retention of biological activity of the encoded products. Therefore, where the terms "chondroitinase II gene" or "chondroitinase II enzyme" are used in either the specification or the claims, each will be understood to encompass all such modifications and variations which result in the production of a biologically equivalent protein.
If desired, one of ordinary skill in the art can ligate together the two pieces of DNA from the two depoεitε, for example, at the Hindlll εite at nucleotide 3326, εo as to expresε both the chondroitinase I and chondroitinase II proteins under the control of the T7 promoter upstream of the coding sequence for chondroitinase I.
In order that this invention may be better understood, the following exampleε are εet forth. The examples are for the purpose of illustration only and are not to be construed as limiting the εcope of the invention.
Exampleε
Standard molecular biology techniques are utilized according to the protocols deεcribed in Sambrook et al. (11) .
Example 1
Iεolation Of P. vulgariε Genomic DNA And Construction Of A Cosmid Bank In E. coli
Two 35 ml aliquots (designated A and B) of a P. vulgaris large-scale (1000 liter) fermentation are obtained and centrifuged. Both pellets are resuspended with 7 ml of 0.05M glucose-0.025M Tris- HC1-0.01M EDTA (pH 8) containing 4 mg/ml of egg-white lysozyme. After 30 minuteε of incubation at 37°C, 7 ml of 1% SDS-0.16M EDTA-0.02M NaCl (pH 8) are added to εample "A" and incubation is continued at 37°C for another hour.
After the initial lysozyme treatment, εample "B" is centrifuged and the cell pellet taken up with 7 ml of 0.05M glucose-0.025M Tris-HCl-0.01M EDTA (pH 8) containing 40 μg/ml of DNAaεe-free RNAaεe and then 7 ml of 1% SDS-0.16M EDTA-0.02M NaCl (pH 8) are added to this resuspended material. Finally, proteinase K (Boehringer Mannheim, Indianapolis, IN) is added to both samples to a final concentration of 100 μg/ml and incubation is continued overnight at 37°C.
The next day, the sampleε are extracted once with an equal volume (14 ml) of equilibrated phenol followed by two further extractions in which the samples are extracted with 7 ml of phenol followed by the addition of 7 ml of chloroform, continued shaking and finally, centrifugation to separate the two phaseε. The DNA iε precipitated by adding one-quarter volume of 5M ammonium acetate and 0.6 volumes of isopropanol followed by centrifugation. The pelleted DNA is rinsed once with 70% (v/v) ethanol, dried under vacuum and then resuspended with 1 ml of TE (0.01M Tris-HCl-O.OOlM EDTA, pH 7.4). The nucleic acid concentration for sample "A" iε 1.2 mg/ml while that for εample "B" iε 1.3 mg/ml, aε determined by their ultraviolet absorption at 260 nm.
Fragmentation of the genomic DNA to yield pieces of a size suitable for insertion into cosmid vectors (approximately 25-35 kilobases (kb) ) is accom- pliεhed by partial digestion with the restriction endonuclease Sau3A. Duplicate 0.2 ml reactions are set up (one with preparation "A" and the other with DNA from preparation "B") , each containing 100 μg of the P. vulgariε genomic DNA, 0.1M NaCl, 0.01M MgCl,, 0.01M Triε-HCl (pH 7.5) and 80 unitε of the enzyme Sau3A.
Incubation iε carried out at 37°C and 25 μl aliquotε are removed at appropriate time points (5,6,7,8,9,10,11 and 20 minutes) and added to 25 μl of 0.2M EDTA (pH 8) . The individual samples are heated to' 70°C and then 10 μl are removed for a size- diεtribution analysis on an agarose gel. The sample obtained after five minutes of Sau3A digestion of preparation "A" and that obtained after 6 minutes with preparation "B" are chosen for further use. In each case, an aliquot (4 μl, which is approximately equal to 2 μg) of the chosen partial digest is Iigated to the appropriate "left" and "right" arms of the cosmid vector DNA using approximately 1 μg and 2 μg of each, respectively, in 10 μl reactions containing 0.066M Tris-HCl (pH 7.4), 0.01M MgCl2, 0.001M ATP, and 400 units (as defined by the manufacturer (New England Biolabs, Beverly, MA)) of T4 DNA ligase. Incubation is carried out at 11°C overnight. The "left" and "right" arms of the cosmids are DNA fragments which, when Iigated to an appropriately sized piece of P. vulgaris DNA, comprise a recombinant molecule of approximately 35-50 kb. Both arms contain "cos" siteε which are recognized by the packaging enzymes in the next εtep. In addition, theεe arms carry the origin of replication and ampicillin-reεistance functions of pIBI24 (International Biochemical Inc., New Haven, CT) .
Each of the above ligaεe reactions iε added to one tube of a λ packaging extract (Packagene™, a trademark of Promega Corp., Madison, WI) and the reaction is allowed to proceed at room temperature for two hours, at which point 0.5 ml of PDB (0.1M NaCl- 0.01M Tris-HCl (pH 7.9) -0.01M MgS04) is added followed by approximately 0.05 ml of chloroform. Each tube of packaged DNA is, therefore, a gene bank of the P. vulgaris genome.
Because this method of construction creates a pool of infectious particles (i.e., λ phage heads filled with the cosmid vector joined to approximately 25' to 35 kb of P. vulgaris DNA) , the number of potential clones is quantitated by adsorbing an aliquot of the packaged material to an appropriate, sensitive E. coli host strain, and then after outgrowth, plating the mixture on selective media. For example, an overnight culture of the E. coli strain ER1562 (New England Biolabs, Beverly, MA) grown in 20-10-5 medium is diluted 1:20 into fresh media (20-10-5 supplemented with 1% maltose) and grown for three hours at 37°C. The cells (1 ml) are then centrifuged, resuspended with PDB (0.2 ml) and 0.02 ml of the appropriate gene bank added. After adsorption for twenty minutes at 37°C, the sampleε are diluted to 2 ml with 20-10-5 medium and grown at 37°C for 30 minuteε. The culture is then spread on 20-10-5 plates containing 100 μg/ml of ampicillin and colonies scored after overnight incubation at 37°C. The results indicate that there are approximately 68,000 and 95,000 infectious particles (potential cosmid cloneε) present in the two sampleε, designated PV1-GB and PV2- GB, corresponding to the "A" and "B" preparation of P. vulgaris genomic DNA, respectively.
In addition, four other P. vulgaris gene banks are prepared, as above, using two different cosmid vectors. These two cosmids differ from the above-mentioned vectors in that a kanamycin resiεtance determinant iε used in one caεe rather than the ampicillin resistance, while in the other, the replication functions of pBR322 (New England Biolabs, Beverly, MA) are used instead of those of pIBI24. These four "libraries," designated L1974, L1975, L1976, and L1977, contain, respectively, approximately 18,000 (ampr) . 34,000 (ampr) . 13,000 (kanr) and 15,000 (kanr) members. Aliquots of each of these six gene banks are screened for the presence of the P. vulgaris chondroitinase I gene (see below) . Example 2
PCR Experimentation Designed To Yield An
Authentic Piece Of The Chondroitinase I Gene
For Use Aε A Hybridization Probe
The Polymeraεe Chain Reaction (PCR) (5) allowε the geometric amplification of a DNA sequence that lieε between oligonucleotide primers that can be extended by a DNA polymerase in vitro. The enzyme used in these experiments is the Tag DNA polymerase
(isolated originally from Thermus aguaticus) , which is preferred because of its thermotolerance which allows it to survive the repeated DNA denaturation steps that are carried out at 94°C. In order for this method to be employed successfully, the oligonucleotides used must have sequences that are as close as possible to those of the target εequence -- the P. vulgariε chondroitinase I gene. An approximation of that εequence can be derived from the limited available amino acid εequence data. To minimize uncertainty in the sequence presented by the degeneracy of the genetic code (a given amino acid can be encoded by up to six codons) , the first approximation involves choosing an amino acid sequence that has the least degeneracy. For example, in the amino-terminal sequence of the P. vulgaris chondroitinase I gene, there are the following consecutive amino acids: His-Phe-Ala-Gln- Aεn-Aεn-Pro (SEQ ID NO:2, amino acidε 43-49) . Thiε amino acid εequence could be encoded by any one of 512 different nucleotide εequenceε, repre- εented as 5' -CAY-TTY-GCN-CAR-AAY-AAY-CCN-3' (SEQ ID NO:6), where R stands for purine (A or G) , Y for pyrimidine (C or T) , and N indicates that any one of the four nucleotides (A T, G, or C) at this position will conεtitute a nucleotide εequence that could encode the indicated amino acid εequence. One poεsible approach would be to syntheεize an oligonucleotide mixture containing a total of 512 different olignucleotideε, repreεented aε:
5'-CA(TC) -TT(TC) -GC(GATC) -CA(GA) -AA(TC) -AA(TC) -CC- (GATO-3' (SEQ ID N0:6) .
Although uεe of εuch mixtures in PCR has been successful, another approach is to use a number of oligonucleotide mixtures, each of which is made up of a relatively smaller set of nucleotide sequences. In order to simplify this further, advantage is taken of the observation (7) that mismatched nucleotides in PCR primers are of less consequence at the 5' -end of the primer than they are at the 3' -end. Using these criteria, a set of eight oligonucleotides (each made up of four unique sequences) is designed, where the individual sets of oligonucleotides have the following sequences:
1. 5'-CAC-TTC-GC(GATC)-CAA-AAT-AAT-CC-3' (SEQ ID NO:7)
2. 5'-CAC-TTC-GC(GATC)-CAA-AAC-AAC-CC-3' (SEQ ID NO:8) 3. 5'-CAC-TTC-GC(GATC)-CAA-AAC-AAT-CC-3' (SEQ ID NO:9)
4. 5'-CAC-TTC-GC(GATC)-CAA-AAT-AAC-CC-3' (SEQ ID NO:10)
5. 5'-CAC-TTC-GC(GATC)-CAG-AAT-AAT-CC-3' (SEQ ID NO:ll)
6. 5'-CAC-TTC-GC(GATC)-CAG-AAC-AAC-CC-3' (SEQ ID NO:12)
7. 5'-CAC-TTC-GC(GATC)-CAG-AAC-AAT-CC-3' (SEQ ID NO:13) 8. 5'-CAC-TTC-GC(GATC) -CAG-AAT-AAC-CC-3' (SEQ ID NO:14)
One of these pools is perfectly matched for the first eleven nucleotides (counting from the 3- end) , and, furthermore, within thiε pool of four oligonucleotides, one is a perfect match for the first fourteen nucleotideε. Thiε is important because it permits stringent annealing conditions to be used that discriminate against imperfect matches that give rise to PCR products that are unrelated to the chondroitinase I gene.
A further aid in the design of oligonucleo¬ tides to be used in these PCR experiments is derived from the obεervation that the P. vulgaris 110 kD chodroitinase enzyme appearε to have a structure that leaves one particular region hypersensitive to proteolytic cleavage. The result of this hydrolysiε iε that the normally approximately 110 kD protein iε split into two predominant species of 18 kD and approximately 90 kD. The amino-terminal sequences of the "110 kD" protein and the "18 kD" fragment are the same, while that for the "90 kD" has been found to be different.
The "18 kD" peptide is further fragmented by treatment with cyanogen bromide and trypsin and the reεulting oligopeptides sequenced, affording still more information with which to design oligonucleotides for PCR. This information from the "18 kD" and "90 kD" regions iε also valuable because the locations of these amino acid sequences relative to each other and the N-terminal sequences of the intact protein are well defined. In fact, the nucleotide distance between the regions encoding the N-termini of the "110 kD" and "90 kD" entities can be predicted to be approximately 400-500 bp. Two further setε of oligonucleotide poolε are then deεigned with one further conεideration: The firεt eight oligonucleotides hybridize to one strand of the DNA and, during the in vitro DNA synthesis, they are extended toward the "90 kD" N-terminal coding sequenceε. Consequently, the oligonucleotides correεponding to amino acid sequences from within the "18 kD" peptide and at the N-terminus of the "90 kD" peptide must be designed so that they anneal to the complementary DNA strand of the P. vulgaris genome, so that they extend, in vitro, toward the region encoding the N-terminus of the intact protein.
In this way, the oligonucleotides effectively "bracket" the region of the P. vulgaris chromosome that encodes the N-terminal region of the chondroitinase I gene. It is worth noting that the PCR methodology offers an extremely large potential amplification of this bracketed region. Thirty PCR cycles, in theory, increase the number of copies of this DNA segment by a factor of one billion. This allows the use of very small quantities of P. vulgariε genomic DNA aε a template which will yield, potentially, microgram amounts of synthesized product which can be readily visualized, isolated and cloned. Using the above logic, oligonucleotide mixtures are designed based on the following amino acid sequence that is found within the "18 kD" peptide: Glu-Ala-Gln-Ala-Gly-Phe-Lys (SEQ ID NO:2, amino acids 138-144) . This heptapeptide is encoded by the following nucleotide sequences:
5'-GAR-GCN-CAR-GCN-GGN-TTY-AAR-3' (SEQ ID NO:15).
The complementary strand, therefore, has the following sequences:
5'-YTT-RAA-NCC-NGC-YTG-NGC-YTC-3' which iε the same aε 5' - (CT)TT- (AG)AA- (GATC)CC- (GATC)GC- (CT)TG- (GATC)GC- (CT)TC-3' (SEQ ID NO:16) .
Using the same criteria as described above for the firεt set of eight oligonucleotides, a further set of eight oligonucleotides (each made up of 16 unique sequences) is designed, where the individual sets of oligonucleotides have the following sequences:
9. 5'-TT-GAA- (AG)CC- (GATC)GC- (CT)TG-GGC-TTC-3' (SEQ ID NO:17)
10. 5'-TT-GAA- (AG)CC- (GATC)GC- (CT)TG-AGC-TTC-3' (SEQ ID NO:18) 11. 5'-TT-GAA- (AG)CC- (GATC)GC- (CT)TG-TGC-TTC-3'
(SEQ ID NO:19)
12. 5'-TT-GAA- (AG)CC- (GATC)GC- (CT)TG-CGC-TTC-3' (SEQ ID NO:20)
13. 5'-TT-GAA- (AG)CC- (GATC)GC- (CT)TG-GGC-CTC-3' (SEQ ID NO:21)
14. 5'-TT-GAA- (AG)CC- (GATC)GC- (CT)TG-AGC-CTC-3' (SEQ ID NO:22)
15. 5' -TT-GAA- (AG)CC- (GATC)GC- (CT)TG-TGC-C C-3' (SEQ ID NO:23) 16. 5'-TT-GAA-(AG)CC-(GATC)GC-(CT)TG-CGC-CTC-3'
(SEQ ID NO:24)
Unlike oligonucleotides 1-8 above, one baεe iε deleted from the 5' end of oligonucleotides 9-16 in order to reduce the number of sequence permutations.
In this case, one pool has a perfect match for the first eight nucleotides at the 3' -end, while 50% of this same pool has an eleven-nucleotide perfect match with the genomic DNA of P. vulgaris encoding chondroitinaεe I.
For a third εet of oligonucleotide mixtures, the following amino acid sequence, obtained as part of the N-terminal amino acid sequence of the "90 kD" peptide, is uεed: Gly-Ala-Lyε-Val-Aεp-Ser (SEQ ID NO:2, amino acidε 189-194) . Thiε hexapeptide can be encoded by the following nucleotide sequences:
5'-GGN-GCN-AAR-GTN-GAY-TCN-3' (SEQ ID NO:25) or 5'-GGN-GCN-AAR-GTN-GAY-AGY-3' (SEQ ID NO:26)
The complement of this sequence iε:
5'-NGA-RTC-NAC-YTT-NGC-NCC-3' (SEQ ID NO:27) or
5'-RCT-RTC-NAC-YTT-NGC-NCC-3' (SEQ ID NO:28)
These possible sequences are represented using the following oligonucleotide mixtures:
17. 5' -GA-GTC- (GATOAC-(TC)TT- (AG)GC-GCC-3' (SEQ ID NO:29)
18. 5/-GA-GTC-(GATC)AC-(TC)TT-(AG)GC-ACC-3/ (SEQ ID NO:30) 19. 5'-GA-GTC-(GATC)AC-(TC)TT-(AG)GC-TCC-3' (SEQ
ID NO:31)
20. 5' -GA-GTC- (GATC)AC- (TC)TT- (AG)GC-CCC-3' (SEQ ID NO:32)
21. 5' -GA-GTC- (GATC)AC- (TC)TT-(TC)GC-GCC-3' (SEQ ID NO:33)
22. 5' -GA-GTC- (GATC)AC- (TC)TT- (TC)GC-ACC-3' (SEQ ID NO:34)
23. 5' -GA-GTC- (GATC)AC- (TC)TT- (TC)GC-TCC-3' (SEQ ID NO:35) 24. 5' -GA-GTC- (GATC)AC- (TC)TT- (TC)GC-CCC-3' (SEQ
ID NO:36)
Unlike oligonucleotides 1-8 above, one base is deleted from the 5' end of oligonucleotides 17-24 in order to reduce the number of εequence permutations.
In this case, one oligonucleotide mixture haε half of itε memberε perfectly matched for the firεt eight nucleotides at the 3' -end, and one quarter of the oligonucleotides in the pool are perfectly matched for eleven nucleotides at the 3' -end.
These twenty-four oligonucleotide mixtures are purchased from Biosynthesis, Inc. (Denton, TX) , and are provided as fully deprotected, purified and lyophilized samples. In each case (except oligonucleotide #20), 5 O.D. units of synthetic DNA are obtained. This is resuεpended in 0.5 ml of water to yield a solution that contains approximately 50-60 pmoles of oligonucleotide per microliter. The remaining sample (oligonucleotide #20) contains 15
O.D. and iε reεuεpended with one ml of water to give a εolution with approximately 90 pmole/μl.
A typical 50 μl PCR reaction contains approximately 20 ng of P. vulgaris genomic DNA as template; 200 μM each of dATP, dGTP, dCTP, dTTP; 50mM KCl; lOmM Tris-HCl (pH 8.4); 1.5 mM MgCl2; 0.01% gelatin; 2.5 units of Ampli-Taq™ DNA polymerase (Perkin-Elmer/Cetus, Norwalk, CT) ; and 50 pmoles of each oligonucleotide pool to be tested. The reactions are overlaid with mineral oil (Plough) and incubated in a Perkin-Elmer/Cetuε Thermalcycler™.
For each cycle, the instrument is programmed to denature the template DNA at 94°C for 1.25 minutes, anneal the oligonucleotide primers to the denatured template at 60°C or 62°C for one minute, and to extend these primers via DNA synthesis at 72°C for 2.25 minutes. Thirty such cycles are carried out in an experimental amplification. The products are analyzed by running an aliquot on a 4% NuSieve™ (FMC Biochemicals, Rockland, ME) GTG gel containing approximately 0.5 μg/ml ethidium bromide using either Tris-borate or Triε-acetate buffers at either full or half strength. These gels are usually run overnight at approximately lV/cm and photographed on a long wavelength UV transilluminator uεing a red filter and Polaroid Type 57 film.
PCR experimentε are run teεting the pairwiεe combinationε between oligonucleotide poolε #1-8 (derived from the "110 kD" amino-terminal εequence of chondroitinaεe I) , poolε #9-16 (derived from a peptide sequence contained within the "18 kD" fragment) , and pools #17-24 (derived from the amino-terminal sequence of the "90 kD" fragment) . The most effective amplifications observed (based on the visual yield of a discrete DNA band detected on gel electrophoretic analysis of the reaction products) are between oligonucleotide pools #4 and #18, and pools, #4 and #9,10,11, or 12. In general, the other pools, which differ by one nucleotide from these pools, also yield some amplification. A difference of two nucleotides results, essentially, in no observed product. It is important to note, however, that the annealing temperatures are deliberately kept at 60-62°C to enhance such discrimination. PCR amplifications using oligonucletide pools #4 and #18 yield a product of approximately 500 bp as estimated relative to size standards (pBR322 digested with MSP-1 (New England Biolabs, Beverly, MA) ranging from 30 to 700 bp on NuSieve™ agarose gels. The product from the use of oligonucleotide pool #4 combined with pools #9, 10, 11, or 12 is approximately 350 bp in length. Furthermore, the larger product could be isolated from an agarose gel, diluted a thousand-fold, and then used as the template in a second PCR reaction employing oligonucleotide pools #4 and #9 as primers, which yield a product of approximately 350 bp. That is, the smaller PCR product is synthesized from the larger one in agreement with what would be expected if these εequenceε were all derived from the P. vulgaris chondroitinase I gene. This indicates that the desired region of the genome is amplified.
The larger PCR product is isolated from an agarose gel using a Qiaex™ extraction procedure according to the manufacturer's instructions (Qiagen, Chatsworth, CA) . The isolated DNA iε then subjected to a "fill-in" reaction (11) to remove the extra, protruding adenine residue that Tag DNA polymerase tends to add to the 3'-end of DNA in a template- independent reaction (12) . The isolated DNA is then treated with T. polynucleotide kinase to add a phosphate moiety to the 5' -ends of the PCR products to allow them to be joined to the vector DNA. After these treatments, the PCR product is Iigated to pIBI24, a high copy vector containing a polylinker (IBI, New Haven, CT) , that is first sequentially digested with Pstl. "filled-in" and then treated with calf intestinal alkaline phosphatase (Boehringer- Mannheim) . Once the PCR product iε cloned into pIBI24, it iε removed aε an EcoRl-Hindlll fragment by virtue of the restriction sites within the polylinker carried by the plasmid. This fragment is then cloned into both M13mpl8 and M13mpl9 (13; New England Biolabs, Beverly, MA) after cleavage with both EcoRl and
Hindlll and then phosphataεed. Single stranded DNA corresponding to these constructionε iε then iεolated and subjected to DNA sequence analysis using an Applied Biosystems (Foεter City, CA) inεtrument and Tag sequencing kit. The results indicate that the larger PCR product iε 455 bp in length. As expected, the ends of the fragment are derived from the oligonucleotide pools used as primers.
The DNA εequence iε tranεlated into an uninterrupted amino acid εequence that iε in agreement (with one exception described below) with the available data obtained by amino acid sequence analysiε on the native chondroitinaεe I protein itself, including, for example, a twelve residue oligopeptide (SEQ ID NO:2, amino acidε 133-144) . An eight residue oligopeptide derived from the DNA sequence (SEQ ID NO:2, amino acids 71-78) also matcheε a previouεly sequenced oligopeptide derived by a combination of trypsin digestion and cyanogen bromide treatment of the native protein. The only discrepancy between the two sequences is at amino acid residue #162 of the mature protein (SEQ ID NO:2, amino acid 186) , where the DNA sequence codes for an arginine, while the native protein sequence indicates a leucine. Since a single nucleotide alteration would change a leucine codon (CTT) to an arginine codon (CGT) , an initial interpretation suggests that this may be caused by a lack of perfect incorporation fidelity by the Tag DNA polymerase during the in vitro amplification process. However (see below) later results indicate that the DNA sequence is correct, rather than the amino acid sequence obtained by analyzing the native enzyme. These results also indicate that the "18 kD" and the "90 kD" fragments are, in fact, contiguous pieces of the chondroitinase I protein that has been cleaved (presumably by a contaminating protease) , predominately between residues #157 (Gin) and #158 (Asp) of the mature protein (SEQ ID NO:2, between amino acids 181 and 182) . All of the above information supports the interpretation that the cloned DNA (at least that portion that iε bracketed by the oligonucleotide primers) generated by PCR amplification representε part of the authentic chondroitinaεe I gene of P. vulgaris and, therefore, can be used as a probe to identify cosmid clones that carry the intact gene.
Although it is possible to isolate the entire gene coding for a protein of interest using PCR amplification (thereby avoiding construction of a gene bank and many of the other steps described below) by employing oligonucleotide primers derived from the amino-terminus of the protein coupled with primers derived from the carboxyl-terminal amino acid sequence, there are several potential problems in this approach. In the case of the P. vulgaris chondroitinase I, the problems include: (1) the assumption that the protein being sequenced has not been processed at either end (not likely to be true, for example, with a secreted protein) , (2) the occasional lack of fidelity exhibited by Tag DNA polymerase during PCR reactions, and (3) the rather large size of the bracketed region of the DNA that is to be amplified which was expected to be approximately 3000 bp (deduced from the apparent molecular weight of approximately 110 kD) . Consequently, the approach of constructing a gene bank is selected.
Example 3 Generation Of A Labeled Probe, Colony Hybridization And Identification Of Positive Cosmid Clones
From The P. vulgariε Gene Bank
The cloned PCR product correεponding to the 455 bp near the amino-terminal coding portion of the P. vulgariε chondroitinaεe I gene iε releaεed from the plasmid DNA into which it had been cloned by digestion with the restriction enconuclease Sail. Thiε iε a consequence of the presence of one Sail site within the polylinker sequence and a second Sail site within the cloned PCR amplification product (this is fortuitous in that the latter Sail site is derived from the nucleotide sequence of oligonucleotide pool #18 near its 5' -end; in fact, there is no recognition site for Sail within the P. vulgaris chondroitinase I gene itself) . A total of approximately 260 μg of plasmid DNA is digested with Sail and the products separated by electrophoresis on a NuSieve™ GTG agaroεe gel. The desired approximately 450 bp fragment is isolated using a Qiaex™ extraction protocol. The fragment is then denatured by heating at 95-100°C for 5-15 minutes, followed by rapid cooling. The denatured fragment is then labelled with digoxigenin- labelled dUTP (Boehringer-Mannheim, Indianapolis, IN) in two 200 μl reactions. Aliquots of the six P. vulgaris cosmid gene banks described in Example 1 above are used to infect the E. coli strain ER1562 described above and a total of approximately 10,000 colonies are obtained on the appropriate selective plates. These colonies (on a total of 50 plates) are replica plated onto two nylon membranes on selective agar as well as to a third selective plate. After overnight incubation, the colonies on the filters are lysed by sequentially treating with 10% sodium dodecyl sulfate (SDS) and 0.5 M NaOH for 5-30 minuteε each. The cells from the lysed colonies are neutralized by being placed on sheetε εaturated with 1 M Tris-HCl (pH 7.4) (twice) and then on paper saturated with 2X standard saline citrate prior to vacuum drying at 80°C. The DNA from the lysed colonies is then fixed to the membranes. The filters are then washed by incubation of the filters at 42°C with agitation for 1-3 hours, using at least 10 ml/filter of 0.05 M Tris HCI, 0.5-1 M NaCl/0.001 M EDTA, pH 8, 0.1% SDS and 0.05 mg/ml proteinase K. The filters are then rinsed with 2 X SSC and pre-hybrid!zed by incubation with a hybridization buffer at 65°C for 1-3 hourε. The filters are then hybridized overnight at 65-68°C using the digoxigenin-labeled probe described above (0.5-50 ng/ml in a hybridization solution) . The hybridized filters are waεhed with SSC and SDS, re-blocked with a blocking reagent (Component #11 of DNA Labelling and Detection Kit, Nonradioactive, Boehringer Mannheim, Indianapolis, IN) and exposed to polyclonal sheep anti-digoxigenin Fab fragments conjugated to alkaline phosphatase.
The positive clones are visualized by incubation of the antibody-labeled filters in the presence of BCIP (bromo-chloro-indolyl-phosphate) and NBT (nitro-blue tetrazolium) . The presence of the desired DNA fragment within a colony will result in a dark brownish-purple spot in the filter after this hybridization procedure. After approximately four hours, the developed filters are used aε templates to guide the εelection of a total of 117 cloneε which are then picked to εelective media. A small-volume (10 ml) culture ("Miniprep") of each of these clones is grown in selective media and plasmid DNA is then isolated using materials and protocols supplied by Qiagen. Example 4 Reεtriction Mapping And Southern Hybridization
Uεed To Localize The Poεition Of The Chondroitinase I Gene Within Individual Clones
A number of approaches are used to guide the selection of particular cosmid clones for further study. One is to carry out Southern hybridization (8) using the same PCR-generated fragment as a probe against P. vulgaris genomic DNA that had been digested by a number of restriction enzymes and then fractionated on an agarose gel prior to transfer to a nylon membrane. The probe is labeled with digoxigenin-dUTP by including thiε nucleotide analogue in a PCR amplification. In thiε reaction, the gel- purified product of a previous PCR amplification (that using P. vulgaris genomic DNA as template) is diluted 10,000-fold and serveε aε the template in a second PCR amplification. Thiε latter reaction iε made up as a 0.5 ml mixture, which is then divided into ten individual tubes and amplified as described above for 25 cycles using oligonucleotide pools #2 and #10 (see above) as the primers. The normal complement of deoxyribonucleoside triphoεphateε iε replaced with a digoxigenin-dUTP labeling mixture from the manufacturer (Boehringer-Mannheim, Indianapoliε, IN) , which yields a final concentration of 100 μM each of dATP, dCTP and dGTP, 65 μM dTTP and 35 μl digoxigenin- dUTP. The reactions are pooled and precipitated according to the manufacturer's recommendations. An aliquot of the resuspended product is examined by gel electrophoresis and exhibits a single band between approximately 300 and approximately 400 bp in length as expected for the "smaller" PCR product deεcribed above.
To avoid problemε encountered with the highly viscous P. vulgaris genomic DNA preparation, the DNA (approximately 5 μl) is diluted into large (0.35 ml) volumeε for digeεtion with the variouε reεtriction enzymeε. The DNA iε then concentrated by ethanol precipitation prior to fractionation on agaroεe gels and transfer to nylon membranes. The data obtained in these experiments indicates that the chondroitinase I gene (at least that portion that hybridizes to the N-terminal coding region represented by the probe described above) is carried on a BstYI fragment of approximately 2800 bp, an EcoRV fragment of 5400 bp, and on large (equal to or greater than approximately lOkb) DNA fragments generated by Nsil. Bglll, Hindlll, and Stvl.
Large scale cultures (500 ml) of a number of hybridizing cosmid clones are grown and plasmid DNA is isolated from these cultures for use in mapping the location of the chondroitinase I gene. The DNA of the gene is expected to represent only approximately 10 per cent of the P. vulgaris DNA carried within each cosmid. A number of these clones are digested with BstYI and Nsil and the productε are fractionated on an agarose gel. Individual fragments are then isolated, a portion tested for the presence of chondroitinase I sequences by Southern hybridization, and then subcloned into appropriate vectors.
Two of these fragments are of special interest. The first, a BstYI fragment of approximately 2800 bp, is observed in a number of cosmid clones, including those designated #2 and #45. The DNA iεolated from these two cosmid clones iε designated LP2 751 and LP2 760. With LP2 760, the approximately 2800 bp BstYI fragment is well separated from the other BstYI fragments and iε therefore more readily εubcloned into another vector deεignated pT660-3. The plasmid designated pT660-3 is a derivative of pBR322 in which the DNA from a point immediately downstream of the promoter for tetracycline resistance (approximately bp 80) as far as the PvuII site (approximately bp 2070) is deleted and replaced with a BamHl linker. Similarly, the approximately 10 kb Nsil fragment (which hybridizes with the chondroitinase probe described above) is readily iεolated from a digest performed on LP2 751. These two fragments are referred to as the "2800 bp BstYI" fragment and the "10 kb Nsil" fragment.
The 2800 bp BstYI fragment is small enough to permit a second restriction enzyme digestion on this piece of DNA in order to obtain a fragment suitable for DNA sequence analysis. This is important because the hybridization experiments serve to identify the N-terminal coding region of the chondroitinase I gene, due to how the probe is derived. This procedure does not, however, indicate to which side the rest of the gene is located. Given the relative size of the probe (lesε than 500 bp) compared to the predicted size of the intact gene (greater than 3000 bp) , thiε is not a trivial consideration. The nucleotide sequence, however, clearly indicates in which direction the gene would be "read" and therefore, which restriction fragments should be cloned in order to obtain the entire gene. The subcloned 2800 bp BstYI fragment contains two internal EcoRV siteε, which εuggeεtε that the resulting fragments might be small enough for DNA sequencing. However, the EcoRV sites are symmetrically placed within the 2800 bp BstYI fragment; each EcoRV site is approximately 1200 bp from one end, with the εpace between them equal to approximately 400 bp. The εubcloned fragment is digested asymmetrically by taking advantage of unique restriction sites present within the vector. In this manner, the "halves" of the 2800 bp BεtYI fragment are distinguished physically and, by Southern hybridi¬ zation, the "end" that contains the chondroitinase I N-terminal coding region is ascertained. Once this iε done, the appropriate piece, which is a Hindlll-EcoRV fragment of approximately 1200 bp, is subcloned into both M13mpl8 and M13mpl9 vectors which are firεt digested with both Hindlll and Smal and subsequently treated with calf intestinal alkaline phosphatase. The DNA sequence derived from these subcloneε revealε a number of featureε that clearly establish the location of the chondroitinase I gene, as well as the direction in which it is read.
Starting with nucleotide #183 in this sequence (SEQ ID NO:l, nucleotide 191), a coding region is observed which matches the first thirty previously-identified amino acids of the P. vulgaris chondroitinaεe I enzyme. Preceding thiε sequence, it is possible to discern a number of other features by their analogy to corresponding εequence motifε from previouεly analyzed E. coli genes. These features include: (1) nucleotides 32-37 (SEQ ID NO:l, nucleotideε 40-45) which match in three of εix positions with the consensus "-35" region of a promoter and, after a 17 nucleotide space, a "-10" region of a promoter (matching in six of seven positions with the consensus "-10" region); (2) a putative "Shine-Dalgarno" sequence can be noted between nucleotides 98-103 (SEQ ID NO:l, nucleotideε 106-111) ; and (3) there iε an in-frame ATG initiation codon at nucleotideε 111-113 (SEQ ID N0:1, nucleotideε 119-121) , which indicates that the P. vulgaris chondroitinase I enzyme is syntheεized with a 24 amino acid εignal sequence which is, preεumably, removed aε the protein iε transported across the inner membrane. The second fragment that is subcloned (into a pIBI24 derivative that is firεt modified to include an Nsil reεtriction εite in place of the Pεtl εite normally preεent in the polylinker of thiε vector) iε the approximately 10 kb Nsil fragment. Digestion of this approximately 14 kb recombinant molecule (the approximately 10 kb Nsil fragment in pIBI24) with EcoRV yields four fragments of approximately 9 kb, 2.3 kb, 2.1 kb, and 0.4 kb. Southern hybridization analysis using the probe derived from the N-terminal amino acid sequence indicates that the related chondroitinase gene sequences are contained within the largest fragment (the approximately 9 kb EcoRV fragment) .
Since there is no other fragment larger than 2.9 kb (the size of pIBI24 which has no internal EcoRV recognition sites) , this approximately 9 kb EcoRV fragment must contain the vector as well as P. vulgaris DNA. A double digestion of this recombinant molecule with Nsil and EcoRV releases the pIBI24 vector as a 2.9 kb fragment; it also yields fragments of approximately 4.5 kb, 2.3 kb, 2.1 kb, 1.0 kb and 0.4 kb. Taken together (along with the information presented above on the 2.8 kb BstYI fragment which haε two internal EcoRV sites separated by approximately 0.4 kb) , an initial restriction map is constructed.
A double digestion with EcoRV and Hindlll releases fragments of approximately 4.1 kb, 2.3 kb, 2.1 kb, 2.0 kb, 1.3 kb, 1.1 kb and 0.4 kb. Three of these fragments (2.3 kb, 2.1 kb, and 0.4 kb) are apparently EcoRV fragments that have not been cut by Hindlll. Again, the only fragment larger than the vector (4.1 kb) indicates that this fragment includes pIBI24 (2.9 kb) . The approximately 2.0 kb fragment hybridizes with the chondroitinase probe, thereby serving to place one of the Hindlll siteε. Since there iε a Hindlll εite in the polylinker, it too can be placed, leaving the last Hindlll site to be placed by deduction.
Double digestion of the cloned approximately 10 kb Nsil fragment with EcoRV and EcoRl yields six fragments (of approximately 4.2 kb, 3.5 kb, 2.3 kb, 2.1 kb, 1 kb, and 0.4 kb) , indicating the presence of two EcoRl siteε -- one in the polylinker and one in the cloned P. vulgaris DNA. Southern hybridization reveals that the approximately 4.2 kb band in this double digest contains the chondroitinase I N-terminal coding sequence. Adding this information to the above data yields a preliminary restriction map for the subcloned approximately 10 kb Nsil fragment in pIBI24 (Figure 1) .
It should be noted that, in further support of the placement and orientation of the chondroitinase I gene, in vitro chondroitinase I assays in which the activity of the enzyme based on measuring the release of unsaturated disaccharide from chondroitin sulfate C at 232 nm are carried out on a small number of sampleε. In one case, an aliquot of an overnight culture used to prepare LP2 751 (ER1562 carrying cosmid DNA selected from the colony hybridizations) is found to exprsε 0.12 units/ml of chondroitinaεe. In addition, one of the EcoRV-deletion conεtructionε (to be described below) is grown overnight in the presence of ampicillin. This culture is then inoculated into fresh selective media either with or without isopropyl-beta-D-thiogalactopyranoεide (IPTG) which is expected to increase the level of transcription from the lac promoter preεent in pIBI24. The aεεay reεultε of 0.29 units/ml of chondroitinase without and 0.36 units/ml with IPTG induction indicate that, even after the EcoRV deletion, the gene .iε still intact and possibly oriented in the same direction as that of the lac promoter.
Although the sizes of the fragments in the above discussion are approximate (especially the approximately 1 kb region between the EcoRI/Nsil in the polylinker and the nearest EcoRV εite; in addition, there also might be another small EcoRV fragment that is still unmapped) , overall they suggest that the approximately 4.2 kb EcoRV-EcoRl fragment contains the entire chondroitinase I gene. In order to facilitate the restriction mapping, an EcoRV deletion is constructed using the approximately 10 kb Nsil fragment cloned into pIBI24 (LP 776) . This DNA is digested with EcoRV, treated with calf intestinal alkaline phosphatase, and fractionated on an agaroεe gel. The largeεt (approximately lOkb) fragment iε extracted from the gel and Iigated together in the preεence of a phoεphorylated EcoRl linker. The resulting construction (LP2 786) is next digested with EcoRl to yield three fragments. Although it is not completely separated from the pIBI24-containing, somewhat smaller fragment, an approximately 95% homogenous, approximately 4.2 kb EcoRl fragment is obtained after extraction from the gel. This EcoRl fragment is then used for DNA sequence analysis. Example 5
DNA Sequence Analysis Of The Approximately
4.2 kb EcoRI Fragment
The approximately 10 kb Nsil fragment, cloned into pIBI24, is digested with EcoRV (as described above) and Iigated together in the presence of EcoRI linkers. The net result of this construction is the deletion of approximately 5 kb of P. vulgaris DNA from this subcloned piece of DNA and the simultaneous introduction of another EcoRI site into the molecule. One hundred micrograms of thiε "EcoRV deletion" conεtruction (LP2 786) iε digested with EcoRI and fractionated on an agarose gel. The desired approximately 4.2 kb fragment is eluted from the gel, precipitated and resuspended in 150 μl TE described above. One-third of this material is then Iigated to itself (polymerized) and, after destruction of the DNA ligase by heating, the DNA is sonicated to generate random, small pieces suited to DNA sequence analysis.
The ends are rendered flush in a "fill-in" reaction mediated by the "Klenow fragment" (10; New England Biolabs, Beverly, MA) of the DNA polymerase I of E. coli, and then Iigated into Smal-cut and phoεphataεed M13mpl9. This recombinant DNA is used to transform the male E. coli strain MV1190 and 500 of the phage plaques obtained are picked into SM buffer (NaCl, 100 mM, MgS04, 8 mM, Tris-HCl, pH 7.4, 50 mM and 0.01% gelatin) to serve as stockε for the infection of εmall (leεε than or equal to 10 ml) cultures that are then used for the isolation of single stranded template DNA.
DNA sequencing is carried out at elevated temperatures using Tag DNA polymerase and fluorescently-labeled oligonucleotide primerε. The data are collected uεing a Model 370A DNA εequencing εyεtem (Applied Biosysterns, Foster City, CA) . Sequence editing, overlap determinations and derivation of a consensus sequence are performed using a collection of computer programs obtained from the
Genetics Computer Group at the University of Wisconεin (14) . The resulting DNA sequence of this EcoRI fragment is 3980 nucleotides in length (SEQ ID NO:l) . It is to be noted that the EcoRI site near the N- terminal coding sequence is derived from the linker Iigated into this site; it is not present in the P. vulgaris chromosome. This position actually is an EcoRV site in the cloned cosmid DNA.
Translation of the DNA sequence into the putative amino acid sequence reveals a continuous open reading frame encoding of 1021 amino acids (SEQ ID NO:2), with a 24 residue signal sequence (SEQ ID N0:2, amino acids 1-24) , followed by a 997 residue coding sequence for the mature (processed) chondroitinase I protein (SEQ ID N0:2, amino acids 25-1021) . Computer analysis using the programs described above of this sequence predicts a molecular weight of 115,090.94 for the unprocessed protein, a molecular weight of 112,507.82 for the mature "110 kD" (transported) protein, 17,503.43 for the first 157 amino acids (the "18 kD" fragment) (SEQ ID NO:2, amino acidε 25-181) and 95,022.40 for the remaining 840 amino acidε (the "90 kD" fragment) (SEQ ID NO:2, amino acids 182-1021) and a molecular weight of 2601.14 for the 24-residue signal sequence. One notable feature of the amino acid composition iε the absence of cysteine which could be important if the protein has to be re-folded at any point.
In the nucleotide sequence, it was noted above that there is a unique SphI restriction site located approximately 230 bp beyond the end of the gene (SEQ ID NO:l, nucleotides 3414-3419), which presentε a unique target εite that can be manipulated to allow the facile movement of the gene to achieve 5 the overall goal of expreεsing chondroitinase at high levels in E. coli. Although there are two recognition εiteε for Clal (ATCGAT) , one of them (SEQ ID N0:1, nucleotides 2702-2707) is embedded within the E. coli dam recognition sequence (GATC) (SEQ ID NO:l,
10 nucleotides 2701-2704) . The resulting adenine methylation by the dam-encoded enzyme blocks cleavage of this site by Clal; therefore, there is, in effect, a "unique" Clal site (SEQ ID NOil, nucleotides 497- 502) which is used, as described below, to reconstruct
15 the chondroitinase I gene after the appropriate site- specific mutageneseε are carried out.
Example 6 Site-specific Mutagenesis Of The Cloned 20 P. vulgaris Chondroitinaεe I Gene
The site-specific mutagenesis method employed is based on that of Kunkel (15) , using materials purchaεed from Bio-Rad, Melville, N.Y.
25 (Muta-Gene™ In Vitro Mutageneεiε Kit) . In thiε procedure, the target DNA to be mutagenized iε firεt cloned into an appropriate M13-derived vector. In thiε case, the recombinant molecule used (M13mpl9 into which is cloned the approximately 1200 bp EcoRV-
30 Hindlll fragment as described above) encompasses the
N-terminal coding region of the chondroitinase I gene. Thiε recombinant phage iε replicated in the E. coli hoεt εtrain CJ236 (Bio-Rad) , a male εtrain that carrieε the dut and ung alleleε. The combination of
35 theεe two mutations, dut (dUTPase) and ung (uracil-N- glycosylaεe) , results in the incorporation of some uracil, rather than thymine, residues into the DNA synthesized in this organism. Single stranded template is then isolated after propagation on CJ236 and an appropriate, mutagenic, synthetic oligonucleotide is annealed to this DNA.
This oligonucleotide serveε as a primer for T7 DNA polymerase which copies the entire recombinant molecule. T4 DNA ligase is then used to seal the nick between the first residue of the mutagenic oligonucleotide and the last residue added in vitro. The newly synthesized DNA (containing the desired base changes) therefore does not contain uracil, while the template DNA does. Transformation of a non-mutant (with respect to the dut and ung alleles) male E. coli strain yields phage progeny that are primarily derived from the mutagenized strand syntheεized in vitro aε a reεult of the inactivation of the uracil-containing template εtrand. In thiε specific case, four resuspended plaques (aliquots of which had been used for DNA sequencing which established the N-terminal coding region of the chondroitinase I gene and included another 110 bp "upstream" of the presumed translation initiation site (see above)) are used to infect the male host strain CJ236 (dut ung) . Individual plaques are picked to 0.5 ml of phage dilution buffer (PDB) . One picked plaque from each transformation is adsorbed to log phase CJ236 and the infected culture grown for 6.5 hours. The cells are pelleted by centrifugation, and the supernatant heated to 55°C for 30 minutes and then stored at 4°C. Single stranded DNA is isolated from 100 ml of each supernatant and resuspended in a total volume of 0.1 ml of TE. The goal of the site-specific mutagenesis is to modify the "ends" of thiε gene to allow it to be moved, precisely, into an appropriate high-level E. coli expression system. The target vector chosen (pET9-A; see above) is one derived from genetic regulatory elements present in the bacteriophage T7. In this sytem, there iε a unique Ndel site (CATATG) that includes the translation initiation codon as well as a downstream Ba Hl site that, together, allow the direct, unidirectional, insertion of a gene encoding the protein that is to be expressed. These two sites are preceeded by a T7-specifc promoter sequence and trailed by a transcription terminator that functions with the T7 RNA polymerase. Accordingly, these two restriction sites (Ndel and BamHl) are introduced into the cloned gene for P. vulgaris chondroitinase I.
In order to introduce the Ndel εite (containing the ATG initiation codon) both before the signal sequence as well as, in a second construction, before the coding sequence for the mature protein (thereby deleting the signal sequence) , two synthetic oligonucleotides are designed and syntheεized (purchaεed from Biosynthesis, Inc., Denton, TX) . The first, designated oligonucleotide # 25 (SEQ ID NO:37), retains the signal sequence while the second, oligonucleotide #26 (SEQ ID NO:38), deletes the signal sequence and allows the direct expression of the mature chondroitinase I protein (which can have an additional methionine residue at the N-terminus (SEQ ID NO:5, amino acid number 1)) . The native sequence, including the predicted initiation codon, is presented on line 1 below while the mutagenic oligonucleotide #25 (which differs in the three nucleotides immediately upstream of the initiation codon) is presented on line 2: 1) 5' -GCCAGCGTTTCTAAGGAGAAAAATAATGCCGATATT- TCGTTTTACTGC-3' (SEQ ID NO:l, nucleotides 94-141)
2) 5' -GCCAGCGTTTCTAAGGAGAAAACATATGCCGATATT-
TCGTTTTACTGC-3' (SEQ ID NO:37)
For the construction in which the signal εequence iε deleted, the site-specific mutagenesis is carried out at the junction of the signal sequence and the start of the mature protein (line 3) using the mutagenic oligonucleotide # 26 (line 4) (which differs by six nucleotideε, including the location of the initiation codon) :
3) 5' -GCGCCTTATAACGCGATGGCAGCCACCAGCAATCCTG-3' (SEQ ID NO:l, nucleotideε 170-206)
4) 5' -GCGCCTTATAACGCGCATATGGCCACCAGCAATCCTG-3' (SEQ ID NO:38)
The underlined GCC in line 3 corresponds to the codon for alanine which is the N-terminal amino acid for the mature, processed form of the P. vulgaris chondroitinase I.
In order for these oligonucleotides to be used, their 5' -ends need to be phosphorylated. There¬ fore, oligonucleotide # 25 (5 O.D. units) is resuspended with 0.5 ml of TE, while oligonucleotide # 26 (also 5 O.D. units) is resuspended in 0.65 ml TE to yield stockε that are approximately 20 nM, i.e., 20 pmole/μl. Three nanomoles (150 μl of stock solution) of each oligonucleotide are kinased in separate (0.35 ml) reactions containing 35 μl lOx ligase salts (New England Biolabs, Beverly, MA): 0.5 M Tris-HCl (pH 7.8), 0.1 M MgCl2, 0.2 M dithiothreitol, 10 mM ATP, 0.5 mg/ml bovine serum albumin), 35 μl 0.1 M dithiothreitol, 10 μl (100 units) T4 polynucleotide kinase (New England Biolabε) and made up to volume with 120 μl TE. The reactions are incubated at 37°C for 40 minutes and the enzyme inactivated at 70°C for 20 minutes.
Template DNA (5 μl of the preparation de¬ scribed above) and phoεphorylated mutagenic primer (approximately 2 pmole) are annealed in a 20 μl volume containing 20 mM Tris-HCl (pH 7.4), 2 mM MgCl2, and 50 mM NaCl. The sample is heated at 70°C for 45 minutes in a Perkin-Elmer/Cetus Thermalcycler™. The sample is then gradually cooled from 70°C to 25°C over a 45 minute period. The annealed mixture is placed on ice and the following components added: 2 μl of 10 X synthesis buffer (Bio-Rad) : 5mM each of dATP, dGTP, dCTP, dTTP; 10 mM ATP; lOO M Tris-HCl (pH 7.4); 50 mM MgCl2; 20 mM dithiothreitol) , 2 μl of T4 DNA ligase (6 units) and 1 μl of T7 DNA polymerase (1 unit) . These reactions are incubated for 5 minutes each at 0°C (on ice), 11°C, 25°C, and finally for 30 minutes at 37°C. The reactions are stopped by the addition of 75 μl of 10 mM Tris-HCl-10 mM EDTA (pH 8.0) and placed at -20°C. After the mutagenized DNA is thawed, it is uεed to transform the male E. coli strain MV1190 (dut* ung*) . Individual plaques obtained are picked and single-stranded DNA is isolated and sequenced. For those caseε in which the desired εequence changes are introduced, another aliquot of the resuspended plaque is used to infect strain MV1190, but in this case the intracellular, double-stranded replicative form of the recombinant DNA is isolated from the infected cell pelletε using the Mini-Prep procedure referenced above. Example 7
Reconstruction Of The Site-Specifically
Mutagenized Chondroitinase I Gene And Its
High-Level Expression In E. coli
Example 6 described the site-εpecific mutageneses that created an Ndel site immediately preceeding the signal sequence, as well as a second construction which placed the Ndel site adjacent to the triplet which codes for the N-terminal alanine found on the mature, processed P. vulgaris chondroitinase I gene. In each case, the ATG sequence of the Ndel recognition site (CATATG) can function as the translation initiation codon for the protein (either with or without the signal sequence) .
In order to transfer these alterationε from the M13 vector in which they were constructed, to the full chondroitinase I gene, the isolated replicative form is digested with Kpnl and Clal. The Kpnl site is part of the M13mpl9 polylinker, while the Clal site is found approximately 490 bp from the end of the cloned fragment of the chondroitinase I gene. The restriction digestion products obtained are fractionated on a 4% NuSieve™ GTG agarose gel run in 1/2 X Tris-Acetate buffer (TAE) . The appropriate approximately 500 bp band is extracted from the gel uεing Qiaex™. Similarly, plasmid DNA (LP2 786) carrying the chondroitinase I gene is also digested with Kpnl and Clal and then fractionated on a 0.8% agarose gel run in 1/2 X TAE. In this case, the Kpnl site is part of the polylinker of pIBI24, while the Clal site corresponds to the one described above. (Aε εtated above, there is a second Clal site in the chondroitinase I gene, but it is not cleaved by Clal because this site is apparently blocked by dam methylation. The site-specific mutagenesis and reconstruction of the chondroitinase I gene were carried out before the entire nucleotide sequence waε ascertained) . The approximately 7 kb fragment containing the pIBI24 vector and the large fragment of the chondroitinase I gene are isolated from the agarose gel by electroelution (11) , followed by ethanol precipitation. This 7 kb fragment is then treated with calf intestinal alkaline phosphataεe, extracted first with phenol-chloroform, then with chloroform, and then precipitated twice with ethanol and finally resuspended with 0.1 ml TE. The two isolated N- terminal encoding fragments (the two approximately 500 bp Kpnl-Clal pieces containing the two site- specifically mutagenized sequences, one with and one without the signal sequence) are each Iigated to the approximately 7 kb fragment encompassing the remainder of the chondroitinase I gene and the pIBI24 vector. The ligase reaction is then used to transform the E. coli strain 294 and ampicillin resistant derivatives obtained. DNA is isolated from small (10 ml) cultures and digested with Ndel to verify the presence of this restriction site within the reconstructed DNA. In order to remove the (apparent) P. vulgaris promoter and ribosome binding εite, the modified chondroitinaεe I genes are isolated as approximately 4.5 kb Ndel-Nsil fragments and εubcloned into a pBR322 variant in which the EcoRI εite iε firεt filled-in, then dephoεphorylated, and finally a phosphorylated Nsil linker (New England Biolabs) inserted. The sequence of the linker used (TGCATGCATGCA) to place the Nsil site (ATGCAT) into pBR322 also includes an SphI site (GCATGC) . In order to trim extra, non-coding DNA from the subcloned Ndel- Nsil fragments, as well as to introduce a unique restriction εite to be uεed later, plaεmidε (repreεenting two clones each with the signal sequenc retained [LP2 861 and LP2 863] and two with the signal εequence deleted [LP2 865 and LP2 867] ) containing the approximately 4500 bp Ndel-Nsil segmentε including the chondroitinaεe I gene are first digested with SphI, the endε "filled-in" with the "Klenow" fragment (11) of the E. coli DNA polymeraεe I and the reεulting DNA fragmentε fractionated on an agaroεe gel (0.8% in 1/2 X TAE) . The appropriate bandε (approximately 5200 bp) are eluted from the gel using Qiaex™ and then treated with calf alkaline phosphatase. After the removal of thiε enzyme by phenol-chloroform and chloroform extractions, the DNA is precipitated twice and finally resuspended with 0.1 ml TE.
This DNA is then Iigated in the presence of a phosphorylated BamHl linker and the mixture used to transform the E. coli strain 294. Six representative, ampicillin resiεtant colonies from each of the four constructions are grown in small (10 ml) cultures and plasmid DNA is isolated. Digestion of the DNA from the 24 clones examined with the enzymes Ndel and BamHl indicates which contain the BamHl site and, simultaneously, releases the approximately 3400 bp
Ndel-BamHl fragment which contains the chondroitinase I gene. Seventeen clones (eight with and nine without the signal sequence) yield the desired fragment which is extracted from the agarose gel with Qiaex™. These approximately 3.4 kb Ndel-Bam-HI chondroitinaεe I gene-containing fragments (both with and without the signal sequence) are then used to construct a high-level expresεion system. The expresεion vector uεed, pET-9A (9; Novagen), iε derived from elements of the E. coli bacteriophage T7. It contains an origin of replication derived from the Col El plasmid, a kanamycin resistance determinant, and the transcription and translation initiation determinants of the T7 gene 10. The naturally- occurring translation initiation codon for this gene is part of an Ndel εite. Thiε region iε followed by a unique BamHl εite and a T7 tranεcription terminator. A sample of this expression vector iε digeεted with the reεtrietion enzymes Ndel and BamHl, dephosphorylated with calf inteεtinal alkaline phoεphataεe, and purified by agaroεe gel electrophoresis. Each of the chondroitinaεe I gene fragments (both with and without the signal sequence) is Iigated to the expression vector fragment. The resulting recombinant DNA mixture is used to transform the E. coli K-12 host, HMS174 (Novagen) . Kanamycin- resistant colonies obtained are grown in small scale (10 ml) and plasmid DNA is isolated and examined to confirm the predicted structure. Samples of these constructions are then used to transform the expresεion host BL21(DE3)/pLysS (10). Thiε E. coli B εtrain carrieε the T7 RNA polymeraεe gene under lac control (and iε therefore inducible by either lactose or IPTG) on a lambda phage integrated within the E. coli chromosome, as well as the Col El- compatible plasmid pLysS. This latter replicon specifies resistance to chloramphenicol and contains the T7 lysozyme gene inserted into the tetracycline- resistance determinant of pACYC184 (ATCC 37033, American Type Culture Collection, Rockville, MD) in the "silent" orientation (read in the opposite direction relative to the tetracycline reεiεtance gene) . The T7 lyεozyme is expresεed at a relatively low level in this construction and serves as an inhibitor of the T7 RNA Polymerase (16) , thereby minimizing the basal-level expresεion of the gene to be overexpressed.
Derivatives of BL21 (DE3) /pLysS carrying the chondroitinaεe I gene (with the εignal εequence retained and which have been εubjected to the site- directed mutageneεiε deεcribed in Example 6 (SEQ ID N0:3)) in pET9-A are deεignated LL2084, LL2085, LL2086 and LL2087. They are not tested for expresεion of the chondroitinaεe I enzyme. The native chondroitinaεe I gene (with the signal sequence retained) (SEQ ID
N0:1), which has not been subjected to site-directed mutagenesis, is inserted into a different expresεion host. Expreεεion of the chondroitinase I enzyme is achieved. One of the derivatives of BL21(DE3) /pLysS carrying the signal-less chondroitinase I gene which has been subjected to the site-directed mutagenesiε deεcribed in Example 6 (SEQ ID NO:4) inεerted into pET9-A, iε deεignated LL2088, tested and used to establish a master cell bank. The insertion of the gene into pET9-A yields the plasmid designated pTM49- 6. Samples of the E. coli B strain BL21(DE3) /pLysS carrying the plasmid pTM49-6 constitute the deposited strain ATCC 69234. An overnight culture of this deposited strain is grown at 30°C in the presence of 40 μg/ml of kanamycin and 25 μg/ml of chloramphenicol. A 0.5 ml aliquot of this culture is used to inoculate 100 ml of a rich "expression" medium containing M9 salts (17) supplemented with 20 g/1 tryptone, 10 g/1 yeast extract, and 10 g/1 dextrose in addition to the same level of kanamycin and chloramphenicol.
The culture is grown at 30°C to an appropriate density (a value of 1 at A600) and then chondroitinase I expresεion is induced by the addition of IPTG to a final concentration of 1 mM. After three hours, samples are taken, centrifuged, and the cell pellets frozen on dry ice prior to asεay. The frozen pellets are thawed, resuspended in buffer and sonicated. A value of 56 unitε/ml iε obtained
(relative to the original culture volume) , which indicateε that thiε expression syεtem is functional. A subεequent 10 liter fermentation under controlled conditionε at a higher cell denεity yieldε a maximum value of approximately 600 unitε/ml of chondroitinaεe I. Thiε repreεentε a substantial improvement over fermentation of the original native P. vulgaris. which had not expressed chondroitinase I at a level above 2 units/ml.
Example 8
Method For The Isolation And Purification Of
The Native Chondroitinase I Enzyme
Aε Adapted To The Recombinant Enzyme
The native enzyme iε produced by fermentation of a culture of P. vulgaris.. The bacterial cells are firεt recovered from the medium and resuspended in buffer. The cell suspension is then homogenized to lyse the bacterial cells. Then a charged particulate such as 50 ppm Bioacryl (Toso Haas, Philadelphia, PA) , is added to remove DNA, aggregates and debris from the homogenization step. Next, the solution is brought to 40% saturation of ammonium sulfate to precipitate out undesired proteins. The chondroitinaεe I remainε in εolution. The εolution is then filtered using a 0.22 micron SP240 filter (Amicon, Beverly, MA) , and the retentate is washed using nine volumes of 40% ammonium εulfate solution to recover most of the enzyme. The filtrate is concentrated and subjected to diafiltration with a sodium phosphate buffer using a 30 kD filter to remove salts and small molecules.
The filtrate containing chondroitinase I is subjected to cation exchange chromatography using a Cellufine™ celluloεe εulfate column (Chiεεo Corporation, diεtributed by Amicon) . At pH 7.2, 20 mM εodium phoεphate, more than 98% of the chondroitinaεe I binds to the column. The native chondroitinase I is then eluted from the column using a 0 to 250 mM sodium chloride gradient, in 20 mM sodium phosphate buffer.
The eluted enzyme is then subjected to additional chromatography steps, such as anion exchange and hydrophobic interaction column chromatography. As a result of all of these procedures, chondroitinase I iε obtained at a purity of 90-97% as measured by SDS-PAGE scanning (see above) . However, the yield of the native protein is only 25-35%, determined as described above. This method also results in the cleavage of the approximately 110 kD chondroitinase I protein into a 90 kD and an 18 kD fragment. Nonetheless, the two fragments remain non-ionically bound and exhibit chondroitinase I activity. When this procedure is repeated with lysed host cells carrying a recombinant plasmid encoding chondroitinase I, significantly poorer resultε are obtained. Less than 10% of the chondroitinase I binds to the cation exchange column at standard stringent conditions of pH 7.2, 20 mM sodium phosphate.
Under less stringent binding conditions of pH 6.8 and 5 mM phosphate, an improvement of binding with one batch of material to 60-90% is observed. However, elution of the recombinant protein with the NaCl gradient gives a broad activity peak, rather than a sharp peak (see Figure 2) . This indicates the product is heterogeneous. Furthermore, in subsequent fermentation batches, the recombinant enzyme binds poorly (1-40%) , even using the less stringent binding conditions. Batches that bind poorly are not completely processed, so their overall recovery is not quantified.
Example 9 First Method For The Iεolation And
Purification Of Recombinant Chondroitinase I According To This Invention
As a first step, the host cells which express the recombinant chondroitinase I enzyme are homogenized to lyse the cells. This releases the enzyme into the supernatant.
In one embodiment of this invention, the supernatant is first subjected to diafiltration to remove salts and other small molecules. An example of a suitable filter is a spiral wound 30 kD filter made by Amicon (Beverly, MA) . However, this step only removes the free, but not the bound form of the negatively charged molecules. The bound form of these charged species is removed by passing the supernatant through a strong, high capacity anion exchange resin- containing column. An example of such a resin is the Macro-Prep™ High Q resin (Bio-Rad, Melville, N.Y.) . Other strong, high capacity anion exchange columns are also suitable. The negatively charged molecules bind to the column, while the enzyme passes through the column. It is also found that some unrelated, undesirable proteins also bind to the column.
Next, the eluate from the anion exchange column iε directly loaded to a cation exchange resin- containing column. Examples of such resins are the S- Sepharoεe™ (Pharmacia, Piεcataway, N.J.) and the Macro-Prep™ High S (Bio-Rad) . Each of theεe two resin-containing columns has S03 " ligands bound thereto in order to facilitate the exchange of cations. Other cation exchange columns are also suitable. The enzyme binds to the column and is then eluted with a solvent capable of releasing the enzyme from the column.
Any salt which increaseε the conductivity of the solution is suitable for elution. Examples of such salts include sodium salts, as well as potassium salts and ammonium salts. An aqueous sodium chloride solution of appropriate concentration is suitable. A gradient, such aε 0 to 250 mM sodium chloride is acceptable, as iε a step elution using 200 mM sodium chloride.
A sharp peak is seen in the sodium chloride gradient elution (Figure 3) . The improvement in enzyme yield over the prior method is striking. The recombinant chondroitinase I enzyme is recovered at a purity of 99% at a yield of 80-90%.
The purity of the protein is measured by scanning the bands in SDS-PAGE gels. A 4-20% gradient of acrylamide is used in the development of the gels. The band(s) in each lane of the gel iε scanned using the procedure described above.
These improvements are related directly to the increase in binding of the enzyme to the cation exchange column which results from first using the anion exchange column. In comparative experiments, when only the cation exchange column is used, only 1% of the enzyme binds to the column. However, when the anion exchange column is used first, over 95% of the enzyme binds to the column. Example 10
Second Method For The Isolation And
Purification Of Recombinant Chondroitinase I
According To This Invention
In the second embodiment of this aεpect of the invention, two additional εtepε are inserted in the method before the diafiltration step of the first embodiment. The supernatant is treated with an acidic solution, such as 1 M acetic acid, bringing the supernatant to a final pH of 4.5, to precipitate out the desired enzyme. The pellet is obtained by centrifugation at 5,000 x g for 20 minutes. The pellet is then dissolved in an alkali solution, such as 20-30 mM NaOH, bringing it to a final pH of 9.8. The solution iε then subjected to the diafiltration and subsequent stepε of the firεt embodiment of this invention.
In comparative experiments with the second embodiment of this invention, when only the cation exchange column is used, only 5% of the enzyme binds to the column. However, when the anion exchange column iε used first, essentially 100% of the enzyme binds to the column. The εecond embodiment provideε comparable enzyme purity and yield to the firεt embodiment of the invention.
Acid precipitation removes proteins that remain soluble; however, these proteins are removed anyway by the cation and anion exchange εtepε that follow (although smaller columns may be used) . An advantage of the acid precipitation step is that the sample volume is decreased to about 20% of the original volume after dissolution, and hence can be handled more easily on a large scale. However, the additional acid precipitation and alkali dissolution steps of the second embodiment mean that the εecond embodiment is more time consuming than the first embodiment. On a manufacturing scale, the marginal improvements in purity and yield provided by the second embodiment may be outweighed by the simpler procedure of the first embodiment, which still provides highly pure enzyme at high yields.
The high purity of the recombinant enzyme obtained by the two embodiments of this invention is depicted in Figure 4. A single sharp band is seen in the SDS-PAGE gel photograph: Lane 1 is the enzyme using the method of the first embodiment; Lane 2 is the enzyme using the method of the second embodiment; Lane 3 represents the supernatant from the host cell prior to purification -- many other proteins are present; and Lane 4 representε molecular weight standards.
Example 11 Site-Specific Mutagenesis Of A Fragment Encoding
The N-Terminal Region Of Chondroitinase II
The approach taken in the case of the chondroitinase II gene is to modify the naturally- occurring ATG initiation codon to embed it within an Ndel site. Thiε reεults in a construction in which the signal peptide is retained, such that the expressed gene is processed and secreted to yield the mature native enzyme structure that has a leucine residue at the N-terminus. The mutagenized bases are upstream of the coding region.
The method used for this εite-specific alteration is that described above for the expresεion of the chondroitinaεe I gene and iε based on the work of Kunkel (15) using the Muta-Gene™ In Vitro Mutagenesis Kit Version 2 (Bio-Rad, Melville, N.Y.). In this procedure, the target DNA to be mutagenized is first cloned into a suitable M13-derived vector to generate single-stranded DNA. This recombinant phage is replicated in the E. coli host strain CJ236 (Bio- Rad) , a male strain that carries the dut and ung alleles. The combination of these two mutations, dut (duTPase) and ung (uracil-N-glycoεylaεe) , reεultε in the incorporation of some uracil, rather than thymine, residues into the DNA synthesized in this organism. Single-stranded template is then isolated after propagation on CJ236 and the appropriate mutagenic, synthetic oligonucleotide (SEQ ID NO:41) is annealed to this DNA. This oligonucleotide serves as a primer for
T7 DNA polymerase which copies the entire recombinant molecule. T4 DNA ligase is then used to seal the nick between the firεt residue of the mutagenic oligonucleotide and the last residue added in vitro. The newly syntheεized DNA (containing the deεired base changes) therefore does not contain uracil while the template DNA (with the native sequences) does. Transformation of a non-mutant (with respect to the ung and dut alleles) male E. coli εtrain yields phage progeny that are primarily derived from the mutagenized strand synthesized in vitro as a reεult of the inactivation of the uracil-containing template strand.
In thiε εpecific case, the fragment to be cloned for the mutagenesis is a Muni-EcoRI fragment that spanε the region between nucleotideε 2943 to 3980 (SEQ ID NOS:l and 39) . The DNA digeεted to obtain this fragment is designated LP2783. Thiε plasmid is constructed in the same way as LP2786 (described in Example 4) , except that a Hindlll linker is inserted into the EcoRV deletion of LP2776 rather than the EcoRI linker. This Muni-EcoRI fragment iε Iigated into the unique EcoRI site of LP2941, an M13mpl9 derivative in which the normal polylinker is replaced with that found in the plasmid vector pNEB193 (New England
Biolabs, Beverly MA) . The four baεe overhang produced by Muni digestion can be Iigated to an EcoRI site, but the reεulting recombinant εequence cannot be digeεted by either enzyme. The EcoRI digeεted LP2941 iε also dephosphorylated with calf inteεtinal alkaline phoεphatase (Boehringer Mannheim, Indianapolis IN) prior to gel purification and use.
The Iigated DNA mixture is used to infect the male E. coli strain MV1190 and the plaques obtained are picked to 0.5 ml. of SM buffer and the phage allowed to elute by diffusion. These are then used to infect 10 ml. cultures of MV1190 and grown overnight. The cultures are centrifuged and the pellets used for the isolation of the double-stranded replicative forms of the recombinant viruε. The εupernatantε, which contain the corresponding phage particles, are εtored under refrigeration until needed. The orientation of the cloned fragment is determined by digestion of the replicative form DNA and Hindlll. because there is one site within the polylinker and a second, aymmetrically placed site (SEQ ID NOS:l and 39, nucleotides 3326-3331) within the above Muni-EcoRI fragment.
Once the desired orientation is identified, the corresponding phage-containing supernatant is serially diluted, used to infect the E. coli strain CJ236, and then plated to obtain single plaques which are picked and eluted as above. One of these is then used to infect CJ236 and another 10 ml culture grown and the single-εtranded DNA is isolated from the phage-containing supernatant using Qiaex™ columnε and materialε and methods recommended by the manufacturer (Qiagen, Chatsworth, CA) and finally reεuspended in a volume of 0.01 ml. The recombinant phage are grown on CJ236 (dut" ung") for two rounds in order to maximize the accumulation of uracil residues in the template and strand prior to the actual site-εpecific mutagenesis.
The mutagenic oligonucleotide used is obtained from Bio-Synthesis (Denton, TX) and has the following sequence:
5' -ATT-TGC-AGG-AAA-TCT-GCA-TAT-GCT-AAT-AAA-AAA-CCC-3' (SEQ ID NO:41)
This sequence differs from the corresponding region of SEQ ID NOS:l and 39 in that an AT sequence (base pairs 3235 and 3236) is replaced by a CA sequence which creates the desired Ndel sequence (CATATG) at the start of the presumed leader sequence for the chondroitinase II gene. One optical density unit of this oligonucleotide is dissolved in 0.46 ml. of TE 7.4 (0.01M TrisHCl, pH 7.8-0.001M EDTA, pH 8.0), yielding an oligonucleotide concentration of approximately 6 pmol/μl. Three hundred picomoleε of this oligonucleotide are phosphorylated in a 0.1 ml reaction containing 0.05 M TrisHCl, pH 7.8, 0.01 M MgCl2, 0.02M dithiothreitol, 0.001 M ATP, 25 μg/ml bovine serum albumin and 100 units of T4 polynucleotide kinase (New England Biolabε) at 37°C for 30 minuteε, followed by incubation at 75° for 20 minuteε to inactivate the enzyme. The phoεphorylated oligonucleotide iε then εtored frozen at -20° at a concentration of approximately 3 pmoles/μl. For the site-εpecific mutageneεiε, 1 μl (3 pmole) of the mutagenic oligonucleotide is mixed with 6 μl of the single-stranded DNA prepared above in a 10 μl volume of 0.02 M TrisHCl, pH 7.4, 0.002 M MgCl2, 0.05 M NaCl. The oligonucleotide is annealed to this template by firεt incubating the εample at 70°C for 5 minuteε and then cooling thiε εample at 25°C over a 45 minute period in a DNA Thermal Cycler™ (Perkin-Elmer Cetus/Norwalk, CT) . The sample is maintained at 25°C for another 5 minuteε before being cooled to 20°C and finally transferred to an ice bath.
The annealed primer is then extended after the addition of 1 μl of 10X synthesiε buffer (Bio-Rad; containing 0.005 M of each of the dNTP'ε, 0.01 M ATP, 0.1 M TriεHCl, pH 7.4, 0.05 M MgCl2, 0.02 M DTT). One μl of T4 DNA ligase (3 units/μl Bio-Rad) and 1 μl of T7 DNA polymerase (0.5 units/μl Bio-Rad). The in vitro DNA synthesis is carried out on ice for 5 minutes, at 11°C for ten minuteε, and at 37°C for 30 minuteε prior to transfer to ice. This εample iε used directly to transform the male E. coli host MV1190 (dut* ung*) and the resulting plaques, containing the site-specifically mutagenized phage, are obtained, picked and eluted as described above. Aliquots of these phage stockε are used in infect 10 ml. cultures of MV1190 and allowed to grow overnight. The cultures are centrifuged and the replicative forms of the recombinant phage are isolated using Qiaex™ columns and methods recommended by the manufacturer (Qiagen, Chatsworth CA) . The DNA isolated is resuspended in 0.1 ml of TE 7.4. Initial digestions of a portion of each of these DNA sampleε with Ndel reveals that at least four appeared to have acquired a new Ndel site, indicate that the site- εpecific mutagenesis is successful. Consequently, larger εampleε of theεe four cloneε (0.04 ml each) are digested with Ndel and EcoRI and fractionated on a 1.4% agarose gel run in a Tris-acetate-EDTA buffer system.
The desired approximately 740 baεe pair fragment iε obεerved in each case and this band is excised from each pattern. The four sampleε are then combined and the DNA extracted from the gel using a Qiaex™ resin and bufferε according to the manufacturer' s recommendations (Qiagen, Chatεworth CA) and resuspended in 0.05 ml. of TE, pH 7.4. This isolated, site-specifically mutagenized N-terminal coding region of the cloned P. vulgaris gene for the chondroitinase II gene is then subcloned into the plasmid pNEB193 (New England Biolabs, Beverly MA) between the (dephosphorylated) unique Ndel and EcoRI sites present in this plasmid. After transformation of the E. coli host strain 294, 10 ml cultures derived from the individual transformants are grown and the recombinant plasmid DNA iεolated aε above. The DNA sample from one of the positive clones is designated m#15-5712. This sample represents the modified N- terminal region that is to be joined to the C-terminal coding region for the chondroitinase II gene, which is described in Example 12.
Example 12
Isolation, Characterization And DNA Sequence Analysis
Of A Fragment Encoding The C-terminal Region Of
Chondroitinase II
The DNA sequence contained in SEQ ID NO: 39 indicates that chondroitinase II is encoded by a region that is downstream of that for chondroitinase I. This information is derived from a portion of a 10 kilobase Nsil fragment of P. vulgaris that is εubcloned originally from a coεmid clone designated LP2751. The combination of the DNA sequencing and the restriction map in Figure 1 revealε that the chondroitinase II coding region initiates to the "left" of the EcoRI site that lies within the P. vulgaris derived DNA and proceeds toward the Nsil site at the "right" end of the fragment depicted in Figure 1. Therefore, this reεtriction map εhould be expanded to the "right" to find a εuitable fragment that will include the C-terminal coding region for the chondroitinaεe II gene.
Digeεtion of LP751 revealε three EcoRI fragments of approximately 20 kb, 13 kb, and 10 kb, and indicates that there are three EcoRI siteε within LP2751. Because there are two EcoRI sites that bracket the cloning site, the conclusion is that there is one EcoRI site within the cloned P. vulgaris DNA in this clone. Furthermore, since the approximately 13 kb fragment corresponds to the size of the cosmid vector per se, this unique EcoRI site lies between the approximately 20 kb and the approximately 10 kb fragments noted above. Since it is known that the entire coding region for chondroitinase I, as well as the N-terminal coding region for chondroitinase II, are both contained within the approximately 10 kb Nsil fragment, restriction digestions that compare the patterns obtained among the cloned 10 kb Nsil (present in the recombinant plasmid designated LP2770) and gel- purified sampleε of the above approximately 20 kb EcoRI and approximately 10 kb EcoRI fragmentε indicate which of these EcoRI fragments contain the chondroitinase I coding sequence and, therefore by deduction, which will carry the C-terminal coding region for chondroitinase II. Consequently, digestionε are carried uεing the restriction enzymes Afllll, Clal, EcoRV, and Hindlll each of which has been noted by Applicants to yield eight to ten fragments upon digestion of the original cosmid clone designated LP2751. The recombinant molecule carrying the subcloned approximately 10 kb Nsil fragment (LP2770) and the individually gel-purified approximately 20 kb EcoRI and approximately 10 kb EcoRI fragments are digested with each of theεe enzymeε to yield patternε of fragments that are compared. Theεe digestions reveal that the approximately 20 kb EcoRI and the LP2770 patternε have a number of fragments in common. This indicates that the chondroitinase I gene and the N-terminal coding region of the chondroitinaεe II gene are contained within the larger EcoRI fragment and, therefore, the C-terminal coding region for the chondroitinaεe II gene iε on the approximately 10 kb EcoRI fragment.
The approximately 10 kb EcoRI fragment iε cloned into the unique EcoRI εite of the derivative of pNEB193 (New England Biolabε, Beverly MA) that iε deεignated lacpoΔ pNEB193. Thiε vector carries two deletions relative to the parental molecule pNEB193. The first removes the sequences between the unique Ndel and EcoRI siteε, retaining the EcoRI εite but removing the Ndel εite (and one of the two PvuII siteε) . The second deletion removes the region between the Hindlll site at the other end of the polylinker and the (now unique) PvuII site, maintaining the Hindlll site, while removing the PvuII site. The recombinant DNA molecule carrying the subcloned approximately 10 kb EcoRI fragment in the vector lacpoΔ pNEB193 is deεignated LP21263. The orientation of the 112 kD C-terminal coding region within LP21263 iε determined by reεtriction enzyme mapping. The reεultε indicate that thiε region is positioned so as to proceed from the EcoRI site (defined as the "left" end) toward the Hindlll site at the other end of the polylinker. Similarly, unique restriction siteε for Smal. Xhol, Noel and Ndel are found approximately 2.6, 4.6, 5.8 and 8.5 kb from the "left" end of the approximately 10 kb EcoRI fragment. Digeεtion of LP21263 with Smal. therefore, deleteε a downεtream region of approximately 7.4 kb from the εite within the cloned P. vulgariε DNA to the εecond εite within the polylinker region, leaving approximately 2.6 kb which εhould be enough to encode the missing region of the chondroitinase II gene. Thiε conεtruction also "places" a BamHl site (present in the polylinker) downstream of the coding region for the chondroitinase II gene. This recombinant DNA molecule which carries the chondroitinase II gene from the EcoRI site to (and presumably just beyond) the termination codon for this gene has been designated m#25-5712.
DNA sequence analysiε iε initiated on the approximately 10 kb EcoRI fragment derived from LP21263 and is completed after the assembly of the intact gene for chondroitinase II. The materials and methods for the DNA sequencing of thiε fragment are essentially the same aε thoεe uεed for the approximately 4 kb fragment containing the gene for chondroitinaεe I. Random fragments are derived from this approximately 10 kb EcoRI fragment by self-ligating the DNA and then fragmenting the polymerized DNA by sonication as well aε by partial digeεtion with the reεtriction enzymes Sau3A or Msel. Theεe pieceε are then eventually cloned into M13 derived vectorε and the single- stranded recombinant molecules sequenced using the standard protocols described above. Finally, with the two set of sequence data available, an approximately 300 base-pair Bell fragment is identified that is predicted to contain the EcoRI site that is the junction between the two P. vulgaris fragments of approximately 20 kb and approximately 10 kb obtained by digeεtion with EcoRI. This small fragment is sequenced in both directions to verify the nucleotide sequence through this junction point used in the conεtructionε described below.
Example 13
Assembly Of The Entire Site-Specifically Mutagenized
Gene For Chondroitinase II
During the DNA sequencing, the molecule designated m#25-5712 is digested with EcoRI and BamHl, This releases a DNA fragment of approximately 2.6 kb. Similarly, the construction designated m#15-5712 is digested with EcoRI and BamHl and then dephosphorylated prior to purification by gel electrophoreεis. The latter molecule therefore carries the N-terminal coding region of the chondroitinase II gene from the ATG initiation codon (now present as part of an Ndel site from the site- specific mutagenesis) to the EcoRI site.
These two fragments are Iigated and then the mixture used to transform the E. coli strain 294. Plasmid DNA is isolated from the transformants and positive clones identified. Restriction digestion with Ndel and BamHl releases the desired fragment encoding the chondroitinase II gene (SEQ ID NO:39, nucleotides 3235-6518, followed by 14 nucleotideε derived from the polylinker, which includeε a BamHl εite) . This fragment is then Iigated to the expresεion vector pET9A (Novagen, Madiεon, WI) described in the expresεion of the chondroitinaεe I gene.
The coding region of the chondroitinase II gene includes nucleotides 3238-6276 of the SEQ ID NO: 39, which encodes 1013 amino acids (SEQ ID NO:40) . Of this region, nucleotideε 3238-3306 encode the 23 amino acid εignal peptide (SEQ ID NO:40, amino acidε 1-23), while nucleotideε 3307-6276 encode the mature 990 amino acid chondroitinaεe II protein (SEQ ID NO:40, amino acidε 24-1013) .
Restriction analysis with four enzymes of the region spanning both chondroitinase genes and flanking sequences thereof revealε the following restriction sites:
Enzyme Nucleotide Enzyme Nucleotide
EcoRI 2 Muni 4510
Hindlll 2046 Hindlll 4530
Muni 2904 Muni 5176
Muni 2943 Hindlll 5427
Hindlll 3326 Smal 6515
EcoRI 3974
In addition, restriction analysis with Sau3AI reveals a multiplicity of siteε, including thoεe at SEQ ID NO:39, nucleotideε 212, 602, 890, 1042, 1181, 1241, 1442, 1505, 1746, 2330, 2363, 2701, 2705, 2920, 3697, 3708, 3745, 3868, 4087, 4800, 4872, 5565, 5635, 5860, 6058 and 6467.
One of the recombinant molecules (the chondroitinaεe II gene inεerted into pET9A) obtained in thiε experiment iε grown in large εcale (0.5 liter) and the expreεεion system containing the chondroitinase II gene isolated and designated LP21359. An aliquot of this DNA iε uεed to transform the expresεion host BL21(DE3) /pLysS deεcribed in the expresεion of the chondroitinaεe I gene. The reεulting strain is designated TD112 and is used for large-scale fermentation and isolation of the chondroitinase II enzyme. A fermentation at a 10 liter scale carried out with this E. coli strain containing the plasmid expressing the chondroitinase II protein, provideε a maximum chondroitinaεe II titer of approximately 0.3 mg/ml, which iε approximately 25 timeε that of the approximately 0.012 mg/ml obtained from the native P. vulgariε fermentation proceεε for chondroitinaεe II.
Example 14 Firεt Method For The Iεolation And Purification Of Recombinant Chondroitinase II
According To Thiε Invention
The initial part of this method is the same as that used for the recombinant chondroitinase I enzyme. As a first step, the host cells which express the recombinant chondroitinase II enzyme are homogenized to lyse the cells. This releases the enzyme into the supernatant.
In one embodiment of this invention, the supernatant is first subjected to diafiltration to remove salts and other small molecules. An example of a εuitable filter is a spiral wound 30 kD filter made by Amicon (Beverly, MA) . However, this εtep only removes the free, but not the bound form of the negatively charged molecules. The bound form of theεe charged species is removed by pasεing the supernatant (see the SDS-PAGE gel depicted in Figure 5, lane 1) through a strong, high capacity anion exchange resin- containing column. An example of such a resin is the Macro-Prep™ High Q reεin (Bio-Rad, Melville, N.Y.) . Other εtrong, high capacity anion exchange columnε are alεo suitable. The negatively charged molecules bind to the column, while the enzyme passes through the column with approximately 90% recovery of the enzyme. It is also found that some unrelated, undesirable proteins also bind to the column.
Next, the eluate from the anion exchange column (Figure 5, lane 2) iε directly loaded to a cation exchange resin-containing column. Examples of such resins are the S-Sepharose™ (Pharmacia,
Piscataway, N.J.) and the Macro-Prep™ High S (Bio- Rad) . Each of these two resin-containing columns has S03 " ligands bound thereto in order to facilitate the exchange of cations. Other cation exchange columns are also suitable. The enzyme binds to the column, while a significant portion of contaminating proteins elute unbound.
At this point, the method diverges from that used for the chondroitinase I protein. Instead of eluting the protein with a non-specific salt solution capable of releasing the enzyme from the cation exchange column, a specific elution using a solution containing chondroitin sulfate is used. A 1% concentration of chondroitin sulfate is used; however, a gradient of this solvent is also acceptable. The specific chondroitin sulfate solution is preferred to the non-specific salt solution becauεe the recombinant chondroitinaεe II protein iε expressed at levels approximately several-fold lower than the recombinant chondroitinase I protein; therefore, a more powerful and selective solution is necessary in order to obtain a final chondroitinase II product of a purity equivalent to that obtained for the' chondroitinase I protein. The cation exchange column iε next washed with a phosphate buffer, pH 7.0, to elute unbound proteins, followed by washing with borate buffer, pH 8.5, to elute loosely bound contaminating proteins and to increase the pH of the resin to that required for the optimal elution of the chondroitinase II protein using the εubεtrate, chondroitin εulfate.
Next, a 1% solution of chondroitin sulfate in water, adjusted to pH 9.0, is used to elute the chondroitinase II protein, as a sharp peak (recovery 65%) and at a high purity of approximately 95% (Figure 5, lane 3) . However, the chondroitin sulfate has an affinity for the chondroitinase II protein which is stronger than its affinity for the resin of the column, and therefore the chondroitin sulfate co- elutes with the protein. Thiε ensures that only protein which recognizes chondroitin sulfate is eluted, which is desirable, but also means that an additional proceεε εtep is necesεary to separate the chondroitin sulfate from the chondroitinase II protein.
In thiε separation step, the eluate is adjuεted to pH 7.0 and iε loaded aε is onto an anion exchange resin-containing column, such as the Macro- Prep™ High Q resin. The column is washed with a 20 mM phosphate buffer, pH 6.8. The chondroitin sulfate binds to the column, while the chondroitinaεe II protein flowε through in the unbound pool with greater than 95% recovery. At thiε point, the protein iε pure, except for the preεence of a εingle minor contaminant of approximately 37 kD (Figure 5, laneε 4 and 6) . The contaminant may be a breakdown product of the chondroitinase II protein.
This contaminant is effectively removed by a crytallization step. The eluate from the anion exchange column is concentrated to 15 mg/ml protein using an Amicon stirred cell with a 30 kD cutoff. The solution is maintained at 4°C for εeveral days to crystallize out the pure chondroitinase II protein. The supernatant contains the 37 kD contaminant (Figure 5, lane 7) . Centrifugation causes the crystalε to form a pellet, while the εupernatant with the 37 kD contaminant iε removed by pipetting, and the cryεtalε waεhed twice with water. After the firεt waεh, some of the contaminant remains (Figure 5, lane 8), but after the second wash, only the chondroitinase II protein is visible (Figure 5, lane 9) . The washed crystals are redissolved in water and exhibit a single protein band on SDS-PAGE, with a purity of greater than 99% (Figure 5, lane 10) .
Example 15 Second Method For The Isolation And Purification Of Recombinant Chondroitinase II According To This Invention ,
In the second embodiment of this aspect of the invention, two additional stepε are inserted in the method for purifying the chondroitinase II enzyme before the diafiltration step of the firεt embodiment. The supernatant iε treated with an acidic solution, such as 1 M acetic acid, bringing the supernatant to a final pH of 4.5, to precipitate out the desired enzyme. The pellet is obtained by centrifugation at 5,000 x g for 20 minutes. The pellet is then dissolved in an alkali solution, such as 20-30 mM
NaOH, bringing it to a final pH of 9.8. The solution is then subjected to the diafiltration and subεequent steps of the first embodiment of this aspect of the invention. Bibliography
1. Ya agata, T., et al., J. Biol. Chem. , 243, 1523-1535 (1968) .
2. Kikuchi, H., et al. , U.S. Patent Number 5,198,355.
3. Brown, M. D., U.S. Patent Number 4,696,816.
4. Hageman, G. S., U.S. Patent Number 5,292,509.
5. Malitschek, B., and Schartl, M. , Biotechniσueε. 7., .177-178 (1991).
6. Sanger, F., et al. , Proc. Natl. Acad. Sci. USA. 74, 5463-5467 (1977) .
7. Inniε, M. A., and Gelfand, D. H., "Optimization of PCRε", pages 3-12 in PCR Protocols. A Guide to Methods and Applicationε, Academic Preεε, New York, N.Y. (1990) .
8. Southern, E., J. Mol. Biol., 98. 503- 517 (1975) .
9. Studier, F. W. , et al., Methodε in Enzymoloαv, 185, 60-89 (1990) .
10. Studier, F. . , and Moffatt, B. A., J. Mol. Biol.. 189. 113-130 (1986); Moffatt, B. A., and Studier, F. W. , Cell, 41, 221-227 (1987) .
11. Sambrook, J., et al., Molecular Cloninσ; A Laboratory Manual, 2nd ed. , Cold Spring Harbor Laboratory Preεε, Cold Spring Harbor, N.Y. (1989) .
12. Clark, Nucleic Acidε Research. 16, 9677 (1988) .
13. Yanisch-Perron, C, et al., Gene, 3., 103-119 (1985) .
14. Devereaux, et al.. Nucleic Acidε Research, 12, 387-395 (1984) .
15. Kunkel, T. A., Proc. Natl. Acad. Sci.. USA. 82., 488-492 (1985) .
16. Studier, F. W. , J. Mol. Biol.. 219. 37- 44 (1991) .
17. Miller, J. H., Experimentε in Molecular Geneticε, Cold Spring Harbor Laboratory Preεε, Cold Spring Harbor, N.Y. (1972) .
SEQUENCE LISTING
(1) GENERAL INFORMATION:
(i) APPLICANT: American Cyanamid Company
(ii) TITLE OF INVENTION: Cloning And Expression Of The Chondroitinase I and II Genes From P. Vulgaris
(iii) NUMBER OF SEQUENCES: 41
(iv) CORRESPONDENCE ADDRESS:
(A) ADDRESSEE: American Cyanamid Company
(B) STREET: One Cyanamid Plaza
(C) CITY: Wayne
(D) STATE: New Jersey
(E) COUNTRY: U.S.A.
(F) ZIP: 07470-8426
(v) COMPUTER READABLE FORM:
(A) MEDIUM TYPE: Floppy disk
(B) COMPUTER: IBM PC compatible
(C) OPERATING SYSTEM: PC-DOS/MS-DOS
(D) SOFTWARE: Patentin Release #1.0, Version #1.25
(vi) CURRENT APPLICATION DATA:
(A) APPLICATION NUMBER: PCT/US94/
(B) FILING DATE:
(C) CLASSIFICATION:
(viii) ATTORNEY/AGENT INFORMATION:
(A) NAME: Gordon, Alan M.
(B) REGISTRATION NUMBER: 30,637
(C) REFERENCE/DOCKET NUMBER: 31,726-00/PCT
(ix) TELECOMMUNICATION INFORMATION:
(A) TELEPHONE: 201-831-3244
(B) TELEFAX: 201-831-3305
(2) INFORMATION FOR SEQ ID NO:l:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 3980 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic) (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO
(ix) FEATURE:
(A) NAME/KEY: CDS
(B) LOCATION: 119..3181
(xi) SEQUENCE DESCRIPTION: SEQ ID Nθ:l: GGAATTCCAT CACTCAATCA TTAAATTTAG GCACAACGAT GGGCTATCAG CGTTATGACA 60
AATTTAATGA AGGACGCATT GGTTTCACTG TTAGCCAGCG TTTCTAAGGA GAAAAATA 118
ATG CCG ATA TTT CGT TTT ACT GCA CTT GCA ATG ACA TTG GGG CTA TTA 166 Met Pro He Phe Arg Phe Thr Ala Leu Ala Met Thr Leu Gly Leu Leu 1 . 5 10 15
TCA GCG CCT TAT AAC GCG ATG GCA GCC ACC AGC AAT CCT GCA TTT GAT 214 Ser Ala Pro Tyr Asn Ala Met Ala Ala Thr Ser Asn Pro Ala Phe Asp 20 25 30
CCT AAA AAT CTG ATG CAG TCA GAA ATT TAC CAT TTT GCA CAA AAT AAC 262 Pro Lys Asn Leu Met Gin Ser Glu He Tyr His Phe Ala Gin Asn Asn 35 40 45
CCA TTA GCA GAC TTC TCA TCA GAT AAA AAC TCA ATA CTA ACG TTA TCT 310 Pro Leu Ala Asp Phe Ser Ser Asp Lys Asn Ser He Leu Thr Leu Ser 50 55 60
GAT AAA CGT AGC ATT ATG GGA AAC CAA TCT CTT TTA TGG AAA TGG AAA 358 Asp Lys Arg Ser He Met Gly Asn Gin Ser Leu Leu Trp Lys Trp Lys 65 70 75 80
GGT GGT AGT AGC TTT ACT TTA CAT AAA AAA CTG ATT GTC CCC ACC GAT 406 Gly Gly Ser Ser Phe Thr Leu His Lys Lys Leu He Val Pro Thr Asp 85 90 95
AAA GAA GCA TCT AAA GCA TGG GGA CGC TCA TCT ACC CCC GTT TTC TCA 454 Lys Glu Ala Ser Lys Ala Trp Gly Arg Ser Ser Thr Pro Val Phe Ser 100 105 110
TTT TGG CTT TAC AAT GAA AAA CCG ATT GAT GGT TAT CTT ACT ATC GAT 502 Phe Trp Leu Tyr Asn Glu Lys Pro He Asp Gly Tyr Leu Thr He Asp 115 120 125
TTC GGA GAA AAA CTC ATT TCA ACC AGT GAG GCT CAG GCA GGC TTT AAA 550 Phe Gly Glu Lys Leu He Ser Thr Ser Glu Ala Gin Ala Gly Phe Lys 130 135 140
GTA AAA TTA GAT TTC ACT GGC TGG CGT GCT GTG GGA GTC TCT TTA AAT 598 Val Lys Leu Asp Phe Thr Gly Trp Arg Ala Val Gly Val Ser Leu Asn 145 150 155 160
AAC GAT CTT GAA AAT CGA GAG ATG ACC TTA AAT GCA ACC AAT ACC TCC 646 Asn Asp Leu Glu Asn Arg Glu Met Thr Leu Asn Ala Thr Asn Thr Ser 165 170 175
TCT GAT GGT ACT CAA GAC AGC ATT GGG CGT TCT TTA GGT GCT AAA GTC 694 Ser Asp Gly Thr Gin Asp Ser He Gly Arg Ser Leu Gly Ala Lys Val 180 185 190
GAT AGT ATT CGT TTT AAA GCG CCT TCT AAT GTG AGT CAG GGT GAA ATC 742 Asp Ser He Arg Phe Lys Ala Pro Ser Asn Val Ser Gin Gly Glu He 195 200 205
TAT ATC GAC CGT ATT ATG TTT TCT GTC GAT GAT GCT CGC TAC CAA TGG 790 Tyr He Asp Arg He Met Phe Ser Val Asp Asp Ala Arg Tyr Gin Trp 210 215 220
TCT GAT TAT CAA GTA AAA ACT CGC TTA TCA GAA CCT GAA ATT CAA TTT 838 Ser Asp Tyr Gin Val Lys Thr Arg Leu Ser Glu Pro Glu He Gin Phe 225 230 235 240
CAC AAC GTA AAG CCA CAA CTA CCT GTA ACA CCT GAA AAT TTA GCG GCC 886 His Asn Val Lys Pro Gin Leu Pro Val Thr Pro Glu Asn Leu Ala Ala 245 250 255
ATT GAT- CTT ATT CGC CAA CGT CTA ATT AAT GAA TTT GTC GGA GGT GAA 934 He Asp Leu He Arg Gin Arg Leu He Asn Glu Phe Val Gly Gly Glu 260 265 270
AAA GAG ACA AAC CTC GCA TTA GAA GAG AAT ATC AGC AAA TTA AAA AGT 982 Lys Glu Thr Asn Leu Ala Leu Glu Glu Asn He Ser Lys Leu Lys Ser 275 280 285
GAT TTC GAT GCT CTT AAT ATT CAC ACT TTA GCA AAT GGT GGA ACG CAA 1030 Asp Phe Asp Ala Leu Asn He His Thr Leu Ala Asn Gly Gly Thr Gin 290 295 300
GGC AGA CAT CTG ATC ACT GAT AAA CAA ATC ATT ATT TAT CAA CCA GAG 1078 Gly Arg His Leu He Thr Asp Lys Gin He He He Tyr Gin Pro Glu 305 310 315 320
AAT CTT AAC TCC CAA GAT AAA CAA CTA TTT GAT AAT TAT GTT ATT TTA 1126 Asn Leu Asn Ser Gin Asp Lys Gin Leu Phe Asp Asn Tyr Val He Leu 325 330 335
GGT AAT TAC ACG ACA TTA ATG TTT AAT ATT AGC CGT GCT TAT GTG CTG 1174 Gly Asn Tyr Thr Thr Leu Met Phe Asn He Ser Arg Ala Tyr Val Leu 340 345 350
GAA AAA GAT CCC ACA CAA AAG GCG CAA CTA AAG CAG ATG TAC TTA TTA 1222 Glu Lys Asp Pro Thr Gin Lys Ala Gin Leu Lys Gin Met Tyr Leu Leu 355 360 365
ATG ACA AAG CAT TTA TTA GAT CAA GGC TTT GTT AAA GGG AGT GCT TTA 1270 Met Thr Lys His Leu Leu Asp Gin Gly Phe Val Lys Gly Ser Ala Leu 370 375 380
GTG ACA ACC CAT CAC TGG GGA TAC AGT TCT CGT TGG TGG TAT ATT TCC 1318 Val Thr Thr His His Trp Gly Tyr Ser Ser Arg Trp Trp Tyr He Ser 385 390 395 400
ACG TTA TTA ATG TCT GAT GCA CTA AAA GAA GCG AAC CTA CAA ACT CAA 1366 Thr Leu Leu Met Ser Asp Ala Leu Lys Glu Ala Asn Leu Gin Thr Gin 405 410 415
GTT TAT GAT TCA TTA CTG TGG TAT TCA CGT GAG TTT AAA AGT AGT TTT 1414 Val Tyr Asp Ser Leu Leu Trp Tyr Ser Arg Glu Phe Lys Ser Ser Phe 420 425 430
GAT ATG AAA GTA AGT GCT GAT AGC TCT GAT CTA GAT TAT TTC AAT ACC 1462 Asp Met Lys Val Ser Ala Asp Ser Ser Asp Leu Asp Tyr Phe Asn Thr 435 440 445
TTA TCT CGC CAA CAT TTA GCC TTA TTA TTA CTA GAG CCT GAT GAT CAA 1510 Leu Ser Arg Gin His Leu Ala Leu Leu Leu Leu Glu Pro Asp Asp G n 450 455 460
AAG CGT ATC AAC TTA GTT AAT ACT TTC AGC CAT TAT ATC ACT GGC GCA 1558 Lys Arg He Asn Leu Val Asn Thr Phe Ser His Tyr He Thr Gly Ala 465 470 475 480 TTA ACG CAA GTG CCA CCG GGT GGT AAA GAT GGT TTA CGC CCT GAT GGT 1606 Leu Thr Gin Val Pro Pro Gly Gly Lys Asp Gly Leu Arg Pro Asp Gly 485 490 495
ACA GCA TGG CGA CAT GAA GGC AAC TAT CCG GGC TAC TCT TTC CCA GCC 1654 Thr Ala Trp Arg His Glu Gly Asn Tyr Pro Gly Tyr Ser Phe Pro Ala 500 505 510
TTT AAA AAT GCC TCT CAG CTT ATT TAT TTA TTA CGC GAT ACA CCA TTT 1702 Phe Lys Asn Ala Ser Gin Leu He Tyr Leu Leu Arg Asp Thr Pro Phe 515 520 525
TCA GTG GGT GAA AGT GGT TGG AAT AAC CTG AAA AAA GCG ATG GTT TCA 1750 Ser Val Gly Glu Ser Gly Trp Asn Asn Leu Lys Lys Ala Met Val Ser 530 535 540
GCG TGG ATC TAC AGT AAT CCA GAA GTT GGA TTA CCG CTT GCA GGA AGA 1798 Ala Trp He Tyr Ser Asn Pro Glu Val Gly Leu Pro Leu Ala Gly Arg 545 550 555 560
CAC CCT TTT AAC TCA CCT TCG TTA AAA TCA GTC GCT CAA GGC TAT TAC 1846 His Pro Phe Asn Ser Pro Ser Leu Lys Ser Val Ala Gin Gly Tyr Tyr 565 570 575
TGG CTT GCC ATG TCT GCA AAA TCA TCG CCT GAT AAA ACA CTT GCA TCT 1894 Trp Leu Ala Met Ser Ala Lys Ser Ser Pro Asp Lys Thr Leu Ala Ser 580 585 590
ATT TAT CTT GCG ATT AGT GAT AAA ACA CAA AAT GAA TCA ACT GCT ATT 1942 He Tyr Leu Ala He Ser Asp Lys Thr Gin Asn Glu Ser Thr Ala He 595 600 605
TTT GGA GAA ACT ATT ACA CCA GCG TCT TTA CCT CAA GGT TTC TAT GCC 1990 Phe Gly Glu Thr He Thr Pro Ala Ser Leu Pro Gin Gly Phe Tyr Ala 610 615 620
TTT AAT GGC GGT GCT TTT GGT ATT CAT CGT TGG CAA GAT AAA ATG GTG 2038 Phe Asn Gly Gly Ala Phe Gly He His Arg Trp Gin Asp Lys Met Val 625 630 635 640
ACA CTG AAA GCT TAT AAC ACC AAT GTT TGG TCA TCT GAA ATT TAT AAC 2086 Thr Leu Lys Ala Tyr Asn Thr Asn Val Trp Ser Ser Glu He Tyr Asn 645 650 655
AAA GAT AAC CGT TAT GGC CGT TAC CAA AGT CAT GGT GTC GCT CAA ATA 2134 Lys Asp Asn Arg Tyr Gly Arg Tyr Gin Ser His Gly Val Ala Gin He 660 665 670
GTG AGT AAT GGC TCG CAG CTT TCA CAG GGC TAT CAG CAA GAA GGT TGG 2182 Val Ser Asn Gly Ser Gin Leu Ser Gin Gly Tyr Gin Gin Glu Gly Trp 675 680 685
GAT TGG AAT AGA ATG CAA GGG GCA ACC ACT ATT CAC CTT CCT CTT AAA 2230 Asp Trp Asn Arg Met Gin Gly Ala Thr Thr He His Leu Pro Leu Lys 690 695 700
GAC TTA GAC AGT CCT AAA CCT CAT ACC TTA ATG CAA CGT GGA GAG CGT 2278 Asp Leu Asp Ser Pro Lys Pro His Thr Leu Met Gin Arg Gly Glu Arg 705 710 715 720
GGA TTT AGC GGA ACA TCA TCC CTT GAA GGT CAA TAT GGC ATG ATG GCA 2326 Gly Phe Ser Gly Thr Ser Ser Leu Glu Gly Gin Tyr Gly Met Met Ala 725 730 735
TTC GAT CTT ATT TAT CCC GCC AAT CTT GAG CGT TTT GAT CCT AAT TTC 2374 Phe Asp Leu He Tyr Pro Ala Asn Leu Glu Arg Phe Asp Pro Asn Phe 740 745 750
ACT GCG AAA AAG AGT GTA TTA GCC GCT GAT AAT CAC TTA ATT TTT ATT 2422 Thr Ala Lys Lys Ser Val Leu Ala Ala Asp Asn His Leu He Phe He 755 760 765
GGT AGC AAT ATA AAT AGT AGT GAT AAA AAT AAA AAT GTT GAA ACG ACC 2470 Gly Ser Asn He Asn Ser Ser Asp Lys Asn Lys Asn Val Glu Thr Thr 770 775 780
TTA TTC CAA CAT GCC ATT ACT CCA ACA TTA AAT ACC CTT TGG ATT AAT 2518 Leu Phe Gin His Ala He Thr Pro Thr Leu Asn Thr Leu Trp He Asn 785 790 795 800
GGA CAA AAG ATA GAA AAC ATG CCT TAT CAA ACA ACA CTT CAA CAA GGT 2566 Gly Gin Lys He Glu Asn Met Pro Tyr Gin Thr Thr Leu Gin Gin Gly 805 810 815
GAT TGG TTA ATT GAT AGC AAT GGC AAT GGT TAC TTA ATT ACT CAA GCA 2614 Asp Trp Leu He Asp Ser Asn Gly Asn Gly Tyr Leu He Thr Gin Ala 820 825 830
GAA AAA GTA AAT GTA AGT CGC CAA CAT CAG GTT TCA GCG GAA AAT AAA 2662 Glu Lys Val Asn Val Ser Arg Gin His Gin Val Ser Ala Glu Asn Lys 835 840 845
AAT CGC CAA CCG ACA GAA GGA AAC TTT AGC TCG GCA TGG ATC GAT CAC 2710 Asn Arg Gin Pro Thr Glu Gly Asn Phe Ser Ser Ala Trp He Asp His 850 855 860
AGC ACT CGC CCC AAA GAT GCC AGT TAT GAG TAT ATG GTC TTT TTA GAT 2758 Ser Thr Arg Pro Lys Asp Ala Ser Tyr Glu Tyr Met Val Phe Leu Asp 865 870 875 880
GCG ACA CCT GAA AAA ATG GGA GAG ATG GCA CAA AAA TTC CGT GAA AAT 2806 Ala Thr Pro Glu Lys Met Gly Glu Met Ala Gin Lys Phe Arg Glu Asn 885 890 895
AAT GGG TTA TAT CAG GTT CTT CGT AAG GAT AAA GAC GTT CAT ATT ATT 2854 Asn Gly Leu Tyr Gin Val Leu Arg Lys Asp Lys Asp Val His He He 900 905 910
CTC GAT AAA CTC AGC AAT GTA ACG GGA TAT GCC TTT TAT CAG CCA GCA 2902 Leu Asp Lys Leu Ser Asn Val Thr Gly Tyr Ala Phe Tyr Gin Pro Ala 915 920 925
TCA ATT GAA GAC AAA TGG ATC AAA AAG GTT AAT AAA CCT GCA ATT GTG 2950 Ser He Glu Asp Lys Trp He Lys Lys Val Asn Lys Pro Ala He Val 930 935 940
ATG ACT CAT CGA CAA AAA GAC ACT CTT ATT GTC AGT GCA GTT ACA CCT 2998 Met Thr His Arg Gin Lys Asp Thr Leu He Val Ser Ala Val Thr Pro 945 950 955 960
GAT TTA AAT ATG ACT CGC CAA AAA GCA GCA ACT CCT GTC ACC ATC AAT 3046 Asp Leu Asn Met Thr Arg Gin Lys Ala Ala Thr Pro Val Thr He Asn 965 970 975 GTC ACG ATT AAT GGC AAA TGG CAA TCT GCT GAT AAA AAT AGT GAA GTG 3094 Val Thr He Asn Gly Lys Trp Gin Ser Ala Asp Lys Asn Ser Glu Val 980 985 990
AAA TAT CAG GTT TCT GGT GAT AAC ACT GAA CTG ACG TTT ACG AGT TAC 3142 Lys Tyr Gin Val Ser Gly Asp Asn Thr Glu Leu Thr Phe Thr Ser Tyr 995 1000 1005
TTT GGT ATT CCA CAA GAA ATC AAA CTC TCG CCA CTC CCT TGATTTAATC 3191 Phe Gly He Pro Gin Glu He Lys Leu Ser Pro Leu Pro 1010 1015 1020
AAAAGAACGC TCTTGCGTTC CTTTTTTATT TGCAGGAAAT CTGATTATGC TAATAAAAAA 3251
CCCTTTAGCC CACGCGGTTA CATTAAGCCT CTGTTTATCA TTACCCGCAC AAGCATTACC 3311
CACTCTGTCT CATGAAGCTT TCGGCGATAT TTATCTTTTT GAAGGTGAAT TACCCAATAC 3371
CCTTACCACT TCAAATAATA ATCAATTATC GCTAAGCAAA CAGCATGCTA AAGATGGTGA 3431
ACAATCACTC AAATGGCAAT ATCAACCACA AGCAACATTA ACACTAAATA ATATTGTTAA 3491
TTACCAAGAT GATAAAAATA CAGCCACACC ACTCACTTTT ATGATGTGGA TTTATAATGA 3551
AAAACCTCAA TCTTCCCCAT TAACGTTAGC ATTTAAACAA AATAATAAAA TTGCACTAAG 3611
TTTTAATGCT GAACTTAATT TTACGGGGTG GCGAGGTATT GCTGTTCCTT TTCGTGATAT 3671
GCAAGGCTCT GCGACAGGTC AACTTGATCA ATTAGTGATC ACCGCTCCAA ACCAAGCCGG 3731
AACACTCTTT TTTGATCAAA TCATCATGAG TGTACCGTTA GACAATCGTT GGGCAGTACC 3791
TGACTATCAA ACACCTTACG TAAATAACGC AGTAAACACG ATGGTTAGTA AAAACTGGAG 3851
TGCATTATTG ATGTACGATC AGATGTTTCA AGCCCATTAC CCTACTTTAA ACTTCGATAC 3911
TGAATTTCGC GATGACCAAA CAGAAATGGC TTCGATTTAT CAGCGCTTTG AATATTATCA 3971
AGGAATTCC 3980
(2) INFORMATION FOR SEQ ID NO:2:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1021 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:2:
Met Pro He Phe Arg Phe Thr Ala Leu Ala Met Thr Leu Gly Leu Leu 1 5 10 15
Ser Ala Pro Tyr Asn Ala Met Ala Ala Thr Ser Asn Pro Ala Phe Asp 20 25 30
Pro Lys Asn Leu Met Gin Ser Glu He Tyr His Phe Ala Gin Asn Asn 35 40 45
Pro Leu Ala Asp Phe Ser Ser Asp Lys Asn Ser He Leu Thr Leu Ser 50 55 60
Asp Lys Arg Ser He Met Gly Asn Gin Ser Leu Leu Trp Lys Trp Lys 65 70 75 80
Gly Gly Ser Ser Phe Thr Leu His Lys Lys Leu He Val Pro Thr Asp 85 90 95
Lys Glu Ala Ser Lys Ala Trp Gly Arg Ser Ser Thr Pro Val Phe Ser 100 105 110
Phe Trp Leu Tyr Asn Glu Lys Pro He Asp Gly Tyr Leu Thr He Asp 115 120 125
Phe Gly Glu Lys Leu He Ser Thr Ser Glu Ala Gin Ala Gly Phe Lys 130 135 140
Val Lys Leu Asp Phe Thr Gly Trp Arg Ala Val Gly Val Ser Leu Asn 145 150 155 160
Asn Asp Leu Glu Asn Arg Glu Met Thr Leu Asn Ala Thr Asn Thr Ser 165 170 175
Ser Asp Gly Thr Gin Asp Ser He Gly Arg Ser Leu Gly Ala Lys Val 180 185 190
Asp Ser He Arg Phe Lys Ala Pro Ser Asn Val Ser Gin Gly Glu He 195 200 205
Tyr He Asp Arg He Met Phe Ser Val Asp Asp Ala Arg Tyr Gin Trp 210 215 220
Ser Asp Tyr Gin Val Lys Thr Arg Leu Ser Glu Pro Glu He Gin Phe 225 230 235 240
His Asn Val Lys Pro Gin Leu Pro Val Thr Pro Glu Asn Leu Ala Ala 245 250 255
He Asp Leu He Arg Gin Arg Leu He Asn Glu Phe Val Gly Gly Glu 260 265 270
Lys Glu Thr Asn Leu Ala Leu Glu Glu Asn He Ser Lys Leu Lys Ser 275 280 285
Asp Phe Asp Ala Leu Asn He His Thr Leu Ala Asn Gly Gly Thr Gin 290 295 300
Gly Arg His Leu He Thr Asp Lys Gin He He He Tyr Gin Pro Glu 305 310 315 320
Asn Leu Asn Ser Gin Asp Lys Gin Leu Phe Asp Asn Tyr Val He Leu 325 330 335
Gly Asn Tyr Thr Thr Leu Met Phe Asn He Ser Arg Ala Tyr Val Leu 340 345 350
Glu Lys Asp Pro Thr Gin Lys Ala Gin Leu Lys Gin Met Tyr Leu Leu 355 360 365
Met Thr Lys His Leu Leu Asp Gin Gly Phe Val Lys Gly Ser Ala Leu 370 375 380 Val Thr Thr His His Trp Gly Tyr Ser Ser Arg Trp Trp Tyr He Ser 385 390 395 400
Thr Leu Leu Met Ser Asp Ala Leu Lys Glu Ala Asn Leu Gin Thr Gin 405 410 415
Val Tyr Asp Ser Leu Leu Trp Tyr Ser Arg Glu Phe Lys Ser Ser Phe 420 425 430
Asp Met Lys Val Ser Ala Asp Ser Ser Asp Leu Asp Tyr Phe Asn Thr 435 440 445
Leu Ser Arg Gin His Leu Ala Leu Leu Leu Leu Glu Pro Asp Asp Gin 450 455 460
Lys Arg He Asn Leu Val Asn Thr Phe Ser His Tyr He Thr Gly Ala 465 470 475 480
Leu Thr Gin Val Pro Pro Gly Gly Lys Asp Gly Leu Arg Pro Asp Gly 485 490 495
Thr Ala Trp Arg His Glu Gly Asn Tyr Pro Gly Tyr Ser Phe Pro Ala 500 505 510
Phe Lys Asn Ala Ser Gin Leu He Tyr Leu Leu Arg Aβp Thr Pro Phe 515 520 525
Ser Val Gly Glu Ser Gly Trp Asn Asn Leu Lys Lys Ala Met Val Ser 530 535 540
Ala Trp He Tyr Ser Asn Pro Glu Val Gly Leu Pro Leu Ala Gly Arg 545 550 555 560
His Pro Phe Asn Ser Pro Ser Leu Lys Ser Val Ala Gin Gly Tyr Tyr 565 570 575
Trp Leu Ala Met Ser Ala Lys Ser Ser Pro Asp Lys Thr Leu Ala Ser 580 585 590
He Tyr Leu Ala He Ser Asp Lys Thr Gin Asn Glu Ser Thr Ala He 595 600 605
Phe Gly Glu Thr He Thr Pro Ala Ser Leu Pro Gin Gly Phe Tyr Ala 610 615 620
Phe Asn Gly Gly Ala Phe Gly He His Arg Trp Gin Asp Lys Met Val 625 630 635 640
Thr Leu Lys Ala Tyr Asn Thr Asn Val Trp Ser Ser Glu He Tyr Asn 645 650 655
Lys Asp Asn Arg Tyr Gly Arg Tyr Gin Ser His Gly Val Ala Gin He 660 665 670
Val Ser Asn Gly Ser Gin Leu Ser Gin Gly Tyr Gin Gin Glu Gly Trp 675 680 685
Asp Trp Asn Arg Met Gin Gly Ala Thr Thr He His Leu Pro Leu Lys 690 695 700
Asp Leu Asp Ser Pro Lys Pro His Thr Leu Met Gin Arg Gly Glu Arg 705 710 715 720 Gly Phe Ser Gly Thr Ser Ser Leu Glu Gly Gin Tyr Gly Met Met Ala 725 730 735
Phe Asp Leu He Tyr Pro Ala Asn Leu Glu Arg Phe Asp Pro Asn Phe 740 745 750
Thr Ala Lys Lys Ser Val Leu Ala Ala Asp Asn His Leu He Phe He 755 760 765
Gly Ser Asn He Asn Ser Ser Asp Lys Asn Lys Asn Val Glu Thr Thr 770 775 780
Leu Phe Gin His Ala He Thr Pro Thr Leu Asn Thr Leu Trp He Asn 785 790 795 800
Gly Gin Lys He Glu Asn Met Pro Tyr Gin Thr Thr Leu Gin Gin Gly 805 810 815
Asp Trp Leu He Asp Ser Asn Gly Asn Gly Tyr Leu He Thr Gin Ala 820 825 830
Glu Lys Val Asn Val Ser Arg Gin His Gin Val Ser Ala Glu Asn Lys 835 840 845
Asn Arg Gin Pro Thr Glu Gly Asn Phe Ser Ser Ala Trp He Asp His 850 855 860
Ser Thr Arg Pro Lys Asp Ala Ser Tyr Glu Tyr Met Val Phe Leu Aβp 865 870 875 880
Ala Thr Pro Glu Lys Met Gly Glu Met Ala Gin Lys Phe Arg Glu Asn 885 890 895
Asn Gly Leu Tyr Gin Val Leu Arg Lys Asp Lys Asp Val His He He 900 905 910
Leu Asp Lys Leu Ser Asn Val Thr Gly Tyr Ala Phe Tyr Gin Pro Ala 915 920 925
Ser He Glu Asp Lys Trp He Lys Lys Val Aβn Lys Pro Ala He Val 930 935 940
Met Thr His Arg Gin Lys Asp Thr Leu He Val Ser Ala Val Thr Pro 945 950 955 960
Asp Leu Asn Met Thr Arg Gin Lys Ala Ala Thr Pro Val Thr He Asn 965 970 975
Val Thr He Asn Gly Lys Trp Gin Ser Ala Asp Lys Asn Ser Glu Val 980 985 990
Lys Tyr Gin Val Ser Gly Asp Asn Thr Glu Leu Thr Phe Thr Ser Tyr 995 1000 1005
Phe Gly He Pro Gin Glu He Lys Leu Ser Pro Leu Pro 1010 1015 1020
(2) INFORMATION FOR SEQ ID NO:3:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 3980 base pairs
(B) TYPE: nucleic acid (C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic) (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:3: GGAATTCCAT CACTCAATCA TTAAATTTAG GCACAACGAT GGGCTATCAG CGTTATGACA 60 AATTTAATGA AGGACGCATT GGTTTCACTG TTAGCCAGCG TTTCTAAGGA GAAAACATAT 120 GCCGATATTT CGTTTTACTG CACTTGCAAT GACATTGGGG CTATTATCAG CGCCTTATAA 180 CGCGATGGCA GCCACCAGCA ATCCTGCATT TGATCCTAAA AATCTGATGC AGTCAGAAAT 240 TTACCATTTT GCACAAAATA ACCCATTAGC AGACTTCTCA TCAGATAAAA ACTCAATACT 300 AACGTTATCT GATAAACGTA GCATTATGGG AAACCAATCT CTTTTATGGA AATGGAAAGG 360 TGGTAGTAGC TTTACTTTAC ATAAAAAACT GATTGTCCCC ACCGATAAAG AAGCATCTAA 420 AGCATGGGGA CGCTCATCTA CCCCCGTTTT CTCATTTTGG CTTTACAATG AAAAACCGAT 480 TGATGGTTAT CTTACTATCG ATTTCGGAGA AAAACTCATT TCAACCAGTG AGGCTCAGGC 540 AGGCTTTAAA GTAAAATTAG ATTTCACTGG CTGGCGTGCT GTGGGAGTCT CTTTAAATAA 600 CGATCTTGAA AATCGAGAGA TGACCTTAAA TGCAACCAAT ACCTCCTCTG ATGGTACTCA 660 AGACAGCATT GGGCGTTCTT TAGGTGCTAA AGTCGATAGT ATTCGTTTTA AAGCGCCTTC 720 TAATGTGAGT CAGGGTGAAA TCTATATCGA CCGTATTATG TTTTCTGTCG ATGATGCTCG 780 CTACCAATGG TCTGATTATC AAGTAAAAAC TCGCTTATCA GAACCTGAAA TTCAATTTCA 840 CAACGTAAAG CCACAACTAC CTGTAACACC TGAAAATTTA GCGGCCATTG ATCTTATTCG 900 CCAACGTCTA ATTAATGAAT TTGTCGGAGG TGAAAAAGAG ACAAACCTCG CATTAGAAGA 960 GAATATCAGC AAATTAAAAA GTGATTTCGA TGCTCTTAAT ATTCACACTT TAGCAAATGG 1020 TGGAACGCAA GGCAGACATC TGATCACTGA TAAACAAATC ATTATTTATC AACCAGAGAA 1080 TCTTAACTCC CAAGATAAAC AACTATTTGA TAATTATGTT ATTTTAGGTA ATTACACGAC 1140 ATTAATGTTT AATATTAGCC GTGCTTATGT GCTGGAAAAA GATCCCACAC AAAAGGCGCA 1200 ACTAAAGCAG ATGTACTTAT TAATGACAAA GCATTTATTA GATCAAGGCT TTGTTAAAGG 1260 GAGTGCTTTA GTGACAACCC ATCACTGGGG ATACAGTTCT CGTTGGTGGT ATATTTCCAC 1320 GTTATTAATG TCTGATGCAC TAAAAGAAGC GAACCTACAA ACTCAAGTTT ATGATTCATT 1380 ACTGTGGTAT TCACGTGAGT TTAAAAGTAG TTTTGATATG AAAGTAAGTG CTGATAGCTC 1440 TGATCTAGAT TATTTCAATA CCTTATCTCG CCAACATTTA GCCTTATTAT TACTAGAGCC 1500 TGATGATCAA AAGCGTATCA ACTTAGTTAA TACTTTCAGC CATTATATCA CTGGCGCATT 1560
AACGCAAGTG CCACCGGGTG GTAAAGATGG TTTACGCCCT GATGGTACAG CATGGCGACA 1620
TGAAGGCAAC TATCCGGGCT ACTCTTTCCC AGCCTTTAAA AATGCCTCTC AGCTTATTTA 1680
TTTATTACGC GATACACCAT TTTCAGTGGG TGAAAGTGGT TGGAATAACC TGAAAAAAGC 1740
GATGGTTTCA GCGTGGATCT ACAGTAATCC AGAAGTTGGA TTACCGCTTG CAGGAAGACA 1800
CCCTTTTAAC TCACCTTCGT TAAAATCAGT CGCTCAAGGC TATTACTGGC TTGCCATGTC 1860
TGCAAAATCA TCGCCTGATA AAACACTTGC ATCTATTTAT CTTGCGATTA GTGATAAAAC 1920
ACAAAATGAA TCAACTGCTA TTTTTGGAGA AACTATTACA CCAGCGTCTT TACCTCAAGG 1980
TTTCTATGCC TTTAATGGCG GTGCTTTTGG TATTCATCGT TGGCAAGATA AAATGGTGAC 2040
ACTGAAAGCT TATAACACCA ATGTTTGGTC ATCTGAAATT TATAACAAAG ATAACCGTTA 2100
TGGCCGTTAC CAAAGTCATG GTGTCGCTCA AATAGTGAGT AATGGCTCGC AGCTTTCACA 2160
GGGCTATCAG CAAGAAGGTT GGGATTGGAA TAGAATGCAA GGGGCAACCA CTATTCACCT 2220
TCCTCTTAAA GACTTAGACA GTCCTAAACC TCATACCTTA ATGCAACGTG GAGAGCGTGG 2280
ATTTAGCGGA ACATCATCCC TTGAAGGTCA ATATGGCATG ATGGCATTCG ATCTTATTTA 2340
TCCCGCCAAT CTTGAGCGTT TTGATCCTAA TTTCACTGCG AAAAAGAGTG TATTAGCCGC 2400
TGATAATCAC TTAATTTTTA TTGGTAGCAA TATAAATAGT AGTGATAAAA ATAAAAATGT 2460
TGAAACGACC TTATTCCAAC ATGCCATTAC TCCAACATTA AATACCCTTT GGATTAATGG 2520
ACAAAAGATA GAAAACATGC CTTATCAAAC AACACTTCAA CAAGGTGATT GGTTAATTGA 2580
TAGCAATGGC AATGGTTACT TAATTACTCA AGCAGAAAAA GTAAATGTAA GTCGCCAACA 2640
TCAGGTTTCA GCGGAAAATA AAAATCGCCA ACCGACAGAA GGAAACTTTA GCTCGGCATG 2700
GATCGATCAC AGCACTCGCC CCAAAGATGC CAGTTATGAG TATATGGTCT TTTTAGATGC 2760
GACACCTGAA AAAATGGGAG AGATGGCACA AAAATTCCGT GAAAATAATG GGTTATATCA 2820
GGTTCTTCGT AAGGATAAAG ACGTTCATAT TATTCTCGAT AAACTCAGCA ATGTAACGGG 2880
ATATGCCTTT TATCAGCCAG CATCAATTGA AGACAAATGG ATCAAAAAGG TTAATAAACC 2940
TGCAATTGTG ATGACTCATC GACAAAAAGA CACTCTTATT GTCAGTGCAG TTACACCTGA 3000
TTTAAATATG ACTCGCCAAA AAGCAGCAAC TCCTGTCACC ATCAATGTCA CGATTAATGG 3060
CAAATGGCAA TCTGCTGATA AAAATAGTGA AGTGAAATAT CAGGTTTCTG GTGATAACAC 3120
TGAACTGACG TTTACGAGTT ACTTTGGTAT TCCACAAGAA ATCAAACTCT CGCCACTCCC 3180
TTGATTTAAT CAAAAGAACG CTCTTGCGTT CCTTTTTTAT TTGCAGGAAA TCTGATTATG 3240
CTAATAAAAA ACCCTTTAGC CCACGCGGTT ACATTAAGCC TCTGTTTATC ATTACCCGCA 3300
CAAGCATTAC CCACTCTGTC TCATGAAGCT TTCGGCGATA TTTATCTTTT TGAAGGTGAA 3360 TTACCCAATA CCCTTACCAC TTCAAATAAT AATCAATTAT CGCTAAGCAA ACAGCATGCT 3420
AAAGATGGTG AACAATCACT CAAATGGCAA TATCAACCAC AAGCAACATT AACACTAAAT 3480
AATATTGTTA ATTACCAAGA TGATAAAAAT ACAGCCACAC CACTCACTTT TATGATGTGG 3540
ATTTATAATG AAAAACCTCA ATCTTCCCCA TTAACGTTAG CATTTAAACA AAATAATAAA 3600
ATTGCACTAA GTTTTAATGC TGAACTTAAT TTTACGGGGT GGCGAGGTAT TGCTGTTCCT 3660
TTTCGTGATA TGCAAGGCTC TGCGACAGGT CAACTTGATC AATTAGTGAT CACCGCTCCA 3720
AACCAAGCCG GAACACTCTT TTTTGATCAA ATCATCATGA GTGTACCGTT AGACAATCGT 3780
TGGGCAGTAC CTGACTATCA AACACCTTAC GTAAATAACG CAGTAAACAC GATGGTTAGT 3840
AAAAACTGGA GTGCATTATT GATGTACGAT CAGATGTTTC AAGCCCATTA CCCTACTTTA 3900
AACTTCGATA CTGAATTTCG CGATGACCAA ACAGAAATGG CTTCGATTTA TCAGCGCTTT 3960
GAATATTATC AAGGAATTCC 3980 (2) INFORMATION FOR SEQ ID NO:4:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 3980 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic) (iii) HYPOTHETICAL: NO (iv) ANT -SENSE: NO
(ix) FEATURE:
(A) NAME/KEY: CDS
(B) LOCATION: 188..3181
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:4:
GGAATTCCAT CACTCAATCA TTAAATTTAG GCACAACGAT GGGCTATCAG CGTTATGACA 60
AATTTAATGA AGGACGCATT GGTTTCACTG TTAGCCAGCG TTTCTAAGGA GAAAAATAAT 120
GCCGATATTT CGTTTTACTG CACTTGCAAT GACATTGGGG CTATTATCAG CGCCTTATAA 180
CGCGGAT ATG GCC ACC AGC AAT CCT GCA TTT GAT CCT AAA AAT CTG ATG 229 Met Ala Thr Ser Asn Pro Ala Phe Asp Pro Lys Asn Leu Met 1 5 10
CAG TCA GAA ATT TAC CAT TTT GCA CAA AAT AAC CCA TTA GCA GAC TTC 277 Gin Ser Glu He Tyr His Phe Ala Gin Asn Asn Pro Leu Ala Asp Phe 15 20 25 30
TCA TCA GAT AAA AAC TCA ATA CTA ACG TTA TCT GAT AAA CGT AGC ATT 325 Ser Ser Asp Lys Asn Ser He Leu Thr Leu Ser Asp Lys Arg Ser He 35 40 45 ATG GGA AAC CAA TCT CTT TTA TGG AAA TGG AAA GGT GGT AGT AGC TTT 373 Met Gly Asn Gin Ser Leu Leu Trp Lys Trp Lys Gly Gly Ser Ser Phe 50 55 60
ACT TTA CAT AAA AAA CTG ATT GTC CCC ACC GAT AAA GAA GCA TCT AAA 421 Thr Leu His Lys Lys Leu He Val Pro Thr Asp Lys Glu Ala Ser Lys . 65 70 75
GCA TGG GGA CGC TCA TCT ACC CCC GTT TTC TCA TTT TGG CTT TAC AAT 469 Ala Trp Gly Arg Ser Ser Thr Pro Val Phe Ser Phe Trp Leu Tyr Asn 80 85 90
GAA AAA CCG ATT GAT GGT TAT CTT ACT ATC GAT TTC GGA GAA AAA CTC 517 Glu Lys Pro He Asp Gly Tyr Leu Thr He Aβp Phe Gly Glu Lys Leu 95 100 105 110
ATT TCA ACC AGT GAG GCT CAG GCA GGC TTT AAA GTA AAA TTA GAT TTC 565 He Ser Thr Ser Glu Ala Gin Ala Gly Phe Lys Val Lys Leu Asp Phe 115 120 125
ACT GGC TGG CGT GCT GTG GGA GTC TCT TTA AAT AAC GAT CTT GAA AAT 613 Thr Gly Trp Arg Ala Val Gly Val Ser Leu Asn Asn Asp Leu Glu Asn 130 135 140
CGA GAG ATG ACC TTA AAT GCA ACC AAT ACC TCC TCT GAT GGT ACT CAA 661 Arg Glu Met Thr Leu Asn Ala Thr Asn Thr Ser Ser Asp Gly Thr Gin 145 150 155
GAC AGC ATT GGG CGT TCT TTA GGT GCT AAA GTC GAT AGT ATT CGT TTT 709 Asp Ser He Gly Arg Ser Leu Gly Ala Lys Val Asp Ser He Arg Phe 160 165 170
AAA GCG CCT TCT AAT GTG AGT CAG GGT GAA ATC TAT ATC GAC CGT ATT 757 Lys Ala Pro Ser Asn Val Ser Gin Gly Glu He Tyr He Asp Arg He 175 180 185 190
ATG TTT TCT GTC GAT GAT GCT CGC TAC CAA TGG TCT GAT TAT CAA GTA 805 Met Phe Ser Val Asp Asp Ala Arg Tyr Gin Trp Ser Asp Tyr Gin Val 195 200 205
AAA ACT CGC TTA TCA GAA CCT GAA ATT CAA TTT CAC AAC GTA AAG CCA 853 Lys Thr Arg Leu Ser Glu Pro Glu He Gin Phe His Asn Val Lys Pro 210 215 220
CAA CTA CCT GTA ACA CCT GAA AAT TTA GCG GCC ATT GAT CTT ATT CGC 901 Gin Leu Pro Val Thr Pro Glu Asn Leu Ala Ala He Asp Leu He Arg 225 230 235
CAA CGT CTA ATT AAT GAA TTT GTC GGA GGT GAA AAA GAG ACA AAC CTC 949 Gin Arg Leu He Asn Glu Phe Val Gly Gly Glu Lys Glu Thr Asn Leu 240 245 250
GCA TTA GAA GAG AAT ATC AGC AAA TTA AAA AGT GAT TTC GAT GCT CTT 997 Ala Leu Glu Glu Asn He Ser Lys Leu Lys Ser Asp Phe Asp Ala Leu 255 260 265 270
AAT ATT CAC ACT TTA GCA AAT GGT GGA ACG CAA GGC AGA CAT CTG ATC 1045 Asn He His Thr Leu Ala Asn Gly Gly Thr Gin Gly Arg His Leu He 275 280 285
ACT GAT AAA CAA ATC ATT ATT TAT CAA CCA GAG AAT CTT AAC TCC CAA 1093 Thr Asp Lys Gin He He He Tyr Gin Pro Glu Asn Leu Asn Ser Gin 290 295 300
GAT AAA CAA CTA TTT GAT AAT TAT GTT ATT TTA GGT AAT TAC ACG ACA 1141 Asp Lys Gin Leu Phe Asp Asn Tyr Val He Leu Gly Asn Tyr Thr Thr 305 310 315
TTA ATG TTT AAT ATT AGC CGT GCT TAT GTG CTG GAA AAA GAT CCC ACA 1189 Leu Met Phe Asn He Ser Arg Ala Tyr Val Leu Glu Lys Asp Pro Thr 320 325 330
CAA AAG GCG CAA CTA AAG CAG ATG TAC TTA TTA ATG ACA AAG CAT TTA 1237 Gin Lys Ala Gin Leu Lys Gin Met Tyr Leu Leu Met Thr Lys His Leu 335 340 345 350
TTA GAT CAA GGC TTT GTT AAA GGG AGT GCT TTA GTG ACA ACC CAT CAC 1285 Leu Asp Gin Gly Phe Val Lys Gly Ser Ala Leu Val Thr Thr His His 355 360 365
TGG GGA TAC AGT TCT CGT TGG TGG TAT ATT TCC ACG TTA TTA ATG TCT 1333 Trp Gly Tyr Ser Ser Arg Trp Trp Tyr He Ser Thr Leu Leu Met Ser 370 375 380
GAT GCA CTA AAA GAA GCG AAC CTA CAA ACT CAA GTT TAT GAT TCA TTA 1381 Asp Ala Leu Lys Glu Ala Asn Leu Gin Thr Gin Val Tyr Asp Ser Leu 385 390 395
CTG TGG TAT TCA CGT GAG TTT AAA AGT AGT TTT GAT ATG AAA GTA AGT 1429 Leu Trp Tyr Ser Arg Glu Phe Lys Ser Ser Phe Asp Met Lys Val Ser 400 405 410
GCT GAT AGC TCT GAT CTA GAT TAT TTC AAT ACC TTA TCT CGC CAA CAT 1477 Ala Asp Ser Ser Asp Leu Asp Tyr Phe Asn Thr Leu Ser Arg Gin His 415 420 425 430
TTA GCC TTA TTA TTA CTA GAG CCT GAT GAT CAA AAG CGT ATC AAC TTA 1525 Leu Ala Leu Leu Leu Leu Glu Pro Asp Asp Gin Lys Arg He Asn Leu 435 440 445
GTT AAT ACT TTC AGC CAT TAT ATC ACT GGC GCA TTA ACG CAA GTG CCA 1573 Val Asn Thr Phe Ser His Tyr He Thr Gly Ala Leu Thr Gin Val Pro 450 455 460
CCG GGT GGT AAA GAT GGT TTA CGC CCT GAT GGT ACA GCA TGG CGA CAT 1621 Pro Gly Gly Lys Asp Gly Leu Arg Pro Asp Gly Thr Ala Trp Arg His 465 470 475
GAA GGC AAC TAT CCG GGC TAC TCT TTC CCA GCC TTT AAA AAT GCC TCT 1669 Glu Gly Asn Tyr Pro Gly Tyr Ser Phe Pro Ala Phe Lys Asn Ala Ser 480 485 490
CAG CTT ATT TAT TTA TTA CGC GAT ACA CCA TTT TCA GTG GGT GAA AGT 1717 Gin Leu He Tyr Leu Leu Arg Asp Thr Pro Phe Ser Val Gly Glu Ser 495 500 505 510
GGT TGG AAT AAC CTG AAA AAA GCG ATG GTT TCA GCG TGG ATC TAC AGT 1765 Gly Trp Asn Asn Leu Lys Lys Ala Met Val Ser Ala Trp He Tyr Ser 515 520 525
AAT CCA GAA GTT GGA TTA CCG CTT GCA GGA AGA CAC CCT TTT AAC TCA 1813 Asn Pro Glu Val Gly Leu Pro Leu Ala Gly Arg His Pro Phe Asn Ser 530 535 540 CCT TCG TTA AAA TCA GTC GCT CAA GGC TAT TAC TGG CTT GCC ATG TCT 1861 Pro Ser Leu Lys Ser Val Ala Gin Gly Tyr Tyr Trp Leu Ala Met Ser 545 550 555
GCA AAA TCA TCG CCT GAT AAA ACA CTT GCA TCT ATT TAT CTT GCG ATT 1909 Ala Lys Ser Ser Pro Asp Lys Thr Leu Ala Ser He Tyr Leu Ala He 560 565 570
AGT GAT AAA ACA CAA AAT GAA TCA ACT GCT ATT TTT GGA GAA ACT ATT 1957 Ser Asp Lys Thr Gin Asn Glu Ser Thr Ala He Phe Gly Glu Thr He 575 580 585 590
ACA CCA GCG TCT TTA CCT CAA GGT TTC TAT GCC TTT AAT GGC GGT GCT 2005 Thr Pro Ala Ser Leu Pro Gin Gly Phe Tyr Ala Phe Asn Gly Gly Ala 595 600 605
TTT GGT ATT CAT CGT TGG CAA GAT AAA ATG GTG ACA CTG AAA GCT TAT 2053 Phe Gly He His Arg Trp Gin Asp Lys Met Val Thr Leu Lys Ala Tyr 610 615 620
AAC ACC AAT GTT TGG TCA TCT GAA ATT TAT AAC AAA GAT AAC CGT TAT 2101 Asn Thr Asn Val Trp Ser Ser Glu He Tyr Asn Lys Asp Asn Arg Tyr 625 630 635
GGC CGT TAC CAA AGT CAT GGT GTC GCT CAA ATA GTG AGT AAT GGC TCG 2149 Gly Arg Tyr Gin Ser His Gly Val Ala Gin He Val Ser Asn Gly Ser 640 645 650
CAG CTT TCA CAG GGC TAT CAG CAA GAA GGT TGG GAT TGG AAT AGA ATG 2197 Gin Leu Ser Gin Gly Tyr Gin Gin Glu Gly Trp Asp Trp Asn Arg Met 655 660 665 670
CAA GGG GCA ACC ACT ATT CAC CTT CCT CTT AAA GAC TTA GAC AGT CCT 2245 Gin Gly Ala Thr Thr He His Leu Pro Leu Lys Asp Leu Asp Ser Pro 675 680 685
AAA CCT CAT ACC TTA ATG CAA CGT GGA GAG CGT GGA TTT AGC GGA ACA 2293 Lys Pro His Thr Leu Met Gin Arg Gly Glu Arg Gly Phe Ser Gly Thr 690 695 700
TCA TCC CTT GAA GGT CAA TAT GGC ATG ATG GCA TTC GAT CTT ATT TAT 2341 Ser Ser Leu Glu Gly Gin Tyr Gly Met Met Ala Phe Asp Leu He Tyr 705 710 715
CCC GCC AAT CTT GAG CGT TTT GAT CCT AAT TTC ACT GCG AAA AAG AGT 2389 Pro Ala Asn Leu Glu Arg Phe Asp Pro Asn Phe Thr Ala Lys Lys Ser 720 725 730
GTA TTA GCC GCT GAT AAT CAC TTA ATT TTT ATT GGT AGC AAT ATA AAT 2437 Val Leu Ala Ala Asp Asn His Leu He Phe He Gly Ser Asn He Asn 735 740 745 750
AGT AGT GAT AAA AAT AAA AAT GTT GAA ACG ACC TTA TTC CAA CAT GCC 2485 Ser Ser Asp Lys Asn Lys Asn Val Glu Thr Thr Leu Phe Gin His Ala 755 760 765
ATT ACT CCA ACA TTA AAT ACC CTT TGG ATT AAT GGA CAA AAG ATA GAA 2533 He Thr Pro Thr Leu Asn Thr Leu Trp He Asn Gly Gin Lys He Glu 770 775 780
AAC ATG CCT TAT CAA ACA ACA CTT CAA CAA GGT GAT TGG TTA ATT GAT 2581 Asn Met Pro Tyr Gin Thr Thr Leu Gin Gin Gly Asp Trp Leu He Asp 785 790 795
AGC AAT GGC AAT GGT TAC TTA ATT ACT CAA GCA GAA AAA GTA AAT GTA 2629 Ser Asn Gly Asn Gly Tyr Leu He Thr Gin Ala Glu Lys Val Asn Val 800 805 810
AGT CGC CAA CAT CAG GTT TCA GCG GAA AAT AAA AAT CGC CAA CCG ACA 2677 Ser Arg Gin His Gin Val Ser Ala Glu Asn Lys Asn Arg Gin Pro Thr 815 820 825 830
GAA GGA AAC TTT AGC TCG GCA TGG ATC GAT CAC AGC ACT CGC CCC AAA 2725 Glu Gly Asn Phe Ser Ser Ala Trp He Asp His Ser Thr Arg Pro Lys 835 840 845
GAT GCC AGT TAT GAG TAT ATG GTC TTT TTA GAT GCG ACA CCT GAA AAA 2773 Asp Ala Ser Tyr Glu Tyr Met Val Phe Leu Asp Ala Thr Pro Glu Lys 850 855 860
ATG GGA GAG ATG GCA CAA AAA TTC CGT GAA AAT AAT GGG TTA TAT CAG 2821 Met Gly Glu Met Ala Gin Lys Phe Arg Glu Asn Asn Gly Leu Tyr Gin 865 870 875
GTT CTT CGT AAG GAT AAA GAC GTT CAT ATT ATT CTC GAT AAA CTC AGC 2869 Val Leu Arg Lys Asp Lys Asp Val His He He Leu Asp Lys Leu Ser 880 885 890
AAT GTA ACG GGA TAT GCC TTT TAT CAG CCA GCA TCA ATT GAA GAC AAA 2917 Asn Val Thr Gly Tyr Ala Phe Tyr Gin Pro Ala Ser He Glu Asp Lys 895 900 905 910
TGG ATC AAA AAG GTT AAT AAA CCT GCA ATT GTG ATG ACT CAT CGA CAA 2965 Trp He Lys Lys Val Asn Lys Pro Ala He Val Met Thr His Arg Gin 915 920 925
AAA GAC ACT CTT ATT GTC AGT GCA GTT ACA CCT GAT TTA AAT ATG ACT 3013 Lys Asp Thr Leu He Val Ser Ala Val Thr Pro Asp Leu Asn Met Thr 930 935 940
CGC CAA AAA GCA GCA ACT CCT GTC ACC ATC AAT GTC ACG ATT AAT GGC 3061 Arg Gin Lys Ala Ala Thr Pro Val Thr He Asn Val Thr He Asn Gly 945 950 955
AAA TGG CAA TCT GCT GAT AAA AAT AGT GAA GTG AAA TAT CAG GTT TCT 3109 Lys Trp Gin Ser Ala Asp Lys Asn Ser Glu Val Lys Tyr Gin Val Ser 960 965 970
GGT GAT AAC ACT GAA CTG ACG TTT ACG AGT TAC TTT GGT ATT CCA CAA 3157 Gly Asp Asn Thr Glu Leu Thr Phe Thr Ser Tyr Phe Gly He Pro Gin 975 980 985 990
GAA ATC AAA CTC TCG CCA CTC CCT TGATTTAATC AAAAGAACGC TCTTGCGTTC 3211 Glu He Lys Leu Ser Pro Leu Pro 995
CTTTTTTATT TGCAGGAAAT CTGATTATGC TAATAAAAAA CCCTTTAGCC CACGCGGTTA 3271
CATTAAGCCT CTGTTTATCA TTACCCGCAC AAGCATTACC CACTCTGTCT CATGAAGCTT 3331
TCGGCGATAT TTATCTTTTT GAAGGTGAAT TACCCAATAC CCTTACCACT TCAAATAATA 3391
ATCAATTATC GCTAAGCAAA CAGCATGCTA AAGATGGTGA ACAATCACTC AAATGGCAAT 3451 ATCAACCACA AGCAACATTA ACACTAAATA ATATTGTTAA TTACCAAGAT GATAAAAATA 3511
CAGCCACACC ACTCACTTTT ATGATGTGGA TTTATAATGA AAAACCTCAA TCTTCCCCAT 3571
TAACGTTAGC ATTTAAACAA AATAATAAAA TTGCACTAAG TTTTAATGCT GAACTTAATT 3631
TTACGGGGTG GCGAGGTATT GCTGTTCCTT TTCGTGATAT GCAAGGCTCT GCGACAGGTC 3691
AACTTGATCA ATTAGTGATC ACCGCTCCAA ACCAAGCCGG AACACTCTTT TTTGATCAAA 3751
TCATCATGAG TGTACCGTTA GACAATCGTT GGGCAGTACC TGACTATCAA ACACCTTACG 3811
TAAATAACGC AGTAAACACG ATGGTTAGTA AAAACTGGAG TGCATTATTG ATGTACGATC 3871
AGATGTTTCA AGCCCATTAC CCTACTTTAA ACTTCGATAC TGAATTTCGC GATGACCAAA 3931
CAGAAATGGC TTCGATTTAT CAGCGCTTTG AATATTATCA AGGAATTCC 3980
(2) INFORMATION FOR SEQ ID NO:5:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 998 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:5:
Met Ala Thr Ser Asn Pro Ala Phe Asp Pro Lys Asn Leu Met Gin Ser 1 5 10 15
Glu He Tyr His Phe Ala Gin Asn Asn Pro Leu Ala Asp Phe Ser Ser 20 25 30
Asp Lys Asn Ser He Leu Thr Leu Ser Asp Lys Arg Ser He Met Gly 35 40 45
Asn Gin Ser Leu Leu Trp Lys Trp Lys Gly Gly Ser Ser Phe Thr Leu 50 55 60
His Lys Lys Leu He Val Pro Thr Asp Lys Glu Ala Ser Lys Ala Trp 65 70 75 80
Gly Arg Ser Ser Thr Pro Val Phe Ser Phe Trp Leu Tyr Asn Glu Lys 85 90 95
Pro He Aβp Gly Tyr Leu Thr He Asp Phe Gly Glu Lys Leu He Ser 100 105 110
Thr Ser Glu Ala Gin Ala Gly Phe Lys Val Lys Leu Asp Phe Thr Gly 115 120 125
Trp Arg Ala Val Gly Val Ser Leu Asn Asn Asp Leu Glu Asn Arg Glu 130 135 140
Met Thr Leu Asn Ala Thr Asn Thr Ser Ser Asp Gly Thr Gin Asp Ser 145 150 155 160
He Gly Arg Ser Leu Gly Ala Lys Val Asp Ser He Arg Phe Lys Ala 165 170 175 Pro Ser Asn Val Ser Gin Gly Glu He Tyr He Asp Arg He Met Phe 180 185 190
Ser Val Asp Asp Ala Arg Tyr Gin Trp Ser Asp Tyr Gin Val Lys Thr 195 200 205
Arg Leu Ser Glu Pro Glu He Gin Phe His Asn Val Lys Pro Gin Leu 210 215 220
Pro Val Thr Pro Glu Asn Leu Ala Ala He Asp Leu He Arg Gin Arg 225 230 235 240
Leu He Asn Glu Phe Val Gly Gly Glu Lys Glu Thr Asn Leu Ala Leu 245 250 255
Glu Glu Asn He Ser Lys Leu Lys Ser Asp Phe Asp Ala Leu Asn He 260 265 270
His Thr Leu Ala Asn Gly Gly Thr Gin Gly Arg His Leu He Thr Asp 275 280 285
Lys Gin He He He Tyr Gin Pro Glu Asn Leu Asn Ser Gin Asp Lys 290 295 300
Gin Leu Phe Asp Asn Tyr Val He Leu Gly Asn Tyr Thr Thr Leu Met 305 310 315 320
Phe Asn He Ser Arg Ala Tyr Val Leu Glu Lys Asp Pro Thr Gin Lys 325 330 335
Ala Gin Leu Lys Gin Met Tyr Leu Leu Met Thr Lys His Leu Leu Asp 340 345 350
Gin Gly Phe Val Lys Gly Ser Ala Leu Val Thr Thr His His Trp Gly 355 360 365
Tyr Ser Ser Arg Trp Trp Tyr He Ser Thr Leu Leu Met Ser Asp Ala 370 375 380
Leu Lys Glu Ala Asn Leu Gin Thr Gin Val Tyr Asp Ser Leu Leu Trp 385 390 395 400
Tyr Ser Arg Glu Phe Lys Ser Ser Phe Asp Met Lys Val Ser Ala Asp 405 410 415
Ser Ser Asp Leu Asp Tyr Phe Asn Thr Leu Ser Arg Gin His Leu Ala 420 425 430
Leu Leu Leu Leu Glu Pro Asp Asp Gin Lys Arg He Asn Leu Val Asn 435 440 445
Thr Phe Ser His Tyr He Thr Gly Ala Leu Thr Gin Val Pro Pro Gly 450 455 460
Gly Lys Asp Gly Leu Arg Pro Asp Gly Thr Ala Trp Arg His Glu Gly 465 470 475 480
Asn Tyr Pro Gly Tyr Ser Phe Pro Ala Phe Lys Asn Ala Ser Gin Leu 485 490 495
He Tyr Leu Leu Arg Asp Thr Pro Phe Ser Val Gly Glu Ser Gly Trp 500 505 510 Asn Asn Leu Lys Lys Ala Met Val Ser Ala Trp He Tyr Ser Asn Pro 515 520 525
Glu Val Gly Leu Pro Leu Ala Gly Arg His Pro Phe Asn Ser Pro Ser 530 535 540
Leu Lys Ser Val Ala Gin Gly Tyr Tyr Trp Leu Ala Met Ser Ala Lys 545 550 555 560
Ser Ser Pro Asp Lys Thr Leu Ala Ser He Tyr Leu Ala He Ser Asp 565 570 575
Lys Thr Gin Asn Glu Ser Thr Ala He Phe Gly Glu Thr He Thr Pro 580 585 590
Ala Ser Leu Pro Gin Gly Phe Tyr Ala Phe Asn Gly Gly Ala Phe Gly 595 600 605
He His Arg Trp Gin Asp Lys Met Val Thr Leu Lys Ala Tyr Asn Thr 610 615 620
Asn Val Trp Ser Ser Glu He Tyr Asn Lys Asp Asn Arg Tyr Gly Arg 625 630 635 640
Tyr Gin Ser His Gly Val Ala Gin He Val Ser Asn Gly Ser Gin Leu 645 650 655
Ser Gin Gly Tyr Gin Gin Glu Gly Trp Asp Trp Asn Arg Met Gin Gly 660 665 670
Ala Thr Thr He His Leu Pro Leu Lys Asp Leu Asp Ser Pro Lys Pro 675 680 685
His Thr Leu Met Gin Arg Gly Glu Arg Gly Phe Ser Gly Thr Ser Ser 690 695 700
Leu Glu Gly Gin Tyr Gly Met Met Ala Phe Asp Leu He Tyr Pro Ala 705 710 715 720
Asn Leu Glu Arg Phe Asp Pro Asn Phe Thr Ala Lys Lys Ser Val Leu 725 730 735
Ala Ala Asp Asn His Leu He Phe He Gly Ser Asn He Asn Ser Ser 740 745 750
Asp Lys Asn Lys Asn Val Glu Thr Thr Leu Phe Gin His Ala He Thr 755 760 765
Pro Thr Leu Asn Thr Leu Trp He Asn Gly Gin Lys He Glu Asn Met 770 775 780
Pro Tyr Gin Thr Thr Leu Gin Gin Gly Asp Trp Leu He Asp Ser Asn 785 790 795 800
Gly Asn Gly Tyr Leu He Thr Gin Ala Glu Lys Val Asn Val Ser Arg 805 810 815
Gin His Gin Val Ser Ala Glu Asn Lys Asn Arg Gin Pro Thr Glu Gly 820 825 830
Asn Phe Ser Ser Ala Trp He Asp His Ser Thr Arg Pro Lys Asp Ala 835 840 845 Ser Tyr Glu Tyr Met Val Phe Leu Asp Ala Thr Pro Glu Lys Met Gly 850 855 860
Glu Met Ala Gin Lys Phe Arg Glu Asn Asn Gly Leu Tyr Gin Val Leu 865 870 875 880
Arg Lys Asp Lys Asp Val His He He Leu Asp Lys Leu Ser Asn Val 885 890 895
Thr Gly Tyr Ala Phe Tyr Gin Pro Ala Ser He Glu Asp Lys Trp He 900 905 910
Lys Lys Val Asn Lys Pro Ala He Val Met Thr His Arg Gin Lys Aβp 915 920 925
Thr Leu He Val Ser Ala Val Thr Pro Aβp Leu Asn Met Thr Arg Gin 930 935 940
Lys Ala Ala Thr Pro Val Thr He Asn Val Thr He Asn Gly Lye Trp 945 950 955 960
Gin Ser Ala Aβp Lys Asn Ser Glu Val Lys Tyr Gin Val Ser Gly Aβp 965 970 975
Asn Thr Glu Leu Thr Phe Thr Ser Tyr Phe Gly He Pro Gin Glu He 980 985 990
Lys Leu Ser Pro Leu Pro 995
(2) INFORMATION FOR SEQ ID NO:6:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 21 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic) (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:6: CAYTTYGCNC ARAAYAAYCC N 21
(2) INFORMATION FOR SEQ ID NO:7:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic) (iii) HYPOTHETICAL: NO (iv) ANTI -SENSE : NO
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:7: CACTTCGCNC AAAATAATCC 20
(2) INFORMATION FOR SEQ ID NO:8:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic) (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:8: CACTTCGCNC AAAACAACCC 20
(2) INFORMATION FOR SEQ ID NO:9:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic) (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: : CACTTCGCNC AAAACAATCC 20
(2) INFORMATION FOR SEQ ID NO:10:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic) (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO (xi) SEQUENCE DESCRIPTION: SEQ ID NO:10: CACTTCGCNC AAAATAACCC 20
(2) INFORMATION FOR SEQ ID NO:11:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic) (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:11: CACTTCGCNC AGAATAATCC 20
(2) INFORMATION FOR SEQ ID NO:12:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic) (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:12: CACTTCGCNC AGAACAACCC 20
(2) INFORMATION FOR SEQ ID NO:13:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic) (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:13: CACTTCGCNC AGAACAATCC 20
(2) INFORMATION FOR SEQ ID NO:14:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic) (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:14: CACTTCGCNC AGAATAACCC 20
(2) INFORMATION FOR SEQ ID NO:15:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 21 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic) (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:15: GARGCNCARG CNGGNTTYAA R 21
(2) INFORMATION FOR SEQ ID NO:16:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 21 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic) (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: YES
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:16: YTTRAANCCN GCYTGNGCYT C 21 (2) INFORMATION FOR SEQ ID NO:17:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic) (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: YES
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:17: TTGAARCCNG CYTGGGCTTC 20
(2) INFORMATION FOR SEQ ID NO:18:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic) (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: YES
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:18: TTGAARCCNG CYTGAGCTTC 20
(2) INFORMATION FOR SEQ ID NO:19:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic) (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: YES
(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:19: TTGAARCCNG CYTGTGCTTC 20
(2) INFORMATION FOR SEQ ID NO:20: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic) (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: YES
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:20: TTGAARCCNG CYTGCGCTTC 20
(2) INFORMATION FOR SEQ ID NO:21:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic) (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: YES
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:21: TTGAARCCNG CYTGGGCCTC 20
(2) INFORMATION FOR SEQ ID NO:22:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic) (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: YES
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:22:
TTGAARCCNG CYTGAGCCTC 20
(2) INFORMATION FOR SEQ ID NO:23:
(i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 20 base pairs (B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic) (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: YES
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:23: TTGAARCCNG CYTGTGCCTC 20
(2) INFORMATION FOR SEQ ID NO:24:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic) (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: YES
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:24: TTGAARCCNG CYTGCGCCTC 20
(2) INFORMATION FOR SEQ ID NO:25:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 18 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic) (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:25: GGNGCNAARG TNGAYTCN 18
(2) INFORMATION FOR SEQ ID NO:26:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 18 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single (D) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:26: GGNGCNAARG TNGAYAGY 18
(2) INFORMATION FOR SEQ ID NO:27:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 18 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic) (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: YES
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:27: NGARTCNACY TTNGCNCC 18
(2) INFORMATION FOR SEQ ID NO:28:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 18 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: βingle
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic) (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: YES
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:28: RCTRTCNACY TTNGCNCC 18
(2) INFORMATION FOR SEQ ID NO:29:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 17 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: YES
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:29: GAGTCNACYT TRGCGCC 17
(2) INFORMATION FOR SEQ ID NO:30:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 17 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic) (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: YES
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:30: GAGTCNACYT TRGCACC 17
(2) INFORMATION FOR SEQ ID NO:31:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 17 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic) (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: YES
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:31: GAGTCNACYT TRGCTCC 17
(2) INFORMATION FOR SEQ ID NO:32:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 17 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic) (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: YES
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:32: GAGTCNACYT TRGCCCC 17
(2) INFORMATION FOR SEQ ID NO:33:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 17 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic) (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: YES
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:33: GAGTCNACYT TYGCGCC 17
(2) INFORMATION FOR SEQ ID NO:34:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 17 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic) (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: YES
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:34: GAGTCNACYT TYGCACC 17
(2) INFORMATION FOR SEQ ID NO:35:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 17 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic) (iii) HYPOTHETICAL: NO ( iv) ANT -SENSE : YES
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:35: GAGTCNACYT TYGCTCC 17
(2) INFORMATION FOR SEQ ID NO:36:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 17 base pairβ
(B) TYPE: nucleic acid
(C) STRANDEDNESS: βingle
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic) (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: YES
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:36: GAGTCNACYT TYGCCCC 17
(2) INFORMATION FOR SEQ ID NO:37:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 48 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic) (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:37: GCCAGCGTTT CTAAGGAGAA AACATATGCC GATATTTCGT TTTACTGC 48
(2) INFORMATION FOR SEQ ID NO:38:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 37 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic) (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO (xi) SEQUENCE DESCRIPTION: SEQ ID NO:38: GCGCCTTATA ACGCGCATAT GGCCACCAGC AATCCTG 37
(2) INFORMATION FOR SEQ ID NO:39:
(i)' SEQUENCE CHARACTERISTICS:
(A) LENGTH: 6519 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic) (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO
(ix) FEATURE:
(A) NAME/KEY: CDS
(B) LOCATION: 3238..6276
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:39:
GGAATTCCAT CACTCAATCA TTAAATTTAG GCACAACGAT GGGCTATCAG CGTTATGACA 60
AATTTAATGA AGGACGCATT GGTTTCACTG TTAGCCAGCG TTTCTAAGGA GAAAAATAAT 120
GCCGATATTT CGTTTTACTG CACTTGCAAT GACATTGGGG CTATTATCAG CGCCTTATAA 180
CGCGATGGCA GCCACCAGCA ATCCTGCATT TGATCCTAAA AATCTGATGC AGTCAGAAAT 240
TTACCATTTT GCACAAAATA ACCCATTAGC AGACTTCTCA TCAGATAAAA ACTCAATACT 300
AACGTTATCT GATAAACGTA GCATTATGGG AAACCAATCT CTTTTATGGA AATGGAAAGG 360
TGGTAGTAGC TTTACTTTAC ATAAAAAACT GATTGTCCCC ACCGATAAAG AAGCATCTAA 420
AGCATGGGGA CGCTCATCTA CCCCCGTTTT CTCATTTTGG CTTTACAATG AAAAACCGAT 480
TGATGGTTAT CTTACTATCG ATTTCGGAGA AAAACTCATT TCAACCAGTG AGGCTCAGGC 540
AGGCTTTAAA GTAAAATTAG ATTTCACTGG CTGGCGTGCT GTGGGAGTCT CTTTAAATAA 600
CGATCTTGAA AATCGAGAGA TGACCTTAAA TGCAACCAAT ACCTCCTCTG ATGGTACTCA 660
AGACAGCATT GGGCGTTCTT TAGGTGCTAA AGTCGATAGT ATTCGTTTTA AAGCGCCTTC 720
TAATGTGAGT CAGGGTGAAA TCTATATCGA CCGTATTATG TTTTCTGTCG ATGATGCTCG 780
CTACCAATGG TCTGATTATC AAGTAAAAAC TCGCTTATCA GAACCTGAAA TTCAATTTCA 840
CAACGTAAAG CCACAACTAC CTGTAACACC TGAAAATTTA GCGGCCATTG ATCTTATTCG 900
CCAACGTCTA ATTAATGAAT TTGTCGGAGG TGAAAAAGAG ACAAACCTCG CATTAGAAGA 960
GAATATCAGC AAATTAAAAA GTGATTTCGA TGCTCTTAAT ATTCACACTT TAGCAAATGG 1020
TGGAACGCAA GGCAGACATC TGATCACTGA TAAACAAATC ATTATTTATC AACCAGAGAA 1080 TCTTAACTCC CAAGATAAAC AACTATTTGA TAATTATGTT ATTTTAGGTA ATTACACGAC 1140
ATTAATGTTT AATATTAGCC GTGCTTATGT GCTGGAAAAA GATCCCACAC AAAAGGCGCA 1200
ACTAAAGCAG ATGTACTTAT TAATGACAAA GCATTTATTA GATCAAGGCT TTGTTAAAGG 1260
GAGTGCTTTA GTGACAACCC ATCACTGGGG ATACAGTTCT CGTTGGTGGT ATATTTCCAC 1320
GTTATTAATG TCTGATGCAC TAAAAGAAGC GAACCTACAA ACTCAAGTTT ATGATTCATT 1380
ACTGTGGTAT TCACGTGAGT TTAAAAGTAG TTTTGATATG AAAGTAAGTG CTGATAGCTC 1440
TGATCTAGAT TATTTCAATA CCTTATCTCG CCAACATTTA GCCTTATTAT TACTAGAGCC 1500
TGATGATCAA AAGCGTATCλ ACTTAGTTAA TACTTTCAGC CATTATATCA CTGGCGCATT 1560
AACGCAAGTG CCACCGGGTG GTAAAGATGG TTTACGCCCT GATGGTACAG CATGGCGACA 1620
TGAAGGCAAC TATCCGGGCT ACTCTTTCCC AGCCTTTAAA AATGCCTCTC AGCTTATTTA 1680
TTTATTACGC GATACACCAT TTTCAGTGGG TGAAAGTGGT TGGAATAACC TGAAAAAAGC 1740
GATGGTTTCA GCGTGGATCT ACAGTAATCC AGAAGTTGGA TTACCGCTTG CAGGAAGACA 1800
CCCTTTTAAC TCACCTTCGT TAAAATCAGT CGCTCAAGGC TATTACTGGC TTGCCATGTC 1860
TGCAAAATCA TCGCCTGATA AAACACTTGC ATCTATTTAT CTTGCGATTA GTGATAAAAC 1920
ACAAAATGAA TCAACTGCTA TTTTTGGAGA AACTATTACA CCAGCGTCTT TACCTCAAGG 1980
TTTCTATGCC TTTAATGGCG GTGCTTTTGG TATTCATCGT TGGCAAGATA AAATGGTGAC 2040
ACTGAAAGCT TATAACACCA ATGTTTGGTC ATCTGAAATT TATAACAAAG ATAACCGTTA 2100
TGGCCGTTAC CAAAGTCATG GTGTCGCTCA AATAGTGAGT AATGGCTCGC AGCTTTCACA 2160
GGGCTATCAG CAAGAAGGTT GGGATTGGAA TAGAATGCAA GGGGCAACCA CTATTCACCT 2220
TCCTCTTAAA GACTTAGACA GTCCTAAACC TCATACCTTA ATGCAACGTG GAGAGCGTGG 2280
ATTTAGCGGA ACATCATCCC TTGAAGGTCA ATATGGCATG ATGGCATTCG ATCTTATTTA 2340
TCCCGCCAAT CTTGAGCGTT TTGATCCTAA TTTCACTGCG AAAAAGAGTG TATTAGCCGC 2400
TGATAATCAC TTAATTTTTA TTGGTAGCAA TATAAATAGT AGTGATAAAA ATAAAAATGT 2460
TGAAACGACC TTATTCCAAC ATGCCATTAC TCCAACATTA AATACCCTTT GGATTAATGG 2520
ACAAAAGATA GAAAACATGC CTTATCAAAC AACACTTCAA CAAGGTGATT GGTTAATTGA 2580
TAGCAATGGC AATGGTTACT TAATTACTCA AGCAGAAAAA GTAAATGTAA GTCGCCAACA 2640
TCAGGTTTCA GCGGAAAATA AAAATCGCCA ACCGACAGAA GGAAACTTTA GCTCGGCATG 2700
GATCGATCAC AGCACTCGCC CCAAAGATGC CAGTTATGAG TATATGGTCT TTTTAGATGC 2760
GACACCTGAA AAAATGGGAG AGATGGCACA AAAATTCCGT GAAAATAATG GGTTATATCA 2820
GGTTCTTCGT AAGGATAAAG ACGTTCATAT TATTCTCGAT AAACTCAGCA ATGTAACGGG 2880
ATATGCCTTT TATCAGCCAG CATCAATTGA AGACAAATGG ATCAAAAAGG TTAATAAACC 2940 TGCAATTGTG ATGACTCATC GACAAAAAGA CACTCTTATT GTCAGTGCAG TTACACCTGA 3000
TTTAAATATG ACTCGCCAAA AAGCAGCAAC TCCTGTCACC ATCAATGTCA CGATTAATGG 3060
CAAATGGCAA TCTGCTGATA AAAATAGTGA AGTGAAATAT CAGGTTTCTG GTGATAACAC 3120
TGAACTGACG TTTACGAGTT ACTTTGGTAT TCCACAAGAA ATCAAACTCT CGCCACTCCC 3180
TTGATTTAAT CAAAAGAACG CTCTTGCGTT CCTTTTTTAT TTGCAGGAAA TCTGATT 3237
ATG CTA ATA AAA AAC CCT TTA GCC CAC GCG GTT ACA TTA AGC CTC TGT 3285 Met Leu lie Lys Asn Pro Leu Ala His Ala Val Thr Leu Ser Leu Cys 1 5 10 15
TTA TCA TTA CCC GCA CAA GCA TTA CCC ACT CTG TCT CAT GAA GCT TTC 3333 Leu Ser Leu Pro Ala Gin Ala Leu Pro Thr Leu Ser His Glu Ala Phe 20 25 30
GGC GAT ATT TAT CTT TTT GAA GGT GAA TTA CCC AAT ACC CTT ACC ACT 3381 Gly Asp lie Tyr Leu Phe Glu Gly Glu Leu Pro Asn Thr Leu Thr Thr 35 40 45
TCA AAT AAT AAT CAA TTA TCG CTA AGC AAA CAG CAT GCT AAA GAT GGT 3429 Ser Asn Asn Asn Gin Leu Ser Leu Ser Lys Gin His Ala Lye Aβp Gly 50 55 60
GAA CAA TCA CTC AAA TGG CAA TAT CAA CCA CAA GCA ACA TTA ACA CTA 3477 Glu Gin Ser Leu Lye Trp Gin Tyr Gin Pro Gin Ala Thr Leu Thr Leu 65 70 75 80
AAT AAT ATT GTT AAT TAC CAA GAT GAT AAA AAT ACA GCC ACA CCA CTC 3525 Asn Asn lie Val Asn Tyr Gin Aβp Aβp Lys Asn Thr Ala Thr Pro Leu 85 90 95
ACT TTT ATG ATG TGG ATT TAT AAT GAA AAA CCT CAA TCT TCC CCA TTA 3573 Thr Phe Met Met Trp lie Tyr Asn Glu Lys Pro Gin Ser Ser Pro Leu 100 105 110
ACG TTA GCA TTT AAA CAA AAT AAT AAA ATT GCA CTA AGT TTT AAT GCT 3621 Thr Leu Ala Phe Lys Gin Asn Asn Lys lie Ala Leu Ser Phe Asn Ala 115 120 125
GAA CTT AAT TTT ACG GGG TGG CGA GGT ATT GCT GTT CCT TTT CGT GAT 3669 Glu Leu Asn Phe Thr Gly Trp Arg Gly lie Ala Val Pro Phe Arg Asp 130 135 140
ATG CAA GGC TCT GCG ACA GGT CAA CTT GAT CAA TTA GTG ATC ACC GCT 3717 Met Gin Gly Ser Ala Thr Gly Gin Leu Asp Gin Leu Val lie Thr Ala 145 150 155 160
CCA AAC CAA GCC GGA ACA CTC TTT TTT GAT CAA ATC ATC ATG AGT GTA 3765 Pro Asn Gin Ala Gly Thr Leu Phe Phe Asp Gin lie lie Met Ser Val 165 170 175
CCG TTA GAC AAT CGT TGG GCA GTA CCT GAC TAT CAA ACA CCT TAC GTA 3813 Pro Leu Asp Asn Arg Trp Ala Val Pro Aβp Tyr Gin Thr Pro Tyr Val 180 185 190
AAT AAC GCA GTA AAC ACG ATG GTT AGT AAA AAC TGG AGT GCA TTA TTG 3861 Asn Asn Ala Val Asn Thr Met Val Ser Lys Asn Trp Ser Ala Leu Leu 195 200 205 ATG TAC GAT CAG ATG TTT CAA GCC CAT TAC CCT ACT TTA AAC TTC GAT 3909 Met Tyr Asp Gin Met Phe Gin Ala Hie Tyr Pro Thr Leu Asn Phe Asp 210 215 220
ACT GAA TTT CGC GAT GAC CAA ACA GAA ATG GCT TCG ATT TAT CAG CGC 3957 Thr Glu Phe Arg Asp Asp Gin Thr Glu Met Ala Ser lie Tyr Gin Arg 225 230 235 240
TTT GAA TAT TAT CAA GGA ATT CGT AGT GAT AAA AAA ATT ACT CCA GAT 4005 Phe Glu Tyr Tyr Gin Gly lie Arg Ser Asp Lys Lys lie Thr Pro Asp 245 250 255
ATG CTA GAT AAA CAT TTA GCA TTA TGG GAA AAA TTG GTG TTA ACA CAA 4053 Met Leu Asp Lys His Leu Ala Leu Trp Glu Lye Leu Val Leu Thr Gin 260 265 270
CAC GCT GAT GGC TCA ATC ACA GGA AAA GCC CTT GAT CAC CCT AAC CGG 4101 Hie Ala Aβp Gly Ser lie Thr Gly Lye Ala Leu Asp His Pro Asn Arg 275 280 285
CAA CAT TTT ATG AAA GTC GAA GGT GTA TTT AGT GAG GGG ACT CAA AAA 4149 Gin His Phe Met Lys Val Glu Gly Val Phe Ser Glu Gly Thr Gin Lys 290 295 300
GCA TTA CTT GAT GCC AAT ATG CTA AGA GAT GTG GGC AAA ACG CTT CTT 4197 Ala Leu Leu Asp Ala Asn Met Leu Arg Asp Val Gly Lye Thr Leu Leu 305 310 315 320
CAA ACT GCT ATT TAC TTG CGT AGC GAT TCA TTA TCA GCA ACT GAT AGA 4245 Gin Thr Ala lie Tyr Leu Arg Ser Aβp Ser Leu Ser Ala Thr Aβp Arg 325 330 335
AAA AAA TTA GAA GAG CGC TAT TTA TTA GGT ACT CGT TAT GTC CTT GAA 4293 Lye Lye Leu Glu Glu Arg Tyr Leu Leu Gly Thr Arg Tyr Val Leu Glu 340 345 350
CAA GGT TTT ACA CGA GGA AGT GGT TAT CAA ATT ATT ACT CAT GTT GGT 4341 Gin Gly Phe Thr Arg Gly Ser Gly Tyr Gin lie lie Thr Hie Val Gly 355 360 365
TAC CAA ACC AGA GAA CTT TTT GAT GCA TGG TTT ATT GGC CGT CAT GTT 4389 Tyr Gin Thr Arg Glu Leu Phe Aβp Ala Trp Phe lie Gly Arg Hie Val 370 375 380
CTT GCA AAA AAT AAC CTT TTA GCC CCC ACT CAA CAA GCT ATG ATG TGG 4437 Leu Ala Lye Asn Asn Leu Leu Ala Pro Thr Gin Gin Ala Met Met Trp 385 390 395 400
TAC AAC GCC ACA GGA CGT ATT TTT GAA AAA AAT AAT GAA ATT GTT GAT 4485 Tyr Asn Ala Thr Gly Arg lie Phe Glu Lys Asn Asn Glu lie Val Asp 405 410 415
GCA AAT GTC GAT ATT CTC AAT ACT CAA TTG CAA TGG ATG ATA AAA AGC 4533 Ala Asn Val Asp lie Leu Asn Thr Gin Leu Gin Trp Met lie Lys Ser 420 425 430
TTA TTG ATG CTA CCG GAT TAT CAA CAA CGT CAA CAA GCC TTA GCG CAA 4581 Leu Leu Met Leu Pro Asp Tyr Gin Gin Arg Gin Gin Ala Leu Ala Gin 435 440 445
CTG CAA AGT TGG CTA AAT AAA ACC ATT CTA AGC TCA AAA GGT GTT GCT 4629 Leu Gin Ser Trp Leu Asn Lys Thr lie Leu Ser Ser Lys Gly Val Ala 450 455 460
GGC GGT TTC AAA TCT GAT GGT TCT ATT TTT CAC CAT TCA CAA CAT TAC 4677 Gly Gly Phe Lys Ser Asp Gly Ser He Phe His His Ser Gin His Tyr 465 470 475 480
CCC GCT TAT GCT AAA GAT GCA TTT GGT GGT TTA GCA CCC AGT GTT TAT 4725 Pro Ala Tyr Ala Lys Asp Ala Phe Gly Gly Leu Ala Pro Ser Val Tyr 485 490 495
GCA TTA AGT GAT TCA CCT TTT CGC TTA TCT ACT TCA GCA CAT GAG CGT 4773 Ala Leu Ser Asp Ser Pro Phe Arg Leu Ser Thr Ser Ala His Glu Arg 500 505 510
TTA AAA GAT GTT TTG TTA AAA ATG CGG ATC TAC ACC AAA GAG ACA CAA 4821 Leu Lys Asp Val Leu Leu Lys Met Arg He Tyr Thr Lys Glu Thr Gin 515 520 525
ATT CCT GTG GTA TTA AGT GGT CGT CAT CCA ACT GGG TTG CAT AAA ATA 4869 He Pro Val Val Leu Ser Gly Arg His Pro Thr Gly Leu His Lys He 530 535 540
GGG ATC GCG CCA TTT AAA TGG ATG GCA TTA GCA GGA ACC CCA GAT GGC 4917 Gly He Ala Pro Phe Lys Trp Met Ala Leu Ala Gly Thr Pro Asp Gly 545 550 555 560
AAA CAA AAG TTA GAT ACC ACA TTA TCC GCC GCT TAT GCA AAA TTA GAC 4965 Lys Gin Lys Leu Asp Thr Thr Leu Ser Ala Ala Tyr Ala Lye Leu Aβp 565 570 575
AAC AAA ACG CAT TTT GAA GGC ATT AAC GCT GAA AGT GAG CCA GTC GGC 5013 Aβn Lys Thr His Phe Glu Gly He Aβn Ala Glu Ser Glu Pro Val Gly 580 585 590
GCA TGG GCA ATG AAT TAT GCA TCA ATG GCA ATA CAA CGA AGA GCA TCG 5061 Ala Trp Ala Met Aβn Tyr Ala Ser Met Ala He Gin Arg Arg Ala Ser 595 600 605
ACC CAA TCA CCA CAA CAA AGC TGG CTC GCC ATA GCG CGC GGT TTT AGC 5109 Thr Gin Ser Pro Gin Gin Ser Trp Leu Ala He Ala Arg Gly Phe Ser 610 615 620
CGT TAT CTT GTT GGT AAT GAA AGC TAT GAA AAT AAC AAC CGT TAT GGT 5157 Arg Tyr Leu Val Gly Aβn Glu Ser Tyr Glu Aβn Aβn Aβn Arg Tyr Gly 625 630 635 640
CGT TAT TTA CAA TAT GGA CAA TTG GAA ATT ATT CCA GCT GAT TTA ACT 5205 Arg Tyr Leu Gin Tyr Gly Gin Leu Glu He He Pro Ala Aβp Leu Thr 645 650 655
CAA TCA GGG TTT AGC CAT GCT GGA TGG GAT TGG AAT AGA TAT CCA GGT 5253 Gin Ser Gly Phe Ser Hie Ala Gly Trp Aβp Trp Aβn Arg Tyr Pro Gly 660 665 670
ACA ACA ACT ATT CAT CTT CCC TAT AAC GAA CTT GAA GCA AAA CTT AAT 5301 Thr Thr Thr He Hie Leu Pro Tyr Aβn Glu Leu Glu Ala Lye Leu Aβn 675 680 685
CAA TTA CCT GCT GCA GGT ATT GAA GAA ATG TTG CTT TCA ACA GAA AGT 5349 Gin Leu Pro Ala Ala Gly He Glu Glu Met Leu Leu Ser Thr Glu Ser 690 695 700 TAC TCT GGT GCA AAT ACC CTT AAT AAT AAC AGT ATG TTT GCC ATG AAA 5397 Tyr Ser Gly Ala Asn Thr Leu Asn Aβn Asn Ser Met Phe Ala Met Lys 705 710 715 720
TTA CAC GGT CAC AGT AAA TAT CAA CAA CAA AGC TTA AGG GCA AAT AAA 5445 Leu His Gly Hie Ser Lye Tyr Gin Gin Gin Ser Leu Arg Ala Asn Lys 725 730 735
TCC TAT TTC TTA TTT GAT AAT AGA GTT ATT GCT TTA GGC TCA GGT ATT 5493 Ser Tyr Phe Leu Phe Asp Asn Arg Val He Ala Leu Gly Ser Gly He 740 745 750
GAA AAT GAT GAT AAA CAA CAT ACG ACC GAA ACA ACA CTA TTC CAG TTT 5541 Glu Asn Aβp Aβp Lys Gin His Thr Thr Glu Thr Thr Leu Phe Gin Phe 755 760 765
GCC GTC CCT AAA TTA CAG TCA GTG ATC ATT AAT GGC AAA AAG GTA AAT 5589 Ala Val Pro Lys Leu Gin Ser Val He He Asn Gly Lys Lys Val Asn 770 775 780
CAA TTA GAT ACT CAA TTA ACT TTA AAT AAT GCA GAT ACA TTA ATT GAT 5637 Gin Leu Asp Thr Gin Leu Thr Leu Aβn Aβn Ala Aβp Thr Leu He Aβp 785 790 795 800
CCT GCC GGC AAT TTA TAT AAG CTC ACT AAA GGA CAA ACT GTA AAA TTT 5 85 Pro Ala Gly Aβn Leu Tyr Lye Leu Thr Lye Gly Gin Thr Val Lys Phe 805 810 815
AGT TAT CAA AAA CAA CAT TCA CTT GAT GAT AGA AAT TCA AAA CCA ACA 5733 Ser Tyr Gin Lys Gin Hie Ser Leu Aβp Aβp Arg Aβn Ser Lye Pro Thr 820 825 830
GAA CAA TTA TTT GCA ACA GCT GTT ATT TCT CAT GGT AAG GCA CCG AGT 5781 Glu Gin Leu Phe Ala Thr Ala Val He Ser Hie Gly Lye Ala Pro Ser 835 840 845
AAT GAA AAT TAT GAA TAT GCA ATA GCT ATC GAA GCA CAA AAT AAT AAA 582 Aβn Glu Aβn Tyr Glu Tyr Ala He Ala He Glu Ala Gin Aβn Asn Lys 850 855 860
GCT CCC GAA TAC ACA GTA TTA CAA CAT AAT GAT CAG CTC CAT GCG GTA 5877 Ala Pro Glu Tyr Thr Val Leu Gin His Aβn Asp Gin Leu His Ala Val 865 870 875 880
AAA GAT AAA ATA ACC CAA GAA GAG GGA TAT GCT TTT TTT GAA GCC ACT 5925 Lys Aβp Lye He Thr Gin Glu Glu Gly Tyr Ala Phe Phe Glu Ala Thr 885 890 895
AAG TTA AAA TCA GCG GAT GCA ACA TTA TTA TCC AGT GAT GCG CCG GTT 5973 Lye Leu Lys Ser Ala Aβp Ala Thr Leu Leu Ser Ser Asp Ala Pro Val 900 905 910
ATG GTC ATG GCT AAA ATA CAA AAT CAG CAA TTA ACA TTA AGT ATT GTT 6021 Met Val Met Ala Lys He Gin Asn Gin Gin Leu Thr Leu Ser He Val 915 920 925
AAT CCT GAT TTA AAT TTA TAT CAA GGT AGA GAA AAA GAT CAA TTT GAT 6069 Asn Pro Asp Leu Asn Leu Tyr Gin Gly Arg Glu Lys Asp Gin Phe Aβp 930 935 940
GAT AAA GGT AAT CAA ATC GAA GTT AGT GTT TAT TCT CGT CAT TGG CTT 6117 Asp Lys Gly Asn Gin He Glu Val Ser Val Tyr Ser Arg Hie Trp Leu 945 950 955 960
ACA GCA GAA TCG CAA TCA ACA AAT AGT ACT ATT ACC GTA AAA GGA ATA 6165 Thr Ala Glu Ser Gin Ser Thr Asn Ser Thr He Thr Val Lys Gly He 965 970 975
TGG AAA TTA ACG ACA CCT CAA CCC GGT GTT ATT ATT AAG CAC CAC AAT 6213 Trp Lys Leu Thr Thr Pro Gin Pro Gly Val He He Lys His His Asn 980 985 990
AAC AAC ACT CTT ATT ACG ACA ACA ACC ATA CAG GCA ACA CCT ACT GTT 6261 Asn Asn Thr Leu He Thr Thr Thr Thr He Gin Ala Thr Pro Thr Val 995 1000 1005
ATT AAT TTA GTT AAG TAAATTTCGT AACTTTTAAA CTAAAGAGTC TCGACATAAA 6316 He Asn Leu Val Lys 1010
AATATCGAGA CTCTTTTTAT TAAAAAATTA AAAACAAGTT AACGAATGAA TTAATTATTT 6376
GAAAAATAAA AAATAAATCG ATAGCTTTAT TATTGATAAT AAATGTGTTG TGCTCAATGG 6436
TTATTTTGTT ATTCTCTGCG CGGATGCTTG GATCAATCTG GTTCAAGCAT ATCGCAAGCA 6496
CCAGAACGAA AAAAGCCCCG GGT 6519
(2) INFORMATION FOR SEQ ID NO:40:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1013 amino acids
(B) TYPE: amino acid (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:40:
Met Leu He Lys Asn Pro Leu Ala His Ala Val Thr Leu Ser Leu Cyβ 1 5 10 15
Leu Ser Leu Pro Ala Gin Ala Leu Pro Thr Leu Ser Hie Glu Ala Phe 20 25 30
Gly Aβp He Tyr Leu Phe Glu Gly Glu Leu Pro Aβn Thr Leu Thr Thr 35 40 45
Ser Aβn Asn Asn Gin Leu Ser Leu Ser Lys Gin His Ala Lys Asp Gly 50 55 60
Glu Gin Ser Leu Lys Trp Gin Tyr Gin Pro Gin Ala Thr Leu Thr Leu 65 70 75 80
Asn Asn He Val Asn Tyr Gin Asp Asp Lys Asn Thr Ala Thr Pro Leu 85 90 95
Thr Phe Met Met Trp He Tyr Asn Glu Lys Pro Gin Ser Ser Pro Leu 100 105 110
Thr Leu Ala Phe Lys Gin Asn Asn Lys He Ala Leu Ser Phe Aβn Ala 115 120 125 Glu Leu Asn Phe Thr Gly Trp Arg Gly He Ala Val Pro Phe Arg Asp 130 135 140
Met Gin Gly Ser Ala Thr Gly Gin Leu Asp Gin Leu Val He Thr Ala 145 , 150 155 160
Pro Asn Gin Ala Gly Thr Leu Phe Phe Asp Gin He He Met Ser Val 165 170 175
Pro Leu Asp Asn Arg Trp Ala Val Pro Aβp Tyr Gin Thr Pro Tyr Val 180 185 190
Asn Asn Ala Val Asn Thr Met Val Ser Lys Asn Trp Ser Ala Leu Leu 195 200 205
Met Tyr Asp Gin Met Phe Gin Ala His Tyr Pro Thr Leu Asn Phe Aβp 210 215 220
Thr Glu Phe Arg Aβp Aβp Gin Thr Glu Met Ala Ser He Tyr Gin Arg 225 230 235 240
Phe Glu Tyr Tyr Gin Gly He Arg Ser Aβp Lys Lys He Thr Pro Aβp 245 250 255
Met Leu Aβp Lye Hie Leu Ala Leu Trp Glu Lys Leu Val Leu Thr Gin 260 265 270
His Ala Asp Gly Ser He Thr Gly Lys Ala Leu Aβp His Pro Asn Arg 275 280 285
Gin His Phe Met Lys Val Glu Gly Val Phe Ser Glu Gly Thr Gin Lye 290 295 300
Ala Leu Leu Aβp Ala Aβn Met Leu Arg Aβp Val Gly Lye Thr Leu Leu 305 310 315 320
Gin Thr Ala He Tyr Leu Arg Ser Aβp Ser Leu Ser Ala Thr Asp Arg 325 330 335
Lys Lys Leu Glu Glu Arg Tyr Leu Leu Gly Thr Arg Tyr Val Leu. Glu 340 345 350
Gin Gly Phe Thr Arg Gly Ser Gly Tyr Gin He He Thr Hie Val Gly 355 360 365
Tyr Gin Thr Arg Glu Leu Phe Aβp Ala Trp Phe He Gly Arg His Val 370 375 380
Leu Ala Lys Asn Asn Leu Leu Ala Pro Thr Gin Gin Ala Met Met Trp 385 390 395 400
Tyr Asn Ala Thr Gly Arg He Phe Glu Lys Asn Asn Glu He Val Asp 405 410 415
Ala Asn Val Aβp He Leu Aβn Thr Gin Leu Gin Trp Met He Lys Ser 420 425 430
Leu Leu Met Leu Pro Asp Tyr Gin Gin Arg Gin Gin Ala Leu Ala Gin 435 440 445
Leu Gin Ser Trp Leu Asn Lys Thr He Leu Ser Ser Lys Gly Val Ala 450 455 460 Gly Gly Phe Lys Ser Asp Gly Ser He Phe His His Ser Gin His Tyr 465 470 475 480
Pro Ala Tyr Ala Lys Asp Ala Phe Gly Gly Leu Ala Pro Ser Val Tyr 485 490 495
Ala Leu Ser Asp Ser Pro Phe Arg Leu Ser Thr Ser Ala His Glu Arg 500 505 510
Leu Lys Asp Val Leu Leu Lys Met Arg He Tyr Thr Lys Glu Thr Gin 515 520 525
He Pro Val Val Leu Ser Gly Arg His Pro Thr Gly Leu His Lys He 530 535 540
Gly He Ala Pro Phe Lys Trp Met Ala Leu Ala Gly Thr Pro Asp Gly 545 550 555 560
Lys Gin Lys Leu Asp Thr Thr Leu Ser Ala Ala Tyr Ala Lye Leu Asp 565 570 575
Asn Lys Thr His Phe Glu Gly He Asn Ala Glu Ser Glu Pro Val Gly 580 585 590
Ala Trp Ala Met Asn Tyr Ala Ser Met Ala He Gin Arg Arg Ala Ser 595 600 605
Thr Gin Ser Pro Gin Gin Ser Trp Leu Ala He Ala Arg Gly Phe Ser 610 615 620
Arg Tyr Leu Val Gly Aβn Glu Ser Tyr Glu Aβn Asn Asn Arg Tyr Gly 625 630 635 640
Arg Tyr Leu Gin Tyr Gly Gin Leu Glu He He Pro Ala Asp Leu Thr 645 650 655
Gin Ser Gly Phe Ser Hie Ala Gly Trp Asp Trp Asn Arg Tyr Pro Gly 660 665 670
Thr Thr Thr He His Leu Pro Tyr Aβn Glu Leu Glu Ala Lye Leu Aβn 675 680 685
Gin Leu Pro Ala Ala Gly He Glu Glu Met Leu Leu Ser Thr Glu Ser 690 695 700
Tyr Ser Gly Ala Asn Thr Leu Asn Asn Asn Ser Met Phe Ala Met Lys 705 710 715 720
Leu Hie Gly His Ser Lys Tyr Gin Gin Gin Ser Leu Arg Ala Asn Lys 725 730 735
Ser Tyr Phe Leu Phe Asp Asn Arg Val He Ala Leu Gly Ser Gly He 740 745 750
Glu Asn Asp Asp Lys Gin His Thr Thr Glu Thr Thr Leu Phe Gin Phe 755 760 765
Ala Val Pro Lys Leu Gin Ser Val He He Asn Gly Lys Lys Val Aβn 770 775 780
Gin Leu Aβp Thr Gin Leu Thr Leu Asn Asn Ala Asp Thr Leu He Asp 785 790 795 800 Pro Ala Gly Asn Leu Tyr Lys Leu Thr Lys Gly Gin Thr Val Lys Phe 805 810 815
Ser Tyr Gin Lys Gin His Ser Leu Asp Asp Arg Asn Ser Lys Pro Thr 820 825 830
Glu Gin Leu Phe Ala Thr Ala Val He Ser His Gly Lys Ala Pro Ser 835 ' 840 845
Asn Glu Asn Tyr Glu Tyr Ala He Ala He Glu Ala Gin Asn Asn Lys 850 855 860
Ala Pro Glu Tyr Thr Val Leu Gin His Asn Asp Gin Leu His Ala Val 865 870 875 880
Lye Asp Lys He Thr Gin Glu Glu Gly Tyr Ala Phe Phe Glu Ala Thr 885 890 895
Lys Leu Lys Ser Ala Asp Ala Thr Leu Leu Ser Ser Asp Ala Pro Val 900 905 910
Met Val Met Ala Lys He Gin Asn Gin Gin Leu Thr Leu Ser He Val 915 920 925
Asn Pro Aβp Leu Aβn Leu Tyr Gin Gly Arg Glu Lye Aβp Gin Phe Aβp 930 935 940
Aβp Lys Gly Aβn Gin He Glu Val Ser Val Tyr Ser Arg Hie Trp Leu 945 950 955 960
Thr Ala Glu Ser Gin Ser Thr Aβn Ser Thr He Thr Val Lye Gly He 965 970 975
Trp Lye Leu Thr Thr Pro Gin Pro Gly Val He He Lye Hie Hie Aβn 980 985 990
Aβn Aβn Thr Leu He Thr Thr Thr Thr He Gin Ala Thr Pro Thr Val 995 1000 1005
He Asn Leu Val Lys 1010
(2) INFORMATION FOR SEQ ID NO:41:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 36 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic) (iii) HYPOTHETICAL: NO (iv) ANTI-SENSE: NO
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:41: ATTTGCAGGA AATCTGCATA TGCTAATAAA AAACCC 36

Claims

What is claimed is:
1. A purified isolated DNA fragment of Proteus vulgaris (P. vulgaris) comprising a sequence encoding for the chondroitinase I enzyme.
2. A purified isolated DNA fragment of P. vulgaris, wherein the fragment comprises a sequence which hybridizes with a nucleic acid sequence encoding for the amino acids numbered 1-1021 of SEQ ID NO:2 or a biological equivalent thereof.
3. The purified isolated DNA fragment of Claim 2, wherein the fragment has the sequence of (a) the nucleotides numbered 119-3181 of SEQ ID NO:l, or (b) the nucleotides numbered 119-3181 of SEQ ID NO:3.
4. A purified isolated DNA fragment of P. vulgaris, wherein the fragment comprises a sequence which hybridizes with a nucleic acid sequence encoding for the amino acids numbered 25-1021 of SEQ ID NO:2 or a biological equivalent thereof.
5. The purified isolated DNA fragment of Claim 4, wherein the fragment has the sequence of (a) the nucleotides numbered 191-3181 of SEQ ID N0:1, or (b) the nucleotides numbered 191-3181 of SEQ ID NO:3.
6. A purified isolated DNA fragment of P. vulgaris, wherein the fragment comprises a sequence which hybridizes with a nucleic acid sequence encoding for the amino acids numbered 24-1021 of SEQ ID NO:5 or a biological equivalent thereof.
7. The purified isolated DNA fragment of Claim 6, wherein the fragment has the sequence of nucleotides numbered 188-3181 of SEQ ID NO:4.
8. A purified isolated DNA fragment of Proteus vulgaris (P. vulgaris) comprising a sequence encoding for chondroitinase II enzyme.
9. A purified isolated DNA fragment of P. vulgaris, wherein the fragment comprises a sequence which hybridizes with a nucleic acid sequence encoding for (a) the amino acids numbered 1-1013 of SEQ ID NO:40 or a biological equivalent thereof, or (b) the amino acids numbered 24-1013 of SEQ ID NO:40 or a biological equivalent thereof.
10. The purified isolated DNA fra ment of Claim 9, wherein the fragment has the sequence of nucleotides (a) numbered 3238-6276 of SEQ ID NO:39, or
(b) numbered 3307-6276 of SEQ ID NO:39.
11. A plasmid containing a purified isolated DNA fragment of P. vulgaris comprising the sequence of (a) Claim 1 or (b) Claim 8.
12. The plasmid of Claim 11 wherein the plasmid is that designated pTM49-6 or that designated LP21359.
13. A host cell transformed with the plasmid of Claim 11.
14. The host cell of Claim 13 wherein the plasmid is that designated pTM49-6 (ATCC 69234) or that designated LP21359 (ATCC 69598) .
15. A purified isolated recombinant chondroitinase I enzyme.
16. The chondroitinase I enzyme of Claim 15, whose amino acid sequence is depicted for (a) the amino acids numbered 1-1021 of SEQ ID NO:2 or a biological equivalent thereof, (b) the amino acids numbered 25-1021 of SEQ ID N0:2 or a biological equivalent thereof, or (c) the amino acids numbered 24-1021 of SEQ ID NO:5 or a biological equivalent thereof.
17. A purified isolated recombinant chondroitinase II enzyme.
18. The chondroitinase II enzyme of Claim 17, whose amino acid sequence is depicted for (a) the amino acids numbered 1-1013 of SEQ ID NO:40 or a biological equivalent thereof, or (b) the amino acids numbered 24-1013 of SEQ ID NO:40 or a biological equivalent thereof.
19. A method of producing chondroitinase I enzyme which comprises transforming a host cell with the plasmid of Claim 11 (a) and culturing the host cell under conditions which permit expression of said enzyme by the host cell.
20. A method of producing the chondroitinase II enzyme which comprises transforming a host cell with the plasmid of Claim 11 (b) and culturing the host cell under conditions which permit expression of said enzyme by the host cell.
21. A method for the isolation and purification of the recombinant chondroitinase I enzyme of Proteus vulgaris from host cells, said method comprising the steps of:
(a) lysing by homogenization the host cells to release the enzyme into the supernatant;
(b) subjecting the supernatant to diafiltration to remove salts and other small molecules;
(c) passing the supernatant through an anion exchange resin-containing column;
(d) loading the eluate from step (c) to a cation exchange resin-containing column so that the enzyme in the eluate binds to the cation exchange column; and
(e) eluting the enzyme bound to the cation exchange column with a solvent capable of releasing the enzyme from the column.
22. The method of Claim 21, wherein prior to step (b) , the following two steps are performed:
(1) treating the supernatant with an acidic solution to precipitate out the enzyme; and
(2) recovering the pellet and then dissolving it in an alkali solution to again place the enzyme in a basic environment.
23. A recombinant chondroitinase I enzyme isolated and purified by the method of Claim 21 or by the method of Claim 22.
24. A method for the isolation and purification of the recombinant chondroitinase II enzyme of Proteus vulgaris from host cells, said method comprising the steps of:
(a) lysing by homogenization the host cells to release the enzyme into the supernatant;
(b) subjecting the supernatant to diafiltration to remove salts and other small molecules;
(c) passing the supernatant through an anion exchange resin-containing column;
(d) loading the eluate from step (c) to a cation exchange resin-containing column so that the enzyme in the eluate binds to the cation exchange column; and
(e) obtaining by affinity elution the enzyme bound to the cation exchange column with a solution of chondroitin sulfate, such that the enzyme is co- eluted with the chondroitin sulfate;
(f) loading the eluate from step (e) to an anion exchange resin-containing column and eluting the enzyme with a solvent such that the chondroitin sulfate binds to the column; and (g) concentrating the eluate from step (f) and crystallizing out the enzyme from the supernatant which contains an approximately 37 D contaminant.
25. The method of Claim 24, wherein prior to step (b) , the following two steps are performed:
(1) treating the supernatant with an acidic solution to precipitate out the enzyme; and
(2) recovering the pellet and then dissolving it in an alkali solution to again place the enzyme in a basic environment.
26. A recombinant chondroitinase II enzyme isolated and purified by the method of Claim 24 or by the method of Claim 25.
PCT/US1994/004495 1993-04-23 1994-04-22 CLONING AND EXPRESSION OF THE CHONDROITINASE I AND II GENES FROM $i(P. VULGARIS) WO1994025567A1 (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
AU68183/94A AU697156B2 (en) 1993-04-23 1994-04-22 Cloning and expression of the chondroitinase I and II genes from (p. vulgaris)
EP94916561A EP0702715A4 (en) 1993-04-23 1994-04-22 CLONING AND EXPRESSION OF THE CHONDROITINASE I AND II GENES FROM $i(P. VULGARIS)
JP6524437A JPH09500011A (en) 1993-04-23 1994-04-22 P. Cloning and expression of chondroitinase I and II genes from Bulgaris

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US5220693A 1993-04-23 1993-04-23
US5361593A 1993-04-23 1993-04-23
US08/052,206 1993-04-23
US08/053,615 1993-04-23

Publications (1)

Publication Number Publication Date
WO1994025567A1 true WO1994025567A1 (en) 1994-11-10

Family

ID=26730328

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US1994/004495 WO1994025567A1 (en) 1993-04-23 1994-04-22 CLONING AND EXPRESSION OF THE CHONDROITINASE I AND II GENES FROM $i(P. VULGARIS)

Country Status (5)

Country Link
EP (1) EP0702715A4 (en)
JP (1) JPH09500011A (en)
AU (1) AU697156B2 (en)
CA (1) CA2161125A1 (en)
WO (1) WO1994025567A1 (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1996040897A2 (en) * 1995-06-07 1996-12-19 American Cyanamid Company Chondroitinase i and chondroitinase ii producing mutants of p. vulgaris
WO1996040938A1 (en) * 1995-06-07 1996-12-19 American Cyanamid Company Chondroitinase production in recombinant proteus vulgaris strains
EP0756636A1 (en) * 1994-04-22 1997-02-05 American Cyanamid Company Chondroitinases i and ii, methods of preparation, and use thereof
JP2007532094A (en) * 2003-05-16 2007-11-15 アコーダ セラピューティクス、インク. Proteoglycan-degrading mutant for CNS treatment
US7507570B2 (en) 2004-03-10 2009-03-24 Massachusetts Institute Of Technology Recombinant chondroitinase ABC I and uses thereof
US7959914B2 (en) 2003-05-16 2011-06-14 Acorda Therapeutics, Inc. Methods of reducing extravasation of inflammatory cells
US8183350B2 (en) 2002-05-04 2012-05-22 Acorda Therapeutics, Inc. Compositions and methods for promoting neuronal outgrowth
US8226941B2 (en) 2004-05-18 2012-07-24 Acorda Therapeutics, Inc. Methods of purifying chondroitinase and stable formulations thereof
US8236302B2 (en) 2005-09-26 2012-08-07 Acorda Therapeutics, Inc. Compositions and methods of using chondroitinase ABCI mutants
US8404232B2 (en) 2006-10-10 2013-03-26 Acorda Therapeutics, Inc. Compositions and methods of using chondroitinase ABCI mutants
AU2013201097B2 (en) * 2003-05-16 2016-03-31 Acorda Therapeutics, Inc. Proteoglycan degrading mutants for treatment of cns

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1646353A4 (en) * 2003-05-16 2008-06-04 Acorda Therapeutics Inc Fusion proteins for the treatment of cns

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5049501A (en) * 1988-12-19 1991-09-17 Toyo Boseki Kabushiki Kaisha Production method for PvuI restriction endonuclease
US5198355A (en) * 1988-08-24 1993-03-30 Seikagaku Kogyo Co., Ltd. Purification of glycosaminoglycan degrading enzymes with a sulfated polysaccharide

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB1067253A (en) * 1965-02-15 1967-05-03 Biorex Laboratories Ltd Process for the preparation of enzymes
US5496718A (en) * 1992-06-26 1996-03-05 Seikagaku Kogyo Kabushiki Kaisha (Seikagaku Corporation) Chondroitinase ABC isolated from proteus vulgaris ATCC 6896
JPH0698769A (en) * 1992-09-22 1994-04-12 Maruha Corp Chondroitinase and its gene
JP3419811B2 (en) * 1993-02-24 2003-06-23 マルハ株式会社 Chondroitinase gene

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5198355A (en) * 1988-08-24 1993-03-30 Seikagaku Kogyo Co., Ltd. Purification of glycosaminoglycan degrading enzymes with a sulfated polysaccharide
US5049501A (en) * 1988-12-19 1991-09-17 Toyo Boseki Kabushiki Kaisha Production method for PvuI restriction endonuclease

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
AGRICULTURAL AND BIOLOGICAL CHEMISTRY, Volume 50, Number 4, issued 1986, SATO et al., "Subunit Structure of Chondroitinase ABC from Proteus vulgaris", pages 1057-1059, see page 1057, paragraph bridging columns 1-2. *
D. M. GLOVER, "DNA CLONING, Volume 1, A PRACTICAL APPROACH", published 1985 by IRL Press (Oxford), pages 49-77, see page 49, pagagraph 1. *
See also references of EP0702715A4 *
THE JOURNAL OF BIOLOGICAL CHEMISTRY, Volume 243, Number 7, issued 10 April 1968, YAMAGATA et al., "Purification and Properties of Bacterial Chondroitinases and Chondrosulfatases", pages 1523-1535, see Purification section, pages 1526-1527. *

Cited By (41)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5716617A (en) * 1994-04-22 1998-02-10 American Cyanamid Company Compositions of proteus vulgaris chondroitinase I and chondroitinase II
US5855883A (en) * 1994-04-22 1999-01-05 American Cyanamid Company Method of disinsertion of vitreous body from neural retina of the eye with Proteus vulgaris chondroitinases I and II
EP0756636A1 (en) * 1994-04-22 1997-02-05 American Cyanamid Company Chondroitinases i and ii, methods of preparation, and use thereof
US5741692A (en) * 1994-04-22 1998-04-21 American Cyanamid Company Protein vulgaris chondroitinase II
EP0756636A4 (en) * 1994-04-22 1997-12-17 American Cyanamid Co Chondroitinases i and ii, methods of preparation, and use thereof
WO1996040897A3 (en) * 1995-06-07 1997-04-24 American Cyanamid Co Chondroitinase i and chondroitinase ii producing mutants of p. vulgaris
WO1996040938A1 (en) * 1995-06-07 1996-12-19 American Cyanamid Company Chondroitinase production in recombinant proteus vulgaris strains
US5888798A (en) * 1995-06-07 1999-03-30 American Cyanamid Company Chondroitinase I and chondroitinase II producing mutants of P. vulgaris
WO1996040897A2 (en) * 1995-06-07 1996-12-19 American Cyanamid Company Chondroitinase i and chondroitinase ii producing mutants of p. vulgaris
US8183350B2 (en) 2002-05-04 2012-05-22 Acorda Therapeutics, Inc. Compositions and methods for promoting neuronal outgrowth
US9956273B2 (en) 2002-05-04 2018-05-01 Acorda Therapeutics, Inc. Compositions and methods for promoting neuronal outgrowth
US9468671B2 (en) 2002-05-04 2016-10-18 Acorda Therapeutics, Inc. Compositions and methods for promoting neuronal outgrowth
US8785606B2 (en) 2002-05-04 2014-07-22 Acorda Therapeutics, Inc. Compositions and methods for promoting neuronal outgrowth
AU2013201097B2 (en) * 2003-05-16 2016-03-31 Acorda Therapeutics, Inc. Proteoglycan degrading mutants for treatment of cns
AU2009251124B2 (en) * 2003-05-16 2012-12-06 Acorda Therapeutics, Inc. Proteoglycan degrading mutants for treatment of CNS
AU2004247026B2 (en) * 2003-05-16 2009-09-24 Acorda Therapeutics, Inc. Proteoglycan degrading mutants for treatment of CNS
US9528102B2 (en) 2003-05-16 2016-12-27 Acorda Therapeutics, Inc. Proteoglycan degrading mutants for treatment of CNS
US7959914B2 (en) 2003-05-16 2011-06-14 Acorda Therapeutics, Inc. Methods of reducing extravasation of inflammatory cells
US7968089B2 (en) 2003-05-16 2011-06-28 Acorda Therapeutics, Inc. Proteoglycan degrading mutants for the treatment of CNS
US9839679B2 (en) 2003-05-16 2017-12-12 Acorda Therapeutics, Inc. Methods of reducing extravasation of inflammatory cells
JP2014236729A (en) * 2003-05-16 2014-12-18 アコーダ セラピューティクス、インク. Proteoglycan degrading mutants for treatment of cns
US11141467B2 (en) 2003-05-16 2021-10-12 Acorda Therapeutics, Inc. Methods of reducing extravasation of inflammatory cells
US7429375B2 (en) * 2003-05-16 2008-09-30 Acorda Therapeutics, Inc. Proteoglycan degrading mutants for the treatment of CNS
US8906363B2 (en) * 2003-05-16 2014-12-09 Acorda Therapeutics, Inc. Fusion proteins for the treatment of CNS
JP2007532094A (en) * 2003-05-16 2007-11-15 アコーダ セラピューティクス、インク. Proteoglycan-degrading mutant for CNS treatment
US8679481B2 (en) 2003-05-16 2014-03-25 Acorda Therapeutics, Inc. Methods of reducing extravasation of inflammatory cells
WO2004110360A3 (en) * 2003-05-16 2009-04-09 Acorda Therapeutics Inc Proteoglycan degrading mutants for treatment of cns
US8338119B2 (en) 2004-03-10 2012-12-25 Massachusetts Institute Of Technology Chondroitinase ABC I and methods of degrading therewith
US7592152B2 (en) 2004-03-10 2009-09-22 Massachusetts Institute Of Technology Chondroitinase ABC I and methods of analyzing therewith
US7553950B2 (en) 2004-03-10 2009-06-30 Massachusetts Institute Of Technology Chondroitinase ABC I polynucleotides
US7507570B2 (en) 2004-03-10 2009-03-24 Massachusetts Institute Of Technology Recombinant chondroitinase ABC I and uses thereof
US7662604B2 (en) 2004-03-10 2010-02-16 Massachusetts Institute Of Technology Chondroitinase ABC I and methods of production
US8226941B2 (en) 2004-05-18 2012-07-24 Acorda Therapeutics, Inc. Methods of purifying chondroitinase and stable formulations thereof
US9402886B2 (en) 2005-09-26 2016-08-02 Acorda Therapeutics, Inc. Compositions and methods of using chondroitinase ABCI mutants
US9834764B2 (en) 2005-09-26 2017-12-05 Acorda Therapeutics, Inc. Compositions and methods of using chondroitinase ABCI mutants
US10323240B2 (en) 2005-09-26 2019-06-18 Acorda Therapeutics, Inc. Compositions and methods of using chondroitinase ABCI mutants
US8236302B2 (en) 2005-09-26 2012-08-07 Acorda Therapeutics, Inc. Compositions and methods of using chondroitinase ABCI mutants
US9410141B2 (en) 2006-10-10 2016-08-09 Acorda Therapeutics, Inc. Compositions and methods of using chondroitinase ABCI mutants
US9102930B2 (en) 2006-10-10 2015-08-11 Acorda Therapeutics, Inc. Compositions and methods of using chondroitinase ABCI mutants
US8404232B2 (en) 2006-10-10 2013-03-26 Acorda Therapeutics, Inc. Compositions and methods of using chondroitinase ABCI mutants
US9987340B2 (en) 2006-10-10 2018-06-05 Acorda Therapeutics, Inc. Compositions and methods of using chondroitinase ABCI mutants

Also Published As

Publication number Publication date
AU6818394A (en) 1994-11-21
EP0702715A1 (en) 1996-03-27
AU697156B2 (en) 1998-10-01
EP0702715A4 (en) 2000-01-19
JPH09500011A (en) 1997-01-07
CA2161125A1 (en) 1994-11-10

Similar Documents

Publication Publication Date Title
EP0764205B1 (en) Chondroitin lyase enzymes
JP2882775B2 (en) Human-glia-derived neurite factor
Bhatnagar et al. Studies on the mutator gene, mutT of Escherichia coli. Molecular cloning of the gene, purification of the gene product, and identification of a novel nucleoside triphosphatase.
Nakabeppu et al. Cloning and characterization of the alkA gene of Escherichia coli that encodes 3-methyladenine DNA glycosylase II.
Petter et al. Complete nucleotide sequence of the bacteriophage K1F tail gene encoding endo-N-acylneuraminidase (endo-N) and comparison to an endo-N homolog in bacteriophage PK1E
EP0763101B1 (en) Nucleic acid sequences and expression systems for heparinase iii derived from flavobacterium heparinum
JPS61501307A (en) Production hosts and production methods for high-yield recombinant products
WO1994025567A1 (en) CLONING AND EXPRESSION OF THE CHONDROITINASE I AND II GENES FROM $i(P. VULGARIS)
EP0205475B2 (en) Recombinant methods for production of serine protease inhibitors and dna sequences useful for same
WO1992014819A1 (en) A positive selection vector for the bacteriophage p1 cloning system
US5578480A (en) Methods for the isolation and purification of the recombinantly expressed chondroitinase I and II enzymes from P. vulgaris
CA2094245C (en) Streptolysin o derivatives
Hu et al. Morganella morganii urease: purification, characterization, and isolation of gene sequences
WO1993008289A1 (en) HEPARINASE GENE FROM $i(FLAVOBACTERIUM HEPARINUM)
AU4644793A (en) Molecular cloning of the genes reponsible for collagenase production from clostridium histolyticum
JP3557431B2 (en) Streptolysin O mutant
JPH03500606A (en) Protease-deficient Gram-positive fungi and their use as host organisms for recombinant production
JPS62278986A (en) Hybrid polypeptide
US4933288A (en) Use of a modified soluble Pseudomonas exotoxin A in immunoconjugates
Huang et al. Excretion of the egl gene product of Pseudomonas solanacearum
EP0607005B1 (en) Isolated DNA encoding the Not I restriction endonuclease and related methods for producing the same
EP0496861A1 (en) RECOMBINANT DNA PRODUCTION OF $g(b)-1,3-GLUCANASE
JPH0670768A (en) Blasticidin s deaminase gene
JPS63152984A (en) Dna coding l-chain of antipyocyanic human-type antibody
CA2027637A1 (en) N-acetylmuramidase m1

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AU CA JP

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): AT BE CH DE DK ES FR GB GR IE IT LU MC NL PT SE

DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
121 Ep: the epo has been informed by wipo that ep was designated in this application
WWE Wipo information: entry into national phase

Ref document number: 2161125

Country of ref document: CA

WWE Wipo information: entry into national phase

Ref document number: 1994916561

Country of ref document: EP

WWP Wipo information: published in national office

Ref document number: 1994916561

Country of ref document: EP

WWW Wipo information: withdrawn in national office

Ref document number: 1994916561

Country of ref document: EP