WO2004067765A2 - Organism fingerprinting using nicking agents - Google Patents

Organism fingerprinting using nicking agents Download PDF

Info

Publication number
WO2004067765A2
WO2004067765A2 PCT/US2004/002720 US2004002720W WO2004067765A2 WO 2004067765 A2 WO2004067765 A2 WO 2004067765A2 US 2004002720 W US2004002720 W US 2004002720W WO 2004067765 A2 WO2004067765 A2 WO 2004067765A2
Authority
WO
WIPO (PCT)
Prior art keywords
sequence
nicking
strand
oligonucleotide
nucleic acid
Prior art date
Application number
PCT/US2004/002720
Other languages
French (fr)
Other versions
WO2004067765A3 (en
Inventor
Jeffrey Van Ness
David J. Galas
Lori K. Van Ness
Original Assignee
Keck Graduate Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Keck Graduate Institute filed Critical Keck Graduate Institute
Publication of WO2004067765A2 publication Critical patent/WO2004067765A2/en
Publication of WO2004067765A3 publication Critical patent/WO2004067765A3/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6888Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/156Polymorphic or mutational markers
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/16Primer sets for multiplex assays

Definitions

  • the present invention is generally directed to compositions and methods for identifying any type of organism or individual, where the invention is based on creating and analyzing nucleotide sequence information characteristic of nucleic acids present in the organism or individual.
  • Nosocomial (hospital-based) infections have become one of the most serious problems in infectious disease. Staphylococcus aureus is exceeded only by Escherichia coli as a leading cause of nosocomial infections. See, for example, Brumfitt, W. et al., Drugs Exptl. Clin. Res. 76:205-214 (1990).
  • S. aureus methicillin-resistant S. aureus
  • Patients in the intensive care unit are very susceptible to bacterial infections, due to interventions such as respiratory tubes and indwelling catheters. E. coli and S.
  • aureus if introduced into surgical wounds, the blood stream or the urinary tract, cause serious, sometimes life-threatening infections.
  • a partial solution in most "nosocomial outbreaks" is simply identifying the source of the infection. That is, is the infectious agent coming from a common source (e.g., an infected nurse or doctor, or an instrument such as a respirator) or is there some other reason for the sudden emergence of a single type of bacterial infection.
  • Interspersed repetitive DNA sequence elements have been characterized extensively in eucaryotes although their function still remains largely unknown. The conserved nature and interspersed distribution of these repetitive sequences have been exploited to amplify unique sequences between repetitive sequences by the polymerase chain reaction. Additionally, species-specific repetitive DNA elements have been used to differentiate between closely related murine species. Prokaryotic genomes are much smaller than the genomes of mammalian species (approximately 10 6 versus 10 9 base pairs of DNA, respectively). Since these smaller prokaryotic genomes are maintained through selective pressures for rapid DNA replication and cell reproduction the non- coding repetitive DNA should be kept to a minimum unless maintained by other selective forces. For the most part prokaryotes have a high density of transcribed sequences. Nevertheless, families of short intergenic repeated sequences occur in bacteria.
  • repetitive sequences have been demonstrated in many different bacterial species. Reports of novel repeated sequences in the eubacterial genera, Escherichia, Salmonella, Deinococcus, Calothrix, and Neisseria, and the fungi, Candida albicans and Pneumocystis carinii, illustrate the presence of dispersed extragenic repetitive sequences in many organisms.
  • One such family of repetitive DNA sequences in eubacteria is the Repetitive Extragenic Palindromic (REP) elements.
  • the consensus REP sequence for this family includes a 38-mer sequence containing six totally degenerate positions, including a 5 bp variable loop between each side of the conserved stem of the palindrome.
  • the present invention fulfills this and other related needs.
  • the present invention is generally directed to compositions and methods for identifying any type of organism or individual using nucleic acid- based fingerprinting.
  • the method relies on the creation of a family of nucleic acid or oligonucleotide fragments formed by action of nicking agents on a nucleic acid sample.
  • the nicking reaction is preformed in the presence of a polymerase and one or more (preferably all four of the natural) deoxyribonucleoside triphosphates.
  • the nucleic acid or oligonucleotide fragments or portions thereof are created in higher concentration and are therefore more amenable to characterization.
  • a family also referred to herein as a pattern, of nucleic acid or oligonucleotide fragments of known characteristics (e.g., mass/charge ratios) are produced, which identify unambiguously an organism or individual.
  • the readout of the fingerprinting assay is preferably matrix-assisted-laser-desorption ionization (MALDI) or liquid chromatography time-of-flight (LC-TOF) mass spectrometry, however other characterization methods may be used as well.
  • MALDI matrix-assisted-laser-desorption ionization
  • LC-TOF liquid chromatography time-of-flight
  • a method has been devised according to the present invention in which a set of oligonucleotides are linearly amplified from template structures pre-existing in genomic DNA that can be used to initiate their own exponential amplification.
  • a set of oligonucleotides are linearly amplified from template structures pre-existing in genomic DNA that can be used to initiate their own exponential amplification.
  • short oligonucleotides are linearly amplified in the presence of a nicking agent that recognizes the nicking agent recognition sequences.
  • the products from the linear amplification referred to herein as "initiating oligonucleotide" or “initiator,” can then be coupled to a method for exponentially amplifying the initiating oligonucleotides in true chain reactions.
  • the linear and the exponential amplification reactions can be made into a homogenous assay in which 10 8 - 1O 9 — fold amplification can be achieved in as little as 3 minutes.
  • the linear amplification reaction, the exponential amplification, or both may be performed under isothermal conditions (e.g., at 60°C).
  • the exponential or string reaction is composed of two reaction components: a first amplification reaction that replicates the initiating oligonucleotide and a second amplification reaction that replicatess the complement of the initiating oligonucleotide.
  • a first amplification reaction that replicates the initiating oligonucleotide
  • a second amplification reaction that replicatess the complement of the initiating oligonucleotide.
  • two template oligonucleotides are used, a first template that comprises a sequence complementary to the initiating oligonucleotide, and a second template that comprises a sequence complementary to the complement of the initiating oligonucleotide.
  • the first template may anneal to the complement of the initiating oligonucleobe and be used as a template for amplifying the initiating oligonucleotide
  • the second template may anneal to the initiating oligonucleotide and be used as a template for amplifying the complement of the initiating oligonucleotide
  • a useful example is taken from E. coli K12 in which 55 unique oligonucleotides can be generated from genomic DNA without the use of pre- synthesized probes or primers.
  • the read-out is ideally done by mass spectrometry (LC-TOF or MALDI) but can also be accomplished by other means, e.g., using real-time fluorimetry or "self-amplifying arrays". Foreknowledge of the sequence of the individual or organism is not necessary as it is possible to generate the fragments de novo from genomic DNA.
  • the methods described here permit the creation an assay panel of diagnostic oligonucleotides that can identify any organism or individual.
  • the present invention is advantageous over previous methods for identifying bacterial species.
  • the present invention provides a novel approach to using nicking agent recognition sites within a genomic DNA to directly fingerprint bacterial (as well as viral, fungal, in fact, all prokaryotic and eukaryotic genomes).
  • the unique patterns of oligonucleotides generated by a nicking agent recognition sequence identify different bacterial species and strains.
  • the present invention may produce polymorphic oligonucleotide fragments that contain genetic variations (e.g., single nucleotide polymorphisms, deletions, insertions, variable repeats) from eukaryotic genomes. The characterization of these polymorphic oligonucleotide fragments enables the identification of individual organism from which a nucleic acid sample is obtained.
  • the present invention provides a method comprising: a) providing a nucleic acid sample; b) treating the nucleic acid sample with components under nicking conditions, where the components comprise: i) a nicking agent; and the conditions cause the nicking agent to nick the nucleic acid sample to thereby produce a family of initiating oligonucleotide fragments; c) subjecting one or more members of the family of initiating oligonucleotide fragments to a characterization process to thereby provide results; and d) identifying a source for the nucleic acid sample based on the results of the characterization process.
  • the components of the above method further comprise: ii) a polymerase; and iii) a deoxyribonucleoside triphosphate.
  • the components of the above methods may further comprise: iv) a template oligonucleotide comprising from 3' to 5':
  • A a first nucleotide sequence that is at least substantially complementary to a nucleotide sequence present in one or more members of the family of initiating oligonucleotide fragments;
  • (C) a second nucleotide sequence.
  • the template is immobilized.
  • sequence iv)(B) is a sequence of an antisense strand of the nicking agent recognition sequence
  • sequence iv)(A) is at least substantially identical to sequence iv)(C).
  • the components of the above methods may further comprising v) a second template oligonucleotide, comprising, from 3' to 5':
  • A a first nucleotide sequence that is at least substantially identical to a nucleotide sequence present in the one or more members of the family of initiating oligonucleotide fragments;
  • sequence iv)(B) is a sequence of a sense strand of the nicking agent recognition sequence.
  • sequence iv)(A) is exactly identical to sequence iv)(C).
  • template iv), template v), or both may be immobilized.
  • the components may further comprise a restriction endonuclease.
  • the one or more members of the family of initiating oligonucleotide fragments are 6-16 nucleotides in length.
  • the nicking agent is a nicking endonuclease, such as N.BstNB I or N.AIw I.
  • the polymerase is exo " Vent polymerase,
  • the characterization process is performed at least partially by a technique selected from the group consisting of luminescence spectroscopy or spectrometry, fluorescence spectroscopy or spectrometry, mass spectrometry (such as MALDI or LC-TOF), liquid chromatography, fluorescence polarization, and electrophoresis.
  • a technique selected from the group consisting of luminescence spectroscopy or spectrometry, fluorescence spectroscopy or spectrometry, mass spectrometry (such as MALDI or LC-TOF), liquid chromatography, fluorescence polarization, and electrophoresis.
  • the present invention provides a template oligonucelotide for amplifying a portion of a target nucleic acid, wherein
  • the portion of the target is 6-16 nucleotides in length and flanked by
  • (C) a second nucleotide sequence that is at least substantially identical to the first nucleotide sequence.
  • the first and third nicking enzyme recognition sequences are identical to each other.
  • the first, second and third nicking enzyme recognition sequences are identical to each other.
  • some or all of the nicking enzyme recognition sequences are recognizable by N.BstNB I.
  • the 3' terminus of the template oligonucleotide is blocked.
  • the 3' terminus of the template oligonucleotide is immobilized.
  • the portion of the target is selected from the products listed in Tables 1 , 2, 4, 5, and 9.
  • the portion of the target comprises a genetic variation (e.g ⁇ single nucleotide polymorphism).
  • the present invention provides a composition for amplifying a portion of a target nucleic acid, comprising a first template oligonucleotide and a second template oligonucleotide, wherein
  • the portion of the target is 6-16 nucleotides in length and flanked by
  • the first template oligonucleotide comprises from 5' to 3': (A) a first nucleotide sequence,
  • nicking enzyme recognition sequence (B) a sequence of a sense strand of a fourth nicking agent recognition sequence, and (C) a second nucleotide sequence that is at least substantially identical to the portion of the target.
  • the first, third and fourth nicking enzyme recognition sequences are identical to each other.
  • the first, second, third and fourth nicking enzyme recognition sequences are identical to each other. In certain embodiments, some or all of the nicking enzyme recognition sequence are recognizable by N.BstNB I.
  • the 3' termini of the first and second templates are blocked.
  • the 3' terminus of the first, the 3' terminus of the second, or both termini are immobilized.
  • the portion of the target is selected from the products listed in Tables 1 , 2, 4, 5 and 9.
  • the portion of the target comprises a genetic variation (e.g., single nucleotide polymorphism).
  • the present invention provides a composition comprising:
  • the nicking agent is N.BstNB I or N.AIw I.
  • the composition may further comprise a restriction endonuclease selected from Bel I, Bsa Bl, Bsm I, Bsr I and Bsr D1 , wherein the at least 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 35, 40, 50 fragments are selected from the products listed in Table 9.
  • the present invention provides a kit for identifying the source of a nucleic acid sample, comprising one or two template oligonucleotides as described above for amplifying a portion of a genomic DNA of an organism suspected to be the source of the nucleic acid sample, wherein the portion of the genomic DNA is 6-16 nucleotides in length and flanked by
  • A a sequence of one strand of a first nicking enzyme recognition sequence
  • B a sequence of one strand of a second nicking enzyme recognition sequence, or a sequence of one strand of a restriction enzyme recognition sequence.
  • the kit may further comprise a nicking enzyme that recognizes one or more nicking enzyme recognition sequences in the template(s).
  • kits may further comprise a DNA polymerase and/or one or more deoxyribonucleoside triphosphate.
  • the portion of the genomic DNA is selected from the products listed in Tables 1 , 2, 4, 5 and 9.
  • the portion of the genomic DNA comprises a single nucleotide polymorphism.
  • the present invention provides an array, comprising
  • Figure 1 schematically shows the cycle of the synthesis and release of an amplified short oligonucletide.
  • the recognition site for the enzyme N.BstNB I (5'-GAGTC-3') and the specific nicking site four bases downstream on this strand.
  • the oligonucleotide produced is indicated in blue, the primer in green and the template in red.
  • the lengths of the exemplary template and amplified oligonucleotides are shown in the upper left drawing.
  • Figure 2 is a diagram of the reaction scheme for the exponential amplification of oligonucleotides.
  • the segments in red represent the sequence complement of the oligonucleotide sequence to be amplified, the signal sequence (shown in blue).
  • the amplification template, ⁇ consists of two copies of the signal complement flanking the nicking enzyme recognition site shown as a light blue box, and a spacer sequence, shown as a green segment.
  • the signal oligonucleotide (labeled ⁇ ) is produced in the linear amplification cycle for each amplification template created.
  • Figure 3 is a schematic representation of a template oligonucleotide used in a replicator type of amplification reactions.
  • Figures 4a and 4b show the time of flight spectra of the multiplexed amplification reaction in the presence of short oligonucleotides generated from the genomic DNA of the E. coli strains K12 and 0157, respectively.
  • Figure 5a and 5b show the MALDI spectra of the multiplexed amplification reaction in the presence of short oligonucleotides generated from the genomic DNA of the E. coli strains K12 and O157, respectively.
  • Figure 6 shows real time fluorescence detection of the oligonucleotide amplification by an MJ Opticon I.
  • the time of amplification is plotted on the X axis versus accumulated fluorescence on the Y axis.
  • Each curve from left to right represents a serial dilution of 3-fold.
  • the starting concentration of the trigger was 0.01 picomoles/microliter and the last dilution (far right curve (bottom curve on figure)) was 1.9 x 10 "7 picomoles/microliter. This represents a dilution range of about 20,000-fold (3 9 ).
  • Figure 7 is a schematic diagram showing the ping-pong amplification reaction cycle.
  • Figure 8 is a schematic diagram showing the application of the ping-pong amplification reaction cycle in discriminating genetic variations.
  • the present invention provides methods for identifying any type of organism or individual using polynucleotide-based fingerprinting.
  • the method relies on the creation of a family of polynucleotides formed by action of nicking agents on a nucleic acid sample.
  • a nucleic acid sample may be nicked by a nicking agent to produce various nicked nucleic acid fragments.
  • the resulting nicked fragments are then characterized to determine the identity of the organism from which the nucleic acid sample was obtained or derived.
  • a nucleic acid sample may be nicked by a nicking agent in the presence of a DNA polymerase.
  • the presence of the DNA polymerase allows linear amplification of the nicked fragments or portions thereof, thus facilitates the characterization of such fragments.
  • These fragments may be further amplified by coupling the linear amplification reaction with an exponential amplification reaction.
  • Such an exponential amplification greatly increases the speed and the sensitivity of the identification methods.
  • the nicked fragments may be characterized to determine the presence or absence of a particular fragment unique, or characteristic, to a specific species, subspecies, or strain. The conclusion made from the presence or absence of that particular fragment may be further verified by determining the presence or absence of one or more other fragments also unique, or characteristic, to the specific species, subspecies, or strain.
  • the presence or absence of particular fragments in a nicking reaction mixture of a nucleic acid sample need not be determined. Rather, the pattern formed by the resulting nicked fragments (e.g., a mass spectrum of the nicked fragments) is characterized and compared with a standard pattern known for a particular species, subspecies, or strain.
  • the standard pattern may be generated by performing the nicking reaction with a nucleic acid sample from the particular species, subspecies or strain under conditions identical to that of the nucleic acid sample.
  • the standard pattern may be generated based on the known nucleic acid sequence of the particular species, subspecies, or strain.
  • the present invention uses the presence of closely located nicking agent recognition sequences.
  • short oligonucleotide fragments e.g., 6-16 nucleotides long
  • Short oligonucleotide fragments may also be generated or amplified by the use of a nicking agent in combination with a restriction enzyme if one or more recognition sequences of the nicking agent are located closely to the recognition sequence(s) of the restriction enzyme.
  • the resulting short oligonucleotide fragments may be easily separated from other larger fragments. Such separation simplifies the pattern generated from the characterization of the nucleic acid fragments in a nicking reaction mixture.
  • short oligonucleotide fragments may easily be exponentially amplified to increase the speed and sensitivity of the present methods. Short oligonucleotide fragments are also more suitable to certain characterization technologies, such as LC-TOF and MALDI. In some embodiments, the nicked fragments may also contain genetic variations useful to distinguish among individual organisms.
  • Fingerprinting refers to the identification of a source of nucleic acid based on analysis of the nucleic acid according to the methods described herein. For instance, fingerprinting may be applied to the identification of a bacterial strain from its characteristic pattern of oligonucleotides produced by action of a nicking agent (e.g., N.BstNB I). This characteristic pattern is the strain's genomic "fingerprint", which is determined by the sequence of the strain's genomic DNA.
  • a nicking agent e.g., N.BstNB I
  • nucleic acid when a location in a nucleic acid is "5' to” or “5' of a reference nucleotide or a reference nucleotide sequence, this means that it is between the 5' terminus of the reference nucleotide or the reference nucleotide sequence and the 5' phosphate of that strand of the nucleic acid. Further, when a nucleotide sequence is "directly 3' to” or “directly 3' of a reference nucleotide or a reference nucleotide sequence, this means that the nucleotide sequence is immediately next to the 3' terminus of the reference nucleotide or the reference nucleotide sequence.
  • nicking refers to the cleavage of only one strand of a fully double-stranded nucleic acid molecule or a double-stranded portion of a partially double-stranded nucleic acid molecule at a specific position relative to a nucleotide sequence that is recognized by the enzyme that performs the nicking.
  • the specific position where the nucleic acid is nicked is referred to as the "nicking site" (NS).
  • NA nicking agent
  • Nicking agents include, but are not limited to, a nicking endonuclease (e.g., N.BstNB I) and a restriction endonuclease (e.g., Hinc II) when a completely or partially double-stranded nucleic acid molecule contains a hemimodified recognition/cleavage sequence in which one strand contains at least one derivatized nucleotide(s) that prevents cleavage of that strand (i.e., the strand that contains the derivatized nucleotide(s)) by the restriction endonuclease.
  • a nicking endonuclease e.g., N.BstNB I
  • a restriction endonuclease e.g., Hinc II
  • NE nicking endonuclease
  • a NE Unlike a restriction endonuclease (RE), which requires its recognition sequence to be modified by containing at least one derivatized nucleotide to prevent cleavage of the derivatized nucleotide-containing strand of a fully or partially double-stranded nucleic acid molecule, a NE typically recognizes a nucleotide sequence composed of only native nucleotides and cleaves only one strand of a fully or partially double-stranded nucleic acid molecule that contains the nucleotide sequence.
  • RE restriction endonuclease
  • nucleotide refers to adenylic acid, guanylic acid, cytidylic acid, thymidylic acid or uridylic acid.
  • a "derivatized nucleotide” is a nucleotide other than a native nucleotide.
  • NARS nicking agent recognition sequence
  • RERS Restriction endonuclease recognition sequence
  • a “hemimodified RERS,” as used herein, refers to a double-stranded RERS in which one strand of the recognition sequence contains at least one derivatized nucleotide (e.g., ⁇ -thio deoxynucleotide) that prevents cleavage of that strand (i.e., the strand that contains the derivatized nucleotide within the recognition sequence) by a RE that recognizes the RERS.
  • derivatized nucleotide e.g., ⁇ -thio deoxynucleotide
  • a NARS is a double-stranded nucleotide sequence where each nucleotide in one strand of the nucleotide is complementary to the nucleotide at its corresponding position in the other strand.
  • the nucleotide of a NARS in the strand containing a NS nickable by a NA that recognizes the NARS is referred to as a "sequence of the sense strand of the NARS” or a "sequence of the sense strand of the double-stranded NARS,” while the nucleotide of the NARS in the strand that does not contain the NS is referred to as a "sequence of the antisense strand of the NARS" or a "sequence of the antisense strand of the double-stranded NARS.”
  • a NERS is a double- stranded nucleotide sequence of which one strand is exactly complementary to the other strand
  • the nucleotide of a NERS located in the strand containing a NS nickable by a NE that recognizes the NERS is referred to as a "sequence of a sense strand of the NERS” or a "sequence of the sense strand of the double- stranded NERS”
  • the nucleotide of the NERS located in the strand that does not contain the NS is referred to a "sequence of the antisense strand of the NERS" or a "sequence of the antisense strand of the double-stranded NERS.”
  • the recognition sequence and the nicking site of an exemplary nicking endonuclease, N.BstNB I are shown below with V to indicate the cleavage site and N to indicate any nucleotide:
  • the sequence of the sense strand of the N.BstNB I recognition sequence is 5'- GAGTC-3', whereas that of the antisense strand is 5'-GACTC-3'.
  • the sequence of a hemimodified RERS in the strand containing a NS nickable by a RE that recognizes the hemimodified RERS is referred to as "the sequence of the sense strand of the hemimodified RERS” and is located in "the sense strand of the hemimodified RERS” of a hemimodified RERS- containing nucleic acid
  • the sequence of the hemimodified RERS in the strand that does not contain the NS i.e., the strand that contains derivatized nucleotide(s)
  • the sequence of the antisense strand of the hemimodified RERS is located in "the antisense strand of the hemimodified RERS" of a hemimodified RERS-containing nucleic acid.
  • a NARS is an at most partially double-stranded nucleotide sequence that has one or more nucleotide mismatches, but contains an intact sense strand of a double-stranded NARS as described above.
  • the hybridized product includes a NARS, and there is at least one mismatched base pair within the NARS of the hybridized product, then this NARS is considered to be only partially double- stranded.
  • NARSs may be recognized by certain nicking agents (e.g., N.BstNB I) that require only one strand of double-stranded recognition sequences for their nicking activities.
  • N.BstNB I may contain, in certain embodiments, an intact sense strand, as follows,
  • N indicates any nucleotide
  • N at one position may or may not be identical to N at another position, however there is at least one mismatched base pair within this recognition sequence.
  • the NARS will be characterized as having at least one mismatched nucleotide.
  • a NARS is a partially or completely single-stranded nucleotide sequence that has one or more unmatched nucleotides, but contains an intact sense strand of a double-stranded NARS as described above.
  • the hybridized product includes a nucleotide sequence in the first strand that is recognized by a NA, i.e., the hybridized product contains a NARS, and at least one nucleotide in the sequence recognized by the NA does not correspond to, i.e., is not across from, a nucleotide in the second strand when the hybridized product is formed, then there is at least one unmatched nucleotide within the NARS of the hybridized product, and this NARS is considered to be partially or completely single-stranded.
  • NARSs may be recognized by certain nicking agents (e.g., N.BstNB I) that require only one strand of double-stranded recognition sequences for their nicking activities.
  • N.BstNB I may contain, in certain embodiments, an intact sense strand, as follows,
  • N indicates any nucleotide, 0-4 indicates the number of the nucleotides "N," a "N” at one position may or may not be identical to a “N” at another position), which contains the nucleotide of the sense strand of the double-stranded recognition sequence of N.BstNB I.
  • at least one of G, A, G, T or C is unmatched, in that there is no corresponding nucleotide in the complementary strand. This situation arises, e.g., when there is a "loop" in the hybridized product, and particularly when the sense sequence is present, completely or in part, within a loop.
  • nucleic acid amplification reaction refers to the process of making more than one copy of a nucleic acid molecule (A) using a nucleic acid molecule (T) that comprises a sequence complementary to the nucleotide of nucleic acid molecule A as a template.
  • a first nucleic acid sequence is "at least substantially identical" to a second nucleic acid sequence when the complement of the first sequence is able to anneal to the second sequence to form at least a transient duplex under certain reaction conditions (e.g., conditions for amplifying nucleic acids).
  • the first sequence is exactly identical to the second sequence, that is, the nucleotide of the first sequence at each position is identical to the nucleotide of the second sequence at the same position, and the first sequence is of the same length as the second sequence.
  • a first nucleic acid sequence is "at least substantially complementary" to a second nucleic acid sequence when the first sequence is able to anneal to the second sequence to form at least a transient duplex under certain reaction conditions (e.g., conditions for amplifying nucleic acids).
  • the first sequence is exactly or completely complementary to the second sequence, that is, each nucleotide of the first sequence is complementary to the nucleotide of the second sequence at its corresponding position, and the first sequence is of the same length as the second sequence.
  • a transient duplex between a first nucleic acid sequence and a second nucleic acid sequence is formed when under given reaction conditions, the 3' terminal group of the first nucleic acid sequence (if unblocked) may be extended by a DNA polymerase using the second nucleic acid sequence as a template; or the 3' terminal group of the second nucleic acid sequence (if unblocked) may be extended by a DNA polymerase using the first nucleic acid sequence as a template.
  • at least 80% of the nucleotides of the first nucleic acid in a region of at least 8 nucleotides are complementary to the nucleotides of the second nucleic acid at their corresponding positions.
  • At least 85%, 90%, 95%, 97%, 98%, or 99% of the nucleotides of the first nucleic acid in a region of at least 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, or 18 nucleotides are complementary to the nucleotides of the second nucleic acid at their corresponding positions.
  • a nucleotide in one strand (referred to as the "first strand") of a double-stranded nucleic acid located at a position "corresponding to" another position (e.g., a defined position) in the other strand (referred to as the "second strand") of a double-stranded nucleic acid refers to the nucleotide in the first strand that is complementary to the nucleotide at the corresponding position in the second strand.
  • a position in one strand (referred to as the "first strand") of a double-stranded nucleic acid corresponding to a nicking site within the other strand (referred to as the "second strand”) of a double- stranded nucleic acid refers to the position between the two nucleotides in the first strand complementary to those in the second strand between which nicking occurs.
  • isothermal conditions refers to a set of reaction conditions where the temperature of the reaction is kept essentially constant (i.e., at the same temperature or within the same narrow temperature range wherein the difference between an upper temperature and a lower temperature is no more than about 20°C) during the course of the amplification.
  • a reaction is carried out under conditions where the difference between an upper temperature and a lower temperature is no more than 15°C, 10°C, 5°C, 3°C, 2°C or 1°C.
  • Exemplary temperatures for isothermal amplification include, but are not limited to, any temperature between 50°C to 70°C or the temperature range between 50°C to 70°C, 55°C to 70°C, 60°C to 70°C, 65°C to 70°C, 50°C to 55°C, 50°C to 60°C, or 50°C to 65°C.
  • the terms "polymorphism” and “genetic variation,” as used herein, refer to the occurrence of two or more genetically determined alternative sequences or alleles in a small region (i.e., one to several (e.g., 2, 3, 4, 5, 6, 7, or 8) nucleotides in length) in a population. The allelic form occurring most frequently in a selected population is referred to as the wild type form. Other allelic forms are designated as variant forms. Diploid organisms may be homozygous or heterozygous for allelic forms.
  • the genetic variation is a "single- nucleotide polymorphism" (SNP), which refers to any single nucleotide sequence variation, preferably one that is common in a population of organisms and is inherited in a Mendelian fashion.
  • SNP single- nucleotide polymorphism
  • the SNP is either of two possible bases and there is no possibility of finding a third or fourth nucleotide identity at an SNP site.
  • Sample sources Biological samples of the present invention include any sample that originates from an organism and that may contain a nucleic acid of interest (i.e., target nucleic acid). They may be provided by obtaining a blood sample, biopsy specimen, tissue explant, organ culture or any other tissue or cell preparation from a subject or a biological source.
  • the subject or biological source may be a human or non-human animal, a plant, a fungus, a bacterium, and virus.
  • the subject or biological source may be suspected of having, or being at risk for having, a genetic disease or a pathogen infection.
  • the subject or biological source may be a patient that has a genetic disease or a pathogen infection.
  • the subject or biological source may be a control subject that does not have a genetic disease or a pathogen infection.
  • a bacterial sample can be utilized as starting material, provided it contains or is suspected of containing a bacterial genome of interest.
  • a sample may be obtained from any source that may potentially be contaminated by bacteria.
  • the sample to be tested can be selected or extracted from any bodily sample such as blood, urine, spinal fluid, tissue, vaginal swab, stool, amniotic fluid or buccal mouthwash.
  • the sample can come from a variety of other sources.
  • the sample can be from a plant, fertilizer, soil, liquid or other horticultural or agricultural product.
  • the sample can be from fresh food or processed food (for example infant formula, seafood, fresh produce and packaged food).
  • the sample can be from liquid, soil, sewage treatment, sludge and any other sample in the environment considered or suspected of being contaminated by bacteria.
  • the sample is a mixture of material for example blood, soil and sludge
  • it can be treated with an appropriate reagent effective to open the cells and expose or separate the strands of nucleic acids.
  • this lysing and nucleic acid denaturing step will allow amplification to occur more readily.
  • the bacteria can be cultured prior to analysis and thus a pure sample obtained.
  • fingerprinting genomic DNA may also be used to characterize other DNA molecules (e.g., cDNA).
  • the methods according to the present invention may also be applicable to characterize cDNA expression patterns.
  • the nucleic acids isolated from a biological source may be directly used in a nicking reaction. Alternatively, they may be amplified via known methods (such as PCR) prior to being subjected to action of a nicking agent.
  • a nicking agent may be a nicking endonuclease (used interchangeably with “nicking enzyme") or a restriction endonuclease (used interchangeably with “restriction enzyme”).
  • a nicking endonuclease (NE) useful in the present invention may or may not have a nicking site that overlaps with its recognition sequence.
  • An exemplary NE that nicks outside its recognition sequence is N.BstNB I, which recognizes a unique nucleic acid sequence composed of 5'-GAGTC-3', but nicks four nucleotides beyond the 3' terminus of the recognition sequence.
  • the recognition sequence and the nicking site of N.BstNB I are shown below with V to indicate the cleavage site where the letter N denotes any nucleotide:
  • N.BstNB I may be prepared and isolated as described in U.S. Pat. No. 6,191 ,267, incorporated herein by reference in its entirety. Buffers and conditions for using this nicking endonuclease are also described in the '267 patent.
  • An additional exemplary NE that nicks outside its recognition sequence is N.Alwl, which recognizes the following double-stranded recognition sequence:
  • N.Alwl The nicking site of N.Alwl is also indicated by the symbol V'. Both NEs are available from New England Biolabs (NEB). N.Alwl may also be prepared by mutating a type I Is RE Alwl as described in Xu et al. (Proc. Natl. Acad. Sci. USA 98:12990-5, 2001).
  • NEs that nick within their NERSs include N.BbvCI-a and N.BbvCI-b.
  • the recognition sequences for the two NEs and the NSs are shown as follows:
  • Both NEs are available from NEB.
  • nicking endonucleases include, without limitation, N.BstSE I (Abdurashitov et al., Mol. Biol. (Mosk) 30: 1261-7, 1996), an engineered EcoR V (Stahl et al., Proc. Natl. Acad. Sci. USA 93: 6175-80, 1996), an engineered Fok I (Kim et al., Gene 203: 43-49, 1997), endonuclease V from Thermotoga maritima (Huang et al., Biochem.
  • Additional NEs may be obtained by engineering other restriction endonuclease, especially type lls restriction endonucleases, using methods similar to those for engineering EcoR V, Alwl, Fok I and/or Mly I.
  • a restriction endonuclease useful as a nicking agent can be any restriction endonuclease (RE) that nicks a double-stranded nucleic acid at its hemimodified recognition sequences.
  • Exemplary REs that nick their double- stranded hemimodified recognition sequences include, but are not limited to Ava I, Bsl I, BsmA I, BsoB I, Bsr I, BstN I, BstO I, Fnu4H I, Hinc II, Hind II and Nci I. Additional REs that nick a hemimodified recognition sequence may be screened by the strand protection assays described in U.S. Pat. No. 5,631 ,147.
  • a nicking agent may recognize a nucleotide sequence in a DNA-RNA duplex and nicks in one strand of the duplex. In certain other embodiments, a nicking agent may recognize a nucleotide sequence in a double-stranded RNA and nicks in on strand of the RNA.
  • nicking agents require only the presence of the sense strand of a double-stranded recognition sequence in an at least partially double- stranded substrate nucleic acid for their nicking activities.
  • N.BstNB I is active in nicking a substrate nucleic acid that comprises, in one strand, the sequence of the sense strand of its recognition sequence "5 - GAGTC-3'" of which one or more nucleotides do not form conventional base pairs (e.g., G:C, A:T, or A:U) with nucleotides in the other strand of the substrate nucleic acid.
  • N.BstNB I decreases with the increase of the number of the nucleotides in the sense strand of its recognition sequence that do not form conventional base pairs with any nucleotides in the other strand of the substrate nucleic acid. However, even if none of the nucleotides of "5'-GAGTC-3"' form conventional base pairs with the nucleotides in the other strand, N.BstNB I may still retain 10-20% of its optimum activity. Several factors may be considered when choosing a particular nicking agent for a fingerprinting assay according to the present invention.
  • nicking agent a nicking enzyme that would produce short unique oligonucleotides may be desirable.
  • a nicking enzyme with an optimum temperature similar to that of the DNA polymerase may be desirable.
  • the nicking reaction may be simply performed by incubating the nucleic acid sample with a nicking agent under appropriate conditions. Identifying such appropriate conditions are within the ordinary skill in the art. For instance, the nicking reaction may be performed at the optimum temperature of the nicking agent and in a buffer suitable for the nicking agent.
  • the nicking reaction mixture that contains nicked nucleic acid fragments may be directly characterized. The characterization may be performed by any known applicable methods, including but not limited to, liquid chromatography, electrophoresis, hybridization and mass spectrometry. The use of such methods may indicate the presence or absence of one or more particular fragments unique to a species, subspecies, strain, or individual organism from which the nucleic acid sample is suspected to be. Alternatively, the use of such methods produces a pattern of nicked fragments, which may be compared with the pattern generated from an organism from which the nucleic acid sample is suspected to be.
  • the nicking reaction is performed in the presence of a DNA polymerase so that nicked fragments or portions thereof may be linearly amplified.
  • the amplification produces a larger amount of single-stranded nucleic acid or oligonucleotides, which increases the sensitivity of the fingerprinting assays.
  • the 3' terminus at the nicking site is extended by a DNA polymerase, preferably being 5'->3' exonuclease deficient and having a strand displacement activity and/or in the presence of a strand displacement facilitator, displacing the strand that contains the 5' terminus produced by the nicking reaction.
  • the resulting extension product having a recreated NARS for the NA is nicked ("re-nicked") by the NA.
  • the 3' terminus produced at the NS by the re-nicking is then extended in the presence of the DNA polymerase, also displacing the strand that contains the 5' terminus produced by the nicking reaction.
  • the nicking-extension cycle is repeated, preferably multiple times, to accumulate/amplify the displaced strand that contains the 5' terminus produced by the nicking reaction.
  • DNA polymerases useful in the present invention may be any DNA polymerase that is 5'- ⁇ 3' exonuclease deficient but has a strand displacement activity.
  • DNA polymerases include, but are not limited to, exo " Deep Vent, exo " Bst, exo " Pfu, and exo " Bca.
  • Additional DNA polymerases useful in the present invention may be screened for or created by the methods described in U.S. Pat. No. 5,631 ,147, incorporated herein by reference in its entirety.
  • the strand displacement activity may be further enhanced by the presence of a strand displacement facilitator as described below.
  • a DNA polymerase that does not have a strand displacement activity may be used.
  • DNA polymerases include, but are not limited to, exo " Vent, Taq, the Klenow fragment of DNA polymerase I, T5 DNA polymerase, and Phi29 DNA polymerase.
  • the use of these DNA polymerases requires the presence of a strand displacement facilitator.
  • a "strand displacement facilitator" is any compound or composition that facilitates strand displacement during nucleic acid extensions from a 3' terminus at a nicking site catalyzed by a DNA polymerase.
  • Exemplary strand displacement facilitators useful in the present invention include, but are not limited to, BMRF1 polymerase accessory subunit (Tsurumi et al., J. Virology 67: 7648-53, 1993), adenovirus DNA-binding protein (Zijderveld and van der Vliet, J. Virology 68: 1158-64, 1994), herpes simplex viral protein ICP8 (Boehmer and Lehman, J. Virology 67: 711-5, 1993; Skaliter and Lehman, Proc. Natl. Acad. Sci. USA 91: 10665-9, 1994), single-stranded DNA binding protein (Rigler and Romano, J. Biol. Chem.
  • trehalose is present in the amplification reaction mixture.
  • Additional exemplary DNA polymerases useful in the present invention include, but are not limited to, phage M2 DNA polymerase (Matsumoto et al., Gene 84: 247, 1989), phage PhiPRDI DNA polymerase (Jung et al., Proc. Natl. Acad. Sci.
  • a DNA polymerase that has a 5'->3' exonuclease activity may be used.
  • such a DNA polymerase may be useful for amplifying short nucleic acid fragments that automatically dissociate from the template nucleic acid after nicking.
  • a RNA-dependent DNA polymerase may be used.
  • a DNA-dependent DNA polymerase that extends from a DNA primer such as Avian Myeloblastosis virus reverse transcriptase (Promega) may be used.
  • a target mRNA need not be reverse transcribed into cDNA and may be directly mixed with a template nucleic acid molecule that is at least substantially complementary to the target mRNA.
  • oligonucleotide fragments e.g., 6-20 nucleotides in length.
  • short oligonucleotides are more suitable to certain characterization technologies, such as LC-TOF and MALDI.
  • short fragments may be easily and more efficiently amplified to increase the speed and sensitivity of the present methods.
  • Such fragments also allow the use of a DNA polymerase that does not have a strand displacement activity, or is not 5' to 3' exonuclease deficient.
  • nicking agent recognition sequences or one nicking agent recognition sequence and one restriction enzyme recognition sequence
  • the proximity is about 12 to 24 nucleotides.
  • N can be any nucleotide
  • the number of Ns in the upper strand that is between the sequence of the sense strand of the N.BstNB I recognition sequence (i.e., 5'-GAGTC-3') and the sequence of the antisense strand of the recognition sequence (i.e., 5'-GACTC-3') may be between 12 and 24.
  • the recessed 3'-hydroxyl of each amplification template is filled in by the polymerase, the nicking enzyme then again cleaves the newly extended strand, the resulting short single-stranded oligonucleotide immediately dissociates, and the cycle of nicking and filling is repeated multiple ⁇ times, resulting a linear amplification of the short single-stranded oligonucleotides.
  • the above reaction synthesizes short oligonucleotides whose cycle of reactions depends on the idea that, at the reaction temperature, oligonucleotides above a certain length form stable duplexes, while those below this length form unstable duplexes that dissociate readily.
  • the short oligonucleotide generated in the nicking reaction is below the threshold of stability, and is thereby released from the duplex.
  • the release of the short oligonucleotide from the duplex regenerates a 5' overhang, which may be again used as a template for synthesizing the short oligonucleotide.
  • a schematic representation of a linear amplification of short oligonucleotides using an N.BstNB I as the nicking agent is shown in Figure 1.
  • the identification of one or more pairs of nicking agent recognition sequence (or one nicking agent recognition sequence and one restriction enzyme recognition sequence) in close proximity in a genomic DNA with a known sequence may be performed by the use of applicable computer programs.
  • Such identification may facilitate the selection of a nicking agent for fingerprinting a nucleic acid sample suspected to be derived from the genomic DNA, as well as the identification of oligonucleotide fragments or patterns thereof that are unique to the genomic DNA.
  • These unique oligonucleotide fragments or patterns thereof expected to be present in a fingerprinting assay according to the present invention may be used as a standard for those generated from the nucleic acid sample. The comparison of the expected unique oligonucleotide fragments or patterns thereof with those generated from the nucleic acid sample would indicate whether the nucleic acid sample is derived from the genomic DNA.
  • Genome Identifier through a Nicking-enzyme generated Unique Mass Spectrum (GINUMS).
  • GINUMS further predicts the amplified short oligonucleotides of a genomic sequence produced in the linear amplification reactions described herein.
  • the search begins with the acquisition of the genomic sequence.
  • the analysis is simplified with "finished" genomic sequence, which is represented by one long sequence, rather than "draft" quality sequence, which is represented by a series of sequences, each a fragment of the overall sequence separated by gaps, although the analysis will work with any sequence or set of sequences.
  • sequence is stored in FASTA (or Pearson) format, but the program will work with any sequence format.
  • the sequence is read from either a file of a database, and stored into memory so that it may be searched later.
  • GINUMS searches the genomic sequence for three different patterns, each represented by a regular expression.
  • Regular expressions are simply patterns represented by a string of characters that are used to search text, in this case one long string of A's, C's, G's and T's representing one strand of the genomic sequence.
  • the three regular expressions are:
  • GINUMS searches the genomic sequence with each of the regular expressions one at a time, and then stores all of the matching segments of sequence (termed "hits" for the rest of this document) for each search into separate lists. Thus, all of the hits are composed of the recognition sequences separated by any 14 to 24 base pairs, which are determined by the genomic sequence.
  • the program has three lists of hits in memory, each list the product of searching the genomic sequence with one of the regular expressions. For each hit in all of the lists, GINUMS will determine the "product(s)" of the reaction (the sequence that would be amplified), the mass of the product(s) and the starting and ending position of that product in the genomic sequence.
  • the program searches genomic sequence for target sequences (that is, any sequence beginning with GGATC and ending with GACTC, separated by any 14 to 24 nucleotides).
  • the '.' character represents any one character, in this case any one A, T, C, or G. So, one way to search for each of these occurrences would be to construct 11 regular expressions and then search the sequence for each, such as:
  • GGATC GACTC 14 ,'s
  • GGATC GACTC 15.'s
  • the regular expression represents any string of any 2- 4 characters.
  • this regular expression represents any string that begins with GGATC, and ends with GACTC, separated by at least $min characters and at most $max characters, where $min and $max are integers, and $min ⁇ $max. Because the brackets are preceded by a '.', these can be any characters. Building the profiles for more than one organism is a simple as repeating the process for as many genomic sequences necessary, noting which masses where derived from which organism.
  • GINUMS is capable of inputting the genomic sequences of an unlimited number of organisms, each contained within one file, all contained in the same directory, and then returning two outputs.
  • the first output is a set of masses unique to each organism. This can be accomplished by searching for the existence of each mass in one organism in all of the masses for the other organisms. If that mass is found only in one organism, then it is unique for that organism.
  • GINUMS does not do this precisely (it uses the properties of a data structure called a hash), but the process is analogous.
  • the second output is the complete list of masses for each organism, and would allow the user to determine if each organism has a distinct "profile" (hence, the name), or set, of masses. These could be written either to file or into a database for permanent storage, so that they may be searched for in experimental data at a later time.
  • GINUMS uses the following model to determine the products (and their corresponding masses) of the amplification process on genomic sequence.
  • N represents any nucleotide
  • P represents the nucleotides in the product
  • the products of the reaction are S and S'.
  • short oligonucleotide fragments may also be generated or amplified by the use of a different nicking endonuclease alone or in combination with another nicking endoculease or restriction endonuclease (e.g., a type lls restriction endonuclease).
  • a linear amplification reaction may be first performed under the lower optimal temperature and then under the higher optimal temperature.
  • the amplified short fragments may also contain genetic variations. For instance, SNPs in human genomic DNA that are flanked by two N.BstNB I recognition sequences or an N.BstNB I recognition sequence and another restriction enzyme recognition sequence are shown in Table 9. The characterization of such short fragments would identify the SNPs within these fragments and facilitate the identification of the individual from which a nucleic acid sample is obtained.
  • the oligonucleotides linearly amplified as described above may be further exponentially amplified. These oligonucleotides are referred to as “initiating oligonucleotides” or “initiators.” Such exponential amplification greatly increases the production rate and amount of the initiating oligonucleotides.
  • the key idea for exponential amplification is to arrange it so that the oligonucleotide product of the linear reaction serves to create a new primer that in turn anneals to a target template and creates a new primer-template, which in turn produces more of the same oligonucleotide product, creating a chain reaction.
  • a simple scheme for exponentially amplifying a short oligonucleotide (also referred to "direct EXPAR") using N.BstNB I as the nicking agent is depicted in Figure 2.
  • the scheme is based on our observation that even though the product oligonucleotide is unstable as a duplex it will form a transient duplex molecule with its complement and this transient duplex can act as a primer for extension by the DNA polymerase. Once extension of the oligonucleotide has occurred the duplex is stabilized by the additional complementary duplex section and will not readily dissociate. Extending the primer thus creates a stable primer-template that will produce oligonucleotide products in a linear fashion.
  • oligonucleotides that we call amplification templates.
  • the key feature of these single-stranded oligonucleotides is that they contain two copies in tandem of the complement of the oligonucleotide product to be amplified, separated by the sequence of the antisense strand of the recognition sequence of N.BstNB I (i.e., 3'-CTCAG-5') and a four base spacer (on the 5' side).
  • This primed template will then continue to produce oligonucleotide product via the linear amplification cycle as described above (nicking after the four base spacer, dissociating the oligonucleotide and re-elongating the primer) as long as the enzymes remain active and dNTPs are available.
  • Another scheme for exponentially amplifying short oligonucleotides referred to as "the replicator type” or "ping-pong amplification scheme" uses two templates.
  • an initiating oligonucleotide of sequence S primes a first template oligonucleotide T1 (S is about 8 to 16 nucleotides in length) to form the following partially double-stranded nucleic acid molecule.
  • T1 is about 8 to 16 nucleotides in length
  • the 3' terminus of T1 may be blocked by a phosphate group.
  • the upper strand of the above partially double-stranded nucleic acid molecule is elongated to form the following fully double-stranded nucleic acid molecule:
  • a nicking enzyme e.g., N.BstNB I
  • the lower strand is nicked to generate a 3' hydroxyl group and release an oligonucleotide blocked at the 3' terminus (un-productive oligonucleotide).
  • a nicking enzyme e.g., N.BstNB I
  • the resulting nicked structure is shown below:
  • the DNA polymerase uses sequence S as a template and fills in the recessed 3' hydroxyl group of the lower strand to produce the following double-stranded nucleic acid.
  • the nicking enzyme cleaves the lower strand, releasing an oligonucleotide having the sequence S' (which is completely complementary to the sequence S) as shown below:
  • the lower strand of the above partially double-stranded nucleic acid may be extended again to produce a fully double-stranded nucleic acid molecule.
  • the lower strand of the fully double-stranded nucleic acid may be nicked again to release another oligonucleotide having the sequence S'.
  • the above extension and nicking cycle may be repeated multiple times, resulting in the amplification of the oligonucleotide having the sequence S'.
  • This oligonucleotide (S') is capable of priming the oligonucleotide template T2 to follow a partially double-stranded nucleic acid molecule as shown below:
  • the upper strand of the above partially double-stranded nucleic acid molecule is elongated to form the following fully double-stranded nucleic acid molecule:
  • a nicking enzyme e.g., N.BstNB I
  • the lower strand is nicked to generate a 3' hydroxyl group and release an oligonucleotide blocked at the 3' terminus (un-productive oligonucleotide).
  • a nicking enzyme e.g., N.BstNB I
  • the resulting nicked structure is shown below:
  • the nicking enzyme cleaves the lower strand, releasing an oligonucleotide having the sequence S' (which is completely complementary to the sequence S) as shown below:
  • the lower strand of the above partially double-stranded nucleic acid may be extended again to produce a fully double-stranded nucleic acid molecule.
  • the lower strand of the fully double-stranded nucleic acid may be nicked again to release another oligonucleotide having the sequence S.
  • the above extension and nicking cycle may be repeated multiple times, resulting in the amplification of the oligonucleotide having the sequence S.
  • the two oligonucleotides having the sequences of S and S' are now capable of priming T1 and T2, respectively, and the exponential amplification is started.
  • S and S' are sufficiently short (e.g., 8-16 nucleotides in length) which prevents the triggers from forming a stable duplex in a reaction mixture under conditions for exponential amplification (e.g., 60°C).
  • This variation of exponential amplification has a substantial advantage of requiring a very high level of stringency of an oligonucleotide priming its template.
  • oligonucleotide e.g., an oligonucleotide having the sequence S or S'
  • the oligonucleotide has to be nearly perfectly based paired with its template for an exponential amplification reaction to start. In many cases, even a single mismatch in the oligonucleotide will inhibit the reaction.
  • T1 The first template oligonucleotide (T1 ).
  • T1 may be 24 to 60 nucleotides (including all the integer values therebetween), preferably 32-36 nucleotides in length.
  • the 3'-end of T1 may be blocked with, for example, a phosphate, an amine, a biotin, a dideoxy group or a fluorophore (that is, there is no free 3'-hydroxyl in T1 ) to prevent extension by a polymerase.
  • the region from the 3' terminus of T1 to the fifth nucleotide directly 3' to the 3' terminus of the sense strand of a nicking endonuclease recognition sequence (e.g., GAGTC) (i.e., Region l in Figure 6) may be 8, 9, 10, 11 , 12, 13, 14, 15 or 16 nucleotides in length and is completely (or at least substantially) complementary to the sequence S.
  • a nicking endonuclease e.g., N.BstNB I
  • sequence 3'-CTGAG-5' which is the sense strand of the recognition sequence for a nicking enzyme (e.g., N.BstNB I). Further in the 5' direction is about 10 to 20 nucleotides of any sequence (Region III in Figure 6). The sequence at the 5' end should not be complementary to any of the sequence at the 3'-end.
  • concentration of T1 is 0.001 to 1 micromolar if in solution. T1 can also be tethered to a solid support or covalently attached to any type of solid support.
  • T2 The second template oligonucleotide (T2). Similar to T1 , T2 may be 24 to 60 nucleotides (including all the integer values therebetween), preferably 32-36 nucleotides, in length.
  • the 3'-end of T2 may be blocked with, for example, a phosphate, an amine, a biotin, a dideoxy group or a fluorophore (that is, there is no free 3'-hydroxyl in T2) to prevent extension by a polymerase.
  • the region from the 3' terminus of T2 to the fifth nucleotide directly 3' to the 3' terminus of the sense strand of a nicking endonuclease recognition sequence (e.g.,
  • GAGTC may be 8, 9, 10, 11 , 12, 13, 14, 15 or 16 nucleotides in length and is at least substantially complementary to the sequence S'.
  • a nicking endonuclease e.g., N.BstNB I
  • CGAG the sense strand of the recognition sequence for a nicking enzyme
  • T2 The concentration of T2 is 0.001 to 1 micromolar if in solution. T2 can also be tethered to a solid support or covalently attached to any type of solid support.
  • a DNA polymerase such as exo " Vent, 9°N m TM, Taq, or Bst at a concentration of 0.002 to 20 units per microliter.
  • concentration of the polymerase is 0.02 to 0.5 units per microliter.
  • the enzyme is typically available commercially in 100 mM KCI, 0.1 mM EDTA, 10 mMTris-HCI (pH 7.4), 1 mM DDT, and 50% glycerol.
  • a nicking enzyme such as N.BstNB I (from New England Biolabs, (NEB), MA) at a concentration of 0.002 to 20 units per microliter.
  • concentration of the nicking enzyme is 0.02 to 0.5 units per microliter.
  • the enzyme is supplied in 50 mM KCI, 10 mMTris-HCI (pH 7.5), 0.1 mM EDTA, 1 mM DTT, 200 ug/ml BSA and
  • a salt e.g., MgCI 2 or MgSO 4 ) at 0.5 to 10 mM in concentration. Preferably the concentration is 2 to 6 mM.
  • a salt e.g., (NH 4 ) 2 SO 4
  • a salt e.g., KCI
  • a buffer e.g., Tris-HCl
  • pH 7-8 preferably 7.5 in the 10-50 mM range of concentrations, preferably 10 mM.
  • a reducing agent e.g., dithiothreitol (DTT)
  • a detergent e.g., Triton X-100
  • V/V 0.01 % to 1 % range
  • V/V 0.01 % to 1 % range
  • T1 and T2 may be blocked so that no free 3'-hydroxyl groups are available for extension.
  • T1 and T2 molecules may be immobilized in different regions of a solid substrate or different solid substrates (e.g., microbeads).
  • the ping-pong amplification reaction cycle is also schematically described in Figure 7.
  • the use of the ping-pong amplification reaction cycle in discriminating genetic variations is shown in Figure 8.
  • the fingerprinting assays described above may be carried out in various formats. For instance, the reactions may be performed in a mixture where all the components are soluble.
  • one or all of the template(s) can be covalently attached at the 3' end or the 5' end to a solid phase with the use of cross-linkers or spacers.
  • the solid phase includes (without limitation) nylon tip beads, fluted tips, microbeads, microplate wells, membranes, slides, arrays, and the materials of which the solid phase is made include glass, nylon 6/6, silica, plastics like polystyrene, polymers like poly(ethyleneimine), etc.
  • a replicator type of exponential amplification reaction may be performed using immobilized templates. More specifically, the first template (T1) molecules may be linked to beads, while the second template (T2) molecules are linked to different beads.
  • the beads linked with T1 molecules may be mixed with the beads linked with T2 molecules in a reaction mixture to amplify two oligonucleotides (i.e., S and S' as described above in the context of the replicator type of amplification reaction).
  • S and S' as described above in the context of the replicator type of amplification reaction.
  • such a reaction may be carried out to amplify multiple oligonucleotide sequences.
  • Beads linked with template molecules other than T1 and T2 molecules may be included in the reaction mixture so that oligonucleotides other than S and S' may also be amplified. It is also possible and advantageous to perform amplification reactions (e.g., direct EXPAR as described above) on arrays of immobilized oligonucleotides.
  • the arrays can be composed of elements separated spatially on a 2-dimensional solid support. Suitable solid supports include, but are not limited to, glass slides, wafers, beads, microbeads, rods, ribbons, nylon6/6, nylon parts, polymer-coated solid supports, wells, etc.
  • the arrays can be further assembled on a 3-dimensional solid support.
  • the amplification template (e.g., ⁇ in Figure 2) is immobilized to a solid support at its 5' end or its 3' end, preferably at its 3'- end. There may or may not be any spacer between the template oligonucleotide and the solid support.
  • the immobilized template when annealing to a trigger oligonucleotide, may be used as a template to amplify an oligonucleotide having a sequence identical to the trigger oligonucleotide.
  • the newly synthesized oligonucleotide then primes an adjacent template oligonucleotide in the element on (or in) the array and an exponential amplification reaction takes place. Oligonucleotide amplification is detected by employing a DNA binding dye that preferentially binds to double strand DNA (e.g., SYBR ® green).
  • nucleic acids or oligonucleotides of the present invention are immobilized to a substrate to form an array.
  • an "array" refers to a collection of nucleic acids or oligonucleotides that are placed on a solid support in distinct areas. Each area is separated by some distance in which no nucleic acid or oligonucleotide is bound or deposited. In some embodiments, area sizes are 20 to 500 microns and the center to center distances of neighboring areas range from 50 to 1500 microns.
  • the array of the present invention may contain 2-9, 10-100, 101 -400, 401 - 1 ,000, or more than 1 ,000 distinct areas.
  • the nucleic acid or oligonucleotide may be immobilized to a substrate in the following two ways: (1) synthesizing the nucleic acids or the oligonucleotides directly on the substrate (often termed “in situ synthesis"), or (2) synthesizing or otherwise preparing the nucleic acid or the oligonucleotides separately and then position and bind them to the substrate (sometimes termed "post-synthetic attachment").
  • in situ synthesis the primary technology is photolithography. Briefly, the technology involves modifying the surface of a solid support with photolabile groups that protect, for example, oxygen atoms bound to the substrate through linking elements.
  • This array of protected hydroxyl groups is illuminated through a photolithographic mask, producing reactive hydroxyl groups in the illuminated areas.
  • a 3'-0- phosphoramidite-activated deoxynucleoside protected at the 5'-hydroxyl with the same photolabile group is then presented to the surface and coupling occurs through the hydroxyl group at illuminated areas.
  • the substrate is rinsed and its surface is illuminated through a second mask to expose additional hydroxyl groups for coupling.
  • a second 5'- protected, 3'-0-phosphoramidite-activated deoxynucleoside is present to the surface. The selective photo-de-protection and coupling cycles are repeated until the desired set of products is obtained.
  • carbodiimides are commonly used in three different approaches to couple DNA to solid supports.
  • the support is coated with hydrazide groups that are then treated with carbodumide and carboxy-modified oligonucleotide.
  • a substrate with multiple carboxylic acid groups may be treated with an amino- modified oligonucleotide and carbodumide.
  • Epoxide-based chemistries are also used with amine modified oligonucleotides.
  • the primary post-synthetic attachment technologies include ink jetting and mechanical spotting.
  • Ink jetting involves the dispensing of nucleic acids or oligonucleotides using a dispenser derived from the ink-jet printing industry.
  • the nucleic acid oligonucleotides are withdrawn from the source plate up into the print head and then moved to a location above the substrate.
  • the nucleic acids or oligonucleotides are then forced through a small orifice, causing the ejection of a droplet from the print head onto the surface of the substrate.
  • Mechanical spotting involves the use of rigid pins.
  • the pins are dipped into a nucleic acid or oligonucleotide solution, thereby transferring a small volume of the solution onto the tip of the pins. Touching the pin tips onto the substrate leaves spots, the diameters of which are determined by the surface energies of the pins, the nucleic acid or oligonucleotide solution, and the substrate.
  • Mechanical spotting may be used to spot multiple arrays with a single nucleic acid or oligonucleotide loading. Detailed description of using mechanical spotting in array fabrication may be found in the following patents or published patent applications: U.S. Patent Nos.
  • the substrate to which the nucleic acids or oligonucleotides of the present invention are immobilized to form an array is prepared from a suitable material.
  • the substrate is preferably rigid and has a surface that is substantially flat. In some embodiments, the surface may have raised portions to delineate areas. Such delineation separates the amplification reaction mixtures at distinct areas from each other and allows for the amplification products at distinct areas to be analyzed or characterized individually.
  • the suitable material includes, but is not limited to, silicon, glass, paper, ceramic, metal, metalloid, and plastics. Typical substrates are silicon wafers and borosilicate slides (e.g., microscope glass slides).
  • a particularly useful solid support is a silicon wafer that is usually used in the electronic industry in the construction of semiconductors.
  • the wafers are highly polished and reflective on one side and can be easily coated with various linkers, such as poly(ethyleneimine) using silane chemistry.
  • Wafers are commercially available from companies such as WaferNet, San Jose, CA.
  • WaferNet San Jose, CA.
  • one of ordinary skill in the art may vary the composition of immobilized molecules of the present array.
  • the T1 orT2 molecules of the present invention may or may not be immobilized to every distinct area of the array.
  • the nucleic acids or oligonucleotides in a distinct area of an array are homogeneous.
  • nucleic acids or oligonucleotides in every distinct area of an array to which the nucleic acids or oligonucleotides are immobilized are homogeneous.
  • homogeneous indicates that each nucleic acid or oligonucleotide molecule in a distinct area has the same sequence as another nucleic acid or oligonucleotide molecule in the same area.
  • nucleic acid or oligonucleotide in at least one of the distinct areas of an array are heterogeneous.
  • heterogeneous indicates that at least one nucleic acid or oligonucleotide molecule in a distinct area has a different sequence from another nucleic acid or oligonucleotide molecule in the area.
  • molecules other than the nucleic acids or oligonucleotides described above may also be present in some or all of distinct areas of an array.
  • a molecule useful as an internal control for the quality of an array may be attached to some or all of distinct areas of an array.
  • Another example for such a molecule may be a nucleic acid useful as an indicator of hybridization stringency.
  • composition of nucleic acids or oligonucleotides in every distinct area of an array is the same.
  • Such an array may be useful in determining genetic variations in a particular gene in a selected population of organisms or in parallel diagnosis of a disease or a disorder associated with mutations in a particular gene.
  • the immobilized nucleic acids or oligonucleotides of the present invention may contain oligonucleotide sequences that are at least substantially complementary or identical to various target nucleic acids.
  • target nucleic acids include, but are not limited to, genes associated with hereditary diseases in animals, oncogenes, genes related to disease predisposition, genomic DNAs useful for forensics and/or paternity determination, genes associated with or rendering desirable features in plants or animals, and genomic or episomic DNA of infectious organisms.
  • An array of the present invention may contain nucleic acids or oligonucleotides that are at least substantially complementary or identical to a particular type of target nucleic acids in distinct areas.
  • an array may have a nucleic acid or an oligonucleotide that is at least substantially complementary or identical to a first gene related to disease predisposition in a first distinct area, another nucleic acid or an oligonucleotide that is at least substantially complementary or identical to a second gene also related to disease predisposition in a second distinct area, yet another nucleic acid or an oligonucleotide that is at least substantially complementary or identical to a third gene also related to disease predisposition in a third distinct area, etc.
  • an array is useful to determine disease predisposition of an individual animal (including a human) or a plant.
  • an array may have nucleic acids or oligonucleotides that are at least substantially complementary or identical to multiple types of target nucleic acids categorized by the functions of the targets.
  • an array may contain nucleic acids or oligonucleotides that are at least substantially complementary or identical to a portion of a target nucleic acid that contains various potential genetic variations.
  • a first area of the array may contain immobilized nucleic acids or oligonucleotides that are at least substantially complementary or identical to a portion of a target gene that contains a genetic variation of one allele of the target.
  • a second area of the array may contain immobilized nucleic acids or oligonucleotides that are at least substantially complementary or identical to a portion of target gene that contains a genetic variation of another allele of the target.
  • the array may have additional areas that contain immobilized nucleic acids or oligonucleotides that are at least substantially complementary or identical to portions of the target gene that contains genetic variations of additional alleles of the target.
  • the immobilized nucleic acids or oligonucleotides must be stable and not dissociate during various treatment, such as hybridization, washing or incubation at the temperature at which an amplification reaction is performed.
  • the density of the immobilized nucleic acids or oligonucleotides must be sufficient for the subsequent analysis.
  • typically 1000 to 10 12 , preferably 1000 to 10 6 , 10 6 to 10 9 , or 10 9 to 10 12 ODNP molecules are immobilized in at least one distinct area.
  • the immobilization process should not interfere with the ability of immobilized nucleic acids or oligonucleotides required for exponential nucleic acid amplification.
  • the linker (also referred to as a "linking element") comprises a chemical chain that serves to distance the nucleic acids or oligonucleotides from the substrate.
  • the linker may be cleavable.
  • the substrate is coated with a polymeric layer that provides linking elements with a lot of reactive ends/sites.
  • a common example is glass slides coated with polylysine, which are commercially available.
  • Another example is substrates coated with poly(ethyleneimine) as described in
  • the array of the present invention enables the high throughput of various analyses to which the present nucleic acid amplification is applicable.
  • an array of T2 molecules may be used to amplify multiple target nucleic acids.
  • the reaction mixture or the products of an amplification reaction performed in the presence of a target nucleic acid may be pooled together and applied to the array of T2 molecules.
  • the reaction mixtures or the amplification products of different amplification reactions may be applied to distinct areas of the array.
  • Another round (“second round”) of amplification reactions may then be performed on the array in the present of a nicking agent that recognizes the nicking agent recognition sequence of which the antisense strand is present in the T2 molecules.
  • the amplification products of the second round of reactions performed on the array may be pooled together and analyzed. If the array (e.g., a microwell array) has distinct areas that are delineated by certain physical barriers, the amplification products of the second round of reactions in distinct arrays may be analyzed individually.
  • nucleic acid molecules of the present invention may be immobilized via the methods described above that are useful in preparing an array.
  • any methods known in the art may be used.
  • a target nucleic acid of the present invention may be immobilized by the use of a fixative or tissue printing. It may also be first isolated or purified and then transferred to a substrate that binds to nucleic acids or oligonucleotides, such as nitrocellulose or nylon membranes.
  • the products of a nicking reaction, a linear amplification reaction or an exponential amplification reaction according to the present invention may be characterized by any applicable known methods. These methods include, but are not limited to mass spectrometry, fluorescence spectrometry, electrophoresis, liquid chromatography, hybridization and radiography. Certain exemplary methods are described in more detail below.
  • not all the amplified nucleic acids are characterized. In other words, in these embodiments, only certain amplified nucleic acids that meet a given criterion need be characterized. For instance, the amplified nucleic acid molecules may first be separated by liquid chromatography and only the fractions that contain short nucleic acid fragments are further characterized by, for example, mass chromatography.
  • the fingerprinting assays of the present invention can be read out in a number of ways but the most ideal is by mass spectrometry since a series of well-defined and characterized oligonucleotides are generated that have known mass/charge ratios (m/z).
  • Exemplary mass spectrometric analysis includes Matrix-Assisted Laser Desorpotion/lonization Mass Spectrometry (MALDI) and Time-of-Fight (TOF).
  • Matrix-Assisted Laser Desorption/lonization Mass Spectrometry is becoming an ever more popular technique for studying biomolecules (Hillenkamp et al., Anal. Chem. 63, 1193A-1203A, 1991 ). This technique ionizes high molecular weight biopolymers with minimal concomitant fragmentation of the sample material. This is typically accomplished via the incorporation of the sample to be analyzed into a matrix that absorbs radiation from an incident UV or IR laser. This energy is then transferred from the matrix to the sample resulting in desorption of the sample into the gas phase with subsequent ionization and minimal fragmentation.
  • MALDI-MS One of the advantages of MALDI-MS over ESI-MS is the simplicity of the spectra obtained: MALDI spectra are generally dominated by singly charged species. Typically, the gaseous ions generated by MALDI techniques are detected and analyzed by determining the time-of-flight (TOF) of these ions. While MALDI-TOF MS is not a high resolution technique, resolution can be improved by making modifications to such systems, e.g., by the use of tandem MS techniques, or by the use of other types of analyzers, such as Fourier transform (FT) and quadrupole ion traps.
  • FT Fourier transform
  • MALDI techniques have found application for the rapid and straightforward determination of the molecular weight of certain biomolecules (Feng and Konishi, Anal. Chem. 64, 2090-2095, 1992; Nelson, Dogruel and Williams, Rapid Commun. Mass Spectrom. 8, 627-631 , 1994). These techniques have been used to confirm the identity and integrity of certain biomolecules such as peptides, proteins, oligonucleotides, nucleic acids, glycoproteins, oligosaccharides and carbohydrates. Further, these MS techniques have found biochemical applications in the detection and identification of post-translational modifications on proteins.
  • Verification of DNA and RNA sequences that are less than 100 bases in length has also been accomplished using ESI with FTMS to measure the molecular weight of the nucleic acids (Little et al, Proc. Natl. Acad. Sci. USA 92, 2318-2322, 1995).
  • the matrix is an important feature of MALDI-MS.
  • analysis of nucleic acids by MALDI can be divided into two steps.
  • the first step involves preparing the sample by mixing the sample to be analyzed with a molar excess of a chemical commonly referred to as the "matrix.” See, e.g., Wu et al. Rapid Commun. Mass Spectrom. 7:142-146 (1993).
  • the primary purpose of the matrix is to promote ionization of the nucleic acid. Without the matrix, the nucleic acid molecule tends to fragment upon exposure to the laser energy, so that the mass and identity of the nucleic acid is difficult or impossible to determine.
  • matrix refers to a substance which absorbs radiation at a wavelength substantially corresponding to the pulse of laser energy used in the MALDI method, and where the matrix facilitates desorption and ionization of molecules.
  • a matrix may be any one of several small, light- absorbing chemicals that may be mixed in solution with a nucleic acid in such a manner so that, upon drying on a solid support (e.g., a sample plate or a probe element), the crystalline matrix-embedded analyte molecules are successfully desorbed by laser irradiation and ionized from the solid phase crystals into the vapor phase and accelerate as intact molecular ions.
  • the second step of the MALDI process involves desorption of the bulk portions of the solid sample by a short pulse of laser light.
  • the analyte- containing sample is added to (e.g., spotted onto) a coating of cationic polyelectrolyte, allowing the analyte (nucleic acid) to bind to the cationic polyelectrolyte.
  • This spot is then washed in order to purify the nucleic acid.
  • the spot is then treated with matrix (when the matrix is a liquid) or a solution of matrix (when the matrix is a solid).
  • the spot When the matrix is a solid, the spot should be allowed to dry in order to remove the solvent that was formerly used to dissolve the matrix in solution. Thereafter, this spot of nucleic acid and matrix can be subjected to MALDI-MS to provide a very strong signal due to the nucleic acid.
  • the present invention provides a solid support having a surface, where that surface is at least partially coated with a coating comprising cationic polyelectrolyte, where at least some of the cationic polyelectrolyte is in contact with nucleic acid and the nucleic acid is in contact with matrix.
  • the solid support is a plate, e.g., a stainless steel plate, and the cationic polyelectrolyte either forms a continuous coating across all or a significant portion of the surface, or is spotted onto the surface in distinct regions. Nucleic acid and matrix is then located in distinct regions on the surface, so as to provide an array-type appearance.
  • the surface may be a 96-well plate, with cationic polyelectrolyte, nucleic acid and matrix located in one, and preferably more than one, of the wells.
  • This array is then subjected to MALDI-MS, where the various regions are sequentially subjected to laser light, and the mass spectrum of the nucleic acid present in the spots is sequentially obtained.
  • the matrix should meet one or more of the following criteria, and preferably meets many or all of these criteria.
  • the matrix should be able to embed and isolate nucleic acid (e.g., by co-crystallization), it should be soluble in solvents compatible with nucleic acids, it should be stable under the vacuum used in MALDI, it should assist co-desorption of the nucleic acid upon laser irradiation, and it should promote ionization of the nucleic acid.
  • the matrix should comprise a chromophore that strongly absorbs in the wavelength of light being emitted by the laser. For instance, if the laser is an ultraviolet laser, then the matrix should have a chromophore that absorbs in the ultraviolet region.
  • Suitable matrices for nucleic acids have been identified as suitable matrices for nucleic acids, where these as well as other suitable matrix chemicals known in the art may be used in the methods and compositions of the present invention: 6-aza-2-thiothymine (ATT), glycerol, 2,4,6-trihydroxyacetophenone (THAP), picolinic acid (PA), 3-hydroxy picolinic acid (HPA), 2,5- dihiydroxybenzoic acid, anthranilic acid, nicotinic acid, and salicylamide. Mixtures of these chemicals are also suitable.
  • the matrix is a solid at room temperature.
  • the matrix may be a liquid chemical, where suitable liquid matrices are substituted or unsubstituted: (1 ) alcohols, including: glycerol, 1 ,2- or 1 ,3-propane diol, 1 ,2-, 1 ,3- or 1 ,4-butane diol, triethanolamine; (2) carboxylic acids including: formic acid, lactic acid, acetic acid, propionic acid, butanoic acid, pentanoic acid, hexanoic acid and esters thereof; (3) primary or secondary amides including acetamide, propanamide, butanamide, pentanamide and hexanamide, whether branched or unbranched; (4) primary or secondary amines, including propylamine, butylamine, pentylamine, hexylamine, heptylamine, diethylamine and dipropylamine; (5) nitriles, hydrazine and hydrazide.
  • liquid matrices are particularly useful when the MALDI laser emits light in the infrared spectrum. It is reported that THAP works best for samples below 10kDa while HPA and PA are more appropriate for oligonucleotides above 10kDa. Acidic matrices, e.g., HPA, are preferred for single-stranded nucleic acids, while neutral matrices, e.g., glycerol and ATT, are preferred for double-stranded nucleic acids.
  • MS is particularly advantageous in those applications in which it is desirable to eliminate a size separation step prior to molecular weight determination. Sensitivities of MS may be achieved to at least to 1 amu. The smallest mass differences in nucleic acid bases is between adenine and thymidine which is 9 Daltons. Particularly preferred methodologies according to the present invention employ Liquid Chromatography-Time-of-Flight Mass Spectrometry (LC-TOF-MS).
  • LC-TOF-MS Liquid Chromatography-Time-of-Flight Mass Spectrometry
  • LC-TOF-MS is composed of an orthogonal acceleration Time- of-Flight (TOF) MS detector for atmospheric pressure ionization (API) analysis using electrospray (ES) or atmospheric pressure chemical ionization (APCI).
  • TOF Time- of-Flight
  • ES electrospray
  • APCI atmospheric pressure chemical ionization
  • LC-TOF-MS provides high mass resolution (5000 FWHM), high mass measurement accuracy (to within 5ppm) and very good sensitivity (ability to detect femtomolar amount of DNA polymer).
  • TOF instruments are generally more sensitive than quadrupoles, but are correspondingly more expensive.
  • LC-TOF-MS has a more efficient duty cycle since the current instruments can sequentially analyze one mass at a time while rejecting all others (this is referred to as single ion monitoring (SIM)).
  • SIM single ion monitoring
  • LC-TOF-MS samples all of the ions passing into the TOF analyzer at the same time. This results in higher sensitivity, provides quantitative data, which improves the sensitivity between 10 and 100 fold. Enhanced resolution (5000 FWHM) and mass measurement accuracy of better than 5 ppm imply that differences between nucleosides as small as 9 amu (Daltons) can be accurately measured.
  • the TOF mass analyzer performs very high frequency sampling (10 spectra/sec) of all ions simultaneously across the full mass range of interest.
  • the duty cycle of the LC-TOF-MS allows high sensitivity spectra to be recorded in quick succession making the instrument compatible with more efficient separations techniques such as narrow bore LC, capillary chromatography (CE) and capillary electrochromatography (CEC).
  • CE capillary chromatography
  • CEC capillary electrochromatography
  • the ES or APCI aerosol spray is directed perpendicularly past the sampling cone, which is displaced from the central axis of the instrument. Ions are extracted orthogonally from the spray into the sampling cone aperture leaving large droplets, involatile materials, particulates and other unwanted components to collect in the vent port that is protected with an exchangeable liner.
  • the second orthogonal step enables the volume of gas (and ions) sampled from atmosphere to be increased compared with conventional API sources. Gas at atmospheric pressure sampled through an aperture into a partial vacuum forms a freely expanding jet, which represents a region of high performance compared to the surrounding vacuum. When this jet is directed into the second aperture of a conventional API interface it increases the flow of gas through the second aperture.
  • Maintaining a suitable vacuum in the MS-TOF therefore places a restriction on the maximum diameter of the apertures in such an LC interface. Ions in the partial vacuum of the ion block are extracted electrostatically into the hexapole ion bridge that efficiently transports ions to the analyzer.
  • the coupling of the TOF mass analyzers with MUX-technology allows the connection of up to 8 HPLC columns in parallel to a single LC-TOF- MS. (Micromass, Manchester UK).
  • a multiplexed electrospray (ESI) interface is used for on-line LC-MS utilizing an indexed stepper motor to sequentially sample from up to 8 HPLC columns or liquid inlets operated in parallel.
  • LC-TOF-MS is sometimes preferred over use of MALDI- TOF because LC-TOF-MS is a quantitative method for analysis of the molecular weight of polymers. LC-TOF-MS does not fragment the polymers and it employs a very gentle ionization process compared to matrix-assisted- lazer-desorption-ionization (MALDI). Because every MALDI blast is different, the ionization is not quantitative. LC-TOF-MS does, however, produce different m/z values for polymers.
  • MALDI matrix-assisted- lazer-desorption-ionization
  • HPLC High-Performance Liquid Chromatography
  • HPLC instruments consist of a reservoir of mobile phase, a pump, an injector, a separation column, and a detector. Compounds are separated by injecting an aliquot of the sample mixture onto the column. The different components in the mixture pass through the column at different rates due to differences in their partitioning behavior between the mobile liquid phase and the stationary phase.
  • the pumps provide a steady high performance with no pulsating, and can be programmed to vary the composition of the solvent during the course of the separation.
  • Exemplary detectors useful within the methods of present invention include UV-VIS absorption, or fluorescence after excitation with a suitable wavelength, mass spectrometers and IR spectrometers.
  • IP- RO-HPLC on non-porous PS/DVB particles with chemically bonded alkyl chains have been shown to be rapid alternatives to capillary electrophoresis in the analysis of both single and double-strand nucleic acids providing similar degrees of resolution.
  • IP-RP-HPLC In contrast to ion-exchange chromatography, which does not always retain double-strand DNA as a function of strand length (since AT base pairs interact with the positively charged stationary phase, more strongly than GC base- pairs), IP-RP-HPLC enables a strictly size-dependent separation.
  • Denaturing HPLC is an ion-pair reversed-phase high performance liquid chromatography methodology (IP-RP-HPLC) that uses a non-porous C-18 column as the stationary phase.
  • the column is comprised of a polystyrene-divinylbenzene copolymer.
  • the mobile phase is comprised of an ion-pairing agent of triethylammonium acetate (TEAA), which mediates binding of DNA to the stationary phase, and acetonitrile (ACN) as an organic agent to achieve subsequent separation of the DNA from the column.
  • TEAA triethylammonium acetate
  • ACN acetonitrile
  • a linear gradient of acetonitrile allows separation DHPLC identifies mutations and polymorphisms based on detection of heteroduplex formation between mismatched nucleotides in double stranded PCR amplified DNA. Sequence variation creates a mixed population of heteroduplexes and homoduplexes during reannealling of wild type and mutant DNA of fragments based on size and/or presence of heteroduplexes (this is the traditional use of the DHPLC technology). When this mixed population is analyzed by HPLC under partially denaturing temperatures, the heteroduplexes elute from the column earlier than the homoduplexes because of their reduced melting temperature. Analysis can be performed on individual samples to determine heterozygosity, or on mixed samples to identify sequence variation between individuals.
  • the various nicking and amplification reactions described above may also be readout by detectors that measure real-time fluorescence, such as the MJ Opticon from MJ Research (Boston, MA), the ABI Prism 7000 instrument (Foster City, CA), and endpoint plate readers, such as the Ultramark from Biorad (Hercules, CA).
  • Real time monitoring is a very useful method as it enables parameters such as initial rates to be determined with accuracy and ease.
  • the use of double-strand specific fluorescent dyes such as SYBR ® green from Molecular Probes (Eugene OR) is especially useful when used during the amplification reactions described above. Dyes that bind to single strand nucleic acids can also be used, perhaps at times with slightly less efficacy than double-strand specific dyes.
  • intercalating dyes such as SYBR ®
  • dual labeled probes FRET (fluorescent energy transfer) probes
  • Molecular Beacons exemplary fluorescent intercalating agents include, without limitation, those disclosed in U.S. Pat. Nos.
  • Fluorescence produced by fluorescent intercalating agents may be detected by various detectors, including PMTs, CCD cameras, fluorescent-based microscopes, fluorescent-based scanners, fluorescent-based microplate readers, fluorescent- based capillary readers.
  • the nicking agent in the reaction mixture may be inactivated (e.g., by heat) and a fresh DNA polymerase be added.
  • a fresh DNA polymerase e.g., by heat
  • the presence of the active DNA polymerase, but not the nicking agent, allows any partial duplexes to be extended to completely double-stranded nucleic acid fragments.
  • the generation of such completely double-stranded nucleic acid fragments allow the binding of a greater number of fluorescent intercalating agents, which in turn increase the signal that may be detected.
  • the present invention provides a composition or kit comprising polynucleotide, a nicking agent and a polymerase.
  • inventive compositions have unique properties that render them particularly useful in idenfying the source organism of a nucleic acid sample.
  • a source organism may be a bacterium, fungus, virus, plant, non-human animal or human.
  • Such a composition or kit generally comprises the template oligonucleotide(s) useful for amplifying initiating oligonucleotides described above. It may also further comprise at least one, two, several, or each of the following components: (1) a nicking agent (e.g., a NE or a RE) that recognizes the nicking agent recognition sequence of which one strand is present in the template oligonucleotide(s); (2) a suitable buffer for nicking agent (1); (3) a RE that functions in combination with a nicking agent (with may be identical to or different from nicking agent (1); (4) a suitable buffer for RE (3); (5) a DNA polymerase; (6) a suitable buffer for the DNA polymerase (5); (7) dNTPs; (8) a modified dNTP; (9) a strand displacement facilitator (e.g., 1 M trehalose); and (10) a fluorescent intercalating agent.
  • a nicking agent e.g., a
  • the present invention alleviates and overcomes many drawbacks of the present state of the art through the discovery of novel methods and kits for rapidly fingerprinting DNA to identify prokaryotic and eukaryotic species, subspecies, and especially strains or individuals of the subspecies.
  • the present invention is especially suited for identifying different bacterial strains involved in, for example, nosocomial infections, since the methods and kits are to be sensitive enough to detect differences between, for example, bacterial isolates of the same species.
  • the present invention contemplates identifying, for instance, species, subspecies, and the differences between the individuals of the subspecies, such as pedigrees.
  • the method can be used for: (1) diagnosis of bacterial disease, in plants animals and humans; (2) monitoring for bacterial content and/or contamination in the environment; (3) monitoring food for bacterial contamination; (4) monitoring manufacturing processes for bacterial contamination; (5) monitoring quality assurance/quality control of laboratory tests involving microbiological assays; (6) tracing bacterial contamination and/or outbreaks of bacterial infections; (7) genome mapping; (8) monitoring bioremediation sites; and (9) monitoring agricultural sites for test crops, bacteria and recombinant molecules.
  • a further aspect of the present invention is a machine for automating the identification of bacterial strains, particularly by mass spectrometry.
  • the present invention affords the medical community with a means to not only identify the infectious agents, but also to rapidly characterize the strain or strains involved so that effective measures may be timely employed.
  • Another application of this method is in the manufacturing process.
  • a number of manufacturing processes for instance drugs, microorganism-aided synthesis, food manufacturing, chemical manufacturing and fermentation process all rely either on the presence or absence of bacteria.
  • the method of the present invention can be used. It can monitor bacterial contamination or test that strain purity is being maintained.
  • This method can also be used to test stored blood for bacterial contamination. This would be important in blood banking where bacteria such as Yersinia enterocolitica can cause serious infection and death if it is in transfused blood.
  • the procedure can also be used for quality assurance and quality control in monitoring bacterial contamination in laboratory tests.
  • the Guthrie bacterial inhibition assay uses a specific strain of bacteria to measure phenylalanine in newborn screening. If this strain changes it could affect test results and thus affect the accuracy of the newborn screening program.
  • This method of the present invention can be used to monitor the strain's purity. Any other laboratory test that uses or relies on bacteria in the assay can be monitored. The laboratory or test environment can also be monitored for bacterial contamination by sampling the lab and testing for specific strains of bacteria. This procedure will also be useful in hospitals for tracing the origin and distribution of bacterial infections. It can show whether or not the infection of the patient is a hospital-specific strain.
  • the type of treatment and specific anti-bacterial agent can depend on the source and nature of the bacteria. There are a variety of applications for the fingerprinting technology described here.
  • the fingerprinting technology described herein may be useful to detect polymorphisms in the human genome, in view of the large number of fragments that can be generated.
  • the methods and compositions of the present invention may be used to interrogate a sample for the presence of fragments that uniquely identify all pathogens, and fragments obtained from the human genome that can be used to uniquely identify individuals.
  • oligonucleotides uniquely identify K12, and 15 oligonucleotides uniquely identify O157 (Table 4). These oligonucleotides unique for distinguishing the two E. coli strains are not found in the oligonucleotides that would be generated in the presence of N.BstNB I from chromosome 21 of the human genome (see Table 5).
  • Table 6 lists the relative probability of obtaining an overlapping mass (composition) as a function of the length of the trigger in a background of DNA that is as complex as the human genome (4 billion bases).
  • Table 7 lists the number of oligonucleotides having 6-16 nucleotides that would be generated in the presence of N.BstNB I from 35 bacterial species. The average number of oligonucleotides that would be generated per organism is 19. Greater than 99% of these fragments have a unique mass and sequence, thus virtually every one of the fingerprinting oligonucleotides is unique to the organism from which it is generated.
  • Table 1 E. coli K12 oligonucleotide fragments.
  • Table 2 E. coli O157 oligonucleotide fragments.
  • oligonucleotide fragments from human chromosome 21 Sig. (signature) indicates the number of As-Cs-Gs- Ts (A-C-T-G).
  • the "sequence” includes the sequence of the sense or antisense strand of the recognition sequence(s) of N.BstNB I and/or N.AIw I

Abstract

A sample of nucleic acid that is characteristic of a particular source, e.g., a bacterium, can be analyzed to determine whether that particular sample is from a particular source. The analysis protocol subjects the sample to a nicking reaction to produce a family of nicked fragments. In certain embodiments, the nicking reaction is performed in the presence of a polymerase and nucleoside triphosphates to amplify nicked fragments or portions thereof. The nicked fragments or portions thereof may be analyzed to determine the source of the nucleic acid sample.

Description

ORGANISM FINGERPRINTING USING NICKING AGENTS
BACKGROUND OF THE INVENTION
Field of the Invention
The present invention is generally directed to compositions and methods for identifying any type of organism or individual, where the invention is based on creating and analyzing nucleotide sequence information characteristic of nucleic acids present in the organism or individual.
Description of the Related Art
Nosocomial (hospital-based) infections have become one of the most serious problems in infectious disease. Staphylococcus aureus is exceeded only by Escherichia coli as a leading cause of nosocomial infections. See, for example, Brumfitt, W. et al., Drugs Exptl. Clin. Res. 76:205-214 (1990). One type of S. aureus, methicillin-resistant S. aureus, is of a particular interest because it is resistant to all penicillin-based antibiotics. Patients in the intensive care unit are very susceptible to bacterial infections, due to interventions such as respiratory tubes and indwelling catheters. E. coli and S. aureus, if introduced into surgical wounds, the blood stream or the urinary tract, cause serious, sometimes life-threatening infections. There is no easy solution to the problem, but a partial solution in most "nosocomial outbreaks" is simply identifying the source of the infection. That is, is the infectious agent coming from a common source (e.g., an infected nurse or doctor, or an instrument such as a respirator) or is there some other reason for the sudden emergence of a single type of bacterial infection.
Hospital laboratories can quickly identify the infectious agent (e.g., S. aureus), but they do not have the ability to determine whether a single strain of the organism is causing the outbreak (and therefore a possible common source) or if several different strains are responsible for the outbreak. At the present time most outbreaks can only be characterized in retrospect, since the outbreak is over before the bacterial isolates can be identified. For example, isolates can be identified in a matter of 1 or 2 days, but the strain identification usually takes weeks or even months, because the strains are still analyzed by slow culture-based methods which are labor intensive. Current methods of strain typing bacteria include phage typing, plasmid analysis, and antibiotic susceptibility (biotyping). See, for example, Zuccarelli, A. et al.: J. Clin. Microbiol. 28:97-102, 1990; Tokue, Y. et al., Tohokin J. Exp. Med. 763:31-37, 1991 ; Coia, J. et al., J. Med. Microbiol. 37:125-132, 1990; Pennington, T. et al., J. Clin. Microbiol. 29:390-392, 1991; Tveten, Y. et al., J. Clin. Microbiol. 29:110-1105, 1991 ; Fluit, A. et al., Eur. J. Clin. Microbiol. Infect. Dis. 9:605-608, 1990; Thomson-Carter, F. et al., J. Gen. Microbiol. 735:2093-2097, 1989; and Preheim, L. et al., Eur. J. Clin. Microbiol. Infect. Dis. 70:428-436, 1991. These methods, currently used by the Centers for Disease Control, are laborious, time consuming and often yield inconclusive results.
A comprehensive understanding of biological systems requires knowledge of the chemical composition of cellular systems, and this understanding is becoming increasingly important in elucidating disease mechanisms (Ban, E., Anal. Chem. 70:308A, 1998). Elucidating disease mechanisms requires diagnostic tools for direct DNA sequencing, DNA separation and isolation, and mRNA profiling as well as numerous protein analysis techniques. Analytical methods in this research field must therefore be rapid, accurate, sensitive and robust.
Interspersed repetitive DNA sequence elements have been characterized extensively in eucaryotes although their function still remains largely unknown. The conserved nature and interspersed distribution of these repetitive sequences have been exploited to amplify unique sequences between repetitive sequences by the polymerase chain reaction. Additionally, species-specific repetitive DNA elements have been used to differentiate between closely related murine species. Prokaryotic genomes are much smaller than the genomes of mammalian species (approximately 106 versus 109 base pairs of DNA, respectively). Since these smaller prokaryotic genomes are maintained through selective pressures for rapid DNA replication and cell reproduction the non- coding repetitive DNA should be kept to a minimum unless maintained by other selective forces. For the most part prokaryotes have a high density of transcribed sequences. Nevertheless, families of short intergenic repeated sequences occur in bacteria.
The presence of repetitive sequences has been demonstrated in many different bacterial species. Reports of novel repeated sequences in the eubacterial genera, Escherichia, Salmonella, Deinococcus, Calothrix, and Neisseria, and the fungi, Candida albicans and Pneumocystis carinii, illustrate the presence of dispersed extragenic repetitive sequences in many organisms. One such family of repetitive DNA sequences in eubacteria is the Repetitive Extragenic Palindromic (REP) elements. The consensus REP sequence for this family includes a 38-mer sequence containing six totally degenerate positions, including a 5 bp variable loop between each side of the conserved stem of the palindrome.
Previous studies have used repeated rRNA genes as probes in Southern blots to detect restriction fragment length polymorphisms (RFLPs) between strains. Repeated tRNA genes have been used as consensus primer binding sites to directly amplify DNA fragments of different sizes by PCR amplification of different strains. Limitations of both techniques include the use of radioisotope and time-intensive methods such as Southern blotting and polyacrylamide gel electrophoresis to clearly distinguish subtle differences in the sizes of the DNA fragments generated. The latter technique could only distinguish organisms at the species and genus level. The tDNA-PCR fingerprints are generally invariant between strains of a given species and between related species. Other previous studies include the use of species- specific repetitive DNA elements as primer-binding sites for PCR-based bacterial species identification. Though such methods allow species identification by PCR with picogram amounts of DNA, only single PCR products are generated which precludes the generation of strain-specific genomic fingerprints.
Accordingly, there exists a long-felt need for the development of rapid bacterial strain typing technologies. The present invention fulfills this and other related needs.
BRIEF SUMMARY OF THE INVENTION
The present invention is generally directed to compositions and methods for identifying any type of organism or individual using nucleic acid- based fingerprinting. The method relies on the creation of a family of nucleic acid or oligonucleotide fragments formed by action of nicking agents on a nucleic acid sample. In a preferred embodiment, the nicking reaction is preformed in the presence of a polymerase and one or more (preferably all four of the natural) deoxyribonucleoside triphosphates. Under this preferred embodiment, the nucleic acid or oligonucleotide fragments or portions thereof are created in higher concentration and are therefore more amenable to characterization. According to this method, a family, also referred to herein as a pattern, of nucleic acid or oligonucleotide fragments of known characteristics (e.g., mass/charge ratios) are produced, which identify unambiguously an organism or individual. The readout of the fingerprinting assay is preferably matrix-assisted-laser-desorption ionization (MALDI) or liquid chromatography time-of-flight (LC-TOF) mass spectrometry, however other characterization methods may be used as well.
In a preferred aspect, a method has been devised according to the present invention in which a set of oligonucleotides are linearly amplified from template structures pre-existing in genomic DNA that can be used to initiate their own exponential amplification. By utilizing adjacent, close, nicking agent recognition sequences that occur infrequently in genomic DNA, short oligonucleotides are linearly amplified in the presence of a nicking agent that recognizes the nicking agent recognition sequences. The products from the linear amplification, referred to herein as "initiating oligonucleotide" or "initiator," can then be coupled to a method for exponentially amplifying the initiating oligonucleotides in true chain reactions. The linear and the exponential amplification reactions can be made into a homogenous assay in which 108- 1O9— fold amplification can be achieved in as little as 3 minutes. In certain embodiments, the linear amplification reaction, the exponential amplification, or both may be performed under isothermal conditions (e.g., at 60°C).
In certain embodiments, the exponential or string reaction is composed of two reaction components: a first amplification reaction that replicates the initiating oligonucleotide and a second amplification reaction that replicatess the complement of the initiating oligonucleotide. In the exponential reaction, two template oligonucleotides are used, a first template that comprises a sequence complementary to the initiating oligonucleotide, and a second template that comprises a sequence complementary to the complement of the initiating oligonucleotide. Since the first template may anneal to the complement of the initiating oligonucleobe and be used as a template for amplifying the initiating oligonucleotide, and the second template may anneal to the initiating oligonucleotide and be used as a template for amplifying the complement of the initiating oligonucleotide, once the reaction is initiated, a chain reaction ensues. The reaction has the peculiar property that a very high level of stringency is required in order to trigger the chain amplification.
A useful example is taken from E. coli K12 in which 55 unique oligonucleotides can be generated from genomic DNA without the use of pre- synthesized probes or primers. The read-out is ideally done by mass spectrometry (LC-TOF or MALDI) but can also be accomplished by other means, e.g., using real-time fluorimetry or "self-amplifying arrays". Foreknowledge of the sequence of the individual or organism is not necessary as it is possible to generate the fragments de novo from genomic DNA. The methods described here permit the creation an assay panel of diagnostic oligonucleotides that can identify any organism or individual. The present invention is advantageous over previous methods for identifying bacterial species. Although previous studies have demonstrated that species-specific repetitive DNA elements can be used as primer-binding sites for PCR-based bacterial species identification, these methods only generated single PCR products in a single species. The present invention provides a novel approach to using nicking agent recognition sites within a genomic DNA to directly fingerprint bacterial (as well as viral, fungal, in fact, all prokaryotic and eukaryotic genomes). The unique patterns of oligonucleotides generated by a nicking agent recognition sequence identify different bacterial species and strains. In addition, the present invention may produce polymorphic oligonucleotide fragments that contain genetic variations (e.g., single nucleotide polymorphisms, deletions, insertions, variable repeats) from eukaryotic genomes. The characterization of these polymorphic oligonucleotide fragments enables the identification of individual organism from which a nucleic acid sample is obtained.
In one aspect, the present invention provides a method comprising: a) providing a nucleic acid sample; b) treating the nucleic acid sample with components under nicking conditions, where the components comprise: i) a nicking agent; and the conditions cause the nicking agent to nick the nucleic acid sample to thereby produce a family of initiating oligonucleotide fragments; c) subjecting one or more members of the family of initiating oligonucleotide fragments to a characterization process to thereby provide results; and d) identifying a source for the nucleic acid sample based on the results of the characterization process.
Optionally, the components of the above method further comprise: ii) a polymerase; and iii) a deoxyribonucleoside triphosphate. Optionally, the components of the above methods may further comprise: iv) a template oligonucleotide comprising from 3' to 5':
(A) a first nucleotide sequence that is at least substantially complementary to a nucleotide sequence present in one or more members of the family of initiating oligonucleotide fragments;
(B) a sequence of one strand of a nicking agent recognition sequence; and
(C) a second nucleotide sequence. Optionally, the template is immobilized.
In certain embodiments, sequence iv)(B) is a sequence of an antisense strand of the nicking agent recognition sequence, and sequence iv)(A) is at least substantially identical to sequence iv)(C).
Optionally, the components of the above methods may further comprising v) a second template oligonucleotide, comprising, from 3' to 5':
(A) a first nucleotide sequence that is at least substantially identical to a nucleotide sequence present in the one or more members of the family of initiating oligonucleotide fragments;
(B) a sequence of a sense strand of the nicking agent recognition sequence; and
(C) a second nucleotide sequence. wherein sequence iv)(B) is a sequence of a sense strand of the nicking agent recognition sequence.
Optionally, the 3' terminus of template iv), the 3' termini of template v), or both are blocked. Optionally, sequence iv)(A) is exactly identical to sequence iv)(C). Also optionally, template iv), template v), or both may be immobilized. Optionally, the components may further comprise a restriction endonuclease. Optionally, the one or more members of the family of initiating oligonucleotide fragments are 6-16 nucleotides in length.
In certain embodiments, the nicking agent is a nicking endonuclease, such as N.BstNB I or N.AIw I. In certain embodiments, the polymerase is exo" Vent polymerase,
Bst polymerase, or 9°Nm™ polymerase.
In certain embodiments, the characterization process is performed at least partially by a technique selected from the group consisting of luminescence spectroscopy or spectrometry, fluorescence spectroscopy or spectrometry, mass spectrometry (such as MALDI or LC-TOF), liquid chromatography, fluorescence polarization, and electrophoresis.
In another aspect, the present invention provides a template oligonucelotide for amplifying a portion of a target nucleic acid, wherein
(i) the portion of the target is 6-16 nucleotides in length and flanked by
(A) a sequence of one strand of a first nicking enzyme recognition sequence, and
(B) a sequence of one strand of a second nicking enzyme recognition sequence, or a sequence of one strand of a restriction enzyme recognition sequence, and
(ii) the template oligonucleotide comprising from 3' to 5':
(A) a first nucleotide sequence that is at least substantially complementary to the portion of the target,
(B) a sequence of an antisense strand of a third nicking enzyme recognition sequence, and
(C) a second nucleotide sequence that is at least substantially identical to the first nucleotide sequence.
Optionally, the first and third nicking enzyme recognition sequences are identical to each other. Optionally, the first, second and third nicking enzyme recognition sequences are identical to each other. In certain embodiments, some or all of the nicking enzyme recognition sequences are recognizable by N.BstNB I.
Optionally, the 3' terminus of the template oligonucleotide is blocked. Optionally, the 3' terminus of the template oligonucleotide is immobilized.
In certain embodiments, the portion of the target is selected from the products listed in Tables 1 , 2, 4, 5, and 9. Optionally, the portion of the target comprises a genetic variation (e.g^single nucleotide polymorphism).
In another aspect, the present invention provides a composition for amplifying a portion of a target nucleic acid, comprising a first template oligonucleotide and a second template oligonucleotide, wherein
(i) the portion of the target is 6-16 nucleotides in length and flanked by
(A) a sequence of one strand of a first nicking enzyme recognition sequence, and
(B) a sequence of one strand of a second nicking enzyme recognition sequence, or a sequence of one strand of a restriction enzyme recognition sequence,
(ii) the first template oligonucleotide comprises from 5' to 3': (A) a first nucleotide sequence,
(B) a sequence of a sense strand of a third nicking agent recognition sequence, and
C) a second nucleotide sequence that is at least substantially complementary to the portion of the target, and (iii) the second template oligonucleotide comprises from 5' to
3':
(A) a first nucleotide sequence,
(B) a sequence of a sense strand of a fourth nicking agent recognition sequence, and (C) a second nucleotide sequence that is at least substantially identical to the portion of the target. Optionally, the first, third and fourth nicking enzyme recognition sequences are identical to each other. Optionally, the first, second, third and fourth nicking enzyme recognition sequences are identical to each other. In certain embodiments, some or all of the nicking enzyme recognition sequence are recognizable by N.BstNB I.
Optionally, the 3' termini of the first and second templates are blocked. Optionally, the 3' terminus of the first, the 3' terminus of the second, or both termini are immobilized.
Optionally, the portion of the target is selected from the products listed in Tables 1 , 2, 4, 5 and 9. Optionally, the portion of the target comprises a genetic variation (e.g., single nucleotide polymorphism).
In another aspect, the present invention provides a composition comprising:
(1 ) a nicking agent, and (2) at least one fragment selected from the products listed in
Tables 1 , 2, 4, 5 and 9, wherein the at least one fragment is produced at least partially by action of the nicking agent.
Optionally, the nicking agent is N.BstNB I or N.AIw I. Optionally, the composition may further comprise a restriction endonuclease selected from Bel I, Bsa Bl, Bsm I, Bsr I and Bsr D1 , wherein the at least 1 , 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 35, 40, 50 fragments are selected from the products listed in Table 9.
In another aspect, the present invention provides a kit for identifying the source of a nucleic acid sample, comprising one or two template oligonucleotides as described above for amplifying a portion of a genomic DNA of an organism suspected to be the source of the nucleic acid sample, wherein the portion of the genomic DNA is 6-16 nucleotides in length and flanked by
(A) a sequence of one strand of a first nicking enzyme recognition sequence, and (B) a sequence of one strand of a second nicking enzyme recognition sequence, or a sequence of one strand of a restriction enzyme recognition sequence.
Optionally, the kit may further comprise a nicking enzyme that recognizes one or more nicking enzyme recognition sequences in the template(s).
In certain embodiments, the above kits may further comprise a DNA polymerase and/or one or more deoxyribonucleoside triphosphate.
Optionally, the portion of the genomic DNA is selected from the products listed in Tables 1 , 2, 4, 5 and 9. Optionally, the portion of the genomic DNA comprises a single nucleotide polymorphism.
In another aspect, the present invention provides an array, comprising
(a) a substrate having a plurality of distinct areas; and (b) the template oligonucleotide of claim A1 or the composition of claim B1 immobilized to at least one of the plurality of distinct areas.
BRIEF DESCRIPTION OF THE DRAWINGS
Figure 1 schematically shows the cycle of the synthesis and release of an amplified short oligonucletide. On the upper strand is indicated the recognition site for the enzyme N.BstNB I (5'-GAGTC-3') and the specific nicking site four bases downstream on this strand. The oligonucleotide produced is indicated in blue, the primer in green and the template in red. The lengths of the exemplary template and amplified oligonucleotides are shown in the upper left drawing. Figure 2 is a diagram of the reaction scheme for the exponential amplification of oligonucleotides. The segments in red represent the sequence complement of the oligonucleotide sequence to be amplified, the signal sequence (shown in blue). The amplification template, τ, consists of two copies of the signal complement flanking the nicking enzyme recognition site shown as a light blue box, and a spacer sequence, shown as a green segment. The signal oligonucleotide (labeled σ) is produced in the linear amplification cycle for each amplification template created.
Figure 3 is a schematic representation of a template oligonucleotide used in a replicator type of amplification reactions. Figures 4a and 4b show the time of flight spectra of the multiplexed amplification reaction in the presence of short oligonucleotides generated from the genomic DNA of the E. coli strains K12 and 0157, respectively.
Figure 5a and 5b show the MALDI spectra of the multiplexed amplification reaction in the presence of short oligonucleotides generated from the genomic DNA of the E. coli strains K12 and O157, respectively.
Figure 6 shows real time fluorescence detection of the oligonucleotide amplification by an MJ Opticon I. The time of amplification is plotted on the X axis versus accumulated fluorescence on the Y axis. Each curve from left to right represents a serial dilution of 3-fold. The starting concentration of the trigger was 0.01 picomoles/microliter and the last dilution (far right curve (bottom curve on figure)) was 1.9 x 10"7 picomoles/microliter. This represents a dilution range of about 20,000-fold (39).
Figure 7 is a schematic diagram showing the ping-pong amplification reaction cycle.
Figure 8 is a schematic diagram showing the application of the ping-pong amplification reaction cycle in discriminating genetic variations.
DETAILED DESCRIPTION OF THE INVENTION
A. Overview The present invention provides methods for identifying any type of organism or individual using polynucleotide-based fingerprinting. The method relies on the creation of a family of polynucleotides formed by action of nicking agents on a nucleic acid sample. For instance, a nucleic acid sample may be nicked by a nicking agent to produce various nicked nucleic acid fragments. The resulting nicked fragments are then characterized to determine the identity of the organism from which the nucleic acid sample was obtained or derived. Alternatively, a nucleic acid sample may be nicked by a nicking agent in the presence of a DNA polymerase. The presence of the DNA polymerase allows linear amplification of the nicked fragments or portions thereof, thus facilitates the characterization of such fragments. These fragments may be further amplified by coupling the linear amplification reaction with an exponential amplification reaction. Such an exponential amplification greatly increases the speed and the sensitivity of the identification methods. The nicked fragments may be characterized to determine the presence or absence of a particular fragment unique, or characteristic, to a specific species, subspecies, or strain. The conclusion made from the presence or absence of that particular fragment may be further verified by determining the presence or absence of one or more other fragments also unique, or characteristic, to the specific species, subspecies, or strain. In certain embodiments, the presence or absence of particular fragments in a nicking reaction mixture of a nucleic acid sample need not be determined. Rather, the pattern formed by the resulting nicked fragments (e.g., a mass spectrum of the nicked fragments) is characterized and compared with a standard pattern known for a particular species, subspecies, or strain. The standard pattern may be generated by performing the nicking reaction with a nucleic acid sample from the particular species, subspecies or strain under conditions identical to that of the nucleic acid sample. Alternative, the standard pattern may be generated based on the known nucleic acid sequence of the particular species, subspecies, or strain.
In certain embodiments, the present invention uses the presence of closely located nicking agent recognition sequences. In the presence of a nicking agent that recognizes such sequences, short oligonucleotide fragments (e.g., 6-16 nucleotides long) may be generated or amplified. Short oligonucleotide fragments may also be generated or amplified by the use of a nicking agent in combination with a restriction enzyme if one or more recognition sequences of the nicking agent are located closely to the recognition sequence(s) of the restriction enzyme. The resulting short oligonucleotide fragments may be easily separated from other larger fragments. Such separation simplifies the pattern generated from the characterization of the nucleic acid fragments in a nicking reaction mixture. In addition, as will be described in detail below, short oligonucleotide fragments may easily be exponentially amplified to increase the speed and sensitivity of the present methods. Short oligonucleotide fragments are also more suitable to certain characterization technologies, such as LC-TOF and MALDI. In some embodiments, the nicked fragments may also contain genetic variations useful to distinguish among individual organisms.
B. Definitions
The following conventions and definition of terms as used herein may be helpful to an understanding of the detailed description of the invention. Additional definitions may be found throughout the description of the present invention.
"Fingerprinting" refers to the identification of a source of nucleic acid based on analysis of the nucleic acid according to the methods described herein. For instance, fingerprinting may be applied to the identification of a bacterial strain from its characteristic pattern of oligonucleotides produced by action of a nicking agent (e.g., N.BstNB I). This characteristic pattern is the strain's genomic "fingerprint", which is determined by the sequence of the strain's genomic DNA.
The terms "3"' and "5"' are used herein to describe the location of a particular site within a single strand of nucleic acid. When a location in a nucleic acid is "3' to" or "3' of a reference nucleotide or a reference nucleotide sequence, this means that the location is between the 3' terminus of the reference nucleotide or the reference nucleotide sequence and the 3' hydroxyl of that strand of the nucleic acid. Likewise, when a location in a nucleic acid is "5' to" or "5' of a reference nucleotide or a reference nucleotide sequence, this means that it is between the 5' terminus of the reference nucleotide or the reference nucleotide sequence and the 5' phosphate of that strand of the nucleic acid. Further, when a nucleotide sequence is "directly 3' to" or "directly 3' of a reference nucleotide or a reference nucleotide sequence, this means that the nucleotide sequence is immediately next to the 3' terminus of the reference nucleotide or the reference nucleotide sequence. Similarly, when a nucleotide sequence is "directly 5' to" or "directly 5' of a reference nucleotide or a reference nucleotide sequence, this means that the nucleotide sequence is immediately next to the 5r terminus of the reference nucleotide or the reference nucleotide sequence. As used herein, "nicking" refers to the cleavage of only one strand of a fully double-stranded nucleic acid molecule or a double-stranded portion of a partially double-stranded nucleic acid molecule at a specific position relative to a nucleotide sequence that is recognized by the enzyme that performs the nicking. The specific position where the nucleic acid is nicked is referred to as the "nicking site" (NS).
A "nicking agent" (NA) is an enzyme that recognizes a particular nucleotide sequence of a completely or partially double-stranded nucleic acid molecule and cleaves only one strand of the nucleic acid molecule at a specific position relative to the recognition sequence. Nicking agents include, but are not limited to, a nicking endonuclease (e.g., N.BstNB I) and a restriction endonuclease (e.g., Hinc II) when a completely or partially double-stranded nucleic acid molecule contains a hemimodified recognition/cleavage sequence in which one strand contains at least one derivatized nucleotide(s) that prevents cleavage of that strand (i.e., the strand that contains the derivatized nucleotide(s)) by the restriction endonuclease.
A "nicking endonuclease" (NE), as used herein, refers to an endonuclease that recognizes a nucleotide sequence of a completely or partially double-stranded nucleic acid molecule and cleaves only one strand of the nucleic acid molecule at a specific location relative to the recognition sequence. Unlike a restriction endonuclease (RE), which requires its recognition sequence to be modified by containing at least one derivatized nucleotide to prevent cleavage of the derivatized nucleotide-containing strand of a fully or partially double-stranded nucleic acid molecule, a NE typically recognizes a nucleotide sequence composed of only native nucleotides and cleaves only one strand of a fully or partially double-stranded nucleic acid molecule that contains the nucleotide sequence.
As used herein, "native nucleotide" refers to adenylic acid, guanylic acid, cytidylic acid, thymidylic acid or uridylic acid. A "derivatized nucleotide" is a nucleotide other than a native nucleotide.
The nucleotide sequence of a completely or partially double- stranded nucleic acid molecule that a NA recognizes is referred to as the "nicking agent recognition sequence" (NARS). Likewise, the nucleotide sequence of a completely or partially double-stranded nucleic acid molecule that a NE recognizes is referred to as the "nicking endonuclease recognition sequence" (NERS). The specific sequence that a RE recognizes is referred to as the "restriction endonuclease recognition sequence" (RERS). A "hemimodified RERS," as used herein, refers to a double-stranded RERS in which one strand of the recognition sequence contains at least one derivatized nucleotide (e.g., α-thio deoxynucleotide) that prevents cleavage of that strand (i.e., the strand that contains the derivatized nucleotide within the recognition sequence) by a RE that recognizes the RERS.
In certain embodiments, a NARS is a double-stranded nucleotide sequence where each nucleotide in one strand of the nucleotide is complementary to the nucleotide at its corresponding position in the other strand. In such embodiments, the nucleotide of a NARS in the strand containing a NS nickable by a NA that recognizes the NARS is referred to as a "sequence of the sense strand of the NARS" or a "sequence of the sense strand of the double-stranded NARS," while the nucleotide of the NARS in the strand that does not contain the NS is referred to as a "sequence of the antisense strand of the NARS" or a "sequence of the antisense strand of the double-stranded NARS."
Likewise, in the embodiments where a NERS is a double- stranded nucleotide sequence of which one strand is exactly complementary to the other strand, the nucleotide of a NERS located in the strand containing a NS nickable by a NE that recognizes the NERS is referred to as a "sequence of a sense strand of the NERS" or a "sequence of the sense strand of the double- stranded NERS," while the nucleotide of the NERS located in the strand that does not contain the NS is referred to a "sequence of the antisense strand of the NERS" or a "sequence of the antisense strand of the double-stranded NERS." For example, the recognition sequence and the nicking site of an exemplary nicking endonuclease, N.BstNB I, are shown below with V to indicate the cleavage site and N to indicate any nucleotide:
T
5'-GAGTCNNNNN-3' 3'-CTCAGNNNNN-5'
The sequence of the sense strand of the N.BstNB I recognition sequence is 5'- GAGTC-3', whereas that of the antisense strand is 5'-GACTC-3'.
Similarly, the sequence of a hemimodified RERS in the strand containing a NS nickable by a RE that recognizes the hemimodified RERS (i.e., the strand that does not contain any derivatized nucleotides) is referred to as "the sequence of the sense strand of the hemimodified RERS" and is located in "the sense strand of the hemimodified RERS" of a hemimodified RERS- containing nucleic acid, while the sequence of the hemimodified RERS in the strand that does not contain the NS (i.e., the strand that contains derivatized nucleotide(s)) is referred to as "the sequence of the antisense strand of the hemimodified RERS" and is located in "the antisense strand of the hemimodified RERS" of a hemimodified RERS-containing nucleic acid.
In certain other embodiments, a NARS is an at most partially double-stranded nucleotide sequence that has one or more nucleotide mismatches, but contains an intact sense strand of a double-stranded NARS as described above. According to the convention used herein, in the context of describing a NARS, when two nucleic acid molecules anneal to one another so as to form a hybridized product, and the hybridized product includes a NARS, and there is at least one mismatched base pair within the NARS of the hybridized product, then this NARS is considered to be only partially double- stranded. Such NARSs may be recognized by certain nicking agents (e.g., N.BstNB I) that require only one strand of double-stranded recognition sequences for their nicking activities. For instance, the NARS of N.BstNB I may contain, in certain embodiments, an intact sense strand, as follows,
5'-GAGTC-3' 3'-NNNNN-5'
where N indicates any nucleotide, and N at one position may or may not be identical to N at another position, however there is at least one mismatched base pair within this recognition sequence. In this situation, the NARS will be characterized as having at least one mismatched nucleotide.
In certain other embodiments, a NARS is a partially or completely single-stranded nucleotide sequence that has one or more unmatched nucleotides, but contains an intact sense strand of a double-stranded NARS as described above. According to the convention used herein, in the context of describing a NARS, when two nucleic acid molecules (i.e., a first and a second strand) anneal to one another so as to form a hybridized product, and the hybridized product includes a nucleotide sequence in the first strand that is recognized by a NA, i.e., the hybridized product contains a NARS, and at least one nucleotide in the sequence recognized by the NA does not correspond to, i.e., is not across from, a nucleotide in the second strand when the hybridized product is formed, then there is at least one unmatched nucleotide within the NARS of the hybridized product, and this NARS is considered to be partially or completely single-stranded. Such NARSs may be recognized by certain nicking agents (e.g., N.BstNB I) that require only one strand of double-stranded recognition sequences for their nicking activities. For instance, the NARS of N.BstNB I may contain, in certain embodiments, an intact sense strand, as follows,
5'-GAGTC-3' 3'-No-4-5'
(where "N" indicates any nucleotide, 0-4 indicates the number of the nucleotides "N," a "N" at one position may or may not be identical to a "N" at another position), which contains the nucleotide of the sense strand of the double-stranded recognition sequence of N.BstNB I. In this instance, at least one of G, A, G, T or C is unmatched, in that there is no corresponding nucleotide in the complementary strand. This situation arises, e.g., when there is a "loop" in the hybridized product, and particularly when the sense sequence is present, completely or in part, within a loop.
As used herein, the phrase "amplifying a nucleic acid molecule" or "amplification of a nucleic acid molecule" refers to the making of two or more copies of the particular nucleic acid molecule. The term "nucleic acid amplification reaction" refers to the process of making more than one copy of a nucleic acid molecule (A) using a nucleic acid molecule (T) that comprises a sequence complementary to the nucleotide of nucleic acid molecule A as a template.
A first nucleic acid sequence is "at least substantially identical" to a second nucleic acid sequence when the complement of the first sequence is able to anneal to the second sequence to form at least a transient duplex under certain reaction conditions (e.g., conditions for amplifying nucleic acids). In certain preferred embodiments, the first sequence is exactly identical to the second sequence, that is, the nucleotide of the first sequence at each position is identical to the nucleotide of the second sequence at the same position, and the first sequence is of the same length as the second sequence.
A first nucleic acid sequence is "at least substantially complementary" to a second nucleic acid sequence when the first sequence is able to anneal to the second sequence to form at least a transient duplex under certain reaction conditions (e.g., conditions for amplifying nucleic acids). In certain preferred embodiments, the first sequence is exactly or completely complementary to the second sequence, that is, each nucleotide of the first sequence is complementary to the nucleotide of the second sequence at its corresponding position, and the first sequence is of the same length as the second sequence.
A transient duplex between a first nucleic acid sequence and a second nucleic acid sequence is formed when under given reaction conditions, the 3' terminal group of the first nucleic acid sequence (if unblocked) may be extended by a DNA polymerase using the second nucleic acid sequence as a template; or the 3' terminal group of the second nucleic acid sequence (if unblocked) may be extended by a DNA polymerase using the first nucleic acid sequence as a template. In certain embodiments, at least 80% of the nucleotides of the first nucleic acid in a region of at least 8 nucleotides are complementary to the nucleotides of the second nucleic acid at their corresponding positions. In other embodiments, at least 85%, 90%, 95%, 97%, 98%, or 99% of the nucleotides of the first nucleic acid in a region of at least 8, 9, 10, 11 , 12, 13, 14, 15, 16, 17, or 18 nucleotides are complementary to the nucleotides of the second nucleic acid at their corresponding positions.
As used herein, a nucleotide in one strand (referred to as the "first strand") of a double-stranded nucleic acid located at a position "corresponding to" another position (e.g., a defined position) in the other strand (referred to as the "second strand") of a double-stranded nucleic acid refers to the nucleotide in the first strand that is complementary to the nucleotide at the corresponding position in the second strand. Likewise, a position in one strand (referred to as the "first strand") of a double-stranded nucleic acid corresponding to a nicking site within the other strand (referred to as the "second strand") of a double- stranded nucleic acid refers to the position between the two nucleotides in the first strand complementary to those in the second strand between which nicking occurs.
The term "isothermal conditions" refers to a set of reaction conditions where the temperature of the reaction is kept essentially constant (i.e., at the same temperature or within the same narrow temperature range wherein the difference between an upper temperature and a lower temperature is no more than about 20°C) during the course of the amplification. In certain embodiments, a reaction is carried out under conditions where the difference between an upper temperature and a lower temperature is no more than 15°C, 10°C, 5°C, 3°C, 2°C or 1°C. Exemplary temperatures for isothermal amplification include, but are not limited to, any temperature between 50°C to 70°C or the temperature range between 50°C to 70°C, 55°C to 70°C, 60°C to 70°C, 65°C to 70°C, 50°C to 55°C, 50°C to 60°C, or 50°C to 65°C. The terms "polymorphism" and "genetic variation," as used herein, refer to the occurrence of two or more genetically determined alternative sequences or alleles in a small region (i.e., one to several (e.g., 2, 3, 4, 5, 6, 7, or 8) nucleotides in length) in a population. The allelic form occurring most frequently in a selected population is referred to as the wild type form. Other allelic forms are designated as variant forms. Diploid organisms may be homozygous or heterozygous for allelic forms.
In one aspect of the invention, the genetic variation is a "single- nucleotide polymorphism" (SNP), which refers to any single nucleotide sequence variation, preferably one that is common in a population of organisms and is inherited in a Mendelian fashion. Generally, the SNP is either of two possible bases and there is no possibility of finding a third or fourth nucleotide identity at an SNP site.
C. Sample sources Biological samples of the present invention include any sample that originates from an organism and that may contain a nucleic acid of interest (i.e., target nucleic acid). They may be provided by obtaining a blood sample, biopsy specimen, tissue explant, organ culture or any other tissue or cell preparation from a subject or a biological source. The subject or biological source may be a human or non-human animal, a plant, a fungus, a bacterium, and virus. In certain preferred embodiments of the invention, the subject or biological source may be suspected of having, or being at risk for having, a genetic disease or a pathogen infection. In other preferred embodiments, the subject or biological source may be a patient that has a genetic disease or a pathogen infection. In certain other embodiments, the subject or biological source may be a control subject that does not have a genetic disease or a pathogen infection.
In certain embodiments, a bacterial sample can be utilized as starting material, provided it contains or is suspected of containing a bacterial genome of interest. Such a sample may be obtained from any source that may potentially be contaminated by bacteria. When looking for bacterial infection or in distinguishing bacteria from human or animal subjects, the sample to be tested can be selected or extracted from any bodily sample such as blood, urine, spinal fluid, tissue, vaginal swab, stool, amniotic fluid or buccal mouthwash. In other applications, the sample can come from a variety of other sources. For example, in horticulture and agricultural testing, the sample can be from a plant, fertilizer, soil, liquid or other horticultural or agricultural product. In food testing, the sample can be from fresh food or processed food (for example infant formula, seafood, fresh produce and packaged food). In environmental testing, the sample can be from liquid, soil, sewage treatment, sludge and any other sample in the environment considered or suspected of being contaminated by bacteria.
When the sample is a mixture of material for example blood, soil and sludge, it can be treated with an appropriate reagent effective to open the cells and expose or separate the strands of nucleic acids. Although not necessary, this lysing and nucleic acid denaturing step will allow amplification to occur more readily. Further, if desired, the bacteria can be cultured prior to analysis and thus a pure sample obtained.
Although fingerprinting genomic DNA according to the present invention is primarily described herein, the present invention may also be used to characterize other DNA molecules (e.g., cDNA). Thus, the methods according to the present invention may also be applicable to characterize cDNA expression patterns.
D. Nicking Reactions
The nucleic acids isolated from a biological source may be directly used in a nicking reaction. Alternatively, they may be amplified via known methods (such as PCR) prior to being subjected to action of a nicking agent.
As described above, a nicking agent may be a nicking endonuclease (used interchangeably with "nicking enzyme") or a restriction endonuclease (used interchangeably with "restriction enzyme"). A nicking endonuclease (NE) useful in the present invention may or may not have a nicking site that overlaps with its recognition sequence. An exemplary NE that nicks outside its recognition sequence is N.BstNB I, which recognizes a unique nucleic acid sequence composed of 5'-GAGTC-3', but nicks four nucleotides beyond the 3' terminus of the recognition sequence. The recognition sequence and the nicking site of N.BstNB I are shown below with V to indicate the cleavage site where the letter N denotes any nucleotide:
T
5'-GAGTCNNNNN-3' 3'-CTCAGNNNNN-5'
N.BstNB I may be prepared and isolated as described in U.S. Pat. No. 6,191 ,267, incorporated herein by reference in its entirety. Buffers and conditions for using this nicking endonuclease are also described in the '267 patent. An additional exemplary NE that nicks outside its recognition sequence is N.Alwl, which recognizes the following double-stranded recognition sequence:
5'-GGATCNNNNN-3'
3'-CCTAGNNNNN-5'
The nicking site of N.Alwl is also indicated by the symbol V'. Both NEs are available from New England Biolabs (NEB). N.Alwl may also be prepared by mutating a type I Is RE Alwl as described in Xu et al. (Proc. Natl. Acad. Sci. USA 98:12990-5, 2001).
Exemplary NEs that nick within their NERSs include N.BbvCI-a and N.BbvCI-b. The recognition sequences for the two NEs and the NSs (indicated by the symbol V') are shown as follows:
N.BbvCI-a
5'-CCTCAGC-3' 3'-GGAGTCG-5' N.BbvCI-b
T
5'-GCTGAGG-3' 3'-CGACTCC-5'
Both NEs are available from NEB.
Additional exemplary nicking endonucleases include, without limitation, N.BstSE I (Abdurashitov et al., Mol. Biol. (Mosk) 30: 1261-7, 1996), an engineered EcoR V (Stahl et al., Proc. Natl. Acad. Sci. USA 93: 6175-80, 1996), an engineered Fok I (Kim et al., Gene 203: 43-49, 1997), endonuclease V from Thermotoga maritima (Huang et al., Biochem. 40: 8738-48, 2001 ), Cvi Nickases (e.g., CviNY2A, CviNYSI, Megabase Research Products, Lincoln, Nebraska) (Zhang etal., Virology 240: 366-75, 1998; Nelson et al., Biol. Chem. 379: 423-8, 1998; Xia et al., Nucleic Acids Res. 16: 9477-87, 1988), and an engineered Mly I (i.e., N.MIy I) (Besnier and Kong, EMBO Reports 2: 782-6, 2001 ). Additional NEs may be obtained by engineering other restriction endonuclease, especially type lls restriction endonucleases, using methods similar to those for engineering EcoR V, Alwl, Fok I and/or Mly I. A restriction endonuclease useful as a nicking agent can be any restriction endonuclease (RE) that nicks a double-stranded nucleic acid at its hemimodified recognition sequences. Exemplary REs that nick their double- stranded hemimodified recognition sequences include, but are not limited to Ava I, Bsl I, BsmA I, BsoB I, Bsr I, BstN I, BstO I, Fnu4H I, Hinc II, Hind II and Nci I. Additional REs that nick a hemimodified recognition sequence may be screened by the strand protection assays described in U.S. Pat. No. 5,631 ,147.
In certain embodiments, a nicking agent may recognize a nucleotide sequence in a DNA-RNA duplex and nicks in one strand of the duplex. In certain other embodiments, a nicking agent may recognize a nucleotide sequence in a double-stranded RNA and nicks in on strand of the RNA.
Certain nicking agents require only the presence of the sense strand of a double-stranded recognition sequence in an at least partially double- stranded substrate nucleic acid for their nicking activities. For instance, N.BstNB I is active in nicking a substrate nucleic acid that comprises, in one strand, the sequence of the sense strand of its recognition sequence "5 - GAGTC-3'" of which one or more nucleotides do not form conventional base pairs (e.g., G:C, A:T, or A:U) with nucleotides in the other strand of the substrate nucleic acid. The nicking activity of N.BstNB I decreases with the increase of the number of the nucleotides in the sense strand of its recognition sequence that do not form conventional base pairs with any nucleotides in the other strand of the substrate nucleic acid. However, even if none of the nucleotides of "5'-GAGTC-3"' form conventional base pairs with the nucleotides in the other strand, N.BstNB I may still retain 10-20% of its optimum activity. Several factors may be considered when choosing a particular nicking agent for a fingerprinting assay according to the present invention. These factors include but not limited to (1 ) whether the particular nicking agent would produce nicked fragment(s), or a pattern of nicked fragments, unique to the organism from which the nucleic acid sample is suspected to be; (2) the lengths of nicked fragments unique to the organism from which the nucleic acid sample is suspected to be and the characterization technologies; and (3) the optimum temperature of the nicking agent. For instance, if mass spectrometry is chosen as the characterization method, a nicking enzyme that would produce short unique oligonucleotides may be desirable. In the embodiments whether the nicking reaction is performed in the presence of a DNA polymerase, a nicking enzyme with an optimum temperature similar to that of the DNA polymerase may be desirable. The nicking reaction may be simply performed by incubating the nucleic acid sample with a nicking agent under appropriate conditions. Identifying such appropriate conditions are within the ordinary skill in the art. For instance, the nicking reaction may be performed at the optimum temperature of the nicking agent and in a buffer suitable for the nicking agent. The nicking reaction mixture that contains nicked nucleic acid fragments may be directly characterized. The characterization may be performed by any known applicable methods, including but not limited to, liquid chromatography, electrophoresis, hybridization and mass spectrometry. The use of such methods may indicate the presence or absence of one or more particular fragments unique to a species, subspecies, strain, or individual organism from which the nucleic acid sample is suspected to be. Alternatively, the use of such methods produces a pattern of nicked fragments, which may be compared with the pattern generated from an organism from which the nucleic acid sample is suspected to be.
E. Linear Amplification In certain embodiments, the nicking reaction is performed in the presence of a DNA polymerase so that nicked fragments or portions thereof may be linearly amplified. The amplification produces a larger amount of single-stranded nucleic acid or oligonucleotides, which increases the sensitivity of the fingerprinting assays. In such a nicking reaction, the 3' terminus at the nicking site is extended by a DNA polymerase, preferably being 5'->3' exonuclease deficient and having a strand displacement activity and/or in the presence of a strand displacement facilitator, displacing the strand that contains the 5' terminus produced by the nicking reaction. The resulting extension product having a recreated NARS for the NA is nicked ("re-nicked") by the NA. The 3' terminus produced at the NS by the re-nicking is then extended in the presence of the DNA polymerase, also displacing the strand that contains the 5' terminus produced by the nicking reaction. The nicking-extension cycle is repeated, preferably multiple times, to accumulate/amplify the displaced strand that contains the 5' terminus produced by the nicking reaction.
Various types of DNA polymerase may be used in the present application. For instance, DNA polymerases useful in the present invention may be any DNA polymerase that is 5'-^3' exonuclease deficient but has a strand displacement activity. Such DNA polymerases include, but are not limited to, exo" Deep Vent, exo" Bst, exo" Pfu, and exo" Bca. Additional DNA polymerases useful in the present invention may be screened for or created by the methods described in U.S. Pat. No. 5,631 ,147, incorporated herein by reference in its entirety. The strand displacement activity may be further enhanced by the presence of a strand displacement facilitator as described below.
Alternatively, in certain embodiments, a DNA polymerase that does not have a strand displacement activity may be used. Such DNA polymerases include, but are not limited to, exo" Vent, Taq, the Klenow fragment of DNA polymerase I, T5 DNA polymerase, and Phi29 DNA polymerase. In certain embodiments, the use of these DNA polymerases requires the presence of a strand displacement facilitator. A "strand displacement facilitator" is any compound or composition that facilitates strand displacement during nucleic acid extensions from a 3' terminus at a nicking site catalyzed by a DNA polymerase. Exemplary strand displacement facilitators useful in the present invention include, but are not limited to, BMRF1 polymerase accessory subunit (Tsurumi et al., J. Virology 67: 7648-53, 1993), adenovirus DNA-binding protein (Zijderveld and van der Vliet, J. Virology 68: 1158-64, 1994), herpes simplex viral protein ICP8 (Boehmer and Lehman, J. Virology 67: 711-5, 1993; Skaliter and Lehman, Proc. Natl. Acad. Sci. USA 91: 10665-9, 1994), single-stranded DNA binding protein (Rigler and Romano, J. Biol. Chem. 270: 8910-9, 1995), phage T4 gene 32 protein (Villemain and Giedroc, Biochemistry 35: 14395-4404, 1996), calf thymus helicase (Siegel et al., J. Biol. Chem. 267: 13629-35, 1992) and trehalose. In one embodiment, trehalose is present in the amplification reaction mixture. Additional exemplary DNA polymerases useful in the present invention include, but are not limited to, phage M2 DNA polymerase (Matsumoto et al., Gene 84: 247, 1989), phage PhiPRDI DNA polymerase (Jung et al., Proc. Natl. Acad. Sci. USA 84: 8287, 1987), T5 DNA polymerase (Chatterjee et al., Gene 97: 13-19, 1991 ), Sequenase (U.S. Biochemicals), PRD1 DNA polymerase (Zhu and Ito, Biochim. Biophys. Acta. 1219: 267-76, 1994), 9°Nm™ DNA polymerase (New England Biolabs) (Southworth et al., Proc. Natl. Acad. Sci. 93: 5281-5, 1996; Rodriquez et al., J. Mol. Biol. 302: 447- 62, 2000), and T4 DNA polymerase holoenzyme (Kaboord and Benkovic, Curr. Biol. 5: 149-57, 1995).
Alternatively, a DNA polymerase that has a 5'->3' exonuclease activity may be used. For instance, such a DNA polymerase may be useful for amplifying short nucleic acid fragments that automatically dissociate from the template nucleic acid after nicking.
In certain embodiments where a nicking agent nicks in the DNA strand of a RNA-DNA duplex, a RNA-dependent DNA polymerase may be used. In other embodiments where a nicking agent nicks in the RNA strand of a RNA-DNA duplex, a DNA-dependent DNA polymerase that extends from a DNA primer, such as Avian Myeloblastosis virus reverse transcriptase (Promega) may be used. In both instances, a target mRNA need not be reverse transcribed into cDNA and may be directly mixed with a template nucleic acid molecule that is at least substantially complementary to the target mRNA.
In certain embodiments, it may be desirable to amplify short oligonucleotide fragments (e.g., 6-20 nucleotides in length). For instance, short oligonucleotides are more suitable to certain characterization technologies, such as LC-TOF and MALDI. In addition, short fragments may be easily and more efficiently amplified to increase the speed and sensitivity of the present methods. Such fragments also allow the use of a DNA polymerase that does not have a strand displacement activity, or is not 5' to 3' exonuclease deficient. Generation and amplification of short oligonucleotides is made possible by the presence of one or more pairs of nicking agent recognition sequences (or one nicking agent recognition sequence and one restriction enzyme recognition sequence) in close proximity in a nucleic acid sample. In certain embodiments, the proximity is about 12 to 24 nucleotides. Using N.BstNB I as an exemplary nicking agent, the structure in the nucleic acid can be visualized as follows: 5' GAGTCNNNNNNNNNNNNNNGACTC 3'
3' CTCAGNNNNNNNNNNNNNNCTGAG 5'
where N can be any nucleotide, and the number of Ns in the upper strand that is between the sequence of the sense strand of the N.BstNB I recognition sequence (i.e., 5'-GAGTC-3') and the sequence of the antisense strand of the recognition sequence (i.e., 5'-GACTC-3') may be between 12 and 24. After the nicking step that takes place at 55 to 65°C, the duplex dissociates and forms two amplification templates.
5- GAGTCNNNN-3' 5'-NNNNNNNNNNGACTC 3'
3' CTCAGNNNNNNNNNN-5' 3'-NNNNCTGAG 5'
In the presence of a polymerase, the recessed 3'-hydroxyl of each amplification template is filled in by the polymerase, the nicking enzyme then again cleaves the newly extended strand, the resulting short single-stranded oligonucleotide immediately dissociates, and the cycle of nicking and filling is repeated multiple < times, resulting a linear amplification of the short single-stranded oligonucleotides. The above reaction synthesizes short oligonucleotides whose cycle of reactions depends on the idea that, at the reaction temperature, oligonucleotides above a certain length form stable duplexes, while those below this length form unstable duplexes that dissociate readily. The short oligonucleotide generated in the nicking reaction is below the threshold of stability, and is thereby released from the duplex. The release of the short oligonucleotide from the duplex regenerates a 5' overhang, which may be again used as a template for synthesizing the short oligonucleotide. A schematic representation of a linear amplification of short oligonucleotides using an N.BstNB I as the nicking agent is shown in Figure 1. The identification of one or more pairs of nicking agent recognition sequence (or one nicking agent recognition sequence and one restriction enzyme recognition sequence) in close proximity in a genomic DNA with a known sequence may be performed by the use of applicable computer programs. Such identification may facilitate the selection of a nicking agent for fingerprinting a nucleic acid sample suspected to be derived from the genomic DNA, as well as the identification of oligonucleotide fragments or patterns thereof that are unique to the genomic DNA. These unique oligonucleotide fragments or patterns thereof expected to be present in a fingerprinting assay according to the present invention may be used as a standard for those generated from the nucleic acid sample. The comparison of the expected unique oligonucleotide fragments or patterns thereof with those generated from the nucleic acid sample would indicate whether the nucleic acid sample is derived from the genomic DNA.
An exemplary computer program useful in identifying closely located nicking agent recognition sequences, or nicking agent recognition sequences located near restriction enzyme recognition sequences is called
Genome Identifier through a Nicking-enzyme generated Unique Mass Spectrum (GINUMS). GINUMS further predicts the amplified short oligonucleotides of a genomic sequence produced in the linear amplification reactions described herein. When more than one genomic sequence needs to be analyzed, the following process is repeated for each sequence. The search begins with the acquisition of the genomic sequence. The analysis is simplified with "finished" genomic sequence, which is represented by one long sequence, rather than "draft" quality sequence, which is represented by a series of sequences, each a fragment of the overall sequence separated by gaps, although the analysis will work with any sequence or set of sequences. Usually the sequence is stored in FASTA (or Pearson) format, but the program will work with any sequence format. In any case, the sequence is read from either a file of a database, and stored into memory so that it may be searched later. GINUMS then searches the genomic sequence for three different patterns, each represented by a regular expression. Regular expressions are simply patterns represented by a string of characters that are used to search text, in this case one long string of A's, C's, G's and T's representing one strand of the genomic sequence. Using both N.AIw I and N.Bst NBI nicking endonucleases, the three regular expressions (in PERL syntax) are:
1. GAGTC.{$min,$max}GACTC
2. GGATC.{$min,$max}GACTC
3. GAGTC.{$min,$max}GATCC
where $min represents the smallest gap between the two DNA restriction endonuclease recognition sequences, and $max the largest. The default values for both of these variables are 14 and 24, respectively. GINUMS searches the genomic sequence with each of the regular expressions one at a time, and then stores all of the matching segments of sequence (termed "hits" for the rest of this document) for each search into separate lists. Thus, all of the hits are composed of the recognition sequences separated by any 14 to 24 base pairs, which are determined by the genomic sequence. The program has three lists of hits in memory, each list the product of searching the genomic sequence with one of the regular expressions. For each hit in all of the lists, GINUMS will determine the "product(s)" of the reaction (the sequence that would be amplified), the mass of the product(s) and the starting and ending position of that product in the genomic sequence.
The lists of masses, the products they represent, and the location of those products in the genomic sequence are then written to file or database for permanent storage. Learning to read the regular expressions used in GINUMS is trivial, and will be described here. The second regular expression listed in the section above, GGATC.{$min,$max}GACTC, where $min and $max are integers, will be used as an example for the rest of this section.
The program searches genomic sequence for target sequences (that is, any sequence beginning with GGATC and ending with GACTC, separated by any 14 to 24 nucleotides). In regular expressions, the '.' character represents any one character, in this case any one A, T, C, or G. So, one way to search for each of these occurrences would be to construct 11 regular expressions and then search the sequence for each, such as:
1) GGATC GACTC (14 ,'s) 2) GGATC GACTC (15.'s)
11) GGATC GACTC (24 ,'s)
However, to do this for all three search patterns would be cumbersome, and regular expressions provide us with a more elegant solution to the problem. When any character in a regular expression is followed by a number flanked by brackets (e.g., G{6} ), then the regular expression represents precisely that number of consecutive characters (so G{6} would represent 6 consecutive G's, or GGGGGG). When there are two numbers (x and y) separated by a ',' (e.g., G{x,y}), the regular expression represents and string of at least x characters and at most y characters. So G{2,4} would represent GG, GGG, and GGGG. When the character preceding the brackets is a '.' (e.g., .{2,4} ), then the regular expression represents any string of any 2- 4 characters. Returning to GGATC.{$min,$max}GACTC, it can now be seen that this regular expression represents any string that begins with GGATC, and ends with GACTC, separated by at least $min characters and at most $max characters, where $min and $max are integers, and $min < $max. Because the brackets are preceded by a '.', these can be any characters. Building the profiles for more than one organism is a simple as repeating the process for as many genomic sequences necessary, noting which masses where derived from which organism. GINUMS is capable of inputting the genomic sequences of an unlimited number of organisms, each contained within one file, all contained in the same directory, and then returning two outputs. The first output is a set of masses unique to each organism. This can be accomplished by searching for the existence of each mass in one organism in all of the masses for the other organisms. If that mass is found only in one organism, then it is unique for that organism. GINUMS does not do this precisely (it uses the properties of a data structure called a hash), but the process is analogous. The second output is the complete list of masses for each organism, and would allow the user to determine if each organism has a distinct "profile" (hence, the name), or set, of masses. These could be written either to file or into a database for permanent storage, so that they may be searched for in experimental data at a later time.
GINUMS uses the following model to determine the products (and their corresponding masses) of the amplification process on genomic sequence.
When two N.BstNB I recognition sequences are present on opposite strands separated by 14 to 24 nucleotides (in the example below, 14 will be used), and oriented as follows:
S 5 ' GAGTCNNNNPPPPPPNNNNGACTC 3 '
3 ' CTCAGNNNNPPPPPPNNNNCTGAG 5 ' v γ .. )
S' where N represents any nucleotide, and P represents the nucleotides in the product, the products of the reaction are S and S'. When a N.AIw I recognition site is followed by a N.Bst NBI recognition site on the opposite strand separated by the correct number of nucleotides and oriented as follows:
5 ' GGATCNNNNPPPPPPNNNNNNGACTC 3 '
I I I I I I I I i I I I I I I I I i l l I M
3 ' CCTAGNNNNPPPPPPNNNNNNCTGAG 5 ' v ., >
Y
S' the products are represented by the sequence of P's and is summarized by S and S'. When a N.Bst NBI recognition sequence is followed by a N.AIw I recognition sequence on the opposite strand separated by the correct number of nucleotides and oriented as follows:
5' GAGTCNNNNPPPPPPNNNNGATCC 3'
Ml I I MMI IM I I I I II I I I 3' CTCAGNNNNPPPPPPNNNNCTAGG 5'
S' the product sequences are represented by P's, and summarized as S and S'. After finding all of the products predicted by this model in the genomic sequence, the masses of each product is calculated according to the sequence of the product. The results from the use of GINUMS on E. coli strains K12 and 0157 and human chromosome 21 are shown in Tables 1 , 2, and 5, respectively.
As one of ordinary skill in the art would appreciate, in addition to the use of N.BstNB I alone or in combination with N.AIw I, short oligonucleotide fragments may also be generated or amplified by the use of a different nicking endonuclease alone or in combination with another nicking endoculease or restriction endonuclease (e.g., a type lls restriction endonuclease). In case that two enzymes are used, but have different optimal temperature, a linear amplification reaction may be first performed under the lower optimal temperature and then under the higher optimal temperature.
In certain embodiments, the amplified short fragments may also contain genetic variations. For instance, SNPs in human genomic DNA that are flanked by two N.BstNB I recognition sequences or an N.BstNB I recognition sequence and another restriction enzyme recognition sequence are shown in Table 9. The characterization of such short fragments would identify the SNPs within these fragments and facilitate the identification of the individual from which a nucleic acid sample is obtained.
F. Exponential Amplification
In certain embodiments, the oligonucleotides linearly amplified as described above may be further exponentially amplified. These oligonucleotides are referred to as "initiating oligonucleotides" or "initiators." Such exponential amplification greatly increases the production rate and amount of the initiating oligonucleotides.
The key idea for exponential amplification is to arrange it so that the oligonucleotide product of the linear reaction serves to create a new primer that in turn anneals to a target template and creates a new primer-template, which in turn produces more of the same oligonucleotide product, creating a chain reaction. A simple scheme for exponentially amplifying a short oligonucleotide (also referred to "direct EXPAR") using N.BstNB I as the nicking agent is depicted in Figure 2. The scheme is based on our observation that even though the product oligonucleotide is unstable as a duplex it will form a transient duplex molecule with its complement and this transient duplex can act as a primer for extension by the DNA polymerase. Once extension of the oligonucleotide has occurred the duplex is stabilized by the additional complementary duplex section and will not readily dissociate. Extending the primer thus creates a stable primer-template that will produce oligonucleotide products in a linear fashion.
To create these new duplexes we need only provide a ready supply of complementary oligonucleotides that we call amplification templates. The key feature of these single-stranded oligonucleotides is that they contain two copies in tandem of the complement of the oligonucleotide product to be amplified, separated by the sequence of the antisense strand of the recognition sequence of N.BstNB I (i.e., 3'-CTCAG-5') and a four base spacer (on the 5' side). When the transient duplex is extended a stable new primer-template is created. This primed template will then continue to produce oligonucleotide product via the linear amplification cycle as described above (nicking after the four base spacer, dissociating the oligonucleotide and re-elongating the primer) as long as the enzymes remain active and dNTPs are available. Another scheme for exponentially amplifying short oligonucleotides (referred to as "the replicator type" or "ping-pong amplification scheme") uses two templates. In this scheme, an initiating oligonucleotide of sequence S primes a first template oligonucleotide T1 (S is about 8 to 16 nucleotides in length) to form the following partially double-stranded nucleic acid molecule. The 3' terminus of T1 may be blocked by a phosphate group.
5' s 3'
3' S' NNNNCTGAG 5' (T1 )
In the presence of a DNA polymerase, the upper strand of the above partially double-stranded nucleic acid molecule is elongated to form the following fully double-stranded nucleic acid molecule:
5' s NNNNGACTC 3'
3' S' NNNNCTGAG 5' (T1 )
In the presence of a nicking enzyme (e.g., N.BstNB I), the lower strand is nicked to generate a 3' hydroxyl group and release an oligonucleotide blocked at the 3' terminus (un-productive oligonucleotide). The resulting nicked structure is shown below:
5' s NNNNGACTC 3'
3'-NNNNCTGAG 5' (T1 )
The DNA polymerase uses sequence S as a template and fills in the recessed 3' hydroxyl group of the lower strand to produce the following double-stranded nucleic acid.
5' s NNNNGACTC 3'
3' s' NNNNCTGAG 5' (T1 ) The nicking enzyme cleaves the lower strand, releasing an oligonucleotide having the sequence S' (which is completely complementary to the sequence S) as shown below:
5' s NNNNGACTC 3'
3'-NNNNCTGAG 5' (T1 )
+
K' o C' — — — — — or
The lower strand of the above partially double-stranded nucleic acid may be extended again to produce a fully double-stranded nucleic acid molecule. The lower strand of the fully double-stranded nucleic acid may be nicked again to release another oligonucleotide having the sequence S'. The above extension and nicking cycle may be repeated multiple times, resulting in the amplification of the oligonucleotide having the sequence S'. This oligonucleotide (S') is capable of priming the oligonucleotide template T2 to follow a partially double-stranded nucleic acid molecule as shown below:
5' s' 3' 3' S NNNNCTGAG 5' (T2)
In the presence of a DNA polymerase, the upper strand of the above partially double-stranded nucleic acid molecule is elongated to form the following fully double-stranded nucleic acid molecule:
5' S' NNNNGACTC 3'
3' s NNNNCTGAG 5' (T2)
In the presence of a nicking enzyme (e.g., N.BstNB I), the lower strand is nicked to generate a 3' hydroxyl group and release an oligonucleotide blocked at the 3' terminus (un-productive oligonucleotide). The resulting nicked structure is shown below:
5' S' NNNNGACTC 3' 3'-NNNNCTGAG 5' (T2) The DNA polymerase uses S' as a template and fills in the recessed 3' hydroxyl group of the lower strand to produce the following double- stranded nucleic acid:
5' S' NNNNGACTC 3'
3' s NNNNCTGAG 5' (T2)
The nicking enzyme cleaves the lower strand, releasing an oligonucleotide having the sequence S' (which is completely complementary to the sequence S) as shown below:
5' s' NNNNCTGAG 3'
3'-NNNNGACTC 5' (T2)
+ 5' s 3'
The lower strand of the above partially double-stranded nucleic acid may be extended again to produce a fully double-stranded nucleic acid molecule. The lower strand of the fully double-stranded nucleic acid may be nicked again to release another oligonucleotide having the sequence S. The above extension and nicking cycle may be repeated multiple times, resulting in the amplification of the oligonucleotide having the sequence S.
The two oligonucleotides having the sequences of S and S' are now capable of priming T1 and T2, respectively, and the exponential amplification is started. It should be noted that S and S' are sufficiently short (e.g., 8-16 nucleotides in length) which prevents the triggers from forming a stable duplex in a reaction mixture under conditions for exponential amplification (e.g., 60°C). This variation of exponential amplification has a substantial advantage of requiring a very high level of stringency of an oligonucleotide priming its template. We have discovered that the oligonucleotide (e.g., an oligonucleotide having the sequence S or S') has to be nearly perfectly based paired with its template for an exponential amplification reaction to start. In many cases, even a single mismatch in the oligonucleotide will inhibit the reaction.
To carry out the exponential reaction, some or all of the following components may be used in certain embodiments. 1 ) The first template oligonucleotide (T1 ). A schematic representation of T1 is shown in Figure 3. T1 may be 24 to 60 nucleotides (including all the integer values therebetween), preferably 32-36 nucleotides in length. The 3'-end of T1 may be blocked with, for example, a phosphate, an amine, a biotin, a dideoxy group or a fluorophore (that is, there is no free 3'-hydroxyl in T1 ) to prevent extension by a polymerase. The region from the 3' terminus of T1 to the fifth nucleotide directly 3' to the 3' terminus of the sense strand of a nicking endonuclease recognition sequence (e.g., GAGTC) (i.e., Region l in Figure 6) may be 8, 9, 10, 11 , 12, 13, 14, 15 or 16 nucleotides in length and is completely (or at least substantially) complementary to the sequence S. To the 5' side of the above region are 4 nucleotides of any sequence (Region II in Figure 6) over which a nicking endonuclease (e.g., N.BstNB I) reaches to nick. Next in the 5' direction is the sequence 3'-CTGAG-5' which is the sense strand of the recognition sequence for a nicking enzyme (e.g., N.BstNB I). Further in the 5' direction is about 10 to 20 nucleotides of any sequence (Region III in Figure 6). The sequence at the 5' end should not be complementary to any of the sequence at the 3'-end. The concentration of T1 is 0.001 to 1 micromolar if in solution. T1 can also be tethered to a solid support or covalently attached to any type of solid support.
2) The second template oligonucleotide (T2). Similar to T1 , T2 may be 24 to 60 nucleotides (including all the integer values therebetween), preferably 32-36 nucleotides, in length. The 3'-end of T2 may be blocked with, for example, a phosphate, an amine, a biotin, a dideoxy group or a fluorophore (that is, there is no free 3'-hydroxyl in T2) to prevent extension by a polymerase. The region from the 3' terminus of T2 to the fifth nucleotide directly 3' to the 3' terminus of the sense strand of a nicking endonuclease recognition sequence (e.g.,
GAGTC) may be 8, 9, 10, 11 , 12, 13, 14, 15 or 16 nucleotides in length and is at least substantially complementary to the sequence S'. To the 5' side of the above region are 4 nucleotides of any sequence over which a nicking endonuclease (e.g., N.BstNB I) reaches to nick. Next in the 5' direction is the sequence "CTGAG" which is the sense strand of the recognition sequence for a nicking enzyme (e.g., N.BstNB I). Further in the 5' direction is about 10 to 20 nucleotides of any sequence. The sequence at the 5' end should not be complementary to any of the sequence at the 3'-end. The concentration of T2 is 0.001 to 1 micromolar if in solution. T2 can also be tethered to a solid support or covalently attached to any type of solid support.
3) A DNA polymerase such as exo" Vent, 9°Nm™, Taq, or Bst at a concentration of 0.002 to 20 units per microliter. Preferably the concentration of the polymerase is 0.02 to 0.5 units per microliter. The enzyme is typically available commercially in 100 mM KCI, 0.1 mM EDTA, 10 mMTris-HCI (pH 7.4), 1 mM DDT, and 50% glycerol.
4) A nicking enzyme such as N.BstNB I (from New England Biolabs, (NEB), MA) at a concentration of 0.002 to 20 units per microliter. Preferably the concentration of the nicking enzyme is 0.02 to 0.5 units per microliter. The enzyme is supplied in 50 mM KCI, 10 mMTris-HCI (pH 7.5), 0.1 mM EDTA, 1 mM DTT, 200 ug/ml BSA and
50% glycerol.
5) A salt (e.g., MgCI2 or MgSO4) at 0.5 to 10 mM in concentration. Preferably the concentration is 2 to 6 mM.
6) A salt (e.g., (NH4)2SO4) in the 5 to 50 mM range, preferably 10 mM. 7) A salt (e.g., KCI) at 20 to 200 mM.
8) A buffer (e.g., Tris-HCl), pH 7-8, preferably 7.5 in the 10-50 mM range of concentrations, preferably 10 mM.
9) A reducing agent (e.g., dithiothreitol (DTT)) in the 0.5 to 5 mM range, preferably 1 mM. 10) A detergent (e.g., Triton X-100), in the 0.01 % to 1 % range (V/V), preferably 0.01% final concentration (V/V). Because T1 and T2 contain the sequences S' and S, respectively, and these two sequences are complementary to each other, T1 and T2 may anneal to each other to form the following partially double-stranded nucleic acid:
(T1 ) 5' GAGTCNNNN S' 3'
3' s NNNNCTGAG 5'
(T2)
To prevent the extension of the above partially double-stranded nucleic acid from the 3' terminus of the upper and lower strands, the 3'-termini of T1 and T2 may be blocked so that no free 3'-hydroxyl groups are available for extension. Alternatively, T1 and T2 molecules may be immobilized in different regions of a solid substrate or different solid substrates (e.g., microbeads). The ping-pong amplification reaction cycle is also schematically described in Figure 7. The use of the ping-pong amplification reaction cycle in discriminating genetic variations is shown in Figure 8.
G. Implementation formats
The fingerprinting assays described above (which may include a nicking, linear amplification, or exponential amplification reaction) may be carried out in various formats. For instance, the reactions may be performed in a mixture where all the components are soluble. Alternatively, one or all of the template(s) can be covalently attached at the 3' end or the 5' end to a solid phase with the use of cross-linkers or spacers. The solid phase includes (without limitation) nylon tip beads, fluted tips, microbeads, microplate wells, membranes, slides, arrays, and the materials of which the solid phase is made include glass, nylon 6/6, silica, plastics like polystyrene, polymers like poly(ethyleneimine), etc.
For example, a replicator type of exponential amplification reaction may be performed using immobilized templates. More specifically, the first template (T1) molecules may be linked to beads, while the second template (T2) molecules are linked to different beads. The beads linked with T1 molecules may be mixed with the beads linked with T2 molecules in a reaction mixture to amplify two oligonucleotides (i.e., S and S' as described above in the context of the replicator type of amplification reaction). In addition, such a reaction may be carried out to amplify multiple oligonucleotide sequences. Beads linked with template molecules other than T1 and T2 molecules may be included in the reaction mixture so that oligonucleotides other than S and S' may also be amplified. It is also possible and advantageous to perform amplification reactions (e.g., direct EXPAR as described above) on arrays of immobilized oligonucleotides. The arrays can be composed of elements separated spatially on a 2-dimensional solid support. Suitable solid supports include, but are not limited to, glass slides, wafers, beads, microbeads, rods, ribbons, nylon6/6, nylon parts, polymer-coated solid supports, wells, etc. The arrays can be further assembled on a 3-dimensional solid support.
In such a reaction, the amplification template (e.g., τ in Figure 2) is immobilized to a solid support at its 5' end or its 3' end, preferably at its 3'- end. There may or may not be any spacer between the template oligonucleotide and the solid support. The immobilized template, when annealing to a trigger oligonucleotide, may be used as a template to amplify an oligonucleotide having a sequence identical to the trigger oligonucleotide. The newly synthesized oligonucleotide then primes an adjacent template oligonucleotide in the element on (or in) the array and an exponential amplification reaction takes place. Oligonucleotide amplification is detected by employing a DNA binding dye that preferentially binds to double strand DNA (e.g., SYBR® green).
The methods for immobilizing a nucleic acid or an oligonucleotide are known in the art. In certain embodiments, nucleic acids or oligonucleotides of the present invention are immobilized to a substrate to form an array. As used herein, an "array" refers to a collection of nucleic acids or oligonucleotides that are placed on a solid support in distinct areas. Each area is separated by some distance in which no nucleic acid or oligonucleotide is bound or deposited. In some embodiments, area sizes are 20 to 500 microns and the center to center distances of neighboring areas range from 50 to 1500 microns. The array of the present invention may contain 2-9, 10-100, 101 -400, 401 - 1 ,000, or more than 1 ,000 distinct areas.
Generally, the nucleic acid or oligonucleotide may be immobilized to a substrate in the following two ways: (1) synthesizing the nucleic acids or the oligonucleotides directly on the substrate (often termed "in situ synthesis"), or (2) synthesizing or otherwise preparing the nucleic acid or the oligonucleotides separately and then position and bind them to the substrate (sometimes termed "post-synthetic attachment"). For in situ synthesis, the primary technology is photolithography. Briefly, the technology involves modifying the surface of a solid support with photolabile groups that protect, for example, oxygen atoms bound to the substrate through linking elements. This array of protected hydroxyl groups is illuminated through a photolithographic mask, producing reactive hydroxyl groups in the illuminated areas. A 3'-0- phosphoramidite-activated deoxynucleoside protected at the 5'-hydroxyl with the same photolabile group is then presented to the surface and coupling occurs through the hydroxyl group at illuminated areas. Following further chemical reactions, the substrate is rinsed and its surface is illuminated through a second mask to expose additional hydroxyl groups for coupling. A second 5'- protected, 3'-0-phosphoramidite-activated deoxynucleoside is present to the surface. The selective photo-de-protection and coupling cycles are repeated until the desired set of products is obtained. Detailed description of using photolithography in array fabrication may be found in the following patents or published patent applications: U.S. Patent Nos. 5,143,854; 5,424,186; 5,856,101 ; 5,593,839; 5,908,926; 5,737,257; and Published PCT Patent Application Nos. WO99/40105; WO99/60156; WO00/35931. The post-synthetic attachment approach requires a methodology for attaching pre-existing oligonucleotides to a substrate. One method uses the biotin-streptavidin interaction. Briefly, it is well known that biotin and streptavidin form a non-covalent, but very strong, interaction that may be considered equivalent in strength to a covalent bond. Alternatively, one may covalently bind pre-synthesized or pre-prepared nucleic acids or oligonucleotides to a substrate. For example, carbodiimides are commonly used in three different approaches to couple DNA to solid supports. In one approach, the support is coated with hydrazide groups that are then treated with carbodumide and carboxy-modified oligonucleotide. Alternatively, a substrate with multiple carboxylic acid groups may be treated with an amino- modified oligonucleotide and carbodumide. Epoxide-based chemistries are also used with amine modified oligonucleotides. Detailed descriptions of methods for attaching pre-existing oligonucleotides to a substrate may be found in the following references: U.S. Patent Nos. 6,030,782; 5,760,130; 5,919,626; published PCT Patent Application No. WO00/40593; Stimpson et al. Proc. Natl. Acad. Sci. 92:6379-6383 (1995); Beattie et al. Clin. Chem. 41/700-706 (1995); Lamture et al. Nucleic Acids Res. 22:2121-2125 (1994); Chrisey et al. Nucleic Acids Res. 24:3031-3039 (1996); and Holmstrom et al., Anal. Biochem. 209:278-283 (1993).
The primary post-synthetic attachment technologies include ink jetting and mechanical spotting. Ink jetting involves the dispensing of nucleic acids or oligonucleotides using a dispenser derived from the ink-jet printing industry. The nucleic acid oligonucleotides are withdrawn from the source plate up into the print head and then moved to a location above the substrate. The nucleic acids or oligonucleotides are then forced through a small orifice, causing the ejection of a droplet from the print head onto the surface of the substrate. Detailed description of using ink jetting in array fabrication may be found in the following patents: U.S. Patent Nos: 5,700,637; 6,054,270; 5,658,802; 5,958,342; 6,136,962 and 6,001 ,309.
Mechanical spotting involves the use of rigid pins. The pins are dipped into a nucleic acid or oligonucleotide solution, thereby transferring a small volume of the solution onto the tip of the pins. Touching the pin tips onto the substrate leaves spots, the diameters of which are determined by the surface energies of the pins, the nucleic acid or oligonucleotide solution, and the substrate. Mechanical spotting may be used to spot multiple arrays with a single nucleic acid or oligonucleotide loading. Detailed description of using mechanical spotting in array fabrication may be found in the following patents or published patent applications: U.S. Patent Nos. 6,054,270; 6,040,193; 5,429,807; 5,807,522; 6,110,426; 6,063,339; 6,101 ,946; and published PCT Patent Application Nos. WO99/36760; 99/05308; 00/01859; 00/01798.
One of ordinary skill in the art would appreciate that besides the techniques described above, other methods may also be used in immobilizing nucleic acids or oligonucleotides to a substrate. Descriptions of such methods can be found in, but are not limited to, the following patent or published patent applications: U.S. Patent Nos. 5,677,195; 6,030,782; 5,760,130; and 5,919,626; and published PCT Patent Application Nos. WO98/01221 ; WO99/41007; W099/42813; WO99/43688; WO99/63385; WO00/40593; WO99/19341 ; and WO00/07022.
The substrate to which the nucleic acids or oligonucleotides of the present invention are immobilized to form an array is prepared from a suitable material. The substrate is preferably rigid and has a surface that is substantially flat. In some embodiments, the surface may have raised portions to delineate areas. Such delineation separates the amplification reaction mixtures at distinct areas from each other and allows for the amplification products at distinct areas to be analyzed or characterized individually. The suitable material includes, but is not limited to, silicon, glass, paper, ceramic, metal, metalloid, and plastics. Typical substrates are silicon wafers and borosilicate slides (e.g., microscope glass slides). An example of a particularly useful solid support is a silicon wafer that is usually used in the electronic industry in the construction of semiconductors. The wafers are highly polished and reflective on one side and can be easily coated with various linkers, such as poly(ethyleneimine) using silane chemistry. Wafers are commercially available from companies such as WaferNet, San Jose, CA. Depending on the contemplated application, one of ordinary skill in the art may vary the composition of immobilized molecules of the present array. For instance, the T1 orT2 molecules of the present invention may or may not be immobilized to every distinct area of the array. Preferably, the nucleic acids or oligonucleotides in a distinct area of an array are homogeneous. More preferably, the nucleic acids or oligonucleotides in every distinct area of an array to which the nucleic acids or oligonucleotides are immobilized are homogeneous. The term "homogeneous," as used herein, indicates that each nucleic acid or oligonucleotide molecule in a distinct area has the same sequence as another nucleic acid or oligonucleotide molecule in the same area. Alternatively, the nucleic acid or oligonucleotide in at least one of the distinct areas of an array are heterogeneous. The term "heterogeneous," as used herein, indicates that at least one nucleic acid or oligonucleotide molecule in a distinct area has a different sequence from another nucleic acid or oligonucleotide molecule in the area. In some embodiments, molecules other than the nucleic acids or oligonucleotides described above may also be present in some or all of distinct areas of an array. For instance, a molecule useful as an internal control for the quality of an array may be attached to some or all of distinct areas of an array. Another example for such a molecule may be a nucleic acid useful as an indicator of hybridization stringency. In other embodiments, the composition of nucleic acids or oligonucleotides in every distinct area of an array is the same. Such an array may be useful in determining genetic variations in a particular gene in a selected population of organisms or in parallel diagnosis of a disease or a disorder associated with mutations in a particular gene.
Depending on the envisioned application, the immobilized nucleic acids or oligonucleotides of the present invention (e.g., the T1 or T2 molecules) may contain oligonucleotide sequences that are at least substantially complementary or identical to various target nucleic acids. Such target nucleic acids include, but are not limited to, genes associated with hereditary diseases in animals, oncogenes, genes related to disease predisposition, genomic DNAs useful for forensics and/or paternity determination, genes associated with or rendering desirable features in plants or animals, and genomic or episomic DNA of infectious organisms. An array of the present invention may contain nucleic acids or oligonucleotides that are at least substantially complementary or identical to a particular type of target nucleic acids in distinct areas. For example, an array may have a nucleic acid or an oligonucleotide that is at least substantially complementary or identical to a first gene related to disease predisposition in a first distinct area, another nucleic acid or an oligonucleotide that is at least substantially complementary or identical to a second gene also related to disease predisposition in a second distinct area, yet another nucleic acid or an oligonucleotide that is at least substantially complementary or identical to a third gene also related to disease predisposition in a third distinct area, etc. Such an array is useful to determine disease predisposition of an individual animal (including a human) or a plant. Alternatively, an array may have nucleic acids or oligonucleotides that are at least substantially complementary or identical to multiple types of target nucleic acids categorized by the functions of the targets.
In addition, an array may contain nucleic acids or oligonucleotides that are at least substantially complementary or identical to a portion of a target nucleic acid that contains various potential genetic variations. For instance, a first area of the array may contain immobilized nucleic acids or oligonucleotides that are at least substantially complementary or identical to a portion of a target gene that contains a genetic variation of one allele of the target. A second area of the array may contain immobilized nucleic acids or oligonucleotides that are at least substantially complementary or identical to a portion of target gene that contains a genetic variation of another allele of the target. The array may have additional areas that contain immobilized nucleic acids or oligonucleotides that are at least substantially complementary or identical to portions of the target gene that contains genetic variations of additional alleles of the target. In general, for successful performance in an array environment, the immobilized nucleic acids or oligonucleotides must be stable and not dissociate during various treatment, such as hybridization, washing or incubation at the temperature at which an amplification reaction is performed. The density of the immobilized nucleic acids or oligonucleotides must be sufficient for the subsequent analysis. For an array suitable for the present methods, typically 1000 to 1012, preferably 1000 to 106, 106 to 109, or 109 to 1012 ODNP molecules are immobilized in at least one distinct area. However, there must be minimal non-specific binding of other nucleic acids to the substrate. The immobilization process should not interfere with the ability of immobilized nucleic acids or oligonucleotides required for exponential nucleic acid amplification.
In certain embodiments, it may be desirable to have the nucleic acids or oligonucleotides of the present invention indirectly bound to the substrate via a linker. The linker (also referred to as a "linking element") comprises a chemical chain that serves to distance the nucleic acids or oligonucleotides from the substrate. In certain embodiments, the linker may be cleavable. There are a number of ways to position a linking element. In one common approach, the substrate is coated with a polymeric layer that provides linking elements with a lot of reactive ends/sites. A common example is glass slides coated with polylysine, which are commercially available. Another example is substrates coated with poly(ethyleneimine) as described in
Published PCT Application No. WO99/04896 and U.S. Patent No. 6,150,103.
The array of the present invention enables the high throughput of various analyses to which the present nucleic acid amplification is applicable. For instance, an array of T2 molecules may be used to amplify multiple target nucleic acids. The reaction mixture or the products of an amplification reaction performed in the presence of a target nucleic acid may be pooled together and applied to the array of T2 molecules. Alternatively, the reaction mixtures or the amplification products of different amplification reactions may be applied to distinct areas of the array. Another round ("second round") of amplification reactions may then be performed on the array in the present of a nicking agent that recognizes the nicking agent recognition sequence of which the antisense strand is present in the T2 molecules. The amplification products of the second round of reactions performed on the array may be pooled together and analyzed. If the array (e.g., a microwell array) has distinct areas that are delineated by certain physical barriers, the amplification products of the second round of reactions in distinct arrays may be analyzed individually.
For the nucleic acid molecules of the present invention that do not form an array, they may be immobilized via the methods described above that are useful in preparing an array. In addition, any methods known in the art may be used. For instance, a target nucleic acid of the present invention may be immobilized by the use of a fixative or tissue printing. It may also be first isolated or purified and then transferred to a substrate that binds to nucleic acids or oligonucleotides, such as nitrocellulose or nylon membranes.
H. Product Identification
The products of a nicking reaction, a linear amplification reaction or an exponential amplification reaction according to the present invention may be characterized by any applicable known methods. These methods include, but are not limited to mass spectrometry, fluorescence spectrometry, electrophoresis, liquid chromatography, hybridization and radiography. Certain exemplary methods are described in more detail below. In certain embodiments, not all the amplified nucleic acids are characterized. In other words, in these embodiments, only certain amplified nucleic acids that meet a given criterion need be characterized. For instance, the amplified nucleic acid molecules may first be separated by liquid chromatography and only the fractions that contain short nucleic acid fragments are further characterized by, for example, mass chromatography.
Mass Spectrometry
The fingerprinting assays of the present invention can be read out in a number of ways but the most ideal is by mass spectrometry since a series of well-defined and characterized oligonucleotides are generated that have known mass/charge ratios (m/z). Exemplary mass spectrometric analysis includes Matrix-Assisted Laser Desorpotion/lonization Mass Spectrometry (MALDI) and Time-of-Fight (TOF).
Matrix-Assisted Laser Desorption/lonization Mass Spectrometry (MALDI-MS) is becoming an ever more popular technique for studying biomolecules (Hillenkamp et al., Anal. Chem. 63, 1193A-1203A, 1991 ). This technique ionizes high molecular weight biopolymers with minimal concomitant fragmentation of the sample material. This is typically accomplished via the incorporation of the sample to be analyzed into a matrix that absorbs radiation from an incident UV or IR laser. This energy is then transferred from the matrix to the sample resulting in desorption of the sample into the gas phase with subsequent ionization and minimal fragmentation. One of the advantages of MALDI-MS over ESI-MS is the simplicity of the spectra obtained: MALDI spectra are generally dominated by singly charged species. Typically, the gaseous ions generated by MALDI techniques are detected and analyzed by determining the time-of-flight (TOF) of these ions. While MALDI-TOF MS is not a high resolution technique, resolution can be improved by making modifications to such systems, e.g., by the use of tandem MS techniques, or by the use of other types of analyzers, such as Fourier transform (FT) and quadrupole ion traps.
MALDI techniques have found application for the rapid and straightforward determination of the molecular weight of certain biomolecules (Feng and Konishi, Anal. Chem. 64, 2090-2095, 1992; Nelson, Dogruel and Williams, Rapid Commun. Mass Spectrom. 8, 627-631 , 1994). These techniques have been used to confirm the identity and integrity of certain biomolecules such as peptides, proteins, oligonucleotides, nucleic acids, glycoproteins, oligosaccharides and carbohydrates. Further, these MS techniques have found biochemical applications in the detection and identification of post-translational modifications on proteins. Verification of DNA and RNA sequences that are less than 100 bases in length has also been accomplished using ESI with FTMS to measure the molecular weight of the nucleic acids (Little et al, Proc. Natl. Acad. Sci. USA 92, 2318-2322, 1995).
The matrix is an important feature of MALDI-MS. Typically, analysis of nucleic acids by MALDI can be divided into two steps. The first step involves preparing the sample by mixing the sample to be analyzed with a molar excess of a chemical commonly referred to as the "matrix." See, e.g., Wu et al. Rapid Commun. Mass Spectrom. 7:142-146 (1993). The primary purpose of the matrix is to promote ionization of the nucleic acid. Without the matrix, the nucleic acid molecule tends to fragment upon exposure to the laser energy, so that the mass and identity of the nucleic acid is difficult or impossible to determine. In general, the term "matrix" refers to a substance which absorbs radiation at a wavelength substantially corresponding to the pulse of laser energy used in the MALDI method, and where the matrix facilitates desorption and ionization of molecules. A matrix may be any one of several small, light- absorbing chemicals that may be mixed in solution with a nucleic acid in such a manner so that, upon drying on a solid support (e.g., a sample plate or a probe element), the crystalline matrix-embedded analyte molecules are successfully desorbed by laser irradiation and ionized from the solid phase crystals into the vapor phase and accelerate as intact molecular ions. The second step of the MALDI process involves desorption of the bulk portions of the solid sample by a short pulse of laser light.
While this is probably the most commonly employed approach to combining matrix and analyte prior to MALDI-MS analysis, other methods may be used. For instance, according to one aspect of the invention, the analyte- containing sample is added to (e.g., spotted onto) a coating of cationic polyelectrolyte, allowing the analyte (nucleic acid) to bind to the cationic polyelectrolyte. This spot is then washed in order to purify the nucleic acid. The spot is then treated with matrix (when the matrix is a liquid) or a solution of matrix (when the matrix is a solid). When the matrix is a solid, the spot should be allowed to dry in order to remove the solvent that was formerly used to dissolve the matrix in solution. Thereafter, this spot of nucleic acid and matrix can be subjected to MALDI-MS to provide a very strong signal due to the nucleic acid.
Accordingly, in one aspect, the present invention provides a solid support having a surface, where that surface is at least partially coated with a coating comprising cationic polyelectrolyte, where at least some of the cationic polyelectrolyte is in contact with nucleic acid and the nucleic acid is in contact with matrix. In a preferred embodiment, the solid support is a plate, e.g., a stainless steel plate, and the cationic polyelectrolyte either forms a continuous coating across all or a significant portion of the surface, or is spotted onto the surface in distinct regions. Nucleic acid and matrix is then located in distinct regions on the surface, so as to provide an array-type appearance. For example, the surface may be a 96-well plate, with cationic polyelectrolyte, nucleic acid and matrix located in one, and preferably more than one, of the wells. This array is then subjected to MALDI-MS, where the various regions are sequentially subjected to laser light, and the mass spectrum of the nucleic acid present in the spots is sequentially obtained.
To achieve its purpose, the matrix should meet one or more of the following criteria, and preferably meets many or all of these criteria. The matrix should be able to embed and isolate nucleic acid (e.g., by co-crystallization), it should be soluble in solvents compatible with nucleic acids, it should be stable under the vacuum used in MALDI, it should assist co-desorption of the nucleic acid upon laser irradiation, and it should promote ionization of the nucleic acid. In order to meet these criteria, the matrix should comprise a chromophore that strongly absorbs in the wavelength of light being emitted by the laser. For instance, if the laser is an ultraviolet laser, then the matrix should have a chromophore that absorbs in the ultraviolet region.
The following chemicals have been identified as suitable matrices for nucleic acids, where these as well as other suitable matrix chemicals known in the art may be used in the methods and compositions of the present invention: 6-aza-2-thiothymine (ATT), glycerol, 2,4,6-trihydroxyacetophenone (THAP), picolinic acid (PA), 3-hydroxy picolinic acid (HPA), 2,5- dihiydroxybenzoic acid, anthranilic acid, nicotinic acid, and salicylamide. Mixtures of these chemicals are also suitable. Typically, the matrix is a solid at room temperature. However, the matrix may be a liquid chemical, where suitable liquid matrices are substituted or unsubstituted: (1 ) alcohols, including: glycerol, 1 ,2- or 1 ,3-propane diol, 1 ,2-, 1 ,3- or 1 ,4-butane diol, triethanolamine; (2) carboxylic acids including: formic acid, lactic acid, acetic acid, propionic acid, butanoic acid, pentanoic acid, hexanoic acid and esters thereof; (3) primary or secondary amides including acetamide, propanamide, butanamide, pentanamide and hexanamide, whether branched or unbranched; (4) primary or secondary amines, including propylamine, butylamine, pentylamine, hexylamine, heptylamine, diethylamine and dipropylamine; (5) nitriles, hydrazine and hydrazide. These liquid matrices are particularly useful when the MALDI laser emits light in the infrared spectrum. It is reported that THAP works best for samples below 10kDa while HPA and PA are more appropriate for oligonucleotides above 10kDa. Acidic matrices, e.g., HPA, are preferred for single-stranded nucleic acids, while neutral matrices, e.g., glycerol and ATT, are preferred for double-stranded nucleic acids.
In certain preferred embodiments of the present invention, another type of mass spectrometry other than MALDI may be employed. MS is particularly advantageous in those applications in which it is desirable to eliminate a size separation step prior to molecular weight determination. Sensitivities of MS may be achieved to at least to 1 amu. The smallest mass differences in nucleic acid bases is between adenine and thymidine which is 9 Daltons. Particularly preferred methodologies according to the present invention employ Liquid Chromatography-Time-of-Flight Mass Spectrometry (LC-TOF-MS). LC-TOF-MS is composed of an orthogonal acceleration Time- of-Flight (TOF) MS detector for atmospheric pressure ionization (API) analysis using electrospray (ES) or atmospheric pressure chemical ionization (APCI). LC-TOF-MS provides high mass resolution (5000 FWHM), high mass measurement accuracy (to within 5ppm) and very good sensitivity (ability to detect femtomolar amount of DNA polymer). TOF instruments are generally more sensitive than quadrupoles, but are correspondingly more expensive. LC-TOF-MS has a more efficient duty cycle since the current instruments can sequentially analyze one mass at a time while rejecting all others (this is referred to as single ion monitoring (SIM)). LC-TOF-MS samples all of the ions passing into the TOF analyzer at the same time. This results in higher sensitivity, provides quantitative data, which improves the sensitivity between 10 and 100 fold. Enhanced resolution (5000 FWHM) and mass measurement accuracy of better than 5 ppm imply that differences between nucleosides as small as 9 amu (Daltons) can be accurately measured. The TOF mass analyzer performs very high frequency sampling (10 spectra/sec) of all ions simultaneously across the full mass range of interest. The duty cycle of the LC-TOF-MS allows high sensitivity spectra to be recorded in quick succession making the instrument compatible with more efficient separations techniques such as narrow bore LC, capillary chromatography (CE) and capillary electrochromatography (CEC). The ions are pulsed into the analyzer, effectively taking a 'snapshot' of the ions present at any time.
In the first stage the ES or APCI aerosol spray is directed perpendicularly past the sampling cone, which is displaced from the central axis of the instrument. Ions are extracted orthogonally from the spray into the sampling cone aperture leaving large droplets, involatile materials, particulates and other unwanted components to collect in the vent port that is protected with an exchangeable liner. The second orthogonal step enables the volume of gas (and ions) sampled from atmosphere to be increased compared with conventional API sources. Gas at atmospheric pressure sampled through an aperture into a partial vacuum forms a freely expanding jet, which represents a region of high performance compared to the surrounding vacuum. When this jet is directed into the second aperture of a conventional API interface it increases the flow of gas through the second aperture. Maintaining a suitable vacuum in the MS-TOF therefore places a restriction on the maximum diameter of the apertures in such an LC interface. Ions in the partial vacuum of the ion block are extracted electrostatically into the hexapole ion bridge that efficiently transports ions to the analyzer.
The coupling of the TOF mass analyzers with MUX-technology allows the connection of up to 8 HPLC columns in parallel to a single LC-TOF- MS. (Micromass, Manchester UK). A multiplexed electrospray (ESI) interface is used for on-line LC-MS utilizing an indexed stepper motor to sequentially sample from up to 8 HPLC columns or liquid inlets operated in parallel.
Use of LC-TOF-MS is sometimes preferred over use of MALDI- TOF because LC-TOF-MS is a quantitative method for analysis of the molecular weight of polymers. LC-TOF-MS does not fragment the polymers and it employs a very gentle ionization process compared to matrix-assisted- lazer-desorption-ionization (MALDI). Because every MALDI blast is different, the ionization is not quantitative. LC-TOF-MS does, however, produce different m/z values for polymers.
High Performance Liquid Chromatography (HPLC)
High-Performance Liquid Chromatography (HPLC) is a chromatographic technique for separation of compounds dissolved in solution. HPLC instruments consist of a reservoir of mobile phase, a pump, an injector, a separation column, and a detector. Compounds are separated by injecting an aliquot of the sample mixture onto the column. The different components in the mixture pass through the column at different rates due to differences in their partitioning behavior between the mobile liquid phase and the stationary phase. The pumps provide a steady high performance with no pulsating, and can be programmed to vary the composition of the solvent during the course of the separation.
Exemplary detectors useful within the methods of present invention include UV-VIS absorption, or fluorescence after excitation with a suitable wavelength, mass spectrometers and IR spectrometers. Recently, IP- RO-HPLC on non-porous PS/DVB particles with chemically bonded alkyl chains have been shown to be rapid alternatives to capillary electrophoresis in the analysis of both single and double-strand nucleic acids providing similar degrees of resolution. (Huber et al., Anal. Biochem. 272:351 (1993); Huber et al., Nuc. Acids Res. 27:1061 (1993); Huber et al. Biotechniques 76:898 (1993)). In contrast to ion-exchange chromatography, which does not always retain double-strand DNA as a function of strand length (since AT base pairs interact with the positively charged stationary phase, more strongly than GC base- pairs), IP-RP-HPLC enables a strictly size-dependent separation.
A method has been developed using 100 mM triethylammonium acetate as ion-pairing reagent, phosphodiester oligonucleotides could be successfully separated on alkylated non-porous 2.3 μM poly(styrene- divinylbenzene) particles by means of high performance liquid chromatography. (Oefner et al., Anal. Biochem. 223:39 (1994)). The technique described allows the separation of PCR products differing by only 4 to 8 base pairs in length within a size range of 50 to 200 nucleotides. Denaturing HPLC (DHPLC) is an ion-pair reversed-phase high performance liquid chromatography methodology (IP-RP-HPLC) that uses a non-porous C-18 column as the stationary phase. The column is comprised of a polystyrene-divinylbenzene copolymer. The mobile phase is comprised of an ion-pairing agent of triethylammonium acetate (TEAA), which mediates binding of DNA to the stationary phase, and acetonitrile (ACN) as an organic agent to achieve subsequent separation of the DNA from the column. A linear gradient of acetonitrile allows separation DHPLC identifies mutations and polymorphisms based on detection of heteroduplex formation between mismatched nucleotides in double stranded PCR amplified DNA. Sequence variation creates a mixed population of heteroduplexes and homoduplexes during reannealling of wild type and mutant DNA of fragments based on size and/or presence of heteroduplexes (this is the traditional use of the DHPLC technology). When this mixed population is analyzed by HPLC under partially denaturing temperatures, the heteroduplexes elute from the column earlier than the homoduplexes because of their reduced melting temperature. Analysis can be performed on individual samples to determine heterozygosity, or on mixed samples to identify sequence variation between individuals.
Real-time Fluorescence
The various nicking and amplification reactions described above may also be readout by detectors that measure real-time fluorescence, such as the MJ Opticon from MJ Research (Boston, MA), the ABI Prism 7000 instrument (Foster City, CA), and endpoint plate readers, such as the Ultramark from Biorad (Hercules, CA). Real time monitoring is a very useful method as it enables parameters such as initial rates to be determined with accuracy and ease. The use of double-strand specific fluorescent dyes such as SYBR® green from Molecular Probes (Eugene OR) is especially useful when used during the amplification reactions described above. Dyes that bind to single strand nucleic acids can also be used, perhaps at times with slightly less efficacy than double-strand specific dyes. Currently, there are four types of detection chemistries that can be used for real time fluorescence readout: intercalating dyes such as SYBR®, dual labeled probes, FRET (fluorescent energy transfer) probes and Molecular Beacons. Exemplary fluorescent intercalating agents include, without limitation, those disclosed in U.S. Pat. Nos. 4,119,521 ; 5,599,932, 5,658,735; 5,734,058; 5,763,162; 5,808,077; 6,015,902; 6,255,048 and 6,280,933, those discussed in Glazer and Rye, Nature 359: 859- 61 , 1992, PicoGreen dye, and SYBR® dyes such as SYBR® Gold, SYBR® Green I and SYBR® Green II (Molecular Probes, Eugene WA). Fluorescence produced by fluorescent intercalating agents may be detected by various detectors, including PMTs, CCD cameras, fluorescent-based microscopes, fluorescent-based scanners, fluorescent-based microplate readers, fluorescent- based capillary readers.
To increase the signal produced by fluorescent intercalating agents, after the amplification reaction, the nicking agent in the reaction mixture may be inactivated (e.g., by heat) and a fresh DNA polymerase be added. The presence of the active DNA polymerase, but not the nicking agent, allows any partial duplexes to be extended to completely double-stranded nucleic acid fragments. The generation of such completely double-stranded nucleic acid fragments allow the binding of a greater number of fluorescent intercalating agents, which in turn increase the signal that may be detected.
I. Compositions/Kits
In one aspect, the present invention provides a composition or kit comprising polynucleotide, a nicking agent and a polymerase. The inventive compositions have unique properties that render them particularly useful in idenfying the source organism of a nucleic acid sample. Such a source organism may be a bacterium, fungus, virus, plant, non-human animal or human.
Such a composition or kit generally comprises the template oligonucleotide(s) useful for amplifying initiating oligonucleotides described above. It may also further comprise at least one, two, several, or each of the following components: (1) a nicking agent (e.g., a NE or a RE) that recognizes the nicking agent recognition sequence of which one strand is present in the template oligonucleotide(s); (2) a suitable buffer for nicking agent (1); (3) a RE that functions in combination with a nicking agent (with may be identical to or different from nicking agent (1); (4) a suitable buffer for RE (3); (5) a DNA polymerase; (6) a suitable buffer for the DNA polymerase (5); (7) dNTPs; (8) a modified dNTP; (9) a strand displacement facilitator (e.g., 1 M trehalose); and (10) a fluorescent intercalating agent. Detailed descriptions of many of the above components have been provided above.
J. Applications The present invention alleviates and overcomes many drawbacks of the present state of the art through the discovery of novel methods and kits for rapidly fingerprinting DNA to identify prokaryotic and eukaryotic species, subspecies, and especially strains or individuals of the subspecies. With respect to prokaryotic organisms, the present invention is especially suited for identifying different bacterial strains involved in, for example, nosocomial infections, since the methods and kits are to be sensitive enough to detect differences between, for example, bacterial isolates of the same species. With respect to eukaryotes, the present invention contemplates identifying, for instance, species, subspecies, and the differences between the individuals of the subspecies, such as pedigrees.
In various aspects of the present invention the method can be used for: (1) diagnosis of bacterial disease, in plants animals and humans; (2) monitoring for bacterial content and/or contamination in the environment; (3) monitoring food for bacterial contamination; (4) monitoring manufacturing processes for bacterial contamination; (5) monitoring quality assurance/quality control of laboratory tests involving microbiological assays; (6) tracing bacterial contamination and/or outbreaks of bacterial infections; (7) genome mapping; (8) monitoring bioremediation sites; and (9) monitoring agricultural sites for test crops, bacteria and recombinant molecules.
The method is useful on pure or isolated cultures as well as actual samples from the test site. Because of the simplicity of the test it can also be automated for rapid and quick assay of samples. A further aspect of the present invention is a machine for automating the identification of bacterial strains, particularly by mass spectrometry.
The present invention affords the medical community with a means to not only identify the infectious agents, but also to rapidly characterize the strain or strains involved so that effective measures may be timely employed. Another application of this method is in the manufacturing process. A number of manufacturing processes for instance drugs, microorganism-aided synthesis, food manufacturing, chemical manufacturing and fermentation process all rely either on the presence or absence of bacteria. In either case the method of the present invention can be used. It can monitor bacterial contamination or test that strain purity is being maintained. This method can also be used to test stored blood for bacterial contamination. This would be important in blood banking where bacteria such as Yersinia enterocolitica can cause serious infection and death if it is in transfused blood. The procedure can also be used for quality assurance and quality control in monitoring bacterial contamination in laboratory tests. For example the Guthrie bacterial inhibition assay uses a specific strain of bacteria to measure phenylalanine in newborn screening. If this strain changes it could affect test results and thus affect the accuracy of the newborn screening program. This method of the present invention can be used to monitor the strain's purity. Any other laboratory test that uses or relies on bacteria in the assay can be monitored. The laboratory or test environment can also be monitored for bacterial contamination by sampling the lab and testing for specific strains of bacteria. This procedure will also be useful in hospitals for tracing the origin and distribution of bacterial infections. It can show whether or not the infection of the patient is a hospital-specific strain. The type of treatment and specific anti-bacterial agent can depend on the source and nature of the bacteria. There are a variety of applications for the fingerprinting technology described here. An immediate need is in rapid pathogen detection and bio-threat. Since short oligonucleotides can be generated and amplified (108 - 109 fold) from a nucleic acid sample in a 3-5 minutes time frame and the amplification of can be achieved in 3 minutes, it is conceivable that 10-15 minutes assay can be constructed. It is certain that clinical utility can be gained from the viral quantification as a tool for disease monitoring and cancer typing. Since the assay is highly quantitative using real-time fluorescence readout, we see rapid real-time molecular diagnostics to aid in the appropriate use of antimicrobials. This type of fingerprinting technology will also speed the development and application of nucleic acid-based technologies for microbial content analyses in foods stock. The fingerprinting technology described herein may be useful to detect polymorphisms in the human genome, in view of the large number of fragments that can be generated. The methods and compositions of the present invention may be used to interrogate a sample for the presence of fragments that uniquely identify all pathogens, and fragments obtained from the human genome that can be used to uniquely identify individuals.
The following examples are provided for purposes of illustration and not the invention is not limited thereby.
EXAMPLES
EXAMPLE 1 COMPUTATIONAL RESULTS FROM GENOMIC DNAS USING N.BSTNB I NICKING
ENZYME AND N. ALWl NICKING ENZYME
Computer assisted identification of the adjacent nicking enzyme recognition sites of N.BstNB I and/or N.AIw I in E. coli was performed. Oligonucleotide fragments 6-16 nucleotides in length that would be generated from the genomic DNA of E. coli K12 and 0157 are shown in Tables 1 and 2, respectively. A total of 115 oligonucleotides that are 6-16 nucleotides long could be generated from these two strains. Out of the combined 1 5 oligonucleotides, 70 oligonucletides have unique masses, whereas each of the remaining fragments has a mass identical to another fragment (see Table 3). Twelve oligonucleotides uniquely identify K12, and 15 oligonucleotides uniquely identify O157 (Table 4). These oligonucleotides unique for distinguishing the two E. coli strains are not found in the oligonucleotides that would be generated in the presence of N.BstNB I from chromosome 21 of the human genome (see Table 5).
Table 6 lists the relative probability of obtaining an overlapping mass (composition) as a function of the length of the trigger in a background of DNA that is as complex as the human genome (4 billion bases).
Table 7 lists the number of oligonucleotides having 6-16 nucleotides that would be generated in the presence of N.BstNB I from 35 bacterial species. The average number of oligonucleotides that would be generated per organism is 19. Greater than 99% of these fragments have a unique mass and sequence, thus virtually every one of the fingerprinting oligonucleotides is unique to the organism from which it is generated. Table 1 : E. coli K12 oligonucleotide fragments.
Position Position of the of the first last nucleotide nucleotide Mass of oligoof oligodifference nucleotide nucleotide between in genomic in genomic Base adj acent DNA Product 5'->3 DNA Mass m/z=2 m/z=3 m/z=4 fragments
4064758 CACAAC 4064763 1825, .2 911.6 607.4 455.3 38.0
2111739 GTCTGC 2111744 1863. .2 930.6 620.1 464.8 8.0
765268 GATTCA 765273 1871. .2 934.6 622.7 466.8 18.0
4571546 AACAGA 4571551 1889. .2 943.6 628.7 471.3 22.0
4549455 GATAGT 4549460 1911, .2 954.6 636.1 476.8 7.0
4567093 TGTTGG 4567098 1918, .2 958.1 638.4 478.6 169.1
49704 CTCTCTC 49710 2087, .3 1042.7 694.8 520.8 9.0
1139368 ACCCTTC 1139374 2096. .4 1047.2 697.8 523.1 49.0
1403700 GACATCC 1403706 2145. .4 1071.7 714.1 535.3 12.0
3715545 TTCGTTT 3715551 2157. .4 1077.7 718.1 538.3 4.0
1274893 GATCCCG 1274899 2161. .4 1079.7 719.5 539.3 5.0
2201638 TGTTACT 2201644 2166. .4 1082.2 721.1 540.6 3.0
3528528 CCATAGA 3528534 2169. ,4 1083.7 722.1 541.4 22.0
3528528 TCTATGG 3528534 2191. ,4 1094.7 729.5 546.9 2.0
2201638 AGTAACA 2201644 2193. ,4 1095.7 730.1 547.4 7.0
4397983 GATCAGT 4397989 2200. .4 1099.2 732.5 549.1 16.0
1403700 GGATGTC 1403706 2216. .4 1107.2 737.8 553.1 40.0
3901602 GATGGTG 3901608 2256. .5 1127.2 751.2 563.1 18.0
49704 GAGAGAG 49710 2274. ,5 1136.2 757.2 567.6 144.1
3143034 AACATCCC 3143041 2418. ,6 1208.3 805.2 603.6 47.0
2638367 TGTCGCCA 2638374 2465. .6 1231.8 820.9 615.4 273.2
1156599 TCCACCAGT 1156607 2738. ,8 1368.4 911.9 683.7 15.0
1325420 TGTACCACT 1325428 2753. ,8 1375.9 916.9 687.4 33.0
2413336 AAAATCTCG 2413344 2786. ,8 1392.4 927.9 695.7 32.0
Position Position of the of the first last nucleotide nucleotide Mass of oligoof oligodifference
Figure imgf000065_0001
nucleotide nucleotide between in genomic in genomic Base adjacent DNA Product 5'->3' DNA Mass m/z=2 m/z=3 m/z=4 fragments
272636 AACGTGCGT 272644 2818. .8 1408.4 938.6 703.7 17.0
2074077 TGAAAAACG 2074085 2835. .9 1416.9 944.3 708.0 287.2
1651695 AACGTGTTGC 1651704 3123. .0 1560.5 1040.0 779.8 1.0
3436566 CGCGAGTTGC 3436575 3124. .0 1561.0 1040.3 780.0 17.0
1093455 TAAAGCGGCA 1093464 3141. .0 1569.5 1046.0 784.3 246.1
1302101 CACTACTTTGG 1302111 3387. .2 1692.6 1128.1 845.8 3.0
386112 CGCAGC TCAA 386122 3390. .2 1694.1 1129.1 846.6 55.0
1302101 CCAAAGTAGTG 1302111 3445. .2 1721.6 1147.4 860.3 258.2
2974219 TGCAGACCACAA 2974230 3703. .4 1850.7 1233.5 924.9 40.0
427004 CCAGTGAGCAAA 427015 3743. .4 1870.7 1246.8 934.9 13.0
1701613 CGGTATATCGTG 1701624 3756. .4 1877.2 1251.1 938.1 42.1
1413750 TGAAGCTAAGGA 1413761 3798. .5 1898.2 1265.2 948.6 233.1
2937922 AGTGCGCCGCTGC 2937934 4031. .6 2014.8 1342.9 1006.9 29.0
105759 TGCTTGGACAGTT 105771 4060. ,6 2029.3 1352.5 1014.2 80.0
312061 GGATTGAGTGTTG 312073 4140. .7 2069.3 1379.2 1034.2 19.0
2275435 GTTGAGCGAGGAG 2275447 4159. ,7 2078.8 1385.6 1038.9 473.3
2063725 ATAAGCGCCTCTGCG 2063739 4633. .0 2315.5 1543.3 1157.2 17.0
3963947 ACCGTAAACAGGCAT 3963961 4650. .0 2324.0 1549.0 1161.5 9.0
2273723 TCAGGCCAAACAGAA 2273737 4659. .0 2328.5 1552.0 1163.8 14.0
2063725 CGCAGAGGCGCTTAT 2063739 4673. .0 2335.5 1556.7 1167.3 12.0
412051 CGTGTTTTATGGCTG 412065 4685. .0 2341.5 1560.7 1170.3 20.1
198629 TTGAAACGGGCATAA 198643 4705. .1 2351.5 1567.4 1175.3 60.0
3408973 AGGGGTTTTTGGGTT 3408987 4765. .1 2381.5 1587.4 1190.3 89.1
1097197 TCTCCGCTTTTCCATG 1097212 4854. .1 2426.1 1617.0 1212.5 83.1
3826599 CAACCGGTTGCGCATT 3826614 4937. .2 2467.6 1644.7 1233.3 4.0
Position Position of the of the first last nucleotide nucleotide Mass of oligoof oligodifference nucleotide nucleotide between
Figure imgf000066_0001
in genomic in genomic Base adjacent
DNA Product 5'->3' DNA Mass m/z=2 m/z=3 m/z=4 fragments
2524967 TGAAACTTTTTCGTAT 2524982 4941. .2 2469.6 1646.1 1234.3 45.0
3826599 AATGCGCAACCGGTTG 3826614 4986. .2 2492.1 1661.1 1245.6 20.0
558515 TTCTGGATGAATGTTA 558530 5006. .2 2502.1 1667.7 1250.6 4.0
1595241 TACCGTGATGACAGAG 1595256 5010. .3 2504.1 1669.1 1251.6 58.1
1097197 CATGGAAAAGCGGAGA 1097212 5068. .3 2533.2 1688.4 1266.1
Table 2: E. coli O157 oligonucleotide fragments.
Start Product 5'->3' End Mass m/z=2 m/z=3 m ,/z=4 difference
1095650 CCTCCC 1095655 1768.1 883.1 588.4 441.0 57.1
4862976 CACAAC 4862981 1825.2 911.6 607.4 455.3 38.0
Figure imgf000067_0001
2787876 GTCTGC 2787881 1863.2 930.6 620.1 464.8 48.1
5406708 GATAGT 5406713 1911.2 954.6 636.1 476.8 7.0
5432628 TGTTGG 5432633 1918.2 958.1 638.4 478.6 169.1
54118 CTCTCTC 54124 2087.3 1042.7 694.8 520.8 9.0
1499252 ACCCTTC 1499258 2096.4 1047.2 697.8 523.1 49.0
1915148 GACATCC 1915154 2145.4 1071.7 714.1 535.3 12.0
4465490 TTCGTTT 4465496 2157.4 1077.7 718.1 538.3 9.0
2882357 TGTTACT 2882363 2166.4 1082.2 721.1 540.6 27.0
2882357 AGTAACA 2882363 2193.4 1095.7 730.1 547.4 7.0
5254130 GATCAGT 5254136 2200.4 1099.2 732.5 549.1 16.0
1915148 GGATGTC 1915154 2216.4 1107.2 737.8 553.1 15.0
2914783 GATGGTT 2914789 2231.4 1114.7 742.8 556.9 43.0
54118 GAGAGAG 54124 2274.5 1136.2 757.2 567.6 144.1
3885373 AACATCCC 3885380 2418.6 1208.3 805.2 603.6 28.0
1764961 CTGCTTTT 1764968 2446.6 1222.3 814.5 610.6 9.0
2148971 ATTTCCTG 2148978 2455.6 1226.8 817.5 612.9 10.0
3356488 TGTCGCCA 3356495 2465.6 1231.8 820.9 615.4 0.0
5347073 CTTCGGCA 5347080 2465.6 1231.8 820.9 615.4 49.0
5347073 TGCCGAAG 5347080 2514.6 1256.3 837.2 627.7 8.0
2148971 CAGGAAAT 2148978 2522.7 1260.3 839.9 629.7 216.1
1516471 TCCACCAGT 1516479 2738.8 1368.4 911.9 683.7 15.0
1828106 TGTACCACT 1828114 2753.8 1375.9 916.9 687.4 33.0
3141487 AAAATCTCG 3141495 2786.8 1392.4 927.9 695.7 62.0
368468 TGGAATGTT 368476 2848.8 1423.4 948.6 711.2 266.2
154799 TCAATGTAAA 154808 3115.0 1556.5 1037.3 777.8 9.0
4173941 CGCGAGTTGC 4173950 3124.0 1561.0 1040.3 780.0 263.2
1746921 CACTACTTTGG 1746931 3387.2 1692.6 1128.1 845.8 3.0
Start Product 5 ' ->3 ' End Mass m/z=2 m/z=3 m/z=4 difference
447890 CGCAGCATCAA 447900 3390. .2 1694.1 1129.1 846.6 6.0
392522 TGGCAATATCC 392532 3396. .2 1697.1 1131.1 848.1 49.0
1746921 CCAAAGTAGTG 1746931 3445. .2 1721.6 1147.4 860.3 21.0
1660620 GATATTATTGG 1660630 3466. ,3 1732.1 1154.4 865.6 237.2
Figure imgf000068_0001
3698597 TGCAGACCACAA 3698608 3703. ,4 1850.7 1233.5 924.9 40.0
491409 CCAGTGAGCAAA 491420 3743. ,4 1870.7 1246.8 934.9 13.0
2303220 CGGTATATCGTG 2303231 3756. ,4 1877.2 1251.1 938.1 42.1
1925197 TGAAGCTAAGGA 1925208 3798. ,5 1898.2 1265.2 948.6 233.1
3662300 AGTGCGCCGCTGC 3662312 4031. .6 2014.8 1342.9 1006.9 29.0
110364 TGCTTGGACAGTT 110376 4060. ,6 2029.3 1352.5 1014.2 99.1
3012345 GTTGAGCGAGGAG 3012357 4159. .7 2078.8 1385.6 1038.9 134.1
2268011 TTTAACGCCAGTTC 2268024 4293. ,8 2145.9 1430.3 1072.4 11.0
1856922 CCGCGCGTAATGCC 1856935 4304. .8 2151.4 1433.9 1075.2 80.0
1856922 GGCATTACGCGCGG 1856935 4384. .8 2191.4 1460.6 1095.2 7.0
2268011 GAACTGGCGTTAAA 2268024 4391. .9 2194.9 1463.0 1097.0 207.1
2701876 CGGTATTTCATCCCG 2701890 4599. .0 2298.5 1532.0 1148.7 34.0
2735173 ATAAGCGCCTCTGCG 2735187 4633. .0 2315.5 1543.3 1157.2 17.0
4759738 ACCGTAAACAGGCAT 4759752 4650, .0 2324.0 1549.0 1161.5 9.0
3010633 TCAGGCCAAACAGAA 3010647 4659. .0 2328.5 1552.0 1163.8 14.0
2735173 CGCAGAGGCGCTTAT 2735187 4673, .0 2335.5 1556.7 1167.3 12.0
474811 CGTGTTTTATGGCTG 474825 4685. .0 2341.5 1560.7 1170.3 21.0
2701876 CGGGATGAAATACCG 2701890 4706. .1 2352.0 1567.7 1175.5 59.0
4146296 AGGGGTTTTTGGGTT 4146310 4765, .1 2381.5 1587.4 1190.3 89.1
1457114 TCTCCGCTTTTCCATG 1457129 4854, .1 2426.1 1617.0 1212.5 83.1
4573519 CAACCGGTTGCGCATT 4573534 4937, .2 2467.6 1644.7 1233.3 4.0
3251326 TGAAACTTTTTCGTAT 3251341 4941. .2 2469.6 1646.1 1234.3 38.0
1785538 ACGGTGCCTGCCCGGG 1785553 4979. .2 2488.6 1658.7 1243.8 0.0
1952836 ACGGTGCCTGCCCGGG 1952851 4979. .2 2488.6 1658.7 1243.8 0.0
2228257 ACGGTGCCTGCCCGGG 2228272 4979. .2 2488.6 1658.7 1243.8 7.0
4573519 AATGCGCAACCGGTTG 4573534 4986. .2 2492.1 1661.1 1245.6 24.0
2121139 TACCGTGATGACAGAG 2121154 5010. .3 2504.1 1669.1 1251.6 58.1
Start Product 5 ' ->3 ' End Mass m/z=2 m/z=3 m/z=4 difference
1457114 CATGGAAAAGCGGAGA 1457129 5068.3 2533.2 1688.4 1266.1
Figure imgf000069_0001
Figure imgf000069_0002
oe
Figure imgf000069_0003
Figure imgf000069_0004
Table 3. Comparison of the shared and unique fingerprinting fragments between two E. coli strain K12 and 0157
Figure imgf000070_0001
Table 4. Unique fingerprinting oligonucleotides for E. coli strains O157H7 and K12
Sig. Start Organism Product End Sequence
5-1-1-3 154799 esc erichia_coli_0157H7.fna TCAATGTAAA 154808 GAGTCATTGTCAATGTAAAAGTTGATCC
Figure imgf000071_0001
2-0-3-4 368468 escherichia_coli_0157H7.fna TGGAATGTT 368476 GGATCCCGTAACATTCCAGACAGACTC
3-3-2-3 392522 escherichia_coli_0157H7.fna TGGCAATATCC 392532 GGATCGGGTGGATATTGCCACCATGACTC
1-0-5-0 1095650 escherichia_coli_0157H7.fna GGAGGG 1095655 GGATCACCGGGGAGGGCTGGACTC
3-0-3-5 1660620 escherichia_coli_0157H7.fna GATATTATTGG 1660630 GGATCAAATCCAATAATATCCCATGACTC
0-2-1-5 1764961 escheric ia_coli_0157H7.fna CTGC I I I I 1764968 GAGTCCGGGCTGCTTTTTGCGGATCC
1-6-7-2 1785538 escheric ia_coli_0157H7.fna ACGGTGCCTGCCCGGG 1785553 GAGTCGGTGACGGTGCCTGCCCGGGAGGTGATCC
2-4-6-2 1856922 escherichia_coIi_0157H7.fna GGCATTACGCGCGG 1856935 GAGTCGTGCGGCATTACGCGCGGCGCAGACTC
1-6-7-2 1952836 escherichia_coli_0157H7.fna ACGGTGCCTGCCCGGG 1952851 GAGTCGGTGACGGTGCCTGCCCGGGAGGTGATCC
4-1-2-1 2148971 escherichia_coli_0157H7.fna CAGGAAAT 2148978 GAGTCTCTTCAGGAAATGCGCGACTC
-4 1-6-7-2 2228257 escherichia_coli_0157H7.fna ACGGTGCCTGCCCGGG 2228272 GGATCACCTCCCGGGCAGGCACCGTCACCGACTC
©
3-4-2-5 2268011 escherichia_coIi_0157H7.fna TTTAACGCCAGTTC 2268024 GAGTCATCTTTTAACGCCAGTTCCTGCGACTC
5-3-5-2 2701876 escherichia_coli_0157H7.fna CGGGATGAAATACCG 2701890 GAGTCCGAACGGTATTTCATCCCGTGATGACTC
1-0-3-3 2914783 escherichia_coli_0157H7.fna GATGGTT 2914789 GGATCAGGTAACCATCCGTGGACTC
2-2-3-1 5347073 escherichia_coli_0157H7.fna TGCCGAAG 5347080 GAGTCCGGCCTTCGGCACCAAGACTC
6-2-4-3 198629 esc erichia_coli_K12.fna TTGAAACGGGCATAA 198643 GGATCGCGGTTATGCCCGTTTCAACATCGACTC
2-2-3-2 272636 escherichia_coli_K12.fna AACGTGCGT 272644 GGATCAGTGACGCACGTTTCATGACTC
2-0-6-5 312061 escherichia_coli_K12,fna GGATTGAGTGTTG 312073 GGATCATTTCAACACTCAATCCCTGTGACTC
4-1-4-7 558515 escherichia_coli_K12.fna TTCTGGATGAATGTTA 558530 GAGTCTCTGTTCTGGATGAATGTTAAGACGATCC
2-1-1-2 765268 escherichia_coli_K12.fna GATTCA 765273 GGATCATGATGAATCTGACGACTC
4-2-3-1 1093455 escherichia_coli_K12.fna TAAAGCGGCA 1093464 GAGTCACCATAAAGCGGCATTTTGATCC
1-3-2-1 1274893 escheric ia_coli_K12.fna GATCCCG 1274899 GGATCAGATCGGGATCAAGAGACTC
2-2.-3-3 1651695 esc erichia_coli_K12.fna AACGTGTTGC 1651704 GGATCGACAGCAACACGTTCACCGACTC
5-1-2-1 2074077 escherichia_coli_K12.fna TGAAAAACG 2074085 GAGTCCGTCTGAAAAACGCACGGATCC
3-2-1-1 3528528 escherichia_coli_K12.fna CCATAGA 3528534 GAGTCAAAGCCATAGACAATGACTC
1-0-4-2 3901602 escherichia_coli_K12.fna GATGGTG 3901608 GGATCTCCACACCATCATCTGACTC
4-1-1-0 4571546 esc eric ia_coli_K12.fna AACAGA 4571551 GGATCGATGTCTGTTCAGGGACTC
Table 5: oligonucleotide fragments from human chromosome 21. Sig. (signature) indicates the number of As-Cs-Gs- Ts (A-C-T-G). The "sequence" includes the sequence of the sense or antisense strand of the recognition sequence(s) of N.BstNB I and/or N.AIw I
Figure imgf000072_0001
Sig. Start Product End Sequence
0-0-2-4 21342426 TTTTGG 21342431 GAGTCCCGATTTTGGGCAAGACTC
0-0-2-4 33597518 TTGTGT 33597523 GAGTCAGCCACACAAGTTAGACTC
0-0-2-4 33598110 TTGTGT 33598115 GAGTCAGCCACACAAGTTAGACTC
0-0-2-6 3869614 TTGTTGTT 3869621 GAGTCAAGGTTGTTGTTACCAGATCC
0-0-5-2 25071211 GGGTGTG 25071217 GAGTCCCCCCACACCCTCTGGACTC
0-1-0-13 31168454 TCTTTTTTTTTTTT 31168467 GAGTCTCTCTCTTTTTTTTTTTTTTGAGATCC
0-1-0-5 24179267 TTTTCT 24179272 GGATCCAGCAGAAAAGAATGACTC
0-1-1-4 3885471 TCTTTG 3885476 GAGTCCAGCCAAAGATTATGACTC
0-1-1-4 8035824 CTGTTT 8035829 GAGTCCCAACTGTTTAGAGGACTC
0-1-2-3 33597444 TTGCGT 33597449 GAGTCAGCCACGCAAGTTAGACTC
0-1-2-3 33597666 TTGCGT 33597671 GAGTCAGCCACGCAAGTTAGACTC
0-1-2-3 33597814 TTGCGT 33597819 GAGTCAGCCACGCAAGTTAGACTC
0-1-2-3 33598036 TTGCGT 33598041 GAGTCAGCCACGCAAGTTAGACTC
0-1-2-3 33598332 TTGCGT 33598337 GAGTCAGCCACGCAAGTTAGACTC
0-1-2-5 15651283 CTGTTTGT 15651290 GAGTCCTGAACAAACAGACATGACTC
0-1-3-2 8623105 TGGGTC 8623110 GAGTCTGATTGGGTCTGCAGACTC
0-1-3-3 31857088 GGTGTCT 31857094 GAGTCATGTGGTGTCTGTGGGACTC
0-1-3-3 31935633 GGTGTCT 31935639 GAGTCCCACAGACACCACGTGACTC
0-1-4-2 27397521 GTTGGGC 27397527 GGATCTCAGGCCCAACCAGTGACTC
0-1-4-2 32790478 GGGTCTG 32790484 GAGTCTGCTCAGACCCTCTGGACTC
0-1-4-4 28501162 GCTGTGTGT 28501170 GAGTCCCTTGCTGTGTGTCATGGATCC
0-2-0-4 12632883 TCTTCT 12632888 GAGTCAGAGTCTTCTTTCTGACTC
0-2-0-4 14277246 TCTTTC 14277251 GAGTCAACTGAAAGAGATCGACTC
0-2-0-5 12773697 TCTCTTT 12773703 GAGTCATACAAAGAGACTTAGACTC
0-2-2-2 29443352 GGCTCT 29443357 GGATCCCAGAGAGCCCCACGACTC
0-2-2-5 21080027 GTCCTTTTG 21080035 GAGTCTTGAGTCCTTTTGATGTGATCC
0-2-2-6 9023054 TCTGGTTTTC 9023063 GAGTCTGCCTCTGGTTTTCGGAAGATCC
Sig, Start Product End Sequence
0-2- -3- -2 18327068 GCTCTGG 18327074 GAGTCAGAGGCTCTGGACTTGACTC
0-2- -3- -4 24560854 TTTCCGGGT 24560862 GGATCTCCCACCCGGAAAAGCAGACTC
0-2- -4- -0 15389884 CGGGCG 15389889 GAGTCACCGCGCCCGGCCGGACTC
0-2- -4- -0 32334583 GGGGCC 32334588 GAG CCACAGGGGCCACAAGATCC
Figure imgf000073_0001
0-2- -4- -2 19442336 TGCCTGGG 19442343 GAGTCAAACTGCCTGGGTTTGGATCC
0-2- -4- -5 205051 TTGGCCTTGTG 205061 GAGTCAGGCCACAAGGCCAAGTCTGACTC
0-2- -8- -4 262944 GCGTTGGGGCGGTT 262957 GAGTCGTCAGCGTTGGGGCGGTTGTGGGATCC
0-3- -1- -2 30154426 CTCTGC 30154431 GAGTCTACCGCAGAGTAATGACTC
0-3- -2- -2 13784707 CCGTGTC 13784713 GAGTCACACGACACGGTCAAGACTC
0-3- -2- -2 33520625 GGTCTCC 33520631 GAGTCAGTGGGAGACCCCAAGACTC
0-3- -2- -3 19126274 CGTCTTGC 19126281 GAGTCTGCAGCAAGACGATCTGACTC
0-3- -2- -4 11217484 CTGCCTGTT 11217492 GAGTCCGCTCTGCCTGTTATTTGATCC
0-3- -3- -0 1106893 GGGCCC 1106898 GAGTCCTCAGGGCCCAGGTGACTC
0-3- -3- -5 9929406 TGGCTTTGCCT 9929416 GGATCCACTAGGCAAAGCCAGGGGGACTC
0-3- -4- -2 31617757 TGTCCCGGG 31617765 GAGTCACGCCCCGGGACACACGGACTC
0-3- -5- -4 32203026 TGCCTGCGGTGT 32203037 GAGTCCTCCTGCCTGCGGTGTGAGTGATCC
0-4- -0- -2 32391382 CCCTTC 32391387 GAGTCCTGTCCCTTCTTTTGACTC
0-4- -0- -4 2022002 TCTTCTCC 2022009 GAGTCTAATTCTTCTCCTGTTGACTC
0-4- -1- -3 29987299 CCTCGTTC 29987306 GAGTCTCCTCCTCGTTCACTTGACTC
0-4- -2- -0 15389884 CGCCCG 15389889 GAGTCACCGCGCCCGGCCGGACTC
0-4- -2- -3 29473088 TGCCTGTCC 29473096 GAGTCGCTGTGCCTGTCCTGCAGACTC
0-4- -3- -2 27248245 TGCCTGGCC 27248253 GAGTCACCATGCCTGGCCCTGAGACTC
0-4- -3- -3 18383621 TGTGCCCTCG 18383630 GAGTCCCACTGTGCCCTCGGGATGATCC
0-4- -3- -4 31292411 TGCCTTGCCTG 31292421 GAGTCTCTCTGCCTTGCCTGTCCCGACTC
0-4- -4- -1 22220483 CCTGGCCGG 22220491 GAGTCGCAGCCGGCCAGGTGGGGACTC
0-4- -4- -3 27383276 GGCTGTGCCTC 27383286 GAGTCCCAGGGCTGTGCCTCCACCGATCC
0-5- -0- -5 10650004 CCTTCTTCTC 10650013 GGATCTTTGGAGAAGAAGGGACGGACTC
0-5- -4- -6 21631191 TCTGTTTGGGCCCCT 21631205 GAGTCATTCAGGGGCCCAAACAGATGGAGACTC
0-6- -3- -2 18915782 TGCCCGGCCTC 18915792 GAGTCAGAAGAGGCCGGGCACAGTGACTC
0-7- -0- -1 5132291 CCCCCTCC 5132298 GGATCAGGAGGAGGGGGAGAGGACTC
0-7- -3- -1 29842128 CCCCCTCGGGC 29842138 GAGTCCCAGCCCCCTCGGGCCTCTGACTC
Sig. Start Product End Sequence
1-0-2-3 21124548 TTGGAT 21124553 GAGTCTTCTTTGGATTGTGGACTC
1-0-3-4 8622993 GAGTTTTG 8623000 GAGTCAAGGCAAAACTCCTGAGACTC
1-0-4-1 29789692 GTGGAG 29789697 GAGTCACGGCTCCACTCTGGACTC
1-0-4-2 31022101 GGTGTGA 31022107 GAGTCCAGTGGTGTGATCTCGACTC
1-0-6-6 33647155 GGGTTTTGTAGGT 33647167 GAGTCTCCAGGGTTTTGTAGGTAAACGATCC
1-0-7-2 3807096 GAGTGGGTGG 3807105 GGATCTTCTCCACCCACTCCACCGACTC
1-1-1-3 8943855 CTGATT 8943860 GAGTCAGAGCTGATTCATAGATCC
1-1-1-3 9986275 TTTGCA 9986280 GAGTCTCTTTGCAAATTGTGACTC
1-1-1-3 13756486 TCGATT 13756491 GAGTCAAGGAATCGAGGATGACTC
1-1-1-4 29197174 CTGATTT 29197180 GGATCTCCGAAATCAGAGGCGACTC
1-1-2-2 11437659 GGATTC 11437664 GAGTCTTGTGAATCCCTTTGACTC
1-1-2-2 21573587 ATCTGG 21573592 GAGTCTGAAATCTGGGATGGATCC
1-1-2-3 8996410 CTTGGTA 8996416 GAGTCCCTTCTTGGTATCAAGACTC
1-1-2-3 11364756 TCAGTTG 11364762 GAGTCATGCCAACTGAAGCAGACTC
1-1-2-4 729516 TGCTGTAT 729523 GGATCTAATATACAGCATGAGGACTC
1-1-2-4 15353782 TGCTGTAT 15353789 GGATCTAATATACAGCATGATGACTC
1-1-2-4 19277839 ATTCTTGG 19277846 GAGTCTTGGATTCTTGGATAGGACTC
1-1-2-4 30112644 TGATGCTT 30112651 GGATCCACCAAGCATCATGCAGACTC
1-1-2-4 30578468 TGTTGACT 30578475 GGATCAGAAAGTCAACAGCAGGACTC
1-1-3-2 2054154 CTGGAGT 2054160 GGATCCAGAACTCCAGCCTAGACTC
1-1-3-2 25479138 TGCTGGA 25479144 GAGTCTAGGTGCTGGATCCAGACTC
1-1-3-4 13758263 GGCTTTAGT 13758271 GGATCAAGGACTAAAGCCTGGAGACTC
1-1-4-4 30620951 ATTGTGGCTG 30620960 GAGTCCCACATTGTGGCTGGGGAGACTC
1-1-4-6 7653890 GGTATTTCTGGT 7653901 GAGTCAAATGGTATTTCTGGTTCTAGATCC
1-1-5-6 33893597 GAGTTTTCTGGGT 33893609 GGATCGTTTACCCAGAAAACTCTAAAGACTC
1-1-6-3 14180710 TGGGGGCTTAG 14180720 GAGTCAAACTGGGGGCTTAGAAATGACTC
1-1-6-3 16393787 GTTGCGGTGAG 16393797 GGATCTCGGCTCACCGCAACCTCTGACTC
1-1-7-4 28887813 GGTAGGTGGGCTT 28887825 GAGTCTCTAGGTAGGTGGGCTTTAGAGATCC
1-1-8-2 30805916 TGGGAGCGTGGG 30805927 GAGTCCTGTTGGGAGCGTGGGGTGTGATCC
1-2-0-3 13942792 ACTTCT 13942797 GAGTCCCAGACTTCTCAGAGATCC
1-2-1-2 6583673 GTACCT 6583678 GAGTCCCATGTACCTCTAAGATCC
Sig. Start Product End Sequence
1-2-1-2 19177346 CTGACT 19177351 GAGTCTGATCTGACTTTTGGACTC
1-2-1-2 32832241 CAGCTT 32832246 GAGTCACTGAAGCTGAACAGACTC
1-2-1-4 20284976 CTTCTTAG 20284983 GAGTCCTCCCTTCTTAGCTTGGATCC
1-2-2-2 28989805 CTAGGCT 28989811 GAGTCCGTCAGCCTAGGTTAGACTC
1-2-2-2 29601185 GCCTTGA 29601191 GAGTCAGGTGCCTTGACTTGGATCC
1-2-2-2 30045931 TCTGGAC 30045937 GAGTCTTTTTCTGGACTCCAGATCC
1-2-2-9 17293703 ATGTTTCCTTGTTT 17293716 GGATCTAGAAAACAAGGAAACATTCTTGACTC
1-2-3-1 32584915 CCGGAGT 32584921 GAGTCACTGACTCCGGGGCTGACTC
1-2-3-4 32905734 TTGCTCTGGA 32905743 GAGTCCGTGTCCAGAGCAAGGGTGACTC
1-2-4-1 20619692 GAGGTGCC 20619699 GAGTCCCAGGAGGTGCCCAGTGACTC
1-2-4-2 25643851 CGCTGGATG 25643859 GAGTCCAGCCGCTGGATGGCCAGACTC
1-2-5-1 32059013 GGATGGGCC 32059021 GGATCAATCGGCCCATCCTGTGGACTC
1-2-5-3 29196098 CTGGTGAGCGT 29196108 GGATCTCTCACGCTCACCAGGAAAGACTC
1-2-5-6 33752323 GCTTTTAGGGTCTG 33752336 GAGTCAGCACAGACCCTAAAAGCTAAGGACTC
1-2-6-2 30361545 TGGGTGCCAGG 30361555 GAGTCAAGGTGGGTGCCAGGGGCTGACTC
1-2-6-4 33917713 TCTGGCTGGAGTG 33917725 GAGTCTCGTTCTGGCTGGAGTGCAGTGACTC
1-2-7-2 10589062 GTTGCAGGGCGG 10589073 GAGTCTGATGTTGCAGGGCGGGAAGGATCC
1-3-0-3 27334605 ACTCTCT 27334611 GAGTCCAGCACTCTCTGGGTGATCC
1-3-2-0 30024200 GGACCC 30024205 GAGTCCCATGGACCCCAGAGATCC
1-3-2-1 32584915 ACTCCGG 32584921 GAGTCACTGACTCCGGGGCTGACTC
1-3-2-2 29863514 ATCCTGGC 29863521 GGATCTGGAGCCAGGATTAATGACTC
1-3-2-3 29736871 GTCTCTGAC 29736879 GAGTCCTTTGTCTCTGACCCAGGACTC
1-3-2-9 5408125 TTTCTGTCTTTGACT 5408139 GAGTCTCTGTTTCTGTCTTTGACTCTCTGACTC
1-3-3-2 23936390 CCATCTGGG 23936398 GAGTCCCCACCATCTGGGCCCAGATCC
1-3-3-3 25598534 CCTATGGGCT 25598543 GAGTCCCCACCTATGGGCTGAGGGATCC
1-3-3-6 18506656 CTGAGTGTTCTCT 18506668 GAGTCTTACAGAGAACACTCAGCATAGACTC
1-3-3-7 28059253 CTTTTCTATGCTGG 28059266 GAGTCTCTGCTTTTCTATGCTGGGTAAGATCC
1-3-4-2 30518999 GAGTGGCTCC 30519008 GAGTCCTGGGAGTGGCTCCCCTAGACTC
1-3-5-4 21002027 TCTTGGCTGGGCA 21002039 GAGTCGATGTCTTGGCTGGGCACAGTGACTC
1-3-5-5 19923865 TGCCTGCTGTGGAT 19923878 GGATCTCCCATCCACAGCAGGCACACAGACTC
1-3-6-2 31227003 GGGGGTCTCCAG 31227014 GAGTCCGCGGGGGGTCTCCAGACAGGATCC
Sig, Start Product End Sequence
1-3- -7- -0 29842128 GCCCGAGGGGG 29842138 GAGTCCCAGCCCCCTCGGGCCTCTGACTC
1-3- -8- -4 19067255 GGGTCCCTTGGGGATG 19067270 GGATCCCTCCATCCCCAAGGGACCCACCTGACTC
1-4- -0- -1 29789692 CTCCAC 29789697 GAGTCACGGCTCCACTCTGGACTC
1-4- -0- -4 28253771 CCATCCTTT 28253779 GAGTCTCACCCATCCTTTTGTAGATCC
1-4- -0- -7 8044034 TTTTCTCCTACT 8044045 GAGTCAAGTTTTTCTCCTACTCTTAGATCC
1-4- -1- -4 3000479 TTTCCCAGTC 3000488 GAGTCAGTGGACTGGGAAAGGCAGACTC
1-4- -1- -4 6605602 TTTCCCAGTC 6605611 GAGTCTGCCTTTCCCAGTCCACTGACTC
1-4- -1- -4 16808362 TCTCTCAGTC 16808371 GAGTCTGTCTCTCTCAGTCCACTGACTC
1-4- -2- -1 20619692 GGCACCTC 20619699 GAGTCCCAGGAGGTGCCCAGTGACTC
1-4- -2- -2 21527811 CCTGAGCTC 21527819 GGATCACTTGAGCTCAGGAGTGGACTC
1-4- -2- -4 2636234 TGGCTTCACCT 2636244 GAGTCCAGCTGGCTTCACCTAGTGGATCC
1-4- -3- -2 31957583 CCCATCTGGG 31957592 GAGTCATCCCCCAGATGGGGACAGACTC
1-4- -3- -6 28721600 TCTAGTGTCTCCTG 28721613 GAGTCTGTTCAGGAGACACTAGAGAGGGACTC
1-4- -4- -0 22220483 CCGGCCAGG 22220491 GAGTCGCAGCCGGCCAGGTGGGGACTC
1-4- -4- -3 12281311 CCTGCTGAGCTG 12281322 GAGTCTGCACAGCTCAGCAGGTAGTGACTC
1-4- -4- -7 20634452 TGGCTTGTACTTTGCC 20634467 GAGTCAGCCGGCAAAGTACAAGCCATGCTGACTC
1-5- -0- -0 1077004 CCACCC 1077009 GAGTCCATACCACCCAAGTGATCC
1-5- -1- -2 30821736 CCCCTGATC 30821744 GAGTCCAACCCCCTGATCTGTGGATCC
1-5- -1- -3 899804 TCTCCTGACC 899813 GAGTCTTGATCTCCTGACCTCGTGATCC
1-5- -1- -3 5585769 TCTCCCAGTC 5585778 GAGTCAGTGGACTGGGAGAGGCAGACTC
1-5- -2- -6 14980846 ATCGTTCCTTCCTG 14980859 GAGTCCATTCAGGAAGGAACGATATGTGACTC
1-5- -3- -1 15145160 CACGCGGCTC 15145169 GAGTCAGTCCACGCGGCTCTGCAGATCC
1-5- -3- -1 19767972 GGCCTCCAGC 19767981 GGATCACTAGCTGGAGGCCTTGGGACTC
1-5- -4- -3 6529088 CCCAGGTCTGGCT 6529100 GAGTCCTGGCCCAGGTCTGGCTAGTGGATCC
1-5- -6- -3 32610836 GGGTGGCCCTGTCCA 32610850 GAGTCACTGGGGTGGCCCTGTCCAGTCTGACTC
1-6- -0- -1 32926206 CCTCCCCA 32926213 GGATCAATGTGGGGAGGTTCAGACTC
1-6- -1- -3 2129131 GACTCTCCTCC 2129141 GAGTCAAATGGAGGAGAGTCCCATGACTC
1-6- -4- -3 12170409 TGGTGCCCCTGCAC 12170422 GAGTCGAAATGGTGCCCCTGCACTCCAGACTC
1-6- -4- -3 26181246 TCCGGAGCCTCTCG 26181259 GAGTCTGATTCCGGAGCCTCTCGGGTGGATCC
1-8- -2- -1 24027206 CCACCTCCGCGC 24027217 GGATCCTGGGCGCGGAGGTGGGTCCGACTC
1-9- -1- -3 33439792 CCCCAGTCTCCCCT 33439805 GAGTCCCCTCCCCAGTCTCCCCTCCCAGACTC
Sig. Start Product End Sequence
2-0-0-4 9427183 ATTTAT 9427188 GAGTCAAGTATAAATCCAGGACTC
2-0-1-3 11180295 ATGTAT 11180300 GAGTCCTATATACATTCTAGACTC
2-0-2-4 9363911 GGTATTTA 9363918 GAGTCTTAAGGTATTTATCAGGATCC
2-0-3-1 28169212 AGTGGA 28169217 GAGTCAAAAAGTGGAGAACGATCC
2-0-3-1 31441095 ATGGGA 31441100 GAGTCTACGATGGGACCGAGATCC
2-0-3-3 28401060 GTTTAGGA 28401067 GGATCCTTGTCCTAAACATTCGACTC
2-0-3-4 9409920 TGTGATATG 9409928 GGATCTTTCCATATCACAGTCAGACTC
2-0-4-0 32391382 GAAGGG 32391387 GAGTCCTGTCCCTTCTTTTGACTC
2-0-6-2 9349292 GGTTGGGGAA 9349301 GGATCTTTCTTCCCCAACCCAGTGACTC
2-0-6-6 1506217 GTGGTGTATTGAGT 1506230 GAGTCAACAGTGGTGTATTGAGTGTATGACTC
2-0-7-1 10808696 GGGTGGGGAA 10808705 GAGTCAGTGGGGTGGGGAAGGCAGATCC
2-1-0-5 619226 TCATTTAT 619233 GAGTCAGATATAAATGAATCAGACTC
2-1-0-5 15224385 TCATTTAT 15224392 GAGTCAGATATAAATGAGTCAGACTC
2-1-0-7 22190209 ATTTTATTCT 22190218 GGATCAGTCAGAATAAAATTGATGACTC
2-1-1-2 211515 GAACTT 211520 GGATCTCCCAAGTTCTGATGACTC
2-1-1-2 26774780 TAAGCT 26774785 GAGTCTTCCAGCTTATGGGGACTC
2-1-1-2 27292190 AATTCG 27292195 GGATCTCCGCGAATTTATAGACTC
2-1-1-2 28430669 TAACTG 28430674 GGATCACAGCAGTTACAGAGACTC
2-1-1-3 1437969 AGTCTAT 1437975 GAGTCCCATAGTCTATTTCTGACTC
2-1-1-3 5262945 AGACTTT 5262951 GAGTCCTCTAGACTTTGATTGACTC
2-1-1-4 7980226 TTTTGAAC 7980233 GAGTCAAGGTTTTGAACTCAAGACTC
2-1-1-6 12544191 GTCATTTTAT 12544200 GAGTCCCAAGTCATTTTATATTTGATCC
2-1-1-7 11947722 TTGTCTATTAT 11947732 GAGTCTGGAATAATAGACAATGGAGACTC
2-1-2-1 19177346 AGTCAG 19177351 GAGTCTGATCTGACTTTTGGACTC
2-1-2-1 32832241 AAGCTG 32832246 GAGTCACTGAAGCTGAACAGACTC
2-1-2-3 12598485 TGTGTCAA 12598492 GAGTCTAAATTGACACACAAGGACTC
2-1-2-3 18502098 ATGTGACT 18502105 GGATCACAGAGTCACATGAGTGACTC
2-1-2-3 21268368 TGAATCTG 21268375 GAGTCTCTCTGAATCTGCCGTGATCC
2-1-2-4 9867805 GTTTAGATC 9867813 GAGTCAAAGGATCTAAACCTATGACTC
2-1-3-0 11386861 GGAAGC 11386866 GGATCAATGGCTTCCCACTGACTC
2-1-3-0 30154426 GCAGAG 30154431 GAGTCTACCGCAGAGTAATGACTC
Sig. Start Product End Sequence
2-1-3- -1 11314754 AAGTGGC 11314760 GGATCTCCTGCCACTTCTCTGACTC
2-1-3- -1 30743951 GATGGCA 30743957 GAGTCCTCAGATGGCATGGTGATCC
2-1-3- -3 7200489 TGCTAAGTG 7200497 GAGTCTGGCTGCTAAGTGGCCTGATCC
2-1-3- -3 14978682 GAGGATTCT 14978690 GAGTCTGTGAGAATCCTCTATTGACTC
2-1-3- -6 4549922 GATCTTTTGTAG 4549933 GGATCCTAGCTACAAAAGATCCCATGACTC
2-1-3- -6 12106378 GGTATTTCTAGT 12106389 GAGTCAAATGGTATTTCTAGTGCTAGATCC
2-1-3- -6 12365955 GGTATTTCTAGT 12365966 GAGTCAAATGGTATTTCTAGTGCTAGATCC
2-1-4- -2 28136865 AGATCTGGG 28136873 GAGTCAGAGAGATCTGGGTTTGGATCC
2-1-4- -3 14222830 GACTTGGTGA 14222839 GAGTCAGTGGACTTGGTGAGGCAGATCC
2-1-4- -5 5975272 GGTATATCTGGT 5975283 GGATCTAGAACCAGATATACCATTTGACTC
2-1-5- -4 22052058 GTGAGCTGTGTA 22052069 GAGTCCACAGTGAGCTGTGTAGATGGACTC
2-1-6- -1 10781164 GGCTGGGGAA 10781173 GAGTCCGCAGGCTGGGGAAGGCAGATCC
2-1-6- -1 14383318 GGCTGGGGAA 14383327 GAGTCAGTGGGCTGGGGAAGGCAGATCC
-4 2-1-6- -1 15549307 GGCTGGGGAA 15549316 GGATCTGCCTTCCCCAGCCCACTGACTC -4
2-1-6- -1 15551718 GGCTGGGGAA 15551727 GGATCTGCCTTCCCCAGCCCACTGACTC
2-1-6- -1 24002408 GGCTGGGGAA 24002417 GAGTCAGTGGGCTGGGGAAGGTAGATCC
2-1-6- -2 32080603 GCAGAGTGTGG 32080613 GGATCTTCTCCACACTCTGCACAGGACTC
2-1-8- -3 20750485 GTGGGAGTGTGGCA 20750498 GAGTCCCACGTGGGAGTGTGGCATTGGGACTC
2-2-0- -3 12533497 TTTCCAA 12533503 GAGTCAGGATTTCCAAAGAGGACTC
2-2-0- -4 30068854 TTAACTCT 30068861 GGATCAGCGAGAGTTAATGCTGACTC
2-2-0- -6 18038803 TTTCTTCATA 18038812 GAGTCAGAGTATGAAGAAAGATTGACTC
2-2-0- -7 13894251 ATTTTTTCATC 13894261 GAGTCCATCATTTTTTCATCTTTGGATCC
2-2-1- -1 11437659 GAATCC 11437664 GAGTCTTGTGAATCCCTTTGACTC
2-2-1- -1 22626704 AGTCAC 22626709 GAGTCCCCCAGTCACTGAAGATCC
2-2-1- -3 22498075 ACCTTGAT 22498082 GAGTCCAATATCAAGGTATTGGACTC
2-2-1- -4 22683181 TATTGCCTA 22683189 GAGTCACTGTATTGCCTAAGGAGACTC
2-2-2- -1 28989805 AGCCTAG 28989811 GAGTCCGTCAGCCTAGGTTAGACTC
2-2-2- -2 7173254 TCTGCGAA 7173261 GAGTCACCATCTGCGAAAATTGATCC
2-2-2- -2 22855526 ACACTGGT 22855533 GAGTCTGTGACCAGTGTTGGAGACTC
2-2-2- -3 12738618 TCCTTAAGG 12738626 GAGTCGCACTCCTTAAGGCCTAGACTC
2-2-2- -3 15436214 CTATGATGC 15436222 GAGTCCTTGGCATCATAGGAGGGACTC
Sig, Start Product End Sequence
2-2- -2- -3 18410822 CATGGCTTA 18410830 GGATCCAGCTAAGCCATGCCTGGACTC
2-2- -2- -3 31131563 TTGTGACAC 31131571 GGATCAACAGTGTCACAAGTCAGACTC
2-2- -2- -4 22164257 CTGCATTGTA 22164266 GGATCTGGTTACAATGCAGATCTGACTC
2-2- -3- -0 13784707 GACACGG 13784713 GAGTCACACGACACGGTCAAGACTC
2-2- -3- -0 33520625 GGAGACC 33520631 GAGTCAGTGGGAGACCCCAAGACTC
2-2- -3- -4 28815015 TTAACCTTGGG 28815025 GAGTCCCTTCCCAAGGTTAAGGCTGACTC
2-2- -3- -4 32414641 TTCTTGGGACA 32414651 GAGTCAACGTTCTTGGGACAATTGGACTC
2-2- -3- -6 5932904 GTTTAGTTTAGCC 5932916 GAGTCTGTGGTTTAGTTTAGCCTTTGGATCC
2-2- -3- -6 17378613 TGGTTTCAATTGC 17378625 GAGTCAGAGGCAATTGAAACCAGAGTGACTC
2-2- -4- -1 24532350 AGCGAGTGC 24532358 GAGTCAGTCAGCGAGTGCTTGGGATCC
2-2- -4- -1 28237615 GCAGGCTGA 28237623 GGATCCCTGTCAGCCTGCTCAAGACTC
2-2- -4- -2 24146793 GTGAGCACTG 24146802 GAGTCAGCTGTGAGCACTGCTCAGATCC
2-2- -5- -1 11221991 GGCTGGGGAA 11222000 GAGTCAGTGCGCTGGGGAAGGCGGATCC
-4 2-2- -5- -4 33655118 GTACTGTGGGCTA 33655130 GAGTCTTGGGTACTGTGGGCTAGTAAGACTC oe
2-2- -5- -7 33039205 TCAGTGTCGTGTATGT 33039220 GAGTCTTTATCAGTGTCGTGTATGTTCCAGACTC
2-2- -6- -1 12966530 AGGCGGAGGCT 12966540 GAGTCATTCAGGCGGAGGCTGTAAGATCC
2-3- -1- -0 8623105 GACCCA 8623110 GAGTCTGATTGGGTCTGCAGACTC
2-3- -1- -0 32976276 CCAGCA 32976281 GAGTCCCCACCAGCAAGAAGATCC
2-3- -1- -1 25479138 TCCAGCA 25479144 GAGTCTAGGTGCTGGATCCAGACTC
2-3- -1- -3 23935558 TCCATGCAT 23935566 GAGTCAGGTTCCATGCATTCCTGATCC
2-3- -1- -8 4324439 TTCATTCTTCTGTA 4324452 GGATCCATATACAGAAGAATGAAACTAGACTC
2-3- -2- -0 12748618 CAGAGCC 12748624 GAGTCTGGTCAGAGCCCAGTGATCC
2-3- -2- -0 18327068 CCAGAGC 18327074 GAGTCAGAGGCTCTGGACTTGACTC
2-3- -2- -3 6083987 ACCCGAGTTT 6083996 GAGTCTATGAAACTCGGGTCAATGACTC
2-3- -2- -6 8878952 TGCATTCTTTGAC 8878964 GAGTCTGCCTGCATTCTTTGACTTATGATCC
2-3- -3- -1 23435111 CCGCAGATG 23435119 GGATCCTCACATCTGCGGCGATGACTC
2-3- -3- -1 31671742 CCAGGCATG 31671750 GAGTCTGGGCCAGGCATGGCGGGATCC
2-3- -3- -4 30387315 CTGGATTCCATG 30387326 GAGTCTTTGCATGGAATCCAGCCTGGACTC
2-3- -3- -4 31097332 TATGTTGCCCAG 31097343 GAGTCCAGCCTGGGCAACATAGCAAGACTC
2-3- -3- -8 18319692 CTAATGTCCTTTTTGG 18319707 GAGTCTCAACTAATGTCCTTTTTGGATTTGACTC
2-3- -4- -0 27248245 GGCCAGGCA 27248253 GAGTCACCATGCCTGGCCCTGAGACTC
Sig, Start Product End Sequence
2- -3- -4- -1 25770313 AAGCCTGCGG 25770322 GAGTCAGCAAAGCCTGCGGATGGGATCC
2- -3- -4- -1 31957583 CCCAGATGGG 31957592 GAGTCATCCCCCAGATGGGGACAGACTC
2- -3- -4- -3 24894356 CAGAGGTTTCGC 24894367 GGATCTGCAGCGAAACCTCTGTGGTGACTC
2- -3- -4- -5 11647570 GTGTAGAGCCTTCT 11647583 GGATCCCCGAGAAGGCTCTACACCCATGACTC
2- -3- -4- -5 23934318 CTCTGAACTGGGTT 23934331 GAGTCTCAGAACCCAGTTCAGAGTCTGGACTC
2- -3- -6- -0 18915782 GAGGCCGGGCA 18915792 GAGTCAGAAGAGGCCGGGCACAGTGACTC
2- -3- -6- -2 21787129 CAGCGAGCTGGTG 21787141 GAGTCAGGCCAGCGAGCTGGTGAGGAGACTC
2- -3- -6- -4 2368241 GCCTGGTATTCGAGG 2368255 GGATCTGGACCTCGAATACCAGGCTTTTGACTC
2- -4- -0- -0 1207352 CCAACC 1207357 GAGTCCATACCAACCAAGTGATCC
2- -4- -0- -1 19713320 CCTCACA 19713326 GAGTCCTGTCCTCACAGACAGATCC
2- -4- -0- -1 31022101 TCACACC 31022107 GAGTCCAGTGGTGTGATCTCGACTC
2- -4- -0- -4 18201500 ATCTCCTATC 18201509 GGATCCTGAGATAGGAGATGGCAGACTC
2- -4- -1- -0 32790478 CAGACCC 32790484 GAGTCTGCTCAGACCCTCTGGACTC
2- -4- -1- -6 32574843 ATCTGTCCTTACT 32574855 GGATCCTCCAGTAAGGACAGATACGTGACTC
2- -4- -2- -1 25643851 CATCCAGCG 25643859 GAGTCCAGCCGCTGGATGGCCAGACTC
2- -4- -2- -2 3443735 AGATGTCCCC 3443744 GAGTCAGTGAGATGTCCCCACAGGATCC
2- -4- -3- -0 31617757 CCCGGGACA 31617765 GAGTCACGCCCCGGGACACACGGACTC
2- -4- -3- -1 30518999 GGAGCCACTC 30519008 GAGTCCTGGGAGTGGCTCCCCTAGACTC
2- -4- -3- -3 19951433 CCTATGCAGGCT 19951444 GAGTCATCTCCTATGCAGGCTGCCTGACTC
2- -4- -3- -3 32722468 GCCTTCGGAATC 32722479 GAGTCCTCGGCCTTCGGAATCTCCTGACTC
2- -4- -7- -3 33231602 CGAGGTGCAGTGGCCT 33231617 GAGTCCAAACGAGGTGCAGTGGCCTGACTGATCC
2- -5- -0- -0 25071211 CACACCC 25071217 GAGTCCCCCCACACCCTCTGGACTC
2- -5- -0- -5 26992119 CCATTTCCTACT 26992130 GAGTCAGTAAGTAGGAAATGGGTGCGACTC
2- -5- -1- -1 30264305 GACTCCCCA 30264313 GGATCCAGTTGGGGAGTCCTCGGACTC
2- -5- -1- -3 24033358 ATCCCTGCCAT 24033368 GAGTCTTGAATGGCAGGGATGGAGGACTC
2- -5- -2- -3 8407948 ACTCTGGACCTC 8407959 GGATCACTTGAGGTCCAGAGTTGGAGACTC
2- -5- -2- -4 23901817 CAAGCCCGTTTCT 23901829 GAGTCATTGCAAGCCCGTTTCTTAAGGACTC
2- -5- -4- -5 29781742 CCCTTGGTGCCTTAGA 29781757 GAGTCACATTCTAAGGCACCAAGGGTGAGGACTC
2- -6- -1- -3 3162555 ACTCCTGACCTC 3162566 GAGTCTTGAACTCCTGACCTCAGGTGATCC
2- -6- -2- -1 30361545 CCTGGCACCCA 30361555 GAGTCAAGGTGGGTGCCAGGGGCTGACTC
2- -6- -3- -2 21787129 CACCAGCTCGCTG 21787141 GAGTCAGGCCAGCGAGCTGGTGAGGAGACTC
Sig. Start Product End Sequence
2-6-3-3 23945172 AGGCTGCCCCATTC 23945185 GAGTCTAGAAGGCTGCCCCATTCTCACGACTC
2-6-3-5 12507913 GCTTTCAGCCCCTATG 12507928 GGATCACAGCATAGGGGCTGAAAGCCCTTGACTC
2-6-5-3 30415056 CTCCCTGCAGCAGGTG 30415071 GAGTCCCGCCTCCCTGCAGCAGGTGTGCTGACTC
2-7-1-5 21169634 AGTCCCATCTCTTCC 21169648 GAGTCTGCAAGTCCCATCTCTTCCCCAGGATCC
2-7-2-4 2783952 CTGTCACCTGCTCCA 2783966 GAGTCAGGATGGAGCAGGTGACAGGGGTGACTC
2-8-2-4 33028162 CTGTCCCCCTCCTGAA 33028177 GAGTCAAGGCTGTCCCCCTCCTGAAGTCAGATCC
3-0-0-3 3350206 ATATTA 3350211 GGATCTATATAATATATGAGACTC
3-0-0-5 1575134 TTTTAATA 1575141 GAGTCTCTTTTTTAATAGAATGACTC
3-0-1-2 21517708 GAAATT 21517713 GAGTCTAAGGAAATTATGAGATCC
3-0-1-4 23801449 TTTGAATA 23801456 GAGTCCAGATTTGAATAAGGTGACTC
3-0-2-1 2654630 GAGAAT 2654635 GGATCATTAATTCTCTCAAGACTC
3-0-2-2 12533497 TTGGAAA 12533503 GAGTCAGGATTTCCAAAGAGGACTC
3-0-2-2 14744550 TGAGATA 14744556 GAGTCCATTTGAGATAATGTGATCC oe 3-0-2-3 19462765 AGATTTAG 19462772 GAGTCATGCCTAAATCTTTCTGACTC ©
3-0-2-4 3082986 TTATAAGGT 3082994 GGATCCGTGACCTTATAAAAGGGACTC
3-0-3-1 26717503 ATGAGAG 26717509 GAGTCACAGATGAGAGCGGAGATCC
3-0-3-3 10042155 TGGGAATAT 10042163 GAGTCCTTGTGGGAATATGAAAGATCC
3-0-4-0 20808976 AAGGGAG 20808982 GAGTCTGAGAAGGGAGATGGGATCC
3-0-4-1 19720180 GGTGAAAG 19720187 GGATCCACACTTTCACCCATTGACTC
3-0-5-1 18704652 AGGGAGGAT 18704660 GGATCGGATATCCTCCCTCCCTGACTC
3-1-0-2 11180295 ATACAT 11180300 GAGTCCTATATACATTCTAGACTC
3-1-0-2 21654351 TCAAAT 21654356 GAGTCCTATTCAAATTTTTGATCC
3-1-1-1 9986275 TGCAAA 9986280 GAGTCTCTTTGCAAATTGTGACTC
3-1-1-1 13756486 AATCGA 13756491 GAGTCAAGGAATCGAGGATGACTC
3-1-1-1 25929627 GATACA 25929632 GAGTCTACAGATACAAGGAGATCC
3-1-1-2 1437969 ATAGACT 1437975 GAGTCCCATAGTCTATTTCTGACTC
3-1-1-2 5262945 AAAGTCT 5262951 GAGTCCTCTAGACTTTGATTGACTC
3-1-1-4 9912776 CTGAATTAT 9912784 GAGTCCATACTGAATTATTTCTGACTC
3-1-2-1 28155980 GTCAAGA 28155986 GAGTCAGCTGTCAAGAGATAGATCC
3-1-2-2 22498075 ATCAAGGT 22498082 GAGTCCAATATCAAGGTATTGGACTC
3-1-2-5 5331028 GTATATAGTTC 5331038 GAGTCATTGGTATATAGTTCTCAAGATCC
Sig. Start Product End Sequence
3-1-3-1 23421470 TGGCAAAG 23421477 GGATCCTCTCTTTGCCAAGGGGACTC
3-1-3-2 28691310 GCTTGGAAA 28691318 GGATCTGCCTTTCCAAGCCACTGACTC
3-1-4-0 29987299 GAACGAGG 29987306 GAGTCTCCTCCTCGTTCACTTGACTC
3-1-4-1 28241661 GACTGAGGA 28241669 GAGTCAGTGGACTGAGGAGGGAGATCC
3-1-4-2 9753627 GGCTGTAGAA 9753636 GGATCTGCCTTCTACAGCCCACTGACTC
3-1-4-3 12666387 GTTGCAGTAAG 12666397 GGATCTCGGCTTACTGCAACCTCCGACTC
3-1-4-4 25350996 TCAAGGTTGTGA 25351007 GAGTCAGGCTCAAGGTTGTGAATGTGACTC
3-1-4-6 14333110 GGGATTTTAGTTCA 14333123 GAGTCAGAAGGGATTTTAGTTCATGTAGACTC
3-1-4-7 18897789 TTTTTGTAGATGAGC 18897803 GGATCCCTGGCTCATCTACAAAAAGCCAGACTC
3-1-5-1 2994368 GGCTGGGAAA 2994377 GAGTCAGTGGGCTGGGAAAGGCAGATCC
3-1-5-1 5585769 GACTGGGAGA 5585778 GAGTCAGTGGACTGGGAGAGGCAGACTC
3-1-5-1 14389987 GGCTGGGAAA 14389996 GAGTCAGTGGGCTGGGAAAGGCAGATCC
3-1-5-1 16194477 GGCTGGGAAA 16194486 GAGTCAGTGGGCTGGGAAAGGCAGATCC
3-1-5-2 24033358 ATGGCAGGGAT 24033368 GAGTCTTGAATGGCAGGGATGGAGGACTC
3-1-6-1 2129131 GGAGGAGAGTC 2129141 GAGTCAAATGGAGGAGAGTCCCATGACTC
3-1-9-1 33439792 AGGGGAGACTGGGG 33439805 GAGTCCCCTCCCCAGTCTCCCCTCCCAGACTC
3-2-0-1 21124548 ATCCAA 21124553 GAGTCTTCTTTGGATTGTGGACTC
3-2-0-3 19462765 CTAAATCT 19462772 GAGTCATGCCTAAATCTTTCTGACTC
3-2-1-0 15857752 CCAAAG 15857757 GAGTCCTTACCAAAGTCATGATCC
3-2-1-0 33597444 ACGCAA 33597449 GAGTCAGCCACGCAAGTTAGACTC
3-2-1-0 33597666 ACGCAA 33597671 GAGTCAGCCACGCAAGTTAGACTC
3-2-1-0 33597814 ACGCAA 33597819 GAGTCAGCCACGCAAGTTAGACTC
3-2-1-0 33598036 ACGCAA 33598041 GAGTCAGCCACGCAAGTTAGACTC
3-2-1-0 33598332 ACGCAA 33598337 GAGTCAGCCACGCAAGTTAGACTC
3-2-1-1 8996410 TACCAAG 8996416 GAGTCCCTTCTTGGTATCAAGACTC
3-2-1-1 11364756 CAACTGA 11364762 GAGTCATGCCAACTGAAGCAGACTC
3-2-1-1 22714224 TCCAAGA 22714230 GAGTCAGACTCCAAGAGCTGGATCC
3-2-1-2 12598485 TTGACACA 12598492 GAGTCTAAATTGACACACAAGGACTC
3-2-1-4 25288155 CAAATGCTTT 25288164 GAGTCGTGGCAAATGCTTTTCCAGATCC
3-2-1-5 10431179 ACATTTCTAGT 10431189 GGATCAAAAACTAGAAATGTTAAGGACTC
3-2-2-1 19159639 AAATGGCC 19159646 GGATCTGAGGGCCATTTTTCTGACTC
Sig, Start Product End Sequence
3-2- -2- -2 12738618 CCTTAAGGA 12738626 GAGTCGCACTCCTTAAGGCCTAGACTC
3-2- -2- -2 15436214 GCATCATAG 15436222 GAGTCCTTGGCATCATAGGAGGGACTC
3-2- -2- -2 19716680 AATGACTCG 19716688 GAGTCTGGCAATGACTCGTCTGGATCC
3-2- -2- -2 23157997 GTAAGCCAT 23158005 GGATCTCCCATGGCTTACAATGGACTC
3-2- -2- -3 19474034 ATTCCTGGAA 19474043 GGATCCTGGTTCCAGGAATCCAAGACTC
3-2- -2- -4 20638088 ACGGATTTACT 20638098 GGATCTTTCAGTAAATCCGTGGCTGACTC
3-2- -2- -7 33817389 TCTTATGAGTATCT 33817402 GAGTCATTTTCTTATGAGTATCTAACTGACTC
3-2- -3- -0 19126274 GCAAGACG 19126281 GAGTCTGCAGCAAGACGATCTGACTC
3-2- -3- -1 29736871 GTCAGAGAC 29736879 GAGTCCTTTGTCTCTGACCCAGGACTC
3-2- -3- -2 6083987 AAACTCGGGT 6083996 GAGTCTATGAAACTCGGGTCAATGACTC
3-2- -3- -2 12664695 AACCTGGAGT 12664704 GAGTCACATAACCTGGAGTTAAGGATCC
3-2- -3- -6 1209487 TGTCTGTGTTCAAA 1209500 GAGTCAGAATGTCTGTGTTCAAATCCTGACTC
3-2- -3- -6 33773747 TGAGAACTTTGTCT 33773760 GAGTCCTGATGAGAACTTTGTCTCAAGGACTC
3-2- -4- -0 29473088 GGACAGGCA 29473096 GAGTCGCTGTGCCTGTCCTGCAGACTC
3-2- -4- -1 1795301 GACTGGCAGA 1795310 GAGTCAGTGGACTGGCAGAGGCAGATCC
3-2- -4- -4 4718510 GAATGGTTCACTG 4718522 GAGTCAGCAGAATGGTTCACTGTTCTGATCC
3-2- -5- -5 23758787 GTGTGCTGTGAATAC 23758801 GAGTCGAGTGTATTCACAGCACACAGGGGACTC
3-2- -6- -1 19789414 GACTGAGGCAGG 19789425 GGATCACCTCCTGCCTCAGTCAACAGACTC
3-2- -7- -4 32820688 GTTCTGTGGAGGGACA 32820703 GAGTCATTCGTTCTGTGGAGGGACAAGTGGACTC
3-3- -1- -0 9674013 CCCAAAG 9674019 GAGTCTTAACCCAAAGCACAGATCC
3-3- -1- -0 31857088 AGACACC 31857094 GAGTCATGTGGTGTCTGTGGGACTC
3-3- -1- -0 31935633 AGACACC 31935639 GAGTCCCACAGACACCACGTGACTC
3-3- -1- -2 14978682 AGAATCCTC 14978690 GAGTCTGTGAGAATCCTCTATTGACTC
3-3- -1- -7 13563376 TAATAGTCCTTTTC 13563389 GAGTCTGTAGAAAAGGACTATTAGAAAGACTC
3-3- -2- -1 33424071 AATGCAGCC 33424079 GAGTCTGGGAATGCAGCCCTGAGATCC
3-3- -2- -2 13372656 CATCAGTAGC 13372665 GAGTCCATGCATCAGTAGCTTAGGATCC
3-3- -4- -2 10808838 ATTCCAGGGCAG 10808849 GGATCCTTCCTGCCCTGGAATATTGGACTC
3-3- -4- -2 19951433 AGCCTGCATAGG 19951444 GAGTCATCTCCTATGCAGGCTGCCTGACTC
3-3- -4- -2 22284404 GGGATATCCCAG 22284415 GGATCCCTTCTGGGATATCCCTTCTGACTC
3-3- -4- -2 32722468 GATTCCGAAGGC 32722479 GAGTCCTCGGCCTTCGGAATCTCCTGACTC
3-3- -5- -3 2625867 CGTTCCAGGGAATG 2625880 GGATCTTCACATTCCCTGGAACGTGCAGACTC
Sig. Start Product End Sequence
3-3-5-4 5024130 GGCCTGGATTTGAAC 5024144 GGATCTAGGGTTCAAATCCAGGCCACTAGACTC
3-3-6-2 23945172 GAATGGGGCAGCCT 23945185 GAGTCTAGAAGGCTGCCCCATTCTCACGACTC
3-3-6-3 24022606 GCGCGTGGATTCAGA 24022620 GAGTCTCCAGCGCGTGGATTCAGATCAGGACTC
3-3-7-1 14651608 GCCTGAGGGCAGGA 14651621 GAGTCTGATGCCTGAGGGCAGGAAGCTGATCC
3-4-1-1 16409217 CCAAATGCC 16409225 GGATCACAAGGCATTTGGCCAAGACTC
3-4-1-3 8109948 TCAACATGCCT 8109958 GGATCTGGGAGGCATGTTGACCAAGACTC
3-4-1-5 20735471 TTTGAACCCATCT 20735483 GAGTCAGACAGATGGGTTCAAATCCTGACTC
3-4-1-6 5084130 ATTCTCATGTCACT 5084143 GAGTCTCAAAGTGACATGAGAATTTGTGACTC
3-4-2-0 22138172 AGCACCAGC 22138180 GAGTCAGTCAGCACCAGCTCTGGATCC
3-4-2-3 18763327 AAGCCTTCCTAG 18763338 GAGTCACCCAAGCCTTCCTAGATTTGATCC
3-4-3-1 30276848 AAGGCCTGACC 30276858 GAGTCAGTCAAGGCCTGACCCACAGATCC
3-4-4-1 12281311 CAGCTCAGCAGG 12281322 GAGTCTGCACAGCTCAGCAGGTAGTGACTC
3-4-4-5 2970865 CTTGCCAAGGTCTTGA 2970880 GGATCTCTCTCAAGACCTTGGCAAGGCCAGACTC
3-4-5-4 27256725 CTGGTTATCAGGGCCA 27256740 GGATCACTTTGGCCCTGATAACCAGCGAGGACTC
3-4-6-1 12170409 GTGCAGGGGCACCA 12170422 GAGTCGAAATGGTGCCCCTGCACTCCAGACTC
3-5-1-1 14190105 GCCACTAACC 14190114 GAGTCAGAAGCCACTAACCAGATGATCC
3-5-3-4 25320211 AGCCATTGTGCACTC 25320225 GAGTCTGCTAGCCATTGTGCACTCCCCTGACTC
3-5-6-2 30415056 CACCTGCTGCAGGGAG 30415071 GAGTCCCGCCTCCCTGCAGCAGGTGTGCTGACTC
3-6-1-1 14180710 CTAAGCCCCCA 14180720 GAGTCAAACTGGGGGCTTAGAAATGACTC
3-6-2-2 32111018 GGACTCCACTACC 32111030 GGATCAAGTGGTAGTGGAGTCCCAGGGACTC
3-6-2-4 13538285 GTCACCTCCCAGATT 13538299 GAGTCCAAAAATCTGGGAGGTGACCATCGACTC
3-6-3-3 24022606 TCTGAATCCACGCGC 24022620 GAGTCTCCAGCGCGTGGATTCAGATCAGGACTC
3-6-3-4 7083234 CTTTCCCCCATGGAGA 7083249 GAGTCTTTGTCTCCATGGGGGAAAGGAGAGACTC
3-6-3-4 19756052 GTCCCTAGACAGCTCT 19756067 GAGTCTGGAGTCCCTAGACAGCTCTGAAAGACTC
3-6-4-3 31232261 TCCCTGCTCAGGCAAG 31232276 GAGTCCTCCTCCCTGCTCAGGCAAGAGAGGATCC
3-6-4-3 33866318 GGCCATCTCAGGATCC 33866333 GGATCTGGGGGATCCTGAGATGGCCCGAGGACTC
3-6-5-1 32610836 TGGACAGGGCCACCC 32610850 GAGTCACTGGGGTGGCCCTGTCCAGTCTGACTC
3-7-0-5 29788190 TCATTCTCCACCCAT 29788204 GAGTCTTTGTCATTCTCCACCCATGTCTGACTC
3-7-2-4 21390645 TCCTGAATCTGCCACC 21390660 GAGTCAAGGTCCTGAATCTGCCACCAGTAGACTC
3-8-1-2 20750485 TGCCACACTCCCAC 20750498 GAGTCCCACGTGGGAGTGTGGCATTGGGACTC
4-0-0-2 9427183 ATAAAT 9427188 GAGTCAAGTATAAATCCAGGACTC
Sig. Start Product End Sequence
4- -0-0-4 31759464 TAAATTTA 31759471 GAGTCTCAATAAATTTAAAAGGATCC
4- -0-1-8 27892202 TTTATTTTGTAAA 27892214 GAGTCCCAGTTTATTTTGTAAAATTGGACTC
4- -0-2-0 12632883 AGAAGA 12632888 GAGTCAGAGTCTTCTTTCTGACTC
4- -0-2-0 14277246 GAAAGA 14277251 GAGTCAACTGAAAGAGATCGACTC
4- -0-2-1 29472882 ATAAAGG 29472888 GGATCTTTTCCTTTATCAACGACTC
4- -0-3-4 28480979 TTAAAATGTGG 28480989 GAGTCATGATTAAAATGTGGGTGAGATCC
4- -0-4-0 2022002 GGAGAAGA 2022009 GAGTCTAATTCTTCTCCTGTTGACTC
4- -0-4-1 8297236 GGAATAGGA 8297244 GAGTCATCGGGAATAGGAATAGGATCC
4- -1-0-3 19478130 AACTTTAA 19478137 GAGTCAGTAAACTTTAAGATAGATCC
4- -1-0-3 23801449 TATTCAAA 23801456 GAGTCCAGATTTGAATAAGGTGACTC
4- -1-1-0 3885471 CAAAGA 3885476 GAGTCCAGCCAAAGATTATGACTC
4- -1-1-0 8035824 AAACAG 8035829 GAGTCCCAACTGTTTAGAGGACTC
4- -1-1-2 7980226 GTTCAAAA 7980233 GAGTCAAGGTTTTGAACTCAAGACTC
4- -1-1-3 9912776 ATAATTCAG 9912784 GAGTCCATACTGAATTATTTCTGACTC
4- -1-2-1 7507460 AGAACTAG 7507467 GGATCTTATCTAGTTCTGTCTGACTC
4- -1-2-1 21371940 AGCAAAGT 21371947 GGATCAAAGACTTTGCTGTAAGACTC
4- -1-2-2 22683181 TAGGCAATA 22683189 GAGTCACTGTATTGCCTAAGGAGACTC
4- -1-3-0 25895856 AACGAGGA 25895863 GAGTCCTACAACGAGGATAATGATCC
4- -1-3-5 16036924 AAGTTGGACATTT 16036936 GAGTCAGAAAAATGTCCAACTTCAAAGACTC
4- -1-3-6 30918934 AGATACTGTTTATG 30918947 GGATCTGATCATAAACAGTATCTGAAAGACTC
4- -1-4-1 3000479 GACTGGGAAA 3000488 GAGTCAGTGGACTGGGAAAGGCAGACTC
4- -1-4-1 6605602 GACTGGGAAA 6605611 GAGTCTGCCTTTCCCAGTCCACTGACTC
4- -1-4-1 11453935 AACTGGGAGA 11453944 GAGTCAGTGAACTGGGAGAGGCAGATCC
4- -1-4-1 16808362 GACTGAGAGA 16808371 GAGTCTGTCTCTCTCAGTCCACTGACTC
4- -1-4-1 33643779 GACTGGGAAA 33643788 GGATCTGCCTTTCCCAGTCCATTGACTC
4- -1-4-2 16149438 GACTTGGGAAA 16149448 GAGTCAGTAGACTTGGGAAAGGGAGATCC
4- -1-4-3 23042578 TGATAAGACTGG 23042589 GAGTCGTTCTGATAAGACTGGGGAAGATCC
4- -2-0-0 21342426 CCAAAA 21342431 GAGTCCCGATTTTGGGCAAGACTC
4- -2-0-0 33597518 ACACAA 33597523 GAGTCAGCCACACAAGTTAGACTC
4- -2-0-0 33598110 ACACAA 33598115 GAGTCAGCCACACAAGTTAGACTC
4- -2-1-1 19277839 CCAAGAAT 19277846 GAGTCTTGGATTCTTGGATAGGACTC
Sig. Start Product End Sequence
4-2-1-2 9867805 GATCTAAAC 9867813 GAGTCAAAGGATCTAAACCTATGACTC
4-2-1-5 25037009 GATTCTCATTAA 25037020 GAGTCCAGTGATTCTCATTAATTCAGATCC
4-2-1-6 25734365 TTTAGCTAATACT 25734377 GAGTCACATTTTAGCTAATACTTTCAGATCC
4-2-2-3 17132832 AGCATACTAGT 17132842 GGATCTTCCACTAGTATGCTATATGACTC
4-2-2-6 17159417 AATGTGTTATACTC 17159430 GAGTCAGAGAATGTGTTATACTCCGTGGATCC
4-2-2-8 17813774 GTTTTAGACTATACTT 17813789 GAGTCTGCAAAGTATAGTCTAAAACCTGGGACTC
4-2-3-4 27582047 ATCTGAAGTGTCA 27582059 GAGTCAGAGATCTGAAGTGTCAGAAGGACTC
4-2-3-5 7254161 TCACTGTGTTGAAA 7254174 GGATCCTGTTTTCAACACAGTGACTGGGACTC
4-2-4-5 15860691 AGCCTTAGAAGGTTT 15860705 GGATCTCTCAAACCTTCTAAGGCTGAGAGACTC
4-2-5-2 23901817 AGAAACGGGCTTG 23901829 GAGTCATTGCAAGCCCGTTTCTTAAGGACTC
4-2-6-3 13538285 AATCTGGGAGGTGAC 13538299 GAGTCCAAAAATCTGGGAGGTGACCATCGACTC
4-2-7-2 2783952 TGGAGCAGGTGACAG 2783966 GAGTCAGGATGGAGCAGGTGACAGGGGTGACTC
4-2-7-3 21390645 GGTGGCAGATTCAGGA 21390660 GAGTCAAGGTCCTGAATCTGCCACCAGTAGACTC
4-3-0-1 8622993 CAAAACTC 8623000 GAGTCAAGGCAAAACTCCTGAGACTC
4-3-0-4 22096783 TAAATATCCCT 22096793 GAGTCCCTCTAAATATCCCTTATAGATCC
4-3-1-5 3038558 ATCATCTTAACTG 3038570 GAGTCAAAAATCATCTTAACTGCTAAGACTC
4-3-1-8 7031187 AAACGTCTTTTCTTTA 7031202 GAGTCAATTAAACGTCTTTTCTTTATTATGACTC
4-3-2-1 32905734 TCCAGAGCAA 32905743 GAGTCCGTGTCCAGAGCAAGGGTGACTC
4-3-2-2 28815015 CCCAAGGTTAA 28815025 GAGTCCCTTCCCAAGGTTAAGGCTGACTC
4-3-2-2 32414641 TGTCCCAAGAA 32414651 GAGTCAACGTTCTTGGGACAATTGGACTC
4-3-2-3 6194815 CACAGATGTATC 6194826 GGATCTGTTGATACATCTGTGCTTTGACTC
4-3-2-3 19520077 TCTACAAAGGTC 19520088 GAGTCTTGGTCTACAAAGGTCTAAGGATCC
4-3-2-4 27582047 TGACACTTCAGAT 27582059 GAGTCAGAGATCTGAAGTGTCAGAAGGACTC
4-3-2-6 25660391 CGATTTCTAATTCAG 25660405 GAGTCCCATCTGAATTAGAAATCGGGGAGACTC
4-3-3-2 23926326 TCCAGAAGGATC 23926337 GAGTCTTCATCCAGAAGGATCAGGTGATCC
4-3-3-2 30387315 CATGGAATCCAG 30387326 GAGTCTTTGCATGGAATCCAGCCTGGACTC
4-3-3-2 31097332 CTGGGCAACATA 31097343 GAGTCCAGCCTGGGCAACATAGCAAGACTC
4-3-4-0 31292411 CAGGCAAGGCA 31292421 GAGTCTCTCTGCCTTGCCTGTCCCGACTC
4-3-4-4 33917401 ACTGACGGGCATTTA 33917415 GAGTCTACCACTGACGGGCATTTAGGTTGACTC
4-3-5-3 25320211 GAGTGCACAATGGCT 25320225 GAGTCTGCTAGCCATTGTGCACTCCCCTGACTC
4-3-6-3 7083234 TCTCCATGGGGGAAAG 7083249 GAGTCTTTGTCTCCATGGGGGAAAGGAGAGACTC
Sig. Start Product End Sequence
4-3-6-3 19756052 AGAGCTGTCTAGGGAC 19756067 GAGTCTGGAGTCCCTAGACAGCTCTGAAAGACTC
4-4-0-7 25875523 TTAACTCTTTCCATA 25875537 GAGTCATCTTTAACTCTTTCCATACCCAGACTC
4-4-1-1 27068556 GACACTAACC 27068565 GGATCATCTGGTTAGTGTCTTCTGACTC
4-4-1-1 30620951 CAGCCACAAT 30620960 GAGTCCCACATTGTGGCTGGGGAGACTC
4-4-1-2 33287291 ACCTTCAGAAC 33287301 GGATCCCACGTTCTGAAGGTCCCCGACTC
4-4-1-3 25350996 TCACAACCTTGA 25351007 GAGTCAGGCTCAAGGTTGTGAATGTGACTC
4-4-3-3 15850177 GACTTAGCCAGATC 15850190 GAGTCATTGGACTTAGCCAGATCTACTGATCC
4-4-3-4 33917401 TAAATGCCCGTCAGT 33917415 GAGTCTACCACTGACGGGCATTTAGGTTGACTC
4-4-3-5 273798 ACAGTTGACTGTCTCA 273813 GAGTCATACACAGTTGACTGTCTCAGTGTGACTC
4-4-5-3 29836317 AGAACTGTGGGTACCC 29836332 GAGTCAGGTAGAACTGTGGGTACCCTCCGGATCC
4-5-1-2 22052058 TACACAGCTCAC 22052069 GAGTCCACAGTGAGCTGTGTAGATGGACTC
4-5-1-5 7028188 AAACTTTGCCATCTC 7028202 GAGTCATTGGAGATGGCAAAGTTTTGATGACTC
4-5-2-2 33655118 TAGCCCACAGTAC 33655130 GAGTCTTGGGTACTGTGGGCTAGTAAGACTC
4-5-3-1 10176900 ACAGAGAGCCCCT 10176912 GGATCCTCCAGGGGCTCTCTGTAGAAGACTC
4-5-3-1 21002027 TGCCCAGCCAAGA 21002039 GAGTCGATGTCTTGGCTGGGCACAGTGACTC
4-5-3-3 21926110 CTCCACAGATGTCAG 21926124 GGATCAGTGCTGACATCTGTGGAGAGTGGACTC
4-6-1-5 18855810 CTATTAACCCCTGTAC 18855825 GAGTCCATGCTATTAACCCCTGTACCAGGGACTC
4-6-2-1 33917713 CACTCCAGCCAGA 33917725 GAGTCTCGTTCTGGCTGGAGTGCAGTGACTC
4-6-2-3 23841885 ACAGTCAGCACTTCC 23841899 GGATCTCCAGGAAGTGCTGACTGTTAATGACTC
4-7-2-3 32820688 TGTCCCTCCACAGAAC 32820703 GAGTCATTCGTTCTGTGGAGGGACAAGTGGACTC
4-8-1-2 31630487 CCAACCCACTCTGAC 31630501 GGATCTGGGGTCAGAGTGGGTTGGAGGAGACTC
4-8-1-3 18471550 CTCATCAAGCCCATCC 18471565 GGATCCTCTGGATGGGCTTGATGAGGTACGACTC
5-0-0-3 1575134 TATTAAAA 1575141 GAGTCTCTTTTTTAATAGAATGACTC
5-0-1-2 619226 ATAAATGA 619233 GAGTCAGATATAAATGAATCAGACTC
5-0-1-2 15224385 ATAAATGA 15224392 GAGTCAGATATAAATGAGTCAGACTC
5-0-2-0 12773697 AAAGAGA 12773703 GAGTCATACAAAGAGACTTAGACTC
5-0-5-2 26992119 AGTAGGAAATGG 26992130 GAGTCAGTAAGTAGGAAATGGGTGCGACTC
5-0-7-3 29788190 ATGGGTGGAGAATGA 29788204 GAGTCTTTGTCATTCTCCACCCATGTCTGACTC
5-1-1-5 24998727 TATAATTTGCAA 24998738 GGATCCTTTTTGCAAATTATATACTGACTC
5-1-1-6 23902556 ACTTTGATTTAAA 23902568 GAGTCTGAAACTTTGATTTAAACCTTGACTC
5-1-3-2 25725309 TCAGAATGAGA 25725319 GGATCAGCTTCTCATTCTGAATGAGACTC
Sig. Start Product End Sequence
5-1-3-4 3038558 CAGTTAAGATGAT 3038570 GAGTCAAAAATCATCTTAACTGCTAAGACTC
5-1-4-3 20735471 AGATGGGTTCAAA 20735483 GAGTCAGACAGATGGGTTCAAATCCTGACTC
5-1-5-4 7028188 GAGATGGCAAAGTTT 7028202 GAGTCATTGGAGATGGCAAAGTTTTGATGACTC
5-1-6-4 18855810 GTACAGGGGTTAATAG 18855825 GAGTCCATGCTATTAACCCCTGTACCAGGGACTC
5-1-7-2 27407459 GTGCTGAAAGGAGGA 27407473 GAGTCCCAAGTGCTGAAAGGAGGATGCAGATCC
5-2-1-0 15651283 ACAAACAG 15651290 GAGTCCTGAACAAACAGACATGACTC
5-2-1-0 28972460 AGCAAACA 28972467 GAGTCCTTGAGCAAACATCTCGATCC
5-2-2-2 27918310 CTAAGGTCAAA 27918320 GGATCTTACTTTGACCTTAGCCAGGACTC
5-2-3-6 14431981 TCTTATCTTGAGAAGA 14431996 GAGTCTGTATCTTATCTTGAGAAGAGTGGGATCC
5-2-3-6 19178433 TAATATTGCATGAGCT 19178448 GGATCCAGGAGCTCATGCAATATTACCAGGACTC
5-2-5-4 2970818 AAAGCATGTTGGTGCA 2970833 GAGTCATCCAAAGCATGTTGGTGCAACAGGATCC
5-2-5-4 22527477 AATCTGAGAATGGCTG 22527492 GAGTCATTGAATCTGAGAATGGCTGTGGGGATCC
5-3-1-3 15850371 CATAGAATTACC 15850382 GAGTCTCCACATAGAATTACCACATGATCC oe 5-3-1-4 16036924 AAATGTCCAACTT 16036936 GAGTCAGAAAAATGTCCAACTTCAAAGACTC
-4
5-3-2-2 23315455 CAGTTAGACACA 23315466 GAGTCCCAGCAGTTAGACACATATTGATCC
5-3-2-4 24952095 TTCCATAAAGCATG 24952108 GGATCCCACCATGCTTTATGGAAAAATGACTC
5-3-2-6 18216247 AAGATGTATCATCCTT 18216262 GAGTCTCCCAAGATGTATCATCCTTCCCTGATCC
5-3-3-1 19850611 AAAGAAGTGCCC 19850622 GAGTCTTGGAAAGAAGTGCCCACAAGATCC
5-3-3-2 30711102 TTGAAAGCCAAGC 30711114 GAGTCCCCTTTGAAAGCCAAGCCTTTGATCC
5-3-4-4 273798 TGAGACAGTCAACTGT 273813 GAGTCATACACAGTTGACTGTCTCAGTGTGACTC
5-3-5-3 11255290 TCCAAGATGGGAACTG 11255305 GGATCCTGACAGTTCCCATCTTGGATGCTGACTC
5-4-2-0 205051 CACAAGGCCAA 205061 GAGTCAGGCCACAAGGCCAAGTCTGACTC
5-4-3-2 23934318 AACCCAGTTCAGAG 23934331 GAGTCTCAGAACCCAGTTCAGAGTCTGGACTC
5-4-5-2 29781742 TCTAAGGCACCAAGGG 29781757 GAGTCACATTCTAAGGCACCAAGGGTGAGGACTC
5-5-2-3 23758787 GTATTCACAGCACAC 23758801 GAGTCGAGTGTATTCACAGCACACAGGGGACTC
5-6-2-1 245423 CCACAATCGGCACA 245436 GGATCAGAGTGTGCCGATTGTGGATGAGACTC
6-0-2-2 18038803 TATGAAGAAA 18038812 GAGTCAGAGTATGAAGAAAGATTGACTC
6-1-1-5 23902556 TTTAAATCAAAGT 23902568 GAGTCTGAAACTTTGATTTAAACCTTGACTC
6-1-2-3 3952582 TAAAGTCAAGTA 3952593 GGATCTTAATACTTGACTTTAGCCAGACTC
6-1-4-3 5084130 AGTGACATGAGAAT 5084143 GAGTCTCAAAGTGACATGAGAATTTGTGACTC
6-1-5-2 30028652 GAGTAAGAAGCAGT 30028665 GAGTCCCAGGAGTAAGAAGCAGTCAAGGATCC
Sig. Start Product End Sequence
6-1-8-1 13897829 GGGGAAGAGAGGCTAA 13897844 GGATCTAAATTAGCCTCTCTTCCCCATTAGACTC
6-2-1-3 9172896 TTGTACAACAAA 9172907 GGATCACGTTTTGTTGTACAAGTCAGACTC
6-2-3-4 25660391 CTGAATTAGAAATCG 25660405 GAGTCCCATCTGAATTAGAAATCGGGGAGACTC
6-2-3-4 29780020 GATCAAAGACTTTGA 29780034 GAGTCGTGGGATCAAAGACTTTGAAAACGATCC
6-2-5-1 14980846 CAGGAAGGAACGAT 14980859 GAGTCCATTCAGGAAGGAACGATATGTGACTC
6-3-1-5 1932974 AGTAAACTCACTTAT 1932988 GAGTCCATAAGTAAACTCACTTATTTATGATCC
6-3-2-2 17378613 GCAATTGAAACCA 17378625 GAGTCAGAGGCAATTGAAACCAGAGTGACTC
6-3-2-3 1209487 TTTGAACACAGACA 1209500 GAGTCAGAATGTCTGTGTTCAAATCCTGACTC
6-3-2-3 33773747 AGACAAAGTTCTCA 33773760 GAGTCCTGATGAGAACTTTGTCTCAAGGACTC
6-3-3-1 18506656 AGAGAACACTCAG 18506668 GAGTCTTACAGAGAACACTCAGCATAGACTC
6-3-3-2 33877374 GTAGACAGCATACA 33877387 GGATCTAACTGTATGCTGTCTACAAGAGACTC
6-3-4-0 2713672 AAGGAGCACGCAA 2713684 GAGTCTCAAAAGGAGCACGCAACCTAGATCC
6-3-4-1 28721600 CAGGAGACACTAGA 28721613 GAGTCTGTTCAGGAGACACTAGAGAGGGACTC oe 6-4-1-3 14333110 TGAACTAAAATCCC 14333123 GAGTCAGAAGGGATTTTAGTTCATGTAGACTC oe
6-4-5-0 21631191 AGGGGCCCAAACAGA 21631205 GAGTCATTCAGGGGCCCAAACAGATGGAGACTC
6-5-2-1 33752323 CAGACCCTAAAAGC 33752336 GAGTCAGCACAGACCCTAAAAGCTAAGGACTC
6-6-0-2 1506217 ACTCAATACACCAC 1506230 GAGTCAACAGTGGTGTATTGAGTGTATGACTC
7-0-2-4 925974 TAAAAAGATATTG 925986 GAGTCAACATAAAAAGATATTGAAGAGATCC
7-0-4-4 25875523 TATGGAAAGAGTTAA 25875537 GAGTCATCTTTAACTCTTTCCATACCCAGACTC
7-1-0-3 21712574 AAAAATATTCA 21712584 GAGTCTCAAAAAAATATTCATATAGATCC
7-1-1-2 11947722 ATAATAGACAA 11947732 GAGTCTGGAATAATAGACAATGGAGACTC
7-1-3-3 13563376 GAAAAGGACTATTA 13563389 GAGTCTGTAGAAAAGGACTATTAGAAAGACTC
7-1-3-4 1572344 AAAATAGTCTGTGAA 1572358 GAGTCTGATAAAATAGTCTGTGAAGCAAGATCC
7-1-4-4 22768695 GCTTAAAGAGTTGAAA 22768710 GAGTCATTGGCTTAAAGAGTTGAAAAATAGATCC
7-1-5-1 27319856 GAAGGAACAAGGAT 27319869 GAGTCTGAGGAAGGAACAAGGATGCGGGATCC
7-2-2-3 33817389 AGATACTCATAAGA 33817402 GAGTCATTTTCTTATGAGTATCTAACTGACTC
7-3-1-1 10734587 ATAAACAGCACA 10734598 GGATCAGAATGTGCTGTTTATAATGGACTC
7-4-1-3 2663787 AAATTACACTCAGCA 2663801 GGATCCCTGTGCTGAGTGTAATTTCTATGACTC
7-4-4-1 20634452 GGCAAAGTACAAGCCA 20634467 GAGTCAGCCGGCAAAGTACAAGCCATGCTGACTC
7-5-2-2 33039205 ACATACACGACACTGA 33039220 GAGTCTTTATCAGTGTCGTGTATGTTCCAGACTC
8-1-0-4 27892202 TTTACAAAATAAA 27892214 GAGTCCCAGTTTATTTTGTAAAATTGGACTC
Sig. Start Product End Sequence
8-1-3-2 24503746 GACAAGAAGAATT 24503759 GAGTCTAAGGACAAGAAGAATTAGATAGATCC
8-1-3-4 7031187 TAAAGAAAAGACGTTT 7031202 GAGTCAATTAAACGTCTTTTCTTTATTATGACTC
8-1-4-0 13199929 AAAGAGAGAACGA 13199941 GAGTCTTATAAAGAGAGAACGAATTAGATCC
8-1-5-2 22394971 ATGGAGAAAACAAGGT 22394986 GGATCATTTACCTTGTTTTCTCCATGCGGGACTC
8-2-2-4 17813774 AAGTATAGTCTAAAAC 17813789 GAGTCTGCAAAGTATAGTCTAAAACCTGGGACTC
8-3-3-2 18319692 CCAAAAAGGACATTAG 18319707 GAGTCTCAACTAATGTCCTTTTTGGATTTGACTC
9-2-3-1 5408125 AGTCAAAGACAGAAA 5408139 GAGTCTCTGTTTCTGTCTTTGACTCTCTGACTC
oe
Figure imgf000090_0001
Table 6. Probability of overlap of any given fragment of specified length with a N.BstNB l-derived oligonucleotide created from a genome the size of the human genome
Figure imgf000091_0001
Table 7. Number of N.BstNB I derived oligonucleotides per organism
Figure imgf000092_0001
Figure imgf000093_0001
EXAMPLE 2
GENERATION OF OLIGONUCLEOTIDE-BASED FINGERPRINTS FROM WHOLE GENOMIC
DNA
This example shows oligonucleotide-based fingerprints generated from whole genomic DNA and charaterization thereof.
Templates Designed for Detecting E. coli 0157 and K12
The following templates were designed to detect E.coli 0157 and K12. The first oligonucleotide in each set is the one that would be generated/amplified from the whole genome of E. coli 0157 or K12 in the presence of nicking enzymes N.BstNB I and N.AIw I and a DNA polymerase. The next two oligonucleotides are two template oligonucleotides useful in further amplification of the oligonucleotide generated from the whole genome. The sequence of the sense strand of the recognition sequence of N.BstNB I is underlined.
0157-392522 5'-TGGCAATATCC
O157-1 S: 5'- TGAAAACCGAAAGAGTCAATTGGATATTGCCA 0157-1 SP: 5'- TGAAAACCGAAAGAGTCAATTTGGCAATATCC
O157-1660620 5'- GATATTATTGG
0157-2S: 5'- TGAAAACCGAAAGAGTCAATTCCAATAATATC 0157-2SP: 5'- TGAAAACCGAAAGAGTCAATTGATATTATTGG
0157-1856922 5'- GGCATTACGCGCGG
0157-3S: 5'- TGAAAACCGAAAGAGTCAATTCCGCCGCGCGTAAT
O157-3SP: 5'- TGAAAACCGAAAGAGTCAATTGGCATTACGCGCGG
0157-2268011 5'- TTTAACGCCAGTTC
0157-4S: 5'- TGAAAACCGAAAGAGTCAATTGAACTGGCGTTAAA 0157-4SP: 5'- TGAAAACCGAAAGAGTCAATTTTTAACGCCAGTTC K12-198629 5'-TTGAAACGGGCATAA
K12-1 S: 5'- TGAAAACCGAAAGAGTCAATTTTATGCCCGTTCAA
K12-1 SP: 5'- TGAAAACCGAAAGAGTCAATTTTGAACGGGCATAA
K12-312061 5'- GGATTGAGTGTTG
K12-2S: 5'- TGAAAACCGAAAGAGTCAATTCAACACTCAATCC
K12-2SP: 5'- TGAAAACCGAAAGAGTCAATTGGATTGAGTGTTG
K12-558515 5'- TTCTGGATGAATGTTA
K12-3S: 5'- TGAAAACCGAAAGAGTCAATTTAACATTCATC
K12-3SP: 5'- TGAAAACCGAAAGAGTCAATTGATGAATGTTA
K12-1093455 5'-TAAAGCGGCA K12-4S: 5'- TGAAAACCGAAAGAGTCAATTTGCCGCTTTA
K12-4SP: 5'- TGAAAACCGAAAGAGTCAATTTAAAGCGGCA
Materials
Oligonucleotides were synthesized by Midland Certified Reagent
Company, Inc. (Midland, TX), or MWG Biotech, Inc. (High Point NC). The oligonucleotides were routinely checked by time-of-flight mass spectrometry
(using LCT from Micromass, Manchester, UK).
All enzymes were purchased from New England Biolabs. The
DNA polymerase used was exo" Vent. The nicking enzyme (N.BstNB, I) has a specific activity of approximately 106 units/mg. All HPLC components (water and acetonitrile) were purchased from Fisher Scientific (Pittsburgh, PA). The dimethyl-butylamine was purchased from Sigma-Aldrich Corp. (St. Louis, MO) and a salt was made by the addition of acetic acid (Sigma Aldrich) to pH 8.4.
The 2 molar stock solution was filtered using a 0.2 micron nylon filter.
K12 and 0157 DNAs were obtained the Coriell Institute. Purified E. coli DNA was obtained from Sigma and human liver mRNA was obtained from Ambion (Austin TX). Nicked E. coli DNA was prepared by placing 25 micrograms of DNA in 200 microliters of 1x N.BstNB I buffer (New England Biolabs) and 10 ul of N.BstNB I nicking enzyme (20 units). The solution was incubated at 55°C for 60 minutes, and then heated to 95°C for 5 minutes to inactivate the nicking enzyme and denature the E. coli DNA.
DNA isolation from prokaryotic sources.
Isolation and quantitation of genomic DNA from gram-negative and spirochete bacteria were also achieved as follows. Cells were pelleted and washed twice in 1 ml of 1 M NaCl by centrifugation in a fixed angle microfuge at 15,000 rpm for 5 min. Cells were washed twice and resuspended in TE (10 mm Tris, 25 mM EDTA, pH 8.0) and incubated in 0.2 mg/ml lysozyme and 0.3 mg/ml RNase A for 20 min at 37°C. If lysis by lysozyme was not visible with refractory pathogenic strains, 0.6% SDS was added. To these suspensions, 1% Sarkosyl and 0.6 mg/ml proteinase K were added, and the cells were incubated for 1 hr at 37°C. Cell lysates were extracted twice with phenol and twice with chloroform. The aqueous phase was precipitated with 0.33M NH4 acetate and 2.5 volumes of ethanol. Precipitated threads of DNA were removed with a sterile Pasteur pipette tip, and dissolved in TE (10 mM Tris, 1 mM EDTA, pH 8.0).
Isolation of genomic DNA from Gram-positive bacteria was also performed as follows. Concentrated cell pellets were washed twice in 1 M NaCl and twice in TE (50 mM Tris, 50 mM EDTA, pH 7.8) and spun in a fixed-angle microfuge for 5 min. Cell pellets were resuspended in TE and incubated with 250 U/ml mutanolysin and 0.3 mg/ml RNAse A for 30 min at 37°C. To this reaction, 0.6% SDS and 0.6 mg/ml proteinase K were added, and the mixture was incubated for 1 hr at 37°C, followed by 65°C for 45 min. Lysates were extracted twice with phenol and twice with chloroform. The aqueous phase was precipitated with 0.33M NH4 acetate and 2.5 volumes of ethanol. Precipitated threads of DNA were removed with a sterile Pasteur pipette tip, and dissolved in TE (10 mM Tris, 1 mM EDTA, pH 8.0). In both instances the genomic DNA was quantitated by spectrofluorimetry at excitation and emission wavelengths of 365 nm and 460 nm respectively using the DNA-specific dye, Hoechst 33258 and a fluorometer.
Generating Oligonucleotides from Genomic DNA
The reaction conditions used to generate oligonucleotides from genomic DNA are: 85 mM KCI, 25 mM Tris-HCl (pH 8.8 at 25°C), 2.0 mM MgSO4, 5 mM MgCI2, 10 mM (NH4)2SO4, 0.1 % Triton X-100, 0.5 mM DTT, 0.8 units N.BstNB I nicking enzyme (NEB), 0.05 units 9°Nm™ polymerase (NEB), 200 micromolar dNTPs, 10 micrograms/ml BSA (NEB), and 0.001 to 500 ng of genomic DNA in ultra-pure water that is nuclease free (Ambion, Austin TX). Incubation is at 60°C for the time as specif icied.
Exponential Amplification of Oligonucleotide Generated from Genomic DNA The reaction conditions for exponentially amplifying the oligonucleotides generated from genomic DNA are: 85 mM KCI, 25 mM Tris- HCl (pH 8.8 at 25°C), 2.0 mM MgSO4, 5 mM MgCI2, 10 mM (NH4)2SO4, 0.1% Triton X-100, 0.5 mM DTT, 0.4 units N.BstNB I nicking enzyme (NEB), 0.05 units 9°Nm™ polymerase (New England Biolabs, (NEB), MA), 0.2 M trehalose, 200 micromolar dNTPs, 10 micrograms/ml BSA (NEB), 0.1 micromolar of each template olignucleotide in ultra-pure water that is nuclease free (Ambion). In the case of fluorescence monitoring, SYBR® green (Molecular Probes, Eugene OR) was added to 5x concentration (SYBR® green is supplied by the manufacturer at a concentration of 10.000X). Incubation is for times indicated in the text at 55°C to 65°C (as specified). To determine the lower limit of detection, the following reaction mixture was assembled in a 5 ml polypropylene tubes on ice (4°C). 100 ul 10x Thermopol Reaction buffer, 50 ul 10x N.BstNB I buffer, 1 ul of the first template oligonucleotide (T1) at 100 uM stock, 1 ul of the second template oligonucleotide (T2) at 100 uM stock, 24 ul 25 mM dNTPS, 1 ul 10 mg/ml BSA, 40 ul N.BstNB I nicking enzyme at 10 units/ul, 24 ul 9°Nm™ polymerase, 760 ul ultra-pure water. The reaction was mixed to homogeneity with a 1 ml pipetor. In a set of 12 microtubes, 120 ul was added to the first tube, then 80 ul was aliquoted over the remaining 11 tubes. The oligonucleotide(s) generated from genomic DNA or otherwise to be exponentially amplified was diluted 1-1000 from the stock concentration of 100 uM, to a final concentration of 0.1 uM. One microliter of the 0.1 uM solution was added to the first tube, then eleven, 3-fold dilutions were made by transferring 40 ul from the first tube and mixing. The serial dilutions were made on ice. The tubes were capped and then incubated at 60°C for the times indicated. The reaction was stopped by placing the tube at 4°C or on ice. In the case of real time fluorescence monitoring, an MJ
Opticon was programmed as follows: (1) incubate 10 seconds at 60°C, (2) read plate, then repeat the incubation and reading steps 29 more times.
Chromatography, Time-of-Flight (TOF) Mass Spectrometry
The chromatography system was an Agilent 1100 Series HPLC composed of a binary pump, degasser, a column oven, a diode array detector, and thermostatted microwell plate autoinjector (Agilent Technologies, Palo Alto, CA). The column is a Waters Xterra MS C18, incorporating C18 packing with 3.5 uM particle size, with 125 Angstrom pore size, 2.1 mm x 20 mm (Waters Inc. Milford, MA). The column was run at 30°C with a gradient of acetonitrile in 5mM dimethyl-butylamine acetate (DMBAA). As a check on the complete release of the signal oligo during the chromatography and injection we ran the column at 50° after incubating the sample briefly at 95°C. We saw no increase in the oligo yield over our standard conditions. Buffer A is 5mM DMBAA, buffer B is 5mM DMBAA and 50% (V/V) acetonitrile. The gradient begins at 10%B and ramps to 15%B over 0.3 minute, to 30%B over 2 minutes, to 90%B over 0.5 minute, to 10%B over 0.25 minute, then holds at 10%B for 1.25 minutes. The column temperature was held constant at 30°C. The flow rate was 0.25 ml/minute. The injection volume was 10 μl. Flow rate into the mass spectrometer was also 0.25 ml/min. The mass spectrometer is a Micromass LCT Time-of-Flight with an electrospray inlet (Micromass Inc. Manchester UK). The samples were run in electrospray negative mode with a range from 800 to 2000 amu using a 1 second scan time. Source parameters: Desolvation gas 450 L/hr, Capillary 2225V, Sample cone 30V, RF lens 400V, extraction cone 7V, desolvation temperature 275°C, Source temperature 120°C. Analysis of the LC-mass spectrometry data made use of the software supplied by the manufacturer.
Oligonucleotides are known to exhibit different ionization efficiencies, which in our measurements would be translated into sequence- specific differences in measured oligo concentration. A survey of a range of more than 80 different 12mers indicated that the variation between sequences attributable to this difference is less than 30%. Almost all relevant quantitative comparisons are with the same oligo sequence. It is necessary, however, to calibrate for ionization efficiencies for quantitative comparisons between different sequences.
MADLI Mass Spectroscopy
MALDI Mass Spectroscopy reagents are 1 % PADMAC from a stock solution of 10% w/v of poly(diallyldimethylammonium chloride) from Polysciences in 0.3x SSPE, diluted to 1 % w/v in HPLC water. 3-HPA Matrix is prepared fresh daily by placing 25mg of 3-hydroxypicolinic acid washed once with lOOOul acetonitrile, decanted, dissolved in 500ul acetonitrile and 500ul 0.1%TFA. ammonium citrate is prepared by dissolving in HPLC water to 10mg/ml.
The procedure for manual spotting is as follows: on a clean MALDI card (stainless steel, Micromass), 0.5ul of the 1 % PADMAC solution was placed onto the center of the well using a P2 pipetman. The plate was heated on a thermocycling block to 50°C and the drop was allowed to dry completely (3 to 5 minutes). The card is allowed to cool. 0.5ul of sample was placed (oligo standard or assay sample) onto the well in the same spot as the PADMAC. The card was placed on the heating block until the spots are dry. This step is repeated if the sample is dilute. The card is removed, cooled and washed with 2ul of 10mg/ml ammonium citrate on the wells, 1 minute incubation followed by aspiration of the liquid off using vacuum suction. The wash was repeated followed by two washes with 2 ul HPLC grade water as above. The card was heated to dry (2 to 5 minutes). A small drop (about Λ ul) of matrix was applied to each spot. The matrixwas dried, and.the application was repeated. The card was dried on the heated block for 1 minute.
Acquisition Details: when a MALDI instrument in positive mode with the neutral density filter removed (Micromass Manchester UK) was used, the following parameters were employed: Source voltage of 15000, Pulse voltage of 1313, MCP detector voltage of 1800, Fine Laser Energy % 75, Matrix suppression (amu) 493.2, TLF delay of 500 ns. ADC settings were: Sample period of 0.5 ns and signal sensitivity of 100. Mass range of acquisition was from 1500 to 15000 amu. The laser firing rate was set to 5 Hz and shots per spectrum was set to 5. Calibration: The daily calibration was performed using a mixture of oligonucleotides. The sequences and calculated average masses was as follows:
8me r - lv ATGATGCA 2433.6
62100 tri gger TTTATGAACT AT 3634.4 ITAtop CCGATCTAGT GAGTCGCTC 5779.8
48180p GAAAATCCCC TGAGATGAGT CAGTG 7715 .1
63180 GGTACTAGCCTTATGCGACTCGGTACTAGCCTT 10095
Real-time Fluorescence Measurement
All fluorescence measurements reported here were made on an MJ Opticon™ instrument (MJ Research Ltd., Waltham, MA) using software supplied by the manufacturer. The real-time measurements on this instrument were made using an isothermal protocol using a 10-30 second interval read beginning 10 seconds after the lid and chamber reached 60°C. Results
Eight short oligonucleotides (6-16 nucleotides in length) unique to E.coli strain K12 or 0157, respectively, were generated by incubating 105 cells of each strain in the amplification buffer and incubating with the nicking enzyme N.BstNB I and 9°Nm™ polymerase for 10 minutes at 60°C.
A small aliquot of the above-described reaction mixture was then added to a replicator type amplification reaction where a set of two templates (i.e., 0157-1 S and O157-1SP, O157-2S and 0157-2SP, O157-3S and 0157- 3SP, O157-4S and 0157-4SP, K12-1S and K12-1SP, K12-2S and K12-2SP, K12-3S and K12-3SP, or K12-4S and K12-4SP) designed to further amplifying one of the eight short oligonucleotides as described above were present. The resulting reaction mixture was for 3 minutes at 60°C, cooled to 4°C and then subjected to both MALDI and LC-TOF mass spectrometry analysis (see Table 8). All the four oligonucleotides from each of the two strains were amplified, while no oligonucleotides were amplified from the negative controls.
Table 8. Analysis of the fingerprinting reactions for E. coli 0157 and K12 strains by LC-TOF and mass spectrometry
Figure imgf000102_0001
False positives can be greatly reduced when one strand of genomic DNA generates an oligonucleotide that is exactly complementary to another oligonucleotide generated from the other strand of the genomic DNA. Under such circumstances, the possibility of an accidental production of not just one, but two, oligonucleotides with masses identical to those expected from the two complementary oligonucleotides generated from the genomic DNA is much smaller than the possibility of an accidential production of one oligonucleotide with a mass identical to that of an oligonucleotide generated from the genomic DNA. Figures 4a and 4b show the TOF spectra of the multiplexed amplification reaction in the presence of short oligonucleotides generated from the genomic DNA of the E. coli strain K12 and 0157, respectively.
Figures 5a and 5b show the MALDI spectra of the multiplexed amplification reaction in the presence of short oligonucleotides generated from the genomic DNA of the E. coli strain K12 and 0157, respectively. The figures show that all the eight amplification products and the oligonucleotide generated by the positive control, but no additional fragments were amplified.
Both MALDI and LC-TOF mass spectrometry methods are highly amenable to multiplexing with respect to the simultaneous measurement of mass/charge ratios. Using current off-the-self instrumentation the level of achievable multiplexing should reach 200-500 fragments for LC-TOF and 20-50 different fragments when using MALDI mass spectrometry.
The readout can also be accomplished by real-time fluorescence although the capability to perform intra-well multiplexing is lost. The lower limit of detection for triggering oligonucleotides from E. coli K12 was determined by serially diluting the oligonucleotide generated from the genomic DNA and measuring the duplex formation and intercalation of SYBR® green during the exponential amplification reaction using real-time fluorescence monitoring. The trigger was diluted from a starting concentration of 1 x 10"4 micromolar to 1 x 10" 12 micromolar in 3-fold intervals per well. The reaction was monitored every 10 seconds using SYBR® green on an MJ Opticon™ 1. As shown in Figure 6, the rate of accumulation of double-strand material is proportional to the concentration of the oligonucleotide generated from the genomic DNA. At the highest concentration of this oligonucleotide, the reaction was complete is 50 seconds, and there was an approximate 20 second delay in accumulation of fluorescence for each 4-fold dilution of the trigger. The eleventh dilution of trigger was clearly above the values of the no-trigger control indicating the lower limit of detection of about 200 molecules. The level of amplification was about 08-fold. There appears to be no fundamental limitation on the lower limit of detection. By scaling down the volume of amplification via miniaturation or by using solid phase technologies, a lower limit of detection of a few tens of molecules should be achieved.
Another tractable form of multiplexing is to covalently immobilize the template amplification-oligonucleotides in an array format where multiplexing is achieved by spatial separation of elements on the array. Each element contains two templates (T1 and T2) specific for a particular oligonucleotide generated from a genomic DNA. When the template oligonucleotides are covalently immobilized on a surface coated with 70,000 MW poly(ethyleneimine) (PEI), the nicking enzymes and polymerases are not hindered and amplification is nearly as efficient as in solution. Covalent attachment of the amplification-templates has the added advantage that the annealing between the T1 and T2 templates is prevented.
The 8 sets of amplification templates (for E. coli strains K12 and O157) as described above were immobilized on PEI glass slides using cyanuric chloride as the cross-linker (see above). Trigger oligonucleotides were added to the amplification reaction at 1 x 10"3 femtomolar and the reaction was incubated 60°C for 10 (under a coverslip). All K12 elements were positive with K12 triggers and the 0157 elements positive with 0157 triggers in which the positive and negative controls also showed the results as expected. During the amplification, the single-stranded amplification templates were converted to double strand and in the presence of SYBR® green, the amplifying elements become fluorescent and the negative elements remain unstained.
EXAMPLE 3 COMPUTER-ASSISTED IDENTIFICATION OF SHORT OLIGONUCLEOTIDES CONTAINING SINGLE NUCLEOTIDE POLYMORPHISM
Certain short oligonucleotide fragments generated from the genomic DNA of an organism may contain single nucleotide polymorphisms. In other words, a short oligonucleotide fragment generated from the genomic DNA from one individual may be different from the corresponding fragment from another individual. Such oligonucleotide fragments may be used to characterize genetic variations and to identify a particular individual. Table 9 shows the short oligonucleotides that would be generated from human genomic DNA in the presence of the nicking enzyme N.BstNB I alone or the nicking enzyme N.BstNBI in combination with another restriction endonuclease selected from Bel I, Bsa Bl, Bsm I, Bsr I and Bsr D1. In the enzyme column, "NBstNBI" indicates that the fragment is flanked by two N.BstNB I sites. Bcll_NBstNBI indicates that a fragment is flanked by a Bell site and an N.BstNB I site. This table also shows portions of genomic DNA that contains two closely located sequences of one strand of the N.BstNB I recognition sequence, or the sequence of one strand of the N.BstNB I recognition sequence and that of one strand of the Bell recognition sequence.
©
Ul
Figure imgf000106_0001
© βs
Figure imgf000107_0001
©
-4
Figure imgf000108_0001
© oe
Figure imgf000109_0001
Figure imgf000110_0001
Figure imgf000111_0001
Figure imgf000112_0001
Figure imgf000113_0001
Figure imgf000114_0001
Figure imgf000115_0001
Figure imgf000116_0001
Figure imgf000117_0001
Figure imgf000118_0001
Figure imgf000119_0001
Figure imgf000120_0001
Figure imgf000121_0001
Figure imgf000122_0001
Figure imgf000123_0001
Figure imgf000124_0001
Figure imgf000125_0001
Figure imgf000126_0001
Figure imgf000127_0001
Figure imgf000128_0001
Figure imgf000129_0001
Figure imgf000130_0001
Figure imgf000132_0001
Figure imgf000133_0001
Figure imgf000133_0002
Figure imgf000134_0001
Figure imgf000135_0001
Figure imgf000136_0001
Figure imgf000137_0001
-4
Figure imgf000138_0001
Figure imgf000139_0001
Figure imgf000140_0001
Figure imgf000141_0001
Figure imgf000142_0001
Figure imgf000143_0001
Figure imgf000144_0001
Figure imgf000145_0001
Figure imgf000146_0001
Figure imgf000147_0001
Figure imgf000148_0001
Figure imgf000149_0001
Figure imgf000150_0001
Figure imgf000151_0001
Figure imgf000152_0001
Figure imgf000153_0001
Figure imgf000154_0001
Figure imgf000155_0001
Figure imgf000156_0001
Figure imgf000157_0001
Figure imgf000158_0001
Ul oe
Figure imgf000159_0001
Ul 0
Figure imgf000160_0001
Figure imgf000161_0001
Figure imgf000162_0001
Figure imgf000163_0001
Figure imgf000164_0001
Figure imgf000165_0001
Figure imgf000166_0001
Figure imgf000167_0001
Figure imgf000168_0001
Figure imgf000169_0001
Figure imgf000170_0001
-4
©
Figure imgf000171_0001
Figure imgf000172_0001
Figure imgf000173_0001
-4 l
Figure imgf000174_0001
Figure imgf000175_0001
-4 Ul
Figure imgf000176_0001
Figure imgf000177_0001
-4 -4
Figure imgf000178_0001
-4 oe
Figure imgf000179_0001
Figure imgf000180_0001
oe ©
Figure imgf000181_0001
Figure imgf000182_0001
Figure imgf000183_0001
oe
Figure imgf000184_0001
Figure imgf000185_0001
oe
Ul
Figure imgf000186_0001
Figure imgf000187_0001
oe
-4
Figure imgf000188_0001
oe oe
Figure imgf000189_0001
oe
Figure imgf000190_0001
Figure imgf000191_0001
Figure imgf000192_0001
Figure imgf000193_0001
Figure imgf000194_0001
Figure imgf000195_0001
Figure imgf000196_0001
Figure imgf000197_0001
-4
Figure imgf000198_0001
Figure imgf000199_0001
Figure imgf000200_0001
© ©
Figure imgf000201_0001
Figure imgf000202_0001
Figure imgf000203_0001
©
U
Figure imgf000204_0001
Figure imgf000205_0001
©
Ul
Figure imgf000206_0001
Figure imgf000207_0001
©
-4
Figure imgf000208_0001
I)
© oe
Figure imgf000209_0001
Figure imgf000210_0001
Figure imgf000211_0001
Figure imgf000212_0001
Figure imgf000213_0001
Figure imgf000214_0001
Figure imgf000215_0001
Figure imgf000216_0001
Figure imgf000217_0001
Figure imgf000218_0001
Figure imgf000219_0001
Figure imgf000220_0001
All of the above U.S. patents, U.S. patent application publications, U.S. patent applications, foreign patents, foreign patent applications and non- patent publications referred to in this specification and/or listed in the
Application Data Sheet, are incorporated herein by reference, in their entirety. From the foregoing it will be appreciated that, although specific embodiments of the invention have been described herein for purposes of illustration, various modifications may be made without deviating from the spirit and scope of the invention. Accordingly, the invention is not limited except as by the appended claims.

Claims

1. A method comprising: a) providing a nucleic acid sample; b) treating the nucleic acid sample with components under nicking conditions, where the components comprise: i) a nicking agent; and the conditions cause the nicking agent to nick the nucleic acid sample to thereby produce a family of initiating oligonucleotide fragments; c) subjecting one or more members of the family of initiating oligonucleotide fragments to a characterization process to thereby provide results; and d) identifying a source for the nucleic acid sample based on the results of the characterization process.
2. The method of claim 1 wherein the components further comprise: ii) a polymerase; and iii) a deoxyribonucleoside triphosphate.
3. The method of claim 2 wherein the components further comprise: iv) a template oligonucleotide comprising from 3' to 5':
(A) a first nucleotide sequence that is at least substantially complementary to a nucleotide sequence present in one or more members of the family of initiating oligonucleotide fragments;
(B) a sequence of one strand of a nicking agent recognition sequence; and
(C) a second nucleotide sequence.
4. The method of claim 3 wherein sequence iv)(B) is a sequence of an antisense strand of the nicking agent recognition sequence, and sequence iv)(A) is at least substantially identical to sequence iv)(C).
5. The method of claim 3 wehrein the components further comprising v) a second template oligonucleotide, comprising, from 3' to 5':
(A) a first nucleotide sequence that is at least substantially identical to a nucleotide sequence present in the one or more members of the family of initiating oligonucleotide fragments;
(B) a sequence of a sense strand of the nicking agent recognition sequence; and
(C) a second nucleotide sequence. wherein sequence iv)(B) is a sequence of a sense strand of the nicking agent recognition sequence.
6. The method of claim 1 wherein the components further comprises a restriction endonuclease.
7. The method of claim 1 wherein the one or more members of the family of initiating oligonucleotide fragments are 6-16 nucleotides in length.
8. The method of claim 1 wherein the nicking agent is a nicking endonuclease.
9. The method of claim 1 wherein the nicking agent is N.BstNB I.
10. The method of claim 2 wherein the polymerase is selected from exo" Vent polymerase, Bst polymerase, and 9°Nm™ polymerase.
11. The method of claim 3 wherein the 3' terminus of template iv) is blocked.
12. The method of claim 4 wherein sequence iv)(A) is exactly identical to sequence iv)(C).
13. The method of claim 5 wherein the 3' termini of template iv) and v) are blocked.
14. The method of claim 3 wherein the template is immobilized.
15. The method of claim 5 wherein templates iv) and v) are immobilized.
16. The method of claim 1 wherein the characterization process is performed at least partially by a technique selected from the group consisting of luminescence spectroscopy or spectrometry, fluorescence spectroscopy or spectrometry, mass spectrometry, liquid chromatography, fluorescence polarization, and electrophoresis.
17. The method of claim 16 wherein the characterization process is performed at least partially by mass spectrometry.
18. The method of claim 17 wherein the mass spectrometry is MALDI or LC-TOF.
19. A template oligonucelotide for amplifying a portion of a target nucleic acid, wherein
(i) the portion of the target is 6-16 nucleotides in length and flanked by (A) a sequence of one strand of a first nicking enzyme recognition sequence, and
(B) a sequence of one strand of a second nicking enzyme recognition sequence, or a sequence of one strand of a restriction enzyme recognition sequence, and
(ii) the template oligonucleotide comprising from 3' to 5':
(A) a first nucleotide sequence that is at least substantially complementary to the portion of the target,
(B) a sequence of an antisense strand of a third nicking enzyme recognition sequence, and
(C) a second nucleotide sequence that is at least substantially identical to the first nucleotide sequence.
20. The template oligonucleotide of claim 19 wherein the first and third nicking enzyme recognition sequences are identical to each other.
21. The template oligonucleotide of claim 19 wherein the first, second and third nicking enzyme recognition sequences are identical to each other.
22. The template oligonucleotide of claim 21 wherein the first, second and third nicking enzyme recognition sequences are recognizable by N.BstNB I.
23. The template oligonucleotide of claim 19 wherein the 3' terminus of the template oligonucleotide is blocked.
24. The template oligonucleotide of claim 19 wherein the 3' terminus of the template oligonucleotide is immobilized.
25. The template oligonucleotide of claim 19, wherein the portion of the target is selected from the products listed in Tables 1 , 2, 4, 5, and 9.
26. The template oligonucleotide of claim 19 wherein the portion of the target comprises a genetic variation.
27. The template oligonucleotide of claim 26 wherein the genetic variation is a single nucleotide polymorphism.
28. A composition for amplifying a portion of a target nucleic acid, comprising a first template oligonucleotide and a second template oligonucleotide, wherein
(i) the portion of the target is 6-16 nucleotides in length and flanked by
(A) a sequence of one strand of a first nicking enzyme recognition sequence, and
(B) a sequence of one strand of a second nicking enzyme recognition sequence, or a sequence of one strand of a restriction enzyme recognition sequence,
(ii) the first template oligonucleotide comprises from 5' to 3':
(A) a first nucleotide sequence,
(B) a sequence of a sense strand of a third nicking agent recognition sequence, and
C) a second nucleotide sequence that is at least substantially complementary to the portion of the target, and
(iii) the second template oligonucleotide comprises from 5' to 3':
(A) a first nucleotide sequence,
(B) a sequence of a sense strand of a fourth nicking agent recognition sequence, and
(C) a second nucleotide sequence that is at least substantially identical to the portion of the target.
29. The composition of claim 28 wherein the first, third and fourth nicking enzyme recognition sequences are identical to each other.
30. The composition of claim 28 wherein the first, second, third and fourth nicking enzyme recognition sequences are identical to each other.
31. The composition of claim 29 wherein the first, third and fourth nicking enzyme recognition sequences are recognizable by N.BstNB I.
32. The composition of claim 28 wherein the 3' termini of the first and second templates are blocked.
33. The composition of claim 28 wherein the 3' termini of the first and second templates are immobilized.
34. The composition of claim 28 wherein the portion of the target is selected from the products listed in Tables 1 , 2, 4, 5 and 9.
35. The composition of claim 28 wherein the portion of the target comprises a genetic variation.
36. The composition of claim 35 wherein the genetic variation is a single nucleotide polymorphism.
37. A composition comprising:
(1) a nicking agent, and
(2) at least one fragment selected from the products listed in Tables 1 , 2, 4, 5 and 9, wherein the at least one fragment is produced at least partially by action of the nicking agent.
38. The composition of claim 37, wherein the nicking agent is N.BstNB I.
39. The composition of claim 38, wherein the nicking agent is N.AIw I.
40. The composition of claim 37 further comprising a restriction endonuclease selected from Bel I, Bsa Bl, Bsm I, Bsr I and Bsr D1 , wherein the at least one fragment is selected from the products listed in Table 9.
41. The composition of claim 37 comprising at least two fragments selected from the products listed in Tables 1 , 2, 4, 5 and 9.
42. The composition of claim 37 comprising at least three fragments selected from the products listed in Tables 1, 2, 4, 5 and 9.
43. The composition of claim 37 comprising at least five fragments selected from the products listed in Tables 1 , 2, 4, 5 and 9.
44. A kit for identifying the source of a nucleic acid sample, comprising one or two template oligonucleotides for amplifying a portion of a genomic DNA of an organism suspected to be the source of the nucleic acid sample, wherein the portion of the genomic DNA is 6-16 nucleotides in length and flanked by
(A) a sequence of one strand of a first nicking enzyme recognition sequence, and
(B) a sequence of one strand of a second nicking enzyme recognition sequence, or a sequence of one strand of a restriction enzyme recognition sequence.
45. The kit of claim 44 wherein the one or two template oligonucleotides is the template oligonucleotide of claim 19.
46. The kit of claim 44 wherein the one or two template oligonucleotides is the template oligonucleotide of claim 20.
47. The kit of claim 46 further comprising a nicking enzyme that recognizes the first and third nicking enzyme recognition sequences.
48. The kit of claim 44 wherein the one or two template oligoncleotides is the first and second template oligonucleotides of claim 28.
49. The kit of claim 44 wherein the one or two template oligonucleotides is the first and second template oligonucleotides of claim 29.
50. The kit of claim 49 further comprising a nicking enzyme that recognizes the first, third and fourth nicking enzyme recognition sequences.
51. The kit of claim 44 further comprising a DNA polymerase.
52. The kit of claim 51 further comprising one or more deoxyribonucleoside triphosphate.
53. The kit of claim 44 wherein the portion of the genomic DNA is selected from the products listed in Tables 1 , 2, 4, 5 and 9.
54. The kit of claim 44 wherein the portion of the genomic DNA comprises a single nucleotide polymorphism.
55. An array, comprising
(a) a substrate having a plurality of distinct areas; and
(b) the template oligonucleotide of claim A1 or the composition of claim B1 immobilized to at least one of the plurality of distinct areas.
PCT/US2004/002720 2003-01-29 2004-01-29 Organism fingerprinting using nicking agents WO2004067765A2 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US44381103P 2003-01-29 2003-01-29
US60/443,811 2003-01-29

Publications (2)

Publication Number Publication Date
WO2004067765A2 true WO2004067765A2 (en) 2004-08-12
WO2004067765A3 WO2004067765A3 (en) 2006-09-28

Family

ID=32825374

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2004/002720 WO2004067765A2 (en) 2003-01-29 2004-01-29 Organism fingerprinting using nicking agents

Country Status (1)

Country Link
WO (1) WO2004067765A2 (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7632667B2 (en) 2005-10-24 2009-12-15 Nishikawa Rubber Co., Ltd. Mutan endonuclease with substrate-specific cleavage activity
US20140017692A1 (en) * 2010-12-10 2014-01-16 Abbott Laboratories Method and kit for detecting target nucleic acid
WO2015035305A1 (en) * 2013-09-09 2015-03-12 Somalogic, Inc. Pdgf and vegf aptamers having improved stability and their use in treating pdgf and vegf mediated diseases and disorders
US9410156B2 (en) 2012-03-28 2016-08-09 Somalogic, Inc. Aptamers to PDGF and VEGF and their use in treating PDGF and VEGF mediated conditions
US10036077B2 (en) 2014-01-15 2018-07-31 Abbott Laboratories Covered sequence conversion DNA and detection methods
US10208333B2 (en) 2014-10-14 2019-02-19 Abbott Laboratories Sequence conversion and signal amplifier DNA having locked nucleic acids and detection methods using same
US10604790B2 (en) 2014-12-24 2020-03-31 Abbott Laboratories Sequence conversion and signal amplifier DNA cascade reactions and detection methods using same

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5500363A (en) * 1990-04-26 1996-03-19 New England Biolabs, Inc. Recombinant thermostable DNA polymerase from archaebacteria
US5614386A (en) * 1995-06-23 1997-03-25 Baylor College Of Medicine Alternative dye-labeled primers for automated DNA sequencing
US5641658A (en) * 1994-08-03 1997-06-24 Mosaic Technologies, Inc. Method for performing amplification of nucleic acid with two primers bound to a single solid support
US5874215A (en) * 1995-01-16 1999-02-23 Keygene N.V. Amplification of simple sequence repeats

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5500363A (en) * 1990-04-26 1996-03-19 New England Biolabs, Inc. Recombinant thermostable DNA polymerase from archaebacteria
US5641658A (en) * 1994-08-03 1997-06-24 Mosaic Technologies, Inc. Method for performing amplification of nucleic acid with two primers bound to a single solid support
US5874215A (en) * 1995-01-16 1999-02-23 Keygene N.V. Amplification of simple sequence repeats
US5614386A (en) * 1995-06-23 1997-03-25 Baylor College Of Medicine Alternative dye-labeled primers for automated DNA sequencing

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
COLLINS ET AL.: 'DNA Fingerprinting of Mycobacterium bovis Strains by Restriction Fragment Analysis and Hybridization with Insertion Elements IS1081 and IS6110' J. OF CLINICAL MICROBIOLOGY vol. 31, no. 5, May 1993, pages 1143 - 1147, XP003001112 *
SIEGERT ET AL.: 'Matrix-Assisted Laser Desorption/Ionization Time-of-Flight Mass Spectrometry for the Detection of Polymerase Chain Reaction Products containing 7-Deazapurine Moieties' ANALYICAL BIOCHEMISTRY vol. 243, 1996, pages 55 - 65, XP002072977 *

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7632667B2 (en) 2005-10-24 2009-12-15 Nishikawa Rubber Co., Ltd. Mutan endonuclease with substrate-specific cleavage activity
KR101136758B1 (en) 2005-10-24 2012-04-19 니시카와고무고교가부시키가이샤 Mutant endonuclease
US20140017692A1 (en) * 2010-12-10 2014-01-16 Abbott Laboratories Method and kit for detecting target nucleic acid
US9845495B2 (en) 2010-12-10 2017-12-19 Abbott Laboratories Method and kit for detecting target nucleic acid
US11208663B2 (en) 2012-03-28 2021-12-28 Somalogic, Inc. Post-selex modification methods
US9410156B2 (en) 2012-03-28 2016-08-09 Somalogic, Inc. Aptamers to PDGF and VEGF and their use in treating PDGF and VEGF mediated conditions
US10221421B2 (en) 2012-03-28 2019-03-05 Somalogic, Inc. Post-selec modification methods
US9701967B2 (en) 2012-03-28 2017-07-11 Somalogic, Inc. Aptamers to PDGF and VEGF and their use in treating PDGF and VEGF mediated conditions
US9994857B2 (en) 2013-09-09 2018-06-12 Somalogic, Inc. PDGF and VEGF aptamers having improved stability and their use in treating PDGF and VEGF mediated diseases and disorders
US9695424B2 (en) 2013-09-09 2017-07-04 Somalogic, Inc. PDGF and VEGF aptamers having improved stability and their use in treating PDGF and VEGF mediated diseases and disorders
US10544419B2 (en) 2013-09-09 2020-01-28 Somalogic Inc. PDGF and VEGF aptamers having improved stability and their use in treating PDGF and VEGF mediated diseases and disorders
WO2015035305A1 (en) * 2013-09-09 2015-03-12 Somalogic, Inc. Pdgf and vegf aptamers having improved stability and their use in treating pdgf and vegf mediated diseases and disorders
US10036077B2 (en) 2014-01-15 2018-07-31 Abbott Laboratories Covered sequence conversion DNA and detection methods
US10208333B2 (en) 2014-10-14 2019-02-19 Abbott Laboratories Sequence conversion and signal amplifier DNA having locked nucleic acids and detection methods using same
US10316353B2 (en) 2014-10-14 2019-06-11 Abbott Laboratories Sequence conversion and signal amplifier DNA having poly DNA spacer sequences and detection methods using same
US10604790B2 (en) 2014-12-24 2020-03-31 Abbott Laboratories Sequence conversion and signal amplifier DNA cascade reactions and detection methods using same
US11492658B2 (en) 2014-12-24 2022-11-08 Abbott Laboratories Sequence conversion and signal amplifier DNA cascade reactions and detection methods using same

Also Published As

Publication number Publication date
WO2004067765A3 (en) 2006-09-28

Similar Documents

Publication Publication Date Title
US9562263B2 (en) Nicking and extension amplification reaction for the exponential amplification of nucleic acids
Hurst et al. Detection of bacterial DNA polymerase chain reaction products by matrix‐assisted laser desorption/ionization mass spectrometry
US6653078B2 (en) Multiplex method for nucleic acid detection
US20030165911A1 (en) Gene expression analysis using nicking agents
JP2004535815A (en) Amplification of nucleic acid fragments using nicking agents
JP2004532643A (en) Method for rapid amplification of DNA
WO2003066803A2 (en) Pyrophosphorolysis and incorporation of nucleotide method for nucleic acid detection
EP2201021A1 (en) Alternative nucleic acid sequencing methods
WO2005113822A2 (en) Dna profiling and snp detection utilizing microarrays
Cahill et al. Polymerase chain reaction and Q beta replicase amplification
WO2009007438A1 (en) Method for the simultaneous detection of multiple nucleic acid sequences in a sample
JP2002500362A (en) Solid phase chip and uses therefor
US20040019005A1 (en) Methods for parallel measurement of genetic variations
US20060246464A1 (en) Method of isolating, labeling and profiling small RNAs
WO2004067765A2 (en) Organism fingerprinting using nicking agents
WO2004067764A2 (en) Nucleic acid sequencing using nicking agents
WO2006073449A2 (en) Multiplex systems, methods, and kits for detecting and identifying nucleic acids
JP4418450B2 (en) Detection of biological DNA
WO2003002752A2 (en) Methods of using nick translate libraries for snp analysis
WO2003064692A1 (en) Detecting and quantifying many target nucleic acids within a single sample
US20060172289A1 (en) Combinatorial oligonucleotide pcr
EP1762629B1 (en) Detection of biological DNA
Zou et al. DNA microarrays: applications, future trends, and the need for standardization
Honisch The World of Nucleic Acid-Based Mass Spectrometry for Microbial and Viral Detection
US8518642B2 (en) Method of analyzing probe nucleic acid, microarray and kit for the same

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BW BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE EG ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NA NI NO NZ OM PG PH PL PT RO RU SC SD SE SG SK SL SY TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): BW GH GM KE LS MW MZ SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LU MC NL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
122 Ep: pct application non-entry in european phase