WO1997002488A1 - Method and system for dna sequence determination and mutation detection - Google Patents
Method and system for dna sequence determination and mutation detection Download PDFInfo
- Publication number
- WO1997002488A1 WO1997002488A1 PCT/US1996/011130 US9611130W WO9702488A1 WO 1997002488 A1 WO1997002488 A1 WO 1997002488A1 US 9611130 W US9611130 W US 9611130W WO 9702488 A1 WO9702488 A1 WO 9702488A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- fragment
- nucleic acid
- pattem
- normalization coefficients
- standard
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N27/00—Investigating or analysing materials by the use of electric, electrochemical, or magnetic means
- G01N27/26—Investigating or analysing materials by the use of electric, electrochemical, or magnetic means by investigating electrochemical variables; by using electrolysis or electrophoresis
- G01N27/416—Systems
- G01N27/447—Systems using electrophoresis
- G01N27/44704—Details; Accessories
- G01N27/44717—Arrangements for investigating the separated zones, e.g. localising zones
- G01N27/44721—Arrangements for investigating the separated zones, e.g. localising zones by optical means
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N27/00—Investigating or analysing materials by the use of electric, electrochemical, or magnetic means
- G01N27/26—Investigating or analysing materials by the use of electric, electrochemical, or magnetic means by investigating electrochemical variables; by using electrolysis or electrophoresis
- G01N27/416—Systems
- G01N27/447—Systems using electrophoresis
- G01N27/44704—Details; Accessories
- G01N27/44717—Arrangements for investigating the separated zones, e.g. localising zones
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B20/00—ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B20/00—ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
- G16B20/20—Allele or variant detection, e.g. single nucleotide polymorphism [SNP] detection
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B30/00—ICT specially adapted for sequence analysis involving nucleotides or amino acids
- G16B30/10—Sequence alignment; Homology search
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B30/00—ICT specially adapted for sequence analysis involving nucleotides or amino acids
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y10—TECHNICAL SUBJECTS COVERED BY FORMER USPC
- Y10T—TECHNICAL SUBJECTS COVERED BY FORMER US CLASSIFICATION
- Y10T436/00—Chemistry: analytical and immunological testing
- Y10T436/14—Heterocyclic carbon compound [i.e., O, S, N, Se, Te, as only ring hetero atom]
- Y10T436/142222—Hetero-O [e.g., ascorbic acid, etc.]
- Y10T436/143333—Saccharide [e.g., DNA, etc.]
Definitions
- This invention relates to a method and system of nucleotide sequence dete ⁇ nination and mutation detection in a subject nucleic acid molecule for use with automated electrophoresis detection apparatus.
- fragment pattem One ofthe steps in nucleotide sequence determination of a subject nucleic acid polymer is inte ⁇ retation ofthe pattem of oligonucleotide fragments which results from electrophoretic separation of fragments of the subject nucleic acid polymer (the "fragment pattem”).
- base-calling results in determination ofthe order of four nucleotide bases, A (adenine), C (cytosine), G (guanine) and T (thymine) for DNA or U (uracil) for RNA in the subject nucleic acid polymer.
- nucleic acid polymer is labeled with a radioactive isotope and cither Maxam and Gilbert chemical sequencing (Proc. Natl. Acad. Sci. USA, 74: 560-564 ( 1977)) or Sanger ct al. chain termination sequencing (Proc. Natl. Acad. Sci. USA 74: 5463-5467 (1977)) is performed.
- the resulting four samples of nucleic acid fragments (terminating in A, C, G, or T(U) respectively in the Sanger et al. method) are loaded into separate loading sites at the top end of an electrophoresis gel. An electric field is applied across the gel, and the fragments migrate through the gel.
- the gel acts as a separation matrix.
- the frag ⁇ ments which in each sample are of an extended series of discrete sizes, separate into bands of discrete species in a channel along the length of the gel. Shorter fragments generally move more quickly than larger fragments.
- the electrophoresis is stopped.
- the gel may now be exposed to radiation sensitive film for the generation of an autoradiograph.
- the pattern of radiation detected on the autoradiograph is a fixed representation ofthe fragment pattern. A researcher then manually base-calls the order of fragments from the fragment pattem by identifying the stepwise sequence ofthe order of bands across the four channels.
- Automated DNA sequencers arc basically electrophoresis apparatuses with detection systems which detect the presence of a detectable molecule as it passes through a detection zone.
- Each of these apparatus are capable of real time detection of migrating bands of oligonucleotide fragments; the fragment pattems consist of a time based record of fluorescence emissions or other detectable signals from each individual electrophoresis channel. They do not require the cumbersome autoradiography methods ofthe earliest technologies to generate a fragment pattern.
- each fragment pattern includes a series of sharp peaks and low, flat plains; the peaks representing the passage of a band of oligonucleotide fragments; the plains representing the absence of such bands.
- the A.L.F. system executes at least four discrete functions: 1) it smooths the raw data with a band-pass frequency filter; 2) it identifies successive maxima in each data stream; 3) it aligns the smoothed data from each of the four channels into an aligned data stream; and 4) it determines the order ofthe successive maxima with respect to the aligned data stream.
- the alignment process used in the apparatus depends on the existence of veiy little variability between the lanes ofthe gel.
- the fragment pattems from each lane can be superimposed by alignment to a presumed starting point in each pattern to provide a record of a continuous, non-overlapping constitus of sharp peaks, each peak representing a one nucleotide step in the subject nucleic acid.
- the computer identifies the presence of ambiguities and fails to identify a sequence.
- Tibbetts and Bowling disclose a method and system which relies on the second derivative ofthe peak slopes to smooth the data. The second derivative is used to provide an informative variable and an intensity variable to determine the nucleic acid sequence corresponding to the subject nucleic acid polymer.
- Dam et al. disclose a method of combining peak shapes from two signal spectrums derived from the same electrophoresis channel to determine the order of nucleotides in the subject nucleic acid polymer.
- the first is the inability to align shifted lanes of data. If the signal from the related data streams does not begin at approximately the same time, it is difficult, if not impossible, for these techniques to determine the correct alignment.
- it is a challenge to resolve "com ⁇ pressions" in the fragment pattern: those anomalies wherein the signal from two or more nucleotides in a row are not distinguishably sepaiated as compared to other nucleotides in the general vicinity. Compressions result most often from short hai ⁇ in loops at the end of a fragment which cause altered gel mobility features.
- the third problem is the inability to identify nucleotide sequences beyond the limits of single nucleotide resolution.
- DNA sequence-based diagnosis is the routine sequencing of patient DNA to identify genotype and/or specific gene sequences of the patient, wherein the DNA sequence is reported back to the physician and patient in order to assist in diagnosis and treatment of patient conditions.
- DNA sequence-based diagnosis is that the DNA sequence being examined is largely known.
- the instant invention it is possible to use the known fragment pattem for each DNA sequence to assist in the intc ⁇ retation ofthe fragment pattem obtained from a patient sample to obtain improved read-length and accuracy. It can also be used to increase the speed of sample analysis.
- the clean fragment pattern is then evaluated to determine one or more "normalization coefficients."
- These normalization coefficients reflect the displacement, stretching or shrinking, and rate of stretching or shrinking of the clean fragment, or segments thereof, which arc necày to obtain a suitably high degree of correlation between the clean fragment pattern and a standard fragment pattem which represents the positions of the selected nucleic acid base within a standard polymer actually having the known sequence as a function of migration time or distance.
- the normali ⁇ zation coefficients are then applied to the clean fragment pattem to produce a normalized fragment pattem which is used for base-calling in a conventional manner.
- the method ofthe invention is advantageously implemented in an apparatus comprising a computer processor programmed to determine normalization coefficients for an experimental fragment pattem.
- This computer may be separate from the electrophoresis apparatus, or part of an integrated unit.
- Fig. 1 illustrates the effect of background subtraction and band-pass frequency filtration on the appearance of data.
- Figs. 2A, 2B, and 2C illustrates the con-elation method of instant invention.
- Figs. 3A, 3B and 3C illustrate the effect of increasing the number of segments into which the sample data is divided.
- Fig. 4 is a plot of preferred con-elation shift against data point number.
- Figs. 5 A and 5B illustrate alignment of data windows.
- Fig. 6 shows the process of "reproduction” using Genetic Algorithms.
- Fig. 7 shows a binary genotype useful for finding values for a the coefficients of a second-order polynomial using Genetic Algorithms.
- Fig. 8 illustrates the exercise of base-calling of aligned data, as obtained from a Pharmacia A.L.F.TM and processed using HELIOSTM software.
- Figs. 9A and 9B illustrate a sequencing compression
- Fig. 10 illustrates a cross correlogram which plots maximum correlation against data point number of shifted Origin across the entire length of sample and standard fragment pattems.
- Fig. 1 1 shows an apparatus in accordance with the invention.
- the instant invention is designed to work with DNA sequence-based diagnosis or any other sequencing environment involving nucleotide sequence determination and/or mutation detection for the same region of DNA in a plurality of individual DNA-containing samples (human or otherwise).
- This "diagnostic environment" is unlike the vast majority of DNA sequence determination now occurring in which researchers are attempting to make an initial determination ofthe nucleotide sequence of unknown regions of DNA.
- DNA sequence-based diagnosis in which the DNA sequence of a patient gene is determined is one example of a technique performed within a diagnostic environment to which the present invention is applicable. Other examples include identification of pathogenic bacteria or vimses, DNA f ⁇ nge ⁇ rinting, plant and animal identification, etc.
- the present invention provides a method for normalization of experimental fragment pattems for nucleic acid polymers with putatively known sequences which enhances the ability to inteipret the information found in the fragment patterns.
- this method at least one raw fragment patte is obtained for the experimental sample.
- raw fragment pattem refers to a data set representing the positions of one selected nucleic acid base within the experimental polymer as a function of migration time or distance.
- Preferred raw fragment patterns which may be processed using the present invention include raw data collected using the fluorescence detection apparatus of automated DNA sequencers.
- the present invention is applicable to any data set which reflects the separation of oligonucleotide fragments in space or time, including real time fragment pattems using any type of detector, for example a polarization detector as described in US Patent Application No. 08/387,272 filed Febmary 13, 1995 and inco ⁇ orated herein by reference; densitometer traces of autoradiographs or stained gels; traces from laser-scanned gels containing fluorescently-tagged oligonucleotides; and fragment pattems from samples separated by mass spectrometry.
- a polarization detector as described in US Patent Application No. 08/387,272 filed Febmary 13, 1995 and inco ⁇ orated herein by reference
- densitometer traces of autoradiographs or stained gels traces from laser-scanned gels containing fluorescently-tagged oligonucleotides
- fragment pattems from samples separated by mass spectrometry.
- This raw fragment pattem is conditioned, for example using conventional baseline correction and noise reduction technique to yield a "clean fragment pattem.”
- three methods of signal processing commonly used arc background subtraction, low frequency filtration and high frequency filtration.
- Background subtraction eliminates the minimum constant noise recorded by the detector.
- the background is calculated as a measure of the minimum signal obtained over a selected number of data points. This measure differs from low frequency filtration which eliminates low period variations in signal that may result from variable laser intensity, etc.
- High frequency filtration eliminates the small variations in signal intensity that occur over highly localized areas of signal.
- the result after base-line subtraction is a band-pass filter applied to the frequency domain:
- Fig. 1 illustrates the effect of background subtraction, low and high frequency filtration on the appearance of data from a Visible Genetics MicroGcnc BlasterTM, resulting in a clean fragment pattern useful in the invention.
- a "clean fragment pattern" may be obtained by the application of these signal-processing techniques singly or in any combination.
- other signal processing techniques may be employed to obtained comparable clean fragment patterns without departing from the present invention.
- One note of caution conceming this conditioning step is the finding that signal conditioning or pre-processing may delete features of consequence in the preparation ofthe clean fragment pattern.
- the next step in the method ofthe present invention is the comparison of the clean fragment pattem with a standard fragment pattem to determine one or more "normalization coefficients."
- a "standard fragment pattern” takes advantage ofthe fact that in a diagnostic environment, there is a known fragment pattern that is expected from each test sample.
- the term "standard fragment pattem” refers to a typical fragment pattem which results from sequencing a particular known region of DNA using the same technique as the experimental technique being employed.
- a standard fragment pattem may be a time-based fluorescence emission record as obtained from an automated DNA sequencer, or it may be another representation of the separated fragment pattem.
- a standard fragment pattem used in the present invention includes all the less-than- ideal characteristics of nucleotide separation that may be associated with sequencing of any particular region of DNA.
- a standard fragment pattern may also tend to be idiosyncratic with the electrophoresis apparatus employed, the reaction conditions employed in sequencing and other factors.
- Fig. 2A illustrates a standard fragment pattern for the T lane of the first 260 nucleotides from the universal primer of pUCl 8 prepared using Sequenase 2.0 (United States Biochemical, Cleveland) and detected on a Visible Genetics Microgcnc Blaster(tm).
- Four standard fragment patterns one for each nucleotide, makes up the standard fragment pattern set for a particular nucleic acid polymer.
- a standard fragment pattern or fragment pattern set for a particular nucleic acid polymer may be generated by various methods.
- One such method is to obtain several to several hundred actual fragment pattems for the DNA sequence in question from samples wherein the DNA sequence is already known. From these trial runs, a human operator may select the trial run that is found to be the most typical fragment pattern. Because of slight gel or sample anomalies, and other anomalies, different fragment patterns may have slightly different separation characteristics, and slightly different peak amplitudes etc.
- the selected pattem generally should not show discrete peaks in an area where compressions and overlaps are regularly found. Similarly, the selected pattern generally must not show discrete separation of bases beyond the average single nucleotide resolution limit ofthe electrophoresis instrument used.
- An altemative method to select a standard fragment pattem is to generate a mathematically averaged result from a combination ofthe trial runs.
- the main use of the standard fragment pattem is as a basis for modifying and normalizing an experimental fragment pattern to enhance the reliability ofthe inte ⁇ retation of the experimental data.
- the standard fragment pattem is not used as a comparator for identifying deviations from the expected or "normal" sequence, and in fact is used in a manner which assumes that the experimental sequence will conform to the expected sequence.
- a feature ofthe standard fragment pattern which is important for some uses is that it results in a minimum of (and preferably no) ambiguities in base-calling when combined with the standard fragment patterns from the three other sequencing channels.
- the human operator may prefer to empirically determine which fragment pattems fi-om which lanes work best together in order to determine the standard fragment patterns for each sequencing lane.
- a standard fragment pattern may be used in different ways to provide improved read-length, accuracy and speed of sample analysis. These improvements rely on comparison of an experimental sample fragment pattem with the standard fragment pattem to determine one or more "normalization coefficients" for the particular experimental fragment pattem.
- the normalization coefficients reflect the displacement, stretching or shrinking, and rate of stretching or shrinking of the clean fragment pattern, or segments thereof, which are necessary to obtain a suitably high degree of correlation between the clean fragment pattern and a standard fragment pattern which represents the positions ofthe selected nucleic acid base within a standard polymer actually having the known sequence as a function of migration time or distance.
- the normalization coefficients arc then applied to the clean fragment pattern to produce a normalized fragment pattem which is used for base-calling in a conventional manner.
- the process of comparing the clean fragment pattem and the standard fragment pattem to arrive at normalization coefficients can be earned out in any number of ways without - 10 -
- suitable processes involve consideration of a number of trial normalization s, and selection ofthe trial normalization which achieves the best fit in the model being employed.
- useful comparison procedures are set forth below. The procedures result in the development of normalization coefficients which, when applied to an experimental fragment pattem, shift, stretch or shrink the experimental fragment pattern to achieve a high degree of overlap with the standard fragment pattem.
- the term "high degree of normalization” refers to the maximization ofthe normalization which is achievable within practical constraints.
- a point-for-point co ⁇ clation coefficient calculated for normalized fragment pattems and the corresponding standard fragment pattern of at least 0.8 is desirable, while a correlation coefficient of at least 0.95 is prefe ⁇ -ed.
- Fig. 2 illustrates one con-elation method of instant invention.
- Fig. 2A illustrates a clean fragment pattem obtained using a Visible Genetics MicroGene BlasterTM.
- the signal records the T lane ofa pUC1 sequencing n overthe first 260 nucleotides (nt) ofthe subject nucleic acid molecule.
- the Y axis is an arbitrary representation of signal intensity; the X axis represents a time of 0 to 5 minutes. In the sequencing mn shown, the peaks are cleanly separated.
- Fig. 2B represents the standard fragment pattern for the T lane of the first 260 nucleotides from the universal primer of pUCl 8 prepared using Sequenase 2.0 (United States Biochemical. Cleveland) and detected on a Visible Genetics Microgene Blaster(tm). The standard sequence was selected by a human operator as the most typical fragment pattern from 25 trial runs.
- the experimental fragment pattern of Fig. 2A may be compared with the standard fragment pattem of Fig. 2B according to the equation: M- l
- Fig. 2C shows the correlation values ofthe entire window of Lane A against the entire window of Lane B as lane A is translated relative to lane B. (As the window is shifted, it effectively wraps around, such that the End and Origin points appear to be side by side). The result shows maximum correlation at point P which conesponds to a preferred correlation shift of +40 data points.
- Fig. 2 illustrates comparison of a complete experimental fragment pattern and a complete standard fragment pattem.
- the only normalization coefficient determined is the shift which results in the highest level of correlation.
- This simple model lacks the robustness which is needed for general applicability. Thus for most memeposes. a more complex analysis is required to obtain good normalization.
- One way to take in to account the experimental variability in migration rate caused by inconsistency of sample preparation chemistiy, sample loading, gel material, gel thickness, electric field density, clamping/securing of gel in instmment, detection rate and other aspects ofthe electrophoresis process is to assign the data points of the clean fragment pattern to one or more segments or "windows."
- Each window includes an empirically determined number of data points, generally in the range of 100 to 10000 data points. Windows may be of variable size within a given data series, if desired.
- the starting data point of each window is designated Origin; the final data point in a window is designated End.
- Each window ofthe experimental fragment pattem is then compared with a comparable number of data points making up the standard fragment using the same procedure described above.
- Figs 3 A and 3B illustrate the effect of increasing the number of segments or windows into which the experimental data is divided.
- the experimental fragment pattem from Fig. 2A was divided into three windows, and each was evaluated individually. Instead ofthe single offset of +40 data points found using a single window, the use ofthree windows results in an increasing degree of shift throughout the run, i,c., +24, +34 and +50 in the successive windows reading from right to left.
- Fig. 3B shows the use of five windows on the same experimental fragment, and results in even clearer resolution, with successive shifts of +16, +23, +35, +48, and +51 for the windows. Simply put, the consequence of too few windows is a lack of precision in shifting information. This may cause problems in base-calling aligned data. It is therefore desirable to use more than one window in the correlation process.
- shifted(i) sample(i) +((sample(i) * m) + b)
- Fig. 5 when the peaks identified in the sample window (Lane B) do not align with the standard data (Lane A), (Fig 5A) they may be aligned for analysis pu ⁇ oses by padding the elastically shifted data with zeros when the formula produces values outside ofthe sample data's range (Fig. 5B). While the use of multiple windows increases the accuracy ofthe alignment, a potential problem arises when too many windows arc used.
- Fig. 4C when windows include too few features, the correlation between the data and the window and the standard fragment pattem becomes meaningless. In Fig. 4C, window size has dropped below 1000 data points.
- One window which includes a single peak is found to have highest correlation with a peak distantly removed from the location where it would otherwise be expected to correlate. This situation demonstrates that the human operator must be sensitive to the unique circumstances of each standard fragment pattern to determine the optimum number of data points per window.
- One method wherein windows with fewer data points can be employed is to limit the amount ofthe standard fragment pattem against which the window is correlated. Again, such a limitation would be empirically determined as in the other data filters employed. It is found experimentally that coirelation of a sample window with that region ofthe standard fragment pattem that falls approximately at the same number of data points from the start of signal, and includes twice as many data points as the sample window, is sufficient to obtain correlations which are not often spurious.
- GAs The conceptual basis of GAs is Darwinian "survival of the fittest.” In nature, individuals compete for resources (e.g., food, shelter, mates, etc.). Those individuals which are most highly adapted for their environment tend to produce more offspring. GAs attempt to mimic this process by "evolving" solutions to problems.
- GAs operate on a "population" of individuals, each of which is a possible solution to a given problem.
- Each individual in the starting population is assigned a unique binary string which can be considered to represent that individual's "genotype.”
- the decimal equivalent of this binary genotype is referred to as the "phenotype.”
- a fitness function operating on the phenotype reflects how well a particular individual solves the problem.
- a suitable approach to this optimization uses a binary string as the genotype for each individual, which is divided into three sections representing the three coefficients as shown in Fig. 7. The size of each section is dependent on the range of possible values of each coefficient and the resolution desired. The phenotype ofthe individual is determined by decoding each section to the corresponding decimal value.
- a binary string for use in solving the problem presented by this invention may contain 32 bits of which 8 bits specify the offset coefficient c, 13 bits specify the relative velocity b, and 11 bits specify the relative acceleration a.
- the objective function used to measure the fitness of an individual is the intersection ofthe standard fragment pattem and an experimental fragment pattem produced by applying the second-order polynomial to the experimental fragment pattern. The intersection is defined by the equation
- x is the experimental fragment pattern
- y is the standard fragment pattern
- n is the number of data points. The intersection will be greatest when the two sequences are perfectly aligned.
- Calculating the fitness of each individual is a three step process. First the individual's genotype is decoded producing the values (phenotypes) for the three coefficients. Second, the coefficients are plugged into the second-order polynomial and the polynomial is used to modify the clean fragment pattern. Third, the intersection of the modified fragment pattern and the standard pattem is calculated. The intersection value is then assigned to the individual as its fitness value. About 20 generations are needed to align the two sequences using a population of 50 individuals with a mutation probability of 0.001 (i.e. 1 out of every 1000 bits mutated after crossover). Using conventional computer equipment this can be accomplished in approximately 8 seconds. This time period is sufficiently short that all calculations can be mn for a standardized period of time, rather than to a selected degree of convergence. This substantial simplifies experimental design.
- the second-order polynomial will be unable to normalize the two sequences. This is due to variations in the velocity ofthe experimental fragment pattern which are greater than second-order. This is easily handled by using a higher order polynomial, for example a third- or fourth-order polynomial, and a larger binary genotype to include the extra coefficients; or by simply dividing the experimental fragment pattem into segments or windows such that each segment's variations are at most second-order.
- normalized fragment pattems may be used in various ways including base-calling and mutation detection. For pu ⁇ oses of determining the complete sequence of all four bases in the sample polymer, this will generally involve the supe ⁇ osition of the normalized fragment pattems for each of the four bases. This can be done by designating a starting point or other "alignment point" in each fragment, and aligning those points to position the aligned fragment pattems. Altematively, the fragments can be aligned using a reference peak as disclosed in US Patent Application Serial No. 08/452,719 filed May 30, 1995, which is inco ⁇ orated herein by reference.
- Fig. 8 illustrates the exercise of base-calling of aligned data, as obtained from a Pharmacia A.L.F. Sequencer and processed using HELIOS (tm) software.
- base-calling may be by any method known in the prior art, using aligned fragment pattems for each ofthe four bases to provide a complete sequence.
- aligned fragment pattems for each ofthe four bases to provide a complete sequence.
- the minimum value used in peak detection varies with each sequence and must be set on a per- n basis.
- the well-known Fast Fourier Transform version of conelation is used to speed its calculation.
- nucleotide sequence record can be utilized to detect specific mutations. This can be accomplished in a variety of ways, including amino acid translation, identification of untranslated signal sequences such as start codons, stop codons or splice site junctions. A preferred method involves determining correlations ofthe normalized fragment pattems against a standard to obtain specific diagnostic information about the presence of mutations.
- a region around each identified peak in the standard fragment pattem is correlated with the conesponding region in the normalized fragment pattem.
- the conelation will be low in locations where the two sequences differ, i.e., where there us a nucleotide variation because of the high degree of alignment which normalization makes possible.
- correlation of a region extending approximately 20 data points on either side of a peak is desirable to compensate for small discrepancies which may remain.
- the co ⁇ elation process is then repeated for each peak in the normalized fragment pattem. Instances of low correlation for any peak are indicative of a mutation.
- the correlation of the peaks of the normalized fragment pattern with the standard fragment pattem can be performed in several ways.
- One approach is to determine a standard correlation, using the equation for correlation shown above.
- This number ranges in value from zero to some arbitrarily large number the value of which depends upon the two functions being con-elated, but which is not predictable a priori. This can create a problem in setting threshold levels defining high versus low co ⁇ elation. It is therefore preferable to use a measure of correlation which has defined limits to the range of possible values.
- f gd is the standard deviation of function f
- g sld is the standard deviation of function g.
- the output is normalized to a value of between -1 and 1, inclusively.
- a value of 1 indicates total co ⁇ elation
- a value of -1 indicates complete non-correlation.
- a gradient of con-elation is supplied, and values which are above a pre-defined threshold, i.e., 0.8, could be flagged as suspect.
- An altemative to determining the coefficient of correlation is to use the function
- the standard fiagment pattem allows the resolution of nucleotide sequence where ambiguities occur, such as compressions and loss of single nucleotide resolution.
- the present invention permits automated analysis of many of the ambiguities which are simply rejected as uninteipretable using by known sequencing techniques and equipment.
- Com ⁇ pressions are thought to result fi-om short hai ⁇ in hybridizations at one end ofthe nucleic acid molecule which tend to cause a molecule to travel faster through an electrophoresis gel than would be expected on the basis of size.
- the resulting appearance in the fragment pattern is iUustrated in Fig. 9.
- These compressions may consist of overlapping peaks within one lane that give one large peak, or, they may be peaks from different lanes that overlap when combined together in the alignment process.
- a base-calling method is not able to determine the number or order of the bases in the compression, because it is unable to distinguish the correct ordering of bands. Examination reveals a peak (Peak A) which is clearly wider than a singleton peak (Peaks B and C) but is otherwise indefinable.
- the method and system of the instant invention assigns the correct order and the conect number of nucleotides based on what is known about the standard fragment pattem.
- the standard fragment pattem includes regions of compressions that are typical of a given nucleotide sequence.
- a compression can be characterized by the following features (Fig. 9): Peak Height (Ph) Peak Width at half Ph (Pw) Peak Area (Pa, not shown) Centering of Ph on Pw (Cnt, not shown)
- a compression is characterized in the trial runs by these features, and the ratios between the features. An average and standard deviation is calculated for each ratio. The more precise and controlled the trial mns have been, the lower the standard deviation will be. The inelusiveness ofthe standard deviation must be broad enough to encompass the degree of accuracy sought in base-calling. A standard deviation which includes only 90% of samples, will permit miscalling in 10% of samples, a number which may or may not be too high to be usefully employed. Once ascertained, the compression statistics are recorded in association with the ambiguous peak. These statistics are associated with each compression and herein called a "standard compression."
- Each standard compression can be assigned a nucleotide base sequence upon careful investigation.
- researchers resolve compressions by numerous techniques, which though more cumbersome or less useful, serve to reveal the actual underlying nucleotide sequence. These techniques include: sequencing from primers nearer to the compression, sequencing the opposite strand of DNA, electrophoresis in more highly denaturing conditions, etc. Once the actual base sequence is determined, it can be assigned as a group to the compression, thus relieving the researcher from further time consuming exercises to resolve it.
- Regions of the normalized fragment patte s which do not show discrete peaks for basc-calling are tested for the existence of known compressions. If no compression is known for the region, the area is flagged for the human operator to examine as a possible new mutation.
- the ratios of the peak are determined as above. If the peak falls within the standard deviation of all the ratios determined from the trial mns, it is then assigned the sequence ofthe standard compression. Figs. 9A and 9B identify the actual nucleotides assigned to a standard compression. Where, however, the ratios dete ⁇ nined for the compression fall outside ofthe standard deviation, there lies the possibility of mutation. In this case, the ratios ofthe compression are compared to all known and previously observed mutations in the standard compression. If the compression falls within any ofthe previously identified mutations in the region, it may be identified as conesponding to such a mutation. If the ratios fall outside of any known standard, the area is flagged for examination by the human operator as an example of a possible new and hitherto unobserved mutation.
- a further application ofthe present invention is for base-calling beyond the limits of single nucleotide resolution.
- the standard fragment pattern will define a region where single nucleotide resolution is not observed.
- resolution may fail around 200 nts.
- some apparatus are known to produce read-lengths of over 700 nts.
- the instant invention relies on normalization using a standard fi-agment pattern to resolve the ambiguous wave forms beyond the limit of single nucleotide resolution. The method is essentially the same as a series of compression analyses as described hereinabove.
- Normalized fragment pattems prepared as described hereinabove, may be sequentially analyzed for consistency with the expected ratios of each peak-like feature. Any wave form which does not fall within the parameters ofthe standard peaks is classified as anomalous and flagged for further investigation.
- the instant invention it is possible to compare any base specific experimental fragment pattern, for example the T lane of the patient sample, to the base specific standard fragment pattern for that T lane.
- the features ofthe standard fragment pattem can be used to identify differences within the test lane ofthe sample and thus provide information about the sample.
- This aspect ofthe invention follows the normalization step described hereinabove. The degree of con-elation ofa window at the prefened normalization is plotted against the shifted origin data point ofthe window, effectively describing a cross-correlogram. Fig.
- FIG. 10 shows the maximum and minimum correlation values obtained across the entire length ofthe standard fragment pattem, as determined from a plurality of trial mns. A standard deviation can be determined after a sufficient number of trial mns. Data from a test sample is also plotted. As illustrated, one window is found to deviate substantially from its expected degree of correlation. The failure to correlate as expected suggests that the window contains a mutation or other difference from the standard. The system ofthe invention would cause such a window to be flagged for closer examination by the human operator. Altematively, the window could be reported directly to the patient file for use in diagnosis. In a further altemative, the window would not be reported to the human operator, until base-calling had further confirmed that there was a mutation present in the area represented by the window.
- the cross co ⁇ elogram mutation detection is a method of "single lane base ⁇ calling" wherein the signal from a single nucleotide mn is used to identify the presence or absence of differences fi-om the standard fi-agment pattem.
- a useful embodiment of this aspect ofthe invention is for identification of infectious diseases in patient samples. Many groups of diagnostically-significant bacteria, viruses, fungi and the like all contain regions of DNA which are unique to an individual species, but which are nevertheless amplifiable using a single set of amplification primers due to commonality of genetic code within related species. Diagnostic tests for such organisms may not quickly distinguish between species within such groups.
- a method for classifying a sample of a nucleic acid as a particular species within a group of commonly- amplifiable nucleic acid polymers utilizes at least one sample fragment pattern representing the positions ofa selected type of nucleic acid base within the sample nucleic acid polymer.
- a set of one or more normalization coefficients is determined for the sample fragment pattem. These sets of normalization coefficients are then applied to the sample fragment pattern to obtain a plurality of trial fragment pattems, which are correlated with the conesponding standard fragment patterns.
- the sample is classified as belonging to the species for which the trial fragment pattem has the highest conelation with its conesponding standard fi-agment pattem, provided that the conelation is over a pre-defined threshold.
- This aspect ofthe invention is useful in identifying which allele ofa group of alleles is present in a gene.
- the method is also useful in identifying individual species from among a group of genetic variants ofa disease-causing microorganism, and in particular genetic variants of human immunodeficiency vims.
- a further variation of the invention which may be useful in certain conditions is the reduction ofthe experimental and standard fragment pattems into square wave data.
- Square wave data is useful when the signal obtained is highly reproducible from n to mn.
- the main advantage of a square wave data format is that it includes a maximum of information content and a minimum of noise.
- the standard fi-agment pattem may be reduced to a square wave by a number of means.
- the transition from zero to one occurs at the inflection point on each slope of a peak.
- the inflection points arc found by using the zero crossings of a function that is the convolution ofthe data function with a function that is the second derivative of a gaussian pulse that is about one half the width of single base pair pulse in the original data sequence. This derives inflection points with relatively little addition of noise due to the differentiation process. Any data point value greater than the inflection point on that slope of the peak is assigned 1. Any value below the inflection point is assigned 0.
- the peaks on the square wave are identified and assigned nucleotide sequences. Peaks may be assigned one or more nucleotides as determined by the human operator on the basis of the standard fragment pattern. Peaks are then given identifying characteristics such as a sequential peak number, a standard peak width, a standard gap width on either side ofthe peak and standard deviations with these characteristics.
- sample fragment pattern When a sample fragment pattern is obtained, it is reduced to a square wave format, again on the basis ofthe inflection point data as described above. Peak numbers are assigned.
- the sample square wave may then be used in different ways to identify mutations. In one method, it may be used to align the four different nucleotide data streams as in the method of the invention described hereinabove. Altematively, analysis may be purely statistical.
- the peak width and gap width of sample can be directly compared to the standard square wave. If the sample characteristics fall within the standard deviation of the standard, taking into account permissible elasticity ofthe peaks, then the sample is concluded to be the same as the standard. If the peaks ofthe sample can not be fit within the terms ofthe standard, then the presence of a mutation is concluded and reported.
- the present invention is advantageously implemented using any multipu ⁇ ose computer including those generally refened to as personal computers and mini-computers, programmed to determine normalization coefficients by comparison of an experimental and a standard fragment pattern.
- a computer will include at least one central processor 110, for example an Intel 80386, 80486 or Pentium® processor or Motorola
- a storage device such as a hard disk 1 1 1 , for storing standard fi-agment patterns, means for receiving raw or clean experimental fragment pattems such as wire 112 shown connected to the output of an electrophoresis apparatus 1 13.
- the processor 1 10 is programmed to perform the comparison of the experimental fragment pattem and the standard fi-agment pattern and to determine normalization coefficients based on the comparison.
- This programming may be permanent, as in the case where the processor is a dedicated EEPROM, or it may be transient in which case the programming instmctions are loaded from the storage device or fiom a floppy diskette or other transportable media.
- the normalization coefficients may be output from computer, in print form using printer 1 14; on a video display 1 15; or via a communications link 1 16 to another processor 1 17. Altematively or additionally, the normalization coefficients may be utilized by the processor 110 to normalize the experimental fragment pattern for use in base-calling or other diagnostic evaluation.
- the apparatus may also include programming for applying the normalization coefficients to the experimental fragment pattern to obtain a normalized fragment pattem, and for aligning the normalized f agments pattems and evaluating the nucleic acid sequence ofthe sample therefrom.
Abstract
Description
Claims
Priority Applications (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
DE69601720T DE69601720T2 (en) | 1995-06-30 | 1996-06-28 | METHOD AND DEVICE FOR DETERMINING DNA SEQUENCE AND DETECTING MUTATIONS |
EP96923560A EP0835442B1 (en) | 1995-06-30 | 1996-06-28 | Method and system for dna sequence determination and mutation detection |
JP9505254A JPH11509622A (en) | 1995-06-30 | 1996-06-28 | Methods and systems for DNA sequencing and mutation detection |
AU64039/96A AU700410B2 (en) | 1995-06-30 | 1996-06-28 | Method and system for DNA sequence determination and mutation detection |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US08/497,202 | 1995-06-30 | ||
US08/497,202 US5853979A (en) | 1995-06-30 | 1995-06-30 | Method and system for DNA sequence determination and mutation detection with reference to a standard |
Publications (1)
Publication Number | Publication Date |
---|---|
WO1997002488A1 true WO1997002488A1 (en) | 1997-01-23 |
Family
ID=23975872
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US1996/011130 WO1997002488A1 (en) | 1995-06-30 | 1996-06-28 | Method and system for dna sequence determination and mutation detection |
Country Status (7)
Country | Link |
---|---|
US (2) | US5853979A (en) |
EP (1) | EP0835442B1 (en) |
JP (1) | JPH11509622A (en) |
AU (1) | AU700410B2 (en) |
CA (1) | CA2225385A1 (en) |
DE (1) | DE69601720T2 (en) |
WO (1) | WO1997002488A1 (en) |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO1998054669A1 (en) * | 1997-05-28 | 1998-12-03 | Amersham Pharmacia Biotech Ab | A method and a system for nucleic acid sequence analysis |
WO2001020040A2 (en) * | 1999-09-16 | 2001-03-22 | Mj Research, Inc. | Method and compositions for evaluating resolution of nucleic acid separation systems |
US6671625B1 (en) | 1999-02-22 | 2003-12-30 | Vialogy Corp. | Method and system for signal detection in arrayed instrumentation based on quantum resonance interferometry |
US6780589B1 (en) | 1999-02-22 | 2004-08-24 | Vialogy Corp. | Method and system using active signal processing for repeatable signal amplification in dynamic noise backgrounds |
EP1910556A1 (en) * | 2004-07-20 | 2008-04-16 | Conexio 4 Pty Ltd | Method and apparatus for analysing nucleic acid sequence |
US7371276B2 (en) | 2002-08-07 | 2008-05-13 | Ishihara Sangyo Kaisha, Ltd. | Titanium dioxide pigment and method for producing the same and resin composition using the same |
US7501245B2 (en) | 1999-06-28 | 2009-03-10 | Helicos Biosciences Corp. | Methods and apparatuses for analyzing polynucleotide sequences |
US7645596B2 (en) | 1998-05-01 | 2010-01-12 | Arizona Board Of Regents | Method of determining the nucleotide sequence of oligonucleotides and DNA molecules |
US7666593B2 (en) | 2005-08-26 | 2010-02-23 | Helicos Biosciences Corporation | Single molecule sequencing of captured nucleic acids |
US7981604B2 (en) | 2004-02-19 | 2011-07-19 | California Institute Of Technology | Methods and kits for analyzing polynucleotide sequences |
US8484000B2 (en) | 2004-09-02 | 2013-07-09 | Vialogy Llc | Detecting events of interest using quantum resonance interferometry |
US9012144B2 (en) | 2003-11-12 | 2015-04-21 | Fluidigm Corporation | Short cycle methods for sequencing polynucleotides |
US9096898B2 (en) | 1998-05-01 | 2015-08-04 | Life Technologies Corporation | Method of determining the nucleotide sequence of oligonucleotides and DNA molecules |
Families Citing this family (37)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5605798A (en) | 1993-01-07 | 1997-02-25 | Sequenom, Inc. | DNA diagnostic based on mass spectrometry |
US6466923B1 (en) | 1997-05-12 | 2002-10-15 | Chroma Graphics, Inc. | Method and apparatus for biomathematical pattern recognition |
US6990221B2 (en) * | 1998-02-07 | 2006-01-24 | Biodiscovery, Inc. | Automated DNA array image segmentation and analysis |
US6349144B1 (en) | 1998-02-07 | 2002-02-19 | Biodiscovery, Inc. | Automated DNA array segmentation and analysis |
JP3091176B2 (en) * | 1998-02-18 | 2000-09-25 | 株式会社ヘレナ研究所 | Separation analysis test data processing device |
WO2000000637A2 (en) | 1998-06-26 | 2000-01-06 | Visible Genetics Inc. | Method for sequencing nucleic acids with reduced errors |
US6821402B1 (en) | 1998-09-16 | 2004-11-23 | Applera Corporation | Spectral calibration of fluorescent polynucleotide separation apparatus |
US20020009394A1 (en) | 1999-04-02 | 2002-01-24 | Hubert Koster | Automated process line |
US6334099B1 (en) * | 1999-05-25 | 2001-12-25 | Digital Gene Technologies, Inc. | Methods for normalization of experimental data |
JP2003500662A (en) * | 1999-05-25 | 2003-01-07 | ディジタル・ジーン・テクノロジーズ・インコーポレーテッド | Method and system for amplitude normalization and data peak selection |
US7099502B2 (en) * | 1999-10-12 | 2006-08-29 | Biodiscovery, Inc. | System and method for automatically processing microarrays |
US7917301B1 (en) * | 2000-09-19 | 2011-03-29 | Sequenom, Inc. | Method and device for identifying a biological sample |
US20030207297A1 (en) * | 1999-10-13 | 2003-11-06 | Hubert Koster | Methods for generating databases and databases for identifying polymorphic genetic markers |
US20030190644A1 (en) | 1999-10-13 | 2003-10-09 | Andreas Braun | Methods for generating databases and databases for identifying polymorphic genetic markers |
US6760668B1 (en) | 2000-03-24 | 2004-07-06 | Bayer Healthcare Llc | Method for alignment of DNA sequences with enhanced accuracy and read length |
JP3628232B2 (en) * | 2000-03-31 | 2005-03-09 | 三洋電機株式会社 | Microorganism identification method, microorganism identification device, method for creating a database for microorganism identification, and recording medium on which a microorganism identification program is recorded |
US6436641B1 (en) * | 2000-04-17 | 2002-08-20 | Visible Genetics Inc. | Method and apparatus for DNA sequencing |
US20020116135A1 (en) * | 2000-07-21 | 2002-08-22 | Pasika Hugh J. | Methods, systems, and articles of manufacture for evaluating biological data |
EP1423816A2 (en) * | 2000-08-14 | 2004-06-02 | Incyte Genomics, Inc. | Basecalling system and protocol |
US6681186B1 (en) | 2000-09-08 | 2004-01-20 | Paracel, Inc. | System and method for improving the accuracy of DNA sequencing and error probability estimation through application of a mathematical model to the analysis of electropherograms |
US7222059B2 (en) * | 2001-11-15 | 2007-05-22 | Siemens Medical Solutions Diagnostics | Electrophoretic trace simulator |
AU2003228809A1 (en) * | 2002-05-03 | 2003-11-17 | Sequenom, Inc. | Kinase anchor protein muteins, peptides thereof, and related methods |
US7512496B2 (en) * | 2002-09-25 | 2009-03-31 | Soheil Shams | Apparatus, method, and computer program product for determining confidence measures and combined confidence measures for assessing the quality of microarrays |
WO2004029298A2 (en) * | 2002-09-26 | 2004-04-08 | Applera Corporation | Mitochondrial dna autoscoring system |
WO2004050839A2 (en) | 2002-11-27 | 2004-06-17 | Sequenom, Inc. | Fragmentation-based methods and systems for sequence variation detection and discovery |
US20040215401A1 (en) * | 2003-04-25 | 2004-10-28 | Krane Dan Edward | Computerized analysis of forensic DNA evidence |
US9394565B2 (en) | 2003-09-05 | 2016-07-19 | Agena Bioscience, Inc. | Allele-specific sequence variation analysis |
US9249456B2 (en) | 2004-03-26 | 2016-02-02 | Agena Bioscience, Inc. | Base specific cleavage of methylation-specific amplification products in combination with mass analysis |
US7608394B2 (en) | 2004-03-26 | 2009-10-27 | Sequenom, Inc. | Methods and compositions for phenotype identification based on nucleic acid methylation |
US20070099227A1 (en) * | 2004-10-12 | 2007-05-03 | Curry Bo U | Significance analysis using data smoothing with shaped response functions |
US20060160102A1 (en) | 2005-01-18 | 2006-07-20 | Hossein Fakhrai-Rad | Identification of rare alleles by enzymatic enrichment of mismatched heteroduplexes |
US9388462B1 (en) * | 2006-05-12 | 2016-07-12 | The Board Of Trustees Of The Leland Stanford Junior University | DNA sequencing and approaches therefor |
US8126235B2 (en) * | 2008-04-04 | 2012-02-28 | Massachusetts Institute Of Technology | Methods and apparatus for automated base-calling on multiple DNA strands |
WO2017223515A1 (en) | 2016-06-23 | 2017-12-28 | F. Hoffman-La Roche Ag | Formation and calibration of nanopore sequencing cells |
US11124827B2 (en) | 2016-06-23 | 2021-09-21 | Roche Sequencing Solutions, Inc. | Period-to-period analysis of AC signals from nanopore sequencing |
WO2019129555A1 (en) | 2017-12-28 | 2019-07-04 | F. Hoffmann-La Roche Ag | Measuring and removing noise in stochastic signals from a nanopore dna sequencing system driven by an alternating signal |
US11435283B2 (en) | 2018-04-17 | 2022-09-06 | Sharif University Of Technology | Optically detecting mutations in a sequence of DNA |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
GB2225139A (en) * | 1988-11-16 | 1990-05-23 | Atomic Energy Authority Uk | Method for spectrum matching |
DE4405251A1 (en) * | 1993-02-19 | 1994-08-25 | Olympus Optical Co | Method for processing electrophoretic data |
US5365455A (en) * | 1991-09-20 | 1994-11-15 | Vanderbilt University | Method and apparatus for automatic nucleic acid sequence determination |
US5419825A (en) * | 1991-07-29 | 1995-05-30 | Shimadzu Corporation | Base sequencing apparatus |
Family Cites Families (19)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
DE2056998A1 (en) * | 1970-11-20 | 1972-05-25 | Bodenseewerk Perkin Elmer Co | Peak integrator |
US4720786A (en) * | 1985-04-19 | 1988-01-19 | Fuji Photo Film Co., Ltd. | Method of compensating for offset distortion in rows of electrophoretic patterns |
US4941092A (en) * | 1985-05-23 | 1990-07-10 | Fuji Photo Film Co., Ltd. | Signal processing method for determining base sequence of nucleic acid |
EP0240729A3 (en) * | 1986-03-05 | 1988-08-24 | Fuji Photo Film Co., Ltd. | Method of analyzing autoradiograph for determining base sequence of nucleic acid |
US4811218A (en) * | 1986-06-02 | 1989-03-07 | Applied Biosystems, Inc. | Real time scanning electrophoresis apparatus for DNA sequencing |
US5246866A (en) * | 1987-12-23 | 1993-09-21 | Hitachi Software Engineering Co., Ltd. | Method for transcription of a DNA sequence |
US4960999A (en) * | 1989-02-13 | 1990-10-02 | Kms Fusion, Inc. | Scanning and storage of electrophoretic records |
US5108179A (en) * | 1989-08-09 | 1992-04-28 | Myers Stephen A | System and method for determining changes in fluorescence of stained nucleic acid in electrophoretically separated bands |
US5119316A (en) * | 1990-06-29 | 1992-06-02 | E. I. Du Pont De Nemours And Company | Method for determining dna sequences |
JP2814409B2 (en) * | 1990-11-30 | 1998-10-22 | 日立ソフトウェアエンジニアリング 株式会社 | Multicolor electrophoresis pattern reader |
JP2873884B2 (en) * | 1991-03-22 | 1999-03-24 | 日立ソフトウェアエンジニアリング 株式会社 | Multicolor electrophoresis pattern reader |
US5502773A (en) * | 1991-09-20 | 1996-03-26 | Vanderbilt University | Method and apparatus for automated processing of DNA sequence data |
US5308751A (en) | 1992-03-23 | 1994-05-03 | General Atomics | Method for sequencing double-stranded DNA |
DE592060T1 (en) * | 1992-10-09 | 1994-12-08 | Univ Nebraska | Digital DNS typing. |
US5273632A (en) * | 1992-11-19 | 1993-12-28 | University Of Utah Research Foundation | Methods and apparatus for analysis of chromatographic migration patterns |
US6017434A (en) * | 1995-05-09 | 2000-01-25 | Curagen Corporation | Apparatus and method for the generation, separation, detection, and recognition of biopolymer fragments |
US5916747A (en) | 1995-06-30 | 1999-06-29 | Visible Genetics Inc. | Method and apparatus for alignment of signals for use in DNA based-calling |
EP0914468B1 (en) | 1996-05-01 | 2002-08-28 | Visible Genetics Inc. | Method for sequencing of nucleic acid polymers |
EP0944739A4 (en) | 1996-09-16 | 2000-01-05 | Univ Utah Res Found | Method and apparatus for analysis of chromatographic migration patterns |
-
1995
- 1995-06-30 US US08/497,202 patent/US5853979A/en not_active Expired - Lifetime
-
1996
- 1996-06-28 EP EP96923560A patent/EP0835442B1/en not_active Expired - Lifetime
- 1996-06-28 CA CA002225385A patent/CA2225385A1/en not_active Abandoned
- 1996-06-28 WO PCT/US1996/011130 patent/WO1997002488A1/en active IP Right Grant
- 1996-06-28 DE DE69601720T patent/DE69601720T2/en not_active Expired - Lifetime
- 1996-06-28 AU AU64039/96A patent/AU700410B2/en not_active Ceased
- 1996-06-28 JP JP9505254A patent/JPH11509622A/en active Pending
-
1998
- 1998-11-12 US US09/190,756 patent/US6303303B1/en not_active Expired - Fee Related
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
GB2225139A (en) * | 1988-11-16 | 1990-05-23 | Atomic Energy Authority Uk | Method for spectrum matching |
US5419825A (en) * | 1991-07-29 | 1995-05-30 | Shimadzu Corporation | Base sequencing apparatus |
US5365455A (en) * | 1991-09-20 | 1994-11-15 | Vanderbilt University | Method and apparatus for automatic nucleic acid sequence determination |
DE4405251A1 (en) * | 1993-02-19 | 1994-08-25 | Olympus Optical Co | Method for processing electrophoretic data |
Non-Patent Citations (1)
Title |
---|
L. B. KOUNTY: "AUTOMATED IMAGE ANALYSIS FOR DISTORTION COMPENSATION IN SEQUENCING GEL ELECTROPHORESIS", APPLIED SPECTROSCOPY, vol. 46, no. 1, January 1992 (1992-01-01), FREDERICK, MD, US, pages 136 - 141, XP000247314 * |
Cited By (25)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6260034B1 (en) | 1997-05-28 | 2001-07-10 | Amersham Pharmacia Biotech Ab | Method and a system for nucleic acid sequence analysis |
WO1998054669A1 (en) * | 1997-05-28 | 1998-12-03 | Amersham Pharmacia Biotech Ab | A method and a system for nucleic acid sequence analysis |
US9957561B2 (en) | 1998-05-01 | 2018-05-01 | Life Technologies Corporation | Method of determining the nucleotide sequence of oligonucleotides and DNA molecules |
US9540689B2 (en) | 1998-05-01 | 2017-01-10 | Life Technologies Corporation | Method of determining the nucleotide sequence of oligonucleotides and DNA molecules |
US9458500B2 (en) | 1998-05-01 | 2016-10-04 | Life Technologies Corporation | Method of determining the nucleotide sequence of oligonucleotides and DNA molecules |
US9212393B2 (en) | 1998-05-01 | 2015-12-15 | Life Technologies Corporation | Method of determining the nucleotide sequence of oligonucleotides and DNA molecules |
US9096898B2 (en) | 1998-05-01 | 2015-08-04 | Life Technologies Corporation | Method of determining the nucleotide sequence of oligonucleotides and DNA molecules |
US10214774B2 (en) | 1998-05-01 | 2019-02-26 | Life Technologies Corporation | Method of determining the nucleotide sequence of oligonucleotides and DNA molecules |
US9725764B2 (en) | 1998-05-01 | 2017-08-08 | Life Technologies Corporation | Method of determining the nucleotide sequence of oligonucleotides and DNA molecules |
US7645596B2 (en) | 1998-05-01 | 2010-01-12 | Arizona Board Of Regents | Method of determining the nucleotide sequence of oligonucleotides and DNA molecules |
US10208341B2 (en) | 1998-05-01 | 2019-02-19 | Life Technologies Corporation | Method of determining the nucleotide sequence of oligonucleotides and DNA molecules |
US6671625B1 (en) | 1999-02-22 | 2003-12-30 | Vialogy Corp. | Method and system for signal detection in arrayed instrumentation based on quantum resonance interferometry |
US6780589B1 (en) | 1999-02-22 | 2004-08-24 | Vialogy Corp. | Method and system using active signal processing for repeatable signal amplification in dynamic noise backgrounds |
US7501245B2 (en) | 1999-06-28 | 2009-03-10 | Helicos Biosciences Corp. | Methods and apparatuses for analyzing polynucleotide sequences |
WO2001020040A3 (en) * | 1999-09-16 | 2002-04-25 | Daniel E Sullivan | Method and compositions for evaluating resolution of nucleic acid separation systems |
WO2001020040A2 (en) * | 1999-09-16 | 2001-03-22 | Mj Research, Inc. | Method and compositions for evaluating resolution of nucleic acid separation systems |
US7371276B2 (en) | 2002-08-07 | 2008-05-13 | Ishihara Sangyo Kaisha, Ltd. | Titanium dioxide pigment and method for producing the same and resin composition using the same |
US9012144B2 (en) | 2003-11-12 | 2015-04-21 | Fluidigm Corporation | Short cycle methods for sequencing polynucleotides |
US9657344B2 (en) | 2003-11-12 | 2017-05-23 | Fluidigm Corporation | Short cycle methods for sequencing polynucleotides |
US7981604B2 (en) | 2004-02-19 | 2011-07-19 | California Institute Of Technology | Methods and kits for analyzing polynucleotide sequences |
EP1910556A4 (en) * | 2004-07-20 | 2010-01-20 | Conexio 4 Pty Ltd | Method and apparatus for analysing nucleic acid sequence |
EP1910556A1 (en) * | 2004-07-20 | 2008-04-16 | Conexio 4 Pty Ltd | Method and apparatus for analysing nucleic acid sequence |
US8484000B2 (en) | 2004-09-02 | 2013-07-09 | Vialogy Llc | Detecting events of interest using quantum resonance interferometry |
US9868978B2 (en) | 2005-08-26 | 2018-01-16 | Fluidigm Corporation | Single molecule sequencing of captured nucleic acids |
US7666593B2 (en) | 2005-08-26 | 2010-02-23 | Helicos Biosciences Corporation | Single molecule sequencing of captured nucleic acids |
Also Published As
Publication number | Publication date |
---|---|
EP0835442A1 (en) | 1998-04-15 |
US5853979A (en) | 1998-12-29 |
AU6403996A (en) | 1997-02-05 |
AU700410B2 (en) | 1999-01-07 |
EP0835442B1 (en) | 1999-03-10 |
DE69601720D1 (en) | 1999-04-15 |
US6303303B1 (en) | 2001-10-16 |
JPH11509622A (en) | 1999-08-24 |
CA2225385A1 (en) | 1997-01-23 |
DE69601720T2 (en) | 1999-07-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US5853979A (en) | Method and system for DNA sequence determination and mutation detection with reference to a standard | |
US6554987B1 (en) | Method and apparatus for alignment of signals for use in DNA base-calling | |
US6681186B1 (en) | System and method for improving the accuracy of DNA sequencing and error probability estimation through application of a mathematical model to the analysis of electropherograms | |
JP2719561B2 (en) | Spectrum identification method | |
Richterich | Estimation of errors in “raw” DNA sequences: a validation study | |
Ossadnik et al. | Correlation approach to identify coding regions in DNA sequences | |
Benham | Energetics of the strand separation transition in superhelical DNA | |
US5365455A (en) | Method and apparatus for automatic nucleic acid sequence determination | |
Myers | Whole-genome DNA sequencing | |
US7406385B2 (en) | System and method for consensus-calling with per-base quality values for sample assemblies | |
Berno | A graph theoretic approach to the analysis of DNA sequencing data. | |
US6260034B1 (en) | Method and a system for nucleic acid sequence analysis | |
CN113744807A (en) | Macrogenomics-based pathogenic microorganism detection method and device | |
Lawrence et al. | Assignment of position-specific error probability to primary DNA sequence data | |
US20040142347A1 (en) | Mitochondrial DNA autoscoring system | |
CN109920480B (en) | Method and device for correcting high-throughput sequencing data | |
US20020147548A1 (en) | Basecalling system and protocol | |
CN114005489B (en) | Analysis method and device for detecting point mutation based on third-generation sequencing data | |
KR20220080682A (en) | Method of diagnosing microsatellite instability using coefficient of variation of sequence length at microsatellite locus | |
CN113971986B (en) | Method for checking cross contamination of sequencing sample through sequence similarity | |
EP4204582A1 (en) | Linked dual barcode insertion constructs | |
Pellegrini et al. | TRStalker: an Efficient Heuristic for Finding NP-Complete Tandem Repeats | |
TIBBETTS | PARSING OF GENOMIC GRAFFITI CLARK TIBBETTS*, JAMES GOLDEN, III*, AND DEBORAH TORGERSEN* 1. Introduction 1.1. DNA sequences and the Human Genome Project (HGP) A focal point of modern biology is investigation of wide varieties of phe | |
Sankoff et al. | Allele and locus classification in electrophoretic population studies | |
Klinovská et al. | Detekcia štrukturálnych variantov v genóme z dát s nízkym pokrytím |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A1 Designated state(s): AL AM AT AU AZ BB BG BR BY CA CH CN CZ DE DK EE ES FI GB GE HU IL IS JP KE KG KP KR KZ LK LR LS LT LU LV MD MG MK MN MW MX NO NZ PL PT RO RU SD SE SG SI SK TJ TM TR TT UA UG US UZ VN AM AZ BY KG KZ MD RU TJ TM |
|
AL | Designated countries for regional patents |
Kind code of ref document: A1 Designated state(s): KE LS MW SD SZ UG AT BE CH DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA |
|
DFPE | Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101) | ||
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
ENP | Entry into the national phase |
Ref document number: 2225385 Country of ref document: CA Ref country code: CA Ref document number: 2225385 Kind code of ref document: A Format of ref document f/p: F |
|
ENP | Entry into the national phase |
Ref country code: JP Ref document number: 1997 505254 Kind code of ref document: A Format of ref document f/p: F |
|
WWE | Wipo information: entry into national phase |
Ref document number: 1996923560 Country of ref document: EP |
|
WWP | Wipo information: published in national office |
Ref document number: 1996923560 Country of ref document: EP |
|
REG | Reference to national code |
Ref country code: DE Ref legal event code: 8642 |
|
WWG | Wipo information: grant in national office |
Ref document number: 1996923560 Country of ref document: EP |