WO2013079215A1 - Method for classifying tumour cells - Google Patents

Method for classifying tumour cells Download PDF

Info

Publication number
WO2013079215A1
WO2013079215A1 PCT/EP2012/004957 EP2012004957W WO2013079215A1 WO 2013079215 A1 WO2013079215 A1 WO 2013079215A1 EP 2012004957 W EP2012004957 W EP 2012004957W WO 2013079215 A1 WO2013079215 A1 WO 2013079215A1
Authority
WO
WIPO (PCT)
Prior art keywords
nsclc
gene
expression
genes
pemetrexed
Prior art date
Application number
PCT/EP2012/004957
Other languages
French (fr)
Inventor
Jun Hou
Joan Geertrudis Jacobus Victor AERTS
Franklin Gerardus Grosveld
Original Assignee
Erasmus University Medical Center Rotterdam
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Erasmus University Medical Center Rotterdam filed Critical Erasmus University Medical Center Rotterdam
Publication of WO2013079215A1 publication Critical patent/WO2013079215A1/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • C12Q1/6886Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/48Biological material, e.g. blood, urine; Haemocytometers
    • G01N33/50Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
    • G01N33/53Immunoassay; Biospecific binding assay; Materials therefor
    • G01N33/574Immunoassay; Biospecific binding assay; Materials therefor for cancer
    • G01N33/57407Specifically defined cancers
    • G01N33/57423Specifically defined cancers of lung
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/106Pharmacogenomics, i.e. genetic variability in individual responses to drugs and drug metabolism
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/112Disease subtyping, staging or classification
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/156Polymorphic or mutational markers

Definitions

  • NSCLC is classified based on microscopic analysis of specific histological features, resulting in morphological subtyping and grading. This histopathological classification correlates poorly with patient prognosis and clinical outcome.
  • therapeutic regimens should be tailored for individual patients, in order to obtain maximal anti-tumor effects.
  • EGFR-TKI treatment in NSCLC patients harboring EGFR mutations improved the response rate to ⁇ 68% (27), illustrating the importance of better defining the target group by molecular analysis.
  • tailored therapy for NSCLC remains largely elusive. Most NSCLC of similar histology and grade receive the same therapy, and differences in molecular characteristics are not taken into account routinely.
  • NSCLCs may be classified beyond classical histo-pathological criteria and the resultant subgroups might better indicate the intrinsic divergence of tumor progression, recurrence, and response to therapy (1 -4).
  • Gene expression profiling can be used to reveal tumor features that are relevant to clinical outcome. For example, clustering of ADC or SCC cases based on gene expression profiles identified subgroups presenting favorite overall survival (2-3, 28-30). Microarray-derived gene signatures have also demonstrated the ability to define the risk of NSCLC recurrence (31 ). Ultimately molecular profiling would be expected to predict the response to specific therapies.
  • Pemetrexed is one of the most effective drugs for the treatment of NSCLC.
  • Pemetrexed is a folate anti-metabolite and targets multiple enzymes essential for nucleotide biosynthesis (5). It was established that it has possibly superior activity compared to commonly used agents for treatment of adenocarinoma (ADC) and large cell carcinoma (LCC), but is thought to be less effective for the treatment of squamous cell carcinoma (SCC).
  • ADC adenocarinoma
  • LCC large cell carcinoma
  • SCC squamous cell carcinoma
  • RT-PCR real time - polymerase chain reaction
  • TYMS Thymidylate Synthase
  • NSCLCs can be partitioned into six sub-groups based on global gene expression profiles.
  • a subset of ADC and LCC is clustered in a novel subgroup.
  • the potential clinical relevance of these novel groups is explored by linking this refined phenotyping to the predicted sensitivity to Pemetrexed. Analysis of the expression levels of relevant genes predicts that tumors in this novel subgroup are highly likely to be resistant to Pemetrexed therapy.
  • a subset of SCCs is putative responders to Pemetrexed treatment.
  • the identification of these distinct subgroups of NSCLC suggests that biological characteristics assessed by gene expression profiling can aid in reliably stratifying patients with respect to the choice of therapeutic agents.
  • a method for preparing an optimised gene signature for assigning a NSCLC sample to one or more NSCLC classes comprising subjecting a gene signature set forth in Table 12 to nearest shrunken centroid analysis to identify one or more subgroups of gene classifiers corresponding to one or more of classes 1 to 6 identified in Table 12, and validating the performance of the selected classifiers by K-fold or leave-one-out cross-validation.
  • the metabolic pathways are as follows.
  • Group 1 adherens junction and focal adhesion.
  • Group 2 adhesion molecules on lymphocytes and neutrophils, and complement and coagulation cascades.
  • Group 3 Drug metabolism, including cytochrome P450 and ABC transporters, and p53 signalling pathway.
  • Group 4 tyrosine metabolism and complement and coagulation cascades.
  • Group 5 Long-term potentiation, and neuroactive ligand-receptor interaction.
  • Group 6 Histidine metabolism and GnRH signalling pathway.
  • the full signature set forth in Table 12 was subjected to analysis to identify subgroups of gene classifiers that optimally maintain the capacity of the full signature in distinguishing different phenotypes.
  • the algorithm used in this method is the nearest shrunken centroid classifier (Tibshirani, R., et al., Proc Natl Acad Sci U S A, 2002. 99(10): p. 6567-72).
  • NR non-responder
  • R responder
  • the standardized centroid of each phenotype is calculated. That is, the average gene expression (in log intensities) for each gene divided by the within-phenotype standard deviation for that gene.
  • the centroids of each phenotype then are shrunken toward each other by shrinking the phenotype means of each gene toward an overall mean for all phenotypes.
  • the amount of shrinking is determined by a user-defined parameter. By changing the parameter, the number of genes which have different shrunken means between NR and R is changed, so the classifiers included in the phenotype predictor are changed accordingly.
  • test sample is predicted to belong to NR or R corresponding to the nearest centroid.
  • a method for classifying NSCLC comprising the steps of:
  • NSCLC assigning the NSCLC into any one of Groups 1 -6 as set forth in the gene siganture.
  • Classical subtyping of NSCLC makes use of histological features to subdivide tumours into ADC, LCC and SCC. As set forth above, we have abandoned this subtyping methodology, on the grounds that the results it provides are misleading and lead to mischaracterization of a number of NSCLC subtypes.
  • a new, alternative method for NSCLC characterisation is provided, which is based exclusively on analysis of the gene expression profile of the NSCLC.
  • the gene expression profile includes at least 85% of the genes identified Table 4.
  • the gene expression profile includes at least 90% of the genes identified in Table 4.
  • the gene expression profile can include 95%, 96%, 97%, 98%, 99% or 100% of the genes set forth in Table 4.
  • the gene expression profile(s) obtained can be compared with an external standard, for example the expression profiles of NSCLC archived in a database, or can be compared internally amongst the sampled NSCLC. For example, the levels of gene expression can be compared to
  • the internal reference genes are the genes set forth in Table 11 .
  • a method for classifying a test tissue sample of a malignant non-small cell lung carcinoma (NSCLC) by analysis of gene expression comprising the steps of: (a) performing unsupervised hierarchical clustering of the gene expression data, to identify clusters as defined by over- or under-expression of genes;
  • gene expression is analysed by two-dimensional hierarchical clustering, which provides a graphical representation of comparative gene expression and facilitates classification of NSCLC samples into the relevant gene expression-defined subtypes.
  • the 80% or more of the genes in Table 4 is substantially all of the genes in Table 4.
  • the methods according to the invention are in vitro methods.
  • NSCLC which are categorized in Group 4 show reduced expression of one or more of FLOR1 , ASCL1 , DDC or MAST4 compared with other neuroendocrine NSCLC; or increased expression of one or more of ABCCs, MCM6 and CDCA7 compared with other neuroendocrine NSCLC.
  • classification of NSCLC by gene expression analysis allows susceptibility or resistance to drugs to be predicted, according to the classification of the NSCLC.
  • NSCLC in Group 4 are predicted to be resistant to the drug Pemetrexed.
  • the invention moreover provides a method for preparing an optimised gene signature for predicting resistance to Pemetrexed in a NSCLC, comprising subjecting a gene signature set forth in Table 13 to nearest shrunken centroid analysis to identify subgroups of gene classifiers corresponding to responders and non-responders to Premetrexed therapy, and validating the performance of the selected classifiers by K-fold or leave-one-out cross-validation.
  • the method comprises profiling the expression of at least 90% of the genes set forth in Table 13, and more preferably 95%, 96%, 97%, 98%, 99% or 100% of the genes set forth in Table 13.
  • a method for predicting resistance to Pemetrexed in a NSCLC comprising the steps of: (i) profiling the expression of at least 80% of the genes set forth in Table 6; (ii) comparing the expression of the genes profiled in (i) with the signature set forth in Table 6; and predicting the NSCLC to be responsive or non- responsive to Pemetrexed according to the Table 6 signature.
  • the method comprises profiling the expression of at least 90% of the genes set forth in Table 6 or all of the genes set forth in table 6..
  • kits for determining the subtype grouping of NSCLC, and/or predicting the susceptibility or resistance of an NSCLC to one or more drugs comprise reagents for measuring the presence of mRNA or polypeptides encoded by the genes identified herein.
  • kits may contain instructions as to use.
  • the kits may contain instructions as to the selection of genes to be screened in the diagnosis of NSCLC as set forth herein.
  • the genes are 80% or more of the genes set forth in Table 4. More preferably, the genes are substantially all of the genes set forth in Table 6.
  • the kit may contain instructions for the detection of the gene products expressed from said mRNA species.
  • any method for recognising the levels of expression of a gene may be used in the context of the present invention.
  • the genes identified in each gene signature, and the changes in expression levels associated therewith, are identified in the Tables set out herein; analysis can be made manually, or using automated means, to compare the expression levels observed in a test sample to those observed in a reference sample.
  • kits in accordance with the invention may comprise any reagents suitable for measuring gene expression levels.
  • Such reagents comprise reagents for measuring levels of mRNA, or cDNA derived from mRNA, and/or reagents suitable for measuring levels of polypeptide gene products.
  • a kit may comprise nucleic acid probes which hybridise specifically to mRNA or cDNA specific for the appropriate gene signature, under appropriate conditions.
  • the probes may be immobilised onto a solid surface, such as glass slides, membranes of various types, columns or beads, and may be in the form of an addressable array. If the probes are on an array, the identity of each probe is advantageously known as a result of the spatial arrangement on the array itself.
  • Probes may be used in solution, to probe nucleic acids derived from the sample.
  • labelling means may be provided, to label either the probes or the sample nucleic acids.
  • Primers may also be provided, to prime extension reactions for amplification and/or labelling of sample nucleic acids.
  • the primers are specific for mRNA transcribed from the genes identified in the gene signatures set forth herein, or corresponding cDNA.
  • the kits may alternatively, or in addition, comprise reagents such as immunoglobulins, RNA or peptide aptamers and the like which are capable of specifically detecting the polypeptide gene products of the target genes.
  • the present invention provides a diagnostic kit for use in characterising NSCLC tumours, comprising a set of reagents for specifically measuring the abundance of the mRNA species transcribed from at least 80% of the genes set forth in Table 4 or Table 6 herein.
  • the reagents comprise a set of oligonucleotide primers or probes which hybridise specifically to said genes, which may advantageously be attached to a solid phase in the form of an array.
  • the array consists of a library of oligonucleotides affixed to a solid phase, and said library of oligonucleotides consists substantially of oligonucleotides which are specific for at least 80% of the genes set forth in Table 4 or Table 6 herein.
  • the reagents are selected from immunoglobulin molecules, RNA aptamers and peptide aptamers.
  • the kit is for use in predicting the response of NSCLC to Pemetrexed, and consists substantially of a set of nucleic acid probes or primers which recognise the transcripts of the genes set forth in Table 6.
  • the kit may include a microarray which consists substantially of probes specific for the 25 genes listed in Table 6.
  • kits may further include labelling means, hybridisation reagents, detection reagents, and the like.
  • the kits may contain reagents for detection of one or more of TP53, TTF1 , SYP, NCAM1 and CHGA by immunohistochemical staining.
  • immunoglobulins, RNA or peptide aptamers may be substituted for, or may supplement, the nucleic acid reagents in kits according to the invention.
  • Fig. 1 Identification of 6 subgroups in the Erasmus MC NSCLC cohort.
  • G1 to G6 Six subgroups are indicated by G1 to G6.
  • A Correlation view of gene expression in the 88 Erasmus MC NSCLC samples, excluding 3 samples which were classified as 'healthy'. Pairwise correlations between any two samples are displayed. The colors of the cells represent Pearson's correlation coefficient values between any two samples, with deeper red indicating higher positive and deeper blue lower negative correlations. The red diagonal line displays the self-to-self comparison of each sample.
  • B Relative expression levels of TYMS, cell proliferation genes, and neuroendocrine genes are shown for each of the six identified NSCLC subgroups. Boxes show the distribution of gene expression in each subgroup, with dots representing outliers. The dashed line shows the median expression of that gene across all NSCLC samples.
  • Fig. 2 Deregulated pathways identified by a global functional comparative analysis of predicted Pemetrexed-resistant versus predicted Pemetrexed-sensitive NSCLC cases.
  • the values on the x-axis are calculated enrichment scores, the degree of over- represenation of genes from a specific functional category in NR compared to R. TP53:
  • TP53 signaling pathway Pu: purine metabolism pathway; Py: pyrimidine metabolism pathway; EGFR: EGFR signaling pathway; Pern: Pemetrexed metabolism pathway;
  • Target expression of 14 probe sets representing three Pemetrexed targets (TYMS
  • Fig. 3 Relative expression of TYMS in relation to classical NSCLC histology.
  • Relative expression levels of TYMS in histology signature-assigned NSCLC groups (15).
  • Fig. 4 Predicted Pemetrexed sensitivity in NSCLC subgroups and NSCLC cell lines.
  • NSCLC cases predicted to be resistant (black) or sensitive to Pemetrexed (grey) are correlated to expression profiling-based sub groups (G1 to G6).
  • B Validation of Pemetrexed resistance signature using NSCLC cell lines. Predicted sensitivity to Pemetrexed for NSCLC cell lines is compared to experimentally established sensitivity (16).
  • C Performance of the Pemetrexed-resistance prediction signature on the Duke NSCLC cohort. The ninety-six primary NSCLC cases were classified into six subgroups using the group signature gene set. Six subgroups are indicated by G1 to G6. The response to Pemetrexed predicted by the 25-probe set resistance signature, and its correlation to the six subgroups, are displayed.
  • Fig. 5 Deregulated pyrimidine metabolism pathway in G4 NSCLC.
  • Pyrimidine metabolism is more activated and histidine metabolism pathway less activated in G4 NSCLCs compared to G6 neuroendocrine tumors.
  • the expression status of the genes is indicated by the different colors, darker grey: lower expression; lighter grey: higher expression.
  • Fig. 8 Differential expression of genes associated with Pemetrexed metabolism in G3 NSCLC cases.
  • ABCC1 and FOLR2 are differentially expressed in predicted Pemetrexed-resistant G3 NSCLC cases versus predicted Pemetrexed-sensitive G3 NSCLC cases. Outliers are indicated by crosses.
  • TYMS protein staining in TMA was quantified and graded from 0 to 2 (Staining Score, SS), mRNA expression measured on microarrays was represented as mean of two probe sets for TYMS. Staining for TYMS protein was performed at two different titres, 1 :10 (A) and 1 :50 (B). The samples were grouped according to the predicted response to Pemetrexed, non-responder (NR) and responder (R). Outliers are indicated by crosses.
  • Fig. 10 Utility of routine IHC markers to identify putative NR and R to Pemetrexed therapy.
  • NSCLC cases are annotated with predicted Pemetrexed responsiveness, expression profile-assigned subgroup (G1 to G6), histology (ADC, LCC, SCC), and pathological stages (I to IV) (Table 8).
  • Fig. 11 Proposal for evaluation of putative responsiveness to Pemetrexed therapy.
  • the flowchart shows the proposed procedure to identify sensitive and resistant NSCLC to Pemetrexed using routine histopathological markers.
  • Staining for TP53 and EGFR stratifies G3 NSCLCs with respect to predicted Pemetrexed sensitivity. Negative staining for TP53 and EGFR predicts good response to Pemetrexed. In contrast, positive staining for both TP53 and EGFR predicts Pemetrexed resistance in G3 NSCLC cases.
  • Resistant NSCLC cases in G4 might be predicted by strong staining for TP53 and/or EGFR, or neuroendocrine markers. In contrast, high expression of TP53 or EGFR and other neuroendocrine markers do not predict poor response to Pemetrexed for the NSCLC cases in G1 or G6.
  • NSCLC is "classified” by gene expression profiling.
  • the grouping is set forth in Table 4.
  • genes are assayed in accordance with the present invention by measuring the levels of either nucleic acids or proteins encoded by the gene which are present in a sample.
  • Expression levels are considered herein to be the amounts of mRNA or polypeptide which are present in a sample; they may be influenced, therefore, by for instance modulations in levels of transcription, translation, mRNA or protein turnover.
  • target genes Genes whose expression levels are described herein as being useful for identifying, classifying or measuring the severity of NSCLC are referred to as "target” genes; groups of target genes form gene signatures, which can be used to identify, classify or measure the severity of NSCLC.
  • Nucleic acids are nucleic acids as is commonly understood in the art, and include DNA, RNA and artificial nucleic acids. In the context of the present invention, the levels of naturally-occurring nucleic acids will generally be measured using techniques known to those skilled in the art. Probes, primers and other nucleic acid molecules used in the present invention may comprise synthetic nucleotides or other modifications, as is known in the art. "Reagents” for measuring gene expression levels include nucleic acids and ligands, such as antibodies, which are capable of detecting the RNA or polypeptide products of the target genes described herein.
  • Reagents may be selective, in that they bind to or detect only the RNA or polypeptide products of the target genes, or non-selective, capable of binding to or detecting a wider population of genes, with the selectivity being introduced in a later stage of the assay.
  • assays can be conducted on arrays that comprise many genes in addition to the target genes, and the detection of changes in the expression levels of the target genes will be achieved by selective analysis of the arrays.
  • the Affymetrix Gene chip analyser is capable of identifying binding to probes on gene chip arrays, thereby measuring the degree of hybridisation to the probe sets representing genes on the array as well as the identity of the probes hybridised to at the same time.
  • primers may be used to selectively detect the RNA gene products of target genes.
  • a "primer” is an oligonucleotide, whether produced naturally as in a purified restriction digest or produced synthetically, that is capable of acting as a point of initiation of synthesis when placed under conditions in which synthesis of a primer extension product that is complementary to a nucleic acid strand is induced, (i.e., in the presence of nucleotides and an inducing agent such as DNA polymerase and at a suitable temperature and pH).
  • the primer is preferably single stranded for maximum efficiency in the initiation of the reaction, but may alternatively be double stranded.
  • the primer is first treated to separate its strands before being used to prepare extension products.
  • the primer is an oligodeoxyribonucleotide.
  • the primer must be sufficiently long to prime the synthesis of extension products in the presence of the inducing agent. The exact lengths of the primers will depend on many factors, including temperature, source of primer and the use of the method.
  • probe refers to an oligonucleotide (i.e., a sequence of nucleotides), whether occurring naturally as in a purified restriction digest or produced synthetically, recombinantly or by PCR amplification, that is capable of hybridizing to at least a portion of another oligonucleotide of interest.
  • a probe may be single-stranded or double-stranded. Probes are useful in the detection, identification and isolation of particular gene sequences. It is contemplated that any probe used in the present invention can be labelled with a reporter molecule so that is detectable in any detection system, including, but not limited to enzyme (e. g., ELISA, as well as enzyme-based histochemical assays), fluorescent, radioactive, and luminescent systems. It is not intended that the present invention be limited to any particular detection system or label.
  • sample is used to denote biological samples which may be obtained from animals (including humans) and encompass fluids, solids, tissues, and gases.
  • Biological samples include sputum and blood products, such as plasma, serum and the like.
  • a sample is ordinarily a tissue sample obtained from a NSCLC, or from normal tissue for comparison purposes.
  • Comparing includes comparison of expression levels of target genes directly with a control, as well as comparison with profiles, as described further herein. In comparisons according to the present invention, a match is sought between a pattern of gene expression seen in a control or in a predefined profile.
  • isolated when used in relation to a nucleic acid, refers to a nucleic acid sequence that is identified and separated from at least one component or contaminant with which it is ordinarily associated in its natural source. Isolated nucleic acid is present in a form or setting that is different from that in which it is found in nature. Similarly, isolated polypeptides are polypeptides or proteins separated from at least one component or contaminant with which they are ordinarily associated in their natural source NSCLC Therapy
  • Gene expression profiling can be used to reveal tumor features that are relevant to clinical outcome. For example, clustering of ADC or SCC cases based on gene expression profiles identified subgroups presenting favorite overall survival (2-3, 28-30). Microarray-derived gene signatures have also demonstrated the ability to define the risk of NSCLC recurrence (31 ). Ultimately molecular profiling would be expected to predict the response to specific therapies. In breast cancer and cell lines, gene signatures were identified that reflect the activation status of oncogenic pathways. Based on these signatures, the coordinated active status of pathways was obtained that not only defined prognosis in specific patient subgroups but also predicted the sensitivity to therapeutic agents targeting key components of these pathways (32).
  • NSCLCs independent of classical histopathology, using microarray- based molecular profiles.
  • Tumors clustered in the same subgroups present similar patterns of gene expression and pathway deregulation despite often variable histopathology.
  • tumors clustered in G4 are histologically different - ADC or LCC. But they are molecularly similar, with deregulation of the tyrosine metabolism pathway.
  • G6 tumors are also histologically ADC or LCC, but molecularly characterized by altered histidine metabolism.
  • the group signature defined five subgroups of similar size and composition in the independent Duke NSCLC cohort (.Table 7).
  • TTF1 Positive staining of TTF1 predicts a good response to Pemetrexed in tumors from G1 or G6. Unfortunately, TTF1 staining often fails, limiting the practical utility of this marker.
  • TP53 and EGFR expression with Pemetrexed sensitivity in G3 potentially provides an instant and practical manner to stratify SCC patients for Pemetrexed treatment (Fig. 1 1A).
  • High expression of either TP53 or neuroendocrine markers predicts Pemetrexed resistance in G4 NSCLCs (Fig. 1 1 B), although a few exceptions and staining failures were observed in this group.
  • Fig. 1 1 B For the other NSCLC subgroups, a more specific and sensitive predictor other than the cooperative use of currently available routine markers is needed.
  • Gene expression profiles may guide the choice of chemotherapy regimens
  • the sample used for analysis comprises tissue sample, which includes tumour tissue, and in particular human lung cancer tumour tissue.
  • tissue is, but is not limited to, epithelial tissue and connective tissue; other tissue types as may be used as and if they occur in a lung tumour.
  • NSCLC are comprised of epithelial tissue.
  • Samples are obtained from surgically resected lungs, or may be obtained from patients by standard biopsy techniques.
  • microdissection is used to ensure that the cell types subjected to analysis are the intended cell type.
  • Normal samples can be obtained from the same patient, adjacent the tumour, or from patients not suffering from cancer. Typically, normal samples will be of the same tissue type (i.e. epithelial tissue, connective tissue) as the tumour sample.
  • tissue type i.e. epithelial tissue, connective tissue
  • an analysis model for example using two- dimensional hierarchical clustering, it is only necessary to analyse a tumor sample from a patient rather than both a tumor sample and a normal sample from the same or different patients.
  • the levels mRNAs present in a sample which are encoded by the gene identified in the Tables set forth herein may be measured directly. Analysis is conveniently carried out by labelling the RNA in cells from the sample and assaying the abundance of the desired mRNA species. To prepare RNA from tumour and/or normal samples, total or poly(A)+ RNA is processed according to any suitable technique, for example as set forth below, to produce cDNA and subsequently cRNA, which is conveniently used in microarray analysis.
  • Copies of the cRNA or cDNA may be amplified, for example by RT-PCR. Fluorescent tags or digoxigenin-dUTP can then be enzymatically incorporated into the newly synthesized cDNA/cRNA or can be chemically attached to the new strands of DNA or RNA.
  • the assessment of expression is performed by gene expression profiling using oligonucleotide-based arrays or cDNA-based arrays of any type; RT-PCR (reverse transcription-Polymerase Chain Reaction), real-time PCR, in-situ hybridisation, Northern blotting, serial analysis of gene expression (SAGE) for example as described by Velculescu et al Science 270 (5235): 484-487, or differential display. Details of these and other methods can be found for example in Sambrook et al, 1989, Molecular Cloning: A Laboratory Manual.
  • the assessment uses a microarray assay.
  • Microarrays can be constructed by a number of available technologies. Array technology and the various techniques and applications associated with it are described generally in numerous textbooks and documents. Gene array technology is particularly suited to the practice of the present invention. Methods for preparing microarrays are well known in the art. These include Lemieux et al., (1998), Molecular Breeding 4,277- 289, Schena and Davis. Parallel Analysis with Biological Chips, in PCR Methods Manual (eds. M. Innis, D. Gelfand, J. Sninsky), Schena and Davis, (1999), Genes, Genomes and Chips. In DNA Microarrays : A Practical Approach (ed. M.
  • array technology Major applications for array technology include the identification of sequence (nucleotide sequence/nucleotide sequence mutation) and the determination of expression level (abundance) of nucleotide sequences.
  • Gene expression profiling may make use of array technology, optionally in combination with proteomics techniques (Celis et al, 2000, FEBS Lett, 480 (1 ) : 2-16; Lockhart and Winzeler, 2000, Nature 405 (6788) : 827-836; Khan et al. , 1999,20 (2): 223-9).
  • any library may be arranged in an orderly manner into an array, by spatially separating the members of the library.
  • libraries for arraying include nucleic acid libraries (including DNA, RNA, oligonucleotide and other nucleic acid libraries), peptide, polypeptide and protein libraries, as well as libraries comprising other types of molecules, such as ligand libraries.
  • the members of a library are generally fixed or immobilised onto a solid phase, preferably a solid substrate, to limit diffusion and admixing of the samples.
  • the libraries may be immobilised to a substantially planar solid phase, including membranes and non-porous substrates such as plastic and glass.
  • the samples are preferably arranged in such a way that indexing (i. e. reference or access to a particular sample) is facilitated.
  • the samples are applied as spots in a grid formation.
  • Common assay systems may be adapted for this purpose.
  • an array may be immobilised on the surface of a microplate, either with multiple samples in a well, or with a single sample in each well.
  • the solid substrate may be a membrane, such as a nitrocellulose or nylon membrane (for example, membranes used in blotting experiments).
  • Alternative substrates include glass, or silica based substrates.
  • the samples are immobilised by any suitable method known in the art, for example, by charge interactions, or by chemical coupling to the walls or bottom of the wells, or the surface of the membrane.
  • Other means of arranging and fixing may be used, for example, pipetting, drop-touch, piezoelectric means, ink-jet and bubblejet technology, electrostatic application, etc.
  • photolithography may be utilised to arrange and fix the samples on the chip.
  • the samples may be arranged by being "spotted" onto the solid substrate; this may be done by hand or by making use of robotics to deposit the sample.
  • arrays may be described as macroarrays or microarrays, the difference being the size of the sample spots.
  • Macroarrays typically contain sample spot sizes of about 300 microns or larger and may be easily imaged by existing gel and blot scanners.
  • the sample spot sizes in microarrays are typically less than 200 microns in diameter and these arrays usually contain thousands of spots.
  • microarrays may require specialised robotics and imaging equipment, which may need to be custom made. Instrumentation is described generally in a review by Cortese, 2000, The Engineer 14 [1 1 ]: 26.
  • targets and probes may be labelled with any readily detectable reporter such as a fluorescent, bioluminescent, phosphorescent, radioactive reporter.
  • the materials for use in the methods of the present invention are ideally suited for preparation of kits.
  • a set of instructions will typically be included.
  • microarrays according to the invention may consist of a solid phase and, immobilised thereto, a library of nucleic acid oligonucleotides or probes which consists substantially of one or more of the gene signatures identified herein, and listed in the Tables, especially Tables 4, 6, 10, 11 , 12 and 13.
  • the arrays according to the invention may comprise a library of oligonucleotides which is larger than, though still comprising, one or more of the gene signatures described herein, but still smaller than the set consisting of all known genes.
  • such arrays may comprise gene signatures which are useful for detecting other forms of cancer, or other types of NSCLC, or which may provide different insights into the prognosis for NSCLC patients, or the like.
  • Nucleic acid signatures in accordance with the invention may be detected by nucleic acid analysis which relies on amplification and/or sequencing of sample nucleic acids. Since the invention aims to measure gene expression, the methods used must quantitatively measure transcribed nucleic acid levels. The measured nucleic acids must therefore be mRNA, or nucleic acids derived quantitatively from mRNA such as cDNA.
  • nucleic acid amplification requires nucleic acid amplification.
  • Many amplification methods rely on an enzymatic chain reaction (such as a polymerase chain reaction, a ligase chain reaction, or a self- sustained sequence replication), a linear amplification procedure, or on the replication of all or part of the vector into which the desired sequence has been cloned.
  • the amplification according to the invention is an exponential amplification, as exhibited by for example the polymerase chain reaction.
  • PCR polymerase chain reaction
  • LAR ligase amplification reaction
  • GAS transcription-based amplification system
  • GAWTS genomic amplification with transcript sequencing
  • NASBA nucleic acid sequence-based amplification
  • PCR is a nucleic acid amplification method described inter alia in U.S. Pat. Nos. 4,683,195 and 4,683,202.
  • PCR consists of repeated cycles of DNA polymerase generated primer extension reactions.
  • the target DNA is heat denatured and two oligonucleotides, which bracket the target sequence on opposite strands of the DNA to be amplified, are hybridised. These oligonucleotides become primers for use with DNA polymerase.
  • the DNA is copied by primer extension to make a second copy of both strands. By repeating the cycle of heat denaturation, primer hybridisation and extension, the target DNA can be amplified a million fold or more in about two to four hours.
  • PCR is a molecular biology tool, which must be used in conjunction with a detection technique to determine the results of amplification.
  • An advantage of PCR is that it increases sensitivity by amplifying the amount of target DNA by 1 million to 1 billion fold in approximately 4 hours.
  • PCR can be used to amplify any known nucleic acid in a diagnostic context ( ok et al, (1994), Gynaecologic Oncology, 52: 247-252).
  • Self-sustained sequence replication is a variation of TAS, which involves the isothermal amplification of a nucleic acid template via sequential rounds of reverse transcriptase (RT), polymerase and nuclease activities that are mediated by an enzyme cocktail and appropriate oligonucleotide primers (Guatelli et al. (1990) Proc. Natl. Acad. Sci . US A 87 : 1874).
  • Enzymatic degradation of the RNA of the RNA/DNA heteroduplex is used instead of heat denaturation.
  • RNase H and all other enzymes are added to the reaction and all steps occur at the same temperature and without further reagent additions. Following this process, amplifications of 10 10 have been achieved in one hour at 42°C.
  • Ligation Amplification LAR/LAS
  • Ligation amplification reaction or ligation amplification system uses DNA ligase and four oligonucleotides, two per target strand. This technique is described by Wu, D. Y. and Wallace, R. B. (1989) Genomics 4:560. The oligonucleotides hybridise to adjacent sequences on the target DNA and are joined by the ligase. The reaction is heat denatured and the cycle repeated.
  • RNA replicase for the bacteriophage ⁇ 2 ⁇ which replicates single- stranded RNA, is used to amplify the target DNA, as described by Lizardi et al. (1988) Bio/Technology 6: 1197.
  • the target DNA is hybridised to a primer including a T7 promoter and a ⁇ 5' sequence region.
  • reverse transcriptase generates a cDNA connecting the primer to its 5' end in the process.
  • the resulting heteroduplex is heat denatured.
  • a second primer containing a 0. ⁇ 3' sequence region is used to initiate a second round of cDNA synthesis.
  • T7 RNA polymerase then transcribes the double-stranded DNA into new RNA, which mimics the ⁇ , ⁇ . After extensive washing to remove any unhybridised probe, the new RNA is eluted from the target and replicated by (2 ⁇ replicase. The latter reaction creates 10-fold amplification in approximately 20 minutes.
  • rolling circle amplification (Lizardi et al, (1998) Nat Genet 19:225) is an amplification technology available commercially (RCAT (T ) ) which is driven by DNA polymerase and can replicate circular oligonucleotide probes with either linear or geometric kinetics under isothermal conditions.
  • a geometric amplification occurs via DNA strand displacement and hyperbranching to generate 10 12 or more copies of each circle in 1 hour.
  • RCAT generates, in a few minutes, a linear chain of thousands of tandemly linked DNA copies of a target covalently linked to that target.
  • SDA strand displacement amplification
  • SDA comprises both a target generation phase and an exponential amplification phase.
  • double-stranded DNA is heat denatured creating two single- stranded copies.
  • a series of specially manufactured primers combine with DNA polymerase (amplification primers for copying the base sequence and bumper primers for displacing the newly created strands) to form altered targets capable of exponential amplification.
  • the exponential amplification process begins with altered targets (single-stranded partial DNA strands with restricted enzyme recognition sites) from the target generation phase.
  • An amplification primer is bound to each strand at its complementary DNA sequence.
  • DNA polymerase then uses the primer to identify a location to extend the primer from its 3' end, using the altered target as a template for adding individual nucleotides.
  • the extended primer thus forms a double-stranded DNA segment containing a complete restriction enzyme recognition site at each end.
  • a restriction enzyme is then bound to the double stranded DNA segment at its recognition site.
  • the restriction enzyme dissociates from the recognition site after having cleaved only one strand of the double-sided segment, forming a nick.
  • DNA polymerase recognises the nick and extends the strand from the site, displacing the previously created strand.
  • the recognition site is thus repeatedly nicked and restored by the restriction enzyme and DNA polymerase with continuous displacement of DNA strands containing the target segment.
  • Each displaced strand is then available to anneal with amplification primers as above. The process continues with repeated nicking, extension and displacement of new DNA strands, resulting in exponential amplification of the original DNA target.
  • Identification of nucleic acid sequences can for example be performed by primer extension or sequencing techniques. Such techniques may involve the parallel and/or serial processing of a large number of different template nucleic acid molecules.
  • a library of probes on an array may be employed.
  • a high sensitivity analytical technique may be used to characterize individually nucleic acid molecules which become immobilised on the array, by hybridisation to the probes.
  • primer extension reactions may be used to incorporate labeled nucleotide(s) that can be individually detected in order to sequence individual molecules and/or determine the identity of at least one nucleotide position on individual nucleic acid molecules.
  • Detection may involve labeling one or more of the primers and or extension nucleotides with a detectable label (e.g., using fluorescent label(s), FRET label(s), enzymatic label(s), radio-label(s), etc.). Detection may involve imaging, for example using a high sensitivity camera and/or microscope (e.g., a super-cooled camera and/or microscope).
  • a detectable label e.g., using fluorescent label(s), FRET label(s), enzymatic label(s), radio-label(s), etc.
  • Detection may involve imaging, for example using a high sensitivity camera and/or microscope (e.g., a super-cooled camera and/or microscope).
  • Suitable techniques may be selected by one of ordinary skill in the art.
  • high- throughput sequencing approaches are listed in KY. Chan, Mutation Reseach 573 (2005) 13-40 and include, but are not limited to, near- term sequencing approaches such as cycle-extension approaches, polymerase reading approaches and exonuclease sequencing, revolutionary sequencing approaches such as DNA scanning and nanopore sequencing and direct linear analysis.
  • Examples of current high-throughput sequencing methods are 454 (pyro)sequencing, Solexa Genome Analysis System, Agencourt SOLiD sequencing method (Applied Biosystems), MS-PET sequencing (Ng et al., 2006, http ://nar . oxfordjournals.org/cgi/content/full/34/ 12/e84).
  • a digital analysis (e.g., a digital amplification and subsequent analysis) may be performed to obtain a statistically significant quantitative result.
  • Certain digital techniques are known in the art, see for example, US Patent No. 6,440,706 and US Patent No. 6,753,147, incorporated herein by reference.
  • an emulsion-based method for amplifying and/or sequencing individual nucleic acid molecules may be used (e.g., BEAMing technology; International Published Application Nos. WO2005/010145, WO00/40712, WO02/22869, WO03/044187, WO99/02671 , herein incorporated by reference).
  • a sequencing method that can sequence single molecules in a biological sample may be used. Sequencing methods are known and being developed for high throughput (e.g., parallel) sequencing of complex genomes by sequencing a large number of single molecules (often having overlapping sequences) and compiling the information to obtain the sequence of an entire genome or a significant portion thereof. Suitable sequencing techniques may involve high speed parallel molecular nucleic acid sequencing as described in PCT Application No. WO 01/16375, US Application No. 60/151 ,580 and U.S. Published Application No. 20050014175, the entire contents of which are incorporated herein by reference. Other sequencing techniques are described in PCT Application No. WO 05/73410, PCT Application No.
  • Sequencing techniques for use in connection with the invention may involve exposing a nucleic acid molecule to an oligonucleotide primer and a polymerase in the presence of a mixture of nucleotides.
  • Changes in the fluorescence of individual nucleic acid molecules in response to polymerase activity may be detected and recorded.
  • the specific labels attached to each nucleic acid and/or nucleotide may provide an emission spectrum allowing for the detection of sequence information for individual template nucleic acid molecules.
  • a label is attached to the primer/template and a different label is attached to each type of nucleotide (e.g., A, T/U, C, or G). Each label emits a distinct signal which is distinguished from the other labels.
  • Useful sequencing methods include high throughput sequencing using the 454 Life Sciences Instrument System (International Published Application No. WO2004/069849, filed January 28, 2004). Briefly, a sample of single stranded DNA is prepared and added to an excess of DNA capture beads which are then emulsified. Clonal amplification is performed to produce a sample of enriched DNA on the capture beads (the beads are enriched with millions of copies of a single clonal fragment). The DNA enriched beads are then transferred into PicoTiterPlate (TM) and enzyme beads and sequencing reagents are added. The samples are then analyzed and the sequence data recorded. Pyrophosphate and luciferin are examples of the labels that can be used to generate the signal.
  • a label includes but is not limited to a fluorophore, for example green fluorescent protein (GFP), a luminescent molecule, for example aequorin or europium chelates, fluorescein, rhodamine green, Oregon green, Texas red, naphthofluorescein, or derivatives thereof.
  • the polynucleotide is linked to a substrate.
  • a substrate includes but is not limited to, streptavidin-biotin, histidine-Ni, S-tag-S-protein, or glutathione-S-transferase (GST).
  • a substrate is pretreated to facilitate attachment of a polynucleotide to a surface
  • the substrate can be glass which is coated with a polyelectrolyte multilayer (PEM), or the polynucleotide is biotinylated and the PEM-coated surface is further coated with streptavidin.
  • PEM polyelectrolyte multilayer
  • single molecule sequencing technology available from US Genomics, Mass., may be used.
  • technology described, at least in part, in one or more of US patents 6,790,671 ; 6,772,070; 6,762,059; 6,696,022; 6,403,311 ; 6,355,420; 6,263,286; and 6,210,896 may be used.
  • sequencing methods may be used to analyze DNA and/or RNA according to methods of the invention. It should be appreciated that a sequencing method does not have to be a single molecule sequencing method, since generally nucleic acid material from a substantial sample or biopsy will be available for analysis. Measurement of polypeptide expression
  • the levels of polypeptides encoded by the genes identified in Tables 2, 4, 6, 10, 1 1 , 12 and 13 can be measured directly, without measuring mRNA levels.
  • polypeptides can be detected by differential mobility on protein gels, or by other size analysis techniques such as mass spectrometry.
  • Peptides derived from the gene signatures identified herein can be differentiated by size analysis.
  • the detection means is sequence-specific, such that a particular gene product can accurately be identified as the product of a member of any given gene signature.
  • polypeptide or RNA molecules can be developed which specifically recognise the desired gene products in vivo or in vitro.
  • immunoglobulin molecules may be used to specifically bind to the target polypeptides, for instance in a western blot or ELISA.
  • the immunoglobulins or the target polypeptides may be labelled, to provide a means of identification and measurement. Ideally, such measurements are carried out on an array of immunoglobulin molecules.
  • An "immunoglobulin” is one of a family of polypeptides which retain the immunoglobulin fold characteristic of immunoglobulin (antibody) molecules, which contains two [beta] sheets and, usually, a conserved disulphide bond.
  • immunoglobulin superfamily are involved in many aspects of cellular and non-cellular interactions in vivo, including widespread roles in the immune system (for example, antibodies, T-cell receptor molecules and the like), involvement in cell adhesion (for example the ICAM molecules) and intracellular signalling (for example, receptor molecules, such as the PDGF receptor).
  • Preferred immunoglobulins are antibodies, which are capable of binding to target antigens with high specificity.
  • Antibodies can be whole antibodies, or antigen-binding fragments thereof.
  • the invention includes fragments such as Fv and Fab, as well as Fab' and F(ab') 2 , and antibody variants such as scFv, single domain antibodies, Dab antibodies and other antigen-binding antibody-based molecules.
  • polypeptides encoded by the genes set forth in Tables 2, 4, 6, 10, 1 1 , 12 and 13 , or peptides derived therefrom, can be used to generate antibodies for use in the present invention.
  • the peptides used preferably comprise an epitope which is specific for a polypeptide encoded by a gene in accordance with the invention.
  • Polypeptide fragments which function as epitopes can be produced by any conventional means (see, for example, U.S. Pat. No. 4,631 ,21 1 ).
  • antigenic epitopes preferably contain a sequence of at least 4, at least 5, at least 6, at least 7, more preferably at least 8, at least 9, at least 10, at least 1 1 , at least 12, at least 13, at least 14, at least 15, at least 20, at least 25, at least 30, at least 40, at least 50, and, most preferably, between about 15 to about 30 amino acids.
  • Preferred polypeptides comprising immunogenic or antigenic epitopes are at least 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95 or 100 amino acid residues in length.
  • Antibodies can be generated using antigenic epitopes of polypeptides according to the invention by immunising animals, such as rabbits or mice, with either free or carrier-coupled peptides, for instance, by intraperitoneal and/or intradermal injection of emulsions containing about 100 [mu]g of peptide or carrier protein and Freund's adjuvant or any other adjuvant known for stimulating an immune response.
  • Antibodies for use in the present invention can be fused to marker sequences, such as a peptide which facilitates purification of the fused polypeptide.
  • the marker amino acid sequence is a hexa-histidine peptide, such as the tag provided in a pQE vector (QIAGEN, Inc., 9259 Eton Avenue, Chatsworth, Calif., 9131 1 ), among others, many of which are commercially available.
  • hexa-histidine provides for convenient purification of the fusion protein.
  • HA hemagglutinin protein
  • Antibodies as described herein can be altered antibodies comprising an effector protein such as a label.
  • labels which allow the imaging of the distribution of the antibody in vivo.
  • labels can be radioactive labels or radiopaque labels, such as metal particles, which are readily visualisable within the body of a patient. This can allow an assessment to be made without the need for tissue biopsies.
  • they can be fluorescent labels or other labels which are visualisable on tissue.
  • the antibody is preferably provided together with means for detecting the antibody, which can be enzymatic, fluorescent, radioisotopic or other means.
  • the antibody and the detection means can be provided for simultaneous, simultaneous separate or sequential use, in a diagnostic kit intended for diagnosis.
  • the antibodies for use in the invention can be assayed for immunospecific binding by any method known in the art.
  • the immunoassays which can be used include but are not limited to competitive and noncompetitive assay systems using techniques such as western blots, radioimmunoassays, ELISA, sandwich immunoassays, immunoprecipitation assays, precipitin reactions, gel diffusion precipitin reactions, immunodiffusion assays, agglutination assays, complement-fixation assays, immunoradiometric assays, fluorescent immunoassays and protein A immunoassays.
  • Such assays are routine in the art (see, for example, Ausubel et al, eds, 1994, Current Protocols in Molecular Biology, Vol.
  • Immunoprecipitation protocols generally comprise lysing a population of cells in a lysis buffer such as RIPA buffer (1 % NP-40 or Triton X-100, 1 % sodium deoxycholate, 0.1 % SDS, 0.15 M NaCI, 0.01 M sodium phosphate at pH 7.2, 1 % Trasylol) supplemented with protein phosphatase and/or protease inhibitors (e.g., EDTA, PMSF, aprotinin, sodium vanadate), adding the antibody of interest to the cell lysate, incubating for a period of time (e.g., 1-4 hours) at 4 " C, adding protein A and/or protein G sepharose beads to the cell lysate, incubating for about an hour or more at 4'C, washing the beads in lysis buffer and res
  • a lysis buffer such as RIPA buffer (1 % NP-40 or Triton X-100, 1 % sodium deoxycholate, 0.1 %
  • Western blot analysis generally comprises preparing protein samples, electrophoresis of the protein samples in a polyacrylamide gel (e.g., 8%-20% SDS-PAGE depending on the molecular weight of the antigen), transferring the protein sample from the polyacrylamide gel to a membrane such as nitrocellulose, PVDF or nylon, blocking the membrane in blocking solution (e.g., PBS with 3% BSA or non-fat milk), washing the membrane in washing buffer (e.g., PBS-Tween 20), exposing the membrane to a primary antibody (the antibody of interest) diluted in blocking buffer, washing the membrane in washing buffer, exposing the membrane to a secondary antibody (which recognises the primary antibody, e.g., an antihuman antibody) conjugated to an enzymatic substrate (e.g., horseradish peroxidase or alkaline phosphatase) or radioactive molecule (
  • ELISAs comprise preparing antigen, coating the well of a microtitre plate with the antigen, adding the antibody of interest conjugated to a detectable compound such as an enzymatic substrate (e.g., horseradish peroxidase or alkaline phosphatase) to the well and incubating for a period of time, and detecting the presence of the antigen.
  • a detectable compound such as an enzymatic substrate (e.g., horseradish peroxidase or alkaline phosphatase)
  • a detectable compound such as an enzymatic substrate (e.g., horseradish peroxidase or alkaline phosphatase)
  • a second antibody conjugated to a detectable compound can be added following the addition of the antigen of interest to the coated well.
  • the binding affinity of an antibody to an antigen and the off-rate of an antibody-antigen interaction can be determined by competitive binding assays.
  • a competitive binding assay is a radioimmunoassay comprising the incubation of labelled antigen (e.g., 3 H or 125 l) with the antibody of interest in the presence of increasing amounts of unlabeled antigen, and the detection of the antibody bound to the labelled antigen.
  • the affinity of the antibody of interest for a particular antigen and the binding off-rates can be determined from the data by Scatchard plot analysis.
  • Competition with a second antibody can also be determined using radioimmunoassays.
  • the antigen is incubated with antibody of interest conjugated to a labelled compound (e.g., 3 H or 125 l) in the presence of increasing amounts of an unlabeled second antibody.
  • a labelled compound e.g., 3 H or 125 l
  • Polypeptide levels may be measured using alternative peptide-specific reagents.
  • Such reagents include peptide or RNA aptamers, which can specifically detect a defined polypeptide sequence. Proteins can be detected by protein gel assay, antibody binding assay, or other detection methods known in the art.
  • RNA aptamers can be produced by SELEX. SELEX is a method for the in vitro evolution of nucleic acid molecules with highly specific binding to target molecules. It is described, for example, in U.S. patents 5654151 , 5503978, 5567588 and 5270163, as well as PCT publication WO 96/38579, each of which is specifically incorporated herein by reference.
  • the SELEX method involves selection of nucleic acid aptamers, single-stranded nucleic acids capable of binding to a desired target, from a library of oligonucleotides.
  • the SELEX method includes steps of contacting the library with the target under conditions favourable for binding, partitioning unbound nucleic acids from those nucleic acids which have bound specifically to target molecules, dissociating the nucleic acid-target complexes, amplifying the nucleic acids dissociated from the nucleic acid- target complexes to yield a ligand-enriched library of nucleic acids, then reiterating the steps of binding, partitioning, dissociating and amplifying through as many cycles as desired to yield highly specific, high affinity nucleic acid ligands to the target molecule.
  • SELEX is based on the principle that within a nucleic acid library containing a large number of possible sequences and structures there is a wide range of binding affinities for a given target.
  • a nucleic acid library comprising, for example a 20 nucleotide randomised segment can have 4 20 structural possibilities. Those which have the higher affinity constants for the target are considered to be most likely to bind.
  • the process of partitioning, dissociation and amplification generates a second nucleic acid library, enriched for the higher binding affinity candidates. Additional rounds of selection progressively favour the best ligands until the resulting library is predominantly composed of only one or a few sequences. These can then be cloned, sequenced and individually tested for binding affinity as pure ligands.
  • the iterative selection/amplification method is sensitive enough to allow isolation of a single sequence in a library containing at least 10 14 sequences.
  • the nucleic acids of the library preferably include a randomised sequence portion as well as conserved sequences necessary for efficient amplification.
  • Nucleic acid sequence variants can be produced in a number of ways including synthesis of randomised nucleic acid sequences and size selection from randomly cleaved cellular nucleic acids.
  • the variable sequence portion can contain fully or partially random sequence; it can also contain subportions of conserved sequence incorporated with randomised sequence. Sequence variation in test nucleic acids can be introduced or increased by mutagenesis before or during the selection/amplification iterations and by specific modification of cloned aptamers.
  • Hierarchical Cluster Analysis is defined as grouping or segmenting a collection of objects into subsets or "clusters".
  • the objects to be clustered can be either the genes or the samples: genes can be clustered by comparing their expression profiles across the set of samples, or the samples can be clustered by comparing their expression profiles across the set of genes. In such a way, the genes (or samples) within each cluster are more closely related to one another than genes (or samples) grouped within different clusters. In a hierarchical clustering analysis, the genes (or samples) are not partitioned into a particular cluster in a single step.
  • Microarray data are available at the Gene Expression Omnibus (GEO) of the NCBI (GSE19188).
  • NSCLC microarray datasets Two additional NSCLC microarray datasets were employed in this study to verify the identified gene predictors.
  • One data set contained eighteen NSCLC cell line microarrays from the NCI-60 drug screen panel (16) and the other contained 96 primary NSCLC cases (17).
  • the validation cell lines and tumors were transcriptionally profiled with Affymetrix U133 Plus 2.0 arrays and the complete microarray data sets were accessible in the Gene Expression Omnibus (GEO) database of the NCBI (GSE8332 and GSE3593).
  • GEO Gene Expression Omnibus
  • the sensitivity of the NSCLC cell lines to Pemetrexed was tested in vitro using a standard MTT colorimetric assay quantifying the amount of viable cells (16).
  • RNA from frozen tumor tissues was isolated and processed according to the standard protocol for Affymetrix U133 Plus 2.0 arrays. The details of microarray data processing and normalization are as described previously (15). Microarray data are available in the GEO database (GSE19188). Microarray data was normalized by RMA algorithm. RMA (Robust Multi-Array average) is an integrated algorithm comprising background adjustment, quantile normalization, and expression summarization by median polish (18). The intensities of mismatch probes were ignored due to their spurious estimation of non-specific binding. The intensities were background-corrected in such a way that all corrected values must be positive.
  • RMA Robot Multi-Array average
  • the RMA algorithm utilized quantile normalization in which the signal value of individual probes was substituted by the average of all probes with the same rank of intensity on each chip/array. Finally Tukey's median polish algorithm was used to obtain the estimates of expression for normalized probe intensities. Intensities of probe sets lower than 30 were reset to 30. Probe sets were involved in further analysis only if their expression levels deviated from the overall mean in at least one array by a minimum factor of 2.5, because the remaining data were unlikely to be informative. The result was that 43,160 probe sets were eliminated, and 1 1 ,515 probe sets remained for further analysis.
  • Unsupervised clustering and novel grouping of NSCLC Omniviz software was used to measure the similarities in expression profiles among samples (15).
  • the samples were ordered so that those sharing strong similarities were arranged together into clusters.
  • the clusters and the individual samples within the clusters were sorted in such a manner that the more similar subjects were more closely positioned in the visualization matrix.
  • Six distinct NSCLC clusters were identified by gene expression profiles, as described in (15).
  • NSCLC cell lines within the NCI-60 drug screen panel were transcriptionally profiled by Affymetrix U133 Plus 2.0 array (GSE8332). The sensitivity of these NSCLC cell lines to Pemetrexed was tested in vitro using a standard MTT colorimetric assay via quantifying the amount of viable cells (16).
  • a set of 96 primary NSCLCs were profiled by Affymetrix U133 Plus 2.0 array (GSE3593), and the complete microarray data was downloaded from http://data.genome.duke.edu/LungPotti.php (17). The expression of relevant probe sets/genes of interest was retrieved using a script written in MATLAB.
  • the expression of genes encoding Pemetrexed targets measured by microarray was employed to classify NSCLC to different response groups.
  • the schemes predicted tumor response utilizing the expression of TYMS, the major target of Pemetrexed, alone firstly, and then the expression of all 3 targets, TYMS, DHFR, and GART.
  • the methodology was adjusted to be individually determinant, the expression level of TYMS genes was scaled relative to a set of reference genes from the same microarray - internal reference probe sets/genes (IRG).
  • Non-Responder at least 2 out of 3 probe sets/genes presented an expression higher than 60% of studied population; or 6 out of 14 in cases where all 3 targets were counted.
  • SAM Significance Analysis of Microarray (19). SAM discovered differentially expressed genes between two classes (19), e.g. predicted non-responders and responders. The obtained signatures were subjected to identify subgroups of genes that maintain the capacity of the complete signatures in distinguishing different groups optimally (20). The performance of minimized signatures was validated by "leave-one-out" cross validation (21 ).
  • Probe set identifiers or gene symbols were used to retrieve functional annotation in terms of biological process (BP) and molecular function (MF) from Gene Ontology (GO) for the identified signature genes. Genes/probe sets which were not annotated in the GO knowledge database were excluded from further analyses. Mapped BPs and MFs were subjected to enrichment analysis to determine functional categories significantly overrepresented (DAVID) (22). The reference background used was the human genome.
  • the occurrence of gene members belonging to a certain GO category from the input gene list was compared to that from the gene population. For instance, 10% of input genes may belong to a GO category, while in the human genome, the enrichment of that GO category is 0.17% (50 out of 30,000 genes).
  • the enrichment score was calculated based on the ratio of two enrichments, and the significance, enrichment p-value, was calculated using Fisher's exact test. Multiple test correction was controlled using false discovery rate (FDR) from the Benjamini-Hochberg method.
  • a rarely reported problem with GO term-driven analyses is the inheritance of genes in an ancestry classification system. For instance, genes are repetitively assigned to categories, from ancestor to descendants.
  • a methodology is proposed to address this problem.
  • First, all possible relationships between any two GO categories are identified and recorded in a matrix (MATLAB).
  • Second, existing ancestor and subordinate categories are tagged. Then the relationship of all enriched GO categories is visualized in a diagram.
  • the selected biological processes are condensed into classes by clustering related GO terms on the basis of interrelationship among processes in a network context.
  • categories with a common ancestry are linked in a hierarchical tree.
  • all subordinate categories are combined to the highest level ancestor category to avoid the redundant counting of enrichment genes. Analysis using other pathway knowledge databases
  • TMA Tissue Microarray Analysis
  • the TMA comprised 70 of the 91 tumor tissues, in three replicates, from the Erasmus MC patient cohort used for the expression microarray analyses. TMA blocks were cut into 6 ⁇ slices and antigen retrieval was performed by a 20 min incubation at 95°C using a Tris-EDTA buffer (Klinipath). Slides were cooled down to room temperature and stained with primary antibodies detecting TYMS, EGFR, TP53, TP63, TTF1 , SYG, NCAM1 , CHGA, or KRT5. The source of the antibodies and dilutions used are listed in Table 2.
  • the second step was incubation for 30 min with rabbit anti-mouse 1 :50 (Z0259 Dako) followed by 1 :50 diluted Alkaline Phosphatase Anti-Alkaline Phosphatase (APAAP method; D0651 Dako). Staining was visualized using 20 min development with New Fuchsine substrate.
  • TMA evaluation and protein staining quantification was performed double-blinded by a lung pathologist.
  • the intensity of protein staining was classified using a four-grade scale: with 0 indicating fewer than 10% of positive cells, 1 for 0% to 25%, 2 for 25% to 50%, and 3 greater than 50%.
  • Example 1
  • ADC accounted for a major part of each of these groups, and most LCC / large cell neuroendocrine carcinoma (LCNEC) cases were mingled with ADC in Group4 (G4) and Group6 (G6).
  • ADC in G4 and G6 displayed gene expression patterns suggestive of neuroendocrine features. Regardless of histological consistency between G1 and G2, the NSCLCs in these two groups were distinguished by a low degree of cell differentiation in G1 , and the expression of a large number of immune-related genes in G2.
  • G4, G5, and G6 comprised neuroendocrine NSCLC, including LCNEC, CAR, and NSCLC with neuroendocrine features, mainly of the ADC histological subtype. All CARs were clustered into an independent group (G5). Although the expression profiles of CAR showed to some extent similarity to G4 and G6, the observation that CAR displayed a unique transcriptome profile suggested that CAR is a group of NSCLC with distinct behavior with respect to tumor cell aggressiveness, tumor response to therapy, and prognosis.
  • G1 and G2 Gene expression profiling delineated G1 and G2 were exclusively composed of classical ADC.
  • the G1 signature was enriched by cell cycle genes and proliferative genes, while G2 was characterized by expression of genes associated with complement system, immune response and cytokine secretion. This suggested that these two groups might display a different natural history of disease.
  • G3 was composed of classical SCC.
  • the four non-SCCs in G3 were undistinguishable by expression profiles from the other SCCs in G3, with well known SCC markers, including TP63, KRTs and SERPINB, uniformly high expressed. Additional pathological analysis revealed that two of them presented either positive staining for TP63 or apparent squamous cell elements.
  • histopathoiogical heterogeneity of cancer cells is a common feature of a large fraction of NSCLCs.
  • Molecular phenotyping may be more sensitive than histopathoiogical morphology in grouping NSCLC with respect to tumor behavior.
  • Individual novel groups characterized by unique gene markers are characterized by unique gene markers
  • SFTP surfactant proteins
  • G5 CAR
  • G1 and G2 were associated with focal adhesion and cell adhesion processes respectively, confirming that groups with similar histological composition differed functionally in molecular processes (Table 3). This indicates that different oncogenic mechanisms may be operational in NSCLC, and that these are unrelated to histology as such.
  • TYMS was compared to the average expression of the IRG. Each patient was designated being resistant (NR), sensitive (R), or medium sensitive (M) to Pemetrexed therapy. According to the IRG scheme, out of 91 NSCLC patients 34.1% were predicted as non-responders, 52.7% were predicted to be responders, and 13.2% were predicted to have medium sensitivity to Pemetrexed (Table 8).
  • the relative expression of TYMS in predicted NRs is 177.1 (95% confidence interval: 143.2-210.9), 8.2-fold higher than it in normal lungs; while predicted Rs displayed a 2.2 fold increase in relative expression of TYMS compared with normal lungs. When expression of DHFR or GART was included in the predictive scheme, a similar output was observed.
  • Fig. 2 The differential activity of relevant pathways in NR and R determined by a global analysis is shown in Fig. 2.
  • Surfactant genes, SOX7, SLC16A4 and SLC46A3 were down-regulated in predicted NRs. DNA damage repair-associated genes attributing to multi-drug resistance were found over-expressed in predicted NRs, including TOP2A, PRIM1 , and ATP-binding cassette (ABC) genes.
  • signature genes are a large number of cell cycle regulatory genes such as cyclins and CDCs; cell division related genes like E2Fs, GTSE1 , KIFs, MCMs, and IGFBPL1 ; cell growth and invasion related genes including MMP19; and oncogenes and suppressor genes such as MYB, NBL1 and RAS.
  • a subset of this signature represented by 25 probe sets, performed optimally in predicting Pemetrexed response (Table 6) (20-21 ).
  • the histological subtype (ADC, SCC or LCC) was assigned using the histology signature identified previously (15).
  • LCC contained the highest expression of TYMS (192.0; 95% CI: 125.6-258.4), followed by SCC (86.6; 95% CI: 73.0-100.1 ) and ADC (76.1 ; 95% CI: 58.5-93.7) (Fig. 3).
  • a significant difference in TYMS expression was observed between each subtype of NSCLC and non-cancerous tissues, with 8.85-, 4-, and 3.5-fold increase in LCC, SCC, and ADC, respectively.
  • the difference in TYMS expression was statistically significant between LCC and the other two subtypes, ADC (p-value ⁇ 0.002) and SCC (p-value ⁇ 0.004), but not between ADC and SCC
  • a novel NSCLC group is associated with predicted Pemetrexed resistance
  • G4 The expression profile of G4 was distinguishable from other neuroendocrine tumors.
  • a differential over-expression of neuroendocrine markers, including ASCL1 , DDC, and MAST4 were observed among neuroendocrine groups, with a 2- to 4-fold difference between G4 and G6.
  • MCM6 and CDCA7 showed a relatively higher expression in G4 compared with G6.
  • Pemetrexed is transported in and out of cells by membrane proteins such as FOLR1 , SLC19A1 , and ATP-binding cassette (ABC) family members.
  • Pemetrexed is metabolized by folylpolyglutamate synthetase (FPGS). The aberrant expression of such molecules may also contribute to Pemetrexed resistance.
  • FPGS folylpolyglutamate synthetase
  • Bioinformatics analysis revealed that pyrimidine metabolism and EGF signaling pathways were more activated in G4 compared to other NSCLCs (Figs. 5, 6). Moreover, in comparison to other neuroendocrine tumors, such as G6, pyrimidine metabolism was also more up-regulated while histidine pathway was more down-regulated in G4 than in the neuroendocrine NSCLCs in G6 (Fig. 7).
  • G3 NRs Distinct molecular characteristics of G3 NRs Gene expression stratified SCCs in G3 into two putative groups differing in Pemetrexed responsiveness.
  • the putative SCC NRs presented higher expression of TP53 and higher activity of TP53-associated signaling pathway than putative Rs in this group.
  • predicted SCC NRs differed in the expression of ABCC1 and FLOR2 from predicted SCC Rs, with a 1.47- to 1 .79-fold differential expression in putative NRs (Fig. 8).
  • the expression of resistance associated genes identified with primary NSCLCs was validated in two independent sample cohorts, in transcriptionally profiled NSCLC cell lines (GSE8332) and primary NSCLCs (GSE3141/Duke Cohort) (16-17).
  • the performance of our signature was evaluated by comparing the predicted Pemetrexed- sensitivity of the cell lines to the measured response in drug sensitivity assays (16) (Fig. 4B).
  • the sensitivity to Pemetrexed was predicted for primary NSCLCs in the Duke Cohort (Fig. 4C).
  • the 25-probe set signature correctly predicted response to Pemetrexed in 94% (17 out of 18) of the cell lines. Resistant cell lines were all correctly predicted; the sensitivity of predicting resistance was 100% and specificity 91.7%.
  • Tusher VG Tibshirani R, Chu G. Significance analysis of microarrays applied to the ionizing radiation response. Proceedings of the National Academy of Sciences of the United States of America 2001 Apr 24;98(9):51 16-21 .
  • 20. Tibshirani R, Hastie T, Narasimhan B, Chu G. Diagnosis of multiple cancer types by shrunken centroids of gene expression. Proc Natl Acad Sci U S A 2002 May 14;99(10):6567-72.
  • NR predicted non-responder
  • G1-G6 predicted novel NSCLC subgro
  • ProbelD G2 OG G3 : OG G4 : OG G5 : OG G6 : OG symbol Ratio Ratio Ratio Ratio Ratio Ratio Ratio Ratio

Abstract

The invention relates to a method for preparing an optimised gene signature for assigning a NSCLC sample to one or more NSCLC classes, comprising subjecting a gene signature set forth in Table 12 herein to nearest shrunken centroid analysis to identify one or more subgroups of gene classifiers corresponding to one or more of classes 1 to 6 identified in Table 12, and validating the performance of the selected classifiers by K-fold or leave-one-out cross-validation.

Description

Method for classifying tumour cells
Currently, NSCLC is classified based on microscopic analysis of specific histological features, resulting in morphological subtyping and grading. This histopathological classification correlates poorly with patient prognosis and clinical outcome. Ideally, therapeutic regimens should be tailored for individual patients, in order to obtain maximal anti-tumor effects. For example, EGFR-TKI treatment in NSCLC patients harboring EGFR mutations improved the response rate to ~68% (27), illustrating the importance of better defining the target group by molecular analysis. Despite this promising example, tailored therapy for NSCLC remains largely elusive. Most NSCLC of similar histology and grade receive the same therapy, and differences in molecular characteristics are not taken into account routinely. It is therefore important to develop algorithms for molecular identification of patients that would be sensitive or resistant to a specific therapy. Genome-wide expression studies have revealed that NSCLCs may be classified beyond classical histo-pathological criteria and the resultant subgroups might better indicate the intrinsic divergence of tumor progression, recurrence, and response to therapy (1 -4). Gene expression profiling can be used to reveal tumor features that are relevant to clinical outcome. For example, clustering of ADC or SCC cases based on gene expression profiles identified subgroups presenting favorite overall survival (2-3, 28-30). Microarray-derived gene signatures have also demonstrated the ability to define the risk of NSCLC recurrence (31 ). Ultimately molecular profiling would be expected to predict the response to specific therapies. In breast cancer and cell lines, gene signatures were identified that reflect the activation status of oncogenic pathways. Based on these signatures, the coordinated active status of pathways was obtained that not only defined prognosis in specific patient subgroups but also predicted the sensitivity to therapeutic agents targeting key components of these pathways (32).
Pemetrexed is one of the most effective drugs for the treatment of NSCLC. Pemetrexed is a folate anti-metabolite and targets multiple enzymes essential for nucleotide biosynthesis (5). It was established that it has possibly superior activity compared to commonly used agents for treatment of adenocarinoma (ADC) and large cell carcinoma (LCC), but is thought to be less effective for the treatment of squamous cell carcinoma (SCC). By using real time - polymerase chain reaction (RT-PCR), a study identified that the efficacy of Pemetrexed treatment was related to mRNA expression level of Thymidylate Synthase (TYMS), a key molecule in the thymidine synthesis pathway (6). It was also demonstrated that high expression of TYMS is associated with resistance to Pemetrexed in NSCLC (p = 0.006). Furthermore a higher expression of TYMS is more often seen in SCC than in ADC and LCC (7-9). Based on those observations, Pemetrexed was approved as first-line treatment in combination with cisplatin for advanced non-SCC NSCLC patients (8).
Among ADC and LCC patients, the response rate to Pemetrexed varies between 28 and 61 % (10-11). Intriguingly, a significant number of ADC and LCC cases with high level of TYMS expression were observed with immunohistochemistry (12), suggesting that Pemetrexed treatment would not be effective in those cases. It also indicates that in order to reach a higher response rate for Pemetrexed treatment, it is vital to develop objective criteria for patient selection. Determination of TYMS expression levels by immunohistochemistry is found to be too unreliable for this purpose (13-14). The results are to a great extent prone to variation due to sample processing, antibody specificity, and interpretation of the staining results.
We show herein that early stage NSCLCs can be partitioned into six sub-groups based on global gene expression profiles. In particular, a subset of ADC and LCC is clustered in a novel subgroup. The potential clinical relevance of these novel groups is explored by linking this refined phenotyping to the predicted sensitivity to Pemetrexed. Analysis of the expression levels of relevant genes predicts that tumors in this novel subgroup are highly likely to be resistant to Pemetrexed therapy. Conversely, a subset of SCCs is putative responders to Pemetrexed treatment. The identification of these distinct subgroups of NSCLC suggests that biological characteristics assessed by gene expression profiling can aid in reliably stratifying patients with respect to the choice of therapeutic agents.
Summary of the Invention
In a first aspect, there is provided a method for preparing an optimised gene signature for assigning a NSCLC sample to one or more NSCLC classes, comprising subjecting a gene signature set forth in Table 12 to nearest shrunken centroid analysis to identify one or more subgroups of gene classifiers corresponding to one or more of classes 1 to 6 identified in Table 12, and validating the performance of the selected classifiers by K-fold or leave-one-out cross-validation.
We have found that differences in gene expression profiles, which correspond with activation or inactivation of metabolic pathways, provide a more reliable classification of NSCLC than traditional histological approaches. The genes used in the gene expression analyses, therefore, should correspond with metabolic pathways. In one embodiment, the metabolic pathways are as follows. Group 1 : adherens junction and focal adhesion. Group 2: adhesion molecules on lymphocytes and neutrophils, and complement and coagulation cascades. Group 3: Drug metabolism, including cytochrome P450 and ABC transporters, and p53 signalling pathway. Group 4: tyrosine metabolism and complement and coagulation cascades. Group 5: Long-term potentiation, and neuroactive ligand-receptor interaction. Group 6: Histidine metabolism and GnRH signalling pathway.
The full signature set forth in Table 12 was subjected to analysis to identify subgroups of gene classifiers that optimally maintain the capacity of the full signature in distinguishing different phenotypes.
The algorithm used in this method is the nearest shrunken centroid classifier (Tibshirani, R., et al., Proc Natl Acad Sci U S A, 2002. 99(10): p. 6567-72).
In the training set, two phenotypes are defined as non-responder (NR) and responder (R). The standardized centroid of each phenotype is calculated. That is, the average gene expression (in log intensities) for each gene divided by the within-phenotype standard deviation for that gene. The centroids of each phenotype then are shrunken toward each other by shrinking the phenotype means of each gene toward an overall mean for all phenotypes. The amount of shrinking is determined by a user-defined parameter. By changing the parameter, the number of genes which have different shrunken means between NR and R is changed, so the classifiers included in the phenotype predictor are changed accordingly.
The performance of selected classifiers selected based on the parameter is then validated by K-fold or "leave-one-out" cross validation (Golub, T., et al., Science, 1999. 286: p. 531 -536). In case of leave-one-out cross validation, for all training samples (n=k) in turn, the classifiers are built on the other K-1 samples then tested on the remaining sample. The above procedure is repeated for a range of parameter values, and the cross-validated misclassification errors are reported for each parameter value. The parameter value giving the minimum cross-validated misclassification error and corresponding classifiers are chosen. When a test sample is analysed, the distance of the expression profile for the new sample to both NR and R centroids is calculated and compared. Then the test sample is predicted to belong to NR or R corresponding to the nearest centroid. In a second aspect, there is provided a gene signature for assigning a NSCLC sample to one or more NSCLC classes as identified in Table 12, wherein at least 80% of the genes comprised in said gene signature are set forth in Table 12
In a third aspect, there is provided a method for classifying NSCLC comprising the steps of:
(i) analyzing a gene expression profile of one or more NSCLC, said gene expression profile comprising the expression levels of at least 80% of the genes set forth in a gene signature comprising NSCLC classes 1 -6 as identified in Table 12, said gene signature being: (a) the gene signature set forth in Table 12;
(b) a gene signature according to the second aspect of the invention;
(c) a gene signature wherein the genes are identified as classifiers by the method of the first aspect of the invention; or
(d) the gene signature set forth in Table 4; (ii) comparing the gene expression levels in the NSCLC, and detecting differences in gene expression which are characteristic of any one or more of Groups 1 -6 as set forth in the gene signature; and
(iii) assigning the NSCLC into any one of Groups 1 -6 as set forth in the gene siganture. Classical subtyping of NSCLC makes use of histological features to subdivide tumours into ADC, LCC and SCC. As set forth above, we have abandoned this subtyping methodology, on the grounds that the results it provides are misleading and lead to mischaracterization of a number of NSCLC subtypes. A new, alternative method for NSCLC characterisation is provided, which is based exclusively on analysis of the gene expression profile of the NSCLC.
In one embodiment, the gene expression profile includes at least 85% of the genes identified Table 4. Preferably, the gene expression profile includes at least 90% of the genes identified in Table 4. For example, the gene expression profile can include 95%, 96%, 97%, 98%, 99% or 100% of the genes set forth in Table 4. The gene expression profile(s) obtained can be compared with an external standard, for example the expression profiles of NSCLC archived in a database, or can be compared internally amongst the sampled NSCLC. For example, the levels of gene expression can be compared to
(a) an external standard of NSCLC gene expression levels; or
(b) the expression levels of at least 10 of the internal reference genes identified in Table 10.
In one embodiment, the internal reference genes are the genes set forth in Table 11 .
In a further embodiment, there is provided a method for classifying a test tissue sample of a malignant non-small cell lung carcinoma (NSCLC) by analysis of gene expression, comprising the steps of: (a) performing unsupervised hierarchical clustering of the gene expression data, to identify clusters as defined by over- or under-expression of genes;
(b) assaying the expression levels of 80% or more of the genes set forth in Table 4; and
(c) assigning the NSCLC sample to one of the groups identified in Table 4.
Advantageously, gene expression is analysed by two-dimensional hierarchical clustering, which provides a graphical representation of comparative gene expression and facilitates classification of NSCLC samples into the relevant gene expression-defined subtypes.
Preferably, the 80% or more of the genes in Table 4 is substantially all of the genes in Table 4.
Preferably, the methods according to the invention are in vitro methods.
In a further embodiment, NSCLC which are categorized in Group 4 show reduced expression of one or more of FLOR1 , ASCL1 , DDC or MAST4 compared with other neuroendocrine NSCLC; or increased expression of one or more of ABCCs, MCM6 and CDCA7 compared with other neuroendocrine NSCLC.
In one aspect of the present invention, it has been shown that classification of NSCLC by gene expression analysis allows susceptibility or resistance to drugs to be predicted, according to the classification of the NSCLC. For example, as shown herein, NSCLC in Group 4 are predicted to be resistant to the drug Pemetrexed.
The invention moreover provides a method for preparing an optimised gene signature for predicting resistance to Pemetrexed in a NSCLC, comprising subjecting a gene signature set forth in Table 13 to nearest shrunken centroid analysis to identify subgroups of gene classifiers corresponding to responders and non-responders to Premetrexed therapy, and validating the performance of the selected classifiers by K-fold or leave-one-out cross-validation.
Using the gene expression signatures which we have developed for classification of NSCLC, it has been possible to develop a Pemetrexed response signature, which is set forth in Table 13. A minimal signature for Pemetrexed response is set forth in Table 6.
Accordingly, there is provided a method for predicting resistance to Pemetrexed in a NSCLC comprising the steps of:
(i) analyzing a gene expression profile of one or more NSCLC, said gene expression profile comprising the expression levels of at least 80% of the genes set forth in a gene signature, said gene signature being:
(a) the gene signature set forth in Table 13;
(b) a gene signature wherein 80% of the genes in said signature are set forth in Table 13;
(c) a gene signature wherein the genes are identified as classifiers by subjecting a gene signature set forth in Table 13 to nearest shrunken centroid analysis to identify subgroups of gene classifiers corresponding to responders and non-responders to Premetrexed therapy, and validating the performance of the selected classifiers by K-fold or leave-one-out cross-validation; or
(d) the gene signature set forth in Table 6; (ii) comparing the gene expression levels in the NSCLC, and detecting differences in gene expression which are characteristic of response or non-response to Pemetrexed; and
(iii) assigning the NSCLC to a responder or non-responder group.
Preferably, the method comprises profiling the expression of at least 90% of the genes set forth in Table 13, and more preferably 95%, 96%, 97%, 98%, 99% or 100% of the genes set forth in Table 13.
In one embodiment, there is provided a method for predicting resistance to Pemetrexed in a NSCLC, comprising the steps of: (i) profiling the expression of at least 80% of the genes set forth in Table 6; (ii) comparing the expression of the genes profiled in (i) with the signature set forth in Table 6; and predicting the NSCLC to be responsive or non- responsive to Pemetrexed according to the Table 6 signature. Preferably, the method comprises profiling the expression of at least 90% of the genes set forth in Table 6 or all of the genes set forth in table 6..
The invention moreover provides diagnostic kits for determining the subtype grouping of NSCLC, and/or predicting the susceptibility or resistance of an NSCLC to one or more drugs. Such kits comprise reagents for measuring the presence of mRNA or polypeptides encoded by the genes identified herein.
In one embodiment, such kits may contain instructions as to use. In particular, the kits may contain instructions as to the selection of genes to be screened in the diagnosis of NSCLC as set forth herein. Preferably, the genes are 80% or more of the genes set forth in Table 4. More preferably, the genes are substantially all of the genes set forth in Table 6. Moreover, the kit may contain instructions for the detection of the gene products expressed from said mRNA species.
In general, it will be appreciated that any method for recognising the levels of expression of a gene may be used in the context of the present invention. The genes identified in each gene signature, and the changes in expression levels associated therewith, are identified in the Tables set out herein; analysis can be made manually, or using automated means, to compare the expression levels observed in a test sample to those observed in a reference sample.
Accordingly, kits in accordance with the invention may comprise any reagents suitable for measuring gene expression levels. Such reagents comprise reagents for measuring levels of mRNA, or cDNA derived from mRNA, and/or reagents suitable for measuring levels of polypeptide gene products. For example, therefore, a kit may comprise nucleic acid probes which hybridise specifically to mRNA or cDNA specific for the appropriate gene signature, under appropriate conditions. The probes may be immobilised onto a solid surface, such as glass slides, membranes of various types, columns or beads, and may be in the form of an addressable array. If the probes are on an array, the identity of each probe is advantageously known as a result of the spatial arrangement on the array itself.
Probes may be used in solution, to probe nucleic acids derived from the sample. Moreover, labelling means may be provided, to label either the probes or the sample nucleic acids.
Primers may also be provided, to prime extension reactions for amplification and/or labelling of sample nucleic acids. The primers are specific for mRNA transcribed from the genes identified in the gene signatures set forth herein, or corresponding cDNA. The kits may alternatively, or in addition, comprise reagents such as immunoglobulins, RNA or peptide aptamers and the like which are capable of specifically detecting the polypeptide gene products of the target genes.
In particular, the present invention provides a diagnostic kit for use in characterising NSCLC tumours, comprising a set of reagents for specifically measuring the abundance of the mRNA species transcribed from at least 80% of the genes set forth in Table 4 or Table 6 herein.
Preferably, the reagents comprise a set of oligonucleotide primers or probes which hybridise specifically to said genes, which may advantageously be attached to a solid phase in the form of an array.
Preferably, the array consists of a library of oligonucleotides affixed to a solid phase, and said library of oligonucleotides consists substantially of oligonucleotides which are specific for at least 80% of the genes set forth in Table 4 or Table 6 herein.
Alternatively, or in addition, the reagents are selected from immunoglobulin molecules, RNA aptamers and peptide aptamers.
Preferably, the kit is for use in predicting the response of NSCLC to Pemetrexed, and consists substantially of a set of nucleic acid probes or primers which recognise the transcripts of the genes set forth in Table 6. For instance, the kit may include a microarray which consists substantially of probes specific for the 25 genes listed in Table 6.
The kits may further include labelling means, hybridisation reagents, detection reagents, and the like. For example, the kits may contain reagents for detection of one or more of TP53, TTF1 , SYP, NCAM1 and CHGA by immunohistochemical staining.
It will be understood that immunoglobulins, RNA or peptide aptamers may be substituted for, or may supplement, the nucleic acid reagents in kits according to the invention.
Brief Description of the Figures
Fig. 1. Identification of 6 subgroups in the Erasmus MC NSCLC cohort.
Six subgroups are indicated by G1 to G6. (A) Correlation view of gene expression in the 88 Erasmus MC NSCLC samples, excluding 3 samples which were classified as 'healthy'. Pairwise correlations between any two samples are displayed. The colors of the cells represent Pearson's correlation coefficient values between any two samples, with deeper red indicating higher positive and deeper blue lower negative correlations. The red diagonal line displays the self-to-self comparison of each sample. (B) Relative expression levels of TYMS, cell proliferation genes, and neuroendocrine genes are shown for each of the six identified NSCLC subgroups. Boxes show the distribution of gene expression in each subgroup, with dots representing outliers. The dashed line shows the median expression of that gene across all NSCLC samples.
Fig. 2 Deregulated pathways identified by a global functional comparative analysis of predicted Pemetrexed-resistant versus predicted Pemetrexed-sensitive NSCLC cases.
A global functional analysis revealed that relevant pathways, TP53 signaling, pyrimidine metabolism, and EGFR signaling pathway, are similarly deregulated in predicted
Pemetrexed-resistant NSCLC cases of the Erasmus MC cohort (A) and Duke cohort (B).
The values on the x-axis are calculated enrichment scores, the degree of over- represenation of genes from a specific functional category in NR compared to R. TP53:
TP53 signaling pathway; Pu: purine metabolism pathway; Py: pyrimidine metabolism pathway; EGFR: EGFR signaling pathway; Pern: Pemetrexed metabolism pathway;
Target: expression of 14 probe sets representing three Pemetrexed targets (TYMS,
DHFR, and GART).
Fig. 3. Relative expression of TYMS in relation to classical NSCLC histology.
Relative expression levels of TYMS in histology signature-assigned NSCLC groups (15). (A) Erasmus MC cohort and (B) Duke cohort. Boxes show the distribution of TYMS expression in each subgroup, with crosses representing outliers. The band in the box shows the median expression of TYMS in that group.
Fig. 4. Predicted Pemetrexed sensitivity in NSCLC subgroups and NSCLC cell lines.
(A) NSCLC cases predicted to be resistant (black) or sensitive to Pemetrexed (grey) are correlated to expression profiling-based sub groups (G1 to G6). (B) Validation of Pemetrexed resistance signature using NSCLC cell lines. Predicted sensitivity to Pemetrexed for NSCLC cell lines is compared to experimentally established sensitivity (16). (C) Performance of the Pemetrexed-resistance prediction signature on the Duke NSCLC cohort. The ninety-six primary NSCLC cases were classified into six subgroups using the group signature gene set. Six subgroups are indicated by G1 to G6. The response to Pemetrexed predicted by the 25-probe set resistance signature, and its correlation to the six subgroups, are displayed.
Fig. 5 Deregulated pyrimidine metabolism pathway in G4 NSCLC. Deregulated pyrimidine metabolism pathway in G4 NSCLC cases compared to other Groups. Differential expression is indicated by grey shading: over-expression (60% and 80% grey) and under-expression (20% grey), in the Erasmus MC (A) and Duke (B) NSCLC cohorts Fig. 6 EGF pathway and its correlation with putative Pemetrexed response in NSCLC.
The relative expression of genes in the EGF signalling pathway in G4 NSCLC cases from the Erasmus MC (A) and Duke (B) cohorts. Upregulation, in comparison between G4 and non-G4, of gene expression is shown in dark grey (60% and 80% grey) and down regulation is shown in light grey (20% and 40% grey). Fig. 7 Pyrimidine and histidine metabolism pathways are differentially activated between G4 and G6.
Pyrimidine metabolism is more activated and histidine metabolism pathway less activated in G4 NSCLCs compared to G6 neuroendocrine tumors. The expression status of the genes is indicated by the different colors, darker grey: lower expression; lighter grey: higher expression.
Fig. 8 Differential expression of genes associated with Pemetrexed metabolism in G3 NSCLC cases.
ABCC1 and FOLR2 are differentially expressed in predicted Pemetrexed-resistant G3 NSCLC cases versus predicted Pemetrexed-sensitive G3 NSCLC cases. Outliers are indicated by crosses.
Fig. 9 Correlation of TYMS mRNA and protein expression.
TYMS protein staining in TMA was quantified and graded from 0 to 2 (Staining Score, SS), mRNA expression measured on microarrays was represented as mean of two probe sets for TYMS. Staining for TYMS protein was performed at two different titres, 1 :10 (A) and 1 :50 (B). The samples were grouped according to the predicted response to Pemetrexed, non-responder (NR) and responder (R). Outliers are indicated by crosses.
Fig. 10. Utility of routine IHC markers to identify putative NR and R to Pemetrexed therapy.
The expression of eight IHC markers in 70 out of the 91 NSCLC cases was detected by TMA and used for cluster analysis. High staining is shown in dark grey and low staining is shown in light grey. Failed IHC staining is shown as blanks. NSCLC cases are annotated with predicted Pemetrexed responsiveness, expression profile-assigned subgroup (G1 to G6), histology (ADC, LCC, SCC), and pathological stages (I to IV) (Table 8).
Fig. 11 Proposal for evaluation of putative responsiveness to Pemetrexed therapy. The flowchart shows the proposed procedure to identify sensitive and resistant NSCLC to Pemetrexed using routine histopathological markers. (A) Staining for TP53 and EGFR stratifies G3 NSCLCs with respect to predicted Pemetrexed sensitivity. Negative staining for TP53 and EGFR predicts good response to Pemetrexed. In contrast, positive staining for both TP53 and EGFR predicts Pemetrexed resistance in G3 NSCLC cases. (B) Resistant NSCLC cases in G4 might be predicted by strong staining for TP53 and/or EGFR, or neuroendocrine markers. In contrast, high expression of TP53 or EGFR and other neuroendocrine markers do not predict poor response to Pemetrexed for the NSCLC cases in G1 or G6.
Detailed Description of the Invention Standard techniques are used for molecular, genetic and biochemical methods. See, generally, Sambrook et al., Molecular Cloning: A Laboratory Manual, 2d ed. (1989) Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N. Y. and Ausubel et al., Short Protocols in Molecular Biology (1999) 4th Ed, John Wiley & Sons, Inc.; as well as Guthrie et al., Guide to Yeast Genetics and Molecular Biology, Methods in Enzymology, Vol. 194, Academic Press, Inc. , (1991 ), PCR Protocols: A Guide to Methods and Applications (Innis, et al. 1990. Academic Press, San Diego, Calif.), McPherson et al., PCR Volume 1 , Oxford University Press, (1991 ), Culture of Animal Cells: A Manual of Basic Technique, 2nd Ed. (R. I. Freshney, 1987; Liss, Inc. New York, N. Y.), and Gene Transfer and Expression Protocols, pp. 109-128, ed. E. J. Murray, The Humana Press Inc. , Clifton, N. J.). These documents are incorporated herein by reference.
In the context of the present invention as described herein, NSCLC is "classified" by gene expression profiling. The grouping is set forth in Table 4.
It is not useful to assign probe sets to groups because the entire optimized signature is needed for classification. This column has been removed from Tables 4 and 12 (containing the full signature that did have specific probe sets for each group).
The expression levels of genes are assayed in accordance with the present invention by measuring the levels of either nucleic acids or proteins encoded by the gene which are present in a sample. Expression levels are considered herein to be the amounts of mRNA or polypeptide which are present in a sample; they may be influenced, therefore, by for instance modulations in levels of transcription, translation, mRNA or protein turnover.
As used herein, "measuring" and "assaying", as well as "polypeptide" and "protein", are intended to be interchangeable and equivalent in meaning. Genes whose expression levels are described herein as being useful for identifying, classifying or measuring the severity of NSCLC are referred to as "target" genes; groups of target genes form gene signatures, which can be used to identify, classify or measure the severity of NSCLC.
"Nucleic acids" are nucleic acids as is commonly understood in the art, and include DNA, RNA and artificial nucleic acids. In the context of the present invention, the levels of naturally-occurring nucleic acids will generally be measured using techniques known to those skilled in the art. Probes, primers and other nucleic acid molecules used in the present invention may comprise synthetic nucleotides or other modifications, as is known in the art. "Reagents" for measuring gene expression levels include nucleic acids and ligands, such as antibodies, which are capable of detecting the RNA or polypeptide products of the target genes described herein. Reagents may be selective, in that they bind to or detect only the RNA or polypeptide products of the target genes, or non-selective, capable of binding to or detecting a wider population of genes, with the selectivity being introduced in a later stage of the assay.
In general, assays can be conducted on arrays that comprise many genes in addition to the target genes, and the detection of changes in the expression levels of the target genes will be achieved by selective analysis of the arrays. For example, the Affymetrix Gene chip analyser is capable of identifying binding to probes on gene chip arrays, thereby measuring the degree of hybridisation to the probe sets representing genes on the array as well as the identity of the probes hybridised to at the same time.
Alternatively, specific primers may be used to selectively detect the RNA gene products of target genes. As used herein, a "primer" is an oligonucleotide, whether produced naturally as in a purified restriction digest or produced synthetically, that is capable of acting as a point of initiation of synthesis when placed under conditions in which synthesis of a primer extension product that is complementary to a nucleic acid strand is induced, (i.e., in the presence of nucleotides and an inducing agent such as DNA polymerase and at a suitable temperature and pH). The primer is preferably single stranded for maximum efficiency in the initiation of the reaction, but may alternatively be double stranded. If double stranded, the primer is first treated to separate its strands before being used to prepare extension products. Preferably, the primer is an oligodeoxyribonucleotide. The primer must be sufficiently long to prime the synthesis of extension products in the presence of the inducing agent. The exact lengths of the primers will depend on many factors, including temperature, source of primer and the use of the method.
As used herein, the term "probe" refers to an oligonucleotide (i.e., a sequence of nucleotides), whether occurring naturally as in a purified restriction digest or produced synthetically, recombinantly or by PCR amplification, that is capable of hybridizing to at least a portion of another oligonucleotide of interest. A probe may be single-stranded or double-stranded. Probes are useful in the detection, identification and isolation of particular gene sequences. It is contemplated that any probe used in the present invention can be labelled with a reporter molecule so that is detectable in any detection system, including, but not limited to enzyme (e. g., ELISA, as well as enzyme-based histochemical assays), fluorescent, radioactive, and luminescent systems. It is not intended that the present invention be limited to any particular detection system or label.
As used herein, the term "sample" is used to denote biological samples which may be obtained from animals (including humans) and encompass fluids, solids, tissues, and gases. Biological samples include sputum and blood products, such as plasma, serum and the like. In the context of the present invention, a sample is ordinarily a tissue sample obtained from a NSCLC, or from normal tissue for comparison purposes.
"Comparing", as used herein, includes comparison of expression levels of target genes directly with a control, as well as comparison with profiles, as described further herein. In comparisons according to the present invention, a match is sought between a pattern of gene expression seen in a control or in a predefined profile.
The term "isolated" when used in relation to a nucleic acid, refers to a nucleic acid sequence that is identified and separated from at least one component or contaminant with which it is ordinarily associated in its natural source. Isolated nucleic acid is present in a form or setting that is different from that in which it is found in nature. Similarly, isolated polypeptides are polypeptides or proteins separated from at least one component or contaminant with which they are ordinarily associated in their natural source NSCLC Therapy
Currently, the administration of chemotherapy in NSCLC patients is based on histology, and the response rates to treatment are ~16% for single agents and ~40% for combined regimens (26). The fact that cancer patients with similar histopathological features respond dramatically different to the same therapeutic agent indicates that histology alone is insufficient to predict the response to therapeutics. Ideally, therapeutic regimens should be tailored for individual patients, in order to obtain maximal anti-tumor effects. For example, EGFR-TKI treatment in NSCLC patients harboring EGFR mutations improved the response rate to -68% (27), illustrating the importance of better defining the target group by molecular analysis. Despite this promising example, tailored therapy for NSCLC remains largely elusive. Most NSCLC of similar histology and grade receive the same therapy, and differences in molecular characteristics are not taken into account routinely. It is therefore important to develop algorithms for molecular identification of patients that would be sensitive or resistant to a specific therapy. The folate anti-metabolite Pemetrexed is a promising agent for the treatment of NSCLC. Currently, the administration of Pemetrexed is directed by classical histological criteria (10). Here, we presented an approach to implement tailored Pemetrexed therapy in NSCLC. It is based on the use of gene expression profiles to systematically assess the predicted response to Pemetrexed. Subgrouping of NSCL C implies common molecular characteristics of NSCL C subtypes
Gene expression profiling can be used to reveal tumor features that are relevant to clinical outcome. For example, clustering of ADC or SCC cases based on gene expression profiles identified subgroups presenting favorite overall survival (2-3, 28-30). Microarray-derived gene signatures have also demonstrated the ability to define the risk of NSCLC recurrence (31 ). Ultimately molecular profiling would be expected to predict the response to specific therapies. In breast cancer and cell lines, gene signatures were identified that reflect the activation status of oncogenic pathways. Based on these signatures, the coordinated active status of pathways was obtained that not only defined prognosis in specific patient subgroups but also predicted the sensitivity to therapeutic agents targeting key components of these pathways (32). In this study, we classified NSCLCs into six groups, independent of classical histopathology, using microarray- based molecular profiles. Tumors clustered in the same subgroups present similar patterns of gene expression and pathway deregulation despite often variable histopathology. For example, tumors clustered in G4 are histologically different - ADC or LCC. But they are molecularly similar, with deregulation of the tyrosine metabolism pathway. Moreover, G6 tumors are also histologically ADC or LCC, but molecularly characterized by altered histidine metabolism. Importantly, the group signature defined five subgroups of similar size and composition in the independent Duke NSCLC cohort (.Table 7). G5/CAR was absent since this tumor type was not represented in the Duke cohort. This strongly supports the notion that at least five common subtypes of NSCLC exist and that these can be classified by oncogenomics. This molecular classification might outperform conventional histopathology in depicting tumor biology and revealing the underlying oncogenic phenotypes. The identification of these distinct NSCLC groups provides an opportunity to explore molecular classification as a tool to tailor the therapeutic regimens for individual NSCLC patients. We have applied this to predict Pemetrexed response in NSCLC cases.
Molecular characteristics of NSCLC associated with putative sensitivity to Pemetrexed treatment
We did not observe a significant difference in patient characteristics, such as gender, age or smoking habit, nor post-operation survival among the six subgroups. This is supported by the fact that the group signature does not show overlap with our previously determined survival signature (15). However, distinct patterns of oncogenic pathway deregulation in each group suggest that the response to a particular pharmacological treatment might differ between the NSCLC subgroups. We found that the TYMS expression-based predictions of Pemetrexed response did not correlate well with classical histology, challenging the current guideline to limit its use to ADC and LCC cases. We identified a group of NSCLC (G4), composed mainly of ADC and LCC cases, in which a large proportion of tumors are predicted to be resistant to Pemetrexed. In contrast, ADC and LCC cases classified in G1 and G6 were identified as candidates for Pemetrexed therapy as they were predicted to respond favourably (Fig. 4A).
According to previous clinical trials, LCC patients showed the best response to Pemetrexed (1 1 ). This is contradictory to our predictions. We propose that this paradox results from the difficulty in distinguishing LCC from other types of NSCLC. The classification of LCC by routine IHC is prone to considerable inter-observer variation, for instance 62.5% of LCC cases in our previous study were differentially classified between two pathologists (15). In this study, NSCLC were histologically classified with the aid of a 75-gene histology signature identified in our previous study, resulting in all molecularly defined NSCLC subtypes sharing a distinct gene expression profile (15). A refined classification of LCC is possible with the use of additional IHC markers which are not routinely used, as exemplified by the TMA results (Table 8). In this study, NSCLC were histologically classified with the aid of a 75-gene histology signature identified in our previous study, resulting in molecularly defined NSCLC subtypes sharing a distinct gene expression profile (21 ). Additional IHC showed that at least one of three LCC markers was expressed in 59% of the gene signature-assigned LCC samples, compared to only 7.5% of the remaining samples. All gene signature- assigned SCC (n = 23) expressed KRT or TP63 on TMA and were clustered in G3. In addition, gene-signature-assigned ADC (n = 2) and LCC (n = 1 ) were assigned to G3. All three samples expressed both KRT and TP63 on TMA (Table 9).
Remarkably, around 68% of SCC in our cohort and 47% of SCC in the Duke cohort was predicted to be sensitive to Pemetrexed therapy . The molecular differences between SCC NR and R were similar to that between non-SCC NR and R. The expression of FOLR1 is lower in both SCC NR and non-SCC NR, while ABCC1 and TP53 are over- expressed in the same patients. Pyrimidine and purine metabolism were activated at a higher level in NRs compared to R. This indicates that Pemetrexed may be an effective therapeutic agent for a subset of SCC patients identified by this approach. In addition, our analysis suggests that CAR tumors (G5) might benefit from Pemetrexed therapy.
Validation of the predicted Pemetrexed non-responder gene signature
The role of TYMS expression in the efficacy of Pemetrexed therapy has been established by several studies (6, 9, 16, 24). We stratified NSCLC patients into different groups by TYMS expression extracted from microarray data, and extended this single- gene stratification to a multi-gene signature. This gene signature accurately predicted Pemetrexed response of the NSCLC cell lines whose sensitivity was previously determined (16) (Fig. 4B).
Evaluation of TYMS expression levels by TMA TYMS staining was performed at two different antibody titers, 1 :10 and 1 :50. Strong staining NSCLCs were predominantly associated with predicted NRs. But in NSCLCs scored as grade 0 and grade 1 for TYMS staining, predicted NRs and Rs were both present (Fig. 9). This can result from unspecific binding of TYMS antibody and/or the lack of correlation between mRNA and protein expression (13-14). The sensitivity decreases even further when a lower antibody titer (1 :50) is used for TMA staining. At this dilution, the antibody failed to detect TYMS expression in -80% of the cases that were deemed positive by microarray analysis (Fig. 9). Consistent with previous reports, these results illustrate the technical limitations of routine immunohistochemistry to assess TYMS expression (13-14). Potential to use surrogate markers
Since the expression of TYMS detected by immunohistochemistry appears to be of limited value, we propose a strategy for identifying NSCLCs resistant to Pemetrexed that uses the knowledge of novel grouping.. Negative TP53 staining assigned the NSCLC in G3 as potentially sensitive to Pemetrexed therapy. The likelihood of resistance is predicted by high expression of TP53, and reinforced by concurrent high EGFR expression. Similarly, tumors in G4 with strong staining for TP53 might be resistant to Pemetrexed (Fig. 10). In contrast, high expression of TP53, EGFR, or both does not predict resistance for the NSCLCs in G1 or G6. Positive staining of TTF1 predicts a good response to Pemetrexed in tumors from G1 or G6. Unfortunately, TTF1 staining often fails, limiting the practical utility of this marker. The expression of neuroendocrine markers, SYP, NCAM1 , and CHGA, in non-G4 is predictive of Pemetrexed sensitivity.
The novel relationship between TP53 and EGFR expression with Pemetrexed sensitivity in G3 potentially provides an instant and practical manner to stratify SCC patients for Pemetrexed treatment (Fig. 1 1A). High expression of either TP53 or neuroendocrine markers predicts Pemetrexed resistance in G4 NSCLCs (Fig. 1 1 B), although a few exceptions and staining failures were observed in this group. For the other NSCLC subgroups, a more specific and sensitive predictor other than the cooperative use of currently available routine markers is needed. Gene expression profiles may guide the choice of chemotherapy regimens
Although the conventional immunohistochemistry markers showed some potential for practical application to distinguish Pemetrexed Rs from NRs in specific subgroups of NSCLC, these results are prone to inter-observer variation. It restricts the application of immunohistochemistry -based stratification of NSCLC as a gold standard in clinical practice (25). Conversely, gene expression profiles hold promise for classification of patients with respect to Pemetrexed sensitivity. The identification of the novel G4 classification, and the 25-probe set resistance signature, suggests that NSCLC NRs to Pemetrexed can be predicted with high specificity and sensitivity, providing valuable new information compared to the three classical histological types. These observations also indicate that, rather than by a single gene, Pemetrexed efficacy is determined by a gene interaction network. This is confirmed by the analysis of resistant and sensitive NSCLC cell lines (Fig. 4B) (16).
In conclusion, we suggest a refined classification of NSCLC subtypes based on gene expression profiles. This new molecular classification may aid tailored Pemetrexed therapy for individual NSCLC patients. Our observations indicate that NSCLC resistant to Pemetrexed can be identified by molecular means. Furthermore, the approach we have followed here should be generally applicable to other therapeutic agents and other types of cancer. Tumour Samples
The sample used for analysis comprises tissue sample, which includes tumour tissue, and in particular human lung cancer tumour tissue. Typically, such tissue is, but is not limited to, epithelial tissue and connective tissue; other tissue types as may be used as and if they occur in a lung tumour. Typically, however, NSCLC are comprised of epithelial tissue.
Samples are obtained from surgically resected lungs, or may be obtained from patients by standard biopsy techniques. Advantageously, microdissection is used to ensure that the cell types subjected to analysis are the intended cell type.
Normal samples can be obtained from the same patient, adjacent the tumour, or from patients not suffering from cancer. Typically, normal samples will be of the same tissue type (i.e. epithelial tissue, connective tissue) as the tumour sample.
In the present invention, where an analysis model is defined, for example using two- dimensional hierarchical clustering, it is only necessary to analyse a tumor sample from a patient rather than both a tumor sample and a normal sample from the same or different patients.
Nucleic acid measurement
In the method according to the present invention, the levels mRNAs present in a sample which are encoded by the gene identified in the Tables set forth herein may be measured directly. Analysis is conveniently carried out by labelling the RNA in cells from the sample and assaying the abundance of the desired mRNA species. To prepare RNA from tumour and/or normal samples, total or poly(A)+ RNA is processed according to any suitable technique, for example as set forth below, to produce cDNA and subsequently cRNA, which is conveniently used in microarray analysis.
Copies of the cRNA or cDNA may be amplified, for example by RT-PCR. Fluorescent tags or digoxigenin-dUTP can then be enzymatically incorporated into the newly synthesized cDNA/cRNA or can be chemically attached to the new strands of DNA or RNA. Preferably the assessment of expression is performed by gene expression profiling using oligonucleotide-based arrays or cDNA-based arrays of any type; RT-PCR (reverse transcription-Polymerase Chain Reaction), real-time PCR, in-situ hybridisation, Northern blotting, serial analysis of gene expression (SAGE) for example as described by Velculescu et al Science 270 (5235): 484-487, or differential display. Details of these and other methods can be found for example in Sambrook et al, 1989, Molecular Cloning: A Laboratory Manual. Preferably the assessment uses a microarray assay.
Arrays
Microarrays (or arrays) can be constructed by a number of available technologies. Array technology and the various techniques and applications associated with it are described generally in numerous textbooks and documents. Gene array technology is particularly suited to the practice of the present invention. Methods for preparing microarrays are well known in the art. These include Lemieux et al., (1998), Molecular Breeding 4,277- 289, Schena and Davis. Parallel Analysis with Biological Chips, in PCR Methods Manual (eds. M. Innis, D. Gelfand, J. Sninsky), Schena and Davis, (1999), Genes, Genomes and Chips. In DNA Microarrays : A Practical Approach (ed. M. Schena), Oxford University Press, Oxford, UK, 1999), The Chipping Forecast (Nature Genetics special issue; January 1999 Supplement), Mark Schena (Ed.), Microarray Biochip Technology, (Eaton Publishing Company), Cortes, 2000, The Scientist 14 [17]: 25, Gwynne and Page, Microarray analysis : the next revolution in molecular biology, Science, 1999 August 6; and Eakins and Chu, 1999, Trends in Biotechnology, 17,217-218.
The technology is described in PCT/US01/10063 and US 2002 090979 and references therein. Commercial suppliers include Affymetrix (California) and Clontech Laboratories (California). Alternatives to solid phase arrays include addressable microbead technologies such as VeraBead from lllumina (California).
Major applications for array technology include the identification of sequence (nucleotide sequence/nucleotide sequence mutation) and the determination of expression level (abundance) of nucleotide sequences. Gene expression profiling may make use of array technology, optionally in combination with proteomics techniques (Celis et al, 2000, FEBS Lett, 480 (1 ) : 2-16; Lockhart and Winzeler, 2000, Nature 405 (6788) : 827-836; Khan et al. , 1999,20 (2): 223-9). In general, any library may be arranged in an orderly manner into an array, by spatially separating the members of the library. Examples of suitable libraries for arraying include nucleic acid libraries (including DNA, RNA, oligonucleotide and other nucleic acid libraries), peptide, polypeptide and protein libraries, as well as libraries comprising other types of molecules, such as ligand libraries.
Accordingly, where reference is made to a "library" such reference includes reference to a library in the form of an array.
The members of a library are generally fixed or immobilised onto a solid phase, preferably a solid substrate, to limit diffusion and admixing of the samples. In particular, the libraries may be immobilised to a substantially planar solid phase, including membranes and non-porous substrates such as plastic and glass.
Furthermore, the samples are preferably arranged in such a way that indexing (i. e. reference or access to a particular sample) is facilitated. Typically the samples are applied as spots in a grid formation. Common assay systems may be adapted for this purpose. For example, an array may be immobilised on the surface of a microplate, either with multiple samples in a well, or with a single sample in each well.
Furthermore, the solid substrate may be a membrane, such as a nitrocellulose or nylon membrane (for example, membranes used in blotting experiments). Alternative substrates include glass, or silica based substrates. Thus, the samples are immobilised by any suitable method known in the art, for example, by charge interactions, or by chemical coupling to the walls or bottom of the wells, or the surface of the membrane. Other means of arranging and fixing may be used, for example, pipetting, drop-touch, piezoelectric means, ink-jet and bubblejet technology, electrostatic application, etc. In the case of silicon-based chips, photolithography may be utilised to arrange and fix the samples on the chip.
The samples may be arranged by being "spotted" onto the solid substrate; this may be done by hand or by making use of robotics to deposit the sample. In general, arrays may be described as macroarrays or microarrays, the difference being the size of the sample spots. Macroarrays typically contain sample spot sizes of about 300 microns or larger and may be easily imaged by existing gel and blot scanners. The sample spot sizes in microarrays are typically less than 200 microns in diameter and these arrays usually contain thousands of spots. Thus, microarrays may require specialised robotics and imaging equipment, which may need to be custom made. Instrumentation is described generally in a review by Cortese, 2000, The Scientist 14 [1 1 ]: 26.
Techniques for producing immobilised libraries of DNA molecules have been described in the art. Generally, most prior art methods describe how to prepare single-stranded nucleic acid molecule libraries, using for example masking techniques to build up various permutations of sequences at the various discrete positions on the solid substrate. US 5,837, 832 describes an improved method for producing DNA arrays immobilised to silicon substrates based on very large scale integration technology. In particular, US 5,837, 832 describes a strategy called "tiling" to prepare specific sets of probes at spatially-defined locations on a substrate which may be used to produced the immobilised DNA libraries of the present invention. US 5,837, 832 also provides references for earlier techniques that may also be used.
To aid detection, targets and probes may be labelled with any readily detectable reporter such as a fluorescent, bioluminescent, phosphorescent, radioactive reporter.
Labelling of probes and targets is disclosed in Shalon et al., 1996, Genome Res 6 (7): 639-45.
The materials for use in the methods of the present invention are ideally suited for preparation of kits. A set of instructions will typically be included.
The invention provides kits which comprise microarrays which are specific for a desired set of genes. For example, microarrays according to the invention may consist of a solid phase and, immobilised thereto, a library of nucleic acid oligonucleotides or probes which consists substantially of one or more of the gene signatures identified herein, and listed in the Tables, especially Tables 4, 6, 10, 11 , 12 and 13. Such specialised microarrays are less expensive to produce than general purpose microarrays, and less difficult and expensive to analyse. In a further embodiment, the arrays according to the invention may comprise a library of oligonucleotides which is larger than, though still comprising, one or more of the gene signatures described herein, but still smaller than the set consisting of all known genes. For instance, such arrays may comprise gene signatures which are useful for detecting other forms of cancer, or other types of NSCLC, or which may provide different insights into the prognosis for NSCLC patients, or the like. Amplification and sequencing
Nucleic acid signatures in accordance with the invention may be detected by nucleic acid analysis which relies on amplification and/or sequencing of sample nucleic acids. Since the invention aims to measure gene expression, the methods used must quantitatively measure transcribed nucleic acid levels. The measured nucleic acids must therefore be mRNA, or nucleic acids derived quantitatively from mRNA such as cDNA.
Generation of nucleic acids for analysis from samples generally, but not universally, requires nucleic acid amplification. Many amplification methods rely on an enzymatic chain reaction (such as a polymerase chain reaction, a ligase chain reaction, or a self- sustained sequence replication), a linear amplification procedure, or on the replication of all or part of the vector into which the desired sequence has been cloned. Preferably, the amplification according to the invention is an exponential amplification, as exhibited by for example the polymerase chain reaction.
Many target and signal amplification methods have been described in the literature. See, for example, general reviews of these methods in Landegren, U., et al., Science 242:229-237 (1988) and Lewis, R., Genetic Engineering News 10: 1 , 54-55 (1990). These amplification methods can be used in the methods of the present invention, and include polymerase chain reaction (PCR), PCR in situ, ligase amplification reaction (LAR), ligase hybridisation, Qbeta bacteriophage replicase, transcription-based amplification system (TAS), genomic amplification with transcript sequencing (GAWTS), nucleic acid sequence-based amplification (NASBA) and in situ hybridisation. Primers suitable for use in various amplification techniques can be prepared according to methods known in the art. Polymerase Chain Reaction (PCR)
PCR is a nucleic acid amplification method described inter alia in U.S. Pat. Nos. 4,683,195 and 4,683,202. PCR consists of repeated cycles of DNA polymerase generated primer extension reactions. The target DNA is heat denatured and two oligonucleotides, which bracket the target sequence on opposite strands of the DNA to be amplified, are hybridised. These oligonucleotides become primers for use with DNA polymerase. The DNA is copied by primer extension to make a second copy of both strands. By repeating the cycle of heat denaturation, primer hybridisation and extension, the target DNA can be amplified a million fold or more in about two to four hours. PCR is a molecular biology tool, which must be used in conjunction with a detection technique to determine the results of amplification. An advantage of PCR is that it increases sensitivity by amplifying the amount of target DNA by 1 million to 1 billion fold in approximately 4 hours. PCR can be used to amplify any known nucleic acid in a diagnostic context ( ok et al, (1994), Gynaecologic Oncology, 52: 247-252). Self-Sustained Sequence Replication (3SR)
Self-sustained sequence replication (3SR) is a variation of TAS, which involves the isothermal amplification of a nucleic acid template via sequential rounds of reverse transcriptase (RT), polymerase and nuclease activities that are mediated by an enzyme cocktail and appropriate oligonucleotide primers (Guatelli et al. (1990) Proc. Natl. Acad. Sci . US A 87 : 1874). Enzymatic degradation of the RNA of the RNA/DNA heteroduplex is used instead of heat denaturation. RNase H and all other enzymes are added to the reaction and all steps occur at the same temperature and without further reagent additions. Following this process, amplifications of 1010 have been achieved in one hour at 42°C. Ligation Amplification (LAR/LAS)
Ligation amplification reaction or ligation amplification system uses DNA ligase and four oligonucleotides, two per target strand. This technique is described by Wu, D. Y. and Wallace, R. B. (1989) Genomics 4:560. The oligonucleotides hybridise to adjacent sequences on the target DNA and are joined by the ligase. The reaction is heat denatured and the cycle repeated.
Q 3 Replicase
In this technique, RNA replicase for the bacteriophage <2β, which replicates single- stranded RNA, is used to amplify the target DNA, as described by Lizardi et al. (1988) Bio/Technology 6: 1197. First, the target DNA is hybridised to a primer including a T7 promoter and a Οβ 5' sequence region. Using this primer, reverse transcriptase generates a cDNA connecting the primer to its 5' end in the process. These two steps are similar to the TAS protocol. The resulting heteroduplex is heat denatured. Next, a second primer containing a 0.β 3' sequence region is used to initiate a second round of cDNA synthesis. This results in a double stranded DNA containing both 5' and 3' ends of the 0.β bacteriophage as well as an active T7 RNA polymerase binding site. T7 RNA polymerase then transcribes the double-stranded DNA into new RNA, which mimics the Ο,β. After extensive washing to remove any unhybridised probe, the new RNA is eluted from the target and replicated by (2β replicase. The latter reaction creates 10-fold amplification in approximately 20 minutes.
Alternative amplification technology can be exploited in the present invention. For example, rolling circle amplification (Lizardi et al, (1998) Nat Genet 19:225) is an amplification technology available commercially (RCAT(T )) which is driven by DNA polymerase and can replicate circular oligonucleotide probes with either linear or geometric kinetics under isothermal conditions.
In the presence of two suitably designed primers, a geometric amplification occurs via DNA strand displacement and hyperbranching to generate 1012 or more copies of each circle in 1 hour.
If a single primer is used, RCAT generates, in a few minutes, a linear chain of thousands of tandemly linked DNA copies of a target covalently linked to that target.
A further technique, strand displacement amplification (SDA; Walker et al., (1992) PNAS (USA) 80:392) begins with a specifically defined sequence unique to a specific target. But unlike other techniques which rely on thermal cycling, SDA is an isothermal process that utilises a series of primers, DNA polymerase and a restriction enzyme to exponentially amplify the unique nucleic acid sequence.
SDA comprises both a target generation phase and an exponential amplification phase.
In target generation, double-stranded DNA is heat denatured creating two single- stranded copies. A series of specially manufactured primers combine with DNA polymerase (amplification primers for copying the base sequence and bumper primers for displacing the newly created strands) to form altered targets capable of exponential amplification. The exponential amplification process begins with altered targets (single-stranded partial DNA strands with restricted enzyme recognition sites) from the target generation phase. An amplification primer is bound to each strand at its complementary DNA sequence. DNA polymerase then uses the primer to identify a location to extend the primer from its 3' end, using the altered target as a template for adding individual nucleotides. The extended primer thus forms a double-stranded DNA segment containing a complete restriction enzyme recognition site at each end.
A restriction enzyme is then bound to the double stranded DNA segment at its recognition site. The restriction enzyme dissociates from the recognition site after having cleaved only one strand of the double-sided segment, forming a nick. DNA polymerase recognises the nick and extends the strand from the site, displacing the previously created strand. The recognition site is thus repeatedly nicked and restored by the restriction enzyme and DNA polymerase with continuous displacement of DNA strands containing the target segment.
Each displaced strand is then available to anneal with amplification primers as above. The process continues with repeated nicking, extension and displacement of new DNA strands, resulting in exponential amplification of the original DNA target.
Identification of nucleic acid sequences, for example after amplification, can for example be performed by primer extension or sequencing techniques. Such techniques may involve the parallel and/or serial processing of a large number of different template nucleic acid molecules. In one aspect, a library of probes on an array may be employed. A high sensitivity analytical technique may be used to characterize individually nucleic acid molecules which become immobilised on the array, by hybridisation to the probes. For example, primer extension reactions may be used to incorporate labeled nucleotide(s) that can be individually detected in order to sequence individual molecules and/or determine the identity of at least one nucleotide position on individual nucleic acid molecules. Detection may involve labeling one or more of the primers and or extension nucleotides with a detectable label (e.g., using fluorescent label(s), FRET label(s), enzymatic label(s), radio-label(s), etc.). Detection may involve imaging, for example using a high sensitivity camera and/or microscope (e.g., a super-cooled camera and/or microscope).
Suitable techniques may be selected by one of ordinary skill in the art. Examples of high- throughput sequencing approaches are listed in KY. Chan, Mutation Reseach 573 (2005) 13-40 and include, but are not limited to, near- term sequencing approaches such as cycle-extension approaches, polymerase reading approaches and exonuclease sequencing, revolutionary sequencing approaches such as DNA scanning and nanopore sequencing and direct linear analysis. Examples of current high-throughput sequencing methods are 454 (pyro)sequencing, Solexa Genome Analysis System, Agencourt SOLiD sequencing method (Applied Biosystems), MS-PET sequencing (Ng et al., 2006, http ://nar . oxfordjournals.org/cgi/content/full/34/ 12/e84). In one embodiment, a digital analysis (e.g., a digital amplification and subsequent analysis) may be performed to obtain a statistically significant quantitative result. Certain digital techniques are known in the art, see for example, US Patent No. 6,440,706 and US Patent No. 6,753,147, incorporated herein by reference. Similarly, an emulsion-based method for amplifying and/or sequencing individual nucleic acid molecules may be used (e.g., BEAMing technology; International Published Application Nos. WO2005/010145, WO00/40712, WO02/22869, WO03/044187, WO99/02671 , herein incorporated by reference).
In one embodiment, a sequencing method that can sequence single molecules in a biological sample may be used. Sequencing methods are known and being developed for high throughput (e.g., parallel) sequencing of complex genomes by sequencing a large number of single molecules (often having overlapping sequences) and compiling the information to obtain the sequence of an entire genome or a significant portion thereof. Suitable sequencing techniques may involve high speed parallel molecular nucleic acid sequencing as described in PCT Application No. WO 01/16375, US Application No. 60/151 ,580 and U.S. Published Application No. 20050014175, the entire contents of which are incorporated herein by reference. Other sequencing techniques are described in PCT Application No. WO 05/73410, PCT Application No. WO 05/54431 , PCT Application No. WO 05/39389, PCT Application No. WO 05/03375, PCT Application No. WO 05/010145, PCT Application No. WO 04/069849, PCT Application No. WO 04/70005, PCT Application No. WO 04/69849, PCT Application No. WO 04/70007, and US Published Application No. 20050100932, the entire contents of which are incorporated herein by reference. Sequencing techniques for use in connection with the invention may involve exposing a nucleic acid molecule to an oligonucleotide primer and a polymerase in the presence of a mixture of nucleotides. Changes in the fluorescence of individual nucleic acid molecules in response to polymerase activity may be detected and recorded. The specific labels attached to each nucleic acid and/or nucleotide may provide an emission spectrum allowing for the detection of sequence information for individual template nucleic acid molecules. In certain embodiments, a label is attached to the primer/template and a different label is attached to each type of nucleotide (e.g., A, T/U, C, or G). Each label emits a distinct signal which is distinguished from the other labels.
Useful sequencing methods include high throughput sequencing using the 454 Life Sciences Instrument System (International Published Application No. WO2004/069849, filed January 28, 2004). Briefly, a sample of single stranded DNA is prepared and added to an excess of DNA capture beads which are then emulsified. Clonal amplification is performed to produce a sample of enriched DNA on the capture beads (the beads are enriched with millions of copies of a single clonal fragment). The DNA enriched beads are then transferred into PicoTiterPlate (TM) and enzyme beads and sequencing reagents are added. The samples are then analyzed and the sequence data recorded. Pyrophosphate and luciferin are examples of the labels that can be used to generate the signal. A label includes but is not limited to a fluorophore, for example green fluorescent protein (GFP), a luminescent molecule, for example aequorin or europium chelates, fluorescein, rhodamine green, Oregon green, Texas red, naphthofluorescein, or derivatives thereof. In some embodiments, the polynucleotide is linked to a substrate. A substrate includes but is not limited to, streptavidin-biotin, histidine-Ni, S-tag-S-protein, or glutathione-S-transferase (GST). In some embodiments, a substrate is pretreated to facilitate attachment of a polynucleotide to a surface, for example the substrate can be glass which is coated with a polyelectrolyte multilayer (PEM), or the polynucleotide is biotinylated and the PEM-coated surface is further coated with streptavidin.
In other embodiments, single molecule sequencing technology available from US Genomics, Mass., may be used. For example, technology described, at least in part, in one or more of US patents 6,790,671 ; 6,772,070; 6,762,059; 6,696,022; 6,403,311 ; 6,355,420; 6,263,286; and 6,210,896 may be used.
Other sequencing methods may be used to analyze DNA and/or RNA according to methods of the invention. It should be appreciated that a sequencing method does not have to be a single molecule sequencing method, since generally nucleic acid material from a substantial sample or biopsy will be available for analysis. Measurement of polypeptide expression
In an alternative embodiment, the levels of polypeptides encoded by the genes identified in Tables 2, 4, 6, 10, 1 1 , 12 and 13 can be measured directly, without measuring mRNA levels. For example, polypeptides can be detected by differential mobility on protein gels, or by other size analysis techniques such as mass spectrometry. Peptides derived from the gene signatures identified herein can be differentiated by size analysis. Advantageously, the detection means is sequence-specific, such that a particular gene product can accurately be identified as the product of a member of any given gene signature. For example, polypeptide or RNA molecules can be developed which specifically recognise the desired gene products in vivo or in vitro. For example, immunoglobulin molecules may be used to specifically bind to the target polypeptides, for instance in a western blot or ELISA. The immunoglobulins or the target polypeptides may be labelled, to provide a means of identification and measurement. Ideally, such measurements are carried out on an array of immunoglobulin molecules. An "immunoglobulin" is one of a family of polypeptides which retain the immunoglobulin fold characteristic of immunoglobulin (antibody) molecules, which contains two [beta] sheets and, usually, a conserved disulphide bond. Members of the immunoglobulin superfamily are involved in many aspects of cellular and non-cellular interactions in vivo, including widespread roles in the immune system (for example, antibodies, T-cell receptor molecules and the like), involvement in cell adhesion (for example the ICAM molecules) and intracellular signalling (for example, receptor molecules, such as the PDGF receptor).
Preferred immunoglobulins are antibodies, which are capable of binding to target antigens with high specificity. "Antibodies" can be whole antibodies, or antigen-binding fragments thereof. For example, the invention includes fragments such as Fv and Fab, as well as Fab' and F(ab')2, and antibody variants such as scFv, single domain antibodies, Dab antibodies and other antigen-binding antibody-based molecules.
The polypeptides encoded by the genes set forth in Tables 2, 4, 6, 10, 1 1 , 12 and 13 , or peptides derived therefrom, can be used to generate antibodies for use in the present invention. The peptides used preferably comprise an epitope which is specific for a polypeptide encoded by a gene in accordance with the invention. Polypeptide fragments which function as epitopes can be produced by any conventional means (see, for example, U.S. Pat. No. 4,631 ,21 1 ). In the present invention, antigenic epitopes preferably contain a sequence of at least 4, at least 5, at least 6, at least 7, more preferably at least 8, at least 9, at least 10, at least 1 1 , at least 12, at least 13, at least 14, at least 15, at least 20, at least 25, at least 30, at least 40, at least 50, and, most preferably, between about 15 to about 30 amino acids. Preferred polypeptides comprising immunogenic or antigenic epitopes are at least 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95 or 100 amino acid residues in length. Antibodies can be generated using antigenic epitopes of polypeptides according to the invention by immunising animals, such as rabbits or mice, with either free or carrier-coupled peptides, for instance, by intraperitoneal and/or intradermal injection of emulsions containing about 100 [mu]g of peptide or carrier protein and Freund's adjuvant or any other adjuvant known for stimulating an immune response.
Antibodies for use in the present invention can be fused to marker sequences, such as a peptide which facilitates purification of the fused polypeptide. In preferred embodiments, the marker amino acid sequence is a hexa-histidine peptide, such as the tag provided in a pQE vector (QIAGEN, Inc., 9259 Eton Avenue, Chatsworth, Calif., 9131 1 ), among others, many of which are commercially available. As described in Gentz et al., Proc. Natl. Acad. Sci. USA 86: 821 -824 (1989), for instance, hexa-histidine provides for convenient purification of the fusion protein. Another peptide tag useful for purification, the "HA" tag, corresponds to an epitope derived from the influenza hemagglutinin protein (Wilson et al., (1984) Cell 37: 767. Antibodies as described herein can be altered antibodies comprising an effector protein such as a label. Especially preferred are labels which allow the imaging of the distribution of the antibody in vivo. Such labels can be radioactive labels or radiopaque labels, such as metal particles, which are readily visualisable within the body of a patient. This can allow an assessment to be made without the need for tissue biopsies. Moreover, they can be fluorescent labels or other labels which are visualisable on tissue.
The antibody is preferably provided together with means for detecting the antibody, which can be enzymatic, fluorescent, radioisotopic or other means. The antibody and the detection means can be provided for simultaneous, simultaneous separate or sequential use, in a diagnostic kit intended for diagnosis. The antibodies for use in the invention can be assayed for immunospecific binding by any method known in the art. The immunoassays which can be used include but are not limited to competitive and noncompetitive assay systems using techniques such as western blots, radioimmunoassays, ELISA, sandwich immunoassays, immunoprecipitation assays, precipitin reactions, gel diffusion precipitin reactions, immunodiffusion assays, agglutination assays, complement-fixation assays, immunoradiometric assays, fluorescent immunoassays and protein A immunoassays. Such assays are routine in the art (see, for example, Ausubel et al, eds, 1994, Current Protocols in Molecular Biology, Vol. 1 , John Wiley & Sons, Inc., New York, which is incorporated by reference herein in its entirety). Exemplary immunoassays are described briefly below. Immunoprecipitation protocols generally comprise lysing a population of cells in a lysis buffer such as RIPA buffer (1 % NP-40 or Triton X-100, 1 % sodium deoxycholate, 0.1 % SDS, 0.15 M NaCI, 0.01 M sodium phosphate at pH 7.2, 1 % Trasylol) supplemented with protein phosphatase and/or protease inhibitors (e.g., EDTA, PMSF, aprotinin, sodium vanadate), adding the antibody of interest to the cell lysate, incubating for a period of time (e.g., 1-4 hours) at 4"C, adding protein A and/or protein G sepharose beads to the cell lysate, incubating for about an hour or more at 4'C, washing the beads in lysis buffer and resuspending the beads in SDS/sample buffer. The ability of the antibody of interest to immunoprecipitate a particular antigen can be assessed by, e.g., western blot analysis. Western blot analysis generally comprises preparing protein samples, electrophoresis of the protein samples in a polyacrylamide gel (e.g., 8%-20% SDS-PAGE depending on the molecular weight of the antigen), transferring the protein sample from the polyacrylamide gel to a membrane such as nitrocellulose, PVDF or nylon, blocking the membrane in blocking solution (e.g., PBS with 3% BSA or non-fat milk), washing the membrane in washing buffer (e.g., PBS-Tween 20), exposing the membrane to a primary antibody (the antibody of interest) diluted in blocking buffer, washing the membrane in washing buffer, exposing the membrane to a secondary antibody (which recognises the primary antibody, e.g., an antihuman antibody) conjugated to an enzymatic substrate (e.g., horseradish peroxidase or alkaline phosphatase) or radioactive molecule (e.g., 32P or 125l) diluted in blocking buffer, washing the membrane in wash buffer, and detecting the presence of the antigen.
ELISAs comprise preparing antigen, coating the well of a microtitre plate with the antigen, adding the antibody of interest conjugated to a detectable compound such as an enzymatic substrate (e.g., horseradish peroxidase or alkaline phosphatase) to the well and incubating for a period of time, and detecting the presence of the antigen. In ELISAs the antibody of interest does not have to be conjugated to a detectable compound; instead, a second antibody (which recognises the antibody of interest) conjugated to a detectable compound can be added to the well. Further, instead of coating the well with the antigen, the antibody can be coated to the well. In this case, a second antibody conjugated to a detectable compound can be added following the addition of the antigen of interest to the coated well. The binding affinity of an antibody to an antigen and the off-rate of an antibody-antigen interaction can be determined by competitive binding assays. One example of a competitive binding assay is a radioimmunoassay comprising the incubation of labelled antigen (e.g., 3H or 125l) with the antibody of interest in the presence of increasing amounts of unlabeled antigen, and the detection of the antibody bound to the labelled antigen. The affinity of the antibody of interest for a particular antigen and the binding off-rates can be determined from the data by Scatchard plot analysis. Competition with a second antibody can also be determined using radioimmunoassays. In this case, the antigen is incubated with antibody of interest conjugated to a labelled compound (e.g., 3H or 125l) in the presence of increasing amounts of an unlabeled second antibody.
Polypeptide levels may be measured using alternative peptide-specific reagents. Such reagents include peptide or RNA aptamers, which can specifically detect a defined polypeptide sequence. Proteins can be detected by protein gel assay, antibody binding assay, or other detection methods known in the art. For example, RNA aptamers can be produced by SELEX. SELEX is a method for the in vitro evolution of nucleic acid molecules with highly specific binding to target molecules. It is described, for example, in U.S. patents 5654151 , 5503978, 5567588 and 5270163, as well as PCT publication WO 96/38579, each of which is specifically incorporated herein by reference.
The SELEX method involves selection of nucleic acid aptamers, single-stranded nucleic acids capable of binding to a desired target, from a library of oligonucleotides. Starting from a library of nucleic acids, preferably comprising a segment of randomised sequence, the SELEX method includes steps of contacting the library with the target under conditions favourable for binding, partitioning unbound nucleic acids from those nucleic acids which have bound specifically to target molecules, dissociating the nucleic acid-target complexes, amplifying the nucleic acids dissociated from the nucleic acid- target complexes to yield a ligand-enriched library of nucleic acids, then reiterating the steps of binding, partitioning, dissociating and amplifying through as many cycles as desired to yield highly specific, high affinity nucleic acid ligands to the target molecule.
SELEX is based on the principle that within a nucleic acid library containing a large number of possible sequences and structures there is a wide range of binding affinities for a given target. A nucleic acid library comprising, for example a 20 nucleotide randomised segment can have 420 structural possibilities. Those which have the higher affinity constants for the target are considered to be most likely to bind. The process of partitioning, dissociation and amplification generates a second nucleic acid library, enriched for the higher binding affinity candidates. Additional rounds of selection progressively favour the best ligands until the resulting library is predominantly composed of only one or a few sequences. These can then be cloned, sequenced and individually tested for binding affinity as pure ligands. Cycles of selection and amplification are repeated until a desired goal is achieved. In the most general case, selection/amplification is continued until no significant improvement in binding strength is achieved on repetition of the cycle. The iterative selection/amplification method is sensitive enough to allow isolation of a single sequence in a library containing at least 1014 sequences. The nucleic acids of the library preferably include a randomised sequence portion as well as conserved sequences necessary for efficient amplification. Nucleic acid sequence variants can be produced in a number of ways including synthesis of randomised nucleic acid sequences and size selection from randomly cleaved cellular nucleic acids. The variable sequence portion can contain fully or partially random sequence; it can also contain subportions of conserved sequence incorporated with randomised sequence. Sequence variation in test nucleic acids can be introduced or increased by mutagenesis before or during the selection/amplification iterations and by specific modification of cloned aptamers.
Gene expression profiles The results of the analysis of gene expression, whether by measurement of nucleic acids or polypeptides, require interpretation. Whilst smaller groups of genes can be analysed manually, larger sets of genes will almost certainly require the assistance of computational methods in order properly to interpret any results.
A number of mathematical techniques are available in the art for analysis of gene expression results, including hierarchical clustering techniques used herein. See The analysis of gene expression data: methods and software, edited by Giovanni Parmigiani, Elizabeth S Garrett, Rafael A Irizarry, Scott L Zeger, 2003, Springer, NY, incorporated herein by reference.
In analyzing the genes identified in the subject application, a hierarchical clustering analysis can be applied to construct gene profiles for the identification of tumor tissue. Hierarchical Cluster Analysis is defined as grouping or segmenting a collection of objects into subsets or "clusters". The objects to be clustered can be either the genes or the samples: genes can be clustered by comparing their expression profiles across the set of samples, or the samples can be clustered by comparing their expression profiles across the set of genes. In such a way, the genes (or samples) within each cluster are more closely related to one another than genes (or samples) grouped within different clusters. In a hierarchical clustering analysis, the genes (or samples) are not partitioned into a particular cluster in a single step. Instead, a sequential merging of the genes (or samples), from low level to high level, takes place depending on the measurements of pair-wise similarity between expression profiles. At the highest level, there may be a single cluster containing all genes (or samples), while at the lowest level the clusters each consist of singleton genes (or samples).
In differentiating between two tumor types, a similar procedure as described above is performed after the samples are obtained and the selected gene sets are expressed either as individual genes or as sets. However, the focus will be on whether the gene sets represent Groups 1 -6 of NSCLC subtypes, as identified herein. The expression patterns associated with each type of NSCLC can be differentiated either in the presence of controls, from other, known NSCLC types, or without controls once the expression patters of the genes set forth herein have been established. The prognosis of patients with a tumour uses the same procedure as described above for obtaining the samples and expressing the gene sets either as individual genes or as sets.
Examples
Materials and Methods
NSCLC tumor samples and validation microarray data sets
Ninety-one resected tumor samples from NSCLC patients were collected at the Erasmus MC between 1992 and 2004. Tissues were collected and studied under an anonymous tissue protocol approved by the medical ethical committee of Erasmus University Medical Center Rotterdam (15).
Histopathological analysis
All tumor samples were independently reviewed by two pathologists. The dominant molecular characteristics of tumors were also verified by histology gene signatures established in the previous study (15). According to the molecular level classification, the cohort included 45 ADC, 27 SCC, and 19 LCC, including 3 CAR classified as LCC and 1 as ADC. Patient and tumor characteristics have been described in the previous study (15) and a summary is listed in Table 1.
Microarray data are available at the Gene Expression Omnibus (GEO) of the NCBI (GSE19188).
Validation microarray data sets
Two additional NSCLC microarray datasets were employed in this study to verify the identified gene predictors. One data set contained eighteen NSCLC cell line microarrays from the NCI-60 drug screen panel (16) and the other contained 96 primary NSCLC cases (17). The validation cell lines and tumors were transcriptionally profiled with Affymetrix U133 Plus 2.0 arrays and the complete microarray data sets were accessible in the Gene Expression Omnibus (GEO) database of the NCBI (GSE8332 and GSE3593). The sensitivity of the NSCLC cell lines to Pemetrexed was tested in vitro using a standard MTT colorimetric assay quantifying the amount of viable cells (16). Microarray preparation and data analysis
RNA from frozen tumor tissues was isolated and processed according to the standard protocol for Affymetrix U133 Plus 2.0 arrays. The details of microarray data processing and normalization are as described previously (15). Microarray data are available in the GEO database (GSE19188). Microarray data was normalized by RMA algorithm. RMA (Robust Multi-Array average) is an integrated algorithm comprising background adjustment, quantile normalization, and expression summarization by median polish (18). The intensities of mismatch probes were ignored due to their spurious estimation of non-specific binding. The intensities were background-corrected in such a way that all corrected values must be positive. The RMA algorithm utilized quantile normalization in which the signal value of individual probes was substituted by the average of all probes with the same rank of intensity on each chip/array. Finally Tukey's median polish algorithm was used to obtain the estimates of expression for normalized probe intensities. Intensities of probe sets lower than 30 were reset to 30. Probe sets were involved in further analysis only if their expression levels deviated from the overall mean in at least one array by a minimum factor of 2.5, because the remaining data were unlikely to be informative. The result was that 43,160 probe sets were eliminated, and 1 1 ,515 probe sets remained for further analysis.
Unsupervised clustering and novel grouping of NSCLC Omniviz software (Omniviz, Maynard, MA) was used to measure the similarities in expression profiles among samples (15). The samples were ordered so that those sharing strong similarities were arranged together into clusters. The clusters and the individual samples within the clusters were sorted in such a manner that the more similar subjects were more closely positioned in the visualization matrix. Six distinct NSCLC clusters were identified by gene expression profiles, as described in (15).
Validation microarray data sets from public resources
NSCLC cell lines within the NCI-60 drug screen panel were transcriptionally profiled by Affymetrix U133 Plus 2.0 array (GSE8332). The sensitivity of these NSCLC cell lines to Pemetrexed was tested in vitro using a standard MTT colorimetric assay via quantifying the amount of viable cells (16).
A set of 96 primary NSCLCs were profiled by Affymetrix U133 Plus 2.0 array (GSE3593), and the complete microarray data was downloaded from http://data.genome.duke.edu/LungPotti.php (17). The expression of relevant probe sets/genes of interest was retrieved using a script written in MATLAB.
Scoring formula using Internal Reference Genes
The expression of genes encoding Pemetrexed targets measured by microarray was employed to classify NSCLC to different response groups. The schemes predicted tumor response utilizing the expression of TYMS, the major target of Pemetrexed, alone firstly, and then the expression of all 3 targets, TYMS, DHFR, and GART.
1. Internal Reference Genes (IRG)
To be less prone to cohort-inherent and technical variability, the methodology was adjusted to be individually determinant, the expression level of TYMS genes was scaled relative to a set of reference genes from the same microarray - internal reference probe sets/genes (IRG).
To define IRG, we first selected the top 100 probe sets which showed a constant expression across all samples (Table 10). The constant expression of those probe sets was confirmed by an independent data set (Duke cohort, (17)), which contained a similar number of NSCLC samples (n = 96). To be applicable in the future for different platforms or different generations of the same platform, the presence of these probe sets on the U133 set of Affymetrix chips was verified as well. The average expression of 11 probe sets was used as the final set of internal reference genes to determine the relative expression of genes of interest (Table 11).
2. Percentile rank-based definition
Responder: none of the probe sets/genes showed an expression above the 60th percentile of that population; in case all 3 target genes (14 probe sets) were used, less than 3 probe sets had expression intensity above the 60th percentile of the population studied.
- Non-Responder: at least 2 out of 3 probe sets/genes presented an expression higher than 60% of studied population; or 6 out of 14 in cases where all 3 targets were counted.
- Medium sensitive patients: failed to fall into either Responder or Non-responder category.
All calculations were performed in MATLAB. Supervised analysis to identify Pemetrexed resistance associated genes
Gene profiling with respect to predicted sensitivity to Pemetrexed was performed using Significance Analysis of Microarray (SAM) (19). SAM discovered differentially expressed genes between two classes (19), e.g. predicted non-responders and responders. The obtained signatures were subjected to identify subgroups of genes that maintain the capacity of the complete signatures in distinguishing different groups optimally (20). The performance of minimized signatures was validated by "leave-one-out" cross validation (21 ).
GO term based enrichment analysis Probe set identifiers or gene symbols were used to retrieve functional annotation in terms of biological process (BP) and molecular function (MF) from Gene Ontology (GO) for the identified signature genes. Genes/probe sets which were not annotated in the GO knowledge database were excluded from further analyses. Mapped BPs and MFs were subjected to enrichment analysis to determine functional categories significantly overrepresented (DAVID) (22). The reference background used was the human genome.
The occurrence of gene members belonging to a certain GO category from the input gene list was compared to that from the gene population. For instance, 10% of input genes may belong to a GO category, while in the human genome, the enrichment of that GO category is 0.17% (50 out of 30,000 genes). The enrichment score was calculated based on the ratio of two enrichments, and the significance, enrichment p-value, was calculated using Fisher's exact test. Multiple test correction was controlled using false discovery rate (FDR) from the Benjamini-Hochberg method.
A rarely reported problem with GO term-driven analyses is the inheritance of genes in an ancestry classification system. For instance, genes are repetitively assigned to categories, from ancestor to descendants. Here, a methodology is proposed to address this problem. First, all possible relationships between any two GO categories are identified and recorded in a matrix (MATLAB). Second, existing ancestor and subordinate categories are tagged. Then the relationship of all enriched GO categories is visualized in a diagram. The selected biological processes are condensed into classes by clustering related GO terms on the basis of interrelationship among processes in a network context. Then categories with a common ancestry are linked in a hierarchical tree. In subsequent analyses, all subordinate categories are combined to the highest level ancestor category to avoid the redundant counting of enrichment genes. Analysis using other pathway knowledge databases
All identified differentially expressed genes were also mapped to other knowledge databases, including Ingenuity knowledge database (IPAD) which contains well- characterized metabolic and cell signaling pathways collected from journal articles, KEGG, BioCarta, and PANTHER databases (http://www.ingenuity.com). All mapped genes were used to generate probable molecular networks based on their connectivities illustrated in the knowledge database (23). Any achieved biological networks and associated genes were subjected to Fisher's exact test to determine the probability of random assignment in terms of p-value. The association between the genes and the pathways was also demonstrated by a ratio of the number of mapped genes divided by the total number of genes present in the corresponding pathway. The determination of significantly associated functional networks was made according to the rank of calculated p-values, and the top five networks were chosen.
Tissue Microarray Analysis (TMA)/lmmunohistochemistry Routine immunohistochemistry was performed as previously described (15). Tissue microarrays containing 0.6 mm cores of formalin-fixed paraffin-embedded tumors were used. The TMA comprised 70 of the 91 tumor tissues, in three replicates, from the Erasmus MC patient cohort used for the expression microarray analyses. TMA blocks were cut into 6 μητι slices and antigen retrieval was performed by a 20 min incubation at 95°C using a Tris-EDTA buffer (Klinipath). Slides were cooled down to room temperature and stained with primary antibodies detecting TYMS, EGFR, TP53, TP63, TTF1 , SYG, NCAM1 , CHGA, or KRT5. The source of the antibodies and dilutions used are listed in Table 2.
The second step was incubation for 30 min with rabbit anti-mouse 1 :50 (Z0259 Dako) followed by 1 :50 diluted Alkaline Phosphatase Anti-Alkaline Phosphatase (APAAP method; D0651 Dako). Staining was visualized using 20 min development with New Fuchsine substrate.
TMA evaluation and protein staining quantification was performed double-blinded by a lung pathologist. The intensity of protein staining was classified using a four-grade scale: with 0 indicating fewer than 10% of positive cells, 1 for 0% to 25%, 2 for 25% to 50%, and 3 greater than 50%. Example 1
Classification of NSCLC in six subgroups
Unsupervised clustering of expression profiles revealed six subclasses within the 91 Erasmus MC NSCLC cases. The clustering was based on the similarity in global gene expression between the NSCLC samples and the six distinct subgroups were recognized using 4791 informative probe sets (Fig.1 ). The majority of tumor samples (88 out of 91 ) were clustered into 6 groups. Two of these groups correlated well with classical histopathology: Group3 (G3) displayed a dominant SCC contribution, while the carcinoid (CAR) samples (n = 4) were exclusively assigned to Group5 (G5). By contrast, other groups showed a weak association with classical histology. They were to varying degrees composed of mixed histopathological NSCLC. ADC accounted for a major part of each of these groups, and most LCC / large cell neuroendocrine carcinoma (LCNEC) cases were mingled with ADC in Group4 (G4) and Group6 (G6). Compared to Groupl (G1 ) and Group2 (G2), ADC in G4 and G6 displayed gene expression patterns suggestive of neuroendocrine features. Regardless of histological consistency between G1 and G2, the NSCLCs in these two groups were distinguished by a low degree of cell differentiation in G1 , and the expression of a large number of immune-related genes in G2.
Three tumor samples (GO) clustered with the healthy lung tissues and were totally separated from the other tumors. These samples were classified as 'healthy' by a 5-gene NSCLC signature in a previous study (15). Two non-cancerous samples were assigned to G2, together with three immunocyte-enriched tumors. The histological classification of these samples was ambiguous since its sections were difficult to interpret. We therefore assumed that these samples contained tumor cells similar to those of the other G2 tumors. Gene signatures that reflect patterns of gene and pathway deregulation, distinguishing these six subclasses, were bioinformatically identified (Tables 4 and 12).
Neuroendocrine tumors
We found that G4, G5, and G6 comprised neuroendocrine NSCLC, including LCNEC, CAR, and NSCLC with neuroendocrine features, mainly of the ADC histological subtype. All CARs were clustered into an independent group (G5). Although the expression profiles of CAR showed to some extent similarity to G4 and G6, the observation that CAR displayed a unique transcriptome profile suggested that CAR is a group of NSCLC with distinct behavior with respect to tumor cell aggressiveness, tumor response to therapy, and prognosis.
Classical NSCLC can be subclassified by expression profiles
Gene expression profiling delineated G1 and G2 were exclusively composed of classical ADC. The profiles of G1 and G2 were distinct from any of the other groups; they showed nevertheless a clear-cut border between each other in unsupervised clustering. The G1 signature was enriched by cell cycle genes and proliferative genes, while G2 was characterized by expression of genes associated with complement system, immune response and cytokine secretion. This suggested that these two groups might display a different natural history of disease.
G3 was composed of classical SCC. The four non-SCCs in G3 were undistinguishable by expression profiles from the other SCCs in G3, with well known SCC markers, including TP63, KRTs and SERPINB, uniformly high expressed. Additional pathological analysis revealed that two of them presented either positive staining for TP63 or apparent squamous cell elements.
These observations indicated that histopathoiogical heterogeneity of cancer cells is a common feature of a large fraction of NSCLCs. Molecular phenotyping may be more sensitive than histopathoiogical morphology in grouping NSCLC with respect to tumor behavior. Individual novel groups characterized by unique gene markers
A total of 964 probe sets characterizing each of the six subgroups were identified by supervised analyses (Table 12). Genes assigned to multiple groups were excluded, and an optimized signature of 126 probe sets was derived (Table 4). As a result the percentage of correct classification decreased from 98% to 91 %. Among these genes, surfactant proteins (SFTP), which are type II pneumocyte-specific markers, were highly expressed by tumors in G4, G5, and G6, while other groups were SFTP-negative.
Pathway profiles of novel groups
Pathway analysis revealed that distinct biological processes were characteristic for each group. G5 (CAR) was characterized by neuronal signaling pathways. G1 and G2 were associated with focal adhesion and cell adhesion processes respectively, confirming that groups with similar histological composition differed functionally in molecular processes (Table 3). This indicates that different oncogenic mechanisms may be operational in NSCLC, and that these are unrelated to histology as such.
We conclude that the newly identified subgroups of NSCLC have distinct molecular characteristics that go beyond the traditional histopathological classification. Example 2
Gene expression-based prediction of response to Pemetrexed
The sensitivity of tumors to Pemetrexed treatment is thought to be negatively correlated with the expression levels of the enzymes in the nucleotide metabolic pathways (9, 24). Expression of the relevant genes was extracted from the microarray data, and these were subsequently utilized to develop predictive schemes for Pemetrexed responsiveness.
The expression of TYMS was compared to the average expression of the IRG. Each patient was designated being resistant (NR), sensitive (R), or medium sensitive (M) to Pemetrexed therapy. According to the IRG scheme, out of 91 NSCLC patients 34.1% were predicted as non-responders, 52.7% were predicted to be responders, and 13.2% were predicted to have medium sensitivity to Pemetrexed (Table 8). The relative expression of TYMS in predicted NRs is 177.1 (95% confidence interval: 143.2-210.9), 8.2-fold higher than it in normal lungs; while predicted Rs displayed a 2.2 fold increase in relative expression of TYMS compared with normal lungs. When expression of DHFR or GART was included in the predictive scheme, a similar output was observed.
The differential activity of relevant pathways in NR and R determined by a global analysis is shown in Fig. 2. A set of 426 probe sets (346 genes) characterizing putative NRs was identified using supervised analysis (Table 13). The bioinformatics analysis revealed that signature genes were functionally enriched in gene sets related with cell cycle, cell proliferation, pyrimidine and purine metabolism, and folate biosynthesis (Table 5). Surfactant genes, SOX7, SLC16A4 and SLC46A3 were down-regulated in predicted NRs. DNA damage repair-associated genes attributing to multi-drug resistance were found over-expressed in predicted NRs, including TOP2A, PRIM1 , and ATP-binding cassette (ABC) genes. Among the signature genes are a large number of cell cycle regulatory genes such as cyclins and CDCs; cell division related genes like E2Fs, GTSE1 , KIFs, MCMs, and IGFBPL1 ; cell growth and invasion related genes including MMP19; and oncogenes and suppressor genes such as MYB, NBL1 and RAS. A subset of this signature, represented by 25 probe sets, performed optimally in predicting Pemetrexed response (Table 6) (20-21 ).
Correlation of putative NSCLC responsiveness to other tumor characteristics
Next, the predicted Pemetrexed resistance profile was studied in relation to NSCLC histology. The histological subtype (ADC, SCC or LCC) was assigned using the histology signature identified previously (15). Within the three major subtypes, LCC contained the highest expression of TYMS (192.0; 95% CI: 125.6-258.4), followed by SCC (86.6; 95% CI: 73.0-100.1 ) and ADC (76.1 ; 95% CI: 58.5-93.7) (Fig. 3). A significant difference in TYMS expression was observed between each subtype of NSCLC and non-cancerous tissues, with 8.85-, 4-, and 3.5-fold increase in LCC, SCC, and ADC, respectively. The difference in TYMS expression was statistically significant between LCC and the other two subtypes, ADC (p-value<0.002) and SCC (p-value<0.004), but not between ADC and SCC
Predicted resistance to Pemetrexed was found in 27% of ADC, 33% of SCC, and 53% of LCC. In contrast, all healthy lung tissue samples (n = 64) except for one were stratified as sensitive to Pemetrexed.
Example 3
A novel NSCLC group is associated with predicted Pemetrexed resistance
Since predicted Pemetrexed-resistant cases were not significantly overrepresented in any of the histological subtypes, we correlated predicted Pemetrexed responsiveness to the six molecularly defined subgroups. In four of the six groups, fewer than 25% of the cases were defined as NRs, with percentages of 20%, 0%, 0% and 25% in G1 , G2, G5 and G6. About 32% cases from G3, which were characterized by SCC, were predicted as NRs. In contrast, a remarkable proportion (94%, 17 out of 18) of NRs predicted by the 25 probe-set signature was observed in G4 (Fig. 4). We conclude that tumors in G3 (32% of SCC) and in particular G4 have the highest probability to be classified as NR.
Distinct molecular characteristics of NRs in G4
The expression profile of G4 was distinguishable from other neuroendocrine tumors. A differential over-expression of neuroendocrine markers, including ASCL1 , DDC, and MAST4 were observed among neuroendocrine groups, with a 2- to 4-fold difference between G4 and G6. In contrast, MCM6 and CDCA7 showed a relatively higher expression in G4 compared with G6. Pemetrexed is transported in and out of cells by membrane proteins such as FOLR1 , SLC19A1 , and ATP-binding cassette (ABC) family members. Moreover, Pemetrexed is metabolized by folylpolyglutamate synthetase (FPGS). The aberrant expression of such molecules may also contribute to Pemetrexed resistance. We observed a lower expression of FOLR1 and a higher expression of ABCC1 in G4 compared with either other groups or other neuroendorine tumors (p values < 0.015 and =0.038).
Bioinformatics analysis revealed that pyrimidine metabolism and EGF signaling pathways were more activated in G4 compared to other NSCLCs (Figs. 5, 6). Moreover, in comparison to other neuroendocrine tumors, such as G6, pyrimidine metabolism was also more up-regulated while histidine pathway was more down-regulated in G4 than in the neuroendocrine NSCLCs in G6 (Fig. 7).
We conclude that the newly identified G4 of NSCLC has distinct molecular characteristics associated with predicted Pemetrexed sensitivity.
Distinct molecular characteristics of G3 NRs Gene expression stratified SCCs in G3 into two putative groups differing in Pemetrexed responsiveness. The putative SCC NRs presented higher expression of TP53 and higher activity of TP53-associated signaling pathway than putative Rs in this group. In addition to the TP53 signaling pathway, predicted SCC NRs differed in the expression of ABCC1 and FLOR2 from predicted SCC Rs, with a 1.47- to 1 .79-fold differential expression in putative NRs (Fig. 8).
Validation of TYMS expression in NSCLC by tissue microarrays
TYMS protein staining was performed using tissue microarrays of the NSCLC samples (n = 70) that were used for the microarrays. Staining of the tissue microarrays (TMA) was graded from scale 0 to 3 (Table 8). The mRNA expression level for each staining category is shown in Fig. 9. Since only one sample was graded 3, it was combined with the grade2 NSCLCs. The mRNA expression of TYMS of grade2 was 3.73-fold and 2.52- fold higher than gradeO and gradel (p-value = 1 .1 1 E-7). The correlation between staining intensity and predicted Pemetrexed resistance is shown in Fig. 9. Over 87.5% grade2 NSCLC were predicted NR to Pemetrexed. Conversely, 33.5% of gradel NSCLC and 19.4% gradeO were predicted NR. The expression of TYMS in NR between gradeO and gradel was not significantly different (p-value = 0.993). When the TYMS antibody was more diluted (1 :50), over 85% (6 out 7) grade2 samples were predicted NR, similar to the results of TMA at titer 1 :10. Eight samples had moderate staining (gradel ) of which 2 samples were predicted NR (25%). The TYMS expression was not detected with TMA at titer 1 :50 in the rest of samples (n = 61 ). We conclude that TYMS protein expression detected by immunohistochemistry correlates poorly with TYMS mRNA expression, in agreement with previous work (25). This precludes the use of TYMS staining for reliable prediction of Pemetrexed response.
Example 4
Validation of the putative Pemetrexed resistance signature
The expression of resistance associated genes identified with primary NSCLCs was validated in two independent sample cohorts, in transcriptionally profiled NSCLC cell lines (GSE8332) and primary NSCLCs (GSE3141/Duke Cohort) (16-17). The performance of our signature was evaluated by comparing the predicted Pemetrexed- sensitivity of the cell lines to the measured response in drug sensitivity assays (16) (Fig. 4B). In addition, the sensitivity to Pemetrexed was predicted for primary NSCLCs in the Duke Cohort (Fig. 4C). The 25-probe set signature correctly predicted response to Pemetrexed in 94% (17 out of 18) of the cell lines. Resistant cell lines were all correctly predicted; the sensitivity of predicting resistance was 100% and specificity 91.7%. Similar to the EMC cohort, 39% of NSCLC cases (n = 37) in the Duke cohort were predicted NR to Pemetrexed, including 25 cases in G3 (resembling SCC histology) and 1 1 cases in G4 (Fig. 4C, Table 7). At the same time, 47% (18 out of 38) of SCC in the Duke cohort had putative sensitivity to Pemetrexed.
Example 5
Potential for practical application
Next, we investigated whether (a combination of) routine histopathological markers could be used to stratify the putative Pemetrexed Rs / NRs. We used eight well known histopathological markers, including KRT5, EGFR, TTF1 , TP53 and neuroendocrine markers (Table 2). The expression of any of these markers failed to stratify patients into the predicted groups with differential sensitivity. Furthermore, stratification by any of these routine histopathological markers also failed when it was performed with the prior knowledge of the classical histology (Fig. 10). However, when the stratification was interpreted with the knowledge of novel grouping and the expression of all routine pathological proteins was used cooperatively, putative Pemetrexed sensitive and resistant cases could be distinguished, with the best performance in G3 and still some exceptions in G4 and G6 (Fig. 10). Out of 18, 17 Rs in predicted G3 can be identified by combination of staining for TP53 and EGFR. The expression of neuroendocrine markers was observed in both NRs and Rs from G1 and G6. There are four cases in G4 for which TMA staining was inconclusive since it failed for one or all markers. We conclude that the cooperative use of routine histological markers and molecular classification can be used to predict the sensitivity of a subset of NSCLC to Pemetrexed.
References
1. Travis WD, Gal AA, Colby TV, Klimstra DS, Falk R, Koss MN. Reproducibility of neuroendocrine lung tumor classification. Hum Pathol 1998 Mar;29(3):272-9. 2. Takeuchi T, Tomida S, Yatabe Y, et al. Expression profile-defined classification of lung adenocarcinoma shows close relationship with underlying major genetic changes and clinicopathologic behaviors. J Clin Oncol 2006 Apr 10;24(11 ): 1679-88.
3. Bhattacharjee A, Richards WG, Staunton J, et al. Classification of human lung carcinomas by mRNA expression profiling reveals distinct adenocarcinoma subclasses. Proc Natl Acad Sci U S A 2001 Nov 20;98(24): 13790-5.
4. Anbazhagan R, Tihan T, Bornman DM, et al. Classification of small cell lung cancer and pulmonary carcinoid by gene expression profiles. Cancer Res 1999 Oct 15;59(20):5119-22.
5. Hanauske AR, Chen V, Paoletti P, Niyikiza C. Pemetrexed disodium: a novel antifolate clinically active against multiple solid tumors. Oncologist 2001 ;6(4):363-73.
6. Hanauske AR, Eismann U, Oberschmidt O, et al. In vitro chemosensitivity of freshly explanted tumor cells to pemetrexed is correlated with target gene expression. Invest New Drugs 2007 Oct;25(5):417-23.
7. Esteban E, Casillas M, Cassinello A. Pemetrexed in first-line treatment of non- small cell lung cancer. Cancer Treat Rev 2009 Jun;35(4):364-73.
8. Longo-Sorbello GS, Chen B, Budak-Alpdogan T, Bertino JR. Role of pemetrexed in non-small cell lung cancer. Cancer Invest 2007 Feb;25(1 ):59-66.
9. Bepler G, Sommers KE, Cantor A, et al. Clinical efficacy and predictive molecular markers of neoadjuvant gemcitabine and pemetrexed in resectable non-small cell lung cancer. J Thorac Oncol 2008 Oct;3(10):1 1 12-8.
10. Ciuleanu T, Brodowicz T, Zielinski C, et al. Maintenance pemetrexed plus best supportive care versus placebo plus best supportive care for non-small-cell lung cancer: a randomised, double-blind, phase 3 study. Lancet 2009 Oct 24;374(9699): 1432-40. 1 1. Scagliotti G, Hanna N, Fossella F, et al. The differential efficacy of pemetrexed according to NSCLC histology: a review of two Phase III studies. Oncologist 2009 Mar; 14(3):253-63.
12. Ceppi P, Volante M, Saviozzi S, et al. Squamous cell carcinoma of the lung compared with other histotypes shows higher messenger RNA and protein levels for thymidylate synthase. Cancer 2006 Oct 1 ; 07(7):1589-96.
13. Ceppi P, Volante M, Ferrero A, et al. Thymidylate synthase expression in gastroenteropancreatic and pulmonary neuroendocrine tumors. Clin Cancer Res 2008 Feb 15; 14(4): 1059-64. 14. Monica V, Scagliotti GV, Ceppi P, et al. Differential Thymidylate Synthase Expression in Different Variants of Large-Cell Carcinoma of the Lung. Clin Cancer Res 2009 Dec 15;15(24):7547-52.
15. Hou J, Aerts J, den Hamer B, et al. Gene expression-based classification of non- small cell lung carcinomas and survival prediction. PLoS One;5(4):e10312. 16. Hsu DS, Balakumaran BS, Acharya CR, et al. Pharmacogenomic strategies provide a rational approach to the treatment of cisplatin-resistant patients with advanced cancer. J Clin Oncol 2007 Oct 1 ;25(28):4350-7.
17. Potti A, Mukherjee S, Petersen R, et al. A genomic strategy to refine prognosis in early-stage non-small-cell lung cancer. N Engl J Med 2006 Aug 10;355(6):570-80. 18. Irizarry RA, Hobbs B, Collin F, et al. Exploration, normalization, and summaries of high density oligonucleotide array probe level data. Biostatistics 2003 Apr;4(2):249-64.
19. Tusher VG, Tibshirani R, Chu G. Significance analysis of microarrays applied to the ionizing radiation response. Proceedings of the National Academy of Sciences of the United States of America 2001 Apr 24;98(9):51 16-21 . 20. Tibshirani R, Hastie T, Narasimhan B, Chu G. Diagnosis of multiple cancer types by shrunken centroids of gene expression. Proc Natl Acad Sci U S A 2002 May 14;99(10):6567-72.
21. Golub T, Slonim D, Tamayo P, et al. Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 1999;286:531 -6. 22. Huang da W, Sherman BT, Lempicki RA. Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat Protoc 2009;4(1 ):44-57. 23. Calvano SE, Xiao W, Richards DR, et al. A network-based analysis of systemic inflammation in humans. Nature 2005 Oct 13;437(7061 ): 1032-7.
24. Giovannetti E, Mey V, Nannizzi S, et al. Cellular and pharmacogenetics foundation of synergistic interaction of pemetrexed and gemcitabine in human non-small- cell lung cancer cells. Mol Pharmacol 2005 Jul;68(1 ): 110-8.
25. Tubbs RR, Pettay JD, Roche PC, Stoler MH, Jenkins RB, Grogan TM. Discrepancies in clinical laboratory testing of eligibility for trastuzumab therapy: apparent immunohistochemical false-positives do not get the message. J Clin Oncol 2001 May 15;19(10):2714-21. 26. Rosell R, Felip E, Garcia-Campelo R, Balana C. The biology of non-small-cell lung cancer: identifying new targets for rational therapy. Lung Cancer 2004 Nov;46(2): 135-48.
27. Jackman DM, Miller VA, Cioffredi LA, et al. Impact of epidermal growth factor receptor and KRAS mutations on clinical outcomes in previously untreated non-small cell lung cancer patients: results of an online tumor registry of clinical trials. Clin Cancer Res 2009 Aug 15;15(16):5267-73.
28. Raponi M, Zhang Y, Yu J, et al. Gene expression signatures for predicting prognosis of squamous cell and adenocarcinomas of the lung. Cancer Res 2006 Aug 1 ;66(15):7466-72. 29. Inamura K, Fujiwara T, Hoshida Y, et al. Two subclasses of lung squamous cell carcinoma with different gene expression profiles and prognosis identified by hierarchical clustering and non-negative matrix factorization. Oncogene 2005 Oct 27;24(47):7105-13.
30. Garber ME, Troyanskaya OG, Schluens K, et al. Diversity of gene expression in adenocarcinoma of the lung. Proc Natl Acad Sci U S A 2001 Nov 20;98(24): 13784-9. 31. Wigle DA, Jurisica I, Radulovich N, et al. Molecular profiling of non-small cell lung cancer and correlation with disease-free survival. Cancer Res 2002 Jun 1 ;62(11 ):3005-8.
32. Bild AH, Yao G, Chang JT, et al. Oncogenic pathway signatures in human cancers as a guide to targeted therapies. Nature 2006 Jan 19;439(7074):353-7. cohort (n=91)
Figure imgf000049_0001
Table 2. Antibodies used for tissue microarray analysis
Figure imgf000050_0001
Table 3. Distinct de-regulated metabolic pathways in the 6 NSCLC subgroups
Figure imgf000051_0001
Figure imgf000052_0001
Figure imgf000053_0001
227439_at ANKS1B 4.45 4.13 5.60 3.09 0.02 4.34
227440_at AN S1B 2.88 2.27 3.00 2.38 0.03 2.74
227441_s_at AN S1B 5.70 5.15 6.31 4.99 0.01 5.19
228494_at PPP1 9A 0.69 2.63 5.46 0.56 0.37 0.71
228509_at SKIP 6.37 5.88 8.58 6.68 0.01 6.66
229372_at G0LT1A 0.91 2.78 4.90 1.99 0.32 0.22
229414_at PITPNC1 1.84 2.12 3.27 1.58 0.13 0.35
229550_at KIAA1409 1.77 1.76 1.85 1.57 0.08 1.18
230464_at EDG8 2.57 2.02 0.24 2.00 2.51 2.43
230769_at DENND2C 2.07 1.66 0.25 2.53 2.14 2.55
231395_at ATP8A2 2.29 1.93 2.56 1.60 0.05 1.89
231488_at OTP 3.82 3.15 4.18 2.28 0.02 3.58
231626_at — 19.07 17.99 24.81 15.48 0.00 10.46
231771_at GJB6 11.30 33.88 0.03 9.19 39.79 38.47
232202_at — 3.95 8.94 0.11 2.90 8.00 8.39
232406 at — 2.89 1.91 0.31 1.83 0.48 3.34
233286_at — 1.37 1.18 1.18 1.18 0.21 1.16
233586_s_at KLK12 8.79 6.91 7.58 1.68 8.35 0.05
234261 at — 1.66 1.51 1.79 1.39 0.09 1.52
234316_x_at KLK12 3.86 3.56 3.87 1.25 4.03 0.09
235075_at DSG3 16.50 29.12 0.02 34.50 28.69 34.05
235077_at MEG3 5.47 4.42 5.86 0.85 0.03 5.40
236538_at GRIA2 2.82 2.45 3.12 2.42 0.03 2.59
237193_s_at — 1.39 1.32 1.57 1.44 0.12 1.14
237220_at — 2.08 1.67 2.06 2.11 0.10 0.53
237906_at — 3.04 2.66 3.12 2.22 0.03 2.73
238454_at ZNF540 1.44 1.16 1.65 1.58 0.10 1.48
238649_at PITPNC1 2.05 2.71 3.52 1.79 0.12 0.33
239594_at LOC145837 7.60 7.26 9.74 3.04 0.26 0.07
239765_at — 1.56 1.44 1.77 1.35 0.11 1.13
240292_x_at ANKS1B 2.14 1.99 2.37 1.93 0.05 1.95
240838_s_at LOC145837 4.83 4.24 5.85 3.43 0.15 0.13
242098_at LOC202451 1.31 1.40 1.54 1.31 0.14 1.10
242736_at — 1.33 1.07 2.00 1.45 0.10 1.39
244107_at — 4.39 3.69 0.12 4.03 4.12 3.99
Figure imgf000054_0001
Table5. De-regulated pathways in predicted NR (based on 426 probe-sets / 346 genes)
Figure imgf000055_0001
Table 6. Minimized signature for prediction of Pemetrexed response
(25 probe sets)
Figure imgf000056_0001
NR: predicted non-responder to Pemetrexed
R: predicted responder to Pemetrexed
Table 7. Molecular classification of Duke cohort NSCLCs (n=96) and Pemetrexed sensitivity prediction
Figure imgf000057_0001
NR: predicted non-responder (NR)
R: predicted responder
G1-G6: predicted novel NSCLC subgro
Table 8. Immunohistochemistry results for eight histology markers in Erasmus MC NSCLC samples
Figure imgf000058_0001
Figure imgf000059_0001
GO: normal lung group
G1-G6: predicted novel NSCLC subgroups NR: predicted non-responder to Pemetrexed R: predicted responder to Pemetrexed NA = not available
Table 9. Summary of Immunohistochemistry analysis of histology markers in Erasmus MC NSCLC sam
Figure imgf000060_0001
Table 10. Full signature of internal reference genes (IRG) (100 probe sets)
Figure imgf000061_0001
Figure imgf000062_0001
Figure imgf000063_0001
Table 11. Minimized signature for internal reference genes (IRG) (11 probe sets)
Figure imgf000064_0001
Tablel2. Full signature for assignment to IMSCLC subgroups (964 probe sets)
Gene Gl : OG
ProbelD G2 : OG G3 : OG G4 : OG G5 : OG G6 : OG symbol Ratio Ratio Ratio Ratio Ratio Ratio
1553413_at — 2.35 7.84 1.14 1.90 4.40 0.19
1555993_at — 1.60 1.51 2.59 1.41 0.28 0.28
1556147_at — 1.28 1.25 1.18 0.98 0.26 1.23
1556185_a_at — 0.27 0.65 3.03 4.23 2.74 0.72
1556205_at — 1.37 1.05 1.44 1.25 0.16 1.15
1556314_a_at — 0.81 0.60 1.18 2.03 0.49 0.99
1556331_a_at — 1.30 1.57 2.46 1.72 1.51 0.19
1556763_at — 1.30 1.50 1.85 1.66 1.64 0.23
1556764_s_at — 1.42 1.82 2.39 1.90 2.00 0.16
1556773_at — 9.55 18.20 0.04 10.74 19.69 7.28
1556856_at — 1.14 1.08 1.32 1.18 0.91 0.48
1556989_at — 1.20 0.12 1.83 2.26 1.68 1.55
1560652_at — 14.66 12.84 4.53 2.14 0.01 13.55
1561817_at — 2.03 1.21 3.31 2.44 0.83 0.12
1561956_at — 2.03 1.91 1.84 1.13 1.15 0.22
1562275_at — 1.44 0.16 1.72 1.89 1.07 0.98
1562783_at — 1.18 1.69 2.02 0.82 1.92 0.34
1562821_a_at ... 1.33 5.96 7.69 4.07 5.51 0.07
1563462_at ... 1.37 1.44 1.56 1.41 0.12 1.30
1566115_at — - 1.41 1.65 1.20 1.37 0.91 0.37
1566764_at --- 0.32 2.84 1.89 2.34 1.75 0.72
1566766_a_at — 0.33 2.11 2.07 2.34 1.74 0.68
1568932_at --- 5.79 4.40 0.09 5.19 5.24 5.18
1569208_a_at ... 0.80 1.67 2.00 1.49 0.78 0.37
1569344_a_at --- 0.19 1.75 3.72 1.57 5.30 1.63
209353_s_at ... 2.14 1.92 2.25 2.10 0.05 1.73
210108_at ... 2.70 2.36 5.90 3.05 0.21 0.13
212444_at ... 0.39 0.88 2.89 2.21 3.59 0.51
213106_at ... 0.73 0.56 2.54 2.63 0.88 0.37
213841_at ... 3.26 2.31 3.55 2.91 0.03 2.77
213904_at ... 2.49 2.18 1.97 1.48 0.05 2.30
214078_at ... 2.90 2.99 5.70 3.13 0.23 0.12
215126_at — 1.29 1.12 1.28 1.18 0.20 1.19
215653_at ... 2.27 2.12 2.90 2.53 0.04 2.29
216573_at ... 1.16 0.11 2.00 2.04 2.40 1.07
217085_at 2.88 2.58 3.19 2.63 2.56 0.08
217521_at ... 1.15 1.37 1.96 0.72 1.82 0.42
222303_at — 1.22 0.65 0.75 2.62 0.85 0.74
222379_at ... 2.60 0.79 2.22 1.30 0.44 0.25
224996_at ... 0.45 1.24 0.99 1.85 2.86 1.78
225008_at ... 0.34 1.29 1.12 2.41 2.67 2.01
22561 l_at ... 0.78 0.84 0.79 2.82 0.87 1.01
226034_at ... 0.76 2.52 7.92 1.80 1.41 0.16
226621_at --- 0.43 0.89 1.22 2.10 5.72 1.10
226848_at ... 0.64 1.51 2.59 1.18 1.28 0.40
226885_at ... 2.30 1.55 0.35 1.35 2.20 1.95 227041_at — 0.98 1.35 1.79 0.98 0.96 0.45
227235_at — 0.95 0.50 0.91 2.24 0.69 1.12
227499_at — 2.15 2.34 2.41 0.50 0.94 0.35
227524_at — 2.04 2.08 2.60 0.64 0.72 0.29
227826_s_at — 1.19 0.37 2.13 1.54 0.73 0.42
227827_at — 1.30 0.40 2.12 1.45 0.70 0.39
227955_s_at — 0.40 1.63 1.79 1.52 1.35 0.89
228101_at — 1.41 1.54 1.54 1.69 0.09 1.78
228504_at — 0.71 0.35 2.04 4.97 0.37 0.66
228750_at — 1.43 0.13 1.82 5.20 0.36 1.14
228962_at — 0.92 0.90 2.54 1.95 1.95 0.25
229281_at — 3.08 2.57 3.18 0.86 2.41 0.15
230175_s_at — 0.38 2.56 1.16 1.64 2.92 1.57
230208_at — 1.48 1.20 1.53 1.03 0.16 1.13
230428_at — 0.89 1.44 2.00 1.16 1.96 0.34
230499_at — 0.45 0.54 1.80 1.23 2.34 1.44
230585_at — 0.65 0.55 1.31 4.88 0.66 0.94
230831_at — 0.40 2.43 2.08 1.11 0.81 1.08
230860_at — 1.84 1.80 0.37 1.31 1.90 2.23
231181_at — 0.77 0.30 1.51 1.23 1.78 1.38
231315_at — 0.54 1.70 4.52 1.58 1.07 0.31
231331_at — 7.95 9.93 0.05 9.31 12.70 14.08
231384_at — 3.10 2.49 3.26 2.96 0.03 2.80
231582_at — 1.69 1.40 1.66 1.68 0.13 0.68
231601_at — 1.61 1.31 1.53 1.23 1.55 0.28
231626_at — 19.07 17.99 24.81 15.48 0.00 10.46
232151_at — 0.28 2.06 2.93 3.29 1.74 0.56
232202_at — 3.95 8.94 0.11 2.90 8.00 8.39
232406_at — 2.89 1.91 0.31 1.83 0.48 3.34
233185_at — 1.59 1.55 1.58 1.42 0.10 1.47
233223_at — 0.70 0.86 1.24 2.69 1.06 0.54
233286_at — 1.37 1.18 1.18 1.18 0.21 1.16
233814_at — 0.43 2.02 2.68 1.32 0.53 0.73
234261_at — 1.66 1.51 1.79 1.39 0.09 1.52
234350_at — 0.99 0.17 1.32 1.99 1.73 1.40
234415_x_at — 0.71 0.29 1.37 1.51 1.79 1.35
234624_at — 0.23 1.74 2.07 2.05 1.83 1.92
234638_at — 1.35 1.01 1.16 1.08 0.27 1.10
235299_at — 0.44 1.08 2.22 1.54 2.67 0.63
235676_at — 3.53 2.56 4.83 1.50 0.11 0.26
236277_at — 3.12 2.71 6.16 3.14 0.25 0.11
236338_at — 1.35 1.27 1.56 1.27 1.01 0.32
236383_at — 1.22 0.17 1.49 2.24 1.16 1.32
236489_at — 0.13 2.59 2.77 4.03 3.90 3.51
236610_at — 1.13 0.95 1.86 1.28 1.47 0.32
236666_s_at — 1.34 1.17 1.77 1.37 0.95 0.29
237101_at — 18.72 15.67 21.30 17.32 0.00 15.39
237193_s_at — 1.39 1.32 1.57 1.44 0.12 1.14
237220_at — 2.08 1.67 2.06 2.11 0.10 0.53
237509_at — 1.57 1.49 1.61 1.46 0.10 1.45 237573_at — 1.02 1.34 1.46 1.38 0.16 1.40
237698_at — 1.36 1.23 1.29 1.02 0.23 1.05
237906_at — 3.04 2.66 3.12 2.22 0.03 2.73
238103_at — 0.71 3.30 3.09 1.26 7.52 0.24
238309_x_at — 1.20 1.14 1.12 1.09 0.31 1.08
238674_at — 0.81 0.24 1.15 1.81 1.73 1.62
238827_at — 1.27 1.48 0.40 1.84 2.58 2.00
238879_at — 1.49 1.93 0.40 1.35 2.60 2.06
238898_at — 1.74 2.44 1.70 0.64 1.76 0.36
238967_at — - 2.91 5.89 0.12 3.77 10.56 5.47
239082_at ... 2.32 2.07 2.37 0.60 0.69 0.31
239127_at ... 1.82 1.70 0.37 1.55 2.02 1.67
239611_at ... 1.94 1.72 2.01 1.09 1.50 0.21
239765_at — 1.56 1.44 1.77 1.35 0.11 1.13
240143_at — 5.88 4.50 0.12 2.71 3.61 4.39
240218_at — 1.21 1.07 1.12 1.12 0.30 1.09
240236_at — 3.26 2.58 3.22 1.48 0.04 1.34
240713_s_at — 1.29 1.20 1.30 1.20 0.18 1.22
240869_at ... 1.52 1.28 1.68 1.12 0.12 1.45
241870_at — 1.53 1.67 1.93 1.58 1.59 0.20
241897_at — 0.67 0.85 1.57 2.66 0.44 0.61
241898_at ... 0.43 1.39 1.55 1.13 1.81 1.38
242286_at ... 2.55 2.24 2.64 2.48 0.04 2.24
242736_at ... 1.33 1.07 2.00 1.45 0.10 1.39
242868_at — 0.76 0.56 1.47 2.46 0.51 0.67
242909_at 3.86 4.50 0.23 1.25 4.34 2.78
242931_at ... 0.39 0.75 3.93 2.13 2.69 0.47
243248_at — 2.04 2.45 1.23 2.39 2.37 0.19
243334_at ... 2.35 2.09 3.18 1.86 0.30 0.17
243586_at ... 1.20 1.44 1.96 1.07 1.24 0.31
243729_at ... 0.96 1.46 2.68 1.14 1.92 0.27
243780_at ... 1.12 0.20 1.13 1.37 3.94 1.52
244107_at — 4.39 3.69 0.12 4.03 4.12 3.99
244170_at ... 1.24 1.17 1.32 1.11 0.23 1.03
244247_at ... 0.33 1.70 2.03 0.98 2.17 1.79
244384_at — 1.21 2.38 3.27 0.66 1.59 0.29
244724_at ... 1.46 1.34 1.43 1.26 0.13 1.36
1553605_a_at ABCA13 3.92 2.96 0.13 5.12 4.91 2.53
208161_s_at ABCC3 0.22 1.84 2.46 2.81 3.93 0.94
20964 l_s_at ABCC3 0.24 1.78 2.36 2.22 2.66 1.11
209380_s_at ABCC5 3.01 3.30 0.21 1.96 2.06 3.02
226363_at ABCC5 2.89 3.33 0.25 1.87 1.00 2.58
210246_s_at ABCC8 5.02 3.70 5.78 2.47 0.02 1.95
223395_at ABI3BP 0.68 0.23 1.29 3.74 0.99 1.60
218322_s_at ACSL5 0.38 0.92 3.67 1.85 5.09 0.46
222592_s_at ACSL5 0.34 0.99 3.25 1.59 6.55 0.62
206046_at ADAM23 5.28 3.10 0.17 2.17 2.67 2.61
244463_at ADAM 23 5.59 3.57 0.16 2.00 2.46 3.24
222162_s_at ADAMTS1 1.41 0.25 0.88 3.06 0.79 1.43
220287_at ADAMTS9 1.24 0.20 1.49 1.45 1.32 1.18 209612_s_at ADH IB /// AC 0.84 0.19 2.52 18.36 0.18 1.36
209613_s_at ADH IB /// AC 0.88 0.18 2.20 18.09 0.22 1.18
209614_at ADH 1C 1.04 0.24 1.41 2.46 0.45 1.13
210505_at ADH7 8.18 6.52 0.08 3.74 7.10 6.74
202912_at ADM 2.04 0.78 0.37 1.29 3.17 3.09
202834_at AGT 2.57 2.59 8.99 1.10 1.38 0.10
221008_s_at AGXT2L1 1.62 1.82 2.58 1.39 2.13 0.17
202820_at AHR 0.53 0.94 0.94 2.27 2.01 1.35
206469_x_at AKR7A3 1.32 1.34 2.00 1.53 0.26 0.39
216381_x_at AKR7A3 1.28 1.31 2.07 1.43 0.29 0.38
205623_at ALDH3A1 8.63 7.36 0.07 7.90 13.76 3.84
205640_at ALDH3B1 0.46 0.89 1.93 1.52 2.20 0.73
214366_s_at AL0X5 0.46 0.63 1.48 1.80 1.82 1.40
214846_s_at ALPK3 0.83 1.64 3.08 1.08 0.30 0.48
228342_s_at ALPK3 0.77 2.57 4.15 1.07 0.24 0.47
205477_s_at AMBP 2.81 2.62 4.00 1.11 2.93 0.11
206121_at AMPD1 1.28 0.28 1.39 1.14 1.62 0.80
208498_s_at AMY1A /// AN* 0.27 0.64 1.96 7.39 2.25 1.18
224339_s_at ANGPTL1 1.13 0.48 1.10 1.26 0.89 1.16
231773_at ANGPTL1 1.11 0.41 1.19 1.34 0.95 1.22
213715_s_at ANKRD47 0.99 0.39 1.13 1.62 0.70 1.17
227439_at ANKS1B 4.45 4.13 5.60 3.09 0.02 4.34
227440_at ANKS1B 2.88 2.27 3.00 2.38 0.03 2.74
227441_s_at ANKS1B 5.70 5.15 6.31 4.99 0.01 5.19
240292_x_at ANKS1B 2.14 1.99 2.37 1.93 0.05 1.95
203074_at ANXA8 /// AN 2.31 4.46 0.11 8.97 11.01 9.82
204894_s_at A0C3 0.70 0.34 1.34 2.99 0.67 1.12
207542_s_at AQP1 0.18 0.79 3.36 6.61 0.97 1.80
39248_at AQP3 0.56 0.92 1.25 5.28 3.01 0.47
210066_s_at AQP4 0.47 0.88 2.08 2.24 0.85 1.10
210068_s_at AQP4 0.37 0.60 2.37 7.36 1.09 1.08
210906_x_at AQP4 0.39 0.61 2.11 5.63 1.25 1.04
205239_at AREG /// LOC 0.18 0.72 2.87 2.33 8.70 3.13
228368_at ARHGAP20 0.34 1.04 1.93 1.53 0.60 2.33
203264_s_at ARHGEF9 1.46 1.57 2.46 1.78 0.06 1.74
220658_s_at ARNTL2 1.51 3.51 0.27 1.60 5.78 5.27
224204_x_at ARNTL2 1.45 2.97 0.31 1.49 5.05 3.87
205894_at ARSE 0.95 1.76 4.98 2.19 1.41 0.15
210237_at ARTN 2.29 1.56 0.39 1.17 1.92 1.76
20848 l_at ASB4 2.75 2.17 2.83 2.40 0.04 2.23
217228_s_at ASB4 3.04 2.64 3.43 2.83 0.03 1.98
237721_s_at ASB4 4.84 3.99 5.84 4.52 0.02 1.07
209985_s_at ASCL1 2.56 1.97 2.69 1.14 0.69 0.17
209986_at ASCL1 1.55 1.20 1.48 1.12 1.20 0.33
209987_s_at ASCL1 32.23 27.95 36.55 0.60 0.57 0.10
209988_s_at ASCL1 73.58 77.90 60.32 0.94 0.56 0.06
213768 s at ASCL1 6.42 5.36 6.85 0.96 0.55 0.10
209135_at ASPH 0.42 1.58 1.28 1.76 1.78 1.15
210896_s_at ASPH 0.27 2.41 1.78 2.41 1.40 1.28
242037_at ASPH 0.28 1.73 1.57 2.28 2.40 1.43 228890_at AT0H8 0.82 1.00 2.33 2.13 1.29 0.27
1554556_a_at ATP11B 2.03 1.73 0.35 1.36 2.37 1.97
1554557_at ATP11B 2.91 2.79 0.24 1.45 4.26 3.16
1564063_a_at ATP11B 1.70 1.65 0.43 1.23 1.82 1.76
1564064_a_at ATP11B 2.01 2.26 0.29 1.47 3.43 2.80
212536_at ATP11B 1.52 1.52 0.41 1.42 1.79 2.10
1557136_at ATP13A4 0.33 1.19 1.37 2.49 1.36 1.95
208836_at ATP1B3 1.53 1.15 0.40 1.58 3.63 1.99
219659_at ATP8A2 6.68 4.94 7.56 2.44 0.01 4.56
219660_s_at ATP8A2 2.79 1.55 3.45 1.41 0.05 1.42
231395_at ATP8A2 2.29 1.93 2.56 1.60 0.05 1.89
218899 s at BAALC 3.26 2.55 1.41 1.38 1.30 0.20
222780_s_at BAALC 2.77 2.42 1.43 1.32 0.86 0.22
217911_s_at BAG 3 0.77 1.23 0.83 2.25 1.56 0.78
210347_s_at BCL11A 3.56 1.99 0.51 0.62 1.16 2.09
219497_s_at BCL11A 5.37 2.58 0.42 0.64 1.42 2.30
222891_s_at BCL11A 5.33 2.38 0.43 0.61 1.55 2.35
214068_at BEAN 0.18 2.36 3.16 2.45 2.76 1.22
209203_s_at BICD2 1.96 2.11 0.30 2.06 1.52 1.89
212702_s_at BICD2 2.06 1.95 0.28 2.01 2.33 2.11
213154_s_at BICD2 2.05 1.77 0.28 2.16 2.81 2.09
206176_at BMP6 2.34 0.93 2.37 1.28 2.28 0.17
20959 l_s_at BMP7 6.66 5.51 0.26 0.75 4.70 7.13
1552487_a_at BNC1 3.98 5.46 0.09 5.65 6.37 6.58
206581_at BNC1 1.45 1.39 0.47 1.42 1.46 1.63
207369_at BRS3 1.40 1.23 1.43 1.26 0.14 1.30
201235_s_at BTG2 1.12 0.29 1.21 1.39 0.60 1.27
203571_s_at C10orfl l6 0.90 0.56 0.70 3.55 0.97 1.20
227736_at C10orf99 5.89 4.91 0.08 5.89 4.91 5.37
211835_at C12orf32 /// : 0.76 0.37 1.31 1.33 1.60 1.12
240353_s_at C12orf54 3.79 3.22 0.13 3.84 3.37 3.63
219471_at C13orfl8 0.92 0.39 0.95 1.31 2.01 1.59
44790_s_at C13orfl8 0.90 0.33 0.99 1.41 2.54 1.66
227058_at C13orf33 1.10 0.19 1.21 2.37 2.59 1.24
220293_at C14orfl61 1.09 1.57 1.89 0.85 1.89 0.38
1552307_a_at C18orfl7 1.85 1.94 1.78 1.81 0.67 0.21
1570552_at C18orf50 1.48 1.87 1.67 1.33 0.77 0.28
238480 at C18orf50 1.28 1.54 1.76 1.38 1.13 0.27
223913 s at C19orf30 6.76 5.53 7.61 3.15 0.01 2.78
223631 s at C19orf33 0.40 1.99 1.06 3.65 6.37 0.83
238625_at Clorfl68 1.53 1.45 1.55 1.05 1.08 0.33
236979_at Clorfl78 0.40 0.94 1.83 1.46 2.05 0.99
242013_at Clorfl78 0.39 1.13 1.76 1.54 1.68 1.01
213925_at Clorf95 1.37 1.37 1.99 1.01 1.56 0.28
1553311_at C20orfl97 1.66 1.57 1.67 1.44 0.95 0.25
218796_at C20orf42 1.54 3.99 0.33 2.43 13.32 1.07
229545_at C20orf42 1.61 2.28 0.43 1.57 2.62 1.10
60474_at C20orf42 1.56 4.70 0.33 2.38 12.52 1.09
1553169_at C20orf75 0.38 1.14 1.76 1.83 2.11 0.99
1553171_x_at C20orf75 0.29 1.47 2.20 2.30 2.31 1.07 226464_at C3orf58 2.04 1.44 0.36 1.74 1.18 1.77
205500_at C5 1.05 0.69 2.91 2.60 0.65 0.23
223194_s_at C6orf85 1.94 2.02 0.90 1.62 1.57 0.34
202992_at C7 0.96 0.16 1.41 5.91 0.62 1.28
235979_at C7 0.98 0.35 1.12 1.63 1.00 1.04
226614_s_at C8orfl3 2.15 2.11 3.67 1.57 0.10 0.42
218541_s_at C8orf4 0.74 0.74 1.97 2.25 2.63 0.33
229964_at C9orfl52 0.86 1.80 4.23 1.25 2.11 0.21
1558414_at C9orf4 1.45 1.39 1.25 1.17 0.16 1.42
219147_s_at C9orf95 0.92 0.71 0.61 2.11 1.42 1.68
204865_at CA3 1.09 0.24 1.52 1.28 0.63 1.27
243173_at CABP7 2.52 2.52 2.91 1.82 0.05 1.16
207998_s_at CACNA1D 1.37 1.14 1.52 1.20 0.40 0.47
219572_at CAD PS 2 0.80 0.82 2.19 1.38 0.84 0.42
210727_at CALCA 45.60 46.90 72.30 33.63 43.47 0.00
210728_s_at CALCA 51.58 42.93 67.29 42.76 47.30 0.00
217495_x_at CALCA 10.04 7.60 11.55 8.32 5.17 0.02
217561_at CALCA 52.59 37.75 68.33 19.46 39.08 0.00
214636_at CALCB 3.56 2.85 3.35 1.87 0.18 0.19
206331_at CALCRL 0.81 0.43 1.33 2.36 0.43 0.98
210020_x_at CALML3 18.71 12.17 0.07 3.54 15.38 5.29
229030_at CAP 8 0.48 0.85 4.03 3.17 1.66 0.29
208063_s_at CAPN9 1.29 1.69 2.35 1.89 1.07 0.20
210641_at CAPN9 1.41 1.72 2.03 1.70 1.12 0.21
223832_s_at CAPNS2 7.08 7.63 0.06 7.01 6.59 7.52
212586_at CAST 0.64 0.79 1.24 2.13 0.85 0.80
203065_s_at CAV1 0.81 0.75 0.73 3.56 0.58 2.09
212097_at CAV1 0.72 0.57 0.82 3.47 0.77 1.78
203323_at CAV2 0.58 0.94 0.76 3.90 0.89 2.07
203324_s_at CAV2 0.58 1.24 0.61 3.61 1.59 2.79
205379_at CBR3 2.29 1.96 0.26 1.63 5.24 2.83
220180_at CCDC68 0.45 1.49 2.49 2.04 0.34 0.83
243864_at CCDC80 1.14 0.26 0.89 2.28 2.07 1.30
205392_s_at CCL14 /// cq 1.00 0.13 1.73 3.39 1.07 1.46
223710_at CCL26 1.92 1.86 0.30 1.80 2.55 2.11
229900_at CD109 1.11 2.50 0.35 2.06 3.52 2.82
206398_s_at CD19 0.82 0.24 1.13 1.71 2.95 1.57
202877_s_at CD93 0.87 0.45 1.18 2.35 0.52 1.07
202878_s_at CD93 0.78 0.35 1.26 2.97 0.68 1.11
205627_at CDA 0.34 1.06 1.59 1.40 1.86 2.14
230060_at CDCA7 1.47 2.18 1.30 0.38 1.96 1.24
236331_at CDKL2 0.34 0.81 2.33 2.29 1.30 0.94
201884_at CEACAM5 0.60 2.93 3.15 1.47 37.73 0.25
203757_s_at CEACAM6 0.26 2.61 5.72 1.35 21.58 0.59
203973_s_at CEBPD 0.76 0.72 1.09 2.64 1.87 0.57
213006_at CEBPD 0.64 0.48 1.52 5.06 2.29 0.47
204203_at CEBPG 1.70 2.35 0.39 1.08 4.91 2.18
209172_s_at CENPF 1.74 3.31 1.28 0.31 3.64 1.41
223753_s_at CFCl /// LOCI 2.42 1.92 2.61 1.86 0.04 1.92
223232_s_at CG 0.40 2.46 3.95 0.94 1.72 0.59 223233_s_at CGN 0.31 1.86 3.73 1.36 1.77 0.70
204697_s_at CHGA 20.85 21.59 26.16 6.61 0.01 4.21
213060_s_at CHI3L2 1.04 0.20 1.89 0.92 3.54 1.19
220630_s_at CHIA 0.60 0.51 2.76 2.93 2.06 0.34
223987_at CHRDL2 1.24 0.25 1.25 1.91 1.99 0.66
206533_at CHRNA5 3.92 4.71 3.53 1.90 4.55 0.08
206756_at CHST7 2.42 1.13 0.32 1.91 2.13 1.79
212801_at CIT 1.60 4.18 2.65 1.11 3.84 0.17
242872_at CIT 1.82 2.01 1.99 0.95 2.33 0.23
209357_at CITED2 0.45 0.65 2.00 1.82 0.93 0.88
206164_at CLCA2 9.00 8.53 0.05 8.02 10.50 8.01
206165_s_at CLCA2 22.51 24.28 0.02 13.54 34.06 22.30
206166_s_at CLCA2 12.46 12.35 0.04 8.67 14.17 12.18
217528_at CLCA2 21.99 21.93 0.02 18.97 30.73 21.63
218182_s_at CLDN1 2.85 4.82 0.13 3.90 7.43 5.99
222549_at CLDN1 2.05 3.98 0.16 3.75 22.81 4.10
203953_s_at CLDN3 0.44 4.85 10.21 0.51 0.75 0.77
203954_x_at CLDN3 0.58 2.14 5.00 0.50 1.05 0.71
226244_at CLEC14A 0.82 0.54 1.24 2.09 0.57 1.01
205200_at CLEC3B 0.92 0.52 1.56 4.14 0.37 0.55
219890_at CLEC5A 0.43 2.08 1.25 1.58 5.72 1.46
219529_at CLIC3 0.44 0.73 1.41 2.48 0.73 1.39
213317_at CLIC5 0.40 0.71 2.12 2.75 1.31 0.96
208791_at CLU 0.97 0.33 1.42 2.20 0.27 1.35
208792_s_at CLU 1.00 0.33 1.25 2.24 0.31 1.37
207261_at CNGA3 0.53 1.69 2.08 1.78 1.68 0.42
212865_s_at C0L14A1 1.24 0.14 1.35 4.03 0.92 1.24
214641_at COL4A3 0.31 0.93 2.34 1.75 0.82 1.47
222073_at COL4A3 0.23 0.84 3.11 2.28 1.09 1.72
229779_at COL4A4 0.33 0.84 2.78 1.54 0.73 1.33
213110_s_at COL4A5 4.15 2.08 0.32 1.49 0.38 3.72
213992_at COL4A6 5.03 4.08 0.14 3.10 1.03 5.14
204136_at C0L7A1 2.07 2.49 0.18 3.85 4.26 3.85
217312_s_at C0L7A1 1.56 1.74 0.29 2.47 2.05 2.50
205624_at CPA3 0.83 0.25 0.87 6.63 1.90 1.33
201116_s_at CPE 1.51 2.98 5.16 1.67 0.10 0.36
201117_s_at CPE 1.32 2.05 3.53 1.46 0.17 0.34
206100_at CPM 0.42 1.24 2.29 3.39 0.59 0.61
204920_at CPS1 0.98 4.73 28.29 1.77 7.95 0.09
217564_s_at CPS1 0.93 6.80 24.55 2.88 12.94 0.08
217552_x_at CR1 1.01 0.28 1.27 1.26 1.59 1.15
205544_s_at CR2 1.36 0.14 1.26 1.75 3.35 1.30
213059_at CREB3L1 1.04 0.65 1.47 2.14 2.09 0.35
206315_at CRLF1 0.39 4.01 8.01 5.50 2.12 0.20
206595_at CST6 0.12 2.43 3.09 3.75 6.45 2.50
204971_at CSTA 3.71 1.93 0.10 5.83 45.14 8.93
209617_s_at CTNND2 3.75 2.86 4.29 0.73 0.83 0.16
209618_at CTNND2 2.29 1.95 2.52 0.78 0.78 0.24
205927_s_at CTSE 0.33 0.70 2.21 8.87 6.91 0.67
205653_at CTSG 1.13 0.27 1.09 1.51 1.31 1.23 212977_at CXCR7 2.89 1.07 0.29 2.09 4.58 1.58
220230_s_at CYB5R2 2.29 2.05 0.28 1.50 3.45 2.66
223385_at CYP2S1 2.05 3.46 0.16 4.67 4.04 4.45
244407_at CYP39A1 1.85 1.75 1.83 0.27 1.02 1.60
239738_at DACH2 1.46 1.36 1.46 0.81 0.23 1.06
229290_at DAPL1 34.58 26.85 0.06 5.21 1.05 30.00
208335_s_at DARC 0.92 0.14 1.52 4.56 3.40 1.16
224911_s_at DCBLD2 0.37 1.77 1.13 1.77 2.97 1.85
209335_at DCN 1.05 0.36 0.77 3.12 1.59 1.09
222679_s_at DCUN1D1 1.93 2.55 0.38 1.14 1.62 2.13
205311_at DDC 16.21 21.26 27.85 1.42 0.11 0.14
214347_s_at DDC 4.88 4.00 5.56 2.26 0.11 0.19
230769_at DENND2C 2.07 1.66 0.25 2.53 2.14 2.55
204687_at DKFZP564O0! 0.86 1.00 2.62 1.76 0.55 0.32
230229_at DLG1 1.94 2.69 0.42 0.87 1.91 2.56
208250_s_at DMBT1 0.60 0.23 1.59 5.04 8.41 0.86
229588_at DNAJC10 1.06 1.10 1.97 0.83 1.56 0.43
218976_at DNAJC12 1.48 4.67 15.02 2.34 0.13 0.20
223721_s_at DNAJC12 1.71 5.22 12.05 3.02 0.10 0.21
223722_at DNAJC12 1.20 2.84 4.48 2.04 0.20 0.24
220668_s_at DNMT3B 1.98 2.86 1.05 0.36 1.72 1.60
224825_at DNTTIP1 0.49 1.39 1.59 0.97 1.76 1.21
234942_s_at DNTTIP1 0.40 1.83 1.83 0.95 1.67 1.44
230263_s_at DOCK5 0.80 1.54 2.63 1.81 0.56 0.31
203717_at DPP4 0.32 1.33 3.50 5.98 0.47 0.57
211478_s_at DPP4 0.24 2.07 5.16 8.10 0.37 0.66
1560916_a_at DPY19L1 0.39 2.56 1.63 1.29 3.17 0.96
212792_at DPY19L1 0.37 1.33 1.48 1.63 3.29 1.20
206032_at DSC3 18.45 15.26 0.03 11.81 17.63 17.87
206033_s_at DSC3 15.12 13.17 0.03 10.92 13.87 15.00
237268_at DSCAM 1.30 1.22 1.22 1.16 0.21 1.16
205595_at DSG3 10.45 11.34 0.03 15.65 11.04 15.34
235075_at DSG3 16.50 29.12 0.02 34.50 28.69 34.05
204455_at DST 7.59 41.80 0.05 5.82 28.71 26.60
216918 s at DST 3.64 4.39 0.12 3.30 4.77 5.12
201041_s_at DUSP1 0.65 0.42 1.48 2.50 0.73 0.85
204014_at DUSP4 0.87 2.74 8.19 1.80 1.92 0.14
204015_s_at DUSP4 0.70 1.29 4.69 1.62 1.62 0.22
204947_at E2F1 1.36 1.41 1.20 0.50 1.55 1.12
206101_at ECM2 1.01 0.49 0.89 2.67 0.64 0.94
204642_at EDG1 0.90 0.37 1.44 2.11 0.40 1.01
204036_at EDG2 1.13 0.63 0.64 3.03 1.45 0.89
230464_at EDG8 2.57 2.02 0.24 2.00 2.51 2.43
218995_s_at EDN1 0.70 0.83 0.69 2.96 1.10 1.49
204464_s_at EDNRA 0.90 0.75 0.84 2.35 0.59 1.14
201842_s_at EFE P1 0.82 0.41 0.71 3.68 2.36 1.62
201843_s_at EFE P1 0.82 0.42 0.69 4.40 2.37 1.52
219454_at EGFL6 1.16 0.61 0.56 2.99 0.94 1.62
201983_s_at EGFR 0.87 3.67 0.45 3.61 0.46 3.26
228260_at ELAVL2 3.72 5.31 5.47 0.13 1.27 0.89 226982_at ELL2 0.86 0.47 1.34 2.66 0.55 0.65
219134_at ELTD1 0.93 0.34 1.12 2.33 1.38 1.10
222885_at EMCN 0.87 0.34 1.77 3.07 0.35 0.91
204483_at EN03 1.85 2.74 2.39 1.48 2.50 0.15
223253_at EPDR1 0.37 1.43 2.28 1.59 0.66 1.03
203499_at EPHA2 0.49 1.01 0.87 2.03 2.55 1.82
216836_s_at ERBB2 0.49 1.52 1.75 1.08 2.12 0.82
205767_at EREG 0.24 1.09 3.95 0.74 7.07 6.56
220012_at ER01LB 2.09 1.39 1.35 1.69 1.37 0.23
231944_at ER01LB 2.44 1.82 1.98 1.56 0.96 0.17
224657_at ERRFI1 0.34 1.21 2.70 1.78 1.63 0.67
225369_at ESAM 0.63 0.54 1.47 2.45 0.52 1.10
205225_at ESR1 0.27 1.07 1.80 2.10 4.35 1.85
20798 l_s_at ESRRG 3.92 3.62 3.32 0.17 0.50 1.30
201328_at ETS2 1.13 0.68 0.80 2.42 0.61 0.87
231029_at F5 2.63 3.26 4.83 4.67 0.02 3.35
219694_at FAM105A 0.86 1.30 2.66 1.21 0.91 0.32
220377_at FAM30A 0.87 0.41 1.03 1.38 1.78 1.20
229764_at FAM79B 1.79 3.88 0.13 7.84 10.34 9.51
238460_at FAM83A 0.34 2.35 1.78 1.49 9.29 0.85
238741_at FAM83A 0.44 1.75 1.44 1.50 4.50 0.80
239586_at FAM83A 0.41 1.89 1.53 1.55 6.14 0.78
1556793_a_at FAM83C 4.35 3.91 0.12 3.86 3.85 3.92
208153_s_at FAT2 5.50 4.27 0.09 4.82 4.86 4.87
209696_at FBP1 0.43 0.89 1.77 2.09 3.48 0.86
218875_s_at FBX05 1.33 2.39 1.34 0.38 2.62 1.30
205237_at FCN1 0.76 0.20 1.45 2.03 3.04 1.67
201798_s_at FER1L3 0.48 1.09 0.96 1.96 2.59 1.71
211864_s_at FER1L3 0.43 1.37 0.99 2.07 2.47 1.87
205649_s_at FGA 0.99 1.62 20.58 1.47 12.95 0.11
205650_s_at FGA 0.86 1.17 32.06 1.42 15.16 0.13
204988_at FGB 0.61 35.22 60.12 2.71 47.39 0.11
216238_s_at FGB 0.53 29.19 42.57 5.10 31.83 0.11
221310_at FGF14 8.70 6.44 9.14 7.71 0.01 5.29
23023 l_at FGF14 4.41 3.17 4.38 3.29 0.02 1.72
230288_at FGF14 6.68 4.55 6.25 4.95 0.02 2.10
1555103_s_at FGF7 1.10 0.37 1.02 1.23 1.25 1.21
205014_at FGFBP1 3.94 3.42 0.08 9.96 12.38 10.43
219612_s_at FGG 0.48 0.65 8.46 7.75 4.95 0.19
205305_at FGL1 1.02 9.12 18.56 9.27 12.95 0.06
220276_at FU22655 2.59 1.66 3.30 3.09 1.96 0.09
237131_at FU36032 1.36 1.05 1.42 1.32 0.39 0.48
1556641_at FU37228 3.19 2.76 3.67 0.62 0.06 3.01
231186_at FU43390 2.88 2.46 3.30 2.70 0.03 2.69
236902_at FU43390 2.88 2.36 3.12 2.84 0.03 2.77
204437_s_at FOLR1 0.38 0.62 4.74 1.57 1.22 0.67
40284_at FOXA2 1.64 2.37 4.19 0.75 0.30 0.32
228463_at FOXA3 1.19 0.86 3.04 1.36 0.54 0.27
206912_at FOX El 8.36 10.33 0.05 8.12 11.16 12.71
225464_at FRMD6 1.94 1.18 0.24 3.03 3.90 2.92 225481_at FRMD6 1.88 1.14 0.25 3.10 4.70 2.75
206774_at FRMPD1 2.05 1.70 2.21 1.22 0.07 1.73
204948_s_at FST 5.26 1.61 0.29 1.94 2.08 0.98
207345_at FST 4.29 1.63 0.26 1.60 2.53 1.62
226847_at FST 8.19 1.93 0.25 1.44 1.34 1.78
219683_at FZD3 1.93 2.30 1.99 0.69 0.71 0.33
206525_at GABRR1 3.73 3.14 2.19 3.15 0.03 3.23
204121_at GADD45G 1.25 1.38 1.80 0.84 0.79 0.43
243779_at GALNT13 1.37 1.13 1.35 0.99 0.22 1.10
213280_at GARNL4 0.93 1.98 2.37 1.36 0.34 0.39
202177_at GAS6 0.83 0.50 0.82 2.16 1.29 1.37
1559606_at GBP6 4.67 4.34 0.10 4.96 4.24 4.72
1559607_s_at GBP6 6.53 8.51 0.05 10.32 9.69 9.97
211416_x_at GGTLA4 0.46 0.83 1.45 1.87 1.99 0.93
214524_at GHRH 29.48 23.01 32.16 3.80 0.00 25.52
221360_s_at GHSR 1.97 1.44 2.03 2.03 0.06 1.85
224554_at GHSR 1.76 1.42 1.88 1.72 0.07 1.76
223278_at GJB2 2.95 4.84 0.13 2.96 57.42 7.22
206156_at GJB5 1.94 2.71 0.22 2.87 2.88 2.78
231771_at GJB6 11.30 33.88 0.03 9.19 39.79 38.47
209276_s_at GLRX 0.36 0.66 1.97 2.15 1.52 1.13
220393_at GLULD1 0.44 2.49 11.18 1.17 0.57 0.37
204115_at GNG11 0.49 0.48 1.74 3.52 1.12 0.80
218692_at GOLSYN 0.89 1.67 3.48 1.82 0.95 0.21
229372_at GOLT1A 0.91 2.78 4.90 1.99 0.32 0.22
202756_s_at GPC1 2.38 2.06 0.23 2.40 2.80 2.24
207174_at GPC5 4.94 5.62 7.55 6.20 0.01 5.95
1554018_at GPNMB 1.99 1.86 0.23 2.61 18.03 2.89
235988_at GPR110 0.26 1.61 1.79 1.93 2.17 1.92
238689_at GPR110 0.25 2.26 1.65 1.62 3.63 2.91
232267_at GPR133 0.81 0.70 3.51 2.67 0.81 0.26
228949_at GPR177 0.96 0.66 0.70 3.04 0.95 1.13
207183_at GPR19 1.20 1.67 1.70 0.34 1.16 1.79
219936_s_at GPR87 1.81 17.11 0.12 6.01 20.19 21.04
204793_at GPRASP1 1.85 0.94 2.06 1.39 0.08 1.92
203108_at GPRC5A 0.39 0.94 2.48 1.52 4.57 0.72
224839_s_at GPT2 1.35 3.09 1.53 0.98 0.80 0.36
232116_at GRHL3 2.00 2.94 0.23 2.34 2.74 2.87
205358_at GRIA2 7.08 7.06 6.40 2.18 0.01 6.99
236538_at GRIA2 2.82 2.45 3.12 2.42 0.03 2.59
1569290_s_at GRIA3 1.80 1.69 1.82 1.60 0.07 1.72
230144_at GRIA3 1.44 1.30 1.37 1.43 0.13 1.33
204235_s_at GULP1 0.91 1.41 1.27 2.71 1.52 0.34
204237_at GULP1 0.92 1.34 1.30 2.10 1.39 0.36
215913_s_at GULP1 0.80 1.48 1.79 1.88 0.88 0.35
223837_at GULP1 0.93 1.78 1.33 1.40 1.19 0.43
202947_s_at GYPC 0.89 0.36 1.15 1.34 1.35 1.40
206010_at HABP2 0.65 1.13 2.08 1.96 2.35 0.34
206643_at HAL 1.33 1.99 6.39 0.48 5.24 0.26
223541_at HAS3 3.04 4.58 0.17 3.00 1.10 4.94 213069_at HEG1 0.86 0.57 0.83 2.20 1.12 1.13
226446_at HES6 4.11 7.79 5.73 0.18 3.29 0.45
205221_at HGD 1.29 1.43 3.93 1.38 0.28 0.26
214307_at HGD 1.35 1.83 2.83 1.15 0.32 0.30
220812_s_at HHLA2 0.08 3.67 4.84 4.27 3.55 4.13
216548_x_at HMG4L 1.20 3.95 1.87 0.87 3.48 0.29
203744_at HMGB3 1.08 5.44 1.82 0.79 3.91 0.34
225601_at HMGB3 0.93 5.32 4.27 0.98 6.91 0.19
203914_x_at HPGD 0.26 0.67 3.48 6.52 0.54 0.93
209581_at HRASLS3 0.47 0.72 1.72 1.30 2.32 1.07
226304_at HSPB6 1.17 0.38 1.30 1.17 0.60 1.04
221667_s_at HSPB8 0.84 0.37 1.02 2.94 1.67 0.86
210547_x_at ICA1 0.69 1.66 2.85 0.99 0.56 0.49
202637_s_at ICAM 1 0.30 0.80 1.75 2.07 5.59 1.52
202638_s_at ICAM 1 0.27 0.77 1.78 2.36 6.96 1.66
215485_s_at ICAM 1 0.38 0.92 1.57 1.69 2.05 1.35
204683_at ICAM2 0.72 0.32 1.49 1.64 1.19 1.19
213620_s_at ICAM2 0.83 0.32 1.32 1.59 1.35 1.13
207194_s_at ICAM4 0.27 1.34 1.91 2.72 0.76 1.67
218100_s_at IFT57 0.40 1.12 1.36 1.63 1.55 1.69
209540_at IGF1 1.37 0.12 1.23 3.46 1.80 1.63
209541_at IGF1 1.30 0.14 1.20 3.69 1.83 1.48
209542_x_at IGF1 1.22 0.20 1.08 2.35 1.45 1.24
211577_s_at IGF1 1.28 0.20 1.10 2.12 1.19 1.30
215118_s_at IGHA1 1.20 0.17 1.25 2.03 3.21 0.99
222285_at IGHD 1.11 0.42 0.91 1.60 1.57 1.12
202948_at IL1R1 0.80 0.48 0.96 2.81 1.08 0.95
228575_at IL20RB 1.76 5.88 0.14 6.33 8.17 7.26
203126_at IMPA2 0.98 2.74 1.26 1.14 3.68 0.42
1552477_a_at IRF6 2.45 8.60 0.18 2.99 1.45 3.18
202597_at IRF6 1.83 5.37 0.27 2.25 1.70 1.94
201474_s_at ITGA3 0.34 2.04 1.40 2.03 1.93 1.22
208083_s_at ITGB6 0.30 3.04 1.50 1.81 4.07 1.54
208084_at ITGB6 0.35 2.14 1.42 1.73 2.40 1.30
219064_at ITIH5 1.05 0.30 1.41 1.18 0.55 1.69
205874_at ITPKA 0.25 1.98 2.08 1.58 1.76 1.75
216268_s_at JAG1 2.68 1.66 0.40 1.98 0.26 3.69
231183_s_at JAG1 3.19 1.89 0.27 1.96 0.59 3.95
207248_at KCNA4 1.31 1.19 1.36 1.22 0.17 1.24
1552507_at KCNE4 1.97 1.08 1.72 1.52 0.25 0.37
210119_at KCNJ15 0.99 0.74 0.53 6.40 1.09 1.43
205303_at KCNJ8 0.96 0.35 1.16 1.72 1.18 1.06
228414_at KCNMA1 1.62 1.38 1.89 1.50 0.08 1.56
20440 l_at KCNN4 0.25 0.87 2.40 1.49 4.85 2.00
222664_at KCTD15 1.64 2.27 0.40 1.36 1.11 2.17
222668_at KCTD15 1.95 2.35 0.32 1.50 1.79 2.67
206478_at KIAA0125 0.79 0.27 1.10 1.63 2.75 1.67
228325_at KIAA0146 0.67 0.40 1.64 4.74 2.62 0.45
221874_at KIAA1324 2.38 2.48 3.25 0.78 0.23 0.32
226248_s_at KIAA1324 2.80 2.42 3.37 0.83 0.30 0.24 229550_at KIAA1409 1.77 1.76 1.85 1.57 0.08 1.18
213316_at KIAA1462 0.82 0.62 1.03 2.72 0.33 1.55
230220_at KIAA1843 2.51 2.09 2.51 1.77 0.05 1.47
235708_at KLB 2.15 1.99 2.78 2.23 1.42 0.12
244276_at KLB 2.62 2.28 3.50 3.10 1.35 0.09
227875_at KLHL13 4.67 3.78 0.19 2.89 4.07 1.31
205470_s_at KLK11 1.81 5.38 1.97 1.55 4.67 0.15
220782_x_at KLK12 5.53 4.69 4.84 1.66 5.00 0.06
233586_s_at KLK12 8.79 6.91 7.58 1.68 8.35 0.05
234316_x_at KLK12 3.86 3.56 3.87 1.25 4.03 0.09
205783_at KLK13 1.93 1.80 1.00 1.70 1.69 0.29
207935_s_at KRT13 20.79 40.37 0.01 41.90 46.64 47.71
209351_at KRT14 21.92 35.96 0.01 48.68 58.30 65.02
204734_at KRT15 8.14 20.52 0.05 6.05 23.17 16.39
209800_at KRT16 5.31 7.34 0.06 9.49 13.40 11.36
205157_s_at KRT17 2.38 5.46 0.09 13.47 36.66 22.88
212236_x_at KRT17 2.11 3.31 0.11 9.16 30.91 16.01
201820_at KRT5 30.70 49.84 0.01 21.97 70.38 58.97
214580_x_at KRT6A /// KR 4.72 10.32 0.05 15.96 26.23 20.46
209125_at KRT6A /// KR 3.49 22.40 0.05 32.59 166.61 66.23
209126_x_at KRT6B 3.76 10.26 0.07 9.41 14.94 10.85
213680_at KRT6B 3.57 25.52 0.05 13.96 70.74 61.23
209016_s_at KRT7 0.34 1.43 3.94 1.04 15.96 0.70
218701_at LACTB2 0.42 2.06 2.06 1.40 3.76 0.62
207509_s_at LAIR2 1.14 1.14 1.34 1.65 2.30 0.33
227048_at LAMA1 4.86 3.54 0.18 3.85 5.35 1.15
205569_at LAMP3 0.63 0.41 1.07 4.48 1.79 1.10
204414_at LARGE 1.16 1.09 1.14 1.15 0.28 1.15
215543_s_at LARGE 1.75 1.04 1.51 1.19 0.14 1.14
1554252_a_at LASS3 2.04 1.70 0.31 1.89 1.88 1.84
218717_s_at LEPREL1 1.16 1.39 0.34 4.11 2.22 2.13
219699_at LGI2 1.24 0.75 1.28 1.26 1.07 0.47
212325_at LIMCH1 0.61 1.47 2.70 1.84 0.71 0.37
212327_at LIMCH1 0.62 0.80 2.96 2.38 0.69 0.36
235871_at LIPH 0.42 1.95 1.42 1.04 2.17 1.73
230865_at LIX1 1.81 1.57 1.87 1.57 0.07 1.68
204424_s_at L 03 0.26 0.99 4.53 4.29 5.04 0.49
242722_at LM07 0.41 1.42 2.19 1.55 1.24 0.75
239594_at LOC145837 7.60 7.26 9.74 3.04 0.26 0.07
240838_s_at LOC145837 4.83 4.24 5.85 3.43 0.15 0.13
242098_at LOC202451 1.31 1.40 1.54 1.31 0.14 1.10
242601_at LOC253012 9.97 8.17 11.29 0.41 6.84 0.13
1556638_at LOC284530 6.39 5.55 5.51 4.85 5.85 0.04
241418_at LOC344887 9.04 25.53 0.04 8.11 26.51 10.73
227862_at LOC388610 0.53 1.61 3.30 2.14 0.90 0.32
237450_at LOC389332 1.23 1.05 1.23 1.18 0.22 1.22
229599_at LOC440335 0.75 1.43 2.43 1.11 0.80 0.41
237745_at LOC641467 1.15 1.14 1.13 1.08 0.33 1.09
1555942_a_at LOC642587 1.85 1.74 0.33 1.80 1.71 1.92
226755_at LOC642587 4.27 4.20 0.11 4.01 4.48 4.71 226789_at LOC647121 0.36 0.51 1.81 2.14 8.33 1.22
215812_s_at LOC653562 // 4.03 4.64 0.16 1.88 7.09 4.39
1557285_at LOC727738 0.25 1.15 2.01 2.05 2.13 1.70
230869_at LOC728215 1.02 1.64 2.38 0.82 0.13 1.82
217520_x_at LOC731884 2.51 2.67 2.20 0.59 0.39 0.39
210102_at L0H 11CR2A 1.05 0.76 1.38 2.27 0.93 0.38
229584_at LRRK2 0.43 0.36 2.11 6.17 2.35 0.81
204682_at LTBP2 0.78 0.52 0.84 2.29 2.14 1.10
204952_at LYPD3 3.32 3.52 0.13 2.90 6.86 7.09
219059_s_at LYVE1 1.25 0.12 1.72 3.76 0.72 1.42
220037_s_at LYVE1 1.15 0.16 1.67 3.23 0.56 1.34
226770_at MAGI3 0.45 1.01 1.83 1.40 0.98 1.10
204388_s_at MAOA 0.90 1.49 0.72 3.51 1.48 0.63
204389_at MAOA 0.93 1.34 0.70 3.28 1.30 0.69
212741_at MAOA 0.76 0.96 1.20 3.81 0.98 0.46
203841_x_at MAPRE3 1.23 1.25 1.28 1.35 0.17 1.23
214270_s_at MAPRE3 1.28 1.45 1.31 1.31 0.16 1.30
230112_at MARCH4 1.83 1.68 1.93 1.37 0.07 1.84
221047_s_at MARK1 3.89 3.92 0.40 1.12 0.54 1.47
225613_at MAST4 0.88 0.94 0.71 2.90 0.86 1.02
40016_g_at MAST4 0.98 0.88 0.68 2.30 0.98 1.04
212935_at MCF2L 1.57 1.34 1.86 1.05 0.33 0.42
35147_at MCF2L 1.65 1.34 1.93 1.02 0.32 0.41
201930_at MCM6 1.36 2.07 1.22 0.38 4.68 1.49
209200_at MEF2C 0.73 0.37 1.29 1.92 0.89 1.19
210794_s_at MEG3 2.43 1.52 2.38 0.95 0.07 2.52
212732_at MEG3 5.65 4.45 5.89 0.90 0.03 5.47
22621 l_at MEG3 4.42 2.47 2.92 0.59 0.06 4.00
227390_at MEG3 1.71 1.50 1.79 1.46 0.08 1.57
235077_at MEG3 5.47 4.42 5.86 0.85 0.03 5.40
203510_at MET 0.28 2.05 1.64 1.35 2.95 2.51
211599_x_at MET 0.36 1.68 1.98 1.03 1.11 1.66
213807_x_at MET 0.39 1.62 1.93 0.93 1.15 1.80
213816_s_at MET 0.32 1.97 1.70 1.22 1.60 1.99
1554667_s_at METTL8 1.67 1.60 0.40 1.20 2.21 2.38
226346_at MEX3A 1.50 1.97 1.74 0.35 1.22 1.00
223779_at MGC10981 0.29 2.44 5.34 1.01 5.76 0.65
214696_at MGC14376 0.77 0.83 1.02 2.91 0.59 0.87
211026 s at MGLL 0.42 0.76 1.71 1.86 2.65 1.07
225102 at MGLL 0.40 0.76 1.65 1.95 3.50 1.07
55081_at MICALL1 1.67 1.48 0.41 1.19 3.43 2.24
239273 s at MMP28 0.31 1.28 1.28 2.70 2.29 2.26
219091 s at MMRN2 0.91 0.34 1.22 2.12 0.59 1.13
228599 at MS4A1 0.81 0.14 1.44 2.85 3.11 1.80
207496 at MS4A2 0.98 0.35 0.98 1.77 1.55 1.08
224355_s_at MS4A8B 3.94 4.41 6.45 2.09 0.02 2.70
207430_s_at MSMB 8.04 1.31 2.56 6.06 30.97 0.06
210297_s_at MSMB 7.33 1.39 2.59 5.78 25.57 0.06
243804_at MTMR7 5.41 4.73 6.38 4.73 1.18 0.04
212095_s_at MTUS1 0.91 1.34 1.48 2.17 0.80 0.36 239576_at MTU SI 1.27 1.11 1.07 2.16 1.37 0.36
207847_s_at MUC1 0.36 0.77 3.57 1.90 3.42 0.57
211695_x_at MUC1 0.38 0.75 2.96 1.77 2.62 0.65
213693_s_at MUC1 0.45 0.86 2.84 1.71 3.21 0.50
218687_s_at MUC13 1.16 1.01 2.11 1.51 1.48 0.25
222712_s_at MUC13 0.97 1.22 2.41 1.37 1.17 0.27
220196_at MUC16 0.23 1.92 2.59 1.11 4.59 2.75
226856_at MUSTN 1 1.28 0.20 1.50 1.68 0.61 1.06
201497_x_at MYH 11 1.32 0.48 0.64 3.85 0.57 1.12
1554579_a_at MY018B 2.18 1.80 2.29 1.83 0.05 1.89
210341_at MYTl 2.83 2.21 3.09 0.78 0.30 0.27
210016_at MYTIL 2.62 2.19 2.80 2.04 0.04 2.38
242918_at ASP 1.75 1.82 1.58 0.39 0.44 1.46
225344_at NC0A7 0.47 0.91 1.65 1.22 2.32 1.12
1556057_s_at NEUR0D1 8.05 7.66 9.53 2.34 4.97 0.04
206282_at NEUROD1 5.52 4.75 6.71 1.78 4.10 0.06
207437_at NOVA1 1.54 1.48 1.55 1.08 0.13 1.28
220316_at NPAS3 1.78 1.66 1.82 1.00 1.13 0.27
230412_at NPAS3 1.80 1.48 1.84 0.79 1.21 0.31
220106_at NPC1L1 0.75 1.49 1.92 1.38 0.61 0.46
212298_at NRP1 0.49 0.82 1.35 2.25 0.79 1.14
203939_at NT5E 0.31 1.12 1.30 2.43 2.08 2.17
223315_at NTN4 0.43 0.53 1.44 3.30 1.46 1.42
221795_at NTRK2 16.88 6.50 0.06 4.00 8.22 18.24
221796_at NTRK2 10.36 5.02 0.07 4.03 7.89 14.45
205728_at ODZ1 1.68 4.96 5.12 4.77 5.53 0.07
231867_at ODZ2 9.81 7.47 0.08 4.48 3.03 5.33
218730_s_at OGN 2.34 0.17 2.59 4.38 0.17 0.85
205729_at OSMR 0.37 1.60 1.39 1.83 2.74 1.14
228399_at OSR1 1.43 0.25 1.58 1.48 0.26 1.50
223835_x_at OTP 8.67 7.27 9.81 3.01 0.01 7.64
231488_at OTP 3.82 3.15 4.18 2.28 0.02 3.58
206859_s_at PAEP 0.34 11.93 17.08 11.35 11.97 0.16
227354_at PAG1 0.47 0.78 1.56 1.48 1.57 1.21
205719_s_at PAH 1.61 2.75 3.45 0.53 1.07 0.30
218736_s_at PALMD 1.01 0.34 1.00 2.30 0.84 1.20
210335_at PAMCI 3.22 3.83 0.24 3.50 3.08 0.99
219543_at PBLD 0.82 0.99 2.17 1.33 0.79 0.40
203860_at PCCA 3.17 2.87 3.96 0.44 2.07 0.23
227289_at PCDH17 0.97 0.38 0.94 2.67 0.88 1.33
228863_at PCDH17 0.86 0.39 1.01 2.29 1.00 1.23
232054_at PCDH20 1.01 1.58 2.32 1.48 1.19 0.27
228640_at PCDH7 1.56 0.74 0.33 5.29 1.12 2.00
211876_x_at PCDHGA10 // 1.18 1.15 1.16 1.01 0.29 1.24
216352_x_at PCDHGA3 1.14 1.12 1.19 1.03 0.29 1.22
205825_at PCSK1 9.19 14.12 38.37 6.15 0.07 0.14
1554717_a_at PDE4D 1.09 1.51 2.13 1.19 1.56 0.28
20449 l_at PDE4D 0.82 0.82 1.96 1.68 1.61 0.35
1554789_a_at PDE8B 2.05 2.13 2.48 0.62 0.10 2.08
1553589_a_at PDZK1IP1 0.33 0.94 1.85 1.57 7.50 1.30 219630_at PDZK1IP1 0.31 0.99 1.89 1.70 9.96 1.21
227848_at PEBP4 0.43 0.47 1.61 5.69 1.65 0.95
217744_s_at PERP 1.62 7.78 0.33 1.29 2.11 2.69
222392_x_at PERP 1.44 4.11 0.39 1.25 2.59 2.09
1555236_a_at PGC 2.52 2.38 13.12 13.80 7.77 0.04
205261_at PGC 1.93 1.69 12.78 17.02 8.72 0.04
206726_at PGDS 0.55 0.36 1.23 3.56 1.76 1.20
227949_at PHACTR3 0.79 5.25 9.71 1.32 0.30 0.24
209803_s_at PHLDA2 0.32 1.87 1.60 1.79 6.67 1.07
1557578_at PHLDB2 1.40 1.37 0.42 1.94 1.06 1.84
219093_at PID1 0.94 0.77 1.75 3.47 0.12 1.41
215129_at PIK3C2G 1.26 1.48 2.02 1.94 1.75 0.20
229 14_at PITPNC1 1.84 2.12 3.27 1.58 0.13 0.35
238649_at PITPNC1 2.05 2.71 3.52 1.79 0.12 0.33
207558_s_at PITX2 2.28 7.65 6.47 0.94 8.48 0.10
230673_at PKHD1L1 0.65 0.33 1.41 1.39 2.24 1.46
205724_at PKP1 2.51 2.24 0.25 1.98 2.34 2.31
221854_at PKP1 10.22 17.06 0.05 5.35 22.45 8.41
207222_at PLA2G10 1.00 1.40 3.54 2.56 3.46 0.15
203649_s_at PLA2G2A 2.33 0.13 2.24 4.92 6.30 0.36
210145_at PLA2G4A 0.74 2.25 3.72 1.61 9.75 0.20
211924_s_at PLAUR 0.48 0.71 1.19 1.48 7.71 1.79
1557126_a_at PLD1 1.38 1.61 0.43 1.65 1.60 1.68
215723_s_at PLD1 1.52 2.83 0.28 2.07 2.90 3.01
226636_at PLD1 1.20 1.68 0.37 2.14 3.12 2.23
232530_at PLD1 1.87 2.68 0.22 2.58 6.49 2.93
210139_s_at PMP22 0.89 0.44 0.88 2.34 1.39 1.10
210830_s_at P0N2 0.39 1.52 2.24 1.00 0.95 1.38
209529_at PPAP2C 1.03 3.89 1.12 1.15 6.24 0.41
219195_at PPARGC1A 2.16 1.97 2.98 1.45 0.45 0.17
228494_at PPP1R9A 0.69 2.63 5.46 0.56 0.37 0.71
210670_at PPY 2.66 2.10 2.83 2.61 0.03 2.53
207808_s_at PR0S1 0.59 0.61 1.02 2.52 2.05 1.20
1552455_at PRUNE2 1.91 1.69 1.95 0.77 1.91 0.27
212805_at PRUNE2 2.02 1.19 2.46 1.30 0.29 0.28
201896_s_at PSRC1 1.94 2.41 1.31 0.36 1.56 1.06
211663_x_at PTGDS 0.83 0.20 1.20 2.18 3.10 1.54
224950_at PTGFRN 1.62 1.44 0.39 1.67 3.71 1.49
206300_s_at PTHLH 9.44 16.39 0.05 6.64 21.89 8.16
210355_at PTHLH 5.03 5.85 0.11 3.65 6.80 3.91
211756_at PTHLH 10.45 31.88 0.05 7.40 44.24 7.71
230250_at PTPRB 0.66 0.48 1.49 3.96 0.55 0.78
211534_x_at PTPRN2 1.45 1.14 1.76 1.46 0.11 1.10
204469_at PTPRZ1 22.98 16.49 0.08 2.92 7.39 4.05
208789_at PTRF 0.96 0.60 0.70 2.04 1.00 1.59
208790_s_at PTRF 1.09 0.64 0.61 2.29 0.94 1.71
205174_s_at QPCT 1.04 2.28 4.73 0.96 0.47 0.27
228708_at RAB27B 0.32 1.31 1.46 2.54 4.98 1.42
204146_at RAD51AP1 1.78 4.11 1.03 0.34 5.04 1.75
219440_at RAI2 1.14 0.41 1.19 2.67 0.24 1.20 20391 l_at AP1GAP 0.70 1.35 2.23 1.81 0.61 0.39
218657_at RAPGEFL1 1.32 2.43 0.36 1.84 1.82 2.38
223467_at RASD1 1.02 1.36 4.40 1.54 0.56 0.22
207836_s_at RBPMS 0.66 0.59 1.35 2.48 0.79 0.76
227467_at RDH 10 1.21 1.64 1.74 0.71 5.10 0.40
242998_at RDH 12 8.70 7.45 10.03 8.63 0.01 7.16
238017_at RDHE2 0.36 0.95 3.12 1.77 5.90 0.58
204364_s_at REEP1 1.66 1.64 2.21 1.24 0.07 2.12
205879_x_at RET 1.67 1.42 1.74 1.13 1.15 0.27
211421_s_at RET 4.27 3.21 4.79 1.84 1.59 0.07
204127_at RFC3 1.34 2.58 1.50 0.37 2.02 1.16
210258_at RGS13 1.03 0.18 1.63 1.45 1.89 1.23
202388_at RGS2 0.82 0.38 1.26 2.38 0.73 0.96
206290_s_at RGS7 2.60 2.29 2.97 1.02 0.14 0.38
237719_x_at RGS7BP 1.28 1.11 1.32 1.05 0.22 1.17
226028_at R0B04 0.84 0.63 1.24 2.44 0.37 1.11
228806_at RORC 0.48 1.64 3.73 1.65 1.43 0.35
221614_s_at RPH3AL 1.08 1.74 3.43 2.45 0.39 0.21
212912_at RPS6KA2 0.48 0.61 1.99 2.31 1.66 0.66
228186_s_at RSP03 0.65 0.23 1.66 1.68 3.35 1.59
203724_s_at RUFY3 2.01 1.82 2.00 1.23 0.07 1.74
238909_at S100A10 0.40 1.20 1.12 2.38 3.42 1.51
204268_at S100A2 4.42 24.11 0.05 12.41 92.11 20.21
206893_at SALL1 1.59 1.53 1.43 0.34 1.60 1.46
229273_at SALL1 2.06 1.84 1.59 0.25 1.92 1.82
1569433_at SAMD5 1.32 1.20 1.34 1.16 0.21 1.01
229839_at SCARA5 1.37 0.16 2.20 0.96 1.18 1.57
235849_at SCARA5 1.17 0.24 1.66 1.10 0.89 1.29
206884_s_at SCEL 0.34 3.46 1.98 1.74 13.50 0.68
232056_at SCEL 0.35 3.28 2.49 1.75 8.04 0.54
228782_at SCGB3A2 1.21 0.42 1.73 13.61 0.39 0.36
205697_at SCGN 12.12 10.96 15.36 2.47 0.01 3.51
1552365_at SCIN 1.19 3.65 1.52 2.04 0.96 0.24
1552367_a_at SCIN 1.17 2.45 1.85 1.77 0.67 0.26
210432_s_at SCN3A 11.55 5.62 11.57 1.19 0.19 0.12
204723_at SCN3B 1.91 1.28 1.94 0.68 0.14 1.48
241436_at SCNN1G 2.05 2.02 2.51 1.48 1.35 0.15
205475_at SCRG1 1.29 0.42 1.21 1.13 1.15 1.11
222717_at SDPR 0.53 0.50 1.36 3.25 0.69 1.22
223299_at SEC11C 2.04 0.89 3.19 1.13 0.31 0.27
201582_at SEC23B 1.14 1.22 1.61 0.98 0.96 0.45
204563_at SELL 0.73 0.22 1.20 2.14 6.80 1.58
57703_at SENP5 1.89 2.34 0.44 0.90 2.17 2.25
212268_at SERPINB1 0.48 0.91 1.14 2.28 4.39 0.97
211361_s_at SERPINB13 12.66 10.08 0.04 11.84 11.21 11.34
217272_s_at SERPINB13 31.01 26.53 0.02 27.38 26.34 24.18
209719_x_at SERPINB3 4.01 3.45 0.09 6.23 28.87 5.75
209720_s_at SERPINB3 3.72 3.76 0.09 6.83 22.64 5.38
204855_at SERPINB5 2.26 11.40 0.14 3.22 48.77 7.58
202283_at SERPINF1 1.23 0.39 0.88 2.34 1.03 0.77 235683_at SESN3 3.90 6.18 0.14 3.11 4.74 3.21
207873_x_at SEZ6L 2.66 1.86 2.90 1.56 0.04 2.53
211894_x_at SEZ6L 2.79 2.36 3.39 1.82 0.03 2.61
213609_s_at SEZ6L 3.98 3.26 4.56 1.31 0.03 3.79
216047_x_at SEZ6L 2.35 1.91 2.59 1.61 0.05 2.15
231650_s_at SEZ6L 5.47 4.92 6.25 1.40 0.02 4.54
240709_at SEZ6L 2.22 1.92 2.38 1.58 0.05 2.04
209260_at SFN 1.44 3.41 0.23 3.03 21.84 3.34
223121_s_at SFRP2 1.44 0.26 0.50 6.01 5.66 1.70
223122_s_at SFRP2 1.19 0.32 0.55 5.70 4.60 1.34
218835_at SFTPA2 /// SF 0.49 0.48 1.60 8.13 2.24 0.60
213936_x_at SFTPB 0.52 0.56 2.17 7.36 2.70 0.39
214354_x_at SFTPB 0.58 0.38 2.49 7.98 2.57 0.36
214199_at SFTPD 0.46 0.34 1.42 7.87 2.89 1.21
244056_at SFTPG 0.31 0.85 5.66 3.12 7.69 0.41
227038_at SGMS2 0.42 0.78 2.66 2.70 0.46 0.74
230287_at SGSM1 2.05 1.76 2.82 1.62 0.06 1.33
243582_at SH3RF2 1.17 1.60 0.28 3.41 4.28 3.75
22439 l_s_at SIAE 0.61 0.66 1.57 3.87 0.45 0.69
228509_at SKIP 6.37 5.88 8.58 6.68 0.01 6.66
202855_s_at SLC16A3 0.46 1.62 1.77 0.77 4.10 1.35
213664_at SLC1A1 0.23 1.04 2.41 2.31 3.60 1.57
218675_at SLC22A17 1.85 1.15 1.61 1.40 0.09 1.66
201249_at SLC2A1 2.09 2.27 0.24 2.26 2.65 2.81
201250_s_at SLC2A1 1.54 2.89 0.28 1.47 8.77 4.69
240532_at SLC32A1 3.74 3.12 4.24 3.67 0.02 3.62
204124_at SLC34A2 0.29 0.42 2.98 4.09 2.63 0.90
231341_at SLC35D3 12.48 18.10 24.66 5.57 0.01 3.81
219215_s_at SLC39A4 0.88 2.06 1.23 1.29 2.42 0.46
219869_s_at SLC39A8 0.44 0.61 2.07 2.14 2.01 0.84
219795_at SLC6A14 0.31 1.80 3.77 1.81 10.87 0.50
202219_at SLC6A8 3.55 4.46 0.16 2.03 9.18 4.29
210854_x_at SLC6A8 4.30 6.35 0.14 2.03 9.88 5.09
213843_x_at SLC6A8 4.49 7.07 0.13 2.08 9.62 5.37
225516_at SLC7A2 1.60 2.22 5.56 3.17 2.61 0.08
205374_at SLN 1.20 0.27 1.36 1.22 1.57 1.29
205309_at SMPDL3B 0.34 1.41 3.52 1.45 2.06 0.59
218788_s_at SMYD3 1.59 1.27 1.37 1.44 1.11 0.30
213139_at SNAI2 2.66 1.03 0.26 3.52 0.71 2.67
202507_s_at SNAP25 3.01 2.14 3.07 1.61 0.05 1.07
204953_at SNAP91 3.22 2.77 3.17 2.07 0.03 2.58
232355_at SNORD114-3 2.83 1.95 2.76 0.66 0.07 2.59
216850_at SNRPN 1.26 1.53 1.47 1.10 0.16 1.24
225728_at S0RBS2 1.76 1.18 4.69 1.01 0.17 0.37
206122_at S0X15 2.05 1.92 0.30 1.69 1.97 2.25
219993_at S0X17 1.07 0.31 1.48 1.41 0.50 1.20
230943_at S0X17 1.15 0.33 1.46 1.29 0.60 1.19
228038_at S0X2 15.49 24.69 0.24 0.83 3.00 2.61
228698_at S0X7 1.10 0.28 0.85 3.97 0.51 1.94
212667_at SPARC 1.09 0.54 0.77 2.65 0.72 1.02 200795_at SPARCL1 0.93 0.38 1.01 3.91 0.52 1.14
206239_s_at SPINK1 1.15 0.83 40.51 7.19 45.17 0.07
207214_at SPINK4 1.83 2.18 3.29 2.73 2.50 0.10
209436_at SPON1 1.03 0.30 0.99 4.02 0.57 1.06
213993_at SPON1 1.06 0.29 0.84 4.88 0.80 1.24
213994_s_at SPON1 0.96 0.34 0.81 5.11 0.78 1.17
213796_at SPRRIA 18.08 49.76 0.01 39.33 60.02 67.70
214549_x_at SPRRIA 2.74 5.57 0.10 7.04 7.81 8.16
205064_at SPRRIA /// S 1.42 32.69 0.11 28.14 47.64 49.03
208539_x_at SPRR2B 5.62 9.36 0.07 6.15 9.38 10.06
218990_s_at SPRR3 8.16 40.62 0.02 34.01 50.97 56.52
232082_x_at SPRR3 9.74 28.05 0.03 12.94 26.03 31.98
209794_at SRGAP3 1.34 1.18 0.97 1.21 0.26 1.30
204955_at SRPX 1.52 0.28 0.60 4.36 0.99 1.78
213921_at SST 27.97 20.51 29.66 4.82 4.98 0.01
203217_s_at ST3GAL5 0.39 1.05 2.32 1.86 1.81 0.69
204542_at ST6GALNAC2 2.02 3.19 0.20 2.81 4.25 3.85
208065_at ST8SIA3 2.42 1.99 2.67 0.79 0.09 1.01
204150_at STAB1 0.79 0.32 0.98 2.18 2.08 1.45
38487_at STAB1 0.86 0.32 0.97 1.87 2.06 1.48
223103_at STARD10 0.93 1.36 2.09 1.21 1.44 0.32
232322_x_at STARD10 1.02 1.26 1.72 1.34 1.73 0.32
204226_at STAU2 1.01 1.75 1.72 0.99 0.88 0.43
220187_at STEAP4 0.29 0.71 2.73 2.77 2.00 0.84
227461_at STON2 2.50 2.06 0.30 2.28 0.80 1.75
235852_at STON2 2.35 2.22 0.35 1.57 0.77 1.81
229818_at SVOP 1.76 1.48 1.76 1.44 0.08 1.59
209447_at SY E1 0.65 0.48 1.46 2.36 0.57 0.97
228072_at SYT12 0.26 1.38 1.95 1.67 2.05 1.80
221859_at SYT13 1.82 1.65 2.15 1.61 0.07 1.50
226167_at SYT7 1.23 1.71 1.78 1.30 0.29 0.45
202289_s_at TACC2 0.92 2.07 2.13 3.25 0.31 0.30
211382_s_at TACC2 0.91 2.21 2.05 3.29 0.28 0.33
222634_s_at TBL1XR1 1.83 2.54 0.36 1.22 2.76 2.16
230438_at TBX15 0.74 1.63 1.90 1.76 2.02 0.32
39318_at TCL1A 0.96 0.21 1.18 1.68 1.48 1.63
206702_at TEK 0.80 0.36 1.36 2.94 0.52 1.00
218872_at TESC 0.67 0.83 7.38 1.41 2.19 0.23
233987_at TFAP2D 1.16 1.53 1.52 1.29 1.49 0.33
219735_s_at TFCP2L1 1.28 1.69 1.80 0.94 1.64 0.32
227642_at TFCP2L1 1.52 2.34 3.47 0.58 2.37 0.26
229341_at TFCP2L1 1.50 1.65 2.07 0.60 1.89 0.37
205009_at TFF1 0.70 0.13 17.69 5.04 15.53 0.32
204623_at TFF3 4.27 0.50 6.24 0.84 1.55 0.15
213258_at TFPI 0.58 0.49 1.48 2.44 1.38 0.81
209278_s_at TFPI2 0.13 11.56 3.55 6.08 5.15 1.06
201042_at TGM2 0.44 0.42 1.79 1.85 2.61 1.18
211003_x_at TGM2 0.40 0.94 1.96 1.13 1.22 1.40
211573_x_at TGM2 0.42 0.90 1.75 1.30 1.20 1.29
1552280_at TIMD4 1.47 0.15 1.52 1.46 1.38 1.30 209387_s_at TM4SF1 0.41 1.40 1.36 1.90 2.79 1.00
215033_at TM4SF1 0.42 1.93 1.44 1.72 2.40 0.82
215034_s_at TM4SF1 0.39 1.39 1.29 1.98 3.38 1.08
236430_at TMED6 3.85 4.23 6.75 5.13 0.13 0.13
223594_at TMEM117 2.44 1.64 0.32 1.45 2.24 1.95
227753_at TMEM139 0.27 1.34 2.76 1.20 2.01 1.72
224916_at TMEM173 0.62 0.66 1.07 2.23 1.54 0.98
224929_at TMEM173 0.72 0.59 1.01 2.07 1.81 0.91
222892_s_at TMEM40 1.30 2.25 0.32 2.34 2.29 2.39
230822_at TME 61 0.94 1.54 2.21 0.89 1.05 0.39
207602_at TMPRSS11D 3.96 4.06 0.11 4.40 4.70 4.64
207382_at TP63 3.35 2.91 0.16 2.69 3.48 3.35
209863_s_at TP63 12.19 22.54 0.03 14.87 38.44 33.54
211194_s_at TP63 4.97 4.60 0.09 5.28 5.56 5.42
231978_at TPC 2 2.43 2.04 2.04 2.05 2.29 0.13
1553859_at TPH 1 10.04 8.23 11.66 9.43 0.01 8.07
214601_at TPH 1 9.03 8.95 11.94 1.66 0.01 4.06
210084_x_at TPSABl 0.80 0.28 0.86 4.21 1.27 1.58
215382_x_at TPSABl 0.82 0.28 0.87 3.63 1.14 1.70
205683_x_at TPSABl /// Tf 0.82 0.25 0.87 4.41 1.53 1.67
207134_x_at TPSABl /// Tf 0.85 0.24 0.87 4.20 1.63 1.60
207741_x_at TPSABl /// Tf 0.89 0.25 0.89 3.27 1.61 1.54
216474_x_at TPSABl /// Tf 0.80 0.24 0.89 4.44 1.65 1.60
202504_at TRIM29 2.59 7.41 0.10 6.28 20.01 6.97
211002_s_at TRIM29 1.63 2.55 0.26 2.48 3.50 2.66
242056_at TRIM45 1.48 2.47 1.37 0.36 1.07 1.55
223694_at TRIM 7 1.93 1.94 0.28 2.16 1.28 2.75
1554803_s_at TRIM72 1.39 1.18 1.35 1.05 1.28 0.40
227610_at TSPAN11 1.87 1.14 1.27 1.28 1.53 0.30
230626_at TSPAN12 1.43 2.34 2.70 0.87 0.48 0.30
203824_at TSPAN8 0.59 0.89 5.86 1.51 6.60 0.25
218012_at TSPYL2 1.37 1.06 1.51 1.21 0.16 1.13
1554696_s_at TYMS 1.17 2.80 1.54 0.33 2.05 1.76
202589_at TYMS 1.31 2.51 1.26 0.35 2.75 1.90
206094_x_at UGT1A1 /// U 1.37 7.56 0.24 3.71 10.75 1.95
215125_s_at UGT1A1 /// U 1.42 22.87 0.21 4.73 32.65 2.17
207126_x_at UGT1A4 1.49 10.36 0.23 3.94 17.28 1.82
206505_at UGT2B4 2.28 2.06 2.65 2.25 2.29 0.11
235904_at UGT3A1 7.28 7.01 9.03 4.78 7.25 0.03
238542_at ULBP2 2.78 2.95 0.18 2.14 4.02 6.03
1553183_at UM0DL1 2.38 2.52 3.17 1.84 2.70 0.10
202893_at UNC13B 0.88 1.35 2.26 1.39 0.40 0.43
231008_at UNC5CL 0.31 1.75 2.48 1.53 1.94 0.95
1569532_a_at UNQ2541 2.80 2.21 2.84 2.28 2.36 0.10
202412_s_at USP1 1.27 2.13 1.16 0.46 1.65 1.32
205506_at VIL1 1.78 3.69 5.11 0.85 0.22 0.27
228912_at VIL1 1.72 3.49 4.29 1.05 0.33 0.21
205922_at VNN2 0.79 0.23 1.28 1.81 2.82 1.41
228024_at VPS37A 1.14 1.41 1.68 1.56 0.58 0.35
243764_at VSIG1 0.11 2.12 6.89 6.49 6.96 2.01 203797_at VSNL1 5.72 5.75 0.19 1.13 3.79 6.89
203798_s_at VSNL1 6.34 6.23 0.21 0.97 3.76 7.44
228295_at WDR59 1.20 1.22 1.23 1.05 0.25 1.18
1555007_s_at WDR66 2.65 2.62 0.21 2.99 1.92 2.18
227174_at WDR72 2.14 9.59 0.40 0.98 23.49 1.20
236741_at WDR72 2.51 5.47 0.37 1.32 6.12 0.95
205990_s_at WNT5A 2.54 3.11 0.30 1.79 0.58 3.06
213425_at WNT5A 2.29 3.25 0.29 1.78 0.66 3.58
220057_at XAGE1 /// XA 0.32 7.11 4.05 1.05 8.71 0.57
206484_s_at XPNPEP2 1.19 0.30 1.14 1.18 1.22 1.16
222581_at XPR1 0.41 2.58 1.42 1.61 2.26 0.88
242020_s_at ZBP1 0.69 0.36 1.22 1.28 2.27 1.64
201531_at ZFP36 0.75 0.43 0.98 2.53 1.43 1.10
235366_at ZNF10 1.27 1.28 1.53 0.99 0.21 1.00
220055_at ZNF287 1.27 1.12 1.24 0.99 0.27 1.09
238454_at ZNF540 1.44 1.16 1.65 1.58 0.10 1.48
239700_at ZNF710 1.26 1.50 1.51 1.68 1.66 0.27
39891_at ZNF710 1.00 1.56 1.69 1.35 1.20 0.34
Gl - G6: Six novel groups of NSCLC
OG: Other five groups
Table 13. Full signature for prediction of Pemetrexed response (426 probe sets)
NR : R
ProbelD Gene symbol NR Mean R Mean
Ratio
223362_s_at SEPT1 2.03 -0.21 0.44
1557051 s at ... 2.73 -0.36 0.41
1558397 at ... 0.49 0.42 -0.52
212444 at — 0.42 0.51 -0.78
219987 at — 4.78 -0.47 0.55
223366 at — 2.25 -0.18 0.49
227061_at — 0.48 0.53 -0.57
227545_at ... 2.49 -0.52 0.57
227921_at ... 2.77 -0.38 0.51
228273_at — 2.33 -0.68 0.75
228528_at — 0.46 0.28 -0.36
228642_at — 2.08 -0.31 0.33
229128 s at — 2.57 -0.70 0.75
229490_s_at — 2.08 -0.52 0.58
229554_at — 0.49 0.40 -0.48
235079_at — 2.46 -0.24 0.58
235363_at — 2.23 -0.34 0.50
235609_at — 2.25 -0.48 0.60
235666_at — 0.43 0.54 -0.62
235919_at — 2.25 -0.37 0.56
242881_x_at — 2.56 -0.54 0.77
242890_at — 2.11 -0.37 0.49
243929_at — 0.46 0.44 -0.52
244387_at — 0.48 0.29 -0.43
244548_at — 0.50 0.32 -0.49
223395 at ABI3BP 0.48 0.53 -0.57
218868 at ACTR3B 2.58 -0.25 0.39
1555487 a at ACTR3B /// LOC10C 2.63 -0.29 0.47
226950 at ACVRL1 0.50 0.41 -0.43
235649 at ADAMTS8 0.43 0.29 -0.36
209612 s at ADH1B 0.39 0.80 -1.00
209613 s at ADH1B 0.37 0.75 -0.91
228969 at AGR2 0.35 0.35 -0.72
228241 at AGR3 0.34 0.67 -1.12
204446 s at ALOX5 0.46 0.50 -0.72
204174 at ALOX5AP 0.50 0.35 -0.56
222108 at AMIG02 0.45 0.53 -0.71
1552619 a at ANLN 2.28 -0.58 0.53
222608 s at ANLN 2.07 -0.62 0.60
208103_s_at ANP32E 2.15 -0.52 0.69
204894_s_at AOC3 0.43 0.52 -0.68
207542_s_at AQP1 0.39 0.41 -0.60
209047_at AQP1 0.38 0.66 -0.88
39248 at AQP3 0.40 0.78 -1.34
226228_at AQP4 0.46 0.66 -0.91
226576_at ARHGAP26 0.47 0.29 -0.35 229215_at ASCL2 2.81 -0.37 0.49
218115_at ASF1B 2.18 -0.50 0.64
219918_s_at ASPM 2.67 -0.75 0.93
219087_at ASPN 0.42 0.49 -0.71
218782_s_at ATAD2 2.90 -0.65 0.75
222740_at ATAD2 2.65 -0.62 0.70
228401_at ATAD2 2.04 -0.38 0.52
235266_at ATAD2 2.03 -0.45 0.62
204092_s_at AU KA 2.00 -0.55 0.67
208079_s_at AURKA 2.09 -0.60 0.71
209464_at AURKB 2.02 -0.44 0.54
205345 at BARD1 2.01 -0.43 0.45
219498 s at BCL11A 2.11 -0.30 0.52
202095 s at BIRC5 2.06 -0.64 0.73
209642 at BUB1 2.80 -0.63 0.73
203755_at BUB1B 2.50 -0.66 0.79
209183_s_at ClOorflO 0.48 0.46 -0.61
203571_s_at C10orfll6 0.46 0.41 -0.76
204521_at C12orf24 2.35 -0.28 0.52
218723_s_at C13orfl5 0.38 0.58 -0.69
229177_at C16orf89 0.49 0.49 -0.54
214696_at C17orf91 0.45 0.49 -0.75
223631_s_at C19orf33 0.34 0.64 -1.10
235568_at C19orf59 0.27 0.51 -0.67
219476_at Clorfll6 0.41 0.55 -0.85
205654_at C4BPA 0.36 0.79 -1.01
202992_at C7 0.45 0.61 -0.75
204811_s_at CACNA2D2 0.41 0.51 -0.49
220414_at CALML5 3.90 -0.34 0.42
218309_at CAMK2N1 0.45 0.44 -0.57
229030 at CAPN8 0.44 0.31 -0.49
212097_at CAV1 0.49 0.52 -0.63
203323_at CAV2 0.45 0.50 -0.74
226473 at CBX2 2.15 -0.47 0.70
226287_at CCDC34 2.21 -0.44 0.66
205392_s_at CCL14 /// CCL15 0.46 0.43 -0.53
203418_at CCNA2 2.20 -0.49 0.58
213226_at CCNA2 2.06 -0.51 0.65
214710 s at CCNB1 2.01 -0.59 0.71
202705 at CCNB2 2.31 -0.71 0.80
213523 at CCNE1 2.17 -0.41 0.58
205034 at CCNE2 2.43 -0.57 0.62
203213 at CDC2 2.32 -0.67 0.78
203214 x at CDC2 2.14 -0.62 0.68
210559 s at CDC2 2.19 -0.66 0.72
202870 s at CDC20 2.26 -0.67 0.74
227850 x at CDC42EP5 0.39 0.49 -0.56
204510 at CDC7 2.64 -0.57 0.77
221436 s at CDCA3 2.17 -0.45 0.60
223307 at CDCA3 2.40 -0.60 0.75 224428_s_at CDCA7 2.09 -0.64 0.80
204677 at CDH5 0.48 0.48 -0.58
214877 at CDKAL1 2.41 -0.30 0.51
207039 at CDKN2A 2.61 -0.41 0.62
209644_x_at CDKN2A 2.27 -0.44 0.66
204159 at CDKN2C 2.13 -0.28 0.42
1555758_a_at CDKN3 2.41 -0.63 0.69
209714 s at CDKN3 2.24 -0.59 0.63
204962_s_at CE PA 2.54 -0.68 0.88
207828 s at CENPF 2.79 -0.80 0.97
209172 s at CENPF 2.19 -0.33 0.54
219555 s at CENPN 2.13 -0.49 0.58
226610 at CENPV 2.87 -0.48 0.97
226611 s at CENPV 3.03 -0.52 1.07
239413 at CEP152 2.33 -0.45 0.53
205382_s_at CFD 0.49 0.46 -0.54
205394_at CHEK1 2.11 -0.51 0.51
209763_at CHRDL1 0.47 0.44 -0.47
229610 at CKAP2L 2.52 -0.49 0.57
214135_at CLDN18 0.37 0.62 -0.70
205200 at CLEC3B 0.33 0.65 -0.74
213317_at CLIC5 0.31 0.39 -0.56
219866 at CLIC5 0.41 0.35 -0.53
222043 at CLU 0.48 0.40 -0.56
205518 s at CMAH 0.47 0.34 -0.61
215145_s_at CNTNAP2 2.68 -0.27 0.59
205229 s at COCH 2.38 -0.45 0.62
212865_s_at C0L14A1 0.47 0.41 -0.51
213492 at C0L2A1 2.47 -0.21 0.36
222073 at COL4A3 0.45 0.46 -0.67
204570 at C0X7A1 0.48 0.45 -0.56
205624 at CPA3 0.35 0.75 -1.18
205350 at CRABP1 2.23 -0.18 0.36
221204 s at CRTAC1 0.49 0.38 -0.47
206595 at CST6 0.39 0.31 -0.52
210546 x at CTAG 1 A /// CTAG 1 3.78 -0.27 0.44
211674 x at CTAG1A /// CTAG1 4.93 -0.32 0.51
205927 s at CTSE 0.45 0.65 -0.93
205898 at CX3CR1 0.41 0.28 -0.41
209774_x_at CXCL2 0.42 0.52 -0.59
214974 x at CXCL5 0.37 0.46 -0.62
1555497 a at CYP4B1 0.41 0.39 -0.54
210096 at CYP4B1 0.31 0.66 -0.82
208335 s at DARC 0.47 0.45 -0.61
209335 at DCN 0.46 0.52 -0.62
208151 x at DDX17 2.08 -0.20 0.39
208719 s at DDX17 2.37 -0.25 0.44
222958 s at DEPDC1 2.73 -0.67 0.80
232278 s at DEPDC1 2.15 -0.38 0.49
235545 at DEPDC1 2.22 -0.38 0.62 203764 at DLGAP5 2.44 -0.67 0.81
242138 at DLX1 3.01 -0.19 0.38
208250 s at DMBT1 0.40 0.55 -0.95
220668 s at DN T3B 2.12 -0.37 0.51
203716 s at DPP4 0.49 0.33 -0.59
203717_at DPP4 0.44 0.45 -0.74
219000 s at DSCC1 2.40 -0.56 0.67
218585 s at DTL 2.36 -0.63 0.82
222680 s at DTL 2.48 -0.67 0.83
219990_at E2F8 2.24 -0.40 0.58
209365 s at ECM1 0.38 0.39 -0.57
227780 s at ECSCR 0.50 0.41 -0.53
219787 s at ECT2 2.17 -0.63 0.70
204273 at EDNRB 0.46 0.33 -0.38
209343 at EFHD1 5.89 -0.30 0.55
228260 at ELAVL2 3.40 -0.28 0.46
226982 at ELL2 0.45 0.53 -0.79
222885 at EMCN 0.45 0.47 -0.56
227874 at EMCN 0.45 0.32 -0.38
204603 at EX01 2.34 -0.54 0.66
203358_s_at EZH2 2.65 -0.70 0.87
204363_at F3 0.44 0.51 -0.87
203980_at FABP4 0.20 0.69 -0.92
209074_s_at FAM107A 0.30 0.58 -0.65
225834 at FAM72A /// FAM7; 2.64 -0.65 0.84
225687_at FAM83D 2.47 -0.64 0.84
213007_at FANCI 2.23 -0.58 0.75
234863_x_at FBX05 2.19 -0.51 0.69
211734 s at FCER1A 0.41 0.24 -0.49
205866 at FCN3 0.24 0.47 -0.64
204768_s_at FEN1 2.20 -0.50 0.66
204988_at FGB 0.05 0.59 -0.88
219612 s at FGG 0.24 0.94 -1.32
226769 at FIBIN 0.41 0.41 -0.46
242546_at FU39632 2.01 -0.35 0.55
231882_at FU39632 /// L0C1C 2.07 -0.38 0.59
222853 at FLRT3 0.48 0.55 -0.63
211726 s at FM02 0.34 0.72 -0.83
228268 at FM02 0.26 0.66 -0.74
202768_at FOSB 0.30 0.45 -0.57
206307 s at F0XD1 2.11 -0.30 0.49
206018 at F0XG1 6.77 -0.47 0.64
202580 x at F0XM1 2.25 -0.58 0.69
214088 s at FUT3 0.45 0.28 -0.54
207086 x at GAGE1 /// GAGE12 4.89 -0.43 0.94
207739 s at GAGE1 /// GAGE 12 6.20 -0.56 1.20
208155 x at GAGE1 /// GAGE12 5.14 -0.44 0.95
206640 x at G AGE 12C /// GAGE 6.18 -0.50 1.08
208235 x at GAGE 12 F /// GAGE 5.80 -0.51 1.12
207663 x at GAGE3 4.60 -0.39 0.86 203560 at GGH 2.02 -0.44 0.48
206102_at GINS1 2.13 -0.58 0.78
226701 at GJA5 0.43 0.48 -0.61
238222 at GKN2 0.32 0.29 -0.41
227070 at GLT8D2 0.50 0.39 -0.43
238062 at GPIHBP1 0.47 0.25 -0.31
212950 at GPR116 0.47 0.46 -0.75
212951 at GPR116 0.49 0.48 -0.71
203108 at GPRC5A 0.45 0.56 -0.88
203632 s at GPRC5B 2.27 -0.30 0.46
201348 at GPX3 0.42 0.53 -0.68
214091 s at GPX3 0.44 0.46 -0.59
204318 s at GTSE1 2.20 -0.44 0.67
215942 s at GTSE1 2.01 -0.33 0.52
222685 at HAUS6 2.33 -0.42 0.60
220085 at HELLS 2.24 -0.45 0.56
227350_at HELLS 2.22 -0.44 0.56
203903_s_at HEPH 0.40 0.46 -0.58
226446_at HES6 4.61 -0.50 0.69
208808_s_at HMGB2 2.00 -0.43 0.58
211597 s at HOPX 0.49 0.77 -1.04
214651_s_at H0XA9 2.74 -0.24 0.49
219697 at HS3ST2 0.40 0.47 -0.56
221667_s_at HSPB8 0.48 0.42 -0.58
201185_at HTRA1 0.48 0.40 -0.53
210619 s at HYAL1 0.32 0.48 -0.57
227760_at IGFBPL1 2.92 -0.38 0.57
227997 at IL17RD 3.02 -0.38 0.54
205403_at IL1R2 0.48 0.40 -0.55
220054 at IL23A 2.02 -0.26 0.31
224061_at INMT 0.35 0.59 -0.67
226535_at ITGB6 0.49 0.59 -0.94
229139 at JPH1 2.17 -0.39 0.45
202503 s at KIAA0101 2.01 -0.64 0.66
227230_s_at KIAA1211 2.02 -0.22 0.36
204444_at KIF11 2.35 -0.57 0.69
219306 at KIF15 2.53 -0.55 0.73
218755_at KIF20A 2.70 -0.68 0.79
204709 s at KIF23 2.41 -0.58 0.70
209408_at KIF2C 2.43 -0.66 0.78
211519 s at KIF2C 2.39 -0.51 0.64
218355 at KIF4A 2.41 -0.61 0.77
219371 s at KLF2 0.47 0.42 -0.45
241682 at KLHL23 2.21 -0.41 0.60
209792 s at KLK10 0.22 0.43 -0.61
205470 s at KLK11 0.40 0.38 -0.69
228557 at L3MBTL4 2.18 -0.26 0.38
203726 s at LAMA3 0.48 0.44 -0.64
234985 at LDLRAD3 2.33 -0.31 0.49
212327 at LIMCH1 0.47 0.46 -0.50 203766_s_at LM0D1 0.50 0.39 -0.46
228245 s at LOC728715 /// 0V< 4.28 -0.66 0.95
202779 s at LOC731049 /// UBI 2.24 -0.44 0.68
242931_at L0NRF3 0.48 0.38 -0.52
203548_s_at LPL 0.28 0.60 -0.81
203549_s_at LPL 0.34 0.63 -0.78
203835_at LRRC32 0.48 0.40 -0.46
229584_at LRRK2 0.47 0.58 -0.83
235278_at MACR0D2 0.49 0.50 -0.54
1554768_a_at MAD2L1 2.62 -0.56 0.72
203362 s at MAD2L1 2.57 -0.63 0.79
1553830 s at MAGEA2 /// MAGE 2.07 -0.29 0.38
228885 at MAMDC2 0.40 0.31 -0.39
205819 at MARCO 0.37 0.50 -0.67
220651 s at MCM10 2.00 -0.38 0.47
202107 s at MCM2 2.52 -0.65 0.90
222036_s_at MCM4 2.06 -0.56 0.64
201930_at MCM6 2.34 -0.54 0.68
210983_s_at MCM7 2.40 -0.55 0.71
224320 s at MCM8 2.05 -0.48 0.72
209035_at MDK 2.25 -0.47 0.77
204825_at MELK 2.22 -0.66 0.76
226346_at MEX3A 2.69 -0.44 0.65
227512_at MEX3A 2.51 -0.41 0.62
218883_s_at MLF1IP 2.26 -0.59 0.65
218211_s_at MLPH 0.47 0.48 -0.70
204575_s_at MMP19 0.50 0.32 -0.43
204438_at MRC1 /// M C1L1 0.46 0.50 -0.66
209421_at MSH2 2.15 -0.52 0.69
211450_s_at MSH6 2.13 -0.48 0.69
204798 at MYB 6.22 -0.57 0.72
201710_at MYBL2 2.13 -0.45 0.59
223806_s_at NAPSA 0.37 0.83 -1.22
228055 at NAPSB 0.49 0.47 -0.65
201621 at NBL1 0.49 0.45 -0.58
37005 at NBL1 0.46 0.49 -0.64
201774_s_at NCAPD2 2.50 -0.51 0.67
218662 s at NCAPG 2.07 -0.54 0.63
218663 at NCAPG 2.32 -0.53 0.68
219588 s at NCAPG2 2.19 -0.55 0.62
204162 at NDC80 2.84 -0.73 0.87
227249 at NDE1 2.10 -0.48 0.58
204641 at NEK2 2.34 -0.61 0.77
206023 at NMU 2.32 -0.65 0.75
202238 s at NNMT 0.48 0.57 -0.72
226992 at NOSTRIN 0.38 0.42 -0.55
218086 at NPDC1 0.50 0.36 -0.36
213479_at NPTX2 4.65 -0.42 0.76
216248 s at NR4A2 0.49 0.29 -0.55
209959 at NR4A3 0.43 0.39 -0.63 204081 at N GN 0.49 0.33 -0.55
221606 s at NSBP1 2.08 -0.20 0.39
223381 at NUF2 3.04 -0.74 0.93
218039 at NUSAP1 2.39 -0.68 0.79
219978 s at NUSAP1 2.20 -0.57 0.75
213599 at 0IP5 2.11 -0.46 0.57
201013_s_at PAICS 2.09 -0.41 0.56
201014 s at PAICS 2.47 -0.51 0.68
219148 at PBK 3.06 -0.68 0.78
219543 at PBLD 0.47 0.30 -0.39
1553589 a at PDZK1IP1 0.49 0.41 -0.62
219630 at PDZK1IP1 0.47 0.41 -0.62
227848 at PEBP4 0.47 0.42 -0.54
219093 at PID1 0.38 0.46 -0.63
206311 s at PLA2G1B 0.49 0.24 -0.26
202240 at PLK1 2.07 -0.41 0.56
204939_s_at PLN 0.46 0.44 -0.65
225421_at PM20D2 2.06 -0.30 0.53
209433_s_at PPAT 2.65 -0.53 0.67
209434 s at PPAT 2.30 -0.38 0.59
236302_at PPM1E 4.00 -0.35 0.58
204086_at PRA E 2.99 -0.45 0.83
218009_s_at PRC1 2.36 -0.67 0.83
205053_at PRI 1 2.50 -0.55 0.74
223062 s at PSAT1 2.23 -0.58 0.85
201896_s_at PSRC1 2.46 -0.45 0.69
230250 at PTPRB 0.37 0.63 -0.78
228708_at RAB27B 0.48 0.38 -0.79
204146 at RAD51AP1 2.74 -0.64 0.86
219494_at RAD54B 2.16 -0.43 0.51
219440 at RAI2 0.44 0.44 -0.50
219142_at RASL11B 2.44 -0.22 0.46
228455_at RBM15 2.10 -0.34 0.52
203498 at RCAN2 0.39 0.51 -0.62
204127 at RFC3 2.42 -0.47 0.63
204023_at RFC4 2.27 -0.65 0.85
218353_at RGS5 0.43 0.45 -0.56
213397_x_at RNASE4 0.50 0.50 -0.63
203022 at RNASEH2A 2.40 -0.54 0.70
226028 at R0B04 0.49 0.47 -0.57
214097 at RPS21 2.16 -0.36 0.60
204802 at RRAD 0.39 0.33 -0.53
204803 s at RRAD 0.35 0.46 -0.71
201890 at RRM2 2.03 -0.65 0.73
209773 s at RRM2 2.02 -0.64 0.71
204351 at S100P 0.44 0.71 -1.09
213262 at SACS 2.07 -0.22 0.41
226548 at SBK1 2.51 -0.37 0.54
226549 at SBK1 2.21 -0.33 0.44
230378 at SCGB3A1 0.33 0.74 -1.06 228782 at SCGB3A2 0.44 0.86 -1.20
228504 at SCN7A 0.49 0.39 -0.48
222717 at SDPR 0.44 0.33 -0.59
202833 s at SERPINA1 0.46 0.81 -1.08
211429 s at SERPINA1 0.45 0.79 -1.06
223121 s at SFRP2 0.42 0.43 -0.68
223122 s at SFRP2 0.45 0.40 -0.69
243818 at SFTA1P 0.42 0.34 -0.49
244056 at SFTA2 0.39 0.86 -1.13
223678 s at SFTPAl /// SFTPAl 0.38 0.70 -1.01
218835 at SFTPA2 /// SFTPA2 0.41 1.08 -1.57
209810 at SFTPB 0.48 0.81 -1.14
213936 x at SFTPB 0.43 0.69 -0.89
214354 x at SFTPB 0.40 0.68 -0.96
37004 at SFTPB 0.50 0.81 -1.13
205982 x at SFTPC 0.36 0.98 -1.31
211735 x at SFTPC 0.34 1.00 -1.33
214387_x_at SFTPC 0.34 1.00 -1.36
215454 x at SFTPC 0.41 0.47 -0.70
38691_s_at SFTPC 0.41 0.97 -1.22
214199 at SFTPD 0.46 0.84 -0.99
227038 at SGMS2 0.43 0.48 -0.68
224391 s at SIAE 0.44 0.44 -0.84
203625 x at SKP2 2.25 -0.47 0.71
205234 at SLC16A4 0.37 0.37 -0.50
209267 s at SLC39A8 0.47 0.43 -0.58
214719 at SLC46A3 0.46 0.45 -0.62
1558217 at SLFN13 2.22 -0.23 0.44
204240 s at SMC2 2.17 -0.48 0.64
205236 x at S0D3 0.23 0.35 -0.42
228038 at S0X2 2.02 -0.60 0.87
213668 s at S0X4 2.13 -0.36 0.61
228698 at S0X7 0.49 0.35 -0.51
228654 at SPIN4 2.64 -0.29 0.58
206239 s at SPINK1 0.24 0.82 -1.13
209436 at SP0N1 0.49 0.48 -0.56
213993 at SP0N1 0.47 0.37 -0.51
213994 s at SP0N1 0.42 0.53 -0.69
205499 at SRPX2 0.46 0.30 -0.48
205339 at STIL 2.07 -0.50 0.57
200783 s at ST N1 2.28 -0.46 0.69
227480 at SUSD2 0.29 0.48 -0.60
202289 s at TACC2 0.50 0.44 -0.64
228906 at TET1 2.10 -0.24 0.49
203887 s at THBD 0.46 0.44 -0.67
203046 s at TIMELESS 2.05 -0.46 0.62
205122 at TMEFF1 2.96 -0.37 0.60
1552626 a at TMEM163 0.43 0.37 -0.49
223503 at TMEM163 0.36 0.48 -0.67
212621 at TMEM194A 2.00 -0.45 0.57 203432 at TMPO 2.12 -0.49 0.59
205347 s at TMSB15A 10.00 -0.65 1.11
214051_at TMSB15B 2.27 -0.34 0.51
201291 s at T0P2A 2.63 -0.85 1.05
201292 at T0P2A 3.05 -0.88 1.04
205683_x_at TPSAB1 0.46 0.60 -0.92
216474_x_at TPSAB1 /// TPSB2 0.46 0.59 -0.93
207134_x_at TPSB2 0.46 0.60 -0.93
210052_s_at TPX2 2.77 -0.79 1.04
242056 at TRIM45 2.07 -0.29 0.46
209114 at TSPAN1 0.41 0.37 -0.71
204822 at TTK 2.55 -0.72 0.88
203702 s at TTLL4 2.09 -0.30 0.33
228882_at TUB 2.10 -0.23 0.38
214023_x_at TUBB2B 3.36 -0.56 0.70
1554696_s_at TYMS 3.79 -0.77 1.11
202589_at TYMS 3.70 -0.83 1.15
202954_at UBE2C 3.19 -0.79 1.04
223229_at UBE2T 2.24 -0.62 0.71
202412_s_at USP1 2.16 -0.42 0.61
205019_s_at VIPR1 0.38 0.39 -0.40
202112_at VWF 0.44 0.51 -0.61
219478_at WFDC1 0.45 0.43 -0.50
213996_at YPEL1 2.15 -0.26 0.47
206373_at ZIC1 4.22 -0.40 0.66
223642_at ZIC2 2.56 -0.27 0.55
228144_at ZNF300 2.11 -0.22 0.39
229551_x_at ZNF367 2.42 -0.50 0.68
228988_at ZNF711 3.74 -0.51 0.94
229700_at ZNF738 2.13 -0.34 0.49
NR: predicted non-responder to Pemetrexed
R: predicted responder to Pemetrexed

Claims

Claims
1. A method for preparing an optimised gene signature for assigning a NSCLC sample to one or more NSCLC classes, comprising subjecting a gene signature set forth in Table 12 to nearest shrunken centroid analysis to identify one or more subgroups of gene classifiers corresponding to one or more of classes 1 to 6 identified in Table 12, and validating the performance of the selected classifiers by K-fold or leave-one-out cross-validation.
2. A gene signature for assigning a NSCLC sample to one or more NSCLC classes as identified in Table 12, wherein at least 80% of the genes comprised in said gene signature are set forth in Table 12
3. A method for classifying NSCLC comprising the steps of:
(i) analyzing a gene expression profile of one or more NSCLC, said gene expression profile comprising the expression levels of at least 80% of the genes set forth in a gene signature comprising NSCLC classes 1-6 as identified in Table 12, said gene signature being:
(a) the gene signature set forth in Table 12;
(b) a gene signature according to claim 2;
(c) a gene signature wherein the genes are identified as classifiers by the method of claim 1 ; or
(d) the gene signature set forth in Table 4;
(ii) comparing the gene expression levels in the NSCLC, and detecting differences in gene expression which are characteristic of any one or more of Groups 1-6 as set forth in the gene signature; and
(iii) assigning the NSCLC into any one of Groups 1-6 as set forth in the gene siganture.
4. A method according to claim 3, wherein the gene expression profile includes at least 85% of the genes identified Table 4.
5. A method according to claim 4, wherein the gene expression profile includes at least 90% of the genes identified in Table 4.
6. A method according to any one of claims 3 to 5, wherein the levels of gene expression are compared to
(a) an external standard of NSCLC gene expression levels; or
(b) the expression levels of at least 10 of the internal reference genes identified in Table
7. A method according to claim 6, wherein the internal reference genes are the gene set forth in Table 11.
8. A method according to any one of claims 3 to 7, wherein the gene expression profiles of a plurality of NSCLC are analysed; unsupervised hierarchical clustering of the gene expression data is performed to identify clusters as defined by over- or under-expression of genes; and the NSCLC samples are assigned to one of the groups identified in Table 4.
9. A method according to any one of claims 3 to 8, wherein NSCLC categorized in Group 4 show one or more of (a) reduced expression of FLOR1 , ASCL1 , DDC or MAST4 compared with other neuroendocrine NSCLC; or (b) increased expression of ABCC1 , MCM6 and CDCA7 compared with other neuroendocrine NSCLC.
10. A method according to any one of claims 3 to 9, further comprising assigning a likelihood of drug resistance to the NSCLC according to its classification.
11. A method according to claim 10, wherein NSCLC in Group 4 are predicted to be resistant to Pemetrexed treatment.
12. A method according to claim 10, wherein NSCLC in Group 3 with low expression of the marker TP53 are predicted to be responsive to Pemetrexed therapy.
13. A method according to claim 10, wherein NSCLC in Group 3 which show high expression of the marker TP53 and, optionally, high expression of the marker EGFR, are predicted to be non-responsive to Pemetrexed therapy.
14. A method according to claim 10, wherein NSCLC in Group 1 or Group 6 which show high expression of the marker TTF1 are predicted to be responsive to Pemetrexed therapy.
15. A method according to claim 10, wherein NSCLC in any one of Groups 1 to 4, 5 and 6 which show high expression of neuroendocrine markers SYP, NCAM1 and CHGA are predicted to be responsive to Pemetrexed therapy.
16. A method according to any one of claims 11 to 15, wherein the expression levels of markers are assessed by immunohistochemichal staining.
17. A method for preparing an optimised gene signature for predicting resistance to Pemetrexed in a NSCLC, comprising subjecting a gene signature set forth in Table 13 to nearest shrunken centroid analysis to identify subgroups of gene classifiers corresponding to responders and non-responders to Premetrexed therapy, and validating the performance of the selected classifiers by K-fold or leave-one-out cross-validation.
18. A gene signature for predicting resistance to Pemetrexed in a NSCLC, wherein at least 80% of the genes comprised in said gene signature are set forth in Table 13.
19. A method for predicting resistance to Pemetrexed in a NSCLC comprising the steps of.
(i) analyzing a gene expression profile of one or more NSCLC, said gene expression profile comprising the expression levels of at least 80% of the genes set forth in a gene signature, said gene signature being: (a) the gene signature set forth in Table 13;
(b) a gene signature according to claim 18;
(c) a gene signature wherein the genes are identified as classifiers by the method of claim 17; or
(d) the gene signature set forth in Table 6;
(ii) comparing the gene expression levels in the NSCLC, and detecting differences in gene expression which are characteristic of response or non-response to Pemetrexed; and
(iii) assigning the NSCLC to a responder or non-responder group.
20. A method for predicting resistance to Pemetrexed in a NSCLC, comprising the steps of: (i) profiling the expression of at least 80% of the genes set forth in Table 6; (ii) comparing the expression of the genes profiled in (i) with the signature set forth in Table 6; and predicting the NSCLC to be responsive or non-responsive to Pemetrexed according to the Table 6 signature.
21. A method according to claim 20, comprising profiling the expression of at least 90% of the genes set forth in Table 6.
22. A method according to claim 21 , comprising profiling the expression of the genes set forth in Table 6.
23. A method according to any one of claims 18 to 22, wherein the levels of gene expression are compared to
(a) an external standard of NSCLC gene expression levels; or
(b) the expression levels of at least 10 of the internal reference genes identified in Table
10.
24. A method according to claim 23, wherein the internal reference genes are the genes set forth in Table 11.
25. A method according to any one of claims 20 to 24, wherein the gene expression profiles of a plurality of NSCLC are analysed, and compared internally using hierarchical clustering analysis.
26. A kit for analysis of NSCLC, comprising nucleic acid probes for the expression profiling of at least 80% of the genes set forth in Table 4, and reagents for detection of one or more of TP53, TTF1 , SYP, NCAM1 and CHGA by immunohistochemical staining.
27. A kit for analysis of NSCLC, consisting substantially of nucleic acid probes and reagents for the expression profiling of the 25 genes set forth in Table 6.
PCT/EP2012/004957 2011-12-01 2012-11-30 Method for classifying tumour cells WO2013079215A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
GB201120711A GB201120711D0 (en) 2011-12-01 2011-12-01 Method for classifying tumour cells
GB1120711.5 2011-12-01

Publications (1)

Publication Number Publication Date
WO2013079215A1 true WO2013079215A1 (en) 2013-06-06

Family

ID=45509031

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2012/004957 WO2013079215A1 (en) 2011-12-01 2012-11-30 Method for classifying tumour cells

Country Status (2)

Country Link
GB (1) GB201120711D0 (en)
WO (1) WO2013079215A1 (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2867368A1 (en) * 2012-07-06 2015-05-06 Institut Gustave Roussy Simultaneous detection of cannibalism and senescence as prognostic marker for cancer
EP2877210A4 (en) * 2012-07-24 2016-06-29 Cedars Sinai Medical Center A novel method to detect resistance to chemotherapy in patients with lung cancer
CN112159661A (en) * 2020-08-31 2021-01-01 南昌航空大学 Preparation method and application of rare earth up-conversion fluorescent probe for detecting DNA damage marker
EP3630293A4 (en) * 2017-05-22 2021-06-02 The National Institute for Biotechnology in the Negev Ltd. Biomarkers for diagnosis of lung cancer
CN113096730A (en) * 2021-04-02 2021-07-09 中山大学 Prediction system for nasopharyngeal carcinoma molecular typing

Citations (31)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4631211A (en) 1985-03-25 1986-12-23 Scripps Clinic & Research Foundation Means for sequential solid phase organic synthesis and methods using the same
US4683195A (en) 1986-01-30 1987-07-28 Cetus Corporation Process for amplifying, detecting, and/or-cloning nucleic acid sequences
US5270163A (en) 1990-06-11 1993-12-14 University Research Corporation Methods for identifying nucleic acid ligands
US5503978A (en) 1990-06-11 1996-04-02 University Research Corporation Method for identification of high affinity DNA ligands of HIV-1 reverse transcriptase
US5567588A (en) 1990-06-11 1996-10-22 University Research Corporation Systematic evolution of ligands by exponential enrichment: Solution SELEX
WO1996038579A1 (en) 1995-06-02 1996-12-05 Nexstar Pharmaceuticals, Inc. High-affinity oligonucleotide ligands to growth factors
US5654151A (en) 1990-06-11 1997-08-05 Nexstar Pharmaceuticals, Inc. High affinity HIV Nucleocapsid nucleic acid ligands
US5837832A (en) 1993-06-25 1998-11-17 Affymetrix, Inc. Arrays of nucleic acid probes on biological chips
WO1999002671A1 (en) 1997-07-07 1999-01-21 Medical Research Council In vitro sorting method
WO2000040712A1 (en) 1999-01-07 2000-07-13 Medical Research Council Optical sorting method
WO2001016375A2 (en) 1999-08-30 2001-03-08 The Government Of The United States Of America, As Represented By The Secretary, Department Of Health And Human Services High speed parallel molecular nucleic acid sequencing
US6210896B1 (en) 1998-08-13 2001-04-03 Us Genomics Molecular motors
US6263286B1 (en) 1998-08-13 2001-07-17 U.S. Genomics, Inc. Methods of analyzing polymers using a spatial network of fluorophores and fluorescence resonance energy transfer
US6355420B1 (en) 1997-02-12 2002-03-12 Us Genomics Methods and products for analyzing polymers
WO2002022869A2 (en) 2000-09-13 2002-03-21 Medical Research Council Directed evolution method
US6403311B1 (en) 1997-02-12 2002-06-11 Us Genomics Methods of analyzing polymers using ordered label strategies
US20020090979A1 (en) 2000-10-30 2002-07-11 Sydor John T. Method and wireless communication hub for data communications
US6440706B1 (en) 1999-08-02 2002-08-27 Johns Hopkins University Digital amplification
WO2003044187A2 (en) 2001-11-16 2003-05-30 Medical Research Council Emulsion compositions
US6696022B1 (en) 1999-08-13 2004-02-24 U.S. Genomics, Inc. Methods and apparatuses for stretching polymers
US6762059B2 (en) 1999-08-13 2004-07-13 U.S. Genomics, Inc. Methods and apparatuses for characterization of single polymers
WO2004069849A2 (en) 2003-01-29 2004-08-19 454 Corporation Bead emulsion nucleic acid amplification
US6790671B1 (en) 1998-08-13 2004-09-14 Princeton University Optically characterizing polymers
US20050014175A1 (en) 1999-06-28 2005-01-20 California Institute Of Technology Methods and apparatuses for analyzing polynucleotide sequences
WO2005010145A2 (en) 2003-07-05 2005-02-03 The Johns Hopkins University Method and compositions for detection and enumeration of genetic variations
WO2005039389A2 (en) 2003-10-22 2005-05-06 454 Corporation Sequence-based karyotyping
US20050100932A1 (en) 2003-11-12 2005-05-12 Helicos Biosciences Corporation Short cycle methods for sequencing polynucleotides
WO2005054431A2 (en) 2003-12-01 2005-06-16 454 Corporation Method for isolation of independent, parallel chemical micro-reactions using a porous filter
WO2005073410A2 (en) 2004-01-28 2005-08-11 454 Corporation Nucleic acid amplification with continuous flow emulsion
WO2005101014A2 (en) 2004-04-16 2005-10-27 Evotec Neurosciences Gmbh Diagnostic and therapeutic use of kcnc1 for neurodegenerative diseases
WO2009043022A2 (en) * 2007-09-28 2009-04-02 Duke University Individualized cancer treatments

Patent Citations (37)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4631211A (en) 1985-03-25 1986-12-23 Scripps Clinic & Research Foundation Means for sequential solid phase organic synthesis and methods using the same
US4683195A (en) 1986-01-30 1987-07-28 Cetus Corporation Process for amplifying, detecting, and/or-cloning nucleic acid sequences
US4683195B1 (en) 1986-01-30 1990-11-27 Cetus Corp
US5270163A (en) 1990-06-11 1993-12-14 University Research Corporation Methods for identifying nucleic acid ligands
US5503978A (en) 1990-06-11 1996-04-02 University Research Corporation Method for identification of high affinity DNA ligands of HIV-1 reverse transcriptase
US5567588A (en) 1990-06-11 1996-10-22 University Research Corporation Systematic evolution of ligands by exponential enrichment: Solution SELEX
US5654151A (en) 1990-06-11 1997-08-05 Nexstar Pharmaceuticals, Inc. High affinity HIV Nucleocapsid nucleic acid ligands
US5837832A (en) 1993-06-25 1998-11-17 Affymetrix, Inc. Arrays of nucleic acid probes on biological chips
WO1996038579A1 (en) 1995-06-02 1996-12-05 Nexstar Pharmaceuticals, Inc. High-affinity oligonucleotide ligands to growth factors
US6355420B1 (en) 1997-02-12 2002-03-12 Us Genomics Methods and products for analyzing polymers
US6403311B1 (en) 1997-02-12 2002-06-11 Us Genomics Methods of analyzing polymers using ordered label strategies
WO1999002671A1 (en) 1997-07-07 1999-01-21 Medical Research Council In vitro sorting method
US6790671B1 (en) 1998-08-13 2004-09-14 Princeton University Optically characterizing polymers
US6210896B1 (en) 1998-08-13 2001-04-03 Us Genomics Molecular motors
US6263286B1 (en) 1998-08-13 2001-07-17 U.S. Genomics, Inc. Methods of analyzing polymers using a spatial network of fluorophores and fluorescence resonance energy transfer
US6772070B2 (en) 1998-08-13 2004-08-03 U.S. Genomics, Inc. Methods of analyzing polymers using a spatial network of fluorophores and fluorescence resonance energy transfer
WO2000040712A1 (en) 1999-01-07 2000-07-13 Medical Research Council Optical sorting method
US20050014175A1 (en) 1999-06-28 2005-01-20 California Institute Of Technology Methods and apparatuses for analyzing polynucleotide sequences
US6440706B1 (en) 1999-08-02 2002-08-27 Johns Hopkins University Digital amplification
US6753147B2 (en) 1999-08-02 2004-06-22 The Johns Hopkins University Digital amplification
US6762059B2 (en) 1999-08-13 2004-07-13 U.S. Genomics, Inc. Methods and apparatuses for characterization of single polymers
US6696022B1 (en) 1999-08-13 2004-02-24 U.S. Genomics, Inc. Methods and apparatuses for stretching polymers
WO2001016375A2 (en) 1999-08-30 2001-03-08 The Government Of The United States Of America, As Represented By The Secretary, Department Of Health And Human Services High speed parallel molecular nucleic acid sequencing
WO2002022869A2 (en) 2000-09-13 2002-03-21 Medical Research Council Directed evolution method
US20020090979A1 (en) 2000-10-30 2002-07-11 Sydor John T. Method and wireless communication hub for data communications
WO2003044187A2 (en) 2001-11-16 2003-05-30 Medical Research Council Emulsion compositions
WO2004070005A2 (en) 2003-01-29 2004-08-19 454 Corporation Double ended sequencing
WO2004070007A2 (en) 2003-01-29 2004-08-19 454 Corporation Method for preparing single-stranded dna libraries
WO2005003375A2 (en) 2003-01-29 2005-01-13 454 Corporation Methods of amplifying and sequencing nucleic acids
WO2004069849A2 (en) 2003-01-29 2004-08-19 454 Corporation Bead emulsion nucleic acid amplification
WO2005010145A2 (en) 2003-07-05 2005-02-03 The Johns Hopkins University Method and compositions for detection and enumeration of genetic variations
WO2005039389A2 (en) 2003-10-22 2005-05-06 454 Corporation Sequence-based karyotyping
US20050100932A1 (en) 2003-11-12 2005-05-12 Helicos Biosciences Corporation Short cycle methods for sequencing polynucleotides
WO2005054431A2 (en) 2003-12-01 2005-06-16 454 Corporation Method for isolation of independent, parallel chemical micro-reactions using a porous filter
WO2005073410A2 (en) 2004-01-28 2005-08-11 454 Corporation Nucleic acid amplification with continuous flow emulsion
WO2005101014A2 (en) 2004-04-16 2005-10-27 Evotec Neurosciences Gmbh Diagnostic and therapeutic use of kcnc1 for neurodegenerative diseases
WO2009043022A2 (en) * 2007-09-28 2009-04-02 Duke University Individualized cancer treatments

Non-Patent Citations (70)

* Cited by examiner, † Cited by third party
Title
"Current Protocols in Molecular Biology", vol. 1, 1994, JOHN WILEY & SONS, INC.
"Gene Transfer and Expression Protocols", THE HUMANA PRESS INC., pages: 109 - 128
"Methods in Enzymology", vol. 194, 1991, ACADEMIC PRESS, INC., article "Guide to Yeast Genetics and Molecular Biology"
"Microarray Biochip Technology", 1999, EATON PUBLISHING COMPANY, article "The Chipping Forecast"
AKIRA MOGI ET AL: "TP53 Mutations in Nonsmall Cell Lung Cancer", JOURNAL OF BIOMEDICINE AND BIOTECHNOLOGY, vol. 9, no. 3, 1 January 2011 (2011-01-01), pages 93 - 9, XP055056426, ISSN: 1110-7243, DOI: 10.1038/nrd2656 *
ANBAZHAGAN R; TIHAN T; BOMMAN DM ET AL.: "Classification of small cell lung cancer and pulmonary carcinoid by gene expression profiles", CANCER RES, vol. 59, no. 20, 15 October 1999 (1999-10-15), pages 5119 - 22, XP002901773
AUSUBEL ET AL.: "Short Protocols in Molecular Biology", 1999, JOHN WILEY & SONS, INC.
BEPLER G; SOMMERS KE; CANTOR A ET AL.: "Clinical efficacy and predictive molecular markers of neoadjuvant gemcitabine and pemetrexed in resectable non-small cell lung cancer", J THORAC ONCOL, vol. 3, no. 10, October 2008 (2008-10-01), pages 1112 - 8
BHATTACHARJEE A; RICHARDS WG; STAUNTON J ET AL.: "Classification of human lung carcinomas by mRNA expression profiling reveals distinct adenocarcinoma subclasses", PROC NATL ACAD SCI USA, vol. 98, no. 24, 20 November 2001 (2001-11-20), pages 13790 - 5, XP002513549, DOI: doi:10.1073/pnas.191502998
BILD AH.; YAO G; CHANG JT ET AL.: "Oncogenic pathway signatures in human cancers as a guide to targeted therapies", NATURE, vol. 439, no. 7074, 19 January 2006 (2006-01-19), pages 353 - 7, XP002460134, DOI: doi:10.1038/nature04296
CALVANO SE; XIAO W; RICHARDS DR ET AL.: "A network-based analysis of systemic inflammation in humans", NATURE, vol. 437, no. 7061, 13 October 2005 (2005-10-13), pages 1032 - 7, XP055127208, DOI: doi:10.1038/nature03985
CELIS ET AL., FEBS LETT, vol. 480, no. 1, 2000, pages 2 - 16
CEPPI P.; VOLANTE M; SAVIOZZI S ET AL.: "Squamous cell carcinoma of the lung compared with other histotypes shows higher messenger RNA and protein levels for thymidylate synthase", CANCER, vol. 107, no. 7, 1 October 2006 (2006-10-01), pages 1589 - 96
CEPPI P; VOLANTE M; FERRERO A ET AL.: "Thymidylate synthase expression in gastroenteropancreatic and pulmonary neuroendocrine tumors", CLIN CANCER RES, vol. 14, no. 4, 15 February 2008 (2008-02-15), pages 1059 - 64
CIULEANU T; BRODOWICZ T; ZIELINSKI C ET AL.: "Maintenance pemetrexed plus best supportive care versus placebo plus best supportive care for non-small-cell lung cancer: a randomised, double-blind, phase 3 study", LANCET, vol. 374, no. 9699, 24 October 2009 (2009-10-24), pages 1432 - 40, XP026721788, DOI: doi:10.1016/S0140-6736(09)61497-5
CORTES, THE SCIENTIST, vol. 14, no. 17, 2000, pages 25
CORTESE, THE SCIENTIST, vol. 14, no. 11, 2000, pages 26
EAKINS; CHU, TRENDS IN BIOTECHNOLOGY, vol. 17, 1999, pages 217 - 218
ESTEBAN E.; CASILLAS M; CASSINELLO A.: "Pemetrexed in first-line treatment of non-small cell lung cancer", CANCER TREAT REV, vol. 35, no. 4, June 2009 (2009-06-01), pages 364 - 73, XP026123605, DOI: doi:10.1016/j.ctrv.2009.02.002
GARBER ME; TROYANSKAYA OG; SCHLUENS K ET AL.: "Diversity of gene expression In adenocarcinoma of the lung", PROC NATL ACAD SCI USA, vol. 98, no. 24, 20 November 2001 (2001-11-20), pages 13784 - 9, XP008059849, DOI: doi:10.1073/pnas.241500798
GENTZ ET AL., PROC. NATL. ACAD. SCI. USA, vol. 86, 1989, pages 821 - 824
GIOVANNETTI E; MEY V; NANNIZZI S ET AL.: "Cellular and pharmacogenetics foundation of synergistic interaction of pemetrexed and gemcitabine in human non-small-cell lung cancer cells", MOL PHARMACAL, vol. 68, no. 1, July 2005 (2005-07-01), pages 110 - 8, XP002513966, DOI: doi:10.1124//MOL.104.009373
GIOVANNI PARMIGIANI; ELIZABETH S GARRETT; RAFAEL A IRIZARRY; SCOTT L ZEGER: "The analysis of gene expression data: methods and software", 2003, SPRINGER
GOLUB T; SLONIM D; TAMAYO P ET AL.: "Molecular classification of cancer: class discovery and class prediction by gene expression monitoring", SCIENCE, vol. 286, 1999, pages 531 - 6
GOLUB, T. ET AL., SCIENCE, vol. 286, 1999, pages 531 - 536
GUATELLI ET AL., PROC. NATL. ACAD. SCI . US A, vol. 87, 1990, pages 1874
GWYNNE; PAGE: "Microarray analysis : the next revolution in molecular biology", SCIENCE, 6 August 1999 (1999-08-06)
HANAUSKE AR; CHEN V; PAOLETTI P; NIYIKIZA C.: "Pemetrexed disodium: a novel antifolate clinically active against multiple solid tumors", ONCOLOGIST, vol. 6, no. 4, 2001, pages 363 - 73, XP008005751, DOI: doi:10.1634/theoncologist.6-4-363
HANAUSKE AR; EISMANN U; OBERSCHMIDT 0 ET AL.: "In vitro chemosensitivity of freshly explanted tumor cells to pemetrexed is correlated with target gene expression", INVEST NEW DRUGS, vol. 25, no. 5, October 2007 (2007-10-01), pages 417 - 23, XP019526148, DOI: doi:10.1007/s10637-007-9060-9
HITOSHI KITAMURA ET AL: "Small Cell Lung Cancer: Significance of RB Alterations and TTF-1 Expression in its Carcinogenesis, Phenotype, and Biology", ENDOCRINE PATHOLOGY, vol. 20, no. 2, 24 April 2009 (2009-04-24), pages 101 - 107, XP055056429, ISSN: 1046-3976, DOI: 10.1007/s12022-009-9072-4 *
HOU J; AERTS J; DEN HAMER B ET AL.: "Gene expression-based classification of non-small cell lung carcinomas and survival prediction", PLOS ONE, vol. 5, no. 4, pages E10312, XP002581317, DOI: doi:10.1371/journal.pone.0010312
HSU DS; BALAKUMARAN BS; ACHARYA CR ET AL.: "Pharmacogenomic strategies provide a rational approach to the treatment of cisplatin-resistant patients with advanced cancer", J CLIN ONCOL, vol. 25, no. 28, 1 October 2007 (2007-10-01), pages 4350 - 7, XP002513951, DOI: doi:10.1200/JCO.2007.11.0593
HUANG DA W; SHERMAN BT; LEMPICKI RA: "Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources", NAT PROTOC, vol. 4, no. 1, 2009, pages 44 - 57, XP009153774, DOI: doi:10.1038/nprot.2008.211
HUMAM KADARA ET AL: "Identification of Gene Signatures and Molecular Markers for Human Lung Cancer Prognosis using an In vitro Lung Carcinogenesis System", vol. 2, no. 8, 1 August 2009 (2009-08-01), pages 702 - 711, XP002679416, Retrieved from the Internet <URL:http://cancerpreventionresearch.aacrjournals.org/content/2/8/702> [retrieved on 20090728], DOI: 10.1158/1940-6207.CAPR-09-0084 *
INAMURA K; FUJIWARA T; HOSHIDA Y ET AL.: "Two subclasses of lung squamous cell carcinoma with different gene expression profiles and prognosis identified by hierarchical clustering and non-negative matrix factorization", ONCOGENE, vol. 24, no. 47, 27 October 2005 (2005-10-27), pages 7105 - 13, XP002572602, DOI: doi:10.1038/SJ.ONC.1208858
INNIS ET AL.: "PCR Protocols: A Guide to Methods and Applications", 1990, ACADEMIC PRESS
IRIZARRY RA; HOBBS B; COLLIN F ET AL.: "Exploration, normalization, and summaries of high density oligonucleotide array probe level data", BIOSTATISTICS, vol. 4, no. 2, April 2003 (2003-04-01), pages 249 - 64, XP002466228, DOI: doi:10.1093/biostatistics/4.2.249
JACKMAN DM; MILLER VA; CIOFFREDI LA ET AL.: "Impact of epidermal growth factor receptor and KRAS mutations on clinical outcomes in previously untreated non-small cell lung cancer patients: results of an online tumor registry of clinical trials", CLIN CANCER RES, vol. 15, no. 16, 15 August 2009 (2009-08-15), pages 5267 - 73
KY. CHAN, MUTATION RESEACH, vol. 573, 2005, pages 13 - 40
LANDEGREN, U. ET AL., SCIENCE, vol. 242, 1988, pages 229 - 237
LEMIEUX ET AL., MOLECULAR BREEDING, vol. 4, 1998, pages 277 - 289
LEWIS, R., GENETIC ENGINEERING NEWS, vol. 10, no. 1, 1990, pages 54 - 55
LIZARDI ET AL., BIOLTECHNOLOGY, vol. 6, 1988, pages 1197
LIZARDI ET AL., NAT GENET, vol. 19, 1998, pages 225
LOCKHART; WINZELER, NATURE, vol. 405, no. 6788, 2000, pages 827 - 836
LONGO-SORBELLO GS; CHEN B; BUDAK-ALPDOGAN; T. BERTINO JR: "Role of pemetrexed in non-small cell lung cancer", CANCER INVEST, vol. 25, no. 1, February 2007 (2007-02-01), pages 59 - 66
MCPHERSON ET AL.: "PCR", vol. 1, 1991, OXFORD UNIVERSITY PRESS
MOK ET AL., GYNAECOLOGIC ONCOLOGY, vol. 52, 1994, pages 247 - 252
MONICA V; SCAGLIOTTI GV; CEPPI P ET AL.: "Differential Thymldylate Synthase Expression in Different Variants of Large-Cell Carcinoma of the Lung", CLIN CANCER RES, vol. 15, no. 24, 15 December 2009 (2009-12-15), pages 7547 - 52
MOUNTZIOS G ET AL: "Histopathologic and genetic alterations as predictors of response to treatment and survival in lung cancer: A review of published data", CRITICAL REVIEWS IN ONCOLOGY / HEMATOLOGY, ELSEVIER SCIENCE IRELAND LTD., LIMERICK, IE, vol. 75, no. 2, 1 August 2010 (2010-08-01), pages 94 - 109, XP027169656, ISSN: 1040-8428, [retrieved on 20091113] *
POTTI A; MUKHERJEE S; PETERSEN R ET AL.: "A genomic strategy to refine prognosis in early-stage non-small-cell lung cancer", N ENGL J MED, vol. 355, no. 6, 10 August 2006 (2006-08-10), pages 570 - 80, XP009096340, DOI: doi:10.1056/NEJMoa060467
R. I. FRESHNEY: "Culture of Animal Cells: A Manual of Basic Technique", 1987, LISS, INC.
RAPONI M; ZHANG Y; YU J: "Gene expression signatures for predicting prognosis of squamous cell and adenocarcinomas of the lung", CANCER RES, vol. 66, no. 15, 1 August 2006 (2006-08-01), pages 7466 - 72
ROSELL R; FELIP E; GARCIA-CAMPELO R; BALANA C: "The biology of non-small-cell lung cancer identifying new targets for rational therapy", LUNG CANCER, vol. 46, no. 2, November 2004 (2004-11-01), pages 135 - 48, XP004594546, DOI: doi:10.1016/j.lungcan.2004.04.031
SAMBROOK ET AL.: "Molecular Cloning: A Laboratory Manual", 1989, COLD SPRING HARBOR LABORATORY PRESS
SCAGLIOTTI G; HANNA N; FOSSELLA F ET AL.: "The differential efficacy of pemetrexed according to NSCLC histology: a review of two Phase III studies", ONCOLOGIST, vol. 14, no. 3, March 2009 (2009-03-01), pages 253 - 63
SCHENA; DAVIS: "DNA Microarrays : A Practical Approach", 1999, OXFORD UNIVERSITY PRESS, article "Genes, Genomes and Chips"
SCHENA; DAVIS: "PCR Methods Manual", article "Parallel Analysis with Biological Chips"
SHALON ET AL., GENOME RES, vol. 6, no. 7, 1996, pages 639 - 45
TAKEUCHI T; TOMIDA S; YATABE Y ET AL.: "Expression profile-defined classification of lung adenocarcinoma shows close relationship with underlying major genetic changes and clinicopathologic behaviors", J CLIN ONCOL, vol. 24, no. 11, 10 April 2006 (2006-04-10), pages 1679 - 88
TIBSHIRANI R; HASTIE T; NARASIMHAN B; CHU G.: "Diagnosis of multiple cancer types by shrunken centroids of gene expression", PROC NATL ACAD SCI USA, vol. 99, no. 10, 14 May 2002 (2002-05-14), pages 6567 - 72, XP002988576, DOI: doi:10.1073/pnas.082099299
TIBSHIRANI, R. ET AL., PROC NATL ACAD SCI USA, vol. 99, no. 10, 2002, pages 6567 - 72
TRAVIS WD; GAL AA; COLBY TV; KLIMSTRA DS; FALK R; KOSS MN: "Reproducibility of neuroendocrine lung tumor classification", HUM PATHOL, vol. 29, no. 3, March 1998 (1998-03-01), pages 272 - 9
TUBBS RR; PETTAY JD; ROCHE PC; STOLER MH; JENKINS RB; GROGAN TM: "Discrepancies in clinical laboratory testing of eligibility for trastuzumab therapy: apparent immunohistochemical false-positives do not get the message", J CLIN ONCOL, vol. 19, no. 10, 15 May 2001 (2001-05-15), pages 2714 - 21, XP002600920
TUSHER VG; TIBSHIRANI R; CHU G.: "Significance analysis of microarrays applied to the ionizing radiation response", PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, vol. 98, no. 9, 24 April 2001 (2001-04-24), pages 5116 - 21, XP002967440, DOI: doi:10.1073/pnas.091062498
VELCULESCU ET AL., SCIENCE, vol. 270, no. 5235, pages 484 - 487
WALKER ET AL., PNAS (USA, vol. 80, 1992, pages 392
WIGLE DA; JURISICA 1; RADULOVICH N: "Molecular profiling of non-small cell lung cancer and correlation with disease-free survival", CANCER RES, vol. 62, no. 11, 1 June 2002 (2002-06-01), pages 3005 - 8, XP002271233
WILSON ET AL., CELL, vol. 37, 1984, pages 767
WU, D. Y.; WALLACE, R. B., GENOMICS, vol. 4, 1989, pages 560

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2867368A1 (en) * 2012-07-06 2015-05-06 Institut Gustave Roussy Simultaneous detection of cannibalism and senescence as prognostic marker for cancer
US10969380B2 (en) 2012-07-06 2021-04-06 Institut Gustave Roussy Simultaneous detection of cannibalism and senescence as prognostic marker for cancer
EP2867368B1 (en) * 2012-07-06 2022-01-12 Institut Gustave Roussy Simultaneous detection of cannibalism and senescence as prognostic marker for cancer
EP2877210A4 (en) * 2012-07-24 2016-06-29 Cedars Sinai Medical Center A novel method to detect resistance to chemotherapy in patients with lung cancer
EP3630293A4 (en) * 2017-05-22 2021-06-02 The National Institute for Biotechnology in the Negev Ltd. Biomarkers for diagnosis of lung cancer
US11408887B2 (en) 2017-05-22 2022-08-09 The National Institute for Biotechnology in the Negev Ltd. Biomarkers for diagnosis of lung cancer
CN112159661A (en) * 2020-08-31 2021-01-01 南昌航空大学 Preparation method and application of rare earth up-conversion fluorescent probe for detecting DNA damage marker
CN112159661B (en) * 2020-08-31 2022-09-27 南昌航空大学 Preparation method and application of rare earth up-conversion fluorescent probe for detecting DNA damage marker
CN113096730A (en) * 2021-04-02 2021-07-09 中山大学 Prediction system for nasopharyngeal carcinoma molecular typing

Also Published As

Publication number Publication date
GB201120711D0 (en) 2012-01-11

Similar Documents

Publication Publication Date Title
JP5745848B2 (en) Signs of growth and prognosis in gastrointestinal cancer
AU2010242792B2 (en) Gene expression profile algorithm and test for likelihood of recurrence of colorectal cancer and response to chemotherapy
WO2017062505A1 (en) Method of classifying and diagnosing cancer
AU2017341084B2 (en) Classification and prognosis of cancer
US20100292933A1 (en) Methods, systems, and compositions for classification, prognosis, and diagnosis of cancers
JP2017104116A (en) Predicting methods for gastroenteropancreatic neuroendocrine neoplasms (gep-nens)
WO2010108638A9 (en) Tumour gene profile
US20090221522A1 (en) Methods to correct gene set expression profiles to drug sensitivity
JP2017506506A (en) Molecular diagnostic tests for response to anti-angiogenic drugs and prediction of cancer prognosis
US20210233611A1 (en) Classification and prognosis of prostate cancer
JP7043404B2 (en) Gene signature of residual risk after endocrine treatment in early-stage breast cancer
WO2013079215A1 (en) Method for classifying tumour cells
Huang et al. Molecular gene signature and prognosis of non-small cell lung cancer
WO2019213478A1 (en) Gene expression assay for measurement of dna mismatch repair deficiency
WO2015127101A1 (en) Method and composition for diagnosis of aggressive prostate cancer
Duchnowska et al. Brain metastasis prediction by transcriptomic profiling in triple-negative breast cancer
Dumur et al. Genes involved in radiation therapy response in head and neck cancers
US9195796B2 (en) Malignancy-risk signature from histologically normal breast tissue
JP5688497B2 (en) Methods and compositions for predicting postoperative prognosis in patients with lung adenocarcinoma
US20230348990A1 (en) Prognostic and treatment response predictive method
US20210381058A1 (en) Evidence based selection of patients for clinical trials using histopathology
WO2023081190A1 (en) Epithelial-mesenchymal transition-based gene expression signature for kidney cancer
Byers Molecular Profiling
US20190338368A1 (en) Her2 as a predictor of response to dual her2 blockade in the absence of cytotoxic therapy

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 12805582

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 12805582

Country of ref document: EP

Kind code of ref document: A1