US20130095107A1

US20130095107A1 - Biomarker specific to brain/nerve or specific to neuronal differentiation

Info

Publication number: US20130095107A1
Application number: US13/324,610
Authority: US
Inventors: Ai Wakamatsu; Junichi Yamamoto; Takao Isogai
Original assignee: Reverse Proteomics Research Institute Co Ltd
Current assignee: Reverse Proteomics Research Institute Co Ltd
Priority date: 2007-03-15
Filing date: 2011-12-13
Publication date: 2013-04-18
Also published as: WO2008111634A1; JPWO2008111634A1; US8153764B2; JP5378202B2; US20100098695A1

Abstract

The invention provides a novel polypeptide and a specific partial peptide thereof, as well as a novel polynucleotide and a specific partial nucleotide thereof, that can be used as a biomarker specific for the brain/nerves or specific for nerve differentiation; an expression vector for such a polynucleotide and a specific partial peptide thereof; a transformant incorporating such an expression vector; an antisense molecule, RNAi-inducing nucleic acid (e.g., siRNA), aptamer, or antibody for such a biomarker, and a composition comprising the same; a mammalian cell or non-human mammal wherein the expression or a function of such a biomarker is regulated; a measuring means (e.g., primer set, nucleic acid probe, antibody, aptamer) for such a biomarker, and a reagent comprising the same and the like.

Description

TECHNICAL FIELD

The present invention provides a polypeptide and a partial peptide thereof, as well as a polynucleotide and a partial nucleotide thereof, that can be used as biomarkers specific for the brain/nerves or specific for nerve differentiation; an expression vector; a transformant; an antisense molecule, an RNAi-inducing nucleic acid (e.g., siRNA), an aptamer, an antibody, and a composition comprising them; a mammalian cell or a non-human mammal; a measuring means for a biomarker specific for the brain/nerves or specific for nerve differentiation (e.g., primer set, nucleic acid probe, antibody, aptamer), a measuring method and the like.

BACKGROUND ART

Although there have been remarkable advances in the analysis of human chromosome sequences thanks to the progress in human genome research, this does not mean that all the human genetic functions have been clarified. In humans, gene diversity is significantly associated with changes in gene functions. In fact, it is known that in humans, a plurality of mRNAs are transcribed from a particular region of a chromosome to produce different variants.
For the series of genes that have been discovered by the present inventors, and that can be used as biomarkers specific for the brain/nerves or specific for nerve differentiation (abbreviated as “brain/nerve-specific genes” or “brain/nerve-specific genes 1 to 10” as required), known variants have been reported. Examples of such known variants include known variants of brain/nerve-specific gene 1 (Genbank accession number: NM_—133460.1; non-patent documents 1 and 2), brain/nerve-specific gene 2 (Genbank accession number: NM_—005163.1; non-patent documents 3 and 4), brain/nerve-specific gene 3 (Genbank accession number: NM_—181784.1; non-patent documents 5 and 6), brain/nerve-specific gene 4 (Genbank accession number: NM_—003930.3; non-patent documents 7 and 8), brain/nerve-specific gene 5 (Genbank accession number: NM_—000898.3; non-patent documents 9 and 10), brain/nerve-specific gene 6 (Genbank accession number: NM_—005079.1; non-patent documents 11 and 12), brain/nerve-specific gene 7% (Genbank accession number: NM_—001679.2; non-patent document 13 and 14), brain/nerve-specific gene 8 (Genbank accession number: NM_—000431.1; non-patent documents 15 and 16), brain/nerve-specific gene 9 (Genbank accession number: NM_—153449.2; non-patent document 17), and brain/nerve-specific gene 10 (Genbank accession number: NM_—015009.1; non-patent documents 18 and 19).
However, it is not known that the brain/nerve-specific genes 1 to 10 can be useful as biomarkers specific for the brain/nerves or specific for nerve cell differentiation, and that the particular variants discovered by the present inventors exist in the brain/nerve-specific genes 1 to 10.

[Non-patent document 1] Ota, T. et al., Nat. Genet. 36 (1), 40-45 (2004)
[Non-patent document 2] Strausberg, R. L. et al., Proc. Natl. Acad. Sci. U.S.A. 99 (26), 16899-16903 (2002)
[Non-patent document 3] Staal, S. P., Proc. Natl. Acad. Sci. U.S.A. 84 (14), 5034-5037 (1987)
[Non-patent document 4] Staal, S. P. et al., Genomics 2 (1), 96-98 (1988)
[Non-patent document 5] Wakioka, T. et al., Nature 412 (6847), 647-651 (2001)
[Non-patent document 6] Kato, R. et al., Biochem. Biophys. Res. Commun. 302 (4), 767-772 (2003)
[Non-patent document 7] Marie-Cardine, A. et al., FEES Lett. 435 (1), 55-60 (1998)
[Non-patent document 8] Kouroku, Y. et al., Biochem. Biophys. Res. Commun. 252 (3), 738-742 (1998)
[Non-patent document 9] Kochersperger, L. M. et al., J. Neurosci. Res. 16 (4), 601-616 (1986)
[Non-patent document 10] Bach, A. W. et al., Proc. Natl. Acad. Sci. U.S.A. 85 (13), 4934-4938 (1988)
[Non-patent document 11] Chen, S. L. et al., Oncogene 12 (4), 741-751 (1996)
[Non-patent document 12] Byrne, J. A. et al., Genomics 35 (3), 523-532 (1996)
[Non-patent document 13] Lingrel, J. B. et al., Prog. Nucleic Acid Res. Mol. Biol. 38, 37-89 (1990)
[Non-patent document 14] Malik, N. et al., J. Biol. Chem. 271 (37), 22754-22758 (1996)
[Non-patent document 15] Kopito, R. R. et al., Proc. Natl. Acad. Sci. U.S.A. 77 (10), 5738-5740 (1980)
[Non-patent document 16] Schafer, B. L. et al., J. Biol. Chem. 267 (19), 13229-13238 (1992)
[Non-patent document 17] Wu, X. et al., Genomics 80 (6), 553-557 (2002)
[Non-patent document 18] Bach, I. et al., Nat. Genet. 22 (4), 394-399 (1999)
[Non-patent document 19] Katoh, M. et al., Int. J. Mol. Med. 13 (4), 607-613 (2004)

DISCLOSURE OF THE INVENTION

Problems to be Solved by the Invention

Analyzing a biomarker specific for the brain/nerve cells or specific for nerve cell differentiation leads to the development of, for example, a reagent for nerve cell identification or nerve cell differentiation state determination, a diagnostic reagent for a disease based on a nerve cell disorder, a pharmaceutical for a disease based on a nerve cell disorder, having a new mechanism of action, and the like. Based on the findings obtained by expression profile analysis of specified genes, the present invention is directed to providing such reagents, pharmaceuticals and the like, and providing a means useful in developing such reagents, pharmaceuticals and the like.

Means of Solving the Problems

The present inventors conducted extensive investigations and discovered brain/nerve-specific genes 1 to 10 as biomarkers specific for the brain/nerves or specific for nerve cell differentiation. The present inventors also discovered novel variants of the brain/nerve-specific genes 1 to 10 that can be used as biomarkers specific for the brain/nerves or specific for nerve cell differentiation. Therefore, it is thought that by utilizing the brain/nerve-specific genes 1 to 10 and/or novel variants thereof, it will become possible to identify nerve cells, to determine nerve cell differentiation states, to diagnose a disease based on a nerve cell disorder, and the like. In particular, because the brain/nerve-specific genes 1 to 10 and/or novel variants thereof are expressed specifically in particular differentiation stages of nerve cells, the accuracy of the determination of nerve cells in the particular differentiation stages can be increased. It is also thought that by utilizing the brain/nerve-specific genes 1 to 10 and/or novel variants thereof, it will become possible to develop a novel pharmaceutical for a specified disease such as a disease based on a nerve cell disorder, and the like.
Based on the findings shown above, the present inventors developed the present invention.
Accordingly, the present invention relates to the following aspects and the like.
[1] A polypeptide of any one of 1) to 10) below or a specific partial peptide thereof:
1) a polypeptide having an amino acid sequence shown by SEQ ID NO:18 or SEQ ID NO:10 or substantially the same amino acid sequence thereas;
2) a polypeptide having the amino acid sequence shown by SEQ ID NO:43 or substantially the same amino acid sequence thereas;
3) a polypeptide having the amino acid sequence shown by SEQ ID NO:58 or substantially the same amino acid sequence thereas;
4) a polypeptide having the amino acid sequence shown by SEQ ID NO:74 or substantially the same amino acid sequence thereas;
5) a polypeptide having an amino acid sequence shown by SEQ ID NO:89 or SEQ ID NO:99 or substantially the same amino acid sequence thereas;
6) a polypeptide having the amino acid sequence shown by SEQ ID NO:118 or substantially the same amino acid sequence thereas;
7) a polypeptide having the amino acid sequence shown by SEQ ID NO:133 or substantially the same amino acid sequence thereas;
8) a polypeptide having an amino acid sequence shown by SEQ ID NO:152 or SEQ ID NO:159 or substantially the same amino acid sequence thereas;
9) a polypeptide having an amino acid sequence shown by SEQ ID NO:184 or SEQ ID NO:190 or substantially the same amino acid sequence thereas; and
10) a polypeptide having an amino acid sequence shown by SEQ ID NO:207, SEQ ID NO:213, SEQ ID NO:219, SEQ ID NO:225, SEQ ID NO:231 or SEQ ID NO:236 or substantially the same amino acid sequence thereas.
[2] The polypeptide or specific partial peptide thereof according to [1] above, wherein the polypeptide is any of the polypeptides 1) to 10) below:
1) a polypeptide consisting of an amino acid sequence shown by SEQ ID NO:18 or SEQ ID NO:10;
2) a polypeptide consisting of the amino acid sequence shown by SEQ ID NO:43;
3) a polypeptide consisting of the amino acid sequence shown by SEQ ID NO:58;
4) a polypeptide consisting of the amino acid sequence shown by SEQ ID NO:74;
5) a polypeptide consisting of an amino acid sequence shown by SEQ ID NO:89 or SEQ ID NO:99;
6) a polypeptide consisting of the amino acid sequence shown by SEQ ID NO:118;
7) a polypeptide consisting of the amino acid sequence shown by SEQ ID NO:133;
8) a polypeptide consisting of an amino acid sequence shown by SEQ ID NO:152 or SEQ ID NO:159;
9) a polypeptide consisting of an amino acid sequence shown by SEQ ID NO:184 or SEQ ID NO:190; and
10) a polypeptide consisting of an amino acid sequence shown by SEQ ID NO:207, SEQ ID NO:213, SEQ ID NO:219, SEQ ID NO:225, SEQ ID NO:231 or SEQ ID NO:236.
[3] The polypeptide or specific partial peptide thereof according to [1] or [2] above, which is fused with a polypeptide consisting of a heterologous amino acid sequence.
[4] A partial peptide specific for a polypeptide encoded by one of the brain/nerve-specific genes 1 to 10, being any one of the partial peptides 1) to 10) below:
1) a partial peptide consisting of an amino acid sequence shown by SEQ ID NO:12, SEQ ID NO:15, SEQ ID NO:20 or SEQ ID NO:22 or a partial amino acid sequence thereof;
2) a partial peptide consisting of the amino acid sequence shown by SEQ ID NO:264 or a partial amino acid sequence thereof;
3) a partial peptide having the amino acid sequence shown by SEQ ID NO:60;
4) a partial peptide consisting of the amino acid sequence shown by SEQ ID NO:265 or a partial amino acid sequence thereof;
5) a partial peptide consisting of an amino acid sequence shown by SEQ ID NO:93, SEQ ID NO:94, SEQ ID NO:96, or SEQ ID NO:266 or a partial amino acid sequence thereof;
6) a partial peptide consisting of the amino acid sequence shown by SEQ ID NO:120 or a partial amino acid sequence thereof;
7) a partial peptide consisting of an amino acid sequence shown by SEQ ID NO:135, SEQ ID NO:138 or SEQ ID NO:139 or a partial amino acid sequence thereof;
8) a partial peptide consisting of an amino acid sequence shown by SEQ ID NO:156, SEQ ID NO:161 or SEQ ID NO:163 or a partial amino acid sequence thereof;
9) a partial peptide consisting of an amino acid sequence shown by SEQ ID NO:186 or SEQ ID NO:192 or a partial amino acid sequence thereof; and
10) a partial peptide consisting of an amino acid sequence shown by SEQ ID NO:209, SEQ ID NO:215, SEQ ID NO:221 or SEQ ID NO:227 or a partial amino acid sequence thereof, or a partial peptide having the amino acid sequence shown by SEQ ID NO:238.
[5] A polynucleotide that encodes any one of the polypeptides [1] to [3] above, or any one of the specific partial peptides [1] to [4] above.
[6] A polynucleotide of any one of 1) to 10) below or a specific partial nucleotide thereof:
1) a polynucleotide having a nucleic acid sequence shown by SEQ ID NO:16 or SEQ ID NO:8, or a nucleic acid sequence corresponding to the ORF thereof, or substantially the same nucleic acid sequence thereas;
2) a polynucleotide having the nucleic acid sequence shown by SEQ ID NO:41, or a nucleic acid sequence corresponding to the ORF thereof, or substantially the same nucleic acid sequence thereas;
3) a polynucleotide having the nucleic acid sequence shown by SEQ ID NO:56, or a nucleic acid sequence corresponding to the ORF thereof, or substantially the same nucleic acid sequence thereas;
4) a polynucleotide having the nucleic acid sequence shown by SEQ ID NO:72, or a nucleic acid sequence corresponding to the ORF thereof, or substantially the same nucleic acid sequence thereas;
5) a polynucleotide having a nucleic acid sequence shown by SEQ ID NO:87 or SEQ ID NO:97, or a nucleic acid sequence corresponding to the ORF thereof, or substantially the same nucleic acid sequence thereas;
6) a polynucleotide having the nucleic acid sequence shown by SEQ ID NO:116, or a nucleic acid sequence corresponding to the ORF thereof, or substantially the same nucleic acid sequence thereas;
7) a polynucleotide having the nucleic acid sequence shown by SEQ ID NO:131, or a nucleic acid sequence corresponding to the ORF thereof, or substantially the same nucleic acid sequence thereas;
8) a polynucleotide having a nucleic acid sequence shown by SEQ ID NO:150 or SEQ ID NO:157, or a nucleic acid sequence corresponding to the ORF thereof, or substantially the same nucleic acid sequence thereas;
9) a polynucleotide having a nucleic acid sequence shown by SEQ ID NO:182 or SEQ ID NO:188, or a nucleic acid sequence corresponding to the ORF thereof, or substantially the same nucleic acid sequence thereas; and
10) a polynucleotide having a nucleic acid sequence shown by SEQ ID NO:205, SEQ ID NO:211, SEQ ID NO:217, SEQ ID NO:223, SEQ ID NO:229 or SEQ ID NO:234, or a nucleic acid sequence corresponding to the ORF thereof, or substantially the same nucleic acid sequence thereas.
[7] The polynucleotide or specific partial nucleotide thereof according to [6] above, wherein the any one of the polynucleotides 1) to 10) is any one of the polynucleotides 1) to 10) below:
1) a polynucleotide consisting of a nucleic acid sequence shown by SEQ ID NO:16 or SEQ ID NO:8 or a nucleic acid sequence corresponding to the ORF thereof;
2) a polynucleotide consisting of the nucleic acid sequence shown by SEQ ID NO:41 or a nucleic acid sequence corresponding to the ORF thereof;
3) a polynucleotide consisting of the nucleic acid sequence shown by SEQ ID NO:56 or a nucleic acid sequence corresponding to the ORF thereof;
4) a polynucleotide consisting of the nucleic acid sequence shown by SEQ ID NO:72 or a nucleic acid sequence corresponding to the ORF thereof;
5) a polynucleotide consisting of a nucleic acid sequence shown by SEQ ID NO:87 or SEQ ID NO:97 or a nucleic acid sequence corresponding to the ORF thereof;
6) a polynucleotide consisting of the nucleic acid sequence shown by SEQ ID NO:116 or a nucleic acid sequence corresponding to the ORF thereof;
7) a polynucleotide consisting of the nucleic acid sequence shown by SEQ ID NO:131 or a nucleic acid sequence corresponding to the ORF thereof;
8) a polynucleotide consisting of a nucleic acid sequence shown by SEQ ID NO:150 or SEQ ID NO:157 or a nucleic acid sequence corresponding to the ORF thereof;
9) a polynucleotide consisting of a nucleic acid sequence shown by SEQ ID NO:182 or SEQ ID NO:188 or a nucleic acid sequence corresponding to the ORF thereof; and
10) a polynucleotide consisting of a nucleic acid sequence shown by SEQ ID NO:205, SEQ ID NO:211, SEQ ID NO:217, SEQ ID NO:223, SEQ ID NO:229 or SEQ ID NO:234 or a nucleic acid sequence corresponding to the ORF thereof.
[8] A partial nucleotide specific for any one of the polynucleotides encoded by the brain/nerve-specific genes 1 to 10, being any one of the partial nucleotides 1) to 10) below:
1) a partial nucleotide consisting of a nucleic acid sequence shown by SEQ ID NO:11, SEQ ID NO:13, SEQ ID NO:19, SEQ ID NO:21, SEQ ID NO:25, SEQ ID NO:29, SEQ ID NO:33, SEQ ID NO:39 or SEQ ID NO:40 or a partial nucleic acid sequence thereof;
2) a partial nucleotide consisting of a nucleic acid sequence shown by SEQ ID NO:44, SEQ ID NO:45, SEQ ID NO:48, SEQ ID NO:51 or SEQ ID NO:55 or a partial nucleic acid sequence thereof;
3) a partial nucleotide consisting of a nucleic acid sequence shown by SEQ ID NO:59, SEQ ID NO:61, SEQ ID NO:64, SEQ ID NO:67 or SEQ ID NO:71 or a partial nucleic acid sequence thereof;
4) a partial nucleotide consisting of a nucleic acid sequence shown by SEQ ID NO:75, SEQ ID NO:76, SEQ ID NO:79, SEQ ID NO:82 or SEQ ID NO:86 or a partial nucleic acid sequence thereof;
5) a partial nucleotide consisting of a nucleic acid sequence shown by SEQ ID NO:90, SEQ ID NO:91, SEQ ID NO:92, SEQ ID NO:95, SEQ ID NO:100, SEQ ID NO:101, SEQ ID NO:104, SEQ ID NO:107, SEQ ID NO:110, SEQ ID NO:114 or SEQ ID NO:115 or a partial nucleic acid sequence thereof;
6) a partial nucleotide consisting of a nucleic acid sequence shown by SEQ ID NO:119, SEQ ID NO:123, SEQ ID NO:126 or SEQ ID NO:130 or a partial nucleic acid sequence thereof;
7) a partial nucleotide consisting of a nucleic acid sequence shown by SEQ ID NO:134, SEQ ID NO:136, SEQ ID NO:137, SEQ ID NO:142, SEQ ID NO:145 or SEQ ID NO:149 or a partial nucleic acid sequence thereof;
8) a partial nucleotide consisting of a nucleic acid sequence shown by SEQ ID NO:153, SEQ ID NO:154, SEQ ID NO:155, SEQ ID NO:160, SEQ ID NO:162, SEQ ID NO:166, SEQ ID NO:170, SEQ ID NO:174, SEQ ID NO:180 or SEQ ID NO:181 or a partial nucleic acid sequence thereof;
9) a partial nucleotide consisting of a nucleic acid sequence shown by SEQ ID NO:185, SEQ ID NO:187, SEQ ID NO:191, SEQ ID NO:193, SEQ ID NO:199, SEQ ID NO:203 or SEQ ID NO:204 or a partial nucleic acid sequence thereof; and
10) a partial nucleotide consisting of a nucleic acid sequence shown by SEQ ID NO:208, SEQ ID NO:210, SEQ ID NO:214, SEQ ID NO:216, SEQ ID NO:220, SEQ ID NO:222, SEQ ID NO:226, SEQ ID NO:228, SEQ ID NO:232, SEQ ID NO:233, SEQ ID NO:237, SEQ ID NO:239, SEQ ID NO:242, SEQ ID NO:245, SEQ ID NO:248, SEQ ID NO:251, SEQ ID NO:254, SEQ ID NO:258, SEQ ID NO:259, SEQ ID NO:260, SEQ ID NO:261, SEQ ID NO:262 or SEQ ID NO:263 or a partial nucleic acid sequence thereof.
[9] An expression vector for the polypeptide according to any one of [1] to [3] above or the specific partial peptide according to any one of [1] to [4] above, comprising the polynucleotide according to any one of [5] to [7] above or the specific partial nucleotide according to any one of [6] to [8] above, and a promoter operably linked thereto.
[10] A transformant incorporating the expression vector according to [9] above.
[11] An antisense molecule comprising a nucleic acid sequence complementary to the nucleic acid sequence of the specific partial nucleotide according to [7] or [8] above, and capable of suppressing the expression of any one of the polypeptides encoded by the brain/nerve-specific genes 1 to 10.
[12] An RNAi-inducing nucleic acid capable of suppressing the expression of any one of the polypeptides encoded by the brain/nerve-specific genes 1 to 10, that is configured by a sense strand consisting of the nucleic acid sequence of the specific partial nucleotide according to [7] or [8] above, and an antisense strand consisting of a nucleic acid sequence complementary thereto, and that may have an overhang at the 5′ terminus and/or 3′ terminus of one or both of the sense strand and the antisense strand.
[13] The RNAi-inducing nucleic acid according to [12] above, wherein the RNAi-inducing nucleic acid is an siRNA.
[14] An aptamer capable of binding to any one of the polypeptides encoded by the brain/nerve-specific genes 1 to 10 via a region corresponding to the specific partial peptide according to any one of [2] to [4] above.
[15] An antibody capable of binding to any one of the polypeptides encoded by the brain/nerve-specific genes 1 to 10 via a region corresponding to the specific partial peptide according to any one of [2] to [4] above.
[16] The antibody according to [15] above, wherein the antibody is any one of the i) to iii) below:
i) a polyclonal antibody;
ii) a monoclonal antibody or a portion thereof;
iii) a chimeric antibody, a humanized antibody or a human antibody.
[17] A cell that produces the antibody according to [15] or [16] above.
[18] The cell according to [17] above, wherein the cell is a hybridoma.
[19] A composition comprising the polypeptide according to any one of [1] to [3] above, the antisense molecule according to [11] above, the RNAi-inducing nucleic acid according to [12] or [13] above, the aptamer according to [14] above, the antibody according to [15] or [16] above, or an expression vector therefor, and a pharmaceutically acceptable carrier.
[20] A mammalian cell or non-human mammal wherein the expression or a function of the polypeptide according to any one of [1] to [3] above is regulated.
[21] A primer set specific for any one of the polynucleotides encoded by the brain/nerve-specific genes 1 to 10 or a specific partial nucleotide thereof, comprising the following (a) or (b):
(a) a sense primer corresponding to a first nucleic acid sequence of the polynucleotide according to [7] above or the specific partial nucleotide according to [7] or [8] above; and
(b) an antisense primer corresponding to a nucleic acid sequence complementary to a second nucleic acid sequence of the polynucleotide according to [7] above or the specific partial nucleotide according to [7] or [8] above.
[22] A nucleic acid probe specific for any one of the polynucleotides encoded by the brain/nerve-specific genes 1 to 10 or a specific partial nucleotide thereof, being any one of the following (a) or (b):
(a) a single-stranded polynucleotide comprising a nucleic acid sequence complementary to the nucleic acid sequence of the specific partial nucleotide according to [7] or [8] above; or
(b) a double-stranded polynucleotide configured by a sense strand comprising the nucleic acid sequence of the specific partial nucleotide according to [7] or [8] above, and an antisense strand comprising a nucleic acid sequence complementary thereto.
[23] A reagent or kit for detection or quantification of any one of the polypeptides or polynucleotides encoded by the brain/nerve-specific genes 1 to 10, comprising one or more substances or sets selected from among the aptamer according to
[14] above, the antibody according to [15] or [16] above, the primer set according to [21] above and the nucleic acid probe according to [22] above.
[24] The reagent or kit according to [23] above, being a reagent or kit for determination of nerve cell differentiation.
[25] A method of detecting or quantifying any one of the polypeptides or polynucleotides encoded by the brain/nerve-specific genes 1 to 10, comprising measuring the expression of the polypeptide or polynucleotide in a biological sample or cell or tissue culture obtained from a mammal, wherein the biological sample or the culture contains a nerve cell or a tissue in the brain.
[26] A method of detecting or quantifying the polypeptide according to [2] or [3] above or the polynucleotide according to [7] above, comprising measuring the expression of the polypeptide or the polynucleotide in a biological sample or cell or tissue culture obtained from a mammal.
[27] The method of detection or quantification according to [26] above, wherein the biological sample or the culture contains a nerve cell or a tissue in the brain.

Effect of the Invention

A polypeptide of the present invention and a partial peptide of the present invention can be useful, for example, as a biomarker specific for the brain/nerves or specific for nerve cell differentiation, and in developing a substance capable of specifically recognizing a polypeptide of the present invention or a known polypeptide, or a substance capable of comprehensively recognizing both a polypeptide of the present invention and a known polypeptide, and a substance capable of specifically regulating a function of a polypeptide of the present invention or a known polypeptide, or a substance capable of comprehensively regulating functions of both a polypeptide of the present invention and a known polypeptide.
A polynucleotide of the present invention and a partial nucleotide of the present invention can be useful, for example, as a biomarker specific for the brain/nerves or specific for nerve cell differentiation, and in developing a substance capable of specifically recognizing a polynucleotide of the present invention or a known polynucleotide, or a substance capable of comprehensively recognizing both a polynucleotide of the present invention and a known polynucleotide, and a substance capable of specifically regulating the expression of a polypeptide of the present invention or a known polypeptide, or a substance capable of comprehensively regulating the expression of both a polypeptide of the present invention and a known polypeptide.
Related substances of the present invention (e.g., antisense molecules, RNAi-inducing nucleic acids such as siRNAs, aptamers and antibodies, and expression vectors therefor) can be useful as, for example, pharmaceuticals or reagents.
A cell of the present invention can be useful in, for example, producing a polypeptide of the present invention and a partial peptide of the present invention, and an antibody of the present invention. A cell of the present invention can also be useful in developing a pharmaceutical (e.g., a prophylactic or therapeutic drug for a disease based on a nerve cell disorder), identifying a further marker gene specific for the brain/nerves or specific for nerve cell differentiation, and analyzing a mechanism associated with nerve cell differentiation.
An animal of the present invention can be useful in, for example, developing a pharmaceutical, identifying a further marker gene specific for the brain/nerves or specific for nerve cell differentiation, and analyzing a mechanism associated with nerve cell differentiation.
Measuring means (e.g., primer set, nucleic acid probe, antibody, aptamer) and measuring methods of the present invention can be useful in, for example, specific detection and quantitation of a polynucleotide of the present invention or a known polynucleotide, or a polypeptide of the present invention or a known polypeptide, or comprehensive detection and quantitation of both a polynucleotide of the present invention and a known polynucleotide, or both a polypeptide of the present invention and a known polypeptide. These means and methods can also be utilized for determining nerve cell differentiation states and screening for pharmaceuticals, reagents or foods.

BEST MODE FOR CARRYING OUT THE INVENTION

1. Brain/Nerve-Specific Genes

A gene of the present invention can be a gene derived from an optionally chosen mammal. As examples of the mammal, primates and rodents, as well as laboratory animals, domestic animals, working animals, companion animals and the like can be mentioned. In detail, as examples of the mammal, humans, monkeys, rats, mice, rabbits, horses, cattle, goat, sheep, dogs, cats and the like can be mentioned. Preferably, the mammal is a human.
A gene of the present invention is capable or incapable of being expressed specifically in a tissue in the brain. A gene of the present invention is also capable of being expressed at a higher or lower level in a tissue in the brain, compared with a known polynucleotide and/or a known polypeptide. As examples of such tissues in the brain, the cerebrum, cerebral cortex, cerebellum, caudate nucleus, corpus callosum, hippocampus, substantia nigra, thalamus, hypothalamus, subthalamic nucleus, hypophysis, amygdala and the like can be mentioned.
A gene of the present invention is capable or incapable of being expressed specifically in nerve cells. A gene of the present invention is also capable of being expressed at a higher or lower level in nerve cells, compared with a known polynucleotide and/or a known polypeptide. As examples of such nerve cells, nerve cells in the aforementioned tissues can be mentioned.
Hereinafter, the polypeptides and partial peptides thereof, and polynucleotides and partial nucleotides thereof, provided by the present invention, are described.

1.1. Polypeptides and Partial Peptides Thereof

The present invention provides a polypeptide having an amino acid sequence shown by SEQ ID NO:X or substantially the same amino acid sequence thereas (abbreviated as “amino acid sequence shown by SEQ ID NO:X and the like” as required).
“SEQ ID NO:X” denotes the SEQ ID NO of an optionally chosen amino acid sequence disclosed herein. A polypeptide “having” an amino acid sequence shown by SEQ ID NO:X and the like means a polypeptide “consisting of” an amino acid sequence shown by SEQ ID NO:X and the like, and a polypeptide “comprising” the amino acid sequence and the like.
In one embodiment, substantially the same amino acid sequence as an amino acid sequence shown by SEQ ID NO:X can be an amino acid sequence having a specified amino acid sequence identity to the amino acid sequence shown by SEQ ID NO:X. The degree of amino acid sequence identity can be about 90% or more, preferably about 92% or more, more preferably about 95% or more, still more preferably about 96% or more, and most preferably about 97% or more, about 98% or more or about 99% or more. Amino acid sequence identity can be determined by a method known per se. Unless otherwise specified, amino acid sequence identity (%) is calculated by, for example, executing the commands for the maximum matching method, using the DNASIS sequence analytical software (Hitachi Software Engineering). The parameters for the calculation should be used in default settings. Amino acid sequence identity (%) can also be determined, without following the above procedures, using a program in common use in the art (for example, BLAST, FASTA and the like) in the default settings thereof. In another aspect, the identity (%) can be determined using an optionally chosen algorithm publicly known in the art, for example, the algorithms of Needleman et al. (1970) (J. Mol. Biol. 48: 444-453) and Myers and Miller (CABIOS, 1988, 4: 11-17) and the like. The algorithm of Needleman et al. is incorporated in the GAP program in the GCG software package, and the identity (%) can be determined by, for example, using BLOSUM 62 matrix or PAM250 matrix, with a gap weight of 16, 14, 12, 10, 8, 6 or 4, and a length weight of 1, 2, 3, 4, 5 or 6. The algorithm of Myers and Miller is incorporated in the ALIGN program, which is a portion of the GCG sequence alignment software package. When the ALIGN program is utilized to compare amino acid sequences, for example, PAM120 weight residue table, gap length penalty 12, gap penalty 4, can be used. For calculating amino acid sequence identity, the method that produces the least value among the above-mentioned methods may be employed.
In another embodiment, substantially the same amino acid sequence as an amino acid sequence shown by SEQ ID NO:X can be an amino acid sequence shown by SEQ ID NO:X wherein one or more amino acids have one or more modifications selected from among substitutions, additions, deletions and insertions. The number of amino acids modified is not particularly limited, as far as it is one or more; the number can be, for example, 1 to about 50, preferably 1 to about 30, more preferably 1 to about 20, still more preferably 1 to about 10, and most preferably 1 to about 5 (e.g., 1 or 2).
Substantially the same amino acid sequence as an amino acid sequence shown by SEQ ID NO:X may completely retain a characteristic portion thereof (e.g., a portion corresponding to a specific partial polypeptide described below), and may have another portion (e.g., a portion present in a known polypeptide) being substantially the same as the corresponding portion of the amino acid sequence shown by SEQ ID NO:X. Alternatively, substantially the same amino acid sequence as an amino acid sequence shown by SEQ ID NO:X may have a non-characteristic portion thereof being identical to the corresponding portion of the amino acid sequence shown by SEQ ID NO:X, and a characteristic portion thereof being substantially identical to the corresponding portion of the amino acid sequence shown by SEQ ID NO:X.
A polypeptide of the present invention can have a function that is homogenous or heterogeneous to that of a known polypeptide (e.g., known variant). A polypeptide of the present invention can also have an enhanced or reduced function compared with a known polypeptide (e.g., known variant).
In detail, the novel polypeptides of the brain/nerve-specific genes 1 to 10 are as follows.

1) Brain/Nerve-Specific Gene 1

D-BRACE3000012.1 (SEQ ID NO:18)
D-UTERU2026184.1 (SEQ ID NO:10)
As a known variant of the brain/nerve-specific gene 1, for example, a variant disclosed in an Example (human zinc finger protein 418 (ZNF418); total number of nucleotides in the ORF nucleic acid sequence: 2031; total number of amino acids in the protein: 676; see GenBank accession number: NM_—133460.1) has been reported. A known variant of the brain/nerve-specific gene 1 can have a specified function (e.g., transcription regulatory capacity) (see, e.g., Ota, T. et al., Nat. Genet. 36 (1), 40-45 (2004)). Generally, it is known that a plurality of variants resulting from a single locus (splicing variants) have similar functions, although the degree can vary. Therefore, novel variants of the brain/nerve-specific gene 1 can also have these functions.

2) Brain/Nerve-Specific Gene 0.2

D-NT2RP8004156.1 (SEQ ID NO:43)
As a known variant of the brain/nerve-specific gene 2, for example, a variant disclosed in an Example (human v-akt mouse thymoma virus oncogene homologue 1 (AKT1); total number of nucleotides in the ORF nucleic acid sequence: 1443; total number of amino acids in the protein: 480; see GenBank accession number: NM_—005163.1) has been reported. It has been reported that known variants of the brain/nerve-specific gene 2 have a specified function (e.g., kinase activity, anti-apoptotic activity, or cell cycle regulatory capacity) (see, e.g., Mirza, A. M. et al., Mol. Cell. Biol. 24 (24), 10868-10881 (2004); Koga, M. et al., Biochem. Biophys. Res. Commun. 324 (1), 321-325 (2004)). Generally, it is known that a plurality of variants resulting from a single locus (splicing variants) have similar functions, although the degree can vary. Therefore, novel variants of the brain/nerve-specific gene 2 can also have these functions.

3) Brain/Nerve-Specific Gene 3

D-NT2RI3005525.1 (SEQ ID NO:58)
As a known variant of the brain/nerve-specific gene 3, for example, a variant disclosed in an Example (human budding-related, EVH1 domain-containing 2 (SPRED2); total number of nucleotides in the ORF nucleic acid sequence: 1257; total number of amino acids in the protein: 418; see GenBank accession number: NM_—181784.1) has been reported. It has been reported that known variants of the brain/nerve-specific gene 3 have a specified function (e.g., MAP kinase activation inhibitory capacity, tyrosine kinase-mediated Erk activation inhibitory capacity) (see, e.g., Nobuhisa, I. et al., J. Exp. Med. 199 (5), 737-742 (2004); Kato, R. et al., Biochem. Biophys. Res. Commun. 302 (4), 767-772 (2003)). Generally, it is known that a plurality of variants resulting from a single locus (splicing variants) have similar functions, although the degree can vary. Therefore, novel variants of the brain/nerve-specific gene 3 can also have these functions.

4) Brain/Nerve-Specific Gene 4

D-NT2RP8004592.1 (SEQ ID NO:74)
As a known variant of the brain/nerve-specific gene 4, for example, a variant disclosed in an Example (human src kinase related phosphoprotein 2 (SKAP2); total number of nucleotides in the ORF nucleic acid sequence: 1080; total number of amino acids in the protein: 359; see GenBank accession number: NM_—003930.3) has been reported. It has been reported that known variants of the brain/nerve-specific gene 4 have a specified function (e.g., α-synuclein phosphorylation inhibitory capacity) (see, e.g., Takahashi, T. et al., J. Biol. Chem. 278 (43), 42225-42233 (2003)). Generally, it is known that a plurality of variants resulting from a single locus (splicing variants) have similar functions, although the degree can vary. Therefore, novel variants of the brain/nerve-specific gene 4 can also have these functions.

5) Brain/Nerve-Specific Gene 5

D-NT2RI2014164.1 (SEQ ID NO:89)
D-BRAMY2029564.1 (SEQ ID NO:99)
As a known variant of the brain/nerve-specific gene 5, for example, a variant disclosed in an Example (human monoamine oxidase B (MAOB); total number of nucleotides in the ORF nucleic acid sequence: 1563; total number of amino acids in the protein: 520; see GenBank accession number: NM_—000898.3) has been reported. It has been reported that known variants of the brain/nerve-specific gene 5 have a specified function (e.g., monoamine oxidase activity) (see, e.g., Bach, A. W. et al., Proc. Natl. Acad. Sci. U.S.A. 85 (13), 4934-4938 (1988)). Generally, it is known that a plurality of variants resulting from a single locus (splicing variants) have similar functions, although the degree can vary. Therefore, novel variants of the brain/nerve-specific gene 5 can also have these functions.

6) Brain/Nerve-Specific Gene 6

D-BRHIP2003515.1 (SEQ ID NO:118)
As a known variant of the brain/nerve-specific gene 6, for example, a variant disclosed in an Example (human tumor protein D52 (TPD52); total number of nucleotides in the ORF nucleic acid sequence: 555; total number of amino acids in the protein: 184; see GenBank accession number: NM_—005079.1) has been reported. It has been reported that known variants of the brain/nerve-specific gene 6 have a specified function (e.g., capability of Ca²⁺ dependent interaction with annexin VI) (see, e.g., Tiacci, E. et al., Blood 105 (7), 2812-2820 (2005)). Generally, it is known that a plurality of variants resulting from a single locus (splicing variants) have similar functions, although the degree can vary. Therefore, novel variants of the brain/nerve-specific gene 6 can also have these functions.

7) Brain/Nerve-Specific Gene 7

D-BRACE2044661.1 (SEQ ID NO:133)
As a known variant of the brain/nerve-specific gene 7, for example, a variant disclosed in an Example (human ATPase, Na⁺/K⁺ transport property, β3 polypeptide (ATP1B3); total number of nucleotides in the ORF nucleic acid sequence: 840; total number of amino acids in the protein: 279; see GenBank accession number: NM_—001679.2) has been reported. It has been reported that known variants of the brain/nerve-specific gene 7 have a specified function (e.g., ATP hydrolysis activity in the presence of an ion such as Na⁺ or K⁺) (see, e.g., Malik, N. et al., J. Biol. Chem. 271 (37), 22754-22758 (1996)). Generally, it is known that a plurality of variants resulting from a single locus (splicing variants) have similar functions, although the degree can vary. Therefore, novel variants of the brain/nerve-specific gene 7 can also have these functions.

8) Brain/Nerve-Specific Gene 8

D-3NB692002462.1 (SEQ ID NO:152)
D-BRCAN2027778.1 (SEQ ID NO:159)
As a known variant of the brain/nerve-specific gene 8, for example, a variant disclosed in an Example (human mevalonic acid kinase (MVK); total number of nucleotides in the ORF nucleic acid sequence: 1191; total number of amino acids in the protein: 396; see GenBank accession number: NM_—000431.1) has been reported. It has been reported that known variants brain/nerve-specific gene 8 have a specified function (e.g., mevalonic acid kinase activity) (see, e.g., Hogenboom, S. et al., J. Cell. Sci. 117 (PT 4), 631-639 (2004)). Generally, it is known that a plurality of variants resulting from a single locus (splicing variants) have similar functions, although the degree can vary. Therefore, novel variants of the brain/nerve-specific gene 8 can also have these functions.

9) Brain/Nerve-Specific Gene 9

D-NT2RI3001005.1 (SEQ ID NO:184)
D-NT2RI3005261.1 (SEQ ID NO:190)
As a known variant of the brain/nerve-specific gene 9, for example, a variant disclosed in an Example (human solute carrier family 2 (promoting glucose transporter), member 14 (SLC2A14); total number of nucleotides in the ORF nucleic acid sequence: 1563; total number of amino acids in the protein: 520; see GenBank accession number: NM_—153449.2) has been reported. Known variants of the brain/nerve-specific gene 9 can have a specified function (e.g., glucose transportation capacity) (see, e.g., Wu, X. et al., Genomics 80 (6), 553-557 (2002)). Generally, it is known that a plurality of variants resulting from a single locus (splicing variants) have similar functions, although the degree can vary. Therefore, novel variants of the brain/nerve-specific gene 9 can also have these functions.

10) Brain/Nerve-Specific Gene 10

D-OCBBF2010718.1 (SEQ ID NO:207)
D-OCBBF3004194.1 (SEQ ID NO:213)
D-NT2RP8000826.1 (SEQ ID NO:219)
D-NT2RP7007268.1 (SEQ ID NO:225)
D-BRAWH3008172.1 (SEQ ID NO:331)
D-BRAWH3011965.1 (SEQ ID NO:236)
As a known variant of the brain/nerve-specific gene 10, for example, a variant disclosed in an Example (human PDZ domain-containing RING finger 3 (PDZRN3); total number of nucleotides in the ORF nucleic acid sequence: 3201; total number of amino acids in the protein: 1066; see GenBank accession number: NM_—015009.1) has been reported. Known variants of the brain/nerve-specific gene 10 can have a specified function (e.g., capability of binding to a cell surface protein such as neuroligin via the PDZ domain thereof) (see, e.g., Meyer, G. et al., Neuropharmacology 47 (5), 724-733 (2004)). Generally, it is known that a plurality of variants resulting from a single locus (splicing variants) have similar functions, although the degree can vary. Therefore, novel variants of the brain/nerve-specific gene 10 can also have these functions.
A polypeptide of the present invention can be useful in, for example, developing a substance capable of specifically recognizing a polypeptide of the present invention, a substance incapable of specifically recognizing a polypeptide of the present invention, or a substance capable of comprehensively recognizing both a polypeptide of the present invention and a known polypeptide, and in developing a substance capable of specifically regulating a function of a polypeptide of the present invention, a substance incapable of specifically regulating a function of a polypeptide of the present invention, or a substance capable of comprehensively recognizing functions of both a polypeptide of the present invention and a known polypeptide.
The present invention also provides a partial peptide.
“A partial peptide” consists of at least 6, preferably at least 8, more preferably at least 10, still more preferably at least 12, and most preferably at least 15, consecutive amino acid residues selected from among subject polypeptides, that can have a specified utility (e.g., use as an immunogenic or antigenic peptide, a functional peptide having a particular domain and the like).
“An insert amino acid sequence of a polypeptide of the present invention” refers to an amino acid sequence that is incorporated in a polypeptide of the present invention (e.g., novel variant), but lacked in a known polypeptide (e.g., known variant). Meanwhile, “an insert amino acid sequence of a known polypeptide” refers to an amino acid sequence that is incorporated in a known polypeptide (e.g., known variant), but lacked in a polypeptide of the present invention (e.g., novel variant). These insert amino acid sequences are obvious from the disclosure herein.
“A deleted amino acid sequence of a polypeptide of the present invention” refers to an amino acid sequence that is lacked in a polypeptide of the present invention (e.g., novel variant), but incorporated in a known polypeptide (e.g., known variant). Meanwhile, “a deleted amino acid sequence of a known polypeptide” refers to an amino acid sequence that is lacked in a known polypeptide (e.g., known variant), but incorporated in a polypeptide of the present invention (e.g., novel variant). These deleted amino acid sequences are obvious from the disclosure herein. “A deleted amino acid sequence of a polypeptide of the present invention” can have the same definition as that for “an insert amino acid sequence of a known polypeptide”; “a deleted amino acid sequence of a known polypeptide” can have the same definition as that for “an insert amino acid sequence of a polypeptide of the present invention”.
A partial peptide of the present invention can be a) a specific partial peptide of a polypeptide of the present invention, capable of distinguishing a polypeptide of the present invention from a known polypeptide (abbreviated as “specific partial peptide A” as required), b) a specific partial peptide of a known polypeptide, capable of distinguishing a known polypeptide from a polypeptide of the present invention (abbreviated as “specific partial peptide B” as required), or c) a partial peptide common to both a polypeptide of the present invention and a known polypeptide (abbreviated as “shared partial peptide” as required). For these particular partial peptides, there appears a motivation for preparing them or utilizing them as markers on the basis of the present inventors' findings; however, without these findings, there is no motivation for preparing them or utilizing them as markers. Being partial peptides specific for the polypeptides encoded by the brain/nerve-specific genes 1 to 10, the specific partial peptides A and B are abbreviated as “specific partial peptides of the present invention” or “specific partial peptides” as required.
The specific partial peptide A of the present invention is a partial peptide that is present only in a polypeptide having an amino acid sequence shown by SEQ ID NO:X and the like, and that is not present in any known polypeptide. As examples of the specific partial peptide A, i) a partial peptide consisting of an insert amino acid sequence of a polypeptide of the present invention or a partial amino acid sequence thereof, ii) a partial peptide consisting of an insert amino acid sequence of a polypeptide of the present invention or a terminal partial amino acid sequence thereof and an adjacent amino acid sequence thereof, and iii) a partial peptide consisting of an amino acid sequence wherein both amino acid sequences present on the N-terminal side and C-terminal side relative to an insert amino acid sequence of a known polypeptide are linked together, formed as a result of exon deletion, can be mentioned.
The specific partial peptide A of i) above consists of an insert amino acid sequence of a polypeptide of the present invention or a partial amino acid sequence thereof. Such partial amino acid sequences are obvious from the disclosure herein.
The specific partial peptide A of ii) above consists of an insert amino acid sequence of a polypeptide of the present invention or a terminal partial amino acid sequence thereof and an adjacent amino acid sequence thereof. As such terminal partial amino acid sequences, an amino acid sequence corresponding to an N-terminal portion of an insert amino acid sequence of a polypeptide of the present invention (abbreviated as “N-terminal partial amino acid sequence A” as required), and an amino acid sequence corresponding to a C-terminal portion of an insert amino acid sequence of a polypeptide of the present invention (abbreviated as “C-terminal partial amino acid sequence A” as required) can be mentioned. As such adjacent amino acid sequences, an amino acid sequence present on the N-terminal side relative to an insert amino acid sequence of a polypeptide of the present invention (abbreviated as “N-terminal adjacent amino acid sequence A” as required), and an amino acid sequence present on the C-terminal side relative to an insert amino acid sequence of a polypeptide of the present invention (abbreviated as “C-terminal adjacent amino acid sequence A” as required) can be mentioned. Therefore, the specific partial peptide A of ii) above can be a partial peptide consisting of an amino acid sequence spanning from a specified position of the N-terminal adjacent amino acid sequence A to a specified position of an insert amino acid sequence of a polypeptide of the present invention, a partial peptide consisting of an amino acid sequence spanning from a specified position of an insert amino acid sequence of a polypeptide of the present invention to a specified position of the C-terminal adjacent amino acid sequence A, or a partial peptide consisting of an amino acid sequence comprising the whole insert amino acid sequence of a polypeptide of the present invention, spanning from a specified position of the N-terminal adjacent amino acid sequence A to a specified position of the C-terminal adjacent amino acid sequence A. The number of amino acid residues in the insert amino acid sequence (or N-terminal or C-terminal partial amino acid sequence A) or adjacent amino acid sequence (or N-terminal or C-terminal adjacent amino acid sequence A), contained in the specific partial peptide A of ii) above, is not particularly limited, as far as it is a number that ensures the specificity of the specific partial peptide A of ii) above; the number can be, for example, at least 3, preferably at least 4, more preferably at least 5, still more preferably at least 6, and most preferably at least 7, 8, 9 or 10. Such terminal partial amino acid sequences and such adjacent amino acid sequences are obvious from the disclosure herein.
The specific partial peptide A of iii) above is a partial peptide not present in a known polypeptide, consisting of an amino acid sequence wherein both amino acid sequences present on the N-terminal side and C-terminal side relative to an insert amino acid sequence of a known polypeptide are linked together (in a polypeptide of the present invention, these amino acid sequences are linked together as a result of exon deletion). The number of amino acid residues in each amino acid sequence present on the N-terminal side and C-terminal side relative to an insert amino acid sequence of a known polypeptide, contained in the specific partial peptide A of iii) above, is not particularly limited, as far as it is a number that ensures the specificity of the specific partial peptide A of iii) above; the number can be, for example, at least 3, preferably at least 4, more preferably at least 5, still more preferably at least 6, and most preferably at least 7, 8, 9 or 10.
The specific partial peptide A of the present invention can be useful as, for example, a target for specifically detecting a polypeptide of the present invention, and as a marker specific for the brain/nerves or specific for nerve differentiation. The specific partial peptide A of the present invention can also be useful in developing a substance capable of specifically recognizing a polypeptide of the present invention, or a substance incapable of specifically recognizing a polypeptide of the present invention, or developing a substance capable of specifically regulating a function of a polypeptide of the present invention, or a substance incapable of specifically regulating a function of a polypeptide of the present invention.
The specific partial peptide B of the present invention is a partial peptide that is present only in a known polypeptide, and that is not present in a polypeptide having an amino acid sequence shown by SEQ ID NO:X and the like. As examples of the specific partial peptide B, i) a partial peptide consisting of an insert amino acid sequence of a known polypeptide or a partial amino acid sequence thereof, ii) a partial peptide consisting of an insert amino acid sequence of a known polypeptide or a terminal partial amino acid sequence thereof and an adjacent amino acid sequence thereof, and iii) a partial peptide consisting of an amino acid sequence wherein both amino acid sequences present on the N-terminal side and C-terminal side relative to an insert amino acid sequence of a polypeptide of the present invention are linked together, formed as a result of exon deletion, can be mentioned.
The specific partial peptide B of i) above consists of an insert amino acid sequence of a known polypeptide or a partial amino acid sequence thereof. Such partial amino acid sequences are obvious from the disclosure herein.
The specific partial peptide B of ii) above consists of an insert amino acid sequence of a known polypeptide or a terminal partial amino acid sequence thereof and an adjacent amino acid sequence thereof. As such terminal partial amino acid sequences, an amino acid sequence corresponding to an N-terminal portion of an insert amino acid sequence of a known polypeptide (abbreviated as “N-terminal partial amino acid sequence B” as required), and an amino acid sequence corresponding to a C-terminal portion of an insert amino acid sequence of a known polypeptide (abbreviated as “C-terminal partial amino acid sequence B” as required) can be mentioned. As such adjacent amino acid sequences, an amino acid sequence present on the N-terminal side relative to an insert amino acid sequence of a known polypeptide (abbreviated as “N-terminal adjacent amino acid sequence B” as required), and an amino acid sequence present on the C-terminal side relative to an insert amino acid sequence of a known polypeptide (abbreviated as “C-terminal adjacent amino acid sequence B” as required) can be mentioned. Therefore, the specific partial peptide B of ii) above can be a partial peptide consisting of an amino acid sequence spanning from a specified position of the N-terminal adjacent amino acid sequence B to a specified position of an insert amino acid sequence of a known polypeptide, a partial peptide consisting of an amino acid sequence spanning from a specified position of an insert amino acid sequence of a known polypeptide to a specified position of the C-terminal adjacent amino acid sequence B, or a partial peptide consisting of an amino acid sequence comprising the whole insert amino acid sequence of a known polypeptide, spanning from a specified position of the N-terminal adjacent amino acid sequence B to a specified position of the C-terminal adjacent amino acid sequence B. The number of amino acid residues in the insert amino acid sequence (or N-terminal or C-terminal partial amino acid sequence B) or adjacent amino acid sequence (or N-terminal or C-terminal adjacent amino acid sequence B), contained in the specific partial peptide B of ii) above, is not particularly limited, as far as it is a number that ensures the specificity of the specific partial peptide B of ii) above; the number can be, for example, at least 3, preferably at least 4, more preferably at least 5, still more preferably at least 6, and most preferably at least 7, 8, 9 or 10. Such terminal partial amino acid sequences and such adjacent amino acid sequences are obvious from the disclosure herein.
The specific partial peptide B of iii) above is a partial peptide that is not present in a polypeptide of the present invention, consisting of an amino acid sequence wherein both amino acid sequences present on the N-terminal side and C-terminal side relative to an insert amino acid sequence of a polypeptide of the present invention are linked together (in a known polypeptide, these amino acid sequences are linked together as a result of exon deletion). The number of amino acid residues in each amino acid sequence present on the N-terminal side and C-terminal side relative to the insert amino acid sequence of a polypeptide of the present invention, contained in the specific partial peptide B of iii) above, is not particularly limited, as far as it is a number that ensures the specificity of the specific partial peptide B of iii) above; the number can be, for example, at least 3, preferably at least 4, more preferably at least 5, still more preferably at least 6, and most preferably at least 7, 8, 9 or 10, respectively.
The specific partial peptide B of the present invention can be useful as, for example, a target for specifically detecting a known polypeptide, and as a marker specific for the brain/nerves or specific for nerve differentiation, or as a marker not specific therefor. The specific partial peptide B of the present invention can also be useful in developing a substance capable of specifically recognizing a known polypeptide, or a substance incapable of specifically recognizing a known polypeptide, or developing a substance capable of specifically regulating a function of a known polypeptide, or a substance incapable of specifically regulating a function of a known polypeptide.
A shared partial peptide of the present invention can be a non-specific partial peptide that is present in both a polypeptide of the present invention and a known polypeptide. Such partial peptides are obvious from the disclosure herein. A shared partial peptide of the present invention can be useful as, for example, a target for comprehensively detecting both a polypeptide of the present invention and a known polypeptide, and as a marker specific for the brain/nerves or specific for nerve differentiation, or as a marker not specific therefor. A shared partial peptide of the present invention can also be useful in developing a substance capable of comprehensively recognizing both a polypeptide of the present invention and a known polypeptide, or a substance capable of comprehensively regulating functions of both a polypeptide of the present invention and a known polypeptide.
A polypeptide of the present invention or a specific partial peptide thereof may be fused with a polypeptide consisting of a heterologous amino acid sequence. As such a polypeptide, a polypeptide that facilitates purification or solubilization can be mentioned. In detail, as such polypeptides, histidine tag, maltose-binding protein (MBP), glutathione-S-transferase (GST), calmodulin-binding peptide (CBP), FLAG, and the Fc region of IgG molecule can be mentioned.
A polypeptide of the present invention and a partial peptide thereof may be provided in the form of a salt. As examples of the salt, salts with inorganic bases (e.g., alkali metals such as sodium and potassium; alkaline earth metals such as calcium and magnesium; aluminum, ammonium), salts with organic bases (e.g., trimethylamine, triethylamine, pyridine, picoline, ethanolamine, diethanolamine, triethanolamine, dicyclohexylamine, N,N-dibenzylethylenediamine), salts with inorganic acids (e.g., hydrochloric acid, hydrobromic acid, nitric acid, sulfuric acid, phosphoric acid), salts with organic acids (e.g., formic acid, acetic acid, trifluoroacetic acid, fumaric acid, oxalic acid, tartaric acid, maleic acid, citric acid, succinic acid, malic acid, methanesulfonic acid, benzenesulfonic acid, p-toluenesulfonic acid), salts with basic amino acids (e.g., arginine, lysine, ornithine) or salts with acidic amino acids (e.g., aspartic acid, glutamic acid) and the like can be mentioned.
A polypeptide of the present invention and a partial peptide thereof can be prepared by a method known per se. For example, a polypeptide of the present invention and a partial peptide thereof 1) may be recovered from an expression site, 2) may be recovered from a transformant described below, which expresses a polypeptide of the present invention and a partial peptide thereof, or a culture supernatant thereof, 3) may be synthesized using a cell-free system based on a rabbit reticulocyte lysate, wheat germ lysate, Escherichia coli lysate and the like, or 4) may be synthesized organochemically (e.g., solid phase synthesis). A polypeptide of the present invention and a partial peptide thereof are purified as appropriate by methods based on differences in solubility, such as salting-out and solvent precipitation; methods based mainly on differences in molecular weight, such as dialysis, ultrafiltration, gel filtration, and SDS-polyacrylamide gel electrophoresis; methods based on differences in electric charge, such as ion exchange chromatography; methods based on specific affinity, such as affinity chromatography and use of antibody; methods based on differences in hydrophobicity, such as reverse phase high performance liquid chromatography; methods based on differences in isoelectric point, such as isoelectric focusing; and combinations thereof the like.

1.2. Polynucleotides and Partial Nucleotides Thereof

The present invention provides a polynucleotide having a nucleic acid sequence shown by SEQ ID NO:Y, or the nucleic acid sequence Y1 or the nucleic acid sequence Y2, or substantially the same nucleic acid sequence thereas (abbreviated as “nucleic acid sequence shown by SEQ ID NO:Y and the like” as required).
“SEQ ID NO:Y” denotes the SEQ ID NO of an optionally chosen nucleic acid sequence disclosed herein. A polynucleotide “having” SEQ ID NO:Y and the like means a polynucleotide “consisting of” SEQ ID NO:Y and the like, or a polynucleotide “comprising” the nucleic acid sequence and the like.
“The nucleic acid sequence Y1” denotes a nucleic acid sequence corresponding to the coding portion (that is, the entire open reading frame (ORF) or a portion thereof) in a nucleic acid sequence shown by SEQ ID NO:Y. In other words, “the nucleic acid sequence Y1” denotes a nucleic acid sequence shown by SEQ ID NO:Y when the nucleic acid sequence shown by SEQ ID NO:Y consists of a nucleic acid sequence corresponding to the coding portion only, and it denotes a nucleic acid sequence corresponding to the coding portion only when the nucleic acid sequence shown by SEQ ID NO:Y comprises nucleic acid sequences corresponding to both the coding portion and the non-coding portion.
“The nucleic acid sequence Y2” denotes a nucleic acid sequence corresponding to a non-coding portion (e.g., 5′ or 3′ noncoding region) in a nucleic acid sequence shown by SEQ ID NO:Y. In other words, “the nucleic acid sequence Y2” denotes a nucleic acid sequence shown by SEQ ID NO:Y when the nucleic acid sequence shown by SEQ ID NO:Y consists of a nucleic acid sequence corresponding to the non-coding portion only, and it denotes a nucleic acid sequence corresponding to the non-coding portion only when the nucleic acid sequence shown by SEQ ID NO:Y comprises nucleic acid sequences corresponding to both the non-coding portion and the coding portion.
Therefore, a nucleic acid sequence denoted by “SEQ ID NO:Y” can be denoted by any one of i) the nucleic acid sequence Y1 (when the nucleic acid sequence shown by SEQ ID NO:Y as a whole is a nucleic acid sequence corresponding to the coding portion), ii) the nucleic acid sequence Y2 (when the nucleic acid sequence shown by SEQ ID NO:Y as a whole is a nucleic acid sequence corresponding to the non-coding portion), or iii) a nucleic acid sequence comprising the nucleic acid sequence Y1 and the nucleic acid sequence Y2 (when the nucleic acid sequence shown by SEQ ID NO:Y comprises nucleic acid sequences corresponding to the coding portion and the non-coding portion).
In one embodiment, substantially the same nucleic acid sequence as a nucleic acid sequence shown by SEQ ID NO:Y, or the nucleic acid sequence Y1 or the nucleic acid sequence Y2 can be a nucleic acid sequence having a specified sequence identity to the nucleic acid sequence shown by SEQ ID NO:Y, or the nucleic acid sequence Y1 or the nucleic acid sequence Y2. The degree of nucleic acid sequence identity can be about 90% or more, preferably about 92% or more, more preferably about 95% or more, still more preferably about 96% or more, and most preferably about 97% or more, about 98% or more or about 99% or more. Nucleic acid sequence identity can be determined by a method known per se. For example, nucleic acid sequence identity (%) can be determined by the same method as that described above for amino acid sequence identity (%).
In another embodiment, substantially the same nucleic acid sequence as a nucleic acid sequence shown by SEQ ID NO:Y or the nucleic acid sequence Y1 or the nucleic acid sequence Y2 can be the nucleic acid sequence shown by SEQ ID NO:Y or the nucleic acid sequence Y1 or the nucleic acid sequence Y2, wherein one or more nucleotides have one or more modifications selected from among substitutions, additions, deletions and insertions. The number of nucleotides modified is not particularly limited, as far as it is one or more, and the number can be, for example, 1 to about 100, preferably 1 to about 70, more preferably 1 to about 50, still more preferably 1 to about 30, and most preferably 1 to about 20, 1 to about 10 or 1 to about 5 (e.g., 1 or 2).
In still another embodiment, substantially the same nucleic acid sequence as a nucleic acid sequence shown by SEQ ID NO:Y, or the nucleic acid sequence Y1 or the nucleic acid sequence Y2 can be a polynucleotide that can be hybridized to a nucleic acid sequence complementary to the nucleic acid sequence shown by SEQ ID NO:Y, or the nucleic acid sequence Y1 or the nucleic acid sequence Y2 under high stringent conditions. Hybridization conditions under high stringent conditions can be set with reference to reported conditions (see, e.g., Current Protocols in Molecular Biology, John Wiley & Sons, 6.3.1-6.3.6 (1999)). For example, as hybridization conditions under high stringent conditions, hybridization with 6×SSC (sodium chloride/sodium citrate)/45° C., followed by washing with 0.2×SSC/0.1% SDS/50 to 65° C. once or twice or more, can be mentioned.
Substantially the same nucleic acid sequence as a nucleic acid sequence shown by SEQ ID NO:Y, or the nucleic acid sequence Y1 or the nucleic acid sequence Y2 may completely retain a characteristic portion thereof (e.g., a portion corresponding to a specific partial nucleotide described below), and may have another portion (e.g., a portion present in a known polynucleotide) being substantially the same as the corresponding portion of the nucleic acid sequence shown by SEQ ID NO:Y, or the nucleic acid sequence Y1 or the nucleic acid sequence Y2. Alternatively, substantially the same nucleic acid sequence as a nucleic acid sequence shown by SEQ ID NO:Y, or the nucleic acid sequence Y1 or the nucleic acid sequence Y2 may have a non-characteristic portion thereof being the same as the corresponding portion of the nucleic acid sequence shown by SEQ ID NO:Y, or the nucleic acid sequence Y1 or the nucleic acid sequence Y2, and a characteristic portion thereof being substantially the same as the corresponding portion of the nucleic acid sequence shown by SEQ ID NO:Y, or the nucleic acid sequence Y1 or the nucleic acid sequence Y2.
A polynucleotide of the present invention is capable of encoding a polypeptide of the present invention. Therefore, a polynucleotide of the present invention can be a polynucleotide such that the polypeptide encoded thereby is capable of being functionally equivalent to a polypeptide of the present invention.
In detail, for the brain/nerve-specific genes 1 to 10, the nucleic acid sequence Y of the polynucleotide, and the SEQ ID NO:Y and Ya-th to Yb-th of the ORF-corresponding portion thereof (Ya-th to Yb-th nucleotide residues in the nucleic acid sequence Y) are as follows.

1) Brain/Nerve-Specific Gene 1

D-BRACE3000012.1 (SEQ ID NO:16 or SEQ ID NO:17, and 465th to 2558th)
D-UTERU2026184.1 (SEQ ID NO:8 or SEQ ID NO:9, and 191st to 2119th)

2) Brain/Nerve-Specific Gene 2

D-NT2RP8004156.1 (SEQ ID NO:41 or SEQ ID NO:42, and 131st to 1387th)

3) Brain/Nerve-Specific Gene 3

D-NT2RI3005525.1 (SEQ. ID NO:56 or SEQ ID NO:57, and 45th to 1292nd)

4) Brain/Nerve-Specific Gene 4

D-NT2RP8004592.1 (SEQ ID NO:72 or SEQ ID NO:73, and 620th to 1183rd)

5) Brain/Nerve-Specific Gene 5

D-NT2RI2014164.1 (SEQ ID NO:87 or SEQ ID NO:88, and 162nd to 1397th)
D-BRAMY2029564.1 (SEQ ID NO:97 or SEQ ID NO:98, and 143rd to 1657th)

6) Brain/Nerve-Specific Gene 6

D-BRHIP2003515.1 (SEQ ID NO:116 or SEQ ID NO:117, and 84th to 707th)

7) Brain/Nerve-Specific Gene 7

D-BRACE2044661.1 (SEQ ID NO:131 or SEQ ID NO:132, and 297th to 878th)

8) Brain/Nerve-Specific Gene 8

D-3NB692002462.1 (SEQ ID NO:150 or SEQ ID NO:151, and 343rd to 951st)
D-BRCAN2027778.1 (SEQ ID NO:157 or SEQ ID NO:158, and 52nd to 1086th)

9) Brain/Nerve-Specific Gene 9

D-NT2RI3001005.1 (SEQ ID NO:182 or SEQ ID NO:183, and 22nd to 1629th)
D-NT2RI3005261.1 (SEQ ID NO:188 or SEQ ID NO:189, and 22nd to 1629th)

10) Brain/Nerve-Specific Gene 10

D-OCBBF2010718.1 (SEQ ID NO:205 or SEQ ID NO:206, and 144th to 2495th)
D-OCBBF3004194.1 (SEQ ID NO:211 or SEQ ID NO:212, and 129th to 2480th)
D-NT2RP8000826.1 (SEQ ID NO:217 or SEQ ID NO:218, and 95th to 2461st)
D-NT2RP7007268.1 (SEQ ID NO:223 or SEQ ID NO:224, and 95th to 2461st)
D-BRAWH3008172.1 (SEQ ID NO:229 or SEQ ID NO:330, and 281st to 2452nd)
D-BRAWH3011965.1 (SEQ ID NO:234 or SEQ ID NO:235, and 300th to 1574th)
A polynucleotide of the present invention can be useful in, for example, developing a substance capable of specifically recognizing a polynucleotide of the present invention, a substance incapable of specifically recognizing a polynucleotide of the present invention, or a substance capable of comprehensively recognizing both a polynucleotide of the present invention and a known polynucleotide, and developing a substance capable of specifically regulating the expression of a polypeptide of the present invention, a substance incapable of specifically regulating the expression of a polypeptide of the present invention, or a substance capable of comprehensively regulating the expression of both a polypeptide of the present invention and a known polypeptide.
The present invention also provides a partial nucleotide.
“A partial nucleotide” consists of at least 15, preferably at least 16, more preferably at least 18, still more preferably at least 20, and most preferably at least 22, 23, 24 or 25, consecutive nucleotide residues selected from among subject polynucleotides, that can have a specified utility (e.g., use as a probe, a primer, a polynucleotide that encodes an immunogenic or antigenic peptide, a polynucleotide that encodes a functional peptide having a particular domain and the like).
“An insert nucleic acid sequence of a polynucleotide of the present invention” refers to a nucleic acid sequence that is incorporated in a polynucleotide of the present invention (e.g., novel variant), but lacked in a known polynucleotide (e.g., known variant). Meanwhile, “an insert nucleic acid sequence of a known polynucleotide” refers to a nucleic acid sequence that is incorporated in a known polynucleotide (e.g., known variant), but lacked in a polynucleotide of the present invention (e.g., novel variant). These insert nucleic acid sequences are obvious from the disclosure herein.
“A deletion nucleic acid sequence of a polynucleotide of the present invention” refers to a nucleic acid sequence that is lacked in a polynucleotide of the present invention (e.g., novel variant), but inserted in a known polynucleotide (e.g., known variant). Meanwhile, “a deletion nucleic acid sequence of a known polynucleotide” refers to a nucleic acid sequence that is lacked in a known polynucleotide (e.g., known variant), but inserted in a polynucleotide of the present invention (e.g., novel variant). These deletion nucleic acid sequences are obvious from the disclosure herein. “A deletion nucleic acid sequence of a polynucleotide of the present invention” can have the same definition as that for “an insert nucleic acid sequence of a known polynucleotide”; “a deletion nucleic acid sequence of a known polynucleotide” can have the same definition as that for “an insert nucleic acid sequence of a polynucleotide of the present invention”.
A partial nucleotide of the present invention can be a) a specific partial nucleotide of a polynucleotide of the present invention, capable of distinguishing a polynucleotide of the present invention from a known polynucleotide (abbreviated as “specific partial nucleotide A” as required), b) a specific partial nucleotide of a known polynucleotide, capable of distinguishing a known polynucleotide from a polynucleotide of the present invention (abbreviated as “specific partial nucleotide B” as required, or c) a partial nucleotide common to both a polynucleotide of the present invention and a known polynucleotide (abbreviated as “shared partial nucleotide” as required). For these particular partial nucleotides, there appears a motivation for preparing them or utilizing them as markers on the basis of the present inventors' findings, but without these findings, there is no motivation for preparing them or utilizing them as markers. Being partial nucleotides specific for polynucleotides encoded by brain/nerve-specific genes 1 to 10, the specific partial nucleotides A and B are abbreviated as “specific partial nucleotides of the present invention” or “specific partial nucleotides” as required.
The specific partial nucleotide A of the present invention is a partial nucleotide that is present only in a polynucleotide having a nucleic acid sequence shown by SEQ ID NO:Y and the like, and that is not present in any known polynucleotide. As examples of the specific partial nucleotide A, i) a partial nucleotide consisting of an insert nucleic acid sequence of a polynucleotide of the present invention or a partial nucleic acid sequence thereof, ii) a partial nucleotide consisting of an insert nucleic acid sequence of a polynucleotide of the present invention or a terminal partial nucleic acid sequence thereof and an adjacent nucleic acid sequence thereof, and iii) a partial nucleotide consisting of a nucleic acid sequence wherein both nucleic acid sequences present on the 5′ and 3′ sides relative to an insert nucleic acid sequence of a known polynucleotide are linked together, formed as a result of exon deletion, can be mentioned.
The specific partial nucleotide A of i) above consists of an insert nucleic acid sequence of a polynucleotide of the present invention or a partial nucleic acid sequence thereof. Such partial nucleic acid sequences are obvious from the disclosure herein.
The specific partial nucleotide A of ii) above consists of an insert nucleic acid sequence of a polynucleotide of the present invention or a terminal partial nucleic acid sequence thereof and an adjacent nucleic acid sequence thereof. As such terminal partial nucleic acid sequences, a nucleic acid sequence corresponding to a 5′-terminal portion in an insert nucleic acid sequence of a polynucleotide of the present invention (abbreviated as “5′-terminal partial nucleic acid sequence A” as required), and a nucleic acid sequence corresponding to a 3′-terminal portion in an insert nucleic acid sequence of a polypeptide of the present invention (abbreviated as “3′-terminal partial nucleic acid sequence A” as required) can be mentioned. As such adjacent nucleic acid sequences, a nucleic acid sequence present on the 5′ side relative to an insert nucleic acid sequence of a polynucleotide of the present invention (abbreviated as “5′ adjacent nucleic acid sequence A” as required), and a nucleic acid sequence present on the 3′ side relative to an insert nucleic acid sequence of a polynucleotide of the present invention (abbreviated as “3′ adjacent nucleic acid sequence A” as required) can be mentioned. Therefore, the specific partial nucleotide A of ii) above can be a partial nucleotide consisting of a nucleic acid sequence spanning from a specified position of the 5′ adjacent nucleic acid sequence A to a specified position of an insert nucleic acid sequence of a polynucleotide of the present invention, a partial nucleotide consisting of a nucleic acid sequence spanning from a specified position of an insert nucleic acid sequence of a polynucleotide of the present invention to a specified position of the 3′ adjacent nucleic acid sequence A, or a partial nucleotide consisting of a nucleic acid sequence comprising the whole insert nucleic acid sequence of a polynucleotide of the present invention, spanning from a specified position of the 5′ adjacent nucleic acid sequence A to a specified position of the 3′ adjacent nucleic acid sequence A. The number of nucleotide residues in the insert nucleic acid sequence (or 5′-terminal or 3′-terminal partial nucleic acid sequence A) or adjacent nucleic acid sequence (or 5′-terminal or 3′-terminal adjacent nucleic acid sequence A), contained in the specific partial nucleotide A of ii) above, is not particularly limited, as far as it is a number that ensures the specificity of the specific partial nucleotide A of ii) above; the number can be, for example, at least 3, preferably at least 4, more preferably at least 5, still more preferably at least 6, and most preferably at least 7, 8, 9 or 10. Such terminal partial nucleic acid sequences and such adjacent nucleic acid sequences are obvious from the disclosure herein.
The specific partial nucleotide A of iii) above is a partial nucleotide not present in a known polynucleotide, which nucleotide consisting of a nucleic acid sequence wherein both nucleic acid sequences present on the 5′ and 3′ sides relative to an insert nucleic acid sequence of a known polynucleotide are linked together (in a polynucleotide of the present invention, these nucleic acid sequences are linked together as a result of exon deletion). The number of nucleotide residues in each nucleic acid sequence present on the 5′ and 3′ sides relative to an insert nucleic acid sequence of a known polynucleotide, contained in the specific partial nucleotide A of iii) above, is not particularly limited, as far as it is a number that ensures the specificity of the specific partial nucleotide A of iii) above; the number can be, for example, at least 3, preferably at least 4, more preferably at least 5, still more preferably at least 6, and most preferably at least 7, 8, 9 or 10, respectively.
The specific partial nucleotide A of the present invention can be useful as, for example, a target for specifically detecting a polynucleotide of the present invention, and as a biomarker specific for the brain/nerves or specific for nerve differentiation. The specific partial nucleotide A of the present invention can also be useful in developing a substance capable of specifically recognizing a polynucleotide of the present invention, or a substance incapable of specifically recognizing a polynucleotide of the present invention, or developing a substance capable of specifically regulating the expression of a polypeptide of the present invention, or a substance incapable of specifically regulating the expression of a polypeptide of the present invention.
The specific partial nucleotide B of the present invention is a partial nucleotide that is present only in a known polynucleotide, and not present in a polynucleotide having a nucleic acid sequence shown by SEQ ID NO:X and the like. As examples of the specific partial nucleotide B, i) a partial nucleotide consisting of an insert nucleic acid sequence of a known polynucleotide or a partial nucleic acid sequence thereof, ii) a partial nucleotide consisting of an insert nucleic acid sequence of a known polynucleotide or a terminal partial nucleic acid sequence thereof and an adjacent nucleic acid sequence thereof, and iii) a partial nucleotide consisting of a nucleic acid sequence wherein both nucleic acid sequences present on the 5′ and 3′ sides relative to an insert nucleic acid sequence of a polynucleotide of the present invention are linked together, formed as a result of exon deletion, can be mentioned.
The specific partial nucleotide B of i) above consists of an insert nucleic acid sequence of a known polynucleotide or a partial nucleic acid sequence thereof. Such partial nucleic acid sequences are obvious from the disclosure herein.
The specific partial nucleotide B of ii) above consists of an insert nucleic acid sequence of a known polynucleotide or a terminal partial nucleic acid sequence thereof and an adjacent nucleic acid sequence thereof. As such terminal partial nucleic acid sequences, a nucleic acid sequence corresponding to a 5′-terminal portion in an insert nucleic acid sequence of a known polynucleotide (abbreviated as “5′-terminal partial nucleic acid sequence B” as required), and a nucleic acid sequence corresponding to a 3′-terminal portion in an insert nucleic acid sequence of a known polynucleotide (abbreviated as “3′-terminal partial nucleic acid sequence B” as required) can be mentioned. As such adjacent nucleic acid sequences, a nucleic acid sequence present on the 5′ side relative to an insert nucleic acid sequence of a known polynucleotide (abbreviated as “5′ adjacent nucleic acid sequence B” as required), and a nucleic acid sequence present on the 3′ side relative to an insert nucleic acid sequence of a known polynucleotide (abbreviated as “3′ adjacent nucleic acid sequence B” as required) can be mentioned. Therefore, the specific partial nucleotide B of ii) above can be a partial nucleotide consisting of a nucleic acid sequence spanning from a specified position of the 5′ adjacent nucleic acid sequence B to a specified position of an insert nucleic acid sequence of a known polynucleotide, a partial nucleotide consisting of a nucleic acid sequence spanning from a specified position of an insert nucleic acid sequence of a known polynucleotide to a specified position of the 3′ adjacent nucleic acid sequence B, or a partial nucleotide consisting of a nucleic acid sequence comprising the whole insert nucleic acid sequence of a known polynucleotide, spanning from a specified position of the 5′ adjacent nucleic acid sequence B to a specified position of the 3′ adjacent nucleic acid sequence B. The number of nucleotide residues in the insert nucleic acid sequence (or 5′-terminal or 3′-terminal partial nucleic acid sequence B) or adjacent nucleic acid sequence (or 5′-terminal or 3′-terminal adjacent nucleic acid sequence B), contained in the specific partial nucleotide B of ii) above, is not particularly limited, as far as it is a number that ensures the specificity of the specific partial nucleotide B of ii) above; the number can be, for example, at least 3, preferably at least 4, more preferably at least 5, still more preferably at least 6, and most preferably at least 7, 8, 9 or 10. Such terminal partial nucleic acid sequences and such adjacent nucleic acid sequences are obvious from the disclosure herein.
The specific partial nucleotide B of iii) above is a partial nucleotide not present in a polynucleotide of the present invention, consisting of a nucleic acid sequence wherein both nucleic acid sequences present on the 5′ and 3′ sides relative to an insert nucleic acid sequence of a polynucleotide of the present invention are linked together (in a known polynucleotide, these nucleic acid sequences are linked together as a result of exon deletion). The number of nucleotide residues in each nucleic acid sequence present on the 5′ and 3′ sides relative to an insert nucleic acid sequence of a polynucleotide of the present invention, contained in the specific partial nucleotide B of iii) above, is not particularly limited, as far as it is a number that ensures the specificity of the specific partial nucleotide B of iii) above, and the number can be, for example, at least 3, preferably at least 4, more preferably at least 5, still more preferably at least 6, and most preferably at least 7, 8, 9 or 10, respectively.
The specific partial nucleotide B of the present invention can be useful as, for example, as a target for specifically detecting a known polynucleotide, and as a biomarker specific for the brain/nerves or specific for nerve differentiation, or as a marker not specific therefor. The specific partial nucleotide B of the present invention can also be useful in developing a substance capable of specifically recognizing a known polynucleotide, or a substance incapable of specifically recognizing a known polynucleotide, or developing a substance capable of specifically regulating the expression of a known polypeptide, or a substance incapable of specifically regulating the expression of a known polypeptide.
A shared partial nucleotide of the present invention can be a nonspecific partial nucleotide that is present in both a polynucleotide of the present invention and a known polynucleotide. Such partial nucleotides are obvious from the disclosure herein. A shared partial nucleotide of the present invention can be useful as, for example, a target for comprehensively detecting both a polynucleotide of the present invention and a known polynucleotide, and as a biomarker specific for the brain/nerves or specific for nerve differentiation, or as a marker not specific therefor. A shared partial nucleotide of the present invention can also be useful in developing a substance capable of comprehensively recognizing both a polynucleotide of the present invention and a known polynucleotide, or a substance capable of comprehensively regulating the expression of both a polypeptide of the present invention and a known polypeptide.
A polynucleotide of the present invention and a partial nucleotide thereof are capable of encoding a polypeptide of the present invention or a partial peptide of the present invention. A polynucleotide of the present invention or a partial nucleotide of the present invention may be fused with a polynucleotide consisting of a heterologous nucleic acid sequence. As such heterologous nucleic acid sequences, those that encode the above-described heterologous amino acid sequences can be mentioned.
A polynucleotide of the present invention and a partial nucleotide thereof may be provided in the form of a salt. As the salt, those described above can be mentioned.
A polynucleotide of the present invention and a partial nucleotide thereof can be prepared by a method known per se. For example, the same nucleic acid sequence as a nucleic acid sequence shown by SEQ ID NO:Y, or the nucleic acid sequence Y1 or the nucleic acid sequence Y2 can be cloned using a specified tissue or cell. Moreover, substantially the same nucleic acid sequence as a nucleic acid sequence shown by SEQ ID NO:Y or the nucleic acid sequence Y1 or the nucleic acid sequence Y2 can be prepared by introducing a mutation into a polynucleotide cloned as described above. As examples of the method of mutagenesis, methods such as the synthetic oligonucleotide site-directed mutagenesis method, the gapped duplex method, a method of randomly introducing point mutations (for example, treatment with nitrous acid or sulfurous acid), the cassette mutation method, the linker scanning method, and the mismatch primer method can be mentioned.

2. Related Substances

The present invention provides a series of related substances that can be developed on the basis of a polypeptide of the present invention and a partial peptide of the present invention, and a polynucleotide of the present invention and a partial nucleotide of the present invention. The related substances of the present invention described below can be useful as, for example, pharmaceuticals. When a related substance of the present invention is a pharmaceutical, the target disease can be, for example, a disease based on a nerve cell disorder. In detail, as such diseases, Parkinson's disease, Huntington's chorea, Alzheimer's disease, ischemic cerebral diseases (e.g., cerebral stroke), epilepsy, brain trauma, motor nerve disease, multiple sclerosis, amyotrophic lateral sclerosis, diseases caused by nerve toxic disorders and the like can be mentioned.

2.1. Antisense Molecules

The present invention provides antisense molecules.
The type of the antisense molecule may be a DNA or an RNA, or may be a DNA/RNA chimera. The antisense molecule may be one having a phosphodiester bond of the natural type, or a modified nucleotide of the thiophosphate type (P═O in phosphate bond replaced with P═S), 2′-O-methyl type or the like, which are stable to degrading enzymes. Other important factors for the designing of the antisense molecule include increases in water-solubility and cell membrane permeability and the like; these can also be cleared by choosing appropriate dosage forms such as those using liposome or microspheres. The length of the antisense molecule is not particularly limited, as far as the molecule is capable of specifically hybridizing to the transcription product; the antisense molecule may be of a sequence of about 15 nucleotides for the shortest, or of a sequence complementary to the entire sequence of the transcription product for the longest. Considering the ease of synthesis, antigenicity issue and the like, for example, oligonucleotides consisting of about 15 nucleotides or more, preferably about 15 to about 100 nucleotides, and more preferably about 18 to about 50 nucleotides, can be mentioned. Furthermore, the antisense molecule may be one capable of not only inhibiting the translation of the transcription product by hybridizing thereto, but also binding to a double-stranded DNA to form a triple strand (triplex) to inhibit the transcription into mRNA.
An antisense molecule of the present invention can comprise a nucleic acid sequence complementary to a nucleic acid sequence corresponding to a partial nucleotide of the present invention (e.g., specific partial nucleotides A and B of the present invention, a shared partial nucleotide of the present invention). Therefore, an antisense molecule of the present invention can be an antisense molecule specific for a polynucleotide of the present invention, an antisense molecule specific for a known polynucleotide, or an antisense molecule common to both a polynucleotide of the present invention and a known polynucleotide. An antisense molecule of the present invention can be useful in specifically suppressing the expression of a polypeptide of the present invention or a known polypeptide, or comprehensively suppressing the expression of both a polypeptide of the present invention and a known polypeptide.

2.2. RNAi-Inducing Nucleic Acids

The present invention provides RNAi-inducing nucleic acids.
An RNAi-inducible nucleic acid refers to a polynucleotide, preferably an RNA, capable of inducing the RNA interference (RNAi) effect when transferred into cells. The RNAi effect refers to the phenomenon in which a double-stranded RNA comprising the same nucleic acid sequence as that of mRNA, or a partial sequence thereof, suppresses the expression of the mRNA. To obtain the RNAi effect, it is preferable to use, for example, a double-stranded RNA having the same nucleic acid sequence as that of a target mRNA comprising at least 20 or more continuous bases (or a partial sequence thereof). The double-stranded structure may be configured by different strands, or may be a double strand conferred by stem loop structure of a single RNA. As examples of the RNAi-inducing nucleic acid, siRNA, miRNA and the like can be mentioned, and siRNA is preferable. The siRNA is not particularly limited, as far as it can induce RNAi, and the siRNA can be, for example, 21 to 27 bases long, preferably 21 to 25 bases long.
An RNAi-inducing nucleic acid of the present invention can be a double-stranded polynucleotide configured by a sense strand consisting of a nucleic acid sequence corresponding to a partial nucleotide of the present invention (e.g., specific partial nucleotides A and B of the present invention, a shared partial nucleotide of the present invention), and an antisense strand consisting of a nucleic acid sequence complementary thereto. An RNAi-inducing nucleic acid of the present invention may also have an overhang at the 5′ terminus and/or 3′ terminus of one or both of the sense strand and the antisense strand. The overhang can be one formed as a result of the addition of one to several (e.g., 1, 2 or 3) bases at the 5′ terminus and/or 3′ terminus of the sense strand and/or antisense strand. An RNAi-inducing nucleic acid of the present invention can be an RNAi-inducing nucleic acid specific for a polynucleotide of the present invention, an RNAi-inducing nucleic acid specific for a known polynucleotide, or an RNAi-inducing nucleic acid common to both a polynucleotide of the present invention and a known polynucleotide. An RNAi-inducing nucleic acid of the present invention can be useful in specifically suppressing the expression of a polypeptide of the present invention or a known polypeptide, or comprehensively suppressing the expression of both a polypeptide of the present invention and a known polypeptide.

2.3. Aptamers

The present invention provides aptamers.
An aptamer refers to a polynucleotide having a binding activity (or inhibitory activity) on a specified target molecule. An aptamer of the present invention can be an RNA, a DNA, a modified nucleotide or a mixture thereof. An aptamer of the present invention can also be in a linear or circular form. The length of the aptamer is not particularly limited, and can normally be about 16 to about 200 nucleotides, and can be, for example, about 100 nucleotides or less, preferably about 50 nucleotides or less, and more preferably about 40 nucleotides or less. The length of an aptamer of the present invention may be, for example, about 18, about 20, about 25 or about 30 nucleotides or more. The aptamer, for increasing the bindability, stability, drug delivering quality and the like, may be one wherein a sugar residue (e.g., ribose) of each nucleotide is modified. As examples of a portion of the sugar residue modified, ones wherein the oxygen atom at the 2′-position, 3′-position and/or 4′-position of the sugar residue is replaced with another atom and the like can be mentioned. As examples of types of modifications, fluorination, O-alkylation, O-allylation, S-alkylation, S-allylation and amination can be mentioned (see, e.g., Sproat et al., (1991) Nucle. Acid. Res. 19, 733-738; Cotton et al., (1991) Nucl. Acid. Res. 19, 2629-2635). The aptamer may be one wherein a purine or pyrimidine is altered. As examples of such alterations, alteration of the 5-position pyrimidine, alteration of the 8-position purine, alteration by an exocyclic amine, substitution by 4-thiouridine, and substitution by 5-bromo or 5-iodo-uracil can be mentioned. The phosphate group contained in an aptamer of the present invention may be altered to make it resistant to nuclease and hydrolysis. For example, the phosphate group may be substituted by a thioate, a dithioate or an amidate. An aptamer can be prepared according to available reports (for example, Ellington et al., (1990) Nature, 346, 818-822; Tuerk et al., (1990) Science, 249, 505-510).
An aptamer of the present invention is capable of binding specifically to a polypeptide of the present invention or a known polypeptide, or both a polypeptide of the present invention and a known polypeptide, via a region corresponding to a partial peptide of the present invention. Therefore, an aptamer of the present invention can be an aptamer specific for a polypeptide of the present invention, an aptamer specific for a known polypeptide, or an aptamer common to both a polypeptide of the present invention and a known polypeptide. Such a specific aptamer can be prepared by, for example, selecting (a) a polynucleotide that binds to a polypeptide of the present invention or a specific partial peptide thereof, and that does not bind to a known polypeptide, (b) a polynucleotide that binds to a known polypeptide or a specific partial peptide thereof, and that does not bind to a polypeptide of the present invention, or (c) a polynucleotide that binds to both a polypeptide of the present invention and a known polypeptide or to a shared partial peptide of the present invention, by the SELEX method.

2.4. Antibodies

The present invention provides antibodies.
An antibody of the present invention may be a polyclonal antibody (antiserum) or a monoclonal antibody, and can be prepared by a commonly known immunological technique. Although the monoclonal antibody may be of any isotype, IgG, IgM, IgA, IgD, IgE, or the like, IgG or IgM is preferable.
For example, the polyclonal antibody can be acquired by administering the above-described antigen (as required, may be prepared as a complex crosslinked to a carrier protein such as bovine serum albumin or KLH ((Keyhole Limpet Hemocyanin)), along with a commercially available adjuvant (for example, Freund's complete or incomplete adjuvant), to an animal subcutaneously or intraperitoneally about 2 to 4 times at intervals of 2 to 3 weeks (the antibody titer of partially drawn serum has been determined by a known antigen-antibody reaction and its elevation has been confirmed in advance), collecting whole blood about 3 to about 10 days after final immunization, and purifying the antiserum. As the animal to receive the antigen, mammals such as rats, mice, rabbits, goat, guinea pigs, and hamsters can be mentioned.
The monoclonal antibody can also be prepared by a cell fusion method. For example, the above-described antigen, along with a commercially available adjuvant, is subcutaneously or intraperitoneally administered to a mouse 2 to 4 times, and 3 days after final administration, the spleen or lymph nodes are collected, and leukocytes are collected. These leukocytes and myeloma cells (for example, NS-1, P3X63Ag8 and the like) are cell-fused to obtain a hybridoma that produces a monoclonal antibody against the factor. This cell fusion may be performed by the PEG method or the voltage pulse method. A hybridoma that produces the desired monoclonal antibody can be selected by detecting an antibody that binds specifically to the antigen, in the culture supernatant, using a widely known EIA or RIA method and the like. Cultivation of the hybridoma that produces the monoclonal antibody can be performed in vitro, or in vivo such as in ascitic fluid of a mouse or rat, preferably a mouse, and the antibody can be acquired from the culture supernatant of the hybridoma and the ascitic fluid of the animal.
An antibody of the present invention may also be a chimeric antibody, a humanized antibody or a human antibody.
A chimeric antibody means a monoclonal antibody derived from immunoglobulins of animal species having mutually different variable regions and constant regions. For example, a chimeric antibody can be a mouse/human chimeric monoclonal antibody whose variable region is a variable region derived from a mouse immunoglobulin, and whose constant region is a constant region derived from a human immunoglobulin. The constant region derived from a human immunoglobulin has an amino acid sequence unique depending on the isotype, such as IgG, IgM, IgA, IgD, and IgE, and the constant region of a recombinant chimeric monoclonal antibody in the present invention may be the constant region of a human immunoglobulin belonging to any isotype. The constant region of human IgG is preferable.
A chimeric antibody can be prepared by a method known per se. For example, a mouse/human chimeric monoclonal antibody can be prepared according to available reports (e.g., Jikken Igaku (extra issue), Vol. 6, No. 10, 1988 and JP-B-HEI-3-73280). In detail, a chimeric antibody can be prepared by inserting the C_Hgene acquired from the DNA that encodes a human immunoglobulin (C gene that encodes H chain constant region) downstream of the active V_Hgene acquired from the DNA that encodes a mouse monoclonal antibody isolated from a hybridoma that produces the mouse monoclonal antibody (rearranged VDJ gene that encodes H chain variable region), and inserting the C_Lgene acquired from the DNA that encodes a human immunoglobulin (C gene that encodes L chain constant region) downstream of the active V_Lgene acquired from the DNA that encodes a mouse monoclonal antibody isolated from the hybridoma (rearranged VJ gene that encodes L chain variable region), in a way that allows the expression of each gene, into one or separate expression vectors, transforming a host cell with the expression vector, and culturing the transformant cell.
A humanized antibody means a monoclonal antibody prepared by a gene engineering technique, for example, a human type monoclonal antibody wherein a portion or all of the complementarity-determining region of the ultra-variable region thereof is derived from a mouse monoclonal antibody, and the framework region of the variable region thereof and the constant region thereof are derived from a human immunoglobulin. The complementarity-determining regions of the ultra-variable region are three regions that are present in the ultra-variable region in the variable region of the antibody, and that complementarily directly bind to the antigen (Complementarity-determining regions; CDR1, CDR2, CDR3), and the framework regions of the variable region are four relatively highly conserved regions locating in the front and back of the three complementarity-determining regions (Framework; FR1, FR2, FR3, FR4). In other words, a humanized antibody means, for example, a monoclonal antibody wherein all regions other than a portion or all of the complementarity-determining region of the ultra-variable region of a mouse monoclonal antibody is replaced with a corresponding region of a human immunoglobulin.
A humanized antibody can be prepared by a method known per se. For example, a recombinant humanized antibody derived from a mouse monoclonal antibody can be prepared according to available reports (e.g., Japanese Patent Application Kohyo Publication No. HEI-4-506458 and JP-A-SHO-62-296890). In detail, from a hybridoma that produces a mouse monoclonal antibody, at least one mouse H chain CDR gene and at least one mouse L chain CDR gene corresponding to the mouse H chain CDR gene are isolated, and from a human immunoglobulin gene, the human H chain gene that encodes all regions other than the human H chain CDR corresponding to the mouse H chain gene and the human L chain gene that encodes all regions other than the human L chain CDR corresponding to the mouse L chain CDR are isolated. The mouse H chain CDR gene and human H chain gene isolated are introduced into an appropriate expression vector expressibly; likewise, the mouse L chain CDR gene and the human L chain gene are introduced into another appropriate expression vector expressively. Alternatively, the mouse H chain CDR gene/human H chain gene and the mouse L chain CDR gene/human L chain gene can be introduced into the same expression vector expressively. By transforming a host cell with the expression vector thus prepared to obtain a cell that produces a humanized antibody, and culturing the cell, a desired humanized antibody can be obtained from the culture supernatant.
A human antibody means an antibody wherein all regions comprising the variable regions and constant regions of the H chain and L chain constituting an immunoglobulin are derived from the gene that encodes a human immunoglobulin.
A human antibody can be prepared by a method known per se. For example, a human antibody can be produced by immunologically sensitizing with an antigen a transgenic animal prepared by incorporating at least a human immunoglobulin gene into a gene locus of a non-human mammal such as a mouse, in the same way as the above-described method of preparing a polyclonal antibody or a monoclonal antibody. For example, a transgenic mouse that produces a human antibody can be prepared according to available reports (Nature Genetics, Vol. 15, p. 146-156, 1997; Nature Genetics, Vol. 7, p. 13-21, 1994; Japanese Patent Application Kohyo Publication No. HEI-4-504365; International Patent Application Publication WO94/25585; Nature, Vol. 368, p. 856-859, 1994; and Japanese Patent Application Kohyo Publication No. HEI-6-500233).
An antibody of the present invention can also be a portion of an antibody of the present invention described above (e.g., monoclonal antibody). As examples of such antibodies, F(ab′)₂, Fab′, Fab, and Fv fragments, and single-chain antibodies can be mentioned.
An antibody of the present invention is capable of binding specifically to a polypeptide of the present invention or a known polypeptide, or both a polypeptide of the present invention and a known polypeptide, via a region corresponding to a partial peptide of the present invention. Therefore, an antibody of the present invention can be an antibody specific for a polypeptide of the present invention, an antibody specific for a known polypeptide, or an antibody common to both a polypeptide of the present invention and a known polypeptide. Such a specific antibody can be prepared by, for example, using a specific partial peptide of a polypeptide of the present invention, a specific partial peptide of a known polypeptide, or a shared partial peptide of the present invention as an antigen.

2.5. Expression Vectors

The present invention provides expression vectors for the above-described substances.
An expression vector of the present invention can comprise a polynucleotide that encodes a desired polypeptide to be expressed or a desired polynucleotide to be expressed, and a promoter operably linked to the polynucleotide. “A promoter is operably linked to a polynucleotide” means that the promoter is bound to a polynucleotide that encodes the gene in a way such that allows the expression of the polynucleotide under the control thereof, or the expression of the polypeptide encoded by the polynucleotide.
The backbone for an expression vector of the present invention is not particularly limited, as far as it allows production of a desired substance in a specified cell; for example, plasmid vectors and viral vectors can be mentioned. When an expression vector is used as a pharmaceutical, as vectors suitable for administration to mammals, viral vectors such as adenovirus, retrovirus, adeno-associated virus, herpesvirus, vaccinia virus, poxvirus, poliovirus, Sindbis virus, and Sendai virus can be mentioned.
When a prokaryotic cell is used as the host cell, an expression vector allowing the prokaryotic cell to be utilized as the host cell can be used. Such an expression vector can comprise, for example, elements such as a promoter-operator region, an initiation codon, a polynucleotide that encods a polypeptide of the present invention or a partial peptide thereof, a stop codon, a terminator region and a replication origin. A promoter-operator region for expressing a polypeptide of the present invention in a bacterium comprises a promoter, an operator and a Shine-Dalgarno (SD) sequence. These elements may be ones known per se.
When a eukaryotic cell is used as the host cell, an expression vector allowing the eukaryotic cell to be utilized as the host cell can be used. In this case, the promoter used is not particularly limited, as far as it is capable of functioning in eukaryotic organisms such as mammals. When the expression of a polypeptide is desired, as examples of such promoters, viral promoters such as SV40-derived initial promoter, cytomegalovirus LTR, Rous sarcoma virus LTR, MoMuLV-derived LTR, and adenovirus-derived initial promoter, and mammalian constituent protein gene promoters such as β-actin gene promoter, PGK gene promoter, and transferrin gene promoter, and the like can be mentioned. When the expression of a polynucleotide is desired, the promoter can be a polIII promoter (e.g., tRNA promoter, U6 promoter, H1 promoter).
An expression vector of the present invention can further comprise sites for transcription initiation and transcription termination, and a ribosome-binding site required for translation in the transcription region, a replication origin and a selection marker gene (e.g., ampicillin, tetracycline, kanamycin, spectinomycin, erythromycin, chloramphenicol) and the like. An expression vector of the present invention can be prepared by a method known per se (see, e.g., Molecular Cloning, 2^ndedition, Sambrook et al., Cold Spring Harbor Lab. Press (1989)).

3. Compositions

The present invention provides compositions comprising the above-described substances.
A composition of the present invention can comprise, in addition to the above-described substances, an optionally chosen carrier, for example, a pharmaceutically acceptable carrier. As examples of the pharmaceutically acceptable carrier, excipients such as sucrose, starch, mannitol, sorbitol, lactose, glucose, cellulose, talc, calcium phosphate, and calcium carbonate, binders such as cellulose, methylcellulose, hydroxypropylcellulose, polypropylpyrrolidone, gelatin, gum arabic, polyethylene glycol, sucrose, and starch, disintegrants such as starch, carboxymethylcellulose, hydroxypropylstarch, sodium-glycol-starch, sodium hydrogen carbonate, calcium phosphate, and calcium citrate, lubricants such as magnesium stearate, Aerosil, talc, and sodium lauryl sulfate, flavoring agents such as citric acid, menthol, glycyrrhizin ammonium salt, glycine, and orange flour, preservatives such as sodium benzoate, sodium hydrogen sulfite, methylparaben, and propylparaben, stabilizing agents such as citric acid, sodium citrate, and acetic acid, suspending agents such as methyl cellulose, polyvinylpyrrolidone, and aluminum stearate, dispersing agents such as surfactants, diluents such as water, physiological saline, and orange juice, and base waxes such as cacao butter, polyethylene glycol, and kerosene, and the like can be mentioned, which, however, are not to be construed as limiting.
Preparations suitable for oral administration are liquids prepared by dissolving an effective amount of a substance in a diluent such as water, physiological saline or orange juice, capsules, saches or tablets containing an effective amount of a substance in the form of solids or granules, suspensions prepared by suspending an effective amount of a substance in an appropriate dispersant, emulsions prepared by dispersing and emulsifying a solution, an effective amount of a substance is dissolved therein, in an appropriate dispersant, and the like.
Preparations suitable for parenteral administration (for example, intravenous injection, subcutaneous injection, intramuscular injection, topical injection, intraperitoneal administration and the like) are aqueous and non-aqueous isotonic sterile injectable liquids, which may contain an antioxidant, a buffer solution, a bacteriostatic agent, an isotonizing agent and the like. Aqueous and non-aqueous sterile suspensions can also be mentioned, which may contain a suspending agent, a solubilizer, a thickening agent, a stabilizer, an antiseptic and the like. These preparations can be enclosed in containers such as ampoules and vials for unit dosage or a plurality of dosages. It is also possible to freeze-dry the active ingredient and a pharmaceutically acceptable carrier, and store the preparation in a state that may be dissolved or suspended in an appropriate sterile vehicle just before use.
Although the dosage of a composition of the present invention varies depending on the activity and kind of active ingredient, seriousness of illness, recipient animal species, the recipient's drug tolerance, body weight, age, and the like, it is normally about 0.001 to about 500 mg/kg as the amount of active ingredient per day for an adult.
A composition of the present invention enables a regulation (e.g., promotion or suppression) of the expression or a function of a polypeptide of the present invention. A composition of the present invention can be useful as, for example, a pharmaceutical (e.g., a prophylactic or therapeutic drug for a disease as described above), reagent or food.

4. Cells

The present invention provides transformants that produce a polypeptide of the present invention or a partial peptide of the present invention, cells that produce an antibody of the present invention, and cells wherein the expression or a function of a polynucleotide or polypeptide of the present invention is regulated.

4.1. Transformants

A transformant of the present invention can be a cell transformed with an expression vector of the present invention, that expresses a polypeptide of the present invention or a partial peptide of the present invention. The host cell used to prepare the transformant is not particularly limited, as far as it is compatible with the expression vector, and capable of expressing the desired polynucleotide or polypeptide and the like; for example, primary culture cells or cell lines can be mentioned. In detail, as examples of such host cells, cells of prokaryotic organisms such as Escherichia coli, bacteria of the genus Bacillus (e.g., Bacillus subtilis), and actinomyces, and cells of eukaryotic organisms, such as yeast, insect cells, bird cells, and mammalian cells (e.g., cells derived from the above-described mammals: e.g., CHO cells) can be mentioned. A transformant of the present invention can be prepared by a method known per se (see, e.g., Molecular Cloning, 2^ndedition, Sambrook et al., Cold Spring Harbor Lab. Press (1989)).
Cultivation of the transformant can be performed in a nutrient medium such as a liquid medium by a method known per se. The medium preferably contains a carbon source, a nitrogen source, an inorganic substance and the like necessary for the growth of the transformant. Here, as examples of the carbon source, glucose, dextrin, soluble starch, sucrose and the like can be mentioned; as examples of the nitrogen source, inorganic or organic substances such as an ammonium salt, a nitrate salt, corn steep liquor, peptone, casein, meat extract, soybean cake, potato extract and the like can be mentioned; as examples of the inorganic substance, calcium chloride, sodium dihydrogen phosphate, magnesium chloride and the like can be mentioned. In addition, the medium may be supplemented with yeast extract, vitamins and the like. Culturing conditions, for example, temperature, medium pH and culturing time, are chosen as appropriate to allow a polypeptide of the present invention to be produced in a large amount. Culturing temperature is, for example, 30 to 37° C.

4.2. Antibody Producing Cells

An antibody-producing cell of the present invention can be an optionally chosen cell that produces an antibody of the present invention. As antibody-producing cells of the present invention, the above-described hybridomas, and a transformant cell incorporating an expression vector for one of the above-described antibodies can be mentioned. When an antibody-producing cell of the present invention is a transformant cell, details of the expression vector, host cell, cell culture and the like used to prepare the transformant cell can be the same as those described above.

4.3. Cells Wherein the Expression or a Function of a Polypeptide of the Present Invention is Regulated

The present invention provides cells wherein the expression or a function of a polypeptide of the present invention is regulated.
A cell of the present invention can be an isolated and/or purified one. A cell of the present invention can be a cell derived from one of the above-described tissues, or a cell of one of the above-described kinds. A cell of the present invention can be derived from one of the above-described mammals. A cell of the present invention can be a primary culture cell or cell line, or a normal cell, or a cell derived from a mammal with one of the above-described diseases. A cell of the present invention can be a cell wherein the expression or a function of a polypeptide of the present invention is regulated specifically. A cell of the present invention can have a nerve cell-related action or nerve cell-related phenotype thereof being variable as a result of a regulation (e.g., promotion, suppression) of the expression or a function of a polypeptide of the present invention. A cell of the present invention can be a cell wherein the expression of a polypeptide of the present invention is regulated transiently, or a cell wherein the expression is regulated permanently (e.g., homozygousity- or heterozygousity-deficient cells). A cell of the present invention can also be a transformant or a non-transformant.
A cell of the present invention can be prepared by, for example, treating a cell with one of the above-described substances capable of regulating the expression or a function of a polynucleotide of the present invention or a polypeptide of the present invention (e.g., polypeptides of the present invention, antisense molecules, RNAi-inducing nucleic acids, antibodies, or expression vectors therefor). A cell of the present invention can also be prepared by isolating and/or purifying a cell from a transgenic animal or gene-deficient (so-called knockout) animal described below.
A cell wherein the expression or a function of a polypeptide of the present invention is regulated can be useful in, for example, developing a pharmaceutical (e.g., a prophylactic or therapeutic drug as described above), reagent or food, identifying a further marker gene specific for the brain/nerves or specific for nerve cell differentiation, and analyzing mechanisms associated with nerve cell differentiation. These can be performed by, for example, an expression profile analysis comprising measuring the expression profile in a cell of the present invention using a microarray, protein chip (e.g., antibody chip, or non-antibody chip such as chip manufactured by Ciphergen) and the like, and comparing the profile with the expression profile of a control cell. A cell of the present invention can also be useful as a cell model of a disease as described above.

5. Animals

The present invention provides animals wherein the expression or a function of a polypeptide of the present invention is regulated.
An animal of the present invention can be an animal with or without a genome alteration. The species of an animal of the present invention can be, for example, the same as one of the above-described non-human mammals.
In one embodiment, an animal of the present invention can be a transgenic animal with a genome alteration. A transgenic animal of the present invention is capable of expressing a polypeptide of the present invention. A transgenic animal of the present invention is also capable of expressing a polypeptide of the present invention specifically in one of the above-described cells or tissues.
A transgenic animal of the present invention can be prepared by a method known per se. In more detail, a transgenic animal of the present invention can be prepared by, for example, introducing a polynucleotide of the present invention linked operably to a specified promoter (e.g., a promoter that is non-specific or specific for one of the above-described cells or tissues) (e.g., may be in the form of an expression vector of the present invention) into a fertilized egg of an animal or another cell (e.g., unfertilized egg, spermatozoon or a progenitor cell thereof) in the initial stage of development. As examples of the method of gene introduction, the electroporation method, lipofection method, aggregation method, calcium phosphate coprecipitation method, and microinjection method can be mentioned. A transgenic animal of the present invention may be an animal prepared by mating a thus-prepared animal and another animal of the same species (e.g., animal model of a disease as described above).
In another embodiment, an animal of the present invention can be a gene-deficient animal with a genome alteration. A gene-deficient animal of the present invention is incapable of expressing a polypeptide of the present invention. A gene-deficient animal of the present invention is also incapable of expressing a polypeptide of the present invention specifically in one of the above-described cells or tissues.
A gene-deficient animal of the present invention can be prepared by a method known per se. In more detail, a gene-deficient animal of the present invention can be prepared using an embryonic stem cell (ES cell) specifically lacking a brain/nerve-specific gene. Such an ES cell can be prepared by, for example, introducing a specified targeting vector into ES cells, and selecting an ES cell showing homologous recombination from among the ES cells incorporating the targeting vector.
As a targeting vector, a targeting vector capable of inducing homologous recombination that causes specific expressional failure of a polynucleotide or polypeptide of the present invention can be used. Such a targeting vector comprises a first polynucleotide and second polynucleotide that are homologous or specifically homologous to a brain/nerve-specific gene (of the polynucleotides, at least one comprises a splicing donor signal for the brain/nerve-specific gene, and comprises a mutation that nullifies the splicing that produces at least one isoform in the signal), and, as required, a selection marker. A splicing donor signal for the brain/nerve-specific gene, and a mutation that nullifies the splicing that produces at least one isoform in the signal can be easily determined by a person skilled in the art. The first and second polynucleotides are polynucleotides having a sequence identity and length that are sufficient to produce homologous recombination in the genomic DNA associated with the brain/nerve-specific gene. The first and second polynucleotides are chosen in a way such that specific deficiency of a particular isoform is produced. As selection markers, positive selection markers (e.g., neomycin resistance gene, hygromycin B phosphotransferase (BPH) gene, blasticidin S deaminase gene, puromycin resistance gene), negative selection markers (e.g., herpes simplex virus (HSV) thymidine kinase (tk) gene, diphtheria toxin A fragment (DTA) gene) and the like can be mentioned. The targeting vector can comprise either a positive selection marker or a negative selection marker or both. The targeting vector may comprise two or more recombinase target sequences (e.g., loxP sequence, which is used in the Cre/loxP system derived from bacteriophage P1, FRT sequence, which is used in yeast-derived FLP/FRT system). The present invention also provides such a targeting vector.
As the method for introducing a targeting vector into an ES cell, a method known per se can be used. As examples of such methods, the calcium phosphate method, lipofection method/liposome method, electroporation method and the like can be mentioned. When a targeting vector is introduced into a cell, homologous recombination of the genomic DNA associated with the brain/nerve-specific gene occurs in the cell. Although an ES cell may be established by culturing an inner cell mass separated from a blastocyst of an optionally chosen animal on feeder cells, an existing ES cell may be utilized.
To select an ES cell showing homologous recombination, cells after introduction of a targeting vector are screened for. For example, after selection is performed by positive selection, negative selection and the like, screening based on genotype (for example, PCR method, Southern blot hybridization method) is performed. It is also preferable to further perform karyotype analysis on the ES cell obtained. In the karyotype analysis, the absence of chromosome aberrations in the selected ES cell is checked. Karyotype analysis can be performed by a method known per se. It is preferable that the karyotype of the ES cell be confirmed in advance before introducing the targeting vector.
A gene-deficient animal of the present invention can be prepared by transplanting to an animal a chimeric embryo obtained by introducing an ES cell obtained as described above into an embryo, and then mating the chimeric animal obtained. As examples of the embryo, blastocysts, 8-cell stage embryos and the like can be mentioned. The embryo can be obtained by mating a female animal undergoing an overovulation treatment with a hormone preparation (for example, PMSG, which has FSH-like action, and hCG, which has LH action, are used) and the like with a male animal, and the like. As methods of introducing an ES cell into an embryo, the micromanipulation method, aggregation method and the like can be mentioned.
The animal receiving a chimeric embryo transplanted is preferably a pseudo-pregnant animal. A pseudo-pregnant animal can be obtained by mating a female animal in the normal sexual cycle with a male animal emasculated by vasoligation and the like. The animal incorporating the chimeric embryo becomes pregnant and delivers a chimeric animal. Next, it is determined whether or not the animal born is a chimeric animal. Whether or not the animal born is a chimeric animal can be determined by a method known per se, for example, by the body color or coat color. For the determination, a DNA may be extracted from a portion of the body and subjected to Southern blot analysis or PCR assay. The mating can be performed preferably between a wild-type animal and a chimeric animal, or between chimeric animals. Whether or not the deficiency of the brain/nerve-specific gene has been introduced into the germ cell line of the chimeric animal and heterozygous offspring lacking the brain/nerve-specific gene has been obtained can be determined by a method known per se with various characters as indexes; for example, this can be determined by the body color or coat color of the offspring animal. For the determination, a DNA may be extracted from a portion of the body and subjected to Southern blot analysis or PCR assay. Furthermore, by mating thus-obtained heterozygotes, a homozygote can be prepared. A gene-deficient animal of the present invention may also be an animal prepared by mating an animal thus prepared and another animal of the same species (e.g., animal model of disease based on nerve cell disorder, transgenic animal).
In a still another embodiment, an animal of the present invention can be an animal without a genome alteration. Such an animal can be prepared by treating an animal with one of the above-described substances capable of regulating the expression or a function of a polynucleotide of the present invention or a polypeptide of the present invention (e.g., polypeptides of the present invention, antisense molecules, RNAi-inducing nucleic acids, antibodies, or expression vectors therefor). Such an animal can also be an animal capable or incapable of expressing a polypeptide of the present invention specifically in one of the above-described tissues by topical treatment. The animal treatment can be performed using a method mentioned with respect to a composition of the present invention.
An animal of the present invention can be useful in, for example, developing a pharmaceutical (e.g., a prophylactic or therapeutic drug as described above), reagent or food, identifying a further marker gene specific for the brain/nerves or specific for nerve cell differentiation, and analyzing mechanisms associated with nerve cell differentiation. These can be performed by, for example, an expression profile analysis comprising measuring an expression profile (particularly expression profile of a nerve cell or a tissue in the brain) using a microarray, protein chip (e.g., antibody chip, or non-antibody chip such as a chip manufactured by Ciphergen) and the like in an animal of the present invention, and comparing the profile with the expression profile of a control animal. An animal of the present invention can also be useful as an animal model of a disease as described above.

6. Measuring Means and Measuring Method

The present invention provides measuring means (e.g., primer set, nucleic acid probe, antibody, aptamer) and measuring methods for target polynucleotides and polypeptides.

6.1. Primer Set and Method of Use Thereof

A primer set of the present invention can be used for specific detection and quantitation of a polynucleotide of the present invention or a known polynucleotide, or comprehensive detection and quantitation of both a polynucleotide of the present invention and a known polynucleotide. For example, such detection and quantitation can be achieved, after preparing total RNA from a biological sample, by utilizing a method of gene amplification such as a PCR (e.g., RT-PCR, real-time PCR, quantitative PCR), LAMP (Loop-mediated isothermal amplification) (see, e.g., WO00/28082), or ICAN (Isothermal and Chimeric primer-initiated Amplification of Nucleic acids) (see, e.g., WO00/56877). Because the number of primers required differs depending on the kind of the method of gene amplification, the number of primers is not particularly limited; for example, a primer set of the present invention can comprise two or more primers constituted by a sense and antisense primer. The two or more primers may be mixed in advance or not. Each of the sense and antisense primers is not particularly limited, as far as it is of a size enabling specific amplification of the target region; each primer consists of 12 (for example, at least about 15, preferably at least about 18, more preferably at least about 20 and the like) consecutive nucleotide residues. The sense and antisense primer, when the size of the polynucleotide amplified thereby is to be visually detected, can be designed to allow it to be visually detectable. The visually detectable size is not particularly limited, and can be, for example, at least about 50, preferably at least 70, more preferably at least about 100, still more preferably at least about 150, and most preferably at least about 200, about 300, about 400, about 500 or more nucleotide residues long. The sense and antisense primer do not require that the polynucleotide amplified thereby be visually detected, and may be detected by a fluorescence signal and the like, as is commonly used in real-time PCR.
A primer set of the present invention can be a) a primer set specific for a polynucleotide of the present invention, capable of distinguishing a polynucleotide of the present invention from a known polynucleotide (abbreviated as “specific primer set A” as required), b) a primer set specific for a known polynucleotide, capable of distinguishing a known polynucleotide from a polynucleotide of the present invention (abbreviated as “specific primer set B” as required), or c) a primer set common to both a polynucleotide of the present invention and a known polynucleotide (abbreviated as “shared primer set” as required) wherein a polynucleotide of the present invention and a known polynucleotide do not distinguish each other.
The specific primer set A of the present invention can comprise i) a sense and antisense primer designed to make it possible to distinguish the size of the polynucleotide of the present invention or partial nucleotide thereof to be amplified from the size of the known polynucleotide or partial nucleotide thereof to be amplified, or ii) a sense and antisense primer designed to allow a polynucleotide of the present invention or a partial nucleotide thereof alone to be amplified, and not to allow a known polynucleotide to be amplified.
The sense and antisense primers of i) above are preferably, for example, a) a sense primer corresponding to a nucleic acid sequence present on the 5′ side relative to the nucleic acid sequence of the above-described specific partial nucleotide A (particularly an insert nucleic acid sequence of a polynucleotide of the present invention), and an antisense primer corresponding to a nucleic acid sequence complementary to a nucleic acid sequence present on the 3′ side relative to the nucleic acid sequence, or b) a sense primer corresponding to a nucleic acid sequence present on the 5′ side relative to the nucleic acid sequence of the above-described specific partial nucleotide B (particularly an insert nucleic acid sequence of a known polynucleotide), and an antisense primer corresponding to a nucleic acid sequence complementary to a nucleic acid sequence present on the 3′ side relative to the nucleic acid sequence.
The sense and antisense primers of ii) above are preferably, for example, a) a sense primer corresponding to the nucleic acid sequence of the above-described specific partial nucleotide A (particularly an insert nucleic acid sequence of a polynucleotide of the present invention), and a specified antisense primer, b) a specified sense primer, and a sense primer corresponding to the nucleic acid sequence of the above-described specific partial nucleotide A (particularly an insert nucleic acid sequence of a polynucleotide of the present invention), or c) a sense and antisense primer corresponding to the nucleic acid sequence of the above-described specific partial nucleotide A (particularly an insert nucleic acid sequence of a polynucleotide of the present invention).
The specific primer set B of the present invention can comprise i) a sense and antisense primer designed to make it possible to distinguish the size of the known polynucleotide or partial nucleotide thereof to be amplified from the size of the polynucleotide of the present invention or partial nucleotide thereof to be amplified, or ii) a sense and antisense primer designed to allow a known polynucleotide or a partial nucleotide thereof alone to be amplified, and not to allow a polynucleotide of the present invention to be amplified.
The sense and antisense primers of i) above are preferably, for example, a) a sense primer corresponding to a nucleic acid sequence present on the 5′ side relative to the nucleic acid sequence of the above-described specific partial nucleotide B (particularly an insert nucleic acid sequence of a known polynucleotide), and an antisense primer corresponding to a nucleic acid sequence complementary to a nucleic acid sequence present on the 3′ side relative to the nucleic acid sequence, or b) a sense primer corresponding to a nucleic acid sequence present on the 5′ side relative to the nucleic acid sequence of the above-described specific partial nucleotide A (particularly an insert nucleic acid sequence of a polynucleotide of the present invention), and an antisense primer corresponding to a nucleic acid sequence complementary to a nucleic acid sequence present on the 3′ side relative to the nucleic acid sequence.
The sense and antisense primers of ii) above are preferably, for example, a) a sense primer corresponding to the nucleic acid sequence of the above-described specific partial nucleotide B (particularly an insert nucleic acid sequence of a known polynucleotide), and a specified antisense primer, b) a specified sense primer, and a sense primer corresponding to the nucleic acid sequence of the above-described specific partial nucleotide B (particularly an insert nucleic acid sequence of a known polynucleotide), or c) a sense and antisense primer corresponding to the nucleic acid sequence of the above-described specific partial nucleotide B (particularly an insert nucleic acid sequence of a known polynucleotide).
A shared primer set of the present invention can comprise a sense and antisense primer designed to equalize the size of the known polynucleotide or partial nucleotide thereof to be amplified to the size of the polynucleotide of the present invention or partial nucleotide thereof to be amplified. Such a sense and antisense primer are preferably, for example, a sense and antisense primer designed not to allow the polynucleotide of the present invention or partial nucleotide thereof to be amplified, and the known polynucleotide or partial nucleotide thereof to be amplified, to comprise the nucleic acid sequences of the above-described specific partial nucleotides A and B.

6.2. Nucleic Acid Probe and Method of Use Thereof

A nucleic acid probe of the present invention can be used for specific detection and quantitation of a polynucleotide of the present invention or a known polynucleotide, or comprehensive detection and quantitation of both a polynucleotide of the present invention and a known polynucleotide. For example, such a detection and quantitation can be achieved, after preparing total RNA from a biological sample, by utilizing Northern blotting, a nucleic acid array wherein a nucleic acid probe of the present invention is immobilized, and the like. Although the nucleic acid probe can be a DNA, an RNA, a modified nucleic acid or a chimeric molecule thereof and the like, a DNA is preferable in consideration of safety, convenience and the like. The nucleic acid probe may also be any one of a single-stranded or a double-stranded polynucleotide. The size of the nucleic acid probe is not particularly limited, as far as it is capable of specifically hybridizing to the transcription product of the target gene; the size is, for example, at least about 15 or 16, preferably about 15 to about 1000, more preferably about 20 to about 500, and still more preferably about 25 to about 300. When a nucleic acid probe of the present invention is a single-stranded polynucleotide, the nucleic acid probe of the present invention can be the same as an antisense molecule of the present invention. When a nucleic acid probe of the present invention is a double-stranded polynucleotide, the nucleic acid probe of the present invention can be configured by an antisense molecule of the present invention and a polynucleotide molecule complementary thereto.
A nucleic acid probe of the present invention can be a) a nucleic acid probe specific for a polynucleotide of the present invention, capable of distinguishing a polynucleotide of the present invention from a known polynucleotide (abbreviated as “specific nucleic acid probe A” as required), b) a nucleic acid probe specific for a known polynucleotide, capable of distinguishing a known polynucleotide from a polynucleotide of the present invention (abbreviated as “specific nucleic acid probe B” as required), or c) a nucleic acid probe common to both a polynucleotide of the present invention and a known polynucleotide, wherein a polynucleotide of the present invention and a known polynucleotide do not distinguish each other (abbreviated as “shared nucleic acid probe” as required).
The specific nucleic acid probe A of the present invention can be a polynucleotide having a nucleic acid sequence complementary to the nucleic acid sequence of the above-described specific partial nucleotide A (particularly an insert nucleic acid sequence of a polynucleotide of the present invention) (a single-stranded polynucleotide), or a polynucleotide having the nucleic acid sequence of the above-described specific partial nucleotide A (particularly an insert nucleic acid sequence of a polynucleotide of the present invention) and a nucleic acid sequence complementary to the nucleic acid sequence (a double-stranded polynucleotide).
The specific nucleic acid probe B of the present invention can be a polynucleotide having a nucleic acid sequence complementary to the nucleic acid sequence of the above-described specific partial nucleotide B (particularly an insert nucleic acid sequence of a known polynucleotide) (a single-stranded polynucleotide), or a polynucleotide having the nucleic acid sequence of the above-described specific partial nucleotide B (particularly an insert nucleic acid sequence of a known polynucleotide) and a nucleic acid sequence complementary to the nucleic acid sequence (a double-stranded polynucleotide).
A shared nucleic acid probe of the present invention can be a polynucleotide having a nucleic acid sequence complementary to the nucleic acid sequence of the above-described shared partial nucleotide (a single-stranded polynucleotide), or a polynucleotide having a nucleic acid sequence complementary to the nucleic acid sequence of the above-described shared partial nucleotide and the nucleic acid sequence (a double-stranded polynucleotide).
A nucleic acid probe of the present invention may be provided in a state immobilized on a support (i.e., as an array). The support for such a nucleic acid array is not particularly limited, as far as it is a support in common use in the art; for example, membranes (e.g., nylon membranes), glass, plastics, metals, plates and the like can be mentioned. A nucleic acid array in the present invention can assume a form known per se; for example, an array wherein a nucleic acid is directly synthesized on a support (so-called affimetrix type), an array wherein a nucleic acid is immobilized on a support (so-called Stanford type), fiber-type array, and electrochemical array (ECA) can be mentioned.

6.3. Antibodies and Aptamers and Method of Use Thereof

An antibody and aptamer of the present invention can be used for specific detection and quantitation of a polypeptide of the present invention, a known polypeptide, or both a polypeptide of the present invention and a known polypeptide. For example, such a detection and quantitation can be achieved, after preparing an extract from a biological sample, or using a biological sample, by an immunological technique or an affinity-based method. As examples of such immunological techniques, enzyme immunoassay (EIA) (e.g., direct competitive ELISA, indirect competitive ELISA, sandwich ELISA), radioimmunoassay (RIA), fluorescent immunoassay (FIA), immunochromatography, luminescence immunoassay, spin immunoassay, Western blotting, and immunohistochemical staining can be mentioned. An affinity-based method can be performed in accordance with one of the above-described immunological techniques. The antibody and aptamer used for a measurement of a polypeptide of the present invention, a known polypeptide, or both a polypeptide of the present invention and a known polypeptide can be the same as the above-described antibody and aptamer of the present invention.
An antibody and aptamer of the present invention can be a) an antibody and aptamer specific for a polypeptide of the present invention, that make it possible to distinguish a polypeptide of the present invention from a known polypeptide (abbreviated as “specific antibody and aptamer A” as required), b) an antibody and aptamer specific for a known polypeptide, that make it possible to distinguish a known polypeptide from a polypeptide of the present invention (abbreviated as “specific antibody and aptamer B” as required), or c) an antibody and an aptamer common to both a polypeptide of the present invention and a known polypeptide, that do not distinguish between a polypeptide of the present invention and a known polypeptide (abbreviated as “shared antibody and aptamer” as required). The specific antibody and aptamer A of the present invention are capable of binding to the above-described specific partial peptide A (particularly a partial peptide consisting of an insert amino acid sequence of a polypeptide of the present invention). The specific antibody and aptamer B of the present invention are capable of binding to the above-described specific partial peptide B (particularly a partial peptide consisting of an insert amino acid sequence of a known polypeptide). A shared antibody and aptamer of the present invention are capable of binding to the above-described shared partial peptide.
An antibody and aptamer of the present invention may be provided in a form immobilized on a support (i.e., as an array). The support for such a nucleic acid array is not particularly limited, as far as it is a support in common use in the art; for example, membranes (e.g., nitrocellulose membranes), glass, plastics, metals, and plates (e.g., multiwell plates) can be mentioned.

6.4. Supplementary Matters Concerning Measuring Means of the Present Invention

A measuring means of the present invention can be provided in a form labeled with a labeling substance as required. As examples of the labeling substance, fluorescent substances such as FITC and FAM, luminescent substances such as luminol, luciferin and lucigenin, radioisotopes such as ³H, ¹⁴C, ³²P, ³⁵S, and ¹²³I, affinity substances such as biotin and streptavidin, and the like can be mentioned.
A measuring means of the present invention may be provided in the form of a kit comprising an additional constituent, in addition to the measuring means. In this case, the various constituents contained in the kit can be provided in mutually isolated forms, for example, in forms housed in different containers. For example, when the measuring means is not labeled with a labeling substance, the kit can further comprise a labeling substance. A kit of the present invention can comprise two or more measuring means for two or more target genes (e.g., a combination of a brain/nerve-specific gene and a known gene, a combination of two or more brain/nerve-specific genes). When the measuring means of the present invention is provided in the form of an array, the array of the present invention can be one wherein two or more measuring means for two or more target genes are immobilized. A kit and array of to the present invention can also comprise a measuring means as described above with respect to a housekeeping gene (e.g., GAPDH, β-actin).

6.5. Measuring Methods of the Present Invention

The present invention also provides a method of detecting or quantifying a target polypeptide or polynucleotide using a measuring means of the present invention.
A measurement of a target polynucleotide and polypeptide can be properly performed according to the kind of the measuring means by the above-described method.
In a method of the present invention, the expression level of a target polynucleotide or polypeptide in a biological sample obtained from one of the above-described mammals (e.g., human) or a culture (e.g., cell or tissue culture) can be measured. The biological sample is not particularly limited, as far as it is, for example, a sample containing a cell or tissue expressing the target polynucleotide or polynucleotide, or, if the target polynucleotide or polypeptide is secreted or oozed or the like, an animal-derived sample (e.g., blood, plasma, serum, saliva, cerebrospinal fluid, tear, urine) containing the polynucleotide or polypeptide secreted or oozed or the like. The biological sample can be one containing one of the above-described cells or tissues (e.g., nerve cell or a tissue in the brain). The biological sample used in the present invention, unless otherwise specified, can be a biological sample collected from a mammal in advance; in a particular aspect, a method of the present invention can comprise collecting a biological sample from a mammal.
In one embodiment, a method of the present invention can be utilized to identify a nerve cell, to determine a nerve cell differentiation state, or to diagnose a disease based on a nerve cell disorder (e.g., determination of onset or likelihood of onset). This method can comprise measuring the expression level of a target polynucleotide or polypeptide in a biological sample collected from an animal, and evaluating the onset or likelihood of onset of a target disease on the basis of the measured expression level or relative expression rate. For example, the measured expression level or relative expression rate is compared with the expression level in a mammal not suffering the target disease (e.g., normal animal). The expression level or expression rate in a mammal not suffering the target disease can be determined by a method known per se. By such a comparison, it is determined whether or not the animal possibly has the target disease, or whether or not the animal is likely to suffer the disease. It is known that in a mammal having a particular disease manifested, an expressional change in the gene associated with the disease is often observed. It is also known that before the onset of a particular disease, an expressional change in a particular gene is often observed. Therefore, by such an analysis, it is possible to determine the onset or likelihood of onset of a target disease. Such a method can be useful in, for example, conveniently determining and early detecting a target disease. Of course, a measuring means of the present invention and a reagent or kit of the present invention can also be utilized for such a determination.
In detail, the changes in the expression profiles of the brain/nerve-specific genes 1 to 10 in nerve cells or tissues in the brain are as described in Examples. Therefore, using a measuring means of the present invention that enables a specific measurement of a polynucleotide of the present invention and a partial nucleotide of the present invention (e.g., specific partial nucleotide A of the present invention, specific partial nucleotide B of the present invention, shared partial nucleotide of the present invention), and a polypeptide of the present invention and a partial peptide of the present invention (e.g., specific partial peptide A of the present invention, specific partial peptide B of the present invention, shared partial peptide of the present invention), by evaluating the degree of the expression of the brain/nerve-specific genes 1 to 10 and/or relative expression ratios thereof, it is possible to identify a nerve cell, to determine a differentiation state of a nerve cell, or to diagnose a disease based on a nerve cell disorder.
In another embodiment, a method of the present invention can be utilized for screening for a pharmaceutical, reagent or food and the like. For example, in one methodology, the screening method can comprise determining whether or not a test material is capable of regulating (e.g., increasing or decreasing) the number of nerve cells. Because the number of nerve cells and the expression level of a brain/nerve-specific gene can correlate with each other, such a screening can be performed by measuring the expression level of the brain/nerve-specific gene. In another methodology, the screening method can comprise determining whether or not a test material is capable of regulating the expression or a function of a target polynucleotide or polypeptide. Such a screening method can be utilized as, for example, a screening method for a pharmaceutical effective for a specified disease (e.g., disease based on a nerve cell disorder) and the like, comprising selecting a test substance capable of regulating the expression or a function of a target, and a screening method for a pharmaceutical with a decreased specified action (e.g., adverse reactions such as nerve cell differentiation regulatory action) and the like, comprising selecting a test substance incapable of regulating the expression or a function of a target. The test material subjected to the screening method can be a commonly known compound or a novel compound or a composition; as examples, nucleic acids, glucides, lipids, proteins, peptides, organic low molecular compounds, compound libraries prepared using combinatorial chemistry technology, random peptide libraries prepared by solid phase synthesis or the phage display method, or naturally occurring ingredients derived from microorganisms, animals, plants, marine organisms and the like, existing pharmaceuticals, reagents or foods and the like can be mentioned. In the screening method, mammals, cells and tissues (e.g., nerve cell and a tissue in the brain), or reconstitution systems (non-cell systems) as described above can be used. Pharmaceuticals and the like obtained by the screening method are also provided by the present invention.
The disclosures in all publications mentioned herein, including patents and patent application specifications, are incorporated herein by reference to the extent that all of them have been given expressly.

EXAMPLES

The present invention is hereinafter described in further detail with reference to Examples; however, the present invention is not limited to the Examples and the like by any means.

Example 1

Preparation and Sequence Analysis of Human cDNA Libraries

(1) Preparation and Sequence Analysis of cDNA Libraries by the Improved Oligocap method
1) Extraction and Purchase of mRNAs
From human tissues (shown below), by a method described in a literature document (J. Sambrook, E. F. Fritsch & T. Maniatis, Molecular Cloning Second edition, Cold Spring harbor Laboratory Press, 1989), mRNAs were extracted as total RNAs. After cultivation of cultured human cells or primary culture human cells (shown below) by the methods described in the catalogues thereof, mRNAs were extracted as total RNAs by a method described in a literature document (J. Sambrook, E. F. Fritsch & T. Maniatis, Molecular Cloning Second edition, Cold Spring harbor Laboratory Press, 1989).
Hereinafter, the relationships between the names of libraries and the derivations thereof are shown in the order of “name of library: derivation”. If a library was generated by subtraction, how to generate the subtraction library is also shown.
<Extraction of mRNAs from Human Tissues>
NTONG: Tongue; CTONG: Tongue, Tumor; FCBBF: Brain, Fetal; OCBBF: Brain, Fetal; PLACE: Placenta; SYNOV: Synovial membrane tissue from rheumatioid arthritis; CORDB: Cord blood.
<Extraction of mRNAs from Cultured Cells>
BNGH4: H4 cell (ATCC #HTB-148); IMR32: IMR32 cell (ATCC #CCL-127); SKNMC: SK-N-MC cell (ATCC #HTB-10); 3NB69: NB69 cell (RCB #RCB0480); BGGI1: GI1 cell (RCB #RCB0763); NB9N4: NB9 cell (RCB #RCB0477); SKNSH: SK-N-SH cell (RCB #RCB0426); AHMSC: HMSC cell (Human mesenchymal cell); CHONS: Chondrocyte; ERLTF: TF-1 cell (erythroleukemia); HELAC: HeLa cell; JCMLC: leukemia cell (Leukemia, myelogenous); MESTC: Mesenchyme stem cell; N1ESE: Mesenchymal stem cell; NCRRM: Embryonal carcinoma; NCRRP: Embryonal carcinoma treated with retinoic acid (RA) to induce differentiation; T1ESE: Mesenchymal stem cell treated with trichostatin and 5-azacytidine to induce differentiation; NT2RM: NT2 cell (STARATAGENE #204101); NT2RP: NT2 cell treated with retinoic acid (RA) to induce differentiation for 5 weeks; NT2RI: NT2 cell treated with RA to induce differentiation for 5 weeks, and thereafter treated with a growth inhibitor for 2 weeks; NT2NE: NT2 cell treated with RA and treated with a growth inhibitor to induce nerve differentiation, followed by nerve concentration and recovery (NT2 Neuron); NTISM: a library generated by subtracting cDNAs that overlap with the mRNA of undifferentiated NT2 cells from a cDNA library prepared from an mRNA of NT2 cell (STARATAGENE #204101) treated with RA to induce differentiation for 5 weeks, and thereafter treated with a growth inhibitor for 2 weeks, using Subtract Kit (Invitrogen #K4320-01) (NT2RI-NT2RM). RCB indicates that the cell line was supplied by the RIKEN Gene Bank—Cell Development Bank, and ATCC indicates that the cell line was supplied by the American Type Culture Collection.
<Extraction of mRNAs from Primary Culture Cells>
ASTRO: Normal Human Astrocyte NHA5732, Takara Shuzo #CC2565; DFNES: Normal Human Dermal Fibroblasts (Neonatal Skin); NHDF-Neo) NHDF2564, Takara Shuzo #CC2509; MESAN: Normal human mesangial cells NHMC56046-2, Takara Shuzo #CC2559; NHNPC: Normal human neural progenitor cells NHNP5958, Takara Shuzo #CC2599; PEBLM: Human peripheral blood mononuclear cells HPBMC5939, Takara Shuzo #CC2702; HSYRA: HS-RA (Human synoviocytes from rheumatioid arthritis), Toyobo #T404K-05; PUAEN: Human pulmonary artery endothelial cells, Toyobo #T302K-05; UMVEN: Human umbilical vein endothelial cells HUVEC, Toyobo #T200K-05; HCASM: HCASMC (Human coronary artery smooth muscle cells), Toyobo #T305K-05; HCHON: HC (Human Chondrocytes), Toyobo #T402K-05; HHDPC: HDPC (Human dermal papilla cells), Toyobo #THPCK-001; CD34C: CD34+ cell (AllCells, LLC #CB14435M); D3OST: CD34+ cells treated with osteoclast differentiation factor (ODF) to induce differentiation for 3 days; D6OST: CD34+ cells treated with an ODF to induce differentiation for 6 days; D9OST: CD34+ cells treated with ODF to induce differentiation for 9 days; ACTVT: activated T-cell; LYMPB: Lymphoblast, EB virus transferred B cell; NETRP: Neutrophil.
Next, mRNAs extracted as total RNAs from the human tissues shown below were purchased. Hereinafter, the relationships between the names of libraries and the derivations thereof are shown in the order of “name of library: derivation”. If a library was generated by subtraction, how to generate the subtraction library is also shown.
<mRNAs from Human Tissues Purchased as Total RNAs>
ADRGL: Adrenal gland, CLONTECH #64016-1; BRACE: Brain, cerebellum, CLONTECH #64035-1; BRAWH: Brain, whole, CLONTECH #64020-1; FEBRA: Brain, Fetal, CLONTECH #64019-1; FELIV: Liver, Fetal, CLONTECH #64018-1; HEART: Heart, CLONTECH #64025-1; HLUNG: Lung, CLONTECH #64023-1; KIDNE: Kidney, CLONTECH #64030-1; LIVER: Liver, CLONTECH #64022-1; MAMGL: Mammary Gland, CLONTECH #64037-1; PANCR: Pancreas, CLONTECH #64031-1; PROST: Prostate, CLONTECH #64038-1; SALGL: Salivary Gland, CLONTECH #64026-1; SKMUS: Skeletal Muscle, CLONTECH #64033-1; SMINT: Small Intestine, CLONTECH #64039-1; SPLEN: Spleen, CLONTECH #64034-1; STOMA: Stomach, CLONTECH #64090-1; TBAES: Breast, Tumor, CLONTECH #64015-1; TCERX: Cervix, Tumor, CLONTECH #64010-1; TCOLN: Colon, Tumor, CLONTECH #64014-1; TESTI: Testis, CLONTECH #64027-1; THYMU: Thymus, CLONTECH #64028-1; TLUNG: Lung, Tumor, CLONTECH #64013-1; TOVAR: Ovary, Tumor, CLONTECH #64011-1; TRACH: Trachea, CLONTECH #64091-1; TUTER: Uterus, Tumor, CLONTECH #64008-1; UTERU: Uterus, CLONTECH #64029-1; ADIPS: Adipose, Invitrogen #D6005-01; BLADE: Bladder, Invitrogen #D6020-01; BRALZ: Brain, cortex, Alzheimer, Invitrogen #D6830-01; CERVX: Cervix, Invitrogen #D6047-01; COLON: Colon, Invitrogen #D6050-0; NESOP: Esophagus, Invitrogen #D6060-01; PERIC: Pericardium, Invitrogen #D6105-01; RECTM: Rectum, Invitrogen #D6110-01; TESOP: Esophageal, Tumor, Invitrogen #D6860-01; TKIDN: Kidney, Tumor, Invitrogen #D6870-01; TLIVE: Liver, Tumor, Invitrogen #D6880-01; TSTOM: Stomach, Tumor, Invitrogen #D6920-01; BEAST: Adult Breast, STARATAGENE #735044; FEHRT: Heart, Fetal, STARATAGENE #738012; FEKID: Kidney, Fetal, STARATAGENE #738014; FELNG: Lung, Fetal, STARATAGENE #738020; NOVAR: Adult Ovary, STARATAGENE #735260; BRASW: a library generated by subtracting cDNAs that overlap with the mRNA of BRAWH (Brain, whole, CLONTECH #64020-1) from a cDNA library prepared from the mRNA of BRALZ (Brain, cortex, Alzheimer, Invitrogen #D6830-01), using Subtract Kit (Invitrogen #K4320-01) (BRALZ-BRAWH).
Furthermore, mRNAs extracted and purified as polyA(+) RNAs from the human tissues shown below were purchased. From an RNA prepared by mixing polyA(+) RNA derived from each tissue with polyA(−) RNA, a cDNA library was prepared. The polyA(−) RNA was prepared by removing the polyA(+) RNA from the total RNA of Brain, whole, CLONTECH #64020-1 by means of oligo dT cellulose. Hereinafter, the relationships between the names of libraries and the derivations thereof are shown in the order of “name of library: derivation”.
<mRNAs from Human Tissues Purchased as PolyA(+) RNAs>
BRAMY: Brain, amygdala, CLONTECH #6574-1; BRCAN: Brain, caudate nucleus, CLONTECH #6575-1; BRCOC: Brain, corpus callosum, CLONTECH #6577-1; BRHIP: Brain, hippocampus, CLONTECH #6578-1; BRSSN: Brain, substantia nigra, CLONTECH #6580-1; BRSTN: Brain, subthalamic nucleus, CLONTECH #6581-1; BRTHA: Brain, thalamus, CLONTECH #6582-1.
2) Preparation of cDNA Libraries by the Improved Oligocap Method
From each RNA, by a method (WO 01/04286) developed by improving the oligocap method [M. Maruyama and S. Sugano, Gene, 138: 171-174 (1994)], a cDNA library was prepared. Using an Oligo-cap linker (SEQ ID NO:1) and an Oligo dT primer (SEQ ID NO:2), as described in WO 01/04286, BAP (Bacterial Alkaline Phosphatase) treatment, TAP (Tobacco Acid Pyrophosphatase) treatment, RNA ligation, synthesis of first strand cDNA and removal of RNA were performed. Next, using 5′ (SEQ ID NO:3) and 3′ (SEQ ID NO:4) PCR primers, by PCR (polymerase chain reaction), the first strand cDNA was converted to a double-stranded cDNA, and cleaved with SfiI. Next, the cDNA fragment, usually fractionated into 2 kb or more (3 kb or more as the case may be), was cloned into the vector pME18SFL3 (GenBank AB009864, Expression vector), previously cleaved with DraIII, in a determined orientation of the cDNA, whereby a cDNA library was prepared.
The relationships between the names of the cDNA libraries used for 5′-terminal sequence analysis of the cDNAs and the derivations thereof are shown in Tables 1-1 to 1-6. The number of the 5′-terminal sequences of the cDNAs in each cDNA library after mapping onto the human genome are also shown in Table 1.
3) 5′-terminal sequence analysis of cDNAs from cDNA Libraries Prepared by the Improved Oligocap Method
The 5′-terminal nucleic acid sequences of cDNAs acquired from each cDNA library, after a sequencing reaction using a DNA sequencing reagent (BigDye Terminator Cycle Sequencing FS Ready Reaction Kit, manufactured by PE Biosystems) according to the manual, were analyzed using a DNA sequencer (ABI PRISM 3700, manufactured by PE Biosystems). For the data obtained, a database was constructed. The 5′-terminus full-length rate of each cDNA library prepared by the improved oligocap method was 90% on average, being a high full-length rate (calculated with the protein coding region of a known mRNA as an index).
4) Full-Length cDNA Nucleic Acid Analysis
For cDNAs selected for full-length cDNA nucleic acid analysis, the nucleic acid sequence of each full-length cDNA was determined. The nucleic acid sequences were determined mainly by a primer walking method based on the dideoxy terminator method using a custom-synthesized DNA primer. Specifically, a sequencing reaction was performed using a custom-synthesized DNA primer with a DNA sequencing reagent manufactured by PE Biosystem as directed in the manual, after which the DNA nucleic acid sequence was analyzed using a sequencer manufactured by the same company. The full-length nucleic acid sequence was finally established by completely overlapping the partial nucleic acid sequences determined by the above-described method. Next, the region of translation into protein was predicted from the determined full-length cDNA nucleic acid sequence, and the amino acid sequence was determined.
(2) Preparation of cDNA Libraries by the Oligocap Method and Sequence Analysis
1) Preparation of cDNA libraries by the oligocap method
Being human fetal testis derived teratocarcinoma cells, NT-2 neuronal precursor cells (purchased from Stratagene), which can be differentiated into nerve cells by retinoic acid treatment, were used after being treated per the attached manual as follows.

- NT-2 cells cultured without differentiation induction with retinoic acid (NT2RM)
- NT-2 cells cultured, followed by differentiation induction by the addition of retinoic acid, then cultured for 2 days and 2 weeks (NT2RP)

Cultured human cell SK-N-MC (ATCC HTB-10) (SKNMC), cultured human cell Y79 (ATCC HTB-18) (Y79AA), cultured human cell GI1 (RCB RCB0763) (BGGI1), cultured human cell H4 (ATCC HTB-148) (BNGH4), cultured human cell IMR32 (ATCC CCL-127) (IMR32), and cultured human cell NB9 (RCB #RCB0477) (NB9N4) were cultured by the methods described in the catalogues thereof. RCB indicates that the cell line was supplied by the RIKEN Gene Bank—Cell Development Bank, and ATCC indicates that the cell line was supplied by the American Type Culture Collection.
The cultured cells of each line were collected, and by a method described in a literature document (J. Sambrook, E. F. Fritsch & T. Maniatis, Molecular Cloning Second edition, Cold Spring harbor Laboratory Press 1989), mRNAs were extracted. Furthermore, poly(A)+ RNAs were purified by means of oligo dT cellulose. Likewise, from human placenta tissue (PLACE), human ovarian cancer tissue (OVARC), tissue rich in head portion from 10-week-gestional fetal human (HEMBA), tissue rich in trunk portion from 10-week-gestional fetal human (HEMBB), human mammary gland tissue (MAMMA), human thyroid tissue (THYRO), and human vascular endothelial tissue primary culture cell (VESEN), by a method described in a literature document (J. Sambrook, E. F. Fritsch & T. Maniatis, Molecular Cloning Second edition, Cold Spring harbor Laboratory Press, 1989), mRNAs were extracted. Furthermore, poly(A)+ RNAs were purified by means of oligo dT cellulose.
From all these poly(A)+ RNAs, by the oligocap method [M. Maruyama and S. Sugano, Gene, 138: 171-174 (1994)], respective cDNA libraries were prepared. Using an Oligo-cap linker (SEQ ID NO:1) and an Oligo dT primer (SEQ ID NO:2), as directed in a literature document [Suzuki and Sugano, Protein, Nucleic Acid and Enzyme, 41: 197-201 (1996), Y. Suzuki et al., Gene, 200: 149-156 (1997)], BAP (Bacterial Alkaline Phosphatase) treatment, TAP (Tobacco Acid Phosphatase) treatment, RNA ligation, synthesis of first strand cDNA and removal of RNA were performed. Next, using 5′ (SEQ ID NO:3) and 3′ (SEQ ID NO:4) PCR primers, the first strand cDNA was converted to a double-stranded cDNA by PCR (polymerase chain reaction), and cleaved with SfiI. Next, the cDNA was cloned into the vector pUC19FL3 (for some cases of NT2RM and NT2RP) or pME18SFL3 (GenBank AB009864, Expression vector), previously cleaved with DraIII, in a determined orientation of the cDNA, whereby a cDNA library was prepared.
The relationships between the names of the cDNA libraries used for 5′-terminal sequence analysis of the cDNAs and the derivations thereof are shown in Tables 1-1 to 1-6. The number of 5′-terminal sequences of the cDNAs in each cDNA library after mapping onto the human genome are also shown in Tables 1-1 to 1-6.

	TABLES 1-1 to 1-6

	number of 5′-terminal
	sequences (only those
	which permitted mapping
	onto human genome)

Improved oligocap method

CORDB	Cord blood	Extraction of mRNAs from	708
		human tissues
CTONG	Tongue, Cancer	Extraction of mRNAs from	31,371
		human tissues
FCBBF	Brain, Fetal	Extraction of mRNAs from	31,986
		human tissues
NTONG	Tongue	Extraction of mRNAs from	7,125
		human tissues
OCBBF	Brain, Fetal	Extraction of mRNAs from	47,574
		human tissues
PLACE	Placenta	Extraction of mRNAs from	33,231
		human tissues
SYNOV	Synovial membrane tissue from	Extraction of mRNAs from	27,489
	rheumatoid arthritis	human tissues
BRAMY	Brain, amygdala, CLONTECH #6574-1	mRNAs from human tissues	58,640
		purchased as polyA(+)
		RNAs
BRCAN	Brain, caudate nucleus, CLONTECH	mRNAs from human tissues	25,786
	#6575-1	purchased as polyA(+)
		RNAs
BRCOC	Brain, corpus callosum, CLONTECH	mRNAs from human tissues	16,718
	#6577-1	purchased as polyA(+)
		RNAs
BRHIP	Brain, hippocampus, CLONTECH #6578-1	mRNAs from human tissues	57,918
		purchased as polyA(+)
		RNAs
BRSSN	Brain, substantia nigra, CLONTECH	mRNAs from human tissues	15,897
	#6580-1	purchased as polyA(+)
		RNAs
BRSTN	Brain, subthalamic nucleus, CLONTECH	mRNAs from human tissues	16,308
	#6581-1	purchased as polyA(+)
		RNAs
BRTHA	Brain, thalamus, CLONTECH #6582-1	mRNAs from human tissues	53,267
		purchased as polyA(+)
		RNAs
ADIPS	Adipose, Invitrogen #D6005-01	mRNAs from human tissues	608
		purchased as total RNAs
ADRGL	Adrenal gland, CLONTECH #64016-1	mRNAs from human tissues	10,223
		purchased as total RNAs
BEAST	Adult Breast, STARATAGENE #735044	mRNAs from human tissues	2,731
		purchased as total RNAs
BLADE	Bladder, Invitrogen #D6020-01	mRNAs from human tissues	8,431
		purchased as total RNAs
BRACE	Brain, cerebellum, CLONTECH #64035-1	mRNAs from human tissues	82,880
		purchased as total RNAs
BRALZ	Brain, cortex, Alzheimer, Invitrogen	mRNAs from human tissues	16,360
	#D6830-01	purchased as total RNAs
BRASW	A library generated by subtracting cDNAs	mRNAs from human tissues	157
	that overlap with the mRNA of BRAWH	purchased as total RNAs
	(Brain, whole, CLONTECH #64020-1) from
	a cDNA library prepared from the mRNA of
	BRALZ (Brain, cortex, Alzheimer,
	Invitrogen #D6830-01), using Subtract Kit
	(Invitrogen #K4320-01) (BRALZ-BRAWH)
BRAWH	Brain, whole, CLONTECH #64020-1	mRNAs from human tissues	59,069
		purchased as total RNAs
CERVX	Cervix, Invitrogen #D6047-01	mRNAs from human tissues	2,836
		purchased as total RNAs
COLON	Colon, Invitrogen #D6050-0	mRNAs from human tissues	8,398
		purchased as total RNAs
FEBRA	Brain, Fetal, CLONTECH #64019-1	mRNAs from human tissues	23,578
		purchased as total RNAs
FEHRT	Heart, Fetal, STARATAGENE #738012	mRNAs from human tissues	2,859
		purchased as total RNAs
FEKID	Kidney, Fetal, STARATAGENE #738014	mRNAs from human tissues	2,747
		purchased as total RNAs
FELIV	Liver, Fetal, CLONTECH #64018-1	mRNAs from human tissues	186
		purchased as total RNAs
FELNG	Lung, Fetal, STARATAGENE #738020	mRNAs from human tissues	2,764
		purchased as total RNAs
HEART	Heart, CLONTECH #64025-1	mRNAs from human tissues	8,889
		purchased as total RNAs
HLUNG	Lung, CLONTECH #64023-1	mRNAs from human tissues	16,146
		purchased as total RNAs
KIDNE	Kidney, CLONTECH #64030-1	mRNAs from human tissues	17,008
		purchased as total RNAs
LIVER	Liver, CLONTECH #64022-1	mRNAs from human tissues	6,843
		purchased as total RNAs
MAMGL	Mammary Gland, CLONTECH #64037-1	mRNAs from human tissues	182
		purchased as total RNAs
NESOP	Esophagus, Invitrogen #D6060-01	mRNAs from human tissues	2,690
		purchased as total RNAs
NOVAR	Adult Ovary, STARATAGENE #735260	mRNAs from human tissues	2,486
		purchased as total RNAs
PANCR	Pancreas, CLONTECH #64031-1	mRNAs from human tissues	179
		purchased as total RNAs
PERIC	Pericardium, Invitrogen #D6105-01	mRNAs from human tissues	8,781
		purchased as total RNAs
PROST	Prostate, CLONTECH #64038-1	mRNAs from human tissues	16,671
		purchased as total RNAs
RECTM	Rectum, Invitrogen #D6110-01	mRNAs from human tissues	2,723
		purchased as total RNAs
SALGL	Salivary Gland, CLONTECH #64026-1	mRNAs from human tissues	183
		purchased as total RNAs
SKMUS	Skeletal Muscle, CLONTECH #64033-1	mRNAs from human tissues	8,424
		purchased as total RNAs
SMINT	Small Intestine, CLONTECH #64039-1	mRNAs from human tissues	16,767
		purchased as total RNAs
SPLEN	Spleen, CLONTECH #64034-1	mRNAs from human tissues	33,950
		purchased as total RNAs
STOMA	Stomach, CLONTECH #64090-1	mRNAs from human tissues	8,685
		purchased as total RNAs
TBAES	Breast, Tumor, CLONTECH #64015-1	mRNAs from human tissues	8,416
		purchased as total RNAs
TCERX	Cervix, Tumor, CLONTECH #64010-1	mRNAs from human tissues	2,797
		purchased as total RNAs
TCOLN	Colon, Tumor, CLONTECH #64014-1	mRNAs from human tissues	2,798
		purchased as total RNAs
TESOP	Esophageal, Tumor, Invitrogen #D6860-01	mRNAs from human tissues	8,500
		purchased as total RNAs
TESTI	Testis, CLONTECH #64027-1	mRNAs from human tissues	90,188
		purchased as total RNAs
THYMU	Thymus, CLONTECH #64028-1	mRNAs from human tissues	70,578
		purchased as total RNAs
TKIDN	Kidney, Tumor, Invitrogen #D6870-01	mRNAs from human tissues	15,970
		purchased as total RNAs
TLIVE	Liver, Tumor, Invitrogen #D6880-01	mRNAs from human tissues	8,627
		purchased as total RNAs
TLUNG	Lung, Tumor, CLONTECH #64013-1	mRNAs from human tissues	2,844
		purchased as total RNAs
TOVAR	Ovary, Tumor, CLONTECH #64011-1	mRNAs from human tissues	2,722
		purchased as total RNAs
TRACH	Trachea, CLONTECH #64091-1	mRNAs from human tissues	52,352
		purchased as total RNAs
TSTOM	Stomach, Tumor, Invitrogen #D6920-01	mRNAs from human tissues	2,757
		purchased as total RNAs
TUTER	Uterus, Tumor, CLONTECH #64008-1	mRNAs from human tissues	2,668
		purchased as total RNAs
UTERU	Uterus, CLONTECH #64029-1	mRNAs from human tissues	49,561
		purchased as total RNAs
ACTVT	Activated T-cell	Extraction of mRNAs from	679
		primary culture human cells
ASTRO	Normal Human Astrocyte NHA5732,	Extraction of mRNAs from	17,162
	Takara Shuzo #CC2565	primary culture human cells
CD34C	CD34+ cell (AllCells, LLC #CB14435M)	Extraction of mRNAs from	1,420
		primary culture human cells
D3OST	CD34+ cells treated with osteoclast	Extraction of mRNAs from	5,092
	differentiation factor (ODF) to induce	primary culture human cells
	differentiation for 3 days
D6OST	CD34+ cells treated with osteoclast	Extraction of mRNAs from	888
	differentiation factor (ODF) to induce	primary culture human cells
	differentiation for 6 days
D9OST	CD34+ cells treated with osteoclast	Extraction of mRNAs from	4,407
	differentiation factor (ODF) to induce	primary culture human cells
	differentiation for 9 days
DFNES	Normal Human Dermal Fibroblasts	Extraction of mRNAs from	10,103
	(Neonatal Skin; NHDF-Neo) NHDF2564,	primary culture human cells
	Takara Shuzo #CC2509
HCASM	HCASMC (Human coronary artery smooth	Extraction of mRNAs from	8,949
	muscle cells), Toyobo #T305K-05	primary culture human cells
HCHON	HC (Human Chondrocytes), Toyobo	Extraction of mRNAs from	9,397
	#T402K-05	primary culture human cells
HHDPC	HDPC (Human dermal papilla cells),	Extraction of mRNAs from	8,453
	Toyobo #THPCK-001	primary culture human cells
HSYRA	HS-RA (Human synoviocytes from	Extraction of mRNAs from	7,955
	rheumatioid arthritis), Toyobo #T404K-05	primary culture human cells
LYMPB	Lymphoblast, EB virus transferred B cell	Extraction of mRNAs from	2,617
		primary culture human cells
MESAN	Normal human mesangial cells	Extraction of mRNAs from	16,053
	NHMC56046-2, Takara Shuzo	primary culture human cells
NETRP	Neutrophil	Extraction of mRNAs from	9,170
		primary culture human cells
NHNPC	Normal human neural progenitor cells	Extraction of mRNAs from	2,377
	NHNP5958, Takara Shuzo	primary culture human cells
PEBLM	Human peripheral blood mononuclear cells	Extraction of mRNAs from	7,900
	HPBMC5939, Takara Shuzo #CC2702	primary culture human cells
PUAEN	Human pulmonary artery endothelial cells,	Extraction of mRNAs from	10,544
	Toyobo #T302K-05	primary culture human cells
UMVEN	Human umbilical vein endothelial cells	Extraction of mRNAs from	631
	HUVEC, Toyobo	primary culture human cells
3NB69	NB69 cell (RCB #RCB0480)	Extraction of mRNAs from	8,153
		cultured human cells
AHMSC	HMSC cell (Human mesenchymal cell)	Extraction of mRNAs from	668
		cultured human cells
BGGI1	GI1 cell (Gioma separated from	Extraction of mRNAs from	1,899
	gliosarcoma; RCB #RCB0763)	cultured human cells
BNGH4	H4 cell (Neuroglioma; ATCC #HTB-148)	Extraction of mRNAs from	7,699
		cultured human cells
CHONS	Chondrocyte; Cell Applications, Inc.	Extraction of mRNAs from	2,687
	#1205F	cultured human cells
ERLTF	TF-1 cell (erythroleukemia)	Extraction of mRNAs from	2,169
		cultured human cells
HELAC	HeLa cell (from cervical cancer)	Extraction of mRNAs from	676
		cultured human cells
IMR32	IMR32 cell (Neuroblastoma; ATCC #CCL-	Extraction of mRNAs from	16,867
	127)	cultured human cells
JCMLC	Leukemia, myelogenous	Extraction of mRNAs from	2,156
		cultured human cells
MESTC	Mesenchyme stem cell	Extraction of mRNAs from	687
		cultured human cells
N1ESE	Mesenchymal stem cell	Extraction of mRNAs from	2,624
		cultured human cells
NB9N4	NB9 cell (Neuroblastoma; RCB #RCB0477)	Extraction of mRNAs from	1,759
		cultured human cells
NCRRM	Embryonal carcinoma	Extraction of mRNAs from	698
		cultured human cells
NCRRP	Embryonal carcinoma treated with retinoic	Extraction of mRNAs from	691
	acid (RA) to induce differentiation	cultured human cells
NT2NE	NT2 cell treated with RA and treated with a	Extraction of mRNAs from	16,337
	growth inhibitor to induce nerve	cultured human cells
	differentiation, followed by nerve
	concentration and recovery (NT2 Neuron)
NT2RI	NT2 cell treated with RA to induce	Extraction of mRNAs from	32,662
	differentiation for 5 weeks, and thereafter	cultured human cells
	treated with a growth inhibitor for 2 weeks
NT2RM	NT2 cell (STARATAGENE #204101)	Extraction of mRNAs from	2,026
		cultured human cells
NT2RP	NT2 cell treated with retinoic acid (RA) to	Extraction of mRNAs from	24,634
	induce differentiation for 5 weeks	cultured human cells
NTISM	a library generated by subtracting cDNAs	Extraction of mRNAs from	180
	that overlap with the mRNA of	cultured human cells
	undifferentiated NT2 cells from a cDNA
	library prepared from an mRNA of NT2 cell
	(STARATAGENE #204101) treated with
	RA to induce differentiation for 5 weeks,
	and thereafter treated with a growth
	inhibitor for 2 weeks, using Subtract Kit
	(Invitrogen #K4320-01) (NT2RI-NT2RM)
SKNMC	SK-N-MC cell (Neuroepithelioma; ATCC	Extraction of mRNAs from	7,607
	#HTB-10)	cultured human cells
SKNSH	SK-N-SH cell (Neuroblastoma; RCB	Extraction of mRNAs from	8,662
	#RCB0426)	cultured human cells
T1ESE	Mesenchymal stem cell treated with	Extraction of mRNAs from	2,685
	trichostatin and 5-azacytidine to induce	cultured human cells
	differentiation

Oligocap method

HEMBA	tissue rich in head portion from 10-week-	mRNAs from human tissues	7,033
	gestional fetal human (whole embryo,
	mainly head)
HEMBB	tissue rich in trunk portion from 10-week-	mRNAs from human tissues	2,581
	gestional fetal human (whole embryo,
	mainly body)
MAMMA	Mammary Gland	mRNAs from human tissues	2,987
OVARC	Ovary, Tumor	mRNAs from human tissues	2,058
PLACE	Placenta	mRNAs from human tissues	12,859
THYRO	Thyroid gland	mRNAs from human tissues	1,863
VESEN	Human umbilical vein endothelial cells	Extraction of mRNAs from	1,309
		primary culture human cells
NB9N3	NB9 cell (Neuroblastoma; RCB #RCB0477)	Extraction of mRNAs from	96
		cultured human cells
NT2RM	NT2 cell (STARATAGENE #204101)	Extraction of mRNAs from	5,375
		cultured human cells
NT2RP	NT2 cell treated with retinoic acid (RA) to	Extraction of mRNAs from	14,608
	induce differentiation for 2 days and 2	cultured human cells
	weeks
Y79AA	Y79 cell (Retinoblastoma; ATCC HTB-18)	Extraction of mRNAs from	2,377
		cultured human cells
BGGI1	GI1 cell (Gioma separated from	Extraction of mRNAs from	62
	gliosarcoma; RCB #RCB0763)	cultured human cells
BNGH4	H4 cell (Neuroglioma; ATCC #HTB-148)	Extraction of mRNAs from	89
		cultured human cells
IMR32	IMR32 cell (Neuroblastoma; ATCC #CCL-	Extraction of mRNAs from	94
	127)	cultured human cells
SKNMC	SK-N-MC cell (Neuroepithelioma; ATCC	Extraction of mRNAs from	92
	#HTB-10)	cultured human cells

either oligocap method or improved oligocap method,
not distinguished

BGGI1	GI1 cell (Gioma separated from	Extraction of mRNAs from	1
	gliosarcoma; RCB #RCB0763)	cultured human cells
BNGH4	H4 cell (Neuroglioma; ATCC #HTB-148)	Extraction of mRNAs from	3
		cultured human cells
IMR32	IMR32 cell (Neuroblastoma; ATCC #CCL-	Extraction of mRNAs from	1
	127)	cultured human cells
SKNMC	SK-N-MC cell (Neuroepithelioma; ATCC	Extraction of mRNAs from	1
	#HTB-10)	cultured human cells
NT2RM	NT2 cell (STARATAGENE #204101)	Extraction of mRNAs from	48
		cultured human cells

Total		1,440,790

2) 5′-terminal sequence analysis of cDNAs from cDNA Libraries Prepared by the Oligocap Method

The 5′-terminal or 3′-terminal nucleic acid sequences of cDNAs acquired from each cDNA library, after a sequencing reaction using a DNA sequencing reagent (Dye Terminator Cycle Sequencing FS Ready Reaction Kit, dRhodamine Terminator Cycle Sequencing FS Ready Reaction Kit or BigDye Terminator Cycle Sequencing FS Ready Reaction Kit, manufactured by PE Biosystems) according to the manual, were analyzed for DNA nucleic acid sequences using a DNA sequencer (ABI PRISM 377, manufactured by PE Biosystems). For the data obtained, a database was constructed. The 5′-terminus full-length rate of each cDNA library prepared by the oligocap method was 60% on average (calculated with the protein coding region of a known mRNA as an index).
3) Full-Length cDNA Nucleic Acid Analysis
For cDNAs selected for full-length cDNA nucleic acid analysis, the nucleic acid sequence of each full-length cDNA was determined. The nucleic acid sequences were determined mainly by a primer walking method based on the dideoxy terminator method using a custom-synthesized DNA primer. Specifically, a sequencing reaction was performed using a custom-synthesized DNA primer with a DNA sequencing reagent manufactured by PE Biosystem as directed in the manual, after which the DNA nucleic acid sequence was analyzed using a sequencer manufactured by the same company. For some clones, a DNA sequencer manufactured by Licor was also utilized. For some cDNAs, no custom primer was used, but the shotgun method, in which cDNA-containing plasmids are randomly cleaved, was used with a DNA sequencer to determine the DNA nucleic acid sequence. The full-length nucleic acid sequence was finally established by completely overlapping the partial nucleic acid sequences determined by the above-described method. Next, the region of translation into protein was predicted from the determined full-length nucleic acid sequence, and the amino acid sequence was determined.

Example 2

Genome Mapping and Clustering

(1) Sequence Data Set

The following sequences were used as a data set.
Human genome sequence: UCSC hg 17 (NCBI Build 35) (http://www.genome.ucsc.edu/)
Human full-length cDNAs, 19,265 sequences, newly acquired and subjected to full-length cDNA sequence analysis by us
Out of human full-length cDNA sequences acquired and subjected to full-length cDNA sequence analysis by us, and registered with an existing public database (DDBJ/GenBank/EMBL) (accession numbers: AB038269, AB045981, AB056476, AB056477, AK000001 to AK002212, AK021413 to AK027260, AK027263 to AK027902, AK054561 to AK058202, AK074029 to AK074481, AK074483 to AK075325, AK075326 to AK075566, AK090395 to AK098842, AK122580 to AK129030, AK129488 to AK131107, AK131190 to AK131575, AK160364 to AK160386, AK172724 to AK172740, AK172741 to AK172866), 30,754 sequences that can be used for genome mapping
2039 sequences that had been registered with the database HUGE of Kazusa DNA Research Institute by Feb. 3, 2005 (http://www.kazusa.or.jp/huge/)
Human full-length cDNAs, 20,878 sequences, that had been listed on the Full Length Clone List on the website of Mammalian Gene Collection (http://mgc.nci.nih.gov/) and included in GenBank gbpri (ftp://ftp.ncbi.nih.gov/genbank/) by Jan. 30, 2005
Human full-length cDNAs, 9,280 sequences, that had been registered as Deutsches Krebsforschungszentrum (DKFZ) in GenBank gbpri before Jan. 30, 2005
Human full-length cDNAs, 13,984 sequences, being constituent sequences of the human RefSeq sequences of the Jan. 31, 2005 version (http://www.ncbi.nlm.nih.gov/RefSeq/), registered as mRNAs, and included in GenBank gbpri
Human RefSeq sequences of the Jan. 31, 2005 version (http://www.ncbi.nlm.nih.gov/RefSeq/), 28,931 sequences
Out of the human genome assemble sequences in Feb. 10, 2005 Ensembl (http://www.ensembl.org/) (NCBI35.nov_—26.35), 33,666 sequences of NCBI35.nov_—26.35 that had been mapped to the hg 17 human genome in UCSC (University of California, Santa Cruz, http://www.genome.ucsc.edu/)
Human cDNA 5′-terminal sequence, 1,456,213 sequences, and 3′-terminal sequence, 109,283 sequences, subjected to sequence analysis in our project (including published sequences with accession numbers: AU116788 to AU160826, AU279383 to AU280837, DA000001 to DA999999, DB000001 to DB384947)

(2) Genome Mapping

The above-described data set was subjected to genome mapping using BLASTN (ftp://ftp.ncbi.nih.gov/blast/), under the conditions of Identity of 95% or more and consensus length of 50 base pairs (bp) or more. About 99% of the sequences in the data set used for the mapping permitted genome mapping.

(3) Clustering

After the genome mapping, a sequence group contained in a genome region, as a single assembly, was allowed to form a cluster. Hence, each cluster was chosen in a way such that the outer sides of both ends of each genome region in the sequence group would not overlap the sequences mapped on each genome region. As a result, a total of 87,173 clusters existed. Therefrom, 17,535 clusters configured solely with human cDNA 3′-terminal sequences that were acquired and subjected to sequence analysis in our project were excluded, leaving 69,638 clusters. Of these clusters, 36,782 clusters were excluded since they were configured solely with human cDNA 5′-terminal sequences that were acquired and subjected to sequence analysis in our project (those having none of full-length cDNA, RefSeq, and Ensembl sequences were excluded). As a result, 32,856 clusters were found to comprise at least one of full-length cDNAs, RefSeq, and Ensembl sequences. By selecting clusters comprising one or more of full-length cDNAs, RefSeq, and Ensembl sequences, which are expected to have an ORF (open reading frame, coding region) with a reliability above a given level, 21,703 clusters were acquired. For these 21,703 clusters, expression specificity was determined.

Example 3

Experimental Procedures for Real-Time PCR

(1) Synthesis of Template cDNAs
1) Human mRNA (Human Total RNA) Used as Template
A reaction was carried out with 50 μg of Human Total RNA per 150 μl of the system.
To 50 μg of Total RNA dissolved in 87 μl of H₂O, 10 μl of a random primer (concentration 65 ng/μl) and 7.5 μl of dNTP Mix (concentration 10 mM each dNTP Mix) were added. This was followed by incubation at 65° C. for 5 minutes and on ice for 1 minute. 30 μl of 5× reaction buffer solution (attached to the Invitrogen SuperScript III RT kit) and 7.5 μl of 0.1M DTT and 3 μl of RNase Inhibitor (STRATAGENE) and 5 μl of SuperScript III RT (Invitrogen) were added. This was followed by incubation at 25° C. for 5 minutes, incubation at 50° C. for 60 minutes, and incubation at 70° C. for 15 minutes. After the reaction, phenol-chloroform extraction was performed to deactivate the enzyme. By adding 3 μl of EDTA (0.5M) and 22.5 μl of 0.1N NaOH, alkali treatment was performed to degrade the RNA. After 30 μl of Tris (1M pH 7.8) was added to neutralize the reaction liquid, ethanol precipitation was performed, and the precipitate was dissolved in 100 μl of TE buffer solution.
Human mRNAs from the mRNA sources (Human Total RNAs) were acquired by the method described in Example 1.
A list of the human mRNAs used in the experiments is shown in Table 2.
2) Human mRNA (Human PolyA(+) RNA) Used as Template
A reaction was carried out with 5 μg of human PolyA RNA per 100 μl of the system.
To 5 μg of PolyA(+) RNA dissolved in 58 μl of H₂O, 5 μl of a random primer (concentration 65 ng/μl) and 5 μl of dNTP Mix (concentration 10 mM each dNTP Mix) were added. This was followed by incubation at 65° C. for 5 minutes and incubation on ice for 1 minute. 20 μl of 5× reaction buffer solution (attached to the Invitrogen SuperScript III RTkit), 5 μl of 0.1M DTT, 2 μl of RNase Inhibitor (STRATAGENE) and 5 μl of SuperScript III RT (Invitrogens) were added. This was followed by incubation at 25° C. for 5 minutes, incubation at 50° C. for 60 minutes, and incubation at 70° C. for 15 minutes. After the reaction, phenol-chloroform extraction was performed to deactivate the enzyme. By adding 2 μl of EDTA (0.5M) and 15 μl of 0.1N NaOH, alkali treatment was performed to degrade the RNA. After 20 μl of Tris (1M pH 7.8) was added to neutralize the reaction liquid, ethanol precipitation was performed, and the precipitate was dissolved in 50 μl of TE buffer solution.
A list of the human mRNAs used in the experiments is shown in Table 2.

TABLE 2

Product name	Manufacturer	Catalog number

Human total RNA purchased

1	Bone Marrow	Human Bone Marrow Total	Clontech	636548
		RNA
2	Brain, whole	Human Brain Total RNA	Clontech	636530
3	Fetal Brain	Human Fetal Brain Total	Clontech	636526
		RNA
4	Heart	Human Heart Total RNA	Clontech	636532
5	Kidney	Human Kidney Total RNA	Clontech	636529/636514
6	Liver	Human Liver Total RNA	Clontech	636531
7	Lung	Human Lung Total RNA	Clontech	636524
8	Thymus	Human Thymus Total RNA	Clontech	636549
9	Uterus	Human Uterus Total RNA	Clontech	636551/636513
10	Spinal Cord	Human Spinal Cord Total	Clontech	636554
		RNA
11	Colon	Human Colon Total RNA	Clontech	636521
12	Colon Tumor	Human Colon Tumor Total	Clontech	636634
		RNA
13	Kidney Tumor	Human Kidney Tumor Total	Clontech	636632
		RNA
14	Liver Tumor	Human Liver Total RNA	CHEMICOM	RNA569
15	Lung Tumor	Human Lung Tumor Total	Clontech	636633
		RNA
16	Ovary	Human Ovary Total RNA	Clontech	636555
17	Ovary Tumor	Human Ovary Tumor Total	Clontech	636631
		RNA
18	Spleen	Human Spleen Total RNA	Clontech	636525
19	Stomach	Human Stomach Total RNA	Clontech	636522
20	Stomach Tumor	Human Stomach Tumor	Clontech	636629
		Total RNA
21	Uterus Tumor	Human Uterus Tumor Total	Clontech	636628
		RNA
22	ALZ Visual Cortex Occipital	Human Visual Cortex	Ambion	B6336
		Occipital ALZ Total RNA

Human polyA(+) RNA purchased

1	Brain, whole	Human Brain, whole	Clontech	636102
		PolyARNA
2	Brain cerebellum	Brain, cerebellum	Clontech	636122
3	Brain, amygdala	Brain, amygdala	Clontech	6574-1
4	Brain, caudate nucleus	Brain, caudate nucleus	Clontech	6575-1
5	Brain, corpus callosum	Brain, corpus callosum	Clontech	636133
6	Brain, hippocampus	Brain, hippocampus	Clontech	636134
7	Brain, substantia nigra	Brain, substantia nigra	Clontech	6580-1
8	Brain, thalamus	Brain, thalamus	Clontech	636135
9	Brain, subthalamic nucleus	Brain, subthalamic nucleus	Clontech	636167

Extraction of human total RNA	Explanation of the derivation
from an RNA source	of mRNA

1	Tongue (normal)	Normal tongue tissue
2	Tongue Tumor	Tongue tumor tissue
3	NT2 cell (STARATAGENE	Before treatment with NT2
	#204101)	retinoic acid (RA(−))
4	NT2 cell treated with	NT2 cell treated with retinoic
	retinoic acid (RA) to induce	acid (RA) to induce
	differentiation	differentiation for 5 weeks
5	NT2 cell treated with RA to	NT2 cell treated with RA to
	induce differentiation	induce differentiation for 5
	followed by treatment with	weeks, and thereafter
	a growth inhibitor (Inh)	treated with a growth
		inhibitor for 2 weeks
6	NT2 cell treated with RA to	NT2 cell treated with retinoic
	induce differentiation	acid (RA) to induce
		differentiation for 1 day
7	NT2 cell treated with RA to	NT2 cell treated with retinoic
	induce differentiation	acid (RA) to induce
		differentiation for 2 days
8	NT2 cell treated with RA to	NT2 cell treated with retinoic
	induce differentiation	acid (RA) to induce
		differentiation for 1 week
9	NT2 cell treated with RA	NT2 cell treated with RA and
	and treated with a-Inh to	treated with a growth
	induce nerve differentiation	inhibitor to induce nerve
		differentiation, followed by
		nerve cell concentration and
		recovery

(2) Design of Primers and Probes

Using Primer Express software 3.0, the primer design software attached to the Applied Biosystems real-time PCR 7500 Fast, with the sequences of portions that serve as the borders of the changing region, primers and probes were designed to allow the individual detection of cDNAs having other splice patterns transcribed from the same chromosome region as the cDNA to be comparatively examined under the conditions recommended by the software. Using the designed primers, real-time PCR was performed, and they were confirmed to produce a single band and to be capable of specifically detecting only one kind of cDNA.

(3) Expressional Analysis Using Real-Time PCR

1) mRNAs Used
All mRNAs used were of human derivation.
The experiments on the four clusters chr14-45, chr7-2007, chr12-1875, and chr3-1507, out of the 10 experimental systems, were performed using SYBR GREEN as a real-time PCR reaction system, with 16 kinds of samples as template cDNAs: NT2 cells [NT2 RA(−)], NT2 cells treated with retinoic acid (RA) to induce differentiation for 24 hours [NT2 RA(+) 24 hr], NT2 cells treated with retinoic acid (RA) to induce differentiation for 48 hours [NT2 RA(+) 48 hr], NT2 cells treated with retinoic acid (RA) to induce differentiation for 1 week [NT2 RA(+) 1 week], NT2 cells treated with retinoic acid (RA) to induce differentiation for 5 weeks [NT2 RA(+)], NT2 cells treated with RA to induce differentiation for 5 weeks, and thereafter treated with a growth inhibitor for 2 weeks [NT2 RA(+) Inh(+)], NT2 cells treated with RA and treated with a growth inhibitor to induce nerve differentiation, followed by nerve concentration and recovery (NT2 Neuron), Brain, Fetal, Brain, whole, Alzheimer patient cerebral cortex (ALZ Visual Cortex Occipital), Mix, viscus tissues [Heart, Kidney, Liver, Lung, Colon, Stomach], Mix, blood cells and related tissues [Bone Marrow, Thymus, Spinal Cord, Spleen], Mix, tumor tissues [Colon Tumor, Kidney Tumor, Liver Tumor, Lung Tumor, Ovary Tumor, Stomach Tumor, Uterus Tumor, Tongue Tumor], Mix, normal tissues [Colon, Kidney, Liver, Lung, Ovary, Stomach, Uterus, Tongue], whole brain polyA(+)RNA [Brain, whole PolyA(+) RNA], and Brain, hippocampus.
For the cluster chr12-1875, experiments were also performed with, in addition to the foregoing 16 kinds, additional samples: Colon, Kidney, Liver, Lung, Ovary, Stomach, Uterus, Tongue, Colon Tumor, Kidney Tumor, Liver Tumor, Lung Tumor, Ovary Tumor, Stomach Tumor, Uterus Tumor, and Tongue Tumor.
For the cluster chr3-1507, experiments were also performed with, in addition to the foregoing 16 kinds, additional samples: Brain cerebellum, Brain, amygdala, Brain, caudate nucleus, Brain, corpus callosum, Brain, substantia nigra, Brain, thalamus, and Brain, subthalamic nucleus.
The experiments on the 2 clusters chr19-32 and chr12+1658, out of the 10 experimental systems, were performed using TaqMan manufactured by Applied Biosystems as a real-time PCR reaction system, with a total of 16 kinds of samples as template cDNAs: NT2 cells [NT2 RA(−)], NT2 cells treated with retinoic acid (RA) to induce differentiation for 24 hours [NT2 RA(+) 24 hr], NT2 cells treated with retinoic acid (RA) to induce differentiation for 48 hours [NT2 RA(+) 48 hr], NT2 cells treated with retinoic acid (RA) to induce differentiation for 1 week [NT2 RA(+) 1 week], NT2 cells treated with retinoic acid (RA) to induce differentiation for 5 weeks [NT2 RA(+)], NT2 cells treated with RA to induce differentiation for 5 weeks, and thereafter treated with a growth inhibitor for 2 weeks [NT2 RA(+) Inh(+)], NT2 cells treated with RA and treated with a growth inhibitor to induce nerve differentiation, followed by nerve concentration and recovery (NT2 Neuron), Brain, Fetal, Brain, whole, Alzheimer patient cerebral cortex (ALZ Visual Cortex Occipital), Mix, viscus tissues [Heart, Kidney, Liver, Lung, Colon, Stomach], Mix, blood cells and related tissues [Bone Marrow, Thymus, Spinal Cord, Spleen], Mix, tumor tissues [Colon Tumor, Kidney Tumor, Liver Tumor, Lung Tumor, Ovary Tumor, Stomach Tumor, Uterus Tumor, Tongue Tumor], Mix, normal tissues [Colon, Kidney, Liver, Lung, Ovary, Stomach, Uterus, Tongue], Brain, whole PolyA(+) RNA, and Brain, hippocampus.
The experiments on the 4 clusters chr2-2324, chrX-900, chr8-916, and chr3+2014, out of the 10 experimental systems, were performed using SYBR GREEN as a real-time PCR reaction system, with a total of 23 kinds of samples as template cDNAs: NT2 cells [NT2 RA(−)], NT2 cells treated with retinoic acid (RA) to induce differentiation for 24 hours [NT2 RA(+) 24 hr], NT2 cells treated with retinoic acid (RA) to induce differentiation for 48 hours [NT2 RA(+) 48 hr], NT2 cells treated with retinoic acid (RA) to induce differentiation for 1 week [NT2 RA(+) 1 week], NT2 cells treated with retinoic acid (RA) to induce differentiation for 5 weeks [NT2 RA(+)], NT2 cells treated with RA to induce differentiation for 5 weeks, and thereafter treated with a growth inhibitor for 2 weeks [NT2 RA(+) Inh(+)], NT2 cells treated with RA and treated with a growth inhibitor to induce nerve differentiation, followed by nerve concentration and recovery (NT2 Neuron), Brain, Fetal, Brain, whole, Alzheimer patient cerebral cortex (ALZ Visual Cortex Occipital), Mix, viscus tissues [Heart, Kidney, Liver, Lung, Colon, Stomach], Mix, blood cells and related tissues [Bone Marrow, Thymus, Spinal Cord, Spleen], Mix, tumor tissues [Colon Tumor, Kidney Tumor, Liver Tumor, Lung Tumor, Ovary Tumor, Stomach Tumor, Uterus Tumor, Tongue Tumor], Mix, normal tissues [Colon, Kidney, Liver, Lung, Ovary, Stomach, Uterus, Tongue], Brain, whole PolyA(+) RNA, Brain, hippocampus, Brain cerebellum, Brain, amygdala, Brain, caudate nucleus, Brain, corpus callosum, Brain, substantia nigra, Brain, thalamus, and Brain, subthalamic nucleus.

2) Reaction System Using SYBR GREEN

The SYBR GREEN I Dye assay chemistry is an experimental system based on the characteristic of SYBR GREEN to emit strong fluorescence by binding to a double-stranded DNA. When the DNA denatures to single-stranded during the PCR reaction, SYBR GREEN leaves from the DNA and the fluorescence decreases rapidly; however, with the subsequent annealing/extension reaction, it binds to the double-stranded DNA to emit fluorescence again. In the SYBR GREEN I Dye assay chemistry, fluorescence intensity, which increases with every PCR cycle, is detected.
To a cDNA derived from each tissue, 0.2 μl (equivalent to 100 ng of Total RNA), as the template, Forward Primer (final concentration 250 nM), Reverse Primer (final concentration 250 nM), and SYBR Green PCR Master Mix (ABI 4309155) were added, to make a total volume of 20 μl. For endogenous control, GAPDH (Accession No; NM_—002046.2) always served as a reaction control for all templates.
A PCR was performed under the conditions shown below, which represent the standard protocol for Applied Biosystems real-time PCR 7500 Fast. After an initial step at 50° C. for 2 minutes and at 95° C. for 10 minutes, denaturation at 95° C. for 15 seconds and annealing elongation at 60° C. for 1 minute were repeated in 40 cycles.
GAPDH-F (SEQ ID NO:5): Forward Primer for endogenous control GAPDH
GAPDH-R (SEQ ID NO:6): Reverse Primer for endogenous control GAPDH

3) Reaction System Using TaqMan

The TaqMan assay chemistry is an experimental system employing the TaqMan probe, a probe phosphorylated at the 3′ terminus and labeled with a Fluorescenin-series fluorescent dye (reporter) at the 5′ terminus, and a Rhodamine-series fluorescent dye (quencher) at the 3′ terminus. When the TaqMan probe occurs alone, the fluorescence energy of the reporter is consumed as excitation energy for the quencher, and the fluorescence of the reporter is suppressed, because the fluorescence wavelength is close to that of the quencher even if reporter excitation light is irradiated. However, when the TaqMan probe is degraded by the 5′-3′ exonuclease activity of DNA polymerase during the elongation from the primer in the PCR reaction, the fluorescent dye of the reporter leaves from the 5′ terminus of the TaqMan probe, and the distance from the fluorescent dye of the quencher increases, resulting in the emission of fluorescence. In the TaqMan assay chemistry, the fluorescence intensity from the reporter, which increases with every PCR cycle, is detected.
To 0.2 μl (equivalent to 100 ng as converted to Total RNA) of a cDNA derived from each tissue as a template, Forward Primer (final concentration 900 nM), Reverse Primer (final concentration 900 nM), TaqMan Probe (final concentration 250 nM), and TaqMan Fast Universal PCR Master Mix (ABI 466073) were added, to make a total volume of 20 μl. For endogenous control, GAPDH always served as a reaction control for all templates.
A PCR was performed under the conditions shown below, which represent the Fast protocol for Applied Biosystems real-time PCR 7500 Fast. After enzyme activation 95° C. for 20 seconds, denaturation at 95° C. for 3 seconds and annealing elongation at 60° C. for 30 seconds were repeated in 40 cycles. GAPDH-Probe (SEQ ID NO:7): TaqMan Probe for endogenous control GAPDH

(4) Method of Statistical Analysis of Data

The results were analyzed using a relative quantitation method.
Using the RQ study software for Applied Biosystems real-time PCR 7500 Fast, a threshold was set in an exponential functional amplification region of the amplification curve. The number of cycles at that time was used as the Ct (threshold cycle). To make a correction for initial RNA content, the Ct of the endogenous control GAPDH was subtracted from the Ct obtained, and this value was used as the dCt [dCt=Target Ct−ENDOGENUS Ct (GAPDH)]. The dCt of the sample serving as the reference standard (control) was further subtracted from the dCt obtained, and this value was used as the ddCt [ddCt=Target dCt−Control dCt]. On the basis of this value, relative value was calculated, and this was used as the RQ [RQ=2^−ddCt]. On the basis of this result, a logarithmic graph was generated, and the amounts amplified and hence expression levels with each primer and probe were compared.
In each Example, analytical results for RQ and Log₁₀RQ are shown. RQ scores are shown to the first decimal point. For samples not allowing detection by real-time PCR, “Undet.” was written in the fields for RQ score and the score of Log₁₀RQ. Log₁₀RQ scores are shown to the second decimal point. However, for a mixed sample of control normal visceral tissues (Mix, viscus tissues) (RQ value “1.0”), “0.0” was written in the field for Log₁₀RQ scores.

Example 4

Cluster chr19-32 (Data Set: 103)

(1) Cluster Analysis

1) Cluster Characteristics

An analysis was performed on 8 sequences of full-length cDNAs subjected to genome mapping onto the cluster chr19-32 (Human genome UCSC hg18 (NCBI Build34) chromosome 19, 63,124,000 bp to 63,140,000 bp) [D-UTERU2026184.1, D-BRACE3000012.1, AB075836.1, AY695825.1, C-NT2RI2001083, ENST00000358502, ENST00000361044, NM_—133460.1]. They were classifiable according to expression pattern difference into the following 3 kinds.

[1] D-UTERU2026184.1

[2] D-BRACE3000012.1

[3] AB075836.1, AY695825.1, C-NT2RI2001083 (AK056113.1), ENST00000358502, ENST00000361044, NM_—133460.1

[1] and [2] are cDNAs that were newly acquired and subjected to full-length cDNA sequence analysis by us, having an ORF different from that of [3], which had been registered in an existing public DB (DDBJ/Genbank/EMBL).
[1], compared with the known [3], had a different ORF region because of the deletion of portions corresponding to the second and third exons of [3] in the ORF region.
[2], compared with the known [3], had an altered translation initiation point and a different ORF region because of the insertion of an exon different from the other patterns into the ORF region.
It was found that the ORF regions present in the 3 kinds of cDNA patterns [1] to [3] undergo splicing in different patterns, such as exon deletions and insertions, from the same chromosome region, resulting in alterations of the amino acid sequences to produce diverse proteins and mRNAs.
2) Characteristics of D-UTERU2026184.1 ([1]), which was Newly Acquired and Subjected to Full-Length cDNA Sequence Analysis by Us
103_—[1]_—1-N0 (SEQ ID NO:8): The entire nucleic acid sequence region of D-UTERU2026184.1
103_—[1]_—1-NA0 (SEQ ID NO:9): Both the entire nucleic acid sequence region and amino acid sequence of D-UTERU2026184.1
103_—[1]_—1-A0 (SEQ ID NO:10): The entire amino acid sequence region of D-UTERU2026184.1
The 213-base exon present at the 213th to 425th bases of NM_—133460.1 (SEQ ID NO:13), which is registered with an existing public DB, and serves for control, is deleted and not present in the region at the 223rd to 224th bases of D-UTERU2026184.1. The 2 bases present at the 520th to 521st bases of NM_—133460.1 (SEQ ID NO:14) are also deleted and not present in the region at the 317th to 318th bases of D-UTERU2026184.1 (SEQ ID NO:11). Although the translation initiation point of NM_—133460.1 is present on the 128-base insertion exon, D-UTERU2026184.1 is present on the first exon, which is shared by NM_—133460.1; therefore, compared with NM_—133460.1, the N-terminal amino acids differed by 43 residues.
103_—[1]_—1-N1 (SEQ ID NO:11): Deletion nucleic acid sequence region of D-UTERU2026184.1
103_—[1]_—1-A1 (SEQ ID NO:12): Amino acid region altered as a result of deletion of D-UTERU2026184.1
103_—[1]_—1-N2 (identical to SEQ ID NO:11): ORF nucleic acid region in the deletion nucleic acid region of D-UTERU2026184.1
103_—[1]_—1-A2 (identical to SEQ ID NO:12): ORF amino acid region related to the deletion nucleic acid region of D-UTERU2026184.1
103_—[1]_C-N1 (SEQ ID NO:13): 213-base insert nucleic acid sequence present at the 213th to 425th bases of NM_—133460.1 inserted into the region at the 223rd to 224th bases of D-UTERU2026184.1
103_—[1]_C—N2 (SEQ ID NO:14): 2-base insert nucleic acid sequence present at the 520th to 521 bases of NM_—133460.1 inserted into the region at the 317th to 318th bases of D-UTERU2026184.1
103_—[1]_C-A1 (SEQ ID NO:15): Amino acid region related to the insert nucleic acid sequences at the 213th to 425th bases and the 520th to 521st bases of NM_—133460.1, inserted into the region at the 223rd to 224th bases and the region at the 317th to 318th bases of D-UTERU2026184.1.
With this change, the Pfam motif “KRAB box”, which is present at the 5th to 45th amino acids of NM_—133460.1, which serves for control, disappeared in D-UTERU2026184.1 (http://pfam.janelia.org/).
3) Characteristics of D-BRACE3000012.1 ([2]), which was Newly Acquired and Subjected to Full-Length cDNA Sequence Analysis by Us
103_—[2]_—1-N0 (SEQ ID NO:16): The entire nucleic acid sequence region of D-BRACE3000012.1
103_—[2]_—1-NA0 (SEQ ID NO:17): Both the entire nucleic acid sequence region and amino acid sequence of D-BRACE3000012.1
103_—[2]_—1-A0 (SEQ ID NO:18): The entire amino acid sequence region of D-BRACE3000012.1
The sequence at the 314th to 533rd bases of D-BRACE3000012.1 (SEQ ID NO:19) is a variant with insertion of an exon not present in NM_—133460.1, which is registered with an existing public DB and serves for control; because of its presence on the exon inserted, along with the translation initiation point, compared with the NM_—133460.1, the N-terminal amino acids differed by 23 residues (SEQ ID NO:20).
103_—[2]_—1-N1 (SEQ ID NO:19): 220-base insert nucleic acid sequence region of D-BRACE3000012.1
103_—[2]_—1-A1 (SEQ ID NO:20): 23-residue insert amino acid sequence region of D-BRACE3000012.1
103_—[2]_—1-N2 (SEQ ID NO:21): ORF nucleic acid sequence region in 220-base insert region of D-BRACE3000012.1
103_—[2]_—1-A2 (SEQ ID NO:22): ORF amino acid region related to 220-base insert region of D-BRACE3000012.1

4) Expression Specificity Analysis and Design of Primers and TaqMan Probes for Real-Time PCR

To clearly distinguish between the characteristic regions shown above, and examine the respective expression levels thereof, the following regions were used as detection regions. It seemed possible to compare the expression levels of the individual characteristic regions by comparing the expression levels of the detection regions.
103_—01—A region specifically extracted by means of the sequence information at the border of a region lacking an exon in the cDNA pattern [1]: an ORF-altering region with exon deletion in the cDNA pattern [1], which was newly subjected to full-length cDNA sequence analysis by us
→Fragment 103_—01 (SEQ ID NO:25) amplified by Primer103_—01F (SEQ ID NO:23) and Primer103_—01R (SEQ ID NO:24) TaqMan probe used 103_—01TP: (SEQ ID NO:26)
103_—02—A region specifically extracted by means of the sequence information on a region with exon insertion in the cDNA pattern [2]: an ORF-altering region with exon insertion in the cDNA pattern [2], which was newly subjected to full-length cDNA sequence analysis by us
→Fragment 103_—02 (SEQ ID NO:29) amplified by Primer103_—02F (SEQ ID NO:27) and Primer103_—02R (SEQ ID NO:28)
TaqMan probe used 103_—02TP: (SEQ ID NO:30)
103_—03—A specific region that is distinguishable from both the deletion region [1] and insert region of [2] in the cDNA pattern [3] registered with an existing public DB, serving as a control for comparing [1] and [2]
→Fragment 103_—03 (SEQ ID NO:33) amplified by Primer103_—03F (SEQ ID NO:31) and Primer103_—03R (SEQ ID NO:32)
TaqMan probe used 103_—03TP: (SEQ ID NO:34)
103_—04—A common region shared by all of [1] to [3]: a region common to all patterns, serving for control to compare the overall expression levels of the cDNA patterns [1] and [2], which were newly subjected to full-length cDNA sequence analysis by us, and the cDNA pattern [3], registered with an existing public DB
→Fragment 103_—04 (SEQ ID NO:37) amplified by Primer103_—04F (SEQ ID NO:35) and Primer103_—04R (SEQ ID NO:36)
TaqMan probe 103_—04TP used: (SEQ ID NO:38)
By mapping the 5′-terminal sequences of about 1.44 million sequences acquired using the oligocap method onto the human genome sequence, and comparatively analyzing them, the exon regions specific for the cDNA patterns [1], [2], and [3] shown above, respectively, were found to be expressed at the following frequencies.
In the cDNA pattern [1], which was newly acquired and analyzed by us, one 5′-terminal sequence was present, the derivation thereof being Uterus for 1 sequence (analytical parameter 49,561).
In the cDNA pattern [2], which was newly acquired and analyzed by us, two 5′-terminal sequences were present, the derivations thereof being Brain, cerebellum for 1 sequence (analytical parameter 82,880), and NT2 cells treated with retinoic acid (RA) to induce differentiation (NT2RP) for 1 sequence (analytical parameter 39,242).
In the cDNA pattern [3], which is registered with an existing public DB, fourteen 5′-terminal sequences were present, the derivations thereof being NT2 cells treated with retinoic acid (RA) to induce differentiation for 5 weeks (NT2RP) for 4 sequences (analytical parameter 39,242), NT2 cells treated with RA to induce differentiation for 5 weeks, and thereafter treated with a growth inhibitor for 2 weeks (NT2RI) for 2 sequences (analytical parameter 32,662), Brain, cerebellum for 1 sequence (analytical parameter 82,880), Brain, amygdala for 1 sequence (analytical parameter 58,640), Brain, hippocampus for 1 sequence (analytical parameter 57,918), Brain, substantia nigra for 1 sequence (analytical parameter 15,897), Normal Human Dermal Fibroblasts for 1 sequence (analytical parameter 10,103), Brain, Fetal for 2 sequences (analytical parameter 79,560), and Uterus for 1 sequence (analytical parameter 49,561).
From this result, it was found that the exon-deletion pattern [1] was expressed in Uterus, and that the exon-insertion pattern [2] was expressed in Brain, cerebellum and NT2 cells treated with retinoic acid to induce differentiation (NT2RP). It was found that the known sequence [3] was abundantly expressed in NT2 cells treated with retinoic acid to induce differentiation (NT2RP) and in brain tissues.

(2) Analysis of Expression Specificity by Real-Time PCR

To detect protein expression diversity changes due to exon selectivity, details of expression levels were analyzed by real-time PCR. The results are shown in Table 3.

	TABLE 3

	RQ Score	Log₁₀RQ Score

	103_01	103_02	103_03	103_04	103_01	103_02	103_03	103_04

01 NT2RA(−)	0.1	6.3	0.9	0.4	−0.86	0.80	−0.04	−0.45
02 NT2RA(+) 24 hr	0.3	2.7	0.6	0.3	−0.54	0.42	−0.19	−0.59
03 NT2RA(+) 48 hr	0.2	2.0	0.7	0.3	−0.68	0.29	−0.14	−0.52
04 NT2RA(+) 1 week	1.9	1.5	0.9	0.8	0.27	0.17	−0.03	−0.11
05 NT2RA(+) 5 weeks	6.7	8.6	2.9	1.1	0.83	0.93	0.46	0.05
06 NT2RA(+) 5 weeks, Inh(+)	2.1	1.5	1.1	0.5	0.32	0.19	0.05	−0.30
07 NT2 Neuron	0.2	0.2	0.5	1.3	−0.77	−0.74	−0.30	0.10
08 Brain, Fetal	4.0	5.0	17.3	8.4	0.60	0.70	1.24	0.92
09 Brain, whole	8.6	3.2	6.1	5.3	0.93	0.51	0.78	0.73
10 ALZ Visual Cortex	1.2	0.7	2.2	3.0	0.08	−0.18	0.34	0.47
Occipital
11 Mix, viscus tissues	1.0	1.0	1.0	1.0	0.0	0.0	0.0	0.0
12 Mix, blood cells and	12.7	6.0	2.7	2.6	1.11	0.78	0.43	0.42
related tissues
13 Mix, tumor tissues	2.3	0.9	0.5	0.7	0.36	−0.02	−0.28	−0.17
14 Mix, normal tissues	2.4	1.9	1.2	1.8	0.37	0.29	0.09	0.24
15 Brain, whole PolyA(+) RNA	4.8	2.2	5.9	3.1	0.68	0.34	0.77	0.49
16 Brain, hippocampus	2.4	1.9	4.1	2.6	0.39	0.28	0.61	0.42

Expression levels were compared using the 16 samples shown in Example 3, including Brain, hippocampus, Brain, whole, Brain, Fetal, Alzheimer patient cerebral cortex (ALZ Visual Cortex Occipital), and NT2 cells in 7 differentiation stages. For experimental control, comparisons were made using the sample prepared by mixing normal visceral tissues in Example 3 (Mix, viscus tissues).
The ratio of ORF alteration due to exon insertion/deletion selectivity as compared between 103_—01 (SEQ ID NO:25) and 103_—02 (SEQ ID NO:29) changed greatly among the following differentiation stages of the brain and NT2 cells.
The expression of the exon-deletion pattern shown by 103_—01 (SEQ ID NO:25) was low in undifferentiated NT2 cells NT2RA (−) and NT2RA (+) 48 hr, which represents the initial stage in which retinoic acid was added to induce differentiation; the expression was high in NT2RA(+) 1 week to NT2RA (+) 5 weeks, Inh (+), which represent the late stage of differentiation induction, and was low in NT2 Neuron. The expression in Brain, Fetal was also low (Table 3).
The expression of the exon insertion pattern shown by 103_—02 (SEQ ID NO:29) was abundant in undifferentiated NT2 cells NT2RA(−) and the initial stage in which retinoic acid was added to induce differentiation, to NT2RA (+) 5 weeks, Inh (+); the expression was low in NT2 Neuron (Table 3). Not only in Fetal Brain, but also in the whole brain, the expression was low (Table 3).
These results demonstrated that by comparing the expression of selective exon regions 103_—[1]_—1-N1 (SEQ ID NO:11) and 103_—[2]_—1-N1 (SEQ ID NO:19) of newly acquired cDNAs shown by the detection regions 103_—01 (SEQ ID NO:25) and 103_—02 (SEQ ID NO:29), it is possible to use these regions as differentiation markers for detecting stages of nerve cell differentiation or regeneration. It also seems possible to develop a new drug by means of a compound, antibody, siRNA or the like that targets a region that exhibits specificity.
The following regions also seem to be useful as differentiation markers for detecting nerve cell differentiation or regeneration stages.
Upstream sequence 062_—[1]_—1-N3 (SEQ ID NO:39), which comprises the 285th to 306th bases undergoing priming by Primer10301R (SEQ ID NO:24) in D-UTERU2026184.1 of the cDNA pattern [1].
Upstream sequence 062_—[1]_—1-N3 (SEQ ID NO:40), which comprises the 521st to 541st bases undergoing priming by Primer103_—02R (SEQ ID NO:28) in D-BRACE3000012.1 of the cDNA pattern [2].
Region 10301 (SEQ ID NO:25) amplified by Primer10301F (SEQ ID NO:23) and Primer103_—01R (SEQ ID NO:24) in the cDNA pattern [1]
Region 103_—02 (SEQ ID NO:29) amplified by Primer103_—02F (SEQ ID NO:27) and Primer103_—02R (SEQ ID NO:28) in the cDNA pattern [2]

Example 5

Cluster chr14-45 (Data Set: 019)

(1) Cluster Analysis

1) Cluster Characteristics

An analysis was performed on 13 sequences of full-length cDNAs subjected to genome mapping onto the cluster chr14-45 (Human genome UCSC hg18 (NCBI Build34) chromosome 14, 104,305,000 bp to 104,335,000 bp) [D-NT2RP8004156.1, BC000479.2, BC084538.1, BX647722.1, BX648205.1, C-BRACE2006105, C-BRHIP2019884, C-PLACE7003657, C-TEST14021482, ENST00000310523, ENST00000349310, M63167.1, NM_—005163.1]. They were classified according to expression pattern difference into 7 kinds, which mainly included the following 2 kinds.

[1] D-NT2RP8004156.1

[2] BC000479.2, BC084538.1, BX648205.1, C-PLACE7003657 (AK122894.1), ENST00000310523, M63167.1, NM_—005163.1

[1] is a cDNA which was newly acquired and subjected to full-length cDNA sequence analysis by us, having an ORF different from that of [2] registered in an existing public DB.
[1] had a different ORF region because of its expression from a chromosome region located downstream of the known [2].
It was found that the ORF regions present in the 2 kinds of cDNA patterns [1] to [2] cause expression starting at different transcription initiation points, from the same chromosome region, resulting in alterations of the amino acid sequences to produce diverse proteins and mRNAs.
2) Characteristics of D-NT2RP8004156.1 ([1]), Which was Newly Acquired and Subjected to Full-Length cDNA Sequence Analysis by Us
019_—[1]_—1-N0 (SEQ ID NO:41): The entire nucleic acid sequence region of D-NT2RP8004156.1
019_—[1]_—1-NA0 (SEQ ID NO:42): Both the entire nucleic acid sequence region and amino acid sequence of D-NT2RP8004156.1
019_—[1]_—1-A0 (SEQ ID NO:43): The entire amino acid sequence region of D-NT2RP8004156.1
The 1st to 119th bases of D-NT2RP8004156.1 (SEQ ID NO:44) is an exon that is not present in NM_—005163.1, which is registered in an existing public DB and serves as a control, lacking homology to NM_—005163.1.
With this change, the translation initiation point of D-NT2RP8004156.1 shifts toward the 3′ side relative to NM_—005163.1, and the 131st base of D-NT2RP8004156.1 becomes the translation initiation point. For this reason, the N-terminal amino acid sequence shortened by 62 residues compared with NM_—005163.1 (SEQ ID NO:264).
019_—[1]_—1-N1 (SEQ ID NO:44): A 119-base insert nucleic acid sequence region of D-NT2RP8004156.1
019_—[1]_—1-N2 (SEQ ID NO:45): A 130-base 5′UTR region of an ORF whose translation initiation point is the 131st base of D-NT2RP8004156.1
019_—[1]_C-A1 (SEQ ID NO:264): Amino acid sequence region lacking 62 residues of D-NT2RP8004156.1 present in NM_—005163.1
With this change, the Pfam motif “PH domain” present at the 6th to 108th amino acids of NM_—005163.1 disappeared in D-NT2RP8004156.1.

3) Expression Specificity Analysis and Design of Primers for Real-Time PCR

To clearly distinguish between the characteristic regions shown above, and examine the respective expression levels thereof, the following regions were used as detection regions. It seemed possible to compare the expression levels of the individual characteristic regions by comparing the expression levels of the detection regions.
019_—01—A specific region present on the N-terminal side of the cDNA pattern [1]: a translation initiation region of the cDNA pattern [1], which was newly subjected to full-length cDNA sequence analysis by us, being a novel region not registered with an existing public DB
→Fragment 019_—01 (SEQ ID NO:48) amplified by Primer019_—01F (SEQ ID NO:46) and Primer019_—01R (SEQ ID NO:47)
019_—02—A transcription initiation point region of [2], which is registered with an existing public DB, serving as a control for comparing [1]
→Fragment 019_—02 (SEQ ID NO:51) amplified by Primer019_—02F (SEQ ID NO:49) and Primer01902R (SEQ ID NO:50)
019_—03—A common region shared by all of [1] to [2]: a region common to all patterns, serving for control to compare the overall expression levels of the cDNA pattern [1], which was newly subjected to full-length cDNA sequence analysis by us, and the cDNA pattern [2], which is registered with an existing public DB
→Fragment 019_—03 (SEQ ID NO:54) amplified by Primer019_—03F (SEQ ID NO:52) and Primer019_—03R (SEQ ID NO:53)
By mapping the 5′-terminal sequences of about 1.44 million sequences acquired using the oligocap method onto the human genome sequence, and comparatively analyzing them, the exon regions specific for the respective cDNA patterns [1] to [2] shown above, respectively, were found to be expressed at the following frequencies.
In the cDNA pattern [1], which was newly acquired and analyzed by us, four 5′-terminal sequences were present, the derivation thereof being NT2 cells treated with retinoic acid (RA) to induce differentiation (NT2RP) for all sequences.
It was found that in the cDNA pattern [2], which is registered with an existing public DB, eleven 5′-terminal sequences were present: 4 sequences derived from brain tissues and 7 sequences from a plurality of other organs and the like were expressed.
From this result, it was found that the transcription initiation point of [1] was expressed specifically in NT2 cells after differentiation. From the transcription initiation point of [2], expression in a variety of organs was observed. Hence, it was thought that the mechanism of transcription in this chromosome region might be unique to the nerve cell differentiation stage of NT2 cells after differentiation, with a different transcription initiation point being used.

(2) Analysis of Expression Specificity by Real-Time PCR

To determine what are the states in which the transcription initiation point used for the expression changes, details of expression levels were analyzed by real-time PCR. The results are shown in Table 4 and Table 5.

	TABLE 4

	RQ Score	Log₁₀RQ Score

	019_01	019_02	019_03	019_01	019_02	019_03

01 NT2RA(−)	0.2	0.1	0.1	−0.73	−1.02	−1.03
02 NT2RA(+) 24 hr	0.5	0.1	0.1	−0.29	−1.16	−1.15
03 NT2RA(+) 48 hr	0.2	0.1	0.1	−0.71	−1.05	−1.10
04 NT2RA(+) 1 week	1.4	0.1	0.2	0.16	−0.84	−0.75
05 NT2RA(+) 5 weeks	94.2	0.4	0.5	1.97	−0.35	−0.34
06 NT2RA(+) 5 weeks, Inh(+)	4.7	0.4	0.5	0.67	−0.37	−0.32
07 NT2 Neuron	0.0	0.1	0.0	−1.40	−1.03	−1.80
08 Brain, Fetal	1.1	1.5	1.4	0.03	0.17	0.16
09 Brain, whole	0.1	0.6	0.6	−1.06	−0.24	−0.25
10 ALZ Visual Cortex	0.1	0.2	0.2	−1.00	−0.74	−0.72
Occipital
11 Mix, viscus tissues	1.0	1.0	1.0	0.0	0.0	0.0
12 Mix, blood cells and	1.7	0.7	0.4	0.23	−0.15	−0.38
related tissues
13 Mix, tumor tissues	2.2	0.7	0.8	0.35	−0.18	−0.09
14 Mix, normal tissues	1.3	0.9	0.9	0.10	−0.02	−0.04
15 Brain, whole PolyA(+) RNA	0.2	0.4	0.3	−0.75	−0.35	−0.46
16 Brain, hippocampus	0.2	0.4	0.3	−0.81	−0.43	−0.51

	TABLE 5

	RQ Score	Log₁₀RQ Score

	019_01	019_02	019_03	019_01	019_02	019_03

01 NT2RA(−)	0.1	0.1	0.1	−1.01	−1.01	−1.01
02 NT2RA(+) 24 hr	0.2	0.1	0.1	−0.66	−1.23	−1.11
03 NT2RA(+) 48 hr	0.0	0.1	0.1	−1.66	−1.02	−1.08
04 NT2RA(+) 1 week	0.6	0.2	0.2	−0.22	−0.80	−0.72
05 NT2RA(+) 5 weeks	40.2	0.5	0.5	1.60	−0.32	−0.29
06 NT2RA(+) 5 weeks, Inh(+)	2.0	0.5	0.6	0.30	−0.29	−0.25
07 NT2 Neuron	0.0	0.1	0.0	−1.52	−1.04	−1.80
08 Brain, Fetal	0.4	1.5	1.5	−0.36	0.19	0.17
09 Brain, whole	0.2	0.6	0.6	−0.73	−0.21	−0.22
10 ALZ Visual Cortex	0.0	0.2	0.2	−1.40	−0.72	−0.72
Occipital
11 Mix, viscus tissues	1.0	1.0	1.0	0.0	0.0	0.0
12 Mix, blood cells and	1.1	0.9	0.7	0.03	−0.07	−0.13
related tissues
13 Mix, tumor tissues	0.7	0.6	0.6	−0.17	−0.25	−0.19
14 Mix, normal tissues	0.3	0.9	0.9	−0.59	−0.03	−0.04
15 Brain, whole PolyA(+) RNA	0.1	0.6	0.4	−1.11	−0.25	−0.36
16 Brain, hippocampus	0.1	0.5	0.4	−1.08	−0.29	−0.40

Expression levels were compared using the 16 samples shown in Example 3, including Brain, hippocampus, Brain, whole, Brain, Fetal, ALZ Visual Cortex Occipital, and NT2 cells at 7 different differentiation stages. The comparison was made using the mixed sample of normal visceral tissues shown in Example 3 (Mix, viscus tissues) as an experimental control.
The ratio of ORF alteration due to transcription initiation point selectivity as compared between 019_—01 (SEQ ID NO:48) and 019_—02 (SEQ ID NO:51) changed greatly depending on NT2 cell differentiation stage. When compared in detail with respect to NT2 cell differentiation, no major difference was observed between the 2 kinds of transcription initiation points shown by 019_—01 (SEQ ID NO:48) and 019_—02 (SEQ ID NO:51) in undifferentiated NT2 cells NT2RA (−) and NT2RA (+) 48 hr, which represents the initial stage in which retinoic acid was added to induce differentiation (Table 4 and Table 5). However, in NT2RA (+) 1 week, which represents an advanced stage of differentiation, the difference widened; in NT2RA (+) 5 weeks, the ratio of transcription from the downstream transcription initiation point shown by 019_—01 (SEQ ID NO:48) increased considerably (Table 4 and Table 5). However, thereafter in NT2RA (+) 5 weeks, Inh (+), the difference decreased; in NT2% Neuron, on the contrary, the ratio of transcription from the known transcription initiation point shown by 019_—02 (SEQ ID NO:51) increased (Table 4 and Table 5). In other tissues, no major difference was observed.
These results demonstrated that by comparing the expression of the 5′-terminal region of a newly acquired cDNA region shown by the detection region 019_—01 (sequence No. 019-8) (a region close to the transcription initiation point) 019_—[1]_—1-N1 (SEQ ID NO:44), it is possible to use the 5′-terminal region as a differentiation marker for detecting cells in nerve cell differentiation or regeneration stages, particularly in the late stage of nerve differentiation or regeneration. It also seems possible to develop a new drug by means of a compound, antibody, siRNA or the like that targets a region that exhibits specificity.
The following regions also seem to be useful as differentiation markers for detecting cells in the late stage of nerve differentiation or regeneration.
Upstream sequence 019_—[1]_—1-N3 (SEQ ID NO:55), which comprises the 195th to 213th bases undergoing priming by Primer019_—01R (SEQ ID NO:47) in D-NT2RP8004156.1 of the cDNA pattern [1].
Region 019_—01 (SEQ ID NO:48) amplified by Primer019_—01F (SEQ ID NO:46) and Primer019_—01R (SEQ ID NO:47) in the cDNA pattern [1]

Example 6

Cluster chr2-2324 (Data Set: 031)

(1) Cluster Analysis

1) Cluster Characteristics

An analysis was performed on 7 sequences of full-length cDNAs subjected to genome mapping onto the cluster chr2-2324 (Human genome UCSC hg18 (NCBI Build34) chromosome 2, 65,440,000 bp to 65,580,000 bp) [D-NT2RI3005525.1, D-TRACH3029063.1, AY299090.1, C-HEP03447, C-NT2RP7004925, ENST00000356388, NM_—181784.1]. They were classified according to expression pattern difference into 5 kinds, which mainly included the following 2 kinds.

[1] D-NT2RI3005525.1

[2] AY299090.1, C-NT2RP7004925 (AK056479.1), NM_—181784.1

[1] is a cDNA which was newly acquired and subjected to full-length cDNA by us, and had a different ORF from [2] registered with an existing public DB.
[1] had a different ORF region because of its expression from a chromosome region located downstream of the known [2], and also because of the presence of the translation initiation point on a new exon lacking identity to [2].
It was found that the ORF regions present in the 2 kinds of cDNA patterns [1] to [2] cause expression starting at different transcription initiation points, from the same chromosome region, resulting in alterations of the amino acid sequences to produce diverse proteins and mRNAs.
2) Characteristics of D-NT2RI3005525.1 ([1]), which was Newly Acquired and Subjected to Full-Length cDNA by Us
031_—[1]_—1-N0 (SEQ ID NO:56): The entire nucleic acid sequence region of D-NT2RI3005525.1
031_—[1]_—1-NA0 (SEQ ID NO:57): Both the entire nucleic acid sequence region and amino acid sequence of D-NT2RI3005525.1
031_—[1]_—1-A0 (SEQ ID NO:58): The entire amino acid sequence region of D-NT2RI3005525.1
The sequence at the 1st to 61st bases of D-NT2RI3005525.1 (SEQ ID NO:59) is a variant incorporating an exon that is not present in NM_—181784.1, which is registered with an existing public DB, and serves for control; because of the presence thereof along with the translation initiation point on the exon inserted, the N-terminal amino acids differed by 6 residues, compared with NM_—181784.1 (SEQ ID NO:60).
031_—[1]_—1-N1 (SEQ ID NO:59): 61-base insert nucleic acid sequence region of D-NT2RI3005525.1
031_—[1]_—1-A1 (SEQ ID NO:60): 6-residue insert amino acid sequence region of D-NT2RI3005525.1
031_—[1]_—1-N2 (SEQ ID NO:61): ORF nucleic acid sequence region in 61-base insert region of D-NT2RI3005525.1
031_—[1]_—1-A2 (identical to SEQ ID NO:60): ORF amino acid region related to 61-base insert region of D-NT2RI3005525.1

3) Expression Specificity Analysis and Design of Primer for Real-Time PCR

To clearly distinguish between the characteristic regions shown above, and examine the respective expression levels thereof, the following regions were used as detection regions. It seemed possible to compare the expression levels of the individual characteristic regions by comparing the expression levels of the detection regions.
031_—01—A specific region present on the N-terminal side of the cDNA pattern [1]: a translation initiation region of the cDNA pattern [1], which was newly subjected to full-length cDNA sequence analysis by us, being a novel region not registered with an existing public DB
→Fragment 031_—01 (SEQ ID NO:64) amplified by Primer031_—01F (SEQ ID NO:62) and Primer031_—01R (SEQ ID NO:63)
031_—02—A transcription initiation point region of [2], registered with an existing public DB, serving as a control for comparing [1]
→Fragment 031_—02 (SEQ ID NO:67) amplified by Primer031_—02F (SEQ ID NO:65) and Primer031_—02R (SEQ ID NO:66)
031_—03—A common region shared by all of [1] to [2]: a region common to all patterns, serving for control to compare the overall expression levels of the cDNA pattern [1], which was newly subjected to full-length cDNA sequence analysis by us, and the cDNA pattern [2], which is registered with an existing public DB
→Fragment 031_—03 (SEQ ID NO:70) amplified by Primer031_—03F (SEQ ID NO:68) and Primer03103R (SEQ ID NO:69)
By mapping the 5′-terminal sequences of about 1.44 million sequences acquired using the oligocap method onto the human genome sequence, and comparatively analyzing them, the exon regions specific for the cDNA patterns [1] and [2] shown above, respectively, were found to be expressed at the following frequencies.
In the cDNA pattern [1], which was newly acquired and analyzed by us, twenty-eight 5′-terminal sequences were present, the derivations thereof being Brain, whole for 13 sequences (analytical parameter 59,069), Brain, hippocampus for 8 sequences (analytical parameter 57,918), Brain, amygdala for 5 sequences (analytical parameter 58,640), HDPC (Human dermal papilla cells) for 1 sequence (analytical parameter 8,453), and NT2 cells treated with retinoic acid (RA) to induce differentiation for 5 weeks, and thereafter treated with a growth inhibitor for 2 weeks (NT2RI) for 1 sequence (analytical parameter 32,662).
In the cDNA pattern [2], which is registered with an existing public DB, thirty-five 5′-terminal sequences were present, the derivations thereof being Brain, whole for 10 sequences (analytical parameter 59,069), Brain, cerebellum for 5 sequences (analytical parameter 82,880), Brain, Fetal for 5 sequences (analytical parameter 47,574), Brain, hippocampus for 3 sequences (analytical parameter 57,918), Trachea for 3 sequences (analytical parameter 52,352), Brain, thalamus for 2 sequences (analytical parameter 53,267), NT2 cells treated with retinoic acid (RA) to induce differentiation (NT2RP) for 2 sequences (analytical parameter 39,242), Thymus for 2 sequences (analytical parameter 70,578), NT2 cells treated with retinoic acid (RA) to induce differentiation for 5 weeks, and thereafter treated with a growth inhibitor for 2 weeks (NT2RI) for 1 sequence (analytical parameter 32,662), Testis for 1 sequence (analytical parameter 90,188), and Uterus for 1 sequence (analytical parameter 49,561).
From this result, it was found that the transcription initiation point of [1] was expressed abundantly in the brain, particularly in Brain, hippocampus and Brain, amygdala. It was found that the transcription initiation point of [2] was also abundantly expressed in the brain, but expressed in a wider variety of tissues compared with the transcription initiation point of [1]. From this result, it was thought that the mechanism of transcription in this chromosome region might be unique to particular portions of the brain, with a different is transcription initiation point being used.

(2) Analysis of Expression Specificity by Real-Time PCR

To determine what are the portions and states in which the transcription initiation point used for the expression changes, details of expression levels were analyzed by real-time PCR. The results are shown in Table 6 and Table 7.

	TABLE 6

	RQ Score	Log₁₀RQ Score

	031_01	031_02	031_03	031_01	031_02	031_03

01 NT2RA(−)	0.0	0.1	0.2	−3.12	−0.85	−0.82
02 NT2RA(+) 24 hr	0.0	0.5	0.8	−2.48	−0.34	−0.09
03 NT2RA(+) 48 hr	0.0	0.4	0.9	−2.48	−0.41	−0.03
04 NT2RA(+) 1 week	0.0	0.2	0.4	−2.32	−0.81	−0.43
05 NT2RA(+) 5 weeks	0.9	0.4	0.4	−0.03	−0.45	−0.39
06 NT2RA(+) 5 weeks, Inh(+)	2.3	0.4	0.5	0.36	−0.37	−0.29
07 NT2 Neuron	0.1	0.0	0.1	−1.00	−1.51	−0.83
08 Brain, Fetal	0.5	1.7	2.1	−0.33	0.22	0.32
09 Brain, whole	15.4	1.4	2.1	1.19	0.16	0.31
10 ALZ Visual Cortex	8.1	0.4	0.6	0.91	−0.44	−0.20
Occipital
11 Mix, viscus tissues	1.0	1.0	1.0	0.0	0.0	0.0
12 Mix, blood cells and	0.6	0.7	0.9	−0.21	−0.17	−0.06
related tissues
13 Mix, tumor tissues	0.5	0.4	0.5	−0.31	−0.35	−0.29
14 Mix, normal tissues	0.9	0.9	1.2	−0.04	−0.04	0.08
15 Brain, whole PolyA(+) RNA	4.2	0.2	0.3	0.63	−0.71	−0.59
16 Brain, hippocampus	2.8	0.1	0.2	0.44	−0.87	−0.74
17 Brain, cerebellum	0.0	0.2	0.3	−1.61	−0.65	−0.55
18 Brain, amygdala	3.1	0.1	0.2	0.49	−0.95	−0.75
19 Brain, caudate nucleus	0.2	0.1	0.1	−0.78	−1.00	−0.88
20 Brain, corpus callosum	0.2	0.1	0.1	−0.61	−1.10	−1.02
21 Brain, substantia nigra	0.2	0.1	0.2	−0.72	−0.85	−0.78
22 Brain, thalamus	0.2	0.1	0.1	−0.75	−1.16	−1.05
23 Brain, subthalamic nucleus	0.1	0.1	0.1	−1.16	−1.24	−0.96

	TABLE 7

	RQ Score	Log₁₀RQ Score

	031_01	031_02	031_03	031_01	031_02	031_03

01 Brain, Fetal	0.3	1.9	1.8	−0.46	0.28	0.27
02 Brain, whole	10.2	1.3	1.8	1.01	0.10	0.26
03 ALZ Visual Cortex	5.6	0.4	0.6	0.75	−0.44	−0.21
Occipital
04 Mix, viscus tissues	1.0	1.0	1.0	0.0	0.0	0.0
05 Mix, blood cells and	0.5	0.8	0.9	−0.31	−0.11	−0.03
related tissues
06 Mix, tumor tissues	0.8	0.7	0.8	−0.11	−0.17	−0.08
07 Mix, normal tissues	0.8	1.1	1.3	−0.11	0.05	0.13
08 Brain, whole PolyA(+) RNA	3.0	0.1	0.3	0.48	−0.82	−0.57
09 Brain, hippocampus	2.1	0.1	0.2	0.32	−0.88	−0.72
10 Brain, cerebellum	0.0	0.1	0.2	−1.96	−0.87	−0.80
11 Brain, amygdala	2.3	0.1	0.2	0.37	−0.97	−0.75
12 Brain, caudate nucleus	0.1	0.1	0.1	−0.96	−1.06	−0.96
13 Brain, corpus callosum	0.2	0.1	0.1	−0.82	−1.16	−1.09
14 Brain, substantia nigra	0.1	0.1	0.1	−0.99	−1.01	−0.95
15 Brain, thalamus	0.1	0.0	0.1	−1.05	−1.34	−1.23
16 Brain, subthalamic nucleus	0.0	0.1	0.1	−1.37	−1.28	−1.03

Expression levels were compared using the 23 kinds of samples shown in Example 3, including 11 kinds of brain tissues and NT2 cells at 7 different differentiation stages. The comparison was made using the mixed sample of normal visceral tissues shown in Example 3 (Mix, viscus tissues) as an experimental control.
The ratio of ORF alteration due to transcription initiation point selectivity as compared between 031_—01 (SEQ ID NO:64) and 031_—02 (SEQ ID NO:67) changed greatly among the following brain portions and NT2 cell differentiation stages.
In the brain, particularly in Brain, hippocampus and Brain, amygdala, the transcription from the downstream transcription initiation point shown by 031_—01 (SEQ ID NO:64) was abundant (Table 6 and Table 7). No major difference was observed among the other portions of the brain.
Furthermore, when compared in detail with respect to NT2 cell differentiation, the expression of the mRNA transcribed from the transcription initiation point shown by 031_—02 (SEQ ID NO:67), registered with an existing public DB, was abundant in undifferentiated NT2 cells NT2RA (−) and NT2RA (+) 1 week, which represents the initial stage in which retinoic acid was added to induce differentiation; however, in NT2RA (+) 5 weeks, predicted to be rich in nerve cells after differentiation, the expression level reversed; in the subsequent stages of NT2RA (+) 5 weeks, Inh (+), and NT2 Neuron, the expression of the mRNA transcribed from the downstream transcription initiation point shown by 031_—01 (SEQ ID NO:64) was abundant (Table 6 and Table 7).
These results demonstrated that by comparing the expression of the 5′-terminal region 031_—[1]-N1 (SEQ ID NO:59) of a newly acquired cDNA shown by the detection region 031_—01 (SEQ ID NO:64) (a region close to the transcription initiation point), it is possible to use the 5′-terminal region as a marker specific for the brain, particularly for nerve-rich portions such as Brain, hippocampus (nerve differentiation, nerve regeneration marker and the like), and as a differentiation marker for detecting cells in nerve cell differentiation or regeneration stages, particularly those that have differentiated into a nerve. It also seems possible to develop a new drug by means of a compound, antibody, siRNA or the like that targets a region that exhibits specificity.
The following regions also seem to be useful as markers specific to the brain, particularly to the nerve-rich portions such as Brain, hippocampus (nerve differentiation, nerve regeneration marker and the like), and as differentiation markers for detecting nerve cells in differentiation or regeneration stages, particularly those that have differentiated into nerves.
Upstream sequence 031_—[1]_—1-N3 (SEQ ID NO:71), which comprises the 80th to 101st bases undergoing priming by Primer031_—01R (SEQ ID NO:63) in D-NT2RI3005525.1 of the cDNA pattern [1].
Region 031_—01 (SEQ ID NO:64) amplified by Primer031_—01F (SEQ ID NO:62) and Primer031_—01R (SEQ ID NO:63) in the cDNA pattern [1]

Example 7

Cluster chr7-2007 (Data Set: 067)

(1) Cluster Analysis

1) Cluster Characteristics

An analysis was performed on 10 sequences of full-length cDNAs subjected to genome mapping onto the cluster chr7-2007 (Human genome UCSC hg18 (NCBI Build34) chromosome 7, 26,400,000 bp to 26,850,000 bp) [D-NT2RP8004592.1, D-NT2RP7010844.1, Z-NT2RP7020087-01, BC002893.2, BC036044.1, ENST00000338865, ENST00000345317, NM_—003930.3, XM_—498174.1, XM_—499404.1]. They were classified according to expression pattern difference into 5 kinds, which mainly included the following 2 kinds.

[1] D-NT2RP8004592.1

[2] BC002893.2, BC036044.1, NM_—003930.3

[1] is a cDNA which was newly acquired and subjected to full-length cDNA sequence analysis by us, having a different ORF from [2], which is registered with an existing public DB.
[1] had a different ORF region from [2] because of its expression from a chromosome region located downstream of the known [2], and hence a shift of the translation initiation point toward the C-terminal side.
It was found that the ORF regions present in the 2 kinds of cDNA patterns [1] and [2] cause expression starting at different transcription initiation points, from the same chromosome region, resulting in alterations of the amino acid sequences to produce diverse proteins and mRNAs.
2) Characteristics of D-NT2RP8004592.1 ([1]), which was Newly Acquired and Subjected to Full-Length cDNA Sequence Analysis by Us
067_—[1]_—1-N0 (SEQ ID NO:72): The entire nucleic acid sequence region of D-NT2RP8004592.1
067_—[1]_—1-NA0 (SEQ ID NO:73): Both the entire nucleic acid sequence region and amino acid sequence of D-NT2RP8004592.1
067_—[1]_—1-A0 (SEQ ID NO:74): The entire amino acid sequence region of D-NT2RP8004592.1
The exon at the 1st to 169th bases of D-NT2RP8004592.1 (SEQ ID NO:75) (1st exon) is an exon that is not present in NM_—003930.3, which is registered with an existing public DB, and serves for control, lacking homology thereto. The exon at the 1st to 359th bases of NM_—003930.3 (first exon) is an exon that is not present in D-NT2RP8004592.1, lacking homology thereto. The second exon and beyond are present commonly in both cDNAs. The translation termination point of the ORF of NM_—003930.3 is the same as that of D-NT2RP8004592.1; however, because the translation initiation point is present on the 1st exon, which is not present in D-NT2RP8004592.1, the N-terminus of the ORF differed. Because the translation initiation point of D-NT2RP8004592.1 is present on the 6th exon, which is shared by NM_—003930.3, the amino acid sequence on the N-terminal side shortened by 172 residues, compared with NM_—003930.3 (SEQ ID NO:265).
067_—[1]_—1-N 1 (SEQ ID NO:75): A 169-base insert nucleic acid sequence region of D-NT2RP8004592.1
067_—[1]_—1-N 2 (SEQ ID NO:76): A 619-base 5′UTR region of an ORF whose translation initiation point is the 620th base of D-NT2RP8004592.1
067_—[1]_C-A1 (SEQ ID NO:265): A 172-residue deletion amino acid sequence region of D-NT2RP8004592.1 present in NM_—003930.3

3) Expression Specificity Analysis and Design of Primers for Real-Time PCR

To clearly distinguish between the characteristic regions shown above, and examine the respective expression levels thereof, the following regions were used as detection regions. It seemed possible to compare the expression levels of the individual characteristic regions by comparing the expression levels of the detection regions.
067_—01—A specific region present on the N-terminal side of the cDNA pattern [1]: a translation initiation region of the cDNA pattern [1], which was newly subjected to full-length cDNA sequence analysis by us, being a novel region not registered with an existing public DB
→Fragment 067_—01 (SEQ ID NO:79) amplified by Primer067_—01F (SEQ ID NO:77) and Primer067_—01R (SEQ ID NO:78)
067_—03—Transcription initiation point region of [2], which is registered with an existing public DB, serving as a control for comparing [1]
→Fragment 067_—03 (SEQ ID NO:82) amplified by Primer067_—03F (SEQ ID NO:80) and Primer067_—03R (SEQ ID NO:81)
067_—04—A common region shared by all of [1] to [2]: a region common to all patterns, serving for control to compare the overall expression levels of the cDNA pattern [1], which was newly subjected to full-length cDNA sequence analysis by us, and the cDNA pattern [2], which is registered with an existing public DB
→Fragment 067_—04 (SEQ ID NO:85) amplified by Primer067_—04F (SEQ ID NO:83) and Primer067_—04R (SEQ ID NO:84)
By mapping the 5′-terminal sequences of about 1.44 million sequences acquired using the oligocap method onto the human genome sequence, and comparatively analyzing them, the exon regions specific for the cDNA patterns [1] to [2] shown above, respectively, were found to be expressed at the following frequencies.
In the cDNA pattern [1], which was newly acquired and analyzed by us, eighteen 5′-terminal sequences were present, the derivations thereof being NT2 cells treated with retinoic acid (RA) to induce differentiation (NR2RP) for 16 sequences (analytical parameter 39,242), and NT2 cells treated with retinoic acid (RA) to induce differentiation for 5 weeks, and thereafter treated with a growth inhibitor for 2 weeks (NT2RI) for 2 sequences (analytical parameter 32,662); all were derived from NT2 cells after differentiation.
In the cDNA pattern [2], which is registered with an existing public DB, one hundred twenty-two (122) 5′-terminal sequences were present, the derivations thereof being NT2 cells for 45 sequences, brain tissues for 25 sequences, and others for 47 sequences.
From this result, it was found that the transcription initiation point of [1] was expressed specifically in NT2 cells after differentiation. From the transcription initiation point of [2], expression was observed in NT2 cells, brain tissues and various other tissues. Hence, it was suggested that in this chromosome region, the mechanism of transcription may differ, and may result in different transcription initiation points being used only at the nerve cell differentiation states of NT2 cells after differentiation.

(2) Analysis of Expression Specificity by Real-Time PCR

To determine what are the states in which the transcription initiation point used for the expression changes, details of expression levels were analyzed by real-time PCR. The results are shown in Table 8 and Table 9.

	TABLE 8

	RQ Score	Log₁₀RQ Score

	067_01	067_03	067_04	067_01	067_03	067_04

01 NT2RA(−)	0.0	0.0	0.0	−1.58	−2.04	−1.79
02 NT2RA(+) 24 hr	3.1	0.5	0.6	0.49	−0.29	−0.19
03 NT2RA(+) 48 hr	8.6	1.6	1.5	0.93	0.21	0.19
04 NT2RA(+) 1 week	21.6	1.5	1.9	1.33	0.16	0.27
05 NT2RA(+) 5 weeks	103.8	3.3	11.3	2.02	0.52	1.05
06 NT2RA(+) 5 weeks, Inh(+)	3.2	1.2	2.6	0.51	0.07	0.41
07 NT2 Neuron	30.3	0.7	0.4	1.48	−0.16	−0.37
08 Brain, Fetal	0.1	0.2	0.3	−0.95	−0.66	−0.51
09 Brain, whole	0.9	0.6	1.0	−0.05	−0.19	0.01
10 ALZ Visual Cortex	0.4	0.3	0.7	−0.36	−0.51	−0.17
Occipital
11 Mix, viscus tissues	1.0	1.0	1.0	0.0	0.0	0.0
12 Mix, blood cells and	2.3	1.9	2.5	0.36	0.27	0.40
related tissues
13 Mix, tumor tissues	0.7	0.3	0.3	−0.14	−0.46	−0.49
14 Mix, normal tissues	2.3	1.0	1.2	0.37	−0.01	0.07
15 Brain, whole PolyA(+) RNA	0.2	0.2	0.5	−0.71	−0.66	−0.29
16 Brain, hippocampus	0.1	0.2	0.4	−0.94	−0.75	−0.35

	TABLE 9

	RQ Score	Log₁₀RQ Score

	067_01	067_03	067_04	067_01	067_03	067_04

01 NT2RA(−)	0.0	0.0	0.0	−1.53	−2.04	−1.78
02 NT2RA(+) 24 hr	3.2	0.6	0.8	0.50	−0.23	−0.11
03 NT2RA(+) 48 hr	10.6	1.6	1.7	1.03	0.21	0.22
04 NT2RA(+) 1 week	25.0	1.5	1.9	1.40	0.18	0.28
05 NT2RA(+) 5 weeks	125.3	3.7	13.6	2.10	0.57	1.13
06 NT2RA(+) 5 weeks, Inh(+)	4.4	1.3	3.4	0.64	0.11	0.53
07 NT2 Neuron	25.5	0.6	0.4	1.41	−0.19	−0.37
08 Brain, Fetal	0.2	0.2	0.3	−0.63	−0.64	−0.48
09 Brain, whole	1.0	0.7	1.2	−0.01	−0.16	0.10
10 ALZ Visual Cortex	0.4	0.3	0.7	−0.35	−0.47	−0.15
Occipital
11 Mix, viscus tissues	1.0	1.0	1.0	0.0	0.0	0.0
12 Mix, blood cells and	2.1	2.1	3.2	0.32	0.32	0.50
related tissues
13 Mix, tumor tissues	0.4	0.3	0.4	−0.45	−0.48	−0.42
14 Mix, normal tissues	2.7	0.9	1.2	0.44	−0.02	0.08
15 Brain, whole PolyA(+) RNA	0.2	0.3	0.7	−0.60	−0.54	−0.16
16 Brain, hippocampus	0.1	0.2	0.6	−0.88	−0.62	−0.22

Expression levels were compared using the 16 samples shown in Example 3, including Brain, hippocampus, Brain, whole, Brain, Fetal, Alzheimer patient cerebral cortex (ALZ Visual Cortex Occipital), and NT2 cells at 7 different differentiation stages and the like. The comparison was made using the mixed sample of normal visceral tissues shown in Example 3 (Mix, viscus tissues) as an experimental control.
The ratio of ORF alteration due to transcription initiation point selectivity as compared between 067_—01 (SEQ ID NO:79) and 067_—03 (SEQ ID NO:82) changed greatly depending on NT2 cell differentiation stage. When compared in detail with respect to NT2 cell differentiation, the ratio of the transcription from the transcription initiation point shown by 067_—01 (SEQ ID NO:79) was higher than that from the transcription initiation point shown by 067_—03 (SEQ ID NO:82) in NT2RA (+) 1 week to NT2RA (+) 5 weeks, advanced stages of differentiation, compared with undifferentiated NT2 cells NT2RA (−) and NT2RA (+) 48 hr, which represents the initial stage in which retinoic acid was added to induce differentiation (Table 8 and Table 9). Subsequently, in NT2RA (+) 5 weeks, Inh (+), the difference narrowed, but in NT2 Neuron, the ratio of transcription represented by 067_—01 (SEQ ID NO:79) increased again (Table 8 and Table 9).
These results demonstrated that by comparing the expression of the 5′-terminal region (a region close to the transcription initiation point) 067_—[1]-N1 (SEQ ID NO:75) of a newly acquired cDNA shown by the detection region 067_—01 (SEQ ID NO:79), it is possible to use the 5′-terminal region as a differentiation marker for detecting cells in nerve cell differentiation or regeneration stages, particularly those that have differentiated into a nerve. It also seems possible to develop a new drug by means of a compound, antibody, siRNA or the like that targets a region that exhibits specificity.
The following regions also seem to be useful as differentiation markers for detecting nerve cells in differentiation or regeneration stages, particularly those that have differentiated into nerves.
Upstream sequence 067_—[1]_—1-N3 (SEQ ID NO:86), which comprises the 65th to 84th bases undergoing priming by Primer067_—01R (SEQ ID NO:78) in D-NT2RP8004592.1 of the cDNA pattern [1].
Region 067_—01 (SEQ ID NO:79) amplified by Primer067_—01F (SEQ ID NO:77) and Primer067_—01R (SEQ ID NO:78) in the cDNA pattern [1].

Example 8

Cluster chrX-900 (Data Set: 122)

(1) Cluster Analysis

1) Cluster Characteristics

An analysis was performed on 7 full-length cDNAs subjected to genome mapping onto the cluster chrX-900 (Human genome UCSC hg18 (NCBI Build34) chromosome X, 43,380,000 bp to 43,500,000 bp) [D-NT2RI2014164.1, D-BRAMY2029564.1, D-BRAMY2029564.1, BC022494.1, ENST00000265833, M69177.1, NM_—000898.3]. They were classified according to expression pattern difference into 4 kinds, which mainly included the following 3 kinds.

[1] D-NT2RI2014164.1

[2] D-BRAMY2029564.1

[3] BC022494.1, ENST00000265833, M69177.1, NM_—000898.3

[1] is a cDNA which was newly acquired and subjected to full-length cDNA sequence analysis by us, and had a different ORF from that of [3], which is registered with an existing public DB, because of the expression thereof from a chromosome region located downstream of the known [3].
[2] is a cDNA which was newly acquired and subjected to full-length cDNA sequence analysis by us, having a different ORF from that of the known [3] because of the insertion of an exon different from the other patterns in the ORF region [3].
It was found that the ORF regions present in the 3 kinds of cDNA patterns [1] to [3] cause expression starting at different transcription initiation points, from the same chromosome region, and have different splice patterns, such as exon insertions, resulting in alterations of the amino acid sequences to produce diverse proteins and mRNAs.
2) Characteristics of D-NT2RI2014164.1 ([1]), which was Newly Acquired and Subjected to Full-Length cDNA Sequence Analysis by Us
122_—[1]_—1-N0 (SEQ ID NO:87): The entire nucleic acid sequence region of D-NT2RI2014164.1
122_—[1]_—1-NA0 (SEQ ID NO:88): Both the entire nucleic acid sequence region and amino acid sequence of D-NT2RI2014164.1
122_—[1]_—1-A0 (SEQ ID NO:89): The entire amino acid sequence region of D-NT2RI2014164.1
The sequence at the 1st to 156th bases of D-NT2RI2014164.1 (SEQ ID NO:90) is an exon that is not present in NM_—000898.3, which is registered with an existing public DB, and serves for control, lacking homology to NM_—000898.3. With this change, the translation initiation point of D-NT2RI2014164.1 shifts toward the 3′ side relative to NM_—000898.3, and the 162nd base of D-NT2RI2014164.1 becomes the translation initiation point. For this reason, the amino acid sequence shortened by 16 residues, compared with NM_—000898.3 (SEQ ID NO:266).
The 98-base exon present at the 1,274th to 1,371st bases of NM_—000898.3 (SEQ ID NO:95) is lacked and not present in the region at the 1,250th to 1,251st bases of D-NT2RI2014164.1 (SEQ ID NO:92).
With this change, because of a translation frame change to cause the termination of the ORF at a stop codon different from that of NM_—000898.3, the C-terminal amino acids differed by 48 residues, compared with NM_—000898.3 (SEQ ID NO:93). 122_—[1]_—1-N1 (SEQ ID NO:90): A 156-base insert nucleic acid sequence region of D-NT2RI2014164.1
122_—[1]_—1-N2 (SEQ ID NO:91): A 161-base 5′UTR region of an ORF whose translation initiation point is the 162nd base of D-NT2RI2014164.1
122_—[1]_—1-N3 (SEQ ID NO:92): A deletion nucleic acid sequence region of D-NT2RI2014164.1
122_—[1]_—1-A1 (SEQ ID NO:93): Amino acid sequence region
122_—[1]_—1-N4 altered as a result of deletion of D-NT2RI2014164.1 (identical to SEQ ID NO:92): an ORF nucleic acid region in the deletion region of D-NT2RI2014164.1
122_—[1]_—1-A2 (SEQ ID NO:94): An ORF amino acid sequence region related to the deletion region of D-NT2RI2014164.1
122_—[1]_C-N1 (SEQ ID NO:95): A 98-base exon nucleic acid sequence present at the 1,274th to 1,371th bases of NM_—000898.3 inserted into the region at the 1,250th to 1,251st bases of D-NT2RI2014164.1
122_—[1]_C-A1 (SEQ ID NO:96): A 33-residue amino acid sequence related to the 98-base exon nucleic acid sequence present at the 1,274th to 1,371st bases of NM_—000898.3 inserted into the region at the 1,250th to 1,251st bases of D-NT2RI2014164.1 122_—[1]_C-A2 (SEQ ID NO:266): A 16-residue deletion amino acid sequence region of D-NT2RI2014164.1 present in NM_—000898.3
3) Characteristics of D-BRAMY2029564.1 ([2]), which was Newly Acquired and Subjected to Full-Length cDNA Sequence Analysis by Us
122_—[2]_—1-N0 (SEQ ID NO:97): The entire nucleic acid sequence region of D-BRAMY2029564.1
122_—[2]_—1-NA0 (SEQ ID NO:98): Both the entire nucleic acid sequence region and amino acid sequence of D-BRAMY2029564.1
122_—[2]_—1-A0 (SEQ ID NO:99): The entire amino acid sequence region of D-BRAMY2029564.1
The 90th to 140th bases of D-BRAMY2029564.1 (SEQ ID NO:100) is an exon that is not present in NM_—000898.3, which is registered with an existing public DB, and serves for control, lacking homology to NM_—000898.3. With this change, the translation initiation point of D-BRAMY2029564.1 shifts toward the 3′ side, compared with NM_—000898.3, and the 143rd base of D-BRAMY2029564.1 becomes a translation initiation point. For this reason, the amino acid sequence shortened by 16 residues, compared with NM_—000898.3 (identical to SEQ ID NO:266).
122_—[2]_—1-N1 (SEQ ID NO:100): A 43-base insert nucleic acid sequence region of D-BRAMY2029564.1
122_—[2]_—1-N2 (SEQ ID NO:101): A 142-base 5′UTR region of an ORF whose translation initiation point is the 143rd base of D-BRAMY2029564.1
122_—[2]_C-A1 (identical to SEQ ID NO:266): A 16-residue deletion amino acid sequence region of D-BRAMY2029564.1 present in NM_—000898.3

4) Expression Specificity Analysis and Design of Primers for Real-Time PCR

To clearly distinguish between the characteristic regions shown above, and examine the respective expression levels thereof, the following regions were used as detection regions. It seemed possible to compare the expression levels of the individual characteristic regions by comparing the expression levels of the detection regions.
122_—01—A specific region present on the N-terminal side of the cDNA pattern [1]: a translation initiation region of the cDNA pattern [1], which was newly subjected to full-length cDNA sequence analysis by us, being a novel region not registered with an existing public DB
→Fragment 122_—01 (SEQ ID NO:104) amplified by Primer122_—01F (SEQ ID NO:102) and Primer122_—01R (SEQ ID NO:103)
122_—02—A region specifically extracted by means of the sequence information on regions of the exon insertion of cDNA pattern [2]: an ORF-altering exon insert region in the cDNA pattern [2], which was newly subjected to full-length cDNA sequence analysis by us
→Fragment 122_—02 (SEQ ID NO:107) amplified by Primer122_—02F (SEQ ID NO:105) and Primer122_—02R (SEQ ID NO:106)
122_—03—A transcription initiation point region of [3], which is registered with an existing public DB, serving as a control for comparing [1] and [2]
→Fragment 122_—03 (SEQ ID NO:110) amplified by Primer122_—03F (SEQ ID NO:108) and Primer122_—03R (SEQ ID NO:109)
122_—04—A common region shared by all of [1] to [3]: a region common to all patterns, serving for control to compare the overall expression levels of the cDNA patterns [1] and [2], which were newly subjected to full-length cDNA sequence analysis by us, and the cDNA pattern [3], which is registered with an existing public DB
→Fragment 122_—04 (SEQ ID NO:113) amplified by Primer122_—04F (SEQ ID NO:111) and Primer122_—04R (SEQ ID NO:112)
By mapping the 5′-terminal sequences of about 1.44 million sequences acquired using the oligocap method onto the human genome sequence, and comparatively analyzing them, the exon regions specific to the cDNA patterns [1] to [3] shown above, respectively, were found to be expressed at the following frequencies.
In the cDNA pattern [1], which was newly acquired and analyzed by us, four 5′-terminal sequences were present, the derivations thereof being NT2 cells treated with retinoic acid (RA) and treated with a growth inhibitor to induce nerve differentiation, followed by nerve concentration and recovery (NT2NE) for 2 sequences (analytical parameter 16,337), and NT2 cells treated with retinoic acid (RA) to induce differentiation for 5 weeks, and thereafter treated with a growth inhibitor for 2 weeks (NT2RI) for 2 sequences (analytical parameter 32,662).
In the cDNA pattern [2], which was newly acquired and analyzed by us, two 5′-terminal sequences were present, the derivation thereof being Brain, amygdala for the 2 sequences (analytical parameter 58,640).
In the cDNA pattern [3], which is registered with an existing public DB, fifty-nine 5′-terminal sequences were present, the derivations thereof being Uterus for 11 sequence (analytical parameter 49,561), brain tissues for 19 sequences, and a variety of other tissues for the other sequences.
From this result, it was found that the transcription initiation point of [1] was abundantly expressed in differentiated NT2 cells. It was also found that the exon insertion pattern [2] was abundantly expressed in the brain. The transcription initiation point of [3] was expressed in various tissues. Hence, it was thought that the mechanism of transcription or splice pattern in this chromosome region might be unique to particular tissues such as the brain and nerve cells after differentiation, to alter amino acids, with a selection mechanism arising for mRNA pattern changes resulting in the expression of different proteins.

(2) Analysis of Expression Specificity by Real-Time PCR

To detect protein expression diversity changes due to transcription initiation point or exon selectivity among different tissues, details of expression levels were analyzed by real-time PCR. The results are shown in Table 10 and Table 11.

	TABLE 10

	RQ Score	Log₁₀RQ Score

	122_01	122_02	122_03	122_04	122_01	122_02	122_03	122_04

01 NT2RA(−)	0.0	0.0	0.0	0.0	−1.89	−1.37	−1.82	−1.89
02 NT2RA(+) 24 hr	0.3	0.1	0.0	0.0	−0.59	−0.99	−2.55	−2.85
03 NT2RA(+) 48 hr	1.3	0.6	0.0	0.0	0.11	−0.25	−2.28	−2.58
04 NT2RA(+) 1 week	3.4	1.3	0.1	0.0	0.54	0.10	−1.16	−1.65
05 NT2RA(+) 5 weeks	0.3	0.2	0.1	0.1	−0.51	−0.72	−1.00	−1.09
06 NT2RA(+) 5 weeks, Inh(+)	0.5	0.3	0.2	0.2	−0.28	−0.59	−0.61	−0.69
07 NT2 Neuron	5.1	0.6	0.0	0.0	0.71	−0.19	−1.34	−2.21
08 Brain, Fetal	0.6	1.7	0.4	0.2	−0.19	0.22	−0.43	−0.73
09 Brain, whole	2.3	13.4	1.1	0.5	0.36	1.13	0.02	−0.26
10 ALZ Visual Cortex	1.0	6.5	0.9	0.4	−0.01	0.82	−0.07	−0.42
Occipital
11 Mix, viscus tissues	1.0	1.0	1.0	1.0	0.0	0.0	0.0	0.0
12 Mix, blood cells and	1.1	2.1	1.1	0.7	0.06	0.33	0.03	−0.13
related tissues
13 Mix, tumor tissues	0.5	0.2	0.2	0.3	−0.33	−0.61	−0.67	−0.53
14 Mix, normal tissues	3.7	4.6	1.7	1.3	0.57	0.67	0.23	0.12
15 Brain, whole PolyA(+) RNA	0.3	7.2	0.7	0.3	−0.48	0.86	−0.17	−0.48
16 Brain, hippocampus	0.4	4.6	0.9	0.5	−0.44	0.66	−0.05	−0.35
17 Brain, cerebellum	0.2	2.9	0.5	0.2	−0.81	0.47	−0.31	−0.68
18 Brain, amygdala	0.5	4.2	1.0	0.5	−0.32	0.62	0.00	−0.29
19 Brain, caudate nucleus	0.4	4.5	0.9	0.7	−0.35	0.66	−0.06	−0.17
20 Brain, corpus callosum	0.2	0.6	1.1	0.6	−0.64	−0.20	0.05	−0.19
21 Brain, substantia nigra	0.4	2.2	1.1	0.6	−0.44	0.35	0.02	−0.23
22 Brain, thalamus	0.2	4.0	0.6	0.3	−0.76	0.60	−0.23	−0.48
23 Brain, subthalamic nucleus	0.1	0.8	0.8	0.4	−1.12	−0.09	−0.07	−0.43

	TABLE 11

	RQ Score	Log₁₀RQ Score

	122_01	122_02	122_03	122_04	122_01	122_02	122_03	122_04

01 NT2RA(−)	0.0	0.0	0.0	0.0	−1.68	−1.42	−1.85	−1.83
02 NT2RA(+) 24 hr	0.8	0.2	0.0	0.0	−0.08	−0.82	−2.32	−2.79
03 NT2RA(+) 48 hr	3.4	0.7	0.0	0.0	0.54	−0.15	−2.32	−2.70
04 NT2RA(+) 1 week	8.5	2.3	0.1	0.0	0.93	0.36	−1.23	−1.65
05 NT2RA(+) 5 weeks	0.8	0.3	0.1	0.1	−0.11	−0.52	−0.98	−1.15
06 NT2RA(+) 5 weeks, Inh(+)	1.4	0.5	0.3	0.2	0.16	−0.32	−0.55	−0.62
07 NT2 Neuron	14.1	0.7	0.0	0.0	1.15	−0.18	−1.36	−2.21
08 Brain, Fetal	1.6	2.7	0.4	0.2	0.21	0.43	−0.45	−0.75
09 Brain, whole	7.5	19.4	1.4	0.6	0.87	1.29	0.13	−0.21
10 ALZ Visual Cortex	3.0	10.9	1.0	0.4	0.48	1.04	−0.02	−0.40
Occipital
11 Mix, viscus tissues	1.0	1.0	1.0	1.0	0.0	0.0	0.0	0.0
12 Mix, blood cells and	1.3	4.0	1.0	0.4	0.11	0.60	0.01	−0.42
related tissues
13 Mix, tumor tissues	1.2	0.5	0.2	0.2	0.08	−0.34	−0.64	−0.74
14 Mix, normal tissues	5.0	11.8	2.0	1.8	0.70	1.07	0.30	0.25
15 Brain, whole PolyA(+) RNA	1.1	12.0	0.7	0.4	0.04	1.08	−0.13	−0.45
16 Brain, hippocampus	1.2	8.0	1.0	0.5	0.06	0.90	−0.01	−0.30
17 Brain, cerebellum	0.4	4.0	0.5	0.2	−0.43	0.60	−0.30	−0.69
18 Brain, amygdala	0.9	5.6	0.9	0.5	−0.03	0.75	−0.03	−0.31
19 Brain, caudate nucleus	1.1	6.6	1.1	0.5	0.03	0.82	0.05	−0.27
20 Brain, corpus callosum	0.5	1.1	1.1	0.6	−0.26	0.04	0.04	−0.21
21 Brain, substantia nigra	0.8	2.6	0.7	0.4	−0.10	0.41	−0.19	−0.39
22 Brain, thalamus	0.4	5.2	0.5	0.3	−0.35	0.72	−0.32	−0.58
23 Brain, subthalamic nucleus	0.2	0.9	0.6	0.3	−0.77	−0.06	−0.25	−0.57

Expression levels were compared using the 23 kinds of samples shown in Example 3, including 11 kinds of brain tissues and NT2 cells at 7 different differentiation stages. The comparison was made using the mixed sample of normal visceral tissues shown in Example 3 (Mix, viscus tissues) as an experimental control.
The ratio of ORF alteration due to transcription initiation point selectivity and exon selectivity as compared among 122_—01 (SEQ ID NO:104), 122_—02 (SEQ ID NO:107) and 122_—03 (SEQ ID NO:110) changed greatly among the following differentiation stages of the brain and NT2 cells.
In all portions of the brain, the expression in the pattern of insertion of the exon shown by 122_—02 (SEQ ID NO:107) was more abundant than 122_—03 (SEQ ID NO:110) (Table 10 and Table 11).
For the downstream transcription initiation point shown by 122_—01 (SEQ ID NO:104), it was found that the expression level varied greatly among the differentiation stages of NT2 cells. When compared in detail with respect to NT2 cell differentiation, it was found that the expression level of the pattern with an insertion was the same as that of the pattern without an insertion at the stage of undifferentiated NT2 cells NT2RA (−); however, in initial stages of differentiation such as NT2RA (+) 24 hr, NT2RA (+) 48 hr, and NT2RA (+) 1 week, which represent the initial stage in which retinoic acid was added to induce differentiation, the ratio of selection of the downstream transcription initiation point increased greatly, the difference being smaller in the late stage of differentiation (Table 10 and Table 11).
These results demonstrated that by comparing the expression of the 5′-terminal region of a newly acquired cDNA shown by the detection region 122_—01 (SEQ ID NO:104) (a region close to the transcription initiation point), 122_—[1]-N1 (SEQ ID NO:90), or the expression of a newly acquired cDNA region 122_—[2]-N1 (SEQ ID NO:100), shown by the detection region 122_—02 (SEQ ID NO:107), it is possible to use these regions as differentiation markers for detecting cells in nerve cell differentiation or regeneration stages, particularly those in an early stage of differentiation into nerve cells. It also seems possible to develop a new drug by means of a compound, antibody, siRNA or the like that targets a region that exhibits specificity.
The following regions also seem to be useful as differentiation markers for detecting nerve cell differentiation or regeneration stages, particularly initial stages of differentiation into nerve cells.
Upstream sequence 031_—[1]_—1-N3 (SEQ ID NO:114), which comprises the 138th to 162nd bases undergoing priming by Primer122_—01R (SEQ ID NO:103) in D-NT2RI2014164.1 of the cDNA pattern [1].
Upstream sequence 031_—[1]_—1-N3 (SEQ ID NO:115), which comprises the 177th to 198th bases undergoing priming by Primer122_—02R (SEQ ID NO:106) in D-BRAMY2029564.1 of the cDNA pattern [2].
Region 122_—01 (SEQ ID NO:104) amplified by Primer122_—01F (SEQ ID NO:102) and Primer122_—01R (SEQ ID NO:103) in the cDNA pattern [1].
Region 122_—02 (SEQ ID NO:107) amplified by Primer122_—02F (SEQ ID NO:105) and Primer122_—02R (SEQ ID NO:106) in the cDNA pattern [2].

Example 9

Cluster chr8-916 (Data Set: 124)

(1) Cluster Analysis

1) Cluster Characteristics

An analysis was performed on 10 sequences of full-length cDNAs subjected to genome mapping onto the cluster chr8-916 (Human genome UCSC hg18 (NCBI Build34) chromosome 8, 81,100,000 bp to 81,325,000 bp) [D-BRHIP2003515.1, D-COLON2003937.1, Z-BRCOC2013886-01, BC018117.1, BX640835.1, C-SMINT1000078, ENST00000263850, NM_—005079.1, U18914.1, XM_—374275.1]. They were classified according to expression pattern difference into 4 kinds, which mainly included the following 2 kinds.

[1] D-BRHIP2003515.1

[2] BC018117.1, NM_—005079.1, U18914.1

[1] is a cDNA newly acquired and subjected to full-length cDNA sequence analysis by us, and having a different ORF from [2], which had been registered with an existing public DB.
[1], compared with the known [2], had a different ORF region because of amino acid sequence alteration due to the insertion of an exon different from other patterns in the ORF region.
It was found that the ORF regions present in the 2 kinds of cDNA patterns [1] to [2] have different splice patterns, from the same chromosome region, resulting in alterations of the amino acid sequences to produce diverse proteins and mRNAs.
2) Characteristics of D-BRHIP2003515.1 ([1]), which was Newly Acquired and Subjected to Full-Length cDNA Sequence Analysis by Us
124_—[1]_—1-N0 (SEQ ID NO:116): The entire nucleic acid sequence region of D-BRHIP2003515.1
124_—[1]_—1-NA0 (SEQ ID NO:117): Both the entire nucleic acid sequence region and amino acid sequence of D-BRHIP2003515.1
124_—[1]_—1-A0 (SEQ ID NO:118): The entire amino acid sequence region of D-BRHIP2003515.1
The sequence at the 471st to 539th bases of D-BRHIP2003515.1 (SEQ ID NO:119) is a variant incorporating an exon that is not present in NM_—005079.1, which is registered with an existing public DB, and serves for control. The translation initiation point and translation termination point of D-BRHIP2003515.1 are the same as those of NM_—005079.1; however, because of the insertion of a 69-base exon into D-BRHIP2003515.1, the amino acid length increased by 23 residues, compared with NM_—005079.1 (SEQ ID NO:120).
124_—[1]_—1-N1 (SEQ ID NO:119): A 69-base insert nucleic acid sequence region of D-BRHIP2003515.1
124_—[1]_—1-A 1 (SEQ ID NO:120): A 23-residue insert amino acid sequence region of D-BRHIP2003515.1
124_—[1]_—1-N 2 (identical to SEQ ID NO:119): An ORF nucleic acid sequence region in the 69-base insert region of D-BRHIP2003515.1
124_—[1]_—1-A 2 (identical to SEQ ID NO:120): An ORF amino acid region related to the 69-base insert region of D-BRHIP2003515.1

3) Expression Specificity Analysis and Design of Primers for Real-Time PCR

To clearly distinguish between the characteristic regions shown above, and examine the respective expression levels thereof, the following regions were used as detection regions. It seemed possible to compare the expression levels of the individual characteristic regions by comparing the expression levels of the detection regions.
124_—04—A region specifically extracted by means of the sequence information at the border of a region having an exon inserted therein in the cDNA pattern [1]: an insert region of an ORF altering exon in the cDNA pattern [1], which was newly subjected to full-length cDNA sequence analysis by us
→Fragment 124_—04 (SEQ ID NO:123) amplified by Primer124_—04F (SEQ ID NO:121) and Primer12404R (SEQ ID NO:122)
124_—05—A specific region corresponding to a deletion region of the cDNA pattern [2], which is registered with an existing public DB, compared with the insertion region of [1], serving as a control for comparing [1]
→Fragment 124_—05 (SEQ ID NO:126) amplified by Primer124_—05F (SEQ ID NO:124) and Primer12405R (SEQ ID NO:125)
124_—06—A common region shared by all of [1] to [2]: a region common to all patterns, serving for control to compare the overall expression levels of the cDNA pattern [1], which was newly subjected to full-length cDNA sequence analysis by us, and the cDNA pattern [2], which is registered with an existing public DB
→Fragment 124_—06 (SEQ ID NO:129) amplified by Primer124_—06F (SEQ ID NO:127) and Primer124_—06R (SEQ ID NO:128)
By mapping the 5′-terminal sequences of about 1.44 million sequences acquired using the oligocap method onto the human genome sequence, and comparatively analyzing them, the exon regions specific to the cDNA patterns [1] to [2] shown above, respectively, were found to be expressed at the following frequencies.
In the cDNA pattern [1], which was newly acquired and analyzed by us, twenty-one 5′-terminal sequences were present, the derivations thereof being brain tissues such as Brain, amygdala, Brain, cerebellum, and Brain, hippocampus for 18 sequences and Kidney, Tumor for 3 sequences.
In the cDNA pattern [2], which is registered with an existing public DB, fifty-one 5′-terminal sequences were present, the derivations thereof being brain tissues such as Brain, substantia nigra, Brain, hippocampus, Brain, amygdala, and Brain, corpus callosum for 17 sequences, tumor tissues such as Tongue, Tumor, and Kidney, Tumor for 9 sequences, and other normal tissues such as Lung, Small Intestine, and Trachea for sequences.
From this result, it was found that the exon insertion pattern [1] was abundantly expressed in the brain. It was also found that the exon deletion pattern [2] was expressed not only in the brain, but also in other various tissues. Hence, it was thought that the mechanism for amino acid alteration due to exon insertion in this chromosome region to cause the expression of different proteins, as with the pattern [1], might be unique to particular tissues.

(2) Analysis of Expression Specificity by Real-Time PCR

To detect protein expression diversity changes due to exon selectivity among different tissues, details of expression levels were analyzed by real-time PCR. The results are shown in Table 12 and Table 13.

	TABLE 12

	RQ Score	Log₁₀RQ Score

	124_04	124_05	124_06	124_04	124_05	124_06

01 NT2RA(−)	0.1	0.1	0.1	−1.28	−1.07	−1.06
02 NT2RA(+) 24 hr	0.2	0.1	0.1	−0.74	−1.21	−1.22
03 NT2RA(+) 48 hr	0.0	0.1	0.1	−1.47	−1.25	−1.27
04 NT2RA(+) 1 week	0.1	0.0	0.0	−0.83	−1.58	−1.60
05 NT2RA(+) 5 weeks	8.6	0.0	0.1	0.93	−1.32	−1.07
06 NT2RA(+) 5 weeks, Inh(+)	4.7	0.1	0.1	0.67	−1.15	−1.03
07 NT2 Neuron	1.1	0.0	0.0	0.04	−2.08	−1.79
08 Brain, Fetal	148.6	0.0	0.4	2.17	−1.59	−0.38
09 Brain, whole	465.6	0.3	2.1	2.67	−0.47	0.32
10 ALZ Visual Cortex	286.6	0.3	1.1	2.46	−0.49	0.05
Occipital
11 Mix, viscus tissues	1.0	1.0	1.0	0.0	0.0	0.0
12 Mix, blood cells and	14.6	0.5	0.6	1.16	−0.34	−0.26
related tissues
13 Mix, tumor tissues	0.4	0.9	0.8	−0.40	−0.04	−0.09
14 Mix, normal tissues	1.0	0.9	0.8	−0.01	−0.06	−0.10
15 Brain, whole PolyA(+) RNA	190.4	0.3	1.3	2.28	−0.54	0.12
16 Brain, hippocampus	189.2	0.3	1.1	2.28	−0.50	0.06
17 Brain, cerebellum	247.5	0.1	1.6	2.39	−0.84	0.21
18 Brain, amygdala	191.9	0.2	0.9	2.28	−0.74	−0.03
19 Brain, caudate nucleus	134.8	0.4	0.9	2.13	−0.45	−0.05
20 Brain, corpus callosum	25.0	1.1	1.2	1.40	0.03	0.09
21 Brain, substantia nigra	70.3	0.5	0.9	1.85	−0.29	−0.06
22 Brain, thalamus	194.7	0.3	1.0	2.29	−0.57	0.01
23 Brain, subthalamic nucleus	22.2	0.6	0.6	1.35	−0.25	−0.25

	TABLE 13

	RQ Score	Log₁₀RQ Score

	124_04	124_05	124_06	124_04	124_05	124_06

01 NT2RA(−)	0.2	0.1	0.1	−0.72	−1.19	−0.99
02 NT2RA(+) 24 hr	1.0	0.1	0.1	−0.02	−1.19	−1.14
03 NT2RA(+) 48 hr	0.3	0.1	0.1	−0.48	−1.24	−1.14
04 NT2RA(+) 1 week	0.7	0.0	0.0	−0.17	−1.56	−1.47
05 NT2RA(+) 5 weeks	41.4	0.0	0.1	1.62	−1.33	−0.97
06 NT2RA(+) 5 weeks, Inh(+)	30.7	0.1	0.1	1.49	−1.09	−0.89
07 NT2 Neuron	6.2	0.0	0.0	0.79	−1.95	−1.59
08 Brain, Fetal	839.9	0.0	0.5	2.92	−1.62	−0.27
09 Brain, whole	3655.9	0.3	2.6	3.56	−0.50	0.41
10 ALZ Visual Cortex	1899.0	0.3	1.6	3.28	−0.46	0.19
Occipital
11 Mix, viscus tissues	1.0	1.0	1.0	0.0	0.0	0.0
12 Mix, blood cells and	100.9	0.5	0.7	2.00	−0.28	−0.17
related tissues
13 Mix, tumor tissues	1.9	0.8	0.9	0.28	−0.11	−0.05
14 Mix, normal tissues	8.9	0.7	1.2	0.95	−0.16	0.08
15 Brain, whole PolyA(+) RNA	1539.2	0.3	1.7	3.19	−0.55	0.22
16 Brain, hippocampus	1524.4	0.3	1.5	3.18	−0.48	0.16
17 Brain, cerebellum	2130.5	0.2	2.3	3.33	−0.80	0.36
18 Brain, amygdala	1379.6	0.2	1.2	3.14	−0.75	0.09
19 Brain, caudate nucleus	804.0	0.4	1.1	2.91	−0.45	0.04
20 Brain, corpus callosum	163.7	1.1	1.4	2.21	0.04	0.16
21 Brain, substantia nigra	386.9	0.5	1.1	2.59	−0.32	0.04
22 Brain, thalamus	1285.4	0.3	1.3	3.11	−0.59	0.10
23 Brain, subthalamic nucleus	181.6	0.6	0.8	2.26	−0.26	−0.11

Expression levels were compared using the 23 kinds of samples shown in Example 3, including 11 kinds of brain tissues and NT2 cells at 7 different differentiation stages. The comparison was made using the mixed sample of normal visceral tissues shown in Example 3 (Mix, viscus tissues) as an experimental control.
The ratio of ORF alteration due to exon insertion/deletion selectivity as compared between 124_—04 (SEQ ID NO:123) and 124_—05 (SEQ ID NO:126) changed greatly among the following tissues and NT2 cell differentiation stages.
In all portions of the brain, the expression of the pattern for insertion of the exon shown by 124_—04 (SEQ ID NO:123) was abundant (Table 12 and Table 13).
It was found that in NT2 cells, exon selectivity changed greatly depending on the stage of differentiation. When compared in detail with respect to NT2 cell differentiation, almost no difference was observed between the two patterns 124_—04 (SEQ ID NO:123) and 124_—05 (SEQ ID NO:126) in undifferentiated NT2 cells NT2RA (−) and NT2RA (+) 1 week, which represent the initial stage in which retinoic acid was added to NT2 cells to induce differentiation; however, in NT2RA (+) 5 weeks to NT2 Neuron, the expression of the pattern of insertion of the exon shown by 124_—04 (SEQ ID NO:123) was considerably abundant (Table 12 and Table 13).
These results demonstrated that by comparing the expression of the selective exon region 124_—[1]_—1-N1 (SEQ ID NO:119) of a newly acquired cDNA shown by the detection region 124_—04 (SEQ ID NO:123), it is possible to use the exon region as a brain-specific marker, and as a differentiation marker for detecting cells in nerve cell differentiation or regeneration stages, particularly those after nerve differentiation or nerve regeneration. It also seems possible to develop a new drug by means of a compound, antibody, siRNA or the like that targets a region that exhibits specificity.
The following regions also seem to be useful as markers specific for the brain, and as differentiation markers for detecting nerve cells in differentiation or regeneration stages, particularly those after nerve differentiation or after nerve regeneration.
Upstream sequence 124_—[1]_—1-N3 (SEQ ID NO:130), which comprises the 472nd to 491st bases undergoing priming by Primer124_—04R (SEQ ID NO:122) in D-BRHIP2003515.1 of the cDNA pattern [1]. Region 12404 (SEQ ID NO:123) amplified by Primer124_—04F (SEQ ID NO:121) and Primer12404R (SEQ ID NO:122) in the cDNA pattern [1]

Example 10

Cluster chr3+2014 (Data Set: 112)

(1) Cluster Analysis

1) Cluster Characteristics

An analysis was performed on 7 full-length cDNAs subjected to genome mapping onto the cluster chr3+2014 (Human genome UCSC hg18 (NCBI Build34) chromosome 3, 143,070,000 bp to 143,130,000 bp) [D-BRACE2044661.1, BC011835.2, C-BRAMY2022929, C-PRS09188, ENST00000286371, NM_—001679.2, U51478.1]. They were classified according to expression pattern difference into 4 kinds, which mainly included the following 2 kinds.

[1] D-BRACE2044661.1

[2] BC011835.2, ENST00000286371, NM_—001679.2, U51478.1

[1] is a cDNA newly acquired and subjected to full-length cDNA sequence analysis by us, and having a different ORF from [2], which had been registered with an existing public DB.
[1], compared with the known [2], had a different ORF because of translation initiation point alteration due to the insertion of an exon different from other patterns in the ORF region.
It was found that the ORF regions present in the 2 kinds of cDNA patterns [1] to [2] have different splice patterns, from the same chromosome region, resulting in alterations of the amino acid sequences to produce diverse proteins and mRNAs.
2) Characteristics of D-BRACE2044661.1 ([1]), which was Newly Acquired and Subjected to Full-Length cDNA Sequence Analysis by Us
112_—[1]_—1-N0 (SEQ ID NO:131): The entire nucleic acid sequence region of D-BRACE2044661.1
112_—[1]_—1-NA0 (SEQ ID NO:132): Both the entire nucleic acid sequence region and amino acid sequence of D-BRACE2044661.1
112_—[1]_—1-A0 (SEQ ID NO:133): The entire amino acid sequence region of D-BRACE2044661.1
The 272nd to 363rd bases of D-BRACE2044661.1 (SEQ ID NO:134) is an exon that is not present in NM_—001679.2, which is registered with an existing public DB, and serves for control, lacking homology to NM_—001679.2. Because a translation initiation point is present on this exon, the amino acids on the N-terminal side changed by 23 residues (SEQ ID NO:135).
112_—[1]_—1-N1 (SEQ ID NO:134): A 92-base insert nucleic acid sequence region of D-BRACE2044661.1
112_—[1]_—1-A1 (SEQ ID NO:135): A 23-residue insert amino acid sequence region D-BRACE2044661.1
112_—[1]_—1-N2 (SEQ ID NO:136): An ORF nucleic acid sequence region in the 92-base insert region of D-BRACE2044661.1
112_—[1]_—1-A2 (identical to SEQ ID NO:135): An ORF amino acid sequence region in the 92-base insert region of D-BRACE2044661.1
The sequence at the 837th to 856th bases of D-BRACE2044661.1 (SEQ ID NO:137) is an exon that is not present in NM_—001679.2, which is registered with an existing public DB, and serves for control, lacking homology to NM_—001679.2. Because of a change in the translation frame by this insert sequence, the amino acids on the C-terminal side changed by 13 residues (SEQ ID NO:138).
112_—[1]_—1-N3 (SEQ ID NO:137): A 20-base insert nucleic acid sequence region of D-BRACE2044661.1
112_—[1]_—1-A3 (SEQ ID NO:138): A 13-residue insert amino acid sequence region of D-BRACE2044661.1
112_—[1]_—1-N4 (identical to SEQ ID NO:137): An ORF nucleic acid sequence region in the 20-base insert region of D-BRACE2044661.1
112_—[1]_—1-A4 (SEQ ID NO:139): An ORF amino acid sequence region in the 20-base insert region of D-BRACE2044661.1

3) Expression Specificity Analysis and Design of Primers for Real-Time PCR

To clearly distinguish between the characteristic regions shown above, and examine the respective expression levels thereof, the following regions were used as detection regions. It seemed possible to compare the expression levels of the individual characteristic regions by comparing the expression levels of the detection regions.
112_—01—A region incorporating an exon of the cDNA pattern [1], specifically extracted by means of the sequence information at the border: an ORF-altering exon insert region in the cDNA pattern [1], which was newly subjected to full-length cDNA sequence analysis by us
→Fragment 112_—01 (SEQ ID NO:142) amplified by Primer112_—01F (SEQ ID NO:140) and Primer112_—01R (SEQ ID NO:141)
112_—02—A specific region corresponding to a deletion region of the cDNA pattern [2], which is registered with an existing public DB, compared with the insert region of [1], serving as a control for comparatively examining [1]
→Fragment 112_—02 (SEQ ID NO:145) amplified by Primer112_—02F (SEQ ID NO:143) and Primer112_—02R (SEQ ID NO:144)
112_—03—A common region shared by all of [1] to [2]: a region common to all patterns, serving for control to compare the overall expression levels of the cDNA pattern [1], which was newly subjected to full-length cDNA sequence analysis by us, and the cDNA pattern [2], which is registered with an existing public DB
→Fragment 112_—03 (SEQ ID NO:148) amplified by Primer112_—03F (SEQ ID NO:146) and Primer112_—03R (SEQ ID NO:147)
By mapping the 5′-terminal sequences of about 1.44 million sequences acquired using the oligocap method onto the human genome sequence, and comparatively analyzing them, the regions specific for the 2 kinds of cDNA patterns [1] to [2] shown above, respectively, were found to be expressed at the following frequencies.
In the cDNA pattern [1], which was newly acquired and analyzed by us, six 5′-terminal sequences were present, the derivations thereof being Brain, cerebellum for 3 sequences (analytical parameter 82,880), Brain, cortex, Alzheimer for 1 sequence (analytical parameter 16,360), Brain, amygdala for 1 sequence (analytical parameter 58,640), and tissues rich in head portion from 10-week-gestional fetal human (whole embryo, mainly head) for 1 sequence (analytical parameter 7,033).
In the cDNA pattern [2], which is registered with an existing public DB, twenty-four 5′-terminal sequences were present, the derivations thereof being Placenta for 4 sequences (analytical parameter 46,090), NT2 cells treated with retinoic acid (RA) to induce differentiation (NT2RP) for 3 sequences (analytical parameter 39,242), Tongue, Tumor for 2 sequences (analytical parameter 31,371), IMR32 cells (Neuroblastoma) for 2 sequences (analytical parameter 16964), NT2 cells treated with retinoic acid and a growth inhibitor to induce nerve differentiation, followed by nerve concentration and recovery (NT2NE) for 2 sequences (analytical parameter 16,337) and the like; this pattern was expressed in various tissues.
From this result, it was found that the exon insertion pattern [1] was abundantly expressed in the brain. It was also found that the exon deletion pattern [2] was expressed not only in the brain, but also in other various tissues. Hence, it was thought that the selection mechanism for mRNA pattern change in this chromosome region, which alters N-terminal amino acids and results in the expression of different proteins because of exon insertion as with the pattern [1], might be unique to particular tissues.

(2) Analysis of Expression Specificity by Real-Time PCR

To detect protein expression diversity changes due to exon selectivity among different tissues, details of expression levels were analyzed by real-time PCR. The results are shown in Table 14.

	TABLE 14

	RQ Score	Log₁₀RQ Score

	112_01	112_02	112_03	112_01	112_02	112_03

01 NT2RA(−)	0.4	0.5	1.2	−0.35	−0.26	0.09
02 NT2RA(+) 24 hr	0.5	0.3	0.5	−0.33	−0.48	−0.33
03 NT2RA(+) 48 hr	0.4	0.5	0.6	−0.41	−0.32	−0.22
04 NT2RA(+) 1 week	0.2	0.5	0.6	−0.74	−0.32	−0.21
05 NT2RA(+) 5 weeks	2.0	0.9	2.0	0.29	−0.03	0.31
06 NT2RA(+) 5 weeks, Inh(+)	4.1	0.8	1.5	0.62	−0.12	0.18
07 NT2 Neuron	7.6	0.9	1.3	0.88	−0.03	0.11
08 Brain, Fetal	23.3	1.2	2.3	1.37	0.08	0.36
09 Brain, whole	158.2	0.6	1.8	2.20	−0.21	0.26
10 ALZ Visual Cortex	109.3	0.3	1.2	2.04	−0.55	0.08
Occipital
11 Mix, viscus tissues	1.0	1.0	1.0	0.0	0.0	0.0
12 Mix, blood cells and	10.1	0.8	1.2	1.00	−0.12	0.07
related tissues
13 Mix, tumor tissues	0.9	1.6	1.2	−0.03	0.22	0.06
14 Mix, normal tissues	2.4	1.3	1.6	0.37	0.10	0.20
15 Brain, whole PolyA(+) RNA	114.0	0.3	0.9	2.06	−0.56	−0.03
16 Brain, hippocampus	56.4	0.3	0.7	1.75	−0.55	−0.15
17 Brain, cerebellum	149.9	0.6	1.6	2.18	−0.20	0.21
18 Brain, amygdala	47.5	0.3	0.9	1.68	−0.47	−0.06
19 Brain, caudate nucleus	47.9	0.3	0.8	1.68	−0.59	−0.11
20 Brain, corpus callosum	8.7	0.3	0.6	0.94	−0.53	−0.19
21 Brain, substantia nigra	56.7	0.4	1.0	1.75	−0.37	0.01
22 Brain, thalamus	124.0	0.3	1.2	2.09	−0.60	0.07
23 Brain, subthalamic nucleus	26.1	0.4	0.7	1.42	−0.37	−0.16

Expression levels were compared using the 23 kinds of samples shown in Example 3, including 11 kinds of brain tissues and NT2 cells at 7 different differentiation stages. The comparison was made using the mixed sample of normal visceral tissues shown in Example 3 (Mix, viscus tissues) as an experimental control.
The ratio of ORF alteration due to exon insertion/deletion selectivity as compared between 112_—01 (SEQ ID NO:142) and 112_—02 (SEQ ID NO:145) changed greatly among the following brain portions and NT2 cell differentiation stages.
In the brain, particularly in Brain, cerebellum, Brain, hippocampus, Brain, amygdala, Brain, caudate nucleus, Brain, substantia nigra, and Brain, thalamus, the pattern of insertion of the exon shown by 112_—01 (SEQ ID NO:142) was abundantly observed (Table 14).
It was also found that in NT2 cells, exon selectivity varied greatly depending on the stage of differentiation. When compared in detail with respect to NT2 cell differentiation, the expression of the exon deletion pattern shown by 112_—02 (SEQ ID NO:145), which is registered with an existing public DB was more abundant in undifferentiated NT2 cells NT2RA (−) and NT2RA (+) 48 hr, NT2RA (+) 1 week, which represents the initial stage in which retinoic acid was added to induce differentiation; however, in NT2RA (+) 5 weeks, which is predicted to be rich in nerve cells after differentiation, the expression level reversed; even in NT2RA (+) 5 weeks, Inh (+) and NT2 Neuron, the expression of the exon insertion pattern shown by 112_—01 (SEQ ID NO:142) was abundantly observed (Table 14).
These results demonstrated that by comparing the expression of the selective exon region 112_—[1]-N1 (SEQ ID NO:134) of a newly acquired cDNA shown by the detection region 112_—01 (SEQ ID NO:142), it is possible to use the exon region as a marker specific for the brain, particularly for portions such as Brain, cerebellum, Brain, hippocampus, Brain, amygdala, Brain, caudate nucleus, Brain, substantia nigra, and Brain, thalamus, and as a differentiation marker for detecting cells in nerve cell differentiation or regeneration stages, particularly those that have differentiated or regenerated into a nerve. It also seems possible to develop a new drug by means of a compound, antibody, siRNA or the like that targets a region that exhibits specificity.
The following regions also seem to be useful as differentiation markers.
Upstream sequence 112_—[1]_—1-N5 (SEQ ID NO:149), which comprises the 363rd to 390th bases undergoing priming by Primer112_—01R (SEQ ID NO:141) in D-BRACE2044661.1 of the cDNA pattern [1].
Region 112_—01 (SEQ ID NO:142) amplified by Primer112_—01F (SEQ ID NO:140) and Primer112_—01R (SEQ ID NO:141) in the cDNA pattern [1].

Example 11

Cluster chr12+1658 (Data Set: 095)

(1) Cluster Analysis

1) Cluster Characteristics

An analysis was performed on 7 sequences of full-length cDNAs subjected to genome mapping onto the cluster chr12+1658 (Human genome UCSC hg18 (NCBI Build34) chromosome 12, 108,470,000 bp to 108,500,000 bp) [D-BRCAN2027778.1, D-3NB692002462.1, BC016140.1, C-NT2RP3000875, ENST00000228510, M88468.1, NM_—000431.1]. They were classifiable according to expression pattern difference mainly into the following 3 kinds.

[1] D-3NB692002462.1

[2] D-BRCAN2027778.1

[3] BC016140.1, ENST00000228510, M88468.1, NM_—000431.1

[1] and [2] are cDNAs which were newly acquired and subjected to full-length cDNA sequence analysis by us, and had a different ORF from [3], which had been registered with an existing public DB.
[1], compared with the known [3], had a different ORF region because of the deletion of portions corresponding to the third and fourth exons of [3] in the ORF region.
[2], compared with the known [3], had a different ORF region because of the deletion of a portion corresponding to the fourth exon of [3] in the ORF region.
It was found that the ORF regions present in the 3 kinds of cDNA patterns [1] to [3] have different splice patterns, such as exon deletions, from the same chromosome region, resulting in alterations of the amino acid sequences to produce diverse proteins and mRNAs.
2) Characteristics of D-3NB692002462.1 ([1]), which was Newly Acquired and Subjected to Full-Length cDNA Sequence Analysis by Us
095_—[1]_—1-N0 (SEQ ID NO:150): The entire nucleic acid sequence region of D-3NB692002462.1
095_—[1]_—1-NA0 (SEQ ID NO:151): Both the entire nucleic acid sequence region and amino acid sequence of D-3NB692002462.1
095_—[1]_—1-A0 (SEQ ID NO:152): The entire amino acid sequence region of D-3NB692002462.1
The 301-base exon present at the 303rd to 603rd bases of NM_—000431.1, which is registered with an existing public DB, and serves for control (SEQ ID NO:155), is lacked and not present in the region at the 287th to 288th bases of D-3NB692002462.1 (SEQ ID NO:153). The translation initiation point of NM_—000431.1 is present on the first exon, shared by D-3NB692002462.1; however, in D-3NB692002462.1, because of the alteration of the frame due to deletion of the 301 bases, the translation initiation point shifts toward the 3′ side, compared with NM_—000431.1, and the 343rd base of D-3NB692002462.1 becomes the translation initiation point. For this reason, the N-terminal amino acid sequence shortened by 194 residues, compared with NM_—000431.1.
095_—[1]_—1-N1 (SEQ ID NO:153): A deletion nucleic acid sequence region of D-3NB692002462.1
095_—[1]_—1-N2 (SEQ ID NO:154): A 342-base 5′UTR region of an ORF whose translation initiation point is the 343rd base of D-3NB692002462.1
095_—[1]_C-N1 (SEQ ID NO:155): A 301-base exon nucleic acid sequence present in the region at the 303rd to 603rd bases of NM_—000431.1 inserted into the region at the 287th to 288th bases of D-3NB692002462.1
095_—[1]_C-A1 (SEQ ID NO:156): A 101-residue amino acid sequence related to the 301-base exon nucleic acid sequence present in the region at the 303rd to 603rd bases of NM_—000431.1 inserted into the region at the 1,250th to 1,251st bases of D-3NB692002462.1
With this change, “GHMP kinase putative ATP-binding protein”, the Pfam motif present at the 128th to 346th amino acids of NM_—000431.1, disappeared in D-3NB692002462.1.
3) Characteristics of D-BRCAN2027778.1 ([2]), which was Newly Acquired and Subjected to Full-Length cDNA Sequence Analysis by Us
095_—[2]_—1-N0 (SEQ ID NO:157): The entire nucleic acid sequence region of D-BRCAN2027778.1
095_—[2]_—1-NA0 (SEQ ID NO:158): Both the entire nucleic acid sequence region and amino acid sequence of D-BRCAN2027778.1
095_—[2]_—1-A0 (SEQ ID NO:159): The entire amino acid sequence region of D-BRCAN2027778.1
The 156-base exon present at the 448th to 603rd bases of NM_—000431.1, which is registered with an existing public DB, and serves for control (SEQ ID NO:162), is lacked and not present in the region at the 422nd to 423rd bases of D-BRCAN2027778.1 (SEQ ID NO:160).
095_—[2]_—1-N1 (SEQ ID NO:160): A deletion nucleic acid sequence region of D-BRCAN2027778.1
095_—[2]_—1-A1 (SEQ ID NO:161): An altered amino acid sequence region of D-BRCAN2027778.1
095_—[2]_—1-N2 (identical to SEQ ID NO:160): An ORF nucleic acid sequence region in the deletion region of D-BRCAN2027778.1
095_—[2]_—1-A2 (identical to SEQ ID NO:161): An ORF amino acid region related to the deletion region of D-BRCAN2027778.1
095_—[2]_C-N1 (SEQ ID NO:162): A 156-base exon nucleic acid sequence present in the region at the 448th to 603rd bases of NM_—000431.1 inserted into the region at the 422nd to 423rd bases of D-BRCAN2027778.1
095_—[2]_C-A1 (SEQ ID NO:163): A 101-residue amino acid sequence related to the 156-base exon nucleic acid sequence present in the region at the 448th to 603rd bases of NM_—000431.1 inserted into the region at the 423rd to 424th bases of D-BRCAN2027778.1

4) Expression Specificity Analysis and Design of Primers for Real-Time PCR and TaqMan Probe

To clearly distinguish between the characteristic regions shown above, and examine the respective expression levels thereof, the following regions were used as detection regions. It seemed possible to compare the expression levels of the individual characteristic regions by comparing the expression levels of the detection regions.
095_—01—A region specifically extracted by means of the sequence information at the border of regions of the exon deletion of cDNA pattern [1]: an ORF-altering exon deletion region in the cDNA pattern [1], which was newly subjected to full-length cDNA sequence analysis by us
→Fragment 095_—01 (SEQ ID NO:166) amplified by Primer095_—01F (SEQ ID NO:164) and Primer09501R (SEQ ID NO:165) TaqMan probe used 095_—01TP: (SEQ ID NO:167)
095_—02—A region specifically extracted by means of the sequence information at the border of regions of the exon deletion of cDNA pattern [2]: an ORF-altering exon deletion in the cDNA pattern [2], which was newly subjected to full-length cDNA sequence analysis by us
→Fragment 095_—02 (SEQ ID NO:170) amplified by Primer095_—02F (SEQ ID NO:168) and Primer09502R (SEQ ID NO:169) TaqMan probe used 095_—02TP: (SEQ ID NO:171)
095_—03—A specific region of the cDNA pattern [3], which is registered with an existing public DB, that can be distinguished from both the deletion regions of [1] and [2], serving as a control for comparing [1] and [2]
→Fragment 095_—03 (SEQ ID NO:174) amplified by Primer095_—03F (SEQ ID NO:172) and Primer09503R (SEQ ID NO:173) TaqMan probe used 095_—03TP: (SEQ ID NO:175)
095_—04—A common region shared by all of [1] to [3]: a region common to all patterns, serving for control to compare the overall expression levels of the cDNA patterns [1] and [2], which were newly subjected to full-length cDNA sequence analysis by us, and the cDNA pattern [3], which is registered with an existing public DB
→Fragment 095_—04 (SEQ ID NO:178) amplified by Primer095_—04F (SEQ ID NO:176) and Primer095_—04R (SEQ ID NO:177) TaqMan probe used 095_—04TP: (SEQ ID NO:179)
By mapping the 5′-terminal sequences of about 1.44 million sequences acquired using the oligocap method onto the human genome sequence, and comparatively analyzing them, the exon regions specific for the cDNA patterns [1] to [3] shown above, respectively, were found to be expressed at the following frequencies.
In the cDNA pattern [1], which was newly acquired and analyzed by us, three 5′-terminal sequences were present, the derivations thereof being NB69 cells for 1 sequence (analytical parameter 8,153), NT2 cells treated with retinoic acid (RA) to induce differentiation (NT2RP) for 1 sequence (analytical parameter 39,242), and SK-N-MC cells (Neuroepithelioma) for 1 sequence (analytical parameter 7,700).
In the cDNA pattern [2], which was newly acquired and analyzed by us, three 5′-terminal sequences were present, the derivations thereof being a library generated by subtracting cDNAs that overlap with the mRNA of BRAWH: Brain, whole from a cDNA library prepared from the mRNA of BRALZ [Alzheimer patient cerebral cortex (Brain, cortex, Alzheimer)] (BRALZ-BRAWH) for 1 sequence (analytical parameter 157), Brain, caudate nucleus for 1 sequence (analytical parameter 25,786), and NT2 cells treated with retinoic acid and a growth inhibitor to induce nerve differentiation, followed by nerve concentration and recovery (NT2NE) for 1 sequence (analytical parameter 16,337).
In the cDNA pattern [3], which is registered with an existing public DB, thirty-four 5′-terminal sequences were present, and expression was observed in various tissues, the derivations thereof being Brain, cerebellum for 4 sequences (analytical parameter 82,880), Testis for 4 sequences (analytical parameter 90,188), NT2 cells treated with RA and treated with a growth inhibitor to induce nerve differentiation, followed by nerve concentration and recovery (NT2NE) for 3 sequences (analytical parameter 16,337), Brain, whole for 2 sequences (analytical parameter 59,069), Brain, subthalamic nucleus for 2 sequences (analytical parameter 16,308), Kidney for 2 sequences (analytical parameter 17,008), and Thymus for 2 sequences (analytical parameter 70,578).
From this result, it was found that the exon deletion pattern [1] was expressed in differentiated NT2 cells and the like. It was also found that the exon deletion pattern [2] was abundantly expressed in the brain. The known sequence [3], compared with the patterns [1] and [2], was expressed in a wider variety of organs. Hence, it was thought that the selection mechanism for mRNA pattern change in this chromosome region, which alters amino acid sequences and results in the expression of different proteins because of exon selectivity as with the patterns [1] and [2], might be unique to particular tissues.

(2) Analysis of Expression Specificity by Real-Time PCR

To detect protein expression diversity changes due to exon selectivity among different tissues, details of expression levels were analyzed by real-time PCR. The results are shown in Table 15.

	TABLE 15

	RQ Score	Log₁₀RQ Score

	095_01	095_02	095_03	095_04	095_01	095_02	095_03	095_04

01 NT2RA(−)	0.5	0.3	0.2	0.2	−0.34	−0.49	−0.78	−0.66
02 NT2RA(+) 24 hr	0.8	0.4	0.2	0.3	−0.11	−0.43	−0.78	−0.52
03 NT2RA(+) 48 hr	0.3	0.2	0.2	0.4	−0.49	−0.66	−0.70	−0.37
04 NT2RA(+) 1 week	0.9	0.4	0.7	1.2	−0.03	−0.36	−0.18	0.07
05 NT2RA(+) 5 weeks	0.2	0.2	0.3	0.3	−0.71	−0.80	−0.58	−0.48
06 NT2RA(+) 5 weeks, Inh(+)	0.1	0.2	0.3	0.2	−0.88	−0.64	−0.58	−0.63
07 NT2 Neuron	0.2	0.0	0.1	1.7	−0.72	−1.36	−1.16	0.23
08 Brain, Fetal	2.3	0.7	0.5	1.2	0.36	−0.16	−0.26	0.09
09 Brain, whole	1.0	0.3	0.4	0.7	−0.01	−0.52	−0.41	−0.16
10 ALZ Visual Cortex	0.5	0.2	0.2	0.3	−0.30	−0.73	−0.80	−0.60
Occipital
11 Mix, viscus tissues	1.0	1.0	1.0	1.0	0.0	0.0	0.0	0.0
12 Mix, blood cells and	2.1	1.0	0.9	1.0	0.33	0.01	−0.05	0.01
related tissues
13 Mix, tumor tissues	0.4	0.3	0.5	0.5	−0.45	−0.53	−0.30	−0.31
14 Mix, normal tissues	1.3	0.8	1.2	1.0	0.11	−0.11	0.07	0.00
15 Brain, whole PolyA(+) RNA	3.6	0.9	1.2	1.3	0.55	−0.05	0.09	0.10
16 Brain, hippocampus	1.9	0.4	0.7	0.7	0.27	−0.36	−0.13	−0.17

Expression levels were compared using the 16 samples shown in Example 3, including Brain, hippocampus, Brain, whole, Brain, Fetal, Alzheimer patient cerebral cortex (ALZ Visual Cortex Occipital) and NT2 cells at 7 different differentiation stages. The comparison was made using the mixed sample of normal visceral tissues shown in Example 3 (Mix, viscus tissues) as an experimental control.
The ratio of ORF alteration due to exon deletion selectivity as compared between 095_—01 (SEQ ID NO:166) and 095_—02 (SEQ ID NO:170) changed greatly among the following differentiation stages of the brain and NT2 cells.
The expression of the pattern of deletion of the exon shown by 095_—01 (SEQ ID NO:166) was abundant in undifferentiated NT2 cells NT2RA (−) and NT2RA (+) 1 week, which represents the initial stage in which retinoic acid was added to induce differentiation. Although the expression decreased in NT2RA (+) 5 weeks to NT2RA(+) 5 weeks, Inh (+), which represent the late stage of differentiation induction, this pattern was again abundantly expressed in NT2 Neuron (Table 15).
The expression of the pattern of deletion of the exon shown by 095_—02 (SEQ ID NO:170) was abundant in undifferentiated NT2 cells NT2RA (−) and NT2RA (+) 24 hr, which represents the initial stage in which retinoic acid was added to induce differentiation. In NT2RA (+) 5 weeks to NT2RA (+)5 weeks, Inh (+), which represent the late stage of differentiation, and NT2 Neuron, the expression level decreased (Table 15).
These results demonstrated that by comparing the expression of the selective exon regions 095_—[1]_—1-N1 (SEQ ID NO:153) and 095_—[2]_—1-N1 (SEQ ID NO:160) of newly acquired cDNAs shown by the detection regions 095_—01 (SEQ ID NO:166) and 095_—02 (SEQ ID NO:170), it is possible to use the exon regions as differentiation markers for detecting nerve cell differentiation or regeneration stages, particularly initial stages of differentiation into nerve cells.
Furthermore, it was demonstrated that the selective exon region 095_—[1]_—1-N1 (SEQ ID NO:153) of a newly acquired cDNA shown by the detection region 095_—01 (sequence No. 095-17), as a brain-specific marker, can be used as one of differentiation markers for detecting cells in nerve cell differentiation or regeneration stages, particularly those after nerve differentiation or nerve regeneration. It also seems possible to develop a new drug by means of a compound, antibody, siRNA or the like that targets a region that exhibits specificity.
The following regions also seem to be useful as differentiation markers for detecting nerve cell differentiation or regeneration.
Upstream sequence 095_—[1]_—1-N3 (SEQ ID NO:180), which comprises the 304th to 326th bases undergoing priming by Primer095_—01R (SEQ ID NO:165) in D-3NB692002462.1 of the cDNA pattern [1]. Upstream sequence 095_—[2]_—1-N3 (SEQ ID NO:181), which comprises the 444th to 466th bases undergoing priming by Primer095_—02R (SEQ ID NO:169) in D-BRCAN2027778.1 of the cDNA pattern [2].
Region 095_—01 (SEQ ID NO:166) amplified by Primer095_—01F (SEQ ID NO:164) and Primer095_—01R (SEQ ID NO:165) in the cDNA pattern [1]
Region 095_—02 (SEQ ID NO:170) amplified by Primer095_—02F (SEQ ID NO:168) and Primer095_—02R (SEQ ID NO:169) in the cDNA pattern [2]

Example 12

Cluster chr12-1875 (Data Set: 017)

(1) Cluster Analysis

1) Cluster Characteristics

Analysis was performed on 10 sequences of full-length cDNAs genome-mapped to the cluster chr12-1875 (Human genome UCSC hg18 (NCBI Build34) chromosome 12, 7,840,000 bp to 7,960,000 bp) [D-NT2RI3001005.1, D-NT2RI3005261.1, AF481879.1, AL110298.1, AL832448.1, BC060766.1, C-TESTI1000257, C-TESTI4028880, ENST00000340749, NM_—153449.2]. They were classified according to expression pattern difference into 4 kinds, which mainly included the following 2 kinds.

[1] D-NT2RI3001005.1, D-NT2RI3005261.1

[2] AF481879.1, C-TESTI4028880 (AK126026.1), NM_—153449.2

[1] is a cDNA which was newly acquired and subjected to full-length cDNA sequence analysis by us, and had a different ORF region because of the expression thereof from a chromosome region upstream of the known [2], and also because of the presence of the translation initiation point on a new exon lacking identity to [2].
It was found that the ORF regions present in the 2 kinds of cDNA patterns [1] and [2] cause expression starting at different initiation points, from the same chromosome region, resulting in alterations of the amino acid sequences to produce diverse proteins and mRNAs.
2) Characteristics of D-NT2RI3001005.1 ([1]), which was Newly Acquired and Subjected to Full-Length cDNA Sequence Analysis by Us
017_—[1]_—1-N0 (SEQ ID NO:182): The entire nucleic acid sequence region of D-NT2RI3001005.1
017_—[1]_—1-NA0 (SEQ ID NO:183): Both the entire nucleic acid sequence region and amino acid sequence of D-NT2RI3001005.1
017_—[1]_—1-A0 (SEQ ID NO:184): The entire amino acid sequence region of D-NT2RI3001005.1
The sequence at the 1st to 153rd bases of D-NT2RI3001005.1 (SEQ ID NO:185) is an exon that is not present in NM_—153449.2, which is registered with an existing public DB, and serves for control, lacking homology to NM_—153449.2. Because the translation initiation point is present on this exon, the amino acids on the N-terminal side changed by 44 residues (SEQ ID NO:186).
017_—[1]_—1-N1 (SEQ ID NO:185): A 153-base insert nucleic acid sequence region of D-NT2RI3001005.1
017_—[1]_—1-A1 (SEQ ID NO:186): A 44-residue insert amino acid sequence region of D-NT2RI3001005.1
017_—[1]_—1-N2 (SEQ ID NO:187): An ORF nucleic acid sequence region in the 153-base insert region of D-NT2RI3001005.1
017_—[1]_—1-A2 (identical to SEQ ID NO:186): An ORF amino acid sequence region in the 153-base insert region of D-NT2RI3001005.1
3) Characteristics of D-NT2RI3005261.1 ([1]), which was Newly Acquired and Subjected to Full-Length cDNA Sequence Analysis by Us
017_—[1]_—2-N0 (SEQ ID NO:188): The entire nucleic acid sequence region of D-NT2RI3005261.1
017_—[1]_—2-NA0 (SEQ ID NO:189): Both the entire nucleic acid sequence region and amino acid sequence of D-NT2RI3005261.1
017_—[1]_—2-A0 (SEQ ID NO:190): The entire amino acid sequence region of D-NT2RI3005261.1
The sequence at the 1st to 153rd bases of D-NT2RI3005261.1 (SEQ ID NO:191) is an exon that is not present in NM_—153449.2, which is registered with an existing public DB, and serves for control, lacking homology to NM_—153449.2. Because the translation initiation point is present on this exon, the amino acids on the N-terminal side changed by 44 residues (SEQ ID NO:192).
017_—[1]_—2-N1 (SEQ ID NO:191): A 153-base insert nucleic acid sequence region of D-NT2RI3005261.1
017_—[1]_—2-A1 (SEQ ID NO:192): A 44-residue insert amino acid sequence region of D-NT2RI3005261.1
017_—[1]_—2-N2 (SEQ ID NO:193): An ORF nucleic acid sequence region in the 153-base insert region of D-NT2RI3005261.1
017_—[1]_—2-A2 (identical to SEQ ID NO:192): An ORF amino acid sequence region in the 153-base insert region of D-NT2RI3005261.1
4) Characteristics of C-TESTI4028880 (AK126026.1) ([2]), which was Acquired and Subjected to Full-Length cDNA Sequence Analysis by Us, and is Already Registered with a Public DB
017_—[2]_—1-N0 (SEQ ID NO:194): The entire nucleic acid sequence region of C-TESTI4028880
017_—[2]_—1-NA0 (SEQ ID NO:195): Both the entire nucleic acid sequence region and amino acid sequence of C-TESTI4028880
017_—[2]_—1-A0 (SEQ ID NO:196): The entire amino acid sequence region of C-TESTI4028880

5) Expression Specificity Analysis and Design of Primers for Real-Time PCR

To clearly distinguish between the characteristic regions shown above, and examine the respective expression levels thereof, the following regions were used as detection regions. It seemed possible to compare the expression levels of the individual characteristic regions by comparing the expression levels of the detection regions.
017_—01—A specific region present on the N-terminal side of the cDNA pattern [1]: a translation initiation region of the cDNA pattern [1], which was newly subjected to full-length cDNA sequence analysis by us, being a novel region not registered with an existing public DB
→Fragment 017_—01 (SEQ. ID NO:199) amplified by Primer017_—01F (SEQ ID NO:197) and Primer017_—01R (SEQ ID NO:198)
017_—03—A common region shared by all of [1] to [2]: a region common to all patterns, serving for control to compare the overall expression levels of the cDNA pattern [1], which was newly subjected to full-length cDNA sequence analysis by us, and the cDNA pattern [2], which is registered with an existing public DB
→Fragment 017_—03 (SEQ ID NO:202) amplified by Primer017_—03F (SEQ ID NO:200) and Primer017_—03R (SEQ ID NO:201)
By mapping the 5′-terminal sequences of about 1.44 million sequences acquired using the oligocap method onto the human genome sequence, and comparatively analyzing them, the regions specific for the 2 kinds of cDNA patterns [1] to [2] shown above, respectively, were found to be expressed at the following frequencies.
In the cDNA pattern [1], which was newly acquired and analyzed by us, fourteen 5′-terminal sequences were present, the derivations thereof being NT2 cells treated with retinoic acid (RA) to induce differentiation for 5 weeks, and thereafter treated with a growth inhibitor for 2 weeks (NT2RI) for 13 sequences (analytical parameter 32,662), and NT2 cells treated with retinoic acid (RA) to induce differentiation (NT2RP) for 1 sequence (analytical parameter 39,242).
In the cDNA pattern [2], which is registered with an existing public DB, eighty-six 5′-terminal sequences were present, the derivations thereof being Testis for 85 sequences (analytical parameter 90,188), and NT2 cells treated with RA to induce differentiation for 5 weeks, and thereafter treated with a growth inhibitor for 2 weeks (NT2RI) for 1 sequence (analytical parameter 32,662).
From this result, it was found that the transcription initiation point of [1] was expressed specifically in NT2 cells after differentiation. From the transcription initiation point of [2], the expression in Testis was very abundant. Hence, it was thought that the mechanism of transcription in this chromosome region might be different only it the situation of nerve cell differentiation of NT2 cells after differentiation, with a different transcription initiation point being used.

(2) Analysis of Expression Specificity by Real-Time PCR

To determine what are the states in which the transcription initiation point used for the expression changes, details of expressions level were analyzed by real-time PCR. The results are shown in Table 16 and Table 17.

	TABLE 16

	RQ Score	Log₁₀RQ Score

	017_01	017_03	017_01	017_03

01 NT2RA(−)	81.1	4.7	1.91	0.67
02 NT2RA(+) 24 hr	29.4	1.8	1.47	0.25
03 NT2RA(+) 48 hr	34.8	1.6	1.54	0.21
04 NT2RA(+) 1 week	177.5	2.6	2.25	0.41
05 NT2RA(+) 5 weeks	39.2	0.8	1.59	−0.07
06 NT2RA(+) 5 weeks, Inh(+)	1250.2	7.0	3.10	0.85
07 NT2 Neuron	319.1	0.6	2.50	−0.19
08 Brain, Fetal	1.2	1.5	0.07	0.18
09 Brain, whole	0.6	2.3	−0.25	0.35
10 ALZ Visual Cortex	0.6	0.6	−0.23	−0.21
Occipital
11 Mix, viscus tissues	1.0	1.0	0.0	0.0
12 Mix, blood cells and	1.4	2.1	0.15	0.32
related tissues
13 Mix, tumor tissues	0.6	0.3	−0.24	−0.60
14 Mix, normal tissues	32.4	1.1	1.51	0.06
15 Brain, whole PolyA(+) RNA	0.1	0.5	−0.88	−0.27
16 Brain, hippocampus	0.7	0.5	−0.15	−0.31

	TABLE 17

	RQ Score	Log₁₀RQ Score

	017_01	017_03	017_01	017_03

01 NT2RA(−)	30.9	5.2	1.49	0.72
02 NT2RA(+) 24 hr	11.3	1.7	1.05	0.22
03 NT2RA(+) 48 hr	15.5	1.6	1.19	0.22
04 NT2RA(+) 1 week	77.1	2.9	1.89	0.46
05 NT2RA(+) 5 weeks	17.5	1.0	1.24	−0.02
06 NT2RA(+) 5 weeks, Inh(+)	497.7	7.6	2.70	0.88
07 NT2 Neuron	145.3	0.6	2.16	−0.20
08 Brain, Fetal	1.0	1.8	−0.02	0.24
09 Brain, whole	0.3	2.6	−0.57	0.41
10 ALZ Visual Cortex	0.3	0.7	−0.46	−0.14
Occipital
11 Mix, viscus tissues	1.0	1.0	0.0	0.0
12 Mix, blood cells and	0.9	2.7	−0.02	0.43
related tissues
13 Mix, tumor tissues	1.7	0.3	0.24	−0.57
14 Mix, normal tissues	19.8	1.2	1.30	0.07
15 Brain, whole PolyA(+) RNA	0.2	0.7	−0.79	−0.16
16 Brain, hippocampus	0.5	0.7	−0.29	−0.16
17 Colon	0.8	0.1	−0.12	−0.92
18 Colon Tumor	Undet.	0.0	Undet.	−1.65
19 Kidney	0.7	0.3	−0.15	−0.50
20 Kidney Tumor	0.0	0.2	−1.60	−0.61
21 Liver	2.2	0.1	0.34	−0.94
22 Liver Tumor	14.8	0.1	1.17	−0.94
23 Lung	0.1	2.0	−0.91	0.30
24 Lung Tumor	0.3	0.6	−0.60	−0.25
25 Ovary	93.4	2.0	1.97	0.29
26 Ovary Tumor	6.7	0.2	0.83	−0.70
27 Stomach	1.1	0.7	0.04	−0.17
28 Stomach Tumor	Undet.	0.1	Undet.	−1.25
29 Uterus	2.5	1.6	0.40	0.21
30 Uterus Tumor	0.6	0.3	−0.21	−0.53
31 Tongue	33.7	0.2	1.53	−0.65
32 Tumor Tongue	15.6	0.1	1.19	−0.91

Expression levels were compared using the 32 samples shown in Example 3, including Brain, hippocampus, Brain, whole, Brain, Fetal, Alzheimer patient cerebral cortex (ALZ Visual Cortex Occipital), NT2 cells at 7 different differentiation stages, 8 kinds of normal tissues, and 8 kinds of tumor tissues and the like. The comparison was made using the mixed sample of normal visceral tissues shown in Example 3 (Mix, viscus tissues) as an experimental control.
The transcription initiation point shown by 017_—01 (SEQ ID NO:199) is used selectively in NT2 cells. Hence, in NT2 cells at all stages, whether undifferentiated or differentiated, the ratio of transcription from the upstream transcription initiation point was considerably high (Table 16 and Table 17).
These results demonstrated that by detecting the expression of the 5′-terminal regions (regions close to the transcription initiation point) 017_—[1]_—1-N1 (SEQ ID NO:185) and 017_—[1]_—2-N1 (SEQ ID NO:191) of a newly acquired cDNA region shown by the detection region 017_—01 (SEQ ID NO:199), it is possible to use the 5′-terminal regions as nerve cell markers. It also seems possible to develop a new drug by means of a compound, antibody, siRNA or the like that targets a region that exhibits specificity.
The following regions also seem to be useful as nerve cell markers.
Upstream sequence 017_—[1]_—1-N3 (SEQ ID NO:203), which comprises the 143rd to 159th bases undergoing priming by Primer017_—01R (SEQ ID NO:198) in D-NT2RI3001005.1 of the cDNA pattern [1].
Upstream sequence 017_—[1]_—2-N3 (SEQ ID NO:204), which comprises the 143rd to 159th bases undergoing priming by Primer017_—01R (SEQ ID NO:198) in D-NT2RI3005261.1 of the cDNA pattern [1].
Region 017_—01 (SEQ ID NO:199) amplified by Primer017_—01F (SEQ ID NO:197) and Primer017_—01R (SEQ ID NO:198) in the cDNA pattern [1].

Example 13

Cluster chr3-1507 (Data Set: 023)

(1) Cluster Analysis

1) Cluster Characteristics

An analysis was performed on 15 sequences of full-length cDNAs subjected to genome mapping onto the cluster chr3-1507 (Human genome UCSC hg18 (NCBI Build34) chromosome 3, 73,500,000 bp to 73,800,000 bp) [D-OCBBF2010718.1, D-OCBBF3004194.1, D-NT2RP8000826.1, D-NT2RP7007268.1, D-BRAWH3008172.1, D-BRAWH3011965.1, AB029018.1, AL049958.1, AL157498.1, BC014432.1, C-HEMBA1005489, ENST00000263666, ENST00000308537, ENST00000319719, NM_—015009.1]. They were classified according to expression pattern difference into 8 kinds, which mainly included the following 5 kinds.

[1] D-OCBBF2010718.1, D-OCBBF3004194.1

[2] D-NT2RP8000826.1, D-NT2RP7007268.1

[3] D-BRAWH3008172.1

[4] D-BRAWH3011965.1

[5] AB029018.1, ENST00000263666, NM_—015009.1

[1], [2], [3], and [4] are cDNAs which were newly acquired and subjected to full-length cDNA sequence analysis by us, and had a different ORF from [5], which had been registered with an existing public DB.
[1], [2], [3], and [4] had a different ORF region because of the expression thereof from a chromosome region located downstream of the known [5], and also because of the presence of a translation initiation point different from [5].
It was found that the ORF regions present in the 5 kinds of cDNA patterns [1] to [5] cause expression starting at different transcription initiation points, from the same chromosome region, resulting in alterations of the amino acid sequences to produce diverse proteins and mRNAs.
2) Characteristics of D-OCBBF2010718.1 ([1]), which was Newly Acquired and Subjected to Full-Length cDNA Sequence Analysis by Us
023_—[1]_—1-N0 (SEQ ID NO:205): The entire nucleic acid sequence region of D-OCBBF2010718.1
023_—[1]_—1-NA0 (SEQ ID NO:206): Both the entire nucleic acid sequence region and amino acid sequence of D-OCBBF2010718.1
023_—[1]_—1-A0 (SEQ ID NO:207): The entire amino acid sequence region of D-OCBBF2010718.1
The 1st to 212th bases of D-OCBBF2010718.1 (SEQ ID NO:208) is an exon that is not present in NM_—015009.1, which is registered with an existing public DB, and serves for control, lacking homology to NM_—015009.1. Because the translation initiation point is present on this exon, the amino acids on the N-terminal side changed by 23 residues (SEQ ID NO:209).
023_—[1]_—1-N1 (SEQ ID NO:208): A 212-base insert nucleic acid sequence region of D-OCBBF2010718.1
023_—[1]_—1-A1 (SEQ ID NO:209): A 23-residue insert amino acid sequence region of D-OCBBF2010718.1
023_—[1]_—1-N2 (SEQ ID NO:210): An ORF nucleic acid sequence region in the 212-base insert region of D-OCBBF2010718.1
023_—[1]_—1-A2 (identical to SEQ ID NO:209): An ORF amino acid sequence region in the 212-base insert region of D-OCBBF2010718.1
3) Characteristics of D-OCBBF3004194.1 ([1]), which was Newly Acquired and Subjected to Full-Length cDNA Sequence Analysis by Us
023_—[1]_—2-N0 (SEQ ID NO:211): The entire nucleic acid sequence region of D-OCBBF3004194.1
023_—[1]_—2-NA0 (SEQ ID NO:212): Both the entire nucleic acid sequence region and amino acid sequence of D-OCBBF3004194.1
023_—[1]_—2-A0 (SEQ ID NO:213): The entire amino acid sequence region of D-OCBBF3004194.1
The sequence at the 1st to 197th bases of D-OCBBF3004194.1 (SEQ ID NO:214) is an exon that is not present in NM_—015009.1, which is registered with an existing public DB, and serves for control, lacking homology to NM_—015009.1. Because the translation initiation point is present on this exon, the amino acids on the N-terminal side changed by 23 residues (SEQ ID NO:215).
023_—[1]_—2-N1 (SEQ ID NO:214): A 197-base insert nucleic acid sequence region of D-OCBBF3004194.1
023_—[1]_—2-A1 (SEQ ID NO:215): A 23-residue insert amino acid sequence region of D-OCBBF3004194.1
023_—[1]_—2-N2 (SEQ ID NO:216): An ORF nucleic acid sequence region in the 197-base insert region of D-OCBBF3004194.1
023_—[1]_—2-A2 (identical to SEQ ID NO:215): An ORF amino acid sequence region in the 197-base insert region of D-OCBBF3004194.1
4) Characteristics of D-NT2RP8000826.1 ([2]), which was Newly Acquired and Subjected to Full-Length cDNA Sequence Analysis by Us
023_—[2]_—1-N0 (SEQ ID NO:217): The entire nucleic acid sequence region of D-NT2RP8000826.1
023_—[2]_—1-NA0 (SEQ ID NO:218): Both the entire nucleic acid sequence region and amino acid sequence of D-NT2RP8000826.1
023_—[2]_—1-A0 (SEQ ID NO:219): The entire amino acid sequence region of D-NT2RP8000826.1
The sequence at the 1st to 178th bases of D-NT2RP8000826.1 (SEQ ID NO:220) is an exon that is not present in NM_—015009.1, which is registered with an existing public DB, and serves for control, lacking homology to NM_—015009.1. Because the translation initiation point is present on this exon, the amino acids on the N-terminal side changed by 28 residues (SEQ ID NO:221).
023_—[2]_—1-N1 (SEQ ID NO:220): A 178-base insert nucleic acid sequence region of D-NT2RP8000826.1
023_—[2]_—1-A1 (SEQ ID NO:221): A 28-residue insert amino acid sequence region of D-NT2RP8000826.1
023_—[2]_—1-N2 (SEQ ID NO:222): An ORF nucleic acid sequence region in the 178-base insert region of D-NT2RP8000826.1
023_—[2]_—1-A2 (identical to SEQ ID NO:221): An ORF amino acid sequence region in the 178-base insert region of D-NT2RP8000826.1
5) Characteristics of D-NT2RP7007268.1 ([2]), which was Newly Acquired and Subjected to Full-Length cDNA Sequence Analysis by Us
023_—[2]_—2-N0 (SEQ ID NO:223): The entire nucleic acid sequence region of D-NT2RP7007268.1
023_—[2]_—2-NA0 (SEQ ID NO:224): Both the entire nucleic acid sequence region and amino acid sequence of D-NT2RP7007268.1
023_—[2]_—2-A0 (SEQ ID NO:225): The entire amino acid sequence region of D-NT2RP7007268.1
The sequence at the 1st to 178th bases of D-NT2RP7007268.1 (SEQ ID NO:226) is an exon that is not present in NM_—015009.1, which is registered with an existing public DB, and serve for control, lacking homology to NM_—015009.1. Because the translation initiation point is present on this exon, the amino acids on the N-terminal side changed by 28 residues (SEQ ID NO:227).
023_—[2]_—2-N1 (SEQ ID NO:226): A 178-base insert nucleic acid sequence region of D-NT2RP7007268.1
023_—[2]_—2-A1 (SEQ ID NO:227): A 28-residue insert amino acid sequence region of D-NT2RP7007268.1
023_—[2]_—2-N2 (SEQ ID NO:228): An ORF nucleic acid sequence region in the 178-base insert region of D-NT2RP7007268.1
023_—[2]_—2-A2 (identical to SEQ ID NO:227): An ORF amino acid sequence region in the 178-base insert region of D-NT2RP7007268.1
6) Characteristics of D-BRAWH3008172.1 ([3]), which was Newly Acquired and Subjected to Full-Length cDNA Sequence Analysis by Us
023_—[3]_—1-N0 (SEQ ID NO:229): The entire nucleic acid sequence region of D-BRAWH3008172.1
023_—[3]_—1-NA0 (SEQ ID NO:230): Both the entire nucleic acid sequence region and amino acid sequence of D-BRAWH3008172.1
023_—[3]_—1-A0 (SEQ ID NO:231): The entire amino acid sequence region of D-BRAWH3008172.1
The sequence at the 1st to 169th bases of D-BRAWH3008172.1 (SEQ ID NO:232) is an exon that is not present in NM_—015009.1, which is registered with an existing public DB, and serves for control, lacking homology to NM_—015009.1. With this change, the translation initiation point of D-BRAWH3008172.1 shifts toward the 3′ side relative to NM_—015009.1, and the 281st base of D-BRAWH3008172.1 becomes the translation initiation point. For this reason, the amino acid sequence shortened by 343 residues compared with NM_—015009.1.
023_—[3]_—1-N1 (SEQ ID NO:232): A 169-base insert nucleic acid sequence region of D-BRAWH3008172.1
023_—[3]_—1-N2 (SEQ ID NO:233): A 280-base 5′UTR region of an ORF whose translation initiation point is the 281st base of D-BRAWH3008172.1
7) Characteristics of D-BRAWH3011965.1 ([4]), which was Newly Acquired and Subjected to Full-Length cDNA Sequence Analysis by Us
023_—[4]_—1-N0 (SEQ ID NO:234): The entire nucleic acid sequence region of D-BRAWH3011965.1
023_—[4]_—1-NA0 (SEQ ID NO:235): Both the entire nucleic acid sequence region and amino acid sequence of D-BRAWH3011965.1
023_—[4]_—1-A0 (SEQ ID NO:236): The entire amino acid sequence region of D-BRAWH3011965.1
The sequence at the 1st to 311th bases of D-BRAWH3011965.1 (SEQ ID NO:237) is an exon that is not present in NM_—015009.1, which is registered in an existing public DB and serves as a control, lacking homology to NM_—015009.1. Because the translation initiation point is present on this exon, the amino acids on the N-terminal side changed by 4 residues (SEQ ID NO:238).
023_—[4]_—1-N1 (SEQ ID NO:237): A 311-base insert nucleic acid sequence region of D-BRAWH3011965.1
023_—[4]_—1-A1 (SEQ ID NO:238): A 4-residue insert amino acid sequence region of D-BRAWH3011965.1
023_—[4]_—1-N2 (SEQ ID NO:239): An ORF nucleic acid sequence region in the 311-base insert region of D-BRAWH3011965.1
023_—[4]_—1-A2 (identical to SEQ ID NO:238): An ORF amino acid sequence region in the 311-base insert region of D-BRAWH3011965.1

8) Expression Specificity Analysis and Design of Primers for Real-Time PCR

To clearly distinguish between the characteristic regions shown above, and examine the respective expression levels thereof, the following regions were used as detection regions. It seemed possible to compare the expression levels of the individual characteristic regions by comparing the expression levels of the detection regions.
023_—01—A specific region present on the N-terminal side of the cDNA pattern [1]: a translation initiation region of the cDNA pattern [1], which was newly subjected to full-length cDNA sequence analysis by us, being a novel region not registered with an existing public DB
→Fragment 023_—01 (SEQ ID NO:242) amplified by Primer023_—01F (SEQ ID NO:240) and Primer023_—01R (SEQ ID NO:241)
023_—02—A specific region present on the N-terminal side of the cDNA pattern [2]: a translation initiation region of the cDNA pattern [2], which was newly subjected to full-length cDNA sequence analysis by us, being a novel region not registered with an existing public DB
→Fragment 023_—02 (SEQ ID NO:245) amplified by Primer023_—02F (SEQ ID NO:243) and Primer023_—02R (SEQ ID NO:244)
023_—03—A specific region present on the N-terminal side of the cDNA pattern [3]: a translation initiation region of the cDNA pattern [3], which was newly subjected to full-length cDNA sequence analysis by us, being a novel region not registered with an existing public DB
→Fragment 023_—03 (SEQ ID NO:248) amplified by Primer023_—03F (SEQ ID NO:246) and Primer023_—03R (SEQ ID NO:247)
023_—04—A specific region present on the N-terminal side of the cDNA pattern [4]: a translation initiation region of the cDNA pattern [4], which was newly subjected to full-length cDNA sequence analysis by us, being a novel region not registered with an existing public DB
→Fragment 023_—04 (SEQ ID NO:251) amplified by Primer023_—04F (SEQ ID NO:249) and Primer023_—04R (SEQ ID NO:250)
023_—05—A specific region of the cDNA pattern [5], which is registered with an existing public DB, that can be distinguished from all of [1], [2], [3], and [4], serving as a control for comparing [1], [2], [3], and [4]
→Fragment 023_—05 (SEQ ID NO:254) amplified by Primer023_—05F (SEQ ID NO:252) and Primer023_—05R (SEQ ID NO:253)
023_—06—A common region shared by all of [1] to [5]: a region common to all patterns, serving for control to compare the overall expression levels of the cDNA patterns [1], [2], [3], and [4], which were newly subjected to full-length cDNA sequence analysis by us, and the cDNA pattern [5] registered with an existing public DB
→Fragment 023_—06 (SEQ ID NO:257) amplified by Primer023_—06F (SEQ ID NO:255) and Primer023_—06R (SEQ ID NO:256)
By mapping the 5′-terminal sequences of about 1.44 million sequences acquired using the oligocap method onto the human genome sequence, and comparatively analyzing them, the regions specific for the 2 kinds of cDNA patterns [1] to [2] shown above, respectively, were found to be expressed at the following frequencies.
In the cDNA pattern [1], which was newly acquired and analyzed by us, thirty-two 5′-terminal sequences were present, the derivations thereof being NT2 cells treated with retinoic acid (RA) to induce differentiation (NT2RP) for 21 sequences (analytical parameter 39,242), Brain, Fetal for 8 sequences (analytical parameter 103,138), NT2 cells treated with retinoic acid (RA) to induce differentiation for 5 weeks, and thereafter treated with a growth inhibitor for 2 weeks (NT2RI) for 1 sequence (analytical parameter 32,662), Brain, hippocampus for 1 sequence (analytical parameter 57,918), and Brain, amygdala for 1 sequence (analytical parameter 58,640).
In the cDNA pattern [2], which was newly acquired and analyzed by us, twenty 5′-terminal sequences were present, the derivation thereof being NT2 cells treated with retinoic acid (RA) to induce differentiation (NT2RP) for the 20 sequences (analytical parameter 39,242).
In the cDNA pattern [3], which was newly acquired and analyzed by us, sixteen 5′-terminal sequences were present, the derivations thereof being Brain, whole for 8 sequences (analytical parameter 59,069), Brain, amygdala for 5 sequences (analytical parameter 58,640), Kidney, Tumor for 1 sequence (analytical parameter 15,970), Brain, thalamus for 1 sequence (analytical parameter 53,267), and Testis for 1 sequence (analytical parameter 90,188).
In the cDNA pattern [4], which was newly acquired and analyzed by us, five 5′-terminal sequences were present, the derivations thereof being Brain, whole for 3 sequences (analytical parameter 59,069), Brain, hippocampus for 1 sequence (analytical parameter 57,918), and Brain, thalamus for 1 sequence (analytical parameter 53,267).
In the cDNA pattern [5], which is registered with an existing public DB, two 5′-terminal sequences were present, the derivations thereof being Stomach, Tumor for 1 sequence (analytical parameter 2,757), and Prostate for 1 sequence (analytical parameter 16,671).
From this result, it was found that the transcription initiation point of [1] was abundantly expressed in differentiated NT2 cells and the fetal brain. It was found that the transcription initiation point of [2] was abundantly expressed in differentiated NT2 cells. It was found that the transcription initiation points of [3] and [4] were abundantly expressed in the brain. The known sequence [5] was expressed in gastric cancer and the prostate. Hence, it was thought that the mechanism of transcription in this chromosome region might differ among various organs and cell conditions, with different transcription initiation points being used.

(2) Analysis of Expression Specificity by Real-Time PCR

To determine what are the states in which the transcription initiation point used for expression changes, details of expression levels were analyzed by real-time PCR. The results are shown in Tables 18-1 and 18-2 and Tables 19-1 and 19-2.

	TABLES 18-1, 18-2

	RQ Score	Log₁₀RQ Score

023_01

023_02

023_03

023_04

023_05

023_06

023_01

023_02

023_03

023_04

023_05

023_06

01 NT2RA(−)	0.0	0.0	Undet.	0.0	0.0	0.0	−2.32	−2.82	Undet.	−2.07	−1.63	−1.54
02 NT2RA(+) 24 hr	0.1	0.1	0.0	0.2	0.0	0.0	−1.11	−1.22	−3.03	−0.71	−1.62	−1.40
03 NT2RA(+) 48 hr	1.3	0.4	0.0	0.8	0.1	0.2	0.12	−0.38	−2.12	−0.11	−1.13	−0.81
04 NT2RA(+) 1 week	19.1	1.8	0.1	5.6	0.3	1.8	1.28	0.25	−1.24	0.75	−0.55	0.25
05 NT2RA(+) 5 weeks	39.7	1.2	0.0	0.2	0.4	1.9	1.60	0.08	−1.58	−0.63	−0.43	0.27
06 NT2RA(+) 5 weeks, Inh(+)	2.0	0.0	0.0	0.1	0.1	0.3	0.30	−1.64	−1.65	−1.29	−0.84	−0.55
07 NT2 Neuron	2.9	0.7	0.0	1.4	0.0	0.2	0.46	−0.17	−1.95	0.16	−1.77	−0.66
08 Brain, Fetal	53.3	3.5	34.6	11.3	1.7	3.5	1.73	0.54	1.54	1.05	0.23	0.54
09 Brain, whole	0.8	1.0	58.9	2.7	0.4	1.0	−0.12	−0.01	1.77	0.42	−0.46	0.01
10 ALZ Visual Cortex	0.5	0.6	27.6	1.2	0.2	0.6	−0.26	−0.24	1.44	0.07	−0.71	−0.20
Occipital
11 Mix, viscus tissues	1.0	1.0	1.0	1.0	1.0	1.0	0.0	0.0	0.0	0.0	0.0	0.0
12 Mix, blood cells and	1.0	0.3	0.3	0.6	0.3	0.4	−0.02	−0.54	−0.51	−0.20	−0.46	−0.40
related tissues
13 Mix, tumor tissues	1.0	0.1	0.0	0.0	0.1	0.3	−0.01	−1.22	−1.36	−1.41	−0.87	−0.50
14 Mix, normal tissues	2.1	2.9	3.0	4.2	2.3	2.4	0.33	0.47	0.48	0.63	0.36	0.37
15 Brain, whole PolyA(+) RNA	0.2	0.1	19.0	0.8	0.1	0.4	−0.71	−0.99	1.28	−0.12	−1.01	−0.38
16 Brain, hippocampus	0.3	0.0	9.7	0.4	0.1	0.3	−0.55	−1.39	0.98	−0.40	−0.96	−0.48

	TABLES 19-1, 19-2

	RQ Score	Log₁₀RQ Score

023_01

023_02

023_03

023_04

023_05

023_06

023_01

023_02

023_03

023_04

023_05

023_06

01 NT2RA(−)	0.0	0.0	Undet.	0.0	0.0	0.0	−2.22	−2.59	Undet.	−1.70	−1.59	−1.53
02 NT2RA(+) 24 hr	0.1	0.1	0.0	0.5	0.0	0.0	−1.02	−0.91	−2.71	−0.31	−1.54	−1.34
03 NT2RA(+) 48 hr	1.7	0.6	0.0	1.2	0.1	0.2	0.23	−0.20	−2.18	0.08	−0.96	−0.65
04 NT2RA(+) 1 week	25.8	3.2	0.1	10.6	0.4	2.6	1.41	0.51	−0.96	1.03	−0.39	0.42
05 NT2RA(+) 5 weeks	48.9	1.8	0.0	0.2	0.5	3.0	1.69	0.26	−1.37	−0.62	−0.29	0.48
06 NT2RA(+) 5 weeks, Inh(+)	2.8	0.0	0.0	0.1	0.2	0.5	0.45	−1.36	−1.66	−1.02	−0.66	−0.32
07 NT2 Neuron	3.1	0.9	0.0	2.3	0.0	0.3	0.49	−0.02	−1.45	0.35	−1.85	−0.58
08 Brain, Fetal	67.9	6.0	54.2	25.8	1.9	4.3	1.83	0.78	1.73	1.41	0.29	0.63
09 Brain, whole	0.8	1.5	90.6	5.4	0.4	1.0	−0.10	0.17	1.96	0.74	−0.44	0.02
10 ALZ Visual Cortex	0.6	0.9	40.4	2.5	0.2	0.6	−0.21	−0.07	1.61	0.40	−0.65	−0.26
Occipital
11 Mix, viscus tissues	1.0	1.0	1.0	1.0	1.0	1.0	0.0	0.0	0.0	0.0	0.0	0.0
12 Mix, blood cells and	1.2	0.4	0.7	1.2	0.3	0.4	0.06	−0.35	−0.16	0.07	−0.50	−0.42
related tissues
13 Mix, tumor tissues	1.0	0.1	0.0	0.0	0.1	0.3	−0.02	−1.29	−1.53	−1.41	−0.92	−0.58
14 Mix, normal tissues	2.2	4.0	4.3	6.6	2.2	2.3	0.34	0.60	0.63	0.82	0.35	0.37
15 Brain, whole PolyA(+) RNA	0.3	0.2	31.4	1.3	0.1	0.5	−0.54	−0.75	1.50	0.12	−0.88	−0.28
16 Brain, hippocampus	0.4	0.1	16.0	0.7	0.1	0.5	−0.40	−1.13	1.20	−0.15	−0.86	−0.30
17 Brain, cerebellum	0.1	0.0	0.1	0.0	0.1	0.2	−0.86	−1.81	−0.86	−1.80	−1.09	−0.79
18 Brain, amygdala	1.4	0.1	11.9	0.5	0.1	0.5	0.14	−1.11	1.08	−0.33	−1.02	−0.31
19 Brain, caudate nucleus	0.1	0.0	1.3	0.1	0.1	0.2	−1.15	−1.56	0.12	−0.84	−1.13	−0.70
20 Brain, corpus callosum	0.1	0.0	1.1	0.2	0.1	0.2	−1.14	−1.61	0.05	−0.64	−1.05	−0.64
21 Brain, substantia nigra	0.1	0.1	2.4	0.3	0.1	0.2	−0.83	−1.13	0.39	−0.56	−1.04	−0.73
22 Brain, thalamus	0.2	0.0	4.4	0.2	0.0	0.2	−0.78	−1.57	0.65	−0.77	−1.41	−0.82
23 Brain, subthalamic nucleus	0.0	0.0	0.3	0.0	0.1	0.2	−1.70	−2.49	−0.55	−1.82	−1.24	−0.77

Expression levels were compared using the 23 samples shown in Example 3, including Brain, hippocampus, Brain, whole, Brain, Fetal, Alzheimer patient cerebral cortex (ALZ Visual Cortex Occipital), NT2 cells at 7 different differentiation stages, and 7 kinds of brain tissues. The comparison was made using the mixed sample of normal visceral tissues shown in Example 3 (Mix, viscus tissues) as an experimental control.
The transcription initiation points shown by 023_—01 (SEQ ID NO:242), 023_—02 (SEQ ID NO:245), and 023_—04 (SEQ ID NO:251) were abundantly expressed in NT2 cells after differentiation, particularly in NT2RA (+) 1 week, which represents an advanced stage of differentiation, whereas 023_—01 (SEQ ID NO:242) was most abundantly expressed in NT2RA (+) 5 weeks (Tables 18-1 and 18-2 and Tables 19-1 and 19-2). In the brain tissues, the expression from the transcription initiation points shown by 023_—01 (SEQ ID NO:242), 023_—03 (SEQ ID NO:248), and 023_—04 (SEQ ID NO:251) was abundant, with the expression in Brain, Fetal being particularly abundant (Tables 18-1 and 18-2 and Tables 19-1 and 19-2).
These results demonstrated that by comparing the expression of transcription initiation point regions 023_—[1]_—1-N1 (SEQ ID NO:208), 023_—[1]_—2-N1 (SEQ ID NO:214), 023_—[2]_—1-N1 (SEQ ID NO:220), 023_—[2]_—2-N1 (SEQ ID NO:226), 023_—[3]_—1-N1 (SEQ ID NO:232), and 023_—[4]_—1-N1 (SEQ ID NO:237) of newly acquired cDNAs shown by the detection regions 023_—01 (SEQ ID NO:242), 023_—02 (SEQ ID NO:245), 023_—03 (SEQ ID NO:248), and 023_—04 (SEQ ID NO:251), it is possible to use these regions as differentiation markers for detecting nerve cell differentiation or regeneration stages, or as brain-specific markers. It also seems possible to develop a new drug by means of a compound, antibody, siRNA or the like that targets a region that exhibits specificity.
The following regions also seem to be useful as differentiation markers for detecting stages of nerve cell differentiation or regeneration and brain-specific markers.
Upstream sequence 023_—[1]_—1-N3 (SEQ ID NO:258), which comprises the 191st to 219th bases undergoing priming by Primer023_—01R (SEQ ID NO:241) in D-OCBBF2010718.1 of the cDNA pattern [1].
Upstream sequence 023_—[1]_—2-N3 (SEQ ID NO:259), which comprises the 181st to 204th bases undergoing priming by Primer023_—01R (SEQ ID NO:241) in D-OCBBF3004194.1 of the cDNA pattern [1].
Upstream sequence 023_—[2]_—1-N3 (SEQ ID NO:260), which comprises the 158th to 179th bases undergoing priming by Primer023_—02R (SEQ ID NO:244) in D-NT2RP8000826.1 of the cDNA pattern [2].
Upstream sequence 023_—[2]_—2-N3 (SEQ ID NO:261), which comprises the 161st to 180th bases undergoing priming by Primer023_—02R (SEQ ID NO:244) in D-NT2RP7007268.1 of the cDNA pattern [2].
Upstream sequence 023_—[3]_—1-N3 (SEQ ID NO:262), which comprises the 293rd to 316th bases undergoing priming by Primer023_—03R (SEQ ID NO:247) in D-BRAWH3008172.1 of the cDNA pattern [3].
Upstream sequence 023_—[4]_—1-N3 (SEQ ID NO:263), which comprises the 65th to 84th bases undergoing priming by Primer023_—04R (SEQ ID NO:250) in D-BRAWH3011965.1 of the cDNA pattern [4].
Region 023_—01 (SEQ ID NO:242) amplified by Primer023_—01F (SEQ ID NO:240) and Primer023_—01R (SEQ ID NO:241) in the cDNA pattern [1].
Region 023_—02 (SEQ ID NO:245) amplified by Primer023_—02F (SEQ ID NO:243) and Primer023_—02R (SEQ ID NO:244) in the cDNA pattern [2].
Region 023_—03 (SEQ ID NO:248) amplified by Primer023_—03F (SEQ ID NO:246) and Primer023_—03R (SEQ ID NO:247) in the cDNA pattern [3].
Region 023_—04 (SEQ ID NO:251) amplified by Primer023_—04F (SEQ ID NO:249) and Primer023_—04R (SEQ ID NO:250) in the cDNA pattern [4].

Example 14

OFR Information on Full-Length cDNA Sequences and Results of Homology Analysis and Results of Analysis of Motif and the Like

To determine the functions of 19 sequences of full-length cDNAs that were newly acquired and subjected to full-length cDNA sequence analysis by us, ORF prediction and annotation analysis were performed. Results of the annotation analysis can be updated when the database or analytical software for comparison is upgraded. Thereby, it is sometimes possible to newly add an annotation to sequences with no annotation given under the same conditions.
1) Prediction of ORFs of cDNAs undergoing Full-Length cDNA Sequence Analysis
Using ORF prediction/evaluation systems such as ATGpr (A. Salamov et al. (1998) Bioinformatics 14: 384-390) and TRins (K. Kimura et al. (2003) Genome Informatics 14: 456-457), ORFs were predicted from full-length cDNA sequences. The ORF region information predicted from the full-length cDNA sequences is shown below.
The ORF regions were denoted in compliance with the rules of “DDBJ/EMBL/GenBank Feature Table Definition” (http://www.ncbi.nlm.nih.gov/collab/FT/index.html). The ORF start position is the first character of the methionine-encoding base “ATG”, and the stop position represents the third character of the stop codon. These are indicated by a partition “..”. However, for the ORFs that do not have a stop codon, the stop position is indicated with the use of “>” in compliance with the denotation rules.


	Name of cDNA sequence	ORF region

	D-UTERU2026184.1	191 . . . 2119
	D-BRACE3000012.1	465 . . . 2558
	D-NT2RP8004156.1	131 . . . 1387
	D-NT2RI3005525.1	45 . . . 1292
	D-NT2RP8004592.1	620 . . . 1183
	D-NT2RI2014164.1	162 . . . 1397
	D-BRAMY2029564.1	143 . . . 1657
	D-BRHIP2003515.1	84 . . . 707
	D-BRACE2044661.1	297 . . . 878
	D-3NB692002462.1	343 . . . 951
	D-BRCAN2027778.1	52 . . . 1086
	D-NT2RI3001005.1	22 . . . 1629
	D-NT2RI3005261.1	22 . . . 1629
	D-OCBBF2010718.1	144 . . . 2495
	D-OCBBF3004194.1	129 . . . 2480
	D-NT2RP8000826.1	95 . . . 2461
	D-NT2RP7007268.1	95 . . . 2461
	D-BRAWH3008172.1	281 . . . 2452
	D-BRAWH3011965.1	300 . . . >1574

2) Results of Homology Analysis Using BLASTP (SwissProt)

Homology analysis was performed on the 19 ORF sequences shown in Example 14-1), using BLASTP (blastall 2.2.6; ftp://ftp.ncbi.nih.gov/blast/), for SwissProt of the Aug. 22, 2006 version (ftp://us.expasy.org/databases/swiss-prot/). Based on the results of the homology analysis, the sequences showing the highest homology with an E-value of 1E-10 or less are shown below. In the following cases, however, the applicable candidate is not selected, but the next candidate is shown.
Having a definition beginning with “ALU SUBFAMILY”
Having a definition beginning with “Alu subfamily”
Having a definition beginning with “!!!! ALU SUBFAMILY”
Having a definition beginning with “B-CELL GROWTH FACTOR PRECURSOR”
Having a definition including “NRK2”
Having a definition beginning with “PROLINE-RICH”
Having a definition beginning with “GLYCINE-RICH”
Having a definition beginning with “EXTENSIN PRECURSOR”
Having a definition beginning with “COLLAGEN”
Having a definition beginning with “100 KD”
Having a definition beginning with “RETROVIRUS-RELATED POL POLYPROTEIN”
Having a definition beginning with “CUTICLE COLLAGEN”
Having a definition beginning with “HYPOTHETICAL”
Having a definition beginning with “Hypothetical”
Having a definition beginning with “SALIVARY PROLINE-RICH PROTEIN”
Having a definition beginning with “IMMEDIATE-EARLY PROTEIN”
Having the accession No “P49646”
Individual data are shown with the name of cDNA sequence, ORF region, hit data accession number, hit data definition, hit data keyword, E-value, consensus length (amino acid length), and identity, separated by “//” in this order.
D-UTERU2026184.1// 191..2119// Q8TF45// Zinc finger protein 418// DNA-binding; Metal-binding; Nuclear protein; Repeat; Repressor; Transcription; Transcription regulation; Zinc; Zinc-finger.// 0// 601// 100
D-BRACE3000012.1// 465..2558// Q8TF45// Zinc finger protein 418// DNA-binding; Metal-binding; Nuclear protein; Repeat; Repressor; Transcription; Transcription regulation; Zinc; Zinc-finger.// 0// 674// 99
D-NT2RP8004156.1// 131..1387// P31749// RAC-alpha serine/threonine-protein kinase (EC2.7.11.1) (RAC-PK-alpha) (Protein kinase B) (PKB)(C-AKT)// 3D-structure; Apoptosis; ATP-binding; Carbohydrate metabolism; Glucose metabolism; Glycogen biosynthesis; Glycogen metabolism; Kinase; Nuclear protein; Nucleotide-binding; Phosphorylation; Serine/threonine-protein kinase; Sugar transport; Transferase; Translation regulation; Transport.// 0// 418// 100
D-NT2RI3005525.1// 45..1292// Q7Z698// Sprouty-related, EVH1-domain-containingprotein 2 (Spred-2)// Membrane; Phosphorylation.// 0// 409// 99
D-NT2RI2014164.1// 162..1397// P27338// Amine oxidase [flavin-containing] B (EC1.4.3.4) (Monoamine oxidase type B) (MAO-B)// 3D-structure; Acetylation; Direct protein sequencing; FAD; Flavoprotein; Membrane; Mitochondrion; Oxidoreductase; Transmembrane.// 0// 367// 93
D-BRAMY2029564.1// 143..1657// P27338// Amine oxidase [flavin-containing] B (EC1.4.3.4) (Monoamine oxidase type B) (MAO-B)// 3D-structure; Acetylation; Direct protein sequencing; FAD; Flavoprotein; Membrane; Mitochondrion; Oxidoreductase; Transmembrane.// 0// 504// 100
D-BRHIP2003515.1// 84.707// P55327// Tumor protein D52 (N8 protein)// Coiled coil.// 7E-93// 184// 88
D-BRACE2044661.1// 297.878// P54709// Sodium/potassium-transporting ATPase subunitbeta-3 (Sodium/potassium-dependent ATPase beta-3subunit) (ATPB-3) (CD298 antigen)// Glycoprotein; Ion transport; Membrane; Potassium; Potassium transport; Signal-anchor; Sodium; Sodium transport; Sodium/potassium transport; Transmembrane; Transport.// 1E-90// 158// 97
D-3NB692002462.1// 343..951// Q03426// Mevalonate kinase (EC 2.7.1.36) (MK)// ATP-binding; Cataract; Cholesterol biosynthesis; Disease mutation; Kinase; Lipid synthesis; Nucleotide-binding; Peroxisome; Polymorphism; Steroid biosynthesis; Sterol biosynthesis; Transferase.// 1E-112//202//100
D-BRCAN2027778.1// 52.1086// Q03426// Mevalonate kinase (EC 2.7.1.36) (MK)// ATP-binding; Cataract; Cholesterol biosynthesis; Disease mutation; Kinase; Lipid synthesis; Nucleotide-binding; Peroxisome; Polymorphism; Steroid biosynthesis; Sterol biosynthesis; Transferase.// 0// 343// 86
D-NT2RI3001005.1// 22.1629// Q8TDB8// Solute carrier family 2, facilitated glucosetransporter member 14 (Glucose transporter type 14)// Alternative splicing; Developmental protein; Differentiation; Glycoprotein; Membrane; Spermatogenesis; Sugar transport; Transmembrane; Transport.// 0// 490// 99
D-NT2RI3005261.1// 22..1629// Q8TDB8// Solute carrier family 2, facilitated glucosetransporter member 14 (Glucose transporter type 14)// Alternative splicing; Developmental protein; Differentiation; Glycoprotein; Membrane; Spermatogenesis; Sugar transport; Transmembrane; Transport.// 0// 491// 100
D-OCBBF2010718.1// 144..2495// Q9UPQ7// PDZ domain-containing RING finger protein 3(Ligand of Numb-protein X 3) (Semaphorin cytoplasmicdomain-associated protein 3) (SEMACAP3 protein)// 3D-structure; Alternative splicing; Coiled coil; Metal-binding; Polymorphism; Repeat; Zinc; Zinc-finger.// 0// 758// 99
D-OCBBF3004194.1// 129.2480// Q9UPQ7// PDZ domain-containing RING finger protein 3(Ligand of Numb-protein X 3) (Semaphorin cytoplasmicdomain-associated protein 3) (SEMACAP3 protein)// 3D-structure; Alternative splicing; Coiled coil; Metal-binding; Polymorphism; Repeat; Zinc; Zinc-finger.// 0// 760// 99
D-NT2RP8000826.1// 95.2461// Q9UPQ7// PDZ domain-containing RING finger protein 3(Ligand of Numb-protein X 3) (Semaphorin cytoplasmicdomain-associated protein 3) (SEMACAP3 protein)// 3D-structure; Alternative splicing; Coiled coil; Metal-binding; Polymorphism; Repeat; Zinc; Zinc-finger.// 0// 759// 99
D-NT2RP7007268.1// 95..2461// Q9UPQ7// PDZ domain-containing RING finger protein 3(Ligand of Numb-protein X 3) (Semaphorin cytoplasmicdomain-associated protein 3) (SEMACAP3 protein)// 3D-structure; Alternative splicing; Coiled coil; Metal-binding; Polymorphism; Repeat; Zinc; Zinc-finger.// 0// 759// 99
D-BRAWH3008172.1// 281..2452// Q9UPQ7// PDZ domain-containing RING finger protein 3(Ligand of Numb-protein X 3) (Semaphorin cytoplasmicdomain-associated protein 3) (SEMACAP3 protein)// 3D-structure; Alternative splicing; Coiled coil; Metal-binding; Polymorphism; Repeat; Zinc; Zinc-finger.// 0// 722// 99
D-BRAWH3011965.1// 300..>1574// Q9UPQ7// PDZ domain-containing RING finger protein 3(Ligand of Numb-protein X 3) (Semaphorin cytoplasmicdomain-associated protein 3) (SEMACAP3 protein)// 3D-structure; Alternative splicing; Coiled coil; Metal-binding; Polymorphism; Repeat; Zinc; Zinc-finger.// 0// 421// 99

3) Results of Homology Analysis Using BLASTP (RefSeq)

Homology analysis was performed on the 19 ORF sequences shown in Example 14-1), using BLASTP (blastall 2.2.6; ftp://ftp.ncbi.nih.gov/blast/), for RefSeq of the Jul. 15, 2006 version (human, mouse, rat; ftp://ftp.ncbi.nih.gov/refseq/). Based on the results of the homology analysis, the sequences showing the highest homology with an E-value of 1E-10 or less are shown below. In the following cases, however, the applicable candidate is not selected, but the next candidate is shown.
Having a definition beginning with “hypothetical protein FLJ”
Having a definition beginning with “KIAA”
Having a definition beginning with “hypothetical protein DKFZ”
Having a definition beginning with “DKFZ”
Having a definition beginning with “RIKEN cDNA”
Having a definition beginning with “hypothetical protein MGC”
Having a definition of “hypothetical protein”
Having a definition beginning with “hypothetical protein PP”
Having the definition as “neuronal thread protein”
Having a definition beginning with “clone FLB”
Having a definition beginning with “hypothetical protein PRO”
Having the definition as “PRO0483 protein”
Having a definition including “MNC”
Having a definition including “MOST-1”
Having a definition beginning with “similar to”
Having a definition including “TPR gene on Y”
Having a definition beginning with “HSPC”
Having a definition beginning with “CGI-”
Individual data are shown with the name of cDNA sequence, ORF region, hit data accession number, hit data definition, E-value, consensus length (amino acid length), and identity separated by “//” in this order.
D-UTERU2026184.1// 191..2119// NP_—597717.1// zinc finger protein 418 [Homo sapiens]// 0// 601// 100
D-BRACE3000012.1// 465..2558// NP_—597717.1// zinc finger protein 418 [Homo sapiens]// 0// 674// 99
D-NT2RP8004156.1// 131..1387// NP_—005154.2// v-akt murine thymoma viral oncogene homolog 1 [Homo sapiens]// 0// 418// 100
D-NT2RI3005525.1// 45..1292// NP_—861449.1// sprouty-related protein with EVH-1 domain 2 [Homo sapiens]// 0// 408// 99
D-NT2RP8004592.1// 620..1183// NP_—003921.2// src family associated phosphoprotein 2 [Homo sapiens]// 1E-110// 187// 100
D-NT2RI2014164.1// 162..1397// NP_—000889.3// amine oxidase (flavin-containing) [Homo sapiens]// 0// 367// 93
D-BRAMY2029564.1// 143..1657// NP_—000889.3// amine oxidase (flavin-containing) [Homo sapiens]// 0// 504// 100
D-BRHIP2003515.1// 84..707// NP_—001020424.1// tumor protein D52 isoform 2 [Homo sapiens]// 1E-110// 207// 100
D-BRACE2044661.1// 297..878// NP_—001670.1// Na+/K+-ATPase beta 3 subunit [Homo sapiens]// 5E-91// 158// 97
D-3NB692002462.1// 343..951// NP_—000422.1// mevalonate kinase [Homo sapiens]// 1E-112// 202// 100
D-BRCAN2027778.1// 52..1086// NP_—000422.1// mevalonate kinase [Homo sapiens]// 0// 343// 86
D-NT2RI3001005.1// 22..1629// NP_—703150.1// glucose transporter 14 [Homo sapiens]// 0// 490// 99
D-NT2RI3005261.1// 22..1629// NP_—703150.1// glucose transporter 14 [Homo sapiens]// 0// 491// 100
D-OCBBF2010718.1// 144..2495// NP_—055824.1// PDZ domain containing RING finger 3 [Homo sapiens]// 0// 758// 99
D-OCBBF3004194.1// 129..2480// NP_—055824.1// PDZ domain containing RING finger 3 [Homo sapiens]// 0// 760// 99
D-NT2RP8000826.1// 95..2461// NP_—055824.1// PDZ domain containing RING finger 3 [Homo sapiens]// 0// 759// 99
D-NT2RP7007268.1// 95..2461// NP_—055824.1// PDZ domain containing RING finger 3 [Homo sapiens]// 0// 759// 99
D-BRAWH3008172.1// 281..2452// NP_—055824.1// PDZ domain containing RING finger 3 [Homo sapiens]// 0// 722// 99
D-BRAWH3011965.1// 300..>1574// NP_—055824.1// PDZ domain containing RING finger 3 [Homo sapiens]// 0// 421// 99

4) Results of Motif Homology Analysis Using Pfam

Motif homology analysis was performed on the 19 ORF sequences shown in Example 14-1), using Pfam (ftp://ftp.sanger.ac.uk/pub/databases/Pfam/). The analytical program used was hmmpfam v2.3.2, and the analysis was performed for the November 2005 version of Pfam19.0. Based on the results of the homology analysis, the sequences showing the highest homology with an E-value of 1E-10 or less are shown below.
Individual data are shown with the name of cDNA sequence and ORF region, followed by hit data accession number, hit data name, hit data description, E-value, and InterPro ID, separated by “¥” in this order, presented repeatedly using as many “//” partitions as the hit data.
D-BRACE3000012.1// 465..2558// PF01352.15¥KRAB¥KRAB box¥2.1e-20¥IPR001909
D-NT2RP8004156.1// 131..1387// PF00069.14¥Pkinase¥Protein kinase domain¥1.6e-113¥IPR000719//PF07714.5¥Pkinase_Tyr¥Protein tyrosine kinase¥1.3e-18¥// PF00433.12¥Pkinase_C¥Protein kinase C terminal domain¥1.4e-11¥IPR000961
D-NT2RI3005525.1// 45..1292// PF05210.2¥Sprouty¥Sprouty protein (Spry)¥2.7e-11¥IPR007875
D-NT2RI2014164.1// 162..1397// PF01593.12¥Amino_oxidase¥Flavin containing amine oxidoreductase¥9.2e-57¥IPR002937
D-BRAMY2029564.1// 143..1657// PF01593.12¥Amino_oxidase¥Flavin containing amine oxidoreductase¥5.8e-103¥IPR002937
D-BRHIP2003515.1// 84..707// PF04201.4¥TPD52¥Tumour protein D52 family¥1.5e-119¥IPR007327
D-BRACE2044661.1// 297..878// PF00287.7¥Na_K-ATPase¥Sodium/potassium ATPase beta chain¥3.1e-32¥IPR000402
D-NT2RI3001005.1// 22..1629// PF00083.13¥Sugar_tr¥Sugar (and other) transporter¥6.3e-200¥IPR005828// PF07690.5¥MFS_—1¥Major Facilitator Superfamily¥1.1e-14¥IPR011701
D-NT2RI3005261.1// 22..1629// PF00083.13¥Sugar_tr¥Sugar (and other) transporter¥5.5e-200¥IPR005828// PF07690.5¥MFS_—1¥Major Facilitator Superfamily¥1.1e-14¥PRO11701
D-OCBBF2010718.1// 144..2495// PF00595.12¥PDZ¥PDZ domain (Also known as DHR or GLGF)¥2e-14¥IPR001478
D-OCBBF3004194.1// 129..2480// PF00595.12¥PDZ¥PDZ domain (Also known as DHR or GLGF)¥7.1e-16¥IPR001478
D-NT2RP8000826.1// 95..2461// PF00595.12¥PDZ¥PDZ domain (Also known as DHR or GLGF)¥7.1e-16¥IPR001478
D-NT2RP7007268.1// 95..2461// PF00595.12¥PDZ¥PDZ domain (Also known as DHR or GLGF)¥7.1e-16¥IPR001478
D-BRAWH3008172.1// 281..2452// PF00595.12¥PDZ¥PDZ domain (Also known as DHR or GLGF)¥7.1e-16¥IPR001478
D-BRAWH3011965.1// 300..>1574// PF00595.12¥PDZ¥PDZ domain (Also known as DHR or GLGF)¥7.1e-16¥IPR001478

5) Transmembrane Domain Prediction Analysis Using SOSUI

Transmembrane domain prediction analysis was performed on the 19 ORF sequences shown in Example 14-1), using SOSUI (http://bp.nuap.nagoya-u.ac.jp/sosui/). For the analysis, SOSUI version 1.5 was used. The sequences that permitted prediction of the transmembrane domain in the SOSUI analysis are shown below.
Individual data are shown with the name of cDNA sequence, ORF region, and number of transmembrane domain separated by “//”.

D-NT2RI3005525.1// 45..1292// 1

D-BRACE2044661.1// 297..878// 2

D-NT2RI3001005.1// 22..1629// 11

D-NT2RI3005261.1// 22..1629// 11

6) N-Terminal Secretion Signal Sequence Prediction Analysis Using PSORT

N-terminal secretion signal sequence prediction was performed on the 19 ORF sequences shown in Example 14-1), using PSORT (http://psort.nibb.ac.jp/). PSORT II was used for the analysis.
In the PSORT analysis, no sequences permitted prediction of the N-terminal secretion signal sequence.

7) N-Terminal Secretion Signal Sequence Prediction Analysis Using SignalP ver. 3.0

N-terminal secretion signal sequence prediction was performed on the 19 ORF sequences shown in Example 14-1), using SignalP (http://www.cbs.dtu.dk/services/SignalP/). SignalP version 3.0 was used for the analysis. Sequences that permitted prediction of the N-terminal secretion signal sequence in the SignalP analysis are shown below.
Individual data are shown with the name of cDNA sequence and ORF region separated by “//”.

D-BRACE2044661.1// 297..878

Summary of Examples 1 to 14

Although there have been remarkable advances in the analysis of human chromosome sequences thanks to the progress in human genome research, this does not mean that all the human genetic functions have been clarified. We analyzed human genes with a focus on the diversity thereof, and showed that the diversity is largely associated with gene functional changes.
By comparing human genome sequence information and data on human cDNAs, which are products of transcription therefrom, it was found that a plurality of mRNAs are transcribed from certain regions of chromosome. They occur in two cases: a case wherein there are different ORF regions predicted to encode and produce different proteins, and another case wherein there are different 5′UTR regions or 3′UTR regions, which are noncoding regions, and the same protein is produced. With an emphasis on cDNAs predicted to encode proteins different from those of known cDNAs that have already been analyzed, in particular, we performed search and sequence analysis of such cDNAs. Hence, it was found that the cause of the diversity resides mainly in transcription initiation point selectivity and exon selectivity. Regarding transcription initiation point selectivity, a change of the transcription factor used in a certain chromosome region produced a different position for transcription initiation, resulting in the cDNA diversity. As for exon selectivity, an increase or decrease in the exon used, despite transcription from the same chromosome region, at the time of transcription and splicing, resulted in the cDNA diversity.
How the genetic diversity is associated with gene functions was analyzed on the basis of our own information on the expression frequencies of mRNAs by the 5′-terminal sequences of about 1.50 million human cDNAs (5′-onepass sequences). Hence, a large number of cases were found wherein gene functions seemed to be significantly influenced by diversity features, including variation of transcription initiation region selective in a certain organ, and deletion of exon in a certain condition. We discovered genes whose diversity varies depending on the brain tissue portion and nerve cell differentiation stage, and conducted extensive analyses.
Regarding the analytical method, the expression levels were compared using real-time PCR (polymerase chain reaction). For example, assuming an exon predicted to be inserted selectively only after differentiation into nerve cells, a primer that specifically detects the exon region (01) is designed, a primer that specifically detects the pattern in which the exon is not inserted (02) is designed, and a primer that detects a region having both patterns in common (03) is designed. With the use of these 3 kinds of primers, the amounts amplified at the various stages of nerve cell differentiation are compared. The specific region detection results for 01 and 02 are compared with the amount amplified for the shared region 03 as the control at various stages of nerve cell differentiation, whereby it is possible to know how the exon selectivity was changed by nerve cell differentiation. Hence, the correlation between exon selectivity and tissue specific expression can be assessed.
By this method, we discovered many genes whose diversity is associated with tissue-specific expression. Being specific for the tissue in which the gene is expressed suggests that the diversity may significantly influence the function of the gene. Hence, by using a specific region with diversity as a gene marker, it seems possible to elucidate the function of a particular portion of the brain, and to detect nerve cell differentiation or regeneration stages in detail. Furthermore, for example, by proceeding to develop a pharmaceutical with a protein having a specific region expressed only at a certain stage of nerve cell differentiation or regeneration as the target, it seems possible to develop a pharmaceutical that is more effective with lower prevalence of adverse reactions.

The mRNAs related to nerve cell differentiation (mRNAs that induce differentiation of nervous system cells and exhibit an expressional change) are thought to be useful as therapeutic/diagnostic markers for nerve disease. By searching for an mRNA that exhibits an expressional change during the process of inducing differentiation of cultured human cells NT2 into nerve cells (retinoic acid (RA) stimulation or RA stimulation followed by treatment with growth inhibitor), such an mRNA can be discovered. These mRNAs are also thought to be associated with nerve regeneration.

1) Hippocampus

Among the brain tissues, the hippocampus is a very important portion that controls memory, having the function of fixing memory by determining whether or not the information obtained is necessary, and allowing other brain portions to store the memory. Clinical findings show that if the hippocampus is disordered, or, in the worst case, if the hippocampus is lacked, one is only able to remember new things for a short time. Some patients with dementia are thought to have an abnormality in the hippocampus. When comparing the whole brain tissue and the hippocampus, the mRNAs exhibiting expressional variation are mRNAs involved in memory or associated with dementia, and are thought to be useful in elucidating the mechanism for memory and as therapeutic/diagnostic markers.

2) Caudate Nucleus

The hippocampal system is a portion that is important to memory associated with spatial cognition. Spatial cognition is also said to be memory of remembering places. By contrast, the caudate nucleus is said to be a portion that is important to memory acquired through habits (habitual memory).

3) Amygdala

The amygdala is the emotional center of the brain. The information that has passed the amygdala causes emotional reactions, for example, panic and fear reactions. If a strong fear is produced upon affect assessment of a stimulus by the amygdala, the amygdala transmits warning signals to various portions of the brain. As a result, reactions such as palm sweating, palpitation, blood pressure elevation, and rapid secretion of adrenaline occur. The amygdala can also be said to be a tissue that controls a kind of instinct of defense in which warning signals are transmitted to the body to make the body in a warning state. When comparing the whole brain tissue and the amygdala, the mRNAs exhibiting expressional variation are mRNAs involved in emotional reactions, and are thought to be useful in elucidating the molecular mechanisms for emotional reactions, fear reactions, panic and the like.

4) Cerebellum

The cerebellum is the center of equilibrium, muscle movement, and motor learning. This region is thought to be involved in motor regulation; as the cerebellum acts, one can make smooth motions involuntarily. There is also increasing evidence for the involvement of the cerebellum not only in physical movement, but also in the habituation of higher movements such as reading and writing. When comparing the brain tissue as a whole and the cerebellum, the mRNAs exhibiting expressional variation are mRNAs involved in equilibrium and motor functions, and are thought to be useful in elucidating the molecular mechanisms for the motor functions under the control of the brain.

5) Thalamus

The thalamus is a portion where nerve cells that are highly associated with the cerebrum gather, transferring sensory information from the spinal cord and the like to the relevant portions of the cerebrum, and regulating the motor commands of the cerebrum. For example, in visual sensation, images are separated into size, shape, and color, and in auditory sensation, sounds are separated into volume and comfortability and sent to the sensory area of cerebral cortex. When comparing the whole brain tissue and the thalamus, the mRNAs exhibiting expressional variation are mRNAs involved in signal transduction from sensory organs, and are thought to be useful in elucidating the molecular mechanism for the signal transduction under the control of the brain.

6) Substantia Nigra

The substantia nigra is a nerve nucleus that occupies a portion of the midbrain. The substantia nigra is roughly divided into two portions: pars compacta and pars reticulata (and lateral portion), both of which are central constituents of the basal ganglion. The basal ganglion, along with the cerebellum, is known as a higher center responsible for important roles in the onset and control of voluntary movement. The basal ganglion roughly consists of the four nerve nuclei, i.e., striate body, pallidum, substantia nigra, and subthalamic nucleus, the striate body being divided into the caudate nucleus and the putamen, the pallidum into the lateral segment and the medial segment, and the substantia nigra into the pars compacta and the pars reticulata. When these nerve nuclei are re-classified from the viewpoint of the signal transduction modes of “input”, “output”, and “mutual communication”, the striate body corresponds to the input portion of the basal ganglion, and the pallidal medial segment and the substantia nigra pars reticulata correspond to the output portion thereof. Connecting the input portion and the output portion indirectly, the pallidal lateral segment and the subthalamic nucleus are thought to be an interface of the basal ganglion; modifying the nervous activity of the striate body by dopamine, the substantia nigra pars compacta is thought to be a modifying portion of the basal ganglion.
An illness of the cerebro-nervous system characterized by an insufficient production of a neurotransmitter produced in the substantia nigra in the brain, known as dopamine, resulting in motor disorders such as hand tremor and stiffening of muscles making physical movement dull, is said to be Parkinson's disease. Brain nerve cells usually decrease little by little with aging; in Parkinson's disease, nerve cells of the substantia nigra decrease remarkably at higher rates than usual.
When comparing the whole brain tissue and the substantia nigra, the mRNAs exhibiting expressional variation are thought to be mRNAs involved in the above events.

7) Alzheimer Patient's Cerebral Cortex

Alzheimer's disease is an illness of the cerebro-nervous system characterized by loss of memory, that hampers daily activities and necessitates nursing care in advanced cases, eventually leading to atrophy of the brain. Although the causes of the onset thereof are said to be associated with environmental factors such as stress, as well as vascular factors such as hypertension and cholesterolemia, they have not been investigated in full. Therefore, when comparing normal brain tissue and Alzheimer pathologic tissues, the mRNAs exhibiting expressional variation are mRNAs associated with Alzheimer's disease, and are thought to be useful in elucidating the mechanisms for the onset of pathologic conditions, and as therapeutic/diagnostic markers.
This application is based on a patent application No. 2007-066430 filed in Japan (filing date: Mar. 15, 2007), the contents of which are incorporated in full herein by this reference.
[Sequence Listing]

Claims

1.-27. (canceled)

28. An isolated peptide comprising the amino acid sequence of SEQ ID NO: 60.

29. A polynucleotide that encodes the peptide of claim 28.

30. An expression vector comprising the polynucleotide of claim 29 and a promoter operably linked thereto.

31. A transformant incorporating the expression vector of claim 30.

32. An aptamer that binds the peptide of claim 28.

33. An antibody that binds the peptide of claim 28.

34. The antibody of claim 33, wherein the antibody is any one of the (i) to (iii) below:

(i) a polyclonal antibody;

(ii) a monoclonal antibody or a portion thereof;

(iii) a chimeric antibody, a humanized antibody or a human antibody.

35. A cell that produces the antibody of claim 33.

36. The cell of claim 35, wherein the cell is a hybridoma.

37. A composition comprising (a) the antibody of claim 33, or an expression vector therefor, and (b) a pharmaceutically acceptable carrier.

38. A reagent or kit for detection or quantification of any one of the polypeptides encoded by the brain/nerve-specific genes 1 to 10, which reagent or kit comprises one or more antibodies of claim 33.

39. The peptide of claim 28 consisting of the amino acid sequence of SEQ ID NO: 60.

40. A polynucleotide that encodes the peptide of claim 39.

41. An expression vector comprising the polynucleotide of claim 40 and a promoter operably linked thereto.

42. A transformant incorporating the expression vector of claim 41.

43. An aptamer that binds the peptide of claim 39.

44. An antibody that binds the peptide of claim 39.

45. The antibody of claim 44, wherein the antibody is any one of the (i) to (iii) below:

(i) a polyclonal antibody;

(ii) a monoclonal antibody or a portion thereof;

(iii) a chimeric antibody, a humanized antibody or a human antibody.

46. A cell that produces the antibody of claim 44.

47. The cell of claim 46, wherein the cell is a hybridoma.

48. A composition comprising (a) the antibody of claim 44, or an expression vector therefor, and (b) a pharmaceutically acceptable carrier.

49. A reagent or kit for detection or quantification of any one of the polypeptides encoded by the brain/nerve-specific genes 1 to 10, which reagent or kit comprise one or more antibodies of claim 44.