CA2033411C - Document revising system for use with document reading and translating system - Google Patents
Document revising system for use with document reading and translating systemInfo
- Publication number
- CA2033411C CA2033411C CA002033411A CA2033411A CA2033411C CA 2033411 C CA2033411 C CA 2033411C CA 002033411 A CA002033411 A CA 002033411A CA 2033411 A CA2033411 A CA 2033411A CA 2033411 C CA2033411 C CA 2033411C
- Authority
- CA
- Canada
- Prior art keywords
- document
- image
- character
- correspondence
- translated
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/232—Orthographic correction, e.g. spell checking or vowelisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/40—Processing or translation of natural language
- G06F40/55—Rule-based translation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/40—Processing or translation of natural language
- G06F40/58—Use of machine translation, e.g. for multi-lingual retrieval, for server-side translation for client devices or for real-time translation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/98—Detection or correction of errors, e.g. by rescanning the pattern or by human intervention; Evaluation of the quality of the acquired patterns
- G06V10/987—Detection or correction of errors, e.g. by rescanning the pattern or by human intervention; Evaluation of the quality of the acquired patterns with the intervention of an operator
Abstract
An image-to-character-position-correspondence-table producing unit produces image-to-character-position-correspondence-table composed of a set comprising an-image-of-a-document, a character-recognized document and a translated document. A
candidate character producing unit produces candidate c h a r a c t e r s f o r b a c k i n g u p t h e r e v i s i o n o f misrecognized characters. A Japanese-document-to-translated-document correspondence table stores a correspondence relationship between an original Japanese document and a translated document in the form of a table. When misrecognized characters are being revised, the image-to-character-position-correspondence-table is displayed by the image-to-character-position-correspondence-table. A revising unit prompts a user to specify a misrecognized portion in the translated document of the image-to-character-position-correspondence-table. Next, the revising unit refers to the Japanese-document-to-translated-document correspondence table to extract a portion of each of the-image-of-the-document and the recognized document that corresponds to the specified portion and causes the image-to-character-position-correspondence-table producing unit to display the corresponding portions. Subsequently, the revising unit refers to the candidate character producing unit to extract candidate characters as requested by the user and causes the image-to-character-position-correspondence-table producing unit to display these candidate characters. Candidate characters are selected by the user. The misrecognized portion in the recognized document is replaced with the selected candidate characters, a new character-recognized document is translated and a newly translated document is displayed. In this way even foreigners who have little knowledge of Japanese can carry out revision work on misrecognized characters with ease.
candidate character producing unit produces candidate c h a r a c t e r s f o r b a c k i n g u p t h e r e v i s i o n o f misrecognized characters. A Japanese-document-to-translated-document correspondence table stores a correspondence relationship between an original Japanese document and a translated document in the form of a table. When misrecognized characters are being revised, the image-to-character-position-correspondence-table is displayed by the image-to-character-position-correspondence-table. A revising unit prompts a user to specify a misrecognized portion in the translated document of the image-to-character-position-correspondence-table. Next, the revising unit refers to the Japanese-document-to-translated-document correspondence table to extract a portion of each of the-image-of-the-document and the recognized document that corresponds to the specified portion and causes the image-to-character-position-correspondence-table producing unit to display the corresponding portions. Subsequently, the revising unit refers to the candidate character producing unit to extract candidate characters as requested by the user and causes the image-to-character-position-correspondence-table producing unit to display these candidate characters. Candidate characters are selected by the user. The misrecognized portion in the recognized document is replaced with the selected candidate characters, a new character-recognized document is translated and a newly translated document is displayed. In this way even foreigners who have little knowledge of Japanese can carry out revision work on misrecognized characters with ease.
Description
203~
Document Revising System for Use with Document Reading and Translating System Background of the Invention Field of the Invention The present invention relates to a document-revising apparatus for use with a document reading and translating system and, more particularly, to a revised document display apparatus for use with a Japanese-document reading and translating system which is used with a combined system comprising a Japanese document reader adapted for entering a Japanese document as an image and character recognition thereof and an automatic translator, permitting even foreigners who understand little Japanese to revise misread characters with ease.
Description of the Related Art With recent internationalization, it has become increasingly necessary for Japanese documents to be read in various countries. Thus, a combined system comprising a Japanese document reader which serves as Japanese entry means and an automatic translator which translates Japanese into a foreign language has been developed.
Figure 1 is a block diagram of a conventional Japanese document reader. This prior art consists of .~
3~
an image entry unit 1, an-image-of-a-document storage unit (image memory) 2, a character segmentation unit 3, a character recognition unit 4, a Japanese document storage unit 5, a revising or correcting unit 6 and a display unit 7.
A Japanese document is read as an-image-of-a-document by an OCR (optical character reader) of the image entry unit 1 and the-image-of-the-document is then stored in the-image-of-the-document storage unit 2.
Next, the character segmentation unit 3 reads the image of the document from the-image-of-the-document storage unit 2 and segregates characters from the image of the document in sequence. The character recognition unit 4 performs a character recognition process on each of the character segmentations. Data on each of recognized characters is stored in the Japanese document storage unit 5. The display unit 7 displays the Japanese document subjected to the recognition process which has been stored in the Japanese document storage unit 5.
The character recognition rate of the character recognition unit 4 cannot be 100%. Therefore, it is necessary to revise a document that has been partly misrecognized. The user compares the character-~33~
recognized document displayed by the display unit 7,with the original document (namely, the document written or printed on a sheet of paper) to search for misrecognized characters. If he finds any, he revises them by using the revising unit 6. For example, the revising work may be performed by deleting a misrecognized character, entering the Japanese rendering or reading (kana: Japanese syllabry) of an image character corresponding to the misrecognized character, performing kana-to-kanji ( Chinese character) conversion on the kana to obtain a correct character, and again storing the obtained character data in the Japanese document storage unit 5.
Figure 2 is a block diagram of a conventional automatic translator comprising a data entry unit 8, a translating unit 9, a translated document storage unit 10 and a display unit 7'.
Japanese-document data entered via the data entry unit 8 is read into the translating unit 9 for translation into a language (for example, English) other than Japanese. The translated document is stored in the translated document storage unit 10 and displayed on the display unit 7' as needed.
The Japanese-document reader of Figure 1 and the automatic translator of Figure 2 constitute separate systems. Since such separate systems have poor operability, it has been proposed to integrate them.
Figure 3 is a block diagram of a conventional integrated Japanese-document reading and translating system. In Figure 3, like reference characters are used to designate blocks corresponding to those in Figures 1 and 2.
In the system of Figure 3, first, a Japanese document is stored in the Japanese document storage unit 5 via the image entry unit 1, the-image-of-the-document storage unit 2, the character segmentation unit 3 and the character recognition unit 4, and is revised by the revision unit 6 while it is being displayed on the display unit 7.
Next, the correct Japanese document stored in the Japanese document storage unit 5 is entered directly into the translator 9 for translation into a foreign language, as in the translator of Figure 2. The obtained foreign language document is then stored in the translation document storage unit 10 and displayed by the display unit 7 as needed. That is, the display unit 7 also serves as the display unit 7' of Figure 2.
In this way the Japanese-document reading and translating system of Figure 3 can perform a combined process of reading a ~apanese document written or ~3~411 printed on a sheet of paper and translating it to a foreign language.
However, the conventional system of Figure 3 has the following problems.
First, the user has to compare a displayed document with an original Japanese document prior to image entry (a document written or printed on a sheet of paper) with his eyes in order to search for and revise misrecognized characters. Thus, it is very difficult for a foreigner ( a non-Japanese) whose knowledge of Japanese is poor to be sure of correctly revising the results of recognition.
Second, since it is difficult to be sure that the recognition results have been correctly revised, subsequent translation work may not be executed correctly.
As described above, heretofore, a system combining a Japanese-document reader and an automatic translator which is easy for foreigners to operate has not yet been constructed.
Summary of the Invention It is therefore an object of the present invention to provide a system combining a Japanese-document reader and an automatic translator which permits even persons whose knowledge of Japanese is 3Al~
poor to search for and revise misrecognized characters in a short time and without any difficulty.
The present invention provides a document revising apparatus for use with a document reading and translating system for performing character recognition of an-image-of-a-document to make a recognized document and translating the recognized document, comprising: character recognition means for entering a document written in a first language as an image of a document, segregating characters from said image of the document and performing character recognition on each character segmentation to produce a recognized document; translating means for translating said document in said first language to a second language to make a translated document; image-to-character-position-correspondence-table producing and displaying means for producing and displaying an image-to-character-position-correspondence-table in which a correspondence is established between said image document, said recognized document and said translated document; original-document-to-translated-document correspondence relationship storing means for storing a correspondence relationship between an original document and a translated document; candidate character producing means for producing candidate ~3~
characters used for revising misrecognized characters;
and document revising means for carrying out the following processes: a first process allowing a user to specify a misrecognized portion in said translated document displayed by said image-to-character-position-correspondence-table producing and displaying means; a second process referring to said original-document-to-translated-document correspondence relationship storing means to extract portions of said image document and said recognized document which correspond to said misrecognized portion specified and causing said image-to-character-position-correspondence-table producing and displaying means to display said portions extracted explicitly; a third process referring to said candidate character producing means to extract candidate characters for said misrecognized portion in said recognized document and causing said image-to-character-position-correspondence-table producing and displaying means to display said candidate characters as requested by the user; a fourth process enabling the user to select arbitrary characters from said candidate characters displayed and replacing said misrecognized portion in said recognized document with selected candidate characters; a fifth process causing said translating 2~3~L:~
means to retranslate a new document in which said misrecognized portion is replaced with said selected candidate characters to thereby produce a new translated document and causing said image-to-character-position-correspondence-table producing and displaying means to display said new translated document; and a control process for repeating said first through said fifth processes.
According to the configuration of the present invention, the user can search for misrecognized characters on the basis of the translation result, not on the basis of the character recognition result of the document reader. Thus, even foreigners who have little knowledge of the original language can carry out the revising work without any difficulty.
The work of revising the recognized document can be carried out not by kana-to-kanji conversion using keyboard entry, but by selecting a correct character from displayed candidate characters on the basis of visual matching with the-image-of-the-document. Thus, even persons who have little knowledge of the original language can carry out the revising work with ease.
Brief Description of the Drawings Further objects and advantages of the present invention will be apparent from the following ?~3ql ~, description of a preferred embodiment with reference to the accompanying drawings, in which:
Figure 1 is a block diagram of a conventional Japanese document reader;
5Figure 2 is a block diagram of a conventional automatic translator;
Figure 3 is a block diagram of a conventional combined Japanese document reading and translating system;
10Figure 4 is a basic block diagram of a Japanese document reading and translating system embodying the present invention;
Figure 5 is a detailed block diagram of the system of Figure 4;
15Figure 6 is a flowchart for explaining the operation of the present invention; and Figure 7 is a diagram illustrating an example of an image-to-character-position-correspondence-table of the present invention.
Detailed Description of the Preferred Embodiment Explanation of the principle of the invention Figure 4 is a basic block diagram of a Japanese document reading and translating system embodying the present invention. In Figure 4, like reference numerals are used to designate blocks corresponding to ~3~
those in Figure 3.
The embodiment of the present invention includes a character recognition unit 14 for segregating characters from an entered image document and performing character recognition on character segmentations, a translator unit 15 for translating a document which has been subjected to the recognition process to a foreign language , a display unit 7 for displaying a document and a revising unit 6 for revising misrecognized characters in the recognized Japanese document. These units have the same functions as those in Figure 3.
In addition to the above configuration, the present embodiment contains the following distinctive units. First, an image-to-character-position-correspondence-table producing unit 11 is provided.
This unit produces a set of documents comprising an-image-of-a-document, a recognized document and a translated document. Second, a candidate character making unit 12 is provided which makes candidate characters for backing up the revision of misrecognized characters. Third, a Japanese-document-to-translated-document correspondence table 13 is provided, which stores the correspondence between a Japanese document before translation and a 2~334 il translated document in the form of a table. In the basic configuration described above, the image-to-character-position-correspondence-table produced by the image-to-character-position-correspondence-table producing unit 11 is displayed by the display unit 7 so that the misrecognized characters can be revised.
When the user specifies a misrecognized portion on the translated document of the image-to-character-position-correspondence-table displayed on the display unit 7 by using the revising unit 6, the revising unit 6 refers to the Japanese-document-to-translated-document correspondence table 13 to extract from the image of the document and the recognized document portions corresponding to the specified portion and informs the image-to-character-position-correspondence-table producing unit 11 of information about the corresponding portions. The portion of the translated document specified by the user and the corresponding portions of the-image-of-the-document and the recognized document are thereby displayed explicitly on the display unit 7. That is, these portions are, for example, blinked or reversed on the display unit.
Subsequently, when prompted by the user, the revision unit 6 refers to the candidate character ~ ~ 3 ~ ~ ~
producing unit 12 to extract candidate characters for the misrecognized portion in the recognized document and informs the image-to-character-position-correspondence-table producing unit 11 of the candidate characters. The candidate characters are thereby displayed on the display unit 7.
When the user selects arbitrary characters from the candidate characters displayed by the display unit 7 by using the function of the revision unit 6, the misrecognized portion in the recognized document displayed by the display unit 7 is replaced with the selected candidate characters and the document revision information is transmitted to the character recognition unit 14. The character recognition unit 14 replaces the misrecognized portion in the recognized document with the selected candidate characters to prepare a new recognized document which is stored again, and sends the new document to the translating unit 15. The translating unit 15 retranslates the portions corresponding to the selected candidate characters and sends a new translation document to the image-to-character-position-correspondence-table making unit 11 for display on the display unit 7.
Specific Embodiment 2~3341~
Figure 5 illustrates more specifically the configuration of the system of Figure 4. Figure 6 is a flowchart of the operation of the system of Figure 5 and Figure 7 illustrates an example of a displayed document. In Figure 5, like reference numerals are used to designate blocks corresponding to those in Figures 3 and 4.
In the configuration of Figure 5, which is based on the configuration of Figure 4, the character recognition unit 14 consists of an image entry unit 1, an-image-of-a-document storage unit 2, a character segmentation unit 3, a character recognition unit 4 and a Japanese-document storage unit 5. The translating unit 15 is comprised of a translation unit 9 and a document translation storage unit 10.
The Japanese-document-to-translated-document correspondence table 13 stores a set comprising a recognized document (in Japanese) and a corresponding translated document (translated to English, for example) in the form of a table in character-recognition units (for example, on a clause-by-clause basis).
The candidate character producing unit 12 extracts characters from the character recognition unit 4 to prepare a table of candidate characters for 2`~33~
misrecognized characters.
The operation of the system of Figure 5 will be described specifically with reference to Figures 6 and 7. In the following description, steps 1 through 15 correspond to steps 1 through 15 of the flowchart of Figure 6.
First, a Japanese document, such as a technological treatise, written or printed on a sheet of paper is read as an-image-of-a-document by the image entry unit 1 and the-image-of-the-document is stored in the-image-of-the-document storage unit 2 (step 1). Ne~t, the character segmentation unit 3 segregates characters from the-image-of-the-document read from the-image-of-the-document storage unit 2 (step 2). The character recognition unit 4 performs character recognition on each of the segregated characters and stores the recognized Japanese document in the Japanese-document storage unit 5 (step 3). Subsequently, the character-recognized document is read into the translating unit 9 for translation into a foreign language ( non-Japanese) and the resulting translated document (a translation from Japanese) is stored in the translated document storage unit 10 (step 4).
When the misrecognized characters are being 203341:L
revised, the image-to-character-position-correspondence-table preparing unit 11 prepares an image-to-character-position-correspondence-table containing a set comprising the-image-of-the-document, the recognized document and the translated document.
The image-to-character-position-correspondence-table preparing unit 11 then reads the-image-of-the-document from the-image-of-the-document storage unit 2, the recognized document from the Japanese-document storage unit 5 and the translated document from the translated document storage unit 10 on the basis of position information from the character segmentation unit 3, thereby producing the image-to-character-position-correspondence-table (step 5). The image-to-character-position-correspondence-table prepared in this way is displayed on the display unit 7 (step 5).
Figure 7A illustrates one example of a displayed image (an-image-to-character-position-correspondence-table) on the screen of the display unit 7. In this example, the first line indicates an-image-of-a-document, the second line indicates a recognized document and the third line indicates a translated document. The image-of-the-document and the character-recognized document are each separated into, for example, clauses and the clauses of both ~Q33~
documents are displayed in one-to-one correspondence.
The user carries out revising work while watching the display screen. In this case, the user searches the translated document for portions that do not seem to make sense and specifies those portions by using a device (for example, a mouse input device not shown) attached to the revision unit 6 (step 7). In Figure 7A, "branch art" is specified.
The revision unit 6 refers to the Japanese-document-to-translated-document correspondence table 13 to extract a character from the recognized document that corresponds to the specified portion. As a result, "~/~" ("branch art" in English) is extracted as the corresponding character in the recognized document. Then, the revision unit 6 extracts the corresponding character "~ " ("technological" in English) in the-image-of-the-document using the above position information. The revision unit 6 informs the image-to-character-position-correspondence-table making unit 11 of information about these extracted portions (step 8).
As a result, as illustrated in Figure 7B, the specified portion that seems to have been misrecognized is displayed explicitly by the display unit 7. The explicit display is performed by blinking 2~3~41~
or reversing (i.e., reversing white and black) corresponding portions in the documents (step 9).
Subsequently, the user makes a comparison between the-image-of-the-document and the recognized document to confirm that the recognition result is wrong.
Then, the user enters a predetermined command (for example, through a click of the mouse) in step 10. As a result, the revision unit 6 refers to the candidate character, making unit 12 extract candidate characters for the misrecognized portion in the recognized document, and informs the document producing unit 11 of these candidate characters (step 11). The candidate characters for the misrecognized portion are thereby displayed on the display unit 7 (step 12).
When, in step 13, the user selects arbitrary characters from among the candidate characters displayed on the display unit 7 through clicks of the mouse as illustrated in ~igure 7C, the misrecognized portion of the recognized document is replaced with the selected candidate characters and the revision unit 6 replaces the corresponding portion in the Japanese document storage unit 5 with the candidate characters (step 14).
The contents of the recognized document which has been subjected to replacement in that way are sent to ~3~
the translating unit 9. The translating unit 9 retranslates the portion corresponding to the selected candidate characters and sends the newly translated document to the image-to-character-position-correspondence-table producing unit 11 via the translated document storage unit 10 (step 15).
The image-to-character-position-correspondence-table producing unit 11 produces a new image-to-character-position-correspondence-table on the basis f the correct translated document sent from the translating unit and displays it as illustrated in Figure 7D (step 15 --~ step 5). Finally, the user terminates the revision work through a click of the mouse (step 6).
As described above, the user can search a translated document to find misrecognized characters in a recognized document, not a recognized document made by the Japanese document reader. Thus, even foreigners with little knowledge of Japanese can carry out revision work without difficulty.
In addition, the work of revising a recognized document can be carried out not by kana-to-kanji conversion using keyboard entry but by selecting a correct character from displayed candidate characters through visual matching with an-image-of-a-document.
2~33~ ~
Thus, even persons with little knowledge of Japanese can carry out the revision work with ease.
Document Revising System for Use with Document Reading and Translating System Background of the Invention Field of the Invention The present invention relates to a document-revising apparatus for use with a document reading and translating system and, more particularly, to a revised document display apparatus for use with a Japanese-document reading and translating system which is used with a combined system comprising a Japanese document reader adapted for entering a Japanese document as an image and character recognition thereof and an automatic translator, permitting even foreigners who understand little Japanese to revise misread characters with ease.
Description of the Related Art With recent internationalization, it has become increasingly necessary for Japanese documents to be read in various countries. Thus, a combined system comprising a Japanese document reader which serves as Japanese entry means and an automatic translator which translates Japanese into a foreign language has been developed.
Figure 1 is a block diagram of a conventional Japanese document reader. This prior art consists of .~
3~
an image entry unit 1, an-image-of-a-document storage unit (image memory) 2, a character segmentation unit 3, a character recognition unit 4, a Japanese document storage unit 5, a revising or correcting unit 6 and a display unit 7.
A Japanese document is read as an-image-of-a-document by an OCR (optical character reader) of the image entry unit 1 and the-image-of-the-document is then stored in the-image-of-the-document storage unit 2.
Next, the character segmentation unit 3 reads the image of the document from the-image-of-the-document storage unit 2 and segregates characters from the image of the document in sequence. The character recognition unit 4 performs a character recognition process on each of the character segmentations. Data on each of recognized characters is stored in the Japanese document storage unit 5. The display unit 7 displays the Japanese document subjected to the recognition process which has been stored in the Japanese document storage unit 5.
The character recognition rate of the character recognition unit 4 cannot be 100%. Therefore, it is necessary to revise a document that has been partly misrecognized. The user compares the character-~33~
recognized document displayed by the display unit 7,with the original document (namely, the document written or printed on a sheet of paper) to search for misrecognized characters. If he finds any, he revises them by using the revising unit 6. For example, the revising work may be performed by deleting a misrecognized character, entering the Japanese rendering or reading (kana: Japanese syllabry) of an image character corresponding to the misrecognized character, performing kana-to-kanji ( Chinese character) conversion on the kana to obtain a correct character, and again storing the obtained character data in the Japanese document storage unit 5.
Figure 2 is a block diagram of a conventional automatic translator comprising a data entry unit 8, a translating unit 9, a translated document storage unit 10 and a display unit 7'.
Japanese-document data entered via the data entry unit 8 is read into the translating unit 9 for translation into a language (for example, English) other than Japanese. The translated document is stored in the translated document storage unit 10 and displayed on the display unit 7' as needed.
The Japanese-document reader of Figure 1 and the automatic translator of Figure 2 constitute separate systems. Since such separate systems have poor operability, it has been proposed to integrate them.
Figure 3 is a block diagram of a conventional integrated Japanese-document reading and translating system. In Figure 3, like reference characters are used to designate blocks corresponding to those in Figures 1 and 2.
In the system of Figure 3, first, a Japanese document is stored in the Japanese document storage unit 5 via the image entry unit 1, the-image-of-the-document storage unit 2, the character segmentation unit 3 and the character recognition unit 4, and is revised by the revision unit 6 while it is being displayed on the display unit 7.
Next, the correct Japanese document stored in the Japanese document storage unit 5 is entered directly into the translator 9 for translation into a foreign language, as in the translator of Figure 2. The obtained foreign language document is then stored in the translation document storage unit 10 and displayed by the display unit 7 as needed. That is, the display unit 7 also serves as the display unit 7' of Figure 2.
In this way the Japanese-document reading and translating system of Figure 3 can perform a combined process of reading a ~apanese document written or ~3~411 printed on a sheet of paper and translating it to a foreign language.
However, the conventional system of Figure 3 has the following problems.
First, the user has to compare a displayed document with an original Japanese document prior to image entry (a document written or printed on a sheet of paper) with his eyes in order to search for and revise misrecognized characters. Thus, it is very difficult for a foreigner ( a non-Japanese) whose knowledge of Japanese is poor to be sure of correctly revising the results of recognition.
Second, since it is difficult to be sure that the recognition results have been correctly revised, subsequent translation work may not be executed correctly.
As described above, heretofore, a system combining a Japanese-document reader and an automatic translator which is easy for foreigners to operate has not yet been constructed.
Summary of the Invention It is therefore an object of the present invention to provide a system combining a Japanese-document reader and an automatic translator which permits even persons whose knowledge of Japanese is 3Al~
poor to search for and revise misrecognized characters in a short time and without any difficulty.
The present invention provides a document revising apparatus for use with a document reading and translating system for performing character recognition of an-image-of-a-document to make a recognized document and translating the recognized document, comprising: character recognition means for entering a document written in a first language as an image of a document, segregating characters from said image of the document and performing character recognition on each character segmentation to produce a recognized document; translating means for translating said document in said first language to a second language to make a translated document; image-to-character-position-correspondence-table producing and displaying means for producing and displaying an image-to-character-position-correspondence-table in which a correspondence is established between said image document, said recognized document and said translated document; original-document-to-translated-document correspondence relationship storing means for storing a correspondence relationship between an original document and a translated document; candidate character producing means for producing candidate ~3~
characters used for revising misrecognized characters;
and document revising means for carrying out the following processes: a first process allowing a user to specify a misrecognized portion in said translated document displayed by said image-to-character-position-correspondence-table producing and displaying means; a second process referring to said original-document-to-translated-document correspondence relationship storing means to extract portions of said image document and said recognized document which correspond to said misrecognized portion specified and causing said image-to-character-position-correspondence-table producing and displaying means to display said portions extracted explicitly; a third process referring to said candidate character producing means to extract candidate characters for said misrecognized portion in said recognized document and causing said image-to-character-position-correspondence-table producing and displaying means to display said candidate characters as requested by the user; a fourth process enabling the user to select arbitrary characters from said candidate characters displayed and replacing said misrecognized portion in said recognized document with selected candidate characters; a fifth process causing said translating 2~3~L:~
means to retranslate a new document in which said misrecognized portion is replaced with said selected candidate characters to thereby produce a new translated document and causing said image-to-character-position-correspondence-table producing and displaying means to display said new translated document; and a control process for repeating said first through said fifth processes.
According to the configuration of the present invention, the user can search for misrecognized characters on the basis of the translation result, not on the basis of the character recognition result of the document reader. Thus, even foreigners who have little knowledge of the original language can carry out the revising work without any difficulty.
The work of revising the recognized document can be carried out not by kana-to-kanji conversion using keyboard entry, but by selecting a correct character from displayed candidate characters on the basis of visual matching with the-image-of-the-document. Thus, even persons who have little knowledge of the original language can carry out the revising work with ease.
Brief Description of the Drawings Further objects and advantages of the present invention will be apparent from the following ?~3ql ~, description of a preferred embodiment with reference to the accompanying drawings, in which:
Figure 1 is a block diagram of a conventional Japanese document reader;
5Figure 2 is a block diagram of a conventional automatic translator;
Figure 3 is a block diagram of a conventional combined Japanese document reading and translating system;
10Figure 4 is a basic block diagram of a Japanese document reading and translating system embodying the present invention;
Figure 5 is a detailed block diagram of the system of Figure 4;
15Figure 6 is a flowchart for explaining the operation of the present invention; and Figure 7 is a diagram illustrating an example of an image-to-character-position-correspondence-table of the present invention.
Detailed Description of the Preferred Embodiment Explanation of the principle of the invention Figure 4 is a basic block diagram of a Japanese document reading and translating system embodying the present invention. In Figure 4, like reference numerals are used to designate blocks corresponding to ~3~
those in Figure 3.
The embodiment of the present invention includes a character recognition unit 14 for segregating characters from an entered image document and performing character recognition on character segmentations, a translator unit 15 for translating a document which has been subjected to the recognition process to a foreign language , a display unit 7 for displaying a document and a revising unit 6 for revising misrecognized characters in the recognized Japanese document. These units have the same functions as those in Figure 3.
In addition to the above configuration, the present embodiment contains the following distinctive units. First, an image-to-character-position-correspondence-table producing unit 11 is provided.
This unit produces a set of documents comprising an-image-of-a-document, a recognized document and a translated document. Second, a candidate character making unit 12 is provided which makes candidate characters for backing up the revision of misrecognized characters. Third, a Japanese-document-to-translated-document correspondence table 13 is provided, which stores the correspondence between a Japanese document before translation and a 2~334 il translated document in the form of a table. In the basic configuration described above, the image-to-character-position-correspondence-table produced by the image-to-character-position-correspondence-table producing unit 11 is displayed by the display unit 7 so that the misrecognized characters can be revised.
When the user specifies a misrecognized portion on the translated document of the image-to-character-position-correspondence-table displayed on the display unit 7 by using the revising unit 6, the revising unit 6 refers to the Japanese-document-to-translated-document correspondence table 13 to extract from the image of the document and the recognized document portions corresponding to the specified portion and informs the image-to-character-position-correspondence-table producing unit 11 of information about the corresponding portions. The portion of the translated document specified by the user and the corresponding portions of the-image-of-the-document and the recognized document are thereby displayed explicitly on the display unit 7. That is, these portions are, for example, blinked or reversed on the display unit.
Subsequently, when prompted by the user, the revision unit 6 refers to the candidate character ~ ~ 3 ~ ~ ~
producing unit 12 to extract candidate characters for the misrecognized portion in the recognized document and informs the image-to-character-position-correspondence-table producing unit 11 of the candidate characters. The candidate characters are thereby displayed on the display unit 7.
When the user selects arbitrary characters from the candidate characters displayed by the display unit 7 by using the function of the revision unit 6, the misrecognized portion in the recognized document displayed by the display unit 7 is replaced with the selected candidate characters and the document revision information is transmitted to the character recognition unit 14. The character recognition unit 14 replaces the misrecognized portion in the recognized document with the selected candidate characters to prepare a new recognized document which is stored again, and sends the new document to the translating unit 15. The translating unit 15 retranslates the portions corresponding to the selected candidate characters and sends a new translation document to the image-to-character-position-correspondence-table making unit 11 for display on the display unit 7.
Specific Embodiment 2~3341~
Figure 5 illustrates more specifically the configuration of the system of Figure 4. Figure 6 is a flowchart of the operation of the system of Figure 5 and Figure 7 illustrates an example of a displayed document. In Figure 5, like reference numerals are used to designate blocks corresponding to those in Figures 3 and 4.
In the configuration of Figure 5, which is based on the configuration of Figure 4, the character recognition unit 14 consists of an image entry unit 1, an-image-of-a-document storage unit 2, a character segmentation unit 3, a character recognition unit 4 and a Japanese-document storage unit 5. The translating unit 15 is comprised of a translation unit 9 and a document translation storage unit 10.
The Japanese-document-to-translated-document correspondence table 13 stores a set comprising a recognized document (in Japanese) and a corresponding translated document (translated to English, for example) in the form of a table in character-recognition units (for example, on a clause-by-clause basis).
The candidate character producing unit 12 extracts characters from the character recognition unit 4 to prepare a table of candidate characters for 2`~33~
misrecognized characters.
The operation of the system of Figure 5 will be described specifically with reference to Figures 6 and 7. In the following description, steps 1 through 15 correspond to steps 1 through 15 of the flowchart of Figure 6.
First, a Japanese document, such as a technological treatise, written or printed on a sheet of paper is read as an-image-of-a-document by the image entry unit 1 and the-image-of-the-document is stored in the-image-of-the-document storage unit 2 (step 1). Ne~t, the character segmentation unit 3 segregates characters from the-image-of-the-document read from the-image-of-the-document storage unit 2 (step 2). The character recognition unit 4 performs character recognition on each of the segregated characters and stores the recognized Japanese document in the Japanese-document storage unit 5 (step 3). Subsequently, the character-recognized document is read into the translating unit 9 for translation into a foreign language ( non-Japanese) and the resulting translated document (a translation from Japanese) is stored in the translated document storage unit 10 (step 4).
When the misrecognized characters are being 203341:L
revised, the image-to-character-position-correspondence-table preparing unit 11 prepares an image-to-character-position-correspondence-table containing a set comprising the-image-of-the-document, the recognized document and the translated document.
The image-to-character-position-correspondence-table preparing unit 11 then reads the-image-of-the-document from the-image-of-the-document storage unit 2, the recognized document from the Japanese-document storage unit 5 and the translated document from the translated document storage unit 10 on the basis of position information from the character segmentation unit 3, thereby producing the image-to-character-position-correspondence-table (step 5). The image-to-character-position-correspondence-table prepared in this way is displayed on the display unit 7 (step 5).
Figure 7A illustrates one example of a displayed image (an-image-to-character-position-correspondence-table) on the screen of the display unit 7. In this example, the first line indicates an-image-of-a-document, the second line indicates a recognized document and the third line indicates a translated document. The image-of-the-document and the character-recognized document are each separated into, for example, clauses and the clauses of both ~Q33~
documents are displayed in one-to-one correspondence.
The user carries out revising work while watching the display screen. In this case, the user searches the translated document for portions that do not seem to make sense and specifies those portions by using a device (for example, a mouse input device not shown) attached to the revision unit 6 (step 7). In Figure 7A, "branch art" is specified.
The revision unit 6 refers to the Japanese-document-to-translated-document correspondence table 13 to extract a character from the recognized document that corresponds to the specified portion. As a result, "~/~" ("branch art" in English) is extracted as the corresponding character in the recognized document. Then, the revision unit 6 extracts the corresponding character "~ " ("technological" in English) in the-image-of-the-document using the above position information. The revision unit 6 informs the image-to-character-position-correspondence-table making unit 11 of information about these extracted portions (step 8).
As a result, as illustrated in Figure 7B, the specified portion that seems to have been misrecognized is displayed explicitly by the display unit 7. The explicit display is performed by blinking 2~3~41~
or reversing (i.e., reversing white and black) corresponding portions in the documents (step 9).
Subsequently, the user makes a comparison between the-image-of-the-document and the recognized document to confirm that the recognition result is wrong.
Then, the user enters a predetermined command (for example, through a click of the mouse) in step 10. As a result, the revision unit 6 refers to the candidate character, making unit 12 extract candidate characters for the misrecognized portion in the recognized document, and informs the document producing unit 11 of these candidate characters (step 11). The candidate characters for the misrecognized portion are thereby displayed on the display unit 7 (step 12).
When, in step 13, the user selects arbitrary characters from among the candidate characters displayed on the display unit 7 through clicks of the mouse as illustrated in ~igure 7C, the misrecognized portion of the recognized document is replaced with the selected candidate characters and the revision unit 6 replaces the corresponding portion in the Japanese document storage unit 5 with the candidate characters (step 14).
The contents of the recognized document which has been subjected to replacement in that way are sent to ~3~
the translating unit 9. The translating unit 9 retranslates the portion corresponding to the selected candidate characters and sends the newly translated document to the image-to-character-position-correspondence-table producing unit 11 via the translated document storage unit 10 (step 15).
The image-to-character-position-correspondence-table producing unit 11 produces a new image-to-character-position-correspondence-table on the basis f the correct translated document sent from the translating unit and displays it as illustrated in Figure 7D (step 15 --~ step 5). Finally, the user terminates the revision work through a click of the mouse (step 6).
As described above, the user can search a translated document to find misrecognized characters in a recognized document, not a recognized document made by the Japanese document reader. Thus, even foreigners with little knowledge of Japanese can carry out revision work without difficulty.
In addition, the work of revising a recognized document can be carried out not by kana-to-kanji conversion using keyboard entry but by selecting a correct character from displayed candidate characters through visual matching with an-image-of-a-document.
2~33~ ~
Thus, even persons with little knowledge of Japanese can carry out the revision work with ease.
Claims (8)
1. A document revising apparatus for use with a document reading and translating system for performing character recognition of an-image-of-a-document to produce a recognized document and translating the character-recognized document, comprising:
character recognition means for entering a document written in a first language as an-image-of-a-document, segregating characters from said image document and performing character recognition on each cut character to produce a recognized document;
translating process means for translating said document in said first language to a second language to make a translated document;
image-to-character-position-correspondence-table producing and displaying means for producing and displaying an image-to-character-position-correspondence-table in which a correspondence is established between said image document, said recognized document and said translated document;
original-document-to-translated-document corre-spondence relationship storing means for storing a correspondence relationship between an original document and a translated document;
candidate character producing means for producing candidate characters used for revising misrecognized characters; and document revising means for carrying out the following processes:
a first process allowing a user to specify a misrecognized portion in said image-to-character-position-correspondence-table displayed by said image-to-character-position-correspondence-table producing and displaying means;
a second process referring to said original-document-to-translated-document correspondence relationship storing means to extract portions of said image-of-the-document and said recognized document which correspond to said misrecognized portion specified and causing said image-to-character-position-correspondence-table producing and displaying means to display said portions extracted explicitly;
a third process referring to said candidate character producing means to extract candidate characters for said misrecognized portion in said recognized document as requested by the user and causing said image-to-character-position-correspondence-table producing and displaying means to display said candidate characters;
a fourth process causing the user to select arbitrary characters from said candidate characters displayed and replacing said misrecognized portion in said recognized document with selected candidate characters;
a fifth process causing said translating means to retranslate a new document in which said misrecognized portion is replaced with said selected candidate characters to thereby produce a new translated document and causing said image-to-character-position-correspondence-table producing and displaying means to display said new translated document; and a control process for repeating said first through said fifth processes.
character recognition means for entering a document written in a first language as an-image-of-a-document, segregating characters from said image document and performing character recognition on each cut character to produce a recognized document;
translating process means for translating said document in said first language to a second language to make a translated document;
image-to-character-position-correspondence-table producing and displaying means for producing and displaying an image-to-character-position-correspondence-table in which a correspondence is established between said image document, said recognized document and said translated document;
original-document-to-translated-document corre-spondence relationship storing means for storing a correspondence relationship between an original document and a translated document;
candidate character producing means for producing candidate characters used for revising misrecognized characters; and document revising means for carrying out the following processes:
a first process allowing a user to specify a misrecognized portion in said image-to-character-position-correspondence-table displayed by said image-to-character-position-correspondence-table producing and displaying means;
a second process referring to said original-document-to-translated-document correspondence relationship storing means to extract portions of said image-of-the-document and said recognized document which correspond to said misrecognized portion specified and causing said image-to-character-position-correspondence-table producing and displaying means to display said portions extracted explicitly;
a third process referring to said candidate character producing means to extract candidate characters for said misrecognized portion in said recognized document as requested by the user and causing said image-to-character-position-correspondence-table producing and displaying means to display said candidate characters;
a fourth process causing the user to select arbitrary characters from said candidate characters displayed and replacing said misrecognized portion in said recognized document with selected candidate characters;
a fifth process causing said translating means to retranslate a new document in which said misrecognized portion is replaced with said selected candidate characters to thereby produce a new translated document and causing said image-to-character-position-correspondence-table producing and displaying means to display said new translated document; and a control process for repeating said first through said fifth processes.
2. The document revising apparatus according to claim 1, in which said character recognition means comprises image entry means for entering said document in said first language as an-image-of-a-document, image document storage means for storing said image document, character segmentation means for segregating each character from said stored image document, character recognition means for performing character recognition on each of said characters cut from said image document, and document storage means for storing each of said recognized characters.
3. The document revising apparatus according to claim 2, in which said translation means comprises translating means and a translated document storage means.
4. The document revising apparatus according to claim 3, in which said image-to-character-position-correspondence-table producing and displaying means reads said image-of-the-document from said image-of-the-document storage means, said recognized document from said document storage means and said translated document from said translated document storage means on the basis of position information read from said character segmentation means, thereby producing said displayed document.
5. The document revising apparatus according to claim 2, in which said candidate character producing means refers to said character recognition means to produce said candidate characters.
6. The document revising apparatus according to claim 2, in which said image entry means comprises an optical character reader for entering a document written or printed on a sheet of paper as an-image-of-a-document.
7. The document revising apparatus according to claim 1, in which said original-document-to-translated-document correspondence relationship storage means stores a correspondence relationship between said original document and said translated document in the form of a table.
8. The document revising apparatus according to claim 1, in which said first language is Japanese.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP1342465A JP2758952B2 (en) | 1989-12-28 | 1989-12-28 | Display Method for Japanese Document Reading and Translation System at Correction |
JP1-342465 | 1989-12-28 |
Publications (2)
Publication Number | Publication Date |
---|---|
CA2033411A1 CA2033411A1 (en) | 1991-06-29 |
CA2033411C true CA2033411C (en) | 1996-04-30 |
Family
ID=18353951
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CA002033411A Expired - Fee Related CA2033411C (en) | 1989-12-28 | 1990-12-28 | Document revising system for use with document reading and translating system |
Country Status (8)
Country | Link |
---|---|
US (1) | US5222160A (en) |
EP (1) | EP0435349B1 (en) |
JP (1) | JP2758952B2 (en) |
KR (1) | KR940000028B1 (en) |
AU (1) | AU642945B2 (en) |
CA (1) | CA2033411C (en) |
DE (1) | DE69029251T2 (en) |
ES (1) | ES2099082T3 (en) |
Families Citing this family (95)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH0493888A (en) * | 1990-08-03 | 1992-03-26 | Canon Inc | Pattern processing method |
JP2818052B2 (en) * | 1991-05-21 | 1998-10-30 | シャープ株式会社 | Optical character reader |
US5434971A (en) * | 1991-06-28 | 1995-07-18 | Digital Equipment Corp. | System for constructing a table data structure based on an associated configuration data structure and loading it with chemical sample physical data |
US5446575A (en) * | 1991-06-28 | 1995-08-29 | Digital Equipment Corp. | System for constructing and loading a table data structure based on an associated configuration data |
US5926565A (en) * | 1991-10-28 | 1999-07-20 | Froessl; Horst | Computer method for processing records with images and multiple fonts |
CA2078423C (en) * | 1991-11-19 | 1997-01-14 | Per-Kristian Halvorsen | Method and apparatus for supplementing significant portions of a document selected without document image decoding with retrieved information |
JPH05314175A (en) * | 1992-05-13 | 1993-11-26 | Ricoh Co Ltd | Parallel translation image forming device |
US5608622A (en) * | 1992-09-11 | 1997-03-04 | Lucent Technologies Inc. | System for analyzing translations |
US5987170A (en) * | 1992-09-28 | 1999-11-16 | Matsushita Electric Industrial Co., Ltd. | Character recognition machine utilizing language processing |
US6041141A (en) * | 1992-09-28 | 2000-03-21 | Matsushita Electric Industrial Co., Ltd. | Character recognition machine utilizing language processing |
JPH07114558A (en) * | 1993-10-19 | 1995-05-02 | Fujitsu Ltd | Chinese character conversion correcting process system |
TW250558B (en) * | 1993-10-20 | 1995-07-01 | Yamaha Corp | Sheet music recognition device |
US6339767B1 (en) | 1997-06-02 | 2002-01-15 | Aurigin Systems, Inc. | Using hyperbolic trees to visualize data generated by patent-centric and group-oriented data processing |
US5696963A (en) * | 1993-11-19 | 1997-12-09 | Waverley Holdings, Inc. | System, method and computer program product for searching through an individual document and a group of documents |
US6963920B1 (en) | 1993-11-19 | 2005-11-08 | Rose Blush Software Llc | Intellectual asset protocol for defining data exchange rules and formats for universal intellectual asset documents, and systems, methods, and computer program products related to same |
US5991751A (en) * | 1997-06-02 | 1999-11-23 | Smartpatents, Inc. | System, method, and computer program product for patent-centric and group-oriented data processing |
US5623681A (en) * | 1993-11-19 | 1997-04-22 | Waverley Holdings, Inc. | Method and apparatus for synchronizing, displaying and manipulating text and image documents |
US5623679A (en) * | 1993-11-19 | 1997-04-22 | Waverley Holdings, Inc. | System and method for creating and manipulating notes each containing multiple sub-notes, and linking the sub-notes to portions of data objects |
US6877137B1 (en) * | 1998-04-09 | 2005-04-05 | Rose Blush Software Llc | System, method and computer program product for mediating notes and note sub-notes linked or otherwise associated with stored or networked web pages |
US5806079A (en) * | 1993-11-19 | 1998-09-08 | Smartpatents, Inc. | System, method, and computer program product for using intelligent notes to organize, link, and manipulate disparate data objects |
US5799325A (en) * | 1993-11-19 | 1998-08-25 | Smartpatents, Inc. | System, method, and computer program product for generating equivalent text files |
JP3453422B2 (en) * | 1994-02-10 | 2003-10-06 | キヤノン株式会社 | Registration method of character pattern in user dictionary and character recognition device having the user dictionary |
US5822720A (en) | 1994-02-16 | 1998-10-13 | Sentius Corporation | System amd method for linking streams of multimedia data for reference material for display |
CA2138830A1 (en) * | 1994-03-03 | 1995-09-04 | Jamie Joanne Marschner | Real-time administration-translation arrangement |
US5812818A (en) * | 1994-11-17 | 1998-09-22 | Transfax Inc. | Apparatus and method for translating facsimile text transmission |
JPH08185393A (en) * | 1994-12-28 | 1996-07-16 | Canon Inc | Reexecution system and its method |
JPH0981566A (en) * | 1995-09-08 | 1997-03-28 | Toshiba Corp | Method and device for translation |
US6115482A (en) * | 1996-02-13 | 2000-09-05 | Ascent Technology, Inc. | Voice-output reading system with gesture-based navigation |
EP0867815A3 (en) * | 1997-03-26 | 2000-05-31 | Kabushiki Kaisha Toshiba | Translation service providing method and translation service system |
GB2328770A (en) * | 1997-08-30 | 1999-03-03 | E Lead Electronic Co Ltd | Camera type electronic language translation system |
US6092074A (en) | 1998-02-10 | 2000-07-18 | Connect Innovations, Inc. | Dynamic insertion and updating of hypertext links for internet servers |
WO2000013102A1 (en) * | 1998-08-31 | 2000-03-09 | Sony Corporation | Natural language processing device and method |
US6275978B1 (en) * | 1998-11-04 | 2001-08-14 | Agilent Technologies, Inc. | System and method for term localization differentiation using a resource bundle generator |
US7716060B2 (en) | 1999-03-02 | 2010-05-11 | Germeraad Paul B | Patent-related tools and methodology for use in the merger and acquisition process |
US7966328B2 (en) | 1999-03-02 | 2011-06-21 | Rose Blush Software Llc | Patent-related tools and methodology for use in research and development projects |
JP2001117828A (en) * | 1999-10-14 | 2001-04-27 | Fujitsu Ltd | Electronic device and storage medium |
US6883168B1 (en) | 2000-06-21 | 2005-04-19 | Microsoft Corporation | Methods, systems, architectures and data structures for delivering software via a network |
US7000230B1 (en) | 2000-06-21 | 2006-02-14 | Microsoft Corporation | Network-based software extensions |
US6874143B1 (en) | 2000-06-21 | 2005-03-29 | Microsoft Corporation | Architectures for and methods of providing network-based software extensions |
US7346848B1 (en) | 2000-06-21 | 2008-03-18 | Microsoft Corporation | Single window navigation methods and systems |
US6948135B1 (en) | 2000-06-21 | 2005-09-20 | Microsoft Corporation | Method and systems of providing information to computer users |
US7155667B1 (en) | 2000-06-21 | 2006-12-26 | Microsoft Corporation | User interface for integrated spreadsheets and word processing tables |
US7191394B1 (en) | 2000-06-21 | 2007-03-13 | Microsoft Corporation | Authoring arbitrary XML documents using DHTML and XSLT |
AU2001264895A1 (en) * | 2000-06-21 | 2002-01-02 | Microsoft Corporation | System and method for integrating spreadsheets and word processing tables |
US7624356B1 (en) | 2000-06-21 | 2009-11-24 | Microsoft Corporation | Task-sensitive methods and systems for displaying command sets |
GB0031596D0 (en) * | 2000-12-22 | 2001-02-07 | Barbara Justin S | A system and method for improving accuracy of signal interpretation |
US7050979B2 (en) * | 2001-01-24 | 2006-05-23 | Matsushita Electric Industrial Co., Ltd. | Apparatus and method for converting a spoken language to a second language |
US7130861B2 (en) | 2001-08-16 | 2006-10-31 | Sentius International Corporation | Automated creation and delivery of database content |
US7007303B2 (en) * | 2001-10-10 | 2006-02-28 | Xerox Corporation | Systems and methods for authenticating documents |
US20030101044A1 (en) * | 2001-11-28 | 2003-05-29 | Mark Krasnov | Word, expression, and sentence translation management tool |
US7415672B1 (en) | 2003-03-24 | 2008-08-19 | Microsoft Corporation | System and method for designing electronic forms |
US7370066B1 (en) | 2003-03-24 | 2008-05-06 | Microsoft Corporation | System and method for offline editing of data files |
US7275216B2 (en) | 2003-03-24 | 2007-09-25 | Microsoft Corporation | System and method for designing electronic forms and hierarchical schemas |
US7913159B2 (en) | 2003-03-28 | 2011-03-22 | Microsoft Corporation | System and method for real-time validation of structured data files |
US7296017B2 (en) | 2003-03-28 | 2007-11-13 | Microsoft Corporation | Validation of XML data files |
US7516145B2 (en) | 2003-03-31 | 2009-04-07 | Microsoft Corporation | System and method for incrementally transforming and rendering hierarchical data files |
US7451392B1 (en) | 2003-06-30 | 2008-11-11 | Microsoft Corporation | Rendering an HTML electronic form by applying XSLT to XML using a solution |
US7406660B1 (en) | 2003-08-01 | 2008-07-29 | Microsoft Corporation | Mapping between structured data and a visual surface |
US7581177B1 (en) | 2003-08-01 | 2009-08-25 | Microsoft Corporation | Conversion of structured documents |
US7334187B1 (en) | 2003-08-06 | 2008-02-19 | Microsoft Corporation | Electronic form aggregation |
US8819072B1 (en) | 2004-02-02 | 2014-08-26 | Microsoft Corporation | Promoting data from structured data files |
US7430711B2 (en) | 2004-02-17 | 2008-09-30 | Microsoft Corporation | Systems and methods for editing XML documents |
US7496837B1 (en) | 2004-04-29 | 2009-02-24 | Microsoft Corporation | Structural editing with schema awareness |
US7568101B1 (en) | 2004-05-13 | 2009-07-28 | Microsoft Corporation | Digital signatures with an embedded view |
US7774620B1 (en) | 2004-05-27 | 2010-08-10 | Microsoft Corporation | Executing applications at appropriate trust levels |
US7516399B2 (en) | 2004-09-30 | 2009-04-07 | Microsoft Corporation | Structured-document path-language expression methods and systems |
US7712022B2 (en) | 2004-11-15 | 2010-05-04 | Microsoft Corporation | Mutually exclusive options in electronic forms |
US7584417B2 (en) | 2004-11-15 | 2009-09-01 | Microsoft Corporation | Role-dependent action for an electronic form |
US7721190B2 (en) | 2004-11-16 | 2010-05-18 | Microsoft Corporation | Methods and systems for server side form processing |
US7509353B2 (en) * | 2004-11-16 | 2009-03-24 | Microsoft Corporation | Methods and systems for exchanging and rendering forms |
US7904801B2 (en) | 2004-12-15 | 2011-03-08 | Microsoft Corporation | Recursive sections in electronic forms |
US7437376B2 (en) | 2004-12-20 | 2008-10-14 | Microsoft Corporation | Scalable object model |
US7937651B2 (en) | 2005-01-14 | 2011-05-03 | Microsoft Corporation | Structural editing operations for network forms |
US7725834B2 (en) | 2005-03-04 | 2010-05-25 | Microsoft Corporation | Designer-created aspect for an electronic form template |
JP2006251902A (en) * | 2005-03-08 | 2006-09-21 | Fuji Xerox Co Ltd | Device, program, and method for generating translation document image |
JP4428266B2 (en) * | 2005-03-22 | 2010-03-10 | 富士ゼロックス株式会社 | Translation apparatus and program |
JP2006276918A (en) * | 2005-03-25 | 2006-10-12 | Fuji Xerox Co Ltd | Translating device, translating method and program |
JP2006276915A (en) * | 2005-03-25 | 2006-10-12 | Fuji Xerox Co Ltd | Translating processing method, document translating device and program |
JP2006276911A (en) * | 2005-03-25 | 2006-10-12 | Fuji Xerox Co Ltd | Electronic equipment and program |
US7543228B2 (en) | 2005-06-27 | 2009-06-02 | Microsoft Corporation | Template for rendering an electronic form |
US8200975B2 (en) | 2005-06-29 | 2012-06-12 | Microsoft Corporation | Digital signatures for network forms |
US7613996B2 (en) | 2005-08-15 | 2009-11-03 | Microsoft Corporation | Enabling selection of an inferred schema part |
US8001459B2 (en) | 2005-12-05 | 2011-08-16 | Microsoft Corporation | Enabling electronic documents for limited-capability computing devices |
JP4539613B2 (en) * | 2006-06-28 | 2010-09-08 | 富士ゼロックス株式会社 | Image forming apparatus, image generation method, and program |
US9015029B2 (en) * | 2007-06-04 | 2015-04-21 | Sony Corporation | Camera dictionary based on object recognition |
JP2010055235A (en) * | 2008-08-27 | 2010-03-11 | Fujitsu Ltd | Translation support program and system thereof |
US8903709B2 (en) | 2012-05-17 | 2014-12-02 | Dell Products, Lp | Revising translated documents in a document storage system |
WO2014155742A1 (en) * | 2013-03-29 | 2014-10-02 | 楽天株式会社 | Information processing system, control method for information processing system, information processing device, control method for information processing device, information storage medium, and program |
RU2631168C2 (en) * | 2013-06-18 | 2017-09-19 | Общество с ограниченной ответственностью "Аби Девелопмент" | Methods and devices that convert images of documents to electronic documents using trie-data structures containing unparameterized symbols for definition of word and morphemes on document image |
WO2014204336A1 (en) * | 2013-06-18 | 2014-12-24 | Abbyy Development Llс | Methods and systems that build a hierarchically organized data structure containing standard feature symbols for conversion of document images to electronic documents |
JP6005285B2 (en) | 2013-07-08 | 2016-10-12 | 旭化成株式会社 | Modified resin and resin composition |
US9633048B1 (en) * | 2015-11-16 | 2017-04-25 | Adobe Systems Incorporated | Converting a text sentence to a series of images |
RU172882U1 (en) * | 2016-07-20 | 2017-07-28 | Общество с ограниченной ответственностью "Технологии управления переводом" | DEVICE FOR AUTOMATIC TEXT TRANSLATION |
US10235361B2 (en) * | 2017-02-15 | 2019-03-19 | International Business Machines Corporation | Context-aware translation memory to facilitate more accurate translation |
US20190266248A1 (en) * | 2018-02-26 | 2019-08-29 | Loveland Co., Ltd. | Webpage translation system, webpage translation apparatus, webpage providing apparatus, and webpage translation method |
Family Cites Families (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPS58101365A (en) * | 1981-12-14 | 1983-06-16 | Hitachi Ltd | Text display calibration system in machine translation system |
JPS59206985A (en) * | 1983-05-11 | 1984-11-22 | Hitachi Ltd | Mechanical translating system |
JPS6089275A (en) * | 1983-10-21 | 1985-05-20 | Hitachi Ltd | Translation system |
JPS63106866A (en) * | 1986-10-24 | 1988-05-11 | Toshiba Corp | Machine translation device |
US4916614A (en) * | 1986-11-25 | 1990-04-10 | Hitachi, Ltd. | Sentence translator using a thesaurus and a concept-organized co- occurrence dictionary to select from a plurality of equivalent target words |
JPS63143684A (en) * | 1986-12-05 | 1988-06-15 | Sharp Corp | Method for correcting recognized result in character recognizing device |
US4890230A (en) * | 1986-12-19 | 1989-12-26 | Electric Industry Co., Ltd. | Electronic dictionary |
US5022081A (en) * | 1987-10-01 | 1991-06-04 | Sharp Kabushiki Kaisha | Information recognition system |
JPH02121055A (en) * | 1988-10-31 | 1990-05-08 | Nec Corp | Braille word processor |
JPH02211580A (en) * | 1989-02-10 | 1990-08-22 | Seiko Epson Corp | Electronic translating machine |
-
1989
- 1989-12-28 JP JP1342465A patent/JP2758952B2/en not_active Expired - Fee Related
-
1990
- 1990-12-28 DE DE69029251T patent/DE69029251T2/en not_active Expired - Fee Related
- 1990-12-28 ES ES90125772T patent/ES2099082T3/en not_active Expired - Lifetime
- 1990-12-28 EP EP90125772A patent/EP0435349B1/en not_active Expired - Lifetime
- 1990-12-28 US US07/635,077 patent/US5222160A/en not_active Expired - Lifetime
- 1990-12-28 AU AU68559/90A patent/AU642945B2/en not_active Ceased
- 1990-12-28 KR KR9022104A patent/KR940000028B1/en not_active IP Right Cessation
- 1990-12-28 CA CA002033411A patent/CA2033411C/en not_active Expired - Fee Related
Also Published As
Publication number | Publication date |
---|---|
ES2099082T3 (en) | 1997-05-16 |
CA2033411A1 (en) | 1991-06-29 |
KR910012986A (en) | 1991-08-08 |
JP2758952B2 (en) | 1998-05-28 |
AU642945B2 (en) | 1993-11-04 |
EP0435349A3 (en) | 1991-10-23 |
JPH03201166A (en) | 1991-09-03 |
EP0435349A2 (en) | 1991-07-03 |
DE69029251D1 (en) | 1997-01-09 |
AU6855990A (en) | 1991-07-04 |
KR940000028B1 (en) | 1994-01-05 |
EP0435349B1 (en) | 1996-11-27 |
DE69029251T2 (en) | 1997-03-27 |
US5222160A (en) | 1993-06-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CA2033411C (en) | Document revising system for use with document reading and translating system | |
US5687383A (en) | Translation rule learning scheme for machine translation | |
US5349368A (en) | Machine translation method and apparatus | |
US6002798A (en) | Method and apparatus for creating, indexing and viewing abstracted documents | |
US5978754A (en) | Translation display apparatus and method having designated windows on the display | |
US20030200505A1 (en) | Method and apparatus for overlaying a source text on an output text | |
US5321801A (en) | Document processor with character string conversion function | |
GB2332544A (en) | Automatic adaptive document help system | |
WO1994019755A1 (en) | Method and system for translating documents using translation handles | |
EP0687991A2 (en) | Information processing method and apparatus | |
US6535652B2 (en) | Image retrieval apparatus and method, and computer-readable memory therefor | |
CN109445900A (en) | The interpretation method and device shown for picture | |
JPH0696288A (en) | Character recognizing device and machine translation device | |
GB2259386A (en) | Text processing | |
JP3083171B2 (en) | Character recognition apparatus and method | |
JPH1063813A (en) | Method for managing image document and device therefor | |
JPH08202859A (en) | Electronic filing device and its method | |
JPH0388086A (en) | Document reader | |
JPH103516A (en) | Method and device for processing information | |
JPS5814249A (en) | Display control system for ruled line | |
JP2874815B2 (en) | Japanese character reader | |
JPH0636067A (en) | Character reader | |
JPH04293185A (en) | Filing device | |
JPH05151195A (en) | Kanji input device | |
JPH05290209A (en) | Character recognition method and its device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
EEER | Examination request | ||
MKLA | Lapsed |