WO1980000105A1 - System for selecting graphic characters phonetically - Google Patents

System for selecting graphic characters phonetically Download PDF

Info

Publication number
WO1980000105A1
WO1980000105A1 PCT/US1979/000418 US7900418W WO8000105A1 WO 1980000105 A1 WO1980000105 A1 WO 1980000105A1 US 7900418 W US7900418 W US 7900418W WO 8000105 A1 WO8000105 A1 WO 8000105A1
Authority
WO
WIPO (PCT)
Prior art keywords
signals
pronunciation
characters
addresses
kanji
Prior art date
Application number
PCT/US1979/000418
Other languages
French (fr)
Inventor
S Nori
Original Assignee
Logan Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Logan Corp filed Critical Logan Corp
Publication of WO1980000105A1 publication Critical patent/WO1980000105A1/en

Links

Classifications

    • BPERFORMING OPERATIONS; TRANSPORTING
    • B41PRINTING; LINING MACHINES; TYPEWRITERS; STAMPS
    • B41JTYPEWRITERS; SELECTIVE PRINTING MECHANISMS, i.e. MECHANISMS PRINTING OTHERWISE THAN FROM A FORME; CORRECTION OF TYPOGRAPHICAL ERRORS
    • B41J3/00Typewriters or selective printing or marking mechanisms characterised by the purpose for which they are constructed
    • B41J3/01Typewriters or selective printing or marking mechanisms characterised by the purpose for which they are constructed for special character, e.g. for Chinese characters or barcodes
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B41PRINTING; LINING MACHINES; TYPEWRITERS; STAMPS
    • B41BMACHINES OR ACCESSORIES FOR MAKING, SETTING, OR DISTRIBUTING TYPE; TYPE; PHOTOGRAPHIC OR PHOTOELECTRIC COMPOSING DEVICES
    • B41B27/00Control, indicating, or safety devices or systems for composing machines of various kinds or types
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09GARRANGEMENTS OR CIRCUITS FOR CONTROL OF INDICATING DEVICES USING STATIC MEANS TO PRESENT VARIABLE INFORMATION
    • G09G5/00Control arrangements or circuits for visual indicators common to cathode-ray tube indicators and other visual indicators
    • G09G5/22Control arrangements or circuits for visual indicators common to cathode-ray tube indicators and other visual indicators characterised by the display of characters or indicia using display control signals derived from coded signals representing the characters or indicia, e.g. with a character-code memory
    • G09G5/24Generation of individual character patterns
    • G09G5/246Generation of individual character patterns of ideographic or arabic-like characters

Definitions

  • This invention relates to a system and apparatus for se ⁇ lecting graphical symbols and particularly to a system and apparatus for selecting desired ideograms from among a large number of ideograms by a system using alphabetic or phonetic symbols capable of being accommodated on a key ⁇ board arranged to place substantially every key within a space that can be spanned by the two hands of a typical operator.
  • This invention is primarily concerned with the problem of selecting a desired ideogram by means of a keyboard that has a relatively small number of keys compared to the number of ideograms in the set from which the selection is to be made.
  • a typical electric typewriter for typ ⁇ ing English and other languages that use the Roman alpha ⁇ bet includes 44 printing keys of which 26 are letter keys, ten are numeral keys from 0 to 9, and eight are keys for printing punctuation and other symbols.
  • the typewriter also has seven control keys, including a space bar. Three of the control keys operate the shifting mechanism to allow each printing key to control the printing of two different symbols, such as upper case and lower case letters and ad ditional non-letter symbols.
  • All of the keys are within a distance capable of being spanned by the two hands of a typical operator and yet there is sufficient space between the keys to permit easy actuation of any desired key with ⁇ out inadvertent actuation of an adjacent key.
  • a professional typist can operate such a keyboard entirely by touch and without visual reference, primarily because it is possible for each of the typist's hands to remain very nearly in a fixed location. Key selection can be achieved almost entirely by finger movement.
  • the Roman alphabet is not the only one that can be accom ⁇ modated by a typewriter; the Arabic, Cyrillic, Russian, Greek, and other symbolic alphabets and phonetic alphabets, and syllabaries, such as the Korean alphabet and the Japanese kana can all be incorporated in suitably adapted typewriters.
  • Korean language which also has the advantage of including an alphabet system consisting of 24 letters generally cap ⁇ able of being sounded as either an initial or final letter.
  • the pronunciation of a word made up of two or even more kanji is not always sufficient to identify the kanji.
  • a machine or an electronic system will be unable to select a specific kanji based only on phonetic input.
  • the ambiguity may be resolved as to one or more kanji of a compound word that comprises more than one kanji , but will not be resolved as to other kanji in the same word.
  • One object of this invention is to provide a selection system capable of selecting a proper symbol from a group of such symbols.
  • Another object of the invention is to provide a selection system using phonetic input means to obtain the initial information on which the selection is based.
  • Another object of the invention is to provide an automatic correlation system between a simplified keyboard and a memory in which a large number of symbols is stored, the number of such symbols being far greater than twice the number of keys on the keyboard, and each symbol being selectable with not more than one shift operation per symbol.
  • Still another object is to provide a printing and display system capable of printing and displaying ideograms rapidly and accurately.
  • a further object of the invention is to provide an accurate language translation system and method.
  • a further object is to provide a graphic symbol selection system adapted to be incorporated in a communication system.
  • a still further object of this invention is to select ideograms and print them with a dot-matrix printer.
  • Yet another object is to select ideograms and print them in a facsimile machine.
  • a further object is to select ideograms on the basis of plural styles of pronunciation thereof.
  • a still further object is to select ideograms on the basis of two linguistic characteristics thereof.
  • information specific to each of a number of graphic symbols is stored in a memory.
  • the memory could be a printing device and the information could be printing elements formed according to the graphic symbols, themselves.
  • Phototypesetters have images of the symbols they are capable of printing and each symbol is located at a specific address on a master sheet.
  • the information could be typing elements capable of printing by impact.
  • the information could be in the form of electrical or magnetic conditions at known addresses in an electrical or magnetic memory.
  • the information could be capable of controlling a dot-matrix or any other known form of printing device or graphic display device, such as a cathode ray tube (CRT) or a facsimile machine.
  • CTR cathode ray tube
  • the electrically or magnetically stored information is not limited to the type of information that is capable of directly creating a graphic symbol but may simply be in the form of address information to actuate a phototypesetter, for example, and cause a symbol at that address on the master sheet to be printed.
  • the information is stored according to linguistic characteristics and can be retrieved by linguistic data.
  • a first linguistic characteristic is the pronunciation or phonetic representation of a character, which is the way people normally handle a spoken language.
  • pronunciation is not a sufficient basis for selection, but it still forms a useful first basis for dividing ideograms into groups.
  • the invention further contemplates the application of a second linguistic characteristic to at least the relatively small group of symbols obtained by the first selection process.
  • the second linguistic characteristic must be applicable in a unique manner to each of the members of the small group so that any specific symbol in that group can be chosen.
  • a second linguistic characteristic that can be used in the selection of Japanese ideograms is a different pronunciation of each of the ideograms. This is due to the fact that almost all Japanese ideograms have two styles of pronunciation, known as the on-yomi and the kun-yomi. These are frequently referred to simply as the "on” and “kun” styles of pronunciation. I have found that although there are many homonyms in the "on” style and many homonyms in the "kun” style, any ideogram recorded in an electrical memory can be uniquely selected by applying data consisting of the "on” style of pronunciation and the "kun” style of pronunciation of that ideogram.
  • kanji ideograms
  • the word "T ⁇ ky ⁇ ” previously mentioned is an example of such usage of kanji in a compound word.
  • the second linguistic principle, or characteristic is the usage of the two kanji together.
  • the proper "to" kanji can be selected and, at the same time, the proper "ky ⁇ ” kanji can also be selected.
  • the compound word "tenki” consisting of a first kanji pronounced "ten” and a second kanji pronounced “ki” can mean either "a turning point” or "weather”.
  • the four kanji involved in these two words are entirely different.
  • the selection may be made by reference to kana associated with the kanji meaning "weather”. In this instance, it is common to place the kana pronounced "o” immediately in front of the kanji pronounced “tenki” and meaning "weather”, but the kana "o” is never placed in front of the kanji pronounced "tenk ⁇ " and meaning "a turning point".
  • the appropriate kanji can be selected.
  • a related second linguistic principle is to utilize kana that follow the kanji.
  • the kanji pronounced “kiso” meaning “the foundation” are entirely different from the kanji also pronounced “kiso” but meaning “prosecute”.
  • "kiso” is a verb and is followed by the kana “suru”, which do not follow the kanji pronounced “kiso” and used as a noun meaning "the foundation”.
  • These preceding and following kana that are closely associated with the kanji in Japanese words are called "okurigana”.
  • an operator of the system of this invention can rely on the "on-kun” method of selection, but to do so requires the input of additional data not normally included in a message. Normally, one uses only the "on” style or the “kun” style at any given point but not both styles.
  • Still another linguistically-related principle that can be applied to select kanji that cannot be selected by reference to associated kanji or to okurigana is the graphic depiction of all of the kanji having a similar pronunciation and the associated graphic depiction of identifying information, such as the address of each of these kanji, to allow a person familiar with the language to chose the proper kanji and refer its address back to the apparatus.
  • the address may be in the form of a number having four decimal digits.
  • the presentation of the ambiguous kanji and their addresses can be accomplished automatically by the system without requiring any additional input from the operator and it is thus less burdensome than the selection based on multiple styles of pronunciation.
  • the selection system of this invention can utilize a keyboard of the standard size in Roman letter typewriters, such as are used to type English and most of the European languages, can be used to select specific ideograms from a group of ideograms much larger in number than the number of keys on the keyboard.
  • the Japanese phonetic syllabries, hiragana, and katakana have only about 50 symbols, and the keys of an electric typewriter keyboard originally set up for the Roman alphabet can easily be modified to accommodate the kana.
  • a keyboard suitable for use In a system according to the present invention can be small enough to be spanned by the outstretched fingers of two hands of average size and yet the information entered through this keyboard utilizing the apparatus of the present invention can produce a graphic display of more than 2,000 graphic symbols. Furthermore, selection of the graphic symbols in accordance with the linguistic characteristics as just described makes it unnecessary to use each key to obtain more than two symbols in each mode of operation.
  • the present invention requires, at most, shifting between two symbols, or symbol styles, for each key. This corresponds to writing English in which the operation is shifted between one level for lower case letters and another level for upper case letters.
  • the apparatus of this invention includes a memory in which the graphic symbols or information defining such symbols is stored in specific address locations. These addresses may be reached by signals processed by a suitably programed computer or by a sequence of storage, comparison, and switching circuits. The full scope of operation of a computer is unnecessary because there is a finite, specific number of symbols or addresses to be retrieved by a finite set of data.
  • Figure 1 is a block diagram illustrating one embodiment of this invention
  • Figure 2 is a plan view of one embodiment of a keyboard for use in the present invention
  • Figure 3 is a schematic diagram illustrating the procedure followed in encoding a typical Japanese expression by use of the keyboard shown in Figure 2;
  • Figure 4 is a detailed schematic diagram of a component of the system shown in Figure 1;
  • FIG. 5 is a schematic diagram of another embodiment of the invention.
  • Figures 6A-6L are illustrative examples of the use of ideograms in Japanese writing.
  • FIG. 7 is a block diagram of a modified system incorporating the invention.
  • Figure 8 is a plan view of another embodiment of a keyboard for use in the present invention.
  • Figure 9 is a block diagram of a terminal suitable for use in the circuit in Figure 7;
  • Figure 10 is a simplified illustration of the screen of a cathode ray tube displaying information according to the present invention.
  • Figure 11 is a simplified drawing of one sheet of computer fanfold paper arranged to be used in accordance with this invention.
  • Figures 12A and 12B illustrate two types of graphic symbol display in accordance with the present invention.
  • the present invention is particularly well suited for use with the Japanese language.
  • written Japanese usually consists of "kanji” or Chinese characters, mixed with “kana” or Japanese phonetic characters.
  • Kana characters are relatively easy to select, type, or print because there are only a relatively small number of them.
  • the Chinese characters cause all of the problems described above.
  • each Chinese character is represented by a signal composed of two separate signals, one representing the "kun” style and the other the "on" style of pronunciation of the character.
  • FIG. 1 of the drawings shows a system for encoding, decoding, and graphically displaying Japanese phonetic and Chinese characters to form written matter in the Japanese language.
  • This system includes a keyboard unit 10 with character keys 12.
  • the unit 10 sends coded electrical signals to a conventional tape punch unit 14 which produces a punched paper tape 16 bearing binary-coded arrays of holes each representing a character key which was depressed.
  • the matter being typed is typed in accordance with the above- described "on-kun” code in which each Chinese character is represented by Japanese phonetic (kana) or Roman characters which represent the "kun” and the "on” styles of pronunciation. Words which are to be printed or otherwise displayed in kana form can be typed directly, without use of the "on-kun” code.
  • Each set of signals using the "on-kun” code is segregated from the other signals by appropriate start and stop signals.
  • the punched tape 16 is delivered to a conventional tape reader 18 which produces coded electrical signals corresponding to the punch-coded signals on the tape 16. These electrical signals are conducted to a conventional code decoder 20 which detects the start and stop signals surrounding each "on-kun" code sequence and delivers a gating signal on a lead 22 to a switching device 26.
  • Switching device 26 which can be a conventional bi-stable circuit such as a flip-flop, directs the coded signals it receives over lead 24 to one of two leads 28 or 30, depending upon which of its bi-stable conditions it is switched to by the gating signal received on the lead 22.
  • the device 26 Upon the receipt of a "start" signal signifying the start of a sequence of "on-kun” coded signals, the device 26 switches to one mode in which the coded signals ane delivered over output lead 28 to a code converter device 32.
  • Code converter 32 stores coded signals representing Chinese characters and delivers one of those signals over an output lead 34 to a utilization device 36 in response to the receipt of the "on-kun" code designation of a selected character.
  • the utilization device can be any type desired, such as a photocomposing machine, cathode ray tube display, or teleprinter, each of which pr.ints or otherwise graphically displays the Chinese characters.
  • the punched tape is but one example of a register for storing the encoded character signals from the keyboard device 10.
  • Other permanent, semi-permanent or temporary registers such as magnetic tape, punched cards, etc., can be used instead of punched tape, or the use of a register can be dispensed with entirely. In the latter case, the output of the keyboard device 10 would be connected directly to the decoding device 20.
  • the keyboard device could be connected directly to the converter 32, thus eliminating elements 20 and 26 from the circuit.
  • FIG. 2 is a schematic plan view of the keyboard device 10.
  • the keyboard includes character keys 12 each of which is marked with a Roman letter 38 and a Japanese phonetic (kana) character 40.
  • Number keys 42 are marked with an Arabic numeral, and, since the Arabic numerals are themselves ideograms with directly corresponding Chinese ideograms, one key can be used to represent the same number in each language if desired.
  • Certain other keys 44 termed herein as "quadrated” keys, are marked with three or four different symbols, some being kana characters, and others being English ideograms.
  • Kay V is for upper case Roman characters and Arabic numerals
  • Key IV is for lower case Roman characters and Arabic numerals
  • Key III is for the "katakana” form of kana characters
  • Key I is for the “hiragana” form of kana characters
  • Key II is the "on-kun” code signal key which records the stop and start signals indicating that the code is for Chinese
  • the "M” key designates the switch to turn the keyboard on and off.
  • the one of the characters on quadrated keys 44 which is selected depends upon which of the four keys I, III, IV or V is actuated.
  • a space bar 46 is also provided to space the characters from one another.
  • FIG. 3 An example of how the keyboard device 10 can be used is illustrated in Figure 3.
  • the expression "at first light exists", indicated by reference numeral 49 at the bottom of Figure 3, is. properly written in Japanese as shown by the expression at the top of Figure 3 which is indicated by numeral 47.
  • the symbols indicated by reference numeral 51 comprise a single Chinese or "kanji” character which means “first”.
  • Characters 55 and 57 are Japanese phonetic characters (kana characters) which create a meaning, together with character 51, of "at first”.
  • Character 59 is another Chinese character which means "light”.
  • Characters 61 and 63 are two further Japanese phonetic characters which together mean "exists”.
  • the encoding of the Japanese express ⁇ on 47 will now be explained as an example.
  • the "M” key is depressed to turn the keyboard device 10 on.
  • the I I key is depressed to indicate that a Chinese character will be encoded.
  • the keyboard device 10 is one of several devices which are commercially available for converting keystrokes into appropriately coded electrical signals.
  • the electrical signals are used to operate the tape punch 14, or to otherwise operate in the system shown in Figure 1. Any particular binary code can be used as desired. For example, either six- or seven-level "Teletypesetter" (TTS) code can be used.
  • TTS "Teletypesetter"
  • Table 65 in Figure 3 shows the Japanese phonetic and the Roman alphabet components of both the "kun” and the "on” pronunciations for the Chinese character 51.
  • the "kun” pronunciation of character 51 is “ha- ji -me”
  • the "on” pronunciation is "s ⁇ -yo”.
  • Chinese character 51 is encoded by first pressing the key 48 for the Japanese phonetic symbol for "ha”. The next phonetic symbol is formed by successively depressing keys 50 and 52. Since the first two syllables in the "kun" pronunciation are sufficient to uniquely identify the character 51, it is not necessary to depress a third key to represent "me”, although the third key can be depressed if the operator desires. Instead, a key 53 may be depressed which encodes a signal on the tape 16 which indicates ' the end of the "kun” pronunciation and the beginning of the "on” pronunciation. Next, the key 50 for "si” and the key 54 for "yo” are depressed. This completes the coding except for ending the Chinese character. This is done by depressing key II again, thus placing a coded signal which is the same as the start signal on the tape.
  • the Japanese phonetic characters 55 and 57 do not need to be specially encoded.
  • the next step in encoding the expression 47 is to depress the Japanese phonetic key I to condition the keys 12 to encode Japanese phonetic characters, and then keys 56 and 58 corresponding, respectively, to phonetic characters 55 and 57 are depressed.
  • the Chinese character 59 is encoded by first again pressing the Chinese character key II, and then depressing, in succession, keys 60, 62, and 64 for the "kun” pronunciation.
  • key 53 is depressed, to separate the different pronunciations, and the "on" pronunciation of the Chinese character 59 is encoded by depressing successively keys 66 and 68, in accordance with table 67.
  • This Chinese character is ended by again depressing key II.
  • the final two phonetic characters 61 and 63 are encoded by once again depressing key 1, and then keys 70 and 64.
  • the depression of the "M" key turns the keyboard off.
  • the keyboard device 10 is vastly simpler than previous keyboard devices for encoding the Japanese language characters.
  • the keyboard can be simplified even further if only Japanese phonetic characters are desired to be used in the typewriter, or if only Roman characters are to be used. In such a case, some of the function keys and their associated circuitry can be eliminated.
  • Figure 4 shows an example of circuitry which can be used to convert "on-kun” coded signals into Chinese characters.
  • the converter 32 includes a shift register 100 connected to the input lead 28, and a register 102 connected to the shift register 100. These registers are conventional and are adapted to store two coded signals apiece and then read out their signals. When the register 100 has receive two "kun” coded words, it shifts those words into the second register 102. When both registers 100 and 102 are full the signals stored in those registers are transferred, respectively, to conventional decoders 106 and 104 respectively.
  • the "kun" code signals are delivered first, they are decoded in the de- coder 104 whereas the "on” signals are decoded in the decoder 106.
  • the "on" and “kun” codes are reversed in order, the opposite signals are decoded by each decoder .
  • Each decoder 104 or 106 produces an output signal on only one of its output leads for a given combination of input signals.
  • the decoder 104 might deliver an output signal over lead 108
  • the decoder 106 might deliver the signal over the lead 112.
  • the lead 108 is connected by means of another lead 110 to a plurality of different storage units 113.
  • Each of the storage units 113 stores a coded electrical signal representative of one Chinese character.
  • Each storage unit 113 includes "AND" circuit means so that its stored signal will be read out only when a signal is received on each of two input leads.
  • Each output lead of the decoder 106 also is connected to every storage unit 113 which has the "on" pronunciation represented by the lead 112..
  • the lead 112 is connected to the second input lead of the same storage device 115 as the one to which the lead 108 of decoder 104 is connected.
  • FIG. 5 illustrates another embodiment of the present invention.
  • the keyboard unit 10 is connected to a visual display device 150 of a well-known type.
  • a visual display device 150 of a well-known type.
  • One such device is part of the machine known as the "Ohicoder” machine which is sold by the I tek Company, Lexington, Massachusetts. It has a display screen 156 on which Chinese characters can be displayed, one within each one of the squares on the screen.
  • a horizontal array of switch buttons 152 and a vertical array of switch buttons 154 are provided along the edges of the screen 156.
  • a particular Chinese character displayed on the screen can be selected by sight and trans mi tted from the device 150 to the utilization device 36 by pressing a button corresponding to the proper row and column of the desired Chinese character.
  • the keyboard device 10 shown in Figure 5 is operated so as to develop coded signals corresponding only to either the "on” or the "kun” style of pronunciation of a particular Chinese character desired to be selected.
  • This signal is delivered to the display unit 150 which then will display each of the Chinese characters having that particular "on” or “kun” pronunciation. Then the desired Chinese characte is selected visually and read out of the device 150 in the manner described above. This provides an extremely simple solution to the character selection problem.
  • characters can be selected by means of a pre-programed general-purpose digital computer instead of the permanently-wired circuitry represented by elements 20, 26 and 32 of Figure 1, such as a Honeywell Level 6-36.
  • a general-purpose cumputer may be of any well-known type and will not be described in detail herein.
  • the detailed steps to be performed by actually programming the computer are well within the skill of those knowledgeable in the computer programming art.
  • the computer would be programmed to store at each address in its memory a separate signal representative of a distinct Chinese character. A signal would be read out of each address only when signals representing both the "on” and "kun” meaning for the character were received at that address in the storage.
  • the input data to such a computer could be in the form of a punched paper tape 16, magnetic tape, punched cards or other well-known digital computer input media.
  • each Chinese character is represented by only two different pronunciations, it should be understood that, especially when encoding Chinese characters in the Chinese language, more than two different pronunciations can be used to encode each character. Thus, the character encoding capacity of the system will be increased .
  • the system and method of the present invention is very easy to use by relatively unskilled personnel familiar with the Japanese language. Even those with only the most rudimentary education in Japanese learn to speak and write both the "on" and "kun” styles of pronunciation for Chinese characters.
  • the operation of the keyboard of the present invention makes advantageous use of the basic knowledge of most people with a fundamental education in the Japanese language.
  • the vast reduction in the number of different keys to be depressed makes the keyboard device of the present invention vastly more simple to use, faster in operation, smaller in size and weight, and of less complicated and expensive mechanical construction than prior art keyboards.
  • a single data processor can be used to process the input data from a plurality of different keyboards, the savings due to reduction of complexity in the keyboard equipment can be multiplied by the number of different keyboard units which can be used with a single data processor.
  • Figure 7 is a simplified block diagram of major components of a computer arranged to operate as a communication system incorporating the present invention.
  • the circuit shown in Figure 7 is basically a Honeywell Level 6-36 computer, the components of which are connected to a bus 161 referred to as a megabus.
  • the components connected to the megabus 161 include a CPU 162 having a 64KW memory 163 connected to it.
  • the megabus 161 also has a mass storage controller 164 connected to it and two discpacks 167 and 168 connected thereto. Each of the discpacks has a 5 megabyte storage capacity.
  • the megabus 161 also has a multi-line communication processor 169 connected to it with one or more communication packs 171 connected to the MLCP 169.
  • each of the CP's has four asynchronous ports 172-175 connected to it.
  • the port 172 is connected to a subscribers telex channel operating at a 50 baud rate
  • the port 173 is connected by means of an accoustic coupler connected to a telephone line and operating at a 300 baud rate.
  • the port 174 is connected to a data communication channel operating at up to 19.2 kilobauds.
  • the port 175 operates through a suitable connector such as an RS232C connected to a direct line to permit wide band signals to pass therethrough.
  • a second MLCP 176 may be connected to the megabus 161.
  • a multi-device controller 177 may be connected between the megabus 161 and a keyboard device pack 178. The latter is connected to a keyboard device 179.
  • FIG. 7 may be operated as a time-sharing system so that a number of terminals may be connected to feed signals into it and to receive signals from it.
  • Figure 8 shows a typical keyboard 179.
  • This keyboard may be identical with the Honeywell VIP 7200 keyboard.
  • the main part of the keyboard includes all of the letter and numeral keys found on a standard typewriter operating on the Roman alphabet.
  • the keys marked with Roman alphabetic symbols are identified in general by reference numeral 181.
  • the keys marked with Arabic numerals in the next to the top line of keys in the keyboard are indicated by reference numeral 182.
  • At the right hand side of the keyboard in Figure 8 is a numerical pad generally indicated by reference numeral 183- Each of these keys is connected to a correspondingly-numbered key in the numeric keys 182 in the main part of the keyboard.
  • the top row of keys in the keyboard in Figure 8 contains a number of function and mode keys. These are the keys that control most of the functions found in a standard electric typewriter.
  • the keyboard in Figure 8 may be connected to a terminal shown in block form in Figure 9.
  • This is a typical, configuration for a terminal and includes the keyboard 179, a printer 184, a cathode ray terminal 186 that includes a cathode ray tube, and a control system 187 including a micro-processor such as a Motorola 6800 along with a random access memory and a character memory in the form of an eraseable programable read-only memory capable of storing information allowing the printout of 2500 characters including about 2250 kanji along with a full set of katakana and hiragana symbols and all of the Roman letters, both upper and lower case, and the Arabic numerals from 0 through 9, together with various punctuation and other symbols.
  • a micro-processor such as a Motorola 6800 along with a random access memory and a character memory in the form of an eraseable programable read-only memory capable of storing information allowing the printout of 2500 characters including about 2250 kanji along with a full
  • the terminal in Figure 9 further includes a switch 188 that connects the terminal to any one of four lines which are the same four lines as identified in connection with the asynchronous ports 172-175 in Figure 7.
  • the terminal in Figure 9 can be connected directly to the computer sys tem in Figure 7 by any suitable one of the four lines and can also be connected to another terminal by any one of the four lines. This makes the system very flexible in its modes of communication.
  • the function keys in the top row indi cate the form of input and output information.
  • the keysin the keyboard 179 are marked with Roman letters and with katakana symbols along with other non-phonetic symbols.
  • the system may be so arranged that, when it is placed in operation, the information typed in must be typed according to eit.her the katakana symbols on the keys or according to the romaji, or Roman letter, markings on the keys. The difference is simply that, if the information is to be typed in using the katakana designations, the phonetic katakana symbol pronounced "ka” must be typed in by actuating the key that is also marked with the Roman letter "F". If the information is to be presented (for the Japanese language) by using Roman letters, the same pronunciation "ka” could be obtained by typing the letters "k” and "a” in succession.
  • the output of the information typed in at the keyboard 179 may be displayed on the cathode ray terminal 186 in any of several ways. By depressing the key marked "hiragana", the information may be displayed in hiragana. If the input is in katakana, there will be one character generated on the face of the CRT in the hiragana form for each katakana key struck on the keyboard 179. The significance of this is that an operator can have an immediate feedback of information that a key has been actuated. On the other hand, if the information is typed in Roman "letters, the direct 1:1 correspondence between the information entered by way of the keys and the display on the cathode ray tube of the terminal 186 is not present.
  • the system is arranged so that, as each Roman letter key is actuated, the corresponding Roman letter will appear on the screen of the cathode ray tube in the terminal 186.
  • the Roman letters of the immediately-preceding word will automatically be changed into hiragana.
  • the "echo" effect is obtained but the information is still presented in the form of hiragana.
  • the information can be presented in katakana by actuating the key marked "katakana”.
  • the kana shift key controls the shifting operation similar to the upper and lower case operation of a standard typewriter, but indicated on each of the symbol keys by the two symbols on the right hand side of each of these keys.
  • each of the kanji stored in the memory of the system has an address which is preferably in the form of a four-digit code number.
  • the four-digit number may be presented on the screen of the cathode ray terminal 186 and this number can then be entered by way of the numerical pad 183 to cause the printout of the appropriate kanji.
  • the kanji shift keys 192 and 193 are used.
  • the input may be in either Roman letters or in katakana.
  • the kanji shift keys 192 or 193 must be depressed. This has the effect of shifting the operation of the system to obtain processing of the Roman letters or kana symbols entered immediately thereafter into appropriate kanji.
  • the appropriate kanji may be determined without ambiguity simply on the basis of the phonetic information and the additional linguistic information afforded by the preceding and following kanji or the preceding and following kana.
  • some of the kana may remain displayed in negative form on the screen of the cathode ray terminal 186.
  • the negative display of kana will be shown until the code signal for each of the kanji in the sentence is returned from the central memory in the system shown in Figure 7. If it turns out that the kanji shift key has been depressed to call for transformation into a kanji and there is no kanji that can be identified in the memory, the system in Figure 7 will send back information that no such kanji exists and in that case the negative display of those particular kana will be transformed into positive display.
  • the system will cause the display on the bottom half of the cathode ray screen in the terminal 186 of all of the kanji having the same pronunciation.
  • These kanji will be displayed along with their four-digit code numbers and the operator, who must be familiar with the Japanese language, can then select the proper kanji. This selection can then be entered back into the keyboard by typing in the four- digit code number by means of the numerical pad 183 of the keyboard in Figure 8.
  • a sentence having several groups of negatively-displayed kana indicated by shaded blocks 196-198 will cause a plurality of groups of kanji to be depicted in the lower part of the screen.
  • the kanji 201 has its four-digit code presented in a corresponding 18 x 18 array 203 immediately below the kanji. Because of the fact that the four numbers that make up the code for the kanji 201 can each be displayed in a 9 x 9 array, all four can be presented in the same space as the kanji 201, itself.
  • the four-digit code 204 for the kanji 202 is displayed immediately below it and in the same manner the kanji 206 and 207 that correspond to the kana 197 are displayed with their four-digit codes 208 and 209, respectively.
  • the kana 198 three kanji 211-213 could be appropriate. These three kanji are presented with their respective fourdigit codes 214-216. The operator has only to look at the kanji presented at the lower part of the screen to determine which ones to use to replace the negatively-displayed kana 196-198 and then must enter the corresponding fourdigit number by way of the numerical pad 183.
  • Still another way to determine the appropriate kanji is to utilize the key that has the arrow that points to the left. This key also has the katakana symbols pronounced "a" and
  • the first key to be utilized for this purpose is the edit key, and when it is depressed, a cursor can be moved back and forth along a line of symbols on the face of the cathode ray tube in the terminal 186 by actuation of the arrow keys pointing left and right. Preceding lines and sentences can be moved into veiw on the cathode ray tube screen by actuation of the arrow pointing upward and then these preceding sentences can be run off the screen by actuation again of the arrow pointing downward. When the proper line is in position and the cursor is on the proper word, the symbols in that word may be changed symbol by symbol to make any necessary corrections. When the information in the message is correct, and when all of the negative kana have been replaced by positive kana or by kanji, the message may be recorded by actuation of the print key and then may be later printed out on the printer 184.
  • Figure 11 shows one sheet of computer paper arranged to accommodate the invention.
  • On the left hand side of the paper is an area 218 which is shown as being a letterhead sheet. It is not possible to go from negative to positive depiction of kana in a printer such as the printer 184, but the information can be printed out initially in the letterhead area 218, and for each line of print, any ambiguous kanji may be caused to print the possible insertions on the same line. This is illustrated by the shaded block 219 that represents kana which should be transformed into kanji but which cannot be transformed unambiguously. On the same line but on the right hand side 221 of the sheet of.
  • Figure 12A are printed by printing first the upper half of all four symbols by moving the printer means along the line 236 from left to right and then printing the lower half by moving the printing elements along a continuation of the line 236 from right to left.
  • the areas 231-234 may be printed by a raster sweep of each of the 18 lines from top to bottom with each line going from the far left to the far right.
  • each 18 x 18 area is separated from its neighboring 18 x 18 area by a space represented by six dots.
  • Figure 12B shows a modified form of presentation in which only the relatively complex kanji are presented with an 18 x 18 matrix format and the simpler symbols, such as the kana and any Roman letters and numerals, are presented in 9 x 9 matrices. It takes less time to retrieve 81 bits of information required for generation of a symbol in a 9 x 9 matrix than it does to retrieve 324 bits of information to generate a symbol in an 18 x 18 matrix area. This speeds up the printing process. The printing process may be further speeded up by the technique of moving the printing or scanning element faster in areas in which it is known that nothing is to be printed as is true in the upper part of areas over the small squares 237 and 238 in Figure 12B.
  • one of the linguistic characteristics that can be utilized to select kanji for printing the Japanese language is to group the kanji in accordance with the number of such kanji in a word.
  • there are words that contain more than four kanji but it has been determined that it is unnecessary to include such words in the memory of the system since any word can be spelled out phonetically by using kana.
  • the number of kanji used by a person is Indicative of the educational attainments of that person, but this is not always the case, It is thus perfectly satisfactory if some of the kana that should be translated into kanji do not get translated.
  • IF KTX1 "M502" GO TO TXT-R. TXl-M. WRITE TXTFIL INVALID KEY GO TO TXT-E1.
  • IF R3 "Q” GO TO HIRAGANA.
  • IF R3 "V” GO TO KATAKANA.
  • IF R3 "L” GO TO ALPHABET.
  • IF R3 - "'” OR R3 " , " GO TO SRCH1. SRCHO.
  • IF R3 "#" GO TO END-P.
  • R3 SPACE GO TO SYMBOL. SRCH1.
  • IF TYP 0 GO TO PROCESS-C1.
  • IF TYP NOT 0 GO TO SC2.
  • IF C3 NOT 0 GO TO SC1.
  • IF TYP NOT 6 GO TO FOUND.
  • IF TYP 0 MOVE 0 TO KC.
  • PROCESS-H SET III TO 11. GO TO TR1.
  • PROCESS-A PROCESS-A.
  • IF FGW1 1 GO TO EXX12 EXXI1.
  • IF OS-REGISTER NOT SPACES GO TO EXX31.
  • EXX3 3. MOVE 01-REGISTER TO OSR(0S). GO TO EXX30. EXX3 4. DISPLAY "9999".
  • KJU2. READ K--7 ⁇ BLE-S INVALID KEY GO TO KJAB CD. MOVE 5 TO FLG.

Abstract

Graphic symbols such as Chinese ideograms (51) or their informational equivalents, are stored at coded addresses in a memory (113) and are identified and retrieved by signals (108, 112) representing first and second linguistic characteristics. One of the linguistic characteristics is preferably one style of pronunciation of the desired symbol. The other linguistic characteristic may be a second style of pronunciation, especially in the retrieval of ideograms in printing the Japanese language in which the two styles are "kun" and "on" styles of Japanese pronunciation (65). The second characteristic can be a related ideogram or phonetic linguistic characteristic. A graphic symbol reproducer (150), such as a typewriter, printer, facsimile machine, cathode ray terminal, phototypesetter, teletype machine or other encoding device which is simple and rapid to operate is provided by the invention.

Description

SYSTEM FOR SELECTING GRAPHIC CHARACTERS PHONETICALLY
FIELD AND BACKGROUND OF THE INVENTION
This invention relates to a system and apparatus for se¬lecting graphical symbols and particularly to a system and apparatus for selecting desired ideograms from among a large number of ideograms by a system using alphabetic or phonetic symbols capable of being accommodated on a key¬board arranged to place substantially every key within a space that can be spanned by the two hands of a typical operator.
DESCRIPTION OF THE PRIOR ART
This invention is primarily concerned with the problem of selecting a desired ideogram by means of a keyboard that has a relatively small number of keys compared to the number of ideograms in the set from which the selection is to be made. For example, a typical electric typewriter for typ¬ing English and other languages that use the Roman alpha¬bet includes 44 printing keys of which 26 are letter keys, ten are numeral keys from 0 to 9, and eight are keys for printing punctuation and other symbols. The typewriter also has seven control keys, including a space bar. Three of the control keys operate the shifting mechanism to allow each printing key to control the printing of two different symbols, such as upper case and lower case letters and ad ditional non-letter symbols. All of the keys are within a distance capable of being spanned by the two hands of a typical operator and yet there is sufficient space between the keys to permit easy actuation of any desired key with¬out inadvertent actuation of an adjacent key. A professional typist can operate such a keyboard entirely by touch and without visual reference, primarily because it is possible for each of the typist's hands to remain very nearly in a fixed location. Key selection can be achieved almost entirely by finger movement.
The Roman alphabet is not the only one that can be accom¬modated by a typewriter; the Arabic, Cyrillic, Russian, Greek, and other symbolic alphabets and phonetic alphabets, and syllabaries, such as the Korean alphabet and the Japanese kana can all be incorporated in suitably adapted typewriters.
This is to be contrasted with the keyboards presently re¬quired to select even a minimum number of ideograms for a satisfactory printing of information in a language like Japanese. One of the most efficient ideogram selection machines presently available in Japan has 264 keys to be operated by one hand of the operator and 12 keys to be operated by the other hand. Selection of a single ideogram requires that one of the 264 keys and one of the 12 keys be operated. The selection of one ideogram per second is considered a good speed and can be reached by an operator only after many months of training. The operator of an English-language typewriter should, after only a few days or weeks of training, be able to select, accurately by touch at the rate of one key per second, any key on the keyboard previously described. The selection of one ideo¬gram per second is not entirely comparable with the selec¬tion of one letter key per second because a single ideogram is usually equal to several letters in terms of the amount of information it contains, but even so, it is far slower to type a given amount of information in Japanese than in English and it takes far longer to develop the manual dexterity required to do so.
It is desirable to reduce the number of keys required to select any one of a large number of ideograms, but the re¬duction in the number of keys is not, in itself, satisfac¬tory. For example, a single telegraph key can select any letter in any alphabet, any numeral from 0 to 9, and any of the necessary punctuation marks; but the operator must know all of the code symbols, which requires an additional men¬tal transformation. Ideograms can be given code numbers to facilitate their selection and their transmission by radio or wire, but this also represents a further transformation. Telegraphic code for English and numerals only requires learning about 50 code symbols, but The Japanese Ministry of Education has listed about 1850 most essential ideograms, and Japanese newspapers normally use about 3,600. Memoriz¬ing 3,600 numerical code numbers would be a formidable task. The task is somewhat simpler in Japan than in China because much of the written Japanese textual material is in one of the two phonetic syllabaries, generically called "kana" but separately identified as "hiragana" and "katakana." In the Chinese language, all written symbols are ideograms, although there has been an attempt in the past to promulgate a set of about 39 or 40 national phonetic letters for the Chinese language. There has also been an attempt to adopt a system of national Romanϊzation as an alternative to the phonetic letters, but neither of these attempts at simplification has been completely successful.
Large numbers of ideograms are also used in writing the
Korean language, which also has the advantage of including an alphabet system consisting of 24 letters generally cap¬able of being sounded as either an initial or final letter. There are still other languages that rely heavily on ideograms or symbols that can be selected according to the principles of this invention. To make the following discussion clearer, the Japanese language will be used as the basis of description, but it should be understood that the principles of the invention are not limited just to select i on of graphic symbols in that language.
One of the difficulties of providing a machine or electronic system to select ideograms on a phonetic basis is that there are many ideograms that have the same pronunciation. For example, just in the group of 1,850 basic characters in the Japanese language, there are 38 entirely different ideograms, or kanji, pronounced "tō". Sixteen of these ideograms pronounced "tō" are included in the first 881 essential characters prescribed by the Ministry of Education as the minimum requirement for the six years of Japan ese elementary school. To pronounce the syllable "tō" by itself would not allow a listener familiar with the Japanese language to determine whether the speaker was referring to, for example, the word meaning "winter" or the word meaning "east" or any of the other meanings of similarly pronounced ideograms. However, to pronounce the word "Tōkyō", which includes the ideogram pronounced "tō" along with the ideogram pronounced "kyō", would immediatel indicate to a Japanese-speaking person both of the specific ideograms in that word.
In the Japanese language, the pronunciation of a word made up of two or even more kanji is not always sufficient to identify the kanji. In such cases, a machine or an electronic system will be unable to select a specific kanji based only on phonetic input. In some cases, the ambiguity may be resolved as to one or more kanji of a compound word that comprises more than one kanji , but will not be resolved as to other kanji in the same word. In addition to selecting the proper ideogram by means of a machine operated by a person familiar with the language, there is a somewhat related problem of translating from one language to another. For example, in translation between Japanese and English, there are usually many different English words corresponding to a Japanese word or many different Japanese words corresponding to an English word. In view of this problem, there has not yet been proposed a fully successful translation technique capable of selecting an exactly appropriate foreign language word for each word of a given language. Accordingly, machine translation of languages usually is of poor quality.
SUMMARY OF THE INVENTION
One object of this invention is to provide a selection system capable of selecting a proper symbol from a group of such symbols.
Another object of the invention is to provide a selection system using phonetic input means to obtain the initial information on which the selection is based.
Another object of the invention is to provide an automatic correlation system between a simplified keyboard and a memory in which a large number of symbols is stored, the number of such symbols being far greater than twice the number of keys on the keyboard, and each symbol being selectable with not more than one shift operation per symbol.
Still another object is to provide a printing and display system capable of printing and displaying ideograms rapidly and accurately.
A further object of the invention is to provide an accurate language translation system and method. A further object is to provide a graphic symbol selection system adapted to be incorporated in a communication system.
A still further object of this invention is to select ideograms and print them with a dot-matrix printer.
Yet another object is to select ideograms and print them in a facsimile machine.
A further object is to select ideograms on the basis of plural styles of pronunciation thereof.
A still further object is to select ideograms on the basis of two linguistic characteristics thereof.
Other objects will become apparent from studying the following specification together with the accompanying drawings.
In accordance with this invention, information specific to each of a number of graphic symbols is stored in a memory. For example, the memory could be a printing device and the information could be printing elements formed according to the graphic symbols, themselves. Phototypesetters have images of the symbols they are capable of printing and each symbol is located at a specific address on a master sheet.
Alternatively, the information could be typing elements capable of printing by impact.
Another alternative is that the information could be in the form of electrical or magnetic conditions at known addresses in an electrical or magnetic memory. The information could be capable of controlling a dot-matrix or any other known form of printing device or graphic display device, such as a cathode ray tube (CRT) or a facsimile machine.
The electrically or magnetically stored information is not limited to the type of information that is capable of directly creating a graphic symbol but may simply be in the form of address information to actuate a phototypesetter, for example, and cause a symbol at that address on the master sheet to be printed.
Further in accordance with this invention, the information is stored according to linguistic characteristics and can be retrieved by linguistic data. A first linguistic characteristic is the pronunciation or phonetic representation of a character, which is the way people normally handle a spoken language. In spite of the many homonyms of ideograms, it is still possible to select a number of them on the basis of pronunciation alone. Because of the homonyms, pronunciation is not a sufficient basis for selection, but it still forms a useful first basis for dividing ideograms into groups.
In order to resolve ambiguities that remain after application of the first linguistic characteristic, the invention further contemplates the application of a second linguistic characteristic to at least the relatively small group of symbols obtained by the first selection process. The second linguistic characteristic must be applicable in a unique manner to each of the members of the small group so that any specific symbol in that group can be chosen.
A second linguistic characteristic that can be used in the selection of Japanese ideograms is a different pronunciation of each of the ideograms. This is due to the fact that almost all Japanese ideograms have two styles of pronunciation, known as the on-yomi and the kun-yomi. These are frequently referred to simply as the "on" and "kun" styles of pronunciation. I have found that although there are many homonyms in the "on" style and many homonyms in the "kun" style, any ideogram recorded in an electrical memory can be uniquely selected by applying data consisting of the "on" style of pronunciation and the "kun" style of pronunciation of that ideogram. By presenting this information in a known order, such as the "on" information first and then the "kun" information, it is also possible to select those ideograms that have only an "on" pronunciation or only a "kun" pronunciation by using a special symbol such as a blank or a hyphen to indicate the absence of one or the other styles of pronunciation.
Another linguistic characteristic that is easier to use in that it imposes less constraints on the operator of apparatus used in the selection process is based on the fact that certain ideograms, hereinafter referred to as kanji, the term by which they are designated in the Japanese language, are used only with certain other kanji but are not used in compound words with still further kanji. The word "Tōkyō" previously mentioned is an example of such usage of kanji in a compound word. In this compound word, the second linguistic principle, or characteristic, is the usage of the two kanji together. By holding the information of the pronunciation of the first kanji, "tō", in a buffer memory, and then noting what the following information contains (in this instance, the pronunciation of the immediately following kanji), the proper "to" kanji can be selected and, at the same time, the proper "kyō" kanji can also be selected.
Not all Japanese words can be so easily distinguished. For example, the compound word "tenki" consisting of a first kanji pronounced "ten" and a second kanji pronounced "ki" can mean either "a turning point" or "weather". The four kanji involved in these two words are entirely different. However, the selection may be made by reference to kana associated with the kanji meaning "weather". In this instance, it is common to place the kana pronounced "o" immediately in front of the kanji pronounced "tenki" and meaning "weather", but the kana "o" is never placed in front of the kanji pronounced "tenkϊ" and meaning "a turning point". By holding the immediately preceding kana in a buffer memory and comparing that information with information as to the following kanji, the appropriate kanji can be selected.
A related second linguistic principle is to utilize kana that follow the kanji. For example, the kanji pronounced "kiso" meaning "the foundation" are entirely different from the kanji also pronounced "kiso" but meaning "prosecute". In the latter instance, "kiso" is a verb and is followed by the kana "suru", which do not follow the kanji pronounced "kiso" and used as a noun meaning "the foundation". These preceding and following kana that are closely associated with the kanji in Japanese words are called "okurigana".
Selection of kanji on the basis of association with other kanji or on the basis of okurigana has the important advantage that this is the way the language is normally used, but there are still instances in which kanji cannot be uniquely identified by reference to either okurigana or to other associated kanji. For example, in one of the exampies just given, the kana pronounced "o" does not necessarily have to precede the kanji pronounced "tenki". The kana pronounced "o" makes the usuage polite, but one can speak about the weather to other people without using the polite form of the word. In that case, the second linguistic principle to distinguish the word meaning "a turning point" from the word meaning "weather" can be the sense of the statement or the syntax in which the particular kanji are used. Alternatively, an operator of the system of this invention can rely on the "on-kun" method of selection, but to do so requires the input of additional data not normally included in a message. Normally, one uses only the "on" style or the "kun" style at any given point but not both styles.
Still another linguistically-related principle that can be applied to select kanji that cannot be selected by reference to associated kanji or to okurigana is the graphic depiction of all of the kanji having a similar pronunciation and the associated graphic depiction of identifying information, such as the address of each of these kanji, to allow a person familiar with the language to chose the proper kanji and refer its address back to the apparatus. For selection of a group of symbols totaling less than 10,000 but more than 1,000, the address may be in the form of a number having four decimal digits. The presentation of the ambiguous kanji and their addresses can be accomplished automatically by the system without requiring any additional input from the operator and it is thus less burdensome than the selection based on multiple styles of pronunciation. However, it does require that the person making the selection on the basis of the graphically- presented kanji be capable of reading those kanji, whereas at least in principle, it is possible for information utilizing the previously-mentioned forms of linguistic analysis to be fed into the system by a keyboard using the Roman alphabet and thus operable, on a letter by letter basis, by an operator who knows nothing of the Japanese language but simply follows a text written in the Roman alphabet.
This is a point of major significance in the present ϊnvention. It means that the selection system of this invention can utilize a keyboard of the standard size in Roman letter typewriters, such as are used to type English and most of the European languages, can be used to select specific ideograms from a group of ideograms much larger in number than the number of keys on the keyboard. Furthermore, the Japanese phonetic syllabries, hiragana, and katakana, have only about 50 symbols, and the keys of an electric typewriter keyboard originally set up for the Roman alphabet can easily be modified to accommodate the kana.
There is no specific size limitation to be set on the keyboard. It need not have exactly the same letters or position of keys as a standard typewriter capable of writing English. What is important is to note that, as a data input device, a keyboard suitable for use In a system according to the present invention can be small enough to be spanned by the outstretched fingers of two hands of average size and yet the information entered through this keyboard utilizing the apparatus of the present invention can produce a graphic display of more than 2,000 graphic symbols. Furthermore, selection of the graphic symbols in accordance with the linguistic characteristics as just described makes it unnecessary to use each key to obtain more than two symbols in each mode of operation. That is, for a given mode of operation to obtain the proper kanji where desired in writing Japanese text material, the present invention requires, at most, shifting between two symbols, or symbol styles, for each key. This corresponds to writing English in which the operation is shifted between one level for lower case letters and another level for upper case letters.
The apparatus of this invention includes a memory in which the graphic symbols or information defining such symbols is stored in specific address locations. These addresses may be reached by signals processed by a suitably programed computer or by a sequence of storage, comparison, and switching circuits. The full scope of operation of a computer is unnecessary because there is a finite, specific number of symbols or addresses to be retrieved by a finite set of data. BRIEF DESCRIPTION OF THE DRAWINGS
Figure 1 is a block diagram illustrating one embodiment of this invention;
Figure 2 is a plan view of one embodiment of a keyboard for use in the present invention;
Figure 3 is a schematic diagram illustrating the procedure followed in encoding a typical Japanese expression by use of the keyboard shown in Figure 2;
Figure 4 is a detailed schematic diagram of a component of the system shown in Figure 1;
Figure 5 is a schematic diagram of another embodiment of the invention;
Figures 6A-6L are illustrative examples of the use of ideograms in Japanese writing;
Figure 7 is a block diagram of a modified system incorporating the invention;
Figure 8 is a plan view of another embodiment of a keyboard for use in the present invention;
Figure 9 is a block diagram of a terminal suitable for use in the circuit in Figure 7;
Figure 10 is a simplified illustration of the screen of a cathode ray tube displaying information according to the present invention;
Figure 11 is a simplified drawing of one sheet of computer fanfold paper arranged to be used in accordance with this invention; and
Figures 12A and 12B illustrate two types of graphic symbol display in accordance with the present invention.
DETAILED DESCRIPTION OF THE INVENTION
The present invention is particularly well suited for use with the Japanese language. As it has been pointed out above, written Japanese usually consists of "kanji" or Chinese characters, mixed with "kana" or Japanese phonetic characters. Kana characters are relatively easy to select, type, or print because there are only a relatively small number of them. However, the Chinese characters cause all of the problems described above.
In the Japanese language, as in the Chinese language, almost all of the Chinese characters have more than one style of pronunciation. In Chinese, often there are four or more pronunciation styles. In Japanese, there are two styles. One such Japanese style is called the "kun" style, and the other is called the 'on' style. In the preferred embodiment of this invention, each Chinese character is represented by a signal composed of two separate signals, one representing the "kun" style and the other the "on" style of pronunciation of the character. By using Japanese phonetic (kana) characters or their Roman alphabet equivalents to represent the "kun" and "on" styles for each Chinese character, a keyboard and system using fewer than fifty different character keys can be used to operate a teleprinter, phototypesetter, or graphic display to graphically reproduce typed or printed matter including the precise
Chinese character desired.
Figure 1 of the drawings shows a system for encoding, decoding, and graphically displaying Japanese phonetic and Chinese characters to form written matter in the Japanese language. This system includes a keyboard unit 10 with character keys 12. When the keys 12 are depressed by an operator, the unit 10 sends coded electrical signals to a conventional tape punch unit 14 which produces a punched paper tape 16 bearing binary-coded arrays of holes each representing a character key which was depressed. The matter being typed is typed in accordance with the above- described "on-kun" code in which each Chinese character is represented by Japanese phonetic (kana) or Roman characters which represent the "kun" and the "on" styles of pronunciation. Words which are to be printed or otherwise displayed in kana form can be typed directly, without use of the "on-kun" code. Each set of signals using the "on-kun" code is segregated from the other signals by appropriate start and stop signals.
The punched tape 16 is delivered to a conventional tape reader 18 which produces coded electrical signals corresponding to the punch-coded signals on the tape 16. These electrical signals are conducted to a conventional code decoder 20 which detects the start and stop signals surrounding each "on-kun" code sequence and delivers a gating signal on a lead 22 to a switching device 26. Switching device 26, which can be a conventional bi-stable circuit such as a flip-flop, directs the coded signals it receives over lead 24 to one of two leads 28 or 30, depending upon which of its bi-stable conditions it is switched to by the gating signal received on the lead 22. Upon the receipt of a "start" signal signifying the start of a sequence of "on-kun" coded signals, the device 26 switches to one mode in which the coded signals ane delivered over output lead 28 to a code converter device 32. Code converter 32 stores coded signals representing Chinese characters and delivers one of those signals over an output lead 34 to a utilization device 36 in response to the receipt of the "on-kun" code designation of a selected character.
The utilization device can be any type desired, such as a photocomposing machine, cathode ray tube display, or teleprinter, each of which pr.ints or otherwise graphically displays the Chinese characters.
When a "stop" signal is received by the decoder 20, another gating signal is sent over line 22 to the switching device 26 to switch it into a second condition in which the coded signals are delivered directly to the utilization device 36 over lead 30. Thus, signals will be received in the device first over one lead 34 and then the other lead 30, depending upon whether the characters to be printed are Chinese or other characters.
The punched tape is but one example of a register for storing the encoded character signals from the keyboard device 10. Other permanent, semi-permanent or temporary registers such as magnetic tape, punched cards, etc., can be used instead of punched tape, or the use of a register can be dispensed with entirely. In the latter case, the output of the keyboard device 10 would be connected directly to the decoding device 20.
If all of the characters being selected are ideograms, such as the case would be if the Chinese language were being encoded and printed, the keyboard device could be connected directly to the converter 32, thus eliminating elements 20 and 26 from the circuit.
KEYBOARD DEVICE
Figure 2 is a schematic plan view of the keyboard device 10. The keyboard includes character keys 12 each of which is marked with a Roman letter 38 and a Japanese phonetic (kana) character 40. Number keys 42 are marked with an Arabic numeral, and, since the Arabic numerals are themselves ideograms with directly corresponding Chinese ideograms, one key can be used to represent the same number in each language if desired. Certain other keys 44, termed herein as "quadrated" keys, are marked with three or four different symbols, some being kana characters, and others being English ideograms.
Five different function keys are provided. Kay V is for upper case Roman characters and Arabic numerals; Key IV is for lower case Roman characters and Arabic numerals; Key III is for the "katakana" form of kana characters; Key I is for the "hiragana" form of kana characters; Key II is the "on-kun" code signal key which records the stop and start signals indicating that the code is for Chinese
(kanji) characters. The "M" key designates the switch to turn the keyboard on and off. The one of the characters on quadrated keys 44 which is selected depends upon which of the four keys I, III, IV or V is actuated. A space bar 46 is also provided to space the characters from one another.
An example of how the keyboard device 10 can be used is illustrated in Figure 3. The expression "at first light exists", indicated by reference numeral 49 at the bottom of Figure 3, is. properly written in Japanese as shown by the expression at the top of Figure 3 which is indicated by numeral 47. The symbols indicated by reference numeral 51 comprise a single Chinese or "kanji" character which means "first". Characters 55 and 57 are Japanese phonetic characters (kana characters) which create a meaning, together with character 51, of "at first". Character 59 is another Chinese character which means "light". Characters 61 and 63 are two further Japanese phonetic characters which together mean "exists". The encoding of the Japanese expressϊon 47 will now be explained as an example.
First, the "M" key is depressed to turn the keyboard device 10 on. Then the I I key is depressed to indicate that a Chinese character will be encoded. The keyboard device 10 is one of several devices which are commercially available for converting keystrokes into appropriately coded electrical signals. The electrical signals are used to operate the tape punch 14, or to otherwise operate in the system shown in Figure 1. Any particular binary code can be used as desired. For example, either six- or seven-level "Teletypesetter" (TTS) code can be used. Thus, when key I I is depressed, a signal is encodedwhich identifies the beginning of a Chinese character code train.
Table 65 in Figure 3 shows the Japanese phonetic and the Roman alphabet components of both the "kun" and the "on" pronunciations for the Chinese character 51. Thus, the "kun" pronunciation of character 51 is "ha- ji -me", and the "on" pronunciation is "sϊ-yo".
In accordance with the present invention, Chinese character 51 is encoded by first pressing the key 48 for the Japanese phonetic symbol for "ha". The next phonetic symbol is formed by successively depressing keys 50 and 52. Since the first two syllables in the "kun" pronunciation are sufficient to uniquely identify the character 51, it is not necessary to depress a third key to represent "me", although the third key can be depressed if the operator desires. Instead, a key 53 may be depressed which encodes a signal on the tape 16 which indicates' the end of the "kun" pronunciation and the beginning of the "on" pronunciation. Next, the key 50 for "si" and the key 54 for "yo" are depressed. This completes the coding except for ending the Chinese character. This is done by depressing key II again, thus placing a coded signal which is the same as the start signal on the tape.
The Japanese phonetic characters 55 and 57 do not need to be specially encoded. Thus, the next step in encoding the expression 47 is to depress the Japanese phonetic key I to condition the keys 12 to encode Japanese phonetic characters, and then keys 56 and 58 corresponding, respectively, to phonetic characters 55 and 57 are depressed.
Next, the Chinese character 59 is encoded by first again pressing the Chinese character key II, and then depressing, in succession, keys 60, 62, and 64 for the "kun" pronunciation. Next, key 53 is depressed, to separate the different pronunciations, and the "on" pronunciation of the Chinese character 59 is encoded by depressing successively keys 66 and 68, in accordance with table 67. This Chinese character is ended by again depressing key II.
The final two phonetic characters 61 and 63 are encoded by once again depressing key 1, and then keys 70 and 64. The depression of the "M" key turns the keyboard off.
All three phonetic symbols for the "kun" pronunciation of the Cinese character 59 were used as an example to show how possible ambiguity with another character having the "kun" pronunciation "hi-ka" and the "on" pronunciation "ko-u" is eliminated.
It can be seen that the keyboard device 10 is vastly simpler than previous keyboard devices for encoding the Japanese language characters. In the specific example shown in Figure 2, there are only 49 character keys, five function keys, and one space bar. This is in sharp contrast to the hundreds of keys which appear on most prior art Japanese keyboards, which are physically much larger than the stand- ard Roman alphabet keyboard device 10. Furthermore, the keyboard can be simplified even further if only Japanese phonetic characters are desired to be used in the typewriter, or if only Roman characters are to be used. In such a case, some of the function keys and their associated circuitry can be eliminated.
Examples of words and sentences printed in Japanese and Chinese characters follow in order to illustrate further the encoding arrangement of the present invention. The symbol " < " is used to represent a "start" symbol, and the symbol " > " is used as a "stop" symbol for a Chinese character. Thus, a typical short Japanese sentence is shown in Figure 6A. With start and stop symbols, the sentence reads as shown in Figure 6B. After encoding the Chinese characters in accordance with the "on-kun" code described above, the sentence reads as shown in Figure 6C
I have found that almost all of the approximately 1,970 Chinese characters considered to be essential to Japanese language printing can be represented uniquely by a combination of the "on" and "kun" pronunciation styles.
However, in the case of a few other Chinese characters, some either have no "kun" style at all, or the "kun" style is rarely used. Examples of the former cases are Chinese characters such as are shown in Figure 6D. Examples of the latter case are Chinese characters such as are shown in Figure 6E. In these cases, the "on" style is repeated in order to uniquely represent each character. Thus, the Chinese characters shown in Figure 6F are represented, respectively, by the combinations shown in Figure 6G. Some Chinese characters of the type shown in Figure 6H, each having an "on" reading of only one character are represented by the respective combinations each of which is constructed by adding the corresponding "kun" reading just after the only character of "on" reading as shown by the combinations shown in Figure 6I. Moreover, some Chinese characters, such as those shown in Figure 6J , which can be designated only by either the "on" style or the "kun" style, are represented by only one of the styles, together with signals indicating two blank spaces as shown in Figure 6K, where the symbols ". ." represent blank spaces.
The "on-kun" code for the Chinese characters appearing in the short sentences set forth in Figure 6A is illustrated in Table 1, as shown in Figure 6L.
THE CODE CONVERTER
Figure 4 shows an example of circuitry which can be used to convert "on-kun" coded signals into Chinese characters. The converter 32 includes a shift register 100 connected to the input lead 28, and a register 102 connected to the shift register 100. These registers are conventional and are adapted to store two coded signals apiece and then read out their signals. When the register 100 has receive two "kun" coded words, it shifts those words into the second register 102. When both registers 100 and 102 are full the signals stored in those registers are transferred, respectively, to conventional decoders 106 and 104 respectively.
If, as in some of the foregoing examples, the "kun" code signals are delivered first, they are decoded in the de- coder 104 whereas the "on" signals are decoded in the decoder 106. Of course, if the "on" and "kun" codes are reversed in order, the opposite signals are decoded by each decoder .
Each decoder 104 or 106 produces an output signal on only one of its output leads for a given combination of input signals. Thus, for example, the decoder 104 might deliver an output signal over lead 108, and the decoder 106 might deliver the signal over the lead 112. The lead 108 is connected by means of another lead 110 to a plurality of different storage units 113. Each of the storage units 113 stores a coded electrical signal representative of one Chinese character. Each storage unit 113 includes "AND" circuit means so that its stored signal will be read out only when a signal is received on each of two input leads.
Each output lead of the decoder 106 also is connected to every storage unit 113 which has the "on" pronunciation represented by the lead 112.. Thus, by means of a lead 114, the lead 112 is connected to the second input lead of the same storage device 115 as the one to which the lead 108 of decoder 104 is connected. When the storage unit 115 thus receives two input signals simultaneously, indicating that storage unit 115 contains a code corresponding to a Chinese character which has both the "on" and the "kun" pronunciation input to the device 32, the stored signal is delivered over an output lead 116 to an "OR" gate 118 which delivers the signal over lead 34 to the utilization device 36. There is, of course, one input lead to the "OR" gate 118 from each of the storage devices 113.
There are many "read only" memories available on the market which can be used to store the character signals to be delivered to the device 36.
Figure 5 illustrates another embodiment of the present invention. The keyboard unit 10 is connected to a visual display device 150 of a well-known type. One such device is part of the machine known as the "Ohicoder" machine which is sold by the I tek Company, Lexington, Massachusetts. It has a display screen 156 on which Chinese characters can be displayed, one within each one of the squares on the screen. A horizontal array of switch buttons 152 and a vertical array of switch buttons 154 are provided along the edges of the screen 156. A particular Chinese character displayed on the screen can be selected by sight and trans mi tted from the device 150 to the utilization device 36 by pressing a button corresponding to the proper row and column of the desired Chinese character.
In accordance with another aspect of the present invention, the keyboard device 10 shown in Figure 5 is operated so as to develop coded signals corresponding only to either the "on" or the "kun" style of pronunciation of a particular Chinese character desired to be selected. This signal is delivered to the display unit 150 which then will display each of the Chinese characters having that particular "on" or "kun" pronunciation. Then the desired Chinese characte is selected visually and read out of the device 150 in the manner described above. This provides an extremely simple solution to the character selection problem.
Also it should be understood that characters can be selected by means of a pre-programed general-purpose digital computer instead of the permanently-wired circuitry represented by elements 20, 26 and 32 of Figure 1, such as a Honeywell Level 6-36. Such a general-purpose cumputer may be of any well-known type and will not be described in detail herein. The detailed steps to be performed by actually programming the computer are well within the skill of those knowledgeable in the computer programming art. However, to review, the computer would be programmed to store at each address in its memory a separate signal representative of a distinct Chinese character. A signal would be read out of each address only when signals representing both the "on" and "kun" meaning for the character were received at that address in the storage. The input data to such a computer could be in the form of a punched paper tape 16, magnetic tape, punched cards or other well-known digital computer input media.
Although in the preferred embodiment each Chinese character is represented by only two different pronunciations, it should be understood that, especially when encoding Chinese characters in the Chinese language, more than two different pronunciations can be used to encode each character. Thus, the character encoding capacity of the system will be increased .
The system and method of the present invention is very easy to use by relatively unskilled personnel familiar with the Japanese language. Even those with only the most rudimentary education in Japanese learn to speak and write both the "on" and "kun" styles of pronunciation for Chinese characters. Thus, the operation of the keyboard of the present invention makes advantageous use of the basic knowledge of most people with a fundamental education in the Japanese language. The vast reduction in the number of different keys to be depressed makes the keyboard device of the present invention vastly more simple to use, faster in operation, smaller in size and weight, and of less complicated and expensive mechanical construction than prior art keyboards. Furthermore, since a single data processor can be used to process the input data from a plurality of different keyboards, the savings due to reduction of complexity in the keyboard equipment can be multiplied by the number of different keyboard units which can be used with a single data processor.
Figure 7 is a simplified block diagram of major components of a computer arranged to operate as a communication system incorporating the present invention. The circuit shown in Figure 7 is basically a Honeywell Level 6-36 computer, the components of which are connected to a bus 161 referred to as a megabus. The components connected to the megabus 161 include a CPU 162 having a 64KW memory 163 connected to it.
The megabus 161 also has a mass storage controller 164 connected to it and two discpacks 167 and 168 connected thereto. Each of the discpacks has a 5 megabyte storage capacity.
The megabus 161 also has a multi-line communication processor 169 connected to it with one or more communication packs 171 connected to the MLCP 169. In the embodiment shown in Figure 7, each of the CP's has four asynchronous ports 172-175 connected to it. The port 172 is connected to a subscribers telex channel operating at a 50 baud rate The port 173 is connected by means of an accoustic coupler connected to a telephone line and operating at a 300 baud rate. The port 174 is connected to a data communication channel operating at up to 19.2 kilobauds. The port 175 operates through a suitable connector such as an RS232C connected to a direct line to permit wide band signals to pass therethrough.
As in the case of most computer equipment, the apparatus shown in Figure 7 is capable of being expanded to accommodate additional components. For example, a second MLCP 176 may be connected to the megabus 161. A multi-device controller 177 may be connected between the megabus 161 and a keyboard device pack 178. The latter is connected to a keyboard device 179.
The computer system shown in Figure 7 may be operated as a time-sharing system so that a number of terminals may be connected to feed signals into it and to receive signals from it. Figure 8 shows a typical keyboard 179. This keyboard may be identical with the Honeywell VIP 7200 keyboard. The main part of the keyboard includes all of the letter and numeral keys found on a standard typewriter operating on the Roman alphabet. The keys marked with Roman alphabetic symbols are identified in general by reference numeral 181. The keys marked with Arabic numerals in the next to the top line of keys in the keyboard are indicated by reference numeral 182. At the right hand side of the keyboard in Figure 8 is a numerical pad generally indicated by reference numeral 183- Each of these keys is connected to a correspondingly-numbered key in the numeric keys 182 in the main part of the keyboard.
The top row of keys in the keyboard in Figure 8 contains a number of function and mode keys. These are the keys that control most of the functions found in a standard electric typewriter.
The keyboard in Figure 8 may be connected to a terminal shown in block form in Figure 9. This is a typical, configuration for a terminal and includes the keyboard 179, a printer 184, a cathode ray terminal 186 that includes a cathode ray tube, and a control system 187 including a micro-processor such as a Motorola 6800 along with a random access memory and a character memory in the form of an eraseable programable read-only memory capable of storing information allowing the printout of 2500 characters including about 2250 kanji along with a full set of katakana and hiragana symbols and all of the Roman letters, both upper and lower case, and the Arabic numerals from 0 through 9, together with various punctuation and other symbols. The terminal in Figure 9 further includes a switch 188 that connects the terminal to any one of four lines which are the same four lines as identified in connection with the asynchronous ports 172-175 in Figure 7. Thus, the terminal in Figure 9 can be connected directly to the computer sys tem in Figure 7 by any suitable one of the four lines and can also be connected to another terminal by any one of the four lines. This makes the system very flexible in its modes of communication.
When the keyboard in Figure 8 is operated in the terminal shown in Figure 9, the function keys in the top row indi cate the form of input and output information. The keysin the keyboard 179 are marked with Roman letters and with katakana symbols along with other non-phonetic symbols. The system may be so arranged that, when it is placed in operation, the information typed in must be typed according to eit.her the katakana symbols on the keys or according to the romaji, or Roman letter, markings on the keys. The difference is simply that, if the information is to be typed in using the katakana designations, the phonetic katakana symbol pronounced "ka" must be typed in by actuating the key that is also marked with the Roman letter "F". If the information is to be presented (for the Japanese language) by using Roman letters, the same pronunciation "ka" could be obtained by typing the letters "k" and "a" in succession.
The output of the information typed in at the keyboard 179 may be displayed on the cathode ray terminal 186 in any of several ways. By depressing the key marked "hiragana", the information may be displayed in hiragana. If the input is in katakana, there will be one character generated on the face of the CRT in the hiragana form for each katakana key struck on the keyboard 179. The significance of this is that an operator can have an immediate feedback of information that a key has been actuated. On the other hand, if the information is typed in Roman "letters, the direct 1:1 correspondence between the information entered by way of the keys and the display on the cathode ray tube of the terminal 186 is not present. As a result, if it is desired to present the display in the form of hiragana but to have the input in Roman letters, the system is arranged so that, as each Roman letter key is actuated, the corresponding Roman letter will appear on the screen of the cathode ray tube in the terminal 186. At the end of each word, after a space signal has been applied by means of the space bar 191 and the first letter of the following word has been typed, the Roman letters of the immediately-preceding word will automatically be changed into hiragana. Thus, the "echo" effect is obtained but the information is still presented in the form of hiragana.
Alternatively, the information can be presented in katakana by actuating the key marked "katakana".
The kana shift key controls the shifting operation similar to the upper and lower case operation of a standard typewriter, but indicated on each of the symbol keys by the two symbols on the right hand side of each of these keys.
The "on-kun" key controls the operation described in greater detail in connection with Figures 1-4. All that needs to be added at this time is that each of the kanji stored in the memory of the system has an address which is preferably in the form of a four-digit code number. Once the specific kanji has been selected by the "on-kun" technique, the four-digit number may be presented on the screen of the cathode ray terminal 186 and this number can then be entered by way of the numerical pad 183 to cause the printout of the appropriate kanji.
When the system is to be operated in such a way as to rely on the preceding and the following kana or the associated kanji in order to determine which kanji to display either on the printer 184 or the cathode ray tube in the terminal 186, the kanji shift keys 192 and 193 are used. The input may be in either Roman letters or in katakana. Before typing in the information for one kanji or a group of kanji one of the kanji shift keys 192 or 193 must be depressed. This has the effect of shifting the operation of the system to obtain processing of the Roman letters or kana symbols entered immediately thereafter into appropriate kanji. Such transformation continues until the kanji shift key 192 or the kanji shift key 193 is actuated a second time to unshift the system. In the unshϊfted condition, the information entered will not be processed to obtain kanji but will simply be passed along in kana form.
The processing into kanji does not take place immediately as soon as the symbols are typed in. Instead, the symbols are retained in hiragana form, but those symbols that are entered when the keyboard is in its shifted condition will be displayed on the cathode ray tube screen of the terminal 186 in negative form. This means that instead of having the symbols appear to be white on a dark background of the cathode ray screen, they will appear to be dark on a small bright area that extends only far enough to serve as a background for a single such symbol. When the end of the sentence is reached, and a period is typed in and then a return key 194 is actuated, the system will process the negative kana into kanji, to the extent that it can do so. However, it is not always possible for the appropriate kanji to be determined without ambiguity simply on the basis of the phonetic information and the additional linguistic information afforded by the preceding and following kanji or the preceding and following kana. As a result, some of the kana may remain displayed in negative form on the screen of the cathode ray terminal 186.
If the kanji code key is depressed before a sentence is typed in, the negative display of kana will be shown until the code signal for each of the kanji in the sentence is returned from the central memory in the system shown in Figure 7. If it turns out that the kanji shift key has been depressed to call for transformation into a kanji and there is no kanji that can be identified in the memory, the system in Figure 7 will send back information that no such kanji exists and in that case the negative display of those particular kana will be transformed into positive display. However, if there is more than one kanji but the system is incapable of determining which of the two or more kanji should be substituted for the negatively-displayed kana, the system will cause the display on the bottom half of the cathode ray screen in the terminal 186 of all of the kanji having the same pronunciation. These kanji will be displayed along with their four-digit code numbers and the operator, who must be familiar with the Japanese language, can then select the proper kanji. This selection can then be entered back into the keyboard by typing in the four- digit code number by means of the numerical pad 183 of the keyboard in Figure 8.
In displaying the kanji on the screen of the cathode ray terminal 186, it has been determined that they may be shown with sufficient clarity if they are presented as a matrix array of dot elements consisting of 18 vertical elements and 18 horizontal elements for each kanji. Of course, not all of the 324 possible dot elements will be used in the depiction of a single kanji. However, simple symbols, such as Arabic numbers and Roman letters do not require such an extensive array of dots to be presented quite clearly. They may be presented by an array of 9 horizontal and 9 vertical rows of dots. Thus, as indicated in Figure 10, a sentence having several groups of negatively-displayed kana indicated by shaded blocks 196-198 will cause a plurality of groups of kanji to be depicted in the lower part of the screen. The kanji 201 has its four-digit code presented in a corresponding 18 x 18 array 203 immediately below the kanji. Because of the fact that the four numbers that make up the code for the kanji 201 can each be displayed in a 9 x 9 array, all four can be presented in the same space as the kanji 201, itself. In the same way, the four-digit code 204 for the kanji 202 is displayed immediately below it and in the same manner the kanji 206 and 207 that correspond to the kana 197 are displayed with their four-digit codes 208 and 209, respectively. In the case of the kana 198, three kanji 211-213 could be appropriate. These three kanji are presented with their respective fourdigit codes 214-216. The operator has only to look at the kanji presented at the lower part of the screen to determine which ones to use to replace the negatively-displayed kana 196-198 and then must enter the corresponding fourdigit number by way of the numerical pad 183.
Still another way to determine the appropriate kanji is to utilize the key that has the arrow that points to the left. This key also has the katakana symbols pronounced "a" and
"wa" . These are the first and last symbols in the kana syllabaries, and in that sense correspond to the letters "A" and "Z" in the Roman alphabet. When that key is depressed, and information is entered for at least the first character of a kanji, all of the kanji that start with that character will be presented on the cathode ray screen along with their four-digit numbers, and the appropriate one can be selected by the operator.
If information that has been entered by way of the keyboard is to be edited, the four keys marked with the arrows can be utilized. The first key to be utilized for this purpose is the edit key, and when it is depressed, a cursor can be moved back and forth along a line of symbols on the face of the cathode ray tube in the terminal 186 by actuation of the arrow keys pointing left and right. Preceding lines and sentences can be moved into veiw on the cathode ray tube screen by actuation of the arrow pointing upward and then these preceding sentences can be run off the screen by actuation again of the arrow pointing downward. When the proper line is in position and the cursor is on the proper word, the symbols in that word may be changed symbol by symbol to make any necessary corrections. When the information in the message is correct, and when all of the negative kana have been replaced by positive kana or by kanji, the message may be recorded by actuation of the print key and then may be later printed out on the printer 184.
It is not necessary that there be a cathode ray terminal in order to make use of the invention. Figure 11 shows one sheet of computer paper arranged to accommodate the invention. On the left hand side of the paper is an area 218 which is shown as being a letterhead sheet. It is not possible to go from negative to positive depiction of kana in a printer such as the printer 184, but the information can be printed out initially in the letterhead area 218, and for each line of print, any ambiguous kanji may be caused to print the possible insertions on the same line. This is illustrated by the shaded block 219 that represents kana which should be transformed into kanji but which cannot be transformed unambiguously. On the same line but on the right hand side 221 of the sheet of. paper are two kanji 222 and 223 with their four-digit numbers 224 and 225, respectively. The sheet 218 is used as a draft sheet, and the operator enters the appropriate four-digit numbers into the system to take the place of each ambiguous kanji. Thereafter, the following left hand sheet 227 is used as the final draft. The message printed thereon has the appropriate kanji because of their having been entered by way of the numerical pad 183 after having been printed out on the right hand side of the preceding sheet. One of the ways of speeding up the printing of information in accordance with the present system is illustrated in Figures 12A and 12B. Texts printed by matrix printers frequently appear only in upper case letters and numerals and all of the symbols are within the same size format. Sometimes texts are presented having both upper and lower case letters with the upper case letters occupying a larger are than the lower case letters. Upper case letters are usually found only at the beginning of a sentence.
Japanese language texts present a different situation.
Many kanji are very complex and require a dot matrix having a relatively large number of dot positions for adequate resolution of the kanji. This is the reason for having an 18 x 18 matrix. In a given sentence, there are likely to be several kanji interspersed with several relatively simple kana symbols that do not require a full 18 x 18 matrix for adequate resolution. If such simple symbols are presented in 18 x 18 matrix format, the space for each symbol looks relatively unoccupied.
The four symbols depicted by the square areas 231-234 in
Figure 12A are printed by printing first the upper half of all four symbols by moving the printer means along the line 236 from left to right and then printing the lower half by moving the printing elements along a continuation of the line 236 from right to left. Alternatively, in the case of a cathode ray tube or facsimile device, the areas 231-234 may be printed by a raster sweep of each of the 18 lines from top to bottom with each line going from the far left to the far right. In order to give adequate separation between adjacent areas in which symbols can be printed, each 18 x 18 area is separated from its neighboring 18 x 18 area by a space represented by six dots. It takes a certain length of time to scan the entire area of all four of the square regions 231-234, no matter whether the scanning takes place on a line by line basis as it does in the case of the cathode ray tube presentation and facsimile presentation or whether it takes place by scanning half the areas at a time, as it does in the case of a printer.
Figure 12B shows a modified form of presentation in which only the relatively complex kanji are presented with an 18 x 18 matrix format and the simpler symbols, such as the kana and any Roman letters and numerals, are presented in 9 x 9 matrices. It takes less time to retrieve 81 bits of information required for generation of a symbol in a 9 x 9 matrix than it does to retrieve 324 bits of information to generate a symbol in an 18 x 18 matrix area. This speeds up the printing process. The printing process may be further speeded up by the technique of moving the printing or scanning element faster in areas in which it is known that nothing is to be printed as is true in the upper part of areas over the small squares 237 and 238 in Figure 12B. Of course, the large squares 239-241 require the full amount of printing time. A still further reduction in printing time afforded by presenting the simpler symbols in smaller format is that the smaller symbols can be spaced more closely and thus further reduce the time required to scan the empty space between such symbols. As indicated in Figure 12B, there is a spacing of four dots between an 18 x 18 matrix area 239 and the adjacent 9 x 9 matrix area 237 and a spacing of only three dots between the two small 9 x 9 matrix areas 237 and 238.
In accordance with this invention, one of the linguistic characteristics that can be utilized to select kanji for printing the Japanese language is to group the kanji in accordance with the number of such kanji in a word. There are words that contain only a single kanji and words that contain two, three and even four kanji. In fact, there are words that contain more than four kanji, but it has been determined that it is unnecessary to include such words in the memory of the system since any word can be spelled out phonetically by using kana. In fact, it is one of the advantages of the Japanese language that a given sentence may be written with different numbers of kanji and still be correct. In a general sense, the number of kanji used by a person is Indicative of the educational attainments of that person, but this is not always the case, It is thus perfectly satisfactory if some of the kana that should be translated into kanji do not get translated.
The following program has been written in COBOL to implement the system of the present invention by providing separate tables A -D for words having 1-4 kanji, along with a table S and a PF table for preceding and following kana.
ENVIRONMENT DIVISION. CONFIGURATION SECTION. SOURCE-COMPUTER. LEVEL-6. OBJECT-COMPUTER. LEVEL-6. INPUT-OUTPUT SECTION. FILE-CONTROL.
SELECT K-TABLE-A ASSIGN TO OA-MSD ORGANIZATION INDEXED ACCESS MODE RANDOM RECORD KEY PH1. SELECT K-TABLE-B ASSIGN TO OB-MSD ORGANIZATION INDEXED ACCESS MODE RANDOM RECORD KEY PH2. SELECT K-TABLE-C ASSIGN TO OC-MSD ORGANIZATION INDEXED ACCESS MODE RANDOM RECORD KEY PH3.
SELECT K-TABLE-D ASSIGN TO OD-MSD ORGANIZATION INDEXED ACCESS MODE RANDOM RECORD KEY PH4. SELECT K-TABLE-S ASSIGN TO OG-MSD ORGANIZATION INDEXED
ACCESS MODE RANDOM RECORD KEY PH5. SELECT PF-TABLE ASSIGN TO OE-MSD ORGANIZATION RELATIVE ACCESS MODE DYNAMIC RELATIVE KEY KP. SELECT PF-TABLE1 ASSIGN TO 1E-MSD
ORGANIZATION RELATIVE
ACCESS MODE DYNAMIC RELATIVE KEY KP1. SELECT K- TABLE-SS ASSIGN TO OF-MSD ORGANIZATION RELATIVE ACCESS MODE DYNAMIC RELATIVE KEY SKP.
SELECT TEXT--FIL ASSIGN TO OH-MSD ORGANIZATION RELATIVE
ACCESS MODE DYNAMIC RELATIVE KEY KTT. DATA DIVISION. FILE SECTION.
FD K-TABLE-A LABEL RECORD OMITTED. 01 KANJI1.
02 BASE1 03 PH1. 04 PHA OCCURS 15 TIMES INDEXED BY P1 PIC X.
03 FG 1 PIC 9. 03 ADDR1.
04 FILLER PIC 9. 04 AD11 PIC 9(4). FD K-TABLE-B LABEL RECORD OMITTED.
01 KANJI2.
02 BASE2. 03 P H2 .
04 PHB OCCURS 15 TIMES INDEXED BY P2 PIC X. 03 FG2 PIC 9.
03 ADDR2.
04 FILLER P.ΪC 9. 04 AD21 PIC 9(4). 02 AD22 PIC 9(4). FD K-TABLE-C LABEL RECORD OMITTED. 01 KANJI3.
02 BASE3. »
03 PH3. 04 PHC OCCURS 15 TIMES INDEXED BY P3 PIC X.
03 FG3 PIC 9. 03 ADDR3.
04 FILLER PIC 9. 04 AD31 PIC 9(4). 02 AD32 PIC 9(4).
02 AD33 PIC 9(4). FD K-TABLE-D LABEL RECORD OMITTED. 01 KANJI4.
02 BASE4. 03 PH4.
04 PKD OCCURS 15 TIMES INDEXED BY P4 PIC X. 03 FG4 PIC 9.
03 ADDR4. _
04 FILLER PIC 9. 04 AD41 PIC 9(4).
02 AD42 PIC 9(4). 02 AD43 PIC 9(4). 02 AD44 PIC 9(4). FD K-TABLE-S LABEL RECORD OMITTED. OI KANJIS. 02 PH5.
03 PHS OCCURS 36 TIMES INDEXED BY P5 PIC X. 02 ADDRS.
03 ADS OCCURS 36 TIMES INDEXED BY ΛD PIC X. 02 CDC PIC 9.
FD K-TABLE-SS LABEL RECORD OMITTED. 01 KANJISS. 02 PH6.
03 PHSS OCCURS 36 TIMES INDEXED BY P6 PIC X. 02 ADDRSS.
03 KJNR OCCURS 36 TIMES INDEXED BY K1 PIC X. FD PF-TABLE LABEL RECORD OMITTED. 01 PF-ENTRY.
02 PHN PIC X(15). 02 PFW PIC 9.
02 PFF PIC 9. 02 PEA PIC 9(4). 02 PFD PIC 9. 02 PFC PIC 99. 02 PFK PIC X(14).
FD PF-TABLE1 LABEL RECORD OMITTED. 01 PF-ENTRY1 PIC X(38). FD TEXT- FIL LABEL RECORD OMITTED. 01 TXTFIL. 02 KTX.
03 KTX1 PIC X(4). 03 KTX2 PIC 9(4). 02 TXX.
03 TXT OCCURS 48 TIMES INDEXED BY TX PIC X(4)
Figure imgf000038_0001
WORKING - STORAGE SECTION. 01 BASE.
02 PHO PIC X(15). 02 F6 PIC 9. 02 ADDR PIC 9(5).
77 KTT PIC 9(4). 77 KP PIC 9(5).
77 KP-1 PIC 9(5). 77 SKP PIC 9(5). 77 SSKP PIC 9(5). 77 FFF PIC 9. 77 FAD PIC 999. 77 PAD PIC 999. 77 PPAD PIC 999. 77 Ill PIC 999 VALUE 1. 01 SEG-INDEX. 02 STATS PIC 9.
02 P-TYP PIC 9.
02 TYP PIC 9.
02 F-TYP PIC 9.
02 KANJI-CNT PIC 9.
02 START- ADDR PIC 999.
02 LENG PIC 99.
01 PGE
02 CHR OCCURS 200 TIMES INDEXED BY C1 PIC X. 01 PG.
02 CWS OCCURS 200 TIMES INDEXED BY I1 PIC X.
01 KANA-BUFF1.
02 P-KANA1 PIC X(14).
02 P-KC1 PIC 99.
02 F-KANA1 PIC X(14).
02 F-KC1 PIC 99.
02 FGW1 PIC 9.
01 01-REGISTER.
02 ORG1 PIC X(4).
02 ORG2 PIC X(4).
02 ORG3 PIC X(4).
02 ORG4 PIC X(4).
01 KANA-BUFF2.
02 P-KANA2.
03 P-K OCCURS 14 TIMES INDEXED BY 14 PIC X. 02 P-KC2 PIC 99. 02 F-KANA2.
03 F-K OCCURS 14 TIMES INDEXED BY 13 PIC X. 02 F-KC2 PIC 99. 02 FGW2 PIC 9.
01 02-REGISTER. 02 ORG OCCURS 4 TIMES INDEXED BY 01 PIC X(4).
01 OS-REGISTER. 02 OSR OCCURS 5 TIMES INDEXED BY OS PIC X(16) .
77 SS PIC 9. 77 FLG PIC 9.
77 MULT PIC 9.
77 KC PIC 9 VALUE 0. 77 C3 PIC 99 VALUE 0. 77 R1 PIC 9 VALUE 0.
77 R3 PIC X.
77 KNC1 PIC 999. 77 KNC2 PIC 999. 77 MCH PIC 9. 77 MEM PIC 9. PR OCEDURE DIVISION.
4 KJ-TABLES, 2 PF-TABLES AND 2 S-TABLES *
PR0CESS-S WITH 888S *
TRANSLATION.
OPEN INPUT K-TABLE-A.
OPEN INPUT K-TABLE-B.
OPEN INPUT K-TABLE-C.
OPEN INPUT K-TABLE-D.
OPEN INPUT K-TABLE-S.
OPEN INPUT PF-TABLE.
OPEN INPUT PF-TABLE1.
OPEN OUTPUT K-TABLE-SS.
CLOSE K-TABLE-SS.
OPEN I-0 K-TABLE-SS.
OPEN I - 0 TEXT-FIL.
YOMIKOMI.
MOVE 2 TO MULT.
MOVE 1 TO SKP.
MOVE 0 TO MEM.
YOMI. SET C1 TO 1.
YOMI1 MOVE "#" TO CHR (C1).
IF C1 = 200 GO TO Y0MI2,
SET C1 UP BY 1
G0 TO YOMl1. YOMI2. DISPLAY "M000".
ACCEPT PGE.
MOVE 0 TO C3.
PREPROCESS OF INPUT MESSAGE *
< MEMORY AND RECALL FUNCTION ON *
> MEMORY AND RECALL FUNCTION OFF * M5001234 WRITE TEXT, ID NUMBER = 1234 * M5021234 READ TEXT, ID NUMBER = 1234 *
SET I1 C1 TX TO 1.
MOVE PGE TO TXTFIL.
1F KTX1 = "M500" OR KTX1 "M502 " GO TO TXT-MR. CLR . IF C1 = 1 GO TO FIN.
CLR 1. IF CHR (C1) = "#" GO TO T RA NS2.
IF CHR (C1) = "X" OR CHR(C1) = "0" GO TO CLR2.
IF CHR (C1) = "L"" OR CHR(C1) = "V" G0 TO CLR2.
IF CHR (C1) = "<"" OR CHR(C1) = ">" GO TO MEMO.
GO TO TRANS1.
CLR2. MOVE CHR (C1 TO R3.
IF C1 = C3 GO TO TRANS1.
SET C1 DOWN BY 1.
IF CHR (C1) = "X" OR CHR(C1) = " Q" GO TO CLR3.
IF CHR (C1) = "L" OR CHR(C1) = "V" GO TO CLR3.
GO TO TRANSO. CLR 3. IF CHR (C1) NOT = R3 G0 TO SP2.
SET C1 UP BY 1.
IF CHR (C1) = "X" G0 TO TRANS1 SPO. SET I1 DOWN BY 1.
IF CWS(I1) = "X" OR CWS(I1) = "Q" GO TO KESU.
IF CWS(I1) = "L" OR CWS(I1) = "V" GO TO KESU.
SET I1 UP BY 1. SP1. SET C3 TO C1
ADD 1 TO C3.
SET C1 UP BY 1.
GO TO CLR1 SP2. SET C1 UP BY 1.
G0 TO SPO. KESU. IF I1 = 1 GO TO SP1.
GO TO SPO. TRANSO. SET C1 UP BY 1. TRANS1. MOVE CHR (C1) TO CWS (I1).
SET C1 I1 UP BY 1
GO TO CLR1 FIN. IF CHR (C1) = "#" GO TO KANRYO.
GO TO CLR1. MEMO. IF CHR (C1) = ">" MOVE 0 TO MEM.
IF CHR (C1) = "<" MOVE 1 TO MEM.
SET C1 UP BY 1.
GO TO CLR1. TXT -MR. MOVE KTX2 TO KTT.
IF KTX1 = "M500" GO TO TXT-M.
IF KTX1 = "M502" GO TO TXT-R. TXl-M. WRITE TXTFIL INVALID KEY GO TO TXT-E1.
GO TO YOMI. TXT-R. READ TEXT-FIL INVALID KEY GO TO TXT-E2.
MOVE "M502" TO KTX1.
DISPLAY TXTFIL.
GO TO YOMI. TXT-E1. DISPLAY "E003".
GO TO YOMI. TXT-E2. DISPLAY "E004".
GO TO YOMI. TRANS2. MOVE CHR (C1) TO CWS (I1). SEGMENTATION *
HAJIME. SE T I1 TO 1.
MOVE 0 TO P-TYP F- TYP .
B6N. SET I1 TO III.
MOVE 1 TO STATS.
IF F-TYP = 5 GO T O SEG4.
IF P-TYP NOT = 6 GO TO SEG3
MOVE 1 TO KC. SEGO. MOVE 0 TO TYP C3. GCT-CH. MOVE CWS (I1) TO R3.
SET I1 UP BY 1. SEARCH CONTROL *
IF TYP = 3 GO TO SRCHO.
IF R3 = "X" GO TO KANJI.
IF R3 = "Q" GO TO HIRAGANA. IF R3 = "V" GO TO KATAKANA. IF R3 = "L" GO TO ALPHABET. IF R3 = "=" GO TO KIOKU. IF R3 - "'" OR R3 = " , " GO TO SRCH1. SRCHO. IF R3 = "#" GO TO END-P.
IF R3 NOT ALPHABETIC OR R3 = SPACE GO TO SYMBOL. SRCH1. IF TYP = 0 GO TO PROCESS-C1. ADD 1 TO C3. GO TO 6ET-CH. KANJI. IF TYP NOT = 0 GO TO SC2. MOVE 1 TO KC. SCI . MOVE 6 TO R1. GO TO FOUND. SC2. IF C3 NOT = 0 GO TO SC1. IF TYP NOT = 6 GO TO FOUND. ADD 1 TO START-ADDR. ADD 1 TO KC. GO TO GET-CH. HIRAGANA. IF TYP = 0 MOVE 0 TO KC. MOVE 2 TO R1. GO TO FOUND. KATAKANA. IF TYP = 0 MOVE 0 TO KC. MOVE 1 TO R1. GO TO FOUND. ALPHABET. IF TYP = 0 MOVE 0 TO KC. MOVE 3 TO R1. GO TO FOUND. END-P. MOVE 5 TO R1. GO TO FOUND. SYMBOL. IF TYP = γ AND R3 NUMERIC GO TO SRCH1. MOVE 4 TO R1. GO TO FOUND. KIOKU.
MOVE 7 TO R1. END SEARCH CONTROL *
FOUND. IF TYP - 0 GO TO SEG1. MOVE KC TO KANJI-CNT. MOVE C3 TO LENG. MOVE R1 TO F-TYP. SET II DOWN BY 1. GO TO PROCESS-CONT. SEG1. MOVE R1 TO TYP.
IF R1 ~ 4 GO TO SE62. SET START-ADDR TO II. GO TO GET-.CH. SEG2. SET START-ADDR TO I1.
SUBTRACT 1 FROM START-ADDR. GO TO GET-CH. SEG3. MOVE 0 TO KC. GO TO SEGO. SEG4. MOVE 5 TO TYP. PROCESS CONTROL * PROCESS-CONT.
IF TYP = 1 BO TO PROCESS-KN. ιr TYP = 2 GO TO PROCESS-H.
IF TYP = 3 GO TO PROCESS-A.
IF TYP = 4 GO TO PROCESS-S.
IF TYP = 5 GO TO PROCESS-PEND.
IF TYP = 4 GO TO PROCESS-KJ.
IF TYP = 7 GO TO PROCESS-M.
PRCCESS-C1.. DISPLAY "E000". GO TO OWARI.
PROCESS-KN.
SET III TO 11. GO TO TR 1. PROCESS-H. SET III TO 11. GO TO TR1. PROCESS-A.
SET III TO 11. GO TO TR1. PROCESS-S.
SET III TO I1. ADD 1 TO START-ADDR. GO TO TR2. PROCESS-H. SET III TO 11. IF MEM = 0 GO TO TR1.
SET 11 TO START-ADDR. SET K1 TO 1. PHO. MOVE CWS(I1) T)O KJNR(K1). IF K1 = LENG GO TO PM1. SET I1 K1 UP BY 1.
GO TO PMO.
PHI . WRITE KANJISS INVALID KEY GO TO OWARI.
ADD 1 TO SKP.
MOVE 6 TO FLG. GO TO EXX. PROCESS-PEND.
SET III TO I1. GO TO OWARI. PROCESS-KJ. GO TO PKANJ.
EXX. IF FLG = 5 GO TO EXXO.
IF FLG = 6 GO TO EXX2.
IF 01 -REGISTER = SPACES MOVE "0000" TO ORG1 .
IF FGW1 = 1 GO TO EXX12 EXXI1. IF OS-REGISTER NOT = SPACES GO TO EXX31.
DISPLAY 01-REGISTER.
GO TO TR1. EXX12. MOVE SPACES TO 01-REGISTER.
MOVE "0000" TO ORG1. GO TO EXX11.
EXX31. IF FGW1 = 5 GO TO EXX33. EXX30. SET OS TO 1.
EXX32. DISPLAY "8888" OS R(OS) . IF OS = SS GO TO EXX34.
SET OS UP BY 1.
GO TO EXX32.
EXX3 3. MOVE 01-REGISTER TO OSR(0S). GO TO EXX30. EXX3 4. DISPLAY "9999".
GO TO TR1. EXXO. DISPLAY ADDRS.
GO TO TR1. EXX2. DISPLAY ADDRSS.
END PROCESS CONTROL * TR1. IF F-TYP = 5 OR TYP = 7 GO TO BGN. MOVE TYP TO P-TYP.
GO TO BGN.
TR 2. IF LENG = 0 GO TO TR3. MOVE P-TYP TO TYP.
GO TO TR4 PROCESS-CONT DEPENDING ON MULT. TR 4. IF TYP = ό GO TO TR5. GO TO PROCESS-CONT.
TR 3. IF TYP = F-TYP GO TO BGN. GO TO TR1.
TR 5 . DISPLAY "E002".
MOVE 1 TO III.
GO TO YOMI.
PROCESS KANJI * PKANJ.
SET III TO I1.
MOVE SPACES TO 01- REGISTER OS-REGISTER ADDRS.
MOVE 0 TO FLG I FGW1 FFF.
MOVE SPACES TO KANJIS.
SET I1 TO START-ADDR.
SET P5 TO 1.
KJSO. MOVE CWS(I1) TO PHS(P-5).
IF P5 = LENG GO TO KJSS.
SET I1 P5 UP BY 1. GO TO KJSO.
MEMORY AND RECALL TBL-SS *
KJSS. IF MEM = 0 GO TO KJS2.
IF F-TYP - 7 GO TO TEIGI.
MOVE SKP TO SSKP.
MOVE 1 TO SKP.
KJSS1. READ K-TABLE-SS INVALID KEY 5 30 TO KJS1
IF PH5 - P\.6 GO TO KJSS2.
ADD 1 TO SKP.
GO TO KJSS1.
KJSS2. MOVE SSKP TO SKP. .
MOVE 6 TO FLG.
GO TO EXX.
TEIGI. MOVE SPACES TO. KANJISS.
MOVE PH5 TO PH6.
GO TO TR1.
KJS1. MOVE SSKP TO SKP.
SEARCH TBL-S *
KJU2. READ K--7ΛBLE-S INVALID KEY GO TO KJAB CD. MOVE 5 TO FLG.
GO TO EXX.
SEARCH TBL-A, B, C AND D * KJABCD. SET P1 P2 P3 P4 01 TO 1.
IF MULT a 2 MOVE 1 TO KANJI-CNT.
SET I1 TO START-ADDR.
IF CWSdD = "M" OR CWS( I1 ) > "M" MOVE 1 TO FFF. KJO. GO TO KJS KJ1 KJ3 KJ4 EXX DEPENDING ON KANJI-CNT. KJ1. MOVE SPACES. TO PHI.
MOVE PH5 TO PHI PHO.
READ K-TABLE-A INVALID KEY GO TO AGAIN.
MOVE BASE1 TO BASE. GO TO. BRANCH. KJ2. MOVE SPACES TO PH2.
MOVE PH5 TO PH2 PHO.
READ K-TABLE-B INVALID KEY GO TO AGAIN.
MOVE BASE2 TO- BASE.
GO TO BRANCH. KJ3. MOVE SPACES TO PH3.
MOVE PH5 TO PH3 PHO.
READ K-TABLE-C INVALID KEY GO TO AGAIN.
MOVE BASES TO BASE. GO TO BRANCH. KJ4. MOVE SPACES TO PH4.
MOVE PH5 TO PH4 PHO.
READ K-TABLE-D INVALID KEY GO TO AGAIN.
MOVE BASE4 TO BASE. BRANCH. GO TO MATCH RESOLVE DEPENDING ON FG. MATCH. GO TO KJ6 KJ5 KJ7 KJ3 DEPENDING ON KANJI-CNT. KJ5. MOVE AD11 TO ORG1.
GO TO EXX. KJ 6 . MOVE A021 TO ORG1.
MOVE AD22 TO ORG2. GO TO EXX.
KJ7. MOVE AD31 TO ORG1.
MOVE AD32 TO ORG2.
MOVE AD33 TO ORG3. GO TO EXX. KJ8. MOVE AD41 TO ORG1.
MOVE AD-42 TO ORG2.
MOVE AD43 TO ORG3.
MOVE AD44 TO ORG4.
GO TO EXX. RESOLUTION ANALYSIS * RESOLVE.
SET OS TO 1.
MOVE ADDR TO KP1.
MOVE ADDR TO KP. ADD START-ADDR LENG 1 GIVING FAD.
SUBTRACT 1 FROM START-ADDR GIVING PAD.
MOVE SPACES TO P-KANA1 F-KANA1.
MOVE 0 TO P-KC1 F-KC1 FGW1 FGW2 MCH SS. CHECK-M. MOVE SPACES TO P-KANA2 F-KANA2.
MOVE 0 TO P-KC2 F-KC2.
MOVE SPACES TO 02-REGISTER.
SET 01 TO 1. GET-ENTRY. IF FFF = 1 GO TO GET1.
READ PF-TABLE INVALID KEY GO TO EXT. GET2. IF PHO NOT = PHN G0 TO EXT.
IF PFW = 5 OR PFW « 3 ADD 1 TO SS. "
IF FGW-1 = 5 OR FGW1 = 3 GO TO GET22. GET2L IF PFC = 0 GO TO TEST.
GO TO GET-P GET-F DEPENDING ON PFD. GET1. READ PF-TABLE1 INVALID KEY G0 TO EXT.
MOVE PF-ENTRY1 TO PF-ENTRY.
GO TO GET2. GET22. IF OS = 5 OR OS > 5 G0 TO GET2L
MOVE 01-REGISTER TO OSR(OS).
SET OS UP BY 1. GO TO GET21. GZT-F. IF F-TYP = 1 OR F-TYP = 2 G0 TO FF1. GO TO PFO.
FF1 . SET I1 TO FAD.
SET 13 TO 1. FF0. MOVE CWS(I1) TO F-K(I3).
IF F-K(13) NOT ALPHABETIC OR F-K(I3) = " "GO TO PFO. IF 13 = PFC GO TO GOT-F.
SET I1 13 UP BY 1.
GO TO FFO. GOT-F. SET F-KC2- TO I3.
IF F-KANA2 = PFK GO TO TEST. GO TO PFO.
GET- P. IF P-TYP = 1 OR P-TYP = 2 GO TO PP1.
GO TO PFO. PP1. SUBTRACT PFC FROM PAD GIVING PPAD.
SET I1TO PPAD. SET 14 TO 1
PPO. MOVE CWS(I1) TO P-K (14).
IF P-K (14) NOT ALPHABETIC OR P-K(14) = " " GO TO PFO
IF 14 = PFC GO TO GOT-P.
SET I1 I4. UP BY 1. GO TO PPO.
GOT-P. SET P-KC2 TO 14.
IF P-KANA2 = PFK GO TO TEST. PFO. IF PFF NOT = 1 GO TO PF1.
IF MCH = 0 GO TO PF4. GO TO PF2 PF3 DEPENDING ON PFD.
PF2. IF P-KC1 = PFC OR P-KC1 > PFC GO TO EXT. GO TO PF4. PT3. IF F-KC1 = PFC OR F-KC1 > PFC GO TO EXT. PF4. ADD 1 TO KP1. ADD 1 TO KP.
GO TO CHECK-M. PF1. IF FFF = 1 GO TO PF5.
ADD 1 TO KP. READ PF-TΛBLE INVALID KEY GO TO PFERROR.
GO TO PFO. PT5. ADD 1 TO KP1.
READ PF-TABLE1 INVALID KEY GO TO PFERROR. MOVE PF-ENΪRY1 TO PF-ENTRY. GO TO PFO. TEST. MOVE 1 TO MCH.
MOVE PFW TO FGW2. TEΞT1. GO TO TAA TBB TCC DEPENDING ON PFF. TAA. MOVE PFA TO 0RG(01).
IF P-ICC1 = 0 AND F-KC1 = 0 GO TO TSTO.
IF P-KC2 NOT = 0 AND F-KC2 NOT = 0 GO TO TST2.
IF F-KC1 NOT = 0 AND F-KC2 = 0 GO TO TST3.
IF P-KC1 NOT = 0 AND P-KC2 = 0 GO TO TST4. IF P-KC1 - 0 GO TO CMP-G. GO TO CMP-F. TSTO. MOVE KANA-BUFF2 TO KANA-BUFF1.
MOVE 02-REGTSTER TO 01-REGISTER. TST1. ADD 1 TO KP1. ADD 1 TO KP.
GO TO CHECK-M. TSTP, IF P-KC1 NOT = 0 AND F-KC1 NOT = 0 GO TO CMP-E.
GO TO TbTϋ. TST3. IF P-KCI = 0 AND P-KC2 NOT = 0 GO TO CMP-C. GO TO TST1.
TST4. IF F-KC1 = 0 AND F-KC2 NOT - 0 GO TO CMP-D. GO TO TST1. CMP- C. IF P-KC2 > F-KC1 GO TO TSTO. GO TO TST1. CMP-D. IF F-KC2 > P-KCI GO TO TSTO.
GO TO TSTI . CMP-E. MULTIPLY P-KCI BY F-KC1 GIVING KNC1.
MULTIPLY P-KC2 BY F-KC2 GIVIN6 KNC2.
IF KNC2 > KNC1 GO TO TSTO. GO TO TSTI.
CMP--F. IF P-KC2 > P-KCI GO TO TSTO. GO TO TSTI. CMP -G. IF F-KC2 > F-KC1 GO TO TSTO. GO TO TSTL TBB . MOVE PFA TO 0RG (01).
SET 01 UP BY 1.
IF FFF - 1 GO TO TBB1.
ADD 1 TO KP.
READ Pr-TABLE INVALID KEY GO TO PFERROR. GO TO TEST1. T B B 1. ADD 1 T O KP1.
READ PF -TABLE 1 INVALID KEY GO TO PFERROR.
MOVE PF-ENTRY1 TO PF-ENTRY.
GO TO TEST1. TCC . ADD 1 TO KP1.
ADD 1 TO KP.
GO TO GET -ENTRY.
END RES OL UT ION ANAL YSIS * EXT. IF 01--REGISTER NOT = SPACES GO TO EXX.
AGAIN. GO TO EXX RETRY DEPENDING ON MULT.
RETRY. ADD 1 TO KANJI-CNT.
GO TO KJO.
END PROCESS KANJI *
OWARI. MOVE 1 TO III.
GO TO YOMI.
PFERROR. DISPLAY "E001".
KANRYO.
CLOSE K-TABLE-A.
CLOSE K-TABLE-B.
CLOSE K-TABLE-C.
CLOSE K-TABLE-D.
CLOSE K-TABLE-S.
CLOSE PF-TABLE.
CLOSE K-TABLE-SS.
CLOSE PF-TABLE1.
CLOSE TEXT-FIL.
STOP RUN.
END COBOL.
RDY:

Claims

WHAT IS CLAIMED IS:
1. Apparatus operable to effect selections of graphic symbols, said apparatus comprising: a terminal including means for introducing data into and obtaining data from the apparatus; data storage means for storing information relating to graphic language symbols in addresses according to a first linguistic characteristic, at least one group of said symbols sharinq a common first linguistic characteristic, the symbols of said one qroup thereby being incapable of unique identification relative to other symbols in the same group on the basis of the shared linguistic characteristic, the addresses of the individual symbols in said one group being further uniquely identifiable from each other by a second linguistic characteristic; a data processor to receive data including the. first and second linguistic characteristics; and program means responsive to the data applied to the data processor to select the address of a specific one of said symbols by analyzing said data on the basis of both of said linguistic characteristics.
2. The apparatus of Claim 1 in which the first linguistic characteristic is a pronunciation of a specific symbol.
3. The apparatus of Claim 2 in which the second linguistic characteristic is a pronunciation of an associated second symbol in a word that includes both the specific symbol and the second symbol .
4 . The apparatus of Claim 3 in which the specific symbol is an ideogram.
5. The apparatus of Claim 4 in which the second symbol is a second ideogram.
6. The apparatus of Claim 3 in which the specific symbol is a kanji in the Japanese language and the second symbol is an okurigana.
7. The apparatus of Claim 6 in which the okurigana ppe- cedes the kanji in sequence.
8. The apparatus of Claim 6 in which the okurigana follows the kanji in sequence.
9. The apparatus of Claim 2 in which the second linguistic characteristic is a graphic display of all symbols having the same initial phonetic content..
10. The apparatus of Claim 2 in which the second linguistic characteristic is a graphic display of all symbols having the same complete pronunciation as the specific symbol .
11. Apparatus for selecting graphic characters from a first set, said apparatus comprising: decoding means responsive to electrical signals each representing characters of a second set, said characters of said second set corresponding to different pronunciations of said graphic characters; a plurality of individual storage means for storing, separately, a plurality of individual character representations, each representative of one of said graphic characters; and output means, said decoding means comprising a plurality of output terminals, each of said output terminals being connected to at least one of said individual storage means, and at least some of said output terminals being connected to a plurality of said individual storage means, whereby the application of one of said signals corresponding to pronunciation of one of said graphic characters to said decoding means causes one of said output terminals to apply a signal to such of said storage means as are connected to that output terminal, and application of another of said signals corresponding to a second linguistic characteristic of said one of said graphic characters to said decoding means causes a different one of said output terminals to apply a signal to only one of such storage means as are connected to said one of said output terminals to cause said one of such storage means to supply a signal to said output means.
12. Apparatus for selecting graphic characters from a first set, said apparatus comrϊsing: decoding means responsive to electrical signals each representing characters of a second set, said characters of said second set corresponding to different pronunciations of said graphic characters: a plurality of individual character signals, each repre- sentatϊve of one of said graphic characters; and output means, said decoding means comprising a plurality of output terminals, each of said output terminals being connected to at least one of said individual storaαe means, and at least some of said output terminals being connected to a plurality of said individual storage menas, whereby the application of one of said signals corresponding to one pronunciation of one of said qraphic characters' to said decoding means causes one of said output terminals to apply a signal to such of said storage means as are connected to that output terminal, and application of another of said siαnals corresponding to a different pronunciation of said one of said graphic characters to said decoding means causes a different one of said output terminals to apply a siqnal to only one of such storaαe means as are connected to said one of said output terminals to cause said one of such s torage means to suppl y a s i gna l to Sa i d outpu t means..
13. Apparatus for selectinα ideograms, said apparatus comprising: electric signal storage means comprising a plurality of separate storage addresses for storing a plurality of different coded ideogram signals each of which is representative of one of said ideograms and each of which is stored at an individual one of said addresses; means for delivering first electrical signals representative of a first pronunciation of a selected one of said ideograms in a first pronunciation style to each of said addresses in said storage means to select a sub-set comprising those of said addresses at which are stored signals representing an ideogram having said first pronunciation said first pronunciation style; and means for delivering to said sub-set second electrical signals representative a second pronunciati on of said selected one of said ideograms in a second pronunciation style to select a specific one of said addresses at which are stored signals repre- sentϊng said selected ideogram having both said first and said second pronunciation; output means; and means respon sive to said first and second electrical signals for read ing out from said storage means said selected ideogram si nal stored at said specific one of said addresses and sup plying said selected ideogram signal to said output means
1 4. Apparatus for selecting a specific character from a group of Chinese ideogram characters and Japanese phoneti characters, said apparatus comprising: storage means comprising a plurality of addresses to store electrical in- formation for each of said characters at a corresponding one of said addresses; first signalling means connected to a selected group of said addresses to apply thereto a first signal to identify a sub-set of addresses of a group of said characters havinq a selected first pronunciation; second signalling means connected to said sub-set of addresses to identify the specific address of said specific character; output means; and means controlled by said first and second signalling means to connect said specific address to said output means to deliver a signal representa tive of said address of said specific character to said output means
15. Apparatus according to Claim 14 in which said second signalling means selects said specific address according to a second pronunciation of said specific character.
16. Apparatus according to Claim 15 in which all of the characters in said selected group of addresses are Chinese ideograms, said first pronunciation being one of the "kun" and "on" styles of pronunciation of said ideograms, and said second pronunciation being the other of said styles of pronunciation.
17. The apparatus of Claim 16 comprising means connecting said first and second signalling means to said storage means to supply first and second electrical signals to said plurality of addresses of said storage means substantially simultaneously.
18. Character encoding apparatus for producing a coded signal representative of a specific ideogram of a set of ideograms, said apparatus comprisirrg: memory means comprising a plurality of storage locations for storing, separately, each of said coded signals; means for generating, selectively, a plurality of different retrieval signals, each representative of at least part of one pronunciation of said specific ideogram; means for generating a selected one of said plurality of different retrieval signals and applying said selected one to said storage locations to identify a first set of said ideograms including said specific ideogram; means for generating a second of said, re.tr i eval signals corresponding to a second set of said ideograms intersecting said first set uniquely at said specific ideogram and applying said second of said retrieval signals to said storage locations; and means for generating function signals to signify the beginning and end of each such retrie- va l s i gna l s .
19. Apparatus as in Claim 18 including register means for storing said retrieval signals in said register means.
20. A method of selecting Chinese ideograms correspondi ng to signals stored at separate addresses in storage means comprising: operating an encoding machine to produce encoded signals representative of Japanese "kun" and "on" styles of pronunciation for a specific ideogram that is be selected; producing ideogram identification signals at the beginning and end of each group of signals representing an ideogram; applying said encoded signals to said storage means, thereby to. select said specific one of said ideograms; and producing Japanese phonetic character signals intermixed with said group of signals.
21. A machine control device for controlling the operation of a computer to select characters, said device compris ing, in combination: memory means having a plurality of individ ual addresses, each of said characters being stored in one of said addresses; a source of a plurality of coded computer control signals connected to said memory means, said signals being capable of being sensed by said computer and being arranged in a pattern which will direct the computer to read out of said addresses a signal representative of a preselected character in response to the input to the computer of successive signals representative of at least two different pronunciations of said character.
22. A device as in Claim 21 in which each of said signal s connected to said memory represents a Japanese phonetic character, and a group of said last-named signals represents each of said Chinese ideograms, and identification signals on said support means at the beginning and end of each such group.
PCT/US1979/000418 1978-06-14 1979-06-14 System for selecting graphic characters phonetically WO1980000105A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US91538878A 1978-06-14 1978-06-14
US915388 1978-06-14

Publications (1)

Publication Number Publication Date
WO1980000105A1 true WO1980000105A1 (en) 1980-01-24

Family

ID=25435656

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US1979/000418 WO1980000105A1 (en) 1978-06-14 1979-06-14 System for selecting graphic characters phonetically

Country Status (2)

Country Link
EP (1) EP0016067A1 (en)
WO (1) WO1980000105A1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
FR2482747A1 (en) * 1980-05-19 1981-11-20 Barouch Eleazar DEVICE FOR ENCODING IDEOGRAPHIC CHARACTERS
WO1982000442A1 (en) * 1980-08-01 1982-02-18 R Johnson Ideographic word selection system
US4544276A (en) * 1983-03-21 1985-10-01 Cornell Research Foundation, Inc. Method and apparatus for typing Japanese text using multiple systems
GB2163578A (en) * 1984-08-07 1986-02-26 Yuk Kwan Chan Cornelius Character encoder and decoder

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US3325786A (en) * 1964-06-02 1967-06-13 Rca Corp Machine for composing ideographs
US3665450A (en) * 1968-07-02 1972-05-23 Leo Stanger Method and means for encoding and decoding ideographic characters
US3809203A (en) * 1972-01-31 1974-05-07 Yamura Shinkoseisakusho Kk Chinese character (kanji) teleprinter or a chinese character (kanji) punching typewriter or similar apparatus
US3820644A (en) * 1972-02-10 1974-06-28 Chan H Yeh System for the electronic data processing of chinese characters
US3852720A (en) * 1973-02-12 1974-12-03 H Park Method and apparatus for automatically generating korean character fonts
US3927752A (en) * 1974-01-22 1975-12-23 American Physics Inst Keyboard and encoding system for photocomposition of scientific text including multiline mathematical equations
US3938099A (en) * 1972-11-02 1976-02-10 Alephtran Systems Ltd. Electronic digital system and method for reproducing languages using the Arabic-Farsi script
US3950734A (en) * 1973-08-16 1976-04-13 Li Tzu Hung Language processing system
US3999167A (en) * 1973-11-05 1976-12-21 Fuji Xerox Co., Ltd. Method and apparatus for generating character patterns
US4096934A (en) * 1975-10-15 1978-06-27 Philip George Kirmser Method and apparatus for reproducing desired ideographs
US4144405A (en) * 1977-08-05 1979-03-13 Shuichi Wakamatsu Character writing system

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US3325786A (en) * 1964-06-02 1967-06-13 Rca Corp Machine for composing ideographs
US3665450A (en) * 1968-07-02 1972-05-23 Leo Stanger Method and means for encoding and decoding ideographic characters
US3809203A (en) * 1972-01-31 1974-05-07 Yamura Shinkoseisakusho Kk Chinese character (kanji) teleprinter or a chinese character (kanji) punching typewriter or similar apparatus
US3820644A (en) * 1972-02-10 1974-06-28 Chan H Yeh System for the electronic data processing of chinese characters
US3938099A (en) * 1972-11-02 1976-02-10 Alephtran Systems Ltd. Electronic digital system and method for reproducing languages using the Arabic-Farsi script
US3852720A (en) * 1973-02-12 1974-12-03 H Park Method and apparatus for automatically generating korean character fonts
US3950734A (en) * 1973-08-16 1976-04-13 Li Tzu Hung Language processing system
US3999167A (en) * 1973-11-05 1976-12-21 Fuji Xerox Co., Ltd. Method and apparatus for generating character patterns
US3927752A (en) * 1974-01-22 1975-12-23 American Physics Inst Keyboard and encoding system for photocomposition of scientific text including multiline mathematical equations
US4096934A (en) * 1975-10-15 1978-06-27 Philip George Kirmser Method and apparatus for reproducing desired ideographs
US4144405A (en) * 1977-08-05 1979-03-13 Shuichi Wakamatsu Character writing system

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
Computer Graphics and Image Processing, Volume 1, No. 1, Issued April 1972, "Recognition Of Printed Chinese Characters By Automatic Pattern Analysis", WILLIAM STALLINGS, Pages 47-63. (con't on suppl. sheet 2). *
IBM Technical Disclosure Bulletin, Volume 13, No. 11, issued April 1971, "Character Keyboard", F.F. Fang, C.N. Liu and D.T. Tang, pages 3540-3541. *
IBM Technical Disclosure Bulletin, Volume 21, No. 11, issued April 1979, "Kanji Character Generation", S. ODA, pages 4576-4578. *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
FR2482747A1 (en) * 1980-05-19 1981-11-20 Barouch Eleazar DEVICE FOR ENCODING IDEOGRAPHIC CHARACTERS
WO1982000442A1 (en) * 1980-08-01 1982-02-18 R Johnson Ideographic word selection system
US4544276A (en) * 1983-03-21 1985-10-01 Cornell Research Foundation, Inc. Method and apparatus for typing Japanese text using multiple systems
GB2163578A (en) * 1984-08-07 1986-02-26 Yuk Kwan Chan Cornelius Character encoder and decoder

Also Published As

Publication number Publication date
EP0016067A1 (en) 1980-10-01

Similar Documents

Publication Publication Date Title
US4505602A (en) Method for encoding ideographic characters
US4193119A (en) Apparatus for assisting in the transposition of foreign language text
US4543631A (en) Japanese text inputting system having interactive mnemonic mode and display choice mode
EP0028533B1 (en) Method and apparatus for producing ideographic text
US5164900A (en) Method and device for phonetically encoding Chinese textual data for data processing entry
US4951202A (en) Oriental language processing system
US5212638A (en) Alphabetic keyboard arrangement for typing Mandarin Chinese phonetic data
US4124843A (en) Multi-lingual input keyboard and display
US4327421A (en) Chinese printing system
US3820644A (en) System for the electronic data processing of chinese characters
US4603330A (en) Font display and text editing system with character overlay feature
US4644492A (en) Plural mode language translator having formatting circuitry for arranging translated words in different orders
US4868913A (en) System of encoding chinese characters according to their patterns and accompanying keyboard for electronic computer
US4500872A (en) Method for encoding Chinese characters
US4270022A (en) Ideographic character selection
US4173753A (en) Input system for sino-computer
US4602878A (en) Ideographic word processor
GB2221780A (en) System for encoding a collection of ideographic characters
US4187031A (en) Korean (hangul) electronic typewriter and communication equipment system
US4698758A (en) Method of selecting and reproducing language characters
KR860001012B1 (en) Ideographic coder
EP0087871B1 (en) Interactive chinese typewriter
US5378068A (en) Word processor for generating Chinese characters
JPS6119045B2 (en)
JPS6120004B2 (en)

Legal Events

Date Code Title Description
AK Designated states

Designated state(s): JP

Kind code of ref document: A1

Designated state(s): JP

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GB

Designated state(s): GB