CA1092243A - Apparatus for automatically forming hyphenated words - Google Patents

Apparatus for automatically forming hyphenated words

Info

Publication number
CA1092243A
CA1092243A CA288,062A CA288062A CA1092243A CA 1092243 A CA1092243 A CA 1092243A CA 288062 A CA288062 A CA 288062A CA 1092243 A CA1092243 A CA 1092243A
Authority
CA
Canada
Prior art keywords
word
input
hyphenation
hyphen
gate
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired
Application number
CA288,062A
Other languages
French (fr)
Inventor
Walter S. Rosenbaum
Howard C. Tanner
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
International Business Machines Corp
Original Assignee
International Business Machines Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by International Business Machines Corp filed Critical International Business Machines Corp
Application granted granted Critical
Publication of CA1092243A publication Critical patent/CA1092243A/en
Expired legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/191Automatic line break hyphenation

Abstract

IMPROVED APPARATUS FOR AUTOMATICALLY FORMING
HYPHENATED WORDS
ABSTRACT:
Improved hyphenation apparatus is combined with word verification apparatus to automatically provide hyphenation points for input words from a keyboard or other input de-vice. The spelling of each word input to the system is verified by the digital reference matrix section of the apparatus by calculating a vector magnitude and angle for the word which is compared to the contents of a storage dictionary of words. Each cell of storage in the storage dictionary, in addition to containing a unique angle repre-sentation of the input word, contains a byte of data re-presenting the valid hyphenation points for the input word.
When an input word is verified to be correctly spelled, the hyphenation byte is read out of dictionary and used by the hyphenation section to reassemble the word in hyphenated form. The hyphenated word is then displayed to the operator for appropriate action.

Description

A(~:K(;~(~U~L1 UE' 'l'IIE LNV~N'L'l~)N:
19. FIELD OF T~E INVENTION: The invention disclosed herein 20. relates to data processing devices and more particularly 21. relates to post processing devices for keyboards and other 22. data input devices.
23. DESCRIPTION OF THE PRIOR ART: One of the problems 24. that adversely affects throughput in a word processing 25. system where line end justification is required is the 26. problem of how to hyphenate words that occur at the end of 27. a printing line without adequate remaining space to accomodate - lO~ZZ~3 1. the word. This prGblem generally leads to the operator
2- having to stop the machine and manually look up the word in
3- a dictionary.
4- One technique in the prior art for solving this problem
5- required st~ring all hyphenated versions of commonly used
6- words in a table and then searching this huge table each
7- time a word is to be hyphenated. Assuming that each word in
8- the table was correctly hyphenated when stored, this techni-
9 que has the advantage of being accurate in correctly hyphenating
10. each word found in the table. The primary disadvantage of
11. this technique are that the storage requirements and execution
12. time are prohibitively large unless a large scale computer
13. system is used.
14. SUMMARY OF THE INVENTION:
15. This invention provides an automatic hyphenation system
16. wherein each hyphenatable word in the dictionary is compactly
17. stored in a dictionary memory as a vector. Words from an
18. input device such as a keyboard or character recognition
19. machine are encoded into a vector representation comprising
20. a magnitude and angle as disclosed in U.S. patent 3,995,254,
21. entitled "Digital Reference Matrix for Word Verification",
22. issued November 30, 1976 to W. S. Rosenbaum, and
23. assigned to the assignee of the present invention.
24. The vector magnitude and angle are used as
25. addresses for a memory containing representations of a - -
26- plurality of words stored as vectors. Appended to each
27. vector word in the memory which is hyphenatable is an en-
28. coded data byte representing placement of hyphenation points
29- within the word. The hyphenation byte is divided into a - - . . . . . ........ . .

10gZZ'l3 1. plurality of subfields each representing the placement of a 2. hyphen within the word. Each subfield has its beginning 3. point marked by the occurrence of a vowel in the word. When 4. a word to be hyphenated has been verified as being spelled 5. correctly by the verification apparatus, the hyphenation 6. byte for the word is accessed. The hyphenation byte is 7. decoded and controls the insertion of hyphens into the word.
8. The hyphenated word is then presented to the operator on a 9. display for appropriate action.
10. BRIEF DESCRIPTION OF THE DRAWING:
11. Figure 1 is a circuit schematic of the verification 12. apparatus of U.S. patent 3,995,254 modified.
13. Figures 2 and 3 are circuit schematics of the automatic 14. hyphenation apparatus of this invention.
15. Figure 4 is a system timing chart for controlling 16. operation of the invention.
17. DESCRIPTION OF THE PREFERRED EMBODIMENT:
18. THEORY: The theory underlying the digital reference 19. matrix (DRM) for word verification is fully disclosed in 20. previously referenced U.S. patent 3,995,254, and will not be 21. elaborated herein. The DRM relies on a vector magnitude and 22. an absolutely unique vector angle representation for com-23. pactly representing a plurality of alpha words in a dictionary 24. memory. Each letter in the alphabet and each position in a - -25. word is assigned a numerical value. The vector magnitude 26. and absolutely unique vector angle for a given word is 27. calculated based on the character and position assignments.
28. The vector magnitude serves as an address in the dictionary 29. memory at which the angle for the word is stored. While a
30. plurality of words may have the same magnitude, no two words l(~9ZZ~3 1. have the same angle and therefore a single magnitude might 2- have a plurality of angles stored at its address. In opera-3- tion, the magnitude for the word under consideration is 4- calculated and the memory is searched at the address corres-5- ponding to that magnitude for an angle which corresponds to 6- the angle calculated for the word. If the angle is found 7- then the word under consideration is determined to be 8- correctly spelled.
9 This invention builds on the above described concept by 10. adding an automatic hyphenation capability. The hyphenation 11. capability is realized by combining a hyphenation byte with 12. each angle representation for the word stored in the diction-13. ary memory, see Table 1.
14. TABLE 1 15. OCCUPIED MAGNITUDE LEGAL ANGLE/HYPHEN CODE WORD
I _ . _ 16.10 21.83/H 35.04/H 42.53/H 73.81/H
17.17 88/H
18.256 62.41/H 89.88/H
19. A hyphen code word is generated for each word stored in 20. the dictionary and prestored in the digital reference matrix.
21. In the preferred embodiment, the hyphen code word has been 22. defined as an eight bit byte consisting of four fields of 23. two bits each. Each field defines the placement of a hyphen 2~. in the word. The start of a field is defined by a vowel in 25. the word and field is up to three letters in length. The 26. two bits for the field then will represent binary zero 27. through three indicating no hyphen in the field, a hyphen 28. after the first character in the field, a hyphen after the 29. second character in the field, or a hyphen after the third , lOgZZ43 1. character in the field. Vowels within the field are ignored 2- for purposes of orienting the beginning point for other sub-3- fields in the code word hyphen. If a hyphen is allowable in 4. a field, but its displacement from the vowel is greater than 5- three characters, that subfield must be coded (00) as if a 6. hyphen were not allowable. For example, consider the word 7. miscellaneousness. The fourth field in that word consists 8- of the letters eous-n. Since the hyphen occurs more than 9- three letters from the beginning of the field, this field 10. must be coded (00).
11. Table 2 shows a list of words with their corresponding 12. dictionary hyphenation points and the binary hyphenation 13. byte identification of those points. When a word has more 14. than four hyphenation points, the additional hyphenation 15. points will be ignored and the system will skip to the end 16. of the word. The rational behind this is that the occurrence 17- of four hyphenation points within a word should be sufficient 18. to justify most line endings. However, it is recognized 19. that the ability will be increased by making the hyphenation 20- byte longer to encode more fields, or making the subfields 21. larger so that the possible displacement range is increased, 22- or breaking the word into a number of equal length fields 23- and encoding hyphens. in each field, or other possible variations 24- on the definition of the hyphenation byte to suit the purpose 25- of the user. It would also be possible to code and process 26- words from the last character in the word toward the be-27- ginning of the word if this suited the purpose of the user.

- 10922~3 1. TABLE 2 2. HYPHENATION BYTE
3. WORD DICTIONARY HYPHENATION Fl F2 F3 F4 . . _ . _ _ 4- Abate A-bate 01 00 00 00 5- Abdicate Ab-di-cate 10 01 00 00 6. Clip _ 00 00 00 00 7- Clodhopper Clod-hop-per 10 10 00 00 8. colony Col-o-ny 10 01 00 00 9 Colorimeter Col-or-im-e-ter 10 10 10 01 10. Attribute At-trib-ute 10 10 00 00 11. Attributiveness At-trib-u-tive-ness 10 10 01 11 12. Avenue Av-e-nue 10 01 00 00 13. Beatitude Be-at-i-tude . 01 10 01 00 14. Beekeeper Bee-keep-er 10 11 00 00 15. Superconductivity Su-per-con-duc-tiv-i-ty 01 10 10 10 16. Superintendency Su-per-in-ten-den-cy 01 10 10 10 17. Miscellaneousness Mis-cel-la-neous-ness 10 10 01 00 18. DESCRIPTION OF TH~ APPARATUS:
.
19. Referring to Figure 1 there is shown a character source .
20. 99 whose output is connected to the input bus or the word 21. verification apparatus disclosed in previously referenced 22. U.S. patent 3,995,254. The character source may be either 23. a standard typewriter keyboard, a magnetic tape or card 24. reader, a suitable character recognition device or other .
25. input device. Characters are produced by the character .. :
26. source in synchronism with a clock signal on line 75 from :
27. clock generator 98. During the verification phase, the ~ ~
28. characters generated by the character source 99 are impressed : .

~T9-76-009 6 ~

,"'.

lO~Z2~3 1. along line 3 in Figure 1 through gate 8 and into conversion 2. memory 10. The characters also trigger counter 18 which 3. counts the position of each character in a word that is 4. produced. Data bus 3 is also connected to the input of 5. input shift register 300 shown in Figure 2. Each time a 6. character is presented during the verification phase, it is 7. shifted into the input shift register block 300 upon the 8. occurrence of pulse from AND gate 401 generated by the input 9. terms INSERT HYPHEN and clock.
10. During the word verification phase, which is fully 11. disclosed in the aforementioned patent 3,995,254, the 12. characters of the word are assembled in conversion memory 10 13. and multiplier 12, adder 14, and register 16 generate a 14. vector magnitude for the word which is stored in magnitude 15. register 17 and used as an address into memory 38. The 16. output of the conversion memory 10 also feeds into multi-17. plier 20 along with the unique character position code from 18. character position decode 19. The sum of the product of the 19. character codes from conversion memory 10 and the character 20. position codes from character position decode 19 is accumu-21. lated by adder 22 in register 24. Also, the square of the - ~
22. character position decodes is produced in multiplier 30 and ~-23. the sum of the squares is accumulated in register 34 by `
24. adder 32. These sums go into square root calculators 27 and -:
25. 36, multiplier 26 and divider 28 to produce the secant of an 26. angle for the input word in accordance with the theory : -27. disclosed in U.S. patent 3,995,254. The secant on divider 28. 28 is passed into arcsecant calculator 29 were it is con-29. verted to a unique angle for each word.
:

:~:

,, .: . :

l~ZZ~3 1. Memory 38 contains at each magnitude address the angle 2. for each dictionary word defined by that magnitude together 3. with the hyphenation byte representing the dictionary 4. hyphenation points for the word as shown in Table 1. If the 5. magnitude is not found in the memory 38, then gate 39 sets 6. null magnitude register 40 so indicating which triggers 7. flip-flop 42 to produce the VALID WORD signal on line 46.
8. The VALID WORD signal on line 46 may then be used to signal 9. the operator that the word just keyed is incorrectly spelled 10. or to store the word in a special error word memory for 11. later consideration. If the magnitude address is found in 12. memory 38 then the corresponding angles together with their 13. hyphenation bytes are gated through gate 47 into angle and 14. hyphenation code word buffer 45. There the angle portion of lS. the data block is compared in angle compare register 41 with 16. the angle just calculated for the word under consideration 17. by arcsecant calculator 29. An e~ual compare triggers gate 18. 43 to set flip-flop 42 and place a valid word indicator 19. signal on line 44. Also, the hyphenation byte of the corres-20. ponding angle is gated by gate 47 into hyphenation code word 21. buffer 200. This concludes the word verification phase. -22. If at the conclusion of the word verification phase, 23. the valid word indicator on line 44 is set by flip-flop 42 24. and a line end signal appears on line 48 from character 25. source 99, indicating that the line is within the tolerances 26. set for justification, the conditions are set for the ~-~
27. beginning of the hyphenation phase.
28. Referring now to Figure 2, the combination of line end 29. signal 48 and valid word indicator on line 44 satisfy the 30. conditions of AND gate 430 to set flip-flop 101 and output a 109Z2~3 1. check hyphenation signal to AND gate 407. t~alid word indi-2. cator signal on line 44 also feeds into the input of AND
3. gate 408 together with the shift input signal on line 203 4. from input shift register 300 and a 50 SHIFTS signal 302.
5. The capacity of input shift register 300 is 50 characters.
6. Shift counter 600 which is connected to the output of AND
7. gate 408 counts the number of shifts from input shift re-8. gister 300 and is used to control the shifting of output 9. shift register 400. The check hyphenation signal causes a 10.` series of clock pulses to be fed through AND gate 407 into 11. the shift input of output shift register 400 causing the 12. output shift register to begin shifting on each clock pulse.
13. At the time that the check hyphenation signal comes into AND
14. gate 407, it is also fed to OR gate 411 creating an output 15. called WAIT. This signal is fed to line 97 on Figure 1 16. causing the character source 99 not to produce any more 17. characters at this time. The clock signal is ANDED with the 18. signal INS~RT HYPHEN at AND gate 401 to produce a shift --19. input signal to the input shift register 300. Thus, the 20. input shift register will shift on each clock signal unless -21. it is inhibited by the signal INSERT HYPHEN. The shift 22. signal on line 203 is also fed to AND gate 408 along with 23. signal ~ SHIFTS and valid word indicator to produce a count 24. up pulse which gates the shift counter 600 to count each 25. clock signal. Decode 409 is connected to the output of 26. shift counter 600 and its output comes high when the value 27. in the shift counter equals 50 and that signal is ~ed into -28. the inverter 410 to produce the signal ~ SHIFTS. The 29. signal ~ SHIFTS is fed back to AND gate 408 to inhibit the 30. count up pulse. The signal 50 SHIFTS is also fed to AND -lU~ZZ~3 1. gate 407 to inhibit count p~llses to the output shift re-2. gister after the input shift register has been shifted 50 3- times. The true signal 50 SHIFTS is fed to OR gate 411 to 4- produce the signal WAIT which is fed back along line 97 to 5- character source 91 on Figure 1 for the purpose of holding 6- the system. It can be seen, therefore, that the shift 7- counter 600 is used for the purpose of counting shift pulses 8- to the input shift register 300 and that as soon as 50 9- shifts have been accomplished the shifting of the output 10. shift register 400 will stop, thereby holding the data in ll. output shift register. It can be seen further that the 12- outputs of the output shift register 400 are connected to 13. the display 500 for the purpose of operator review. After 14. the operator has reviewed the word, systems operations may 15. be restarted by depressing the operator reset button 1002 ~-16- which provides a reset pulse to the shift counter 600 and 17. provides an input to AND gate 412 for generating a clear 18. signal which is applied back to the clear inputs to the 19. input shift register 300 and the output shift register 400 20. and the reset line to flip-flop 101 for the purpose of 21. establishing initial conditions for the beginning of the 22. ~ next cycle.
23. Assume now that the hyphenation process is started, the -24. check hyphenation latch is set and shift pulses are being 25. gated to the output shift register 400 through AND gate 407.
26- Also, shift pulses are being gated to the input shift re-27. gister through AND gate 401. In most probability, the word 28. contained in the input shift register 300 is less than 50 29- characters long. With each clock pulse at 401 the word in 30- input shift register 300 is moved one character cell to the 109Z2~3 1. right and no other control action occurs. Eventually, this 2. word will appear at the right edge of the input shift re-3. gister 300, the first character first. The null characters, 4. which were in input shift register 300 to the right of the 5. word, shift out along character bus 301 to AND gate 405 and 6. into OR gate 404 back to the input of the output shift 7. register 400. At this time, the signal INSERT HYPHEN is 8. true at AND gate 405 and therefore the signal INSERT HYPHEN
9. is not true at AND gate 406. Shifting continues until 10. eventually a vowel appears in the right hand position of 11. input shift register 300 causing the vowel signal at the 12. output of decode 403 to come true.
13. Referring now to Figure 3, the output of the vowel 14. detector on line 211 is fed into the input of AND gates 413 15. and 424. AND gate 413 controls the input to field counter 16. 700. The count in field counter 700 determines which of the 17. four two bit data fields in the hyphenation word buffer 200 18. is under consideration. The counter 700 is in the reset 19. condition having been previously reset and there having been -20. no count pulses yet applied to it. The decode block 800 21. decodes the output of field counter 700 to select the proper 22. hyphenation field in the hyphenation code word buffer 200. -23. Since no count has yet been applied to the field counter 24. 700, the decode block 800 provides the signal FIELD ONE in -~
25. the true condition and the other fields, TWO, THREE and FOUR, 26. in the false condition. At this instant of time, the signal 27. FIELD ONE on line 220 is applied to AND gate 417 causing the 28. first field of the hyphen code word buffer 200 to be gated 29. to OR gate 423 and onto the hyphen displacement bus 212.

. . . ' : : ' .

109ZZ~3 1. Connected to the hyphen displacement bus 212 is a 2- decode circuit 901 having two outputs, one called DISPLACE-3- MENT EQUALS ZERO on line 214 and the second called DISPLACEMENT
4- EQUALS ZERO on line 213. When the displacement does not equal zero, a hyphen may be placed within the active (first) 6- field of the word. This signal along with the signal VOWEL
on line 211 and the signal SKIP TO END on line 230 is applied 8- to AND gate 424 for the purpose of causing a set pulse to 9 flip-flop 102. Flip-flop 102 sets providing a signal called 10- COUNT TO HYPHEN on line 216 which is fed through the count 11. up input of binary counter 1000 called CHARACTER DISPLACEMENT
12- COUNTER. Counter 1000 counts on the clock pulse for the 13- purpose of determining where the hyphen is to be inserted 14. into the output shift register 400. The value on the hyphen 15. displacement bus 212 is fed to compare circuit 1001 along 16- with the output of the character displacement counter 1000.
17. Now shifting continues so that the vowel that was 18- previously at the output of the input shift register 300 is 19. gated through the character bus 301 to AND gate 405 and from 20- there to OR gate 404 and into the input of the output shift 21- register 400. On the next clock pulse, the character follow-22- ing the vowel is gated along the same path, and so on.
23- Also, on each clock pulse the binary counter 1000 counts up 24- because the input COUNT TO HYPHEN on line 216 is high, 25- having been set as previously described. When the count in -26- the binary counter 1000 becomes equal to the count in the 27- active (first) field of the hyphen code word buffer which 28- has been gated on hyphenation displacement bus 212, this 29- will cause a compare output signal called HYPHEN DISPLACEMENT

10~3 1. EQUAL CHARACTER DISPLACEMENT on line 217 at the output of 2-compare 1001. This signal is fed to AND gate 425 along with 3- the signal COUNT TO HYPHEN on line 216 to produce a signal 4- called INSERT HYPHEN on line 205. The signal INSERT HYPHEN
5- is also inverted by inverter 426 to produce the signal 6. INSERT HYPHEN on line 202.
7-Referring back to Figure 2, the signal INSERT HYPHEN is 8- fed to AND gate 406 which causes a hyphen code to be fed 9- into OR gate 404 for presentation to the input of output 10. shift register 400 on the next clock pulse. At the same 11. time, the inverse signal INSERT HYPHEN into AND gate 401 and 12- AND gate 405 inhibits the shifting of input shift register 13. 300 and inhibits any input from AND gate 405 into OR gate 14- 404. Notice that the shift counter 600 does not count at 15- this time because the shift input on line 203 to AND gate 16- 408 has been inhibited by the low input signals to AND gate 17- 401. In this way, one hyphen has been shifted into the 18- output shift register a number of characters behind the 19. vowel as indicated by the subfield of the hyphen code word 20. buffer 200.
21. Referring back to Figure 3, it can be seen that the 22. output HYPHEN DISPLACEMENT EQUALS CHARACTER DISPLACEMENT on 23- line 217 is fed back to binary counter 1000 so that on the 24. occurrence of the next clock pulse, this counter is reset.
25. The output of compare circuit 1001 is also fed back to the -~
26- clocked reset side of flip-flop 102 so that after the hyphen 27. has been inserted, this flip-flop resets. Now shifting 28- continues with data flowing from the output of shift reglster 29- 300 along line 301 through AND gate 405 and OR gate 404 to 30- the input of output shift register 400.

lO~Z2~3 1. At the same instant of time that the hyphen was in-2. serted in the output shift register 400, the signal I~SERT
3. HYPHEN was also applied to the input of OR gate 414 for 4. producing a count up pulse to the field counter 700 causing 5. it to advance one count on the following clock pulse. Now 6. the condition FIELD TWO high exists at the output of decode 7. block 800 and the other signals are low. The signal FIELD
8. TWO on line 221 causes the second subfield of the hyphen 9. code word in buffer 200 to be gated through AND gate 418 and 10. OR gate 423 onto the hyphen displacement bus 212.
11. Now shifting continues until another vowel is detected 12. at the output of the input shift register 300 by the vowel 13. detector 403. Now assume that the value contained in the 14. second subfield of the hyphen code word buffer is zero. In 15. this instance the output of the decode block 901 called 16. DISPLACEMENT EQUALS ZERO on line 214 will be high and the 17. other output on line 213 will be low. At this instant in i 18. time, a voweI is located in the last stage of the input 19. shift register 300 and the signal DISPLACEMENT EQUALS ZERO
20. on line 214 is high. The AND condition for AND gate 413 is 21. satisfied on the next clock pulse which causes the vowel to 22. shift from input shift register 300 into the output shift 23. register 400. Satisfaction of the condition on AND gate 413 24. produces an output signal to OR gate 414 which causes field 25. counter 700 to advance to the next field. Notice that flip-26. flop 102 did not set in this case because the DISPLACEMENT
27. EQUAL ZERO signal on line 213 was not true. The field 28. counter will have advanced to FIELD THREE causing the decode 29. 800 to generate a signal on line 226 to set AND gate 419 to -~

10S'~Z43 1. gate the third subfield of the hyphen code word buffer 200 2. through OR gate 423 onto hyphen displacement bus 212. The 3, operation for handling the last two subfields is the same as 4, was described for the first two subfields.
5. Still referring to Figure 3, flip-flop 103 has an 6. output called SKIP TO END (SKE) which is provided to take 7. care of the condition wherein a word has more than four 8. hyphens. Since, in the preferred embodiment we have defined 9. a hyphen code word havlng a maximum of four fields, we have 10. not provided means to encode hyphenations for more than four 11. fields. Therefore, if enough fields occur in a word such 12. that the field four condition is encountered, the fiel2 four 13. signal is applied to AND gate 428. When another count pulse 14. is applied to field counter 700 from source OR gate 414, 15. this same count pulse will be applied to AND gate 428 caus-16. ing flip-flop 103 to set. The output SKIP TO END goes hiyh 17. and the inverse signal SKIP ~~ END goes low. The SKIP TO
18. END signal is applied to AND gate 424 to inhibit the setting 19. of flip-flop 102. Notice that the SKIP TO END flip-flop 103 20. will remain set until reset by a clear signal originating at 21. AND gate 412, and further caused by operator depression of 22. the operator reset button 1002. Notice further that the 23. SXIP TO END flip-flop 103 may not set unless there are more 2q. than four field within a word. Its only function is to 25. inhibit incorrect hyphenations on words having more fields 26. than can be encoded in the hyphen code word which is stored 27. in buffer 200.
28. OPERATIO~:
29. Referring now to Figure 4 the operation of the hyphena- -~
30. tion apparatus will be discussed using as an example the . . .~

109~2~3 1. word CORNWALL. This word has a single hyphenation point 2. between the characters N and W. The clock pulse on line 75 3. is free running. ~ssume that the first character i~ the . word Cornwall, C, is located in position 50 of input shift 5. register 300. On the occurrence of the next clock pulse the ~. C is shifted from input shift register 300 over data bus 301 7. through AND gate 405 and OR gate 404 into position one of 8. output shift register 400. Also on this shift the second 9. character, O, is shifted into position 50 of the input shift 10. register 300 and sensed by decode 431 to produce the signal 11~ VOWEL on line 211. The VOWEL signal on line 211 operates 12. AND gate 424 to set flip-flop 102 and produce a true signal 13. on line 216. At this point the count in the character 14. displacement counter 1000 is zero on line 232 since the 15. counter was reset during the previous operation. On the 16. next clock pulse, the O is shifted from the output of input 17. shift register 300 into the first position of output shift 18. register 400 while the C is shifted into the second position 19. of output shift register 400. The count up input to charac-20. ter displacement counter 1000 is being held high by the 21. setting of flip-flop 102 on the previous pulse. Therefore, 22. this clock pulse causes the counter 1000 to advance to a ~.
23. count of one. Also, since the character in position 50 of 24. input shift register 300 is now a consonant the vowel output 25. on line 211 returns low.
26. On the third clock pulse the letter R is shifted from 27. position 50 of input shift register 300 into position one of - 28. output shift register 400 and the output of character dis-29. placement counter 200 is advanced to binary 2 on line 232.
30. The next clock cycle shifts the letter N from position 50 of lO9Z2~3 1. input shift register 300 into position one of output shift 2. register 400 and brings the count in character displacement 3. counter 1000 to a binary 3. The count in character displace-4. ment counter 1000 now compares equal to the output of hyphen 5. displacement bus 212 and signal is raised on line 217 to AND
6. gate 425 to produce a high signal on INSERT HYPHEN line 205 7. and low signal on INSERT HYPHEN line 202. The high signal 8. on INSERT HYPHEN line 205 controls AND gate 406 to gate a 9. hyphen code into position one of output shift register 400 10. while the high INSERT HYPHEN code inhibits a signal on line 11. 203 and prevents input shift register 300 from shifting.
12. Also, the high INSERT HYPHEN code feeds into OR gate 414 and 13. produces a high signal on line 231 to cause field counter 14. 700 to advance to FIELD TWO. At this point, input shift 15. register 300 contains in positions 47 - 50 the characters 16. LLAW while output shift register 400 contains in positions 17. 1 - 5 the characters -NROC. The field counter 700 is set at 18. FIELD TWO and the shifting continues.
19. On the next clock pulse the letter W is shifted from -20. the output of input shift register 300 into the first posi- -~ -21. tion of output shift register 400 while the ietter A is 22. shifted into position 50 of input shift register 300. The 23. appearance of the letter A in position 50 of input shift 24. register 300 causes vowel decoder 431 to produce an output 25. signal on line 211 to AND gates 413 and 424. At the time -26. when field counter 700 was set to FIELD TWO decode 800 27. produced a signal on line 221 to AND gate 418 to gate the 28. contents of bits 2 and 3 of hyphenation code word buffer 20G -~
29. onto hyphen displacement bus 212 to decode 901. This signal , - : .- ,: . .

1092Z~3 1. was decoded as (00) indicating that no hyphen occurs in this 2. field. Therefore, line 214 was set high and line 213 to AND
3. gate 424 was set low. The occurrence of the VOWEL signal on 4. line 211 to AND gate 413 causes a signal at the input of OR
5. gate 414 whose output on line 231 causes field counter 700 6. to count up to FIELD THREE on the occurrence of the next 7. clock pulse. Since the displacement not equal zero signal 8. was low on line 213 AND gate 424 does not set flip-flop 102 9. and therefore the character displacement counter 1000 is not 10. advanced.
11. On the next two clock pulses, the L ' s are shifted from 12- the output of input shift register 300 into the input of 13- output shift register 400 and the hyphenated word is dis-14. played on display 500 as shown in Figure 2.
15. While the invention has been particularly shown and 16- described with reference to the preferred embodiment thereof, 17. it would be understood by those skilled in the art that 18. changes in form and detail may be made therein without 19. departing from the spirit and scope of the invention. ~

AT9-76-009 18 ~;

Claims (9)

The embodiments of the invention in which an exclusive property or privilege is claimed are defined as follows:
1. Improved apparatus for automatically hyphenating data segments comprising in combination:
storage means containing representations of a dictionary of data segments including representations of the legal hyphenation points for the data segments;
means for calculating an address for each data segment input to the system;
accessing means for addressing said storage means at the calculated address; and means for receiving the representations of the legal hyphenation points for the data segment stored at the accessed addressed operable to modify each said data segment to cause legal hyphenation of the data segment.
2. The apparatus of Claim 1 wherein said means for re-ceiving further includes means for decoding the representations of the legal hyphenation points and means responsive to said decoding means for inserting hyphens into the data segment.
3. The apparatus of Claim 2 wherein said means for re-ceiving further includes means for displaying a visual output of the hyphenated data segment.
4. The apparatus of Claim 1 wherein said storage means contains vector representations of said data segments.

Claims 1, 2, 3 and 4
5. In combination apparatus for verifying the spelling of an input word and automatically hyphenating the word com-prising:
a source of input characters;
storage means for receiving said input characters;
means for detecting the end of a word;
conversion means for converting said word into a vector having a magnitude and an absolutely unique angle;
memory means containing vector representations of a dictionary of words including representations of the hyphenation points for the words;
addressing means for accessing said memory means at the address defined by the by the vector magnitude of the input word;
comparator means for comparing the vector angle of the input word to vector angles stored in said memory means at said magnitude address;
means for receiving the hyphenation representation associated with the angle that compares equal to the angle of the word to be hyphenated, and operable to hyphenate the word stored in said storage means.
6. Means for automatically hyphenating an input word comprising:
a first storage means for receiving the input word to be hyphenated;
a dictionary memory for storing representations of a plurality of words together with a data byte representing the hyphenation points of the word;

Claims 5 and 6 means for converting the input word into its corresponding representation;
means for searching said dictionary for a repre-sentation equal to the representation of said input word;
means for accessing the hyphenation data byte corresponding to the word;
means for decoding the hyphenation data byte into the legal hyphenation points for the word;
a second storage means;
means for serially transferring said input word from said first storage means to said second storage means;
means connected to said decoding means for inter-rupting the transfer of said input word form said first storage means to said second storage means;
means for inserting a hyphen into said second storage means during each transfer interruption; and means for displaying the contents of said second storage means.
7. The means of Claim 6 wherein said hyphenation data byte contains a plurality of subfields each representing a hyphena-tion point for the word.
8. The apparatus of Claim 7 wherein each subfield contains a code representing the proximity of the hyphen to a vowel in the word.

Claims 6 (Cont.), 7 and 8
9. The method of automatically hyphenating input words comprising the steps of:
storing the input word to be hyphenated in a first memory means;
searching a predetermined dictionary memory for a data byte corresponding to the input word;
decoding the data byte into subfields representing hyphenation points for the input word;
serially transferring the characters of the input word from the first memory means to a second memory means;
interrupting the transfer of characters in accordance with said decoding;
inserting a hyphen in said second memory means each time an interrupt occurs; and displaying said hyphenated word.

Claim 9
CA288,062A 1976-12-28 1977-10-04 Apparatus for automatically forming hyphenated words Expired CA1092243A (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US754,938 1976-12-28
US05/754,938 US4092729A (en) 1976-12-28 1976-12-28 Apparatus for automatically forming hyphenated words

Publications (1)

Publication Number Publication Date
CA1092243A true CA1092243A (en) 1980-12-23

Family

ID=25037026

Family Applications (1)

Application Number Title Priority Date Filing Date
CA288,062A Expired CA1092243A (en) 1976-12-28 1977-10-04 Apparatus for automatically forming hyphenated words

Country Status (9)

Country Link
US (1) US4092729A (en)
JP (1) JPS5383533A (en)
AU (1) AU512282B2 (en)
CA (1) CA1092243A (en)
DE (1) DE2755875A1 (en)
FR (1) FR2376468A1 (en)
GB (1) GB1595932A (en)
IT (1) IT1114684B (en)
NL (1) NL7712773A (en)

Families Citing this family (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4164025A (en) * 1977-12-13 1979-08-07 Bell Telephone Laboratories, Incorporated Spelled word input directory information retrieval system with input word error corrective searching
US4471459A (en) * 1981-09-30 1984-09-11 System Development Corp. Digital data processing method and means for word classification by pattern analysis
US4456969A (en) * 1981-10-09 1984-06-26 International Business Machines Corporation System for automatically hyphenating and verifying the spelling of words in a multi-lingual document
US4689768A (en) * 1982-06-30 1987-08-25 International Business Machines Corporation Spelling verification system with immediate operator alerts to non-matches between inputted words and words stored in plural dictionary memories
US4574363A (en) * 1982-07-13 1986-03-04 International Business Machines Corporation Mixed mode enhanced resolution hyphenation function for a text processing system
US5200892A (en) * 1984-01-17 1993-04-06 Sharp Kabushiki Kaisha Intelligent electronic word processor with plural print wheels and tables used to identify displayed characters supported by designated print wheels
GB2183874B (en) * 1984-01-17 1988-12-29 Sharp Kk Word processor
GB2201020B (en) * 1984-01-17 1988-12-29 Sharp Kk Word processor
US4974195A (en) * 1986-06-20 1990-11-27 Canon Kabushiki Kaisha Document processing apparatus
US4829472A (en) * 1986-10-20 1989-05-09 Microlytics, Inc. Spelling check module
JPS63130376A (en) * 1986-11-20 1988-06-02 Brother Ind Ltd Printer
US5754847A (en) * 1987-05-26 1998-05-19 Xerox Corporation Word/number and number/word mapping
JP2703907B2 (en) * 1987-10-23 1998-01-26 キヤノン株式会社 Document processing method
US5560037A (en) * 1987-12-28 1996-09-24 Xerox Corporation Compact hyphenation point data
US5224038A (en) * 1989-04-05 1993-06-29 Xerox Corporation Token editor architecture
US5625773A (en) * 1989-04-05 1997-04-29 Xerox Corporation Method of encoding and line breaking text
US5008818A (en) * 1989-04-24 1991-04-16 Alexander K. Bocast Method and apparatus for reconstructing a token from a token fragment
US5113342A (en) * 1989-04-26 1992-05-12 International Business Machines Corporation Computer method for executing transformation rules
JPH02277170A (en) * 1989-12-15 1990-11-13 Sharp Corp Electronic dictionary
US5295069A (en) * 1991-06-05 1994-03-15 International Business Machines Corporation Computer method for ranked hyphenation of multilingual text
JP3599775B2 (en) * 1993-04-21 2004-12-08 ゼロックス コーポレイション Finite state coding system for hyphenation rules
US6671856B1 (en) 1999-09-01 2003-12-30 International Business Machines Corporation Method, system, and program for determining boundaries in a string using a dictionary
AU2003201744A1 (en) * 2002-02-08 2003-09-02 Herbert Prah Reading aid
US8996994B2 (en) * 2008-01-16 2015-03-31 Microsoft Technology Licensing, Llc Multi-lingual word hyphenation using inductive machine learning on training data

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US3439341A (en) * 1965-08-09 1969-04-15 Lockheed Aircraft Corp Hyphenation machine
US3537076A (en) * 1967-11-28 1970-10-27 Ibm Automatic hyphenation scheme
US3995254A (en) * 1975-07-16 1976-11-30 International Business Machines Corporation Digital reference matrix for word verification
US4028677A (en) 1975-07-16 1977-06-07 International Business Machines Corporation Digital reference hyphenation matrix apparatus for automatically forming hyphenated words

Also Published As

Publication number Publication date
IT1114684B (en) 1986-01-27
DE2755875A1 (en) 1978-06-29
FR2376468A1 (en) 1978-07-28
GB1595932A (en) 1981-08-19
FR2376468B1 (en) 1980-12-19
JPS5732382B2 (en) 1982-07-10
DE2755875C2 (en) 1988-10-20
NL7712773A (en) 1978-06-30
AU2952877A (en) 1979-04-26
US4092729A (en) 1978-05-30
JPS5383533A (en) 1978-07-24
AU512282B2 (en) 1980-10-02

Similar Documents

Publication Publication Date Title
CA1092243A (en) Apparatus for automatically forming hyphenated words
US4783761A (en) Spelling check dictionary with early error signal
US3995254A (en) Digital reference matrix for word verification
US4396992A (en) Word processor
EP0031495B1 (en) Text processing terminal with automatic text string input facility
US4503514A (en) Compact high speed hashed array for dictionary storage and lookup
US4782464A (en) Compact spelling-check dictionary
US4807181A (en) Dictionary memory with visual scanning from a selectable starting point
US4359286A (en) Character set expansion
US4028677A (en) Digital reference hyphenation matrix apparatus for automatically forming hyphenated words
US4381551A (en) Electronic translator
US3537073A (en) Number display system eliminating futile zeros
US4139898A (en) Microfilm searching reader
GB1275001A (en) Programmable electronic calculator
EP0097818A2 (en) Spelling verification method and typewriter embodying said method
EP0052757B1 (en) Method of decoding phrases and obtaining a readout of events in a text processing system
US3953846A (en) Encoding device
US3676854A (en) Keyboard to tape data input preparation unit
EP0042035B1 (en) Method and apparatus for vectorizing text words in a text processing system
JPS6371767A (en) Document producing device
JPS6213710B2 (en)
KR840000051B1 (en) Hangul printer
SU1019484A1 (en) Text data display device
JPH0677252B2 (en) Japanese data input processor
JPS5713575A (en) Display system of chinese language

Legal Events

Date Code Title Description
MKEX Expiry