(19) United States
(12) Patent Application Publication (io) Pub. No.: US 2004/0139059 Al
Conroy et al. (43) Pub. Date: Jul. 15,2004
(54) METHOD FOR AUTOMATIC DEDUCTION OF RULES FOR MATCHING CONTENT TO CATEGORIES
(76) Inventors: William F. Conroy, Champaign, IL
(US); Desiree D. G. Gosby, Allston,
MA (US)
Correspondence Address:
Schmeiser, Olsen & Watts
Suite 201
3 Lear Jet Lane
Latham, NY 12110 (US)
(21) Appl. No.: 10/335,351
(22) Filed: Dec. 31, 2002
Publication Classification (51) Int. CI.7 G06F 7 00
(52) U.S. CI 707 3
(57) ABSTRACT
Accordingly, the invention is a method for automatic deduction of rules for matching document content to a category within a strange taxonomy, which allows the document to be automatically classified into a proper category for storage in that strange taxonomy. The method includes the steps of spidering the taxonomy to determine its structure and contents, extracting keywords from documents within the strange taxonomy, formulating rules for determining the category from the extracted keywords, and applying the rules to classify a new document whose keywords have been extracted. The taxonomy is strange because the user has no knowledge of its internal structure and needs no such knowledge. The taxonomy may be flat or may be hierarchal, the later having rules formulated at each level for proceeding to the next level. Variations for creating new and refurbishing old document management systems are disclosed.