WO2001024053A3 - System and method for automatic context creation for electronic documents - Google Patents

System and method for automatic context creation for electronic documents Download PDF

Info

Publication number
WO2001024053A3
WO2001024053A3 PCT/US2000/025755 US0025755W WO0124053A3 WO 2001024053 A3 WO2001024053 A3 WO 2001024053A3 US 0025755 W US0025755 W US 0025755W WO 0124053 A3 WO0124053 A3 WO 0124053A3
Authority
WO
WIPO (PCT)
Prior art keywords
contexts
electronic document
generated
information
context
Prior art date
Application number
PCT/US2000/025755
Other languages
French (fr)
Other versions
WO2001024053A2 (en
WO2001024053A9 (en
Inventor
Rachael Sokolwski
Philip Oxenberg
Original Assignee
Xmlexpress Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xmlexpress Inc filed Critical Xmlexpress Inc
Priority to AU40253/01A priority Critical patent/AU4025301A/en
Publication of WO2001024053A2 publication Critical patent/WO2001024053A2/en
Publication of WO2001024053A9 publication Critical patent/WO2001024053A9/en
Publication of WO2001024053A3 publication Critical patent/WO2001024053A3/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/80Information retrieval; Database structures therefor; File system structures therefor of semi-structured data, e.g. markup language structured data such as SGML, XML or HTML
    • G06F16/84Mapping; Conversion
    • G06F16/86Mapping to a database
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/93Document management systems

Abstract

A system and method for automatically generating a context for information contained in any type of text-based electronic document such as a hypertext markup language (HTML) encoded web page. The contexts generated by the system describe the content or meaning of parts or sections of the electronic document. Additionally, the system automatically generates a hierarchy of how these contexts are organized. The generated contexts do not describe format or appearance such as heading or paragraph; the contexts created are descriptive names that summarize the content. The contexts provided for the electronic document are used to generate descriptive markup of an electronic document, key words, and indices. The system uses a unique combination of sentence and paragraph boundaries, document markup and linguistic information to generate the context and/or keyword. The contexts generated may be used to electronically provide start and end boundaries for the information. The preferred embodiment is the creation of an XML (eXtensible Markup Language) electronic document.
PCT/US2000/025755 1999-09-28 2000-09-20 System and method for automatic context creation for electronic documents WO2001024053A2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU40253/01A AU4025301A (en) 1999-09-28 2000-09-20 System and method for automatic context creation for electronic documents

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US15692399P 1999-09-28 1999-09-28
US60/156,923 1999-09-28

Publications (3)

Publication Number Publication Date
WO2001024053A2 WO2001024053A2 (en) 2001-04-05
WO2001024053A9 WO2001024053A9 (en) 2002-10-03
WO2001024053A3 true WO2001024053A3 (en) 2004-03-25

Family

ID=22561675

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2000/025755 WO2001024053A2 (en) 1999-09-28 2000-09-20 System and method for automatic context creation for electronic documents

Country Status (2)

Country Link
AU (1) AU4025301A (en)
WO (1) WO2001024053A2 (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3480844B2 (en) 2001-10-26 2003-12-22 株式会社リコー Document management apparatus, document management control method, and document management control program
CN106033414A (en) * 2015-03-09 2016-10-19 北大方正集团有限公司 A hot spot information processing method and system
CN110738033B (en) * 2018-07-03 2023-09-19 百度在线网络技术(北京)有限公司 Report template generation method, device and storage medium
CN113221559B (en) * 2021-05-31 2023-11-03 浙江大学 Method and system for extracting Chinese key phrase in scientific and technological innovation field by utilizing semantic features

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5488725A (en) * 1991-10-08 1996-01-30 West Publishing Company System of document representation retrieval by successive iterated probability sampling
EP0802491A2 (en) * 1996-04-15 1997-10-22 Sun Microsystems, Inc. Structured documents on the WWW
WO1998026357A1 (en) * 1996-12-09 1998-06-18 Practical Approach Corporation Natural language meta-search system and method
WO1999023584A2 (en) * 1997-10-31 1999-05-14 Iota Industries Ltd. Information component management system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5488725A (en) * 1991-10-08 1996-01-30 West Publishing Company System of document representation retrieval by successive iterated probability sampling
EP0802491A2 (en) * 1996-04-15 1997-10-22 Sun Microsystems, Inc. Structured documents on the WWW
WO1998026357A1 (en) * 1996-12-09 1998-06-18 Practical Approach Corporation Natural language meta-search system and method
WO1999023584A2 (en) * 1997-10-31 1999-05-14 Iota Industries Ltd. Information component management system

Also Published As

Publication number Publication date
AU4025301A (en) 2001-04-30
WO2001024053A2 (en) 2001-04-05
WO2001024053A9 (en) 2002-10-03

Similar Documents

Publication Publication Date Title
Takagi et al. Transcoding proxy for nonvisual web access
Maler et al. XML linking language (XLink)
WO2001050349A8 (en) Electronic document customization and transformation utilizing user feedback
EP1376392A3 (en) Method and system for associating actions with semantic labels in electronic documents
MY140805A (en) Providing contextually sensitive tools and help content in computer-generated documents
WO2004049204A3 (en) Transformation of web description documents
TWI266210B (en) Method and apparatus for extensible stylesheet dsigns
GB2368432A (en) System and method for language extraction and encoding
WO2001033422A3 (en) Automated processing and delivery of media to web servers
WO2005076155A3 (en) Method and system for automating creation of multiple stylesheet formats using an integrated visual design environment
US20080163077A1 (en) System and method for visually generating an xquery document
EP1452973A3 (en) Method and system for showing unannotated text nodes in a structured document
WO2003046756A3 (en) Schema, syntactic analysis method and method of generating a bit stream based on a schema
WO2001024053A3 (en) System and method for automatic context creation for electronic documents
WO2003034638A3 (en) Extensible mark-up language (xml) tracer for conversion of xml documents to hypertext markup language (html)
WO2001082121A3 (en) Pre-computing and encoding techniques for an electronic document to improve run-time processing
Cisco I/O Step Descriptions
Cisco Zones for Markup Language Documents
US7509573B1 (en) Anti-virus security information in an extensible markup language document
Salminen Metamarkup Languages: SGML and XML
Kansong et al. Graphic user interface and front-end operation on MS Windows
HASIDA Design and Application of Multimodal Common Format
Kappe et al. Hyper-G Text Format (HTF) Version 2.13
Welzenbach et al. Digital Publishing and Preservation Using the TEI
Hammond Extensible Markup Language (XML) Standard Generalized Markup Language (SGML)

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CR CU CZ DE DK DM DZ EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

AK Designated states

Kind code of ref document: C2

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CR CU CZ DE DK DM DZ EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: C2

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

COP Corrected version of pamphlet

Free format text: PAGES 1/10-10/10, DRAWINGS, REPLACED BY NEW PAGES 1/10-10/10; DUE TO LATE TRANSMITTAL BY THE RECEIVING OFFICE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS(R.69(1) EPC)

122 Ep: pct application non-entry in european phase
NENP Non-entry into the national phase

Ref country code: JP