A document (or multiple documents) is analyzed to identify entities of interest within that document. This is accomplished by constructing n-gram or bi-gram models that correspond to different kinds of text entities, such as chemistry-related words and generic English words. The models can be constructed...http://www.google.fr/patents/US7493293?utm_source=gb-gplus-shareBrevet US7493293 - System and method for extracting entities of interest from text using n-gram models
System and method for extracting entities of interest from text using n-gram ...