KEYWORDS: Machine learning, LCDs, Mining, Personal digital assistants, Data mining, Data hiding, Analytical research, Detection and tracking algorithms, Binary data, Multimedia
Retrieving document sby subject matter is the general goal of information retrieval and othe rcontent access systems. There are aspects of textual content, however, which form equally valid election critiria. One such aspect is that of sentiment or polarity - indicating the author's opinion or emotional relationship with some topic. Recent work in this are has treated polarity effectively as a discrete aspect of text. In this paper we present a lightweight but robust approach to combining topic and polarity thus enabling content access systems to select content based on a certain opinion about a certain topic.
We present an overview of an information extraction application in the health insurance invoice processing domain. The system is novel in that it is not constrained by the document type - it has no explicit document model or document type classification phase. The system relies on constraints derived from a domain model, constraints derived from world state, and simple models of layout, including the use of labeled fields and the proximity of related information.
The ability to accurately detect those areas in plain text documents that consist of contiguous text is an important pre- process to many applications. This paper introduces a novel method that uses both spatial and linguistic knowledge in an accurate manner to provide an initial analysis of the document. This initial analysis may then be extended to provide a complete analysis of the text areas in the document.
Conference Committee Involvement (7)
Document Recognition and Retrieval XVI
21 January 2009 | San Jose, California, United States
Document Recognition and Retrieval XV
30 January 2008 | San Jose, California, United States
Document Recognition and Retrieval XIV
30 January 2007 | San Jose, CA, United States
Document Recognition and Retrieval XIII
18 January 2006 | San Jose, California, United States
Document Recognition and Retrieval XII
19 January 2005 | San Jose, California, United States
Document Recognition and Retrieval XI
21 January 2004 | San Jose, California, United States
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.