Paper
13 January 2003 Exploring a hybrid of support vector machines (SVMs) and a heuristic based system in classifying web pages
Ahmad Rahman, Yuliya Tarnikova, Hassan Alam
Author Affiliations +
Proceedings Volume 5010, Document Recognition and Retrieval X; (2003) https://doi.org/10.1117/12.472836
Event: Electronic Imaging 2003, 2003, Santa Clara, CA, United States
Abstract
Due to the proliferation of various types of devices used to browse the web and the shift of document access via web interfaces, it is now becoming very important to classify web pages into pre-selected types. This often forms the pre-processing stage of a number of web applications. However, classification of web pages is known to be a difficult problem because it is inherently difficult to identify specific features of web pages that are distinct and therefore it is equally difficult to use a set of heuristics to accomplish this. This paper describes a solution to the problem by combining a heuristic based system and a Support Vector Machine (SVM). It is found that such a hybrid system is able to perform at a very high accuracy when compared to using SVMs on their own.
© (2003) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Ahmad Rahman, Yuliya Tarnikova, and Hassan Alam "Exploring a hybrid of support vector machines (SVMs) and a heuristic based system in classifying web pages", Proc. SPIE 5010, Document Recognition and Retrieval X, (13 January 2003); https://doi.org/10.1117/12.472836
Lens.org Logo
CITATIONS
Cited by 2 scholarly publications.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Computing systems

Classification systems

Computer programming

Data conversion

Detection and tracking algorithms

Error analysis

Image classification

Back to Top