Paper
10 January 2003 Recognition as Translating Images into Text
Kobus Barnard, Pinar Duygulu, David A. Forsyth
Author Affiliations +
Proceedings Volume 5018, Internet Imaging IV; (2003) https://doi.org/10.1117/12.478427
Event: Electronic Imaging 2003, 2003, Santa Clara, CA, United States
Abstract
We present an overview of a new paradigm for tackling long standing computer vision problems. Specifically our approach is to build statistical models which translate from a visual representations (images) to semantic ones (associated text). As providing optimal text for training is difficult at best, we propose working with whatever associated text is available in large quantities. Examples include large image collections with keywords, museum image collections with descriptive text, news photos, and images on the web. In this paper we discuss how the translation approach can give a handle on difficult questions such as: What counts as an object? Which objects are easy to recognize and which are hard? Which objects are indistinguishable using our features? How to integrate low level vision processes such as feature based segmentation, with high level processes such as grouping. We also summarize some of the models proposed for translating from visual information to text, and some of the methods used to evaluate their performance.
© (2003) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Kobus Barnard, Pinar Duygulu, and David A. Forsyth "Recognition as Translating Images into Text", Proc. SPIE 5018, Internet Imaging IV, (10 January 2003); https://doi.org/10.1117/12.478427
Lens.org Logo
CITATIONS
Cited by 16 scholarly publications.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Data modeling

Image segmentation

Information visualization

Visualization

Image processing

Performance modeling

Computer vision technology

RELATED CONTENT

Image segmentation using Gaussian curvature
Proceedings of SPIE (October 03 1995)
Recognition Of Complex Graphical Objects
Proceedings of SPIE (March 02 1989)

Back to Top