(18.116.67.212)
Users online: 13340     
Ijournet
Email id
 

World Digital Libraries- An International Journal
Year : 2008, Volume : 1, Issue : 1
First page : ( 47) Last page : ( 59)
Print ISSN : 0974-567X. Online ISSN : 0975-7597.

Recognition-free search in graphics stream of PDF

Balasubramanian A1,*, Jawahar C V2,**

1HP Labs India, Bangalore – 560 030, India

2Centre for Visual Information Technology, International Institute of Information Technology, Gachibowli, Hyderabad – 500 032, India

*bala.a@hp.com

**jawahar@iiit.ac.in

Online published on 25 November, 2013.

Abstract

Digital libraries are becoming integral part of our day-to-day life. Digitized books and manuscripts in many of these digital libraries are often stored as images or graphics. Very often, they cannot be searched at the content level due to the lack of robust character recognizers. PDF (portable document format) has emerged as one of the most popular document representation schema in digital libraries, especially for storing scanned documents. When there is no textual (UNICODE, ASCII) representation available, scanned images are stored in the graphics stream of PDF. In this paper, we describe a solution to search the textual data in the graphics stream of the PDF files, at the content level. The proposed solution is demonstrated by enhancing an open source PDF viewer (Xpdf). Indian language support is also provided. Users can type a word in Roman (ITRANS), view it in a font, and simultaneously search in textual and graphics stream of PDF.

Top

 
║ Site map ║ Privacy Policy ║ Copyright ║ Terms & Conditions ║ Page Rank Tool
750,934,725 visitor(s) since 30th May, 2005.
All rights reserved. Site designed and maintained by DIVA ENTERPRISES PVT. LTD..
Note: Please use Internet Explorer (6.0 or above). Some functionalities may not work in other browsers.