Document Conversion
Digitization & Archiving:
The objective of digitization and archiving is to create content of databases to facilitate access, and dissemination of information resources.

Our content-Conversion department is involved in digitization process for converting the written and printed records into electronic form. The content may be text, image, audio or a combination of these three (multimedia). The output of digitization process is an electronic document which is then hosted on Internet/Intranet. The electronic document could be in “PDF” (Portable Document Format), “TIFF” (Tagged Image File Format), “JPG” (Joint Photographic Experts Group), “SGML” (Standard Generalized Mark-up Language), “HTML” (Hypertext Mark-up Language), and “XML” (Extensible Mark-up Language) etc. These electronic documents are used to store information on the Web, as the file size is relatively small and easily downloadable and transferable. Also, it has a unique print/display format that is the same on many platforms.

The digitization process at PATELiinfo follows a series of steps:
1.Identification of the items for collection.
2.Scanning of the documents.
3.Organizing the scanned files. 4.Electronic Character recognition of the document.
5.XML or HTML Coding as per the DTD (Document Type Definition).
6.Parsing & Validation.
7.Browser view through the style-sheet or XSLT.
8.Quality check.
9.Upload to the client’s database.
The output of the converted documents could be in any of the below formats depending upon the project requirement:
i. XML
ii. HTML, XHTML, SGML
iii. Text Searchable PDF
iv. TEXT
v. DOC/ RTF
vi. XSL/ CSV
vii. VSG
viii. DAT


