Document Imaging Workstation
Document imaging is an information technology category for systems capable of replicating documents commonly used in business. Document imaging systems can take many forms including microfilm, on demand printers, facsimile machines, copiers, multifunction printers, document scanners, computer output microfilm (COM) and archive writers. Since the 1990s, “document imaging” has been used to describe software-based computer systems that capture, store and reprint images.
Document imaging is a form of enterprise content management. In the early days of content management technologies, the term “document imaging” was used interchangeably with “document image management” as the industry tried to separate itself from the micrographic and reprographic technologies.
Scanning the document is only one part of the process. For the scanned image to be useful, it must be transferred from the scanner to an application running on the computer. There are two basic issues: one is how the scanner is physically connected to the computer and second, how the application retrieves the information from the scanner.
Scanners typically read red-green-blue color (RGB) data from the array. This data is then processed with some proprietary algorithm to correct for different exposure conditions, and sent to the computer via the device’s input/output interface (usually USB, previous to which was SCSI or bidirectional parallel port in older units). Color depth varies depending on the scanning array characteristics, but is usually at least 24 bits. High quality models have 36-48 bits of color depth.
Another qualifying parameter for a scanner is its resolution, measured in pixels per inch (ppi), sometimes more accurately referred to as Samples per inch (spi). Instead of using the scanner’s true optical resolution, the only meaningful parameter, manufacturers like to refer to the interpolated resolution, which is much higher thanks to software interpolation. As of 2009, a high-end flatbed scanner can scan up to 5400 ppi and drum scanners have an optical resolution of between 3,000 and 24,000 ppi.
The scanned result is a non-compressed RGB image, which can be transferred to a computer’s memory. Some scanners compress and clean up the image using embedded firmware. Once on the computer, the image can be processed with a raster graphics program (such as Photoshop or the GIMP) and saved on a storage device (such as a hard disk).
Images are usually stored on a hard disk. Pictures are normally stored in image formats such as uncompressed Bitmap, “non-lossy” (lossless) compressed TIFF and PNG, and “lossy” compressed JPEG. Documents are best stored in TIFF or PDF format; JPEG is particularly unsuitable for text. Optical character recognition (OCR) software allows a scanned image of text to be converted into editable text with reasonable accuracy, so long as the text is cleanly printed and in a typeface and size that can be read by the software. OCR capability may be integrated into the scanning software, or the scanned image file can be processed with a separate OCR program.
Document Scanner – The scanning or digitization of paper documents for storage makes different requirements of the scanning equipment used than scanning of pictures for reproduction. While documents can be scanned on general
Document Management Software – Document Management Software is used to track and store electronic documents and/or images of paper documents. It is usually also capable of keeping track of the different versions created by different users (history tracking). The term has some overlap with the concepts of content management systems.