Magazine Scanning Overview

A Brief Overview of Scanning Magazines

Scanning magazines and producing searchable PDF files from those scans is a fairly simple, if time consuming process. The bound material scanner located at the library is basically just a pair of DSLR cameras aimed at a two part platen on which an open book or magazine can be held in place by a glass cover. The cameras are hard-wired to a computer that contains the scanning software and stores the raw images as JPEG files. After each pair of pages is photographed the software names and numbers the images sequentially and stores them in appropriately named files. After the scanning session is finished the stored files are transferred to the processing computer for the next step. Working diligently, as long as everything goes smoothly, it is possible to scan 800 or more pages in a two hour period depending on the type of materials being scanned. The second step in the process is done on a second, more powerful computer, using software supplied by the scanner manufacturer that takes the individual, raw JPEGs and creates finished JPEGs from them. The pairs of images (left and right page photographs taken with the two cameras in sequence) are straightened, cropped and resized and the exposure and contrast levels adjusted as needed to form good quality images of each page ready for the next step. PDF files are produced using Adobe Acrobat software that takes multiple JPEG images and makes one PDF file per magazine issue. The size of the PDF depends on the number of pages in the magazine and the type of data on each page. A color magazine with lots of photography will generate a much larger PDF than a monochrome magazine that's mostly printed words. After the PDF is compiled the OCR tool in Acrobat is used to make a searchable document. The final result is saved with a descriptive name. Finally the PDF files of the scanned magazines are both saved to portable hard drives and burned to DVDs. They are able to be viewed on any computer that can display PDF files and individual pages can easily be printed. The finished JPEGs are also saved to the hard drives since they will yield better copies of photographs from the magazines and the data contained in the JPEGs might be wanted for other purposes in the future. It should be noted that the processing of the scanned images is much more time consuming than the scanning itself. The final product seems to be of acceptable quality. It is certainly easy to read magazines and find desired information using the files generated by the above procedure.