LOC Workshop on Etexts - Library of Congress (life books to read .TXT) 📗
- Author: Library of Congress
- Performer: -
Book online «LOC Workshop on Etexts - Library of Congress (life books to read .TXT) 📗». Author Library of Congress
******
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
DISCUSSION Project operating frequencies
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
During a brief discussion period, which also concluded the day’s proceedings, BROWNRIGG stated that the project was operating in four frequencies. The slow speed is operating at 435 megahertz, and it would later go up to 920 megahertz. With the high-speed frequency, the one-megabyte radios will run at 2.4 gigabits, and 1.5 will run at 5.7. At 5.7, rain can be a factor, but it would have to be tropical rain, unlike what falls in most parts of the United States.
******
SESSION IV. IMAGE CAPTURE, TEXT CAPTURE, OVERVIEW OF TEXT AND
IMAGE STORAGE FORMATS
William HOOTON, vice president of operations, I-NET, moderated this session.
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
KENNEY Factors influencing development of CXP Advantages of using digital technology versus photocopy and microfilm A primary goal of CXP; publishing challenges Characteristics of copies printed Quality of samples achieved in image capture Several factors to be considered in choosing scanning Emphasis of CXP on timely and cost-effective production of black-and-white printed facsimiles Results of producing microfilm from digital files Advantages of creating microfilm Details concerning production Costs Role of digital technology in library preservation *
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Anne KENNEY, associate director, Department of Preservation and Conservation, Cornell University, opened her talk by observing that the Cornell Xerox Project (CXP) has been guided by the assumption that the ability to produce printed facsimiles or to replace paper with paper would be important, at least for the present generation of users and equipment. She described three factors that influenced development of the project: 1) Because the project has emphasized the preservation of deteriorating brittle books, the quality of what was produced had to be sufficiently high to return a paper replacement to the shelf. CXP was only interested in using: 2) a system that was cost-effective, which meant that it had to be cost-competitive with the processes currently available, principally photocopy and microfilm, and 3) new or currently available product hardware and software.
KENNEY described the advantages that using digital technology offers over both photocopy and microfilm: 1) The potential exists to create a higher quality reproduction of a deteriorating original than conventional light-lens technology. 2) Because a digital image is an encoded representation, it can be reproduced again and again with no resulting loss of quality, as opposed to the situation with light-lens processes, in which there is discernible difference between a second and a subsequent generation of an image. 3) A digital image can be manipulated in a number of ways to improve image capture; for example, Xerox has developed a windowing application that enables one to capture a page containing both text and illustrations in a manner that optimizes the reproduction of both. (With light-lens technology, one must choose which to optimize, text or the illustration; in preservation microfilming, the current practice is to shoot an illustrated page twice, once to highlight the text and the second time to provide the best capture for the illustration.) 4) A digital image can also be edited, density levels adjusted to remove underlining and stains, and to increase legibility for faint documents. 5) On-screen inspection can take place at the time of initial setup and adjustments made prior to scanning, factors that substantially reduce the number of retakes required in quality control.
A primary goal of CXP has been to evaluate the paper output printed on the Xerox DocuTech, a high-speed printer that produces 600-dpi pages from scanned images at a rate of 135 pages a minute. KENNEY recounted several publishing challenges to represent faithful and legible reproductions of the originals that the 600-dpi copy for the most part successfully captured. For example, many of the deteriorating volumes in the project were heavily illustrated with fine line drawings or halftones or came in languages such as Japanese, in which the buildup of characters comprised of varying strokes is difficult to reproduce at lower resolutions; a surprising number of them came with annotations and mathematical formulas, which it was critical to be able to duplicate exactly.
KENNEY noted that 1) the copies are being printed on paper that meets the ANSI standards for performance, 2) the DocuTech printer meets the machine and toner requirements for proper adhesion of print to page, as described by the National Archives, and thus 3) paper product is considered to be the archival equivalent of preservation photocopy.
KENNEY then discussed several samples of the quality achieved in the project that had been distributed in a handout, for example, a copy of a print-on-demand version of the 1911 Reed lecture on the steam turbine, which contains halftones, line drawings, and illustrations embedded in text; the first four loose pages in the volume compared the capture capabilities of scanning to photocopy for a standard test target, the IEEE standard 167A 1987 test chart. In all instances scanning proved superior to photocopy, though only slightly more so in one.
Conceding the simplistic nature of her review of the quality of scanning to photocopy, KENNEY described it as one representation of the kinds of settings that could be used with scanning capabilities on the equipment CXP uses. KENNEY also pointed out that CXP investigated the quality achieved with binary scanning only, and noted the great promise in gray scale and color scanning, whose advantages and disadvantages need to be examined. She argued further that scanning resolutions and file formats can represent a complex trade-off between the time it takes to capture material, file size, fidelity to the original, and on-screen display; and printing and equipment availability. All these factors must be taken into consideration.
CXP placed primary emphasis on the production in a timely and cost-effective manner of printed facsimiles that consisted largely of black-and-white text. With binary scanning, large files may be compressed efficiently and in a lossless manner (i.e., no data is lost in the process of compressing [and decompressing] an image—the exact bit-representation is maintained) using Group 4 CCITT (i.e., the French acronym for International Consultative Committee for Telegraph and Telephone) compression. CXP was getting compression ratios of about forty to one. Gray-scale compression, which primarily uses JPEG, is much less economical and can represent a lossy compression (i.e., not lossless), so that as one compresses and decompresses, the illustration is subtly changed. While binary files produce a high-quality printed version, it appears 1) that other combinations of spatial resolution with gray and/or color hold great promise as well, and 2) that gray scale can represent a tremendous advantage for on-screen viewing. The quality associated with binary and gray scale also depends on the equipment used. For instance, binary scanning produces a much better copy on a binary printer.
Among CXP’s findings concerning the production of microfilm from digital files, KENNEY reported that the digital files for the same Reed lecture were used to produce sample film using an electron beam recorder. The resulting film was faithful to the image capture of the digital files, and while CXP felt that the text and image pages represented in the Reed lecture were superior to that of the light-lens film, the resolution readings for the 600 dpi were not as high as standard microfilming. KENNEY argued that the standards defined for light-lens technology are not totally transferable to a digital environment. Moreover, they are based on definition of quality for a preservation copy. Although making this case will prove to be a long, uphill struggle, CXP plans to continue to investigate the issue over the course of the next year.
KENNEY concluded this portion of her talk with a discussion of the advantages of creating film: it can serve as a primary backup and as a preservation master to the digital file; it could then become the print or production master and service copies could be paper, film, optical disks, magnetic media, or on-screen display.
Finally, KENNEY presented details re production:
* Development and testing of a moderately-high resolution production
scanning workstation represented a third goal of CXP; to date, 1,000
volumes have been scanned, or about 300,000 images.
* The resulting digital files are stored and used to produce
hard-copy replacements for the originals and additional prints on
demand; although the initial costs are high, scanning technology
offers an affordable means for reformatting brittle material.
* A technician in production mode can scan 300 pages per hour when
performing single-sheet scanning, which is a necessity when working
with truly brittle paper; this figure is expected to increase
significantly with subsequent iterations of the software from Xerox;
a three-month time-and-cost study of scanning found that the average
300-page book would take about an hour and forty minutes to scan
(this figure included the time for setup, which involves keying in
primary bibliographic data, going into quality control mode to
define page size, establishing front-to-back registration, and
scanning sample pages to identify a default range of settings for
the entire book—functions not dissimilar to those performed by
filmers or those preparing a book for photocopy).
* The final step in the scanning process involved rescans, which
happily were few and far between, representing well under 1 percent
of the total pages scanned.
In addition to technician time, CXP costed out equipment, amortized over four years, the cost of storing and refreshing the digital files every four years, and the cost of printing and binding, book-cloth binding, a paper reproduction. The total amounted to a little under $65 per single 300-page volume, with 30 percent overhead included—a figure competitive with the prices currently charged by photocopy vendors.
Of course, with scanning, in addition to the paper facsimile, one is left with a digital file from which subsequent copies of the book can be produced for a fraction of the cost of photocopy, with readers afforded choices in the form of these copies.
KENNEY concluded that digital technology offers an electronic means for a library preservation effort to pay for itself. If a brittle-book program included the means of disseminating reprints of books that are in demand by libraries and researchers alike, the initial investment in capture could be recovered and used to preserve additional but less popular books. She disclosed that an economic model for a self-sustaining program could be developed for CXP’s report to the Commission on Preservation and Access (CPA).
KENNEY stressed that the focus of CXP has been on obtaining high quality in a production environment. The use of digital technology is viewed as an affordable alternative to other reformatting options.
******
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
ANDRE Overview and history of NATDP Various agricultural CD-ROM products created inhouse and by service bureaus Pilot project on Internet transmission Additional products in progress *
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Pamela
Comments (0)