Introduction
To date, because we deposit to Dspace our standards have focused on the file formats that can best be represented in our depository. The access point chosen for a particular collection, drives the standards that are used, for example the HathiTrust, OCA (Open Content Alliance).
Common File Types:
Digital images are saved in a file format, the structure by which data is organized in a file. Common file formats include:
- TIFF (Tagged Image File Format)
- JPEG (Joint Photographic Experts Group)
- JPEG-2000 (Joint Photographic Experts Group - 2000)
- GIF (Graphics Interchange Format)
- BMP (Bit-Mapped)
Management of Digitized files
Digital image: A digital image is defined for the purposes of this document as a raster based,
two-dimensional, rectangular array of static data elements called pixels, intended for display on a
computer monitor or for transformation to another format, such as a printed page.
Digital master image files: When analog materials are converted to digital through a digital
reformatting process (such as scanning, photographing with a digital camera, etc.), the digital
master image file is the file created as the direct result of image capture. The digital master image
file should represent as accurately as possible the visual information in the original object.
However, if the original object cannot be digitized directly due to its size or other attributes, it may
be necessary to use a photographic intermediary. Care should be taken that the photographic
intermediary is well documented and represents the original object as accurately as possible.
In the case of analog materials reformatted to digital, the primary function of digital master image
files is to serve as a long-term archival record and as a source for derivative files. A digital master
image file may serve as a surrogate for the original, may completely replace originals, or may be
used as security against possible loss of originals due to disaster, theft and/or deterioration.
For "born digital" objects that were not created through a digital reformatting process, the digital
master image file comprises the original, source digital file itself.
The long term preservation of digital master image files requires a strategy of identification,
storage, and migration to new media, as well as policies about image use and access. It is
essential that master files remain unaltered over time. Lossy compression techniques such as
JPEG should not be applied to master files, and migration procedures should include quality
control procedures to ensure that the integrity of the files is maintained throughout the entire
process.
The specifications for derivative files used for image presentation may change over time; digital
master image files can serve an archival purpose, and can be processed by different presentation
methods to create necessary derivative files without the expense of digitizing the original object
again. Because the process of image capture is so labor intensive, the goal should be to create a
master that has a useful life of at least 50 years. Therefore, collection managers should anticipate
a wide variety of future uses, and capture at a quality high enough to satisfy these uses. In
general, decisions about image capture should err towards the highest quality.
Derivative files: These files are created from digital master image files for editing or
enhancement, conversion of the master to different formats, and presentation and transmission
over networks. Examples include access and thumbnail images.
Glossary of digital imaging terms:
Born Digital: objects originally created in a digital format, i.e. photos from digital cameras or Microsoft Word documents
Bit Depth: the tonal or signal resolution; determines maximum number of shades of gray or colors in a digital file.
Color Mode refers to whether the image is black and white, grayscale, or color. Grayscale images consist of a single channel and can be 8-bit (256 levels) or 16-bit (65,536 levels). Color images consist of 3 or more grayscale channels that represent color and brightness information and each channel may be either 8-bit or 16-bit, forming 24-bit or 48-bit files. Common color modes are RGB, CMYK, and LAB color.
Compression a process that eliminates redundant data to create a smaller file size.
Dimensions the size or measurement of an image’s height and length, recorded in inches or centimeters.
File Format a structure for encoding the information in a data file.
Pixel Array(Dimensions) a measurement of the spatial resolution or the amount of information in an image file expressed as the number of pixels on each dimension of the image.
Resolution a measurement of the spatial resolution, written as pixels per inch or “ppi”. The term “dpi” refers to printer resolution or dots per inch and is often used interchangeably for ppi.
Assumptions:
- Files should use color rather than grayscale when color is an integral part of the original object, and any compression applied to the file should be lossless. Document Services uses LZW compression which is lossless.
- Some file formats discard redundant information to reduce the file’s size, a process known as compression. Best practices suggest saving master digital images in an uncompressed file format, such as TIFF.
The table below contains the current standards being used for digitization projects.
Type | Resolution | Bit Depth | Color Mode | Archival File Format |
---|---|---|---|---|
Multi-page text documents B&W | 400 dpi | 1 bit | Bitmap | TIFF |
Multi-page text documents Color | 400 dpi | 24 bit | RGB | TIFF |
Multi-page text documents mixed grayscale color | 400 dpi | 8 bit Grayscale | Gray | TIFF |
transparencies, slides | 3200 dpi | 24 bit | RGB | TIFF |
Image - photo, illustrations, artwork | 400 dpi | 24 bit | RGB | TIFF |
Maps, oversized items | 400 dpi | 24 bit | RGB | TIFF |
OCA (Internet Archive) projects - multi-page text documents | 300 dpi | 24 bit | RGB | JPEG2000 |
Microfilm, microfiche |
|
|
|
|
|
|
|
|
|
Note regarding Digital Audio:
We currently do not have any standards of practice for digitizing audio as we have yet as a Team to take on one of these projects. When we do begin one of these projects we will reference one of these documents for digital audio standards in the industry: