IDeA Home

Digital Archive Overview

IUPUI Implementation

News & Related Sites

People

IDeA Format Support Policy

IDeA wishes to provide support for as many file formats as possible. Over time, items stored in IDeA will be preserved as is, using a combination of time-honored techniques for data management and best practices for digital preservation. Some specific formats, however, are proprietary in nature and make it impossible to guarantee future usability. Put simply, our policy for file formats is:

  • Everything put in IDeA will be retrievable. “Retrievable” means “the file can be downloaded from IDeA”.
  • IDeA will automatically recognize as many files' formats as possible and assign the appropriate Support category.
  • IDeA will support as many known file formats as possible based on current data management and digital preservation standards and best practices.

When a file is uploaded to IDEA, it is assigned one of the following Support categories:*

  1. Supported
    • IDeA fully supports the format. "Support" means "every reasonable effort will be made to maintain future usability through whatever techniques (such as migration, emulation, etc.) is appropriate given the context of need".
    • For supported formats, IDeA might choose to bulk-transform files from a current format version to a future one to maintain usability. “Usability” means the file can be opened, viewed and/or executed. But IDEA cannot predict which services will be necessary in the future, so will continually monitor formats and techniques to ensure IDeA can accommodate needs as they arise.
  2. Known
    • IDeA can recognize the format, but cannot guarantee full support as previously defined.
  3. Unsupported
    • IDeA cannot recognize a format; these will be listed as "application/octet-stream", aka Unknown.

* File formats, and format Support Category assignments, will be annually reviewed. Adjusted may be made based on technical and resource considerations or needs.

IDeA can choose to "support" a format if IDEA can gather enough documentation to capture how the format works. Documentation and format information collection includes file specifications, descriptions, and samples. IDEA lists format Support category assignments in the IDEA Format Reference Collection below.

Proprietary formats for which these materials are not publicly available cannot be supported in IDeA (e.g. Microsoft). IDeA will preserve these files, maintain “retrievability” and will provide Communities with guidance on converting files into supported formats (e.g. PDF). It is also likely that for extremely popular but proprietary formats (such as Microsoft .doc, .xls, and .ppt), IDeA will be able to help make these formats usable in the future simply because their prevalence makes it likely tools will be available. Even so, IDeA cannot guarantee this level of service, so will still list these formats as "known" - not "supported".

What to do if your format isn't recognized

IDeA understands that there are always more formats to consider, and would appreciate your help in identifying and studying the suitability of support for formats you care about. If IDeA can't identify a format, it will be recorded "unknown", aka "application/octet-stream"; but IDeA would like to keep the percentage of supported format materials within the repository as high as possible. We encourage Communities to contact us with any questions or concerns.

IDeA Format Reference Collection

In the table below, MIME type is the Multipurpose Internet Mail Extensions (MIME) type identifier; for more information on MIME, see the MIME RFCs or the MIME FAQ. Description is what most people use as the name for the format. Extensions are typical file name extensions (the part after the dot, e.g. the extension for "index.html" is "html"). These are not case-sensitive in IDeA, so either "sample.XML" or "sample.xml" will be recognized as XML. Level is IDeA's support level for each format:

MIME type

Description

Extensions

Level

application/marc

MARC

marc, mrc

supported

application/pdf

Adobe PDF

pdf

supported

application/postscript

Postscript

ps, eps, ai

supported

audio/x-aiff

AIFF

aiff, aif, aifc

supported

image/gif

GIF

gif

supported

image/jpeg

JPEG

jpeg, jpg

supported

image/png

PNG

png

supported

image/tiff

TIFF

tiff, tif

supported

text/html

HTML

html, htm

supported

text/plain

Text

txt

supported

text/richtext

Rich Text Format

rtf

supported

text/xml

XML

xml

supported

application/mathematica

Mathematica

ma

known

application/msword

Microsoft Word

doc

known

application/sgml

SGML

sgm, sgml

known

application/vnd.ms-excel

Microsoft Excel

xls

known

application/vnd.ms-powerpint

Microsoft Powerpoint

ppt

known

application/vnd.ms-project

Microsoft Project

mpp, mpx, mpd

known

application/vnd.visio

Microsoft Visio

vsd

known

application/wordperfect5.1

WordPerfect

wpd

known

application/x-dvi

TeXdvi

dvi

known

application/x-filemaker

FMP3

fm

known

application/x-latex

LateX

latex

known

application/x-photoshop

Photoshop

psd, pdd

known

application/x-tex

TeX

tex

known

audio/basic

audio/basic

au, snd

known

audio/x-mpeg

MPEG Audio

mpa, abs, mpeg

known

audio/x-pn-realaudio

RealAudio

ra, ram

known

audio/x-wav

WAV

wav

known

image/x-ms-bmp

BMP

bmp

known

image/x-photo-cd

Photo CD

pcd

known

video/mpeg

MPEG

mpeg, mpg, mpe

known

video/quicktime

Video Quicktime

mov, qt

known

application/octet-stream

Unknown

(anything not listed)

unsupported

This page last modified on 5 June 2007, bpd
Copyright © 2002-2005 The Trustees of Indiana University - Copyright Complaints