Collecting, Archiving and Exhibiting Digital Design Data Section 5: Appendices
The Art Institute of Chicago Department of Architecture Collecting, Archiving and Exhibiting Digital Design Data Section 5: Appendices Appendix A: CITI Snapshots Figure 5.1: Screen Captures of CITI (Collection Image Text and Index) Interface Copyright: The Art Institute of Chicago Appendix A: CITI Snapshots Copyright: The Art Institute of Chicago Appendix A: CITI Snapshots Copyright: The Art Institute of Chicago Appendix A: CITI Snapshots Copyright: The Art Institute of Chicago Appendix A: CITI Snapshots Copyright: The Art Institute of Chicago Appendix A: CITI Snapshots Copyright: The Art Institute of Chicago Appendix A: CITI Snapshots Copyright: The Art Institute of Chicago Appendix B: CITI Implementation of CDWA This appendix gives an explanation of the categories and subcategories of the Categories for the Description of Works of Art metadata schema and its implementation in CITI (Collection Image Text and Index), The Art Institute’s collection management system. A discussion of the CDWA metadata schema and CITI database is included in the Cataloging Digital Design Data chapter. Table 5.1: CDWA Explanation and CITI Implementation (Note: Required categories are highlighted with a light gray background; required subcategories are marked with ♦) Category Subcategories Object, Architecture or Group Object/Work Catalog Level Quantity Type♦ Components Quantity Type♦ Titles or Names Text♦ Type Date Creation Creator♦ Extent Qualifier Identity♦ Role♦ Statement Date♦ Earliest Date Latest Date Place/Original Location Commission Commissioner Type Description Implementation in CITI The Catalog Level can indicate a group of objects, such as a suite of drawings or a set of digital design data. Master/Part relates job level to component or document level. The Type would indicate that the documents are Architectural Documents. Subject Description records quantity. Main Reference Number implies quantity. The Component Type would classify the group of documents: Presentation Drawings Design/Detail Drawings (for current collection) or Schematic/Conceptual Design Design Development (for digital collection). Object Type and Classification fields indicate type of document. (e.g., Architectural Drawing or Architectural Working Drawing) A level below the group of documents would be individual document records that would further define type to be images, animations, etc. Title – Text is the name of an architectural job or project, or the name of an individual component document. Creator – Identity would be the architect’s name, with Role = Architect. Because architects will often have many jobs within a collection, an associated “Authority” called Creator Identification should be created to link into the Creation record of many projects. (Authorities are explained later in this table.) Date represents the beginning and end date of an architectural Titles (e.g., Palmer, Potter, Apartment: Working Drawings or 11th Floor Plan) Artist/ Culture/ Place Display is a text field with artist name, nationality, life dates, role, places of work and commissioning agent. (e.g., Adler, David, American, 18821949.) Artists/ Index indexes names and roles of artists mentioned in Artist/ Culture/ Place. Appendix B: CITI Implementation of CDWA Date Place Cost Numbers job. (e.g., John Wyatt Gregg; Draftsman) Places field indexes places from Artist/ Culture/ Place Display. Date Display is a text field that records dates associated with the creation of a work. (e.g., Designed 1929) Dates field is an index for Date Display. Materials and Techniques Measurements Classification Description♦ Extent Processes or Techniques Name Implement Materials Role Name Color Source Marks Date Actions Dimensions♦ Extent Type Value Unit Qualifier Date Shape Size Scale Format Term♦ Materials and Techniques – Description describes the medium of the work. Examples would be: Ink, graphite and watercolor on toned paper. Or, for digital data: Photomontage of site photo and rendered building. Drawing Dimensions are noted here, as well as scale. With digital design data, Format describes the data format and can be automatically populated based on file extension (TIF, AVI or DWG). Additional format information will be stored in the Format Registry and can link into this category. Size will be the document’s file size. Resolution should be an additional subcategory for digital images. Classification identifies the broad category of the object, e.g., architecture, or can be a department within a museum, e.g., Department of Architecture. Separate authority records are created for artists and places. Medium Display records a short description of material, support, processes and techniques. (e.g., Black and colored ink on linen.) Physical Description records longer descriptions of materials and techniques used. Media, Process and Technique, and Classification fields can index content of Medium Display and Physical Description. Dimensions field records measurements of a work and is a text field. (e.g., approx. 64.2 x 84.6 cm.) AMICO Object Type records one of 13 object classification terms determined by AMICO project. Appendix B: CITI Implementation of CDWA Subject Matter Current Location Description Indexing Terms♦ Identification Indexing Terms♦ Interpretation Indexing Terms♦ Interpretive History Repository Name♦ Geographic Location♦ Repository Numbers♦ Subject Matter Identification might describe the type of building, e.g., Commercial Residence or Civic center. To aid in tracking, the Repository Name and shelf Number should be included, e.g., T 036.12. In the case of digital data, the file and server location would become the Repository Number. Cataloging History Cataloger Name Cataloger Institution Date Cataloging History could be combined with Condition/ Examination History because these tasks are performed by the same person in the current Dept. of Architecture workflow. Condition/ Examination History Description Type Date Agent Place Condition assessment is performed twice in the current archiving workflow. An initial condition report is made during preliminary cataloging to send to the Registrar. A second condition report is completed when a full cataloging is made. These Examinations can be recorded in this field. Conservation/ Treatment History Description Type Date Agent Place For the digital collection, the archivist would check for file corruption and would perform a checksum operation. An additional subcategory for Checksum should be added. Conservation/ Treatment refers to conservation or restoration efforts made to a work of art and are not highly applicable to the current paper collection. For digital data, preservation is important and the preservation Department records the curatorial group responsible for the work. (e.g., Architecture) Subject Description records subject matter and is a text field. Subject Index field indexes terms relating to a work. Home Location records a text description of the location of documents. (e.g., Jackson/ Peoria Architecture Vault; Parents: Jackson/ Peoria Sixth Floor) Location History indexes the locations mentioned in Home Location and includes Move Date and Moved By. Cataloging History records the cataloger name, date and cataloging notes in a text description. (e.g., Luigi H. Mumford 9/13/01 5:56PM. Core data elements reviewed.) Condition Description records a text description of document condition. (e.g., Soiled and water stained at right edges of all sheets.) Conditions field indexes Date, Condition Status and Staff Member. CITI will add the capability to link to conservation documents to track data preservation techniques or additional fields to track dates and types of format migration or translation. Appendix B: CITI Implementation of CDWA Context Copyright/ Restrictions Critical Responses Descriptive Note Edition Exhibition/ Loan History Facture Historical/Cultural Event Type Event Name Date Place Agent Identity Role Cost or Value Architectural Building/ Site Name Part Type Place Placement Date Holder Name Place Date Statement Comment Document Type Author Date Circumstance Text Number or Name Impression Number Size Title or Name Curator Organizer Sponsor Venue Name Place Type Dates Object Number Description strategies of the Preservation Policy Committee should be recorded here. Format migration and other activities should be recorded. Format Registry preservation info may link to this category. The Context – Historical/ Cultural might be a cultural event such as Columbian Exposition or a competition or a building complex of which an individual project is a part. Copyright information will be important for images to be made available on the Web. Critical Responses of art historians and critics or the general public to exhibitions can be cataloged here. Additional information relevant to a specific drawing can be added as a Descriptive Note. Current paper drawings tend to be collected in the final edition. Edition could be used to track digital data as it is migrated to new versions or formats. Exhibition and Loan History would follow the current Department of Architecture format. Facture – Description describes the method in which a work was created. It could be used to describe architectural process in terms of sequence of software Historical Context records any didactic commentary relating to a work in a text field. Copyright and rights granted information is recorded in the image metadata record. Historical Context records critical responses. Critical commentary could also be linked documents. Physical Description or Subject Description could record additional descriptive notes. Medium Display or Catalog Raisonné can record edition information, depending on medium. Exhibition History N/A Appendix B: CITI Implementation of CDWA Inscriptions/ Marks Orientation/ Arrangement Ownership/ Collecting History Transcription or Description Type Author Location Typeface/ Letterform Date Description Remarks Citations Description Transfer Mode Cost or Value Legal Status Owner Role Place Dates Owner’s Numbers Credit Line operations or digital tool use. Notes made by the architect should be noted here. Inscriptions Electronic markups are a digital example. For architecture, Arrangement might apply to a bound series of sketches or the flow of a PowerPoint presentation. This category tracks the provenance (or history of owners of the collection) and museum accessioning. This would include the temporary RX number given by the Registrar and the accession number once the objects are accepted into the collection after committee approval. N/A Reference Numbers record the accession number and permanent receipt (RX) number. (e.g., Accession No: 1989.682.1-7; Permanent Receipt (RX) No.: RX17685/168.1-3) Provenance Text and Provenance Index record previous owner information. Acquisition information is record includes Credit Line (e.g., Gift of Bowen Blair, executor of estate of William McCormick Blair.) Committees field records meeting types and dates that approved documents to enter the collection. (e.g., Architecture, 04/27/1989; Board of Trustees, 06/12/1989) Physical Description Physical Appearance Indexing Terms Physical Appearance can be incorporated into the Materials and Techniques category as done in CITI. Acquisition Agents field records indexed names and roles. (e.g., Bowen Blair, Executor; William McCormick Blair Estate, Donor.) Physical Description field records a long version of materials and techniques that is stated briefly in Medium Display. May not apply to digital data. Indexed Terms record terms about physical Appendix B: CITI Implementation of CDWA Related Works Related Textual References Related Visual Documentation State Styles/ Periods/ Groups/ Movements Relationship Type Relationship Number Identification Creator Qualifier Identity Role Titles or Names Creation Date Earliest Date Latest Date Repository Name Geographic Location Repository Numbers Object/ Work Type Identification Type Work Cited Work Illustrated Object/ Work Number Relationship Type Image Type Image Measurements Value Unit Image Format Image Date Image Color Image View Indexing Terms Image Ownership Owner’s Name Owner’s Numbers Image Source Name Number Copyright/ Restrictions Identification Description Indexing Terms The Related Works category describes the link between the group record and the item record, e.g., is larger context for, or can describe an intellectual link to any other work. Related Textural References might be housed in the museum Library archives. This would provide a direct link to the reference or could provide a link to the relevant Finding Aid. For the digital collection, the digital data, including visual documentation, will be cataloged in groups and can be viewed by following a link to the digital object. Thus, Related Visual Documentation will already be addressed in the cataloging strategy. description with fields such as Media, Process and Technique, and Classification to categorize term type. (e.g., Term: Architectural working drawing Term Type: Classifications; Term: Graphite Term Type: Media) Master/ Part field links job records to component records. Publications and Catalog Raisonne fields record related text references. Images field records and presents digital image or version of work. (e.g., (.1)1 11th Floor Plan E38400) These image-related fields might be integrated into the item-level catalog record for a digital object. State may distinguish an early set of drawings from a later set. The architectural style or movement should be included as Indexing Terms for crossreferencing architectural jobs, e.g., Mid-American Classicism or Medium Display or Catalog Raisonné field records document's state. Subject Description field records style as a text description. Style index field records Appendix B: CITI Implementation of CDWA Authorities Creator Identification Generic Concept Identification Place/Location Identification Subject Identification Name♦ Variant Names Dates/Locations♦ Birth Date♦ Death Date♦ Earliest Active Date Latest Active Date Place of Birth Place of Death Places of Activity Nationality/ Culture/Race♦ Nationality/ Citizenship Culture Race/Ethnicity Gender Life Roles♦ Related People Relationship Name Term ♦ Variant Terms Dates Earliest Date Latest Date Related Generic Concepts Relationship Type Name/ Term Place Name ♦ Variant Place Names Dates Earliest Date Latest Date Coordinates Place Types ♦ Related Types ♦ Related Places Relationship Type Name Subject Name Variant Subject Names Dates Earliest Date Latest Date Indexing Terms Related Subjects Relationship Type Name Bauhaus. style terms. The Creator Identification Authority would be created for a particular architect or draftsman and could be linked to his numerous jobs or projects to avoid re-entering the data. CITI has an Agent entity with multiple-name variants and biographic information about artists and architects. The Agent record is linked to the Object record through Role (such as Architect or Draftsman). Therefore, an Agent could be linked to many projects with a different role for each project. The Generic Concept Identification Authority defines concepts related to the type of object, its material, activities associated with it, its style, other attributes, or the role of the artist or place. N/A The Place/ Location Identification Authority could relate jobs that fall in the Chicago area. CITI has a Place entity with multiple-name variants capabilities and hierarchical structure. The Subject Identification Authority could be used in the same manner as Generic Concept Identification. N/A Appendix C: Additional Metadata Initiatives This appendix records additional metadata initiatives that are not discussed in the Cataloging Digital Design Data chapter in Section 2: Archiving Digital Design Data: Practices and Technology. The Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) – http://www.openarchives.org/OAI/openarchivesprotocol.html The Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH), which was proposed by the Open Archives Initiative (OAI), provides a way to make the resources and related metadata included in museum archives “open” and available over the Web. Here, “open” does not mean “free” or “unlimited access,” but rather possessing a common machine interface that facilitates the availability of content from a variety of digital content providers. The OAI grew out of a meeting in 1999 in Santa Fe, New Mexico. Web-based archives (such as the physics archive run by Paul Ginsparg at Los Alamos National Laboratory) were beginning to make a significant impact on the dissemination of scholarly publications, and there was a desire to make these documents available to as wide an audience as possible. The 1999 meeting resulted in early 2000 in the “Santa Fe Convention,” which outlined the structure and basic metadata of an open digital archive. The Santa Fe Convention has since been superceded by the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH). The OAI develops and promotes standards to facilitate the efficient dissemination of digital content and increase the availability of scholarly communication. At present, the OAI has defined the OAI-PMH, a technical specification for a common means of exposing an archive’s metadata to external searches. OAI-PMH provides an application-independent interoperability framework based on “metadata harvesting,” or collecting of metadata information from a repository. OAI-PMH-compliant search engine will request information about a resource in a repository and will receive an XML-encoded byte stream in response. Z39.50 – http://lcweb.loc.gov/z3950 Another proposal for opening digital archives to external searches is Z39.50, an international standard (ISO 23950) that defines protocol for computer-to-computer information retrieval. With Z39.50, a user in one system can search and retrieve information from other computer systems that have implemented Z39.50 without knowing the search syntax used by the other systems. Z39.50 was originally approved by the National Information Standards Organization (NISO) in 1988. It is possible to search the Library of Congress bibliographic file using a Z39.50 client. Metadata Encoding and Transmission Standard (METS) – www.loc.gov/standards/mets METS (Metadata Encoding and Transmission Standard) is an XML-based encoding schema for digital library metadata used by the Library of Congress. It does not have its own set of metadata semantics, but instead references existing metadata schemas such as Dublin Core. It encodes descriptive and administrative metadata and is most unique for its ability to encode structural metadata. With its structural metadata section, METS can be used to represent the relationships and hierarchy between multiple images such as highresolution archival images, thumbnails and delivery images at lower resolutions, images of particular details at higher magnification. METS is significant because it is becoming a standard of exchange between research institutions. Fedora, one of the open-source repositories discussed in Appendix D: Open-Source Repositories, uses a variation of METS Appendix C: Additional Metadata Initiatives as its Archival Information Package (AIP) format. In its next release, DSpace plans to use METS as its AIP format as well. VRA Core 3.0 – www.vraweb.org/vracore3.htm The VRA Core Categories, Version 3.0 consist of a single element set that can be applied to create records to describe works of visual culture and the images that document them. The VRA Core 3.0 follows the 1:1 principle developed by the Dublin Core community, meaning that only one resource may be described within a single metadata set. It does not provide for hierarchy and object linking, but instead relies on the local database to link metadata element sets. Art Museum Image Consortium (AMICO) – www.amico.org The Art Museum Image Consortium (AMICO) Library is an online collection of images, text and multimedia from the collections of museums such as the Library of Congress, the Getty and The Art Institute of Chicago. Each work in the AMICO Library is documented by a catalog record, an image file and an image metadata record. At the left is a sample AMICO record for the Sarcophagus held by The Art Institute of Chicago. The catalog record is based on standards developed by the Categories for the Description of Works of Art, discussed above. The metadata record is based on the Dublin Core and adds fields specific to digital images such as file size and compression. Encoded Archival Description (EAD) – www.loc.gov/ead Encoded Archival Description (EAD) is a digital finding aid that captures all the information and metadata of an analog finding aid and also provides metadata about the finding aid itself, including its author, language and publication details. This is useful for projects where the original analog finding aid has some historical significance and there is a desire to duplicate it in digital form. US MARC (MAchine-Readable Cataloging) – www.loc.gov/marc The Library of Congress is the official depository of United States publications and is a primary source of cataloging records for US and international publications. To take advantage of the shift to computers for cataloging in the 1960s, the Library of Congress created the LC MARC format that uses brief numbers, letters, and symbols within the catalog record to identify different types of information. The original LC MARC format evolved into MARC 21 and has become the standard used by most library computer programs. However, it is not as well suited for museum collection information, though metadata initiatives mentioned above will often have mapping “crosswalks” to and from MARC. TEI Header – http://www.tei-c.org The TEI Header is a component of the Text Encoding Initiative Guidelines and is used to document a digital file, whether that file is an encoded text, an image, a digital recording, or a group of any of these. It provides standard bibliographic information about the digital file and its source as well as more specialized metadata to record the details of classification schemes, encoding and sampling systems used, linguistic details, editorial methods, and administrative metadata such as the revision history of the file. SPECTRUM SPECTRUM was developed by the Museum Documentation Standard (MDS) in Great Britain. It contains procedures for documenting objects and the processes they undergo, as well as ways to document information to support the procedures. Object ID – www.object-id.com Object ID is an international standard for describing cultural objects developed as a collaboration of the museum community, police and customs agencies, the art trade, insurance industry, and appraisers of art and antiques. Beginning as an initiative by the J. Paul Getty Trust in 1993, the Object ID project was adopted by the Council for the Prevention of Art Theft (CoPAT), a registered charity in the UK whose mission is crime Appendix C: Additional Metadata Initiatives prevention in the arts fields. Object ID includes metadata that could be useful for object identification in case of theft such as distinguishing features or markings, size and weight. Metadata Vocabulary and Syntax Art and Architecture Thesaurus (AAT) – www.getty.edu/research/conducting_research/vocabularies/aat The Getty Museum has created a structured vocabulary that can be used to describe art, architecture, decorative arts, material culture and archival material called the Art and Architecture Thesaurus (AAT). This vocabulary creates a set of values that can be entered into fields of metadata schemas such as the Categories for the Description of Works of Art. Terms, related concepts, parent hierarchical information and notes are linked to each concept. For facility in searching, terms can include the plural or singular form of the term, natural or inverted order, spelling variations, various forms of speech and synonyms. The concepts, such as “marble” or “Impressionism” are further organized into facets such as “Materials” and “Styles and Periods.” Cataloging Cultural Objects – www.vraweb.org/CCOweb For syntax decisions for the metadata values, the Visual Resources Association has created a guide called Cataloging Cultural Objects: A Guide to Describing Cultural Works and Their Images. This guide advises an archivist on how to format words in metadata fields. There will often be a “preferred title” for a work of art as well as translations from or into other languages or a several word description of the piece. Metadata Harvesting and Accessibility There is a pressing need for basic standards that will enable digital archives to be accessed at a common level by a widely dispersed audience. According to the Digital Library Federation: Despite enormous institutional investment in the creation and description of materials of serious interest to research and education, these resources exist in isolated pockets. They are difficult to find and impossible to search across. Meanwhile, students and faculty are tempted into over-reliance on commercial Internet search engines, despite their limitations and the uneven quality of the materials they include.1 Therefore, museums and other institutions cannot rely on the traditional metadata records or on the auto indexing HTML text used by Web search engines to make their research resources accessible to the public. In response to this problem, the Digital Library Federation (DLF) has partnered with the Mellon Foundation, and has included CIMI2, to come up with a solution. One solution involves the Open Archives Initiative, discussed below, that allows institutions to have their metadata harvested by supporting a simple technical protocol. The harvested metadata can be built into Web-accessible information resources using portals and gateways. Digital Library Federation, A New Approach to Finding Research Materials on the Web, 2000, available from http://www.diglib.org/architectures/vision.html; Internet; accessed 24 September 2003. 1 2 Consortium for the Computer Interchange of Museum Information (CIMI) – www.cimi.org CIMI is an international consortium of cultural heritage institutions and organizations currently researching how to make the traditional “collections” approach to archiving compatible with the Web. CIMI is involved in a metadata harvesting project to this end, described below. Appendix D: Open-Source Repository Software There are presently a number of open-source software packages that comply with Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH). As of the beginning of this year, available software includes (in alphabetical order) ARNO, CDSware, DSpace, Eprints, Fedora, i-TOR, and MyCoRe. The following discussion makes direct use of information presented in the Open Society Institute’s A Guide to Institutional Repository 1 Software , as well as information from each individual software system’s Web site. ARNO –– http://www.uba.uva.nl/arno The ARNO system—Academic Research in the Netherlands Online—was released for public use in December 2003. ARNO has different design goals from the other repository systems in that it was designed to provide a flexible tool for creating, managing, and exposing OAI-compliant archives and repositories. The system supports the centralized creation and administration of repository content, as well as end-user submission. The OAI-PMH module is not limited to presenting metadata in the standard Dublin Core format, but offers a transformation engine that, based on the internal ARNO XML structures and XSLT style sheets, is able to produce any format. ARNO does not provide a self-contained, “off-the-shelf” institutional repository system, nor was it intended to provide a full-blown end-user interface with extensive and advanced search capabilities. To be able to offer these services ARNO implementers need to deploy other, third party software. CDSware –– http://cdsware.cern.ch The CERN [Conseil Européen pour la Recherche Nucléaire/European Organisation for Nuclear Research] Document Server Software (CDSware) was developed to support the CERN Document Server. This software supports electronic preprint servers, online library catalogs, and other web-based document depository systems. CERN uses CDSware to manage over 450 collections of data, comprising over 620,000 bibliographic records and 250,000 full-text documents, including preprints, journal articles, books, and photographs. CDSware was built to handle very large repositories holding disparate types of materials, including multimedia content catalogs, museum object descriptions, confidential and public sets of documents. CDSware complies with the Open Archives Initiative metadata harvesting protocol (OAI-PMH) and uses MARC 21 as its underlying bibliographic standard. DSpace –– http://www.dspace.org DSpace was expressly created as a digital repository to capture the intellectual output of multidisciplinary research organizations. The system is running as a production service at MIT, and a federation comprising large research institutions is in development for adopters worldwide. DSpace integrates a user community orientation into the system’s structure. As the requirements of these communities might vary, DSpace allows the workflow and other policy-related aspects of the system to be customized to serve the content, authorization, and intellectual property issues of each. DSpace also addresses the problem of long-term preservation of deposited research material to maintain its utility for archival time frames. DSpace, in its basic configuration, assumes that the Submission Information Package (SIP) will be delivered via a Web upload procedure. A unique persistent ID is assigned to each uploaded file. When the digital object enters archival storage, technical metadata about file format are added and quality assurance checks are performed automatically. A preservation status is added manually and the whole package is stored as the AIP. 1 Accessible from http://www.soros.org/openaccess/software/ OSI_Guide_to_Institutional_Repository_Software_v2.htm; accessed 27 February 2004. Appendix D: Open-Source Repository Software In the release of DSpace scheduled for early 2004, a structural metadata package that describes parent-child relationships between multiple files called METS will become the Archival Information Package (AIP) format. METS (Metadata Encoding and Transmission Standard) is based on XML (eXtensible Markup Language) and is becoming a standard for digital information exchange between research institutions. It is also being used at the Library of Congress. DSpace currently supports an export of digital content and metadata in a simple XMLencoded file format. DSpace has Web-based search and retrieval capabilities, but no explicit dissemination strategy beyond delivery of the file itself. No viewing strategies are included in dissemination. DSpace is designed to be a federated model—the DSpace Federation—of many research institutions or museums and could be used to search across many archives at once. This cross-archive search capability is furthered by DSpace’s support of the Open Archives Initiative protocol for metadata harvesting. Eprints –– http://software.eprints.org According to the Open Society Institute’s “A Guide to Institutional Repository Software,” Eprints has the largest installed base (at least 120 installations) of the seven software systems discussed here. Developed at the University of Southampton (and now supported in part by the U.S. National Science Foundation), the first version of the system was publicly released in late 2000. The number of Eprints installations that have augmented the system’s baseline capabilities—for example, by integrating advanced search, extended metadata and other features—indicates that the system can be readily modified to meet local requirements. It is possible to add new tools and scripts using the modules provided. Eprints can store documents in any format that the archive administrator wishes to be accepted. Documents can be placed in a configurable, extendible subject hierarchy, which can be used to view and search the archive. Each individual document can be stored in more than one document format. The archive can also use any metadata schema (in addition to a core set required by the software); the administrator decides what metadata fields to hold about each document and how these metadata fields should be projected into the Open Archives world. Authors can also have associated metadata. Data integrity checks are performed automatically without the need for administrator intervention. Some "stub" routines allow individual sites to add their own integrity checks if they desire. Fedora –– http://www.fedora.info The Fedora digital object repository management system is based on the Flexible Extensible Digital Object and Repository Architecture (Fedora). Developed by the University of Virginia Library and Cornell University, the Fedora is designed to be a foundation upon which full-featured institutional repositories and other web-based digital libraries can be built. The units of content in Fedora are called “data objects” and include the digital file, metadata about the file and links to software tools and services for data delivery. In Fedora, the SIPs are considered the digital files submitted by the design office before they enter the system. Descriptive and administrative metadata can be entered in a Dublin Core metadata record. To enter archival storage, digital files along with their descriptive and administrative metadata must be fed into an XML document using an XML editor or programmed routine. The system’s interface comprises three web-based services: A management API that defines an interface for administering the repository, including operations necessary for clients to create and maintain digital objects; an access API that facilitates the discovery and dissemination of objects in the repository; and a streamlined version of the access system implemented as an HTTP-enabled web service. A Fedora Java application provides an administrator graphical user interface (GUI) to create XML documents from the SIP. Fedora uses a variation of METS with additional requirements and enables the creation of structural parent-child relationships between a project and the documents that comprise it. For example, a rendering with a high-resolution TIFF image, a low-resolution JPG image and a thumbnail image could be represented as one XML file with a structured relationship. A persistent ID would be assigned only at the parent object level. The Fedora data objects can be searched through a Web interface. The DIP is well thought-out: Fedora devotes attention to the way a repository can deliver a wide range of media types in their native format. Each Appendix D: Open-Source Repository Software data object is assigned a disseminator that links out to tools and services for accessing the object. This sort of dissemination strategy could prove useful if 3D viewers were chosen to access native formats. i-TOR –– http://www.i-tor.org/en/toon i-Tor (Tools and technologies for Open Repositories) was developed by the Innovative Technology-Applied (ITA) section of Netherlands Institute for Scientific Information Services (Dutch acronym: NIWI). NIWI calls i-TOR “a web technology by which various types of information can be presented through a web interface,” irrespective of where the data is stored or the format in which it is stored. It enables creation of websites which can access information from a database, an Open Archive, or some other file. i-Tor aims to implement a “data independent” repository, where the content and the user-interface function as two independent parts of the system. Data from various existing sources can be merged with data entered by users, all of which information can be searched and browsed in full. Content can be searched automatically with Internet search engines (e.g., Google). All content, including information from databases, can be retrieved. i-Tor’s design might make it an appropriate choice for an institution that wishes to impose a repository on top of an existing set of disparate digital repositories. MyCoRe –– http://www.mycore.de/engl/index.html MyCoRe grew out of the MILESS Project of the University of Essen. In contrast to MILESS, which provided a hard-coded Qualified Dublin Core data model, the MyCoRe data model is completely configurable. The MyCoRe system provides a core bundle of software tools to support digital libraries and archiving solutions (or Content Repositories, thus “CoRe”). The bundle is designed to be configurable and adaptable to local requirements (hence, the “My”), without the need for local programming efforts. The core contains the functionality that would be required in a repository implementation, including distributed search over geographically dispersed repositories, OAI functionality, audio/video streaming support, file management, online metadata editors, etc. MyCoRe is not hard-coded to a special underlying database. Rather, a persistence layer interface is provided, together with implementations for different databases. Appendix E: Global Digital Format Registry There is an initiative at Harvard University and MIT, with funding from the Digital Library Federation and participation from the Library of Congress and the National Archives and Records Administration, to create a Global Digital Format Registry (GDFR). The Global Digital Format Registry is a project to create a single, universal format registry to serve multiple repository systems. The GDFR has developed an extensive and comprehensive listing of information to be maintained about each data format, described in the tables below: Table 5.2: Format Properties for a Format Registry1 Type Enumeration M Start End Note LastModified Date Date String Date O O MA M Name Type String Enumeration M M Address Telephone Fax Email Web Note LastModified String Telephone Telephone Email URI String Date O O O O O MA M Name Version Release Vendor Process String String Date Agent Process M M M O O 1 R R R Access Access type: Escrow Inaccessible copy on file License Access by license only On-site On-site access only Public Unrestricted access Restricted No access Other Requires informative note Starting date Ending date Informative note Modification date/timestamp Agent Personal or corporate name of agent Agent type: Commercial Commercial (for-profit) entity Government Governmental agency Education Educational institution Non-profit Non-profit entity Professional Professional organization Standard Accredited standards body Trade Trade association Other Requires informative note Postal address Telephone number Facsimile number Email address Web site Informative note Modification date/timestamp Application Application name Version identifier Release date Vendor Process Format Registry Data Model, Harvard University Library, 22 December 2003, available from http://hul.harvard.edu/gdfr/DataModel_v3.doc; Internet; accessed 1 June 2004. Appendix E: Global Digital Format Registry HWDependency SWDependency Note LastModified Platform Application String Date O O O M Agent Start End Note LastModified Agent Date Date String Date M MA MA O M Identifier Description Note LastModified Cognomen String String Date M M O M Value Type String Enumeration M M R R R R R Hardware dependency Software dependency Informative note Modification date/timestamp Authority Authority agent Starting date of effective authority Ending date of effective authority Informative note Modification date/timestamp Class Class identifier Description Informative note Modification date/timestamp Cognomen Cognomen value Cognomen type: AFNOR ANSI ARK BSI CCITT DDC DOI ECMA GDFRClass GDFRFormat GDFRRegistry Handle Informal ISO ISBN ISSN ITU JEITA LCC LCCN MIME NISO PII PURL RFC SICI TOM UUID/GUID URI URL URN Other AFNOR standard ANSI standard CDL Archival Resource Key BSI standard CCITT standard Dewey Decimal Classification Digital Object Identifier ECMA standard GDFR classification identifier GDFR format identifier GDFR registry identifier CNRI handle No defined syntax or embedded semantics ISO standard International Standard Book Number International Standard Serial Number ITU recommendation JEITA standard Library of Congress Classification Library of Congress Control Number MIME media type [MIME] NISO standard Publisher's Item Identification [PII] Persistent URL IETF Request for Comment Serial Item and Contribution Identifier [SICI] Typed Object Model identifier Universally/globally-unique Identifier [UUID] Uniform Resource Identifier [URI] Uniform Resource Locator Uniform Resource Number [URN] Requires informative note Appendix E: Global Digital Format Registry Note LastModified String Date MA M Title Type String Enumeration M M Author Edition Publisher Date Accessibility Identifier Note LastModified Agent String Agent Date Access Cognomen String Date O O O O M O MA M Agent Type Agent Enumeration M M Scope Enumeration M Review Enumeration M Date Note LastModified Date String Date M O M R R R R R R R Informative note Modification date/timestamp Document Document title Document type: Article Correspondence Manual Monograph Report Standard Thesis Other Requires informative note Author Edition Publisher Publication date Access regime Identifier Informative note Modification date/timestamp Event Agent effecting the event Event type: Delete Deletion of a format Initial Initial registration of a format Obsolescence Declaration of format obsolescence Update Update format representation information Other Requires informative note Scope of the vent: Editorial Non-substantive editorial change Technical Substantive technical change Review type: Full Full technical review Partial Requires informative note None No review Date/timestamp Informative note Modification date/timestamp Appendix E: Global Digital Format Registry Protocol Enumeration M Connection Note LastModified String String Date MA O M Class Note LastModified Class String Date M O M Name Version Release Vendor Note LastModified String String Date Agent String Date M M M O O M Type Enumeration M Auxiliary Note LastModified Identifier Service LastHarvestedBy LastHarvest Note LastModified Cognomen String Date Cognomen Service Date Date String Date MA O M M M O O O M R R R Interface Interface protocol: HTTP .NET RMI Remote method invocation SOAP Web Service Other Requires informative note Protocol-specific connection parameters Informative note Modification date/timestamp Ontology Ontological class Informative note Modification date/timestamp Platform Platform name Version identifier Release date Vendor Informative note Modification date/timestamp Process Process type: Create Render R R R R Create new instantiation of formatted object Media type-specific rendering of formatted object TransformFrom Requires source auxiliary format TransformTo Requires target auxiliary format Validate Validation of formatted object Other Requires informative note Source or target format of transformation Informative note Modification date/timestamp Registry Registry identifier Supported GDFR service Date/timestamp of last harvest by this registry Date/timestamp of last harvest of this registry Informative note Modification date/timestamp Appendix E: Global Digital Format Registry Identifier Registry Note LastModified Cognomen Cognomen String Date M O O M Type Enumeration M Interface Note LastModified Interface String Date M O M Value Obligation ByteStream Enumeration M M Note LastModified String Date MA M R R R R Relation Target format identifier Target registry identifier Informative note Modification date/timestamp Service Service type: Approval Technical review Description Query for specific format Export Bulk export of registry data Introspection Information about registry instance Maintenance Maintain format representation information Notification Synchronization Distributed synchronization Service interface Informative note Modification date/timestamp Signature Signature value Signature obligation: Mandatory MandatoryIfApplicable Requires informative note Optional Informative note Modification date/timestamp Table 5.3: Derived Properties (Derived properties inherit all of the attributes of their parent.) Type Enumeration M Type Enumeration M Position Enumeration M ExternalSignature IS-A Signature External signature type: Extension File extension Type Mac OS data type Other Requires informative note FormatRelation IS-A Relation Format relation type: EquivalentTo IsPreviousVersionOf IsSubsequentVersionO f IsSubtypeOf IsSupertypeOf MayContain UsedBy Other Equivalent to target Previous version of target Subsequent version of target Subtype of target Supertype (parent) of target May encapsulate target May be encapsulated by target Requires informative note InternalSignature IS-A Signature Signature position: Fixed Fixed position; requires offset Appendix E: Global Digital Format Registry Offset NonNegative MA Arbitrary Byte offset Arbitrary position Title Affiliation String Agent O O Person IS-A Agent Personal title Organizational affiliation Table 5.4: Registry Properties Version Date Aegis ExternalRegistry Ontology Format String Date Authority Registry Ontology Format M M M O M O GDFR IS-A Registry Version identifier for registry code base and data model Build date for registry code base and data model R Responsible authority R Known external registry Ontological classification scheme R Format representation information Format Properties Identifier Description Alias Version Author Owner Maintainer Classification Relationship Specification Signature Application Provenance Note LastModified Cognomen String Cognomen String Agent Authority Authority Cognomen FormatRelation Document Signature Application Event String Date M M O O O M O O O M O O M O M R R R R R R R R R R R Format Format canonical identifier Short description of format Variant identifier Format version identifier Author Legal owner Maintainer Ontological classification Typed relationship with other format Specification document External or internal signature Application system using format Provenance event Informative note Modification date/timestamp Appendix F: Adobe PDF Format and Settings Throughout our discussions, we have been referring to the capabilities of the Portable Document Format (PDF) version 1.5, Adobe Acrobat 6.0 and Adobe Reader 6.0. Since the PDF specifications are backwardscompatible, we are assuming that future versions will be able to read documents created under the current specifications. Operating System and Version Dependencies Rich graphic and multimedia content, such as animations, can be embedded in PDF documents, meaning that providing the media as a separate file is unnecessary. The ability to embed content was added in Acrobat 6.0. There is, however, a dependency on the ability of the operating system or the availability player software to access some formats of embedded content. For this reason, we have recommended that the AVI format be used for embedded animations. If other formats are used, separate players might be needed and they might not be available for all operating systems. Likewise, Adobe Reader 6.0 is not currently available for versions of Windows earlier than 98SE, Macintosh operating systems earlier than OS X 10.2.2, or for other operating systems such as Linux. Computers with these operating systems will not be able to play the embedded content. Acrobat Settings In order to maintain color fidelity and image resolution when creating PDF documents, it is important to select the correct settings. With Adobe Acrobat, the user can create PDF documents by using the print to PDF function from an application like PowerPoint or by assembling source documents in Acrobat itself. The different methods require different ways of setting preferences. (Please note: These instructions are based on Adobe Acrobat 6.0 Professional and Microsoft Office 2002 applications.) To print to PDF from an application such as PowerPoint, use File Æ Print and select Adobe PDF as the printer Name. Click on Properties and under the Adobe PDF Settings tab there will be a pull-down menu for Default Settings. Select the High Quality option from the pull-down menu. (See Figure 5.2.) This choice of settings will reduce images to 300 dpi and will maintain the embedded color profile in each image. These settings will meet the requirements of the Department of Architecture. (These Default Settings can be edited by clicking the Edit button and changing values under the Images and Color tabs.) The second way to create a PDF is in Adobe Acrobat by selecting Create PDF Æ From File or From Multiple Files and then browsing for the desired file or files. Before importing files, it is important to select the following settings. Under the Edit Æ Preferences menu, select Convert to Figure 5.2: Adobe PDF Settings Snapshot Appendix F: Adobe PDF Format and Settings PDF from the left column. Select TIFF from the next column and click the Edit Settings button. Select lossless compression schemes from all Compression pull-down menus and Preserve embedded profiles from all Color Management pull-down menus. Repeat the process for JPG and other image files to be converted to PDF. (See Figure 5.3.) Figure 5.3: Adobe Acrobat Preferences Snapshot Also in the Edit Æ Preferences dialog box is a Digital Signatures option in the left column. Users can create their own signature, either from a scanned image of a signature or a company logo, and enter company information. To sign a document, go to Document Æ Digital Signatures Æ Sign this Document and select a “new invisible signature.” The user will need to add a Digital ID, either a self-signed ID or one provided by a third party. Once an ID is created, the user can select “I am the author of this document” and save. When the PDF document is received by the museum, the archivist will be able to open the document and verify the digital signature using the Signatures tab and can verify that no modifications have been made to the document since it was signed. Design firms can go an additional step and certify documents with Adobe—an option that pops up automatically when signing a document. When printing a PDF with the above settings, it is important to select Printer/Postscript Color Management from the Printer Profile pull-down menu of the Output selection of the Advanced option of the Print dialog box. If the printer has been properly calibrated and an ICC profile has been saved for it, Acrobat will be able to properly map color values from the embedded ICC profiles in the images within the PDF document to the ICC profile of the printer. To embed media content in a PDF file, select Tools Æ Advanced Editing Æ Movie Tool. After you drag a box for the media clip with the crosshairs cursor, the Add Movie dialog will appear. (See Figure 5.4.) Figure 5.4: Adobe Acrobat Add Movie Snapshot Appendix F: Adobe PDF Format and Settings Under Content Settings, select “Acrobat 6 Compatible Media” and check “Embed content in document.” The Poster Settings selection determines what appears in the media frame before it is selected for playback. If you choose “Retrieve poster from movie,” the opening frame of the animation will be displayed.