register Homepage

ECLAP 2013 Conference REPORT : Trust and Quality in Cultural Heritage Digital Libraries

Held at the ECLAP 2013
2nd International Conference on Information Technologies for Performing Arts, Media Access and Entertainment


THANKS TO Jana Renée Wilcoxen who prepared this report
Jana Renée Wilcoxen is ECLAP WG-C member MUZEUM, Ljubljana (SI) and co-chair of the WG-C Session


The PDF of the report is available here:

The ECLAP 2013 International Conference in Porto (PT) provided a context for the ECLAP project’s Working Group-C to hold an afternoon session on Quality and Trust in Digital Archives in order to share experiences and recommended practices for managing performing arts digital assets and collections. To answer these questions we turned to related fields of scientific archives, archaeological archives as well as other digital libraries and repositories. As we have seen thus far, the performing arts field can borrow many insightful ideas from other areas that have a tendency to have more experience and more resources available for investing in the development of new tools. The ECLAP Conference marked a rare opportunity for researchers and developers from so many different fields to gather and share firsthand the experiences from their domains.

So what did we learn?

Glancing at the keywords from the abstracts we see that they all refer in some way to an aspect of metadata: Authenticity, Persistent Identifiers, Consistency, Accuracy, Completeness, Provenance, Integrity,Preservation, Interoperability, XML, OPM, PREMIS, Digital Library, Digital Curation, OAIS, METS, MODS, LIDO, GAMA, Contemporary Artworks Description.
The keywords from these sessions tell us that metadata is a very important component of a digital archive, we could say as nearly as important as the archival material or content item itself – without properly organised, maintained and accessible metadata, no user can access the wealth of an archive’s holdings. Treasures will remain out of reach. Data will take up server space, but it will not be illuminated on computer screens. Our shared cultural heritage will result in a pile of indiscernible bits and bytes.

How do we know that we have quality metadata in our archive?
How can a user know that she/he can trust not only the search findings but also the metadata itself and the authenticity of what she/he is researching or retrieving?
How can we be sure that our work will withstand changes in hardware, software, working environments and knowledge bases?

We must always keep in mind that the end user is essential in the equation. Also, tools are an extension of our thought processes; a tool must always fulfil a purpose that is deemed useful. The more people that a tool can serve in different contexts, the better; a tool’s interoperability is thus enhanced. In short: the right tool equals success.

 

 

TALKS:


Preserving Authenticity Evidence to Assess Provenance and Integrity of Digital Resources



Our first speaker was LUIGI BRIGUGLIO, who uses his extensive information engineering background as leader of the Cultural Heritage R&D activities at Engineering Ingegneria Informatica SPA to assist in the software architecture, modelling and development activities of the SCIDIP-ES project[1]. His paper shared the insights from the combined work of the SCIDIP-ES and APARSEN[2] projects, in which the latter proposes a methodology for the management of the authenticity of digital resources. During their lifecycle digital resources, notably digital representations of artistic works, can go through changes of custody, format migrations and other transformations of their representation. These changes may pose a threat to the integrity of their intellectual content and make it difficult to trace their provenance. His paper thus addressed the crucial problem of gathering and preserving the evidence that would allow later users to properly assess the authenticity, provenance and integrity of these resources. To preserve the authenticity evidence, the authors have proposed a solution using the Open Provenance Model (OPM) and defined a set of special XML-based structures, which are compliant with the PREMIS Data Dictionary[3] and hence guarantee a sound basis for interoperability among different repositories. The SCIDIP-ES will thus soon launch an Authenticity Toolkit for archives to use in the tracing of all the events in the “lives” of digital assets so that end-users can be assured of the integrity of the content they see on their screen.



[1]SCIDIP-ES Science Data Infrastructure for Preservation – Earth Science http://www.scidip-es.eu/.
[2]Alliance Permanent Access to the Records of Science in Europe Network, a project of the Alliance Permanent Access http://www.alliancepermanentaccess.org/index.php/aparsen/.
[3]http://www.loc.gov/standards/premis/.
 

·         Video of presentation http://www.eclap.eu/120718
·         Cross-media documentation of presentation http://www.eclap.eu/120695
·         Slides from presentation http://www.eclap.eu/120671



Applicability of digital library descriptive metadata to the contemporary artworks The Sapienza Digital Library case study



ANGELA DI IORIO, known for her work as a metadata expert in various projects and capacities as well as her active contributions as the Italian PREMIS expert, joined us again at the ECLAP 2013 Conference to share her experiences in coordinating the soon to be launched Sapienza Digital Library (SDL), a joint project of the University of Rome La Sapienza and the CINECA Inter University Consortium. We must remember that La Sapienza is the oldest – and largest – university in Europe. Her task involved implementing a metadata model in order to integrate La Sapienza’s production and assets contained in its 59 libraries and 20 museums into an accessible, manageable, comprehensive online collection. She sought open source solutions, following the OAIS Model to ingest 112,620 diverse digital objects into the collection, which comprises everything from books and academic papers, to iconography and architecture, a museum of chemistry and even historical census records. In her paper she presents a case study of mapping the content from the university’s Museo Laboratorio di Arte Contemporanea (MLAC).

The model had to provide a common format to present the works in the digital library but not flatten the materials by overlooking useful information contained in wasted or unimplemented fields. Thus, a combination metadata framework using MODS, METS and PREMIS was created. This combination worked well with the majority of print and documentation content, but it wasn’t as sufficient for describing museum artefacts. Thus, LIDO was also integrated using FRBRoo as the joining link. To better understand the types of metadata currently used in describing contemporary art works, SDL analysed the metadata scheme put forth by the project Digitising Contemporary Art (DCA) which uses a combination of both LIDO and the GAMA (Gateway to Archives of Media Art) profile. With these other schemes and metadata descriptors in mind, SDL has built its own metadata framework and tested the mapping of items against these other schemes to make sure that there is interoperability and compliance at the European level, in both horizontal (EUROPEANA) and vertical (GAMA) aggregators.

One important aspect of their findings is that in the DCA application profile and GAMA format, while it may be possible to describe an Event, it is not clear how to link the names with Artworks if the Event entity is not a mandatory field, since then the related properties do not exist; conversely, in the RDA/MODS mapping there is “no way to describe properly an Event”. This finding is particularly important for Performing Arts Digital Collections which typically must describe content which references a particular Event related to a particular Artwork (for example, the opening of a play by a particular director).


·         Video of presentation
·         Cross-media documentation of presentation
·         Slides to presentation



Validating the Digital Documentation of Cultural Objects



ACHILLE FELICETTI from the VAST-LAB at the University of Firenze in Prato introduced us to the concept of paradata in his paper dealing with the problem of validating visual digital documentation of cultural heritage. The trustworthiness of digital replicas of cultural objects relies on the presence of paradata, that is, data concerning the provenance of the digital document. Here we saw an interesting difference in the attitude toward the meaning of the term provenance between the cultural heritage sector and archaeological sector, since for archaeologists provenance refers to where the object comes from. Hence, when using this term, the latter refer to Digital Provenance. Felicetti considered several cases in his explanation, taken from the recently finished project 3D-COFORM. In the project they examined the creation of digital visual documents and asked if the software currently used to create such documents creates the necessary metadata that is necessary to validate those documents in the future. To support the digital documentation, either born digitally through automatic data acquisition or resulting from post-processing of the captured data, he suggested that it is necessary to collect certain data about the metadata, hence paradata. Namely, the type of machine, the software used, the creation process, the people creating it, the date of creation, and the purpose of the object’s creation. He also considered interpretive digital models and presented a tool for annotating digital objects. Hardware manufacturers and software developers of should be made aware of these requirements so that out-of-the-box their products can provide this important data in the digital documentation creation process. In the meantime, the 3D-COFORM group has extended the guidelines set out by the London Charter (2006) and the CIDOC-CRM ontology and proposed a CRMdig, a solution for documenting paradata. To this end, built a toolkit which allows for capturing this data as well as implementing annotations. Testing of these tools is currently being carried out at the Victoria and Albert Museum in London (UK).

·         Video of presentation
·         Cross-media file of presentation
·         Slides of presentation


Metadata Quality assessment tool for Open Access Cultural Heritage institutional repositories



EMANUELE BELLINI, of the Fondazione Rinascimento Digitale in Florence, reported on his research into the Metadata Quality (MQ) of Cultural Heritage Institutional Repositories. He approached the topic of metadata not from one of how to describe content or how to maintain authenticity, integrity and provenance, but one of quality assurance. He set out to determine a method for measuring the Completeness (if a field is empty or not), Accuracy (if a field has no typographical errors) and Consistency (if a field has no logical errors) of the existing content in metadata fieldswith the goal to improve the performance of OAI-PMH processing and metadata harvesting. After examining other work done in the area of metadata quality, such as the Bruce & Hillman Framework and the Gasser & Stvilia Framework and others, a Quality Profile was defined and divided into High-Level and Low-Level metrics using Dublin Core metadata. Then the DC fields were weighted for their importance and ability to support the functional requirements of the archive’s designated community.

Next 1200 OA-IR containing over 15 million items represented by over 100 metadata schemes were analysed. The proposed prototype was tested in-depth on three Open Access Institution Repositories of Italian universities: the University of Pisa, the University of Roma 3 and the University of Turin. Bellini’s research reminded us of the importance of using known standards and semantic vocabularies in the descriptions of our repository items. Without complete, accurate and consistent metadata, users cannot easily and quickly find what they are searching for. Here we should also mention that Consistency is perhaps a relative concept: within a single collection or repository, there may be consistency, but as we continue with the trend towards aggregated collections such as Europeana, then the definition of what is consistent changes, that is, it becomes necessary to achieve consistency across a wider playing field, often across multiple languages, multiple disciplines, etc. This fact is why standards are so important to follow. Given the inaccuracies and inconsistencies present in most repositories we can see the need for an approach to metadata harvesting that would provide the tools for testing for completeness, checking for accuracy and standardizing for consistency. In free text fields these issues are amplified.


·         Video of presentation
·         Cross-media document of presentation

·         Slides

Keynote speaker: DAVID GIARETTA, Alliance for Permanent Access



In addition to the four speakers who presented their latest research on topics related to metadata during the WG-C session on Trust and Quality in Digital Libraries on the first day of the conference, on Day 2 the ECLAP consortium was pleased to welcome David Giarretta as the keynote speaker for the General Track. Known as the grandfather of Digital Preservation, Mr Giarretta shared with us his work at the Alliance for Permanent Access. He began his talk by going over some of the fundamentals of preservation and remarking on the incredible, exponential “rising wave of data” that is occurring in our time: every two years we are doubling the current amount of data. Acknowledging that this wave is both a challenge and an opportunity, he laid the groundwork for his work at the APA and why it is important for us to think long-term about digital preservation methods. He briefly discussed the outcomes of the CASPAR project and the importance of the project’s unique collaboration of institutions from the fields of culture, arts and science.
The project used important threats outlined by the PARSE.Insight[1] Project.

By identifying the main threats to digital preservation, the project members were able to put forth the requirements for the solution by testing simulated scenarios of hardware and software changes, environmental changes and changes to the knowledge base. He gave an example of one of the test scenarios using the University of Leeds “test bed”. The University of Leeds is doing extensive work in the development of new technologies for multimodal communication and education in the performing arts (2 of their researchers also presented papers at the ECLAP conference). His second case study was on the work of CIANT and their performance GOLEM. Followers of ECLAP might remember Michal Masa’s presentation on the Digital Preservation of a New Media Performance: Web 2.0 and Performance Viewer in relation to this project from the ECLAP Workshop held in Brussels in June 2011.[2]
The performing arts research showed that the key issues were Preservation, Usability and also Re-performability, the latter being a unique specification in the domain of Performing Arts. Another key component is the importance of a Designated Community, and having representatives from that community who can be asked about the quality of the preservation.
·         Validation report
·         Commonalities across disciplines
·         Tools and Services
·         Standards based Repository Audit and
 
He highlighted new work being done for Certification (ISO 16363) and Evidence-based Digital Preservation Tools. The following slide shows a good overview of the types of services and toolkits that can be implemented to track the life of a digital object.



[1]PARSE-Insight Permanent Access to the Records of Science in Europe http://www.parse-insight.eu/.
[2]See http://bpnet.eclap.eu/drupal/?q=en-US/home&axoid=urn%3Aaxmedis%3A00000%3Aobj%3A221fc426-95c5-4260-8d3b-dd321138fc68&section=metadata


·         Video of presentation
·         Crossmedia presentation documentation
·         Slides

 

CONCLUSIONS:

Can we perhaps start to think of the Top Ten Important Metadata Elements for Performing Arts Digital Collections or any digital collection in general?
Since our era loves acronyms, let’s call them TIME for short. Here’s a sketch of them.

1.      Make sure your metadata is in a widely-accepted Metadata standard.
a.      We propose Dublin Core (DC) as a starting point, but additional fields are usually necessary according to the field in which your digital collection represents.

2.      Use accepted ISO and other standards for fields such as Country name, Date, Language.

3.      Use Semantic Web or other standardised Vocabularies to describe your content.

4.      Create – and stick to – a consistent naming convention.

5.      Implement a Persistent Identifier system to eliminate the threat of not being able to find your digital content.

6.      Implement Linked Open Data in your metadata to improve the ability of your metadata to be reused and repurposed in other API.

7.       Utilise an Authenticity ToolKit in the ingestion process to follow the chain of evidence and maintain certainty of Provenance and Authenticity. 

a.      Examples of Authenticity Toolkits can be found at CASPAR and SCIDIP-ES projects.

8.      Compatibility and Interoperability are key components of a robust metadata scheme.

9.      Implement an Orchestration Manager.

10.   Implement a Packaging Toolkit to keep the intended digital use rights intact for the object throughout its life and future derivatives.




 

5
Twoja ocena: Żaden Średnia: 5 (1 vote)