DSpace User Group – session descriptions
A Mapping Tool for a Multi-Institutional Metadata Repository
Larry Hansard, Bill Anderson (Georgia Institute of Technology, US)
The Georgia Institute of Technology has collaborated with 9 universities and colleges in Georgia to build a statewide institutional repository (IR) called the GALILEO Knowledge Repository (GKR). The contents of the GKR IR include metadata harvested from other IRs that describes their available documents. The metadata is harvested from local repositories that use DSpace and Digital Commons as their repository software. A mapping tool was created to map the collections from the local IR to collections in the GKR. The mapping tool allows staff from each institution to log in and maintain the mapping data for their institution. The GKR also offers a hosting service for four smaller GKR participating institutions using DSpace.
DSpace Semantic Search v2.0: What’s New and Current Status
Dimitrios Koutsomitropoulos (University of Patras, Greece); Georgia Solomou, (University of Patras, Greece); Ricard Borillo Domenech (Universitat Jaume I, Spain)
DSpace Semantic Search v2.0 is the next version of the reasoning-based querying and navigation service for the DSpace digital repository system. Compared to its predecessor, v2.0 has been significantly refactored and comes with an improved interface, additional functionality and a modular, ‘add-on’ architecture. In this paper, we introduce this service to the community by giving an abridged account of new and upcoming features and give pointers for further information and involvement.
Updates on the DSpace Roadmap
Tim Donohue (DuraSpace), Brad McLean (DuraSpace)
This talk will provide high-level updates to the community on the DSpace software platform, highlighting key initiatives and providing an overview of upcoming releases. We will propose a technology roadmap based on these initiatives and based on the development direction(s) of the Committers Team. Topics / initiatives to be covered may include (not an exhaustive list):
* Plans for DSpace 3.0 release (Fall/Winter 2012) and new version numbering scheme
* Significant community developments / deployments
* Review of current DSpace institutional demographics
* Technology updates
- Codebase migration to GitHub (how this may benefit both our development community and individual institutions)
- Achieving a “RESTful” DSpace (REST API, etc.)
- Making DSpace more friendly for mobile devices
- Replication Task Suite (automated backup and restoration for DSpace)
- “Modernizing” the DSpace API
- Encouraging new UI development to occur (SkylightUI, etc)
- Updates on “DSpace with Fedora Inside”
- Google Summer of Code projects
Obviously as time will be limited, we will only be able to briefly touch upon each of the above initiatives. The goal of this talk is to just make the community aware of these various projects, and explain what the importance of each project may be for individual institutions.
An Invitation Toward Development: Demystifying Customization of the XMLUI Through Best Practices
Patrick Etienne (Georgia Institute of Technology, US)
DSpace is an example of outstanding open-source software, and a large portion of what enables DSpace to thrive is their focus on the community. In order for software implementations to be successful, user institutions and organizations must be provided the ability to personalize their implementations to highlight their individual strengths, and this means customization. However, with software as large and complex as DSpace can be, and with the great diversity of technical proficiency of the user base, the priority of accessible customization becomes increasingly influential as the community grows. Currently, customization can be a significant hurdle for uninitiated users seeking to personalize DSpace for their institution or organization. This presentation will seek to demystify interface customizations through discussing development tools, the architecture of the most popular interface, troubleshooting best practices, and a combination of these with regard to specific popular topics.
AgriOcean DSpace: Customized version of DSpace for agricultural and aquatic networks in parallel with developments at Hasselt University
Denys Slipetskyy (Institute of Biology of the Southern Seas, Sevastopol, Ukraine), Dirk Leinders (Hasselt University, Belgium), Marc Goovaerts (Hasselt University, Belgium), Imma Subirats (FAO of the United Nations ), Sarah Dister (FAO of the United Nations ), Thembani Malapela (FAO of the United Nations) , Johannes Keizer (FAO of the United Nations )
In 2010, the United Nations agencies of FAO and UNESCO-IOC announced a joint initiative to provide a customized version of DSpace to promote open access to scientific literature in the field of oceanography, agriculture and related sciences available in digital form; to assure good metadata quality and the use of thesauri and other forms of authority control; to develop sustainable repositories that are more accessible and visible. The customization is branded AgriOcean Dspace (AOD), and integrates the previous developments of both UN agencies in one customized version of DSpace.
AgriOcean DSpace inherited from OceanDocs , a customization of DSpace 1.4.2, an upgraded type-based submission module. Further technical developments includes the specific use of the authority control for ontologies, author namesjournal titles and ISSN, an embargo module on the bitstream level, a batch importer, extended crosswalk functionality and an easy installer.
AgriOcean Dspace is installed by four organizations and is under consideration by other members of the FAO and IODE communities.
Andalusian Health Repository: promoting the scientific health output among professionals and providing citizens with quality health information
Pilar Toro Sánchez-Blanco (Biblioteca Virtual del Sistema Sanitario Público de Andalucía, ES), Teresa Matamoros Casas (Biblioteca Virtual del Sistema Sanitario Público de Andalucía, ES) ; Verónica Juan Quilis (Biblioteca Virtual del Sistema Sanitario Público de Andalucía, ES)
The Andalusian Public Health System (Sistema Sanitario Público de Andalucía -SSPA) Repository is the open environment where all the scientific output generated by the SSPA professionals, resulting from their medical care, research and administrative activities, is comprehensively collected and managed. This repository possesses special features which determined its development: the SSPA organization and its purpose as a health institution, the specific sets of documents that it generates and the stakeholders involved in it.
The repository uses DSpace 1.6.2, to which several changes were implemented in order to achieve the SSPA initial goals and requirements. The main changes were: the addition of specific qualifiers to the Metadata Dublin Core scheme, the modification of the submission form, the integration of the MeSH Thesaurus as controlled vocabulary and the optimization of the advanced search tool. Another key point during the setting up of the repository was the initial batch ingest of the documents.
Surfacing Google Analytics in DSpace
Claire Knowles (University of Edinburgh, UK)
Google Analytics is an easy way to gather usage statistics about a repository providing metrics for Repository Managers through it’s dashboard. Visibility of all statistics within the dashboard is limited to those with Google accounts who have been given permission to view the site’s usage statistics. The Digital Library at Edinburgh University has been developing away of exposing this data to it’s DSpace Repository users on behalf of the Scottish Digital Library Consortium (SDLC) using jQuery and the Google Analytics API.
Putting a face to the service: the people behind making the local global
Amy L. Lana (University of Missouri, US) J. Hardy Pottinger. IV (University of Missouri, US)
Once upon a time, in a land far, far from the ocean, there lived a repository called MOspace. Every item in this repository was put there by a living, breathing library staff person, and received full AACR2-equivalent cataloging. This was possible not, as some thought, because a self-aware repository gobbled up scholarship as it hunted for scientific proof of unicorns, nor because metadata fairies visited item records every night. No. The real story is so much stranger…it was people working together to gather content and make it discoverable by the world. This interactive presentation at Open Repositories 2012 DSpace Users Group will highlight how the combination of a focus on the people behind the repository and full service deposit is working to build a culture of open access scholarship, ensuring that our local content has an impact on a global scale. We will also suggest some ideas for speaking about your repository which personalizes the services and discuss some of the challenges, both technical and administrative, for staff as they offer mediated submission in a DSpace repository.
A DSpace Split: Private Repository / Public Story Collection Site
Kathy Reagan (Reagan Marketing + Design, US), Jennifer Henderson (Consultant, US), BramDe Schouwe (@mire, Belgium)
Celebrating a corporation’s centennial anniversary is a major milestone. A very small percentage of publicly held companies make it to this mark. According to research by Vicki TenHaken, Professor of Management at Hope College, Holland, MI, less than .01% of all U.S. businesses (public and private) have survived beyond 100 years. Nearly 40% of these are manufacturing organizations. Steelcase Inc., a global office furniture manufacturer headquartered in Grand Rapids, Michigan, now joins this group of centenarian manufacturers.
Major celebrations and anniversaries are often driving forces for collecting oral histories and other related memorabilia. Oral history projects can be costly to undertake, and with limited anniversary budget, Steelcase did not want to ignore this opportunity. Being a global company, they are looking to collect oral histories from anyone around the world who has a Steelcase Story to tell.
Two years ago, as part of the preparation of organizing historic records for the anniversary needs, Steelcase invested in a private digital repository. The Archives team suggested using DSpace as the public story collection platform, and after careful evaluation, we were allowed to proceed.
If selected, our presentation will explain to DSpace users how a DSpace instance can serve multiple functions for multiple audiences.
How to make your DSpace repository OpenAIRE compliant
Pedro Principe, Eloy Rodrigues, José Carvalho (University of Minho, Portugal)
The present proposal is for a tutorial session in order to inform repository managers and administrators about the compliancy process with the OpenAIRE guidelines.
The tutorial is directed to repository managers:
- Interested in learning more about making their DSpace repository compliant with the OpenAIRE infrastructure to support and monitor the implementation of the FP7 Open Access pilot;
- Wanting to help their faculty complying with the Open Access requirements of the European Commission (EC)?
- Wanting to add EC project data to their repository records and use OpenAIRE value-added functionality (post authoring tools, monitoring tools through analysis of document and usage statistics)?
In order to harvest and connect publications to related EC FP7 grant agreement, the OpenAIRE infrastructure requires repositories to adopt the OpenAIRE Guidelines. These are low-barrier requirements for OAI-PMH compliant repositories that build on the oai_dc and DRIVER Guidelines.
When the repository is OpenAIRE compliant, its FP7 funded content is harvested periodically, indexed within the OpenAIRE portal and presented in the OpenAIRE search and browse section. In this way, FP7 funded research results deposited in the repository can achieve wider visibility and distribution – and be read, used and cited more widely by the global research community.
Research managers in institution will be able to compare the institutional performance in FP7 projects with the performance of other institutions in their countries and within the European Union using the OpenAIRE FP7 publication statistics tool. This compliance will also save time for researchers.
Repositories, successfully harvested by the OpenAIRE, are entitled to display the OpenAIRE logo on their website, to certify quality and the global networked status of their content.
The OpenAIRE project team can help repository managers with targeted advocacy activities to ensure that high quality content is deposited into the repositories and then harvested by the OpenAIRE portal. The OpenAIRE project team reach out to the researchers publishing FP7 funded articles and encourage them to self-archive in repositories.
• OpenAIRE and compliancy with the ERC Scientific Council Guidelines for Open Access and the European Commission Open Access Pilot in FP7.
• How to make the repository OpenAIRE compliant: OpenAIRE guidelines.
• Compliancy for DSpace: OAI Extended & OpenAIRE Authority Control.
ShareGeo – encouraging the reuse of derived geospatial data
Anne M Robertson (University of Edinburgh, UK)
Every year the UK Higher Education community downloads large volumes of geospatial data from EDINA’s geospatial data delivery services. Researchers and students are deriving new geospatial datasets from this source data. In recognition of the value these new derived datasets can play to the academic community, EDINA has set up ShareGeo as a repository to encourage the sharing and reuse of derived geospatial data. DSpace has been customised to offer a repository that eases both the deposit and discovery of geospatial data. ShareGeo interoperates with other components of the UK academic spatial data infrastructure by exposing its metadata for harvesting by GoGeo, the national academic geospatial metadata discovery service.
A tale of two data repositories: An examination of features in figshare and DSpace
Mark Hahnel (figshare, UK), Stuart Macdonald (University of Edinburgh, UK)
Figshare is a cloud based repository for individuals to make all of their research outputs available in a sharable, citable, discoverable manner. Edinburgh DataShare is a multi-disciplinary institutional data repository, based on DSpace, set up by the Data Library to allow Edinburgh University researchers to deposit, share, and license their data resources for online discovery and use by others, either openly or in a controlled way if requested.
On Figshare researchers can publish figures, datasets, tables, videos, anything. All file formats can be published, including videos and datasets that are often demoted to the supplemental materials section in current journals. Up to 1GB of data can be stored privately for free, and users have unlimited space for publicly available research. All data is stored under CC license to encourage frictionless sharing.
Edinburgh DataShare has been customised to offer its own look and feel, with a selection of standards-compliant metadata fields useful for discovery of datasets, through Google and other search engines. A persistent identifier and suggested citation are provided and all data is stored under ODC, PDDL licenses, or unlicensed with a valid rights statement to encourage sharing and re-use.
Both of these platforms offer a lot to researchers but rather than working in relative isolation, can the platforms trade socio-technical solutions to encourage best practice and researcher engagement in the process of making publicly funded research outputs available for re-use? Where can these projects collaborate in order to aid the research community? What are the technical and ‘political’ problems associated with such collaborations?
Preparing DSpace based Institutional Repository for the Semantic Web
Joonas Kesäniemi (University of Helsinki, Finland )
Linked Data has recently been gathering significant momentum as the method of choice for publishing and connecting open data in the web. Governmental and academic organizations are breaking out of their data silos in hope of fostering new innovations that are built upon their data. Institutional repositories, as centralized services for collecting, managing and distributing digital content and related metadata, are prime candidates for sources of Linked Data. This connected and shared data builds a foundation for more comprehensive vision of Semantic web, where the web of data can be automatically analyzed and enriched by machines to provide trusted, integrated and intelligent services as envision by Tim Berners-Lee.
Out of the box, the current versions of DSpace provide very limited functionality for the organizations wanting to tip their toes in to the semantic web seas. The proposed paper describes a set of solutions for bringing data management, dissemination and discovery in DSpace based repository to the semantic web era.
Implementing a funder repository with heterogeneous material and advanced presentation capabilities
Ioanna-Ourania Stathopoulou, Nikos Houssos, Panagiotis Stathopoulos, Despina Hardouveli, Alexandra Roubani, Alexandros Soumplis, Chrysostomos Nanakos (National Documentation Centre, Greece)
The present paper briefly describes the development of a funder repository aiming at the dissemination, reuse and preservation of digital material of diverse types, providing an enhanced user experience, which was produced under the auspices of large scale (multi-billion Euros) funding programmes of the Hellenic Ministry of Education (co-financed by the European Union). This involved the handling of a wide range of content, like (among others) educational material, books, peer-reviewed scientific articles, conference proceedings, theses, videos and studies/reports. The DSpace platform has been selected and used for the implementation of the repository and fulfilled the challenging requirements at hand, for example the heterogeneity of material; several DSpace extensions were also developed to assist both the back-end procedures for cataloguing and digital file processing as well as providing an enhanced end user experience. The project has been successfully completed and the system is publicly available since spring 2011.
Implementation of a consortia driven repository infrastructure
Chris Helms (Georgia Institute of Technology, US)
This presentation will provide information on how the GKR infrastructure was configured to maintain and service multiple DSpace instances. Methodologies utilized in directory configurations, port usage, and variable assignments will be shared in replicable fashion. Thus providing others an opportunity to re-think and discuss their DSpace hosting and development environments at Open Repositories 2012.
Contributions for DSpace 3.0
Lieven Droogmans, Ben Bosman, Mark Diggory, Bram De Schouwer (@mire, Belgium)
As a registered DuraSpace service provider @mire is committed to support the community in as many ways as possible. @mire considers contribution to the development of DSpace not only as a means to support the community, but also as strategic company R&D. When working with clients, we regularly emphasize that contribution to the open source community provides significant long term maintenance benefits.
For DSpace 3.0, @mire has striven to significantly improve its previous contributions to DSpace: statistics for DSpace 1.6, Discovery for DSpace 1.7, and Configurable Workflow for DSpace 1.8. New features will be added to all of these contributions, and they will be further improved in terms of usability and performance. We will also discuss several additional planned contributions from our client projects (Item Versioning, Identifier Services, and Enhanced Embargo). These contributions will benefit all the community while reducing the long term maintenance costs of their solutions, reflecting the ultimate benefit of this contribution as a vehicle for long term sustainability of cost effective open source repository platforms.
Tutorial on configuration & usage of DSpace configurable workflow
Lieven Droogmans, Bram De Schouwer (@mire, Belgium)
The configurable workflow contributed by @mire to DSpace 1.8 is an addition offering a wide variety of workflow possibilities in DSpace compared to the standard DSpace workflow process.
Because of the fact that this workflow is not enabled by default, most users are unaware of the different configuration options and because of the requirement to migrate items from the standard workflow to the new workflow, the use of this new workflow framework has been limited so far. The main goal of this tutorial is to explain the ease of installation & configuration of the workflow framework as well as the potential scenarios.
Creating metadata out of thin air and managing large batch imports
Linda D. Newman (University of Cincinnati, US)
The University of Cincinnati Libraries processed a larger (over 500,000 item) collection of birth and death records from 1865-1912, and successfully created dublin_core.xml submission packages from spreadsheets with minimal information, and successfully created, using batch methods a community that is now at 524360 records.
This presentation will demonstrate the methods and scripting used to create and manage the submission package, and discuss what worked well, and what presented challenges, with the DSpace batch import process.
Since building that collection, we are processing an archive of over 37,000 letters and notes of the scientist and polio researcher Dr. Albert B. Sabin. Similar methods are being used, with the additional layer that we need to review about 1/3 of the items for possible redaction of personal medical information, or classified (by the U.S. military) information. This presentation will also discuss how an IP limited test environment was used to load, redact, export and reload the same materials, in an efficient manner.
Curation Tasks for Repository Managers : Staying in the Light and have a Dark Side
Yanan Zhao, Kim Shepherd, Yin Yin Latt, Leonie S. Hayes (The University of Auckland, NZ)
The problem facing many DSpace Repository Managers is the complex task of matching content policies with complex business rules. The authorisation and access within the DSpace software is challenging for most Repository Managers, and it dictates that policy is uniform across collections, in reality our collections are a mixture of items and we need flexibility to change and amend policies. We have solved many problems using authentication systems to create access to matching collections but we struggle at the application end matching the access to the system capabilities. There are some things that cannot be solved with authentication.
These policies can range from:
1. Freely available online
2. Available to campus users only
3. Available to administrators or discrete groups
4. A mixture within an item of some parts being available and others restricted.
5. A mixture of different access for metadata and the associated bitstreams. (some files open and some restricted)
A series of DSpace curation tasks were created to meet the user requirements. Use cases are described and discussed.
1. Digital Theses: ResearchSpace@auckland.ac.nz – managing institutional deposit policies determined by University Statutes
2. Data Curation – satisfying the requirements for data elements ranging from open datasets, to derived data components.
3. Publications and Creative Works – mixed access policies
4. Image collections
A series of three curation tasks were created and added to processing workflow. dc.rights.accessrights field is used to indicate overall access level of each ResearchSpace item. Three values used in the field are: “http://purl.org/eprint/accessRights/RestrictedAccess”, “http://purl.org/eprint/accessRights/ClosedAccess”, and “http://purl.org/eprint/accessRights/OpenAccess”.
What is a Curation Task?
A Curation Task is a workflow step in the DSpace software that does not involve making changes to the core DSpace code base, it enables software developers to take a business rule and craft this to work in addition to and on top of the application functionality. It further expands the concept of repository as a service where Repository Managers and administrative staff can design their requirements with maximum flexibility. If the policy changes the curation task can be updated and applied to items, collections or parts of items.