Aug 202012
 

It has now been a month since we gathered in Edinburgh for Open Repositories 2012 and we are delighted to report that there has been plenty of new content and reflection about the conference appearing since then.

Well over 90 blog posts and reports on the conference are now out there – you have been absolutely brilliant over the last few weeks sharing your reports, reflections and thoughts on how to take forward the fantastic ideas shared by speakers, posters, fellow delegates. We are sure there are more posts to come (since it has taken us a while to update this blog and we’re sure we’re not the only ones still thinking about talks, ideas, discussions had) so do let us know as you add any reports or write ups of your own. For now here are a few more highlights we wanted to share while everything is still fairly fresh – look at the bottom of this post for links to a more thorough collection of posts.

Firstly we have noticed lots of you sharing links to your slides on SlideShare. We will be making sure all of the programme content, slides and videos are connected up here on the website but for now we are making sure we gather these links to your shared presentations. For instance Todd Grappone and Sharon Farb at UCLA have shared their slides on the broadcast news archival work. This ambitious project is one to keep an eye out for, especially when it opens to the public in the future.



Research data has featured prominently in many of your write ups as it was a major theme of this year’s Open Repositories:

Leyla Williams blogged up a summary on the conference for the Center For Digital Research and Scholarship, with particular attention paid to research data and public access to hives of content.

Meanwhile Leslie Johnston of the Library of Congress gave a talk on big data, and also wrote up a great post on the significance of data in a repository setting where publications were once the center focus.

Tyrannosaurus and Shark in National Museum

Some people say open access policy has no teeth… (‘OR2012 012’ by wr_or2012, 22-07-12)

In addition to delegates and attendees who have been sharing their experiences some of our workshop facilitators have been sharing rich reflections on their workshops. For example Angus Whyte of the Digital Curation Centre further developed the idea of research data in repositories, and wrote up the conference workshop on the subject

Most of you will have seen some of the Developer Challenge Show & Tell sessions and we are delighted that the DevCSI team have shared their videos of OR2012 and they are a great collection of Developer Challenge presentations and short interview recordings, like this clip of Peter Sefton, chair of the judges:

We are also starting to see some really interesting posts about how OR2012 ideas and talks can be operationalised. For instance Simon Hodson of JISC has posted a whole series of excellent OR2012 write ups and reflections at the JISC Managing Research Data blog.

And we have also started to see publications based on the conference appearing.  Steph Taylor has written about OR2012 for Ariadne (Issue 69) as an example to frame her advice from getting the most from a conference – it’s a super article and should prove handy for planning your trip to OR2013 on Prince Edward Island. OR2012 has also featured very prominently in the latest issue of Digital Repository Federation Monthly, which includes 10 Japanese attendees’ reports of the conference – huge thanks to @nish_ku for bringing this to our attention.

The Digital Repository Federation article is far from the only non-English write up we’ve had – so far we have spotted write ups of the conference in GermanFinnishPolish, more posts in Japanese and this fantastic series of images of the conference dinner from the Czech Klíštěcí šuplátko photo blog. We know our language skills can’t match up to the incredible diversity of languages spoken by OR2012 delegates so we would really you to let us know if we’ve missed any of the write ups, reports, or reflections shared, particularly if they have been shared in another language.

As we have shared a number of write ups that draw on major conference themes it seems appropriate to close this post with the video of Peter Burnhill of EDINA delivering the closing session this year and wrapping everything up. It’s worth re-watching and, like all of the OR2012 videos, you can watch, share and comment on this on YouTube:

YouTube Preview Image

And finally….

We have several OR2012 conference bags left to give away. These are the perfect size for a laptop and papers which makes them fantastic for meetings but they are also great for looking stylish and well-travelled around the office or for transporting your craft kit to coffee shops and meet ups. We will be posting these remaining bags out with a few bonus edible Scottish treats so make sure you comment here or tweet with #or2012bags quickly to make sure you secure one of our last three remaining bags!

Where to find even more highlights…

  • Images can be found on Flickr, Highlights are gathered on our Pinterest board.
  • We have several gatherings of useful links which you can find on Delicious: write ups (blog posts, reports, etc.) of OR2012, useful resources shared in presentations and via Twitter, and OR2012 presentations.
  • Videos are on YouTube.
  • We have gathered tweets with Storify for browsing and exploring (please note this archive is updated once a week).
  • If you want to analyse or browse the text of all tweets you can access the full spreadsheet containing thousands of #OR2012 tweets on Google Docs. Please ignore colour codings – these are being used to remove unwanted content (tweets intended for other hashtags) and to ensure we capture all links to useful resources shared.
 August 20, 2012  Posted by at 1:31 pm Updates Tagged with: , , ,  Comments Off on Another Round of Highlights
Jul 182012
 

OR2012 has wrapped up, tweets are now just slowly fluttering in, and blog posts are popping up like new database entries in springtime. We wanted to gather together a sampling of the best stuff we’ve come across since last week and put it all in plain sight. We know you guys eat broken links and buried content for breakfast, but we figured this could be your pre-meal cup of coffee. …or something. Anyway, here’s what we’ve got.

Keita Bando was active throughout the conference. Here's a shot taken at the drinks and poster session. Click through to see the rest of Keita's lovely photos

Natasha Simons was one of our volunteer bloggers, and she did a fantastic job of it. Mixing summary, analysis, and flair into each post makes each and every one a pleasure to read. Here’s one on arriving in Edinburgh and hearing about the ‘Building a National Network’ workshop, one on conference day 2 (and haggis balls), and one with a sporran full of identifiers chat.

Rob Hilliker immortalized some of the software archiving workshop whiteboard notes for us. Linked to his Twitter post, which leads to a few more pictures and his epic stream of OR2012 tweets

Nick Sheppard, another of our volunteer bloggers, wrote up his reflections of the first two days of the conference on the train ride home. He was keen to write it, and you should be keen to read it. Trust us.



Owen Stephens put together some notes and commentary on repository services, and especially on ResourceSync for folks that are into that sort of thing.

We’re also pleased that discussing the Anthologizr project inspired an Edinburgh University MSc student to focus on that work for his e-Learning dissertation.

An amazing bit of #OR2012 activity analytics by Martin Hawkseye using Carrot2. Click through for full details on how it was made.

The JISC MRD folks took superb notes about the session on institutional perspectives in research data management and infrastructure.

Brian Kelly weighed in on Cameron Neylon’s opening plenary and the significance of connectedness, with particular focus on social media platforms. His site is always worth a browse, so keep tabs on it. View the plenary below.
YouTube Preview Image

The DevCSI developer challenge was quite a lively segment of the conference, no matter which side of the mic you were on. Stuart Lewis drummed up excitement about the collaboration between developers and managers that the challenge aimed for this year, and the result was more than we could hope for. The number of submissions was higher than ever. Check out the competition show and tell and read about the winners.

A mockup of Clang! It was the runner-up project in the DevCSI developer challenge. Click through for a post about the idea

That’s what we’ve gathered so far, but it isn’t enough to do you all justice. That’s why we want you to comment, write in, tweet, and photograph everything you think we missed. We need slide decks, papers, pictures, and everything else. Speakers, if you haven’t passed on slides to session chairs, don’t be shy. And everybody else, drop us a line. We’ll be sure to include whatever you’ve got.

"Coder we can believe in." Click through for Adam Field's first tweet of the image

All this work isn’t just for the website. Everything we gather up will be going into a repository of open repository conference content. What can we say, we’re pretty single-minded when it comes to keeping it all open access for you lot. Get sending, and we’ll share more soon.

 July 18, 2012  Posted by at 11:02 am Updates Tagged with: , ,  Comments Off on Highlights (so far)
Jul 132012
 

As the conference draws to a close we wanted to thank all of you that came along or followed the event online, and we wantnd to fill you in on what would be happening around the conference after the in-person part of Open Repositories 2012.

In the next few weeks we will be going through the over 4000 tweets and the fantastic photos, blog posts, presentations, conference materials and commentary that you have been producing throughout the conference and we’ll be summarising all that right here, linking to your blogs and reports, and highlighting where you can access all of the official conference content.

Here are eight ways to keep in touch:

  1. Fill in our survey – tell us what you liked, what we could have done better… we value all of your feedback on the event whether you were here in person or via reading our blogs, tweets, seeing videos etc: http://www.surveymonkey.com/s/OR_2012
  2. Stick with us on Twitter – we will continue sharing blog posts, updates, and conference-related new via the #or2012 tag and the @OpenRepos2012 account. And you should start following the new @ORConference Twitter account which will keep you in touch with Open Repositories throughout the year! Remember to reply, comment, retweet!
  3. Blog with us – we did our best to liveblog from the parallel strands but we would love to hear what you thought of these and other sessions – did you go to or run a fantastic workshop? Was there something increadibly useful from the user group you’d like to see shared more widely? We would love your contributions to the blog or to hear about where you’ve been writing up the event – just drop us an email or leave a comment here!
  4. Keep an eye on the OR2012 YouTube channel – you will find over 40 videos of the parallel sessions (excluding P1A unfortunately, our AV team have been unable to correct a corrupt file of that recording) there already and Pecha Kucha sessions will be appearing over the next few weeks.
  5. Share your pictures – if you haven’t already joined our Flickr group please do get in touch – we’d love to see more of your pictures of the event!
  6. Pin with us! – We have begun the process of gathering our favourite images and videos from OR2012 on Pinterest. We would love to add your highlights, your favourite parts of the event so do let us know what you’d like to see appear!
  7. Connect on CrowdVine! Now that you’ve had a chance to meet and chat it’s a great time to use the OR2012 CrowdVine to stay in touch, make further connection, discuss your thoughts on the event. For instance there’s already a great thread on “highlights and things you’ll take home“.
  8. And finally… Look out for emails about Open Repositories 2013. If you’ve let us know your email address via the feedback form we’ll be in touch. You can also join the Open Repositories Google Group and stay in touch that way. Or you can simply drop us a note to or2012@ed.ac.uk and we’ll make sure we add you to our list for staying in touch.

We really enjoyed Open Repositories 2012 and really hope you did too!

 July 13, 2012  Posted by at 4:04 pm Updates Tagged with: ,  Comments Off on What to expect from OR2012 over the next few weeks
Jul 122012
 

Today we are liveblogging from the OR2012 conference at George Square Lecture Theatre (GSLT), George Square, part of the University of Edinburgh. Find out more by looking at the full program.

If you are following the event online please add your comment to this post or use the #or2012 hashtag.

This is a liveblog so there may be typos, spelling issues and errors. Please do let us know if you spot a correction and we will be happy to update the post.

Kevin: I am delighted to introduce my colleague Peter Burnhill, Director of EDINA and Head of the Edinburgh University Data Library, who will be giving the conference summing up.
Peter: When I was asked to do this I realised I was doing the Clifford Lynch slot here! So… I am going to show you a Wordle. Our theme for this years conference was Local In for Global Out… I’m not sure if we did that but here is the summing up of all of the tweets from the event. Happily we see Data, open, repositories and challange are all prominent here. But Data is the big arrival. Data is now mainstream. If we look back on previous events we’ve heard about services around repositories… we got a bit obsessed with research articles, in the UK because of the REF, but data is important and great to see it being prominent. And we see jiscmrd here so Simon will be pleased he did come on his crutches [he has broken his leg].
I have to confess that I haven’t been part of the organising committee but my colleagues have. We had over 460 register from over 40 different nations so do all go to PEI. Edinburgh is a beautiful city but when you got here is was rather damp but it’s nicer now – go see those things. Edinburgh is a bit of a repository itself – we have David Hume, Peter Higgs and Harry Potter to boast – and that fits with local in for global out as I’m sure you’ve heard of two of them. And I’ve like to than John Howard, chair of the OR Steering Committe and our Host Organising Committee
Our opening keynote Cameron Neylon talked about repositories beyond academic walls and the idea of using them for turning good research outputs into good research outcomes. We are motivated to make sure we have secure access to content… as part of a more general rumbling with workshops before the formal start there was this notion of disruption. Not only the Digital Economy but also a sense of not being passive about that. We need to take command of the scholarly communication area that is our job – that cry to action from Cameron and we should heed that.
And there was talk of citation… LinkedIn, Academia.edu etc. is all about linking back to research to data. And that means having reliable identifiers. And trust is a key part of that. Publishers have trust, if repositories are to step up to that trust level you have to be sure that when you access that repository you get what it says it is. As a researcher you don’t use data without knowing what it is and where it came from. The respoitory world needs to think about that notion of assurance, not quality assurance exactly. And also that object may be interrogatable to say what it is and really help you reproduce that object.
Preservation and Provenance is also crucial,
Disaster recovery is also important.. When you fail, and you will, you need to know how you cope, really interesting to see this picked up in a number of sessions too.
I won’t  summarise everything but there were some themes…
We are beginning to deal with the idea on registries and how those can be leveaged for linking resources and identifiers. I don’t think solutions were found exactly but the conversations were very valuable.And we need to think about connectivity, as flagged by Cameron. And these places l,e twitter and Facebook… WE don’t own them but we need to be I them, to make sure that citations come back to us from here.And finally, we have been running a thing called repository fringe for the last four years, and then we won the big One. But we had a little trepidation as There afe a lot lf hou! And we had an uncondference strand. Ad i can say that UoE intends to do repository fringe in 2013.

We hope you enjoyed that unconference strand – an addition to complement the open repositories, not to take away from it but to add an extra flavour. We hope that the PEI folk will keep a bit f that flavour at OR and we will be running the fringe a wee bit later in the year, nearer the edinburgh fringe.

As I finish up I wanted to mention an organisation in IASSIST, librarians used to be about the demand side of services but things have shifted over time. We would encourage that those of us here lik up to groups like IASSIST (and we will suggest the same to them) and we can finds way to connect up, to commune together at PEI and to kshare experience. And so finally I think this is about the notion of connectivity. We have the technology, we have the opportunity to connect up more to our colleagues!

And with that I shall finish up!

Begin with an apology….

We seem to have the builders in. We have a small event coming up… The biggest festival in the world… Bt we didn’t realise that the builders would move in about the same week as you….what you haven’t seen yet is out 60x40ft upside down purple cow… If you are here a bit longer you may see it! We hope you enjoyed your time nonetheless

It’s a worrying thing hosting a conference like this… Lke hosting a party you worry if anyone will show up. But the feedback seems to have been good and and I have many thank yous. Firstly to all of those who reviewed papers. To our sponsors. To the staff here – catering, edinburgh first,nthe tech staff. Bt particularly to my colleagues on the local Host Orgnaising Committee: Stuart Macdonald, William Nixon, james toon,  andrew bevan – most persuasive committee member getting our sponsors on board, saly Macgregor, nicola osborne who has led our social media activity, and to Florance Kennedy, who has been using her experience of wrangling 1000 developers at FLOc a few years ago.

The Measure of success for any event like this is about the quality of conversation, of collaboration, of idea sharing, and that seems to have worked well and we’ve really enjoyed having you here. The conference doesn’t end now of course but changes shape.. And so we move onto the user groups!

 July 12, 2012  Posted by at 11:33 am LiveBlog, Updates Tagged with: ,  2 Responses »
Jul 112012
 

Today we are liveblogging from the OR2012 conference at Lecture Theatre 5 (LT5), Appleton Tower, part of the University of Edinburgh. Find out more by looking at the full program.

If you are following the event online please add your comment to this post or use the #or2012 hashtag.

This is a liveblog so there may be typos, spelling issues and errors. Please do let us know if you spot a correction and we will be happy to update the post.

Topic: A Repository-based Architecture for Capturing Research Projects at the Smithsonian Institution
Speaker(s): Thorny Staples

I have recently returned to the Smithsonian. I got into repositories through lots of digital research projects. I should start off by saying that I’ll show you screenshots for a system that allows researchers to deposit data from the very first moment of research, it’s in their control until it goes off to curators later.

I’m sure most of you know of the Smithsonian. We were founded to be a research institute originally – museums were a result of that. We have 19 museums, 9 scientific research centers, 8 advances study centres, 22 libraries, 2 major archives and a zoo (Washington zoo). We focus on longterm baseline research, especially in biodiversity and environmental studies, lots of research in cultural heritage areas. And all of this, hundreds of researchers working around the world, has had no systematic data management of digital researvh content (except for SAO who work under contract for NASA).

So the problem is that we need to capture research information as it’s created and make it “durable” – it’s not about presevation but about making it durable. The Smithsonian is now requiring a data management plan for ALL projects of ANY time. This is supposed to say where they will put their digital information, or at least get them thinking about it. But we are seeing very complex arrays of numerous types of data. Capturing the full structure and context of the research content is neccasary. It’s a network model, it’s not a library model. We have to think network from the very beginning.

We have to depend on the researvchers to do much of the work, so we have to make it easy. They have to at least minimally describe their data but they have to do something. And if we want them to do it we must provide incentives. It’s not about making them curators. They will have a workspace, not an archive. It’s about a virtual research environment but a repository-enables VRE. Primary goal is to enhance their research capabilities, leaving trusted data as their legacy. So to deliver that we have to care about a content creation and management environment, an analysis environment and a dissemination environment. And we have to think about this as two repositories: there is the repository for the researcher, they are data owners, they set policies, they have control – crucial buy-in and crucial concept for them; And then we have to think about an interoperable gathering service – a place researcher content feeds into and also cross search/access to multiple repositories back in the other direction as these researchers work in international teams.

Key to the whole thinking is the concept of the web as the model. It’s a network of nodes that are units of content, connected by arcs that are relationships. I was attracted to Fedora because of the notion of a physical object and a way to create networks here. Increasingly content will not be sustainable as discrete packages. We will be maintaining our part of the formalized world-wide web of content. Some policies will mean we can’t share everything all the time but we have to enable that, that’s where things are going. Information objects should be ready to be linked, not copied, as policy permits. We may move things from one repository to another as data moves over to curatorial staff but we need to think of it that way.

My conceptual take here is that a data object is one unit of content – not one file. E.g. a book is one object no matter how many pages (all of which could be objects). By the way this is a prototype, this isn’t a working service, it’s a prototype to take forward. And the other idea that’s new is the “concept object”. This is an object with a metadata about the project as a whole then a series of concept objects for the components of that project. If I want to create a virtual exhibition I might build 10 concept objects for those paintings and then pull up those resources.

So if you come into a project you see a file structure idea. Theres an object at the top for the project as a whole. Your metadata overview, which you can edit, lets you define those concepts. The researcher controls every object and all definitions. The network is there, they are operating within it. You can link concepts to each other, it’s not a simple hierachy. And you can see connections already there. You can then ingest objects – right now we have about 8 concept types (e.g. “Research site, plot or area”). When you pick that you then pick which of several forms you want to use. When you click “edit” you can see the metadata editor in a simple web form prepopulated with existing record. And when you look at resources you can see any resources associated with that concept. You can upload resources without adding metadata but it will show in bright yellow to remind you to add metadata. And you can attach batches of resources – and these are offered depending where you are in the network.

And if I click in “exhibit” – a link on each concept – you can see a web version of the data. This takes advantage of the adminstrator screen but allows me to publish my work to the web. I can keep resources private if I want. I can make things public if I want. And when browsing this I can potentially download or view metadata – all those options defined by researcher’s setting of policies.

Q&A

Q1 – Paul Stanhope from University of Lincoln) Is there any notion of concepts being bigger than the institution, being available to others

A1) We are building this as a prototype, as an idea. So I hope so. We are a good microcosm for most types of data – when the researcher picks that they pick metadata schemas behind the scenes. This think we built is local but it could be global, we’re building it in a way that could work that way. With the URIs othwe intstitutions can link their own resources etc.

Q2) Coming from a university, do you think there’s anything different about your institution? Is there a reason this works differently?

A2) One of the things about the Smithsonian is that all of our researchers are Federal employees and HAVE to make their data public after a year. That’s a big advantage. We have other problems – funding, the government – but policy says that the researchers have to

Q3 – Joseph Green from University College Dublin) How do you convey the idea of concept objects etc. to actual users – it looks like file structures.

A3) Well yes, kind of the idea. If they want to make messy structures they can (curators can fix). The only thing they need is a title for their concept structure. They do have a file system BUT they are building organising nodes here. And that web view is an incentive – it’ll look way better if they fill in their metadata. Thats the beginning… for tabular data objects for instance they will be required to do a “code book” to describe the variables. They can do this in a basic way or they can do better more detailed code book and it will look better on the web. We are trying to incentivise  at every level. And we have to be fine with ugly file structures and live with it.

Topic: Open Access Repository Registries: unrealised infrastructure?
Speaker(s): Richard Jones, Sheridan Brown, Emma Tonkin

I’m going to be talking about an Open Access Repositories project that we have been working on, funded by JISC, looking at what Open Access repositories are being used for and what their potential is via stakeholder interviews, via a detailed review of ROAR and OPENDOAR, and somerecommendations.

So if we thought about a perfect/ideal repository as a starting point… we asked out stakeholders what they would want. They would want it to be authoritative – the right name, the right URL; they want it to be reliable; automated; broad scope; curated; up-to-date. The idea of curation and the role of human intervention would be valuable although much of this would be automated. People particularly wanted the scope to be much wider. If a data set changes there are no clear ways to expand the registry and that’s an issue. But all of those terms are really about the core things you want to do – you all want to benchmark. You want to compare yourself to others and see how you’re doing. And in our sector and funders they want to see all repositories, what are the trends, how are we doing with Open Access. And potentially ranking repositories or universities (like Times HE rankings) etc.

But what are they ACTUALLY being used for right now? Well mainly use them for documenting their own existing repositories. Basic management info. Discovery. Contact info. Lookups for services – use registry for OAI-PMH endpoints. So that’s I think, it looks as if we’re falling a bit short! So, a bit of background on what OA repository registries there are. So we have OpenDOAR, ROAR (Registry of Open Access Repositories) – those are both very broad scope repositories, well known and well used. But there is also the Registry of Biological Repositories. There is re3data.org – all research data so it’s a content type specific repository registry. And, more esoterically, is the Ranking Web of World Repositories. Not clear if this is a registry or a service on a registry. And indeed that’s a good question… what services run on registries. So things like BASE search for OAI-PMH endpoints, very similar to this is Institutional Respositories Search based at Mimas in the UK. Repository 66 is a more novel idea – mashup with Google Maps to show repositories around the world. Then there is the Open Access Repository Junction a multideposit tool for discovery and use of Sword endpoints.

Looking specifically at OpenDOAR and ROAR. OpenDOAR is run at University at Nottingham (SHERPA) and it uses manual curation. Only lists OA and Full-text repositories. It’s been running since 2005. Whereas DOAR is principally Repository Manager added records. No manual curation. And lists both full-text and metadata only. Based at University of Southampton and running EPrints 3, inc. SNEEP elements etc. Interestingly both of these have policy addition as an added value service. Looking at the data here – and these are a wee bit out of date (2011). There seems to be big growth but some flattening out in OpenDOAR in 2011 – probably approaching full coverage. ROAR has a larger number of repositories due to difference in listing but quite similar to OpenDOAR (and ROAR harvests this too). And if we look at where repositories are both ROAR and OpenDOAR are highly international. Slightly more European bias in OpenDOAR perhaps. The coverage is fairly broad and even around the globe. When looking at content type OpenDOAR is good at classifying material into types, reflective of manual curation. We expect this to change over time, especially datasets. ROAR doesn’t really distinguish between content types and repository types – it would be interesting to see these separately. We also looked at what data you typically see about the repository in any record. Most have name, URL, location etc. OpenDOAR is more likely to include a description and contact details than is the case in ROAR. Interestingly the machine to machine interfaces are a different story. OpenDOAR didn’t have any RSS or SWORD endpoint information at all, ROAR had little. I know OpenDOAR are changing this soon. This field has been added on later in ROAR and no-one has come back to update this new technology, that needs addressing.

A quick not about APIs. ROAR has OAI-PMH API, no client library, full data dump available. OpenDOAR has a fulled documented query API, no client library and full data dump available. When we were doing this work almost no one was using the APIs, they just download all data.

We found stakeholders, interviewees etc. noted some key limitations: content count stats are unreliable; not internationalised/multilingual- particularly problematic if a name is translated and is the same as but doesnt appear to be the same thing; limited revisions history; No clear relationships between repos, orgs, etc. And no policies/mechanisms for populating new fields (e.g. SWORD). So how can we take what we have and realise potential for registries? There is already good stuff going on… Neither of those registries automatically harvest data from repositories but that would help to make data more authoritative/reliable/up to date; automated; increased scope of data – and that makes updates so much easier for all.  And we can think about different kinds of quality control – no one was doing automated link checking or spell checking and those are pretty easy to do. And an option for human intervention was in OpenDOAR but not in ROAR, and that could be make available.

But we could also make them more useful for more things – graphical representaqtions of the registry; better APIs and Data (with standards compliance where relevent); versioning of repositories and record counts; more focus on policy tools.  And we could look to encourage overlaid services: repository content stats analysis; comparitive statistics and analytics; repository and OA rankings; text analysis for identifying holdings; error detection; multiple deposits. Getting all of that we start hitting that benchmarking objective.

Q&A

Q1 – Owen Stephens) One of the projects I’m working on is CORE project from OU and we are harvesting repositories via OpenDOAR. We are producing stats about harvesting. Others do the same. It seems you are combining two things – benchmarking and repositories. We want OpenDOAR to be comprehensive, and we share your thoughts on need to automate and check much of that. But how do we make sure we don’t build both at the same time or separate things out so we address that need and do it properly?

A1) The review didn’t focus on structures of resulting applications so much. But we said there should be a good repository registry that allows overlay of other services – like the benchmarking services. CORE is an example of something you would build over the registry. We expect the registry to provide mechanism to connect up to these though. And I need to make an announcement: JISC, in the next few weeks, will be putting out an ITT to take forward some of this work. There will be a call out soon.

Q2 – Peter from OpenDOAR) We have been improving record quality in OpenDOAR. We’ve been removing some repositories that are no longer there – link checking doesn’t do it all. We also are starting to look at including those machine to machine interfaces. We are doing that automatically with help from Ian Stuart at EDINA. But we are very happy to have them sent in too – we’ll need that in some case

A2) you are right that link checkers are not perfect. More advanced checking services can be built on top of registries though.

Q3) I am also working on the CORE project. The collaboration with OpenDOAR where we reuse their data, it’s very useful. Because we are harvesting we can validate the repository and share that with OpenDOAR. The distinction between registries and harvesting is really about an ecosystem that can work very well.

Q4) Is there any way for repositories to register with schema.org to enable automatic discovery?

A4) We would envision something like that, that you could get all that data in a sitemap or similar.

A4 – Ian Stuart) If registering with Schema.org then why not register with OpenDOAR?

A4 – chair) Well with scheama.org you host the file, its just out on the web.

Q5) How about persistant URLs for repositories?

A5) You can do this. The Handle in DSpace is not a persistant URL for the repository.

Topic: Collabratorium Digitus Humanitas: Building a Collaborative DH Repository Framework
Speaker(s): Mark Leggott, Dean Irvine, Susan Brown, Doug Reside, Julia Flanders

I have put together a panel for today but they are in North America so I’ll bring them in virtually… I will introduce and then pass over to them here.

So… we all need a cute title and Collaboratory is a great word we’ve heard before. I’m using that title to describe a desire to create a common framework and/or set of interoperable tools providing a DH Scholars Workbench. We often create great creative tools but the idea is to combine and make best use of these in combination.

This is all based on Islandora. A Drupal+ Feora framework from UPEI. Flexible UI on top of Fedora and other apps. It’s deployed in over 100 institutions and that’s growing. The ultimate goal of those efforst is to release a Digital Humanities solutions packs with various tools integrated in, in a framework that would be of interest to scholarly DH context – images, video, TEI, etc.

OK so now my colleagues…

Dean is visiting professor in Yale, and also professor at Dalhousie University in Canada and part of a group that creates new versions of important modernism in canada prints. Dean: so this is the homepage for Modernist Commons. This is the ancillery site that goes with the Modernism in Canada project. One of our concerns is about long term preservation about digital data stored in the commons. What we have here is both the repository and a suite of editing tools. When you go into the commons you will find a number of collections – all test collections and samples from the last year or so. We have scans of a bilingual publication called Le Nigog, a magazine that was published in Canada. You can view images, mark-up, or you can view all of the different ways to organise and orchestrate the book object in a given collection. You can use an Internet Archive viewer or alternative views. The IA viewer frames things according to the second to last image in the object, so you might want to use an alternative. In this viewer you can look at the markup, entities, structures, RDF relations or whether you want to look at image annotations. The middle pane is a version of CWRC Writer that lets us do TEI and RDF markup. And you see the SharedCanvas tools provided with other open annotation group items. As you mark up a text you can create author authority files that can be used across collections/objects.

Next up Victoria Brown, her doctorate is on Victorian feminist literature. She currently researches collaborative systems, interface design, usability. Victoria: I’ll be talking more generally than Dean. The Canadian Writing Research Council is looking to do something pretty ambitios that only works in a collaborative DH environment. We have tools that can aim as big as we can. I want to focus on talking about a couple of things that define a DH Collaboratory. It needs to move beyond institutional repository model. To invoke persoective of librarian colleagues I want to address what makes us so weird… What’s different about us is that storing final DH materials is only part of the story, we want to find, amass, collect materials; to sort and organise them; to read, analyse and visualize. That means environments much be flexible, porous, really robust. Right now most of that work is on personal computers – we need to make these more scalable and interoperable. This will take a huge array of stakeholders buying into these projects. So a DH repository environment needs to be easy o manage, diverse and flexible. And some of these will only have a small amount of work and resources. In many projects small teanms of experts will be working with very little funding. So the CWRC Writer here shows you how you edit materials. On the right you see TEI markup. You can edit this and other aspects – entities, RDF open annotation mark up etc, notations allows you to construct triples from within the eidt. One of the ways to encourage interoperability is through use of common entities – connecting your work to the world of linked data. The idea is to increase consistency across projects with TEI markup and RDF means better metadata than the standard working in Word, publishing in HTML many use. So this is a flexible tool. Embedding this in a repository does raise questions about revisioning and archiving though. One of the challenges for repositories and DH is how we handle those ideas. Ultimately though we think this sort of tool can broaden participation in DH and collaboration in DH content. I think the converse challenge for DH is to work on more generalised environments to make sure that work can be interoperable. So we need to take something from solid and stable structure and move to the idea of shared materials – a porous silo maybe – where we can be specific to our work but share and collaborate with others.

The final speaker is Doug, he became first digital curator at NYPL. He’s currently editing music of the month blog at NYPL. Doug: the main thing we are doing is completely reconfiguring our repository to allow annotation of Fedora and take in a lot of audio nad video content. And particularly for large amounts of born digital collections. We’ve just started working with a company called BrightCove to share some of our materials. Actually we are hiring an engineer to design the interface for that – get in touch. We are also working on improved display interfaces. Right now it’s all about the idea of th egallery – the idea was that it would self-sustain through selling prints. We are moving to a model where you can still view those collections but also archival materials. We did a week long code sprint with DH developers to extend the Internet Archive book reader. We have since decided to move from that to New York Times backed reader – the NYT doc viewer with OCR and annotation there.

Q&A

Q1) I was interested in what you said about the CWRC writer – you said you wnated to record every key stroke. Have you thought about SVN or GIT that do all that versioning stuff already.

A1 – Susan) They are great tools for version control and it would be fascinating to do that. But do you put your dev money into that or do you try to meet needs of greatest number of projects? But we would definitely look in that direction to look at challenges of versioning introduced in dynamic online production environments.

 

 July 11, 2012  Posted by at 12:27 pm LiveBlog, Updates Tagged with: , , , ,  Comments Off on P4B: Shared Repository Services and Infrastructure LiveBlog
Jul 102012
 

There are two fantastic ways you can use your smartphone at Open Repositories this week!

Microsoft Research have put together a great little Windows Phone 7 app for frequent conference goers and throwers: My Conference. It pulls in conference information and puts it into a convenient, dare I say attractive, interface. Browse events, tunnel deeper to learn about the delegates, and read what they’ve submitted to earn a slot at the conference. You can also see their other publications via Microsoft Academic Search. And what would any conference app be without a quick game of Guess Who? Here’s a little walkthrough and showoff video using the OR2012 programme.

YouTube Preview Image

You’ll probably see a bunch of pixelated little squares around the conference’s paper programme. You should put those QR codes to use, linking to equivalent pages online and freeing all that information from your printout. You can also use our custom map and read abstracts this way. Watch the video to find out how.

YouTube Preview Image
 July 10, 2012  Posted by at 11:47 am Updates Tagged with: , , ,  Comments Off on Two Fantastic Ways to use your phone at Open Repositories 2012
Jul 102012
 

As Day two workshops get underway we thought we’d take a look through the tweets and updated from Day One and share the highlights. This is the first set of updates, we’ll be adding to it from the tweets and comments throughout the day and after the event so do add links and comments here and come back to take a look…

After registration we realised it had taken a wee while for people to spot the 8GB memory stick hidden inside their delegate badges. Thankfully @williamjnixon was on hand with an explanation of how to use them – “The 8gb flash drive just swivels out of the badge”. Yes, it is really that easy 😉

The DSpace Committers meeting seemed to go really well – lots of interesting stuff raised according to those we’ve been chatting to this morning. If you were along and would like to share your notes/thoughts just let us know!

The Islandora workshop introduced Islandora to lots of folk who didn’t previously know much about it. And @jjtuttle was impressed to see “the #DiscGarden #Islandora video solution pack does reencoding using ffmpeg to generate access files. We want that.”

The Open Access Index workshop described establishing a way to “measure openness of research. What factors should it consider?” (via @openscience). To gather responses they have set up a survey here – do fill it in.

The DCC workshop, Institutional Repositories & Data – Roles and Responsibilities highlighted that Research Data Management is “a relay race, pass the baton at key point in the cycle” (via @wrap_ed). @informnivore tweeted Jared Lyle’s take on data curation challenges: “formats, metadata, privacy, and training”. The ICPSR work was of lots of interest. The full results of the recent ICPSR study will be a here (via @sjDCC) and a handy tip from the workshop: “ICPSR has an anonymizer tool for social science data”. One of the more interesting questions raised here was “what the role of funding agencies in data preservation & curation?” (via @informnivore). Breakouts included researcher workflows, insinstutional responsibilities, and IR limitations (via @pcastromartin). Concerns over the latter included “limited qualifications of library staff to deal with domain-specific metadata”. Apparently it “took us an hour or so but ‘the’ question has come up. So ‘what is research data?'” (via @wrap_ed).

At the text mining workshop @CriticalSteph reported back so regularly she got banned by Twitter for the day! But before the ban she shared news of Argo, which has “the aim is to be a community resource of a complete framework of text mining”.

The Repositories Support Project workshop: Building a national network kicked off with an overview from Balviar Notay of JISC’s work with repositories that left the crowd wanting more and particularly interested in the JISC Elevator and UK RepositoryNet+. And @llordllama wondered “Does the JISC elevator sound like the one in Are You Being Served? That would be neat.”. Jackie Wickham spoke on RSP but also on Sherpa as “Congratulations go to Bill Hubbard, now a very recent dad which has trumped attending #or2012“! (via @williamjnixon). OpenDOAR, Sherpa/Romeo and Juliet were all well recognised by the crowd, even those from overseas and “66% of publishers listed on Romeo allow some form of repository archiving. That figure’s been stable for half a decade” (via @llordllama). And one audience member suggested that adding “Article Processing Charges” (APCs) to Romeo would be a v useful addition”. A great fact from Jackie’s talk on RSP (the website for which was launched in 2006): “UK only second to US in number of repositories” – “on OpenDOAR the 9.5% of the institutional repositories are from the UK” (via @RepoSupport). “RSP has been busy with over 1300 delegates to from from 200+ organisations to events and 90+ consultancy visits” and have an embeddedness self-assessment tool (via @williamjnixon and @nancypontika). Marie Cairney talked about “the evolution of Enlighten from its antecedents in JISC FAIR and DAEDALUS“. Now @uofglibrary has two separate repositories 1 for published papers (Enlighten) & 2nd for theses. This led to discussion of deposit policy and of the Glasgow publications policy a “mix of metadata, full text and use of address”. See also: Building a national network – Nick Sheppard’s excellent liveblog of the RSP session: http://ukcorr.org/2012/07/09/building-a-national-network/

And finally…

The DevCSI Developer Challenge is still looking for your fantastic ideas! Add them here or go and say hello in the Developer Lounge on the 1st floor of Appleton Tower (just near the lifts).

Recommended by the Tweetosphere

Get the picture?

 

 July 10, 2012  Posted by at 8:33 am Updates Tagged with: , ,  Comments Off on Highlights of OR2012 Day One!
Jul 062012
 

The wait is nearly over! We are just a weekend away from the start of Open Repositories 2012 and a warm welcome from us all here in Edinburgh awaits. There are over 430 coming to OR2012. It’s a packed programme and we know it will be a busy (and exhilarating) week.

Workshops
We have arranged 14 workshops in total on Monday and Tuesday morning, many of which are now fully booked. We would be grateful if you could check your booking and, if you are no longer able to attend, notify us or cancel the booking yourself as there are waiting lists for several workshops now.

There will also be an opportunity for the repository user group communities (DSpace, Fedora and EPrints) to share their latest developments and work. These user group sessions are open to all delegates and offer an excellent opportunity to find out more.

This year we have introduced a third “Repository Fringe” strand, based on the very successful “Repository Fringe” hosted here at the University of Edinburgh. This includes a wide range of Pecha Kucha presentations which promise to be lively. There’s also an opportunity to contribute to the Open Access Index project and to learn more from Ipsos Mori about online survey tools.

Registration
Registration will take place in Appleton Tower (see campus map). The registration desk will open on Monday 9 July at 8.30 am and will be available throughout the conference. A separate conference office will also be available to deal with any further enquiries you may have.

For the latest details please refer to the online programme.

A printed programme and delegate list will be provided upon registration.

Weather
We have organised wall-to-wall sunshine for the week starting Monday 9 July but have yet to identify a delivery mechanism. We shall work on this over the weekend!! In the meantime check the forecast here and it may be prudent to pack an umbrella, it is summer after all!

Travel and Accommodation
Details about travel and the conference accommodation can be found on the conference website – click on the Registration link in the menu above to access the relevant pages.

If you haven’t booked accommodation yet please refer to our Accommodation information.  Note: The cost of accommodation is NOT included in the registration fee.

Venues
The conference will take place on the George Square campus at the University of Edinburgh.  Opening and closing sessions will take place in George Square Lecture Theatre. All other sessions will take place in Appleton Tower (See central area maps[PDF] and also our OR2012 Google Map of the venues).

For further information about the conference location please refer to: http://or2012.ed.ac.uk/location/

Speakers and Session Chairs
If you are a speaker and haven’t yet sent us your presentations please do, it will really assist the smooth running of the conference. Further guidance about timings, set-up etc is available elsewhere on the conference site for speakers and session chairs.

Network Access and Eduroam
There is wifi throughout the George Square campus through two routes. Users of either wifi option should be aware of the University of Edinburgh Computing Regulations.

Eduroam is available and accessible throughout the buildings so Eduroam users should be able to login with their usual details. You may need to set this up at your own institution before arriving.

We can provide free University of Edinburgh wifi guest accounts will be available for OR2012 – please ask at the registration desk for more information and your guest login details.

If you have a mobile device like a tablet or smartphone, guidance is available from the University of Edinburgh.

Social Media and Recording
We will be recording, blogging, tweeting, using Crowdvine and other exciting social media tools throughout Open Repositories this year. We hope that you’ll join in the fun so, if you are curious about any of these tools but haven’t used them before we’d like to help you get started. We have put together a Beginners Guide to Social Media for OR2012.

Have a look and please do leave a question or comment – or email them to: Nicola.Osborne@ed.ac.uk We are also looking for live bloggers so contact Nicola to volunteer and be part of our social media mix.

#OR2012 is the perfect time to take the plunge with Twitter and we recommend following the conference’s official Twitter account @OpenRepos2012 for all the latest news and breaking action.

Lunches and snacks
Lunches, coffee, teas and snacks will be provided to all delegates each day at the conference. There will be coffee breaks available during the workshops.

Sponsors
Open Repositories 2012 would not be possible without the sponsors, supporters, collaborators and organisers that enable us to make this both a highly useful and very enjoyable event and we would like to take the opportunity to thank them.  Find out more about our sponsors here. Many of the sponsors will be exhibitors in the concourse in Appleton Tower during the week, drop by and say hello.

Social Events
The social events are included in your registration fee.

There will be a Drinks Reception in the Playfair Library on the evening of Tuesday 10 July 6pm – 8pm). This will be opened by the Depute Lord Provost of Edinburgh, Deidre Brock with reply from Professor Jeff Haywood, Vice-Principal of Knowledge Management & Chief Information Officer, University of Edinburgh.

Please note that canapés will be served at the Drinks Reception and as such delegates are advised to make their own dinner arrangements. There are a wide range of restaurants to suit all tastes and budgets in the vicinity.

On Wednesday evening (11 July) there will be a conference dinner and a Ceilidh at the National Museum of Scotland. Drinks will be served at 7 pm with dinner at 8pm and dancing until just before midnight, if you can stand the pace! We are delighted that Dr John Howard, Chair of the Steering Committee has agreed to be our Master of Ceremonies and to announce the winners of the Developer’s Challenge, after their Show and Tell earlier in the evening.

An invitation for dinner will be in your registration pack. If you don’t plan on coming to the conference dinner we would appreciate it if you hadn’t the dinner invitation back to us at registration. This will assist us with numbers.

Anything else? Need help?
Contact us by e-mail at or2012@ed.ac.uk or on Twitter using #or2012info and we will do our best to help, we look forward to seeing you in Edinburgh.

Kevin Ashley
Chair of OR2012 Programme Committee

 July 6, 2012  Posted by at 5:17 pm Delegates, Updates Tagged with: , , ,  Comments Off on Just a few days to go…
Jun 152012
 
Image of Lord Kitchener Your Country Needs You Poster.

"Bletchley Park - Block B - The Bletchley Park Story - Your Country Needs You - poster" by Flickr user Elliott Brown / ell brown

OR2012 kicks off in just three weeks time and as the date approaches we will be getting prepared to ensure you have a fantastic experience at the event, whether attending in person or taking an interest from further afield.

We will be live blogging all of the keynotes and many more presentations throughout the week as well as tweeting key updates and information for delegates and interested #OR2012 hashtag watchers. Although we will have some bloggers, tweeters and other dedicated social media amplifiers along for the week we also need your help!

Blogging
If you are interested in taking part in the blogging around the event – either on the OR2012 site or from the comfort of your own blog – then do get in touch. We are particularly keen to here from those who might be able and interested to summarise workshops and user group meetings.

Tweeting
You may already be aware of our hashtag, #OR2012, and we welcome your comments, updates and discussion here. Twitter users might also like to sign up to our lanyrd page to start meeting fellow tweeting delegates. We will also be grabbing the most interesting tweets, blogs posts, and other web coverage for a special Storify that we will make available after the event so do let us know your highlights!

Photographs
If you are planning on bringing your camera along to the event then we would love to see your images and have set up a Flickr Group for these. Do get in touch with your Flickr username if you would like to be added to the group.

And finally if you haven’t had a look already now is a perfect time to take a look at the OR2012 Crowdvine and start introducing yourself to your fellow delegates. It’s a great place to start getting an idea of shared interests, set up chats and meetings around the event and just get to know some friendly faces.

If you have any questions about any of the above or think we’ve missed something important out please do leave comments here or get in touch with Nicola Osborne (nicola.osborne@ed.ac.uk).

 June 15, 2012  Posted by at 1:11 pm Updates Tagged with: , , ,  2 Responses »