What DSpace could learn from IR+

January 27, 2010 at 1:41am

IR+ is the newly launched Open Source Repository platform, originally developed by the University of Rochester.

As the university was a long time user of DSpace, highlighting a struggle to recruit content as early as 2005</A>, I was very surprised to read in Wired Campus that launching a new, in-house developed platform, was one of the key points in the strategy to cope with the issues from the past</A>.

In the short article on
Wired</A>, the claim is made that DSpace lacks the key functionality for having a workspace, and easily customizable Researcher Pages.

If you think of these arguments in the context of the Sticks vs Carrots arguments in promoting access, this approach clearly works on the "carrot" side, trying to convince people to adopt open access with new features. (while the most common "stick" approach is a deposit mandate, or using the repository as sole academic bibliography source for staff promotions etc). Now, examples of both stick and carrot approaches have been proven successful and unsuccessful in the past. It will be very interesting to see whether Rochester really hit the "sweet spot" of functionality that researchers are longing for.

Although I didn't actually install and run the IR+ platform, I took the time to
run through its manuals, to see what DSpace could really learn from IR+, feature wise.

Here's my list, please comment, or illustrate which of these you think would really substantially improve DSpace:

Elaborated user account settings


  • Multiple Email Addresses

  • Publication Name Management (especially useful when you get married during the course of your academic career)

  • An overview of accepted licences

Researcher Pages


  • File & Link association

  • Pictures


The researcher pages offer nice "profile" functionality, which is way better than currently offered in DSpace, but way lagging behind the current state of the art in this field (Facebook, Linkedin, ResearchGate, ...). Social features on the Researcher pages are lacking (colleague list, comments, ...), as well as integration of external services (blogs, twitter feed, ...)

User Workspace


  • File and Folder Management

  • Share files with other users

  • Version management: upload new versions for files

  • File locks & Permission mechanism


The implementation of User Workspaces gave me a very Google-Docs like feeling, but without the in-browser editors. Uploading & downloading files, and managing them locally on a PC could possibly be a show stopper here. But the version management, permission & locks and sharing system seem well implemented.

Submission


  • Multiple collection selection for one submission

  • File(s) selection first

  • Collection(s) selection last



As opposed to the DSpace submission process, IR+ starts from the file upload, and adds metadata provision afterwards. This would be useful in any approach where some metadata could be automatically generated from the uploaded file.

Also interesting is the approach where the collection of choice, to which the item should be submitted, happens all the way at the end of the submission proces. It's also possible to submit to multiple collections.

The software could be used to implement both a repository, as well as a referratory: it's possible to add a new "publication" without adding a file, but adding a hyperlink to an external resource instead.

An embargo system is present, but has an impact on a whole item, instead of the individual files. I could upload a summarized presentation about an item as well as the full text, which is effectively under embargo, but the embargo settings would have an impact on both of them.

Authors or co-authors are defined by text details (name, first name), ... but don't seem to have an (institution) identifier. It's not clear whether it's possible to differentiate between co-authors from the institution, and external ones.

(Co-)Author entry screen(Co-)Author entry screen

The way <B>versioning is handled</B>, both in the area of workspaces, as well as published content, is really interesting. At this point, it isn't really clear if the platform really avoids that 2 researchers submit the same publication, or how duplicates are dealt with in general.

In my personal opinion, 2 main difficulties in DSpace have not been addressed, and persist in IR+. It looks like the data model for published items is more or less as rigid as the one from DSpace: a hierarchy of collections can contain items, and these items can contain bitstreams.

At the same time, there are no possibilities to freely relate items to eachother. So let's say some publications are part of the same virtual research project collaboration: apart from a keyword association, or being contained in the same collection, there seems to be no way to relate these items to eachother.

**Edit: it actually does seem possible to compose relations between items, on the researcher page. On the page, a researcher can construct his own hierarchical structure, free from the existing hierarchies.

There is also no reference to which statistics or reports can be generated or visualized. Apperently, this is still under development in the documentation, but a 2 page handout illustrates</A> that both repository wide, as well as contributor statistics should be present.

<B>Final verdict</B>

Researcher pages, versioning, the idea of a collaborative workspace could really be useful additions to DSpace, as well as a few general tweaks that were implemented in IR+. In the context of DSpace 2 development, I think quite a lot can be learned.

However, the most important limitations of DSpace (from a data model point of view) persist in this platform. Judging by the 2009 work on the DSpace 2 data model, those will be effectively tackled in DSpace 2.

On a very personal level, I think it's unfortunate that the obviously talented developers didn't implement their ideas as a contribution or patch on DSpace.

More about IR+:
http://code.google.com/p/irplus/