Hackathon winner!

MoRe Quality by Vangelis Banos (Future LIbrary – Greece) is the winning prototype application of the LoCloud hackathon which took place on the 11th of February 2015 at the premises of the Google Cultural Institute in Paris, France.The hackathon was organised by LoCloud and Europeana in the context of EuropeanaTech 2015.

Image of Vangelis Banos at the Google culture institute
Vangelis Banos winner of the Locloud hackathon
Vangelis Banos, winner of the LoCloud Hackathon

Concept

The Metadata & Object Repository (MoRe) is an easy and powerful tool to aggregate information and harvest metadata from multiple sources in multiple schemas. Such aggregation schemas usually create problematic situations regarding the quality of the harvested metadata.

Metadata may pass the standard Europeana XML validity tests but they may include problematic metadata values. For instance:

* a dc:date value could be formatted in the wrong way:

<dc:date>approximately 18th century</dc:date>

This format is not correct according to established date formats.

*  an author name could be incomplete according to bibliographic standards.

Example: <dc:creator>Mike</dc:creator>.

* a URL may be invalid. E.g.: <ese:isShownAt>http://invalidurl.com/error-url</ese:isShownAt>

The aim of the MoRe Quality tool is to implement a validation system which could be able to catch these errors and produce useful reports to the collection administrators.

MoRe Quality application –The prototype functionality

MoRe Quality communicates with the LoCloud MoRe instance using a specific user API Key, retrieves the ingested metadata and performs evaluations to identify common errors such as:

* Invalid date formats (ISO 8601 standard)

* Invalid hyperlinks

* Invalid language codes (ISO 639 standard)

* Invalid author names

The results are presented to the user in a simple report.

The application is implemented in such a way that enables developers to add extra evaluation rules in an easy and intuitive way by implementing simple functions – plugins.

Technical information

MoRe Quality is implemented using linux and python 2.7.

Some common python modules are utilised:

* Virtual environments

* Flask

* Python Requests

* BeautifulSoup4

* pycountry

* iso8601

The prototype is not currently running on a production server but the full source code freely available at: https://bitbucket.org/vbanos/more-quality/

Anyone interested in MoRe Quality should feel free to contact the author for more information.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s