Sunday, June 15, 2008

Generic data harvester

Work is underway on a generic harvester for biodiversity data sources (DiGIR, TAPIR, BioCASe, OAI-PMH + others when they come up - LSID + TaxonOccurrence etc).

The goals of this project are to remove the need for application developers to spend time on data harvesting, scheduling of harvesting, console for log display or basic processing.  The framework is extensible, with each protocol residing in a very simple subproject (new protocols / versions easy to add) and UI generated automatically for parameter input (no need to write JSP for a new protocol!!!).  Everything is internationalised, including all logs which come with a simple JSON+AJAX based log tailing in the browser.  If we can work with the wrapper providers, and offer a single generic solution to TDWG, perhaps we should be aiming to get to the stage where data providers are certified to work with the TDWG harvester?

The code is almost stable, and ready to accept contributers (Java developers)...

(Built on top of AppFuse, Java, Spring, Hibernate, Struts2, Maven, Mysql soon to be H2)

No comments: