Archives for category: Archiving

The Registry of Research Data Repositories (re3data.org) is hosting a database of over 900 data repositories that cover “all academic disciplines.” The Registry is funded by the German Research Foundation and is comprised of all German institutes. However, to be included, the repository must have an English GUI to their website. Suggestions are being solicited for other repositories to be included.

Their schema is published and comments can be added to that schema until Oct. 20, 2014.

The schema treats archaeology oddly as a subject. There is a value tree for Humanities/Ancient Cultures/Prehistory/ and Classical Archaeology. There is also a separate entry for Egyptology and Ancient Near Easter Studies.

The Archaeology Data Service (ADS) and the Digital Archaeological Record (tDAR) get listed under the following:

  • Ancient Cultures
  • Classical Archaeology
  • History
  • Humanities
  • Humanities and Social Sciences

which strikes me as odd since tDAR, for example, is overwhelmingly non-Classical archaeology. Approaching this from an anthropological perspective would have one browsing through the ‘A’s for ‘archaeology’ or ‘anthropology’ and not finding either (anthropology is under E for Evolution-Anthropology). You can search for ‘archaeology’ in the search engine (you need the second ‘a’) and get 10 results, including the two listed above but also

The re3data.org site is supposed to join with another index of repository site Databib, and be controlled by yet a third organization, DataCite, sometime next year.

Advertisement

What I will be reading during the long Thanksgiving weekend.

OCHRE: An Online Cultural and Historical Research Environment
by J. David Schloen and Sandra R. Schloen
November 2012
THIS book describes an “Online Cultural and Historical Research
Environment” (OCHRE) in which scholars can record, integrate,
analyze, publish, and preserve their data. OCHRE is a multiproject,
multi-user database system that provides a comprehensive
framework for diverse kinds of information at all stages of research. It
can be used for initial data acquisition and storage; for data querying
and analysis; for data presentation and publication; and for long-term
archiving and curation of data. The OCHRE system was designed by
the co-authors of this book, David Schloen and Sandra Schloen. The
software for it was written by Sandra Schloen.

The Cultural Heritage Informatics Initiative at Michigan State University is hosting a field school for Cultural Heritage Informatics (CHI) from May 31st to July 1st, 2011.

The CHI Fieldschool is a unique experience that employs the model of an archaeological fieldschool (in which students come together for a period of 5 or 6 weeks to work on an archaeological site in order to learn how to do archaeology).  Instead of working on an archaeological site, however, students in the CHI Fieldschool will come together to collaboratively work on several cultural heritage informatics projects.  In the process they will learn a great deal about what it takes to build applications and digital user experiences that serve the domain of cultural heritage – skills such as programming, media design, project management, user centered design, digital storytelling, etc.

Most archaeologists that I know see the end result of their research as a publication. More often we are seeing the integration (or at least the desire of integration) of electronic data into published accounts. Very few see the need to introduce their projects in a user centered design for the presentation of their data. Field schools like this one help us meet in the middle. Knowing what people are doing with the presentation of the data can help create good data collection policies in the field.

tDAR mentioned a newly awarded NSF grant to principle investigators Keith Kintigh and K. Selçuk Candan for a proposal titled “One Size Does Not Fit All: Empowering the User with User-Driven Integration.”

I am excited by this development as it addresses the key problem with archiving and curating excavation material with any intent to redistribute:

Data and knowledge integration are costly processes. Consequently, most existing solutions rely on a one-size-fits-all approach, where the data are integrated upfront and then the integrated data or knowledge-bases are used as is. Such snapshot-based integration solutions, however, cannot be effectively applied when the data sources are autonomous and dynamic or when, as in most scientific and decision making applications, assumptions, beliefs, and knowledge of the domain experts are indispensable to the integration process.

There are so many differences between one field project and another that it creates huge problems for anyone interested in cross-project analysis based on digital data. I have worked on projects with the same director and essentially the same database, but the field techniques and analytical differences in processing were so broad that we could not easily compare the data between the two. This becomes even more difficult, and possibly insurmountable, when trying to create an archival data repository from the digital remains of completed projects.

What makes me excited about this project is its explicit relationship to the tDAR:

In particular,UDI will be incorporated into the NSF-funded tDAR (the Digital Archaeological Record), which has the potential to transform archaeology’s scientific endeavors by enormously advancing the capacity for synthetic research. The investigation of fundamental information integration challenges will thus contribute substantially to a shared infrastructure of science and will enable crucial transdisciplinary research concerning complex systems.

Another reason that I am excited by this project is because their process is geared towards satisfying the research needs of a professional audience to map their various back-end systems together and not  a promise that a portal to the various data sets will yield instant results. Much more work is necessary on back-end systems like this to make the examination of raw archaeological data between multiple projects fruitful. I just hope that the results are understandable by mere mortals.

On January 18, 2011, the National Science Foundation (NSF) will change their requirements for proposals for the Archaeology awards. All proposals will have to include a “Data Management Plan” describing how your project will conform to the NSF’s data sharing policy.

Beginning January 18, 2011, proposals submitted to NSF must include a supplementary document of no more than two pages labeled “Data Management Plan” (DMP) .  This supplementary document should describe how the proposal will conform to NSF policy on the dissemination and sharing of research results.  Proposals that do not include a DMP will not be able to be submitted.

Some of this is not new. Proposals to NSF awards in Archaeology have had a one-page plan for data access since 2005. One of the projects I work for, SANAP (Southern Alabanian Neolithic Archaeological Project) run by Susan Allen of UC’s Anthropology department and Illir Gjipali of the Institute of Archaeology in Albania, submitted such a plan for their NSF grant in 2009. But this new requirement formalizes the procedure a bit.

There is one very important difference regarding how non-compliant proposals will be handled under the new plan. Currently when an archaeology proposal lacks a data management plan, the application is accepted by NSF and the Program Director contacts the Principal Investigator and requests that he/she submit via Fastlane an updated Supplementary Documents section which contains the plan. Under the new system, a proposal which does not conform to this requirement will not be able to be submitted to the Foundation.

Looking through their FAQ page on Data Management and Sharing it looks like there is quite a bit of room for project specific plans. Such terms as ‘reasonable procedures’ and ‘reasonable length of time’ are left to be decided by “the community of interest through the process of peer review and program management.”
The Data Management Plan is meant to address more than just observational data. It is meant to cover samples, and physical objects. And the data doesn’t have to be digital. You can record your entire project on paper and simply plan to make that paper available for scholarly review later on. But most archaeological projects that I know use a combination of paper and electronic, often of duplicate data. And if you have two sets of data, one analog and one digital, that doubles the complexity (and cost) of archiving  your information when the project is done.