Bill Caraher who blogs at The New Archaeology of the Mediterranean World posted a comment to my New Math post. Since the gist of the New Math post was a quote by Richard Rothaus, not me, I wasn’t really prepared to answer his comment immediately. But I thought that his comment on cost of tech vs. labor deserved its own post. So I am elevating it to an article and have commented on it below.

John,

That comment caused me to think a good bit as well! I keep trying to think of ways to factor in the true cost of digital archaeology.  For example, while I agree that converting paper to digital is a hassle, it is nevertheless a hassle that can be accomplished back in the US by inexpensive labor (e.g. graduate students, spouses, faculty free time). Digital data capture in the field (and you know that I am a huge fan of a digital archaeological workflow) may take additional time in the field (Richard even admits to this).  Time in country is expensive.

Moreover, technology is (compared to paper) expensive.

Person hours, from the perspective of an academic archaeology (and Richard has “gone pro”) are cheap.

So the equation is more complex (as you know).

I’m pretty excited to read your blog and learn how you came up with such clever processes!

Bill

Bill,

Thanks for the comment.

I find that expense, as it relates to digital vs. paper is directly related to the process, not the goal.

I have seen many examples of time wasted in the field by moving the exact same notes from one paper media to another. From, say, an informal field notebook to a ‘clean’ representation of the same data on a form, and finally entered into a database and even printed out for a more permanent ‘bound’ notebook. That is an extreme example, but this happens with every bit of each process: photo tagging, data entry, matrix building, final report writing. And the larger the team the greater number of people hours spent redoing the same work.

I have also seen a progression over the years. At Troy everything was written down and selected bits were entered into the computer (films scanned, drawings scanned, notebooks and field forms scanned, finds quantified and entered) and all the Troy grad students did during the academic year was to catch up on this work. My experience in Albania, on survey was different and governed by an “everything digital before we leave the site” principle. But there were still some data entry and data check projects done by students after the field season. Now I am on the brink of having a project with complete “born digital” data and now our students get to spend their winter helping with the research and analytical tasks instead of remedial data entry and scanning.

I don’t really embrace the “technology is expensive” argument. Some tech is expensive some is not. Some tech is expensive and doesn’t contribute much. Some is cheap and can change everything about the way you work.

So I hope to expand on this complex equation in the course of this blog.

John

 

Harris matrices are notoriously difficult to handle electronically. Several projects that I have seen have something similar to this graph on their excavation forms:

Relationship area of PARP:PS SU form

Recording the associations of SUs looks clear enough with this example but it is usually a disaster in practice, unless one is extremely particular about reviewing old forms. If an SU sheet is filled out for 13040 (trench 13, su 40) one day, then SU 13050 is opened up two days later and found to be the same strata, it is easy, while filling out the form for 50, to say that it is the same as 40 (filling in the center boxes with 13050=13040) but how often will they go back to the 40 form and enter that 13040=13050? I find that this is missed very often on the forms and that gets transferred to the database. This, in turn, ends up being one of the real difficulties in researching the material. The seemingly complete form masks the inaccurate data behind it.

I have tried to make the database fix this problem. In one database recently, I had the database take the relationship mentioned above and make an equation (13040=13050) and look for an inverse statement. If that inverse statement is not found, then I have the database add the inverse statement (13050=13040) automatically. This also works for later and earlier SUs. If the form for 40 says that it is later than 41, the database breaks it down as (13040 > 13041). It that is so it has to create a record that says (13041 < 13040). But since 40=50, then I also have to make records that say (13050 > 13041) and (13041 < 13050). Needless to say, this is a lot of development and it isn’t quite rock solid. It is difficult to work around typos, edits, and record deletion. I will post an example of that at some point.

At PARP:PS, we also had the trench supervisors supply us with a trench-wide matrix at the end of the season. This would be supplied in a number of ways: from colored pencil on grid paper, to photoshop files. Notes on matrix information were kept in trench supervisor’s paper notebooks and the trench wide matrix was only usually done at the end of the season. If a specialist (mostly ceramic) needed the matrix information during the season, the trench supervisor would cobble together their notes and make a hand-sketched matrix of just the specific SUs under consideration.

In 2009, we tried to standardize on using Omnigraffle for electronic matrices. Omnigraffle is rather expensive (59.95 USD for standard academic version, 119.95 USD for the Professional academic version), and we needed a license for each machine. Plus it is Mac only and the personal computers on our project were 50% Windows.

In 2010 we introduced the iPads to the trenches. And we added Omnigraffle for iPad to the software used. The cost suddenly got better. Although Omnigraffle for iPad is 49.99 USD, according the app store licensing structure we can put it on five iPads at a time. We then only need one desktop license to clean up and edit the matrices at the end of the season.

When the trench supervisor created a new SU, they simply switched to Omnigraffle and made the rectangle for the new SU. They could associate, and edit associations whenever they wanted to. Since the iPads are backed up twice every day, the matrices could be looked at by the director, and the latest versions could be delivered to the specialists at any time.

Making matrices on the iPad was one of the easiest things for the teams to pick up and we trained them in the field using a short tutorial designed to have them draw a matrix from Dr. Harris’ own book on the subject. That tutorial is attached here in both pdf and doc format, if you should choose to edit for your own training needs.

Omnigraffle Tutorial.pdf

Omnigraffle Tutorial.doc

In order to establish a baseline for conversation, I should outline what it is that I do. Some of this might be obvious but it should be made explicit.

I work in a US university on Mediterranean archaeological projects. This means several things:
  1. For fieldwork we carry on a plane almost everything that we use. We can certainly find some stuff locally, especially the basic tools and excavation equipment. But we also carry bags, tags, and lots of specialty items that are sometimes either hard to get or just plain cheaper to buy in the US. We carry all of our tech equipment (including computing hardware, total stations, rods, and tripods) all of which needs to be hand carried. Peripherals are kept to a minimum.
  2. We often aren’t guaranteed decent Internet access. Even at Pompeii, using wireless cards, it is difficult to get a signal. Other times we are working in isolated areas without any hope of any type of net access.
  3. We have access only to portable computers. No desktop machines with large hard drives. No servers (in the physical sense, I mean), and no decent sized monitors.
  4. I work for other projects outside of UC, but I only work for academic projects. There is quite a difference between government run archaeology, CRM archaeology, and academic archaeology. Academic archaeologists tend to focus their data gathering techniques based on their own research design. They also don’t tend to write reports in the CRM fashion, but are focused solely on academic peer reviewed publishing. Academic field projects also don’t have a centralized data management scheme. There have been attempts over the years to standardize data collected in the field but most projects still create a whole new documentation and database scheme for each project. Even several field projects in the same academic department have different storage and data sharing mechanisms for inter-project communication and research. Again, this is very unlike CRM archaeology, with which I have a passing familiarity, but I have never worked for a CRM firm.
  5. There is a constant stream of new personnel on the projects. Almost all are graduate students and almost all end up actually ‘working’ for a project for a maximum of five years. Some only work for any given project for one or two seasons. So there is a lot of training that we have to do for each project, every year. To give a sense of perspective to this: UC’s participation in the current Trioa Project began in 1988. We are still publishing annual reports in the form of Studia Troica from that project, and two major monographs (and one dissertation) are still in the final stages of completion.
These are the circumstances under which I work. Given these circumstances, I have worked out the following strategies:
  1. I don’t code. Maybe a little. I write scripts for databases and computer automation. I code some web pages in php and lasso (for database integration). But making an app is beyond what my time allows. So no custom software. It has to be off-the-shelf.
  2. I don’t work with enterprise-level databases for fieldwork. They require too much in the way of coding and training. I need desktop (or now tablet) apps. Open source is great, but often the projects can afford professional level software with academic pricing.
  3. The software has to be something that I can teach to a beginner in a very short amount of time. Software that a classicist can use. I can train almost anybody to use FileMaker. Seventy-five percent of the field personnel can understand what we need to do with Photoshop in less than an hour. Fifty-percent will understand GIS well enough to use it. Unfortunately the number drops considerably when it comes to working with a vector drawing app like Illustrator. And almost no one but specialists can use CAD tools efficiently. I have to plan for that, and identify students who show affinity for one type of software over another.
  4. I do have the luxury of a server. I run an OSX file, web, and database server here at UC. So I can host collaborative software of my choosing without having to work through my university IT people, and non-UC people can have accounts on my server.
Almost all (academic) field projects have one person who can handle most of the software necessary. It is their job to take something complex, like a notation system being used by five team leaders with different academic backgrounds who have been taught that there is only one right way to do something, and produce a clean, consistent data set. That person serves as a developer for the project. Someone who will take it upon themselves to tackle the difficult problems so that the others in the field project don’t have to. That person is my audience. In the future, I will be posting samples of our work at PARP:PS. I don’t expect them to be used by themselves, but I will post them so that they can be an example for  project developers to adapt for their own work.
I should also note that I recognize that there are many ways to do something correctly. I use FileMaker and Mac-based tools when I can. I use Windows when I have to. But other teams have different tools. If I post some samples that use one method but you want to use another, I invite you to be a part of this learning process and post some samples of your own.

This blog went public before I intended, but right away it produced one of my favorite quotes. Richard Rothaus (via facebook) wrote:

1 paperless field person = 1.5 papered field person

On January 18, 2011, the National Science Foundation (NSF) will change their requirements for proposals for the Archaeology awards. All proposals will have to include a “Data Management Plan” describing how your project will conform to the NSF’s data sharing policy.

Beginning January 18, 2011, proposals submitted to NSF must include a supplementary document of no more than two pages labeled “Data Management Plan” (DMP) .  This supplementary document should describe how the proposal will conform to NSF policy on the dissemination and sharing of research results.  Proposals that do not include a DMP will not be able to be submitted.

Some of this is not new. Proposals to NSF awards in Archaeology have had a one-page plan for data access since 2005. One of the projects I work for, SANAP (Southern Alabanian Neolithic Archaeological Project) run by Susan Allen of UC’s Anthropology department and Illir Gjipali of the Institute of Archaeology in Albania, submitted such a plan for their NSF grant in 2009. But this new requirement formalizes the procedure a bit.

There is one very important difference regarding how non-compliant proposals will be handled under the new plan. Currently when an archaeology proposal lacks a data management plan, the application is accepted by NSF and the Program Director contacts the Principal Investigator and requests that he/she submit via Fastlane an updated Supplementary Documents section which contains the plan. Under the new system, a proposal which does not conform to this requirement will not be able to be submitted to the Foundation.

Looking through their FAQ page on Data Management and Sharing it looks like there is quite a bit of room for project specific plans. Such terms as ‘reasonable procedures’ and ‘reasonable length of time’ are left to be decided by “the community of interest through the process of peer review and program management.”
The Data Management Plan is meant to address more than just observational data. It is meant to cover samples, and physical objects. And the data doesn’t have to be digital. You can record your entire project on paper and simply plan to make that paper available for scholarly review later on. But most archaeological projects that I know use a combination of paper and electronic, often of duplicate data. And if you have two sets of data, one analog and one digital, that doubles the complexity (and cost) of archiving  your information when the project is done.