In This Section

4C Partners

Deutsche National Bibliothek
Keep Solutions
National Library Estonia
The Royal Library
Statens Arkiver
UK Data Archive
University of Glasgow

'The Curation Costs Exchange unveiled and challenged' by Alex Thirifays

The Curation Costs Exchange (CCEx) was launched at the 4C Conference. The 4C-project introduced it and two guest speakers responded to it from their own perspectives. The presentations were followed by a – very – brief sofa discussion and a crowd bursting with questions, so it goes without saying that these circumstances made it a bit difficult to exhaustively and satisfactorily answer the guests’ pertinent questions.

So, this is what I’m going to try to do here – both for the benefit of the guest speakers, but also for every other stakeholder in our community who has probably – at least to some extent – been pondering the same issues.

In the same breath, I’d like to encourage any readers of this blog-post to participate by posing questions of their own in the comments’ section below. We’d be delighted, of course, if this discussion were taken to the CCEx forum.

The guest speakers were Simon Hodson, in his capacity as Executive Director of CODATA, and Kate Wittenberg, Managing Director of Portico.

Simon Hodson started by introducing CODATA as an off-shoot of The International Council for Science (ICSU) whose mission it is, simply put, to strengthen society via the merits of science. In Simon’s words, CODATA “supports that mission by promoting improved scientific and technical data management and use”.

On granularity or ‘enough’ information about the organisations

The first question was double-edged: How does the CCEx deal with quality? And how does the CCEx deal with granularity? As Kate Wittenberg also expressed interest in quality levels, we’ll hang that issue on a cliff and keep you in suspense while addressing the question on granularity: “How can we get to a state within the Curation Costs Exchange where we are comparing like with like for different curation models?”, as Simon Hodson put it.

Simon was interested in introducing more contextual information in the CCEx, because even though it distinguishes between different types of organisations and different types of data, it does not for example provide descriptions about an organisation’s mission and objectives. So if a repository at ‘Ingest’ aims for high throughput of data with a minimum of curation action which limits itself to verification of metadata and takes around 30 minutes (DRYAD was Simon’s example here), then this curation service is cheap, because it aims to be so. However, if you have different objectives than high throughput – for example meticulous metadata enrichment at Ingest – then CCEx risks leading you to draw false conclusions when comparing these two different services.

At the moment, the CCEx enables comparison of different types of organisations (e.g. universities, memory institutions, service providers), but it does not go much further beyond that level of granularity. It is true that we do ask for a mission statement, but this piece of information is optional.

Figure 1 - The granularity level of information regarding your organisation

Figure 2 - The granularity level of information regarding your costs

Why have we chosen to keep granularity low? First of all because we want to attract users to the CCEx and achieve a critical mass of cost data. To do this we need to keep the barrier for submission low. If people have to spend too much time arguing with their financial director, getting the right figures, interpreting them rightly, uploading and normalising them into the proposed 4C-categories, then it’s probably not a good idea to ask them also to fastidiously add contextual information about their organization and their costs.

In short, this is a trade-off between generating meaningful figures and receiving figures at all.

The second reason is that information about for example organisational objectives is not quantifiable. And if it isn’t quantifiable, we cannot use it for automated comparisons. However, we could – and should – make this information available. I will speak about this a bit later. Because then suddenly Kevin Ashley of the Digital Curation Coalition (DCC) jumped in and said to Simon:

The CCEx “does help you, at DRYAD, validate the fact that you’re aiming to be lower cost than, let’s say something like the ADS [Archaeology Data Service] […]. The problem then is for other people who’d come along later on and potentially see those costs and not realizing what service model is behind it and then worry ‘why are we so much more expensive than the others?’”.

So it seems that lack of granularity leads to the risk of drawing false conclusions…

Yes, we agree that it is a risk that the CCEx can be misleading in some respects. So, apart from creating awareness about these risks in a more explicit manner on the website, we have tried to counter this from happening in two ways: Firstly we do encourage stakeholders to describe themselves and their mission more thoroughly so that peers can take these statements into account. Secondly we offer a mechanism that enables them to get in contact with an organization whose figures they find interesting, disturbing, worrying or inspiring, for more information and clarification.

This last measure in particular is the CCEx response to the lack of granularity.

But the point is taken, and we will do two things: Put up caveats up front that warn about these potential false conclusions, and also revisit our level of granularity: Users are likely tell us which pieces of cost information they’d like to add in order to make their numbers more meaningful and comparable.

From costs to business models

The next thing on Simon’s mind was about diversification of business models: Will structural funding keep pace with costs? And if not, what options are available in order to identify other sources of funding? Simon told the story about the RDA Cost Recovery Group which is looking for other funding options than structural funding. In order to do so, a survey will be initiated to document which business models are in use as of today.

However, Simon added, it would also be interesting if the CCEx captured this information: “If we can break down [these sources of funding] into proportions in a similar way as the CCEx has [done regarding the cost categories], we’ll get a better insight into the potential business models – or diversified business models – for data archives.”

Even though the CCEx captures costs, it is not, currently, within its remit to capture where the funding comes from. We recognise the usefulness of such an approach, but would, in the first instance and again, refer to the possibility of contacting peers in order to acquire this type of information.

It should be stressed here that the 4C-project does actually address the need for business models specifically tailored for organisations whose core business is digital curation. The 4C working group responsible for this is named ‘From costs to business models’ and is informed by our stakeholders via surveys, qualitative interviews and focus groups. One of the tools used by this working group is the Business Model Canvas, which identifies key partners, activities & resources as well as value propositions, customer relationships and segments, communication channels, cost structures and revenue streams. This work is due by the end of January 2015 and will be integrated into the CCEx. For now you can follow any progress here.


Simon’s last question was about the fact that the CCEx presently does not accept any other currencies than GBP (£), USD ($) and EUR (€). Beyond that, he was addressing the fact that data archives are emerging in third world countries and being able to model their future costs would help them immensely in the process of establishing themselves as functioning digital curation organisations.

Firstly, the CCEx will extend the list to include most currencies in the world. Secondly, it has been a deliberate choice from the start not to build another cost model that foresees the costs of curation. The CCEx captures past costs, but does not estimate future ones. Why? It has proven so difficult for all of the initiatives which developed cost models throughout the noughties that the very first lesson learnt was to not to do it again. It is also fair to say that the CCEx is not built for the founding phase of archives, but is adapted to cope with existing organisations. Digital curation organisations that are in the establishing phase may however find the CCEx useful in order to spot which kinds of budgets they should or could aim for.

The next intervention was by Kate Wittenberg who introduced her organisation (Portico) as being a – very large – digital preservation service (she mentioned 920 libraries, 275 publishers, 30 million articles…), which has the responsibility of preserving content from many publishers making it available to many libraries.

Amalgamated and indirect costs

“How hard is it to actually calculate one’s costs, particularly when one is part of [a network of] other organisations […] Sharing costs can be a huge advantage when doing preservation […], but it makes it hard sometimes to extract one’s own costs as separate from those of the larger organisation. So, for example, if one shares IT expertise, shares storage, shares staff, shares larger infrastructures – all of which makes it possible to create a cost efficient preservation organisation – it is very hard to figure out what exactly are the costs required for your activities as opposed to the activities of the larger organisation”.

The answer to this question is manifold. It touches for example upon the difficulties of capturing and calculating indirect costs, which is an old story in accountancy. The CCEx has – for now – two ways of dealing with those, and one does not exclude the other: You add them to the provided ‘Overhead’ category and attach some explanation to it.

True, these explanations are not at the moment visible to others, and we’re working on how to implement this feature in a useful way so that these costs do not just disappear.

The other way is the ‘ABC’ way. ABC is an acronym for Activity-based costing in which one of the main principles is to assign indirect costs to the direct costs: Indirect costs will be spread in some pertinent proportion to the activity categories that are relevant (in our case Pre-Ingest, Ingest, Storage and Access).

It should be mentioned in this respect that the CCEx does also ask for FTEs (Full Time Equivalents). This does not help organisations extract their costs from a larger context – there is no magic tool that provides this functionality, but having to submit FTEs gives the incentive of estimating approximately the amount of staff time spent on different activities, thus effectively splitting for example the IT supporter’s time.

A last thing is that merely undertaking the exercise of retrieving one’s cost from one’s financial director’s reluctant office, trying to extract them from those of others, and going through the process of mapping these figures into the pre-defined 4C categories is very useful in order to obtain some clarity on your costs. It will help you among other things understand what is needed to extract your costs and to calculate the indirect costs, but it will not help the community do this in a uniform way because we think that there are as many ways as there are organisations.

This is exactly why the 4C project ‘imposes’ the normalisation of cost figures into ‘our’ cost categories – so that they become uniform and comparable.

Qualitative costing

Kate had one question in common with Simon: How do we cost qualitative differences in preservation? Does the CCEx do that for us? And one instance of that question, as formulated by Kate, is about the necessary, supporting activities that make the core preservation activity possible in umbrella organisations (which per se deal with networking): “How would one calculate the relationship costs involved in preserving content for various parts of our communities?”

The need for organisations to nuance the cost data they upload to the CCEx has been a key concern for many of our stakeholders. The CCEx therefore offers the possibility of adding cost units (e.g. Costs of networking) and accompany that cost unit with a description which details what it is about.

This does not, however, display a networking cost at the results’ end of the CCEx, where you can analyse and compare costs, because all cost units are being aggregated into the predefined categories already mentioned. So the challenge here resembles the challenge of dealing with indirect costs, the details of which also become invisible in the CCEx after the aggregation has taken place: How do we communicate the richness of information submitted by users of the CCEx without compromising anonymity and breaching confidentiality?  

Consequently an outstanding challenge of the 4C-projects is to figure out how to display and make useful the entirety of the information that organisations have been willing to submit to us in the attempt to nuance their cost landscape. Acknowledged.

The future – contextualising costs and creating an online course

The last two questions from Kate addressed the need for understanding costs better.

First, she wanted to know how to connect the cost data that organisations have submitted with options for how to move forward.

Next, Kate asked if the “CCEx could develop into a broader site with tools that actually include educational materials or maybe even curriculum that would educate and move organisations forward once we’ve been able to capture and understand the costs better? Can it be more? Can it actually assist people by providing more extensive information about others in the community – or actually an online course in preservation?”

The CCEx team has in fact already been reflecting upon these topics. The idea of turning a part of the site into maybe not an online course but some sort of curriculum for the apprenticeship of the economics of curation was already suggested by one of our reviewers at the 1st year review at the European Commission in February 2014. It is not, and has never been, the intention of the 4C-project to create a formalised curriculum, but we do want to raise awareness about the topic, offer materials that provide valuable insights into the field. And we also would like to spread good practices, nudging people for example to do more ‘Activity-based costing,’ to separate their core curation costs from other business costs and to use our curation-specific tools to develop for example business models and strategies for sustainable curation.

Some of these tools are available at the ‘Understand your costs’ section of the site,

and they also give a part of the answer to Kate’s first question: How to connect costs with the whole economic ecosystem of curation and learn how to move onwards from the point where you have captured and submitted your costs.

Still we have to keep in mind that it is not a one-day job to assimilate all the knowledge assembled on the site, and it is not enough just reading the material that the 4C-project has published there either. But we believe that it is the place to start if you want to achieve a solid knowledge base and learn to operate the methods and tools that are available for you and which will nevertheless make it easier to clarify the landscape of the costs of curation and to navigate therein.

As you may know, we intend to hand over the CCEx to interested parties that will ensure its further development. To this end, we will use an online tool that allows for conveying our ideas for future development as well as voting on them.

The last remarks from the CCEx team in this blog-post are huge ‘Thank Yous’ to Kate Wittenberg and Simon Hodson for taking the time not only to join us at the 4C-conference but also for candidly sharing their thoughts on the CCEx and giving us insights into how it can become even more useful in the worlds.

If you are interested, you can see the footage from the CCEx session at the 4C conference here.

Alex Thirifays, Danish National Archive

 Alex Leads the 4C Project work package to assess current methods of estimating and comparing curation costs and to work out the most beneficial paths for future development of solutions and services.