In This Section

4C Partners

dpc
Jisc
Inesc
SBA
Dans
DCC
Deutsche National Bibliothek
Keep Solutions
National Library Estonia
The Royal Library
Statens Arkiver
UK Data Archive
University of Glasgow

'The Carrot and the Stick' by Matthew Addis

There can be a big difference in the approach to digital preservation and cost modelling when content is being kept because you want to, compared to content being kept because you have to.  This is the carrot vs. the stick.  The carrot is where content has value: it can be reused, monetised, shared or otherwise exploited.  The investment needed into keeping content alive is balanced by a positive return, which can come relatively quickly, and this makes the business case all the easier to make (easier being a relative term!).  The stick on the other hand is where content has to be kept for regulatory of compliance reasons; if it isn't retrievable then an organisation might face litigation, fines, penalties and major business damage.  Those who are responsible can even face losing their jobs or legal proceedings.  Not something to be taken lightly, but also something where the business case can perhaps paradoxically be a lot harder to make - you need to spend money now to avoid spending even more money in the future.  That future could be a long way away or never even come.  That's a hard thing to ‘sell' - much better to have a business case built on 'content value' than the 'cost of loss.' 

So what's that all got to do with costs and cost modelling?  Well, the carrot and the stick both highlight different sides of the same coin.  Thinking about the stick emphasises the need for modelling the 'cost of action' and how this offsets the potentially greater 'costs of inaction'.  This is all about costs, risks, contingencies - essentially an actuarial approach to cost modelling.  The key thing is the need to consider the cost of not doing something, e.g. a fine because content has been lost, against the cost of doing something, e.g. investing in the people, processes and infrastructure needed to preserve that content.  But the same is true for the carrot too.  Rather than risk in a negative sense it's about opportunity and the likelihood of a positive return.  The 'value of action' rather than the 'cost of inaction'.  Again there's uncertainty in what will, or won't happen, and there's the need to consider how this might change with time, but perhaps with more emphasis on access and reuse of content and all which that entails.  But in both cases it's about cost in context - costs don't mean much in isolation.  Organisations need to know what would happen if budgets don't cover what they ideally need to do. They need to know the 'opportunity cost' of spending the money elsewhere - or not spending the money at all.  The people I work with who are considering whether to invest in curation and archiving ask these sort of questions on a regular basis.  It will be really interesting to see how 4C can help them get the answers they need!

The 'cost of risk of loss' is something I looked when I was working on wibbly wobbly jelly cost modelling for audio-visual preservation.  The lower you want the risk of not being able to retrieve content in the future, the more you need to spend on its preservation, although it needs to be said that the case I was looking at it was the much smaller problem of how to store data. The good guys at AVPreserve have been looking at a similar problem with the Cost of Inaction for media migration.  The longer you wait before migrating a media collection then the more the migration will cost because of degradation and the more will have been lost in the meantime - you end up spending more money for less result.  But perhaps my favourite example comes from the aerospace industry. On the 4th November 2010 an Airbus A380 travelling from London to Sydney made an emergency landing in Singapore following an 'uncontained engine failure'.  To you and me this meant that one of the turbines had disintegrated causing debris to rip through the engine cowling, tear holes in the wings, puncture fuel lines, and cause the engine to catch fire.  The cause was a small manufacturing defect to an oil pipe coupling.  The result was a 10% drop in Rolls Royce share price, $93M AUD compensation, a $123M AUD repair bill, and aircraft across Europe being grounded.  But what struck me is when I was speaking to someone in the industry they said it could have been much worse!  The aircraft engine was very recent, so the design and manufacturing data was relatively easily accessible and understandable and it could quickly be shown that the fault was a limited to a the manufacture of a specific set of engines - as opposed to a systemic design fault or not being able to determine the scope of the fault at all.  Every day of the investigation cost Quantas $1M+ because the Australian Transport Safety Bureau grounded their fleet.  Now imagine if that same event happened 30 years from now, which is well within the service life of the plane - would the data still be as easy to find and use - would the applications and people who can understand it still be around?  How do you balance the cost of curating and preserving this sort of data with the risk of a rare future event such as a downed aircraft that needs immediate access to this data?  That's a real cost modelling challenge! 

But what about the carrot?  Some good examples of calculating the 'value of action' are now emerging in the research community.  Research Data Management (RDM) is gaining momentum following the growing drive to make research data accessible to the community so the underlying science is verifiable and repeatable.  This is becoming enshrined in the funding body requirements and policies.  Whilst these superficially look like the equivalent of regulatory requirements, e.g. 'thou shalt keep thy data for 10 years' as the EPSRC and others stipulate, it’s actually all about the value of the data.  The successful business cases for funding RDM are those that focus on the positive benefits, e.g. income, citation, and better research.  These can be strong arguments, especially where a 1% increase in research funding success rates can mean £1M+ extra research income a year for a University.  But on the ground the budgets are still limited, there isn't enough money to keep everything, and questions are still asked internally about what institutions can 'get away with' throwing away and how costs can be driven down.  So whilst the outward face might be trumpeting the benefits of RDM and reaping the rewards, the inward reality is still one of cost management, trade-offs, and a certain element of risk taking about the consequences of not keeping everything - or not having enough budget.  Institutions trying to decide on the right balance is what we see at Arkivum when looking at the cost and benefits of long-term RDM

If you've stayed with me this far then I hope I've made the case that the costs of curation and preservation (in which I include access) need to be compared with the potential costs of not doing so.  The challenge is deciding where to draw the line, how to make the balance, and what the possible consequences might be.  This is risk management.  Cost models need to be less about providing a single answer (42 comes to mind) and more about ranges and likelihoods 'what is the cost if I want to do more, or to do less', 'what could I do if I had more budget or less budget', 'what's the likelihood of things turning out better or worse'.  This then becomes the foundation for making both carrot and stick business cases.  The 4C project looks very well placed to make important and very welcome in-roads into this area and I'm very much looking forward to the results.

Matthew Addis, Arkivum

Matthew is part of the 4C Project Advisory Board and repesents Service Providers and Curation Expert. He has spent 15 years leading a diverse portfolio of industry-led applied research projects including archiving and digital preservation, service-oriented computing, data mining and knowledge management to name but a few.