Total Cost of Preservation (CDL-TCP)

Property Description
Creator and Funding The model was (and is being) developed by the California Digital Library (CDL), UC Curation Center (UC3) under a Creative Commons Attribution-Sharealike 3.0 license
Status The latest version of the TCP pricing model tool and whitepaper is rev 2.1 from 2013-08-05
Purpose Modelling the full economic costs of preservation, the “total cost of preservation” (TCP) over time in order to sustain long‐term preservation efforts—effective and affordable curation management. UC3 itself needs a TCP model in order to move many of its core service offerings to a cost recovery operational basis.
Information assets Any kind of digital asset—the model uses a generic, abstract level
Activities Ingest, Data Management, Archival Storage, Preservation Planning, Access, Administration, Management
Resources Total cost; in the tool total cost is refined into subsidiary costs such as capital cost, labour cost; operational cost; one-time, term or annual costs (called scope), fixed cost or marginal cost (proportional cost). Term costs are annualized over their lifespan and adjusted for inflation.
Time Present, future—10 year scope
Variables More than 100. For example, for “Migration” there are unit costs for: refreshment, replication, repackaging, transformation. For “Staff” there are 12 kinds of roles with salaries, FTE day rates.
Type of tool Analysis tool, implemented as a MS Excel spreadsheet.
Availability of tools The tool is available for download at:
The Total Cost of Preservation (TCP) model from the California Digital Library (CDL) is an analytical framework for modelling, assessing and accounting for the full economic costs of preservation. It relies on a number of fundamental abstractions and assumptions about preservation activities.

The model defines 10 high-level categories that cover all digital curation activities. The categories are based on the OAIS Model, but some of the concepts and terminology are modified to broaden applicability and facilitate understanding by non-specialists. Each category represents a cost component in the TCP pricing model and based on these components the pricing models calculate costs and express it as monetary expenses.

The 10 categories are:

  • content owners
  • submission streams
  • preservation system (ingest, data management, access)
  • servers
  • storage
  • consumers
  • preservation planning
  • interventions (e.g. migrations)
  • administration
  • management

The framework provides two price models that account for two different types of funding: Pay-as-you-go (PAYG) and Paid-UP (front) (PUP).

The Total Cost of Preservation model relies on a number of fundamental abstractions and assumptions about preservation activities. The cost associated with content creation or acquisition, reformatting, packaging, submission, and so on are excluded from the model. The cost of supporting owners in making use of the preservation System functions (sheet W3 in the spreadsheet tool) is included. Costs are nominal, based on generic instances of activities. However, preservation actions are assumed to be substantially automated, and the main costs are therefore acquisition and deployment of the software which is assumed to be independent of the number of objects.

System, Administration and Management are considered to be fixed costs—independent of the number of objects—whereas other costs are considered variable, in general proportional, using unit costs and number of units, such as the number of unique submission streams and the unit cost of a stream. The cost is assumed to be paid by the owners and the preservation service provider.

Establishing who pays for the cost is not essential for using the model, you can estimate the cost without that information. It has been incorporated as a factor to make the model more useful for CDL.

In the case of Paid-Up it is assumed that the archive institution (preservation service provider) can carry forward surpluses across fiscal year boundaries and reinvest them at market rates. (Note: For many public sector institutions this is not possible, therefore they are forced to use PAYG). Normal discounted cash flow analysis (DCF) is used. The model creator is aware of the shortcomings of DCF regarding fluctuating interest rates and the strong bias for the short term for the time value of money.