Acquisition
The data that form the backbone of most archives collections are acquired through negotiation from a number of different data producers. Increasingly, these come as a result of funding bodies' grant conditions and institutional data policies. CESSDA archives all collect and disseminate data which are of interest for research and teaching. Individual collections vary greatly in terms of subject coverage, time period and geographical areas covered, dimensions and size of the data material, and in terms of data type, so it is always advisable to contact your national archive for information early on concerning deposit.
Examples of Types of Data Accepted by Data Archives and Services
- Quantitative micro data. The coded numerical responses to surveys with a separate record for each individual respondent.
- Macro data. Aggregate figures, for example country-level economic indicators.
- Qualitative data. In-depth interviews, diaries, anthropological field notes and the complete answers to survey questions.
- Multimedia. A small number of datasets may include image files, such as photographs, and audio clips, non-digital material.
- Paper media. Photographs, reports, questionnaires and transcriptions, analogue audio or audio-visual recordings.
Typical quantitative data formats include SPSS, Stata and tab delimited formats, and typical qualitative data formats include ASCII text, Excel, Word and RTF.
Sources of Data in Many Archives
- Official agencies - mainly central government
- International statistical agencies
- Individual academics with research grants
- Market research agencies
- Historical sources
- Other data archives worldwide
Arrangements and Licences
As a first step, the archive and data producer negotiate arrangements concerning copyright and access issues, and a licence is agreed that gives the archive rights to hold, store and disseminate the data and associated metadata products. It is important to note that the archive acquires rights to manage the data effectively, but does not claim to own the data. In this respect, it performs the role of custodian rather than owner. A key part of this licensing process is to ensure that all copyright holders in the data are identified and give their consent to the transfer of data to the archive.
At an early stage in the acquisition process in most CESSDA archives, depositors or archive staff complete a standard pre-acquisition form. This provides summary information on the potential depositor (or data producer), the content of the data collection, the associated metadata, and technical information on the format(s) of the data collection. The potential data producer may also be subjected to an acquisition review to decide whether to accept or reject a data submission, and to decide on its readability and content from a technical point of view. Many CESSDA archives have developed collection development policies which establish the criteria to be employed in deciding to acquire and preserve identified data collections. The decision to accept is usually based on the scope of the collection, its content and its applicability to the user community that the archive services although format and structure of the data are a key part of the decision. Requests from the archive to the data collector to re-format, re-structure or provide additional metadata may be made before the collection is accepted.
Once the data are accepted, a set of depositor forms (for example, see the ESDS Data Review Form) need to be completed, giving details of data provenance, as well as certain technical information.
It is clear that early contact by the data producer with the archive at an early stage in their data collection activities stressed in sections of these notes is advisable. Timely and continuous dialogue can avoid any last minute problems with data formats and serious lack of documentation and metadata, as well as assisting the producer with data management processes throughout the collection and analysis.