Documentation

Preparing high-quality technical documentation, or codebook, can be a time-consuming task. An outline of the ideal elements of clear and comprehensive technical documentation occurs below. Details are available in the UK Data Archive web pages Documenting your data or the ICPSR Guide (pp. 13-16). Other data archives have published similar guides in their national languages.

Document Using Structured Formats

Many CESSDA archives now encourage users to generate documentation that is "marked up" according to the Data Documentation Initiative (DDI) metadata specification, an emerging international standard for the content, presentation, transport, and preservation of documentation about datasets in the social and behavioural sciences. This specification allows the mark-up of documentation elements and facilitates the creation of coherent and comprehensive technical documentation of a dataset. Using this system means that all the information the analyst needs is available in one document, which can be used to produce other products (such as set-up files). The files are amenable to web display and navigation, and because the documentation is prepared from the onset, deposit in the archive or data centre will be possible immediately after data collection is complete.

More information on DDI and a list of tools and other XML resources is available at the DDI pages, at the UKDA metadata page and at your local or national archive.

If it is not possible for a project to produce documentation that is in DDI-format, using a uniform, structured format with integrated question text is the best alternative, as it will enable the archive to convert the files to XML format easily. In most cases, CESSDA archives produce the DDI records from information provided by the data depositor.

Minimum Level of Information

Good documentation is an essential part of any dataset and there are minimal levels of information which are required to make the data suitable for sharing with other researchers. Three types of information must be provided: explanatory, contextual and cataloguing information.

Explanatory Material

Explanatory material is essential for informed use of dataset. Much of this information is likely to be available from reports, working papers and other publications.

Contextual Information

Information about the context in which the data were collected and information about the uses to which the data were put:

Cataloguing Information

Cataloguing information allows the archive to create a formal catalogue record, or study description, for the study. The study description serves two purposes - first, it is a bibliographic record of the dataset for proper acknowledgment and citation, and second, it is the principal instrument used for resource discovery. A formal catalogue record for every archived study is created by most archives.

This includes information such as the title of the dataset, principal investigator, sponsors, data collectors, dates of data collection, temporal and geographic coverage, methods of data collection, and sampling design and frames. This information is often gathered by the archive at the time of deposit through the use of a special form prepared for this purpose. For a sample form, see the ESDS Deposit Data web page. For a discussion of and further information on the elements of information required for a structured catalogue record, or study description see the UKDA metadata page.

In addition, you should provide:

Qualitative Data Documentation

Documentation requirements for qualitative data are essentially similar to those outlined above, but particulars will vary. The essential items of information are noted below. For more information, see the ESDS Qualidata web page or contact your local archive.

Documentation of qualitative files provides a context for the data and the research investigation, and if detailed enough can help the raw qualitative data be more usable by secondary analysts who have not previously been directly involved with the data collection.

Common examples of documentation include:

In most cases, depositors are asked to provide as much documentation and metadata as possible, although it is recognised that this can vary widely from data collection to data collection. The data archive or service will use this information to provide a user guide or standardised documentation to assist the secondary user.