Data Life Cycle
"Data archiving is a process, not an end state where data is simply turned over to a repository at the conclusion of a study. Rather, data archiving should begin early in a project and incorporate a schedule for depositing products over the course of a project's life cycle and the creation and preservation of accurate metadata, ensuring the usability of the research data itself. Such practices would incorporate archiving as part of the research method." (Jacobs, James A. & Humphrey, Charles (2004). "Preserving Research Data." Communications of the ACM. 47, 9, 27–29.)
To ensure that the data which you are collecting today can be used in the future, standards in terms of data structure and format, documentation format and content and metadata need to be considered from an early stage. It is also important to develop a data management plan to address the archival considerations that come into play across all stages of the data life cycle.
The Seven Phases of Data Management
| Phase One | Planning and Outline of Data Collection Exercise | Review of existing datasets, determine need for new data; investigate special archiving challenges, identify potential users, describe costs related to archiving. |
| Phase Two | Start-Up and Data Management | Create data management plan; make decisions about documentation form and content; conduct pre-tests and pilots of materials and methods. |
| Phase Three | Data Collection and File Creation; Documentation | Following best practice, carry out planning, survey and measurement; data entry and digitisation; data checking and cleaning; data integrity, variable names and groups; create information for documentation, with full information on all aspects of data, including documentation of derived variables, imputation and weighting. |
| Phase Four | Data Analysis | Analysis and derived data creation, creation of final data documentation; manage master dataset; set up appropriate file structures; create multiple backups. |
| Phase Five | Preparing Data for Sharing with Others | Further checking and cleaning derived data creation; limitation and weighting, linkage to other data; address disclosure risks; determine file formats for deposit; contact archive. |
| Phase Six | Depositing Data | Complete relevant forms; comply with relevant dissemination standards and formats. |
| Phase Seven | After Deposit Archival Activities | Storage, migration and metadata creation; additional confidentiality investigation; possible preparation for on-line access and enhancement; distribution to secondary analysts outside of originating team; linkages to other datasets; support for data users. |
(Source: the ICPSR Guide to Social Science Data Preparation and Archiving.)