Another particularly important aspect for consideration in the early stages of a research project involving the collection and use of quantitative and qualitative data is that of respondent confidentiality and other aspects of ethical research practice. Because this will ultimately affect the data's quality, reliability and usability, this is an area that should be considered before and during the collection phase of any project.
It is worth noting that information provided here is rather broad and general, as considerations and procedures will differ in detail across CESSDA member countries. The vast majority of codes of ethics are national rather than international, and discipline-specific rather than cross-disciplinary in nature. Recently, international research and funding bodies have either implemented such codes of ethical practice or are actively discussing the issues.
A major concern of data collectors, necessitated by the responsibility to protect respondent confidentiality laid on them, is informed consent. The role of informed consent in the research process is important in that it can determine whether and how data from fieldwork can be shared with other researchers. Informed consent is also an ethical requirement of the research process which must be thought through at the planning and writing stage of any research proposal and must be tailored towards the specific research questions and the sample.
Consider before Collection Phase
Special attention must to be paid to this issue by researchers who deposit data with a public archive. Once data are released to the public, it is impossible to continuously monitor use to ensure that other researchers respect respondent confidentiality. So, it is common practice to alter the files so that information that could threaten the confidentiality of research subjects is removed or masked before the dataset is made public. At the same time, however, care must be used to ensure that the alterations do not unnecessarily reduce the secondary analyst's ability to reproduce or extend the original study findings. The tension between these two concerns remains throughout the data collection and analysis phases, and should, therefore, be considered by all researchers early in their data collection activities. Failure to realise the need to gain informed consent means that the opportunities for archiving and secondary analysis may be jeopardised from the start.
The Principles of Informed Consent
As regards the principles of informed consent, there are two major concerns that govern policy and practice in this area: professional ethics and applicable regulations. These are discussed in more detail in the Research Ethics section.
The social sciences broadly defined (as well as a number of professional associations) have promulgated codes of ethics that require social scientists to ensure the confidentiality of data collected for research purposes. Both the rights of respondents and their continued willingness to voluntarily provide answers to scientific inquiries underlie this professional ethic. The ethic applies to all participants in the research process, from data collectors to archivists to secondary analysts who use such data in their research. Sets of regulations also bind all of us in the research enterprise to measures intended to protect research subjects as well as data obtained from such subjects. These regulations range from national and local statutes to rules instituted by universities and colleges. Researchers at most universities and in other organisations are subject to such regulations that cover data that they generate or collect.
Gaining Informed Consent
Ensuring that participants are engaging voluntarily, without coercion and in the knowledge they can withdraw at any time, is a fundamental ethical principle of research. Participants must also feel confident that there will be no adverse consequences of their involvement in the research including, if they wish, the preservation of their anonymity. Researchers should also assure participants that the information they give will be securely stored and used responsibly by bona fide researchers.
The provisions of European law such as, for example, the Data Protection Act and the ethical guidelines of international and national statistical and academic research organisations recommend the following minimum requirements in relation to informed consent:
- Consent must be freely given with enough detail to indicate what has been agreed.
- There must be active communication between the parties - what is expected from participants and why their participation is required.
- Documentation outlining consent has to differentiate between consent to participate and consent to allow findings to be shared or published.
- Consent cannot be inferred from a non-response to a communication such as a letter or invitation to participate.
A common danger for researchers when negotiating a consent agreement is the offer to keep what is discussed between the researcher and the participant confidential. This presents two problems. First, it ignores the need to share the data with other bona fide researchers, which may be a condition of funding. Second, if the material is to be transcribed, this can only be undertaken by the principal researcher as the material can not be read or transcribed by anyone else.
Sharing or archiving research data need not compromise assurances of consent and confidentiality. Anonymisation is often the first approach considered by most researchers, but this should not be considered in isolation. Sensitive and confidential data may also be safeguarded effectively through access and usage restrictions employed in certain circumstances and if deposited in a formal archive.
Written consent should be gained wherever possible to ensure that information is being collected and provided in a consistent and uniform way. Not only is it necessary under current ethical and data protection requirements but it also acts as a guarantee, should any form of dispute arise. It is the researcher's responsibility to gain the participant's trust by outlining the research proposal, the aims and outcomes and benefits of secondary re-use.
Although the format of a written consent form may be tailored to the needs of a particular project, a number of basic issues must be addressed in order to show information is being provided by the researcher (or research team).
- Purpose of the research
- What is involved in participation
- Benefits and risks
- Withdrawal of consent (and mechanism of withdrawal)
- Usage of data
- Strategies to ensure appropriate confidentiality
- Storage of data
- Access to data
It is advisable to gain full consent for all intended use of material before and during participation. If this is not possible, post-interview contact can be made, although this is often difficult and time-consuming. Offering interviewees the chance to edit or comment on transcriptions is generally not advised. In some cases, what may seem a good way of encouraging participation in the early stages of a project often turns into a major block to its later development, whilst researchers wait for transcripts to be returned for heavy editing. Also the original audio recording and edited transcript may differ quite substantially. If a researcher wishes to offer participants this option, it should be restricted by an agreed response time.
For examples, see the FSD guidelines to Informing Research Particpants.
Most research will include some sensitive or confidential information. This issue must also be dealt with as part of the process of gaining informed consent. Ways of dealing with confidentiality vary widely depending upon the research project in question, often the main concern is with the disclosure of names, addresses and sometimes occupational and location details, i.e. the basic identifiers that will be collected in most research projects.
Projects based upon a quantitative methodology can usually deal with this in a straightforward manner. Anonymisation may involve removing or aggregating key variables. Researchers using qualitative methods need to approach the problem in a much more considered and reflective way.
Researchers should consider how important confidentiality is to the potential participants. There is no need to make promises regarding confidentiality if respondents are happy for their comments to be made public or see their role, perhaps as public figures, as one where disclosure is routinely assumed. In many cases where some confidentiality is required, the undertakings to provide this will need to be made clear, workable and preferably in writing.
Dealing with Confidentiality in Quantitative Data
Two kinds of variables often found in social science datasets present problems that could endanger the confidentiality of research subjects.
Direct identifiers. These are variables, which may have been collected in the process of survey administration, that point explicitly to particular individuals or units. These might include national insurance numbers, telephone numbers, licence numbers, phone numbers or mailing addresses. The analytical importance of such variables should be carefully weighed against the risk of disclosure they represent. If the risk is too great, these variables should be deleted or placed under special security arrangements.
Indirect identifiers. Indirect identifiers are variables which include information that could result in a breach of confidentiality when linked with other publicly available sources. This could include geographical information, workplace/organisation, education institution or occupation. There are several strategies for working with problematic variables.
Techniques for Handling Risk Disclosure
The data collector, when considering release of the data collection as a public-use dataset, has a number of options when dealing with a variable which might act as an indirect identifier. Commonly used types of "treatment" of such variables are:
- Removal - eliminating a variable that contains direct identifiers entirely from the dataset. Remove, for example, respondent's names and addresses, postcode and so on.
- Aggregation or reduction of the precision of a variable - reducing precision of potentially revealing socio-demographic by reducing the details of some characteristics, such as the respondent's age and place of residence. Or, record the year of birth rather than the day, month and year.
- Bracketing - combining the categories of a coded (categorical) variable into a broader code. If using standard hierarchical codes (such as occupational codes), this process can be automated.
- Top-coding - restricting the upper and lower ranges of a continuous variable. Salary, for example, is often top-coded to avoid identification of those with particularly high salaries.
- Collapsing and/or combining variables - merging the concepts embodied in two or more variables by creating a new summary variable. This involves generalising the meaning of a nominal string variable; for example, specific types of training or qualifications which might identify particular respondents.
Other techniques, which should be carefully considered before they are implemented as they might result in some loss of analytical power for the data collection, are:
- Sampling - releasing a random sample of sufficient size to yield reasonable inferences, rather than providing all of the original data.
- "Swapping" - matching unique cases on the indirect identifier, then exchanging the values of key variables between the cases. This retains the covariate structure while retaining the analytic utility. Swapping is a service that some archives may offer to limit disclosure risk.
- Disturbing - adding random variation or stochastic error to the variable. This retains the statistical properties between the variable and its covariates, while preventing someone from using the variable as a means for linking records.
Data collectors should consult with archive staff to ensure that they produce a public-use dataset that maintains the confidentiality of respondents. Archive staff will also perform an independent confidentiality review of datasets submitted to the archive and will work with the investigators to resolve any remaining problems of confidentiality.
Dealing with Confidentiality in Qualitative Data
(Based substantially on the notes produced by the UK ESDS Data Archive Qualidata Unit)
Researchers also need to ensure an appropriate level of confidentiality with qualitative material they collect. Although the first thought is often to simply anonymise the data, it might not always be the best or most effective solution. In fact anonymisation is only one element in the larger strategy of protecting data in ways that maintain respondent's confidentiality whether endangered by direct or indirect identifiers.
Effective editing of data, such as interview transcriptions, can involve using pseudonyms, abstract systems of coding or simply the crude removal of text. Whenever editing is done, researchers need to be aware of the potential for distorting the data. For example, deleting all identifiers from text or sound recordings is a simple but blunt tool that creates data that is confidential but also unusable.
The objective should be to achieve a reasonable level of anonymisation, which is then combined with other restrictions in order to maintain confidentiality. It is part of creating informed consent and of guarding against unrealistic or overly harsh applications of anonymisation.
With that in mind, the following basic strategies are a helpful guide for those researchers who feel some degree of anonymisation is needed:
- It is most cost-effective to apply any form of editing at the initial transcription stage.
- Whenever possible adopt a procedure of pseudonyms rather than crudely blanking out details.
- Use search and replace techniques with care as it is easy to make unintended changes.
- Retain unedited versions for use within the research team and for archival preservation.
- Agree in advance to what extent other more subtle but obvious clues to a character, place or institution will be left intact.
If the anonymisation is being carried out after transcription:
- Always ensure the system employed is consistent within the research team.
- Try to use the same pseudonyms and place names in all subsequent publications.
Finally it may be noted that very often researchers presume respondents want their data destroyed or made inaccessible. In fact, informants might be quite satisfied to have their data available to additional authorised researchers if appropriate pseudonyms and other protections are provided. Researchers should not presume there is only one way to provide confidentiality.
Conditions of access to data can also be used to enhancing data protection. Archives can preserve and provide access to data collections under specific conditions. Typical restrictions designed to augment anonymisation or confidentiality of research data include:
- End user licence to respect confidentiality and not to disseminate any identifying information; a standard clause affecting all users of research data. Such a written undertaking does have contractual force in law. Furthermore, the good reputation of a secondary user depends upon abiding by these undertakings.
- Restricted access to certain kinds of highly sensitive data; for example, permission from the data creator might be required to access the materials.
- Principal Investigator (PI) is responsible for all decisions.
- All actions should be consistent with ethical standards.
- Ethical issues should be considered from point of view of participant's society.
- In cases of dilemmas, consultation with professional associations or colleagues is recommended.
- Deviation from confidentiality rules implies that PI is taking greater responsibility; there is need for more outside counsel, and the need for more safeguards.
- Conduct of research should maintain integrity and protect future research.
- Selection of issues for empirical investigation should be based on best scientific judgment and must be related to important intellectual question, with humanitarian implication; there must be no other way to resolve question.
- In using human subjects, a risk-benefit analysis is advised.
- If risk exists, these, as well as potential therapeutic effects, must be justified in terms of benefit to client or patient.
- No prior reason for belief in major permanent negative effects.
- If permanent damage to participants, community or institutions with community (e.g. indigenous social scientists), research possibly abandoned.
Conduct of Research
- Research conducted in competent fashion, as objective, scientific project.
- Research personnel qualified to use procedures employed.
- Competent personnel and adequate facilities available if drugs involved.
- No bias in design, conduct, or reporting of research - objective.
Effects on and Relationships with the Participants, Informed Consent General
- Informed consent used in obtaining participants; investigators honour all commitments associated with agreements.
- Participants to give informed consent; otherwise given by those responsible for participants.
- Informed consent used if potential effects on participants ambiguous or potentially hazardous.
- If possible, informed consent should be obtained in writing.
- Official permission to use government data, no matter how obtained.
Provision of Information
- Purposes, procedures, and risks of research (including possible hazards) explained to participants in a way they understand.
- Participants aware of possible consequences for group or community from which selected, in advance of their decision to participate.
- The procedure used to obtain the participant's name should be described to him or her.
- Sponsorship, financial and otherwise, should be specified to the potential participants.
- Identity of those conducting research fully revealed to potential participants.
- Names and addresses of research personnel left with participants; research personnel subsequently traceable.
- Participants fully aware of all data gathering techniques, capacities of such techniques, and the extent to which participants will remain anonymous and data confidential.
- In longer projects, participants periodically informed of the progress of research.
- When recording videotapes or film, subjects should have the right to approve the material made public (by viewing it and giving specific approval to each segment) as well as the nature of the audiences.
Voluntary Consent - Individuals Know of Option to Refuse to Participate
- Participants able to terminate involvement at any time and know of the option.
- No coercion, explicit or overt, used to encourage individuals to participate.
Protection of Rights and Welfare of Participants - General Issues
- Dignity, privacy and interests of participants respected and protected.
- Participants not harmed; welfare of participants a priority over all other concerns.
- Damage and suffering to participants minimised through procedural mechanisms and termination of risk studies asap; such effects justified only when problem not studiable in any other fashion.
- Potential problems anticipated, no matter how remote, to ensure that unexpected difficulties do not lead to major negative effects.
- Any harmful aftereffects should be eliminated.
- Hopes or anxieties of potential participants not raised.
- Research terminated if danger to the participants arises.
- Use of clients seeking professional assistance for research purposes is justified only to the extent that they may derive direct benefits as clients.
- Deceit of participants acceptable only if absolutely necessary and there is no other way to study a problem.
- Deception may be utilised. But, if deceit is involved, additional precautions will be needed to protect the rights and welfare of participants.
- After a study using deception, all participants must be given a thorough, complete and honest description of the study and need for deception.
- If deception is not revealed, for humane or scientific reasons, investigator has special obligation to protect the interests and welfare of participants.
- Research data confidential and all participants anonymous, unless they (or their legal guardians) have given permission for release of their identity.
- If confidentiality or anonymity not guaranteed, participants should be aware of this and its possible consequences before involvement.
- Persons in official positions (studied as part of a research project) should provide written descriptions of their official roles, duties and so forth (which need not be treated as confidential information) and provided with copy of final report.
- Studies designed to provide descriptions of aggregates or collectivities should always guarantee anonymity to individual respondents.
- 'Privacy' considered from perspective of participant and the participant's culture.
- Material stored in data banks should not be used without the permission of the investigator who originally gathered the data.
- If promises of confidentiality are honoured, investigators need not withhold information on misconduct of participants or organisations.
- Specific procedures for organising data to ensure anonymity of participants.
Benefits to Participants
- Fair return for all services of participants.
- Increased self-knowledge, as a benefit to the participants, should be incorporated as a major part of the research design or procedures.
- Copies or explanations of research provided to all participants.
- Studies of aggregates or cultural subgroups should produce knowledge which will benefit them.
Effects on Aggregates or Communities
- Investigators should respect and be familiar with the host cultures, including their own.
- Investigators should cooperate with members of host society.
- Investigators should consider, in advance, the potential effects on the social structure of the host community and potential changes in influence of various groups/individuals by the study.
- Investigators consider, in advance, potential effects and report on population or subgroup from which participants are drawn.
- Participants aware, in advance, of potential effects upon aggregates or cultural subgroups which they represent.
- Interests of collectivities and social systems of all kinds considered by the investigator.
Interpretations and Reporting of the Results of the Research
- All reports and public documents should be freely available to all.
- Research procedures described fully and accurately in reports, including all evidence regardless of the support it provides for the research hypotheses; conclusions objective and unbiased.
- Full and complete interpretations provided for all data and attempts made to prevent misrepresentations in writing research reports.
- Sponsorship, purpose, sources of financial support, and investigators responsible for the research should be made clear in all publications related thereto.
- If publication may jeopardise or damage population studied, and complete disguise is impossible, publication delayed.
- Cross-cultural studies published in language and journals of host society, in addition to publication in other languages and other societies.
- Appropriate credit given to all parties contributing to the research.
- Full, accurate disclosure of all published sources bearing on or contributing to the work is expected.
- Publication of research findings on cultural subgroups to include description in terms understood by participants.
- Whenever requested, raw data or other original documentation made available to qualified investigators.
- Research with scientific merit always submitted for publication and not withheld from public presentation unless quality of research or analysis inadequate.