Protocol for Qualitative Data Analysis using AI

Jewlya Lynn
5 min readFeb 7, 2024

--

Lately I keep hearing about about researchers and evaluators testing out AI by analyzing their qualitative data using one platform or another. I fully embrace the potential of AI, and I also recognize that as with any new innovation, we must use it in the right ways.

I can’t help but cringe as I’ve heard of researchers not asking for permission from interviewees or clients to use an online AI platform to analyze; not vetting the platforms for their privacy practices or analytical capabilities; having little familiarity with how to prompt in productive ways; not using any type of protocol or systematic practice to support the analysis; not disclosing use; and/or not engaging in any meaningful quality control.

Let’s make it easier to use AI in ways that match the integrity of how we do our research in other ways. Below is a step by step protocol for engaging AI in your analytical process. It is not the be-all, end-all, but rather a starting place for those just getting their toes wet. A few disclosures:

This protocol was written by an AI (ChatGPT-4 Turbo) on 12.2.23 in response to the following prompt: “Using what you’ve reviewed and summarized, can you please generate a step by step protocol for how a researcher can integrate AI into their qualitative analysis, beginning in the design phase of the study, including the consent process with interviewees, preparing the AI to analyze, data cleaning, data coding, data analysis, data interpretation, writing up results, and quality control steps throughout? Please consider steps that help ensure the data is fully analyzed across questions (not just themed within) and the AI is guided as best as possible to look for nuance.”

The prompt was the final step of a longer conversation with the AI to investigate currently known best practices in the use of AI in the analysis of interviews specifically. I carefully reviewed the protocol and revised in a couple places due to incomplete answers. These are indicated with [Revised] and [Added].

Protocol for Integrating AI into Qualitative Analysis

Study Design Phase

1. Define Research Questions and Objectives:

  • Ensure that the research questions are suitable for AI-assisted analysis.
  • Determine what types of data AI can analyze effectively within your framework.

2. Select Appropriate AI Tools:

  • Choose AI platforms based on their capabilities, data security standards, and compliance with ethical guidelines.
  • Consider tools with features for coding, theme identification, and nuanced analysis.

3. Pilot Test AI Tools:

  • Conduct a pilot study to test the chosen AI tools on a smaller dataset.
  • Practice and refine your approach to prompting.
  • If appropriate, train the AI system on example datasets similar to what you’ll be collecting.

Participant Consent Process

4. Inform Participants and Obtain Consent:

  • Update consent forms to disclose the use of AI in data analysis.
  • Explain the implications, including how data will be stored, processed, and the extent of human oversight.
  • Ensure participants have an opportunity to ask questions about the AI’s role.
  • Obtain written or recorded verbal consent that acknowledges the use of AI.

Data Preparation

5. Data Collection and Transcription:

  • Transcribe interviews, focus groups, or other qualitative data accurately.
  • Anonymize any identifiable information to protect participant privacy.
  • [Added] Familiarize yourself with the data sufficiently to be able to identify when the AI is not providing appropriate codes or findings.

6. Data Cleaning:

  • Remove irrelevant content, inconsistencies, and correct errors in transcriptions or be prepared to prompt the AI to ignore irrelevant content (e.g., transcripts with consent language at the beginning).

7. Prepare AI for Analysis:

  • Input cleaned data into the AI platform.
  • Provide the AI with initial sample codes or prompts that introduce the purpose of the research to guide its learning.

Data Coding

8. AI-Assisted Coding:

  • [Revised] Use AI tools to generate initial codes. Clarify first what you mean by codes, e.g. “Please code the data using qualitative coding techniques, assigning a code to each excerpt, which can vary in length from a phrase to a few sentences. Codes can be used multiple times.”
  • [Added] You may also want to provide clarity on the types of content you want to see the coding pay attention. For example, if you are interested in looking at causal relationships, you might instruct it to, “Assign codes that represent relationships between different parts of the data, including when the respondent is indicating that one thing led to another (cause and effect).”

9. Refinement of Codes:

  • [Added] Make sure to request the full list of codes and learn how they were applied. Some platforms are designed to do this automatically. Not all are. You can ask it to generate the list of codes first, or generate the list of codes with associated excerpts. At this point, you need to review AI-generated codes for accuracy and relevance.
  • Manually go through a subset of data to ensure codes and coded data align with the research questions.
  • Adjust AI coding settings and re-run coding if necessary.

Data Analysis

10. Initial Analysis:

  • Use AI to identify common themes, guided by prompts like “What are the prevailing sentiments about the new policy among participants?”
  • Prompt the AI to look across different questions for patterns with commands like, “Analyze responses to understand if perceptions of policy impacts are influenced by demographic factors” or [Added] “Can you please make sense of how a person experiences shifts in economic stability, looking at the relationship between different codes (not just a summary of codes) and the codes related to cause and effect?

11. Nuance and Context:

  • Input specific queries to capture context and nuance, such as “Highlight any instances where participants express mixed feelings about the policy.”
  • [Added] Input queries that look not just for themes, but also interesting exemplars and outliers. For example, “Highlight examples of when participants hold a contrasting view to most other participants. Explain the what is contrasting.”

12. Review AI Analysis:

  • Carefully review the AI’s work, keeping an eye out for any missed nuances or misinterpretations.

Data Interpretation

13. Human Judgment:

  • Integrate human expertise to explore and finalize themes and contextualize findings within the theoretical framework of the study.

Writing Up Results

14. Drafting Results:

  • Use AI insights alongside human interpretation to write the results section, ensuring a coherent narrative.
  • Clearly document the specific contributions of AI analysis.

Quality Control Steps

15. Consistency Checks:

  • Conduct consistency checks between the AI’s analysis and manual coding performed by researchers.
  • Make any necessary adjustments or provide additional training data to the AI.

16. Iteration and Feedback:

  • Use an iterative approach to refine AI contributions to the analysis.
  • Document all iterations of the process and the rationale behind changes made based on AI insights and researcher judgments.

Concluding Thoughts and Resources

The protocol offers a starting point and some guardrails as you begin exploring the use of AI to engage in qualitative analysis.

However, I recommend researchers go much further on their learning journeys before digging deep into the use of AI in their work. Among other things, it is important to understand the limitations of AI, how to navigate security concerns, and what any given AI can do best and doesn’t do well.

Some articles I have found helpful:

Exploring the Use of Artificial Intelligence for Qualitative Data Analysis: The Case of ChatGPT (2023) by David L. Morgan

What influences what? Using AI to turn stakeholders’ stories into causal maps, rapidly and at scale (2023) by Steve Powell and Gabriele Caldas Cabral

--

--

Jewlya Lynn
Jewlya Lynn

Written by Jewlya Lynn

Jewlya Lynn is a facilitator, advisor, and researcher who works with leaders dedicated to making a difference in the world by solving complex problems.