From 5,000 Documents to Clear Insights: WCER Researchers Share Lessons From Building Their Own AI Assistant

June 17, 2026   |   By Karen Rivedal, Office of Research & Scholarship

From left, Caleb Probst, Mark Miller and Rich Halverson, the builders of an AI-powered research tool

From left, Caleb Probst, Mark Miller and Rich Halverson, the builders of an AI-powered research tool

WCER researchers recently offered a behind-the-scenes look at why and how they built a Gemini‑based AI tool to help them make sense of a vast array of complex data collected over five years as part of a national project to develop equity-centered school leadership.

In a spring presentation led by now-retired principal investigator Rich Halverson, postdoctoral researcher Caleb Probst, and developer Mark Miller, the team walked colleagues through the tool’s development and showed how it helps researchers work more efficiently with the project’s large dataset while protecting the information.

“The scope of the data is vast, and the team of researchers is large,” Probst said. “Data collection and analysis are often done by different groups of people.”

The AI-powered research tool, Probst added, “brings all the data together to support the team’s various lines of inquiry. It does not replace the intellectual work of researchers. Rather, it’s like having a helper who can read thousands of pages in a matter of seconds.”

The research project, titled Comprehensive Assessment of Leadership Learning – Equity-Centered Leadership (CALL-ECL), is in its fifth year and has involved 70 researchers (including more than 50 graduate and undergraduate students) from multiple institutions. The seven-year study, funded by The Wallace Foundation, has been documenting how eight large urban school districts are building pathways to develop principals trained to better assist historically underserved students.

The CALL-ECL study includes two major types of data: social network surveys that track collaboration across districts, and qualitative data from interviews, field notes, and artifacts. These materials total just over 5,000 documents.

Halverson explained the team needed a way to make the full dataset accessible to everyone tracking district collaborations for the national project. Data integration was also a goal, he noted — linking qualitative data with the network survey information so researchers could see how people were connected and how they participated in the work. District partners in the national project include universities, state education agencies and community organizations.

Beyond getting the technical details right, the path to making a successful AI tool starts with putting together an interdisciplinary team with members who have the different skills needed to do the work, the way Halverson, Probst and Miller did by joining forces.

“Working together, you too can do this,” Probst told his colleagues in the presentation.

Key Questions to Ask

Halverson’s team said researchers who want to build an AI-powered research tool like theirs — which they titled “Chat/ECL” — should keep some core concepts in mind. They should ask themselves:

Why might you need it? Large qualitative datasets (i.e., thousands of documents) can become unwieldy. An AI-powered tool is a great way to read across a large body of data and identify themes and patterns or review a team’s analysis to see if things were missed.

Who should lead the design? It’s best to assemble a design team with complementary expertise related to project goals, research priorities, data characteristics, programming, and AI development. People with AI skills could be as close as the student research assistants on your team, depending on their training. Campus offices, including the Data Science Institute, can also provide connections to AI-trained personnel.

How do you build it? Ideally, the design team convenes regularly and continuously improves the tool until it meets the needs of the whole research team. User testing with members of the research team who know the data well is a key part of the process.

What are three things for tool designers to keep in mind?

  • Zero data retention. Requesting this from the AI platform helps ensure ethical handling of participant data because it guarantees participant comments will never be used for AI training.
  • Hallucinations/mistakes sometimes happen. To mitigate this, design the AI tool to cite the original data sources.
  • Thoroughness. AI usually works by sampling data, but this is insufficient for research. Make sure that the tool reads across the entire dataset.

Resources: UW–Madison faculty and staff looking to build AI tools should start by consulting the Generative AI Services Hub offered by UW–Madison’s Division of Information Technology (DoIT). The hub provides access to secure environments, data usage policies, and development resources such as research/collaboration options and compliance/security information. Another good overview is the DoIT resource, “Smart Choices: Why UW–‍Madison’s Enterprise AI Tools Should Be Your Go-To.”

Before building, it is also critical to know your data classification. You can review the UW–Madison Data Security Policy to ensure your AI tool complies with campus rules regarding public, internal, sensitive, and restricted (for example, student records or HIPAA) data.

How Halverson’s Team Built and Refined the Tool

Miller and Probst began building their Gemini-based AI tool in the summer of 2025 and released it to the full research team in the spring of 2026. Importantly, each of them brought different needed skills.

Probst is a qualitative researcher who understands how the data would be used by other researchers. Miller, an undergraduate research assistant who graduated this spring, is a software developer who can write code and work with large language models, a type of AI system designed to understand, process and generate text while analyzing huge datasets.

The pair met weekly to progressively develop the tool’s design, with Halverson checking in as needed or required. Other members of the research team provided user testing, which Probst and Miller then used to calibrate the tool. A final shot of needed labor came from four students from the Wisconsin Undergraduate Research in Data Science (WISCURDS) program, a new resource available to researchers from the Data Science Institute.

Early versions of the tool were fast but unreliable.

“It was very confident in everything that it was saying, and it totally lied,” Probst said, recalling how the system once confused district names and “fixated on really small details and then extrapolated these grand themes.”

To address this, Miller redesigned the tool’s internal process so it would slow down, search more systematically, and show its work. “I’m willing to wait 90 seconds for an answer I can trust,” Probst said.

Miller explained that the improved version of their AI tool now generates a plan, conducts multiple searches across both raw and coded documents, and only produces an answer when it has gathered enough evidence. It also indicates how confident it is and provides citations that link directly back to the original coded segments and documents. The team emphasized that the tool’s transparency helps researchers check their assumptions, surface overlooked documents, and navigate the dataset more easily.

The design team ensured their research data was secure and protected by building Chat/ECL on Google Gemini under UW–Madison’s license with Google, which requires the use of only public or internal data and mandates Google won’t use prompt data from users to train its AI models. Chat/ECL uses only internal data, and the Google contract means that no words from research participants, for example, are used in Google’s AI training.

The team plans to continue refining the tool and is open to sharing documentation with colleagues who may want to adapt the approach for their own projects.

About the Comprehensive Assessment of Leadership Learning – Equity-Centered Leadership (CALL-ECL)

CALL-ECL is studying urban school district efforts to prepare equity-centered school leaders. The project is documenting the development of principal development pipelines, tracing the growth of professional networks, and developing data tools to support the practices of equity-centered leaders in schools. Learn more at call-ecl.wcer.wisc.edu.

For questions about CALL-ECL’s AI-powered research tool, contact Caleb Probst. You can also learn more by clicking through the presenters' slide deck: Developing Chat/ECL.