Developing Trustworthy AI

When we talk about AI being “trustworthy,” it is easy to assume that this trustworthiness is something engineers can simply build into a system, such as making it accurate, consistent, or error-free. While those technical qualities matter, they don’t capture everything important about an AI model. Trustworthiness, an important feature of these systems, is a human concept: it is not a property of the AI itself, but a judgment made by the people who use it. That means trustworthiness is both perceptual (i.e., shaped by how a user experiences and evaluates a tool) and context-dependent, varying based on who is using the AI system, what decisions are at stake, and the specific situation they find themselves in. A weather forecaster working under time pressure, for example, might rely on an AI model based on criteria very different from those of a policymaker reviewing long-term projections of how flood hazards will change.

This distinction matters because much of AI development focuses on the model itself, while the people who use the system and place trust in it often receive less attention. If a tool is built to high technical standards but users don’t understand it, can’t interpret it, or don’t feel confident in its ability to perform well in the situation they are working in, it won’t be considered trustworthy or used.

Figure: Conceptual representation of the relationship among reviewed trust and trustworthiness literatures from Wirz et al. 2025.

Researchers at NSF NCAR are working to close the gap between systems and their users. In close collaboration with university partners, stakeholders, and domain experts, they are studying how real users perceive and evaluate AI models. Their findings about how different users establish trust in these systems inform our practices for designing new tools that are driven by user needs and reflect user preferences. The research also helps inform the development of methods to make AI more transparent and reproducible, and directs future development towards user experience. The goal is to ensure that as AI becomes more embedded in high-stakes fields like weather and hazard prediction, the people relying on these tools have both good reason and the right conditions met to trust the systems.

For more information or for partnership opportunities, please contact Julie Demuth.

Papers published by NSF NCAR researchers on the topic of Trustworthiness: