PhD Thesis Defense: Aimen Gaba, Design and User Identity in Shaping Perception and Trust of AI Systems
Content
Speaker:
Abstract:
Data-driven systems that use machine learning are becoming increasingly embedded in high-stakes domains, including healthcare, hiring, and the criminal justice system. Yet, they routinely exhibit biases that cause real harm. They can generate stigmatizing language about marginalized groups and encode discrimination into consequential decisions. A growing body of work has focused on detecting and mitigating these biases at the algorithmic level, but a critical question remains underexplored: when does AI bias actually become visible to the humans who interact with these systems? My dissertation argues that how people perceive and interpret bias in AI outputs is shaped by how the outputs are presented and who is interpreting them.
I present three empirical studies that each isolate a different point in the human-AI interaction loop where bias legibility is shaped. First, I characterized the visualization design space for data comparison tasks with natural language interfaces. Through a design study with visualization experts and novices, followed by a crowdsourced preference study, I showed that the cardinality and concreteness of a comparison query systematically influence which visualization designs users prefer, which has not been accounted for in current natural language interface designs. Second, I investigated how visual design choices affect people's trust in ML models and their perception of model fairness. Across three controlled experiments with over 1,500 participants, I found that presenting model performance as text rather than bar charts increases the likelihood that people prioritize fairness. Further, women and men weigh accuracy-fairness trade-offs differently given identical visual information, and explicitly labeling a model as biased has a stronger effect on trust than showing its biased history. These findings demonstrate that visualization design can shape what people attend to and how they reason. Third, I examined how gender identity shapes the perception of bias in large language model outputs. Through 25 in-depth interviews with participants across gender identities, I found that non-binary, and women participants consistently identified harms, including condescension, identity erasure, and stereotyping, in ChatGPT outputs that cisgender men characterized as neutral or realistic. These differential perceptions reflect knowledge derived from lived experience with systems that have historically erased or stereotyped their identities, knowledge that technical fairness metrics and aggregate user feedback are not designed to capture.
Combined, these studies provide empirical evidence that how people perceive and respond to bias in AI systems is shaped by forces beyond the algorithm: the language used to query systems, the visual design of outputs, and the identity of the person interpreting them. This work offers practical design implications for visualization designers, AI practitioners, and fairness researchers seeking to build AI systems that are more transparent and trustworthy for diverse populations. Making AI systems genuinely fair benefits from treating these human factors as a site of intervention rather than an afterthought.
Advisor:
Cindy Xiong