PhD Dissertation Proposal: Zhanna Kaufman, Measuring and Increasing Trust in Software Systems
Content
Speaker:
Abstract:
As software permeates nearly every aspect of our lives, today, knowing when and how much to trust that software is paramount. This dissertation develops methods for accurately measuring user trust in software and creates automated techniques for making software more trustworthy. Software has long performed many of society's most high-impact, computation-intensive, and mission-critical operations. With the significant recent improvement in machine-learning (ML) algorithms, software has also become a daily decision assistant, aiding users with both essential and mundane daily tasks. Unfortunately, software is not flawless — many instances of its failure in recent history have led to catastrophic outcomes for their users, including loss of property, loss of privacy, and even loss of life. Furthermore, software has been shown to discriminate, benefiting some users while harming others. This includes facial recognition software with higher error rates for darker skin tones, healthcare software that underestimated black patients' health risks, and job application software that excessively rejected older applicants. Before relying on a program for decision making or task completion, users must trust the program not to fail them; they must believe it to be accurate, beneficial, and fair. We investigate trust in software from two perspectives:
- What makes people trust software? We analyze how accuracy, benefit to the user, and bias affects user trust of ML models. We first identify the strategies that people use when deciding whether to trust biased ML models of different accuracies, both when this bias favors them and when it does not. We next investigate whether improving user comprehension of a biased ML model with the help of explainability visualizations impacts user trust, identifying causal relationships between model comprehension, trust, and perception of bias.
- How we can make software more trustworthy? We create tools to make software functionality match developer intent by ensuring that outcomes match specifications. We first propose new work to assist developers with writing specifications by incorporating large-language-model-generated natural language comments. Then, we present tools to improve existing methods of automated formal verification. We present QEDCartographer, a proof-synthesis tool that uses reinforcement-learning to increase efficiency for ML-driven proof script synthesis via tree-style tactic search. Finally, we present ProofCoop, a tool that combines innovative methods for ML model collaboration during this same tree-style tactic search.
Advisor:
Yuriy Brun