This course will explore what HCI knowledge and methods can bring to the study, design, and evaluation of AI systems with a particular emphasis on the human, social, and ethical impact of those systems. Students will read papers and engage in discussions around the three main components of a human-centered design process as it relates to an AI system: 1) needs assessment, 2) design and development, and 3) evaluation. Following these three main design phases, students will learn what needs assessment might look like for designing AI systems, how those systems might be prototyped, and what HCI methods for real-world evaluation can teach us about evaluating AI systems in their context of use. The course will also discuss challenges that are unique to AI systems, such as understanding and communicating technical capabilities and recognizing and recovering from errors.

Learning Goals

This course has the following learning goals:

Foundational knowledge:
- Human-centered Design process
- How the process applies to AI
- The functionalities, limitations, and possibilities for harm of AI systems
- Methods for studying the human factors related to AI systems
Skills:
- Applying the human-centered design process to new problems
- Identifying appropriate methods to study new questions
- Understanding a new paper’s contribution and type of knowledge
- Analyzing the validity of a paper’s methods and appropriateness of its evaluation
Learning how to learn:
- How to read a paper
- Note taking
- Keeping track of literature
- How to look for relevant papers

Syllabus

Date	Topic	Readings
1/17/2023		As we May Think, Vannevar Bush, 1945 Optional: Augmenting Human Intellect: A Conceptual Framework. Douglas C. Engelbart. 1962
1/24/2023	Foundations for Independent Community Rooted Research Guest lecture: Timnit Gebru (DAIR)	The Exploited Labor Behind Artificial Intelligence Adrienne Williams, Milagros Miceli, and Timnit Gebru. Constructing a Visual Dataset to Study the Effects of Spatial Apartheid in South Africa, Raesetje Sefala, Timnit Gebru, Luzango Mfupe, Nyalleng Moorosi, Richard Klein, NeurIPS 2021 What’s at Stake: Characterizing Risk Perceptions of Emerging Technologies. Skirpan, Michael Warren, Tom Yeh, and Casey Fiesler. CHI 2018. Optional: AI Now report, 2019 On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?🦜. Emily Bender, et al. . FAccT 2021 Gender shades: Intersectional accuracy disparities in commercial gender classification. Joy Buolamwini and Timnit Gebru. Conference on fairness, accountability and transparency. PMLR, 2018.
1/31/2023	Needfinding in HCI	About Face, chapter 2, Understanding the Problem Imagining new futures beyond predictive systems in child welfare: A qualitative study with impacted stakeholders. Logan Stapleton, et al. FAccT 2022. WeBuildAI: Participatory framework for algorithmic governanceLee, Min Kyung, et al. CSCW 2019 Optional: Human-Centered Machine Learning. Jess Holbrook. Medium post Value-sensitive algorithm design: Method, case study, and lessons. Haiyi Zhu et al. CSCW 2018 The model card authoring toolkit: Toward community-centered, deliberation-driven AI design. Hong Shen et al. FAccT 2022.
2/7/2023	Mental models Guest lecture: Zahra Ashktorab (IBM)	Beyond accuracy: The role of mental models in human-AI team performance.” Gagan Bansall et al. AAAI HCOMP 2019. Towards mutual theory of mind in human-ai interaction: How language reflects what students perceive about a virtual teaching assistant. Wang et al. CHI 2021 Mental models of AI agents in a cooperative game setting. Gero, Katy Ilonka, et al. CHI 2020 Human-ai collaboration in a cooperative game setting: Measuring social perception and outcomes. Ashktorab., et al. CSCW 2020 Optional: Understanding user beliefs about algorithmic curation in the Facebook news feed. Emilee Rader, and Rebecca Gray. CHI 2015 “I always assumed that I wasnt really that close to [her]”; Reasoning about Invisible Algorithms in News Feeds. Eslami, Motahhare, et al. CHI 2015 Monsters, metaphors, and machine learning. Graham Dove and Anne-Laure Fayard. CHI 2020 “Like Having a Really Bad PA”: The Gulf between User Expectation and Experience of Conversational Agents. Ewa Luger and Abigail Sellen. CHI 2016
2/14/2023	AI constraints and assumptions	Jury learning: Integrating dissenting voices into machine learning models. Gordon, Mitchell L., et al. CHI 2022 Modeling assumptions clash with the real world: Transparency, equity, and community challenges for student assignment algorithms. Robertson, Samantha, Tonya Nguyen, and Niloufar Salehi. CHI 2021. Optional: A human-centered review of algorithms used within the US child welfare system.Devansh Saxena et al., CHI 2020
2/21/2023	Design and development	Agency plus automation: Designing artificial intelligence into interactive systems. Jeffrey Heer. PNAS 2019 Gu, Hongyan, et al. “Augmenting Pathologists with NaviPath: Design and Evaluation of a Human-AI Collaborative Navigation System.” Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems. 2023. Guidelines for human-AI interaction. Amershi, Saleema, et al. CHI 2019 Cognitive Artifacts, Don Norman, 1991 Optional: Re-examining whether, why, and how human-AI interaction is uniquely difficult to design. Yang, Qian, et al. CHI 2020 P+AI Research guidebook Toward General Design Principles for Generative AI Applications. Weisz, Justin D., et al. . arXiv preprint arXiv:2301.05578 (2023). The biggest bottleneck for large language model startups is UX (blog post) Principles of Mixed-initiative User Interfaces. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. Eric Horvitz. 1999. CHI 1999 Investigating how experienced UX designers effectively work with machine learning. Yang, Qian, et al. CHI 2018 UX design innovation: Challenges for working with machine learning as a design material. Dove, Graham, et al. CHI 2017
2/28/2023	Human-AI interaction Guest lecture: Sherry Tongshuang Wu (CMU)	AI chains: Transparent and controllable human-ai interaction by chaining large language model prompts.Wu, Tongshuang, Michael Terry, and Carrie Jun Cai. CHI 2022 Power to the People: The Role of Humans in Interactive Machine Learning. Saleema Amershi, Maya Cakmak, William Bradley Knox, and Todd Kulesza. 2014. Optional: An image is worth one word: Personalizing text-to-image generation using textual inversion. Gal, Rinon, et al. 2022 Prompt-to-prompt image editing with cross attention control. Hertz, Amir, et al. 2022. Towards algorithmic experience: Initial efforts for social media contexts. Oscar Alvarado and Annika Waern. CHI 2018. Design Guidelines for Prompt Engineering Text-to-Image Generative Models. Liu, Vivian, and Lydia B. Chilton. CHI 2022. Overview based example selection in end user interactive concept learning. Saleema Amershi, Saleema, et al. UIST 2009
3/7/2023	Explaintability/interpretability Guest lecture: Hima Lakkaraju (Harvard)	The Disagreement Problem in Explainable Machine Learning: A Practitioner’s Perspective, Krishna, Satyapriya, et al., ICML 2022 The challenge of crafting intelligible intelligence. Dan Weld and Gagan Bansal. Communications of the ACM 62.6 (2019): 70-79. Interpreting Interpretability: Understanding Data Scientists’ Use of Interpretability Tools for Machine Learning. Harmanpreet Kaur, Harsha Nori, Samuel Jenkins, Rich Caruana, Hanna Wallach, and Jennifer Wortman Vaughan. CHI 2020 Optional: The Building Blocks of Interpretability Explanations Can Reduce Overreliance on AI Systems During Decision-Making. Vasconcelos, Helena, et al. CHI 2022. Seeing without knowing: Limitations of the transparency ideal and its application to algorithmic accountability. Mike Ananny and Kate Crawford. ” new media & society 20.3 (2018): 973-989. An evaluation of the human-interpretability of explanation. Lage, Isaac, et al., NeurIPS 2018 Sensible AI: Re-imagining interpretability and explainability using sensemaking theory. 2022 ACM Conference on Fairness, Accountability, and Transparency. 2022. Kaur, Harmanpreet, et al. Does the whole exceed its parts? the effect of ai explanations on complementary team performance. Gagan Bansal et al. CHI 2021.
3/14/2023	Bias and bias mitigation Guest lecture: Marzyeh Ghassemi (MIT)	The road to explainability is paved with bias: Measuring the fairness of explanations. Balagopalan, Aparna, et al. FAccT 2022 Mitigating the impact of biased artificial intelligence in emergency decision-making. Adam, Hammaad, et al. Nature Communications Medicine 2.1 (2022): 149. Write It Like You See It: Detectable Differences in Clinical Notes By Race Lead To Differential Model Recommendations. Adam, Hammaad, et al. arXiv preprint arXiv:2205.03931 (2022). Optional: Human-centered tools for coping with imperfect algorithms during medical decision-making. Cai, Carrie J., et al. CHI 2019
3/21/2023	Designing and Prototyping AI Guest lecture: Lauren Wilcox (Google Research), Michael Madaio (Google Research), Chelsea Wang (Georgia Tech)	Designing Responsible AI: Adaptations of UX Practice to Meet Responsible AI Challenges, Qiaosi Wang, Michael Madaio, Shivani Kapania, Shaun Kane, Michael Terry, Lauren Wilcox. CHI 2023 Will You Accept an Imperfect AI? Exploring Designs for Adjusting End-user Expectations of AI Systems. Rafal Kocielnik, Saleema Amershi, and Paul N. Bennett. CHI 2019. Explore Google’s What-If tool Optional: iCanDraw: using sketch recognition and corrective feedback to assist a user in drawing human faces. Daniel Dixon, Manoj Prasad, and Tracy Hammond. CHI 2010 Hertzmann, Aaron. “Can computers create art?.” Arts. Vol. 7. No. 2. MDPI, 2018. Ambiguity-aware ai assistants for medical data analysis. Mike Schaekermann et al. CHI 2020 AI Design & Practices Guidelines (blog post)
3/28/2023	Spring break
4/4/2023	Understanding computers and cognition Guest lecture: Terry Winograd (Stanford)	Understanding Computers and Cognition: A New Foundation for Design, introduction and chapter 9 Optional: Malleable software in the age of LLMs (blog post)
4/11/2023	Fairness Accountability Transparency and Ethics (FATE) Guest lecture: Motahhare Eslami (CMU)	Procedural justice in algorithmic fairness: Leveraging transparency and outcome control for fair algorithmic mediation. Lee, Min Kyung, et al. CSCW 2019 Trends and Trajectories for Explainable, Accountable and Intelligible Systems: An HCI Research Agenda, CHI 2018 Moral crumple zones: Cautionary tales in human-robot interaction (pre-print). Madeleine Clare Elish. Engaging Science, Technology, and Society (2019). It’s Reducing a Human Being to a Percentage’ Perceptions of Justice in Algorithmic Decisions. Binns, Reuben, et al., CHI 2018 Optional: Human-centered artificial intelligence: Reliable, safe & trustworthy. Ben Shneiderman. International Journal of Human–Computer Interaction 36.6 (2020): 495-504.
4/18/2023	Evaluation Guest lecture: Mina Lee (Stanford)	Proxy tasks and subjective measures can be misleading in evaluating explainable AI systems. Buçinca, Zana, et al. UIST 2020 Kaleidoscope: Semantically-grounded, context-specific ML model evaluation, Suresh et al., CHI 2023 Evaluating Human-Language Model Interaction. Lee, M., et al. arXiv preprint arXiv:2212.09746. 2023 Optional: Liang, Percy, et al. “Holistic evaluation of language models.” arXiv preprint arXiv:2211.09110 (2022).
4/25/2023	Heuristic evaluation and model cards	Measurement and fairness. Jacobs, Abigail Z., and Hanna Wallach. FAccT 2021 Model cards for model reporting. Mitchell, Margaret, et al., FAccT 2019 Optional: Extracting training data from large language models. Carlini, Nicholas, et al. 30th USENIX Security Symposium (USENIX Security 21). 2021. Beyond accuracy: Behavioral testing of NLP models with CheckList. Ribeiro, Marco Tulio, et al. ACL 2020