Trust but verify, building production ready Gen AI applications
Dr. Anupam Datta, Co-Head of Snowflake AI Research
1-2pm EDT, Thur. 10 Oct. 2024, ITE 325b, Webex TBD
We have seen a tremendous surge of interest in enterprises for generative AI applications – from chatbots and summarization to agents and multi-modal apps. Yet most of these apps remain in the prototype and experimentational stage. To move apps into production and keep them there, it is critical to make them efficient and to address common failure modes – hallucinations, retrieval failures, safety issues such as toxicity and bias and more, all while providing robust governance.
In this talk, we will share learnings from building and maintaining production-grade apps. Using two concrete examples – Cortex Search (which incorporates hybrid search to power question-answering over unstructured data) and Cortex Analyst (which enables business users to talk to their structured data with SOTA accuracy) – we will examine key design patterns to build efficient and trusted generative AI apps. Building on TruLens, our open source project, we will also present a methodology for evaluating, experimenting, and monitoring generative AI apps that enables developers to measure app quality, identify and debug failure modes, and build the trust that’s needed to move apps into production and keep them there. Specifically, we will discuss evaluation methodologies based on the RAG Triad -- context relevance, groundedness, and answer relevance -- that can surface flaws in LLM apps and guide iteration to improve them. We will also show how this can be done practically using our open source framework, TruLens, leveraging both LLMs-as-a-judge and smaller, custom models that scale better to production monitoring workloads.
Anupam Datta is Co-Head of Snowflake AI Research. He joined Snowflake as part of the acquisition of TruEra where he served as Co-Founder, President, and Chief Scientist from 2019-2024. Datta was on the faculty at Carnegie Mellon University from 2007-2022, most recently as a tenured Professor of Electrical & Computer Engineering and Computer Science. Datta's current research focuses on Trustworthy AI and includes pioneering work on evaluation, explainability, fairness, and adversarial robustness of ML models and GenAI applications. His research has also had significant product impact at TruEra and Snowflake. Datta served as Chair of the National Academies Workshop on Assessing and Improving AI Trustworthiness, on the Steering Committee of of the ACM Conference on Fairness, Accountability, and Transparency, and the IEEE Computer Security Foundations Symposium, and as an Editor-in-Chief of Foundations and Trends in Privacy and Security. He obtained a B.Tech. from IIT Kharagpur, and Ph.D. and M.S. degrees from Stanford University, all in Computer Science.
Host: Dr. Richard Forno (rforno@umbc.edu)