blog insights

AI sauce on everything: Reflections on ASU+GSV 2025

Data, evaltuation, product iteration, and public goods: reflections on the ASU+GSV Summit 2025.

Peter Bull
Co-founder

This was my first year attending the ASU+GSV Summit, and while I was excited to present our work on Automatic Speech Recognition (ASR) for children in the classroom, I wasn't quite prepared for the sheer scale of the event. Lines spilling out of session rooms, conference lunches vanishing in moments, and rooms so full of conversation that it is hard to hear yourself think. It's an environment brimming with ideas, and it's taken me some time to digest the whirlwind of interactions.

I had fascinating conversations, ranging from speaking with a pioneer who brought interactive AI systems to classrooms back in the late 1990s, to seeing Google researchers present sophisticated models assessed against comprehensive learning goals, to listening to a tense debate on squaring ChatGPT use with academic honor codes (is it a violation if I did the outline but not the writing?). Unsurprisingly, talks and conversations were liberally sprinkled with "AI" (and some folks may have been liberal with the A1 as well!).

The progress is undeniable, but amidst the excitement, I couldn't help but reflect on the critical pieces that will make this AI boom effective in the classroom.

The Data Bottleneck: Fueling AI Responsibly

AI models are only as good as the data they're trained on, and this presents a major hurdle in education. We're experiencing this firsthand in our work with ASR for children – the vast datasets scraped from the internet that train most commercial Large Language Models (LLMs) simply don't reflect the unique linguistic patterns, acoustics, and realities of a diverse classroom.

While this generation of students might be the most digitally documented, their data is often fragmented and locked away in aging Learning Management Systems, isolated research projects, or proprietary platforms. Bringing this data together ethically and effectively requires significant effort. We're undertaking this challenge for children's ASR data, but cleaning, annotating, and ensuring privacy is expensive and time-consuming. Off-the-shelf models trained on generic data will probably not work effectively or equitably in your specific educational context. Verification is crucial.

Evaluation: Moving Beyond the Checklist

On that point, evaluation was an emerging theme. Even if appropriate training data could be accessed, how do we rigorously assess if these AI tools actually work, and for whom?

Buying traditional software often involves a checklist that matches features to user needs. With AI, it's far more complex. An AI "feature" might work well for some students but fail, or even disadvantage, others with different backgrounds, learning styles, or in specific environments (e.g., a noisy classroom). AI buyers need evaluation frameworks that go deeper than surface functionality. This includes assessing performance across diverse student subgroups and understanding the potential for bias (we’ve been thinking about these kinds of biases for a long time—see our data ethics tool, Deon). We either need robust third-party auditing and evaluation, or buyers are going to have the get much more sophisticated in their evaluation approach.

Iteration vs. Fatigue: Finding the Right Pace for Innovation

The standard advice for building great AI is rapid iteration based on user feedback. This works well in many commercial sectors. However, education requires a more cautious approach. There is a real risk of inducing "study fatigue," a phenomenon that we have observed in fields like international development where communities become weary of constant experimentation without seeing clear benefits.

Students, teachers, and parents won't have infinite patience for endless cycles of testing AI prototypes in the classroom. We can't afford to treat schools like perpetual beta-testing labs. While iteration is necessary, experiments need to be structured for success from the outset, minimizing disruption and aiming quickly for demonstrable improvements in learning outcomes, rather than lingering in pilot purgatory.

The Cambrian Explosion and The Role of Public Goods

We're currently witnessing a Cambrian explosion of AI use cases in education – tools for tutoring, assessment, content generation, administrative support, and more. This rapid diversification is exciting, but it also increases the risk of the challenges mentioned above: data fragmentation, evaluation gaps, and study fatigue.

Public goods and infrastructure became a core focus of the panel I was on with Magpie Education, Learn FX, MRDC, Khan Academy, and the Bill & Melinda Gates Foundation. In an era dominated by powerful frontier models from major labs, investing in shared infrastructure for the sector—open datasets, open-source models, and open tools—is crucial. Our own work in ASR for kids aims to contribute exactly these kinds of public goods by collating data and annotating it so that a broader community can benefit. It’s clear that building blocks and infrastructure that are shared can help mitigate some of the externalities mentioned above of the Cambrian explosion of technologies being tested in the classroom.

I'd love to hear your reflections on the conference or these topics! What stood out to you? What challenges or opportunities do you see? Let’s work together to chart a thoughtful path forward.

Tags

Stay updated

Join our newsletter or follow us for the latest on our social impact projects, data science competitions and open source work.

There was a problem. Please try again.
Subscribe successful!
Protected by reCAPTCHA. The Google Privacy Policy and Terms of Service apply.

Latest posts

All posts

insights

Life beyond the leaderboard

What happens to winning solutions after a machine learning competition?

insights

(Tech) Infrastructure Week for the Nonprofit Sector

Reflections on how to build data and AI infrastructure in the social sector that serves the needs of nonprofits and their beneficiaries.

winners

Meet the winners of Phase 2 of the PREPARE Challenge

Learn about how winners detected cognitive decline using speech recordings and social determinants of health survey data

resources

Open-source packages for using speech data in ML

Overview of key open-source packages for extracting features from voice data to support ML applications

tutorial

Getting started with LLMs: a benchmark for the 'What's Up, Docs?' challenge

An introduction to using large language models via the benchmark to a document summarization challenge.

winners

Meet the Winners of the Goodnight Moon, Hello Early Literacy Screening Challenge

Learn about the results and winning methods from the early literacy screening challenge.

resources

Where to find a data job for a good cause

Finding data jobs for good causes can be difficult. Learn strategies, job lists, and tips to find organizations with open positions working on causes you care about.

winners

Meet the Winners of the Youth Mental Health Narratives Challenge

Learn about the winning solutions from the Youth Mental Health Challenge Automated Abstraction and Novel Variables Tracks

winners

Meet the winners of the Forecast and Final Prize Stages of the Water Supply Forecast Rodeo

Learn about the winners and winning solutions from the final stages of the Water Supply Forecast Rodeo.

insights

10 takeaways from 10 years of data science for social good

This year DrivenData celebrates our 10th birthday! We've spent the past decade working to use data science and AI for social good. Here are some lessons we've learned along the way.

tutorial

Goodnight Moon, Hello Early Literacy Screening Benchmark

In this guest post from the MIT Gabrieli Lab, we'll show you how to get started with the literacy screening challenge!

tutorial

Youth Mental Health: Automated Abstraction Benchmark

Learn how to process text narratives using open-source LLMs for the Youth Mental Health: Automated Abstraction challenge

winners

Meet the winners of Phase 1 of the PREPARE Challenge

Learn about the top datasets sourced for Phase 1 of the PREPARE Challenge.

resources

Teaching with DrivenData Competitions

Inspiration and resources for teaching students data science, machine learning, and AI skills with DrivenData competitions.

insights

What a non-profit shutting down tells us about AI in the social sector

When non-profits when they shut down, we should pay attention to the assets they produce as public goods and how they can be used to drive impact.

winners

Meet the winners of the Pose Bowl challenge

Learn about the top solutions submitted for the Pose Bowl: Spacecraft Detection and Pose Estimation Challenge.

winners

Meet the winners of the Water Supply Forecast Rodeo Hindcast Stage

Learn about the winning models for forecasting seasonal water supply from the first stage of the Water Supply Forecast Rodeo.

tools

Cookiecutter Data Science V2

Announcing the V2 release of Cookiecutter Data Science, the most widely adopted data science project template.

resources

How to make data science projects more open and inclusive

Key practices from the field of open science for making data science work more transparent, inclusive, and equitable.

winners

Meet the winners of the Kelp Wanted challenge

Dive into the solutions from the super segmenters who best detected kelp in Landsat imagery!

Work with us to build a better world

Learn more about how our team is bringing the transformative power of data science and AI to organizations tackling the world's biggest challenges.