Henry Li
Generative $\{\text{Audio}, \text{Text}, \text{Image}\}$ Researcher
I am a Research Scientist at Google, focusing on generative audio research. More broadly, my research interests center around the intersection of $\{\text{generation}, \text{understanding}, \text{guidance}\}$ tasks $\times$ $\{\text{audio}, \text{text}, \text{image}\}$ modalities, specifically leveraging diffusion and flow matching models.
Previously, I was a Ph.D. student at Yale University. I have also spent time as a Student Researcher at Google DeepMind, and as a Research Intern with the Seed Vision Team at TikTok / ByteDance, the Bosch Center for Artificial Intelligence, and the Flatiron Institute.
news
selected publications
- WASPAA
- ICML