About this Event
150 Western Avenue, Allston, MA 02134
https://crcs.seas.harvard.edu/event/bhramar-mukherjee-yale-universityTalk Title: The Data Struggle of the Unseen
Despite several proposed roadmaps to increase representation in scientific research, most of the world's research data are collected on selected populations. We rely on summary statistics from dominant groups and then devise clever statistical methods to transfer/transport them for cross-ancestry use. In this talk, I would first argue the obvious: for building fair algorithms we need representative training datasets. As public health statisticians, our job is not just to predict, but to prevent. However, till we have reached the dream of representative big data at a global scale, statisticians have an important role to play. In fact, we have the perfect tools to study the "unobserved" through modeling of missing data, selection bias and alike. I will share examples from my personal journey as a statistician when doing good and timely statistical work with imperfect data (such as data from observational patient care databases) quantified important scientific observations. I will conclude the talk with a call to arms for statisticians and computational scientists to not just develop new methods but also lead efforts for creating, curating, collecting data and pioneering new scientific studies, stepping outside their comfort zones.