Discussion about this post

User's avatar
Surag Nair's avatar

On the point of data quality over quantity — if the end goal is to make patient-level predictions (e.g., response to therapy), won’t we eventually need large-scale data (10-100k+ patients even)? High-dimensional, multi-modal data per patient is crucial, but with few patients, the analysis risks becoming more descriptive than predictive. That’s still great for hypothesis generation but maybe not for ML. One analogy is models that predict sex from retinal images where the signal is real and non-obvious, but only becomes robust and generalizable with scale.

Expand full comment
zdk's avatar

All the good ones are leaving NY for SF 😞

Expand full comment
5 more comments...

No posts