On the point of data quality over quantity — if the end goal is to make patient-level predictions (e.g., response to therapy), won’t we eventually need large-scale data (10-100k+ patients even)? High-dimensional, multi-modal data per patient is crucial, but with few patients, the analysis risks becoming more descriptive than predictive. That’s still great for hypothesis generation but maybe not for ML. One analogy is models that predict sex from retinal images where the signal is real and non-obvious, but only becomes robust and generalizable with scale.
i think it is an open question how much data is necessary! i think in the short term, i am much more bullish on hypothesis generation, which is also why it is good that noetik’s collected dataset is (currently) one of a kind. i agree data throughput will need to improve regardless, but the bottleneck is much more on the machine side, and people besides us are working hard on that (spatial transcriptomics companies)
I think there's an opportunity to combine quantity and quality. In endoscopy, we're finding that we can use massive quantities of unlabeled data to train a self-supervised encoder. That encoder allows us to train downstream application decoders with relatively small datasets that are well-curated and labeled. The example we've shown so far is that we can take the placebo arm of a Ph3 ulcerative colitis trial that's 300 patients and classify the responders vs. non-responders from only their baseline colonoscopy video!
Hi Dr. Owl. I spent a big chunk of my Ph.D. evaluating counterfactual predictions about genetic perturbation outcomes. I spent some time looking at the OCTO-VC demos and I found it very worrisome. There is a growing graveyard of similar models that seem to do worse than the mean of their training data. Here are 8 independent evaluations that differ in many details but are all broadly compatible with poor performance of virtual cell predictions.
I would be interested to hear your thoughts on this. Are you worried about it? If OCTO-VC doesn't predict counterfactuals well, how will that affect Noetik's strategy?
On the point of data quality over quantity — if the end goal is to make patient-level predictions (e.g., response to therapy), won’t we eventually need large-scale data (10-100k+ patients even)? High-dimensional, multi-modal data per patient is crucial, but with few patients, the analysis risks becoming more descriptive than predictive. That’s still great for hypothesis generation but maybe not for ML. One analogy is models that predict sex from retinal images where the signal is real and non-obvious, but only becomes robust and generalizable with scale.
i think it is an open question how much data is necessary! i think in the short term, i am much more bullish on hypothesis generation, which is also why it is good that noetik’s collected dataset is (currently) one of a kind. i agree data throughput will need to improve regardless, but the bottleneck is much more on the machine side, and people besides us are working hard on that (spatial transcriptomics companies)
All the good ones are leaving NY for SF 😞
currently plan to stay in NY! at least for the moment
I think there's an opportunity to combine quantity and quality. In endoscopy, we're finding that we can use massive quantities of unlabeled data to train a self-supervised encoder. That encoder allows us to train downstream application decoders with relatively small datasets that are well-curated and labeled. The example we've shown so far is that we can take the placebo arm of a Ph3 ulcerative colitis trial that's 300 patients and classify the responders vs. non-responders from only their baseline colonoscopy video!
Hi Dr. Owl. I spent a big chunk of my Ph.D. evaluating counterfactual predictions about genetic perturbation outcomes. I spent some time looking at the OCTO-VC demos and I found it very worrisome. There is a growing graveyard of similar models that seem to do worse than the mean of their training data. Here are 8 independent evaluations that differ in many details but are all broadly compatible with poor performance of virtual cell predictions.
Ahlmann-Eltze et al.
https://www.biorxiv.org/content/10.1101/2024.09.16.613342v5
Csendes et al.
https://pmc.ncbi.nlm.nih.gov/articles/PMC12016270/
PertEval-scFM
https://icml.cc/virtual/2025/poster/43799
scEval
https://www.biorxiv.org/content/10.1101/2023.09.08.555192v7
C. Li et al.
https://www.biorxiv.org/content/10.1101/2024.12.20.629581v1.full
L. Li et al.
https://www.biorxiv.org/content/10.1101/2024.12.23.630036v1#libraryItemId=17605488
Wong et al.
https://www.biorxiv.org/content/10.1101/2025.01.06.631555v3#libraryItemId=17605840
My Ph.D. work
https://www.biorxiv.org/content/10.1101/2023.07.28.551039v2
I would be interested to hear your thoughts on this. Are you worried about it? If OCTO-VC doesn't predict counterfactuals well, how will that affect Noetik's strategy?
what was worrying about specifically the octo demo?