Discussion about this post

User's avatar
Jonas Kubilius's avatar

"if you can do reliable genome generation, you can create plants that sequester carbon at 1000x the typical rate" -- it seems that I'm still missing the point of these generative models even after reading your excellent essay as I don't understand how one could in principle request for a certain function from these models? All they know is generating natural-looking sequences and I'm failing to see how can we get from that to 1000x faster carbon sequestration?

Expand full comment
Jacob's avatar

I take your point about the usefulness of generation of complex features like antibody synthesis or whatever but are nucleotide language models the right level for that? As opposed to a model that operates on a higher level of abstraction. Like with the glycosylation stuff why do you need to do base by base generation, essentially slightly re-engineering each glycosyltransferase, as opposed to gene by gene where you just paste in the appropriate gene sequence or enhancer element or whatever? It would look more like a systems biology model than language model, or maybe something like Future House-esque automated scientist + tons of compute for reasoning

...though come to think of it, probably an AI scientist would still consult a language model while doing the reasoning, so it's good to have around. I slightly wonder how core it would be though.

Expand full comment
3 more comments...

No posts