Discussion about this post

User's avatar
Stephen Rong's avatar

1) You may be interested in the “nucleotide dependency” preprint which has some interesting ideas of how to go beyond LL for variant interpretation with DNA LMs https://www.biorxiv.org/content/10.1101/2024.07.27.605418v1

2) That GPN MSA, a much much smaller model that mixes a short context DNA LM with evolutionary conservation from MSA as input comes kinda close Evo 2 for both coding and noncoding, suggests that for variant interpretation, a model as large as Evo 2 probably isn’t necessary in the long run.

3) There have been multiple updates on HARs, such as 312 reported in https://www.science.org/doi/10.1126/science.abm1696. And there are also elements such as HAQERs, which are previously neutrally evolving regions that show accelerated evolution in humans https://pubmed.ncbi.nlm.nih.gov/36423581/.

4) What are these models even learning when genomes like humans are nearly 50% repeats, only a minority of which are functional?

Expand full comment
Maxx Yung's avatar

Wow, this was an amazing piece. Didn't know that I also enjoy Socratic dialogue essays as well.

Expand full comment
5 more comments...

No posts