Model training diary/journal for LLMs?

by nalzokon 11/15/2023, 2:42 AMwith 1 comments

About half a year ago, some big tech company released an open-source LLM. What makes that model special is that they made available a model training diary/journal recording everything their engineers did to babysit the training process, e.g. "on day 143, the training loss plateaued, so we decreased the learning rate further". I think it was in a shared Google Doc.

Can you remind me of the name of the company/model?

by nalzokon 11/15/2023, 2:54 AM

Nevermind, I figured it out: https://github.com/facebookresearch/metaseq/blob/main/projec...