Just like with their last release, they only released the architecture and not the weights. It may be useful for analyzing the system if you're a competitor (but from my last dive into it, it seemed like a strict subset of fancier, industry-leading rec systems), or perhaps getting into rec / retrieval systems as a newcomer.
However, this gives roughly zero insight into how Twitter's feed behaves.
RIP author_is_elon, we hardly knew ye.
I've always wondered - how can I as a non X engineer be sure that the code on GH is actually deployed on their servers?
This is laudable. But the great thing about Twitter is that you don't have to use the algorithmic "For You" feed at all. You can just use the "Following" feed, which is purely chronological, and doesn't contain any recommended content. This isn't possible on Facebook, which makes it unusable for me.
I browsed through it a bit and these are some details that raised questions or which I found interesting:
There's multiple mentions of slop, for example: SlopsAuthorScoreFeature in HomeTweetTypePredicates. That means everyone gets a slop score between 0 and 1, which makes me wish that it was openly visible and that people with a high slop score would get a little piggy emoji next to their name.
There's a CLIENT_TWEET_TAKE_SCREENSHOT action, which is likely used to keep track of when a (mobile, presumably) client takes a screenshot. I hadn't considered this before, but for a social media app where posts are often shared externally through screenshots, keeping track of this can give you another engagement metric.
They have two types of NSFW filters: isNsfw and isSoftNsfw, but I couldn't figure out the distinction. Other metadata types include: isGore, isViolent, isSpam, isLowQuality, isOcr.
In ContentFeatureAdapter there's a getTweetLengthType function which shows the range for each tweet type. This is used to set TWEET_LENGTH_TYPE elsewhere. I wonder if it would help your virality to switch up your tweet lengths to regularly put out tweets which hit every length category, or if it doesn't significantly affect your potential reach.
There's a hardcoded list of top-level Grok topics [0]. Just mildly interesting to see what they consider to be top-level categories. Anime has achieved a significant cultural victory by getting separated into its own major category.
The timeout values for different service request types varied a lot across the application, which makes me curious about how they settled on those numbers. This is a question I've pondered in the past but haven't gotten around to researching deeply.
[0] https://github.com/twitter/the-algorithm/blob/c54bec0d4e029f...
This is essentially useless without the training set or the weights. It's open-source theatre.
Not sure if this is the right place to ask, but why does Bluesky feel so much faster to load and interact with compared to X? On the surface, both have similar interfaces and equally rich content, yet Bluesky consistently feels snappier and more responsive, even though it’s the newer platform.
Always good to see some Scala in the wild. :)
sidenote: when do you think they're going to coax GitHub to transfer the `x` username?
I tried going through the latest diff, but there is so much boilerplate that I was nt able to find any real insights through skimming.
Has anyone found anything useful? Interesting needle-in-a-haystack problem for LLMs to try as well.
Just ping Nikita, he'll tell you the current algo.
I want a social media where I can ssh into the servers with limited privileges, enough to see what's going on but not cause harm.
It's so disappointing to see that Twitter has only released the source code of their algorithm while all of their competitors have released both algorithms and weights.
so basically pay to play
Does it include the bit about white South Africans?
I think Elon said he would release the weights. In a video somewhere. That's what he meant - when the next major version lands, they release the previous one?
People's choices can change, maybe the economic/geopolitical reality of AI race has been impressed upon him, but I think that's what he said.
Previous discussions:
25-apr-2022 https://news.ycombinator.com/item?id=31160546 380 comments
31-mar-2023 https://news.ycombinator.com/item?id=35391433 1185 comments