X For You Feed Algorithm

2026-01-205:1612565github.com

Algorithm powering the For You feed on X. Contribute to xai-org/x-algorithm development by creating an account on GitHub.

This repository contains the core recommendation system powering the "For You" feed on X. It combines in-network content (from accounts you follow) with out-of-network content (discovered through ML-based retrieval) and ranks everything using a Grok-based transformer model.

Note: The transformer implementation is ported from the Grok-1 open source release by xAI, adapted for recommendation system use cases.

The For You feed algorithm retrieves, ranks, and filters posts from two sources:

  1. In-Network (Thunder): Posts from accounts you follow
  2. Out-of-Network (Phoenix Retrieval): Posts discovered from a global corpus

Both sources are combined and ranked together using Phoenix, a Grok-based transformer model that predicts engagement probabilities for each post. The final score is a weighted combination of these predicted engagements.

We have eliminated every single hand-engineered feature and most heuristics from the system. The Grok-based transformer does all the heavy lifting by understanding your engagement history (what you liked, replied to, shared, etc.) and using that to determine what content is relevant to you.

┌─────────────────────────────────────────────────────────────────────────────────────────────┐
│                                    FOR YOU FEED REQUEST                                     │
└─────────────────────────────────────────────────────────────────────────────────────────────┘
                                               │
                                               ▼
┌─────────────────────────────────────────────────────────────────────────────────────────────┐
│                                         HOME MIXER                                          │
│                                    (Orchestration Layer)                                    │
├─────────────────────────────────────────────────────────────────────────────────────────────┤
│                                                                                             │
│   ┌─────────────────────────────────────────────────────────────────────────────────────┐   │
│   │                                   QUERY HYDRATION                                   │   │
│   │  ┌──────────────────────────┐    ┌──────────────────────────────────────────────┐   │   │
│   │  │ User Action Sequence     │    │ User Features                                │   │   │
│   │  │ (engagement history)     │    │ (following list, preferences, etc.)          │   │   │
│   │  └──────────────────────────┘    └──────────────────────────────────────────────┘   │   │
│   └─────────────────────────────────────────────────────────────────────────────────────┘   │
│                                              │                                              │
│                                              ▼                                              │
│   ┌─────────────────────────────────────────────────────────────────────────────────────┐   │
│   │                                  CANDIDATE SOURCES                                  │   │
│   │         ┌─────────────────────────────┐    ┌────────────────────────────────┐       │   │
│   │         │        THUNDER              │    │     PHOENIX RETRIEVAL          │       │   │
│   │         │    (In-Network Posts)       │    │   (Out-of-Network Posts)       │       │   │
│   │         │                             │    │                                │       │   │
│   │         │  Posts from accounts        │    │  ML-based similarity search    │       │   │
│   │         │  you follow                 │    │  across global corpus          │       │   │
│   │         └─────────────────────────────┘    └────────────────────────────────┘       │   │
│   └─────────────────────────────────────────────────────────────────────────────────────┘   │
│                                              │                                              │
│                                              ▼                                              │
│   ┌─────────────────────────────────────────────────────────────────────────────────────┐   │
│   │                                      HYDRATION                                      │   │
│   │  Fetch additional data: core post metadata, author info, media entities, etc.       │   │
│   └─────────────────────────────────────────────────────────────────────────────────────┘   │
│                                              │                                              │
│                                              ▼                                              │
│   ┌─────────────────────────────────────────────────────────────────────────────────────┐   │
│   │                                      FILTERING                                      │   │
│   │  Remove: duplicates, old posts, self-posts, blocked authors, muted keywords, etc.   │   │
│   └─────────────────────────────────────────────────────────────────────────────────────┘   │
│                                              │                                              │
│                                              ▼                                              │
│   ┌─────────────────────────────────────────────────────────────────────────────────────┐   │
│   │                                       SCORING                                       │   │
│   │  ┌──────────────────────────┐                                                       │   │
│   │  │  Phoenix Scorer          │    Grok-based Transformer predicts:                   │   │
│   │  │  (ML Predictions)        │    P(like), P(reply), P(repost), P(click)...          │   │
│   │  └──────────────────────────┘                                                       │   │
│   │               │                                                                     │   │
│   │               ▼                                                                     │   │
│   │  ┌──────────────────────────┐                                                       │   │
│   │  │  Weighted Scorer         │    Weighted Score = Σ (weight × P(action))            │   │
│   │  │  (Combine predictions)   │                                                       │   │
│   │  └──────────────────────────┘                                                       │   │
│   │               │                                                                     │   │
│   │               ▼                                                                     │   │
│   │  ┌──────────────────────────┐                                                       │   │
│   │  │  Author Diversity        │    Attenuate repeated author scores                   │   │
│   │  │  Scorer                  │    to ensure feed diversity                           │   │
│   │  └──────────────────────────┘                                                       │   │
│   └─────────────────────────────────────────────────────────────────────────────────────┘   │
│                                              │                                              │
│                                              ▼                                              │
│   ┌─────────────────────────────────────────────────────────────────────────────────────┐   │
│   │                                      SELECTION                                      │   │
│   │                    Sort by final score, select top K candidates                     │   │
│   └─────────────────────────────────────────────────────────────────────────────────────┘   │
│                                              │                                              │
│                                              ▼                                              │
│   ┌─────────────────────────────────────────────────────────────────────────────────────┐   │
│   │                              FILTERING (Post-Selection)                             │   │
│   │                 Visibility filtering (deleted/spam/violence/gore etc)               │   │
│   └─────────────────────────────────────────────────────────────────────────────────────┘   │
│                                                                                             │
└─────────────────────────────────────────────────────────────────────────────────────────────┘
                                               │
                                               ▼
┌─────────────────────────────────────────────────────────────────────────────────────────────┐
│                                     RANKED FEED RESPONSE                                    │
└─────────────────────────────────────────────────────────────────────────────────────────────┘

Location: home-mixer/

The orchestration layer that assembles the For You feed. It leverages the CandidatePipeline framework with the following stages:

Stage Description
Query Hydrators Fetch user context (engagement history, following list)
Sources Retrieve candidates from Thunder and Phoenix
Hydrators Enrich candidates with additional data
Filters Remove ineligible candidates
Scorers Predict engagement and compute final scores
Selector Sort by score and select top K
Post-Selection Filters Final visibility and dedup checks
Side Effects Cache request info for future use

The server exposes a gRPC endpoint (ScoredPostsService) that returns ranked posts for a given user.

Location: thunder/

An in-memory post store and realtime ingestion pipeline that tracks recent posts from all users. It:

  • Consumes post create/delete events from Kafka
  • Maintains per-user stores for original posts, replies/reposts, and video posts
  • Serves "in-network" post candidates from accounts the requesting user follows
  • Automatically trims posts older than the retention period

Thunder enables sub-millisecond lookups for in-network content without hitting an external database.

Location: phoenix/

The ML component with two main functions:

Finds relevant out-of-network posts:

  • User Tower: Encodes user features and engagement history into an embedding
  • Candidate Tower: Encodes all posts into embeddings
  • Similarity Search: Retrieves top-K posts via dot product similarity

Predicts engagement probabilities for each candidate:

  • Takes user context (engagement history) and candidate posts as input
  • Uses special attention masking so candidates cannot attend to each other
  • Outputs probabilities for each action type (like, reply, repost, click, etc.)

See phoenix/README.md for detailed architecture documentation.

Location: candidate-pipeline/

A reusable framework for building recommendation pipelines. Defines traits for:

Trait Purpose
Source Fetch candidates from a data source
Hydrator Enrich candidates with additional features
Filter Remove candidates that shouldn't be shown
Scorer Compute scores for ranking
Selector Sort and select top candidates
SideEffect Run async side effects (caching, logging)

The framework runs sources and hydrators in parallel where possible, with configurable error handling and logging.

  1. Query Hydration: Fetch the user's recent engagements history and metadata (eg. following list)

  2. Candidate Sourcing: Retrieve candidates from:

    • Thunder: Recent posts from followed accounts (in-network)
    • Phoenix Retrieval: ML-discovered posts from the global corpus (out-of-network)
  3. Candidate Hydration: Enrich candidates with:

    • Core post data (text, media, etc.)
    • Author information (username, verification status)
    • Video duration (for video posts)
    • Subscription status
  4. Pre-Scoring Filters: Remove posts that are:

    • Duplicates
    • Too old
    • From the viewer themselves
    • From blocked/muted accounts
    • Containing muted keywords
    • Previously seen or recently served
    • Ineligible subscription content
  5. Scoring: Apply multiple scorers sequentially:

    • Phoenix Scorer: Get ML predictions from the Phoenix transformer model
    • Weighted Scorer: Combine predictions into a final relevance score
    • Author Diversity Scorer: Attenuate repeated author scores for diversity
    • OON Scorer: Adjust scores for out-of-network content
  6. Selection: Sort by score and select the top K candidates

  7. Post-Selection Processing: Final validation of post candidates to be served

The Phoenix Grok-based transformer model predicts probabilities for multiple engagement types:

Predictions:
├── P(favorite)
├── P(reply)
├── P(repost)
├── P(quote)
├── P(click)
├── P(profile_click)
├── P(video_view)
├── P(photo_expand)
├── P(share)
├── P(dwell)
├── P(follow_author)
├── P(not_interested)
├── P(block_author)
├── P(mute_author)
└── P(report)

The Weighted Scorer combines these into a final score:

Final Score = Σ (weight_i × P(action_i))

Positive actions (like, repost, share) have positive weights. Negative actions (block, mute, report) have negative weights, pushing down content the user would likely dislike.

Filters run at two stages:

Pre-Scoring Filters:

Filter Purpose
DropDuplicatesFilter Remove duplicate post IDs
CoreDataHydrationFilter Remove posts that failed to hydrate core metadata
AgeFilter Remove posts older than threshold
SelfpostFilter Remove user's own posts
RepostDeduplicationFilter Dedupe reposts of same content
IneligibleSubscriptionFilter Remove paywalled content user can't access
PreviouslySeenPostsFilter Remove posts user has already seen
PreviouslyServedPostsFilter Remove posts already served in session
MutedKeywordFilter Remove posts with user's muted keywords
AuthorSocialgraphFilter Remove posts from blocked/muted authors

Post-Selection Filters:

Filter Purpose
VFFilter Remove posts that are deleted/spam/violence/gore etc.
DedupConversationFilter Deduplicate multiple branches of the same conversation thread

The system relies entirely on the Grok-based transformer to learn relevance from user engagement sequences. No manual feature engineering for content relevance. This significantly reduces the complexity in our data pipelines and serving infrastructure.

During transformer inference, candidates cannot attend to each other—only to the user context. This ensures the score for a post doesn't depend on which other posts are in the batch, making scores consistent and cacheable.

Both retrieval and ranking use multiple hash functions for embedding lookup

Rather than predicting a single "relevance" score, the model predicts probabilities for many actions.

The candidate-pipeline crate provides a flexible framework for building recommendation pipelines with:

  • Separation of pipeline execution and monitoring from business logic
  • Parallel execution of independent stages and graceful error handling
  • Easy addition of new sources, hydrations, filters, and scorers

This project is licensed under the Apache License 2.0. See LICENSE for details.


Read the original article

Comments

  • By swyx 2026-01-205:362 reply

    ooh, LLM Recsys alert! (we had an LLM Recsys track at ai.engineer last year). official announcement here: https://x.com/XEng/status/2013471689087086804

    looks like this is the "for you" feed, once again shared without weights so we only have so much visibility into the actual influence of each trait.

    "We have eliminated every single hand-engineered feature and most heuristics from the system. The Grok-based transformer does all the heavy lifting by understanding your engagement history (what you liked, replied to, shared, etc.) and using that to determine what content is relevant to you." aka it's a black box now.

    the README is actually pretty nice, would recommend reading this. it doesnt look too different form Elon's original code review tweet/picture https://x.com/elonmusk/status/1593899029531803649?lang=en

    sharing additonal notes while diving through the source: https://deepwiki.com/xai-org/x-algorithm

    and a codemap of the signal generation pipeline: https://deepwiki.com/search/make-a-map-of-all-the-signals_3d...

    - Phoenix (out of network) ranker seems to have all the interesting predictive ML work. it estimates P(favorite), P(reply), P(repost), P(quote), P(click), P(video_view), P(share), P(follow_author), P(not_interested), P(block_author), P(mute_author), P(report) independently and then the `WeightedScorer` combines them using configurable weights. there's an extra DiversityScore and OONScore to add some adjustments but again dont know the weights https://deepwiki.com/xai-org/x-algorithm/4.1-phoenix-candida... - other scores of interest: photo_expand_score, and dwell_score and dwell_time. share via copy, share, and share via dm are all obviously "super like" buttons.

    - Two-Tower retrieval uses dot product similarity between user features/engagement (User Tower) and normalized embeddings for all items (Candidate Tower). but when you look into the code and considering that this is probably the most important model for recommendations quality.... it's maybe a little disappointing that its a 2 layer MLP? https://deepwiki.com/search/what-models-are-used-for-user_98...

    - Grok-1 JAX transformer (https://github.com/xai-org/x-algorithm/blob/main/phoenix/REA...) uses special attention masking that prevents candidates from attending to each other during inference. Each candidate only attends to the user context (engagement history). This ensures a candidate's score is independent of which other candidates are in the batch, enabling score consistency and caching. nice image here https://github.com/xai-org/x-algorithm/blob/main/phoenix/REA...

    - kind of nice usage of Rust traits to create a type safe data pipeline. look at this beautiful flow chart https://deepwiki.com/xai-org/x-algorithm/3-candidate-pipelin... and the "Field Ownership pattern" https://deepwiki.com/xai-org/x-algorithm/3.6-scorer-trait#fi...

    - the ten pre-scoring filters are minorly interesting, nothing super surprising here apart from AgeFilter (https://deepwiki.com/xai-org/x-algorithm/4.6.1-agefilter) which I guess means beyond a certain max_age (1 day?) nothing ever shows up on For You. surprising to have a simple flat cutoff vs i guess the alternative of an exponential aging algorithm.

    - videoduration hydrator explicitly prioritizes video duration (https://deepwiki.com/xai-org/x-algorithm/4.5.6-videoduration...) but we dont know in what direction... do you recommend shorter or longer videos? and why a hydrator for what is presumably a pretty static property?

    open questions from me

    1. how large is the production reranker? default param count is here https://deepwiki.com/search/how-many-params-is-the-transfo_c... but that gives no indication. the latency felt ultra high initially last year and seems to have come down some, what budget are we working with?

    2. can we make the retrieval better? i dont have a tooon of confidence in the User Tower / Candidate Tower system - is this SOTA (it's probably not - see how youtube does codebook semantic id's https://www.youtube.com/watch?v=LxQsQ3vZDqo&list=PLcfpQ4tk2k... )

    3. no a/b testing / rollout infrastructure?

    4. so many hydration subsystems - is this brittle?

    • By sunaookami 2026-01-207:481 reply

      Sad that this is the only relevant comment in this thread, thanks for the insights. DeepWiki is very nice for this. Didn't know that e.g. copying the post link via the share button influences the algorithm!

      • By jnd0 2026-01-2012:05

        Agreed.

    • By dang 2026-01-205:391 reply

      Thanks, we'll put that link in the toptext too.

      • By modeless 2026-01-2015:35

        The only relevant technical discussion in the whole thread got downvoted to the bottom, and the top comment is poorly reasoned and arguably factually incorrect but on the "correct" political side. This is typical on politically charged topics. Is there anything HN can do to reduce the impact of politically motivated voting? The discouragement of posting politics in the guidelines doesn't seem to be enough anymore.

  • By rapsey 2026-01-205:361 reply

    I did not expect to see Rust. They seem to have forgotten to commit Cargo.toml though.

    Oh I see it is not meant to be built really. Some code is omitted.

  • By roryirvine 2026-01-2010:211 reply

    I wonder if this'll turn out like the last time they published their algorithm to great fanfare, and then didn't bother to ever update it: https://github.com/twitter/the-algorithm

    • By roryirvine 2026-01-2010:31

      Though, to be fair, there were hundreds of "rewrite it in Rust" issues opened against that old one - it looks like they listened!

HackerNews