Building an LLM-RecSys Hybrid for Steerable Recommendations with Semantic IDs

Models & Research

The Engineer

16 Sept 2025 · 3 min read

Semantic IDs allow language models to understand product data, merging recommendation systems with AI chatbots for smarter, more conversational recommendations that bridge the gap between human queries and digital products.

When I first heard about Semantic IDs, it immediately piqued my interest. The concept is straightforward: instead of using random hash IDs for items like videos, songs, or products, we use semantically meaningful tokens that a language model (LLM) can naturally understand. This opens up the possibility of training an LLM-recommender hybrid on rich behavioral data, leveraging the strengths of both worlds.

The result? A language model that can converse in both natural language and item IDs, effectively becoming a "bilingual" model where items are part of its vocabulary. Not only can it recommend items based on historical interactions, but it can also reason about its choices, offer explanations, and even creatively name product bundles-all through natural language interactions.

How It Works

To achieve this, we need to extend the LLM's vocabulary with semantic ID tokens (e.g., <|sid_0|>, <|sid_1|>, <|sid_2|>). These tokens represent items in our catalog. Here’s a step-by-step breakdown:

Extend Vocabulary: Add semantic ID tokens to the language model’s vocabulary.
Pretraining: Continue pretraining the LLM on sequences that include these semantic IDs to teach it the relationships between IDs and the catalog.
Finetuning: Further finetune the model on sequences of user behavior to make it more effective at making recommendations.

Why This Matters

This approach combines the best of both recommender systems (RecSys) and language models:

Language Models: They have world knowledge and can eloquently talk about products, but they lack awareness of specific catalogs and suffer from popularity bias.
Recommender Systems: These are trained on our catalog and billions of user interactions, excelling at predicting what a user will click or buy next. However, they cannot be steered via natural language or reason about their choices.

By merging these capabilities, we get a model that can:

Recommend Items: Based on historical interactions.
Steer Recommendations: Through natural language interactions.
Reason and Explain: Provide explanations for its recommendations.
Creative Naming: Generate creative names for product bundles.

Implementation Details

To implement this, follow these steps:

Data Preparation:
- Collect user interaction data (e.g., clicks, purchases).
- Map items to semantic IDs.
- Combine text and interaction sequences into a single dataset.
Model Training:
- Extend the vocabulary of an existing LLM with semantic ID tokens.
- Pretrain the model on the combined dataset to learn the relationships between semantic IDs and user behavior.
- Finetune the model on more specific tasks, such as recommendation generation.
Evaluation:
- Test the model’s performance in recommending items based on historical interactions.
- Evaluate its ability to reason and explain recommendations through natural language interactions.

Demo and Code

Eugene Yan has provided a demo video and code to help you get started:

Demo Video: LLM-Recommender Hybrid with Steerable Recommendations and Reasoning
Code Repository: GitHub - semantic-ids-llm

Note that this is a small model with basic finetuning, so the effectiveness can vary based on how you prompt it. It’s also not as general-purpose and robust as most LLMs due to limited finetuning.

Conclusion

The LLM-RecSys hybrid using Semantic IDs represents a significant step forward in recommendation systems. By combining the rich behavioral data of RecSys with the natural language capabilities of LLMs, we can create more interactive, explainable, and steerable recommendation models. This approach opens up new possibilities for personalization and user engagement.