Jam's Portfolio

Personalization in AI products often feels like a gimmick, with just a few variables swapped here and there. But with Retrieval-Augmented Generation (RAG), we have the opportunity to make a product truly understand the user's specific context.

### The Challenge of Over-Retrieval The biggest mistake in RAG implementation is pulling too much noise. When you dump 10 irrelevant documents into an LLM prompt just because they matched a vector search, you dilute the signal. The "magic" happens in the filtering.

### My Approach at CosmiQ At CosmiQ, we implemented a multi-stage retrieval process: 1. **User Identity Filtering**: Only search within the user's explicit knowledge base. 2. **Temporal Weighting**: Prioritize documents and interactions from the last 7 days. 3. **Semantic Reranking**: Use a smaller, faster model to verify the relevance of the top 20 matches before passing the top 5 to the main LLM.

The result? Users mentioned that the AI "remembered" things they'd forgotten they even wrote. That’s the outcome we’re aiming for.

RAG that feels magical: personalization without confusion

Enjoyed this note?