Scale-Aware Recommendation Engine (Walmart)
Business Impact
+10% CTR, +25% Recall@K
Scale
Billions of transactions analyzed

The Challenge
Walmart needed to optimize cross-selling across a massive catalog with billions of historical transactions. Legacy systems were rule-based, unscalable, and couldn't capture semantic product relationships.
The Architecture
Designed a two-tower recommendation architecture using BERT embeddings for product and user representations. Data Lake (BigQuery) → Feature Engineering (PySpark on Dataproc) → BERT Embedding Layer → Ranking Algorithm (XGBoost + Business Rules) → Serving Infrastructure (Vertex AI Endpoints). Implemented offline batch processing for candidate generation and real-time scoring for personalized ranking.
System Architecture Diagram
graph LR
A[Data Lake<br/>BigQuery] --> B[Feature Engineering<br/>PySpark/Dataproc]
B --> C[BERT Embedding<br/>Layer]
C --> D[Ranking Algorithm<br/>XGBoost + Rules]
D --> E[Serving Infrastructure<br/>Vertex AI Endpoints]
E --> F[Walmart.com<br/>Personalization]
G[A/B Testing<br/>Framework] -.->|Metrics| E
H[Retraining<br/>Pipeline] -.->|Daily| C
style A fill:#0066ff,stroke:#0052cc,stroke-width:2px,color:#fff
style B fill:#4C9AFF,stroke:#0066ff,stroke-width:2px,color:#fff
style C fill:#0066ff,stroke:#0052cc,stroke-width:2px,color:#fff
style D fill:#4C9AFF,stroke:#0066ff,stroke-width:2px,color:#fff
style E fill:#0066ff,stroke:#0052cc,stroke-width:2px,color:#fff
style F fill:#4C9AFF,stroke:#0066ff,stroke-width:2px,color:#fff
style G fill:#666,stroke:#444,stroke-width:1px,color:#fff
style H fill:#666,stroke:#444,stroke-width:1px,color:#fffThe Impact
Achieved 10% CTR lift and 25% improvement in Recall@K metrics. System processes billions of product interactions daily, serving personalized recommendations to millions of customers. Deployed on GCP with automated retraining pipelines and A/B testing framework.
Interested in similar solutions?
Let's discuss how we can build scalable ML systems for your business challenges.
View More Case Studies