Swiggy: Company Context
Swiggy is India's largest food delivery platform, founded in 2014 by Sriharsha Majety, Nandan Reddy, and Rahul Jaimini. Operating in 600+ cities, Swiggy has transformed how Indians order food.
Key Metrics (2026)
- 1.5 million orders/day (550+ million annually)
- 350,000+ restaurant partners
- 300,000+ delivery executives
- 600+ cities covered
- 30-minute average delivery time
- โน8,000+ crore annual revenue
Data Infrastructure
Swiggy's analytics runs on:
- Geospatial database: Real-time location tracking of 300K delivery partners
- Streaming pipeline: Apache Kafka processing 50K events/second (orders, GPS pings, restaurant updates)
- ML platform: Demand forecasting, ETA prediction, dynamic pricing models
- Time-series database: Historical order patterns, weather data, traffic conditions
- A/B testing framework: 100+ live experiments on pricing, UI, recommendations
Analytics Team Structure
- Delivery Analytics: Route optimization, ETA prediction, fleet management
- Demand Analytics: Order forecasting, restaurant capacity planning, surge pricing
- Growth Analytics: Customer acquisition, retention, LTV modeling
- Operations Analytics: Restaurant onboarding, quality control, fraud detection
- Product Analytics: App usage funnels, feature adoption, personalization
Swiggy's analytics system is like air traffic control for food delivery โ tracking 300,000 delivery partners in real-time, predicting where demand will spike in the next 30 minutes, and dynamically rerouting orders to minimize delivery time. Every second of delay costs money; every optimization saves lakhs.
The Business Problem
Swiggy faces three critical analytics challenges:
1. Delivery Time Optimization
Problem: Late deliveries = refunds + bad reviews + customer churn.
Challenge:
- Traffic variability: Same route takes 10 min (midnight) vs 35 min (rush hour)
- Weather impact: Rain increases delivery time by 40%+
- Restaurant delays: Food not ready when delivery partner arrives
- Distance vs speed trade-off: Assign nearest delivery partner or fastest available?
Traditional approach: Fixed 30-min delivery promise for all orders โ Result: 25% late deliveries (refunds cost โน50 crore/year)
Data-driven approach: Dynamic ETA prediction using ML โ Result: 8% late deliveries (67% reduction in refund costs)
2. Demand Forecasting for Peak Hours
Problem: Too few delivery partners = long wait times; too many = idle partners (wasted cost).
Challenge:
- Lunch rush: 12-2 PM sees 3ร normal demand
- Dinner peak: 7-10 PM sees 5ร normal demand
- Weekend spikes: Saturday dinner 2ร higher than weekday
- Event-driven surges: India vs Pakistan cricket match = 10ร spike in specific zones
- Weather dependency: Rainy days see 60% more orders (people avoid going out)
Traditional approach: Fixed fleet size throughout the day โ Result: 15-min wait times during peak + 40% idle capacity during off-peak
Data-driven approach: Hourly demand forecasting with dynamic fleet allocation โ Result: 5-min average wait time + 15% idle capacity (60% cost savings)
3. Restaurant Recommendations
Problem: Users spend 8-10 minutes browsing before ordering (high friction).
Typical user journey:
App Open โ Search/Browse โ Restaurant Page โ Menu โ Checkout โ Payment
100K users โ 85K โ 35K โ 28K โ 22K โ 20K
Drop-off: 15% 59% 20% 21% 9%
Key insights from analytics:
- Paradox of choice: Showing 100 restaurants overwhelms users (60% exit without ordering)
- Search mismatch: Users search "biryani" but see irrelevant results (Chinese, pizza)
- Price sensitivity: 70% of users filter by "under โน300" but default shows โน500+ restaurants first
- Delivery time: 45% prefer "fastest delivery" over "best rated"
Data-driven solutions: Personalized restaurant ranking using collaborative filtering + contextual factors (time of day, past orders, weather).
Scale context: Reducing average browse time by 1 minute = 500,000 hours saved daily for users + 5% higher conversion (โน400 crore additional revenue/year).
Data They Used & Analytics Approach
1. Delivery Optimization: Geospatial Analytics
Data sources:
# Real-time delivery partner location (GPS pings every 10 seconds)
{
"partner_id": "DP12345",
"lat": 12.9716,
"lon": 77.5946,
"timestamp": "2026-03-24 19:45:23",
"status": "idle", # idle | en_route_to_restaurant | picked_up | delivering
"current_order": null
}
# Historical delivery data
{
"order_id": "O987654",
"restaurant_lat_lon": (12.9352, 77.6245),
"customer_lat_lon": (12.9698, 77.6450),
"actual_delivery_time": 28, # minutes
"predicted_delivery_time": 25,
"traffic_level": "medium",
"weather": "clear",
"hour_of_day": 19,
"day_of_week": "Friday"
}Analytics approach: ETA Prediction Model
Swiggy uses a gradient boosting model (XGBoost) to predict delivery time:
import xgboost as xgb
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
# Feature engineering
def extract_features(order_data):
"""Convert raw order data to ML features"""
features = pd.DataFrame()
# Distance features
features['haversine_distance'] = calculate_distance(
order_data['restaurant_lat_lon'],
order_data['customer_lat_lon']
)
# Time features
features['hour'] = order_data['timestamp'].dt.hour
features['day_of_week'] = order_data['timestamp'].dt.dayofweek
features['is_weekend'] = (features['day_of_week'] >= 5).astype(int)
features['is_peak_hour'] = ((features['hour'] >= 12) & (features['hour'] <= 14) |
(features['hour'] >= 19) & (features['hour'] <= 21)).astype(int)
# Weather features
features['is_raining'] = (order_data['weather'] == 'rain').astype(int)
features['temperature'] = order_data['temperature']
# Traffic features (from Google Maps API or custom traffic model)
features['traffic_level'] = order_data['traffic_level'].map({'low': 1, 'medium': 2, 'high': 3})
# Historical features (restaurant average prep time)
features['avg_restaurant_prep_time'] = order_data['restaurant_id'].map(
historical_prep_times # precomputed lookup table
)
return features
# Train model
X = extract_features(historical_orders)
y = historical_orders['actual_delivery_time']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
model = xgb.XGBRegressor(
n_estimators=200,
max_depth=6,
learning_rate=0.1,
objective='reg:squarederror'
)
model.fit(X_train, y_train)
# Predict ETA for new order
predicted_eta = model.predict(X_test)
print(f"Mean Absolute Error: {np.mean(np.abs(y_test - predicted_eta))} minutes")
# Output: Mean Absolute Error: 2.3 minutes (industry benchmark: 3-4 minutes)Real-world impact:
- Prediction accuracy: ยฑ2 minutes for 80% of orders
- Assignment optimization: Reduced average delivery time from 35 min โ 28 min
- Customer satisfaction: Late delivery rate dropped from 25% โ 8%
2. Demand Forecasting: Time-Series Analysis
SQL: Analyzing historical demand patterns
-- Hourly order volume by zone (used for forecasting)
WITH hourly_orders AS (
SELECT
zone_id,
zone_name,
DATE_TRUNC('hour', order_time) as hour,
DATE_PART('dow', order_time) as day_of_week, -- 0=Sunday, 6=Saturday
DATE_PART('hour', order_time) as hour_of_day,
COUNT(*) as order_count,
COUNT(DISTINCT customer_id) as unique_customers,
AVG(order_value) as avg_order_value
FROM orders
WHERE order_time >= CURRENT_DATE - INTERVAL '90 days'
GROUP BY 1,2,3,4,5
)
SELECT
zone_name,
hour_of_day,
day_of_week,
AVG(order_count) as avg_orders,
STDDEV(order_count) as stddev_orders,
MAX(order_count) as peak_orders,
-- Predict required delivery partners (1 partner can handle 2 orders/hour)
CEIL(AVG(order_count) / 2.0) as required_partners
FROM hourly_orders
WHERE day_of_week IN (0, 6) -- Weekends only
GROUP BY 1,2,3
ORDER BY zone_name, day_of_week, hour_of_day;Python: Demand forecasting with Prophet
from prophet import Prophet
import pandas as pd
# Load historical order data
df = pd.read_sql("""
SELECT
DATE_TRUNC('hour', order_time) as ds,
COUNT(*) as y
FROM orders
WHERE zone_id = 'BLR_KORAMANGALA'
AND order_time >= CURRENT_DATE - INTERVAL '180 days'
GROUP BY 1
ORDER BY 1
""", connection)
# Add external regressors (weather, holidays)
df['is_raining'] = df['ds'].map(weather_data) # 1 if raining, 0 otherwise
df['is_cricket_match'] = df['ds'].map(cricket_schedule) # 1 if India match, 0 otherwise
# Train forecasting model
model = Prophet(
yearly_seasonality=True,
weekly_seasonality=True,
daily_seasonality=True
)
model.add_regressor('is_raining')
model.add_regressor('is_cricket_match')
model.fit(df)
# Forecast next 7 days
future = model.make_future_dataframe(periods=24*7, freq='H') # Hourly for next 7 days
future['is_raining'] = get_weather_forecast(future['ds']) # From weather API
future['is_cricket_match'] = get_cricket_schedule(future['ds'])
forecast = model.predict(future)
print(forecast[['ds', 'yhat', 'yhat_lower', 'yhat_upper']].tail(24)) # Next 24 hoursBusiness impact:
- Fleet optimization: Reduced idle time by 60% (saves โน200 crore/year in partner payouts)
- Customer wait time: Reduced from 15 min โ 5 min during peak hours
- Surge pricing accuracy: Predicts demand spikes 2 hours in advance (enables dynamic pricing)
โ ๏ธ CheckpointQuiz error: Missing or invalid options array
Key Results & Impact
1. Delivery Time Reduction
Before analytics:
- Average delivery time: 35 minutes
- Late delivery rate: 25%
- Refund costs: โน50 crore/year
- Customer satisfaction (CSAT): 3.2/5
After analytics:
- Average delivery time: 28 minutes (20% improvement)
- Late delivery rate: 8% (67% reduction)
- Refund costs: โน16 crore/year (68% savings)
- Customer satisfaction (CSAT): 4.1/5 (28% improvement)
2. Fleet Optimization
Before analytics:
- Fixed fleet size: 250K delivery partners active throughout the day
- Idle time: 40% (10-11 AM, 3-6 PM)
- Peak-hour wait time: 15 minutes (insufficient partners during lunch/dinner)
- Annual partner payouts: โน500 crore (including idle time)
After analytics:
- Dynamic fleet allocation: 180K partners during off-peak, 300K during peak
- Idle time: 15% (optimal utilization)
- Peak-hour wait time: 5 minutes (right-sized fleet)
- Annual partner payouts: โน380 crore (24% savings while improving service)
3. Revenue Impact
Personalized restaurant recommendations:
- Browse time reduced from 8 min โ 5 min (faster decision-making)
- Conversion rate improved from 20% โ 25% (5 percentage points)
- Order frequency increased from 2.5 โ 3.2 orders/month/user (better recommendations)
- Net revenue impact: โน400 crore additional GMV/year
Demand-based surge pricing:
- Implemented dynamic pricing during peak hours (1.2ร - 1.5ร normal price)
- Reduced customer wait time (incentivizes more partners to come online)
- Net revenue impact: โน150 crore additional revenue/year
ROI of analytics team: Swiggy's 200-person analytics team costs ~โน100 crore/year in salaries + infrastructure. Combined savings + revenue impact = โน700+ crore/year. 7ร return on investment.
What You Can Learn from Swiggy
1. Geospatial Analytics is a Superpower for Location-Based Businesses
Key insight: Swiggy's competitive advantage isn't just technology โ it's their ability to model the chaotic Indian traffic/weather/restaurant ecosystem with data.
How to apply this:
- If you work in logistics/delivery: Learn geospatial SQL (PostGIS), distance calculations (Haversine formula), and map visualization (Folium, Mapbox)
- Build a sample project: Analyze Uber trip data, optimize delivery routes for an imaginary food delivery app, or predict taxi demand by zone
- Portfolio value: Geospatial skills are rare in India โ showcasing a project with maps + route optimization instantly stands out
2. Real-Time Analytics Requires Different Tools Than Batch Analytics
Key insight: Swiggy can't wait 24 hours for a daily report to assign delivery partners โ they need second-by-second decision-making.
The two types of analytics:
| Batch Analytics | Real-Time Analytics | |---------------------|-------------------------| | SQL query runs overnight, results ready next morning | Query runs in <100ms, results used immediately | | Tools: SQL, Python, dbt, Airflow | Tools: Kafka, Flink, Redis, Elasticsearch | | Use case: Monthly revenue report, cohort analysis | Use case: Fraud detection, dynamic pricing, ETA prediction | | Example: "Which restaurants had highest sales last month?" | Example: "Which delivery partner should I assign to this order right now?" |
How to learn real-time analytics:
- Start with SQL window functions (running totals, moving averages)
- Learn event-driven architecture (how Kafka works)
- Build a streaming project (real-time stock price tracker, live cricket score dashboard)
3. Domain Knowledge > Fancy Algorithms
Key insight: Swiggy's ETA model isn't state-of-the-art AI โ it's XGBoost (a 10-year-old algorithm). The magic is in feature engineering:
- Knowing that rain increases delivery time by 40% (domain knowledge)
- Knowing that restaurant X always takes 12 minutes to prepare food (historical data)
- Knowing that Koramangala traffic is 3ร worse at 8 PM than 3 PM (local knowledge)
How to build domain knowledge:
- When analyzing Zomato/Swiggy data, order food yourself and note patterns (delivery time, restaurant ratings, surge pricing)
- When analyzing e-commerce data, browse Flipkart/Amazon and observe their recommendation logic
- When building a portfolio project, pick an industry you understand (cricket, food, movies) rather than generic datasets
The best analysts aren't just good at SQL/Python โ they understand the business deeply.
โ ๏ธ FinalQuiz error: Missing or invalid questions array
โ ๏ธ SummarySection error: Missing or invalid items array
Received: {"hasItems":false,"isArray":false}