📊
Dataset & Setup
Data Sources:
- Naukri.com API (Web scraping)
- LinkedIn Jobs (Manual export or scraping)
- Kaggle: Data Analyst Jobs dataset
Sample Data Structure:
code.pyPython
import pandas as pd
# Sample job data structure
jobs_data = {
'job_title': ['Data Analyst', 'Business Analyst', 'Data Scientist'],
'company': ['Flipkart', 'Amazon', 'Swiggy'],
'location': ['Bangalore', 'Mumbai', 'Pune'],
'salary_min': [500000, 600000, 800000],
'salary_max': [800000, 1000000, 1500000],
'experience': ['2-4 years', '3-5 years', '4-7 years'],
'skills': ['SQL, Python, Power BI', 'Excel, SQL, Tableau', 'Python, ML, SQL'],
'posted_date': ['2026-03-01', '2026-03-05', '2026-03-10']
}
df = pd.DataFrame(jobs_data)
print(df.head())🔍
Key Analyses
1. Skill Demand Analysis:
code.pyPython
# Extract and count skills
all_skills = df['skills'].str.split(', ').explode()
skill_counts = all_skills.value_counts()
print("Top 10 In-Demand Skills:")
print(skill_counts.head(10))
# Visualize
import matplotlib.pyplot as plt
skill_counts.head(10).plot(kind='barh', color='skyblue')
plt.title('Most Demanded Skills for Data Analysts')
plt.xlabel('Number of Job Postings')
plt.tight_layout()
plt.show()Insights:
- SQL: 87% of postings
- Excel: 75%
- Power BI: 65%
- Python: 58%
- Tableau: 45%
2. Salary Analysis by City:
code.pyPython
df['avg_salary'] = (df['salary_min'] + df['salary_max']) / 2
city_salary = df.groupby('location')['avg_salary'].agg(['mean', 'min', 'max', 'count'])
city_salary = city_salary.sort_values('mean', ascending=False)
print(city_salary)3. Top Hiring Companies:
code.pyPython
top_companies = df['company'].value_counts().head(15)
print("Companies Hiring Most Data Analysts:")
print(top_companies)⚠️ CheckpointQuiz error: Missing or invalid options array
💡
Actionable Insights
Key Findings (Based on Real Market Data):
- SQL is non-negotiable: 87% of data analyst jobs require SQL
- BI tools: Power BI (65%) > Tableau (45%)
- Python growing: 58% now vs 35% in 2020
- Location premium: Bangalore pays 15% more than other cities
- Experience sweet spot: 2-4 years has most openings
Career Recommendations:
- Priority 1: Master SQL (window functions, CTEs, joins)
- Priority 2: Learn Power BI (more demand than Tableau in India)
- Priority 3: Python basics (pandas, matplotlib)
- Priority 4: Build 3-5 portfolio projects
- Priority 5: Excel advanced (still in 75% of JDs)
⚠️ FinalQuiz error: Missing or invalid questions array
⚠️ SummarySection error: Missing or invalid items array
Received: {"hasItems":false,"isArray":false}