Top 50 Data Analytics Interview Questions & Answers (2026)
Comprehensive list of data analytics interview questions for freshers and experienced professionals. Covers SQL, Excel, Power BI, Python, and core concepts — exactly what recruiters ask.
1. Basic / Conceptual Questions
These foundational questions are commonly asked in screening rounds and are essential for data analytics interview questions for freshers.
Q1. What is data analytics?
Data analytics is the process of examining raw data to find trends, draw conclusions, and support decision-making. It involves collecting, cleaning, transforming, and modelling data to uncover actionable insights for businesses.
Q2. What are the 4 types of data analytics?
The four types are: Descriptive (what happened — uses historical data), Diagnostic (why it happened — root-cause analysis), Predictive (what will happen — uses statistical models), and Prescriptive (what should we do — recommends actions).
Q3. What is the difference between data analytics and data science?
Data analytics focuses on analysing existing datasets to answer specific business questions, using tools like SQL, Excel, and Power BI. Data science is broader — it includes building ML models, developing algorithms, and working with unstructured data using Python and statistics.
Q4. What is a KPI in data analytics?
KPI stands for Key Performance Indicator. It is a measurable value that shows how effectively a company or individual is achieving a key business objective. Examples include monthly revenue, customer churn rate, and conversion rate.
Q5. What is a data pipeline?
A data pipeline is a series of processes that transport and transform data from one or more sources to a destination for storage, analysis, or reporting. It typically involves ingestion, transformation, validation, and loading steps.
Q6. What is ETL?
ETL stands for Extract, Transform, Load. It is the process of extracting data from multiple source systems, transforming it into a consistent format (cleaning, filtering, aggregating), and loading it into a data warehouse or target system.
Q7. What is data cleaning?
Data cleaning (or data cleansing) is the process of identifying and correcting errors, inconsistencies, and missing values in a dataset. Common tasks include removing duplicates, fixing incorrect formats, handling nulls, and standardising values.
Q8. What is a dashboard?
A dashboard is a visual display of KPIs and metrics that provides a real-time, at-a-glance view of business performance. Dashboards are built in tools like Power BI, Tableau, or Looker and help stakeholders monitor trends and make faster decisions.
Q9. What is the difference between OLAP and OLTP?
OLAP (Online Analytical Processing) is optimised for analytics and complex queries across large historical datasets — used in data warehouses. OLTP (Online Transaction Processing) is optimised for real-time, row-level transactions like order placements or bank transfers.
Q10. What is big data?
Big data refers to extremely large datasets that cannot be processed by traditional tools. It is characterised by the 3 Vs: Volume (massive scale), Velocity (high speed of generation), and Variety (structured, semi-structured, unstructured). Tools like Hadoop and Spark are used to process it.
2. SQL Interview Questions
SQL is the most tested skill in any data analytics interview. Master these questions before your next interview.
Q11. What is the difference between WHERE and HAVING in SQL?
WHERE filters rows before aggregation is applied, while HAVING filters groups after aggregation. WHERE cannot be used with aggregate functions like COUNT() or SUM(), but HAVING can. Example: SELECT department, COUNT(*) FROM employees GROUP BY department HAVING COUNT(*) > 10.
Q12. What are the different types of JOINs in SQL?
The main JOIN types are: INNER JOIN (returns matching rows from both tables), LEFT JOIN (all rows from left + matching from right), RIGHT JOIN (all rows from right + matching from left), FULL OUTER JOIN (all rows from both tables), and CROSS JOIN (cartesian product of both tables).
Q13. What is a subquery in SQL?
A subquery is a query nested inside another query. It can appear in the SELECT, FROM, or WHERE clause. Subqueries can return a single value (scalar), a list of values, or a full table. Correlated subqueries reference columns from the outer query.
Q14. What is the difference between RANK() and DENSE_RANK()?
RANK() assigns the same rank to tied rows but skips subsequent rank numbers (e.g., 1, 2, 2, 4). DENSE_RANK() also assigns the same rank to ties but does not skip numbers (e.g., 1, 2, 2, 3). Both are window functions used with OVER(ORDER BY ...).
Q15. What is a CTE (Common Table Expression)?
A CTE is a temporary named result set defined using the WITH clause, scoped to a single query. CTEs improve readability and can be referenced multiple times in the same query. Recursive CTEs are used for hierarchical data like org charts.
Q16. What is GROUP BY used for in SQL?
GROUP BY is used to aggregate data by one or more columns. It collapses multiple rows with the same value into a single row, enabling use of aggregate functions (SUM, COUNT, AVG, MIN, MAX). Every non-aggregated column in SELECT must appear in GROUP BY.
Q17. What is the difference between UNION and UNION ALL?
UNION combines result sets from two queries and removes duplicate rows. UNION ALL keeps all rows including duplicates, making it faster. Use UNION when you need distinct results; use UNION ALL for performance when duplicates are acceptable or expected.
Q18. What are window functions in SQL?
Window functions perform calculations across a set of rows related to the current row, without collapsing them into one row. Common examples: ROW_NUMBER(), RANK(), DENSE_RANK(), LAG(), LEAD(), SUM() OVER(), AVG() OVER(). They use the OVER() clause with optional PARTITION BY and ORDER BY.
Q19. How do you find duplicate records in SQL?
Use GROUP BY on the columns you suspect are duplicated, then filter with HAVING COUNT(*) > 1. Example: SELECT email, COUNT(*) FROM users GROUP BY email HAVING COUNT(*) > 1. To delete duplicates, use a CTE with ROW_NUMBER() and delete rows where row_num > 1.
Q20. What is an index in SQL?
An index is a database object that speeds up data retrieval by creating a separate data structure (typically a B-tree) that points to rows in a table. Indexes improve SELECT performance but add overhead to INSERT/UPDATE/DELETE. Common types: clustered, non-clustered, composite, unique.
3. Excel & Power BI Questions
Tool-specific questions for Microsoft stack roles — very common in analyst and BI developer interviews.
Q21. What is VLOOKUP and when should you use it?
VLOOKUP (Vertical Lookup) is an Excel function that searches for a value in the first column of a range and returns a value from a specified column in the same row. Use it to merge data from two tables. Syntax: =VLOOKUP(lookup_value, table_array, col_index_num, [range_lookup]). For more flexibility, INDEX-MATCH or XLOOKUP are preferred.
Q22. What is a Pivot Table in Excel?
A Pivot Table is a powerful tool in Excel to summarise, analyse, and explore large datasets interactively. You can drag fields into Rows, Columns, Values, and Filters areas to quickly aggregate data, identify trends, and answer business questions — without writing formulas.
Q23. What is the difference between measures and dimensions in Power BI?
Dimensions are categorical fields used to group and filter data (e.g., Region, Product Name, Date). Measures are calculated values that aggregate numeric data (e.g., Total Sales, Average Revenue). In Power BI, measures are written in DAX and are evaluated in the context of visualisation filters.
Q24. What is DAX in Power BI?
DAX (Data Analysis Expressions) is the formula language used in Power BI, Power Pivot, and Analysis Services. It is used to create calculated columns, measures, and tables. Key DAX functions include CALCULATE(), SUMX(), FILTER(), RELATED(), and time intelligence functions like SAMEPERIODLASTYEAR().
Q25. How do you handle missing data in Excel?
Common approaches: use IFERROR() to replace errors with a default value, IF(ISBLANK(), ...) to detect empty cells, or Power Query's "Replace Values" / "Remove Rows with Errors" features. For statistical imputation, calculate the column mean/median and fill blanks accordingly.
4. Python for Data Analytics
Python questions focus on Pandas and data manipulation — frequently asked in product analytics and data analyst roles at tech companies.
Q26. What is Pandas used for in data analytics?
Pandas is a Python library for data manipulation and analysis. It provides two main data structures: Series (1D) and DataFrame (2D). Use Pandas to read/write CSV/Excel files, filter rows, group and aggregate data, merge datasets, handle missing values, and transform data for analysis.
Q27. What is the difference between iloc and loc in Pandas?
loc is label-based — it selects rows and columns by their index labels or column names (e.g., df.loc[0:5, "Name"]). iloc is integer position-based — it uses numeric positions regardless of labels (e.g., df.iloc[0:5, 1]). Use loc for named access and iloc for positional slicing.
Q28. How do you handle null values in Pandas?
Key methods: isnull() / notnull() to detect nulls, dropna() to remove rows/columns with nulls, fillna(value) to replace nulls with a specific value (e.g., mean, median, 0, or "Unknown"). For forward/backward filling use ffill() or bfill().
Q29. What is a DataFrame in Python?
A DataFrame is a 2D labelled data structure in Pandas, similar to a spreadsheet or SQL table. It has rows (each with an index) and columns (each with a name and dtype). DataFrames can store mixed data types and support powerful operations like groupby, merge, pivot, and apply.
Q30. What libraries are used for data visualisation in Python?
The main libraries are: Matplotlib (foundational, highly customisable static charts), Seaborn (statistical charts built on Matplotlib with better defaults), Plotly (interactive charts for dashboards and web), and Pandas built-in plot() for quick exploratory charts.
Want to go beyond interview prep?
Our Data Analytics course teaches SQL, Power BI, Python & Excel with real projects — so you crack interviews and actually land the job.
Start Learning — ₹1,5995. How to Prepare for a Data Analytics Interview
Master SQL fundamentals first
SQL is asked in almost every data analytics interview. Focus on JOINs, GROUP BY, window functions, and subqueries. Practice on platforms like LeetCode, HackerRank, or Mode Analytics.
Build a portfolio of projects
Interviewers value hands-on experience. Create 2–3 projects using real datasets — a sales dashboard in Power BI, a customer churn analysis in Python, or an Excel KPI tracker.
Understand business context, not just tools
Analytics interviews test your ability to translate business problems into analytical solutions. Practice explaining your thought process: what metric you'd choose, why, and what the insight means.
Revise statistics and probability basics
Concepts like mean vs median, standard deviation, A/B testing, confidence intervals, and correlation vs causation come up frequently — especially at product and tech companies.
Prepare a case study answer
Many interviews include an open-ended case: "How would you measure the success of feature X?" Use a structured framework: define the goal, pick the right metrics, identify data sources, outline analysis steps, and discuss limitations.
Ready to get hired?
Learn Data Analytics the Right Way
Join our structured course covering SQL, Power BI, Excel, Python, and real-world projects — everything you need to crack your first data analytics role.