Topic 2 of 12

Types of Data & Data Structures

Not all data is created equal. Learn the types and how to work with each one effectively.

๐Ÿ“šBeginner
โฑ๏ธ10 min
โœ…4 quizzes

Why Understanding Data Types Matters

Before you can analyze data, you need to understand what kind of data you're working with. The type of data determines:

  • Which tools you can use
  • What analysis methods are valid
  • How you should visualize it
  • What insights you can extract

Think of it like cooking: You wouldn't use the same techniques for rice and pasta. Similarly, you handle numerical data differently from text data.


The Big Picture: Two Main Classifications

1. Qualitative vs Quantitative Data

| Type | Definition | Examples | Can You Calculate Average? | |------|-----------|----------|---------------------------| | Qualitative | Descriptive, non-numerical | Colors, names, categories, feedback | โŒ No | | Quantitative | Numerical, measurable | Age, salary, temperature, clicks | โœ… Yes |

Example:

  • "The customer rated the product as 'Excellent'" โ†’ Qualitative
  • "The customer gave 5 stars out of 5" โ†’ Quantitative

Pro Tip: Quantitative data is easier to analyze statistically, but qualitative data often provides richer context. Ideally, you want both.


2. Structured vs Unstructured Data

| Type | Definition | Examples | Easy to Analyze? | |------|-----------|----------|-----------------| | Structured | Organized in rows & columns | Excel tables, SQL databases, CSV files | โœ… Very easy | | Semi-Structured | Partially organized | JSON, XML, email headers | โš ๏ธ Moderate | | Unstructured | No predefined format | Text documents, images, videos, PDFs | โŒ Hard (needs pre-processing) |

Real-World Example:

  • Structured: Customer purchase records in an e-commerce database
  • Semi-Structured: Instagram post metadata (likes, comments, hashtags)
  • Unstructured: Customer support chat transcripts

Types of Quantitative Data

Quantitative data breaks down into four levels of measurement:

1. Nominal Data (Categories)

Definition: Named categories with no inherent order.

Examples:

  • Gender: Male, Female, Non-binary
  • Product categories: Electronics, Clothing, Food
  • Payment method: Credit card, UPI, Cash

What you CAN do:

  • Count frequencies (e.g., "60% paid by UPI")
  • Find the mode (most common category)

What you CANNOT do:

  • Calculate average
  • Say one category is "greater than" another

2. Ordinal Data (Ordered Categories)

Definition: Categories with a meaningful order, but gaps between values aren't equal.

Examples:

  • Customer satisfaction: Poor, Average, Good, Excellent
  • Education level: High School, Bachelor's, Master's, PhD
  • T-shirt sizes: S, M, L, XL

What you CAN do:

  • Rank items
  • Find median (middle value)
  • Calculate percentiles

What you CANNOT do:

  • Calculate average meaningfully (the gap between "Good" and "Excellent" isn't the same as "Poor" to "Average")

3. Interval Data (Equal Intervals, No True Zero)

Definition: Numerical data with equal intervals between values, but zero doesn't mean "absence of."

Examples:

  • Temperature in Celsius (0ยฐC doesn't mean "no temperature")
  • IQ scores
  • pH levels

What you CAN do:

  • Calculate average, standard deviation
  • Add/subtract values

What you CANNOT do:

  • Multiply or divide meaningfully (you can't say "20ยฐC is twice as hot as 10ยฐC")

4. Ratio Data (Equal Intervals + True Zero)

Definition: Numerical data with equal intervals AND a meaningful zero point.

Examples:

  • Age (0 = not born yet)
  • Salary (โ‚น0 = unpaid)
  • Website traffic (0 visitors = no one visited)
  • Weight, height, distance

What you CAN do:

  • All mathematical operations (add, subtract, multiply, divide)
  • Say "Product A sold twice as much as Product B"

Key Insight: Most business metrics (revenue, sales, users, clicks) are ratio data โ€” the most flexible type for analysis.


Discrete vs Continuous Data

Another important distinction within quantitative data:

| Type | Definition | Examples | Typical Storage | |------|-----------|----------|----------------| | Discrete | Countable, whole numbers | Number of customers, items sold, app downloads | Integer | | Continuous | Measurable, infinite precision | Temperature, weight, time spent on page | Float/Decimal |

Example:

  • You can have 5 customers or 6 customers, but not 5.5 customers โ†’ Discrete
  • A page load time can be 2.341 seconds โ†’ Continuous

Common Data Types in Programming

When you work with data in Excel, SQL, or Python, you'll encounter these data types:

| Type | Description | Examples | Used For | |------|-------------|----------|----------| | String/Text | Alphanumeric characters | "John Doe", "Mumbai" | Names, addresses, descriptions | | Integer | Whole numbers | 25, 1000, -5 | Counts, IDs, quantities | | Float/Decimal | Numbers with decimals | 3.14, 99.99, -0.5 | Prices, percentages, measurements | | Boolean | True/False | TRUE, FALSE, 1, 0 | Yes/No questions, flags | | Date/Time | Timestamps | "2026-03-21", "14:30:00" | Event tracking, time series | | NULL/NA | Missing value | NULL, NaN, NA | Indicates no data available |


Data Structures: How Data is Organized

1. Tables (Most Common)

Rows = records/observations Columns = variables/fields

Example: Sales Table

| Order_ID | Customer | Product | Quantity | Price | |----------|----------|---------|----------|-------| | 1001 | Rahul | Laptop | 1 | โ‚น45000 | | 1002 | Priya | Mouse | 2 | โ‚น500 |

โœ… Best for: Structured data, databases, spreadsheets


2. Lists/Arrays

Ordered collection of items.

Example:

code.pyPython
monthly_revenue = [120000, 135000, 142000, 150000]

โœ… Best for: Time series, sequences, single-variable data


3. Key-Value Pairs (Dictionaries/JSON)

Data stored as name-value pairs.

Example:

data.jsonJSON
{
  "customer_id": 101,
  "name": "Amit Kumar",
  "location": "Delhi",
  "active": true
}

โœ… Best for: APIs, semi-structured data, nested information


4. Time Series

Data points indexed by time.

Example:

2026-01-01: 1200 visitors 2026-01-02: 1350 visitors 2026-01-03: 1280 visitors

โœ… Best for: Trends, forecasting, stock prices, website traffic


How to Choose the Right Data Type

| If your data represents... | Use this type | Storage format | |----------------------------|---------------|----------------| | Names, addresses, categories | String/Text | VARCHAR, TEXT | | Counts (users, sales, clicks) | Integer | INT, BIGINT | | Money, percentages, measurements | Decimal/Float | DECIMAL, FLOAT | | Yes/No, True/False | Boolean | BOOLEAN, BIT | | Dates and times | Date/DateTime | DATE, TIMESTAMP | | Ordered categories | Ordinal (store as integer with labels) | TINYINT + mapping |


Real-World Example: E-commerce Dataset

Let's classify each column:

| Column | Example Value | Data Type | Measurement Scale | |--------|--------------|-----------|------------------| | Order ID | "ORD-1001" | String | Nominal | | Customer Name | "Sneha Reddy" | String | Nominal | | Product Category | "Electronics" | String | Nominal | | Customer Rating | "4 stars" | Integer/Ordinal | Ordinal | | Order Date | "2026-03-15" | Date | Interval | | Quantity | 3 | Integer | Ratio (discrete) | | Price | โ‚น2499.99 | Decimal | Ratio (continuous) | | Discount % | 15.5 | Decimal | Ratio (continuous) | | Delivery Status | "Delivered" | String | Nominal |


Common Mistakes to Avoid

โŒ Mistake 1: Treating Ordinal Data as Ratio

Wrong: Calculating average of ratings (1-5 stars) Why it's wrong: The gap between 1 and 2 stars isn't necessarily equal to the gap between 4 and 5 stars. Better approach: Use median or mode.

โŒ Mistake 2: Storing Numbers as Text

Example: Storing phone numbers or zip codes as strings when you need to do calculations. Fix: If you need to sum, average, or compare, store as numbers.

โŒ Mistake 3: Ignoring Data Types in SQL/Python

Problem: Mixing data types causes errors. Example: "100" + 50 in Python gives an error (string + integer). Fix: Convert types explicitly: int("100") + 50 = 150


Summary

โœ… Qualitative vs Quantitative: Descriptive categories vs measurable numbers โœ… Structured vs Unstructured: Organized tables vs free-form text/media โœ… Four measurement scales: Nominal โ†’ Ordinal โ†’ Interval โ†’ Ratio (increasing flexibility) โœ… Discrete vs Continuous: Countable whole numbers vs measurable decimals โœ… Common data types: String, Integer, Float, Boolean, Date, NULL โœ… Data structures: Tables, lists, key-value pairs, time series

Next Topic: Excel for Data Analysts โ€” Core Functions

Now that you understand data types, let's learn how to work with them in Excel! ๐Ÿš€