Organisation of Data | Class 11 Economics

Classification of Data and Variables

Once data are collected, they arrive as a confusing heap of raw figures. The next stage of a statistical study is to bring order to this heap — this is the organisation of data. The first step is classification: arranging the data into groups or classes according to their common characteristics, so that comparison and analysis become easy.

Data can be classified on four bases:

Geographical — by place or region (e.g. population of different states).
Chronological — by time (e.g. India's GDP year by year).
Qualitative — by a quality or attribute that cannot be measured in numbers (e.g. people grouped by gender, literacy or religion).
Quantitative — by a characteristic that can be measured in numbers (e.g. height, weight, income, marks).

A characteristic that can be measured and takes different numerical values is called a variable. Variables are of two kinds:

Discrete variable — takes only whole, separate values, with jumps in between (e.g. the number of children in a family: 0, 1, 2, 3 — never 2.5).
Continuous variable — can take any value within a range, including fractions (e.g. height, weight or temperature, which can be 160.5 cm, 55.25 kg, etc.).

Knowing whether a variable is discrete or continuous decides how its frequency table is built.

1

Worked Example

Example 1: On what four bases can data be classified?

Solution

Data are grouped by a common characteristic.

Geographical (place), chronological (time).
Qualitative (attribute) and quantitative (measurable).

2

Worked Example

Example 2: Classify these variables as discrete or continuous: (a) number of cars in a street, (b) weight of students.

Solution

Can the value be a fraction?

(a) Number of cars — only whole values — discrete.
(b) Weight — can be any value like 48.6 kg — continuous.

3

Worked Example

Example 3: India's wheat output recorded for each year from 2010 to 2020 is classified on which basis?

Solution

It is arranged by time.

Year-by-year data are arranged chronologically.

Key Points

- Classification = arranging raw data into groups by common characteristics.
- Bases: geographical (place), chronological (time), qualitative (attribute), quantitative (measurable).
- Variable = a measurable characteristic with different values.
- Discrete (whole values only, e.g. number of children) vs continuous (any value in a range, e.g. height).

✎ Quick Check — 2 questions0 / 2

Q1.Arranging India's yearly GDP figures is an example of ____ classification.

Explanation: Year-by-year (time-based) data are classified chronologically.

Q2.The number of children in a family is a:

Explanation: It takes only whole values, so it is a discrete variable.

Frequency Distribution and Class Intervals

When the same value occurs again and again in data, we record how many times it occurs. The number of times a value (or group of values) appears is its frequency, and a table showing values with their frequencies is a frequency distribution.

For a discrete variable we can list each value and its frequency directly. But for a continuous variable, or when the data spread over a wide range, we group the values into class intervals (such as 0–10, 10–20, 20–30). Some key terms:

The two ends of a class are its class limits — the smaller is the lower limit, the larger the upper limit.
The difference between the upper and lower limit is the class size (width); for 10–20 it is 10.
The middle value of a class is its mid-point (class mark) = (lower limit + upper limit) ÷ 2; for 10–20 it is 15.

There are two ways to form class intervals:

Inclusive method — both limits are included in the class (e.g. 0–9, 10–19, 20–29). There is a gap between classes, so it suits discrete data.
Exclusive method — the upper limit of one class is the lower limit of the next, and the upper limit is excluded (e.g. 0–10, 10–20: a value of exactly 10 goes into 10–20). This avoids gaps and suits continuous data. An inclusive table can be converted to exclusive form to remove the gaps before drawing graphs.

1

Worked Example

Example 1: For the class interval 20–30, find the class size and the mid-point.

Solution

Use the formulas.

Class size = upper − lower = 30 − 20 = 10.
Mid-point = (20 + 30) ÷ 2 = 25.

2

Worked Example

Example 2: In the exclusive class 10–20, where does a value of exactly 10 go?

Solution

The upper limit is excluded, the lower included.

In the exclusive method, 10 belongs to the class 10–20 (not to 0–10).

3

Worked Example

Example 3: Which method (inclusive or exclusive) is better suited to continuous data, and why?

Solution

Continuous data has no gaps.

The exclusive method, because it leaves no gap between classes.
This matches the continuous nature of the data.

Key Points

- Frequency = how many times a value occurs; frequency distribution = values with their frequencies.
- Class interval (e.g. 10–20): class size = upper − lower; mid-point = (lower + upper) ÷ 2.
- Inclusive (both limits included; gaps; suits discrete) vs exclusive (upper limit excluded; no gaps; suits continuous).

✎ Quick Check — 2 questions0 / 2

Q1.The mid-point (class mark) of the class 40–50 is:

Explanation: Mid-point = (40 + 50) ÷ 2 = 45.

Q2.In which method is the upper limit of a class excluded from it?

Explanation: The exclusive method excludes the upper limit, leaving no gaps.

Constructing a Frequency Table

Let us put it all together and build a frequency table from raw data. Suppose 20 students scored the following marks (out of 50):

12, 23, 35, 41, 9, 18, 27, 33, 45, 7, 22, 38, 16, 29, 31, 44, 11, 26, 39, 48

To organise these into a frequency distribution using the exclusive method with a class size of 10, we (1) find the range (highest − lowest = 48 − 7 = 41), (2) decide the classes (0–10, 10–20, …, 40–50), and (3) put a tally mark for each value in its class, then count the tallies to get the frequency:

Marks (class)	Tally	Frequency
0–10	\|\|	2
10–20	\|\|\|\|	4
20–30	\|\|\|\|\|	5
30–40	\|\|\|\|\|	5
40–50	\|\|\|\|	4
Total		20

The total of all frequencies (2 + 4 + 5 + 5 + 4 = 20) must equal the number of observations — a quick check that nothing was missed. We have turned a messy list into a neat, readable table. The running total of frequencies down (or up) the table is called the cumulative frequency, which we use later for the median and ogive.

1

Worked Example

Example 1: Why must the sum of all frequencies equal the number of observations?

Solution

Every observation is counted once.

Each value is tallied into exactly one class.
So adding all frequencies must give back the total number of observations (here 20).

2

Worked Example

Example 2: In the table above, how many students scored less than 20 marks?

Solution

Add the first two classes.

0–10 has 2 and 10–20 has 4.
2 + 4 = 6.

3

Worked Example

Example 3: What is cumulative frequency?

Solution

It is a running total.

Cumulative frequency is the running total of frequencies down (or up) the table.

Key Points

- Build a frequency table: find the range, choose classes, put tally marks, count to get the frequency.
- Sum of frequencies must equal the number of observations (a check).
- Cumulative frequency = running total of frequencies (used for median and ogive).

✎ Quick Check — 2 questions0 / 2

Q1.In a frequency table, the sum of all frequencies must equal the:

Explanation: Every observation is tallied once, so frequencies sum to the number of observations.

Q2.The running total of frequencies down a table is the:

Explanation: The running total of frequencies is the cumulative frequency.