# Data Classification

## Learning Objectives

When the data are collected, they need to be classified and tabulated in a proper manner so that readers can understand the information very easily. In this chapter efforts have been made to explain different ways of classification, construction of frequency distributions, measurement scales, and different tabular presentation.

When one has studied this chapter he/ she should be able to do the following:

• Able to classify data according the nature of data.
• Have an idea about measurement and measurement scales.
• Arrange raw data in an array, and then classify data to construct a frequency table and cumulative frequency distribution table.
• Transform frequency tables into relative frequency and percentage distribution.
• Understand the main parts of statistical table.
• Ascertain for a given set of data, as to which type of table would be most appropriate.

## Introduction

Data collected from the primary sources, obviously enough, are in a raw form and have not gone through any statistical treatment. These unwidely, ungrouped and shapeless masses of collected data are not easy to handle and are not capable of interpretation. In order to make the data easily understandable and capable of interpretation, the first task is to condense and simplify them in such a way that irrelevant details are appropriate procedure for this is the classification and tabulation of data.

## Need and Meaning

The collected data, also known as raw data or ungrouped data, are always in an unorganized form and need to be organized and presented in meaningful and readily comprehensible form in order to facilitate further statistical analysis. It is, therefore, essential for an investigator to condense a mass of data into more and more comprehensible and capable of being assimilated form. The process of grouping into different classes or sub classes according to some characteristics, is known as classification, tabulation is concerned with the systematic arrangement and presentation of classified data. Thus, classification is the first step in tabulation. For Example, letters in the post office are classified according to their destinations viz, Pokhara, Biratnagar, Surkhet, Humla etc., Classification of statistical data is also comparable to the sorting operation. Let us take another example of students who graduated (completed BBS) from Tribhuvan University, Faculty of Management in certain time period. If one is interested in finding out the caste wise distribution of management graduates, one may look into each and every record and note whether it relates to which caste such as Shresths, Rai, Brahamins, Chhetry, Gurung, Tharu, etc. Finally, he/she may be able to find that out the distribution of graduates according to the classification i.e. according to their castes.

## Objectives

The following are main objectives of classifying the data:

1. It condenses the mass of data in an easily comprehensible and capable of being assimilated.
2. It eliminates unnecessary details.
3. It facilitates comparison and highlights the significant aspect of data.
4. It enables one to get a mental privute of the information and helps in drawing inferences.
5. It helps in the further statistical analysis of the information collected.

## Classification Procedure

There are four important classification.

• Geographical classification
• Chronological clasification
• Qualitative classification
• Quantitative classification

### Geographical classification

In this classification the data are classified according to place, area, region etc. For example,

Population Density in Different Region of Nepal

 Region Population Density (per sq. km.) Eastern Central Western Mid-western Far-western 188 293 155 71 112

Chronological Classification

A type of classification in which, the data are classified according to time variation. The time series data are good examples of chronological classification. For example,

Population Growth Rate of Nepal

 Year Growth Rate (in Percentage) 2011 2001 1991 1981 1971 1961 1.42 2.24 2.10 2.66 2.07 1.65

Qualitative Classification

Classification is said to be qualitative when the data are classified on the basis of some attribute or quality or descriptive characteristics, which are not capable of being described numerically. These types of data are known as categorical data or qualitative data. For example: sex, nationality, honesty, color of eye, religions etc.

A classification with two sub division with one attribute is known as simple or ‘two fold’ classification. If more than one attribute is to be studied simultaneously, the data should be divided into a number of classes and this classification is known as ‘manifold’ classification.

The following chart would be useful to depict simple and manifold classification.

Quantitative Classification

Classification is said to be quantitative when the data are expressed numerically. These types of data are known as numerical data or quantitative data. Height, weight, age, profit, turnover, income, death etc. are some of examples of this type of data.

Variable

Any quantitative characteristic under study is known as variable. Basically there are two types of variables.

i. Discrete variable: A variable is said to be discrete if it takes only countably many values (whole numbers). For example: Number of buses, number of persons, family size etc.

ii. Continuous variable: A variable is said to be continuous if it takes all possible real values (whole number as well as fractional values) within a certain range. For example: heights, weights, temperature records, marks obtained by students etc.