Fullscreen View Fix with English Version

Definition, Importance, and Limitations of Statistics, and Data Collection, Classification, and Tabulation

Table of Contents

SectionDescription
1.0 Introduction Overview of statistics, its origins, and its relevance.
→ Latin Word “statisticum”Origin of the word “Statistics” from the Latin word “statisticum.”
→ Italian Word “statista”Development of the word from Italian, meaning “expert of the state.”
→ German Word “Statistik”Use in 18th-century Germany for state data collection.
→ Modern Use of StatisticsExpanded use of statistics in science, business, and economics.
1.1 What is Data? Defines data and explains its types: qualitative and quantitative.
→ Qualitative DataDescriptive data related to qualities and characteristics.
→ Quantitative DataData expressed in numerical terms (e.g., quantity, size).
1.2 Phases of Statistical Analysis Four main phases of statistical analysis: collection, classification, analysis, and interpretation of data.
→ Collection of DataGathering information from primary or secondary sources.
→ Classification and Tabulation of DataOrganizing data into categories or tables for easier analysis..
→ Analysis of DataStudying the organized data to identify patterns or trends.
→ Interpretation of DataDrawing conclusions based on the analyzed data.
→ Ancient Example: Ancient Egypt CensusCensuses conducted by Pharaohs in ancient Egypt.
→ Modern Example: Census of India 2011The large-scale census conducted in India in 2011.
1.3 Importance of Statistics Explains the key role of statistics in various fields such as business, economics, healthcare, and more.
→ Business and EconomicsImportance of statistics in business decision-making and economic predictions.
→ Medical Use Importance of statistics in making medical decisions based on patient data.
→ Weather ForecastUse of past weather data to predict future conditions.
→ Stock MarketApplication of statistics to predict stock price trends.
→ BanksStatistical methods to assess loan approvals and credit risk.
1.4 Functions of Statistics Describes how statistics present facts, simplify data, enable comparisons, and aid in decision-making.
→ Presenting FactsUsing charts and graphs to make data easier to understand.
→ Simplifying Complex DataSummarizing large sets of data into understandable averages or percentages.
→ Enabling ComparisonsComparing different groups or categories using statistical data.
→ Studying RelationshipsIdentifying correlations between variables (e.g., exercise and weight loss).
→ Policy FormulationAssisting governments and organizations in forming policies based on data.
→ Forecasting OutcomesPredicting future outcomes using past data trends.
1.5 Limitations or Demerits of Statistics Explains the limitations of statistics, such as exclusion of qualitative data and reliance on averages.
→ Statistics Do Not Deal with IndividualsStatistics focus on groups and general trends, not individual cases.
→ Qualitative Data is ExcludedStatistics cannot measure non-numeric data like emotions.
→ Results Based on AveragesAverages can sometimes hide extreme cases or outliers.
→ Bias in ResultsBiased sampling can lead to inaccurate conclusions.
1.6 Definitions Defines key statistical terms such as population, sample, variates, attributes, and parameters.
→ PopulationThe entire group being studied in a statistical analysis.
→ Population and SampleDifference between studying an entire population versus a smaller representative sample.
→ Variates and AttributesVariates are measurable characteristics, while attributes are qualities that can’t be numerically measured.
→ Discrete and Continuous VariablesDiscrete variables are countable, while continuous variables can take any value within a range.
→ Parameter and StatisticA parameter describes a population, while a statistic describes a sample.
→ Primary DataData collected firsthand by the researcher for a specific purpose.
→ Secondary DataData collected by someone else that is reused for research.
→ Census vs. Sample SurveyCensus involves collecting data from the entire population, while a sample survey collects data from a smaller group.
→ Accuracy vs. Ease: Primary vs. Secondary DataPrimary data is more accurate but time-consuming, while secondary data is quicker to obtain but may be less specific.
1.7 Classification and Tabulation Methods of sorting data into categories to make analysis easier.
→ Qualitative BaseClassifying data based on qualities that cannot be measured (e.g., religion, literacy level).
→ Quantitative BaseSorting data based on measurable characteristics (e.g., age, height, marks).
→ Geographical BaseSorting data by location (e.g., country, state, or region).
→ Chronological BaseSorting data by time (e.g., years, months, or days).
→ Types of Classification (One-way, Two-way, Multi-way)Classifying data based on one or more characteristics (e.g., age and gender).
1.8 Frequency Distribution Describes how to group data into intervals and calculate frequency distribution.
→ Class IntervalsGrouping data into ranges (e.g., 10-20, 20-30).
→ Class LimitsThe boundaries of class intervals (e.g., lower and upper limits).
→ Class LengthThe difference between the upper and lower class limits.
→ Mid-value (Class Mark)The midpoint of the class interval.
→ Types of Class Intervals (Inclusive, Exclusive)Inclusive intervals include both limits, while exclusive intervals exclude the upper limit.
→ Class BoundariesAdjusting class limits to ensure no gaps between intervals.
→ Open-end Class IntervalIntervals where the lower or upper limit is undefined.
→ Relative FrequencyProportion of the total frequency represented by a class.
→ Percentage FrequencyRelative frequency expressed as a percentage.
→ Explanation of Frequency DensityUsed to calculate how densely data is distributed across unequal class intervals.
→ Discrete Frequency DistributionFor countable data (e.g., number of children in a family).
→ Continuous Frequency DistributionFor data that falls within ranges (e.g., exam scores).
→ Cumulative Frequency Distributioncurrent frequency + all previous frequencies

1.1 Introduction

Back to Table

The process of collecting numbers or figures related to the area of interest, analyzing them, and making decisions based on them is called statistics.

In simple terms, statistics is the process of collecting, analyzing, and making decisions based on data in a specific field.

1. Latin Word “statisticum”:
The word “Statistics” originates from the Latin word “statisticum”, which means “related to the state” or “pertaining to state affairs.”

2. Italian Word “statista”:
Following this, the word came from the Italian word “statista”, which means “expert of the state” or “a person knowledgeable in political affairs.”

3. German Word “Statistik”:
In the 18th century, the word in German was referred to as “Statistik”, which meant “the collection and analysis of data for matters of the state.”

Each of which means – “A group of numbers or figures that represent some information of human interest”

4. Modern Use:
Gradually, the use of the word “Statistics” expanded beyond just state or government matters and began to be used in every field where data collection, analysis, and interpretation occur. Today, statistics are used in science, business, economics, and many other fields.

Did you know? The word ‘Statistics’ was first used in 1749 by a German scholar, Gottfried Achenwall. He defined it as the “political science of many countries”.

2. What is Data?

“Data” means information, and it comes in two main types – Qualitative and Quantitative.

  • Qualitative Data: This is descriptive and relates to qualities or characteristics.
    Example: When baking a cake, if you describe the cake as “sweet” or “salty”, this is qualitative data.
  • Quantitative Data: This is numerical and deals with numbers.
    Example: If you count how many cakes you baked, like saying you baked 10 cakes, that’s quantitative data.
Example: Imagine you’re selling cakes. You note that the flavor “chocolate” is the most popular (Qualitative data). You also record that you sold 50 chocolate cakes (Quantitative data).

Phases of Statistical Analysis

Back to Table

Statistical analysis has four main phases:

  • 1. Collection of Data: This is the first stage, where you gather information from different sources, which can be primary (original data) or secondary (data collected by others).
    Example: You record how many cakes you sold each day for the past month.
  • 2. Classification and Tabulation of Data: Once you collect the data, you organize it so it’s easy to read and analyze. You can group the data into categories or classes.
    Example: You make a table showing how many cakes of each flavor (chocolate, vanilla, strawberry) were sold on each day of the week.
  • 3. Analysis of Data: In this step, you study the organized data using formulas and methods to find patterns or trends.
    Example: You analyze the table to see which days chocolate cakes were sold the most.
  • 4. Interpretation of Data: After analyzing, you draw conclusions based on the findings.
    Example: You conclude that chocolate cakes sell best on weekends, so you decide to bake more chocolate cakes on Fridays to prepare for the weekend.

Ancient and Modern Census Examples

Ancient Example: Ancient Egypt Census

In ancient Egypt, Pharaohs regularly conducted censuses to maintain accurate records of population and wealth in their empire. One notable census was conducted around 2,800 BCE, documented in the “Book of the Dead.” The purpose of this census was to track the number of people living in the state, their occupations, and their assets.

This information was used by the Pharaoh to collect taxes and manage labor resources, such as determining the number of workers needed for the construction of the pyramids.


Modern Example: Census of India 2011

In 2011, the Government of India conducted one of the largest and most complex censuses in the world, gathering information on more than 1.2 billion citizens. This census collected data such as age, gender, education level, occupation, language, and living conditions.

The data collected in the 2011 census was used by the Indian government to formulate policies, implement plans, and drive social and economic development across the nation.

1.3 Importance of Statistics

Back to Table

Imagine statistics as a superpower! Statistics give you the ability to “see” patterns in data, much like how a superhero uses their special powers to save the day. Let’s take a journey through some real-world scenarios:

1.2.1 Business and Economics

Statistics are essential in business and economics because they allow decision-makers to analyze data, predict outcomes, and create strategies that lead to success. Here are the key points explained:

  • 1. In Business: The decision maker takes suitable policies and strategies based on information on production, sale, profit, purchase, finance, etc.
  • Story: Imagine you’re the CEO of a clothing company, trying to decide how many shirts to produce next season. You analyze past production data, sales trends, and profits to find the optimal number of shirts to produce. By using statistical data, you avoid overproducing or underproducing, which helps you maximize profit.

    Example: A clothing company looks at last year’s winter sales to decide how many jackets to make this year, ensuring they don’t have leftover inventory or missed opportunities.

  • 2. Time Series Analysis: The businessman can predict the effect of a large number of variables with a fair degree of accuracy.
  • Story: Picture yourself as a retail store manager. You want to predict how holiday sales will perform based on previous years’ data. Using time series analysis, you analyze sales over the past 10 holiday seasons, adjusting for different economic conditions. This helps you estimate future sales and make better decisions on inventory and staffing.

    Example: A retail manager uses data from the past 10 years to predict holiday sales, ensuring that inventory levels and staffing are adequate for the upcoming holiday rush.

  • 3. Bayesian Decision Theory: The businessman selects the optimal decisions by identifying the payoff for each alternative course of action.
  • Story: Imagine you’re a tech company deciding whether to launch a new product. You have three options: delay the launch for more testing, release it now, or cancel it altogether. Using Bayesian Decision Theory, you calculate the probability of success and the expected payoff for each option. This statistical approach helps you choose the course of action with the highest chance of maximizing profit, based on available data.

    Example: A tech company uses Bayesian Decision Theory to evaluate the likelihood of success for a new smartphone launch, considering different scenarios like releasing now or delaying it to avoid potential software issues.

  • 4. In Economics: Statistics analyze demand, cost, price, and other economic factors like elasticity of demand and consumer satisfaction.
  • Story: Picture yourself as an economist working for a car manufacturing company. You need to understand how changing the price of cars will affect consumer demand. By using data on income, price elasticity of demand, and spending patterns, you predict that a small price reduction will lead to a large increase in sales. This helps the company set the right price to maximize both revenue and consumer satisfaction.

    Example: An economist analyzes how the price of electric cars impacts consumer demand and recommends a price cut to increase market share without significantly reducing profit margins.

1.2.2 Medical

Story: Picture yourself as a doctor treating patients in a hospital. You’re trying to determine which painkiller works best. You gather data from past patients using different medications and compare their recovery times. With statistics, you can clearly see which medicine helps patients recover faster and make a better decision.

Example: A doctor analyzing patients’ recovery times finds that a new medicine works faster than existing ones, leading to more informed treatment choices.

1.2.3 Weather Forecast

Story: Imagine you’re a meteorologist trying to predict if it will rain during your weekend picnic. You analyze the weather from past years and notice a pattern: it often rains in the third week of September. Using this information, you predict the rain and move your picnic to the following weekend, avoiding a soggy sandwich.

Example: A forecaster uses past data to predict rainfall in September, allowing someone to avoid scheduling an outdoor event during a likely rainstorm.

1.2.4 Stock Market

Story: You are an investor, and you’re trying to decide when to buy stocks in a rising tech company. By analyzing stock price trends from the past, you notice that the company’s stock tends to rise right before new product launches. You buy the stock just before the next release, making a smart investment based on statistics.

Example: An investor buys shares in a tech company after identifying that the stock typically rises before product launches, making profits based on statistical analysis.

1.2.5 Bank

Story: As a bank manager, you need to decide who gets approved for a loan. You analyze the credit scores and income data of all applicants. By using statistics, you identify who is most likely to repay the loan on time, helping you make a fair decision.

Example: A bank manager uses credit scores and income data to approve loans, ensuring the bank takes fewer risks based on solid statistical analysis.

1.4 Functions of Statistics

Back to Table
  • 1. Presenting Facts: Statistics make facts easier to understand using visual aids like charts and graphs.
  • Story: You are a journalist reporting on the number of people attending a concert. Instead of writing down huge numbers in your article, you create a colorful pie chart to show how many people came from each age group. This makes your report easier to understand for your readers.

    Example: A journalist uses pie charts to show audience age groups at a concert, making the data visually appealing and easy to understand.

  • 2. Simplifying Complex Data: Statistics help simplify large data sets by summarizing them into averages or percentages.
  • Story: Imagine you’re an ecologist studying the height of trees in a forest. Instead of measuring every single tree, you take a sample and find the average height. This average gives you a simple, yet accurate representation of the entire forest.

    Example: An ecologist studies a sample of trees to find the average height, rather than measuring every single tree in the forest.

  • 3. Enabling Comparisons: Statistics enable comparisons between different groups or categories.
  • Story: Picture yourself as a school principal comparing the test scores of two classrooms. With statistics, you calculate the average score for each class, and you can easily see which class performed better. This helps you decide where more attention is needed.

    Example: A school principal compares average scores of two classes, identifying which class performed better and where to improve teaching methods.

  • 4. Studying Relationships: Statistics study the relationships between variables, such as exercise and weight loss.
  • Story: You’re a fitness trainer, and you want to know if there’s a connection between how many hours someone exercises per week and how much weight they lose. By analyzing data from your clients, you discover that those who exercise more tend to lose more weight, showing a clear relationship between the two.

    Example: A fitness trainer tracks client exercise hours and weight loss, finding a strong relationship between increased workout hours and higher weight loss.

  • 5. Policy Formulation: Statistics help in policy-making by analyzing population data and other relevant metrics.
  • Story: Imagine you work for the government, and you need to decide where to build a new school. By looking at population data, you find out that one part of the city has more children. Using this data, you decide to build the school there.

    Example: A government uses population data to determine the location of a new school, ensuring it is built where the need is highest.

  • 6. Forecasting Outcomes: Statistics predict future outcomes based on past data.
  • Story: You run an online store, and you’re planning your stock for the holiday season. By analyzing past sales data, you predict that you’ll sell more of a particular toy this year. You order extra stock to meet the expected demand.

    Example: An online retailer predicts demand for a toy during the holidays based on last year’s data and adjusts inventory accordingly.

1.5 Limitations or Demerits of Statistics

Back to Table
  • 1. Statistics Do Not Deal with Individuals: Statistics deal with groups and averages, not individual cases.
  • Story: You are a school counselor, and you want to know the average happiness level of the students in the school. The average happiness score might be 7 out of 10, but this doesn’t tell you that one student is very unhappy while another is extremely happy. Statistics show trends, not individual stories.

    Example: A school counselor uses an average happiness score but misses the extreme experiences of individual students, highlighting the limitation of relying on averages alone.

  • 2. Qualitative Data is Excluded: Statistics focus on numbers and exclude qualitative factors like emotions or feelings.
  • Story: Imagine you are a movie director trying to measure how much the audience loved your latest film. You could look at ticket sales (quantitative data), but this won’t tell you how they felt about the movie. You need qualitative feedback to understand their emotions.

    Example: A director measures ticket sales for their film but doesn’t capture audience emotions, revealing how statistics miss qualitative insights.

  • 3. Results Based on Average: Statistics rely on averages, which can hide extreme cases or outliers.
  • Story: You are the coach of a basketball team, and you calculate the average height of your players. The average might be 6 feet, but this doesn’t tell you that one player is 5 feet tall and another is 7 feet tall. The average hides the extreme differences.

    Example: A basketball coach uses the average team height, but that hides the fact that some players are much shorter or taller than the average.

  • 4. Bias in Results: Statistics can be biased if the sample data isn’t representative of the entire population.
  • Story: You’re a researcher conducting a survey about a new app, but you only ask people aged 18-25 for their opinion. Since older age groups aren’t represented, your results are biased, and you won’t get the full picture.

    Example: A researcher surveys only young people about a new app, leading to biased results that don’t reflect the opinions of older users.

1.6 Definitions

Back to Table
  • Population: A population includes all members of a specific group under study.
  • Story: Picture yourself as a scientist studying the heights of sunflowers in a field. The population includes all the sunflowers in the field, not just a few of them. Your goal is to collect data from the entire population to get an accurate result.

    Example: A scientist collects data from all sunflowers in a field (the population) to ensure accurate results for the study.

1. Population and Sample – The Festival Planning Story

Definition: A population is the entire group you want to study, and a sample is a smaller, representative subset of the population that you study to make conclusions about the whole.

Story: Imagine you are in charge of organizing a Diwali celebration in a large village in India. The village has 10,000 people (that’s your population), and you need to figure out how many sweets to prepare for the festival. But asking each and every person how many sweets they’ll eat is impossible. Instead, you decide to ask just 200 people from different parts of the village. This group of 200 people is your sample. After gathering their answers, you find that each person will likely eat 5 sweets. Now, you use this sample to estimate that the entire village will need 50,000 sweets, using the sample to make decisions for the whole population!

2. Variates and Attributes – The Indian School Story

Definition: Variates are characteristics that can be measured and expressed with numbers, while attributes are qualities that cannot be measured numerically.

Story: In a school in Mumbai, the principal is interested in knowing more about the students. Some things about the students, like their height, weight, and exam scores, can be measured. These are called variates. For example, one student might be 5 feet tall, while another might score 85 marks in mathematics. But then there are things about the students that cannot be measured with numbers—these are attributes. For example, one student may prefer eating pani puri, and another might have a favorite color, like blue. These preferences and qualities can’t be assigned numbers, but they’re just as important as the variates.

3. Discrete and Continuous Variables – The Indian Family Story

Definition: Discrete variables are countable numbers (like 1, 2, 3), while continuous variables can take any value within a range (like 3.5 or 7.8).

Story: An Indian family is planning a wedding and trying to decide how many people to invite. The number of guests is a discrete variable because you can only invite whole numbers of people, such as 50 or 100 guests. You can’t invite 50.5 guests! On the other hand, they are also shopping for sarees for the bride. When buying fabric, the length of the saree is a continuous variable because you can buy it in decimals, like 6.5 meters or 8.2 meters. While the number of guests is a whole number, the length of fabric can be measured in smaller, more precise values.

4. Parameter and Statistic – The Farmer’s Crop Story

Definition: A parameter is a value that represents the entire population, while a statistic is a value calculated from a sample to estimate the population parameter.

Story: In a village in Punjab, the farmers want to estimate how much wheat they will harvest this year. The parameter is the average wheat yield for all the fields in the village, but it’s difficult to measure every single field. So, the farmers choose a sample of 10 fields and measure the wheat produced in those fields. They calculate that these fields produce 20 quintals of wheat per hectare. This statistic—the average yield from the sample—helps them estimate the total wheat yield for the entire village (the parameter). Just like this, statistics help us make educated guesses about unknown parameters!

5. Primary Data – The Delhi Market Story

Definition: Primary data is information collected directly by a researcher from the source for a specific purpose.

Story: In the busy lanes of Chandni Chowk in Delhi, a shopkeeper wants to know which new products his customers are interested in. Instead of guessing, he asks them directly. Every time a customer enters his shop, he gives them a questionnaire with questions like “What are your favorite products?” and “How often do you visit the market?” The shopkeeper collects this fresh information directly from his customers. This is primary data because he is gathering the data firsthand. It’s like when a detective goes out to collect clues directly from the crime scene, making sure the information is fresh and accurate!

6. Census vs. Sample Survey – The Indian Census Story

Definition: A census is when data is collected from the entire population, while a sample survey involves collecting data from a smaller group that represents the whole.

Story: Every 10 years, the Indian government conducts the Census, gathering information about every person living in the country. They ask about things like how many people live in each household, their ages, and their education levels. This process collects data from the entire population of India, which is a massive task. But sometimes, the government needs information more quickly or on a specific topic. For example, if they want to know how many people in rural areas use smartphones, they conduct a sample survey. Instead of asking everyone, they select a group of villages and ask the people there. Based on this smaller group, they make an estimate for the entire country. This is how a sample survey helps when a full census is too time-consuming!

7. Secondary Data – The Mumbai College Story

Definition: Secondary data is data that has already been collected by someone else for a different purpose, which can be reused for research.

Story: A professor at a college in Mumbai wants to research the literacy rate in India. Instead of collecting the data herself by traveling all over the country, she uses data from the Indian Census, which has already been gathered by the government. This secondary data helps her understand the literacy rates without needing to start from scratch. It’s like reading a news article about an event instead of going to the event yourself—you still get the information, but it was collected by someone else!

8. Distinction Between Primary and Secondary Data – The Recipe Story

Definition: Primary data is original and collected firsthand, while secondary data has already been collected by someone else and is reused.

Story: Imagine two friends, Ravi and Priya, want to cook a traditional dish for a festival. Ravi decides to visit his grandmother and ask her how she prepares the dish. He writes down every step and gathers the ingredients himself. This is like primary data because he is collecting the information firsthand, directly from the source. Priya, on the other hand, prefers to find a recipe online. She downloads one from a popular cooking website. This is like secondary data because she didn’t collect the recipe herself—it was already available, and she’s reusing it. Both approaches work, but Ravi’s method is more personal and specific, while Priya’s is faster and easier!

9. Accuracy vs. Ease – The Detective Story

Definition: Primary data is generally more accurate but takes more time and effort to collect, while secondary data is quicker to obtain but may not be as specific.

Story: Imagine you are a detective solving a mystery. You have two ways to gather clues: You can either visit the crime scene yourself and interview witnesses (this is like collecting primary data) or you can read the police reports from other detectives who have already visited the scene (this is like using secondary data). If you go to the crime scene yourself, your information will be more accurate, but it will take more time. If you use the reports, you’ll get the clues faster, but they might not be as detailed as you’d like. As a detective (or a researcher), you need to decide: Do you want accuracy, or do you need results quickly?

1.7 Classification and Tabulation

Back to Table

Imagine you’ve just finished a big survey where you asked students about their hobbies, favorite subjects, and heights. You now have all this information, but it’s all jumbled up, making it hard to make sense of. That’s where classification comes in!

Classification is like organizing a messy room. It’s the process of sorting things into different groups so that it becomes easier to find patterns and understand what’s happening. For example, just like sorting mail at a post office, you might group all letters meant for one neighborhood together. In data, classification means grouping similar things to simplify them.

Example: Imagine you are working at a post office, sorting letters. Some letters need to go to “Park Street,” some to “Main Street,” and others to “Baker Street.” You don’t want to send all the letters together in one big mess. Instead, you classify them based on the street name. This way, the delivery person can easily know which streets to deliver to.

Bases for Classification:

There are four main ways to classify data:

1. Qualitative Base:

This is when you sort data based on a quality or characteristic that can’t be measured, like religion, literacy level, or intelligence.

Story Example: Imagine you’re a librarian, and you need to organize books. But instead of using book titles, you classify them by genre, like “fiction,” “non-fiction,” and “mystery.” Just like that, data can be classified based on qualities like religion—Hindu, Muslim, Christian—and not just numbers.

2. Quantitative Base:

Here, the data is sorted based on measurable characteristics like age, height, or marks.

Story Example: Let’s say you’re organizing a sports competition. You want to group players based on their height: “5 feet to 5.5 feet” and “5.6 feet to 6 feet.” This is quantitative classification because you’re sorting them by measurable numbers.

3. Geographical Base:

This means sorting data by location or geography.

Story Example: You’re a travel blogger with followers from different states. You want to know where most of your followers come from, so you classify them based on their state or country—like “Maharashtra,” “Karnataka,” or “Tamil Nadu.” This is called geographical classification.

4. Chronological or Temporal Base:

This involves sorting data based on time—like years, months, or days.

Story Example: Imagine running a coffee shop and recording sales. You could classify the sales data based on the time of year: “January,” “February,” “March,” and so on. This helps you compare how well your business did in different months. Sorting by time is called temporal classification.

Example: Let’s say you’re tracking the sales of a company. You could group the data by year (2019, 2020, 2021), or you could group it by product categories (electronics, furniture, clothing). The way you group the data depends on what you want to study or understand.

Types of Classification:

There are different ways to classify data based on how many characteristics you consider:

1. One-way Classification:

This is when you classify data based on one characteristic.

Story Example: Imagine you’re a teacher, and you want to sort students based on the subject they like best—math, science, or English. This is called one-way classification because you’re only considering one thing: favorite subject.

2. Two-way Classification:

Here, you classify data based on two characteristics at the same time.

Story Example: Now imagine that you’re not just interested in their favorite subjects, but also their gender. So, you classify the students based on both their favorite subject and whether they’re boys or girls. For example, “Boys who like math” or “Girls who like science.” This is two-way classification.

3. Multi-way Classification:

This happens when you consider more than two characteristics.

Story Example: Now, let’s say you want to go even further and also classify students based on their marks in these subjects. You now group students by their favorite subject, gender, and marks—like “Girls who like math and scored above 90%.” This is multi-way classification because you’re looking at three characteristics at the same time.

1.8 Frequency Distribution

Back to Table

Now, let’s talk about frequency distribution. Imagine you’re a class teacher, and you’ve just graded your students’ math tests. Some students scored between 60-70, some between 70-80, and so on. To understand how many students fall into each score range, you create a frequency table.

Frequency means how often something happens. For example, if 5 students scored between 60-70, the frequency of that score range is 5.

A frequency distribution is simply a table that shows how often different values (or ranges of values) appear in the data.

Example:

You’re at a school sports day. Some students ran the 100-meter race in 10-12 seconds, some in 12-14 seconds, and so on. You make a table that shows how many students finished within each time range. This table is called a frequency distribution.

Class Intervals:

When you have a lot of different data points, it’s helpful to group them into class intervals. These are ranges that the data falls into.

1. Class Limits:

These are the boundaries of the intervals.

Story Example: If you have a class interval of 10-20, the lower class limit is 10, and the upper class limit is 20. Think of it like organizing students based on their height; one group might be for students between 150 cm and 160 cm tall.

2. Class Length:

This is the size of the interval, or the difference between the upper and lower class limits.

Story Example: For the height interval 150-160 cm, the class length would be 10 cm (160-150 = 10). This tells you how big each group is.

3. Mid-value (Class Mark):

This is the midpoint of the class interval, calculated by averaging the lower and upper class limits.

Story Example: In the height group 150-160 cm, the mid-value would be (150 + 160)/2 = 155 cm. This helps represent the “middle” value of that group.

Types of Class Intervals

There are two main types of class intervals:

1. Inclusive Type:

In this type, both the lower and upper limits of the class interval are included.

Story Example: If you’re grouping students by age and one group is 10-15 years old, the ages 10, 11, 12, 13, 14, and 15 are all included in that group. So, a 15-year-old is part of this interval.

2. Exclusive Type:

In this type, the upper limit is not included.

Story Example: If the group is 10-15 years old, only ages 10 through 14.999 are included, but not exactly 15. So, a 15-year-old would fall into the next group, like 15-20.

Story Example for Class Intervals:

Let’s say you’re in charge of organizing a big birthday party for kids of different ages. You want to group them based on their ages so that the games are appropriate. You make age groups like 5-10 years, 10-15 years, and 15-20 years.

In the inclusive type, kids exactly 10 years old would be in the 5-10 group.

In the exclusive type, a 10-year-old would go into the next group, 10-15.

Class Boundaries

Definition: Class boundaries are the real limits of class intervals, making inclusive classes into exclusive classes.

Example: For classes 5-9 and 10-14:
– New UCI = 9 + 0.5 = 9.5
– New LCI = 10 – 0.5 = 9.5
– So, the new class boundaries are 4.5-9.5 and 9.5-14.5.

Story/Real Implication: Imagine you’re organizing bookshelves in a library. If you leave gaps between book categories, some books might not fit well. Class boundaries are like ensuring the categories “touch” each other so no gaps are left.

Open-end Class Interval

Definition: These are intervals where the lower or upper limit is not defined.

Example: Below 10, 10-20, 20-30, 30-40, Above 40.

Story/Real Implication: Think of an “open-end” class interval like going to a store where you can only categorize prices like “less than $10” or “more than $40”. You don’t know exactly how low or high the values go, but you can group them this way.

Relative Frequency

Definition: It’s the proportion of the total frequency that a specific class represents.

Formula:
Relative Frequency = (Frequency of the Class / Total Frequency)

Example: For class interval 20-30 with a frequency of 12 out of a total of 32:
Relative Frequency = 12 / 32 = 0.375

Story/Real Implication: Think of relative frequency as figuring out how many of your classmates prefer a specific flavor of ice cream compared to the whole group. If 3 out of 10 like chocolate, then the relative frequency of chocolate lovers is 0.3 or 30%.

Percentage Frequency

Definition: It’s the relative frequency expressed as a percentage.

Formula:
Percentage Frequency = (Class Frequency / Total Frequency) × 100

Example: If 12 students out of 32 scored between 20-30 in a test, their percentage frequency is:
(12 / 32) × 100 = 37.5%

Story/Real Implication: This is like a survey where you figure out how many people (in percentage) like a specific movie genre. For instance, 40% of a group might say they love action movies, which gives a quick idea of popularity.

Frequency Density

Definition: Used for unequal class intervals to show how dense the frequency is in each class.

Formula:
Frequency Density = Class Frequency / Width of Class

Example: If a class with a width of 10 (say 20-30) has a frequency of 12:
Frequency Density = 12 / 10 = 1.2

Story/Real Implication: Picture a classroom where students are packed in rows. Frequency density is like calculating how tightly packed students are in a row with different numbers of seats. The more students per row, the denser it feels.

Discrete Frequency Distribution

Definition: This shows how often certain distinct values (like the number of children in a family) occur in the data.

Example: Data of post-graduates in 10 families: 0, 1, 3, 1, 0, 2, 2, 2, 2, 4.

Number of Post-graduatesFrequency
02
12
24
31
41

Story/Real Implication: Think of a neighborhood where each house has a different number of children. The discrete frequency distribution helps us count how many houses have 0, 1, 2, 3, or 4 children. It’s like organizing and counting families based on how many kids they have.

Continuous Frequency Distribution

Definition: Used when data can take any value within a range (e.g., 10-20, 20-30).

Example: Marks obtained by 20 students in a test: 18, 23, 28, 29, 44, 28, 43, 44, 24, 29, 32, 39, etc.

Converted into a frequency distribution:

MarksFrequency
15-202
20-254
25-307
30-352
35-401
40-453
45-501

Story/Real Implication: Imagine you are collecting scores from a class test. Some students score between 15 and 20, some between 25 and 30, and so on. Continuous frequency distribution helps you organize these scores into ranges so you can easily see how students performed.

Cumulative Frequency Distribution

Definition: A way to calculate the running total of frequencies up to a certain point.

Story/Real Implication: Cumulative frequency is like tracking your savings over time. Every time you add more money, you’re not just looking at the current amount but how much you’ve saved overall. It helps you understand the total build-up of data up to a certain class.

Continuous Frequency Distribution

Definition: Variable takes values which are expressed in class intervals within certain limits.

Problem 2:

Marks obtained by 20 students in an exam for 50 marks are given below. Convert this data into continuous frequency distribution form:

Data: 18, 23, 28, 29, 44, 28, 43, 44, 24, 29, 32, 39, 42, 38, 49, 47, 22, 33, 28, 29

Solution:

MarksFrequency
15-202
20-254
25-307
30-352
35-401
40-453
45-501

Problem 3:

Following data reveals information about the number of children per family for 25 families. Prepare frequency distribution of number of children (say variable X), taking distinct values 0, 1, 2, 3, 4.

Data:
4, 3, 1, 1, 1, 2,
4, 3, 1, 1, 1, 1, 0,
2, 2, 2, 3, 2,
2, 1, 3, 4

Solution: Frequency distribution of the number of children in 25 families

Number of ChildrenNumber of Families
01
17
28
36
43

Problem 4:

For the following frequency distribution, prepare cumulative frequency distribution of ‘less than’ and ‘greater than’ type.

Solution:

XFrequencyLess than cfGreater than cf
15543
271236
3152721
4103711
56435

Cumulative Frequency Distribution

Problem 5:

Following is the marks of 50 students. Prepare cumulative frequency distribution of both the types. Also find relative frequencies.

MarksNo. of Students
0-107
10-2011
20-3019
30-406
40-507

Solution:

MarksCum FrequencyCum Frequency more thanRelative Frequency
0-107437/50
10-20183611/50
20-30371919/50
30-404366/50
40-505007/50

Problem 6:

For the following frequency distribution, obtain cumulative frequencies, relative frequencies, and relative cumulative frequencies.

Class IntervalFrequency
30-508
50-7015
70-9025
90-11016
110-1307
130-1504
MCQ Questions

Multiple Choice Questions (MCQs)

1. What is the primary objective of studying the unit “Advanced Bank Management”?

  • a) Understanding human behavior
  • b) Learning proper methods to collect and analyze data
  • c) Managing accounting systems
  • d) Developing new banking products

Answer: b) Learning proper methods to collect and analyze data

2. What is one of the main purposes of using data management techniques in banking?

  • a) To design marketing campaigns
  • b) To predict economic trends
  • c) To collect and present data effectively
  • d) To train employees

Answer: c) To collect and present data effectively

3. The term ‘Statistics’ is derived from which Latin word?

  • a) Statista
  • b) Statisticum
  • c) Statistici
  • d) Statibus

Answer: b) Statisticum

4. Who first used the word ‘Statistics’ in reference to subject matter as a whole?

  • a) Professor Keynes
  • b) Professor Achenwall
  • c) Professor Adam Smith
  • d) Professor Ricardo

Answer: b) Professor Achenwall

5. Which one of the following phases is the first stage in the process of data analysis?

  • a) Classification of data
  • b) Presentation of data
  • c) Collection of data
  • d) Analysis of data

Answer: c) Collection of data

6. Which of the following best describes the second phase in the statistical analysis process?

  • a) Analysis of classified data
  • b) Collection of raw data
  • c) Classification and tabulation of data
  • d) Drawing conclusions from the data

Answer: c) Classification and tabulation of data

7. In which century was the word ‘Statistics’ first used in its modern sense?

  • a) 15th century
  • b) 16th century
  • c) 18th century
  • d) 19th century

Answer: c) 18th century

8. Which phase of statistical analysis involves dividing raw data into groups or categories?

  • a) Collection of data
  • b) Interpretation of data
  • c) Classification and tabulation of data
  • d) Analysis of data

Answer: c) Classification and tabulation of data

9. Which of the following is NOT mentioned as a field where statistics is widely used?

  • a) Business
  • b) Economics
  • c) Politics
  • d) Literature

Answer: d) Literature

10. What is the final stage of the statistical analysis process?

  • a) Collection of data
  • b) Interpretation of data
  • c) Tabulation of data
  • d) Presentation of data

Answer: b) Interpretation of data

11. The word “Statistics” in German is referred to as:

  • a) Statistiko
  • b) Statistik
  • c) Statista
  • d) Statisticum

Answer: b) Statistik

12. Which term is used to describe the gathering of information from primary or secondary sources?

  • a) Classification of data
  • b) Analysis of data
  • c) Collection of data
  • d) Interpretation of data

Answer: c) Collection of data

13. What is the primary focus of the classification and tabulation phase of data management?

  • a) To analyze the data for patterns
  • b) To collect relevant data
  • c) To represent raw data in a tabular format
  • d) To make predictions based on data

Answer: c) To represent raw data in a tabular format

14. Which of the following is an example of where statistics can be applied in daily life, according to the image?

  • a) Analyzing hospital patient data
  • b) Writing a novel
  • c) Building a house
  • d) Preparing a meal

Answer: a) Analyzing hospital patient data

15. What type of data is divided into different groups or classes in the tabulation process?

  • a) Primary data
  • b) Raw data
  • c) Processed data
  • d) Qualitative data

Answer: b) Raw data

16. Which phase of data analysis uses formulas and methods to understand the patterns in data?

  • a) Collection of data
  • b) Classification of data
  • c) Analysis of data
  • d) Interpretation of data

Answer: c) Analysis of data

17. What is the purpose of the ‘Interpretation of Data’ stage in statistical analysis?

  • a) To collect raw data
  • b) To divide data into groups
  • c) To analyze data using formulas
  • d) To draw conclusions from the data

Answer: d) To draw conclusions from the data

18. Which of the following areas is mentioned as having increasing use of statistics?

  • a) Poetry
  • b) Social Sciences
  • c) Art and Design
  • d) Classical Music

Answer: b) Social Sciences

19. What is the function of tabulated data in the analysis process?

  • a) It helps in analyzing data using formulas
  • b) It helps in collecting more data
  • c) It simplifies the presentation of findings
  • d) It prepares the data for interpretation

Answer: a) It helps in analyzing data using formulas

20. What is the last step before drawing conclusions in the statistical process?

  • a) Collection of data
  • b) Analysis of data
  • c) Tabulation of data
  • d) Classification of data

Answer: b) Analysis of data

21. Which of the following statements about statistics is true according to the text?

  • a) Statistics is only useful for scientific purposes.
  • b) Statistics was originally used to collect facts about the state.
  • c) Statistics does not involve any data collection.
  • d) Statistics is mainly used for artistic analysis.

Answer: b) Statistics was originally used to collect facts about the state.

22. In what context did Professor Achenwall define statistics?

  • a) As a branch of mathematics
  • b) As a method for economic analysis
  • c) As the political science of many countries
  • d) As a study of natural sciences

Answer: c) As the political science of many countries

23. What is the importance of presenting data properly in statistics?

  • a) It helps in creating beautiful graphs
  • b) It allows for making predictions and correct decisions
  • c) It reduces the workload of analysts
  • d) It eliminates the need for further data collection

Answer: b) It allows for making predictions and correct decisions

24. In which phase is data divided into groups and represented in a table?

  • a) Data interpretation
  • b) Data collection
  • c) Data classification and tabulation
  • d) Data presentation

Answer: c) Data classification and tabulation

25. Which field is NOT mentioned in the text as an area where statistics is commonly applied?

  • a) Medicine
  • b) Engineering
  • c) Politics
  • d) Commerce

Answer: b) Engineering

26. According to the objectives of this chapter, which of the following is NOT a key goal?

  • a) Making correct decisions based on data
  • b) Learning to create new types of data
  • c) Understanding how to collect and present data
  • d) Developing data analysis techniques

Answer: b) Learning to create new types of data

27. The term “Statistics” was first used in a formal sense in which year?

  • a) 1492
  • b) 1749
  • c) 1849
  • d) 1949

Answer: b) 1749

28. In statistical analysis, which phase is essential for ensuring the data is ready for further analysis?

  • a) Collection of data
  • b) Classification and Tabulation
  • c) Presentation of data
  • d) Interpretation of data

Answer: b) Classification and Tabulation

29. Which of the following fields mentioned in the text relies heavily on the use of statistics?

  • a) Arts and Music
  • b) Literature and Poetry
  • c) Business and Economics
  • d) Philosophy and Ethics

Answer: c) Business and Economics

30. In which field is statistical analysis used to study stock prices and calculate risk?

  • a) Weather Forecast
  • b) Banking
  • c) Stock Market
  • d) Medical Research

Answer: c) Stock Market

31. Which statistical method is commonly used in weather forecasting?

  • a) Correlation analysis
  • b) Time series analysis
  • c) Bayesian decision theory
  • d) Sampling techniques

Answer: b) Time series analysis

32. What is the primary application of statistics in the medical field?

  • a) To analyze sales data
  • b) To investigate clinical treatments
  • c) To analyze business strategies
  • d) To forecast weather conditions

Answer: b) To investigate clinical treatments

33. What do credit policies in banking primarily depend on?

  • a) Stock prices
  • b) Statistical analysis of profitability and other ratios
  • c) Historical data on economic growth
  • d) Marketing trends

Answer: b) Statistical analysis of profitability and other ratios

34. In business, statistical techniques like time series analysis are used to?

  • a) Analyze consumer preferences
  • b) Design new products
  • c) Predict the effect of large numbers of variables
  • d) Train employees

Answer: c) Predict the effect of large numbers of variables

35. What is the role of statistics in demand and supply analysis in economics?

  • a) To predict future trends in marketing
  • b) To calculate profit and loss
  • c) To determine elasticity of demand and maximum satisfaction
  • d) To train employees in the financial sector

Answer: c) To determine elasticity of demand and maximum satisfaction

36. What is one way that regression techniques are used in weather forecasting?

  • a) To analyze future stock prices
  • b) To predict future weather conditions
  • c) To train new meteorologists
  • d) To compare various types of weather patterns

Answer: b) To predict future weather conditions

37. What does “Bayesian Decision Theory” help businesses to achieve?

  • a) Improve employee retention
  • b) Optimize decisions by evaluating payoffs
  • c) Predict customer satisfaction levels
  • d) Develop new products

Answer: b) Optimize decisions by evaluating payoffs

38. What is one of the benefits of using time series analysis in business?

  • a) To collect historical data
  • b) To predict sales trends and market fluctuations
  • c) To calculate future stock prices
  • d) To analyze employee behavior

Answer: b) To predict sales trends and market fluctuations

39. What is the application of statistics in banking?

  • a) To assess loan risks based on historical data
  • b) To predict stock market behavior
  • c) To analyze customer satisfaction
  • d) To measure employee performance

Answer: a) To assess loan risks based on historical data

40. How do players in sports use statistics?

  • a) To predict the outcome of games
  • b) To identify or rectify their mistakes
  • c) To recruit new players
  • d) To evaluate the history of the sport

Answer: b) To identify or rectify their mistakes

41. Which of the following is a function of statistics?

  • a) Statistics eliminate the need for forecasting
  • b) Statistics prevent data from being biased
  • c) Statistics help in simplifying complex data
  • d) Statistics replace individual analysis with machine learning

Answer: c) Statistics help in simplifying complex data

42. Which of the following is NOT a function of statistics?

  • a) Statistics simplify complex data
  • b) Statistics help in forecasting outcomes
  • c) Statistics eliminate the need for data analysis
  • d) Statistics provide a technique of comparison

Answer: c) Statistics eliminate the need for data analysis

43. What is a limitation of statistics?

  • a) Statistics are only useful in business contexts
  • b) Statistics do not deal with individuals
  • c) Statistics always provide accurate results
  • d) Statistics exclude quantitative data

Answer: b) Statistics do not deal with individuals

44. Why can’t statistics be applied to qualitative data?

  • a) Qualitative data is subjective
  • b) Qualitative data is too complex for statistics
  • c) Qualitative data cannot be measured in terms of quantity or numbers
  • d) Qualitative data requires machine learning to analyze

Answer: c) Qualitative data cannot be measured in terms of quantity or numbers

45. What is one reason statistical methods may not always give the correct result?

  • a) They require too many samples to work
  • b) They cannot be applied to large groups
  • c) They only provide an average of the data
  • d) They are biased by default

Answer: c) They only provide an average of the data

46. When are the results of statistics considered biased?

  • a) When data is collected by inexperienced or dishonest persons
  • b) When data is collected from too many people
  • c) When data is collected using digital methods
  • d) When data is analyzed without machine learning

Answer: a) When data is collected by inexperienced or dishonest persons

47. What is the definition of a population in statistics?

  • a) A small group that represents the whole
  • b) The entire group we are interested in for drawing conclusions
  • c) A sample of a larger group of data
  • d) A collection of numerical data points only

Answer: b) The entire group we are interested in for drawing conclusions

48. In the example, what is the population if we are studying the weight of adult men in India?

  • a) A group of men from a particular city
  • b) The set of weights of all men in India
  • c) Men from rural areas only
  • d) Data collected from a survey of male athletes

Answer: b) The set of weights of all men in India

49. Which of the following is not studied under statistics?

  • a) The relationship between two or more variables
  • b) Individual qualitative observations
  • c) Group observations and averages
  • d) Complex data for comparison

Answer: b) Individual qualitative observations

50. What is the population if we are studying the grade point average of students at Mumbai University?

  • a) A group of selected students
  • b) The set of GPAs of all students of Mumbai University
  • c) Only students with high GPAs
  • d) A sample of students from a single department

Answer: b) The set of GPAs of all students of Mumbai University

51. Why is a sample often used in statistical studies?

  • a) It is impossible to study an entire population
  • b) To save time and money when the population is too large to study
  • c) Samples provide more accurate data than populations
  • d) Sampling eliminates the need for further analysis

Answer: b) To save time and money when the population is too large to study

52. What is a sample in statistics?

  • a) A small subset of the population
  • b) The entire group we are studying
  • c) A characteristic that varies among individuals
  • d) A parameter of the population

Answer: a) A small subset of the population

53. In the given example, what would be the sample for a study on infant health in India?

  • a) All infants born in India in one year
  • b) All infants born on one particular day in one year
  • c) All children below the age of 5
  • d) A group of mothers with infants

Answer: b) All infants born on one particular day in one year

54. What is a variate in statistics?

  • a) A characteristic that cannot be expressed numerically
  • b) A characteristic that varies from one individual to another and can be expressed in numerical terms
  • c) A group of individuals with the same characteristics
  • d) A subset of the population

Answer: b) A characteristic that varies from one individual to another and can be expressed in numerical terms

55. Which of the following is an example of a variate?

  • a) Religion of humans
  • b) Colour of the ball
  • c) Weight of students in a class
  • d) Gender of employees

Answer: c) Weight of students in a class

56. What is an attribute in statistics?

  • a) A numerical characteristic that can vary
  • b) A characteristic that cannot be expressed in numerical terms
  • c) A type of data collection method
  • d) A type of statistical error

Answer: b) A characteristic that cannot be expressed in numerical terms

57. Which of the following is an example of an attribute?

  • a) Number of accidents
  • b) Age of students
  • c) Colour of the ball
  • d) Number of members in a family

Answer: c) Colour of the ball

58. What type of data is referred to as discrete variate?

  • a) Data that takes any value within a range
  • b) Data that takes only a countable and usually finite number of values
  • c) Data that is difficult to analyze
  • d) Data that can only be measured in continuous ranges

Answer: b) Data that takes only a countable and usually finite number of values

59. Which of the following is an example of discrete variate?

  • a) Percentage of marks
  • b) Age in years
  • c) Height of students
  • d) Weight of students

Answer: b) Age in years

60. What is a continuous variate in statistics?

  • a) A variate that takes distinct values only
  • b) A variate that can take any value within a range
  • c) A variate that applies only to qualitative data
  • d) A variate that applies only to finite values

Answer: b) A variate that can take any value within a range

61. Which of the following is an example of continuous variate?

  • a) Number of children in a family
  • b) Percentage of marks
  • c) Number of accidents
  • d) Number of students in a class

Answer: b) Percentage of marks

62. What is a parameter in statistics?

  • a) A sample size chosen for the study
  • b) A numerical value or function of the observations of the entire population
  • c) A characteristic that is observed during the study
  • d) The number of observations in a study

Answer: b) A numerical value or function of the observations of the entire population

63. What is the role of statistics in relation to the parameter?

  • a) It eliminates the need to estimate parameters
  • b) It is used to estimate unknown parameters from a sample
  • c) It provides accurate predictions without the need for data
  • d) It collects data from the entire population

Answer: b) It is used to estimate unknown parameters from a sample

64. What is an example of a parameter in the given context?

  • a) Population mean
  • b) Age of students
  • c) Colour of the ball
  • d) Number of children in a family

Answer: a) Population mean

65. What is the process of sorting letters in a post office based on their addresses an example of?

  • a) Classification
  • b) Tabulation
  • c) Data Analysis
  • d) Sampling

Answer: a) Classification

66. What is the main purpose of classification in data analysis?

  • a) To make the data more complex
  • b) To condense data by removing unimportant details
  • c) To enlarge the dataset
  • d) To change the structure of data

Answer: b) To condense data by removing unimportant details

67. How is data classified in the “Quantitative Base”?

  • a) By geographical regions
  • b) By qualitative characteristics like religion
  • c) By numerical characteristics such as age or income
  • d) By the color of objects

Answer: c) By numerical characteristics such as age or income

68. What is an example of Geographical Base classification?

  • a) Classification by income level
  • b) Classification by color
  • c) Classification by states or countries
  • d) Classification by gender

Answer: c) Classification by states or countries

69. What is Chronological Base classification?

  • a) Classifying data by numerical characteristics
  • b) Classifying data by time periods
  • c) Classifying data by religious affiliation
  • d) Classifying data by geographical region

Answer: b) Classifying data by time periods

70. What is One-way classification?

  • a) Classifying data based on more than two characteristics
  • b) Classifying data based on one characteristic
  • c) Classifying data using only qualitative data
  • d) Classifying data using only geographical data

Answer: b) Classifying data based on one characteristic

71. In Two-way classification, how is the data classified?

  • a) By two characteristics at the same time
  • b) By one characteristic only
  • c) By geographical region only
  • d) By qualitative data only

Answer: a) By two characteristics at the same time

72. What does frequency in a frequency distribution represent?

  • a) The total amount of data collected
  • b) The number of occurrences of a value in a given set of observations
  • c) The sum of all numerical values
  • d) The variation in a dataset

Answer: b) The number of occurrences of a value in a given set of observations

73. How is frequency distribution typically represented?

  • a) In the form of graphs
  • b) In a table showing values of the variable and corresponding frequencies
  • c) In a list of descriptive statistics
  • d) Using complex algebraic equations

Answer: b) In a table showing values of the variable and corresponding frequencies

74. What is Primary Data?

  • a) Data collected from secondary sources
  • b) Data collected directly from respondents for the first time
  • c) Data from internet sources
  • d) Data that has already been processed

Answer: b) Data collected directly from respondents for the first time

75. What is the main advantage of using questionnaires for data collection?

  • a) They collect only qualitative data
  • b) They allow respondents to provide large volumes of data quickly
  • c) They eliminate the need for analysis
  • d) They are used only in small-scale studies

Answer: b) They allow respondents to provide large volumes of data quickly

76. What is the main difference between a census and a sample survey?

  • a) Census collects data from the entire population while a sample survey collects data from a subset
  • b) Sample surveys collect more accurate data than a census
  • c) Census uses questionnaires while sample surveys do not
  • d) There is no difference

Answer: a) Census collects data from the entire population while a sample survey collects data from a subset

77. What is Secondary Data?

  • a) Data collected for the first time
  • b) Data that has already been collected and processed by someone else
  • c) Data collected through direct observation
  • d) Data collected using interviews

Answer: b) Data that has already been collected and processed by someone else

78. What is the distinction between primary and secondary data?

  • a) Primary data is collected for the first time; secondary data has been collected previously
  • b) Secondary data is always more accurate than primary data
  • c) Primary data comes from books; secondary data comes from surveys
  • d) There is no distinction between primary and secondary data

Answer: a) Primary data is collected for the first time; secondary data has been collected previously

79. What is the formula for calculating the Class Length or Class Width?

  • a) Class Length = Lower Class Interval – Upper Class Interval
  • b) Class Length = Total Frequency / Number of Intervals
  • c) Class Length = Upper Class Interval – Lower Class Interval
  • d) Class Length = Frequency / Class Interval

Answer: c) Class Length = Upper Class Interval – Lower Class Interval

80. What is the formula to calculate Mid-Value or Class Mark?

  • a) Class Mark = (Lower Class Limit + Upper Class Limit) / 2
  • b) Class Mark = (Upper Class Interval – Lower Class Interval) / 2
  • c) Class Mark = Total Frequency / Number of Classes
  • d) Class Mark = (Upper Class Frequency – Lower Class Frequency) / 2

Answer: a) Class Mark = (Lower Class Limit + Upper Class Limit) / 2

81. Which type of class interval includes both the upper and lower class limits?

  • a) Exclusive type
  • b) Inclusive type
  • c) Open-end type
  • d) None of the above

Answer: b) Inclusive type

82. What is an example of an exclusive type of class interval?

  • a) 10-20
  • b) 0-10
  • c) 500-1000
  • d) All of the above

Answer: d) All of the above

83. In an inclusive type of class interval, if the class interval is 10-15, which of the following would be the upper class limit?

  • a) 15
  • b) 10
  • c) 12.5
  • d) 16

Answer: a) 15

84. The data of some workers’ salaries is given as 2300, 2400, 2500, 2100, 2000, 2000, 2300, 2800, 3000, 3200, 2700, 2400, and 2500. If the desired number of class intervals is 5, what is the class width?

  • a) 100
  • b) 200
  • c) 300
  • d) 400

Solution:
The class width can be calculated using the formula:
Class width = (Maximum value – Minimum value) / Number of class intervals.
Maximum salary = 3200, Minimum salary = 2000
Class width = (3200 – 2000) / 5 = 1200 / 5 = 240
So the class width should be approximately 240, closest option is b) 200.

85. The largest and smallest values of a data set are 100 and 60 respectively. If the desired number of class intervals is 8, what is the class width?

  • a) 5
  • b) 8
  • c) 10
  • d) 40

Solution:
Class width = (Maximum value – Minimum value) / Number of class intervals.
= (100 – 60) / 8 = 40 / 8 = 5.
Therefore, the class width is a) 5.

86. For the data set: 20, 25, 30, 35, 40, calculate the mean.

  • a) 25
  • b) 30
  • c) 35
  • d) 40

Solution:
The mean is calculated as the sum of the values divided by the number of values.
Mean = (20 + 25 + 30 + 35 + 40) / 5 = 150 / 5 = 30.
Therefore, the mean is b) 30.

87. If the class intervals are 10-20, 20-30, 30-40, and the frequencies are 5, 10, 15 respectively, what is the cumulative frequency for the class interval 30-40?

  • a) 5
  • b) 10
  • c) 15
  • d) 30

Solution:
Cumulative frequency is calculated by adding the frequencies of all previous class intervals.
Cumulative frequency for 30-40 = 5 (10-20) + 10 (20-30) + 15 (30-40) = 30.
Therefore, the cumulative frequency is d) 30.

88. If the frequency for the class interval 40-50 is 8, what is the cumulative frequency if the cumulative frequency of the previous class interval (30-40) is 15?

  • a) 23
  • b) 15
  • c) 8
  • d) 30

Solution:
Cumulative frequency for the class interval 40-50 = Previous cumulative frequency + Current frequency.
= 15 + 8 = 23.
Therefore, the cumulative frequency is a) 23.

89. Calculate the mode of the following data: 5, 10, 10, 15, 20, 20, 20, 25, 30.

  • a) 10
  • b) 15
  • c) 20
  • d) 25

Solution:
The mode is the value that occurs most frequently in the dataset.
In this case, 20 appears three times, more than any other number.
Therefore, the mode is c) 20.

90. In a class interval 50-60 with a frequency of 18 and a total frequency of 90, what is the percentage frequency?

  • a) 5%
  • b) 10%
  • c) 20%
  • d) 15%

Solution:
Percentage frequency = (Frequency of the interval / Total frequency) * 100.
= (18 / 90) * 100 = 20%.
Therefore, the percentage frequency is c) 20%.

91. Find the median for the following data set: 12, 15, 18, 22, 24, 30.

  • a) 15
  • b) 18
  • c) 20
  • d) 22

Solution:
For an even number of values, the median is the average of the two middle values.
Median = (18 + 22) / 2 = 20.
Therefore, the median is c) 20.

92. In a continuous frequency distribution, the lower class limit of a class is 25 and the class width is 10. What is the upper class limit of the class?

  • a) 30
  • b) 35
  • c) 40
  • d) 45

Solution:
The upper class limit is calculated by adding the class width to the lower class limit.
Upper class limit = 25 + 10 = 35.
Therefore, the upper class limit is b) 35.

93. For the class interval 1000-1500 with a frequency of 10, what is the frequency density given that the class width is 500?

  • a) 2
  • b) 0.02
  • c) 0.5
  • d) 20

Solution:
Frequency density = Frequency / Class width.
Frequency density = 10 / 500 = 0.02.
Therefore, the frequency density is b) 0.02.

94. Calculate the variance for the following data set: 10, 12, 14, 16, 18.

  • a) 6
  • b) 8
  • c) 10
  • d) 12

Solution:
First, calculate the mean: Mean = (10 + 12 + 14 + 16 + 18) / 5 = 70 / 5 = 14.
Variance formula:
Variance = [(10-14)^2 + (12-14)^2 + (14-14)^2 + (16-14)^2 + (18-14)^2] / 5
= [16 + 4 + 0 + 4 + 16] / 5 = 40 / 5 = 8.
Therefore, the variance is b) 8.

95. In a survey, 60 students participated, and the average marks were found to be 75 with a standard deviation of 10. Calculate the variance.

  • a) 100
  • b) 75
  • c) 10
  • d) 50

Solution:
Variance is the square of the standard deviation.
Variance = (Standard Deviation)^2 = 10^2 = 100.
Therefore, the variance is a) 100.

96. The mode of the data set 5, 10, 10, 15, 15, 15, 20 is:

  • a) 5
  • b) 10
  • c) 15
  • d) 20

Solution:
The mode is the number that appears most frequently.
In the given set, 15 appears 3 times, more than any other number.
Therefore, the mode is c) 15.

97. What is the cumulative frequency for the class interval 40-50 if the previous cumulative frequency is 50 and the frequency of the interval 40-50 is 10?

  • a) 50
  • b) 60
  • c) 70
  • d) 80

Solution:
Cumulative frequency for 40-50 = Previous cumulative frequency + Current frequency.
= 50 + 10 = 60.
Therefore, the cumulative frequency is b) 60.

98. In a group of data, if the mean is 60 and the median is 50, what can be inferred about the skewness of the data?

  • a) Positively skewed
  • b) Negatively skewed
  • c) Symmetrical
  • d) Cannot be determined

Solution:
In a positively skewed distribution, the mean is greater than the median.
Since the mean (60) is greater than the median (50), the data is a) Positively skewed.

99. For the class interval 20-30 with a frequency of 5, and the total frequency being 50, what is the relative frequency?

  • a) 5%
  • b) 10%
  • c) 15%
  • d) 20%

Solution:
Relative frequency = (Frequency of the interval / Total frequency) * 100.
= (5 / 50) * 100 = 10%.
Therefore, the relative frequency is b) 10%.

100. What is the mid-point of the class interval 50-60?

  • a) 50
  • b) 55
  • c) 60
  • d) 65

Solution:
Mid-point = (Lower class limit + Upper class limit) / 2.
= (50 + 60) / 2 = 110 / 2 = 55.
Therefore, the mid-point is b) 55.

101. What is the range of the data set: 100, 90, 80, 70, 60?

  • a) 30
  • b) 40
  • c) 50
  • d) 60

Solution:
Range = Maximum value – Minimum value.
Range = 100 – 60 = 40.
Therefore, the range is b) 40.

102. The class interval for a set of data is 10-20. The lower boundary is 10, and the class width is 10. What is the upper boundary?

  • a) 25
  • b) 20
  • c) 15
  • d) 30

Solution:
Upper boundary = Lower boundary + Class width.
Upper boundary = 10 + 10 = 20.
Therefore, the upper boundary is b) 20.

103. If the mean of a dataset is 40 and the standard deviation is 5, what is the coefficient of variation?

  • a) 12.5%
  • b) 25%
  • c) 50%
  • d) 75%

Solution:
Coefficient of variation = (Standard Deviation / Mean) * 100.
= (5 / 40) * 100 = 12.5%.
Therefore, the coefficient of variation is a) 12.5%.

Definition, Importance, and Limitations of Statistics, and Data Collection, Classification, and Tabulation

Table of Contents

SectionDescription
1.0 Introduction Overview of statistics, its origins, and its relevance.
→ Latin Word “statisticum”Origin of the word “Statistics” from the Latin word “statisticum.”
→ Italian Word “statista”Development of the word from Italian, meaning “expert of the state.”
→ German Word “Statistik”Use in 18th-century Germany for state data collection.
→ Modern Use of StatisticsExpanded use of statistics in science, business, and economics.
1.1 What is Data? Defines data and explains its types: qualitative and quantitative.
→ Qualitative DataDescriptive data related to qualities and characteristics.
→ Quantitative DataData expressed in numerical terms (e.g., quantity, size).
1.2 Phases of Statistical Analysis Four main phases of statistical analysis: collection, classification, analysis, and interpretation of data.
→ Collection of DataGathering information from primary or secondary sources.
→ Classification and Tabulation of DataOrganizing data into categories or tables for easier analysis..
→ Analysis of DataStudying the organized data to identify patterns or trends.
→ Interpretation of DataDrawing conclusions based on the analyzed data.
→ Ancient Example: Ancient Egypt CensusCensuses conducted by Pharaohs in ancient Egypt.
→ Modern Example: Census of India 2011The large-scale census conducted in India in 2011.
1.3 Importance of Statistics Explains the key role of statistics in various fields such as business, economics, healthcare, and more.
→ Business and EconomicsImportance of statistics in business decision-making and economic predictions.
→ Medical Use Importance of statistics in making medical decisions based on patient data.
→ Weather ForecastUse of past weather data to predict future conditions.
→ Stock MarketApplication of statistics to predict stock price trends.
→ BanksStatistical methods to assess loan approvals and credit risk.
1.4 Functions of Statistics Describes how statistics present facts, simplify data, enable comparisons, and aid in decision-making.
→ Presenting FactsUsing charts and graphs to make data easier to understand.
→ Simplifying Complex DataSummarizing large sets of data into understandable averages or percentages.
→ Enabling ComparisonsComparing different groups or categories using statistical data.
→ Studying RelationshipsIdentifying correlations between variables (e.g., exercise and weight loss).
→ Policy FormulationAssisting governments and organizations in forming policies based on data.
→ Forecasting OutcomesPredicting future outcomes using past data trends.
1.5 Limitations or Demerits of Statistics Explains the limitations of statistics, such as exclusion of qualitative data and reliance on averages.
→ Statistics Do Not Deal with IndividualsStatistics focus on groups and general trends, not individual cases.
→ Qualitative Data is ExcludedStatistics cannot measure non-numeric data like emotions.
→ Results Based on AveragesAverages can sometimes hide extreme cases or outliers.
→ Bias in ResultsBiased sampling can lead to inaccurate conclusions.
1.6 Definitions Defines key statistical terms such as population, sample, variates, attributes, and parameters.
→ PopulationThe entire group being studied in a statistical analysis.
→ Population and SampleDifference between studying an entire population versus a smaller representative sample.
→ Variates and AttributesVariates are measurable characteristics, while attributes are qualities that can’t be numerically measured.
→ Discrete and Continuous VariablesDiscrete variables are countable, while continuous variables can take any value within a range.
→ Parameter and StatisticA parameter describes a population, while a statistic describes a sample.
→ Primary DataData collected firsthand by the researcher for a specific purpose.
→ Secondary DataData collected by someone else that is reused for research.
→ Census vs. Sample SurveyCensus involves collecting data from the entire population, while a sample survey collects data from a smaller group.
→ Accuracy vs. Ease: Primary vs. Secondary DataPrimary data is more accurate but time-consuming, while secondary data is quicker to obtain but may be less specific.
1.7 Classification and Tabulation Methods of sorting data into categories to make analysis easier.
→ Qualitative BaseClassifying data based on qualities that cannot be measured (e.g., religion, literacy level).
→ Quantitative BaseSorting data based on measurable characteristics (e.g., age, height, marks).
→ Geographical BaseSorting data by location (e.g., country, state, or region).
→ Chronological BaseSorting data by time (e.g., years, months, or days).
→ Types of Classification (One-way, Two-way, Multi-way)Classifying data based on one or more characteristics (e.g., age and gender).
1.8 Frequency Distribution Describes how to group data into intervals and calculate frequency distribution.
→ Class IntervalsGrouping data into ranges (e.g., 10-20, 20-30).
→ Class LimitsThe boundaries of class intervals (e.g., lower and upper limits).
→ Class LengthThe difference between the upper and lower class limits.
→ Mid-value (Class Mark)The midpoint of the class interval.
→ Types of Class Intervals (Inclusive, Exclusive)Inclusive intervals include both limits, while exclusive intervals exclude the upper limit.
→ Class BoundariesAdjusting class limits to ensure no gaps between intervals.
→ Open-end Class IntervalIntervals where the lower or upper limit is undefined.
→ Relative FrequencyProportion of the total frequency represented by a class.
→ Percentage FrequencyRelative frequency expressed as a percentage.
→ Explanation of Frequency DensityUsed to calculate how densely data is distributed across unequal class intervals.
→ Discrete Frequency DistributionFor countable data (e.g., number of children in a family).
→ Continuous Frequency DistributionFor data that falls within ranges (e.g., exam scores).
→ Cumulative Frequency Distributioncurrent frequency + all previous frequencies

1.1 Introduction

Back to Table

The process of collecting numbers or figures related to the area of interest, analyzing them, and making decisions based on them is called statistics.

In simple terms, statistics is the process of collecting, analyzing, and making decisions based on data in a specific field.

1. Latin Word “statisticum”:
The word “Statistics” originates from the Latin word “statisticum”, which means “related to the state” or “pertaining to state affairs.”

2. Italian Word “statista”:
Following this, the word came from the Italian word “statista”, which means “expert of the state” or “a person knowledgeable in political affairs.”

3. German Word “Statistik”:
In the 18th century, the word in German was referred to as “Statistik”, which meant “the collection and analysis of data for matters of the state.”

Each of which means – “A group of numbers or figures that represent some information of human interest”

4. Modern Use:
Gradually, the use of the word “Statistics” expanded beyond just state or government matters and began to be used in every field where data collection, analysis, and interpretation occur. Today, statistics are used in science, business, economics, and many other fields.

Did you know? The word ‘Statistics’ was first used in 1749 by a German scholar, Gottfried Achenwall. He defined it as the “political science of many countries”.

2. What is Data?

“Data” means information, and it comes in two main types – Qualitative and Quantitative.

  • Qualitative Data: This is descriptive and relates to qualities or characteristics.
    Example: When baking a cake, if you describe the cake as “sweet” or “salty”, this is qualitative data.
  • Quantitative Data: This is numerical and deals with numbers.
    Example: If you count how many cakes you baked, like saying you baked 10 cakes, that’s quantitative data.
Example: Imagine you’re selling cakes. You note that the flavor “chocolate” is the most popular (Qualitative data). You also record that you sold 50 chocolate cakes (Quantitative data).

Phases of Statistical Analysis

Statistical analysis has four main phases:

  • 1. Collection of Data: This is the first stage, where you gather information from different sources, which can be primary (original data) or secondary (data collected by others).
    Example: You record how many cakes you sold each day for the past month.
  • 2. Classification and Tabulation of Data: Once you collect the data, you organize it so it’s easy to read and analyze. You can group the data into categories or classes.
    Example: You make a table showing how many cakes of each flavor (chocolate, vanilla, strawberry) were sold on each day of the week.
  • 3. Analysis of Data: In this step, you study the organized data using formulas and methods to find patterns or trends.
    Example: You analyze the table to see which days chocolate cakes were sold the most.
  • 4. Interpretation of Data: After analyzing, you draw conclusions based on the findings.
    Example: You conclude that chocolate cakes sell best on weekends, so you decide to bake more chocolate cakes on Fridays to prepare for the weekend.

Ancient and Modern Census Examples

Ancient Example: Ancient Egypt Census

In ancient Egypt, Pharaohs regularly conducted censuses to maintain accurate records of population and wealth in their empire. One notable census was conducted around 2,800 BCE, documented in the “Book of the Dead.” The purpose of this census was to track the number of people living in the state, their occupations, and their assets.

This information was used by the Pharaoh to collect taxes and manage labor resources, such as determining the number of workers needed for the construction of the pyramids.


Modern Example: Census of India 2011

In 2011, the Government of India conducted one of the largest and most complex censuses in the world, gathering information on more than 1.2 billion citizens. This census collected data such as age, gender, education level, occupation, language, and living conditions.

The data collected in the 2011 census was used by the Indian government to formulate policies, implement plans, and drive social and economic development across the nation.

1.2 Importance of Statistics

Imagine statistics as a superpower! Statistics give you the ability to “see” patterns in data, much like how a superhero uses their special powers to save the day. Let’s take a journey through some real-world scenarios:

1.2.1 Business and Economics

Statistics are essential in business and economics because they allow decision-makers to analyze data, predict outcomes, and create strategies that lead to success. Here are the key points explained:

  • 1. In Business: The decision maker takes suitable policies and strategies based on information on production, sale, profit, purchase, finance, etc.
  • Story: Imagine you’re the CEO of a clothing company, trying to decide how many shirts to produce next season. You analyze past production data, sales trends, and profits to find the optimal number of shirts to produce. By using statistical data, you avoid overproducing or underproducing, which helps you maximize profit.

    Example: A clothing company looks at last year’s winter sales to decide how many jackets to make this year, ensuring they don’t have leftover inventory or missed opportunities.

  • 2. Time Series Analysis: The businessman can predict the effect of a large number of variables with a fair degree of accuracy.
  • Story: Picture yourself as a retail store manager. You want to predict how holiday sales will perform based on previous years’ data. Using time series analysis, you analyze sales over the past 10 holiday seasons, adjusting for different economic conditions. This helps you estimate future sales and make better decisions on inventory and staffing.

    Example: A retail manager uses data from the past 10 years to predict holiday sales, ensuring that inventory levels and staffing are adequate for the upcoming holiday rush.

  • 3. Bayesian Decision Theory: The businessman selects the optimal decisions by identifying the payoff for each alternative course of action.
  • Story: Imagine you’re a tech company deciding whether to launch a new product. You have three options: delay the launch for more testing, release it now, or cancel it altogether. Using Bayesian Decision Theory, you calculate the probability of success and the expected payoff for each option. This statistical approach helps you choose the course of action with the highest chance of maximizing profit, based on available data.

    Example: A tech company uses Bayesian Decision Theory to evaluate the likelihood of success for a new smartphone launch, considering different scenarios like releasing now or delaying it to avoid potential software issues.

  • 4. In Economics: Statistics analyze demand, cost, price, and other economic factors like elasticity of demand and consumer satisfaction.
  • Story: Picture yourself as an economist working for a car manufacturing company. You need to understand how changing the price of cars will affect consumer demand. By using data on income, price elasticity of demand, and spending patterns, you predict that a small price reduction will lead to a large increase in sales. This helps the company set the right price to maximize both revenue and consumer satisfaction.

    Example: An economist analyzes how the price of electric cars impacts consumer demand and recommends a price cut to increase market share without significantly reducing profit margins.

1.2.2 Medical

Story: Picture yourself as a doctor treating patients in a hospital. You’re trying to determine which painkiller works best. You gather data from past patients using different medications and compare their recovery times. With statistics, you can clearly see which medicine helps patients recover faster and make a better decision.

Example: A doctor analyzing patients’ recovery times finds that a new medicine works faster than existing ones, leading to more informed treatment choices.

1.2.3 Weather Forecast

Story: Imagine you’re a meteorologist trying to predict if it will rain during your weekend picnic. You analyze the weather from past years and notice a pattern: it often rains in the third week of September. Using this information, you predict the rain and move your picnic to the following weekend, avoiding a soggy sandwich.

Example: A forecaster uses past data to predict rainfall in September, allowing someone to avoid scheduling an outdoor event during a likely rainstorm.

1.2.4 Stock Market

Story: You are an investor, and you’re trying to decide when to buy stocks in a rising tech company. By analyzing stock price trends from the past, you notice that the company’s stock tends to rise right before new product launches. You buy the stock just before the next release, making a smart investment based on statistics.

Example: An investor buys shares in a tech company after identifying that the stock typically rises before product launches, making profits based on statistical analysis.

1.2.5 Bank

Story: As a bank manager, you need to decide who gets approved for a loan. You analyze the credit scores and income data of all applicants. By using statistics, you identify who is most likely to repay the loan on time, helping you make a fair decision.

Example: A bank manager uses credit scores and income data to approve loans, ensuring the bank takes fewer risks based on solid statistical analysis.

1.3 Functions of Statistics

  • 1. Presenting Facts: Statistics make facts easier to understand using visual aids like charts and graphs.
  • Story: You are a journalist reporting on the number of people attending a concert. Instead of writing down huge numbers in your article, you create a colorful pie chart to show how many people came from each age group. This makes your report easier to understand for your readers.

    Example: A journalist uses pie charts to show audience age groups at a concert, making the data visually appealing and easy to understand.

  • 2. Simplifying Complex Data: Statistics help simplify large data sets by summarizing them into averages or percentages.
  • Story: Imagine you’re an ecologist studying the height of trees in a forest. Instead of measuring every single tree, you take a sample and find the average height. This average gives you a simple, yet accurate representation of the entire forest.

    Example: An ecologist studies a sample of trees to find the average height, rather than measuring every single tree in the forest.

  • 3. Enabling Comparisons: Statistics enable comparisons between different groups or categories.
  • Story: Picture yourself as a school principal comparing the test scores of two classrooms. With statistics, you calculate the average score for each class, and you can easily see which class performed better. This helps you decide where more attention is needed.

    Example: A school principal compares average scores of two classes, identifying which class performed better and where to improve teaching methods.

  • 4. Studying Relationships: Statistics study the relationships between variables, such as exercise and weight loss.
  • Story: You’re a fitness trainer, and you want to know if there’s a connection between how many hours someone exercises per week and how much weight they lose. By analyzing data from your clients, you discover that those who exercise more tend to lose more weight, showing a clear relationship between the two.

    Example: A fitness trainer tracks client exercise hours and weight loss, finding a strong relationship between increased workout hours and higher weight loss.

  • 5. Policy Formulation: Statistics help in policy-making by analyzing population data and other relevant metrics.
  • Story: Imagine you work for the government, and you need to decide where to build a new school. By looking at population data, you find out that one part of the city has more children. Using this data, you decide to build the school there.

    Example: A government uses population data to determine the location of a new school, ensuring it is built where the need is highest.

  • 6. Forecasting Outcomes: Statistics predict future outcomes based on past data.
  • Story: You run an online store, and you’re planning your stock for the holiday season. By analyzing past sales data, you predict that you’ll sell more of a particular toy this year. You order extra stock to meet the expected demand.

    Example: An online retailer predicts demand for a toy during the holidays based on last year’s data and adjusts inventory accordingly.

1.4 Limitations or Demerits of Statistics

  • 1. Statistics Do Not Deal with Individuals: Statistics deal with groups and averages, not individual cases.
  • Story: You are a school counselor, and you want to know the average happiness level of the students in the school. The average happiness score might be 7 out of 10, but this doesn’t tell you that one student is very unhappy while another is extremely happy. Statistics show trends, not individual stories.

    Example: A school counselor uses an average happiness score but misses the extreme experiences of individual students, highlighting the limitation of relying on averages alone.

  • 2. Qualitative Data is Excluded: Statistics focus on numbers and exclude qualitative factors like emotions or feelings.
  • Story: Imagine you are a movie director trying to measure how much the audience loved your latest film. You could look at ticket sales (quantitative data), but this won’t tell you how they felt about the movie. You need qualitative feedback to understand their emotions.

    Example: A director measures ticket sales for their film but doesn’t capture audience emotions, revealing how statistics miss qualitative insights.

  • 3. Results Based on Average: Statistics rely on averages, which can hide extreme cases or outliers.
  • Story: You are the coach of a basketball team, and you calculate the average height of your players. The average might be 6 feet, but this doesn’t tell you that one player is 5 feet tall and another is 7 feet tall. The average hides the extreme differences.

    Example: A basketball coach uses the average team height, but that hides the fact that some players are much shorter or taller than the average.

  • 4. Bias in Results: Statistics can be biased if the sample data isn’t representative of the entire population.
  • Story: You’re a researcher conducting a survey about a new app, but you only ask people aged 18-25 for their opinion. Since older age groups aren’t represented, your results are biased, and you won’t get the full picture.

    Example: A researcher surveys only young people about a new app, leading to biased results that don’t reflect the opinions of older users.

1.5 Definitions

  • Population: A population includes all members of a specific group under study.
  • Story: Picture yourself as a scientist studying the heights of sunflowers in a field. The population includes all the sunflowers in the field, not just a few of them. Your goal is to collect data from the entire population to get an accurate result.

    Example: A scientist collects data from all sunflowers in a field (the population) to ensure accurate results for the study.

1. Population and Sample – The Festival Planning Story

Definition: A population is the entire group you want to study, and a sample is a smaller, representative subset of the population that you study to make conclusions about the whole.

Story: Imagine you are in charge of organizing a Diwali celebration in a large village in India. The village has 10,000 people (that’s your population), and you need to figure out how many sweets to prepare for the festival. But asking each and every person how many sweets they’ll eat is impossible. Instead, you decide to ask just 200 people from different parts of the village. This group of 200 people is your sample. After gathering their answers, you find that each person will likely eat 5 sweets. Now, you use this sample to estimate that the entire village will need 50,000 sweets, using the sample to make decisions for the whole population!

2. Variates and Attributes – The Indian School Story

Definition: Variates are characteristics that can be measured and expressed with numbers, while attributes are qualities that cannot be measured numerically.

Story: In a school in Mumbai, the principal is interested in knowing more about the students. Some things about the students, like their height, weight, and exam scores, can be measured. These are called variates. For example, one student might be 5 feet tall, while another might score 85 marks in mathematics. But then there are things about the students that cannot be measured with numbers—these are attributes. For example, one student may prefer eating pani puri, and another might have a favorite color, like blue. These preferences and qualities can’t be assigned numbers, but they’re just as important as the variates.

3. Discrete and Continuous Variables – The Indian Family Story

Definition: Discrete variables are countable numbers (like 1, 2, 3), while continuous variables can take any value within a range (like 3.5 or 7.8).

Story: An Indian family is planning a wedding and trying to decide how many people to invite. The number of guests is a discrete variable because you can only invite whole numbers of people, such as 50 or 100 guests. You can’t invite 50.5 guests! On the other hand, they are also shopping for sarees for the bride. When buying fabric, the length of the saree is a continuous variable because you can buy it in decimals, like 6.5 meters or 8.2 meters. While the number of guests is a whole number, the length of fabric can be measured in smaller, more precise values.

4. Parameter and Statistic – The Farmer’s Crop Story

Definition: A parameter is a value that represents the entire population, while a statistic is a value calculated from a sample to estimate the population parameter.

Story: In a village in Punjab, the farmers want to estimate how much wheat they will harvest this year. The parameter is the average wheat yield for all the fields in the village, but it’s difficult to measure every single field. So, the farmers choose a sample of 10 fields and measure the wheat produced in those fields. They calculate that these fields produce 20 quintals of wheat per hectare. This statistic—the average yield from the sample—helps them estimate the total wheat yield for the entire village (the parameter). Just like this, statistics help us make educated guesses about unknown parameters!

5. Primary Data – The Delhi Market Story

Definition: Primary data is information collected directly by a researcher from the source for a specific purpose.

Story: In the busy lanes of Chandni Chowk in Delhi, a shopkeeper wants to know which new products his customers are interested in. Instead of guessing, he asks them directly. Every time a customer enters his shop, he gives them a questionnaire with questions like “What are your favorite products?” and “How often do you visit the market?” The shopkeeper collects this fresh information directly from his customers. This is primary data because he is gathering the data firsthand. It’s like when a detective goes out to collect clues directly from the crime scene, making sure the information is fresh and accurate!

6. Census vs. Sample Survey – The Indian Census Story

Definition: A census is when data is collected from the entire population, while a sample survey involves collecting data from a smaller group that represents the whole.

Story: Every 10 years, the Indian government conducts the Census, gathering information about every person living in the country. They ask about things like how many people live in each household, their ages, and their education levels. This process collects data from the entire population of India, which is a massive task. But sometimes, the government needs information more quickly or on a specific topic. For example, if they want to know how many people in rural areas use smartphones, they conduct a sample survey. Instead of asking everyone, they select a group of villages and ask the people there. Based on this smaller group, they make an estimate for the entire country. This is how a sample survey helps when a full census is too time-consuming!

7. Secondary Data – The Mumbai College Story

Definition: Secondary data is data that has already been collected by someone else for a different purpose, which can be reused for research.

Story: A professor at a college in Mumbai wants to research the literacy rate in India. Instead of collecting the data herself by traveling all over the country, she uses data from the Indian Census, which has already been gathered by the government. This secondary data helps her understand the literacy rates without needing to start from scratch. It’s like reading a news article about an event instead of going to the event yourself—you still get the information, but it was collected by someone else!

8. Distinction Between Primary and Secondary Data – The Recipe Story

Definition: Primary data is original and collected firsthand, while secondary data has already been collected by someone else and is reused.

Story: Imagine two friends, Ravi and Priya, want to cook a traditional dish for a festival. Ravi decides to visit his grandmother and ask her how she prepares the dish. He writes down every step and gathers the ingredients himself. This is like primary data because he is collecting the information firsthand, directly from the source. Priya, on the other hand, prefers to find a recipe online. She downloads one from a popular cooking website. This is like secondary data because she didn’t collect the recipe herself—it was already available, and she’s reusing it. Both approaches work, but Ravi’s method is more personal and specific, while Priya’s is faster and easier!

9. Accuracy vs. Ease – The Detective Story

Definition: Primary data is generally more accurate but takes more time and effort to collect, while secondary data is quicker to obtain but may not be as specific.

Story: Imagine you are a detective solving a mystery. You have two ways to gather clues: You can either visit the crime scene yourself and interview witnesses (this is like collecting primary data) or you can read the police reports from other detectives who have already visited the scene (this is like using secondary data). If you go to the crime scene yourself, your information will be more accurate, but it will take more time. If you use the reports, you’ll get the clues faster, but they might not be as detailed as you’d like. As a detective (or a researcher), you need to decide: Do you want accuracy, or do you need results quickly?

1.7 Classification and Tabulation

Imagine you’ve just finished a big survey where you asked students about their hobbies, favorite subjects, and heights. You now have all this information, but it’s all jumbled up, making it hard to make sense of. That’s where classification comes in!

Classification is like organizing a messy room. It’s the process of sorting things into different groups so that it becomes easier to find patterns and understand what’s happening. For example, just like sorting mail at a post office, you might group all letters meant for one neighborhood together. In data, classification means grouping similar things to simplify them.

Example: Imagine you are working at a post office, sorting letters. Some letters need to go to “Park Street,” some to “Main Street,” and others to “Baker Street.” You don’t want to send all the letters together in one big mess. Instead, you classify them based on the street name. This way, the delivery person can easily know which streets to deliver to.

Bases for Classification:

There are four main ways to classify data:

1. Qualitative Base:

This is when you sort data based on a quality or characteristic that can’t be measured, like religion, literacy level, or intelligence.

Story Example: Imagine you’re a librarian, and you need to organize books. But instead of using book titles, you classify them by genre, like “fiction,” “non-fiction,” and “mystery.” Just like that, data can be classified based on qualities like religion—Hindu, Muslim, Christian—and not just numbers.

2. Quantitative Base:

Here, the data is sorted based on measurable characteristics like age, height, or marks.

Story Example: Let’s say you’re organizing a sports competition. You want to group players based on their height: “5 feet to 5.5 feet” and “5.6 feet to 6 feet.” This is quantitative classification because you’re sorting them by measurable numbers.

3. Geographical Base:

This means sorting data by location or geography.

Story Example: You’re a travel blogger with followers from different states. You want to know where most of your followers come from, so you classify them based on their state or country—like “Maharashtra,” “Karnataka,” or “Tamil Nadu.” This is called geographical classification.

4. Chronological or Temporal Base:

This involves sorting data based on time—like years, months, or days.

Story Example: Imagine running a coffee shop and recording sales. You could classify the sales data based on the time of year: “January,” “February,” “March,” and so on. This helps you compare how well your business did in different months. Sorting by time is called temporal classification.

Example: Let’s say you’re tracking the sales of a company. You could group the data by year (2019, 2020, 2021), or you could group it by product categories (electronics, furniture, clothing). The way you group the data depends on what you want to study or understand.

Types of Classification:

There are different ways to classify data based on how many characteristics you consider:

1. One-way Classification:

This is when you classify data based on one characteristic.

Story Example: Imagine you’re a teacher, and you want to sort students based on the subject they like best—math, science, or English. This is called one-way classification because you’re only considering one thing: favorite subject.

2. Two-way Classification:

Here, you classify data based on two characteristics at the same time.

Story Example: Now imagine that you’re not just interested in their favorite subjects, but also their gender. So, you classify the students based on both their favorite subject and whether they’re boys or girls. For example, “Boys who like math” or “Girls who like science.” This is two-way classification.

3. Multi-way Classification:

This happens when you consider more than two characteristics.

Story Example: Now, let’s say you want to go even further and also classify students based on their marks in these subjects. You now group students by their favorite subject, gender, and marks—like “Girls who like math and scored above 90%.” This is multi-way classification because you’re looking at three characteristics at the same time.

1.8 Frequency Distribution

Now, let’s talk about frequency distribution. Imagine you’re a class teacher, and you’ve just graded your students’ math tests. Some students scored between 60-70, some between 70-80, and so on. To understand how many students fall into each score range, you create a frequency table.

Frequency means how often something happens. For example, if 5 students scored between 60-70, the frequency of that score range is 5.

A frequency distribution is simply a table that shows how often different values (or ranges of values) appear in the data.

Example:

You’re at a school sports day. Some students ran the 100-meter race in 10-12 seconds, some in 12-14 seconds, and so on. You make a table that shows how many students finished within each time range. This table is called a frequency distribution.

Class Intervals:

When you have a lot of different data points, it’s helpful to group them into class intervals. These are ranges that the data falls into.

1. Class Limits:

These are the boundaries of the intervals.

Story Example: If you have a class interval of 10-20, the lower class limit is 10, and the upper class limit is 20. Think of it like organizing students based on their height; one group might be for students between 150 cm and 160 cm tall.

2. Class Length:

This is the size of the interval, or the difference between the upper and lower class limits.

Story Example: For the height interval 150-160 cm, the class length would be 10 cm (160-150 = 10). This tells you how big each group is.

3. Mid-value (Class Mark):

This is the midpoint of the class interval, calculated by averaging the lower and upper class limits.

Story Example: In the height group 150-160 cm, the mid-value would be (150 + 160)/2 = 155 cm. This helps represent the “middle” value of that group.

Types of Class Intervals

There are two main types of class intervals:

1. Inclusive Type:

In this type, both the lower and upper limits of the class interval are included.

Story Example: If you’re grouping students by age and one group is 10-15 years old, the ages 10, 11, 12, 13, 14, and 15 are all included in that group. So, a 15-year-old is part of this interval.

2. Exclusive Type:

In this type, the upper limit is not included.

Story Example: If the group is 10-15 years old, only ages 10 through 14.999 are included, but not exactly 15. So, a 15-year-old would fall into the next group, like 15-20.

Story Example for Class Intervals:

Let’s say you’re in charge of organizing a big birthday party for kids of different ages. You want to group them based on their ages so that the games are appropriate. You make age groups like 5-10 years, 10-15 years, and 15-20 years.

In the inclusive type, kids exactly 10 years old would be in the 5-10 group.

In the exclusive type, a 10-year-old would go into the next group, 10-15.

Class Boundaries

Definition: Class boundaries are the real limits of class intervals, making inclusive classes into exclusive classes.

Example: For classes 5-9 and 10-14:
– New UCI = 9 + 0.5 = 9.5
– New LCI = 10 – 0.5 = 9.5
– So, the new class boundaries are 4.5-9.5 and 9.5-14.5.

Story/Real Implication: Imagine you’re organizing bookshelves in a library. If you leave gaps between book categories, some books might not fit well. Class boundaries are like ensuring the categories “touch” each other so no gaps are left.

Open-end Class Interval

Definition: These are intervals where the lower or upper limit is not defined.

Example: Below 10, 10-20, 20-30, 30-40, Above 40.

Story/Real Implication: Think of an “open-end” class interval like going to a store where you can only categorize prices like “less than $10” or “more than $40”. You don’t know exactly how low or high the values go, but you can group them this way.

Relative Frequency

Definition: It’s the proportion of the total frequency that a specific class represents.

Formula:
Relative Frequency = (Frequency of the Class / Total Frequency)

Example: For class interval 20-30 with a frequency of 12 out of a total of 32:
Relative Frequency = 12 / 32 = 0.375

Story/Real Implication: Think of relative frequency as figuring out how many of your classmates prefer a specific flavor of ice cream compared to the whole group. If 3 out of 10 like chocolate, then the relative frequency of chocolate lovers is 0.3 or 30%.

Percentage Frequency

Definition: It’s the relative frequency expressed as a percentage.

Formula:
Percentage Frequency = (Class Frequency / Total Frequency) × 100

Example: If 12 students out of 32 scored between 20-30 in a test, their percentage frequency is:
(12 / 32) × 100 = 37.5%

Story/Real Implication: This is like a survey where you figure out how many people (in percentage) like a specific movie genre. For instance, 40% of a group might say they love action movies, which gives a quick idea of popularity.

Frequency Density

Definition: Used for unequal class intervals to show how dense the frequency is in each class.

Formula:
Frequency Density = Class Frequency / Width of Class

Example: If a class with a width of 10 (say 20-30) has a frequency of 12:
Frequency Density = 12 / 10 = 1.2

Story/Real Implication: Picture a classroom where students are packed in rows. Frequency density is like calculating how tightly packed students are in a row with different numbers of seats. The more students per row, the denser it feels.

Discrete Frequency Distribution

Definition: This shows how often certain distinct values (like the number of children in a family) occur in the data.

Example: Data of post-graduates in 10 families: 0, 1, 3, 1, 0, 2, 2, 2, 2, 4.

Number of Post-graduatesFrequency
02
12
24
31
41

Story/Real Implication: Think of a neighborhood where each house has a different number of children. The discrete frequency distribution helps us count how many houses have 0, 1, 2, 3, or 4 children. It’s like organizing and counting families based on how many kids they have.

Continuous Frequency Distribution

Definition: Used when data can take any value within a range (e.g., 10-20, 20-30).

Example: Marks obtained by 20 students in a test: 18, 23, 28, 29, 44, 28, 43, 44, 24, 29, 32, 39, etc.

Converted into a frequency distribution:

MarksFrequency
15-202
20-254
25-307
30-352
35-401
40-453
45-501

Story/Real Implication: Imagine you are collecting scores from a class test. Some students score between 15 and 20, some between 25 and 30, and so on. Continuous frequency distribution helps you organize these scores into ranges so you can easily see how students performed.

Cumulative Frequency Distribution

Definition: A way to calculate the running total of frequencies up to a certain point.

Story/Real Implication: Cumulative frequency is like tracking your savings over time. Every time you add more money, you’re not just looking at the current amount but how much you’ve saved overall. It helps you understand the total build-up of data up to a certain class.

Continuous Frequency Distribution

Definition: Variable takes values which are expressed in class intervals within certain limits.

Problem 2:

Marks obtained by 20 students in an exam for 50 marks are given below. Convert this data into continuous frequency distribution form:

Data: 18, 23, 28, 29, 44, 28, 43, 44, 24, 29, 32, 39, 42, 38, 49, 47, 22, 33, 28, 29

Solution:

MarksFrequency
15-202
20-254
25-307
30-352
35-401
40-453
45-501

Problem 3:

Following data reveals information about the number of children per family for 25 families. Prepare frequency distribution of number of children (say variable X), taking distinct values 0, 1, 2, 3, 4.

Data:
4, 3, 1, 1, 1, 2,
4, 3, 1, 1, 1, 1, 0,
2, 2, 2, 3, 2,
2, 1, 3, 4

Solution: Frequency distribution of the number of children in 25 families

Number of ChildrenNumber of Families
01
17
28
36
43

Problem 4:

For the following frequency distribution, prepare cumulative frequency distribution of ‘less than’ and ‘greater than’ type.

Solution:

XFrequencyLess than cfGreater than cf
15543
271236
3152721
4103711
56435

Cumulative Frequency Distribution

Problem 5:

Following is the marks of 50 students. Prepare cumulative frequency distribution of both the types. Also find relative frequencies.

MarksNo. of Students
0-107
10-2011
20-3019
30-406
40-507

Solution:

MarksCum FrequencyCum Frequency more thanRelative Frequency
0-107437/50
10-20183611/50
20-30371919/50
30-404366/50
40-505007/50

Problem 6:

For the following frequency distribution, obtain cumulative frequencies, relative frequencies, and relative cumulative frequencies.

Class IntervalFrequency
30-508
50-7015
70-9025
90-11016
110-1307
130-1504
MCQ Questions

Multiple Choice Questions (MCQs)

1. What is the primary objective of studying the unit “Advanced Bank Management”?

  • a) Understanding human behavior
  • b) Learning proper methods to collect and analyze data
  • c) Managing accounting systems
  • d) Developing new banking products

Answer: b) Learning proper methods to collect and analyze data

2. What is one of the main purposes of using data management techniques in banking?

  • a) To design marketing campaigns
  • b) To predict economic trends
  • c) To collect and present data effectively
  • d) To train employees

Answer: c) To collect and present data effectively

3. The term ‘Statistics’ is derived from which Latin word?

  • a) Statista
  • b) Statisticum
  • c) Statistici
  • d) Statibus

Answer: b) Statisticum

4. Who first used the word ‘Statistics’ in reference to subject matter as a whole?

  • a) Professor Keynes
  • b) Professor Achenwall
  • c) Professor Adam Smith
  • d) Professor Ricardo

Answer: b) Professor Achenwall

5. Which one of the following phases is the first stage in the process of data analysis?

  • a) Classification of data
  • b) Presentation of data
  • c) Collection of data
  • d) Analysis of data

Answer: c) Collection of data

6. Which of the following best describes the second phase in the statistical analysis process?

  • a) Analysis of classified data
  • b) Collection of raw data
  • c) Classification and tabulation of data
  • d) Drawing conclusions from the data

Answer: c) Classification and tabulation of data

7. In which century was the word ‘Statistics’ first used in its modern sense?

  • a) 15th century
  • b) 16th century
  • c) 18th century
  • d) 19th century

Answer: c) 18th century

8. Which phase of statistical analysis involves dividing raw data into groups or categories?

  • a) Collection of data
  • b) Interpretation of data
  • c) Classification and tabulation of data
  • d) Analysis of data

Answer: c) Classification and tabulation of data

9. Which of the following is NOT mentioned as a field where statistics is widely used?

  • a) Business
  • b) Economics
  • c) Politics
  • d) Literature

Answer: d) Literature

10. What is the final stage of the statistical analysis process?

  • a) Collection of data
  • b) Interpretation of data
  • c) Tabulation of data
  • d) Presentation of data

Answer: b) Interpretation of data

11. The word “Statistics” in German is referred to as:

  • a) Statistiko
  • b) Statistik
  • c) Statista
  • d) Statisticum

Answer: b) Statistik

12. Which term is used to describe the gathering of information from primary or secondary sources?

  • a) Classification of data
  • b) Analysis of data
  • c) Collection of data
  • d) Interpretation of data

Answer: c) Collection of data

13. What is the primary focus of the classification and tabulation phase of data management?

  • a) To analyze the data for patterns
  • b) To collect relevant data
  • c) To represent raw data in a tabular format
  • d) To make predictions based on data

Answer: c) To represent raw data in a tabular format

14. Which of the following is an example of where statistics can be applied in daily life, according to the image?

  • a) Analyzing hospital patient data
  • b) Writing a novel
  • c) Building a house
  • d) Preparing a meal

Answer: a) Analyzing hospital patient data

15. What type of data is divided into different groups or classes in the tabulation process?

  • a) Primary data
  • b) Raw data
  • c) Processed data
  • d) Qualitative data

Answer: b) Raw data

16. Which phase of data analysis uses formulas and methods to understand the patterns in data?

  • a) Collection of data
  • b) Classification of data
  • c) Analysis of data
  • d) Interpretation of data

Answer: c) Analysis of data

17. What is the purpose of the ‘Interpretation of Data’ stage in statistical analysis?

  • a) To collect raw data
  • b) To divide data into groups
  • c) To analyze data using formulas
  • d) To draw conclusions from the data

Answer: d) To draw conclusions from the data

18. Which of the following areas is mentioned as having increasing use of statistics?

  • a) Poetry
  • b) Social Sciences
  • c) Art and Design
  • d) Classical Music

Answer: b) Social Sciences

19. What is the function of tabulated data in the analysis process?

  • a) It helps in analyzing data using formulas
  • b) It helps in collecting more data
  • c) It simplifies the presentation of findings
  • d) It prepares the data for interpretation

Answer: a) It helps in analyzing data using formulas

20. What is the last step before drawing conclusions in the statistical process?

  • a) Collection of data
  • b) Analysis of data
  • c) Tabulation of data
  • d) Classification of data

Answer: b) Analysis of data

21. Which of the following statements about statistics is true according to the text?

  • a) Statistics is only useful for scientific purposes.
  • b) Statistics was originally used to collect facts about the state.
  • c) Statistics does not involve any data collection.
  • d) Statistics is mainly used for artistic analysis.

Answer: b) Statistics was originally used to collect facts about the state.

22. In what context did Professor Achenwall define statistics?

  • a) As a branch of mathematics
  • b) As a method for economic analysis
  • c) As the political science of many countries
  • d) As a study of natural sciences

Answer: c) As the political science of many countries

23. What is the importance of presenting data properly in statistics?

  • a) It helps in creating beautiful graphs
  • b) It allows for making predictions and correct decisions
  • c) It reduces the workload of analysts
  • d) It eliminates the need for further data collection

Answer: b) It allows for making predictions and correct decisions

24. In which phase is data divided into groups and represented in a table?

  • a) Data interpretation
  • b) Data collection
  • c) Data classification and tabulation
  • d) Data presentation

Answer: c) Data classification and tabulation

25. Which field is NOT mentioned in the text as an area where statistics is commonly applied?

  • a) Medicine
  • b) Engineering
  • c) Politics
  • d) Commerce

Answer: b) Engineering

26. According to the objectives of this chapter, which of the following is NOT a key goal?

  • a) Making correct decisions based on data
  • b) Learning to create new types of data
  • c) Understanding how to collect and present data
  • d) Developing data analysis techniques

Answer: b) Learning to create new types of data

27. The term “Statistics” was first used in a formal sense in which year?

  • a) 1492
  • b) 1749
  • c) 1849
  • d) 1949

Answer: b) 1749

28. In statistical analysis, which phase is essential for ensuring the data is ready for further analysis?

  • a) Collection of data
  • b) Classification and Tabulation
  • c) Presentation of data
  • d) Interpretation of data

Answer: b) Classification and Tabulation

29. Which of the following fields mentioned in the text relies heavily on the use of statistics?

  • a) Arts and Music
  • b) Literature and Poetry
  • c) Business and Economics
  • d) Philosophy and Ethics

Answer: c) Business and Economics

30. In which field is statistical analysis used to study stock prices and calculate risk?

  • a) Weather Forecast
  • b) Banking
  • c) Stock Market
  • d) Medical Research

Answer: c) Stock Market

31. Which statistical method is commonly used in weather forecasting?

  • a) Correlation analysis
  • b) Time series analysis
  • c) Bayesian decision theory
  • d) Sampling techniques

Answer: b) Time series analysis

32. What is the primary application of statistics in the medical field?

  • a) To analyze sales data
  • b) To investigate clinical treatments
  • c) To analyze business strategies
  • d) To forecast weather conditions

Answer: b) To investigate clinical treatments

33. What do credit policies in banking primarily depend on?

  • a) Stock prices
  • b) Statistical analysis of profitability and other ratios
  • c) Historical data on economic growth
  • d) Marketing trends

Answer: b) Statistical analysis of profitability and other ratios

34. In business, statistical techniques like time series analysis are used to?

  • a) Analyze consumer preferences
  • b) Design new products
  • c) Predict the effect of large numbers of variables
  • d) Train employees

Answer: c) Predict the effect of large numbers of variables

35. What is the role of statistics in demand and supply analysis in economics?

  • a) To predict future trends in marketing
  • b) To calculate profit and loss
  • c) To determine elasticity of demand and maximum satisfaction
  • d) To train employees in the financial sector

Answer: c) To determine elasticity of demand and maximum satisfaction

36. What is one way that regression techniques are used in weather forecasting?

  • a) To analyze future stock prices
  • b) To predict future weather conditions
  • c) To train new meteorologists
  • d) To compare various types of weather patterns

Answer: b) To predict future weather conditions

37. What does “Bayesian Decision Theory” help businesses to achieve?

  • a) Improve employee retention
  • b) Optimize decisions by evaluating payoffs
  • c) Predict customer satisfaction levels
  • d) Develop new products

Answer: b) Optimize decisions by evaluating payoffs

38. What is one of the benefits of using time series analysis in business?

  • a) To collect historical data
  • b) To predict sales trends and market fluctuations
  • c) To calculate future stock prices
  • d) To analyze employee behavior

Answer: b) To predict sales trends and market fluctuations

39. What is the application of statistics in banking?

  • a) To assess loan risks based on historical data
  • b) To predict stock market behavior
  • c) To analyze customer satisfaction
  • d) To measure employee performance

Answer: a) To assess loan risks based on historical data

40. How do players in sports use statistics?

  • a) To predict the outcome of games
  • b) To identify or rectify their mistakes
  • c) To recruit new players
  • d) To evaluate the history of the sport

Answer: b) To identify or rectify their mistakes

41. Which of the following is a function of statistics?

  • a) Statistics eliminate the need for forecasting
  • b) Statistics prevent data from being biased
  • c) Statistics help in simplifying complex data
  • d) Statistics replace individual analysis with machine learning

Answer: c) Statistics help in simplifying complex data

42. Which of the following is NOT a function of statistics?

  • a) Statistics simplify complex data
  • b) Statistics help in forecasting outcomes
  • c) Statistics eliminate the need for data analysis
  • d) Statistics provide a technique of comparison

Answer: c) Statistics eliminate the need for data analysis

43. What is a limitation of statistics?

  • a) Statistics are only useful in business contexts
  • b) Statistics do not deal with individuals
  • c) Statistics always provide accurate results
  • d) Statistics exclude quantitative data

Answer: b) Statistics do not deal with individuals

44. Why can’t statistics be applied to qualitative data?

  • a) Qualitative data is subjective
  • b) Qualitative data is too complex for statistics
  • c) Qualitative data cannot be measured in terms of quantity or numbers
  • d) Qualitative data requires machine learning to analyze

Answer: c) Qualitative data cannot be measured in terms of quantity or numbers

45. What is one reason statistical methods may not always give the correct result?

  • a) They require too many samples to work
  • b) They cannot be applied to large groups
  • c) They only provide an average of the data
  • d) They are biased by default

Answer: c) They only provide an average of the data

46. When are the results of statistics considered biased?

  • a) When data is collected by inexperienced or dishonest persons
  • b) When data is collected from too many people
  • c) When data is collected using digital methods
  • d) When data is analyzed without machine learning

Answer: a) When data is collected by inexperienced or dishonest persons

47. What is the definition of a population in statistics?

  • a) A small group that represents the whole
  • b) The entire group we are interested in for drawing conclusions
  • c) A sample of a larger group of data
  • d) A collection of numerical data points only

Answer: b) The entire group we are interested in for drawing conclusions

48. In the example, what is the population if we are studying the weight of adult men in India?

  • a) A group of men from a particular city
  • b) The set of weights of all men in India
  • c) Men from rural areas only
  • d) Data collected from a survey of male athletes

Answer: b) The set of weights of all men in India

49. Which of the following is not studied under statistics?

  • a) The relationship between two or more variables
  • b) Individual qualitative observations
  • c) Group observations and averages
  • d) Complex data for comparison

Answer: b) Individual qualitative observations

50. What is the population if we are studying the grade point average of students at Mumbai University?

  • a) A group of selected students
  • b) The set of GPAs of all students of Mumbai University
  • c) Only students with high GPAs
  • d) A sample of students from a single department

Answer: b) The set of GPAs of all students of Mumbai University

51. Why is a sample often used in statistical studies?

  • a) It is impossible to study an entire population
  • b) To save time and money when the population is too large to study
  • c) Samples provide more accurate data than populations
  • d) Sampling eliminates the need for further analysis

Answer: b) To save time and money when the population is too large to study

52. What is a sample in statistics?

  • a) A small subset of the population
  • b) The entire group we are studying
  • c) A characteristic that varies among individuals
  • d) A parameter of the population

Answer: a) A small subset of the population

53. In the given example, what would be the sample for a study on infant health in India?

  • a) All infants born in India in one year
  • b) All infants born on one particular day in one year
  • c) All children below the age of 5
  • d) A group of mothers with infants

Answer: b) All infants born on one particular day in one year

54. What is a variate in statistics?

  • a) A characteristic that cannot be expressed numerically
  • b) A characteristic that varies from one individual to another and can be expressed in numerical terms
  • c) A group of individuals with the same characteristics
  • d) A subset of the population

Answer: b) A characteristic that varies from one individual to another and can be expressed in numerical terms

55. Which of the following is an example of a variate?

  • a) Religion of humans
  • b) Colour of the ball
  • c) Weight of students in a class
  • d) Gender of employees

Answer: c) Weight of students in a class

56. What is an attribute in statistics?

  • a) A numerical characteristic that can vary
  • b) A characteristic that cannot be expressed in numerical terms
  • c) A type of data collection method
  • d) A type of statistical error

Answer: b) A characteristic that cannot be expressed in numerical terms

57. Which of the following is an example of an attribute?

  • a) Number of accidents
  • b) Age of students
  • c) Colour of the ball
  • d) Number of members in a family

Answer: c) Colour of the ball

58. What type of data is referred to as discrete variate?

  • a) Data that takes any value within a range
  • b) Data that takes only a countable and usually finite number of values
  • c) Data that is difficult to analyze
  • d) Data that can only be measured in continuous ranges

Answer: b) Data that takes only a countable and usually finite number of values

59. Which of the following is an example of discrete variate?

  • a) Percentage of marks
  • b) Age in years
  • c) Height of students
  • d) Weight of students

Answer: b) Age in years

60. What is a continuous variate in statistics?

  • a) A variate that takes distinct values only
  • b) A variate that can take any value within a range
  • c) A variate that applies only to qualitative data
  • d) A variate that applies only to finite values

Answer: b) A variate that can take any value within a range

61. Which of the following is an example of continuous variate?

  • a) Number of children in a family
  • b) Percentage of marks
  • c) Number of accidents
  • d) Number of students in a class

Answer: b) Percentage of marks

62. What is a parameter in statistics?

  • a) A sample size chosen for the study
  • b) A numerical value or function of the observations of the entire population
  • c) A characteristic that is observed during the study
  • d) The number of observations in a study

Answer: b) A numerical value or function of the observations of the entire population

63. What is the role of statistics in relation to the parameter?

  • a) It eliminates the need to estimate parameters
  • b) It is used to estimate unknown parameters from a sample
  • c) It provides accurate predictions without the need for data
  • d) It collects data from the entire population

Answer: b) It is used to estimate unknown parameters from a sample

64. What is an example of a parameter in the given context?

  • a) Population mean
  • b) Age of students
  • c) Colour of the ball
  • d) Number of children in a family

Answer: a) Population mean

65. What is the process of sorting letters in a post office based on their addresses an example of?

  • a) Classification
  • b) Tabulation
  • c) Data Analysis
  • d) Sampling

Answer: a) Classification

66. What is the main purpose of classification in data analysis?

  • a) To make the data more complex
  • b) To condense data by removing unimportant details
  • c) To enlarge the dataset
  • d) To change the structure of data

Answer: b) To condense data by removing unimportant details

67. How is data classified in the “Quantitative Base”?

  • a) By geographical regions
  • b) By qualitative characteristics like religion
  • c) By numerical characteristics such as age or income
  • d) By the color of objects

Answer: c) By numerical characteristics such as age or income

68. What is an example of Geographical Base classification?

  • a) Classification by income level
  • b) Classification by color
  • c) Classification by states or countries
  • d) Classification by gender

Answer: c) Classification by states or countries

69. What is Chronological Base classification?

  • a) Classifying data by numerical characteristics
  • b) Classifying data by time periods
  • c) Classifying data by religious affiliation
  • d) Classifying data by geographical region

Answer: b) Classifying data by time periods

70. What is One-way classification?

  • a) Classifying data based on more than two characteristics
  • b) Classifying data based on one characteristic
  • c) Classifying data using only qualitative data
  • d) Classifying data using only geographical data

Answer: b) Classifying data based on one characteristic

71. In Two-way classification, how is the data classified?

  • a) By two characteristics at the same time
  • b) By one characteristic only
  • c) By geographical region only
  • d) By qualitative data only

Answer: a) By two characteristics at the same time

72. What does frequency in a frequency distribution represent?

  • a) The total amount of data collected
  • b) The number of occurrences of a value in a given set of observations
  • c) The sum of all numerical values
  • d) The variation in a dataset

Answer: b) The number of occurrences of a value in a given set of observations

73. How is frequency distribution typically represented?

  • a) In the form of graphs
  • b) In a table showing values of the variable and corresponding frequencies
  • c) In a list of descriptive statistics
  • d) Using complex algebraic equations

Answer: b) In a table showing values of the variable and corresponding frequencies

74. What is Primary Data?

  • a) Data collected from secondary sources
  • b) Data collected directly from respondents for the first time
  • c) Data from internet sources
  • d) Data that has already been processed

Answer: b) Data collected directly from respondents for the first time

75. What is the main advantage of using questionnaires for data collection?

  • a) They collect only qualitative data
  • b) They allow respondents to provide large volumes of data quickly
  • c) They eliminate the need for analysis
  • d) They are used only in small-scale studies

Answer: b) They allow respondents to provide large volumes of data quickly

76. What is the main difference between a census and a sample survey?

  • a) Census collects data from the entire population while a sample survey collects data from a subset
  • b) Sample surveys collect more accurate data than a census
  • c) Census uses questionnaires while sample surveys do not
  • d) There is no difference

Answer: a) Census collects data from the entire population while a sample survey collects data from a subset

77. What is Secondary Data?

  • a) Data collected for the first time
  • b) Data that has already been collected and processed by someone else
  • c) Data collected through direct observation
  • d) Data collected using interviews

Answer: b) Data that has already been collected and processed by someone else

78. What is the distinction between primary and secondary data?

  • a) Primary data is collected for the first time; secondary data has been collected previously
  • b) Secondary data is always more accurate than primary data
  • c) Primary data comes from books; secondary data comes from surveys
  • d) There is no distinction between primary and secondary data

Answer: a) Primary data is collected for the first time; secondary data has been collected previously

79. What is the formula for calculating the Class Length or Class Width?

  • a) Class Length = Lower Class Interval – Upper Class Interval
  • b) Class Length = Total Frequency / Number of Intervals
  • c) Class Length = Upper Class Interval – Lower Class Interval
  • d) Class Length = Frequency / Class Interval

Answer: c) Class Length = Upper Class Interval – Lower Class Interval

80. What is the formula to calculate Mid-Value or Class Mark?

  • a) Class Mark = (Lower Class Limit + Upper Class Limit) / 2
  • b) Class Mark = (Upper Class Interval – Lower Class Interval) / 2
  • c) Class Mark = Total Frequency / Number of Classes
  • d) Class Mark = (Upper Class Frequency – Lower Class Frequency) / 2

Answer: a) Class Mark = (Lower Class Limit + Upper Class Limit) / 2

81. Which type of class interval includes both the upper and lower class limits?

  • a) Exclusive type
  • b) Inclusive type
  • c) Open-end type
  • d) None of the above

Answer: b) Inclusive type

82. What is an example of an exclusive type of class interval?

  • a) 10-20
  • b) 0-10
  • c) 500-1000
  • d) All of the above

Answer: d) All of the above

83. In an inclusive type of class interval, if the class interval is 10-15, which of the following would be the upper class limit?

  • a) 15
  • b) 10
  • c) 12.5
  • d) 16

Answer: a) 15

84. The data of some workers’ salaries is given as 2300, 2400, 2500, 2100, 2000, 2000, 2300, 2800, 3000, 3200, 2700, 2400, and 2500. If the desired number of class intervals is 5, what is the class width?

  • a) 100
  • b) 200
  • c) 300
  • d) 400

Solution:
The class width can be calculated using the formula:
Class width = (Maximum value – Minimum value) / Number of class intervals.
Maximum salary = 3200, Minimum salary = 2000
Class width = (3200 – 2000) / 5 = 1200 / 5 = 240
So the class width should be approximately 240, closest option is b) 200.

85. The largest and smallest values of a data set are 100 and 60 respectively. If the desired number of class intervals is 8, what is the class width?

  • a) 5
  • b) 8
  • c) 10
  • d) 40

Solution:
Class width = (Maximum value – Minimum value) / Number of class intervals.
= (100 – 60) / 8 = 40 / 8 = 5.
Therefore, the class width is a) 5.

86. For the data set: 20, 25, 30, 35, 40, calculate the mean.

  • a) 25
  • b) 30
  • c) 35
  • d) 40

Solution:
The mean is calculated as the sum of the values divided by the number of values.
Mean = (20 + 25 + 30 + 35 + 40) / 5 = 150 / 5 = 30.
Therefore, the mean is b) 30.

87. If the class intervals are 10-20, 20-30, 30-40, and the frequencies are 5, 10, 15 respectively, what is the cumulative frequency for the class interval 30-40?

  • a) 5
  • b) 10
  • c) 15
  • d) 30

Solution:
Cumulative frequency is calculated by adding the frequencies of all previous class intervals.
Cumulative frequency for 30-40 = 5 (10-20) + 10 (20-30) + 15 (30-40) = 30.
Therefore, the cumulative frequency is d) 30.

88. If the frequency for the class interval 40-50 is 8, what is the cumulative frequency if the cumulative frequency of the previous class interval (30-40) is 15?

  • a) 23
  • b) 15
  • c) 8
  • d) 30

Solution:
Cumulative frequency for the class interval 40-50 = Previous cumulative frequency + Current frequency.
= 15 + 8 = 23.
Therefore, the cumulative frequency is a) 23.

89. Calculate the mode of the following data: 5, 10, 10, 15, 20, 20, 20, 25, 30.

  • a) 10
  • b) 15
  • c) 20
  • d) 25

Solution:
The mode is the value that occurs most frequently in the dataset.
In this case, 20 appears three times, more than any other number.
Therefore, the mode is c) 20.

90. In a class interval 50-60 with a frequency of 18 and a total frequency of 90, what is the percentage frequency?

  • a) 5%
  • b) 10%
  • c) 20%
  • d) 15%

Solution:
Percentage frequency = (Frequency of the interval / Total frequency) * 100.
= (18 / 90) * 100 = 20%.
Therefore, the percentage frequency is c) 20%.

91. Find the median for the following data set: 12, 15, 18, 22, 24, 30.

  • a) 15
  • b) 18
  • c) 20
  • d) 22

Solution:
For an even number of values, the median is the average of the two middle values.
Median = (18 + 22) / 2 = 20.
Therefore, the median is c) 20.

92. In a continuous frequency distribution, the lower class limit of a class is 25 and the class width is 10. What is the upper class limit of the class?

  • a) 30
  • b) 35
  • c) 40
  • d) 45

Solution:
The upper class limit is calculated by adding the class width to the lower class limit.
Upper class limit = 25 + 10 = 35.
Therefore, the upper class limit is b) 35.

93. For the class interval 1000-1500 with a frequency of 10, what is the frequency density given that the class width is 500?

  • a) 2
  • b) 0.02
  • c) 0.5
  • d) 20

Solution:
Frequency density = Frequency / Class width.
Frequency density = 10 / 500 = 0.02.
Therefore, the frequency density is b) 0.02.

94. Calculate the variance for the following data set: 10, 12, 14, 16, 18.

  • a) 6
  • b) 8
  • c) 10
  • d) 12

Solution:
First, calculate the mean: Mean = (10 + 12 + 14 + 16 + 18) / 5 = 70 / 5 = 14.
Variance formula:
Variance = [(10-14)^2 + (12-14)^2 + (14-14)^2 + (16-14)^2 + (18-14)^2] / 5
= [16 + 4 + 0 + 4 + 16] / 5 = 40 / 5 = 8.
Therefore, the variance is b) 8.

95. In a survey, 60 students participated, and the average marks were found to be 75 with a standard deviation of 10. Calculate the variance.

  • a) 100
  • b) 75
  • c) 10
  • d) 50

Solution:
Variance is the square of the standard deviation.
Variance = (Standard Deviation)^2 = 10^2 = 100.
Therefore, the variance is a) 100.

96. The mode of the data set 5, 10, 10, 15, 15, 15, 20 is:

  • a) 5
  • b) 10
  • c) 15
  • d) 20

Solution:
The mode is the number that appears most frequently.
In the given set, 15 appears 3 times, more than any other number.
Therefore, the mode is c) 15.

97. What is the cumulative frequency for the class interval 40-50 if the previous cumulative frequency is 50 and the frequency of the interval 40-50 is 10?

  • a) 50
  • b) 60
  • c) 70
  • d) 80

Solution:
Cumulative frequency for 40-50 = Previous cumulative frequency + Current frequency.
= 50 + 10 = 60.
Therefore, the cumulative frequency is b) 60.

98. In a group of data, if the mean is 60 and the median is 50, what can be inferred about the skewness of the data?

  • a) Positively skewed
  • b) Negatively skewed
  • c) Symmetrical
  • d) Cannot be determined

Solution:
In a positively skewed distribution, the mean is greater than the median.
Since the mean (60) is greater than the median (50), the data is a) Positively skewed.

99. For the class interval 20-30 with a frequency of 5, and the total frequency being 50, what is the relative frequency?

  • a) 5%
  • b) 10%
  • c) 15%
  • d) 20%

Solution:
Relative frequency = (Frequency of the interval / Total frequency) * 100.
= (5 / 50) * 100 = 10%.
Therefore, the relative frequency is b) 10%.

100. What is the mid-point of the class interval 50-60?

  • a) 50
  • b) 55
  • c) 60
  • d) 65

Solution:
Mid-point = (Lower class limit + Upper class limit) / 2.
= (50 + 60) / 2 = 110 / 2 = 55.
Therefore, the mid-point is b) 55.

101. What is the range of the data set: 100, 90, 80, 70, 60?

  • a) 30
  • b) 40
  • c) 50
  • d) 60

Solution:
Range = Maximum value – Minimum value.
Range = 100 – 60 = 40.
Therefore, the range is b) 40.

102. The class interval for a set of data is 10-20. The lower boundary is 10, and the class width is 10. What is the upper boundary?

  • a) 25
  • b) 20
  • c) 15
  • d) 30

Solution:
Upper boundary = Lower boundary + Class width.
Upper boundary = 10 + 10 = 20.
Therefore, the upper boundary is b) 20.

103. If the mean of a dataset is 40 and the standard deviation is 5, what is the coefficient of variation?

  • a) 12.5%
  • b) 25%
  • c) 50%
  • d) 75%

Solution:
Coefficient of variation = (Standard Deviation / Mean) * 100.
= (5 / 40) * 100 = 12.5%.
Therefore, the coefficient of variation is a) 12.5%.

Fullscreen View Fix with Hindi Version

सांख्यिकी की परिभाषा, महत्व और सीमाएँ एवं डाटा संग्रहण, वर्गीकरण और सारणीकरण

1.1 परिचय

हमारी और आपकी जरूरतें और रुचि जिस क्षेत्र से संबंधित होती हैं उससे जुड़े numbers या figures collect करना, उनका विश्लेषण करना और उनके आधार पर निर्णय लेना ही सांख्यिकी कहलाता है।

साधारण शब्दों में, सांख्यिकी वह प्रक्रिया है जिसके द्वारा किसी विशेष क्षेत्र में आंकड़ों का संग्रहण, विश्लेषण और निर्णय लेना होता है।

क्या आपने कभी सोचा है कि ‘सांख्यिकी’ शब्द कहाँ से आया है?

‘सांख्यिकी’ शब्द की उत्पत्ति लैटिन शब्द ‘statisticum’ से हुई है, जिसका मतलब राज्य से संबंधित होता है।

जर्मन विद्वान गॉटफ्राइड अचेनवेल ने सबसे पहले ‘सांख्यिकी’ शब्द का उपयोग किया।

सांख्यिकी की परिभाषा, महत्व और सीमाएँ एवं डाटा संग्रहण, वर्गीकरण और सारणीकरण

1.1 परिचय

हमारी और आपकी जरूरतें और रुचि जिस क्षेत्र से संबंधित होती हैं उससे जुड़े numbers या figures collect करना, उनका विश्लेषण करना और उनके आधार पर निर्णय लेना ही सांख्यिकी कहलाता है।

साधारण शब्दों में, सांख्यिकी वह प्रक्रिया है जिसके द्वारा किसी विशेष क्षेत्र में आंकड़ों का संग्रहण, विश्लेषण और निर्णय लेना होता है।

क्या आपने कभी सोचा है कि ‘सांख्यिकी’ शब्द कहाँ से आया है?

‘सांख्यिकी’ शब्द की उत्पत्ति लैटिन शब्द ‘statisticum’ से हुई है, जिसका मतलब राज्य से संबंधित होता है।

जर्मन विद्वान गॉटफ्राइड अचेनवेल ने सबसे पहले ‘सांख्यिकी’ शब्द का उपयोग किया।

Primary Color

Secondary Color

Layout Mode