SciTech-Mathmatics-Probability+Statistics-Population Vs. Sampling: Representative Samples + How to obtain Samples
Difference: Population vs. Sample
BY ZACH BOBBITTPOSTED ON NOVEMBER 27, 2020
Often in statistics we're interested in collecting data so that we can answer some research question.
For example, we might want to answer the following questions:
- What is the median household income in Miami, Florida?
- What is the mean weight of a certain population of turtles?
- What percentage of residents in a certain county support a certain law?
In each scenario, we are interested in answering some question about a population, which represents every possible individual element that we're interested in measuring.
However, instead of collecting data on every individual in a population we instead collect data on a sample of the population, which represents a portion of the population.
Population: Every possible individual element that we are interested in measuring.
Sample: A portion of the population.
Here is an example of a population vs. a sample in the three intro examples.
Three Examples
- What is the median household income in Miami, Florida?
The entire population might include 500,000 households,
but we might only collect data on a sample of 2,000 total households. - What is the mean weight of a certain population of turtles?
The entire population might include 800 turtles,
but we might only collect data on a sample of 30 turtles. - What percentage of residents in a certain county support a certain law?
The entire population might include 50,000 residents,
but we might only collect data on a sample of 1,000 residents.
Why Use Samples?
There are several reasons that we typically collect data on samples instead of entire populations, including:
- It is too time-consuming to collect data on an entire population. For example, if we want to know the median household income in Miami, Florida, it might take months or even years to go around and gather income for each household. By the time we collect all of this data, the population may have changed or the research question of interest might no longer be of interest.
- It is too costly to collect data on an entire population. It is often too expensive to go around and collect data for every individual in a population, which is why we instead choose to collect data on a sample instead.
- It is unfeasible to collect data on an entire population. In many cases it's simply not possible to collect data for every individual in a population. For example, it may be extraordinarily difficult to track down and weigh every turtle in a certain population that we're interested in.
By collecting data on samples, we're able to gather information about a given population much faster and cheaper.
And if our sample is representative of the population, then we can generalize the findings from a sample to the larger population with a high level of confidence.
The Importance of Representative Samples
When we collect a sample from a population,
we ideally want the sample to be like a "mini version" of our population.
For example, suppose we want to understand the movie preferences of students in a certain school district that has a population of 5,000 total students. Since it would take too long to survey every individual student, we might instead take a sample of 100 students and ask them about their preferences.
If the overall student population is composed of 50% girls and 50% boys, our sample would not be representative if it included 90% boys and only 10% girls.
Or if the overall population is composed of equal parts freshman, sophomores, juniors, and seniors, then our sample would not be representative if it only included freshman.
A sample is representative of a population if the characteristics of the individuals in the sample
When this occurs, we can generalize the findings from the sample to the overall population with confidence.
How to Obtain Samples
There are many different methods we can use to obtain samples from populations.
To maximize the chances that we obtain a representative sample, we can use one of the three following methods:
-
Simple random sampling:Randomly select
through the use of or . -
Stratified random sampling: Split
into . Randomly select some from to be in the sample. -
Systematic random sampling: Put every member of a population into some order. Choose a random starting point and select every
th member to be in the sample.
In each of these methods, every individual in the population has an equal probability of being included in the sample. This maximizes the chances that we obtain a sample that is a “mini version” of the population.
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· 被坑几百块钱后,我竟然真的恢复了删除的微信聊天记录!
· 没有Manus邀请码?试试免邀请码的MGX或者开源的OpenManus吧
· 【自荐】一款简洁、开源的在线白板工具 Drawnix
· 园子的第一款AI主题卫衣上架——"HELLO! HOW CAN I ASSIST YOU TODAY
· Docker 太简单,K8s 太复杂?w7panel 让容器管理更轻松!