In which type of sampling does every element in the population?

Population vs Sample

The population includes all objects of interest whereas the sample is only a portion of the population. Parameters are associated with populations and statistics with samples. Parameters are usually denoted using Greek letters (mu, sigma) while statistics are usually denoted using Roman letters (x, s).

There are several reasons why we don't work with populations. They are usually large, and it is often impossible to get data for every object we're studying. Sampling does not usually occur without cost, and the more items surveyed, the larger the cost.

We compute statistics, and use them to estimate parameters. The computation is the first part of the statistics course (Descriptive Statistics) and the estimation is the second part (Inferential Statistics)

Discrete vs Continuous

Discrete variables are usually obtained by counting. There are a finite or countable number of choices available with discrete data. You can't have 2.63 people in the room.

Continuous variables are usually obtained by measuring. Length, weight, and time are all examples of continous variables. Since continuous variables are real numbers, we usually round them. This implies a boundary depending on the number of decimal places. For example: 64 is really anything 63.5 <= x < 64.5. Likewise, if there are two decimal places, then 64.03 is really anything 63.025 <= x < 63.035. Boundaries always have one more decimal place than the data and end in a 5.

Levels of Measurement

There are four levels of measurement: Nominal, Ordinal, Interval, and Ratio. These go from lowest level to highest level. Data is classified according to the highest level which it fits. Each additional level adds something the previous level didn't have.

  • Nominal is the lowest level. Only names are meaningful here.
  • Ordinal adds an order to the names.
  • Interval adds meaningful differences
  • Ratio adds a zero so that ratios are meaningful.

Types of Sampling

There are five types of sampling: Random, Systematic, Convenience, Cluster, and Stratified.

  • Random sampling is analogous to putting everyone's name into a hat and drawing out several names. Each element in the population has an equal chance of occuring. While this is the preferred way of sampling, it is often difficult to do. It requires that a complete list of every element in the population be obtained. Computer generated lists are often used with random sampling. You can generate random numbers using the TI82 calculator.
  • Systematic sampling is easier to do than random sampling. In systematic sampling, the list of elements is "counted off". That is, every kth element is taken. This is similar to lining everyone up and numbering off "1,2,3,4; 1,2,3,4; etc". When done numbering, all people numbered 4 would be used.
  • Convenience sampling is very easy to do, but it's probably the worst technique to use. In convenience sampling, readily available data is used. That is, the first people the surveyor runs into.
  • Cluster sampling is accomplished by dividing the population into groups -- usually geographically. These groups are called clusters or blocks. The clusters are randomly selected, and each element in the selected clusters are used.
  • Stratified sampling also divides the population into groups called strata. However, this time it is by some characteristic, not geographically. For instance, the population might be separated into males and females. A sample is taken from each of these strata using either random, systematic, or convenience sampling.


Table of Contents

Understanding Sampling Methods (Visuals and Code)

Image from Author

Sampling is the process of selecting a subset(a predetermined number of observations) from a larger population. It’s a pretty common technique wherein, we run experiments and draw conclusions about the population, without the need of having to study the entire population. In this blog, we will go through two types of sampling methods:

  1. Probability Sampling —Here we choose a sample based on the theory of probability.
  2. Non-Probability Sampling — Here we choose a sample based on non-random criteria, and not every member of the population has a chance of being included.

Random Sampling

Under Random sampling, every element of the population has an equal probability of getting selected. Below fig. shows the pictorial view of the same — All the points collectively represent the entire population wherein every point has an equal chance of getting selected.

Random Sampling

You can implement it using python as shown below —

import randompopulation = 100
data = range(population)
print(random.sample(data,5))
> 4, 19, 82, 45, 41

Stratified Sampling

Under stratified sampling, we group the entire population into subpopulations by some common property. For example — Class labels in a typical ML classification task. We then randomly sample from those groups individually, such that the groups are still maintained in the same ratio as they were in the entire population. Below fig. shows a pictorial view of the same — We have two groups with a count ratio of x and 4x based on the colour, we randomly sample from yellow and green sets separately and represent the final set in the same ratio of these groups.

Stratified Sampling

You can implement it very easily using python sklearn lib. as shown below —

from sklearn.model_selection import train_test_split

stratified_sample, _ = train_test_split(population, test_size=0.9, stratify=population[['label']])
print (stratified_sample)

You can also implement it without the lib., read this.

Cluster Sampling

In Cluster sampling, we divide the entire population into subgroups, wherein, each of those subgroups has similar characteristics to that of the population when considered in totality. Also, instead of sampling individuals, we randomly select the entire subgroups. As can be seen in the below fig. that we had 4 clusters with similar properties (size and shape), we randomly select two clusters and treat them as samples.

Cluster Sampling

Real-Life example — Class of 120 students divided into groups of 12 for a common class project. Clustering parameters like (Designation, Class, Topic) are all similar over here as well.

You can implement it using python as shown below —

import numpy as npclusters=5
pop_size = 100
sample_clusters=2
#assigning cluster ids sequentially from 1 to 5 on gap of 20
cluster_ids = np.repeat([range(1,clusters+1)], pop_size/clusters)
cluster_to_select = random.sample(set(cluster_ids), sample_clusters)indexes = [i for i, x in enumerate(cluster_ids) if x in cluster_to_select]cluster_associated_elements = [el for idx, el in enumerate(range(1, 101)) if idx in indexes]print (cluster_associated_elements)

Systematic Sampling

Systematic sampling is about sampling items from the population at regular predefined intervals(basically fixed and periodic intervals). For example — Every 5th element, 21st element and so on. This sampling method tends to be more effective than the vanilla random sampling method in general. Below fig. shows a pictorial view of the same — We sample every 9th and 7th element in order and then repeat this pattern.

Systematic Sampling

You can implement it using python as shown below —

population = 100
step = 5
sample = [element for element in range(1, population, step)]
print (sample)

Multistage sampling

Under Multistage sampling, we stack multiple sampling methods one after the other. For example, at the first stage, cluster sampling can be used to choose clusters from the population and then we can perform random sampling to choose elements from each cluster to form the final set. Below fig. shows a pictorial view of the same —

Multi-stage Sampling

You can implement it using python as shown below —

import numpy as npclusters=5
pop_size = 100
sample_clusters=2
sample_size=5
#assigning cluster ids sequentially from 1 to 5 on gap of 20
cluster_ids = np.repeat([range(1,clusters+1)], pop_size/clusters)
cluster_to_select = random.sample(set(cluster_ids), sample_clusters)indexes = [i for i, x in enumerate(cluster_ids) if x in cluster_to_select]cluster_associated_elements = [el for idx, el in enumerate(range(1, 101)) if idx in indexes]print (random.sample(cluster_associated_elements, sample_size))

Convenience Sampling

Under convenience sampling, the researcher includes only those individuals who are most accessible and available to participate in the study. Below fig. shows the pictorial view of the same — Blue dot is the researcher and orange dots are the most accessible set of people in orange’s vicinity.

Convenience Sampling

Voluntary Sampling

Under Voluntary sampling, interested people usually take part by themselves by filling in some sort of survey forms. A good example of this is the youtube survey about “Have you seen any of these ads”, which has been recently shown a lot. Here, the researcher who is conducting the survey has no right to choose anyone. Below fig. shows the pictorial view of the same — Blue dot is the researcher, orange one’s are those who voluntarily agreed to take part in the study.

Voluntary Sampling

Snowball Sampling

Under Snowball sampling, the final set is chosen via other participants, i.e. The researcher asks other known contacts to find people who would like to participate in the study. Below fig. shows the pictorial view of the same — Blue dot is the researcher, orange ones are known contacts(of the researcher), and yellow ones (orange’s contacts) are other people that got ready to participate in the study.

Snowball Sampling

Also if research papers interest you then you can checkout some research paper summaries that I have written.

I hope you enjoyed reading this. If you’d like to support me as a writer, consider signing up to become a Medium member. It’s just $5 a month and you get unlimited access to Medium

So, that’s it for this blog. Thank you for your time!

In which type of sampling does every element in the population have a known non zero probability?

Probability sampling is a technique in which every unit in the population has a chance (non-zero probability) of being selected in the sample, and this chance can be accurately determined.

Which type of sampling is where every element in the population being studied has a known chance of being selected for the study?

In simple random sampling (SRS), each sampling unit of a population has an equal chance of being included in the sample. Consequently, each possible sample also has an equal chance of being selected.

What type of sampling is when you number every element in the population and then select every kth element from the list?

A method of sampling from a list of the population so that the sample is made up of every kth member on the list, after randomly selecting a starting point from 1 to k.

What is the type of sampling technique where each element of population?

Simple Random Sampling Simple random sampling requires using randomly generated numbers to choose a sample. More specifically, it initially requires a sampling frame, a list or database of all members of a population.