Стандартне відхилення проти стандартної помилки: у чому різниця?

Twins from Different Universes

Фото Мартін Санчес on Unsplash

Standard Deviation and Standard Error are two statistical concepts that often cause confusion. Do they have the same interpretations or they are meant to represent something totally different? We’ll discuss more in this post.

What is Standard Deviation (SD)?

Команда стандартне відхилення вимірює мінливість (він же, поширення) of data points around the значити in a given dataset. In other words, it tells us, on average, how far each data point is away from the mean.

Стандартне відхилення населення

In the real world, we’re interested in estimating a certain characteristic in a населення. Standard deviation is an example of these characteristics.

Коли у вас є ALL the data points from a population, you can compute the ІСТИНА value of the population standard deviation using the following formula.

Зображення автора

Зразок стандартного відхилення

Oftentimes, it is difficult to collect all the data points from the population due to time, financial, or technical limitations. For example, if we would like to compute the ІСТИНА standard deviation of household income in Los Angeles, we would need to get income from all the households in Los Angeles, which is almost impossible to do.

Instead, we can collect random samples from the population and make inferences about the population standard deviation using Sample Standard Deviation. The formula for sample standard deviation is

Зображення автора

Why use N-1 for sample standard deviation?

You will notice that we are using the sample mean (x̄) instead of the population mean (μ) for the sample standard deviation because we don’t know anything about the population mean. x̄ is a reasonable estimate for μ.

Therefore, any value X in the sample dataset would be closer to x̄ than to μ. The numerator in the sample standard deviation would get artificially smaller than it is supposed to be. As a result, the sample standard deviation would be занижена.

To correct this зміщення in the sample standard deviation, we would use “N-1” instead of “N” (також, Поправка Бесселя) for sample standard deviation.

Using N-1 would make the sample standard deviation larger than otherwise using N. Therefore, we have a less biased estimate of the population standard deviation, giving us a conservative estimate of variability.

What is Standard Error (SE)?

Before we discuss the Standard Error, let’s first get familiar with the concepts of Розповсюдження зразків та Розподіл вибірки.

Sample Distribution vs Sampling Distribution

Команда sample distribution просто розповсюдження даних of the sample which is randomly taken from the population.

For example, we ask 100 random people in Los Angeles what their incomes are. The sample distribution describes the АКТУАЛЬНІ income distribution in these 100 people.

But what is Sampling Distribution?

Команда sampling distribution є distribution of the sample statistic (e.g., the sample mean, sample variance, sample standard deviation, and sample proportion) over many samples drawn from the same population (i.e., repeated sampling).

For example, we ask 100 random people in Los Angeles what their incomes are. Then compute the average income. We repeat this 1000 times, then we have 1000 different average incomes. The distribution of these 1000 average incomes is called the sampling distribution.

Таким чином, sample distribution is the distribution of the зразкові дані в той час як sampling distribution is the distribution of the sample statistic.

The concept is стандартна помилка is relevant to the sampling distribution, NOT the sample distribution.

Команда Стандартна помилка is a metric that describes the variability of a statistic в sampling distribution.

How to interpret Standard Error (SE)?

The Standard Error measures how far the вибіркова статистика (e.g., sample mean) is likely to be from the true population statistic (e.g., the population mean).

Why do we need Standard Error (SE)?

Typically you might want to construct довірчі інтервали when we try to make statistical inferences, and it is more informative to assign a probability to construct a confidence interval that contains the mean.

  • If the underlying data are normally distributed, then the sampling distribution is also normally distributed. Then we can say we are 68% confident that the population mean lies within 1 standard error or 95% will be within 2 standard errors, etc.
  • If the underlying data are NOT normally distributed, but the sample size is large enough, we can rely on Центральна гранична теорема (CLT) to say the sampling distribution is approximately normally distributed, then we can make similar statements about confidence intervals.

How to compute Standard Error (SE)?

We typically use the following formula to compute the standard error. I will discuss how to derive this formula in the next sections.

Зображення автора

What are the examples of Standard Error?

Standard Error can be applied to various types of статистики. Деякі популярні приклади

  • The Standard Error of the Sample Mean (aka, the standard error of the mean, SEM)
  • The Standard Error of the Sample Proportion (aka, the standard error of the proportion, SEP)

What is the Standard Error of the Mean (SEM)?

The standard error of the mean (or simply standard error), indicates how different the середня вибірка is likely to be from the населення середнє.

Technically, the standard error of the mean is computed as the standard deviation of the sample mean.

Зображення автора

Hypothetically, we can compute the standard error under repeated samples using the following steps:

  1. Draw a new sample from the population.
  2. Compute the sample mean of the drawn sample in Step 1
  3. Repeat Steps 1 and 2 multiple times.
  4. The standard error is obtained by computing the standard deviation of the previous steps’ sample means.

Завдяки Центральна гранична теорема (CLT), we don’t need to consider the Sampling Distribution under repeated samples. Instead, the sampling distribution of the sample means can be estimated from just ONE random sample.

The Central Limit Theorem states that the sample mean has an approximately normal distribution with a mean of μ і standard deviation (or standard error) of σ/√n.

How to derive the formula for SEM?

Зображення автора

Таким чином,

Зображення автора

In most cases, the standard deviation of the population data is unknown. We will estimate it using the standard deviation of the sample data (sample standard deviation).

Таким чином,

Зображення автора

What is the Standard Error of the Proportion(SEP)?

The standard error of the proportion indicates how different the пропорція вибірки is likely to be from the частка населення.

The standard error of the proportion is computed as the standard deviation of the sample proportions.

Зображення автора

You will notice that in each sample data, we only have data either 1 or 0. Each value follows a Bernouilli distribution. The computed sample proportions are no longer binary values. Instead, they could be any value between 0 and 1.

The Central Limit Theorem states that the sample proportion has an approximately normal distribution with a середнє значення p і standard deviation (or standard error) of √P(1-P)/√n, where P is the population proportion.

How to derive the formula for SEP?

Зображення автора

Similar to SEM,

Зображення автора
Зображення автора

We can estimate σ using the sample standard deviation √p(1-p) (i.e., the standard deviation of a Bernouilli distribution)

Зображення автора

Висновок:

Standard Deviation and Standard Error are similar concepts that both are used to measure мінливість.

Standard Deviation indicates how the sample data values are different from the mean in the sample distribution.

Стандартна помилка indicates how the sample data statistics are different from the population statistic in the sampling distribution.

Thank you for reading !!!

If you enjoy this article and would like to Buy Me a Coffee, будь ласка натисніть тут.

Ви можете підписатись на членство to unlock full access to my articles, and have unlimited access to everything on Medium. Please підписуватися if you’d like to get an email notification whenever I post a new article.

Standard Deviation vs Standard Error: What’s the Difference? Republished from Source https://towardsdatascience.com/standard-deviation-vs-standard-error-whats-the-difference-ae969f48adef?source=rss—-7f60cf5620c9—4 via https://towardsdatascience.com/feed

<!–

->

Часова мітка:

Більше від Консультанти з блокчейнів