🕷️ Crawler Inspector

URL Lookup

Direct Parameter Lookup

Raw Queries and Responses

1. Shard Calculation

Query:
Response:
Calculated Shard: 129 (from laksa102)

2. Crawled Status Check

Query:
Response:

3. Robots.txt Check

Query:
Response:

4. Spam/Ban Check

Query:
Response:

5. Seen Status Check

ℹ️ Skipped - page is already crawled

📄
INDEXABLE
CRAWLED
3 months ago
🤖
ROBOTS ALLOWED

Page Info Filters

FilterStatusConditionDetails
HTTP statusPASSdownload_http_code = 200HTTP 200
Age cutoffPASSdownload_stamp > now() - 6 MONTH4 months ago
History dropPASSisNull(history_drop_reason)No drop reason
Spam/banPASSfh_dont_index != 1 AND ml_spam_score = 0ml_spam_score=0
CanonicalPASSmeta_canonical IS NULL OR = '' OR = src_unparsedNot set

Page Details

PropertyValue
URLhttps://pmc.ncbi.nlm.nih.gov/articles/PMC5370305/
Last Crawled2025-12-08 16:14:09 (3 months ago)
First Indexednot set
HTTP Status Code200
Meta TitleCentral limit theorem: the cornerstone of modern statistics - PMC
Meta DescriptionAccording to the central limit theorem, the means of a random sample of size, n, from a population with mean, µ, and variance, σ2, distribute normally with mean, µ, and variance, σ2n. Using the central limit theorem, a variety of parametric tests ...
Meta Canonicalnull
Boilerpipe Text
Abstract According to the central limit theorem, the means of a random sample of size, n , from a population with mean, µ, and variance, σ 2 , distribute normally with mean, µ, and variance, σ 2 n . Using the central limit theorem, a variety of parametric tests have been developed under assumptions about the parameters that determine the population probability distribution. Compared to non-parametric tests, which do not require any assumptions about the population probability distribution, parametric tests produce more accurate and precise estimates with higher statistical powers. However, many medical researchers use parametric tests to present their data without knowledge of the contribution of the central limit theorem to the development of such tests. Thus, this review presents the basic concepts of the central limit theorem and its role in binomial distributions and the Student's t-test, and provides an example of the sampling distributions of small populations. A proof of the central limit theorem is also described with the mathematical concepts required for its near-complete understanding. Keywords: Normal distribution, Probability, Statistical distributions, Statistics Introduction The central limit theorem is the most fundamental theory in modern statistics. Without this theorem, parametric tests based on the assumption that sample data come from a population with fixed parameters determining its probability distribution would not exist. With the central limit theorem, parametric tests have higher statistical power than non-parametric tests, which do not require probability distribution assumptions. Currently, multiple parametric tests are used to assess the statistical validity of clinical studies performed by medical researchers; however, most researchers are unaware of the value of the central limit theorem, despite their routine use of parametric tests. Thus, clinical researchers would benefit from knowing what the central limit theorem is and how it has become the basis for parametric tests. This review aims to address these topics. The proof of the central limit theorem is described in the appendix , with the necessary mathematical concepts (e.g., moment-generating function and Taylor's formula) required for understanding the proof. However, some mathematical techniques (e.g., differential and integral calculus) were omitted due to space limitations. Basic Concepts of Central Limit Theorem In statistics, a population is the set of all items, people, or events of interest. In reality, however, collecting all such elements of the population requires considerable effort and is often not possible. For example, it is not possible to investigate the proficiency of every anesthesiologist, worldwide, in performing awake nasotracheal intubations. To make inferences regarding the population, however, a subset of the population (sample) can be used. A sample of sufficient size that is randomly selected can be used to estimate the parameters of the population using inferential statistics. A finite number of samples are attainable from the population depending on the size of the sample and the population itself. For example, all samples with a size of 1 obtained at random, with replacement, from the population {3,6,9,30} would be {3},{6},{9}, and {30}. If the sample size is 2, a total of 4 × 4 = 4 2 = 16 samples, which are {3,3},{3,6},{3,9},{3,30},{6,3},{6,6}...{9,30},{30,3},{30,6},{30,9}, and {30,30}, would be possible. In this way, 4 n samples with a size of n would be obtained from the population. Here, we consider the distribution of the sample means. For example, here is the asymmetric population with a size of 4 as presented above: The population mean, µ, and variance, σ 2 , are: A simple rando m sampling with replacement from the population produces 4 × 4 × 4 = 4 3 = 64 samples with a size of 3 ( Table 1 ). The mean and variance of the 64 sample means are 12 (the population mean) and 37.5 = σ 2 n = 112.5 3 , respectively; however, the distribution of the means of samples is skewed ( Fig. 1A ). Table 1. Samples with a Size of 3 and Their Means. Number Sample Sample mean 1 3 3 3 3 2 3 3 6 4 3 3 3 9 5 4 3 3 30 12 5 3 6 3 4 6 3 6 6 5 7 3 6 9 6 8 3 6 30 13 9 3 9 3 5 10 3 9 6 6 Truncated 54 30 6 6 14 55 30 6 9 15 56 30 6 30 22 57 30 9 3 14 58 30 9 6 15 59 30 9 9 16 60 30 9 30 23 61 30 30 3 21 62 30 30 6 22 63 30 30 9 23 64 30 30 30 30 Fig. 1. Histogram representing the means for samples of sizes of 3 (A), 6 (B), 9 (C), and 12 (D). When a simple random sampling with replacement is performed for samples with a size of 6, 4 × 4 × 4 × 4 × 4 × 4 = 4 6 = 4,096 samples are possible ( Table 2 ). The mean and variance of the 4,096 sample means are 12 (the population mean) and 18.75 = σ 2 n = 112.5 6 , respectively. Compared to the distribution of the means of samples with a size of 3, that of the means of samples with a size of 6 is less skewed. Importantly, the sample means also gather around the population mean. ( Fig. 1B ). Thus, the larger the sample size ( n ), the more closely the sample means gather symmetrically around the population mean (µ) and have a corresponding reduction in the variance ( σ 2 n ) ( Figs. 1C and 1D ). If Figs. 1A to 1D are converted to the probability density function by replacing the variable “frequency” with another variable “probability” on the vertical axis, their shapes remain unchanged. Table 2. Samples with a Size of 6 and Their Means. Number Sample Sample mean 1 3 3 3 3 3 3 3 2 3 3 3 3 3 6 3.5 3 3 3 3 3 3 9 4 4 3 3 3 3 3 30 7.5 5 3 3 3 3 6 3 3.5 6 3 3 3 3 6 6 4 7 3 3 3 3 6 9 4.5 8 3 3 3 3 6 30 8 9 3 3 3 3 9 3 4 10 3 3 3 3 9 6 4.5 Truncated 4087 30 30 30 30 6 9 22.5 4088 30 30 30 30 6 30 26 4089 30 30 30 30 9 3 22 4090 30 30 30 30 9 6 22.5 4091 30 30 30 30 9 9 23 4092 30 30 30 30 9 30 26.5 4093 30 30 30 30 30 3 25.5 4094 30 30 30 30 30 6 26 4095 30 30 30 30 30 9 26.5 4096 30 30 30 30 30 30 30 In general, as the sample size from the population increases, its mean gathers more closely around the population mean with a decrease in variance. Thus, as the sample size approaches infinity, the sample means approximate the normal distribution with a mean, µ, and a variance, σ 2 n . As shown above, the skewed distribution of the population does not affect the distribution of the sample means as the sample size increases. Therefore, the central limit theorem indicates that if the sample size is sufficiently large, the means of samples obtained using a random sampling with replacement are distributed normally with the mean, µ, and the variance, σ 2 n , regardless of the population distribution. Refer to the appendix for a near-complete proof of the central limit theorem, as well as the basic mathematical concepts required for its proof. Central Limit Theorem in the Real World An unbiased, symmetric, 6-sided dice is rolled at random n times. The probability of rolling the number 3 x times in n successive independent trials has the following probability density distribution, which is called the binomial distribution: n : number of in dependent trials (rolling a dice), x : number of times the number 3 is rolled in each trial, 1 6 : the probability of rolling the number 3 in each trial, 1 − 1 6 : the probability of rolling a number other than 3 in each trial. The mathematical expectation of the random variable, X, (i.e., the number of times that the number 3 is rolled in each trial), which is also referred to as the mean of the distribution, is: And the variance is: When n = 10, the probability has a skewed distribution ( Fig. 2A ); however, as n increases, the distribution becomes symmetric with respect to its mean ( Figs. 2B–2D ). As n approaches infinity, the binomial distribution approximates the normal distribution with a mean, np , and a variance, np (1 − p ), where p is the probability constant for the occurrence of the specific event during each trial. Fig. 2. The probability density function of a binomial distribution with a probability parameter of 1 6 (i.e., the probability of rolling the number 3 in each trial), based on to the number of trials. Central Limit Theorem in the Student's t-test Since the central limit theorem determines the sampling distribution of the means with a sufficient size, a specific mean (X̅) can be standardized z = X - - µ σ n and subsequently identified against the normal distribution with mean of 0 and variance of 1 2 . In reality, however, the lack of a known population variance (σ 2 ) prevents a determination of the probability density distribution. X i ( i = 1, 2, ..., n ): a sample from the population, N : the size of the population, µ: the mean of the population. Notably, the Student's t-distribution was developed to use a sample variance (S) instead of a population variance (σ 2 ). x i ( i = 1, 2, ..., n ): a random sample from the population, n : sample size, X̅: the mean of the samples. The specific mean (X̅) is studentized t = ( X - - µ ) S n and its location is evaluated on the Student's t-distribution, based on the degree of freedom ( n − 1). The shape of the Student's t-distribution is dependent on the degree of freedom. A low degree of freedom renders the peak of the Student's t-distribution lower than that of a normal distribution, although at some points, the tails have higher values than those of the normal distribution. As the degree of freedom increases, the Student's t-distribution approaches the normal distribution. At a degree of freedom of 30, the Student's t-distribution is regarded as equaling the normal distribution [ 1 ]. The underlying assumption for the Student's t-test is that samples should be obtained from a normally distributed population. However, since the distribution of population is not known, it should be determined whether the sample is normally distributed. This is particularly true for small sample sizes. If small sample sizes are normally distributed, the studentized distribution of the sample means is equal to the Student's t-distribution with a degree of freedom corresponding to the sample size. If the small samples are not normally distributed, non-parametric tests should be performed instead of the Student's t-test since they do not require assumptions about the distribution of population. If the sample size is 30, the studentized sampling distribution approximates the standard normal distribution and assumptions about the population distribution are meaningless since the sampling distribution is considered normal, according to the central limit theorem. Therefore, even if the mean of a sample of size > 30 is studentized using the variance, a normal distribution can be used for the probability distribution. Conclusions A comprehensive understanding of the central limit theorem will assist medical investigators when performing parametric tests to analyze their data with high statistical powers. The use of this theorem will also aid in the design of study protocols that are based on best-fit statistics. Appendix Moment-generating function Moment-generating function of a normal distribution Mathematical expectation of multiplicat ion between stochastically independent random variables Taylor's formula Central limit theorem References
Markdown
[Skip to main content](https://pmc.ncbi.nlm.nih.gov/articles/PMC5370305/#main-content) ![](https://pmc.ncbi.nlm.nih.gov/static/img/us_flag.svg) An official website of the United States government Here's how you know Here's how you know ![](https://pmc.ncbi.nlm.nih.gov/static/img/icon-dot-gov.svg) **Official websites use .gov** A **.gov** website belongs to an official government organization in the United States. ![](https://pmc.ncbi.nlm.nih.gov/static/img/icon-https.svg) **Secure .gov websites use HTTPS** A **lock** ( Locked padlock icon) or **https://** means you've safely connected to the .gov website. Share sensitive information only on official, secure websites. [![NCBI home page](https://pmc.ncbi.nlm.nih.gov/static/img/ncbi-logos/nih-nlm-ncbi--white.svg)](https://www.ncbi.nlm.nih.gov/) Search Log in - [Dashboard](https://www.ncbi.nlm.nih.gov/myncbi/) - [Publications](https://www.ncbi.nlm.nih.gov/myncbi/collections/bibliography/) - [Account settings](https://www.ncbi.nlm.nih.gov/account/settings/) - Log out Primary site navigation ![Close](https://pmc.ncbi.nlm.nih.gov/static/img/usa-icons/close.svg) Logged in as: - [Dashboard](https://www.ncbi.nlm.nih.gov/myncbi/) - [Publications](https://www.ncbi.nlm.nih.gov/myncbi/collections/bibliography/) - [Account settings](https://www.ncbi.nlm.nih.gov/account/settings/) Log in - [Journal List](https://pmc.ncbi.nlm.nih.gov/journals/) - [User Guide](https://pmc.ncbi.nlm.nih.gov/about/userguide/) - ## PERMALINK Copy As a library, NLM provides access to scientific literature. Inclusion in an NLM database does not imply endorsement of, or agreement with, the contents by NLM or the National Institutes of Health. Learn more: [PMC Disclaimer](https://pmc.ncbi.nlm.nih.gov/about/disclaimer/) \| [PMC Copyright Notice](https://pmc.ncbi.nlm.nih.gov/about/copyright/) ![Korean Journal of Anesthesiology logo](https://cdn.ncbi.nlm.nih.gov/pmc/banners/logo-kjanesth.gif) Korean J Anesthesiol . 2017 Feb 21;70(2):144–156. doi: [10\.4097/kjae.2017.70.2.144](https://doi.org/10.4097/kjae.2017.70.2.144) - [Search in PMC](https://pmc.ncbi.nlm.nih.gov/search/?term=%22Korean%20J%20Anesthesiol%22%5Bjour%5D) - [Search in PubMed](https://pubmed.ncbi.nlm.nih.gov/?term=%22Korean%20J%20Anesthesiol%22%5Bjour%5D) - [View in NLM Catalog](https://www.ncbi.nlm.nih.gov/nlmcatalog?term=%22Korean%20J%20Anesthesiol%22%5BTitle%20Abbreviation%5D) - [Add to search](https://pmc.ncbi.nlm.nih.gov/articles/PMC5370305/?term=%22Korean%20J%20Anesthesiol%22%5Bjour%5D) # Central limit theorem: the cornerstone of modern statistics [Sang Gyu Kwak](https://pubmed.ncbi.nlm.nih.gov/?term=%22Kwak%20SG%22%5BAuthor%5D) ### Sang Gyu Kwak 1Department of Medical Statistics, School of Medicine, Catholic University of Daegu, Daegu, Korea. Find articles by [Sang Gyu Kwak](https://pubmed.ncbi.nlm.nih.gov/?term=%22Kwak%20SG%22%5BAuthor%5D) 1, [Jong Hae Kim](https://pubmed.ncbi.nlm.nih.gov/?term=%22Kim%20JH%22%5BAuthor%5D) ### Jong Hae Kim 2Department of Anesthesiology and Pain Medicine, School of Medicine, Catholic University of Daegu, Daegu, Korea. Find articles by [Jong Hae Kim](https://pubmed.ncbi.nlm.nih.gov/?term=%22Kim%20JH%22%5BAuthor%5D) 2,✉ - Author information - Article notes - Copyright and License information 1Department of Medical Statistics, School of Medicine, Catholic University of Daegu, Daegu, Korea. 2Department of Anesthesiology and Pain Medicine, School of Medicine, Catholic University of Daegu, Daegu, Korea. ✉ Corresponding author: Jong Hae Kim, M.D. Department of Anesthesiology and Pain Medicine, School of Medicine, Catholic University of Daegu, 33, Duryugongwon-ro 17-gil, Nam-gu, Daegu 42472, Korea. Tel: 82-53-650-4979, Fax: 82-53-650-4517, usmed@cu.ac.kr ✉ Corresponding author. Received 2016 Sep 30; Revised 2016 Nov 17; Accepted 2017 Jan 2; Issue date 2017 Apr. Copyright © the Korean Society of Anesthesiologists, 2017 This is an open-access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (<http://creativecommons.org/licenses/by-nc/4.0/>), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited. [PMC Copyright notice](https://pmc.ncbi.nlm.nih.gov/about/copyright/) PMCID: PMC5370305 PMID: [28367284](https://pubmed.ncbi.nlm.nih.gov/28367284/) ## Abstract According to the central limit theorem, the means of a random sample of size, *n*, from a population with mean, µ, and variance, σ2, distribute normally with mean, µ, and variance, σ 2 n. Using the central limit theorem, a variety of parametric tests have been developed under assumptions about the parameters that determine the population probability distribution. Compared to non-parametric tests, which do not require any assumptions about the population probability distribution, parametric tests produce more accurate and precise estimates with higher statistical powers. However, many medical researchers use parametric tests to present their data without knowledge of the contribution of the central limit theorem to the development of such tests. Thus, this review presents the basic concepts of the central limit theorem and its role in binomial distributions and the Student's t-test, and provides an example of the sampling distributions of small populations. A proof of the central limit theorem is also described with the mathematical concepts required for its near-complete understanding. **Keywords:** Normal distribution, Probability, Statistical distributions, Statistics ## Introduction The central limit theorem is the most fundamental theory in modern statistics. Without this theorem, parametric tests based on the assumption that sample data come from a population with fixed parameters determining its probability distribution would not exist. With the central limit theorem, parametric tests have higher statistical power than non-parametric tests, which do not require probability distribution assumptions. Currently, multiple parametric tests are used to assess the statistical validity of clinical studies performed by medical researchers; however, most researchers are unaware of the value of the central limit theorem, despite their routine use of parametric tests. Thus, clinical researchers would benefit from knowing what the central limit theorem is and how it has become the basis for parametric tests. This review aims to address these topics. The proof of the central limit theorem is described in the [appendix](https://pmc.ncbi.nlm.nih.gov/articles/PMC5370305/#APP1), with the necessary mathematical concepts (e.g., moment-generating function and Taylor's formula) required for understanding the proof. However, some mathematical techniques (e.g., differential and integral calculus) were omitted due to space limitations. ## Basic Concepts of Central Limit Theorem In statistics, a population is the set of all items, people, or events of interest. In reality, however, collecting all such elements of the population requires considerable effort and is often not possible. For example, it is not possible to investigate the proficiency of every anesthesiologist, worldwide, in performing awake nasotracheal intubations. To make inferences regarding the population, however, a subset of the population (sample) can be used. A sample of sufficient size that is randomly selected can be used to estimate the parameters of the population using inferential statistics. A finite number of samples are attainable from the population depending on the size of the sample and the population itself. For example, all samples with a size of 1 obtained at random, with replacement, from the population {3,6,9,30} would be {3},{6},{9}, and {30}. If the sample size is 2, a total of 4 × 4 = 42 = 16 samples, which are {3,3},{3,6},{3,9},{3,30},{6,3},{6,6}...{9,30},{30,3},{30,6},{30,9}, and {30,30}, would be possible. In this way, 4*n* samples with a size of *n* would be obtained from the population. Here, we consider the distribution of the sample means. For example, here is the asymmetric population with a size of 4 as presented above: | | |---| | {3, 6, 9, 30} | The population mean, µ, and variance, σ2, are: | | |---| | µ \= 3 \+ 6 \+ 9 \+ 30 4 \= 12 |
Readable Markdownnull
Shard129 (laksa)
Root Hash7295144728021232729
Unparsed URLgov,nih!nlm,ncbi,pmc,/articles/PMC5370305/ s443