ℹ️ Skipped - page is already crawled
| Filter | Status | Condition | Details |
|---|---|---|---|
| HTTP status | PASS | download_http_code = 200 | HTTP 200 |
| Age cutoff | FAIL | download_stamp > now() - 6 MONTH | 8.2 months ago |
| History drop | PASS | isNull(history_drop_reason) | No drop reason |
| Spam/ban | PASS | fh_dont_index != 1 AND ml_spam_score = 0 | ml_spam_score=0 |
| Canonical | PASS | meta_canonical IS NULL OR = '' OR = src_unparsed | Not set |
| Property | Value |
|---|---|
| URL | https://math.stackexchange.com/questions/12983/intuition-about-the-central-limit-theorem |
| Last Crawled | 2025-08-03 01:37:43 (8 months ago) |
| First Indexed | 2017-05-05 07:48:31 (8 years ago) |
| HTTP Status Code | 200 |
| Meta Title | statistics - Intuition about the Central Limit Theorem - Mathematics Stack Exchange |
| Meta Description | null |
| Meta Canonical | null |
| Boilerpipe Text | (Cross post from Stats Stack, where a similar question was asked.) Why the n − − √ n instead of n n ? What's this weird version of an average? If you have a bunch of perpendicular vectors x 1 , … , x n x 1 , … , x n of length ℓ ℓ , then
x 1 + ⋯ + x n n √ x 1 + ⋯ + x n n is again of length ℓ . ℓ . You have to normalize by n − − √ n to keep the sum at the same scale. There is a deep connection between independent random variables and orthogonal vectors. When random variables are independent, that basically means that they are orthogonal vectors in a vector space of functions. (The function space I refer to are the L 2 L 2 functions with mean zero, and the variance of a random variable X X is just ∥ X − μ ∥ 2 L 2 ‖ X − μ ‖ L 2 2 . So no wonder the variance is additive over independent random variables. Just like ∥ x + y ∥ 2 = ∥ x ∥ 2 + ∥ y ∥ 2 ‖ x + y ‖ 2 = ‖ x ‖ 2 + ‖ y ‖ 2 when x ⊥ y x ⊥ y .)** Why the normal distribution? One thing that really confused me for a while, and which I think lies at the heart of the matter, is the following question: Why is it that the sum X 1 + ⋯ + X n n √ X 1 + ⋯ + X n n ( n n large) doesn’t care anything
about the X i X i except their mean and their variance? (Moments 1 and 2.) This is similar to the law of large numbers phenomenon: X 1 + ⋯ + X n n X 1 + ⋯ + X n n ( n n large) only cares about moment 1 (the mean). (Both of these have their hypotheses that I'm suppressing (see the footnote), but the most important thing, of course, is that the X i X i be independent .) A more elucidating way to express this phenomenon is: in the sum X 1 + ⋯ + X n n √ X 1 + ⋯ + X n n , I can replace any or all of the X i X i with some other RV’s, mixing and matching between all kinds of various distributions, as long as they have the same first and second moments. And it won’t matter as long as n n is large, relative to the moments. If we understand why that’s true, then we understand the central limit theorem . Because then we may as well take X i X i to be normal with the same first and second moment, and in that case we know X 1 + ⋯ + X n n √ X 1 + ⋯ + X n n is just normal again for any n n , including super-large n n . Because the normal distribution has the special property ("stability") that you can add two independent normals together and get another normal. Voila. The explanation of the first-and-second-moment phenomemonon is ultimately just some arithmetic. There are several lenses through which once can choose to view this arithmetic. The most common one people use is the fourier transform (AKA characteristic function), which has the feel of "I follow the steps, but how and why would anyone ever think of that?" Another approach is to look at the cumulants of X i X i . There we find that the normal distribution is the unique distribution whose higher cumulants vanish, and dividing by n − − √ n tends to kill all but the first two cumulants as n n gets large. I'll show here a more elementary approach. As the sum Z n = (def) X 1 + ⋯ + X n n √ Z n = (def) X 1 + ⋯ + X n n gets longer and longer, I'll show that all of the moments of Z n Z n are functions only of the variances Var ( X i ) Var ( X i ) and the means E X i E X i , and nothing else. Now the moments of Z n Z n determine the distribution of Z n Z n (that's true not just for long independent sums, but for any nice distribution, by the Carleman continuity theorem ). To restate, we're claiming that as n n gets large, Z n Z n depends only on the E X i E X i and the Var X i Var X i . And to show that, we're going to show that E ( ( Z n − E Z n ) k ) E ( ( Z n − E Z n ) k ) depends only on the E X i E X i and the Var X i Var X i . That suffices, by the Carleman continuity theorem. For convenience, let's require that the X i X i have mean zero and variance σ 2 σ 2 . Assume all their moments exist and are uniformly bounded. (But nevertheless, the X i X i can be all different independent distributions.) Claim: Under the stated assumptions, the k k th moment E [ ( X 1 + ⋯ + X n n − − √ ) k ] E [ ( X 1 + ⋯ + X n n ) k ]
has a limit as n → ∞ n → ∞ , and that limit is a function only of σ 2 σ 2 . (It
disregards all other information.) (Specifically, the values of those limits of moments are just the moments of the normal distribution N ( 0 , σ 2 ) N ( 0 , σ 2 ) : zero for k k odd, and | σ | k k ! ( k / 2 ) ! 2 k / 2 | σ | k k ! ( k / 2 ) ! 2 k / 2 when k k is even. This is equation (1) below.) Proof: Consider E [ ( X 1 + ⋯ + X n n √ ) k ] E [ ( X 1 + ⋯ + X n n ) k ] . When you expand it, you get a factor of n − k / 2 n − k / 2 times a big fat multinomial sum. n − k / 2 ∑ | α | = k ( k α 1 , … , α n ) ∏ i = 1 n E ( X α i i ) n − k / 2 ∑ | α | = k ( k α 1 , … , α n ) ∏ i = 1 n E ( X i α i ) (Remember you can distribute the expectation over independent random variables. E ( X a Y b ) = E ( X a ) E ( Y b ) E ( X a Y b ) = E ( X a ) E ( Y b ) .) Now if ever I have as one of my factors a plain old E ( X i ) E ( X i ) , with exponent α i = 1 α i = 1 , then that whole term is zero, because E ( X i ) = 0 E ( X i ) = 0 by assumption. So I need all the exponents α i ≠ 1 α i ≠ 1 in order for that term to survive. That pushes me toward using fewer of the X i X i in each term, because each term has ∑ α i = k ∑ α i = k , and I have to have each α i > 1 α i > 1 if it is > 0 > 0 . In fact, some simple arithmetic shows that at most k / 2 k / 2 of the α i α i can be nonzero, and that's only when k k is even, and when I use only twos and zeros as my α i α i . This pattern where I use only twos and zeros turns out to be very important...in fact, any term where I don't do that will vanish as the sum grows larger. Lemma: The sum n − k / 2 ∑ | α | = k ( k α 1 , … , α n ) ∏ i = 1 n E ( X α i i ) n − k / 2 ∑ | α | = k ( k α 1 , … , α n ) ∏ i = 1 n E ( X i α i ) breaks up like n − k / 2 ⎛ ⎝ ⎜ ⎜ ( terms where some α i = 1 ) These are zero because E X i = 0 + ( terms where α i 's are twos and zeros ) This part is O ( n k / 2 ) if k is even, otherwise no such terms + ( rest of terms ) o ( n k / 2 ) ⎞ ⎠ ⎟ ⎟ n − k / 2 ( ( terms where some α i = 1 ) ⏟ These are zero because E X i = 0 + ( terms where α i 's are twos andzeros ) ⏟ This part is O ( n k / 2 ) if k is even, otherwise no suchterms + ( rest of terms ) ⏟ o ( n k / 2 ) ) In other words, in the limit, all terms become irrelevant except n − k / 2 ∑ ( n k / 2 ) ( k 2 , … , 2 ) k / 2 twos ∏ j = 1 k / 2 E ( X 2 i j ) (1) (1) n − k / 2 ∑ ( n k / 2 ) ( k 2 , … , 2 ) ⏟ k / 2 twos ∏ j = 1 k / 2 E ( X i j 2 ) Proof: The main points are to split up the sum by which (strong) composition of k k is represented by the multinomial α α . There are only 2 k − 1 2 k − 1 possibilities for strong compositions of k k , so the number of those can't explode as n → ∞ n → ∞ . Then there is the choice of which of the X 1 , … , X n X 1 , … , X n will receive the positive exponents, and the number of such choices is ( n # positive terms in α ) = O ( n # positive terms in α ) ( n # positive terms in α ) = O ( n # positive terms in α ) . (Remember the number of positive terms in α α can't be bigger than k / 2 k / 2 without killing the term.) That's basically it. You can find a more thorough description here on my website, or in section 2.2.3 of Tao's Topics in Random Matrix Theory , where I first read this argument. And that concludes the whole proof. We’ve shown that all moments of X 1 + … + X n n √ X 1 + … + X n n forget everything but E X i E X i and E ( X 2 i ) E ( X i 2 ) as n → ∞ n → ∞ . And therefore swapping out the X i X i with any variables with the same first and second moments wouldn't have made any difference in the limit. And so we may as well have taken them to be ∼ N ( μ , σ 2 ) ∼ N ( μ , σ 2 ) to begin with; it wouldn't have made any difference. **(If one wants to pursue more deeply the question of why n 1 / 2 n 1 / 2 is the magic number here for vectors and for functions, and why the variance (square L 2 L 2 norm) is the important statistic, one might read about why L 2 L 2 is the only L p L p space that can be an inner product space. Because 2 2 is the only number that is its own Holder conjugate.) Another valid view is that n 1 / 2 n 1 / 2 is not the only denominator can appear. There are different "basins of attraction" for random variables, and so there are infinitely many central limit theorems. There are random variables for which X 1 + ⋯ + X n n ⇒ X X 1 + ⋯ + X n n ⇒ X , and for which X 1 + ⋯ + X n 1 ⇒ X X 1 + ⋯ + X n 1 ⇒ X ! But these random variables necessarily have infinite variance. These are called "stable laws". It's also enlightening to look at the normal distribution from a calculus of variations standpoint: the normal distribution N ( μ , σ 2 ) N ( μ , σ 2 ) maximizes the Shannon entropy among distributions with a given mean and variance, and which are absolutely continuous with respect to the Lebesgue measure on R R (or R d R d , for the multivariate case). This is proven here , for example. |
| Markdown | #  **Join Mathematics**
By clicking “Sign up”, you agree to our [terms of service](https://math.stackexchange.com/legal/terms-of-service/public) and acknowledge you have read our [privacy policy](https://math.stackexchange.com/legal/privacy-policy).
# OR
Already have an account? [Log in](https://math.stackexchange.com/users/login)
[Skip to main content](https://math.stackexchange.com/questions/12983/intuition-about-the-central-limit-theorem#content)
#### Stack Exchange Network
Stack Exchange network consists of 183 Q\&A communities including [Stack Overflow](https://stackoverflow.com/), the largest, most trusted online community for developers to learn, share their knowledge, and build their careers.
[Visit Stack Exchange](https://stackexchange.com/)
1. - [Tour Start here for a quick overview of the site](https://math.stackexchange.com/tour)
- [Help Center Detailed answers to any questions you might have](https://math.stackexchange.com/help)
- [Meta Discuss the workings and policies of this site](https://math.meta.stackexchange.com/)
- [About Us Learn more about Stack Overflow the company, and our products](https://stackoverflow.co/)
2. ### [current community](https://math.stackexchange.com/)
- [Mathematics](https://math.stackexchange.com/)
[help](https://math.stackexchange.com/help) [chat](https://chat.stackexchange.com/?tab=site&host=math.stackexchange.com)
- [Mathematics Meta](https://math.meta.stackexchange.com/)
### your communities
[Sign up](https://math.stackexchange.com/users/signup?ssrc=site_switcher&returnurl=https%3A%2F%2Fmath.stackexchange.com%2Fquestions%2F12983%2Fintuition-about-the-central-limit-theorem) or [log in](https://math.stackexchange.com/users/login?ssrc=site_switcher&returnurl=https%3A%2F%2Fmath.stackexchange.com%2Fquestions%2F12983%2Fintuition-about-the-central-limit-theorem) to customize your list.
### [more stack exchange communities](https://stackexchange.com/sites)
[company blog](https://stackoverflow.blog/)
3. [Log in](https://math.stackexchange.com/users/login?ssrc=head&returnurl=https%3A%2F%2Fmath.stackexchange.com%2Fquestions%2F12983%2Fintuition-about-the-central-limit-theorem)
4. [Sign up](https://math.stackexchange.com/users/signup?ssrc=head&returnurl=https%3A%2F%2Fmath.stackexchange.com%2Fquestions%2F12983%2Fintuition-about-the-central-limit-theorem)
[](https://math.stackexchange.com/)
1. 1. [Home](https://math.stackexchange.com/)
2. [Questions](https://math.stackexchange.com/questions)
3. [Unanswered](https://math.stackexchange.com/unanswered)
4. [AI Assist Labs](https://stackoverflow.ai/)
5. [Tags](https://math.stackexchange.com/tags)
6. [Chat](https://chat.stackexchange.com/rooms/158962/stack-exchange-lobby)
7. [Users](https://math.stackexchange.com/users)
2. Teams

Ask questions, find answers and collaborate at work with Stack Overflow for Teams.
[Try Teams for free](https://stackoverflowteams.com/teams/create/free/?utm_medium=referral&utm_source=math-community&utm_campaign=side-bar&utm_content=explore-teams) [Explore Teams](https://stackoverflow.co/teams/?utm_medium=referral&utm_source=math-community&utm_campaign=side-bar&utm_content=explore-teams)
3. [Teams]()
4. Ask questions, find answers and collaborate at work with Stack Overflow for Teams. [Explore Teams](https://stackoverflow.co/teams/?utm_medium=referral&utm_source=math-community&utm_campaign=side-bar&utm_content=explore-teams-compact)
**Teams**
Q\&A for work
Connect and share knowledge within a single location that is structured and easy to search.
[Learn more about Teams](https://stackoverflow.co/teams/)
# 
# Hang on, you can't upvote just yet.
You'll need to complete a few actions and gain 15 reputation points before being able to upvote. **Upvoting** indicates when questions and answers are useful. [What's reputation and how do I get it?](https://stackoverflow.com/help/whats-reputation)
Instead, you can save this post to reference later.
Save this post for later
Not now
# [Intuition about the Central Limit Theorem](https://math.stackexchange.com/questions/12983/intuition-about-the-central-limit-theorem)
[Ask Question](https://math.stackexchange.com/questions/ask)
Asked
14 years, 8 months ago
Modified [8 months ago](https://math.stackexchange.com/questions/12983/intuition-about-the-central-limit-theorem?lastactivity "2024-11-18 16:00:52Z")
Viewed 4k times
This question shows research effort; it is useful and clear
19
Save this question.
Show activity on this post.
I'm studying statistics, and would like to better understand the Central Limit Theorem. The proof I found on Wikipedia requires some previous knowledge I do not currently possess.
Is there a quick intuitive explanation you can give as to why this theorem is correct?
- [statistics](https://math.stackexchange.com/questions/tagged/statistics "show questions tagged 'statistics'")
[Share](https://math.stackexchange.com/q/12983 "Short permalink to this question")
Share a link to this question
Copy link
[CC BY-SA 2.5](https://creativecommons.org/licenses/by-sa/2.5/ "The current license for this post: CC BY-SA 2.5")
Cite
Follow
Follow this question to receive notifications
asked Dec 4, 2010 at 11:43
[](https://math.stackexchange.com/users/4401/chaoticdawn)
[chaoticdawn](https://math.stackexchange.com/users/4401/chaoticdawn)chaoticdawn
19111 silver badge33 bronze badges
4
- 1
Have you seen [this](http://stats.stackexchange.com/questions/3734) already?
– [J. M. ain't a mathematician](https://math.stackexchange.com/users/498/j-m-aint-a-mathematician "76,570 reputation")
[Commented Dec 4, 2010 at 11:50](https://math.stackexchange.com/questions/12983/intuition-about-the-central-limit-theorem#comment27895_12983)
- @J. M., thanks for the link. Although, I didn't find a quick explanation that I understood, sadly. Specifically, I'm interested in why sampling any distribution (even a non-symmetric one) will lead to a normal, symmetric distribution, for large enough samples.
– [chaoticdawn](https://math.stackexchange.com/users/4401/chaoticdawn "191 reputation")
[Commented Dec 4, 2010 at 12:01](https://math.stackexchange.com/questions/12983/intuition-about-the-central-limit-theorem#comment27897_12983)
- 2
[en.wikipedia.org/wiki/Illustration\_of\_the\_central\_limit\_theorem](http://en.wikipedia.org/wiki/Illustration_of_the_central_limit_theorem) might help provide a little intuition. Great question, by the way\!
– [Qiaochu Yuan](https://math.stackexchange.com/users/232/qiaochu-yuan "471,149 reputation")
[Commented Dec 4, 2010 at 12:39](https://math.stackexchange.com/questions/12983/intuition-about-the-central-limit-theorem#comment27900_12983)
- 2
Not to confuse you, but the CLT only applies in the finite second moment case (finite variance). Levy-Stable distributions are also a convergent distribution of sums of random variables.
– [user4143](https://math.stackexchange.com/users/4143/user4143 "1,020 reputation")
[Commented Dec 4, 2010 at 16:46](https://math.stackexchange.com/questions/12983/intuition-about-the-central-limit-theorem#comment27930_12983)
[Add a comment](https://math.stackexchange.com/questions/12983/intuition-about-the-central-limit-theorem "Use comments to ask for more information or suggest improvements. Avoid answering questions in comments.") \|
## 5 Answers 5
Sorted by:
[Reset to default](https://math.stackexchange.com/questions/12983/intuition-about-the-central-limit-theorem?answertab=scoredesc#tab-top)
This answer is useful
24
Save this answer.
Show activity on this post.
I don't think you should expect any short, snappy answers because I think this is a very deep question. Here is a guess at a conceptual explanation, which I can't quite flesh out.
Our starting point is something called the [principle of maximum entropy](http://en.wikipedia.org/wiki/Principle_of_maximum_entropy), which says that in any situation where you're trying to assign a probability distribution to some events, you should choose the distribution with maximum entropy which is consistent with your knowledge. For example, if you don't know anything and there are n n events, then the maximum entropy distribution is the uniform one where each event occurs with probability 1n 1 n. There are lots more examples in [this expository paper](https://kconrad.math.uconn.edu/blurbs/analysis/entropypost.pdf) by Keith Conrad.
Now take a bunch of independent identically distributed random variables Xi X i with mean μ μ and variance σ2 σ 2. You know exactly what the mean of X1\+...\+Xnn X 1 \+ . . . \+ X n n is; it's μ μ by linearity of expectation. Variance is also linear, at least on independent variables (this is a probabilistic form of the Pythagorean theorem), hence
Var(X1\+...\+Xn)\=Var(X1)\+...\+Var(Xn)\=nσ2
Var
(
X
1
\+
.
.
.
\+
X
n
)
\=
Var
(
X
1
)
\+
.
.
.
\+
Var
(
X
n
)
\=
n
σ
2
but since variance scales quadratically, the variance of X1\+...\+Xnn X 1 \+ . . . \+ X n n is actually σ2n σ 2 n; in other words, it goes to zero! This is a simple way to convince yourself of the (weak) [law of large numbers](http://en.wikipedia.org/wiki/Law_of_large_numbers).
So we can convince ourselves that (under the assumptions of finite mean and variance) the average of a bunch of iid random variables tends to its mean. If we want to study *how* it tends to its mean, we need to instead consider (X1−μ)\+...\+(Xn−μ)n√ ( X 1 − μ ) \+ . . . \+ ( X n − μ ) n, which has mean 0 0 and variance σ2 σ 2.
Suppose we suspected, for one reason or another, that this tended to some fixed limiting distribution in terms of σ2 σ 2 alone. We might be led to this conclusion by seeing this behavior for several particular distributions, for example. Given that, it follows that we don't know anything about this limiting distribution except its mean and variance. So we should choose the distribution of maximum entropy with a fixed mean and variance. And this is **precisely** the corresponding normal distribution! Intuitively, each iid random variable is like a particle moving randomly, and adding up the contributions of all of the random particles adds "heat," or "entropy," to your system. (I think this is why the normal distribution shows up in the description of the [heat kernel](http://en.wikipedia.org/wiki/Heat_kernel), but don't quote me on this.) In information-theoretic terms, the more iid random variables you sum, the less information you have about the result.
[Share](https://math.stackexchange.com/a/12985 "Short permalink to this answer")
Share a link to this answer
Copy link
[CC BY-SA 4.0](https://creativecommons.org/licenses/by-sa/4.0/ "The current license for this post: CC BY-SA 4.0")
Cite
Follow
Follow this answer to receive notifications
[edited Nov 18, 2024 at 16:00](https://math.stackexchange.com/posts/12985/revisions "show all edits to this post")
[](https://math.stackexchange.com/users/343701/jaspreet)
[Jaspreet](https://math.stackexchange.com/users/343701/jaspreet)
81377 silver badges2020 bronze badges
answered Dec 4, 2010 at 13:03
[](https://math.stackexchange.com/users/232/qiaochu-yuan)
[Qiaochu Yuan](https://math.stackexchange.com/users/232/qiaochu-yuan)Qiaochu Yuan
471k5555 gold badges1\.1k1\.1k silver badges1\.5k1\.5k bronze badges
2
- (+1) Given that we want a given mean and variance, maximum entropy and the calculus of variations lead directly to the normal distribution. Using the machinery of Fourier Transforms (see [this answer](https://math.stackexchange.com/a/123987) for a start), we can derive the Gaussian distribution as a weak limit of a contraction of convolutions of any distribution with mean
0
0
and variance
1
1
.
– [robjohn](https://math.stackexchange.com/users/13854/robjohn "354,051 reputation") ♦
[Commented Mar 26, 2020 at 16:31](https://math.stackexchange.com/questions/12983/intuition-about-the-central-limit-theorem#comment7393174_12985)
- 1
Why dont we know anything about the limiting distribution, say we start with random varibles of any distribution, a priori a would expect this knowledge should translate to the limit? Does anyone have some intuition for this?
– [alijfsnwiafebq](https://math.stackexchange.com/users/1397294/alijfsnwiafebq "185 reputation")
[Commented Dec 6, 2024 at 11:23](https://math.stackexchange.com/questions/12983/intuition-about-the-central-limit-theorem#comment10737572_12985)
[Add a comment](https://math.stackexchange.com/questions/12983/intuition-about-the-central-limit-theorem "Use comments to ask for more information or suggest improvements. Avoid comments like “+1” or “thanks”.") \|
This answer is useful
8
Save this answer.
Show activity on this post.
There's an almost-formal argument using cumulants. Given a random variable X X, define its moment generating function
M(X)\=E\[etX\].
M
(
X
)
\=
E
\[
e
t
X
\]
.
It's called the moment generating function since opening the Taylor series of the exponential, we get
M(X)\=1\+E\[X\]t\+12E\[X\]2t2\+⋯.
M
(
X
)
\=
1
\+
E
\[
X
\]
t
\+
1
2
E
\[
X
\]
2
t
2
\+
⋯
.
The moment generating function is useful because of its relation to convolution of two *independent* random variables:
M(X\+Y)\=E\[et(X\+Y)\]\=E\[etXetY\]\=E\[etX\]E\[etY\]\=M(X)M(Y).
M
(
X
\+
Y
)
\=
E
\[
e
t
(
X
\+
Y
)
\]
\=
E
\[
e
t
X
e
t
Y
\]
\=
E
\[
e
t
X
\]
E
\[
e
t
Y
\]
\=
M
(
X
)
M
(
Y
)
.
One proof of the CLT takes the route of the mgf, but we would like to replace the multiplication by a addition since we only really know how to handle sums. So we define the cumulant generating function
K(X)\=logM(X).
K
(
X
)
\=
log
M
(
X
)
.
We can calculate the first few coefficients (which are called cumulant) by substituting into the (formal) power series of
log(1\+x)\=x−x2/2\+⋯
log
(
1
\+
x
)
\=
x
−
x
2
/
2
\+
⋯
:
K(X)\=log(1\+E\[X\]t\+E\[X2\]t2/2\+⋯)\=E\[X\]t\+E\[X2\]t2/2−(E\[X\]2t2\+E\[X\]E\[X2\]t3\+E\[X2\]t4/4)/2\+⋯\=E\[X\]t\+V\[X\]t2/2\+⋯.
K
(
X
)
\=
log
(
1
\+
E
\[
X
\]
t
\+
E
\[
X
2
\]
t
2
/
2
\+
⋯
)
\=
E
\[
X
\]
t
\+
E
\[
X
2
\]
t
2
/
2
−
(
E
\[
X
\]
2
t
2
\+
E
\[
X
\]
E
\[
X
2
\]
t
3
\+
E
\[
X
2
\]
t
4
/
4
)
/
2
\+
⋯
\=
E
\[
X
\]
t
\+
V
\[
X
\]
t
2
/
2
\+
⋯
.
Also, if
X
X
and
Y
Y
are independent then
K\[X\+Y\]\=K\[X\]\+K\[Y\].
K
\[
X
\+
Y
\]
\=
K
\[
X
\]
\+
K
\[
Y
\]
.
Now suppose
X1,…,Xn
X
1
,
…
,
X
n
are iid variables distributed like
X
X
with zero expectation. Then
K\[X1\+⋯\+Xn\]\=nK\[X\]\=12nV\[X\]t2\+16nK3(X)t3\+⋯,
K
\[
X
1
\+
⋯
\+
X
n
\]
\=
n
K
\[
X
\]
\=
1
2
n
V
\[
X
\]
t
2
\+
1
6
n
K
3
(
X
)
t
3
\+
⋯
,
where
Km(X)
K
m
(
X
)
are just the (normalized) coefficients of the cgf, i.e. the cumulants (they are normalized by
1/m\!
1
/
m
\!
). If we scale this sum down by
n−−√
n
, then the second cumulant becomes
V\[X\]
V
\[
X
\]
(i.e. the variance is the same), but the rest of the cumulants
Km
K
m
for
m≥3
m
≥
3
get multiplied by
n1−m/2→0
n
1
−
m
/
2
→
0
, so in the limit they disappear, and the cumulant of the limit is just
K\[X1\+⋯\+Xnn−−√\]\=12V\[X\]t2.
K
\[
X
1
\+
⋯
\+
X
n
n
\]
\=
1
2
V
\[
X
\]
t
2
.
Therefore there is one 'domain of attraction' for distributions, which must be the normal distribution with zero mean and variance
V\[X\]
V
\[
X
\]
; it can be calculated directly from this representation. The same idea can be used to analyze the case where the variables are independent but not identically distributed. The main step missing to make this proof formal is reasoning about the limit distribution from the limit cgf; this is the Levy continuity lemma, which shows that the 'inverse Fourier transform' is continuous.
Had we taken the route of mgf's, we would have had to use the identity (1\+1/n)n→en ( 1 \+ 1 / n ) n → e n somewhere, but otherwise the argument would be much the same.
[Share](https://math.stackexchange.com/a/13016 "Short permalink to this answer")
Share a link to this answer
Copy link
[CC BY-SA 2.5](https://creativecommons.org/licenses/by-sa/2.5/ "The current license for this post: CC BY-SA 2.5")
Cite
Follow
Follow this answer to receive notifications
answered Dec 4, 2010 at 19:04
[](https://math.stackexchange.com/users/1277/yuval-filmus)
[Yuval Filmus](https://math.stackexchange.com/users/1277/yuval-filmus)Yuval Filmus
58k55 gold badges9696 silver badges170170 bronze badges
[Add a comment](https://math.stackexchange.com/questions/12983/intuition-about-the-central-limit-theorem "Use comments to ask for more information or suggest improvements. Avoid comments like “+1” or “thanks”.") \|
This answer is useful
6
Save this answer.
Show activity on this post.
(Cross post from Stats Stack, where a similar question was asked.)
## Why the n−−√ n instead of n n? What's this weird version of an average?
If you have a bunch of perpendicular vectors x1,…,xn x 1 , … , x n of length ℓ ℓ, then x1\+⋯\+xnn√ x 1 \+ ⋯ \+ x n n is again of length ℓ. ℓ . You have to normalize by n−−√ n to keep the sum at the same scale.
There is a deep connection between independent random variables and orthogonal vectors. When random variables are independent, that basically means that they are orthogonal vectors in a vector space of functions.
(The function space I refer to are the L2 L 2 functions with mean zero, and the variance of a random variable X X is just ∥X−μ∥2L2 ‖ X − μ ‖ L 2 2. So no wonder the variance is additive over independent random variables. Just like ∥x\+y∥2\=∥x∥2\+∥y∥2 ‖ x \+ y ‖ 2 \= ‖ x ‖ 2 \+ ‖ y ‖ 2 when x⊥y x ⊥ y.)\*\*
## Why the normal distribution?
One thing that really confused me for a while, and which I think lies at the heart of the matter, is the following question:
> Why is it that the sum X1\+⋯\+Xnn√ X 1 \+ ⋯ \+ X n n (n n large) doesn’t care anything about the Xi X i except their mean and their variance? (Moments 1 and 2.)
This is similar to the law of large numbers phenomenon:
> X1\+⋯\+Xnn X 1 \+ ⋯ \+ X n n (n n large) only cares about moment 1 (the mean).
(Both of these have their hypotheses that I'm suppressing (see the footnote), but the most important thing, of course, is that the Xi X i be **independent**.)
A more elucidating way to express this phenomenon is: in the sum X1\+⋯\+Xnn√ X 1 \+ ⋯ \+ X n n, I can replace any or all of the Xi X i with some other RV’s, mixing and matching between all kinds of various distributions, as long as they have the same first and second moments. And it won’t matter as long as n n is large, relative to the moments.
If we understand why that’s true, then **we understand the central limit theorem**. Because then we may as well take Xi X i to be *normal* with the same first and second moment, and in that case we know X1\+⋯\+Xnn√ X 1 \+ ⋯ \+ X n n is just normal again for any n n, including super-large n n. Because the normal distribution has the special property ("stability") that you can add two independent normals together and get another normal. Voila.
The explanation of the first-and-second-moment phenomemonon is ultimately just some arithmetic. There are several lenses through which once can choose to view this arithmetic. The most common one people use is the fourier transform (AKA characteristic function), which has the feel of "I follow the steps, but how and why would anyone ever think of that?" Another approach is to look at the *cumulants* of Xi X i. There we find that the normal distribution is the unique distribution whose higher cumulants vanish, and dividing by n−−√ n tends to kill all but the first two cumulants as n n gets large.
I'll show here a more elementary approach. As the sum Zn\=(def)X1\+⋯\+Xnn√ Z n \= (def) X 1 \+ ⋯ \+ X n n gets longer and longer, I'll show that all of the moments of Zn Z n are functions only of the variances Var(Xi) Var ( X i ) and the means EXi E X i, and nothing else. Now the moments of Zn Z n determine the distribution of Zn Z n (that's true not just for long independent sums, but for any nice distribution, by the [Carleman continuity theorem](https://en.wikipedia.org/wiki/Carleman%27s_condition)). To restate, we're claiming that as n n gets large, Zn Z n depends only on the EXi E X i and the VarXi Var X i. And to show that, we're going to show that E((Zn−EZn)k) E ( ( Z n − E Z n ) k ) depends only on the EXi E X i and the VarXi Var X i. That suffices, by the Carleman continuity theorem.
For convenience, let's require that the Xi X i have mean zero and variance σ2 σ 2. Assume all their moments exist and are uniformly bounded. (But nevertheless, the Xi X i can be all different independent distributions.)
> **Claim:** Under the stated assumptions, the k kth moment
>
> E\[(X1\+⋯\+Xnn−−√)k\]
>
> E
>
> \[
>
> (
>
> X
>
> 1
>
> \+
>
> ⋯
>
> \+
>
> X
>
> n
>
> n
>
> )
>
> k
>
> \]
>
> has a limit as
>
> n→∞
>
> n
>
> →
>
> ∞
>
> , and that limit is a function only of
>
> σ2
>
> σ
>
> 2
>
> . (It disregards all other information.)
(Specifically, the values of those limits of moments are just [the moments of the normal distribution](https://en.wikipedia.org/wiki/Normal_distribution#Moments) N(0,σ2) N ( 0 , σ 2 ): zero for k k odd, and \|σ\|kk\!(k/2)\!2k/2 \| σ \| k k \! ( k / 2 ) \! 2 k / 2 when k k is even. This is equation (1) below.)
**Proof:** Consider E\[(X1\+⋯\+Xnn√)k\] E \[ ( X 1 \+ ⋯ \+ X n n ) k \]. When you expand it, you get a factor of n−k/2 n − k / 2 times a big fat multinomial sum.
n−k/2∑\|α\|\=k(kα1,…,αn)∏i\=1nE(Xαii)
n
−
k
/
2
∑
\|
α
\|
\=
k
(
k
α
1
,
…
,
α
n
)
∏
i
\=
1
n
E
(
X
i
α
i
)
α1\+⋯\+αn\=k
α
1
\+
⋯
\+
α
n
\=
k
(αi≥0)
(
α
i
≥
0
)
(Remember you can distribute the expectation over independent random variables. E(XaYb)\=E(Xa)E(Yb) E ( X a Y b ) \= E ( X a ) E ( Y b ).)
Now if ever I have as one of my factors a plain old E(Xi) E ( X i ), with exponent αi\=1 α i \= 1, then that whole term is zero, because E(Xi)\=0 E ( X i ) \= 0 by assumption. So I need all the exponents αi≠1 α i ≠ 1 in order for that term to survive. That pushes me toward using fewer of the Xi X i in each term, because each term has ∑αi\=k ∑ α i \= k, and I have to have each αi\>1 α i \> 1 if it is \>0 \> 0. In fact, some simple arithmetic shows that at most k/2 k / 2 of the αi α i can be nonzero, and that's only when k k is even, and when I use only twos and zeros as my αi α i.
This pattern where I use only twos and zeros turns out to be very important...in fact, any term where I don't do that will vanish as the sum grows larger.
> **Lemma:** The sum
>
> n−k/2∑\|α\|\=k(kα1,…,αn)∏i\=1nE(Xαii)
>
> n
>
> −
>
> k
>
> /
>
> 2
>
> ∑
>
> \|
>
> α
>
> \|
>
> \=
>
> k
>
> (
>
> k
>
> α
>
> 1
>
> ,
>
> …
>
> ,
>
> α
>
> n
>
> )
>
> ∏
>
> i
>
> \=
>
> 1
>
> n
>
> E
>
> (
>
> X
>
> i
>
> α
>
> i
>
> )
>
> breaks up like
>
> n−k/2⎛⎝⎜⎜(terms where some αi\=1)These are zero because EXi\=0\+(terms where αi's are twos and zeros)This part is O(nk/2) if k is even, otherwise no such terms\+(rest of terms)o(nk/2)⎞⎠⎟⎟
>
> n
>
> −
>
> k
>
> /
>
> 2
>
> (
>
> (
>
> terms where some
>
> α
>
> i
>
> \=
>
> 1
>
> )
>
> ⏟
>
> These are zero because
>
> E
>
> X
>
> i
>
> \=
>
> 0
>
> \+
>
> (
>
> terms where
>
> α
>
> i
>
> 's are twos andzeros
>
> )
>
> ⏟
>
> This part is
>
> O
>
> (
>
> n
>
> k
>
> /
>
> 2
>
> )
>
> if
>
> k
>
> is even, otherwise no suchterms
>
> \+
>
> (
>
> rest of terms
>
> )
>
> ⏟
>
> o
>
> (
>
> n
>
> k
>
> /
>
> 2
>
> )
>
> )
In other words, in the limit, all terms become irrelevant except
n−k/2∑(nk/2)(k2,…,2)k/2 twos∏j\=1k/2E(X2ij)(1)
(1)
n
−
k
/
2
∑
(
n
k
/
2
)
(
k
2
,
…
,
2
)
⏟
k
/
2
twos
∏
j
\=
1
k
/
2
E
(
X
i
j
2
)
**Proof:** The main points are to split up the sum by which [(strong) composition](https://en.wikipedia.org/wiki/Composition_\(combinatorics\)) of k k is represented by the multinomial α α. There are only 2k−1 2 k − 1 possibilities for strong compositions of k k, so the number of those can't explode as n→∞ n → ∞. Then there is the choice of which of the X1,…,Xn X 1 , … , X n will receive the positive exponents, and the number of such choices is (n\# positive terms in α)\=O(n\# positive terms in α) ( n \# positive terms in α ) \= O ( n \# positive terms in α ). (Remember the number of positive terms in α α can't be bigger than k/2 k / 2 without killing the term.) That's basically it. You can find a more thorough description [here](http://www.ericauld.net/s/CLTproof.pdf) on my website, or in section 2.2.3 of Tao's *Topics in Random Matrix Theory*, where I first read this argument.
And that concludes the whole proof. We’ve shown that all moments of X1\+…\+Xnn√ X 1 \+ … \+ X n n forget everything but EXi E X i and E(X2i) E ( X i 2 ) as n→∞ n → ∞. And therefore swapping out the Xi X i with any variables with the same first and second moments wouldn't have made any difference in the limit. And so we may as well have taken them to be ∼N(μ,σ2) ∼ N ( μ , σ 2 ) to begin with; it wouldn't have made any difference.
***
\*\*(If one wants to pursue more deeply the question of why n1/2 n 1 / 2 is the magic number here for vectors and for functions, and why the variance (square L2 L 2 norm) is the important statistic, one might read about why L2 L 2 is the only Lp L p space that can be an inner product space. Because 2 2 is the only number that is its own Holder conjugate.)
Another valid view is that n1/2 n 1 / 2 is **not** the only denominator can appear. There are different "basins of attraction" for random variables, and so there are infinitely many central limit theorems. There are random variables for which X1\+⋯\+Xnn⇒X X 1 \+ ⋯ \+ X n n ⇒ X, and for which X1\+⋯\+Xn1⇒X X 1 \+ ⋯ \+ X n 1 ⇒ X! But these random variables necessarily have infinite variance. These are called "stable laws".
It's also enlightening to look at the normal distribution from a calculus of variations standpoint: the normal distribution N(μ,σ2) N ( μ , σ 2 ) maximizes the Shannon entropy among distributions with a given mean and variance, and which are absolutely continuous with respect to the Lebesgue measure on R R (or Rd R d, for the multivariate case). This is proven [here](https://web.stanford.edu/class/stats311/Lectures/lec-07.pdf), for example.
[Share](https://math.stackexchange.com/a/3591024 "Short permalink to this answer")
Share a link to this answer
Copy link
[CC BY-SA 4.0](https://creativecommons.org/licenses/by-sa/4.0/ "The current license for this post: CC BY-SA 4.0")
Cite
Follow
Follow this answer to receive notifications
[edited Aug 22, 2021 at 23:54](https://math.stackexchange.com/posts/3591024/revisions "show all edits to this post")
answered Mar 23, 2020 at 1:09
[](https://math.stackexchange.com/users/76333/eric-auld)
[Eric Auld](https://math.stackexchange.com/users/76333/eric-auld)Eric Auld
29k1212 gold badges8686 silver badges223223 bronze badges
1
- 1
Please make substantive edits. There have been many small edits to this answer. This puts this question on the front page unnecessarily frequently.
– [robjohn](https://math.stackexchange.com/users/13854/robjohn "354,051 reputation") ♦
[Commented Mar 26, 2020 at 18:36](https://math.stackexchange.com/questions/12983/intuition-about-the-central-limit-theorem#comment7393504_3591024)
[Add a comment](https://math.stackexchange.com/questions/12983/intuition-about-the-central-limit-theorem "Use comments to ask for more information or suggest improvements. Avoid comments like “+1” or “thanks”.") \|
This answer is useful
3
Save this answer.
Show activity on this post.
Working out a few simple examples might help. This would indeed show you that the theorem works in special cases. Thus it would go a long way towards convincing oneself of the validity of the central limit theorem. The central limit theorem first appeared in the work of Abraham de Moivre, in which he proved that the normal distribution to approximate the distribution of the number of heads resulting from many tosses of a fair coin. Later Laplace showed that the same for the binomial distribution, approximating it with the normal distribution. I suggest that you work out these two simpler cases to get a feeling of how the approximation happens. All the necessary background for doing this yourself is available in the book of Hoel, Port and Stone.
If you find the theorem hard to understand, it might make you feel better to hear that probabilists needed a long time to properly formulate and understand the theorem. It was only done in the twentieth century by Lyapunov.
If you are oriented towards practical applications, then getting used to some topics of your choice, for instance noise analysis in communication theory, might help you in convincing yourself of the truth of the central limit theorem.
The best way to understand the central limit theorem would of course be to take a course in probability theory. An introductory course usually ends with a proof of this theorem. And if you take a course, you would see other interesting theorems such as the weak and strong laws of large numbers, and this would put the central limit theorem in better perspective. Even after all this, you might still need to contemplate a bit to really absorb the theorem. The proof I have seen is using some "characteristic functions" and a sort of "Fourier transform". I have to regretfully confess that I didn't fully understand it when I took the course. I never had to take up probability theory later; but if the occasion arises, I intend to go all properly through the proof and understand the machinery.
[Share](https://math.stackexchange.com/a/13023 "Short permalink to this answer")
Share a link to this answer
Copy link
[CC BY-SA 2.5](https://creativecommons.org/licenses/by-sa/2.5/ "The current license for this post: CC BY-SA 2.5")
Cite
Follow
Follow this answer to receive notifications
answered Dec 4, 2010 at 20:21
user1119user1119
[Add a comment](https://math.stackexchange.com/questions/12983/intuition-about-the-central-limit-theorem "Use comments to ask for more information or suggest improvements. Avoid comments like “+1” or “thanks”.") \|
This answer is useful
2
Save this answer.
Show activity on this post.
[This answer](https://math.stackexchange.com/a/123987) gives an outline of how to use the Fourier Transform to prove that the n n\-fold convolution of any probability distribution with a finite variance contracted by a factor of n−−√ n converges weakly to the normal distribution.
However, in [his answer](https://math.stackexchange.com/a/12985), Qiaochu Yuan mentions that one can use the Principle of Maximum Entropy to get a normal distribution. Below, I have endeavored to do just that using the [Calculus of Variations](https://en.wikipedia.org/wiki/Calculus_of_variations).
***
**Applying the Principle of Maximum Entropy**
Suppose we want to maximize the entropy
−∫Rlog(f(x))f(x)dx(1)
(1)
−
∫
R
log
(
f
(
x
)
)
f
(
x
)
d
x
over all
f
f
whose mean is
0
0
and variance is
σ2
σ
2
, that is
∫R(1,x,x2)f(x)dx\=(1,0,σ2)(2)
(2)
∫
R
(
1
,
x
,
x
2
)
f
(
x
)
d
x
\=
(
1
,
0
,
σ
2
)
That is, we want the variation of
(1)
(
1
)
to vanish
∫R(1\+log(f(x)))δf(x)dx\=0(3)
(3)
∫
R
(
1
\+
log
(
f
(
x
)
)
)
δ
f
(
x
)
d
x
\=
0
for all variations of
f
f
,
δf(x)
δ
f
(
x
)
, so that the variation of
(2)
(
2
)
vanishes
∫R(1,x,x2)δf(x)dx\=(0,0,0)(4)
(4)
∫
R
(
1
,
x
,
x
2
)
δ
f
(
x
)
d
x
\=
(
0
,
0
,
0
)
(3)
(
3
)
,
(4)
(
4
)
, and [orthogonality](https://math.stackexchange.com/a/188306) requires
log(f(x))\=c0\+c1x\+c2x2(5)
(5)
log
(
f
(
x
)
)
\=
c
0
\+
c
1
x
\+
c
2
x
2
To satisfy
(2)
(
2
)
, we need
c0\=−12log(2πσ2)
c
0
\=
−
1
2
log
(
2
π
σ
2
)
,
c1\=0
c
1
\=
0
, and
c2\=−12σ2
c
2
\=
−
1
2
σ
2
. That is,
f(x)\=1σ2π−−√e−x22σ2(6)
(6)
f
(
x
)
\=
1
σ
2
π
e
−
x
2
2
σ
2
[Share](https://math.stackexchange.com/a/3596397 "Short permalink to this answer")
Share a link to this answer
Copy link
[CC BY-SA 4.0](https://creativecommons.org/licenses/by-sa/4.0/ "The current license for this post: CC BY-SA 4.0")
Cite
Follow
Follow this answer to receive notifications
[edited Mar 27, 2020 at 23:14](https://math.stackexchange.com/posts/3596397/revisions "show all edits to this post")
answered Mar 26, 2020 at 18:15
[](https://math.stackexchange.com/users/13854/robjohn)
[robjohn](https://math.stackexchange.com/users/13854/robjohn)♦robjohn
354k3838 gold badges495495 silver badges889889 bronze badges
2
- This says that the maximum entropy for a probability distribution on
R
R
with variance
σ2
σ
2
is
12log(2πeσ2)
1
2
log
(
2
π
e
σ
2
)
.
– [robjohn](https://math.stackexchange.com/users/13854/robjohn "354,051 reputation") ♦
[Commented Mar 26, 2020 at 18:16](https://math.stackexchange.com/questions/12983/intuition-about-the-central-limit-theorem#comment7393457_3596397)
- 1
@EricAuld: I have added a bit more to help clarify that I am using the Calculus of Variations.
– [robjohn](https://math.stackexchange.com/users/13854/robjohn "354,051 reputation") ♦
[Commented Mar 27, 2020 at 23:15](https://math.stackexchange.com/questions/12983/intuition-about-the-central-limit-theorem#comment7396868_3596397)
[Add a comment](https://math.stackexchange.com/questions/12983/intuition-about-the-central-limit-theorem "Use comments to ask for more information or suggest improvements. Avoid comments like “+1” or “thanks”.") \|
## You must [log in](https://math.stackexchange.com/users/login?ssrc=question_page&returnurl=https%3A%2F%2Fmath.stackexchange.com%2Fquestions%2F12983) to answer this question.
Start asking to get answers
Find the answer to your question by asking.
[Ask question](https://math.stackexchange.com/questions/ask)
Explore related questions
- [statistics](https://math.stackexchange.com/questions/tagged/statistics "show questions tagged 'statistics'")
See similar questions with these tags.
- Featured on Meta
- [Will you help build our new visual identity?](https://meta.stackexchange.com/questions/411312/will-you-help-build-our-new-visual-identity?cb=1)
- [Upcoming initiatives on Stack Overflow and across the Stack Exchange network...](https://meta.stackexchange.com/questions/411578/upcoming-initiatives-on-stack-overflow-and-across-the-stack-exchange-network-j?cb=1 "Upcoming initiatives on Stack Overflow and across the Stack Exchange network - July 2025")
#### Linked
[72](https://math.stackexchange.com/questions/123633/pseudo-proofs-that-are-intuitively-reasonable?lq=1 "Question score (upvotes - downvotes)")
[Pseudo Proofs that are intuitively reasonable](https://math.stackexchange.com/questions/123633/pseudo-proofs-that-are-intuitively-reasonable?noredirect=1&lq=1)
[0](https://math.stackexchange.com/questions/188026/minimize-the-sum-of-tangents-when-sum-of-angles-are-constants?lq=1 "Question score (upvotes - downvotes)")
[Minimize the sum of tangents when sum of angles are constants](https://math.stackexchange.com/questions/188026/minimize-the-sum-of-tangents-when-sum-of-angles-are-constants?noredirect=1&lq=1)
#### Related
[80](https://math.stackexchange.com/questions/265917/intuitive-explanation-of-a-definition-of-the-fisher-information?rq=1 "Question score (upvotes - downvotes)")
[Intuitive explanation of a definition of the Fisher information](https://math.stackexchange.com/questions/265917/intuitive-explanation-of-a-definition-of-the-fisher-information?rq=1)
[0](https://math.stackexchange.com/questions/363531/central-limit-theorem-vs-normal-model?rq=1 "Question score (upvotes - downvotes)")
[Central Limit Theorem VS Normal Model](https://math.stackexchange.com/questions/363531/central-limit-theorem-vs-normal-model?rq=1)
[0](https://math.stackexchange.com/questions/487375/central-limit-theorem-question-about-%E2%88%9An-and-%CF%832?rq=1 "Question score (upvotes - downvotes)")
[Central limit theorem: question about √n and σ2](https://math.stackexchange.com/questions/487375/central-limit-theorem-question-about-%E2%88%9An-and-%CF%832?rq=1)
[5](https://math.stackexchange.com/questions/1126008/central-limit-theorem-proof?rq=1 "Question score (upvotes - downvotes)")
[Central Limit Theorem proof.](https://math.stackexchange.com/questions/1126008/central-limit-theorem-proof?rq=1)
[1](https://math.stackexchange.com/questions/1368941/current-applications-of-the-central-limit-theorem-for-binomial-distributions?rq=1 "Question score (upvotes - downvotes)")
[Current applications of the central limit theorem for binomial distributions](https://math.stackexchange.com/questions/1368941/current-applications-of-the-central-limit-theorem-for-binomial-distributions?rq=1)
[5](https://math.stackexchange.com/questions/1706744/using-central-limit-theorem-to-approximate?rq=1 "Question score (upvotes - downvotes)")
[Using Central Limit Theorem to approximate.](https://math.stackexchange.com/questions/1706744/using-central-limit-theorem-to-approximate?rq=1)
[0](https://math.stackexchange.com/questions/2383338/when-and-how-to-use-the-central-limit-theorem?rq=1 "Question score (upvotes - downvotes)")
[When and how to use the Central Limit Theorem?](https://math.stackexchange.com/questions/2383338/when-and-how-to-use-the-central-limit-theorem?rq=1)
[2](https://math.stackexchange.com/questions/4003372/grasping-the-practical-usage-of-central-limit-theorem?rq=1 "Question score (upvotes - downvotes)")
[Grasping the practical usage of Central Limit Theorem](https://math.stackexchange.com/questions/4003372/grasping-the-practical-usage-of-central-limit-theorem?rq=1)
[1](https://math.stackexchange.com/questions/5020712/real-world-application-of-the-central-limit-theorem-if-it-hinges-on-independentl?rq=1 "Question score (upvotes - downvotes)")
[Real-world application of the Central Limit Theorem if it hinges on independently and identically distributed variables?](https://math.stackexchange.com/questions/5020712/real-world-application-of-the-central-limit-theorem-if-it-hinges-on-independentl?rq=1)
#### [Hot Network Questions](https://stackexchange.com/questions?tab=hot)
- [CDFA VOR Approaches: What to Do When Above MDA?](https://aviation.stackexchange.com/questions/111113/cdfa-vor-approaches-what-to-do-when-above-mda)
- [Is there any way to see the string that was matched in grep?](https://unix.stackexchange.com/questions/798433/is-there-any-way-to-see-the-string-that-was-matched-in-grep)
- [Did any automatic card punches have multiple hoppers to allow insertion of contrasting-color or other special cards?](https://retrocomputing.stackexchange.com/questions/31946/did-any-automatic-card-punches-have-multiple-hoppers-to-allow-insertion-of-contr)
- [I implemented FFT in C](https://codereview.stackexchange.com/questions/297789/i-implemented-fft-in-c)
- [Why would a piece of music be written with first and second endings that are the same?](https://music.stackexchange.com/questions/141139/why-would-a-piece-of-music-be-written-with-first-and-second-endings-that-are-the)
- [Will a research-based Ph.D. admission committee consider my application if my supervisor has only vaguely expressed his interest in supervising me?](https://academia.stackexchange.com/questions/220674/will-a-research-based-ph-d-admission-committee-consider-my-application-if-my-su)
- [Do I need an invitation letter to be a guest in Spain?](https://travel.stackexchange.com/questions/199875/do-i-need-an-invitation-letter-to-be-a-guest-in-spain)
- [Recall ambiguity](https://stats.stackexchange.com/questions/669233/recall-ambiguity)
- [What does the world look like to astronauts in a neutral buoyancy pool?](https://space.stackexchange.com/questions/69670/what-does-the-world-look-like-to-astronauts-in-a-neutral-buoyancy-pool)
- [Can a Planeswalker be tapped?](https://boardgames.stackexchange.com/questions/63061/can-a-planeswalker-be-tapped)
- [The Infinite Rumplestiltskin problem: what is the relationship between randomness and definability?](https://math.stackexchange.com/questions/5087213/the-infinite-rumplestiltskin-problem-what-is-the-relationship-between-randomnes)
- [What's left after repeatedly removing palindromes](https://codegolf.stackexchange.com/questions/282899/whats-left-after-repeatedly-removing-palindromes)
- [how to define when a key (or a secret in general) has become too old?](https://security.stackexchange.com/questions/281800/how-to-define-when-a-key-or-a-secret-in-general-has-become-too-old)
- [Was It's Always Sunny in Philadelphia, "The Gang Goes to a Dog Track," filmed at a real dog track?](https://movies.stackexchange.com/questions/128065/was-its-always-sunny-in-philadelphia-the-gang-goes-to-a-dog-track-filmed-at)
- [FOR XML PATH and OPTION() can't be together in query?](https://dba.stackexchange.com/questions/347369/for-xml-path-and-option-cant-be-together-in-query)
- [Negate the effect of parskip on \\item](https://tex.stackexchange.com/questions/749113/negate-the-effect-of-parskip-on-item)
- [How to create a shunt resistor using a PCB track](https://electronics.stackexchange.com/questions/753052/how-to-create-a-shunt-resistor-using-a-pcb-track)
- [unknotting number for k11a266](https://mathoverflow.net/questions/498556/unknotting-number-for-k11a266)
- [Derivatives of a function without an expression? Needed for calculating the force from a machine-learned potential](https://mattermodeling.stackexchange.com/questions/14391/derivatives-of-a-function-without-an-expression-needed-for-calculating-the-forc)
- [How can I color a cylinder inside a sphere?](https://mathematica.stackexchange.com/questions/314722/how-can-i-color-a-cylinder-inside-a-sphere)
- [My lab is not equipped for me to do my postdoc work. Should I quit?](https://academia.stackexchange.com/questions/220650/my-lab-is-not-equipped-for-me-to-do-my-postdoc-work-should-i-quit)
- [Does hot melt glue damage wire insulation?](https://electronics.stackexchange.com/questions/753100/does-hot-melt-glue-damage-wire-insulation)
- [Questions Re: Rudin's proof that 'every neighbourhood is open'](https://math.stackexchange.com/questions/5087406/questions-re-rudins-proof-that-every-neighbourhood-is-open)
- [Who decides on an official translation of common names?](https://politics.stackexchange.com/questions/93214/who-decides-on-an-official-translation-of-common-names)
[Question feed](https://math.stackexchange.com/feeds/question/12983 "Feed of this question and its answers")
# Subscribe to RSS
Question feed
To subscribe to this RSS feed, copy and paste this URL into your RSS reader.

# Why are you flagging this comment?
It contains harassment, bigotry or abuse.
This comment attacks a person or group. Learn more in our [Code of Conduct](https://math.stackexchange.com/conduct/abusive-behavior).
It's unfriendly or unkind.
This comment is rude or condescending. Learn more in our [Code of Conduct](https://math.stackexchange.com/conduct/abusive-behavior).
Not needed.
This comment is not relevant to the post.
```
```
Enter at least 6 characters
Something else.
A problem not listed above. Try to be as specific as possible.
```
```
Enter at least 6 characters
Flag comment
Cancel
You have 0 flags left today
##### [Mathematics](https://math.stackexchange.com/)
- [Tour](https://math.stackexchange.com/tour)
- [Help](https://math.stackexchange.com/help)
- [Chat](https://chat.stackexchange.com/?tab=site&host=math.stackexchange.com)
- [Contact](https://math.stackexchange.com/contact)
- [Feedback](https://math.meta.stackexchange.com/)
##### [Company](https://stackoverflow.co/)
- [Stack Overflow](https://stackoverflow.com/)
- [Teams](https://stackoverflow.co/teams/)
- [Advertising](https://stackoverflow.co/advertising/)
- [Talent](https://stackoverflow.co/advertising/employer-branding/)
- [About](https://stackoverflow.co/)
- [Press](https://stackoverflow.co/company/press/)
- [Legal](https://stackoverflow.com/legal)
- [Privacy Policy](https://stackoverflow.com/legal/privacy-policy)
- [Terms of Service](https://stackoverflow.com/legal/terms-of-service/public)
- Cookie Settings
- [Cookie Policy](https://policies.stackoverflow.co/stack-overflow/cookie-policy)
##### [Stack Exchange Network](https://stackexchange.com/)
- [Technology](https://stackexchange.com/sites#technology)
- [Culture & recreation](https://stackexchange.com/sites#culturerecreation)
- [Life & arts](https://stackexchange.com/sites#lifearts)
- [Science](https://stackexchange.com/sites#science)
- [Professional](https://stackexchange.com/sites#professional)
- [Business](https://stackexchange.com/sites#business)
- [API](https://api.stackexchange.com/)
- [Data](https://data.stackexchange.com/)
- [Blog](https://stackoverflow.blog/?blb=1)
- [Facebook](https://www.facebook.com/officialstackoverflow/)
- [Twitter](https://twitter.com/stackoverflow)
- [LinkedIn](https://linkedin.com/company/stack-overflow)
- [Instagram](https://www.instagram.com/thestackoverflow)
Site design / logo © 2025 Stack Exchange Inc; user contributions licensed under [CC BY-SA](https://stackoverflow.com/help/licensing) . rev 2025.8.1.32392 |
| Readable Markdown | null |
| Shard | 18 (laksa) |
| Root Hash | 8045678284012640218 |
| Unparsed URL | com,stackexchange!math,/questions/12983/intuition-about-the-central-limit-theorem s443 |