ℹ️ Skipped - page is already crawled
| Filter | Status | Condition | Details |
|---|---|---|---|
| HTTP status | PASS | download_http_code = 200 | HTTP 200 |
| Age cutoff | PASS | download_stamp > now() - 6 MONTH | 0.4 months ago |
| History drop | PASS | isNull(history_drop_reason) | No drop reason |
| Spam/ban | PASS | fh_dont_index != 1 AND ml_spam_score = 0 | ml_spam_score=0 |
| Canonical | PASS | meta_canonical IS NULL OR = '' OR = src_unparsed | Not set |
| Property | Value |
|---|---|
| URL | https://www.chebfun.org/examples/stats/MercerKarhunenLoeve.html |
| Last Crawled | 2026-04-03 00:07:52 (11 days ago) |
| First Indexed | 2018-01-02 21:23:34 (8 years ago) |
| HTTP Status Code | 200 |
| Meta Title | Mercer's theorem and the Karhunen-Loeve expansion » Chebfun |
| Meta Description | null |
| Meta Canonical | null |
| Boilerpipe Text | plotopt = {'linewidth',2,'markersize',12};
Mercer's theorem is a continuous analog of the singular-value or eigenvalue decomposition of a symmetric positive definite matrix. One of its main applications is to find convenient ways to express stochastic processes, via the Karhunen-Loeve expansion [1].
Mercer's theorem
Suppose
K
(
s
,
t
)
is a symmetric (that is,
K
(
t
,
s
)
=
K
(
s
,
t
)
), continuous, and nonnegative definite kernel function on
[
a
,
b
]
×
[
a
,
b
]
. Mercer's theorem asserts that there is an orthonormal set of eigenfunctions
ψ
j
(
x
)
and eigenvalues
λ
j
such that
K
(
s
,
t
)
=
∑
j
∞
λ
j
ψ
j
(
s
)
ψ
j
(
t
)
,
where the values and functions satisfy the integral eigenvalue equation
λ
j
ψ
j
(
s
)
=
∫
a
b
K
(
s
,
t
)
ψ
j
(
t
)
.
For example, suppose we have an exponentially decaying kernel:
K = @(s,t) exp(-abs(s-t));
We can create the integral operator and find the leading terms of its Mercer decomposition numerically.
F = chebop(@(u) fred(K, u));
[Psi,Lambda] = eigs(F,20,'lm');
Psi = chebfun(Psi);
[lambda,idx] = sort(diag(Lambda),'descend');
Psi = Psi(:,idx);
plot(Psi(:,[1 2 5 10]),plotopt{:})
title('First four Mercer eigenfunctions')
xlabel('x')
ylabel('\Psi(x)')
The eigenfunctions returned by
eigs
are orthonormal.
format short
Psi(:,1:6)'*Psi(:,1:6)
ans =
1.0000 0.0000 -0.0000 -0.0000 0.0000 0.0000
0.0000 1.0000 -0.0000 -0.0000 -0.0000 0.0000
-0.0000 -0.0000 1.0000 -0.0000 -0.0000 0.0000
-0.0000 -0.0000 -0.0000 1.0000 -0.0000 -0.0000
0.0000 -0.0000 -0.0000 -0.0000 1.0000 0.0000
0.0000 0.0000 0.0000 -0.0000 0.0000 1.0000
The truncation of the Mercer sum does lead to an underestimate of the values of the kernel
K
(
s
,
t
)
. For our example, we should get
K
(
s
,
s
)
=
1
, but we get noticeably less.
Psi(0,:)*diag(lambda)*Psi(0,:)'
Psi(0.95,:)*diag(lambda)*Psi(0.95,:)'
ans =
0.9799
ans =
0.9825
In fact, the eigenvalues decrease only like
O
(
n
−
2
)
, which makes the pointwise convergence in the number of terms rather slow.
loglog(lambda,'.',plotopt{:}), axis tight
xlabel('n')
ylabel('| \lambda_n |')
Karhunen-Loeve expansion
Now suppose that
X
(
t
,
ω
)
is a stochastic process for
t
in some interval
[
a
,
b
]
and
ω
in some probability space. The process is often characterized by its mean,
μ
(
t
)
, and its covariance,
K
(
s
,
t
)
, the expected value of
(
X
(
s
)
−
μ
(
s
)
)
(
X
(
t
)
−
μ
(
t
)
)
. Using Mercer's theorem on
K
, we can express the process by the K-L expansion
X
(
t
,
ω
)
=
μ
(
t
)
+
∑
j
∞
(
λ
j
)
ψ
j
(
t
)
Z
j
(
ω
)
,
where
λ
j
and
ψ
j
are Mercer eigenmodes for
K
, and the
Z
j
are uncorrelated and of unit variance.
K-L is a generalization of the singular value decomposition of a matrix, which can be written as a sum of outer products of vectors. The covariance
K
plays the role of the Gram matrix inner products (in probability) of "columns" of the process for different values of
s
and
t
. A number of SVD results have K-L analogs, most notably that the best approximation of the process results from truncating the expansion, if the eigenvalues are arranged in nonincreasing order.
Because the
Z
j
in the expansion are uncorrelated, the variance of
X
is just the sum of the eigenvalues. This is the trace of
K
, which is the integral of
K
(
s
,
s
)
; in this case, the result is
2
. But we can also calculate the variance in a truncation of the expansion by summing only some of the eigenvalues. For example, suppose the process
X
has the exponential covariance in
K
above. The eigenvalues show that
95
%
of the variance in the process is captured by the first
10
K-L modes:
captured = sum(lambda(1:10)) / 2
captured =
0.9579
We can find realizations of
X
by selecting the random parameters
Z
j
in the expansion.
Z = randn(10,400);
L = diag( sqrt(lambda(1:10)) );
X = Psi(:,1:10)*(L*Z);
plot(X(:,1:40))
mu = sum(X,2)/400;
hold on, plot(mu,'k',plotopt{:})
title('Random realizations, and the mean')
We should get roughly the original covariance function back. (We'll discretize the computation for speed.)
points = (-1:.05:1)';
[S,T] = meshgrid(points);
C = cov( X(points,:)' ); % covariance at discrete locations
clf, mesh(S,T,C)
hold on, plot3(S,T,K(S,T),'k.',plotopt{:})
If we shorten the correlation length of the process relative to the domain (i.e., more randomness), the amount of variance captured by the first
10
modes will decrease.
K = @(s,t) exp(-4*abs(s-t)); % decrease correlation faster, then...
F = chebop(@(u) fred(K, u) );
lambdaShort = sort( eigs(F,24,'lm'), 'descend' );
clf
loglog(lambda,'b.',plotopt{:})
hold on
loglog(lambdaShort,'r.',plotopt{:}), axis tight
xlabel('n')
ylabel('| \lambda_n |')
captured = sum(lambdaShort(1:10)) / 2 % ... a smaller fraction is captured
captured =
0.6744
References
D. Xu,
Numerical Methods for Stochastic Computations
, Princeton University Press, 2010. |
| Markdown | [](https://www.chebfun.org/)
- [About](https://www.chebfun.org/about)
- [News](https://www.chebfun.org/news)
- [Download](https://www.chebfun.org/download)
- [Docs](https://www.chebfun.org/docs)
- [Examples](https://www.chebfun.org/examples)
- [Support](https://www.chebfun.org/support)
- [](https://github.com/chebfun/chebfun)
# Mercer's theorem and the Karhunen-Loeve expansion
## Toby Driscoll, December 2011 in [stats](https://www.chebfun.org/examples/stats/)[download](https://www.chebfun.org/examples/stats/MercerKarhunenLoeve.m)·[view on GitHub](https://github.com/chebfun/examples/blob/master/stats/MercerKarhunenLoeve.m)
```
plotopt = {'linewidth',2,'markersize',12};
```
Mercer's theorem is a continuous analog of the singular-value or eigenvalue decomposition of a symmetric positive definite matrix. One of its main applications is to find convenient ways to express stochastic processes, via the Karhunen-Loeve expansion \[1\].
### Mercer's theorem
Suppose K(s,t) K ( s , t ) is a symmetric (that is, K(t,s)\=K(s,t) K ( t , s ) \= K ( s , t )), continuous, and nonnegative definite kernel function on \[a,b\]×\[a,b\] \[ a , b \] × \[ a , b \]. Mercer's theorem asserts that there is an orthonormal set of eigenfunctions ψj(x) ψ j ( x ) and eigenvalues λj λ j such that
K(s,t)\=∑j∞λjψj(s)ψj(t),
K
(
s
,
t
)
\=
∑
j
∞
λ
j
ψ
j
(
s
)
ψ
j
(
t
)
,
where the values and functions satisfy the integral eigenvalue equation
λjψj(s)\=∫baK(s,t)ψj(t).
λ
j
ψ
j
(
s
)
\=
∫
a
b
K
(
s
,
t
)
ψ
j
(
t
)
.
For example, suppose we have an exponentially decaying kernel:
```
K = @(s,t) exp(-abs(s-t));
```
We can create the integral operator and find the leading terms of its Mercer decomposition numerically.
```
F = chebop(@(u) fred(K, u));
[Psi,Lambda] = eigs(F,20,'lm');
Psi = chebfun(Psi);
[lambda,idx] = sort(diag(Lambda),'descend');
Psi = Psi(:,idx);
```
```
plot(Psi(:,[1 2 5 10]),plotopt{:})
title('First four Mercer eigenfunctions')
xlabel('x')
ylabel('\Psi(x)')
```

The eigenfunctions returned by `eigs` are orthonormal.
```
format short
Psi(:,1:6)'*Psi(:,1:6)
```
```
ans =
1.0000 0.0000 -0.0000 -0.0000 0.0000 0.0000
0.0000 1.0000 -0.0000 -0.0000 -0.0000 0.0000
-0.0000 -0.0000 1.0000 -0.0000 -0.0000 0.0000
-0.0000 -0.0000 -0.0000 1.0000 -0.0000 -0.0000
0.0000 -0.0000 -0.0000 -0.0000 1.0000 0.0000
0.0000 0.0000 0.0000 -0.0000 0.0000 1.0000
```
The truncation of the Mercer sum does lead to an underestimate of the values of the kernel K(s,t) K ( s , t ). For our example, we should get K(s,s)\=1 K ( s , s ) \= 1, but we get noticeably less.
```
Psi(0,:)*diag(lambda)*Psi(0,:)'
Psi(0.95,:)*diag(lambda)*Psi(0.95,:)'
```
```
ans =
0.9799
ans =
0.9825
```
In fact, the eigenvalues decrease only like O(n−2) O ( n − 2 ), which makes the pointwise convergence in the number of terms rather slow.
```
loglog(lambda,'.',plotopt{:}), axis tight
xlabel('n')
ylabel('| \lambda_n |')
```

### Karhunen-Loeve expansion
Now suppose that X(t,ω) X ( t , ω ) is a stochastic process for t t in some interval \[a,b\] \[ a , b \] and ω ω in some probability space. The process is often characterized by its mean, μ(t) μ ( t ), and its covariance, K(s,t) K ( s , t ), the expected value of (X(s)−μ(s))(X(t)−μ(t)) ( X ( s ) − μ ( s ) ) ( X ( t ) − μ ( t ) ). Using Mercer's theorem on K K, we can express the process by the K-L expansion
X(t,ω)\=μ(t)\+∑j∞(√λj)ψj(t)Zj(ω),
X
(
t
,
ω
)
\=
μ
(
t
)
\+
∑
j
∞
(
λ
j
)
ψ
j
(
t
)
Z
j
(
ω
)
,
where λj λ j and ψj ψ j are Mercer eigenmodes for K K, and the Zj Z j are uncorrelated and of unit variance.
K-L is a generalization of the singular value decomposition of a matrix, which can be written as a sum of outer products of vectors. The covariance K K plays the role of the Gram matrix inner products (in probability) of "columns" of the process for different values of s s and t t. A number of SVD results have K-L analogs, most notably that the best approximation of the process results from truncating the expansion, if the eigenvalues are arranged in nonincreasing order.
Because the Zj Z j in the expansion are uncorrelated, the variance of X X is just the sum of the eigenvalues. This is the trace of K K, which is the integral of K(s,s) K ( s , s ); in this case, the result is 2 2. But we can also calculate the variance in a truncation of the expansion by summing only some of the eigenvalues. For example, suppose the process X X has the exponential covariance in K K above. The eigenvalues show that 95% 95 % of the variance in the process is captured by the first 10 10 K-L modes:
```
captured = sum(lambda(1:10)) / 2
```
```
captured =
0.9579
```
We can find realizations of X X by selecting the random parameters Zj Z j in the expansion.
```
Z = randn(10,400);
L = diag( sqrt(lambda(1:10)) );
X = Psi(:,1:10)*(L*Z);
plot(X(:,1:40))
mu = sum(X,2)/400;
hold on, plot(mu,'k',plotopt{:})
title('Random realizations, and the mean')
```

We should get roughly the original covariance function back. (We'll discretize the computation for speed.)
```
points = (-1:.05:1)';
[S,T] = meshgrid(points);
C = cov( X(points,:)' ); % covariance at discrete locations
clf, mesh(S,T,C)
hold on, plot3(S,T,K(S,T),'k.',plotopt{:})
```

If we shorten the correlation length of the process relative to the domain (i.e., more randomness), the amount of variance captured by the first 10 10 modes will decrease.
```
K = @(s,t) exp(-4*abs(s-t)); % decrease correlation faster, then...
F = chebop(@(u) fred(K, u) );
lambdaShort = sort( eigs(F,24,'lm'), 'descend' );
```
```
clf
loglog(lambda,'b.',plotopt{:})
hold on
loglog(lambdaShort,'r.',plotopt{:}), axis tight
xlabel('n')
ylabel('| \lambda_n |')
captured = sum(lambdaShort(1:10)) / 2 % ... a smaller fraction is captured
```
```
captured =
0.6744
```

### References
1. D. Xu, *Numerical Methods for Stochastic Computations*, Princeton University Press, 2010.
© Copyright 2025 the University of Oxford and the Chebfun Developers. |
| Readable Markdown | ```
plotopt = {'linewidth',2,'markersize',12};
```
Mercer's theorem is a continuous analog of the singular-value or eigenvalue decomposition of a symmetric positive definite matrix. One of its main applications is to find convenient ways to express stochastic processes, via the Karhunen-Loeve expansion \[1\].
### Mercer's theorem
Suppose K ( s , t ) is a symmetric (that is, K ( t , s ) \= K ( s , t )), continuous, and nonnegative definite kernel function on \[ a , b \] × \[ a , b \]. Mercer's theorem asserts that there is an orthonormal set of eigenfunctions ψ j ( x ) and eigenvalues λ j such that
K ( s , t ) \= ∑ j ∞ λ j ψ j ( s ) ψ j ( t ) ,
where the values and functions satisfy the integral eigenvalue equation
λ j ψ j ( s ) \= ∫ a b K ( s , t ) ψ j ( t ) .
For example, suppose we have an exponentially decaying kernel:
```
K = @(s,t) exp(-abs(s-t));
```
We can create the integral operator and find the leading terms of its Mercer decomposition numerically.
```
F = chebop(@(u) fred(K, u));
[Psi,Lambda] = eigs(F,20,'lm');
Psi = chebfun(Psi);
[lambda,idx] = sort(diag(Lambda),'descend');
Psi = Psi(:,idx);
```
```
plot(Psi(:,[1 2 5 10]),plotopt{:})
title('First four Mercer eigenfunctions')
xlabel('x')
ylabel('\Psi(x)')
```

The eigenfunctions returned by `eigs` are orthonormal.
```
format short
Psi(:,1:6)'*Psi(:,1:6)
```
```
ans =
1.0000 0.0000 -0.0000 -0.0000 0.0000 0.0000
0.0000 1.0000 -0.0000 -0.0000 -0.0000 0.0000
-0.0000 -0.0000 1.0000 -0.0000 -0.0000 0.0000
-0.0000 -0.0000 -0.0000 1.0000 -0.0000 -0.0000
0.0000 -0.0000 -0.0000 -0.0000 1.0000 0.0000
0.0000 0.0000 0.0000 -0.0000 0.0000 1.0000
```
The truncation of the Mercer sum does lead to an underestimate of the values of the kernel K ( s , t ). For our example, we should get K ( s , s ) \= 1, but we get noticeably less.
```
Psi(0,:)*diag(lambda)*Psi(0,:)'
Psi(0.95,:)*diag(lambda)*Psi(0.95,:)'
```
```
ans =
0.9799
ans =
0.9825
```
In fact, the eigenvalues decrease only like O ( n − 2 ), which makes the pointwise convergence in the number of terms rather slow.
```
loglog(lambda,'.',plotopt{:}), axis tight
xlabel('n')
ylabel('| \lambda_n |')
```

### Karhunen-Loeve expansion
Now suppose that X ( t , ω ) is a stochastic process for t in some interval \[ a , b \] and ω in some probability space. The process is often characterized by its mean, μ ( t ), and its covariance, K ( s , t ), the expected value of ( X ( s ) − μ ( s ) ) ( X ( t ) − μ ( t ) ). Using Mercer's theorem on K, we can express the process by the K-L expansion
X ( t , ω ) \= μ ( t ) \+ ∑ j ∞ ( λ j ) ψ j ( t ) Z j ( ω ) ,
where λ j and ψ j are Mercer eigenmodes for K, and the Z j are uncorrelated and of unit variance.
K-L is a generalization of the singular value decomposition of a matrix, which can be written as a sum of outer products of vectors. The covariance K plays the role of the Gram matrix inner products (in probability) of "columns" of the process for different values of s and t. A number of SVD results have K-L analogs, most notably that the best approximation of the process results from truncating the expansion, if the eigenvalues are arranged in nonincreasing order.
Because the Z j in the expansion are uncorrelated, the variance of X is just the sum of the eigenvalues. This is the trace of K, which is the integral of K ( s , s ); in this case, the result is 2. But we can also calculate the variance in a truncation of the expansion by summing only some of the eigenvalues. For example, suppose the process X has the exponential covariance in K above. The eigenvalues show that 95 % of the variance in the process is captured by the first 10 K-L modes:
```
captured = sum(lambda(1:10)) / 2
```
```
captured =
0.9579
```
We can find realizations of X by selecting the random parameters Z j in the expansion.
```
Z = randn(10,400);
L = diag( sqrt(lambda(1:10)) );
X = Psi(:,1:10)*(L*Z);
plot(X(:,1:40))
mu = sum(X,2)/400;
hold on, plot(mu,'k',plotopt{:})
title('Random realizations, and the mean')
```

We should get roughly the original covariance function back. (We'll discretize the computation for speed.)
```
points = (-1:.05:1)';
[S,T] = meshgrid(points);
C = cov( X(points,:)' ); % covariance at discrete locations
clf, mesh(S,T,C)
hold on, plot3(S,T,K(S,T),'k.',plotopt{:})
```

If we shorten the correlation length of the process relative to the domain (i.e., more randomness), the amount of variance captured by the first 10 modes will decrease.
```
K = @(s,t) exp(-4*abs(s-t)); % decrease correlation faster, then...
F = chebop(@(u) fred(K, u) );
lambdaShort = sort( eigs(F,24,'lm'), 'descend' );
```
```
clf
loglog(lambda,'b.',plotopt{:})
hold on
loglog(lambdaShort,'r.',plotopt{:}), axis tight
xlabel('n')
ylabel('| \lambda_n |')
captured = sum(lambdaShort(1:10)) / 2 % ... a smaller fraction is captured
```
```
captured =
0.6744
```

### References
1. D. Xu, *Numerical Methods for Stochastic Computations*, Princeton University Press, 2010. |
| Shard | 99 (laksa) |
| Root Hash | 2083995644193301499 |
| Unparsed URL | org,chebfun!www,/examples/stats/MercerKarhunenLoeve.html s443 |