âšď¸ Skipped - page is already crawled
| Filter | Status | Condition | Details |
|---|---|---|---|
| HTTP status | PASS | download_http_code = 200 | HTTP 200 |
| Age cutoff | PASS | download_stamp > now() - 6 MONTH | 0.8 months ago (distributed domain, exempt) |
| History drop | PASS | isNull(history_drop_reason) | No drop reason |
| Spam/ban | PASS | fh_dont_index != 1 AND ml_spam_score = 0 | ml_spam_score=0 |
| Canonical | PASS | meta_canonical IS NULL OR = '' OR = src_unparsed | Not set |
| Property | Value |
|---|---|
| URL | https://en.wikipedia.org/wiki/Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem |
| Last Crawled | 2026-03-25 16:45:48 (22 days ago) |
| First Indexed | 2021-12-07 08:50:09 (4 years ago) |
| HTTP Status Code | 200 |
| Meta Title | KosambiâKarhunenâLoève theorem - Wikipedia |
| Meta Description | null |
| Meta Canonical | null |
| Boilerpipe Text | In the theory of
stochastic processes
, the
KarhunenâLoève theorem
(named after
Kari Karhunen
and
Michel Loève
), also known as the
KosambiâKarhunenâLoève theorem
[
1
]
[
2
]
states that a
stochastic process
can be represented as an infinite
linear combination
of
orthogonal functions
, analogous to a
Fourier series
representation of a function on a bounded interval. The transformation is also known as
Hotelling
transform and
eigenvector
transform, and is closely related to
principal component analysis
(PCA) technique widely used in image processing and in data analysis in many fields.
[
3
]
There exist many such expansions of a stochastic process: if the process is indexed over
[
a
,
b
]
, any
orthonormal basis
of
L
2
([
a
,
b
])
yields an expansion thereof in that form. The importance of the KarhunenâLoève theorem is that it yields the best such basis in the sense that it minimizes the total
mean squared error
.
In contrast to a Fourier series where the coefficients are fixed numbers and the expansion basis consists of
sinusoidal functions
(that is,
sine
and
cosine
functions), the coefficients in the KarhunenâLoève theorem are
random variables
and the expansion basis depends on the process. In fact, the orthogonal basis functions used in this representation are determined by the
covariance function
of the process. One can think that the KarhunenâLoève transform adapts to the process in order to produce the best possible basis for its expansion.
In the case of a
centered
stochastic process
{
X
t
}
t
â [
a
,
b
]
(
centered
means
E
[
X
t
] = 0
for all
t
â [
a
,
b
]
) satisfying a technical continuity condition,
X
admits a decomposition
where
Z
k
are pairwise
uncorrelated
random variables and the functions
e
k
are continuous real-valued functions on
[
a
,
b
]
that are pairwise
orthogonal
in
L
2
([
a
,
b
])
. It is therefore sometimes said that the expansion is
bi-orthogonal
since the random coefficients
Z
k
are orthogonal in the probability space while the deterministic functions
e
k
are orthogonal in the time domain. The general case of a process
X
t
that is not centered can be brought back to the case of a centered process by considering
X
t
â
E
[
X
t
]
which is a centered process.
Moreover, if the process is
Gaussian
, then the random variables
Z
k
are Gaussian and
stochastically independent
. This result generalizes the
KarhunenâLoève transform
. An important example of a centered real stochastic process on
[0, 1]
is the
Wiener process
; the KarhunenâLoève theorem can be used to provide a canonical orthogonal representation for it. In this case the expansion consists of sinusoidal functions.
The above expansion into uncorrelated random variables is also known as the
KarhunenâLoève expansion
or
KarhunenâLoève decomposition
. The
empirical
version (i.e., with the coefficients computed from a sample) is known as the
KarhunenâLoève transform
(KLT),
principal component analysis
,
proper orthogonal decomposition
(POD)
,
empirical orthogonal functions
(a term used in
meteorology
and
geophysics
), or the
Hotelling
transform
.
Throughout this article, we will consider a random process
X
t
defined over a
probability space
(Ί,
F
,
P
)
and indexed over a closed interval
[
a
,
b
]
, which is
square-integrable
, has zero-mean, and with covariance function
K
X
(
s
,
t
)
. In other words, we have:
The square-integrable condition
is logically equivalent to
being finite for all
.
[
4
]
We associate to
K
X
a
linear operator
(more specifically a
HilbertâSchmidt integral operator
)
T
K
X
defined in the following way:
Since
T
K
X
is a linear endomorphism, it makes sense to talk about its eigenvalues
Îť
k
and eigenfunctions
e
k
, which are found by solving the homogeneous Fredholm
integral equation
of the second kind
.
Statement of the theorem
[
edit
]
Theorem
. Let
X
t
be a zero-mean square-integrable stochastic process defined over a probability space
(Ί,
F
,
P
)
and indexed over a closed and bounded interval [
a
,Â
b
], with continuous covariance function
K
X
(
s
,
t
)
.
Then
K
X
(
s,t
)
is a
Mercer kernel
and letting
e
k
be an orthonormal basis on
L
2
([
a
,
b
])
formed by the eigenfunctions of
T
K
X
with respective eigenvalues
Îť
k
, X
t
admits the following representation
where the convergence is in
L
2
, uniform in
t
and
Furthermore, the random variables
Z
k
have zero-mean, are uncorrelated and have variance
Îť
k
Note that by generalizations of Mercer's theorem we can replace the interval [
a
,
b
] with other compact spaces
C
and the
Lebesgue measure
on [
a
,
b
] with a
Borel measure
whose support is
C
.
The covariance function
K
X
satisfies the definition of a Mercer kernel. By
Mercer's theorem
, there consequently exists a set
Îť
k
,
e
k
(
t
)
of eigenvalues and eigenfunctions of
T
K
X
forming an orthonormal basis of
L
2
([
a
,
b
])
, and
K
X
can be expressed as
The process
X
t
can be expanded in terms of the eigenfunctions
e
k
as:
where the coefficients (random variables)
Z
k
are given by the projection of
X
t
on the respective eigenfunctions
We may then derive
where we have used the fact that the
e
k
are eigenfunctions of
T
K
X
and are orthonormal.
Let us now show that the convergence is in
L
2
. Let
Then:
which goes to 0 by Mercer's theorem.
Properties of the KarhunenâLoève transform
[
edit
]
Special case: Gaussian distribution
[
edit
]
Since the limit in the mean of jointly Gaussian random variables is jointly Gaussian, and jointly Gaussian random (centered) variables are independent
if and only if
they are orthogonal, we can also conclude:
Theorem
. The variables
Z
i
have a joint Gaussian distribution and are stochastically independent if the original process
{
X
t
}
t
is Gaussian.
In the Gaussian case, since the variables
Z
i
are independent, we can say more:
almost surely.
The KarhunenâLoève transform decorrelates the process
[
edit
]
This is a consequence of the independence of the
Z
k
.
The KarhunenâLoève expansion minimizes the total mean square error
[
edit
]
In the introduction, we mentioned that the truncated KarhunenâLoeve expansion was the best approximation of the original process in the sense that it reduces the total mean-square error resulting of its truncation. Because of this property, it is often said that the KL transform optimally compacts the energy.
More specifically, given any orthonormal basis
{
f
k
} of
L
2
([
a
,
b
])
, we may decompose the process
X
t
as:
where
and we may approximate
X
t
by the finite sum
for some integer
N
.
Claim
. Of all such approximations, the KL approximation is the one that minimizes the total mean square error (provided we have arranged the eigenvalues in decreasing order).
Proof
Consider the error resulting from the truncation at the
N
-th term in the following orthonormal expansion:
The mean-square error
Îľ
N
2
(
t
) can be written as:
We then integrate this last equality over [
a
,
b
]. The orthonormality of the
f
k
yields:
The problem of minimizing the total mean-square error thus comes down to minimizing the right hand side of this equality subject to the constraint that the
f
k
be normalized. We hence introduce
β
k
, the Lagrangian multipliers associated with these constraints, and aim at minimizing the following function:
Differentiating with respect to
f
i
(
t
) (this is a
functional derivative
) and setting the derivative to 0 yields:
which is satisfied in particular when
In other words, when the
f
k
are chosen to be the eigenfunctions of
T
K
X
, hence resulting in the KL expansion.
An important observation is that since the random coefficients
Z
k
of the KL expansion are uncorrelated, the
BienaymĂŠ formula
asserts that the variance of
X
t
is simply the sum of the variances of the individual components of the sum:
Integrating over [
a
,
b
] and using the orthonormality of the
e
k
, we obtain that the total variance of the process is:
In particular, the total variance of the
N
-truncated approximation is
As a result, the
N
-truncated expansion explains
of the variance; and if we are content with an approximation that explains, say, 95% of the variance, then we just have to determine an
such that
The KarhunenâLoève expansion has the minimum representation entropy property
[
edit
]
Given a representation of
, for some orthonormal basis
and random
, we let
, so that
. We may then define the representation
entropy
to be
. Then we have
, for all choices of
. That is, the KL-expansion has minimal representation entropy.
Proof:
Denote the coefficients obtained for the basis
as
, and for
as
.
Choose
. Note that since
minimizes the mean squared error, we have that
Expanding the right hand size, we get:
Using the orthonormality of
, and expanding
in the
basis, we get that the right hand size is equal to:
We may perform identical analysis for the
, and so rewrite the above inequality as:
Subtracting the common first term, and dividing by
, we obtain that:
This implies that:
Linear KarhunenâLoève approximations
[
edit
]
Consider a whole class of signals we want to approximate over the first
M
vectors of a basis. These signals are modeled as realizations of a random vector
Y
[
n
]
of size
N
. To optimize the approximation we design a basis that minimizes the average
approximation error
. This section proves that optimal bases are KarhunenâLoeve bases that diagonalize the covariance matrix of
Y
. The random vector
Y
can be decomposed in an orthogonal basis
as follows:
where each
is a random variable. The approximation from the first
M
â¤
N
vectors of the basis is
The energy conservation in an orthogonal basis implies
This error is related to the covariance of
Y
defined by
For any vector
x
[
n
]
we denote by
K
the
covariance operator
represented by this matrix,
The error
Îľ
[
M
]
is therefore a sum of the last
N
â
M
coefficients of the covariance operator
The covariance operator
K
is Hermitian and Positive and is thus diagonalized in an orthogonal basis called a KarhunenâLoève basis. The following theorem states that a KarhunenâLoève basis is optimal for linear approximations.
Theorem (Optimality of KarhunenâLoève basis).
Let
K
be a covariance operator. For all
M
⼠1
, the approximation error
is minimum if and only if
is a KarhunenâLoeve basis ordered by decreasing eigenvalues.
Non-Linear approximation in bases
[
edit
]
Linear approximations project the signal on
M
vectors a priori. The approximation can be made more precise by choosing the
M
orthogonal vectors depending on the signal properties. This section analyzes the general performance of these non-linear approximations. A signal
is approximated with M vectors selected adaptively in an orthonormal basis for
[
definition needed
]
Let
be the projection of f over M vectors whose indices are in
I
M
:
The approximation error is the sum of the remaining coefficients
To minimize this error, the indices in
I
M
must correspond to the M vectors having the largest inner product amplitude
These are the vectors that best correlate f. They can thus be interpreted as the main features of f. The resulting error is necessarily smaller than the error of a
linear approximation
which selects the M approximation vectors independently of f. Let us sort
in decreasing order
The best non-linear approximation is
It can also be written as inner product thresholding:
with
The non-linear error is
this error goes quickly to zero as M increases, if the sorted values of
have a fast decay as k increases. This decay is quantified by computing the
norm of the signal inner products in B:
The following theorem relates the decay of
Îľ
[
M
]
to
Theorem (decay of error).
If
with
p
< 2
then
and
Conversely, if
then
for any
q
>
p
.
Non-optimality of KarhunenâLoève bases
[
edit
]
To further illustrate the differences between linear and non-linear approximations, we study the decomposition of a simple non-Gaussian random vector in a KarhunenâLoève basis. Processes whose realizations have a random translation are stationary. The KarhunenâLoève basis is then a Fourier basis and we study its performance. To simplify the analysis, consider a random vector
Y
[
n
] of size
N
that is random shift modulo
N
of a deterministic signal
f
[
n
] of zero mean
The random shift
P
is uniformly distributed on [0,Â
N
 â 1]:
Clearly
and
Hence
Since R
Y
is N periodic, Y is a circular stationary random vector. The covariance operator is a
circular convolution
with R
Y
and is therefore diagonalized in the discrete Fourier KarhunenâLoève basis
The power spectrum is Fourier transform of
R
Y
:
Example:
Consider an extreme case where
. A theorem stated above guarantees that the Fourier KarhunenâLoève basis produces a smaller expected approximation error than a canonical basis of Diracs
. Indeed, we do not know a priori the abscissa of the non-zero coefficients of
Y
, so there is no particular Dirac that is better adapted to perform the approximation. But the Fourier vectors cover the whole support of Y and thus absorb a part of the signal energy.
Selecting higher frequency Fourier coefficients yields a better mean-square approximation than choosing a priori a few Dirac vectors to perform the approximation. The situation is totally different for non-linear approximations. If
then the discrete Fourier basis is extremely inefficient because f and hence Y have an energy that is almost uniformly spread among all Fourier vectors. In contrast, since f has only two non-zero coefficients in the Dirac basis, a non-linear approximation of Y with
M
⼠2
gives zero error.
[
5
]
Principal component analysis
[
edit
]
We have established the KarhunenâLoève theorem and derived a few properties thereof. We also noted that one hurdle in its application was the numerical cost of determining the eigenvalues and eigenfunctions of its covariance operator through the Fredholm integral equation of the second kind
However, when applied to a discrete and finite process
, the problem takes a much simpler form and standard algebra can be used to carry out the calculations.
Note that a continuous process can also be sampled at
N
points in time in order to reduce the problem to a finite version.
We henceforth consider a random
N
-dimensional vector
. As mentioned above,
X
could contain
N
samples of a signal but it can hold many more representations depending on the field of application. For instance it could be the answers to a survey or economic data in an econometrics analysis.
As in the continuous version, we assume that
X
is centered, otherwise we can let
(where
is the
mean vector
of
X
) which is centered.
Let us adapt the procedure to the discrete case.
Recall that the main implication and difficulty of the KL transformation is computing the eigenvectors of the linear operator associated to the covariance function, which are given by the solutions to the integral equation written above.
Define ÎŁ, the covariance matrix of
X
, as an
N
Ă
N
matrix whose elements are given by:
Rewriting the above integral equation to suit the discrete case, we observe that it turns into:
where
is an
N
-dimensional vector.
The integral equation thus reduces to a simple matrix eigenvalue problem, which explains why the PCA has such a broad domain of applications.
Since ÎŁ is a positive definite symmetric matrix, it possesses a set of orthonormal eigenvectors forming a basis of
, and we write
this set of eigenvalues and corresponding eigenvectors, listed in decreasing values of
Îť
i
. Let also
ÎŚ
be the orthonormal matrix consisting of these eigenvectors:
Principal component transform
[
edit
]
It remains to perform the actual KL transformation, called the
principal component transform
in this case. Recall that the transform was found by expanding the process with respect to the basis spanned by the eigenvectors of the covariance function. In this case, we hence have:
In a more compact form, the principal component transform of
X
is defined by:
The
i
-th component of
Y
is
, the projection of
X
on
and the inverse transform
X
= ÎŚ
Y
yields the expansion of
X
on the space spanned by the
:
As in the continuous case, we may reduce the dimensionality of the problem by truncating the sum at some
such that
where Îą is the explained variance threshold we wish to set.
We can also reduce the dimensionality through the use of multilevel dominant eigenvector estimation (MDEE).
[
6
]
There are numerous equivalent characterizations of the
Wiener process
which is a mathematical formalization of
Brownian motion
. Here we regard it as the centered standard Gaussian process
W
t
with covariance function
We restrict the time domain to [
a
,
b
]=[0,1] without loss of generality.
The eigenvectors of the covariance kernel are easily determined. These are
and the corresponding eigenvalues are
Proof
In order to find the eigenvalues and eigenvectors, we need to solve the integral equation:
differentiating once with respect to
t
yields:
a second differentiation produces the following differential equation:
The general solution of which has the form:
where
A
and
B
are two constants to be determined with the boundary conditions. Setting
t
 = 0 in the initial integral equation gives
e
(0)Â =Â 0 which implies that
B
 = 0 and similarly, setting
t
 = 1 in the first differentiation yields
e'
(1)Â =Â 0, whence:
which in turn implies that eigenvalues of
T
K
X
are:
The corresponding eigenfunctions are thus of the form:
A
is then chosen so as to normalize
e
k
:
This gives the following representation of the Wiener process:
Theorem
. There is a sequence {
Z
i
}
i
of independent Gaussian random variables with mean zero and variance 1 such that
Note that this representation is only valid for
On larger intervals, the increments are not independent. As stated in the theorem, convergence is in the L
2
norm and uniform inÂ
t
.
The Brownian bridge
[
edit
]
Similarly the
Brownian bridge
which is a
stochastic process
with covariance function
can be represented as the series
Adaptive optics
systems sometimes use KâL functions to reconstruct wave-front phase information (Dai 1996, JOSA A).
KarhunenâLoève expansion is closely related to the
Singular Value Decomposition
. The latter has myriad applications in image processing, radar, seismology, and the like. If one has independent vector observations from a vector valued stochastic process then the left singular vectors are
maximum likelihood
estimates of the ensemble KL expansion.
Applications in signal estimation and detection
[
edit
]
Detection of a known continuous signal
S
(
t
)
[
edit
]
In communication, we usually have to decide whether a signal from a noisy channel contains valuable information. The following hypothesis testing is used for detecting continuous signal
s
(
t
) from channel output
X
(
t
),
N
(
t
) is the channel noise, which is usually assumed zero mean Gaussian process with correlation function
Signal detection in white noise
[
edit
]
When the channel noise is white, its correlation function is
and it has constant power spectrum density. In physically practical channel, the noise power is finite, so:
Then the noise correlation function is sinc function with zeros at
Since are uncorrelated and gaussian, they are independent. Thus we can take samples from
X
(
t
) with time spacing
Let
. We have a total of
i.i.d observations
to develop the likelihood-ratio test. Define signal
, the problem becomes,
The log-likelihood ratio
As
t
â 0
, let:
Then
G
is the test statistics and the
NeymanâPearson optimum detector
is
As
G
is Gaussian, we can characterize it by finding its mean and variances. Then we get
where
is the signal energy.
The false alarm error
And the probability of detection:
where ÎŚ is the cdf of standard normal, or Gaussian, variable.
Signal detection in colored noise
[
edit
]
When N(t) is colored (correlated in time) Gaussian noise with zero mean and covariance function
we cannot sample independent discrete observations by evenly spacing the time. Instead, we can use KâL expansion to decorrelate the noise process and get independent Gaussian observation 'samples'. The KâL expansion of
N
(
t
):
where
and the orthonormal bases
are generated by kernel
, i.e., solution to
Do the expansion:
where
, then
under H and
under K. Let
, we have
are independent Gaussian r.v's with variance
under H:
are independent Gaussian r.v's.
under K:
are independent Gaussian r.v's.
Hence, the log-LR is given by
and the optimum detector is
Define
then
Since
k(t) is the solution to
If
N
(
t
)is wide-sense stationary,
which is known as the
WienerâHopf equation
. The equation can be solved by taking fourier transform, but not practically realizable since infinite spectrum needs spatial factorization. A special case which is easy to calculate
k
(
t
) is white Gaussian noise.
The corresponding impulse response is
h
(
t
) =
k
(
T
 âÂ
t
) =
CS
(
T
 âÂ
t
). Let
C
 = 1, this is just the result we arrived at in previous section for detecting of signal in white noise.
Test threshold for NeymanâPearson detector
[
edit
]
Since X(t) is a Gaussian process,
is a Gaussian random variable that can be characterized by its mean and variance.
Hence, we obtain the distributions of
H
and
K
:
The false alarm error is
So the test threshold for the NeymanâPearson optimum detector is
Its power of detection is
When the noise is white Gaussian process, the signal power is
For some type of colored noise, a typical practise is to add a prewhitening filter before the matched filter to transform the colored noise into white noise. For example, N(t) is a wide-sense stationary colored noise with correlation function
The transfer function of prewhitening filter is
When the signal we want to detect from the noisy channel is also random, for example, a white Gaussian process
X
(
t
), we can still implement KâL expansion to get independent sequence of observation. In this case, the detection problem is described as follows:
X
(
t
) is a random process with correlation function
The KâL expansion of
X
(
t
) is
where
and
are solutions to
So
's are independent sequence of r.v's with zero mean and variance
. Expanding
Y
(
t
) and
N
(
t
) by
, we get
where
As
N
(
t
) is Gaussian white noise,
's are i.i.d sequence of r.v with zero mean and variance
, then the problem is simplified as follows,
The NeymanâPearson optimal test:
so the log-likelihood ratio is
Since
is just the minimum-mean-square estimate of
given
's,
KâL expansion has the following property: If
where
then
So let
Noncausal filter
Q
(
t
,
s
) can be used to get the estimate through
By
orthogonality principle
,
Q
(
t
,
s
) satisfies
However, for practical reasons, it's necessary to further derive the causal filter
h
(
t
,
s
), where
h
(
t
,
s
) = 0 for
s
>
t
, to get estimate
. Specifically,
Principal component analysis
Polynomial chaos
Reproducing kernel Hilbert space
Mercer's theorem
^
Sapatnekar, Sachin (2011), "Overcoming variations in nanometer-scale technologies",
IEEE Journal on Emerging and Selected Topics in Circuits and Systems
,
1
(1):
5â
1,
Bibcode
:
2011IJEST...1....5S
,
CiteSeerX
Â
10.1.1.300.5659
,
doi
:
10.1109/jetcas.2011.2138250
,
S2CID
Â
15566585
^
Ghoman, Satyajit; Wang, Zhicun; Chen, PC; Kapania, Rakesh (2012). "A POD-based Reduced Order Design Scheme for Shape Optimization of Air Vehicles".
Proc of 53rd AIAA/ASME/ASCE/AHS/ASC Structures, Structural Dynamics, and Materials Conference, AIAA-2012-1808, Honolulu, Hawaii
.
^
KarhunenâLoeve transform (KLT)
Archived
2016-11-28 at the
Wayback Machine
, Computer Image Processing and Analysis (E161) lectures, Harvey Mudd College
^
Giambartolomei, Giordano (2016). "4 The Karhunen-Loève Theorem".
The Karhunen-Loève theorem
(Bachelors). University of Bologna.
^
A wavelet tour of signal processing-StĂŠphane Mallat
^
X. Tang, âTexture information in run-length matrices,â IEEE Transactions on Image Processing, vol. 7, No. 11, pp. 1602â1609, Nov. 1998
Stark, Henry; Woods, John W. (1986).
Probability, Random Processes, and Estimation Theory for Engineers
. Prentice-Hall, Inc.
ISBN
Â
978-0-13-711706-2
.
OL
Â
21138080M
.
Ghanem, Roger; Spanos, Pol (1991).
Stochastic finite elements: a spectral approach
. Springer-Verlag.
ISBN
Â
978-0-387-97456-9
.
OL
Â
1865197M
.
Guikhman, I.; Skorokhod, A. (1977).
Introduction a la ThĂŠorie des Processus AlĂŠatoires
. Ăditions MIR.
Simon, B. (1979).
Functional Integration and Quantum Physics
. Academic Press.
Karhunen, Kari (1947). "Ăber lineare Methoden in der Wahrscheinlichkeitsrechnung".
Ann. Acad. Sci. Fennicae. Ser. A I. Math.-Phys
.
37
:
1â
79.
Loève, M. (1978).
Probability theory Vol. II
. Graduate Texts in Mathematics. Vol. 46 (4 ed.). Springer-Verlag.
ISBN
Â
978-0-387-90262-3
.
Dai, G. (1996). "Modal wave-front reconstruction with Zernike polynomials and KarhunenâLoeve functions".
JOSA A
.
13
(6): 1218.
Bibcode
:
1996JOSAA..13.1218D
.
doi
:
10.1364/JOSAA.13.001218
.
Wu B., Zhu J., Najm F.(2005) "A Non-parametric Approach for Dynamic Range Estimation of Nonlinear Systems". In Proceedings of Design Automation Conference(841â844) 2005
Wu B., Zhu J., Najm F.(2006) "Dynamic Range Estimation". IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, Vol. 25 Issue:9 (1618â1636) 2006
Jorgensen, Palle E. T.; Song, Myung-Sin (2007). "Entropy Encoding, Hilbert Space and KarhunenâLoeve Transforms".
Journal of Mathematical Physics
.
48
(10): 103503.
arXiv
:
math-ph/0701056
.
Bibcode
:
2007JMP....48j3503J
.
doi
:
10.1063/1.2793569
.
S2CID
Â
17039075
.
Mathematica
KarhunenLoeveDecomposition
function.
E161: Computer Image Processing and Analysis
notes by Pr. Ruye Wang at
Harvey Mudd College
[1]
Archived
2011-05-16 at the
Wayback Machine |
| Markdown | [Jump to content](https://en.wikipedia.org/wiki/Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem#bodyContent)
Main menu
Main menu
move to sidebar
hide
Navigation
- [Main page](https://en.wikipedia.org/wiki/Main_Page "Visit the main page [z]")
- [Contents](https://en.wikipedia.org/wiki/Wikipedia:Contents "Guides to browsing Wikipedia")
- [Current events](https://en.wikipedia.org/wiki/Portal:Current_events "Articles related to current events")
- [Random article](https://en.wikipedia.org/wiki/Special:Random "Visit a randomly selected article [x]")
- [About Wikipedia](https://en.wikipedia.org/wiki/Wikipedia:About "Learn about Wikipedia and how it works")
- [Contact us](https://en.wikipedia.org/wiki/Wikipedia:Contact_us "How to contact Wikipedia")
Contribute
- [Help](https://en.wikipedia.org/wiki/Help:Contents "Guidance on how to use and edit Wikipedia")
- [Learn to edit](https://en.wikipedia.org/wiki/Help:Introduction "Learn how to edit Wikipedia")
- [Community portal](https://en.wikipedia.org/wiki/Wikipedia:Community_portal "The hub for editors")
- [Recent changes](https://en.wikipedia.org/wiki/Special:RecentChanges "A list of recent changes to Wikipedia [r]")
- [Upload file](https://en.wikipedia.org/wiki/Wikipedia:File_upload_wizard "Add images or other media for use on Wikipedia")
- [Special pages](https://en.wikipedia.org/wiki/Special:SpecialPages "A list of all special pages [q]")
[  ](https://en.wikipedia.org/wiki/Main_Page)
[Search](https://en.wikipedia.org/wiki/Special:Search "Search Wikipedia [f]")
Appearance
- [Donate](https://donate.wikimedia.org/?wmf_source=donate&wmf_medium=sidebar&wmf_campaign=en.wikipedia.org&uselang=en)
- [Create account](https://en.wikipedia.org/w/index.php?title=Special:CreateAccount&returnto=Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve+theorem "You are encouraged to create an account and log in; however, it is not mandatory")
- [Log in](https://en.wikipedia.org/w/index.php?title=Special:UserLogin&returnto=Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve+theorem "You're encouraged to log in; however, it's not mandatory. [o]")
Personal tools
- [Donate](https://donate.wikimedia.org/?wmf_source=donate&wmf_medium=sidebar&wmf_campaign=en.wikipedia.org&uselang=en)
- [Create account](https://en.wikipedia.org/w/index.php?title=Special:CreateAccount&returnto=Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve+theorem "You are encouraged to create an account and log in; however, it is not mandatory")
- [Log in](https://en.wikipedia.org/w/index.php?title=Special:UserLogin&returnto=Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve+theorem "You're encouraged to log in; however, it's not mandatory. [o]")
## Contents
move to sidebar
hide
- [(Top)](https://en.wikipedia.org/wiki/Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem)
- [1 Formulation](https://en.wikipedia.org/wiki/Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem#Formulation)
- [2 Statement of the theorem](https://en.wikipedia.org/wiki/Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem#Statement_of_the_theorem)
- [3 Proof](https://en.wikipedia.org/wiki/Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem#Proof)
- [4 Properties of the KarhunenâLoève transform](https://en.wikipedia.org/wiki/Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem#Properties_of_the_Karhunen%E2%80%93Lo%C3%A8ve_transform)
Toggle Properties of the KarhunenâLoève transform subsection
- [4\.1 Special case: Gaussian distribution](https://en.wikipedia.org/wiki/Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem#Special_case:_Gaussian_distribution)
- [4\.2 The KarhunenâLoève transform decorrelates the process](https://en.wikipedia.org/wiki/Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem#The_Karhunen%E2%80%93Lo%C3%A8ve_transform_decorrelates_the_process)
- [4\.3 The KarhunenâLoève expansion minimizes the total mean square error](https://en.wikipedia.org/wiki/Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem#The_Karhunen%E2%80%93Lo%C3%A8ve_expansion_minimizes_the_total_mean_square_error)
- [4\.4 Explained variance](https://en.wikipedia.org/wiki/Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem#Explained_variance)
- [4\.5 The KarhunenâLoève expansion has the minimum representation entropy property](https://en.wikipedia.org/wiki/Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem#The_Karhunen%E2%80%93Lo%C3%A8ve_expansion_has_the_minimum_representation_entropy_property)
- [5 Linear KarhunenâLoève approximations](https://en.wikipedia.org/wiki/Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem#Linear_Karhunen%E2%80%93Lo%C3%A8ve_approximations)
- [6 Non-Linear approximation in bases](https://en.wikipedia.org/wiki/Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem#Non-Linear_approximation_in_bases)
Toggle Non-Linear approximation in bases subsection
- [6\.1 Non-optimality of KarhunenâLoève bases](https://en.wikipedia.org/wiki/Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem#Non-optimality_of_Karhunen%E2%80%93Lo%C3%A8ve_bases)
- [7 Principal component analysis](https://en.wikipedia.org/wiki/Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem#Principal_component_analysis)
Toggle Principal component analysis subsection
- [7\.1 Covariance matrix](https://en.wikipedia.org/wiki/Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem#Covariance_matrix)
- [7\.2 Principal component transform](https://en.wikipedia.org/wiki/Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem#Principal_component_transform)
- [8 Examples](https://en.wikipedia.org/wiki/Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem#Examples)
Toggle Examples subsection
- [8\.1 The Wiener process](https://en.wikipedia.org/wiki/Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem#The_Wiener_process)
- [8\.2 The Brownian bridge](https://en.wikipedia.org/wiki/Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem#The_Brownian_bridge)
- [9 Applications](https://en.wikipedia.org/wiki/Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem#Applications)
Toggle Applications subsection
- [9\.1 Applications in signal estimation and detection](https://en.wikipedia.org/wiki/Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem#Applications_in_signal_estimation_and_detection)
- [9\.1.1 Detection of a known continuous signal *S*(*t*)](https://en.wikipedia.org/wiki/Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem#Detection_of_a_known_continuous_signal_S\(t\))
- [9\.1.2 Signal detection in white noise](https://en.wikipedia.org/wiki/Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem#Signal_detection_in_white_noise)
- [9\.1.3 Signal detection in colored noise](https://en.wikipedia.org/wiki/Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem#Signal_detection_in_colored_noise)
- [9\.1.3.1 How to find *k*(*t*)](https://en.wikipedia.org/wiki/Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem#How_to_find_k\(t\))
- [9\.1.3.2 Test threshold for NeymanâPearson detector](https://en.wikipedia.org/wiki/Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem#Test_threshold_for_Neyman%E2%80%93Pearson_detector)
- [9\.1.3.3 Prewhitening](https://en.wikipedia.org/wiki/Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem#Prewhitening)
- [9\.1.4 Detection of a Gaussian random signal in Additive white Gaussian noise (AWGN)](https://en.wikipedia.org/wiki/Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem#Detection_of_a_Gaussian_random_signal_in_Additive_white_Gaussian_noise_\(AWGN\))
- [10 See also](https://en.wikipedia.org/wiki/Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem#See_also)
- [11 Notes](https://en.wikipedia.org/wiki/Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem#Notes)
- [12 References](https://en.wikipedia.org/wiki/Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem#References)
- [13 External links](https://en.wikipedia.org/wiki/Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem#External_links)
Toggle the table of contents
# KosambiâKarhunenâLoève theorem
9 languages
- [EspaĂąol](https://es.wikipedia.org/wiki/Teorema_de_Karhunen-Lo%C3%A8ve "Teorema de Karhunen-Loève â Spanish")
- [ŮاعسŰ](https://fa.wikipedia.org/wiki/%D9%82%D8%B6%DB%8C%D9%87_%DA%A9%D9%88%D8%B3%D9%85%D8%A8%DB%8C-%DA%A9%D8%A7%D8%B1%D8%A7%D9%86%D9%86-%D9%84%D9%88%D9%81 "ŮŘśŰŮ ÚŠŮŘłŮ
بŰ-ڊاعاŮŮ-ŮŮŮ â Persian")
- [Français](https://fr.wikipedia.org/wiki/Transform%C3%A9e_de_Karhunen-Lo%C3%A8ve "TransformĂŠe de Karhunen-Loève â French")
- [Nederlands](https://nl.wikipedia.org/wiki/Stelling_van_Kosambi-Karhunen-Lo%C3%A8ve "Stelling van Kosambi-Karhunen-Loève â Dutch")
- [PortuguĂŞs](https://pt.wikipedia.org/wiki/Transformada_de_Karhunen-Lo%C3%A8ve "Transformada de Karhunen-Loève â Portuguese")
- [Đ ŃŃŃкиК](https://ru.wikipedia.org/wiki/%D0%A2%D0%B5%D0%BE%D1%80%D0%B5%D0%BC%D0%B0_%D0%9A%D0%BE%D1%81%D0%B0%D0%BC%D0%B1%D0%B8_%E2%80%94_%D0%9A%D0%B0%D1%80%D1%83%D0%BD%D0%B5%D0%BD%D0%B0_%E2%80%94_%D0%9B%D0%BE%D1%8D%D0%B2%D0%B0 "ТоОŃоПа ĐĐžŃаПйи â ĐаŃŃнона â ĐĐžŃва â Russian")
- [Svenska](https://sv.wikipedia.org/wiki/Karhunen-Lo%C3%A8ve-transform "Karhunen-Loève-transform â Swedish")
- [ĐŁĐşŃаŃĐ˝ŃŃка](https://uk.wikipedia.org/wiki/%D0%A2%D0%B5%D0%BE%D1%80%D0%B5%D0%BC%D0%B0_%D0%9A%D0%BE%D1%81%D0%B0%D0%BC%D0%B1%D1%96_%E2%80%94_%D0%9A%D0%B0%D1%80%D1%83%D0%BD%D0%B5%D0%BD%D0%B0_%E2%80%94_%D0%9B%D0%BE%D0%B5%D0%B2%D0%B0 "ТоОŃоПа ĐĐžŃĐ°ĐźĐąŃ â ĐаŃŃнона â ĐОова â Ukrainian")
- [ä¸ć](https://zh.wikipedia.org/wiki/K-L%E5%8F%98%E6%8D%A2 "K-Lĺć˘ â Chinese")
[Edit links](https://www.wikidata.org/wiki/Special:EntityPage/Q2046647#sitelinks-wikipedia "Edit interlanguage links")
- [Article](https://en.wikipedia.org/wiki/Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem "View the content page [c]")
- [Talk](https://en.wikipedia.org/wiki/Talk:Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem "Discuss improvements to the content page [t]")
English
- [Read](https://en.wikipedia.org/wiki/Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem)
- [Edit](https://en.wikipedia.org/w/index.php?title=Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem&action=edit "Edit this page [e]")
- [View history](https://en.wikipedia.org/w/index.php?title=Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem&action=history "Past revisions of this page [h]")
Tools
Tools
move to sidebar
hide
Actions
- [Read](https://en.wikipedia.org/wiki/Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem)
- [Edit](https://en.wikipedia.org/w/index.php?title=Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem&action=edit "Edit this page [e]")
- [View history](https://en.wikipedia.org/w/index.php?title=Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem&action=history)
General
- [What links here](https://en.wikipedia.org/wiki/Special:WhatLinksHere/Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem "List of all English Wikipedia pages containing links to this page [j]")
- [Related changes](https://en.wikipedia.org/wiki/Special:RecentChangesLinked/Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem "Recent changes in pages linked from this page [k]")
- [Upload file](https://en.wikipedia.org/wiki/Wikipedia:File_Upload_Wizard "Upload files [u]")
- [Permanent link](https://en.wikipedia.org/w/index.php?title=Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem&oldid=1334891229 "Permanent link to this revision of this page")
- [Page information](https://en.wikipedia.org/w/index.php?title=Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem&action=info "More information about this page")
- [Cite this page](https://en.wikipedia.org/w/index.php?title=Special:CiteThisPage&page=Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem&id=1334891229&wpFormIdentifier=titleform "Information on how to cite this page")
- [Get shortened URL](https://en.wikipedia.org/w/index.php?title=Special:UrlShortener&url=https%3A%2F%2Fen.wikipedia.org%2Fwiki%2FKosambi%25E2%2580%2593Karhunen%25E2%2580%2593Lo%25C3%25A8ve_theorem)
Print/export
- [Download as PDF](https://en.wikipedia.org/w/index.php?title=Special:DownloadAsPdf&page=Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem&action=show-download-screen "Download this page as a PDF file")
- [Printable version](https://en.wikipedia.org/w/index.php?title=Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem&printable=yes "Printable version of this page [p]")
In other projects
- [Wikidata item](https://www.wikidata.org/wiki/Special:EntityPage/Q2046647 "Structured data on this page hosted by Wikidata [g]")
Appearance
move to sidebar
hide
From Wikipedia, the free encyclopedia
Theory of stochastic processes
In the theory of [stochastic processes](https://en.wikipedia.org/wiki/Stochastic_process "Stochastic process"), the **KarhunenâLoève theorem** (named after [Kari Karhunen](https://en.wikipedia.org/wiki/Kari_Karhunen "Kari Karhunen") and [Michel Loève](https://en.wikipedia.org/wiki/Michel_Lo%C3%A8ve "Michel Loève")), also known as the **KosambiâKarhunenâLoève theorem**[\[1\]](https://en.wikipedia.org/wiki/Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem#cite_note-sapatnekar-1)[\[2\]](https://en.wikipedia.org/wiki/Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem#cite_note-ghoman-2) states that a [stochastic process](https://en.wikipedia.org/wiki/Stochastic_process "Stochastic process") can be represented as an infinite [linear combination](https://en.wikipedia.org/wiki/Linear_combination "Linear combination") of [orthogonal functions](https://en.wikipedia.org/wiki/Orthogonal_function "Orthogonal function"), analogous to a [Fourier series](https://en.wikipedia.org/wiki/Fourier_series "Fourier series") representation of a function on a bounded interval. The transformation is also known as [Hotelling](https://en.wikipedia.org/wiki/Harold_Hotelling "Harold Hotelling") transform and [eigenvector](https://en.wikipedia.org/wiki/Eigenvector "Eigenvector") transform, and is closely related to [principal component analysis](https://en.wikipedia.org/wiki/Principal_component_analysis "Principal component analysis") (PCA) technique widely used in image processing and in data analysis in many fields.[\[3\]](https://en.wikipedia.org/wiki/Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem#cite_note-3)
There exist many such expansions of a stochastic process: if the process is indexed over \[*a*, *b*\], any [orthonormal basis](https://en.wikipedia.org/wiki/Orthonormal_basis "Orthonormal basis") of *L*2(\[*a*, *b*\]) yields an expansion thereof in that form. The importance of the KarhunenâLoève theorem is that it yields the best such basis in the sense that it minimizes the total [mean squared error](https://en.wikipedia.org/wiki/Mean_squared_error "Mean squared error").
In contrast to a Fourier series where the coefficients are fixed numbers and the expansion basis consists of [sinusoidal functions](https://en.wikipedia.org/wiki/Trigonometric_function "Trigonometric function") (that is, [sine](https://en.wikipedia.org/wiki/Sine "Sine") and [cosine](https://en.wikipedia.org/wiki/Cosine "Cosine") functions), the coefficients in the KarhunenâLoève theorem are [random variables](https://en.wikipedia.org/wiki/Random_variable "Random variable") and the expansion basis depends on the process. In fact, the orthogonal basis functions used in this representation are determined by the [covariance function](https://en.wikipedia.org/wiki/Covariance_function "Covariance function") of the process. One can think that the KarhunenâLoève transform adapts to the process in order to produce the best possible basis for its expansion.
In the case of a *centered* stochastic process {*Xt*}*t* â \[*a*, *b*\] (*centered* means **E**\[*Xt*\] = 0 for all *t* â \[*a*, *b*\]) satisfying a technical continuity condition, X admits a decomposition
X
t
\=
â
k
\=
1
â
Z
k
e
k
(
t
)
{\\displaystyle X\_{t}=\\sum \_{k=1}^{\\infty }Z\_{k}e\_{k}(t)}

where Zk are pairwise [uncorrelated](https://en.wikipedia.org/wiki/Uncorrelated "Uncorrelated") random variables and the functions ek are continuous real-valued functions on \[*a*, *b*\] that are pairwise [orthogonal](https://en.wikipedia.org/wiki/Orthogonal_function "Orthogonal function") in *L*2(\[*a*, *b*\]). It is therefore sometimes said that the expansion is *bi-orthogonal* since the random coefficients Zk are orthogonal in the probability space while the deterministic functions ek are orthogonal in the time domain. The general case of a process Xt that is not centered can be brought back to the case of a centered process by considering *Xt* â **E**\[*Xt*\] which is a centered process.
Moreover, if the process is [Gaussian](https://en.wikipedia.org/wiki/Gaussian_process "Gaussian process"), then the random variables Zk are Gaussian and [stochastically independent](https://en.wikipedia.org/wiki/Stochastically_independent "Stochastically independent"). This result generalizes the *KarhunenâLoève transform*. An important example of a centered real stochastic process on \[0, 1\] is the [Wiener process](https://en.wikipedia.org/wiki/Wiener_process "Wiener process"); the KarhunenâLoève theorem can be used to provide a canonical orthogonal representation for it. In this case the expansion consists of sinusoidal functions.
The above expansion into uncorrelated random variables is also known as the *KarhunenâLoève expansion* or *KarhunenâLoève decomposition*. The [empirical](https://en.wikipedia.org/wiki/Statistic "Statistic") version (i.e., with the coefficients computed from a sample) is known as the *KarhunenâLoève transform* (KLT), *[principal component analysis](https://en.wikipedia.org/wiki/Principal_component_analysis "Principal component analysis")*, *[proper orthogonal decomposition](https://en.wikipedia.org/wiki/Proper_orthogonal_decomposition "Proper orthogonal decomposition") (POD)*, *[empirical orthogonal functions](https://en.wikipedia.org/wiki/Empirical_orthogonal_functions "Empirical orthogonal functions")* (a term used in [meteorology](https://en.wikipedia.org/wiki/Meteorology "Meteorology") and [geophysics](https://en.wikipedia.org/wiki/Geophysics "Geophysics")), or the *[Hotelling](https://en.wikipedia.org/wiki/Harold_Hotelling "Harold Hotelling") transform*.
## Formulation
\[[edit](https://en.wikipedia.org/w/index.php?title=Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem&action=edit§ion=1 "Edit section: Formulation")\]
- Throughout this article, we will consider a random process Xt defined over a [probability space](https://en.wikipedia.org/wiki/Probability_space "Probability space") (Ί, *F*, **P**) and indexed over a closed interval \[*a*, *b*\], which is [square-integrable](https://en.wikipedia.org/wiki/Square-integrable_function "Square-integrable function"), has zero-mean, and with covariance function *KX*(*s*, *t*). In other words, we have:
â
t
â
\[
a
,
b
\]
X
t
â
L
2
(
Ί
,
F
,
P
)
,
i.e.
E
\[
X
t
2
\]
\<
â
,
{\\displaystyle \\forall t\\in \[a,b\]\\qquad X\_{t}\\in L^{2}(\\Omega ,F,\\mathbf {P} ),\\quad {\\text{i.e. }}\\mathbf {E} \[X\_{t}^{2}\]\<\\infty ,}
![{\\displaystyle \\forall t\\in \[a,b\]\\qquad X\_{t}\\in L^{2}(\\Omega ,F,\\mathbf {P} ),\\quad {\\text{i.e. }}\\mathbf {E} \[X\_{t}^{2}\]\<\\infty ,}](https://wikimedia.org/api/rest_v1/media/math/render/svg/632342a83ac4f49f07863840a5571a5f3854ae00)
â
t
â
\[
a
,
b
\]
E
\[
X
t
\]
\=
0
,
{\\displaystyle \\forall t\\in \[a,b\]\\qquad \\mathbf {E} \[X\_{t}\]=0,}
![{\\displaystyle \\forall t\\in \[a,b\]\\qquad \\mathbf {E} \[X\_{t}\]=0,}](https://wikimedia.org/api/rest_v1/media/math/render/svg/053ae36b10faf28608ba342fcd8618a0547627d6)
â
t
,
s
â
\[
a
,
b
\]
K
X
(
s
,
t
)
\=
E
\[
X
s
X
t
\]
.
{\\displaystyle \\forall t,s\\in \[a,b\]\\qquad K\_{X}(s,t)=\\mathbf {E} \[X\_{s}X\_{t}\].}
![{\\displaystyle \\forall t,s\\in \[a,b\]\\qquad K\_{X}(s,t)=\\mathbf {E} \[X\_{s}X\_{t}\].}](https://wikimedia.org/api/rest_v1/media/math/render/svg/304b63242d3aadafe76989a78b2f0616bb9bb05b)
The square-integrable condition E \[ X t 2 \] \< â {\\displaystyle \\mathbf {E} \[X\_{t}^{2}\]\<\\infty } ![{\\displaystyle \\mathbf {E} \[X\_{t}^{2}\]\<\\infty }](https://wikimedia.org/api/rest_v1/media/math/render/svg/63bcfac9c642a55be391d81dd3de4aeb591da1bc) is logically equivalent to K X ( s , t ) {\\displaystyle K\_{X}(s,t)}  being finite for all s , t â \[ a , b \] {\\displaystyle s,t\\in \[a,b\]} ![{\\displaystyle s,t\\in \[a,b\]}](https://wikimedia.org/api/rest_v1/media/math/render/svg/e9b5cac7ee149307b7d98ab13f7c7bb33b1fb42c).[\[4\]](https://en.wikipedia.org/wiki/Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem#cite_note-giambartolomei-4)
- We associate to *K**X* a [linear operator](https://en.wikipedia.org/wiki/Linear_operator "Linear operator") (more specifically a [HilbertâSchmidt integral operator](https://en.wikipedia.org/wiki/Hilbert%E2%80%93Schmidt_integral_operator "HilbertâSchmidt integral operator")) *T**K**X* defined in the following way:
T
K
X
:
{
L
2
(
\[
a
,
b
\]
)
â
L
2
(
\[
a
,
b
\]
)
f
âŚ
T
K
X
f
\=
âŤ
a
b
K
X
(
s
,
â
)
f
(
s
)
d
s
{\\displaystyle T\_{K\_{X}}\\colon \\left\\{{\\begin{aligned}L^{2}(\[a,b\])&\\to L^{2}(\[a,b\])\\\\f&\\mapsto T\_{K\_{X}}f=\\int \_{a}^{b}K\_{X}(s,\\cdot )f(s)\\,ds\\end{aligned}}\\right.}
![{\\displaystyle T\_{K\_{X}}\\colon \\left\\{{\\begin{aligned}L^{2}(\[a,b\])&\\to L^{2}(\[a,b\])\\\\f&\\mapsto T\_{K\_{X}}f=\\int \_{a}^{b}K\_{X}(s,\\cdot )f(s)\\,ds\\end{aligned}}\\right.}](https://wikimedia.org/api/rest_v1/media/math/render/svg/7202812d5864abeac5bfa024d7919fb0dd3739e1)
Since *T**K**X* is a linear endomorphism, it makes sense to talk about its eigenvalues *Îťk* and eigenfunctions *e**k*, which are found by solving the homogeneous Fredholm [integral equation](https://en.wikipedia.org/wiki/Integral_equation "Integral equation") of the second kind
âŤ
a
b
K
X
(
s
,
t
)
e
k
(
s
)
d
s
\=
Îť
k
e
k
(
t
)
{\\displaystyle \\int \_{a}^{b}K\_{X}(s,t)e\_{k}(s)\\,ds=\\lambda \_{k}e\_{k}(t)}

.
## Statement of the theorem
\[[edit](https://en.wikipedia.org/w/index.php?title=Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem&action=edit§ion=2 "Edit section: Statement of the theorem")\]
**Theorem**. Let Xt be a zero-mean square-integrable stochastic process defined over a probability space (Ί, *F*, **P**) and indexed over a closed and bounded interval \[*a*, *b*\], with continuous covariance function *K**X*(*s*, *t*).
Then *K**X*(*s,t*) is a [Mercer kernel](https://en.wikipedia.org/wiki/Mercer%27s_theorem "Mercer's theorem") and letting *e**k* be an orthonormal basis on *L*2(\[*a*, *b*\]) formed by the eigenfunctions of *T**K**X* with respective eigenvalues Îťk, Xt admits the following representation
X
t
\=
â
k
\=
1
â
Z
k
e
k
(
t
)
{\\displaystyle X\_{t}=\\sum \_{k=1}^{\\infty }Z\_{k}e\_{k}(t)}

where the convergence is in [*L*2](https://en.wikipedia.org/wiki/Convergence_of_random_variables#Convergence_in_mean "Convergence of random variables"), uniform in *t* and
Z
k
\=
âŤ
a
b
X
t
e
k
(
t
)
d
t
{\\displaystyle Z\_{k}=\\int \_{a}^{b}X\_{t}e\_{k}(t)\\,dt}

Furthermore, the random variables *Z**k* have zero-mean, are uncorrelated and have variance *Îťk*
E
\[
Z
k
\]
\=
0
,
â
k
â
N
and
E
\[
Z
i
Z
j
\]
\=
δ
i
j
Îť
j
,
â
i
,
j
â
N
{\\displaystyle \\mathbf {E} \[Z\_{k}\]=0,~\\forall k\\in \\mathbb {N} \\qquad {\\mbox{and}}\\qquad \\mathbf {E} \[Z\_{i}Z\_{j}\]=\\delta \_{ij}\\lambda \_{j},~\\forall i,j\\in \\mathbb {N} }
![{\\displaystyle \\mathbf {E} \[Z\_{k}\]=0,~\\forall k\\in \\mathbb {N} \\qquad {\\mbox{and}}\\qquad \\mathbf {E} \[Z\_{i}Z\_{j}\]=\\delta \_{ij}\\lambda \_{j},~\\forall i,j\\in \\mathbb {N} }](https://wikimedia.org/api/rest_v1/media/math/render/svg/23223c32fd5054a22a855f70feffce7a9c48f868)
Note that by generalizations of Mercer's theorem we can replace the interval \[*a*, *b*\] with other compact spaces *C* and the [Lebesgue measure](https://en.wikipedia.org/wiki/Lebesgue_measure "Lebesgue measure") on \[*a*, *b*\] with a [Borel measure](https://en.wikipedia.org/wiki/Borel_measure "Borel measure") whose support is *C*.
## Proof
\[[edit](https://en.wikipedia.org/w/index.php?title=Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem&action=edit§ion=3 "Edit section: Proof")\]
- The covariance function *K**X* satisfies the definition of a Mercer kernel. By [Mercer's theorem](https://en.wikipedia.org/wiki/Mercer%27s_theorem "Mercer's theorem"), there consequently exists a set *Îťk*, *ek*(*t*) of eigenvalues and eigenfunctions of T*K**X* forming an orthonormal basis of *L*2(\[*a*,*b*\]), and *K**X* can be expressed as
K
X
(
s
,
t
)
\=
â
k
\=
1
â
Îť
k
e
k
(
s
)
e
k
(
t
)
{\\displaystyle K\_{X}(s,t)=\\sum \_{k=1}^{\\infty }\\lambda \_{k}e\_{k}(s)e\_{k}(t)}

- The process *X**t* can be expanded in terms of the eigenfunctions *e**k* as:
X
t
\=
â
k
\=
1
â
Z
k
e
k
(
t
)
{\\displaystyle X\_{t}=\\sum \_{k=1}^{\\infty }Z\_{k}e\_{k}(t)}

where the coefficients (random variables) *Z**k* are given by the projection of *X**t* on the respective eigenfunctions
Z
k
\=
âŤ
a
b
X
t
e
k
(
t
)
d
t
{\\displaystyle Z\_{k}=\\int \_{a}^{b}X\_{t}e\_{k}(t)\\,dt}

- We may then derive
E
\[
Z
k
\]
\=
E
\[
âŤ
a
b
X
t
e
k
(
t
)
d
t
\]
\=
âŤ
a
b
E
\[
X
t
\]
e
k
(
t
)
d
t
\=
0
E
\[
Z
i
Z
j
\]
\=
E
\[
âŤ
a
b
âŤ
a
b
X
t
X
s
e
j
(
t
)
e
i
(
s
)
d
t
d
s
\]
\=
âŤ
a
b
âŤ
a
b
E
\[
X
t
X
s
\]
e
j
(
t
)
e
i
(
s
)
d
t
d
s
\=
âŤ
a
b
âŤ
a
b
K
X
(
s
,
t
)
e
j
(
t
)
e
i
(
s
)
d
t
d
s
\=
âŤ
a
b
e
i
(
s
)
(
âŤ
a
b
K
X
(
s
,
t
)
e
j
(
t
)
d
t
)
d
s
\=
Îť
j
âŤ
a
b
e
i
(
s
)
e
j
(
s
)
d
s
\=
δ
i
j
Îť
j
{\\displaystyle {\\begin{aligned}\\mathbf {E} \[Z\_{k}\]&=\\mathbf {E} \\left\[\\int \_{a}^{b}X\_{t}e\_{k}(t)\\,dt\\right\]=\\int \_{a}^{b}\\mathbf {E} \[X\_{t}\]e\_{k}(t)dt=0\\\\\[8pt\]\\mathbf {E} \[Z\_{i}Z\_{j}\]&=\\mathbf {E} \\left\[\\int \_{a}^{b}\\int \_{a}^{b}X\_{t}X\_{s}e\_{j}(t)e\_{i}(s)\\,dt\\,ds\\right\]\\\\&=\\int \_{a}^{b}\\int \_{a}^{b}\\mathbf {E} \\left\[X\_{t}X\_{s}\\right\]e\_{j}(t)e\_{i}(s)\\,dt\\,ds\\\\&=\\int \_{a}^{b}\\int \_{a}^{b}K\_{X}(s,t)e\_{j}(t)e\_{i}(s)\\,dt\\,ds\\\\&=\\int \_{a}^{b}e\_{i}(s)\\left(\\int \_{a}^{b}K\_{X}(s,t)e\_{j}(t)\\,dt\\right)\\,ds\\\\&=\\lambda \_{j}\\int \_{a}^{b}e\_{i}(s)e\_{j}(s)\\,ds\\\\&=\\delta \_{ij}\\lambda \_{j}\\end{aligned}}}
![{\\displaystyle {\\begin{aligned}\\mathbf {E} \[Z\_{k}\]&=\\mathbf {E} \\left\[\\int \_{a}^{b}X\_{t}e\_{k}(t)\\,dt\\right\]=\\int \_{a}^{b}\\mathbf {E} \[X\_{t}\]e\_{k}(t)dt=0\\\\\[8pt\]\\mathbf {E} \[Z\_{i}Z\_{j}\]&=\\mathbf {E} \\left\[\\int \_{a}^{b}\\int \_{a}^{b}X\_{t}X\_{s}e\_{j}(t)e\_{i}(s)\\,dt\\,ds\\right\]\\\\&=\\int \_{a}^{b}\\int \_{a}^{b}\\mathbf {E} \\left\[X\_{t}X\_{s}\\right\]e\_{j}(t)e\_{i}(s)\\,dt\\,ds\\\\&=\\int \_{a}^{b}\\int \_{a}^{b}K\_{X}(s,t)e\_{j}(t)e\_{i}(s)\\,dt\\,ds\\\\&=\\int \_{a}^{b}e\_{i}(s)\\left(\\int \_{a}^{b}K\_{X}(s,t)e\_{j}(t)\\,dt\\right)\\,ds\\\\&=\\lambda \_{j}\\int \_{a}^{b}e\_{i}(s)e\_{j}(s)\\,ds\\\\&=\\delta \_{ij}\\lambda \_{j}\\end{aligned}}}](https://wikimedia.org/api/rest_v1/media/math/render/svg/4ce33680637c8657a8bd6a180225c18f165917c3)
where we have used the fact that the *e**k* are eigenfunctions of *T**K**X* and are orthonormal.
- Let us now show that the convergence is in *L*2. Let
S
N
\=
â
k
\=
1
N
Z
k
e
k
(
t
)
.
{\\displaystyle S\_{N}=\\sum \_{k=1}^{N}Z\_{k}e\_{k}(t).}

Then:
E
\[
\|
X
t
â
S
N
\|
2
\]
\=
E
\[
X
t
2
\]
\+
E
\[
S
N
2
\]
â
2
E
\[
X
t
S
N
\]
\=
K
X
(
t
,
t
)
\+
E
\[
â
k
\=
1
N
â
l
\=
1
N
Z
k
Z
â
e
k
(
t
)
e
â
(
t
)
\]
â
2
E
\[
X
t
â
k
\=
1
N
Z
k
e
k
(
t
)
\]
\=
K
X
(
t
,
t
)
\+
â
k
\=
1
N
Îť
k
e
k
(
t
)
2
â
2
E
\[
â
k
\=
1
N
âŤ
a
b
X
t
X
s
e
k
(
s
)
e
k
(
t
)
d
s
\]
\=
K
X
(
t
,
t
)
â
â
k
\=
1
N
Îť
k
e
k
(
t
)
2
{\\displaystyle {\\begin{aligned}\\mathbf {E} \\left\[\\left\|X\_{t}-S\_{N}\\right\|^{2}\\right\]&=\\mathbf {E} \\left\[X\_{t}^{2}\\right\]+\\mathbf {E} \\left\[S\_{N}^{2}\\right\]-2\\mathbf {E} \\left\[X\_{t}S\_{N}\\right\]\\\\&=K\_{X}(t,t)+\\mathbf {E} \\left\[\\sum \_{k=1}^{N}\\sum \_{l=1}^{N}Z\_{k}Z\_{\\ell }e\_{k}(t)e\_{\\ell }(t)\\right\]-2\\mathbf {E} \\left\[X\_{t}\\sum \_{k=1}^{N}Z\_{k}e\_{k}(t)\\right\]\\\\&=K\_{X}(t,t)+\\sum \_{k=1}^{N}\\lambda \_{k}e\_{k}(t)^{2}-2\\mathbf {E} \\left\[\\sum \_{k=1}^{N}\\int \_{a}^{b}X\_{t}X\_{s}e\_{k}(s)e\_{k}(t)\\,ds\\right\]\\\\&=K\_{X}(t,t)-\\sum \_{k=1}^{N}\\lambda \_{k}e\_{k}(t)^{2}\\end{aligned}}}
![{\\displaystyle {\\begin{aligned}\\mathbf {E} \\left\[\\left\|X\_{t}-S\_{N}\\right\|^{2}\\right\]&=\\mathbf {E} \\left\[X\_{t}^{2}\\right\]+\\mathbf {E} \\left\[S\_{N}^{2}\\right\]-2\\mathbf {E} \\left\[X\_{t}S\_{N}\\right\]\\\\&=K\_{X}(t,t)+\\mathbf {E} \\left\[\\sum \_{k=1}^{N}\\sum \_{l=1}^{N}Z\_{k}Z\_{\\ell }e\_{k}(t)e\_{\\ell }(t)\\right\]-2\\mathbf {E} \\left\[X\_{t}\\sum \_{k=1}^{N}Z\_{k}e\_{k}(t)\\right\]\\\\&=K\_{X}(t,t)+\\sum \_{k=1}^{N}\\lambda \_{k}e\_{k}(t)^{2}-2\\mathbf {E} \\left\[\\sum \_{k=1}^{N}\\int \_{a}^{b}X\_{t}X\_{s}e\_{k}(s)e\_{k}(t)\\,ds\\right\]\\\\&=K\_{X}(t,t)-\\sum \_{k=1}^{N}\\lambda \_{k}e\_{k}(t)^{2}\\end{aligned}}}](https://wikimedia.org/api/rest_v1/media/math/render/svg/a8931f37eae91505769b097b93769ecd5f5894f4)
which goes to 0 by Mercer's theorem.
## Properties of the KarhunenâLoève transform
\[[edit](https://en.wikipedia.org/w/index.php?title=Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem&action=edit§ion=4 "Edit section: Properties of the KarhunenâLoève transform")\]
### Special case: Gaussian distribution
\[[edit](https://en.wikipedia.org/w/index.php?title=Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem&action=edit§ion=5 "Edit section: Special case: Gaussian distribution")\]
Since the limit in the mean of jointly Gaussian random variables is jointly Gaussian, and jointly Gaussian random (centered) variables are independent [if and only if](https://en.wikipedia.org/wiki/If_and_only_if "If and only if") they are orthogonal, we can also conclude:
**Theorem**. The variables Zi have a joint Gaussian distribution and are stochastically independent if the original process {*Xt*}*t* is Gaussian.
In the Gaussian case, since the variables Zi are independent, we can say more:
lim
N
â
â
â
i
\=
1
N
e
i
(
t
)
Z
i
(
Ď
)
\=
X
t
(
Ď
)
{\\displaystyle \\lim \_{N\\to \\infty }\\sum \_{i=1}^{N}e\_{i}(t)Z\_{i}(\\omega )=X\_{t}(\\omega )}

almost surely.
### The KarhunenâLoève transform decorrelates the process
\[[edit](https://en.wikipedia.org/w/index.php?title=Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem&action=edit§ion=6 "Edit section: The KarhunenâLoève transform decorrelates the process")\]
This is a consequence of the independence of the Zk.
### The KarhunenâLoève expansion minimizes the total mean square error
\[[edit](https://en.wikipedia.org/w/index.php?title=Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem&action=edit§ion=7 "Edit section: The KarhunenâLoève expansion minimizes the total mean square error")\]
In the introduction, we mentioned that the truncated KarhunenâLoeve expansion was the best approximation of the original process in the sense that it reduces the total mean-square error resulting of its truncation. Because of this property, it is often said that the KL transform optimally compacts the energy.
More specifically, given any orthonormal basis {*f**k*} of *L*2(\[*a*, *b*\]), we may decompose the process *Xt* as:
X
t
(
Ď
)
\=
â
k
\=
1
â
A
k
(
Ď
)
f
k
(
t
)
{\\displaystyle X\_{t}(\\omega )=\\sum \_{k=1}^{\\infty }A\_{k}(\\omega )f\_{k}(t)}

where
A
k
(
Ď
)
\=
âŤ
a
b
X
t
(
Ď
)
f
k
(
t
)
d
t
{\\displaystyle A\_{k}(\\omega )=\\int \_{a}^{b}X\_{t}(\\omega )f\_{k}(t)\\,dt}

and we may approximate *X**t* by the finite sum
X
^
t
(
Ď
)
\=
â
k
\=
1
N
A
k
(
Ď
)
f
k
(
t
)
{\\displaystyle {\\hat {X}}\_{t}(\\omega )=\\sum \_{k=1}^{N}A\_{k}(\\omega )f\_{k}(t)}

for some integer *N*.
**Claim**. Of all such approximations, the KL approximation is the one that minimizes the total mean square error (provided we have arranged the eigenvalues in decreasing order).
**Proof**
Consider the error resulting from the truncation at the *N*\-th term in the following orthonormal expansion:
Îľ
N
(
t
)
\=
â
k
\=
N
\+
1
â
A
k
(
Ď
)
f
k
(
t
)
{\\displaystyle \\varepsilon \_{N}(t)=\\sum \_{k=N+1}^{\\infty }A\_{k}(\\omega )f\_{k}(t)}

The mean-square error *Îľ**N*2(*t*) can be written as:
Îľ
N
2
(
t
)
\=
E
\[
â
i
\=
N
\+
1
â
â
j
\=
N
\+
1
â
A
i
(
Ď
)
A
j
(
Ď
)
f
i
(
t
)
f
j
(
t
)
\]
\=
â
i
\=
N
\+
1
â
â
j
\=
N
\+
1
â
E
\[
âŤ
a
b
âŤ
a
b
X
t
X
s
f
i
(
t
)
f
j
(
s
)
d
s
d
t
\]
f
i
(
t
)
f
j
(
t
)
\=
â
i
\=
N
\+
1
â
â
j
\=
N
\+
1
â
f
i
(
t
)
f
j
(
t
)
âŤ
a
b
âŤ
a
b
K
X
(
s
,
t
)
f
i
(
t
)
f
j
(
s
)
d
s
d
t
{\\displaystyle {\\begin{aligned}\\varepsilon \_{N}^{2}(t)&=\\mathbf {E} \\left\[\\sum \_{i=N+1}^{\\infty }\\sum \_{j=N+1}^{\\infty }A\_{i}(\\omega )A\_{j}(\\omega )f\_{i}(t)f\_{j}(t)\\right\]\\\\&=\\sum \_{i=N+1}^{\\infty }\\sum \_{j=N+1}^{\\infty }\\mathbf {E} \\left\[\\int \_{a}^{b}\\int \_{a}^{b}X\_{t}X\_{s}f\_{i}(t)f\_{j}(s)\\,ds\\,dt\\right\]f\_{i}(t)f\_{j}(t)\\\\&=\\sum \_{i=N+1}^{\\infty }\\sum \_{j=N+1}^{\\infty }f\_{i}(t)f\_{j}(t)\\int \_{a}^{b}\\int \_{a}^{b}K\_{X}(s,t)f\_{i}(t)f\_{j}(s)\\,ds\\,dt\\end{aligned}}}
![{\\displaystyle {\\begin{aligned}\\varepsilon \_{N}^{2}(t)&=\\mathbf {E} \\left\[\\sum \_{i=N+1}^{\\infty }\\sum \_{j=N+1}^{\\infty }A\_{i}(\\omega )A\_{j}(\\omega )f\_{i}(t)f\_{j}(t)\\right\]\\\\&=\\sum \_{i=N+1}^{\\infty }\\sum \_{j=N+1}^{\\infty }\\mathbf {E} \\left\[\\int \_{a}^{b}\\int \_{a}^{b}X\_{t}X\_{s}f\_{i}(t)f\_{j}(s)\\,ds\\,dt\\right\]f\_{i}(t)f\_{j}(t)\\\\&=\\sum \_{i=N+1}^{\\infty }\\sum \_{j=N+1}^{\\infty }f\_{i}(t)f\_{j}(t)\\int \_{a}^{b}\\int \_{a}^{b}K\_{X}(s,t)f\_{i}(t)f\_{j}(s)\\,ds\\,dt\\end{aligned}}}](https://wikimedia.org/api/rest_v1/media/math/render/svg/db3545962d3e0d868cffefe4b7b028aa583d140a)
We then integrate this last equality over \[*a*, *b*\]. The orthonormality of the *fk* yields:
âŤ
a
b
Îľ
N
2
(
t
)
d
t
\=
â
k
\=
N
\+
1
â
âŤ
a
b
âŤ
a
b
K
X
(
s
,
t
)
f
k
(
t
)
f
k
(
s
)
d
s
d
t
{\\displaystyle \\int \_{a}^{b}\\varepsilon \_{N}^{2}(t)\\,dt=\\sum \_{k=N+1}^{\\infty }\\int \_{a}^{b}\\int \_{a}^{b}K\_{X}(s,t)f\_{k}(t)f\_{k}(s)\\,ds\\,dt}

The problem of minimizing the total mean-square error thus comes down to minimizing the right hand side of this equality subject to the constraint that the *f**k* be normalized. We hence introduce βk, the Lagrangian multipliers associated with these constraints, and aim at minimizing the following function:
E
r
\[
f
k
(
t
)
,
k
â
{
N
\+
1
,
âŚ
}
\]
\=
â
k
\=
N
\+
1
â
âŤ
a
b
âŤ
a
b
K
X
(
s
,
t
)
f
k
(
t
)
f
k
(
s
)
d
s
d
t
â
β
k
(
âŤ
a
b
f
k
(
t
)
f
k
(
t
)
d
t
â
1
)
{\\displaystyle Er\[f\_{k}(t),k\\in \\{N+1,\\ldots \\}\]=\\sum \_{k=N+1}^{\\infty }\\int \_{a}^{b}\\int \_{a}^{b}K\_{X}(s,t)f\_{k}(t)f\_{k}(s)\\,ds\\,dt-\\beta \_{k}\\left(\\int \_{a}^{b}f\_{k}(t)f\_{k}(t)\\,dt-1\\right)}
![{\\displaystyle Er\[f\_{k}(t),k\\in \\{N+1,\\ldots \\}\]=\\sum \_{k=N+1}^{\\infty }\\int \_{a}^{b}\\int \_{a}^{b}K\_{X}(s,t)f\_{k}(t)f\_{k}(s)\\,ds\\,dt-\\beta \_{k}\\left(\\int \_{a}^{b}f\_{k}(t)f\_{k}(t)\\,dt-1\\right)}](https://wikimedia.org/api/rest_v1/media/math/render/svg/a21063ed7c70d9c06aa9e8ea756049df74acd26d)
Differentiating with respect to *f**i*(*t*) (this is a [functional derivative](https://en.wikipedia.org/wiki/Functional_derivative "Functional derivative")) and setting the derivative to 0 yields:
â
E
r
â
f
i
(
t
)
\=
âŤ
a
b
(
âŤ
a
b
K
X
(
s
,
t
)
f
i
(
s
)
d
s
â
β
i
f
i
(
t
)
)
d
t
\=
0
{\\displaystyle {\\frac {\\partial Er}{\\partial f\_{i}(t)}}=\\int \_{a}^{b}\\left(\\int \_{a}^{b}K\_{X}(s,t)f\_{i}(s)\\,ds-\\beta \_{i}f\_{i}(t)\\right)\\,dt=0}

which is satisfied in particular when
âŤ
a
b
K
X
(
s
,
t
)
f
i
(
s
)
d
s
\=
β
i
f
i
(
t
)
.
{\\displaystyle \\int \_{a}^{b}K\_{X}(s,t)f\_{i}(s)\\,ds=\\beta \_{i}f\_{i}(t).}

In other words, when the *f**k* are chosen to be the eigenfunctions of *T**K**X*, hence resulting in the KL expansion.
### Explained variance
\[[edit](https://en.wikipedia.org/w/index.php?title=Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem&action=edit§ion=8 "Edit section: Explained variance")\]
An important observation is that since the random coefficients *Z**k* of the KL expansion are uncorrelated, the [BienaymĂŠ formula](https://en.wikipedia.org/wiki/Variance#Sum_of_variables "Variance") asserts that the variance of *X**t* is simply the sum of the variances of the individual components of the sum:
var
âĄ
\[
X
t
\]
\=
â
k
\=
0
â
e
k
(
t
)
2
var
âĄ
\[
Z
k
\]
\=
â
k
\=
1
â
Îť
k
e
k
(
t
)
2
{\\displaystyle \\operatorname {var} \[X\_{t}\]=\\sum \_{k=0}^{\\infty }e\_{k}(t)^{2}\\operatorname {var} \[Z\_{k}\]=\\sum \_{k=1}^{\\infty }\\lambda \_{k}e\_{k}(t)^{2}}
![{\\displaystyle \\operatorname {var} \[X\_{t}\]=\\sum \_{k=0}^{\\infty }e\_{k}(t)^{2}\\operatorname {var} \[Z\_{k}\]=\\sum \_{k=1}^{\\infty }\\lambda \_{k}e\_{k}(t)^{2}}](https://wikimedia.org/api/rest_v1/media/math/render/svg/964cf49295fa3fa55bf38196d631bcf1397214dd)
Integrating over \[*a*, *b*\] and using the orthonormality of the *e**k*, we obtain that the total variance of the process is:
âŤ
a
b
var
âĄ
\[
X
t
\]
d
t
\=
â
k
\=
1
â
Îť
k
{\\displaystyle \\int \_{a}^{b}\\operatorname {var} \[X\_{t}\]\\,dt=\\sum \_{k=1}^{\\infty }\\lambda \_{k}}
![{\\displaystyle \\int \_{a}^{b}\\operatorname {var} \[X\_{t}\]\\,dt=\\sum \_{k=1}^{\\infty }\\lambda \_{k}}](https://wikimedia.org/api/rest_v1/media/math/render/svg/56be7b3cd7f104d34b681a70c0ee160bcc4fb848)
In particular, the total variance of the *N*\-truncated approximation is
â
k
\=
1
N
Îť
k
.
{\\displaystyle \\sum \_{k=1}^{N}\\lambda \_{k}.}

As a result, the *N*\-truncated expansion explains
â
k
\=
1
N
Îť
k
â
k
\=
1
â
Îť
k
{\\displaystyle {\\frac {\\sum \_{k=1}^{N}\\lambda \_{k}}{\\sum \_{k=1}^{\\infty }\\lambda \_{k}}}}

of the variance; and if we are content with an approximation that explains, say, 95% of the variance, then we just have to determine an N â N {\\displaystyle N\\in \\mathbb {N} }  such that
â
k
\=
1
N
Îť
k
â
k
\=
1
â
Îť
k
âĽ
0\.95.
{\\displaystyle {\\frac {\\sum \_{k=1}^{N}\\lambda \_{k}}{\\sum \_{k=1}^{\\infty }\\lambda \_{k}}}\\geq 0.95.}

### The KarhunenâLoève expansion has the minimum representation entropy property
\[[edit](https://en.wikipedia.org/w/index.php?title=Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem&action=edit§ion=9 "Edit section: The KarhunenâLoève expansion has the minimum representation entropy property")\]
Given a representation of X t \= â k \= 1 â W k Ď k ( t ) {\\displaystyle X\_{t}=\\sum \_{k=1}^{\\infty }W\_{k}\\varphi \_{k}(t)} , for some orthonormal basis Ď k ( t ) {\\displaystyle \\varphi \_{k}(t)}  and random W k {\\displaystyle W\_{k}} , we let p k \= E \[ \| W k \| 2 \] / E \[ \| X t \| L 2 2 \] {\\displaystyle p\_{k}=\\mathbb {E} \[\|W\_{k}\|^{2}\]/\\mathbb {E} \[\|X\_{t}\|\_{L^{2}}^{2}\]} ![{\\displaystyle p\_{k}=\\mathbb {E} \[\|W\_{k}\|^{2}\]/\\mathbb {E} \[\|X\_{t}\|\_{L^{2}}^{2}\]}](https://wikimedia.org/api/rest_v1/media/math/render/svg/d508243f2e679ea109f53d5f05dab97da287d79d), so that â k \= 1 â p k \= 1 {\\displaystyle \\sum \_{k=1}^{\\infty }p\_{k}=1} . We may then define the representation [entropy](https://en.wikipedia.org/wiki/Entropy_\(information_theory\) "Entropy (information theory)") to be H ( { Ď k } ) \= â â i p k log ⥠( p k ) {\\displaystyle H(\\{\\varphi \_{k}\\})=-\\sum \_{i}p\_{k}\\log(p\_{k})} . Then we have H ( { Ď k } ) ⼠H ( { e k } ) {\\displaystyle H(\\{\\varphi \_{k}\\})\\geq H(\\{e\_{k}\\})} , for all choices of Ď k {\\displaystyle \\varphi \_{k}} . That is, the KL-expansion has minimal representation entropy.
**Proof:**
Denote the coefficients obtained for the basis e k ( t ) {\\displaystyle e\_{k}(t)}  as p k {\\displaystyle p\_{k}} , and for Ď k ( t ) {\\displaystyle \\varphi \_{k}(t)}  as q k {\\displaystyle q\_{k}} .
Choose N ⼠1 {\\displaystyle N\\geq 1} . Note that since e k {\\displaystyle e\_{k}}  minimizes the mean squared error, we have that
E
\|
â
k
\=
1
N
Z
k
e
k
(
t
)
â
X
t
\|
L
2
2
â¤
E
\|
â
k
\=
1
N
W
k
Ď
k
(
t
)
â
X
t
\|
L
2
2
{\\displaystyle \\mathbb {E} \\left\|\\sum \_{k=1}^{N}Z\_{k}e\_{k}(t)-X\_{t}\\right\|\_{L^{2}}^{2}\\leq \\mathbb {E} \\left\|\\sum \_{k=1}^{N}W\_{k}\\varphi \_{k}(t)-X\_{t}\\right\|\_{L^{2}}^{2}}

Expanding the right hand size, we get:
E
\|
â
k
\=
1
N
W
k
Ď
k
(
t
)
â
X
t
\|
L
2
2
\=
E
\|
X
t
2
\|
L
2
\+
â
k
\=
1
N
â
â
\=
1
N
E
\[
W
â
Ď
â
(
t
)
W
k
â
Ď
k
â
(
t
)
\]
L
2
â
â
k
\=
1
N
E
\[
W
k
Ď
k
X
t
â
\]
L
2
â
â
k
\=
1
N
E
\[
X
t
W
k
â
Ď
k
â
(
t
)
\]
L
2
{\\displaystyle \\mathbb {E} \\left\|\\sum \_{k=1}^{N}W\_{k}\\varphi \_{k}(t)-X\_{t}\\right\|\_{L^{2}}^{2}=\\mathbb {E} \|X\_{t}^{2}\|\_{L^{2}}+\\sum \_{k=1}^{N}\\sum \_{\\ell =1}^{N}\\mathbb {E} \[W\_{\\ell }\\varphi \_{\\ell }(t)W\_{k}^{\*}\\varphi \_{k}^{\*}(t)\]\_{L^{2}}-\\sum \_{k=1}^{N}\\mathbb {E} \[W\_{k}\\varphi \_{k}X\_{t}^{\*}\]\_{L^{2}}-\\sum \_{k=1}^{N}\\mathbb {E} \[X\_{t}W\_{k}^{\*}\\varphi \_{k}^{\*}(t)\]\_{L^{2}}}
![{\\displaystyle \\mathbb {E} \\left\|\\sum \_{k=1}^{N}W\_{k}\\varphi \_{k}(t)-X\_{t}\\right\|\_{L^{2}}^{2}=\\mathbb {E} \|X\_{t}^{2}\|\_{L^{2}}+\\sum \_{k=1}^{N}\\sum \_{\\ell =1}^{N}\\mathbb {E} \[W\_{\\ell }\\varphi \_{\\ell }(t)W\_{k}^{\*}\\varphi \_{k}^{\*}(t)\]\_{L^{2}}-\\sum \_{k=1}^{N}\\mathbb {E} \[W\_{k}\\varphi \_{k}X\_{t}^{\*}\]\_{L^{2}}-\\sum \_{k=1}^{N}\\mathbb {E} \[X\_{t}W\_{k}^{\*}\\varphi \_{k}^{\*}(t)\]\_{L^{2}}}](https://wikimedia.org/api/rest_v1/media/math/render/svg/0c9dd795699c6f791be251af56fda845082a2cb9)
Using the orthonormality of Ď k ( t ) {\\displaystyle \\varphi \_{k}(t)} , and expanding X t {\\displaystyle X\_{t}}  in the Ď k ( t ) {\\displaystyle \\varphi \_{k}(t)}  basis, we get that the right hand size is equal to:
E
\[
X
t
\]
L
2
2
â
â
k
\=
1
N
E
\[
\|
W
k
\|
2
\]
{\\displaystyle \\mathbb {E} \[X\_{t}\]\_{L^{2}}^{2}-\\sum \_{k=1}^{N}\\mathbb {E} \[\|W\_{k}\|^{2}\]}
![{\\displaystyle \\mathbb {E} \[X\_{t}\]\_{L^{2}}^{2}-\\sum \_{k=1}^{N}\\mathbb {E} \[\|W\_{k}\|^{2}\]}](https://wikimedia.org/api/rest_v1/media/math/render/svg/bd4cc80ff88af90eeed172031bedda8d116e2298)
We may perform identical analysis for the e k ( t ) {\\displaystyle e\_{k}(t)} , and so rewrite the above inequality as:
E
\[
X
t
\]
L
2
2
â
â
k
\=
1
N
E
\[
\|
Z
k
\|
2
\]
â¤
E
\[
X
t
\]
L
2
2
â
â
k
\=
1
N
E
\[
\|
W
k
\|
2
\]
{\\displaystyle {\\displaystyle \\mathbb {E} \[X\_{t}\]\_{L^{2}}^{2}-\\sum \_{k=1}^{N}\\mathbb {E} \[\|Z\_{k}\|^{2}\]}\\leq {\\displaystyle \\mathbb {E} \[X\_{t}\]\_{L^{2}}^{2}-\\sum \_{k=1}^{N}\\mathbb {E} \[\|W\_{k}\|^{2}\]}}
![{\\displaystyle {\\displaystyle \\mathbb {E} \[X\_{t}\]\_{L^{2}}^{2}-\\sum \_{k=1}^{N}\\mathbb {E} \[\|Z\_{k}\|^{2}\]}\\leq {\\displaystyle \\mathbb {E} \[X\_{t}\]\_{L^{2}}^{2}-\\sum \_{k=1}^{N}\\mathbb {E} \[\|W\_{k}\|^{2}\]}}](https://wikimedia.org/api/rest_v1/media/math/render/svg/d7b9f649b35e5c0ccd0d9294dddd5ed2c38df531)
Subtracting the common first term, and dividing by E \[ \| X t \| L 2 2 \] {\\displaystyle \\mathbb {E} \[\|X\_{t}\|\_{L^{2}}^{2}\]} ![{\\displaystyle \\mathbb {E} \[\|X\_{t}\|\_{L^{2}}^{2}\]}](https://wikimedia.org/api/rest_v1/media/math/render/svg/5f66e132a304afedacd6c2503f0484633d16ee81), we obtain that:
â
k
\=
1
N
p
k
âĽ
â
k
\=
1
N
q
k
{\\displaystyle \\sum \_{k=1}^{N}p\_{k}\\geq \\sum \_{k=1}^{N}q\_{k}}

This implies that:
â
â
k
\=
1
â
p
k
log
âĄ
(
p
k
)
â¤
â
â
k
\=
1
â
q
k
log
âĄ
(
q
k
)
{\\displaystyle -\\sum \_{k=1}^{\\infty }p\_{k}\\log(p\_{k})\\leq -\\sum \_{k=1}^{\\infty }q\_{k}\\log(q\_{k})}

## Linear KarhunenâLoève approximations
\[[edit](https://en.wikipedia.org/w/index.php?title=Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem&action=edit§ion=10 "Edit section: Linear KarhunenâLoève approximations")\]
Consider a whole class of signals we want to approximate over the first M vectors of a basis. These signals are modeled as realizations of a random vector *Y*\[*n*\] of size N. To optimize the approximation we design a basis that minimizes the average [approximation error](https://en.wikipedia.org/wiki/Approximation_error "Approximation error"). This section proves that optimal bases are KarhunenâLoeve bases that diagonalize the covariance matrix of Y. The random vector Y can be decomposed in an orthogonal basis
{
g
m
}
0
â¤
m
â¤
N
{\\displaystyle \\left\\{g\_{m}\\right\\}\_{0\\leq m\\leq N}}

as follows:
Y
\=
â
m
\=
0
N
â
1
â¨
Y
,
g
m
âŠ
g
m
,
{\\displaystyle Y=\\sum \_{m=0}^{N-1}\\left\\langle Y,g\_{m}\\right\\rangle g\_{m},}

where each
â¨
Y
,
g
m
âŠ
\=
â
n
\=
0
N
â
1
Y
\[
n
\]
g
m
â
\[
n
\]
{\\displaystyle \\left\\langle Y,g\_{m}\\right\\rangle =\\sum \_{n=0}^{N-1}{Y\[n\]}g\_{m}^{\*}\[n\]}
![{\\displaystyle \\left\\langle Y,g\_{m}\\right\\rangle =\\sum \_{n=0}^{N-1}{Y\[n\]}g\_{m}^{\*}\[n\]}](https://wikimedia.org/api/rest_v1/media/math/render/svg/02f1fe093bbcd1aa4a590ba40632868931607175)
is a random variable. The approximation from the first *M* ⤠*N* vectors of the basis is
Y
M
\=
â
m
\=
0
M
â
1
â¨
Y
,
g
m
âŠ
g
m
{\\displaystyle Y\_{M}=\\sum \_{m=0}^{M-1}\\left\\langle Y,g\_{m}\\right\\rangle g\_{m}}

The energy conservation in an orthogonal basis implies
Îľ
\[
M
\]
\=
E
{
â
Y
â
Y
M
â
2
}
\=
â
m
\=
M
N
â
1
E
{
\|
â¨
Y
,
g
m
âŠ
\|
2
}
{\\displaystyle \\varepsilon \[M\]=\\mathbf {E} \\left\\{\\left\\\|Y-Y\_{M}\\right\\\|^{2}\\right\\}=\\sum \_{m=M}^{N-1}\\mathbf {E} \\left\\{\\left\|\\left\\langle Y,g\_{m}\\right\\rangle \\right\|^{2}\\right\\}}
![{\\displaystyle \\varepsilon \[M\]=\\mathbf {E} \\left\\{\\left\\\|Y-Y\_{M}\\right\\\|^{2}\\right\\}=\\sum \_{m=M}^{N-1}\\mathbf {E} \\left\\{\\left\|\\left\\langle Y,g\_{m}\\right\\rangle \\right\|^{2}\\right\\}}](https://wikimedia.org/api/rest_v1/media/math/render/svg/f03ea5618ab818ac1fba489f606629de7bae0624)
This error is related to the covariance of Y defined by
R
\[
n
,
m
\]
\=
E
{
Y
\[
n
\]
Y
â
\[
m
\]
}
{\\displaystyle R\[n,m\]=\\mathbf {E} \\left\\{Y\[n\]Y^{\*}\[m\]\\right\\}}
![{\\displaystyle R\[n,m\]=\\mathbf {E} \\left\\{Y\[n\]Y^{\*}\[m\]\\right\\}}](https://wikimedia.org/api/rest_v1/media/math/render/svg/97040a6e6c6093e9b2025f61eef9d5b4c01e96a7)
For any vector *x*\[*n*\] we denote by K the [covariance operator](https://en.wikipedia.org/wiki/Covariance_operator "Covariance operator") represented by this matrix,
E
{
\|
â¨
Y
,
x
âŠ
\|
2
}
\=
â¨
K
x
,
x
âŠ
\=
â
n
\=
0
N
â
1
â
m
\=
0
N
â
1
R
\[
n
,
m
\]
x
\[
n
\]
x
â
\[
m
\]
{\\displaystyle \\mathbf {E} \\left\\{\\left\|\\langle Y,x\\rangle \\right\|^{2}\\right\\}=\\langle Kx,x\\rangle =\\sum \_{n=0}^{N-1}\\sum \_{m=0}^{N-1}R\[n,m\]x\[n\]x^{\*}\[m\]}
![{\\displaystyle \\mathbf {E} \\left\\{\\left\|\\langle Y,x\\rangle \\right\|^{2}\\right\\}=\\langle Kx,x\\rangle =\\sum \_{n=0}^{N-1}\\sum \_{m=0}^{N-1}R\[n,m\]x\[n\]x^{\*}\[m\]}](https://wikimedia.org/api/rest_v1/media/math/render/svg/329f1d64474e88691438cbf535cef5909c6afe90)
The error *Îľ*\[*M*\] is therefore a sum of the last *N* â *M* coefficients of the covariance operator
Îľ
\[
M
\]
\=
â
m
\=
M
N
â
1
â¨
K
g
m
,
g
m
âŠ
{\\displaystyle \\varepsilon \[M\]=\\sum \_{m=M}^{N-1}{\\left\\langle Kg\_{m},g\_{m}\\right\\rangle }}
![{\\displaystyle \\varepsilon \[M\]=\\sum \_{m=M}^{N-1}{\\left\\langle Kg\_{m},g\_{m}\\right\\rangle }}](https://wikimedia.org/api/rest_v1/media/math/render/svg/495b9ad300b894a7656069ceedb341799830137d)
The covariance operator K is Hermitian and Positive and is thus diagonalized in an orthogonal basis called a KarhunenâLoève basis. The following theorem states that a KarhunenâLoève basis is optimal for linear approximations.
**Theorem (Optimality of KarhunenâLoève basis).** Let K be a covariance operator. For all *M* ⼠1, the approximation error
Îľ
\[
M
\]
\=
â
m
\=
M
N
â
1
â¨
K
g
m
,
g
m
âŠ
{\\displaystyle \\varepsilon \[M\]=\\sum \_{m=M}^{N-1}\\left\\langle Kg\_{m},g\_{m}\\right\\rangle }
![{\\displaystyle \\varepsilon \[M\]=\\sum \_{m=M}^{N-1}\\left\\langle Kg\_{m},g\_{m}\\right\\rangle }](https://wikimedia.org/api/rest_v1/media/math/render/svg/ada8bd92150dcbf57702d1f335b74be240e84a28)
is minimum if and only if
{
g
m
}
0
â¤
m
\<
N
{\\displaystyle \\left\\{g\_{m}\\right\\}\_{0\\leq m\<N}}

is a KarhunenâLoeve basis ordered by decreasing eigenvalues.
â¨
K
g
m
,
g
m
âŠ
âĽ
â¨
K
g
m
\+
1
,
g
m
\+
1
âŠ
,
0
â¤
m
\<
N
â
1\.
{\\displaystyle \\left\\langle Kg\_{m},g\_{m}\\right\\rangle \\geq \\left\\langle Kg\_{m+1},g\_{m+1}\\right\\rangle ,\\qquad 0\\leq m\<N-1.}

## Non-Linear approximation in bases
\[[edit](https://en.wikipedia.org/w/index.php?title=Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem&action=edit§ion=11 "Edit section: Non-Linear approximation in bases")\]
Linear approximations project the signal on *M* vectors a priori. The approximation can be made more precise by choosing the *M* orthogonal vectors depending on the signal properties. This section analyzes the general performance of these non-linear approximations. A signal f â H {\\displaystyle f\\in \\mathrm {H} }  is approximated with M vectors selected adaptively in an orthonormal basis for H {\\displaystyle \\mathrm {H} } \[*[definition needed](https://en.wikipedia.org/wiki/Wikipedia:Please_clarify "Wikipedia:Please clarify")*\]
B
\=
{
g
m
}
m
â
N
{\\displaystyle \\mathrm {B} =\\left\\{g\_{m}\\right\\}\_{m\\in \\mathbb {N} }}

Let f M {\\displaystyle f\_{M}}  be the projection of f over M vectors whose indices are in IM:
f
M
\=
â
m
â
I
M
â¨
f
,
g
m
âŠ
g
m
{\\displaystyle f\_{M}=\\sum \_{m\\in I\_{M}}\\left\\langle f,g\_{m}\\right\\rangle g\_{m}}

The approximation error is the sum of the remaining coefficients
Îľ
\[
M
\]
\=
{
â
f
â
f
M
â
2
}
\=
â
m
â
I
M
N
â
1
{
\|
â¨
f
,
g
m
âŠ
\|
2
}
{\\displaystyle \\varepsilon \[M\]=\\left\\{\\left\\\|f-f\_{M}\\right\\\|^{2}\\right\\}=\\sum \_{m\\notin I\_{M}}^{N-1}\\left\\{\\left\|\\left\\langle f,g\_{m}\\right\\rangle \\right\|^{2}\\right\\}}
![{\\displaystyle \\varepsilon \[M\]=\\left\\{\\left\\\|f-f\_{M}\\right\\\|^{2}\\right\\}=\\sum \_{m\\notin I\_{M}}^{N-1}\\left\\{\\left\|\\left\\langle f,g\_{m}\\right\\rangle \\right\|^{2}\\right\\}}](https://wikimedia.org/api/rest_v1/media/math/render/svg/6a2c7e02bbbe9afe5d05b3e7a225d12d294cc5fb)
To minimize this error, the indices in IM must correspond to the M vectors having the largest inner product amplitude
\|
â¨
f
,
g
m
âŠ
\|
.
{\\displaystyle \\left\|\\left\\langle f,g\_{m}\\right\\rangle \\right\|.}

These are the vectors that best correlate f. They can thus be interpreted as the main features of f. The resulting error is necessarily smaller than the error of a [linear approximation](https://en.wikipedia.org/wiki/Linear_approximation "Linear approximation") which selects the M approximation vectors independently of f. Let us sort
{
\|
â¨
f
,
g
m
âŠ
\|
}
m
â
N
{\\displaystyle \\left\\{\\left\|\\left\\langle f,g\_{m}\\right\\rangle \\right\|\\right\\}\_{m\\in \\mathbb {N} }}

in decreasing order
\|
â¨
f
,
g
m
k
âŠ
\|
âĽ
\|
â¨
f
,
g
m
k
\+
1
âŠ
\|
.
{\\displaystyle \\left\|\\left\\langle f,g\_{m\_{k}}\\right\\rangle \\right\|\\geq \\left\|\\left\\langle f,g\_{m\_{k+1}}\\right\\rangle \\right\|.}

The best non-linear approximation is
f
M
\=
â
k
\=
1
M
â¨
f
,
g
m
k
âŠ
g
m
k
{\\displaystyle f\_{M}=\\sum \_{k=1}^{M}\\left\\langle f,g\_{m\_{k}}\\right\\rangle g\_{m\_{k}}}

It can also be written as inner product thresholding:
f
M
\=
â
m
\=
0
â
θ
T
(
â¨
f
,
g
m
âŠ
)
g
m
{\\displaystyle f\_{M}=\\sum \_{m=0}^{\\infty }\\theta \_{T}\\left(\\left\\langle f,g\_{m}\\right\\rangle \\right)g\_{m}}

with
T
\=
\|
â¨
f
,
g
m
M
âŠ
\|
,
θ
T
(
x
)
\=
{
x
\|
x
\|
âĽ
T
0
\|
x
\|
\<
T
{\\displaystyle T=\\left\|\\left\\langle f,g\_{m\_{M}}\\right\\rangle \\right\|,\\qquad \\theta \_{T}(x)={\\begin{cases}x&\|x\|\\geq T\\\\0&\|x\|\<T\\end{cases}}}

The non-linear error is
Îľ
\[
M
\]
\=
{
â
f
â
f
M
â
2
}
\=
â
k
\=
M
\+
1
â
{
\|
â¨
f
,
g
m
k
âŠ
\|
2
}
{\\displaystyle \\varepsilon \[M\]=\\left\\{\\left\\\|f-f\_{M}\\right\\\|^{2}\\right\\}=\\sum \_{k=M+1}^{\\infty }\\left\\{\\left\|\\left\\langle f,g\_{m\_{k}}\\right\\rangle \\right\|^{2}\\right\\}}
![{\\displaystyle \\varepsilon \[M\]=\\left\\{\\left\\\|f-f\_{M}\\right\\\|^{2}\\right\\}=\\sum \_{k=M+1}^{\\infty }\\left\\{\\left\|\\left\\langle f,g\_{m\_{k}}\\right\\rangle \\right\|^{2}\\right\\}}](https://wikimedia.org/api/rest_v1/media/math/render/svg/a10028719c0a213e358f42be10ff625fcd480c61)
this error goes quickly to zero as M increases, if the sorted values of \| ⨠f , g m k ⊠\| {\\displaystyle \\left\|\\left\\langle f,g\_{m\_{k}}\\right\\rangle \\right\|}  have a fast decay as k increases. This decay is quantified by computing the I P {\\displaystyle \\mathrm {I} ^{\\mathrm {P} }}  norm of the signal inner products in B:
â
f
â
B
,
p
\=
(
â
m
\=
0
â
\|
â¨
f
,
g
m
âŠ
\|
p
)
1
p
{\\displaystyle \\\|f\\\|\_{\\mathrm {B} ,p}=\\left(\\sum \_{m=0}^{\\infty }\\left\|\\left\\langle f,g\_{m}\\right\\rangle \\right\|^{p}\\right)^{\\frac {1}{p}}}

The following theorem relates the decay of *Îľ*\[*M*\] to â f â B , p {\\displaystyle \\\|f\\\|\_{\\mathrm {B} ,p}} 
**Theorem (decay of error).** If â f â B , p \< â {\\displaystyle \\\|f\\\|\_{\\mathrm {B} ,p}\<\\infty }  with *p* \< 2 then
Îľ
\[
M
\]
â¤
â
f
â
B
,
p
2
2
p
â
1
M
1
â
2
p
{\\displaystyle \\varepsilon \[M\]\\leq {\\frac {\\\|f\\\|\_{\\mathrm {B} ,p}^{2}}{{\\frac {2}{p}}-1}}M^{1-{\\frac {2}{p}}}}
![{\\displaystyle \\varepsilon \[M\]\\leq {\\frac {\\\|f\\\|\_{\\mathrm {B} ,p}^{2}}{{\\frac {2}{p}}-1}}M^{1-{\\frac {2}{p}}}}](https://wikimedia.org/api/rest_v1/media/math/render/svg/1c5514c4d3d4296125778f3546ddee2b05272abf)
and
Îľ
\[
M
\]
\=
o
(
M
1
â
2
p
)
.
{\\displaystyle \\varepsilon \[M\]=o\\left(M^{1-{\\frac {2}{p}}}\\right).}
![{\\displaystyle \\varepsilon \[M\]=o\\left(M^{1-{\\frac {2}{p}}}\\right).}](https://wikimedia.org/api/rest_v1/media/math/render/svg/626974ced6297faf4103543175e6afe5a9ee7236)
Conversely, if Îľ \[ M \] \= o ( M 1 â 2 p ) {\\displaystyle \\varepsilon \[M\]=o\\left(M^{1-{\\frac {2}{p}}}\\right)} ![{\\displaystyle \\varepsilon \[M\]=o\\left(M^{1-{\\frac {2}{p}}}\\right)}](https://wikimedia.org/api/rest_v1/media/math/render/svg/97c2c3d7907e4a58e3224371cb6647948350d1c1) then
â f â B , q \< â {\\displaystyle \\\|f\\\|\_{\\mathrm {B} ,q}\<\\infty }  for any *q* \> *p*.
### Non-optimality of KarhunenâLoève bases
\[[edit](https://en.wikipedia.org/w/index.php?title=Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem&action=edit§ion=12 "Edit section: Non-optimality of KarhunenâLoève bases")\]
To further illustrate the differences between linear and non-linear approximations, we study the decomposition of a simple non-Gaussian random vector in a KarhunenâLoève basis. Processes whose realizations have a random translation are stationary. The KarhunenâLoève basis is then a Fourier basis and we study its performance. To simplify the analysis, consider a random vector *Y*\[*n*\] of size *N* that is random shift modulo *N* of a deterministic signal *f*\[*n*\] of zero mean
â
n
\=
0
N
â
1
f
\[
n
\]
\=
0
{\\displaystyle \\sum \_{n=0}^{N-1}f\[n\]=0}
![{\\displaystyle \\sum \_{n=0}^{N-1}f\[n\]=0}](https://wikimedia.org/api/rest_v1/media/math/render/svg/8491175ed9cde718f570ca5a1690e598acc83818)
Y
\[
n
\]
\=
f
\[
(
n
â
p
)
mod
N
\]
{\\displaystyle Y\[n\]=f\[(n-p){\\bmod {N}}\]}
![{\\displaystyle Y\[n\]=f\[(n-p){\\bmod {N}}\]}](https://wikimedia.org/api/rest_v1/media/math/render/svg/719334ff76852253bde14fe3e5ea7d76df8937dc)
The random shift *P* is uniformly distributed on \[0, *N* â 1\]:
Pr
(
P
\=
p
)
\=
1
N
,
0
â¤
p
\<
N
{\\displaystyle \\Pr(P=p)={\\frac {1}{N}},\\qquad 0\\leq p\<N}

Clearly
E
{
Y
\[
n
\]
}
\=
1
N
â
p
\=
0
N
â
1
f
\[
(
n
â
p
)
mod
N
\]
\=
0
{\\displaystyle \\mathbf {E} \\{Y\[n\]\\}={\\frac {1}{N}}\\sum \_{p=0}^{N-1}f\[(n-p){\\bmod {N}}\]=0}
![{\\displaystyle \\mathbf {E} \\{Y\[n\]\\}={\\frac {1}{N}}\\sum \_{p=0}^{N-1}f\[(n-p){\\bmod {N}}\]=0}](https://wikimedia.org/api/rest_v1/media/math/render/svg/2fcbf5841519051f55abe7088d6220ea4dd73705)
and
R
\[
n
,
k
\]
\=
E
{
Y
\[
n
\]
Y
\[
k
\]
}
\=
1
N
â
p
\=
0
N
â
1
f
\[
(
n
â
p
)
mod
N
\]
f
\[
(
k
â
p
)
mod
N
\]
\=
1
N
f
Î
f
ÂŻ
\[
n
â
k
\]
,
f
ÂŻ
\[
n
\]
\=
f
\[
â
n
\]
{\\displaystyle R\[n,k\]=\\mathbf {E} \\{Y\[n\]Y\[k\]\\}={\\frac {1}{N}}\\sum \_{p=0}^{N-1}f\[(n-p){\\bmod {N}}\]f\[(k-p){\\bmod {N}}\]={\\frac {1}{N}}f\\Theta {\\bar {f}}\[n-k\],\\quad {\\bar {f}}\[n\]=f\[-n\]}
![{\\displaystyle R\[n,k\]=\\mathbf {E} \\{Y\[n\]Y\[k\]\\}={\\frac {1}{N}}\\sum \_{p=0}^{N-1}f\[(n-p){\\bmod {N}}\]f\[(k-p){\\bmod {N}}\]={\\frac {1}{N}}f\\Theta {\\bar {f}}\[n-k\],\\quad {\\bar {f}}\[n\]=f\[-n\]}](https://wikimedia.org/api/rest_v1/media/math/render/svg/6e40a9d9083e7fd94b363d97fed5f876131e1891)
Hence
R
\[
n
,
k
\]
\=
R
Y
\[
n
â
k
\]
,
R
Y
\[
k
\]
\=
1
N
f
Î
f
ÂŻ
\[
k
\]
{\\displaystyle R\[n,k\]=R\_{Y}\[n-k\],\\qquad R\_{Y}\[k\]={\\frac {1}{N}}f\\Theta {\\bar {f}}\[k\]}
![{\\displaystyle R\[n,k\]=R\_{Y}\[n-k\],\\qquad R\_{Y}\[k\]={\\frac {1}{N}}f\\Theta {\\bar {f}}\[k\]}](https://wikimedia.org/api/rest_v1/media/math/render/svg/5a8cfb5801399b3718406c3c48962f205026f127)
Since RY is N periodic, Y is a circular stationary random vector. The covariance operator is a [circular convolution](https://en.wikipedia.org/wiki/Circular_convolution "Circular convolution") with RY and is therefore diagonalized in the discrete Fourier KarhunenâLoève basis
{
1
N
e
i
2
Ď
m
n
/
N
}
0
â¤
m
\<
N
.
{\\displaystyle \\left\\{{\\frac {1}{\\sqrt {N}}}e^{i2\\pi mn/N}\\right\\}\_{0\\leq m\<N}.}

The power spectrum is Fourier transform of *R**Y*:
P
Y
\[
m
\]
\=
R
^
Y
\[
m
\]
\=
1
N
\|
f
^
\[
m
\]
\|
2
{\\displaystyle P\_{Y}\[m\]={\\hat {R}}\_{Y}\[m\]={\\frac {1}{N}}\\left\|{\\hat {f}}\[m\]\\right\|^{2}}
![{\\displaystyle P\_{Y}\[m\]={\\hat {R}}\_{Y}\[m\]={\\frac {1}{N}}\\left\|{\\hat {f}}\[m\]\\right\|^{2}}](https://wikimedia.org/api/rest_v1/media/math/render/svg/ae2ed523a89b792ecc1f0024675c39466bd7c8ae)
**Example:** Consider an extreme case where f \[ n \] \= δ \[ n \] â δ \[ n â 1 \] {\\displaystyle f\[n\]=\\delta \[n\]-\\delta \[n-1\]} ![{\\displaystyle f\[n\]=\\delta \[n\]-\\delta \[n-1\]}](https://wikimedia.org/api/rest_v1/media/math/render/svg/24ed776f976f118ac0a033bb5cdaaaf0ebb97edb). A theorem stated above guarantees that the Fourier KarhunenâLoève basis produces a smaller expected approximation error than a canonical basis of Diracs { g m \[ n \] \= δ \[ n â m \] } 0 ⤠m \< N {\\displaystyle \\left\\{g\_{m}\[n\]=\\delta \[n-m\]\\right\\}\_{0\\leq m\<N}} ![{\\displaystyle \\left\\{g\_{m}\[n\]=\\delta \[n-m\]\\right\\}\_{0\\leq m\<N}}](https://wikimedia.org/api/rest_v1/media/math/render/svg/007c54c9fb2f295a49f42934713f2249674b54dd). Indeed, we do not know a priori the abscissa of the non-zero coefficients of *Y*, so there is no particular Dirac that is better adapted to perform the approximation. But the Fourier vectors cover the whole support of Y and thus absorb a part of the signal energy.
E
{
\|
â¨
Y
\[
n
\]
,
1
N
e
i
2
Ď
m
n
/
N
âŠ
\|
2
}
\=
P
Y
\[
m
\]
\=
4
N
sin
2
âĄ
(
Ď
k
N
)
{\\displaystyle \\mathbf {E} \\left\\{\\left\|\\left\\langle Y\[n\],{\\frac {1}{\\sqrt {N}}}e^{i2\\pi mn/N}\\right\\rangle \\right\|^{2}\\right\\}=P\_{Y}\[m\]={\\frac {4}{N}}\\sin ^{2}\\left({\\frac {\\pi k}{N}}\\right)}
![{\\displaystyle \\mathbf {E} \\left\\{\\left\|\\left\\langle Y\[n\],{\\frac {1}{\\sqrt {N}}}e^{i2\\pi mn/N}\\right\\rangle \\right\|^{2}\\right\\}=P\_{Y}\[m\]={\\frac {4}{N}}\\sin ^{2}\\left({\\frac {\\pi k}{N}}\\right)}](https://wikimedia.org/api/rest_v1/media/math/render/svg/4d4950a653a45d50571c71ad9019ce11467431e0)
Selecting higher frequency Fourier coefficients yields a better mean-square approximation than choosing a priori a few Dirac vectors to perform the approximation. The situation is totally different for non-linear approximations. If f \[ n \] \= δ \[ n \] â δ \[ n â 1 \] {\\displaystyle f\[n\]=\\delta \[n\]-\\delta \[n-1\]} ![{\\displaystyle f\[n\]=\\delta \[n\]-\\delta \[n-1\]}](https://wikimedia.org/api/rest_v1/media/math/render/svg/24ed776f976f118ac0a033bb5cdaaaf0ebb97edb) then the discrete Fourier basis is extremely inefficient because f and hence Y have an energy that is almost uniformly spread among all Fourier vectors. In contrast, since f has only two non-zero coefficients in the Dirac basis, a non-linear approximation of Y with *M* ⼠2 gives zero error.[\[5\]](https://en.wikipedia.org/wiki/Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem#cite_note-5)
## Principal component analysis
\[[edit](https://en.wikipedia.org/w/index.php?title=Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem&action=edit§ion=13 "Edit section: Principal component analysis")\]
Main article: [Principal component analysis](https://en.wikipedia.org/wiki/Principal_component_analysis "Principal component analysis")
We have established the KarhunenâLoève theorem and derived a few properties thereof. We also noted that one hurdle in its application was the numerical cost of determining the eigenvalues and eigenfunctions of its covariance operator through the Fredholm integral equation of the second kind
âŤ
a
b
K
X
(
s
,
t
)
e
k
(
s
)
d
s
\=
Îť
k
e
k
(
t
)
.
{\\displaystyle \\int \_{a}^{b}K\_{X}(s,t)e\_{k}(s)\\,ds=\\lambda \_{k}e\_{k}(t).}

However, when applied to a discrete and finite process ( X n ) n â { 1 , ⌠, N } {\\displaystyle \\left(X\_{n}\\right)\_{n\\in \\{1,\\ldots ,N\\}}} , the problem takes a much simpler form and standard algebra can be used to carry out the calculations.
Note that a continuous process can also be sampled at *N* points in time in order to reduce the problem to a finite version.
We henceforth consider a random *N*\-dimensional vector X \= ( X 1 X 2 ⌠X N ) T {\\displaystyle X=\\left(X\_{1}~X\_{2}~\\ldots ~X\_{N}\\right)^{T}} . As mentioned above, *X* could contain *N* samples of a signal but it can hold many more representations depending on the field of application. For instance it could be the answers to a survey or economic data in an econometrics analysis.
As in the continuous version, we assume that *X* is centered, otherwise we can let X := X â Îź X {\\displaystyle X:=X-\\mu \_{X}}  (where Îź X {\\displaystyle \\mu \_{X}}  is the [mean vector](https://en.wikipedia.org/wiki/Mean_vector "Mean vector") of *X*) which is centered.
Let us adapt the procedure to the discrete case.
### Covariance matrix
\[[edit](https://en.wikipedia.org/w/index.php?title=Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem&action=edit§ion=14 "Edit section: Covariance matrix")\]
Recall that the main implication and difficulty of the KL transformation is computing the eigenvectors of the linear operator associated to the covariance function, which are given by the solutions to the integral equation written above.
Define ÎŁ, the covariance matrix of *X*, as an *N* Ă *N* matrix whose elements are given by:
ÎŁ
i
j
\=
E
\[
X
i
X
j
\]
,
â
i
,
j
â
{
1
,
âŚ
,
N
}
{\\displaystyle \\Sigma \_{ij}=\\mathbf {E} \[X\_{i}X\_{j}\],\\qquad \\forall i,j\\in \\{1,\\ldots ,N\\}}
![{\\displaystyle \\Sigma \_{ij}=\\mathbf {E} \[X\_{i}X\_{j}\],\\qquad \\forall i,j\\in \\{1,\\ldots ,N\\}}](https://wikimedia.org/api/rest_v1/media/math/render/svg/0aca0bb8748f361f4f3051b36cb1b667b357891e)
Rewriting the above integral equation to suit the discrete case, we observe that it turns into:
â
j
\=
1
N
ÎŁ
i
j
e
j
\=
Îť
e
i
â
ÎŁ
e
\=
Îť
e
{\\displaystyle \\sum \_{j=1}^{N}\\Sigma \_{ij}e\_{j}=\\lambda e\_{i}\\quad \\Leftrightarrow \\quad \\Sigma e=\\lambda e}

where e \= ( e 1 e 2 ⌠e N ) T {\\displaystyle e=(e\_{1}~e\_{2}~\\ldots ~e\_{N})^{T}}  is an *N*\-dimensional vector.
The integral equation thus reduces to a simple matrix eigenvalue problem, which explains why the PCA has such a broad domain of applications.
Since ÎŁ is a positive definite symmetric matrix, it possesses a set of orthonormal eigenvectors forming a basis of R N {\\displaystyle \\mathbb {R} ^{N}} , and we write { Îť i , Ď i } i â { 1 , ⌠, N } {\\displaystyle \\{\\lambda \_{i},\\varphi \_{i}\\}\_{i\\in \\{1,\\ldots ,N\\}}}  this set of eigenvalues and corresponding eigenvectors, listed in decreasing values of Îťi. Let also ÎŚ be the orthonormal matrix consisting of these eigenvectors:
ÎŚ
:=
(
Ď
1
Ď
2
âŚ
Ď
N
)
T
ÎŚ
T
ÎŚ
\=
I
{\\displaystyle {\\begin{aligned}\\Phi &:=\\left(\\varphi \_{1}~\\varphi \_{2}~\\ldots ~\\varphi \_{N}\\right)^{T}\\\\\\Phi ^{T}\\Phi &=I\\end{aligned}}}

### Principal component transform
\[[edit](https://en.wikipedia.org/w/index.php?title=Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem&action=edit§ion=15 "Edit section: Principal component transform")\]
It remains to perform the actual KL transformation, called the *principal component transform* in this case. Recall that the transform was found by expanding the process with respect to the basis spanned by the eigenvectors of the covariance function. In this case, we hence have:
X
\=
â
i
\=
1
N
â¨
Ď
i
,
X
âŠ
Ď
i
\=
â
i
\=
1
N
Ď
i
T
X
Ď
i
{\\displaystyle X=\\sum \_{i=1}^{N}\\langle \\varphi \_{i},X\\rangle \\varphi \_{i}=\\sum \_{i=1}^{N}\\varphi \_{i}^{T}X\\varphi \_{i}}

In a more compact form, the principal component transform of *X* is defined by:
{
Y
\=
ÎŚ
T
X
X
\=
ÎŚ
Y
{\\displaystyle {\\begin{cases}Y=\\Phi ^{T}X\\\\X=\\Phi Y\\end{cases}}}

The *i*\-th component of *Y* is Y i \= Ď i T X {\\displaystyle Y\_{i}=\\varphi \_{i}^{T}X} , the projection of *X* on Ď i {\\displaystyle \\varphi \_{i}}  and the inverse transform *X* = ÎŚ*Y* yields the expansion of X on the space spanned by the Ď i {\\displaystyle \\varphi \_{i}} :
X
\=
â
i
\=
1
N
Y
i
Ď
i
\=
â
i
\=
1
N
â¨
Ď
i
,
X
âŠ
Ď
i
{\\displaystyle X=\\sum \_{i=1}^{N}Y\_{i}\\varphi \_{i}=\\sum \_{i=1}^{N}\\langle \\varphi \_{i},X\\rangle \\varphi \_{i}}

As in the continuous case, we may reduce the dimensionality of the problem by truncating the sum at some K â { 1 , ⌠, N } {\\displaystyle K\\in \\{1,\\ldots ,N\\}}  such that
â
i
\=
1
K
Îť
i
â
i
\=
1
N
Îť
i
âĽ
Îą
{\\displaystyle {\\frac {\\sum \_{i=1}^{K}\\lambda \_{i}}{\\sum \_{i=1}^{N}\\lambda \_{i}}}\\geq \\alpha }

where Îą is the explained variance threshold we wish to set.
We can also reduce the dimensionality through the use of multilevel dominant eigenvector estimation (MDEE).[\[6\]](https://en.wikipedia.org/wiki/Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem#cite_note-6)
## Examples
\[[edit](https://en.wikipedia.org/w/index.php?title=Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem&action=edit§ion=16 "Edit section: Examples")\]
### The Wiener process
\[[edit](https://en.wikipedia.org/w/index.php?title=Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem&action=edit§ion=17 "Edit section: The Wiener process")\]
There are numerous equivalent characterizations of the [Wiener process](https://en.wikipedia.org/wiki/Wiener_process "Wiener process") which is a mathematical formalization of [Brownian motion](https://en.wikipedia.org/wiki/Brownian_motion "Brownian motion"). Here we regard it as the centered standard Gaussian process **W***t* with covariance function
K
W
(
t
,
s
)
\=
cov
âĄ
(
W
t
,
W
s
)
\=
min
(
s
,
t
)
.
{\\displaystyle K\_{W}(t,s)=\\operatorname {cov} (W\_{t},W\_{s})=\\min(s,t).}

We restrict the time domain to \[*a*, *b*\]=\[0,1\] without loss of generality.
The eigenvectors of the covariance kernel are easily determined. These are
e
k
(
t
)
\=
2
sin
âĄ
(
(
k
â
1
2
)
Ď
t
)
{\\displaystyle e\_{k}(t)={\\sqrt {2}}\\sin \\left(\\left(k-{\\tfrac {1}{2}}\\right)\\pi t\\right)}

and the corresponding eigenvalues are
Îť
k
\=
1
(
k
â
1
2
)
2
Ď
2
.
{\\displaystyle \\lambda \_{k}={\\frac {1}{(k-{\\frac {1}{2}})^{2}\\pi ^{2}}}.}

**Proof**
In order to find the eigenvalues and eigenvectors, we need to solve the integral equation:
âŤ
a
b
K
W
(
s
,
t
)
e
(
s
)
d
s
\=
Îť
e
(
t
)
â
t
,
0
â¤
t
â¤
1
âŤ
0
1
min
(
s
,
t
)
e
(
s
)
d
s
\=
Îť
e
(
t
)
â
t
,
0
â¤
t
â¤
1
âŤ
0
t
s
e
(
s
)
d
s
\+
t
âŤ
t
1
e
(
s
)
d
s
\=
Îť
e
(
t
)
â
t
,
0
â¤
t
â¤
1
{\\displaystyle {\\begin{aligned}\\int \_{a}^{b}K\_{W}(s,t)e(s)\\,ds&=\\lambda e(t)\\qquad \\forall t,0\\leq t\\leq 1\\\\\\int \_{0}^{1}\\min(s,t)e(s)\\,ds&=\\lambda e(t)\\qquad \\forall t,0\\leq t\\leq 1\\\\\\int \_{0}^{t}se(s)\\,ds+t\\int \_{t}^{1}e(s)\\,ds&=\\lambda e(t)\\qquad \\forall t,0\\leq t\\leq 1\\end{aligned}}}

differentiating once with respect to *t* yields:
âŤ
t
1
e
(
s
)
d
s
\=
Îť
e
â˛
(
t
)
{\\displaystyle \\int \_{t}^{1}e(s)\\,ds=\\lambda e'(t)}

a second differentiation produces the following differential equation:
â
e
(
t
)
\=
Îť
e
âł
(
t
)
{\\displaystyle -e(t)=\\lambda e''(t)}

The general solution of which has the form:
e
(
t
)
\=
A
sin
âĄ
(
t
Îť
)
\+
B
cos
âĄ
(
t
Îť
)
{\\displaystyle e(t)=A\\sin \\left({\\frac {t}{\\sqrt {\\lambda }}}\\right)+B\\cos \\left({\\frac {t}{\\sqrt {\\lambda }}}\\right)}

where *A* and *B* are two constants to be determined with the boundary conditions. Setting *t* = 0 in the initial integral equation gives *e*(0) = 0 which implies that *B* = 0 and similarly, setting *t* = 1 in the first differentiation yields *e'* (1) = 0, whence:
cos
âĄ
(
1
Îť
)
\=
0
{\\displaystyle \\cos \\left({\\frac {1}{\\sqrt {\\lambda }}}\\right)=0}

which in turn implies that eigenvalues of *T**K**X* are:
Îť
k
\=
(
1
(
k
â
1
2
)
Ď
)
2
,
k
âĽ
1
{\\displaystyle \\lambda \_{k}=\\left({\\frac {1}{(k-{\\frac {1}{2}})\\pi }}\\right)^{2},\\qquad k\\geq 1}

The corresponding eigenfunctions are thus of the form:
e
k
(
t
)
\=
A
sin
âĄ
(
(
k
â
1
2
)
Ď
t
)
,
k
âĽ
1
{\\displaystyle e\_{k}(t)=A\\sin \\left((k-{\\frac {1}{2}})\\pi t\\right),\\qquad k\\geq 1}

*A* is then chosen so as to normalize *e**k*:
âŤ
0
1
e
k
2
(
t
)
d
t
\=
1
âš
A
\=
2
{\\displaystyle \\int \_{0}^{1}e\_{k}^{2}(t)\\,dt=1\\quad \\implies \\quad A={\\sqrt {2}}}

This gives the following representation of the Wiener process:
**Theorem**. There is a sequence {*Z**i*}*i* of independent Gaussian random variables with mean zero and variance 1 such that
W
t
\=
2
â
k
\=
1
â
Z
k
sin
âĄ
(
(
k
â
1
2
)
Ď
t
)
(
k
â
1
2
)
Ď
.
{\\displaystyle W\_{t}={\\sqrt {2}}\\sum \_{k=1}^{\\infty }Z\_{k}{\\frac {\\sin \\left(\\left(k-{\\frac {1}{2}}\\right)\\pi t\\right)}{\\left(k-{\\frac {1}{2}}\\right)\\pi }}.}

Note that this representation is only valid for t â \[ 0 , 1 \] . {\\displaystyle t\\in \[0,1\].} ![{\\displaystyle t\\in \[0,1\].}](https://wikimedia.org/api/rest_v1/media/math/render/svg/bafa089d991504bb539141c6221e17f79d06d7b8) On larger intervals, the increments are not independent. As stated in the theorem, convergence is in the L2 norm and uniform in *t*.
### The Brownian bridge
\[[edit](https://en.wikipedia.org/w/index.php?title=Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem&action=edit§ion=18 "Edit section: The Brownian bridge")\]
Similarly the [Brownian bridge](https://en.wikipedia.org/wiki/Brownian_bridge "Brownian bridge") B t \= W t â t W 1 {\\displaystyle B\_{t}=W\_{t}-tW\_{1}}  which is a [stochastic process](https://en.wikipedia.org/wiki/Stochastic_process "Stochastic process") with covariance function
K
B
(
t
,
s
)
\=
min
(
t
,
s
)
â
t
s
{\\displaystyle K\_{B}(t,s)=\\min(t,s)-ts}

can be represented as the series
B
t
\=
â
k
\=
1
â
Z
k
2
sin
âĄ
(
k
Ď
t
)
k
Ď
{\\displaystyle B\_{t}=\\sum \_{k=1}^{\\infty }Z\_{k}{\\frac {{\\sqrt {2}}\\sin(k\\pi t)}{k\\pi }}}

## Applications
\[[edit](https://en.wikipedia.org/w/index.php?title=Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem&action=edit§ion=19 "Edit section: Applications")\]
| | |
|---|---|
| [![\[icon\]](https://upload.wikimedia.org/wikipedia/commons/thumb/1/1c/Wiki_letter_w_cropped.svg/20px-Wiki_letter_w_cropped.svg.png)](https://en.wikipedia.org/wiki/File:Wiki_letter_w_cropped.svg) | This section **needs expansion**. You can help by [adding missing information](https://en.wikipedia.org/w/index.php?title=Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem&action=edit§ion=). *(July 2010)* |
[Adaptive optics](https://en.wikipedia.org/wiki/Adaptive_optics "Adaptive optics") systems sometimes use KâL functions to reconstruct wave-front phase information (Dai 1996, JOSA A). KarhunenâLoève expansion is closely related to the [Singular Value Decomposition](https://en.wikipedia.org/wiki/Singular_Value_Decomposition "Singular Value Decomposition"). The latter has myriad applications in image processing, radar, seismology, and the like. If one has independent vector observations from a vector valued stochastic process then the left singular vectors are [maximum likelihood](https://en.wikipedia.org/wiki/Maximum_likelihood "Maximum likelihood") estimates of the ensemble KL expansion.
### Applications in signal estimation and detection
\[[edit](https://en.wikipedia.org/w/index.php?title=Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem&action=edit§ion=20 "Edit section: Applications in signal estimation and detection")\]
#### Detection of a known continuous signal *S*(*t*)
\[[edit](https://en.wikipedia.org/w/index.php?title=Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem&action=edit§ion=21 "Edit section: Detection of a known continuous signal S(t)")\]
In communication, we usually have to decide whether a signal from a noisy channel contains valuable information. The following hypothesis testing is used for detecting continuous signal *s*(*t*) from channel output *X*(*t*), *N*(*t*) is the channel noise, which is usually assumed zero mean Gaussian process with correlation function R N ( t , s ) \= E \[ N ( t ) N ( s ) \] {\\displaystyle R\_{N}(t,s)=E\[N(t)N(s)\]} ![{\\displaystyle R\_{N}(t,s)=E\[N(t)N(s)\]}](https://wikimedia.org/api/rest_v1/media/math/render/svg/1f09f8b659fd5e362914b33effdda43332d909e4)
H
:
X
(
t
)
\=
N
(
t
)
,
{\\displaystyle H:X(t)=N(t),}

K
:
X
(
t
)
\=
N
(
t
)
\+
s
(
t
)
,
t
â
(
0
,
T
)
{\\displaystyle K:X(t)=N(t)+s(t),\\quad t\\in (0,T)}

#### Signal detection in white noise
\[[edit](https://en.wikipedia.org/w/index.php?title=Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem&action=edit§ion=22 "Edit section: Signal detection in white noise")\]
When the channel noise is white, its correlation function is
R
N
(
t
)
\=
1
2
N
0
δ
(
t
)
,
{\\displaystyle R\_{N}(t)={\\tfrac {1}{2}}N\_{0}\\delta (t),}

and it has constant power spectrum density. In physically practical channel, the noise power is finite, so:
S
N
(
f
)
\=
{
N
0
2
\|
f
\|
\<
w
0
\|
f
\|
\>
w
{\\displaystyle S\_{N}(f)={\\begin{cases}{\\frac {N\_{0}}{2}}&\|f\|\<w\\\\0&\|f\|\>w\\end{cases}}}

Then the noise correlation function is sinc function with zeros at n 2 Ď , n â Z . {\\displaystyle {\\frac {n}{2\\omega }},n\\in \\mathbf {Z} .}  Since are uncorrelated and gaussian, they are independent. Thus we can take samples from *X*(*t*) with time spacing
Î
t
\=
n
2
Ď
within
(
0
,
âł
T
âł
)
.
{\\displaystyle \\Delta t={\\frac {n}{2\\omega }}{\\text{ within }}(0,''T'').}

Let X i \= X ( i Î t ) {\\displaystyle X\_{i}=X(i\\,\\Delta t)} . We have a total of n \= T Î t \= T ( 2 Ď ) \= 2 Ď T {\\displaystyle n={\\frac {T}{\\Delta t}}=T(2\\omega )=2\\omega T}  i.i.d observations { X 1 , X 2 , ⌠, X n } {\\displaystyle \\{X\_{1},X\_{2},\\ldots ,X\_{n}\\}}  to develop the likelihood-ratio test. Define signal S i \= S ( i Î t ) {\\displaystyle S\_{i}=S(i\\,\\Delta t)} , the problem becomes,
H
:
X
i
\=
N
i
,
{\\displaystyle H:X\_{i}=N\_{i},}

K
:
X
i
\=
N
i
\+
S
i
,
i
\=
1
,
2
,
âŚ
,
n
.
{\\displaystyle K:X\_{i}=N\_{i}+S\_{i},i=1,2,\\ldots ,n.}

The log-likelihood ratio
L
(
x
\_
)
\=
log
âĄ
â
i
\=
1
n
(
2
S
i
x
i
â
S
i
2
)
2
Ď
2
â
Î
t
â
i
\=
1
n
S
i
x
i
\=
â
i
\=
1
n
S
(
i
Î
t
)
x
(
i
Î
t
)
Î
t
âˇ
Îť
â
2
{\\displaystyle {\\mathcal {L}}({\\underline {x}})=\\log {\\frac {\\sum \_{i=1}^{n}(2S\_{i}x\_{i}-S\_{i}^{2})}{2\\sigma ^{2}}}\\Leftrightarrow \\Delta t\\sum \_{i=1}^{n}S\_{i}x\_{i}=\\sum \_{i=1}^{n}S(i\\,\\Delta t)x(i\\,\\Delta t)\\,\\Delta t\\gtrless \\lambda \_{\\cdot }2}

As *t* â 0, let:
G
\=
âŤ
0
T
S
(
t
)
x
(
t
)
d
t
.
{\\displaystyle G=\\int \_{0}^{T}S(t)x(t)\\,dt.}

Then *G* is the test statistics and the [NeymanâPearson optimum detector](https://en.wikipedia.org/wiki/Neyman%E2%80%93Pearson_lemma "NeymanâPearson lemma") is
G
(
x
\_
)
\>
G
0
â
K
\<
G
0
â
H
.
{\\displaystyle G({\\underline {x}})\>G\_{0}\\Rightarrow K\<G\_{0}\\Rightarrow H.}

As *G* is Gaussian, we can characterize it by finding its mean and variances. Then we get
H
:
G
âź
N
(
0
,
1
2
N
0
E
)
{\\displaystyle H:G\\sim N\\left(0,{\\tfrac {1}{2}}N\_{0}E\\right)}

K
:
G
âź
N
(
E
,
1
2
N
0
E
)
{\\displaystyle K:G\\sim N\\left(E,{\\tfrac {1}{2}}N\_{0}E\\right)}

where
E
\=
âŤ
0
T
S
2
(
t
)
d
t
{\\displaystyle \\mathbf {E} =\\int \_{0}^{T}S^{2}(t)\\,dt}

is the signal energy.
The false alarm error
Îą
\=
âŤ
G
0
â
N
(
0
,
1
2
N
0
E
)
d
G
â
G
0
\=
1
2
N
0
E
ÎŚ
â
1
(
1
â
Îą
)
{\\displaystyle \\alpha =\\int \_{G\_{0}}^{\\infty }N\\left(0,{\\tfrac {1}{2}}N\_{0}E\\right)\\,dG\\Rightarrow G\_{0}={\\sqrt {{\\tfrac {1}{2}}N\_{0}E}}\\Phi ^{-1}(1-\\alpha )}

And the probability of detection:
β
\=
âŤ
G
0
â
N
(
E
,
1
2
N
0
E
)
d
G
\=
1
â
ÎŚ
(
G
0
â
E
1
2
N
0
E
)
\=
ÎŚ
(
2
E
N
0
â
ÎŚ
â
1
(
1
â
Îą
)
)
,
{\\displaystyle \\beta =\\int \_{G\_{0}}^{\\infty }N\\left(E,{\\tfrac {1}{2}}N\_{0}E\\right)\\,dG=1-\\Phi \\left({\\frac {G\_{0}-E}{\\sqrt {{\\tfrac {1}{2}}N\_{0}E}}}\\right)=\\Phi \\left({\\sqrt {\\frac {2E}{N\_{0}}}}-\\Phi ^{-1}(1-\\alpha )\\right),}

where ÎŚ is the cdf of standard normal, or Gaussian, variable.
#### Signal detection in colored noise
\[[edit](https://en.wikipedia.org/w/index.php?title=Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem&action=edit§ion=23 "Edit section: Signal detection in colored noise")\]
When N(t) is colored (correlated in time) Gaussian noise with zero mean and covariance function R N ( t , s ) \= E \[ N ( t ) N ( s ) \] , {\\displaystyle R\_{N}(t,s)=E\[N(t)N(s)\],} ![{\\displaystyle R\_{N}(t,s)=E\[N(t)N(s)\],}](https://wikimedia.org/api/rest_v1/media/math/render/svg/4e0b2810a9a6fef382bec8d835be29090f3d6871) we cannot sample independent discrete observations by evenly spacing the time. Instead, we can use KâL expansion to decorrelate the noise process and get independent Gaussian observation 'samples'. The KâL expansion of *N*(*t*):
N
(
t
)
\=
â
i
\=
1
â
N
i
ÎŚ
i
(
t
)
,
0
\<
t
\<
T
,
{\\displaystyle N(t)=\\sum \_{i=1}^{\\infty }N\_{i}\\Phi \_{i}(t),\\quad 0\<t\<T,}

where N i \= ⍠N ( t ) Ό i ( t ) d t {\\displaystyle N\_{i}=\\int N(t)\\Phi \_{i}(t)\\,dt}  and the orthonormal bases { Ό i t } {\\displaystyle \\{\\Phi \_{i}{t}\\}}  are generated by kernel R N ( t , s ) {\\displaystyle R\_{N}(t,s)} , i.e., solution to
âŤ
0
T
R
N
(
t
,
s
)
ÎŚ
i
(
s
)
d
s
\=
Îť
i
ÎŚ
i
(
t
)
,
var
âĄ
\[
N
i
\]
\=
Îť
i
.
{\\displaystyle \\int \_{0}^{T}R\_{N}(t,s)\\Phi \_{i}(s)\\,ds=\\lambda \_{i}\\Phi \_{i}(t),\\quad \\operatorname {var} \[N\_{i}\]=\\lambda \_{i}.}
![{\\displaystyle \\int \_{0}^{T}R\_{N}(t,s)\\Phi \_{i}(s)\\,ds=\\lambda \_{i}\\Phi \_{i}(t),\\quad \\operatorname {var} \[N\_{i}\]=\\lambda \_{i}.}](https://wikimedia.org/api/rest_v1/media/math/render/svg/5a3438932b073324caf252e7c0b3f9f19ac0791e)
Do the expansion:
S
(
t
)
\=
â
i
\=
1
â
S
i
ÎŚ
i
(
t
)
,
{\\displaystyle S(t)=\\sum \_{i=1}^{\\infty }S\_{i}\\Phi \_{i}(t),}

where S i \= ⍠0 T S ( t ) Ό i ( t ) d t {\\displaystyle S\_{i}=\\int \_{0}^{T}S(t)\\Phi \_{i}(t)\\,dt} , then
X
i
\=
âŤ
0
T
X
(
t
)
ÎŚ
i
(
t
)
d
t
\=
N
i
{\\displaystyle X\_{i}=\\int \_{0}^{T}X(t)\\Phi \_{i}(t)\\,dt=N\_{i}}

under H and N i \+ S i {\\displaystyle N\_{i}+S\_{i}}  under K. Let X ¯ \= { X 1 , X 2 , ⌠} {\\displaystyle {\\overline {X}}=\\{X\_{1},X\_{2},\\dots \\}} , we have
N
i
{\\displaystyle N\_{i}}

are independent Gaussian r.v's with variance
Îť
i
{\\displaystyle \\lambda \_{i}}

under H:
{
X
i
}
{\\displaystyle \\{X\_{i}\\}}

are independent Gaussian r.v's.
f
H
\[
x
(
t
)
\|
0
\<
t
\<
T
\]
\=
f
H
(
x
\_
)
\=
â
i
\=
1
â
1
2
Ď
Îť
i
exp
âĄ
(
â
x
i
2
2
Îť
i
)
{\\displaystyle f\_{H}\[x(t)\|0\<t\<T\]=f\_{H}({\\underline {x}})=\\prod \_{i=1}^{\\infty }{\\frac {1}{\\sqrt {2\\pi \\lambda \_{i}}}}\\exp \\left(-{\\frac {x\_{i}^{2}}{2\\lambda \_{i}}}\\right)}
![{\\displaystyle f\_{H}\[x(t)\|0\<t\<T\]=f\_{H}({\\underline {x}})=\\prod \_{i=1}^{\\infty }{\\frac {1}{\\sqrt {2\\pi \\lambda \_{i}}}}\\exp \\left(-{\\frac {x\_{i}^{2}}{2\\lambda \_{i}}}\\right)}](https://wikimedia.org/api/rest_v1/media/math/render/svg/5106d08c991f9d03bb7bb076c831a0f33200d4bc)
under K:
{
X
i
â
S
i
}
{\\displaystyle \\{X\_{i}-S\_{i}\\}}

are independent Gaussian r.v's.
f
K
\[
x
(
t
)
âŁ
0
\<
t
\<
T
\]
\=
f
K
(
x
\_
)
\=
â
i
\=
1
â
1
2
Ď
Îť
i
exp
âĄ
(
â
(
x
i
â
S
i
)
2
2
Îť
i
)
{\\displaystyle f\_{K}\[x(t)\\mid 0\<t\<T\]=f\_{K}({\\underline {x}})=\\prod \_{i=1}^{\\infty }{\\frac {1}{\\sqrt {2\\pi \\lambda \_{i}}}}\\exp \\left(-{\\frac {(x\_{i}-S\_{i})^{2}}{2\\lambda \_{i}}}\\right)}
![{\\displaystyle f\_{K}\[x(t)\\mid 0\<t\<T\]=f\_{K}({\\underline {x}})=\\prod \_{i=1}^{\\infty }{\\frac {1}{\\sqrt {2\\pi \\lambda \_{i}}}}\\exp \\left(-{\\frac {(x\_{i}-S\_{i})^{2}}{2\\lambda \_{i}}}\\right)}](https://wikimedia.org/api/rest_v1/media/math/render/svg/14a5bf34e1d35f9432e0a659d53f733ce651a83b)
Hence, the log-LR is given by
L
(
x
\_
)
\=
â
i
\=
1
â
2
S
i
x
i
â
S
i
2
2
Îť
i
{\\displaystyle {\\mathcal {L}}({\\underline {x}})=\\sum \_{i=1}^{\\infty }{\\frac {2S\_{i}x\_{i}-S\_{i}^{2}}{2\\lambda \_{i}}}}

and the optimum detector is
G
\=
â
i
\=
1
â
S
i
x
i
Îť
i
\>
G
0
â
K
,
\<
G
0
â
H
.
{\\displaystyle G=\\sum \_{i=1}^{\\infty }S\_{i}x\_{i}\\lambda \_{i}\>G\_{0}\\Rightarrow K,\<G\_{0}\\Rightarrow H.}

Define
k
(
t
)
\=
â
i
\=
1
â
Îť
i
S
i
ÎŚ
i
(
t
)
,
0
\<
t
\<
T
,
{\\displaystyle k(t)=\\sum \_{i=1}^{\\infty }\\lambda \_{i}S\_{i}\\Phi \_{i}(t),0\<t\<T,}

then G \= ⍠0 T k ( t ) x ( t ) d t . {\\displaystyle G=\\int \_{0}^{T}k(t)x(t)\\,dt.} 
##### How to find *k*(*t*)
\[[edit](https://en.wikipedia.org/w/index.php?title=Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem&action=edit§ion=24 "Edit section: How to find k(t)")\]
Since
âŤ
0
T
R
N
(
t
,
s
)
k
(
s
)
d
s
\=
â
i
\=
1
â
Îť
i
S
i
âŤ
0
T
R
N
(
t
,
s
)
ÎŚ
i
(
s
)
d
s
\=
â
i
\=
1
â
S
i
ÎŚ
i
(
t
)
\=
S
(
t
)
,
{\\displaystyle \\int \_{0}^{T}R\_{N}(t,s)k(s)\\,ds=\\sum \_{i=1}^{\\infty }\\lambda \_{i}S\_{i}\\int \_{0}^{T}R\_{N}(t,s)\\Phi \_{i}(s)\\,ds=\\sum \_{i=1}^{\\infty }S\_{i}\\Phi \_{i}(t)=S(t),}

k(t) is the solution to
âŤ
0
T
R
N
(
t
,
s
)
k
(
s
)
d
s
\=
S
(
t
)
.
{\\displaystyle \\int \_{0}^{T}R\_{N}(t,s)k(s)\\,ds=S(t).}

If *N*(*t*)is wide-sense stationary,
âŤ
0
T
R
N
(
t
â
s
)
k
(
s
)
d
s
\=
S
(
t
)
,
{\\displaystyle \\int \_{0}^{T}R\_{N}(t-s)k(s)\\,ds=S(t),}

which is known as the [WienerâHopf equation](https://en.wikipedia.org/wiki/Wiener%E2%80%93Hopf_equation "WienerâHopf equation"). The equation can be solved by taking fourier transform, but not practically realizable since infinite spectrum needs spatial factorization. A special case which is easy to calculate *k*(*t*) is white Gaussian noise.
âŤ
0
T
N
0
2
δ
(
t
â
s
)
k
(
s
)
d
s
\=
S
(
t
)
â
k
(
t
)
\=
C
S
(
t
)
,
0
\<
t
\<
T
.
{\\displaystyle \\int \_{0}^{T}{\\frac {N\_{0}}{2}}\\delta (t-s)k(s)\\,ds=S(t)\\Rightarrow k(t)=CS(t),\\quad 0\<t\<T.}

The corresponding impulse response is *h*(*t*) = *k*(*T* â *t*) = *CS*(*T* â *t*). Let *C* = 1, this is just the result we arrived at in previous section for detecting of signal in white noise.
##### Test threshold for NeymanâPearson detector
\[[edit](https://en.wikipedia.org/w/index.php?title=Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem&action=edit§ion=25 "Edit section: Test threshold for NeymanâPearson detector")\]
Since X(t) is a Gaussian process,
G
\=
âŤ
0
T
k
(
t
)
x
(
t
)
d
t
,
{\\displaystyle G=\\int \_{0}^{T}k(t)x(t)\\,dt,}

is a Gaussian random variable that can be characterized by its mean and variance.
E
\[
G
âŁ
H
\]
\=
âŤ
0
T
k
(
t
)
E
\[
x
(
t
)
âŁ
H
\]
d
t
\=
0
E
\[
G
âŁ
K
\]
\=
âŤ
0
T
k
(
t
)
E
\[
x
(
t
)
âŁ
K
\]
d
t
\=
âŤ
0
T
k
(
t
)
S
(
t
)
d
t
âĄ
Ď
E
\[
G
2
âŁ
H
\]
\=
âŤ
0
T
âŤ
0
T
k
(
t
)
k
(
s
)
R
N
(
t
,
s
)
d
t
d
s
\=
âŤ
0
T
k
(
t
)
(
âŤ
0
T
k
(
s
)
R
N
(
t
,
s
)
d
s
)
\=
âŤ
0
T
k
(
t
)
S
(
t
)
d
t
\=
Ď
var
âĄ
\[
G
âŁ
H
\]
\=
E
\[
G
2
âŁ
H
\]
â
(
E
\[
G
âŁ
H
\]
)
2
\=
Ď
E
\[
G
2
âŁ
K
\]
\=
âŤ
0
T
âŤ
0
T
k
(
t
)
k
(
s
)
E
\[
x
(
t
)
x
(
s
)
\]
d
t
d
s
\=
âŤ
0
T
âŤ
0
T
k
(
t
)
k
(
s
)
(
R
N
(
t
,
s
)
\+
S
(
t
)
S
(
s
)
)
d
t
d
s
\=
Ď
\+
Ď
2
var
âĄ
\[
G
âŁ
K
\]
\=
E
\[
G
2
\|
K
\]
â
(
E
\[
G
\|
K
\]
)
2
\=
Ď
\+
Ď
2
â
Ď
2
\=
Ď
{\\displaystyle {\\begin{aligned}\\mathbf {E} \[G\\mid H\]&=\\int \_{0}^{T}k(t)\\mathbf {E} \[x(t)\\mid H\]\\,dt=0\\\\\\mathbf {E} \[G\\mid K\]&=\\int \_{0}^{T}k(t)\\mathbf {E} \[x(t)\\mid K\]\\,dt=\\int \_{0}^{T}k(t)S(t)\\,dt\\equiv \\rho \\\\\\mathbf {E} \[G^{2}\\mid H\]&=\\int \_{0}^{T}\\int \_{0}^{T}k(t)k(s)R\_{N}(t,s)\\,dt\\,ds=\\int \_{0}^{T}k(t)\\left(\\int \_{0}^{T}k(s)R\_{N}(t,s)\\,ds\\right)=\\int \_{0}^{T}k(t)S(t)\\,dt=\\rho \\\\\\operatorname {var} \[G\\mid H\]&=\\mathbf {E} \[G^{2}\\mid H\]-(\\mathbf {E} \[G\\mid H\])^{2}=\\rho \\\\\\mathbf {E} \[G^{2}\\mid K\]&=\\int \_{0}^{T}\\int \_{0}^{T}k(t)k(s)\\mathbf {E} \[x(t)x(s)\]\\,dt\\,ds=\\int \_{0}^{T}\\int \_{0}^{T}k(t)k(s)(R\_{N}(t,s)+S(t)S(s))\\,dt\\,ds=\\rho +\\rho ^{2}\\\\\\operatorname {var} \[G\\mid K\]&=\\mathbf {E} \[G^{2}\|K\]-(\\mathbf {E} \[G\|K\])^{2}=\\rho +\\rho ^{2}-\\rho ^{2}=\\rho \\end{aligned}}}
![{\\displaystyle {\\begin{aligned}\\mathbf {E} \[G\\mid H\]&=\\int \_{0}^{T}k(t)\\mathbf {E} \[x(t)\\mid H\]\\,dt=0\\\\\\mathbf {E} \[G\\mid K\]&=\\int \_{0}^{T}k(t)\\mathbf {E} \[x(t)\\mid K\]\\,dt=\\int \_{0}^{T}k(t)S(t)\\,dt\\equiv \\rho \\\\\\mathbf {E} \[G^{2}\\mid H\]&=\\int \_{0}^{T}\\int \_{0}^{T}k(t)k(s)R\_{N}(t,s)\\,dt\\,ds=\\int \_{0}^{T}k(t)\\left(\\int \_{0}^{T}k(s)R\_{N}(t,s)\\,ds\\right)=\\int \_{0}^{T}k(t)S(t)\\,dt=\\rho \\\\\\operatorname {var} \[G\\mid H\]&=\\mathbf {E} \[G^{2}\\mid H\]-(\\mathbf {E} \[G\\mid H\])^{2}=\\rho \\\\\\mathbf {E} \[G^{2}\\mid K\]&=\\int \_{0}^{T}\\int \_{0}^{T}k(t)k(s)\\mathbf {E} \[x(t)x(s)\]\\,dt\\,ds=\\int \_{0}^{T}\\int \_{0}^{T}k(t)k(s)(R\_{N}(t,s)+S(t)S(s))\\,dt\\,ds=\\rho +\\rho ^{2}\\\\\\operatorname {var} \[G\\mid K\]&=\\mathbf {E} \[G^{2}\|K\]-(\\mathbf {E} \[G\|K\])^{2}=\\rho +\\rho ^{2}-\\rho ^{2}=\\rho \\end{aligned}}}](https://wikimedia.org/api/rest_v1/media/math/render/svg/d28a73179b98c25cf752e7f06937466182b316c4)
Hence, we obtain the distributions of *H* and *K*:
H
:
G
âź
N
(
0
,
Ď
)
{\\displaystyle H:G\\sim N(0,\\rho )}

K
:
G
âź
N
(
Ď
,
Ď
)
{\\displaystyle K:G\\sim N(\\rho ,\\rho )}

The false alarm error is
Îą
\=
âŤ
G
0
â
N
(
0
,
Ď
)
d
G
\=
1
â
ÎŚ
(
G
0
Ď
)
.
{\\displaystyle \\alpha =\\int \_{G\_{0}}^{\\infty }N(0,\\rho )\\,dG=1-\\Phi \\left({\\frac {G\_{0}}{\\sqrt {\\rho }}}\\right).}

So the test threshold for the NeymanâPearson optimum detector is
G
0
\=
Ď
ÎŚ
â
1
(
1
â
Îą
)
.
{\\displaystyle G\_{0}={\\sqrt {\\rho }}\\Phi ^{-1}(1-\\alpha ).}

Its power of detection is
β
\=
âŤ
G
0
â
N
(
Ď
,
Ď
)
d
G
\=
ÎŚ
(
Ď
â
ÎŚ
â
1
(
1
â
Îą
)
)
{\\displaystyle \\beta =\\int \_{G\_{0}}^{\\infty }N(\\rho ,\\rho )\\,dG=\\Phi \\left({\\sqrt {\\rho }}-\\Phi ^{-1}(1-\\alpha )\\right)}

When the noise is white Gaussian process, the signal power is
Ď
\=
âŤ
0
T
k
(
t
)
S
(
t
)
d
t
\=
âŤ
0
T
S
(
t
)
2
d
t
\=
E
.
{\\displaystyle \\rho =\\int \_{0}^{T}k(t)S(t)\\,dt=\\int \_{0}^{T}S(t)^{2}\\,dt=E.}

##### Prewhitening
\[[edit](https://en.wikipedia.org/w/index.php?title=Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem&action=edit§ion=26 "Edit section: Prewhitening")\]
For some type of colored noise, a typical practise is to add a prewhitening filter before the matched filter to transform the colored noise into white noise. For example, N(t) is a wide-sense stationary colored noise with correlation function
R
N
(
Ď
)
\=
B
N
0
4
e
â
B
\|
Ď
\|
{\\displaystyle R\_{N}(\\tau )={\\frac {BN\_{0}}{4}}e^{-B\|\\tau \|}}

S
N
(
f
)
\=
N
0
2
(
1
\+
(
w
B
)
2
)
{\\displaystyle S\_{N}(f)={\\frac {N\_{0}}{2(1+({\\frac {w}{B}})^{2})}}}

The transfer function of prewhitening filter is
H
(
f
)
\=
1
\+
j
w
B
.
{\\displaystyle H(f)=1+j{\\frac {w}{B}}.}

#### Detection of a Gaussian random signal in [Additive white Gaussian noise (AWGN)](https://en.wikipedia.org/wiki/Additive_white_Gaussian_noise "Additive white Gaussian noise")
\[[edit](https://en.wikipedia.org/w/index.php?title=Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem&action=edit§ion=27 "Edit section: Detection of a Gaussian random signal in Additive white Gaussian noise (AWGN)")\]
When the signal we want to detect from the noisy channel is also random, for example, a white Gaussian process *X*(*t*), we can still implement KâL expansion to get independent sequence of observation. In this case, the detection problem is described as follows:
H
0
:
Y
(
t
)
\=
N
(
t
)
{\\displaystyle H\_{0}:Y(t)=N(t)}

H
1
:
Y
(
t
)
\=
N
(
t
)
\+
X
(
t
)
,
0
\<
t
\<
T
.
{\\displaystyle H\_{1}:Y(t)=N(t)+X(t),\\quad 0\<t\<T.}

*X*(*t*) is a random process with correlation function R X ( t , s ) \= E { X ( t ) X ( s ) } {\\displaystyle R\_{X}(t,s)=E\\{X(t)X(s)\\}} 
The KâL expansion of *X*(*t*) is
X
(
t
)
\=
â
i
\=
1
â
X
i
ÎŚ
i
(
t
)
,
{\\displaystyle X(t)=\\sum \_{i=1}^{\\infty }X\_{i}\\Phi \_{i}(t),}

where
X
i
\=
âŤ
0
T
X
(
t
)
ÎŚ
i
(
t
)
d
t
{\\displaystyle X\_{i}=\\int \_{0}^{T}X(t)\\Phi \_{i}(t)\\,dt}

and ÎŚ i ( t ) {\\displaystyle \\Phi \_{i}(t)}  are solutions to
âŤ
0
T
R
X
(
t
,
s
)
ÎŚ
i
(
s
)
d
s
\=
Îť
i
ÎŚ
i
(
t
)
.
{\\displaystyle \\int \_{0}^{T}R\_{X}(t,s)\\Phi \_{i}(s)ds=\\lambda \_{i}\\Phi \_{i}(t).}

So X i {\\displaystyle X\_{i}} 's are independent sequence of r.v's with zero mean and variance Îť i {\\displaystyle \\lambda \_{i}} . Expanding *Y*(*t*) and *N*(*t*) by ÎŚ i ( t ) {\\displaystyle \\Phi \_{i}(t)} , we get
Y
i
\=
âŤ
0
T
Y
(
t
)
ÎŚ
i
(
t
)
d
t
\=
âŤ
0
T
\[
N
(
t
)
\+
X
(
t
)
\]
ÎŚ
i
(
t
)
\=
N
i
\+
X
i
,
{\\displaystyle Y\_{i}=\\int \_{0}^{T}Y(t)\\Phi \_{i}(t)\\,dt=\\int \_{0}^{T}\[N(t)+X(t)\]\\Phi \_{i}(t)=N\_{i}+X\_{i},}
![{\\displaystyle Y\_{i}=\\int \_{0}^{T}Y(t)\\Phi \_{i}(t)\\,dt=\\int \_{0}^{T}\[N(t)+X(t)\]\\Phi \_{i}(t)=N\_{i}+X\_{i},}](https://wikimedia.org/api/rest_v1/media/math/render/svg/3f98059b8f7146860740552c3f232a4387c8abe1)
where
N
i
\=
âŤ
0
T
N
(
t
)
ÎŚ
i
(
t
)
d
t
.
{\\displaystyle N\_{i}=\\int \_{0}^{T}N(t)\\Phi \_{i}(t)\\,dt.}

As *N*(*t*) is Gaussian white noise, N i {\\displaystyle N\_{i}} 's are i.i.d sequence of r.v with zero mean and variance 1 2 N 0 {\\displaystyle {\\tfrac {1}{2}}N\_{0}} , then the problem is simplified as follows,
H
0
:
Y
i
\=
N
i
{\\displaystyle H\_{0}:Y\_{i}=N\_{i}}

H
1
:
Y
i
\=
N
i
\+
X
i
{\\displaystyle H\_{1}:Y\_{i}=N\_{i}+X\_{i}}

The NeymanâPearson optimal test:
Î
\=
f
Y
âŁ
H
1
f
Y
âŁ
H
0
\=
C
e
â
â
i
\=
1
â
y
i
2
2
Îť
i
1
2
N
0
(
1
2
N
0
\+
Îť
i
)
,
{\\displaystyle \\Lambda ={\\frac {f\_{Y}\\mid H\_{1}}{f\_{Y}\\mid H\_{0}}}=Ce^{-\\sum \_{i=1}^{\\infty }{\\frac {y\_{i}^{2}}{2}}{\\frac {\\lambda \_{i}}{{\\tfrac {1}{2}}N\_{0}({\\tfrac {1}{2}}N\_{0}+\\lambda \_{i})}}},}

so the log-likelihood ratio is
L
\=
ln
âĄ
(
Î
)
\=
K
â
â
i
\=
1
â
1
2
y
i
2
Îť
i
N
0
2
(
N
0
2
\+
Îť
i
)
.
{\\displaystyle {\\mathcal {L}}=\\ln(\\Lambda )=K-\\sum \_{i=1}^{\\infty }{\\tfrac {1}{2}}y\_{i}^{2}{\\frac {\\lambda \_{i}}{{\\frac {N\_{0}}{2}}\\left({\\frac {N\_{0}}{2}}+\\lambda \_{i}\\right)}}.}

Since
X
^
i
\=
Îť
i
N
0
2
(
N
0
2
\+
Îť
i
)
{\\displaystyle {\\widehat {X}}\_{i}={\\frac {\\lambda \_{i}}{{\\frac {N\_{0}}{2}}\\left({\\frac {N\_{0}}{2}}+\\lambda \_{i}\\right)}}}

is just the minimum-mean-square estimate of X i {\\displaystyle X\_{i}}  given Y i {\\displaystyle Y\_{i}} 's,
L
\=
K
\+
1
N
0
â
i
\=
1
â
Y
i
X
^
i
.
{\\displaystyle {\\mathcal {L}}=K+{\\frac {1}{N\_{0}}}\\sum \_{i=1}^{\\infty }Y\_{i}{\\widehat {X}}\_{i}.}

KâL expansion has the following property: If
f
(
t
)
\=
â
f
i
ÎŚ
i
(
t
)
,
g
(
t
)
\=
â
g
i
ÎŚ
i
(
t
)
,
{\\displaystyle f(t)=\\sum f\_{i}\\Phi \_{i}(t),g(t)=\\sum g\_{i}\\Phi \_{i}(t),}

where
f
i
\=
âŤ
0
T
f
(
t
)
ÎŚ
i
(
t
)
d
t
,
g
i
\=
âŤ
0
T
g
(
t
)
ÎŚ
i
(
t
)
d
t
.
{\\displaystyle f\_{i}=\\int \_{0}^{T}f(t)\\Phi \_{i}(t)\\,dt,\\quad g\_{i}=\\int \_{0}^{T}g(t)\\Phi \_{i}(t)\\,dt.}

then
â
i
\=
1
â
f
i
g
i
\=
âŤ
0
T
g
(
t
)
f
(
t
)
d
t
.
{\\displaystyle \\sum \_{i=1}^{\\infty }f\_{i}g\_{i}=\\int \_{0}^{T}g(t)f(t)\\,dt.}

So let
X
^
(
t
âŁ
T
)
\=
â
i
\=
1
â
X
^
i
ÎŚ
i
(
t
)
,
L
\=
K
\+
1
N
0
âŤ
0
T
Y
(
t
)
X
^
(
t
âŁ
T
)
d
t
.
{\\displaystyle {\\widehat {X}}(t\\mid T)=\\sum \_{i=1}^{\\infty }{\\widehat {X}}\_{i}\\Phi \_{i}(t),\\quad {\\mathcal {L}}=K+{\\frac {1}{N\_{0}}}\\int \_{0}^{T}Y(t){\\widehat {X}}(t\\mid T)\\,dt.}

Noncausal filter *Q*(*t*,*s*) can be used to get the estimate through
X
^
(
t
âŁ
T
)
\=
âŤ
0
T
Q
(
t
,
s
)
Y
(
s
)
d
s
.
{\\displaystyle {\\widehat {X}}(t\\mid T)=\\int \_{0}^{T}Q(t,s)Y(s)\\,ds.}

By [orthogonality principle](https://en.wikipedia.org/wiki/Orthogonality_principle "Orthogonality principle"), *Q*(*t*,*s*) satisfies
âŤ
0
T
Q
(
t
,
s
)
R
X
(
s
,
t
)
d
s
\+
N
0
2
Q
(
t
,
Îť
)
\=
R
X
(
t
,
Îť
)
,
0
\<
Îť
\<
T
,
0
\<
t
\<
T
.
{\\displaystyle \\int \_{0}^{T}Q(t,s)R\_{X}(s,t)\\,ds+{\\tfrac {N\_{0}}{2}}Q(t,\\lambda )=R\_{X}(t,\\lambda ),0\<\\lambda \<T,0\<t\<T.}

However, for practical reasons, it's necessary to further derive the causal filter *h*(*t*,*s*), where *h*(*t*,*s*) = 0 for *s* \> *t*, to get estimate X ^ ( t ⣠t ) {\\displaystyle {\\widehat {X}}(t\\mid t)} . Specifically,
Q
(
t
,
s
)
\=
h
(
t
,
s
)
\+
h
(
s
,
t
)
â
âŤ
0
T
h
(
Îť
,
t
)
h
(
s
,
Îť
)
d
Îť
{\\displaystyle Q(t,s)=h(t,s)+h(s,t)-\\int \_{0}^{T}h(\\lambda ,t)h(s,\\lambda )\\,d\\lambda }

## See also
\[[edit](https://en.wikipedia.org/w/index.php?title=Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem&action=edit§ion=28 "Edit section: See also")\]
- [Principal component analysis](https://en.wikipedia.org/wiki/Principal_component_analysis "Principal component analysis")
- [Polynomial chaos](https://en.wikipedia.org/wiki/Polynomial_chaos "Polynomial chaos")
- [Reproducing kernel Hilbert space](https://en.wikipedia.org/wiki/Reproducing_kernel_Hilbert_space "Reproducing kernel Hilbert space")
- [Mercer's theorem](https://en.wikipedia.org/wiki/Mercer%27s_theorem "Mercer's theorem")
## Notes
\[[edit](https://en.wikipedia.org/w/index.php?title=Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem&action=edit§ion=29 "Edit section: Notes")\]
1. **[^](https://en.wikipedia.org/wiki/Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem#cite_ref-sapatnekar_1-0)**
Sapatnekar, Sachin (2011), "Overcoming variations in nanometer-scale technologies", *IEEE Journal on Emerging and Selected Topics in Circuits and Systems*, **1** (1): 5â1, [Bibcode](https://en.wikipedia.org/wiki/Bibcode_\(identifier\) "Bibcode (identifier)"):[2011IJEST...1....5S](https://ui.adsabs.harvard.edu/abs/2011IJEST...1....5S), [CiteSeerX](https://en.wikipedia.org/wiki/CiteSeerX_\(identifier\) "CiteSeerX (identifier)") [10\.1.1.300.5659](https://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.300.5659), [doi](https://en.wikipedia.org/wiki/Doi_\(identifier\) "Doi (identifier)"):[10\.1109/jetcas.2011.2138250](https://doi.org/10.1109%2Fjetcas.2011.2138250), [S2CID](https://en.wikipedia.org/wiki/S2CID_\(identifier\) "S2CID (identifier)") [15566585](https://api.semanticscholar.org/CorpusID:15566585)
2. **[^](https://en.wikipedia.org/wiki/Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem#cite_ref-ghoman_2-0)**
Ghoman, Satyajit; Wang, Zhicun; Chen, PC; Kapania, Rakesh (2012). "A POD-based Reduced Order Design Scheme for Shape Optimization of Air Vehicles". *Proc of 53rd AIAA/ASME/ASCE/AHS/ASC Structures, Structural Dynamics, and Materials Conference, AIAA-2012-1808, Honolulu, Hawaii*.
3. **[^](https://en.wikipedia.org/wiki/Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem#cite_ref-3)** [KarhunenâLoeve transform (KLT)](http://fourier.eng.hmc.edu/e161/lectures/klt/node3.html) [Archived](https://web.archive.org/web/20161128140401/http://fourier.eng.hmc.edu/e161/lectures/klt/node3.html) 2016-11-28 at the [Wayback Machine](https://en.wikipedia.org/wiki/Wayback_Machine "Wayback Machine"), Computer Image Processing and Analysis (E161) lectures, Harvey Mudd College
4. **[^](https://en.wikipedia.org/wiki/Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem#cite_ref-giambartolomei_4-0)**
Giambartolomei, Giordano (2016). "4 The Karhunen-Loève Theorem". [*The Karhunen-Loève theorem*](https://amslaurea.unibo.it/10169/) (Bachelors). University of Bologna.
5. **[^](https://en.wikipedia.org/wiki/Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem#cite_ref-5)** A wavelet tour of signal processing-StĂŠphane Mallat
6. **[^](https://en.wikipedia.org/wiki/Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem#cite_ref-6)** X. Tang, âTexture information in run-length matrices,â IEEE Transactions on Image Processing, vol. 7, No. 11, pp. 1602â1609, Nov. 1998
## References
\[[edit](https://en.wikipedia.org/w/index.php?title=Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem&action=edit§ion=30 "Edit section: References")\]
- Stark, Henry; Woods, John W. (1986). *Probability, Random Processes, and Estimation Theory for Engineers*. Prentice-Hall, Inc. [ISBN](https://en.wikipedia.org/wiki/ISBN_\(identifier\) "ISBN (identifier)")
[978-0-13-711706-2](https://en.wikipedia.org/wiki/Special:BookSources/978-0-13-711706-2 "Special:BookSources/978-0-13-711706-2")
. [OL](https://en.wikipedia.org/wiki/OL_\(identifier\) "OL (identifier)") [21138080M](https://openlibrary.org/books/OL21138080M).
- Ghanem, Roger; Spanos, Pol (1991). *Stochastic finite elements: a spectral approach*. Springer-Verlag. [ISBN](https://en.wikipedia.org/wiki/ISBN_\(identifier\) "ISBN (identifier)")
[978-0-387-97456-9](https://en.wikipedia.org/wiki/Special:BookSources/978-0-387-97456-9 "Special:BookSources/978-0-387-97456-9")
. [OL](https://en.wikipedia.org/wiki/OL_\(identifier\) "OL (identifier)") [1865197M](https://openlibrary.org/books/OL1865197M).
- Guikhman, I.; Skorokhod, A. (1977). *Introduction a la ThĂŠorie des Processus AlĂŠatoires*. Ăditions MIR.
- Simon, B. (1979). *Functional Integration and Quantum Physics*. Academic Press.
- Karhunen, Kari (1947). "Ăber lineare Methoden in der Wahrscheinlichkeitsrechnung". *Ann. Acad. Sci. Fennicae. Ser. A I. Math.-Phys*. **37**: 1â79\.
- Loève, M. (1978). *Probability theory Vol. II*. Graduate Texts in Mathematics. Vol. 46 (4 ed.). Springer-Verlag. [ISBN](https://en.wikipedia.org/wiki/ISBN_\(identifier\) "ISBN (identifier)")
[978-0-387-90262-3](https://en.wikipedia.org/wiki/Special:BookSources/978-0-387-90262-3 "Special:BookSources/978-0-387-90262-3")
.
- Dai, G. (1996). "Modal wave-front reconstruction with Zernike polynomials and KarhunenâLoeve functions". *JOSA A*. **13** (6): 1218. [Bibcode](https://en.wikipedia.org/wiki/Bibcode_\(identifier\) "Bibcode (identifier)"):[1996JOSAA..13.1218D](https://ui.adsabs.harvard.edu/abs/1996JOSAA..13.1218D). [doi](https://en.wikipedia.org/wiki/Doi_\(identifier\) "Doi (identifier)"):[10\.1364/JOSAA.13.001218](https://doi.org/10.1364%2FJOSAA.13.001218).
- Wu B., Zhu J., Najm F.(2005) "A Non-parametric Approach for Dynamic Range Estimation of Nonlinear Systems". In Proceedings of Design Automation Conference(841â844) 2005
- Wu B., Zhu J., Najm F.(2006) "Dynamic Range Estimation". IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, Vol. 25 Issue:9 (1618â1636) 2006
- Jorgensen, Palle E. T.; Song, Myung-Sin (2007). "Entropy Encoding, Hilbert Space and KarhunenâLoeve Transforms". *Journal of Mathematical Physics*. **48** (10): 103503. [arXiv](https://en.wikipedia.org/wiki/ArXiv_\(identifier\) "ArXiv (identifier)"):[math-ph/0701056](https://arxiv.org/abs/math-ph/0701056). [Bibcode](https://en.wikipedia.org/wiki/Bibcode_\(identifier\) "Bibcode (identifier)"):[2007JMP....48j3503J](https://ui.adsabs.harvard.edu/abs/2007JMP....48j3503J). [doi](https://en.wikipedia.org/wiki/Doi_\(identifier\) "Doi (identifier)"):[10\.1063/1.2793569](https://doi.org/10.1063%2F1.2793569). [S2CID](https://en.wikipedia.org/wiki/S2CID_\(identifier\) "S2CID (identifier)") [17039075](https://api.semanticscholar.org/CorpusID:17039075).
## External links
\[[edit](https://en.wikipedia.org/w/index.php?title=Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem&action=edit§ion=31 "Edit section: External links")\]
- *Mathematica* [KarhunenLoeveDecomposition](http://reference.wolfram.com/mathematica/ref/KarhunenLoeveDecomposition.html) function.
- *E161: Computer Image Processing and Analysis* notes by Pr. Ruye Wang at [Harvey Mudd College](https://en.wikipedia.org/wiki/Harvey_Mudd_College "Harvey Mudd College") [\[1\]](http://fourier.eng.hmc.edu/e161/lectures/klt/klt.html) [Archived](https://web.archive.org/web/20110516045654/http://fourier.eng.hmc.edu/e161/lectures/klt/klt.html) 2011-05-16 at the [Wayback Machine](https://en.wikipedia.org/wiki/Wayback_Machine "Wayback Machine")
| [v](https://en.wikipedia.org/wiki/Template:Stochastic_processes "Template:Stochastic processes") [t](https://en.wikipedia.org/wiki/Template_talk:Stochastic_processes "Template talk:Stochastic processes") [e](https://en.wikipedia.org/wiki/Special:EditPage/Template:Stochastic_processes "Special:EditPage/Template:Stochastic processes")[Stochastic processes](https://en.wikipedia.org/wiki/Stochastic_process "Stochastic process") | |
|---|---|
| [Discrete time](https://en.wikipedia.org/wiki/Discrete-time_stochastic_process "Discrete-time stochastic process") | [Bernoulli process](https://en.wikipedia.org/wiki/Bernoulli_process "Bernoulli process") [Branching process](https://en.wikipedia.org/wiki/Branching_process "Branching process") [Chinese restaurant process](https://en.wikipedia.org/wiki/Chinese_restaurant_process "Chinese restaurant process") [GaltonâWatson process](https://en.wikipedia.org/wiki/Galton%E2%80%93Watson_process "GaltonâWatson process") [Independent and identically distributed random variables](https://en.wikipedia.org/wiki/Independent_and_identically_distributed_random_variables "Independent and identically distributed random variables") [Markov chain](https://en.wikipedia.org/wiki/Markov_chain "Markov chain") [Moran process](https://en.wikipedia.org/wiki/Moran_process "Moran process") [Random walk](https://en.wikipedia.org/wiki/Random_walk "Random walk") [Loop-erased](https://en.wikipedia.org/wiki/Loop-erased_random_walk "Loop-erased random walk") [Self-avoiding](https://en.wikipedia.org/wiki/Self-avoiding_walk "Self-avoiding walk") [Biased](https://en.wikipedia.org/wiki/Biased_random_walk_on_a_graph "Biased random walk on a graph") [Maximal entropy](https://en.wikipedia.org/wiki/Maximal_entropy_random_walk "Maximal entropy random walk") |
| [Continuous time](https://en.wikipedia.org/wiki/Continuous-time_stochastic_process "Continuous-time stochastic process") | [Additive process](https://en.wikipedia.org/wiki/Additive_process "Additive process") [Airy process](https://en.wikipedia.org/wiki/Airy_process "Airy process") [Bessel process](https://en.wikipedia.org/wiki/Bessel_process "Bessel process") [Birthâdeath process](https://en.wikipedia.org/wiki/Birth%E2%80%93death_process "Birthâdeath process") [pure birth](https://en.wikipedia.org/wiki/Birth_process "Birth process") [Brownian motion](https://en.wikipedia.org/wiki/Wiener_process "Wiener process") [Bridge](https://en.wikipedia.org/wiki/Brownian_bridge "Brownian bridge") [Dyson](https://en.wikipedia.org/wiki/Dyson_Brownian_motion "Dyson Brownian motion") [Excursion](https://en.wikipedia.org/wiki/Brownian_excursion "Brownian excursion") [Fractional](https://en.wikipedia.org/wiki/Fractional_Brownian_motion "Fractional Brownian motion") [Geometric](https://en.wikipedia.org/wiki/Geometric_Brownian_motion "Geometric Brownian motion") [Meander](https://en.wikipedia.org/wiki/Brownian_meander "Brownian meander") [Cauchy process](https://en.wikipedia.org/wiki/Cauchy_process "Cauchy process") [Contact process](https://en.wikipedia.org/wiki/Contact_process_\(mathematics\) "Contact process (mathematics)") [Continuous-time random walk](https://en.wikipedia.org/wiki/Continuous-time_random_walk "Continuous-time random walk") [Cox process](https://en.wikipedia.org/wiki/Cox_process "Cox process") [Diffusion process](https://en.wikipedia.org/wiki/Diffusion_process "Diffusion process") [Empirical process](https://en.wikipedia.org/wiki/Empirical_process "Empirical process") [Feller process](https://en.wikipedia.org/wiki/Feller_process "Feller process") [FlemingâViot process](https://en.wikipedia.org/wiki/Fleming%E2%80%93Viot_process "FlemingâViot process") [Gamma process](https://en.wikipedia.org/wiki/Gamma_process "Gamma process") [Geometric process](https://en.wikipedia.org/wiki/Geometric_process "Geometric process") [Hawkes process](https://en.wikipedia.org/wiki/Hawkes_process "Hawkes process") [Hunt process](https://en.wikipedia.org/wiki/Hunt_process "Hunt process") [Interacting particle systems](https://en.wikipedia.org/wiki/Interacting_particle_system "Interacting particle system") [ItĂ´ diffusion](https://en.wikipedia.org/wiki/It%C3%B4_diffusion "ItĂ´ diffusion") [ItĂ´ process](https://en.wikipedia.org/wiki/It%C3%B4_process "ItĂ´ process") [Jump diffusion](https://en.wikipedia.org/wiki/Jump_diffusion "Jump diffusion") [Jump process](https://en.wikipedia.org/wiki/Jump_process "Jump process") [LĂŠvy process](https://en.wikipedia.org/wiki/L%C3%A9vy_process "LĂŠvy process") [Local time](https://en.wikipedia.org/wiki/Local_time_\(mathematics\) "Local time (mathematics)") [Markov additive process](https://en.wikipedia.org/wiki/Markov_additive_process "Markov additive process") [McKeanâVlasov process](https://en.wikipedia.org/wiki/McKean%E2%80%93Vlasov_process "McKeanâVlasov process") [OrnsteinâUhlenbeck process](https://en.wikipedia.org/wiki/Ornstein%E2%80%93Uhlenbeck_process "OrnsteinâUhlenbeck process") [Poisson process](https://en.wikipedia.org/wiki/Poisson_point_process "Poisson point process") [Compound](https://en.wikipedia.org/wiki/Compound_Poisson_process "Compound Poisson process") [Non-homogeneous](https://en.wikipedia.org/wiki/Non-homogeneous_Poisson_process "Non-homogeneous Poisson process") [Quasimartingale](https://en.wikipedia.org/wiki/Quasimartingale "Quasimartingale") [SchrammâLoewner evolution](https://en.wikipedia.org/wiki/Schramm%E2%80%93Loewner_evolution "SchrammâLoewner evolution") [Semimartingale](https://en.wikipedia.org/wiki/Semimartingale "Semimartingale") [Sigma-martingale](https://en.wikipedia.org/wiki/Sigma-martingale "Sigma-martingale") [Stable process](https://en.wikipedia.org/wiki/Stable_process "Stable process") [Superprocess](https://en.wikipedia.org/wiki/Superprocess "Superprocess") [Telegraph process](https://en.wikipedia.org/wiki/Telegraph_process "Telegraph process") [Variance gamma process](https://en.wikipedia.org/wiki/Variance_gamma_process "Variance gamma process") [Wiener process](https://en.wikipedia.org/wiki/Wiener_process "Wiener process") [Wiener sausage](https://en.wikipedia.org/wiki/Wiener_sausage "Wiener sausage") |
| Both | [Branching process](https://en.wikipedia.org/wiki/Branching_process "Branching process") [Gaussian process](https://en.wikipedia.org/wiki/Gaussian_process "Gaussian process") [Hidden Markov model (HMM)](https://en.wikipedia.org/wiki/Hidden_Markov_model "Hidden Markov model") [Markov process](https://en.wikipedia.org/wiki/Markov_process "Markov process") [Martingale](https://en.wikipedia.org/wiki/Martingale_\(probability_theory\) "Martingale (probability theory)") [Differences](https://en.wikipedia.org/wiki/Martingale_difference_sequence "Martingale difference sequence") [Local](https://en.wikipedia.org/wiki/Local_martingale "Local martingale") [Sub-](https://en.wikipedia.org/wiki/Submartingale "Submartingale") [Super-](https://en.wikipedia.org/wiki/Supermartingale "Supermartingale") [Random dynamical system](https://en.wikipedia.org/wiki/Random_dynamical_system "Random dynamical system") [Regenerative process](https://en.wikipedia.org/wiki/Regenerative_process "Regenerative process") [Renewal process](https://en.wikipedia.org/wiki/Renewal_process "Renewal process") [Stochastic chains with memory of variable length](https://en.wikipedia.org/wiki/Stochastic_chains_with_memory_of_variable_length "Stochastic chains with memory of variable length") [White noise](https://en.wikipedia.org/wiki/White_noise "White noise") |
| Fields and other | [Dirichlet process](https://en.wikipedia.org/wiki/Dirichlet_process "Dirichlet process") [Gaussian random field](https://en.wikipedia.org/wiki/Gaussian_random_field "Gaussian random field") [Gibbs measure](https://en.wikipedia.org/wiki/Gibbs_measure "Gibbs measure") [Hopfield model](https://en.wikipedia.org/wiki/Hopfield_model "Hopfield model") [Ising model](https://en.wikipedia.org/wiki/Ising_model "Ising model") [Potts model](https://en.wikipedia.org/wiki/Potts_model "Potts model") [Boolean network](https://en.wikipedia.org/wiki/Boolean_network "Boolean network") [Markov random field](https://en.wikipedia.org/wiki/Markov_random_field "Markov random field") [Percolation](https://en.wikipedia.org/wiki/Percolation_theory "Percolation theory") [PitmanâYor process](https://en.wikipedia.org/wiki/Pitman%E2%80%93Yor_process "PitmanâYor process") [Point process](https://en.wikipedia.org/wiki/Point_process "Point process") [Cox](https://en.wikipedia.org/wiki/Point_process#Cox_point_process "Point process") [Determinantal](https://en.wikipedia.org/wiki/Determinantal_point_process "Determinantal point process") [Poisson](https://en.wikipedia.org/wiki/Poisson_point_process "Poisson point process") [Random field](https://en.wikipedia.org/wiki/Random_field "Random field") [Random graph](https://en.wikipedia.org/wiki/Random_graph "Random graph") |
| [Time series models](https://en.wikipedia.org/wiki/Time_series "Time series") | [Autoregressive conditional heteroskedasticity (ARCH) model](https://en.wikipedia.org/wiki/Autoregressive_conditional_heteroskedasticity "Autoregressive conditional heteroskedasticity") [Autoregressive integrated moving average (ARIMA) model](https://en.wikipedia.org/wiki/Autoregressive_integrated_moving_average "Autoregressive integrated moving average") [Autoregressive (AR) model](https://en.wikipedia.org/wiki/Autoregressive_model "Autoregressive model") [Autoregressive moving-average (ARMA) model](https://en.wikipedia.org/wiki/Autoregressive_moving-average_model "Autoregressive moving-average model") [Generalized autoregressive conditional heteroskedasticity (GARCH) model](https://en.wikipedia.org/wiki/Autoregressive_conditional_heteroskedasticity "Autoregressive conditional heteroskedasticity") [Moving-average (MA) model](https://en.wikipedia.org/wiki/Moving-average_model "Moving-average model") |
| [Financial models](https://en.wikipedia.org/wiki/Asset_pricing_model "Asset pricing model") | [Binomial options pricing model](https://en.wikipedia.org/wiki/Binomial_options_pricing_model "Binomial options pricing model") [BlackâDermanâToy](https://en.wikipedia.org/wiki/Black%E2%80%93Derman%E2%80%93Toy_model "BlackâDermanâToy model") [BlackâKarasinski](https://en.wikipedia.org/wiki/Black%E2%80%93Karasinski_model "BlackâKarasinski model") [BlackâScholes](https://en.wikipedia.org/wiki/Black%E2%80%93Scholes_model "BlackâScholes model") [ChanâKarolyiâLongstaffâSanders (CKLS)](https://en.wikipedia.org/wiki/Chan%E2%80%93Karolyi%E2%80%93Longstaff%E2%80%93Sanders_process "ChanâKarolyiâLongstaffâSanders process") [Chen](https://en.wikipedia.org/wiki/Chen_model "Chen model") [Constant elasticity of variance (CEV)](https://en.wikipedia.org/wiki/Constant_elasticity_of_variance_model "Constant elasticity of variance model") [CoxâIngersollâRoss (CIR)](https://en.wikipedia.org/wiki/Cox%E2%80%93Ingersoll%E2%80%93Ross_model "CoxâIngersollâRoss model") [GarmanâKohlhagen](https://en.wikipedia.org/wiki/Garman%E2%80%93Kohlhagen_model "GarmanâKohlhagen model") [HeathâJarrowâMorton (HJM)](https://en.wikipedia.org/wiki/Heath%E2%80%93Jarrow%E2%80%93Morton_framework "HeathâJarrowâMorton framework") [Heston](https://en.wikipedia.org/wiki/Heston_model "Heston model") [HoâLee](https://en.wikipedia.org/wiki/Ho%E2%80%93Lee_model "HoâLee model") [HullâWhite](https://en.wikipedia.org/wiki/Hull%E2%80%93White_model "HullâWhite model") [Korn-Kreer-Lenssen](https://en.wikipedia.org/wiki/Korn%E2%80%93Kreer%E2%80%93Lenssen_model "KornâKreerâLenssen model") [LIBOR market](https://en.wikipedia.org/wiki/LIBOR_market_model "LIBOR market model") [RendlemanâBartter](https://en.wikipedia.org/wiki/Rendleman%E2%80%93Bartter_model "RendlemanâBartter model") [SABR volatility](https://en.wikipedia.org/wiki/SABR_volatility_model "SABR volatility model") [VaĹĄĂÄek](https://en.wikipedia.org/wiki/Vasicek_model "Vasicek model") [Wilkie](https://en.wikipedia.org/wiki/Wilkie_investment_model "Wilkie investment model") |
| [Actuarial models](https://en.wikipedia.org/wiki/Actuarial_mathematics "Actuarial mathematics") | [BĂźhlmann](https://en.wikipedia.org/wiki/B%C3%BChlmann_model "BĂźhlmann model") [CramĂŠrâLundberg](https://en.wikipedia.org/wiki/Cram%C3%A9r%E2%80%93Lundberg_model "CramĂŠrâLundberg model") [Risk process](https://en.wikipedia.org/wiki/Risk_process "Risk process") [SparreâAnderson](https://en.wikipedia.org/wiki/Sparre%E2%80%93Anderson_model "SparreâAnderson model") |
| [Queueing models](https://en.wikipedia.org/wiki/Queueing_model "Queueing model") | [Bulk](https://en.wikipedia.org/wiki/Bulk_queue "Bulk queue") [Fluid](https://en.wikipedia.org/wiki/Fluid_queue "Fluid queue") [Generalized queueing network](https://en.wikipedia.org/wiki/G-network "G-network") [M/G/1](https://en.wikipedia.org/wiki/M/G/1_queue "M/G/1 queue") [M/M/1](https://en.wikipedia.org/wiki/M/M/1_queue "M/M/1 queue") [M/M/c](https://en.wikipedia.org/wiki/M/M/c_queue "M/M/c queue") |
| Properties | [CĂ dlĂ g paths](https://en.wikipedia.org/wiki/C%C3%A0dl%C3%A0g "CĂ dlĂ g") [Continuous](https://en.wikipedia.org/wiki/Continuous_stochastic_process "Continuous stochastic process") [Continuous paths](https://en.wikipedia.org/wiki/Sample-continuous_process "Sample-continuous process") [Ergodic](https://en.wikipedia.org/wiki/Ergodicity "Ergodicity") [Exchangeable](https://en.wikipedia.org/wiki/Exchangeable_random_variables "Exchangeable random variables") [Feller-continuous](https://en.wikipedia.org/wiki/Feller-continuous_process "Feller-continuous process") [GaussâMarkov](https://en.wikipedia.org/wiki/Gauss%E2%80%93Markov_process "GaussâMarkov process") [Markov](https://en.wikipedia.org/wiki/Markov_property "Markov property") [Mixing](https://en.wikipedia.org/wiki/Mixing_\(mathematics\) "Mixing (mathematics)") [Piecewise-deterministic](https://en.wikipedia.org/wiki/Piecewise-deterministic_Markov_process "Piecewise-deterministic Markov process") [Predictable](https://en.wikipedia.org/wiki/Predictable_process "Predictable process") [Progressively measurable](https://en.wikipedia.org/wiki/Progressively_measurable_process "Progressively measurable process") [Self-similar](https://en.wikipedia.org/wiki/Self-similar_process "Self-similar process") [Stationary](https://en.wikipedia.org/wiki/Stationary_process "Stationary process") [Time-reversible](https://en.wikipedia.org/wiki/Time_reversibility "Time reversibility") |
| Limit theorems | [Central limit theorem](https://en.wikipedia.org/wiki/Central_limit_theorem "Central limit theorem") [Donsker's theorem](https://en.wikipedia.org/wiki/Donsker%27s_theorem "Donsker's theorem") [Doob's martingale convergence theorems](https://en.wikipedia.org/wiki/Doob%27s_martingale_convergence_theorems "Doob's martingale convergence theorems") [Ergodic theorem](https://en.wikipedia.org/wiki/Ergodic_theory "Ergodic theory") [FisherâTippettâGnedenko theorem](https://en.wikipedia.org/wiki/Fisher%E2%80%93Tippett%E2%80%93Gnedenko_theorem "FisherâTippettâGnedenko theorem") [Large deviation principle](https://en.wikipedia.org/wiki/Large_deviation_principle "Large deviation principle") [Law of large numbers (weak/strong)](https://en.wikipedia.org/wiki/Law_of_large_numbers "Law of large numbers") [Law of the iterated logarithm](https://en.wikipedia.org/wiki/Law_of_the_iterated_logarithm "Law of the iterated logarithm") [Maximal ergodic theorem](https://en.wikipedia.org/wiki/Maximal_ergodic_theorem "Maximal ergodic theorem") [Sanov's theorem](https://en.wikipedia.org/wiki/Sanov%27s_theorem "Sanov's theorem") [Zeroâone laws](https://en.wikipedia.org/wiki/Zero%E2%80%93one_law "Zeroâone law") ([Blumenthal](https://en.wikipedia.org/wiki/Blumenthal%27s_zero%E2%80%93one_law "Blumenthal's zeroâone law"), [BorelâCantelli](https://en.wikipedia.org/wiki/Borel%E2%80%93Cantelli_lemma "BorelâCantelli lemma"), [EngelbertâSchmidt](https://en.wikipedia.org/wiki/Engelbert%E2%80%93Schmidt_zero%E2%80%93one_law "EngelbertâSchmidt zeroâone law"), [HewittâSavage](https://en.wikipedia.org/wiki/Hewitt%E2%80%93Savage_zero%E2%80%93one_law "HewittâSavage zeroâone law"), [Kolmogorov](https://en.wikipedia.org/wiki/Kolmogorov%27s_zero%E2%80%93one_law "Kolmogorov's zeroâone law"), [LĂŠvy](https://en.wikipedia.org/wiki/L%C3%A9vy%27s_zero%E2%80%93one_law "LĂŠvy's zeroâone law")) |
| [Inequalities](https://en.wikipedia.org/wiki/List_of_inequalities#Probability_theory_and_statistics "List of inequalities") | [BurkholderâDavisâGundy](https://en.wikipedia.org/wiki/Burkholder%E2%80%93Davis%E2%80%93Gundy_inequalities "BurkholderâDavisâGundy inequalities") [Doob's martingale](https://en.wikipedia.org/wiki/Doob%27s_martingale_inequality "Doob's martingale inequality") [Doob's upcrossing](https://en.wikipedia.org/wiki/Doob%27s_upcrossing_inequality "Doob's upcrossing inequality") [KunitaâWatanabe](https://en.wikipedia.org/wiki/Kunita%E2%80%93Watanabe_inequality "KunitaâWatanabe inequality") [MarcinkiewiczâZygmund](https://en.wikipedia.org/wiki/Marcinkiewicz%E2%80%93Zygmund_inequality "MarcinkiewiczâZygmund inequality") |
| Tools | [CameronâMartin theorem](https://en.wikipedia.org/wiki/Cameron%E2%80%93Martin_theorem "CameronâMartin theorem") [Convergence of random variables](https://en.wikipedia.org/wiki/Convergence_of_random_variables "Convergence of random variables") [DolĂŠans-Dade exponential](https://en.wikipedia.org/wiki/Dol%C3%A9ans-Dade_exponential "DolĂŠans-Dade exponential") [Doob decomposition theorem](https://en.wikipedia.org/wiki/Doob_decomposition_theorem "Doob decomposition theorem") [DoobâMeyer decomposition theorem](https://en.wikipedia.org/wiki/Doob%E2%80%93Meyer_decomposition_theorem "DoobâMeyer decomposition theorem") [Doob's optional stopping theorem](https://en.wikipedia.org/wiki/Doob%27s_optional_stopping_theorem "Doob's optional stopping theorem") [Dynkin's formula](https://en.wikipedia.org/wiki/Dynkin%27s_formula "Dynkin's formula") [FeynmanâKac formula](https://en.wikipedia.org/wiki/Feynman%E2%80%93Kac_formula "FeynmanâKac formula") [Filtration](https://en.wikipedia.org/wiki/Filtration_\(probability_theory\) "Filtration (probability theory)") [Girsanov theorem](https://en.wikipedia.org/wiki/Girsanov_theorem "Girsanov theorem") [Infinitesimal generator](https://en.wikipedia.org/wiki/Infinitesimal_generator_\(stochastic_processes\) "Infinitesimal generator (stochastic processes)") [ItĂ´ integral](https://en.wikipedia.org/wiki/It%C3%B4_integral "ItĂ´ integral") [ItĂ´'s lemma](https://en.wikipedia.org/wiki/It%C3%B4%27s_lemma "ItĂ´'s lemma") [Kolmogorov continuity theorem](https://en.wikipedia.org/wiki/Kolmogorov_continuity_theorem "Kolmogorov continuity theorem") [Kolmogorov extension theorem](https://en.wikipedia.org/wiki/Kolmogorov_extension_theorem "Kolmogorov extension theorem") [KosambiâKarhunenâLoève theorem]() [LĂŠvyâProkhorov metric](https://en.wikipedia.org/wiki/L%C3%A9vy%E2%80%93Prokhorov_metric "LĂŠvyâProkhorov metric") [Malliavin calculus](https://en.wikipedia.org/wiki/Malliavin_calculus "Malliavin calculus") [Martingale representation theorem](https://en.wikipedia.org/wiki/Martingale_representation_theorem "Martingale representation theorem") [Optional stopping theorem](https://en.wikipedia.org/wiki/Optional_stopping_theorem "Optional stopping theorem") [Prokhorov's theorem](https://en.wikipedia.org/wiki/Prokhorov%27s_theorem "Prokhorov's theorem") [Quadratic variation](https://en.wikipedia.org/wiki/Quadratic_variation "Quadratic variation") [Reflection principle](https://en.wikipedia.org/wiki/Reflection_principle_\(Wiener_process\) "Reflection principle (Wiener process)") [Skorokhod integral](https://en.wikipedia.org/wiki/Skorokhod_integral "Skorokhod integral") [Skorokhod's representation theorem](https://en.wikipedia.org/wiki/Skorokhod%27s_representation_theorem "Skorokhod's representation theorem") [Skorokhod space](https://en.wikipedia.org/wiki/Skorokhod_space "Skorokhod space") [Snell envelope](https://en.wikipedia.org/wiki/Snell_envelope "Snell envelope") [Stochastic differential equation](https://en.wikipedia.org/wiki/Stochastic_differential_equation "Stochastic differential equation") [Tanaka](https://en.wikipedia.org/wiki/Tanaka_equation "Tanaka equation") [Stopping time](https://en.wikipedia.org/wiki/Stopping_time "Stopping time") [Stratonovich integral](https://en.wikipedia.org/wiki/Stratonovich_integral "Stratonovich integral") [Uniform integrability](https://en.wikipedia.org/wiki/Uniform_integrability "Uniform integrability") [Usual hypotheses](https://en.wikipedia.org/wiki/Usual_hypotheses "Usual hypotheses") Wiener space [Classical](https://en.wikipedia.org/wiki/Classical_Wiener_space "Classical Wiener space") [Abstract](https://en.wikipedia.org/wiki/Abstract_Wiener_space "Abstract Wiener space") |
| Disciplines | [Actuarial mathematics](https://en.wikipedia.org/wiki/Actuarial_mathematics "Actuarial mathematics") [Control theory](https://en.wikipedia.org/wiki/Stochastic_control "Stochastic control") [Econometrics](https://en.wikipedia.org/wiki/Econometrics "Econometrics") [Ergodic theory](https://en.wikipedia.org/wiki/Ergodic_theory "Ergodic theory") [Extreme value theory (EVT)](https://en.wikipedia.org/wiki/Extreme_value_theory "Extreme value theory") [Large deviations theory](https://en.wikipedia.org/wiki/Large_deviations_theory "Large deviations theory") [Mathematical finance](https://en.wikipedia.org/wiki/Mathematical_finance "Mathematical finance") [Mathematical statistics](https://en.wikipedia.org/wiki/Mathematical_statistics "Mathematical statistics") [Probability theory](https://en.wikipedia.org/wiki/Probability_theory "Probability theory") [Queueing theory](https://en.wikipedia.org/wiki/Queueing_theory "Queueing theory") [Renewal theory](https://en.wikipedia.org/wiki/Renewal_theory "Renewal theory") [Ruin theory](https://en.wikipedia.org/wiki/Ruin_theory "Ruin theory") [Signal processing](https://en.wikipedia.org/wiki/Signal_processing "Signal processing") [Statistics](https://en.wikipedia.org/wiki/Statistics "Statistics") [Stochastic analysis](https://en.wikipedia.org/wiki/Stochastic_analysis "Stochastic analysis") [Time series analysis](https://en.wikipedia.org/wiki/Time_series_analysis "Time series analysis") [Machine learning](https://en.wikipedia.org/wiki/Machine_learning "Machine learning") |
| [List of topics](https://en.wikipedia.org/wiki/List_of_stochastic_processes_topics "List of stochastic processes topics") [Category](https://en.wikipedia.org/wiki/Category:Stochastic_processes "Category:Stochastic processes") | |

Retrieved from "<https://en.wikipedia.org/w/index.php?title=KosambiâKarhunenâLoève_theorem&oldid=1334891229>"
[Categories](https://en.wikipedia.org/wiki/Help:Category "Help:Category"):
- [Theorems in probability theory](https://en.wikipedia.org/wiki/Category:Theorems_in_probability_theory "Category:Theorems in probability theory")
- [Signal estimation](https://en.wikipedia.org/wiki/Category:Signal_estimation "Category:Signal estimation")
- [Theorems in statistics](https://en.wikipedia.org/wiki/Category:Theorems_in_statistics "Category:Theorems in statistics")
Hidden categories:
- [Webarchive template wayback links](https://en.wikipedia.org/wiki/Category:Webarchive_template_wayback_links "Category:Webarchive template wayback links")
- [Articles with short description](https://en.wikipedia.org/wiki/Category:Articles_with_short_description "Category:Articles with short description")
- [Short description matches Wikidata](https://en.wikipedia.org/wiki/Category:Short_description_matches_Wikidata "Category:Short description matches Wikidata")
- [Wikipedia articles needing clarification from February 2021](https://en.wikipedia.org/wiki/Category:Wikipedia_articles_needing_clarification_from_February_2021 "Category:Wikipedia articles needing clarification from February 2021")
- [Articles to be expanded from July 2010](https://en.wikipedia.org/wiki/Category:Articles_to_be_expanded_from_July_2010 "Category:Articles to be expanded from July 2010")
- [All articles to be expanded](https://en.wikipedia.org/wiki/Category:All_articles_to_be_expanded "Category:All articles to be expanded")
- This page was last edited on 26 January 2026, at 05:26 (UTC).
- Text is available under the [Creative Commons Attribution-ShareAlike 4.0 License](https://en.wikipedia.org/wiki/Wikipedia:Text_of_the_Creative_Commons_Attribution-ShareAlike_4.0_International_License "Wikipedia:Text of the Creative Commons Attribution-ShareAlike 4.0 International License"); additional terms may apply. By using this site, you agree to the [Terms of Use](https://foundation.wikimedia.org/wiki/Special:MyLanguage/Policy:Terms_of_Use "foundation:Special:MyLanguage/Policy:Terms of Use") and [Privacy Policy](https://foundation.wikimedia.org/wiki/Special:MyLanguage/Policy:Privacy_policy "foundation:Special:MyLanguage/Policy:Privacy policy"). WikipediaÂŽ is a registered trademark of the [Wikimedia Foundation, Inc.](https://wikimediafoundation.org/), a non-profit organization.
- [Privacy policy](https://foundation.wikimedia.org/wiki/Special:MyLanguage/Policy:Privacy_policy)
- [About Wikipedia](https://en.wikipedia.org/wiki/Wikipedia:About)
- [Disclaimers](https://en.wikipedia.org/wiki/Wikipedia:General_disclaimer)
- [Contact Wikipedia](https://en.wikipedia.org/wiki/Wikipedia:Contact_us)
- [Legal & safety contacts](https://foundation.wikimedia.org/wiki/Special:MyLanguage/Legal:Wikimedia_Foundation_Legal_and_Safety_Contact_Information)
- [Code of Conduct](https://foundation.wikimedia.org/wiki/Special:MyLanguage/Policy:Universal_Code_of_Conduct)
- [Developers](https://developer.wikimedia.org/)
- [Statistics](https://stats.wikimedia.org/#/en.wikipedia.org)
- [Cookie statement](https://foundation.wikimedia.org/wiki/Special:MyLanguage/Policy:Cookie_statement)
- [Mobile view](https://en.wikipedia.org/w/index.php?title=Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem&mobileaction=toggle_view_mobile)
- [](https://www.wikimedia.org/)
- [](https://www.mediawiki.org/)
Search
Toggle the table of contents
KosambiâKarhunenâLoève theorem
9 languages
[Add topic](https://en.wikipedia.org/wiki/Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem) |
| Readable Markdown | In the theory of [stochastic processes](https://en.wikipedia.org/wiki/Stochastic_process "Stochastic process"), the **KarhunenâLoève theorem** (named after [Kari Karhunen](https://en.wikipedia.org/wiki/Kari_Karhunen "Kari Karhunen") and [Michel Loève](https://en.wikipedia.org/wiki/Michel_Lo%C3%A8ve "Michel Loève")), also known as the **KosambiâKarhunenâLoève theorem**[\[1\]](https://en.wikipedia.org/wiki/Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem#cite_note-sapatnekar-1)[\[2\]](https://en.wikipedia.org/wiki/Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem#cite_note-ghoman-2) states that a [stochastic process](https://en.wikipedia.org/wiki/Stochastic_process "Stochastic process") can be represented as an infinite [linear combination](https://en.wikipedia.org/wiki/Linear_combination "Linear combination") of [orthogonal functions](https://en.wikipedia.org/wiki/Orthogonal_function "Orthogonal function"), analogous to a [Fourier series](https://en.wikipedia.org/wiki/Fourier_series "Fourier series") representation of a function on a bounded interval. The transformation is also known as [Hotelling](https://en.wikipedia.org/wiki/Harold_Hotelling "Harold Hotelling") transform and [eigenvector](https://en.wikipedia.org/wiki/Eigenvector "Eigenvector") transform, and is closely related to [principal component analysis](https://en.wikipedia.org/wiki/Principal_component_analysis "Principal component analysis") (PCA) technique widely used in image processing and in data analysis in many fields.[\[3\]](https://en.wikipedia.org/wiki/Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem#cite_note-3)
There exist many such expansions of a stochastic process: if the process is indexed over \[*a*, *b*\], any [orthonormal basis](https://en.wikipedia.org/wiki/Orthonormal_basis "Orthonormal basis") of *L*2(\[*a*, *b*\]) yields an expansion thereof in that form. The importance of the KarhunenâLoève theorem is that it yields the best such basis in the sense that it minimizes the total [mean squared error](https://en.wikipedia.org/wiki/Mean_squared_error "Mean squared error").
In contrast to a Fourier series where the coefficients are fixed numbers and the expansion basis consists of [sinusoidal functions](https://en.wikipedia.org/wiki/Trigonometric_function "Trigonometric function") (that is, [sine](https://en.wikipedia.org/wiki/Sine "Sine") and [cosine](https://en.wikipedia.org/wiki/Cosine "Cosine") functions), the coefficients in the KarhunenâLoève theorem are [random variables](https://en.wikipedia.org/wiki/Random_variable "Random variable") and the expansion basis depends on the process. In fact, the orthogonal basis functions used in this representation are determined by the [covariance function](https://en.wikipedia.org/wiki/Covariance_function "Covariance function") of the process. One can think that the KarhunenâLoève transform adapts to the process in order to produce the best possible basis for its expansion.
In the case of a *centered* stochastic process {*Xt*}*t* â \[*a*, *b*\] (*centered* means **E**\[*Xt*\] = 0 for all *t* â \[*a*, *b*\]) satisfying a technical continuity condition, X admits a decomposition

where Zk are pairwise [uncorrelated](https://en.wikipedia.org/wiki/Uncorrelated "Uncorrelated") random variables and the functions ek are continuous real-valued functions on \[*a*, *b*\] that are pairwise [orthogonal](https://en.wikipedia.org/wiki/Orthogonal_function "Orthogonal function") in *L*2(\[*a*, *b*\]). It is therefore sometimes said that the expansion is *bi-orthogonal* since the random coefficients Zk are orthogonal in the probability space while the deterministic functions ek are orthogonal in the time domain. The general case of a process Xt that is not centered can be brought back to the case of a centered process by considering *Xt* â **E**\[*Xt*\] which is a centered process.
Moreover, if the process is [Gaussian](https://en.wikipedia.org/wiki/Gaussian_process "Gaussian process"), then the random variables Zk are Gaussian and [stochastically independent](https://en.wikipedia.org/wiki/Stochastically_independent "Stochastically independent"). This result generalizes the *KarhunenâLoève transform*. An important example of a centered real stochastic process on \[0, 1\] is the [Wiener process](https://en.wikipedia.org/wiki/Wiener_process "Wiener process"); the KarhunenâLoève theorem can be used to provide a canonical orthogonal representation for it. In this case the expansion consists of sinusoidal functions.
The above expansion into uncorrelated random variables is also known as the *KarhunenâLoève expansion* or *KarhunenâLoève decomposition*. The [empirical](https://en.wikipedia.org/wiki/Statistic "Statistic") version (i.e., with the coefficients computed from a sample) is known as the *KarhunenâLoève transform* (KLT), *[principal component analysis](https://en.wikipedia.org/wiki/Principal_component_analysis "Principal component analysis")*, *[proper orthogonal decomposition](https://en.wikipedia.org/wiki/Proper_orthogonal_decomposition "Proper orthogonal decomposition") (POD)*, *[empirical orthogonal functions](https://en.wikipedia.org/wiki/Empirical_orthogonal_functions "Empirical orthogonal functions")* (a term used in [meteorology](https://en.wikipedia.org/wiki/Meteorology "Meteorology") and [geophysics](https://en.wikipedia.org/wiki/Geophysics "Geophysics")), or the *[Hotelling](https://en.wikipedia.org/wiki/Harold_Hotelling "Harold Hotelling") transform*.
- Throughout this article, we will consider a random process Xt defined over a [probability space](https://en.wikipedia.org/wiki/Probability_space "Probability space") (Ί, *F*, **P**) and indexed over a closed interval \[*a*, *b*\], which is [square-integrable](https://en.wikipedia.org/wiki/Square-integrable_function "Square-integrable function"), has zero-mean, and with covariance function *KX*(*s*, *t*). In other words, we have:
![{\\displaystyle \\forall t\\in \[a,b\]\\qquad X\_{t}\\in L^{2}(\\Omega ,F,\\mathbf {P} ),\\quad {\\text{i.e. }}\\mathbf {E} \[X\_{t}^{2}\]\<\\infty ,}](https://wikimedia.org/api/rest_v1/media/math/render/svg/632342a83ac4f49f07863840a5571a5f3854ae00)
![{\\displaystyle \\forall t\\in \[a,b\]\\qquad \\mathbf {E} \[X\_{t}\]=0,}](https://wikimedia.org/api/rest_v1/media/math/render/svg/053ae36b10faf28608ba342fcd8618a0547627d6)
![{\\displaystyle \\forall t,s\\in \[a,b\]\\qquad K\_{X}(s,t)=\\mathbf {E} \[X\_{s}X\_{t}\].}](https://wikimedia.org/api/rest_v1/media/math/render/svg/304b63242d3aadafe76989a78b2f0616bb9bb05b)
The square-integrable condition ![{\\displaystyle \\mathbf {E} \[X\_{t}^{2}\]\<\\infty }](https://wikimedia.org/api/rest_v1/media/math/render/svg/63bcfac9c642a55be391d81dd3de4aeb591da1bc) is logically equivalent to  being finite for all ![{\\displaystyle s,t\\in \[a,b\]}](https://wikimedia.org/api/rest_v1/media/math/render/svg/e9b5cac7ee149307b7d98ab13f7c7bb33b1fb42c).[\[4\]](https://en.wikipedia.org/wiki/Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem#cite_note-giambartolomei-4)
- We associate to *K**X* a [linear operator](https://en.wikipedia.org/wiki/Linear_operator "Linear operator") (more specifically a [HilbertâSchmidt integral operator](https://en.wikipedia.org/wiki/Hilbert%E2%80%93Schmidt_integral_operator "HilbertâSchmidt integral operator")) *T**K**X* defined in the following way:
![{\\displaystyle T\_{K\_{X}}\\colon \\left\\{{\\begin{aligned}L^{2}(\[a,b\])&\\to L^{2}(\[a,b\])\\\\f&\\mapsto T\_{K\_{X}}f=\\int \_{a}^{b}K\_{X}(s,\\cdot )f(s)\\,ds\\end{aligned}}\\right.}](https://wikimedia.org/api/rest_v1/media/math/render/svg/7202812d5864abeac5bfa024d7919fb0dd3739e1)
Since *T**K**X* is a linear endomorphism, it makes sense to talk about its eigenvalues *Îťk* and eigenfunctions *e**k*, which are found by solving the homogeneous Fredholm [integral equation](https://en.wikipedia.org/wiki/Integral_equation "Integral equation") of the second kind
.
## Statement of the theorem
\[[edit](https://en.wikipedia.org/w/index.php?title=Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem&action=edit§ion=2 "Edit section: Statement of the theorem")\]
**Theorem**. Let Xt be a zero-mean square-integrable stochastic process defined over a probability space (Ί, *F*, **P**) and indexed over a closed and bounded interval \[*a*, *b*\], with continuous covariance function *K**X*(*s*, *t*).
Then *K**X*(*s,t*) is a [Mercer kernel](https://en.wikipedia.org/wiki/Mercer%27s_theorem "Mercer's theorem") and letting *e**k* be an orthonormal basis on *L*2(\[*a*, *b*\]) formed by the eigenfunctions of *T**K**X* with respective eigenvalues Îťk, Xt admits the following representation

where the convergence is in [*L*2](https://en.wikipedia.org/wiki/Convergence_of_random_variables#Convergence_in_mean "Convergence of random variables"), uniform in *t* and

Furthermore, the random variables *Z**k* have zero-mean, are uncorrelated and have variance *Îťk*
![{\\displaystyle \\mathbf {E} \[Z\_{k}\]=0,~\\forall k\\in \\mathbb {N} \\qquad {\\mbox{and}}\\qquad \\mathbf {E} \[Z\_{i}Z\_{j}\]=\\delta \_{ij}\\lambda \_{j},~\\forall i,j\\in \\mathbb {N} }](https://wikimedia.org/api/rest_v1/media/math/render/svg/23223c32fd5054a22a855f70feffce7a9c48f868)
Note that by generalizations of Mercer's theorem we can replace the interval \[*a*, *b*\] with other compact spaces *C* and the [Lebesgue measure](https://en.wikipedia.org/wiki/Lebesgue_measure "Lebesgue measure") on \[*a*, *b*\] with a [Borel measure](https://en.wikipedia.org/wiki/Borel_measure "Borel measure") whose support is *C*.
- The covariance function *K**X* satisfies the definition of a Mercer kernel. By [Mercer's theorem](https://en.wikipedia.org/wiki/Mercer%27s_theorem "Mercer's theorem"), there consequently exists a set *Îťk*, *ek*(*t*) of eigenvalues and eigenfunctions of T*K**X* forming an orthonormal basis of *L*2(\[*a*,*b*\]), and *K**X* can be expressed as

- The process *X**t* can be expanded in terms of the eigenfunctions *e**k* as:

where the coefficients (random variables) *Z**k* are given by the projection of *X**t* on the respective eigenfunctions

- We may then derive
![{\\displaystyle {\\begin{aligned}\\mathbf {E} \[Z\_{k}\]&=\\mathbf {E} \\left\[\\int \_{a}^{b}X\_{t}e\_{k}(t)\\,dt\\right\]=\\int \_{a}^{b}\\mathbf {E} \[X\_{t}\]e\_{k}(t)dt=0\\\\\[8pt\]\\mathbf {E} \[Z\_{i}Z\_{j}\]&=\\mathbf {E} \\left\[\\int \_{a}^{b}\\int \_{a}^{b}X\_{t}X\_{s}e\_{j}(t)e\_{i}(s)\\,dt\\,ds\\right\]\\\\&=\\int \_{a}^{b}\\int \_{a}^{b}\\mathbf {E} \\left\[X\_{t}X\_{s}\\right\]e\_{j}(t)e\_{i}(s)\\,dt\\,ds\\\\&=\\int \_{a}^{b}\\int \_{a}^{b}K\_{X}(s,t)e\_{j}(t)e\_{i}(s)\\,dt\\,ds\\\\&=\\int \_{a}^{b}e\_{i}(s)\\left(\\int \_{a}^{b}K\_{X}(s,t)e\_{j}(t)\\,dt\\right)\\,ds\\\\&=\\lambda \_{j}\\int \_{a}^{b}e\_{i}(s)e\_{j}(s)\\,ds\\\\&=\\delta \_{ij}\\lambda \_{j}\\end{aligned}}}](https://wikimedia.org/api/rest_v1/media/math/render/svg/4ce33680637c8657a8bd6a180225c18f165917c3)
where we have used the fact that the *e**k* are eigenfunctions of *T**K**X* and are orthonormal.
- Let us now show that the convergence is in *L*2. Let

Then:
![{\\displaystyle {\\begin{aligned}\\mathbf {E} \\left\[\\left\|X\_{t}-S\_{N}\\right\|^{2}\\right\]&=\\mathbf {E} \\left\[X\_{t}^{2}\\right\]+\\mathbf {E} \\left\[S\_{N}^{2}\\right\]-2\\mathbf {E} \\left\[X\_{t}S\_{N}\\right\]\\\\&=K\_{X}(t,t)+\\mathbf {E} \\left\[\\sum \_{k=1}^{N}\\sum \_{l=1}^{N}Z\_{k}Z\_{\\ell }e\_{k}(t)e\_{\\ell }(t)\\right\]-2\\mathbf {E} \\left\[X\_{t}\\sum \_{k=1}^{N}Z\_{k}e\_{k}(t)\\right\]\\\\&=K\_{X}(t,t)+\\sum \_{k=1}^{N}\\lambda \_{k}e\_{k}(t)^{2}-2\\mathbf {E} \\left\[\\sum \_{k=1}^{N}\\int \_{a}^{b}X\_{t}X\_{s}e\_{k}(s)e\_{k}(t)\\,ds\\right\]\\\\&=K\_{X}(t,t)-\\sum \_{k=1}^{N}\\lambda \_{k}e\_{k}(t)^{2}\\end{aligned}}}](https://wikimedia.org/api/rest_v1/media/math/render/svg/a8931f37eae91505769b097b93769ecd5f5894f4)
which goes to 0 by Mercer's theorem.
## Properties of the KarhunenâLoève transform
\[[edit](https://en.wikipedia.org/w/index.php?title=Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem&action=edit§ion=4 "Edit section: Properties of the KarhunenâLoève transform")\]
### Special case: Gaussian distribution
\[[edit](https://en.wikipedia.org/w/index.php?title=Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem&action=edit§ion=5 "Edit section: Special case: Gaussian distribution")\]
Since the limit in the mean of jointly Gaussian random variables is jointly Gaussian, and jointly Gaussian random (centered) variables are independent [if and only if](https://en.wikipedia.org/wiki/If_and_only_if "If and only if") they are orthogonal, we can also conclude:
**Theorem**. The variables Zi have a joint Gaussian distribution and are stochastically independent if the original process {*Xt*}*t* is Gaussian.
In the Gaussian case, since the variables Zi are independent, we can say more:

almost surely.
### The KarhunenâLoève transform decorrelates the process
\[[edit](https://en.wikipedia.org/w/index.php?title=Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem&action=edit§ion=6 "Edit section: The KarhunenâLoève transform decorrelates the process")\]
This is a consequence of the independence of the Zk.
### The KarhunenâLoève expansion minimizes the total mean square error
\[[edit](https://en.wikipedia.org/w/index.php?title=Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem&action=edit§ion=7 "Edit section: The KarhunenâLoève expansion minimizes the total mean square error")\]
In the introduction, we mentioned that the truncated KarhunenâLoeve expansion was the best approximation of the original process in the sense that it reduces the total mean-square error resulting of its truncation. Because of this property, it is often said that the KL transform optimally compacts the energy.
More specifically, given any orthonormal basis {*f**k*} of *L*2(\[*a*, *b*\]), we may decompose the process *Xt* as:

where

and we may approximate *X**t* by the finite sum

for some integer *N*.
**Claim**. Of all such approximations, the KL approximation is the one that minimizes the total mean square error (provided we have arranged the eigenvalues in decreasing order).
**Proof**
Consider the error resulting from the truncation at the *N*\-th term in the following orthonormal expansion:

The mean-square error *Îľ**N*2(*t*) can be written as:
![{\\displaystyle {\\begin{aligned}\\varepsilon \_{N}^{2}(t)&=\\mathbf {E} \\left\[\\sum \_{i=N+1}^{\\infty }\\sum \_{j=N+1}^{\\infty }A\_{i}(\\omega )A\_{j}(\\omega )f\_{i}(t)f\_{j}(t)\\right\]\\\\&=\\sum \_{i=N+1}^{\\infty }\\sum \_{j=N+1}^{\\infty }\\mathbf {E} \\left\[\\int \_{a}^{b}\\int \_{a}^{b}X\_{t}X\_{s}f\_{i}(t)f\_{j}(s)\\,ds\\,dt\\right\]f\_{i}(t)f\_{j}(t)\\\\&=\\sum \_{i=N+1}^{\\infty }\\sum \_{j=N+1}^{\\infty }f\_{i}(t)f\_{j}(t)\\int \_{a}^{b}\\int \_{a}^{b}K\_{X}(s,t)f\_{i}(t)f\_{j}(s)\\,ds\\,dt\\end{aligned}}}](https://wikimedia.org/api/rest_v1/media/math/render/svg/db3545962d3e0d868cffefe4b7b028aa583d140a)
We then integrate this last equality over \[*a*, *b*\]. The orthonormality of the *fk* yields:

The problem of minimizing the total mean-square error thus comes down to minimizing the right hand side of this equality subject to the constraint that the *f**k* be normalized. We hence introduce βk, the Lagrangian multipliers associated with these constraints, and aim at minimizing the following function:
![{\\displaystyle Er\[f\_{k}(t),k\\in \\{N+1,\\ldots \\}\]=\\sum \_{k=N+1}^{\\infty }\\int \_{a}^{b}\\int \_{a}^{b}K\_{X}(s,t)f\_{k}(t)f\_{k}(s)\\,ds\\,dt-\\beta \_{k}\\left(\\int \_{a}^{b}f\_{k}(t)f\_{k}(t)\\,dt-1\\right)}](https://wikimedia.org/api/rest_v1/media/math/render/svg/a21063ed7c70d9c06aa9e8ea756049df74acd26d)
Differentiating with respect to *f**i*(*t*) (this is a [functional derivative](https://en.wikipedia.org/wiki/Functional_derivative "Functional derivative")) and setting the derivative to 0 yields:

which is satisfied in particular when

In other words, when the *f**k* are chosen to be the eigenfunctions of *T**K**X*, hence resulting in the KL expansion.
An important observation is that since the random coefficients *Z**k* of the KL expansion are uncorrelated, the [BienaymĂŠ formula](https://en.wikipedia.org/wiki/Variance#Sum_of_variables "Variance") asserts that the variance of *X**t* is simply the sum of the variances of the individual components of the sum:
![{\\displaystyle \\operatorname {var} \[X\_{t}\]=\\sum \_{k=0}^{\\infty }e\_{k}(t)^{2}\\operatorname {var} \[Z\_{k}\]=\\sum \_{k=1}^{\\infty }\\lambda \_{k}e\_{k}(t)^{2}}](https://wikimedia.org/api/rest_v1/media/math/render/svg/964cf49295fa3fa55bf38196d631bcf1397214dd)
Integrating over \[*a*, *b*\] and using the orthonormality of the *e**k*, we obtain that the total variance of the process is:
![{\\displaystyle \\int \_{a}^{b}\\operatorname {var} \[X\_{t}\]\\,dt=\\sum \_{k=1}^{\\infty }\\lambda \_{k}}](https://wikimedia.org/api/rest_v1/media/math/render/svg/56be7b3cd7f104d34b681a70c0ee160bcc4fb848)
In particular, the total variance of the *N*\-truncated approximation is

As a result, the *N*\-truncated expansion explains

of the variance; and if we are content with an approximation that explains, say, 95% of the variance, then we just have to determine an  such that

### The KarhunenâLoève expansion has the minimum representation entropy property
\[[edit](https://en.wikipedia.org/w/index.php?title=Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem&action=edit§ion=9 "Edit section: The KarhunenâLoève expansion has the minimum representation entropy property")\]
Given a representation of , for some orthonormal basis  and random , we let ![{\\displaystyle p\_{k}=\\mathbb {E} \[\|W\_{k}\|^{2}\]/\\mathbb {E} \[\|X\_{t}\|\_{L^{2}}^{2}\]}](https://wikimedia.org/api/rest_v1/media/math/render/svg/d508243f2e679ea109f53d5f05dab97da287d79d), so that . We may then define the representation [entropy](https://en.wikipedia.org/wiki/Entropy_\(information_theory\) "Entropy (information theory)") to be . Then we have , for all choices of . That is, the KL-expansion has minimal representation entropy.
**Proof:**
Denote the coefficients obtained for the basis  as , and for  as .
Choose . Note that since  minimizes the mean squared error, we have that

Expanding the right hand size, we get:
![{\\displaystyle \\mathbb {E} \\left\|\\sum \_{k=1}^{N}W\_{k}\\varphi \_{k}(t)-X\_{t}\\right\|\_{L^{2}}^{2}=\\mathbb {E} \|X\_{t}^{2}\|\_{L^{2}}+\\sum \_{k=1}^{N}\\sum \_{\\ell =1}^{N}\\mathbb {E} \[W\_{\\ell }\\varphi \_{\\ell }(t)W\_{k}^{\*}\\varphi \_{k}^{\*}(t)\]\_{L^{2}}-\\sum \_{k=1}^{N}\\mathbb {E} \[W\_{k}\\varphi \_{k}X\_{t}^{\*}\]\_{L^{2}}-\\sum \_{k=1}^{N}\\mathbb {E} \[X\_{t}W\_{k}^{\*}\\varphi \_{k}^{\*}(t)\]\_{L^{2}}}](https://wikimedia.org/api/rest_v1/media/math/render/svg/0c9dd795699c6f791be251af56fda845082a2cb9)
Using the orthonormality of , and expanding  in the  basis, we get that the right hand size is equal to:
![{\\displaystyle \\mathbb {E} \[X\_{t}\]\_{L^{2}}^{2}-\\sum \_{k=1}^{N}\\mathbb {E} \[\|W\_{k}\|^{2}\]}](https://wikimedia.org/api/rest_v1/media/math/render/svg/bd4cc80ff88af90eeed172031bedda8d116e2298)
We may perform identical analysis for the , and so rewrite the above inequality as:
![{\\displaystyle {\\displaystyle \\mathbb {E} \[X\_{t}\]\_{L^{2}}^{2}-\\sum \_{k=1}^{N}\\mathbb {E} \[\|Z\_{k}\|^{2}\]}\\leq {\\displaystyle \\mathbb {E} \[X\_{t}\]\_{L^{2}}^{2}-\\sum \_{k=1}^{N}\\mathbb {E} \[\|W\_{k}\|^{2}\]}}](https://wikimedia.org/api/rest_v1/media/math/render/svg/d7b9f649b35e5c0ccd0d9294dddd5ed2c38df531)
Subtracting the common first term, and dividing by ![{\\displaystyle \\mathbb {E} \[\|X\_{t}\|\_{L^{2}}^{2}\]}](https://wikimedia.org/api/rest_v1/media/math/render/svg/5f66e132a304afedacd6c2503f0484633d16ee81), we obtain that:

This implies that:

## Linear KarhunenâLoève approximations
\[[edit](https://en.wikipedia.org/w/index.php?title=Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem&action=edit§ion=10 "Edit section: Linear KarhunenâLoève approximations")\]
Consider a whole class of signals we want to approximate over the first M vectors of a basis. These signals are modeled as realizations of a random vector *Y*\[*n*\] of size N. To optimize the approximation we design a basis that minimizes the average [approximation error](https://en.wikipedia.org/wiki/Approximation_error "Approximation error"). This section proves that optimal bases are KarhunenâLoeve bases that diagonalize the covariance matrix of Y. The random vector Y can be decomposed in an orthogonal basis

as follows:

where each
![{\\displaystyle \\left\\langle Y,g\_{m}\\right\\rangle =\\sum \_{n=0}^{N-1}{Y\[n\]}g\_{m}^{\*}\[n\]}](https://wikimedia.org/api/rest_v1/media/math/render/svg/02f1fe093bbcd1aa4a590ba40632868931607175)
is a random variable. The approximation from the first *M* ⤠*N* vectors of the basis is

The energy conservation in an orthogonal basis implies
![{\\displaystyle \\varepsilon \[M\]=\\mathbf {E} \\left\\{\\left\\\|Y-Y\_{M}\\right\\\|^{2}\\right\\}=\\sum \_{m=M}^{N-1}\\mathbf {E} \\left\\{\\left\|\\left\\langle Y,g\_{m}\\right\\rangle \\right\|^{2}\\right\\}}](https://wikimedia.org/api/rest_v1/media/math/render/svg/f03ea5618ab818ac1fba489f606629de7bae0624)
This error is related to the covariance of Y defined by
![{\\displaystyle R\[n,m\]=\\mathbf {E} \\left\\{Y\[n\]Y^{\*}\[m\]\\right\\}}](https://wikimedia.org/api/rest_v1/media/math/render/svg/97040a6e6c6093e9b2025f61eef9d5b4c01e96a7)
For any vector *x*\[*n*\] we denote by K the [covariance operator](https://en.wikipedia.org/wiki/Covariance_operator "Covariance operator") represented by this matrix,
![{\\displaystyle \\mathbf {E} \\left\\{\\left\|\\langle Y,x\\rangle \\right\|^{2}\\right\\}=\\langle Kx,x\\rangle =\\sum \_{n=0}^{N-1}\\sum \_{m=0}^{N-1}R\[n,m\]x\[n\]x^{\*}\[m\]}](https://wikimedia.org/api/rest_v1/media/math/render/svg/329f1d64474e88691438cbf535cef5909c6afe90)
The error *Îľ*\[*M*\] is therefore a sum of the last *N* â *M* coefficients of the covariance operator
![{\\displaystyle \\varepsilon \[M\]=\\sum \_{m=M}^{N-1}{\\left\\langle Kg\_{m},g\_{m}\\right\\rangle }}](https://wikimedia.org/api/rest_v1/media/math/render/svg/495b9ad300b894a7656069ceedb341799830137d)
The covariance operator K is Hermitian and Positive and is thus diagonalized in an orthogonal basis called a KarhunenâLoève basis. The following theorem states that a KarhunenâLoève basis is optimal for linear approximations.
**Theorem (Optimality of KarhunenâLoève basis).** Let K be a covariance operator. For all *M* ⼠1, the approximation error
![{\\displaystyle \\varepsilon \[M\]=\\sum \_{m=M}^{N-1}\\left\\langle Kg\_{m},g\_{m}\\right\\rangle }](https://wikimedia.org/api/rest_v1/media/math/render/svg/ada8bd92150dcbf57702d1f335b74be240e84a28)
is minimum if and only if

is a KarhunenâLoeve basis ordered by decreasing eigenvalues.

## Non-Linear approximation in bases
\[[edit](https://en.wikipedia.org/w/index.php?title=Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem&action=edit§ion=11 "Edit section: Non-Linear approximation in bases")\]
Linear approximations project the signal on *M* vectors a priori. The approximation can be made more precise by choosing the *M* orthogonal vectors depending on the signal properties. This section analyzes the general performance of these non-linear approximations. A signal  is approximated with M vectors selected adaptively in an orthonormal basis for \[*[definition needed](https://en.wikipedia.org/wiki/Wikipedia:Please_clarify "Wikipedia:Please clarify")*\]

Let  be the projection of f over M vectors whose indices are in IM:

The approximation error is the sum of the remaining coefficients
![{\\displaystyle \\varepsilon \[M\]=\\left\\{\\left\\\|f-f\_{M}\\right\\\|^{2}\\right\\}=\\sum \_{m\\notin I\_{M}}^{N-1}\\left\\{\\left\|\\left\\langle f,g\_{m}\\right\\rangle \\right\|^{2}\\right\\}}](https://wikimedia.org/api/rest_v1/media/math/render/svg/6a2c7e02bbbe9afe5d05b3e7a225d12d294cc5fb)
To minimize this error, the indices in IM must correspond to the M vectors having the largest inner product amplitude

These are the vectors that best correlate f. They can thus be interpreted as the main features of f. The resulting error is necessarily smaller than the error of a [linear approximation](https://en.wikipedia.org/wiki/Linear_approximation "Linear approximation") which selects the M approximation vectors independently of f. Let us sort

in decreasing order

The best non-linear approximation is

It can also be written as inner product thresholding:

with

The non-linear error is
![{\\displaystyle \\varepsilon \[M\]=\\left\\{\\left\\\|f-f\_{M}\\right\\\|^{2}\\right\\}=\\sum \_{k=M+1}^{\\infty }\\left\\{\\left\|\\left\\langle f,g\_{m\_{k}}\\right\\rangle \\right\|^{2}\\right\\}}](https://wikimedia.org/api/rest_v1/media/math/render/svg/a10028719c0a213e358f42be10ff625fcd480c61)
this error goes quickly to zero as M increases, if the sorted values of  have a fast decay as k increases. This decay is quantified by computing the  norm of the signal inner products in B:

The following theorem relates the decay of *Îľ*\[*M*\] to 
**Theorem (decay of error).** If  with *p* \< 2 then
![{\\displaystyle \\varepsilon \[M\]\\leq {\\frac {\\\|f\\\|\_{\\mathrm {B} ,p}^{2}}{{\\frac {2}{p}}-1}}M^{1-{\\frac {2}{p}}}}](https://wikimedia.org/api/rest_v1/media/math/render/svg/1c5514c4d3d4296125778f3546ddee2b05272abf)
and
![{\\displaystyle \\varepsilon \[M\]=o\\left(M^{1-{\\frac {2}{p}}}\\right).}](https://wikimedia.org/api/rest_v1/media/math/render/svg/626974ced6297faf4103543175e6afe5a9ee7236)
Conversely, if ![{\\displaystyle \\varepsilon \[M\]=o\\left(M^{1-{\\frac {2}{p}}}\\right)}](https://wikimedia.org/api/rest_v1/media/math/render/svg/97c2c3d7907e4a58e3224371cb6647948350d1c1) then
 for any *q* \> *p*.
### Non-optimality of KarhunenâLoève bases
\[[edit](https://en.wikipedia.org/w/index.php?title=Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem&action=edit§ion=12 "Edit section: Non-optimality of KarhunenâLoève bases")\]
To further illustrate the differences between linear and non-linear approximations, we study the decomposition of a simple non-Gaussian random vector in a KarhunenâLoève basis. Processes whose realizations have a random translation are stationary. The KarhunenâLoève basis is then a Fourier basis and we study its performance. To simplify the analysis, consider a random vector *Y*\[*n*\] of size *N* that is random shift modulo *N* of a deterministic signal *f*\[*n*\] of zero mean
![{\\displaystyle \\sum \_{n=0}^{N-1}f\[n\]=0}](https://wikimedia.org/api/rest_v1/media/math/render/svg/8491175ed9cde718f570ca5a1690e598acc83818)
![{\\displaystyle Y\[n\]=f\[(n-p){\\bmod {N}}\]}](https://wikimedia.org/api/rest_v1/media/math/render/svg/719334ff76852253bde14fe3e5ea7d76df8937dc)
The random shift *P* is uniformly distributed on \[0, *N* â 1\]:

Clearly
![{\\displaystyle \\mathbf {E} \\{Y\[n\]\\}={\\frac {1}{N}}\\sum \_{p=0}^{N-1}f\[(n-p){\\bmod {N}}\]=0}](https://wikimedia.org/api/rest_v1/media/math/render/svg/2fcbf5841519051f55abe7088d6220ea4dd73705)
and
![{\\displaystyle R\[n,k\]=\\mathbf {E} \\{Y\[n\]Y\[k\]\\}={\\frac {1}{N}}\\sum \_{p=0}^{N-1}f\[(n-p){\\bmod {N}}\]f\[(k-p){\\bmod {N}}\]={\\frac {1}{N}}f\\Theta {\\bar {f}}\[n-k\],\\quad {\\bar {f}}\[n\]=f\[-n\]}](https://wikimedia.org/api/rest_v1/media/math/render/svg/6e40a9d9083e7fd94b363d97fed5f876131e1891)
Hence
![{\\displaystyle R\[n,k\]=R\_{Y}\[n-k\],\\qquad R\_{Y}\[k\]={\\frac {1}{N}}f\\Theta {\\bar {f}}\[k\]}](https://wikimedia.org/api/rest_v1/media/math/render/svg/5a8cfb5801399b3718406c3c48962f205026f127)
Since RY is N periodic, Y is a circular stationary random vector. The covariance operator is a [circular convolution](https://en.wikipedia.org/wiki/Circular_convolution "Circular convolution") with RY and is therefore diagonalized in the discrete Fourier KarhunenâLoève basis

The power spectrum is Fourier transform of *R**Y*:
![{\\displaystyle P\_{Y}\[m\]={\\hat {R}}\_{Y}\[m\]={\\frac {1}{N}}\\left\|{\\hat {f}}\[m\]\\right\|^{2}}](https://wikimedia.org/api/rest_v1/media/math/render/svg/ae2ed523a89b792ecc1f0024675c39466bd7c8ae)
**Example:** Consider an extreme case where ![{\\displaystyle f\[n\]=\\delta \[n\]-\\delta \[n-1\]}](https://wikimedia.org/api/rest_v1/media/math/render/svg/24ed776f976f118ac0a033bb5cdaaaf0ebb97edb). A theorem stated above guarantees that the Fourier KarhunenâLoève basis produces a smaller expected approximation error than a canonical basis of Diracs ![{\\displaystyle \\left\\{g\_{m}\[n\]=\\delta \[n-m\]\\right\\}\_{0\\leq m\<N}}](https://wikimedia.org/api/rest_v1/media/math/render/svg/007c54c9fb2f295a49f42934713f2249674b54dd). Indeed, we do not know a priori the abscissa of the non-zero coefficients of *Y*, so there is no particular Dirac that is better adapted to perform the approximation. But the Fourier vectors cover the whole support of Y and thus absorb a part of the signal energy.
![{\\displaystyle \\mathbf {E} \\left\\{\\left\|\\left\\langle Y\[n\],{\\frac {1}{\\sqrt {N}}}e^{i2\\pi mn/N}\\right\\rangle \\right\|^{2}\\right\\}=P\_{Y}\[m\]={\\frac {4}{N}}\\sin ^{2}\\left({\\frac {\\pi k}{N}}\\right)}](https://wikimedia.org/api/rest_v1/media/math/render/svg/4d4950a653a45d50571c71ad9019ce11467431e0)
Selecting higher frequency Fourier coefficients yields a better mean-square approximation than choosing a priori a few Dirac vectors to perform the approximation. The situation is totally different for non-linear approximations. If ![{\\displaystyle f\[n\]=\\delta \[n\]-\\delta \[n-1\]}](https://wikimedia.org/api/rest_v1/media/math/render/svg/24ed776f976f118ac0a033bb5cdaaaf0ebb97edb) then the discrete Fourier basis is extremely inefficient because f and hence Y have an energy that is almost uniformly spread among all Fourier vectors. In contrast, since f has only two non-zero coefficients in the Dirac basis, a non-linear approximation of Y with *M* ⼠2 gives zero error.[\[5\]](https://en.wikipedia.org/wiki/Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem#cite_note-5)
## Principal component analysis
\[[edit](https://en.wikipedia.org/w/index.php?title=Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem&action=edit§ion=13 "Edit section: Principal component analysis")\]
We have established the KarhunenâLoève theorem and derived a few properties thereof. We also noted that one hurdle in its application was the numerical cost of determining the eigenvalues and eigenfunctions of its covariance operator through the Fredholm integral equation of the second kind

However, when applied to a discrete and finite process , the problem takes a much simpler form and standard algebra can be used to carry out the calculations.
Note that a continuous process can also be sampled at *N* points in time in order to reduce the problem to a finite version.
We henceforth consider a random *N*\-dimensional vector . As mentioned above, *X* could contain *N* samples of a signal but it can hold many more representations depending on the field of application. For instance it could be the answers to a survey or economic data in an econometrics analysis.
As in the continuous version, we assume that *X* is centered, otherwise we can let  (where  is the [mean vector](https://en.wikipedia.org/wiki/Mean_vector "Mean vector") of *X*) which is centered.
Let us adapt the procedure to the discrete case.
Recall that the main implication and difficulty of the KL transformation is computing the eigenvectors of the linear operator associated to the covariance function, which are given by the solutions to the integral equation written above.
Define ÎŁ, the covariance matrix of *X*, as an *N* Ă *N* matrix whose elements are given by:
![{\\displaystyle \\Sigma \_{ij}=\\mathbf {E} \[X\_{i}X\_{j}\],\\qquad \\forall i,j\\in \\{1,\\ldots ,N\\}}](https://wikimedia.org/api/rest_v1/media/math/render/svg/0aca0bb8748f361f4f3051b36cb1b667b357891e)
Rewriting the above integral equation to suit the discrete case, we observe that it turns into:

where  is an *N*\-dimensional vector.
The integral equation thus reduces to a simple matrix eigenvalue problem, which explains why the PCA has such a broad domain of applications.
Since ÎŁ is a positive definite symmetric matrix, it possesses a set of orthonormal eigenvectors forming a basis of , and we write  this set of eigenvalues and corresponding eigenvectors, listed in decreasing values of Îťi. Let also ÎŚ be the orthonormal matrix consisting of these eigenvectors:

### Principal component transform
\[[edit](https://en.wikipedia.org/w/index.php?title=Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem&action=edit§ion=15 "Edit section: Principal component transform")\]
It remains to perform the actual KL transformation, called the *principal component transform* in this case. Recall that the transform was found by expanding the process with respect to the basis spanned by the eigenvectors of the covariance function. In this case, we hence have:

In a more compact form, the principal component transform of *X* is defined by:

The *i*\-th component of *Y* is , the projection of *X* on  and the inverse transform *X* = ÎŚ*Y* yields the expansion of X on the space spanned by the :

As in the continuous case, we may reduce the dimensionality of the problem by truncating the sum at some  such that

where Îą is the explained variance threshold we wish to set.
We can also reduce the dimensionality through the use of multilevel dominant eigenvector estimation (MDEE).[\[6\]](https://en.wikipedia.org/wiki/Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem#cite_note-6)
There are numerous equivalent characterizations of the [Wiener process](https://en.wikipedia.org/wiki/Wiener_process "Wiener process") which is a mathematical formalization of [Brownian motion](https://en.wikipedia.org/wiki/Brownian_motion "Brownian motion"). Here we regard it as the centered standard Gaussian process **W***t* with covariance function

We restrict the time domain to \[*a*, *b*\]=\[0,1\] without loss of generality.
The eigenvectors of the covariance kernel are easily determined. These are

and the corresponding eigenvalues are

**Proof**
In order to find the eigenvalues and eigenvectors, we need to solve the integral equation:

differentiating once with respect to *t* yields:

a second differentiation produces the following differential equation:

The general solution of which has the form:

where *A* and *B* are two constants to be determined with the boundary conditions. Setting *t* = 0 in the initial integral equation gives *e*(0) = 0 which implies that *B* = 0 and similarly, setting *t* = 1 in the first differentiation yields *e'* (1) = 0, whence:

which in turn implies that eigenvalues of *T**K**X* are:

The corresponding eigenfunctions are thus of the form:

*A* is then chosen so as to normalize *e**k*:

This gives the following representation of the Wiener process:
**Theorem**. There is a sequence {*Z**i*}*i* of independent Gaussian random variables with mean zero and variance 1 such that

Note that this representation is only valid for ![{\\displaystyle t\\in \[0,1\].}](https://wikimedia.org/api/rest_v1/media/math/render/svg/bafa089d991504bb539141c6221e17f79d06d7b8) On larger intervals, the increments are not independent. As stated in the theorem, convergence is in the L2 norm and uniform in *t*.
### The Brownian bridge
\[[edit](https://en.wikipedia.org/w/index.php?title=Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem&action=edit§ion=18 "Edit section: The Brownian bridge")\]
Similarly the [Brownian bridge](https://en.wikipedia.org/wiki/Brownian_bridge "Brownian bridge")  which is a [stochastic process](https://en.wikipedia.org/wiki/Stochastic_process "Stochastic process") with covariance function

can be represented as the series

[Adaptive optics](https://en.wikipedia.org/wiki/Adaptive_optics "Adaptive optics") systems sometimes use KâL functions to reconstruct wave-front phase information (Dai 1996, JOSA A). KarhunenâLoève expansion is closely related to the [Singular Value Decomposition](https://en.wikipedia.org/wiki/Singular_Value_Decomposition "Singular Value Decomposition"). The latter has myriad applications in image processing, radar, seismology, and the like. If one has independent vector observations from a vector valued stochastic process then the left singular vectors are [maximum likelihood](https://en.wikipedia.org/wiki/Maximum_likelihood "Maximum likelihood") estimates of the ensemble KL expansion.
### Applications in signal estimation and detection
\[[edit](https://en.wikipedia.org/w/index.php?title=Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem&action=edit§ion=20 "Edit section: Applications in signal estimation and detection")\]
#### Detection of a known continuous signal *S*(*t*)
\[[edit](https://en.wikipedia.org/w/index.php?title=Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem&action=edit§ion=21 "Edit section: Detection of a known continuous signal S(t)")\]
In communication, we usually have to decide whether a signal from a noisy channel contains valuable information. The following hypothesis testing is used for detecting continuous signal *s*(*t*) from channel output *X*(*t*), *N*(*t*) is the channel noise, which is usually assumed zero mean Gaussian process with correlation function ![{\\displaystyle R\_{N}(t,s)=E\[N(t)N(s)\]}](https://wikimedia.org/api/rest_v1/media/math/render/svg/1f09f8b659fd5e362914b33effdda43332d909e4)


#### Signal detection in white noise
\[[edit](https://en.wikipedia.org/w/index.php?title=Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem&action=edit§ion=22 "Edit section: Signal detection in white noise")\]
When the channel noise is white, its correlation function is

and it has constant power spectrum density. In physically practical channel, the noise power is finite, so:

Then the noise correlation function is sinc function with zeros at  Since are uncorrelated and gaussian, they are independent. Thus we can take samples from *X*(*t*) with time spacing

Let . We have a total of  i.i.d observations  to develop the likelihood-ratio test. Define signal , the problem becomes,


The log-likelihood ratio

As *t* â 0, let:

Then *G* is the test statistics and the [NeymanâPearson optimum detector](https://en.wikipedia.org/wiki/Neyman%E2%80%93Pearson_lemma "NeymanâPearson lemma") is

As *G* is Gaussian, we can characterize it by finding its mean and variances. Then we get


where

is the signal energy.
The false alarm error

And the probability of detection:

where ÎŚ is the cdf of standard normal, or Gaussian, variable.
#### Signal detection in colored noise
\[[edit](https://en.wikipedia.org/w/index.php?title=Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem&action=edit§ion=23 "Edit section: Signal detection in colored noise")\]
When N(t) is colored (correlated in time) Gaussian noise with zero mean and covariance function ![{\\displaystyle R\_{N}(t,s)=E\[N(t)N(s)\],}](https://wikimedia.org/api/rest_v1/media/math/render/svg/4e0b2810a9a6fef382bec8d835be29090f3d6871) we cannot sample independent discrete observations by evenly spacing the time. Instead, we can use KâL expansion to decorrelate the noise process and get independent Gaussian observation 'samples'. The KâL expansion of *N*(*t*):

where  and the orthonormal bases  are generated by kernel , i.e., solution to
![{\\displaystyle \\int \_{0}^{T}R\_{N}(t,s)\\Phi \_{i}(s)\\,ds=\\lambda \_{i}\\Phi \_{i}(t),\\quad \\operatorname {var} \[N\_{i}\]=\\lambda \_{i}.}](https://wikimedia.org/api/rest_v1/media/math/render/svg/5a3438932b073324caf252e7c0b3f9f19ac0791e)
Do the expansion:

where , then

under H and  under K. Let , we have
 are independent Gaussian r.v's with variance 
under H:  are independent Gaussian r.v's.
![{\\displaystyle f\_{H}\[x(t)\|0\<t\<T\]=f\_{H}({\\underline {x}})=\\prod \_{i=1}^{\\infty }{\\frac {1}{\\sqrt {2\\pi \\lambda \_{i}}}}\\exp \\left(-{\\frac {x\_{i}^{2}}{2\\lambda \_{i}}}\\right)}](https://wikimedia.org/api/rest_v1/media/math/render/svg/5106d08c991f9d03bb7bb076c831a0f33200d4bc)
under K:  are independent Gaussian r.v's.
![{\\displaystyle f\_{K}\[x(t)\\mid 0\<t\<T\]=f\_{K}({\\underline {x}})=\\prod \_{i=1}^{\\infty }{\\frac {1}{\\sqrt {2\\pi \\lambda \_{i}}}}\\exp \\left(-{\\frac {(x\_{i}-S\_{i})^{2}}{2\\lambda \_{i}}}\\right)}](https://wikimedia.org/api/rest_v1/media/math/render/svg/14a5bf34e1d35f9432e0a659d53f733ce651a83b)
Hence, the log-LR is given by

and the optimum detector is

Define

then 
Since

k(t) is the solution to

If *N*(*t*)is wide-sense stationary,

which is known as the [WienerâHopf equation](https://en.wikipedia.org/wiki/Wiener%E2%80%93Hopf_equation "WienerâHopf equation"). The equation can be solved by taking fourier transform, but not practically realizable since infinite spectrum needs spatial factorization. A special case which is easy to calculate *k*(*t*) is white Gaussian noise.

The corresponding impulse response is *h*(*t*) = *k*(*T* â *t*) = *CS*(*T* â *t*). Let *C* = 1, this is just the result we arrived at in previous section for detecting of signal in white noise.
##### Test threshold for NeymanâPearson detector
\[[edit](https://en.wikipedia.org/w/index.php?title=Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem&action=edit§ion=25 "Edit section: Test threshold for NeymanâPearson detector")\]
Since X(t) is a Gaussian process,

is a Gaussian random variable that can be characterized by its mean and variance.
![{\\displaystyle {\\begin{aligned}\\mathbf {E} \[G\\mid H\]&=\\int \_{0}^{T}k(t)\\mathbf {E} \[x(t)\\mid H\]\\,dt=0\\\\\\mathbf {E} \[G\\mid K\]&=\\int \_{0}^{T}k(t)\\mathbf {E} \[x(t)\\mid K\]\\,dt=\\int \_{0}^{T}k(t)S(t)\\,dt\\equiv \\rho \\\\\\mathbf {E} \[G^{2}\\mid H\]&=\\int \_{0}^{T}\\int \_{0}^{T}k(t)k(s)R\_{N}(t,s)\\,dt\\,ds=\\int \_{0}^{T}k(t)\\left(\\int \_{0}^{T}k(s)R\_{N}(t,s)\\,ds\\right)=\\int \_{0}^{T}k(t)S(t)\\,dt=\\rho \\\\\\operatorname {var} \[G\\mid H\]&=\\mathbf {E} \[G^{2}\\mid H\]-(\\mathbf {E} \[G\\mid H\])^{2}=\\rho \\\\\\mathbf {E} \[G^{2}\\mid K\]&=\\int \_{0}^{T}\\int \_{0}^{T}k(t)k(s)\\mathbf {E} \[x(t)x(s)\]\\,dt\\,ds=\\int \_{0}^{T}\\int \_{0}^{T}k(t)k(s)(R\_{N}(t,s)+S(t)S(s))\\,dt\\,ds=\\rho +\\rho ^{2}\\\\\\operatorname {var} \[G\\mid K\]&=\\mathbf {E} \[G^{2}\|K\]-(\\mathbf {E} \[G\|K\])^{2}=\\rho +\\rho ^{2}-\\rho ^{2}=\\rho \\end{aligned}}}](https://wikimedia.org/api/rest_v1/media/math/render/svg/d28a73179b98c25cf752e7f06937466182b316c4)
Hence, we obtain the distributions of *H* and *K*:


The false alarm error is

So the test threshold for the NeymanâPearson optimum detector is

Its power of detection is

When the noise is white Gaussian process, the signal power is

For some type of colored noise, a typical practise is to add a prewhitening filter before the matched filter to transform the colored noise into white noise. For example, N(t) is a wide-sense stationary colored noise with correlation function


The transfer function of prewhitening filter is

When the signal we want to detect from the noisy channel is also random, for example, a white Gaussian process *X*(*t*), we can still implement KâL expansion to get independent sequence of observation. In this case, the detection problem is described as follows:


*X*(*t*) is a random process with correlation function 
The KâL expansion of *X*(*t*) is

where

and  are solutions to

So 's are independent sequence of r.v's with zero mean and variance . Expanding *Y*(*t*) and *N*(*t*) by , we get
![{\\displaystyle Y\_{i}=\\int \_{0}^{T}Y(t)\\Phi \_{i}(t)\\,dt=\\int \_{0}^{T}\[N(t)+X(t)\]\\Phi \_{i}(t)=N\_{i}+X\_{i},}](https://wikimedia.org/api/rest_v1/media/math/render/svg/3f98059b8f7146860740552c3f232a4387c8abe1)
where

As *N*(*t*) is Gaussian white noise, 's are i.i.d sequence of r.v with zero mean and variance , then the problem is simplified as follows,


The NeymanâPearson optimal test:

so the log-likelihood ratio is

Since

is just the minimum-mean-square estimate of  given 's,

KâL expansion has the following property: If

where

then

So let

Noncausal filter *Q*(*t*,*s*) can be used to get the estimate through

By [orthogonality principle](https://en.wikipedia.org/wiki/Orthogonality_principle "Orthogonality principle"), *Q*(*t*,*s*) satisfies

However, for practical reasons, it's necessary to further derive the causal filter *h*(*t*,*s*), where *h*(*t*,*s*) = 0 for *s* \> *t*, to get estimate . Specifically,

- [Principal component analysis](https://en.wikipedia.org/wiki/Principal_component_analysis "Principal component analysis")
- [Polynomial chaos](https://en.wikipedia.org/wiki/Polynomial_chaos "Polynomial chaos")
- [Reproducing kernel Hilbert space](https://en.wikipedia.org/wiki/Reproducing_kernel_Hilbert_space "Reproducing kernel Hilbert space")
- [Mercer's theorem](https://en.wikipedia.org/wiki/Mercer%27s_theorem "Mercer's theorem")
1. **[^](https://en.wikipedia.org/wiki/Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem#cite_ref-sapatnekar_1-0)**
Sapatnekar, Sachin (2011), "Overcoming variations in nanometer-scale technologies", *IEEE Journal on Emerging and Selected Topics in Circuits and Systems*, **1** (1): 5â1, [Bibcode](https://en.wikipedia.org/wiki/Bibcode_\(identifier\) "Bibcode (identifier)"):[2011IJEST...1....5S](https://ui.adsabs.harvard.edu/abs/2011IJEST...1....5S), [CiteSeerX](https://en.wikipedia.org/wiki/CiteSeerX_\(identifier\) "CiteSeerX (identifier)") [10\.1.1.300.5659](https://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.300.5659), [doi](https://en.wikipedia.org/wiki/Doi_\(identifier\) "Doi (identifier)"):[10\.1109/jetcas.2011.2138250](https://doi.org/10.1109%2Fjetcas.2011.2138250), [S2CID](https://en.wikipedia.org/wiki/S2CID_\(identifier\) "S2CID (identifier)") [15566585](https://api.semanticscholar.org/CorpusID:15566585)
2. **[^](https://en.wikipedia.org/wiki/Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem#cite_ref-ghoman_2-0)**
Ghoman, Satyajit; Wang, Zhicun; Chen, PC; Kapania, Rakesh (2012). "A POD-based Reduced Order Design Scheme for Shape Optimization of Air Vehicles". *Proc of 53rd AIAA/ASME/ASCE/AHS/ASC Structures, Structural Dynamics, and Materials Conference, AIAA-2012-1808, Honolulu, Hawaii*.
3. **[^](https://en.wikipedia.org/wiki/Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem#cite_ref-3)** [KarhunenâLoeve transform (KLT)](http://fourier.eng.hmc.edu/e161/lectures/klt/node3.html) [Archived](https://web.archive.org/web/20161128140401/http://fourier.eng.hmc.edu/e161/lectures/klt/node3.html) 2016-11-28 at the [Wayback Machine](https://en.wikipedia.org/wiki/Wayback_Machine "Wayback Machine"), Computer Image Processing and Analysis (E161) lectures, Harvey Mudd College
4. **[^](https://en.wikipedia.org/wiki/Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem#cite_ref-giambartolomei_4-0)**
Giambartolomei, Giordano (2016). "4 The Karhunen-Loève Theorem". [*The Karhunen-Loève theorem*](https://amslaurea.unibo.it/10169/) (Bachelors). University of Bologna.
5. **[^](https://en.wikipedia.org/wiki/Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem#cite_ref-5)** A wavelet tour of signal processing-StĂŠphane Mallat
6. **[^](https://en.wikipedia.org/wiki/Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem#cite_ref-6)** X. Tang, âTexture information in run-length matrices,â IEEE Transactions on Image Processing, vol. 7, No. 11, pp. 1602â1609, Nov. 1998
- Stark, Henry; Woods, John W. (1986). *Probability, Random Processes, and Estimation Theory for Engineers*. Prentice-Hall, Inc. [ISBN](https://en.wikipedia.org/wiki/ISBN_\(identifier\) "ISBN (identifier)")
[978-0-13-711706-2](https://en.wikipedia.org/wiki/Special:BookSources/978-0-13-711706-2 "Special:BookSources/978-0-13-711706-2")
. [OL](https://en.wikipedia.org/wiki/OL_\(identifier\) "OL (identifier)") [21138080M](https://openlibrary.org/books/OL21138080M).
- Ghanem, Roger; Spanos, Pol (1991). *Stochastic finite elements: a spectral approach*. Springer-Verlag. [ISBN](https://en.wikipedia.org/wiki/ISBN_\(identifier\) "ISBN (identifier)")
[978-0-387-97456-9](https://en.wikipedia.org/wiki/Special:BookSources/978-0-387-97456-9 "Special:BookSources/978-0-387-97456-9")
. [OL](https://en.wikipedia.org/wiki/OL_\(identifier\) "OL (identifier)") [1865197M](https://openlibrary.org/books/OL1865197M).
- Guikhman, I.; Skorokhod, A. (1977). *Introduction a la ThĂŠorie des Processus AlĂŠatoires*. Ăditions MIR.
- Simon, B. (1979). *Functional Integration and Quantum Physics*. Academic Press.
- Karhunen, Kari (1947). "Ăber lineare Methoden in der Wahrscheinlichkeitsrechnung". *Ann. Acad. Sci. Fennicae. Ser. A I. Math.-Phys*. **37**: 1â79\.
- Loève, M. (1978). *Probability theory Vol. II*. Graduate Texts in Mathematics. Vol. 46 (4 ed.). Springer-Verlag. [ISBN](https://en.wikipedia.org/wiki/ISBN_\(identifier\) "ISBN (identifier)")
[978-0-387-90262-3](https://en.wikipedia.org/wiki/Special:BookSources/978-0-387-90262-3 "Special:BookSources/978-0-387-90262-3")
.
- Dai, G. (1996). "Modal wave-front reconstruction with Zernike polynomials and KarhunenâLoeve functions". *JOSA A*. **13** (6): 1218. [Bibcode](https://en.wikipedia.org/wiki/Bibcode_\(identifier\) "Bibcode (identifier)"):[1996JOSAA..13.1218D](https://ui.adsabs.harvard.edu/abs/1996JOSAA..13.1218D). [doi](https://en.wikipedia.org/wiki/Doi_\(identifier\) "Doi (identifier)"):[10\.1364/JOSAA.13.001218](https://doi.org/10.1364%2FJOSAA.13.001218).
- Wu B., Zhu J., Najm F.(2005) "A Non-parametric Approach for Dynamic Range Estimation of Nonlinear Systems". In Proceedings of Design Automation Conference(841â844) 2005
- Wu B., Zhu J., Najm F.(2006) "Dynamic Range Estimation". IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, Vol. 25 Issue:9 (1618â1636) 2006
- Jorgensen, Palle E. T.; Song, Myung-Sin (2007). "Entropy Encoding, Hilbert Space and KarhunenâLoeve Transforms". *Journal of Mathematical Physics*. **48** (10): 103503. [arXiv](https://en.wikipedia.org/wiki/ArXiv_\(identifier\) "ArXiv (identifier)"):[math-ph/0701056](https://arxiv.org/abs/math-ph/0701056). [Bibcode](https://en.wikipedia.org/wiki/Bibcode_\(identifier\) "Bibcode (identifier)"):[2007JMP....48j3503J](https://ui.adsabs.harvard.edu/abs/2007JMP....48j3503J). [doi](https://en.wikipedia.org/wiki/Doi_\(identifier\) "Doi (identifier)"):[10\.1063/1.2793569](https://doi.org/10.1063%2F1.2793569). [S2CID](https://en.wikipedia.org/wiki/S2CID_\(identifier\) "S2CID (identifier)") [17039075](https://api.semanticscholar.org/CorpusID:17039075).
- *Mathematica* [KarhunenLoeveDecomposition](http://reference.wolfram.com/mathematica/ref/KarhunenLoeveDecomposition.html) function.
- *E161: Computer Image Processing and Analysis* notes by Pr. Ruye Wang at [Harvey Mudd College](https://en.wikipedia.org/wiki/Harvey_Mudd_College "Harvey Mudd College") [\[1\]](http://fourier.eng.hmc.edu/e161/lectures/klt/klt.html) [Archived](https://web.archive.org/web/20110516045654/http://fourier.eng.hmc.edu/e161/lectures/klt/klt.html) 2011-05-16 at the [Wayback Machine](https://en.wikipedia.org/wiki/Wayback_Machine "Wayback Machine") |
| Shard | 152 (laksa) |
| Root Hash | 17790707453426894952 |
| Unparsed URL | org,wikipedia!en,/wiki/Kosambi%E2%80%93Karhunen%E2%80%93Lo%C3%A8ve_theorem s443 |