âčïž Skipped - page is already crawled
| Filter | Status | Condition | Details |
|---|---|---|---|
| HTTP status | PASS | download_http_code = 200 | HTTP 200 |
| Age cutoff | PASS | download_stamp > now() - 6 MONTH | 0.5 months ago (distributed domain, exempt) |
| History drop | PASS | isNull(history_drop_reason) | No drop reason |
| Spam/ban | PASS | fh_dont_index != 1 AND ml_spam_score = 0 | ml_spam_score=0 |
| Canonical | PASS | meta_canonical IS NULL OR = '' OR = src_unparsed | Not set |
| Property | Value |
|---|---|
| URL | https://en.wikipedia.org/wiki/Bayesian_inference |
| Last Crawled | 2026-04-02 21:40:06 (14 days ago) |
| First Indexed | 2013-08-08 16:25:20 (12 years ago) |
| HTTP Status Code | 200 |
| Meta Title | Bayesian inference - Wikipedia |
| Meta Description | null |
| Meta Canonical | null |
| Boilerpipe Text | Bayesian inference
(
BAY
-zee-Én
or
BAY
-zhÉn
)
[
1
]
is a method of
statistical inference
in which
Bayes' theorem
is used to calculate a probability of a hypothesis, given prior
evidence
, and update it as more
information
becomes available. Fundamentally, Bayesian inference uses a
prior distribution
to estimate
posterior probabilities.
Bayesian inference is an important technique in
statistics
, and especially in
mathematical statistics
. Bayesian updating is particularly important in the
dynamic analysis of a sequence of data
. Bayesian inference has found application in a wide range of activities, including
science
,
engineering
,
philosophy
,
medicine
,
sport
,
psychology
[
2
]
, and
law
. In the philosophy of
decision theory
, Bayesian inference is closely related to subjective probability, often called "
Bayesian probability
".
Introduction to Bayes' rule
[
edit
]
A geometric visualisation of Bayes' theorem. In the table, the values 2, 3, 6 and 9 give the relative weights of each corresponding condition and case. The figures denote the cells of the table involved in each metric, the probability being the fraction of each figure that is shaded. This shows that
i.e.
. Similar reasoning can be used to show that
etc.
Contingency table
Hypothesis
Evidence
Satisfies
hypothesis
H
Violates
hypothesis
â
â
Total
Has evidence
E
â
â
No evidence
â
â
=
Total
 Â
â
â
1
Bayesian inference derives the
posterior probability
as a
consequence
of two
antecedents
: a
prior probability
and a "
likelihood function
" derived from a
statistical model
for the observed data. Bayesian inference computes the posterior probability according to
Bayes' theorem
:
where
stands for any
hypothesis
whose probability may be affected by
data
(called
evidence
below). Often there are competing hypotheses, and the task is to determine which is the most probable.
, the
prior probability
, is the estimate of the probability of the hypothesis
before
the data
, the current evidence, is observed.
, the
evidence
, corresponds to new data that were not used in computing the prior probability.
, the
posterior probability
, is the probability of
given
, i.e.,
after
is observed. This is what we want to know: the probability of a hypothesis
given
the observed evidence.
is the probability of observing
given
and is called the
likelihood
. As a function of
with
fixed, it indicates the compatibility of the evidence with the given hypothesis. The likelihood function is a function of the evidence,
, while the posterior probability is a function of the hypothesis,
.
is sometimes termed the
marginal likelihood
or "model evidence". This factor is the same for all possible hypotheses being considered (as is evident from the fact that the hypothesis
does not appear anywhere in the symbol, unlike for all the other factors) and hence does not factor into determining the relative probabilities of different hypotheses.
(Else one has
.)
For different values of
, only the factors
and
, both in the numerator, affect the value of
 â the posterior probability of a hypothesis is proportional to its prior probability (its inherent likeliness) and the newly acquired likelihood (its compatibility with the new observed evidence).
In cases where
("not
"), the
logical negation
of
, is a valid likelihood, Bayes' rule can be rewritten as follows:
because
and
This focuses attention on the term
If that term is approximately 1, then the probability of the hypothesis given the evidence,
, is about
, about 50% likely - equally likely or not likely. If that term is very small, close to zero, then the probability of the hypothesis, given the evidence,
is close to 1 or the conditional hypothesis is quite likely. If that term is very large, much larger than 1, then the hypothesis, given the evidence, is quite unlikely. If the hypothesis (without consideration of evidence) is unlikely, then
is small (but not necessarily astronomically small) and
is much larger than 1 and this term can be approximated as
and relevant probabilities can be compared directly to each other.
One quick and easy way to remember the equation would be to use
rule of multiplication
:
Alternatives to Bayesian updating
[
edit
]
Bayesian updating is widely used and computationally convenient. However, it is not the only updating rule that might be considered rational.
Ian Hacking
noted that traditional "
Dutch book
" arguments did not specify Bayesian updating: they left open the possibility that non-Bayesian updating rules could avoid Dutch books. Hacking wrote:
[
3
]
"And neither the Dutch book argument nor any other in the personalist arsenal of proofs of the probability axioms entails the dynamic assumption. Not one entails Bayesianism. So the personalist requires the dynamic assumption to be Bayesian. It is true that in consistency a personalist could abandon the Bayesian model of learning from experience. Salt could lose its savour."
Indeed, there are non-Bayesian updating rules that also avoid Dutch books (as discussed in the literature on "
probability kinematics
") following the publication of
Richard C. Jeffrey
's rule, which applies Bayes' rule to the case where the evidence itself is assigned a probability.
[
4
]
The additional hypotheses needed to uniquely require Bayesian updating have been deemed to be substantial, complicated, and unsatisfactory.
[
5
]
Inference over exclusive and exhaustive possibilities
[
edit
]
If evidence is simultaneously used to update belief over a set of exclusive and exhaustive propositions, Bayesian inference may be thought of as acting on this belief distribution as a whole.
General formulation
[
edit
]
Diagram illustrating event space
in general formulation of Bayesian inference. Although this diagram shows discrete models and events, the continuous case may be visualized similarly using probability densities.
Suppose a process is generating independent and identically distributed events
, but the
probability distribution
is unknown. Let the event space
represent the current state of belief for this process. Each model is represented by event
. The conditional probabilities
are specified to define the models.
is the
degree of belief
in
. Before the first inference step,
is a set of
initial prior probabilities
. These must sum to 1, but are otherwise arbitrary.
Suppose that the process is observed to generate
. For each
, the prior
is updated to the posterior
. From
Bayes' theorem
:
[
6
]
Upon observation of further evidence, this procedure may be repeated.
Multiple observations
[
edit
]
For a sequence of
independent and identically distributed
observations
, it can be shown by induction that repeated application of the above is equivalent to
where
Parametric formulation: motivating the formal description
[
edit
]
By parameterizing the space of models, the belief in all models may be updated in a single step. The distribution of belief over the model space may then be thought of as a distribution of belief over the parameter space. The distributions in this section are expressed as continuous, represented by probability densities, as this is the usual situation. The technique is, however, equally applicable to discrete distributions.
Let the vector
span the parameter space. Let the initial prior distribution over
be
, where
is a set of parameters to the prior itself, or
hyperparameters
. Let
be a sequence of
independent and identically distributed
event observations, where all
are distributed as
for some
.
Bayes' theorem
is applied to find the
posterior distribution
over
:
where
Formal description of Bayesian inference
[
edit
]
, a data point in general. This may in fact be a
vector
of values.
, the
parameter
of the data point's distribution, i.e.,
.
This may be a
vector
of parameters.
, the
hyperparameter
of the parameter distribution, i.e.,
.
This may be a
vector
of hyperparameters.
is the sample, a set of
observed data points, i.e.,
.
, a new data point whose distribution is to be predicted.
The
prior distribution
is the distribution of the parameter(s) before any data is observed, i.e.
. The prior distribution might not be easily determined; in such a case, one possibility may be to use the
Jeffreys prior
to obtain a prior distribution before updating it with newer observations.
The
sampling distribution
is the distribution of the observed data conditional on its parameters, i.e.
.
This is also termed the
likelihood
, especially when viewed as a function of the parameter(s), sometimes written
.
The
marginal likelihood
(sometimes also termed the
evidence
) is the distribution of the observed data
marginalized
over the parameter(s), i.e.
It quantifies the agreement between data and expert opinion, in a geometric sense that can be made precise.
[
7
]
If the marginal likelihood is 0 then there is no agreement between the data and expert opinion and Bayes' rule cannot be applied.
The
posterior distribution
is the distribution of the parameter(s) after taking into account the observed data. This is determined by
Bayes' rule
, which forms the heart of Bayesian inference:
This is expressed in words as "posterior is proportional to likelihood times prior", or sometimes as "posterior = likelihood times prior, over evidence".
In practice, for almost all complex Bayesian models used in machine learning, the posterior distribution
is not obtained in a closed form distribution, mainly because the parameter space for
can be very high, or the Bayesian model retains certain hierarchical structure formulated from the observations
and parameter
. In such situations, we need to resort to approximation techniques.
[
8
]
General case: Let
be the conditional distribution of
given
and let
be the distribution of
. The joint distribution is then
. The conditional distribution
of
given
is then determined by
Existence and uniqueness of the needed
conditional expectation
is a consequence of the
RadonâNikodym theorem
. This was formulated by
Kolmogorov
in his famous book from 1933. Kolmogorov underlines the importance of conditional probability by writing "I wish to call attention to ... and especially the theory of conditional probabilities and conditional expectations ..." in the Preface.
[
9
]
The Bayes theorem determines the posterior distribution from the prior distribution. Uniqueness requires continuity assumptions.
[
10
]
Bayes' theorem can be generalized to include improper prior distributions such as the uniform distribution on the real line.
[
11
]
Modern
Markov chain Monte Carlo
methods have boosted the importance of Bayes' theorem including cases with improper priors.
[
12
]
Bayesian prediction
[
edit
]
Bayesian theory calls for the use of the posterior predictive distribution to do
predictive inference
, i.e., to
predict
the distribution of a new, unobserved data point. That is, instead of a fixed point as a prediction, a distribution over possible points is returned. Only this way is the entire posterior distribution of the parameter(s) used. By comparison, prediction in
frequentist statistics
often involves finding an optimum point estimate of the parameter(s)âe.g., by
maximum likelihood
or
maximum a posteriori estimation
(MAP)âand then plugging this estimate into the formula for the distribution of a data point. This has the disadvantage that it does not account for any uncertainty in the value of the parameter, and hence will underestimate the
variance
of the predictive distribution.
In some instances, frequentist statistics can work around this problem. For example,
confidence intervals
and
prediction intervals
in frequentist statistics when constructed from a
normal distribution
with unknown
mean
and
variance
are constructed using a
Student's t-distribution
. This correctly estimates the variance, due to the facts that (1)Â the average of normally distributed random variables is also normally distributed, and (2) the predictive distribution of a normally distributed data point with unknown mean and variance, using conjugate or uninformative priors, has a Student's t-distribution. In Bayesian statistics, however, the posterior predictive distribution can always be determined exactlyâor at least to an arbitrary level of precision when numerical methods are used.
Both types of predictive distributions have the form of a
compound probability distribution
(as does the
marginal likelihood
). In fact, if the prior distribution is a
conjugate prior
, such that the prior and posterior distributions come from the same family, it can be seen that both prior and posterior predictive distributions also come from the same family of compound distributions. The only difference is that the posterior predictive distribution uses the updated values of the hyperparameters (applying the Bayesian update rules given in the
conjugate prior
article), while the prior predictive distribution uses the values of the hyperparameters that appear in the prior distribution.
Mathematical properties
[
edit
]
Interpretation of factor
[
edit
]
. That is, if the model were true, the evidence would be more likely than is predicted by the current state of belief. The reverse applies for a decrease in belief. If the belief does not change,
. That is, the evidence is independent of the model. If the model were true, the evidence would be exactly as likely as predicted by the current state of belief.
If
then
. If
and
, then
. This can be interpreted to mean that hard convictions are insensitive to counter-evidence.
The former follows directly from Bayes' theorem. The latter can be derived by applying the first rule to the event "not
" in place of "
", yielding "if
, then
", from which the result immediately follows.
Asymptotic behaviour of posterior
[
edit
]
Consider the behaviour of a belief distribution as it is updated a large number of times with
independent and identically distributed
trials. For sufficiently nice prior probabilities, the
Bernstein-von Mises theorem
gives that in the limit of infinite trials, the posterior converges to a
Gaussian distribution
independent of the initial prior under some conditions firstly outlined and rigorously proven by
Joseph L. Doob
in 1948, namely if the random variable in consideration has a finite
probability space
. The more general results were obtained later by the statistician
David A. Freedman
who published in two seminal research papers in 1963
[
13
]
and 1965
[
14
]
when and under what circumstances the asymptotic behaviour of posterior is guaranteed. His 1963 paper treats, like Doob (1949), the finite case and comes to a satisfactory conclusion. However, if the random variable has an infinite but countable
probability space
(i.e., corresponding to a die with infinite many faces) the 1965 paper demonstrates that for a dense subset of priors the
Bernstein-von Mises theorem
is not applicable. In this case there is
almost surely
no asymptotic convergence. Later in the 1980s and 1990s
Freedman
and
Persi Diaconis
continued to work on the case of infinite countable probability spaces.
[
15
]
To summarise, there may be insufficient trials to suppress the effects of the initial choice, and especially for large (but finite) systems the convergence might be very slow.
In parameterized form, the prior distribution is often assumed to come from a family of distributions called
conjugate priors
. The usefulness of a conjugate prior is that the corresponding posterior distribution will be in the same family, and the calculation may be expressed in
closed form
.
Estimates of parameters and predictions
[
edit
]
It is often desired to use a posterior distribution to estimate a parameter or variable. Several methods of Bayesian estimation select
measurements of central tendency
from the posterior distribution.
For one-dimensional problems, a unique median exists for practical continuous problems. The posterior median is attractive as a
robust estimator
.
[
16
]
If there exists a finite mean for the posterior distribution, then the posterior mean is a method of estimation.
[
17
]
Taking a value with the greatest probability defines
maximum
a posteriori
(MAP)
estimates:
[
18
]
There are examples where no maximum is attained, in which case the set of MAP estimates is
empty
.
There are other methods of estimation that minimize the posterior
risk
(expected-posterior loss) with respect to a
loss function
, and these are of interest to
statistical decision theory
using the sampling distribution ("frequentist statistics").
[
19
]
The
posterior predictive distribution
of a new observation
(that is independent of previous observations) is determined by
[
20
]
Probability of a hypothesis
[
edit
]
Contingency table
Bowl
Cookie
#1
H
1
#2
H
2
Total
Plain,
E
30
20
50
Choc, ÂŹ
E
10
20
30
Total
40
40
80
P
(
H
1
|
E
) = 30 / 50 = 0.6
Suppose there are two full bowls of cookies. Bowl #1 has 10 chocolate chip and 30 plain cookies, while bowl #2 has 20 of each. Our friend Fred picks a bowl at random, and then picks a cookie at random. We may assume there is no reason to believe Fred treats one bowl differently from another, likewise for the cookies. The cookie turns out to be a plain one. How probable is it that Fred picked it out of bowl #1?
Intuitively, it seems clear that the answer should be more than a half, since there are more plain cookies in bowl #1. The precise answer is given by Bayes' theorem. Let
correspond to bowl #1, and
to bowl #2.
It is given that the bowls are identical from Fred's point of view, thus
, and the two must add up to 1, so both are equal to 0.5.
The event
is the observation of a plain cookie. From the contents of the bowls, we know that
and
Bayes' formula then yields
Before we observed the cookie, the probability we assigned for Fred having chosen bowl #1 was the prior probability,
, which was 0.5. After observing the cookie, we must revise the probability to
, which is 0.6.
Making a prediction
[
edit
]
Example results for archaeology example. This simulation was generated using c=15.2.
An archaeologist is working at a site thought to be from the medieval period, between the 11th century to the 16th century. However, it is uncertain exactly when in this period the site was inhabited. Fragments of pottery are found, some of which are glazed and some of which are decorated. It is expected that if the site were inhabited during the early medieval period, then 1% of the pottery would be glazed and 50% of its area decorated, whereas if it had been inhabited in the late medieval period then 81% would be glazed and 5% of its area decorated. How confident can the archaeologist be in the date of inhabitation as fragments are unearthed?
The degree of belief in the continuous variable
(century) is to be calculated, with the discrete set of events
as evidence. Assuming linear variation of glaze and decoration with time, and that these variables are independent,
Assume a uniform prior of
, and that trials are
independent and identically distributed
. When a new fragment of type
is discovered, Bayes' theorem is applied to update the degree of belief for each
:
A computer simulation of the changing belief as 50 fragments are unearthed is shown on the graph. In the simulation, the site was inhabited around 1420, or
. By calculating the area under the relevant portion of the graph for 50 trials, the archaeologist can say that there is practically no chance the site was inhabited in the 11th and 12th centuries, about 1% chance that it was inhabited during the 13th century, 63% chance during the 14th century and 36% during the 15th century. The
Bernstein-von Mises theorem
asserts here the asymptotic convergence to the "true" distribution because the
probability space
corresponding to the discrete set of events
is finite (see above section on asymptotic behaviour of the posterior).
In frequentist statistics and decision theory
[
edit
]
A
decision-theoretic
justification of the use of Bayesian inference was given by
Abraham Wald
, who proved that every unique Bayesian procedure is
admissible
. Conversely, every
admissible
statistical procedure is either a Bayesian procedure or a limit of Bayesian procedures.
[
21
]
Wald characterized admissible procedures as Bayesian procedures (and limits of Bayesian procedures), making the Bayesian formalism a central technique in such areas of
frequentist inference
as
parameter estimation
,
hypothesis testing
, and computing
confidence intervals
.
[
22
]
[
23
]
[
24
]
For example:
"Under some conditions, all admissible procedures are either Bayes procedures or limits of Bayes procedures (in various senses). These remarkable results, at least in their original form, are due essentially to Wald. They are useful because the property of being Bayes is easier to analyze than admissibility."
[
21
]
"In decision theory, a quite general method for proving admissibility consists in exhibiting a procedure as a unique Bayes solution."
[
25
]
"In the first chapters of this work, prior distributions with finite support and the corresponding Bayes procedures were used to establish some of the main theorems relating to the comparison of experiments. Bayes procedures with respect to more general prior distributions have played a very important role in the development of statistics, including its asymptotic theory." "There are many problems where a glance at posterior distributions, for suitable priors, yields immediately interesting information. Also, this technique can hardly be avoided in sequential analysis."
[
26
]
"A useful fact is that any Bayes decision rule obtained by taking a proper prior over the whole parameter space must be admissible"
[
27
]
"An important area of investigation in the development of admissibility ideas has been that of conventional sampling-theory procedures, and many interesting results have been obtained."
[
28
]
Bayesian methodology also plays a role in
model selection
where the aim is to select one model from a set of competing models that represents most closely the underlying process that generated the observed data. In Bayesian model comparison, the model with the highest
posterior probability
given the data is selected. The posterior probability of a model depends on the evidence, or
marginal likelihood
, which reflects the probability that the data is generated by the model, and on the
prior belief
of the model. When two competing models are a priori considered to be equiprobable, the ratio of their posterior probabilities corresponds to the
Bayes factor
. Since Bayesian model comparison is aimed on selecting the model with the highest posterior probability, this methodology is also referred to as the maximum a posteriori (MAP) selection rule
[
29
]
or the MAP probability rule.
[
30
]
Probabilistic programming
[
edit
]
While conceptually simple, Bayesian methods can be mathematically and numerically challenging. Probabilistic programming languages (PPLs) implement functions to easily build Bayesian models together with efficient automatic inference methods. This helps separate the model building from the inference, allowing practitioners to focus on their specific problems and leaving PPLs to handle the computational details for them.
[
31
]
[
32
]
[
33
]
Statistical data analysis
[
edit
]
See the separate Wikipedia entry on
Bayesian statistics
, specifically the
statistical modeling
section in that page.
Computer applications
[
edit
]
Bayesian inference has applications in
artificial intelligence
and
expert systems
. Bayesian inference techniques have been a fundamental part of computerized
pattern recognition
techniques since the late 1950s.
[
34
]
There is also an ever-growing connection between Bayesian methods and simulation-based
Monte Carlo
techniques since complex models cannot be processed in closed form by a Bayesian analysis, while a
graphical model
structure
may
allow for efficient simulation algorithms like the
Gibbs sampling
and other
MetropolisâHastings algorithm
schemes.
[
35
]
Recently
[
when?
]
Bayesian inference has gained popularity among the
phylogenetics
community for these reasons; a number of applications allow many demographic and evolutionary parameters to be estimated simultaneously.
As applied to
statistical classification
, Bayesian inference has been used to develop algorithms for identifying
e-mail spam
. Applications which make use of Bayesian inference for spam filtering include
CRM114
,
DSPAM
,
Bogofilter
,
SpamAssassin
,
SpamBayes
,
Mozilla
, XEAMS, and others. Spam classification is treated in more detail in the article on the
naĂŻve Bayes classifier
.
Solomonoff's Inductive inference
is the theory of prediction based on observations; for example, predicting the next symbol based upon a given series of symbols. The only assumption is that the environment follows some unknown but computable
probability distribution
. It is a formal inductive framework that combines two well-studied principles of inductive inference: Bayesian statistics and
Occam's Razor
.
[
36
]
[
unreliable source?
]
Solomonoff's universal prior probability of any prefix
p
of a computable sequence
x
is the sum of the probabilities of all programs (for a universal computer) that compute something starting with
p
. Given some
p
and any computable but unknown probability distribution from which
x
is sampled, the universal prior and Bayes' theorem can be used to predict the yet unseen parts of
x
in optimal fashion.
[
37
]
[
38
]
Bioinformatics and healthcare applications
[
edit
]
Bayesian inference has been applied in different
bioinformatics
applications, including differential gene expression analysis.
[
39
]
Bayesian inference is also used in a general cancer risk model, called
CIRI
(Continuous Individualized Risk Index), where serial measurements are incorporated to update a Bayesian model which is primarily built from prior knowledge.
[
40
]
[
41
]
Cosmology and astrophysical applications
[
edit
]
The Bayesian approach has been central to recent progress in cosmology and astrophysical applications,
[
42
]
[
43
]
and extends to a wide range of astrophysical problems, including the characterisation of exoplanet (such as the fitting of atmosphere for
k2-18b
[
44
]
), parameter constraints with cosmological data,
[
45
]
and calibration in astrophysical experiments.
[
46
]
In cosmology, it is often employed with computational techniques such as
Markov chain Monte Carlo
(MCMC) and
Nested sampling algorithm
to analyse complex datasets and navigate high-dimensional parameter space. A notable application is to the Planck 2018 CMB data for parameter inference.
[
45
]
The six base cosmological parameters in
Lambda-CDM model
are not predicted by a theory, but rather fitted from Cosmic microwave background (CMB) data to a chosen model of cosmology (the Lambda-CDM model).
[
47
]
The bayesian code for cosmology `cobaya`
[
48
]
sets up cosmological runs and interfaces cosmological likelihoods, Boltzmann code,
[
49
]
[
50
]
which computes the predicted CMB anisotropies for any given set of cosmological parameters, with MCMC or nested sampler.
This computational framework is not limited to the standard model, it is also essential for testing alternative or extended theories of cosmology, such as theories with early dark energy,
[
51
]
or modified gravity theories introducing additional parameters beyond Lambda-CDM.
Bayesian model comparison
can then be employed to calculate the evidence for competing models, providing a statistical basis to assess whether the data support them over the standard Lambda-CDM.
[
52
]
Bayesian inference can be used by jurors to coherently accumulate the evidence for and against a defendant, and to see whether, in totality, it meets their personal threshold for "
beyond a reasonable doubt
".
[
53
]
[
54
]
[
55
]
Bayes' theorem is applied successively to all evidence presented, with the posterior from one stage becoming the prior for the next. The benefit of a Bayesian approach is that it gives the juror an unbiased, rational mechanism for combining evidence. It may be appropriate to explain Bayes' theorem to jurors in
odds form
, as
betting odds
are more widely understood than probabilities. Alternatively, a
logarithmic approach
, replacing multiplication with addition, might be easier for a jury to handle.
Adding up evidence
If the existence of the crime is not in doubt, only the identity of the culprit, it has been suggested that the prior should be uniform over the qualifying population.
[
56
]
For example, if 1,000 people could have committed the crime, the prior probability of guilt would be 1/1000.
The use of Bayes' theorem by jurors is controversial. In the United Kingdom, a defence
expert witness
explained Bayes' theorem to the jury in
R v Adams
. The jury convicted, but the case went to appeal on the basis that no means of accumulating evidence had been provided for jurors who did not wish to use Bayes' theorem. The Court of Appeal upheld the conviction, but it also gave the opinion that "To introduce Bayes' Theorem, or any similar method, into a criminal trial plunges the jury into inappropriate and unnecessary realms of theory and complexity, deflecting them from their proper task."
Gardner-Medwin
[
57
]
argues that the criterion on which a verdict in a criminal trial should be based is
not
the probability of guilt, but rather the
probability of the evidence, given that the defendant is innocent
(akin to a
frequentist
p-value
). He argues that if the posterior probability of guilt is to be computed by Bayes' theorem, the prior probability of guilt must be known. This will depend on the incidence of the crime, which is an unusual piece of evidence to consider in a criminal trial. Consider the following three propositions:
A
â the known facts and testimony could have arisen if the defendant is guilty.
B
â the known facts and testimony could have arisen if the defendant is innocent.
C
â the defendant is guilty.
Gardner-Medwin argues that the jury should believe both
A
and not-
B
in order to convict.
A
and not-
B
implies the truth of
C
, but the reverse is not true. It is possible that
B
and
C
are both true, but in this case he argues that a jury should acquit, even though they know that they will be letting some guilty people go free. See also
Lindley's paradox
.
Bayesian epistemology
[
edit
]
Bayesian epistemology
is a movement that advocates for Bayesian inference as a means of justifying the rules of inductive logic.
Karl Popper
and
David Miller
have rejected the idea of Bayesian rationalism, i.e. using Bayes rule to make epistemological inferences:
[
58
]
It is prone to the same
vicious circle
as any other
justificationist
epistemology, because it presupposes what it attempts to justify. According to this view, a rational interpretation of Bayesian inference would see it merely as a probabilistic version of
falsification
, rejecting the belief, commonly held by Bayesians, that high likelihood achieved by a series of Bayesian updates would prove the hypothesis beyond any reasonable doubt, or even with likelihood greater than 0.
The
scientific method
is sometimes interpreted as an application of Bayesian inference. In this view, Bayes' rule guides (or should guide) the updating of probabilities about
hypotheses
conditional on new observations or
experiments
.
[
59
]
The Bayesian inference has also been applied to treat
stochastic scheduling
problems with incomplete information by Cai et al. (2009).
[
60
]
Bayesian search theory
is used to search for lost objects.
Bayesian inference in phylogeny
Bayesian tool for methylation analysis
Bayesian approaches to brain function
investigate the brain as a Bayesian mechanism.
Bayesian inference in ecological studies
[
61
]
[
62
]
Bayesian inference is used to estimate parameters in stochastic chemical kinetic models
[
63
]
Bayesian inference in
econophysics
for currency or prediction of trend changes in financial quotations
[
64
]
Bayesian inference in marketing
Bayesian inference in motor learning
Bayesian inference is used in
probabilistic numerics
to solve numerical problems
Bayes and Bayesian inference
[
edit
]
The problem considered by Bayes in Proposition 9 of his essay, "
An Essay Towards Solving a Problem in the Doctrine of Chances
", is the posterior distribution for the parameter
a
(the success rate) of the
binomial distribution
.
[
citation needed
]
The term
Bayesian
refers to
Thomas Bayes
(1701â1761), who proved that probabilistic limits could be placed on an unknown event.
[
65
]
However, it was
Pierre-Simon Laplace
(1749â1827) who introduced (as Principle VI) what is now called
Bayes' theorem
and used it to address problems in
celestial mechanics
, medical statistics,
reliability
, and
jurisprudence
.
[
66
]
Early Bayesian inference, which used uniform priors following Laplace's
principle of insufficient reason
, was called "
inverse probability
" (because it
infers
backwards from observations to parameters, or from effects to causes
[
67
]
). After the 1920s, "inverse probability" was largely supplanted by a collection of methods that came to be called
frequentist statistics
.
[
67
]
In the 20th century, the ideas of Laplace were further developed in two different directions, giving rise to
objective
and
subjective
currents in Bayesian practice. In the objective or "non-informative" current, the statistical analysis depends on only the model assumed, the data analyzed,
[
68
]
and the method assigning the prior, which differs from one objective Bayesian practitioner to another. In the subjective or "informative" current, the specification of the prior depends on the belief (that is, propositions on which the analysis is prepared to act), which can summarize information from experts, previous studies, etc.
In the 1980s, there was a dramatic growth in research and applications of Bayesian methods, mostly attributed to the discovery of
Markov chain Monte Carlo
methods, which removed many of the computational problems, and an increasing interest in nonstandard, complex applications.
[
69
]
Despite growth of Bayesian research, most undergraduate teaching is still based on frequentist statistics.
[
70
]
Nonetheless, Bayesian methods are widely accepted and used, such as for example in the field of
machine learning
.
[
71
]
Bayesian approaches to brain function
Credibility theory
Epistemology
Free energy principle
Inductive probability
Information field theory
Principle of maximum entropy
Probabilistic causation
Probabilistic programming
^
"Bayesian"
.
Merriam-Webster.com Dictionary
. Merriam-Webster.
OCLC
Â
1032680871
.
^
Griffiths, Thomas (July 24, 2024).
"Bayesian Models of Cognition"
.
^
Hacking, Ian (December 1967). "Slightly More Realistic Personal Probability".
Philosophy of Science
.
34
(4): 316.
doi
:
10.1086/288169
.
S2CID
Â
14344339
.
^
"Bayes' Theorem (Stanford Encyclopedia of Philosophy)"
. Plato.stanford.edu
. Retrieved
2014-01-05
.
^
van Fraassen, B.
(1989)
Laws and Symmetry
, Oxford University Press.
ISBN
Â
0-19-824860-1
.
^
Gelman, Andrew; Carlin, John B.; Stern, Hal S.; Dunson, David B.; Vehtari, Aki; Rubin, Donald B. (2013).
Bayesian Data Analysis
, Third Edition. Chapman and Hall/CRC.
ISBN
Â
978-1-4398-4095-5
.
^
de Carvalho, Miguel; Page, Garritt; Barney, Bradley (2019).
"On the geometry of Bayesian inference"
(PDF)
.
Bayesian Analysis
.
14
(4): 1013â1036.
doi
:
10.1214/18-BA1112
.
S2CID
Â
88521802
.
^
Lee, Se Yoon (2021). "Gibbs sampler and coordinate ascent variational inference: A set-theoretical review".
Communications in Statistics â Theory and Methods
.
51
(6):
1549â
1568.
arXiv
:
2008.01006
.
doi
:
10.1080/03610926.2021.1921214
.
S2CID
Â
220935477
.
^
Kolmogorov, A.N. (1933) [1956].
Foundations of the Theory of Probability
. Chelsea Publishing Company.
^
Tjur, Tue (1980).
Probability based on Radon measures
. Internet Archive. Chichester [Eng.]; New York : Wiley.
ISBN
Â
978-0-471-27824-5
.
^
Taraldsen, Gunnar; Tufto, Jarle; Lindqvist, Bo H. (2021-07-24).
"Improper priors and improper posteriors"
.
Scandinavian Journal of Statistics
.
49
(3):
969â
991.
doi
:
10.1111/sjos.12550
.
hdl
:
11250/2984409
.
ISSN
Â
0303-6898
.
S2CID
Â
237736986
.
^
Robert, Christian P.; Casella, George (2004).
Monte Carlo Statistical Methods
. Springer.
ISBN
Â
978-1-4757-4145-2
.
OCLC
Â
1159112760
.
^
Freedman, DA (1963).
"On the asymptotic behavior of Bayes' estimates in the discrete case"
.
The Annals of Mathematical Statistics
.
34
(4):
1386â
1403.
doi
:
10.1214/aoms/1177703871
.
JSTOR
Â
2238346
.
^
Freedman, DA (1965).
"On the asymptotic behavior of Bayes estimates in the discrete case II"
.
The Annals of Mathematical Statistics
.
36
(2):
454â
456.
doi
:
10.1214/aoms/1177700155
.
JSTOR
Â
2238150
.
^
Robins, James; Wasserman, Larry (2000). "Conditioning, likelihood, and coherence: A review of some foundational concepts".
Journal of the American Statistical Association
.
95
(452):
1340â
1346.
doi
:
10.1080/01621459.2000.10474344
.
S2CID
Â
120767108
.
^
Sen, Pranab K.
; Keating, J. P.; Mason, R. L. (1993).
Pitman's measure of closeness: A comparison of statistical estimators
. Philadelphia: SIAM.
^
Choudhuri, Nidhan; Ghosal, Subhashis; Roy, Anindya (2005-01-01). "Bayesian Methods for Function Estimation".
Handbook of Statistics
. Bayesian Thinking. Vol. 25. pp.Â
373â
414.
CiteSeerX
Â
10.1.1.324.3052
.
doi
:
10.1016/s0169-7161(05)25013-7
.
ISBN
Â
978-0-444-51539-1
.
^
"Maximum A Posteriori (MAP) Estimation"
.
www.probabilitycourse.com
. Retrieved
2017-06-02
.
^
Yu, Angela.
"Introduction to Bayesian Decision Theory"
(PDF)
.
cogsci.ucsd.edu/
. Archived from
the original
(PDF)
on 2013-02-28.
^
Hitchcock, David.
"Posterior Predictive Distribution Stat Slide"
(PDF)
.
stat.sc.edu
.
^
a
b
Bickel & Doksum (2001, p. 32)
^
Kiefer, J.
; Schwartz R. (1965).
"Admissible Bayes Character of T
2
-, R
2
-, and Other Fully Invariant Tests for Multivariate Normal Problems"
.
Annals of Mathematical Statistics
.
36
(3):
747â
770.
doi
:
10.1214/aoms/1177700051
.
^
Schwartz, R. (1969).
"Invariant Proper Bayes Tests for Exponential Families"
.
Annals of Mathematical Statistics
.
40
:
270â
283.
doi
:
10.1214/aoms/1177697822
.
^
Hwang, J. T. & Casella, George (1982).
"Minimax Confidence Sets for the Mean of a Multivariate Normal Distribution"
(PDF)
.
Annals of Statistics
.
10
(3):
868â
881.
doi
:
10.1214/aos/1176345877
.
^
Lehmann, Erich
(1986).
Testing Statistical Hypotheses
(Second ed.).
(see p. 309 of Chapter 6.7 "Admissibility", and pp. 17â18 of Chapter 1.8 "Complete Classes"
^
Le Cam, Lucien
(1986).
Asymptotic Methods in Statistical Decision Theory
. Springer-Verlag.
ISBN
Â
978-0-387-96307-5
.
(From "Chapter 12 Posterior Distributions and Bayes Solutions", p. 324)
^
Cox, D. R.
; Hinkley, D.V. (1974).
Theoretical Statistics
. Chapman and Hall. p. 432.
ISBN
Â
978-0-04-121537-3
.
^
Cox, D. R.
; Hinkley, D.V. (1974).
Theoretical Statistics
. Chapman and Hall. p. 433.
ISBN
Â
978-0-04-121537-3
.
)
^
Stoica, P.; Selen, Y. (2004). "A review of information criterion rules".
IEEE Signal Processing Magazine
.
21
(4):
36â
47.
doi
:
10.1109/MSP.2004.1311138
.
S2CID
Â
17338979
.
^
Fatermans, J.; Van Aert, S.; den Dekker, A.J. (2019). "The maximum a posteriori probability rule for atom column detection from HAADF STEM images".
Ultramicroscopy
.
201
:
81â
91.
arXiv
:
1902.05809
.
doi
:
10.1016/j.ultramic.2019.02.003
.
PMID
Â
30991277
.
S2CID
Â
104419861
.
^
Bessiere, P., Mazer, E., Ahuactzin, J. M., & Mekhnacha, K. (2013). Bayesian Programming (1 edition) Chapman and Hall/CRC.
^
Daniel Roy (2015).
"Probabilistic Programming"
.
probabilistic-programming.org
. Archived from
the original
on 2016-01-10
. Retrieved
2020-01-02
.
^
Ghahramani, Z (2015).
"Probabilistic machine learning and artificial intelligence"
.
Nature
.
521
(7553):
452â
459.
Bibcode
:
2015Natur.521..452G
.
doi
:
10.1038/nature14541
.
PMID
Â
26017444
.
S2CID
Â
216356
.
^
Fienberg, Stephen E. (2006-03-01).
"When did Bayesian inference become "Bayesian"?"
.
Bayesian Analysis
.
1
(1).
doi
:
10.1214/06-BA101
.
^
Jim Albert (2009).
Bayesian Computation with R, Second edition
. New York, Dordrecht, etc.: Springer.
ISBN
Â
978-0-387-92297-3
.
^
Rathmanner, Samuel; Hutter, Marcus; Ormerod, Thomas C (2011).
"A Philosophical Treatise of Universal Induction"
.
Entropy
.
13
(6):
1076â
1136.
arXiv
:
1105.5721
.
Bibcode
:
2011Entrp..13.1076R
.
doi
:
10.3390/e13061076
.
S2CID
Â
2499910
.
^
Hutter, Marcus; He, Yang-Hui; Ormerod, Thomas C (2007). "On Universal Prediction and Bayesian Confirmation".
Theoretical Computer Science
.
384
(2007):
33â
48.
arXiv
:
0709.1516
.
Bibcode
:
2007arXiv0709.1516H
.
doi
:
10.1016/j.tcs.2007.05.016
.
S2CID
Â
1500830
.
^
GĂĄcs, Peter; VitĂĄnyi, Paul M. B. (2 December 2010). "Raymond J. Solomonoff 1926-2009".
CiteSeerX
Â
10.1.1.186.8268
.
^
Robinson, Mark D & McCarthy, Davis J & Smyth, Gordon K edgeR: a Bioconductor package for differential expression analysis of digital gene expression data, Bioinformatics.
^
"CIRI"
.
ciri.stanford.edu
. Retrieved
2019-08-11
.
^
Kurtz, David M.; Esfahani, Mohammad S.; Scherer, Florian; Soo, Joanne; Jin, Michael C.; Liu, Chih Long; Newman, Aaron M.; DĂŒhrsen, Ulrich; HĂŒttmann, Andreas (2019-07-25).
"Dynamic Risk Profiling Using Serial Tumor Biomarkers for Personalized Outcome Prediction"
.
Cell
.
178
(3): 699â713.e19.
doi
:
10.1016/j.cell.2019.06.011
.
ISSN
Â
1097-4172
.
PMC
Â
7380118
.
PMID
Â
31280963
.
^
Trotta, Roberto (2017). "Bayesian Methods in Cosmology".
arXiv
:
1701.01467
[
astro-ph.CO
].
^
Staicova, Denitsa (2025).
"Modern Bayesian Sampling Methods for Cosmological Inference: A Comparative Study"
.
Universe
.
11
(2): 68.
arXiv
:
2501.06022
.
Bibcode
:
2025Univ...11...68S
.
doi
:
10.3390/universe11020068
.
^
Madhusudhan, Nikku; Constantinou, Savvas; Holmberg, MÄns; Sarkar, Subhajit; Piette, Anjali A. A.; Moses, Julianne I. (2025).
"New Constraints on DMS and DMDS in the Atmosphere of K2-18 b from JWST MIRI"
.
The Astrophysical Journal
.
983
(2): L40.
arXiv
:
2504.12267
.
Bibcode
:
2025ApJ...983L..40M
.
doi
:
10.3847/2041-8213/adc1c8
.
^
a
b
Aghanim, N.; et al. (2020). "
Planck
2018 results".
Astronomy & Astrophysics
.
641
: A6.
arXiv
:
1807.06209
.
Bibcode
:
2020A&A...641A...6P
.
doi
:
10.1051/0004-6361/201833910
.
^
Anstey, Dominic; De Lera Acedo, Eloy; Handley, Will (2021).
"A general Bayesian framework for foreground modelling and chromaticity correction for global 21 cm experiments"
.
Monthly Notices of the Royal Astronomical Society
.
506
(2):
2041â
2058.
arXiv
:
2010.09644
.
doi
:
10.1093/mnras/stab1765
.
^
Lewis, Antony; Bridle, Sarah (2002). "Cosmological parameters from CMB and other data: A Monte Carlo approach".
Physical Review D
.
66
(10) 103511.
arXiv
:
astro-ph/0205436
.
Bibcode
:
2002PhRvD..66j3511L
.
doi
:
10.1103/PhysRevD.66.103511
.
^
"Cobaya, a code for Bayesian analysis in Cosmology â cobaya 3.5.7 documentation"
.
cobaya.readthedocs.io
. Retrieved
2025-07-23
.
^
"CAMB â Code for Anisotropies in the Microwave Background (CAMB) 1.6.1 documentation"
.
camb.readthedocs.io
. Retrieved
2025-07-23
.
^
Lesgourgues, Julien (2011). "The Cosmic Linear Anisotropy Solving System (CLASS) I: Overview".
arXiv
:
1104.2932
[
astro-ph.IM
].
^
Hill, J. Colin; McDonough, Evan; Toomey, Michael W.; Alexander, Stephon (2020). "Early dark energy does not restore cosmological concordance".
Physical Review D
.
102
(4) 043507.
arXiv
:
2003.07355
.
Bibcode
:
2020PhRvD.102d3507H
.
doi
:
10.1103/PhysRevD.102.043507
.
^
Trotta, Roberto (2008). "Bayes in the sky: Bayesian inference and model selection in cosmology".
Contemporary Physics
.
49
(2):
71â
104.
arXiv
:
0803.4089
.
Bibcode
:
2008ConPh..49...71T
.
doi
:
10.1080/00107510802066753
.
^
Dawid, A. P. and Mortera, J. (1996) "Coherent Analysis of Forensic Identification Evidence".
Journal of the Royal Statistical Society
, Series B, 58, 425â443.
^
Foreman, L. A.; Smith, A. F. M., and Evett, I. W. (1997). "Bayesian analysis of deoxyribonucleic acid profiling data in forensic identification applications (with discussion)".
Journal of the Royal Statistical Society
, Series A, 160, 429â469.
^
Robertson, B. and Vignaux, G. A. (1995)
Interpreting Evidence: Evaluating Forensic Science in the Courtroom
. John Wiley and Sons. Chichester.
ISBN
Â
978-0-471-96026-3
.
^
Dawid, A. P. (2001)
Bayes' Theorem and Weighing Evidence by Juries
.
Archived
2015-07-01 at the
Wayback Machine
^
Gardner-Medwin, A. (2005) "What Probability Should the Jury Address?".
Significance
, 2 (1), March 2005.
^
Miller, David (1994).
Critical Rationalism
. Chicago: Open Court.
ISBN
Â
978-0-8126-9197-9
.
^
Howson & Urbach (2005), Jaynes (2003)
^
Cai, X.Q.; Wu, X.Y.; Zhou, X. (2009). "Stochastic scheduling subject to breakdown-repeat breakdowns with incomplete information".
Operations Research
.
57
(5):
1236â
1249.
doi
:
10.1287/opre.1080.0660
.
^
Ogle, Kiona; Tucker, Colin; Cable, Jessica M. (2014-01-01). "Beyond simple linear mixing models: process-based isotope partitioning of ecological processes".
Ecological Applications
.
24
(1):
181â
195.
Bibcode
:
2014EcoAp..24..181O
.
doi
:
10.1890/1051-0761-24.1.181
.
ISSN
Â
1939-5582
.
PMID
Â
24640543
.
^
Evaristo, Jaivime; McDonnell, Jeffrey J.; Scholl, Martha A.; Bruijnzeel, L. Adrian; Chun, Kwok P. (2016-01-01). "Insights into plant water uptake from xylem-water isotope measurements in two tropical catchments with contrasting moisture conditions".
Hydrological Processes
.
30
(18):
3210â
3227.
Bibcode
:
2016HyPr...30.3210E
.
doi
:
10.1002/hyp.10841
.
ISSN
Â
1099-1085
.
S2CID
Â
131588159
.
^
Gupta, Ankur; Rawlings, James B. (April 2014).
"Comparison of Parameter Estimation Methods in Stochastic Chemical Kinetic Models: Examples in Systems Biology"
.
AIChE Journal
.
60
(4):
1253â
1268.
Bibcode
:
2014AIChE..60.1253G
.
doi
:
10.1002/aic.14409
.
ISSN
Â
0001-1541
.
PMC
Â
4946376
.
PMID
Â
27429455
.
^
SchĂŒtz, N.; Holschneider, M. (2011). "Detection of trend changes in time series using Bayesian inference".
Physical Review E
.
84
(2) 021120.
arXiv
:
1104.3448
.
Bibcode
:
2011PhRvE..84b1120S
.
doi
:
10.1103/PhysRevE.84.021120
.
PMID
Â
21928962
.
S2CID
Â
11460968
.
^
Stigler, Stephen (1982). "Thomas Bayes's Bayesian Inference".
Journal of the Royal Statistical Society
.
145
(2):
250â
58.
doi
:
10.2307/2981538
.
JSTOR
Â
2981538
.
^
Stigler, Stephen M. (1986).
"Chapter 3"
.
The History of Statistics
. Harvard University Press.
ISBN
Â
978-0-674-40340-6
.
^
a
b
Fienberg, Stephen E. (2006).
"When did Bayesian Inference Become 'Bayesian'?"
.
Bayesian Analysis
.
1
(1): 1â40 [p. 5].
doi
:
10.1214/06-ba101
.
^
Bernardo, José-Miguel
(2005). "Reference analysis".
Handbook of statistics
. Vol. 25. pp.Â
17â
90.
^
Wolpert, R. L. (2004). "A Conversation with James O. Berger".
Statistical Science
.
19
(1):
205â
218.
CiteSeerX
Â
10.1.1.71.6112
.
doi
:
10.1214/088342304000000053
.
MR
Â
2082155
.
S2CID
Â
120094454
.
^
Bernardo, José M.
(2006).
"A Bayesian mathematical statistics primer"
(PDF)
.
Icots-7
.
^
Bishop, C. M. (2007).
Pattern Recognition and Machine Learning
. New York: Springer.
ISBN
Â
978-0-387-31073-2
.
Aster, Richard; Borchers, Brian, and Thurber, Clifford (2012).
Parameter Estimation and Inverse Problems
, Second Edition, Elsevier.
ISBN
Â
0123850487
,
ISBN
Â
978-0123850485
Bickel, Peter J. & Doksum, Kjell A. (2001).
Mathematical Statistics, Volume 1: Basic and Selected Topics
(Second (updated printing 2007)Â ed.). Pearson PrenticeâHall.
ISBN
Â
978-0-13-850363-5
.
Box, G. E. P.
and
Tiao, G. C.
(1973).
Bayesian Inference in Statistical Analysis
, Wiley,
ISBN
Â
0-471-57428-7
Edwards, Ward (1968). "Conservatism in Human Information Processing". In Kleinmuntz, B. (ed.).
Formal Representation of Human Judgment
. Wiley.
Edwards, Ward (1982).
Daniel Kahneman
;
Paul Slovic
;
Amos Tversky
(eds.). "Judgment under uncertainty: Heuristics and biases".
Science
.
185
(4157):
1124â
1131.
Bibcode
:
1974Sci...185.1124T
.
doi
:
10.1126/science.185.4157.1124
.
PMID
Â
17835457
.
S2CID
Â
143452957
.
Chapter: Conservatism in Human Information Processing (excerpted)
Jaynes E. T.
(2003)
Probability Theory: The Logic of Science
, CUP.
ISBN
Â
978-0-521-59271-0
(
Link to Fragmentary Edition of March 1996
).
Howson, C.
& Urbach, P. (2005).
Scientific Reasoning: the Bayesian Approach
(3rd ed.).
Open Court Publishing Company
.
ISBN
Â
978-0-8126-9578-6
.
Phillips, L. D.; Edwards, Ward (October 2008). "Chapter 6: Conservatism in a Simple Probability Inference Task (
Journal of Experimental Psychology
(1966) 72: 346-354)". In Jie W. Weiss; David J. Weiss (eds.).
A Science of Decision Making:The Legacy of Ward Edwards
. Oxford University Press. p. 536.
ISBN
Â
978-0-19-532298-9
.
For a full report on the history of Bayesian statistics and the debates with frequentists approaches, read
Vallverdu, Jordi (2016).
Bayesians Versus Frequentists A Philosophical Debate on Statistical Reasoning
. New York: Springer.
ISBN
Â
978-3-662-48638-2
.
Clayton, Aubrey
(August 2021).
Bernoulli's Fallacy: Statistical Illogic and the Crisis of Modern Science
. Columbia University Press.
ISBN
Â
978-0-231-55335-3
.
The following books are listed in ascending order of probabilistic sophistication:
Stone, JV (2013), "Bayes' Rule: A Tutorial Introduction to Bayesian Analysis",
Download first chapter here
, Sebtel Press, England.
Dennis V. Lindley
(2013).
Understanding Uncertainty, Revised Edition
(2nd ed.). John Wiley.
ISBN
Â
978-1-118-65012-7
.
Colin Howson
& Peter Urbach (2005).
Scientific Reasoning: The Bayesian Approach
(3rd ed.).
Open Court Publishing Company
.
ISBN
Â
978-0-8126-9578-6
.
Berry, Donald A. (1996).
Statistics: A Bayesian Perspective
. Duxbury.
ISBN
Â
978-0-534-23476-8
.
Morris H. DeGroot
& Mark J. Schervish (2002).
Probability and Statistics
(third ed.). Addison-Wesley.
ISBN
Â
978-0-201-52488-8
.
Bolstad, William M. (2007)
Introduction to Bayesian Statistics
: Second Edition, John Wiley
ISBN
Â
0-471-27020-2
Winkler, Robert L (2003).
Introduction to Bayesian Inference and Decision
(2nd ed.). Probabilistic.
ISBN
Â
978-0-9647938-4-2
.
Updated classic textbook. Bayesian theory clearly presented.
Lee, Peter M.
Bayesian Statistics: An Introduction
. Fourth Edition (2012), John Wiley
ISBN
Â
978-1-1183-3257-3
Carlin, Bradley P. & Louis, Thomas A. (2008).
Bayesian Methods for Data Analysis, Third Edition
. Boca Raton, FL: Chapman and Hall/CRC.
ISBN
Â
978-1-58488-697-6
.
Gelman, Andrew
; Carlin, John B.; Stern, Hal S.; Dunson, David B.; Vehtari, Aki;
Rubin, Donald B.
(2013).
Bayesian Data Analysis, Third Edition
. Chapman and Hall/CRC.
ISBN
Â
978-1-4398-4095-5
.
Intermediate or advanced
[
edit
]
Berger, James O
(1985).
Statistical Decision Theory and Bayesian Analysis
. Springer Series in Statistics (Second ed.). Springer-Verlag.
Bibcode
:
1985sdtb.book.....B
.
ISBN
Â
978-0-387-96098-2
.
Bernardo, José M.
;
Smith, Adrian F. M.
(1994).
Bayesian Theory
. Wiley.
DeGroot, Morris H.
,
Optimal Statistical Decisions
. Wiley Classics Library. 2004. (Originally published (1970) by McGraw-Hill.)
ISBN
Â
0-471-68029-X
.
Schervish, Mark J. (1995).
Theory of statistics
. Springer-Verlag.
ISBN
Â
978-0-387-94546-0
.
Jaynes, E. T. (1998).
Probability Theory: The Logic of Science
.
O'Hagan, A. and Forster, J. (2003).
Kendall's Advanced Theory of Statistics
, Volume 2B:
Bayesian Inference
. Arnold, New York.
ISBN
Â
0-340-52922-9
.
Robert, Christian P (2007).
The Bayesian Choice: From Decision-Theoretic Foundations to Computational Implementation
(paperback ed.). Springer.
ISBN
Â
978-0-387-71598-8
.
Pearl, Judea
. (1988).
Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference
, San Mateo, CA: Morgan Kaufmann.
Pierre BessiĂšre et al. (2013). "
Bayesian Programming
". CRC Press.
ISBN
Â
9781439880326
Francisco J. Samaniego (2010). "A Comparison of the Bayesian and Frequentist Approaches to Estimation". Springer. New York,
ISBN
Â
978-1-4419-5940-9
"Bayesian approach to statistical problems"
,
Encyclopedia of Mathematics
,
EMS Press
, 2001 [1994]
Bayesian Statistics
from Scholarpedia.
Introduction to Bayesian probability
from Queen Mary University of London
Mathematical Notes on Bayesian Statistics and Markov Chain Monte Carlo
Bayesian reading list
Archived
2011-06-25 at the
Wayback Machine
, categorized and annotated by
Tom Griffiths
A. Hajek and S. Hartmann:
Bayesian Epistemology
, in: J. Dancy et al. (eds.), A Companion to Epistemology. Oxford: Blackwell 2010, 93â106.
S. Hartmann and J. Sprenger:
Bayesian Epistemology
, in: S. Bernecker and D. Pritchard (eds.), Routledge Companion to Epistemology. London: Routledge 2010, 609â620.
Stanford Encyclopedia of Philosophy
: "Inductive Logic"
Bayesian Confirmation Theory
(PDF)
What is Bayesian Learning?
Data, Uncertainty and Inference
â Informal introduction with many examples, ebook (PDF) freely available at
causaScientia |
| Markdown | [Jump to content](https://en.wikipedia.org/wiki/Bayesian_inference#bodyContent)
Main menu
Main menu
move to sidebar
hide
Navigation
- [Main page](https://en.wikipedia.org/wiki/Main_Page "Visit the main page [z]")
- [Contents](https://en.wikipedia.org/wiki/Wikipedia:Contents "Guides to browsing Wikipedia")
- [Current events](https://en.wikipedia.org/wiki/Portal:Current_events "Articles related to current events")
- [Random article](https://en.wikipedia.org/wiki/Special:Random "Visit a randomly selected article [x]")
- [About Wikipedia](https://en.wikipedia.org/wiki/Wikipedia:About "Learn about Wikipedia and how it works")
- [Contact us](https://en.wikipedia.org/wiki/Wikipedia:Contact_us "How to contact Wikipedia")
Contribute
- [Help](https://en.wikipedia.org/wiki/Help:Contents "Guidance on how to use and edit Wikipedia")
- [Learn to edit](https://en.wikipedia.org/wiki/Help:Introduction "Learn how to edit Wikipedia")
- [Community portal](https://en.wikipedia.org/wiki/Wikipedia:Community_portal "The hub for editors")
- [Recent changes](https://en.wikipedia.org/wiki/Special:RecentChanges "A list of recent changes to Wikipedia [r]")
- [Upload file](https://en.wikipedia.org/wiki/Wikipedia:File_upload_wizard "Add images or other media for use on Wikipedia")
- [Special pages](https://en.wikipedia.org/wiki/Special:SpecialPages "A list of all special pages [q]")
[  ](https://en.wikipedia.org/wiki/Main_Page)
[Search](https://en.wikipedia.org/wiki/Special:Search "Search Wikipedia [f]")
Appearance
- [Donate](https://donate.wikimedia.org/?wmf_source=donate&wmf_medium=sidebar&wmf_campaign=en.wikipedia.org&uselang=en)
- [Create account](https://en.wikipedia.org/w/index.php?title=Special:CreateAccount&returnto=Bayesian+inference "You are encouraged to create an account and log in; however, it is not mandatory")
- [Log in](https://en.wikipedia.org/w/index.php?title=Special:UserLogin&returnto=Bayesian+inference "You're encouraged to log in; however, it's not mandatory. [o]")
Personal tools
- [Donate](https://donate.wikimedia.org/?wmf_source=donate&wmf_medium=sidebar&wmf_campaign=en.wikipedia.org&uselang=en)
- [Create account](https://en.wikipedia.org/w/index.php?title=Special:CreateAccount&returnto=Bayesian+inference "You are encouraged to create an account and log in; however, it is not mandatory")
- [Log in](https://en.wikipedia.org/w/index.php?title=Special:UserLogin&returnto=Bayesian+inference "You're encouraged to log in; however, it's not mandatory. [o]")
## Contents
move to sidebar
hide
- [(Top)](https://en.wikipedia.org/wiki/Bayesian_inference)
- [1 Introduction to Bayes' rule](https://en.wikipedia.org/wiki/Bayesian_inference#Introduction_to_Bayes'_rule)
Toggle Introduction to Bayes' rule subsection
- [1\.1 Formal explanation](https://en.wikipedia.org/wiki/Bayesian_inference#Formal_explanation)
- [1\.2 Alternatives to Bayesian updating](https://en.wikipedia.org/wiki/Bayesian_inference#Alternatives_to_Bayesian_updating)
- [2 Inference over exclusive and exhaustive possibilities](https://en.wikipedia.org/wiki/Bayesian_inference#Inference_over_exclusive_and_exhaustive_possibilities)
Toggle Inference over exclusive and exhaustive possibilities subsection
- [2\.1 General formulation](https://en.wikipedia.org/wiki/Bayesian_inference#General_formulation)
- [2\.2 Multiple observations](https://en.wikipedia.org/wiki/Bayesian_inference#Multiple_observations)
- [2\.3 Parametric formulation: motivating the formal description](https://en.wikipedia.org/wiki/Bayesian_inference#Parametric_formulation:_motivating_the_formal_description)
- [3 Formal description of Bayesian inference](https://en.wikipedia.org/wiki/Bayesian_inference#Formal_description_of_Bayesian_inference)
Toggle Formal description of Bayesian inference subsection
- [3\.1 Definitions](https://en.wikipedia.org/wiki/Bayesian_inference#Definitions)
- [3\.2 Bayesian inference](https://en.wikipedia.org/wiki/Bayesian_inference#Bayesian_inference)
- [3\.3 Bayesian prediction](https://en.wikipedia.org/wiki/Bayesian_inference#Bayesian_prediction)
- [4 Mathematical properties](https://en.wikipedia.org/wiki/Bayesian_inference#Mathematical_properties)
Toggle Mathematical properties subsection
- [4\.1 Interpretation of factor](https://en.wikipedia.org/wiki/Bayesian_inference#Interpretation_of_factor)
- [4\.2 Cromwell's rule](https://en.wikipedia.org/wiki/Bayesian_inference#Cromwell's_rule)
- [4\.3 Asymptotic behaviour of posterior](https://en.wikipedia.org/wiki/Bayesian_inference#Asymptotic_behaviour_of_posterior)
- [4\.4 Conjugate priors](https://en.wikipedia.org/wiki/Bayesian_inference#Conjugate_priors)
- [4\.5 Estimates of parameters and predictions](https://en.wikipedia.org/wiki/Bayesian_inference#Estimates_of_parameters_and_predictions)
- [5 Examples](https://en.wikipedia.org/wiki/Bayesian_inference#Examples)
Toggle Examples subsection
- [5\.1 Probability of a hypothesis](https://en.wikipedia.org/wiki/Bayesian_inference#Probability_of_a_hypothesis)
- [5\.2 Making a prediction](https://en.wikipedia.org/wiki/Bayesian_inference#Making_a_prediction)
- [6 In frequentist statistics and decision theory](https://en.wikipedia.org/wiki/Bayesian_inference#In_frequentist_statistics_and_decision_theory)
Toggle In frequentist statistics and decision theory subsection
- [6\.1 Model selection](https://en.wikipedia.org/wiki/Bayesian_inference#Model_selection)
- [7 Probabilistic programming](https://en.wikipedia.org/wiki/Bayesian_inference#Probabilistic_programming)
- [8 Applications](https://en.wikipedia.org/wiki/Bayesian_inference#Applications)
Toggle Applications subsection
- [8\.1 Statistical data analysis](https://en.wikipedia.org/wiki/Bayesian_inference#Statistical_data_analysis)
- [8\.2 Computer applications](https://en.wikipedia.org/wiki/Bayesian_inference#Computer_applications)
- [8\.3 Bioinformatics and healthcare applications](https://en.wikipedia.org/wiki/Bayesian_inference#Bioinformatics_and_healthcare_applications)
- [8\.4 Cosmology and astrophysical applications](https://en.wikipedia.org/wiki/Bayesian_inference#Cosmology_and_astrophysical_applications)
- [8\.5 In the courtroom](https://en.wikipedia.org/wiki/Bayesian_inference#In_the_courtroom)
- [8\.6 Bayesian epistemology](https://en.wikipedia.org/wiki/Bayesian_inference#Bayesian_epistemology)
- [8\.7 Other](https://en.wikipedia.org/wiki/Bayesian_inference#Other)
- [9 Bayes and Bayesian inference](https://en.wikipedia.org/wiki/Bayesian_inference#Bayes_and_Bayesian_inference)
- [10 History](https://en.wikipedia.org/wiki/Bayesian_inference#History)
- [11 See also](https://en.wikipedia.org/wiki/Bayesian_inference#See_also)
- [12 References](https://en.wikipedia.org/wiki/Bayesian_inference#References)
Toggle References subsection
- [12\.1 Citations](https://en.wikipedia.org/wiki/Bayesian_inference#Citations)
- [12\.2 Sources](https://en.wikipedia.org/wiki/Bayesian_inference#Sources)
- [13 Further reading](https://en.wikipedia.org/wiki/Bayesian_inference#Further_reading)
Toggle Further reading subsection
- [13\.1 Elementary](https://en.wikipedia.org/wiki/Bayesian_inference#Elementary)
- [13\.2 Intermediate or advanced](https://en.wikipedia.org/wiki/Bayesian_inference#Intermediate_or_advanced)
- [14 External links](https://en.wikipedia.org/wiki/Bayesian_inference#External_links)
Toggle the table of contents
# Bayesian inference
36 languages
- [Afrikaans](https://af.wikipedia.org/wiki/Bayesiaanse_statistiek "Bayesiaanse statistiek â Afrikaans")
- [ۧÙŰč۱ۚÙŰ©](https://ar.wikipedia.org/wiki/%D8%A7%D8%B3%D8%AA%D8%AF%D9%84%D8%A7%D9%84_%D8%A8%D8%A7%D9%8A%D8%B2%D9%8A "ۧ۳ŰȘŰŻÙŰ§Ù ŰšŰ§ÙŰČÙ â Arabic")
- [Asturianu](https://ast.wikipedia.org/wiki/Inferencia_bayesiana "Inferencia bayesiana â Asturian")
- [AzÉrbaycanca](https://az.wikipedia.org/wiki/Bayes_t%C9%99hlili "Bayes tÉhlili â Azerbaijani")
- [CatalĂ ](https://ca.wikipedia.org/wiki/Infer%C3%A8ncia_bayesiana "InferĂšncia bayesiana â Catalan")
- [Cymraeg](https://cy.wikipedia.org/wiki/Anwythiad_Bayesaidd "Anwythiad Bayesaidd â Welsh")
- [Dansk](https://da.wikipedia.org/wiki/Bayesiansk_statistik "Bayesiansk statistik â Danish")
- [Deutsch](https://de.wikipedia.org/wiki/Bayessche_Inferenz "Bayessche Inferenz â German")
- [ÎλληΜÎčÎșÎŹ](https://el.wikipedia.org/wiki/%CE%A3%CF%85%CE%BC%CF%80%CE%AD%CF%81%CE%B1%CF%83%CE%BC%CE%B1_%CF%84%CE%BF%CF%85_%CE%9C%CF%80%CE%AD%CF%85%CE%B6 "ÎŁÏ
ÎŒÏÎÏαÏΌα ÏÎżÏ
ÎÏÎÏ
ζ â Greek")
- [Español](https://es.wikipedia.org/wiki/Inferencia_bayesiana "Inferencia bayesiana â Spanish")
- [Euskara](https://eu.wikipedia.org/wiki/Inferentzia_bayestar "Inferentzia bayestar â Basque")
- [Ùۧ۱۳Û](https://fa.wikipedia.org/wiki/%D8%A7%D8%B3%D8%AA%D9%86%D8%A8%D8%A7%D8%B7_%D8%A8%DB%8C%D8%B2%DB%8C "ۧ۳ŰȘÙۚۧ۷ ŰšÛŰČÛ â Persian")
- [Suomi](https://fi.wikipedia.org/wiki/Bayesil%C3%A4inen_tilastotiede "BayesilĂ€inen tilastotiede â Finnish")
- [Français](https://fr.wikipedia.org/wiki/Inf%C3%A9rence_bay%C3%A9sienne "InfĂ©rence bayĂ©sienne â French")
- [Galego](https://gl.wikipedia.org/wiki/Inferencia_bayesiana "Inferencia bayesiana â Galician")
- [ŚąŚŚšŚŚȘ](https://he.wikipedia.org/wiki/%D7%94%D7%A1%D7%A7%D7%94_%D7%91%D7%99%D7%99%D7%A1%D7%99%D7%90%D7%A0%D7%99%D7%AA "ŚŚĄŚ§Ś ŚŚŚŚĄŚŚŚ ŚŚȘ â Hebrew")
- [Bahasa Indonesia](https://id.wikipedia.org/wiki/Analisis_Bayes "Analisis Bayes â Indonesian")
- [Italiano](https://it.wikipedia.org/wiki/Inferenza_bayesiana "Inferenza bayesiana â Italian")
- [æ„æŹèȘ](https://ja.wikipedia.org/wiki/%E3%83%99%E3%82%A4%E3%82%BA%E6%8E%A8%E5%AE%9A "ăă€ășæšćź â Japanese")
- [íê”ìŽ](https://ko.wikipedia.org/wiki/%EB%B2%A0%EC%9D%B4%EC%A6%88_%EC%B6%94%EB%A1%A0 "ëČ ìŽìŠ ì¶ëĄ â Korean")
- [Norsk bokmĂ„l](https://no.wikipedia.org/wiki/Bayesisk_inferens "Bayesisk inferens â Norwegian BokmĂ„l")
- [Polski](https://pl.wikipedia.org/wiki/Wnioskowanie_bayesowskie "Wnioskowanie bayesowskie â Polish")
- [PortuguĂȘs](https://pt.wikipedia.org/wiki/Infer%C3%AAncia_bayesiana "InferĂȘncia bayesiana â Portuguese")
- [RomĂąnÄ](https://ro.wikipedia.org/wiki/Inferen%C8%9Ba_bayesian%C4%83 "InferenÈa bayesianÄ â Romanian")
- [Đ ŃŃŃĐșĐžĐč](https://ru.wikipedia.org/wiki/%D0%91%D0%B0%D0%B9%D0%B5%D1%81%D0%BE%D0%B2%D1%81%D0%BA%D0%B8%D0%B9_%D0%B2%D1%8B%D0%B2%D0%BE%D0%B4 "ĐаĐčĐ”ŃĐŸĐČŃĐșĐžĐč ĐČŃĐČĐŸĐŽ â Russian")
- [Đ ŃŃĐžĐœŃŃĐșŃĐč](https://rue.wikipedia.org/wiki/%D0%91%D0%B0%D0%B9%D0%B5%D1%81%D0%BE%D0%B2%D1%81%D0%BA%D0%B0_%D1%88%D1%82%D0%B0%D1%82%D0%B8%D1%81%D1%82%D0%B8%D0%BA%D0%B0 "ĐаĐčĐ”ŃĐŸĐČŃĐșа ŃŃаŃĐžŃŃĐžĐșа â Rusyn")
- [Simple English](https://simple.wikipedia.org/wiki/Bayesian_inference "Bayesian inference â Simple English")
- [SlovenĆĄÄina](https://sl.wikipedia.org/wiki/Bayesiansko_sklepanje "Bayesiansko sklepanje â Slovenian")
- [ĐĄŃĐżŃĐșĐž / srpski](https://sr.wikipedia.org/wiki/Bajesovo_zaklju%C4%8Divanje "Bajesovo zakljuÄivanje â Serbian")
- [Sunda](https://su.wikipedia.org/wiki/Inferensi_Bayes "Inferensi Bayes â Sundanese")
- [Svenska](https://sv.wikipedia.org/wiki/Bayesiansk_inferens "Bayesiansk inferens â Swedish")
- [TĂŒrkçe](https://tr.wikipedia.org/wiki/Bayes_%C3%A7%C4%B1kar%C4%B1m%C4%B1 "Bayes çıkarımı â Turkish")
- [ĐŁĐșŃаŃĐœŃŃĐșа](https://uk.wikipedia.org/wiki/%D0%91%D0%B0%D1%94%D1%81%D0%BE%D0%B2%D0%B5_%D0%B2%D0%B8%D1%81%D0%BD%D0%BE%D0%B2%D1%83%D0%B2%D0%B0%D0%BD%D0%BD%D1%8F "ĐаŃŃĐŸĐČĐ” ĐČĐžŃĐœĐŸĐČŃĐČĐ°ĐœĐœŃ â Ukrainian")
- [Tiáșżng Viá»t](https://vi.wikipedia.org/wiki/Suy_lu%E1%BA%ADn_Bayes "Suy luáșn Bayes â Vietnamese")
- [çČ”èȘ](https://zh-yue.wikipedia.org/wiki/%E8%B2%9D%E8%91%89%E6%96%AF%E6%8E%A8%E8%AB%96 "èČèæŻæšè« â Cantonese")
- [äžæ](https://zh.wikipedia.org/wiki/%E8%B4%9D%E5%8F%B6%E6%96%AF%E6%8E%A8%E6%96%AD "èŽć¶æŻæšæ â Chinese")
[Edit links](https://www.wikidata.org/wiki/Special:EntityPage/Q812535#sitelinks-wikipedia "Edit interlanguage links")
- [Article](https://en.wikipedia.org/wiki/Bayesian_inference "View the content page [c]")
- [Talk](https://en.wikipedia.org/wiki/Talk:Bayesian_inference "Discuss improvements to the content page [t]")
English
- [Read](https://en.wikipedia.org/wiki/Bayesian_inference)
- [Edit](https://en.wikipedia.org/w/index.php?title=Bayesian_inference&action=edit "Edit this page [e]")
- [View history](https://en.wikipedia.org/w/index.php?title=Bayesian_inference&action=history "Past revisions of this page [h]")
Tools
Tools
move to sidebar
hide
Actions
- [Read](https://en.wikipedia.org/wiki/Bayesian_inference)
- [Edit](https://en.wikipedia.org/w/index.php?title=Bayesian_inference&action=edit "Edit this page [e]")
- [View history](https://en.wikipedia.org/w/index.php?title=Bayesian_inference&action=history)
General
- [What links here](https://en.wikipedia.org/wiki/Special:WhatLinksHere/Bayesian_inference "List of all English Wikipedia pages containing links to this page [j]")
- [Related changes](https://en.wikipedia.org/wiki/Special:RecentChangesLinked/Bayesian_inference "Recent changes in pages linked from this page [k]")
- [Upload file](https://en.wikipedia.org/wiki/Wikipedia:File_Upload_Wizard "Upload files [u]")
- [Permanent link](https://en.wikipedia.org/w/index.php?title=Bayesian_inference&oldid=1346386418 "Permanent link to this revision of this page")
- [Page information](https://en.wikipedia.org/w/index.php?title=Bayesian_inference&action=info "More information about this page")
- [Cite this page](https://en.wikipedia.org/w/index.php?title=Special:CiteThisPage&page=Bayesian_inference&id=1346386418&wpFormIdentifier=titleform "Information on how to cite this page")
- [Get shortened URL](https://en.wikipedia.org/w/index.php?title=Special:UrlShortener&url=https%3A%2F%2Fen.wikipedia.org%2Fwiki%2FBayesian_inference)
Print/export
- [Download as PDF](https://en.wikipedia.org/w/index.php?title=Special:DownloadAsPdf&page=Bayesian_inference&action=show-download-screen "Download this page as a PDF file")
- [Printable version](https://en.wikipedia.org/w/index.php?title=Bayesian_inference&printable=yes "Printable version of this page [p]")
In other projects
- [Wikimedia Commons](https://commons.wikimedia.org/wiki/Category:Bayesian_inference)
- [Wikidata item](https://www.wikidata.org/wiki/Special:EntityPage/Q812535 "Structured data on this page hosted by Wikidata [g]")
Appearance
move to sidebar
hide
From Wikipedia, the free encyclopedia
Method of statistical inference
| |
|---|
| Part of a series on |
| [Bayesian statistics](https://en.wikipedia.org/wiki/Bayesian_statistics "Bayesian statistics") |
| [](https://en.wikipedia.org/wiki/File:Bayes_icon.svg) |
| [Posterior](https://en.wikipedia.org/wiki/Posterior_probability "Posterior probability") = [Likelihood](https://en.wikipedia.org/wiki/Likelihood_function "Likelihood function") Ă [Prior](https://en.wikipedia.org/wiki/Prior_probability "Prior probability") Ă· [Evidence](https://en.wikipedia.org/wiki/Marginal_likelihood "Marginal likelihood") |
| Background |
| [Bayesian inference]() [Bayesian probability](https://en.wikipedia.org/wiki/Bayesian_probability "Bayesian probability") [Bayes' theorem](https://en.wikipedia.org/wiki/Bayes%27_theorem "Bayes' theorem") [Bernsteinâvon Mises theorem](https://en.wikipedia.org/wiki/Bernstein%E2%80%93von_Mises_theorem "Bernsteinâvon Mises theorem") [Coherence](https://en.wikipedia.org/wiki/Coherence_\(philosophical_gambling_strategy\) "Coherence (philosophical gambling strategy)") [Cox's theorem](https://en.wikipedia.org/wiki/Cox%27s_theorem "Cox's theorem") [Cromwell's rule](https://en.wikipedia.org/wiki/Cromwell%27s_rule "Cromwell's rule") [Likelihood principle](https://en.wikipedia.org/wiki/Likelihood_principle "Likelihood principle") [Principle of indifference](https://en.wikipedia.org/wiki/Principle_of_indifference "Principle of indifference") [Principle of maximum entropy](https://en.wikipedia.org/wiki/Principle_of_maximum_entropy "Principle of maximum entropy") |
| Model building |
| [Conjugate prior](https://en.wikipedia.org/wiki/Conjugate_prior "Conjugate prior") [Linear regression](https://en.wikipedia.org/wiki/Bayesian_linear_regression "Bayesian linear regression") [Empirical Bayes](https://en.wikipedia.org/wiki/Empirical_Bayes_method "Empirical Bayes method") [Hierarchical model](https://en.wikipedia.org/wiki/Bayesian_hierarchical_modeling "Bayesian hierarchical modeling") |
| Posterior approximation |
| [Markov chain Monte Carlo](https://en.wikipedia.org/wiki/Markov_chain_Monte_Carlo "Markov chain Monte Carlo") [Laplace's approximation](https://en.wikipedia.org/wiki/Laplace%27s_approximation "Laplace's approximation") [Integrated nested Laplace approximations](https://en.wikipedia.org/wiki/Integrated_nested_Laplace_approximations "Integrated nested Laplace approximations") [Variational inference](https://en.wikipedia.org/wiki/Variational_Bayesian_methods "Variational Bayesian methods") [Approximate Bayesian computation](https://en.wikipedia.org/wiki/Approximate_Bayesian_computation "Approximate Bayesian computation") |
| Estimators |
| [Bayesian estimator](https://en.wikipedia.org/wiki/Bayesian_estimator "Bayesian estimator") [Credible interval](https://en.wikipedia.org/wiki/Credible_interval "Credible interval") [Maximum a posteriori estimation](https://en.wikipedia.org/wiki/Maximum_a_posteriori_estimation "Maximum a posteriori estimation") |
| Evidence approximation |
| [Evidence lower bound](https://en.wikipedia.org/wiki/Evidence_lower_bound "Evidence lower bound") [Nested sampling](https://en.wikipedia.org/wiki/Nested_sampling_algorithm "Nested sampling algorithm") |
| Model evaluation |
| [Bayes factor](https://en.wikipedia.org/wiki/Bayes_factor "Bayes factor") ([Schwarz criterion](https://en.wikipedia.org/wiki/Bayesian_information_criterion "Bayesian information criterion")) [Model averaging](https://en.wikipedia.org/wiki/Bayesian_model_averaging "Bayesian model averaging") [Posterior predictive](https://en.wikipedia.org/wiki/Posterior_predictive_distribution "Posterior predictive distribution") |
| [](https://en.wikipedia.org/wiki/File:Nuvola_apps_edu_mathematics_blue-p.svg) [Mathematics portal](https://en.wikipedia.org/wiki/Portal:Mathematics "Portal:Mathematics") |
| [v](https://en.wikipedia.org/wiki/Template:Bayesian_statistics "Template:Bayesian statistics") [t](https://en.wikipedia.org/wiki/Template_talk:Bayesian_statistics "Template talk:Bayesian statistics") [e](https://en.wikipedia.org/wiki/Special:EditPage/Template:Bayesian_statistics "Special:EditPage/Template:Bayesian statistics") |
**Bayesian inference** ([/ËbeÉȘziÉn/](https://en.wikipedia.org/wiki/Help:IPA/English "Help:IPA/English") [*BAY\-zee-Én*](https://en.wikipedia.org/wiki/Help:Pronunciation_respelling_key "Help:Pronunciation respelling key") or [/ËbeÉȘÊÉn/](https://en.wikipedia.org/wiki/Help:IPA/English "Help:IPA/English") [*BAY\-zhÉn*](https://en.wikipedia.org/wiki/Help:Pronunciation_respelling_key "Help:Pronunciation respelling key"))[\[1\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-1) is a method of [statistical inference](https://en.wikipedia.org/wiki/Statistical_inference "Statistical inference") in which [Bayes' theorem](https://en.wikipedia.org/wiki/Bayes%27_theorem "Bayes' theorem") is used to calculate a probability of a hypothesis, given prior [evidence](https://en.wikipedia.org/wiki/Evidence "Evidence"), and update it as more [information](https://en.wikipedia.org/wiki/Information "Information") becomes available. Fundamentally, Bayesian inference uses a [prior distribution](https://en.wikipedia.org/wiki/Prior_probability "Prior probability") to estimate [posterior probabilities.](https://en.wikipedia.org/wiki/Posterior_probability "Posterior probability") Bayesian inference is an important technique in [statistics](https://en.wikipedia.org/wiki/Statistics "Statistics"), and especially in [mathematical statistics](https://en.wikipedia.org/wiki/Mathematical_statistics "Mathematical statistics"). Bayesian updating is particularly important in the [dynamic analysis of a sequence of data](https://en.wikipedia.org/wiki/Sequential_analysis "Sequential analysis"). Bayesian inference has found application in a wide range of activities, including [science](https://en.wikipedia.org/wiki/Science "Science"), [engineering](https://en.wikipedia.org/wiki/Engineering "Engineering"), [philosophy](https://en.wikipedia.org/wiki/Philosophy "Philosophy"), [medicine](https://en.wikipedia.org/wiki/Medicine "Medicine"), [sport](https://en.wikipedia.org/wiki/Sport "Sport"), [psychology](https://en.wikipedia.org/wiki/Psychology "Psychology")[\[2\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-2), and [law](https://en.wikipedia.org/wiki/Law "Law"). In the philosophy of [decision theory](https://en.wikipedia.org/wiki/Decision_theory "Decision theory"), Bayesian inference is closely related to subjective probability, often called "[Bayesian probability](https://en.wikipedia.org/wiki/Bayesian_probability "Bayesian probability")".
## Introduction to Bayes' rule
\[[edit](https://en.wikipedia.org/w/index.php?title=Bayesian_inference&action=edit§ion=1 "Edit section: Introduction to Bayes' rule")\]
[](https://en.wikipedia.org/wiki/File:Bayes_theorem_visualisation.svg)
A geometric visualisation of Bayes' theorem. In the table, the values 2, 3, 6 and 9 give the relative weights of each corresponding condition and case. The figures denote the cells of the table involved in each metric, the probability being the fraction of each figure that is shaded. This shows that
P
(
A
\|
B
)
P
(
B
)
\=
P
(
B
\|
A
)
P
(
A
)
{\\displaystyle P(A\|B)P(B)=P(B\|A)P(A)}

i.e.
P
(
A
\|
B
)
\=
P
(
B
\|
A
)
P
(
A
)
P
(
B
)
{\\displaystyle P(A\|B)={\\frac {P(B\|A)P(A)}{P(B)}}}

. Similar reasoning can be used to show that
P
(
ÂŹ
A
\|
B
)
\=
P
(
B
\|
ÂŹ
A
)
P
(
ÂŹ
A
)
P
(
B
)
{\\displaystyle P(\\neg A\|B)={\\frac {P(B\|\\neg A)P(\\neg A)}{P(B)}}}

etc.
Main article: [Bayes' theorem](https://en.wikipedia.org/wiki/Bayes%27_theorem "Bayes' theorem")
See also: [Bayesian probability](https://en.wikipedia.org/wiki/Bayesian_probability "Bayesian probability")
### Formal explanation
\[[edit](https://en.wikipedia.org/w/index.php?title=Bayesian_inference&action=edit§ion=2 "Edit section: Formal explanation")\]
| HypothesisEvidence | Satisfies hypothesis H | Violates hypothesis â ÂŹ H {\\displaystyle \\neg H}  â |
|---|---|---|
Bayesian inference derives the [posterior probability](https://en.wikipedia.org/wiki/Posterior_probability "Posterior probability") as a [consequence](https://en.wikipedia.org/wiki/Consequence_relation "Consequence relation") of two [antecedents](https://en.wikipedia.org/wiki/Antecedent_\(logic\) "Antecedent (logic)"): a [prior probability](https://en.wikipedia.org/wiki/Prior_probability "Prior probability") and a "[likelihood function](https://en.wikipedia.org/wiki/Likelihood_function "Likelihood function")" derived from a [statistical model](https://en.wikipedia.org/wiki/Statistical_model "Statistical model") for the observed data. Bayesian inference computes the posterior probability according to [Bayes' theorem](https://en.wikipedia.org/wiki/Bayes%27_theorem "Bayes' theorem"):
P ( H ⣠E ) \= P ( E ⣠H ) â
P ( H ) P ( E ) , {\\displaystyle P(H\\mid E)={\\frac {P(E\\mid H)\\cdot P(H)}{P(E)}},} 
where
- H
{\\displaystyle H}

stands for any *hypothesis* whose probability may be affected by [data](https://en.wikipedia.org/wiki/Experimental_data "Experimental data") (called *evidence* below). Often there are competing hypotheses, and the task is to determine which is the most probable.
- P
(
H
)
{\\displaystyle P(H)}

, the *[prior probability](https://en.wikipedia.org/wiki/Prior_probability "Prior probability")*, is the estimate of the probability of the hypothesis
H
{\\displaystyle H}

*before* the data
E
{\\displaystyle E}

, the current evidence, is observed.
- E
{\\displaystyle E}

, the *evidence*, corresponds to new data that were not used in computing the prior probability.
- P
(
H
âŁ
E
)
{\\displaystyle P(H\\mid E)}

, the *[posterior probability](https://en.wikipedia.org/wiki/Posterior_probability "Posterior probability")*, is the probability of
H
{\\displaystyle H}

*given*
E
{\\displaystyle E}

, i.e., *after*
E
{\\displaystyle E}

is observed. This is what we want to know: the probability of a hypothesis *given* the observed evidence.
- P
(
E
âŁ
H
)
{\\displaystyle P(E\\mid H)}

is the probability of observing
E
{\\displaystyle E}

*given*
H
{\\displaystyle H}

and is called the *[likelihood](https://en.wikipedia.org/wiki/Likelihood_function "Likelihood function")*. As a function of
E
{\\displaystyle E}

with
H
{\\displaystyle H}

fixed, it indicates the compatibility of the evidence with the given hypothesis. The likelihood function is a function of the evidence,
E
{\\displaystyle E}

, while the posterior probability is a function of the hypothesis,
H
{\\displaystyle H}

.
- P
(
E
)
{\\displaystyle P(E)}

is sometimes termed the [marginal likelihood](https://en.wikipedia.org/wiki/Marginal_likelihood "Marginal likelihood") or "model evidence". This factor is the same for all possible hypotheses being considered (as is evident from the fact that the hypothesis
H
{\\displaystyle H}

does not appear anywhere in the symbol, unlike for all the other factors) and hence does not factor into determining the relative probabilities of different hypotheses.
- P
(
E
)
\>
0
{\\displaystyle P(E)\>0}

(Else one has
0
/
0
{\\displaystyle 0/0}

.)
For different values of H {\\displaystyle H} , only the factors P ( H ) {\\displaystyle P(H)}  and P ( E ⣠H ) {\\displaystyle P(E\\mid H)} , both in the numerator, affect the value of P ( H ⣠E ) {\\displaystyle P(H\\mid E)}  â the posterior probability of a hypothesis is proportional to its prior probability (its inherent likeliness) and the newly acquired likelihood (its compatibility with the new observed evidence).
In cases where ÂŹ H {\\displaystyle \\neg H}  ("not H {\\displaystyle H} "), the [logical negation](https://en.wikipedia.org/wiki/Logical_negation "Logical negation") of H {\\displaystyle H} , is a valid likelihood, Bayes' rule can be rewritten as follows:
P ( H ⣠E ) \= P ( E ⣠H ) P ( H ) P ( E ) \= P ( E ⣠H ) P ( H ) P ( E ⣠H ) P ( H ) \+ P ( E ⣠ H ) P ( ÂŹ H ) \= 1 1 \+ ( 1 P ( H ) â 1 ) P ( E ⣠ H ) P ( E ⣠H ) {\\displaystyle {\\begin{aligned}P(H\\mid E)&={\\frac {P(E\\mid H)P(H)}{P(E)}}\\\\\\\\&={\\frac {P(E\\mid H)P(H)}{P(E\\mid H)P(H)+P(E\\mid \\neg H)P(\\neg H)}}\\\\\\\\&={\\frac {1}{1+\\left({\\frac {1}{P(H)}}-1\\right){\\frac {P(E\\mid \\neg H)}{P(E\\mid H)}}}}\\\\\\end{aligned}}} 
because
P ( E ) \= P ( E ⣠H ) P ( H ) \+ P ( E ⣠ H ) P (  H ) {\\displaystyle P(E)=P(E\\mid H)P(H)+P(E\\mid \\neg H)P(\\neg H)} 
and
P ( H ) \+ P ( ÂŹ H ) \= 1\. {\\displaystyle P(H)+P(\\neg H)=1.} 
This focuses attention on the term
( 1 P ( H ) â 1 ) P ( E ⣠ H ) P ( E ⣠H ) . {\\displaystyle \\left({\\tfrac {1}{P(H)}}-1\\right){\\tfrac {P(E\\mid \\neg H)}{P(E\\mid H)}}.} 
If that term is approximately 1, then the probability of the hypothesis given the evidence, P ( H ⣠E ) {\\displaystyle P(H\\mid E)} , is about 1 2 {\\displaystyle {\\tfrac {1}{2}}} , about 50% likely - equally likely or not likely. If that term is very small, close to zero, then the probability of the hypothesis, given the evidence, P ( H ⣠E ) {\\displaystyle P(H\\mid E)}  is close to 1 or the conditional hypothesis is quite likely. If that term is very large, much larger than 1, then the hypothesis, given the evidence, is quite unlikely. If the hypothesis (without consideration of evidence) is unlikely, then P ( H ) {\\displaystyle P(H)}  is small (but not necessarily astronomically small) and 1 P ( H ) {\\displaystyle {\\tfrac {1}{P(H)}}}  is much larger than 1 and this term can be approximated as P ( E ⣠ H ) P ( E ⣠H ) â
P ( H ) {\\displaystyle {\\tfrac {P(E\\mid \\neg H)}{P(E\\mid H)\\cdot P(H)}}}  and relevant probabilities can be compared directly to each other.
One quick and easy way to remember the equation would be to use [rule of multiplication](https://en.wikipedia.org/wiki/Conditional_probability#As_an_axiom_of_probability "Conditional probability"):
P ( E ⩠H ) \= P ( E ⣠H ) P ( H ) \= P ( H ⣠E ) P ( E ) . {\\displaystyle P(E\\cap H)=P(E\\mid H)P(H)=P(H\\mid E)P(E).} 
### Alternatives to Bayesian updating
\[[edit](https://en.wikipedia.org/w/index.php?title=Bayesian_inference&action=edit§ion=3 "Edit section: Alternatives to Bayesian updating")\]
Bayesian updating is widely used and computationally convenient. However, it is not the only updating rule that might be considered rational.
[Ian Hacking](https://en.wikipedia.org/wiki/Ian_Hacking "Ian Hacking") noted that traditional "[Dutch book](https://en.wikipedia.org/wiki/Dutch_book "Dutch book")" arguments did not specify Bayesian updating: they left open the possibility that non-Bayesian updating rules could avoid Dutch books. Hacking wrote:[\[3\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-3) "And neither the Dutch book argument nor any other in the personalist arsenal of proofs of the probability axioms entails the dynamic assumption. Not one entails Bayesianism. So the personalist requires the dynamic assumption to be Bayesian. It is true that in consistency a personalist could abandon the Bayesian model of learning from experience. Salt could lose its savour."
Indeed, there are non-Bayesian updating rules that also avoid Dutch books (as discussed in the literature on "[probability kinematics](https://en.wikipedia.org/wiki/Probability_kinematics "Probability kinematics")") following the publication of [Richard C. Jeffrey](https://en.wikipedia.org/wiki/Richard_C._Jeffrey "Richard C. Jeffrey")'s rule, which applies Bayes' rule to the case where the evidence itself is assigned a probability.[\[4\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-4) The additional hypotheses needed to uniquely require Bayesian updating have been deemed to be substantial, complicated, and unsatisfactory.[\[5\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-5)
## Inference over exclusive and exhaustive possibilities
\[[edit](https://en.wikipedia.org/w/index.php?title=Bayesian_inference&action=edit§ion=4 "Edit section: Inference over exclusive and exhaustive possibilities")\]
If evidence is simultaneously used to update belief over a set of exclusive and exhaustive propositions, Bayesian inference may be thought of as acting on this belief distribution as a whole.
### General formulation
\[[edit](https://en.wikipedia.org/w/index.php?title=Bayesian_inference&action=edit§ion=5 "Edit section: General formulation")\]
[](https://en.wikipedia.org/wiki/File:Bayesian_inference_event_space.svg)
Diagram illustrating event space
Ω
{\\displaystyle \\Omega }

in general formulation of Bayesian inference. Although this diagram shows discrete models and events, the continuous case may be visualized similarly using probability densities.
Suppose a process is generating independent and identically distributed events E n , n \= 1 , 2 , 3 , ⊠{\\displaystyle E\_{n},\\ n=1,2,3,\\ldots } , but the [probability distribution](https://en.wikipedia.org/wiki/Probability_distribution "Probability distribution") is unknown. Let the event space Ω {\\displaystyle \\Omega }  represent the current state of belief for this process. Each model is represented by event M m {\\displaystyle M\_{m}} . The conditional probabilities P ( E n ⣠M m ) {\\displaystyle P(E\_{n}\\mid M\_{m})}  are specified to define the models. P ( M m ) {\\displaystyle P(M\_{m})}  is the [degree of belief](https://en.wikipedia.org/wiki/Credence_\(statistics\) "Credence (statistics)") in M m {\\displaystyle M\_{m}} . Before the first inference step, { P ( M m ) } {\\displaystyle \\{P(M\_{m})\\}}  is a set of *initial prior probabilities*. These must sum to 1, but are otherwise arbitrary.
Suppose that the process is observed to generate E â { E n } {\\displaystyle E\\in \\{E\_{n}\\}} . For each M â { M m } {\\displaystyle M\\in \\{M\_{m}\\}} , the prior P ( M ) {\\displaystyle P(M)}  is updated to the posterior P ( M ⣠E ) {\\displaystyle P(M\\mid E)} . From [Bayes' theorem](https://en.wikipedia.org/wiki/Bayes%27_theorem "Bayes' theorem"):[\[6\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-6)
P ( M ⣠E ) \= P ( E ⣠M ) â m P ( E ⣠M m ) P ( M m ) â
P ( M ) . {\\displaystyle P(M\\mid E)={\\frac {P(E\\mid M)}{\\sum \_{m}{P(E\\mid M\_{m})P(M\_{m})}}}\\cdot P(M).} 
Upon observation of further evidence, this procedure may be repeated.
### Multiple observations
\[[edit](https://en.wikipedia.org/w/index.php?title=Bayesian_inference&action=edit§ion=6 "Edit section: Multiple observations")\]
For a sequence of [independent and identically distributed](https://en.wikipedia.org/wiki/Independent_and_identically_distributed "Independent and identically distributed") observations E \= ( e 1 , ⊠, e n ) {\\displaystyle \\mathbf {E} =(e\_{1},\\dots ,e\_{n})} , it can be shown by induction that repeated application of the above is equivalent to P ( M ⣠E ) \= P ( E ⣠M ) â m P ( E ⣠M m ) P ( M m ) â
P ( M ) , {\\displaystyle P(M\\mid \\mathbf {E} )={\\frac {P(\\mathbf {E} \\mid M)}{\\sum \_{m}{P(\\mathbf {E} \\mid M\_{m})P(M\_{m})}}}\\cdot P(M),}  where P ( E ⣠M ) \= â k P ( e k ⣠M ) . {\\displaystyle P(\\mathbf {E} \\mid M)=\\prod \_{k}{P(e\_{k}\\mid M)}.} 
### Parametric formulation: motivating the formal description
\[[edit](https://en.wikipedia.org/w/index.php?title=Bayesian_inference&action=edit§ion=7 "Edit section: Parametric formulation: motivating the formal description")\]
By parameterizing the space of models, the belief in all models may be updated in a single step. The distribution of belief over the model space may then be thought of as a distribution of belief over the parameter space. The distributions in this section are expressed as continuous, represented by probability densities, as this is the usual situation. The technique is, however, equally applicable to discrete distributions.
Let the vector Ξ {\\displaystyle {\\boldsymbol {\\theta }}}  span the parameter space. Let the initial prior distribution over Ξ {\\displaystyle {\\boldsymbol {\\theta }}}  be p ( Ξ ⣠α ) {\\displaystyle p({\\boldsymbol {\\theta }}\\mid {\\boldsymbol {\\alpha }})} , where α {\\displaystyle {\\boldsymbol {\\alpha }}}  is a set of parameters to the prior itself, or *[hyperparameters](https://en.wikipedia.org/wiki/Hyperparameter_\(Bayesian_statistics\) "Hyperparameter (Bayesian statistics)")*. Let E \= ( e 1 , ⊠, e n ) {\\displaystyle \\mathbf {E} =(e\_{1},\\dots ,e\_{n})}  be a sequence of [independent and identically distributed](https://en.wikipedia.org/wiki/Independent_and_identically_distributed_random_variables "Independent and identically distributed random variables") event observations, where all e i {\\displaystyle e\_{i}}  are distributed as p ( e ⣠Ξ ) {\\displaystyle p(e\\mid {\\boldsymbol {\\theta }})}  for some Ξ {\\displaystyle {\\boldsymbol {\\theta }}} . [Bayes' theorem](https://en.wikipedia.org/wiki/Bayes%27_theorem "Bayes' theorem") is applied to find the [posterior distribution](https://en.wikipedia.org/wiki/Posterior_distribution "Posterior distribution") over Ξ {\\displaystyle {\\boldsymbol {\\theta }}} :
p ( Ξ ⣠E , α ) \= p ( E ⣠Ξ , α ) p ( E ⣠α ) â
p ( Ξ ⣠α ) \= p ( E ⣠Ξ , α ) â« p ( E ⣠Ξ , α ) p ( Ξ ⣠α ) d Ξ â
p ( Ξ ⣠α ) , {\\displaystyle {\\begin{aligned}p({\\boldsymbol {\\theta }}\\mid \\mathbf {E} ,{\\boldsymbol {\\alpha }})&={\\frac {p(\\mathbf {E} \\mid {\\boldsymbol {\\theta }},{\\boldsymbol {\\alpha }})}{p(\\mathbf {E} \\mid {\\boldsymbol {\\alpha }})}}\\cdot p({\\boldsymbol {\\theta }}\\mid {\\boldsymbol {\\alpha }})\\\\&={\\frac {p(\\mathbf {E} \\mid {\\boldsymbol {\\theta }},{\\boldsymbol {\\alpha }})}{\\int p(\\mathbf {E} \\mid {\\boldsymbol {\\theta }},{\\boldsymbol {\\alpha }})p({\\boldsymbol {\\theta }}\\mid {\\boldsymbol {\\alpha }})\\,d{\\boldsymbol {\\theta }}}}\\cdot p({\\boldsymbol {\\theta }}\\mid {\\boldsymbol {\\alpha }}),\\end{aligned}}}  where p ( E ⣠Ξ , α ) \= â k p ( e k ⣠Ξ ) . {\\displaystyle p(\\mathbf {E} \\mid {\\boldsymbol {\\theta }},{\\boldsymbol {\\alpha }})=\\prod \_{k}p(e\_{k}\\mid {\\boldsymbol {\\theta }}).} 
## Formal description of Bayesian inference
\[[edit](https://en.wikipedia.org/w/index.php?title=Bayesian_inference&action=edit§ion=8 "Edit section: Formal description of Bayesian inference")\]
### Definitions
\[[edit](https://en.wikipedia.org/w/index.php?title=Bayesian_inference&action=edit§ion=9 "Edit section: Definitions")\]
- x
{\\displaystyle x}

, a data point in general. This may in fact be a [vector](https://en.wikipedia.org/wiki/Random_vector "Random vector") of values.
- Ξ
{\\displaystyle \\theta }

, the [parameter](https://en.wikipedia.org/wiki/Parameter "Parameter") of the data point's distribution, i.e.,
x
âŒ
p
(
x
âŁ
Ξ
)
{\\displaystyle x\\sim p(x\\mid \\theta )}

.
This may be a [vector](https://en.wikipedia.org/wiki/Random_vector "Random vector") of parameters.
- α
{\\displaystyle \\alpha }

, the [hyperparameter](https://en.wikipedia.org/wiki/Hyperparameter_\(Bayesian_statistics\) "Hyperparameter (Bayesian statistics)") of the parameter distribution, i.e.,
Ξ
âŒ
p
(
Ξ
âŁ
α
)
{\\displaystyle \\theta \\sim p(\\theta \\mid \\alpha )}

.
This may be a [vector](https://en.wikipedia.org/wiki/Random_vector "Random vector") of hyperparameters.
- X
{\\displaystyle \\mathbf {X} }

is the sample, a set of
n
{\\displaystyle n}

observed data points, i.e.,
x
1
,
âŠ
,
x
n
{\\displaystyle x\_{1},\\ldots ,x\_{n}}

.
- x
~
{\\displaystyle {\\tilde {x}}}

, a new data point whose distribution is to be predicted.
### Bayesian inference
\[[edit](https://en.wikipedia.org/w/index.php?title=Bayesian_inference&action=edit§ion=10 "Edit section: Bayesian inference")\]
- The [prior distribution](https://en.wikipedia.org/wiki/Prior_distribution "Prior distribution") is the distribution of the parameter(s) before any data is observed, i.e.
p
(
Ξ
âŁ
α
)
{\\displaystyle p(\\theta \\mid \\alpha )}

. The prior distribution might not be easily determined; in such a case, one possibility may be to use the [Jeffreys prior](https://en.wikipedia.org/wiki/Jeffreys_prior "Jeffreys prior") to obtain a prior distribution before updating it with newer observations.
- The [sampling distribution](https://en.wikipedia.org/wiki/Sampling_distribution "Sampling distribution") is the distribution of the observed data conditional on its parameters, i.e.
p
(
X
âŁ
Ξ
)
{\\displaystyle p(\\mathbf {X} \\mid \\theta )}

.
This is also termed the [likelihood](https://en.wikipedia.org/wiki/Likelihood_function "Likelihood function"), especially when viewed as a function of the parameter(s), sometimes written
L
âĄ
(
Ξ
âŁ
X
)
\=
p
(
X
âŁ
Ξ
)
{\\displaystyle \\operatorname {L} (\\theta \\mid \\mathbf {X} )=p(\\mathbf {X} \\mid \\theta )}

.
- The [marginal likelihood](https://en.wikipedia.org/wiki/Marginal_likelihood "Marginal likelihood") (sometimes also termed the *evidence*) is the distribution of the observed data [marginalized](https://en.wikipedia.org/wiki/Marginal_distribution "Marginal distribution") over the parameter(s), i.e.
p
(
X
âŁ
α
)
\=
â«
p
(
X
âŁ
Ξ
)
p
(
Ξ
âŁ
α
)
d
Ξ
.
{\\displaystyle p(\\mathbf {X} \\mid \\alpha )=\\int p(\\mathbf {X} \\mid \\theta )p(\\theta \\mid \\alpha )d\\theta .}

It quantifies the agreement between data and expert opinion, in a geometric sense that can be made precise.[\[7\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-deCarvalho-Geometry-7) If the marginal likelihood is 0 then there is no agreement between the data and expert opinion and Bayes' rule cannot be applied.
- The [posterior distribution](https://en.wikipedia.org/wiki/Posterior_distribution "Posterior distribution") is the distribution of the parameter(s) after taking into account the observed data. This is determined by [Bayes' rule](https://en.wikipedia.org/wiki/Bayes%27_rule "Bayes' rule"), which forms the heart of Bayesian inference:
p
(
Ξ
âŁ
X
,
α
)
\=
p
(
Ξ
,
X
,
α
)
p
(
X
,
α
)
\=
p
(
X
âŁ
Ξ
,
α
)
p
(
Ξ
,
α
)
p
(
X
âŁ
α
)
p
(
α
)
\=
p
(
X
âŁ
Ξ
,
α
)
p
(
Ξ
âŁ
α
)
p
(
X
âŁ
α
)
â
p
(
X
âŁ
Ξ
,
α
)
p
(
Ξ
âŁ
α
)
.
{\\displaystyle p(\\theta \\mid \\mathbf {X} ,\\alpha )={\\frac {p(\\theta ,\\mathbf {X} ,\\alpha )}{p(\\mathbf {X} ,\\alpha )}}={\\frac {p(\\mathbf {X} \\mid \\theta ,\\alpha )p(\\theta ,\\alpha )}{p(\\mathbf {X} \\mid \\alpha )p(\\alpha )}}={\\frac {p(\\mathbf {X} \\mid \\theta ,\\alpha )p(\\theta \\mid \\alpha )}{p(\\mathbf {X} \\mid \\alpha )}}\\propto p(\\mathbf {X} \\mid \\theta ,\\alpha )p(\\theta \\mid \\alpha ).}

This is expressed in words as "posterior is proportional to likelihood times prior", or sometimes as "posterior = likelihood times prior, over evidence".
- In practice, for almost all complex Bayesian models used in machine learning, the posterior distribution
p
(
Ξ
âŁ
X
,
α
)
{\\displaystyle p(\\theta \\mid \\mathbf {X} ,\\alpha )}

is not obtained in a closed form distribution, mainly because the parameter space for
Ξ
{\\displaystyle \\theta }

can be very high, or the Bayesian model retains certain hierarchical structure formulated from the observations
X
{\\displaystyle \\mathbf {X} }

and parameter
Ξ
{\\displaystyle \\theta }

. In such situations, we need to resort to approximation techniques.[\[8\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-Lee-GibbsSampler-8)
- General case: Let
P
Y
x
{\\displaystyle P\_{Y}^{x}}

be the conditional distribution of
Y
{\\displaystyle Y}

given
X
\=
x
{\\displaystyle X=x}

and let
P
X
{\\displaystyle P\_{X}}

be the distribution of
X
{\\displaystyle X}

. The joint distribution is then
P
X
,
Y
(
d
x
,
d
y
)
\=
P
Y
x
(
d
y
)
P
X
(
d
x
)
{\\displaystyle P\_{X,Y}(dx,dy)=P\_{Y}^{x}(dy)P\_{X}(dx)}

. The conditional distribution
P
X
y
{\\displaystyle P\_{X}^{y}}

of
X
{\\displaystyle X}

given
Y
\=
y
{\\displaystyle Y=y}

is then determined by
P X y ( A ) \= E ( 1 A ( X ) \| Y \= y ) {\\displaystyle P\_{X}^{y}(A)=E(1\_{A}(X)\|Y=y)} Existence and uniqueness of the needed [conditional expectation](https://en.wikipedia.org/wiki/Conditional_expectation "Conditional expectation") is a consequence of the [RadonâNikodym theorem](https://en.wikipedia.org/wiki/Radon%E2%80%93Nikodym_theorem "RadonâNikodym theorem"). This was formulated by [Kolmogorov](https://en.wikipedia.org/wiki/Andrey_Kolmogorov "Andrey Kolmogorov") in his famous book from 1933. Kolmogorov underlines the importance of conditional probability by writing "I wish to call attention to ... and especially the theory of conditional probabilities and conditional expectations ..." in the Preface.[\[9\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-9) The Bayes theorem determines the posterior distribution from the prior distribution. Uniqueness requires continuity assumptions.[\[10\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-10) Bayes' theorem can be generalized to include improper prior distributions such as the uniform distribution on the real line.[\[11\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-11) Modern [Markov chain Monte Carlo](https://en.wikipedia.org/wiki/Markov_chain_Monte_Carlo "Markov chain Monte Carlo") methods have boosted the importance of Bayes' theorem including cases with improper priors.[\[12\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-12)
### Bayesian prediction
\[[edit](https://en.wikipedia.org/w/index.php?title=Bayesian_inference&action=edit§ion=11 "Edit section: Bayesian prediction")\]
- The [posterior predictive distribution](https://en.wikipedia.org/wiki/Posterior_predictive_distribution "Posterior predictive distribution") is the distribution of a new data point, marginalized over the posterior:
p
(
x
~
âŁ
X
,
α
)
\=
â«
p
(
x
~
âŁ
Ξ
)
p
(
Ξ
âŁ
X
,
α
)
d
Ξ
{\\displaystyle p({\\tilde {x}}\\mid \\mathbf {X} ,\\alpha )=\\int p({\\tilde {x}}\\mid \\theta )p(\\theta \\mid \\mathbf {X} ,\\alpha )d\\theta }

- The [prior predictive distribution](https://en.wikipedia.org/wiki/Prior_predictive_distribution "Prior predictive distribution") is the distribution of a new data point, marginalized over the prior:
p
(
x
~
âŁ
α
)
\=
â«
p
(
x
~
âŁ
Ξ
)
p
(
Ξ
âŁ
α
)
d
Ξ
{\\displaystyle p({\\tilde {x}}\\mid \\alpha )=\\int p({\\tilde {x}}\\mid \\theta )p(\\theta \\mid \\alpha )d\\theta }

Bayesian theory calls for the use of the posterior predictive distribution to do [predictive inference](https://en.wikipedia.org/wiki/Predictive_inference "Predictive inference"), i.e., to [predict](https://en.wikipedia.org/wiki/Prediction "Prediction") the distribution of a new, unobserved data point. That is, instead of a fixed point as a prediction, a distribution over possible points is returned. Only this way is the entire posterior distribution of the parameter(s) used. By comparison, prediction in [frequentist statistics](https://en.wikipedia.org/wiki/Frequentist_statistics "Frequentist statistics") often involves finding an optimum point estimate of the parameter(s)âe.g., by [maximum likelihood](https://en.wikipedia.org/wiki/Maximum_likelihood "Maximum likelihood") or [maximum a posteriori estimation](https://en.wikipedia.org/wiki/Maximum_a_posteriori_estimation "Maximum a posteriori estimation") (MAP)âand then plugging this estimate into the formula for the distribution of a data point. This has the disadvantage that it does not account for any uncertainty in the value of the parameter, and hence will underestimate the [variance](https://en.wikipedia.org/wiki/Variance "Variance") of the predictive distribution.
In some instances, frequentist statistics can work around this problem. For example, [confidence intervals](https://en.wikipedia.org/wiki/Confidence_interval "Confidence interval") and [prediction intervals](https://en.wikipedia.org/wiki/Prediction_interval "Prediction interval") in frequentist statistics when constructed from a [normal distribution](https://en.wikipedia.org/wiki/Normal_distribution "Normal distribution") with unknown [mean](https://en.wikipedia.org/wiki/Mean "Mean") and [variance](https://en.wikipedia.org/wiki/Variance "Variance") are constructed using a [Student's t-distribution](https://en.wikipedia.org/wiki/Student%27s_t-distribution "Student's t-distribution"). This correctly estimates the variance, due to the facts that (1) the average of normally distributed random variables is also normally distributed, and (2) the predictive distribution of a normally distributed data point with unknown mean and variance, using conjugate or uninformative priors, has a Student's t-distribution. In Bayesian statistics, however, the posterior predictive distribution can always be determined exactlyâor at least to an arbitrary level of precision when numerical methods are used.
Both types of predictive distributions have the form of a [compound probability distribution](https://en.wikipedia.org/wiki/Compound_probability_distribution "Compound probability distribution") (as does the [marginal likelihood](https://en.wikipedia.org/wiki/Marginal_likelihood "Marginal likelihood")). In fact, if the prior distribution is a [conjugate prior](https://en.wikipedia.org/wiki/Conjugate_prior "Conjugate prior"), such that the prior and posterior distributions come from the same family, it can be seen that both prior and posterior predictive distributions also come from the same family of compound distributions. The only difference is that the posterior predictive distribution uses the updated values of the hyperparameters (applying the Bayesian update rules given in the [conjugate prior](https://en.wikipedia.org/wiki/Conjugate_prior "Conjugate prior") article), while the prior predictive distribution uses the values of the hyperparameters that appear in the prior distribution.
## Mathematical properties
\[[edit](https://en.wikipedia.org/w/index.php?title=Bayesian_inference&action=edit§ion=12 "Edit section: Mathematical properties")\]
| | |
|---|---|
|  | This section includes a list of [general references](https://en.wikipedia.org/wiki/Wikipedia:Citing_sources#General_references "Wikipedia:Citing sources"), but **it lacks sufficient corresponding [inline citations](https://en.wikipedia.org/wiki/Wikipedia:Citing_sources#Inline_citations "Wikipedia:Citing sources")**. Please help to [improve](https://en.wikipedia.org/wiki/Wikipedia:WikiProject_Reliability "Wikipedia:WikiProject Reliability") this section by [introducing](https://en.wikipedia.org/wiki/Wikipedia:When_to_cite "Wikipedia:When to cite") more precise citations. *(February 2012)* *([Learn how and when to remove this message](https://en.wikipedia.org/wiki/Help:Maintenance_template_removal "Help:Maintenance template removal"))* |
### Interpretation of factor
\[[edit](https://en.wikipedia.org/w/index.php?title=Bayesian_inference&action=edit§ion=13 "Edit section: Interpretation of factor")\]
P ( E ⣠M ) P ( E ) \> 1 â P ( E ⣠M ) \> P ( E ) {\\textstyle {\\frac {P(E\\mid M)}{P(E)}}\>1\\Rightarrow P(E\\mid M)\>P(E)} . That is, if the model were true, the evidence would be more likely than is predicted by the current state of belief. The reverse applies for a decrease in belief. If the belief does not change, P ( E ⣠M ) P ( E ) \= 1 â P ( E ⣠M ) \= P ( E ) {\\textstyle {\\frac {P(E\\mid M)}{P(E)}}=1\\Rightarrow P(E\\mid M)=P(E)} . That is, the evidence is independent of the model. If the model were true, the evidence would be exactly as likely as predicted by the current state of belief.
### Cromwell's rule
\[[edit](https://en.wikipedia.org/w/index.php?title=Bayesian_inference&action=edit§ion=14 "Edit section: Cromwell's rule")\]
Main article: [Cromwell's rule](https://en.wikipedia.org/wiki/Cromwell%27s_rule "Cromwell's rule")
If P ( M ) \= 0 {\\displaystyle P(M)=0}  then P ( M ⣠E ) \= 0 {\\displaystyle P(M\\mid E)=0} . If P ( M ) \= 1 {\\displaystyle P(M)=1}  and P ( E ) \> 0 {\\displaystyle P(E)\>0} , then P ( M \| E ) \= 1 {\\displaystyle P(M\|E)=1} . This can be interpreted to mean that hard convictions are insensitive to counter-evidence.
The former follows directly from Bayes' theorem. The latter can be derived by applying the first rule to the event "not M {\\displaystyle M} " in place of "M {\\displaystyle M} ", yielding "if 1 â P ( M ) \= 0 {\\displaystyle 1-P(M)=0} , then 1 â P ( M ⣠E ) \= 0 {\\displaystyle 1-P(M\\mid E)=0} ", from which the result immediately follows.
### Asymptotic behaviour of posterior
\[[edit](https://en.wikipedia.org/w/index.php?title=Bayesian_inference&action=edit§ion=15 "Edit section: Asymptotic behaviour of posterior")\]
Consider the behaviour of a belief distribution as it is updated a large number of times with [independent and identically distributed](https://en.wikipedia.org/wiki/Independent_and_identically_distributed "Independent and identically distributed") trials. For sufficiently nice prior probabilities, the [Bernstein-von Mises theorem](https://en.wikipedia.org/wiki/Bernstein%E2%80%93von_Mises_theorem "Bernsteinâvon Mises theorem") gives that in the limit of infinite trials, the posterior converges to a [Gaussian distribution](https://en.wikipedia.org/wiki/Gaussian_distribution "Gaussian distribution") independent of the initial prior under some conditions firstly outlined and rigorously proven by [Joseph L. Doob](https://en.wikipedia.org/wiki/Joseph_L._Doob "Joseph L. Doob") in 1948, namely if the random variable in consideration has a finite [probability space](https://en.wikipedia.org/wiki/Probability_space "Probability space"). The more general results were obtained later by the statistician [David A. Freedman](https://en.wikipedia.org/wiki/David_A._Freedman_\(statistician\) "David A. Freedman (statistician)") who published in two seminal research papers in 1963 [\[13\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-13) and 1965 [\[14\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-14) when and under what circumstances the asymptotic behaviour of posterior is guaranteed. His 1963 paper treats, like Doob (1949), the finite case and comes to a satisfactory conclusion. However, if the random variable has an infinite but countable [probability space](https://en.wikipedia.org/wiki/Probability_space "Probability space") (i.e., corresponding to a die with infinite many faces) the 1965 paper demonstrates that for a dense subset of priors the [Bernstein-von Mises theorem](https://en.wikipedia.org/wiki/Bernstein%E2%80%93von_Mises_theorem "Bernsteinâvon Mises theorem") is not applicable. In this case there is [almost surely](https://en.wikipedia.org/wiki/Almost_surely "Almost surely") no asymptotic convergence. Later in the 1980s and 1990s [Freedman](https://en.wikipedia.org/wiki/David_A._Freedman_\(statistician\) "David A. Freedman (statistician)") and [Persi Diaconis](https://en.wikipedia.org/wiki/Persi_Diaconis "Persi Diaconis") continued to work on the case of infinite countable probability spaces.[\[15\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-15) To summarise, there may be insufficient trials to suppress the effects of the initial choice, and especially for large (but finite) systems the convergence might be very slow.
### Conjugate priors
\[[edit](https://en.wikipedia.org/w/index.php?title=Bayesian_inference&action=edit§ion=16 "Edit section: Conjugate priors")\]
Main article: [Conjugate prior](https://en.wikipedia.org/wiki/Conjugate_prior "Conjugate prior")
In parameterized form, the prior distribution is often assumed to come from a family of distributions called [conjugate priors](https://en.wikipedia.org/wiki/Conjugate_prior "Conjugate prior"). The usefulness of a conjugate prior is that the corresponding posterior distribution will be in the same family, and the calculation may be expressed in [closed form](https://en.wikipedia.org/wiki/Closed-form_expression "Closed-form expression").
### Estimates of parameters and predictions
\[[edit](https://en.wikipedia.org/w/index.php?title=Bayesian_inference&action=edit§ion=17 "Edit section: Estimates of parameters and predictions")\]
It is often desired to use a posterior distribution to estimate a parameter or variable. Several methods of Bayesian estimation select [measurements of central tendency](https://en.wikipedia.org/wiki/Central_tendency "Central tendency") from the posterior distribution.
For one-dimensional problems, a unique median exists for practical continuous problems. The posterior median is attractive as a [robust estimator](https://en.wikipedia.org/wiki/Robust_statistics "Robust statistics").[\[16\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-16)
If there exists a finite mean for the posterior distribution, then the posterior mean is a method of estimation.[\[17\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-17) Ξ ~ \= E ⥠\[ Ξ \] \= ⫠Ξ p ( Ξ ⣠X , α ) d Ξ {\\displaystyle {\\tilde {\\theta }}=\\operatorname {E} \[\\theta \]=\\int \\theta \\,p(\\theta \\mid \\mathbf {X} ,\\alpha )\\,d\\theta } ![{\\displaystyle {\\tilde {\\theta }}=\\operatorname {E} \[\\theta \]=\\int \\theta \\,p(\\theta \\mid \\mathbf {X} ,\\alpha )\\,d\\theta }](https://wikimedia.org/api/rest_v1/media/math/render/svg/fe77d50024b7504dd853e6cee501d293653c546b)
Taking a value with the greatest probability defines [maximum *a posteriori* (MAP)](https://en.wikipedia.org/wiki/Maximum_a_posteriori_estimation "Maximum a posteriori estimation") estimates:[\[18\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-18) { Ξ MAP } â arg ⥠max Ξ p ( Ξ ⣠X , α ) . {\\displaystyle \\{\\theta \_{\\text{MAP}}\\}\\subset \\arg \\max \_{\\theta }p(\\theta \\mid \\mathbf {X} ,\\alpha ).} 
There are examples where no maximum is attained, in which case the set of MAP estimates is [empty](https://en.wikipedia.org/wiki/Empty_set "Empty set").
There are other methods of estimation that minimize the posterior *[risk](https://en.wikipedia.org/wiki/Risk "Risk")* (expected-posterior loss) with respect to a [loss function](https://en.wikipedia.org/wiki/Loss_function "Loss function"), and these are of interest to [statistical decision theory](https://en.wikipedia.org/wiki/Statistical_decision_theory "Statistical decision theory") using the sampling distribution ("frequentist statistics").[\[19\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-19)
The [posterior predictive distribution](https://en.wikipedia.org/wiki/Posterior_predictive_distribution "Posterior predictive distribution") of a new observation x ~ {\\displaystyle {\\tilde {x}}}  (that is independent of previous observations) is determined by[\[20\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-20) p ( x ~ \| X , α ) \= ⫠p ( x ~ , Ξ ⣠X , α ) d Ξ \= ⫠p ( x ~ ⣠Ξ ) p ( Ξ ⣠X , α ) d Ξ . {\\displaystyle p({\\tilde {x}}\|\\mathbf {X} ,\\alpha )=\\int p({\\tilde {x}},\\theta \\mid \\mathbf {X} ,\\alpha )\\,d\\theta =\\int p({\\tilde {x}}\\mid \\theta )p(\\theta \\mid \\mathbf {X} ,\\alpha )\\,d\\theta .} 
## Examples
\[[edit](https://en.wikipedia.org/w/index.php?title=Bayesian_inference&action=edit§ion=18 "Edit section: Examples")\]
### Probability of a hypothesis
\[[edit](https://en.wikipedia.org/w/index.php?title=Bayesian_inference&action=edit§ion=19 "Edit section: Probability of a hypothesis")\]
| BowlCookie | \#1 *H*1 | \#2 *H*2 | | Total |
|---|---|---|---|---|
| Plain, *E* | **30** | 20 | **50** | |
| Choc, ÂŹ*E* | 10 | 20 | 30 | |
| Total | 40 | 40 | 80 | |
| *P*(*H*1\|*E*) = 30 / 50 = 0.6 | | | | |
Suppose there are two full bowls of cookies. Bowl \#1 has 10 chocolate chip and 30 plain cookies, while bowl \#2 has 20 of each. Our friend Fred picks a bowl at random, and then picks a cookie at random. We may assume there is no reason to believe Fred treats one bowl differently from another, likewise for the cookies. The cookie turns out to be a plain one. How probable is it that Fred picked it out of bowl \#1?
Intuitively, it seems clear that the answer should be more than a half, since there are more plain cookies in bowl \#1. The precise answer is given by Bayes' theorem. Let H 1 {\\displaystyle H\_{1}}  correspond to bowl \#1, and H 2 {\\displaystyle H\_{2}}  to bowl \#2. It is given that the bowls are identical from Fred's point of view, thus P ( H 1 ) \= P ( H 2 ) {\\displaystyle P(H\_{1})=P(H\_{2})} , and the two must add up to 1, so both are equal to 0.5. The event E {\\displaystyle E}  is the observation of a plain cookie. From the contents of the bowls, we know that P ( E ⣠H 1 ) \= 30 / 40 \= 0\.75 {\\displaystyle P(E\\mid H\_{1})=30/40=0.75}  and P ( E ⣠H 2 ) \= 20 / 40 \= 0\.5. {\\displaystyle P(E\\mid H\_{2})=20/40=0.5.}  Bayes' formula then yields P ( H 1 ⣠E ) \= P ( E ⣠H 1 ) P ( H 1 ) P ( E ⣠H 1 ) P ( H 1 ) \+ P ( E ⣠H 2 ) P ( H 2 ) \= 0\.75 à 0\.5 0\.75 à 0\.5 \+ 0\.5 à 0\.5 \= 0\.6 {\\displaystyle {\\begin{aligned}P(H\_{1}\\mid E)&={\\frac {P(E\\mid H\_{1})\\,P(H\_{1})}{P(E\\mid H\_{1})\\,P(H\_{1})\\;+\\;P(E\\mid H\_{2})\\,P(H\_{2})}}\\\\\\\\\\ &={\\frac {0.75\\times 0.5}{0.75\\times 0.5+0.5\\times 0.5}}\\\\\\\\\\ &=0.6\\end{aligned}}} 
Before we observed the cookie, the probability we assigned for Fred having chosen bowl \#1 was the prior probability, P ( H 1 ) {\\displaystyle P(H\_{1})} , which was 0.5. After observing the cookie, we must revise the probability to P ( H 1 ⣠E ) {\\displaystyle P(H\_{1}\\mid E)} , which is 0.6.
### Making a prediction
\[[edit](https://en.wikipedia.org/w/index.php?title=Bayesian_inference&action=edit§ion=20 "Edit section: Making a prediction")\]
[](https://en.wikipedia.org/wiki/File:Bayesian_inference_archaeology_example.jpg)
Example results for archaeology example. This simulation was generated using c=15.2.
An archaeologist is working at a site thought to be from the medieval period, between the 11th century to the 16th century. However, it is uncertain exactly when in this period the site was inhabited. Fragments of pottery are found, some of which are glazed and some of which are decorated. It is expected that if the site were inhabited during the early medieval period, then 1% of the pottery would be glazed and 50% of its area decorated, whereas if it had been inhabited in the late medieval period then 81% would be glazed and 5% of its area decorated. How confident can the archaeologist be in the date of inhabitation as fragments are unearthed?
The degree of belief in the continuous variable C {\\displaystyle C}  (century) is to be calculated, with the discrete set of events { G D , G D ÂŻ , G ÂŻ D , G ÂŻ D ÂŻ } {\\displaystyle \\{GD,G{\\bar {D}},{\\bar {G}}D,{\\bar {G}}{\\bar {D}}\\}}  as evidence. Assuming linear variation of glaze and decoration with time, and that these variables are independent,
P ( E \= G D ⣠C \= c ) \= ( 0\.01 \+ 0\.81 â 0\.01 16 â 11 ( c â 11 ) ) ( 0\.5 â 0\.5 â 0\.05 16 â 11 ( c â 11 ) ) {\\displaystyle P(E=GD\\mid C=c)=(0.01+{\\frac {0.81-0.01}{16-11}}(c-11))(0.5-{\\frac {0.5-0.05}{16-11}}(c-11))}  P ( E \= G D ÂŻ ⣠C \= c ) \= ( 0\.01 \+ 0\.81 â 0\.01 16 â 11 ( c â 11 ) ) ( 0\.5 \+ 0\.5 â 0\.05 16 â 11 ( c â 11 ) ) {\\displaystyle P(E=G{\\bar {D}}\\mid C=c)=(0.01+{\\frac {0.81-0.01}{16-11}}(c-11))(0.5+{\\frac {0.5-0.05}{16-11}}(c-11))}  P ( E \= G ÂŻ D ⣠C \= c ) \= ( ( 1 â 0\.01 ) â 0\.81 â 0\.01 16 â 11 ( c â 11 ) ) ( 0\.5 â 0\.5 â 0\.05 16 â 11 ( c â 11 ) ) {\\displaystyle P(E={\\bar {G}}D\\mid C=c)=((1-0.01)-{\\frac {0.81-0.01}{16-11}}(c-11))(0.5-{\\frac {0.5-0.05}{16-11}}(c-11))}  P ( E \= G ÂŻ D ÂŻ ⣠C \= c ) \= ( ( 1 â 0\.01 ) â 0\.81 â 0\.01 16 â 11 ( c â 11 ) ) ( 0\.5 \+ 0\.5 â 0\.05 16 â 11 ( c â 11 ) ) {\\displaystyle P(E={\\bar {G}}{\\bar {D}}\\mid C=c)=((1-0.01)-{\\frac {0.81-0.01}{16-11}}(c-11))(0.5+{\\frac {0.5-0.05}{16-11}}(c-11))} 
Assume a uniform prior of f C ( c ) \= 0\.2 {\\textstyle f\_{C}(c)=0.2} , and that trials are [independent and identically distributed](https://en.wikipedia.org/wiki/Independent_and_identically_distributed "Independent and identically distributed"). When a new fragment of type e {\\displaystyle e}  is discovered, Bayes' theorem is applied to update the degree of belief for each c {\\displaystyle c} : f C ( c ⣠E \= e ) \= P ( E \= e ⣠C \= c ) P ( E \= e ) f C ( c ) \= P ( E \= e ⣠C \= c ) ⫠11 16 P ( E \= e ⣠C \= c ) f C ( c ) d c f C ( c ) {\\displaystyle f\_{C}(c\\mid E=e)={\\frac {P(E=e\\mid C=c)}{P(E=e)}}f\_{C}(c)={\\frac {P(E=e\\mid C=c)}{\\int \_{11}^{16}{P(E=e\\mid C=c)f\_{C}(c)dc}}}f\_{C}(c)} 
A computer simulation of the changing belief as 50 fragments are unearthed is shown on the graph. In the simulation, the site was inhabited around 1420, or c \= 15\.2 {\\displaystyle c=15.2} . By calculating the area under the relevant portion of the graph for 50 trials, the archaeologist can say that there is practically no chance the site was inhabited in the 11th and 12th centuries, about 1% chance that it was inhabited during the 13th century, 63% chance during the 14th century and 36% during the 15th century. The [Bernstein-von Mises theorem](https://en.wikipedia.org/wiki/Bernstein%E2%80%93von_Mises_theorem "Bernsteinâvon Mises theorem") asserts here the asymptotic convergence to the "true" distribution because the [probability space](https://en.wikipedia.org/wiki/Probability_space "Probability space") corresponding to the discrete set of events { G D , G D ÂŻ , G ÂŻ D , G ÂŻ D ÂŻ } {\\displaystyle \\{GD,G{\\bar {D}},{\\bar {G}}D,{\\bar {G}}{\\bar {D}}\\}}  is finite (see above section on asymptotic behaviour of the posterior).
## In frequentist statistics and decision theory
\[[edit](https://en.wikipedia.org/w/index.php?title=Bayesian_inference&action=edit§ion=21 "Edit section: In frequentist statistics and decision theory")\]
A [decision-theoretic](https://en.wikipedia.org/wiki/Statistical_decision_theory "Statistical decision theory") justification of the use of Bayesian inference was given by [Abraham Wald](https://en.wikipedia.org/wiki/Abraham_Wald "Abraham Wald"), who proved that every unique Bayesian procedure is [admissible](https://en.wikipedia.org/wiki/Admissible_decision_rule "Admissible decision rule"). Conversely, every [admissible](https://en.wikipedia.org/wiki/Admissible_decision_rule "Admissible decision rule") statistical procedure is either a Bayesian procedure or a limit of Bayesian procedures.[\[21\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-Bickel_&_Doksum_2001,_page_32-21)
Wald characterized admissible procedures as Bayesian procedures (and limits of Bayesian procedures), making the Bayesian formalism a central technique in such areas of [frequentist inference](https://en.wikipedia.org/wiki/Frequentist_inference "Frequentist inference") as [parameter estimation](https://en.wikipedia.org/wiki/Parameter_estimation "Parameter estimation"), [hypothesis testing](https://en.wikipedia.org/wiki/Hypothesis_testing "Hypothesis testing"), and computing [confidence intervals](https://en.wikipedia.org/wiki/Confidence_intervals "Confidence intervals").[\[22\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-22)[\[23\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-23)[\[24\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-24) For example:
- "Under some conditions, all admissible procedures are either Bayes procedures or limits of Bayes procedures (in various senses). These remarkable results, at least in their original form, are due essentially to Wald. They are useful because the property of being Bayes is easier to analyze than admissibility."[\[21\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-Bickel_&_Doksum_2001,_page_32-21)
- "In decision theory, a quite general method for proving admissibility consists in exhibiting a procedure as a unique Bayes solution."[\[25\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-25)
- "In the first chapters of this work, prior distributions with finite support and the corresponding Bayes procedures were used to establish some of the main theorems relating to the comparison of experiments. Bayes procedures with respect to more general prior distributions have played a very important role in the development of statistics, including its asymptotic theory." "There are many problems where a glance at posterior distributions, for suitable priors, yields immediately interesting information. Also, this technique can hardly be avoided in sequential analysis."[\[26\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-26)
- "A useful fact is that any Bayes decision rule obtained by taking a proper prior over the whole parameter space must be admissible"[\[27\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-27)
- "An important area of investigation in the development of admissibility ideas has been that of conventional sampling-theory procedures, and many interesting results have been obtained."[\[28\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-28)
### Model selection
\[[edit](https://en.wikipedia.org/w/index.php?title=Bayesian_inference&action=edit§ion=22 "Edit section: Model selection")\]
Main article: [Bayesian model selection](https://en.wikipedia.org/wiki/Bayesian_model_selection "Bayesian model selection")
See also: [Bayesian information criterion](https://en.wikipedia.org/wiki/Bayesian_information_criterion "Bayesian information criterion")
Bayesian methodology also plays a role in [model selection](https://en.wikipedia.org/wiki/Model_selection "Model selection") where the aim is to select one model from a set of competing models that represents most closely the underlying process that generated the observed data. In Bayesian model comparison, the model with the highest [posterior probability](https://en.wikipedia.org/wiki/Posterior_probability "Posterior probability") given the data is selected. The posterior probability of a model depends on the evidence, or [marginal likelihood](https://en.wikipedia.org/wiki/Marginal_likelihood "Marginal likelihood"), which reflects the probability that the data is generated by the model, and on the [prior belief](https://en.wikipedia.org/wiki/Prior_probability "Prior probability") of the model. When two competing models are a priori considered to be equiprobable, the ratio of their posterior probabilities corresponds to the [Bayes factor](https://en.wikipedia.org/wiki/Bayes_factor "Bayes factor"). Since Bayesian model comparison is aimed on selecting the model with the highest posterior probability, this methodology is also referred to as the maximum a posteriori (MAP) selection rule [\[29\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-29) or the MAP probability rule.[\[30\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-30)
## Probabilistic programming
\[[edit](https://en.wikipedia.org/w/index.php?title=Bayesian_inference&action=edit§ion=23 "Edit section: Probabilistic programming")\]
Main article: [Probabilistic programming](https://en.wikipedia.org/wiki/Probabilistic_programming "Probabilistic programming")
While conceptually simple, Bayesian methods can be mathematically and numerically challenging. Probabilistic programming languages (PPLs) implement functions to easily build Bayesian models together with efficient automatic inference methods. This helps separate the model building from the inference, allowing practitioners to focus on their specific problems and leaving PPLs to handle the computational details for them.[\[31\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-31)[\[32\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-32)[\[33\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-33)
## Applications
\[[edit](https://en.wikipedia.org/w/index.php?title=Bayesian_inference&action=edit§ion=24 "Edit section: Applications")\]
### Statistical data analysis
\[[edit](https://en.wikipedia.org/w/index.php?title=Bayesian_inference&action=edit§ion=25 "Edit section: Statistical data analysis")\]
See the separate Wikipedia entry on [Bayesian statistics](https://en.wikipedia.org/wiki/Bayesian_statistics "Bayesian statistics"), specifically the [statistical modeling](https://en.wikipedia.org/wiki/Bayesian_statistics#Statistical_modeling "Bayesian statistics") section in that page.
### Computer applications
\[[edit](https://en.wikipedia.org/w/index.php?title=Bayesian_inference&action=edit§ion=26 "Edit section: Computer applications")\]
Bayesian inference has applications in [artificial intelligence](https://en.wikipedia.org/wiki/Artificial_intelligence "Artificial intelligence") and [expert systems](https://en.wikipedia.org/wiki/Expert_system "Expert system"). Bayesian inference techniques have been a fundamental part of computerized [pattern recognition](https://en.wikipedia.org/wiki/Pattern_recognition "Pattern recognition") techniques since the late 1950s.[\[34\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-34) There is also an ever-growing connection between Bayesian methods and simulation-based [Monte Carlo](https://en.wikipedia.org/wiki/Monte_Carlo_method "Monte Carlo method") techniques since complex models cannot be processed in closed form by a Bayesian analysis, while a [graphical model](https://en.wikipedia.org/wiki/Graphical_model "Graphical model") structure *may* allow for efficient simulation algorithms like the [Gibbs sampling](https://en.wikipedia.org/wiki/Gibbs_sampling "Gibbs sampling") and other [MetropolisâHastings algorithm](https://en.wikipedia.org/wiki/Metropolis%E2%80%93Hastings_algorithm "MetropolisâHastings algorithm") schemes.[\[35\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-35) Recently\[*[when?](https://en.wikipedia.org/wiki/Wikipedia:Manual_of_Style/Dates_and_numbers#Chronological_items "Wikipedia:Manual of Style/Dates and numbers")*\] Bayesian inference has gained popularity among the [phylogenetics](https://en.wikipedia.org/wiki/Phylogenetics "Phylogenetics") community for these reasons; a number of applications allow many demographic and evolutionary parameters to be estimated simultaneously.
As applied to [statistical classification](https://en.wikipedia.org/wiki/Statistical_classification "Statistical classification"), Bayesian inference has been used to develop algorithms for identifying [e-mail spam](https://en.wikipedia.org/wiki/E-mail_spam "E-mail spam"). Applications which make use of Bayesian inference for spam filtering include [CRM114](https://en.wikipedia.org/wiki/CRM114_\(program\) "CRM114 (program)"), [DSPAM](https://en.wikipedia.org/w/index.php?title=DSPAM&action=edit&redlink=1 "DSPAM (page does not exist)"), [Bogofilter](https://en.wikipedia.org/wiki/Bogofilter "Bogofilter"), [SpamAssassin](https://en.wikipedia.org/wiki/SpamAssassin "SpamAssassin"), [SpamBayes](https://en.wikipedia.org/wiki/SpamBayes "SpamBayes"), [Mozilla](https://en.wikipedia.org/wiki/Mozilla "Mozilla"), XEAMS, and others. Spam classification is treated in more detail in the article on the [naĂŻve Bayes classifier](https://en.wikipedia.org/wiki/Na%C3%AFve_Bayes_classifier "NaĂŻve Bayes classifier").
[Solomonoff's Inductive inference](https://en.wikipedia.org/wiki/Solomonoff%27s_theory_of_inductive_inference "Solomonoff's theory of inductive inference") is the theory of prediction based on observations; for example, predicting the next symbol based upon a given series of symbols. The only assumption is that the environment follows some unknown but computable [probability distribution](https://en.wikipedia.org/wiki/Probability_distribution "Probability distribution"). It is a formal inductive framework that combines two well-studied principles of inductive inference: Bayesian statistics and [Occam's Razor](https://en.wikipedia.org/wiki/Occam%27s_Razor "Occam's Razor").[\[36\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-36)\[*[unreliable source?](https://en.wikipedia.org/wiki/Wikipedia:Reliable_sources "Wikipedia:Reliable sources")*\] Solomonoff's universal prior probability of any prefix *p* of a computable sequence *x* is the sum of the probabilities of all programs (for a universal computer) that compute something starting with *p*. Given some *p* and any computable but unknown probability distribution from which *x* is sampled, the universal prior and Bayes' theorem can be used to predict the yet unseen parts of *x* in optimal fashion.[\[37\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-37)[\[38\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-38)
### Bioinformatics and healthcare applications
\[[edit](https://en.wikipedia.org/w/index.php?title=Bayesian_inference&action=edit§ion=27 "Edit section: Bioinformatics and healthcare applications")\]
Bayesian inference has been applied in different [bioinformatics](https://en.wikipedia.org/wiki/Bioinformatics "Bioinformatics") applications, including differential gene expression analysis.[\[39\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-:edgr-39) Bayesian inference is also used in a general cancer risk model, called [CIRI](https://en.wikipedia.org/wiki/Continuous_Individualized_Risk_Index "Continuous Individualized Risk Index") (Continuous Individualized Risk Index), where serial measurements are incorporated to update a Bayesian model which is primarily built from prior knowledge.[\[40\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-40)[\[41\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-41)
### Cosmology and astrophysical applications
\[[edit](https://en.wikipedia.org/w/index.php?title=Bayesian_inference&action=edit§ion=28 "Edit section: Cosmology and astrophysical applications")\]
The Bayesian approach has been central to recent progress in cosmology and astrophysical applications,[\[42\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-42)[\[43\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-43) and extends to a wide range of astrophysical problems, including the characterisation of exoplanet (such as the fitting of atmosphere for [k2-18b](https://en.wikipedia.org/wiki/K2-18b "K2-18b")[\[44\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-44)), parameter constraints with cosmological data,[\[45\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-ArXiv_1807-45) and calibration in astrophysical experiments.[\[46\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-46)
In cosmology, it is often employed with computational techniques such as [Markov chain Monte Carlo](https://en.wikipedia.org/wiki/Markov_chain_Monte_Carlo "Markov chain Monte Carlo")(MCMC) and [Nested sampling algorithm](https://en.wikipedia.org/wiki/Nested_sampling_algorithm "Nested sampling algorithm") to analyse complex datasets and navigate high-dimensional parameter space. A notable application is to the Planck 2018 CMB data for parameter inference.[\[45\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-ArXiv_1807-45) The six base cosmological parameters in [Lambda-CDM model](https://en.wikipedia.org/wiki/Lambda-CDM_model "Lambda-CDM model") are not predicted by a theory, but rather fitted from Cosmic microwave background (CMB) data to a chosen model of cosmology (the Lambda-CDM model).[\[47\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-47) The bayesian code for cosmology \`cobaya\` [\[48\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-48) sets up cosmological runs and interfaces cosmological likelihoods, Boltzmann code,[\[49\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-49)[\[50\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-50) which computes the predicted CMB anisotropies for any given set of cosmological parameters, with MCMC or nested sampler.
This computational framework is not limited to the standard model, it is also essential for testing alternative or extended theories of cosmology, such as theories with early dark energy,[\[51\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-51) or modified gravity theories introducing additional parameters beyond Lambda-CDM. [Bayesian model comparison](https://en.wikipedia.org/wiki/Bayesian_model_comparison "Bayesian model comparison") can then be employed to calculate the evidence for competing models, providing a statistical basis to assess whether the data support them over the standard Lambda-CDM.[\[52\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-52)
### In the courtroom
\[[edit](https://en.wikipedia.org/w/index.php?title=Bayesian_inference&action=edit§ion=29 "Edit section: In the courtroom")\]
Main article: [Jurimetrics § Bayesian analysis of evidence](https://en.wikipedia.org/wiki/Jurimetrics#Bayesian_analysis_of_evidence "Jurimetrics")
Bayesian inference can be used by jurors to coherently accumulate the evidence for and against a defendant, and to see whether, in totality, it meets their personal threshold for "[beyond a reasonable doubt](https://en.wikipedia.org/wiki/Beyond_a_reasonable_doubt "Beyond a reasonable doubt")".[\[53\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-53)[\[54\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-54)[\[55\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-55) Bayes' theorem is applied successively to all evidence presented, with the posterior from one stage becoming the prior for the next. The benefit of a Bayesian approach is that it gives the juror an unbiased, rational mechanism for combining evidence. It may be appropriate to explain Bayes' theorem to jurors in [odds form](https://en.wikipedia.org/wiki/Bayes%27_rule "Bayes' rule"), as [betting odds](https://en.wikipedia.org/wiki/Betting_odds "Betting odds") are more widely understood than probabilities. Alternatively, a [logarithmic approach](https://en.wikipedia.org/wiki/Gambling_and_information_theory "Gambling and information theory"), replacing multiplication with addition, might be easier for a jury to handle.
[](https://en.wikipedia.org/wiki/File:Ebits2c.png)
Adding up evidence
If the existence of the crime is not in doubt, only the identity of the culprit, it has been suggested that the prior should be uniform over the qualifying population.[\[56\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-56) For example, if 1,000 people could have committed the crime, the prior probability of guilt would be 1/1000.
The use of Bayes' theorem by jurors is controversial. In the United Kingdom, a defence [expert witness](https://en.wikipedia.org/wiki/Expert_witness "Expert witness") explained Bayes' theorem to the jury in *[R v Adams](https://en.wikipedia.org/wiki/Regina_versus_Denis_John_Adams "Regina versus Denis John Adams")*. The jury convicted, but the case went to appeal on the basis that no means of accumulating evidence had been provided for jurors who did not wish to use Bayes' theorem. The Court of Appeal upheld the conviction, but it also gave the opinion that "To introduce Bayes' Theorem, or any similar method, into a criminal trial plunges the jury into inappropriate and unnecessary realms of theory and complexity, deflecting them from their proper task."
Gardner-Medwin[\[57\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-57) argues that the criterion on which a verdict in a criminal trial should be based is *not* the probability of guilt, but rather the *probability of the evidence, given that the defendant is innocent* (akin to a [frequentist](https://en.wikipedia.org/wiki/Frequentist "Frequentist") [p-value](https://en.wikipedia.org/wiki/P-value "P-value")). He argues that if the posterior probability of guilt is to be computed by Bayes' theorem, the prior probability of guilt must be known. This will depend on the incidence of the crime, which is an unusual piece of evidence to consider in a criminal trial. Consider the following three propositions:
*A* â the known facts and testimony could have arisen if the defendant is guilty.
*B* â the known facts and testimony could have arisen if the defendant is innocent.
*C* â the defendant is guilty.
Gardner-Medwin argues that the jury should believe both *A* and not-*B* in order to convict. *A* and not-*B* implies the truth of *C*, but the reverse is not true. It is possible that *B* and *C* are both true, but in this case he argues that a jury should acquit, even though they know that they will be letting some guilty people go free. See also [Lindley's paradox](https://en.wikipedia.org/wiki/Lindley%27s_paradox "Lindley's paradox").
### Bayesian epistemology
\[[edit](https://en.wikipedia.org/w/index.php?title=Bayesian_inference&action=edit§ion=30 "Edit section: Bayesian epistemology")\]
[Bayesian epistemology](https://en.wikipedia.org/wiki/Bayesian_epistemology "Bayesian epistemology") is a movement that advocates for Bayesian inference as a means of justifying the rules of inductive logic.
[Karl Popper](https://en.wikipedia.org/wiki/Karl_Popper "Karl Popper") and [David Miller](https://en.wikipedia.org/wiki/David_Miller_\(philosopher\) "David Miller (philosopher)") have rejected the idea of Bayesian rationalism, i.e. using Bayes rule to make epistemological inferences:[\[58\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-58) It is prone to the same [vicious circle](https://en.wikipedia.org/wiki/Vicious_circle "Vicious circle") as any other [justificationist](https://en.wikipedia.org/wiki/Justificationism "Justificationism") epistemology, because it presupposes what it attempts to justify. According to this view, a rational interpretation of Bayesian inference would see it merely as a probabilistic version of [falsification](https://en.wikipedia.org/wiki/Falsifiability "Falsifiability"), rejecting the belief, commonly held by Bayesians, that high likelihood achieved by a series of Bayesian updates would prove the hypothesis beyond any reasonable doubt, or even with likelihood greater than 0.
### Other
\[[edit](https://en.wikipedia.org/w/index.php?title=Bayesian_inference&action=edit§ion=31 "Edit section: Other")\]
- The [scientific method](https://en.wikipedia.org/wiki/Scientific_method "Scientific method") is sometimes interpreted as an application of Bayesian inference. In this view, Bayes' rule guides (or should guide) the updating of probabilities about [hypotheses](https://en.wikipedia.org/wiki/Hypothesis "Hypothesis") conditional on new observations or [experiments](https://en.wikipedia.org/wiki/Experiment "Experiment").[\[59\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-59) The Bayesian inference has also been applied to treat [stochastic scheduling](https://en.wikipedia.org/wiki/Stochastic_scheduling "Stochastic scheduling") problems with incomplete information by Cai et al. (2009).[\[60\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-Cai_et_al._2009-60)
- [Bayesian search theory](https://en.wikipedia.org/wiki/Bayesian_search_theory "Bayesian search theory") is used to search for lost objects.
- [Bayesian inference in phylogeny](https://en.wikipedia.org/wiki/Bayesian_inference_in_phylogeny "Bayesian inference in phylogeny")
- [Bayesian tool for methylation analysis](https://en.wikipedia.org/wiki/Bayesian_tool_for_methylation_analysis "Bayesian tool for methylation analysis")
- [Bayesian approaches to brain function](https://en.wikipedia.org/wiki/Bayesian_approaches_to_brain_function "Bayesian approaches to brain function") investigate the brain as a Bayesian mechanism.
- Bayesian inference in ecological studies[\[61\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-61)[\[62\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-62)
- Bayesian inference is used to estimate parameters in stochastic chemical kinetic models[\[63\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-63)
- Bayesian inference in [econophysics](https://en.wikipedia.org/wiki/Econophysics "Econophysics") for currency or prediction of trend changes in financial quotations[\[64\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-64)
- [Bayesian inference in marketing](https://en.wikipedia.org/wiki/Bayesian_inference_in_marketing "Bayesian inference in marketing")
- [Bayesian inference in motor learning](https://en.wikipedia.org/wiki/Bayesian_inference_in_motor_learning "Bayesian inference in motor learning")
- Bayesian inference is used in [probabilistic numerics](https://en.wikipedia.org/wiki/Probabilistic_numerics "Probabilistic numerics") to solve numerical problems
## Bayes and Bayesian inference
\[[edit](https://en.wikipedia.org/w/index.php?title=Bayesian_inference&action=edit§ion=32 "Edit section: Bayes and Bayesian inference")\]
The problem considered by Bayes in Proposition 9 of his essay, "[An Essay Towards Solving a Problem in the Doctrine of Chances](https://en.wikipedia.org/wiki/An_Essay_Towards_Solving_a_Problem_in_the_Doctrine_of_Chances "An Essay Towards Solving a Problem in the Doctrine of Chances")", is the posterior distribution for the parameter *a* (the success rate) of the [binomial distribution](https://en.wikipedia.org/wiki/Binomial_distribution "Binomial distribution").\[*[citation needed](https://en.wikipedia.org/wiki/Wikipedia:Citation_needed "Wikipedia:Citation needed")*\]
## History
\[[edit](https://en.wikipedia.org/w/index.php?title=Bayesian_inference&action=edit§ion=33 "Edit section: History")\]
Main article: [History of statistics § Bayesian statistics](https://en.wikipedia.org/wiki/History_of_statistics#Bayesian_statistics "History of statistics")
The term *Bayesian* refers to [Thomas Bayes](https://en.wikipedia.org/wiki/Thomas_Bayes "Thomas Bayes") (1701â1761), who proved that probabilistic limits could be placed on an unknown event.[\[65\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-65) However, it was [Pierre-Simon Laplace](https://en.wikipedia.org/wiki/Pierre-Simon_Laplace "Pierre-Simon Laplace") (1749â1827) who introduced (as Principle VI) what is now called [Bayes' theorem](https://en.wikipedia.org/wiki/Bayes%27_theorem "Bayes' theorem") and used it to address problems in [celestial mechanics](https://en.wikipedia.org/wiki/Celestial_mechanics "Celestial mechanics"), medical statistics, [reliability](https://en.wikipedia.org/wiki/Reliability_\(statistics\) "Reliability (statistics)"), and [jurisprudence](https://en.wikipedia.org/wiki/Jurisprudence "Jurisprudence").[\[66\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-Stigler1986-66) Early Bayesian inference, which used uniform priors following Laplace's [principle of insufficient reason](https://en.wikipedia.org/wiki/Principle_of_insufficient_reason "Principle of insufficient reason"), was called "[inverse probability](https://en.wikipedia.org/wiki/Inverse_probability "Inverse probability")" (because it [infers](https://en.wikipedia.org/wiki/Inductive_reasoning "Inductive reasoning") backwards from observations to parameters, or from effects to causes[\[67\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-Fienberg2006-67)). After the 1920s, "inverse probability" was largely supplanted by a collection of methods that came to be called [frequentist statistics](https://en.wikipedia.org/wiki/Frequentist_statistics "Frequentist statistics").[\[67\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-Fienberg2006-67)
In the 20th century, the ideas of Laplace were further developed in two different directions, giving rise to *objective* and *subjective* currents in Bayesian practice. In the objective or "non-informative" current, the statistical analysis depends on only the model assumed, the data analyzed,[\[68\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-Bernardo2005-68) and the method assigning the prior, which differs from one objective Bayesian practitioner to another. In the subjective or "informative" current, the specification of the prior depends on the belief (that is, propositions on which the analysis is prepared to act), which can summarize information from experts, previous studies, etc.
In the 1980s, there was a dramatic growth in research and applications of Bayesian methods, mostly attributed to the discovery of [Markov chain Monte Carlo](https://en.wikipedia.org/wiki/Markov_chain_Monte_Carlo "Markov chain Monte Carlo") methods, which removed many of the computational problems, and an increasing interest in nonstandard, complex applications.[\[69\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-Wolpert2004-69) Despite growth of Bayesian research, most undergraduate teaching is still based on frequentist statistics.[\[70\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-Bernardo2006-70) Nonetheless, Bayesian methods are widely accepted and used, such as for example in the field of [machine learning](https://en.wikipedia.org/wiki/Machine_learning "Machine learning").[\[71\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-Bishop2007-71)
## See also
\[[edit](https://en.wikipedia.org/w/index.php?title=Bayesian_inference&action=edit§ion=34 "Edit section: See also")\]
- [Bayesian approaches to brain function](https://en.wikipedia.org/wiki/Bayesian_approaches_to_brain_function "Bayesian approaches to brain function")
- [Credibility theory](https://en.wikipedia.org/wiki/Credibility_theory "Credibility theory")
- [Epistemology](https://en.wikipedia.org/wiki/Epistemology "Epistemology")
- [Free energy principle](https://en.wikipedia.org/wiki/Free_energy_principle "Free energy principle")
- [Inductive probability](https://en.wikipedia.org/wiki/Inductive_probability "Inductive probability")
- [Information field theory](https://en.wikipedia.org/wiki/Information_field_theory "Information field theory")
- [Principle of maximum entropy](https://en.wikipedia.org/wiki/Principle_of_maximum_entropy "Principle of maximum entropy")
- [Probabilistic causation](https://en.wikipedia.org/wiki/Probabilistic_causation "Probabilistic causation")
- [Probabilistic programming](https://en.wikipedia.org/wiki/Probabilistic_programming "Probabilistic programming")
## References
\[[edit](https://en.wikipedia.org/w/index.php?title=Bayesian_inference&action=edit§ion=35 "Edit section: References")\]
### Citations
\[[edit](https://en.wikipedia.org/w/index.php?title=Bayesian_inference&action=edit§ion=36 "Edit section: Citations")\]
1. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-1)**
["Bayesian"](https://www.merriam-webster.com/dictionary/Bayesian). *[Merriam-Webster.com Dictionary](https://en.wikipedia.org/wiki/Merriam-Webster "Merriam-Webster")*. Merriam-Webster. [OCLC](https://en.wikipedia.org/wiki/OCLC_\(identifier\) "OCLC (identifier)") [1032680871](https://search.worldcat.org/oclc/1032680871).
2. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-2)**
Griffiths, Thomas (July 24, 2024). ["Bayesian Models of Cognition"](https://oecs.mit.edu/pub/lwxmte1p/release/2).
3. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-3)**
Hacking, Ian (December 1967). "Slightly More Realistic Personal Probability". *Philosophy of Science*. **34** (4): 316. [doi](https://en.wikipedia.org/wiki/Doi_\(identifier\) "Doi (identifier)"):[10\.1086/288169](https://doi.org/10.1086%2F288169). [S2CID](https://en.wikipedia.org/wiki/S2CID_\(identifier\) "S2CID (identifier)") [14344339](https://api.semanticscholar.org/CorpusID:14344339).
4. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-4)**
["Bayes' Theorem (Stanford Encyclopedia of Philosophy)"](http://plato.stanford.edu/entries/bayes-theorem/). Plato.stanford.edu. Retrieved 2014-01-05.
5. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-5)**
[van Fraassen, B.](https://en.wikipedia.org/wiki/Bas_van_Fraassen "Bas van Fraassen") (1989) *Laws and Symmetry*, Oxford University Press.
[ISBN](https://en.wikipedia.org/wiki/ISBN_\(identifier\) "ISBN (identifier)")
[0-19-824860-1](https://en.wikipedia.org/wiki/Special:BookSources/0-19-824860-1 "Special:BookSources/0-19-824860-1")
.
6. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-6)**
Gelman, Andrew; Carlin, John B.; Stern, Hal S.; Dunson, David B.; Vehtari, Aki; Rubin, Donald B. (2013). *Bayesian Data Analysis*, Third Edition. Chapman and Hall/CRC.
[ISBN](https://en.wikipedia.org/wiki/ISBN_\(identifier\) "ISBN (identifier)")
[978-1-4398-4095-5](https://en.wikipedia.org/wiki/Special:BookSources/978-1-4398-4095-5 "Special:BookSources/978-1-4398-4095-5")
.
7. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-deCarvalho-Geometry_7-0)**
de Carvalho, Miguel; Page, Garritt; Barney, Bradley (2019). ["On the geometry of Bayesian inference"](https://www.maths.ed.ac.uk/~mdecarv/papers/decarvalho2018.pdf) (PDF). *Bayesian Analysis*. **14** (4): 1013â1036. [doi](https://en.wikipedia.org/wiki/Doi_\(identifier\) "Doi (identifier)"):[10\.1214/18-BA1112](https://doi.org/10.1214%2F18-BA1112). [S2CID](https://en.wikipedia.org/wiki/S2CID_\(identifier\) "S2CID (identifier)") [88521802](https://api.semanticscholar.org/CorpusID:88521802).
8. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-Lee-GibbsSampler_8-0)**
Lee, Se Yoon (2021). "Gibbs sampler and coordinate ascent variational inference: A set-theoretical review". *Communications in Statistics â Theory and Methods*. **51** (6): 1549â1568\. [arXiv](https://en.wikipedia.org/wiki/ArXiv_\(identifier\) "ArXiv (identifier)"):[2008\.01006](https://arxiv.org/abs/2008.01006). [doi](https://en.wikipedia.org/wiki/Doi_\(identifier\) "Doi (identifier)"):[10\.1080/03610926.2021.1921214](https://doi.org/10.1080%2F03610926.2021.1921214). [S2CID](https://en.wikipedia.org/wiki/S2CID_\(identifier\) "S2CID (identifier)") [220935477](https://api.semanticscholar.org/CorpusID:220935477).
9. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-9)**
Kolmogorov, A.N. (1933) \[1956\]. *Foundations of the Theory of Probability*. Chelsea Publishing Company.
10. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-10)**
Tjur, Tue (1980). [*Probability based on Radon measures*](http://archive.org/details/probabilitybased0000tjur). Internet Archive. Chichester \[Eng.\]; New York : Wiley. [ISBN](https://en.wikipedia.org/wiki/ISBN_\(identifier\) "ISBN (identifier)")
[978-0-471-27824-5](https://en.wikipedia.org/wiki/Special:BookSources/978-0-471-27824-5 "Special:BookSources/978-0-471-27824-5")
.
11. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-11)**
Taraldsen, Gunnar; Tufto, Jarle; Lindqvist, Bo H. (2021-07-24). ["Improper priors and improper posteriors"](https://doi.org/10.1111%2Fsjos.12550). *Scandinavian Journal of Statistics*. **49** (3): 969â991\. [doi](https://en.wikipedia.org/wiki/Doi_\(identifier\) "Doi (identifier)"):[10\.1111/sjos.12550](https://doi.org/10.1111%2Fsjos.12550). [hdl](https://en.wikipedia.org/wiki/Hdl_\(identifier\) "Hdl (identifier)"):[11250/2984409](https://hdl.handle.net/11250%2F2984409). [ISSN](https://en.wikipedia.org/wiki/ISSN_\(identifier\) "ISSN (identifier)") [0303-6898](https://search.worldcat.org/issn/0303-6898). [S2CID](https://en.wikipedia.org/wiki/S2CID_\(identifier\) "S2CID (identifier)") [237736986](https://api.semanticscholar.org/CorpusID:237736986).
12. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-12)**
Robert, Christian P.; Casella, George (2004). *Monte Carlo Statistical Methods*. Springer. [ISBN](https://en.wikipedia.org/wiki/ISBN_\(identifier\) "ISBN (identifier)")
[978-1-4757-4145-2](https://en.wikipedia.org/wiki/Special:BookSources/978-1-4757-4145-2 "Special:BookSources/978-1-4757-4145-2")
. [OCLC](https://en.wikipedia.org/wiki/OCLC_\(identifier\) "OCLC (identifier)") [1159112760](https://search.worldcat.org/oclc/1159112760).
13. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-13)**
Freedman, DA (1963). ["On the asymptotic behavior of Bayes' estimates in the discrete case"](https://doi.org/10.1214%2Faoms%2F1177703871). *The Annals of Mathematical Statistics*. **34** (4): 1386â1403\. [doi](https://en.wikipedia.org/wiki/Doi_\(identifier\) "Doi (identifier)"):[10\.1214/aoms/1177703871](https://doi.org/10.1214%2Faoms%2F1177703871). [JSTOR](https://en.wikipedia.org/wiki/JSTOR_\(identifier\) "JSTOR (identifier)") [2238346](https://www.jstor.org/stable/2238346).
14. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-14)**
Freedman, DA (1965). ["On the asymptotic behavior of Bayes estimates in the discrete case II"](https://doi.org/10.1214%2Faoms%2F1177700155). *The Annals of Mathematical Statistics*. **36** (2): 454â456\. [doi](https://en.wikipedia.org/wiki/Doi_\(identifier\) "Doi (identifier)"):[10\.1214/aoms/1177700155](https://doi.org/10.1214%2Faoms%2F1177700155). [JSTOR](https://en.wikipedia.org/wiki/JSTOR_\(identifier\) "JSTOR (identifier)") [2238150](https://www.jstor.org/stable/2238150).
15. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-15)**
Robins, James; Wasserman, Larry (2000). "Conditioning, likelihood, and coherence: A review of some foundational concepts". *Journal of the American Statistical Association*. **95** (452): 1340â1346\. [doi](https://en.wikipedia.org/wiki/Doi_\(identifier\) "Doi (identifier)"):[10\.1080/01621459.2000.10474344](https://doi.org/10.1080%2F01621459.2000.10474344). [S2CID](https://en.wikipedia.org/wiki/S2CID_\(identifier\) "S2CID (identifier)") [120767108](https://api.semanticscholar.org/CorpusID:120767108).
16. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-16)**
[Sen, Pranab K.](https://en.wikipedia.org/wiki/Pranab_K._Sen "Pranab K. Sen"); Keating, J. P.; Mason, R. L. (1993). *Pitman's measure of closeness: A comparison of statistical estimators*. Philadelphia: SIAM.
17. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-17)**
Choudhuri, Nidhan; Ghosal, Subhashis; Roy, Anindya (2005-01-01). "Bayesian Methods for Function Estimation". *Handbook of Statistics*. Bayesian Thinking. Vol. 25. pp. 373â414\. [CiteSeerX](https://en.wikipedia.org/wiki/CiteSeerX_\(identifier\) "CiteSeerX (identifier)") [10\.1.1.324.3052](https://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.324.3052). [doi](https://en.wikipedia.org/wiki/Doi_\(identifier\) "Doi (identifier)"):[10\.1016/s0169-7161(05)25013-7](https://doi.org/10.1016%2Fs0169-7161%2805%2925013-7). [ISBN](https://en.wikipedia.org/wiki/ISBN_\(identifier\) "ISBN (identifier)")
[978-0-444-51539-1](https://en.wikipedia.org/wiki/Special:BookSources/978-0-444-51539-1 "Special:BookSources/978-0-444-51539-1")
.
18. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-18)**
["Maximum A Posteriori (MAP) Estimation"](https://www.probabilitycourse.com/chapter9/9_1_2_MAP_estimation.php). *www.probabilitycourse.com*. Retrieved 2017-06-02.
19. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-19)**
Yu, Angela. ["Introduction to Bayesian Decision Theory"](https://web.archive.org/web/20130228060536/http://www.cogsci.ucsd.edu/~ajyu/Teaching/Tutorials/bayes_dt.pdf) (PDF). *cogsci.ucsd.edu/*. Archived from [the original](http://www.cogsci.ucsd.edu/~ajyu/Teaching/Tutorials/bayes_dt.pdf) (PDF) on 2013-02-28.
20. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-20)**
Hitchcock, David. ["Posterior Predictive Distribution Stat Slide"](http://people.stat.sc.edu/Hitchcock/stat535slidesday18.pdf) (PDF). *stat.sc.edu*.
21. ^ [***a***](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-Bickel_&_Doksum_2001,_page_32_21-0) [***b***](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-Bickel_&_Doksum_2001,_page_32_21-1) Bickel & Doksum (2001, p. 32)
22. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-22)**
[Kiefer, J.](https://en.wikipedia.org/wiki/Jack_Kiefer_\(mathematician\) "Jack Kiefer (mathematician)"); Schwartz R. (1965). ["Admissible Bayes Character of T2\-, R2\-, and Other Fully Invariant Tests for Multivariate Normal Problems"](https://doi.org/10.1214%2Faoms%2F1177700051). *Annals of Mathematical Statistics*. **36** (3): 747â770\. [doi](https://en.wikipedia.org/wiki/Doi_\(identifier\) "Doi (identifier)"):[10\.1214/aoms/1177700051](https://doi.org/10.1214%2Faoms%2F1177700051).
23. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-23)**
Schwartz, R. (1969). ["Invariant Proper Bayes Tests for Exponential Families"](https://doi.org/10.1214%2Faoms%2F1177697822). *Annals of Mathematical Statistics*. **40**: 270â283\. [doi](https://en.wikipedia.org/wiki/Doi_\(identifier\) "Doi (identifier)"):[10\.1214/aoms/1177697822](https://doi.org/10.1214%2Faoms%2F1177697822).
24. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-24)**
Hwang, J. T. & Casella, George (1982). ["Minimax Confidence Sets for the Mean of a Multivariate Normal Distribution"](https://ecommons.cornell.edu/bitstream/1813/32852/1/BU-750-M.pdf) (PDF). *Annals of Statistics*. **10** (3): 868â881\. [doi](https://en.wikipedia.org/wiki/Doi_\(identifier\) "Doi (identifier)"):[10\.1214/aos/1176345877](https://doi.org/10.1214%2Faos%2F1176345877).
25. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-25)**
[Lehmann, Erich](https://en.wikipedia.org/wiki/Erich_Leo_Lehmann "Erich Leo Lehmann") (1986). *Testing Statistical Hypotheses* (Second ed.).
(see p. 309 of Chapter 6.7 "Admissibility", and pp. 17â18 of Chapter 1.8 "Complete Classes"
26. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-26)**
[Le Cam, Lucien](https://en.wikipedia.org/wiki/Lucien_Le_Cam "Lucien Le Cam") (1986). *Asymptotic Methods in Statistical Decision Theory*. Springer-Verlag. [ISBN](https://en.wikipedia.org/wiki/ISBN_\(identifier\) "ISBN (identifier)")
[978-0-387-96307-5](https://en.wikipedia.org/wiki/Special:BookSources/978-0-387-96307-5 "Special:BookSources/978-0-387-96307-5")
.
(From "Chapter 12 Posterior Distributions and Bayes Solutions", p. 324)
27. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-27)**
[Cox, D. R.](https://en.wikipedia.org/wiki/David_R._Cox "David R. Cox"); Hinkley, D.V. (1974). *Theoretical Statistics*. Chapman and Hall. p. 432. [ISBN](https://en.wikipedia.org/wiki/ISBN_\(identifier\) "ISBN (identifier)")
[978-0-04-121537-3](https://en.wikipedia.org/wiki/Special:BookSources/978-0-04-121537-3 "Special:BookSources/978-0-04-121537-3")
.
28. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-28)**
[Cox, D. R.](https://en.wikipedia.org/wiki/David_R._Cox "David R. Cox"); Hinkley, D.V. (1974). *Theoretical Statistics*. Chapman and Hall. p. 433. [ISBN](https://en.wikipedia.org/wiki/ISBN_\(identifier\) "ISBN (identifier)")
[978-0-04-121537-3](https://en.wikipedia.org/wiki/Special:BookSources/978-0-04-121537-3 "Special:BookSources/978-0-04-121537-3")
.
)
29. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-29)**
Stoica, P.; Selen, Y. (2004). "A review of information criterion rules". *IEEE Signal Processing Magazine*. **21** (4): 36â47\. [doi](https://en.wikipedia.org/wiki/Doi_\(identifier\) "Doi (identifier)"):[10\.1109/MSP.2004.1311138](https://doi.org/10.1109%2FMSP.2004.1311138). [S2CID](https://en.wikipedia.org/wiki/S2CID_\(identifier\) "S2CID (identifier)") [17338979](https://api.semanticscholar.org/CorpusID:17338979).
30. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-30)**
Fatermans, J.; Van Aert, S.; den Dekker, A.J. (2019). "The maximum a posteriori probability rule for atom column detection from HAADF STEM images". *Ultramicroscopy*. **201**: 81â91\. [arXiv](https://en.wikipedia.org/wiki/ArXiv_\(identifier\) "ArXiv (identifier)"):[1902\.05809](https://arxiv.org/abs/1902.05809). [doi](https://en.wikipedia.org/wiki/Doi_\(identifier\) "Doi (identifier)"):[10\.1016/j.ultramic.2019.02.003](https://doi.org/10.1016%2Fj.ultramic.2019.02.003). [PMID](https://en.wikipedia.org/wiki/PMID_\(identifier\) "PMID (identifier)") [30991277](https://pubmed.ncbi.nlm.nih.gov/30991277). [S2CID](https://en.wikipedia.org/wiki/S2CID_\(identifier\) "S2CID (identifier)") [104419861](https://api.semanticscholar.org/CorpusID:104419861).
31. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-31)** Bessiere, P., Mazer, E., Ahuactzin, J. M., & Mekhnacha, K. (2013). Bayesian Programming (1 edition) Chapman and Hall/CRC.
32. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-32)**
Daniel Roy (2015). ["Probabilistic Programming"](https://web.archive.org/web/20160110035042/http://probabilistic-programming.org/wiki/Home). *probabilistic-programming.org*. Archived from [the original](http://probabilistic-programming.org/wiki/Home) on 2016-01-10. Retrieved 2020-01-02.
33. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-33)**
Ghahramani, Z (2015). ["Probabilistic machine learning and artificial intelligence"](https://www.repository.cam.ac.uk/handle/1810/248538). *Nature*. **521** (7553): 452â459\. [Bibcode](https://en.wikipedia.org/wiki/Bibcode_\(identifier\) "Bibcode (identifier)"):[2015Natur.521..452G](https://ui.adsabs.harvard.edu/abs/2015Natur.521..452G). [doi](https://en.wikipedia.org/wiki/Doi_\(identifier\) "Doi (identifier)"):[10\.1038/nature14541](https://doi.org/10.1038%2Fnature14541). [PMID](https://en.wikipedia.org/wiki/PMID_\(identifier\) "PMID (identifier)") [26017444](https://pubmed.ncbi.nlm.nih.gov/26017444). [S2CID](https://en.wikipedia.org/wiki/S2CID_\(identifier\) "S2CID (identifier)") [216356](https://api.semanticscholar.org/CorpusID:216356).
34. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-34)**
Fienberg, Stephen E. (2006-03-01). ["When did Bayesian inference become "Bayesian"?"](https://doi.org/10.1214%2F06-BA101). *Bayesian Analysis*. **1** (1). [doi](https://en.wikipedia.org/wiki/Doi_\(identifier\) "Doi (identifier)"):[10\.1214/06-BA101](https://doi.org/10.1214%2F06-BA101).
35. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-35)**
Jim Albert (2009). *Bayesian Computation with R, Second edition*. New York, Dordrecht, etc.: Springer. [ISBN](https://en.wikipedia.org/wiki/ISBN_\(identifier\) "ISBN (identifier)")
[978-0-387-92297-3](https://en.wikipedia.org/wiki/Special:BookSources/978-0-387-92297-3 "Special:BookSources/978-0-387-92297-3")
.
36. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-36)**
Rathmanner, Samuel; Hutter, Marcus; Ormerod, Thomas C (2011). ["A Philosophical Treatise of Universal Induction"](https://doi.org/10.3390%2Fe13061076). *Entropy*. **13** (6): 1076â1136\. [arXiv](https://en.wikipedia.org/wiki/ArXiv_\(identifier\) "ArXiv (identifier)"):[1105\.5721](https://arxiv.org/abs/1105.5721). [Bibcode](https://en.wikipedia.org/wiki/Bibcode_\(identifier\) "Bibcode (identifier)"):[2011Entrp..13.1076R](https://ui.adsabs.harvard.edu/abs/2011Entrp..13.1076R). [doi](https://en.wikipedia.org/wiki/Doi_\(identifier\) "Doi (identifier)"):[10\.3390/e13061076](https://doi.org/10.3390%2Fe13061076). [S2CID](https://en.wikipedia.org/wiki/S2CID_\(identifier\) "S2CID (identifier)") [2499910](https://api.semanticscholar.org/CorpusID:2499910).
37. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-37)**
Hutter, Marcus; He, Yang-Hui; Ormerod, Thomas C (2007). "On Universal Prediction and Bayesian Confirmation". *Theoretical Computer Science*. **384** (2007): 33â48\. [arXiv](https://en.wikipedia.org/wiki/ArXiv_\(identifier\) "ArXiv (identifier)"):[0709\.1516](https://arxiv.org/abs/0709.1516). [Bibcode](https://en.wikipedia.org/wiki/Bibcode_\(identifier\) "Bibcode (identifier)"):[2007arXiv0709.1516H](https://ui.adsabs.harvard.edu/abs/2007arXiv0709.1516H). [doi](https://en.wikipedia.org/wiki/Doi_\(identifier\) "Doi (identifier)"):[10\.1016/j.tcs.2007.05.016](https://doi.org/10.1016%2Fj.tcs.2007.05.016). [S2CID](https://en.wikipedia.org/wiki/S2CID_\(identifier\) "S2CID (identifier)") [1500830](https://api.semanticscholar.org/CorpusID:1500830).
38. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-38)**
GĂĄcs, Peter; VitĂĄnyi, Paul M. B. (2 December 2010). "Raymond J. Solomonoff 1926-2009". [CiteSeerX](https://en.wikipedia.org/wiki/CiteSeerX_\(identifier\) "CiteSeerX (identifier)") [10\.1.1.186.8268](https://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.186.8268).
39. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-:edgr_39-0)** Robinson, Mark D & McCarthy, Davis J & Smyth, Gordon K edgeR: a Bioconductor package for differential expression analysis of digital gene expression data, Bioinformatics.
40. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-40)**
["CIRI"](https://ciri.stanford.edu/). *ciri.stanford.edu*. Retrieved 2019-08-11.
41. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-41)**
Kurtz, David M.; Esfahani, Mohammad S.; Scherer, Florian; Soo, Joanne; Jin, Michael C.; Liu, Chih Long; Newman, Aaron M.; DĂŒhrsen, Ulrich; HĂŒttmann, Andreas (2019-07-25). ["Dynamic Risk Profiling Using Serial Tumor Biomarkers for Personalized Outcome Prediction"](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7380118). *Cell*. **178** (3): 699â713.e19. [doi](https://en.wikipedia.org/wiki/Doi_\(identifier\) "Doi (identifier)"):[10\.1016/j.cell.2019.06.011](https://doi.org/10.1016%2Fj.cell.2019.06.011). [ISSN](https://en.wikipedia.org/wiki/ISSN_\(identifier\) "ISSN (identifier)") [1097-4172](https://search.worldcat.org/issn/1097-4172). [PMC](https://en.wikipedia.org/wiki/PMC_\(identifier\) "PMC (identifier)") [7380118](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7380118). [PMID](https://en.wikipedia.org/wiki/PMID_\(identifier\) "PMID (identifier)") [31280963](https://pubmed.ncbi.nlm.nih.gov/31280963).
42. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-42)**
Trotta, Roberto (2017). "Bayesian Methods in Cosmology". [arXiv](https://en.wikipedia.org/wiki/ArXiv_\(identifier\) "ArXiv (identifier)"):[1701\.01467](https://arxiv.org/abs/1701.01467) \[[astro-ph.CO](https://arxiv.org/archive/astro-ph.CO)\].
43. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-43)**
Staicova, Denitsa (2025). ["Modern Bayesian Sampling Methods for Cosmological Inference: A Comparative Study"](https://doi.org/10.3390%2Funiverse11020068). *Universe*. **11** (2): 68. [arXiv](https://en.wikipedia.org/wiki/ArXiv_\(identifier\) "ArXiv (identifier)"):[2501\.06022](https://arxiv.org/abs/2501.06022). [Bibcode](https://en.wikipedia.org/wiki/Bibcode_\(identifier\) "Bibcode (identifier)"):[2025Univ...11...68S](https://ui.adsabs.harvard.edu/abs/2025Univ...11...68S). [doi](https://en.wikipedia.org/wiki/Doi_\(identifier\) "Doi (identifier)"):[10\.3390/universe11020068](https://doi.org/10.3390%2Funiverse11020068).
44. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-44)**
Madhusudhan, Nikku; Constantinou, Savvas; Holmberg, MÄns; Sarkar, Subhajit; Piette, Anjali A. A.; Moses, Julianne I. (2025). ["New Constraints on DMS and DMDS in the Atmosphere of K2-18 b from JWST MIRI"](https://doi.org/10.3847%2F2041-8213%2Fadc1c8). *The Astrophysical Journal*. **983** (2): L40. [arXiv](https://en.wikipedia.org/wiki/ArXiv_\(identifier\) "ArXiv (identifier)"):[2504\.12267](https://arxiv.org/abs/2504.12267). [Bibcode](https://en.wikipedia.org/wiki/Bibcode_\(identifier\) "Bibcode (identifier)"):[2025ApJ...983L..40M](https://ui.adsabs.harvard.edu/abs/2025ApJ...983L..40M). [doi](https://en.wikipedia.org/wiki/Doi_\(identifier\) "Doi (identifier)"):[10\.3847/2041-8213/adc1c8](https://doi.org/10.3847%2F2041-8213%2Fadc1c8).
45. ^ [***a***](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-ArXiv_1807_45-0) [***b***](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-ArXiv_1807_45-1)
Aghanim, N.; et al. (2020). "*Planck* 2018 results". *Astronomy & Astrophysics*. **641**: A6. [arXiv](https://en.wikipedia.org/wiki/ArXiv_\(identifier\) "ArXiv (identifier)"):[1807\.06209](https://arxiv.org/abs/1807.06209). [Bibcode](https://en.wikipedia.org/wiki/Bibcode_\(identifier\) "Bibcode (identifier)"):[2020A\&A...641A...6P](https://ui.adsabs.harvard.edu/abs/2020A&A...641A...6P). [doi](https://en.wikipedia.org/wiki/Doi_\(identifier\) "Doi (identifier)"):[10\.1051/0004-6361/201833910](https://doi.org/10.1051%2F0004-6361%2F201833910).
46. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-46)**
Anstey, Dominic; De Lera Acedo, Eloy; Handley, Will (2021). ["A general Bayesian framework for foreground modelling and chromaticity correction for global 21 cm experiments"](https://doi.org/10.1093%2Fmnras%2Fstab1765). *Monthly Notices of the Royal Astronomical Society*. **506** (2): 2041â2058\. [arXiv](https://en.wikipedia.org/wiki/ArXiv_\(identifier\) "ArXiv (identifier)"):[2010\.09644](https://arxiv.org/abs/2010.09644). [doi](https://en.wikipedia.org/wiki/Doi_\(identifier\) "Doi (identifier)"):[10\.1093/mnras/stab1765](https://doi.org/10.1093%2Fmnras%2Fstab1765).
47. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-47)**
Lewis, Antony; Bridle, Sarah (2002). "Cosmological parameters from CMB and other data: A Monte Carlo approach". *Physical Review D*. **66** (10) 103511. [arXiv](https://en.wikipedia.org/wiki/ArXiv_\(identifier\) "ArXiv (identifier)"):[astro-ph/0205436](https://arxiv.org/abs/astro-ph/0205436). [Bibcode](https://en.wikipedia.org/wiki/Bibcode_\(identifier\) "Bibcode (identifier)"):[2002PhRvD..66j3511L](https://ui.adsabs.harvard.edu/abs/2002PhRvD..66j3511L). [doi](https://en.wikipedia.org/wiki/Doi_\(identifier\) "Doi (identifier)"):[10\.1103/PhysRevD.66.103511](https://doi.org/10.1103%2FPhysRevD.66.103511).
48. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-48)**
["Cobaya, a code for Bayesian analysis in Cosmology â cobaya 3.5.7 documentation"](https://cobaya.readthedocs.io/en/latest/index.html). *cobaya.readthedocs.io*. Retrieved 2025-07-23.
49. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-49)**
["CAMB â Code for Anisotropies in the Microwave Background (CAMB) 1.6.1 documentation"](https://camb.readthedocs.io/en/latest/). *camb.readthedocs.io*. Retrieved 2025-07-23.
50. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-50)**
Lesgourgues, Julien (2011). "The Cosmic Linear Anisotropy Solving System (CLASS) I: Overview". [arXiv](https://en.wikipedia.org/wiki/ArXiv_\(identifier\) "ArXiv (identifier)"):[1104\.2932](https://arxiv.org/abs/1104.2932) \[[astro-ph.IM](https://arxiv.org/archive/astro-ph.IM)\].
51. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-51)**
Hill, J. Colin; McDonough, Evan; Toomey, Michael W.; Alexander, Stephon (2020). "Early dark energy does not restore cosmological concordance". *Physical Review D*. **102** (4) 043507. [arXiv](https://en.wikipedia.org/wiki/ArXiv_\(identifier\) "ArXiv (identifier)"):[2003\.07355](https://arxiv.org/abs/2003.07355). [Bibcode](https://en.wikipedia.org/wiki/Bibcode_\(identifier\) "Bibcode (identifier)"):[2020PhRvD.102d3507H](https://ui.adsabs.harvard.edu/abs/2020PhRvD.102d3507H). [doi](https://en.wikipedia.org/wiki/Doi_\(identifier\) "Doi (identifier)"):[10\.1103/PhysRevD.102.043507](https://doi.org/10.1103%2FPhysRevD.102.043507).
52. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-52)**
Trotta, Roberto (2008). "Bayes in the sky: Bayesian inference and model selection in cosmology". *Contemporary Physics*. **49** (2): 71â104\. [arXiv](https://en.wikipedia.org/wiki/ArXiv_\(identifier\) "ArXiv (identifier)"):[0803\.4089](https://arxiv.org/abs/0803.4089). [Bibcode](https://en.wikipedia.org/wiki/Bibcode_\(identifier\) "Bibcode (identifier)"):[2008ConPh..49...71T](https://ui.adsabs.harvard.edu/abs/2008ConPh..49...71T). [doi](https://en.wikipedia.org/wiki/Doi_\(identifier\) "Doi (identifier)"):[10\.1080/00107510802066753](https://doi.org/10.1080%2F00107510802066753).
53. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-53)** Dawid, A. P. and Mortera, J. (1996) "Coherent Analysis of Forensic Identification Evidence". *[Journal of the Royal Statistical Society](https://en.wikipedia.org/wiki/Journal_of_the_Royal_Statistical_Society "Journal of the Royal Statistical Society")*, Series B, 58, 425â443.
54. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-54)** Foreman, L. A.; Smith, A. F. M., and Evett, I. W. (1997). "Bayesian analysis of deoxyribonucleic acid profiling data in forensic identification applications (with discussion)". *Journal of the Royal Statistical Society*, Series A, 160, 429â469.
55. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-55)**
Robertson, B. and Vignaux, G. A. (1995) *Interpreting Evidence: Evaluating Forensic Science in the Courtroom*. John Wiley and Sons. Chichester.
[ISBN](https://en.wikipedia.org/wiki/ISBN_\(identifier\) "ISBN (identifier)")
[978-0-471-96026-3](https://en.wikipedia.org/wiki/Special:BookSources/978-0-471-96026-3 "Special:BookSources/978-0-471-96026-3")
.
56. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-56)** Dawid, A. P. (2001) [Bayes' Theorem and Weighing Evidence by Juries](http://128.40.111.250/evidence/content/dawid-paper.pdf). [Archived](https://web.archive.org/web/20150701112146/http://128.40.111.250/evidence/content/dawid-paper.pdf) 2015-07-01 at the [Wayback Machine](https://en.wikipedia.org/wiki/Wayback_Machine "Wayback Machine")
57. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-57)** Gardner-Medwin, A. (2005) "What Probability Should the Jury Address?". *[Significance](https://en.wikipedia.org/wiki/Significance_\(journal\) "Significance (journal)")*, 2 (1), March 2005.
58. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-58)**
Miller, David (1994). [*Critical Rationalism*](https://books.google.com/books?id=bh_yCgAAQBAJ). Chicago: Open Court. [ISBN](https://en.wikipedia.org/wiki/ISBN_\(identifier\) "ISBN (identifier)")
[978-0-8126-9197-9](https://en.wikipedia.org/wiki/Special:BookSources/978-0-8126-9197-9 "Special:BookSources/978-0-8126-9197-9")
.
59. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-59)** Howson & Urbach (2005), Jaynes (2003)
60. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-Cai_et_al._2009_60-0)**
Cai, X.Q.; Wu, X.Y.; Zhou, X. (2009). "Stochastic scheduling subject to breakdown-repeat breakdowns with incomplete information". *Operations Research*. **57** (5): 1236â1249\. [doi](https://en.wikipedia.org/wiki/Doi_\(identifier\) "Doi (identifier)"):[10\.1287/opre.1080.0660](https://doi.org/10.1287%2Fopre.1080.0660).
61. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-61)**
Ogle, Kiona; Tucker, Colin; Cable, Jessica M. (2014-01-01). "Beyond simple linear mixing models: process-based isotope partitioning of ecological processes". *Ecological Applications*. **24** (1): 181â195\. [Bibcode](https://en.wikipedia.org/wiki/Bibcode_\(identifier\) "Bibcode (identifier)"):[2014EcoAp..24..181O](https://ui.adsabs.harvard.edu/abs/2014EcoAp..24..181O). [doi](https://en.wikipedia.org/wiki/Doi_\(identifier\) "Doi (identifier)"):[10\.1890/1051-0761-24.1.181](https://doi.org/10.1890%2F1051-0761-24.1.181). [ISSN](https://en.wikipedia.org/wiki/ISSN_\(identifier\) "ISSN (identifier)") [1939-5582](https://search.worldcat.org/issn/1939-5582). [PMID](https://en.wikipedia.org/wiki/PMID_\(identifier\) "PMID (identifier)") [24640543](https://pubmed.ncbi.nlm.nih.gov/24640543).
62. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-62)**
Evaristo, Jaivime; McDonnell, Jeffrey J.; Scholl, Martha A.; Bruijnzeel, L. Adrian; Chun, Kwok P. (2016-01-01). "Insights into plant water uptake from xylem-water isotope measurements in two tropical catchments with contrasting moisture conditions". *Hydrological Processes*. **30** (18): 3210â3227\. [Bibcode](https://en.wikipedia.org/wiki/Bibcode_\(identifier\) "Bibcode (identifier)"):[2016HyPr...30.3210E](https://ui.adsabs.harvard.edu/abs/2016HyPr...30.3210E). [doi](https://en.wikipedia.org/wiki/Doi_\(identifier\) "Doi (identifier)"):[10\.1002/hyp.10841](https://doi.org/10.1002%2Fhyp.10841). [ISSN](https://en.wikipedia.org/wiki/ISSN_\(identifier\) "ISSN (identifier)") [1099-1085](https://search.worldcat.org/issn/1099-1085). [S2CID](https://en.wikipedia.org/wiki/S2CID_\(identifier\) "S2CID (identifier)") [131588159](https://api.semanticscholar.org/CorpusID:131588159).
63. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-63)**
Gupta, Ankur; Rawlings, James B. (April 2014). ["Comparison of Parameter Estimation Methods in Stochastic Chemical Kinetic Models: Examples in Systems Biology"](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4946376). *AIChE Journal*. **60** (4): 1253â1268\. [Bibcode](https://en.wikipedia.org/wiki/Bibcode_\(identifier\) "Bibcode (identifier)"):[2014AIChE..60.1253G](https://ui.adsabs.harvard.edu/abs/2014AIChE..60.1253G). [doi](https://en.wikipedia.org/wiki/Doi_\(identifier\) "Doi (identifier)"):[10\.1002/aic.14409](https://doi.org/10.1002%2Faic.14409). [ISSN](https://en.wikipedia.org/wiki/ISSN_\(identifier\) "ISSN (identifier)") [0001-1541](https://search.worldcat.org/issn/0001-1541). [PMC](https://en.wikipedia.org/wiki/PMC_\(identifier\) "PMC (identifier)") [4946376](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4946376). [PMID](https://en.wikipedia.org/wiki/PMID_\(identifier\) "PMID (identifier)") [27429455](https://pubmed.ncbi.nlm.nih.gov/27429455).
64. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-64)**
SchĂŒtz, N.; Holschneider, M. (2011). "Detection of trend changes in time series using Bayesian inference". *Physical Review E*. **84** (2) 021120. [arXiv](https://en.wikipedia.org/wiki/ArXiv_\(identifier\) "ArXiv (identifier)"):[1104\.3448](https://arxiv.org/abs/1104.3448). [Bibcode](https://en.wikipedia.org/wiki/Bibcode_\(identifier\) "Bibcode (identifier)"):[2011PhRvE..84b1120S](https://ui.adsabs.harvard.edu/abs/2011PhRvE..84b1120S). [doi](https://en.wikipedia.org/wiki/Doi_\(identifier\) "Doi (identifier)"):[10\.1103/PhysRevE.84.021120](https://doi.org/10.1103%2FPhysRevE.84.021120). [PMID](https://en.wikipedia.org/wiki/PMID_\(identifier\) "PMID (identifier)") [21928962](https://pubmed.ncbi.nlm.nih.gov/21928962). [S2CID](https://en.wikipedia.org/wiki/S2CID_\(identifier\) "S2CID (identifier)") [11460968](https://api.semanticscholar.org/CorpusID:11460968).
65. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-65)**
Stigler, Stephen (1982). "Thomas Bayes's Bayesian Inference". *Journal of the Royal Statistical Society*. **145** (2): 250â58\. [doi](https://en.wikipedia.org/wiki/Doi_\(identifier\) "Doi (identifier)"):[10\.2307/2981538](https://doi.org/10.2307%2F2981538). [JSTOR](https://en.wikipedia.org/wiki/JSTOR_\(identifier\) "JSTOR (identifier)") [2981538](https://www.jstor.org/stable/2981538).
66. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-Stigler1986_66-0)**
Stigler, Stephen M. (1986). ["Chapter 3"](https://archive.org/details/historyofstatist00stig). *The History of Statistics*. Harvard University Press. [ISBN](https://en.wikipedia.org/wiki/ISBN_\(identifier\) "ISBN (identifier)")
[978-0-674-40340-6](https://en.wikipedia.org/wiki/Special:BookSources/978-0-674-40340-6 "Special:BookSources/978-0-674-40340-6")
.
67. ^ [***a***](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-Fienberg2006_67-0) [***b***](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-Fienberg2006_67-1)
Fienberg, Stephen E. (2006). ["When did Bayesian Inference Become 'Bayesian'?"](https://doi.org/10.1214%2F06-ba101). *Bayesian Analysis*. **1** (1): 1â40 \[p. 5\]. [doi](https://en.wikipedia.org/wiki/Doi_\(identifier\) "Doi (identifier)"):[10\.1214/06-ba101](https://doi.org/10.1214%2F06-ba101).
68. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-Bernardo2005_68-0)**
[Bernardo, JosĂ©-Miguel](https://en.wikipedia.org/wiki/Jos%C3%A9-Miguel_Bernardo "JosĂ©-Miguel Bernardo") (2005). "Reference analysis". *Handbook of statistics*. Vol. 25. pp. 17â90\.
69. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-Wolpert2004_69-0)**
Wolpert, R. L. (2004). "A Conversation with James O. Berger". *Statistical Science*. **19** (1): 205â218\. [CiteSeerX](https://en.wikipedia.org/wiki/CiteSeerX_\(identifier\) "CiteSeerX (identifier)") [10\.1.1.71.6112](https://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.71.6112). [doi](https://en.wikipedia.org/wiki/Doi_\(identifier\) "Doi (identifier)"):[10\.1214/088342304000000053](https://doi.org/10.1214%2F088342304000000053). [MR](https://en.wikipedia.org/wiki/MR_\(identifier\) "MR (identifier)") [2082155](https://mathscinet.ams.org/mathscinet-getitem?mr=2082155). [S2CID](https://en.wikipedia.org/wiki/S2CID_\(identifier\) "S2CID (identifier)") [120094454](https://api.semanticscholar.org/CorpusID:120094454).
70. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-Bernardo2006_70-0)**
[Bernardo, José M.](https://en.wikipedia.org/wiki/Jos%C3%A9-Miguel_Bernardo "José-Miguel Bernardo") (2006). ["A Bayesian mathematical statistics primer"](http://www.ime.usp.br/~abe/ICOTS7/Proceedings/PDFs/InvitedPapers/3I2_BERN.pdf) (PDF). *Icots-7*.
71. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-Bishop2007_71-0)**
Bishop, C. M. (2007). *Pattern Recognition and Machine Learning*. New York: Springer. [ISBN](https://en.wikipedia.org/wiki/ISBN_\(identifier\) "ISBN (identifier)")
[978-0-387-31073-2](https://en.wikipedia.org/wiki/Special:BookSources/978-0-387-31073-2 "Special:BookSources/978-0-387-31073-2")
.
### Sources
\[[edit](https://en.wikipedia.org/w/index.php?title=Bayesian_inference&action=edit§ion=37 "Edit section: Sources")\]
- Aster, Richard; Borchers, Brian, and Thurber, Clifford (2012). *Parameter Estimation and Inverse Problems*, Second Edition, Elsevier.
[ISBN](https://en.wikipedia.org/wiki/ISBN_\(identifier\) "ISBN (identifier)")
[0123850487](https://en.wikipedia.org/wiki/Special:BookSources/0123850487 "Special:BookSources/0123850487")
,
[ISBN](https://en.wikipedia.org/wiki/ISBN_\(identifier\) "ISBN (identifier)")
[978-0123850485](https://en.wikipedia.org/wiki/Special:BookSources/978-0123850485 "Special:BookSources/978-0123850485")
- Bickel, Peter J. & Doksum, Kjell A. (2001). *Mathematical Statistics, Volume 1: Basic and Selected Topics* (Second (updated printing 2007) ed.). Pearson PrenticeâHall. [ISBN](https://en.wikipedia.org/wiki/ISBN_\(identifier\) "ISBN (identifier)")
[978-0-13-850363-5](https://en.wikipedia.org/wiki/Special:BookSources/978-0-13-850363-5 "Special:BookSources/978-0-13-850363-5")
.
- [Box, G. E. P.](https://en.wikipedia.org/wiki/George_E._P._Box "George E. P. Box") and [Tiao, G. C.](https://en.wikipedia.org/wiki/George_Tiao "George Tiao") (1973). *Bayesian Inference in Statistical Analysis*, Wiley,
[ISBN](https://en.wikipedia.org/wiki/ISBN_\(identifier\) "ISBN (identifier)")
[0-471-57428-7](https://en.wikipedia.org/wiki/Special:BookSources/0-471-57428-7 "Special:BookSources/0-471-57428-7")
- Edwards, Ward (1968). "Conservatism in Human Information Processing". In Kleinmuntz, B. (ed.). *Formal Representation of Human Judgment*. Wiley.
- Edwards, Ward (1982). [Daniel Kahneman](https://en.wikipedia.org/wiki/Daniel_Kahneman "Daniel Kahneman"); [Paul Slovic](https://en.wikipedia.org/wiki/Paul_Slovic "Paul Slovic"); [Amos Tversky](https://en.wikipedia.org/wiki/Amos_Tversky "Amos Tversky") (eds.). "Judgment under uncertainty: Heuristics and biases". *Science*. **185** (4157): 1124â1131\. [Bibcode](https://en.wikipedia.org/wiki/Bibcode_\(identifier\) "Bibcode (identifier)"):[1974Sci...185.1124T](https://ui.adsabs.harvard.edu/abs/1974Sci...185.1124T). [doi](https://en.wikipedia.org/wiki/Doi_\(identifier\) "Doi (identifier)"):[10\.1126/science.185.4157.1124](https://doi.org/10.1126%2Fscience.185.4157.1124). [PMID](https://en.wikipedia.org/wiki/PMID_\(identifier\) "PMID (identifier)") [17835457](https://pubmed.ncbi.nlm.nih.gov/17835457). [S2CID](https://en.wikipedia.org/wiki/S2CID_\(identifier\) "S2CID (identifier)") [143452957](https://api.semanticscholar.org/CorpusID:143452957). "Chapter: Conservatism in Human Information Processing (excerpted)"
- [Jaynes E. T.](https://en.wikipedia.org/wiki/Edwin_Thompson_Jaynes "Edwin Thompson Jaynes") (2003) *Probability Theory: The Logic of Science*, CUP.
[ISBN](https://en.wikipedia.org/wiki/ISBN_\(identifier\) "ISBN (identifier)")
[978-0-521-59271-0](https://en.wikipedia.org/wiki/Special:BookSources/978-0-521-59271-0 "Special:BookSources/978-0-521-59271-0")
([Link to Fragmentary Edition of March 1996](http://www-biba.inrialpes.fr/Jaynes/prob.html)).
- [Howson, C.](https://en.wikipedia.org/wiki/Colin_Howson "Colin Howson") & Urbach, P. (2005). *Scientific Reasoning: the Bayesian Approach* (3rd ed.). [Open Court Publishing Company](https://en.wikipedia.org/wiki/Open_Court_Publishing_Company "Open Court Publishing Company"). [ISBN](https://en.wikipedia.org/wiki/ISBN_\(identifier\) "ISBN (identifier)")
[978-0-8126-9578-6](https://en.wikipedia.org/wiki/Special:BookSources/978-0-8126-9578-6 "Special:BookSources/978-0-8126-9578-6")
.
- Phillips, L. D.; Edwards, Ward (October 2008). "Chapter 6: Conservatism in a Simple Probability Inference Task (*Journal of Experimental Psychology* (1966) 72: 346-354)". In Jie W. Weiss; David J. Weiss (eds.). *A Science of Decision Making:The Legacy of Ward Edwards*. Oxford University Press. p. 536. [ISBN](https://en.wikipedia.org/wiki/ISBN_\(identifier\) "ISBN (identifier)")
[978-0-19-532298-9](https://en.wikipedia.org/wiki/Special:BookSources/978-0-19-532298-9 "Special:BookSources/978-0-19-532298-9")
.
## Further reading
\[[edit](https://en.wikipedia.org/w/index.php?title=Bayesian_inference&action=edit§ion=38 "Edit section: Further reading")\]
- For a full report on the history of Bayesian statistics and the debates with frequentists approaches, read
Vallverdu, Jordi (2016). *Bayesians Versus Frequentists A Philosophical Debate on Statistical Reasoning*. New York: Springer. [ISBN](https://en.wikipedia.org/wiki/ISBN_\(identifier\) "ISBN (identifier)")
[978-3-662-48638-2](https://en.wikipedia.org/wiki/Special:BookSources/978-3-662-48638-2 "Special:BookSources/978-3-662-48638-2")
.
- [Clayton, Aubrey](https://en.wikipedia.org/w/index.php?title=Aubrey_Clayton&action=edit&redlink=1 "Aubrey Clayton (page does not exist)") (August 2021). [*Bernoulli's Fallacy: Statistical Illogic and the Crisis of Modern Science*](https://cup.columbia.edu/book/bernoullis-fallacy/9780231199940). Columbia University Press. [ISBN](https://en.wikipedia.org/wiki/ISBN_\(identifier\) "ISBN (identifier)")
[978-0-231-55335-3](https://en.wikipedia.org/wiki/Special:BookSources/978-0-231-55335-3 "Special:BookSources/978-0-231-55335-3")
.
### Elementary
\[[edit](https://en.wikipedia.org/w/index.php?title=Bayesian_inference&action=edit§ion=39 "Edit section: Elementary")\]
The following books are listed in ascending order of probabilistic sophistication:
- Stone, JV (2013), "Bayes' Rule: A Tutorial Introduction to Bayesian Analysis", [Download first chapter here](http://jim-stone.staff.shef.ac.uk/BookBayes2012/BayesRuleBookMain.html), Sebtel Press, England.
- [Dennis V. Lindley](https://en.wikipedia.org/wiki/Dennis_V._Lindley "Dennis V. Lindley") (2013). *Understanding Uncertainty, Revised Edition* (2nd ed.). John Wiley. [ISBN](https://en.wikipedia.org/wiki/ISBN_\(identifier\) "ISBN (identifier)")
[978-1-118-65012-7](https://en.wikipedia.org/wiki/Special:BookSources/978-1-118-65012-7 "Special:BookSources/978-1-118-65012-7")
.
- [Colin Howson](https://en.wikipedia.org/wiki/Colin_Howson "Colin Howson") & Peter Urbach (2005). *Scientific Reasoning: The Bayesian Approach* (3rd ed.). [Open Court Publishing Company](https://en.wikipedia.org/wiki/Open_Court_Publishing_Company "Open Court Publishing Company"). [ISBN](https://en.wikipedia.org/wiki/ISBN_\(identifier\) "ISBN (identifier)")
[978-0-8126-9578-6](https://en.wikipedia.org/wiki/Special:BookSources/978-0-8126-9578-6 "Special:BookSources/978-0-8126-9578-6")
.
- Berry, Donald A. (1996). *Statistics: A Bayesian Perspective*. Duxbury. [ISBN](https://en.wikipedia.org/wiki/ISBN_\(identifier\) "ISBN (identifier)")
[978-0-534-23476-8](https://en.wikipedia.org/wiki/Special:BookSources/978-0-534-23476-8 "Special:BookSources/978-0-534-23476-8")
.
- [Morris H. DeGroot](https://en.wikipedia.org/wiki/Morris_H._DeGroot "Morris H. DeGroot") & Mark J. Schervish (2002). [*Probability and Statistics*](https://archive.org/details/probabilitystati00degr_0) (third ed.). Addison-Wesley. [ISBN](https://en.wikipedia.org/wiki/ISBN_\(identifier\) "ISBN (identifier)")
[978-0-201-52488-8](https://en.wikipedia.org/wiki/Special:BookSources/978-0-201-52488-8 "Special:BookSources/978-0-201-52488-8")
.
- Bolstad, William M. (2007) *Introduction to Bayesian Statistics*: Second Edition, John Wiley
[ISBN](https://en.wikipedia.org/wiki/ISBN_\(identifier\) "ISBN (identifier)")
[0-471-27020-2](https://en.wikipedia.org/wiki/Special:BookSources/0-471-27020-2 "Special:BookSources/0-471-27020-2")
- Winkler, Robert L (2003). *Introduction to Bayesian Inference and Decision* (2nd ed.). Probabilistic. [ISBN](https://en.wikipedia.org/wiki/ISBN_\(identifier\) "ISBN (identifier)")
[978-0-9647938-4-2](https://en.wikipedia.org/wiki/Special:BookSources/978-0-9647938-4-2 "Special:BookSources/978-0-9647938-4-2")
.
Updated classic textbook. Bayesian theory clearly presented.
- Lee, Peter M. *Bayesian Statistics: An Introduction*. Fourth Edition (2012), John Wiley
[ISBN](https://en.wikipedia.org/wiki/ISBN_\(identifier\) "ISBN (identifier)")
[978-1-1183-3257-3](https://en.wikipedia.org/wiki/Special:BookSources/978-1-1183-3257-3 "Special:BookSources/978-1-1183-3257-3")
- Carlin, Bradley P. & Louis, Thomas A. (2008). *Bayesian Methods for Data Analysis, Third Edition*. Boca Raton, FL: Chapman and Hall/CRC. [ISBN](https://en.wikipedia.org/wiki/ISBN_\(identifier\) "ISBN (identifier)")
[978-1-58488-697-6](https://en.wikipedia.org/wiki/Special:BookSources/978-1-58488-697-6 "Special:BookSources/978-1-58488-697-6")
.
- [Gelman, Andrew](https://en.wikipedia.org/wiki/Andrew_Gelman "Andrew Gelman"); Carlin, John B.; Stern, Hal S.; Dunson, David B.; Vehtari, Aki; [Rubin, Donald B.](https://en.wikipedia.org/wiki/Donald_Rubin "Donald Rubin") (2013). *Bayesian Data Analysis, Third Edition*. Chapman and Hall/CRC. [ISBN](https://en.wikipedia.org/wiki/ISBN_\(identifier\) "ISBN (identifier)")
[978-1-4398-4095-5](https://en.wikipedia.org/wiki/Special:BookSources/978-1-4398-4095-5 "Special:BookSources/978-1-4398-4095-5")
.
### Intermediate or advanced
\[[edit](https://en.wikipedia.org/w/index.php?title=Bayesian_inference&action=edit§ion=40 "Edit section: Intermediate or advanced")\]
- [Berger, James O](https://en.wikipedia.org/wiki/James_Berger_\(statistician\) "James Berger (statistician)") (1985). *Statistical Decision Theory and Bayesian Analysis*. Springer Series in Statistics (Second ed.). Springer-Verlag. [Bibcode](https://en.wikipedia.org/wiki/Bibcode_\(identifier\) "Bibcode (identifier)"):[1985sdtb.book.....B](https://ui.adsabs.harvard.edu/abs/1985sdtb.book.....B). [ISBN](https://en.wikipedia.org/wiki/ISBN_\(identifier\) "ISBN (identifier)")
[978-0-387-96098-2](https://en.wikipedia.org/wiki/Special:BookSources/978-0-387-96098-2 "Special:BookSources/978-0-387-96098-2")
.
- [Bernardo, José M.](https://en.wikipedia.org/wiki/Jos%C3%A9-Miguel_Bernardo "José-Miguel Bernardo"); [Smith, Adrian F. M.](https://en.wikipedia.org/wiki/Adrian_Smith_\(statistician\) "Adrian Smith (statistician)") (1994). *Bayesian Theory*. Wiley.
- [DeGroot, Morris H.](https://en.wikipedia.org/wiki/Morris_H._DeGroot "Morris H. DeGroot"), *Optimal Statistical Decisions*. Wiley Classics Library. 2004. (Originally published (1970) by McGraw-Hill.)
[ISBN](https://en.wikipedia.org/wiki/ISBN_\(identifier\) "ISBN (identifier)")
[0-471-68029-X](https://en.wikipedia.org/wiki/Special:BookSources/0-471-68029-X "Special:BookSources/0-471-68029-X")
.
- Schervish, Mark J. (1995). *Theory of statistics*. Springer-Verlag. [ISBN](https://en.wikipedia.org/wiki/ISBN_\(identifier\) "ISBN (identifier)")
[978-0-387-94546-0](https://en.wikipedia.org/wiki/Special:BookSources/978-0-387-94546-0 "Special:BookSources/978-0-387-94546-0")
.
- Jaynes, E. T. (1998). [*Probability Theory: The Logic of Science*](http://www-biba.inrialpes.fr/Jaynes/prob.html).
- O'Hagan, A. and Forster, J. (2003). *Kendall's Advanced Theory of Statistics*, Volume 2B: *Bayesian Inference*. Arnold, New York.
[ISBN](https://en.wikipedia.org/wiki/ISBN_\(identifier\) "ISBN (identifier)")
[0-340-52922-9](https://en.wikipedia.org/wiki/Special:BookSources/0-340-52922-9 "Special:BookSources/0-340-52922-9")
.
- Robert, Christian P (2007). *The Bayesian Choice: From Decision-Theoretic Foundations to Computational Implementation* (paperback ed.). Springer. [ISBN](https://en.wikipedia.org/wiki/ISBN_\(identifier\) "ISBN (identifier)")
[978-0-387-71598-8](https://en.wikipedia.org/wiki/Special:BookSources/978-0-387-71598-8 "Special:BookSources/978-0-387-71598-8")
.
- [Pearl, Judea](https://en.wikipedia.org/wiki/Judea_Pearl "Judea Pearl"). (1988). *Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference*, San Mateo, CA: Morgan Kaufmann.
- Pierre BessiĂšre et al. (2013). "[Bayesian Programming](http://www.crcpress.com/product/isbn/9781439880326)". CRC Press.
[ISBN](https://en.wikipedia.org/wiki/ISBN_\(identifier\) "ISBN (identifier)")
[9781439880326](https://en.wikipedia.org/wiki/Special:BookSources/9781439880326 "Special:BookSources/9781439880326")
- Francisco J. Samaniego (2010). "A Comparison of the Bayesian and Frequentist Approaches to Estimation". Springer. New York,
[ISBN](https://en.wikipedia.org/wiki/ISBN_\(identifier\) "ISBN (identifier)")
[978-1-4419-5940-9](https://en.wikipedia.org/wiki/Special:BookSources/978-1-4419-5940-9 "Special:BookSources/978-1-4419-5940-9")
## External links
\[[edit](https://en.wikipedia.org/w/index.php?title=Bayesian_inference&action=edit§ion=41 "Edit section: External links")\]
- ["Bayesian approach to statistical problems"](https://www.encyclopediaofmath.org/index.php?title=Bayesian_approach_to_statistical_problems), *[Encyclopedia of Mathematics](https://en.wikipedia.org/wiki/Encyclopedia_of_Mathematics "Encyclopedia of Mathematics")*, [EMS Press](https://en.wikipedia.org/wiki/European_Mathematical_Society "European Mathematical Society"), 2001 \[1994\]
- [Bayesian Statistics](http://www.scholarpedia.org/article/Bayesian_statistics) from Scholarpedia.
- [Introduction to Bayesian probability](http://www.dcs.qmw.ac.uk/~norman/BBNs/BBNs.htm) from Queen Mary University of London
- [Mathematical Notes on Bayesian Statistics and Markov Chain Monte Carlo](http://webuser.bus.umich.edu/plenk/downloads.htm)
- [Bayesian reading list](http://cocosci.berkeley.edu/tom/bayes.html) [Archived](https://web.archive.org/web/20110625052506/http://cocosci.berkeley.edu/tom/bayes.html) 2011-06-25 at the [Wayback Machine](https://en.wikipedia.org/wiki/Wayback_Machine "Wayback Machine"), categorized and annotated by [Tom Griffiths](https://web.archive.org/web/20060711151352/http://psychology.berkeley.edu/faculty/profiles/tgriffiths.html)
- A. Hajek and S. Hartmann: [Bayesian Epistemology](https://web.archive.org/web/20110728055439/http://stephanhartmann.org/HajekHartmann_BayesEpist.pdf), in: J. Dancy et al. (eds.), A Companion to Epistemology. Oxford: Blackwell 2010, 93â106.
- S. Hartmann and J. Sprenger: [Bayesian Epistemology](https://web.archive.org/web/20110728055519/http://stephanhartmann.org/HartmannSprenger_BayesEpis.pdf), in: S. Bernecker and D. Pritchard (eds.), Routledge Companion to Epistemology. London: Routledge 2010, 609â620.
- [*Stanford Encyclopedia of Philosophy*: "Inductive Logic"](http://plato.stanford.edu/entries/logic-inductive/)
- [Bayesian Confirmation Theory](https://web.archive.org/web/20150905093734/http://faculty-staff.ou.edu/H/James.A.Hawthorne-1/Hawthorne--Bayesian_Confirmation_Theory.pdf) (PDF)
- [What is Bayesian Learning?](http://www.faqs.org/faqs/ai-faq/neural-nets/part3/section-7.html)
- [*Data, Uncertainty and Inference*](https://causascientia.org/math_stat/DataUnkInf.html) â Informal introduction with many examples, ebook (PDF) freely available at [causaScientia](https://causascientia.org/)
| [v](https://en.wikipedia.org/wiki/Template:Statistics "Template:Statistics") [t](https://en.wikipedia.org/wiki/Template_talk:Statistics "Template talk:Statistics") [e](https://en.wikipedia.org/wiki/Special:EditPage/Template:Statistics "Special:EditPage/Template:Statistics")[Statistics](https://en.wikipedia.org/wiki/Statistics "Statistics") | |
|---|---|
| [Outline](https://en.wikipedia.org/wiki/Outline_of_statistics "Outline of statistics") [Index](https://en.wikipedia.org/wiki/List_of_statistics_articles "List of statistics articles") | |
| [Descriptive statistics](https://en.wikipedia.org/wiki/Descriptive_statistics "Descriptive statistics") | |
| | |
| [Continuous data](https://en.wikipedia.org/wiki/Continuous_probability_distribution "Continuous probability distribution") | |
| | |
| [Center](https://en.wikipedia.org/wiki/Central_tendency "Central tendency") | [Mean](https://en.wikipedia.org/wiki/Mean "Mean") [Arithmetic](https://en.wikipedia.org/wiki/Arithmetic_mean "Arithmetic mean") [Arithmetic-Geometric](https://en.wikipedia.org/wiki/Arithmetic%E2%80%93geometric_mean "Arithmeticâgeometric mean") [Contraharmonic](https://en.wikipedia.org/wiki/Contraharmonic_mean "Contraharmonic mean") [Cubic](https://en.wikipedia.org/wiki/Cubic_mean "Cubic mean") [Generalized/power](https://en.wikipedia.org/wiki/Generalized_mean "Generalized mean") [Geometric](https://en.wikipedia.org/wiki/Geometric_mean "Geometric mean") [Harmonic](https://en.wikipedia.org/wiki/Harmonic_mean "Harmonic mean") [Heronian](https://en.wikipedia.org/wiki/Heronian_mean "Heronian mean") [Heinz](https://en.wikipedia.org/wiki/Heinz_mean "Heinz mean") [Lehmer](https://en.wikipedia.org/wiki/Lehmer_mean "Lehmer mean") [Median](https://en.wikipedia.org/wiki/Median "Median") [Mode](https://en.wikipedia.org/wiki/Mode_\(statistics\) "Mode (statistics)") |
| [Dispersion](https://en.wikipedia.org/wiki/Statistical_dispersion "Statistical dispersion") | [Average absolute deviation](https://en.wikipedia.org/wiki/Average_absolute_deviation "Average absolute deviation") [Coefficient of variation](https://en.wikipedia.org/wiki/Coefficient_of_variation "Coefficient of variation") [Interquartile range](https://en.wikipedia.org/wiki/Interquartile_range "Interquartile range") [Percentile](https://en.wikipedia.org/wiki/Percentile "Percentile") [Range](https://en.wikipedia.org/wiki/Range_\(statistics\) "Range (statistics)") [Standard deviation](https://en.wikipedia.org/wiki/Standard_deviation "Standard deviation") [Variance](https://en.wikipedia.org/wiki/Variance#Sample_variance "Variance") |
| [Shape](https://en.wikipedia.org/wiki/Shape_of_the_distribution "Shape of the distribution") | [Central limit theorem](https://en.wikipedia.org/wiki/Central_limit_theorem "Central limit theorem") [Moments](https://en.wikipedia.org/wiki/Moment_\(mathematics\) "Moment (mathematics)") [Kurtosis](https://en.wikipedia.org/wiki/Kurtosis "Kurtosis") [L-moments](https://en.wikipedia.org/wiki/L-moment "L-moment") [Skewness](https://en.wikipedia.org/wiki/Skewness "Skewness") |
| [Count data](https://en.wikipedia.org/wiki/Count_data "Count data") | [Index of dispersion](https://en.wikipedia.org/wiki/Index_of_dispersion "Index of dispersion") |
| Summary tables | [Contingency table](https://en.wikipedia.org/wiki/Contingency_table "Contingency table") [Frequency distribution](https://en.wikipedia.org/wiki/Frequency_distribution "Frequency distribution") [Grouped data](https://en.wikipedia.org/wiki/Grouped_data "Grouped data") |
| [Dependence](https://en.wikipedia.org/wiki/Correlation_and_dependence "Correlation and dependence") | [Partial correlation](https://en.wikipedia.org/wiki/Partial_correlation "Partial correlation") [Pearson product-moment correlation](https://en.wikipedia.org/wiki/Pearson_correlation_coefficient "Pearson correlation coefficient") [Rank correlation](https://en.wikipedia.org/wiki/Rank_correlation "Rank correlation") [Kendall's Ï](https://en.wikipedia.org/wiki/Kendall_rank_correlation_coefficient "Kendall rank correlation coefficient") [Spearman's Ï](https://en.wikipedia.org/wiki/Spearman%27s_rank_correlation_coefficient "Spearman's rank correlation coefficient") [Scatter plot](https://en.wikipedia.org/wiki/Scatter_plot "Scatter plot") |
| [Graphics](https://en.wikipedia.org/wiki/Statistical_graphics "Statistical graphics") | [Bar chart](https://en.wikipedia.org/wiki/Bar_chart "Bar chart") [Biplot](https://en.wikipedia.org/wiki/Biplot "Biplot") [Box plot](https://en.wikipedia.org/wiki/Box_plot "Box plot") [Control chart](https://en.wikipedia.org/wiki/Control_chart "Control chart") [Correlogram](https://en.wikipedia.org/wiki/Correlogram "Correlogram") [Fan chart](https://en.wikipedia.org/wiki/Fan_chart_\(statistics\) "Fan chart (statistics)") [Forest plot](https://en.wikipedia.org/wiki/Forest_plot "Forest plot") [Histogram](https://en.wikipedia.org/wiki/Histogram "Histogram") [Pie chart](https://en.wikipedia.org/wiki/Pie_chart "Pie chart") [QâQ plot](https://en.wikipedia.org/wiki/Q%E2%80%93Q_plot "QâQ plot") [Radar chart](https://en.wikipedia.org/wiki/Radar_chart "Radar chart") [Run chart](https://en.wikipedia.org/wiki/Run_chart "Run chart") [Scatter plot](https://en.wikipedia.org/wiki/Scatter_plot "Scatter plot") [Stem-and-leaf display](https://en.wikipedia.org/wiki/Stem-and-leaf_display "Stem-and-leaf display") [Violin plot](https://en.wikipedia.org/wiki/Violin_plot "Violin plot") [Heatmap](https://en.wikipedia.org/wiki/Heatmap "Heatmap") [Scatter Plot Matrix](https://en.wikipedia.org/wiki/Scatter_plot "Scatter plot") [ECDF plot](https://en.wikipedia.org/wiki/Empirical_distribution_function "Empirical distribution function") |
| [Statistical data processing](https://en.wikipedia.org/wiki/Data_preprocessing "Data preprocessing") | |
| | |
| [Transformations](https://en.wikipedia.org/wiki/Data_transformation_\(statistics\) "Data transformation (statistics)") | [Data transformation](https://en.wikipedia.org/wiki/Data_transformation_\(statistics\) "Data transformation (statistics)") [Log transformation](https://en.wikipedia.org/w/index.php?title=Log_transformation&action=edit&redlink=1 "Log transformation (page does not exist)") [Power transform](https://en.wikipedia.org/wiki/Power_transform "Power transform") [BoxâCox transformation](https://en.wikipedia.org/wiki/Box%E2%80%93Cox_transformation "BoxâCox transformation") [YeoâJohnson transformation](https://en.wikipedia.org/wiki/Yeo%E2%80%93Johnson_transformation "YeoâJohnson transformation") [Variance-stabilizing transformation](https://en.wikipedia.org/wiki/Variance-stabilizing_transformation "Variance-stabilizing transformation") [Anscombe transform](https://en.wikipedia.org/wiki/Anscombe_transform "Anscombe transform") [Fisher transformation](https://en.wikipedia.org/wiki/Fisher_transformation "Fisher transformation") |
| [Scaling and normalization](https://en.wikipedia.org/wiki/Feature_scaling "Feature scaling") | [Feature scaling](https://en.wikipedia.org/wiki/Feature_scaling "Feature scaling") [Normalization](https://en.wikipedia.org/wiki/Normalization_\(statistics\) "Normalization (statistics)") [Standardization (z-score)](https://en.wikipedia.org/wiki/Standard_score "Standard score") [Minâmax normalization](https://en.wikipedia.org/w/index.php?title=Min%E2%80%93max_normalization&action=edit&redlink=1 "Minâmax normalization (page does not exist)") [Unit vector normalization](https://en.wikipedia.org/w/index.php?title=Unit_vector_normalization&action=edit&redlink=1 "Unit vector normalization (page does not exist)") |
| Data cleaning | [Data cleaning](https://en.wikipedia.org/wiki/Data_cleaning "Data cleaning") [Outlier](https://en.wikipedia.org/wiki/Outlier "Outlier") [Winsorizing](https://en.wikipedia.org/wiki/Winsorizing "Winsorizing") [Truncation](https://en.wikipedia.org/wiki/Truncation_\(statistics\) "Truncation (statistics)") [Missing data](https://en.wikipedia.org/wiki/Missing_data "Missing data") |
| Data reduction | [Dimensionality reduction](https://en.wikipedia.org/wiki/Dimensionality_reduction "Dimensionality reduction") [Principal component analysis](https://en.wikipedia.org/wiki/Principal_component_analysis "Principal component analysis") [Factor analysis](https://en.wikipedia.org/wiki/Factor_analysis "Factor analysis") |
| Time-series preprocessing | [Differencing](https://en.wikipedia.org/wiki/Differencing "Differencing") [Detrending](https://en.wikipedia.org/wiki/Detrending "Detrending") [Seasonal adjustment](https://en.wikipedia.org/wiki/Seasonal_adjustment "Seasonal adjustment") [Stationarity transformation](https://en.wikipedia.org/wiki/Stationary_process "Stationary process") |
| [Data collection](https://en.wikipedia.org/wiki/Data_collection "Data collection") | |
| | |
| [Study design](https://en.wikipedia.org/wiki/Design_of_experiments "Design of experiments") | [Effect size](https://en.wikipedia.org/wiki/Effect_size "Effect size") [Missing data](https://en.wikipedia.org/wiki/Missing_data "Missing data") [Optimal design](https://en.wikipedia.org/wiki/Optimal_design "Optimal design") [Population](https://en.wikipedia.org/wiki/Statistical_population "Statistical population") [Replication](https://en.wikipedia.org/wiki/Replication_\(statistics\) "Replication (statistics)") [Sample size determination](https://en.wikipedia.org/wiki/Sample_size_determination "Sample size determination") [Statistic](https://en.wikipedia.org/wiki/Statistic "Statistic") [Statistical power](https://en.wikipedia.org/wiki/Statistical_power "Statistical power") |
| [Survey methodology](https://en.wikipedia.org/wiki/Survey_methodology "Survey methodology") | [Sampling](https://en.wikipedia.org/wiki/Sampling_\(statistics\) "Sampling (statistics)") [Cluster](https://en.wikipedia.org/wiki/Cluster_sampling "Cluster sampling") [Stratified](https://en.wikipedia.org/wiki/Stratified_sampling "Stratified sampling") [Opinion poll](https://en.wikipedia.org/wiki/Opinion_poll "Opinion poll") [Questionnaire](https://en.wikipedia.org/wiki/Questionnaire "Questionnaire") [Standard error](https://en.wikipedia.org/wiki/Standard_error "Standard error") |
| [Controlled experiments](https://en.wikipedia.org/wiki/Experiment "Experiment") | [Blocking](https://en.wikipedia.org/wiki/Blocking_\(statistics\) "Blocking (statistics)") [Factorial experiment](https://en.wikipedia.org/wiki/Factorial_experiment "Factorial experiment") [Interaction](https://en.wikipedia.org/wiki/Interaction_\(statistics\) "Interaction (statistics)") [Random assignment](https://en.wikipedia.org/wiki/Random_assignment "Random assignment") [Randomized controlled trial](https://en.wikipedia.org/wiki/Randomized_controlled_trial "Randomized controlled trial") [Randomized experiment](https://en.wikipedia.org/wiki/Randomized_experiment "Randomized experiment") [Scientific control](https://en.wikipedia.org/wiki/Scientific_control "Scientific control") |
| Adaptive designs | [Adaptive clinical trial](https://en.wikipedia.org/wiki/Adaptive_clinical_trial "Adaptive clinical trial") [Stochastic approximation](https://en.wikipedia.org/wiki/Stochastic_approximation "Stochastic approximation") [Up-and-down designs](https://en.wikipedia.org/wiki/Up-and-Down_Designs "Up-and-Down Designs") |
| [Observational studies](https://en.wikipedia.org/wiki/Observational_study "Observational study") | [Cohort study](https://en.wikipedia.org/wiki/Cohort_study "Cohort study") [Cross-sectional study](https://en.wikipedia.org/wiki/Cross-sectional_study "Cross-sectional study") [Natural experiment](https://en.wikipedia.org/wiki/Natural_experiment "Natural experiment") [Quasi-experiment](https://en.wikipedia.org/wiki/Quasi-experiment "Quasi-experiment") |
| [Statistical inference](https://en.wikipedia.org/wiki/Statistical_inference "Statistical inference") | |
| | |
| [Statistical theory](https://en.wikipedia.org/wiki/Statistical_theory "Statistical theory") | [Population](https://en.wikipedia.org/wiki/Population_\(statistics\) "Population (statistics)") [Statistic](https://en.wikipedia.org/wiki/Statistic "Statistic") [Probability distribution](https://en.wikipedia.org/wiki/Probability_distribution "Probability distribution") [Sampling distribution](https://en.wikipedia.org/wiki/Sampling_distribution "Sampling distribution") [Order statistic](https://en.wikipedia.org/wiki/Order_statistic "Order statistic") [Empirical distribution](https://en.wikipedia.org/wiki/Empirical_distribution_function "Empirical distribution function") [Density estimation](https://en.wikipedia.org/wiki/Density_estimation "Density estimation") [Statistical model](https://en.wikipedia.org/wiki/Statistical_model "Statistical model") [Model specification](https://en.wikipedia.org/wiki/Model_specification "Model specification") [L*p* space](https://en.wikipedia.org/wiki/Lp_space "Lp space") [Parameter](https://en.wikipedia.org/wiki/Statistical_parameter "Statistical parameter") [location](https://en.wikipedia.org/wiki/Location_parameter "Location parameter") [scale](https://en.wikipedia.org/wiki/Scale_parameter "Scale parameter") [shape](https://en.wikipedia.org/wiki/Shape_parameter "Shape parameter") [Parametric family](https://en.wikipedia.org/wiki/Parametric_statistics "Parametric statistics") [Likelihood](https://en.wikipedia.org/wiki/Likelihood_function "Likelihood function") [(monotone)](https://en.wikipedia.org/wiki/Monotone_likelihood_ratio "Monotone likelihood ratio") [Locationâscale family](https://en.wikipedia.org/wiki/Location%E2%80%93scale_family "Locationâscale family") [Exponential family](https://en.wikipedia.org/wiki/Exponential_family "Exponential family") [Completeness](https://en.wikipedia.org/wiki/Completeness_\(statistics\) "Completeness (statistics)") [Sufficiency](https://en.wikipedia.org/wiki/Sufficient_statistic "Sufficient statistic") [Statistical functional](https://en.wikipedia.org/wiki/Plug-in_principle "Plug-in principle") [Bootstrap](https://en.wikipedia.org/wiki/Bootstrapping_\(statistics\) "Bootstrapping (statistics)") [U](https://en.wikipedia.org/wiki/U-statistic "U-statistic") [V](https://en.wikipedia.org/wiki/V-statistic "V-statistic") [Optimal decision](https://en.wikipedia.org/wiki/Optimal_decision "Optimal decision") [loss function](https://en.wikipedia.org/wiki/Loss_function "Loss function") [Efficiency](https://en.wikipedia.org/wiki/Efficiency_\(statistics\) "Efficiency (statistics)") [Statistical distance](https://en.wikipedia.org/wiki/Statistical_distance "Statistical distance") [divergence](https://en.wikipedia.org/wiki/Divergence_\(statistics\) "Divergence (statistics)") [Asymptotics](https://en.wikipedia.org/wiki/Asymptotic_theory_\(statistics\) "Asymptotic theory (statistics)") [Robustness](https://en.wikipedia.org/wiki/Robust_statistics "Robust statistics") |
| [Frequentist inference](https://en.wikipedia.org/wiki/Frequentist_inference "Frequentist inference") | |
| | |
| [Point estimation](https://en.wikipedia.org/wiki/Point_estimation "Point estimation") | [Estimating equations](https://en.wikipedia.org/wiki/Estimating_equations "Estimating equations") [Maximum likelihood](https://en.wikipedia.org/wiki/Maximum_likelihood "Maximum likelihood") [Method of moments](https://en.wikipedia.org/wiki/Method_of_moments_\(statistics\) "Method of moments (statistics)") [M-estimator](https://en.wikipedia.org/wiki/M-estimator "M-estimator") [Minimum distance](https://en.wikipedia.org/wiki/Minimum_distance_estimation "Minimum distance estimation") [Unbiased estimators](https://en.wikipedia.org/wiki/Bias_of_an_estimator "Bias of an estimator") [Mean-unbiased minimum-variance](https://en.wikipedia.org/wiki/Minimum-variance_unbiased_estimator "Minimum-variance unbiased estimator") [RaoâBlackwellization](https://en.wikipedia.org/wiki/Rao%E2%80%93Blackwell_theorem "RaoâBlackwell theorem") [LehmannâScheffĂ© theorem](https://en.wikipedia.org/wiki/Lehmann%E2%80%93Scheff%C3%A9_theorem "LehmannâScheffĂ© theorem") [Median unbiased](https://en.wikipedia.org/wiki/Median-unbiased_estimator "Median-unbiased estimator") [Plug-in](https://en.wikipedia.org/wiki/Plug-in_principle "Plug-in principle") |
| [Interval estimation](https://en.wikipedia.org/wiki/Interval_estimation "Interval estimation") | [Confidence interval](https://en.wikipedia.org/wiki/Confidence_interval "Confidence interval") [Pivot](https://en.wikipedia.org/wiki/Pivotal_quantity "Pivotal quantity") [Likelihood interval](https://en.wikipedia.org/wiki/Likelihood_interval "Likelihood interval") [Prediction interval](https://en.wikipedia.org/wiki/Prediction_interval "Prediction interval") [Tolerance interval](https://en.wikipedia.org/wiki/Tolerance_interval "Tolerance interval") [Resampling](https://en.wikipedia.org/wiki/Resampling_\(statistics\) "Resampling (statistics)") [Bootstrap](https://en.wikipedia.org/wiki/Bootstrapping_\(statistics\) "Bootstrapping (statistics)") [Jackknife](https://en.wikipedia.org/wiki/Jackknife_resampling "Jackknife resampling") |
| [Testing hypotheses](https://en.wikipedia.org/wiki/Statistical_hypothesis_testing "Statistical hypothesis testing") | [1- & 2-tails](https://en.wikipedia.org/wiki/One-_and_two-tailed_tests "One- and two-tailed tests") [Power](https://en.wikipedia.org/wiki/Power_\(statistics\) "Power (statistics)") [Uniformly most powerful test](https://en.wikipedia.org/wiki/Uniformly_most_powerful_test "Uniformly most powerful test") [Permutation test](https://en.wikipedia.org/wiki/Permutation_test "Permutation test") [Randomization test](https://en.wikipedia.org/wiki/Randomization_test "Randomization test") [Multiple comparisons](https://en.wikipedia.org/wiki/Multiple_comparisons "Multiple comparisons") |
| [Parametric tests](https://en.wikipedia.org/wiki/Parametric_statistics "Parametric statistics") | [Likelihood-ratio](https://en.wikipedia.org/wiki/Likelihood-ratio_test "Likelihood-ratio test") [Score/Lagrange multiplier](https://en.wikipedia.org/wiki/Score_test "Score test") [Wald](https://en.wikipedia.org/wiki/Wald_test "Wald test") |
| [Specific tests](https://en.wikipedia.org/wiki/List_of_statistical_tests "List of statistical tests") | |
| | |
| [*Z*\-test (normal)](https://en.wikipedia.org/wiki/Z-test "Z-test") [Student's *t*\-test](https://en.wikipedia.org/wiki/Student%27s_t-test "Student's t-test") [*F*\-test](https://en.wikipedia.org/wiki/F-test "F-test") | |
| [Goodness of fit](https://en.wikipedia.org/wiki/Goodness_of_fit "Goodness of fit") | [Chi-squared](https://en.wikipedia.org/wiki/Chi-squared_test "Chi-squared test") [*G*\-test](https://en.wikipedia.org/wiki/G-test "G-test") [KolmogorovâSmirnov](https://en.wikipedia.org/wiki/Kolmogorov%E2%80%93Smirnov_test "KolmogorovâSmirnov test") [AndersonâDarling](https://en.wikipedia.org/wiki/Anderson%E2%80%93Darling_test "AndersonâDarling test") [Lilliefors](https://en.wikipedia.org/wiki/Lilliefors_test "Lilliefors test") [JarqueâBera](https://en.wikipedia.org/wiki/Jarque%E2%80%93Bera_test "JarqueâBera test") [Normality (ShapiroâWilk)](https://en.wikipedia.org/wiki/Shapiro%E2%80%93Wilk_test "ShapiroâWilk test") [Likelihood-ratio test](https://en.wikipedia.org/wiki/Likelihood-ratio_test "Likelihood-ratio test") [Model selection](https://en.wikipedia.org/wiki/Model_selection "Model selection") [Cross validation](https://en.wikipedia.org/wiki/Cross-validation_\(statistics\) "Cross-validation (statistics)") [AIC](https://en.wikipedia.org/wiki/Akaike_information_criterion "Akaike information criterion") [BIC](https://en.wikipedia.org/wiki/Bayesian_information_criterion "Bayesian information criterion") |
| [Rank statistics](https://en.wikipedia.org/wiki/Rank_statistics "Rank statistics") | [Sign](https://en.wikipedia.org/wiki/Sign_test "Sign test") [Sample median](https://en.wikipedia.org/wiki/Sample_median "Sample median") [Signed rank (Wilcoxon)](https://en.wikipedia.org/wiki/Wilcoxon_signed-rank_test "Wilcoxon signed-rank test") [HodgesâLehmann estimator](https://en.wikipedia.org/wiki/Hodges%E2%80%93Lehmann_estimator "HodgesâLehmann estimator") [Rank sum (MannâWhitney)](https://en.wikipedia.org/wiki/Mann%E2%80%93Whitney_U_test "MannâWhitney U test") [Nonparametric](https://en.wikipedia.org/wiki/Nonparametric_statistics "Nonparametric statistics") [anova](https://en.wikipedia.org/wiki/Analysis_of_variance "Analysis of variance") [1-way (KruskalâWallis)](https://en.wikipedia.org/wiki/Kruskal%E2%80%93Wallis_test "KruskalâWallis test") [2-way (Friedman)](https://en.wikipedia.org/wiki/Friedman_test "Friedman test") [Ordered alternative (JonckheereâTerpstra)](https://en.wikipedia.org/wiki/Jonckheere%27s_trend_test "Jonckheere's trend test") [Van der Waerden test](https://en.wikipedia.org/wiki/Van_der_Waerden_test "Van der Waerden test") |
| [Bayesian inference]() | [Bayesian probability](https://en.wikipedia.org/wiki/Bayesian_probability "Bayesian probability") [prior](https://en.wikipedia.org/wiki/Prior_probability "Prior probability") [posterior](https://en.wikipedia.org/wiki/Posterior_probability "Posterior probability") [Credible interval](https://en.wikipedia.org/wiki/Credible_interval "Credible interval") [Bayes factor](https://en.wikipedia.org/wiki/Bayes_factor "Bayes factor") [Bayesian estimator](https://en.wikipedia.org/wiki/Bayes_estimator "Bayes estimator") [Maximum posterior estimator](https://en.wikipedia.org/wiki/Maximum_a_posteriori_estimation "Maximum a posteriori estimation") |
| [Correlation](https://en.wikipedia.org/wiki/Correlation_and_dependence "Correlation and dependence") [Regression analysis](https://en.wikipedia.org/wiki/Regression_analysis "Regression analysis") | |
| | |
| [Correlation](https://en.wikipedia.org/wiki/Correlation_and_dependence "Correlation and dependence") | [Pearson product-moment](https://en.wikipedia.org/wiki/Pearson_product-moment_correlation_coefficient "Pearson product-moment correlation coefficient") [Partial correlation](https://en.wikipedia.org/wiki/Partial_correlation "Partial correlation") [Confounding variable](https://en.wikipedia.org/wiki/Confounding "Confounding") [Coefficient of determination](https://en.wikipedia.org/wiki/Coefficient_of_determination "Coefficient of determination") |
| [Regression analysis](https://en.wikipedia.org/wiki/Regression_analysis "Regression analysis") | [Errors and residuals](https://en.wikipedia.org/wiki/Errors_and_residuals "Errors and residuals") [Regression validation](https://en.wikipedia.org/wiki/Regression_validation "Regression validation") [Mixed effects models](https://en.wikipedia.org/wiki/Mixed_model "Mixed model") [Simultaneous equations models](https://en.wikipedia.org/wiki/Simultaneous_equations_model "Simultaneous equations model") [Multivariate adaptive regression splines (MARS)](https://en.wikipedia.org/wiki/Multivariate_adaptive_regression_splines "Multivariate adaptive regression splines")  [Template:Least squares and regression analysis](https://en.wikipedia.org/wiki/Template:Least_squares_and_regression_analysis "Template:Least squares and regression analysis") |
| [Linear regression](https://en.wikipedia.org/wiki/Linear_regression "Linear regression") | [Simple linear regression](https://en.wikipedia.org/wiki/Simple_linear_regression "Simple linear regression") [Ordinary least squares](https://en.wikipedia.org/wiki/Ordinary_least_squares "Ordinary least squares") [General linear model](https://en.wikipedia.org/wiki/General_linear_model "General linear model") [Bayesian regression](https://en.wikipedia.org/wiki/Bayesian_linear_regression "Bayesian linear regression") |
| Non-standard predictors | [Nonlinear regression](https://en.wikipedia.org/wiki/Nonlinear_regression "Nonlinear regression") [Nonparametric](https://en.wikipedia.org/wiki/Nonparametric_regression "Nonparametric regression") [Semiparametric](https://en.wikipedia.org/wiki/Semiparametric_regression "Semiparametric regression") [Isotonic](https://en.wikipedia.org/wiki/Isotonic_regression "Isotonic regression") [Robust](https://en.wikipedia.org/wiki/Robust_regression "Robust regression") [Homoscedasticity and Heteroscedasticity](https://en.wikipedia.org/wiki/Homoscedasticity_and_heteroscedasticity "Homoscedasticity and heteroscedasticity") |
| [Generalized linear model](https://en.wikipedia.org/wiki/Generalized_linear_model "Generalized linear model") | [Exponential families](https://en.wikipedia.org/wiki/Exponential_family "Exponential family") [Logistic (Bernoulli)](https://en.wikipedia.org/wiki/Logistic_regression "Logistic regression") / [Binomial](https://en.wikipedia.org/wiki/Binomial_regression "Binomial regression") / [Poisson regressions](https://en.wikipedia.org/wiki/Poisson_regression "Poisson regression") |
| [Partition of variance](https://en.wikipedia.org/wiki/Partition_of_sums_of_squares "Partition of sums of squares") | [Analysis of variance (ANOVA, anova)](https://en.wikipedia.org/wiki/Analysis_of_variance "Analysis of variance") [Analysis of covariance](https://en.wikipedia.org/wiki/Analysis_of_covariance "Analysis of covariance") [Multivariate ANOVA](https://en.wikipedia.org/wiki/Multivariate_analysis_of_variance "Multivariate analysis of variance") [Degrees of freedom](https://en.wikipedia.org/wiki/Degrees_of_freedom_\(statistics\) "Degrees of freedom (statistics)") |
| [Categorical](https://en.wikipedia.org/wiki/Categorical_variable "Categorical variable") / [multivariate](https://en.wikipedia.org/wiki/Multivariate_statistics "Multivariate statistics") / [time-series](https://en.wikipedia.org/wiki/Time_series "Time series") / [survival analysis](https://en.wikipedia.org/wiki/Survival_analysis "Survival analysis") | |
| | |
| [Categorical](https://en.wikipedia.org/wiki/Categorical_variable "Categorical variable") | [Cohen's kappa](https://en.wikipedia.org/wiki/Cohen%27s_kappa "Cohen's kappa") [Contingency table](https://en.wikipedia.org/wiki/Contingency_table "Contingency table") [Graphical model](https://en.wikipedia.org/wiki/Graphical_model "Graphical model") [Log-linear model](https://en.wikipedia.org/wiki/Poisson_regression "Poisson regression") [McNemar's test](https://en.wikipedia.org/wiki/McNemar%27s_test "McNemar's test") [CochranâMantelâHaenszel statistics](https://en.wikipedia.org/wiki/Cochran%E2%80%93Mantel%E2%80%93Haenszel_statistics "CochranâMantelâHaenszel statistics") |
| [Multivariate](https://en.wikipedia.org/wiki/Multivariate_statistics "Multivariate statistics") | [Regression](https://en.wikipedia.org/wiki/General_linear_model "General linear model") [Manova](https://en.wikipedia.org/wiki/Multivariate_analysis_of_variance "Multivariate analysis of variance") [Principal components](https://en.wikipedia.org/wiki/Principal_component_analysis "Principal component analysis") [Canonical correlation](https://en.wikipedia.org/wiki/Canonical_correlation "Canonical correlation") [Discriminant analysis](https://en.wikipedia.org/wiki/Linear_discriminant_analysis "Linear discriminant analysis") [Cluster analysis](https://en.wikipedia.org/wiki/Cluster_analysis "Cluster analysis") [Classification](https://en.wikipedia.org/wiki/Statistical_classification "Statistical classification") [Structural equation model](https://en.wikipedia.org/wiki/Structural_equation_modeling "Structural equation modeling") [Factor analysis](https://en.wikipedia.org/wiki/Factor_analysis "Factor analysis") [Multivariate distributions](https://en.wikipedia.org/wiki/Multivariate_distribution "Multivariate distribution") [Elliptical distributions](https://en.wikipedia.org/wiki/Elliptical_distribution "Elliptical distribution") [Normal](https://en.wikipedia.org/wiki/Multivariate_normal_distribution "Multivariate normal distribution") |
| [Time-series](https://en.wikipedia.org/wiki/Time_series "Time series") | |
| | |
| General | [Decomposition](https://en.wikipedia.org/wiki/Decomposition_of_time_series "Decomposition of time series") [Trend](https://en.wikipedia.org/wiki/Trend_estimation "Trend estimation") [Stationarity](https://en.wikipedia.org/wiki/Stationary_process "Stationary process") [Seasonal adjustment](https://en.wikipedia.org/wiki/Seasonal_adjustment "Seasonal adjustment") [Exponential smoothing](https://en.wikipedia.org/wiki/Exponential_smoothing "Exponential smoothing") [Cointegration](https://en.wikipedia.org/wiki/Cointegration "Cointegration") [Structural break](https://en.wikipedia.org/wiki/Structural_break "Structural break") [Granger causality](https://en.wikipedia.org/wiki/Granger_causality "Granger causality") |
| Specific tests | [DickeyâFuller](https://en.wikipedia.org/wiki/Dickey%E2%80%93Fuller_test "DickeyâFuller test") [Johansen](https://en.wikipedia.org/wiki/Johansen_test "Johansen test") [Q-statistic (LjungâBox)](https://en.wikipedia.org/wiki/Ljung%E2%80%93Box_test "LjungâBox test") [DurbinâWatson](https://en.wikipedia.org/wiki/Durbin%E2%80%93Watson_statistic "DurbinâWatson statistic") [BreuschâGodfrey](https://en.wikipedia.org/wiki/Breusch%E2%80%93Godfrey_test "BreuschâGodfrey test") |
| [Time domain](https://en.wikipedia.org/wiki/Time_domain "Time domain") | [Autocorrelation (ACF)](https://en.wikipedia.org/wiki/Autocorrelation "Autocorrelation") [partial (PACF)](https://en.wikipedia.org/wiki/Partial_autocorrelation_function "Partial autocorrelation function") [Cross-correlation (XCF)](https://en.wikipedia.org/wiki/Cross-correlation "Cross-correlation") [ARMA model](https://en.wikipedia.org/wiki/Autoregressive%E2%80%93moving-average_model "Autoregressiveâmoving-average model") [ARIMA model (BoxâJenkins)](https://en.wikipedia.org/wiki/Box%E2%80%93Jenkins_method "BoxâJenkins method") [Autoregressive conditional heteroskedasticity (ARCH)](https://en.wikipedia.org/wiki/Autoregressive_conditional_heteroskedasticity "Autoregressive conditional heteroskedasticity") [Vector autoregression (VAR)](https://en.wikipedia.org/wiki/Vector_autoregression "Vector autoregression") ([Autoregressive model (AR)](https://en.wikipedia.org/wiki/Autoregressive_model "Autoregressive model")) |
| [Frequency domain](https://en.wikipedia.org/wiki/Frequency_domain "Frequency domain") | [Spectral density estimation](https://en.wikipedia.org/wiki/Spectral_density_estimation "Spectral density estimation") [Fourier analysis](https://en.wikipedia.org/wiki/Fourier_analysis "Fourier analysis") [Least-squares spectral analysis](https://en.wikipedia.org/wiki/Least-squares_spectral_analysis "Least-squares spectral analysis") [Wavelet](https://en.wikipedia.org/wiki/Wavelet "Wavelet") [Whittle likelihood](https://en.wikipedia.org/wiki/Whittle_likelihood "Whittle likelihood") |
| [Survival](https://en.wikipedia.org/wiki/Survival_analysis "Survival analysis") | |
| | |
| [Survival function](https://en.wikipedia.org/wiki/Survival_function "Survival function") | [KaplanâMeier estimator (product limit)](https://en.wikipedia.org/wiki/Kaplan%E2%80%93Meier_estimator "KaplanâMeier estimator") [Proportional hazards models](https://en.wikipedia.org/wiki/Proportional_hazards_model "Proportional hazards model") [Accelerated failure time (AFT) model](https://en.wikipedia.org/wiki/Accelerated_failure_time_model "Accelerated failure time model") [First hitting time](https://en.wikipedia.org/wiki/First-hitting-time_model "First-hitting-time model") |
| [Hazard function](https://en.wikipedia.org/wiki/Failure_rate "Failure rate") | [NelsonâAalen estimator](https://en.wikipedia.org/wiki/Nelson%E2%80%93Aalen_estimator "NelsonâAalen estimator") |
| Test | [Log-rank test](https://en.wikipedia.org/wiki/Log-rank_test "Log-rank test") |
| [Applications](https://en.wikipedia.org/wiki/List_of_fields_of_application_of_statistics "List of fields of application of statistics") | |
| | |
| [Biostatistics](https://en.wikipedia.org/wiki/Biostatistics "Biostatistics") | [Bioinformatics](https://en.wikipedia.org/wiki/Bioinformatics "Bioinformatics") [Clinical trials](https://en.wikipedia.org/wiki/Clinical_trial "Clinical trial") / [studies](https://en.wikipedia.org/wiki/Clinical_study_design "Clinical study design") [Epidemiology](https://en.wikipedia.org/wiki/Epidemiology "Epidemiology") [Medical statistics](https://en.wikipedia.org/wiki/Medical_statistics "Medical statistics") |
| [Engineering statistics](https://en.wikipedia.org/wiki/Engineering_statistics "Engineering statistics") | [Chemometrics](https://en.wikipedia.org/wiki/Chemometrics "Chemometrics") [Methods engineering](https://en.wikipedia.org/wiki/Methods_engineering "Methods engineering") [Probabilistic design](https://en.wikipedia.org/wiki/Probabilistic_design "Probabilistic design") [Process](https://en.wikipedia.org/wiki/Statistical_process_control "Statistical process control") / [quality control](https://en.wikipedia.org/wiki/Quality_control "Quality control") [Reliability](https://en.wikipedia.org/wiki/Reliability_engineering "Reliability engineering") [System identification](https://en.wikipedia.org/wiki/System_identification "System identification") |
| [Social statistics](https://en.wikipedia.org/wiki/Social_statistics "Social statistics") | [Actuarial science](https://en.wikipedia.org/wiki/Actuarial_science "Actuarial science") [Census](https://en.wikipedia.org/wiki/Census "Census") [Crime statistics](https://en.wikipedia.org/wiki/Crime_statistics "Crime statistics") [Demography](https://en.wikipedia.org/wiki/Demographic_statistics "Demographic statistics") [Econometrics](https://en.wikipedia.org/wiki/Econometrics "Econometrics") [Jurimetrics](https://en.wikipedia.org/wiki/Jurimetrics "Jurimetrics") [National accounts](https://en.wikipedia.org/wiki/National_accounts "National accounts") [Official statistics](https://en.wikipedia.org/wiki/Official_statistics "Official statistics") [Population statistics](https://en.wikipedia.org/wiki/Population_statistics "Population statistics") [Psychometrics](https://en.wikipedia.org/wiki/Psychometrics "Psychometrics") |
| [Spatial statistics](https://en.wikipedia.org/wiki/Spatial_analysis "Spatial analysis") | [Cartography](https://en.wikipedia.org/wiki/Cartography "Cartography") [Environmental statistics](https://en.wikipedia.org/wiki/Environmental_statistics "Environmental statistics") [Geographic information system](https://en.wikipedia.org/wiki/Geographic_information_system "Geographic information system") [Geostatistics](https://en.wikipedia.org/wiki/Geostatistics "Geostatistics") [Kriging](https://en.wikipedia.org/wiki/Kriging "Kriging") |
| **[Category](https://en.wikipedia.org/wiki/Category:Statistics "Category:Statistics")** **[](https://en.wikipedia.org/wiki/File:Nuvola_apps_edu_mathematics_blue-p.svg) [Mathematics portal](https://en.wikipedia.org/wiki/Portal:Mathematics "Portal:Mathematics")** [](https://en.wikipedia.org/wiki/File:Commons-logo.svg "Commons page")**[Commons](https://commons.wikimedia.org/wiki/Category:Statistics "commons:Category:Statistics")**  **[WikiProject](https://en.wikipedia.org/wiki/Wikipedia:WikiProject_Statistics "Wikipedia:WikiProject Statistics")** | |
| [Authority control databases](https://en.wikipedia.org/wiki/Help:Authority_control "Help:Authority control") [](https://www.wikidata.org/wiki/Q812535#identifiers "Edit this at Wikidata") | |
|---|---|
| International | [GND](https://d-nb.info/gnd/4144220-9) |
| National | [United States](https://id.loc.gov/authorities/sh85012506) [Czech Republic](https://aleph.nkp.cz/F/?func=find-c&local_base=aut&ccl_term=ica=ph135362&CON_LNG=ENG) [Israel](https://www.nli.org.il/en/authorities/987007282424705171) |
| Other | [Yale LUX](https://lux.collections.yale.edu/view/concept/89628c4b-4062-4a41-9920-860f80a2eb33) |

Retrieved from "<https://en.wikipedia.org/w/index.php?title=Bayesian_inference&oldid=1346386418>"
[Categories](https://en.wikipedia.org/wiki/Help:Category "Help:Category"):
- [Bayesian inference](https://en.wikipedia.org/wiki/Category:Bayesian_inference "Category:Bayesian inference")
- [Logic and statistics](https://en.wikipedia.org/wiki/Category:Logic_and_statistics "Category:Logic and statistics")
- [Statistical forecasting](https://en.wikipedia.org/wiki/Category:Statistical_forecasting "Category:Statistical forecasting")
- [Probabilistic arguments](https://en.wikipedia.org/wiki/Category:Probabilistic_arguments "Category:Probabilistic arguments")
Hidden categories:
- [Webarchive template wayback links](https://en.wikipedia.org/wiki/Category:Webarchive_template_wayback_links "Category:Webarchive template wayback links")
- [Articles with short description](https://en.wikipedia.org/wiki/Category:Articles_with_short_description "Category:Articles with short description")
- [Short description is different from Wikidata](https://en.wikipedia.org/wiki/Category:Short_description_is_different_from_Wikidata "Category:Short description is different from Wikidata")
- [Articles lacking in-text citations from February 2012](https://en.wikipedia.org/wiki/Category:Articles_lacking_in-text_citations_from_February_2012 "Category:Articles lacking in-text citations from February 2012")
- [All articles lacking in-text citations](https://en.wikipedia.org/wiki/Category:All_articles_lacking_in-text_citations "Category:All articles lacking in-text citations")
- [All articles with vague or ambiguous time](https://en.wikipedia.org/wiki/Category:All_articles_with_vague_or_ambiguous_time "Category:All articles with vague or ambiguous time")
- [Vague or ambiguous time from September 2018](https://en.wikipedia.org/wiki/Category:Vague_or_ambiguous_time_from_September_2018 "Category:Vague or ambiguous time from September 2018")
- [All articles lacking reliable references](https://en.wikipedia.org/wiki/Category:All_articles_lacking_reliable_references "Category:All articles lacking reliable references")
- [Articles lacking reliable references from September 2018](https://en.wikipedia.org/wiki/Category:Articles_lacking_reliable_references_from_September_2018 "Category:Articles lacking reliable references from September 2018")
- [All articles with unsourced statements](https://en.wikipedia.org/wiki/Category:All_articles_with_unsourced_statements "Category:All articles with unsourced statements")
- [Articles with unsourced statements from August 2010](https://en.wikipedia.org/wiki/Category:Articles_with_unsourced_statements_from_August_2010 "Category:Articles with unsourced statements from August 2010")
- This page was last edited on 31 March 2026, at 15:00 (UTC).
- Text is available under the [Creative Commons Attribution-ShareAlike 4.0 License](https://en.wikipedia.org/wiki/Wikipedia:Text_of_the_Creative_Commons_Attribution-ShareAlike_4.0_International_License "Wikipedia:Text of the Creative Commons Attribution-ShareAlike 4.0 International License"); additional terms may apply. By using this site, you agree to the [Terms of Use](https://foundation.wikimedia.org/wiki/Special:MyLanguage/Policy:Terms_of_Use "foundation:Special:MyLanguage/Policy:Terms of Use") and [Privacy Policy](https://foundation.wikimedia.org/wiki/Special:MyLanguage/Policy:Privacy_policy "foundation:Special:MyLanguage/Policy:Privacy policy"). WikipediaÂź is a registered trademark of the [Wikimedia Foundation, Inc.](https://wikimediafoundation.org/), a non-profit organization.
- [Privacy policy](https://foundation.wikimedia.org/wiki/Special:MyLanguage/Policy:Privacy_policy)
- [About Wikipedia](https://en.wikipedia.org/wiki/Wikipedia:About)
- [Disclaimers](https://en.wikipedia.org/wiki/Wikipedia:General_disclaimer)
- [Contact Wikipedia](https://en.wikipedia.org/wiki/Wikipedia:Contact_us)
- [Legal & safety contacts](https://foundation.wikimedia.org/wiki/Special:MyLanguage/Legal:Wikimedia_Foundation_Legal_and_Safety_Contact_Information)
- [Code of Conduct](https://foundation.wikimedia.org/wiki/Special:MyLanguage/Policy:Universal_Code_of_Conduct)
- [Developers](https://developer.wikimedia.org/)
- [Statistics](https://stats.wikimedia.org/#/en.wikipedia.org)
- [Cookie statement](https://foundation.wikimedia.org/wiki/Special:MyLanguage/Policy:Cookie_statement)
- [Mobile view](https://en.wikipedia.org/w/index.php?title=Bayesian_inference&mobileaction=toggle_view_mobile)
- [](https://www.wikimedia.org/)
- [](https://www.mediawiki.org/)
Search
Toggle the table of contents
Bayesian inference
36 languages
[Add topic](https://en.wikipedia.org/wiki/Bayesian_inference) |
| Readable Markdown | **Bayesian inference** ( [*BAY\-zee-Én*](https://en.wikipedia.org/wiki/Help:Pronunciation_respelling_key "Help:Pronunciation respelling key") or [*BAY\-zhÉn*](https://en.wikipedia.org/wiki/Help:Pronunciation_respelling_key "Help:Pronunciation respelling key"))[\[1\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-1) is a method of [statistical inference](https://en.wikipedia.org/wiki/Statistical_inference "Statistical inference") in which [Bayes' theorem](https://en.wikipedia.org/wiki/Bayes%27_theorem "Bayes' theorem") is used to calculate a probability of a hypothesis, given prior [evidence](https://en.wikipedia.org/wiki/Evidence "Evidence"), and update it as more [information](https://en.wikipedia.org/wiki/Information "Information") becomes available. Fundamentally, Bayesian inference uses a [prior distribution](https://en.wikipedia.org/wiki/Prior_probability "Prior probability") to estimate [posterior probabilities.](https://en.wikipedia.org/wiki/Posterior_probability "Posterior probability") Bayesian inference is an important technique in [statistics](https://en.wikipedia.org/wiki/Statistics "Statistics"), and especially in [mathematical statistics](https://en.wikipedia.org/wiki/Mathematical_statistics "Mathematical statistics"). Bayesian updating is particularly important in the [dynamic analysis of a sequence of data](https://en.wikipedia.org/wiki/Sequential_analysis "Sequential analysis"). Bayesian inference has found application in a wide range of activities, including [science](https://en.wikipedia.org/wiki/Science "Science"), [engineering](https://en.wikipedia.org/wiki/Engineering "Engineering"), [philosophy](https://en.wikipedia.org/wiki/Philosophy "Philosophy"), [medicine](https://en.wikipedia.org/wiki/Medicine "Medicine"), [sport](https://en.wikipedia.org/wiki/Sport "Sport"), [psychology](https://en.wikipedia.org/wiki/Psychology "Psychology")[\[2\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-2), and [law](https://en.wikipedia.org/wiki/Law "Law"). In the philosophy of [decision theory](https://en.wikipedia.org/wiki/Decision_theory "Decision theory"), Bayesian inference is closely related to subjective probability, often called "[Bayesian probability](https://en.wikipedia.org/wiki/Bayesian_probability "Bayesian probability")".
## Introduction to Bayes' rule
\[[edit](https://en.wikipedia.org/w/index.php?title=Bayesian_inference&action=edit§ion=1 "Edit section: Introduction to Bayes' rule")\]
[](https://en.wikipedia.org/wiki/File:Bayes_theorem_visualisation.svg)
A geometric visualisation of Bayes' theorem. In the table, the values 2, 3, 6 and 9 give the relative weights of each corresponding condition and case. The figures denote the cells of the table involved in each metric, the probability being the fraction of each figure that is shaded. This shows that  i.e. . Similar reasoning can be used to show that  etc.
| HypothesisEvidence | Satisfies hypothesis H | Violates hypothesis â â |
|---|---|---|
Bayesian inference derives the [posterior probability](https://en.wikipedia.org/wiki/Posterior_probability "Posterior probability") as a [consequence](https://en.wikipedia.org/wiki/Consequence_relation "Consequence relation") of two [antecedents](https://en.wikipedia.org/wiki/Antecedent_\(logic\) "Antecedent (logic)"): a [prior probability](https://en.wikipedia.org/wiki/Prior_probability "Prior probability") and a "[likelihood function](https://en.wikipedia.org/wiki/Likelihood_function "Likelihood function")" derived from a [statistical model](https://en.wikipedia.org/wiki/Statistical_model "Statistical model") for the observed data. Bayesian inference computes the posterior probability according to [Bayes' theorem](https://en.wikipedia.org/wiki/Bayes%27_theorem "Bayes' theorem"):

where
-  stands for any *hypothesis* whose probability may be affected by [data](https://en.wikipedia.org/wiki/Experimental_data "Experimental data") (called *evidence* below). Often there are competing hypotheses, and the task is to determine which is the most probable.
- , the *[prior probability](https://en.wikipedia.org/wiki/Prior_probability "Prior probability")*, is the estimate of the probability of the hypothesis  *before* the data , the current evidence, is observed.
- , the *evidence*, corresponds to new data that were not used in computing the prior probability.
- , the *[posterior probability](https://en.wikipedia.org/wiki/Posterior_probability "Posterior probability")*, is the probability of  *given* , i.e., *after*  is observed. This is what we want to know: the probability of a hypothesis *given* the observed evidence.
-  is the probability of observing  *given*  and is called the *[likelihood](https://en.wikipedia.org/wiki/Likelihood_function "Likelihood function")*. As a function of  with  fixed, it indicates the compatibility of the evidence with the given hypothesis. The likelihood function is a function of the evidence, , while the posterior probability is a function of the hypothesis, .
-  is sometimes termed the [marginal likelihood](https://en.wikipedia.org/wiki/Marginal_likelihood "Marginal likelihood") or "model evidence". This factor is the same for all possible hypotheses being considered (as is evident from the fact that the hypothesis  does not appear anywhere in the symbol, unlike for all the other factors) and hence does not factor into determining the relative probabilities of different hypotheses.
-  (Else one has .)
For different values of , only the factors  and , both in the numerator, affect the value of  â the posterior probability of a hypothesis is proportional to its prior probability (its inherent likeliness) and the newly acquired likelihood (its compatibility with the new observed evidence).
In cases where  ("not "), the [logical negation](https://en.wikipedia.org/wiki/Logical_negation "Logical negation") of , is a valid likelihood, Bayes' rule can be rewritten as follows:

because

and

This focuses attention on the term

If that term is approximately 1, then the probability of the hypothesis given the evidence, , is about , about 50% likely - equally likely or not likely. If that term is very small, close to zero, then the probability of the hypothesis, given the evidence,  is close to 1 or the conditional hypothesis is quite likely. If that term is very large, much larger than 1, then the hypothesis, given the evidence, is quite unlikely. If the hypothesis (without consideration of evidence) is unlikely, then  is small (but not necessarily astronomically small) and  is much larger than 1 and this term can be approximated as  and relevant probabilities can be compared directly to each other.
One quick and easy way to remember the equation would be to use [rule of multiplication](https://en.wikipedia.org/wiki/Conditional_probability#As_an_axiom_of_probability "Conditional probability"):

### Alternatives to Bayesian updating
\[[edit](https://en.wikipedia.org/w/index.php?title=Bayesian_inference&action=edit§ion=3 "Edit section: Alternatives to Bayesian updating")\]
Bayesian updating is widely used and computationally convenient. However, it is not the only updating rule that might be considered rational.
[Ian Hacking](https://en.wikipedia.org/wiki/Ian_Hacking "Ian Hacking") noted that traditional "[Dutch book](https://en.wikipedia.org/wiki/Dutch_book "Dutch book")" arguments did not specify Bayesian updating: they left open the possibility that non-Bayesian updating rules could avoid Dutch books. Hacking wrote:[\[3\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-3) "And neither the Dutch book argument nor any other in the personalist arsenal of proofs of the probability axioms entails the dynamic assumption. Not one entails Bayesianism. So the personalist requires the dynamic assumption to be Bayesian. It is true that in consistency a personalist could abandon the Bayesian model of learning from experience. Salt could lose its savour."
Indeed, there are non-Bayesian updating rules that also avoid Dutch books (as discussed in the literature on "[probability kinematics](https://en.wikipedia.org/wiki/Probability_kinematics "Probability kinematics")") following the publication of [Richard C. Jeffrey](https://en.wikipedia.org/wiki/Richard_C._Jeffrey "Richard C. Jeffrey")'s rule, which applies Bayes' rule to the case where the evidence itself is assigned a probability.[\[4\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-4) The additional hypotheses needed to uniquely require Bayesian updating have been deemed to be substantial, complicated, and unsatisfactory.[\[5\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-5)
## Inference over exclusive and exhaustive possibilities
\[[edit](https://en.wikipedia.org/w/index.php?title=Bayesian_inference&action=edit§ion=4 "Edit section: Inference over exclusive and exhaustive possibilities")\]
If evidence is simultaneously used to update belief over a set of exclusive and exhaustive propositions, Bayesian inference may be thought of as acting on this belief distribution as a whole.
### General formulation
\[[edit](https://en.wikipedia.org/w/index.php?title=Bayesian_inference&action=edit§ion=5 "Edit section: General formulation")\]
[](https://en.wikipedia.org/wiki/File:Bayesian_inference_event_space.svg)
Diagram illustrating event space  in general formulation of Bayesian inference. Although this diagram shows discrete models and events, the continuous case may be visualized similarly using probability densities.
Suppose a process is generating independent and identically distributed events , but the [probability distribution](https://en.wikipedia.org/wiki/Probability_distribution "Probability distribution") is unknown. Let the event space  represent the current state of belief for this process. Each model is represented by event . The conditional probabilities  are specified to define the models.  is the [degree of belief](https://en.wikipedia.org/wiki/Credence_\(statistics\) "Credence (statistics)") in . Before the first inference step,  is a set of *initial prior probabilities*. These must sum to 1, but are otherwise arbitrary.
Suppose that the process is observed to generate . For each , the prior  is updated to the posterior . From [Bayes' theorem](https://en.wikipedia.org/wiki/Bayes%27_theorem "Bayes' theorem"):[\[6\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-6)

Upon observation of further evidence, this procedure may be repeated.
### Multiple observations
\[[edit](https://en.wikipedia.org/w/index.php?title=Bayesian_inference&action=edit§ion=6 "Edit section: Multiple observations")\]
For a sequence of [independent and identically distributed](https://en.wikipedia.org/wiki/Independent_and_identically_distributed "Independent and identically distributed") observations , it can be shown by induction that repeated application of the above is equivalent to  where 
### Parametric formulation: motivating the formal description
\[[edit](https://en.wikipedia.org/w/index.php?title=Bayesian_inference&action=edit§ion=7 "Edit section: Parametric formulation: motivating the formal description")\]
By parameterizing the space of models, the belief in all models may be updated in a single step. The distribution of belief over the model space may then be thought of as a distribution of belief over the parameter space. The distributions in this section are expressed as continuous, represented by probability densities, as this is the usual situation. The technique is, however, equally applicable to discrete distributions.
Let the vector  span the parameter space. Let the initial prior distribution over  be , where  is a set of parameters to the prior itself, or *[hyperparameters](https://en.wikipedia.org/wiki/Hyperparameter_\(Bayesian_statistics\) "Hyperparameter (Bayesian statistics)")*. Let  be a sequence of [independent and identically distributed](https://en.wikipedia.org/wiki/Independent_and_identically_distributed_random_variables "Independent and identically distributed random variables") event observations, where all  are distributed as  for some . [Bayes' theorem](https://en.wikipedia.org/wiki/Bayes%27_theorem "Bayes' theorem") is applied to find the [posterior distribution](https://en.wikipedia.org/wiki/Posterior_distribution "Posterior distribution") over :
 where 
## Formal description of Bayesian inference
\[[edit](https://en.wikipedia.org/w/index.php?title=Bayesian_inference&action=edit§ion=8 "Edit section: Formal description of Bayesian inference")\]
- , a data point in general. This may in fact be a [vector](https://en.wikipedia.org/wiki/Random_vector "Random vector") of values.
- , the [parameter](https://en.wikipedia.org/wiki/Parameter "Parameter") of the data point's distribution, i.e., . This may be a [vector](https://en.wikipedia.org/wiki/Random_vector "Random vector") of parameters.
- , the [hyperparameter](https://en.wikipedia.org/wiki/Hyperparameter_\(Bayesian_statistics\) "Hyperparameter (Bayesian statistics)") of the parameter distribution, i.e., . This may be a [vector](https://en.wikipedia.org/wiki/Random_vector "Random vector") of hyperparameters.
-  is the sample, a set of  observed data points, i.e., .
- , a new data point whose distribution is to be predicted.
- The [prior distribution](https://en.wikipedia.org/wiki/Prior_distribution "Prior distribution") is the distribution of the parameter(s) before any data is observed, i.e.  . The prior distribution might not be easily determined; in such a case, one possibility may be to use the [Jeffreys prior](https://en.wikipedia.org/wiki/Jeffreys_prior "Jeffreys prior") to obtain a prior distribution before updating it with newer observations.
- The [sampling distribution](https://en.wikipedia.org/wiki/Sampling_distribution "Sampling distribution") is the distribution of the observed data conditional on its parameters, i.e. . This is also termed the [likelihood](https://en.wikipedia.org/wiki/Likelihood_function "Likelihood function"), especially when viewed as a function of the parameter(s), sometimes written .
- The [marginal likelihood](https://en.wikipedia.org/wiki/Marginal_likelihood "Marginal likelihood") (sometimes also termed the *evidence*) is the distribution of the observed data [marginalized](https://en.wikipedia.org/wiki/Marginal_distribution "Marginal distribution") over the parameter(s), i.e.  It quantifies the agreement between data and expert opinion, in a geometric sense that can be made precise.[\[7\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-deCarvalho-Geometry-7) If the marginal likelihood is 0 then there is no agreement between the data and expert opinion and Bayes' rule cannot be applied.
- The [posterior distribution](https://en.wikipedia.org/wiki/Posterior_distribution "Posterior distribution") is the distribution of the parameter(s) after taking into account the observed data. This is determined by [Bayes' rule](https://en.wikipedia.org/wiki/Bayes%27_rule "Bayes' rule"), which forms the heart of Bayesian inference:  This is expressed in words as "posterior is proportional to likelihood times prior", or sometimes as "posterior = likelihood times prior, over evidence".
- In practice, for almost all complex Bayesian models used in machine learning, the posterior distribution  is not obtained in a closed form distribution, mainly because the parameter space for  can be very high, or the Bayesian model retains certain hierarchical structure formulated from the observations  and parameter . In such situations, we need to resort to approximation techniques.[\[8\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-Lee-GibbsSampler-8)
- General case: Let  be the conditional distribution of  given  and let  be the distribution of . The joint distribution is then . The conditional distribution  of  given  is then determined by
Existence and uniqueness of the needed [conditional expectation](https://en.wikipedia.org/wiki/Conditional_expectation "Conditional expectation") is a consequence of the [RadonâNikodym theorem](https://en.wikipedia.org/wiki/Radon%E2%80%93Nikodym_theorem "RadonâNikodym theorem"). This was formulated by [Kolmogorov](https://en.wikipedia.org/wiki/Andrey_Kolmogorov "Andrey Kolmogorov") in his famous book from 1933. Kolmogorov underlines the importance of conditional probability by writing "I wish to call attention to ... and especially the theory of conditional probabilities and conditional expectations ..." in the Preface.[\[9\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-9) The Bayes theorem determines the posterior distribution from the prior distribution. Uniqueness requires continuity assumptions.[\[10\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-10) Bayes' theorem can be generalized to include improper prior distributions such as the uniform distribution on the real line.[\[11\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-11) Modern [Markov chain Monte Carlo](https://en.wikipedia.org/wiki/Markov_chain_Monte_Carlo "Markov chain Monte Carlo") methods have boosted the importance of Bayes' theorem including cases with improper priors.[\[12\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-12)
### Bayesian prediction
\[[edit](https://en.wikipedia.org/w/index.php?title=Bayesian_inference&action=edit§ion=11 "Edit section: Bayesian prediction")\]
Bayesian theory calls for the use of the posterior predictive distribution to do [predictive inference](https://en.wikipedia.org/wiki/Predictive_inference "Predictive inference"), i.e., to [predict](https://en.wikipedia.org/wiki/Prediction "Prediction") the distribution of a new, unobserved data point. That is, instead of a fixed point as a prediction, a distribution over possible points is returned. Only this way is the entire posterior distribution of the parameter(s) used. By comparison, prediction in [frequentist statistics](https://en.wikipedia.org/wiki/Frequentist_statistics "Frequentist statistics") often involves finding an optimum point estimate of the parameter(s)âe.g., by [maximum likelihood](https://en.wikipedia.org/wiki/Maximum_likelihood "Maximum likelihood") or [maximum a posteriori estimation](https://en.wikipedia.org/wiki/Maximum_a_posteriori_estimation "Maximum a posteriori estimation") (MAP)âand then plugging this estimate into the formula for the distribution of a data point. This has the disadvantage that it does not account for any uncertainty in the value of the parameter, and hence will underestimate the [variance](https://en.wikipedia.org/wiki/Variance "Variance") of the predictive distribution.
In some instances, frequentist statistics can work around this problem. For example, [confidence intervals](https://en.wikipedia.org/wiki/Confidence_interval "Confidence interval") and [prediction intervals](https://en.wikipedia.org/wiki/Prediction_interval "Prediction interval") in frequentist statistics when constructed from a [normal distribution](https://en.wikipedia.org/wiki/Normal_distribution "Normal distribution") with unknown [mean](https://en.wikipedia.org/wiki/Mean "Mean") and [variance](https://en.wikipedia.org/wiki/Variance "Variance") are constructed using a [Student's t-distribution](https://en.wikipedia.org/wiki/Student%27s_t-distribution "Student's t-distribution"). This correctly estimates the variance, due to the facts that (1) the average of normally distributed random variables is also normally distributed, and (2) the predictive distribution of a normally distributed data point with unknown mean and variance, using conjugate or uninformative priors, has a Student's t-distribution. In Bayesian statistics, however, the posterior predictive distribution can always be determined exactlyâor at least to an arbitrary level of precision when numerical methods are used.
Both types of predictive distributions have the form of a [compound probability distribution](https://en.wikipedia.org/wiki/Compound_probability_distribution "Compound probability distribution") (as does the [marginal likelihood](https://en.wikipedia.org/wiki/Marginal_likelihood "Marginal likelihood")). In fact, if the prior distribution is a [conjugate prior](https://en.wikipedia.org/wiki/Conjugate_prior "Conjugate prior"), such that the prior and posterior distributions come from the same family, it can be seen that both prior and posterior predictive distributions also come from the same family of compound distributions. The only difference is that the posterior predictive distribution uses the updated values of the hyperparameters (applying the Bayesian update rules given in the [conjugate prior](https://en.wikipedia.org/wiki/Conjugate_prior "Conjugate prior") article), while the prior predictive distribution uses the values of the hyperparameters that appear in the prior distribution.
## Mathematical properties
\[[edit](https://en.wikipedia.org/w/index.php?title=Bayesian_inference&action=edit§ion=12 "Edit section: Mathematical properties")\]
### Interpretation of factor
\[[edit](https://en.wikipedia.org/w/index.php?title=Bayesian_inference&action=edit§ion=13 "Edit section: Interpretation of factor")\]
. That is, if the model were true, the evidence would be more likely than is predicted by the current state of belief. The reverse applies for a decrease in belief. If the belief does not change, . That is, the evidence is independent of the model. If the model were true, the evidence would be exactly as likely as predicted by the current state of belief.
If  then . If  and , then . This can be interpreted to mean that hard convictions are insensitive to counter-evidence.
The former follows directly from Bayes' theorem. The latter can be derived by applying the first rule to the event "not " in place of "", yielding "if , then ", from which the result immediately follows.
### Asymptotic behaviour of posterior
\[[edit](https://en.wikipedia.org/w/index.php?title=Bayesian_inference&action=edit§ion=15 "Edit section: Asymptotic behaviour of posterior")\]
Consider the behaviour of a belief distribution as it is updated a large number of times with [independent and identically distributed](https://en.wikipedia.org/wiki/Independent_and_identically_distributed "Independent and identically distributed") trials. For sufficiently nice prior probabilities, the [Bernstein-von Mises theorem](https://en.wikipedia.org/wiki/Bernstein%E2%80%93von_Mises_theorem "Bernsteinâvon Mises theorem") gives that in the limit of infinite trials, the posterior converges to a [Gaussian distribution](https://en.wikipedia.org/wiki/Gaussian_distribution "Gaussian distribution") independent of the initial prior under some conditions firstly outlined and rigorously proven by [Joseph L. Doob](https://en.wikipedia.org/wiki/Joseph_L._Doob "Joseph L. Doob") in 1948, namely if the random variable in consideration has a finite [probability space](https://en.wikipedia.org/wiki/Probability_space "Probability space"). The more general results were obtained later by the statistician [David A. Freedman](https://en.wikipedia.org/wiki/David_A._Freedman_\(statistician\) "David A. Freedman (statistician)") who published in two seminal research papers in 1963 [\[13\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-13) and 1965 [\[14\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-14) when and under what circumstances the asymptotic behaviour of posterior is guaranteed. His 1963 paper treats, like Doob (1949), the finite case and comes to a satisfactory conclusion. However, if the random variable has an infinite but countable [probability space](https://en.wikipedia.org/wiki/Probability_space "Probability space") (i.e., corresponding to a die with infinite many faces) the 1965 paper demonstrates that for a dense subset of priors the [Bernstein-von Mises theorem](https://en.wikipedia.org/wiki/Bernstein%E2%80%93von_Mises_theorem "Bernsteinâvon Mises theorem") is not applicable. In this case there is [almost surely](https://en.wikipedia.org/wiki/Almost_surely "Almost surely") no asymptotic convergence. Later in the 1980s and 1990s [Freedman](https://en.wikipedia.org/wiki/David_A._Freedman_\(statistician\) "David A. Freedman (statistician)") and [Persi Diaconis](https://en.wikipedia.org/wiki/Persi_Diaconis "Persi Diaconis") continued to work on the case of infinite countable probability spaces.[\[15\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-15) To summarise, there may be insufficient trials to suppress the effects of the initial choice, and especially for large (but finite) systems the convergence might be very slow.
In parameterized form, the prior distribution is often assumed to come from a family of distributions called [conjugate priors](https://en.wikipedia.org/wiki/Conjugate_prior "Conjugate prior"). The usefulness of a conjugate prior is that the corresponding posterior distribution will be in the same family, and the calculation may be expressed in [closed form](https://en.wikipedia.org/wiki/Closed-form_expression "Closed-form expression").
### Estimates of parameters and predictions
\[[edit](https://en.wikipedia.org/w/index.php?title=Bayesian_inference&action=edit§ion=17 "Edit section: Estimates of parameters and predictions")\]
It is often desired to use a posterior distribution to estimate a parameter or variable. Several methods of Bayesian estimation select [measurements of central tendency](https://en.wikipedia.org/wiki/Central_tendency "Central tendency") from the posterior distribution.
For one-dimensional problems, a unique median exists for practical continuous problems. The posterior median is attractive as a [robust estimator](https://en.wikipedia.org/wiki/Robust_statistics "Robust statistics").[\[16\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-16)
If there exists a finite mean for the posterior distribution, then the posterior mean is a method of estimation.[\[17\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-17) ![{\\displaystyle {\\tilde {\\theta }}=\\operatorname {E} \[\\theta \]=\\int \\theta \\,p(\\theta \\mid \\mathbf {X} ,\\alpha )\\,d\\theta }](https://wikimedia.org/api/rest_v1/media/math/render/svg/fe77d50024b7504dd853e6cee501d293653c546b)
Taking a value with the greatest probability defines [maximum *a posteriori* (MAP)](https://en.wikipedia.org/wiki/Maximum_a_posteriori_estimation "Maximum a posteriori estimation") estimates:[\[18\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-18) 
There are examples where no maximum is attained, in which case the set of MAP estimates is [empty](https://en.wikipedia.org/wiki/Empty_set "Empty set").
There are other methods of estimation that minimize the posterior *[risk](https://en.wikipedia.org/wiki/Risk "Risk")* (expected-posterior loss) with respect to a [loss function](https://en.wikipedia.org/wiki/Loss_function "Loss function"), and these are of interest to [statistical decision theory](https://en.wikipedia.org/wiki/Statistical_decision_theory "Statistical decision theory") using the sampling distribution ("frequentist statistics").[\[19\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-19)
The [posterior predictive distribution](https://en.wikipedia.org/wiki/Posterior_predictive_distribution "Posterior predictive distribution") of a new observation  (that is independent of previous observations) is determined by[\[20\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-20) 
### Probability of a hypothesis
\[[edit](https://en.wikipedia.org/w/index.php?title=Bayesian_inference&action=edit§ion=19 "Edit section: Probability of a hypothesis")\]
| BowlCookie | \#1 *H*1 | \#2 *H*2 | | Total |
|---|---|---|---|---|
| Plain, *E* | **30** | 20 | **50** | |
| Choc, ÂŹ*E* | 10 | 20 | 30 | |
| Total | 40 | 40 | 80 | |
| *P*(*H*1\|*E*) = 30 / 50 = 0.6 | | | | |
Suppose there are two full bowls of cookies. Bowl \#1 has 10 chocolate chip and 30 plain cookies, while bowl \#2 has 20 of each. Our friend Fred picks a bowl at random, and then picks a cookie at random. We may assume there is no reason to believe Fred treats one bowl differently from another, likewise for the cookies. The cookie turns out to be a plain one. How probable is it that Fred picked it out of bowl \#1?
Intuitively, it seems clear that the answer should be more than a half, since there are more plain cookies in bowl \#1. The precise answer is given by Bayes' theorem. Let  correspond to bowl \#1, and  to bowl \#2. It is given that the bowls are identical from Fred's point of view, thus , and the two must add up to 1, so both are equal to 0.5. The event  is the observation of a plain cookie. From the contents of the bowls, we know that  and  Bayes' formula then yields 
Before we observed the cookie, the probability we assigned for Fred having chosen bowl \#1 was the prior probability, , which was 0.5. After observing the cookie, we must revise the probability to , which is 0.6.
### Making a prediction
\[[edit](https://en.wikipedia.org/w/index.php?title=Bayesian_inference&action=edit§ion=20 "Edit section: Making a prediction")\]
[](https://en.wikipedia.org/wiki/File:Bayesian_inference_archaeology_example.jpg)
Example results for archaeology example. This simulation was generated using c=15.2.
An archaeologist is working at a site thought to be from the medieval period, between the 11th century to the 16th century. However, it is uncertain exactly when in this period the site was inhabited. Fragments of pottery are found, some of which are glazed and some of which are decorated. It is expected that if the site were inhabited during the early medieval period, then 1% of the pottery would be glazed and 50% of its area decorated, whereas if it had been inhabited in the late medieval period then 81% would be glazed and 5% of its area decorated. How confident can the archaeologist be in the date of inhabitation as fragments are unearthed?
The degree of belief in the continuous variable  (century) is to be calculated, with the discrete set of events  as evidence. Assuming linear variation of glaze and decoration with time, and that these variables are independent,
   
Assume a uniform prior of , and that trials are [independent and identically distributed](https://en.wikipedia.org/wiki/Independent_and_identically_distributed "Independent and identically distributed"). When a new fragment of type  is discovered, Bayes' theorem is applied to update the degree of belief for each : 
A computer simulation of the changing belief as 50 fragments are unearthed is shown on the graph. In the simulation, the site was inhabited around 1420, or . By calculating the area under the relevant portion of the graph for 50 trials, the archaeologist can say that there is practically no chance the site was inhabited in the 11th and 12th centuries, about 1% chance that it was inhabited during the 13th century, 63% chance during the 14th century and 36% during the 15th century. The [Bernstein-von Mises theorem](https://en.wikipedia.org/wiki/Bernstein%E2%80%93von_Mises_theorem "Bernsteinâvon Mises theorem") asserts here the asymptotic convergence to the "true" distribution because the [probability space](https://en.wikipedia.org/wiki/Probability_space "Probability space") corresponding to the discrete set of events  is finite (see above section on asymptotic behaviour of the posterior).
## In frequentist statistics and decision theory
\[[edit](https://en.wikipedia.org/w/index.php?title=Bayesian_inference&action=edit§ion=21 "Edit section: In frequentist statistics and decision theory")\]
A [decision-theoretic](https://en.wikipedia.org/wiki/Statistical_decision_theory "Statistical decision theory") justification of the use of Bayesian inference was given by [Abraham Wald](https://en.wikipedia.org/wiki/Abraham_Wald "Abraham Wald"), who proved that every unique Bayesian procedure is [admissible](https://en.wikipedia.org/wiki/Admissible_decision_rule "Admissible decision rule"). Conversely, every [admissible](https://en.wikipedia.org/wiki/Admissible_decision_rule "Admissible decision rule") statistical procedure is either a Bayesian procedure or a limit of Bayesian procedures.[\[21\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-Bickel_&_Doksum_2001,_page_32-21)
Wald characterized admissible procedures as Bayesian procedures (and limits of Bayesian procedures), making the Bayesian formalism a central technique in such areas of [frequentist inference](https://en.wikipedia.org/wiki/Frequentist_inference "Frequentist inference") as [parameter estimation](https://en.wikipedia.org/wiki/Parameter_estimation "Parameter estimation"), [hypothesis testing](https://en.wikipedia.org/wiki/Hypothesis_testing "Hypothesis testing"), and computing [confidence intervals](https://en.wikipedia.org/wiki/Confidence_intervals "Confidence intervals").[\[22\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-22)[\[23\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-23)[\[24\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-24) For example:
- "Under some conditions, all admissible procedures are either Bayes procedures or limits of Bayes procedures (in various senses). These remarkable results, at least in their original form, are due essentially to Wald. They are useful because the property of being Bayes is easier to analyze than admissibility."[\[21\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-Bickel_&_Doksum_2001,_page_32-21)
- "In decision theory, a quite general method for proving admissibility consists in exhibiting a procedure as a unique Bayes solution."[\[25\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-25)
- "In the first chapters of this work, prior distributions with finite support and the corresponding Bayes procedures were used to establish some of the main theorems relating to the comparison of experiments. Bayes procedures with respect to more general prior distributions have played a very important role in the development of statistics, including its asymptotic theory." "There are many problems where a glance at posterior distributions, for suitable priors, yields immediately interesting information. Also, this technique can hardly be avoided in sequential analysis."[\[26\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-26)
- "A useful fact is that any Bayes decision rule obtained by taking a proper prior over the whole parameter space must be admissible"[\[27\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-27)
- "An important area of investigation in the development of admissibility ideas has been that of conventional sampling-theory procedures, and many interesting results have been obtained."[\[28\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-28)
Bayesian methodology also plays a role in [model selection](https://en.wikipedia.org/wiki/Model_selection "Model selection") where the aim is to select one model from a set of competing models that represents most closely the underlying process that generated the observed data. In Bayesian model comparison, the model with the highest [posterior probability](https://en.wikipedia.org/wiki/Posterior_probability "Posterior probability") given the data is selected. The posterior probability of a model depends on the evidence, or [marginal likelihood](https://en.wikipedia.org/wiki/Marginal_likelihood "Marginal likelihood"), which reflects the probability that the data is generated by the model, and on the [prior belief](https://en.wikipedia.org/wiki/Prior_probability "Prior probability") of the model. When two competing models are a priori considered to be equiprobable, the ratio of their posterior probabilities corresponds to the [Bayes factor](https://en.wikipedia.org/wiki/Bayes_factor "Bayes factor"). Since Bayesian model comparison is aimed on selecting the model with the highest posterior probability, this methodology is also referred to as the maximum a posteriori (MAP) selection rule [\[29\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-29) or the MAP probability rule.[\[30\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-30)
## Probabilistic programming
\[[edit](https://en.wikipedia.org/w/index.php?title=Bayesian_inference&action=edit§ion=23 "Edit section: Probabilistic programming")\]
While conceptually simple, Bayesian methods can be mathematically and numerically challenging. Probabilistic programming languages (PPLs) implement functions to easily build Bayesian models together with efficient automatic inference methods. This helps separate the model building from the inference, allowing practitioners to focus on their specific problems and leaving PPLs to handle the computational details for them.[\[31\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-31)[\[32\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-32)[\[33\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-33)
### Statistical data analysis
\[[edit](https://en.wikipedia.org/w/index.php?title=Bayesian_inference&action=edit§ion=25 "Edit section: Statistical data analysis")\]
See the separate Wikipedia entry on [Bayesian statistics](https://en.wikipedia.org/wiki/Bayesian_statistics "Bayesian statistics"), specifically the [statistical modeling](https://en.wikipedia.org/wiki/Bayesian_statistics#Statistical_modeling "Bayesian statistics") section in that page.
### Computer applications
\[[edit](https://en.wikipedia.org/w/index.php?title=Bayesian_inference&action=edit§ion=26 "Edit section: Computer applications")\]
Bayesian inference has applications in [artificial intelligence](https://en.wikipedia.org/wiki/Artificial_intelligence "Artificial intelligence") and [expert systems](https://en.wikipedia.org/wiki/Expert_system "Expert system"). Bayesian inference techniques have been a fundamental part of computerized [pattern recognition](https://en.wikipedia.org/wiki/Pattern_recognition "Pattern recognition") techniques since the late 1950s.[\[34\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-34) There is also an ever-growing connection between Bayesian methods and simulation-based [Monte Carlo](https://en.wikipedia.org/wiki/Monte_Carlo_method "Monte Carlo method") techniques since complex models cannot be processed in closed form by a Bayesian analysis, while a [graphical model](https://en.wikipedia.org/wiki/Graphical_model "Graphical model") structure *may* allow for efficient simulation algorithms like the [Gibbs sampling](https://en.wikipedia.org/wiki/Gibbs_sampling "Gibbs sampling") and other [MetropolisâHastings algorithm](https://en.wikipedia.org/wiki/Metropolis%E2%80%93Hastings_algorithm "MetropolisâHastings algorithm") schemes.[\[35\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-35) Recently\[*[when?](https://en.wikipedia.org/wiki/Wikipedia:Manual_of_Style/Dates_and_numbers#Chronological_items "Wikipedia:Manual of Style/Dates and numbers")*\] Bayesian inference has gained popularity among the [phylogenetics](https://en.wikipedia.org/wiki/Phylogenetics "Phylogenetics") community for these reasons; a number of applications allow many demographic and evolutionary parameters to be estimated simultaneously.
As applied to [statistical classification](https://en.wikipedia.org/wiki/Statistical_classification "Statistical classification"), Bayesian inference has been used to develop algorithms for identifying [e-mail spam](https://en.wikipedia.org/wiki/E-mail_spam "E-mail spam"). Applications which make use of Bayesian inference for spam filtering include [CRM114](https://en.wikipedia.org/wiki/CRM114_\(program\) "CRM114 (program)"), [DSPAM](https://en.wikipedia.org/w/index.php?title=DSPAM&action=edit&redlink=1 "DSPAM (page does not exist)"), [Bogofilter](https://en.wikipedia.org/wiki/Bogofilter "Bogofilter"), [SpamAssassin](https://en.wikipedia.org/wiki/SpamAssassin "SpamAssassin"), [SpamBayes](https://en.wikipedia.org/wiki/SpamBayes "SpamBayes"), [Mozilla](https://en.wikipedia.org/wiki/Mozilla "Mozilla"), XEAMS, and others. Spam classification is treated in more detail in the article on the [naĂŻve Bayes classifier](https://en.wikipedia.org/wiki/Na%C3%AFve_Bayes_classifier "NaĂŻve Bayes classifier").
[Solomonoff's Inductive inference](https://en.wikipedia.org/wiki/Solomonoff%27s_theory_of_inductive_inference "Solomonoff's theory of inductive inference") is the theory of prediction based on observations; for example, predicting the next symbol based upon a given series of symbols. The only assumption is that the environment follows some unknown but computable [probability distribution](https://en.wikipedia.org/wiki/Probability_distribution "Probability distribution"). It is a formal inductive framework that combines two well-studied principles of inductive inference: Bayesian statistics and [Occam's Razor](https://en.wikipedia.org/wiki/Occam%27s_Razor "Occam's Razor").[\[36\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-36)\[*[unreliable source?](https://en.wikipedia.org/wiki/Wikipedia:Reliable_sources "Wikipedia:Reliable sources")*\] Solomonoff's universal prior probability of any prefix *p* of a computable sequence *x* is the sum of the probabilities of all programs (for a universal computer) that compute something starting with *p*. Given some *p* and any computable but unknown probability distribution from which *x* is sampled, the universal prior and Bayes' theorem can be used to predict the yet unseen parts of *x* in optimal fashion.[\[37\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-37)[\[38\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-38)
### Bioinformatics and healthcare applications
\[[edit](https://en.wikipedia.org/w/index.php?title=Bayesian_inference&action=edit§ion=27 "Edit section: Bioinformatics and healthcare applications")\]
Bayesian inference has been applied in different [bioinformatics](https://en.wikipedia.org/wiki/Bioinformatics "Bioinformatics") applications, including differential gene expression analysis.[\[39\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-:edgr-39) Bayesian inference is also used in a general cancer risk model, called [CIRI](https://en.wikipedia.org/wiki/Continuous_Individualized_Risk_Index "Continuous Individualized Risk Index") (Continuous Individualized Risk Index), where serial measurements are incorporated to update a Bayesian model which is primarily built from prior knowledge.[\[40\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-40)[\[41\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-41)
### Cosmology and astrophysical applications
\[[edit](https://en.wikipedia.org/w/index.php?title=Bayesian_inference&action=edit§ion=28 "Edit section: Cosmology and astrophysical applications")\]
The Bayesian approach has been central to recent progress in cosmology and astrophysical applications,[\[42\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-42)[\[43\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-43) and extends to a wide range of astrophysical problems, including the characterisation of exoplanet (such as the fitting of atmosphere for [k2-18b](https://en.wikipedia.org/wiki/K2-18b "K2-18b")[\[44\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-44)), parameter constraints with cosmological data,[\[45\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-ArXiv_1807-45) and calibration in astrophysical experiments.[\[46\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-46)
In cosmology, it is often employed with computational techniques such as [Markov chain Monte Carlo](https://en.wikipedia.org/wiki/Markov_chain_Monte_Carlo "Markov chain Monte Carlo")(MCMC) and [Nested sampling algorithm](https://en.wikipedia.org/wiki/Nested_sampling_algorithm "Nested sampling algorithm") to analyse complex datasets and navigate high-dimensional parameter space. A notable application is to the Planck 2018 CMB data for parameter inference.[\[45\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-ArXiv_1807-45) The six base cosmological parameters in [Lambda-CDM model](https://en.wikipedia.org/wiki/Lambda-CDM_model "Lambda-CDM model") are not predicted by a theory, but rather fitted from Cosmic microwave background (CMB) data to a chosen model of cosmology (the Lambda-CDM model).[\[47\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-47) The bayesian code for cosmology \`cobaya\` [\[48\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-48) sets up cosmological runs and interfaces cosmological likelihoods, Boltzmann code,[\[49\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-49)[\[50\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-50) which computes the predicted CMB anisotropies for any given set of cosmological parameters, with MCMC or nested sampler.
This computational framework is not limited to the standard model, it is also essential for testing alternative or extended theories of cosmology, such as theories with early dark energy,[\[51\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-51) or modified gravity theories introducing additional parameters beyond Lambda-CDM. [Bayesian model comparison](https://en.wikipedia.org/wiki/Bayesian_model_comparison "Bayesian model comparison") can then be employed to calculate the evidence for competing models, providing a statistical basis to assess whether the data support them over the standard Lambda-CDM.[\[52\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-52)
Bayesian inference can be used by jurors to coherently accumulate the evidence for and against a defendant, and to see whether, in totality, it meets their personal threshold for "[beyond a reasonable doubt](https://en.wikipedia.org/wiki/Beyond_a_reasonable_doubt "Beyond a reasonable doubt")".[\[53\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-53)[\[54\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-54)[\[55\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-55) Bayes' theorem is applied successively to all evidence presented, with the posterior from one stage becoming the prior for the next. The benefit of a Bayesian approach is that it gives the juror an unbiased, rational mechanism for combining evidence. It may be appropriate to explain Bayes' theorem to jurors in [odds form](https://en.wikipedia.org/wiki/Bayes%27_rule "Bayes' rule"), as [betting odds](https://en.wikipedia.org/wiki/Betting_odds "Betting odds") are more widely understood than probabilities. Alternatively, a [logarithmic approach](https://en.wikipedia.org/wiki/Gambling_and_information_theory "Gambling and information theory"), replacing multiplication with addition, might be easier for a jury to handle.
[](https://en.wikipedia.org/wiki/File:Ebits2c.png)
Adding up evidence
If the existence of the crime is not in doubt, only the identity of the culprit, it has been suggested that the prior should be uniform over the qualifying population.[\[56\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-56) For example, if 1,000 people could have committed the crime, the prior probability of guilt would be 1/1000.
The use of Bayes' theorem by jurors is controversial. In the United Kingdom, a defence [expert witness](https://en.wikipedia.org/wiki/Expert_witness "Expert witness") explained Bayes' theorem to the jury in *[R v Adams](https://en.wikipedia.org/wiki/Regina_versus_Denis_John_Adams "Regina versus Denis John Adams")*. The jury convicted, but the case went to appeal on the basis that no means of accumulating evidence had been provided for jurors who did not wish to use Bayes' theorem. The Court of Appeal upheld the conviction, but it also gave the opinion that "To introduce Bayes' Theorem, or any similar method, into a criminal trial plunges the jury into inappropriate and unnecessary realms of theory and complexity, deflecting them from their proper task."
Gardner-Medwin[\[57\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-57) argues that the criterion on which a verdict in a criminal trial should be based is *not* the probability of guilt, but rather the *probability of the evidence, given that the defendant is innocent* (akin to a [frequentist](https://en.wikipedia.org/wiki/Frequentist "Frequentist") [p-value](https://en.wikipedia.org/wiki/P-value "P-value")). He argues that if the posterior probability of guilt is to be computed by Bayes' theorem, the prior probability of guilt must be known. This will depend on the incidence of the crime, which is an unusual piece of evidence to consider in a criminal trial. Consider the following three propositions:
*A* â the known facts and testimony could have arisen if the defendant is guilty.
*B* â the known facts and testimony could have arisen if the defendant is innocent.
*C* â the defendant is guilty.
Gardner-Medwin argues that the jury should believe both *A* and not-*B* in order to convict. *A* and not-*B* implies the truth of *C*, but the reverse is not true. It is possible that *B* and *C* are both true, but in this case he argues that a jury should acquit, even though they know that they will be letting some guilty people go free. See also [Lindley's paradox](https://en.wikipedia.org/wiki/Lindley%27s_paradox "Lindley's paradox").
### Bayesian epistemology
\[[edit](https://en.wikipedia.org/w/index.php?title=Bayesian_inference&action=edit§ion=30 "Edit section: Bayesian epistemology")\]
[Bayesian epistemology](https://en.wikipedia.org/wiki/Bayesian_epistemology "Bayesian epistemology") is a movement that advocates for Bayesian inference as a means of justifying the rules of inductive logic.
[Karl Popper](https://en.wikipedia.org/wiki/Karl_Popper "Karl Popper") and [David Miller](https://en.wikipedia.org/wiki/David_Miller_\(philosopher\) "David Miller (philosopher)") have rejected the idea of Bayesian rationalism, i.e. using Bayes rule to make epistemological inferences:[\[58\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-58) It is prone to the same [vicious circle](https://en.wikipedia.org/wiki/Vicious_circle "Vicious circle") as any other [justificationist](https://en.wikipedia.org/wiki/Justificationism "Justificationism") epistemology, because it presupposes what it attempts to justify. According to this view, a rational interpretation of Bayesian inference would see it merely as a probabilistic version of [falsification](https://en.wikipedia.org/wiki/Falsifiability "Falsifiability"), rejecting the belief, commonly held by Bayesians, that high likelihood achieved by a series of Bayesian updates would prove the hypothesis beyond any reasonable doubt, or even with likelihood greater than 0.
- The [scientific method](https://en.wikipedia.org/wiki/Scientific_method "Scientific method") is sometimes interpreted as an application of Bayesian inference. In this view, Bayes' rule guides (or should guide) the updating of probabilities about [hypotheses](https://en.wikipedia.org/wiki/Hypothesis "Hypothesis") conditional on new observations or [experiments](https://en.wikipedia.org/wiki/Experiment "Experiment").[\[59\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-59) The Bayesian inference has also been applied to treat [stochastic scheduling](https://en.wikipedia.org/wiki/Stochastic_scheduling "Stochastic scheduling") problems with incomplete information by Cai et al. (2009).[\[60\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-Cai_et_al._2009-60)
- [Bayesian search theory](https://en.wikipedia.org/wiki/Bayesian_search_theory "Bayesian search theory") is used to search for lost objects.
- [Bayesian inference in phylogeny](https://en.wikipedia.org/wiki/Bayesian_inference_in_phylogeny "Bayesian inference in phylogeny")
- [Bayesian tool for methylation analysis](https://en.wikipedia.org/wiki/Bayesian_tool_for_methylation_analysis "Bayesian tool for methylation analysis")
- [Bayesian approaches to brain function](https://en.wikipedia.org/wiki/Bayesian_approaches_to_brain_function "Bayesian approaches to brain function") investigate the brain as a Bayesian mechanism.
- Bayesian inference in ecological studies[\[61\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-61)[\[62\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-62)
- Bayesian inference is used to estimate parameters in stochastic chemical kinetic models[\[63\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-63)
- Bayesian inference in [econophysics](https://en.wikipedia.org/wiki/Econophysics "Econophysics") for currency or prediction of trend changes in financial quotations[\[64\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-64)
- [Bayesian inference in marketing](https://en.wikipedia.org/wiki/Bayesian_inference_in_marketing "Bayesian inference in marketing")
- [Bayesian inference in motor learning](https://en.wikipedia.org/wiki/Bayesian_inference_in_motor_learning "Bayesian inference in motor learning")
- Bayesian inference is used in [probabilistic numerics](https://en.wikipedia.org/wiki/Probabilistic_numerics "Probabilistic numerics") to solve numerical problems
## Bayes and Bayesian inference
\[[edit](https://en.wikipedia.org/w/index.php?title=Bayesian_inference&action=edit§ion=32 "Edit section: Bayes and Bayesian inference")\]
The problem considered by Bayes in Proposition 9 of his essay, "[An Essay Towards Solving a Problem in the Doctrine of Chances](https://en.wikipedia.org/wiki/An_Essay_Towards_Solving_a_Problem_in_the_Doctrine_of_Chances "An Essay Towards Solving a Problem in the Doctrine of Chances")", is the posterior distribution for the parameter *a* (the success rate) of the [binomial distribution](https://en.wikipedia.org/wiki/Binomial_distribution "Binomial distribution").\[*[citation needed](https://en.wikipedia.org/wiki/Wikipedia:Citation_needed "Wikipedia:Citation needed")*\]
The term *Bayesian* refers to [Thomas Bayes](https://en.wikipedia.org/wiki/Thomas_Bayes "Thomas Bayes") (1701â1761), who proved that probabilistic limits could be placed on an unknown event.[\[65\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-65) However, it was [Pierre-Simon Laplace](https://en.wikipedia.org/wiki/Pierre-Simon_Laplace "Pierre-Simon Laplace") (1749â1827) who introduced (as Principle VI) what is now called [Bayes' theorem](https://en.wikipedia.org/wiki/Bayes%27_theorem "Bayes' theorem") and used it to address problems in [celestial mechanics](https://en.wikipedia.org/wiki/Celestial_mechanics "Celestial mechanics"), medical statistics, [reliability](https://en.wikipedia.org/wiki/Reliability_\(statistics\) "Reliability (statistics)"), and [jurisprudence](https://en.wikipedia.org/wiki/Jurisprudence "Jurisprudence").[\[66\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-Stigler1986-66) Early Bayesian inference, which used uniform priors following Laplace's [principle of insufficient reason](https://en.wikipedia.org/wiki/Principle_of_insufficient_reason "Principle of insufficient reason"), was called "[inverse probability](https://en.wikipedia.org/wiki/Inverse_probability "Inverse probability")" (because it [infers](https://en.wikipedia.org/wiki/Inductive_reasoning "Inductive reasoning") backwards from observations to parameters, or from effects to causes[\[67\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-Fienberg2006-67)). After the 1920s, "inverse probability" was largely supplanted by a collection of methods that came to be called [frequentist statistics](https://en.wikipedia.org/wiki/Frequentist_statistics "Frequentist statistics").[\[67\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-Fienberg2006-67)
In the 20th century, the ideas of Laplace were further developed in two different directions, giving rise to *objective* and *subjective* currents in Bayesian practice. In the objective or "non-informative" current, the statistical analysis depends on only the model assumed, the data analyzed,[\[68\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-Bernardo2005-68) and the method assigning the prior, which differs from one objective Bayesian practitioner to another. In the subjective or "informative" current, the specification of the prior depends on the belief (that is, propositions on which the analysis is prepared to act), which can summarize information from experts, previous studies, etc.
In the 1980s, there was a dramatic growth in research and applications of Bayesian methods, mostly attributed to the discovery of [Markov chain Monte Carlo](https://en.wikipedia.org/wiki/Markov_chain_Monte_Carlo "Markov chain Monte Carlo") methods, which removed many of the computational problems, and an increasing interest in nonstandard, complex applications.[\[69\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-Wolpert2004-69) Despite growth of Bayesian research, most undergraduate teaching is still based on frequentist statistics.[\[70\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-Bernardo2006-70) Nonetheless, Bayesian methods are widely accepted and used, such as for example in the field of [machine learning](https://en.wikipedia.org/wiki/Machine_learning "Machine learning").[\[71\]](https://en.wikipedia.org/wiki/Bayesian_inference#cite_note-Bishop2007-71)
- [Bayesian approaches to brain function](https://en.wikipedia.org/wiki/Bayesian_approaches_to_brain_function "Bayesian approaches to brain function")
- [Credibility theory](https://en.wikipedia.org/wiki/Credibility_theory "Credibility theory")
- [Epistemology](https://en.wikipedia.org/wiki/Epistemology "Epistemology")
- [Free energy principle](https://en.wikipedia.org/wiki/Free_energy_principle "Free energy principle")
- [Inductive probability](https://en.wikipedia.org/wiki/Inductive_probability "Inductive probability")
- [Information field theory](https://en.wikipedia.org/wiki/Information_field_theory "Information field theory")
- [Principle of maximum entropy](https://en.wikipedia.org/wiki/Principle_of_maximum_entropy "Principle of maximum entropy")
- [Probabilistic causation](https://en.wikipedia.org/wiki/Probabilistic_causation "Probabilistic causation")
- [Probabilistic programming](https://en.wikipedia.org/wiki/Probabilistic_programming "Probabilistic programming")
1. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-1)**
["Bayesian"](https://www.merriam-webster.com/dictionary/Bayesian). *[Merriam-Webster.com Dictionary](https://en.wikipedia.org/wiki/Merriam-Webster "Merriam-Webster")*. Merriam-Webster. [OCLC](https://en.wikipedia.org/wiki/OCLC_\(identifier\) "OCLC (identifier)") [1032680871](https://search.worldcat.org/oclc/1032680871).
2. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-2)**
Griffiths, Thomas (July 24, 2024). ["Bayesian Models of Cognition"](https://oecs.mit.edu/pub/lwxmte1p/release/2).
3. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-3)**
Hacking, Ian (December 1967). "Slightly More Realistic Personal Probability". *Philosophy of Science*. **34** (4): 316. [doi](https://en.wikipedia.org/wiki/Doi_\(identifier\) "Doi (identifier)"):[10\.1086/288169](https://doi.org/10.1086%2F288169). [S2CID](https://en.wikipedia.org/wiki/S2CID_\(identifier\) "S2CID (identifier)") [14344339](https://api.semanticscholar.org/CorpusID:14344339).
4. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-4)**
["Bayes' Theorem (Stanford Encyclopedia of Philosophy)"](http://plato.stanford.edu/entries/bayes-theorem/). Plato.stanford.edu. Retrieved 2014-01-05.
5. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-5)**
[van Fraassen, B.](https://en.wikipedia.org/wiki/Bas_van_Fraassen "Bas van Fraassen") (1989) *Laws and Symmetry*, Oxford University Press. [ISBN](https://en.wikipedia.org/wiki/ISBN_\(identifier\) "ISBN (identifier)")
[0-19-824860-1](https://en.wikipedia.org/wiki/Special:BookSources/0-19-824860-1 "Special:BookSources/0-19-824860-1")
.
6. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-6)**
Gelman, Andrew; Carlin, John B.; Stern, Hal S.; Dunson, David B.; Vehtari, Aki; Rubin, Donald B. (2013). *Bayesian Data Analysis*, Third Edition. Chapman and Hall/CRC. [ISBN](https://en.wikipedia.org/wiki/ISBN_\(identifier\) "ISBN (identifier)")
[978-1-4398-4095-5](https://en.wikipedia.org/wiki/Special:BookSources/978-1-4398-4095-5 "Special:BookSources/978-1-4398-4095-5")
.
7. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-deCarvalho-Geometry_7-0)**
de Carvalho, Miguel; Page, Garritt; Barney, Bradley (2019). ["On the geometry of Bayesian inference"](https://www.maths.ed.ac.uk/~mdecarv/papers/decarvalho2018.pdf) (PDF). *Bayesian Analysis*. **14** (4): 1013â1036. [doi](https://en.wikipedia.org/wiki/Doi_\(identifier\) "Doi (identifier)"):[10\.1214/18-BA1112](https://doi.org/10.1214%2F18-BA1112). [S2CID](https://en.wikipedia.org/wiki/S2CID_\(identifier\) "S2CID (identifier)") [88521802](https://api.semanticscholar.org/CorpusID:88521802).
8. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-Lee-GibbsSampler_8-0)**
Lee, Se Yoon (2021). "Gibbs sampler and coordinate ascent variational inference: A set-theoretical review". *Communications in Statistics â Theory and Methods*. **51** (6): 1549â1568\. [arXiv](https://en.wikipedia.org/wiki/ArXiv_\(identifier\) "ArXiv (identifier)"):[2008\.01006](https://arxiv.org/abs/2008.01006). [doi](https://en.wikipedia.org/wiki/Doi_\(identifier\) "Doi (identifier)"):[10\.1080/03610926.2021.1921214](https://doi.org/10.1080%2F03610926.2021.1921214). [S2CID](https://en.wikipedia.org/wiki/S2CID_\(identifier\) "S2CID (identifier)") [220935477](https://api.semanticscholar.org/CorpusID:220935477).
9. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-9)**
Kolmogorov, A.N. (1933) \[1956\]. *Foundations of the Theory of Probability*. Chelsea Publishing Company.
10. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-10)**
Tjur, Tue (1980). [*Probability based on Radon measures*](http://archive.org/details/probabilitybased0000tjur). Internet Archive. Chichester \[Eng.\]; New York : Wiley. [ISBN](https://en.wikipedia.org/wiki/ISBN_\(identifier\) "ISBN (identifier)")
[978-0-471-27824-5](https://en.wikipedia.org/wiki/Special:BookSources/978-0-471-27824-5 "Special:BookSources/978-0-471-27824-5")
.
11. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-11)**
Taraldsen, Gunnar; Tufto, Jarle; Lindqvist, Bo H. (2021-07-24). ["Improper priors and improper posteriors"](https://doi.org/10.1111%2Fsjos.12550). *Scandinavian Journal of Statistics*. **49** (3): 969â991\. [doi](https://en.wikipedia.org/wiki/Doi_\(identifier\) "Doi (identifier)"):[10\.1111/sjos.12550](https://doi.org/10.1111%2Fsjos.12550). [hdl](https://en.wikipedia.org/wiki/Hdl_\(identifier\) "Hdl (identifier)"):[11250/2984409](https://hdl.handle.net/11250%2F2984409). [ISSN](https://en.wikipedia.org/wiki/ISSN_\(identifier\) "ISSN (identifier)") [0303-6898](https://search.worldcat.org/issn/0303-6898). [S2CID](https://en.wikipedia.org/wiki/S2CID_\(identifier\) "S2CID (identifier)") [237736986](https://api.semanticscholar.org/CorpusID:237736986).
12. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-12)**
Robert, Christian P.; Casella, George (2004). *Monte Carlo Statistical Methods*. Springer. [ISBN](https://en.wikipedia.org/wiki/ISBN_\(identifier\) "ISBN (identifier)")
[978-1-4757-4145-2](https://en.wikipedia.org/wiki/Special:BookSources/978-1-4757-4145-2 "Special:BookSources/978-1-4757-4145-2")
. [OCLC](https://en.wikipedia.org/wiki/OCLC_\(identifier\) "OCLC (identifier)") [1159112760](https://search.worldcat.org/oclc/1159112760).
13. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-13)**
Freedman, DA (1963). ["On the asymptotic behavior of Bayes' estimates in the discrete case"](https://doi.org/10.1214%2Faoms%2F1177703871). *The Annals of Mathematical Statistics*. **34** (4): 1386â1403\. [doi](https://en.wikipedia.org/wiki/Doi_\(identifier\) "Doi (identifier)"):[10\.1214/aoms/1177703871](https://doi.org/10.1214%2Faoms%2F1177703871). [JSTOR](https://en.wikipedia.org/wiki/JSTOR_\(identifier\) "JSTOR (identifier)") [2238346](https://www.jstor.org/stable/2238346).
14. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-14)**
Freedman, DA (1965). ["On the asymptotic behavior of Bayes estimates in the discrete case II"](https://doi.org/10.1214%2Faoms%2F1177700155). *The Annals of Mathematical Statistics*. **36** (2): 454â456\. [doi](https://en.wikipedia.org/wiki/Doi_\(identifier\) "Doi (identifier)"):[10\.1214/aoms/1177700155](https://doi.org/10.1214%2Faoms%2F1177700155). [JSTOR](https://en.wikipedia.org/wiki/JSTOR_\(identifier\) "JSTOR (identifier)") [2238150](https://www.jstor.org/stable/2238150).
15. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-15)**
Robins, James; Wasserman, Larry (2000). "Conditioning, likelihood, and coherence: A review of some foundational concepts". *Journal of the American Statistical Association*. **95** (452): 1340â1346\. [doi](https://en.wikipedia.org/wiki/Doi_\(identifier\) "Doi (identifier)"):[10\.1080/01621459.2000.10474344](https://doi.org/10.1080%2F01621459.2000.10474344). [S2CID](https://en.wikipedia.org/wiki/S2CID_\(identifier\) "S2CID (identifier)") [120767108](https://api.semanticscholar.org/CorpusID:120767108).
16. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-16)**
[Sen, Pranab K.](https://en.wikipedia.org/wiki/Pranab_K._Sen "Pranab K. Sen"); Keating, J. P.; Mason, R. L. (1993). *Pitman's measure of closeness: A comparison of statistical estimators*. Philadelphia: SIAM.
17. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-17)**
Choudhuri, Nidhan; Ghosal, Subhashis; Roy, Anindya (2005-01-01). "Bayesian Methods for Function Estimation". *Handbook of Statistics*. Bayesian Thinking. Vol. 25. pp. 373â414\. [CiteSeerX](https://en.wikipedia.org/wiki/CiteSeerX_\(identifier\) "CiteSeerX (identifier)") [10\.1.1.324.3052](https://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.324.3052). [doi](https://en.wikipedia.org/wiki/Doi_\(identifier\) "Doi (identifier)"):[10\.1016/s0169-7161(05)25013-7](https://doi.org/10.1016%2Fs0169-7161%2805%2925013-7). [ISBN](https://en.wikipedia.org/wiki/ISBN_\(identifier\) "ISBN (identifier)")
[978-0-444-51539-1](https://en.wikipedia.org/wiki/Special:BookSources/978-0-444-51539-1 "Special:BookSources/978-0-444-51539-1")
.
18. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-18)**
["Maximum A Posteriori (MAP) Estimation"](https://www.probabilitycourse.com/chapter9/9_1_2_MAP_estimation.php). *www.probabilitycourse.com*. Retrieved 2017-06-02.
19. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-19)**
Yu, Angela. ["Introduction to Bayesian Decision Theory"](https://web.archive.org/web/20130228060536/http://www.cogsci.ucsd.edu/~ajyu/Teaching/Tutorials/bayes_dt.pdf) (PDF). *cogsci.ucsd.edu/*. Archived from [the original](http://www.cogsci.ucsd.edu/~ajyu/Teaching/Tutorials/bayes_dt.pdf) (PDF) on 2013-02-28.
20. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-20)**
Hitchcock, David. ["Posterior Predictive Distribution Stat Slide"](http://people.stat.sc.edu/Hitchcock/stat535slidesday18.pdf) (PDF). *stat.sc.edu*.
21. ^ [***a***](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-Bickel_&_Doksum_2001,_page_32_21-0) [***b***](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-Bickel_&_Doksum_2001,_page_32_21-1) Bickel & Doksum (2001, p. 32)
22. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-22)**
[Kiefer, J.](https://en.wikipedia.org/wiki/Jack_Kiefer_\(mathematician\) "Jack Kiefer (mathematician)"); Schwartz R. (1965). ["Admissible Bayes Character of T2\-, R2\-, and Other Fully Invariant Tests for Multivariate Normal Problems"](https://doi.org/10.1214%2Faoms%2F1177700051). *Annals of Mathematical Statistics*. **36** (3): 747â770\. [doi](https://en.wikipedia.org/wiki/Doi_\(identifier\) "Doi (identifier)"):[10\.1214/aoms/1177700051](https://doi.org/10.1214%2Faoms%2F1177700051).
23. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-23)**
Schwartz, R. (1969). ["Invariant Proper Bayes Tests for Exponential Families"](https://doi.org/10.1214%2Faoms%2F1177697822). *Annals of Mathematical Statistics*. **40**: 270â283\. [doi](https://en.wikipedia.org/wiki/Doi_\(identifier\) "Doi (identifier)"):[10\.1214/aoms/1177697822](https://doi.org/10.1214%2Faoms%2F1177697822).
24. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-24)**
Hwang, J. T. & Casella, George (1982). ["Minimax Confidence Sets for the Mean of a Multivariate Normal Distribution"](https://ecommons.cornell.edu/bitstream/1813/32852/1/BU-750-M.pdf) (PDF). *Annals of Statistics*. **10** (3): 868â881\. [doi](https://en.wikipedia.org/wiki/Doi_\(identifier\) "Doi (identifier)"):[10\.1214/aos/1176345877](https://doi.org/10.1214%2Faos%2F1176345877).
25. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-25)**
[Lehmann, Erich](https://en.wikipedia.org/wiki/Erich_Leo_Lehmann "Erich Leo Lehmann") (1986). *Testing Statistical Hypotheses* (Second ed.).
(see p. 309 of Chapter 6.7 "Admissibility", and pp. 17â18 of Chapter 1.8 "Complete Classes"
26. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-26)**
[Le Cam, Lucien](https://en.wikipedia.org/wiki/Lucien_Le_Cam "Lucien Le Cam") (1986). *Asymptotic Methods in Statistical Decision Theory*. Springer-Verlag. [ISBN](https://en.wikipedia.org/wiki/ISBN_\(identifier\) "ISBN (identifier)")
[978-0-387-96307-5](https://en.wikipedia.org/wiki/Special:BookSources/978-0-387-96307-5 "Special:BookSources/978-0-387-96307-5")
.
(From "Chapter 12 Posterior Distributions and Bayes Solutions", p. 324)
27. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-27)**
[Cox, D. R.](https://en.wikipedia.org/wiki/David_R._Cox "David R. Cox"); Hinkley, D.V. (1974). *Theoretical Statistics*. Chapman and Hall. p. 432. [ISBN](https://en.wikipedia.org/wiki/ISBN_\(identifier\) "ISBN (identifier)")
[978-0-04-121537-3](https://en.wikipedia.org/wiki/Special:BookSources/978-0-04-121537-3 "Special:BookSources/978-0-04-121537-3")
.
28. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-28)**
[Cox, D. R.](https://en.wikipedia.org/wiki/David_R._Cox "David R. Cox"); Hinkley, D.V. (1974). *Theoretical Statistics*. Chapman and Hall. p. 433. [ISBN](https://en.wikipedia.org/wiki/ISBN_\(identifier\) "ISBN (identifier)")
[978-0-04-121537-3](https://en.wikipedia.org/wiki/Special:BookSources/978-0-04-121537-3 "Special:BookSources/978-0-04-121537-3")
.
)
29. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-29)**
Stoica, P.; Selen, Y. (2004). "A review of information criterion rules". *IEEE Signal Processing Magazine*. **21** (4): 36â47\. [doi](https://en.wikipedia.org/wiki/Doi_\(identifier\) "Doi (identifier)"):[10\.1109/MSP.2004.1311138](https://doi.org/10.1109%2FMSP.2004.1311138). [S2CID](https://en.wikipedia.org/wiki/S2CID_\(identifier\) "S2CID (identifier)") [17338979](https://api.semanticscholar.org/CorpusID:17338979).
30. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-30)**
Fatermans, J.; Van Aert, S.; den Dekker, A.J. (2019). "The maximum a posteriori probability rule for atom column detection from HAADF STEM images". *Ultramicroscopy*. **201**: 81â91\. [arXiv](https://en.wikipedia.org/wiki/ArXiv_\(identifier\) "ArXiv (identifier)"):[1902\.05809](https://arxiv.org/abs/1902.05809). [doi](https://en.wikipedia.org/wiki/Doi_\(identifier\) "Doi (identifier)"):[10\.1016/j.ultramic.2019.02.003](https://doi.org/10.1016%2Fj.ultramic.2019.02.003). [PMID](https://en.wikipedia.org/wiki/PMID_\(identifier\) "PMID (identifier)") [30991277](https://pubmed.ncbi.nlm.nih.gov/30991277). [S2CID](https://en.wikipedia.org/wiki/S2CID_\(identifier\) "S2CID (identifier)") [104419861](https://api.semanticscholar.org/CorpusID:104419861).
31. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-31)** Bessiere, P., Mazer, E., Ahuactzin, J. M., & Mekhnacha, K. (2013). Bayesian Programming (1 edition) Chapman and Hall/CRC.
32. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-32)**
Daniel Roy (2015). ["Probabilistic Programming"](https://web.archive.org/web/20160110035042/http://probabilistic-programming.org/wiki/Home). *probabilistic-programming.org*. Archived from [the original](http://probabilistic-programming.org/wiki/Home) on 2016-01-10. Retrieved 2020-01-02.
33. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-33)**
Ghahramani, Z (2015). ["Probabilistic machine learning and artificial intelligence"](https://www.repository.cam.ac.uk/handle/1810/248538). *Nature*. **521** (7553): 452â459\. [Bibcode](https://en.wikipedia.org/wiki/Bibcode_\(identifier\) "Bibcode (identifier)"):[2015Natur.521..452G](https://ui.adsabs.harvard.edu/abs/2015Natur.521..452G). [doi](https://en.wikipedia.org/wiki/Doi_\(identifier\) "Doi (identifier)"):[10\.1038/nature14541](https://doi.org/10.1038%2Fnature14541). [PMID](https://en.wikipedia.org/wiki/PMID_\(identifier\) "PMID (identifier)") [26017444](https://pubmed.ncbi.nlm.nih.gov/26017444). [S2CID](https://en.wikipedia.org/wiki/S2CID_\(identifier\) "S2CID (identifier)") [216356](https://api.semanticscholar.org/CorpusID:216356).
34. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-34)**
Fienberg, Stephen E. (2006-03-01). ["When did Bayesian inference become "Bayesian"?"](https://doi.org/10.1214%2F06-BA101). *Bayesian Analysis*. **1** (1). [doi](https://en.wikipedia.org/wiki/Doi_\(identifier\) "Doi (identifier)"):[10\.1214/06-BA101](https://doi.org/10.1214%2F06-BA101).
35. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-35)**
Jim Albert (2009). *Bayesian Computation with R, Second edition*. New York, Dordrecht, etc.: Springer. [ISBN](https://en.wikipedia.org/wiki/ISBN_\(identifier\) "ISBN (identifier)")
[978-0-387-92297-3](https://en.wikipedia.org/wiki/Special:BookSources/978-0-387-92297-3 "Special:BookSources/978-0-387-92297-3")
.
36. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-36)**
Rathmanner, Samuel; Hutter, Marcus; Ormerod, Thomas C (2011). ["A Philosophical Treatise of Universal Induction"](https://doi.org/10.3390%2Fe13061076). *Entropy*. **13** (6): 1076â1136\. [arXiv](https://en.wikipedia.org/wiki/ArXiv_\(identifier\) "ArXiv (identifier)"):[1105\.5721](https://arxiv.org/abs/1105.5721). [Bibcode](https://en.wikipedia.org/wiki/Bibcode_\(identifier\) "Bibcode (identifier)"):[2011Entrp..13.1076R](https://ui.adsabs.harvard.edu/abs/2011Entrp..13.1076R). [doi](https://en.wikipedia.org/wiki/Doi_\(identifier\) "Doi (identifier)"):[10\.3390/e13061076](https://doi.org/10.3390%2Fe13061076). [S2CID](https://en.wikipedia.org/wiki/S2CID_\(identifier\) "S2CID (identifier)") [2499910](https://api.semanticscholar.org/CorpusID:2499910).
37. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-37)**
Hutter, Marcus; He, Yang-Hui; Ormerod, Thomas C (2007). "On Universal Prediction and Bayesian Confirmation". *Theoretical Computer Science*. **384** (2007): 33â48\. [arXiv](https://en.wikipedia.org/wiki/ArXiv_\(identifier\) "ArXiv (identifier)"):[0709\.1516](https://arxiv.org/abs/0709.1516). [Bibcode](https://en.wikipedia.org/wiki/Bibcode_\(identifier\) "Bibcode (identifier)"):[2007arXiv0709.1516H](https://ui.adsabs.harvard.edu/abs/2007arXiv0709.1516H). [doi](https://en.wikipedia.org/wiki/Doi_\(identifier\) "Doi (identifier)"):[10\.1016/j.tcs.2007.05.016](https://doi.org/10.1016%2Fj.tcs.2007.05.016). [S2CID](https://en.wikipedia.org/wiki/S2CID_\(identifier\) "S2CID (identifier)") [1500830](https://api.semanticscholar.org/CorpusID:1500830).
38. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-38)**
GĂĄcs, Peter; VitĂĄnyi, Paul M. B. (2 December 2010). "Raymond J. Solomonoff 1926-2009". [CiteSeerX](https://en.wikipedia.org/wiki/CiteSeerX_\(identifier\) "CiteSeerX (identifier)") [10\.1.1.186.8268](https://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.186.8268).
39. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-:edgr_39-0)** Robinson, Mark D & McCarthy, Davis J & Smyth, Gordon K edgeR: a Bioconductor package for differential expression analysis of digital gene expression data, Bioinformatics.
40. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-40)**
["CIRI"](https://ciri.stanford.edu/). *ciri.stanford.edu*. Retrieved 2019-08-11.
41. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-41)**
Kurtz, David M.; Esfahani, Mohammad S.; Scherer, Florian; Soo, Joanne; Jin, Michael C.; Liu, Chih Long; Newman, Aaron M.; DĂŒhrsen, Ulrich; HĂŒttmann, Andreas (2019-07-25). ["Dynamic Risk Profiling Using Serial Tumor Biomarkers for Personalized Outcome Prediction"](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7380118). *Cell*. **178** (3): 699â713.e19. [doi](https://en.wikipedia.org/wiki/Doi_\(identifier\) "Doi (identifier)"):[10\.1016/j.cell.2019.06.011](https://doi.org/10.1016%2Fj.cell.2019.06.011). [ISSN](https://en.wikipedia.org/wiki/ISSN_\(identifier\) "ISSN (identifier)") [1097-4172](https://search.worldcat.org/issn/1097-4172). [PMC](https://en.wikipedia.org/wiki/PMC_\(identifier\) "PMC (identifier)") [7380118](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7380118). [PMID](https://en.wikipedia.org/wiki/PMID_\(identifier\) "PMID (identifier)") [31280963](https://pubmed.ncbi.nlm.nih.gov/31280963).
42. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-42)**
Trotta, Roberto (2017). "Bayesian Methods in Cosmology". [arXiv](https://en.wikipedia.org/wiki/ArXiv_\(identifier\) "ArXiv (identifier)"):[1701\.01467](https://arxiv.org/abs/1701.01467) \[[astro-ph.CO](https://arxiv.org/archive/astro-ph.CO)\].
43. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-43)**
Staicova, Denitsa (2025). ["Modern Bayesian Sampling Methods for Cosmological Inference: A Comparative Study"](https://doi.org/10.3390%2Funiverse11020068). *Universe*. **11** (2): 68. [arXiv](https://en.wikipedia.org/wiki/ArXiv_\(identifier\) "ArXiv (identifier)"):[2501\.06022](https://arxiv.org/abs/2501.06022). [Bibcode](https://en.wikipedia.org/wiki/Bibcode_\(identifier\) "Bibcode (identifier)"):[2025Univ...11...68S](https://ui.adsabs.harvard.edu/abs/2025Univ...11...68S). [doi](https://en.wikipedia.org/wiki/Doi_\(identifier\) "Doi (identifier)"):[10\.3390/universe11020068](https://doi.org/10.3390%2Funiverse11020068).
44. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-44)**
Madhusudhan, Nikku; Constantinou, Savvas; Holmberg, MÄns; Sarkar, Subhajit; Piette, Anjali A. A.; Moses, Julianne I. (2025). ["New Constraints on DMS and DMDS in the Atmosphere of K2-18 b from JWST MIRI"](https://doi.org/10.3847%2F2041-8213%2Fadc1c8). *The Astrophysical Journal*. **983** (2): L40. [arXiv](https://en.wikipedia.org/wiki/ArXiv_\(identifier\) "ArXiv (identifier)"):[2504\.12267](https://arxiv.org/abs/2504.12267). [Bibcode](https://en.wikipedia.org/wiki/Bibcode_\(identifier\) "Bibcode (identifier)"):[2025ApJ...983L..40M](https://ui.adsabs.harvard.edu/abs/2025ApJ...983L..40M). [doi](https://en.wikipedia.org/wiki/Doi_\(identifier\) "Doi (identifier)"):[10\.3847/2041-8213/adc1c8](https://doi.org/10.3847%2F2041-8213%2Fadc1c8).
45. ^ [***a***](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-ArXiv_1807_45-0) [***b***](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-ArXiv_1807_45-1)
Aghanim, N.; et al. (2020). "*Planck* 2018 results". *Astronomy & Astrophysics*. **641**: A6. [arXiv](https://en.wikipedia.org/wiki/ArXiv_\(identifier\) "ArXiv (identifier)"):[1807\.06209](https://arxiv.org/abs/1807.06209). [Bibcode](https://en.wikipedia.org/wiki/Bibcode_\(identifier\) "Bibcode (identifier)"):[2020A\&A...641A...6P](https://ui.adsabs.harvard.edu/abs/2020A&A...641A...6P). [doi](https://en.wikipedia.org/wiki/Doi_\(identifier\) "Doi (identifier)"):[10\.1051/0004-6361/201833910](https://doi.org/10.1051%2F0004-6361%2F201833910).
46. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-46)**
Anstey, Dominic; De Lera Acedo, Eloy; Handley, Will (2021). ["A general Bayesian framework for foreground modelling and chromaticity correction for global 21 cm experiments"](https://doi.org/10.1093%2Fmnras%2Fstab1765). *Monthly Notices of the Royal Astronomical Society*. **506** (2): 2041â2058\. [arXiv](https://en.wikipedia.org/wiki/ArXiv_\(identifier\) "ArXiv (identifier)"):[2010\.09644](https://arxiv.org/abs/2010.09644). [doi](https://en.wikipedia.org/wiki/Doi_\(identifier\) "Doi (identifier)"):[10\.1093/mnras/stab1765](https://doi.org/10.1093%2Fmnras%2Fstab1765).
47. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-47)**
Lewis, Antony; Bridle, Sarah (2002). "Cosmological parameters from CMB and other data: A Monte Carlo approach". *Physical Review D*. **66** (10) 103511. [arXiv](https://en.wikipedia.org/wiki/ArXiv_\(identifier\) "ArXiv (identifier)"):[astro-ph/0205436](https://arxiv.org/abs/astro-ph/0205436). [Bibcode](https://en.wikipedia.org/wiki/Bibcode_\(identifier\) "Bibcode (identifier)"):[2002PhRvD..66j3511L](https://ui.adsabs.harvard.edu/abs/2002PhRvD..66j3511L). [doi](https://en.wikipedia.org/wiki/Doi_\(identifier\) "Doi (identifier)"):[10\.1103/PhysRevD.66.103511](https://doi.org/10.1103%2FPhysRevD.66.103511).
48. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-48)**
["Cobaya, a code for Bayesian analysis in Cosmology â cobaya 3.5.7 documentation"](https://cobaya.readthedocs.io/en/latest/index.html). *cobaya.readthedocs.io*. Retrieved 2025-07-23.
49. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-49)**
["CAMB â Code for Anisotropies in the Microwave Background (CAMB) 1.6.1 documentation"](https://camb.readthedocs.io/en/latest/). *camb.readthedocs.io*. Retrieved 2025-07-23.
50. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-50)**
Lesgourgues, Julien (2011). "The Cosmic Linear Anisotropy Solving System (CLASS) I: Overview". [arXiv](https://en.wikipedia.org/wiki/ArXiv_\(identifier\) "ArXiv (identifier)"):[1104\.2932](https://arxiv.org/abs/1104.2932) \[[astro-ph.IM](https://arxiv.org/archive/astro-ph.IM)\].
51. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-51)**
Hill, J. Colin; McDonough, Evan; Toomey, Michael W.; Alexander, Stephon (2020). "Early dark energy does not restore cosmological concordance". *Physical Review D*. **102** (4) 043507. [arXiv](https://en.wikipedia.org/wiki/ArXiv_\(identifier\) "ArXiv (identifier)"):[2003\.07355](https://arxiv.org/abs/2003.07355). [Bibcode](https://en.wikipedia.org/wiki/Bibcode_\(identifier\) "Bibcode (identifier)"):[2020PhRvD.102d3507H](https://ui.adsabs.harvard.edu/abs/2020PhRvD.102d3507H). [doi](https://en.wikipedia.org/wiki/Doi_\(identifier\) "Doi (identifier)"):[10\.1103/PhysRevD.102.043507](https://doi.org/10.1103%2FPhysRevD.102.043507).
52. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-52)**
Trotta, Roberto (2008). "Bayes in the sky: Bayesian inference and model selection in cosmology". *Contemporary Physics*. **49** (2): 71â104\. [arXiv](https://en.wikipedia.org/wiki/ArXiv_\(identifier\) "ArXiv (identifier)"):[0803\.4089](https://arxiv.org/abs/0803.4089). [Bibcode](https://en.wikipedia.org/wiki/Bibcode_\(identifier\) "Bibcode (identifier)"):[2008ConPh..49...71T](https://ui.adsabs.harvard.edu/abs/2008ConPh..49...71T). [doi](https://en.wikipedia.org/wiki/Doi_\(identifier\) "Doi (identifier)"):[10\.1080/00107510802066753](https://doi.org/10.1080%2F00107510802066753).
53. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-53)** Dawid, A. P. and Mortera, J. (1996) "Coherent Analysis of Forensic Identification Evidence". *[Journal of the Royal Statistical Society](https://en.wikipedia.org/wiki/Journal_of_the_Royal_Statistical_Society "Journal of the Royal Statistical Society")*, Series B, 58, 425â443.
54. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-54)** Foreman, L. A.; Smith, A. F. M., and Evett, I. W. (1997). "Bayesian analysis of deoxyribonucleic acid profiling data in forensic identification applications (with discussion)". *Journal of the Royal Statistical Society*, Series A, 160, 429â469.
55. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-55)**
Robertson, B. and Vignaux, G. A. (1995) *Interpreting Evidence: Evaluating Forensic Science in the Courtroom*. John Wiley and Sons. Chichester. [ISBN](https://en.wikipedia.org/wiki/ISBN_\(identifier\) "ISBN (identifier)")
[978-0-471-96026-3](https://en.wikipedia.org/wiki/Special:BookSources/978-0-471-96026-3 "Special:BookSources/978-0-471-96026-3")
.
56. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-56)** Dawid, A. P. (2001) [Bayes' Theorem and Weighing Evidence by Juries](http://128.40.111.250/evidence/content/dawid-paper.pdf). [Archived](https://web.archive.org/web/20150701112146/http://128.40.111.250/evidence/content/dawid-paper.pdf) 2015-07-01 at the [Wayback Machine](https://en.wikipedia.org/wiki/Wayback_Machine "Wayback Machine")
57. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-57)** Gardner-Medwin, A. (2005) "What Probability Should the Jury Address?". *[Significance](https://en.wikipedia.org/wiki/Significance_\(journal\) "Significance (journal)")*, 2 (1), March 2005.
58. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-58)**
Miller, David (1994). [*Critical Rationalism*](https://books.google.com/books?id=bh_yCgAAQBAJ). Chicago: Open Court. [ISBN](https://en.wikipedia.org/wiki/ISBN_\(identifier\) "ISBN (identifier)")
[978-0-8126-9197-9](https://en.wikipedia.org/wiki/Special:BookSources/978-0-8126-9197-9 "Special:BookSources/978-0-8126-9197-9")
.
59. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-59)** Howson & Urbach (2005), Jaynes (2003)
60. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-Cai_et_al._2009_60-0)**
Cai, X.Q.; Wu, X.Y.; Zhou, X. (2009). "Stochastic scheduling subject to breakdown-repeat breakdowns with incomplete information". *Operations Research*. **57** (5): 1236â1249\. [doi](https://en.wikipedia.org/wiki/Doi_\(identifier\) "Doi (identifier)"):[10\.1287/opre.1080.0660](https://doi.org/10.1287%2Fopre.1080.0660).
61. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-61)**
Ogle, Kiona; Tucker, Colin; Cable, Jessica M. (2014-01-01). "Beyond simple linear mixing models: process-based isotope partitioning of ecological processes". *Ecological Applications*. **24** (1): 181â195\. [Bibcode](https://en.wikipedia.org/wiki/Bibcode_\(identifier\) "Bibcode (identifier)"):[2014EcoAp..24..181O](https://ui.adsabs.harvard.edu/abs/2014EcoAp..24..181O). [doi](https://en.wikipedia.org/wiki/Doi_\(identifier\) "Doi (identifier)"):[10\.1890/1051-0761-24.1.181](https://doi.org/10.1890%2F1051-0761-24.1.181). [ISSN](https://en.wikipedia.org/wiki/ISSN_\(identifier\) "ISSN (identifier)") [1939-5582](https://search.worldcat.org/issn/1939-5582). [PMID](https://en.wikipedia.org/wiki/PMID_\(identifier\) "PMID (identifier)") [24640543](https://pubmed.ncbi.nlm.nih.gov/24640543).
62. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-62)**
Evaristo, Jaivime; McDonnell, Jeffrey J.; Scholl, Martha A.; Bruijnzeel, L. Adrian; Chun, Kwok P. (2016-01-01). "Insights into plant water uptake from xylem-water isotope measurements in two tropical catchments with contrasting moisture conditions". *Hydrological Processes*. **30** (18): 3210â3227\. [Bibcode](https://en.wikipedia.org/wiki/Bibcode_\(identifier\) "Bibcode (identifier)"):[2016HyPr...30.3210E](https://ui.adsabs.harvard.edu/abs/2016HyPr...30.3210E). [doi](https://en.wikipedia.org/wiki/Doi_\(identifier\) "Doi (identifier)"):[10\.1002/hyp.10841](https://doi.org/10.1002%2Fhyp.10841). [ISSN](https://en.wikipedia.org/wiki/ISSN_\(identifier\) "ISSN (identifier)") [1099-1085](https://search.worldcat.org/issn/1099-1085). [S2CID](https://en.wikipedia.org/wiki/S2CID_\(identifier\) "S2CID (identifier)") [131588159](https://api.semanticscholar.org/CorpusID:131588159).
63. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-63)**
Gupta, Ankur; Rawlings, James B. (April 2014). ["Comparison of Parameter Estimation Methods in Stochastic Chemical Kinetic Models: Examples in Systems Biology"](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4946376). *AIChE Journal*. **60** (4): 1253â1268\. [Bibcode](https://en.wikipedia.org/wiki/Bibcode_\(identifier\) "Bibcode (identifier)"):[2014AIChE..60.1253G](https://ui.adsabs.harvard.edu/abs/2014AIChE..60.1253G). [doi](https://en.wikipedia.org/wiki/Doi_\(identifier\) "Doi (identifier)"):[10\.1002/aic.14409](https://doi.org/10.1002%2Faic.14409). [ISSN](https://en.wikipedia.org/wiki/ISSN_\(identifier\) "ISSN (identifier)") [0001-1541](https://search.worldcat.org/issn/0001-1541). [PMC](https://en.wikipedia.org/wiki/PMC_\(identifier\) "PMC (identifier)") [4946376](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4946376). [PMID](https://en.wikipedia.org/wiki/PMID_\(identifier\) "PMID (identifier)") [27429455](https://pubmed.ncbi.nlm.nih.gov/27429455).
64. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-64)**
SchĂŒtz, N.; Holschneider, M. (2011). "Detection of trend changes in time series using Bayesian inference". *Physical Review E*. **84** (2) 021120. [arXiv](https://en.wikipedia.org/wiki/ArXiv_\(identifier\) "ArXiv (identifier)"):[1104\.3448](https://arxiv.org/abs/1104.3448). [Bibcode](https://en.wikipedia.org/wiki/Bibcode_\(identifier\) "Bibcode (identifier)"):[2011PhRvE..84b1120S](https://ui.adsabs.harvard.edu/abs/2011PhRvE..84b1120S). [doi](https://en.wikipedia.org/wiki/Doi_\(identifier\) "Doi (identifier)"):[10\.1103/PhysRevE.84.021120](https://doi.org/10.1103%2FPhysRevE.84.021120). [PMID](https://en.wikipedia.org/wiki/PMID_\(identifier\) "PMID (identifier)") [21928962](https://pubmed.ncbi.nlm.nih.gov/21928962). [S2CID](https://en.wikipedia.org/wiki/S2CID_\(identifier\) "S2CID (identifier)") [11460968](https://api.semanticscholar.org/CorpusID:11460968).
65. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-65)**
Stigler, Stephen (1982). "Thomas Bayes's Bayesian Inference". *Journal of the Royal Statistical Society*. **145** (2): 250â58\. [doi](https://en.wikipedia.org/wiki/Doi_\(identifier\) "Doi (identifier)"):[10\.2307/2981538](https://doi.org/10.2307%2F2981538). [JSTOR](https://en.wikipedia.org/wiki/JSTOR_\(identifier\) "JSTOR (identifier)") [2981538](https://www.jstor.org/stable/2981538).
66. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-Stigler1986_66-0)**
Stigler, Stephen M. (1986). ["Chapter 3"](https://archive.org/details/historyofstatist00stig). *The History of Statistics*. Harvard University Press. [ISBN](https://en.wikipedia.org/wiki/ISBN_\(identifier\) "ISBN (identifier)")
[978-0-674-40340-6](https://en.wikipedia.org/wiki/Special:BookSources/978-0-674-40340-6 "Special:BookSources/978-0-674-40340-6")
.
67. ^ [***a***](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-Fienberg2006_67-0) [***b***](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-Fienberg2006_67-1)
Fienberg, Stephen E. (2006). ["When did Bayesian Inference Become 'Bayesian'?"](https://doi.org/10.1214%2F06-ba101). *Bayesian Analysis*. **1** (1): 1â40 \[p. 5\]. [doi](https://en.wikipedia.org/wiki/Doi_\(identifier\) "Doi (identifier)"):[10\.1214/06-ba101](https://doi.org/10.1214%2F06-ba101).
68. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-Bernardo2005_68-0)**
[Bernardo, JosĂ©-Miguel](https://en.wikipedia.org/wiki/Jos%C3%A9-Miguel_Bernardo "JosĂ©-Miguel Bernardo") (2005). "Reference analysis". *Handbook of statistics*. Vol. 25. pp. 17â90\.
69. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-Wolpert2004_69-0)**
Wolpert, R. L. (2004). "A Conversation with James O. Berger". *Statistical Science*. **19** (1): 205â218\. [CiteSeerX](https://en.wikipedia.org/wiki/CiteSeerX_\(identifier\) "CiteSeerX (identifier)") [10\.1.1.71.6112](https://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.71.6112). [doi](https://en.wikipedia.org/wiki/Doi_\(identifier\) "Doi (identifier)"):[10\.1214/088342304000000053](https://doi.org/10.1214%2F088342304000000053). [MR](https://en.wikipedia.org/wiki/MR_\(identifier\) "MR (identifier)") [2082155](https://mathscinet.ams.org/mathscinet-getitem?mr=2082155). [S2CID](https://en.wikipedia.org/wiki/S2CID_\(identifier\) "S2CID (identifier)") [120094454](https://api.semanticscholar.org/CorpusID:120094454).
70. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-Bernardo2006_70-0)**
[Bernardo, José M.](https://en.wikipedia.org/wiki/Jos%C3%A9-Miguel_Bernardo "José-Miguel Bernardo") (2006). ["A Bayesian mathematical statistics primer"](http://www.ime.usp.br/~abe/ICOTS7/Proceedings/PDFs/InvitedPapers/3I2_BERN.pdf) (PDF). *Icots-7*.
71. **[^](https://en.wikipedia.org/wiki/Bayesian_inference#cite_ref-Bishop2007_71-0)**
Bishop, C. M. (2007). *Pattern Recognition and Machine Learning*. New York: Springer. [ISBN](https://en.wikipedia.org/wiki/ISBN_\(identifier\) "ISBN (identifier)")
[978-0-387-31073-2](https://en.wikipedia.org/wiki/Special:BookSources/978-0-387-31073-2 "Special:BookSources/978-0-387-31073-2")
.
- Aster, Richard; Borchers, Brian, and Thurber, Clifford (2012). *Parameter Estimation and Inverse Problems*, Second Edition, Elsevier. [ISBN](https://en.wikipedia.org/wiki/ISBN_\(identifier\) "ISBN (identifier)")
[0123850487](https://en.wikipedia.org/wiki/Special:BookSources/0123850487 "Special:BookSources/0123850487")
, [ISBN](https://en.wikipedia.org/wiki/ISBN_\(identifier\) "ISBN (identifier)")
[978-0123850485](https://en.wikipedia.org/wiki/Special:BookSources/978-0123850485 "Special:BookSources/978-0123850485")
- Bickel, Peter J. & Doksum, Kjell A. (2001). *Mathematical Statistics, Volume 1: Basic and Selected Topics* (Second (updated printing 2007) ed.). Pearson PrenticeâHall. [ISBN](https://en.wikipedia.org/wiki/ISBN_\(identifier\) "ISBN (identifier)")
[978-0-13-850363-5](https://en.wikipedia.org/wiki/Special:BookSources/978-0-13-850363-5 "Special:BookSources/978-0-13-850363-5")
.
- [Box, G. E. P.](https://en.wikipedia.org/wiki/George_E._P._Box "George E. P. Box") and [Tiao, G. C.](https://en.wikipedia.org/wiki/George_Tiao "George Tiao") (1973). *Bayesian Inference in Statistical Analysis*, Wiley, [ISBN](https://en.wikipedia.org/wiki/ISBN_\(identifier\) "ISBN (identifier)")
[0-471-57428-7](https://en.wikipedia.org/wiki/Special:BookSources/0-471-57428-7 "Special:BookSources/0-471-57428-7")
- Edwards, Ward (1968). "Conservatism in Human Information Processing". In Kleinmuntz, B. (ed.). *Formal Representation of Human Judgment*. Wiley.
- Edwards, Ward (1982). [Daniel Kahneman](https://en.wikipedia.org/wiki/Daniel_Kahneman "Daniel Kahneman"); [Paul Slovic](https://en.wikipedia.org/wiki/Paul_Slovic "Paul Slovic"); [Amos Tversky](https://en.wikipedia.org/wiki/Amos_Tversky "Amos Tversky") (eds.). "Judgment under uncertainty: Heuristics and biases". *Science*. **185** (4157): 1124â1131\. [Bibcode](https://en.wikipedia.org/wiki/Bibcode_\(identifier\) "Bibcode (identifier)"):[1974Sci...185.1124T](https://ui.adsabs.harvard.edu/abs/1974Sci...185.1124T). [doi](https://en.wikipedia.org/wiki/Doi_\(identifier\) "Doi (identifier)"):[10\.1126/science.185.4157.1124](https://doi.org/10.1126%2Fscience.185.4157.1124). [PMID](https://en.wikipedia.org/wiki/PMID_\(identifier\) "PMID (identifier)") [17835457](https://pubmed.ncbi.nlm.nih.gov/17835457). [S2CID](https://en.wikipedia.org/wiki/S2CID_\(identifier\) "S2CID (identifier)") [143452957](https://api.semanticscholar.org/CorpusID:143452957). "Chapter: Conservatism in Human Information Processing (excerpted)"
- [Jaynes E. T.](https://en.wikipedia.org/wiki/Edwin_Thompson_Jaynes "Edwin Thompson Jaynes") (2003) *Probability Theory: The Logic of Science*, CUP. [ISBN](https://en.wikipedia.org/wiki/ISBN_\(identifier\) "ISBN (identifier)")
[978-0-521-59271-0](https://en.wikipedia.org/wiki/Special:BookSources/978-0-521-59271-0 "Special:BookSources/978-0-521-59271-0")
([Link to Fragmentary Edition of March 1996](http://www-biba.inrialpes.fr/Jaynes/prob.html)).
- [Howson, C.](https://en.wikipedia.org/wiki/Colin_Howson "Colin Howson") & Urbach, P. (2005). *Scientific Reasoning: the Bayesian Approach* (3rd ed.). [Open Court Publishing Company](https://en.wikipedia.org/wiki/Open_Court_Publishing_Company "Open Court Publishing Company"). [ISBN](https://en.wikipedia.org/wiki/ISBN_\(identifier\) "ISBN (identifier)")
[978-0-8126-9578-6](https://en.wikipedia.org/wiki/Special:BookSources/978-0-8126-9578-6 "Special:BookSources/978-0-8126-9578-6")
.
- Phillips, L. D.; Edwards, Ward (October 2008). "Chapter 6: Conservatism in a Simple Probability Inference Task (*Journal of Experimental Psychology* (1966) 72: 346-354)". In Jie W. Weiss; David J. Weiss (eds.). *A Science of Decision Making:The Legacy of Ward Edwards*. Oxford University Press. p. 536. [ISBN](https://en.wikipedia.org/wiki/ISBN_\(identifier\) "ISBN (identifier)")
[978-0-19-532298-9](https://en.wikipedia.org/wiki/Special:BookSources/978-0-19-532298-9 "Special:BookSources/978-0-19-532298-9")
.
- For a full report on the history of Bayesian statistics and the debates with frequentists approaches, read
Vallverdu, Jordi (2016). *Bayesians Versus Frequentists A Philosophical Debate on Statistical Reasoning*. New York: Springer. [ISBN](https://en.wikipedia.org/wiki/ISBN_\(identifier\) "ISBN (identifier)")
[978-3-662-48638-2](https://en.wikipedia.org/wiki/Special:BookSources/978-3-662-48638-2 "Special:BookSources/978-3-662-48638-2")
.
- [Clayton, Aubrey](https://en.wikipedia.org/w/index.php?title=Aubrey_Clayton&action=edit&redlink=1 "Aubrey Clayton (page does not exist)") (August 2021). [*Bernoulli's Fallacy: Statistical Illogic and the Crisis of Modern Science*](https://cup.columbia.edu/book/bernoullis-fallacy/9780231199940). Columbia University Press. [ISBN](https://en.wikipedia.org/wiki/ISBN_\(identifier\) "ISBN (identifier)")
[978-0-231-55335-3](https://en.wikipedia.org/wiki/Special:BookSources/978-0-231-55335-3 "Special:BookSources/978-0-231-55335-3")
.
The following books are listed in ascending order of probabilistic sophistication:
- Stone, JV (2013), "Bayes' Rule: A Tutorial Introduction to Bayesian Analysis", [Download first chapter here](http://jim-stone.staff.shef.ac.uk/BookBayes2012/BayesRuleBookMain.html), Sebtel Press, England.
- [Dennis V. Lindley](https://en.wikipedia.org/wiki/Dennis_V._Lindley "Dennis V. Lindley") (2013). *Understanding Uncertainty, Revised Edition* (2nd ed.). John Wiley. [ISBN](https://en.wikipedia.org/wiki/ISBN_\(identifier\) "ISBN (identifier)")
[978-1-118-65012-7](https://en.wikipedia.org/wiki/Special:BookSources/978-1-118-65012-7 "Special:BookSources/978-1-118-65012-7")
.
- [Colin Howson](https://en.wikipedia.org/wiki/Colin_Howson "Colin Howson") & Peter Urbach (2005). *Scientific Reasoning: The Bayesian Approach* (3rd ed.). [Open Court Publishing Company](https://en.wikipedia.org/wiki/Open_Court_Publishing_Company "Open Court Publishing Company"). [ISBN](https://en.wikipedia.org/wiki/ISBN_\(identifier\) "ISBN (identifier)")
[978-0-8126-9578-6](https://en.wikipedia.org/wiki/Special:BookSources/978-0-8126-9578-6 "Special:BookSources/978-0-8126-9578-6")
.
- Berry, Donald A. (1996). *Statistics: A Bayesian Perspective*. Duxbury. [ISBN](https://en.wikipedia.org/wiki/ISBN_\(identifier\) "ISBN (identifier)")
[978-0-534-23476-8](https://en.wikipedia.org/wiki/Special:BookSources/978-0-534-23476-8 "Special:BookSources/978-0-534-23476-8")
.
- [Morris H. DeGroot](https://en.wikipedia.org/wiki/Morris_H._DeGroot "Morris H. DeGroot") & Mark J. Schervish (2002). [*Probability and Statistics*](https://archive.org/details/probabilitystati00degr_0) (third ed.). Addison-Wesley. [ISBN](https://en.wikipedia.org/wiki/ISBN_\(identifier\) "ISBN (identifier)")
[978-0-201-52488-8](https://en.wikipedia.org/wiki/Special:BookSources/978-0-201-52488-8 "Special:BookSources/978-0-201-52488-8")
.
- Bolstad, William M. (2007) *Introduction to Bayesian Statistics*: Second Edition, John Wiley [ISBN](https://en.wikipedia.org/wiki/ISBN_\(identifier\) "ISBN (identifier)")
[0-471-27020-2](https://en.wikipedia.org/wiki/Special:BookSources/0-471-27020-2 "Special:BookSources/0-471-27020-2")
- Winkler, Robert L (2003). *Introduction to Bayesian Inference and Decision* (2nd ed.). Probabilistic. [ISBN](https://en.wikipedia.org/wiki/ISBN_\(identifier\) "ISBN (identifier)")
[978-0-9647938-4-2](https://en.wikipedia.org/wiki/Special:BookSources/978-0-9647938-4-2 "Special:BookSources/978-0-9647938-4-2")
.
Updated classic textbook. Bayesian theory clearly presented.
- Lee, Peter M. *Bayesian Statistics: An Introduction*. Fourth Edition (2012), John Wiley [ISBN](https://en.wikipedia.org/wiki/ISBN_\(identifier\) "ISBN (identifier)")
[978-1-1183-3257-3](https://en.wikipedia.org/wiki/Special:BookSources/978-1-1183-3257-3 "Special:BookSources/978-1-1183-3257-3")
- Carlin, Bradley P. & Louis, Thomas A. (2008). *Bayesian Methods for Data Analysis, Third Edition*. Boca Raton, FL: Chapman and Hall/CRC. [ISBN](https://en.wikipedia.org/wiki/ISBN_\(identifier\) "ISBN (identifier)")
[978-1-58488-697-6](https://en.wikipedia.org/wiki/Special:BookSources/978-1-58488-697-6 "Special:BookSources/978-1-58488-697-6")
.
- [Gelman, Andrew](https://en.wikipedia.org/wiki/Andrew_Gelman "Andrew Gelman"); Carlin, John B.; Stern, Hal S.; Dunson, David B.; Vehtari, Aki; [Rubin, Donald B.](https://en.wikipedia.org/wiki/Donald_Rubin "Donald Rubin") (2013). *Bayesian Data Analysis, Third Edition*. Chapman and Hall/CRC. [ISBN](https://en.wikipedia.org/wiki/ISBN_\(identifier\) "ISBN (identifier)")
[978-1-4398-4095-5](https://en.wikipedia.org/wiki/Special:BookSources/978-1-4398-4095-5 "Special:BookSources/978-1-4398-4095-5")
.
### Intermediate or advanced
\[[edit](https://en.wikipedia.org/w/index.php?title=Bayesian_inference&action=edit§ion=40 "Edit section: Intermediate or advanced")\]
- [Berger, James O](https://en.wikipedia.org/wiki/James_Berger_\(statistician\) "James Berger (statistician)") (1985). *Statistical Decision Theory and Bayesian Analysis*. Springer Series in Statistics (Second ed.). Springer-Verlag. [Bibcode](https://en.wikipedia.org/wiki/Bibcode_\(identifier\) "Bibcode (identifier)"):[1985sdtb.book.....B](https://ui.adsabs.harvard.edu/abs/1985sdtb.book.....B). [ISBN](https://en.wikipedia.org/wiki/ISBN_\(identifier\) "ISBN (identifier)")
[978-0-387-96098-2](https://en.wikipedia.org/wiki/Special:BookSources/978-0-387-96098-2 "Special:BookSources/978-0-387-96098-2")
.
- [Bernardo, José M.](https://en.wikipedia.org/wiki/Jos%C3%A9-Miguel_Bernardo "José-Miguel Bernardo"); [Smith, Adrian F. M.](https://en.wikipedia.org/wiki/Adrian_Smith_\(statistician\) "Adrian Smith (statistician)") (1994). *Bayesian Theory*. Wiley.
- [DeGroot, Morris H.](https://en.wikipedia.org/wiki/Morris_H._DeGroot "Morris H. DeGroot"), *Optimal Statistical Decisions*. Wiley Classics Library. 2004. (Originally published (1970) by McGraw-Hill.) [ISBN](https://en.wikipedia.org/wiki/ISBN_\(identifier\) "ISBN (identifier)")
[0-471-68029-X](https://en.wikipedia.org/wiki/Special:BookSources/0-471-68029-X "Special:BookSources/0-471-68029-X")
.
- Schervish, Mark J. (1995). *Theory of statistics*. Springer-Verlag. [ISBN](https://en.wikipedia.org/wiki/ISBN_\(identifier\) "ISBN (identifier)")
[978-0-387-94546-0](https://en.wikipedia.org/wiki/Special:BookSources/978-0-387-94546-0 "Special:BookSources/978-0-387-94546-0")
.
- Jaynes, E. T. (1998). [*Probability Theory: The Logic of Science*](http://www-biba.inrialpes.fr/Jaynes/prob.html).
- O'Hagan, A. and Forster, J. (2003). *Kendall's Advanced Theory of Statistics*, Volume 2B: *Bayesian Inference*. Arnold, New York. [ISBN](https://en.wikipedia.org/wiki/ISBN_\(identifier\) "ISBN (identifier)")
[0-340-52922-9](https://en.wikipedia.org/wiki/Special:BookSources/0-340-52922-9 "Special:BookSources/0-340-52922-9")
.
- Robert, Christian P (2007). *The Bayesian Choice: From Decision-Theoretic Foundations to Computational Implementation* (paperback ed.). Springer. [ISBN](https://en.wikipedia.org/wiki/ISBN_\(identifier\) "ISBN (identifier)")
[978-0-387-71598-8](https://en.wikipedia.org/wiki/Special:BookSources/978-0-387-71598-8 "Special:BookSources/978-0-387-71598-8")
.
- [Pearl, Judea](https://en.wikipedia.org/wiki/Judea_Pearl "Judea Pearl"). (1988). *Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference*, San Mateo, CA: Morgan Kaufmann.
- Pierre BessiĂšre et al. (2013). "[Bayesian Programming](http://www.crcpress.com/product/isbn/9781439880326)". CRC Press. [ISBN](https://en.wikipedia.org/wiki/ISBN_\(identifier\) "ISBN (identifier)")
[9781439880326](https://en.wikipedia.org/wiki/Special:BookSources/9781439880326 "Special:BookSources/9781439880326")
- Francisco J. Samaniego (2010). "A Comparison of the Bayesian and Frequentist Approaches to Estimation". Springer. New York, [ISBN](https://en.wikipedia.org/wiki/ISBN_\(identifier\) "ISBN (identifier)")
[978-1-4419-5940-9](https://en.wikipedia.org/wiki/Special:BookSources/978-1-4419-5940-9 "Special:BookSources/978-1-4419-5940-9")
- ["Bayesian approach to statistical problems"](https://www.encyclopediaofmath.org/index.php?title=Bayesian_approach_to_statistical_problems), *[Encyclopedia of Mathematics](https://en.wikipedia.org/wiki/Encyclopedia_of_Mathematics "Encyclopedia of Mathematics")*, [EMS Press](https://en.wikipedia.org/wiki/European_Mathematical_Society "European Mathematical Society"), 2001 \[1994\]
- [Bayesian Statistics](http://www.scholarpedia.org/article/Bayesian_statistics) from Scholarpedia.
- [Introduction to Bayesian probability](http://www.dcs.qmw.ac.uk/~norman/BBNs/BBNs.htm) from Queen Mary University of London
- [Mathematical Notes on Bayesian Statistics and Markov Chain Monte Carlo](http://webuser.bus.umich.edu/plenk/downloads.htm)
- [Bayesian reading list](http://cocosci.berkeley.edu/tom/bayes.html) [Archived](https://web.archive.org/web/20110625052506/http://cocosci.berkeley.edu/tom/bayes.html) 2011-06-25 at the [Wayback Machine](https://en.wikipedia.org/wiki/Wayback_Machine "Wayback Machine"), categorized and annotated by [Tom Griffiths](https://web.archive.org/web/20060711151352/http://psychology.berkeley.edu/faculty/profiles/tgriffiths.html)
- A. Hajek and S. Hartmann: [Bayesian Epistemology](https://web.archive.org/web/20110728055439/http://stephanhartmann.org/HajekHartmann_BayesEpist.pdf), in: J. Dancy et al. (eds.), A Companion to Epistemology. Oxford: Blackwell 2010, 93â106.
- S. Hartmann and J. Sprenger: [Bayesian Epistemology](https://web.archive.org/web/20110728055519/http://stephanhartmann.org/HartmannSprenger_BayesEpis.pdf), in: S. Bernecker and D. Pritchard (eds.), Routledge Companion to Epistemology. London: Routledge 2010, 609â620.
- [*Stanford Encyclopedia of Philosophy*: "Inductive Logic"](http://plato.stanford.edu/entries/logic-inductive/)
- [Bayesian Confirmation Theory](https://web.archive.org/web/20150905093734/http://faculty-staff.ou.edu/H/James.A.Hawthorne-1/Hawthorne--Bayesian_Confirmation_Theory.pdf) (PDF)
- [What is Bayesian Learning?](http://www.faqs.org/faqs/ai-faq/neural-nets/part3/section-7.html)
- [*Data, Uncertainty and Inference*](https://causascientia.org/math_stat/DataUnkInf.html) â Informal introduction with many examples, ebook (PDF) freely available at [causaScientia](https://causascientia.org/) |
| Shard | 152 (laksa) |
| Root Hash | 17790707453426894952 |
| Unparsed URL | org,wikipedia!en,/wiki/Bayesian_inference s443 |