ℹ️ Skipped - page is already crawled
| Filter | Status | Condition | Details |
|---|---|---|---|
| HTTP status | PASS | download_http_code = 200 | HTTP 200 |
| Age cutoff | PASS | download_stamp > now() - 6 MONTH | 0.1 months ago |
| History drop | PASS | isNull(history_drop_reason) | No drop reason |
| Spam/ban | PASS | fh_dont_index != 1 AND ml_spam_score = 0 | ml_spam_score=0 |
| Canonical | PASS | meta_canonical IS NULL OR = '' OR = src_unparsed | Not set |
| Property | Value |
|---|---|
| URL | https://www.xlstat.com/solutions/features/ordinary-least-squares-regression-ols |
| Last Crawled | 2026-04-04 20:34:40 (3 days ago) |
| First Indexed | 2025-05-23 02:03:53 (10 months ago) |
| HTTP Status Code | 200 |
| Meta Title | Ordinary Least Squares regression (OLS) | Statistical Software for Excel |
| Meta Description | Ordinary Least Squares regression (OLS), often called linear regression, is available in Excel using the XLSTAT add-on statistical software. |
| Meta Canonical | null |
| Boilerpipe Text | Ordinary Least Squares regression (OLS), often called linear regression, is available in Excel using the XLSTAT add-on statistical software.
Ordinary Least Squares regression
(
OLS
) is a common technique for estimating coefficients of
linear regression
equations which describe the relationship between one or more independent quantitative variables and a dependent variable (simple or multiple linear regression), often evaluated using r-squared.
Least squares
stand for the minimum
squares error (SSE)
. Maximum likelihood and Generalized method of moments estimator are alternative approaches to OLS.
In practice, you can use linear regression in many fields:
meteorology, if you need to predict temperature or rainfall based on external factors.
biology, if you need to predict the number of remaining individuals in a species depending on the number of predators or life resources.
economy, if you need to predict a company’s turnover based on the amount of sales.
… and many more.
Statsmodels 0.15.0 and XLSTAT both offer linear regression functionalities, but they differ significantly in terms of their features, usage, and target audiences.
A bit of theory: Equations for the Ordinary Least Squares regression
The ordinary least squares formula: what is the equation of the model?
In the case of a model with
p
explanatory variables, the OLS regression model writes:
Y = β
0
+ Σ
j=1..p
β
j
X
j
+ ε
where Y is the dependent variable, β
0
, is the intercept of the model, X
j
corresponds to the j
th
explanatory variable of the model (j= 1 to p), and e is the random error with expectation 0 and variance σ².
In the case where there are n observations, the estimation of the predicted value of the dependent variable Y for the i
th
observation is given by the regression equation:
y
i
= β
0
+ Σ
j=1..p
β
j
X
ij
Example: We want to predict the height of plants depending on the number of days they have spent in the sun, where days spent in the sun is an independent variable. Before getting exposure, they are 30 cm. A plant grows 1 mm (0.1 cm) after being exposed to the sun for a day.
Y is the height of the plants
X is the number of days spent in the sun
β0 is 30 because it is the value of Y when X is 0.
β1 is 0.1 because it is the coefficient multiplied by the number of days.
A plant being exposed 5 days to the sun has therefore an estimated height of Y = 30 + 0.1*5 = 30.5 cm.
Of course, it is not always exact, which is why we must take into account the random error ε (or the null hypothesis being tested).
Moreover, before predicting, our method must find the beta coefficients: we just start out by inputting a table containing the heights of several plants along with the number of days they have spent in the sun. If you want to find out more about the calculations, read the next paragraph.
How do ordinary least squares (OLS) work?
The OLS method aims to minimize the sum of square differences between the observed and predicted values, considering potential issues of multicollinearity.
That way, the vector β of the coefficients can be estimated by the following formula
β = (X’DX)-1 X’ Dy
with X the matrix of the explanatory variables preceded by a vector of 1s, D is a matrix with the wi weights on its diagonal, and y the vector of the n observed values of the dependent variable
The vector of the predicted values can be written as follows:
y* = Xβ=X (X’ DX)-1 X’Dy
We can even the variance σ² of the random error ε by the following formula :
σ² = 1/(W –p*) Σi=1..n wi(yi - y*i)
where p* is the number of explanatory variables to which we add 1 if the intercept is not fixed, wi is the weight of the ith observation, W is the sum of the wi weights, y the * vector of the observed values and y* the vector of predicted values
What is the intuitive explanation of the least squares method?
Intuitively speaking, the aim of the ordinary least squares method is to minimize the prediction error, between the predicted and real values. One may ask themselves why we choose to minimize the sum of squared errors instead of the sum of errors directly.
It takes into account the sum of squared errors instead of the errors as they are because sometimes they can be negative or positive and they could sum up to a nearly null value.
For example, if your real values are 2, 3, 5, 2, and 4 and your predicted values are 3, 2, 5, 1, 5, then the total error would be (3-2)+(2-3)+(5-5)+(1-2)+(5-4)=1-1+0-1+1=0 and the average error would be 0/5=0, which could lead to false conclusions.
However, if you compute the mean squared error, then you get (3-2)^2+(2-3)^2+(5-5)^2+(1-2)^2+(5-4)^2=4 and 4/5=0.8. By scaling the error back to the data and taking the square root of it, we get sqrt(0.8)=0.89, so on average, the predictions differ by 0.89 from the real value.
RUN YOUR LINEAR REGRESSION MODEL
What are the assumptions of Ordinary Least Squares (OLS)?
1) Individuals (observations) are independent. It is in general true in daily situations (the amount of rainfall does not depend on the previous day, the income does not depend on the previous month, the height of a person does not depend on the person measured just before…).
2) Variance is homogeneous.
Levene's test
is proposed by XLSTAT to test the equality of the error variances.
3) Residuals follow a normal distribution. XLSTAT offers several methods to test
normality
of residuals.
Model
residuals
(or errors) are the distances between data points and the fitted model. Model residuals represent the part of variability in the data the model was unable to capture. The R² statistic is the part of variability that is explained by the model, crucial in the evaluation of the null hypothesis. So the lower the residuals, the higher the R² statistic.
Homoscedasticity
and
independence
of the error terms are key hypotheses in linear regression where it is assumed that the variances of the error terms are independent and identically distributed and normally distributed. When these assumptions are not possible to keep, a consequence is that the covariance matrix cannot be estimated using the classical formula, and the variance of the parameters correspondi
ng to th
e beta coefficients of the linear model can be wrong and their confidence intervals as well.
How do I set up a linear regression model in XLSTAT?
In XLSTAT, you can easily run your ordinary least squares regression without even coding, just by data selection! You just have to select your dependent variable as well as your explanatory ones.
You can select several outcomes such as Descriptive statistics of your datasets, but also Correlations and Analysis of Variance.
Besides statistics and the equation of the model, you can also select the charts to be displayed, such as the regression one for example. You can see all the data points and the central regression line with a confidence interval.
Predictions in OLS regression with XLSTAT
Linear regression
is often used to predict outputs' values for new samples. XLSTAT enables you to characterize the quality of the model for prediction before you use the regression equation.
To go further: limitations of the Ordinary Least Squares regression
The limitations of the OLS regression come from the constraint of the inversion of the X’X matrix: it is required that the rank of the matrix is p+1, and some numerical problems may arise if the matrix is not well behaved. XLSTAT uses algorithms due to Dempster (1969) that allow circumventing these two issues: if the matrix rank equals q where q is strictly lower than p+1, some variables are removed from the model, either because they are constant or because they belong to a block of
collinear
variables.
What are the advantages of OLS: variable selection
An automatic selection of the variables is performed if the user selects a too high number of variables compared to the number of observations. The theoretical limit is n-1, as with greater values the X’X matrix becomes non-invertible due to multicollinearity.
The deleting of some of the variables may however not be optimal: in some cases we might not add a variable to the model because it is almost collinear to some other variables or to a block of variables, but it might be that it would be more relevant to remove a variable that is already in the model and to the new variable.
For that reason, and also in order to handle the cases where there are a lot of explanatory variables or issues like multicollinearity, other methods have been developed such as
Partial Least Squares regression (PLS)
.
Tutorials for Ordinary Least Squares regression
Below you will find a list of examples using ordinary least squares regression analysis (OLS):
A simple linear regression model
A multiple linear regression model
How to compute influence diagnostics for linear regression models
TRY IT ON YOUR OWN DATA |
| Markdown | [](https://www.xlstat.com/)
- [Solutions](https://www.xlstat.com/solutions/features/ordinary-least-squares-regression-ols)
- [Essentials](https://www.xlstat.com/solutions/essentials)
- [Standard](https://www.xlstat.com/solutions/standard)
- [Advanced](https://www.xlstat.com/solutions/advanced)
- [XLSTAT options](https://www.xlstat.com/solutions/features/ordinary-least-squares-regression-ols)
- [3D Plot](https://www.xlstat.com/solutions/3d-plot)
- [LatentClass](https://www.xlstat.com/solutions/lg)
- [Pricing](https://www.xlstat.com/solutions/features/ordinary-least-squares-regression-ols)
- [Commercial](https://www.xlstat.com/pricing/company)
- [Academic](https://www.xlstat.com/pricing/education)
- [Student](https://www.xlstat.com/pricing/student)
- [Course](https://www.xlstat.com/pricing/course)
- [Webinars](https://www.xlstat.com/solutions/features/ordinary-least-squares-regression-ols)
- [Calendar](https://www.xlstat.com/training)
- [Trainers](https://www.xlstat.com/training/trainer)
- [Resources](https://www.xlstat.com/solutions/features/ordinary-least-squares-regression-ols)
- [Installation & Licenses](https://help.xlstat.com/installation-licences)
- [Tutorials & Guides](https://help.xlstat.com/tutorial-guides)
- [Videos](https://www.youtube.com/channel/UCHvhQQxGqVepBRELIUeGZew)
- [Webinars](https://help.xlstat.com/6438-statistics-webinar-recordings)
- [Case Studies & White Papers](https://help.xlstat.com/6831-free-case-studies-and-white-papers)
- [FAQ](https://www.xlstat.com/faq)
- [Testimonials](https://www.xlstat.com/testimonials)
- [Publications](https://www.xlstat.com/papers)
- [Trial](https://www.xlstat.com/download)
- [Cart](https://www.xlstat.com/order)
- Search
- [My Lumivero](https://my.lumivero.com/ "My Lumivero")
# Ordinary Least Squares regression (OLS)
Ordinary Least Squares regression (OLS), often called linear regression, is available in Excel using the XLSTAT add-on statistical software.

**Ordinary Least Squares regression** (**OLS**) is a common technique for estimating coefficients of [linear regression](https://www.xlstat.com/en/products-solutions/feature/linear-regression.html) equations which describe the relationship between one or more independent quantitative variables and a dependent variable (simple or multiple linear regression), often evaluated using r-squared. **Least squares** stand for the minimum **squares error (SSE)**. Maximum likelihood and Generalized method of moments estimator are alternative approaches to OLS.
In practice, you can use linear regression in many fields:
- meteorology, if you need to predict temperature or rainfall based on external factors.
- biology, if you need to predict the number of remaining individuals in a species depending on the number of predators or life resources.
- economy, if you need to predict a company’s turnover based on the amount of sales.
- … and many more.
*Statsmodels 0.15.0 and XLSTAT both offer linear regression functionalities, but they differ significantly in terms of their features, usage, and target audiences.*
## A bit of theory: Equations for the Ordinary Least Squares regression
### The ordinary least squares formula: what is the equation of the model?
In the case of a model with *p* explanatory variables, the OLS regression model writes:
Y = β0 + Σj=1..p βjXj + ε
where Y is the dependent variable, β0, is the intercept of the model, X j corresponds to the jth explanatory variable of the model (j= 1 to p), and e is the random error with expectation 0 and variance σ\².
In the case where there are n observations, the estimation of the predicted value of the dependent variable Y for the ith observation is given by the regression equation:
yi = β0 + Σj=1..p βjXij
Example: We want to predict the height of plants depending on the number of days they have spent in the sun, where days spent in the sun is an independent variable. Before getting exposure, they are 30 cm. A plant grows 1 mm (0.1 cm) after being exposed to the sun for a day.
- Y is the height of the plants
- X is the number of days spent in the sun
- β0 is 30 because it is the value of Y when X is 0.
- β1 is 0.1 because it is the coefficient multiplied by the number of days.
A plant being exposed 5 days to the sun has therefore an estimated height of Y = 30 + 0.1\*5 = 30.5 cm.
Of course, it is not always exact, which is why we must take into account the random error ε (or the null hypothesis being tested).
Moreover, before predicting, our method must find the beta coefficients: we just start out by inputting a table containing the heights of several plants along with the number of days they have spent in the sun. If you want to find out more about the calculations, read the next paragraph.
### How do ordinary least squares (OLS) work?
The OLS method aims to minimize the sum of square differences between the observed and predicted values, considering potential issues of multicollinearity.
That way, the vector β of the coefficients can be estimated by the following formula
β = (X’DX)-1 X’ Dy
with X the matrix of the explanatory variables preceded by a vector of 1s, D is a matrix with the wi weights on its diagonal, and y the vector of the n observed values of the dependent variable
The vector of the predicted values can be written as follows:
y\* = Xβ=X (X’ DX)-1 X’Dy
We can even the variance σ\² of the random error ε by the following formula :
σ\² = 1/(W –p\*) Σi=1..n wi(yi - y\*i)
where p\* is the number of explanatory variables to which we add 1 if the intercept is not fixed, wi is the weight of the ith observation, W is the sum of the wi weights, y the \* vector of the observed values and y\* the vector of predicted values
### What is the intuitive explanation of the least squares method?
Intuitively speaking, the aim of the ordinary least squares method is to minimize the prediction error, between the predicted and real values. One may ask themselves why we choose to minimize the sum of squared errors instead of the sum of errors directly.
It takes into account the sum of squared errors instead of the errors as they are because sometimes they can be negative or positive and they could sum up to a nearly null value.
For example, if your real values are 2, 3, 5, 2, and 4 and your predicted values are 3, 2, 5, 1, 5, then the total error would be (3-2)+(2-3)+(5-5)+(1-2)+(5-4)=1-1+0-1+1=0 and the average error would be 0/5=0, which could lead to false conclusions.
However, if you compute the mean squared error, then you get (3-2)^2+(2-3)^2+(5-5)^2+(1-2)^2+(5-4)^2=4 and 4/5=0.8. By scaling the error back to the data and taking the square root of it, we get sqrt(0.8)=0.89, so on average, the predictions differ by 0.89 from the real value.
[RUN YOUR LINEAR REGRESSION MODEL](https://www.xlstat.com/en/download/trial/xlstat)
## What are the assumptions of Ordinary Least Squares (OLS)?
1\) Individuals (observations) are independent. It is in general true in daily situations (the amount of rainfall does not depend on the previous day, the income does not depend on the previous month, the height of a person does not depend on the person measured just before…).
2\) Variance is homogeneous. [Levene's test](https://www.xlstat.com/en/solutions/features/k-sample-comparison-of-variances) is proposed by XLSTAT to test the equality of the error variances.
3\) Residuals follow a normal distribution. XLSTAT offers several methods to test [normality](https://www.xlstat.com/en/solutions/features/normality-tests) of residuals.
Model **residuals** (or errors) are the distances between data points and the fitted model. Model residuals represent the part of variability in the data the model was unable to capture. The R\² statistic is the part of variability that is explained by the model, crucial in the evaluation of the null hypothesis. So the lower the residuals, the higher the R\² statistic.
**Homoscedasticity** and **independence** of the error terms are key hypotheses in linear regression where it is assumed that the variances of the error terms are independent and identically distributed and normally distributed. When these assumptions are not possible to keep, a consequence is that the covariance matrix cannot be estimated using the classical formula, and the variance of the parameters corresponding to the beta coefficients of the linear model can be wrong and their confidence intervals as well.
## How do I set up a linear regression model in XLSTAT?
In XLSTAT, you can easily run your ordinary least squares regression without even coding, just by data selection! You just have to select your dependent variable as well as your explanatory ones.

You can select several outcomes such as Descriptive statistics of your datasets, but also Correlations and Analysis of Variance.

Besides statistics and the equation of the model, you can also select the charts to be displayed, such as the regression one for example. You can see all the data points and the central regression line with a confidence interval.

## Predictions in OLS regression with XLSTAT
[Linear regression](https://www.xlstat.com/en/solutions/features/linear-regression) is often used to predict outputs' values for new samples. XLSTAT enables you to characterize the quality of the model for prediction before you use the regression equation.
## To go further: limitations of the Ordinary Least Squares regression
The limitations of the OLS regression come from the constraint of the inversion of the X’X matrix: it is required that the rank of the matrix is p+1, and some numerical problems may arise if the matrix is not well behaved. XLSTAT uses algorithms due to Dempster (1969) that allow circumventing these two issues: if the matrix rank equals q where q is strictly lower than p+1, some variables are removed from the model, either because they are constant or because they belong to a block of [collinear](https://www.xlstat.com/en/solutions/features/multicolinearity-statistics) variables.
## What are the advantages of OLS: variable selection
An automatic selection of the variables is performed if the user selects a too high number of variables compared to the number of observations. The theoretical limit is n-1, as with greater values the X’X matrix becomes non-invertible due to multicollinearity.
The deleting of some of the variables may however not be optimal: in some cases we might not add a variable to the model because it is almost collinear to some other variables or to a block of variables, but it might be that it would be more relevant to remove a variable that is already in the model and to the new variable.
For that reason, and also in order to handle the cases where there are a lot of explanatory variables or issues like multicollinearity, other methods have been developed such as [Partial Least Squares regression (PLS)](https://www.xlstat.com/en/solutions/features/partial-least-squares-regression).
## Tutorials for Ordinary Least Squares regression
Below you will find a list of examples using ordinary least squares regression analysis (OLS):
- [A simple linear regression model](https://xlst.at/simple-linear-regression)
- [A multiple linear regression model](https://xlst.at/multiple-linear-regression)
- [How to compute influence diagnostics for linear regression models](https://xlst.at/influence-diagnostics-for-linear-regression)
[TRY IT ON YOUR OWN DATA](https://www.xlstat.com/en/download/trial/xlstat)
[View all tutorials](https://help.xlstat.com/tutorial-guides)

### analyze your data with xlstat
14-day free trial
[Download xlstat](https://www.xlstat.com/download)
***
Related features
[Distribution fitting](https://www.xlstat.com/solutions/features/distribution-fitting)
[Linear regression](https://www.xlstat.com/solutions/features/linear-regression)
[ANOVA (Analysis of variance)](https://www.xlstat.com/solutions/features/anova-analysis-of-variance)
[Welch and Brown-Forsythe one-way ANOVA](https://www.xlstat.com/solutions/features/welch-and-brown-forsythe-one-way-anova)
[ANCOVA (Analysis of Covariance)](https://www.xlstat.com/solutions/features/ancova-analysis-of-covariance)
[Multivariate Analysis of Variance (MANOVA)](https://www.xlstat.com/solutions/features/multivariate-analysis-of-variance-manova)
[Logistic regression (Binary, Ordinal, Multinomial, …)](https://www.xlstat.com/solutions/features/logistic-regression-for-binary-response-data-and-polytomous-variables-logit-probit)
[Ordinal logit model](https://www.xlstat.com/solutions/features/ordinal-logit-model)
[Cubic splines](https://www.xlstat.com/solutions/features/cubic-splines)
[Nonlinear regression](https://www.xlstat.com/solutions/features/nonlinear-regression-genfit)
[Log-linear regression (Poisson regression)](https://www.xlstat.com/solutions/features/log-linear-regression-in-xlstat-poisson-regression)
[Quantile regression](https://www.xlstat.com/solutions/features/quantile-regression)
[Nonparametric regression (Kernel and Lowess)](https://www.xlstat.com/solutions/features/nonparametric-regression-kernel-and-lowess)
[Partial Least Squares regression (PLS)](https://www.xlstat.com/solutions/features/partial-least-squares-regression)
[PLS discriminant analysis](https://www.xlstat.com/solutions/features/pls-discriminant-analysis)
[Repeated measures Analysis of Variance (ANOVA)](https://www.xlstat.com/solutions/features/repeated-measures-analysis-of-variance-anova)
[Mixed models](https://www.xlstat.com/solutions/features/mixed-models)
[Principal Component Regression (PCR)](https://www.xlstat.com/solutions/features/principal-component-regression)
[Two-stage least squares regression](https://www.xlstat.com/solutions/features/two-stage-least-squares-regression)
[LASSO Regression](https://www.xlstat.com/solutions/features/lasso-regression)
[Ridge Regression](https://www.xlstat.com/solutions/features/ridge-regression)
***
[](https://lumivero.com/)
Expert Software for Better Insights, Research, and Outcomes
- [facebook](https://www.facebook.com/xlstat)
- [twitter](https://twitter.com/xlstat)
- [linkedid](https://www.linkedin.com/company/xlstat-by-lumivero/)
- [YouTube](https://youtube.com/@XLSTAT)
- [Contact us](https://www.xlstat.com/contact/sales)
- [About us](https://lumivero.com/company/)
Copyright ©2026 Lumivero. All Rights Reserved.
- [Legal mentions](https://www.xlstat.com/legal/legal-mentions)
- [Privacy policy](https://www.xlstat.com/legal/privacy-policy)
- [Terms of use](https://www.xlstat.com/legal/terms-of-use)
- [Terms of sale](https://www.xlstat.com/legal/terms-of-sale) |
| Readable Markdown | Ordinary Least Squares regression (OLS), often called linear regression, is available in Excel using the XLSTAT add-on statistical software.
**Ordinary Least Squares regression** (**OLS**) is a common technique for estimating coefficients of [linear regression](https://www.xlstat.com/en/products-solutions/feature/linear-regression.html) equations which describe the relationship between one or more independent quantitative variables and a dependent variable (simple or multiple linear regression), often evaluated using r-squared. **Least squares** stand for the minimum **squares error (SSE)**. Maximum likelihood and Generalized method of moments estimator are alternative approaches to OLS.
In practice, you can use linear regression in many fields:
- meteorology, if you need to predict temperature or rainfall based on external factors.
- biology, if you need to predict the number of remaining individuals in a species depending on the number of predators or life resources.
- economy, if you need to predict a company’s turnover based on the amount of sales.
- … and many more.
*Statsmodels 0.15.0 and XLSTAT both offer linear regression functionalities, but they differ significantly in terms of their features, usage, and target audiences.*
## A bit of theory: Equations for the Ordinary Least Squares regression
### The ordinary least squares formula: what is the equation of the model?
In the case of a model with *p* explanatory variables, the OLS regression model writes:
Y = β0 + Σj=1..p βjXj + ε
where Y is the dependent variable, β0, is the intercept of the model, X j corresponds to the jth explanatory variable of the model (j= 1 to p), and e is the random error with expectation 0 and variance σ\².
In the case where there are n observations, the estimation of the predicted value of the dependent variable Y for the ith observation is given by the regression equation:
yi = β0 + Σj=1..p βjXij
Example: We want to predict the height of plants depending on the number of days they have spent in the sun, where days spent in the sun is an independent variable. Before getting exposure, they are 30 cm. A plant grows 1 mm (0.1 cm) after being exposed to the sun for a day.
- Y is the height of the plants
- X is the number of days spent in the sun
- β0 is 30 because it is the value of Y when X is 0.
- β1 is 0.1 because it is the coefficient multiplied by the number of days.
A plant being exposed 5 days to the sun has therefore an estimated height of Y = 30 + 0.1\*5 = 30.5 cm.
Of course, it is not always exact, which is why we must take into account the random error ε (or the null hypothesis being tested).
Moreover, before predicting, our method must find the beta coefficients: we just start out by inputting a table containing the heights of several plants along with the number of days they have spent in the sun. If you want to find out more about the calculations, read the next paragraph.
### How do ordinary least squares (OLS) work?
The OLS method aims to minimize the sum of square differences between the observed and predicted values, considering potential issues of multicollinearity.
That way, the vector β of the coefficients can be estimated by the following formula
β = (X’DX)-1 X’ Dy
with X the matrix of the explanatory variables preceded by a vector of 1s, D is a matrix with the wi weights on its diagonal, and y the vector of the n observed values of the dependent variable
The vector of the predicted values can be written as follows:
y\* = Xβ=X (X’ DX)-1 X’Dy
We can even the variance σ\² of the random error ε by the following formula :
σ\² = 1/(W –p\*) Σi=1..n wi(yi - y\*i)
where p\* is the number of explanatory variables to which we add 1 if the intercept is not fixed, wi is the weight of the ith observation, W is the sum of the wi weights, y the \* vector of the observed values and y\* the vector of predicted values
### What is the intuitive explanation of the least squares method?
Intuitively speaking, the aim of the ordinary least squares method is to minimize the prediction error, between the predicted and real values. One may ask themselves why we choose to minimize the sum of squared errors instead of the sum of errors directly.
It takes into account the sum of squared errors instead of the errors as they are because sometimes they can be negative or positive and they could sum up to a nearly null value.
For example, if your real values are 2, 3, 5, 2, and 4 and your predicted values are 3, 2, 5, 1, 5, then the total error would be (3-2)+(2-3)+(5-5)+(1-2)+(5-4)=1-1+0-1+1=0 and the average error would be 0/5=0, which could lead to false conclusions.
However, if you compute the mean squared error, then you get (3-2)^2+(2-3)^2+(5-5)^2+(1-2)^2+(5-4)^2=4 and 4/5=0.8. By scaling the error back to the data and taking the square root of it, we get sqrt(0.8)=0.89, so on average, the predictions differ by 0.89 from the real value.
[RUN YOUR LINEAR REGRESSION MODEL](https://www.xlstat.com/en/download/trial/xlstat)
## What are the assumptions of Ordinary Least Squares (OLS)?
1\) Individuals (observations) are independent. It is in general true in daily situations (the amount of rainfall does not depend on the previous day, the income does not depend on the previous month, the height of a person does not depend on the person measured just before…).
2\) Variance is homogeneous. [Levene's test](https://www.xlstat.com/en/solutions/features/k-sample-comparison-of-variances) is proposed by XLSTAT to test the equality of the error variances.
3\) Residuals follow a normal distribution. XLSTAT offers several methods to test [normality](https://www.xlstat.com/en/solutions/features/normality-tests) of residuals.
Model **residuals** (or errors) are the distances between data points and the fitted model. Model residuals represent the part of variability in the data the model was unable to capture. The R\² statistic is the part of variability that is explained by the model, crucial in the evaluation of the null hypothesis. So the lower the residuals, the higher the R\² statistic.
**Homoscedasticity** and **independence** of the error terms are key hypotheses in linear regression where it is assumed that the variances of the error terms are independent and identically distributed and normally distributed. When these assumptions are not possible to keep, a consequence is that the covariance matrix cannot be estimated using the classical formula, and the variance of the parameters corresponding to the beta coefficients of the linear model can be wrong and their confidence intervals as well.
## How do I set up a linear regression model in XLSTAT?
In XLSTAT, you can easily run your ordinary least squares regression without even coding, just by data selection! You just have to select your dependent variable as well as your explanatory ones.

You can select several outcomes such as Descriptive statistics of your datasets, but also Correlations and Analysis of Variance.

Besides statistics and the equation of the model, you can also select the charts to be displayed, such as the regression one for example. You can see all the data points and the central regression line with a confidence interval.

## Predictions in OLS regression with XLSTAT
[Linear regression](https://www.xlstat.com/en/solutions/features/linear-regression) is often used to predict outputs' values for new samples. XLSTAT enables you to characterize the quality of the model for prediction before you use the regression equation.
## To go further: limitations of the Ordinary Least Squares regression
The limitations of the OLS regression come from the constraint of the inversion of the X’X matrix: it is required that the rank of the matrix is p+1, and some numerical problems may arise if the matrix is not well behaved. XLSTAT uses algorithms due to Dempster (1969) that allow circumventing these two issues: if the matrix rank equals q where q is strictly lower than p+1, some variables are removed from the model, either because they are constant or because they belong to a block of [collinear](https://www.xlstat.com/en/solutions/features/multicolinearity-statistics) variables.
## What are the advantages of OLS: variable selection
An automatic selection of the variables is performed if the user selects a too high number of variables compared to the number of observations. The theoretical limit is n-1, as with greater values the X’X matrix becomes non-invertible due to multicollinearity.
The deleting of some of the variables may however not be optimal: in some cases we might not add a variable to the model because it is almost collinear to some other variables or to a block of variables, but it might be that it would be more relevant to remove a variable that is already in the model and to the new variable.
For that reason, and also in order to handle the cases where there are a lot of explanatory variables or issues like multicollinearity, other methods have been developed such as [Partial Least Squares regression (PLS)](https://www.xlstat.com/en/solutions/features/partial-least-squares-regression).
## Tutorials for Ordinary Least Squares regression
Below you will find a list of examples using ordinary least squares regression analysis (OLS):
- [A simple linear regression model](https://xlst.at/simple-linear-regression)
- [A multiple linear regression model](https://xlst.at/multiple-linear-regression)
- [How to compute influence diagnostics for linear regression models](https://xlst.at/influence-diagnostics-for-linear-regression)
[TRY IT ON YOUR OWN DATA](https://www.xlstat.com/en/download/trial/xlstat) |
| Shard | 106 (laksa) |
| Root Hash | 15077506440926382106 |
| Unparsed URL | com,xlstat!www,/solutions/features/ordinary-least-squares-regression-ols s443 |