ℹ️ Skipped - page is already crawled
| Filter | Status | Condition | Details |
|---|---|---|---|
| HTTP status | PASS | download_http_code = 200 | HTTP 200 |
| Age cutoff | PASS | download_stamp > now() - 6 MONTH | 2.7 months ago |
| History drop | PASS | isNull(history_drop_reason) | No drop reason |
| Spam/ban | PASS | fh_dont_index != 1 AND ml_spam_score = 0 | ml_spam_score=0 |
| Canonical | PASS | meta_canonical IS NULL OR = '' OR = src_unparsed | Not set |
| Property | Value |
|---|---|
| URL | https://llego.dev/posts/pandas-excluding-missing-values/ |
| Last Crawled | 2026-01-25 13:13:24 (2 months ago) |
| First Indexed | 2023-10-12 00:29:38 (2 years ago) |
| HTTP Status Code | 200 |
| Meta Title | How to Exclude Missing Values in Python Pandas - llego.dev |
| Meta Description | This comprehensive Pandas how-to guide covers detecting, dropping, filling, and carefully handling missing NaN values in DataFrames for cleaning data before analysis. |
| Meta Canonical | null |
| Boilerpipe Text | Handling missing data is a common task in data analysis and data science projects. Pandas, Python’s popular data analysis library, provides many useful features for dealing with missing values in DataFrames and Series objects.
In Pandas, missing values are represented by NaN (Not a Number). Filtering out missing values or excluding rows and columns containing NaN values is often necessary for preparing clean, complete datasets for analysis and modeling. This allows you to focus on just the valid, complete observations without distortion from null entries.
This guide will demonstrate various methods to exclude or drop missing values in Pandas using realistic examples. We will cover key topics including:
Table of Contents
Open Table of Contents
Checking for Null Values
Dropping Rows and Columns with NaN Values
Drop Rows with
dropna()
Drop Columns with
drop()
Filling Missing Values
fillna()
interpolate()
replace()
Excluding Nulls During Joins/Merges
Treating NaN as a Value
Conclusion
Checking for Null Values
The first step is detecting which cells in a DataFrame or Series contain NaN values. Pandas provides the
isna()
and
notna()
methods to check for nulls and non-nulls respectively:
import
pandas
as
pd
df
=
pd.
DataFrame
({
'A'
: [
1
,
2
, np.nan],
'B'
: [
5
, np.nan, np.nan]})
df.
isna
()
# Returns True for each NaN value
df.
notna
()
# Returns True for each non-NaN value
The
isnull()
and
notnull()
methods are aliases of
isna()
and
notna()
.
Summing the Boolean DataFrame returned by
isna()
gives the total count of missing values per column:
df.
isna
().
sum
()
A
1
B
2
dtype: int64
To count NaN values in the entire DataFrame, use
isna().sum().sum()
.
These methods enable detecting nulls before handling them according to your requirements.
Dropping Rows and Columns with NaN Values
A common way to handle missing data is to simply exclude rows and/or columns containing NaN values. Pandas provides two main methods for this:
Drop Rows with
dropna()
The
dropna()
method on DataFrames excludes any rows containing NaN values.
df
=
pd.
DataFrame
({
'A'
: [
1
,
2
, np.nan],
'B'
: [
5
, np.nan, np.nan]})
df.
dropna
()
# Drops second and third row
By default
dropna()
drops any row with at least 1 NaN. Pass
how='all'
to only drop rows where
all
values are NaN:
df
=
pd.
DataFrame
({
'A'
: [
1
, np.nan, np.nan],
'B'
: [
5
, np.nan, np.nan]})
df.
dropna
(
how
=
'all'
)
# Only drops third row with all NaNs
To drop rows with at least a certain number of NaNs, use the
thresh
argument:
df.
dropna
(
thresh
=
2
)
# Would drop rows with >= 2 NaNs
subset
allows specifying columns to consider when looking for NaNs:
df.
dropna
(
subset
=
[
'A'
])
# Only drops rows with NaN in column 'A'
Drop Columns with
drop()
The
drop()
method can exclude columns containing null values by passing
axis=1
:
df.
drop
(
columns
=
[
'A'
],
axis
=
1
)
# Drops column A
thresh
can again be used to drop columns with at least a certain count of NaNs:
df.
dropna
(
axis
=
1
,
thresh
=
1
)
# Drops any column with >= 1 NaN
This provides an easy way to filter out columns missing too much data to be usable.
Filling Missing Values
Sometimes it is preferable to fill or impute missing values rather than excluding entire rows or columns containing them. This retains more complete data.
Pandas provides several methods for imputation like
fillna()
,
interpolate()
, and
replace()
.
fillna()
The
fillna()
method replaces NaN values with a specified value. Passing a single value fills all NaNs with it:
df.
fillna
(
0
)
# Replace NaNs with 0
Dictionary mapping can be passed to fill different values per column:
df.
fillna
({
'A'
:
0
,
'B'
:
5
})
Use
method='ffill'
/
'bfill'
to propagate next/previous valid values forward/backward:
df.
fillna
(
method
=
'ffill'
)
This carries over the last valid observation to fill NaNs in following rows.
interpolate()
For numeric data,
interpolate()
fills NaN values via linear interpolation of neighboring points:
s
=
pd.
Series
([
1
, np.nan, np.nan,
10
])
s.
interpolate
()
# Fills NaNs by interpolating between 1 and 10
Various interpolation methods like ‘spline’ and ‘akima’ are
supported
.
replace()
The
replace()
method can substitute NaN with a new value:
df.
replace
(np.nan,
5
)
This provides a simple, flexible way to fill null values while copying the DataFrame.
Filling missing data preserves more complete rows/columns and may produce better analysis results than dropping NaNs.
Excluding Nulls During Joins/Merges
NaN values can also be excluded when combining DataFrames using operations like join/merge.
By default, joins preserve all values (including NaNs) from the original DataFrames:
df1
=
pd.
DataFrame
({
'A'
: [
1
,
2
],
'B'
: [
5
, np.nan]})
df2
=
pd.
DataFrame
({
'A'
: [
1
,
3
],
'C'
: [
7
,
8
]})
df1.
merge
(df2)
# NaN value is retained
Passing
how='inner'
joins on only the intersection of rows containing no null values:
df1.
merge
(df2,
how
=
'inner'
)
# Row with NaN is excluded
For outer joins, pass
indicator=True
and filter for rows having
_merge
equal to
'both'
:
df1.
merge
(df2,
how
=
'outer'
,
indicator
=
True
)
.
query
(
'_merge == "both"'
)
# Null rows are excluded after merge
This avoids incorporating NaN values from the join/merge output.
Treating NaN as a Value
In some cases, you may want to explicitly include NaN values rather than exclude them. This treats NaN as an actual value category or level in the data.
Methods like
groupby()
,
value_counts()
, and
pivot_table()
by default will ignore NaN values.
To include NaN values in the output, pass
dropna=False
:
df.
groupby
(
'A'
,
dropna
=
False
).
mean
()
# NaN is retained as a group
This lets you analyze null values alongside actual data.
Conclusion
Handling missing data is vital for preparing Pandas DataFrames for analysis and visualization. By detecting, dropping, filling, and carefully treating NaN values, you can wrangle messy datasets into clean, usable formats.
The techniques covered in this guide, including
isna()
,
dropna()
,
fillna()
, joining on complete rows, and handling NaN as a group, provide code-driven solutions to exclude or retain missing values when needed.
Mastering Pandas’ tools for missing data will boost your proficiency for real-world data science tasks. The skills are broadly applicable across disciplines including machine learning, analytics, finance, sciences and more.
For further learning, refer to the official Pandas documentation on
Working with Missing Data
. The example codes can be adapted to your specific datasets and use cases when encountering null values. |
| Markdown | [Skip to content](https://llego.dev/posts/pandas-excluding-missing-values/#main-content)
[llego.dev](https://llego.dev/)
- [Posts](https://llego.dev/posts/)
- [Tags](https://llego.dev/tags/)
- [About](https://llego.dev/about/)
- [Services](https://llego.dev/services/)
- [Search](https://llego.dev/search/ "Search")
***
Go Back
# How to Exclude Missing Values in Python Pandas
Updated:
Mar 29, 2023
\| at 05:37 AM
Handling missing data is a common task in data analysis and data science projects. Pandas, Python’s popular data analysis library, provides many useful features for dealing with missing values in DataFrames and Series objects.
In Pandas, missing values are represented by NaN (Not a Number). Filtering out missing values or excluding rows and columns containing NaN values is often necessary for preparing clean, complete datasets for analysis and modeling. This allows you to focus on just the valid, complete observations without distortion from null entries.
This guide will demonstrate various methods to exclude or drop missing values in Pandas using realistic examples. We will cover key topics including:
## Table of Contents
Open Table of Contents
- [Checking for Null Values](https://llego.dev/posts/pandas-excluding-missing-values/#checking-for-null-values)
- [Dropping Rows and Columns with NaN Values](https://llego.dev/posts/pandas-excluding-missing-values/#dropping-rows-and-columns-with-nan-values)
- [Drop Rows with `dropna()`](https://llego.dev/posts/pandas-excluding-missing-values/#drop-rows-with-dropna)
- [Drop Columns with `drop()`](https://llego.dev/posts/pandas-excluding-missing-values/#drop-columns-with-drop)
- [Filling Missing Values](https://llego.dev/posts/pandas-excluding-missing-values/#filling-missing-values)
- [fillna()](https://llego.dev/posts/pandas-excluding-missing-values/#fillna)
- [interpolate()](https://llego.dev/posts/pandas-excluding-missing-values/#interpolate)
- [replace()](https://llego.dev/posts/pandas-excluding-missing-values/#replace)
- [Excluding Nulls During Joins/Merges](https://llego.dev/posts/pandas-excluding-missing-values/#excluding-nulls-during-joinsmerges)
- [Treating NaN as a Value](https://llego.dev/posts/pandas-excluding-missing-values/#treating-nan-as-a-value)
- [Conclusion](https://llego.dev/posts/pandas-excluding-missing-values/#conclusion)
## Checking for Null Values
The first step is detecting which cells in a DataFrame or Series contain NaN values. Pandas provides the `isna()` and `notna()` methods to check for nulls and non-nulls respectively:
```
import pandas as pd
df = pd.DataFrame({'A': [1, 2, np.nan],
'B': [5, np.nan, np.nan]})
df.isna()
# Returns True for each NaN value
df.notna()
# Returns True for each non-NaN value
```
The `isnull()` and `notnull()` methods are aliases of `isna()` and `notna()`.
Summing the Boolean DataFrame returned by `isna()` gives the total count of missing values per column:
```
df.isna().sum()
A 1
B 2
dtype: int64
```
To count NaN values in the entire DataFrame, use `isna().sum().sum()`.
These methods enable detecting nulls before handling them according to your requirements.
## Dropping Rows and Columns with NaN Values
A common way to handle missing data is to simply exclude rows and/or columns containing NaN values. Pandas provides two main methods for this:
### Drop Rows with `dropna()`
The `dropna()` method on DataFrames excludes any rows containing NaN values.
```
df = pd.DataFrame({'A': [1, 2, np.nan],
'B': [5, np.nan, np.nan]})
df.dropna()
# Drops second and third row
```
By default `dropna()` drops any row with at least 1 NaN. Pass `how='all'` to only drop rows where *all* values are NaN:
```
df = pd.DataFrame({'A': [1, np.nan, np.nan],
'B': [5, np.nan, np.nan]})
df.dropna(how='all')
# Only drops third row with all NaNs
```
To drop rows with at least a certain number of NaNs, use the `thresh` argument:
```
df.dropna(thresh=2)
# Would drop rows with >= 2 NaNs
```
`subset` allows specifying columns to consider when looking for NaNs:
```
df.dropna(subset=['A'])
# Only drops rows with NaN in column 'A'
```
### Drop Columns with `drop()`
The `drop()` method can exclude columns containing null values by passing `axis=1`:
```
df.drop(columns=['A'], axis=1)
# Drops column A
```
`thresh` can again be used to drop columns with at least a certain count of NaNs:
```
df.dropna(axis=1, thresh=1)
# Drops any column with >= 1 NaN
```
This provides an easy way to filter out columns missing too much data to be usable.
## Filling Missing Values
Sometimes it is preferable to fill or impute missing values rather than excluding entire rows or columns containing them. This retains more complete data.
Pandas provides several methods for imputation like `fillna()`, `interpolate()`, and `replace()`.
### fillna()
The `fillna()` method replaces NaN values with a specified value. Passing a single value fills all NaNs with it:
```
df.fillna(0) # Replace NaNs with 0
```
Dictionary mapping can be passed to fill different values per column:
```
df.fillna({'A': 0, 'B': 5})
```
Use `method='ffill'`/`'bfill'` to propagate next/previous valid values forward/backward:
```
df.fillna(method='ffill')
```
This carries over the last valid observation to fill NaNs in following rows.
### interpolate()
For numeric data, `interpolate()` fills NaN values via linear interpolation of neighboring points:
```
s = pd.Series([1, np.nan, np.nan, 10])
s.interpolate()
# Fills NaNs by interpolating between 1 and 10
```
Various interpolation methods like ‘spline’ and ‘akima’ are [supported](https://pandas.pydata.org/docs/reference/api/pandas.Series.interpolate.html).
### replace()
The `replace()` method can substitute NaN with a new value:
```
df.replace(np.nan, 5)
```
This provides a simple, flexible way to fill null values while copying the DataFrame.
Filling missing data preserves more complete rows/columns and may produce better analysis results than dropping NaNs.
## Excluding Nulls During Joins/Merges
NaN values can also be excluded when combining DataFrames using operations like join/merge.
By default, joins preserve all values (including NaNs) from the original DataFrames:
```
df1 = pd.DataFrame({'A': [1, 2], 'B': [5, np.nan]})
df2 = pd.DataFrame({'A': [1, 3], 'C': [7, 8]})
df1.merge(df2)
# NaN value is retained
```
Passing `how='inner'` joins on only the intersection of rows containing no null values:
```
df1.merge(df2, how='inner')
# Row with NaN is excluded
```
For outer joins, pass `indicator=True` and filter for rows having `_merge` equal to `'both'`:
```
df1.merge(df2, how='outer', indicator=True)
.query('_merge == "both"')
# Null rows are excluded after merge
```
This avoids incorporating NaN values from the join/merge output.
## Treating NaN as a Value
In some cases, you may want to explicitly include NaN values rather than exclude them. This treats NaN as an actual value category or level in the data.
Methods like `groupby()`, `value_counts()`, and `pivot_table()` by default will ignore NaN values.
To include NaN values in the output, pass `dropna=False`:
```
df.groupby('A', dropna=False).mean()
# NaN is retained as a group
```
This lets you analyze null values alongside actual data.
## Conclusion
Handling missing data is vital for preparing Pandas DataFrames for analysis and visualization. By detecting, dropping, filling, and carefully treating NaN values, you can wrangle messy datasets into clean, usable formats.
The techniques covered in this guide, including `isna()`, `dropna()`, `fillna()`, joining on complete rows, and handling NaN as a group, provide code-driven solutions to exclude or retain missing values when needed.
Mastering Pandas’ tools for missing data will boost your proficiency for real-world data science tasks. The skills are broadly applicable across disciplines including machine learning, analytics, finance, sciences and more.
For further learning, refer to the official Pandas documentation on [Working with Missing Data](https://pandas.pydata.org/docs/user_guide/missing_data.html). The example codes can be adapted to your specific datasets and use cases when encountering null values.
- [python](https://llego.dev/tags/python/)
- [pandas](https://llego.dev/tags/pandas/)
Back to Top
Share this post on:
[Share this post via WhatsApp](https://wa.me/?text=https://llego.dev/posts/pandas-excluding-missing-values/ "Share this post via WhatsApp")
[Share this post on Facebook](https://www.facebook.com/sharer.php?u=https://llego.dev/posts/pandas-excluding-missing-values/ "Share this post on Facebook")
[Tweet this post](https://twitter.com/intent/tweet?url=https://llego.dev/posts/pandas-excluding-missing-values/ "Tweet this post")
[Share this post via Telegram](https://t.me/share/url?url=https://llego.dev/posts/pandas-excluding-missing-values/ "Share this post via Telegram")
[Share this post on Pinterest](https://pinterest.com/pin/create/button/?url=https://llego.dev/posts/pandas-excluding-missing-values/ "Share this post on Pinterest")
[Share this post via email](https://llego.dev/cdn-cgi/l/email-protection#714e0204131b1412054c22141454434105191802544341011e0205575242494a131e15084c19050501024b5e5e1d1d14161e5f1514075e011e0205025e01101f1510025c1409121d0415181f165c1c180202181f165c07101d0414025e "Share this post via email")
***
[Mark Anthony Llego on Github](https://github.com/llegomark " Mark Anthony Llego on Github")
[Mark Anthony Llego on X](https://twitter.com/markllego "Mark Anthony Llego on X")
[Mark Anthony Llego on Discord](https://discordapp.com/users/1012984419029622784 "Mark Anthony Llego on Discord")
[Mark Anthony Llego on Hugging Face](https://huggingface.co/markllego "Mark Anthony Llego on Hugging Face")
[Send an email to Mark Anthony Llego](https://llego.dev/cdn-cgi/l/email-protection#c6aba7b4adaaaaa3a1a986a1aba7afaae8a5a9ab "Send an email to Mark Anthony Llego")
© 2025 Mark Anthony Llego \| [Legal](https://llego.dev/legal) |
| Readable Markdown | null |
| Shard | 108 (laksa) |
| Root Hash | 11443539829787219308 |
| Unparsed URL | dev,llego!/posts/pandas-excluding-missing-values/ s443 |