âšī¸ Skipped - page is already crawled
| Filter | Status | Condition | Details |
|---|---|---|---|
| HTTP status | PASS | download_http_code = 200 | HTTP 200 |
| Age cutoff | PASS | download_stamp > now() - 6 MONTH | 0.5 months ago |
| History drop | PASS | isNull(history_drop_reason) | No drop reason |
| Spam/ban | PASS | fh_dont_index != 1 AND ml_spam_score = 0 | ml_spam_score=0 |
| Canonical | PASS | meta_canonical IS NULL OR = '' OR = src_unparsed | Not set |
| Property | Value |
|---|---|
| URL | https://note.nkmk.me/en/python-pandas-nan-dropna/ |
| Last Crawled | 2026-03-27 12:27:53 (16 days ago) |
| First Indexed | 2022-02-13 07:19:01 (4 years ago) |
| HTTP Status Code | 200 |
| Meta Title | pandas: Remove NaN (missing values) with dropna() | note.nkmk.me |
| Meta Description | You can remove NaN from pandas.DataFrame and pandas.Series with the dropna() method. pandas.DataFrame.dropna â pandas 2.0.3 documentation pandas.Series.dropna â pandas 2.0.3 documentation Remove row ... |
| Meta Canonical | null |
| Boilerpipe Text | You can remove
NaN
from
pandas.DataFrame
and
pandas.Series
with the
dropna()
method.
pandas.DataFrame.dropna â pandas 2.0.3 documentation
pandas.Series.dropna â pandas 2.0.3 documentation
Contents
Remove rows/columns where all elements are
NaN
:
how='all'
Remove rows/columns that contain at least one
NaN
:
how='any'
(default)
Remove rows/columns according to the number of non-missing values:
thresh
Remove based on specific rows/columns:
subset
Update the original object:
inplace
For
pandas.Series
While this article primarily deals with
NaN
(Not a Number), it's important to note that in pandas,
None
is also treated as a missing value.
Missing values in pandas (nan, None, pd.NA)
See the following article on extracting, replacing, and counting missing values.
pandas: Find rows/columns with NaN (missing values)
pandas: Replace NaN (missing values) with fillna()
pandas: Detect and count NaN (missing values) with isnull(), isna()
The sample code in this article uses pandas version
2.0.3
. As an example, read a CSV file with missing values.
sample_pandas_normal_nan.csv
import
pandas
as
pd
print
(
pd
.
__version__
)
# 2.0.3
df
=
pd
.
read_csv
(
'data/src/sample_pandas_normal_nan.csv'
)
print
(
df
)
# name age state point other
# 0 Alice 24.0 NY NaN NaN
# 1 NaN NaN NaN NaN NaN
# 2 Charlie NaN CA NaN NaN
# 3 Dave 68.0 TX 70.0 NaN
# 4 Ellen NaN CA 88.0 NaN
# 5 Frank 30.0 NaN NaN NaN
Remove rows/columns where all elements are
NaN
:
how='all'
By setting
how='all'
, rows where all elements are
NaN
are removed.
print
(
df
.
dropna
(
how
=
'all'
))
# name age state point other
# 0 Alice 24.0 NY NaN NaN
# 2 Charlie NaN CA NaN NaN
# 3 Dave 68.0 TX 70.0 NaN
# 4 Ellen NaN CA 88.0 NaN
# 5 Frank 30.0 NaN NaN NaN
If
axis
is set to
1
or
'columns'
, columns where all elements are
NaN
are removed.
print
(
df
.
dropna
(
how
=
'all'
,
axis
=
1
))
# name age state point
# 0 Alice 24.0 NY NaN
# 1 NaN NaN NaN NaN
# 2 Charlie NaN CA NaN
# 3 Dave 68.0 TX 70.0
# 4 Ellen NaN CA 88.0
# 5 Frank 30.0 NaN NaN
Note that if
axis
is set to
0
or
'index'
, rows are removed. Since the default value of
axis
is
0
, rows are removed if omitted, as shown in the first example.
In former versions, both rows and columns were removed with
axis=[0, 1]
, but since version
1.0.0
,
axis
can no longer be specified with a list or tuple. If you want to remove both rows and columns, you can repeatedly apply
dropna()
.
# print(df.dropna(how='all', axis=[0, 1]))
# TypeError: supplying multiple axes to axis is no longer supported.
print
(
df
.
dropna
(
how
=
'all'
)
.
dropna
(
how
=
'all'
,
axis
=
1
))
# name age state point
# 0 Alice 24.0 NY NaN
# 2 Charlie NaN CA NaN
# 3 Dave 68.0 TX 70.0
# 4 Ellen NaN CA 88.0
# 5 Frank 30.0 NaN NaN
Remove rows/columns that contain at least one
NaN
:
how='any'
(default)
To use as an example, remove rows and columns where all values are
NaN
.
df2
=
df
.
dropna
(
how
=
'all'
)
.
dropna
(
how
=
'all'
,
axis
=
1
)
print
(
df2
)
# name age state point
# 0 Alice 24.0 NY NaN
# 2 Charlie NaN CA NaN
# 3 Dave 68.0 TX 70.0
# 4 Ellen NaN CA 88.0
# 5 Frank 30.0 NaN NaN
By setting
how='any'
, rows that contain at least one
NaN
are removed. Since the default value of
how
is
'any'
, the result is the same even if omitted.
print
(
df2
.
dropna
(
how
=
'any'
))
# name age state point
# 3 Dave 68.0 TX 70.0
print
(
df2
.
dropna
())
# name age state point
# 3 Dave 68.0 TX 70.0
If
axis
is set to
1
or
'columns'
, columns that contain at least one
NaN
are removed.
print
(
df2
.
dropna
(
axis
=
1
))
# name
# 0 Alice
# 2 Charlie
# 3 Dave
# 4 Ellen
# 5 Frank
Remove rows/columns according to the number of non-missing values:
thresh
With the
thresh
argument, you can remove rows and columns according to the number of non-missing values.
For example, if
thresh=3
, the rows that contain more than three non-missing values remain, and the other rows are removed.
print
(
df
.
dropna
(
thresh
=
3
))
# name age state point other
# 0 Alice 24.0 NY NaN NaN
# 3 Dave 68.0 TX 70.0 NaN
# 4 Ellen NaN CA 88.0 NaN
If
axis
is set to
1
or
'columns'
, columns are removed.
print
(
df
.
dropna
(
thresh
=
3
,
axis
=
1
))
# name age state
# 0 Alice 24.0 NY
# 1 NaN NaN NaN
# 2 Charlie NaN CA
# 3 Dave 68.0 TX
# 4 Ellen NaN CA
# 5 Frank 30.0 NaN
Remove based on specific rows/columns:
subset
If you want to remove based on specific rows and columns, specify a list of rows/columns labels (names) to the
subset
argument of
dropna()
. Even if you want to set only one label, you need to specify it as a list, like
subset=['name']
.
Since the default is
how='any'
and
axis=0
, rows with
NaN
in the columns specified by
subset
are removed.
print
(
df
.
dropna
(
subset
=
[
'age'
]))
# name age state point other
# 0 Alice 24.0 NY NaN NaN
# 3 Dave 68.0 TX 70.0 NaN
# 5 Frank 30.0 NaN NaN NaN
print
(
df
.
dropna
(
subset
=
[
'age'
,
'state'
]))
# name age state point other
# 0 Alice 24.0 NY NaN NaN
# 3 Dave 68.0 TX 70.0 NaN
If
how
is set to
'all'
, rows with
NaN
in all specified columns are removed.
print
(
df
.
dropna
(
subset
=
[
'age'
,
'state'
],
how
=
'all'
))
# name age state point other
# 0 Alice 24.0 NY NaN NaN
# 2 Charlie NaN CA NaN NaN
# 3 Dave 68.0 TX 70.0 NaN
# 4 Ellen NaN CA 88.0 NaN
# 5 Frank 30.0 NaN NaN NaN
If
axis
is set to
1
or
'columns'
, columns are removed.
print
(
df
.
dropna
(
subset
=
[
0
,
4
],
axis
=
1
))
# name state
# 0 Alice NY
# 1 NaN NaN
# 2 Charlie CA
# 3 Dave TX
# 4 Ellen CA
# 5 Frank NaN
print
(
df
.
dropna
(
subset
=
[
0
,
4
],
axis
=
1
,
how
=
'all'
))
# name age state point
# 0 Alice 24.0 NY NaN
# 1 NaN NaN NaN NaN
# 2 Charlie NaN CA NaN
# 3 Dave 68.0 TX 70.0
# 4 Ellen NaN CA 88.0
# 5 Frank 30.0 NaN NaN
An error is raised if a non-existent row or column name is specified. An error is also raised if you set
axis=1
but specify column names or set
axis=0
(default) but specify row names.
# print(df.dropna(subset=['age', 'state', 'xxx']))
# KeyError: ['xxx']
# print(df.dropna(subset=['age', 'state'], axis=1))
# KeyError: ['age', 'state']
Update the original object:
inplace
As shown in the examples above, by default, a new object is returned, and the original object is not changed, but if
inplace=True
, the original object itself is updated.
df
.
dropna
(
subset
=
[
'age'
],
inplace
=
True
)
print
(
df
)
# name age state point other
# 0 Alice 24.0 NY NaN NaN
# 3 Dave 68.0 TX 70.0 NaN
# 5 Frank 30.0 NaN NaN NaN
For
pandas.Series
The only valid argument for
dropna()
of
pandas.Series
is
inplace
. Since it is one-dimensional data, the elements with
NaN
are simply removed.
s
=
pd
.
read_csv
(
'data/src/sample_pandas_normal_nan.csv'
)[
'age'
]
print
(
s
)
# 0 24.0
# 1 NaN
# 2 NaN
# 3 68.0
# 4 NaN
# 5 30.0
# Name: age, dtype: float64
print
(
s
.
dropna
())
# 0 24.0
# 3 68.0
# 5 30.0
# Name: age, dtype: float64
s
.
dropna
(
inplace
=
True
)
print
(
s
)
# 0 24.0
# 3 68.0
# 5 30.0
# Name: age, dtype: float64 |
| Markdown | [note.nkmk.me](https://note.nkmk.me/en/)
1. [Home](https://note.nkmk.me/en/)
2. [Python](https://note.nkmk.me/en/python/)
3. [pandas](https://note.nkmk.me/en/pandas/)
# pandas: Remove NaN (missing values) with dropna()
Modified:
2023-08-02
\| Tags: [Python](https://note.nkmk.me/en/python/), [pandas](https://note.nkmk.me/en/pandas/)
You can remove `NaN` from `pandas.DataFrame` and `pandas.Series` with the `dropna()` method.
- [pandas.DataFrame.dropna â pandas 2.0.3 documentation](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.dropna.html)
- [pandas.Series.dropna â pandas 2.0.3 documentation](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.dropna.html)
Contents
- [Remove rows/columns where all elements are `NaN`: `how='all'`](https://note.nkmk.me/en/python-pandas-nan-dropna/#remove-rowscolumns-where-all-elements-are-nan-howall)
- [Remove rows/columns that contain at least one `NaN`: `how='any'` (default)](https://note.nkmk.me/en/python-pandas-nan-dropna/#remove-rowscolumns-that-contain-at-least-one-nan-howany-default)
- [Remove rows/columns according to the number of non-missing values: `thresh`](https://note.nkmk.me/en/python-pandas-nan-dropna/#remove-rowscolumns-according-to-the-number-of-non-missing-values-thresh)
- [Remove based on specific rows/columns: `subset`](https://note.nkmk.me/en/python-pandas-nan-dropna/#remove-based-on-specific-rowscolumns-subset)
- [Update the original object: `inplace`](https://note.nkmk.me/en/python-pandas-nan-dropna/#update-the-original-object-inplace)
- [For `pandas.Series`](https://note.nkmk.me/en/python-pandas-nan-dropna/#for-pandasseries)
While this article primarily deals with `NaN` (Not a Number), it's important to note that in pandas, `None` is also treated as a missing value.
- [Missing values in pandas (nan, None, pd.NA)](https://note.nkmk.me/en/python-pandas-nan-none-na/)
See the following article on extracting, replacing, and counting missing values.
- [pandas: Find rows/columns with NaN (missing values)](https://note.nkmk.me/en/python-pandas-nan-extract/)
- [pandas: Replace NaN (missing values) with fillna()](https://note.nkmk.me/en/python-pandas-nan-fillna/)
- [pandas: Detect and count NaN (missing values) with isnull(), isna()](https://note.nkmk.me/en/python-pandas-nan-judge-count/)
The sample code in this article uses pandas version `2.0.3`. As an example, read a CSV file with missing values.
- [sample\_pandas\_normal\_nan.csv](https://raw.githubusercontent.com/nkmk/python-snippets/217ee1c1fde5d816726f083185036af252a39647/notebook/data/src/sample_pandas_normal_nan.csv)
```
```
source: [pandas\_nan\_dropna.py](https://github.com/nkmk/python-snippets/blob/20611e7fec7b73380534460784071f486f57e1bd/notebook/pandas_nan_dropna.py#L1-L14)
## Remove rows/columns where all elements are `NaN`: `how='all'`
By setting `how='all'`, rows where all elements are `NaN` are removed.
```
```
source: [pandas\_nan\_dropna.py](https://github.com/nkmk/python-snippets/blob/20611e7fec7b73380534460784071f486f57e1bd/notebook/pandas_nan_dropna.py#L16-L22)
If `axis` is set to `1` or `'columns'`, columns where all elements are `NaN` are removed.
```
```
source: [pandas\_nan\_dropna.py](https://github.com/nkmk/python-snippets/blob/20611e7fec7b73380534460784071f486f57e1bd/notebook/pandas_nan_dropna.py#L24-L31)
Note that if `axis` is set to `0` or `'index'`, rows are removed. Since the default value of `axis` is `0`, rows are removed if omitted, as shown in the first example.
In former versions, both rows and columns were removed with `axis=[0, 1]`, but since version `1.0.0`, `axis` can no longer be specified with a list or tuple. If you want to remove both rows and columns, you can repeatedly apply `dropna()`.
```
```
source: [pandas\_nan\_dropna.py](https://github.com/nkmk/python-snippets/blob/20611e7fec7b73380534460784071f486f57e1bd/notebook/pandas_nan_dropna.py#L33-L42)
## Remove rows/columns that contain at least one `NaN`: `how='any'` (default)
To use as an example, remove rows and columns where all values are `NaN`.
```
```
source: [pandas\_nan\_dropna.py](https://github.com/nkmk/python-snippets/blob/20611e7fec7b73380534460784071f486f57e1bd/notebook/pandas_nan_dropna.py#L44-L51)
By setting `how='any'`, rows that contain at least one `NaN` are removed. Since the default value of `how` is `'any'`, the result is the same even if omitted.
```
```
source: [pandas\_nan\_dropna.py](https://github.com/nkmk/python-snippets/blob/20611e7fec7b73380534460784071f486f57e1bd/notebook/pandas_nan_dropna.py#L53-L59)
If `axis` is set to `1` or `'columns'`, columns that contain at least one `NaN` are removed.
```
```
source: [pandas\_nan\_dropna.py](https://github.com/nkmk/python-snippets/blob/20611e7fec7b73380534460784071f486f57e1bd/notebook/pandas_nan_dropna.py#L61-L67)
## Remove rows/columns according to the number of non-missing values: `thresh`
With the `thresh` argument, you can remove rows and columns according to the number of non-missing values.
For example, if `thresh=3`, the rows that contain more than three non-missing values remain, and the other rows are removed.
```
```
source: [pandas\_nan\_dropna.py](https://github.com/nkmk/python-snippets/blob/20611e7fec7b73380534460784071f486f57e1bd/notebook/pandas_nan_dropna.py#L69-L73)
If `axis` is set to `1` or `'columns'`, columns are removed.
```
```
source: [pandas\_nan\_dropna.py](https://github.com/nkmk/python-snippets/blob/20611e7fec7b73380534460784071f486f57e1bd/notebook/pandas_nan_dropna.py#L75-L82)
## Remove based on specific rows/columns: `subset`
If you want to remove based on specific rows and columns, specify a list of rows/columns labels (names) to the `subset` argument of `dropna()`. Even if you want to set only one label, you need to specify it as a list, like `subset=['name']`.
Since the default is `how='any'` and `axis=0`, rows with `NaN` in the columns specified by `subset` are removed.
```
```
source: [pandas\_nan\_dropna.py](https://github.com/nkmk/python-snippets/blob/20611e7fec7b73380534460784071f486f57e1bd/notebook/pandas_nan_dropna.py#L84-L93)
If `how` is set to `'all'`, rows with `NaN` in all specified columns are removed.
```
```
source: [pandas\_nan\_dropna.py](https://github.com/nkmk/python-snippets/blob/20611e7fec7b73380534460784071f486f57e1bd/notebook/pandas_nan_dropna.py#L95-L101)
If `axis` is set to `1` or `'columns'`, columns are removed.
```
```
source: [pandas\_nan\_dropna.py](https://github.com/nkmk/python-snippets/blob/20611e7fec7b73380534460784071f486f57e1bd/notebook/pandas_nan_dropna.py#L103-L119)
An error is raised if a non-existent row or column name is specified. An error is also raised if you set `axis=1` but specify column names or set `axis=0` (default) but specify row names.
```
```
source: [pandas\_nan\_dropna.py](https://github.com/nkmk/python-snippets/blob/20611e7fec7b73380534460784071f486f57e1bd/notebook/pandas_nan_dropna.py#L121-L125)
## Update the original object: `inplace`
As shown in the examples above, by default, a new object is returned, and the original object is not changed, but if `inplace=True`, the original object itself is updated.
```
```
source: [pandas\_nan\_dropna.py](https://github.com/nkmk/python-snippets/blob/20611e7fec7b73380534460784071f486f57e1bd/notebook/pandas_nan_dropna.py#L127-L132)
## For `pandas.Series`
The only valid argument for `dropna()` of `pandas.Series` is `inplace`. Since it is one-dimensional data, the elements with `NaN` are simply removed.
```
```
source: [pandas\_nan\_dropna.py](https://github.com/nkmk/python-snippets/blob/20611e7fec7b73380534460784071f486f57e1bd/notebook/pandas_nan_dropna.py#L134-L155)
## Related Categories
- [Python](https://note.nkmk.me/en/python/)
- [pandas](https://note.nkmk.me/en/pandas/)
## Related Articles
- [pandas: Shuffle rows/elements of DataFrame/Series](https://note.nkmk.me/en/python-pandas-random-sort-shuffle/)
- [pandas: Replace Series values with map()](https://note.nkmk.me/en/python-pandas-map-replace/)
- [List of pandas articles](https://note.nkmk.me/en/python-pandas-post-summary/)
- [pandas: Iterate DataFrame with for loop (iterrows, itertuples, items)](https://note.nkmk.me/en/python-pandas-dataframe-for-iteration/)
- [pandas: Convert a list of dictionaries to DataFrame with json\_normalize](https://note.nkmk.me/en/python-pandas-json-normalize/)
- [pandas: Detect and count NaN (missing values) with isnull(), isna()](https://note.nkmk.me/en/python-pandas-nan-judge-count/)
- [pandas: Get first/last n rows of DataFrame with head() and tail()](https://note.nkmk.me/en/python-pandas-head-tail/)
- [pandas: Get and set options for display, data behavior, etc.](https://note.nkmk.me/en/python-pandas-option-setting/)
- [pandas: Merge DataFrame with merge(), join() (INNER, OUTER JOIN)](https://note.nkmk.me/en/python-pandas-merge-join/)
- [pandas: Sort DataFrame/Series with sort\_values(), sort\_index()](https://note.nkmk.me/en/python-pandas-sort-values-sort-index/)
- [pandas: Get unique values and their counts in a column](https://note.nkmk.me/en/python-pandas-value-counts/)
- [pandas: Views and copies in DataFrame](https://note.nkmk.me/en/python-pandas-view-copy/)
- [pandas: Read CSV into DataFrame with read\_csv()](https://note.nkmk.me/en/python-pandas-read-csv-tsv/)
- [pandas: Select rows/columns by index (numbers and names)](https://note.nkmk.me/en/python-pandas-index-row-column/)
- [pandas: Slice substrings from each element in columns](https://note.nkmk.me/en/python-pandas-str-slice/)
Search
Categories
- [Python](https://note.nkmk.me/en/python/)
- [NumPy](https://note.nkmk.me/en/numpy/)
- [OpenCV](https://note.nkmk.me/en/opencv/)
- [pandas](https://note.nkmk.me/en/pandas/)
- [Pillow](https://note.nkmk.me/en/pillow/)
- [pip](https://note.nkmk.me/en/pip/)
- [scikit-image](https://note.nkmk.me/en/scikit-image/)
- [uv](https://note.nkmk.me/en/uv/)
- [Git](https://note.nkmk.me/en/git/)
- [Jupyter Notebook](https://note.nkmk.me/en/jupyter-notebook/)
- [Mac](https://note.nkmk.me/en/mac/)
- [Windows](https://note.nkmk.me/en/windows/)
- [Image Processing](https://note.nkmk.me/en/image-processing/)
- [File](https://note.nkmk.me/en/file/)
- [CSV](https://note.nkmk.me/en/csv/)
- [JSON](https://note.nkmk.me/en/json/)
- [PDF](https://note.nkmk.me/en/pdf/)
- [Date and time](https://note.nkmk.me/en/date-and-time/)
- [String](https://note.nkmk.me/en/string/)
- [Regex](https://note.nkmk.me/en/regex/)
- [Numeric](https://note.nkmk.me/en/numeric/)
- [Dictionary](https://note.nkmk.me/en/dictionary/)
- [List](https://note.nkmk.me/en/list/)
- [Error handling](https://note.nkmk.me/en/error-handling/)
- [Mathematics](https://note.nkmk.me/en/mathematics/)
- [Summary](https://note.nkmk.me/en/summary/)
About
- GitHub: [nkmk](https://github.com/nkmk)
Related Articles
- [pandas: Shuffle rows/elements of DataFrame/Series](https://note.nkmk.me/en/python-pandas-random-sort-shuffle/)
- [pandas: Replace Series values with map()](https://note.nkmk.me/en/python-pandas-map-replace/)
- [List of pandas articles](https://note.nkmk.me/en/python-pandas-post-summary/)
- [pandas: Iterate DataFrame with for loop (iterrows, itertuples, items)](https://note.nkmk.me/en/python-pandas-dataframe-for-iteration/)
- [pandas: Convert a list of dictionaries to DataFrame with json\_normalize](https://note.nkmk.me/en/python-pandas-json-normalize/)
- English / [Japanese](https://note.nkmk.me/)
- \|
- [Disclaimer](https://note.nkmk.me/en/disclaimer/)
- [Privacy policy](https://note.nkmk.me/en/privacy-policy/)
- [GitHub](https://github.com/nkmk)
- Š[nkmk.me](https://nkmk.me/) |
| Readable Markdown | You can remove `NaN` from `pandas.DataFrame` and `pandas.Series` with the `dropna()` method.
- [pandas.DataFrame.dropna â pandas 2.0.3 documentation](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.dropna.html)
- [pandas.Series.dropna â pandas 2.0.3 documentation](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.dropna.html)
Contents
- [Remove rows/columns where all elements are `NaN`: `how='all'`](https://note.nkmk.me/en/python-pandas-nan-dropna/#remove-rowscolumns-where-all-elements-are-nan-howall)
- [Remove rows/columns that contain at least one `NaN`: `how='any'` (default)](https://note.nkmk.me/en/python-pandas-nan-dropna/#remove-rowscolumns-that-contain-at-least-one-nan-howany-default)
- [Remove rows/columns according to the number of non-missing values: `thresh`](https://note.nkmk.me/en/python-pandas-nan-dropna/#remove-rowscolumns-according-to-the-number-of-non-missing-values-thresh)
- [Remove based on specific rows/columns: `subset`](https://note.nkmk.me/en/python-pandas-nan-dropna/#remove-based-on-specific-rowscolumns-subset)
- [Update the original object: `inplace`](https://note.nkmk.me/en/python-pandas-nan-dropna/#update-the-original-object-inplace)
- [For `pandas.Series`](https://note.nkmk.me/en/python-pandas-nan-dropna/#for-pandasseries)
While this article primarily deals with `NaN` (Not a Number), it's important to note that in pandas, `None` is also treated as a missing value.
- [Missing values in pandas (nan, None, pd.NA)](https://note.nkmk.me/en/python-pandas-nan-none-na/)
See the following article on extracting, replacing, and counting missing values.
- [pandas: Find rows/columns with NaN (missing values)](https://note.nkmk.me/en/python-pandas-nan-extract/)
- [pandas: Replace NaN (missing values) with fillna()](https://note.nkmk.me/en/python-pandas-nan-fillna/)
- [pandas: Detect and count NaN (missing values) with isnull(), isna()](https://note.nkmk.me/en/python-pandas-nan-judge-count/)
The sample code in this article uses pandas version `2.0.3`. As an example, read a CSV file with missing values.
- [sample\_pandas\_normal\_nan.csv](https://raw.githubusercontent.com/nkmk/python-snippets/217ee1c1fde5d816726f083185036af252a39647/notebook/data/src/sample_pandas_normal_nan.csv)
```
```
## Remove rows/columns where all elements are `NaN`: `how='all'`
By setting `how='all'`, rows where all elements are `NaN` are removed.
```
```
If `axis` is set to `1` or `'columns'`, columns where all elements are `NaN` are removed.
```
```
Note that if `axis` is set to `0` or `'index'`, rows are removed. Since the default value of `axis` is `0`, rows are removed if omitted, as shown in the first example.
In former versions, both rows and columns were removed with `axis=[0, 1]`, but since version `1.0.0`, `axis` can no longer be specified with a list or tuple. If you want to remove both rows and columns, you can repeatedly apply `dropna()`.
```
```
## Remove rows/columns that contain at least one `NaN`: `how='any'` (default)
To use as an example, remove rows and columns where all values are `NaN`.
```
```
By setting `how='any'`, rows that contain at least one `NaN` are removed. Since the default value of `how` is `'any'`, the result is the same even if omitted.
```
```
If `axis` is set to `1` or `'columns'`, columns that contain at least one `NaN` are removed.
```
```
## Remove rows/columns according to the number of non-missing values: `thresh`
With the `thresh` argument, you can remove rows and columns according to the number of non-missing values.
For example, if `thresh=3`, the rows that contain more than three non-missing values remain, and the other rows are removed.
```
```
If `axis` is set to `1` or `'columns'`, columns are removed.
```
```
## Remove based on specific rows/columns: `subset`
If you want to remove based on specific rows and columns, specify a list of rows/columns labels (names) to the `subset` argument of `dropna()`. Even if you want to set only one label, you need to specify it as a list, like `subset=['name']`.
Since the default is `how='any'` and `axis=0`, rows with `NaN` in the columns specified by `subset` are removed.
```
```
If `how` is set to `'all'`, rows with `NaN` in all specified columns are removed.
```
```
If `axis` is set to `1` or `'columns'`, columns are removed.
```
```
An error is raised if a non-existent row or column name is specified. An error is also raised if you set `axis=1` but specify column names or set `axis=0` (default) but specify row names.
```
```
## Update the original object: `inplace`
As shown in the examples above, by default, a new object is returned, and the original object is not changed, but if `inplace=True`, the original object itself is updated.
```
```
## For `pandas.Series`
The only valid argument for `dropna()` of `pandas.Series` is `inplace`. Since it is one-dimensional data, the elements with `NaN` are simply removed.
```
``` |
| Shard | 13 (laksa) |
| Root Hash | 14415757146955323613 |
| Unparsed URL | me,nkmk!note,/en/python-pandas-nan-dropna/ s443 |