🕷️ Crawler Inspector

URL Lookup

Direct Parameter Lookup

Raw Queries and Responses

1. Shard Calculation

Query:
Response:
Calculated Shard: 108 (from laksa034)

2. Crawled Status Check

Query:
Response:

3. Robots.txt Check

Query:
Response:

4. Spam/Ban Check

Query:
Response:

5. Seen Status Check

ℹ️ Skipped - page is already crawled

📄
INDEXABLE
CRAWLED
2 months ago
🚫
ROBOTS BLOCKED

Page Info Filters

FilterStatusConditionDetails
HTTP statusPASSdownload_http_code = 200HTTP 200
Age cutoffPASSdownload_stamp > now() - 6 MONTH2.7 months ago
History dropPASSisNull(history_drop_reason)No drop reason
Spam/banPASSfh_dont_index != 1 AND ml_spam_score = 0ml_spam_score=0
CanonicalPASSmeta_canonical IS NULL OR = '' OR = src_unparsedNot set

Page Details

PropertyValue
URLhttps://llego.dev/posts/pandas-excluding-missing-values/
Last Crawled2026-01-25 13:13:24 (2 months ago)
First Indexed2023-10-12 00:29:38 (2 years ago)
HTTP Status Code200
Meta TitleHow to Exclude Missing Values in Python Pandas - llego.dev
Meta DescriptionThis comprehensive Pandas how-to guide covers detecting, dropping, filling, and carefully handling missing NaN values in DataFrames for cleaning data before analysis.
Meta Canonicalnull
Boilerpipe Text
Handling missing data is a common task in data analysis and data science projects. Pandas, Python’s popular data analysis library, provides many useful features for dealing with missing values in DataFrames and Series objects. In Pandas, missing values are represented by NaN (Not a Number). Filtering out missing values or excluding rows and columns containing NaN values is often necessary for preparing clean, complete datasets for analysis and modeling. This allows you to focus on just the valid, complete observations without distortion from null entries. This guide will demonstrate various methods to exclude or drop missing values in Pandas using realistic examples. We will cover key topics including: Table of Contents Open Table of Contents Checking for Null Values Dropping Rows and Columns with NaN Values Drop Rows with dropna() Drop Columns with drop() Filling Missing Values fillna() interpolate() replace() Excluding Nulls During Joins/Merges Treating NaN as a Value Conclusion Checking for Null Values The first step is detecting which cells in a DataFrame or Series contain NaN values. Pandas provides the isna() and notna() methods to check for nulls and non-nulls respectively: import pandas as pd df = pd. DataFrame ({ 'A' : [ 1 , 2 , np.nan], 'B' : [ 5 , np.nan, np.nan]}) df. isna () # Returns True for each NaN value df. notna () # Returns True for each non-NaN value The isnull() and notnull() methods are aliases of isna() and notna() . Summing the Boolean DataFrame returned by isna() gives the total count of missing values per column: df. isna (). sum () A 1 B 2 dtype: int64 To count NaN values in the entire DataFrame, use isna().sum().sum() . These methods enable detecting nulls before handling them according to your requirements. Dropping Rows and Columns with NaN Values A common way to handle missing data is to simply exclude rows and/or columns containing NaN values. Pandas provides two main methods for this: Drop Rows with dropna() The dropna() method on DataFrames excludes any rows containing NaN values. df = pd. DataFrame ({ 'A' : [ 1 , 2 , np.nan], 'B' : [ 5 , np.nan, np.nan]}) df. dropna () # Drops second and third row By default dropna() drops any row with at least 1 NaN. Pass how='all' to only drop rows where all values are NaN: df = pd. DataFrame ({ 'A' : [ 1 , np.nan, np.nan], 'B' : [ 5 , np.nan, np.nan]}) df. dropna ( how = 'all' ) # Only drops third row with all NaNs To drop rows with at least a certain number of NaNs, use the thresh argument: df. dropna ( thresh = 2 ) # Would drop rows with >= 2 NaNs subset allows specifying columns to consider when looking for NaNs: df. dropna ( subset = [ 'A' ]) # Only drops rows with NaN in column 'A' Drop Columns with drop() The drop() method can exclude columns containing null values by passing axis=1 : df. drop ( columns = [ 'A' ], axis = 1 ) # Drops column A thresh can again be used to drop columns with at least a certain count of NaNs: df. dropna ( axis = 1 , thresh = 1 ) # Drops any column with >= 1 NaN This provides an easy way to filter out columns missing too much data to be usable. Filling Missing Values Sometimes it is preferable to fill or impute missing values rather than excluding entire rows or columns containing them. This retains more complete data. Pandas provides several methods for imputation like fillna() , interpolate() , and replace() . fillna() The fillna() method replaces NaN values with a specified value. Passing a single value fills all NaNs with it: df. fillna ( 0 ) # Replace NaNs with 0 Dictionary mapping can be passed to fill different values per column: df. fillna ({ 'A' : 0 , 'B' : 5 }) Use method='ffill' / 'bfill' to propagate next/previous valid values forward/backward: df. fillna ( method = 'ffill' ) This carries over the last valid observation to fill NaNs in following rows. interpolate() For numeric data, interpolate() fills NaN values via linear interpolation of neighboring points: s = pd. Series ([ 1 , np.nan, np.nan, 10 ]) s. interpolate () # Fills NaNs by interpolating between 1 and 10 Various interpolation methods like ‘spline’ and ‘akima’ are supported . replace() The replace() method can substitute NaN with a new value: df. replace (np.nan, 5 ) This provides a simple, flexible way to fill null values while copying the DataFrame. Filling missing data preserves more complete rows/columns and may produce better analysis results than dropping NaNs. Excluding Nulls During Joins/Merges NaN values can also be excluded when combining DataFrames using operations like join/merge. By default, joins preserve all values (including NaNs) from the original DataFrames: df1 = pd. DataFrame ({ 'A' : [ 1 , 2 ], 'B' : [ 5 , np.nan]}) df2 = pd. DataFrame ({ 'A' : [ 1 , 3 ], 'C' : [ 7 , 8 ]}) df1. merge (df2) # NaN value is retained Passing how='inner' joins on only the intersection of rows containing no null values: df1. merge (df2, how = 'inner' ) # Row with NaN is excluded For outer joins, pass indicator=True and filter for rows having _merge equal to 'both' : df1. merge (df2, how = 'outer' , indicator = True ) . query ( '_merge == "both"' ) # Null rows are excluded after merge This avoids incorporating NaN values from the join/merge output. Treating NaN as a Value In some cases, you may want to explicitly include NaN values rather than exclude them. This treats NaN as an actual value category or level in the data. Methods like groupby() , value_counts() , and pivot_table() by default will ignore NaN values. To include NaN values in the output, pass dropna=False : df. groupby ( 'A' , dropna = False ). mean () # NaN is retained as a group This lets you analyze null values alongside actual data. Conclusion Handling missing data is vital for preparing Pandas DataFrames for analysis and visualization. By detecting, dropping, filling, and carefully treating NaN values, you can wrangle messy datasets into clean, usable formats. The techniques covered in this guide, including isna() , dropna() , fillna() , joining on complete rows, and handling NaN as a group, provide code-driven solutions to exclude or retain missing values when needed. Mastering Pandas’ tools for missing data will boost your proficiency for real-world data science tasks. The skills are broadly applicable across disciplines including machine learning, analytics, finance, sciences and more. For further learning, refer to the official Pandas documentation on Working with Missing Data . The example codes can be adapted to your specific datasets and use cases when encountering null values.
Markdown
[Skip to content](https://llego.dev/posts/pandas-excluding-missing-values/#main-content) [llego.dev](https://llego.dev/) - [Posts](https://llego.dev/posts/) - [Tags](https://llego.dev/tags/) - [About](https://llego.dev/about/) - [Services](https://llego.dev/services/) - [Search](https://llego.dev/search/ "Search") *** Go Back # How to Exclude Missing Values in Python Pandas Updated: Mar 29, 2023 \| at 05:37 AM Handling missing data is a common task in data analysis and data science projects. Pandas, Python’s popular data analysis library, provides many useful features for dealing with missing values in DataFrames and Series objects. In Pandas, missing values are represented by NaN (Not a Number). Filtering out missing values or excluding rows and columns containing NaN values is often necessary for preparing clean, complete datasets for analysis and modeling. This allows you to focus on just the valid, complete observations without distortion from null entries. This guide will demonstrate various methods to exclude or drop missing values in Pandas using realistic examples. We will cover key topics including: ## Table of Contents Open Table of Contents - [Checking for Null Values](https://llego.dev/posts/pandas-excluding-missing-values/#checking-for-null-values) - [Dropping Rows and Columns with NaN Values](https://llego.dev/posts/pandas-excluding-missing-values/#dropping-rows-and-columns-with-nan-values) - [Drop Rows with `dropna()`](https://llego.dev/posts/pandas-excluding-missing-values/#drop-rows-with-dropna) - [Drop Columns with `drop()`](https://llego.dev/posts/pandas-excluding-missing-values/#drop-columns-with-drop) - [Filling Missing Values](https://llego.dev/posts/pandas-excluding-missing-values/#filling-missing-values) - [fillna()](https://llego.dev/posts/pandas-excluding-missing-values/#fillna) - [interpolate()](https://llego.dev/posts/pandas-excluding-missing-values/#interpolate) - [replace()](https://llego.dev/posts/pandas-excluding-missing-values/#replace) - [Excluding Nulls During Joins/Merges](https://llego.dev/posts/pandas-excluding-missing-values/#excluding-nulls-during-joinsmerges) - [Treating NaN as a Value](https://llego.dev/posts/pandas-excluding-missing-values/#treating-nan-as-a-value) - [Conclusion](https://llego.dev/posts/pandas-excluding-missing-values/#conclusion) ## Checking for Null Values The first step is detecting which cells in a DataFrame or Series contain NaN values. Pandas provides the `isna()` and `notna()` methods to check for nulls and non-nulls respectively: ``` import pandas as pd df = pd.DataFrame({'A': [1, 2, np.nan], 'B': [5, np.nan, np.nan]}) df.isna() # Returns True for each NaN value df.notna() # Returns True for each non-NaN value ``` The `isnull()` and `notnull()` methods are aliases of `isna()` and `notna()`. Summing the Boolean DataFrame returned by `isna()` gives the total count of missing values per column: ``` df.isna().sum() A 1 B 2 dtype: int64 ``` To count NaN values in the entire DataFrame, use `isna().sum().sum()`. These methods enable detecting nulls before handling them according to your requirements. ## Dropping Rows and Columns with NaN Values A common way to handle missing data is to simply exclude rows and/or columns containing NaN values. Pandas provides two main methods for this: ### Drop Rows with `dropna()` The `dropna()` method on DataFrames excludes any rows containing NaN values. ``` df = pd.DataFrame({'A': [1, 2, np.nan], 'B': [5, np.nan, np.nan]}) df.dropna() # Drops second and third row ``` By default `dropna()` drops any row with at least 1 NaN. Pass `how='all'` to only drop rows where *all* values are NaN: ``` df = pd.DataFrame({'A': [1, np.nan, np.nan], 'B': [5, np.nan, np.nan]}) df.dropna(how='all') # Only drops third row with all NaNs ``` To drop rows with at least a certain number of NaNs, use the `thresh` argument: ``` df.dropna(thresh=2) # Would drop rows with >= 2 NaNs ``` `subset` allows specifying columns to consider when looking for NaNs: ``` df.dropna(subset=['A']) # Only drops rows with NaN in column 'A' ``` ### Drop Columns with `drop()` The `drop()` method can exclude columns containing null values by passing `axis=1`: ``` df.drop(columns=['A'], axis=1) # Drops column A ``` `thresh` can again be used to drop columns with at least a certain count of NaNs: ``` df.dropna(axis=1, thresh=1) # Drops any column with >= 1 NaN ``` This provides an easy way to filter out columns missing too much data to be usable. ## Filling Missing Values Sometimes it is preferable to fill or impute missing values rather than excluding entire rows or columns containing them. This retains more complete data. Pandas provides several methods for imputation like `fillna()`, `interpolate()`, and `replace()`. ### fillna() The `fillna()` method replaces NaN values with a specified value. Passing a single value fills all NaNs with it: ``` df.fillna(0) # Replace NaNs with 0 ``` Dictionary mapping can be passed to fill different values per column: ``` df.fillna({'A': 0, 'B': 5}) ``` Use `method='ffill'`/`'bfill'` to propagate next/previous valid values forward/backward: ``` df.fillna(method='ffill') ``` This carries over the last valid observation to fill NaNs in following rows. ### interpolate() For numeric data, `interpolate()` fills NaN values via linear interpolation of neighboring points: ``` s = pd.Series([1, np.nan, np.nan, 10]) s.interpolate() # Fills NaNs by interpolating between 1 and 10 ``` Various interpolation methods like ‘spline’ and ‘akima’ are [supported](https://pandas.pydata.org/docs/reference/api/pandas.Series.interpolate.html). ### replace() The `replace()` method can substitute NaN with a new value: ``` df.replace(np.nan, 5) ``` This provides a simple, flexible way to fill null values while copying the DataFrame. Filling missing data preserves more complete rows/columns and may produce better analysis results than dropping NaNs. ## Excluding Nulls During Joins/Merges NaN values can also be excluded when combining DataFrames using operations like join/merge. By default, joins preserve all values (including NaNs) from the original DataFrames: ``` df1 = pd.DataFrame({'A': [1, 2], 'B': [5, np.nan]}) df2 = pd.DataFrame({'A': [1, 3], 'C': [7, 8]}) df1.merge(df2) # NaN value is retained ``` Passing `how='inner'` joins on only the intersection of rows containing no null values: ``` df1.merge(df2, how='inner') # Row with NaN is excluded ``` For outer joins, pass `indicator=True` and filter for rows having `_merge` equal to `'both'`: ``` df1.merge(df2, how='outer', indicator=True) .query('_merge == "both"') # Null rows are excluded after merge ``` This avoids incorporating NaN values from the join/merge output. ## Treating NaN as a Value In some cases, you may want to explicitly include NaN values rather than exclude them. This treats NaN as an actual value category or level in the data. Methods like `groupby()`, `value_counts()`, and `pivot_table()` by default will ignore NaN values. To include NaN values in the output, pass `dropna=False`: ``` df.groupby('A', dropna=False).mean() # NaN is retained as a group ``` This lets you analyze null values alongside actual data. ## Conclusion Handling missing data is vital for preparing Pandas DataFrames for analysis and visualization. By detecting, dropping, filling, and carefully treating NaN values, you can wrangle messy datasets into clean, usable formats. The techniques covered in this guide, including `isna()`, `dropna()`, `fillna()`, joining on complete rows, and handling NaN as a group, provide code-driven solutions to exclude or retain missing values when needed. Mastering Pandas’ tools for missing data will boost your proficiency for real-world data science tasks. The skills are broadly applicable across disciplines including machine learning, analytics, finance, sciences and more. For further learning, refer to the official Pandas documentation on [Working with Missing Data](https://pandas.pydata.org/docs/user_guide/missing_data.html). The example codes can be adapted to your specific datasets and use cases when encountering null values. - [python](https://llego.dev/tags/python/) - [pandas](https://llego.dev/tags/pandas/) Back to Top Share this post on: [Share this post via WhatsApp](https://wa.me/?text=https://llego.dev/posts/pandas-excluding-missing-values/ "Share this post via WhatsApp") [Share this post on Facebook](https://www.facebook.com/sharer.php?u=https://llego.dev/posts/pandas-excluding-missing-values/ "Share this post on Facebook") [Tweet this post](https://twitter.com/intent/tweet?url=https://llego.dev/posts/pandas-excluding-missing-values/ "Tweet this post") [Share this post via Telegram](https://t.me/share/url?url=https://llego.dev/posts/pandas-excluding-missing-values/ "Share this post via Telegram") [Share this post on Pinterest](https://pinterest.com/pin/create/button/?url=https://llego.dev/posts/pandas-excluding-missing-values/ "Share this post on Pinterest") [Share this post via email](https://llego.dev/cdn-cgi/l/email-protection#714e0204131b1412054c22141454434105191802544341011e0205575242494a131e15084c19050501024b5e5e1d1d14161e5f1514075e011e0205025e01101f1510025c1409121d0415181f165c1c180202181f165c07101d0414025e "Share this post via email") *** [Mark Anthony Llego on Github](https://github.com/llegomark " Mark Anthony Llego on Github") [Mark Anthony Llego on X](https://twitter.com/markllego "Mark Anthony Llego on X") [Mark Anthony Llego on Discord](https://discordapp.com/users/1012984419029622784 "Mark Anthony Llego on Discord") [Mark Anthony Llego on Hugging Face](https://huggingface.co/markllego "Mark Anthony Llego on Hugging Face") [Send an email to Mark Anthony Llego](https://llego.dev/cdn-cgi/l/email-protection#c6aba7b4adaaaaa3a1a986a1aba7afaae8a5a9ab "Send an email to Mark Anthony Llego") © 2025 Mark Anthony Llego \| [Legal](https://llego.dev/legal)
Readable Markdownnull
Shard108 (laksa)
Root Hash11443539829787219308
Unparsed URLdev,llego!/posts/pandas-excluding-missing-values/ s443