How to Detect and Fill Missing Values in Pandas (Python)
This video shows how to detect and fill missing values such as NaN, NA, None and the empty string in Pandas data frames. Detecting, counting and filling missing values or other odd values is a basic data exploration and cleaning step that is going to be necessary with all but the cleanest real world data sets.
If you find this video useful, like, share and subscribe to support the channel!
► Subscribe: https://www.youtube.com/c/DataDaft?sub_confirmation=1
Code used in this Python Code Clip:
import numpy as np
import pandas as pd
import statsmodels.api as sm #(To access mtcars dataset)
mtcars = sm.datasets.get_rdataset("mtcars", "datasets", cache=True).data
mtcars.iloc[1:4, 2:3] = np.NaN
mtcars.iloc[1:4, 3:4] = "NA"
mtcars.iloc[1:4, 4:5] = ""
mtcars["None_col"] = None
mtcars.head()
# Detect NaN and None with df.isnull() or df.isna()
null = pd.isnull(mtcars)
null.head()
# Count the total number of missing values
pd.isnull(mtcars).sum().sum()
# Detect a list of missing values with df.isin()
missing_vals = ["NA", "", None, np.NaN]
missing = mtcars.isin(missing_vals)
missing.head()
# Fill null values (NaN and None) with a given value:
mtcars.fillna(0).head()
# Fill a list of missing values with a given value:
missing_vals = ["NA", "", None, np.NaN]
missing = mtcars.isin(missing_vals) # Detect missing vals
mtcars.mask(missing, "missing").head() # Fill missing with df.mask()
* Note: YouTube does not allow greater than or less than symbols in the text description, so the code above will not be exactly the same as the code shown in the video! I will use Unicode large < and > symbols in place of the standard sized ones. .
⭐ Kite is a free AI-powered coding assistant that integrates with popular editors and IDEs to give you smart code completions and docs while you’re typing. It is a cool application of machine learning that can also help you code faster! Check it out here: https://www.kite.com/get-kite/?utm_medium=referral&utm_source=youtube&utm_campaign=datadaft&utm_content=description-only