Handling Missing Data - p.10 Data Analysis with Python and Pandas Tutorial

Channel:
Subscribers:
1,410,000
Published on ● Video Link: https://www.youtube.com/watch?v=O5v4NrSCw_A



Category:
Tutorial
Duration: 14:21
41,232 views
317


Welcome to Part 10 of our Data Analysis with Python and Pandas tutorial. In this part, we're going to be talking about missing or not available data. We have a few options when considering the existence of missing data.

Ignore it - Just leave it there
Delete it - Remove all cases. Remove from data entirely. This means forfeiting the entire row of data.
Fill forward or backwards - This means taking the prior or following value and just filling it in.
Replace it with something static - For example, replacing all NaN data with -9999.
Each of these options has their own merits for a variety of reasons. Ignoring it requires no more work on our end. You may choose to ignore missing data for legal reasons, or maybe to retain the utmost integrity of the data. Missing data might also be very important data. For example, maybe part of your analysis is investigating signal drops from a server. In this case, maybe the missing data is super important to keep in the set.

Next, we have delete it. You have another two choices at this point. You can either delete rows if they contain any amount of NaN data, or you can delete the row if it is completely NaN data. Usually a row that is full of NaN data comes from a calculation you performed on the dataset, and no data is really missing, it's just simply not available given your formula. In most cases, you would at least want to drop all rows that are completely NaN, and in many cases you would like to just drop rows that have any NaN data.

Tutorial sample code and text: http://pythonprogramming.net/nan-na-missing-data-analysis-python-pandas-tutorial/

http://pythonprogramming.net
https://twitter.com/sentdex




Other Videos By sentdex


2015-11-27Flask Mail - Flask Web Development with Python 29
2015-11-23URL Converters - Flask Web Development with Python 28
2015-11-18Jinja Templating Cont'd - Flask Web Development with Python 27
2015-11-16Includes - Flask Web Development with Python 26
2015-11-08Scikit Learn Incorporation - p.16 Data Analysis with Python and Pandas Tutorial
2015-11-03Rolling Apply and Mapping Functions - p.15 Data Analysis with Python and Pandas Tutorial
2015-10-29Adding other economic indicators - p.14 Data Analysis with Python and Pandas Tutorial
2015-10-27Joining 30 year mortgage rate - p.13 Data Analysis with Python and Pandas Tutorial
2015-10-21Applying Comparison Operators to DataFrame - p.12 Data Analysis with Python and Pandas Tutorial
2015-10-17Rolling statistics - p.11 Data Analysis with Python and Pandas Tutorial
2015-10-12Handling Missing Data - p.10 Data Analysis with Python and Pandas Tutorial
2015-10-09Resampling - p.9 Data Analysis with Python and Pandas Tutorial
2015-10-05Percent Change and Correlation Tables - p.8 Data Analysis with Python and Pandas Tutorial
2015-10-03Pickling - p.7 Data Analysis with Python and Pandas Tutorial
2015-09-29Joining and Merging Dataframes - p.6 Data Analysis with Python and Pandas Tutorial
2015-09-25Concatenating and Appending dataframes - p.5 Data Analysis with Python and Pandas Tutorial
2015-09-23Building dataset - p.4 Data Analysis with Python and Pandas Tutorial
2015-09-20IO Basics - p.3 Data Analysis with Python and Pandas Tutorial
2015-09-16Pandas Basics - p.2 Data Analysis with Python and Pandas Tutorial
2015-09-14Data Analysis with Python and Pandas Tutorial Introduction
2015-09-11PythonProgramming.net's +=1 Subscription



Tags:
Pandas
Data Analysis (Media Genre)
Missing Data
Data (Website Category)
Python (Programming Language)
fillna
dropna
DataFrame.dropna
DataFrame.fillna