Null Values
There are multiple ways to handle missing data. Some people come up with some very creative solutions. This notebook contains some basic methods to handle missing data. Again your strategy to handle missing data will different with contextual knowledge around the problem with your domain expertise.
**Basic Strategies**
- Removing observations
- Filling in NaN values with certain value
- Filling in NaN values with the mean
- Filling in NaN values with the median
- Dropping columns with missing values
- Dropping Features with NaN
The best strategy will normally be context specific. Therefore the more contextual knowledge you have the better.
Import Preliminaries¶
# Import modules
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
import missingno as msno
Create Data¶
# Create some studend data
students = pd.DataFrame({'Name' : ['Student_' + str(i) for i in range(100)],
'Midterm_Score' : np.random.randint(70, 100, size=100),
'Final_Score' : np.random.randint(90, 100, size=100)
})
# Create null value in for final scores
students.Final_Score.replace(to_replace=list(range(92,97)), value=np.nan,
inplace=True)
# View our dataframe
students.head(15)
Visualization¶
# Barplot the frequecny of null values across feature
msno.bar(students, figsize=(10,5), fontsize=10);
plt.xlabel('Features')
plt.ylabel('Record Number');
# Plot the occurence null values across feature
msno.matrix(students, figsize=(10,5), fontsize=10);
plt.xlabel('Features')
plt.ylabel('Record Number');
Removing Rows with NaN Values¶
# Drop null values
students.dropna().head(15)
Filling NaN Value with Another Value¶
# Fill Null values with another value
students.fillna(0).head(15)
Filling NaN Values with Mean¶
# Fill in null values with the mean
students.fillna(students.Final_Score.mean()).head(15)
Filling NaN Values with Median¶
# Fill null values with the median
students.fillna(students.Final_Score.median()).head(15)
Dropping Features with NaN¶
# Drop Feature that contain null valuse
students.dropna(axis=1).head(15)
Author: Kavi Sekhon