Python for Data Science - Multivariate analysis for outlier detection

Chapter 5 - Outlier Analysis

Segment 9 - Multivariate analysis for outlier detection

import pandas as pd

import matplotlib.pyplot as plt
from pylab import rcParams
import seaborn as sb
%matplotlib inline
rcParams['figure.figsize'] = 5, 4
sb.set_style('whitegrid')

Visually inspecting boxplots

df = pd.read_csv(filepath_or_buffer='~/Data/iris.data.csv', header=None, sep=',')

df.columns=['Sepal Length','Sepal Width','Petal Length','Petal Width', 'Species']
data = df.iloc[:,0:4].values
target = df.iloc[:,4].values

df[:5]

sb.boxplot(x='Species', y = 'Sepal Length', data=df, palette='hls')
<matplotlib.axes._subplots.AxesSubplot at 0x7f10bca12e10>

png

Looking at the scatterplot matrix

sb.pairplot(df, hue='Species', palette='hls')
<seaborn.axisgrid.PairGrid at 0x7f10bc332ef0>

png

posted @ 2021-01-16 16:29  晨风_Eric  阅读(99)  评论(0编辑  收藏  举报