随笔分类 -  数据分析

摘要:###一. 特征选择 #####1. Permutation Importance # shuffle a single column of the validation data and get the loss(which reflects the importance) import eli5 阅读全文
posted @ 2022-07-06 22:03 失控D大白兔 编辑
摘要:###一. 数据探索分析&数据清洗&缺失值填充 1 Which features are categorical? 什么特征是离散的? 2 Which features are numerical? 什么特征是连续的? 3 Which features are mixed data types? 什 阅读全文
posted @ 2022-07-06 15:39 失控D大白兔 编辑
摘要:Pipelines are a simple way to keep your data preprocessing and modeling code organized. Specifically, a pipeline bundles preprocessing and modeling st 阅读全文
posted @ 2022-07-04 15:20 失控D大白兔 编辑
摘要:import matplotlib.pyplot as plt import seaborn as sns ###1. Line Chart plt.figure(figsize=(16,6)) # Set the width and height of the figure plt.title(" 阅读全文
posted @ 2022-06-07 00:04 失控D大白兔 编辑
摘要:import numpy as np ###1. Create ndarray #Specify every value x = np.array([1, 2, 3, 4, 5]) y = np.array([[1,2,3],[4,5,6],[7,8,9], [10,11,12]]) # Creat 阅读全文
posted @ 2022-06-06 12:49 失控D大白兔 编辑
摘要:###1. Handling Missing Values #get the missing data ratio missing_values_count = nfl_data.isnull().sum() ## get the number of missing data points per 阅读全文
posted @ 2022-06-05 18:56 失控D大白兔 编辑
摘要:###1. Getting Started import pandas as pd #导入 pd.DataFrame({'Yes': [50, 21], 'No': [131, 2]}) #Create a table #assign the row lables pd.DataFrame({'Bo 阅读全文
posted @ 2022-06-04 21:58 失控D大白兔 编辑
摘要:###1. Function #default values to the functions def greet(who="Colin"): print("Hello,", who) #make choices in the function print("Splitting", total_ca 阅读全文
posted @ 2022-05-27 20:55 失控D大白兔 编辑
摘要:###数据导入 import numpy as np import pandas as pd # 使用pandas读取csv数据,数据类型为dataframe,相当于字典加数组,第一行为索引特征 data = pd.read_csv('data/kaggle_house_price_predicti 阅读全文
posted @ 2022-05-27 03:10 失控D大白兔 编辑

点击右上角即可分享
微信分享提示