Kaggle 学习之旅
决策树
https://www.kaggle.com/dansbecker/your-first-machine-learning-model
1 import pandas as pd 2 melb_data_path = 'melb_data.csv' 3 data1 = pd.read_csv(melb_data_path) 4 data1.describe() 5 data1.columns 6 data2 = data1.dropna(axis=0) 7 data2.describe() 8 y = data2.Price #定义target 9 y.describe() 10 features = ['Rooms', 'Bathroom', 'Landsize', 'Lattitude', 'Longtitude'] 11 X = data2[features] #定义feature 12 X.describe() 13 X.head() 14 15 from sklearn.tree import DecisionTreeRegressor 16 model1 = DecisionTreeRegressor(random_state=1) #选择决策树模型 17 model1.fit(X,y) #训练模型 18 X.head() 19 model1.predict(X.head()) #使用模型对X样本前5行进行价格预测
output:
>>> X.head()
Rooms Bathroom Landsize Lattitude Longtitude
1 2 1.0 156.0 -37.8079 144.9934
2 3 2.0 134.0 -37.8093 144.9944
4 4 1.0 120.0 -37.8072 144.9941
6 3 2.0 245.0 -37.8024 144.9993
7 2 1.0 256.0 -37.8060 144.9954
>>> model1.predict(X.head())
array([1035000., 1465000., 1600000., 1876000., 1636000.])
其他:
查看python 历史命令: import readline; print '\n'.join([str(readline.get_history_item(i + 1)) for i in range(readline.get_current_history_length())])