Statistics and Linear Algebra 1

1. Add a value to each element in a list:

　　degrees_zero = [f + 459.67 for f in fahrenheit_degrees]

2. Assign the index of a list into the list:

　　survey_responses = ["none", "some", "a lot", "none", "a few", "none", "none"]

　　survey_scale = ["none", "a few", "some", "a lot"]

　　survey_numbers = [survey_scale.index(response) for response in survey_responses]

　　average_smoking = sum(smoke_dic)/len(smoke_dic)

3. Categorize scales: here we need to filter gender list into categorizes and find the corresponse saving.　　

　　gender = ["male", "female", "female", "male", "male", "female"]
　　savings = [1200, 5000, 3400, 2400, 2800, 4100]

//////////////////////////////////Solutions one////////////////////////////////////////

　　male_saving_list = []

　　for i in range(len(gender)):
　　　　if gender[i] == "male":
　　　　　　male_saving_list.append(savings[i])
　　　　　　male_savings = sum(male_savings_list)/ len(male_savings_list)

//////////////////////////////////Solutions two////////////////////////////////////////

　　female_saving_list = [savings[i] for i in range(len(gender)) if gender[i] == "female"] # Here we use square bracket to replace append function. savings[i] is the value we want, if conditions have to follow the for loop like solution one.
　　female_savings = sum(female_saving_list)/len(female_saving_list)

4. Frequency Histograms

　　student_scores = [15, 80, 95, 100, 45, 75, 65]

　　plt.hist(student_scores,bin = 2) # There are only two bins in the chart

　　plt.show()

5. Skew function decide the shape of the histogram, if the tail of the histogram is in the left side, the skew is negative. If the tail of histogram is on the right side, the skew is positive.

　　from scipy.stats import skew

　　positive_skew = skew(test_scores_positive)

6. Kurtosis measures whether the distribution is short and flat, or tall and skinny:

　　kurt_meso = kurtosis(test_scores_meso)

7. Modality:

　　unimodal: only one peak in the plot

　　bimodal: two peaks in the plot

　　multimodal: several model in the plot

8. Filter the row which contains NAN in certain column:

　　new_titanic_survival = titanic_survival.dropna(subset = ["age","sex"])

posted on 2016-11-30 03:47 阿难1020 阅读(242) 评论(0) 编辑收藏举报