和鲸-numpy+pandas使用基础 关卡1

STEP1: 按照下列要求创建数据框

已知10位同学的学号以及语数英三科成绩如下:(都是数值型数据)

Image Name

Id: [202001, 202002, 202003, 202004, 202005, 202006, 202007, 202008, 202009, 202010]
Chinese: [98, 67, 84, 88, 78, 90, 93, 75, 82, 87]
Math: [92, 80, 73, 76, 88, 78, 90, 82, 77, 69]
English: [88, 79, 90, 73, 79, 83, 81, 91, 71, 78]

要求:计算出每位同学的总成绩(SumScore)、平均成绩(MeanScore),最高成绩(MaxScore)、最低成绩(MinScore)、最高成绩与最低成绩的极差(PtpScore)、成绩方差(VarScore);并将所有数据保存到score数据框中;将多列数据(包括学生的ID)合并到一列中,列名设置为answer,最终只保留索引id(从0到100)和answer两列,统一保留整数;

Id= [202001, 202002, 202003, 202004, 202005, 202006, 202007, 202008, 202009, 202010]
Chinese= [98, 67, 84, 88, 78, 90, 93, 75, 82, 87]
Math=[92, 80, 73, 76, 88, 78, 90, 82, 77, 69]
English = [88, 79, 90, 73, 79, 83, 81, 91, 71, 78]
import pandas as pd
import numpy as np
data={
    'ID':Id,
    'Chinese':Chinese,
    'Math':Math,
    'English':English
}
df=pd.DataFrame(data,dtype=int,copy=True)
df['MaxScore'] = df[['Chinese', 'Math', 'English']].apply(np.max, axis=1)
df['MinScore']=df[['Chinese', 'Math', 'English']].apply(np.min,axis=1)
df['Sumscore']=df[['Chinese','Math','English']].apply(np.sum,axis=1)
df['Ptpscore']=df[['Chinese', 'Math', 'English']].apply(np.ptp,axis=1)
df['mean']=df[['Chinese', 'Math', 'English']].apply(np.mean,axis=1)
df['VarScore']=df[['Chinese', 'Math', 'English']].apply(np.var,axis=1)
#总成绩(SumScore)、平均成绩(),最高成绩(MaxScore)、
#最低成绩(MinScore)、(PtpScore)、成绩方差(VarScore)
df_contact = pd.concat([df['ID'], df['Chinese'], df['Math'], df['English'],df['Sumscore'],df['mean'],
df['MaxScore'],df['MinScore'],df['Ptpscore'],df['VarScore']])
df_contact=df_contact.astype(int)
df2 = pd.DataFrame() # 新建df2
df2['answer'] = df_contact # 新增列,数据来自拼接列
df2['id'] = range(len(df2['answer'])) # 新增列,并按照answer数量添加id
##交换两列 cols = df2.columns[[1,0]] df2 = df2[cols] df2

STEP2: 将结果保存为 csv 文件

df2.to_csv('answer_1.csv', index=False, encoding='utf-8-sig') #代码自动补全的快捷键是 tab;运行完成后,左侧文件树刷新下,可以找到这份文件

 

posted @ 2023-12-27 14:00  夫琅禾费米线  阅读(61)  评论(0编辑  收藏  举报