【338】Pandas.DataFrame
Ref: Pandas Tutorial: DataFrames in Python
Ref: pandas.DataFrame
Ref: Creating, reading, and writing reference
- pandas.DataFrame()
- pandas.Series()
- pandas.read_csv()
- pandas.DataFrame.shape
- pandas.DataFrame.head
- pandas.read_excel()
- pandas.to_csv()
- pandas.to_excel()
Ref: Indexing, selecting, assigning reference
- pandas.iloc(): 类似于Excel中的Cell函数,将其看做Matrix
- pandas.loc()
一、基本概念
- class
pandas.
DataFrame
(data=None, index=None, columns=None, dtype=None, copy=False) -
Parameters: data : 数据主体部分,numpy ndarray (structured or homogeneous), dict, or DataFrame
Dict can contain Series, arrays, constants, or list-like objects
Changed in version 0.23.0: If data is a dict, argument order is maintained for Python 3.6 and later.
index : 行名称,默认 0, 1, 2, ..., n, Index or array-like
Index to use for resulting frame. Will default to RangeIndex if no indexing information part of input data and no index provided
columns : 列名称,默认 0, 1, 2, ..., n, Index or array-like
Column labels to use for resulting frame. Will default to RangeIndex (0, 1, 2, …, n) if no column labels are provided
dtype : 数据类型,dtype, default None
Data type to force. Only a single dtype is allowed. If None, infer
copy : boolean, default False
Copy data from inputs. Only affects DataFrame / 2d ndarray input
data[1:,0] means the first column, data[0,1:] means the first row.
1 2 3 4 5 6 7 8 9 10 11 12 13 | >>> import numpy as np >>> import pandas as pd >>> data = np.array([ [' ',' Col1 ',' Col2'], [ 'Row1' , 1 , 2 ], [ 'Row2' , 3 , 4 ] ]) >>> print (pd.DataFrame(data = data[ 1 :, 1 :], index = data[ 1 :, 0 ], columns = data[ 0 , 1 :])) Col1 Col2 Row1 1 2 Row2 3 4 |
or
1 2 3 4 5 6 7 8 9 | >>> data = np.array([ [ 1 , 2 ], [ 3 , 4 ]]) >>> print (pd.DataFrame(data = data, index = [ 'Row1' , 'Row2' ], columns = [ 'Col1' , 'Col2' ])) Col1 Col2 Row1 1 2 Row2 3 4 |
Ref: pandas dataframe.apply() 实现对某一行/列进行处理获得一个新行/新列
Ref: 在pandas中遍历DataFrame行
二、相关方法:
DataFrame.apply(func, axis=0, broadcast=None, raw=False, reduce=None, result_type=None, args=(), **kwds)
Apply a funciton along an axis of the DataFrame. (类似Excel中对一列或者一行数据进行摸个函数的处理)
Objects passed to the function are Series objects whose index is either the DataFrame's index (axis=0) or the DataFrame's columns (axis=1).
Ref: pandas.Series.value_counts
Series.value_counts(normalize=False, sort=True, ascending=False, bins=None, dropna=True)
Returns object containing counts of unique values.
The resulting object will be in desceding order so that the first element is the most frequent-occurring element. Excludes NA values by default.
DataFrame.read_csv(): 可以将 Str 通过 StringIO() 转为文件缓存,可以直接用此方法
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | >>> from io import StringIO >>> a = ''' A, B, C 1,2,3 4,5,6 7,8,9 ''' >>> a '\nA, B, C\n1,2,3\n4,5,6\n7,8,9\n' >>> data = pd.read_csv(StringIO(a)) >>> data A B C 0 1 2 3 1 4 5 6 2 7 8 9 |
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· AI与.NET技术实操系列(二):开始使用ML.NET
· 记一次.NET内存居高不下排查解决与启示
· 探究高空视频全景AR技术的实现原理
· 理解Rust引用及其生命周期标识(上)
· 浏览器原生「磁吸」效果!Anchor Positioning 锚点定位神器解析
· DeepSeek 开源周回顾「GitHub 热点速览」
· 记一次.NET内存居高不下排查解决与启示
· 物流快递公司核心技术能力-地址解析分单基础技术分享
· .NET 10首个预览版发布:重大改进与新特性概览!
· .NET10 - 预览版1新功能体验(一)
2014-11-13 【153】电脑耗材购买&报销
2011-11-13 【C015】Python数据类型 - 序列
2011-11-13 【C014】Python数据类型 - 数值类型
2011-11-13 【C013】ArcPy - 入门学习