alex_bn_lee

导航

< 2025年3月 >
23 24 25 26 27 28 1
2 3 4 5 6 7 8
9 10 11 12 13 14 15
16 17 18 19 20 21 22
23 24 25 26 27 28 29
30 31 1 2 3 4 5

统计

【820】Python R 读取 csv 文件加入数据类型控制

参考:PySe-023-pandas.read_csv 读取 csv 文件,指定列数据类型 解决字符串数据列变为数字的问题

参考:Read a delimited file (including CSV and TSV) into a tibble


Python:根据具体的列名指定数据格式

import pandas as pd
# the column of "id" will be stored as "string", otherwise it will be stored as "int", maybe
pd.read_csv("df.csv", dtype={"id": str})

R:用缩写代替具体的列的属性

df <- readr::read_csv("df_eu_sim.csv", col_types = "Ddc")

具体如下:

col_types

One of NULL, a cols() specification, or a string. See vignette("readr") for more details.

If NULL, all column types will be inferred from guess_max rows of the input, interspersed throughout the file. This is convenient (and fast), but not robust. If the guessed types are wrong, you'll need to increase guess_max or supply the correct types yourself.

Column specifications created by list() or cols() must contain one column specification for each column. If you only want to read a subset of the columns, use cols_only().

Alternatively, you can use a compact string representation where each character represents one column:

  • c = character

  • i = integer

  • n = number

  • d = double

  • l = logical

  • f = factor

  • D = date

  • T = date time

  • t = time

  • ? = guess

  • _ or - = skip

 

posted on   McDelfino  阅读(75)  评论(0编辑  收藏  举报

相关博文:
阅读排行:
· DeepSeek 开源周回顾「GitHub 热点速览」
· 记一次.NET内存居高不下排查解决与启示
· 物流快递公司核心技术能力-地址解析分单基础技术分享
· .NET 10首个预览版发布:重大改进与新特性概览!
· .NET10 - 预览版1新功能体验(一)
历史上的今天:
2019-03-13 【378】python any() and all()
2013-03-13 【105】无线网络WIFI密码破解(附下载文件)
2012-03-13 【023】◀▶ C#学习(十) - 泛型&LINQ
点击右上角即可分享
微信分享提示