【Java/Csv/Regex】用正则表达式去劈分带引号的csv文件行,得到想要的行数据
csv文件是用引号分隔的文本行,为了完善内容人们又用引号把每个区块的内容又包了起来,于是形成下面的文件:
"1","2","3","4","5","6","7","8","9","10","11","12","13","14","15","16","傅宗龙","18","19","20" "1","2","3","4","5.55","6","7","8","9","10","朱由检","12","13","14","15","16,666,666","17","袁崇焕","19","20" "醉里挑灯看剑,梦回吹角连营","2","3","4","孙传庭","6","7","8","9","10","11","12","13","14","15","16","17","18","19","20" ",,,,,,,,,","2","3","4","熊廷弼","6","7","8","9","10","11","12","卢象升","14","15","16","17","18","19","20"
要解析这样的文件也算简单,只用在劈分时加入一些细节就好,代码如下:
import java.io.FileReader; import java.io.IOException; import java.io.LineNumberReader; import java.util.ArrayList; import java.util.List; /** * 解析一个csv文件,将其内容转化为一个嵌套链表 * @author 逆火 * * 2019年11月23日 上午8:51:15 */ public class CsvfileParser { private List<List<String>> contents; public CsvfileParser(String filename) throws IOException { contents=new ArrayList<List<String>>(); LineNumberReader fileReader = new LineNumberReader(new FileReader(filename)); String line = null; while ((line = fileReader.readLine()) != null) { System.out.println("Line " + fileReader.getLineNumber() +": " + line); contents.add(getArrayFromLine(line)); } fileReader.close(); } private List<String> getArrayFromLine(String line) { List<String> retval=new ArrayList<String>(); // (^\\s*\")匹配每行开头的",这会产生数组第一项为零长度字符串,所以下面遍历时选择跳过 // (\"\\s*,\\s*\")匹配中间的"," // (\"\\s*$)匹配每行结尾的" String[] arr=line.split("(^\\s*\")|(\"\\s*,\\s*\")|(\"\\s*$)"); for(int i=1;i<arr.length;i++) {// Jump first empty string retval.add(arr[i]); } return retval; } public void printContents() { for(List<String> ls:contents) { System.out.println(String.join("|", ls)); } } public static void main(String[] args) throws IOException { CsvfileParser cp=new CsvfileParser("C:\\Users\\horn1\\Desktop\\sample.csv"); System.out.println("---------------------------"); cp.printContents(); } }
输出如下:
Line 1: "1","2","3","4","5","6","7","8","9","10","11","12","13","14","15","16","傅宗龙","18","19","20" Line 2: "1","2","3","4","5.55","6","7","8","9","10","朱由检","12","13","14","15","16,666,666","17","袁崇焕","19","20" Line 3: "醉里挑灯看剑,梦回吹角连营","2","3","4","孙传庭","6","7","8","9","10","11","12","13","14","15","16","17","18","19","20" Line 4: ",,,,,,,,,","2","3","4","熊廷弼","6","7","8","9","10","11","12","卢象升","14","15","16","17","18","19","20" --------------------------- 1|2|3|4|5|6|7|8|9|10|11|12|13|14|15|16|傅宗龙|18|19|20 1|2|3|4|5.55|6|7|8|9|10|朱由检|12|13|14|15|16,666,666|17|袁崇焕|19|20 醉里挑灯看剑,梦回吹角连营|2|3|4|孙传庭|6|7|8|9|10|11|12|13|14|15|16|17|18|19|20 ,,,,,,,,,|2|3|4|熊廷弼|6|7|8|9|10|11|12|卢象升|14|15|16|17|18|19|20
--END-- 2019年11月23日09:14:45
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· Linux系列:如何用heaptrack跟踪.NET程序的非托管内存泄露
· 开发者必知的日志记录最佳实践
· SQL Server 2025 AI相关能力初探
· Linux系列:如何用 C#调用 C方法造成内存泄露
· AI与.NET技术实操系列(二):开始使用ML.NET
· 无需6万激活码!GitHub神秘组织3小时极速复刻Manus,手把手教你使用OpenManus搭建本
· C#/.NET/.NET Core优秀项目和框架2025年2月简报
· Manus爆火,是硬核还是营销?
· 终于写完轮子一部分:tcp代理 了,记录一下
· 【杭电多校比赛记录】2025“钉耙编程”中国大学生算法设计春季联赛(1)
2014-11-23 JS里取前天,昨天和今天
2014-11-23 【高中数学/对数函数】设f(x)为定义在(0,+∞)上的连续函数,且对任意x都有f(2^x)+f(3^x)=x,求f(x)的解析式?