opencsv 将对象数组导出为 csv 文件时、文件列按对象字段定义顺序排序的实现
前言
有这个需求的朋友应该已经大致熟悉使用opencsv将 bean[] 转 csv 的基本操作,本文掠过 opencsv 的使用方法介绍。
还不熟悉的朋友可以参考这篇博文 Java之利用openCsv导出csv文件 。
想深究 opencsv 的朋友,去跟官方文档做斗争吧
ps: 以下解决思路为个人阅读源码而想到的, 可能存在错误、过于繁琐或者官方文档有更简洁方式等情况。写这篇文章主要是抛砖引玉,若有朋友知晓更好的实现方式,希望留个言指点迷津。
需求场景
将对象数组转换为 csv 文件,并且 csv 的列顺序跟对象字段的定义顺序一致。
环境
openjdk11 + opencsv 5.0
冲突点
- opencsv 的默认排序策略是按照对象属性名的字母升序排序,不满足笔者的需求。
- 在前言中提到的博文实际上提供了一种解决思路,使用 @CsvBindByPosition 注解显式声明字段的位置。使用该方式要满足笔者的需求就得一个一个字段地加上,太繁琐; 并且使用该方式之后 opencsv 不能自动读取对象属性名为 csv 表头, 需要要额外传入表头。
解决思路
急于完成业务需求的可以直接看第一点
1. 直接使用版
完整代码
@SneakyThrows
public <T> String generateCsvFile(List<? extends T> exportResults, String fileName)
throws IOException, CsvDataTypeMismatchException, CsvRequiredFieldEmptyException {
String finalFileName = new File(nginxDownloadPath,
fileName + System.currentTimeMillis() + ".csv").getPath();
Writer writer = new FileWriter(finalFileName);
CSVWriter csvWriter = new CSVWriter(
writer,
CSVWriter.DEFAULT_SEPARATOR,
CSVWriter.DEFAULT_QUOTE_CHARACTER,
CSVWriter.NO_ESCAPE_CHARACTER,
CSVWriter.DEFAULT_LINE_END);
// csvWriter.writeNext(header);
if (exportResults.size() > 0) {
//写内容
StatefulBeanToCsv beanToCsv = new StatefulBeanToCsvBuilder<T>(writer).
withMappingStrategy(new OrderColumnMappingStrategy(exportResults.get(0).getClass())).
withIgnoreField(exportResults.get(0).getClass(), Arrays.stream(exportResults.get(0).getClass().getDeclaredFields()).filter(one -> {
one.setAccessible(true);
return one.isAnnotationPresent(CsvIgnore.class);
}).findFirst().orElse(null)).
build();
beanToCsv.write(exportResults);
}
csvWriter.close();
writer.close();
return finalFileName;
}
public class OrderColumnMappingStrategy<T> extends HeaderColumnNameMappingStrategy<T> {
private Locale errorLocale = Locale.getDefault();
public OrderColumnMappingStrategy(Class<? extends T> type) {
super();
this.setErrorLocale(errorLocale);
this.setType(type);
}
@Override
public String[] generateHeader(T bean) throws CsvRequiredFieldEmptyException {
if (type == null) {
throw new IllegalStateException(ResourceBundle
.getBundle(ICSVParser.DEFAULT_BUNDLE_NAME, errorLocale)
.getString("type.before.header"));
}
if (headerIndex.isEmpty()) {
List<String> realHeaderList = new ArrayList<>();
/**getFieldNameForCsvHeader()方法是通过反射获取对象的字段, 字段
是按照定义顺序返回的. 这里就不贴出代码了*/
getFieldNameForCsvHeader(type).forEach(one -> {
realHeaderList.add(one.toUpperCase());
});
String[] header = realHeaderList.toArray(new String[0]);
headerIndex.initializeHeaderIndex(header);
return header;
}
return headerIndex.getHeaderIndex();
}
}
关键代码摘出说明
- 继承 HeaderColumnNameMappingStrategy 类并重写 String[] generateHeader(T bean) 方法, 思路是改变 headerIndex 对象的初始化内容
public class OrderColumnMappingStrategy<T> extends HeaderColumnNameMappingStrategy<T> {
private Locale errorLocale = Locale.getDefault();
public OrderColumnMappingStrategy(Class<? extends T> type) {
super();
this.setErrorLocale(errorLocale);
this.setType(type);
}
@Override
public String[] generateHeader(T bean) throws CsvRequiredFieldEmptyException {
if(type == null) {
throw new IllegalStateException(ResourceBundle
.getBundle(ICSVParser.DEFAULT_BUNDLE_NAME, errorLocale)
.getString("type.before.header"));
}
if(headerIndex.isEmpty()) {
List<String> realHeaderList = new ArrayList<>();
//getFieldNameForCsvHeader()方法是通过反射获取对象的字段, 字段是按照定义顺序返回的. 这里就不贴出代码了
getFieldNameForCsvHeader(type).forEach(one -> {
realHeaderList.add(one.toUpperCase());
});
String[] header = realHeaderList.toArray(new String[0]);
// 实际上, 最终 csv 文件的列排序和数据按序获取都去都是通过 headerIndex 对象完成的, 所以这一步针对 header 重新赋值为想要的顺序即可. header 最终会用于初始化 headerIndex
headerIndex.initializeHeaderIndex(header);
return header;
}
return headerIndex.getHeaderIndex();
}
}
- 新建 StatefulBeanToCsv 对象时将自定义的 Mapping 策略注册进去
//注意这个 .withMappingStrategy() 方法, 这就是将自定义的
StatefulBeanToCsv beanToCsv = new StatefulBeanToCsvBuilder<T>(writer).
withMappingStrategy(new OrderColumnMappingStrategy(exportResults.get(0).getClass())).
withIgnoreField(exportResults.get(0).getClass(), Arrays.stream(exportResults.get(0).getClass().getDeclaredFields()).filter(one -> {
one.setAccessible(true);
return one.isAnnotationPresent(CsvIgnore.class);
}).findFirst().orElse(null)).
build();
- 写入结果文件, 看一下效果吧
beanToCsv.write(exportResults);
2. 太长不看版
opencsv 底层使用一个SortedMap 实例 simpleMap 来存储 csv 列信息和对象字段(Field)信息的对应关系;simpleMap 的key 为对象字段名的的大写形式,value 则为 对象字段的 Field 对象;由于是 SortedMap, 所以 simpleMap 的键值对是依照 key 的自然顺序排序的(字母升序)。随后使用 simpleMap 的keySet 生成一个 MultiValuedMap<String, Integer> 的 map 实例 headerToPosition,其中 key 为对象字段名(也就是csv 的表头字段)的大写形式,value 为该字段在 csv 文件中的位置。在后续真正开始写文件时,opencsv 根据 headerToPosition 、使用索引找到对象字段名(也就是csv 的表头字段)的大写形式,再返回 simpleMap 找到 对应的 Field 对象实例。
以上就是 opencsv 将 对象 list 写 csv 文件的大概过程。只要针对 headerToPosition 做出修改,就能满足笔者的需求。如果还想深究的朋友,可以接着往下看。建议感兴趣的朋友在看以下内容时能自己 debugger 代码比对着看,若发现与笔者的说法有出入,欢迎讨论和指出错误。
3. 啰啰嗦嗦底层代码探究版
从调用 StatefulBeanToCsv 对象的 write(T Object) 开始, 就到了 opencsv 的地盘
/**处理过的注解
Writes a list of beans out to the Writer provided to the constructor.
Params:
beans – A list of beans to be written to a CSV destination
Throws:
CsvDataTypeMismatchException – If a field of the beans is annotated improperly or an unsupported data type is supposed to be written
CsvRequiredFieldEmptyException – If a field is marked as required, but the source is null
*/
public void write(List<T> beans) throws CsvDataTypeMismatchException,
CsvRequiredFieldEmptyException {
if (CollectionUtils.isNotEmpty(beans)) {
write(beans.iterator());
}
}
我们跟着上述 write 方法的调用进一步查看代码, 在下述代码块必要的地方我会附上中文注释
public void write(Iterator<T> iBeans) throws CsvDataTypeMismatchException, CsvRequiredFieldEmptyException {
PeekingIterator<T> beans = new PeekingIterator<>(iBeans);
T firstBean = beans.peek();
if (!beans.hasNext()) {
return;
}
// Write header
if (!headerWritten) {
//这里准备写 csv 文件所需的条件, 列的顺序也在这一步进行定义. 也就是说, 在这一步的代码找到一个适合的切入点就能满足笔者的需求
beforeFirstWrite(firstBean);
}
executor = new BeanExecutor<>(orderedResults);
executor.prepare();
// Process the beans
try {
//这里使用多线程进行 csv 文件的写.
submitAllLines(beans);
} catch (RejectedExecutionException e) {
// An exception in one of the bean writing threads prompted the
// executor service to shutdown before we were done.
if (executor.getTerminalException() instanceof RuntimeException) {
throw (RuntimeException) executor.getTerminalException();
}
if (executor.getTerminalException() instanceof CsvDataTypeMismatchException) {
throw (CsvDataTypeMismatchException) executor.getTerminalException();
}
if (executor.getTerminalException() instanceof CsvRequiredFieldEmptyException) {
throw (CsvRequiredFieldEmptyException) executor
.getTerminalException();
}
throw new RuntimeException(ResourceBundle.getBundle(ICSVParser.DEFAULT_BUNDLE_NAME, errorLocale)
.getString("error.writing.beans"), executor.getTerminalException());
} catch (Exception e) {
// Exception during parsing. Always unrecoverable.
// I can't find a way to create this condition in the current
// code, but we must have a catch-all clause.
executor.shutdownNow();
if (executor.getTerminalException() instanceof RuntimeException) {
throw (RuntimeException) executor.getTerminalException();
}
throw new RuntimeException(ResourceBundle.getBundle(ICSVParser.DEFAULT_BUNDLE_NAME, errorLocale)
.getString("error.writing.beans"), e);
}
capturedExceptions.addAll(executor.getCapturedExceptions());
executor.resultStream().forEach(l -> csvwriter.writeNext(l, applyQuotesToAll));
}
基于上述的代码理解, 我们进入到 beforeFirstWrite(firstBean)
方法查看 opencsv 为了写 csv 文件都做了哪些准备以及查找笔者的切入点
private void beforeFirstWrite(T bean) throws CsvRequiredFieldEmptyException {
//我们不注册 mappingStrategy时,opencsv 会自动选择一个 mappingStrategy 实例
// Determine mapping strategy
if (mappingStrategy == null) {
mappingStrategy = OpencsvUtils.determineMappingStrategy((Class<T>) bean.getClass(), errorLocale);
}
// Ignore fields. It's possible the mapping strategy has already been
// primed, so only pass on our data if the user actually gave us
// something.
if(!ignoredFields.isEmpty()) {
mappingStrategy.ignoreFields(ignoredFields);
}
// Build CSVWriter
if (csvwriter == null) {
csvwriter = new CSVWriter(writer, separator, quotechar, escapechar, lineEnd);
}
//这一步就是笔者的切入点, 为何是这一步呢?我们先观察一下默认的 mappingStrategy 的表头生成策略
// Write the header
String[] header = mappingStrategy.generateHeader(bean);
if (header.length > 0) {
csvwriter.writeNext(header, applyQuotesToAll);
}
headerWritten = true;
}
查看 OpencsvUtils.determineMappingStrategy
方法
static <T> MappingStrategy<T> determineMappingStrategy(Class<? extends T> type, Locale errorLocale) {
// Check for annotations
boolean positionAnnotationsPresent = Stream.of(FieldUtils.getAllFields(type)).anyMatch(
f -> f.isAnnotationPresent(CsvBindByPosition.class)
|| f.isAnnotationPresent(CsvBindAndSplitByPosition.class)
|| f.isAnnotationPresent(CsvBindAndJoinByPosition.class)
|| f.isAnnotationPresent(CsvCustomBindByPosition.class));
// Set the mapping strategy according to what we've found.
MappingStrategy<T> mappingStrategy = positionAnnotationsPresent ?
new ColumnPositionMappingStrategy<>() :
new HeaderColumnNameMappingStrategy<>();
mappingStrategy.setErrorLocale(errorLocale);
mappingStrategy.setType(type);
return mappingStrategy;
}
我们可以看出, 当不使用位置绑定相关注解时, 默认使用 HeaderColumnNameMappingStrategy 类的实例(这里用一个缩写代替该实例 HCNMS-Instance). HCNSMS-instance 自身并没有实现 generateHeader(T bean)
方法, 故 HCNSMS-instance.generateHeader(T bean)
方法最终调用父类的父类 AbstractMappingStrategy 的方法 AbstractMappingStrategy.generateHeader(T bean)
方法
public String[] generateHeader(T bean) throws CsvRequiredFieldEmptyException {
if(type == null) {
throw new IllegalStateException(ResourceBundle
.getBundle(ICSVParser.DEFAULT_BUNDLE_NAME, errorLocale)
.getString("type.before.header"));
}
// Always take what's been given or previously determined first.
if(headerIndex.isEmpty()) {
/**
getFieldMap() 方法获取 FieldMapByName 对象, 该
对象用一个 SortetMap 的实例 simpleMap 来存储对象属性信息、用
以生成与之顺序对应的表头; SortedMap 的属性会根据 key 的自然顺序排序.
`FieldMapByName.generateHeader` 方法还做了多个属性映射到一列的操作, 笔者
的业务需求没有一点, 故在前述代码中(太长不看版那儿)直接注释掉.
*/
String[] header = getFieldMap().generateHeader(bean);
/**这一步传入的 header 决定了写入 csv 文件 的列顺序, 因为 HeaderIndex 使用 header 数组的数据来初始化, 并以此决定最终 csv 列的顺序/*
headerIndex.initializeHeaderIndex(header);
return header;
}
// Otherwise, put headers in the right places.
return headerIndex.getHeaderIndex();
}
在上述 AbstractMappingStrategy.generateHeader(T bean)
方法中,主要做了 HeaderIndex 对象的实例化,并返回 一个表头数组,该数组与最终写入 csv 的列顺序保持一致.
到这儿我们不禁要问, 这个 HeaderIndex 是如何决定 csv 列的顺序呢?
我们看一下写文件的逻辑. 通过线程池提交的任务最终执行代码如下
@Override
public void run() {
try {
/*这里的 mappingStrategy.transmuteBean(bean) 就是进行真正的 csv 文件写入*/
OpencsvUtils.queueRefuseToAcceptDefeat(resultantLineQueue,
new OrderedObject<>(lineNumber, mappingStrategy.transmuteBean(bean)));
}
catch (CsvException e) {
e.setLineNumber(lineNumber);
if(throwExceptions) {
throw new RuntimeException(e);
}
OpencsvUtils.queueRefuseToAcceptDefeat(thrownExceptionsQueue,
new OrderedObject<>(lineNumber, e));
}
catch(CsvRuntimeException csvre) {
// Rethrowing exception here because I do not want the CsvRuntimeException caught and rewrapped in the catch below.
throw csvre;
}
catch(Exception t) {
throw new RuntimeException(t);
}
}
我们进入 mappingStrategy.transmuteBean(bean)
方法, 这个方法最终是返回一个顺序与固定列一致的对象属性值 数组:
@Override
public String[] transmuteBean(T bean) throws CsvDataTypeMismatchException, CsvRequiredFieldEmptyException {
int numColumns = headerIndex.findMaxIndex()+1;
BeanField<T, K> firstBeanField, subsequentBeanField;
K firstIndex, subsequentIndex;
List<String> contents = new ArrayList<>(Math.max(numColumns, 0));
// Create a map of types to instances of subordinate beans
Map<Class<?>, Object> instanceMap;
try {
instanceMap = indexBean(bean);
}
catch(IllegalAccessException | InvocationTargetException e) {
// Our testing indicates these exceptions probably can't be thrown,
// but they're declared, so we have to deal with them. It's an
// alibi catch block.
CsvBeanIntrospectionException csve = new CsvBeanIntrospectionException(
ResourceBundle.getBundle(
ICSVParser.DEFAULT_BUNDLE_NAME, errorLocale)
.getString("error.introspecting.beans"));
csve.initCause(e);
throw csve;
}
/*这个 for 循环最终按固定列顺序返回一个对象的各属性的值集合*/
for(int i = 0; i < numColumns;) {
//这一步就是通过经 HeaderIndex 实例查找索引 i 对应的表头以及属性字段对象 (Field)
// Determine the first value
firstBeanField = findField(i);
firstIndex = chooseMultivaluedFieldIndexFromHeaderIndex(i);
String[] fields = firstBeanField != null
? firstBeanField.write(instanceMap.get(firstBeanField.getType()), firstIndex)
: ArrayUtils.EMPTY_STRING_ARRAY;
if(fields.length == 0) {
// Write the only value
contents.add(StringUtils.EMPTY);
i++; // Advance the index
}
else {
// Multiple values. Write the first.
contents.add(StringUtils.defaultString(fields[0]));
// Now write the rest.
// We must make certain that we don't write more fields
// than we have columns of the correct type to cover them.
int j = 1;
int displacedIndex = i+j;
subsequentBeanField = findField(displacedIndex);
subsequentIndex = chooseMultivaluedFieldIndexFromHeaderIndex(displacedIndex);
while(j < fields.length
&& displacedIndex < numColumns
&& Objects.equals(firstBeanField, subsequentBeanField)
&& Objects.equals(firstIndex, subsequentIndex)) {
// This field still has a header, so add it
contents.add(StringUtils.defaultString(fields[j]));
// Prepare for the next loop through
displacedIndex = i + (++j);
subsequentBeanField = findField(displacedIndex);
subsequentIndex = chooseMultivaluedFieldIndexFromHeaderIndex(displacedIndex);
}
i = displacedIndex; // Advance the index
// And here's where we fill in any fields that are missing to
// cover the number of columns of the same type
if(i < numColumns) {
subsequentBeanField = findField(i);
subsequentIndex = chooseMultivaluedFieldIndexFromHeaderIndex(i);
while(Objects.equals(firstBeanField, subsequentBeanField)
&& Objects.equals(firstIndex, subsequentIndex)
&& i < numColumns) {
contents.add(StringUtils.EMPTY);
subsequentBeanField = findField(++i);
subsequentIndex = chooseMultivaluedFieldIndexFromHeaderIndex(i);
}
}
}
}
return contents.toArray(new String[0]);
}
上述代码的 for 循环里有一行很关键的代码 firstBeanField = findField(i)
, 这一行就是使用 索引 i 去确定 Field 对象实例.
@Override
protected BeanField<T, String> findField(int col) throws CsvBadConverterException {
BeanField<T, String> beanField = null;
//将索引传入这个方法,找到索引对应的大写对象属性名, 详见下个代码块
String columnName = getColumnName(col);
if (columnName == null) {
return null;
}
columnName = columnName.trim();
if (!columnName.isEmpty()) {
//在这一步使用大写的字段名获取 simpleMap (fieledMap 内存储列字段信息的 map 数组) 对应的 Field 实例
beanField = fieldMap.get(columnName.toUpperCase());
}
return beanField;
}
String getColumnName(int col) {
// 这里访问 headerIndex.getByPosition 方法
// headerIndex is never null because it's final
return headerIndex.getByPosition(col);
}
public String getByPosition(int i) {
if(i < positionToHeader.length) {
//在这里我们可以看到, 通过索引检索对应的大写字段名
return positionToHeader[i];
}
return null;
}
到这儿, 基本就已经确定最终写入 csv 文件的列顺序了. 后续是收集这些数组, 写入 csv 文件.
写在最后
由于个人能力比较差, 加上看源码时比较匆忙, 可能会存在错误或纰漏, 欢迎各位朋友发现错误时在评论区指出!
posted on 2021-02-08 15:48 gaarakseven 阅读(2032) 评论(0) 编辑 收藏 举报