Java 有什么工具可以快速将CSV 存入关系型数据库
在Java中,有多种工具和库可以快速将CSV文件的数据导入到数据库中。以下是几种常用的方法和工具:
1. 使用OpenCSV和JDBC
OpenCSV是一个非常流行的库,可以轻松读取和写入CSV文件。结合JDBC,可以将CSV文件的数据快速存储到数据库中。
示例代码
首先,添加OpenCSV库的依赖(假设使用Maven):
<dependency>
<groupId>com.opencsv</groupId>
<artifactId>opencsv</artifactId>
<version>5.5.2</version>
</dependency>
然后,使用以下代码将CSV文件的数据存入数据库:
import com.opencsv.CSVReader;
import java.io.FileReader;
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.PreparedStatement;
public class CsvToDatabase {
public static void main(String[] args) {
String jdbcURL = "jdbc:postgresql://localhost:5432/yourdatabase";
String username = "yourusername";
String password = "yourpassword";
String csvFilePath = "path/to/your/file.csv";
Connection connection = null;
try {
connection = DriverManager.getConnection(jdbcURL, username, password);
connection.setAutoCommit(false);
String sql = "INSERT INTO your_table (column1, column2, column3) VALUES (?, ?, ?)";
PreparedStatement statement = connection.prepareStatement(sql);
CSVReader reader = new CSVReader(new FileReader(csvFilePath));
String[] nextLine;
while ((nextLine = reader.readNext()) != null) {
statement.setString(1, nextLine[0]);
statement.setString(2, nextLine[1]);
statement.setString(3, nextLine[2]);
statement.addBatch();
}
statement.executeBatch();
connection.commit();
reader.close();
statement.close();
connection.close();
System.out.println("CSV data has been inserted into the database successfully.");
} catch (Exception e) {
e.printStackTrace();
if (connection != null) {
try {
connection.rollback();
} catch (Exception ex) {
ex.printStackTrace();
}
}
}
}
}
2. 使用Spring Batch
Spring Batch是一个强大的批处理框架,适合处理大规模数据导入操作。它可以通过配置文件定义批处理作业,并提供事务管理和重试机制。
示例代码
首先,添加Spring Batch和其他必要依赖(假设使用Maven):
<dependency>
<groupId>org.springframework.batch</groupId>
<artifactId>spring-batch-core</artifactId>
<version>4.3.4</version>
</dependency>
<dependency>
<groupId>org.springframework.batch</groupId>
<artifactId>spring-batch-infrastructure</artifactId>
<version>4.3.4</version>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-batch</artifactId>
<version>2.5.4</version>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-jdbc</artifactId>
<version>2.5.4</version>
</dependency>
<dependency>
<groupId>org.springframework</groupId>
<artifactId>spring-jdbc</artifactId>
<version>5.3.10</version>
</dependency>
<dependency>
<groupId>com.h2database</groupId>
<artifactId>h2</artifactId>
<version>1.4.200</version>
<scope>runtime</scope>
</dependency>
<dependency>
<groupId>com.opencsv</groupId>
<artifactId>opencsv</artifactId>
<version>5.5.2</version>
</dependency>
然后,创建一个Spring Batch配置类:
import org.springframework.batch.core.Job;
import org.springframework.batch.core.Step;
import org.springframework.batch.core.configuration.annotation.EnableBatchProcessing;
import org.springframework.batch.core.configuration.annotation.JobBuilderFactory;
import org.springframework.batch.core.configuration.annotation.StepBuilderFactory;
import org.springframework.batch.core.launch.support.RunIdIncrementer;
import org.springframework.batch.item.database.BeanPropertyItemSqlParameterSourceProvider;
import org.springframework.batch.item.database.JdbcBatchItemWriter;
import org.springframework.batch.item.file.FlatFileItemReader;
import org.springframework.batch.item.file.builder.FlatFileItemReaderBuilder;
import org.springframework.batch.item.file.mapping.BeanWrapperFieldSetMapper;
import org.springframework.batch.item.file.mapping.DefaultLineMapper;
import org.springframework.batch.item.file.mapping.DelimitedLineTokenizer;
import org.springframework.batch.item.file.transform.LineTokenizer;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.core.io.ClassPathResource;
import org.springframework.jdbc.core.JdbcTemplate;
@Configuration
@EnableBatchProcessing
public class BatchConfig {
@Autowired
public JobBuilderFactory jobBuilderFactory;
@Autowired
public StepBuilderFactory stepBuilderFactory;
@Autowired
public DataSource dataSource;
@Bean
public FlatFileItemReader<YourEntity> reader() {
return new FlatFileItemReaderBuilder<YourEntity>()
.name("yourEntityItemReader")
.resource(new ClassPathResource("path/to/your/file.csv"))
.delimited()
.names(new String[]{"column1", "column2", "column3"})
.fieldSetMapper(new BeanWrapperFieldSetMapper<YourEntity>() {{
setTargetType(YourEntity.class);
}})
.build();
}
@Bean
public JdbcBatchItemWriter<YourEntity> writer() {
JdbcBatchItemWriter<YourEntity> writer = new JdbcBatchItemWriter<>();
writer.setItemSqlParameterSourceProvider(new BeanPropertyItemSqlParameterSourceProvider<>());
writer.setSql("INSERT INTO your_table (column1, column2, column3) VALUES (:column1, :column2, :column3)");
writer.setDataSource(dataSource);
return writer;
}
@Bean
public Job importUserJob(JobCompletionNotificationListener listener, Step step1) {
return jobBuilderFactory.get("importUserJob")
.incrementer(new RunIdIncrementer())
.listener(listener)
.flow(step1)
.end()
.build();
}
@Bean
public Step step1(JdbcBatchItemWriter<YourEntity> writer) {
return stepBuilderFactory.get("step1")
.<YourEntity, YourEntity>chunk(10)
.reader(reader())
.writer(writer)
.build();
}
}
在这个示例中:
- 定义了一个
FlatFileItemReader
来读取CSV文件。 - 定义了一个
JdbcBatchItemWriter
来将数据写入数据库。 - 创建了一个Job和Step,配置了批处理流程。
3. 使用其他第三方库
除了OpenCSV和Spring Batch,还有其他第三方库可以用于将CSV文件快速导入数据库,如Apache Commons CSV和Super CSV。结合这些库和JDBC,可以实现类似的功能。
结论
根据具体需求选择合适的工具或库。如果只是简单的批量插入,可以使用OpenCSV和JDBC。如果需要处理复杂的批处理任务和大规模数据导入,Spring Batch是一个强大的选择。无论选择哪种方法,都可以通过良好的错误处理和事务管理来确保数据导入的可靠性和一致性。