Java 有什么工具可以快速将CSV 存入关系型数据库

在Java中,有多种工具和库可以快速将CSV文件的数据导入到数据库中。以下是几种常用的方法和工具:

1. 使用OpenCSV和JDBC

OpenCSV是一个非常流行的库,可以轻松读取和写入CSV文件。结合JDBC,可以将CSV文件的数据快速存储到数据库中。

示例代码

首先,添加OpenCSV库的依赖(假设使用Maven):

<dependency>
    <groupId>com.opencsv</groupId>
    <artifactId>opencsv</artifactId>
    <version>5.5.2</version>
</dependency>

然后,使用以下代码将CSV文件的数据存入数据库:

import com.opencsv.CSVReader;
import java.io.FileReader;
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.PreparedStatement;

public class CsvToDatabase {
    public static void main(String[] args) {
        String jdbcURL = "jdbc:postgresql://localhost:5432/yourdatabase";
        String username = "yourusername";
        String password = "yourpassword";
        String csvFilePath = "path/to/your/file.csv";

        Connection connection = null;

        try {
            connection = DriverManager.getConnection(jdbcURL, username, password);
            connection.setAutoCommit(false);

            String sql = "INSERT INTO your_table (column1, column2, column3) VALUES (?, ?, ?)";
            PreparedStatement statement = connection.prepareStatement(sql);

            CSVReader reader = new CSVReader(new FileReader(csvFilePath));
            String[] nextLine;
            while ((nextLine = reader.readNext()) != null) {
                statement.setString(1, nextLine[0]);
                statement.setString(2, nextLine[1]);
                statement.setString(3, nextLine[2]);
                statement.addBatch();
            }

            statement.executeBatch();
            connection.commit();
            reader.close();
            statement.close();
            connection.close();

            System.out.println("CSV data has been inserted into the database successfully.");

        } catch (Exception e) {
            e.printStackTrace();
            if (connection != null) {
                try {
                    connection.rollback();
                } catch (Exception ex) {
                    ex.printStackTrace();
                }
            }
        }
    }
}

2. 使用Spring Batch

Spring Batch是一个强大的批处理框架,适合处理大规模数据导入操作。它可以通过配置文件定义批处理作业,并提供事务管理和重试机制。

示例代码

首先,添加Spring Batch和其他必要依赖(假设使用Maven):

<dependency>
    <groupId>org.springframework.batch</groupId>
    <artifactId>spring-batch-core</artifactId>
    <version>4.3.4</version>
</dependency>
<dependency>
    <groupId>org.springframework.batch</groupId>
    <artifactId>spring-batch-infrastructure</artifactId>
    <version>4.3.4</version>
</dependency>
<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-batch</artifactId>
    <version>2.5.4</version>
</dependency>
<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-jdbc</artifactId>
    <version>2.5.4</version>
</dependency>
<dependency>
    <groupId>org.springframework</groupId>
    <artifactId>spring-jdbc</artifactId>
    <version>5.3.10</version>
</dependency>
<dependency>
    <groupId>com.h2database</groupId>
    <artifactId>h2</artifactId>
    <version>1.4.200</version>
    <scope>runtime</scope>
</dependency>
<dependency>
    <groupId>com.opencsv</groupId>
    <artifactId>opencsv</artifactId>
    <version>5.5.2</version>
</dependency>

然后,创建一个Spring Batch配置类:

import org.springframework.batch.core.Job;
import org.springframework.batch.core.Step;
import org.springframework.batch.core.configuration.annotation.EnableBatchProcessing;
import org.springframework.batch.core.configuration.annotation.JobBuilderFactory;
import org.springframework.batch.core.configuration.annotation.StepBuilderFactory;
import org.springframework.batch.core.launch.support.RunIdIncrementer;
import org.springframework.batch.item.database.BeanPropertyItemSqlParameterSourceProvider;
import org.springframework.batch.item.database.JdbcBatchItemWriter;
import org.springframework.batch.item.file.FlatFileItemReader;
import org.springframework.batch.item.file.builder.FlatFileItemReaderBuilder;
import org.springframework.batch.item.file.mapping.BeanWrapperFieldSetMapper;
import org.springframework.batch.item.file.mapping.DefaultLineMapper;
import org.springframework.batch.item.file.mapping.DelimitedLineTokenizer;
import org.springframework.batch.item.file.transform.LineTokenizer;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.core.io.ClassPathResource;
import org.springframework.jdbc.core.JdbcTemplate;

@Configuration
@EnableBatchProcessing
public class BatchConfig {

    @Autowired
    public JobBuilderFactory jobBuilderFactory;

    @Autowired
    public StepBuilderFactory stepBuilderFactory;

    @Autowired
    public DataSource dataSource;

    @Bean
    public FlatFileItemReader<YourEntity> reader() {
        return new FlatFileItemReaderBuilder<YourEntity>()
                .name("yourEntityItemReader")
                .resource(new ClassPathResource("path/to/your/file.csv"))
                .delimited()
                .names(new String[]{"column1", "column2", "column3"})
                .fieldSetMapper(new BeanWrapperFieldSetMapper<YourEntity>() {{
                    setTargetType(YourEntity.class);
                }})
                .build();
    }

    @Bean
    public JdbcBatchItemWriter<YourEntity> writer() {
        JdbcBatchItemWriter<YourEntity> writer = new JdbcBatchItemWriter<>();
        writer.setItemSqlParameterSourceProvider(new BeanPropertyItemSqlParameterSourceProvider<>());
        writer.setSql("INSERT INTO your_table (column1, column2, column3) VALUES (:column1, :column2, :column3)");
        writer.setDataSource(dataSource);
        return writer;
    }

    @Bean
    public Job importUserJob(JobCompletionNotificationListener listener, Step step1) {
        return jobBuilderFactory.get("importUserJob")
                .incrementer(new RunIdIncrementer())
                .listener(listener)
                .flow(step1)
                .end()
                .build();
    }

    @Bean
    public Step step1(JdbcBatchItemWriter<YourEntity> writer) {
        return stepBuilderFactory.get("step1")
                .<YourEntity, YourEntity>chunk(10)
                .reader(reader())
                .writer(writer)
                .build();
    }
}

在这个示例中:

  1. 定义了一个FlatFileItemReader来读取CSV文件。
  2. 定义了一个JdbcBatchItemWriter来将数据写入数据库。
  3. 创建了一个Job和Step,配置了批处理流程。

3. 使用其他第三方库

除了OpenCSV和Spring Batch,还有其他第三方库可以用于将CSV文件快速导入数据库,如Apache Commons CSV和Super CSV。结合这些库和JDBC,可以实现类似的功能。

结论

根据具体需求选择合适的工具或库。如果只是简单的批量插入,可以使用OpenCSV和JDBC。如果需要处理复杂的批处理任务和大规模数据导入,Spring Batch是一个强大的选择。无论选择哪种方法,都可以通过良好的错误处理和事务管理来确保数据导入的可靠性和一致性。

posted @ 2024-07-04 20:09  gongchengship  阅读(22)  评论(0编辑  收藏  举报