Spring boot 要监听 S3 上的文件, 当有新文件到来时,下载新文件(Quartz 事件轮询版)

在Spring Boot中实现使用Quartz定时任务轮询AWS S3 Bucket,并根据文件的最后修改日期确定是否需要下载文件的功能。

步骤一:添加依赖

pom.xml中添加Spring Boot、Quartz和AWS SDK的依赖:

<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-quartz</artifactId>
</dependency>
<dependency>
    <groupId>software.amazon.awssdk</groupId>
    <artifactId>s3</artifactId>
    <version>2.17.27</version> <!-- 根据需要选择版本 -->
</dependency>

步骤二:配置AWS凭证和Bucket信息

application.propertiesapplication.yml中配置AWS S3相关的信息:

aws.s3.bucketName=your-bucket-name
aws.region=your-region
aws.s3.downloadFolder=/path/to/download/folder

步骤三:创建S3服务类

创建一个服务类,用于封装与S3的交互逻辑,包括列出文件、下载文件和检查文件的最后修改时间。

import org.springframework.beans.factory.annotation.Value;
import org.springframework.stereotype.Service;
import software.amazon.awssdk.auth.credentials.DefaultCredentialsProvider;
import software.amazon.awssdk.regions.Region;
import software.amazon.awssdk.services.s3.S3Client;
import software.amazon.awssdk.services.s3.model.GetObjectRequest;
import software.amazon.awssdk.services.s3.model.ListObjectsV2Request;
import software.amazon.awssdk.services.s3.model.ListObjectsV2Response;
import software.amazon.awssdk.services.s3.model.S3Object;

import java.io.File;
import java.io.FileOutputStream;
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Paths;
import java.util.List;

@Service
public class S3Service {

    private final S3Client s3Client;
    private final String bucketName;
    private final String downloadFolder;

    public S3Service(@Value("${aws.s3.bucketName}") String bucketName,
                     @Value("${aws.region}") String region,
                     @Value("${aws.s3.downloadFolder}") String downloadFolder) {
        this.s3Client = S3Client.builder()
                .region(Region.of(region))
                .credentialsProvider(DefaultCredentialsProvider.create())
                .build();
        this.bucketName = bucketName;
        this.downloadFolder = downloadFolder;
    }

    // 列出Bucket中的所有文件
    public List<S3Object> listFiles() {
        ListObjectsV2Request request = ListObjectsV2Request.builder()
                .bucket(bucketName)
                .build();

        ListObjectsV2Response response = s3Client.listObjectsV2(request);
        return response.contents();
    }

    // 下载文件
    public void downloadFile(String keyName) throws IOException {
        GetObjectRequest request = GetObjectRequest.builder()
                .bucket(bucketName)
                .key(keyName)
                .build();

        File file = new File(downloadFolder + "/" + keyName);
        try (FileOutputStream fos = new FileOutputStream(file);
             var s3Object = s3Client.getObject(request)) {
            byte[] buffer = new byte[1024];
            int bytesRead;
            while ((bytesRead = s3Object.read(buffer)) != -1) {
                fos.write(buffer, 0, bytesRead);
            }
        }

        System.out.println("Downloaded file: " + file.getAbsolutePath());
    }

    // 检查文件是否已经下载过
    public boolean isFileDownloaded(String keyName) {
        return Files.exists(Paths.get(downloadFolder + "/" + keyName));
    }
}

步骤四:创建Quartz定时任务

创建Quartz任务类,定期轮询S3 Bucket,并根据文件的最后修改日期确定是否下载文件。

import org.quartz.Job;
import org.quartz.JobExecutionContext;
import org.quartz.JobExecutionException;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.stereotype.Component;
import software.amazon.awssdk.services.s3.model.S3Object;

import java.io.IOException;
import java.util.List;

@Component
public class S3FilePollingJob implements Job {

    private final S3Service s3Service;

    @Autowired
    public S3FilePollingJob(S3Service s3Service) {
        this.s3Service = s3Service;
    }

    @Override
    public void execute(JobExecutionContext context) throws JobExecutionException {
        // 获取S3中的文件列表
        List<S3Object> s3Objects = s3Service.listFiles();

        for (S3Object s3Object : s3Objects) {
            String keyName = s3Object.key();
            if (!s3Service.isFileDownloaded(keyName)) {
                try {
                    s3Service.downloadFile(keyName);
                } catch (IOException e) {
                    e.printStackTrace();
                }
            } else {
                System.out.println("File already downloaded: " + keyName);
            }
        }
    }
}

步骤五:配置Quartz定时任务

在Spring Boot配置类中配置Quartz定时任务的执行频率。

import org.quartz.*;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;

@Configuration
public class QuartzConfig {

    @Bean
    public JobDetail s3FilePollingJobDetail() {
        return JobBuilder.newJob(S3FilePollingJob.class)
                .withIdentity("s3FilePollingJob")
                .storeDurably()
                .build();
    }

    @Bean
    public Trigger s3FilePollingTrigger() {
        return TriggerBuilder.newTrigger()
                .forJob(s3FilePollingJobDetail())
                .withIdentity("s3FilePollingTrigger")
                .withSchedule(SimpleScheduleBuilder.simpleSchedule()
                        .withIntervalInMinutes(5)  // 每5分钟轮询一次
                        .repeatForever())
                .build();
    }
}

步骤六:启动Spring Boot应用程序

确保所有配置和代码都正确之后,启动Spring Boot应用程序。应用程序将每5分钟(或你配置的间隔)轮询一次S3 Bucket,并下载任何未下载过的文件。

以上

此方法简单且有效,适合在文件变化不太频繁的情况下使用。如果文件变化较为频繁,可以考虑使用S3事件通知与SQS的方式进行实时监听。

posted @ 2024-08-14 20:54  gongchengship  阅读(3)  评论(0编辑  收藏  举报