【原创】Talend ETL开发——基于joblet的统一的email发送
更新
2019/1/16:增加了允许5个附件发送、smtp等配置的字段。
一、背景
在ETL数据整合过程中,肯定会涉及到email的通知,比如ETL的执行情况汇报,执行耗时汇报,关键数据更新情况等信息汇报等,这些信息都是需要及时给到相应的operation人员或者使用BI数据的人员。
但是,如果一开始没有规划好邮件推送的一些基本信息,有可能会导致后期邮件发送混乱和不好管理等问题,例如:每个人都有自己的etl,每个人都会去开发自己的邮件通知,那随着时间推移,后期哪些邮件要取消、哪些通知人要屏蔽等都是个难事,可能需要打开所有的ETL job去检查,去修改,耗时耗力,非常不利于管理。
二、实现
在实现该方案的时候,我主要考虑了以下几个方面:
1、每个人需要调用发送邮件的时候,尽量不要重复再做一次拖拉整套组件了,拖拉一次公共组件就好了,所以我选择用joblet来实现这个。
2、邮件的一些基础公共信息必须在一个地方维护,比如发送、接收邮件列表,发送记录等信息,所以我设计了数据库表来存放这些信息,这样只要更新数据库信息,就可以使得全局都使用统一的信息。
3、信息的发送、状态、生成的方式都可以灵活控制,所以我设计了一个表来存储这些信息,而且通过存储过程生成具体的email信息,这样可以追踪发送记录等信息。
4、因为talend joblet支持变量,所以我尽量将发送邮件组件中的一些变量都设计到数据库表中,这样方便于维护和修改。
2.1、数据库表设计
数据库设计主要有2张表:mail_send_group、mail_send_list_rec
mail_send_group:该表是用于记录发送者和接收者之间的信息,维护在这里可以让后去维护更简单,修改数据库则全局启用。
IF (OBJECT_ID(N'[chk].[mail_send_group]', N'U') IS NOT NULL) BEGIN PRINT N'删除表:[chk].[mail_send_group]'; DROP TABLE [chk].[mail_send_group]; END GO CREATE TABLE [chk].[mail_send_group] ( [group_id] NVARCHAR(50) NOT NULL,--主键 [mail_to] NVARCHAR(1000) NOT NULL,--接收者邮箱列表,多个邮箱用;分割 [mail_from] NVARCHAR(100) NOT NULL,--发送者邮箱 [sender_name] NVARCHAR(100) NOT NULL,--发送者昵称 [mail_cc] NVARCHAR(1000) NULL,--抄送邮箱列表,多个邮箱用;分割 [mail_bcc] NVARCHAR(100) NULL,--密送邮箱列表,多个邮箱用;分割 [smtp_host] NVARCHAR(100) NOT NULL,--smtp host地址 [smtp_port] INT NOT NULL,--smtp host 端口号 [user_name] NVARCHAR(50) NOT NULL,--邮箱用户名 [user_pwd] NVARCHAR(50) NOT NULL,--邮箱用户密码 [create_date] DATETIME NOT NULL,--创建日期 [status] SMALLINT NULL--状态(0禁用,1启用) ) GO SELECT * FROM [chk].[mail_send_group] |
mail_send_list_rec:该表是用于记录email生成的记录和发送记录的,每条信息通过group_id和上表关联,就可以知道每条信息是由谁发给谁的,什么时候发送的。
IF (OBJECT_ID(N'[chk].[mail_send_list_rec]', N'U') IS NOT NULL) BEGIN PRINT N'删除表:[chk].[mail_send_list_rec]'; DROP TABLE [chk].[mail_send_list_rec]; END GO CREATE TABLE [chk].[mail_send_list_rec] ( [mail_id] NVARCHAR(50) NOT NULL,--主键 [group_id] NVARCHAR(50) NOT NULL,--所属的group id,用于确认发送接收等信息 [scope] NVARCHAR(100) NOT NULL,--业务,用于区分不同业务生成的邮件,相当于一个分类 [subject] NVARCHAR(100) NOT NULL,--主题 [message] NVARCHAR(4000) NOT NULL,--正文,支持HTML代码,建议是用HTML [create_date] DATETIME NOT NULL,--创建日期 [send_date] DATETIME NULL,--发送日期 [send_status] SMALLINT NULL,--发送状态(0创建完未发送,1已经发送) [atta01_path] NVARCHAR(200) NULL,--第1个发送附件路径绝对路径,包含文件名/data/mailatts/ids/checkret.csv [atta02_path] NVARCHAR(200) NULL,--第1个发送附件路径绝对路径,包含文件名 [atta03_path] NVARCHAR(200) NULL,--第1个发送附件路径绝对路径,包含文件名 [atta04_path] NVARCHAR(200) NULL,--第1个发送附件路径绝对路径,包含文件名 [atta05_path] NVARCHAR(200) NULL--第1个发送附件路径绝对路径,包含文件名 ) GO |
2.2、Joblet开发
1、 Joblet采用了input方式,输入的参数是mail_id,即邮件的ID,这个是外部调用这个joblet的时候需要传递进来的一个参数。
2、 tFlowToIterate是用于将mail_id生成一个全局变量,用于传递给3的mssqlinput。
3、 该组件是用于根据mail_id去查询数据库表中的详细email信息,为后面的发送信息提供明细。
"SELECT [a].[mail_id] ,[a].[subject] ,[a].[message] ,[b].[mail_from] ,[b].[mail_to] ,[b].[sender_name] ,[b].[mail_cc] ,[b].[mail_bcc] ,[b].[status] ,[a].[atta01_path] ,[a].[atta02_path] ,[a].[atta03_path] ,[a].[atta04_path] ,[a].[atta05_path] ,[b].[smtp_host] ,[b].[smtp_port] ,[b].[user_name] ,[b].[user_pwd] FROM [chk].[mail_send_list_rec] AS a INNER JOIN [chk].[mail_send_group] AS b ON ([a].[group_id] = [b].[group_id]) WHERE [a].[mail_id] = '" + ((String)globalMap.get("curr_mail_id")) + "' AND ISNULL([b].[status], 0) = 1" |
4、 发送邮件组件主要就是根据数据库中查询的数据,通过变量方式传递过来后,执行发送邮件的操作。
5、 更新数据库中相应的mail_id的记录为已发送和发送时间等信息。先用tfixedflowinput生成相应的存储过程参数,然后MSSQL_SP调用存储过程更新。
2.3、存储过程开发生成&更新email内容
生成email:主要功能就是按照你想要发送的内容生成一个message,并插入到数据库表中即可。
IF (OBJECT_ID(N'[chk].[usp_insert_ids_mail_send_list_rec]', N'P') IS NOT NULL) BEGIN PRINT N'删除存储过程:[chk].[usp_insert_ids_mail_send_list_rec]'; DROP PROC [chk].[usp_insert_ids_mail_send_list_rec]; END GO CREATE PROC [chk].[usp_insert_ids_mail_send_list_rec] ( @curr_date NVARCHAR(20), @atta01_path NVARCHAR(200), @atta02_path NVARCHAR(200), @atta03_path NVARCHAR(200), @atta04_path NVARCHAR(200), @atta05_path NVARCHAR(200) ) AS --==================================================================================================================================== -- ProcedureName : chk.usp_insert_ids_mail_send_list_rec -- Author : john.xiong -- CreateDate : 2019-01-02 -- Description : 生成daily的detail mail content /*************************************Parameters参数说明******************************************************************************* -- @curr_date : 数据实行日期YYYYMMDD **************************************Modfied List修改记录***************************************************************************** -- Modified Date Modified User Version Modified Reason ************************************************************************************************************************************** -- 2019-01-02 john.xiong V01.00.00 初始化版本 **************************************************************************************************************************************/ --==================================================================================================================================== BEGIN BEGIN TRY DECLARE @begin_time DATETIME ,@end_time DATETIME ,@cost_time INT; SET @begin_time = DATEADD(HOUR, 8, GETDATE()); INSERT INTO [chk].[tb_proc_cost_log] ( [proc_name] ,[Object_name] ,[execute_time] ,[action] ,[remark] ,[cost_time] ) SELECT N'chk.usp_insert_ids_mail_send_list_rec' AS [proc_name] ,N'chk.mail_send_list_rec' AS [Object_name] ,@begin_time AS [execute_time] ,N'start' AS [action] ,'' AS [remark] ,0 AS [cost_time] DECLARE @mail_id UNIQUEIDENTIFIER, @scope NVARCHAR(100), @group_id UNIQUEIDENTIFIER, @subject NVARCHAR(100), @create_date DATETIME, @message NVARCHAR(4000), @temp_message NVARCHAR(4000), @count INT, @count1 INT, @count2 INT, @error_count INT SET @mail_id = NEWID(); SET @scope = N'IDS'; SET @group_id = N'8D42D25D-59C7-4A5E-AE9C-4A5F24D910B0' SET @subject = N'IDS daily - job运行情况'; SET @create_date = DATEADD(HOUR, 8, GETDATE()); SET @count1 = 0; SET @count2 = 0; SET @error_count = 0; SET @message = '<span style="color:#000; line-height:30px"><ol>'; SET @temp_message = ''; SET @count = 0; SELECT @count = COUNT(*) FROM [chk].[log_move_blob_rec] AS a WHERE LEFT([a].[rec_load_time], 8) = @curr_date AND ([a].[scope] IN ('ids_regular_data', 'ids_regular_rtm') OR [a].[blobFileName] LIKE '%LCH%') SET @message = @message + N'<li>从landing搬移blob文件总数:' + CONVERT(NVARCHAR(20), ISNULL(@count, 0)); SET @temp_message = ''; SET @count = 0; SELECT @count = COUNT(*) FROM [chk].[log_move_blob_rec] AS a WHERE LEFT([a].[rec_load_time], 8) = @curr_date AND [a].[scope] = 'ids_regular_data' SET @message = @message + N'<br>经销商regular data文件数:' + CONVERT(NVARCHAR(20), ISNULL(@count, 0)); SET @temp_message = ''; SET @count = 0; SELECT @count = COUNT(*) FROM [chk].[log_move_blob_rec] AS a WHERE LEFT([a].[rec_load_time], 8) = @curr_date AND [a].[scope] IN ('ids_regular_rtm') SET @message = @message + N'<br>restatement data文件数:' + CONVERT(NVARCHAR(20), ISNULL(@count, 0)); SET @temp_message = ''; SET @count = 0; SELECT @count = COUNT(*) FROM [chk].[log_move_blob_rec] AS a WHERE LEFT([a].[rec_load_time], 8) = @curr_date AND [a].[blobFileName] LIKE '%LCH%' SET @message = @message + N'<br>local customer hierarchy daily文件数:' + CONVERT(NVARCHAR(20), ISNULL(@count, 0)) + '</li>'; SET @temp_message = ''; SET @count = 0; SELECT @count = SUM([a].[file_count]) FROM [chk].[log_blob_file_deal] AS a WHERE LOWER([a].[data_scope]) = 'ids' AND LOWER([a].[deal_level]) = 'ext' AND LOWER([a].[job_name]) = LOWER('IDS_Data_Blob_To_Stg_Ongoing_Loop_New_1_3') AND [a].[remark] LIKE '%tFileList Count%' AND CONVERT(NVARCHAR(8), [a].[deal_date], 112) = @curr_date SET @message = @message + N'<li>实际处理经销商regular data文件数:' + CONVERT(NVARCHAR(20), ISNULL(@count, 0)) + '</li>'; SET @temp_message = ''; SET @count = 0; SELECT @count = SUM([a].[file_count]) FROM [chk].[log_blob_file_deal] AS a WHERE LOWER([a].[data_scope]) = 'ids' AND LOWER([a].[deal_level]) = 'ext' AND LOWER([a].[job_name]) = LOWER('IDS_Data_Blob_To_Stg_SalesDaily_Rtm_New_1_4') AND [a].[remark] LIKE '%tFileList Count%' AND CONVERT(NVARCHAR(8), [a].[deal_date], 112) = @curr_date SET @message = @message + N'<li>实际处理restatement data文件数:' + CONVERT(NVARCHAR(20), ISNULL(@count, 0)) + '</li>'; SET @temp_message = ''; SET @count = 0; SELECT @count = COUNT([a].[file_name]) FROM [chk].[log_file_deal_error_rec] AS a WHERE LOWER([a].[data_scope]) = 'ids' AND LOWER([a].[deal_level]) = 'ext' AND LOWER([a].[job_name]) = LOWER('IDS_Data_Blob_To_Stg_Ongoing_Loop_New_1_3') AND CONVERT(NVARCHAR(8), [a].[deal_date], 112) = @curr_date SET @message = @message + N'<li>无法解压的经销商regular data文件数:' + CONVERT(NVARCHAR(20), ISNULL(@count, 0)) + '</li>'; SET @temp_message = ''; SET @count = 0; SELECT @count = COUNT([a].[file_name]) FROM [chk].[log_file_deal_error_rec] AS a WHERE LOWER([a].[data_scope]) = 'ids' AND LOWER([a].[deal_level]) = 'ext' AND LOWER([a].[job_name]) = LOWER('IDS_Data_Blob_To_Stg_SalesDaily_Rtm_New_1_4') AND CONVERT(NVARCHAR(8), [a].[deal_date], 112) = @curr_date SET @message = @message + N'<li>无法解压的restatement data文件数:' + CONVERT(NVARCHAR(20), ISNULL(@count, 0)) + '</li>'; SET @temp_message = ''; SET @count = 0; SELECT @count = SUM([a].[file_count]) FROM [chk].[log_blob_file_deal] AS a WHERE LOWER([a].[data_scope]) = 'ids' AND LOWER([a].[deal_level]) = 'ext' AND LOWER([a].[job_name]) = LOWER('IDS_RCS_Local_Master_Data_Daily_1_2') AND [a].[remark] LIKE '%tFileList Count lch%' AND CONVERT(NVARCHAR(8), [a].[deal_date], 112) = @curr_date SET @message = @message + N'<li>处理local customer hierarchy daily文件数:' + CONVERT(NVARCHAR(20), ISNULL(@count, 0)); SET @temp_message = ''; SET @count = 0; SET @count1 = 0; SELECT TOP (1) @count1 = [a].[row_count] FROM [chk].[log_table_data_rec] AS a WHERE [a].[data_scope] = 'rcs dim' AND [a].[table_name] = 'stg.cust_ids_rcs_local_customer_hierarchy_daily' AND CONVERT(NVARCHAR(8), [a].[action_time], 112) = @curr_date ORDER BY [a].[action_time] DESC SET @message = @message + N'<br>文件数据行数:' + CONVERT(NVARCHAR(20), ISNULL(@count1, 0)); SET @temp_message = ''; SET @count = 0; SET @count2 = 0; SELECT @count2 = COUNT(*) FROM [stg].[cust_ids_rcs_local_customer_hierarchy_daily] AS a WHERE LEFT([a].[rec_load_time], 8) = @curr_date SET @message = @message + N'<br>入库数据行数:' + CONVERT(NVARCHAR(20), ISNULL(@count2, 0)) + '</li>'; IF (@count1 <> @count2) BEGIN SET @error_count = @error_count + 1; END IF (OBJECT_ID(N'[chk].[temp_mail_send_proc_error_list_ids_daily]', N'U') IS NOT NULL) BEGIN DROP TABLE [chk].[temp_mail_send_proc_error_list_ids_daily]; END /*生成错误proc的记录*/ CREATE TABLE [chk].[temp_mail_send_proc_error_list_ids_daily] WITH ( DISTRIBUTION = ROUND_ROBIN, CLUSTERED COLUMNSTORE INDEX ) AS SELECT [a].[proc_name] ,ROW_NUMBER() OVER(ORDER BY [a].[error_time] ASC) AS [Num] FROM [chk].[log_proc_error_rec] AS a WHERE [a].[proc_name] LIKE '%ids%' AND [a].[proc_name] NOT LIKE '%mail%' AND CONVERT(NVARCHAR(8), [a].[error_time], 112) = @curr_date SET @count = 0; SELECT @count = COUNT(*) FROM [chk].[temp_mail_send_proc_error_list_ids_daily]; IF (@count > 0) BEGIN SET @message = @message + N'<li style="color:red">有错误的PROC:' + CONVERT(NVARCHAR(20), @count); SET @error_count = @error_count + @count; END WHILE (@count > 0) BEGIN SELECT @temp_message = [proc_name] FROM [chk].[temp_mail_send_proc_error_list_ids_daily] WHERE [Num] = @count; SET @message = @message + N'<br />' + @temp_message + '; '; SET @count = @count - 1; END SET @message = @message + '</li>'; IF (@error_count <> 0) BEGIN SET @subject = @subject + ':有 ' + CONVERT(NVARCHAR(20), @error_count) + ' 个错误'; END SET @subject = @curr_date + N' ' + @subject; SET @message = @message + '</ol></span>' PRINT @message INSERT INTO [chk].[mail_send_list_rec] ( [mail_id] ,[group_id] ,[scope] ,[subject] ,[message] ,[create_date] ,[send_date] ,[send_status] ,[atta01_path] ,[atta02_path] ,[atta03_path] ,[atta04_path] ,[atta05_path] ) SELECT @mail_id, @group_id, @scope, @subject, @message, @create_date, NULL, 0, @atta01_path, @atta02_path, @atta03_path, @atta04_path, @atta05_path SET @end_time = DATEADD(HOUR, 8, GETDATE()); SET @cost_time = DATEDIFF(SECOND, @begin_time, @end_time); INSERT INTO [chk].[tb_proc_cost_log] ( [proc_name] ,[Object_name] ,[execute_time] ,[action] ,[remark] ,[cost_time] ) SELECT N'chk.usp_insert_ids_mail_send_list_rec' AS [proc_name] ,N'chk.mail_send_list_rec' AS [Object_name] ,@end_time AS [execute_time] ,N'end' AS [action] ,CONVERT(NVARCHAR(50), @mail_id) AS [remark] ,@cost_time AS [cost_time] PRINT N'Exec success'; SELECT @mail_id AS [curr_mail_id] END TRY BEGIN CATCH INSERT INTO [chk].[log_proc_error_rec] ( [proc_name] ,[error_source] ,[error_time] ,[error_severity] ,[error_state] ,[error_msg] ,[log_user] ) SELECT N'chk.usp_insert_ids_mail_send_list_rec' AS [proc_name] ,ERROR_PROCEDURE() AS [error_source] ,DATEADD(HOUR, 8, GETDATE()) AS [error_time] ,ERROR_SEVERITY() AS [error_severity] ,ERROR_STATE() AS [error_state] ,ERROR_MESSAGE() AS [error_msg] ,SUSER_SNAME() AS [log_user] PRINT N'Exec failed'; END CATCH END |
更新email by mail_id
IF (OBJECT_ID(N'[chk].[usp_update_mail_send_list_rec_by_mail_id]', N'P') IS NOT NULL) BEGIN PRINT N'删除存储过程:[chk].[usp_update_mail_send_list_rec_by_mail_id]'; DROP PROC [chk].[usp_update_mail_send_list_rec_by_mail_id]; END GO CREATE PROC [chk].[usp_update_mail_send_list_rec_by_mail_id] ( @mail_id NVARCHAR(50) ,@send_date DATETIME ,@send_status SMALLINT ) AS --==================================================================================================================================== -- ProcedureName : [chk].[usp_update_mail_send_list_rec_by_mail_id] -- Author : john.xiong -- CreateDate : 2018-12-24 -- Description : 根据mail_id更新mail发生记录信息 /*************************************Parameters参数说明******************************************************************************* -- @mail_id : 邮件id NEWID **************************************Modfied List修改记录***************************************************************************** -- Modified Date Modified User Version Modified Reason ************************************************************************************************************************************** -- 2018-12-24 john.xiong V01.00.00 初始化版本 **************************************************************************************************************************************/ --==================================================================================================================================== BEGIN BEGIN TRY DECLARE @begin_time DATETIME ,@end_time DATETIME ,@cost_time INT SET @begin_time = DATEADD(HOUR, 8, GETDATE()); INSERT INTO [chk].[tb_proc_cost_log] ( [proc_name] ,[Object_name] ,[execute_time] ,[action] ,[remark] ,[cost_time] ) SELECT N'chk.usp_update_mail_send_list_rec_by_mail_id' AS [proc_name] ,N'chk.mail_send_list_rec' AS [Object_name] ,@begin_time AS [execute_time] ,N'start' AS [action] ,N'' AS [remark] ,0 AS [cost_time] IF (@mail_id IS NULL) BEGIN RAISERROR (N'mail id错误!强制退出', 16, 1); END IF (@send_date IS NULL) BEGIN SET @send_date = DATEADD(HOUR, 8, GETDATE()); END UPDATE [chk].[mail_send_list_rec] SET [send_date] = @send_date, [send_status] = @send_status WHERE [mail_id] = @mail_id; SET @end_time = DATEADD(HOUR, 8, GETDATE()); SET @cost_time = DATEDIFF(SECOND, @begin_time, @end_time); INSERT INTO [chk].[tb_proc_cost_log] ( [proc_name] ,[Object_name] ,[execute_time] ,[action] ,[remark] ,[cost_time] ) SELECT N'chk.usp_update_mail_send_list_rec_by_mail_id' AS [proc_name] ,N'chk.mail_send_list_rec' AS [Object_name] ,@end_time AS [execute_time] ,N'end' AS [action] ,N'' AS [remark] ,@cost_time AS [cost_time] PRINT N'exec successed' END TRY BEGIN CATCH INSERT INTO [chk].[log_proc_error_rec] ( [proc_name] ,[error_source] ,[error_time] ,[error_severity] ,[error_state] ,[error_msg] ,[log_user] ) SELECT N'chk.usp_update_mail_send_list_rec_by_mail_id' AS [proc_name] ,ERROR_PROCEDURE() AS [error_source] ,DATEADD(HOUR, 8, GETDATE()) AS [error_time] ,ERROR_SEVERITY() AS [error_severity] ,ERROR_STATE() AS [error_state] ,ERROR_MESSAGE() AS [error_msg] ,SUSER_SNAME() AS [log_user] PRINT N'exec failed' END CATCH END |
三、和job结合调用
在需要发送email的job中,将joblet拖拉过去即可,然后生成一个你需要发送的邮件的mail_id,通过input组件将其传递到joblet组件的input输入中,这样就可以将joblet融入到job中。
如果您觉得此文章对您有帮助,请点击右下方【推荐】让更多人看到,thanks!
作者:DataStrategy
出处:https://www.cnblogs.com/xiongnanbin/
联系:1183744742@qq.com;xiongnanbin@126.com
本文版权归作者和博客园共有(转载的归原作者所有),欢迎转载,但是请在文章页面明显位置给出原文连接。如有问题或建议,请多多留言、赐教,非常感谢。