Data4Strategy

——合抱之木,生于毫末;九层之台,起于累土

【原创】Talend ETL开发——基于joblet的统一的email发送

更新

2019/1/16:增加了允许5个附件发送、smtp等配置的字段。


一、背景

在ETL数据整合过程中,肯定会涉及到email的通知,比如ETL的执行情况汇报,执行耗时汇报,关键数据更新情况等信息汇报等,这些信息都是需要及时给到相应的operation人员或者使用BI数据的人员。

但是,如果一开始没有规划好邮件推送的一些基本信息,有可能会导致后期邮件发送混乱和不好管理等问题,例如:每个人都有自己的etl,每个人都会去开发自己的邮件通知,那随着时间推移,后期哪些邮件要取消、哪些通知人要屏蔽等都是个难事,可能需要打开所有的ETL job去检查,去修改,耗时耗力,非常不利于管理。

二、实现

在实现该方案的时候,我主要考虑了以下几个方面:

1、每个人需要调用发送邮件的时候,尽量不要重复再做一次拖拉整套组件了,拖拉一次公共组件就好了,所以我选择用joblet来实现这个。

2、邮件的一些基础公共信息必须在一个地方维护,比如发送、接收邮件列表,发送记录等信息,所以我设计了数据库表来存放这些信息,这样只要更新数据库信息,就可以使得全局都使用统一的信息。

3、信息的发送、状态、生成的方式都可以灵活控制,所以我设计了一个表来存储这些信息,而且通过存储过程生成具体的email信息,这样可以追踪发送记录等信息。

4、因为talend joblet支持变量,所以我尽量将发送邮件组件中的一些变量都设计到数据库表中,这样方便于维护和修改。

2.1、数据库表设计

数据库设计主要有2张表:mail_send_group、mail_send_list_rec

mail_send_group:该表是用于记录发送者和接收者之间的信息,维护在这里可以让后去维护更简单,修改数据库则全局启用。

IF (OBJECT_ID(N'[chk].[mail_send_group]', N'U') IS NOT NULL)

BEGIN

PRINT N'删除表:[chk].[mail_send_group]';

DROP TABLE [chk].[mail_send_group];

END

GO

CREATE TABLE [chk].[mail_send_group]

(

[group_id] NVARCHAR(50) NOT NULL,--主键

[mail_to] NVARCHAR(1000) NOT NULL,--接收者邮箱列表,多个邮箱用;分割

[mail_from] NVARCHAR(100) NOT NULL,--发送者邮箱

[sender_name] NVARCHAR(100) NOT NULL,--发送者昵称

[mail_cc] NVARCHAR(1000) NULL,--抄送邮箱列表,多个邮箱用;分割

[mail_bcc] NVARCHAR(100) NULL,--密送邮箱列表,多个邮箱用;分割

[smtp_host] NVARCHAR(100) NOT NULL,--smtp host地址

[smtp_port] INT NOT NULL,--smtp host 端口号

[user_name] NVARCHAR(50) NOT NULL,--邮箱用户名

[user_pwd] NVARCHAR(50) NOT NULL,--邮箱用户密码

[create_date] DATETIME NOT NULL,--创建日期

[status] SMALLINT NULL--状态(0禁用,1启用)

)

GO

SELECT * FROM [chk].[mail_send_group]

1

mail_send_list_rec:该表是用于记录email生成的记录和发送记录的,每条信息通过group_id和上表关联,就可以知道每条信息是由谁发给谁的,什么时候发送的。

IF (OBJECT_ID(N'[chk].[mail_send_list_rec]', N'U') IS NOT NULL)

BEGIN

PRINT N'删除表:[chk].[mail_send_list_rec]';

DROP TABLE [chk].[mail_send_list_rec];

END

GO

CREATE TABLE [chk].[mail_send_list_rec]

(

[mail_id] NVARCHAR(50) NOT NULL,--主键

[group_id] NVARCHAR(50) NOT NULL,--所属的group id,用于确认发送接收等信息

[scope] NVARCHAR(100) NOT NULL,--业务,用于区分不同业务生成的邮件,相当于一个分类

[subject] NVARCHAR(100) NOT NULL,--主题

[message] NVARCHAR(4000) NOT NULL,--正文,支持HTML代码,建议是用HTML

[create_date] DATETIME NOT NULL,--创建日期

[send_date] DATETIME NULL,--发送日期

[send_status] SMALLINT NULL,--发送状态(0创建完未发送,1已经发送)

[atta01_path] NVARCHAR(200) NULL,--第1个发送附件路径绝对路径,包含文件名/data/mailatts/ids/checkret.csv

[atta02_path] NVARCHAR(200) NULL,--第1个发送附件路径绝对路径,包含文件名

[atta03_path] NVARCHAR(200) NULL,--第1个发送附件路径绝对路径,包含文件名

[atta04_path] NVARCHAR(200) NULL,--第1个发送附件路径绝对路径,包含文件名

[atta05_path] NVARCHAR(200) NULL--第1个发送附件路径绝对路径,包含文件名

)

GO

2


2.2、Joblet开发

3

1、 Joblet采用了input方式,输入的参数是mail_id,即邮件的ID,这个是外部调用这个joblet的时候需要传递进来的一个参数。

4

2、 tFlowToIterate是用于将mail_id生成一个全局变量,用于传递给3的mssqlinput。

5

3、 该组件是用于根据mail_id去查询数据库表中的详细email信息,为后面的发送信息提供明细。

"SELECT

[a].[mail_id]

,[a].[subject]

,[a].[message]

,[b].[mail_from]

,[b].[mail_to]

,[b].[sender_name]

,[b].[mail_cc]

,[b].[mail_bcc]

,[b].[status]

,[a].[atta01_path]

,[a].[atta02_path]

,[a].[atta03_path]

,[a].[atta04_path]

,[a].[atta05_path]

,[b].[smtp_host]

,[b].[smtp_port]

,[b].[user_name]

,[b].[user_pwd]

FROM [chk].[mail_send_list_rec] AS a

INNER JOIN [chk].[mail_send_group] AS b

ON ([a].[group_id] = [b].[group_id])

WHERE [a].[mail_id] = '" + ((String)globalMap.get("curr_mail_id")) + "'

AND ISNULL([b].[status], 0) = 1"

4、 发送邮件组件主要就是根据数据库中查询的数据,通过变量方式传递过来后,执行发送邮件的操作。

4

5、 更新数据库中相应的mail_id的记录为已发送和发送时间等信息。先用tfixedflowinput生成相应的存储过程参数,然后MSSQL_SP调用存储过程更新。

7

8


2.3、存储过程开发生成&更新email内容

生成email:主要功能就是按照你想要发送的内容生成一个message,并插入到数据库表中即可。

IF (OBJECT_ID(N'[chk].[usp_insert_ids_mail_send_list_rec]', N'P') IS NOT NULL)

BEGIN

PRINT N'删除存储过程:[chk].[usp_insert_ids_mail_send_list_rec]';

DROP PROC [chk].[usp_insert_ids_mail_send_list_rec];

END

GO

CREATE PROC [chk].[usp_insert_ids_mail_send_list_rec]

(

@curr_date NVARCHAR(20),

@atta01_path NVARCHAR(200),

@atta02_path NVARCHAR(200),

@atta03_path NVARCHAR(200),

@atta04_path NVARCHAR(200),

@atta05_path NVARCHAR(200)

)

AS

--====================================================================================================================================

-- ProcedureName : chk.usp_insert_ids_mail_send_list_rec

-- Author : john.xiong

-- CreateDate : 2019-01-02

-- Description : 生成daily的detail mail content

/*************************************Parameters参数说明*******************************************************************************

-- @curr_date : 数据实行日期YYYYMMDD

**************************************Modfied List修改记录*****************************************************************************

-- Modified Date Modified User Version Modified Reason

**************************************************************************************************************************************

-- 2019-01-02 john.xiong V01.00.00 初始化版本

**************************************************************************************************************************************/

--====================================================================================================================================

BEGIN

BEGIN TRY

DECLARE

@begin_time DATETIME

,@end_time DATETIME

,@cost_time INT;

SET @begin_time = DATEADD(HOUR, 8, GETDATE());

INSERT INTO [chk].[tb_proc_cost_log]

(

[proc_name]

,[Object_name]

,[execute_time]

,[action]

,[remark]

,[cost_time]

)

SELECT

N'chk.usp_insert_ids_mail_send_list_rec' AS [proc_name]

,N'chk.mail_send_list_rec' AS [Object_name]

,@begin_time AS [execute_time]

,N'start' AS [action]

,'' AS [remark]

,0 AS [cost_time]

DECLARE

@mail_id UNIQUEIDENTIFIER,

@scope NVARCHAR(100),

@group_id UNIQUEIDENTIFIER,

@subject NVARCHAR(100),

@create_date DATETIME,

@message NVARCHAR(4000),

@temp_message NVARCHAR(4000),

@count INT,

@count1 INT,

@count2 INT,

@error_count INT

SET @mail_id = NEWID();

SET @scope = N'IDS';

SET @group_id = N'8D42D25D-59C7-4A5E-AE9C-4A5F24D910B0'

SET @subject = N'IDS daily - job运行情况';

SET @create_date = DATEADD(HOUR, 8, GETDATE());

SET @count1 = 0;

SET @count2 = 0;

SET @error_count = 0;

SET @message = '<span style="color:#000; line-height:30px"><ol>';

SET @temp_message = '';

SET @count = 0;

SELECT

@count = COUNT(*)

FROM [chk].[log_move_blob_rec] AS a

WHERE LEFT([a].[rec_load_time], 8) = @curr_date

AND ([a].[scope] IN ('ids_regular_data', 'ids_regular_rtm') OR [a].[blobFileName] LIKE '%LCH%')

SET @message = @message + N'<li>从landing搬移blob文件总数:' + CONVERT(NVARCHAR(20), ISNULL(@count, 0));

SET @temp_message = '';

SET @count = 0;

SELECT

@count = COUNT(*)

FROM [chk].[log_move_blob_rec] AS a

WHERE LEFT([a].[rec_load_time], 8) = @curr_date

AND [a].[scope] = 'ids_regular_data'

SET @message = @message + N'<br>经销商regular data文件数:' + CONVERT(NVARCHAR(20), ISNULL(@count, 0));

SET @temp_message = '';

SET @count = 0;

SELECT

@count = COUNT(*)

FROM [chk].[log_move_blob_rec] AS a

WHERE LEFT([a].[rec_load_time], 8) = @curr_date

AND [a].[scope] IN ('ids_regular_rtm')

SET @message = @message + N'<br>restatement data文件数:' + CONVERT(NVARCHAR(20), ISNULL(@count, 0));

SET @temp_message = '';

SET @count = 0;

SELECT

@count = COUNT(*)

FROM [chk].[log_move_blob_rec] AS a

WHERE LEFT([a].[rec_load_time], 8) = @curr_date

AND [a].[blobFileName] LIKE '%LCH%'

SET @message = @message + N'<br>local customer hierarchy daily文件数:' + CONVERT(NVARCHAR(20), ISNULL(@count, 0)) + '</li>';

SET @temp_message = '';

SET @count = 0;

SELECT

@count = SUM([a].[file_count])

FROM [chk].[log_blob_file_deal] AS a

WHERE LOWER([a].[data_scope]) = 'ids'

AND LOWER([a].[deal_level]) = 'ext'

AND LOWER([a].[job_name]) = LOWER('IDS_Data_Blob_To_Stg_Ongoing_Loop_New_1_3')

AND [a].[remark] LIKE '%tFileList Count%'

AND CONVERT(NVARCHAR(8), [a].[deal_date], 112) = @curr_date

SET @message = @message + N'<li>实际处理经销商regular data文件数:' + CONVERT(NVARCHAR(20), ISNULL(@count, 0)) + '</li>';

SET @temp_message = '';

SET @count = 0;

SELECT

@count = SUM([a].[file_count])

FROM [chk].[log_blob_file_deal] AS a

WHERE LOWER([a].[data_scope]) = 'ids'

AND LOWER([a].[deal_level]) = 'ext'

AND LOWER([a].[job_name]) = LOWER('IDS_Data_Blob_To_Stg_SalesDaily_Rtm_New_1_4')

AND [a].[remark] LIKE '%tFileList Count%'

AND CONVERT(NVARCHAR(8), [a].[deal_date], 112) = @curr_date

SET @message = @message + N'<li>实际处理restatement data文件数:' + CONVERT(NVARCHAR(20), ISNULL(@count, 0)) + '</li>';

SET @temp_message = '';

SET @count = 0;

SELECT

@count = COUNT([a].[file_name])

FROM [chk].[log_file_deal_error_rec] AS a

WHERE LOWER([a].[data_scope]) = 'ids'

AND LOWER([a].[deal_level]) = 'ext'

AND LOWER([a].[job_name]) = LOWER('IDS_Data_Blob_To_Stg_Ongoing_Loop_New_1_3')

AND CONVERT(NVARCHAR(8), [a].[deal_date], 112) = @curr_date

SET @message = @message + N'<li>无法解压的经销商regular data文件数:' + CONVERT(NVARCHAR(20), ISNULL(@count, 0)) + '</li>';

SET @temp_message = '';

SET @count = 0;

SELECT

@count = COUNT([a].[file_name])

FROM [chk].[log_file_deal_error_rec] AS a

WHERE LOWER([a].[data_scope]) = 'ids'

AND LOWER([a].[deal_level]) = 'ext'

AND LOWER([a].[job_name]) = LOWER('IDS_Data_Blob_To_Stg_SalesDaily_Rtm_New_1_4')

AND CONVERT(NVARCHAR(8), [a].[deal_date], 112) = @curr_date

SET @message = @message + N'<li>无法解压的restatement data文件数:' + CONVERT(NVARCHAR(20), ISNULL(@count, 0)) + '</li>';

SET @temp_message = '';

SET @count = 0;

SELECT

@count = SUM([a].[file_count])

FROM [chk].[log_blob_file_deal] AS a

WHERE LOWER([a].[data_scope]) = 'ids'

AND LOWER([a].[deal_level]) = 'ext'

AND LOWER([a].[job_name]) = LOWER('IDS_RCS_Local_Master_Data_Daily_1_2')

AND [a].[remark] LIKE '%tFileList Count lch%'

AND CONVERT(NVARCHAR(8), [a].[deal_date], 112) = @curr_date

SET @message = @message + N'<li>处理local customer hierarchy daily文件数:' + CONVERT(NVARCHAR(20), ISNULL(@count, 0));

SET @temp_message = '';

SET @count = 0;

SET @count1 = 0;

SELECT TOP (1)

@count1 = [a].[row_count]

FROM [chk].[log_table_data_rec] AS a

WHERE [a].[data_scope] = 'rcs dim'

AND [a].[table_name] = 'stg.cust_ids_rcs_local_customer_hierarchy_daily'

AND CONVERT(NVARCHAR(8), [a].[action_time], 112) = @curr_date

ORDER BY [a].[action_time] DESC

SET @message = @message + N'<br>文件数据行数:' + CONVERT(NVARCHAR(20), ISNULL(@count1, 0));

SET @temp_message = '';

SET @count = 0;

SET @count2 = 0;

SELECT

@count2 = COUNT(*)

FROM [stg].[cust_ids_rcs_local_customer_hierarchy_daily] AS a

WHERE LEFT([a].[rec_load_time], 8) = @curr_date

SET @message = @message + N'<br>入库数据行数:' + CONVERT(NVARCHAR(20), ISNULL(@count2, 0)) + '</li>';

IF (@count1 <> @count2)

BEGIN

SET @error_count = @error_count + 1;

END

IF (OBJECT_ID(N'[chk].[temp_mail_send_proc_error_list_ids_daily]', N'U') IS NOT NULL)

BEGIN

DROP TABLE [chk].[temp_mail_send_proc_error_list_ids_daily];

END

/*生成错误proc的记录*/

CREATE TABLE [chk].[temp_mail_send_proc_error_list_ids_daily]

WITH

(

DISTRIBUTION = ROUND_ROBIN,

CLUSTERED COLUMNSTORE INDEX

)

AS

SELECT

[a].[proc_name]

,ROW_NUMBER() OVER(ORDER BY [a].[error_time] ASC) AS [Num]

FROM [chk].[log_proc_error_rec] AS a

WHERE [a].[proc_name] LIKE '%ids%'

AND [a].[proc_name] NOT LIKE '%mail%'

AND CONVERT(NVARCHAR(8), [a].[error_time], 112) = @curr_date

SET @count = 0;

SELECT @count = COUNT(*) FROM [chk].[temp_mail_send_proc_error_list_ids_daily];

IF (@count > 0)

BEGIN

SET @message = @message + N'<li style="color:red">有错误的PROC:' + CONVERT(NVARCHAR(20), @count);

SET @error_count = @error_count + @count;

END

WHILE (@count > 0)

BEGIN

SELECT @temp_message = [proc_name] FROM [chk].[temp_mail_send_proc_error_list_ids_daily] WHERE [Num] = @count;

SET @message = @message + N'<br />' + @temp_message + ';&nbsp;';

SET @count = @count - 1;

END

SET @message = @message + '</li>';

IF (@error_count <> 0)

BEGIN

SET @subject = @subject + ':有 ' + CONVERT(NVARCHAR(20), @error_count) + ' 个错误';

END

SET @subject = @curr_date + N' ' + @subject;

SET @message = @message + '</ol></span>'

PRINT @message

INSERT INTO [chk].[mail_send_list_rec]

(

[mail_id]

,[group_id]

,[scope]

,[subject]

,[message]

,[create_date]

,[send_date]

,[send_status]

,[atta01_path]

,[atta02_path]

,[atta03_path]

,[atta04_path]

,[atta05_path]

)

SELECT

@mail_id,

@group_id,

@scope,

@subject,

@message,

@create_date,

NULL,

0,

@atta01_path,

@atta02_path,

@atta03_path,

@atta04_path,

@atta05_path

SET @end_time = DATEADD(HOUR, 8, GETDATE());

SET @cost_time = DATEDIFF(SECOND, @begin_time, @end_time);

INSERT INTO [chk].[tb_proc_cost_log]

(

[proc_name]

,[Object_name]

,[execute_time]

,[action]

,[remark]

,[cost_time]

)

SELECT

N'chk.usp_insert_ids_mail_send_list_rec' AS [proc_name]

,N'chk.mail_send_list_rec' AS [Object_name]

,@end_time AS [execute_time]

,N'end' AS [action]

,CONVERT(NVARCHAR(50), @mail_id) AS [remark]

,@cost_time AS [cost_time]

PRINT N'Exec success';

SELECT @mail_id AS [curr_mail_id]

END TRY

BEGIN CATCH

INSERT INTO [chk].[log_proc_error_rec]

(

[proc_name]

,[error_source]

,[error_time]

,[error_severity]

,[error_state]

,[error_msg]

,[log_user]

)

SELECT

N'chk.usp_insert_ids_mail_send_list_rec' AS [proc_name]

,ERROR_PROCEDURE() AS [error_source]

,DATEADD(HOUR, 8, GETDATE()) AS [error_time]

,ERROR_SEVERITY() AS [error_severity]

,ERROR_STATE() AS [error_state]

,ERROR_MESSAGE() AS [error_msg]

,SUSER_SNAME() AS [log_user]

PRINT N'Exec failed';

END CATCH

END

更新email by mail_id

IF (OBJECT_ID(N'[chk].[usp_update_mail_send_list_rec_by_mail_id]', N'P') IS NOT NULL)

BEGIN

PRINT N'删除存储过程:[chk].[usp_update_mail_send_list_rec_by_mail_id]';

DROP PROC [chk].[usp_update_mail_send_list_rec_by_mail_id];

END

GO

CREATE PROC [chk].[usp_update_mail_send_list_rec_by_mail_id]

(

@mail_id NVARCHAR(50)

,@send_date DATETIME

,@send_status SMALLINT

)

AS

--====================================================================================================================================

-- ProcedureName : [chk].[usp_update_mail_send_list_rec_by_mail_id]

-- Author : john.xiong

-- CreateDate : 2018-12-24

-- Description : 根据mail_id更新mail发生记录信息

/*************************************Parameters参数说明*******************************************************************************

-- @mail_id : 邮件id NEWID

**************************************Modfied List修改记录*****************************************************************************

-- Modified Date Modified User Version Modified Reason

**************************************************************************************************************************************

-- 2018-12-24 john.xiong V01.00.00 初始化版本

**************************************************************************************************************************************/

--====================================================================================================================================

BEGIN

BEGIN TRY

DECLARE

@begin_time DATETIME

,@end_time DATETIME

,@cost_time INT

SET @begin_time = DATEADD(HOUR, 8, GETDATE());

INSERT INTO [chk].[tb_proc_cost_log]

(

[proc_name]

,[Object_name]

,[execute_time]

,[action]

,[remark]

,[cost_time]

)

SELECT

N'chk.usp_update_mail_send_list_rec_by_mail_id' AS [proc_name]

,N'chk.mail_send_list_rec' AS [Object_name]

,@begin_time AS [execute_time]

,N'start' AS [action]

,N'' AS [remark]

,0 AS [cost_time]

IF (@mail_id IS NULL)

BEGIN

RAISERROR (N'mail id错误!强制退出', 16, 1);

END

IF (@send_date IS NULL)

BEGIN

SET @send_date = DATEADD(HOUR, 8, GETDATE());

END

UPDATE [chk].[mail_send_list_rec]

SET [send_date] = @send_date, [send_status] = @send_status

WHERE [mail_id] = @mail_id;

SET @end_time = DATEADD(HOUR, 8, GETDATE());

SET @cost_time = DATEDIFF(SECOND, @begin_time, @end_time);

INSERT INTO [chk].[tb_proc_cost_log]

(

[proc_name]

,[Object_name]

,[execute_time]

,[action]

,[remark]

,[cost_time]

)

SELECT

N'chk.usp_update_mail_send_list_rec_by_mail_id' AS [proc_name]

,N'chk.mail_send_list_rec' AS [Object_name]

,@end_time AS [execute_time]

,N'end' AS [action]

,N'' AS [remark]

,@cost_time AS [cost_time]

PRINT N'exec successed'

END TRY

BEGIN CATCH

INSERT INTO [chk].[log_proc_error_rec]

(

[proc_name]

,[error_source]

,[error_time]

,[error_severity]

,[error_state]

,[error_msg]

,[log_user]

)

SELECT

N'chk.usp_update_mail_send_list_rec_by_mail_id' AS [proc_name]

,ERROR_PROCEDURE() AS [error_source]

,DATEADD(HOUR, 8, GETDATE()) AS [error_time]

,ERROR_SEVERITY() AS [error_severity]

,ERROR_STATE() AS [error_state]

,ERROR_MESSAGE() AS [error_msg]

,SUSER_SNAME() AS [log_user]

PRINT N'exec failed'

END CATCH

END

三、和job结合调用

在需要发送email的job中,将joblet拖拉过去即可,然后生成一个你需要发送的邮件的mail_id,通过input组件将其传递到joblet组件的input输入中,这样就可以将joblet融入到job中。

9

如果您觉得此文章对您有帮助,请点击右下方【推荐】让更多人看到,thanks!

posted @ 2019-01-03 15:59  John.Xiong  阅读(1082)  评论(2编辑  收藏  举报