[Debezium/FlinkCDC] 自定义列值转换器(`CustomConverter<SchemaBuilder, RelationalColumn>`)
需求描述
-
希望精确地、统一地控制
mysql
的datetime
/timestamp
等字段从 debezium 框架 Bilog CDC 采集、转换后的值。 -
又或者希望 mysql 的 int 、tinyint 、varchar 等字段,从 debezium 框架 Bilog CDC 采集、转换后的值类型统一为某个 Java Class 或值格式。
-
这些场景,均可以直接利用 debezium 框架预留的接口
io.debezium.spi.converter.CustomConverter
实现即可满足诉求。
针对 底层也是基于
debezium
的flink cdc应用程序,也直接可使用这套代码。
io.debezium.spi.converter.CustomConverter
public interface CustomConverter<S, F extends ConvertedField> {
@FunctionalInterface
interface Converter {//将数据从一个类型转换为另一个类型的功能
Object convert(Object input);
}
public interface ConverterRegistration<S> {//注册转换器的回调
void register(S fieldSchema, Converter converter); //为当前字段注册给定的模式和转换器。对于同一字段,不应多次调用一次
}
//将连接器配置中指定的属性传递给转换器实例。configure 方法在连接器初始化时运行。您可以将转换器与多个连接器搭配使用,并根据连接器的属性设置修改其行为。
void configure(Properties props);
//注册转换器来处理数据源中的特定列或字段。Debezium 调用 converterFor(...) 方法,以提示转换器 来调用转换的注册。converterFor 方法为每个列运行一次。
void converterFor(F field, ConverterRegistration<S> registration); //注册自定义值和模式转换器,以用于特定字段
}
- 版本
- debezium : 1.4.1.Final
- flink cdc : 1.3.0
- flink : 1.12.6
原理剖析
Debezium 的 TableSchemaBuilder
调用 自定义列值转换器
io.debezium.relational.TableSchemaBuilder
public TableSchema create(String schemaPrefix, String envelopSchemaName, Table table, Tables.ColumnNameFilter filter, ColumnMappers mappers, Key.KeyMapper keysMapper) {
...
Schema valSchema = valSchemaBuilder.optional().build();
Schema keySchema = hasPrimaryKey.get() ? keySchemaBuilder.build() : null;
if (LOGGER.isDebugEnabled()) {
LOGGER.debug("Mapped primary key for table '{}' to schema: {}", tableId, SchemaUtil.asDetailedString(keySchema));
LOGGER.debug("Mapped columns for table '{}' to schema: {}", tableId, SchemaUtil.asDetailedString(valSchema));
}
Envelope envelope = Envelope.defineSchema().withName(this.schemaNameAdjuster.adjust(envelopSchemaName)).withRecord(valSchema).withSource(this.sourceInfoSchema).build();
StructGenerator keyGenerator = this.createKeyGenerator(keySchema, tableId, tableKey.keyColumns());
StructGenerator valueGenerator = this.createValueGenerator(valSchema, tableId, table.columns(), filter, mappers);//重点方法,调用自定义列值 Converter
return new TableSchema(tableId, keySchema, keyGenerator, envelope, valSchema, valueGenerator);
}
protected StructGenerator createValueGenerator(Schema schema, TableId tableId, List<Column> columns, Tables.ColumnNameFilter filter, ColumnMappers mappers) {
if (schema != null) {
List<Column> columnsThatShouldBeAdded = (List)columns.stream().filter((column) -> {
return filter == null || filter.matches(tableId.catalog(), tableId.schema(), tableId.table(), column.name());
}).collect(Collectors.toList());
int[] recordIndexes = this.indexesForColumns(columnsThatShouldBeAdded);
Field[] fields = this.fieldsForColumns(schema, columnsThatShouldBeAdded);
int numFields = recordIndexes.length;
ValueConverter[] converters = this.convertersForColumns(schema, tableId, columnsThatShouldBeAdded, mappers);
return (row) -> {
Struct result = new Struct(schema);
for(int i = 0; i != numFields; ++i) {
this.validateIncomingRowToInternalMetadata(recordIndexes, fields, converters, row, i);
Object value = row[recordIndexes[i]];
ValueConverter converter = converters[i];
if (converter != null) {
LOGGER.trace("converter for value object: *** {} ***", converter);
} else {
LOGGER.trace("converter is null...");
}
if (converter != null) {
Column col;
try {
col = (Column)columns.get(i);
//注释行,可忽略,属于笔者自己追加的 debug 代码
//Object firstColumnValue = row[0];
//if(firstColumnValue.toString().equalsIgnoreCase("38")){//仅打印目标行
// Object newValue = converter.convert(value);
// LOGGER.info(
// "columnName : {} | typeName : {}, jdbcType : {} | converterClass:{} | oldValue: {}(class:{}) , newValue:{}(class:{})"
// , col.name(), col.typeName(), col.jdbcType(), converter.getClass().getCanonicalName()
// , value, value.getClass().getCanonicalName(), newValue , newValue.getClass().getCanonicalName()
// );
//}
value = converter.convert(value);
result.put(fields[i], value);
} catch (IllegalArgumentException | DataException var15) {
col = (Column)columns.get(i);
LOGGER.error("Failed to properly convert data value for '{}.{}' of type {} for row {}:", new Object[]{tableId, col.name(), col.typeName(), row, var15});
} catch (Exception var16) {
col = (Column)columns.get(i);
LOGGER.error("Failed to properly convert data value for '{}.{}' of type {} for row {}:", new Object[]{tableId, col.name(), col.typeName(), row, var16});
}
}
}
return result;
};
} else {
return null;
}
}
protected ValueConverter createValueConverterFor(TableId tableId, Column column, Field fieldDefn) {
// this.valueConverterProvider.converter(column, fieldDefn) : 实际调用 io.debezium.connector.mysql.MySqlValueConverters#converter
return (ValueConverter)this.customConverterRegistry.getValueConverter(tableId, column).orElse(this.valueConverterProvider.converter(column, fieldDefn));
}
即 如果用户没有为目标列配置自定义列值转换器,则:使用 debezium 的默认实现
Debezium ValueConverter
: 列值转换器的顶级接口
io.debezium.relational.ValueConverter
//
// Source code recreated from a .class file by IntelliJ IDEA
// (powered by FernFlower decompiler)
//
package io.debezium.relational;
@FunctionalInterface
public interface ValueConverter {
Object convert(Object var1);//核心接口,对列值的转换
default ValueConverter or(ValueConverter fallback) {
return fallback == null ? this : (data) -> {
Object result = this.convert(data);
return result == null && data != null ? fallback.convert(data) : result;
};
}
default ValueConverter and(ValueConverter delegate) {
return delegate == null ? this : (data) -> {
return delegate.convert(this.convert(data));
};
}
default ValueConverter nullOr() {
return (data) -> {
return data == null ? null : this.convert(data);
};
}
static ValueConverter passthrough() {
return (data) -> {
return data;
};
}
}
- 其负责实现
ValueConverter
接口的地方 (非所有)
io.debezium.relational.TableSchemaBuilder#wrapInMappingConverterIfNeeded(ColumnMappers mappers, TableId tableId, Column column, ValueConverter converter)
/**
* Obtain the array of converters for each column in a row. A converter might be null if the column is not be included in the records.
*
* @param schema the schema; may not be null
* @param tableId the identifier of the table that contains the columns
* @param columns the columns in the row; may not be null
* @param mappers the mapping functions for columns; may be null if none of the columns are to be mapped to different values
* @return the converters for each column in the rows; never null
*/
//注:调用本方法的地方:
//io.debezium.relational.TableSchemaBuilder#createKeyGenerator
// ValueConverter[] converters = this.convertersForColumns(schema, columnSetName, columns, (ColumnMappers)null);
//io.debezium.relational.TableSchemaBuilder#createValueGenerator
// ValueConverter[] converters = this.convertersForColumns(schema, tableId, columnsThatShouldBeAdded, mappers);
protected ValueConverter[] convertersForColumns(Schema schema, TableId tableId, List<Column> columns, ColumnMappers mappers) {
ValueConverter[] converters = new ValueConverter[columns.size()];
for (int i = 0; i < columns.size(); i++) {
Column column = columns.get(i);
ValueConverter converter = createValueConverterFor(tableId, column, schema.field(fieldNamer.fieldNameFor(column)));
converter = wrapInMappingConverterIfNeeded(mappers, tableId, column, converter);
if (converter == null) {
LOGGER.warn(
"No converter found for column {}.{} of type {}. The column will not be part of change events for that table.",
tableId, column.name(), column.typeName());
}
// may be null if no converter found
converters[i] = converter;
}
return converters;
}
private ValueConverter wrapInMappingConverterIfNeeded(ColumnMappers mappers, TableId tableId, Column column, ValueConverter converter) {
if (mappers == null || converter == null) {
return converter;
}
ValueConverter mappingConverter = mappers.mappingConverterFor(tableId, column);
if (mappingConverter == null) {
return converter;
}
return (value) -> mappingConverter.convert(converter.convert(value));
}
io.debezium.relational.CustomConverterRegistry#getValueConverter
public Optional<ValueConverter> getValueConverter(TableId table, Column column) {
final ConverterDefinition<SchemaBuilder> converterDefinition = conversionFunctionMap.get(fullColumnName(table, column));
if (converterDefinition == null) {
return Optional.empty();
}
return Optional.of(x -> {
return converterDefinition.converter.convert(x);
});
}
Debezium CustomConverterRegistry
: 自定义列值转换器的注册器
io.debezium.relational.CustomConverterRegistry
: 自定义列值转换器的注册器
作为io.debezium.relational.TableSchemaBuilder
的内部属性(customConverterRegistry
)
在TableSchemaBuilder#createValueConverterFor
方法中customConverterRegistry
属性被调用,以获取自定义列值转换器。其createValueConverterFor
方法的上游调用链路
//io.debezium.relational.TableSchemaBuilder
protected ValueConverter[] convertersForColumns(Schema schema, TableId tableId, List<Column> columns, ColumnMappers mappers)//获取 columns 各列的列值转换器
...
ValueConverter converter = this.createValueConverterFor(tableId, column, schema.field(this.fieldNamer.fieldNameFor(column)));
converter = this.wrapInMappingConverterIfNeeded(mappers, tableId, column, converter);
...
public Optional<ValueConverter> getValueConverter(TableId table, Column column)
: 获取
/**
* Obtain a pre-registered converter for a given column.
*
* @param table the table that contains the column
* @param column the column metadata
* @return the the value converter or empty if converter does not support the column
*/
public Optional<ValueConverter> getValueConverter(TableId table, Column column) {//获取目标列的自定义列值转换器
final ConverterDefinition<SchemaBuilder> converterDefinition = conversionFunctionMap.get(fullColumnName(table, column));
if (converterDefinition == null) {
return Optional.empty();
}
return Optional.of(x -> {//ValueConverter的实现类
//注释行是笔者追加的 debug 代码
//Converter converter = converterDefinition.converter;
//log.info("getValueConverter | columnName:{} ,typeName:{} ,jdbcType:{} | converterClass:{} | x:{}"
// , column.name(), column.typeName(), column.jdbcType()
// , converter.getClass().getCanonicalName()
// , x
//);
return converterDefinition.converter.convert(x);
});
}
CustomConverterRegistry.ConverterDefinition
: CustomConverterRegistry 的内部类
//io.debezium.relational.CustomConverterRegistry.ConverterDefinition
/**
* Class binding together the schema of the conversion result and the converter code.
*
* @param <S> schema describing the output type, usually {@link org.apache.kafka.connect.data.SchemaBuilder}
*/
public class ConverterDefinition<S> {
public final S fieldSchema;
public final CustomConverter.Converter converter;
public ConverterDefinition(S fieldSchema, CustomConverter.Converter converter) {
this.fieldSchema = fieldSchema;
this.converter = converter;
}
}
Debezium 的 MySqlValueConverters
extends JdbcValueConverters
:debezium 对数据库字段与java列值转换的默认实现
Debezium JdbcValueConverters
io.debezium.jdbc.JdbcValueConverters implements io.debezium.relational.ValueConverterProvider
public ValueConverter converter(Column column, Field fieldDefn)
public SchemaBuilder schemaBuilder(Column column)
/**
* Create a new instance that always uses UTC for the default time zone when converting values without timezone information
* to values that require timezones, and uses adapts time and timestamp values based upon the precision of the database
* columns.
*/
public JdbcValueConverters() {
this(null, TemporalPrecisionMode.ADAPTIVE, ZoneOffset.UTC, null, null, null);
}
/**
* Create a new instance, and specify the time zone offset that should be used only when converting values without timezone
* information to values that require timezones. This default offset should not be needed when values are highly-correlated
* with the expected SQL/JDBC types.
*
* @param decimalMode how {@code DECIMAL} and {@code NUMERIC} values should be treated; may be null if
* {@link DecimalMode#PRECISE} is to be used
* @param temporalPrecisionMode temporal precision mode based on {@link io.debezium.jdbc.TemporalPrecisionMode}
* @param defaultOffset the zone offset that is to be used when converting non-timezone related values to values that do
* have timezones; may be null if UTC is to be used
* @param adjuster the optional component that adjusts the local date value before obtaining the epoch day; may be null if no
* adjustment is necessary
* @param bigIntUnsignedMode how {@code BIGINT UNSIGNED} values should be treated; may be null if
* {@link BigIntUnsignedMode#PRECISE} is to be used
* @param binaryMode how binary columns should be represented
*/
public JdbcValueConverters(DecimalMode decimalMode, TemporalPrecisionMode temporalPrecisionMode, ZoneOffset defaultOffset,
TemporalAdjuster adjuster, BigIntUnsignedMode bigIntUnsignedMode, BinaryHandlingMode binaryMode) {
this.defaultOffset = defaultOffset != null ? defaultOffset : ZoneOffset.UTC;//涉及时区配置
this.adaptiveTimePrecisionMode = temporalPrecisionMode.equals(TemporalPrecisionMode.ADAPTIVE);
this.adaptiveTimeMicrosecondsPrecisionMode = temporalPrecisionMode.equals(TemporalPrecisionMode.ADAPTIVE_TIME_MICROSECONDS);
this.decimalMode = decimalMode != null ? decimalMode : DecimalMode.PRECISE;
this.adjuster = adjuster;
this.bigIntUnsignedMode = bigIntUnsignedMode != null ? bigIntUnsignedMode : BigIntUnsignedMode.PRECISE;
this.binaryMode = binaryMode != null ? binaryMode : BinaryHandlingMode.BYTES;
this.fallbackTimestampWithTimeZone = ZonedTimestamp.toIsoString(//涉及时区配置
OffsetDateTime.of(LocalDate.ofEpochDay(0), LocalTime.MIDNIGHT, defaultOffset),
defaultOffset,
adjuster);
this.fallbackTimeWithTimeZone = ZonedTime.toIsoString(//涉及时区配置
OffsetTime.of(LocalTime.MIDNIGHT, defaultOffset),
defaultOffset,
adjuster);
}
public ValueConverter converter(Column column, Field fieldDefn) {
switch (column.jdbcType()) {
...
// Date and time values
case Types.DATE:
if (adaptiveTimePrecisionMode || adaptiveTimeMicrosecondsPrecisionMode) {
return (data) -> convertDateToEpochDays(column, fieldDefn, data);
}
return (data) -> convertDateToEpochDaysAsDate(column, fieldDefn, data);
case Types.TIME:
return (data) -> convertTime(column, fieldDefn, data);
case Types.TIMESTAMP:
if (adaptiveTimePrecisionMode || adaptiveTimeMicrosecondsPrecisionMode) {
if (getTimePrecision(column) <= 3) {
return data -> convertTimestampToEpochMillis(column, fieldDefn, data); //dbz.Timestamp => long
}
if (getTimePrecision(column) <= 6) {
return data -> convertTimestampToEpochMicros(column, fieldDefn, data);//dbz.MicroTimestamp => long
}
return (data) -> convertTimestampToEpochNanos(column, fieldDefn, data);//dbz.NanoTimestamp => long
}
return (data) -> convertTimestampToEpochMillisAsDate(column, fieldDefn, data);//dbz.Timestamp => java.util.Date
case Types.TIME_WITH_TIMEZONE:
return (data) -> convertTimeWithZone(column, fieldDefn, data);
case Types.TIMESTAMP_WITH_TIMEZONE:
return (data) -> convertTimestampWithZone(column, fieldDefn, data);
// Other types ...
}
// MYSQL Timestamp 类型 => java.sql.Types.TIMESTAMP_WITH_TIMEZONE(2014) => java.lang.String : convertTimestampWithZone(column, fieldDefn, data)
protected Object convertTimestampWithZone(Column column, Field fieldDefn, Object data) {
return convertValue(column, fieldDefn, data, fallbackTimestampWithTimeZone, (r) -> {
try {
r.deliver(ZonedTimestamp.toIsoString(data, defaultOffset, adjuster));//返回 java.lang.String
}
catch (IllegalArgumentException e) {
}
});
}
// MYSQL Datetime 类型 => java.sql.Types.TIMESTAMP(93) => [情况1] java.lang.Long: convertTimestampToEpochMillis(column, fieldDefn, data)
protected Object convertTimestampToEpochMillis(Column column, Field fieldDefn, Object data) {
// epoch is the fallback value
return convertValue(column, fieldDefn, data, 0L, (r) -> {
try {
r.deliver(Timestamp.toEpochMillis(data, adjuster));// Timestamp: io.debezium.time.Timestamp , 返回 : long
}
catch (IllegalArgumentException e) {
}
});
}
// MYSQL Datetime 类型 => java.sql.Types.TIMESTAMP(93) => [情况2] java.lang.Long: convertTimestampToEpochMicros(column, fieldDefn, data)
protected Object convertTimestampToEpochMicros(Column column, Field fieldDefn, Object data) {
// epoch is the fallback value
return convertValue(column, fieldDefn, data, 0L, (r) -> {
try {
r.deliver(MicroTimestamp.toEpochMicros(data, adjuster));// Timestamp: io.debezium.time.MicroTimestamp , 返回 : long
}
catch (IllegalArgumentException e) {
}
});
}
// MYSQL Datetime 类型 => java.sql.Types.TIMESTAMP(93) => [情况3] java.lang.Long: convertTimestampToEpochMicros(column, fieldDefn, data)
protected Object convertTimestampToEpochNanos(Column column, Field fieldDefn, Object data) {
// epoch is the fallback value
return convertValue(column, fieldDefn, data, 0L, (r) -> {
try {
r.deliver(NanoTimestamp.toEpochNanos(data, adjuster));// Timestamp: io.debezium.time.NanoTimestamp, 返回 : long
}
catch (IllegalArgumentException e) {
}
});
}
// MYSQL Datetime 类型 => java.sql.Types.TIMESTAMP(93) => [情况4] convertTimestampToEpochMillisAsDate(column, fieldDefn, data)
protected Object convertTimestampToEpochMillisAsDate(Column column, Field fieldDefn, Object data) {
// epoch is the fallback value
return convertValue(column, fieldDefn, data, new java.util.Date(0L), (r) -> {
try {
r.deliver(new java.util.Date(Timestamp.toEpochMillis(data, adjuster)));// Timestamp: java.util.Date 返回 : java.util.Date
}
catch (IllegalArgumentException e) {
}
});
}
protected Object convertValue(Column column, Field fieldDefn, Object data, Object fallback, ValueConversionCallback callback) {
if (data == null) {
if (column.isOptional()) {
return null;
} else {
Object schemaDefault = fieldDefn.schema().defaultValue();
return schemaDefault != null ? schemaDefault : fallback;
}
} else {
this.logger.trace("Value from data object: *** {} ***", data);
ResultReceiver r = ResultReceiver.create();
callback.convert(r);
this.logger.trace("Callback is: {}", callback);
this.logger.trace("Value from ResultReceiver: {}", r);
return r.hasReceived() ? r.get() : this.handleUnknownData(column, fieldDefn, data);
}
}
io.debezium.time.Timestamp
public static long toEpochMillis(Object value, TemporalAdjuster adjuster) {
if (value instanceof Long) {
return (Long) value;
}
LocalDateTime dateTime = Conversions.toLocalDateTime(value);
if (adjuster != null) {
dateTime = dateTime.with(adjuster);
}
return dateTime.toInstant(ZoneOffset.UTC).toEpochMilli();
}
io.debezium.time.MicroTimestamp
public static long toEpochMicros(Object value, TemporalAdjuster adjuster) {
LocalDateTime dateTime = Conversions.toLocalDateTime(value);
if (adjuster != null) {
dateTime = dateTime.with(adjuster);
}
return Conversions.toEpochMicros(dateTime.toInstant(ZoneOffset.UTC));// UTC+0 时区
}
io.debezium.time.MicroTimestamp
https://github.com/debezium/debezium/blob/v1.4.1.Final/debezium-core/src/main/java/io/debezium/time/NanoTimestamp.java
https://github.com/debezium/debezium/blob/v1.4.1.Final/debezium-core/src/main/java/io/debezium/time/Conversions.java
public static long toEpochNanos(Object value, TemporalAdjuster adjuster) {
LocalDateTime dateTime = Conversions.toLocalDateTime(value);
if (adjuster != null) {
dateTime = dateTime.with(adjuster);
}
return toEpochNanos(dateTime);
}
private static long toEpochNanos(LocalDateTime timestamp) {
long nanoInDay = timestamp.toLocalTime().toNanoOfDay();
long nanosOfDay = toEpochNanos(timestamp.toLocalDate());
return nanosOfDay + nanoInDay;
}
private static long toEpochNanos(LocalDate date) {
long epochDay = date.toEpochDay();
return epochDay * Conversions.NANOSECONDS_PER_DAY;
}
Debezium MySqlValueConverters
io.debezium.connector.mysql.MySqlValueConverters extends JdbcValueConverters
...
import java.sql.Timestamp;
...
public MySqlValueConverters(DecimalMode decimalMode, TemporalPrecisionMode temporalPrecisionMode, BigIntUnsignedMode bigIntUnsignedMode,
BinaryHandlingMode binaryMode,
TemporalAdjuster adjuster, ParsingErrorHandler parsingErrorHandler) {
super(decimalMode, temporalPrecisionMode, ZoneOffset.UTC, adjuster, bigIntUnsignedMode, binaryMode);//此处写死了,defaultOffset = ZoneOffset.UTC (UTC+0时区)
this.parsingErrorHandler = parsingErrorHandler;
}
@Override
public ValueConverter converter(Column column, Field fieldDefn) {
...
// We have to convert bytes encoded in the column's character set ...
switch (column.jdbcType()) {
// Types 即 : java.sql.Types
case Types.CHAR: // variable-length
case Types.VARCHAR: // variable-length
case Types.LONGVARCHAR: // variable-length
case Types.CLOB: // variable-length
case Types.NCHAR: // fixed-length
case Types.NVARCHAR: // fixed-length
case Types.LONGNVARCHAR: // fixed-length
case Types.NCLOB: // fixed-length
case Types.DATALINK:
case Types.SQLXML:
Charset charset = charsetFor(column);
if (charset != null) {
logger.debug("Using {} charset by default for column: {}", charset, column);
return (data) -> convertString(column, fieldDefn, charset, data);
}
logger.warn("Using UTF-8 charset by default for column without charset: {}", column);
return (data) -> convertString(column, fieldDefn, StandardCharsets.UTF_8, data);
case Types.TIME: // java.sql.Types#TIME (92)
if (adaptiveTimeMicrosecondsPrecisionMode) {
return data -> convertDurationToMicroseconds(column, fieldDefn, data);
}
case Types.TIMESTAMP: // java.sql.Types#TIMESTAMP (93)
return ((ValueConverter) (data -> convertTimestampToLocalDateTime(column, fieldDefn, data))).and(super.converter(column, fieldDefn));
// 调用 : convertTimestampToLocalDateTime method
default:
break;
}
// Otherwise, let the base class handle it ...
return super.converter(column, fieldDefn);
}
protected Object convertTimestampToLocalDateTime(Column column, Field fieldDefn, Object data) {
if (data == null && !fieldDefn.schema().isOptional()) {
return null;
}
if (!(data instanceof Timestamp)) {
return data;
}
return ((Timestamp) data).toLocalDateTime();
}
案例示范
案例:MySqlDateTimeConverter
: MYSQL日期时间列值转换器
Step0 需求分析 : debezium 对 时间字段的默认实现
- mysql转换的默认策略
mysql启动时,快照期间初始化转换器,在binlog期间仍进行一次初始化转换器。(使用的类不同)
MYSQL 字段类型 | 快照类型(jdbcType) Debezium TableSchemaBuilder 转换前的原始类型 |
Debezium JdbcValueConverters 的转换类型 | SourceRecord 的列值类型 |
---|---|---|---|
DATE | java.time.LocalDate(java.sql.Types#TIMESTAMP/93)?待验证 | 未知 | 未知 |
TIME | java.time.Duration(java.sql.Types#TIME/92)?待验证 | 未知 | 未知 |
DATETIME | java.sql.Timestamp(java.sql.Types#TIMESTAMP/93) | io.debezium.time.Timestamp => return long io.debezium.time.MicroTimestamp => return long io.debezium.time.NanoTimestamp => return long java.util.Date => return java.util.Date |
java.lang.Long |
TIMESTAMP | java.sql.Timestamp(java.sql.Types#TIMESTAMP_WITH_TIMEZONE/2014) | io.debezium.time.ZonedTime => return string | java.lang.String |
- MYSQL样例表的样例数据
MYSQL时区配置 : system_time_zone +08 | time_zone SYSTEM
id[bitint(20)]=38
createTime[datetime(3) 类型] = '2024-01-31 17:56:43.717000000'
=> 若以 UTC+8 时区转换为时间戳,则 : 1706695003717
createTimeTs[timestamp 类型] = '2024-01-31 17:56:44'
=> 若以 UTC+8 时区转为时间戳,则: 1706695004000 (毫秒级时间戳)
select UNIX_TIMESTAMP(createTimeTs) = 1706695004 (秒级时间戳)
- 试验1:MYSQL时区(time_zone)=UTC+8 | Debezium/FlinkCDC MySQLSource.serverTimeZone = utc+8 , 不配置任何自定义的日期时间列值转换器
[2024/12/04 20:42:26.248] [TRACE] [debezium-mysqlconnector-mysql_binlog_source-snapshot] [io.debezium.connector.mysql.MySqlValueConverters :1289 convertValue] Value from ResultReceiver: [received = true, object = 1706723803717]
[2024/12/04 20:42:26.248] [INFO ] [debezium-mysqlconnector-mysql_binlog_source-snapshot] [io.debezium.relational.TableSchemaBuilder :162 lambda$createValueGenerator$5] columnName : createTime | typeName : DATETIME, jdbcType : 93 | converterClass:io.debezium.relational.ValueConverter$$Lambda$986/143348969 | oldValue: 2024-01-31T17:56:43.717+0800(class:java.sql.Timestamp) , newValue:1706723803717(class:java.lang.Long)
[2024/12/04 20:42:26.305] [TRACE] [debezium-mysqlconnector-mysql_binlog_source-snapshot] [io.debezium.connector.mysql.MySqlValueConverters :1289 convertValue] Value from ResultReceiver: [received = true, object = 2024-01-31T09:56:44Z]
[2024/12/04 20:42:26.305] [INFO ] [debezium-mysqlconnector-mysql_binlog_source-snapshot] [io.debezium.relational.TableSchemaBuilder :162 lambda$createValueGenerator$5] columnName : createTimeTs | typeName : TIMESTAMP, jdbcType : 2014 | converterClass:io.debezium.jdbc.JdbcValueConverters$$Lambda$942/460149074 | oldValue: 2024-01-31T17:56:44.000+0800(class:java.sql.Timestamp) , newValue:2024-01-31T09:56:44Z(class:java.lang.String)
[2024/12/04 21:29:03.868] [INFO ] [debezium-engine] [com.xxx.cdc.mysql.MysqlCdcDeserializationSchema :71 deserialize] id: 38, createTime: 1706723803717, type: java.lang.Long
[2024/12/04 21:29:03.869] [TRACE] [debezium-mysqlconnector-mysql_binlog_source-snapshot] [io.debezium.connector.mysql.MySqlValueConverters :1288 convertValue] Callback is: io.debezium.jdbc.JdbcValueConverters$$Lambda$1015/2001322212@30fe60b4
[2024/12/04 21:29:03.869] [INFO ] [debezium-engine] [com.xxx.cdc.mysql.MysqlCdcDeserializationSchema :72 deserialize] id: 38, createTimeTs: 2024-01-31T09:56:44Z, type: java.lang.String
- 试验2: MYSQL时区(time_zone)=UTC+8 | Debezium/FlinkCDC MySQLSource.serverTimeZone = utc, 不配置任何自定义的日期时间列值转换器
[2024/12/04 20:58:19.316] [TRACE] [debezium-mysqlconnector-mysql_binlog_source-snapshot] [io.debezium.connector.mysql.MySqlValueConverters :1289 convertValue] Value from ResultReceiver: [received = true, object = 1706723803717]
[2024/12/04 20:58:19.317] [INFO ] [debezium-mysqlconnector-mysql_binlog_source-snapshot] [io.debezium.relational.TableSchemaBuilder :162 lambda$createValueGenerator$5] columnName : createTime | typeName : DATETIME, jdbcType : 93 | converterClass:io.debezium.relational.ValueConverter$$Lambda$958/157024793 | oldValue: 2024-01-31T17:56:43.717+0800(class:java.sql.Timestamp) , newValue:1706723803717(class:java.lang.Long)
[2024/12/04 20:58:19.342] [TRACE] [debezium-mysqlconnector-mysql_binlog_source-snapshot] [io.debezium.connector.mysql.MySqlValueConverters :1289 convertValue] Value from ResultReceiver: [received = true, object = 2024-01-31T17:56:44Z]
[2024/12/04 20:58:19.342] [INFO ] [debezium-mysqlconnector-mysql_binlog_source-snapshot] [io.debezium.relational.TableSchemaBuilder :162 lambda$createValueGenerator$5] columnName : createTimeTs | typeName : TIMESTAMP, jdbcType : 2014 | converterClass:io.debezium.jdbc.JdbcValueConverters$$Lambda$915/1619603243 | oldValue: 2024-02-01T01:56:44.000+0800(class:java.sql.Timestamp) , newValue:2024-01-31T17:56:44Z(class:java.lang.String)
[2024/12/04 21:16:56.082] [INFO ] [debezium-engine] [com.xxx.cdc.mysql.MysqlCdcDeserializationSchema :71 deserialize] id: 38, createTime: 1706723803717, type: java.lang.Long
[2024/12/04 21:16:56.082] [TRACE] [debezium-mysqlconnector-mysql_binlog_source-snapshot] [io.debezium.connector.mysql.MySqlValueConverters :1288 convertValue] Callback is: io.debezium.jdbc.JdbcValueConverters$$Lambda$1075/1910762610@4f2229a4
[2024/12/04 21:16:56.082] [INFO ] [debezium-engine] [com.xxx.cdc.mysql.MysqlCdcDeserializationSchema :72 deserialize] id: 38, createTimeTs: 2024-01-31T17:56:44Z, type: java.lang.String
- sqlserver 转换
参见 : https://cloud.tencent.com/developer/article/2216144 (仅供参考)
Step1 Debezium 依赖引入
<dependency>
<groupId>io.debezium</groupId>
<artifactId>debezium-api</artifactId>
<version>${debezium.version}</version> 1
</dependency>
<dependency>
<groupId>org.apache.kafka</groupId>
<artifactId>connect-api</artifactId>
<version>${kafka.version}</version> 2
</dependency>
- debezium.version = 1.4.1.Final (与 flink cdc :1.3.0 内置的 debezium 版本一致)
- kafka.version = 2.6.1
Step2 自定义 Debezium CustomConverter
import io.debezium.spi.converter.CustomConverter;
import io.debezium.spi.converter.RelationalColumn;
import org.apache.kafka.connect.data.SchemaBuilder;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import java.time.*;
import java.time.format.DateTimeFormatter;
import java.util.Properties;
import java.util.function.Consumer;
/**
* 处理Debezium时间转换的问题
*/
public class MySqlDateTimeConverter implements CustomConverter<SchemaBuilder, RelationalColumn> {
private final static Logger logger = LoggerFactory.getLogger(MySqlDateTimeConverter.class);
private DateTimeFormatter dateFormatter = DateTimeFormatter.ISO_DATE;
private DateTimeFormatter timeFormatter = DateTimeFormatter.ISO_TIME;
private DateTimeFormatter datetimeFormatter = DateTimeFormatter.ISO_DATE_TIME;
private DateTimeFormatter timestampFormatter = DateTimeFormatter.ISO_DATE_TIME;
private ZoneId timestampZoneId = ZoneId.systemDefault();
@Override
public void configure(Properties props) {
readProps(props, "format.date", p -> dateFormatter = DateTimeFormatter.ofPattern(p));
readProps(props, "format.time", p -> timeFormatter = DateTimeFormatter.ofPattern(p));
readProps(props, "format.datetime", p -> datetimeFormatter = DateTimeFormatter.ofPattern(p));
readProps(props, "format.timestamp", p -> timestampFormatter = DateTimeFormatter.ofPattern(p));
readProps(props, "format.timestamp.zone", z -> timestampZoneId = ZoneId.of(z));
}
private void readProps(Properties properties, String settingKey, Consumer<String> callback) {
String settingValue = (String) properties.get(settingKey);
if (settingValue == null || settingValue.length() == 0) {
return;
}
try {
callback.accept(settingValue.trim());
} catch (IllegalArgumentException | DateTimeException e) {
logger.error("The {} setting is illegal: {}", settingKey, settingValue);
throw e;
}
}
@Override
public void converterFor(RelationalColumn column, ConverterRegistration<SchemaBuilder> registration) {
String sqlType = column.typeName().toUpperCase();
SchemaBuilder schemaBuilder = null;
Converter converter = null;
if ("DATE".equals(sqlType)) {
schemaBuilder = SchemaBuilder.string().optional().name("com.unicdata.debezium.date.string");
converter = this::convertDate;
}
if ("TIME".equals(sqlType)) {
schemaBuilder = SchemaBuilder.string().optional().name("com.unicdata.debezium.time.string");
converter = this::convertTime;
}
if ("DATETIME".equals(sqlType)) {
schemaBuilder = SchemaBuilder.string().optional().name("com.unicdata.debezium.datetime.string");
converter = this::convertDateTime;
}
if ("TIMESTAMP".equals(sqlType)) {
schemaBuilder = SchemaBuilder.string().optional().name("com.unicdata.debezium.timestamp.string");
converter = this::convertTimestamp;
}
if (schemaBuilder != null) {
registration.register(schemaBuilder, converter);
}
}
private String convertDate(Object input) {
if (input == null) {
return null;
}
if (input instanceof LocalDate) {
return dateFormatter.format((LocalDate) input);
}
if (input instanceof Integer) {
LocalDate date = LocalDate.ofEpochDay((Integer) input);
return dateFormatter.format(date);
}
return String.valueOf(input);
}
private String convertTime(Object input) {
if (input == null) {
return null;
}
if (input instanceof Duration) {
Duration duration = (Duration) input;
long seconds = duration.getSeconds();
int nano = duration.getNano();
LocalTime time = LocalTime.ofSecondOfDay(seconds).withNano(nano);
return timeFormatter.format(time);
}
return String.valueOf(input);
}
private String convertDateTime(Object input) {
if (input == null) {
return null;
}
if (input instanceof LocalDateTime) {
return datetimeFormatter.format((LocalDateTime) input);
}
return String.valueOf(input);
}
private String convertTimestamp(Object input) {
if (input == null) {
return null;
}
if (input instanceof ZonedDateTime) {
// mysql的timestamp会转成UTC存储,这里的zonedDatetime都是UTC时间
ZonedDateTime zonedDateTime = (ZonedDateTime) input;
LocalDateTime localDateTime = zonedDateTime.withZoneSameInstant(timestampZoneId).toLocalDateTime();
return timestampFormatter.format(localDateTime);
}
return String.valueOf(input);
}
}
Step3 Debezium Properties中定义自定义的列值转换器
- 在
Source
阶段添加该配置
public static Properties getDebeziumProperties() {
Properties properties = new Properties();
properties.setProperty("converters", "dateConverters");
//根据类在那个包下面修改
properties.setProperty("dateConverters.type", "com.xxx.bdz.schema.MySqlDateTimeConverter");
properties.setProperty("dateConverters.database.type", "mysql");
properties.setProperty("dateConverters.format.date", "yyyy-MM-dd");
properties.setProperty("dateConverters.format.time", "HH:mm:ss");
properties.setProperty("dateConverters.format.datetime", "yyyy-MM-dd HH:mm:ss");
properties.setProperty("dateConverters.format.timestamp", "yyyy-MM-dd HH:mm:ss");
properties.setProperty("dateConverters.format.timestamp.zone", "UTC+8");
properties.setProperty("debezium.snapshot.locking.mode", "none"); //全局读写锁,可能会影响在线业务,跳过锁设置
properties.setProperty("bigint.unsigned.handling.mode", "long");
properties.setProperty("decimal.handling.mode", "string");
return properties;
}
//flink cdc 中的使用方式
//MySqlSource<String> mySqlSource = MySqlSource.<String>builder()
SourceFunction<String> sourceCdc = MySQLSource.<String>builder()
.hostname( appArgs.getHost())
.port(Integer.parseInt( appArgs.getPort()))
.databaseList( appArgs.getDatabaseName()) // set captured database
.tableList( tableList) // set captured table
.username( appArgs.getUserName())
.password( appArgs.getPassword())
//.includeSchemaChanges(true)
.debeziumProperties( getDebeziumProperties())
.deserializer( new JsonDebeziumDeserializationSchema()) // converts SourceRecord to JSON String
.startupOptions( getStartUpMode(appArgs))
.serverTimeZone( "Asia/Shanghai" )
.build();
X 参考文献
- 第 13 章 开发 Debezium 自定义数据类型转换器 - redhat 【推荐】
- debezium
- https://debezium.io/documentation/reference/stable/development/converters.html 【推荐】
- https://github.com/debezium/debezium/blob/v1.4.1.Final/pom.xml
- https://github.com/debezium/debezium/blob/v1.4.1.Final/debezium-api/src/main/java/io/debezium/spi/converter/CustomConverter.java 【推荐】
io.debezium.connector.mysql.MySqlValueConverters
- https://github.com/debezium/debezium/blob/v1.4.1.Final/debezium-core/src/main/java/io/debezium/relational/TableSchemaBuilder.java
- openjdk
public class MySqlDateTimeConverter implements CustomConverter<SchemaBuilder, RelationalColumn>
Debezium
默认将MySQL中datetime
类型转成UTC的时间戳({@link io.debezium.time.Timestamp}),时区是写死的无法更改
Debezium
默认的做法,将导致数据库中设置的UTC+8,到kafka中变成了多八个小时的long型时间戳
Debezium
默认将MySQL中的timestamp
类型转成时间字符串。
mysql | mysql-binlog-connector | debezium |
---|---|---|
date (2021-01-28) |
LocalDate (2021-01-28) |
Integer (18655) |
time (17:29:04) |
Duration (PT17H29M4S) |
Long (62944000000) |
timestamp (2021-01-28 17:29:04) |
ZonedDateTime (2021-01-28T09:29:04Z) |
String (2021-01-28T09:29:04Z) |
Datetime (2021-01-28 17:29:04) |
LocalDateTime (2021-01-28T17:29:04) |
Long (1611854944000) |
本文链接: https://www.cnblogs.com/johnnyzen
关于博文:评论和私信会在第一时间回复,或直接私信我。
版权声明:本博客所有文章除特别声明外,均采用 BY-NC-SA 许可协议。转载请注明出处!
日常交流:大数据与软件开发-QQ交流群: 774386015 【入群二维码】参见左下角。您的支持、鼓励是博主技术写作的重要动力!