使用Kettle编写抽取数据

目录

环境

工程搭建

引入kettle的jar包

引入辅助包

 3. 添加mysql连接jar包

4. 添加lombok包

创建数据库和表

代码分析

初始话kettle环境

定义数据库信息

    3. mysql处理

   4. 生成表输入

5. 设置查询SQL

6. 将操作添加到步骤中

7. 选择字段

8. 将操作添加到步骤

9.  将步骤串联起来

10. 字符串替换

11. 将结果变为json字符串

12. 将记录变为结果

13. 替换SQL中变量

14 设置执行参数,添加监听

15. 执行抽取

16. 获取执行结果

17. 打印输出

18. 前置参数准备,仅用于测试

完整样例

DatabaseConn.java

ExtractBean.java

FieldTransfer.java

测试类


  • 环境

Eclipse, jdk8,window10

  • 工程搭建

  1. 引入kettle的jar包

    <kettle.version>5.1.0.0-752</kettle.version>
    <dependency>
    			<groupId>pentaho-kettle</groupId>
    			<artifactId>kettle-engine</artifactId>
    			<version>${kettle.version}</version>
    		</dependency>
    		<dependency>
    			<groupId>pentaho-kettle</groupId>
    			<artifactId>kettle-core</artifactId>
    			<version>${kettle.version}</version>
    		</dependency>
  2. 引入辅助包

辅助包就是Kettle中可能使用到的

<dependency>
			<groupId>org.apache.commons</groupId>
			<artifactId>commons-lang3</artifactId>
			<version>3.8</version>
		</dependency>
		<dependency>
			<groupId>commons-vfs</groupId>
			<artifactId>commons-vfs</artifactId>
			<version>20100924-pentaho</version>
		</dependency>
		<dependency>
			<groupId>commons-logging</groupId>
			<artifactId>commons-logging</artifactId>
			<version>1.2</version>
		</dependency>
		<dependency>
			<groupId>com.googlecode.json-simple</groupId>
			<artifactId>json-simple</artifactId>
			<version>1.1.1</version>
		</dependency>

 3. 添加mysql连接jar包

<dependency>
	   <groupId>mysql</groupId>
			<artifactId>mysql-connector-java</artifactId>
			<version>5.1.44</version>
		</dependency>

4. 添加lombok包

lombok是一款生成getter,settter的包,很方便,eclipse还需要安装插件来编译生成getter,setter

<dependency>
			<groupId>org.projectlombok</groupId>
			<artifactId>lombok</artifactId>
			<version>1.18.0</version>
		</dependency>
  • 创建数据库和表

Create database test;
创建数据来源表:
CREATE TABLE `etl_src_table` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `name` varchar(20) DEFAULT NULL,
  `age` int(3) DEFAULT NULL,
  `mail` varchar(40) DEFAULT NULL,
  `phone` varchar(20) DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=7 DEFAULT CHARSET=utf8;

创建目标表:
CREATE TABLE `etl_dest_table` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `name` varchar(20) DEFAULT NULL,
  `age` int(3) DEFAULT NULL,
  `mail` varchar(40) DEFAULT NULL,
  `phone` varchar(20) DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
  • 代码分析

  1. 初始话kettle环境

    很重要,否则有时候会有一些莫名其妙错误

    // 初始化Kettle环境

    KettleEnvironment.init();

    EnvUtil.environmentInit();

  2. 定义数据库信息

    此处有两种方法,一种是使用XML,一种构造方法传参

        A)  xml格式定义一个xml报文

/**
     * 数据库连接信息,适用于DatabaseMeta其中 一个构造器DatabaseMeta(String xml)
     */
	protected String databasesXML = 
		"<?xml version=\"1.0\" encoding=\"UTF-8\"?>"
			+ "<connection>"
			+ "<name>{0}</name>"
			+ "<server>{1}</server>"
			+ "<type>{2}</type>"
			+ "<access>{3}</access>"
			+ "<database>{4}</database>"
			+ "<port>{5}</port>"
			+ "<username>{6}</username>"
			+ "<password>{7}</password>"
			+ "</connection>";

然后将值放入

//添加转换的数据库连接
        String xml = MessageFormat.format(databasesXML, new Object[] {
        		srcDBName,
        		srcDB.getServer(),
        		"MySQL",
        		"Native",
        		srcDB.getDatabase(),
        		srcDB.getPort(),
        		srcDB.getUsername(),
        		srcDB.getPassword()
        });
		DatabaseMeta dbMeta = new DatabaseMeta(xml);

 B) 传参方式

final String srcDBName = "srcDB";
		DatabaseMeta dbMeta = new DatabaseMeta(srcDBName, srcDB.getType(),
				srcDB.getAccess(), srcDB.getServer(), srcDB.getDatabase(),
				srcDB.getPort(), srcDB.getUsername(), srcDB.getPassword());

    3. mysql处理

其他数据库可以跳过,为了防止数据库字段查询出来乱码问题

dbMeta.setConnectSQL(MessageFormat.format("set names ''{0}'';",

new Object[] { "utf8" }));

因为mysql数据库是,数据库编码,字段编码,每个字段编码可以不同。

 

   4. 生成表输入

 

TableInputMeta tableInputMeta = new TableInputMeta();
		String tableInputPluginId = registry.getPluginId(StepPluginType.class,
				tableInputMeta);
		// 给表输入添加一个DatabaseMeta连接数据库
		tableInputMeta.setDatabaseMeta(transMeta.findDatabase(srcDBName));

5. 设置查询SQL

// 构造查询SQL
		String selectSql = "select {0} from {1}";
		selectSql = MessageFormat.format(
				selectSql,
				new Object[] {
						StringUtils.join(extractBean.getSrcFields(), ","),
						extractBean.getSrcTable()[0] });
		tableInputMeta.setSQL(selectSql);

上述SQL语句有{0},{1}这种变量,所以还需要设置替换SQL中变量

// 替换SQL语句中变量

tableInputMeta.setVariableReplacementActive(true);

 

6. 将操作添加到步骤中

只有将操作添加到步骤中才可以,这样才能形成一个流程,操作就是流程中的一个节点,和工作流相似。

// 将TableInputMeta添加到转换中
		StepMeta tableInputStepMeta = new StepMeta(tableInputPluginId,
				"tableInput", (StepMetaInterface) tableInputMeta);
		transMeta.addStep(tableInputStepMeta);

7. 选择字段

选择你要获取的字段信息,虽然在sql中设置了查询字段,但是此处是获取查询字段中要获取数据的字段。

相当于查询出name,age,address,phone,sex等个人信息,但是某个流程只需要name即可。

// 字段选择
		SelectValuesMeta selectValuesMeta = new SelectValuesMeta();
		String selectValuesPluginId = registry.getPluginId(
				StepPluginType.class, selectValuesMeta);
		selectValuesMeta.allocate(extractBean.getSrcFields().length, 0, 0);
		selectValuesMeta.setSelectName(extractBean.getSrcFields());

8. 将操作添加到步骤

StepMeta selectValuesStepMeta = new StepMeta(selectValuesPluginId,
				"selectValues", (StepMetaInterface) selectValuesMeta);
		transMeta.addStep(selectValuesStepMeta);

9.  将步骤串联起来

步骤串联起来,就相当于与工作流程中前一个节点到下一个节点那个箭头连线

// 添加hop把两个步骤关联起来
		transMeta.addTransHop(new TransHopMeta(tableInputStepMeta,
				selectValuesStepMeta));

10. 字符串替换

将获取到的内容匹配要替换的字符串,替换为目标字符串。其实就是数据脱敏。

// 字符串替换
		ReplaceStringMeta replaceStringMeta = new ReplaceStringMeta();
		String replaceStringPluginId = registry.getPluginId(
				StepPluginType.class, replaceStringMeta);
		int fieldTransferLength = extractBean.getFieldTransfers().length;
		replaceStringMeta.allocate(fieldTransferLength);
		for (int i = 0; i < fieldTransferLength; i++) {
			FieldTransfer fieldTransfer = extractBean.getFieldTransfers()[i];
			replaceStringMeta.getFieldInStream()[i] = fieldTransfer.getField();
			replaceStringMeta.getReplaceString()[i] = fieldTransfer.getSrc();
			replaceStringMeta.getReplaceByString()[i] = fieldTransfer
					.getTarget();
			//是否使用正则表达式
			replaceStringMeta.getUseRegEx()[i] = fieldTransfer.isRegEx() ? 1
					: 0;
		}

		if (extractBean.getFieldTransfers() == null || fieldTransferLength <= 0) {
			replaceStringMeta.allocate(0);
		}

11. 将结果变为json字符串

outputFields[i] = field; 当初因为少了这么一段而在底层kettle中一直报空指针异常。

// json输出
		JsonOutputMeta jsonOutput = new JsonOutputMeta();
		String jsonOutputPluginId = registry.getPluginId(StepPluginType.class,
				jsonOutput);
		jsonOutput.setOperationType(JsonOutputMeta.OPERATION_TYPE_OUTPUT_VALUE);
		jsonOutput.setJsonBloc("data");
		jsonOutput.setOutputValue("rows");

		int srcFieldLength = extractBean.getSrcFields().length;
		JsonOutputField[] outputFields = new JsonOutputField[srcFieldLength];
		for (int i = 0; i < srcFieldLength; i++) {
			JsonOutputField field = new JsonOutputField();
			field.setFieldName(extractBean.getSrcFields()[i]);
			field.setElementName(extractBean.getSrcFields()[i]);
			
			outputFields[i] = field;
		}
		jsonOutput.setOutputFields(outputFields);

12. 将记录变为结果

将Json输出变为要返回给用户的结果。

// 复制记录到结果
		RowsToResultMeta rowsToResultMeta = new RowsToResultMeta();
		String rowsToResultMetaPluginId = registry.getPluginId(
				StepPluginType.class, rowsToResultMeta);
		// 添加步骤到转换中
		StepMeta rowsToResultStepMeta = new StepMeta(rowsToResultMetaPluginId,
				"rowsToResult", (StepMetaInterface) rowsToResultMeta);
		transMeta.addStep(rowsToResultStepMeta);
		// hop把两个步骤关联起来
		transMeta.addTransHop(new TransHopMeta(jsonOutputStepMeta,
				rowsToResultStepMeta));

13. 替换SQL中变量

这也是为什么要在之前设置替换SQL中变量那个参数

tableInputMeta.setVariableReplacementActive(true);

transMeta.setVariable("VAR_FROM", "0");
		transMeta.setVariable("VAR_TO", "20");

14 设置执行参数,添加监听

Trans trans = new Trans(transMeta);
		trans.setMonitored(true);
		trans.setInitializing(true);
		trans.setPreparing(true);
		trans.setLogLevel(LogLevel.BASIC);
		trans.setRunning(true);
		trans.setSafeModeEnabled(true);

		trans.addTransListener(new TransAdapter() {
			@Override
			public void transFinished(Trans trans) {
				System.out.println("转换执行完成");
			}
		});

15. 执行抽取

try {
			// trans.startThreads();
			trans.execute(null);
		} catch (KettleException e) {
			e.printStackTrace();
		}
		trans.waitUntilFinished();
		if (trans.getErrors() > 0) {
			System.out.println("抽取数据出错.");
			return;
		}

16. 获取执行结果

Result result = trans.getResult();

17. 打印输出

List<RowMetaAndData> list = result.getRows();
		String fieldNames = "";
		if (list != null && list.size() > 0) {
			RowMetaAndData row = list.get(0);
			RowMetaInterface rowMeta = row.getRowMeta();
			Object[] srcFields = ArrayUtils.subarray(rowMeta.getFieldNames(),
					0, this.extractBean.getSrcFields().length);
			fieldNames = StringUtils.join(srcFields, ",");

			Object[] cols = row.getData();
			// 遍历方式获取
			String json = "{}";
			for (int i = 0; i < cols.length; i++) {
				if (cols[i] != null) {
					json = cols[i].toString();
					break;
				}
			}
			System.out.println("抽取数据为:" + json);
		}

18. 前置参数准备,仅用于测试

// 初始化数据源表和目标表数据
		ExtractBean extractBean = new ExtractBean();
		DatabaseConn srcDB = new DatabaseConn();
		srcDB.setDatabase("test");
		srcDB.setServer("127.0.0.1");
		srcDB.setPort("3306");
		srcDB.setUsername("root");
		srcDB.setPassword("root");
		srcDB.setType("MySQL");
		extractBean.setSrcDB(srcDB);
		extractBean
				.setSrcFields(new String[] { "name", "age", "mail", "phone" });
		extractBean.setSrcPk(new String[] { "id" });
		extractBean.setSrcTable(new String[] { "etl_src_table" });

		// 数据转换
		FieldTransfer[] fieldTransfers = new FieldTransfer[2];
		FieldTransfer nameTransfer = new FieldTransfer();
		nameTransfer.setField("name");
		nameTransfer.setSrc("king");
		nameTransfer.setTarget("lw");
		fieldTransfers[0] = nameTransfer;
		FieldTransfer mailTransfer = new FieldTransfer();
		mailTransfer.setField("mail");
		mailTransfer.setSrc("^lw.*@.*//.com$");
		mailTransfer.setTarget("lw***.com");
		mailTransfer.setRegEx(true);
		fieldTransfers[1] = mailTransfer;
		extractBean.setFieldTransfers(fieldTransfers);

		DatabaseConn destDB = new DatabaseConn();
		destDB.setDatabase("test");
		destDB.setServer("127.0.0.1");
		destDB.setPort("3306");
		destDB.setUsername("root");
		destDB.setPassword("root");
		destDB.setType("MySQL");
		extractBean.setDestDB(destDB);
		extractBean
				.setDestFields(new String[] { "name", "age", "mail", "phone" });
		extractBean.setDestPk(new String[] { "id" });
		extractBean.setDestTable("etl_dest_table");
		this.extractBean = extractBean;
  • 完整样例

DatabaseConn.java

package com.lw.kettle;

import lombok.Data;

@Data
public class DatabaseConn {

	/** 数据库服务器IP地址 */
	private String server;

	/** 数据库类型 */
	private String type;

	/** 访问类型(Native,ODBC,JNDI) */
	private String access = "Native";

	/** 数据库名称 */
	private String database;

	/** 连接端口 */
	private String port;

	/** 连接用户名 */
	private String username;

	/** 连接密码 */
	private String password;

}

ExtractBean.java

package com.lw.kettle;

import lombok.Data;

@Data
public class ExtractBean {

	/**
	 * 源表数据库连接
	 */
	private DatabaseConn srcDB;

	/**
	 * 源表表名
	 */
	private String[] srcTable = new String[0];

	/**
	 * 源表交换字段类型
	 */
	private String[] srcFields;

	/**
	 * 源表主键
	 */
	private String[] srcPk;

	/**
	 * 目标表的数据库配置
	 */
	private DatabaseConn destDB;

	/**
	 * 目标表
	 */
	private String destTable;

	/**
	 * 目标表字段
	 */
	private String[] destFields;

	/**
	 * 目标表主键
	 */
	private String[] destPk;
	
	/**
	 * 数据转换
	 */
	private FieldTransfer[] fieldTransfers;
}

FieldTransfer.java

package com.lw.kettle;

import lombok.Data;


/**
 * 字段转换类
 * 
 * @author lenovo
 *
 */
@Data
public class FieldTransfer {

	private String field;

	private String src;

	private String target;

	private boolean regEx = false;
}

测试类

package com.lw.kettle;

import java.text.MessageFormat;
import java.util.List;

import org.apache.commons.lang3.ArrayUtils;
import org.apache.commons.lang3.StringUtils;
import org.junit.Before;
import org.junit.Test;
import org.pentaho.di.core.KettleEnvironment;
import org.pentaho.di.core.Result;
import org.pentaho.di.core.RowMetaAndData;
import org.pentaho.di.core.database.DatabaseMeta;
import org.pentaho.di.core.exception.KettleException;
import org.pentaho.di.core.exception.KettleXMLException;
import org.pentaho.di.core.logging.LogLevel;
import org.pentaho.di.core.plugins.PluginRegistry;
import org.pentaho.di.core.plugins.StepPluginType;
import org.pentaho.di.core.row.RowMetaInterface;
import org.pentaho.di.core.util.EnvUtil;
import org.pentaho.di.trans.Trans;
import org.pentaho.di.trans.TransAdapter;
import org.pentaho.di.trans.TransHopMeta;
import org.pentaho.di.trans.TransMeta;
import org.pentaho.di.trans.step.StepMeta;
import org.pentaho.di.trans.step.StepMetaInterface;
import org.pentaho.di.trans.steps.jsonoutput.JsonOutputField;
import org.pentaho.di.trans.steps.jsonoutput.JsonOutputMeta;
import org.pentaho.di.trans.steps.replacestring.ReplaceStringMeta;
import org.pentaho.di.trans.steps.rowstoresult.RowsToResultMeta;
import org.pentaho.di.trans.steps.selectvalues.SelectValuesMeta;
import org.pentaho.di.trans.steps.tableinput.TableInputMeta;

public class CopyTable {

	ExtractBean extractBean = new ExtractBean();

	@Before
	public void before() {
		try {
			// 初始化Kettle环境
			KettleEnvironment.init();
			EnvUtil.environmentInit();
		} catch (KettleException e) {
			e.printStackTrace();
		}

		// 初始化数据源表和目标表数据
		ExtractBean extractBean = new ExtractBean();
		DatabaseConn srcDB = new DatabaseConn();
		srcDB.setDatabase("test");
		srcDB.setServer("127.0.0.1");
		srcDB.setPort("3306");
		srcDB.setUsername("root");
		srcDB.setPassword("root");
		srcDB.setType("MySQL");
		extractBean.setSrcDB(srcDB);
		extractBean
				.setSrcFields(new String[] { "name", "age", "mail", "phone" });
		extractBean.setSrcPk(new String[] { "id" });
		extractBean.setSrcTable(new String[] { "etl_src_table" });

		// 数据转换
		FieldTransfer[] fieldTransfers = new FieldTransfer[2];
		FieldTransfer nameTransfer = new FieldTransfer();
		nameTransfer.setField("name");
		nameTransfer.setSrc("king");
		nameTransfer.setTarget("lw");
		fieldTransfers[0] = nameTransfer;
		FieldTransfer mailTransfer = new FieldTransfer();
		mailTransfer.setField("mail");
		mailTransfer.setSrc("^lw.*@.*//.com$");
		mailTransfer.setTarget("lw***.com");
		mailTransfer.setRegEx(true);
		fieldTransfers[1] = mailTransfer;
		extractBean.setFieldTransfers(fieldTransfers);

		DatabaseConn destDB = new DatabaseConn();
		destDB.setDatabase("test");
		destDB.setServer("127.0.0.1");
		destDB.setPort("3306");
		destDB.setUsername("root");
		destDB.setPassword("root");
		destDB.setType("MySQL");
		extractBean.setDestDB(destDB);
		extractBean
				.setDestFields(new String[] { "name", "age", "mail", "phone" });
		extractBean.setDestPk(new String[] { "id" });
		extractBean.setDestTable("etl_dest_table");
		this.extractBean = extractBean;
	}
	
	/**
     * 数据库连接信息,适用于DatabaseMeta其中 一个构造器DatabaseMeta(String xml)
     */
	protected String databasesXML = 
		"<?xml version=\"1.0\" encoding=\"UTF-8\"?>"
			+ "<connection>"
			+ "<name>{0}</name>"
			+ "<server>{1}</server>"
			+ "<type>{2}</type>"
			+ "<access>{3}</access>"
			+ "<database>{4}</database>"
			+ "<port>{5}</port>"
			+ "<username>{6}</username>"
			+ "<password>{7}</password>"
			+ "</connection>";

	/**
	 * 1. 源表和目标表全量交换
	 * @throws KettleXMLException 
	 */
	@Test
	public void copyTableTest() throws KettleXMLException {

		ExtractBean extractBean = CopyTable.this.extractBean;

		TransMeta transMeta = new TransMeta();
		transMeta.setName("全量抽取数据");

		DatabaseConn srcDB = extractBean.getSrcDB();
		final String srcDBName = "srcDB";
		DatabaseMeta dbMeta = new DatabaseMeta(srcDBName, srcDB.getType(),
				srcDB.getAccess(), srcDB.getServer(), srcDB.getDatabase(),
				srcDB.getPort(), srcDB.getUsername(), srcDB.getPassword());
		
		//添加转换的数据库连接
//        String xml = MessageFormat.format(databasesXML, new Object[] {
//        		srcDBName,
//        		srcDB.getServer(),
//        		"MySQL",
//        		"Native",
//        		srcDB.getDatabase(),
//        		srcDB.getPort(),
//        		srcDB.getUsername(),
//        		srcDB.getPassword()
//        });
//		DatabaseMeta dbMeta = new DatabaseMeta(xml);

		dbMeta.setConnectSQL(MessageFormat.format("set names ''{0}'';",
				new Object[] { "utf8" }));

		transMeta.addDatabase(dbMeta);

		PluginRegistry registry = PluginRegistry.getInstance();
		TableInputMeta tableInputMeta = new TableInputMeta();
		String tableInputPluginId = registry.getPluginId(StepPluginType.class,
				tableInputMeta);
		// 给表输入添加一个DatabaseMeta连接数据库
		tableInputMeta.setDatabaseMeta(transMeta.findDatabase(srcDBName));
		// 构造查询SQL
		String selectSql = "select {0} from {1}";
		selectSql = MessageFormat.format(
				selectSql,
				new Object[] {
						StringUtils.join(extractBean.getSrcFields(), ","),
						extractBean.getSrcTable()[0] });
		tableInputMeta.setSQL(selectSql);

		// 打印查询SQL
		System.out.println(selectSql);

		// 替换SQL语句中变量
		tableInputMeta.setVariableReplacementActive(true);

		// 将TableInputMeta添加到转换中
		StepMeta tableInputStepMeta = new StepMeta(tableInputPluginId,
				"tableInput", (StepMetaInterface) tableInputMeta);
		transMeta.addStep(tableInputStepMeta);

		// 字段选择
		SelectValuesMeta selectValuesMeta = new SelectValuesMeta();
		String selectValuesPluginId = registry.getPluginId(
				StepPluginType.class, selectValuesMeta);
		selectValuesMeta.allocate(extractBean.getSrcFields().length, 0, 0);
		selectValuesMeta.setSelectName(extractBean.getSrcFields());

		StepMeta selectValuesStepMeta = new StepMeta(selectValuesPluginId,
				"selectValues", (StepMetaInterface) selectValuesMeta);
		transMeta.addStep(selectValuesStepMeta);

		// 添加hop把两个步骤关联起来
		transMeta.addTransHop(new TransHopMeta(tableInputStepMeta,
				selectValuesStepMeta));

		// 字符串替换
		ReplaceStringMeta replaceStringMeta = new ReplaceStringMeta();
		String replaceStringPluginId = registry.getPluginId(
				StepPluginType.class, replaceStringMeta);
		int fieldTransferLength = extractBean.getFieldTransfers().length;
		replaceStringMeta.allocate(fieldTransferLength);
		for (int i = 0; i < fieldTransferLength; i++) {
			FieldTransfer fieldTransfer = extractBean.getFieldTransfers()[i];
			replaceStringMeta.getFieldInStream()[i] = fieldTransfer.getField();
			replaceStringMeta.getReplaceString()[i] = fieldTransfer.getSrc();
			replaceStringMeta.getReplaceByString()[i] = fieldTransfer
					.getTarget();
			//是否使用正则表达式
			replaceStringMeta.getUseRegEx()[i] = fieldTransfer.isRegEx() ? 1
					: 0;
		}

		if (extractBean.getFieldTransfers() == null || fieldTransferLength <= 0) {
			replaceStringMeta.allocate(0);
		}

		// 添加步骤到转换中
		StepMeta replaceStringStepMeta = new StepMeta(replaceStringPluginId,
				"replaceString", (StepMetaInterface) replaceStringMeta);
		transMeta.addStep(replaceStringStepMeta);
		// 添加hop把两个步骤关联起来
		transMeta.addTransHop(new TransHopMeta(selectValuesStepMeta,
				replaceStringStepMeta));

		// json输出
		JsonOutputMeta jsonOutput = new JsonOutputMeta();
		String jsonOutputPluginId = registry.getPluginId(StepPluginType.class,
				jsonOutput);
		jsonOutput.setOperationType(JsonOutputMeta.OPERATION_TYPE_OUTPUT_VALUE);
		jsonOutput.setJsonBloc("data");
		jsonOutput.setOutputValue("rows");

		int srcFieldLength = extractBean.getSrcFields().length;
		JsonOutputField[] outputFields = new JsonOutputField[srcFieldLength];
		for (int i = 0; i < srcFieldLength; i++) {
			JsonOutputField field = new JsonOutputField();
			field.setFieldName(extractBean.getSrcFields()[i]);
			field.setElementName(extractBean.getSrcFields()[i]);
			
			outputFields[i] = field;
		}
		jsonOutput.setOutputFields(outputFields);

		// 添加步骤到转换中
		StepMeta jsonOutputStepMeta = new StepMeta(jsonOutputPluginId,
				"jsonOutput", (StepMetaInterface) jsonOutput);
		transMeta.addStep(jsonOutputStepMeta);
		// 添加hop把两个步骤关联起来
		transMeta.addTransHop(new TransHopMeta(replaceStringStepMeta,
				jsonOutputStepMeta));

		// 复制记录到结果
		RowsToResultMeta rowsToResultMeta = new RowsToResultMeta();
		String rowsToResultMetaPluginId = registry.getPluginId(
				StepPluginType.class, rowsToResultMeta);
		// 添加步骤到转换中
		StepMeta rowsToResultStepMeta = new StepMeta(rowsToResultMetaPluginId,
				"rowsToResult", (StepMetaInterface) rowsToResultMeta);
		transMeta.addStep(rowsToResultStepMeta);
		// hop把两个步骤关联起来
		transMeta.addTransHop(new TransHopMeta(jsonOutputStepMeta,
				rowsToResultStepMeta));

		transMeta.setVariable("VAR_FROM", "0");
		transMeta.setVariable("VAR_TO", "20");

		Trans trans = new Trans(transMeta);
		trans.setMonitored(true);
		trans.setInitializing(true);
		trans.setPreparing(true);
		trans.setLogLevel(LogLevel.BASIC);
		trans.setRunning(true);
		trans.setSafeModeEnabled(true);

		trans.addTransListener(new TransAdapter() {
			@Override
			public void transFinished(Trans trans) {
				System.out.println("转换执行完成");
			}
		});

		try {
			// trans.startThreads();
			trans.execute(null);
		} catch (KettleException e) {
			e.printStackTrace();
		}
		trans.waitUntilFinished();
		if (trans.getErrors() > 0) {
			System.out.println("抽取数据出错.");
			return;
		}

		Result result = trans.getResult();
		List<RowMetaAndData> list = result.getRows();
		String fieldNames = "";
		if (list != null && list.size() > 0) {
			RowMetaAndData row = list.get(0);
			RowMetaInterface rowMeta = row.getRowMeta();
			Object[] srcFields = ArrayUtils.subarray(rowMeta.getFieldNames(),
					0, this.extractBean.getSrcFields().length);
			fieldNames = StringUtils.join(srcFields, ",");

			Object[] cols = row.getData();
			// 遍历方式获取
			String json = "{}";
			for (int i = 0; i < cols.length; i++) {
				if (cols[i] != null) {
					json = cols[i].toString();
					break;
				}
			}
			System.out.println("抽取数据为:" + json);
		}

	}

	/**
	 * 2.建立临时表和表触发器
	 */
	@Test
	public void createTriggerAndTmpTable() {

	}

	/**
	 * 3.触发器增量交换
	 */
	@Test
	public void changeWithTrigger() {

	}

	/**
	 * 4.时间戳交换
	 */
	@Test
	public void changeWithTime() {

	}
}

 

posted @ 2019-12-18 11:04  伟衙内  阅读(110)  评论(0编辑  收藏  举报