Kettle解析JSON错误,We MUST have the same number of values for all paths,We can not find and data with path [$.
最近公司要从聚石塔上抽取数据,其中有JSON格式数据,所以学习一下Kettle解析JSON,碰到小小问题,记录一下:
(1)
2015/07/15 15:22:48 - trade_detail.0 - ERROR (version 5.4.0.1-130, build 1 from 2015-06-14_12-34-55 by buildguy) : Unexpected Error : org.pentaho.di.core.exception.KettleException:
2015/07/15 15:22:48 - trade_detail.0 - The data structure is not the same inside the resource! We found 2 values for json path [$..title], which is different that the number returned for path [$..buyer_alipay_no] (1 values). We MUST have the same number of values for all paths.
2015/07/15 15:22:48 - trade_detail.0 - ERROR (version 5.4.0.1-130, build 1 from 2015-06-14_12-34-55 by buildguy) : org.pentaho.di.core.exception.KettleException:
2015/07/15 15:22:48 - trade_detail.0 - The data structure is not the same inside the resource! We found 2 values for json path [$..title], which is different that the number returned for path [$..buyer_alipay_no] (1 values). We MUST have the same number of values for all paths.
2015/07/15 15:22:48 - trade_detail.0 -
原因是由于JSON嵌套内的对象还有同样的title,所以无法解析。
方法:
在外面这层用javascript解析,然后重命名,继续往下用JSON控件解析...
JAVASCRIPT代码:
var data = JSON.parse(trade);
var alipay_id = data.alipay_id;
var buyer_alipay_no = data.buyer_alipay_no;
var seller_nick = data.seller_nick;
var orders1 = data.orders;
var title1 = data.title;
var logistics_company1 = data.title;
这样只能解决2个同名的JSON结构,当JSON存在3个同名的,1个在外层,2个嵌套在不同的对象里面,还是会报错。
既然是path问题,那就看看是否可以修改path。
官网上又这样的路径,依样画葫芦,OK!
XPath | JSONPath | Description |
/ | $ | the root object/element |
. | @ | the current object/element |
/ | . or [] | child operator |
.. | n/a | parent operator |
// | .. | recursive descent. JSONPath borrows this syntax from E4X. |
* | * | wildcard. All objects/elements regardless their names. |
@ | n/a | attribute access. JSON structures don't have attributes. |
[] | [] | subscript operator. XPath uses it to iterate over element collections and for predicates. In Javascript and JSON it is the native array operator. |
| | [,] | Union operator in XPath results in a combination of node sets. JSONPath allows alternate names or array indices as a set. |
n/a | [start:end:step] | array slice operator borrowed from ES4. |
[] | ?() | applies a filter (script) expression. |
n/a | () | script expression, using the underlying script engine. |
() | n/a | grouping in Xpath |
XPath | JSONPath | Result |
/store/book/author |
$.store.book[*].author |
the authors of all books in the store |
//author |
$..author |
all authors |
/store/* |
$.store.* |
all things in store, which are some books and a red bicycle. |
/store//price |
$.store..price |
the price of everything in the store. |
//book[3] |
$..book[2] |
the third book |
//book[last()] |
$..book[(@.length-1)] $..book[-1:] |
the last book in order. |
//book[position()<3] |
$..book[0,1] $..book[:2] |
the first two books |
//book[isbn] |
$..book[?(@.isbn)] |
filter all books with isbn number |
//book[price<10] |
$..book[?(@.price<10)] |
filter all books cheapier than 10 |
//* |
$..* |
all Elements in XML document. All members of JSON structure. |
官网关于path的URL:http://wiki.pentaho.com/display/EAI/JSON+Input
根据官网提示链接到: http://goessner.net/articles/JsonPath/
(2)
2015/07/15 15:30:48 - trade_detail.0 - ERROR (version 5.4.0.1-130, build 1 from 2015-06-14_12-34-55 by buildguy) : Unexpected Error : org.pentaho.di.core.exception.KettleException:
2015/07/15 15:30:48 - trade_detail.0 - org.pentaho.di.core.exception.KettleException:
2015/07/15 15:22:48 - trade_detail.0 -
2015/07/15 15:22:48 - trade_detail.0 -
2015/07/15 15:22:48 - trade_detail.0 -We can not find and data with path [$..logistics_company]
原因:JSON中存在不完整的字段,无法解析。
方法:
1、用javascript脚本读取内容
2、勾选json控件-->内容-->设置忽视不完整的路径
(不过这个方式对嵌套多层的无效)
javascript脚本例子:
var json={"status":1,"info":"u83b7u53d6u6210u529f","data":[{ "name":"dingding","information":"电视迷","Region_id":"3","type":"1"},{"id":"5","name":"kenwong","information":"you are ok","Region_id":"5","type":"100"}]}
var data=json.data;
var str1 ="";
var str2 ="";
var str3 ="";
for(i=0;i<data.length;i++)
{
str1 += data[i].name +","
str2 += data[i].information +",";
str3 += data[i].id +",";
}