Avro学习笔记——avro-tools工具
1.下载avro-tools.jar
https://archive.apache.org/dist/avro/avro-1.10.1/java/
avro-tools.jar常用命令:Working with Apache Avro files in Amazon S3
也可以查看help
java -jar ./avro-tools-1.10.1.jar help Version 1.10.1 of Apache Avro Copyright 2010-2015 The Apache Software Foundation This product includes software developed at The Apache Software Foundation (https://www.apache.org/). ---------------- Available tools: canonical Converts an Avro Schema to its canonical form cat Extracts samples from files compile Generates Java code for the given schema. concat Concatenates avro files without re-compressing. count Counts the records in avro files or folders fingerprint Returns the fingerprint for the schemas. fragtojson Renders a binary-encoded Avro datum as JSON. fromjson Reads JSON records and writes an Avro data file. fromtext Imports a text file into an avro data file. getmeta Prints out the metadata of an Avro data file. getschema Prints out schema of an Avro data file. idl Generates a JSON schema from an Avro IDL file idl2schemata Extract JSON schemata of the types from an Avro IDL file induce Induce schema/protocol from Java class/interface via reflection. jsontofrag Renders a JSON-encoded Avro datum as binary. random Creates a file with randomly generated instances of a schema. recodec Alters the codec of a data file. repair Recovers data from a corrupt Avro Data file rpcprotocol Output the protocol of a RPC service rpcreceive Opens an RPC Server and listens for one message. rpcsend Sends a single RPC message. tether Run a tethered mapreduce job. tojson Dumps an Avro data file as JSON, record per line or pretty. totext Converts an Avro data file to a text file. totrevni Converts an Avro data file to a Trevni file. trevni_meta Dumps a Trevni file's metadata as JSON. trevni_random Create a Trevni file filled with random instances of a schema. trevni_tojson Dumps a Trevni file as JSON.
2.查看avro文件的schema
java -jar ./avro-tools-1.10.1.jar getschema ./xxxx.avro
3.查看avro文件内容的json格式
java -jar ./avro-tools-1.10.1.jar tojson ./nova_ads_access_log-0-0008589084.avro | less
4.使用avro-tools编译java代码
编译avro IDL文件,参考
https://avro.apache.org/docs/current/gettingstartedjava.html https://yanbin.blog/convert-apache-avro-to-parquet-format-in-java/
定义schema文件kst.avsc
{ "namespace": "com.linkedin.haivvreo", "name": "test_serializer", "type": "record", "fields": [ { "name":"string1", "type":"string" }, { "name":"int1", "type":"int" }, { "name":"tinyint1", "type":"int" }, { "name":"smallint1", "type":"int" }, { "name":"bigint1", "type":"long" }, { "name":"boolean1", "type":"boolean" }, { "name":"float1", "type":"float" }, { "name":"double1", "type":"double" }, { "name":"list1", "type":{"type":"array", "items":"string"} }, { "name":"map1", "type":{"type":"map", "values":"int"} }, { "name":"struct1", "type":{"type":"record", "name":"struct1_name", "fields": [ { "name":"sInt", "type":"int" }, { "name":"sBoolean", "type":"boolean" }, { "name":"sString", "type":"string" } ] } }, { "name":"union1", "type":["float", "boolean", "string"] }, { "name":"enum1", "type":{"type":"enum", "name":"enum1_values", "symbols":["BLUE","RED", "GREEN"]} }, { "name":"nullableint", "type":["int", "null"] }, { "name":"bytes1", "type":"bytes" }, { "name":"fixed1", "type":{"type":"fixed", "name":"threebytes", "size":3} } ] }
编译avro IDL文件
java -jar ./src/main/resources/avro-tools-1.10.1.jar compile schema ./src/main/avro/kst.avsc ./src/main/java
这时编译出来的java代码中,IDL的string类型实际上是CharSequence
如果想编译成string,则可以添加-string参数
java -jar ./src/main/resources/avro-tools-1.10.1.jar compile -string schema ./src/main/avro/kst.avsc ./src/main/java
本文只发表于博客园和tonglin0325的博客,作者:tonglin0325,转载请注明原文链接:https://www.cnblogs.com/tonglin0325/p/5298989.html