Flume
http://www.iteblog.com/archives/1043
http://www.iteblog.com/archives/908
http://www.iteblog.com/archives/1034
In flume-ng's HBase sink, HbaseEventSerializer
implementation is responsible for generating row keys. The default implementation org.apache.flume.sink.hbase.SimpleHbaseEventSerializer
already supports generating timestamp row keys in the format prefix + current timestamp
, to use it just modify your flume configuration accordinly.
hbase-agent.sinks.sink1.type = org.apache.flume.sink.hbase.HBaseSink
hbase-agent.sinks.sink1.channel = ch1
hbase-agent.sinks.sink1.table = demo
hbase-agent.sinks.sink1.columnFamily = cf
hbase-agent.sinks.sink1.serializer = org.apache.flume.sink.hbase.SimpleHbaseEventSerializer
hbase-agent.sinks.sink1.serializer.payloadColumn = col1
hbase-agent.sinks.sink1.serializer.keyType = timestamp
If the provided timestamp based key generation method is not what you are after then you'll need to provide a custom HbaseEventSerializer
implementation to flume, which will require you to:
- Create your own row key generator class (the default one is
org.apache.flume.sink.hbase.SimpleRowKeyGenerator
) - Create your own implementation of the
HbaseEventSerializer
interface (default implementation isorg.apache.flume.sink.hbase.SimpleHbaseEventSerializer
) which will use the custom row key generator you created in first step - Modify your flume hbase sink configuration to use the custom
HbaseEventSerializer
implementation.
http://www.kankanews.com/ICkengine/archives/67816.shtml
http://search-hadoop.com/?q=prefix+salt+key+hotspot&fc_project=HBase