背景
hbase中一张表的rowkey定义为时间戳+字符串
需求
根据时间戳和列簇中某列的值为"abc",导出一天内的数据到excel中。
使用FilterList
FilterList filterList = new FilterList(FilterList.Operator.MUST_PASS_ALL); SingleColumnValueFilter filter=new SingleColumnValueFilter("info".getBytes(),"supplier".getBytes(), CompareFilter.CompareOp.EQUAL,"abc".getBytes()); filter.setFilterIfMissing(true); filterList.addFilter(filter); List<String> list = new ArrayList<String>(); List<ResultDTO> listSpider = new ArrayList<ResultDTO>(); Scan scan = new Scan(); scan.setStartRow(Bytes.toBytes(startKey)); scan.setStopRow(Bytes.toBytes(endtKey)); scan.setFilter(filterList); Connection conn = null; HTable table = null; try { conn = getConnection(); table = (HTable) conn.getTable(TableName.valueOf(tableName)); ResultScanner rs = table.getScanner(scan);
1.rowkey的range,设置startrow和StopRow值
2.列值过滤,使用
SingleColumnValueFilter
默认情况下,列值为空时把此行结果算入
filter.setFilterIfMissing(true);//排除列值为空的
官方说明:To prevent the entire row from being emitted if the column is not found on a row, use setFilterIfMissing(boolean)
. Otherwise, if the column is found, the entire row will be emitted only if the value passes. If the value fails, the row will be filtered out.
微信公众号: 架构师日常笔记 欢迎关注!