背景

hbase中一张表的rowkey定义为时间戳+字符串

需求

根据时间戳和列簇中某列的值为"abc",导出一天内的数据到excel中。

使用FilterList

     FilterList filterList = new FilterList(FilterList.Operator.MUST_PASS_ALL);
        SingleColumnValueFilter filter=new SingleColumnValueFilter("info".getBytes(),"supplier".getBytes(), CompareFilter.CompareOp.EQUAL,"abc".getBytes());
        filter.setFilterIfMissing(true);
        filterList.addFilter(filter);

        List<String> list = new ArrayList<String>();
        List<ResultDTO> listSpider = new ArrayList<ResultDTO>();
        Scan scan = new Scan();
        scan.setStartRow(Bytes.toBytes(startKey));
        scan.setStopRow(Bytes.toBytes(endtKey));
        scan.setFilter(filterList);

        Connection conn = null;
        HTable table = null;
        try {
            conn = getConnection();

            table = (HTable) conn.getTable(TableName.valueOf(tableName));

            ResultScanner rs = table.getScanner(scan);

1.rowkey的range,设置startrow和StopRow值

2.列值过滤,使用

SingleColumnValueFilter 

默认情况下,列值为空时把此行结果算入

filter.setFilterIfMissing(true);//排除列值为空的

官方说明:To prevent the entire row from being emitted if the column is not found on a row, use setFilterIfMissing(boolean). Otherwise, if the column is found, the entire row will be emitted only if the value passes. If the value fails, the row will be filtered out.
posted on 2018-01-17 15:27  一天不进步,就是退步  阅读(242)  评论(0编辑  收藏  举报