Superset实现动态SQL查询
使用自定义参数方式实现 superset 实现SQL动态查询
1、启用参数:config.py 设置"ENABLE_TEMPLATE_PROCESSING": True
2、当前superset v1.2版本支持的参数包括:
{{ current_username() }} 当前登录用户名
{{ current_username(add_to_cache_keys=False) }} 不从缓存中获取登录用户名,默认从缓存获取
{{ current_user_id()}} 当前登录用户ID
{{ current_user_id(add_to_cache_keys=False) }} 不从缓存中获取登录用户ID,默认从缓存获取
{{ url_param('custom_variable') }} url 参数,比如127.0.0.1:8001\dashboard?abc=123,参数就是{{ url_param('abc') }} 结果就是123
{{ cache_key_wrapper() }} 还没有弄明白啥用
{{ filter_values("字段名") }} 获取dashboard filter_box组件对某个字段的筛选结果
{{ from_dttm }} 获取dashboard filter_box组件日期筛选的开始时间
{{ to_dttm }} 获取dashboard filter_box组件日期筛选的结束时间
{{ get_filters() }} 暂时没有弄明白
除此之外,还可以自定义参数,自定义参数方法:
①修改superset/jinja_context.py文件,修改三个地方:
1 regex = re.compile( 2 r"\{\{.*(" 3 r"current_user_id\(.*\)|" 4 r"current_username\(.*\)|" 5 r"current_userroles\(.*\)|" 6 r"isadmin\(.*\)|" 7 r"cache_key_wrapper\(.*\)|" 8 r"url_param\(.*\)" 9 r").*\}\}" 10 )
↑↑↑↑注意此处的 current_userroles 和 isadmin 是我自定义的,源文件没有
1 def current_user_id(self, add_to_cache_keys: bool = True) -> Optional[int]: 2 """ 3 Return the user ID of the user who is currently logged in. 4 5 :param add_to_cache_keys: Whether the value should be included in the cache key 6 :returns: The user ID 7 """ 8 9 if hasattr(g, "user") and g.user: 10 if add_to_cache_keys: 11 self.cache_key_wrapper(g.user.get_id()) 12 return g.user.get_id() 13 return None 14 15 def current_username(self, add_to_cache_keys: bool = True) -> Optional[str]: 16 """ 17 Return the username of the user who is currently logged in. 18 19 :param add_to_cache_keys: Whether the value should be included in the cache key 20 :returns: The username 21 """ 22 23 if g.user and hasattr(g.user, "username"): 24 if add_to_cache_keys: 25 self.cache_key_wrapper(g.user.username) 26 return g.user.username 27 return None 28 def current_userroles(self, add_to_cache_keys: bool = True) -> Optional[str]: 29 """ 30 Return the roles of the user who is currently logged in. 31 32 :param add_to_cache_keys: Whether the value should be included in the cache key 33 :returns: The userroles 34 """ 35 36 if g.user and hasattr(g.user, "roles"): 37 if add_to_cache_keys: 38 user_roles = "/".join([role.name.lower() for role in list(g.user.roles)]) 39 self.cache_key_wrapper(user_roles) 40 print(user_roles) 41 return user_roles 42 """admin in user_roles""" 43 return None 44 45 def isadmin(self, add_to_cache_keys: bool = True) -> Optional[str]: 46 """ 47 Return the roles of the user who is currently logged in. 48 49 :param add_to_cache_keys: Whether the value should be included in the cache key 50 :returns: The userroles 51 """ 52 53 if g.user and hasattr(g.user, "roles"): 54 if add_to_cache_keys: 55 user_roles = [role.name.lower() for role in list(g.user.roles)] 56 return "admin" in user_roles 57 return None
↑↑↑↑仿照系统自带的 current_username 编造自己的函数,我写了current_userroles 和 isadmin
1 class JinjaTemplateProcessor(BaseTemplateProcessor): 2 def set_context(self, **kwargs: Any) -> None: 3 super().set_context(**kwargs) 4 extra_cache = ExtraCache(self._extra_cache_keys) 5 self._context.update( 6 { 7 "url_param": partial(safe_proxy, extra_cache.url_param), 8 "current_user_id": partial(safe_proxy, extra_cache.current_user_id), 9 "current_username": partial(safe_proxy, extra_cache.current_username), 10 "current_userroles": partial(safe_proxy, extra_cache.current_userroles), 11 "isadmin": partial(safe_proxy, extra_cache.isadmin), 12 "cache_key_wrapper": partial(safe_proxy, extra_cache.cache_key_wrapper), 13 "filter_values": partial(safe_proxy, filter_values), 14 } 15 )
↑↑↑↑仿照系统自带的 current_username 编造自己的函数,我写了current_userroles 和 isadmin
就是这3个地方,但是注意,自己在第二步早的函数,返回值必须是:
1 ALLOWED_TYPES = ( 2 NONE_TYPE, 3 "bool", 4 "str", 5 "unicode", 6 "int", 7 "long", 8 "float", 9 "list", 10 "dict", 11 "tuple", 12 "set", 13 )
否则会提示错误,或者自己修改这个types,我是转换,比如上面那个g.user.roles 返回的结果就不是上面类型,导致我一直不成功,最后修改了下,才可以
3、判断是否自定义成功:
在superset sql lab中执行如下代码,如果能被解析,就说明成功
4、应用案例:
在dataset里面,动态访问数据源,数据源添加where语句:select * from sales where salesname =' {{current_username()}}'
dashboard里面,通过获取筛选器的结果,然后获取其他表应当显示的数据范围:
1 select DATE,risktype,sum(num) as num from 2 (SELECT date , customerid,product,risktype ,count(*) as num 3 from v_superset_forecast_risk group by date , customerid,product,risktype ) a 4 join 5 (select distinct customer_code,product from v_superset_access 6 where name='{{ current_username() }}' )access 7 on a.customerid=access.customer_code 8 and a.product=access.product 9 and DATE_FORMAT(date,'%Y-%m')> DATE_FORMAT(date_sub(STR_TO_DATE(concat( {{ "'" + "', '".join(filter_values('yearmonthend')) + "'" }},'-01'), '%Y-%m-%d'), interval 12 month),'%Y-%m') 10 and DATE_FORMAT(date,'%Y-%m')<={{ "'" + "', '".join(filter_values('yearmonthend')) + "'" }} 11 group by DATE,risktype
因为sql里面可以使用jinja 表达式,比如判断筛选当前没有筛选的时候,获取什么数据
1 {% if isadmin() %} 2 3 select * from sales 4 5 {% else %} 6 7 select * from sales where name ='{{current_username()}}' 8 9 {% endif %}
注意{% %} 内部使用参数的时候,不需要加{{}},否则报错
通过筛选器实现模糊查询
5、官方参考文档:https://superset.apache.org/docs/installation/sql-templating
官方没有那么详细,但是里面有一些我这里可能也没有消化吸收掉,可以参考看下
总之,通过上面的自定义参数方法,和jinja表达式在sql中的应用,可以实现动态查询,解决一些无法通过页面直接交互查询结果显示的内容
另外如果你有其他应用或者自定义上的思考,欢迎留言,相互学习