linux PMBus总线及设备驱动分析
PMBus协议规范介绍
PMBus是一套对电源进行配置、控制和监控的通讯协议标准。其最新版本为1.3,该规范还在不断演进中,比如新标准中新增的zone PMBus、AVSBus等特性。在其官网上有详细的规范文档,本节不尝试翻译规范文档,重点记录作者在了解PMBus过程中的疑问和解答。
PMBus与I2C、SMBus的区别?
PMBus在SMBus(System Management Bus)基础上增加了一套电源配置、控制和监控规范。SMBus最初是为电池智能管理而开发的一套标准,其基于I2C协议,并针对I2C协议的弱健壮性做了如下改进:
- 支持SMBALERT#中断;
- 支持错包检测(PEC);
- 支持包超时;
- 支持START/STOP保护;
- 支持Host Notify Protocol协议;
PMBus监控哪些参数?告警分为几级?不同告警级别有什么样的应对措施?
PMBus支持电压、电流、功率、温度和风扇等参数的上下限监控,支持warning和fault 2级告警级别(如上图所示)。
- warning告警:表示监控参数异常,系统需引起关注,但可以继续运行,系统无需任何响应措施;
- fault告警:比warning告警级别高,系统会根据异常对设备的危害情况,进行设备控制电路重启(restart)或输出切断(shutdown)等处理;
告警产生时如何上报给主机?
告警上报一般有如下几种方式:
- 主机轮询PMBus设备;
- PMBus设备通过SMBALERT#中断通知主机;
- Host Notify Protocol(PMBus设备临时切换成总线主机(bus master),并发送一组特定协议通知系统主机)。
什么情况下告警会取消或清除?重启是否会清除告警?
任何warning或fault告警一旦上报,只有通过如下几种方式可以取消清除:
- PMBus设备接收到CLEAR_FAULTS命令;
- PMBus设备RESET引脚生效;
- PMBus设备通过CONTROL引脚或OPERATION命令关闭并重新打开;
- 断电;
- 如果异常一直存在,那么即使进行告警清除操作,告警会马上重新上报。
linux PMBus驱动设计分析
PMBus设备驱动位于linux/drivers/hwmon/pmbus,文件组织划分为3个部分:
比较有意思的是PMBus的通用设备驱动框架设计部分,其设计方案主要要解决如下2个问题:
- 支持PMBus设备厂商的自定义功能集。PMBus规范定义一套功能集,其中有些是基本功能,有些是可选功能;
- 支持PMBus设备厂商的自定义寄存器。
pmbus驱动框架的数据模型如下,其核心对象为i2c_client,即i2c设备对象,i2c_client继承于linux设备驱动模型device对象。pmbus设备信息通过设备驱动模型抽象接口driver_data访问,由pmbus_data对象实现。pmbus_data对象又关联如下2个主要对象:
- pmbus_driver_info:PMBus设备支持的功能集描述及相关接口,由pmbus设备实现。
- pmbus_sensor:PMBus设备支持的监控传感器对象链表,由voltage/current/power/temp/fan实现。
pmbus设备功能集识别有2种实现方式:
- 由pmbus_driver_info对象的identify接口完成,其工作原理是通过读取功能寄存器,如果读取成功,则说明设备支持此功能,否则不支持;
- 直接静态初始化pmbus_driver_info。
设备自定义寄存器通过虚拟寄存器(Virtual registers)统一到pmbus驱动框架中。pmbus通用设备驱动只看到标准寄存器和虚拟寄存器。虚拟寄存器到设备自定义寄存器的映射过程通过设备注册的4个接口:read_byte_data/read_word_data/write_word_data/write_byte来完成。
应用示例
1. 编写pmbus设备的smbus总线设备驱动并注册。如下i2c-10[1-3]为epld实现的4个i2c总线设备
/ # ls /sys/bus/i2c/devices/ -l total 0 lrwxrwxrwx 1 root root 0 Jan 1 00:00 i2c-100 -> ../../../devices/i2c-100 lrwxrwxrwx 1 root root 0 Jan 1 00:00 i2c-101 -> ../../../devices/i2c-101 lrwxrwxrwx 1 root root 0 Jan 1 00:00 i2c-102 -> ../../../devices/i2c-102 lrwxrwxrwx 1 root root 0 Jan 1 00:00 i2c-103 -> ../../../devices/i2c-103 / # cats '/sys/bus/i2c/devices/i2c-10*/name' /sys/bus/i2c/devices/i2c-100/name: WORK_EPLD1 /sys/bus/i2c/devices/i2c-101/name: WORK_EPLD2.0 /sys/bus/i2c/devices/i2c-102/name: WORK_EPLD2.1 /sys/bus/i2c/devices/i2c-103/name: WORK_NSE / #
2. 编写pmbus设备驱动并注册。如下为注册方法,hwmon[0-9]为注册的pmbus设备
echo tps53667 0x60 > /sys/bus/i2c/devices/i2c-100/new_device echo tps53667 0x62 > /sys/bus/i2c/devices/i2c-100/new_device echo tps53667 0x1060 > /sys/bus/i2c/devices/i2c-100/new_device echo tps53667 0x1062 > /sys/bus/i2c/devices/i2c-100/new_device echo tps53667 0x60 > /sys/bus/i2c/devices/i2c-101/new_device echo tps53667 0x60 > /sys/bus/i2c/devices/i2c-102/new_device echo tps53667 0x70 > /sys/bus/i2c/devices/i2c-103/new_device echo tps53667 0x1071 > /sys/bus/i2c/devices/i2c-103/new_device echo tps53667 0x2072 > /sys/bus/i2c/devices/i2c-103/new_device echo tps53667 0x3073 > /sys/bus/i2c/devices/i2c-103/new_device / # ls /sys/class/hwmon/ -l total 0 lrwxrwxrwx 1 root root 0 Jan 1 00:00 hwmon0 -> ../../devices/i2c-100/100-0060/hwmon/hwmon0 lrwxrwxrwx 1 root root 0 Jan 1 00:00 hwmon1 -> ../../devices/i2c-100/100-0062/hwmon/hwmon1 lrwxrwxrwx 1 root root 0 Jan 1 00:00 hwmon2 -> ../../devices/i2c-100/100-1060/hwmon/hwmon2 lrwxrwxrwx 1 root root 0 Jan 1 00:00 hwmon3 -> ../../devices/i2c-100/100-1062/hwmon/hwmon3 lrwxrwxrwx 1 root root 0 Jan 1 00:00 hwmon4 -> ../../devices/i2c-101/101-0060/hwmon/hwmon4 lrwxrwxrwx 1 root root 0 Jan 1 00:00 hwmon5 -> ../../devices/i2c-102/102-0060/hwmon/hwmon5 lrwxrwxrwx 1 root root 0 Jan 1 00:00 hwmon6 -> ../../devices/i2c-103/103-0070/hwmon/hwmon6 lrwxrwxrwx 1 root root 0 Jan 1 00:00 hwmon7 -> ../../devices/i2c-103/103-1071/hwmon/hwmon7 lrwxrwxrwx 1 root root 0 Jan 1 00:00 hwmon8 -> ../../devices/i2c-103/103-2072/hwmon/hwmon8 lrwxrwxrwx 1 root root 0 Jan 1 00:00 hwmon9 -> ../../devices/i2c-103/103-3073/hwmon/hwmon9
3. 查看pmbus设备监控数据,所有字段解释详见kernel文档。
/sys/class/hwmon # cats 'hwmon2/device/*' # 电流监控数据 单位:mA 毫安 /1000 hwmon2/device/curr1_crit: 255000 hwmon2/device/curr1_crit_alarm: 0 hwmon2/device/curr1_input: 2679 hwmon2/device/curr1_label: iin hwmon2/device/curr1_max: 25000 hwmon2/device/curr1_max_alarm: 0 hwmon2/device/curr2_crit: 122000 hwmon2/device/curr2_crit_alarm: 0 hwmon2/device/curr2_input: 22968 hwmon2/device/curr2_label: iout1 hwmon2/device/curr2_max: 98000 hwmon2/device/curr2_max_alarm: 0 hwmon2/device/driver: cat: read error: Is a directory hwmon2/device/hwmon: cat: read error: Is a directory # 电压监控数据 单位:mV 毫伏 1/1000 hwmon2/device/in1_crit: 17000 hwmon2/device/in1_crit_alarm: 0 hwmon2/device/in1_input: 11906 hwmon2/device/in1_label: vin hwmon2/device/in2_alarm: 0 hwmon2/device/in2_input: 631 hwmon2/device/in2_label: vout1 hwmon2/device/modalias: i2c:tps53667 hwmon2/device/name: tps53667 # 功率监控数据 单位:uW 微伏 1/1000000 hwmon2/device/power1_input: 31625000 hwmon2/device/power1_label: pin hwmon2/device/power2_input: 23343750 hwmon2/device/power2_label: pout1 hwmon2/device/subsystem: cat: read error: Is a directory # 温度监控数据 单位:m℃ 毫摄氏度 1/1000 hwmon2/device/temp1_crit: 125000 hwmon2/device/temp1_crit_alarm: 0 hwmon2/device/temp1_input: 35500 hwmon2/device/temp1_max: 95000 hwmon2/device/temp1_max_alarm: 0 hwmon2/device/uevent: DRIVER=tps53667 MODALIAS=i2c:tps53667
4. 构造一个UV fault alarm,如下。可见异常恢复后,告警依然保持,不会清除;重启也不是清除告警;手动清除后,告警清除。
/sys/devices/i2c-103/103-3073 # cats 'in*' in1_crit: 17000 in1_crit_alarm: 0 in1_input: 11968 in1_label: vin in2_alarm: 0 in2_input: 0 in2_label: vout1 /sys/devices/i2c-103/103-3073 # echo 10000 > in1_crit # 将in voltage的UV fault阈值设成10V /sys/devices/i2c-103/103-3073 # cats 'in*' in1_crit: 10000 in1_crit_alarm: 1 # 告警触发 in1_input: 11953 in1_label: vin in2_alarm: 0 in2_input: 0 in2_label: vout1 /sys/devices/i2c-103/103-3073 # echo 17000 > in1_crit # 将in voltage的UV fault阈值恢复成17V /sys/devices/i2c-103/103-3073 # cats 'in*' in1_crit: 17000 in1_crit_alarm: 1 # 异常后再恢复正常,告警依然保持 in1_input: 11968 in1_label: vin in2_alarm: 0 in2_input: 0 in2_label: vout1 /sys/devices/i2c-103/103-3073 # reboot 。。。(启动过程省略) /sys/devices/i2c-103/103-3073 # cats 'in*' in1_crit: 17000 in1_crit_alarm: 1 # 重启不会清除告警 in1_input: 11968 in1_label: vin in2_alarm: 0 in2_input: 0 in2_label: vout1 /sys/devices/i2c-103/103-3073 # echo 0 > clear_fault # 手动清除告警 /sys/devices/i2c-103/103-3073 # cats 'in*' in1_crit: 17000 in1_crit_alarm: 0 # 告警清除 in1_input: 11968 in1_label: vin in2_alarm: 0 in2_input: 631 in2_label: vout1
附主要数据结构:
struct pmbus_data { struct device *dev; struct device *hwmon_dev; u32 flags; /* from platform data */ int exponent; /* linear mode: exponent for output voltages */ const struct pmbus_driver_info *info; int max_attributes; int num_attributes; struct attribute_group group; struct pmbus_sensor *sensors; struct mutex update_lock; bool valid; unsigned long last_updated; /* in jiffies */ /* * A single status register covers multiple attributes, * so we keep them all together. */ u8 status[PB_NUM_STATUS_REG]; u8 status_register; u8 currpage; }; struct pmbus_driver_info { int pages; /* Total number of pages */ enum pmbus_data_format format[PSC_NUM_CLASSES]; /* * Support one set of coefficients for each sensor type * Used for chips providing data in direct mode. */ int m[PSC_NUM_CLASSES]; /* mantissa for direct data format */ int b[PSC_NUM_CLASSES]; /* offset */ int R[PSC_NUM_CLASSES]; /* exponent */ u32 func[PMBUS_PAGES]; /* Functionality, per page */ /* * The following functions map manufacturing specific register values * to PMBus standard register values. Specify only if mapping is * necessary. * Functions return the register value (read) or zero (write) if * successful. A return value of -ENODATA indicates that there is no * manufacturer specific register, but that a standard PMBus register * may exist. Any other negative return value indicates that the * register does not exist, and that no attempt should be made to read * the standard register. */ int (*read_byte_data)(struct i2c_client *client, int page, int reg); int (*read_word_data)(struct i2c_client *client, int page, int reg); int (*write_word_data)(struct i2c_client *client, int page, int reg, u16 word); int (*write_byte)(struct i2c_client *client, int page, u8 value); /* * The identify function determines supported PMBus functionality. * This function is only necessary if a chip driver supports multiple * chips, and the chip functionality is not pre-determined. */ int (*identify)(struct i2c_client *client, struct pmbus_driver_info *info); }; struct pmbus_sensor { struct pmbus_sensor *next; char name[PMBUS_NAME_SIZE]; /* sysfs sensor name */ struct device_attribute attribute; u8 page; /* page number */ u16 reg; /* register */ enum pmbus_sensor_classes class; /* sensor class */ bool update; /* runtime sensor update needed */ int data; /* Sensor data. Negative if there was a read error */ }; /* * Virtual registers. * Useful to support attributes which are not supported by standard PMBus * registers but exist as manufacturer specific registers on individual chips. * Must be mapped to real registers in device specific code. * * Semantics: * Virtual registers are all word size. * READ registers are read-only; writes are either ignored or return an error. * RESET registers are read/write. Reading reset registers returns zero * (used for detection), writing any value causes the associated history to be * reset. * Virtual registers have to be handled in device specific driver code. Chip * driver code returns non-negative register values if a virtual register is * supported, or a negative error code if not. The chip driver may return * -ENODATA or any other error code in this case, though an error code other * than -ENODATA is handled more efficiently and thus preferred. Either case, * the calling PMBus core code will abort if the chip driver returns an error * code when reading or writing virtual registers. */ #define PMBUS_VIRT_BASE 0x100 #define PMBUS_VIRT_READ_TEMP_AVG (PMBUS_VIRT_BASE + 0) #define PMBUS_VIRT_READ_TEMP_MIN (PMBUS_VIRT_BASE + 1) #define PMBUS_VIRT_READ_TEMP_MAX (PMBUS_VIRT_BASE + 2) #define PMBUS_VIRT_RESET_TEMP_HISTORY (PMBUS_VIRT_BASE + 3) #define PMBUS_VIRT_READ_VIN_AVG (PMBUS_VIRT_BASE + 4) #define PMBUS_VIRT_READ_VIN_MIN (PMBUS_VIRT_BASE + 5) #define PMBUS_VIRT_READ_VIN_MAX (PMBUS_VIRT_BASE + 6) #define PMBUS_VIRT_RESET_VIN_HISTORY (PMBUS_VIRT_BASE + 7) #define PMBUS_VIRT_READ_IIN_AVG (PMBUS_VIRT_BASE + 8) #define PMBUS_VIRT_READ_IIN_MIN (PMBUS_VIRT_BASE + 9) #define PMBUS_VIRT_READ_IIN_MAX (PMBUS_VIRT_BASE + 10) #define PMBUS_VIRT_RESET_IIN_HISTORY (PMBUS_VIRT_BASE + 11) #define PMBUS_VIRT_READ_PIN_AVG (PMBUS_VIRT_BASE + 12) #define PMBUS_VIRT_READ_PIN_MAX (PMBUS_VIRT_BASE + 13) #define PMBUS_VIRT_RESET_PIN_HISTORY (PMBUS_VIRT_BASE + 14) #define PMBUS_VIRT_READ_POUT_AVG (PMBUS_VIRT_BASE + 15) #define PMBUS_VIRT_READ_POUT_MAX (PMBUS_VIRT_BASE + 16) #define PMBUS_VIRT_RESET_POUT_HISTORY (PMBUS_VIRT_BASE + 17) #define PMBUS_VIRT_READ_VOUT_AVG (PMBUS_VIRT_BASE + 18) #define PMBUS_VIRT_READ_VOUT_MIN (PMBUS_VIRT_BASE + 19) #define PMBUS_VIRT_READ_VOUT_MAX (PMBUS_VIRT_BASE + 20) #define PMBUS_VIRT_RESET_VOUT_HISTORY (PMBUS_VIRT_BASE + 21) #define PMBUS_VIRT_READ_IOUT_AVG (PMBUS_VIRT_BASE + 22) #define PMBUS_VIRT_READ_IOUT_MIN (PMBUS_VIRT_BASE + 23) #define PMBUS_VIRT_READ_IOUT_MAX (PMBUS_VIRT_BASE + 24) #define PMBUS_VIRT_RESET_IOUT_HISTORY (PMBUS_VIRT_BASE + 25) #define PMBUS_VIRT_READ_TEMP2_AVG (PMBUS_VIRT_BASE + 26) #define PMBUS_VIRT_READ_TEMP2_MIN (PMBUS_VIRT_BASE + 27) #define PMBUS_VIRT_READ_TEMP2_MAX (PMBUS_VIRT_BASE + 28) #define PMBUS_VIRT_RESET_TEMP2_HISTORY (PMBUS_VIRT_BASE + 29) #define PMBUS_VIRT_READ_VMON (PMBUS_VIRT_BASE + 30) #define PMBUS_VIRT_VMON_UV_WARN_LIMIT (PMBUS_VIRT_BASE + 31) #define PMBUS_VIRT_VMON_OV_WARN_LIMIT (PMBUS_VIRT_BASE + 32) #define PMBUS_VIRT_VMON_UV_FAULT_LIMIT (PMBUS_VIRT_BASE + 33) #define PMBUS_VIRT_VMON_OV_FAULT_LIMIT (PMBUS_VIRT_BASE + 34) #define PMBUS_VIRT_STATUS_VMON (PMBUS_VIRT_BASE + 35)