Linux网络驱动

1 简介

1.1 硬件说明

嵌入式网络硬件分为:MAC和PHY。MAC一般时SOC内置,PHY是外部器件。

(1)SOC内部没有MAC

如果SOC内部没有网络MAC外设,可使用外置的MAC,一般外置的网络芯片是MAC+PHY为一体的。外置网络芯片提供SRAM或SPI等接口提供给SOC调用。

image-20220605140044778

(2)SOC内部集成网络MAC外设

SOC内部集成MAC,就需要外接一个PHY芯片。IMX就集成了MAC外设,有以下优点:

1、内置MAC有专用的加速模块,比如DMA,加速网络数据处理。

2、网速快。

3、外接PHY可选种类多,成本低。

内置MAC外设通过MII或RMII接口来连接外部的PHY芯片,用来传输网络数据。另外SOC配置或读取PHY芯片内部寄存器一般通过MDIO接口,MDIO接口有两条线,一根数据线MDIO,一根时钟线MDC。

image-20220605140501375

IMX6ULL有两个10M/100M的网络MAC外设,MINI开发板外接了一个LAN8720A的PHY芯片。

1.2 MII/RMII接口

SOC内部MAC通过MII/RMII接口和外部PHY芯片连接,完成网络数据传输

(1)MII接口

MII(Media Independent Interface)介质独立接口,由IEEE-802.3定义的以太网标准接口,用于以太网MAC连接PHY芯片。连接图如下:

image-20220605141032337

MII 接口一共有 16 根信号线,含义如下:
TX_CLK:发送时钟,如果网速为 100M 的话时钟频率为 25MHz,10M 网速的话时钟频率为 2.5MHz,此时钟由 PHY 产生并发送给 MAC。
TX_EN:发送使能信号。
TX_ER:发送错误信号,高电平有效,表示 TX_ER 有效期内传输的数据无效。10Mpbs 网速下 TX_ER 不起作用。
TXD[3:0]:发送数据信号线,一共 4 根。
RXD[3:0]:接收数据信号线,一共 4 根。
RX_CLK:接收时钟信号,如果网速为 100M 的话时钟频率为 25MHz,10M 网速的话时钟频率为 2.5MHz,RX_CLK 也是由 PHY 产生的。
RX_ER:接收错误信号,高电平有效,表示 RX_ER 有效期内传输的数据无效。10Mpbs 网速下RX_ER 不起作用。
RX_DV:接收数据有效,作用类似 TX_EN。
CRS:载波侦听信号。
COL:冲突检测信号。
MII 接口的缺点就是所需信号线太多,这还没有算 MDIO 和 MDC 这两根管理接口的数据线,因此 MII 接口使用已经越来越少了。

(2)RMII接口

RMII(Reduced Media Independent Interface)精简的介质独立接口,RMII接口连接PHY芯片的示意图:

image-20220605141314336

TX_EN:发送使能信号。

TXD[1:0]:发送数据信号线,一共 2 根。

RXD[1:0]:接收数据信号线,一共 2 根。

CRS_DV:相当于 MII 接口中的 RX_DV 和 CRS 这两个信号的混合。

REF_CLK:参考时钟,由外部时钟源提供, 频率为 50MHz。这里与 MII 不同,MII 的接收和发送时钟是独立分开的,而且都是由 PHY 芯片提供的。
除了 MII 和 RMII 以外,还有其他接口,比如 GMII、RGMII、SMII、SMII 等,关于其他接口基本都是大同小异的。

MINI开发板使用的是RMII接口连接MAC和外部PHY芯片。

1.3 MDIO接口

MDIO(Management Data Input/Output)管理数据输入输出接口,是一个串行接口,一个MDIO数据线,一根MDC时钟线。驱动程序可以通过MDIO接口访问PHY芯片的任意一个寄存器。MDIO接口最多支持32个PHY,同一时刻只能访问一个PHY,使用器件地址进行区分。

1.4 RJ45接口

RJ45接口用于插入网线。RJ45座和PHY连接在一起,需要一个网络变压器,RJ45座子一般有两个灯,一个黄色,一个绿色,绿灯亮表示网络连接正常,黄灯闪烁表示正在进行网络通信,这两个灯由PHY芯片控制。内部MAC+外部PHY+RJ45座(内置网络变压器)组成了一个完整的嵌入式网络接口硬件。

image-20220605142217230

1.5 MINI开发板接口

1.5.1 IMX6ULL ENET接口

I.MX6ULL内部自带的ENET外设其实就是一个网络 MAC,支持 10/100M。

ENET 外设有一个专用的 DMA,用于在 ENET 外设和 SOC 之间传输数据,并且支持可编程的增强型的缓冲描述符,用以支持 IEEE 1588。

1.5.2 PHY芯片

PHY 是 IEEE 802.3 规定的一个标准模块。

PHY 芯片寄存器地址空间为5位,地址 0~31 共 32 个寄存器,IEEE 定义了 0~15 这 16 个寄存器的功能,16~31 这 16 个寄存器由厂商自行实现。

仅靠0~15这 16 个寄存器是完全可以驱动起 PHY 芯片,因此Linux 内核的通用 PHY 驱动是绝对可以让你这 PHY 芯片实现基本的网络通信。

在802.3协议文档中第22.2.4 Management functions章节对PHY前16个寄存器功能进行了规定。

image-20220605143016990

1.5.3 开发板自带LAN8720A PHY芯片

(1)简介

LAN8720A 是低功耗的 10/100M 单以太网 PHY 层芯片。LAN8720A 支持通过 RMII 接口与以太网 MAC 层通信,内置 10-BASE-T/100BASE-TX 全双工传输模块,支持10Mbps 和 100Mbps。

LAN8720A 可以通过自协商的方式选择与目的主机最佳的连接方式(速度和双工模式)。支持 HP Auto-MDIX 自动翻转功能,无需更换网线即可将连接更改为直连或交叉连接。

LAN8720A 的主要特点如下:
· 高性能的 10/100M 以太网传输模块
· 支持 RMII 接口以减少引脚数
· 支持全双工和半双工模式
· 两个状态 LED 输出
· 可以使用 25M 晶振以降低成本
· 支持自协商模式
· 支持 HP Auto-MDIX 自动翻转功能
· 支持 SMI 串行管理接口
· 支持 MAC 接口

(2)PHY地址设置

MAC 层通过 MDIO/MDC 总线对 PHY 进行读写操作,MDIO 最多可以控制 32 个 PHY 芯片,通过不同的 PHY 芯片地址来对不同的 PHY 操作。LAN8720A 通过设置 RXER/PHYAD0引脚来设置其 PHY 地址,默认情况下为 0,其地址设置如表所示:

image-20220605143655598

MINI开发板的ENET1地址为0。

(3)内部寄存器

LAN8720A的前16个寄存器满足IEEE要求,这里说明几个常用寄存器。

1)BCR(Basic Control Register)寄存器,地址为0:

image-20220605144455256

2)BSR(Basic Status Register)寄存器,地址为1:

image-20220605144629400

3)PHY ID寄存器,地址为2和3,这两个寄存器组成了一个32位的唯一ID:

image-20220605144847318

image-20220605144859676

IEEE规定了PHY ID组成方式位OUI(Organizationally Unique Identifier),一共32位,分为三个部分:

image-20220605145051431

4)特殊控制/状态寄存器(Special Control/Status Register),地址位31:

image-20220605145350185

2 Linux内核网络驱动框架

2.1 net_device结构体

Linux内核使用net_device结构体表示一个具体的网络设备,这十分重要。网络驱动的核心就是初始化结构体中的各个成员,然后注册到Linux内核中。

定义在:include/linux/netdevice.h

struct net_device {
	char			name[IFNAMSIZ];	/* 网络设备名字 */
	struct hlist_node	name_hlist;
	char 			*ifalias;
	/*
	 *	I/O specific fields
	 *	FIXME: Merge these and struct ifmap into one
	 */
	unsigned long		mem_end;	/* 共享内存结束地址 */
	unsigned long		mem_start;	/* 共享内存起始地址 */
	unsigned long		base_addr;	/* 网络设备I/O地址 */
	int					irq;	/* 网络设备中断号 */

	atomic_t		carrier_changes;

	/*
	 *	Some hardware also needs these fields (state,dev_list,
	 *	napi_list,unreg_list,close_list) but they are not
	 *	part of the usual set specified in Space.c.
	 */

	unsigned long		state;

	struct list_head	dev_list;	/* 全局网络设备列表 */
	struct list_head	napi_list;	/* napi网络设备的列表入口 */
	struct list_head	unreg_list;	/* 注销的网络设备列表入口 */
	struct list_head	close_list;	/* 关闭的网络设备列表入口 */
	struct list_head	ptype_all;
	struct list_head	ptype_specific;

	struct {
		struct list_head upper;
		struct list_head lower;
	} adj_list;

	struct {
		struct list_head upper;
		struct list_head lower;
	} all_adj_list;

	netdev_features_t	features;
	netdev_features_t	hw_features;
	netdev_features_t	wanted_features;
	netdev_features_t	vlan_features;
	netdev_features_t	hw_enc_features;
	netdev_features_t	mpls_features;

	int			ifindex;
	int			group;

	struct net_device_stats	stats;

	atomic_long_t		rx_dropped;
	atomic_long_t		tx_dropped;

#ifdef CONFIG_WIRELESS_EXT
	const struct iw_handler_def *	wireless_handlers;
	struct iw_public_data *	wireless_data;
#endif
	const struct net_device_ops *netdev_ops;	/* 网络设备的操作集函数 */
	const struct ethtool_ops *ethtool_ops;		/* 网络管理工具相关函数集 */
#ifdef CONFIG_NET_SWITCHDEV
	const struct swdev_ops *swdev_ops;
#endif

	const struct header_ops *header_ops;	/* 头部的相关操作函数集 */

	unsigned int		flags;	/* 网络接口标识 */
	unsigned int		priv_flags;

	unsigned short		gflags;
	unsigned short		padded;

	unsigned char		operstate;
	unsigned char		link_mode;

	unsigned char		if_port;/* 端口类型 */
	unsigned char		dma;	/* 网络设备使用DMA通道 */

	unsigned int		mtu;	/* 网络最大传输单元,为1500 */
	unsigned short		type;	/* 指定ARP模块类型 */
	unsigned short		hard_header_len;

	unsigned short		needed_headroom;
	unsigned short		needed_tailroom;

	/* Interface address info. */
	unsigned char		perm_addr[MAX_ADDR_LEN];	/* 永久硬件地址 */
	unsigned char		addr_assign_type;
	unsigned char		addr_len;	/* 硬件地址长度 */
	unsigned short		neigh_priv_len;
	unsigned short      dev_id;
	unsigned short      dev_port;
	spinlock_t		addr_list_lock;
	unsigned char	name_assign_type;
	bool			uc_promisc;
	struct netdev_hw_addr_list	uc;
	struct netdev_hw_addr_list	mc;
	struct netdev_hw_addr_list	dev_addrs;

#ifdef CONFIG_SYSFS
	struct kset		*queues_kset;
#endif
	unsigned int		promiscuity;
	unsigned int		allmulti;


	/* Protocol specific pointers */

#if IS_ENABLED(CONFIG_VLAN_8021Q)
	struct vlan_info __rcu	*vlan_info;
#endif
#if IS_ENABLED(CONFIG_NET_DSA)
	struct dsa_switch_tree	*dsa_ptr;
#endif
#if IS_ENABLED(CONFIG_TIPC)
	struct tipc_bearer __rcu *tipc_ptr;
#endif
	void 			*atalk_ptr;
	struct in_device __rcu	*ip_ptr;
	struct dn_dev __rcu     *dn_ptr;
	struct inet6_dev __rcu	*ip6_ptr;
	void			*ax25_ptr;
	struct wireless_dev	*ieee80211_ptr;
	struct wpan_dev		*ieee802154_ptr;
#if IS_ENABLED(CONFIG_MPLS_ROUTING)
	struct mpls_dev __rcu	*mpls_ptr;
#endif

/*
 * Cache lines mostly used on receive path (including eth_type_trans())
 */
	unsigned long		last_rx;	/* 最后接收数据包时间戳,是jiffies */

	/* Interface address info used in eth_type_trans() */
	unsigned char		*dev_addr;	/* 硬件地址,当前分配的MAC地址,软件可修改 */


#ifdef CONFIG_SYSFS
	struct netdev_rx_queue	*_rx;	/* 接收队列 */

	unsigned int		num_rx_queues;	/* 接收队列数量 */
	unsigned int		real_num_rx_queues;	/* 当前活动的队列数量 */

#endif

	unsigned long		gro_flush_timeout;
	rx_handler_func_t __rcu	*rx_handler;
	void __rcu		*rx_handler_data;

	struct netdev_queue __rcu *ingress_queue;
	unsigned char		broadcast[MAX_ADDR_LEN];
#ifdef CONFIG_RFS_ACCEL
	struct cpu_rmap		*rx_cpu_rmap;
#endif
	struct hlist_node	index_hlist;

/*
 * Cache lines mostly used on transmit path
 */
	struct netdev_queue	*_tx ____cacheline_aligned_in_smp; /* 发送队列 */
	unsigned int		num_tx_queues;		/* 发送队列数量 */
	unsigned int		real_num_tx_queues;	/* 当前有效发送队列数量 */
	struct Qdisc		*qdisc;
	unsigned long		tx_queue_len;
	spinlock_t			tx_global_lock;
	int					watchdog_timeo;

#ifdef CONFIG_XPS
	struct xps_dev_maps __rcu *xps_maps;
#endif

	/* These may be needed for future network-power-down code. */

	/*
	 * trans_start here is expensive for high speed devices on SMP,
	 * please use netdev_queue->trans_start instead.
	 */
	unsigned long		trans_start;	/* 最后数据包发送的时间戳,为jiffies */

	struct timer_list	watchdog_timer;

	int __percpu		*pcpu_refcnt;
	struct list_head	todo_list;

	struct list_head	link_watch_list;

	enum { NETREG_UNINITIALIZED=0,
	       NETREG_REGISTERED,	/* completed register_netdevice */
	       NETREG_UNREGISTERING,	/* called unregister_netdevice */
	       NETREG_UNREGISTERED,	/* completed unregister todo */
	       NETREG_RELEASED,		/* called free_netdev */
	       NETREG_DUMMY,		/* dummy device for NAPI poll */
	} reg_state:8;

	bool dismantle;

	enum {
		RTNL_LINK_INITIALIZED,
		RTNL_LINK_INITIALIZING,
	} rtnl_link_state:16;

	void (*destructor)(struct net_device *dev);

#ifdef CONFIG_NETPOLL
	struct netpoll_info __rcu	*npinfo;
#endif

	possible_net_t			nd_net;

	/* mid-layer private */
	union {
		void					*ml_priv;
		struct pcpu_lstats __percpu		*lstats;
		struct pcpu_sw_netstats __percpu	*tstats;
		struct pcpu_dstats __percpu		*dstats;
		struct pcpu_vstats __percpu		*vstats;
	};

	struct garp_port __rcu	*garp_port;
	struct mrp_port __rcu	*mrp_port;

	struct device	dev;
	const struct attribute_group *sysfs_groups[4];
	const struct attribute_group *sysfs_rx_queue_group;

	const struct rtnl_link_ops *rtnl_link_ops;

	/* for setting kernel sock attribute on TCP connection setup */
#define GSO_MAX_SIZE		65536
	unsigned int		gso_max_size;
#define GSO_MAX_SEGS		65535
	u16			gso_max_segs;
	u16			gso_min_segs;
#ifdef CONFIG_DCB
	const struct dcbnl_rtnl_ops *dcbnl_ops;
#endif
	u8 num_tc;
	struct netdev_tc_txq tc_to_txq[TC_MAX_QUEUE];
	u8 prio_tc_map[TC_BITMASK + 1];

#if IS_ENABLED(CONFIG_FCOE)
	unsigned int		fcoe_ddp_xid;
#endif
#if IS_ENABLED(CONFIG_CGROUP_NET_PRIO)
	struct netprio_map __rcu *priomap;
#endif
	struct phy_device *phydev;	/* PHY设备 */
	struct lock_class_key *qdisc_tx_busylock;
};

成员flags是一个枚举:

unsigned int		flags;	/* 网络接口标识 */

其内容如下:

enum net_device_flags {
	IFF_UP				= 1<<0,  /* sysfs */
	IFF_BROADCAST		= 1<<1,  /* volatile */
	IFF_DEBUG			= 1<<2,  /* sysfs */
	IFF_LOOPBACK		= 1<<3,  /* volatile */
	IFF_POINTOPOINT		= 1<<4,  /* volatile */
	IFF_NOTRAILERS		= 1<<5,  /* sysfs */
	IFF_RUNNING			= 1<<6,  /* volatile */
	IFF_NOARP			= 1<<7,  /* sysfs */
	IFF_PROMISC			= 1<<8,  /* sysfs */
	IFF_ALLMULTI		= 1<<9,  /* sysfs */
	IFF_MASTER			= 1<<10, /* volatile */
	IFF_SLAVE			= 1<<11, /* volatile */
	IFF_MULTICAST		= 1<<12, /* sysfs */
	IFF_PORTSEL			= 1<<13, /* sysfs */
	IFF_AUTOMEDIA		= 1<<14, /* sysfs */
	IFF_DYNAMIC			= 1<<15, /* sysfs */
	IFF_LOWER_UP		= 1<<16, /* volatile */
	IFF_DORMANT			= 1<<17, /* volatile */
	IFF_ECHO			= 1<<18, /* volatile */
};

端口类型if_port也是一个枚举:

unsigned char		if_port;	/* 端口类型 */

/* Media selection options. */
enum {
        IF_PORT_UNKNOWN = 0,
        IF_PORT_10BASE2,
        IF_PORT_10BASET,
        IF_PORT_AUI,
        IF_PORT_100BASET,
        IF_PORT_100BASETX,
        IF_PORT_100BASEFX
};

ARP模块类型常用取值定义在:include/uapi/linux/if_arp.h

unsigned short		type;	/* 指定ARP模块类型 */

/* ARP protocol HARDWARE identifiers. */
#define ARPHRD_NETROM	0		/* from KA9Q: NET/ROM pseudo	*/
#define ARPHRD_ETHER 	1		/* Ethernet 10Mbps		*/

2.1.2 申请net_device

网络驱动需要先申请net_device,申请函数为:

/* 参数sizeof_priv:私有数据块大小
 * 参数name:设备名字
 * 参数setup:回调函数,初始化设备后调用
 * 参数txqs:分配的发送队列数量
 * 参数rxqs:分配的接收队列数量
 * 返回值:申请成功返回指针,失败返回NULL。
*/
struct net_device *alloc_netdev_mqs(int sizeof_priv,
                    const char *name,
				    unsigned char name_assign_type,
				    void (*setup)(struct net_device *),
				    unsigned int txqs, unsigned int rxqs);

#define alloc_netdev(sizeof_priv, name, name_assign_type, setup) \
	alloc_netdev_mqs(sizeof_priv, name, name_assign_type, setup, 1, 1)

Linux内核支持多种网络驱动,比如:光纤分布式数据接口(FDDI)、以太网设备(Ethernet)、红外数据接口(InDA)、高性能并行接口(HPPI)、CAN 网络等,本文主要说明以太网络。

以太网络的net_device申请也是一个宏:

struct net_device *alloc_etherdev_mqs(int sizeof_priv, unsigned int txqs,
					    unsigned int rxqs);
#define alloc_etherdev(sizeof_priv) alloc_etherdev_mq(sizeof_priv, 1)

struct net_device *alloc_etherdev_mqs(int sizeof_priv, unsigned int txqs,
				      unsigned int rxqs)
{
	return alloc_netdev_mqs(sizeof_priv, "eth%d", NET_NAME_UNKNOWN,
				ether_setup, txqs, rxqs);
}
EXPORT_SYMBOL(alloc_etherdev_mqs);

可以看到,这里设置的名字为“eth%d”,也就是开发板看到的网卡名字。同时设置了启动函数ether_setup,函数中会对net_device进行初步的初始化:

void ether_setup(struct net_device *dev)
{
	dev->header_ops		= &eth_header_ops;
	dev->type			= ARPHRD_ETHER;
	dev->hard_header_len= ETH_HLEN;
	dev->mtu			= ETH_DATA_LEN;
	dev->addr_len		= ETH_ALEN;
	dev->tx_queue_len	= 1000;	/* Ethernet wants good queues */
	dev->flags			= IFF_BROADCAST|IFF_MULTICAST;
	dev->priv_flags		|= IFF_TX_SKB_SHARING;

	eth_broadcast_addr(dev->broadcast);

}
EXPORT_SYMBOL(ether_setup);

2.1.2 删除net_device

注销网络驱动的时候需要释放掉申请的net_device,使用函数为:

void free_netdev(struct net_device *dev);

2.1.3 注册net_device

net_device申请并初始化之后需要向内核注册,调用函数为:

int register_netdev(struct net_device *dev);

2.1.4 注销net_device

void unregister_netdev(struct net_device *dev);

2.2 net_device_ops结构体

net_device_ops结构体是net_device的非常重要的成员,保存的是网络设备的操作函数集,这些函数需要网络驱动编写人员实现,可根据实际使用实现其中一部分函数。

定义在:include/linux/netdevice.h

struct net_device_ops {
    /* 第一次注册网络设备的时候执行,一般不使用 */
	int			(*ndo_init)(struct net_device *dev);
    
    /* 卸载网络设备的时候执行 */
	void		(*ndo_uninit)(struct net_device *dev);
    
    /* 重点,打开网络设备的时候执行,一般做如下工作:使能网络外设时钟、申请网络使用的环缓冲区、初始化MAC外设、绑定接口对应的PHY、开启PHY、使能传输队列等 */
	int			(*ndo_open)(struct net_device *dev);
    
    /* 关闭网络设备的时候执行,一般做如下工作:停止PHY、停止NAPI功能、停止发送功能、关闭MAC、断开PHY连接、关闭网络时钟、释放数据缓冲区 */
	int			(*ndo_stop)(struct net_device *dev);
    
    /* 需要发送数据的时候执行,skb保存了上层传递给网络驱动层的数据 */
	netdev_tx_t	(*ndo_start_xmit) (struct sk_buff *skb,
						   struct net_device *dev);
    
    /* 当设备支持多传输队列时选择使用哪个队列 */
	u16			(*ndo_select_queue)(struct net_device *dev,
						    struct sk_buff *skb,
						    void *accel_priv,
						    select_queue_fallback_t fallback);
	void		(*ndo_change_rx_flags)(struct net_device *dev,
						       int flags);
    
    /* 改变地址过滤表,根据net_device的flags来设置SOC外设寄存器 */
	void		(*ndo_set_rx_mode)(struct net_device *dev);
	
    /* 修改网卡的MAC地址,设置net_device的dev_addr成员,并写回到网络外设的硬件寄存器中 */
    int			(*ndo_set_mac_address)(struct net_device *dev,
						       void *addr);
    
    /* 验证MAC地址是否合法 */
	int			(*ndo_validate_addr)(struct net_device *dev);
    
    /* 用户程序调用ioctl执行,比如PHY芯片相关命令操作 */
	int			(*ndo_do_ioctl)(struct net_device *dev,
					        struct ifreq *ifr, int cmd);
	int			(*ndo_set_config)(struct net_device *dev,
					          struct ifmap *map);
    
    /* 更改MTU大小 */
	int			(*ndo_change_mtu)(struct net_device *dev,
						  int new_mtu);
	int			(*ndo_neigh_setup)(struct net_device *dev,
						   struct neigh_parms *);
    
    /* 发送超时的时候执行 */
	void		(*ndo_tx_timeout) (struct net_device *dev);

	struct rtnl_link_stats64* (*ndo_get_stats64)(struct net_device *dev,
						     struct rtnl_link_stats64 *storage);
	struct net_device_stats* (*ndo_get_stats)(struct net_device *dev);

	int			(*ndo_vlan_rx_add_vid)(struct net_device *dev,
						       __be16 proto, u16 vid);
	int			(*ndo_vlan_rx_kill_vid)(struct net_device *dev,
						        __be16 proto, u16 vid);
#ifdef CONFIG_NET_POLL_CONTROLLER
    /* 使用查询方式来处理网卡数据的收发 */
	void                    (*ndo_poll_controller)(struct net_device *dev);
	int			(*ndo_netpoll_setup)(struct net_device *dev,
						     struct netpoll_info *info);
	void		(*ndo_netpoll_cleanup)(struct net_device *dev);
#endif
#ifdef CONFIG_NET_RX_BUSY_POLL
	int			(*ndo_busy_poll)(struct napi_struct *dev);
#endif
	int			(*ndo_set_vf_mac)(struct net_device *dev,
						  int queue, u8 *mac);
	int			(*ndo_set_vf_vlan)(struct net_device *dev,
						   int queue, u16 vlan, u8 qos);
	int			(*ndo_set_vf_rate)(struct net_device *dev,
						   int vf, int min_tx_rate,
						   int max_tx_rate);
	int			(*ndo_set_vf_spoofchk)(struct net_device *dev,
						       int vf, bool setting);
	int			(*ndo_get_vf_config)(struct net_device *dev,
						     int vf,
						     struct ifla_vf_info *ivf);
	int			(*ndo_set_vf_link_state)(struct net_device *dev,
							 int vf, int link_state);
	int			(*ndo_set_vf_port)(struct net_device *dev,
						   int vf,
						   struct nlattr *port[]);
	int			(*ndo_get_vf_port)(struct net_device *dev,
						   int vf, struct sk_buff *skb);
	int			(*ndo_set_vf_rss_query_en)(
						   struct net_device *dev,
						   int vf, bool setting);
	int			(*ndo_setup_tc)(struct net_device *dev, u8 tc);
#if IS_ENABLED(CONFIG_FCOE)
	int			(*ndo_fcoe_enable)(struct net_device *dev);
	int			(*ndo_fcoe_disable)(struct net_device *dev);
	int			(*ndo_fcoe_ddp_setup)(struct net_device *dev,
						      u16 xid,
						      struct scatterlist *sgl,
						      unsigned int sgc);
	int			(*ndo_fcoe_ddp_done)(struct net_device *dev,
						     u16 xid);
	int			(*ndo_fcoe_ddp_target)(struct net_device *dev,
						       u16 xid,
						       struct scatterlist *sgl,
						       unsigned int sgc);
	int			(*ndo_fcoe_get_hbainfo)(struct net_device *dev,
							struct netdev_fcoe_hbainfo *hbainfo);
#endif

#if IS_ENABLED(CONFIG_LIBFCOE)
#define NETDEV_FCOE_WWNN 0
#define NETDEV_FCOE_WWPN 1
	int			(*ndo_fcoe_get_wwn)(struct net_device *dev,
						    u64 *wwn, int type);
#endif

#ifdef CONFIG_RFS_ACCEL
	int			(*ndo_rx_flow_steer)(struct net_device *dev,
						     const struct sk_buff *skb,
						     u16 rxq_index,
						     u32 flow_id);
#endif
	int			(*ndo_add_slave)(struct net_device *dev,
						 struct net_device *slave_dev);
	int			(*ndo_del_slave)(struct net_device *dev,
						 struct net_device *slave_dev);
	netdev_features_t	(*ndo_fix_features)(struct net_device *dev,
						    netdev_features_t features);
    
    /* 修改net_device的features属性,设置相应的硬件属性 */
	int			(*ndo_set_features)(struct net_device *dev,
						    netdev_features_t features);
	int			(*ndo_neigh_construct)(struct neighbour *n);
	void		(*ndo_neigh_destroy)(struct neighbour *n);

	int			(*ndo_fdb_add)(struct ndmsg *ndm,
					       struct nlattr *tb[],
					       struct net_device *dev,
					       const unsigned char *addr,
					       u16 vid,
					       u16 flags);
	int			(*ndo_fdb_del)(struct ndmsg *ndm,
					       struct nlattr *tb[],
					       struct net_device *dev,
					       const unsigned char *addr,
					       u16 vid);
	int			(*ndo_fdb_dump)(struct sk_buff *skb,
						struct netlink_callback *cb,
						struct net_device *dev,
						struct net_device *filter_dev,
						int idx);

	int			(*ndo_bridge_setlink)(struct net_device *dev,
						      struct nlmsghdr *nlh,
						      u16 flags);
	int			(*ndo_bridge_getlink)(struct sk_buff *skb,
						      u32 pid, u32 seq,
						      struct net_device *dev,
						      u32 filter_mask,
						      int nlflags);
	int			(*ndo_bridge_dellink)(struct net_device *dev,
						      struct nlmsghdr *nlh,
						      u16 flags);
	int			(*ndo_change_carrier)(struct net_device *dev,
						      bool new_carrier);
	int			(*ndo_get_phys_port_id)(struct net_device *dev,
							struct netdev_phys_item_id *ppid);
	int			(*ndo_get_phys_port_name)(struct net_device *dev,
							  char *name, size_t len);
	void		(*ndo_add_vxlan_port)(struct  net_device *dev,
						      sa_family_t sa_family,
						      __be16 port);
	void		(*ndo_del_vxlan_port)(struct  net_device *dev,
						      sa_family_t sa_family,
						      __be16 port);

	void*		(*ndo_dfwd_add_station)(struct net_device *pdev,
							struct net_device *dev);
	void		(*ndo_dfwd_del_station)(struct net_device *pdev,
							void *priv);

	netdev_tx_t	(*ndo_dfwd_start_xmit) (struct sk_buff *skb,
							struct net_device *dev,
							void *priv);
	int			(*ndo_get_lock_subclass)(struct net_device *dev);
	netdev_features_t	(*ndo_features_check) (struct sk_buff *skb,
						       struct net_device *dev,
						       netdev_features_t features);
	int			(*ndo_set_tx_maxrate)(struct net_device *dev,
						      int queue_index,
						      u32 maxrate);
	int			(*ndo_get_iflink)(const struct net_device *dev);
};

2.3 sk_buffer结构体

网络是分层的,应用程序不需要关心底层如何工作,只需要按照协议将要发送或接收的数据打包即可,打包好通过dev_queue_xmit函数将数据发送出去,接收数据使用netif_rx函数。

2.3.1 dev_queue_xmit函数

定义在:include/linux/netdevice.h

/* 功能:将网络数据发送出去
 * 参数skb:要发送的数据
 * 返回值:0,发送成功;负值,发送失败。
*/
static inline int dev_queue_xmit(struct sk_buff *skb);

struct sk_buff保存的是网络数据。

发送时,各个协议层在sk_buff中添加自己的协议层,最终通过底层驱动发送出去。

接收时,网络底层驱动接收到原始数据打包成sk_buff,然后发送给上层协议,上层会去掉相应的头部,最终将数据发送给用户。

dev_queue_xmit最终通过net_device_ops操作函数集中的ndo_start_xmit函数完成发送,大致流程如下:

image-20220605161409538

2.3.2 netif_rx函数

上层接收数据使用netif_rx函数,但最原始的网络数据一般通过轮询、中断或NAPI的方式来接收:

/* 参数skb:保存接收数据
 * 返回值:NET_RX_SUCCESS 成功,NET_RX_DROP 数据包丢弃。
*/
int netif_rx(struct sk_buff *skb);

2.3.3 struct sk_buff结构体

sk_buff结构体非常重要,用于管理接收和发送数据包,定义在:include/linux/skbuff.h

struct sk_buff {
	union {
		struct {
			/* These two members must be first. */
			struct sk_buff	*next;	/* 下一个sk_buff */
			struct sk_buff	*prev;	/* 前一个sk_buff */

			union {
				ktime_t		tstamp;
				struct skb_mstamp skb_mstamp;
			};
		};
		struct rb_node	rbnode; /* used in netem & tcp stack */
	};
	struct sock			*sk;	/* 当前sk_buff所属Socket */
	struct net_device	*dev;	/* 当前sk_buff所属设备 */

	/*
	 * This is the control buffer. It is free to use for every
	 * layer. Please put your private variables there. If you
	 * want to keep them across layers you have to do a skb_clone()
	 * first. This is owned by whoever has the skb queued ATM.
	 */
	char			cb[48] __aligned(8);	/* 控制缓冲区,用于放置私有数据 */

	unsigned long		_skb_refdst;
	void			(*destructor)(struct sk_buff *skb);	/* 释放缓冲区时可在此函数中完成某些动作 */
#ifdef CONFIG_XFRM
	struct	sec_path	*sp;
#endif
#if defined(CONFIG_NF_CONNTRACK) || defined(CONFIG_NF_CONNTRACK_MODULE)
	struct nf_conntrack	*nfct;
#endif
#if IS_ENABLED(CONFIG_BRIDGE_NETFILTER)
	struct nf_bridge_info	*nf_bridge;
#endif
	unsigned int		len, data_len;	/* len:实际的数据长度,包括主缓冲区中数据长度和分片中的数据长度,data_len:数据长度,只计算分片中的数据长度 */
	__u16			mac_len, hdr_len;	/* mac_len:MAC头长度 */

	/* Following fields are _not_ copied in __copy_skb_header()
	 * Note that queue_mapping is here mostly to fill a hole.
	 */
	kmemcheck_bitfield_begin(flags1);
	__u16			queue_mapping;
	__u8			cloned:1,
				nohdr:1,
				fclone:2,
				peeked:1,
				head_frag:1,
				xmit_more:1;
	/* one bit hole */
	kmemcheck_bitfield_end(flags1);

	/* fields enclosed in headers_start/headers_end are copied
	 * using a single memcpy() in __copy_skb_header()
	 */
	/* private: */
	__u32			headers_start[0];
	/* public: */

/* if you move pkt_type around you also must adapt those constants */
#ifdef __BIG_ENDIAN_BITFIELD
#define PKT_TYPE_MAX	(7 << 5)
#else
#define PKT_TYPE_MAX	7
#endif
#define PKT_TYPE_OFFSET()	offsetof(struct sk_buff, __pkt_type_offset)

	__u8			__pkt_type_offset[0];
	__u8			pkt_type:3;
	__u8			pfmemalloc:1;
	__u8			ignore_df:1;
	__u8			nfctinfo:3;

	__u8			nf_trace:1;
	__u8			ip_summed:2;
	__u8			ooo_okay:1;
	__u8			l4_hash:1;
	__u8			sw_hash:1;
	__u8			wifi_acked_valid:1;
	__u8			wifi_acked:1;

	__u8			no_fcs:1;
	/* Indicates the inner headers are valid in the skbuff. */
	__u8			encapsulation:1;
	__u8			encap_hdr_csum:1;
	__u8			csum_valid:1;
	__u8			csum_complete_sw:1;
	__u8			csum_level:2;
	__u8			csum_bad:1;

#ifdef CONFIG_IPV6_NDISC_NODETYPE
	__u8			ndisc_nodetype:2;
#endif
	__u8			ipvs_property:1;
	__u8			inner_protocol_type:1;
	__u8			remcsum_offload:1;
	/* 3 or 5 bit hole */

#ifdef CONFIG_NET_SCHED
	__u16			tc_index;	/* traffic control index */
#ifdef CONFIG_NET_CLS_ACT
	__u16			tc_verd;	/* traffic control verdict */
#endif
#endif

	union {
		__wsum		csum;
		struct {
			__u16	csum_start;
			__u16	csum_offset;
		};
	};
	__u32			priority;
	int			skb_iif;
	__u32			hash;
	__be16			vlan_proto;
	__u16			vlan_tci;
#if defined(CONFIG_NET_RX_BUSY_POLL) || defined(CONFIG_XPS)
	union {
		unsigned int	napi_id;
		unsigned int	sender_cpu;
	};
#endif
#ifdef CONFIG_NETWORK_SECMARK
	__u32			secmark;
#endif
	union {
		__u32		mark;
		__u32		reserved_tailroom;
	};

	union {
		__be16		inner_protocol;
		__u8		inner_ipproto;
	};

	__u16			inner_transport_header;
	__u16			inner_network_header;
	__u16			inner_mac_header;

	__be16			protocol;	/* 协议类型 */
	__u16			transport_header;	/* 传输层头部 */
	__u16			network_header;		/* 网络层头部 */
	__u16			mac_header;			/* 数据链路层头部 */

	/* private: */
	__u32			headers_end[0];
	/* public: */

	/* These elements must be at the end, see alloc_skb() for details.  */
	sk_buff_data_t		tail;	/* 指向实际数据的尾部 */
	sk_buff_data_t		end;	/* 指向缓冲区的尾部 */
	unsigned char		*head,	/* 指向缓冲区的头部 */
						*data;	/* 指向实际数据的头部 */
	unsigned int		truesize;
	atomic_t		users;
};

head 指向缓冲区的头部,data 指向实际数据的头部。data 和 tail 指向实际数据的头部和尾部,head 和 end 指向缓冲区的头部和尾部,结构如图所示:

image-20220605162828512

2.3.4 分配sk_buff

使用sk_buff必须先分配,使用函数为:

/* 参数size:要分配的大小,也就是skb数据段大小
 * 参数priority:为GFP MASK宏,比如GFP_KERNEL、GFP_ATOMIC等。
 * 返回值:成功返回sk_buff首地址,失败返回NULL。
*/
static inline struct sk_buff *alloc_skb(unsigned int size, gfp_t priority);

在网络设备驱动中常常使用 netdev_alloc_skb 来为某个设备申请一个用于接收的 skb_buff:

/* 参数dev:要给哪个设备分配skb
 * 参数length:要分配的大小
 * 返回值:成功返回sk_buff首地址,失败返回NULL。
*/
static inline struct sk_buff *netdev_alloc_skb(struct net_device *dev, unsigned int length);

2.3.4 释放sk_buff

使用完sk_buff之后需要释放,释放函数为:

void kfree_skb(struct sk_buff *skb);

对于网络设备可以使用下面的函数:

void dev_kfree_skb(struct sk_buff *skb);

2.3.5 skb_put、skb_push、skb_pull、skb_reserve

这四个函数用于变更 sk_buff。

/* 功能:在尾部扩展skb_buff的数据区,也就将skb_buff的tail后移n个字节,从而导致skb_buff的len增加n个字节。
 * 参数skb:要操作的skb_buff
 * 参数len:要增加多少个字节
 * 返回值:扩展出来的那一段数据区首地址
 */
unsigned char *skb_put(struct sk_buff *skb, unsigned int len);

/* 功能:在头部扩展skb_buff的数据区
 * 参数skb:要操作的skb_buff
 * 参数len:要增加多少个字节
 * 返回值:扩展出来的那一段数据区首地址
 */
unsigned char *skb_push(struct sk_buff *skb, unsigned int len);

/* 功能:从sk_buff的数据区起始位置删除数据
 * 参数skb:要操作的skb_buff
 * 参数len:要删除多少个字节
 * 返回值:删除以后新的数据区首地址
 */
unsigned char *skb_pull(struct sk_buff *skb, unsigned int len);

/* 功能:调整缓冲区头部大小,将data和tail同时后移len个字节
 * 参数skb:要操作的skb_buff
 * 参数len:要增加的缓冲区头部大小
 * 返回值:无
 */
static inline void skb_reserve(struct sk_buff *skb, int len);

image-20220605164426018

image-20220605164435556

image-20220605164500704

2.4 网络NAPI处理机制

Linux内核中网络数据接收分为轮询和中断两种,中断的好处就是响应快,数据量小的时候处理及时,速度快,但是一旦当数据量大,而且都是短帧的时候会导致中断频繁发生,消耗大量的 CPU 处理时间在中断自身处理上。轮询恰好相反,响应没有中断及时,但是在处理大量数据的时候不需要消耗过多的 CPU 处理时间。

Linux 在这两个处理方式的基础上提出了另外一种网络数据接收的处理方法:NAPI(New API),NAPI 是一种高效的网络处理技术。NAPI 的核心思想就是不全部采用中断来读取网络数据,而是采用中断来唤醒数据接收服务程序,在接收服务程序中采用 POLL 的方法来轮询处理数据。这种方法的好处就是可以提高短数据包的接收效率,减少中断处理的时间。

本章节就简单讲解一下如何在驱动中使用 NAPI。

2.4.1 初始化NAPI

在使用 NAPI 之前要先初始化一个 napi_struct 实例,定义在:include/linux/netdevice.h

struct napi_struct {
	/* The poll_list must only be managed by the entity which
	 * changes the state of the NAPI_STATE_SCHED bit.  This means
	 * whoever atomically sets that bit can add this napi_struct
	 * to the per-cpu poll_list, and whoever clears that bit
	 * can remove from the list right before clearing the bit.
	 */
	struct list_head	poll_list;

	unsigned long		state;
	int			weight;
	unsigned int		gro_count;
	int			(*poll)(struct napi_struct *, int);
#ifdef CONFIG_NETPOLL
	spinlock_t		poll_lock;
	int			poll_owner;
#endif
	struct net_device	*dev;
	struct sk_buff		*gro_list;
	struct sk_buff		*skb;
	struct hrtimer		timer;
	struct list_head	dev_list;
	struct hlist_node	napi_hash_node;
	unsigned int		napi_id;
};

初始化一个 napi_struct 实例,使用 netif_napi_add 函数:

/*
dev:每个 NAPI 必须关联一个网络设备,此参数指定 NAPI 要关联的网络设备。
napi:要初始化的 NAPI 实例。
poll:NAPI 所使用的轮询函数,非常重要,一般在此轮询函数中完成网络数据接收的工作。
weight:NAPI 默认权重(weight),一般为 NAPI_POLL_WEIGHT。
返回值:无
*/
void netif_napi_add(struct net_device *dev, struct napi_struct *napi,
		    int (*poll)(struct napi_struct *, int), int weight);

2.4.2 删除NAPI

void netif_napi_del(struct napi_struct *napi);

2.4.3 使能NAPI

初始化完 NAPI 以后,必须使能才能使用。

static inline void napi_enable(struct napi_struct *n);

2.4.4 关闭NAPI

void napi_disable(struct napi_struct *n);

2.4.5 NAPI调度

如果可以调度的话就进行调度,使用__napi_schedule 函数完成 NAPI 调度:

void __napi_schedule(struct napi_struct *n);

也可以使用 napi_schedule 函数来一次完成 napi_schedule_prep 和__napi_schedule 这两

个函数的工作:

static inline void napi_schedule(struct napi_struct *n)
{
	if (napi_schedule_prep(n))
		__napi_schedule(n);
}

2.4.6 NAPI处理完成

NAPI 处理完成以后需要调用 napi_complete 函数来标记 NAPI 处理完成:

static inline void napi_complete(struct napi_struct *n);

3 IMX6ULL网络驱动

I.MX6ULL 有两个 10/100M 的网络 MAC 外设,MINI板只使用了其中的一个。

3.1 设备树

IMX系列SOC网络绑定文档:Documentation/devicetree/bindings/net/fsl-fec.txt,描述了设备树节点的要求。

在arch/arm/boot/dts/imx6ull.dtsi中,网络节点描述如下:

/* 对应IMX6ULL的ENET1 */
fec1: ethernet@02188000 {
    /* 驱动名字 */
    compatible = "fsl,imx6ul-fec", "fsl,imx6q-fec";
    
    /* SOC网络外设寄存器地址范围 */
    reg = <0x02188000 0x4000>;
    
    /* 网络中断 */
    interrupts = <GIC_SPI 118 IRQ_TYPE_LEVEL_HIGH>,
    <GIC_SPI 119 IRQ_TYPE_LEVEL_HIGH>;
    
    clocks = <&clks IMX6UL_CLK_ENET>,
    <&clks IMX6UL_CLK_ENET_AHB>,
    <&clks IMX6UL_CLK_ENET_PTP>,
    <&clks IMX6UL_CLK_ENET_REF>,
    <&clks IMX6UL_CLK_ENET_REF>;
    clock-names = "ipg", "ahb", "ptp",
    "enet_clk_ref", "enet_out";
    
    /* SOC需要设备GPR为来请求停止模式 */
    stop-mode = <&gpr 0x10 3>;
    
    /* 指定发送队列的数量 */
    fsl,num-tx-queues=<1>;
    
    /* 指定接收队列的数量 */
    fsl,num-rx-queues=<1>;
    
    /* 支持硬件魔术帧唤醒 */
    fsl,magic-packet;
    
    /* 设置唤醒中断索引 */
    fsl,wakeup_irq = <0>;
    
    status = "disabled";
};

/* 对应IMX6ULL的ENET2 */
fec2: ethernet@020b4000 {
    compatible = "fsl,imx6ul-fec", "fsl,imx6q-fec";
    reg = <0x020b4000 0x4000>;
    interrupts = <GIC_SPI 120 IRQ_TYPE_LEVEL_HIGH>,
    <GIC_SPI 121 IRQ_TYPE_LEVEL_HIGH>;
    clocks = <&clks IMX6UL_CLK_ENET>,
    <&clks IMX6UL_CLK_ENET_AHB>,
    <&clks IMX6UL_CLK_ENET_PTP>,
    <&clks IMX6UL_CLK_ENET2_REF_125M>,
    <&clks IMX6UL_CLK_ENET2_REF_125M>;
    clock-names = "ipg", "ahb", "ptp",
    "enet_clk_ref", "enet_out";
    stop-mode = <&gpr 0x10 4>;
    fsl,num-tx-queues=<1>;
    fsl,num-rx-queues=<1>;
    fsl,magic-packet;
    fsl,wakeup_irq = <0>;
    status = "disabled";
};

arch/arm/boot/dts/imx6ull-alientek-emmc.dts中对节点进行了修改:

&fec1 {
	pinctrl-names = "default";
	pinctrl-0 = <&pinctrl_enet1
		     &pinctrl_fec1_reset>;
    
    /* PHY接口模式,MII或RMII */
	phy-mode = "rmii";
    
    /* 连接此网络设备的PHY芯片句柄 */
	phy-handle = <&ethphy0>;
    
    /* PHY芯片复位引脚 */
	phy-reset-gpios = <&gpio5 7 GPIO_ACTIVE_LOW>;
    
    /* PHY复位引脚复位持续时间,单位毫秒 */
	phy-reset-duration = <200>;
    
	status = "okay";
};

&fec2 {
	pinctrl-names = "default";
	pinctrl-0 = <&pinctrl_enet2
		     &pinctrl_fec2_reset>;
	phy-mode = "rmii";
	phy-handle = <&ethphy1>;
	phy-reset-gpios = <&gpio5 8 GPIO_ACTIVE_LOW>;
	phy-reset-duration = <200>;
	status = "okay";

	mdio {	/* 指定网络外设使用的MDIO总线,在MDIO子节点下指定PHY相关的属性信息,可参考Documentation/devicetree/bindings/net/phy.txt */
		#address-cells = <1>;
		#size-cells = <0>;

		ethphy0: ethernet-phy@2 {	/* ENET1的PHY节点名字 */
            /* IEEE802.3的22蔟 */
			compatible = "ethernet-phy-ieee802.3-c22";
            /* PHY芯片地址 */
			reg = <0>;
		};

		ethphy1: ethernet-phy@1 {
			compatible = "ethernet-phy-ieee802.3-c22";
			reg = <1>;
		};
	};
};

设备树中网络相关引脚描述:

pinctrl_enet1: enet1grp {
    fsl,pins = <
        MX6UL_PAD_ENET1_RX_EN__ENET1_RX_EN	0x1b0b0
        MX6UL_PAD_ENET1_RX_ER__ENET1_RX_ER	0x1b0b0
        MX6UL_PAD_ENET1_RX_DATA0__ENET1_RDATA00	0x1b0b0
        MX6UL_PAD_ENET1_RX_DATA1__ENET1_RDATA01	0x1b0b0
        MX6UL_PAD_ENET1_TX_EN__ENET1_TX_EN	0x1b0b0
        MX6UL_PAD_ENET1_TX_DATA0__ENET1_TDATA00	0x1b0b0
        MX6UL_PAD_ENET1_TX_DATA1__ENET1_TDATA01	0x1b0b0
        MX6UL_PAD_ENET1_TX_CLK__ENET1_REF_CLK1	0x4001b031
        >;
};

pinctrl_enet2: enet2grp {
    fsl,pins = <
        MX6UL_PAD_GPIO1_IO07__ENET2_MDC		0x1b0b0
        MX6UL_PAD_GPIO1_IO06__ENET2_MDIO	0x1b0b0
        MX6UL_PAD_ENET2_RX_EN__ENET2_RX_EN	0x1b0b0
        MX6UL_PAD_ENET2_RX_ER__ENET2_RX_ER	0x1b0b0
        MX6UL_PAD_ENET2_RX_DATA0__ENET2_RDATA00	0x1b0b0
        MX6UL_PAD_ENET2_RX_DATA1__ENET2_RDATA01	0x1b0b0
        MX6UL_PAD_ENET2_TX_EN__ENET2_TX_EN	0x1b0b0
        MX6UL_PAD_ENET2_TX_DATA0__ENET2_TDATA00	0x1b0b0
        MX6UL_PAD_ENET2_TX_DATA1__ENET2_TDATA01	0x1b0b0
        MX6UL_PAD_ENET2_TX_CLK__ENET2_REF_CLK2	0x4001b031
        >;
};

pinctrl_fec1_reset: fec1_resetgrp {
    fsl,pins = <
        MX6ULL_PAD_SNVS_TAMPER7__GPIO5_IO07	0x79
        >;
};

pinctrl_fec2_reset: fec2_resetgrp {
    fsl,pins = <
        MX6ULL_PAD_SNVS_TAMPER8__GPIO5_IO08	0x79
        >;
};

3.2 网络驱动源码

3.2.1 fec_probe函数

IMX6ULL网络驱动分为两部分:IMX6ULL网络外设驱动以及PHY芯片驱动。

IMX6ULL网络外设驱动由NXP编写,PHY芯片有通用驱动文件,所以我们不需要实现。

(1)IMX6ULL网络控制驱动

从compatible属性值

3.2 网络驱动源码

3.2.1 fec_probe函数

IMX6ULL网络驱动分为两部分:IMX6ULL网络外设驱动以及PHY芯片驱动。

IMX6ULL网络外设驱动由NXP编写,PHY芯片有通用驱动文件,所以我们不需要实现。

(1)IMX6ULL网络控制驱动

从compatible属性值知道驱动名为:"fsl,imx6ul-fec", "fsl,imx6q-fec"。

定义在:drivers/net/ethernet/freescale/fec_main.c

static const struct of_device_id fec_dt_ids[] = {
	{ .compatible = "fsl,imx25-fec", .data = &fec_devtype[IMX25_FEC], },
	{ .compatible = "fsl,imx27-fec", .data = &fec_devtype[IMX27_FEC], },
	{ .compatible = "fsl,imx28-fec", .data = &fec_devtype[IMX28_FEC], },
	{ .compatible = "fsl,imx6q-fec", .data = &fec_devtype[IMX6Q_FEC], },
	{ .compatible = "fsl,mvf600-fec", .data = &fec_devtype[MVF600_FEC], },
	{ .compatible = "fsl,imx6sx-fec", .data = &fec_devtype[IMX6SX_FEC], },
	{ .compatible = "fsl,imx6ul-fec", .data = &fec_devtype[IMX6UL_FEC], },
	{ /* sentinel */ }
};

static struct platform_driver fec_driver = {
	.driver	= {
		.name	= DRIVER_NAME,
		.pm		= &fec_pm_ops,
		.of_match_table = fec_dt_ids,
	},
	.id_table = fec_devtype,
	.probe	= fec_probe,
	.remove	= fec_drv_remove,
};

驱动匹配之后,fec_probe函数会执行:

static int fec_probe(struct platform_device *pdev)
{
	struct fec_enet_private *fep;
	struct fec_platform_data *pdata;
	struct net_device *ndev;
	int i, irq, ret = 0;
	struct resource *r;
	const struct of_device_id *of_id;
	static int dev_id;
	struct device_node *np = pdev->dev.of_node, *phy_node;
	int num_tx_qs;
	int num_rx_qs;

	void __iomem *IMX6U_ENET1_TX_CLK;
	void __iomem *IMX6U_ENET2_TX_CLK;

	IMX6U_ENET1_TX_CLK = ioremap(0x020E00DC, 4);
	writel(0x14, IMX6U_ENET1_TX_CLK);

	IMX6U_ENET2_TX_CLK = ioremap(0x020E00FC, 4);
	writel(0x14, IMX6U_ENET2_TX_CLK);

    /* 获取设备树中fsl,num-tx-queues和fsl,num-rx-queues属性值 */
	fec_enet_get_queue_num(pdev, &num_tx_qs, &num_rx_qs);

	/* 申请net_device */
	ndev = alloc_etherdev_mqs(sizeof(struct fec_enet_private),
				  num_tx_qs, num_rx_qs);
	if (!ndev)
		return -ENOMEM;

	SET_NETDEV_DEV(ndev, &pdev->dev);

	/* 获取net_device中私有数据内存首地址,存放了IMX6ULL网络设备结构体 */
	fep = netdev_priv(ndev);

	of_id = of_match_device(fec_dt_ids, &pdev->dev);
	if (of_id)
		pdev->id_entry = of_id->data;
    
    /* 初始化网络设备结构体的各个成员 */
	fep->quirks = pdev->id_entry->driver_data;

	fep->netdev = ndev;
	fep->num_rx_queues = num_rx_qs;
	fep->num_tx_queues = num_tx_qs;

#if !defined(CONFIG_M5272)
	/* default enable pause frame auto negotiation */
	if (fep->quirks & FEC_QUIRK_HAS_GBIT)
		fep->pause_flag |= FEC_PAUSE_FLAG_AUTONEG;
#endif

	/* Select default pin state */
	pinctrl_pm_select_default_state(&pdev->dev);
	
    /* 获取设备树中网络外设(ENET)相关寄存器起始地址 */
	r = platform_get_resource(pdev, IORESOURCE_MEM, 0);
    /* 网络外设地址转换为虚拟地址 */
	fep->hwp = devm_ioremap_resource(&pdev->dev, r);
	if (IS_ERR(fep->hwp)) {
		ret = PTR_ERR(fep->hwp);
		goto failed_ioremap;
	}

	fep->pdev = pdev;
	fep->dev_id = dev_id++;

	platform_set_drvdata(pdev, ndev);
	
    /* 解析设备树中ENET的停止模式属性值 */
	fec_enet_of_parse_stop_mode(pdev);

	if (of_get_property(np, "fsl,magic-packet", NULL))
		fep->wol_flag |= FEC_WOL_HAS_MAGIC_PACKET;

    /* 获取phy-handle属性值,该值指定了网络外设所对应获取PHY的设备节点 */
	phy_node = of_parse_phandle(np, "phy-handle", 0);
	if (!phy_node && of_phy_is_fixed_link(np)) {
		ret = of_phy_register_fixed_link(np);
		if (ret < 0) {
			dev_err(&pdev->dev,
				"broken fixed-link specification\n");
			goto failed_phy;
		}
		phy_node = of_node_get(np);
	}
	fep->phy_node = phy_node;
	
    /* 获取PHY工作模式,是MII或者RMII */
	ret = of_get_phy_mode(pdev->dev.of_node);
	if (ret < 0) {
		pdata = dev_get_platdata(&pdev->dev);
		if (pdata)
			fep->phy_interface = pdata->phy;
		else
			fep->phy_interface = PHY_INTERFACE_MODE_MII;
	} else {
		fep->phy_interface = ret;
	}

    /* 获取时钟ipg、ahb、enet_out、enet_clk_ref、ptp */
	fep->clk_ipg = devm_clk_get(&pdev->dev, "ipg");
	if (IS_ERR(fep->clk_ipg)) {
		ret = PTR_ERR(fep->clk_ipg);
		goto failed_clk;
	}
	
	fep->clk_ahb = devm_clk_get(&pdev->dev, "ahb");
	if (IS_ERR(fep->clk_ahb)) {
		ret = PTR_ERR(fep->clk_ahb);
		goto failed_clk;
	}

	fep->itr_clk_rate = clk_get_rate(fep->clk_ahb);

	/* enet_out is optional, depends on board */
	fep->clk_enet_out = devm_clk_get(&pdev->dev, "enet_out");
	if (IS_ERR(fep->clk_enet_out))
		fep->clk_enet_out = NULL;

	fep->ptp_clk_on = false;
	mutex_init(&fep->ptp_clk_mutex);

	/* clk_ref is optional, depends on board */
	fep->clk_ref = devm_clk_get(&pdev->dev, "enet_clk_ref");
	if (IS_ERR(fep->clk_ref))
		fep->clk_ref = NULL;

	fep->bufdesc_ex = fep->quirks & FEC_QUIRK_HAS_BUFDESC_EX;
	fep->clk_ptp = devm_clk_get(&pdev->dev, "ptp");
	if (IS_ERR(fep->clk_ptp)) {
		fep->clk_ptp = NULL;
		fep->bufdesc_ex = false;
	}
	
    /* 使能时钟 */
	pm_runtime_enable(&pdev->dev);
	ret = fec_enet_clk_enable(ndev, true);
	if (ret)
		goto failed_clk;

	fep->reg_phy = devm_regulator_get(&pdev->dev, "phy");
	if (!IS_ERR(fep->reg_phy)) {
		ret = regulator_enable(fep->reg_phy);
		if (ret) {
			dev_err(&pdev->dev,
				"Failed to enable phy regulator: %d\n", ret);
			goto failed_regulator;
		}
	} else {
		fep->reg_phy = NULL;
	}

    /* 复位PHY */
	fec_reset_phy(pdev);

	if (fep->bufdesc_ex)
		fec_ptp_init(pdev);

    /* 初始化enet,设置网络外设相关硬件寄存器 */
	ret = fec_enet_init(ndev);
	if (ret)
		goto failed_init;

	for (i = 0; i < FEC_IRQ_NUM; i++) {
        /* 从设备树中获取中断号 */
		irq = platform_get_irq(pdev, i);
		if (irq < 0) {
			if (i)
				break;
			ret = irq;
			goto failed_irq;
		}
        /* 申请中断,处理函数为fec_enet_interrupt */
		ret = devm_request_irq(&pdev->dev, irq, fec_enet_interrupt,
				       0, pdev->name, ndev);
		if (ret)
			goto failed_irq;

		fep->irq[i] = irq;
	}
	
    /* 获取属性fsl,wakeup_irq值,唤醒中断 */
	ret = of_property_read_u32(np, "fsl,wakeup_irq", &irq);
	if (!ret && irq < FEC_IRQ_NUM)
		fep->wake_irq = fep->irq[irq];
	else
		fep->wake_irq = fep->irq[0];
	
    /* 初始化完成量 */
	init_completion(&fep->mdio_done);
    
    /* 初始化MII/RMII接口,向内核注册MDIO总线 */
	ret = fec_enet_mii_init(pdev);
	if (ret)
		goto failed_mii_init;

	/* Carrier starts down, phylib will bring it up */
	netif_carrier_off(ndev);
    /* 使能网络相关时钟 */
	fec_enet_clk_enable(ndev, false);
	pinctrl_pm_select_sleep_state(&pdev->dev);
	
    /* 注册net_device */
	ret = register_netdev(ndev);
	if (ret)
		goto failed_register;

	device_init_wakeup(&ndev->dev, fep->wol_flag &
			   FEC_WOL_HAS_MAGIC_PACKET);

	if (fep->bufdesc_ex && fep->ptp_clock)
		netdev_info(ndev, "registered PHC device %d\n", fep->dev_id);

	fep->rx_copybreak = COPYBREAK_DEFAULT;
	INIT_WORK(&fep->tx_timeout_work, fec_enet_timeout_work);
	return 0;

failed_register:
	fec_enet_mii_remove(fep);
failed_mii_init:
failed_irq:
failed_init:
	if (fep->reg_phy)
		regulator_disable(fep->reg_phy);
failed_regulator:
	fec_enet_clk_enable(ndev, false);
failed_clk:
failed_phy:
	of_node_put(phy_node);
failed_ioremap:
	free_netdev(ndev);

	return ret;
}

3.2.2 MDIO总线注册

MDIO用来管理PHY芯片,分为MDIO和MDC两条线,Linux内核为MDIO提供了一个MDIO总线,采用mii_bus结构体表示,定义在:include/linux/phy.h

struct mii_bus {
	const char *name;
	char id[MII_BUS_ID_SIZE];
	void *priv;
    /* 对PHY进行读操作函数 */
	int (*read)(struct mii_bus *bus, int phy_id, int regnum);
    /* 对PHY进行写操作函数 */
	int (*write)(struct mii_bus *bus, int phy_id, int regnum, u16 val);
	int (*reset)(struct mii_bus *bus);

	/*
	 * A lock to ensure that only one thing can read/write
	 * the MDIO bus at a time
	 */
	struct mutex mdio_lock;

	struct device *parent;
	enum {
		MDIOBUS_ALLOCATED = 1,
		MDIOBUS_REGISTERED,
		MDIOBUS_UNREGISTERED,
		MDIOBUS_RELEASED,
	} state;
	struct device dev;

	/* list of all PHYs on bus */
	struct phy_device *phy_map[PHY_MAX_ADDR];

	/* PHY addresses to be ignored when probing */
	u32 phy_mask;

	/*
	 * Pointer to an array of interrupts, each PHY's
	 * interrupt at the index matching its address
	 */
	int *irq;
};

在fec_probe函数中会调用fec_enet_mii_init函数完成MII接口的初始化,包括初始化mii_bus下read和write函数。然后通过of_mdiobus_register或mdiobus_register接口注册mii_bus到内核。

static int fec_enet_mii_init(struct platform_device *pdev)
{
	static struct mii_bus *fec0_mii_bus;
	static int *fec_mii_bus_share;
	struct net_device *ndev = platform_get_drvdata(pdev);
	struct fec_enet_private *fep = netdev_priv(ndev);
	struct device_node *node;
	int err = -ENXIO, i;
	u32 mii_speed, holdtime;
	
    ... ...;

	fep->mii_bus = mdiobus_alloc();
	if (fep->mii_bus == NULL) {
		err = -ENOMEM;
		goto err_out;
	}

	fep->mii_bus->name = "fec_enet_mii_bus";
	fep->mii_bus->read = fec_enet_mdio_read;
	fep->mii_bus->write = fec_enet_mdio_write;
	snprintf(fep->mii_bus->id, MII_BUS_ID_SIZE, "%s-%x",
		pdev->name, fep->dev_id + 1);
	fep->mii_bus->priv = fep;
	fep->mii_bus->parent = &pdev->dev;

	fep->mii_bus->irq = kmalloc(sizeof(int) * PHY_MAX_ADDR, GFP_KERNEL);
	if (!fep->mii_bus->irq) {
		err = -ENOMEM;
		goto err_out_free_mdiobus;
	}

	for (i = 0; i < PHY_MAX_ADDR; i++)
		fep->mii_bus->irq[i] = PHY_POLL;

	node = of_get_child_by_name(pdev->dev.of_node, "mdio");
	if (node) {
		err = of_mdiobus_register(fep->mii_bus, node);
		of_node_put(node);
	} else {
		err = mdiobus_register(fep->mii_bus);
	}

	if (err)
		goto err_out_free_mdio_irq;

	mii_cnt++;

	/* save fec0 mii_bus */
	if (fep->quirks & FEC_QUIRK_ENET_MAC) {
		fec0_mii_bus = fep->mii_bus;
		fec_mii_bus_share = &fep->mii_bus_share;
	}

	return 0;

err_out_free_mdio_irq:
	kfree(fep->mii_bus->irq);
err_out_free_mdiobus:
	mdiobus_free(fep->mii_bus);
err_out:
	return err;
}

of_mdiobus_register或mdiobus_register会调用of_mdiobus_register_phy函数,向内核注册PHY设备,of_mdiobus_register_phy函数内容如下:

static int of_mdiobus_register_phy(struct mii_bus *mdio, struct device_node *child, u32 addr)
{
	struct phy_device *phy;
	bool is_c45;
	int rc;
	u32 phy_id;

    /* 检查PHY节点的compatible属性值 */
	is_c45 = of_device_is_compatible(child,
					 "ethernet-phy-ieee802.3-c45");

	if (!is_c45 && !of_get_phy_id(child, &phy_id))
		phy = phy_device_create(mdio, addr, phy_id, 0, NULL);
	else
		phy = get_phy_device(mdio, addr, is_c45); /* 获取PHY设备 */
	if (!phy || IS_ERR(phy))
		return 1;
	
    /* 获取PHY芯片中断信息 */
	rc = irq_of_parse_and_map(child, 0);
	if (rc > 0) {
		phy->irq = rc;
		if (mdio->irq)
			mdio->irq[addr] = rc;
	} else {
		if (mdio->irq)
			phy->irq = mdio->irq[addr];
	}

	/* Associate the OF node with the device structure so it
	 * can be looked up later */
	of_node_get(child);
	phy->dev.of_node = child;

	/* All data is now stored in the phy struct;
	 * register it */
    /* 向内核注册PHY设备 */
	rc = phy_device_register(phy);
	if (rc) {
		phy_device_free(phy);
		of_node_put(child);
		return 1;
	}

	dev_dbg(&mdio->dev, "registered phy %s at address %i\n",
		child->name, addr);

	return 0;
}

MDIO总线注册和PHY设备注册流程如下:

image-20220605211649097

注册MIDO总线的时候会从设备树中查找PHY设备,然后通过phy_device_register函数向内核注册PHY设备。

3.2.3 fec_drv_remove函数

卸载IMX6ULL网络驱动时fec_drv_remove函数会执行,如下:

static int fec_drv_remove(struct platform_device *pdev)
{
	struct net_device *ndev = platform_get_drvdata(pdev);
	struct fec_enet_private *fep = netdev_priv(ndev);

	cancel_delayed_work_sync(&fep->time_keep);
	cancel_work_sync(&fep->tx_timeout_work);
    /* 注销net_device */
	unregister_netdev(ndev);
    /* 移除MDIO总线 */
	fec_enet_mii_remove(fep);
	if (fep->reg_phy)
		regulator_disable(fep->reg_phy);
	if (fep->ptp_clock)
		ptp_clock_unregister(fep->ptp_clock);
	of_node_put(fep->phy_node);
    /* 释放net_device */
	free_netdev(ndev);

	return 0;
}

3.2.4 fec_netdev_ops操作集

fec_probe 函数设置了网卡驱动的 net_dev_ops 操作集为 fec_netdev_ops:

static const struct net_device_ops fec_netdev_ops = {
	.ndo_open			= fec_enet_open,	/* 打开网卡时执行 */
	.ndo_stop			= fec_enet_close,	/* 关闭网卡时执行 */
	.ndo_start_xmit		= fec_enet_start_xmit,	/* 网络数据发送函数 */
	.ndo_select_queue   = fec_enet_select_queue,
	.ndo_set_rx_mode	= set_multicast_list,
	.ndo_change_mtu		= eth_change_mtu,
	.ndo_validate_addr	= eth_validate_addr,
	.ndo_tx_timeout		= fec_timeout,
	.ndo_set_mac_address= fec_set_mac_address,
	.ndo_do_ioctl		= fec_enet_ioctl,
#ifdef CONFIG_NET_POLL_CONTROLLER
	.ndo_poll_controller= fec_poll_controller,
#endif
	.ndo_set_features	= fec_set_features,
};

3.2.4 fec_enet_interrupt函数

I.MX6ULL 的网络数据接收采用 NAPI 框架,要用到中断。fec_probe 函数会初始化网络中断,中断服务函数为fec_enet_interrupt。

具体的网络数据收发是在 NAPI 的 poll 函数中完成的,中断里面只需要进行 napi 调度即可,这个就是中断的上半部和下半部处理机制。

static irqreturn_t fec_enet_interrupt(int irq, void *dev_id)
{
	struct net_device *ndev = dev_id;
	struct fec_enet_private *fep = netdev_priv(ndev);
	uint int_events;
	irqreturn_t ret = IRQ_NONE;

	int_events = readl(fep->hwp + FEC_IEVENT);
	writel(int_events, fep->hwp + FEC_IEVENT);
	fec_enet_collect_events(fep, int_events);

	if ((fep->work_tx || fep->work_rx) && fep->link) {
		ret = IRQ_HANDLED;

		if (napi_schedule_prep(&fep->napi)) {
			/* Disable the NAPI interrupts */
			writel(FEC_ENET_MII, fep->hwp + FEC_IMASK);
			__napi_schedule(&fep->napi);
		}
	}

	if (int_events & FEC_ENET_MII) {
		ret = IRQ_HANDLED;
		complete(&fep->mdio_done);
	}

	if (fep->ptp_clock)
		fec_ptp_check_pps_event(fep);

	return ret;
}

3.2.5 fec_enet_rx_napi函数

fec_enet_init 函数初始化网络的时候会调用 netif_napi_add 来设置 NAPI 的 poll 函数为fec_enet_rx_napi。

static int fec_enet_rx_napi(struct napi_struct *napi, int budget)
{
	struct net_device *ndev = napi->dev;
	struct fec_enet_private *fep = netdev_priv(ndev);
	int pkts;
	
    /* 数据接收 */
	pkts = fec_enet_rx(ndev, budget);
	
    /* 数据发送 */
	fec_enet_tx(ndev);

	if (pkts < budget) {
        /* 标记轮询结束 */
		napi_complete(napi);
		writel(FEC_DEFAULT_IMASK, fep->hwp + FEC_IMASK);
	}
	return pkts;
}

3.3 内核PHY子系统与MDIO总线

注册 MDIO 总线的时候也会向内核注册 PHY 设备,PHY子系统就是用于 PHY 设备相关内容的,分为 PHY 设备和 PHY 驱动,和 platform 总线一样,PHY 子系统也是一个设备、总线和驱动模型。

3.3.1 PHY设备

Linux 内核使用 phy_device 结构体来表示 PHY 设备,结构体定义在 include/linux/phy.h

struct phy_device {
	/* Information about the PHY type */
	/* And management functions */
	struct phy_driver *drv;	/* PHY设备驱动 */
	struct mii_bus *bus;	/* 对应的MII总线 */
	struct device dev;		/* 设备文件 */
	u32 phy_id;				/* PHY ID */

	struct phy_c45_device_ids c45_ids;
	bool is_c45;
	bool is_internal;
	bool has_fixups;
	bool suspended;

	enum phy_state state;	/* PHY状态 */

	u32 dev_flags;

	phy_interface_t interface;	/* PHY接口 */

	/* Bus address of the PHY (0-31) */
	int addr;	/* PHY地址(0-31) */

	/*
	 * forced speed & duplex (no autoneg)
	 * partner speed & duplex & pause (autoneg)
	 */
	int speed;	/* 速率 */
	int duplex;	/* 双工模式 */
	int pause;
	int asym_pause;

	/* The most recently read link state */
	int link;

	/* Enabled Interrupts */
	u32 interrupts;	/* 中断使能标记 */

	/* Union of PHY and Attached devices' supported modes */
	/* See mii.h for more info */
	u32 supported;
	u32 advertising;
	u32 lp_advertising;

	int autoneg;

	int link_timeout;

	/*
	 * Interrupt number for this PHY
	 * -1 means no interrupt
	 */
	int irq;	/* 中断号 */

	/* private data pointer */
	/* For use by PHYs to maintain extra state */
	void *priv;	/* 私有数据 */

	/* Interrupt and Polling infrastructure */
	struct work_struct phy_queue;
	struct delayed_work state_queue;
	atomic_t irq_disable;

	struct mutex lock;

	struct net_device *attached_dev;

	void (*adjust_link)(struct net_device *dev);	/* PHY芯片对应的网络设备 */
};

一个 PHY 设备对应一个 phy_device 实例,然后需要向 Linux 内核注册这个实例。使用phy_device_register 函数完成 PHY 设备的注册,函数原型如下:

int phy_device_register(struct phy_device *phy);

3.3.2 PHY驱动

PHY 驱动使用结构体 phy_driver 表示,结构体也定义在 include/linux/phy.h 文件中:

struct phy_driver {
	u32 phy_id;
	char *name;
	unsigned int phy_id_mask;
	u32 features;
	u32 flags;
	const void *driver_data;

	/*
	 * Called to issue a PHY software reset
	 */
	int (*soft_reset)(struct phy_device *phydev);

	/*
	 * Called to initialize the PHY,
	 * including after a reset
	 */
	int (*config_init)(struct phy_device *phydev);

	/*
	 * Called during discovery.  Used to set
	 * up device-specific structures, if any
	 */
	int (*probe)(struct phy_device *phydev);

	/* PHY Power Management */
	int (*suspend)(struct phy_device *phydev);
	int (*resume)(struct phy_device *phydev);

	/*
	 * Configures the advertisement and resets
	 * autonegotiation if phydev->autoneg is on,
	 * forces the speed to the current settings in phydev
	 * if phydev->autoneg is off
	 */
	int (*config_aneg)(struct phy_device *phydev);

	/* Determines the auto negotiation result */
	int (*aneg_done)(struct phy_device *phydev);

	/* Determines the negotiated speed and duplex */
	int (*read_status)(struct phy_device *phydev);

	/* Clears any pending interrupts */
	int (*ack_interrupt)(struct phy_device *phydev);

	/* Enables or disables interrupts */
	int (*config_intr)(struct phy_device *phydev);

	/*
	 * Checks if the PHY generated an interrupt.
	 * For multi-PHY devices with shared PHY interrupt pin
	 */
	int (*did_interrupt)(struct phy_device *phydev);

	/* Clears up any memory if needed */
	void (*remove)(struct phy_device *phydev);

	/* Returns true if this is a suitable driver for the given
	 * phydev.  If NULL, matching is based on phy_id and
	 * phy_id_mask.
	 */
	int (*match_phy_device)(struct phy_device *phydev);

	/* Handles ethtool queries for hardware time stamping. */
	int (*ts_info)(struct phy_device *phydev, struct ethtool_ts_info *ti);

	/* Handles SIOCSHWTSTAMP ioctl for hardware time stamping. */
	int  (*hwtstamp)(struct phy_device *phydev, struct ifreq *ifr);

	/*
	 * Requests a Rx timestamp for 'skb'. If the skb is accepted,
	 * the phy driver promises to deliver it using netif_rx() as
	 * soon as a timestamp becomes available. One of the
	 * PTP_CLASS_ values is passed in 'type'. The function must
	 * return true if the skb is accepted for delivery.
	 */
	bool (*rxtstamp)(struct phy_device *dev, struct sk_buff *skb, int type);

	/*
	 * Requests a Tx timestamp for 'skb'. The phy driver promises
	 * to deliver it using skb_complete_tx_timestamp() as soon as a
	 * timestamp becomes available. One of the PTP_CLASS_ values
	 * is passed in 'type'.
	 */
	void (*txtstamp)(struct phy_device *dev, struct sk_buff *skb, int type);

	/* Some devices (e.g. qnap TS-119P II) require PHY register changes to
	 * enable Wake on LAN, so set_wol is provided to be called in the
	 * ethernet driver's set_wol function. */
	int (*set_wol)(struct phy_device *dev, struct ethtool_wolinfo *wol);

	/* See set_wol, but for checking whether Wake on LAN is enabled. */
	void (*get_wol)(struct phy_device *dev, struct ethtool_wolinfo *wol);

	/*
	 * Called to inform a PHY device driver when the core is about to
	 * change the link state. This callback is supposed to be used as
	 * fixup hook for drivers that need to take action when the link
	 * state changes. Drivers are by no means allowed to mess with the
	 * PHY device structure in their implementations.
	 */
	void (*link_change_notify)(struct phy_device *dev);

	/* A function provided by a phy specific driver to override the
	 * the PHY driver framework support for reading a MMD register
	 * from the PHY. If not supported, return -1. This function is
	 * optional for PHY specific drivers, if not provided then the
	 * default MMD read function is used by the PHY framework.
	 */
	int (*read_mmd_indirect)(struct phy_device *dev, int ptrad,
				 int devnum, int regnum);

	/* A function provided by a phy specific driver to override the
	 * the PHY driver framework support for writing a MMD register
	 * from the PHY. This function is optional for PHY specific drivers,
	 * if not provided then the default MMD read function is used by
	 * the PHY framework.
	 */
	void (*write_mmd_indirect)(struct phy_device *dev, int ptrad,
				   int devnum, int regnum, u32 val);

	/* Get the size and type of the eeprom contained within a plug-in
	 * module */
	int (*module_info)(struct phy_device *dev,
			   struct ethtool_modinfo *modinfo);

	/* Get the eeprom information from the plug-in module */
	int (*module_eeprom)(struct phy_device *dev,
			     struct ethtool_eeprom *ee, u8 *data);

	struct device_driver driver;
};

(1)注册PHY驱动

phy_driver 结构体初始化完成以后,就需要向 Linux 内核注册,PHY 驱动的注册使用phy_driver_register 函数,注册 phy 驱动时候会设置驱动的总线为 mdio_bus_type,也就是 MDIO总线。

int phy_driver_register(struct phy_driver *new_driver);

(2)连续注册多个PHY驱动

一个厂家会生产多种 PHY 芯片,这些 PHY 芯片内部差别一般不大,如果一个个的去注册驱动将会导致一堆重复的驱动文件,因此 Linux 内核提供了一个连续注册多个 PHY 驱动的函数phy_drivers_register。首先准备一个 phy_driver 数组,一个数组元素就表示一个 PHY 芯片的驱动,然后调用 phy_drivers_register 一次性注册整个数组中的所有驱动,函数原型如下:

int phy_drivers_register(struct phy_driver *new_driver, int n);

(3)卸载PHY驱动

卸载 PHY 驱动的话使用 phy_driver_unregister 函数,函数原型如下:

void phy_driver_unregister(struct phy_driver *drv);
void phy_drivers_unregister(struct phy_driver *drv, int n);

3.3.3 MDIO总线

PHY 子系统也是遵循设备、总线、驱动模型的,设备和驱动就是 phy_device 和phy_driver。总线就是 MDIO 总线,因为 PHY 芯片是通过 MIDO 接口来管理的,MDIO 总线最主要的工作就是匹配 PHY 设备和 PHY 驱动。在文件 drivers/net/phy/mdio_bus.c 中有如下定义:

struct bus_type mdio_bus_type = {
	.name	= "mdio_bus",
	.match	= mdio_bus_match,
	.pm		= MDIO_BUS_PM_OPS,
	.dev_groups	= mdio_dev_groups,
};

重点是mdio_bus_match:

static int mdio_bus_match(struct device *dev, struct device_driver *drv)
{
	struct phy_device *phydev = to_phy_device(dev);
	struct phy_driver *phydrv = to_phy_driver(drv);

	if (of_driver_match_device(dev, drv))
		return 1;

	if (phydrv->match_phy_device)
		return phydrv->match_phy_device(phydev);

	return (phydrv->phy_id & phydrv->phy_id_mask) ==
		(phydev->phy_id & phydrv->phy_id_mask);
}

phy_driver 里面有两个成员变量 phy_id 和 phy_id_mask,表示此驱动所匹配的 PHY 芯片 ID 以及 ID 掩码,PHY 驱动编写人员需要给这两个成员变量赋值。phy_device 也有一个 phy_id 成员变量,表示此 PHY 芯片的 ID,phy_device 里面的 phy_id 是在注册 PHY 设备的时候调用 get_phy_id 函数直接读取PHY 芯片内部 ID 寄存器得到的!很明显 PHY 驱动和 PHY 设备中的 ID 要一样,这样才能匹配起来。对比 PHY 驱动和 PHY 设备中的 phy_id 是否一致,这里需要与PHY 驱动里面的 phy_id_mask 进行与运算,如果结果一致的话就说明驱动和设备匹配。

如果 PHY 设备和 PHY 驱动匹配,那么就使用指定的 PHY 驱动,如果不匹配的话就使用Linux 内核自带的通用 PHY 驱动。

3.3.4 通用PHY驱动

通用PHY驱动名字为“Generic PHY”,打开 drivers/net/phy/phy_device.c,找到 phy_init 函数:

static struct phy_driver genphy_driver[] = {
	{
        .phy_id		= 0xffffffff,
        .phy_id_mask	= 0xffffffff,
        .name		= "Generic PHY",
        .soft_reset	= genphy_soft_reset,
        .config_init	= genphy_config_init,
        .features	= PHY_GBIT_FEATURES | SUPPORTED_MII |
                  SUPPORTED_AUI | SUPPORTED_FIBRE |
                  SUPPORTED_BNC,
        .config_aneg	= genphy_config_aneg,
        .aneg_done	= genphy_aneg_done,
        .read_status	= genphy_read_status,
        .suspend	= genphy_suspend,
        .resume		= genphy_resume,
        .driver		= { .owner = THIS_MODULE, },
	},
	
	{
        .phy_id         = 0xffffffff,
        .phy_id_mask    = 0xffffffff,
        .name           = "Generic 10G PHY",
        .soft_reset	= gen10g_soft_reset,
        .config_init    = gen10g_config_init,
        .features       = 0,
        .config_aneg    = gen10g_config_aneg,
        .read_status    = gen10g_read_status,
        .suspend        = gen10g_suspend,
        .resume         = gen10g_resume,
        .driver         = {.owner = THIS_MODULE, },
	}
};

static int __init phy_init(void)
{
	int rc;

	rc = mdio_bus_init();
	if (rc)
		return rc;

	rc = phy_drivers_register(genphy_driver,
				  ARRAY_SIZE(genphy_driver));
	if (rc)
		mdio_bus_exit();

	return rc;
}
posted @   zhengcixi  阅读(302)  评论(0编辑  收藏  举报
相关博文:
阅读排行:
· winform 绘制太阳,地球,月球 运作规律
· AI与.NET技术实操系列(五):向量存储与相似性搜索在 .NET 中的实现
· 超详细:普通电脑也行Windows部署deepseek R1训练数据并当服务器共享给他人
· 上周热点回顾(3.3-3.9)
· AI 智能体引爆开源社区「GitHub 热点速览」
回到顶部
点击右上角即可分享
微信分享提示

目录导航