The whole vfio subsystem should support 3 sub-features:
- cfg/mem/io support: user space can access cfg/mem/io of vf.
- dma support: data in vf can be translated to user space dma memory range.
- interrupt from vf can be routed to VM OS.
From view of code, we can see whole vfio driver as:
- init base vfio arch in drivers/vfio/vfio.c
- init pci/platform vfio device driver in drivers/vfio/pci/vfio_pci.c, drivers/vfio/platform/vfio_platform.c
- init vfio iommu driver and register to vfio system in drivers/vfio/vfio_iommu*
This vfio system will create /dev/vfio/vfio as a vfio container, which indicates
an address space share by multiple devices. It will also create /dev/vfio/
as a vfio group, which indicates a group shared by multiple devices using a iommu
or smmu unit. when we open a /dev/vfio/
indicates a device handled by vfio system. Device can be controlled by this fd.
vfio system does not create new bus, however, we should unbind original device
driver, and bind device with vfio device driver. So for a PCI device, we need
vfio pci driver to handle this device. This vfio pci driver becomes the agent of
this device and export all its resource to user space.
The interfaces for userspace:
vfio init in vfio.c
vfio registers a misc device in /dev/vfio/vfio.
initialize items in vfio:
1 | register vfio_dev(miscdevice) in misc sub-system, file: /dev/vfio/vfio |
vfio.c creat a vfio class, this will work together with device_create in
vfio_create_group. vfio creates a vfio group indeed is creating a device in this
vfio class, vfio group file will be /dev/vfio/
vfio_create_group is called in vfio_pci_probe and vfio_platform_probe. In the probe,
we get the devices which we want to handle by vfio system, then find which iommu group
these devices belong to, then create the related vfio_group to help to store related
iommu group. Here just use device_creat to create a file under /dev/vfio/ to refer to
the vfio_group. At last, we creat vfio_pci/vfio_platform_device for the devices
which we want vfio system to take care of. For details, please refer to part2.
1 | vfio.class = class_create(THIS_MODULE, "vfio") |
when we operate /dev/vfio/
functions in vfio_group_fops.
1 | register chr device: vfio.group_cdev(struct cdev) |
so what happen if we call above callback:
1 | open: find vfio_group --> share vfio_group to private_data of related struct file. |
An ioctl of vfio_group can get a fd for the device.
We already get the iommu_group of a device, why do we use vfio_group_set_container
to add this vfio_group to a vfio container?
The concept of vfio container is to build an address space shared by multiple
devices.
vfio container
------+--------------+--------------+-------
| | |
+-+--+ +-+--+ +-+--+
|smmu| |smmu| |smmu|
+-+--+ +-+--+ +-+--+
| | |
+-+--+ +-+--+ +-+--+
|dev | |dev | |dev |
+----+ +----+ +----+
When vfio_group is added to vfio container, mappings in this vfio_group will be
added to other smmus physically. So all smmus above have same mapping if vfio_groups
have been added into same vfio container. All mappings are maintained in vfio
container.
how to add vfio_group to vfio_container:
1 | vfio_ioctl_set_iommu |
probe of vfio_pci.c/vfio_platform.c
All working in vfio system will help build below vfio struct:
1 | global: vfio |
Here we analyze the flows in vfio_pci.
in vfio_pci_init, use pci_register_driver(&vfio_pci_driver) to probe the PCIe
devices in the whole PCIe domain, which devices we had already build up in
standard PCIe enumeration process.
1 | vfio_pci_probe |
vfio_register_iommu_driver in specific iommu file
Physically we can use different iommu implementation, e.g. SMMU in ARM, IOMMU for
Intel. This vfio iommu driver is used to control this.
register vfio_iommu_driver to vfio:
1 | vfio |
This ops is called by vfio_container->vfio_smmu_driver->ops, we bind
vfio_container and vfio_smmu_driver together in unlocked_ioctl(using VFIO_SET_IOMMU)
of /dev/vfio/vfio. Here we can register specific iommu driver to vfio, now there are
vfio iommu driver from X86(vfio_iommu_type1.c) and POWER(vfio_iommu_spapr_tce.c).
how to bind vfio_container and vfio_smmu_driver:
for a /dev/vfio/vfio container fd, its ioctl VFIO_SET_IOMMU will set specific
IOMMU for the container:
1 | vfio_fops_unl_ioctl |
how to call the ops in vfio_smmu_driver:
vfio_container->ops will call the ops in vfio_smmu_driver.
how to access cfg/mem/io of VFs
1 | /* ioctl of vfio_group to get a fd of device */ |
Reference:
- https://www.ibm.com/developerworks/community/blogs/5144904d-5d75-45ed-9d2b-cf1754ee936a/entry/20160605?lang=en
- http://blog.csdn.net/qq123386926/article/details/47757089
- https://zhuanlan.zhihu.com/p/35489035