Skip to end of metadata
Go to start of metadata

The following is my understanding and interpretation of the proposal from Michael S. Tsirkin.
https://lists.gnu.org/archive/html/qemu-devel/2015-08/msg03993.html

Jun Nakajima


VM1, VM2 correspond to the examples in his proposal. VM1 can express access permission (R/W) to its guest physical address (GPA) regions by virtual IOMMU. Typically, IOMMU (e.g. AMD-Vi and Intel VT-d) uses page tables that convert bus (I/O) address to GPA for a given PCI device to protect the rest of the system from the DMA operations when specifying the DMA addresses. See https://www.kernel.org/doc/Documentation/vfio.txt for the details and VFIO.

If you think about an imaginary (i.e. virtual) PCI device "R", you can set up mapping from bus (I/O) address to GPA for that device (because you can set up such mapping for each PCI device). This way, VM1 "gets full control of its security, from mapping all memory (like with current vhost-user) to only mapping buffers used for networking (like ivshmem) to transient mappings for the duration of data transfer only."

General DMA Operations and Protection

If the device driver of R sets up a buffer to receive data by programming the (virtual) registers of R, data will be placed into the buffer as DMA. Then the device R generates an interrupt (maybe putting more data to other buffers). The IOMMU machinery makes sure that data (addressed by bus or I/O address) be translated to GPA. If the mapping is not valid, then IOMMU reports an error (via MSI). This way, VM1 is protected against the DMA operations made by the device R. Once the DMA operation is done, the IOMMU transaction is done. Data transferring from VM1 basically takes the same steps.

Inter-VM Communication

How is this mechanism used for inter-VM communication? It should be straightforward. Take a look at a simple example where VM1 receives data from VM2. For example, DPDK runs in VM2, forwarding packets to VM1.

In VM1 the device driver uses polling to keep DMA operations of R open, looking at the "bus address" (the step numbers below should match with the one in the figure):

  1. For performance reasons, the buffer addresses would be static or covered by larger regions that are mapped by virtual IOMMU. The mapping is determined and established by VM1.
  2. QEMU of VM1 communicates the configuration of virtual IOMMU to QEMU of VM2. This would require extensions to the vhost-user protocol.
  3. The vhost-pci (implemented by the extension) sets BAR in VM2.
  4. A process (e.g. DPDK) or kernel in VM2 accesses BAR + (bus address) in its GPA to copy data to VM1. This operation can be done by data-copying or DMA (by SR-IOV VFs, for example).
  5. The mapping from the bus address to GPA in VM1 is done by virtual IOMMU configured for the device R.

  • No labels