Message ID | 20181112075807.9291-2-nek.in.cn@gmail.com |
---|---|
State | New |
Headers | show |
Series | A General Accelerator Framework, WarpDrive | expand |
在 2018/11/13 上午8:23, Leon Romanovsky 写道: > On Mon, Nov 12, 2018 at 03:58:02PM +0800, Kenneth Lee wrote: >> From: Kenneth Lee <liguozhu@hisilicon.com> >> >> WarpDrive is a general accelerator framework for the user application to >> access the hardware without going through the kernel in data path. >> >> The kernel component to provide kernel facility to driver for expose the >> user interface is called uacce. It a short name for >> "Unified/User-space-access-intended Accelerator Framework". >> >> This patch add document to explain how it works. > + RDMA and netdev folks > > Sorry, to be late in the game, I don't see other patches, but from > the description below it seems like you are reinventing RDMA verbs > model. I have hard time to see the differences in the proposed > framework to already implemented in drivers/infiniband/* for the kernel > space and for the https://github.com/linux-rdma/rdma-core/ for the user > space parts. Thanks Leon, Yes, we tried to solve similar problem in RDMA. We also learned a lot from the exist code of RDMA. But we we have to make a new one because we cannot register accelerators such as AI operation, encryption or compression to the RDMA framework:) Another problem we tried to address is the way to pin the memory for dma operation. The RDMA way to pin the memory cannot avoid the page lost due to copy-on-write operation during the memory is used by the device. This may not be important to RDMA library. But it is important to accelerator. Hope this can help the understanding. Cheers > > Hard NAK from RDMA side. > > Thanks > >> Signed-off-by: Kenneth Lee <liguozhu@hisilicon.com> >> --- >> Documentation/warpdrive/warpdrive.rst | 260 +++++++ >> Documentation/warpdrive/wd-arch.svg | 764 ++++++++++++++++++++ >> Documentation/warpdrive/wd.svg | 526 ++++++++++++++ >> Documentation/warpdrive/wd_q_addr_space.svg | 359 +++++++++ >> 4 files changed, 1909 insertions(+) >> create mode 100644 Documentation/warpdrive/warpdrive.rst >> create mode 100644 Documentation/warpdrive/wd-arch.svg >> create mode 100644 Documentation/warpdrive/wd.svg >> create mode 100644 Documentation/warpdrive/wd_q_addr_space.svg >> >> diff --git a/Documentation/warpdrive/warpdrive.rst b/Documentation/warpdrive/warpdrive.rst >> new file mode 100644 >> index 000000000000..ef84d3a2d462 >> --- /dev/null >> +++ b/Documentation/warpdrive/warpdrive.rst >> @@ -0,0 +1,260 @@ >> +Introduction of WarpDrive >> +========================= >> + >> +*WarpDrive* is a general accelerator framework for the user application to >> +access the hardware without going through the kernel in data path. >> + >> +It can be used as the quick channel for accelerators, network adaptors or >> +other hardware for application in user space. >> + >> +This may make some implementation simpler. E.g. you can reuse most of the >> +*netdev* driver in kernel and just share some ring buffer to the user space >> +driver for *DPDK* [4] or *ODP* [5]. Or you can combine the RSA accelerator with >> +the *netdev* in the user space as a https reversed proxy, etc. >> + >> +*WarpDrive* takes the hardware accelerator as a heterogeneous processor which >> +can share particular load from the CPU: >> + >> +.. image:: wd.svg >> + :alt: WarpDrive Concept >> + >> +The virtual concept, queue, is used to manage the requests sent to the >> +accelerator. The application send requests to the queue by writing to some >> +particular address, while the hardware takes the requests directly from the >> +address and send feedback accordingly. >> + >> +The format of the queue may differ from hardware to hardware. But the >> +application need not to make any system call for the communication. >> + >> +*WarpDrive* tries to create a shared virtual address space for all involved >> +accelerators. Within this space, the requests sent to queue can refer to any >> +virtual address, which will be valid to the application and all involved >> +accelerators. >> + >> +The name *WarpDrive* is simply a cool and general name meaning the framework >> +makes the application faster. It includes general user library, kernel >> +management module and drivers for the hardware. In kernel, the management >> +module is called *uacce*, meaning "Unified/User-space-access-intended >> +Accelerator Framework". >> + >> + >> +How does it work >> +================ >> + >> +*WarpDrive* uses *mmap* and *IOMMU* to play the trick. >> + >> +*Uacce* creates a chrdev for the device registered to it. A "queue" will be >> +created when the chrdev is opened. The application access the queue by mmap >> +different address region of the queue file. >> + >> +The following figure demonstrated the queue file address space: >> + >> +.. image:: wd_q_addr_space.svg >> + :alt: WarpDrive Queue Address Space >> + >> +The first region of the space, device region, is used for the application to >> +write request or read answer to or from the hardware. >> + >> +Normally, there can be three types of device regions mmio and memory regions. >> +It is recommended to use common memory for request/answer descriptors and use >> +the mmio space for device notification, such as doorbell. But of course, this >> +is all up to the interface designer. >> + >> +There can be two types of device memory regions, kernel-only and user-shared. >> +This will be explained in the "kernel APIs" section. >> + >> +The Static Share Virtual Memory region is necessary only when the device IOMMU >> +does not support "Share Virtual Memory". This will be explained after the >> +*IOMMU* idea. >> + >> + >> +Architecture >> +------------ >> + >> +The full *WarpDrive* architecture is represented in the following class >> +diagram: >> + >> +.. image:: wd-arch.svg >> + :alt: WarpDrive Architecture >> + >> + >> +The user API >> +------------ >> + >> +We adopt a polling style interface in the user space: :: >> + >> + int wd_request_queue(struct wd_queue *q); >> + void wd_release_queue(struct wd_queue *q); >> + >> + int wd_send(struct wd_queue *q, void *req); >> + int wd_recv(struct wd_queue *q, void **req); >> + int wd_recv_sync(struct wd_queue *q, void **req); >> + void wd_flush(struct wd_queue *q); >> + >> +wd_recv_sync() is a wrapper to its non-sync version. It will trapped into >> +kernel and waits until the queue become available. >> + >> +If the queue do not support SVA/SVM. The following helper function >> +can be used to create Static Virtual Share Memory: :: >> + >> + void *wd_preserve_share_memory(struct wd_queue *q, size_t size); >> + >> +The user API is not mandatory. It is simply a suggestion and hint what the >> +kernel interface is supposed to support. >> + >> + >> +The user driver >> +--------------- >> + >> +The queue file mmap space will need a user driver to wrap the communication >> +protocol. *UACCE* provides some attributes in sysfs for the user driver to >> +match the right accelerator accordingly. >> + >> +The *UACCE* device attribute is under the following directory: >> + >> +/sys/class/uacce/<dev-name>/params >> + >> +The following attributes is supported: >> + >> +nr_queue_remained (ro) >> + number of queue remained >> + >> +api_version (ro) >> + a string to identify the queue mmap space format and its version >> + >> +device_attr (ro) >> + attributes of the device, see UACCE_DEV_xxx flag defined in uacce.h >> + >> +numa_node (ro) >> + id of numa node >> + >> +priority (rw) >> + Priority or the device, bigger is higher >> + >> +(This is not yet implemented in RFC version) >> + >> + >> +The kernel API >> +-------------- >> + >> +The *uacce* kernel API is defined in uacce.h. If the hardware support SVM/SVA, >> +The driver need only the following API functions: :: >> + >> + int uacce_register(uacce); >> + void uacce_unregister(uacce); >> + void uacce_wake_up(q); >> + >> +*uacce_wake_up* is used to notify the process who epoll() on the queue file. >> + >> +According to the IOMMU capability, *uacce* categories the devices as follow: >> + >> +UACCE_DEV_NOIOMMU >> + The device has no IOMMU. The user process cannot use VA on the hardware >> + This mode is not recommended. >> + >> +UACCE_DEV_SVA (UACCE_DEV_PASID | UACCE_DEV_FAULT_FROM_DEV) >> + The device has IOMMU which can share the same page table with user >> + process >> + >> +UACCE_DEV_SHARE_DOMAIN >> + The device has IOMMU which has no multiple page table and device page >> + fault support >> + >> +If the device works in mode other than UACCE_DEV_NOIOMMU, *uacce* will set its >> +IOMMU to IOMMU_DOMAIN_UNMANAGED. So the driver must not use any kernel >> +DMA API but the following ones from *uacce* instead: :: >> + >> + uacce_dma_map(q, va, size, prot); >> + uacce_dma_unmap(q, va, size, prot); >> + >> +*uacce_dma_map/unmap* is valid only for UACCE_DEV_SVA device. It creates a >> +particular PASID and page table for the kernel in the IOMMU (Not yet >> +implemented in the RFC) >> + >> +For the UACCE_DEV_SHARE_DOMAIN device, uacce_dma_map/unmap is not valid. >> +*Uacce* call back start_queue only when the DUS and DKO region is mmapped. The >> +accelerator driver must use those dma buffer, via uacce_queue->qfrs[], on >> +start_queue call back. The size of the queue file region is defined by >> +uacce->ops->qf_pg_start[]. >> + >> +We have to do it this way because most of current IOMMU cannot support the >> +kernel and user virtual address at the same time. So we have to let them both >> +share the same user virtual address space. >> + >> +If the device have to support kernel and user at the same time, both kernel >> +and the user should use these DMA API. This is not convenient. A better >> +solution is to change the future DMA/IOMMU design to let them separate the >> +address space between the user and kernel space. But it is not going to be in >> +a short time. >> + >> + >> +Multiple processes support >> +========================== >> + >> +In the latest mainline kernel (4.19) when this document is written, the IOMMU >> +subsystem do not support multiple process page tables yet. >> + >> +Most IOMMU hardware implementation support multi-process with the concept >> +of PASID. But they may use different name, e.g. it is call sub-stream-id in >> +SMMU of ARM. With PASID or similar design, multi page table can be added to >> +the IOMMU and referred by its PASID. >> + >> +*JPB* has a patchset to enable this[1]_. We have tested it with our hardware >> +(which is known as *D06*). It works well. *WarpDrive* rely on them to support >> +UACCE_DEV_SVA. If it is not enabled, *WarpDrive* can still work. But it >> +support only one process, the device will be set to UACCE_DEV_SHARE_DOMAIN >> +even it is set to UACCE_DEV_SVA initially. >> + >> +Static Share Virtual Memory is mainly used by UACCE_DEV_SHARE_DOMAIN device. >> + >> + >> +Legacy Mode Support >> +=================== >> +For the hardware without IOMMU, WarpDrive can still work, the only problem is >> +VA cannot be used in the device. The driver should adopt another strategy for >> +the shared memory. It is only for testing, and not recommended. >> + >> + >> +The Folk Scenario >> +================= >> +For a process with allocated queues and shared memory, what happen if it forks >> +a child? >> + >> +The fd of the queue will be duplicated on folk, so the child can send request >> +to the same queue as its parent. But the requests which is sent from processes >> +except for the one who open the queue will be blocked. >> + >> +It is recommended to add O_CLOEXEC to the queue file. >> + >> +The queue mmap space has a VM_DONTCOPY in its VMA. So the child will lost all >> +those VMAs. >> + >> +This is why *WarpDrive* does not adopt the mode used in *VFIO* and *InfiniBand*. >> +Both solutions can set any user pointer for hardware sharing. But they cannot >> +support fork when the dma is in process. Or the "Copy-On-Write" procedure will >> +make the parent process lost its physical pages. >> + >> + >> +The Sample Code >> +=============== >> +There is a sample user land implementation with a simple driver for Hisilicon >> +Hi1620 ZIP Accelerator. >> + >> +To test, do the following in samples/warpdrive (for the case of PC host): :: >> + ./autogen.sh >> + ./conf.sh # or simply ./configure if you build on target system >> + make >> + >> +Then you can get test_hisi_zip in the test subdirectory. Copy it to the target >> +system and make sure the hisi_zip driver is enabled (the major and minor of >> +the uacce chrdev can be gotten from the dmesg or sysfs), and run: :: >> + mknod /dev/ua1 c <major> <minior> >> + test/test_hisi_zip -z < data > data.zip >> + test/test_hisi_zip -g < data > data.gzip >> + >> + >> +References >> +========== >> +.. [1] https://patchwork.kernel.org/patch/10394851/ >> + >> +.. vim: tw=78 >> diff --git a/Documentation/warpdrive/wd-arch.svg b/Documentation/warpdrive/wd-arch.svg >> new file mode 100644 >> index 000000000000..e59934188443 >> --- /dev/null >> +++ b/Documentation/warpdrive/wd-arch.svg >> @@ -0,0 +1,764 @@ >> +<?xml version="1.0" encoding="UTF-8" standalone="no"?> >> +<!-- Created with Inkscape (http://www.inkscape.org/) --> >> + >> +<svg >> + xmlns:dc="http://purl.org/dc/elements/1.1/" >> + xmlns:cc="http://creativecommons.org/ns#" >> + xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" >> + xmlns:svg="http://www.w3.org/2000/svg" >> + xmlns="http://www.w3.org/2000/svg" >> + xmlns:xlink="http://www.w3.org/1999/xlink" >> + xmlns:sodipodi="http://sodipodi.sourceforge.net/DTD/sodipodi-0.dtd" >> + xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape" >> + width="210mm" >> + height="193mm" >> + viewBox="0 0 744.09449 683.85823" >> + id="svg2" >> + version="1.1" >> + inkscape:version="0.92.3 (2405546, 2018-03-11)" >> + sodipodi:docname="wd-arch.svg"> >> + <defs >> + id="defs4"> >> + <linearGradient >> + inkscape:collect="always" >> + id="linearGradient6830"> >> + <stop >> + style="stop-color:#000000;stop-opacity:1;" >> + offset="0" >> + id="stop6832" /> >> + <stop >> + style="stop-color:#000000;stop-opacity:0;" >> + offset="1" >> + id="stop6834" /> >> + </linearGradient> >> + <linearGradient >> + inkscape:collect="always" >> + xlink:href="#linearGradient5026" >> + id="linearGradient5032" >> + x1="353" >> + y1="211.3622" >> + x2="565.5" >> + y2="174.8622" >> + gradientUnits="userSpaceOnUse" >> + gradientTransform="translate(-89.949614,405.94594)" /> >> + <linearGradient >> + inkscape:collect="always" >> + id="linearGradient5026"> >> + <stop >> + style="stop-color:#f2f2f2;stop-opacity:1;" >> + offset="0" >> + id="stop5028" /> >> + <stop >> + style="stop-color:#f2f2f2;stop-opacity:0;" >> + offset="1" >> + id="stop5030" /> >> + </linearGradient> >> + <filter >> + inkscape:collect="always" >> + style="color-interpolation-filters:sRGB" >> + id="filter4169-3" >> + x="-0.031597666" >> + width="1.0631953" >> + y="-0.099812768" >> + height="1.1996255"> >> + <feGaussianBlur >> + inkscape:collect="always" >> + stdDeviation="1.3307599" >> + id="feGaussianBlur4171-6" /> >> + </filter> >> + <linearGradient >> + inkscape:collect="always" >> + xlink:href="#linearGradient5026" >> + id="linearGradient5032-1" >> + x1="353" >> + y1="211.3622" >> + x2="565.5" >> + y2="174.8622" >> + gradientUnits="userSpaceOnUse" >> + gradientTransform="translate(175.77842,400.29111)" /> >> + <filter >> + inkscape:collect="always" >> + style="color-interpolation-filters:sRGB" >> + id="filter4169-3-0" >> + x="-0.031597666" >> + width="1.0631953" >> + y="-0.099812768" >> + height="1.1996255"> >> + <feGaussianBlur >> + inkscape:collect="always" >> + stdDeviation="1.3307599" >> + id="feGaussianBlur4171-6-9" /> >> + </filter> >> + <marker >> + markerWidth="18.960653" >> + markerHeight="11.194658" >> + refX="9.4803267" >> + refY="5.5973287" >> + orient="auto" >> + id="marker4613"> >> + <rect >> + y="-5.1589785" >> + x="5.8504119" >> + height="10.317957" >> + width="10.317957" >> + id="rect4212" >> + style="fill:#ffffff;stroke:#000000;stroke-width:0.69143367;stroke-miterlimit:4;stroke-dasharray:none" >> + transform="matrix(0.86111274,0.50841405,-0.86111274,0.50841405,0,0)"> >> + <title >> + id="title4262">generation</title> >> + </rect> >> + </marker> >> + <marker >> + markerWidth="11.227358" >> + markerHeight="12.355258" >> + refX="10" >> + refY="6.177629" >> + orient="auto" >> + id="marker4825"> >> + <path >> + inkscape:connector-curvature="0" >> + id="path4757" >> + d="M 0.42024733,0.42806444 10.231357,6.3500844 0.24347733,11.918544" >> + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" /> >> + </marker> >> + <marker >> + markerWidth="11.227358" >> + markerHeight="12.355258" >> + refX="10" >> + refY="6.177629" >> + orient="auto" >> + id="marker4825-6"> >> + <path >> + inkscape:connector-curvature="0" >> + id="path4757-1" >> + d="M 0.42024733,0.42806444 10.231357,6.3500844 0.24347733,11.918544" >> + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" /> >> + </marker> >> + <linearGradient >> + inkscape:collect="always" >> + xlink:href="#linearGradient5026" >> + id="linearGradient5032-3-9" >> + x1="353" >> + y1="211.3622" >> + x2="565.5" >> + y2="174.8622" >> + gradientUnits="userSpaceOnUse" >> + gradientTransform="matrix(1.2452511,0,0,0.98513016,-190.95632,540.33156)" /> >> + <filter >> + inkscape:collect="always" >> + style="color-interpolation-filters:sRGB" >> + id="filter4169-3-5-8" >> + x="-0.031597666" >> + width="1.0631953" >> + y="-0.099812768" >> + height="1.1996255"> >> + <feGaussianBlur >> + inkscape:collect="always" >> + stdDeviation="1.3307599" >> + id="feGaussianBlur4171-6-3-9" /> >> + </filter> >> + <marker >> + markerWidth="11.227358" >> + markerHeight="12.355258" >> + refX="10" >> + refY="6.177629" >> + orient="auto" >> + id="marker4825-6-2"> >> + <path >> + inkscape:connector-curvature="0" >> + id="path4757-1-9" >> + d="M 0.42024733,0.42806444 10.231357,6.3500844 0.24347733,11.918544" >> + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" /> >> + </marker> >> + <marker >> + markerWidth="11.227358" >> + markerHeight="12.355258" >> + refX="10" >> + refY="6.177629" >> + orient="auto" >> + id="marker4825-6-2-1"> >> + <path >> + inkscape:connector-curvature="0" >> + id="path4757-1-9-9" >> + d="M 0.42024733,0.42806444 10.231357,6.3500844 0.24347733,11.918544" >> + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" /> >> + </marker> >> + <linearGradient >> + inkscape:collect="always" >> + xlink:href="#linearGradient5026" >> + id="linearGradient5032-3-9-7" >> + x1="353" >> + y1="211.3622" >> + x2="565.5" >> + y2="174.8622" >> + gradientUnits="userSpaceOnUse" >> + gradientTransform="matrix(1.3742742,0,0,0.97786398,-234.52617,654.63367)" /> >> + <filter >> + inkscape:collect="always" >> + style="color-interpolation-filters:sRGB" >> + id="filter4169-3-5-8-5" >> + x="-0.031597666" >> + width="1.0631953" >> + y="-0.099812768" >> + height="1.1996255"> >> + <feGaussianBlur >> + inkscape:collect="always" >> + stdDeviation="1.3307599" >> + id="feGaussianBlur4171-6-3-9-0" /> >> + </filter> >> + <marker >> + markerWidth="11.227358" >> + markerHeight="12.355258" >> + refX="10" >> + refY="6.177629" >> + orient="auto" >> + id="marker4825-6-2-6"> >> + <path >> + inkscape:connector-curvature="0" >> + id="path4757-1-9-1" >> + d="M 0.42024733,0.42806444 10.231357,6.3500844 0.24347733,11.918544" >> + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" /> >> + </marker> >> + <linearGradient >> + inkscape:collect="always" >> + xlink:href="#linearGradient5026" >> + id="linearGradient5032-3-9-4" >> + x1="353" >> + y1="211.3622" >> + x2="565.5" >> + y2="174.8622" >> + gradientUnits="userSpaceOnUse" >> + gradientTransform="matrix(1.3742912,0,0,2.0035845,-468.34428,342.56603)" /> >> + <filter >> + inkscape:collect="always" >> + style="color-interpolation-filters:sRGB" >> + id="filter4169-3-5-8-54" >> + x="-0.031597666" >> + width="1.0631953" >> + y="-0.099812768" >> + height="1.1996255"> >> + <feGaussianBlur >> + inkscape:collect="always" >> + stdDeviation="1.3307599" >> + id="feGaussianBlur4171-6-3-9-7" /> >> + </filter> >> + <marker >> + markerWidth="11.227358" >> + markerHeight="12.355258" >> + refX="10" >> + refY="6.177629" >> + orient="auto" >> + id="marker4825-6-2-1-8"> >> + <path >> + inkscape:connector-curvature="0" >> + id="path4757-1-9-9-6" >> + d="M 0.42024733,0.42806444 10.231357,6.3500844 0.24347733,11.918544" >> + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" /> >> + </marker> >> + <marker >> + markerWidth="11.227358" >> + markerHeight="12.355258" >> + refX="10" >> + refY="6.177629" >> + orient="auto" >> + id="marker4825-6-2-1-8-8"> >> + <path >> + inkscape:connector-curvature="0" >> + id="path4757-1-9-9-6-9" >> + d="M 0.42024733,0.42806444 10.231357,6.3500844 0.24347733,11.918544" >> + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" /> >> + </marker> >> + <marker >> + markerWidth="11.227358" >> + markerHeight="12.355258" >> + refX="10" >> + refY="6.177629" >> + orient="auto" >> + id="marker4825-6-0"> >> + <path >> + inkscape:connector-curvature="0" >> + id="path4757-1-93" >> + d="M 0.42024733,0.42806444 10.231357,6.3500844 0.24347733,11.918544" >> + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" /> >> + </marker> >> + <marker >> + markerWidth="11.227358" >> + markerHeight="12.355258" >> + refX="10" >> + refY="6.177629" >> + orient="auto" >> + id="marker4825-6-0-2"> >> + <path >> + inkscape:connector-curvature="0" >> + id="path4757-1-93-6" >> + d="M 0.42024733,0.42806444 10.231357,6.3500844 0.24347733,11.918544" >> + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" /> >> + </marker> >> + <filter >> + inkscape:collect="always" >> + style="color-interpolation-filters:sRGB" >> + id="filter5382" >> + x="-0.089695387" >> + width="1.1793908" >> + y="-0.10052069" >> + height="1.2010413"> >> + <feGaussianBlur >> + inkscape:collect="always" >> + stdDeviation="0.86758925" >> + id="feGaussianBlur5384" /> >> + </filter> >> + <linearGradient >> + inkscape:collect="always" >> + xlink:href="#linearGradient6830" >> + id="linearGradient6836" >> + x1="362.73923" >> + y1="700.04059" >> + x2="340.4751" >> + y2="678.25488" >> + gradientUnits="userSpaceOnUse" >> + gradientTransform="translate(-23.771026,-135.76835)" /> >> + <marker >> + markerWidth="11.227358" >> + markerHeight="12.355258" >> + refX="10" >> + refY="6.177629" >> + orient="auto" >> + id="marker4825-6-2-6-2"> >> + <path >> + inkscape:connector-curvature="0" >> + id="path4757-1-9-1-9" >> + d="M 0.42024733,0.42806444 10.231357,6.3500844 0.24347733,11.918544" >> + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" /> >> + </marker> >> + <linearGradient >> + inkscape:collect="always" >> + xlink:href="#linearGradient5026" >> + id="linearGradient5032-3-9-7-3" >> + x1="353" >> + y1="211.3622" >> + x2="565.5" >> + y2="174.8622" >> + gradientUnits="userSpaceOnUse" >> + gradientTransform="matrix(1.3742742,0,0,0.97786395,-57.357186,649.55786)" /> >> + <filter >> + inkscape:collect="always" >> + style="color-interpolation-filters:sRGB" >> + id="filter4169-3-5-8-5-0" >> + x="-0.031597666" >> + width="1.0631953" >> + y="-0.099812768" >> + height="1.1996255"> >> + <feGaussianBlur >> + inkscape:collect="always" >> + stdDeviation="1.3307599" >> + id="feGaussianBlur4171-6-3-9-0-2" /> >> + </filter> >> + <marker >> + markerWidth="11.227358" >> + markerHeight="12.355258" >> + refX="10" >> + refY="6.177629" >> + orient="auto" >> + id="marker4825-6-2-1-1"> >> + <path >> + inkscape:connector-curvature="0" >> + id="path4757-1-9-9-0" >> + d="M 0.42024733,0.42806444 10.231357,6.3500844 0.24347733,11.918544" >> + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" /> >> + </marker> >> + </defs> >> + <sodipodi:namedview >> + id="base" >> + pagecolor="#ffffff" >> + bordercolor="#666666" >> + borderopacity="1.0" >> + inkscape:pageopacity="0.0" >> + inkscape:pageshadow="2" >> + inkscape:zoom="0.98994949" >> + inkscape:cx="222.32868" >> + inkscape:cy="370.44492" >> + inkscape:document-units="px" >> + inkscape:current-layer="layer1" >> + showgrid="false" >> + inkscape:window-width="1916" >> + inkscape:window-height="1033" >> + inkscape:window-x="0" >> + inkscape:window-y="22" >> + inkscape:window-maximized="0" >> + fit-margin-right="0.3" >> + inkscape:snap-global="false" /> >> + <metadata >> + id="metadata7"> >> + <rdf:RDF> >> + <cc:Work >> + rdf:about=""> >> + <dc:format>image/svg+xml</dc:format> >> + <dc:type >> + rdf:resource="http://purl.org/dc/dcmitype/StillImage" /> >> + <dc:title /> >> + </cc:Work> >> + </rdf:RDF> >> + </metadata> >> + <g >> + inkscape:label="Layer 1" >> + inkscape:groupmode="layer" >> + id="layer1" >> + transform="translate(0,-368.50374)"> >> + <rect >> + style="fill:#000000;stroke:#000000;stroke-width:0.6465112;filter:url(#filter4169-3)" >> + id="rect4136-3-6" >> + width="101.07784" >> + height="31.998148" >> + x="283.01144" >> + y="588.80896" /> >> + <rect >> + style="fill:url(#linearGradient5032);fill-opacity:1;stroke:#000000;stroke-width:0.6465112" >> + id="rect4136-2" >> + width="101.07784" >> + height="31.998148" >> + x="281.63498" >> + y="586.75739" /> >> + <text >> + xml:space="preserve" >> + style="font-style:normal;font-weight:normal;line-height:0%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" >> + x="294.21747" >> + y="612.50073" >> + id="text4138-6"><tspan >> + sodipodi:role="line" >> + id="tspan4140-1" >> + x="294.21747" >> + y="612.50073" >> + style="font-size:15px;line-height:1.25">WarpDrive</tspan></text> >> + <rect >> + style="fill:#000000;stroke:#000000;stroke-width:0.6465112;filter:url(#filter4169-3-0)" >> + id="rect4136-3-6-3" >> + width="101.07784" >> + height="31.998148" >> + x="548.7395" >> + y="583.15417" /> >> + <rect >> + style="fill:url(#linearGradient5032-1);fill-opacity:1;stroke:#000000;stroke-width:0.6465112" >> + id="rect4136-2-60" >> + width="101.07784" >> + height="31.998148" >> + x="547.36304" >> + y="581.1026" /> >> + <text >> + xml:space="preserve" >> + style="font-style:normal;font-weight:normal;line-height:0%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" >> + x="557.83484" >> + y="602.32745" >> + id="text4138-6-6"><tspan >> + sodipodi:role="line" >> + id="tspan4140-1-2" >> + x="557.83484" >> + y="602.32745" >> + style="font-size:15px;line-height:1.25">user_driver</tspan></text> >> + <path >> + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1;marker-end:url(#marker4613)" >> + d="m 547.36304,600.78954 -156.58203,0.0691" >> + id="path4855" >> + inkscape:connector-curvature="0" >> + sodipodi:nodetypes="cc" /> >> + <rect >> + style="fill:#000000;stroke:#000000;stroke-width:0.6465112;filter:url(#filter4169-3-5-8)" >> + id="rect4136-3-6-5-7" >> + width="101.07784" >> + height="31.998148" >> + x="128.74678" >> + y="80.648842" >> + transform="matrix(1.2452511,0,0,0.98513016,113.15182,641.02594)" /> >> + <rect >> + style="fill:url(#linearGradient5032-3-9);fill-opacity:1;stroke:#000000;stroke-width:0.71606314" >> + id="rect4136-2-6-3" >> + width="125.86729" >> + height="31.522341" >> + x="271.75983" >> + y="718.45435" /> >> + <text >> + xml:space="preserve" >> + style="font-style:normal;font-weight:normal;line-height:0%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" >> + x="309.13705" >> + y="745.55371" >> + id="text4138-6-2-6"><tspan >> + sodipodi:role="line" >> + id="tspan4140-1-9-1" >> + x="309.13705" >> + y="745.55371" >> + style="font-size:15px;line-height:1.25">uacce</tspan></text> >> + <path >> + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1;marker-end:url(#marker4825-6-2)" >> + d="m 329.57309,619.72453 5.0373,97.14447" >> + id="path4661-3" >> + inkscape:connector-curvature="0" >> + sodipodi:nodetypes="cc" /> >> + <path >> + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1;marker-end:url(#marker4825-6-2-1)" >> + d="m 342.57219,830.63108 -5.67699,-79.2841" >> + id="path4661-3-4" >> + inkscape:connector-curvature="0" >> + sodipodi:nodetypes="cc" /> >> + <rect >> + style="fill:#000000;stroke:#000000;stroke-width:0.6465112;filter:url(#filter4169-3-5-8-5)" >> + id="rect4136-3-6-5-7-3" >> + width="101.07784" >> + height="31.998148" >> + x="128.74678" >> + y="80.648842" >> + transform="matrix(1.3742742,0,0,0.97786398,101.09126,754.58534)" /> >> + <rect >> + style="fill:url(#linearGradient5032-3-9-7);fill-opacity:1;stroke:#000000;stroke-width:0.74946606" >> + id="rect4136-2-6-3-6" >> + width="138.90866" >> + height="31.289837" >> + x="276.13297" >> + y="831.44263" /> >> + <text >> + xml:space="preserve" >> + style="font-style:normal;font-weight:normal;line-height:0%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" >> + x="295.67819" >> + y="852.98224" >> + id="text4138-6-2-6-1"><tspan >> + sodipodi:role="line" >> + id="tspan4140-1-9-1-0" >> + x="295.67819" >> + y="852.98224" >> + style="font-size:15px;line-height:1.25">Device Driver</tspan></text> >> + <path >> + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1;marker-end:url(#marker4825-6-2-6)" >> + d="m 623.05084,615.00104 0.51369,333.80219" >> + id="path4661-3-5" >> + inkscape:connector-curvature="0" >> + sodipodi:nodetypes="cc" /> >> + <text >> + xml:space="preserve" >> + style="font-style:normal;font-weight:normal;line-height:0%;font-family:sans-serif;text-align:center;letter-spacing:0px;word-spacing:0px;text-anchor:middle;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" >> + x="392.63568" >> + y="660.83667" >> + id="text4138-6-2-6-1-6-2-5"><tspan >> + sodipodi:role="line" >> + x="392.63568" >> + y="660.83667" >> + id="tspan4305" >> + style="font-size:15px;line-height:1.25"><<anom_file>></tspan><tspan >> + sodipodi:role="line" >> + x="392.63568" >> + y="679.58667" >> + style="font-size:15px;line-height:1.25" >> + id="tspan1139">Queue FD</tspan></text> >> + <text >> + xml:space="preserve" >> + style="font-style:normal;font-weight:normal;line-height:0%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" >> + x="389.92969" >> + y="587.44836" >> + id="text4138-6-2-6-1-6-2-56"><tspan >> + sodipodi:role="line" >> + id="tspan4140-1-9-1-0-3-0-9" >> + x="389.92969" >> + y="587.44836" >> + style="font-size:15px;line-height:1.25">1</tspan></text> >> + <text >> + xml:space="preserve" >> + style="font-style:normal;font-weight:normal;line-height:0%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" >> + x="528.64813" >> + y="600.08429" >> + id="text4138-6-2-6-1-6-3"><tspan >> + sodipodi:role="line" >> + id="tspan4140-1-9-1-0-3-7" >> + x="528.64813" >> + y="600.08429" >> + style="font-size:15px;line-height:1.25">*</tspan></text> >> + <rect >> + style="fill:#000000;stroke:#000000;stroke-width:0.6465112;filter:url(#filter4169-3-5-8-54)" >> + id="rect4136-3-6-5-7-4" >> + width="101.07784" >> + height="31.998148" >> + x="128.74678" >> + y="80.648842" >> + transform="matrix(1.3745874,0,0,1.8929066,-132.7754,556.04505)" /> >> + <rect >> + style="fill:url(#linearGradient5032-3-9-4);fill-opacity:1;stroke:#000000;stroke-width:1.07280123" >> + id="rect4136-2-6-3-4" >> + width="138.91039" >> + height="64.111" >> + x="42.321312" >> + y="704.8371" /> >> + <text >> + xml:space="preserve" >> + style="font-style:normal;font-weight:normal;line-height:0%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" >> + x="110.30745" >> + y="722.94025" >> + id="text4138-6-2-6-3"><tspan >> + sodipodi:role="line" >> + x="111.99202" >> + y="722.94025" >> + id="tspan4366" >> + style="font-size:15px;line-height:1.25;text-align:center;text-anchor:middle">other standard </tspan><tspan >> + sodipodi:role="line" >> + x="110.30745" >> + y="741.69025" >> + id="tspan4368" >> + style="font-size:15px;line-height:1.25;text-align:center;text-anchor:middle">framework</tspan><tspan >> + sodipodi:role="line" >> + x="110.30745" >> + y="760.44025" >> + style="font-size:15px;line-height:1.25;text-align:center;text-anchor:middle" >> + id="tspan6840">(crypto/nic/others)</tspan></text> >> + <path >> + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1;marker-end:url(#marker4825-6-2-1-8)" >> + d="M 276.29661,849.04109 134.04449,771.90853" >> + id="path4661-3-4-8" >> + inkscape:connector-curvature="0" >> + sodipodi:nodetypes="cc" /> >> + <text >> + xml:space="preserve" >> + style="font-style:normal;font-weight:normal;line-height:0%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" >> + x="313.70813" >> + y="730.06366" >> + id="text4138-6-2-6-36"><tspan >> + sodipodi:role="line" >> + id="tspan4140-1-9-1-7" >> + x="313.70813" >> + y="730.06366" >> + style="font-size:10px;line-height:1.25"><<lkm>></tspan></text> >> + <text >> + xml:space="preserve" >> + style="font-style:normal;font-weight:normal;line-height:0%;font-family:sans-serif;text-align:start;letter-spacing:0px;word-spacing:0px;text-anchor:start;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" >> + x="259.53165" >> + y="797.8056" >> + id="text4138-6-2-6-1-6-2-5-7-5"><tspan >> + sodipodi:role="line" >> + x="259.53165" >> + y="797.8056" >> + style="font-size:15px;line-height:1.25;text-align:start;text-anchor:start" >> + id="tspan2357">uacce register api</tspan></text> >> + <text >> + xml:space="preserve" >> + style="font-style:normal;font-weight:normal;line-height:0%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" >> + x="29.145819" >> + y="833.44244" >> + id="text4138-6-2-6-1-6-2-5-7-5-2"><tspan >> + sodipodi:role="line" >> + x="29.145819" >> + y="833.44244" >> + id="tspan4301" >> + style="font-size:15px;line-height:1.25">register to other subsystem</tspan></text> >> + <text >> + xml:space="preserve" >> + style="font-style:normal;font-weight:normal;line-height:0%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" >> + x="301.20813" >> + y="597.29437" >> + id="text4138-6-2-6-36-1"><tspan >> + sodipodi:role="line" >> + id="tspan4140-1-9-1-7-2" >> + x="301.20813" >> + y="597.29437" >> + style="font-size:10px;line-height:1.25"><<user_lib>></tspan></text> >> + <text >> + xml:space="preserve" >> + style="font-style:normal;font-weight:normal;line-height:0%;font-family:sans-serif;text-align:center;letter-spacing:0px;word-spacing:0px;text-anchor:middle;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" >> + x="615.9505" >> + y="739.44012" >> + id="text4138-6-2-6-1-6-2-5-3"><tspan >> + sodipodi:role="line" >> + x="615.9505" >> + y="739.44012" >> + id="tspan4274-7" >> + style="font-size:15px;line-height:1.25">mmapped memory r/w interface</tspan></text> >> + <text >> + xml:space="preserve" >> + style="font-style:normal;font-weight:normal;line-height:0%;font-family:sans-serif;text-align:center;letter-spacing:0px;word-spacing:0px;text-anchor:middle;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" >> + x="371.01291" >> + y="529.23682" >> + id="text4138-6-2-6-1-6-2-5-36"><tspan >> + sodipodi:role="line" >> + x="371.01291" >> + y="529.23682" >> + id="tspan4305-3" >> + style="font-size:15px;line-height:1.25">wd user api</tspan></text> >> + <path >> + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" >> + d="m 328.19325,585.87943 0,-23.57142" >> + id="path4348" >> + inkscape:connector-curvature="0" /> >> + <ellipse >> + style="opacity:1;fill:#ffffff;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:1;stroke-miterlimit:4;stroke-dasharray:none;stroke-dashoffset:0" >> + id="path4350" >> + cx="328.01468" >> + cy="551.95081" >> + rx="11.607142" >> + ry="10.357142" /> >> + <path >> + style="opacity:0.444;fill:url(#linearGradient6836);fill-opacity:1;fill-rule:evenodd;stroke:none;stroke-width:1;stroke-miterlimit:4;stroke-dasharray:none;stroke-dashoffset:0;filter:url(#filter5382)" >> + id="path4350-2" >> + sodipodi:type="arc" >> + sodipodi:cx="329.44327" >> + sodipodi:cy="553.37933" >> + sodipodi:rx="11.607142" >> + sodipodi:ry="10.357142" >> + sodipodi:start="0" >> + sodipodi:end="6.2509098" >> + d="m 341.05041,553.37933 a 11.607142,10.357142 0 0 1 -11.51349,10.35681 11.607142,10.357142 0 0 1 -11.69928,-10.18967 11.607142,10.357142 0 0 1 11.32469,-10.52124 11.607142,10.357142 0 0 1 11.88204,10.01988" >> + sodipodi:open="true" /> >> + <text >> + xml:space="preserve" >> + style="font-style:normal;font-weight:normal;line-height:0%;font-family:sans-serif;text-align:center;letter-spacing:0px;word-spacing:0px;text-anchor:middle;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" >> + x="619.67596" >> + y="978.22363" >> + id="text4138-6-2-6-1-6-2-5-36-3"><tspan >> + sodipodi:role="line" >> + x="619.67596" >> + y="978.22363" >> + id="tspan4305-3-67" >> + style="font-size:15px;line-height:1.25">Device(Hardware)</tspan></text> >> + <path >> + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1;marker-end:url(#marker4825-6-2-6-2)" >> + d="m 347.51164,865.4527 193.91929,99.10053" >> + id="path4661-3-5-1" >> + inkscape:connector-curvature="0" >> + sodipodi:nodetypes="cc" /> >> + <rect >> + style="fill:#000000;stroke:#000000;stroke-width:0.6465112;filter:url(#filter4169-3-5-8-5-0)" >> + id="rect4136-3-6-5-7-3-1" >> + width="101.07784" >> + height="31.998148" >> + x="128.74678" >> + y="80.648842" >> + transform="matrix(1.3742742,0,0,0.97786395,278.26025,749.50952)" /> >> + <rect >> + style="fill:url(#linearGradient5032-3-9-7-3);fill-opacity:1;stroke:#000000;stroke-width:0.74946606" >> + id="rect4136-2-6-3-6-0" >> + width="138.90868" >> + height="31.289839" >> + x="453.30197" >> + y="826.36682" /> >> + <text >> + xml:space="preserve" >> + style="font-style:normal;font-weight:normal;font-size:12px;line-height:0%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" >> + x="493.68158" >> + y="847.90643" >> + id="text4138-6-2-6-1-5"><tspan >> + sodipodi:role="line" >> + id="tspan4140-1-9-1-0-1" >> + x="493.68158" >> + y="847.90643" >> + style="font-size:15px;line-height:1.25;stroke-width:1px">IOMMU</tspan></text> >> + <path >> + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1;marker-end:url(#marker4825-6-2-1-1)" >> + d="m 389.49372,755.46667 111.75324,68.4507" >> + id="path4661-3-4-85" >> + inkscape:connector-curvature="0" >> + sodipodi:nodetypes="cc" /> >> + <text >> + xml:space="preserve" >> + style="font-style:normal;font-weight:normal;font-size:12px;line-height:0%;font-family:sans-serif;text-align:start;letter-spacing:0px;word-spacing:0px;text-anchor:start;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" >> + x="427.70282" >> + y="776.91418" >> + id="text4138-6-2-6-1-6-2-5-7-5-0"><tspan >> + sodipodi:role="line" >> + x="427.70282" >> + y="776.91418" >> + style="font-size:15px;line-height:1.25;text-align:start;text-anchor:start;stroke-width:1px" >> + id="tspan2357-6">manage the driver iommu state</tspan></text> >> + </g> >> +</svg> >> diff --git a/Documentation/warpdrive/wd.svg b/Documentation/warpdrive/wd.svg >> new file mode 100644 >> index 000000000000..87ab92ebfbc6 >> --- /dev/null >> +++ b/Documentation/warpdrive/wd.svg >> @@ -0,0 +1,526 @@ >> +<?xml version="1.0" encoding="UTF-8" standalone="no"?> >> +<!-- Created with Inkscape (http://www.inkscape.org/) --> >> + >> +<svg >> + xmlns:dc="http://purl.org/dc/elements/1.1/" >> + xmlns:cc="http://creativecommons.org/ns#" >> + xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" >> + xmlns:svg="http://www.w3.org/2000/svg" >> + xmlns="http://www.w3.org/2000/svg" >> + xmlns:xlink="http://www.w3.org/1999/xlink" >> + xmlns:sodipodi="http://sodipodi.sourceforge.net/DTD/sodipodi-0.dtd" >> + xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape" >> + width="210mm" >> + height="116mm" >> + viewBox="0 0 744.09449 411.02338" >> + id="svg2" >> + version="1.1" >> + inkscape:version="0.92.3 (2405546, 2018-03-11)" >> + sodipodi:docname="wd.svg"> >> + <defs >> + id="defs4"> >> + <linearGradient >> + inkscape:collect="always" >> + id="linearGradient5026"> >> + <stop >> + style="stop-color:#f2f2f2;stop-opacity:1;" >> + offset="0" >> + id="stop5028" /> >> + <stop >> + style="stop-color:#f2f2f2;stop-opacity:0;" >> + offset="1" >> + id="stop5030" /> >> + </linearGradient> >> + <linearGradient >> + inkscape:collect="always" >> + xlink:href="#linearGradient5026" >> + id="linearGradient5032-3" >> + x1="353" >> + y1="211.3622" >> + x2="565.5" >> + y2="174.8622" >> + gradientUnits="userSpaceOnUse" >> + gradientTransform="matrix(2.7384117,0,0,0.91666329,-952.8283,571.10143)" /> >> + <filter >> + inkscape:collect="always" >> + style="color-interpolation-filters:sRGB" >> + id="filter4169-3-5" >> + x="-0.031597666" >> + width="1.0631953" >> + y="-0.099812768" >> + height="1.1996255"> >> + <feGaussianBlur >> + inkscape:collect="always" >> + stdDeviation="1.3307599" >> + id="feGaussianBlur4171-6-3" /> >> + </filter> >> + <marker >> + markerWidth="18.960653" >> + markerHeight="11.194658" >> + refX="9.4803267" >> + refY="5.5973287" >> + orient="auto" >> + id="marker4613"> >> + <rect >> + y="-5.1589785" >> + x="5.8504119" >> + height="10.317957" >> + width="10.317957" >> + id="rect4212" >> + style="fill:#ffffff;stroke:#000000;stroke-width:0.69143367;stroke-miterlimit:4;stroke-dasharray:none" >> + transform="matrix(0.86111274,0.50841405,-0.86111274,0.50841405,0,0)"> >> + <title >> + id="title4262">generation</title> >> + </rect> >> + </marker> >> + <marker >> + markerWidth="11.227358" >> + markerHeight="12.355258" >> + refX="10" >> + refY="6.177629" >> + orient="auto" >> + id="marker4825"> >> + <path >> + inkscape:connector-curvature="0" >> + id="path4757" >> + d="M 0.42024733,0.42806444 10.231357,6.3500844 0.24347733,11.918544" >> + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" /> >> + </marker> >> + <marker >> + markerWidth="11.227358" >> + markerHeight="12.355258" >> + refX="10" >> + refY="6.177629" >> + orient="auto" >> + id="marker4825-6"> >> + <path >> + inkscape:connector-curvature="0" >> + id="path4757-1" >> + d="M 0.42024733,0.42806444 10.231357,6.3500844 0.24347733,11.918544" >> + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" /> >> + </marker> >> + <linearGradient >> + inkscape:collect="always" >> + xlink:href="#linearGradient5026" >> + id="linearGradient5032-3-9" >> + x1="353" >> + y1="211.3622" >> + x2="565.5" >> + y2="174.8622" >> + gradientUnits="userSpaceOnUse" >> + gradientTransform="matrix(1.2452511,0,0,0.98513016,-190.95632,540.33156)" /> >> + <marker >> + markerWidth="11.227358" >> + markerHeight="12.355258" >> + refX="10" >> + refY="6.177629" >> + orient="auto" >> + id="marker4825-6-2"> >> + <path >> + inkscape:connector-curvature="0" >> + id="path4757-1-9" >> + d="M 0.42024733,0.42806444 10.231357,6.3500844 0.24347733,11.918544" >> + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" /> >> + </marker> >> + <marker >> + markerWidth="11.227358" >> + markerHeight="12.355258" >> + refX="10" >> + refY="6.177629" >> + orient="auto" >> + id="marker4825-6-2-1"> >> + <path >> + inkscape:connector-curvature="0" >> + id="path4757-1-9-9" >> + d="M 0.42024733,0.42806444 10.231357,6.3500844 0.24347733,11.918544" >> + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" /> >> + </marker> >> + <marker >> + markerWidth="11.227358" >> + markerHeight="12.355258" >> + refX="10" >> + refY="6.177629" >> + orient="auto" >> + id="marker4825-6-2-6"> >> + <path >> + inkscape:connector-curvature="0" >> + id="path4757-1-9-1" >> + d="M 0.42024733,0.42806444 10.231357,6.3500844 0.24347733,11.918544" >> + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" /> >> + </marker> >> + <marker >> + markerWidth="11.227358" >> + markerHeight="12.355258" >> + refX="10" >> + refY="6.177629" >> + orient="auto" >> + id="marker4825-6-2-1-8"> >> + <path >> + inkscape:connector-curvature="0" >> + id="path4757-1-9-9-6" >> + d="M 0.42024733,0.42806444 10.231357,6.3500844 0.24347733,11.918544" >> + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" /> >> + </marker> >> + <marker >> + markerWidth="11.227358" >> + markerHeight="12.355258" >> + refX="10" >> + refY="6.177629" >> + orient="auto" >> + id="marker4825-6-2-1-8-8"> >> + <path >> + inkscape:connector-curvature="0" >> + id="path4757-1-9-9-6-9" >> + d="M 0.42024733,0.42806444 10.231357,6.3500844 0.24347733,11.918544" >> + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" /> >> + </marker> >> + <marker >> + markerWidth="11.227358" >> + markerHeight="12.355258" >> + refX="10" >> + refY="6.177629" >> + orient="auto" >> + id="marker4825-6-0"> >> + <path >> + inkscape:connector-curvature="0" >> + id="path4757-1-93" >> + d="M 0.42024733,0.42806444 10.231357,6.3500844 0.24347733,11.918544" >> + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" /> >> + </marker> >> + <marker >> + markerWidth="11.227358" >> + markerHeight="12.355258" >> + refX="10" >> + refY="6.177629" >> + orient="auto" >> + id="marker4825-6-0-2"> >> + <path >> + inkscape:connector-curvature="0" >> + id="path4757-1-93-6" >> + d="M 0.42024733,0.42806444 10.231357,6.3500844 0.24347733,11.918544" >> + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" /> >> + </marker> >> + <marker >> + markerWidth="11.227358" >> + markerHeight="12.355258" >> + refX="10" >> + refY="6.177629" >> + orient="auto" >> + id="marker4825-6-2-6-2"> >> + <path >> + inkscape:connector-curvature="0" >> + id="path4757-1-9-1-9" >> + d="M 0.42024733,0.42806444 10.231357,6.3500844 0.24347733,11.918544" >> + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" /> >> + </marker> >> + <linearGradient >> + inkscape:collect="always" >> + xlink:href="#linearGradient5026" >> + id="linearGradient5032-3-8" >> + x1="353" >> + y1="211.3622" >> + x2="565.5" >> + y2="174.8622" >> + gradientUnits="userSpaceOnUse" >> + gradientTransform="matrix(1.0104674,0,0,1.0052679,-218.642,661.15448)" /> >> + <filter >> + inkscape:collect="always" >> + style="color-interpolation-filters:sRGB" >> + id="filter4169-3-5-8" >> + x="-0.031597666" >> + width="1.0631953" >> + y="-0.099812768" >> + height="1.1996255"> >> + <feGaussianBlur >> + inkscape:collect="always" >> + stdDeviation="1.3307599" >> + id="feGaussianBlur4171-6-3-9" /> >> + </filter> >> + <linearGradient >> + inkscape:collect="always" >> + xlink:href="#linearGradient5026" >> + id="linearGradient5032-3-8-2" >> + x1="353" >> + y1="211.3622" >> + x2="565.5" >> + y2="174.8622" >> + gradientUnits="userSpaceOnUse" >> + gradientTransform="matrix(2.1450559,0,0,1.0052679,-521.97704,740.76422)" /> >> + <filter >> + inkscape:collect="always" >> + style="color-interpolation-filters:sRGB" >> + id="filter4169-3-5-8-5" >> + x="-0.031597666" >> + width="1.0631953" >> + y="-0.099812768" >> + height="1.1996255"> >> + <feGaussianBlur >> + inkscape:collect="always" >> + stdDeviation="1.3307599" >> + id="feGaussianBlur4171-6-3-9-1" /> >> + </filter> >> + <linearGradient >> + inkscape:collect="always" >> + xlink:href="#linearGradient5026" >> + id="linearGradient5032-3-8-0" >> + x1="353" >> + y1="211.3622" >> + x2="565.5" >> + y2="174.8622" >> + gradientUnits="userSpaceOnUse" >> + gradientTransform="matrix(1.0104674,0,0,1.0052679,83.456748,660.20747)" /> >> + <filter >> + inkscape:collect="always" >> + style="color-interpolation-filters:sRGB" >> + id="filter4169-3-5-8-6" >> + x="-0.031597666" >> + width="1.0631953" >> + y="-0.099812768" >> + height="1.1996255"> >> + <feGaussianBlur >> + inkscape:collect="always" >> + stdDeviation="1.3307599" >> + id="feGaussianBlur4171-6-3-9-2" /> >> + </filter> >> + <linearGradient >> + inkscape:collect="always" >> + xlink:href="#linearGradient5026" >> + id="linearGradient5032-3-84" >> + x1="353" >> + y1="211.3622" >> + x2="565.5" >> + y2="174.8622" >> + gradientUnits="userSpaceOnUse" >> + gradientTransform="matrix(1.9884948,0,0,0.94903536,-318.42665,564.37696)" /> >> + <filter >> + inkscape:collect="always" >> + style="color-interpolation-filters:sRGB" >> + id="filter4169-3-5-4" >> + x="-0.031597666" >> + width="1.0631953" >> + y="-0.099812768" >> + height="1.1996255"> >> + <feGaussianBlur >> + inkscape:collect="always" >> + stdDeviation="1.3307599" >> + id="feGaussianBlur4171-6-3-0" /> >> + </filter> >> + <marker >> + markerWidth="11.227358" >> + markerHeight="12.355258" >> + refX="10" >> + refY="6.177629" >> + orient="auto" >> + id="marker4825-6-0-0"> >> + <path >> + inkscape:connector-curvature="0" >> + id="path4757-1-93-8" >> + d="M 0.42024733,0.42806444 10.231357,6.3500844 0.24347733,11.918544" >> + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" /> >> + </marker> >> + <marker >> + markerWidth="11.227358" >> + markerHeight="12.355258" >> + refX="10" >> + refY="6.177629" >> + orient="auto" >> + id="marker4825-6-3"> >> + <path >> + inkscape:connector-curvature="0" >> + id="path4757-1-1" >> + d="M 0.42024733,0.42806444 10.231357,6.3500844 0.24347733,11.918544" >> + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" /> >> + </marker> >> + </defs> >> + <sodipodi:namedview >> + id="base" >> + pagecolor="#ffffff" >> + bordercolor="#666666" >> + borderopacity="1.0" >> + inkscape:pageopacity="0.0" >> + inkscape:pageshadow="2" >> + inkscape:zoom="0.98994949" >> + inkscape:cx="457.47339" >> + inkscape:cy="250.14781" >> + inkscape:document-units="px" >> + inkscape:current-layer="layer1" >> + showgrid="false" >> + inkscape:window-width="1916" >> + inkscape:window-height="1033" >> + inkscape:window-x="0" >> + inkscape:window-y="22" >> + inkscape:window-maximized="0" >> + fit-margin-right="0.3" /> >> + <metadata >> + id="metadata7"> >> + <rdf:RDF> >> + <cc:Work >> + rdf:about=""> >> + <dc:format>image/svg+xml</dc:format> >> + <dc:type >> + rdf:resource="http://purl.org/dc/dcmitype/StillImage" /> >> + <dc:title></dc:title> >> + </cc:Work> >> + </rdf:RDF> >> + </metadata> >> + <g >> + inkscape:label="Layer 1" >> + inkscape:groupmode="layer" >> + id="layer1" >> + transform="translate(0,-641.33861)"> >> + <rect >> + style="fill:#000000;stroke:#000000;stroke-width:0.6465112;filter:url(#filter4169-3-5)" >> + id="rect4136-3-6-5" >> + width="101.07784" >> + height="31.998148" >> + x="128.74678" >> + y="80.648842" >> + transform="matrix(2.7384116,0,0,0.91666328,-284.06895,664.79751)" /> >> + <rect >> + style="fill:url(#linearGradient5032-3);fill-opacity:1;stroke:#000000;stroke-width:1.02430749" >> + id="rect4136-2-6" >> + width="276.79272" >> + height="29.331528" >> + x="64.723419" >> + y="736.84473" /> >> + <text >> + xml:space="preserve" >> + style="font-style:normal;font-weight:normal;line-height:0%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" >> + x="78.223282" >> + y="756.79803" >> + id="text4138-6-2"><tspan >> + sodipodi:role="line" >> + id="tspan4140-1-9" >> + x="78.223282" >> + y="756.79803" >> + style="font-size:15px;line-height:1.25">user application (running by the CPU</tspan></text> >> + <path >> + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1;marker-end:url(#marker4825-6)" >> + d="m 217.67507,876.6738 113.40331,45.0758" >> + id="path4661" >> + inkscape:connector-curvature="0" >> + sodipodi:nodetypes="cc" /> >> + <path >> + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1;marker-end:url(#marker4825-6-0)" >> + d="m 208.10197,767.69811 0.29362,76.03656" >> + id="path4661-6" >> + inkscape:connector-curvature="0" >> + sodipodi:nodetypes="cc" /> >> + <rect >> + style="fill:#000000;stroke:#000000;stroke-width:0.6465112;filter:url(#filter4169-3-5-8)" >> + id="rect4136-3-6-5-3" >> + width="101.07784" >> + height="31.998148" >> + x="128.74678" >> + y="80.648842" >> + transform="matrix(1.0104673,0,0,1.0052679,28.128628,763.90722)" /> >> + <rect >> + style="fill:url(#linearGradient5032-3-8);fill-opacity:1;stroke:#000000;stroke-width:0.65159565" >> + id="rect4136-2-6-6" >> + width="102.13586" >> + height="32.16671" >> + x="156.83217" >> + y="842.91852" /> >> + <text >> + xml:space="preserve" >> + style="font-style:normal;font-weight:normal;font-size:12px;line-height:0%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" >> + x="188.58519" >> + y="864.47125" >> + id="text4138-6-2-8"><tspan >> + sodipodi:role="line" >> + id="tspan4140-1-9-0" >> + x="188.58519" >> + y="864.47125" >> + style="font-size:15px;line-height:1.25;stroke-width:1px">MMU</tspan></text> >> + <rect >> + style="fill:#000000;stroke:#000000;stroke-width:0.6465112;filter:url(#filter4169-3-5-8-5)" >> + id="rect4136-3-6-5-3-1" >> + width="101.07784" >> + height="31.998148" >> + x="128.74678" >> + y="80.648842" >> + transform="matrix(2.1450556,0,0,1.0052679,1.87637,843.51696)" /> >> + <rect >> + style="fill:url(#linearGradient5032-3-8-2);fill-opacity:1;stroke:#000000;stroke-width:0.94937181" >> + id="rect4136-2-6-6-0" >> + width="216.8176" >> + height="32.16671" >> + x="275.09283" >> + y="922.5282" /> >> + <text >> + xml:space="preserve" >> + style="font-style:normal;font-weight:normal;font-size:12px;line-height:0%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" >> + x="347.81482" >> + y="943.23291" >> + id="text4138-6-2-8-8"><tspan >> + sodipodi:role="line" >> + id="tspan4140-1-9-0-5" >> + x="347.81482" >> + y="943.23291" >> + style="font-size:15px;line-height:1.25;stroke-width:1px">Memory</tspan></text> >> + <rect >> + style="fill:#000000;stroke:#000000;stroke-width:0.6465112;filter:url(#filter4169-3-5-8-6)" >> + id="rect4136-3-6-5-3-5" >> + width="101.07784" >> + height="31.998148" >> + x="128.74678" >> + y="80.648842" >> + transform="matrix(1.0104673,0,0,1.0052679,330.22737,762.9602)" /> >> + <rect >> + style="fill:url(#linearGradient5032-3-8-0);fill-opacity:1;stroke:#000000;stroke-width:0.65159565" >> + id="rect4136-2-6-6-8" >> + width="102.13586" >> + height="32.16671" >> + x="458.93091" >> + y="841.9715" /> >> + <text >> + xml:space="preserve" >> + style="font-style:normal;font-weight:normal;font-size:12px;line-height:0%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" >> + x="490.68393" >> + y="863.52423" >> + id="text4138-6-2-8-6"><tspan >> + sodipodi:role="line" >> + id="tspan4140-1-9-0-2" >> + x="490.68393" >> + y="863.52423" >> + style="font-size:15px;line-height:1.25;stroke-width:1px">IOMMU</tspan></text> >> + <rect >> + style="fill:#000000;stroke:#000000;stroke-width:0.6465112;filter:url(#filter4169-3-5-4)" >> + id="rect4136-3-6-5-6" >> + width="101.07784" >> + height="31.998148" >> + x="128.74678" >> + y="80.648842" >> + transform="matrix(1.9884947,0,0,0.94903537,167.19229,661.38193)" /> >> + <rect >> + style="fill:url(#linearGradient5032-3-84);fill-opacity:1;stroke:#000000;stroke-width:0.88813609" >> + id="rect4136-2-6-2" >> + width="200.99274" >> + height="30.367374" >> + x="420.4675" >> + y="735.97351" /> >> + <text >> + xml:space="preserve" >> + style="font-style:normal;font-weight:normal;font-size:12px;line-height:0%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" >> + x="441.95297" >> + y="755.9068" >> + id="text4138-6-2-9"><tspan >> + sodipodi:role="line" >> + id="tspan4140-1-9-9" >> + x="441.95297" >> + y="755.9068" >> + style="font-size:15px;line-height:1.25;stroke-width:1px">Hardware Accelerator</tspan></text> >> + <path >> + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1;marker-end:url(#marker4825-6-0-0)" >> + d="m 508.2914,766.55885 0.29362,76.03656" >> + id="path4661-6-1" >> + inkscape:connector-curvature="0" >> + sodipodi:nodetypes="cc" /> >> + <path >> + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1;marker-end:url(#marker4825-6-3)" >> + d="M 499.70201,876.47297 361.38296,920.80258" >> + id="path4661-1" >> + inkscape:connector-curvature="0" >> + sodipodi:nodetypes="cc" /> >> + </g> >> +</svg> >> diff --git a/Documentation/warpdrive/wd_q_addr_space.svg b/Documentation/warpdrive/wd_q_addr_space.svg >> new file mode 100644 >> index 000000000000..5e6cf8e89908 >> --- /dev/null >> +++ b/Documentation/warpdrive/wd_q_addr_space.svg >> @@ -0,0 +1,359 @@ >> +<?xml version="1.0" encoding="UTF-8" standalone="no"?> >> +<!-- Created with Inkscape (http://www.inkscape.org/) --> >> + >> +<svg >> + xmlns:dc="http://purl.org/dc/elements/1.1/" >> + xmlns:cc="http://creativecommons.org/ns#" >> + xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" >> + xmlns:svg="http://www.w3.org/2000/svg" >> + xmlns="http://www.w3.org/2000/svg" >> + xmlns:sodipodi="http://sodipodi.sourceforge.net/DTD/sodipodi-0.dtd" >> + xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape" >> + width="210mm" >> + height="124mm" >> + viewBox="0 0 210 124" >> + version="1.1" >> + id="svg8" >> + inkscape:version="0.92.3 (2405546, 2018-03-11)" >> + sodipodi:docname="wd_q_addr_space.svg"> >> + <defs >> + id="defs2"> >> + <marker >> + inkscape:stockid="Arrow1Mend" >> + orient="auto" >> + refY="0" >> + refX="0" >> + id="marker5428" >> + style="overflow:visible" >> + inkscape:isstock="true"> >> + <path >> + id="path5426" >> + d="M 0,0 5,-5 -12.5,0 5,5 Z" >> + style="fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:1.00000003pt;stroke-opacity:1" >> + transform="matrix(-0.4,0,0,-0.4,-4,0)" >> + inkscape:connector-curvature="0" /> >> + </marker> >> + <marker >> + inkscape:isstock="true" >> + style="overflow:visible" >> + id="marker2922" >> + refX="0" >> + refY="0" >> + orient="auto" >> + inkscape:stockid="Arrow1Mend" >> + inkscape:collect="always"> >> + <path >> + transform="matrix(-0.4,0,0,-0.4,-4,0)" >> + style="fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:1.00000003pt;stroke-opacity:1" >> + d="M 0,0 5,-5 -12.5,0 5,5 Z" >> + id="path2920" >> + inkscape:connector-curvature="0" /> >> + </marker> >> + <marker >> + inkscape:stockid="Arrow1Mstart" >> + orient="auto" >> + refY="0" >> + refX="0" >> + id="Arrow1Mstart" >> + style="overflow:visible" >> + inkscape:isstock="true"> >> + <path >> + id="path840" >> + d="M 0,0 5,-5 -12.5,0 5,5 Z" >> + style="fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:1.00000003pt;stroke-opacity:1" >> + transform="matrix(0.4,0,0,0.4,4,0)" >> + inkscape:connector-curvature="0" /> >> + </marker> >> + <marker >> + inkscape:stockid="Arrow1Mend" >> + orient="auto" >> + refY="0" >> + refX="0" >> + id="Arrow1Mend" >> + style="overflow:visible" >> + inkscape:isstock="true" >> + inkscape:collect="always"> >> + <path >> + id="path843" >> + d="M 0,0 5,-5 -12.5,0 5,5 Z" >> + style="fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:1.00000003pt;stroke-opacity:1" >> + transform="matrix(-0.4,0,0,-0.4,-4,0)" >> + inkscape:connector-curvature="0" /> >> + </marker> >> + <marker >> + inkscape:stockid="Arrow1Mstart" >> + orient="auto" >> + refY="0" >> + refX="0" >> + id="Arrow1Mstart-5" >> + style="overflow:visible" >> + inkscape:isstock="true"> >> + <path >> + inkscape:connector-curvature="0" >> + id="path840-1" >> + d="M 0,0 5,-5 -12.5,0 5,5 Z" >> + style="fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:1.00000003pt;stroke-opacity:1" >> + transform="matrix(0.4,0,0,0.4,4,0)" /> >> + </marker> >> + <marker >> + inkscape:stockid="Arrow1Mend" >> + orient="auto" >> + refY="0" >> + refX="0" >> + id="Arrow1Mend-1" >> + style="overflow:visible" >> + inkscape:isstock="true"> >> + <path >> + inkscape:connector-curvature="0" >> + id="path843-0" >> + d="M 0,0 5,-5 -12.5,0 5,5 Z" >> + style="fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:1.00000003pt;stroke-opacity:1" >> + transform="matrix(-0.4,0,0,-0.4,-4,0)" /> >> + </marker> >> + <marker >> + inkscape:isstock="true" >> + style="overflow:visible" >> + id="marker2922-2" >> + refX="0" >> + refY="0" >> + orient="auto" >> + inkscape:stockid="Arrow1Mend" >> + inkscape:collect="always"> >> + <path >> + transform="matrix(-0.4,0,0,-0.4,-4,0)" >> + style="fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:1.00000003pt;stroke-opacity:1" >> + d="M 0,0 5,-5 -12.5,0 5,5 Z" >> + id="path2920-9" >> + inkscape:connector-curvature="0" /> >> + </marker> >> + <marker >> + inkscape:isstock="true" >> + style="overflow:visible" >> + id="marker2922-27" >> + refX="0" >> + refY="0" >> + orient="auto" >> + inkscape:stockid="Arrow1Mend" >> + inkscape:collect="always"> >> + <path >> + transform="matrix(-0.4,0,0,-0.4,-4,0)" >> + style="fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:1.00000003pt;stroke-opacity:1" >> + d="M 0,0 5,-5 -12.5,0 5,5 Z" >> + id="path2920-0" >> + inkscape:connector-curvature="0" /> >> + </marker> >> + <marker >> + inkscape:isstock="true" >> + style="overflow:visible" >> + id="marker2922-27-8" >> + refX="0" >> + refY="0" >> + orient="auto" >> + inkscape:stockid="Arrow1Mend" >> + inkscape:collect="always"> >> + <path >> + transform="matrix(-0.4,0,0,-0.4,-4,0)" >> + style="fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:1.00000003pt;stroke-opacity:1" >> + d="M 0,0 5,-5 -12.5,0 5,5 Z" >> + id="path2920-0-0" >> + inkscape:connector-curvature="0" /> >> + </marker> >> + </defs> >> + <sodipodi:namedview >> + id="base" >> + pagecolor="#ffffff" >> + bordercolor="#666666" >> + borderopacity="1.0" >> + inkscape:pageopacity="0.0" >> + inkscape:pageshadow="2" >> + inkscape:zoom="1.4" >> + inkscape:cx="401.66654" >> + inkscape:cy="218.12255" >> + inkscape:document-units="mm" >> + inkscape:current-layer="layer1" >> + showgrid="false" >> + inkscape:window-width="1916" >> + inkscape:window-height="1033" >> + inkscape:window-x="0" >> + inkscape:window-y="22" >> + inkscape:window-maximized="0" /> >> + <metadata >> + id="metadata5"> >> + <rdf:RDF> >> + <cc:Work >> + rdf:about=""> >> + <dc:format>image/svg+xml</dc:format> >> + <dc:type >> + rdf:resource="http://purl.org/dc/dcmitype/StillImage" /> >> + <dc:title /> >> + </cc:Work> >> + </rdf:RDF> >> + </metadata> >> + <g >> + inkscape:label="Layer 1" >> + inkscape:groupmode="layer" >> + id="layer1" >> + transform="translate(0,-173)"> >> + <rect >> + style="opacity:0.82999998;fill:none;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:0.4;stroke-miterlimit:4;stroke-dasharray:none;stroke-dashoffset:0;stroke-opacity:0.82745098" >> + id="rect815" >> + width="21.262758" >> + height="40.350552" >> + x="55.509361" >> + y="195.00098" >> + ry="0" /> >> + <rect >> + style="opacity:0.82999998;fill:none;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:0.4;stroke-miterlimit:4;stroke-dasharray:none;stroke-dashoffset:0;stroke-opacity:0.82745098" >> + id="rect815-1" >> + width="21.24276" >> + height="43.732346" >> + x="55.519352" >> + y="235.26543" >> + ry="0" /> >> + <text >> + xml:space="preserve" >> + style="font-style:normal;font-weight:normal;font-size:2.9765625px;line-height:125%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.26458332px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" >> + x="50.549229" >> + y="190.6078" >> + id="text1118"><tspan >> + sodipodi:role="line" >> + id="tspan1116" >> + x="50.549229" >> + y="190.6078" >> + style="stroke-width:0.26458332px">queue file address space</tspan></text> >> + <path >> + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.26458332px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" >> + d="M 76.818568,194.95453 H 97.229281" >> + id="path1126" >> + inkscape:connector-curvature="0" >> + sodipodi:nodetypes="cc" /> >> + <path >> + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.26458332px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" >> + d="M 76.818568,235.20899 H 96.095361" >> + id="path1126-8" >> + inkscape:connector-curvature="0" /> >> + <path >> + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.26458332px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" >> + d="m 76.762111,278.99778 h 19.27678" >> + id="path1126-0" >> + inkscape:connector-curvature="0" /> >> + <path >> + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.26458332px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" >> + d="m 55.519355,265.20165 v 19.27678" >> + id="path1126-2" >> + inkscape:connector-curvature="0" /> >> + <path >> + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.26458332px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" >> + d="m 76.762111,265.20165 v 19.27678" >> + id="path1126-2-1" >> + inkscape:connector-curvature="0" /> >> + <path >> + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.26458332px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1;marker-start:url(#Arrow1Mstart);marker-end:url(#Arrow1Mend)" >> + d="m 87.590896,194.76554 0,39.87648" >> + id="path1126-2-1-0" >> + inkscape:connector-curvature="0" >> + sodipodi:nodetypes="cc" /> >> + <path >> + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.26458332px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1;marker-start:url(#Arrow1Mstart-5);marker-end:url(#Arrow1Mend-1)" >> + d="m 82.48822,235.77596 v 42.90029" >> + id="path1126-2-1-0-8" >> + inkscape:connector-curvature="0" >> + sodipodi:nodetypes="cc" /> >> + <path >> + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.26458332px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1;marker-end:url(#marker2922)" >> + d="M 44.123633,195.3325 H 55.651907" >> + id="path2912" >> + inkscape:connector-curvature="0" /> >> + <text >> + xml:space="preserve" >> + style="font-style:normal;font-weight:normal;font-size:2.9765625px;line-height:125%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.26458332px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" >> + x="32.217381" >> + y="196.27745" >> + id="text2968"><tspan >> + sodipodi:role="line" >> + id="tspan2966" >> + x="32.217381" >> + y="196.27745" >> + style="stroke-width:0.26458332px">offset 0</tspan></text> >> + <text >> + xml:space="preserve" >> + style="font-style:normal;font-weight:normal;font-size:2.9765625px;line-height:125%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.26458332px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" >> + x="91.199554" >> + y="216.03946" >> + id="text1118-5"><tspan >> + sodipodi:role="line" >> + id="tspan1116-0" >> + x="91.199554" >> + y="216.03946" >> + style="stroke-width:0.26458332px">device region (mapped to device mmio or shared kernel driver memory)</tspan></text> >> + <text >> + xml:space="preserve" >> + style="font-style:normal;font-weight:normal;font-size:2.9765625px;line-height:125%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.26458332px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" >> + x="86.188072" >> + y="244.50081" >> + id="text1118-5-6"><tspan >> + sodipodi:role="line" >> + id="tspan1116-0-4" >> + x="86.188072" >> + y="244.50081" >> + style="stroke-width:0.26458332px">static share virtual memory region (for device without share virtual memory)</tspan></text> >> + <flowRoot >> + xml:space="preserve" >> + id="flowRoot5699" >> + style="font-style:normal;font-weight:normal;font-size:11.25px;line-height:125%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1"><flowRegion >> + id="flowRegion5701"><rect >> + id="rect5703" >> + width="5182.8569" >> + height="385.71429" >> + x="34.285713" >> + y="71.09111" /></flowRegion><flowPara >> + id="flowPara5705" /></flowRoot> <path >> + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.26458332px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1;marker-end:url(#marker2922-2)" >> + d="M 43.679028,206.85268 H 55.207302" >> + id="path2912-1" >> + inkscape:connector-curvature="0" /> >> + <path >> + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.26458332px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1;marker-end:url(#marker2922-27)" >> + d="M 44.057004,224.23959 H 55.585278" >> + id="path2912-9" >> + inkscape:connector-curvature="0" /> >> + <text >> + xml:space="preserve" >> + style="font-style:normal;font-weight:normal;font-size:2.9765625px;line-height:125%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.26458332px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" >> + x="24.139778" >> + y="202.40636" >> + id="text1118-5-3"><tspan >> + sodipodi:role="line" >> + id="tspan1116-0-6" >> + x="24.139778" >> + y="202.40636" >> + style="stroke-width:0.26458332px">device mmio region</tspan></text> >> + <text >> + xml:space="preserve" >> + style="font-style:normal;font-weight:normal;font-size:2.9765625px;line-height:125%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.26458332px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" >> + x="17.010948" >> + y="216.73672" >> + id="text1118-5-3-3"><tspan >> + sodipodi:role="line" >> + id="tspan1116-0-6-6" >> + x="17.010948" >> + y="216.73672" >> + style="stroke-width:0.26458332px">device kernel only region</tspan></text> >> + <path >> + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.26458332px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1;marker-end:url(#marker2922-27-8)" >> + d="M 43.981087,235.35153 H 55.509361" >> + id="path2912-9-2" >> + inkscape:connector-curvature="0" /> >> + <text >> + xml:space="preserve" >> + style="font-style:normal;font-weight:normal;font-size:2.9765625px;line-height:125%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.26458332px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" >> + x="17.575975" >> + y="230.53285" >> + id="text1118-5-3-3-0"><tspan >> + sodipodi:role="line" >> + id="tspan1116-0-6-6-5" >> + x="17.575975" >> + y="230.53285" >> + style="stroke-width:0.26458332px">device user share region</tspan></text> >> + </g> >> +</svg> >> -- >> 2.17.1 >>
On Wed, Nov 14, 2018 at 10:58:09AM +0800, Kenneth Lee wrote: > > 在 2018/11/13 上午8:23, Leon Romanovsky 写道: > > On Mon, Nov 12, 2018 at 03:58:02PM +0800, Kenneth Lee wrote: > > > From: Kenneth Lee <liguozhu@hisilicon.com> > > > > > > WarpDrive is a general accelerator framework for the user application to > > > access the hardware without going through the kernel in data path. > > > > > > The kernel component to provide kernel facility to driver for expose the > > > user interface is called uacce. It a short name for > > > "Unified/User-space-access-intended Accelerator Framework". > > > > > > This patch add document to explain how it works. > > + RDMA and netdev folks > > > > Sorry, to be late in the game, I don't see other patches, but from > > the description below it seems like you are reinventing RDMA verbs > > model. I have hard time to see the differences in the proposed > > framework to already implemented in drivers/infiniband/* for the kernel > > space and for the https://github.com/linux-rdma/rdma-core/ for the user > > space parts. > > Thanks Leon, > > Yes, we tried to solve similar problem in RDMA. We also learned a lot from > the exist code of RDMA. But we we have to make a new one because we cannot > register accelerators such as AI operation, encryption or compression to the > RDMA framework:) Assuming that you did everything right and still failed to use RDMA framework, you was supposed to fix it and not to reinvent new exactly same one. It is how we develop kernel, by reusing existing code. > > Another problem we tried to address is the way to pin the memory for dma > operation. The RDMA way to pin the memory cannot avoid the page lost due to > copy-on-write operation during the memory is used by the device. This may > not be important to RDMA library. But it is important to accelerator. Such support exists in drivers/infiniband/ from late 2014 and it is called ODP (on demand paging). > > Hope this can help the understanding. Yes, it helped me a lot. Now, I'm more than before convinced that this whole patchset shouldn't exist in the first place. To be clear, NAK. Thanks > > Cheers > > > > > Hard NAK from RDMA side. > > > > Thanks > > > > > Signed-off-by: Kenneth Lee <liguozhu@hisilicon.com> > > > --- > > > Documentation/warpdrive/warpdrive.rst | 260 +++++++ > > > Documentation/warpdrive/wd-arch.svg | 764 ++++++++++++++++++++ > > > Documentation/warpdrive/wd.svg | 526 ++++++++++++++ > > > Documentation/warpdrive/wd_q_addr_space.svg | 359 +++++++++ > > > 4 files changed, 1909 insertions(+) > > > create mode 100644 Documentation/warpdrive/warpdrive.rst > > > create mode 100644 Documentation/warpdrive/wd-arch.svg > > > create mode 100644 Documentation/warpdrive/wd.svg > > > create mode 100644 Documentation/warpdrive/wd_q_addr_space.svg > > > > > > diff --git a/Documentation/warpdrive/warpdrive.rst b/Documentation/warpdrive/warpdrive.rst > > > new file mode 100644 > > > index 000000000000..ef84d3a2d462 > > > --- /dev/null > > > +++ b/Documentation/warpdrive/warpdrive.rst > > > @@ -0,0 +1,260 @@ > > > +Introduction of WarpDrive > > > +========================= > > > + > > > +*WarpDrive* is a general accelerator framework for the user application to > > > +access the hardware without going through the kernel in data path. > > > + > > > +It can be used as the quick channel for accelerators, network adaptors or > > > +other hardware for application in user space. > > > + > > > +This may make some implementation simpler. E.g. you can reuse most of the > > > +*netdev* driver in kernel and just share some ring buffer to the user space > > > +driver for *DPDK* [4] or *ODP* [5]. Or you can combine the RSA accelerator with > > > +the *netdev* in the user space as a https reversed proxy, etc. > > > + > > > +*WarpDrive* takes the hardware accelerator as a heterogeneous processor which > > > +can share particular load from the CPU: > > > + > > > +.. image:: wd.svg > > > + :alt: WarpDrive Concept > > > + > > > +The virtual concept, queue, is used to manage the requests sent to the > > > +accelerator. The application send requests to the queue by writing to some > > > +particular address, while the hardware takes the requests directly from the > > > +address and send feedback accordingly. > > > + > > > +The format of the queue may differ from hardware to hardware. But the > > > +application need not to make any system call for the communication. > > > + > > > +*WarpDrive* tries to create a shared virtual address space for all involved > > > +accelerators. Within this space, the requests sent to queue can refer to any > > > +virtual address, which will be valid to the application and all involved > > > +accelerators. > > > + > > > +The name *WarpDrive* is simply a cool and general name meaning the framework > > > +makes the application faster. It includes general user library, kernel > > > +management module and drivers for the hardware. In kernel, the management > > > +module is called *uacce*, meaning "Unified/User-space-access-intended > > > +Accelerator Framework". > > > + > > > + > > > +How does it work > > > +================ > > > + > > > +*WarpDrive* uses *mmap* and *IOMMU* to play the trick. > > > + > > > +*Uacce* creates a chrdev for the device registered to it. A "queue" will be > > > +created when the chrdev is opened. The application access the queue by mmap > > > +different address region of the queue file. > > > + > > > +The following figure demonstrated the queue file address space: > > > + > > > +.. image:: wd_q_addr_space.svg > > > + :alt: WarpDrive Queue Address Space > > > + > > > +The first region of the space, device region, is used for the application to > > > +write request or read answer to or from the hardware. > > > + > > > +Normally, there can be three types of device regions mmio and memory regions. > > > +It is recommended to use common memory for request/answer descriptors and use > > > +the mmio space for device notification, such as doorbell. But of course, this > > > +is all up to the interface designer. > > > + > > > +There can be two types of device memory regions, kernel-only and user-shared. > > > +This will be explained in the "kernel APIs" section. > > > + > > > +The Static Share Virtual Memory region is necessary only when the device IOMMU > > > +does not support "Share Virtual Memory". This will be explained after the > > > +*IOMMU* idea. > > > + > > > + > > > +Architecture > > > +------------ > > > + > > > +The full *WarpDrive* architecture is represented in the following class > > > +diagram: > > > + > > > +.. image:: wd-arch.svg > > > + :alt: WarpDrive Architecture > > > + > > > + > > > +The user API > > > +------------ > > > + > > > +We adopt a polling style interface in the user space: :: > > > + > > > + int wd_request_queue(struct wd_queue *q); > > > + void wd_release_queue(struct wd_queue *q); > > > + > > > + int wd_send(struct wd_queue *q, void *req); > > > + int wd_recv(struct wd_queue *q, void **req); > > > + int wd_recv_sync(struct wd_queue *q, void **req); > > > + void wd_flush(struct wd_queue *q); > > > + > > > +wd_recv_sync() is a wrapper to its non-sync version. It will trapped into > > > +kernel and waits until the queue become available. > > > + > > > +If the queue do not support SVA/SVM. The following helper function > > > +can be used to create Static Virtual Share Memory: :: > > > + > > > + void *wd_preserve_share_memory(struct wd_queue *q, size_t size); > > > + > > > +The user API is not mandatory. It is simply a suggestion and hint what the > > > +kernel interface is supposed to support. > > > + > > > + > > > +The user driver > > > +--------------- > > > + > > > +The queue file mmap space will need a user driver to wrap the communication > > > +protocol. *UACCE* provides some attributes in sysfs for the user driver to > > > +match the right accelerator accordingly. > > > + > > > +The *UACCE* device attribute is under the following directory: > > > + > > > +/sys/class/uacce/<dev-name>/params > > > + > > > +The following attributes is supported: > > > + > > > +nr_queue_remained (ro) > > > + number of queue remained > > > + > > > +api_version (ro) > > > + a string to identify the queue mmap space format and its version > > > + > > > +device_attr (ro) > > > + attributes of the device, see UACCE_DEV_xxx flag defined in uacce.h > > > + > > > +numa_node (ro) > > > + id of numa node > > > + > > > +priority (rw) > > > + Priority or the device, bigger is higher > > > + > > > +(This is not yet implemented in RFC version) > > > + > > > + > > > +The kernel API > > > +-------------- > > > + > > > +The *uacce* kernel API is defined in uacce.h. If the hardware support SVM/SVA, > > > +The driver need only the following API functions: :: > > > + > > > + int uacce_register(uacce); > > > + void uacce_unregister(uacce); > > > + void uacce_wake_up(q); > > > + > > > +*uacce_wake_up* is used to notify the process who epoll() on the queue file. > > > + > > > +According to the IOMMU capability, *uacce* categories the devices as follow: > > > + > > > +UACCE_DEV_NOIOMMU > > > + The device has no IOMMU. The user process cannot use VA on the hardware > > > + This mode is not recommended. > > > + > > > +UACCE_DEV_SVA (UACCE_DEV_PASID | UACCE_DEV_FAULT_FROM_DEV) > > > + The device has IOMMU which can share the same page table with user > > > + process > > > + > > > +UACCE_DEV_SHARE_DOMAIN > > > + The device has IOMMU which has no multiple page table and device page > > > + fault support > > > + > > > +If the device works in mode other than UACCE_DEV_NOIOMMU, *uacce* will set its > > > +IOMMU to IOMMU_DOMAIN_UNMANAGED. So the driver must not use any kernel > > > +DMA API but the following ones from *uacce* instead: :: > > > + > > > + uacce_dma_map(q, va, size, prot); > > > + uacce_dma_unmap(q, va, size, prot); > > > + > > > +*uacce_dma_map/unmap* is valid only for UACCE_DEV_SVA device. It creates a > > > +particular PASID and page table for the kernel in the IOMMU (Not yet > > > +implemented in the RFC) > > > + > > > +For the UACCE_DEV_SHARE_DOMAIN device, uacce_dma_map/unmap is not valid. > > > +*Uacce* call back start_queue only when the DUS and DKO region is mmapped. The > > > +accelerator driver must use those dma buffer, via uacce_queue->qfrs[], on > > > +start_queue call back. The size of the queue file region is defined by > > > +uacce->ops->qf_pg_start[]. > > > + > > > +We have to do it this way because most of current IOMMU cannot support the > > > +kernel and user virtual address at the same time. So we have to let them both > > > +share the same user virtual address space. > > > + > > > +If the device have to support kernel and user at the same time, both kernel > > > +and the user should use these DMA API. This is not convenient. A better > > > +solution is to change the future DMA/IOMMU design to let them separate the > > > +address space between the user and kernel space. But it is not going to be in > > > +a short time. > > > + > > > + > > > +Multiple processes support > > > +========================== > > > + > > > +In the latest mainline kernel (4.19) when this document is written, the IOMMU > > > +subsystem do not support multiple process page tables yet. > > > + > > > +Most IOMMU hardware implementation support multi-process with the concept > > > +of PASID. But they may use different name, e.g. it is call sub-stream-id in > > > +SMMU of ARM. With PASID or similar design, multi page table can be added to > > > +the IOMMU and referred by its PASID. > > > + > > > +*JPB* has a patchset to enable this[1]_. We have tested it with our hardware > > > +(which is known as *D06*). It works well. *WarpDrive* rely on them to support > > > +UACCE_DEV_SVA. If it is not enabled, *WarpDrive* can still work. But it > > > +support only one process, the device will be set to UACCE_DEV_SHARE_DOMAIN > > > +even it is set to UACCE_DEV_SVA initially. > > > + > > > +Static Share Virtual Memory is mainly used by UACCE_DEV_SHARE_DOMAIN device. > > > + > > > + > > > +Legacy Mode Support > > > +=================== > > > +For the hardware without IOMMU, WarpDrive can still work, the only problem is > > > +VA cannot be used in the device. The driver should adopt another strategy for > > > +the shared memory. It is only for testing, and not recommended. > > > + > > > + > > > +The Folk Scenario > > > +================= > > > +For a process with allocated queues and shared memory, what happen if it forks > > > +a child? > > > + > > > +The fd of the queue will be duplicated on folk, so the child can send request > > > +to the same queue as its parent. But the requests which is sent from processes > > > +except for the one who open the queue will be blocked. > > > + > > > +It is recommended to add O_CLOEXEC to the queue file. > > > + > > > +The queue mmap space has a VM_DONTCOPY in its VMA. So the child will lost all > > > +those VMAs. > > > + > > > +This is why *WarpDrive* does not adopt the mode used in *VFIO* and *InfiniBand*. > > > +Both solutions can set any user pointer for hardware sharing. But they cannot > > > +support fork when the dma is in process. Or the "Copy-On-Write" procedure will > > > +make the parent process lost its physical pages. > > > + > > > + > > > +The Sample Code > > > +=============== > > > +There is a sample user land implementation with a simple driver for Hisilicon > > > +Hi1620 ZIP Accelerator. > > > + > > > +To test, do the following in samples/warpdrive (for the case of PC host): :: > > > + ./autogen.sh > > > + ./conf.sh # or simply ./configure if you build on target system > > > + make > > > + > > > +Then you can get test_hisi_zip in the test subdirectory. Copy it to the target > > > +system and make sure the hisi_zip driver is enabled (the major and minor of > > > +the uacce chrdev can be gotten from the dmesg or sysfs), and run: :: > > > + mknod /dev/ua1 c <major> <minior> > > > + test/test_hisi_zip -z < data > data.zip > > > + test/test_hisi_zip -g < data > data.gzip > > > + > > > + > > > +References > > > +========== > > > +.. [1] https://patchwork.kernel.org/patch/10394851/ > > > + > > > +.. vim: tw=78 > > > diff --git a/Documentation/warpdrive/wd-arch.svg b/Documentation/warpdrive/wd-arch.svg > > > new file mode 100644 > > > index 000000000000..e59934188443 > > > --- /dev/null > > > +++ b/Documentation/warpdrive/wd-arch.svg > > > @@ -0,0 +1,764 @@ > > > +<?xml version="1.0" encoding="UTF-8" standalone="no"?> > > > +<!-- Created with Inkscape (http://www.inkscape.org/) --> > > > + > > > +<svg > > > + xmlns:dc="http://purl.org/dc/elements/1.1/" > > > + xmlns:cc="http://creativecommons.org/ns#" > > > + xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" > > > + xmlns:svg="http://www.w3.org/2000/svg" > > > + xmlns="http://www.w3.org/2000/svg" > > > + xmlns:xlink="http://www.w3.org/1999/xlink" > > > + xmlns:sodipodi="http://sodipodi.sourceforge.net/DTD/sodipodi-0.dtd" > > > + xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape" > > > + width="210mm" > > > + height="193mm" > > > + viewBox="0 0 744.09449 683.85823" > > > + id="svg2" > > > + version="1.1" > > > + inkscape:version="0.92.3 (2405546, 2018-03-11)" > > > + sodipodi:docname="wd-arch.svg"> > > > + <defs > > > + id="defs4"> > > > + <linearGradient > > > + inkscape:collect="always" > > > + id="linearGradient6830"> > > > + <stop > > > + style="stop-color:#000000;stop-opacity:1;" > > > + offset="0" > > > + id="stop6832" /> > > > + <stop > > > + style="stop-color:#000000;stop-opacity:0;" > > > + offset="1" > > > + id="stop6834" /> > > > + </linearGradient> > > > + <linearGradient > > > + inkscape:collect="always" > > > + xlink:href="#linearGradient5026" > > > + id="linearGradient5032" > > > + x1="353" > > > + y1="211.3622" > > > + x2="565.5" > > > + y2="174.8622" > > > + gradientUnits="userSpaceOnUse" > > > + gradientTransform="translate(-89.949614,405.94594)" /> > > > + <linearGradient > > > + inkscape:collect="always" > > > + id="linearGradient5026"> > > > + <stop > > > + style="stop-color:#f2f2f2;stop-opacity:1;" > > > + offset="0" > > > + id="stop5028" /> > > > + <stop > > > + style="stop-color:#f2f2f2;stop-opacity:0;" > > > + offset="1" > > > + id="stop5030" /> > > > + </linearGradient> > > > + <filter > > > + inkscape:collect="always" > > > + style="color-interpolation-filters:sRGB" > > > + id="filter4169-3" > > > + x="-0.031597666" > > > + width="1.0631953" > > > + y="-0.099812768" > > > + height="1.1996255"> > > > + <feGaussianBlur > > > + inkscape:collect="always" > > > + stdDeviation="1.3307599" > > > + id="feGaussianBlur4171-6" /> > > > + </filter> > > > + <linearGradient > > > + inkscape:collect="always" > > > + xlink:href="#linearGradient5026" > > > + id="linearGradient5032-1" > > > + x1="353" > > > + y1="211.3622" > > > + x2="565.5" > > > + y2="174.8622" > > > + gradientUnits="userSpaceOnUse" > > > + gradientTransform="translate(175.77842,400.29111)" /> > > > + <filter > > > + inkscape:collect="always" > > > + style="color-interpolation-filters:sRGB" > > > + id="filter4169-3-0" > > > + x="-0.031597666" > > > + width="1.0631953" > > > + y="-0.099812768" > > > + height="1.1996255"> > > > + <feGaussianBlur > > > + inkscape:collect="always" > > > + stdDeviation="1.3307599" > > > + id="feGaussianBlur4171-6-9" /> > > > + </filter> > > > + <marker > > > + markerWidth="18.960653" > > > + markerHeight="11.194658" > > > + refX="9.4803267" > > > + refY="5.5973287" > > > + orient="auto" > > > + id="marker4613"> > > > + <rect > > > + y="-5.1589785" > > > + x="5.8504119" > > > + height="10.317957" > > > + width="10.317957" > > > + id="rect4212" > > > + style="fill:#ffffff;stroke:#000000;stroke-width:0.69143367;stroke-miterlimit:4;stroke-dasharray:none" > > > + transform="matrix(0.86111274,0.50841405,-0.86111274,0.50841405,0,0)"> > > > + <title > > > + id="title4262">generation</title> > > > + </rect> > > > + </marker> > > > + <marker > > > + markerWidth="11.227358" > > > + markerHeight="12.355258" > > > + refX="10" > > > + refY="6.177629" > > > + orient="auto" > > > + id="marker4825"> > > > + <path > > > + inkscape:connector-curvature="0" > > > + id="path4757" > > > + d="M 0.42024733,0.42806444 10.231357,6.3500844 0.24347733,11.918544" > > > + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" /> > > > + </marker> > > > + <marker > > > + markerWidth="11.227358" > > > + markerHeight="12.355258" > > > + refX="10" > > > + refY="6.177629" > > > + orient="auto" > > > + id="marker4825-6"> > > > + <path > > > + inkscape:connector-curvature="0" > > > + id="path4757-1" > > > + d="M 0.42024733,0.42806444 10.231357,6.3500844 0.24347733,11.918544" > > > + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" /> > > > + </marker> > > > + <linearGradient > > > + inkscape:collect="always" > > > + xlink:href="#linearGradient5026" > > > + id="linearGradient5032-3-9" > > > + x1="353" > > > + y1="211.3622" > > > + x2="565.5" > > > + y2="174.8622" > > > + gradientUnits="userSpaceOnUse" > > > + gradientTransform="matrix(1.2452511,0,0,0.98513016,-190.95632,540.33156)" /> > > > + <filter > > > + inkscape:collect="always" > > > + style="color-interpolation-filters:sRGB" > > > + id="filter4169-3-5-8" > > > + x="-0.031597666" > > > + width="1.0631953" > > > + y="-0.099812768" > > > + height="1.1996255"> > > > + <feGaussianBlur > > > + inkscape:collect="always" > > > + stdDeviation="1.3307599" > > > + id="feGaussianBlur4171-6-3-9" /> > > > + </filter> > > > + <marker > > > + markerWidth="11.227358" > > > + markerHeight="12.355258" > > > + refX="10" > > > + refY="6.177629" > > > + orient="auto" > > > + id="marker4825-6-2"> > > > + <path > > > + inkscape:connector-curvature="0" > > > + id="path4757-1-9" > > > + d="M 0.42024733,0.42806444 10.231357,6.3500844 0.24347733,11.918544" > > > + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" /> > > > + </marker> > > > + <marker > > > + markerWidth="11.227358" > > > + markerHeight="12.355258" > > > + refX="10" > > > + refY="6.177629" > > > + orient="auto" > > > + id="marker4825-6-2-1"> > > > + <path > > > + inkscape:connector-curvature="0" > > > + id="path4757-1-9-9" > > > + d="M 0.42024733,0.42806444 10.231357,6.3500844 0.24347733,11.918544" > > > + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" /> > > > + </marker> > > > + <linearGradient > > > + inkscape:collect="always" > > > + xlink:href="#linearGradient5026" > > > + id="linearGradient5032-3-9-7" > > > + x1="353" > > > + y1="211.3622" > > > + x2="565.5" > > > + y2="174.8622" > > > + gradientUnits="userSpaceOnUse" > > > + gradientTransform="matrix(1.3742742,0,0,0.97786398,-234.52617,654.63367)" /> > > > + <filter > > > + inkscape:collect="always" > > > + style="color-interpolation-filters:sRGB" > > > + id="filter4169-3-5-8-5" > > > + x="-0.031597666" > > > + width="1.0631953" > > > + y="-0.099812768" > > > + height="1.1996255"> > > > + <feGaussianBlur > > > + inkscape:collect="always" > > > + stdDeviation="1.3307599" > > > + id="feGaussianBlur4171-6-3-9-0" /> > > > + </filter> > > > + <marker > > > + markerWidth="11.227358" > > > + markerHeight="12.355258" > > > + refX="10" > > > + refY="6.177629" > > > + orient="auto" > > > + id="marker4825-6-2-6"> > > > + <path > > > + inkscape:connector-curvature="0" > > > + id="path4757-1-9-1" > > > + d="M 0.42024733,0.42806444 10.231357,6.3500844 0.24347733,11.918544" > > > + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" /> > > > + </marker> > > > + <linearGradient > > > + inkscape:collect="always" > > > + xlink:href="#linearGradient5026" > > > + id="linearGradient5032-3-9-4" > > > + x1="353" > > > + y1="211.3622" > > > + x2="565.5" > > > + y2="174.8622" > > > + gradientUnits="userSpaceOnUse" > > > + gradientTransform="matrix(1.3742912,0,0,2.0035845,-468.34428,342.56603)" /> > > > + <filter > > > + inkscape:collect="always" > > > + style="color-interpolation-filters:sRGB" > > > + id="filter4169-3-5-8-54" > > > + x="-0.031597666" > > > + width="1.0631953" > > > + y="-0.099812768" > > > + height="1.1996255"> > > > + <feGaussianBlur > > > + inkscape:collect="always" > > > + stdDeviation="1.3307599" > > > + id="feGaussianBlur4171-6-3-9-7" /> > > > + </filter> > > > + <marker > > > + markerWidth="11.227358" > > > + markerHeight="12.355258" > > > + refX="10" > > > + refY="6.177629" > > > + orient="auto" > > > + id="marker4825-6-2-1-8"> > > > + <path > > > + inkscape:connector-curvature="0" > > > + id="path4757-1-9-9-6" > > > + d="M 0.42024733,0.42806444 10.231357,6.3500844 0.24347733,11.918544" > > > + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" /> > > > + </marker> > > > + <marker > > > + markerWidth="11.227358" > > > + markerHeight="12.355258" > > > + refX="10" > > > + refY="6.177629" > > > + orient="auto" > > > + id="marker4825-6-2-1-8-8"> > > > + <path > > > + inkscape:connector-curvature="0" > > > + id="path4757-1-9-9-6-9" > > > + d="M 0.42024733,0.42806444 10.231357,6.3500844 0.24347733,11.918544" > > > + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" /> > > > + </marker> > > > + <marker > > > + markerWidth="11.227358" > > > + markerHeight="12.355258" > > > + refX="10" > > > + refY="6.177629" > > > + orient="auto" > > > + id="marker4825-6-0"> > > > + <path > > > + inkscape:connector-curvature="0" > > > + id="path4757-1-93" > > > + d="M 0.42024733,0.42806444 10.231357,6.3500844 0.24347733,11.918544" > > > + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" /> > > > + </marker> > > > + <marker > > > + markerWidth="11.227358" > > > + markerHeight="12.355258" > > > + refX="10" > > > + refY="6.177629" > > > + orient="auto" > > > + id="marker4825-6-0-2"> > > > + <path > > > + inkscape:connector-curvature="0" > > > + id="path4757-1-93-6" > > > + d="M 0.42024733,0.42806444 10.231357,6.3500844 0.24347733,11.918544" > > > + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" /> > > > + </marker> > > > + <filter > > > + inkscape:collect="always" > > > + style="color-interpolation-filters:sRGB" > > > + id="filter5382" > > > + x="-0.089695387" > > > + width="1.1793908" > > > + y="-0.10052069" > > > + height="1.2010413"> > > > + <feGaussianBlur > > > + inkscape:collect="always" > > > + stdDeviation="0.86758925" > > > + id="feGaussianBlur5384" /> > > > + </filter> > > > + <linearGradient > > > + inkscape:collect="always" > > > + xlink:href="#linearGradient6830" > > > + id="linearGradient6836" > > > + x1="362.73923" > > > + y1="700.04059" > > > + x2="340.4751" > > > + y2="678.25488" > > > + gradientUnits="userSpaceOnUse" > > > + gradientTransform="translate(-23.771026,-135.76835)" /> > > > + <marker > > > + markerWidth="11.227358" > > > + markerHeight="12.355258" > > > + refX="10" > > > + refY="6.177629" > > > + orient="auto" > > > + id="marker4825-6-2-6-2"> > > > + <path > > > + inkscape:connector-curvature="0" > > > + id="path4757-1-9-1-9" > > > + d="M 0.42024733,0.42806444 10.231357,6.3500844 0.24347733,11.918544" > > > + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" /> > > > + </marker> > > > + <linearGradient > > > + inkscape:collect="always" > > > + xlink:href="#linearGradient5026" > > > + id="linearGradient5032-3-9-7-3" > > > + x1="353" > > > + y1="211.3622" > > > + x2="565.5" > > > + y2="174.8622" > > > + gradientUnits="userSpaceOnUse" > > > + gradientTransform="matrix(1.3742742,0,0,0.97786395,-57.357186,649.55786)" /> > > > + <filter > > > + inkscape:collect="always" > > > + style="color-interpolation-filters:sRGB" > > > + id="filter4169-3-5-8-5-0" > > > + x="-0.031597666" > > > + width="1.0631953" > > > + y="-0.099812768" > > > + height="1.1996255"> > > > + <feGaussianBlur > > > + inkscape:collect="always" > > > + stdDeviation="1.3307599" > > > + id="feGaussianBlur4171-6-3-9-0-2" /> > > > + </filter> > > > + <marker > > > + markerWidth="11.227358" > > > + markerHeight="12.355258" > > > + refX="10" > > > + refY="6.177629" > > > + orient="auto" > > > + id="marker4825-6-2-1-1"> > > > + <path > > > + inkscape:connector-curvature="0" > > > + id="path4757-1-9-9-0" > > > + d="M 0.42024733,0.42806444 10.231357,6.3500844 0.24347733,11.918544" > > > + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" /> > > > + </marker> > > > + </defs> > > > + <sodipodi:namedview > > > + id="base" > > > + pagecolor="#ffffff" > > > + bordercolor="#666666" > > > + borderopacity="1.0" > > > + inkscape:pageopacity="0.0" > > > + inkscape:pageshadow="2" > > > + inkscape:zoom="0.98994949" > > > + inkscape:cx="222.32868" > > > + inkscape:cy="370.44492" > > > + inkscape:document-units="px" > > > + inkscape:current-layer="layer1" > > > + showgrid="false" > > > + inkscape:window-width="1916" > > > + inkscape:window-height="1033" > > > + inkscape:window-x="0" > > > + inkscape:window-y="22" > > > + inkscape:window-maximized="0" > > > + fit-margin-right="0.3" > > > + inkscape:snap-global="false" /> > > > + <metadata > > > + id="metadata7"> > > > + <rdf:RDF> > > > + <cc:Work > > > + rdf:about=""> > > > + <dc:format>image/svg+xml</dc:format> > > > + <dc:type > > > + rdf:resource="http://purl.org/dc/dcmitype/StillImage" /> > > > + <dc:title /> > > > + </cc:Work> > > > + </rdf:RDF> > > > + </metadata> > > > + <g > > > + inkscape:label="Layer 1" > > > + inkscape:groupmode="layer" > > > + id="layer1" > > > + transform="translate(0,-368.50374)"> > > > + <rect > > > + style="fill:#000000;stroke:#000000;stroke-width:0.6465112;filter:url(#filter4169-3)" > > > + id="rect4136-3-6" > > > + width="101.07784" > > > + height="31.998148" > > > + x="283.01144" > > > + y="588.80896" /> > > > + <rect > > > + style="fill:url(#linearGradient5032);fill-opacity:1;stroke:#000000;stroke-width:0.6465112" > > > + id="rect4136-2" > > > + width="101.07784" > > > + height="31.998148" > > > + x="281.63498" > > > + y="586.75739" /> > > > + <text > > > + xml:space="preserve" > > > + style="font-style:normal;font-weight:normal;line-height:0%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" > > > + x="294.21747" > > > + y="612.50073" > > > + id="text4138-6"><tspan > > > + sodipodi:role="line" > > > + id="tspan4140-1" > > > + x="294.21747" > > > + y="612.50073" > > > + style="font-size:15px;line-height:1.25">WarpDrive</tspan></text> > > > + <rect > > > + style="fill:#000000;stroke:#000000;stroke-width:0.6465112;filter:url(#filter4169-3-0)" > > > + id="rect4136-3-6-3" > > > + width="101.07784" > > > + height="31.998148" > > > + x="548.7395" > > > + y="583.15417" /> > > > + <rect > > > + style="fill:url(#linearGradient5032-1);fill-opacity:1;stroke:#000000;stroke-width:0.6465112" > > > + id="rect4136-2-60" > > > + width="101.07784" > > > + height="31.998148" > > > + x="547.36304" > > > + y="581.1026" /> > > > + <text > > > + xml:space="preserve" > > > + style="font-style:normal;font-weight:normal;line-height:0%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" > > > + x="557.83484" > > > + y="602.32745" > > > + id="text4138-6-6"><tspan > > > + sodipodi:role="line" > > > + id="tspan4140-1-2" > > > + x="557.83484" > > > + y="602.32745" > > > + style="font-size:15px;line-height:1.25">user_driver</tspan></text> > > > + <path > > > + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1;marker-end:url(#marker4613)" > > > + d="m 547.36304,600.78954 -156.58203,0.0691" > > > + id="path4855" > > > + inkscape:connector-curvature="0" > > > + sodipodi:nodetypes="cc" /> > > > + <rect > > > + style="fill:#000000;stroke:#000000;stroke-width:0.6465112;filter:url(#filter4169-3-5-8)" > > > + id="rect4136-3-6-5-7" > > > + width="101.07784" > > > + height="31.998148" > > > + x="128.74678" > > > + y="80.648842" > > > + transform="matrix(1.2452511,0,0,0.98513016,113.15182,641.02594)" /> > > > + <rect > > > + style="fill:url(#linearGradient5032-3-9);fill-opacity:1;stroke:#000000;stroke-width:0.71606314" > > > + id="rect4136-2-6-3" > > > + width="125.86729" > > > + height="31.522341" > > > + x="271.75983" > > > + y="718.45435" /> > > > + <text > > > + xml:space="preserve" > > > + style="font-style:normal;font-weight:normal;line-height:0%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" > > > + x="309.13705" > > > + y="745.55371" > > > + id="text4138-6-2-6"><tspan > > > + sodipodi:role="line" > > > + id="tspan4140-1-9-1" > > > + x="309.13705" > > > + y="745.55371" > > > + style="font-size:15px;line-height:1.25">uacce</tspan></text> > > > + <path > > > + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1;marker-end:url(#marker4825-6-2)" > > > + d="m 329.57309,619.72453 5.0373,97.14447" > > > + id="path4661-3" > > > + inkscape:connector-curvature="0" > > > + sodipodi:nodetypes="cc" /> > > > + <path > > > + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1;marker-end:url(#marker4825-6-2-1)" > > > + d="m 342.57219,830.63108 -5.67699,-79.2841" > > > + id="path4661-3-4" > > > + inkscape:connector-curvature="0" > > > + sodipodi:nodetypes="cc" /> > > > + <rect > > > + style="fill:#000000;stroke:#000000;stroke-width:0.6465112;filter:url(#filter4169-3-5-8-5)" > > > + id="rect4136-3-6-5-7-3" > > > + width="101.07784" > > > + height="31.998148" > > > + x="128.74678" > > > + y="80.648842" > > > + transform="matrix(1.3742742,0,0,0.97786398,101.09126,754.58534)" /> > > > + <rect > > > + style="fill:url(#linearGradient5032-3-9-7);fill-opacity:1;stroke:#000000;stroke-width:0.74946606" > > > + id="rect4136-2-6-3-6" > > > + width="138.90866" > > > + height="31.289837" > > > + x="276.13297" > > > + y="831.44263" /> > > > + <text > > > + xml:space="preserve" > > > + style="font-style:normal;font-weight:normal;line-height:0%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" > > > + x="295.67819" > > > + y="852.98224" > > > + id="text4138-6-2-6-1"><tspan > > > + sodipodi:role="line" > > > + id="tspan4140-1-9-1-0" > > > + x="295.67819" > > > + y="852.98224" > > > + style="font-size:15px;line-height:1.25">Device Driver</tspan></text> > > > + <path > > > + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1;marker-end:url(#marker4825-6-2-6)" > > > + d="m 623.05084,615.00104 0.51369,333.80219" > > > + id="path4661-3-5" > > > + inkscape:connector-curvature="0" > > > + sodipodi:nodetypes="cc" /> > > > + <text > > > + xml:space="preserve" > > > + style="font-style:normal;font-weight:normal;line-height:0%;font-family:sans-serif;text-align:center;letter-spacing:0px;word-spacing:0px;text-anchor:middle;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" > > > + x="392.63568" > > > + y="660.83667" > > > + id="text4138-6-2-6-1-6-2-5"><tspan > > > + sodipodi:role="line" > > > + x="392.63568" > > > + y="660.83667" > > > + id="tspan4305" > > > + style="font-size:15px;line-height:1.25"><<anom_file>></tspan><tspan > > > + sodipodi:role="line" > > > + x="392.63568" > > > + y="679.58667" > > > + style="font-size:15px;line-height:1.25" > > > + id="tspan1139">Queue FD</tspan></text> > > > + <text > > > + xml:space="preserve" > > > + style="font-style:normal;font-weight:normal;line-height:0%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" > > > + x="389.92969" > > > + y="587.44836" > > > + id="text4138-6-2-6-1-6-2-56"><tspan > > > + sodipodi:role="line" > > > + id="tspan4140-1-9-1-0-3-0-9" > > > + x="389.92969" > > > + y="587.44836" > > > + style="font-size:15px;line-height:1.25">1</tspan></text> > > > + <text > > > + xml:space="preserve" > > > + style="font-style:normal;font-weight:normal;line-height:0%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" > > > + x="528.64813" > > > + y="600.08429" > > > + id="text4138-6-2-6-1-6-3"><tspan > > > + sodipodi:role="line" > > > + id="tspan4140-1-9-1-0-3-7" > > > + x="528.64813" > > > + y="600.08429" > > > + style="font-size:15px;line-height:1.25">*</tspan></text> > > > + <rect > > > + style="fill:#000000;stroke:#000000;stroke-width:0.6465112;filter:url(#filter4169-3-5-8-54)" > > > + id="rect4136-3-6-5-7-4" > > > + width="101.07784" > > > + height="31.998148" > > > + x="128.74678" > > > + y="80.648842" > > > + transform="matrix(1.3745874,0,0,1.8929066,-132.7754,556.04505)" /> > > > + <rect > > > + style="fill:url(#linearGradient5032-3-9-4);fill-opacity:1;stroke:#000000;stroke-width:1.07280123" > > > + id="rect4136-2-6-3-4" > > > + width="138.91039" > > > + height="64.111" > > > + x="42.321312" > > > + y="704.8371" /> > > > + <text > > > + xml:space="preserve" > > > + style="font-style:normal;font-weight:normal;line-height:0%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" > > > + x="110.30745" > > > + y="722.94025" > > > + id="text4138-6-2-6-3"><tspan > > > + sodipodi:role="line" > > > + x="111.99202" > > > + y="722.94025" > > > + id="tspan4366" > > > + style="font-size:15px;line-height:1.25;text-align:center;text-anchor:middle">other standard </tspan><tspan > > > + sodipodi:role="line" > > > + x="110.30745" > > > + y="741.69025" > > > + id="tspan4368" > > > + style="font-size:15px;line-height:1.25;text-align:center;text-anchor:middle">framework</tspan><tspan > > > + sodipodi:role="line" > > > + x="110.30745" > > > + y="760.44025" > > > + style="font-size:15px;line-height:1.25;text-align:center;text-anchor:middle" > > > + id="tspan6840">(crypto/nic/others)</tspan></text> > > > + <path > > > + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1;marker-end:url(#marker4825-6-2-1-8)" > > > + d="M 276.29661,849.04109 134.04449,771.90853" > > > + id="path4661-3-4-8" > > > + inkscape:connector-curvature="0" > > > + sodipodi:nodetypes="cc" /> > > > + <text > > > + xml:space="preserve" > > > + style="font-style:normal;font-weight:normal;line-height:0%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" > > > + x="313.70813" > > > + y="730.06366" > > > + id="text4138-6-2-6-36"><tspan > > > + sodipodi:role="line" > > > + id="tspan4140-1-9-1-7" > > > + x="313.70813" > > > + y="730.06366" > > > + style="font-size:10px;line-height:1.25"><<lkm>></tspan></text> > > > + <text > > > + xml:space="preserve" > > > + style="font-style:normal;font-weight:normal;line-height:0%;font-family:sans-serif;text-align:start;letter-spacing:0px;word-spacing:0px;text-anchor:start;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" > > > + x="259.53165" > > > + y="797.8056" > > > + id="text4138-6-2-6-1-6-2-5-7-5"><tspan > > > + sodipodi:role="line" > > > + x="259.53165" > > > + y="797.8056" > > > + style="font-size:15px;line-height:1.25;text-align:start;text-anchor:start" > > > + id="tspan2357">uacce register api</tspan></text> > > > + <text > > > + xml:space="preserve" > > > + style="font-style:normal;font-weight:normal;line-height:0%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" > > > + x="29.145819" > > > + y="833.44244" > > > + id="text4138-6-2-6-1-6-2-5-7-5-2"><tspan > > > + sodipodi:role="line" > > > + x="29.145819" > > > + y="833.44244" > > > + id="tspan4301" > > > + style="font-size:15px;line-height:1.25">register to other subsystem</tspan></text> > > > + <text > > > + xml:space="preserve" > > > + style="font-style:normal;font-weight:normal;line-height:0%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" > > > + x="301.20813" > > > + y="597.29437" > > > + id="text4138-6-2-6-36-1"><tspan > > > + sodipodi:role="line" > > > + id="tspan4140-1-9-1-7-2" > > > + x="301.20813" > > > + y="597.29437" > > > + style="font-size:10px;line-height:1.25"><<user_lib>></tspan></text> > > > + <text > > > + xml:space="preserve" > > > + style="font-style:normal;font-weight:normal;line-height:0%;font-family:sans-serif;text-align:center;letter-spacing:0px;word-spacing:0px;text-anchor:middle;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" > > > + x="615.9505" > > > + y="739.44012" > > > + id="text4138-6-2-6-1-6-2-5-3"><tspan > > > + sodipodi:role="line" > > > + x="615.9505" > > > + y="739.44012" > > > + id="tspan4274-7" > > > + style="font-size:15px;line-height:1.25">mmapped memory r/w interface</tspan></text> > > > + <text > > > + xml:space="preserve" > > > + style="font-style:normal;font-weight:normal;line-height:0%;font-family:sans-serif;text-align:center;letter-spacing:0px;word-spacing:0px;text-anchor:middle;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" > > > + x="371.01291" > > > + y="529.23682" > > > + id="text4138-6-2-6-1-6-2-5-36"><tspan > > > + sodipodi:role="line" > > > + x="371.01291" > > > + y="529.23682" > > > + id="tspan4305-3" > > > + style="font-size:15px;line-height:1.25">wd user api</tspan></text> > > > + <path > > > + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" > > > + d="m 328.19325,585.87943 0,-23.57142" > > > + id="path4348" > > > + inkscape:connector-curvature="0" /> > > > + <ellipse > > > + style="opacity:1;fill:#ffffff;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:1;stroke-miterlimit:4;stroke-dasharray:none;stroke-dashoffset:0" > > > + id="path4350" > > > + cx="328.01468" > > > + cy="551.95081" > > > + rx="11.607142" > > > + ry="10.357142" /> > > > + <path > > > + style="opacity:0.444;fill:url(#linearGradient6836);fill-opacity:1;fill-rule:evenodd;stroke:none;stroke-width:1;stroke-miterlimit:4;stroke-dasharray:none;stroke-dashoffset:0;filter:url(#filter5382)" > > > + id="path4350-2" > > > + sodipodi:type="arc" > > > + sodipodi:cx="329.44327" > > > + sodipodi:cy="553.37933" > > > + sodipodi:rx="11.607142" > > > + sodipodi:ry="10.357142" > > > + sodipodi:start="0" > > > + sodipodi:end="6.2509098" > > > + d="m 341.05041,553.37933 a 11.607142,10.357142 0 0 1 -11.51349,10.35681 11.607142,10.357142 0 0 1 -11.69928,-10.18967 11.607142,10.357142 0 0 1 11.32469,-10.52124 11.607142,10.357142 0 0 1 11.88204,10.01988" > > > + sodipodi:open="true" /> > > > + <text > > > + xml:space="preserve" > > > + style="font-style:normal;font-weight:normal;line-height:0%;font-family:sans-serif;text-align:center;letter-spacing:0px;word-spacing:0px;text-anchor:middle;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" > > > + x="619.67596" > > > + y="978.22363" > > > + id="text4138-6-2-6-1-6-2-5-36-3"><tspan > > > + sodipodi:role="line" > > > + x="619.67596" > > > + y="978.22363" > > > + id="tspan4305-3-67" > > > + style="font-size:15px;line-height:1.25">Device(Hardware)</tspan></text> > > > + <path > > > + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1;marker-end:url(#marker4825-6-2-6-2)" > > > + d="m 347.51164,865.4527 193.91929,99.10053" > > > + id="path4661-3-5-1" > > > + inkscape:connector-curvature="0" > > > + sodipodi:nodetypes="cc" /> > > > + <rect > > > + style="fill:#000000;stroke:#000000;stroke-width:0.6465112;filter:url(#filter4169-3-5-8-5-0)" > > > + id="rect4136-3-6-5-7-3-1" > > > + width="101.07784" > > > + height="31.998148" > > > + x="128.74678" > > > + y="80.648842" > > > + transform="matrix(1.3742742,0,0,0.97786395,278.26025,749.50952)" /> > > > + <rect > > > + style="fill:url(#linearGradient5032-3-9-7-3);fill-opacity:1;stroke:#000000;stroke-width:0.74946606" > > > + id="rect4136-2-6-3-6-0" > > > + width="138.90868" > > > + height="31.289839" > > > + x="453.30197" > > > + y="826.36682" /> > > > + <text > > > + xml:space="preserve" > > > + style="font-style:normal;font-weight:normal;font-size:12px;line-height:0%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" > > > + x="493.68158" > > > + y="847.90643" > > > + id="text4138-6-2-6-1-5"><tspan > > > + sodipodi:role="line" > > > + id="tspan4140-1-9-1-0-1" > > > + x="493.68158" > > > + y="847.90643" > > > + style="font-size:15px;line-height:1.25;stroke-width:1px">IOMMU</tspan></text> > > > + <path > > > + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1;marker-end:url(#marker4825-6-2-1-1)" > > > + d="m 389.49372,755.46667 111.75324,68.4507" > > > + id="path4661-3-4-85" > > > + inkscape:connector-curvature="0" > > > + sodipodi:nodetypes="cc" /> > > > + <text > > > + xml:space="preserve" > > > + style="font-style:normal;font-weight:normal;font-size:12px;line-height:0%;font-family:sans-serif;text-align:start;letter-spacing:0px;word-spacing:0px;text-anchor:start;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" > > > + x="427.70282" > > > + y="776.91418" > > > + id="text4138-6-2-6-1-6-2-5-7-5-0"><tspan > > > + sodipodi:role="line" > > > + x="427.70282" > > > + y="776.91418" > > > + style="font-size:15px;line-height:1.25;text-align:start;text-anchor:start;stroke-width:1px" > > > + id="tspan2357-6">manage the driver iommu state</tspan></text> > > > + </g> > > > +</svg> > > > diff --git a/Documentation/warpdrive/wd.svg b/Documentation/warpdrive/wd.svg > > > new file mode 100644 > > > index 000000000000..87ab92ebfbc6 > > > --- /dev/null > > > +++ b/Documentation/warpdrive/wd.svg > > > @@ -0,0 +1,526 @@ > > > +<?xml version="1.0" encoding="UTF-8" standalone="no"?> > > > +<!-- Created with Inkscape (http://www.inkscape.org/) --> > > > + > > > +<svg > > > + xmlns:dc="http://purl.org/dc/elements/1.1/" > > > + xmlns:cc="http://creativecommons.org/ns#" > > > + xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" > > > + xmlns:svg="http://www.w3.org/2000/svg" > > > + xmlns="http://www.w3.org/2000/svg" > > > + xmlns:xlink="http://www.w3.org/1999/xlink" > > > + xmlns:sodipodi="http://sodipodi.sourceforge.net/DTD/sodipodi-0.dtd" > > > + xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape" > > > + width="210mm" > > > + height="116mm" > > > + viewBox="0 0 744.09449 411.02338" > > > + id="svg2" > > > + version="1.1" > > > + inkscape:version="0.92.3 (2405546, 2018-03-11)" > > > + sodipodi:docname="wd.svg"> > > > + <defs > > > + id="defs4"> > > > + <linearGradient > > > + inkscape:collect="always" > > > + id="linearGradient5026"> > > > + <stop > > > + style="stop-color:#f2f2f2;stop-opacity:1;" > > > + offset="0" > > > + id="stop5028" /> > > > + <stop > > > + style="stop-color:#f2f2f2;stop-opacity:0;" > > > + offset="1" > > > + id="stop5030" /> > > > + </linearGradient> > > > + <linearGradient > > > + inkscape:collect="always" > > > + xlink:href="#linearGradient5026" > > > + id="linearGradient5032-3" > > > + x1="353" > > > + y1="211.3622" > > > + x2="565.5" > > > + y2="174.8622" > > > + gradientUnits="userSpaceOnUse" > > > + gradientTransform="matrix(2.7384117,0,0,0.91666329,-952.8283,571.10143)" /> > > > + <filter > > > + inkscape:collect="always" > > > + style="color-interpolation-filters:sRGB" > > > + id="filter4169-3-5" > > > + x="-0.031597666" > > > + width="1.0631953" > > > + y="-0.099812768" > > > + height="1.1996255"> > > > + <feGaussianBlur > > > + inkscape:collect="always" > > > + stdDeviation="1.3307599" > > > + id="feGaussianBlur4171-6-3" /> > > > + </filter> > > > + <marker > > > + markerWidth="18.960653" > > > + markerHeight="11.194658" > > > + refX="9.4803267" > > > + refY="5.5973287" > > > + orient="auto" > > > + id="marker4613"> > > > + <rect > > > + y="-5.1589785" > > > + x="5.8504119" > > > + height="10.317957" > > > + width="10.317957" > > > + id="rect4212" > > > + style="fill:#ffffff;stroke:#000000;stroke-width:0.69143367;stroke-miterlimit:4;stroke-dasharray:none" > > > + transform="matrix(0.86111274,0.50841405,-0.86111274,0.50841405,0,0)"> > > > + <title > > > + id="title4262">generation</title> > > > + </rect> > > > + </marker> > > > + <marker > > > + markerWidth="11.227358" > > > + markerHeight="12.355258" > > > + refX="10" > > > + refY="6.177629" > > > + orient="auto" > > > + id="marker4825"> > > > + <path > > > + inkscape:connector-curvature="0" > > > + id="path4757" > > > + d="M 0.42024733,0.42806444 10.231357,6.3500844 0.24347733,11.918544" > > > + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" /> > > > + </marker> > > > + <marker > > > + markerWidth="11.227358" > > > + markerHeight="12.355258" > > > + refX="10" > > > + refY="6.177629" > > > + orient="auto" > > > + id="marker4825-6"> > > > + <path > > > + inkscape:connector-curvature="0" > > > + id="path4757-1" > > > + d="M 0.42024733,0.42806444 10.231357,6.3500844 0.24347733,11.918544" > > > + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" /> > > > + </marker> > > > + <linearGradient > > > + inkscape:collect="always" > > > + xlink:href="#linearGradient5026" > > > + id="linearGradient5032-3-9" > > > + x1="353" > > > + y1="211.3622" > > > + x2="565.5" > > > + y2="174.8622" > > > + gradientUnits="userSpaceOnUse" > > > + gradientTransform="matrix(1.2452511,0,0,0.98513016,-190.95632,540.33156)" /> > > > + <marker > > > + markerWidth="11.227358" > > > + markerHeight="12.355258" > > > + refX="10" > > > + refY="6.177629" > > > + orient="auto" > > > + id="marker4825-6-2"> > > > + <path > > > + inkscape:connector-curvature="0" > > > + id="path4757-1-9" > > > + d="M 0.42024733,0.42806444 10.231357,6.3500844 0.24347733,11.918544" > > > + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" /> > > > + </marker> > > > + <marker > > > + markerWidth="11.227358" > > > + markerHeight="12.355258" > > > + refX="10" > > > + refY="6.177629" > > > + orient="auto" > > > + id="marker4825-6-2-1"> > > > + <path > > > + inkscape:connector-curvature="0" > > > + id="path4757-1-9-9" > > > + d="M 0.42024733,0.42806444 10.231357,6.3500844 0.24347733,11.918544" > > > + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" /> > > > + </marker> > > > + <marker > > > + markerWidth="11.227358" > > > + markerHeight="12.355258" > > > + refX="10" > > > + refY="6.177629" > > > + orient="auto" > > > + id="marker4825-6-2-6"> > > > + <path > > > + inkscape:connector-curvature="0" > > > + id="path4757-1-9-1" > > > + d="M 0.42024733,0.42806444 10.231357,6.3500844 0.24347733,11.918544" > > > + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" /> > > > + </marker> > > > + <marker > > > + markerWidth="11.227358" > > > + markerHeight="12.355258" > > > + refX="10" > > > + refY="6.177629" > > > + orient="auto" > > > + id="marker4825-6-2-1-8"> > > > + <path > > > + inkscape:connector-curvature="0" > > > + id="path4757-1-9-9-6" > > > + d="M 0.42024733,0.42806444 10.231357,6.3500844 0.24347733,11.918544" > > > + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" /> > > > + </marker> > > > + <marker > > > + markerWidth="11.227358" > > > + markerHeight="12.355258" > > > + refX="10" > > > + refY="6.177629" > > > + orient="auto" > > > + id="marker4825-6-2-1-8-8"> > > > + <path > > > + inkscape:connector-curvature="0" > > > + id="path4757-1-9-9-6-9" > > > + d="M 0.42024733,0.42806444 10.231357,6.3500844 0.24347733,11.918544" > > > + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" /> > > > + </marker> > > > + <marker > > > + markerWidth="11.227358" > > > + markerHeight="12.355258" > > > + refX="10" > > > + refY="6.177629" > > > + orient="auto" > > > + id="marker4825-6-0"> > > > + <path > > > + inkscape:connector-curvature="0" > > > + id="path4757-1-93" > > > + d="M 0.42024733,0.42806444 10.231357,6.3500844 0.24347733,11.918544" > > > + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" /> > > > + </marker> > > > + <marker > > > + markerWidth="11.227358" > > > + markerHeight="12.355258" > > > + refX="10" > > > + refY="6.177629" > > > + orient="auto" > > > + id="marker4825-6-0-2"> > > > + <path > > > + inkscape:connector-curvature="0" > > > + id="path4757-1-93-6" > > > + d="M 0.42024733,0.42806444 10.231357,6.3500844 0.24347733,11.918544" > > > + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" /> > > > + </marker> > > > + <marker > > > + markerWidth="11.227358" > > > + markerHeight="12.355258" > > > + refX="10" > > > + refY="6.177629" > > > + orient="auto" > > > + id="marker4825-6-2-6-2"> > > > + <path > > > + inkscape:connector-curvature="0" > > > + id="path4757-1-9-1-9" > > > + d="M 0.42024733,0.42806444 10.231357,6.3500844 0.24347733,11.918544" > > > + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" /> > > > + </marker> > > > + <linearGradient > > > + inkscape:collect="always" > > > + xlink:href="#linearGradient5026" > > > + id="linearGradient5032-3-8" > > > + x1="353" > > > + y1="211.3622" > > > + x2="565.5" > > > + y2="174.8622" > > > + gradientUnits="userSpaceOnUse" > > > + gradientTransform="matrix(1.0104674,0,0,1.0052679,-218.642,661.15448)" /> > > > + <filter > > > + inkscape:collect="always" > > > + style="color-interpolation-filters:sRGB" > > > + id="filter4169-3-5-8" > > > + x="-0.031597666" > > > + width="1.0631953" > > > + y="-0.099812768" > > > + height="1.1996255"> > > > + <feGaussianBlur > > > + inkscape:collect="always" > > > + stdDeviation="1.3307599" > > > + id="feGaussianBlur4171-6-3-9" /> > > > + </filter> > > > + <linearGradient > > > + inkscape:collect="always" > > > + xlink:href="#linearGradient5026" > > > + id="linearGradient5032-3-8-2" > > > + x1="353" > > > + y1="211.3622" > > > + x2="565.5" > > > + y2="174.8622" > > > + gradientUnits="userSpaceOnUse" > > > + gradientTransform="matrix(2.1450559,0,0,1.0052679,-521.97704,740.76422)" /> > > > + <filter > > > + inkscape:collect="always" > > > + style="color-interpolation-filters:sRGB" > > > + id="filter4169-3-5-8-5" > > > + x="-0.031597666" > > > + width="1.0631953" > > > + y="-0.099812768" > > > + height="1.1996255"> > > > + <feGaussianBlur > > > + inkscape:collect="always" > > > + stdDeviation="1.3307599" > > > + id="feGaussianBlur4171-6-3-9-1" /> > > > + </filter> > > > + <linearGradient > > > + inkscape:collect="always" > > > + xlink:href="#linearGradient5026" > > > + id="linearGradient5032-3-8-0" > > > + x1="353" > > > + y1="211.3622" > > > + x2="565.5" > > > + y2="174.8622" > > > + gradientUnits="userSpaceOnUse" > > > + gradientTransform="matrix(1.0104674,0,0,1.0052679,83.456748,660.20747)" /> > > > + <filter > > > + inkscape:collect="always" > > > + style="color-interpolation-filters:sRGB" > > > + id="filter4169-3-5-8-6" > > > + x="-0.031597666" > > > + width="1.0631953" > > > + y="-0.099812768" > > > + height="1.1996255"> > > > + <feGaussianBlur > > > + inkscape:collect="always" > > > + stdDeviation="1.3307599" > > > + id="feGaussianBlur4171-6-3-9-2" /> > > > + </filter> > > > + <linearGradient > > > + inkscape:collect="always" > > > + xlink:href="#linearGradient5026" > > > + id="linearGradient5032-3-84" > > > + x1="353" > > > + y1="211.3622" > > > + x2="565.5" > > > + y2="174.8622" > > > + gradientUnits="userSpaceOnUse" > > > + gradientTransform="matrix(1.9884948,0,0,0.94903536,-318.42665,564.37696)" /> > > > + <filter > > > + inkscape:collect="always" > > > + style="color-interpolation-filters:sRGB" > > > + id="filter4169-3-5-4" > > > + x="-0.031597666" > > > + width="1.0631953" > > > + y="-0.099812768" > > > + height="1.1996255"> > > > + <feGaussianBlur > > > + inkscape:collect="always" > > > + stdDeviation="1.3307599" > > > + id="feGaussianBlur4171-6-3-0" /> > > > + </filter> > > > + <marker > > > + markerWidth="11.227358" > > > + markerHeight="12.355258" > > > + refX="10" > > > + refY="6.177629" > > > + orient="auto" > > > + id="marker4825-6-0-0"> > > > + <path > > > + inkscape:connector-curvature="0" > > > + id="path4757-1-93-8" > > > + d="M 0.42024733,0.42806444 10.231357,6.3500844 0.24347733,11.918544" > > > + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" /> > > > + </marker> > > > + <marker > > > + markerWidth="11.227358" > > > + markerHeight="12.355258" > > > + refX="10" > > > + refY="6.177629" > > > + orient="auto" > > > + id="marker4825-6-3"> > > > + <path > > > + inkscape:connector-curvature="0" > > > + id="path4757-1-1" > > > + d="M 0.42024733,0.42806444 10.231357,6.3500844 0.24347733,11.918544" > > > + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" /> > > > + </marker> > > > + </defs> > > > + <sodipodi:namedview > > > + id="base" > > > + pagecolor="#ffffff" > > > + bordercolor="#666666" > > > + borderopacity="1.0" > > > + inkscape:pageopacity="0.0" > > > + inkscape:pageshadow="2" > > > + inkscape:zoom="0.98994949" > > > + inkscape:cx="457.47339" > > > + inkscape:cy="250.14781" > > > + inkscape:document-units="px" > > > + inkscape:current-layer="layer1" > > > + showgrid="false" > > > + inkscape:window-width="1916" > > > + inkscape:window-height="1033" > > > + inkscape:window-x="0" > > > + inkscape:window-y="22" > > > + inkscape:window-maximized="0" > > > + fit-margin-right="0.3" /> > > > + <metadata > > > + id="metadata7"> > > > + <rdf:RDF> > > > + <cc:Work > > > + rdf:about=""> > > > + <dc:format>image/svg+xml</dc:format> > > > + <dc:type > > > + rdf:resource="http://purl.org/dc/dcmitype/StillImage" /> > > > + <dc:title></dc:title> > > > + </cc:Work> > > > + </rdf:RDF> > > > + </metadata> > > > + <g > > > + inkscape:label="Layer 1" > > > + inkscape:groupmode="layer" > > > + id="layer1" > > > + transform="translate(0,-641.33861)"> > > > + <rect > > > + style="fill:#000000;stroke:#000000;stroke-width:0.6465112;filter:url(#filter4169-3-5)" > > > + id="rect4136-3-6-5" > > > + width="101.07784" > > > + height="31.998148" > > > + x="128.74678" > > > + y="80.648842" > > > + transform="matrix(2.7384116,0,0,0.91666328,-284.06895,664.79751)" /> > > > + <rect > > > + style="fill:url(#linearGradient5032-3);fill-opacity:1;stroke:#000000;stroke-width:1.02430749" > > > + id="rect4136-2-6" > > > + width="276.79272" > > > + height="29.331528" > > > + x="64.723419" > > > + y="736.84473" /> > > > + <text > > > + xml:space="preserve" > > > + style="font-style:normal;font-weight:normal;line-height:0%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" > > > + x="78.223282" > > > + y="756.79803" > > > + id="text4138-6-2"><tspan > > > + sodipodi:role="line" > > > + id="tspan4140-1-9" > > > + x="78.223282" > > > + y="756.79803" > > > + style="font-size:15px;line-height:1.25">user application (running by the CPU</tspan></text> > > > + <path > > > + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1;marker-end:url(#marker4825-6)" > > > + d="m 217.67507,876.6738 113.40331,45.0758" > > > + id="path4661" > > > + inkscape:connector-curvature="0" > > > + sodipodi:nodetypes="cc" /> > > > + <path > > > + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1;marker-end:url(#marker4825-6-0)" > > > + d="m 208.10197,767.69811 0.29362,76.03656" > > > + id="path4661-6" > > > + inkscape:connector-curvature="0" > > > + sodipodi:nodetypes="cc" /> > > > + <rect > > > + style="fill:#000000;stroke:#000000;stroke-width:0.6465112;filter:url(#filter4169-3-5-8)" > > > + id="rect4136-3-6-5-3" > > > + width="101.07784" > > > + height="31.998148" > > > + x="128.74678" > > > + y="80.648842" > > > + transform="matrix(1.0104673,0,0,1.0052679,28.128628,763.90722)" /> > > > + <rect > > > + style="fill:url(#linearGradient5032-3-8);fill-opacity:1;stroke:#000000;stroke-width:0.65159565" > > > + id="rect4136-2-6-6" > > > + width="102.13586" > > > + height="32.16671" > > > + x="156.83217" > > > + y="842.91852" /> > > > + <text > > > + xml:space="preserve" > > > + style="font-style:normal;font-weight:normal;font-size:12px;line-height:0%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" > > > + x="188.58519" > > > + y="864.47125" > > > + id="text4138-6-2-8"><tspan > > > + sodipodi:role="line" > > > + id="tspan4140-1-9-0" > > > + x="188.58519" > > > + y="864.47125" > > > + style="font-size:15px;line-height:1.25;stroke-width:1px">MMU</tspan></text> > > > + <rect > > > + style="fill:#000000;stroke:#000000;stroke-width:0.6465112;filter:url(#filter4169-3-5-8-5)" > > > + id="rect4136-3-6-5-3-1" > > > + width="101.07784" > > > + height="31.998148" > > > + x="128.74678" > > > + y="80.648842" > > > + transform="matrix(2.1450556,0,0,1.0052679,1.87637,843.51696)" /> > > > + <rect > > > + style="fill:url(#linearGradient5032-3-8-2);fill-opacity:1;stroke:#000000;stroke-width:0.94937181" > > > + id="rect4136-2-6-6-0" > > > + width="216.8176" > > > + height="32.16671" > > > + x="275.09283" > > > + y="922.5282" /> > > > + <text > > > + xml:space="preserve" > > > + style="font-style:normal;font-weight:normal;font-size:12px;line-height:0%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" > > > + x="347.81482" > > > + y="943.23291" > > > + id="text4138-6-2-8-8"><tspan > > > + sodipodi:role="line" > > > + id="tspan4140-1-9-0-5" > > > + x="347.81482" > > > + y="943.23291" > > > + style="font-size:15px;line-height:1.25;stroke-width:1px">Memory</tspan></text> > > > + <rect > > > + style="fill:#000000;stroke:#000000;stroke-width:0.6465112;filter:url(#filter4169-3-5-8-6)" > > > + id="rect4136-3-6-5-3-5" > > > + width="101.07784" > > > + height="31.998148" > > > + x="128.74678" > > > + y="80.648842" > > > + transform="matrix(1.0104673,0,0,1.0052679,330.22737,762.9602)" /> > > > + <rect > > > + style="fill:url(#linearGradient5032-3-8-0);fill-opacity:1;stroke:#000000;stroke-width:0.65159565" > > > + id="rect4136-2-6-6-8" > > > + width="102.13586" > > > + height="32.16671" > > > + x="458.93091" > > > + y="841.9715" /> > > > + <text > > > + xml:space="preserve" > > > + style="font-style:normal;font-weight:normal;font-size:12px;line-height:0%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" > > > + x="490.68393" > > > + y="863.52423" > > > + id="text4138-6-2-8-6"><tspan > > > + sodipodi:role="line" > > > + id="tspan4140-1-9-0-2" > > > + x="490.68393" > > > + y="863.52423" > > > + style="font-size:15px;line-height:1.25;stroke-width:1px">IOMMU</tspan></text> > > > + <rect > > > + style="fill:#000000;stroke:#000000;stroke-width:0.6465112;filter:url(#filter4169-3-5-4)" > > > + id="rect4136-3-6-5-6" > > > + width="101.07784" > > > + height="31.998148" > > > + x="128.74678" > > > + y="80.648842" > > > + transform="matrix(1.9884947,0,0,0.94903537,167.19229,661.38193)" /> > > > + <rect > > > + style="fill:url(#linearGradient5032-3-84);fill-opacity:1;stroke:#000000;stroke-width:0.88813609" > > > + id="rect4136-2-6-2" > > > + width="200.99274" > > > + height="30.367374" > > > + x="420.4675" > > > + y="735.97351" /> > > > + <text > > > + xml:space="preserve" > > > + style="font-style:normal;font-weight:normal;font-size:12px;line-height:0%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" > > > + x="441.95297" > > > + y="755.9068" > > > + id="text4138-6-2-9"><tspan > > > + sodipodi:role="line" > > > + id="tspan4140-1-9-9" > > > + x="441.95297" > > > + y="755.9068" > > > + style="font-size:15px;line-height:1.25;stroke-width:1px">Hardware Accelerator</tspan></text> > > > + <path > > > + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1;marker-end:url(#marker4825-6-0-0)" > > > + d="m 508.2914,766.55885 0.29362,76.03656" > > > + id="path4661-6-1" > > > + inkscape:connector-curvature="0" > > > + sodipodi:nodetypes="cc" /> > > > + <path > > > + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1;marker-end:url(#marker4825-6-3)" > > > + d="M 499.70201,876.47297 361.38296,920.80258" > > > + id="path4661-1" > > > + inkscape:connector-curvature="0" > > > + sodipodi:nodetypes="cc" /> > > > + </g> > > > +</svg> > > > diff --git a/Documentation/warpdrive/wd_q_addr_space.svg b/Documentation/warpdrive/wd_q_addr_space.svg > > > new file mode 100644 > > > index 000000000000..5e6cf8e89908 > > > --- /dev/null > > > +++ b/Documentation/warpdrive/wd_q_addr_space.svg > > > @@ -0,0 +1,359 @@ > > > +<?xml version="1.0" encoding="UTF-8" standalone="no"?> > > > +<!-- Created with Inkscape (http://www.inkscape.org/) --> > > > + > > > +<svg > > > + xmlns:dc="http://purl.org/dc/elements/1.1/" > > > + xmlns:cc="http://creativecommons.org/ns#" > > > + xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" > > > + xmlns:svg="http://www.w3.org/2000/svg" > > > + xmlns="http://www.w3.org/2000/svg" > > > + xmlns:sodipodi="http://sodipodi.sourceforge.net/DTD/sodipodi-0.dtd" > > > + xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape" > > > + width="210mm" > > > + height="124mm" > > > + viewBox="0 0 210 124" > > > + version="1.1" > > > + id="svg8" > > > + inkscape:version="0.92.3 (2405546, 2018-03-11)" > > > + sodipodi:docname="wd_q_addr_space.svg"> > > > + <defs > > > + id="defs2"> > > > + <marker > > > + inkscape:stockid="Arrow1Mend" > > > + orient="auto" > > > + refY="0" > > > + refX="0" > > > + id="marker5428" > > > + style="overflow:visible" > > > + inkscape:isstock="true"> > > > + <path > > > + id="path5426" > > > + d="M 0,0 5,-5 -12.5,0 5,5 Z" > > > + style="fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:1.00000003pt;stroke-opacity:1" > > > + transform="matrix(-0.4,0,0,-0.4,-4,0)" > > > + inkscape:connector-curvature="0" /> > > > + </marker> > > > + <marker > > > + inkscape:isstock="true" > > > + style="overflow:visible" > > > + id="marker2922" > > > + refX="0" > > > + refY="0" > > > + orient="auto" > > > + inkscape:stockid="Arrow1Mend" > > > + inkscape:collect="always"> > > > + <path > > > + transform="matrix(-0.4,0,0,-0.4,-4,0)" > > > + style="fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:1.00000003pt;stroke-opacity:1" > > > + d="M 0,0 5,-5 -12.5,0 5,5 Z" > > > + id="path2920" > > > + inkscape:connector-curvature="0" /> > > > + </marker> > > > + <marker > > > + inkscape:stockid="Arrow1Mstart" > > > + orient="auto" > > > + refY="0" > > > + refX="0" > > > + id="Arrow1Mstart" > > > + style="overflow:visible" > > > + inkscape:isstock="true"> > > > + <path > > > + id="path840" > > > + d="M 0,0 5,-5 -12.5,0 5,5 Z" > > > + style="fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:1.00000003pt;stroke-opacity:1" > > > + transform="matrix(0.4,0,0,0.4,4,0)" > > > + inkscape:connector-curvature="0" /> > > > + </marker> > > > + <marker > > > + inkscape:stockid="Arrow1Mend" > > > + orient="auto" > > > + refY="0" > > > + refX="0" > > > + id="Arrow1Mend" > > > + style="overflow:visible" > > > + inkscape:isstock="true" > > > + inkscape:collect="always"> > > > + <path > > > + id="path843" > > > + d="M 0,0 5,-5 -12.5,0 5,5 Z" > > > + style="fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:1.00000003pt;stroke-opacity:1" > > > + transform="matrix(-0.4,0,0,-0.4,-4,0)" > > > + inkscape:connector-curvature="0" /> > > > + </marker> > > > + <marker > > > + inkscape:stockid="Arrow1Mstart" > > > + orient="auto" > > > + refY="0" > > > + refX="0" > > > + id="Arrow1Mstart-5" > > > + style="overflow:visible" > > > + inkscape:isstock="true"> > > > + <path > > > + inkscape:connector-curvature="0" > > > + id="path840-1" > > > + d="M 0,0 5,-5 -12.5,0 5,5 Z" > > > + style="fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:1.00000003pt;stroke-opacity:1" > > > + transform="matrix(0.4,0,0,0.4,4,0)" /> > > > + </marker> > > > + <marker > > > + inkscape:stockid="Arrow1Mend" > > > + orient="auto" > > > + refY="0" > > > + refX="0" > > > + id="Arrow1Mend-1" > > > + style="overflow:visible" > > > + inkscape:isstock="true"> > > > + <path > > > + inkscape:connector-curvature="0" > > > + id="path843-0" > > > + d="M 0,0 5,-5 -12.5,0 5,5 Z" > > > + style="fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:1.00000003pt;stroke-opacity:1" > > > + transform="matrix(-0.4,0,0,-0.4,-4,0)" /> > > > + </marker> > > > + <marker > > > + inkscape:isstock="true" > > > + style="overflow:visible" > > > + id="marker2922-2" > > > + refX="0" > > > + refY="0" > > > + orient="auto" > > > + inkscape:stockid="Arrow1Mend" > > > + inkscape:collect="always"> > > > + <path > > > + transform="matrix(-0.4,0,0,-0.4,-4,0)" > > > + style="fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:1.00000003pt;stroke-opacity:1" > > > + d="M 0,0 5,-5 -12.5,0 5,5 Z" > > > + id="path2920-9" > > > + inkscape:connector-curvature="0" /> > > > + </marker> > > > + <marker > > > + inkscape:isstock="true" > > > + style="overflow:visible" > > > + id="marker2922-27" > > > + refX="0" > > > + refY="0" > > > + orient="auto" > > > + inkscape:stockid="Arrow1Mend" > > > + inkscape:collect="always"> > > > + <path > > > + transform="matrix(-0.4,0,0,-0.4,-4,0)" > > > + style="fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:1.00000003pt;stroke-opacity:1" > > > + d="M 0,0 5,-5 -12.5,0 5,5 Z" > > > + id="path2920-0" > > > + inkscape:connector-curvature="0" /> > > > + </marker> > > > + <marker > > > + inkscape:isstock="true" > > > + style="overflow:visible" > > > + id="marker2922-27-8" > > > + refX="0" > > > + refY="0" > > > + orient="auto" > > > + inkscape:stockid="Arrow1Mend" > > > + inkscape:collect="always"> > > > + <path > > > + transform="matrix(-0.4,0,0,-0.4,-4,0)" > > > + style="fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:1.00000003pt;stroke-opacity:1" > > > + d="M 0,0 5,-5 -12.5,0 5,5 Z" > > > + id="path2920-0-0" > > > + inkscape:connector-curvature="0" /> > > > + </marker> > > > + </defs> > > > + <sodipodi:namedview > > > + id="base" > > > + pagecolor="#ffffff" > > > + bordercolor="#666666" > > > + borderopacity="1.0" > > > + inkscape:pageopacity="0.0" > > > + inkscape:pageshadow="2" > > > + inkscape:zoom="1.4" > > > + inkscape:cx="401.66654" > > > + inkscape:cy="218.12255" > > > + inkscape:document-units="mm" > > > + inkscape:current-layer="layer1" > > > + showgrid="false" > > > + inkscape:window-width="1916" > > > + inkscape:window-height="1033" > > > + inkscape:window-x="0" > > > + inkscape:window-y="22" > > > + inkscape:window-maximized="0" /> > > > + <metadata > > > + id="metadata5"> > > > + <rdf:RDF> > > > + <cc:Work > > > + rdf:about=""> > > > + <dc:format>image/svg+xml</dc:format> > > > + <dc:type > > > + rdf:resource="http://purl.org/dc/dcmitype/StillImage" /> > > > + <dc:title /> > > > + </cc:Work> > > > + </rdf:RDF> > > > + </metadata> > > > + <g > > > + inkscape:label="Layer 1" > > > + inkscape:groupmode="layer" > > > + id="layer1" > > > + transform="translate(0,-173)"> > > > + <rect > > > + style="opacity:0.82999998;fill:none;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:0.4;stroke-miterlimit:4;stroke-dasharray:none;stroke-dashoffset:0;stroke-opacity:0.82745098" > > > + id="rect815" > > > + width="21.262758" > > > + height="40.350552" > > > + x="55.509361" > > > + y="195.00098" > > > + ry="0" /> > > > + <rect > > > + style="opacity:0.82999998;fill:none;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:0.4;stroke-miterlimit:4;stroke-dasharray:none;stroke-dashoffset:0;stroke-opacity:0.82745098" > > > + id="rect815-1" > > > + width="21.24276" > > > + height="43.732346" > > > + x="55.519352" > > > + y="235.26543" > > > + ry="0" /> > > > + <text > > > + xml:space="preserve" > > > + style="font-style:normal;font-weight:normal;font-size:2.9765625px;line-height:125%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.26458332px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" > > > + x="50.549229" > > > + y="190.6078" > > > + id="text1118"><tspan > > > + sodipodi:role="line" > > > + id="tspan1116" > > > + x="50.549229" > > > + y="190.6078" > > > + style="stroke-width:0.26458332px">queue file address space</tspan></text> > > > + <path > > > + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.26458332px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" > > > + d="M 76.818568,194.95453 H 97.229281" > > > + id="path1126" > > > + inkscape:connector-curvature="0" > > > + sodipodi:nodetypes="cc" /> > > > + <path > > > + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.26458332px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" > > > + d="M 76.818568,235.20899 H 96.095361" > > > + id="path1126-8" > > > + inkscape:connector-curvature="0" /> > > > + <path > > > + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.26458332px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" > > > + d="m 76.762111,278.99778 h 19.27678" > > > + id="path1126-0" > > > + inkscape:connector-curvature="0" /> > > > + <path > > > + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.26458332px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" > > > + d="m 55.519355,265.20165 v 19.27678" > > > + id="path1126-2" > > > + inkscape:connector-curvature="0" /> > > > + <path > > > + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.26458332px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" > > > + d="m 76.762111,265.20165 v 19.27678" > > > + id="path1126-2-1" > > > + inkscape:connector-curvature="0" /> > > > + <path > > > + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.26458332px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1;marker-start:url(#Arrow1Mstart);marker-end:url(#Arrow1Mend)" > > > + d="m 87.590896,194.76554 0,39.87648" > > > + id="path1126-2-1-0" > > > + inkscape:connector-curvature="0" > > > + sodipodi:nodetypes="cc" /> > > > + <path > > > + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.26458332px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1;marker-start:url(#Arrow1Mstart-5);marker-end:url(#Arrow1Mend-1)" > > > + d="m 82.48822,235.77596 v 42.90029" > > > + id="path1126-2-1-0-8" > > > + inkscape:connector-curvature="0" > > > + sodipodi:nodetypes="cc" /> > > > + <path > > > + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.26458332px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1;marker-end:url(#marker2922)" > > > + d="M 44.123633,195.3325 H 55.651907" > > > + id="path2912" > > > + inkscape:connector-curvature="0" /> > > > + <text > > > + xml:space="preserve" > > > + style="font-style:normal;font-weight:normal;font-size:2.9765625px;line-height:125%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.26458332px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" > > > + x="32.217381" > > > + y="196.27745" > > > + id="text2968"><tspan > > > + sodipodi:role="line" > > > + id="tspan2966" > > > + x="32.217381" > > > + y="196.27745" > > > + style="stroke-width:0.26458332px">offset 0</tspan></text> > > > + <text > > > + xml:space="preserve" > > > + style="font-style:normal;font-weight:normal;font-size:2.9765625px;line-height:125%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.26458332px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" > > > + x="91.199554" > > > + y="216.03946" > > > + id="text1118-5"><tspan > > > + sodipodi:role="line" > > > + id="tspan1116-0" > > > + x="91.199554" > > > + y="216.03946" > > > + style="stroke-width:0.26458332px">device region (mapped to device mmio or shared kernel driver memory)</tspan></text> > > > + <text > > > + xml:space="preserve" > > > + style="font-style:normal;font-weight:normal;font-size:2.9765625px;line-height:125%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.26458332px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" > > > + x="86.188072" > > > + y="244.50081" > > > + id="text1118-5-6"><tspan > > > + sodipodi:role="line" > > > + id="tspan1116-0-4" > > > + x="86.188072" > > > + y="244.50081" > > > + style="stroke-width:0.26458332px">static share virtual memory region (for device without share virtual memory)</tspan></text> > > > + <flowRoot > > > + xml:space="preserve" > > > + id="flowRoot5699" > > > + style="font-style:normal;font-weight:normal;font-size:11.25px;line-height:125%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1"><flowRegion > > > + id="flowRegion5701"><rect > > > + id="rect5703" > > > + width="5182.8569" > > > + height="385.71429" > > > + x="34.285713" > > > + y="71.09111" /></flowRegion><flowPara > > > + id="flowPara5705" /></flowRoot> <path > > > + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.26458332px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1;marker-end:url(#marker2922-2)" > > > + d="M 43.679028,206.85268 H 55.207302" > > > + id="path2912-1" > > > + inkscape:connector-curvature="0" /> > > > + <path > > > + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.26458332px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1;marker-end:url(#marker2922-27)" > > > + d="M 44.057004,224.23959 H 55.585278" > > > + id="path2912-9" > > > + inkscape:connector-curvature="0" /> > > > + <text > > > + xml:space="preserve" > > > + style="font-style:normal;font-weight:normal;font-size:2.9765625px;line-height:125%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.26458332px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" > > > + x="24.139778" > > > + y="202.40636" > > > + id="text1118-5-3"><tspan > > > + sodipodi:role="line" > > > + id="tspan1116-0-6" > > > + x="24.139778" > > > + y="202.40636" > > > + style="stroke-width:0.26458332px">device mmio region</tspan></text> > > > + <text > > > + xml:space="preserve" > > > + style="font-style:normal;font-weight:normal;font-size:2.9765625px;line-height:125%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.26458332px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" > > > + x="17.010948" > > > + y="216.73672" > > > + id="text1118-5-3-3"><tspan > > > + sodipodi:role="line" > > > + id="tspan1116-0-6-6" > > > + x="17.010948" > > > + y="216.73672" > > > + style="stroke-width:0.26458332px">device kernel only region</tspan></text> > > > + <path > > > + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.26458332px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1;marker-end:url(#marker2922-27-8)" > > > + d="M 43.981087,235.35153 H 55.509361" > > > + id="path2912-9-2" > > > + inkscape:connector-curvature="0" /> > > > + <text > > > + xml:space="preserve" > > > + style="font-style:normal;font-weight:normal;font-size:2.9765625px;line-height:125%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.26458332px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" > > > + x="17.575975" > > > + y="230.53285" > > > + id="text1118-5-3-3-0"><tspan > > > + sodipodi:role="line" > > > + id="tspan1116-0-6-6-5" > > > + x="17.575975" > > > + y="230.53285" > > > + style="stroke-width:0.26458332px">device user share region</tspan></text> > > > + </g> > > > +</svg> > > > -- > > > 2.17.1 > > >
On Wed, Nov 14, 2018 at 06:00:17PM +0200, Leon Romanovsky wrote: > Date: Wed, 14 Nov 2018 18:00:17 +0200 > From: Leon Romanovsky <leon@kernel.org> > To: Kenneth Lee <nek.in.cn@gmail.com> > CC: Tim Sell <timothy.sell@unisys.com>, linux-doc@vger.kernel.org, > Alexander Shishkin <alexander.shishkin@linux.intel.com>, Zaibo Xu > <xuzaibo@huawei.com>, zhangfei.gao@foxmail.com, linuxarm@huawei.com, > haojian.zhuang@linaro.org, Christoph Lameter <cl@linux.com>, Hao Fang > <fanghao11@huawei.com>, Gavin Schenk <g.schenk@eckelmann.de>, RDMA mailing > list <linux-rdma@vger.kernel.org>, Zhou Wang <wangzhou1@hisilicon.com>, > Jason Gunthorpe <jgg@ziepe.ca>, Doug Ledford <dledford@redhat.com>, Uwe > Kleine-König <u.kleine-koenig@pengutronix.de>, David Kershner > <david.kershner@unisys.com>, Johan Hovold <johan@kernel.org>, Cyrille > Pitchen <cyrille.pitchen@free-electrons.com>, Sagar Dharia > <sdharia@codeaurora.org>, Jens Axboe <axboe@kernel.dk>, > guodong.xu@linaro.org, linux-netdev <netdev@vger.kernel.org>, Randy Dunlap > <rdunlap@infradead.org>, linux-kernel@vger.kernel.org, Vinod Koul > <vkoul@kernel.org>, linux-crypto@vger.kernel.org, Philippe Ombredanne > <pombredanne@nexb.com>, Sanyog Kale <sanyog.r.kale@intel.com>, Kenneth Lee > <liguozhu@hisilicon.com>, "David S. Miller" <davem@davemloft.net>, > linux-accelerators@lists.ozlabs.org > Subject: Re: [RFCv3 PATCH 1/6] uacce: Add documents for WarpDrive/uacce > User-Agent: Mutt/1.10.1 (2018-07-13) > Message-ID: <20181114160017.GI3759@mtr-leonro.mtl.com> > > On Wed, Nov 14, 2018 at 10:58:09AM +0800, Kenneth Lee wrote: > > > > 在 2018/11/13 上午8:23, Leon Romanovsky 写道: > > > On Mon, Nov 12, 2018 at 03:58:02PM +0800, Kenneth Lee wrote: > > > > From: Kenneth Lee <liguozhu@hisilicon.com> > > > > > > > > WarpDrive is a general accelerator framework for the user application to > > > > access the hardware without going through the kernel in data path. > > > > > > > > The kernel component to provide kernel facility to driver for expose the > > > > user interface is called uacce. It a short name for > > > > "Unified/User-space-access-intended Accelerator Framework". > > > > > > > > This patch add document to explain how it works. > > > + RDMA and netdev folks > > > > > > Sorry, to be late in the game, I don't see other patches, but from > > > the description below it seems like you are reinventing RDMA verbs > > > model. I have hard time to see the differences in the proposed > > > framework to already implemented in drivers/infiniband/* for the kernel > > > space and for the https://github.com/linux-rdma/rdma-core/ for the user > > > space parts. > > > > Thanks Leon, > > > > Yes, we tried to solve similar problem in RDMA. We also learned a lot from > > the exist code of RDMA. But we we have to make a new one because we cannot > > register accelerators such as AI operation, encryption or compression to the > > RDMA framework:) > > Assuming that you did everything right and still failed to use RDMA > framework, you was supposed to fix it and not to reinvent new exactly > same one. It is how we develop kernel, by reusing existing code. Yes, but we don't force other system such as NIC or GPU into RDMA, do we? I assume you would not agree to register a zip accelerator to infiniband? :) Further, I don't think it is wise to break an exist system (RDMA) to fulfill a totally new scenario. The better choice is to let them run in parallel for some time and try to merge them accordingly. > > > > > Another problem we tried to address is the way to pin the memory for dma > > operation. The RDMA way to pin the memory cannot avoid the page lost due to > > copy-on-write operation during the memory is used by the device. This may > > not be important to RDMA library. But it is important to accelerator. > > Such support exists in drivers/infiniband/ from late 2014 and > it is called ODP (on demand paging). I reviewed ODP and I think it is a solution bound to infiniband. It is part of MR semantics and required a infiniband specific hook (ucontext->invalidate_range()). And the hook requires the device to be able to stop using the page for a while for the copying. It is ok for infiniband (actually, only mlx5 uses it). I don't think most accelerators can support this mode. But WarpDrive works fully on top of IOMMU interface, it has no this limitation. > > > > > Hope this can help the understanding. > > Yes, it helped me a lot. > Now, I'm more than before convinced that this whole patchset shouldn't > exist in the first place. Then maybe you can tell me how I can register my accelerator to the user space? > > To be clear, NAK. > > Thanks > > > > > Cheers > > > > > > > > Hard NAK from RDMA side. > > > > > > Thanks > > > > > > > Signed-off-by: Kenneth Lee <liguozhu@hisilicon.com> > > > > --- > > > > Documentation/warpdrive/warpdrive.rst | 260 +++++++ > > > > Documentation/warpdrive/wd-arch.svg | 764 ++++++++++++++++++++ > > > > Documentation/warpdrive/wd.svg | 526 ++++++++++++++ > > > > Documentation/warpdrive/wd_q_addr_space.svg | 359 +++++++++ > > > > 4 files changed, 1909 insertions(+) > > > > create mode 100644 Documentation/warpdrive/warpdrive.rst > > > > create mode 100644 Documentation/warpdrive/wd-arch.svg > > > > create mode 100644 Documentation/warpdrive/wd.svg > > > > create mode 100644 Documentation/warpdrive/wd_q_addr_space.svg > > > > > > > > diff --git a/Documentation/warpdrive/warpdrive.rst b/Documentation/warpdrive/warpdrive.rst > > > > new file mode 100644 > > > > index 000000000000..ef84d3a2d462 > > > > --- /dev/null > > > > +++ b/Documentation/warpdrive/warpdrive.rst > > > > @@ -0,0 +1,260 @@ > > > > +Introduction of WarpDrive > > > > +========================= > > > > + > > > > +*WarpDrive* is a general accelerator framework for the user application to > > > > +access the hardware without going through the kernel in data path. > > > > + > > > > +It can be used as the quick channel for accelerators, network adaptors or > > > > +other hardware for application in user space. > > > > + > > > > +This may make some implementation simpler. E.g. you can reuse most of the > > > > +*netdev* driver in kernel and just share some ring buffer to the user space > > > > +driver for *DPDK* [4] or *ODP* [5]. Or you can combine the RSA accelerator with > > > > +the *netdev* in the user space as a https reversed proxy, etc. > > > > + > > > > +*WarpDrive* takes the hardware accelerator as a heterogeneous processor which > > > > +can share particular load from the CPU: > > > > + > > > > +.. image:: wd.svg > > > > + :alt: WarpDrive Concept > > > > + > > > > +The virtual concept, queue, is used to manage the requests sent to the > > > > +accelerator. The application send requests to the queue by writing to some > > > > +particular address, while the hardware takes the requests directly from the > > > > +address and send feedback accordingly. > > > > + > > > > +The format of the queue may differ from hardware to hardware. But the > > > > +application need not to make any system call for the communication. > > > > + > > > > +*WarpDrive* tries to create a shared virtual address space for all involved > > > > +accelerators. Within this space, the requests sent to queue can refer to any > > > > +virtual address, which will be valid to the application and all involved > > > > +accelerators. > > > > + > > > > +The name *WarpDrive* is simply a cool and general name meaning the framework > > > > +makes the application faster. It includes general user library, kernel > > > > +management module and drivers for the hardware. In kernel, the management > > > > +module is called *uacce*, meaning "Unified/User-space-access-intended > > > > +Accelerator Framework". > > > > + > > > > + > > > > +How does it work > > > > +================ > > > > + > > > > +*WarpDrive* uses *mmap* and *IOMMU* to play the trick. > > > > + > > > > +*Uacce* creates a chrdev for the device registered to it. A "queue" will be > > > > +created when the chrdev is opened. The application access the queue by mmap > > > > +different address region of the queue file. > > > > + > > > > +The following figure demonstrated the queue file address space: > > > > + > > > > +.. image:: wd_q_addr_space.svg > > > > + :alt: WarpDrive Queue Address Space > > > > + > > > > +The first region of the space, device region, is used for the application to > > > > +write request or read answer to or from the hardware. > > > > + > > > > +Normally, there can be three types of device regions mmio and memory regions. > > > > +It is recommended to use common memory for request/answer descriptors and use > > > > +the mmio space for device notification, such as doorbell. But of course, this > > > > +is all up to the interface designer. > > > > + > > > > +There can be two types of device memory regions, kernel-only and user-shared. > > > > +This will be explained in the "kernel APIs" section. > > > > + > > > > +The Static Share Virtual Memory region is necessary only when the device IOMMU > > > > +does not support "Share Virtual Memory". This will be explained after the > > > > +*IOMMU* idea. > > > > + > > > > + > > > > +Architecture > > > > +------------ > > > > + > > > > +The full *WarpDrive* architecture is represented in the following class > > > > +diagram: > > > > + > > > > +.. image:: wd-arch.svg > > > > + :alt: WarpDrive Architecture > > > > + > > > > + > > > > +The user API > > > > +------------ > > > > + > > > > +We adopt a polling style interface in the user space: :: > > > > + > > > > + int wd_request_queue(struct wd_queue *q); > > > > + void wd_release_queue(struct wd_queue *q); > > > > + > > > > + int wd_send(struct wd_queue *q, void *req); > > > > + int wd_recv(struct wd_queue *q, void **req); > > > > + int wd_recv_sync(struct wd_queue *q, void **req); > > > > + void wd_flush(struct wd_queue *q); > > > > + > > > > +wd_recv_sync() is a wrapper to its non-sync version. It will trapped into > > > > +kernel and waits until the queue become available. > > > > + > > > > +If the queue do not support SVA/SVM. The following helper function > > > > +can be used to create Static Virtual Share Memory: :: > > > > + > > > > + void *wd_preserve_share_memory(struct wd_queue *q, size_t size); > > > > + > > > > +The user API is not mandatory. It is simply a suggestion and hint what the > > > > +kernel interface is supposed to support. > > > > + > > > > + > > > > +The user driver > > > > +--------------- > > > > + > > > > +The queue file mmap space will need a user driver to wrap the communication > > > > +protocol. *UACCE* provides some attributes in sysfs for the user driver to > > > > +match the right accelerator accordingly. > > > > + > > > > +The *UACCE* device attribute is under the following directory: > > > > + > > > > +/sys/class/uacce/<dev-name>/params > > > > + > > > > +The following attributes is supported: > > > > + > > > > +nr_queue_remained (ro) > > > > + number of queue remained > > > > + > > > > +api_version (ro) > > > > + a string to identify the queue mmap space format and its version > > > > + > > > > +device_attr (ro) > > > > + attributes of the device, see UACCE_DEV_xxx flag defined in uacce.h > > > > + > > > > +numa_node (ro) > > > > + id of numa node > > > > + > > > > +priority (rw) > > > > + Priority or the device, bigger is higher > > > > + > > > > +(This is not yet implemented in RFC version) > > > > + > > > > + > > > > +The kernel API > > > > +-------------- > > > > + > > > > +The *uacce* kernel API is defined in uacce.h. If the hardware support SVM/SVA, > > > > +The driver need only the following API functions: :: > > > > + > > > > + int uacce_register(uacce); > > > > + void uacce_unregister(uacce); > > > > + void uacce_wake_up(q); > > > > + > > > > +*uacce_wake_up* is used to notify the process who epoll() on the queue file. > > > > + > > > > +According to the IOMMU capability, *uacce* categories the devices as follow: > > > > + > > > > +UACCE_DEV_NOIOMMU > > > > + The device has no IOMMU. The user process cannot use VA on the hardware > > > > + This mode is not recommended. > > > > + > > > > +UACCE_DEV_SVA (UACCE_DEV_PASID | UACCE_DEV_FAULT_FROM_DEV) > > > > + The device has IOMMU which can share the same page table with user > > > > + process > > > > + > > > > +UACCE_DEV_SHARE_DOMAIN > > > > + The device has IOMMU which has no multiple page table and device page > > > > + fault support > > > > + > > > > +If the device works in mode other than UACCE_DEV_NOIOMMU, *uacce* will set its > > > > +IOMMU to IOMMU_DOMAIN_UNMANAGED. So the driver must not use any kernel > > > > +DMA API but the following ones from *uacce* instead: :: > > > > + > > > > + uacce_dma_map(q, va, size, prot); > > > > + uacce_dma_unmap(q, va, size, prot); > > > > + > > > > +*uacce_dma_map/unmap* is valid only for UACCE_DEV_SVA device. It creates a > > > > +particular PASID and page table for the kernel in the IOMMU (Not yet > > > > +implemented in the RFC) > > > > + > > > > +For the UACCE_DEV_SHARE_DOMAIN device, uacce_dma_map/unmap is not valid. > > > > +*Uacce* call back start_queue only when the DUS and DKO region is mmapped. The > > > > +accelerator driver must use those dma buffer, via uacce_queue->qfrs[], on > > > > +start_queue call back. The size of the queue file region is defined by > > > > +uacce->ops->qf_pg_start[]. > > > > + > > > > +We have to do it this way because most of current IOMMU cannot support the > > > > +kernel and user virtual address at the same time. So we have to let them both > > > > +share the same user virtual address space. > > > > + > > > > +If the device have to support kernel and user at the same time, both kernel > > > > +and the user should use these DMA API. This is not convenient. A better > > > > +solution is to change the future DMA/IOMMU design to let them separate the > > > > +address space between the user and kernel space. But it is not going to be in > > > > +a short time. > > > > + > > > > + > > > > +Multiple processes support > > > > +========================== > > > > + > > > > +In the latest mainline kernel (4.19) when this document is written, the IOMMU > > > > +subsystem do not support multiple process page tables yet. > > > > + > > > > +Most IOMMU hardware implementation support multi-process with the concept > > > > +of PASID. But they may use different name, e.g. it is call sub-stream-id in > > > > +SMMU of ARM. With PASID or similar design, multi page table can be added to > > > > +the IOMMU and referred by its PASID. > > > > + > > > > +*JPB* has a patchset to enable this[1]_. We have tested it with our hardware > > > > +(which is known as *D06*). It works well. *WarpDrive* rely on them to support > > > > +UACCE_DEV_SVA. If it is not enabled, *WarpDrive* can still work. But it > > > > +support only one process, the device will be set to UACCE_DEV_SHARE_DOMAIN > > > > +even it is set to UACCE_DEV_SVA initially. > > > > + > > > > +Static Share Virtual Memory is mainly used by UACCE_DEV_SHARE_DOMAIN device. > > > > + > > > > + > > > > +Legacy Mode Support > > > > +=================== > > > > +For the hardware without IOMMU, WarpDrive can still work, the only problem is > > > > +VA cannot be used in the device. The driver should adopt another strategy for > > > > +the shared memory. It is only for testing, and not recommended. > > > > + > > > > + > > > > +The Folk Scenario > > > > +================= > > > > +For a process with allocated queues and shared memory, what happen if it forks > > > > +a child? > > > > + > > > > +The fd of the queue will be duplicated on folk, so the child can send request > > > > +to the same queue as its parent. But the requests which is sent from processes > > > > +except for the one who open the queue will be blocked. > > > > + > > > > +It is recommended to add O_CLOEXEC to the queue file. > > > > + > > > > +The queue mmap space has a VM_DONTCOPY in its VMA. So the child will lost all > > > > +those VMAs. > > > > + > > > > +This is why *WarpDrive* does not adopt the mode used in *VFIO* and *InfiniBand*. > > > > +Both solutions can set any user pointer for hardware sharing. But they cannot > > > > +support fork when the dma is in process. Or the "Copy-On-Write" procedure will > > > > +make the parent process lost its physical pages. > > > > + > > > > + > > > > +The Sample Code > > > > +=============== > > > > +There is a sample user land implementation with a simple driver for Hisilicon > > > > +Hi1620 ZIP Accelerator. > > > > + > > > > +To test, do the following in samples/warpdrive (for the case of PC host): :: > > > > + ./autogen.sh > > > > + ./conf.sh # or simply ./configure if you build on target system > > > > + make > > > > + > > > > +Then you can get test_hisi_zip in the test subdirectory. Copy it to the target > > > > +system and make sure the hisi_zip driver is enabled (the major and minor of > > > > +the uacce chrdev can be gotten from the dmesg or sysfs), and run: :: > > > > + mknod /dev/ua1 c <major> <minior> > > > > + test/test_hisi_zip -z < data > data.zip > > > > + test/test_hisi_zip -g < data > data.gzip > > > > + > > > > + > > > > +References > > > > +========== > > > > +.. [1] https://patchwork.kernel.org/patch/10394851/ > > > > + > > > > +.. vim: tw=78 [...] > > > > -- > > > > 2.17.1 > > > >
On Mon, Nov 19, 2018 at 05:14:05PM +0800, Kenneth Lee wrote: > Date: Mon, 19 Nov 2018 17:14:05 +0800 > From: Kenneth Lee <liguozhu@hisilicon.com> > To: Leon Romanovsky <leon@kernel.org> > CC: Tim Sell <timothy.sell@unisys.com>, linux-doc@vger.kernel.org, > Alexander Shishkin <alexander.shishkin@linux.intel.com>, Zaibo Xu > <xuzaibo@huawei.com>, zhangfei.gao@foxmail.com, linuxarm@huawei.com, > haojian.zhuang@linaro.org, Christoph Lameter <cl@linux.com>, Hao Fang > <fanghao11@huawei.com>, Gavin Schenk <g.schenk@eckelmann.de>, RDMA mailing > list <linux-rdma@vger.kernel.org>, Vinod Koul <vkoul@kernel.org>, Jason > Gunthorpe <jgg@ziepe.ca>, Doug Ledford <dledford@redhat.com>, Uwe > Kleine-König <u.kleine-koenig@pengutronix.de>, David Kershner > <david.kershner@unisys.com>, Kenneth Lee <nek.in.cn@gmail.com>, Johan > Hovold <johan@kernel.org>, Cyrille Pitchen > <cyrille.pitchen@free-electrons.com>, Sagar Dharia > <sdharia@codeaurora.org>, Jens Axboe <axboe@kernel.dk>, > guodong.xu@linaro.org, linux-netdev <netdev@vger.kernel.org>, Randy Dunlap > <rdunlap@infradead.org>, linux-kernel@vger.kernel.org, Zhou Wang > <wangzhou1@hisilicon.com>, linux-crypto@vger.kernel.org, Philippe > Ombredanne <pombredanne@nexb.com>, Sanyog Kale <sanyog.r.kale@intel.com>, > "David S. Miller" <davem@davemloft.net>, > linux-accelerators@lists.ozlabs.org > Subject: Re: [RFCv3 PATCH 1/6] uacce: Add documents for WarpDrive/uacce > User-Agent: Mutt/1.5.21 (2010-09-15) > Message-ID: <20181119091405.GE157308@Turing-Arch-b> > > On Thu, Nov 15, 2018 at 04:54:55PM +0200, Leon Romanovsky wrote: > > Date: Thu, 15 Nov 2018 16:54:55 +0200 > > From: Leon Romanovsky <leon@kernel.org> > > To: Kenneth Lee <liguozhu@hisilicon.com> > > CC: Kenneth Lee <nek.in.cn@gmail.com>, Tim Sell <timothy.sell@unisys.com>, > > linux-doc@vger.kernel.org, Alexander Shishkin > > <alexander.shishkin@linux.intel.com>, Zaibo Xu <xuzaibo@huawei.com>, > > zhangfei.gao@foxmail.com, linuxarm@huawei.com, haojian.zhuang@linaro.org, > > Christoph Lameter <cl@linux.com>, Hao Fang <fanghao11@huawei.com>, Gavin > > Schenk <g.schenk@eckelmann.de>, RDMA mailing list > > <linux-rdma@vger.kernel.org>, Zhou Wang <wangzhou1@hisilicon.com>, Jason > > Gunthorpe <jgg@ziepe.ca>, Doug Ledford <dledford@redhat.com>, Uwe > > Kleine-König <u.kleine-koenig@pengutronix.de>, David Kershner > > <david.kershner@unisys.com>, Johan Hovold <johan@kernel.org>, Cyrille > > Pitchen <cyrille.pitchen@free-electrons.com>, Sagar Dharia > > <sdharia@codeaurora.org>, Jens Axboe <axboe@kernel.dk>, > > guodong.xu@linaro.org, linux-netdev <netdev@vger.kernel.org>, Randy Dunlap > > <rdunlap@infradead.org>, linux-kernel@vger.kernel.org, Vinod Koul > > <vkoul@kernel.org>, linux-crypto@vger.kernel.org, Philippe Ombredanne > > <pombredanne@nexb.com>, Sanyog Kale <sanyog.r.kale@intel.com>, "David S. > > Miller" <davem@davemloft.net>, linux-accelerators@lists.ozlabs.org > > Subject: Re: [RFCv3 PATCH 1/6] uacce: Add documents for WarpDrive/uacce > > User-Agent: Mutt/1.10.1 (2018-07-13) > > Message-ID: <20181115145455.GN3759@mtr-leonro.mtl.com> > > > > On Thu, Nov 15, 2018 at 04:51:09PM +0800, Kenneth Lee wrote: > > > On Wed, Nov 14, 2018 at 06:00:17PM +0200, Leon Romanovsky wrote: > > > > Date: Wed, 14 Nov 2018 18:00:17 +0200 > > > > From: Leon Romanovsky <leon@kernel.org> > > > > To: Kenneth Lee <nek.in.cn@gmail.com> > > > > CC: Tim Sell <timothy.sell@unisys.com>, linux-doc@vger.kernel.org, > > > > Alexander Shishkin <alexander.shishkin@linux.intel.com>, Zaibo Xu > > > > <xuzaibo@huawei.com>, zhangfei.gao@foxmail.com, linuxarm@huawei.com, > > > > haojian.zhuang@linaro.org, Christoph Lameter <cl@linux.com>, Hao Fang > > > > <fanghao11@huawei.com>, Gavin Schenk <g.schenk@eckelmann.de>, RDMA mailing > > > > list <linux-rdma@vger.kernel.org>, Zhou Wang <wangzhou1@hisilicon.com>, > > > > Jason Gunthorpe <jgg@ziepe.ca>, Doug Ledford <dledford@redhat.com>, Uwe > > > > Kleine-König <u.kleine-koenig@pengutronix.de>, David Kershner > > > > <david.kershner@unisys.com>, Johan Hovold <johan@kernel.org>, Cyrille > > > > Pitchen <cyrille.pitchen@free-electrons.com>, Sagar Dharia > > > > <sdharia@codeaurora.org>, Jens Axboe <axboe@kernel.dk>, > > > > guodong.xu@linaro.org, linux-netdev <netdev@vger.kernel.org>, Randy Dunlap > > > > <rdunlap@infradead.org>, linux-kernel@vger.kernel.org, Vinod Koul > > > > <vkoul@kernel.org>, linux-crypto@vger.kernel.org, Philippe Ombredanne > > > > <pombredanne@nexb.com>, Sanyog Kale <sanyog.r.kale@intel.com>, Kenneth Lee > > > > <liguozhu@hisilicon.com>, "David S. Miller" <davem@davemloft.net>, > > > > linux-accelerators@lists.ozlabs.org > > > > Subject: Re: [RFCv3 PATCH 1/6] uacce: Add documents for WarpDrive/uacce > > > > User-Agent: Mutt/1.10.1 (2018-07-13) > > > > Message-ID: <20181114160017.GI3759@mtr-leonro.mtl.com> > > > > > > > > On Wed, Nov 14, 2018 at 10:58:09AM +0800, Kenneth Lee wrote: > > > > > > > > > > 在 2018/11/13 上午8:23, Leon Romanovsky 写道: > > > > > > On Mon, Nov 12, 2018 at 03:58:02PM +0800, Kenneth Lee wrote: > > > > > > > From: Kenneth Lee <liguozhu@hisilicon.com> > > > > > > > > > > > > > > WarpDrive is a general accelerator framework for the user application to > > > > > > > access the hardware without going through the kernel in data path. > > > > > > > > > > > > > > The kernel component to provide kernel facility to driver for expose the > > > > > > > user interface is called uacce. It a short name for > > > > > > > "Unified/User-space-access-intended Accelerator Framework". > > > > > > > > > > > > > > This patch add document to explain how it works. > > > > > > + RDMA and netdev folks > > > > > > > > > > > > Sorry, to be late in the game, I don't see other patches, but from > > > > > > the description below it seems like you are reinventing RDMA verbs > > > > > > model. I have hard time to see the differences in the proposed > > > > > > framework to already implemented in drivers/infiniband/* for the kernel > > > > > > space and for the https://github.com/linux-rdma/rdma-core/ for the user > > > > > > space parts. > > > > > > > > > > Thanks Leon, > > > > > > > > > > Yes, we tried to solve similar problem in RDMA. We also learned a lot from > > > > > the exist code of RDMA. But we we have to make a new one because we cannot > > > > > register accelerators such as AI operation, encryption or compression to the > > > > > RDMA framework:) > > > > > > > > Assuming that you did everything right and still failed to use RDMA > > > > framework, you was supposed to fix it and not to reinvent new exactly > > > > same one. It is how we develop kernel, by reusing existing code. > > > > > > Yes, but we don't force other system such as NIC or GPU into RDMA, do we? > > > > You don't introduce new NIC or GPU, but proposing another interface to > > directly access HW memory and bypass kernel for the data path. This is > > whole idea of RDMA and this is why it is already present in the kernel. > > > > Various hardware devices are supported in our stack allow a ton of crazy > > stuff, including GPUs interconnections and NIC functionalities. > > Yes. We don't want to invent new wheel. That is why we did it behind VFIO in RFC > v1 and v2. But finally we were persuaded by Mr. Jerome Glisse that VFIO was not > a good place to solve the problem. > > And currently, as you see, IB is bound with devices doing RDMA. The register > function, ib_register_device() hint that it is a netdev (get_netdev() callback), it know > about gid, pkey, and Memory Window. IB is not simply a address space management > framework. And verbs to IB are not transparent. If we start to add > compression/decompression, AI (RNN, CNN stuff) operations, and encryption/decryption > to the verbs set. It will become very complexity. Or maybe I misunderstand the > IB idea? But I don't see compression hardware is integrated in the mainline > Kernel. Could you directly point out which one I can used as a reference? > > > > > > > > > I assume you would not agree to register a zip accelerator to infiniband? :) > > > > "infiniband" name in the "drivers/infiniband/" is legacy one and the > > current code supports IB, RoCE, iWARP and OmniPath as a transport layers. > > For a lone time, we wanted to rename that folder to be "drivers/rdma", > > but didn't find enough brave men/women to do it, due to backport mess > > for such move. > > > > The addition of zip accelerator to RDMA is possible and depends on how > > you will model such new functionality - new driver, or maybe new ULP. > > > > > > > > Further, I don't think it is wise to break an exist system (RDMA) to fulfill a > > > totally new scenario. The better choice is to let them run in parallel for some > > > time and try to merge them accordingly. > > > > Awesome, so please run your code out-of-tree for now and once you are ready > > for submission let's try to merge it. > > Yes, yes. We know trust need time to gain. But the fact is that there is no > accelerator user driver can be added to mainline kernel. We should raise the > topic time to time. So to help the communication to fix the gap, right? > > We are also opened to cooperate with IB to do it within the IB framework. But > please let me know where to start. I feel it is quite wired to make a > ib_register_device for a zip or RSA accelerator. > > > > > > > > > > > > > > > > > > > > Another problem we tried to address is the way to pin the memory for dma > > > > > operation. The RDMA way to pin the memory cannot avoid the page lost due to > > > > > copy-on-write operation during the memory is used by the device. This may > > > > > not be important to RDMA library. But it is important to accelerator. > > > > > > > > Such support exists in drivers/infiniband/ from late 2014 and > > > > it is called ODP (on demand paging). > > > > > > I reviewed ODP and I think it is a solution bound to infiniband. It is part of > > > MR semantics and required a infiniband specific hook > > > (ucontext->invalidate_range()). And the hook requires the device to be able to > > > stop using the page for a while for the copying. It is ok for infiniband > > > (actually, only mlx5 uses it). I don't think most accelerators can support > > > this mode. But WarpDrive works fully on top of IOMMU interface, it has no this > > > limitation. > > > > 1. It has nothing to do with infiniband. > > But it must be a ib_dev first. > > > 2. MR and uncontext are verbs semantics and needed to ensure that host > > memory exposed to user is properly protected from security point of view. > > 3. "stop using the page for a while for the copying" - I'm not fully > > understand this claim, maybe this article will help you to better > > describe : https://lwn.net/Articles/753027/ > > This topic was being discussed in RFCv2. The key problem here is that: > > The device need to hold the memory for its own calculation, but the CPU/software > want to stop it for a while for synchronizing with disk or COW. > > If the hardware support SVM/SVA (Shared Virtual Memory/Address), it is easy, the > device share page table with CPU, the device will raise a page fault when the > CPU downgrade the PTE to read-only. > > If the hardware cannot share page table with the CPU, we then need to have > some way to change the device page table. This is what happen in ODP. It > invalidates the page table in device upon mmu_notifier call back. But this cannot > solve the COW problem: if the user process A share a page P with device, and A > forks a new process B, and it continue to write to the page. By COW, the > process B will keep the page P, while A will get a new page P'. But you have > no way to let the device know it should use P' rather than P. > > This may be OK for RDMA application. Because RDMA is a big thing and we can ask > the programmer to avoid the situation. But for a accelerator, I don't think we > can ask a programmer to care for this when use a zlib. > > In WarpDrive/uacce, we make this simple. If you support IOMMU and it support > SVM/SVA. Everything will be fine just like ODP implicit mode. And you don't need > to write any code for that. Because it has been done by IOMMU framework. If it > dose not, you have to use the kernel allocated memory which has the same IOVA as > the VA in user space. So we can still maintain a unify address space among the > devices and the applicatin. > > > 4. mlx5 supports ODP not because of being partially IB device, > > but because HW performance oriented implementation is not an easy task. > > > > > > > > > > > > > > > > > > > Hope this can help the understanding. > > > > > > > > Yes, it helped me a lot. > > > > Now, I'm more than before convinced that this whole patchset shouldn't > > > > exist in the first place. > > > > > > Then maybe you can tell me how I can register my accelerator to the user space? > > > > Write kernel driver and write user space part of it. > > https://github.com/linux-rdma/rdma-core/ > > > > I have no doubts that your colleagues who wrote and maintain > > drivers/infiniband/hw/hns driver know best how to do it. > > They did it very successfully. > > > > Thanks > > > > > > > > > > > > > To be clear, NAK. > > > > > > > > Thanks > > > > > > > > > > > > > > Cheers > > > > > > > > > > > > > > > > > Hard NAK from RDMA side. > > > > > > > > > > > > Thanks > > > > > > > > > > > > > Signed-off-by: Kenneth Lee <liguozhu@hisilicon.com> > > > > > > > --- > > > > > > > Documentation/warpdrive/warpdrive.rst | 260 +++++++ > > > > > > > Documentation/warpdrive/wd-arch.svg | 764 ++++++++++++++++++++ > > > > > > > Documentation/warpdrive/wd.svg | 526 ++++++++++++++ > > > > > > > Documentation/warpdrive/wd_q_addr_space.svg | 359 +++++++++ > > > > > > > 4 files changed, 1909 insertions(+) > > > > > > > create mode 100644 Documentation/warpdrive/warpdrive.rst > > > > > > > create mode 100644 Documentation/warpdrive/wd-arch.svg > > > > > > > create mode 100644 Documentation/warpdrive/wd.svg > > > > > > > create mode 100644 Documentation/warpdrive/wd_q_addr_space.svg > > > > > > > > > > > > > > diff --git a/Documentation/warpdrive/warpdrive.rst b/Documentation/warpdrive/warpdrive.rst > > > > > > > new file mode 100644 > > > > > > > index 000000000000..ef84d3a2d462 > > > > > > > --- /dev/null > > > > > > > +++ b/Documentation/warpdrive/warpdrive.rst > > > > > > > @@ -0,0 +1,260 @@ > > > > > > > +Introduction of WarpDrive > > > > > > > +========================= > > > > > > > + > > > > > > > +*WarpDrive* is a general accelerator framework for the user application to > > > > > > > +access the hardware without going through the kernel in data path. > > > > > > > + > > > > > > > +It can be used as the quick channel for accelerators, network adaptors or > > > > > > > +other hardware for application in user space. > > > > > > > + > > > > > > > +This may make some implementation simpler. E.g. you can reuse most of the > > > > > > > +*netdev* driver in kernel and just share some ring buffer to the user space > > > > > > > +driver for *DPDK* [4] or *ODP* [5]. Or you can combine the RSA accelerator with > > > > > > > +the *netdev* in the user space as a https reversed proxy, etc. > > > > > > > + > > > > > > > +*WarpDrive* takes the hardware accelerator as a heterogeneous processor which > > > > > > > +can share particular load from the CPU: > > > > > > > + > > > > > > > +.. image:: wd.svg > > > > > > > + :alt: WarpDrive Concept > > > > > > > + > > > > > > > +The virtual concept, queue, is used to manage the requests sent to the > > > > > > > +accelerator. The application send requests to the queue by writing to some > > > > > > > +particular address, while the hardware takes the requests directly from the > > > > > > > +address and send feedback accordingly. > > > > > > > + > > > > > > > +The format of the queue may differ from hardware to hardware. But the > > > > > > > +application need not to make any system call for the communication. > > > > > > > + > > > > > > > +*WarpDrive* tries to create a shared virtual address space for all involved > > > > > > > +accelerators. Within this space, the requests sent to queue can refer to any > > > > > > > +virtual address, which will be valid to the application and all involved > > > > > > > +accelerators. > > > > > > > + > > > > > > > +The name *WarpDrive* is simply a cool and general name meaning the framework > > > > > > > +makes the application faster. It includes general user library, kernel > > > > > > > +management module and drivers for the hardware. In kernel, the management > > > > > > > +module is called *uacce*, meaning "Unified/User-space-access-intended > > > > > > > +Accelerator Framework". > > > > > > > + > > > > > > > + > > > > > > > +How does it work > > > > > > > +================ > > > > > > > + > > > > > > > +*WarpDrive* uses *mmap* and *IOMMU* to play the trick. > > > > > > > + > > > > > > > +*Uacce* creates a chrdev for the device registered to it. A "queue" will be > > > > > > > +created when the chrdev is opened. The application access the queue by mmap > > > > > > > +different address region of the queue file. > > > > > > > + > > > > > > > +The following figure demonstrated the queue file address space: > > > > > > > + > > > > > > > +.. image:: wd_q_addr_space.svg > > > > > > > + :alt: WarpDrive Queue Address Space > > > > > > > + > > > > > > > +The first region of the space, device region, is used for the application to > > > > > > > +write request or read answer to or from the hardware. > > > > > > > + > > > > > > > +Normally, there can be three types of device regions mmio and memory regions. > > > > > > > +It is recommended to use common memory for request/answer descriptors and use > > > > > > > +the mmio space for device notification, such as doorbell. But of course, this > > > > > > > +is all up to the interface designer. > > > > > > > + > > > > > > > +There can be two types of device memory regions, kernel-only and user-shared. > > > > > > > +This will be explained in the "kernel APIs" section. > > > > > > > + > > > > > > > +The Static Share Virtual Memory region is necessary only when the device IOMMU > > > > > > > +does not support "Share Virtual Memory". This will be explained after the > > > > > > > +*IOMMU* idea. > > > > > > > + > > > > > > > + > > > > > > > +Architecture > > > > > > > +------------ > > > > > > > + > > > > > > > +The full *WarpDrive* architecture is represented in the following class > > > > > > > +diagram: > > > > > > > + > > > > > > > +.. image:: wd-arch.svg > > > > > > > + :alt: WarpDrive Architecture > > > > > > > + > > > > > > > + > > > > > > > +The user API > > > > > > > +------------ > > > > > > > + > > > > > > > +We adopt a polling style interface in the user space: :: > > > > > > > + > > > > > > > + int wd_request_queue(struct wd_queue *q); > > > > > > > + void wd_release_queue(struct wd_queue *q); > > > > > > > + > > > > > > > + int wd_send(struct wd_queue *q, void *req); > > > > > > > + int wd_recv(struct wd_queue *q, void **req); > > > > > > > + int wd_recv_sync(struct wd_queue *q, void **req); > > > > > > > + void wd_flush(struct wd_queue *q); > > > > > > > + > > > > > > > +wd_recv_sync() is a wrapper to its non-sync version. It will trapped into > > > > > > > +kernel and waits until the queue become available. > > > > > > > + > > > > > > > +If the queue do not support SVA/SVM. The following helper function > > > > > > > +can be used to create Static Virtual Share Memory: :: > > > > > > > + > > > > > > > + void *wd_preserve_share_memory(struct wd_queue *q, size_t size); > > > > > > > + > > > > > > > +The user API is not mandatory. It is simply a suggestion and hint what the > > > > > > > +kernel interface is supposed to support. > > > > > > > + > > > > > > > + > > > > > > > +The user driver > > > > > > > +--------------- > > > > > > > + > > > > > > > +The queue file mmap space will need a user driver to wrap the communication > > > > > > > +protocol. *UACCE* provides some attributes in sysfs for the user driver to > > > > > > > +match the right accelerator accordingly. > > > > > > > + > > > > > > > +The *UACCE* device attribute is under the following directory: > > > > > > > + > > > > > > > +/sys/class/uacce/<dev-name>/params > > > > > > > + > > > > > > > +The following attributes is supported: > > > > > > > + > > > > > > > +nr_queue_remained (ro) > > > > > > > + number of queue remained > > > > > > > + > > > > > > > +api_version (ro) > > > > > > > + a string to identify the queue mmap space format and its version > > > > > > > + > > > > > > > +device_attr (ro) > > > > > > > + attributes of the device, see UACCE_DEV_xxx flag defined in uacce.h > > > > > > > + > > > > > > > +numa_node (ro) > > > > > > > + id of numa node > > > > > > > + > > > > > > > +priority (rw) > > > > > > > + Priority or the device, bigger is higher > > > > > > > + > > > > > > > +(This is not yet implemented in RFC version) > > > > > > > + > > > > > > > + > > > > > > > +The kernel API > > > > > > > +-------------- > > > > > > > + > > > > > > > +The *uacce* kernel API is defined in uacce.h. If the hardware support SVM/SVA, > > > > > > > +The driver need only the following API functions: :: > > > > > > > + > > > > > > > + int uacce_register(uacce); > > > > > > > + void uacce_unregister(uacce); > > > > > > > + void uacce_wake_up(q); > > > > > > > + > > > > > > > +*uacce_wake_up* is used to notify the process who epoll() on the queue file. > > > > > > > + > > > > > > > +According to the IOMMU capability, *uacce* categories the devices as follow: > > > > > > > + > > > > > > > +UACCE_DEV_NOIOMMU > > > > > > > + The device has no IOMMU. The user process cannot use VA on the hardware > > > > > > > + This mode is not recommended. > > > > > > > + > > > > > > > +UACCE_DEV_SVA (UACCE_DEV_PASID | UACCE_DEV_FAULT_FROM_DEV) > > > > > > > + The device has IOMMU which can share the same page table with user > > > > > > > + process > > > > > > > + > > > > > > > +UACCE_DEV_SHARE_DOMAIN > > > > > > > + The device has IOMMU which has no multiple page table and device page > > > > > > > + fault support > > > > > > > + > > > > > > > +If the device works in mode other than UACCE_DEV_NOIOMMU, *uacce* will set its > > > > > > > +IOMMU to IOMMU_DOMAIN_UNMANAGED. So the driver must not use any kernel > > > > > > > +DMA API but the following ones from *uacce* instead: :: > > > > > > > + > > > > > > > + uacce_dma_map(q, va, size, prot); > > > > > > > + uacce_dma_unmap(q, va, size, prot); > > > > > > > + > > > > > > > +*uacce_dma_map/unmap* is valid only for UACCE_DEV_SVA device. It creates a > > > > > > > +particular PASID and page table for the kernel in the IOMMU (Not yet > > > > > > > +implemented in the RFC) > > > > > > > + > > > > > > > +For the UACCE_DEV_SHARE_DOMAIN device, uacce_dma_map/unmap is not valid. > > > > > > > +*Uacce* call back start_queue only when the DUS and DKO region is mmapped. The > > > > > > > +accelerator driver must use those dma buffer, via uacce_queue->qfrs[], on > > > > > > > +start_queue call back. The size of the queue file region is defined by > > > > > > > +uacce->ops->qf_pg_start[]. > > > > > > > + > > > > > > > +We have to do it this way because most of current IOMMU cannot support the > > > > > > > +kernel and user virtual address at the same time. So we have to let them both > > > > > > > +share the same user virtual address space. > > > > > > > + > > > > > > > +If the device have to support kernel and user at the same time, both kernel > > > > > > > +and the user should use these DMA API. This is not convenient. A better > > > > > > > +solution is to change the future DMA/IOMMU design to let them separate the > > > > > > > +address space between the user and kernel space. But it is not going to be in > > > > > > > +a short time. > > > > > > > + > > > > > > > + > > > > > > > +Multiple processes support > > > > > > > +========================== > > > > > > > + > > > > > > > +In the latest mainline kernel (4.19) when this document is written, the IOMMU > > > > > > > +subsystem do not support multiple process page tables yet. > > > > > > > + > > > > > > > +Most IOMMU hardware implementation support multi-process with the concept > > > > > > > +of PASID. But they may use different name, e.g. it is call sub-stream-id in > > > > > > > +SMMU of ARM. With PASID or similar design, multi page table can be added to > > > > > > > +the IOMMU and referred by its PASID. > > > > > > > + > > > > > > > +*JPB* has a patchset to enable this[1]_. We have tested it with our hardware > > > > > > > +(which is known as *D06*). It works well. *WarpDrive* rely on them to support > > > > > > > +UACCE_DEV_SVA. If it is not enabled, *WarpDrive* can still work. But it > > > > > > > +support only one process, the device will be set to UACCE_DEV_SHARE_DOMAIN > > > > > > > +even it is set to UACCE_DEV_SVA initially. > > > > > > > + > > > > > > > +Static Share Virtual Memory is mainly used by UACCE_DEV_SHARE_DOMAIN device. > > > > > > > + > > > > > > > + > > > > > > > +Legacy Mode Support > > > > > > > +=================== > > > > > > > +For the hardware without IOMMU, WarpDrive can still work, the only problem is > > > > > > > +VA cannot be used in the device. The driver should adopt another strategy for > > > > > > > +the shared memory. It is only for testing, and not recommended. > > > > > > > + > > > > > > > + > > > > > > > +The Folk Scenario > > > > > > > +================= > > > > > > > +For a process with allocated queues and shared memory, what happen if it forks > > > > > > > +a child? > > > > > > > + > > > > > > > +The fd of the queue will be duplicated on folk, so the child can send request > > > > > > > +to the same queue as its parent. But the requests which is sent from processes > > > > > > > +except for the one who open the queue will be blocked. > > > > > > > + > > > > > > > +It is recommended to add O_CLOEXEC to the queue file. > > > > > > > + > > > > > > > +The queue mmap space has a VM_DONTCOPY in its VMA. So the child will lost all > > > > > > > +those VMAs. > > > > > > > + > > > > > > > +This is why *WarpDrive* does not adopt the mode used in *VFIO* and *InfiniBand*. > > > > > > > +Both solutions can set any user pointer for hardware sharing. But they cannot > > > > > > > +support fork when the dma is in process. Or the "Copy-On-Write" procedure will > > > > > > > +make the parent process lost its physical pages. > > > > > > > + > > > > > > > + > > > > > > > +The Sample Code > > > > > > > +=============== > > > > > > > +There is a sample user land implementation with a simple driver for Hisilicon > > > > > > > +Hi1620 ZIP Accelerator. > > > > > > > + > > > > > > > +To test, do the following in samples/warpdrive (for the case of PC host): :: > > > > > > > + ./autogen.sh > > > > > > > + ./conf.sh # or simply ./configure if you build on target system > > > > > > > + make > > > > > > > + > > > > > > > +Then you can get test_hisi_zip in the test subdirectory. Copy it to the target > > > > > > > +system and make sure the hisi_zip driver is enabled (the major and minor of > > > > > > > +the uacce chrdev can be gotten from the dmesg or sysfs), and run: :: > > > > > > > + mknod /dev/ua1 c <major> <minior> > > > > > > > + test/test_hisi_zip -z < data > data.zip > > > > > > > + test/test_hisi_zip -g < data > data.gzip > > > > > > > + > > > > > > > + > > > > > > > +References > > > > > > > +========== > > > > > > > +.. [1] https://patchwork.kernel.org/patch/10394851/ > > > > > > > + > > > > > > > +.. vim: tw=78 > > > [...] > > > > > > > -- > > > > > > > 2.17.1 > > > > > > > I don't know if Mr. Jerome Glisse in the list. I think I should cc him for my respectation to his help on last RFC. - Kenneth
On Mon, Nov 19, 2018 at 05:19:10PM +0800, Kenneth Lee wrote: > On Mon, Nov 19, 2018 at 05:14:05PM +0800, Kenneth Lee wrote: > > Date: Mon, 19 Nov 2018 17:14:05 +0800 > > From: Kenneth Lee <liguozhu@hisilicon.com> > > To: Leon Romanovsky <leon@kernel.org> > > CC: Tim Sell <timothy.sell@unisys.com>, linux-doc@vger.kernel.org, > > Alexander Shishkin <alexander.shishkin@linux.intel.com>, Zaibo Xu > > <xuzaibo@huawei.com>, zhangfei.gao@foxmail.com, linuxarm@huawei.com, > > haojian.zhuang@linaro.org, Christoph Lameter <cl@linux.com>, Hao Fang > > <fanghao11@huawei.com>, Gavin Schenk <g.schenk@eckelmann.de>, RDMA mailing > > list <linux-rdma@vger.kernel.org>, Vinod Koul <vkoul@kernel.org>, Jason > > Gunthorpe <jgg@ziepe.ca>, Doug Ledford <dledford@redhat.com>, Uwe > > Kleine-König <u.kleine-koenig@pengutronix.de>, David Kershner > > <david.kershner@unisys.com>, Kenneth Lee <nek.in.cn@gmail.com>, Johan > > Hovold <johan@kernel.org>, Cyrille Pitchen > > <cyrille.pitchen@free-electrons.com>, Sagar Dharia > > <sdharia@codeaurora.org>, Jens Axboe <axboe@kernel.dk>, > > guodong.xu@linaro.org, linux-netdev <netdev@vger.kernel.org>, Randy Dunlap > > <rdunlap@infradead.org>, linux-kernel@vger.kernel.org, Zhou Wang > > <wangzhou1@hisilicon.com>, linux-crypto@vger.kernel.org, Philippe > > Ombredanne <pombredanne@nexb.com>, Sanyog Kale <sanyog.r.kale@intel.com>, > > "David S. Miller" <davem@davemloft.net>, > > linux-accelerators@lists.ozlabs.org > > Subject: Re: [RFCv3 PATCH 1/6] uacce: Add documents for WarpDrive/uacce > > User-Agent: Mutt/1.5.21 (2010-09-15) > > Message-ID: <20181119091405.GE157308@Turing-Arch-b> > > > > On Thu, Nov 15, 2018 at 04:54:55PM +0200, Leon Romanovsky wrote: > > > Date: Thu, 15 Nov 2018 16:54:55 +0200 > > > From: Leon Romanovsky <leon@kernel.org> > > > To: Kenneth Lee <liguozhu@hisilicon.com> > > > CC: Kenneth Lee <nek.in.cn@gmail.com>, Tim Sell <timothy.sell@unisys.com>, > > > linux-doc@vger.kernel.org, Alexander Shishkin > > > <alexander.shishkin@linux.intel.com>, Zaibo Xu <xuzaibo@huawei.com>, > > > zhangfei.gao@foxmail.com, linuxarm@huawei.com, haojian.zhuang@linaro.org, > > > Christoph Lameter <cl@linux.com>, Hao Fang <fanghao11@huawei.com>, Gavin > > > Schenk <g.schenk@eckelmann.de>, RDMA mailing list > > > <linux-rdma@vger.kernel.org>, Zhou Wang <wangzhou1@hisilicon.com>, Jason > > > Gunthorpe <jgg@ziepe.ca>, Doug Ledford <dledford@redhat.com>, Uwe > > > Kleine-König <u.kleine-koenig@pengutronix.de>, David Kershner > > > <david.kershner@unisys.com>, Johan Hovold <johan@kernel.org>, Cyrille > > > Pitchen <cyrille.pitchen@free-electrons.com>, Sagar Dharia > > > <sdharia@codeaurora.org>, Jens Axboe <axboe@kernel.dk>, > > > guodong.xu@linaro.org, linux-netdev <netdev@vger.kernel.org>, Randy Dunlap > > > <rdunlap@infradead.org>, linux-kernel@vger.kernel.org, Vinod Koul > > > <vkoul@kernel.org>, linux-crypto@vger.kernel.org, Philippe Ombredanne > > > <pombredanne@nexb.com>, Sanyog Kale <sanyog.r.kale@intel.com>, "David S. > > > Miller" <davem@davemloft.net>, linux-accelerators@lists.ozlabs.org > > > Subject: Re: [RFCv3 PATCH 1/6] uacce: Add documents for WarpDrive/uacce > > > User-Agent: Mutt/1.10.1 (2018-07-13) > > > Message-ID: <20181115145455.GN3759@mtr-leonro.mtl.com> > > > > > > On Thu, Nov 15, 2018 at 04:51:09PM +0800, Kenneth Lee wrote: > > > > On Wed, Nov 14, 2018 at 06:00:17PM +0200, Leon Romanovsky wrote: > > > > > Date: Wed, 14 Nov 2018 18:00:17 +0200 > > > > > From: Leon Romanovsky <leon@kernel.org> > > > > > To: Kenneth Lee <nek.in.cn@gmail.com> > > > > > CC: Tim Sell <timothy.sell@unisys.com>, linux-doc@vger.kernel.org, > > > > > Alexander Shishkin <alexander.shishkin@linux.intel.com>, Zaibo Xu > > > > > <xuzaibo@huawei.com>, zhangfei.gao@foxmail.com, linuxarm@huawei.com, > > > > > haojian.zhuang@linaro.org, Christoph Lameter <cl@linux.com>, Hao Fang > > > > > <fanghao11@huawei.com>, Gavin Schenk <g.schenk@eckelmann.de>, RDMA mailing > > > > > list <linux-rdma@vger.kernel.org>, Zhou Wang <wangzhou1@hisilicon.com>, > > > > > Jason Gunthorpe <jgg@ziepe.ca>, Doug Ledford <dledford@redhat.com>, Uwe > > > > > Kleine-König <u.kleine-koenig@pengutronix.de>, David Kershner > > > > > <david.kershner@unisys.com>, Johan Hovold <johan@kernel.org>, Cyrille > > > > > Pitchen <cyrille.pitchen@free-electrons.com>, Sagar Dharia > > > > > <sdharia@codeaurora.org>, Jens Axboe <axboe@kernel.dk>, > > > > > guodong.xu@linaro.org, linux-netdev <netdev@vger.kernel.org>, Randy Dunlap > > > > > <rdunlap@infradead.org>, linux-kernel@vger.kernel.org, Vinod Koul > > > > > <vkoul@kernel.org>, linux-crypto@vger.kernel.org, Philippe Ombredanne > > > > > <pombredanne@nexb.com>, Sanyog Kale <sanyog.r.kale@intel.com>, Kenneth Lee > > > > > <liguozhu@hisilicon.com>, "David S. Miller" <davem@davemloft.net>, > > > > > linux-accelerators@lists.ozlabs.org > > > > > Subject: Re: [RFCv3 PATCH 1/6] uacce: Add documents for WarpDrive/uacce > > > > > User-Agent: Mutt/1.10.1 (2018-07-13) > > > > > Message-ID: <20181114160017.GI3759@mtr-leonro.mtl.com> > > > > > > > > > > On Wed, Nov 14, 2018 at 10:58:09AM +0800, Kenneth Lee wrote: > > > > > > > > > > > > 在 2018/11/13 上午8:23, Leon Romanovsky 写道: > > > > > > > On Mon, Nov 12, 2018 at 03:58:02PM +0800, Kenneth Lee wrote: > > > > > > > > From: Kenneth Lee <liguozhu@hisilicon.com> > > > > > > > > > > > > > > > > WarpDrive is a general accelerator framework for the user application to > > > > > > > > access the hardware without going through the kernel in data path. > > > > > > > > > > > > > > > > The kernel component to provide kernel facility to driver for expose the > > > > > > > > user interface is called uacce. It a short name for > > > > > > > > "Unified/User-space-access-intended Accelerator Framework". > > > > > > > > > > > > > > > > This patch add document to explain how it works. > > > > > > > + RDMA and netdev folks > > > > > > > > > > > > > > Sorry, to be late in the game, I don't see other patches, but from > > > > > > > the description below it seems like you are reinventing RDMA verbs > > > > > > > model. I have hard time to see the differences in the proposed > > > > > > > framework to already implemented in drivers/infiniband/* for the kernel > > > > > > > space and for the https://github.com/linux-rdma/rdma-core/ for the user > > > > > > > space parts. > > > > > > > > > > > > Thanks Leon, > > > > > > > > > > > > Yes, we tried to solve similar problem in RDMA. We also learned a lot from > > > > > > the exist code of RDMA. But we we have to make a new one because we cannot > > > > > > register accelerators such as AI operation, encryption or compression to the > > > > > > RDMA framework:) > > > > > > > > > > Assuming that you did everything right and still failed to use RDMA > > > > > framework, you was supposed to fix it and not to reinvent new exactly > > > > > same one. It is how we develop kernel, by reusing existing code. > > > > > > > > Yes, but we don't force other system such as NIC or GPU into RDMA, do we? > > > > > > You don't introduce new NIC or GPU, but proposing another interface to > > > directly access HW memory and bypass kernel for the data path. This is > > > whole idea of RDMA and this is why it is already present in the kernel. > > > > > > Various hardware devices are supported in our stack allow a ton of crazy > > > stuff, including GPUs interconnections and NIC functionalities. > > > > Yes. We don't want to invent new wheel. That is why we did it behind VFIO in RFC > > v1 and v2. But finally we were persuaded by Mr. Jerome Glisse that VFIO was not > > a good place to solve the problem. I saw a couple of his responses, he constantly said to you that you are reinventing the wheel. https://lore.kernel.org/lkml/20180904150019.GA4024@redhat.com/ > > > > And currently, as you see, IB is bound with devices doing RDMA. The register > > function, ib_register_device() hint that it is a netdev (get_netdev() callback), it know > > about gid, pkey, and Memory Window. IB is not simply a address space management > > framework. And verbs to IB are not transparent. If we start to add > > compression/decompression, AI (RNN, CNN stuff) operations, and encryption/decryption > > to the verbs set. It will become very complexity. Or maybe I misunderstand the > > IB idea? But I don't see compression hardware is integrated in the mainline > > Kernel. Could you directly point out which one I can used as a reference? > > I strongly advise you to read the code, not all drivers are implementing gids, pkeys and get_netdev() callback. Yes, you are misunderstanding drivers/infiniband subsystem. We have plenty options to expose APIs to the user space applications, starting from standard verbs API and ending with private objects which are understandable by specific device/driver. IB stack provides secure FD to access device, by creating context, after that you can send direct commands to the FW (see mlx5 DEVX or hfi1) in sane way. So actually, you will need to register your device, declare your own set of objects (similar to mlx5 include/uapi/rdma/mlx5_user_ioctl_*.h). In regards to reference of compression hardware, I don't have. But there is an example of how T10-DIF can be implemented in verbs layer: https://www.openfabrics.org/images/2018workshop/presentations/307_TOved_T10-DIFOffload.pdf Or IPsec crypto: https://www.spinics.net/lists/linux-rdma/msg48906.html > > > > > > > > > > > I assume you would not agree to register a zip accelerator to infiniband? :) > > > > > > "infiniband" name in the "drivers/infiniband/" is legacy one and the > > > current code supports IB, RoCE, iWARP and OmniPath as a transport layers. > > > For a lone time, we wanted to rename that folder to be "drivers/rdma", > > > but didn't find enough brave men/women to do it, due to backport mess > > > for such move. > > > > > > The addition of zip accelerator to RDMA is possible and depends on how > > > you will model such new functionality - new driver, or maybe new ULP. > > > > > > > > > > > Further, I don't think it is wise to break an exist system (RDMA) to fulfill a > > > > totally new scenario. The better choice is to let them run in parallel for some > > > > time and try to merge them accordingly. > > > > > > Awesome, so please run your code out-of-tree for now and once you are ready > > > for submission let's try to merge it. > > > > Yes, yes. We know trust need time to gain. But the fact is that there is no > > accelerator user driver can be added to mainline kernel. We should raise the > > topic time to time. So to help the communication to fix the gap, right? > > > > We are also opened to cooperate with IB to do it within the IB framework. But > > please let me know where to start. I feel it is quite wired to make a > > ib_register_device for a zip or RSA accelerator. Most of ib_ prefixes in drivers/infinband/ are legacy names. You can rename them to be rdma_register_device() if it helps. So from implementation point of view, as I wrote above. Create minimal driver to register, expose MR to user space, add your own objects and capabilities through our new KABI and implement user space part in github.com/linux-rdma/rdma-core. > > > > > > > > > > > > > > > > > > > > > > > > > > Another problem we tried to address is the way to pin the memory for dma > > > > > > operation. The RDMA way to pin the memory cannot avoid the page lost due to > > > > > > copy-on-write operation during the memory is used by the device. This may > > > > > > not be important to RDMA library. But it is important to accelerator. > > > > > > > > > > Such support exists in drivers/infiniband/ from late 2014 and > > > > > it is called ODP (on demand paging). > > > > > > > > I reviewed ODP and I think it is a solution bound to infiniband. It is part of > > > > MR semantics and required a infiniband specific hook > > > > (ucontext->invalidate_range()). And the hook requires the device to be able to > > > > stop using the page for a while for the copying. It is ok for infiniband > > > > (actually, only mlx5 uses it). I don't think most accelerators can support > > > > this mode. But WarpDrive works fully on top of IOMMU interface, it has no this > > > > limitation. > > > > > > 1. It has nothing to do with infiniband. > > > > But it must be a ib_dev first. It is just a name. > > > > > 2. MR and uncontext are verbs semantics and needed to ensure that host > > > memory exposed to user is properly protected from security point of view. > > > 3. "stop using the page for a while for the copying" - I'm not fully > > > understand this claim, maybe this article will help you to better > > > describe : https://lwn.net/Articles/753027/ > > > > This topic was being discussed in RFCv2. The key problem here is that: > > > > The device need to hold the memory for its own calculation, but the CPU/software > > want to stop it for a while for synchronizing with disk or COW. > > > > If the hardware support SVM/SVA (Shared Virtual Memory/Address), it is easy, the > > device share page table with CPU, the device will raise a page fault when the > > CPU downgrade the PTE to read-only. > > > > If the hardware cannot share page table with the CPU, we then need to have > > some way to change the device page table. This is what happen in ODP. It > > invalidates the page table in device upon mmu_notifier call back. But this cannot > > solve the COW problem: if the user process A share a page P with device, and A > > forks a new process B, and it continue to write to the page. By COW, the > > process B will keep the page P, while A will get a new page P'. But you have > > no way to let the device know it should use P' rather than P. I didn't hear about such issue and we supported fork for a long time. > > > > This may be OK for RDMA application. Because RDMA is a big thing and we can ask > > the programmer to avoid the situation. But for a accelerator, I don't think we > > can ask a programmer to care for this when use a zlib. > > > > In WarpDrive/uacce, we make this simple. If you support IOMMU and it support > > SVM/SVA. Everything will be fine just like ODP implicit mode. And you don't need > > to write any code for that. Because it has been done by IOMMU framework. If it > > dose not, you have to use the kernel allocated memory which has the same IOVA as > > the VA in user space. So we can still maintain a unify address space among the > > devices and the applicatin. > > > > > 4. mlx5 supports ODP not because of being partially IB device, > > > but because HW performance oriented implementation is not an easy task. > > > > > > > > > > > > > > > > > > > > > > > > Hope this can help the understanding. > > > > > > > > > > Yes, it helped me a lot. > > > > > Now, I'm more than before convinced that this whole patchset shouldn't > > > > > exist in the first place. > > > > > > > > Then maybe you can tell me how I can register my accelerator to the user space? > > > > > > Write kernel driver and write user space part of it. > > > https://github.com/linux-rdma/rdma-core/ > > > > > > I have no doubts that your colleagues who wrote and maintain > > > drivers/infiniband/hw/hns driver know best how to do it. > > > They did it very successfully. > > > > > > Thanks > > > > > > > > > > > > > > > > > To be clear, NAK. > > > > > > > > > > Thanks > > > > > > > > > > > > > > > > > Cheers > > > > > > > > > > > > > > > > > > > > Hard NAK from RDMA side. > > > > > > > > > > > > > > Thanks > > > > > > > > > > > > > > > Signed-off-by: Kenneth Lee <liguozhu@hisilicon.com> > > > > > > > > --- > > > > > > > > Documentation/warpdrive/warpdrive.rst | 260 +++++++ > > > > > > > > Documentation/warpdrive/wd-arch.svg | 764 ++++++++++++++++++++ > > > > > > > > Documentation/warpdrive/wd.svg | 526 ++++++++++++++ > > > > > > > > Documentation/warpdrive/wd_q_addr_space.svg | 359 +++++++++ > > > > > > > > 4 files changed, 1909 insertions(+) > > > > > > > > create mode 100644 Documentation/warpdrive/warpdrive.rst > > > > > > > > create mode 100644 Documentation/warpdrive/wd-arch.svg > > > > > > > > create mode 100644 Documentation/warpdrive/wd.svg > > > > > > > > create mode 100644 Documentation/warpdrive/wd_q_addr_space.svg > > > > > > > > > > > > > > > > diff --git a/Documentation/warpdrive/warpdrive.rst b/Documentation/warpdrive/warpdrive.rst > > > > > > > > new file mode 100644 > > > > > > > > index 000000000000..ef84d3a2d462 > > > > > > > > --- /dev/null > > > > > > > > +++ b/Documentation/warpdrive/warpdrive.rst > > > > > > > > @@ -0,0 +1,260 @@ > > > > > > > > +Introduction of WarpDrive > > > > > > > > +========================= > > > > > > > > + > > > > > > > > +*WarpDrive* is a general accelerator framework for the user application to > > > > > > > > +access the hardware without going through the kernel in data path. > > > > > > > > + > > > > > > > > +It can be used as the quick channel for accelerators, network adaptors or > > > > > > > > +other hardware for application in user space. > > > > > > > > + > > > > > > > > +This may make some implementation simpler. E.g. you can reuse most of the > > > > > > > > +*netdev* driver in kernel and just share some ring buffer to the user space > > > > > > > > +driver for *DPDK* [4] or *ODP* [5]. Or you can combine the RSA accelerator with > > > > > > > > +the *netdev* in the user space as a https reversed proxy, etc. > > > > > > > > + > > > > > > > > +*WarpDrive* takes the hardware accelerator as a heterogeneous processor which > > > > > > > > +can share particular load from the CPU: > > > > > > > > + > > > > > > > > +.. image:: wd.svg > > > > > > > > + :alt: WarpDrive Concept > > > > > > > > + > > > > > > > > +The virtual concept, queue, is used to manage the requests sent to the > > > > > > > > +accelerator. The application send requests to the queue by writing to some > > > > > > > > +particular address, while the hardware takes the requests directly from the > > > > > > > > +address and send feedback accordingly. > > > > > > > > + > > > > > > > > +The format of the queue may differ from hardware to hardware. But the > > > > > > > > +application need not to make any system call for the communication. > > > > > > > > + > > > > > > > > +*WarpDrive* tries to create a shared virtual address space for all involved > > > > > > > > +accelerators. Within this space, the requests sent to queue can refer to any > > > > > > > > +virtual address, which will be valid to the application and all involved > > > > > > > > +accelerators. > > > > > > > > + > > > > > > > > +The name *WarpDrive* is simply a cool and general name meaning the framework > > > > > > > > +makes the application faster. It includes general user library, kernel > > > > > > > > +management module and drivers for the hardware. In kernel, the management > > > > > > > > +module is called *uacce*, meaning "Unified/User-space-access-intended > > > > > > > > +Accelerator Framework". > > > > > > > > + > > > > > > > > + > > > > > > > > +How does it work > > > > > > > > +================ > > > > > > > > + > > > > > > > > +*WarpDrive* uses *mmap* and *IOMMU* to play the trick. > > > > > > > > + > > > > > > > > +*Uacce* creates a chrdev for the device registered to it. A "queue" will be > > > > > > > > +created when the chrdev is opened. The application access the queue by mmap > > > > > > > > +different address region of the queue file. > > > > > > > > + > > > > > > > > +The following figure demonstrated the queue file address space: > > > > > > > > + > > > > > > > > +.. image:: wd_q_addr_space.svg > > > > > > > > + :alt: WarpDrive Queue Address Space > > > > > > > > + > > > > > > > > +The first region of the space, device region, is used for the application to > > > > > > > > +write request or read answer to or from the hardware. > > > > > > > > + > > > > > > > > +Normally, there can be three types of device regions mmio and memory regions. > > > > > > > > +It is recommended to use common memory for request/answer descriptors and use > > > > > > > > +the mmio space for device notification, such as doorbell. But of course, this > > > > > > > > +is all up to the interface designer. > > > > > > > > + > > > > > > > > +There can be two types of device memory regions, kernel-only and user-shared. > > > > > > > > +This will be explained in the "kernel APIs" section. > > > > > > > > + > > > > > > > > +The Static Share Virtual Memory region is necessary only when the device IOMMU > > > > > > > > +does not support "Share Virtual Memory". This will be explained after the > > > > > > > > +*IOMMU* idea. > > > > > > > > + > > > > > > > > + > > > > > > > > +Architecture > > > > > > > > +------------ > > > > > > > > + > > > > > > > > +The full *WarpDrive* architecture is represented in the following class > > > > > > > > +diagram: > > > > > > > > + > > > > > > > > +.. image:: wd-arch.svg > > > > > > > > + :alt: WarpDrive Architecture > > > > > > > > + > > > > > > > > + > > > > > > > > +The user API > > > > > > > > +------------ > > > > > > > > + > > > > > > > > +We adopt a polling style interface in the user space: :: > > > > > > > > + > > > > > > > > + int wd_request_queue(struct wd_queue *q); > > > > > > > > + void wd_release_queue(struct wd_queue *q); > > > > > > > > + > > > > > > > > + int wd_send(struct wd_queue *q, void *req); > > > > > > > > + int wd_recv(struct wd_queue *q, void **req); > > > > > > > > + int wd_recv_sync(struct wd_queue *q, void **req); > > > > > > > > + void wd_flush(struct wd_queue *q); > > > > > > > > + > > > > > > > > +wd_recv_sync() is a wrapper to its non-sync version. It will trapped into > > > > > > > > +kernel and waits until the queue become available. > > > > > > > > + > > > > > > > > +If the queue do not support SVA/SVM. The following helper function > > > > > > > > +can be used to create Static Virtual Share Memory: :: > > > > > > > > + > > > > > > > > + void *wd_preserve_share_memory(struct wd_queue *q, size_t size); > > > > > > > > + > > > > > > > > +The user API is not mandatory. It is simply a suggestion and hint what the > > > > > > > > +kernel interface is supposed to support. > > > > > > > > + > > > > > > > > + > > > > > > > > +The user driver > > > > > > > > +--------------- > > > > > > > > + > > > > > > > > +The queue file mmap space will need a user driver to wrap the communication > > > > > > > > +protocol. *UACCE* provides some attributes in sysfs for the user driver to > > > > > > > > +match the right accelerator accordingly. > > > > > > > > + > > > > > > > > +The *UACCE* device attribute is under the following directory: > > > > > > > > + > > > > > > > > +/sys/class/uacce/<dev-name>/params > > > > > > > > + > > > > > > > > +The following attributes is supported: > > > > > > > > + > > > > > > > > +nr_queue_remained (ro) > > > > > > > > + number of queue remained > > > > > > > > + > > > > > > > > +api_version (ro) > > > > > > > > + a string to identify the queue mmap space format and its version > > > > > > > > + > > > > > > > > +device_attr (ro) > > > > > > > > + attributes of the device, see UACCE_DEV_xxx flag defined in uacce.h > > > > > > > > + > > > > > > > > +numa_node (ro) > > > > > > > > + id of numa node > > > > > > > > + > > > > > > > > +priority (rw) > > > > > > > > + Priority or the device, bigger is higher > > > > > > > > + > > > > > > > > +(This is not yet implemented in RFC version) > > > > > > > > + > > > > > > > > + > > > > > > > > +The kernel API > > > > > > > > +-------------- > > > > > > > > + > > > > > > > > +The *uacce* kernel API is defined in uacce.h. If the hardware support SVM/SVA, > > > > > > > > +The driver need only the following API functions: :: > > > > > > > > + > > > > > > > > + int uacce_register(uacce); > > > > > > > > + void uacce_unregister(uacce); > > > > > > > > + void uacce_wake_up(q); > > > > > > > > + > > > > > > > > +*uacce_wake_up* is used to notify the process who epoll() on the queue file. > > > > > > > > + > > > > > > > > +According to the IOMMU capability, *uacce* categories the devices as follow: > > > > > > > > + > > > > > > > > +UACCE_DEV_NOIOMMU > > > > > > > > + The device has no IOMMU. The user process cannot use VA on the hardware > > > > > > > > + This mode is not recommended. > > > > > > > > + > > > > > > > > +UACCE_DEV_SVA (UACCE_DEV_PASID | UACCE_DEV_FAULT_FROM_DEV) > > > > > > > > + The device has IOMMU which can share the same page table with user > > > > > > > > + process > > > > > > > > + > > > > > > > > +UACCE_DEV_SHARE_DOMAIN > > > > > > > > + The device has IOMMU which has no multiple page table and device page > > > > > > > > + fault support > > > > > > > > + > > > > > > > > +If the device works in mode other than UACCE_DEV_NOIOMMU, *uacce* will set its > > > > > > > > +IOMMU to IOMMU_DOMAIN_UNMANAGED. So the driver must not use any kernel > > > > > > > > +DMA API but the following ones from *uacce* instead: :: > > > > > > > > + > > > > > > > > + uacce_dma_map(q, va, size, prot); > > > > > > > > + uacce_dma_unmap(q, va, size, prot); > > > > > > > > + > > > > > > > > +*uacce_dma_map/unmap* is valid only for UACCE_DEV_SVA device. It creates a > > > > > > > > +particular PASID and page table for the kernel in the IOMMU (Not yet > > > > > > > > +implemented in the RFC) > > > > > > > > + > > > > > > > > +For the UACCE_DEV_SHARE_DOMAIN device, uacce_dma_map/unmap is not valid. > > > > > > > > +*Uacce* call back start_queue only when the DUS and DKO region is mmapped. The > > > > > > > > +accelerator driver must use those dma buffer, via uacce_queue->qfrs[], on > > > > > > > > +start_queue call back. The size of the queue file region is defined by > > > > > > > > +uacce->ops->qf_pg_start[]. > > > > > > > > + > > > > > > > > +We have to do it this way because most of current IOMMU cannot support the > > > > > > > > +kernel and user virtual address at the same time. So we have to let them both > > > > > > > > +share the same user virtual address space. > > > > > > > > + > > > > > > > > +If the device have to support kernel and user at the same time, both kernel > > > > > > > > +and the user should use these DMA API. This is not convenient. A better > > > > > > > > +solution is to change the future DMA/IOMMU design to let them separate the > > > > > > > > +address space between the user and kernel space. But it is not going to be in > > > > > > > > +a short time. > > > > > > > > + > > > > > > > > + > > > > > > > > +Multiple processes support > > > > > > > > +========================== > > > > > > > > + > > > > > > > > +In the latest mainline kernel (4.19) when this document is written, the IOMMU > > > > > > > > +subsystem do not support multiple process page tables yet. > > > > > > > > + > > > > > > > > +Most IOMMU hardware implementation support multi-process with the concept > > > > > > > > +of PASID. But they may use different name, e.g. it is call sub-stream-id in > > > > > > > > +SMMU of ARM. With PASID or similar design, multi page table can be added to > > > > > > > > +the IOMMU and referred by its PASID. > > > > > > > > + > > > > > > > > +*JPB* has a patchset to enable this[1]_. We have tested it with our hardware > > > > > > > > +(which is known as *D06*). It works well. *WarpDrive* rely on them to support > > > > > > > > +UACCE_DEV_SVA. If it is not enabled, *WarpDrive* can still work. But it > > > > > > > > +support only one process, the device will be set to UACCE_DEV_SHARE_DOMAIN > > > > > > > > +even it is set to UACCE_DEV_SVA initially. > > > > > > > > + > > > > > > > > +Static Share Virtual Memory is mainly used by UACCE_DEV_SHARE_DOMAIN device. > > > > > > > > + > > > > > > > > + > > > > > > > > +Legacy Mode Support > > > > > > > > +=================== > > > > > > > > +For the hardware without IOMMU, WarpDrive can still work, the only problem is > > > > > > > > +VA cannot be used in the device. The driver should adopt another strategy for > > > > > > > > +the shared memory. It is only for testing, and not recommended. > > > > > > > > + > > > > > > > > + > > > > > > > > +The Folk Scenario > > > > > > > > +================= > > > > > > > > +For a process with allocated queues and shared memory, what happen if it forks > > > > > > > > +a child? > > > > > > > > + > > > > > > > > +The fd of the queue will be duplicated on folk, so the child can send request > > > > > > > > +to the same queue as its parent. But the requests which is sent from processes > > > > > > > > +except for the one who open the queue will be blocked. > > > > > > > > + > > > > > > > > +It is recommended to add O_CLOEXEC to the queue file. > > > > > > > > + > > > > > > > > +The queue mmap space has a VM_DONTCOPY in its VMA. So the child will lost all > > > > > > > > +those VMAs. > > > > > > > > + > > > > > > > > +This is why *WarpDrive* does not adopt the mode used in *VFIO* and *InfiniBand*. > > > > > > > > +Both solutions can set any user pointer for hardware sharing. But they cannot > > > > > > > > +support fork when the dma is in process. Or the "Copy-On-Write" procedure will > > > > > > > > +make the parent process lost its physical pages. > > > > > > > > + > > > > > > > > + > > > > > > > > +The Sample Code > > > > > > > > +=============== > > > > > > > > +There is a sample user land implementation with a simple driver for Hisilicon > > > > > > > > +Hi1620 ZIP Accelerator. > > > > > > > > + > > > > > > > > +To test, do the following in samples/warpdrive (for the case of PC host): :: > > > > > > > > + ./autogen.sh > > > > > > > > + ./conf.sh # or simply ./configure if you build on target system > > > > > > > > + make > > > > > > > > + > > > > > > > > +Then you can get test_hisi_zip in the test subdirectory. Copy it to the target > > > > > > > > +system and make sure the hisi_zip driver is enabled (the major and minor of > > > > > > > > +the uacce chrdev can be gotten from the dmesg or sysfs), and run: :: > > > > > > > > + mknod /dev/ua1 c <major> <minior> > > > > > > > > + test/test_hisi_zip -z < data > data.zip > > > > > > > > + test/test_hisi_zip -g < data > data.gzip > > > > > > > > + > > > > > > > > + > > > > > > > > +References > > > > > > > > +========== > > > > > > > > +.. [1] https://patchwork.kernel.org/patch/10394851/ > > > > > > > > + > > > > > > > > +.. vim: tw=78 > > > > [...] > > > > > > > > -- > > > > > > > > 2.17.1 > > > > > > > > > > I don't know if Mr. Jerome Glisse in the list. I think I should cc him for my > respectation to his help on last RFC. > > - Kenneth
On Mon, Nov 19, 2018 at 11:48:54AM -0500, Jerome Glisse wrote: > Just to comment on this, any infiniband driver which use umem and do > not have ODP (here ODP for me means listening to mmu notifier so all > infiniband driver except mlx5) will be affected by same issue AFAICT. > > AFAICT there is no special thing happening after fork() inside any of > those driver. So if parent create a umem mr before fork() and program > hardware with it then after fork() the parent might start using new > page for the umem range while the old memory is use by the child. The > reverse is also true (parent using old memory and child new memory) > bottom line you can not predict which memory the child or the parent > will use for the range after fork(). > > So no matter what you consider the child or the parent, what the hw > will use for the mr is unlikely to match what the CPU use for the > same virtual address. In other word: > > Before fork: > CPU parent: virtual addr ptr1 -> physical address = 0xCAFE > HARDWARE: virtual addr ptr1 -> physical address = 0xCAFE > > Case 1: > CPU parent: virtual addr ptr1 -> physical address = 0xCAFE > CPU child: virtual addr ptr1 -> physical address = 0xDEAD > HARDWARE: virtual addr ptr1 -> physical address = 0xCAFE > > Case 2: > CPU parent: virtual addr ptr1 -> physical address = 0xBEEF > CPU child: virtual addr ptr1 -> physical address = 0xCAFE > HARDWARE: virtual addr ptr1 -> physical address = 0xCAFE IIRC this is solved in IB by automatically calling madvise(MADV_DONTFORK) before creating the MR. MADV_DONTFORK .. This is useful to prevent copy-on-write semantics from changing the physical location of a page if the parent writes to it after a fork(2) .. Jason
On Mon, Nov 19, 2018 at 01:42:16PM -0500, Jerome Glisse wrote: > On Mon, Nov 19, 2018 at 11:27:52AM -0700, Jason Gunthorpe wrote: > > On Mon, Nov 19, 2018 at 11:48:54AM -0500, Jerome Glisse wrote: > > > > > Just to comment on this, any infiniband driver which use umem and do > > > not have ODP (here ODP for me means listening to mmu notifier so all > > > infiniband driver except mlx5) will be affected by same issue AFAICT. > > > > > > AFAICT there is no special thing happening after fork() inside any of > > > those driver. So if parent create a umem mr before fork() and program > > > hardware with it then after fork() the parent might start using new > > > page for the umem range while the old memory is use by the child. The > > > reverse is also true (parent using old memory and child new memory) > > > bottom line you can not predict which memory the child or the parent > > > will use for the range after fork(). > > > > > > So no matter what you consider the child or the parent, what the hw > > > will use for the mr is unlikely to match what the CPU use for the > > > same virtual address. In other word: > > > > > > Before fork: > > > CPU parent: virtual addr ptr1 -> physical address = 0xCAFE > > > HARDWARE: virtual addr ptr1 -> physical address = 0xCAFE > > > > > > Case 1: > > > CPU parent: virtual addr ptr1 -> physical address = 0xCAFE > > > CPU child: virtual addr ptr1 -> physical address = 0xDEAD > > > HARDWARE: virtual addr ptr1 -> physical address = 0xCAFE > > > > > > Case 2: > > > CPU parent: virtual addr ptr1 -> physical address = 0xBEEF > > > CPU child: virtual addr ptr1 -> physical address = 0xCAFE > > > HARDWARE: virtual addr ptr1 -> physical address = 0xCAFE > > > > IIRC this is solved in IB by automatically calling > > madvise(MADV_DONTFORK) before creating the MR. > > > > MADV_DONTFORK > > .. This is useful to prevent copy-on-write semantics from changing the > > physical location of a page if the parent writes to it after a > > fork(2) .. > > This would work around the issue but this is not transparent ie > range marked with DONTFORK no longer behave as expected from the > application point of view. > > Also it relies on userspace doing the right thing (which is not > something i usualy trust :)). The good thing that we didn't see anyone who succeeded to run IB stack without our user space, which does right thing under the hood :). > > Cheers, > Jérôme
On Mon, Nov 19, 2018 at 12:27:02PM -0700, Jason Gunthorpe wrote: > On Mon, Nov 19, 2018 at 02:17:21PM -0500, Jerome Glisse wrote: > > On Mon, Nov 19, 2018 at 11:53:33AM -0700, Jason Gunthorpe wrote: > > > On Mon, Nov 19, 2018 at 01:42:16PM -0500, Jerome Glisse wrote: > > > > On Mon, Nov 19, 2018 at 11:27:52AM -0700, Jason Gunthorpe wrote: > > > > > On Mon, Nov 19, 2018 at 11:48:54AM -0500, Jerome Glisse wrote: > > > > > > > > > > > Just to comment on this, any infiniband driver which use umem and do > > > > > > not have ODP (here ODP for me means listening to mmu notifier so all > > > > > > infiniband driver except mlx5) will be affected by same issue AFAICT. > > > > > > > > > > > > AFAICT there is no special thing happening after fork() inside any of > > > > > > those driver. So if parent create a umem mr before fork() and program > > > > > > hardware with it then after fork() the parent might start using new > > > > > > page for the umem range while the old memory is use by the child. The > > > > > > reverse is also true (parent using old memory and child new memory) > > > > > > bottom line you can not predict which memory the child or the parent > > > > > > will use for the range after fork(). > > > > > > > > > > > > So no matter what you consider the child or the parent, what the hw > > > > > > will use for the mr is unlikely to match what the CPU use for the > > > > > > same virtual address. In other word: > > > > > > > > > > > > Before fork: > > > > > > CPU parent: virtual addr ptr1 -> physical address = 0xCAFE > > > > > > HARDWARE: virtual addr ptr1 -> physical address = 0xCAFE > > > > > > > > > > > > Case 1: > > > > > > CPU parent: virtual addr ptr1 -> physical address = 0xCAFE > > > > > > CPU child: virtual addr ptr1 -> physical address = 0xDEAD > > > > > > HARDWARE: virtual addr ptr1 -> physical address = 0xCAFE > > > > > > > > > > > > Case 2: > > > > > > CPU parent: virtual addr ptr1 -> physical address = 0xBEEF > > > > > > CPU child: virtual addr ptr1 -> physical address = 0xCAFE > > > > > > HARDWARE: virtual addr ptr1 -> physical address = 0xCAFE > > > > > > > > > > IIRC this is solved in IB by automatically calling > > > > > madvise(MADV_DONTFORK) before creating the MR. > > > > > > > > > > MADV_DONTFORK > > > > > .. This is useful to prevent copy-on-write semantics from changing the > > > > > physical location of a page if the parent writes to it after a > > > > > fork(2) .. > > > > > > > > This would work around the issue but this is not transparent ie > > > > range marked with DONTFORK no longer behave as expected from the > > > > application point of view. > > > > > > Do you know what the difference is? The man page really gives no > > > hint.. > > > > > > Does it sometimes unmap the pages during fork? > > > > It is handled in kernel/fork.c look for DONTCOPY, basicaly it just > > leave empty page table in the child process so child will have to > > fault in new page. This also means that child will get 0 as initial > > value for all memory address under DONTCOPY/DONTFORK which breaks > > application expectation of what fork() do. > > Hum, I wonder why this API was selected then.. Because there is nothing else ? :) > > > > I actually wonder if the kernel is a bit broken here, we have the same > > > problem with O_DIRECT and other stuff, right? > > > > No it is not, O_DIRECT is fine. The only corner case i can think > > of with O_DIRECT is one thread launching an O_DIRECT that write > > to private anonymous memory (other O_DIRECT case do not matter) > > while another thread call fork() then what the child get can be > > undefined ie either it get the data before the O_DIRECT finish > > or it gets the result of the O_DIRECT. But this is realy what > > you should expect when doing such thing without synchronization. > > > > So O_DIRECT is fine. > > ?? How can O_DIRECT be fine but RDMA not? They use exactly the same > get_user_pages flow, right? Can we do what O_DIRECT does in RDMA and > be fine too? > > AFAIK the only difference is the length of the race window. You'd have > to fork and fault during the shorter time O_DIRECT has get_user_pages > open. Well in O_DIRECT case there is only one page table, the CPU page table and it gets updated during fork() so there is an ordering there and the race window is small. More over programmer knows that can get in trouble if they do thing like fork() and don't synchronize their threads with each other. So while some weird thing can happen with O_DIRECT, it is unlikely (very small race window) and if it happens its well within the expected behavior. For hardware the race window is the same as the process lifetime so it can be days, months, years ... Once the hardware has programmed its page table they will never see any update (again mlx5 ODP is the exception here). This is where "issues" weird behavior can arise. Because you use DONTFORK than you never see weird thing happening. If you were to comment out DONTFORK then RDMA in the parent might change data in the child (or the other way around ie RDMA in the child might change data in the parent). > > > Really, if I have a get_user_pages FOLL_WRITE on a page and we fork, > > > then shouldn't the COW immediately be broken during the fork? > > > > > > The kernel can't guarentee that an ongoing DMA will not write to those > > > pages, and it breaks the fork semantic to write to both processes. > > > > Fixing that would incur a high cost: need to grow struct page, need > > to copy potentialy gigabyte of memory during fork() ... this would be > > a serious performance regression for many folks just to work around an > > abuse of device driver. So i don't think anything on that front would > > be welcome. > > Why? Keep track in each mm if there are any active get_user_pages > FOLL_WRITE pages in the mm, if yes then sweep the VMAs and fix the > issue for the FOLL_WRITE pages. This has a cost and you don't want to do it for O_DIRECT. I am pretty sure that any such patch to modify fork() code path would be rejected. At least i would not like it and vote against. > > John is already working on being able to detect pages under GUP, so it > seems like a small step.. John is trying to fix serious bugs which can result in filesystem corruption. It has a performance cost and thus i don't see that as something we should pursue as a default solution. I posted patches to remove get_user_page() from GPU driver and i intend to remove as many GUP as i can (for hardware that can do the right thing). To me it sounds better to reward good hardware rather than punish everyone :) > > Since nearly all cases of fork don't have a GUP FOLL_WRITE active > there would be no performance hit. > > > umem without proper ODP and VFIO are the only bad user i know of (for > > VFIO you can argue that it is part of the API contract and thus that > > it is not an abuse but it is not spell out loud in documentation). I > > have been trying to push back on any people trying to push thing that > > would make the same mistake or at least making sure they understand > > what is happening. > > It is something we have to live with and support for the foreseeable > future. Yes for RDMA and VFIO, but i want to avoid any more new users hence why i push back on any solution that have the same issues. > > > What really need to happen is people fixing their hardware and do the > > right thing (good software engineer versus evil hardware engineer ;)) > > Even ODP is no pancea, there are performance problems. What we really > need is CAPI like stuff, so you will tell Intel to redesign the CPU?? > :) I agree Jérôme
On Mon, Nov 19, 2018 at 02:46:32PM -0500, Jerome Glisse wrote: > > ?? How can O_DIRECT be fine but RDMA not? They use exactly the same > > get_user_pages flow, right? Can we do what O_DIRECT does in RDMA and > > be fine too? > > > > AFAIK the only difference is the length of the race window. You'd have > > to fork and fault during the shorter time O_DIRECT has get_user_pages > > open. > > Well in O_DIRECT case there is only one page table, the CPU > page table and it gets updated during fork() so there is an > ordering there and the race window is small. Not really, in O_DIRECT case there is another 'page table', we just call it a DMA scatter/gather list and it is sent directly to the block device's DMA HW. The sgl plays exactly the same role as the various HW page list data structures that underly RDMA MRs. It is not a page table that matters here, it is if the DMA address of the page is active for DMA on HW. Like you say, the only difference is that the race is hopefully small with O_DIRECT (though that is not really small, NVMeof for instance has windows as large as connection timeouts, if you try hard enough) So we probably can trigger this trouble with O_DIRECT and fork(), and I would call it a bug :( > > Why? Keep track in each mm if there are any active get_user_pages > > FOLL_WRITE pages in the mm, if yes then sweep the VMAs and fix the > > issue for the FOLL_WRITE pages. > > This has a cost and you don't want to do it for O_DIRECT. I am pretty > sure that any such patch to modify fork() code path would be rejected. > At least i would not like it and vote against. I was thinking the incremental cost on top of what John is already doing would be very small in the common case and only be triggered in cases that matter (which apps should avoid anyhow). Jason
On Mon, Nov 19, 2018 at 01:11:56PM -0700, Jason Gunthorpe wrote: > On Mon, Nov 19, 2018 at 02:46:32PM -0500, Jerome Glisse wrote: > > > > ?? How can O_DIRECT be fine but RDMA not? They use exactly the same > > > get_user_pages flow, right? Can we do what O_DIRECT does in RDMA and > > > be fine too? > > > > > > AFAIK the only difference is the length of the race window. You'd have > > > to fork and fault during the shorter time O_DIRECT has get_user_pages > > > open. > > > > Well in O_DIRECT case there is only one page table, the CPU > > page table and it gets updated during fork() so there is an > > ordering there and the race window is small. > > Not really, in O_DIRECT case there is another 'page table', we just > call it a DMA scatter/gather list and it is sent directly to the block > device's DMA HW. The sgl plays exactly the same role as the various HW > page list data structures that underly RDMA MRs. > > It is not a page table that matters here, it is if the DMA address of > the page is active for DMA on HW. > > Like you say, the only difference is that the race is hopefully small > with O_DIRECT (though that is not really small, NVMeof for instance > has windows as large as connection timeouts, if you try hard enough) > > So we probably can trigger this trouble with O_DIRECT and fork(), and > I would call it a bug :( I can not think of any scenario that would be a bug with O_DIRECT. Do you have one in mind ? When you fork() and do other syscall that affect the memory of your process in another thread you should expect non consistant results. Kernel is not here to provide a fully safe environement to user, user can shoot itself in the foot and that's fine as long as it only affect the process itself and no one else. We should not be in the business of making everything baby proof :) > > > > Why? Keep track in each mm if there are any active get_user_pages > > > FOLL_WRITE pages in the mm, if yes then sweep the VMAs and fix the > > > issue for the FOLL_WRITE pages. > > > > This has a cost and you don't want to do it for O_DIRECT. I am pretty > > sure that any such patch to modify fork() code path would be rejected. > > At least i would not like it and vote against. > > I was thinking the incremental cost on top of what John is already > doing would be very small in the common case and only be triggered in > cases that matter (which apps should avoid anyhow). What John is addressing has nothing to do with fork() it has to do with GUP and filesystem page. More specificaly that after page_mkclean() all filesystem expect that the page content is stable (ie no one write to the page) with GUP and hardware (DIRECT_IO too) this is not necessarily the case. So John is trying to fix that. Not trying to make fork() baby proof AFAICT :) I rather keep saying that you should expect weird thing with RDMA and VFIO when doing fork() than trying to work around this in the kernel. Better behavior through hardware is what we should aim for (CAPI, ODP, ...). Jérôme
On Mon, Nov 19, 2018 at 04:33:20PM -0500, Jerome Glisse wrote: > On Mon, Nov 19, 2018 at 02:26:38PM -0700, Jason Gunthorpe wrote: > > On Mon, Nov 19, 2018 at 03:26:15PM -0500, Jerome Glisse wrote: > > > On Mon, Nov 19, 2018 at 01:11:56PM -0700, Jason Gunthorpe wrote: > > > > On Mon, Nov 19, 2018 at 02:46:32PM -0500, Jerome Glisse wrote: > > > > > > > > > > ?? How can O_DIRECT be fine but RDMA not? They use exactly the same > > > > > > get_user_pages flow, right? Can we do what O_DIRECT does in RDMA and > > > > > > be fine too? > > > > > > > > > > > > AFAIK the only difference is the length of the race window. You'd have > > > > > > to fork and fault during the shorter time O_DIRECT has get_user_pages > > > > > > open. > > > > > > > > > > Well in O_DIRECT case there is only one page table, the CPU > > > > > page table and it gets updated during fork() so there is an > > > > > ordering there and the race window is small. > > > > > > > > Not really, in O_DIRECT case there is another 'page table', we just > > > > call it a DMA scatter/gather list and it is sent directly to the block > > > > device's DMA HW. The sgl plays exactly the same role as the various HW > > > > page list data structures that underly RDMA MRs. > > > > > > > > It is not a page table that matters here, it is if the DMA address of > > > > the page is active for DMA on HW. > > > > > > > > Like you say, the only difference is that the race is hopefully small > > > > with O_DIRECT (though that is not really small, NVMeof for instance > > > > has windows as large as connection timeouts, if you try hard enough) > > > > > > > > So we probably can trigger this trouble with O_DIRECT and fork(), and > > > > I would call it a bug :( > > > > > > I can not think of any scenario that would be a bug with O_DIRECT. > > > Do you have one in mind ? When you fork() and do other syscall that > > > affect the memory of your process in another thread you should > > > expect non consistant results. Kernel is not here to provide a fully > > > safe environement to user, user can shoot itself in the foot and > > > that's fine as long as it only affect the process itself and no one > > > else. We should not be in the business of making everything baby > > > proof :) > > > > Sure, I setup AIO with O_DIRECT and launch a read. > > > > Then I fork and dirty the READ target memory using the CPU in the > > child. > > > > As you described in this case the fork will retain the physical page > > that is undergoing O_DIRECT DMA, and the parent gets a new copy'd page. > > > > The DMA completes, and the child gets the DMA'd to page. The parent > > gets an unchanged copy'd page. > > > > The parent gets the AIO completion, but can't see the data. > > > > I'd call that a bug with O_DIRECT. The only correct outcome is that > > the parent will always see the O_DIRECT data. Fork should not cause > > the *parent* to malfunction. I agree the child cannot make any > > prediction what memory it will see. > > > > I assume the same flow is possible using threads and read().. > > > > It is really no different than the RDMA bug with fork. > > > > Yes and that's expected behavior :) If you fork() and have anything > still in flight at time of fork that can change your process address > space (including data in it) then all bets are of. > > At least this is my reading of fork() syscall. Not mine.. I can't think of anything else that would have this behavior. All traditional syscalls, will properly dirty the pages of the parent. ie if I call read() in a thread and do fork in another thread, then not seeing the data after read() completes is clearly a bug. All other syscalls are the same. It is bonkers that opening the file with O_DIRECT would change this basic behavior. I'm calling it a bug :) Jason
On Mon, Nov 19, 2018 at 12:48:01PM +0200, Leon Romanovsky wrote: > Date: Mon, 19 Nov 2018 12:48:01 +0200 > From: Leon Romanovsky <leon@kernel.org> > To: Kenneth Lee <liguozhu@hisilicon.com> > CC: Tim Sell <timothy.sell@unisys.com>, linux-doc@vger.kernel.org, > Alexander Shishkin <alexander.shishkin@linux.intel.com>, Zaibo Xu > <xuzaibo@huawei.com>, zhangfei.gao@foxmail.com, linuxarm@huawei.com, > haojian.zhuang@linaro.org, Christoph Lameter <cl@linux.com>, Hao Fang > <fanghao11@huawei.com>, Gavin Schenk <g.schenk@eckelmann.de>, RDMA mailing > list <linux-rdma@vger.kernel.org>, Vinod Koul <vkoul@kernel.org>, Jason > Gunthorpe <jgg@ziepe.ca>, Doug Ledford <dledford@redhat.com>, Uwe > Kleine-König <u.kleine-koenig@pengutronix.de>, David Kershner > <david.kershner@unisys.com>, Kenneth Lee <nek.in.cn@gmail.com>, Johan > Hovold <johan@kernel.org>, Cyrille Pitchen > <cyrille.pitchen@free-electrons.com>, Sagar Dharia > <sdharia@codeaurora.org>, Jens Axboe <axboe@kernel.dk>, > guodong.xu@linaro.org, linux-netdev <netdev@vger.kernel.org>, Randy Dunlap > <rdunlap@infradead.org>, linux-kernel@vger.kernel.org, Zhou Wang > <wangzhou1@hisilicon.com>, linux-crypto@vger.kernel.org, Philippe > Ombredanne <pombredanne@nexb.com>, Sanyog Kale <sanyog.r.kale@intel.com>, > "David S. Miller" <davem@davemloft.net>, > linux-accelerators@lists.ozlabs.org, Jerome Glisse <jglisse@redhat.com> > Subject: Re: [RFCv3 PATCH 1/6] uacce: Add documents for WarpDrive/uacce > User-Agent: Mutt/1.10.1 (2018-07-13) > Message-ID: <20181119104801.GF8268@mtr-leonro.mtl.com> > > On Mon, Nov 19, 2018 at 05:19:10PM +0800, Kenneth Lee wrote: > > On Mon, Nov 19, 2018 at 05:14:05PM +0800, Kenneth Lee wrote: > > > Date: Mon, 19 Nov 2018 17:14:05 +0800 > > > From: Kenneth Lee <liguozhu@hisilicon.com> > > > To: Leon Romanovsky <leon@kernel.org> > > > CC: Tim Sell <timothy.sell@unisys.com>, linux-doc@vger.kernel.org, > > > Alexander Shishkin <alexander.shishkin@linux.intel.com>, Zaibo Xu > > > <xuzaibo@huawei.com>, zhangfei.gao@foxmail.com, linuxarm@huawei.com, > > > haojian.zhuang@linaro.org, Christoph Lameter <cl@linux.com>, Hao Fang > > > <fanghao11@huawei.com>, Gavin Schenk <g.schenk@eckelmann.de>, RDMA mailing > > > list <linux-rdma@vger.kernel.org>, Vinod Koul <vkoul@kernel.org>, Jason > > > Gunthorpe <jgg@ziepe.ca>, Doug Ledford <dledford@redhat.com>, Uwe > > > Kleine-König <u.kleine-koenig@pengutronix.de>, David Kershner > > > <david.kershner@unisys.com>, Kenneth Lee <nek.in.cn@gmail.com>, Johan > > > Hovold <johan@kernel.org>, Cyrille Pitchen > > > <cyrille.pitchen@free-electrons.com>, Sagar Dharia > > > <sdharia@codeaurora.org>, Jens Axboe <axboe@kernel.dk>, > > > guodong.xu@linaro.org, linux-netdev <netdev@vger.kernel.org>, Randy Dunlap > > > <rdunlap@infradead.org>, linux-kernel@vger.kernel.org, Zhou Wang > > > <wangzhou1@hisilicon.com>, linux-crypto@vger.kernel.org, Philippe > > > Ombredanne <pombredanne@nexb.com>, Sanyog Kale <sanyog.r.kale@intel.com>, > > > "David S. Miller" <davem@davemloft.net>, > > > linux-accelerators@lists.ozlabs.org > > > Subject: Re: [RFCv3 PATCH 1/6] uacce: Add documents for WarpDrive/uacce > > > User-Agent: Mutt/1.5.21 (2010-09-15) > > > Message-ID: <20181119091405.GE157308@Turing-Arch-b> > > > > > > On Thu, Nov 15, 2018 at 04:54:55PM +0200, Leon Romanovsky wrote: > > > > Date: Thu, 15 Nov 2018 16:54:55 +0200 > > > > From: Leon Romanovsky <leon@kernel.org> > > > > To: Kenneth Lee <liguozhu@hisilicon.com> > > > > CC: Kenneth Lee <nek.in.cn@gmail.com>, Tim Sell <timothy.sell@unisys.com>, > > > > linux-doc@vger.kernel.org, Alexander Shishkin > > > > <alexander.shishkin@linux.intel.com>, Zaibo Xu <xuzaibo@huawei.com>, > > > > zhangfei.gao@foxmail.com, linuxarm@huawei.com, haojian.zhuang@linaro.org, > > > > Christoph Lameter <cl@linux.com>, Hao Fang <fanghao11@huawei.com>, Gavin > > > > Schenk <g.schenk@eckelmann.de>, RDMA mailing list > > > > <linux-rdma@vger.kernel.org>, Zhou Wang <wangzhou1@hisilicon.com>, Jason > > > > Gunthorpe <jgg@ziepe.ca>, Doug Ledford <dledford@redhat.com>, Uwe > > > > Kleine-König <u.kleine-koenig@pengutronix.de>, David Kershner > > > > <david.kershner@unisys.com>, Johan Hovold <johan@kernel.org>, Cyrille > > > > Pitchen <cyrille.pitchen@free-electrons.com>, Sagar Dharia > > > > <sdharia@codeaurora.org>, Jens Axboe <axboe@kernel.dk>, > > > > guodong.xu@linaro.org, linux-netdev <netdev@vger.kernel.org>, Randy Dunlap > > > > <rdunlap@infradead.org>, linux-kernel@vger.kernel.org, Vinod Koul > > > > <vkoul@kernel.org>, linux-crypto@vger.kernel.org, Philippe Ombredanne > > > > <pombredanne@nexb.com>, Sanyog Kale <sanyog.r.kale@intel.com>, "David S. > > > > Miller" <davem@davemloft.net>, linux-accelerators@lists.ozlabs.org > > > > Subject: Re: [RFCv3 PATCH 1/6] uacce: Add documents for WarpDrive/uacce > > > > User-Agent: Mutt/1.10.1 (2018-07-13) > > > > Message-ID: <20181115145455.GN3759@mtr-leonro.mtl.com> > > > > > > > > On Thu, Nov 15, 2018 at 04:51:09PM +0800, Kenneth Lee wrote: > > > > > On Wed, Nov 14, 2018 at 06:00:17PM +0200, Leon Romanovsky wrote: > > > > > > Date: Wed, 14 Nov 2018 18:00:17 +0200 > > > > > > From: Leon Romanovsky <leon@kernel.org> > > > > > > To: Kenneth Lee <nek.in.cn@gmail.com> > > > > > > CC: Tim Sell <timothy.sell@unisys.com>, linux-doc@vger.kernel.org, > > > > > > Alexander Shishkin <alexander.shishkin@linux.intel.com>, Zaibo Xu > > > > > > <xuzaibo@huawei.com>, zhangfei.gao@foxmail.com, linuxarm@huawei.com, > > > > > > haojian.zhuang@linaro.org, Christoph Lameter <cl@linux.com>, Hao Fang > > > > > > <fanghao11@huawei.com>, Gavin Schenk <g.schenk@eckelmann.de>, RDMA mailing > > > > > > list <linux-rdma@vger.kernel.org>, Zhou Wang <wangzhou1@hisilicon.com>, > > > > > > Jason Gunthorpe <jgg@ziepe.ca>, Doug Ledford <dledford@redhat.com>, Uwe > > > > > > Kleine-König <u.kleine-koenig@pengutronix.de>, David Kershner > > > > > > <david.kershner@unisys.com>, Johan Hovold <johan@kernel.org>, Cyrille > > > > > > Pitchen <cyrille.pitchen@free-electrons.com>, Sagar Dharia > > > > > > <sdharia@codeaurora.org>, Jens Axboe <axboe@kernel.dk>, > > > > > > guodong.xu@linaro.org, linux-netdev <netdev@vger.kernel.org>, Randy Dunlap > > > > > > <rdunlap@infradead.org>, linux-kernel@vger.kernel.org, Vinod Koul > > > > > > <vkoul@kernel.org>, linux-crypto@vger.kernel.org, Philippe Ombredanne > > > > > > <pombredanne@nexb.com>, Sanyog Kale <sanyog.r.kale@intel.com>, Kenneth Lee > > > > > > <liguozhu@hisilicon.com>, "David S. Miller" <davem@davemloft.net>, > > > > > > linux-accelerators@lists.ozlabs.org > > > > > > Subject: Re: [RFCv3 PATCH 1/6] uacce: Add documents for WarpDrive/uacce > > > > > > User-Agent: Mutt/1.10.1 (2018-07-13) > > > > > > Message-ID: <20181114160017.GI3759@mtr-leonro.mtl.com> > > > > > > > > > > > > On Wed, Nov 14, 2018 at 10:58:09AM +0800, Kenneth Lee wrote: > > > > > > > > > > > > > > 在 2018/11/13 上午8:23, Leon Romanovsky 写道: > > > > > > > > On Mon, Nov 12, 2018 at 03:58:02PM +0800, Kenneth Lee wrote: > > > > > > > > > From: Kenneth Lee <liguozhu@hisilicon.com> > > > > > > > > > > > > > > > > > > WarpDrive is a general accelerator framework for the user application to > > > > > > > > > access the hardware without going through the kernel in data path. > > > > > > > > > > > > > > > > > > The kernel component to provide kernel facility to driver for expose the > > > > > > > > > user interface is called uacce. It a short name for > > > > > > > > > "Unified/User-space-access-intended Accelerator Framework". > > > > > > > > > > > > > > > > > > This patch add document to explain how it works. > > > > > > > > + RDMA and netdev folks > > > > > > > > > > > > > > > > Sorry, to be late in the game, I don't see other patches, but from > > > > > > > > the description below it seems like you are reinventing RDMA verbs > > > > > > > > model. I have hard time to see the differences in the proposed > > > > > > > > framework to already implemented in drivers/infiniband/* for the kernel > > > > > > > > space and for the https://github.com/linux-rdma/rdma-core/ for the user > > > > > > > > space parts. > > > > > > > > > > > > > > Thanks Leon, > > > > > > > > > > > > > > Yes, we tried to solve similar problem in RDMA. We also learned a lot from > > > > > > > the exist code of RDMA. But we we have to make a new one because we cannot > > > > > > > register accelerators such as AI operation, encryption or compression to the > > > > > > > RDMA framework:) > > > > > > > > > > > > Assuming that you did everything right and still failed to use RDMA > > > > > > framework, you was supposed to fix it and not to reinvent new exactly > > > > > > same one. It is how we develop kernel, by reusing existing code. > > > > > > > > > > Yes, but we don't force other system such as NIC or GPU into RDMA, do we? > > > > > > > > You don't introduce new NIC or GPU, but proposing another interface to > > > > directly access HW memory and bypass kernel for the data path. This is > > > > whole idea of RDMA and this is why it is already present in the kernel. > > > > > > > > Various hardware devices are supported in our stack allow a ton of crazy > > > > stuff, including GPUs interconnections and NIC functionalities. > > > > > > Yes. We don't want to invent new wheel. That is why we did it behind VFIO in RFC > > > v1 and v2. But finally we were persuaded by Mr. Jerome Glisse that VFIO was not > > > a good place to solve the problem. > > I saw a couple of his responses, he constantly said to you that you are > reinventing the wheel. > https://lore.kernel.org/lkml/20180904150019.GA4024@redhat.com/ > No. I think he asked me did not create trouble in VFIO but just use common interface from dma_buf and iommu itself. That is exactly what I am doing. > > > > > > And currently, as you see, IB is bound with devices doing RDMA. The register > > > function, ib_register_device() hint that it is a netdev (get_netdev() callback), it know > > > about gid, pkey, and Memory Window. IB is not simply a address space management > > > framework. And verbs to IB are not transparent. If we start to add > > > compression/decompression, AI (RNN, CNN stuff) operations, and encryption/decryption > > > to the verbs set. It will become very complexity. Or maybe I misunderstand the > > > IB idea? But I don't see compression hardware is integrated in the mainline > > > Kernel. Could you directly point out which one I can used as a reference? > > > > > I strongly advise you to read the code, not all drivers are implementing > gids, pkeys and get_netdev() callback. > > Yes, you are misunderstanding drivers/infiniband subsystem. We have > plenty options to expose APIs to the user space applications, starting > from standard verbs API and ending with private objects which are > understandable by specific device/driver. > > IB stack provides secure FD to access device, by creating context, > after that you can send direct commands to the FW (see mlx5 DEVX > or hfi1) in sane way. > > So actually, you will need to register your device, declare your own > set of objects (similar to mlx5 include/uapi/rdma/mlx5_user_ioctl_*.h). > > In regards to reference of compression hardware, I don't have. > But there is an example of how T10-DIF can be implemented in verbs > layer: > https://www.openfabrics.org/images/2018workshop/presentations/307_TOved_T10-DIFOffload.pdf > Or IPsec crypto: > https://www.spinics.net/lists/linux-rdma/msg48906.html > OK. I will spend some time on it first. But according to current discussion, Don't you think I should avoid all these complexities but simply use SVM/SVA on iommu or let the user application use the kernel-allocated VMA and page? It does not create anything new. Just a new user of IOMMU and its SVM/SVA capability. > > > > > > > > > > > > > > I assume you would not agree to register a zip accelerator to infiniband? :) > > > > > > > > "infiniband" name in the "drivers/infiniband/" is legacy one and the > > > > current code supports IB, RoCE, iWARP and OmniPath as a transport layers. > > > > For a lone time, we wanted to rename that folder to be "drivers/rdma", > > > > but didn't find enough brave men/women to do it, due to backport mess > > > > for such move. > > > > > > > > The addition of zip accelerator to RDMA is possible and depends on how > > > > you will model such new functionality - new driver, or maybe new ULP. > > > > > > > > > > > > > > Further, I don't think it is wise to break an exist system (RDMA) to fulfill a > > > > > totally new scenario. The better choice is to let them run in parallel for some > > > > > time and try to merge them accordingly. > > > > > > > > Awesome, so please run your code out-of-tree for now and once you are ready > > > > for submission let's try to merge it. > > > > > > Yes, yes. We know trust need time to gain. But the fact is that there is no > > > accelerator user driver can be added to mainline kernel. We should raise the > > > topic time to time. So to help the communication to fix the gap, right? > > > > > > We are also opened to cooperate with IB to do it within the IB framework. But > > > please let me know where to start. I feel it is quite wired to make a > > > ib_register_device for a zip or RSA accelerator. > > Most of ib_ prefixes in drivers/infinband/ are legacy names. You can > rename them to be rdma_register_device() if it helps. > > So from implementation point of view, as I wrote above. > Create minimal driver to register, expose MR to user space, add your own > objects and capabilities through our new KABI and implement user space part > in github.com/linux-rdma/rdma-core. I don't think it is just a name. But anyway, let me spend some time to try the possibility. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Another problem we tried to address is the way to pin the memory for dma > > > > > > > operation. The RDMA way to pin the memory cannot avoid the page lost due to > > > > > > > copy-on-write operation during the memory is used by the device. This may > > > > > > > not be important to RDMA library. But it is important to accelerator. > > > > > > > > > > > > Such support exists in drivers/infiniband/ from late 2014 and > > > > > > it is called ODP (on demand paging). > > > > > > > > > > I reviewed ODP and I think it is a solution bound to infiniband. It is part of > > > > > MR semantics and required a infiniband specific hook > > > > > (ucontext->invalidate_range()). And the hook requires the device to be able to > > > > > stop using the page for a while for the copying. It is ok for infiniband > > > > > (actually, only mlx5 uses it). I don't think most accelerators can support > > > > > this mode. But WarpDrive works fully on top of IOMMU interface, it has no this > > > > > limitation. > > > > > > > > 1. It has nothing to do with infiniband. > > > > > > But it must be a ib_dev first. > > It is just a name. > > > > > > > > 2. MR and uncontext are verbs semantics and needed to ensure that host > > > > memory exposed to user is properly protected from security point of view. > > > > 3. "stop using the page for a while for the copying" - I'm not fully > > > > understand this claim, maybe this article will help you to better > > > > describe : https://lwn.net/Articles/753027/ > > > > > > This topic was being discussed in RFCv2. The key problem here is that: > > > > > > The device need to hold the memory for its own calculation, but the CPU/software > > > want to stop it for a while for synchronizing with disk or COW. > > > > > > If the hardware support SVM/SVA (Shared Virtual Memory/Address), it is easy, the > > > device share page table with CPU, the device will raise a page fault when the > > > CPU downgrade the PTE to read-only. > > > > > > If the hardware cannot share page table with the CPU, we then need to have > > > some way to change the device page table. This is what happen in ODP. It > > > invalidates the page table in device upon mmu_notifier call back. But this cannot > > > solve the COW problem: if the user process A share a page P with device, and A > > > forks a new process B, and it continue to write to the page. By COW, the > > > process B will keep the page P, while A will get a new page P'. But you have > > > no way to let the device know it should use P' rather than P. > > I didn't hear about such issue and we supported fork for a long time. > > > > > > > This may be OK for RDMA application. Because RDMA is a big thing and we can ask > > > the programmer to avoid the situation. But for a accelerator, I don't think we > > > can ask a programmer to care for this when use a zlib. > > > > > > In WarpDrive/uacce, we make this simple. If you support IOMMU and it support > > > SVM/SVA. Everything will be fine just like ODP implicit mode. And you don't need > > > to write any code for that. Because it has been done by IOMMU framework. If it > > > dose not, you have to use the kernel allocated memory which has the same IOVA as > > > the VA in user space. So we can still maintain a unify address space among the > > > devices and the applicatin. > > > > > > > 4. mlx5 supports ODP not because of being partially IB device, > > > > but because HW performance oriented implementation is not an easy task. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hope this can help the understanding. > > > > > > > > > > > > Yes, it helped me a lot. > > > > > > Now, I'm more than before convinced that this whole patchset shouldn't > > > > > > exist in the first place. > > > > > > > > > > Then maybe you can tell me how I can register my accelerator to the user space? > > > > > > > > Write kernel driver and write user space part of it. > > > > https://github.com/linux-rdma/rdma-core/ > > > > > > > > I have no doubts that your colleagues who wrote and maintain > > > > drivers/infiniband/hw/hns driver know best how to do it. > > > > They did it very successfully. > > > > > > > > Thanks > > > > > > > > > > > > > > > > > > > > > To be clear, NAK. > > > > > > > > > > > > Thanks > > > > > > > > > > > > > > > > > > > > Cheers > > > > > > > > > > > > > > > > > > > > > > > Hard NAK from RDMA side. > > > > > > > > > > > > > > > > Thanks > > > > > > > > > > > > > > > > > Signed-off-by: Kenneth Lee <liguozhu@hisilicon.com> > > > > > > > > > --- > > > > > > > > > Documentation/warpdrive/warpdrive.rst | 260 +++++++ > > > > > > > > > Documentation/warpdrive/wd-arch.svg | 764 ++++++++++++++++++++ > > > > > > > > > Documentation/warpdrive/wd.svg | 526 ++++++++++++++ > > > > > > > > > Documentation/warpdrive/wd_q_addr_space.svg | 359 +++++++++ > > > > > > > > > 4 files changed, 1909 insertions(+) > > > > > > > > > create mode 100644 Documentation/warpdrive/warpdrive.rst > > > > > > > > > create mode 100644 Documentation/warpdrive/wd-arch.svg > > > > > > > > > create mode 100644 Documentation/warpdrive/wd.svg > > > > > > > > > create mode 100644 Documentation/warpdrive/wd_q_addr_space.svg > > > > > > > > > > > > > > > > > > diff --git a/Documentation/warpdrive/warpdrive.rst b/Documentation/warpdrive/warpdrive.rst > > > > > > > > > new file mode 100644 > > > > > > > > > index 000000000000..ef84d3a2d462 > > > > > > > > > --- /dev/null > > > > > > > > > +++ b/Documentation/warpdrive/warpdrive.rst > > > > > > > > > @@ -0,0 +1,260 @@ > > > > > > > > > +Introduction of WarpDrive > > > > > > > > > +========================= > > > > > > > > > + > > > > > > > > > +*WarpDrive* is a general accelerator framework for the user application to > > > > > > > > > +access the hardware without going through the kernel in data path. > > > > > > > > > + > > > > > > > > > +It can be used as the quick channel for accelerators, network adaptors or > > > > > > > > > +other hardware for application in user space. > > > > > > > > > + > > > > > > > > > +This may make some implementation simpler. E.g. you can reuse most of the > > > > > > > > > +*netdev* driver in kernel and just share some ring buffer to the user space > > > > > > > > > +driver for *DPDK* [4] or *ODP* [5]. Or you can combine the RSA accelerator with > > > > > > > > > +the *netdev* in the user space as a https reversed proxy, etc. > > > > > > > > > + > > > > > > > > > +*WarpDrive* takes the hardware accelerator as a heterogeneous processor which > > > > > > > > > +can share particular load from the CPU: > > > > > > > > > + > > > > > > > > > +.. image:: wd.svg > > > > > > > > > + :alt: WarpDrive Concept > > > > > > > > > + > > > > > > > > > +The virtual concept, queue, is used to manage the requests sent to the > > > > > > > > > +accelerator. The application send requests to the queue by writing to some > > > > > > > > > +particular address, while the hardware takes the requests directly from the > > > > > > > > > +address and send feedback accordingly. > > > > > > > > > + > > > > > > > > > +The format of the queue may differ from hardware to hardware. But the > > > > > > > > > +application need not to make any system call for the communication. > > > > > > > > > + > > > > > > > > > +*WarpDrive* tries to create a shared virtual address space for all involved > > > > > > > > > +accelerators. Within this space, the requests sent to queue can refer to any > > > > > > > > > +virtual address, which will be valid to the application and all involved > > > > > > > > > +accelerators. > > > > > > > > > + > > > > > > > > > +The name *WarpDrive* is simply a cool and general name meaning the framework > > > > > > > > > +makes the application faster. It includes general user library, kernel > > > > > > > > > +management module and drivers for the hardware. In kernel, the management > > > > > > > > > +module is called *uacce*, meaning "Unified/User-space-access-intended > > > > > > > > > +Accelerator Framework". > > > > > > > > > + > > > > > > > > > + > > > > > > > > > +How does it work > > > > > > > > > +================ > > > > > > > > > + > > > > > > > > > +*WarpDrive* uses *mmap* and *IOMMU* to play the trick. > > > > > > > > > + > > > > > > > > > +*Uacce* creates a chrdev for the device registered to it. A "queue" will be > > > > > > > > > +created when the chrdev is opened. The application access the queue by mmap > > > > > > > > > +different address region of the queue file. > > > > > > > > > + > > > > > > > > > +The following figure demonstrated the queue file address space: > > > > > > > > > + > > > > > > > > > +.. image:: wd_q_addr_space.svg > > > > > > > > > + :alt: WarpDrive Queue Address Space > > > > > > > > > + > > > > > > > > > +The first region of the space, device region, is used for the application to > > > > > > > > > +write request or read answer to or from the hardware. > > > > > > > > > + > > > > > > > > > +Normally, there can be three types of device regions mmio and memory regions. > > > > > > > > > +It is recommended to use common memory for request/answer descriptors and use > > > > > > > > > +the mmio space for device notification, such as doorbell. But of course, this > > > > > > > > > +is all up to the interface designer. > > > > > > > > > + > > > > > > > > > +There can be two types of device memory regions, kernel-only and user-shared. > > > > > > > > > +This will be explained in the "kernel APIs" section. > > > > > > > > > + > > > > > > > > > +The Static Share Virtual Memory region is necessary only when the device IOMMU > > > > > > > > > +does not support "Share Virtual Memory". This will be explained after the > > > > > > > > > +*IOMMU* idea. > > > > > > > > > + > > > > > > > > > + > > > > > > > > > +Architecture > > > > > > > > > +------------ > > > > > > > > > + > > > > > > > > > +The full *WarpDrive* architecture is represented in the following class > > > > > > > > > +diagram: > > > > > > > > > + > > > > > > > > > +.. image:: wd-arch.svg > > > > > > > > > + :alt: WarpDrive Architecture > > > > > > > > > + > > > > > > > > > + > > > > > > > > > +The user API > > > > > > > > > +------------ > > > > > > > > > + > > > > > > > > > +We adopt a polling style interface in the user space: :: > > > > > > > > > + > > > > > > > > > + int wd_request_queue(struct wd_queue *q); > > > > > > > > > + void wd_release_queue(struct wd_queue *q); > > > > > > > > > + > > > > > > > > > + int wd_send(struct wd_queue *q, void *req); > > > > > > > > > + int wd_recv(struct wd_queue *q, void **req); > > > > > > > > > + int wd_recv_sync(struct wd_queue *q, void **req); > > > > > > > > > + void wd_flush(struct wd_queue *q); > > > > > > > > > + > > > > > > > > > +wd_recv_sync() is a wrapper to its non-sync version. It will trapped into > > > > > > > > > +kernel and waits until the queue become available. > > > > > > > > > + > > > > > > > > > +If the queue do not support SVA/SVM. The following helper function > > > > > > > > > +can be used to create Static Virtual Share Memory: :: > > > > > > > > > + > > > > > > > > > + void *wd_preserve_share_memory(struct wd_queue *q, size_t size); > > > > > > > > > + > > > > > > > > > +The user API is not mandatory. It is simply a suggestion and hint what the > > > > > > > > > +kernel interface is supposed to support. > > > > > > > > > + > > > > > > > > > + > > > > > > > > > +The user driver > > > > > > > > > +--------------- > > > > > > > > > + > > > > > > > > > +The queue file mmap space will need a user driver to wrap the communication > > > > > > > > > +protocol. *UACCE* provides some attributes in sysfs for the user driver to > > > > > > > > > +match the right accelerator accordingly. > > > > > > > > > + > > > > > > > > > +The *UACCE* device attribute is under the following directory: > > > > > > > > > + > > > > > > > > > +/sys/class/uacce/<dev-name>/params > > > > > > > > > + > > > > > > > > > +The following attributes is supported: > > > > > > > > > + > > > > > > > > > +nr_queue_remained (ro) > > > > > > > > > + number of queue remained > > > > > > > > > + > > > > > > > > > +api_version (ro) > > > > > > > > > + a string to identify the queue mmap space format and its version > > > > > > > > > + > > > > > > > > > +device_attr (ro) > > > > > > > > > + attributes of the device, see UACCE_DEV_xxx flag defined in uacce.h > > > > > > > > > + > > > > > > > > > +numa_node (ro) > > > > > > > > > + id of numa node > > > > > > > > > + > > > > > > > > > +priority (rw) > > > > > > > > > + Priority or the device, bigger is higher > > > > > > > > > + > > > > > > > > > +(This is not yet implemented in RFC version) > > > > > > > > > + > > > > > > > > > + > > > > > > > > > +The kernel API > > > > > > > > > +-------------- > > > > > > > > > + > > > > > > > > > +The *uacce* kernel API is defined in uacce.h. If the hardware support SVM/SVA, > > > > > > > > > +The driver need only the following API functions: :: > > > > > > > > > + > > > > > > > > > + int uacce_register(uacce); > > > > > > > > > + void uacce_unregister(uacce); > > > > > > > > > + void uacce_wake_up(q); > > > > > > > > > + > > > > > > > > > +*uacce_wake_up* is used to notify the process who epoll() on the queue file. > > > > > > > > > + > > > > > > > > > +According to the IOMMU capability, *uacce* categories the devices as follow: > > > > > > > > > + > > > > > > > > > +UACCE_DEV_NOIOMMU > > > > > > > > > + The device has no IOMMU. The user process cannot use VA on the hardware > > > > > > > > > + This mode is not recommended. > > > > > > > > > + > > > > > > > > > +UACCE_DEV_SVA (UACCE_DEV_PASID | UACCE_DEV_FAULT_FROM_DEV) > > > > > > > > > + The device has IOMMU which can share the same page table with user > > > > > > > > > + process > > > > > > > > > + > > > > > > > > > +UACCE_DEV_SHARE_DOMAIN > > > > > > > > > + The device has IOMMU which has no multiple page table and device page > > > > > > > > > + fault support > > > > > > > > > + > > > > > > > > > +If the device works in mode other than UACCE_DEV_NOIOMMU, *uacce* will set its > > > > > > > > > +IOMMU to IOMMU_DOMAIN_UNMANAGED. So the driver must not use any kernel > > > > > > > > > +DMA API but the following ones from *uacce* instead: :: > > > > > > > > > + > > > > > > > > > + uacce_dma_map(q, va, size, prot); > > > > > > > > > + uacce_dma_unmap(q, va, size, prot); > > > > > > > > > + > > > > > > > > > +*uacce_dma_map/unmap* is valid only for UACCE_DEV_SVA device. It creates a > > > > > > > > > +particular PASID and page table for the kernel in the IOMMU (Not yet > > > > > > > > > +implemented in the RFC) > > > > > > > > > + > > > > > > > > > +For the UACCE_DEV_SHARE_DOMAIN device, uacce_dma_map/unmap is not valid. > > > > > > > > > +*Uacce* call back start_queue only when the DUS and DKO region is mmapped. The > > > > > > > > > +accelerator driver must use those dma buffer, via uacce_queue->qfrs[], on > > > > > > > > > +start_queue call back. The size of the queue file region is defined by > > > > > > > > > +uacce->ops->qf_pg_start[]. > > > > > > > > > + > > > > > > > > > +We have to do it this way because most of current IOMMU cannot support the > > > > > > > > > +kernel and user virtual address at the same time. So we have to let them both > > > > > > > > > +share the same user virtual address space. > > > > > > > > > + > > > > > > > > > +If the device have to support kernel and user at the same time, both kernel > > > > > > > > > +and the user should use these DMA API. This is not convenient. A better > > > > > > > > > +solution is to change the future DMA/IOMMU design to let them separate the > > > > > > > > > +address space between the user and kernel space. But it is not going to be in > > > > > > > > > +a short time. > > > > > > > > > + > > > > > > > > > + > > > > > > > > > +Multiple processes support > > > > > > > > > +========================== > > > > > > > > > + > > > > > > > > > +In the latest mainline kernel (4.19) when this document is written, the IOMMU > > > > > > > > > +subsystem do not support multiple process page tables yet. > > > > > > > > > + > > > > > > > > > +Most IOMMU hardware implementation support multi-process with the concept > > > > > > > > > +of PASID. But they may use different name, e.g. it is call sub-stream-id in > > > > > > > > > +SMMU of ARM. With PASID or similar design, multi page table can be added to > > > > > > > > > +the IOMMU and referred by its PASID. > > > > > > > > > + > > > > > > > > > +*JPB* has a patchset to enable this[1]_. We have tested it with our hardware > > > > > > > > > +(which is known as *D06*). It works well. *WarpDrive* rely on them to support > > > > > > > > > +UACCE_DEV_SVA. If it is not enabled, *WarpDrive* can still work. But it > > > > > > > > > +support only one process, the device will be set to UACCE_DEV_SHARE_DOMAIN > > > > > > > > > +even it is set to UACCE_DEV_SVA initially. > > > > > > > > > + > > > > > > > > > +Static Share Virtual Memory is mainly used by UACCE_DEV_SHARE_DOMAIN device. > > > > > > > > > + > > > > > > > > > + > > > > > > > > > +Legacy Mode Support > > > > > > > > > +=================== > > > > > > > > > +For the hardware without IOMMU, WarpDrive can still work, the only problem is > > > > > > > > > +VA cannot be used in the device. The driver should adopt another strategy for > > > > > > > > > +the shared memory. It is only for testing, and not recommended. > > > > > > > > > + > > > > > > > > > + > > > > > > > > > +The Folk Scenario > > > > > > > > > +================= > > > > > > > > > +For a process with allocated queues and shared memory, what happen if it forks > > > > > > > > > +a child? > > > > > > > > > + > > > > > > > > > +The fd of the queue will be duplicated on folk, so the child can send request > > > > > > > > > +to the same queue as its parent. But the requests which is sent from processes > > > > > > > > > +except for the one who open the queue will be blocked. > > > > > > > > > + > > > > > > > > > +It is recommended to add O_CLOEXEC to the queue file. > > > > > > > > > + > > > > > > > > > +The queue mmap space has a VM_DONTCOPY in its VMA. So the child will lost all > > > > > > > > > +those VMAs. > > > > > > > > > + > > > > > > > > > +This is why *WarpDrive* does not adopt the mode used in *VFIO* and *InfiniBand*. > > > > > > > > > +Both solutions can set any user pointer for hardware sharing. But they cannot > > > > > > > > > +support fork when the dma is in process. Or the "Copy-On-Write" procedure will > > > > > > > > > +make the parent process lost its physical pages. > > > > > > > > > + > > > > > > > > > + > > > > > > > > > +The Sample Code > > > > > > > > > +=============== > > > > > > > > > +There is a sample user land implementation with a simple driver for Hisilicon > > > > > > > > > +Hi1620 ZIP Accelerator. > > > > > > > > > + > > > > > > > > > +To test, do the following in samples/warpdrive (for the case of PC host): :: > > > > > > > > > + ./autogen.sh > > > > > > > > > + ./conf.sh # or simply ./configure if you build on target system > > > > > > > > > + make > > > > > > > > > + > > > > > > > > > +Then you can get test_hisi_zip in the test subdirectory. Copy it to the target > > > > > > > > > +system and make sure the hisi_zip driver is enabled (the major and minor of > > > > > > > > > +the uacce chrdev can be gotten from the dmesg or sysfs), and run: :: > > > > > > > > > + mknod /dev/ua1 c <major> <minior> > > > > > > > > > + test/test_hisi_zip -z < data > data.zip > > > > > > > > > + test/test_hisi_zip -g < data > data.gzip > > > > > > > > > + > > > > > > > > > + > > > > > > > > > +References > > > > > > > > > +========== > > > > > > > > > +.. [1] https://patchwork.kernel.org/patch/10394851/ > > > > > > > > > + > > > > > > > > > +.. vim: tw=78 > > > > > [...] > > > > > > > > > -- > > > > > > > > > 2.17.1 > > > > > > > > > > > > > I don't know if Mr. Jerome Glisse in the list. I think I should cc him for my > > respectation to his help on last RFC. > > > > - Kenneth
On Mon, Nov 19, 2018 at 11:49:54AM -0700, Jason Gunthorpe wrote: > Date: Mon, 19 Nov 2018 11:49:54 -0700 > From: Jason Gunthorpe <jgg@ziepe.ca> > To: Kenneth Lee <liguozhu@hisilicon.com> > CC: Leon Romanovsky <leon@kernel.org>, Kenneth Lee <nek.in.cn@gmail.com>, > Tim Sell <timothy.sell@unisys.com>, linux-doc@vger.kernel.org, Alexander > Shishkin <alexander.shishkin@linux.intel.com>, Zaibo Xu > <xuzaibo@huawei.com>, zhangfei.gao@foxmail.com, linuxarm@huawei.com, > haojian.zhuang@linaro.org, Christoph Lameter <cl@linux.com>, Hao Fang > <fanghao11@huawei.com>, Gavin Schenk <g.schenk@eckelmann.de>, RDMA mailing > list <linux-rdma@vger.kernel.org>, Zhou Wang <wangzhou1@hisilicon.com>, > Doug Ledford <dledford@redhat.com>, Uwe Kleine-König > <u.kleine-koenig@pengutronix.de>, David Kershner > <david.kershner@unisys.com>, Johan Hovold <johan@kernel.org>, Cyrille > Pitchen <cyrille.pitchen@free-electrons.com>, Sagar Dharia > <sdharia@codeaurora.org>, Jens Axboe <axboe@kernel.dk>, > guodong.xu@linaro.org, linux-netdev <netdev@vger.kernel.org>, Randy Dunlap > <rdunlap@infradead.org>, linux-kernel@vger.kernel.org, Vinod Koul > <vkoul@kernel.org>, linux-crypto@vger.kernel.org, Philippe Ombredanne > <pombredanne@nexb.com>, Sanyog Kale <sanyog.r.kale@intel.com>, "David S. > Miller" <davem@davemloft.net>, linux-accelerators@lists.ozlabs.org > Subject: Re: [RFCv3 PATCH 1/6] uacce: Add documents for WarpDrive/uacce > User-Agent: Mutt/1.9.4 (2018-02-28) > Message-ID: <20181119184954.GB4890@ziepe.ca> > > On Mon, Nov 19, 2018 at 05:14:05PM +0800, Kenneth Lee wrote: > > > If the hardware cannot share page table with the CPU, we then need to have > > some way to change the device page table. This is what happen in ODP. It > > invalidates the page table in device upon mmu_notifier call back. But this cannot > > solve the COW problem: if the user process A share a page P with device, and A > > forks a new process B, and it continue to write to the page. By COW, the > > process B will keep the page P, while A will get a new page P'. But you have > > no way to let the device know it should use P' rather than P. > > Is this true? I thought mmu_notifiers covered all these cases. > > The mm_notifier for A should fire if B causes the physical address of > A's pages to change via COW. > > And this causes the device page tables to re-synchronize. I don't see such code. The current do_cow_fault() implemenation has nothing to do with mm_notifer. > > > In WarpDrive/uacce, we make this simple. If you support IOMMU and it support > > SVM/SVA. Everything will be fine just like ODP implicit mode. And you don't need > > to write any code for that. Because it has been done by IOMMU framework. If it > > Looks like the IOMMU code uses mmu_notifier, so it is identical to > IB's ODP. The only difference is that IB tends to have the IOMMU page > table in the device, not in the CPU. > > The only case I know if that is different is the new-fangled CAPI > stuff where the IOMMU can directly use the CPU's page table and the > IOMMU page table (in device or CPU) is eliminated. > Yes. We are not focusing on the current implementation. As mentioned in the cover letter. We are expecting Jean Philips' SVA patch: git://linux-arm.org/linux-jpb. > Anyhow, I don't think a single instance of hardware should justify an > entire new subsystem. Subsystems are hard to make and without multiple > hardware examples there is no way to expect that it would cover any > future use cases. Yes. That's our first expectation. We can keep it with our driver. But because there is no user driver support for any accelerator in mainline kernel. Even the well known QuickAssit has to be maintained out of tree. So we try to see if people is interested in working together to solve the problem. > > If all your driver needs is to mmap some PCI bar space, route > interrupts and do DMA mapping then mediated VFIO is probably a good > choice. Yes. That is what is done in our RFCv1/v2. But we accepted Jerome's opinion and try not to add complexity to the mm subsystem. > > If it needs to do a bunch of other stuff, not related to PCI bar > space, interrupts and DMA mapping (ie special code for compression, > crypto, AI, whatever) then you should probably do what Jerome said and > make a drivers/char/hisillicon_foo_bar.c that exposes just what your > hardware does. Yes. If no other accelerator driver writer is interested. That is the expectation:) But we really like to have a public solution here. Consider this scenario: You create some connections (queues) to NIC, RSA, and AI engine. Then you got data direct from the NIC and pass the pointer to RSA engine for decryption. The CPU then finish some data taking or operation and then pass through to the AI engine for CNN calculation....This will need a place to maintain the same address space by some means. It is not complex, but it is helpful. > > If you have networking involved in here then consider RDMA, > particularly if this functionality is already part of the same > hardware that the hns infiniband driver is servicing. > > 'computational MRs' are a reasonable approach to a side-car offload of > already existing RDMA support. OK. Thanks. I will spend some time on it. But personally, I really don't like RDMA's complexity. I cannot even try one single function without a...some expensive hardwares and complexity connection in the lab. This is not like a open source way. > > Jason
On Tue, Nov 20, 2018 at 11:07:02AM +0800, Kenneth Lee wrote: > On Mon, Nov 19, 2018 at 11:49:54AM -0700, Jason Gunthorpe wrote: > > Date: Mon, 19 Nov 2018 11:49:54 -0700 > > From: Jason Gunthorpe <jgg@ziepe.ca> > > To: Kenneth Lee <liguozhu@hisilicon.com> > > CC: Leon Romanovsky <leon@kernel.org>, Kenneth Lee <nek.in.cn@gmail.com>, > > Tim Sell <timothy.sell@unisys.com>, linux-doc@vger.kernel.org, Alexander > > Shishkin <alexander.shishkin@linux.intel.com>, Zaibo Xu > > <xuzaibo@huawei.com>, zhangfei.gao@foxmail.com, linuxarm@huawei.com, > > haojian.zhuang@linaro.org, Christoph Lameter <cl@linux.com>, Hao Fang > > <fanghao11@huawei.com>, Gavin Schenk <g.schenk@eckelmann.de>, RDMA mailing > > list <linux-rdma@vger.kernel.org>, Zhou Wang <wangzhou1@hisilicon.com>, > > Doug Ledford <dledford@redhat.com>, Uwe Kleine-König > > <u.kleine-koenig@pengutronix.de>, David Kershner > > <david.kershner@unisys.com>, Johan Hovold <johan@kernel.org>, Cyrille > > Pitchen <cyrille.pitchen@free-electrons.com>, Sagar Dharia > > <sdharia@codeaurora.org>, Jens Axboe <axboe@kernel.dk>, > > guodong.xu@linaro.org, linux-netdev <netdev@vger.kernel.org>, Randy Dunlap > > <rdunlap@infradead.org>, linux-kernel@vger.kernel.org, Vinod Koul > > <vkoul@kernel.org>, linux-crypto@vger.kernel.org, Philippe Ombredanne > > <pombredanne@nexb.com>, Sanyog Kale <sanyog.r.kale@intel.com>, "David S. > > Miller" <davem@davemloft.net>, linux-accelerators@lists.ozlabs.org > > Subject: Re: [RFCv3 PATCH 1/6] uacce: Add documents for WarpDrive/uacce > > User-Agent: Mutt/1.9.4 (2018-02-28) > > Message-ID: <20181119184954.GB4890@ziepe.ca> > > > > On Mon, Nov 19, 2018 at 05:14:05PM +0800, Kenneth Lee wrote: > > > > > If the hardware cannot share page table with the CPU, we then need to have > > > some way to change the device page table. This is what happen in ODP. It > > > invalidates the page table in device upon mmu_notifier call back. But this cannot > > > solve the COW problem: if the user process A share a page P with device, and A > > > forks a new process B, and it continue to write to the page. By COW, the > > > process B will keep the page P, while A will get a new page P'. But you have > > > no way to let the device know it should use P' rather than P. > > > > Is this true? I thought mmu_notifiers covered all these cases. > > > > The mm_notifier for A should fire if B causes the physical address of > > A's pages to change via COW. > > > > And this causes the device page tables to re-synchronize. > > I don't see such code. The current do_cow_fault() implemenation has nothing to > do with mm_notifer. Well, that sure sounds like it would be a bug in mmu_notifiers.. But considering Jean's SVA stuff seems based on mmu notifiers, I have a hard time believing that it has any different behavior from RDMA's ODP, and if it does have different behavior, then it is probably just a bug in the ODP implementation. > > > In WarpDrive/uacce, we make this simple. If you support IOMMU and it support > > > SVM/SVA. Everything will be fine just like ODP implicit mode. And you don't need > > > to write any code for that. Because it has been done by IOMMU framework. If it > > > > Looks like the IOMMU code uses mmu_notifier, so it is identical to > > IB's ODP. The only difference is that IB tends to have the IOMMU page > > table in the device, not in the CPU. > > > > The only case I know if that is different is the new-fangled CAPI > > stuff where the IOMMU can directly use the CPU's page table and the > > IOMMU page table (in device or CPU) is eliminated. > > Yes. We are not focusing on the current implementation. As mentioned in the > cover letter. We are expecting Jean Philips' SVA patch: > git://linux-arm.org/linux-jpb. This SVA stuff does not look comparable to CAPI as it still requires maintaining seperate IOMMU page tables. Also, those patches from Jean have a lot of references to mmu_notifiers (ie look at iommu_mmu_notifier). Are you really sure it is actually any different at all? > > Anyhow, I don't think a single instance of hardware should justify an > > entire new subsystem. Subsystems are hard to make and without multiple > > hardware examples there is no way to expect that it would cover any > > future use cases. > > Yes. That's our first expectation. We can keep it with our driver. But because > there is no user driver support for any accelerator in mainline kernel. Even the > well known QuickAssit has to be maintained out of tree. So we try to see if > people is interested in working together to solve the problem. Well, you should come with patches ack'ed by these other groups. > > If all your driver needs is to mmap some PCI bar space, route > > interrupts and do DMA mapping then mediated VFIO is probably a good > > choice. > > Yes. That is what is done in our RFCv1/v2. But we accepted Jerome's opinion and > try not to add complexity to the mm subsystem. Why would a mediated VFIO driver touch the mm subsystem? Sounds like you don't have a VFIO driver if it needs to do stuff like that... > > If it needs to do a bunch of other stuff, not related to PCI bar > > space, interrupts and DMA mapping (ie special code for compression, > > crypto, AI, whatever) then you should probably do what Jerome said and > > make a drivers/char/hisillicon_foo_bar.c that exposes just what your > > hardware does. > > Yes. If no other accelerator driver writer is interested. That is the > expectation:) I don't think it matters what other drivers do. If your driver does not need any other kernel code then VFIO is sensible. In this kind of world you will probably have a RDMA-like userspace driver that can bring this to a common user space API, even if one driver use VFIO and a different driver uses something else. > You create some connections (queues) to NIC, RSA, and AI engine. Then you got > data direct from the NIC and pass the pointer to RSA engine for decryption. The > CPU then finish some data taking or operation and then pass through to the AI > engine for CNN calculation....This will need a place to maintain the same > address space by some means. How is this any different from what we have today? SVA is not something even remotely new, IB has been doing various versions of it for 20 years. Jason
+CC Jean-Phillipe and iommu list. On Mon, 19 Nov 2018 20:29:39 -0700 Jason Gunthorpe <jgg@ziepe.ca> wrote: > On Tue, Nov 20, 2018 at 11:07:02AM +0800, Kenneth Lee wrote: > > On Mon, Nov 19, 2018 at 11:49:54AM -0700, Jason Gunthorpe wrote: > > > Date: Mon, 19 Nov 2018 11:49:54 -0700 > > > From: Jason Gunthorpe <jgg@ziepe.ca> > > > To: Kenneth Lee <liguozhu@hisilicon.com> > > > CC: Leon Romanovsky <leon@kernel.org>, Kenneth Lee <nek.in.cn@gmail.com>, > > > Tim Sell <timothy.sell@unisys.com>, linux-doc@vger.kernel.org, Alexander > > > Shishkin <alexander.shishkin@linux.intel.com>, Zaibo Xu > > > <xuzaibo@huawei.com>, zhangfei.gao@foxmail.com, linuxarm@huawei.com, > > > haojian.zhuang@linaro.org, Christoph Lameter <cl@linux.com>, Hao Fang > > > <fanghao11@huawei.com>, Gavin Schenk <g.schenk@eckelmann.de>, RDMA mailing > > > list <linux-rdma@vger.kernel.org>, Zhou Wang <wangzhou1@hisilicon.com>, > > > Doug Ledford <dledford@redhat.com>, Uwe Kleine-König > > > <u.kleine-koenig@pengutronix.de>, David Kershner > > > <david.kershner@unisys.com>, Johan Hovold <johan@kernel.org>, Cyrille > > > Pitchen <cyrille.pitchen@free-electrons.com>, Sagar Dharia > > > <sdharia@codeaurora.org>, Jens Axboe <axboe@kernel.dk>, > > > guodong.xu@linaro.org, linux-netdev <netdev@vger.kernel.org>, Randy Dunlap > > > <rdunlap@infradead.org>, linux-kernel@vger.kernel.org, Vinod Koul > > > <vkoul@kernel.org>, linux-crypto@vger.kernel.org, Philippe Ombredanne > > > <pombredanne@nexb.com>, Sanyog Kale <sanyog.r.kale@intel.com>, "David S. > > > Miller" <davem@davemloft.net>, linux-accelerators@lists.ozlabs.org > > > Subject: Re: [RFCv3 PATCH 1/6] uacce: Add documents for WarpDrive/uacce > > > User-Agent: Mutt/1.9.4 (2018-02-28) > > > Message-ID: <20181119184954.GB4890@ziepe.ca> > > > > > > On Mon, Nov 19, 2018 at 05:14:05PM +0800, Kenneth Lee wrote: > > > > > > > If the hardware cannot share page table with the CPU, we then need to have > > > > some way to change the device page table. This is what happen in ODP. It > > > > invalidates the page table in device upon mmu_notifier call back. But this cannot > > > > solve the COW problem: if the user process A share a page P with device, and A > > > > forks a new process B, and it continue to write to the page. By COW, the > > > > process B will keep the page P, while A will get a new page P'. But you have > > > > no way to let the device know it should use P' rather than P. > > > > > > Is this true? I thought mmu_notifiers covered all these cases. > > > > > > The mm_notifier for A should fire if B causes the physical address of > > > A's pages to change via COW. > > > > > > And this causes the device page tables to re-synchronize. > > > > I don't see such code. The current do_cow_fault() implemenation has nothing to > > do with mm_notifer. > > Well, that sure sounds like it would be a bug in mmu_notifiers.. > > But considering Jean's SVA stuff seems based on mmu notifiers, I have > a hard time believing that it has any different behavior from RDMA's > ODP, and if it does have different behavior, then it is probably just > a bug in the ODP implementation. > > > > > In WarpDrive/uacce, we make this simple. If you support IOMMU and it support > > > > SVM/SVA. Everything will be fine just like ODP implicit mode. And you don't need > > > > to write any code for that. Because it has been done by IOMMU framework. If it > > > > > > Looks like the IOMMU code uses mmu_notifier, so it is identical to > > > IB's ODP. The only difference is that IB tends to have the IOMMU page > > > table in the device, not in the CPU. > > > > > > The only case I know if that is different is the new-fangled CAPI > > > stuff where the IOMMU can directly use the CPU's page table and the > > > IOMMU page table (in device or CPU) is eliminated. > > > > Yes. We are not focusing on the current implementation. As mentioned in the > > cover letter. We are expecting Jean Philips' SVA patch: > > git://linux-arm.org/linux-jpb. > > This SVA stuff does not look comparable to CAPI as it still requires > maintaining seperate IOMMU page tables. > > Also, those patches from Jean have a lot of references to > mmu_notifiers (ie look at iommu_mmu_notifier). > > Are you really sure it is actually any different at all? > > > > Anyhow, I don't think a single instance of hardware should justify an > > > entire new subsystem. Subsystems are hard to make and without multiple > > > hardware examples there is no way to expect that it would cover any > > > future use cases. > > > > Yes. That's our first expectation. We can keep it with our driver. But because > > there is no user driver support for any accelerator in mainline kernel. Even the > > well known QuickAssit has to be maintained out of tree. So we try to see if > > people is interested in working together to solve the problem. > > Well, you should come with patches ack'ed by these other groups. > > > > If all your driver needs is to mmap some PCI bar space, route > > > interrupts and do DMA mapping then mediated VFIO is probably a good > > > choice. > > > > Yes. That is what is done in our RFCv1/v2. But we accepted Jerome's opinion and > > try not to add complexity to the mm subsystem. > > Why would a mediated VFIO driver touch the mm subsystem? Sounds like > you don't have a VFIO driver if it needs to do stuff like that... > > > > If it needs to do a bunch of other stuff, not related to PCI bar > > > space, interrupts and DMA mapping (ie special code for compression, > > > crypto, AI, whatever) then you should probably do what Jerome said and > > > make a drivers/char/hisillicon_foo_bar.c that exposes just what your > > > hardware does. > > > > Yes. If no other accelerator driver writer is interested. That is the > > expectation:) > > I don't think it matters what other drivers do. > > If your driver does not need any other kernel code then VFIO is > sensible. In this kind of world you will probably have a RDMA-like > userspace driver that can bring this to a common user space API, even > if one driver use VFIO and a different driver uses something else. > > > You create some connections (queues) to NIC, RSA, and AI engine. Then you got > > data direct from the NIC and pass the pointer to RSA engine for decryption. The > > CPU then finish some data taking or operation and then pass through to the AI > > engine for CNN calculation....This will need a place to maintain the same > > address space by some means. > > How is this any different from what we have today? > > SVA is not something even remotely new, IB has been doing various > versions of it for 20 years. > > Jason
On Tue, Nov 20, 2018 at 07:17:44AM +0200, Leon Romanovsky wrote: > Date: Tue, 20 Nov 2018 07:17:44 +0200 > From: Leon Romanovsky <leon@kernel.org> > To: Kenneth Lee <liguozhu@hisilicon.com> > CC: Jason Gunthorpe <jgg@ziepe.ca>, Kenneth Lee <nek.in.cn@gmail.com>, Tim > Sell <timothy.sell@unisys.com>, linux-doc@vger.kernel.org, Alexander > Shishkin <alexander.shishkin@linux.intel.com>, Zaibo Xu > <xuzaibo@huawei.com>, zhangfei.gao@foxmail.com, linuxarm@huawei.com, > haojian.zhuang@linaro.org, Christoph Lameter <cl@linux.com>, Hao Fang > <fanghao11@huawei.com>, Gavin Schenk <g.schenk@eckelmann.de>, RDMA mailing > list <linux-rdma@vger.kernel.org>, Zhou Wang <wangzhou1@hisilicon.com>, > Doug Ledford <dledford@redhat.com>, Uwe Kleine-König > <u.kleine-koenig@pengutronix.de>, David Kershner > <david.kershner@unisys.com>, Johan Hovold <johan@kernel.org>, Cyrille > Pitchen <cyrille.pitchen@free-electrons.com>, Sagar Dharia > <sdharia@codeaurora.org>, Jens Axboe <axboe@kernel.dk>, > guodong.xu@linaro.org, linux-netdev <netdev@vger.kernel.org>, Randy Dunlap > <rdunlap@infradead.org>, linux-kernel@vger.kernel.org, Vinod Koul > <vkoul@kernel.org>, linux-crypto@vger.kernel.org, Philippe Ombredanne > <pombredanne@nexb.com>, Sanyog Kale <sanyog.r.kale@intel.com>, "David S. > Miller" <davem@davemloft.net>, linux-accelerators@lists.ozlabs.org > Subject: Re: [RFCv3 PATCH 1/6] uacce: Add documents for WarpDrive/uacce > User-Agent: Mutt/1.10.1 (2018-07-13) > Message-ID: <20181120051743.GD25178@mtr-leonro.mtl.com> > > On Tue, Nov 20, 2018 at 11:07:02AM +0800, Kenneth Lee wrote: > > On Mon, Nov 19, 2018 at 11:49:54AM -0700, Jason Gunthorpe wrote: > > > Date: Mon, 19 Nov 2018 11:49:54 -0700 > > > From: Jason Gunthorpe <jgg@ziepe.ca> > > > To: Kenneth Lee <liguozhu@hisilicon.com> > > > CC: Leon Romanovsky <leon@kernel.org>, Kenneth Lee <nek.in.cn@gmail.com>, > > > Tim Sell <timothy.sell@unisys.com>, linux-doc@vger.kernel.org, Alexander > > > Shishkin <alexander.shishkin@linux.intel.com>, Zaibo Xu > > > <xuzaibo@huawei.com>, zhangfei.gao@foxmail.com, linuxarm@huawei.com, > > > haojian.zhuang@linaro.org, Christoph Lameter <cl@linux.com>, Hao Fang > > > <fanghao11@huawei.com>, Gavin Schenk <g.schenk@eckelmann.de>, RDMA mailing > > > list <linux-rdma@vger.kernel.org>, Zhou Wang <wangzhou1@hisilicon.com>, > > > Doug Ledford <dledford@redhat.com>, Uwe Kleine-König > > > <u.kleine-koenig@pengutronix.de>, David Kershner > > > <david.kershner@unisys.com>, Johan Hovold <johan@kernel.org>, Cyrille > > > Pitchen <cyrille.pitchen@free-electrons.com>, Sagar Dharia > > > <sdharia@codeaurora.org>, Jens Axboe <axboe@kernel.dk>, > > > guodong.xu@linaro.org, linux-netdev <netdev@vger.kernel.org>, Randy Dunlap > > > <rdunlap@infradead.org>, linux-kernel@vger.kernel.org, Vinod Koul > > > <vkoul@kernel.org>, linux-crypto@vger.kernel.org, Philippe Ombredanne > > > <pombredanne@nexb.com>, Sanyog Kale <sanyog.r.kale@intel.com>, "David S. > > > Miller" <davem@davemloft.net>, linux-accelerators@lists.ozlabs.org > > > Subject: Re: [RFCv3 PATCH 1/6] uacce: Add documents for WarpDrive/uacce > > > User-Agent: Mutt/1.9.4 (2018-02-28) > > > Message-ID: <20181119184954.GB4890@ziepe.ca> > > > > > > On Mon, Nov 19, 2018 at 05:14:05PM +0800, Kenneth Lee wrote: > > > > > > > If the hardware cannot share page table with the CPU, we then need to have > > > > some way to change the device page table. This is what happen in ODP. It > > > > invalidates the page table in device upon mmu_notifier call back. But this cannot > > > > solve the COW problem: if the user process A share a page P with device, and A > > > > forks a new process B, and it continue to write to the page. By COW, the > > > > process B will keep the page P, while A will get a new page P'. But you have > > > > no way to let the device know it should use P' rather than P. > > > > > > Is this true? I thought mmu_notifiers covered all these cases. > > > > > > The mm_notifier for A should fire if B causes the physical address of > > > A's pages to change via COW. > > > > > > And this causes the device page tables to re-synchronize. > > > > I don't see such code. The current do_cow_fault() implemenation has nothing to > > do with mm_notifer. > > > > > > > > > In WarpDrive/uacce, we make this simple. If you support IOMMU and it support > > > > SVM/SVA. Everything will be fine just like ODP implicit mode. And you don't need > > > > to write any code for that. Because it has been done by IOMMU framework. If it > > > > > > Looks like the IOMMU code uses mmu_notifier, so it is identical to > > > IB's ODP. The only difference is that IB tends to have the IOMMU page > > > table in the device, not in the CPU. > > > > > > The only case I know if that is different is the new-fangled CAPI > > > stuff where the IOMMU can directly use the CPU's page table and the > > > IOMMU page table (in device or CPU) is eliminated. > > > > > > > Yes. We are not focusing on the current implementation. As mentioned in the > > cover letter. We are expecting Jean Philips' SVA patch: > > git://linux-arm.org/linux-jpb. > > > > > Anyhow, I don't think a single instance of hardware should justify an > > > entire new subsystem. Subsystems are hard to make and without multiple > > > hardware examples there is no way to expect that it would cover any > > > future use cases. > > > > Yes. That's our first expectation. We can keep it with our driver. But because > > there is no user driver support for any accelerator in mainline kernel. Even the > > well known QuickAssit has to be maintained out of tree. So we try to see if > > people is interested in working together to solve the problem. > > > > > > > > If all your driver needs is to mmap some PCI bar space, route > > > interrupts and do DMA mapping then mediated VFIO is probably a good > > > choice. > > > > Yes. That is what is done in our RFCv1/v2. But we accepted Jerome's opinion and > > try not to add complexity to the mm subsystem. > > > > > > > > If it needs to do a bunch of other stuff, not related to PCI bar > > > space, interrupts and DMA mapping (ie special code for compression, > > > crypto, AI, whatever) then you should probably do what Jerome said and > > > make a drivers/char/hisillicon_foo_bar.c that exposes just what your > > > hardware does. > > > > Yes. If no other accelerator driver writer is interested. That is the > > expectation:) > > > > But we really like to have a public solution here. Consider this scenario: > > > > You create some connections (queues) to NIC, RSA, and AI engine. Then you got > > data direct from the NIC and pass the pointer to RSA engine for decryption. The > > CPU then finish some data taking or operation and then pass through to the AI > > engine for CNN calculation....This will need a place to maintain the same > > address space by some means. > > You are using NIC terminology, in the documentation, you wrote that it is needed > for DPDK use and I don't really understand, why do we need another shiny new > interface for DPDK. > I'm not a DPDK expert. But we had some discussion with LNG of Linaro. They were considering to create something similar to simplify the user driver. In most of case, we use DPDK or ODP (open data plane) just for faster data plane data flow. But many logic such as setting the hardware mode, mac address and so on is not necessary. So they were looking for a way to keep the driver in the kernel and just the ring buffer of some queues to the user space. This may simplified the user space design. > > > > It is not complex, but it is helpful. > > > > > > > > If you have networking involved in here then consider RDMA, > > > particularly if this functionality is already part of the same > > > hardware that the hns infiniband driver is servicing. > > > > > > 'computational MRs' are a reasonable approach to a side-car offload of > > > already existing RDMA support. > > > > OK. Thanks. I will spend some time on it. But personally, I really don't like > > RDMA's complexity. I cannot even try one single function without a...some > > expensive hardwares and complexity connection in the lab. This is not like a > > open source way. > > It is not very accurate. We have RXE driver which is virtual RDMA device > which is implemented purely in SW. It struggles from bad performance and > sporadic failures, but it is enough to try RDMA on your laptop in VM. Woo. This will be helpful. Thank you very much. > > Thanks > > > > > > > > > Jason > > -- -Kenneth(Hisilicon) ================================================================================ 本邮件及其附件含有华为公司的保密信息,仅限于发送给上面地址中列出的个人或群组。禁 止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、或散发)本邮件中 的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本邮件! This e-mail and its attachments contain confidential information from HUAWEI, which is intended only for the person or entity whose address is listed above. Any use of the information contained herein in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender by phone or email immediately and delete it!
On Mon, Nov 19, 2018 at 08:29:39PM -0700, Jason Gunthorpe wrote: > Date: Mon, 19 Nov 2018 20:29:39 -0700 > From: Jason Gunthorpe <jgg@ziepe.ca> > To: Kenneth Lee <liguozhu@hisilicon.com> > CC: Leon Romanovsky <leon@kernel.org>, Kenneth Lee <nek.in.cn@gmail.com>, > Tim Sell <timothy.sell@unisys.com>, linux-doc@vger.kernel.org, Alexander > Shishkin <alexander.shishkin@linux.intel.com>, Zaibo Xu > <xuzaibo@huawei.com>, zhangfei.gao@foxmail.com, linuxarm@huawei.com, > haojian.zhuang@linaro.org, Christoph Lameter <cl@linux.com>, Hao Fang > <fanghao11@huawei.com>, Gavin Schenk <g.schenk@eckelmann.de>, RDMA mailing > list <linux-rdma@vger.kernel.org>, Zhou Wang <wangzhou1@hisilicon.com>, > Doug Ledford <dledford@redhat.com>, Uwe Kleine-König > <u.kleine-koenig@pengutronix.de>, David Kershner > <david.kershner@unisys.com>, Johan Hovold <johan@kernel.org>, Cyrille > Pitchen <cyrille.pitchen@free-electrons.com>, Sagar Dharia > <sdharia@codeaurora.org>, Jens Axboe <axboe@kernel.dk>, > guodong.xu@linaro.org, linux-netdev <netdev@vger.kernel.org>, Randy Dunlap > <rdunlap@infradead.org>, linux-kernel@vger.kernel.org, Vinod Koul > <vkoul@kernel.org>, linux-crypto@vger.kernel.org, Philippe Ombredanne > <pombredanne@nexb.com>, Sanyog Kale <sanyog.r.kale@intel.com>, "David S. > Miller" <davem@davemloft.net>, linux-accelerators@lists.ozlabs.org > Subject: Re: [RFCv3 PATCH 1/6] uacce: Add documents for WarpDrive/uacce > User-Agent: Mutt/1.9.4 (2018-02-28) > Message-ID: <20181120032939.GR4890@ziepe.ca> > > On Tue, Nov 20, 2018 at 11:07:02AM +0800, Kenneth Lee wrote: > > On Mon, Nov 19, 2018 at 11:49:54AM -0700, Jason Gunthorpe wrote: > > > Date: Mon, 19 Nov 2018 11:49:54 -0700 > > > From: Jason Gunthorpe <jgg@ziepe.ca> > > > To: Kenneth Lee <liguozhu@hisilicon.com> > > > CC: Leon Romanovsky <leon@kernel.org>, Kenneth Lee <nek.in.cn@gmail.com>, > > > Tim Sell <timothy.sell@unisys.com>, linux-doc@vger.kernel.org, Alexander > > > Shishkin <alexander.shishkin@linux.intel.com>, Zaibo Xu > > > <xuzaibo@huawei.com>, zhangfei.gao@foxmail.com, linuxarm@huawei.com, > > > haojian.zhuang@linaro.org, Christoph Lameter <cl@linux.com>, Hao Fang > > > <fanghao11@huawei.com>, Gavin Schenk <g.schenk@eckelmann.de>, RDMA mailing > > > list <linux-rdma@vger.kernel.org>, Zhou Wang <wangzhou1@hisilicon.com>, > > > Doug Ledford <dledford@redhat.com>, Uwe Kleine-König > > > <u.kleine-koenig@pengutronix.de>, David Kershner > > > <david.kershner@unisys.com>, Johan Hovold <johan@kernel.org>, Cyrille > > > Pitchen <cyrille.pitchen@free-electrons.com>, Sagar Dharia > > > <sdharia@codeaurora.org>, Jens Axboe <axboe@kernel.dk>, > > > guodong.xu@linaro.org, linux-netdev <netdev@vger.kernel.org>, Randy Dunlap > > > <rdunlap@infradead.org>, linux-kernel@vger.kernel.org, Vinod Koul > > > <vkoul@kernel.org>, linux-crypto@vger.kernel.org, Philippe Ombredanne > > > <pombredanne@nexb.com>, Sanyog Kale <sanyog.r.kale@intel.com>, "David S. > > > Miller" <davem@davemloft.net>, linux-accelerators@lists.ozlabs.org > > > Subject: Re: [RFCv3 PATCH 1/6] uacce: Add documents for WarpDrive/uacce > > > User-Agent: Mutt/1.9.4 (2018-02-28) > > > Message-ID: <20181119184954.GB4890@ziepe.ca> > > > > > > On Mon, Nov 19, 2018 at 05:14:05PM +0800, Kenneth Lee wrote: > > > > > > > If the hardware cannot share page table with the CPU, we then need to have > > > > some way to change the device page table. This is what happen in ODP. It > > > > invalidates the page table in device upon mmu_notifier call back. But this cannot > > > > solve the COW problem: if the user process A share a page P with device, and A > > > > forks a new process B, and it continue to write to the page. By COW, the > > > > process B will keep the page P, while A will get a new page P'. But you have > > > > no way to let the device know it should use P' rather than P. > > > > > > Is this true? I thought mmu_notifiers covered all these cases. > > > > > > The mm_notifier for A should fire if B causes the physical address of > > > A's pages to change via COW. > > > > > > And this causes the device page tables to re-synchronize. > > > > I don't see such code. The current do_cow_fault() implemenation has nothing to > > do with mm_notifer. > > Well, that sure sounds like it would be a bug in mmu_notifiers.. Yes, it can be taken that way:) But it is going to be a tough bug. > > But considering Jean's SVA stuff seems based on mmu notifiers, I have > a hard time believing that it has any different behavior from RDMA's > ODP, and if it does have different behavior, then it is probably just > a bug in the ODP implementation. As Jean has explained, his solution is based on page table sharing. I think ODP should also consider this new feature. > > > > > In WarpDrive/uacce, we make this simple. If you support IOMMU and it support > > > > SVM/SVA. Everything will be fine just like ODP implicit mode. And you don't need > > > > to write any code for that. Because it has been done by IOMMU framework. If it > > > > > > Looks like the IOMMU code uses mmu_notifier, so it is identical to > > > IB's ODP. The only difference is that IB tends to have the IOMMU page > > > table in the device, not in the CPU. > > > > > > The only case I know if that is different is the new-fangled CAPI > > > stuff where the IOMMU can directly use the CPU's page table and the > > > IOMMU page table (in device or CPU) is eliminated. > > > > Yes. We are not focusing on the current implementation. As mentioned in the > > cover letter. We are expecting Jean Philips' SVA patch: > > git://linux-arm.org/linux-jpb. > > This SVA stuff does not look comparable to CAPI as it still requires > maintaining seperate IOMMU page tables. > > Also, those patches from Jean have a lot of references to > mmu_notifiers (ie look at iommu_mmu_notifier). > > Are you really sure it is actually any different at all? > > > > Anyhow, I don't think a single instance of hardware should justify an > > > entire new subsystem. Subsystems are hard to make and without multiple > > > hardware examples there is no way to expect that it would cover any > > > future use cases. > > > > Yes. That's our first expectation. We can keep it with our driver. But because > > there is no user driver support for any accelerator in mainline kernel. Even the > > well known QuickAssit has to be maintained out of tree. So we try to see if > > people is interested in working together to solve the problem. > > Well, you should come with patches ack'ed by these other groups. Yes, that is what we are doing. > > > > If all your driver needs is to mmap some PCI bar space, route > > > interrupts and do DMA mapping then mediated VFIO is probably a good > > > choice. > > > > Yes. That is what is done in our RFCv1/v2. But we accepted Jerome's opinion and > > try not to add complexity to the mm subsystem. > > Why would a mediated VFIO driver touch the mm subsystem? Sounds like > you don't have a VFIO driver if it needs to do stuff like that... VFIO has no ODP-like solution, and if we want to solve the fork problem, we have to make some change to iommu and the fork procedure. Further, VFIO takes every queue as a independent device. This create a lot of trouble on resource management. For example, you will need a manager process to withdraw the unused device and you need to let the user process know about PASID of the queue, and so on. > > > > If it needs to do a bunch of other stuff, not related to PCI bar > > > space, interrupts and DMA mapping (ie special code for compression, > > > crypto, AI, whatever) then you should probably do what Jerome said and > > > make a drivers/char/hisillicon_foo_bar.c that exposes just what your > > > hardware does. > > > > Yes. If no other accelerator driver writer is interested. That is the > > expectation:) > > I don't think it matters what other drivers do. > > If your driver does not need any other kernel code then VFIO is > sensible. In this kind of world you will probably have a RDMA-like > userspace driver that can bring this to a common user space API, even > if one driver use VFIO and a different driver uses something else. Yes, in some way. But in another way, if they don't have the same address space logic. It won't be easy to cooperate. In VFIO, you have to use DMA address for device while you cannot direct share RDMA-ODP memory to other device. WarpDrive/uacce can solve all these problem. > > > You create some connections (queues) to NIC, RSA, and AI engine. Then you got > > data direct from the NIC and pass the pointer to RSA engine for decryption. The > > CPU then finish some data taking or operation and then pass through to the AI > > engine for CNN calculation....This will need a place to maintain the same > > address space by some means. > > How is this any different from what we have today? > > SVA is not something even remotely new, IB has been doing various > versions of it for 20 years. But now we can have them unified to IOMMU framework. > > Jason --
On Wed, Nov 21, 2018 at 07:58:40PM -0700, Jason Gunthorpe wrote: > Date: Wed, 21 Nov 2018 19:58:40 -0700 > From: Jason Gunthorpe <jgg@ziepe.ca> > To: Kenneth Lee <liguozhu@hisilicon.com> > CC: Leon Romanovsky <leon@kernel.org>, Kenneth Lee <nek.in.cn@gmail.com>, > Tim Sell <timothy.sell@unisys.com>, linux-doc@vger.kernel.org, Alexander > Shishkin <alexander.shishkin@linux.intel.com>, Zaibo Xu > <xuzaibo@huawei.com>, zhangfei.gao@foxmail.com, linuxarm@huawei.com, > haojian.zhuang@linaro.org, Christoph Lameter <cl@linux.com>, Hao Fang > <fanghao11@huawei.com>, Gavin Schenk <g.schenk@eckelmann.de>, RDMA mailing > list <linux-rdma@vger.kernel.org>, Zhou Wang <wangzhou1@hisilicon.com>, > Doug Ledford <dledford@redhat.com>, Uwe Kleine-König > <u.kleine-koenig@pengutronix.de>, David Kershner > <david.kershner@unisys.com>, Johan Hovold <johan@kernel.org>, Cyrille > Pitchen <cyrille.pitchen@free-electrons.com>, Sagar Dharia > <sdharia@codeaurora.org>, Jens Axboe <axboe@kernel.dk>, > guodong.xu@linaro.org, linux-netdev <netdev@vger.kernel.org>, Randy Dunlap > <rdunlap@infradead.org>, linux-kernel@vger.kernel.org, Vinod Koul > <vkoul@kernel.org>, linux-crypto@vger.kernel.org, Philippe Ombredanne > <pombredanne@nexb.com>, Sanyog Kale <sanyog.r.kale@intel.com>, "David S. > Miller" <davem@davemloft.net>, linux-accelerators@lists.ozlabs.org > Subject: Re: [RFCv3 PATCH 1/6] uacce: Add documents for WarpDrive/uacce > User-Agent: Mutt/1.9.4 (2018-02-28) > Message-ID: <20181122025840.GB19938@ziepe.ca> > > On Wed, Nov 21, 2018 at 02:08:05PM +0800, Kenneth Lee wrote: > > > > But considering Jean's SVA stuff seems based on mmu notifiers, I have > > > a hard time believing that it has any different behavior from RDMA's > > > ODP, and if it does have different behavior, then it is probably just > > > a bug in the ODP implementation. > > > > As Jean has explained, his solution is based on page table sharing. I think ODP > > should also consider this new feature. > > Shared page tables would require the HW to walk the page table format > of the CPU directly, not sure how that would be possible for ODP? > > Presumably the implementation for ARM relies on the IOMMU hardware > doing this? Yes, that is the idea. And since Jean is merging the AMD and Intel solution together, I assume they can do the same. This is also the reason I want to solve my problem on top of IOMMU directly. But anyway, let me try to see if I can merge the logic with ODP. > > > > > > If all your driver needs is to mmap some PCI bar space, route > > > > > interrupts and do DMA mapping then mediated VFIO is probably a good > > > > > choice. > > > > > > > > Yes. That is what is done in our RFCv1/v2. But we accepted Jerome's opinion and > > > > try not to add complexity to the mm subsystem. > > > > > > Why would a mediated VFIO driver touch the mm subsystem? Sounds like > > > you don't have a VFIO driver if it needs to do stuff like that... > > > > VFIO has no ODP-like solution, and if we want to solve the fork problem, we have > > to make some change to iommu and the fork procedure. Further, VFIO takes every > > queue as a independent device. This create a lot of trouble on resource > > management. For example, you will need a manager process to withdraw the unused > > device and you need to let the user process know about PASID of the queue, and > > so on. > > Well, I would think you'd add SVA support to the VFIO driver as a > generic capability - it seems pretty useful for any VFIO user as it > avoids all the kernel upcalls to do memory pinning and DMA address > translation. It is already part of Jean's patchset. And that's why I built my solution on VFIO in the first place. But I think the concept of SVA and PASID is not compatible with the original VFIO concept space. You would not share your whole address space to a device at all in a virtual machine manager, wouldn't you? And if you can manage to have a separated mdev for your virtual machine, why bother to set a PASID to it? The answer to those problem, I think, will be Intel's Scalable IO Virtualization. For accelerator, the requirement is simply: getting a handle to device, attaching the process's mm with the handle by sharing the process's page table with its iommu indexed by PASID, and start the communication... > > Once the VFIO driver knows about this as a generic capability then the > device it exposes to userspace would use CPU addresses instead of DMA > addresses. > > The question is if your driver needs much more than the device > agnostic generic services VFIO provides. > > I'm not sure what you have in mind with resource management.. It is > hard to revoke resources from userspace, unless you are doing > kernel syscalls, but then why do all this? Say, I have 1024 queues in my accelerator. I can get one by opening the device and attach it with the fd. If the process exit by any means, the queue can be returned with the release of the fd. But if it is mdev, it will still be there and some one should tell the allocator it is available again. This is not easy to design in user space. > > Jason --
On Fri, Nov 23, 2018 at 04:02:42PM +0800, Kenneth Lee wrote: > It is already part of Jean's patchset. And that's why I built my solution on > VFIO in the first place. But I think the concept of SVA and PASID is not > compatible with the original VFIO concept space. You would not share your whole > address space to a device at all in a virtual machine manager, > wouldn't you? Why not? That seems to fit VFIO's space just fine to me.. You might need a new upcall to create a full MM registration, but that doesn't seem unsuited. Part of the point here is you should try to make sensible revisions to existing subsystems before just inventing a new thing... VFIO is deeply connected to the IOMMU, so enabling more general IOMMU based approache seems perfectly fine to me.. > > Once the VFIO driver knows about this as a generic capability then the > > device it exposes to userspace would use CPU addresses instead of DMA > > addresses. > > > > The question is if your driver needs much more than the device > > agnostic generic services VFIO provides. > > > > I'm not sure what you have in mind with resource management.. It is > > hard to revoke resources from userspace, unless you are doing > > kernel syscalls, but then why do all this? > > Say, I have 1024 queues in my accelerator. I can get one by opening the device > and attach it with the fd. If the process exit by any means, the queue can be > returned with the release of the fd. But if it is mdev, it will still be there > and some one should tell the allocator it is available again. This is not easy > to design in user space. ?? why wouldn't the mdev track the queues assigned using the existing open/close/ioctl callbacks? That is basic flow I would expect: open(/dev/vfio) ioctl(unity map entire process MM to mdev with IOMMU) // Create a HQ queue and link the PASID in the HW to this HW queue struct hw queue[..]; ioctl(create HW queue) // Get BAR doorbell memory for the queue bar = mmap() // Submit work to the queue using CPU addresses queue[0] = ... writel(bar [..], &queue); // Queue, SVA, etc is cleaned up when the VFIO closes close() Presumably the kernel has to handle the PASID and related for security reasons, so they shouldn't go to userspace? If there is something missing in vfio to do this is it looks pretty small to me.. Jason
On Tue, Nov 20, 2018 at 10:30:55AM +0800, Kenneth Lee wrote: > Date: Tue, 20 Nov 2018 10:30:55 +0800 > From: Kenneth Lee <liguozhu@hisilicon.com> > To: Leon Romanovsky <leon@kernel.org> > CC: Tim Sell <timothy.sell@unisys.com>, linux-doc@vger.kernel.org, > Alexander Shishkin <alexander.shishkin@linux.intel.com>, Zaibo Xu > <xuzaibo@huawei.com>, zhangfei.gao@foxmail.com, linuxarm@huawei.com, > haojian.zhuang@linaro.org, Christoph Lameter <cl@linux.com>, Hao Fang > <fanghao11@huawei.com>, Gavin Schenk <g.schenk@eckelmann.de>, RDMA mailing > list <linux-rdma@vger.kernel.org>, Zhou Wang <wangzhou1@hisilicon.com>, > Jason Gunthorpe <jgg@ziepe.ca>, Doug Ledford <dledford@redhat.com>, Uwe > Kleine-König <u.kleine-koenig@pengutronix.de>, David Kershner > <david.kershner@unisys.com>, Kenneth Lee <nek.in.cn@gmail.com>, Johan > Hovold <johan@kernel.org>, Jerome Glisse <jglisse@redhat.com>, Cyrille > Pitchen <cyrille.pitchen@free-electrons.com>, Sagar Dharia > <sdharia@codeaurora.org>, Jens Axboe <axboe@kernel.dk>, > guodong.xu@linaro.org, linux-netdev <netdev@vger.kernel.org>, Randy Dunlap > <rdunlap@infradead.org>, linux-kernel@vger.kernel.org, Vinod Koul > <vkoul@kernel.org>, linux-crypto@vger.kernel.org, Philippe Ombredanne > <pombredanne@nexb.com>, Sanyog Kale <sanyog.r.kale@intel.com>, "David S. > Miller" <davem@davemloft.net>, linux-accelerators@lists.ozlabs.org > Subject: Re: [RFCv3 PATCH 1/6] uacce: Add documents for WarpDrive/uacce > User-Agent: Mutt/1.5.21 (2010-09-15) > Message-ID: <20181120023055.GG157308@Turing-Arch-b> > > On Mon, Nov 19, 2018 at 12:48:01PM +0200, Leon Romanovsky wrote: > > Date: Mon, 19 Nov 2018 12:48:01 +0200 > > From: Leon Romanovsky <leon@kernel.org> > > To: Kenneth Lee <liguozhu@hisilicon.com> > > CC: Tim Sell <timothy.sell@unisys.com>, linux-doc@vger.kernel.org, > > Alexander Shishkin <alexander.shishkin@linux.intel.com>, Zaibo Xu > > <xuzaibo@huawei.com>, zhangfei.gao@foxmail.com, linuxarm@huawei.com, > > haojian.zhuang@linaro.org, Christoph Lameter <cl@linux.com>, Hao Fang > > <fanghao11@huawei.com>, Gavin Schenk <g.schenk@eckelmann.de>, RDMA mailing > > list <linux-rdma@vger.kernel.org>, Vinod Koul <vkoul@kernel.org>, Jason > > Gunthorpe <jgg@ziepe.ca>, Doug Ledford <dledford@redhat.com>, Uwe > > Kleine-König <u.kleine-koenig@pengutronix.de>, David Kershner > > <david.kershner@unisys.com>, Kenneth Lee <nek.in.cn@gmail.com>, Johan > > Hovold <johan@kernel.org>, Cyrille Pitchen > > <cyrille.pitchen@free-electrons.com>, Sagar Dharia > > <sdharia@codeaurora.org>, Jens Axboe <axboe@kernel.dk>, > > guodong.xu@linaro.org, linux-netdev <netdev@vger.kernel.org>, Randy Dunlap > > <rdunlap@infradead.org>, linux-kernel@vger.kernel.org, Zhou Wang > > <wangzhou1@hisilicon.com>, linux-crypto@vger.kernel.org, Philippe > > Ombredanne <pombredanne@nexb.com>, Sanyog Kale <sanyog.r.kale@intel.com>, > > "David S. Miller" <davem@davemloft.net>, > > linux-accelerators@lists.ozlabs.org, Jerome Glisse <jglisse@redhat.com> > > Subject: Re: [RFCv3 PATCH 1/6] uacce: Add documents for WarpDrive/uacce > > User-Agent: Mutt/1.10.1 (2018-07-13) > > Message-ID: <20181119104801.GF8268@mtr-leonro.mtl.com> > > > > On Mon, Nov 19, 2018 at 05:19:10PM +0800, Kenneth Lee wrote: > > > On Mon, Nov 19, 2018 at 05:14:05PM +0800, Kenneth Lee wrote: > > > > Date: Mon, 19 Nov 2018 17:14:05 +0800 > > > > From: Kenneth Lee <liguozhu@hisilicon.com> > > > > To: Leon Romanovsky <leon@kernel.org> > > > > CC: Tim Sell <timothy.sell@unisys.com>, linux-doc@vger.kernel.org, > > > > Alexander Shishkin <alexander.shishkin@linux.intel.com>, Zaibo Xu > > > > <xuzaibo@huawei.com>, zhangfei.gao@foxmail.com, linuxarm@huawei.com, > > > > haojian.zhuang@linaro.org, Christoph Lameter <cl@linux.com>, Hao Fang > > > > <fanghao11@huawei.com>, Gavin Schenk <g.schenk@eckelmann.de>, RDMA mailing > > > > list <linux-rdma@vger.kernel.org>, Vinod Koul <vkoul@kernel.org>, Jason > > > > Gunthorpe <jgg@ziepe.ca>, Doug Ledford <dledford@redhat.com>, Uwe > > > > Kleine-König <u.kleine-koenig@pengutronix.de>, David Kershner > > > > <david.kershner@unisys.com>, Kenneth Lee <nek.in.cn@gmail.com>, Johan > > > > Hovold <johan@kernel.org>, Cyrille Pitchen > > > > <cyrille.pitchen@free-electrons.com>, Sagar Dharia > > > > <sdharia@codeaurora.org>, Jens Axboe <axboe@kernel.dk>, > > > > guodong.xu@linaro.org, linux-netdev <netdev@vger.kernel.org>, Randy Dunlap > > > > <rdunlap@infradead.org>, linux-kernel@vger.kernel.org, Zhou Wang > > > > <wangzhou1@hisilicon.com>, linux-crypto@vger.kernel.org, Philippe > > > > Ombredanne <pombredanne@nexb.com>, Sanyog Kale <sanyog.r.kale@intel.com>, > > > > "David S. Miller" <davem@davemloft.net>, > > > > linux-accelerators@lists.ozlabs.org > > > > Subject: Re: [RFCv3 PATCH 1/6] uacce: Add documents for WarpDrive/uacce > > > > User-Agent: Mutt/1.5.21 (2010-09-15) > > > > Message-ID: <20181119091405.GE157308@Turing-Arch-b> > > > > > > > > On Thu, Nov 15, 2018 at 04:54:55PM +0200, Leon Romanovsky wrote: > > > > > Date: Thu, 15 Nov 2018 16:54:55 +0200 > > > > > From: Leon Romanovsky <leon@kernel.org> > > > > > To: Kenneth Lee <liguozhu@hisilicon.com> > > > > > CC: Kenneth Lee <nek.in.cn@gmail.com>, Tim Sell <timothy.sell@unisys.com>, > > > > > linux-doc@vger.kernel.org, Alexander Shishkin > > > > > <alexander.shishkin@linux.intel.com>, Zaibo Xu <xuzaibo@huawei.com>, > > > > > zhangfei.gao@foxmail.com, linuxarm@huawei.com, haojian.zhuang@linaro.org, > > > > > Christoph Lameter <cl@linux.com>, Hao Fang <fanghao11@huawei.com>, Gavin > > > > > Schenk <g.schenk@eckelmann.de>, RDMA mailing list > > > > > <linux-rdma@vger.kernel.org>, Zhou Wang <wangzhou1@hisilicon.com>, Jason > > > > > Gunthorpe <jgg@ziepe.ca>, Doug Ledford <dledford@redhat.com>, Uwe > > > > > Kleine-König <u.kleine-koenig@pengutronix.de>, David Kershner > > > > > <david.kershner@unisys.com>, Johan Hovold <johan@kernel.org>, Cyrille > > > > > Pitchen <cyrille.pitchen@free-electrons.com>, Sagar Dharia > > > > > <sdharia@codeaurora.org>, Jens Axboe <axboe@kernel.dk>, > > > > > guodong.xu@linaro.org, linux-netdev <netdev@vger.kernel.org>, Randy Dunlap > > > > > <rdunlap@infradead.org>, linux-kernel@vger.kernel.org, Vinod Koul > > > > > <vkoul@kernel.org>, linux-crypto@vger.kernel.org, Philippe Ombredanne > > > > > <pombredanne@nexb.com>, Sanyog Kale <sanyog.r.kale@intel.com>, "David S. > > > > > Miller" <davem@davemloft.net>, linux-accelerators@lists.ozlabs.org > > > > > Subject: Re: [RFCv3 PATCH 1/6] uacce: Add documents for WarpDrive/uacce > > > > > User-Agent: Mutt/1.10.1 (2018-07-13) > > > > > Message-ID: <20181115145455.GN3759@mtr-leonro.mtl.com> > > > > > > > > > > On Thu, Nov 15, 2018 at 04:51:09PM +0800, Kenneth Lee wrote: > > > > > > On Wed, Nov 14, 2018 at 06:00:17PM +0200, Leon Romanovsky wrote: > > > > > > > Date: Wed, 14 Nov 2018 18:00:17 +0200 > > > > > > > From: Leon Romanovsky <leon@kernel.org> > > > > > > > To: Kenneth Lee <nek.in.cn@gmail.com> > > > > > > > CC: Tim Sell <timothy.sell@unisys.com>, linux-doc@vger.kernel.org, > > > > > > > Alexander Shishkin <alexander.shishkin@linux.intel.com>, Zaibo Xu > > > > > > > <xuzaibo@huawei.com>, zhangfei.gao@foxmail.com, linuxarm@huawei.com, > > > > > > > haojian.zhuang@linaro.org, Christoph Lameter <cl@linux.com>, Hao Fang > > > > > > > <fanghao11@huawei.com>, Gavin Schenk <g.schenk@eckelmann.de>, RDMA mailing > > > > > > > list <linux-rdma@vger.kernel.org>, Zhou Wang <wangzhou1@hisilicon.com>, > > > > > > > Jason Gunthorpe <jgg@ziepe.ca>, Doug Ledford <dledford@redhat.com>, Uwe > > > > > > > Kleine-König <u.kleine-koenig@pengutronix.de>, David Kershner > > > > > > > <david.kershner@unisys.com>, Johan Hovold <johan@kernel.org>, Cyrille > > > > > > > Pitchen <cyrille.pitchen@free-electrons.com>, Sagar Dharia > > > > > > > <sdharia@codeaurora.org>, Jens Axboe <axboe@kernel.dk>, > > > > > > > guodong.xu@linaro.org, linux-netdev <netdev@vger.kernel.org>, Randy Dunlap > > > > > > > <rdunlap@infradead.org>, linux-kernel@vger.kernel.org, Vinod Koul > > > > > > > <vkoul@kernel.org>, linux-crypto@vger.kernel.org, Philippe Ombredanne > > > > > > > <pombredanne@nexb.com>, Sanyog Kale <sanyog.r.kale@intel.com>, Kenneth Lee > > > > > > > <liguozhu@hisilicon.com>, "David S. Miller" <davem@davemloft.net>, > > > > > > > linux-accelerators@lists.ozlabs.org > > > > > > > Subject: Re: [RFCv3 PATCH 1/6] uacce: Add documents for WarpDrive/uacce > > > > > > > User-Agent: Mutt/1.10.1 (2018-07-13) > > > > > > > Message-ID: <20181114160017.GI3759@mtr-leonro.mtl.com> > > > > > > > > > > > > > > On Wed, Nov 14, 2018 at 10:58:09AM +0800, Kenneth Lee wrote: > > > > > > > > > > > > > > > > 在 2018/11/13 上午8:23, Leon Romanovsky 写道: > > > > > > > > > On Mon, Nov 12, 2018 at 03:58:02PM +0800, Kenneth Lee wrote: > > > > > > > > > > From: Kenneth Lee <liguozhu@hisilicon.com> > > > > > > > > > > > > > > > > > > > > WarpDrive is a general accelerator framework for the user application to > > > > > > > > > > access the hardware without going through the kernel in data path. > > > > > > > > > > > > > > > > > > > > The kernel component to provide kernel facility to driver for expose the > > > > > > > > > > user interface is called uacce. It a short name for > > > > > > > > > > "Unified/User-space-access-intended Accelerator Framework". > > > > > > > > > > > > > > > > > > > > This patch add document to explain how it works. > > > > > > > > > + RDMA and netdev folks > > > > > > > > > > > > > > > > > > Sorry, to be late in the game, I don't see other patches, but from > > > > > > > > > the description below it seems like you are reinventing RDMA verbs > > > > > > > > > model. I have hard time to see the differences in the proposed > > > > > > > > > framework to already implemented in drivers/infiniband/* for the kernel > > > > > > > > > space and for the https://github.com/linux-rdma/rdma-core/ for the user > > > > > > > > > space parts. > > > > > > > > > > > > > > > > Thanks Leon, > > > > > > > > > > > > > > > > Yes, we tried to solve similar problem in RDMA. We also learned a lot from > > > > > > > > the exist code of RDMA. But we we have to make a new one because we cannot > > > > > > > > register accelerators such as AI operation, encryption or compression to the > > > > > > > > RDMA framework:) > > > > > > > > > > > > > > Assuming that you did everything right and still failed to use RDMA > > > > > > > framework, you was supposed to fix it and not to reinvent new exactly > > > > > > > same one. It is how we develop kernel, by reusing existing code. > > > > > > > > > > > > Yes, but we don't force other system such as NIC or GPU into RDMA, do we? > > > > > > > > > > You don't introduce new NIC or GPU, but proposing another interface to > > > > > directly access HW memory and bypass kernel for the data path. This is > > > > > whole idea of RDMA and this is why it is already present in the kernel. > > > > > > > > > > Various hardware devices are supported in our stack allow a ton of crazy > > > > > stuff, including GPUs interconnections and NIC functionalities. > > > > > > > > Yes. We don't want to invent new wheel. That is why we did it behind VFIO in RFC > > > > v1 and v2. But finally we were persuaded by Mr. Jerome Glisse that VFIO was not > > > > a good place to solve the problem. > > > > I saw a couple of his responses, he constantly said to you that you are > > reinventing the wheel. > > https://lore.kernel.org/lkml/20180904150019.GA4024@redhat.com/ > > > > No. I think he asked me did not create trouble in VFIO but just use common > interface from dma_buf and iommu itself. That is exactly what I am doing. > > > > > > > > > And currently, as you see, IB is bound with devices doing RDMA. The register > > > > function, ib_register_device() hint that it is a netdev (get_netdev() callback), it know > > > > about gid, pkey, and Memory Window. IB is not simply a address space management > > > > framework. And verbs to IB are not transparent. If we start to add > > > > compression/decompression, AI (RNN, CNN stuff) operations, and encryption/decryption > > > > to the verbs set. It will become very complexity. Or maybe I misunderstand the > > > > IB idea? But I don't see compression hardware is integrated in the mainline > > > > Kernel. Could you directly point out which one I can used as a reference? > > > > > > > > I strongly advise you to read the code, not all drivers are implementing > > gids, pkeys and get_netdev() callback. > > > > Yes, you are misunderstanding drivers/infiniband subsystem. We have > > plenty options to expose APIs to the user space applications, starting > > from standard verbs API and ending with private objects which are > > understandable by specific device/driver. > > > > IB stack provides secure FD to access device, by creating context, > > after that you can send direct commands to the FW (see mlx5 DEVX > > or hfi1) in sane way. > > > > So actually, you will need to register your device, declare your own > > set of objects (similar to mlx5 include/uapi/rdma/mlx5_user_ioctl_*.h). > > > > In regards to reference of compression hardware, I don't have. > > But there is an example of how T10-DIF can be implemented in verbs > > layer: > > https://www.openfabrics.org/images/2018workshop/presentations/307_TOved_T10-DIFOffload.pdf > > Or IPsec crypto: > > https://www.spinics.net/lists/linux-rdma/msg48906.html > > > > OK. I will spend some time on it first. But according to current discussion, > Don't you think I should avoid all these complexities but simply use SVM/SVA on > iommu or let the user application use the kernel-allocated VMA and page? It > does not create anything new. Just a new user of IOMMU and its SVM/SVA > capability. > Hi, Leon, I have done some architecture and code study to the IB solution these days. Now I understand why you said WarpDrive was another wheel of IB. At the very beginning when I understood the verbs concept, I had the same feeling. But when I considered to merge them together, I finally found that it would be a disaster to both of them if we do so. As my understanding, the idea of IB is to manage share memory among "peers". Verbs are to help the peers to communicate to each other with these share memory, which is wrapped as MRs. The benefit of IB framework itself is to provide the communication channel in most efficiency way. To do so, it let the user process send the communicating data to the hardware directly. While the idea of WD is simply to provide a channel between the process and the LOCAL devices and let them share "address space", rather than "memory region". We can take the device of WD accelerator as a "peer" in IB. But then most of the semantics in verbs will become worthless, e.g. IBV_WR_RDMA_READ/WRITE. As a local system, you just need to read or write to it, you don't need to "tell" you are writing to it:). The semantics of verbs hint MRs are remote. We can also invent a new "local dma" semantics to the original MRs semantic space. But it is worthless, because it has already provided. Further, it bring no benefit to the current IB users. In another way, WD get very few benefit by integrating into IB framework either. The verb interface provides a standard interface for memory operation. But WD only need a pure message channel between the process and the device, it dose not intercept between them. The ODP feature will be provided by IOMMU framework if Jean's patchset is upstreamed, we don't need to get it from IB. Moreover, ODP simply provides the "fault from device" feature. But WD can also be benefit from the "share page table" feature, but which does no good to IB. Please understand I have no any motivation to reinvent anything. As a software architect for 10+ years and coder for 20+ years. I fully understand how hard to make a module mature. It is not simply the problem of effort. But I also understand that what is going to happen if we merge improper requirement to a exist module. I don't think it is wise to merge WD into IB. It hurts both. So would you change your previous conclusion? Cheers -Kenneth > > > > > > > > > > > > > > > > > I assume you would not agree to register a zip accelerator to infiniband? :) > > > > > > > > > > "infiniband" name in the "drivers/infiniband/" is legacy one and the > > > > > current code supports IB, RoCE, iWARP and OmniPath as a transport layers. > > > > > For a lone time, we wanted to rename that folder to be "drivers/rdma", > > > > > but didn't find enough brave men/women to do it, due to backport mess > > > > > for such move. > > > > > > > > > > The addition of zip accelerator to RDMA is possible and depends on how > > > > > you will model such new functionality - new driver, or maybe new ULP. > > > > > > > > > > > > > > > > > Further, I don't think it is wise to break an exist system (RDMA) to fulfill a > > > > > > totally new scenario. The better choice is to let them run in parallel for some > > > > > > time and try to merge them accordingly. > > > > > > > > > > Awesome, so please run your code out-of-tree for now and once you are ready > > > > > for submission let's try to merge it. > > > > > > > > Yes, yes. We know trust need time to gain. But the fact is that there is no > > > > accelerator user driver can be added to mainline kernel. We should raise the > > > > topic time to time. So to help the communication to fix the gap, right? > > > > > > > > We are also opened to cooperate with IB to do it within the IB framework. But > > > > please let me know where to start. I feel it is quite wired to make a > > > > ib_register_device for a zip or RSA accelerator. > > > > Most of ib_ prefixes in drivers/infinband/ are legacy names. You can > > rename them to be rdma_register_device() if it helps. > > > > So from implementation point of view, as I wrote above. > > Create minimal driver to register, expose MR to user space, add your own > > objects and capabilities through our new KABI and implement user space part > > in github.com/linux-rdma/rdma-core. > > I don't think it is just a name. But anyway, let me spend some time to try the > possibility. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Another problem we tried to address is the way to pin the memory for dma > > > > > > > > operation. The RDMA way to pin the memory cannot avoid the page lost due to > > > > > > > > copy-on-write operation during the memory is used by the device. This may > > > > > > > > not be important to RDMA library. But it is important to accelerator. > > > > > > > > > > > > > > Such support exists in drivers/infiniband/ from late 2014 and > > > > > > > it is called ODP (on demand paging). > > > > > > > > > > > > I reviewed ODP and I think it is a solution bound to infiniband. It is part of > > > > > > MR semantics and required a infiniband specific hook > > > > > > (ucontext->invalidate_range()). And the hook requires the device to be able to > > > > > > stop using the page for a while for the copying. It is ok for infiniband > > > > > > (actually, only mlx5 uses it). I don't think most accelerators can support > > > > > > this mode. But WarpDrive works fully on top of IOMMU interface, it has no this > > > > > > limitation. > > > > > > > > > > 1. It has nothing to do with infiniband. > > > > > > > > But it must be a ib_dev first. > > > > It is just a name. > > > > > > > > > > > 2. MR and uncontext are verbs semantics and needed to ensure that host > > > > > memory exposed to user is properly protected from security point of view. > > > > > 3. "stop using the page for a while for the copying" - I'm not fully > > > > > understand this claim, maybe this article will help you to better > > > > > describe : https://lwn.net/Articles/753027/ > > > > > > > > This topic was being discussed in RFCv2. The key problem here is that: > > > > > > > > The device need to hold the memory for its own calculation, but the CPU/software > > > > want to stop it for a while for synchronizing with disk or COW. > > > > > > > > If the hardware support SVM/SVA (Shared Virtual Memory/Address), it is easy, the > > > > device share page table with CPU, the device will raise a page fault when the > > > > CPU downgrade the PTE to read-only. > > > > > > > > If the hardware cannot share page table with the CPU, we then need to have > > > > some way to change the device page table. This is what happen in ODP. It > > > > invalidates the page table in device upon mmu_notifier call back. But this cannot > > > > solve the COW problem: if the user process A share a page P with device, and A > > > > forks a new process B, and it continue to write to the page. By COW, the > > > > process B will keep the page P, while A will get a new page P'. But you have > > > > no way to let the device know it should use P' rather than P. > > > > I didn't hear about such issue and we supported fork for a long time. > > > > > > > > > > This may be OK for RDMA application. Because RDMA is a big thing and we can ask > > > > the programmer to avoid the situation. But for a accelerator, I don't think we > > > > can ask a programmer to care for this when use a zlib. > > > > > > > > In WarpDrive/uacce, we make this simple. If you support IOMMU and it support > > > > SVM/SVA. Everything will be fine just like ODP implicit mode. And you don't need > > > > to write any code for that. Because it has been done by IOMMU framework. If it > > > > dose not, you have to use the kernel allocated memory which has the same IOVA as > > > > the VA in user space. So we can still maintain a unify address space among the > > > > devices and the applicatin. > > > > > > > > > 4. mlx5 supports ODP not because of being partially IB device, > > > > > but because HW performance oriented implementation is not an easy task. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hope this can help the understanding. > > > > > > > > > > > > > > Yes, it helped me a lot. > > > > > > > Now, I'm more than before convinced that this whole patchset shouldn't > > > > > > > exist in the first place. > > > > > > > > > > > > Then maybe you can tell me how I can register my accelerator to the user space? > > > > > > > > > > Write kernel driver and write user space part of it. > > > > > https://github.com/linux-rdma/rdma-core/ > > > > > > > > > > I have no doubts that your colleagues who wrote and maintain > > > > > drivers/infiniband/hw/hns driver know best how to do it. > > > > > They did it very successfully. > > > > > > > > > > Thanks > > > > > > > > > > > > > > > > > > > > > > > > > To be clear, NAK. > > > > > > > > > > > > > > Thanks > > > > > > > > > > > > > > > > > > > > > > > Cheers > > > > > > > > > > > > > > > > > > > > > > > > > > Hard NAK from RDMA side. > > > > > > > > > > > > > > > > > > Thanks > > > > > > > > > [...]
diff --git a/Documentation/warpdrive/warpdrive.rst b/Documentation/warpdrive/warpdrive.rst new file mode 100644 index 000000000000..ef84d3a2d462 --- /dev/null +++ b/Documentation/warpdrive/warpdrive.rst @@ -0,0 +1,260 @@ +Introduction of WarpDrive +========================= + +*WarpDrive* is a general accelerator framework for the user application to +access the hardware without going through the kernel in data path. + +It can be used as the quick channel for accelerators, network adaptors or +other hardware for application in user space. + +This may make some implementation simpler. E.g. you can reuse most of the +*netdev* driver in kernel and just share some ring buffer to the user space +driver for *DPDK* [4] or *ODP* [5]. Or you can combine the RSA accelerator with +the *netdev* in the user space as a https reversed proxy, etc. + +*WarpDrive* takes the hardware accelerator as a heterogeneous processor which +can share particular load from the CPU: + +.. image:: wd.svg + :alt: WarpDrive Concept + +The virtual concept, queue, is used to manage the requests sent to the +accelerator. The application send requests to the queue by writing to some +particular address, while the hardware takes the requests directly from the +address and send feedback accordingly. + +The format of the queue may differ from hardware to hardware. But the +application need not to make any system call for the communication. + +*WarpDrive* tries to create a shared virtual address space for all involved +accelerators. Within this space, the requests sent to queue can refer to any +virtual address, which will be valid to the application and all involved +accelerators. + +The name *WarpDrive* is simply a cool and general name meaning the framework +makes the application faster. It includes general user library, kernel +management module and drivers for the hardware. In kernel, the management +module is called *uacce*, meaning "Unified/User-space-access-intended +Accelerator Framework". + + +How does it work +================ + +*WarpDrive* uses *mmap* and *IOMMU* to play the trick. + +*Uacce* creates a chrdev for the device registered to it. A "queue" will be +created when the chrdev is opened. The application access the queue by mmap +different address region of the queue file. + +The following figure demonstrated the queue file address space: + +.. image:: wd_q_addr_space.svg + :alt: WarpDrive Queue Address Space + +The first region of the space, device region, is used for the application to +write request or read answer to or from the hardware. + +Normally, there can be three types of device regions mmio and memory regions. +It is recommended to use common memory for request/answer descriptors and use +the mmio space for device notification, such as doorbell. But of course, this +is all up to the interface designer. + +There can be two types of device memory regions, kernel-only and user-shared. +This will be explained in the "kernel APIs" section. + +The Static Share Virtual Memory region is necessary only when the device IOMMU +does not support "Share Virtual Memory". This will be explained after the +*IOMMU* idea. + + +Architecture +------------ + +The full *WarpDrive* architecture is represented in the following class +diagram: + +.. image:: wd-arch.svg + :alt: WarpDrive Architecture + + +The user API +------------ + +We adopt a polling style interface in the user space: :: + + int wd_request_queue(struct wd_queue *q); + void wd_release_queue(struct wd_queue *q); + + int wd_send(struct wd_queue *q, void *req); + int wd_recv(struct wd_queue *q, void **req); + int wd_recv_sync(struct wd_queue *q, void **req); + void wd_flush(struct wd_queue *q); + +wd_recv_sync() is a wrapper to its non-sync version. It will trapped into +kernel and waits until the queue become available. + +If the queue do not support SVA/SVM. The following helper function +can be used to create Static Virtual Share Memory: :: + + void *wd_preserve_share_memory(struct wd_queue *q, size_t size); + +The user API is not mandatory. It is simply a suggestion and hint what the +kernel interface is supposed to support. + + +The user driver +--------------- + +The queue file mmap space will need a user driver to wrap the communication +protocol. *UACCE* provides some attributes in sysfs for the user driver to +match the right accelerator accordingly. + +The *UACCE* device attribute is under the following directory: + +/sys/class/uacce/<dev-name>/params + +The following attributes is supported: + +nr_queue_remained (ro) + number of queue remained + +api_version (ro) + a string to identify the queue mmap space format and its version + +device_attr (ro) + attributes of the device, see UACCE_DEV_xxx flag defined in uacce.h + +numa_node (ro) + id of numa node + +priority (rw) + Priority or the device, bigger is higher + +(This is not yet implemented in RFC version) + + +The kernel API +-------------- + +The *uacce* kernel API is defined in uacce.h. If the hardware support SVM/SVA, +The driver need only the following API functions: :: + + int uacce_register(uacce); + void uacce_unregister(uacce); + void uacce_wake_up(q); + +*uacce_wake_up* is used to notify the process who epoll() on the queue file. + +According to the IOMMU capability, *uacce* categories the devices as follow: + +UACCE_DEV_NOIOMMU + The device has no IOMMU. The user process cannot use VA on the hardware + This mode is not recommended. + +UACCE_DEV_SVA (UACCE_DEV_PASID | UACCE_DEV_FAULT_FROM_DEV) + The device has IOMMU which can share the same page table with user + process + +UACCE_DEV_SHARE_DOMAIN + The device has IOMMU which has no multiple page table and device page + fault support + +If the device works in mode other than UACCE_DEV_NOIOMMU, *uacce* will set its +IOMMU to IOMMU_DOMAIN_UNMANAGED. So the driver must not use any kernel +DMA API but the following ones from *uacce* instead: :: + + uacce_dma_map(q, va, size, prot); + uacce_dma_unmap(q, va, size, prot); + +*uacce_dma_map/unmap* is valid only for UACCE_DEV_SVA device. It creates a +particular PASID and page table for the kernel in the IOMMU (Not yet +implemented in the RFC) + +For the UACCE_DEV_SHARE_DOMAIN device, uacce_dma_map/unmap is not valid. +*Uacce* call back start_queue only when the DUS and DKO region is mmapped. The +accelerator driver must use those dma buffer, via uacce_queue->qfrs[], on +start_queue call back. The size of the queue file region is defined by +uacce->ops->qf_pg_start[]. + +We have to do it this way because most of current IOMMU cannot support the +kernel and user virtual address at the same time. So we have to let them both +share the same user virtual address space. + +If the device have to support kernel and user at the same time, both kernel +and the user should use these DMA API. This is not convenient. A better +solution is to change the future DMA/IOMMU design to let them separate the +address space between the user and kernel space. But it is not going to be in +a short time. + + +Multiple processes support +========================== + +In the latest mainline kernel (4.19) when this document is written, the IOMMU +subsystem do not support multiple process page tables yet. + +Most IOMMU hardware implementation support multi-process with the concept +of PASID. But they may use different name, e.g. it is call sub-stream-id in +SMMU of ARM. With PASID or similar design, multi page table can be added to +the IOMMU and referred by its PASID. + +*JPB* has a patchset to enable this[1]_. We have tested it with our hardware +(which is known as *D06*). It works well. *WarpDrive* rely on them to support +UACCE_DEV_SVA. If it is not enabled, *WarpDrive* can still work. But it +support only one process, the device will be set to UACCE_DEV_SHARE_DOMAIN +even it is set to UACCE_DEV_SVA initially. + +Static Share Virtual Memory is mainly used by UACCE_DEV_SHARE_DOMAIN device. + + +Legacy Mode Support +=================== +For the hardware without IOMMU, WarpDrive can still work, the only problem is +VA cannot be used in the device. The driver should adopt another strategy for +the shared memory. It is only for testing, and not recommended. + + +The Folk Scenario +================= +For a process with allocated queues and shared memory, what happen if it forks +a child? + +The fd of the queue will be duplicated on folk, so the child can send request +to the same queue as its parent. But the requests which is sent from processes +except for the one who open the queue will be blocked. + +It is recommended to add O_CLOEXEC to the queue file. + +The queue mmap space has a VM_DONTCOPY in its VMA. So the child will lost all +those VMAs. + +This is why *WarpDrive* does not adopt the mode used in *VFIO* and *InfiniBand*. +Both solutions can set any user pointer for hardware sharing. But they cannot +support fork when the dma is in process. Or the "Copy-On-Write" procedure will +make the parent process lost its physical pages. + + +The Sample Code +=============== +There is a sample user land implementation with a simple driver for Hisilicon +Hi1620 ZIP Accelerator. + +To test, do the following in samples/warpdrive (for the case of PC host): :: + ./autogen.sh + ./conf.sh # or simply ./configure if you build on target system + make + +Then you can get test_hisi_zip in the test subdirectory. Copy it to the target +system and make sure the hisi_zip driver is enabled (the major and minor of +the uacce chrdev can be gotten from the dmesg or sysfs), and run: :: + mknod /dev/ua1 c <major> <minior> + test/test_hisi_zip -z < data > data.zip + test/test_hisi_zip -g < data > data.gzip + + +References +========== +.. [1] https://patchwork.kernel.org/patch/10394851/ + +.. vim: tw=78 diff --git a/Documentation/warpdrive/wd-arch.svg b/Documentation/warpdrive/wd-arch.svg new file mode 100644 index 000000000000..e59934188443 --- /dev/null +++ b/Documentation/warpdrive/wd-arch.svg @@ -0,0 +1,764 @@ +<?xml version="1.0" encoding="UTF-8" standalone="no"?> +<!-- Created with Inkscape (http://www.inkscape.org/) --> + +<svg + xmlns:dc="http://purl.org/dc/elements/1.1/" + xmlns:cc="http://creativecommons.org/ns#" + xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" + xmlns:svg="http://www.w3.org/2000/svg" + xmlns="http://www.w3.org/2000/svg" + xmlns:xlink="http://www.w3.org/1999/xlink" + xmlns:sodipodi="http://sodipodi.sourceforge.net/DTD/sodipodi-0.dtd" + xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape" + width="210mm" + height="193mm" + viewBox="0 0 744.09449 683.85823" + id="svg2" + version="1.1" + inkscape:version="0.92.3 (2405546, 2018-03-11)" + sodipodi:docname="wd-arch.svg"> + <defs + id="defs4"> + <linearGradient + inkscape:collect="always" + id="linearGradient6830"> + <stop + style="stop-color:#000000;stop-opacity:1;" + offset="0" + id="stop6832" /> + <stop + style="stop-color:#000000;stop-opacity:0;" + offset="1" + id="stop6834" /> + </linearGradient> + <linearGradient + inkscape:collect="always" + xlink:href="#linearGradient5026" + id="linearGradient5032" + x1="353" + y1="211.3622" + x2="565.5" + y2="174.8622" + gradientUnits="userSpaceOnUse" + gradientTransform="translate(-89.949614,405.94594)" /> + <linearGradient + inkscape:collect="always" + id="linearGradient5026"> + <stop + style="stop-color:#f2f2f2;stop-opacity:1;" + offset="0" + id="stop5028" /> + <stop + style="stop-color:#f2f2f2;stop-opacity:0;" + offset="1" + id="stop5030" /> + </linearGradient> + <filter + inkscape:collect="always" + style="color-interpolation-filters:sRGB" + id="filter4169-3" + x="-0.031597666" + width="1.0631953" + y="-0.099812768" + height="1.1996255"> + <feGaussianBlur + inkscape:collect="always" + stdDeviation="1.3307599" + id="feGaussianBlur4171-6" /> + </filter> + <linearGradient + inkscape:collect="always" + xlink:href="#linearGradient5026" + id="linearGradient5032-1" + x1="353" + y1="211.3622" + x2="565.5" + y2="174.8622" + gradientUnits="userSpaceOnUse" + gradientTransform="translate(175.77842,400.29111)" /> + <filter + inkscape:collect="always" + style="color-interpolation-filters:sRGB" + id="filter4169-3-0" + x="-0.031597666" + width="1.0631953" + y="-0.099812768" + height="1.1996255"> + <feGaussianBlur + inkscape:collect="always" + stdDeviation="1.3307599" + id="feGaussianBlur4171-6-9" /> + </filter> + <marker + markerWidth="18.960653" + markerHeight="11.194658" + refX="9.4803267" + refY="5.5973287" + orient="auto" + id="marker4613"> + <rect + y="-5.1589785" + x="5.8504119" + height="10.317957" + width="10.317957" + id="rect4212" + style="fill:#ffffff;stroke:#000000;stroke-width:0.69143367;stroke-miterlimit:4;stroke-dasharray:none" + transform="matrix(0.86111274,0.50841405,-0.86111274,0.50841405,0,0)"> + <title + id="title4262">generation</title> + </rect> + </marker> + <marker + markerWidth="11.227358" + markerHeight="12.355258" + refX="10" + refY="6.177629" + orient="auto" + id="marker4825"> + <path + inkscape:connector-curvature="0" + id="path4757" + d="M 0.42024733,0.42806444 10.231357,6.3500844 0.24347733,11.918544" + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" /> + </marker> + <marker + markerWidth="11.227358" + markerHeight="12.355258" + refX="10" + refY="6.177629" + orient="auto" + id="marker4825-6"> + <path + inkscape:connector-curvature="0" + id="path4757-1" + d="M 0.42024733,0.42806444 10.231357,6.3500844 0.24347733,11.918544" + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" /> + </marker> + <linearGradient + inkscape:collect="always" + xlink:href="#linearGradient5026" + id="linearGradient5032-3-9" + x1="353" + y1="211.3622" + x2="565.5" + y2="174.8622" + gradientUnits="userSpaceOnUse" + gradientTransform="matrix(1.2452511,0,0,0.98513016,-190.95632,540.33156)" /> + <filter + inkscape:collect="always" + style="color-interpolation-filters:sRGB" + id="filter4169-3-5-8" + x="-0.031597666" + width="1.0631953" + y="-0.099812768" + height="1.1996255"> + <feGaussianBlur + inkscape:collect="always" + stdDeviation="1.3307599" + id="feGaussianBlur4171-6-3-9" /> + </filter> + <marker + markerWidth="11.227358" + markerHeight="12.355258" + refX="10" + refY="6.177629" + orient="auto" + id="marker4825-6-2"> + <path + inkscape:connector-curvature="0" + id="path4757-1-9" + d="M 0.42024733,0.42806444 10.231357,6.3500844 0.24347733,11.918544" + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" /> + </marker> + <marker + markerWidth="11.227358" + markerHeight="12.355258" + refX="10" + refY="6.177629" + orient="auto" + id="marker4825-6-2-1"> + <path + inkscape:connector-curvature="0" + id="path4757-1-9-9" + d="M 0.42024733,0.42806444 10.231357,6.3500844 0.24347733,11.918544" + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" /> + </marker> + <linearGradient + inkscape:collect="always" + xlink:href="#linearGradient5026" + id="linearGradient5032-3-9-7" + x1="353" + y1="211.3622" + x2="565.5" + y2="174.8622" + gradientUnits="userSpaceOnUse" + gradientTransform="matrix(1.3742742,0,0,0.97786398,-234.52617,654.63367)" /> + <filter + inkscape:collect="always" + style="color-interpolation-filters:sRGB" + id="filter4169-3-5-8-5" + x="-0.031597666" + width="1.0631953" + y="-0.099812768" + height="1.1996255"> + <feGaussianBlur + inkscape:collect="always" + stdDeviation="1.3307599" + id="feGaussianBlur4171-6-3-9-0" /> + </filter> + <marker + markerWidth="11.227358" + markerHeight="12.355258" + refX="10" + refY="6.177629" + orient="auto" + id="marker4825-6-2-6"> + <path + inkscape:connector-curvature="0" + id="path4757-1-9-1" + d="M 0.42024733,0.42806444 10.231357,6.3500844 0.24347733,11.918544" + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" /> + </marker> + <linearGradient + inkscape:collect="always" + xlink:href="#linearGradient5026" + id="linearGradient5032-3-9-4" + x1="353" + y1="211.3622" + x2="565.5" + y2="174.8622" + gradientUnits="userSpaceOnUse" + gradientTransform="matrix(1.3742912,0,0,2.0035845,-468.34428,342.56603)" /> + <filter + inkscape:collect="always" + style="color-interpolation-filters:sRGB" + id="filter4169-3-5-8-54" + x="-0.031597666" + width="1.0631953" + y="-0.099812768" + height="1.1996255"> + <feGaussianBlur + inkscape:collect="always" + stdDeviation="1.3307599" + id="feGaussianBlur4171-6-3-9-7" /> + </filter> + <marker + markerWidth="11.227358" + markerHeight="12.355258" + refX="10" + refY="6.177629" + orient="auto" + id="marker4825-6-2-1-8"> + <path + inkscape:connector-curvature="0" + id="path4757-1-9-9-6" + d="M 0.42024733,0.42806444 10.231357,6.3500844 0.24347733,11.918544" + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" /> + </marker> + <marker + markerWidth="11.227358" + markerHeight="12.355258" + refX="10" + refY="6.177629" + orient="auto" + id="marker4825-6-2-1-8-8"> + <path + inkscape:connector-curvature="0" + id="path4757-1-9-9-6-9" + d="M 0.42024733,0.42806444 10.231357,6.3500844 0.24347733,11.918544" + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" /> + </marker> + <marker + markerWidth="11.227358" + markerHeight="12.355258" + refX="10" + refY="6.177629" + orient="auto" + id="marker4825-6-0"> + <path + inkscape:connector-curvature="0" + id="path4757-1-93" + d="M 0.42024733,0.42806444 10.231357,6.3500844 0.24347733,11.918544" + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" /> + </marker> + <marker + markerWidth="11.227358" + markerHeight="12.355258" + refX="10" + refY="6.177629" + orient="auto" + id="marker4825-6-0-2"> + <path + inkscape:connector-curvature="0" + id="path4757-1-93-6" + d="M 0.42024733,0.42806444 10.231357,6.3500844 0.24347733,11.918544" + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" /> + </marker> + <filter + inkscape:collect="always" + style="color-interpolation-filters:sRGB" + id="filter5382" + x="-0.089695387" + width="1.1793908" + y="-0.10052069" + height="1.2010413"> + <feGaussianBlur + inkscape:collect="always" + stdDeviation="0.86758925" + id="feGaussianBlur5384" /> + </filter> + <linearGradient + inkscape:collect="always" + xlink:href="#linearGradient6830" + id="linearGradient6836" + x1="362.73923" + y1="700.04059" + x2="340.4751" + y2="678.25488" + gradientUnits="userSpaceOnUse" + gradientTransform="translate(-23.771026,-135.76835)" /> + <marker + markerWidth="11.227358" + markerHeight="12.355258" + refX="10" + refY="6.177629" + orient="auto" + id="marker4825-6-2-6-2"> + <path + inkscape:connector-curvature="0" + id="path4757-1-9-1-9" + d="M 0.42024733,0.42806444 10.231357,6.3500844 0.24347733,11.918544" + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" /> + </marker> + <linearGradient + inkscape:collect="always" + xlink:href="#linearGradient5026" + id="linearGradient5032-3-9-7-3" + x1="353" + y1="211.3622" + x2="565.5" + y2="174.8622" + gradientUnits="userSpaceOnUse" + gradientTransform="matrix(1.3742742,0,0,0.97786395,-57.357186,649.55786)" /> + <filter + inkscape:collect="always" + style="color-interpolation-filters:sRGB" + id="filter4169-3-5-8-5-0" + x="-0.031597666" + width="1.0631953" + y="-0.099812768" + height="1.1996255"> + <feGaussianBlur + inkscape:collect="always" + stdDeviation="1.3307599" + id="feGaussianBlur4171-6-3-9-0-2" /> + </filter> + <marker + markerWidth="11.227358" + markerHeight="12.355258" + refX="10" + refY="6.177629" + orient="auto" + id="marker4825-6-2-1-1"> + <path + inkscape:connector-curvature="0" + id="path4757-1-9-9-0" + d="M 0.42024733,0.42806444 10.231357,6.3500844 0.24347733,11.918544" + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" /> + </marker> + </defs> + <sodipodi:namedview + id="base" + pagecolor="#ffffff" + bordercolor="#666666" + borderopacity="1.0" + inkscape:pageopacity="0.0" + inkscape:pageshadow="2" + inkscape:zoom="0.98994949" + inkscape:cx="222.32868" + inkscape:cy="370.44492" + inkscape:document-units="px" + inkscape:current-layer="layer1" + showgrid="false" + inkscape:window-width="1916" + inkscape:window-height="1033" + inkscape:window-x="0" + inkscape:window-y="22" + inkscape:window-maximized="0" + fit-margin-right="0.3" + inkscape:snap-global="false" /> + <metadata + id="metadata7"> + <rdf:RDF> + <cc:Work + rdf:about=""> + <dc:format>image/svg+xml</dc:format> + <dc:type + rdf:resource="http://purl.org/dc/dcmitype/StillImage" /> + <dc:title /> + </cc:Work> + </rdf:RDF> + </metadata> + <g + inkscape:label="Layer 1" + inkscape:groupmode="layer" + id="layer1" + transform="translate(0,-368.50374)"> + <rect + style="fill:#000000;stroke:#000000;stroke-width:0.6465112;filter:url(#filter4169-3)" + id="rect4136-3-6" + width="101.07784" + height="31.998148" + x="283.01144" + y="588.80896" /> + <rect + style="fill:url(#linearGradient5032);fill-opacity:1;stroke:#000000;stroke-width:0.6465112" + id="rect4136-2" + width="101.07784" + height="31.998148" + x="281.63498" + y="586.75739" /> + <text + xml:space="preserve" + style="font-style:normal;font-weight:normal;line-height:0%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" + x="294.21747" + y="612.50073" + id="text4138-6"><tspan + sodipodi:role="line" + id="tspan4140-1" + x="294.21747" + y="612.50073" + style="font-size:15px;line-height:1.25">WarpDrive</tspan></text> + <rect + style="fill:#000000;stroke:#000000;stroke-width:0.6465112;filter:url(#filter4169-3-0)" + id="rect4136-3-6-3" + width="101.07784" + height="31.998148" + x="548.7395" + y="583.15417" /> + <rect + style="fill:url(#linearGradient5032-1);fill-opacity:1;stroke:#000000;stroke-width:0.6465112" + id="rect4136-2-60" + width="101.07784" + height="31.998148" + x="547.36304" + y="581.1026" /> + <text + xml:space="preserve" + style="font-style:normal;font-weight:normal;line-height:0%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" + x="557.83484" + y="602.32745" + id="text4138-6-6"><tspan + sodipodi:role="line" + id="tspan4140-1-2" + x="557.83484" + y="602.32745" + style="font-size:15px;line-height:1.25">user_driver</tspan></text> + <path + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1;marker-end:url(#marker4613)" + d="m 547.36304,600.78954 -156.58203,0.0691" + id="path4855" + inkscape:connector-curvature="0" + sodipodi:nodetypes="cc" /> + <rect + style="fill:#000000;stroke:#000000;stroke-width:0.6465112;filter:url(#filter4169-3-5-8)" + id="rect4136-3-6-5-7" + width="101.07784" + height="31.998148" + x="128.74678" + y="80.648842" + transform="matrix(1.2452511,0,0,0.98513016,113.15182,641.02594)" /> + <rect + style="fill:url(#linearGradient5032-3-9);fill-opacity:1;stroke:#000000;stroke-width:0.71606314" + id="rect4136-2-6-3" + width="125.86729" + height="31.522341" + x="271.75983" + y="718.45435" /> + <text + xml:space="preserve" + style="font-style:normal;font-weight:normal;line-height:0%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" + x="309.13705" + y="745.55371" + id="text4138-6-2-6"><tspan + sodipodi:role="line" + id="tspan4140-1-9-1" + x="309.13705" + y="745.55371" + style="font-size:15px;line-height:1.25">uacce</tspan></text> + <path + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1;marker-end:url(#marker4825-6-2)" + d="m 329.57309,619.72453 5.0373,97.14447" + id="path4661-3" + inkscape:connector-curvature="0" + sodipodi:nodetypes="cc" /> + <path + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1;marker-end:url(#marker4825-6-2-1)" + d="m 342.57219,830.63108 -5.67699,-79.2841" + id="path4661-3-4" + inkscape:connector-curvature="0" + sodipodi:nodetypes="cc" /> + <rect + style="fill:#000000;stroke:#000000;stroke-width:0.6465112;filter:url(#filter4169-3-5-8-5)" + id="rect4136-3-6-5-7-3" + width="101.07784" + height="31.998148" + x="128.74678" + y="80.648842" + transform="matrix(1.3742742,0,0,0.97786398,101.09126,754.58534)" /> + <rect + style="fill:url(#linearGradient5032-3-9-7);fill-opacity:1;stroke:#000000;stroke-width:0.74946606" + id="rect4136-2-6-3-6" + width="138.90866" + height="31.289837" + x="276.13297" + y="831.44263" /> + <text + xml:space="preserve" + style="font-style:normal;font-weight:normal;line-height:0%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" + x="295.67819" + y="852.98224" + id="text4138-6-2-6-1"><tspan + sodipodi:role="line" + id="tspan4140-1-9-1-0" + x="295.67819" + y="852.98224" + style="font-size:15px;line-height:1.25">Device Driver</tspan></text> + <path + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1;marker-end:url(#marker4825-6-2-6)" + d="m 623.05084,615.00104 0.51369,333.80219" + id="path4661-3-5" + inkscape:connector-curvature="0" + sodipodi:nodetypes="cc" /> + <text + xml:space="preserve" + style="font-style:normal;font-weight:normal;line-height:0%;font-family:sans-serif;text-align:center;letter-spacing:0px;word-spacing:0px;text-anchor:middle;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" + x="392.63568" + y="660.83667" + id="text4138-6-2-6-1-6-2-5"><tspan + sodipodi:role="line" + x="392.63568" + y="660.83667" + id="tspan4305" + style="font-size:15px;line-height:1.25"><<anom_file>></tspan><tspan + sodipodi:role="line" + x="392.63568" + y="679.58667" + style="font-size:15px;line-height:1.25" + id="tspan1139">Queue FD</tspan></text> + <text + xml:space="preserve" + style="font-style:normal;font-weight:normal;line-height:0%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" + x="389.92969" + y="587.44836" + id="text4138-6-2-6-1-6-2-56"><tspan + sodipodi:role="line" + id="tspan4140-1-9-1-0-3-0-9" + x="389.92969" + y="587.44836" + style="font-size:15px;line-height:1.25">1</tspan></text> + <text + xml:space="preserve" + style="font-style:normal;font-weight:normal;line-height:0%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" + x="528.64813" + y="600.08429" + id="text4138-6-2-6-1-6-3"><tspan + sodipodi:role="line" + id="tspan4140-1-9-1-0-3-7" + x="528.64813" + y="600.08429" + style="font-size:15px;line-height:1.25">*</tspan></text> + <rect + style="fill:#000000;stroke:#000000;stroke-width:0.6465112;filter:url(#filter4169-3-5-8-54)" + id="rect4136-3-6-5-7-4" + width="101.07784" + height="31.998148" + x="128.74678" + y="80.648842" + transform="matrix(1.3745874,0,0,1.8929066,-132.7754,556.04505)" /> + <rect + style="fill:url(#linearGradient5032-3-9-4);fill-opacity:1;stroke:#000000;stroke-width:1.07280123" + id="rect4136-2-6-3-4" + width="138.91039" + height="64.111" + x="42.321312" + y="704.8371" /> + <text + xml:space="preserve" + style="font-style:normal;font-weight:normal;line-height:0%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" + x="110.30745" + y="722.94025" + id="text4138-6-2-6-3"><tspan + sodipodi:role="line" + x="111.99202" + y="722.94025" + id="tspan4366" + style="font-size:15px;line-height:1.25;text-align:center;text-anchor:middle">other standard </tspan><tspan + sodipodi:role="line" + x="110.30745" + y="741.69025" + id="tspan4368" + style="font-size:15px;line-height:1.25;text-align:center;text-anchor:middle">framework</tspan><tspan + sodipodi:role="line" + x="110.30745" + y="760.44025" + style="font-size:15px;line-height:1.25;text-align:center;text-anchor:middle" + id="tspan6840">(crypto/nic/others)</tspan></text> + <path + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1;marker-end:url(#marker4825-6-2-1-8)" + d="M 276.29661,849.04109 134.04449,771.90853" + id="path4661-3-4-8" + inkscape:connector-curvature="0" + sodipodi:nodetypes="cc" /> + <text + xml:space="preserve" + style="font-style:normal;font-weight:normal;line-height:0%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" + x="313.70813" + y="730.06366" + id="text4138-6-2-6-36"><tspan + sodipodi:role="line" + id="tspan4140-1-9-1-7" + x="313.70813" + y="730.06366" + style="font-size:10px;line-height:1.25"><<lkm>></tspan></text> + <text + xml:space="preserve" + style="font-style:normal;font-weight:normal;line-height:0%;font-family:sans-serif;text-align:start;letter-spacing:0px;word-spacing:0px;text-anchor:start;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" + x="259.53165" + y="797.8056" + id="text4138-6-2-6-1-6-2-5-7-5"><tspan + sodipodi:role="line" + x="259.53165" + y="797.8056" + style="font-size:15px;line-height:1.25;text-align:start;text-anchor:start" + id="tspan2357">uacce register api</tspan></text> + <text + xml:space="preserve" + style="font-style:normal;font-weight:normal;line-height:0%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" + x="29.145819" + y="833.44244" + id="text4138-6-2-6-1-6-2-5-7-5-2"><tspan + sodipodi:role="line" + x="29.145819" + y="833.44244" + id="tspan4301" + style="font-size:15px;line-height:1.25">register to other subsystem</tspan></text> + <text + xml:space="preserve" + style="font-style:normal;font-weight:normal;line-height:0%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" + x="301.20813" + y="597.29437" + id="text4138-6-2-6-36-1"><tspan + sodipodi:role="line" + id="tspan4140-1-9-1-7-2" + x="301.20813" + y="597.29437" + style="font-size:10px;line-height:1.25"><<user_lib>></tspan></text> + <text + xml:space="preserve" + style="font-style:normal;font-weight:normal;line-height:0%;font-family:sans-serif;text-align:center;letter-spacing:0px;word-spacing:0px;text-anchor:middle;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" + x="615.9505" + y="739.44012" + id="text4138-6-2-6-1-6-2-5-3"><tspan + sodipodi:role="line" + x="615.9505" + y="739.44012" + id="tspan4274-7" + style="font-size:15px;line-height:1.25">mmapped memory r/w interface</tspan></text> + <text + xml:space="preserve" + style="font-style:normal;font-weight:normal;line-height:0%;font-family:sans-serif;text-align:center;letter-spacing:0px;word-spacing:0px;text-anchor:middle;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" + x="371.01291" + y="529.23682" + id="text4138-6-2-6-1-6-2-5-36"><tspan + sodipodi:role="line" + x="371.01291" + y="529.23682" + id="tspan4305-3" + style="font-size:15px;line-height:1.25">wd user api</tspan></text> + <path + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" + d="m 328.19325,585.87943 0,-23.57142" + id="path4348" + inkscape:connector-curvature="0" /> + <ellipse + style="opacity:1;fill:#ffffff;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:1;stroke-miterlimit:4;stroke-dasharray:none;stroke-dashoffset:0" + id="path4350" + cx="328.01468" + cy="551.95081" + rx="11.607142" + ry="10.357142" /> + <path + style="opacity:0.444;fill:url(#linearGradient6836);fill-opacity:1;fill-rule:evenodd;stroke:none;stroke-width:1;stroke-miterlimit:4;stroke-dasharray:none;stroke-dashoffset:0;filter:url(#filter5382)" + id="path4350-2" + sodipodi:type="arc" + sodipodi:cx="329.44327" + sodipodi:cy="553.37933" + sodipodi:rx="11.607142" + sodipodi:ry="10.357142" + sodipodi:start="0" + sodipodi:end="6.2509098" + d="m 341.05041,553.37933 a 11.607142,10.357142 0 0 1 -11.51349,10.35681 11.607142,10.357142 0 0 1 -11.69928,-10.18967 11.607142,10.357142 0 0 1 11.32469,-10.52124 11.607142,10.357142 0 0 1 11.88204,10.01988" + sodipodi:open="true" /> + <text + xml:space="preserve" + style="font-style:normal;font-weight:normal;line-height:0%;font-family:sans-serif;text-align:center;letter-spacing:0px;word-spacing:0px;text-anchor:middle;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" + x="619.67596" + y="978.22363" + id="text4138-6-2-6-1-6-2-5-36-3"><tspan + sodipodi:role="line" + x="619.67596" + y="978.22363" + id="tspan4305-3-67" + style="font-size:15px;line-height:1.25">Device(Hardware)</tspan></text> + <path + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1;marker-end:url(#marker4825-6-2-6-2)" + d="m 347.51164,865.4527 193.91929,99.10053" + id="path4661-3-5-1" + inkscape:connector-curvature="0" + sodipodi:nodetypes="cc" /> + <rect + style="fill:#000000;stroke:#000000;stroke-width:0.6465112;filter:url(#filter4169-3-5-8-5-0)" + id="rect4136-3-6-5-7-3-1" + width="101.07784" + height="31.998148" + x="128.74678" + y="80.648842" + transform="matrix(1.3742742,0,0,0.97786395,278.26025,749.50952)" /> + <rect + style="fill:url(#linearGradient5032-3-9-7-3);fill-opacity:1;stroke:#000000;stroke-width:0.74946606" + id="rect4136-2-6-3-6-0" + width="138.90868" + height="31.289839" + x="453.30197" + y="826.36682" /> + <text + xml:space="preserve" + style="font-style:normal;font-weight:normal;font-size:12px;line-height:0%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" + x="493.68158" + y="847.90643" + id="text4138-6-2-6-1-5"><tspan + sodipodi:role="line" + id="tspan4140-1-9-1-0-1" + x="493.68158" + y="847.90643" + style="font-size:15px;line-height:1.25;stroke-width:1px">IOMMU</tspan></text> + <path + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1;marker-end:url(#marker4825-6-2-1-1)" + d="m 389.49372,755.46667 111.75324,68.4507" + id="path4661-3-4-85" + inkscape:connector-curvature="0" + sodipodi:nodetypes="cc" /> + <text + xml:space="preserve" + style="font-style:normal;font-weight:normal;font-size:12px;line-height:0%;font-family:sans-serif;text-align:start;letter-spacing:0px;word-spacing:0px;text-anchor:start;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" + x="427.70282" + y="776.91418" + id="text4138-6-2-6-1-6-2-5-7-5-0"><tspan + sodipodi:role="line" + x="427.70282" + y="776.91418" + style="font-size:15px;line-height:1.25;text-align:start;text-anchor:start;stroke-width:1px" + id="tspan2357-6">manage the driver iommu state</tspan></text> + </g> +</svg> diff --git a/Documentation/warpdrive/wd.svg b/Documentation/warpdrive/wd.svg new file mode 100644 index 000000000000..87ab92ebfbc6 --- /dev/null +++ b/Documentation/warpdrive/wd.svg @@ -0,0 +1,526 @@ +<?xml version="1.0" encoding="UTF-8" standalone="no"?> +<!-- Created with Inkscape (http://www.inkscape.org/) --> + +<svg + xmlns:dc="http://purl.org/dc/elements/1.1/" + xmlns:cc="http://creativecommons.org/ns#" + xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" + xmlns:svg="http://www.w3.org/2000/svg" + xmlns="http://www.w3.org/2000/svg" + xmlns:xlink="http://www.w3.org/1999/xlink" + xmlns:sodipodi="http://sodipodi.sourceforge.net/DTD/sodipodi-0.dtd" + xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape" + width="210mm" + height="116mm" + viewBox="0 0 744.09449 411.02338" + id="svg2" + version="1.1" + inkscape:version="0.92.3 (2405546, 2018-03-11)" + sodipodi:docname="wd.svg"> + <defs + id="defs4"> + <linearGradient + inkscape:collect="always" + id="linearGradient5026"> + <stop + style="stop-color:#f2f2f2;stop-opacity:1;" + offset="0" + id="stop5028" /> + <stop + style="stop-color:#f2f2f2;stop-opacity:0;" + offset="1" + id="stop5030" /> + </linearGradient> + <linearGradient + inkscape:collect="always" + xlink:href="#linearGradient5026" + id="linearGradient5032-3" + x1="353" + y1="211.3622" + x2="565.5" + y2="174.8622" + gradientUnits="userSpaceOnUse" + gradientTransform="matrix(2.7384117,0,0,0.91666329,-952.8283,571.10143)" /> + <filter + inkscape:collect="always" + style="color-interpolation-filters:sRGB" + id="filter4169-3-5" + x="-0.031597666" + width="1.0631953" + y="-0.099812768" + height="1.1996255"> + <feGaussianBlur + inkscape:collect="always" + stdDeviation="1.3307599" + id="feGaussianBlur4171-6-3" /> + </filter> + <marker + markerWidth="18.960653" + markerHeight="11.194658" + refX="9.4803267" + refY="5.5973287" + orient="auto" + id="marker4613"> + <rect + y="-5.1589785" + x="5.8504119" + height="10.317957" + width="10.317957" + id="rect4212" + style="fill:#ffffff;stroke:#000000;stroke-width:0.69143367;stroke-miterlimit:4;stroke-dasharray:none" + transform="matrix(0.86111274,0.50841405,-0.86111274,0.50841405,0,0)"> + <title + id="title4262">generation</title> + </rect> + </marker> + <marker + markerWidth="11.227358" + markerHeight="12.355258" + refX="10" + refY="6.177629" + orient="auto" + id="marker4825"> + <path + inkscape:connector-curvature="0" + id="path4757" + d="M 0.42024733,0.42806444 10.231357,6.3500844 0.24347733,11.918544" + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" /> + </marker> + <marker + markerWidth="11.227358" + markerHeight="12.355258" + refX="10" + refY="6.177629" + orient="auto" + id="marker4825-6"> + <path + inkscape:connector-curvature="0" + id="path4757-1" + d="M 0.42024733,0.42806444 10.231357,6.3500844 0.24347733,11.918544" + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" /> + </marker> + <linearGradient + inkscape:collect="always" + xlink:href="#linearGradient5026" + id="linearGradient5032-3-9" + x1="353" + y1="211.3622" + x2="565.5" + y2="174.8622" + gradientUnits="userSpaceOnUse" + gradientTransform="matrix(1.2452511,0,0,0.98513016,-190.95632,540.33156)" /> + <marker + markerWidth="11.227358" + markerHeight="12.355258" + refX="10" + refY="6.177629" + orient="auto" + id="marker4825-6-2"> + <path + inkscape:connector-curvature="0" + id="path4757-1-9" + d="M 0.42024733,0.42806444 10.231357,6.3500844 0.24347733,11.918544" + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" /> + </marker> + <marker + markerWidth="11.227358" + markerHeight="12.355258" + refX="10" + refY="6.177629" + orient="auto" + id="marker4825-6-2-1"> + <path + inkscape:connector-curvature="0" + id="path4757-1-9-9" + d="M 0.42024733,0.42806444 10.231357,6.3500844 0.24347733,11.918544" + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" /> + </marker> + <marker + markerWidth="11.227358" + markerHeight="12.355258" + refX="10" + refY="6.177629" + orient="auto" + id="marker4825-6-2-6"> + <path + inkscape:connector-curvature="0" + id="path4757-1-9-1" + d="M 0.42024733,0.42806444 10.231357,6.3500844 0.24347733,11.918544" + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" /> + </marker> + <marker + markerWidth="11.227358" + markerHeight="12.355258" + refX="10" + refY="6.177629" + orient="auto" + id="marker4825-6-2-1-8"> + <path + inkscape:connector-curvature="0" + id="path4757-1-9-9-6" + d="M 0.42024733,0.42806444 10.231357,6.3500844 0.24347733,11.918544" + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" /> + </marker> + <marker + markerWidth="11.227358" + markerHeight="12.355258" + refX="10" + refY="6.177629" + orient="auto" + id="marker4825-6-2-1-8-8"> + <path + inkscape:connector-curvature="0" + id="path4757-1-9-9-6-9" + d="M 0.42024733,0.42806444 10.231357,6.3500844 0.24347733,11.918544" + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" /> + </marker> + <marker + markerWidth="11.227358" + markerHeight="12.355258" + refX="10" + refY="6.177629" + orient="auto" + id="marker4825-6-0"> + <path + inkscape:connector-curvature="0" + id="path4757-1-93" + d="M 0.42024733,0.42806444 10.231357,6.3500844 0.24347733,11.918544" + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" /> + </marker> + <marker + markerWidth="11.227358" + markerHeight="12.355258" + refX="10" + refY="6.177629" + orient="auto" + id="marker4825-6-0-2"> + <path + inkscape:connector-curvature="0" + id="path4757-1-93-6" + d="M 0.42024733,0.42806444 10.231357,6.3500844 0.24347733,11.918544" + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" /> + </marker> + <marker + markerWidth="11.227358" + markerHeight="12.355258" + refX="10" + refY="6.177629" + orient="auto" + id="marker4825-6-2-6-2"> + <path + inkscape:connector-curvature="0" + id="path4757-1-9-1-9" + d="M 0.42024733,0.42806444 10.231357,6.3500844 0.24347733,11.918544" + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" /> + </marker> + <linearGradient + inkscape:collect="always" + xlink:href="#linearGradient5026" + id="linearGradient5032-3-8" + x1="353" + y1="211.3622" + x2="565.5" + y2="174.8622" + gradientUnits="userSpaceOnUse" + gradientTransform="matrix(1.0104674,0,0,1.0052679,-218.642,661.15448)" /> + <filter + inkscape:collect="always" + style="color-interpolation-filters:sRGB" + id="filter4169-3-5-8" + x="-0.031597666" + width="1.0631953" + y="-0.099812768" + height="1.1996255"> + <feGaussianBlur + inkscape:collect="always" + stdDeviation="1.3307599" + id="feGaussianBlur4171-6-3-9" /> + </filter> + <linearGradient + inkscape:collect="always" + xlink:href="#linearGradient5026" + id="linearGradient5032-3-8-2" + x1="353" + y1="211.3622" + x2="565.5" + y2="174.8622" + gradientUnits="userSpaceOnUse" + gradientTransform="matrix(2.1450559,0,0,1.0052679,-521.97704,740.76422)" /> + <filter + inkscape:collect="always" + style="color-interpolation-filters:sRGB" + id="filter4169-3-5-8-5" + x="-0.031597666" + width="1.0631953" + y="-0.099812768" + height="1.1996255"> + <feGaussianBlur + inkscape:collect="always" + stdDeviation="1.3307599" + id="feGaussianBlur4171-6-3-9-1" /> + </filter> + <linearGradient + inkscape:collect="always" + xlink:href="#linearGradient5026" + id="linearGradient5032-3-8-0" + x1="353" + y1="211.3622" + x2="565.5" + y2="174.8622" + gradientUnits="userSpaceOnUse" + gradientTransform="matrix(1.0104674,0,0,1.0052679,83.456748,660.20747)" /> + <filter + inkscape:collect="always" + style="color-interpolation-filters:sRGB" + id="filter4169-3-5-8-6" + x="-0.031597666" + width="1.0631953" + y="-0.099812768" + height="1.1996255"> + <feGaussianBlur + inkscape:collect="always" + stdDeviation="1.3307599" + id="feGaussianBlur4171-6-3-9-2" /> + </filter> + <linearGradient + inkscape:collect="always" + xlink:href="#linearGradient5026" + id="linearGradient5032-3-84" + x1="353" + y1="211.3622" + x2="565.5" + y2="174.8622" + gradientUnits="userSpaceOnUse" + gradientTransform="matrix(1.9884948,0,0,0.94903536,-318.42665,564.37696)" /> + <filter + inkscape:collect="always" + style="color-interpolation-filters:sRGB" + id="filter4169-3-5-4" + x="-0.031597666" + width="1.0631953" + y="-0.099812768" + height="1.1996255"> + <feGaussianBlur + inkscape:collect="always" + stdDeviation="1.3307599" + id="feGaussianBlur4171-6-3-0" /> + </filter> + <marker + markerWidth="11.227358" + markerHeight="12.355258" + refX="10" + refY="6.177629" + orient="auto" + id="marker4825-6-0-0"> + <path + inkscape:connector-curvature="0" + id="path4757-1-93-8" + d="M 0.42024733,0.42806444 10.231357,6.3500844 0.24347733,11.918544" + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" /> + </marker> + <marker + markerWidth="11.227358" + markerHeight="12.355258" + refX="10" + refY="6.177629" + orient="auto" + id="marker4825-6-3"> + <path + inkscape:connector-curvature="0" + id="path4757-1-1" + d="M 0.42024733,0.42806444 10.231357,6.3500844 0.24347733,11.918544" + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" /> + </marker> + </defs> + <sodipodi:namedview + id="base" + pagecolor="#ffffff" + bordercolor="#666666" + borderopacity="1.0" + inkscape:pageopacity="0.0" + inkscape:pageshadow="2" + inkscape:zoom="0.98994949" + inkscape:cx="457.47339" + inkscape:cy="250.14781" + inkscape:document-units="px" + inkscape:current-layer="layer1" + showgrid="false" + inkscape:window-width="1916" + inkscape:window-height="1033" + inkscape:window-x="0" + inkscape:window-y="22" + inkscape:window-maximized="0" + fit-margin-right="0.3" /> + <metadata + id="metadata7"> + <rdf:RDF> + <cc:Work + rdf:about=""> + <dc:format>image/svg+xml</dc:format> + <dc:type + rdf:resource="http://purl.org/dc/dcmitype/StillImage" /> + <dc:title></dc:title> + </cc:Work> + </rdf:RDF> + </metadata> + <g + inkscape:label="Layer 1" + inkscape:groupmode="layer" + id="layer1" + transform="translate(0,-641.33861)"> + <rect + style="fill:#000000;stroke:#000000;stroke-width:0.6465112;filter:url(#filter4169-3-5)" + id="rect4136-3-6-5" + width="101.07784" + height="31.998148" + x="128.74678" + y="80.648842" + transform="matrix(2.7384116,0,0,0.91666328,-284.06895,664.79751)" /> + <rect + style="fill:url(#linearGradient5032-3);fill-opacity:1;stroke:#000000;stroke-width:1.02430749" + id="rect4136-2-6" + width="276.79272" + height="29.331528" + x="64.723419" + y="736.84473" /> + <text + xml:space="preserve" + style="font-style:normal;font-weight:normal;line-height:0%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" + x="78.223282" + y="756.79803" + id="text4138-6-2"><tspan + sodipodi:role="line" + id="tspan4140-1-9" + x="78.223282" + y="756.79803" + style="font-size:15px;line-height:1.25">user application (running by the CPU</tspan></text> + <path + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1;marker-end:url(#marker4825-6)" + d="m 217.67507,876.6738 113.40331,45.0758" + id="path4661" + inkscape:connector-curvature="0" + sodipodi:nodetypes="cc" /> + <path + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1;marker-end:url(#marker4825-6-0)" + d="m 208.10197,767.69811 0.29362,76.03656" + id="path4661-6" + inkscape:connector-curvature="0" + sodipodi:nodetypes="cc" /> + <rect + style="fill:#000000;stroke:#000000;stroke-width:0.6465112;filter:url(#filter4169-3-5-8)" + id="rect4136-3-6-5-3" + width="101.07784" + height="31.998148" + x="128.74678" + y="80.648842" + transform="matrix(1.0104673,0,0,1.0052679,28.128628,763.90722)" /> + <rect + style="fill:url(#linearGradient5032-3-8);fill-opacity:1;stroke:#000000;stroke-width:0.65159565" + id="rect4136-2-6-6" + width="102.13586" + height="32.16671" + x="156.83217" + y="842.91852" /> + <text + xml:space="preserve" + style="font-style:normal;font-weight:normal;font-size:12px;line-height:0%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" + x="188.58519" + y="864.47125" + id="text4138-6-2-8"><tspan + sodipodi:role="line" + id="tspan4140-1-9-0" + x="188.58519" + y="864.47125" + style="font-size:15px;line-height:1.25;stroke-width:1px">MMU</tspan></text> + <rect + style="fill:#000000;stroke:#000000;stroke-width:0.6465112;filter:url(#filter4169-3-5-8-5)" + id="rect4136-3-6-5-3-1" + width="101.07784" + height="31.998148" + x="128.74678" + y="80.648842" + transform="matrix(2.1450556,0,0,1.0052679,1.87637,843.51696)" /> + <rect + style="fill:url(#linearGradient5032-3-8-2);fill-opacity:1;stroke:#000000;stroke-width:0.94937181" + id="rect4136-2-6-6-0" + width="216.8176" + height="32.16671" + x="275.09283" + y="922.5282" /> + <text + xml:space="preserve" + style="font-style:normal;font-weight:normal;font-size:12px;line-height:0%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" + x="347.81482" + y="943.23291" + id="text4138-6-2-8-8"><tspan + sodipodi:role="line" + id="tspan4140-1-9-0-5" + x="347.81482" + y="943.23291" + style="font-size:15px;line-height:1.25;stroke-width:1px">Memory</tspan></text> + <rect + style="fill:#000000;stroke:#000000;stroke-width:0.6465112;filter:url(#filter4169-3-5-8-6)" + id="rect4136-3-6-5-3-5" + width="101.07784" + height="31.998148" + x="128.74678" + y="80.648842" + transform="matrix(1.0104673,0,0,1.0052679,330.22737,762.9602)" /> + <rect + style="fill:url(#linearGradient5032-3-8-0);fill-opacity:1;stroke:#000000;stroke-width:0.65159565" + id="rect4136-2-6-6-8" + width="102.13586" + height="32.16671" + x="458.93091" + y="841.9715" /> + <text + xml:space="preserve" + style="font-style:normal;font-weight:normal;font-size:12px;line-height:0%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" + x="490.68393" + y="863.52423" + id="text4138-6-2-8-6"><tspan + sodipodi:role="line" + id="tspan4140-1-9-0-2" + x="490.68393" + y="863.52423" + style="font-size:15px;line-height:1.25;stroke-width:1px">IOMMU</tspan></text> + <rect + style="fill:#000000;stroke:#000000;stroke-width:0.6465112;filter:url(#filter4169-3-5-4)" + id="rect4136-3-6-5-6" + width="101.07784" + height="31.998148" + x="128.74678" + y="80.648842" + transform="matrix(1.9884947,0,0,0.94903537,167.19229,661.38193)" /> + <rect + style="fill:url(#linearGradient5032-3-84);fill-opacity:1;stroke:#000000;stroke-width:0.88813609" + id="rect4136-2-6-2" + width="200.99274" + height="30.367374" + x="420.4675" + y="735.97351" /> + <text + xml:space="preserve" + style="font-style:normal;font-weight:normal;font-size:12px;line-height:0%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" + x="441.95297" + y="755.9068" + id="text4138-6-2-9"><tspan + sodipodi:role="line" + id="tspan4140-1-9-9" + x="441.95297" + y="755.9068" + style="font-size:15px;line-height:1.25;stroke-width:1px">Hardware Accelerator</tspan></text> + <path + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1;marker-end:url(#marker4825-6-0-0)" + d="m 508.2914,766.55885 0.29362,76.03656" + id="path4661-6-1" + inkscape:connector-curvature="0" + sodipodi:nodetypes="cc" /> + <path + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1;marker-end:url(#marker4825-6-3)" + d="M 499.70201,876.47297 361.38296,920.80258" + id="path4661-1" + inkscape:connector-curvature="0" + sodipodi:nodetypes="cc" /> + </g> +</svg> diff --git a/Documentation/warpdrive/wd_q_addr_space.svg b/Documentation/warpdrive/wd_q_addr_space.svg new file mode 100644 index 000000000000..5e6cf8e89908 --- /dev/null +++ b/Documentation/warpdrive/wd_q_addr_space.svg @@ -0,0 +1,359 @@ +<?xml version="1.0" encoding="UTF-8" standalone="no"?> +<!-- Created with Inkscape (http://www.inkscape.org/) --> + +<svg + xmlns:dc="http://purl.org/dc/elements/1.1/" + xmlns:cc="http://creativecommons.org/ns#" + xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" + xmlns:svg="http://www.w3.org/2000/svg" + xmlns="http://www.w3.org/2000/svg" + xmlns:sodipodi="http://sodipodi.sourceforge.net/DTD/sodipodi-0.dtd" + xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape" + width="210mm" + height="124mm" + viewBox="0 0 210 124" + version="1.1" + id="svg8" + inkscape:version="0.92.3 (2405546, 2018-03-11)" + sodipodi:docname="wd_q_addr_space.svg"> + <defs + id="defs2"> + <marker + inkscape:stockid="Arrow1Mend" + orient="auto" + refY="0" + refX="0" + id="marker5428" + style="overflow:visible" + inkscape:isstock="true"> + <path + id="path5426" + d="M 0,0 5,-5 -12.5,0 5,5 Z" + style="fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:1.00000003pt;stroke-opacity:1" + transform="matrix(-0.4,0,0,-0.4,-4,0)" + inkscape:connector-curvature="0" /> + </marker> + <marker + inkscape:isstock="true" + style="overflow:visible" + id="marker2922" + refX="0" + refY="0" + orient="auto" + inkscape:stockid="Arrow1Mend" + inkscape:collect="always"> + <path + transform="matrix(-0.4,0,0,-0.4,-4,0)" + style="fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:1.00000003pt;stroke-opacity:1" + d="M 0,0 5,-5 -12.5,0 5,5 Z" + id="path2920" + inkscape:connector-curvature="0" /> + </marker> + <marker + inkscape:stockid="Arrow1Mstart" + orient="auto" + refY="0" + refX="0" + id="Arrow1Mstart" + style="overflow:visible" + inkscape:isstock="true"> + <path + id="path840" + d="M 0,0 5,-5 -12.5,0 5,5 Z" + style="fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:1.00000003pt;stroke-opacity:1" + transform="matrix(0.4,0,0,0.4,4,0)" + inkscape:connector-curvature="0" /> + </marker> + <marker + inkscape:stockid="Arrow1Mend" + orient="auto" + refY="0" + refX="0" + id="Arrow1Mend" + style="overflow:visible" + inkscape:isstock="true" + inkscape:collect="always"> + <path + id="path843" + d="M 0,0 5,-5 -12.5,0 5,5 Z" + style="fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:1.00000003pt;stroke-opacity:1" + transform="matrix(-0.4,0,0,-0.4,-4,0)" + inkscape:connector-curvature="0" /> + </marker> + <marker + inkscape:stockid="Arrow1Mstart" + orient="auto" + refY="0" + refX="0" + id="Arrow1Mstart-5" + style="overflow:visible" + inkscape:isstock="true"> + <path + inkscape:connector-curvature="0" + id="path840-1" + d="M 0,0 5,-5 -12.5,0 5,5 Z" + style="fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:1.00000003pt;stroke-opacity:1" + transform="matrix(0.4,0,0,0.4,4,0)" /> + </marker> + <marker + inkscape:stockid="Arrow1Mend" + orient="auto" + refY="0" + refX="0" + id="Arrow1Mend-1" + style="overflow:visible" + inkscape:isstock="true"> + <path + inkscape:connector-curvature="0" + id="path843-0" + d="M 0,0 5,-5 -12.5,0 5,5 Z" + style="fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:1.00000003pt;stroke-opacity:1" + transform="matrix(-0.4,0,0,-0.4,-4,0)" /> + </marker> + <marker + inkscape:isstock="true" + style="overflow:visible" + id="marker2922-2" + refX="0" + refY="0" + orient="auto" + inkscape:stockid="Arrow1Mend" + inkscape:collect="always"> + <path + transform="matrix(-0.4,0,0,-0.4,-4,0)" + style="fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:1.00000003pt;stroke-opacity:1" + d="M 0,0 5,-5 -12.5,0 5,5 Z" + id="path2920-9" + inkscape:connector-curvature="0" /> + </marker> + <marker + inkscape:isstock="true" + style="overflow:visible" + id="marker2922-27" + refX="0" + refY="0" + orient="auto" + inkscape:stockid="Arrow1Mend" + inkscape:collect="always"> + <path + transform="matrix(-0.4,0,0,-0.4,-4,0)" + style="fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:1.00000003pt;stroke-opacity:1" + d="M 0,0 5,-5 -12.5,0 5,5 Z" + id="path2920-0" + inkscape:connector-curvature="0" /> + </marker> + <marker + inkscape:isstock="true" + style="overflow:visible" + id="marker2922-27-8" + refX="0" + refY="0" + orient="auto" + inkscape:stockid="Arrow1Mend" + inkscape:collect="always"> + <path + transform="matrix(-0.4,0,0,-0.4,-4,0)" + style="fill:#000000;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:1.00000003pt;stroke-opacity:1" + d="M 0,0 5,-5 -12.5,0 5,5 Z" + id="path2920-0-0" + inkscape:connector-curvature="0" /> + </marker> + </defs> + <sodipodi:namedview + id="base" + pagecolor="#ffffff" + bordercolor="#666666" + borderopacity="1.0" + inkscape:pageopacity="0.0" + inkscape:pageshadow="2" + inkscape:zoom="1.4" + inkscape:cx="401.66654" + inkscape:cy="218.12255" + inkscape:document-units="mm" + inkscape:current-layer="layer1" + showgrid="false" + inkscape:window-width="1916" + inkscape:window-height="1033" + inkscape:window-x="0" + inkscape:window-y="22" + inkscape:window-maximized="0" /> + <metadata + id="metadata5"> + <rdf:RDF> + <cc:Work + rdf:about=""> + <dc:format>image/svg+xml</dc:format> + <dc:type + rdf:resource="http://purl.org/dc/dcmitype/StillImage" /> + <dc:title /> + </cc:Work> + </rdf:RDF> + </metadata> + <g + inkscape:label="Layer 1" + inkscape:groupmode="layer" + id="layer1" + transform="translate(0,-173)"> + <rect + style="opacity:0.82999998;fill:none;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:0.4;stroke-miterlimit:4;stroke-dasharray:none;stroke-dashoffset:0;stroke-opacity:0.82745098" + id="rect815" + width="21.262758" + height="40.350552" + x="55.509361" + y="195.00098" + ry="0" /> + <rect + style="opacity:0.82999998;fill:none;fill-opacity:1;fill-rule:evenodd;stroke:#000000;stroke-width:0.4;stroke-miterlimit:4;stroke-dasharray:none;stroke-dashoffset:0;stroke-opacity:0.82745098" + id="rect815-1" + width="21.24276" + height="43.732346" + x="55.519352" + y="235.26543" + ry="0" /> + <text + xml:space="preserve" + style="font-style:normal;font-weight:normal;font-size:2.9765625px;line-height:125%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.26458332px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" + x="50.549229" + y="190.6078" + id="text1118"><tspan + sodipodi:role="line" + id="tspan1116" + x="50.549229" + y="190.6078" + style="stroke-width:0.26458332px">queue file address space</tspan></text> + <path + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.26458332px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" + d="M 76.818568,194.95453 H 97.229281" + id="path1126" + inkscape:connector-curvature="0" + sodipodi:nodetypes="cc" /> + <path + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.26458332px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" + d="M 76.818568,235.20899 H 96.095361" + id="path1126-8" + inkscape:connector-curvature="0" /> + <path + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.26458332px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" + d="m 76.762111,278.99778 h 19.27678" + id="path1126-0" + inkscape:connector-curvature="0" /> + <path + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.26458332px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" + d="m 55.519355,265.20165 v 19.27678" + id="path1126-2" + inkscape:connector-curvature="0" /> + <path + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.26458332px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" + d="m 76.762111,265.20165 v 19.27678" + id="path1126-2-1" + inkscape:connector-curvature="0" /> + <path + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.26458332px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1;marker-start:url(#Arrow1Mstart);marker-end:url(#Arrow1Mend)" + d="m 87.590896,194.76554 0,39.87648" + id="path1126-2-1-0" + inkscape:connector-curvature="0" + sodipodi:nodetypes="cc" /> + <path + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.26458332px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1;marker-start:url(#Arrow1Mstart-5);marker-end:url(#Arrow1Mend-1)" + d="m 82.48822,235.77596 v 42.90029" + id="path1126-2-1-0-8" + inkscape:connector-curvature="0" + sodipodi:nodetypes="cc" /> + <path + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.26458332px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1;marker-end:url(#marker2922)" + d="M 44.123633,195.3325 H 55.651907" + id="path2912" + inkscape:connector-curvature="0" /> + <text + xml:space="preserve" + style="font-style:normal;font-weight:normal;font-size:2.9765625px;line-height:125%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.26458332px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" + x="32.217381" + y="196.27745" + id="text2968"><tspan + sodipodi:role="line" + id="tspan2966" + x="32.217381" + y="196.27745" + style="stroke-width:0.26458332px">offset 0</tspan></text> + <text + xml:space="preserve" + style="font-style:normal;font-weight:normal;font-size:2.9765625px;line-height:125%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.26458332px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" + x="91.199554" + y="216.03946" + id="text1118-5"><tspan + sodipodi:role="line" + id="tspan1116-0" + x="91.199554" + y="216.03946" + style="stroke-width:0.26458332px">device region (mapped to device mmio or shared kernel driver memory)</tspan></text> + <text + xml:space="preserve" + style="font-style:normal;font-weight:normal;font-size:2.9765625px;line-height:125%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.26458332px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" + x="86.188072" + y="244.50081" + id="text1118-5-6"><tspan + sodipodi:role="line" + id="tspan1116-0-4" + x="86.188072" + y="244.50081" + style="stroke-width:0.26458332px">static share virtual memory region (for device without share virtual memory)</tspan></text> + <flowRoot + xml:space="preserve" + id="flowRoot5699" + style="font-style:normal;font-weight:normal;font-size:11.25px;line-height:125%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1"><flowRegion + id="flowRegion5701"><rect + id="rect5703" + width="5182.8569" + height="385.71429" + x="34.285713" + y="71.09111" /></flowRegion><flowPara + id="flowPara5705" /></flowRoot> <path + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.26458332px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1;marker-end:url(#marker2922-2)" + d="M 43.679028,206.85268 H 55.207302" + id="path2912-1" + inkscape:connector-curvature="0" /> + <path + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.26458332px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1;marker-end:url(#marker2922-27)" + d="M 44.057004,224.23959 H 55.585278" + id="path2912-9" + inkscape:connector-curvature="0" /> + <text + xml:space="preserve" + style="font-style:normal;font-weight:normal;font-size:2.9765625px;line-height:125%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.26458332px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" + x="24.139778" + y="202.40636" + id="text1118-5-3"><tspan + sodipodi:role="line" + id="tspan1116-0-6" + x="24.139778" + y="202.40636" + style="stroke-width:0.26458332px">device mmio region</tspan></text> + <text + xml:space="preserve" + style="font-style:normal;font-weight:normal;font-size:2.9765625px;line-height:125%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.26458332px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" + x="17.010948" + y="216.73672" + id="text1118-5-3-3"><tspan + sodipodi:role="line" + id="tspan1116-0-6-6" + x="17.010948" + y="216.73672" + style="stroke-width:0.26458332px">device kernel only region</tspan></text> + <path + style="fill:none;fill-rule:evenodd;stroke:#000000;stroke-width:0.26458332px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1;marker-end:url(#marker2922-27-8)" + d="M 43.981087,235.35153 H 55.509361" + id="path2912-9-2" + inkscape:connector-curvature="0" /> + <text + xml:space="preserve" + style="font-style:normal;font-weight:normal;font-size:2.9765625px;line-height:125%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.26458332px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" + x="17.575975" + y="230.53285" + id="text1118-5-3-3-0"><tspan + sodipodi:role="line" + id="tspan1116-0-6-6-5" + x="17.575975" + y="230.53285" + style="stroke-width:0.26458332px">device user share region</tspan></text> + </g> +</svg>