In this paper we addressed the problem of user-level DMA, that is, starting a DMA operation from user-level without the help of the operating system kernel. Previous approaches to user-level DMA (although managing to launch DMA operations from user-level) require that the operating system kernel be modified to avoid race conditions from multiple users trying to start DMA operations at the same time. We believe that modifying the operating system kernel is a major obstacle in the widespread use of DMA, since most users are unwilling or unable to modify their underlying operating system kernel in any way.
In this paper we proposed several methods that achieve user-level DMA without any modifications to the operating system kernel. These methods vary in their simplicity and in their requirements of the host computer. Our ``PAL Code" method is the simplest of all, but requires the existence of the DEC Alpha processor as the host processor. The ``Extended Shadow Addressing" is simple as well, but it requires a large physical address space. The other two methods (``Key-based DMA", and ``Repeated Passing of Arguments") although a bit more elaborate, function correctly in the general case, without any assumptions about the host computer. Using our proposed algorithms, a DMA operation can be initiated in only 2-5 assembly instructions all issued from user level.
We believe that our user-level DMA methods should be seriously considered for inclusion in high-speed network interfaces. Research prototypes have shown that the hardware cost of user-level DMA is low [3, 8, 9], while in this paper we show that the software cost of user-level DMA is also low, since it can be achieved without operating system kernel modifications.