Abstract
This paper proposes new network interface controller (NIC) designs that take advantage of integration with the host CPU to provide increased flexibility for operating system kernel-based performance optimization.We believe that this approach is more likely to meet the needs of current and future high-bandwidth TCP/IP networking on end hosts than the current trend of putting more complexity in the NIC, while avoiding the need to modify applications and protocols. This paper presents two such NICs. The first, the simple integrated NIC (SINIC), is a minimally complex design that moves the responsibility for managing the network FIFOs from the NIC to the kernel. Despite this closer interaction between the kernel and the NIC, SINIC provides performance equivalent to a conventional DMA-based NIC without increasing CPU overhead. The second design, V-SINIC, adds virtual per-packet registers to SINIC, enabling parallel packet processing while maintaining a FIFO model. V-SINIC allows the kernel to decouple examining a packet's header from copying its payload to memory. We exploit this capability to implement a true zero-copy receive optimization in the Linux 2.6 kernel, providing bandwidth improvements of over 50% on unmodified sockets-based receive-intensive benchmarks.
- Alacritech, Inc. Alacritech / SLIC technology overview. http://www.alacritech.com/html/tech review.html.Google Scholar
- Apache Software Foundation. Apache HTTP server. http://httpd.apache.org.Google Scholar
- P. Barford and M. Crovella. Generating representative web workloads for network and server performance evaluation. In Measurement and Modeling of Computer Systems, pages 151--160, 1998. Google ScholarDigital Library
- N.L. Binkert, R.G. Dreslinski, L.R. Hsu, K.T. Lim, A.G. Saidi, and S.K. Reinhardt. The M5 simulator: Modeling networked systems. IEEE Micro, 26(4):52--60, Jul/Aug 2006. Google ScholarDigital Library
- N.L. Binkert, L.R. Hsu, A.G. Saidi, R.G. Dreslinski, A.L. Schultz, and S.K. Reinhardt. Performance analysis of system overheads in TCP/IP workloads. In Proc. 14th Ann. Int'l Conf. on Parallel Architectures and Compilation Techniques, pages 218--228, Sept. 2005. Google ScholarDigital Library
- M.A. Blumrich, C. Dubnicki, E.W. Felten, and K. Li. Protected, user-level DMA for the SHRIMP network interface. In Proc. 2nd Int'l Symp. on High-Performance Computer Architecture (HPCA), pages 154--165, Feb. 1996. Google ScholarDigital Library
- Broadcom Corp. BCM5706 product brief, 2004. http://www.broadcom.com/collateral/pb/5706-PB04-R.pdf.Google Scholar
- Broadcom Corporation. BCM1250 product brief, 2003. http://www.broadcom.com/collateral/pb/1250-PB09-R.pdf.Google Scholar
- J. Chase. High Performance TCP/IP Networking, chapter 13, "Software Implementation of TCP". Prentice-Hall, 2003.Google Scholar
- J. Corbet. Linux and TCP offload engines. Linux Weekly News, Aug. 2005. http://lwn.net/Articles/148697.Google Scholar
- W.J. Dally, L. Chao, A. Chien, S. Hassoun, W. Horwat, J. Kaplan, P. Song, B. Totty, and S. Wills. Architecture of a message-driven processor. In Proc. 14th Ann. Int'l Symp. on Computer Architecture, pages 189--196, May 1987. Google ScholarDigital Library
- C. Dalton, G. Watson, D. Banks, C. Calamvokis, A. Edwards, and J. Lumley. Afterburner. IEEE Network, 7(4):36--43, July 1993.Google ScholarDigital Library
- C. Demerjian. Sun's Niagara falls neatly into multithreaded place. The Inquirer, Nov. 2004. http://www.theinquirer.net/?article=19423.Google Scholar
- W. Feng et al. Optimizing 10-Gigabit Ethernet for networks of workstations, clusters, and grids: A case study. In Proc. Supercomputing 2003, Nov. 2003. Google ScholarDigital Library
- M. Fillo, S.W. Keckler, W.J. Dally, N.P. Carter, A. Chang, Y. Gurevich, and W.S. Lee. The M-Machine multicomputer. In 28th Ann. Int'l Symp. on Microarchitecture, pages 146--156, Dec. 1995. Google ScholarDigital Library
- A.P. Foong, T.R. Huff, H.H. Hum, J. Patwardhan, and G.J. Regnier. TCP performance re-visited. In Proc. 2003 IEEE Int'l Symp. on Performance Analysis of Systems and Software, Mar. 2003. Google ScholarDigital Library
- B. Francis. Enterprises pushing 10GigE to edge. InfoWorld, Dec. 2004. http://www.infoworld.com/article/04/12/06/49NNcisco 1.html.Google Scholar
- D. Freimuth, E. Hu, J. LaVoie, R. Mraz, E. Nahum, P. Pradhan, and J. Tracey. Server network scalability and TCP offload. In Proc. 2005 USENIX Technical Conference, pages 209--222, Apr. 2005. Google ScholarDigital Library
- A. Gallatin, J. Chase, and K. Yocum. Trapeze/IP: TCP/IP at neargigabit speeds. In Proc. 1999 USENIX Technical Conference, Freenix Track, 1999. Google ScholarDigital Library
- P. Gelsinger, H.G. Geyer, and J. Rattner. Speeding up the network: A system problem, a platform solution. Technology@Intel Magazine, Mar. 2005. http://www.intel.com/technology/magazine/communications/speeding-network-0305.pdf.Google Scholar
- D.S. Henry and C.F. Joerg. A tightly-coupled processor-network interface. In Proc. Fifth Int'l Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS V), pages 111--122, Oct. 1992. Google ScholarDigital Library
- Hewlett-Packard Company. Netperf: A network performance benchmark. http://www.netperf.org.Google Scholar
- L.R. Hsu, A.G. Saidi, N.L. Binkert, and S.K. Reinhardt. Sampling and stability in TCP/IP workloads. In Proc. First Annual Workshop on Modeling, Benchmarking, and Simulation, pages 68--77, June 2005.Google Scholar
- R. Huggahalli, R. Iyer, and S. Tetrick. Direct cache access for high bandwidth network I/O. In Proc. 32nd Ann. Int'l Symp. on Computer Architecture, pages 50--59, June 2005. Google ScholarDigital Library
- Intel Corp. Intel IXP1200 Network Processor Family - Hardware Reference Manual, Dec. 2001.Google Scholar
- K. Lauritzen, T. Sawicki, T. Stachura, and C.E. Wilson. Intel I/O acceleration technology improves network performance, reliability and efficiently. Technology@Intel magazine, Mar. 2005. http://www.intel.com/technology/magazine/communications/Intel-IOAT-0305.pdf.Google Scholar
- D.S. Miller. Re: {PATCH} TCP Offload (TOE) - Chelsio. E-mail, Aug. 2005. http://lwn.net/Articles/148701.Google Scholar
- J.C. Mogul. TCP offload is a dumb idea whose time has come. In Proc. 9th Workshop on Hot Topics in Operating Systems, May 2003. Google ScholarDigital Library
- S.S. Mukherjee and M.D. Hill. Making network interfaces less peripheral. IEEE Computer, 31(10):70--76, Oct. 1998. Google ScholarDigital Library
- T.H. Myer and I.E. Sutherland. On the design of display processors. Commun. ACM, 11(6):410--414, June 1968. Google ScholarDigital Library
- National Semiconductor. DP83820 datasheet, Feb. 2001. http://www.national.com/ds.cgi/DP/DP83820.pdf.Google Scholar
- R.S. Nikhil, G.M. Papadopoulos, and Arvind. *T: A multithreaded massively parallel architecture. In Proc. 19th Ann. Int'l Symp. on Computer Architecture, pages 156--167, May 1992. Google ScholarDigital Library
- M. Ohmacht et al. Blue Gene/L compute chip: Memory and Ethernet subsystem. IBM Journal of Research and Development, 49(2/3):255--264, March/May 2005. Google ScholarDigital Library
- G. Regnier, S. Makineni, R. Illikkal, R. Iyer, D. Minturn, R. Huggahalli, D. Newell, L. Cline, and A. Foong. TCP onloading for data center servers. IEEE Computer, 37(11):48--58, Nov. 2004. Google ScholarDigital Library
- A.G. Saidi, N.L. Binkert, L.R. Hsu, and S.K. Reinhardt. Performance validation of network-intensive workloads on a fullsystem simulator. In Proc. 2005 Workshop on Interaction between Operating System and Computer Architecture (IOSCA), pages 33--38, Oct. 2005.Google Scholar
- J. Satran, C. Sapuntzakis, M. Chadalapaka, and E. Zeidner. iscsi. http://www.ietf.org/internet-drafts/draft-ietf-ips-iscsi-20. pdf, January 2004.Google Scholar
- P. Shivam and J.S. Chase. On the elusive benefits of protocol offload. In NICELI '03: Proceedings of the ACM SIGCOMM Workshop on Network-I/O Convergence, pages 179--184, 2003. Google ScholarDigital Library
- Standard Performance Evaluation Corporation. SPECweb99 benchmark. http://www.spec.org/web99.Google Scholar
- P. Willmann, H. Kim, S. Rixner, and V.S. Pai. An efficient programmable 10 gigabit Ethernet network interface card. In Proc. 11th Int'l Symp. on High-Performance Computer Architecture (HPCA), Feb. 2005. Google ScholarDigital Library
Index Terms
- Integrated network interfaces for high-bandwidth TCP/IP
Recommendations
Integrated network interfaces for high-bandwidth TCP/IP
Proceedings of the 2006 ASPLOS ConferenceThis paper proposes new network interface controller (NIC) designs that take advantage of integration with the host CPU to provide increased flexibility for operating system kernel-based performance optimization.We believe that this approach is more ...
Integrated network interfaces for high-bandwidth TCP/IP
Proceedings of the 2006 ASPLOS ConferenceThis paper proposes new network interface controller (NIC) designs that take advantage of integration with the host CPU to provide increased flexibility for operating system kernel-based performance optimization.We believe that this approach is more ...
Integrated network interfaces for high-bandwidth TCP/IP
ASPLOS XII: Proceedings of the 12th international conference on Architectural support for programming languages and operating systemsThis paper proposes new network interface controller (NIC) designs that take advantage of integration with the host CPU to provide increased flexibility for operating system kernel-based performance optimization.We believe that this approach is more ...
Comments