© 1997 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.

IEEE
Volume 6 Number 3, June 1998

Table of Contents for this issue

Complete paper in PDF format

Performance Modeling of Multiprocessor Implementations of Protocols

Mats Björkman and Per Gunningberg, Member, IEEE

Page 262.

Abstract:

Two major performance bottlenecks in multiprocessor execution of protocols are contention for shared memory and for locks. Locks are used to protect shared messages and/or shared protocol state in a memory shared by competing processors. Mutual exclusion by locking can be costly, in terms of both lock contention and memory contention, if the parallel protocol code frequently accesses shared state and data. This paper presents a queueing network model for performance predictions of shared-memory multiprocessor protocol executions. Predictions from this model are compared to performance measurements from a multiprocessor implementation of two commonly used communication protocol stacks, transmission control protocol/Internet protocol (TCP/IP)/Ethernet and user datagram protocol/Internet protocol (UDP/IP)/Ethernet. These stacks are implemented on a parallelized version of the { {x}}-kernel protocol environment from the University of Arizona. A "processor-per-message" paradigm is used to partition the load among the processors. The measured speedups for the parallel implementations relative to the sequential ones are more than 11 times for UDP (using 20 processors) and three times for TCP (using five processors) on a sequent symmetry. We show that the model accurately captures the effects of lock and memory contention in our shared-memory multiprocessor and predicts the performance with a discrepancy of less than 10%.

References

  1. M. Björkman and P. Gunningberg, "Locking effects in multiprocessor implementations of protocols," in Proc. ACM SIGCOMM'93, San Francisco, CA, 1993, pp. 74-83.
  2. M. Björkman, "The xx-kernel--An execution environment for parallel execution of communication protocols," Dep. Comput. Syst., Uppsala Univ., Uppsala, Sweden, DoCS Report 93/39, 1993.
  3. L. Brakmo, S. O'Malley, and L. Peterson, "TCP Vegas: New techniques for congestion detection and avoidance," in Proc. ACM SIGCOMM'94, London, U.K., 1994, pp. 24-35.
  4. J. Cady and B. Howarth, Computer Systems Performance Management and Capacity Planning.Englewood Cliffs, NJ: Prentice-Hall, 1990.
  5. D. Clark, V. Jacobson, J. Romkey, and H. Salwen, "An analysis of TCP processing overhead," IEEE Commun. Mag., vol. 27, pp. 23-29, June 1989.
  6. D. Clark and D. Tennenhouse, "Architectural considerations for a new generation of protocols," in Proc. ACM SIGCOMM'90, Philadelphia, PA, Sept. 1990, pp. 200-208.
  7. P. Denning and J. Buzen, "The operational analysis of queueing network models," Computing Surveys, vol. 10, no. 3, pp. 225-261, Sept. 1978.
  8. D. Giarrizzo, M. Kaiserswerth, T. Wicki, and R. Williamson, "High-speed parallel protocol implementation," in Proc. 1st IFIP Workshop Protocols for High-Speed Networks, Zürich, Switzerland, 1989.
  9. M. Kaiserswerth, "The parallel protocol engine," IEEE/ACM Trans. Networking, vol. 1, pp. 650-663, Dec. 1993.
  10. M. Goldberg and G. Neufeld, "The raven protocol framework," Univ. British Columbia, Vancouver, B.C., Canada, Tech. Rep. TR-92-15, 1992.
  11. M. Goldberg, G. Neufeld, and M. Ito, "A parallel approach to OSI connection-oriented protocols," in Proc. 3rd IFIP Workshop Protocols for High-Speed Networks, Stockholm, Sweden, 1992, pp. 225-240.
  12. N. Hutchinson and L. Peterson, "The x-kernel: An architecture for implementing network protocols," IEEE Trans. Software Eng., vol. 17, pp. 64-75, Jan. 1991.
  13. N. Jain, M. Schwartz, and T. Bashkow, "Transport protocols processing at GBPS rates," in Proc. ACM SIGCOMM'90, Philadelphia, PA, Sept. 1990, pp. 188-199.
  14. T. LaPorta and M. Schwarz, "Performance analysis of MSP: Feature-rich high-speed transport protocol," IEEE/ACM Trans. Networking, vol. 1, pp. 740-753, Dec. 1993.
  15. E. Lazowska, J. Zahorjan, G. Graham, and K. Sevcik, Quantitative System Performance.Englewood Cliffs, NJ: Prentice-Hall, 1984.
  16. E. Nahum, D. Yates, J. Kurose, and D. Towsley, "Performance issues in parallelized network protocols," in Proc. USENIX Symp. Operating Systems Design and Implementation (OSDI), Monterey, CA, Nov. 1994, pp. 125-137.
  17. J. Salehi, J. Kurose, and D. Towsley, "The performance impact of scheduling for cache affinity in parallel networking processing," in Proc. Fourth IEEE Int. Symp. High-Performance Distributed Computing (HPDC-4), Pentagon City, VA, Aug. 1995, pp. 66-77.
  18. --, "The effectiveness of affinity-based scheduling in multiprocessor networking," IEEE/ACM Trans. Networking, vol. 4, pp. 516-530, Oct. 1996.
  19. D. Schmidt and T. Suda, "Measuring the performance of parallelmessage-based process architectures," in Proc. IEEE Conf. Computer Communications (INFOCOM), Boston, MA, Apr. 1995, pp. 624-633.
  20. D. Yates, E. Nahum, J. Kurose, and D. Towsley, "Networking support for large scale multiprocessor servers," in Proc. SIGMETRICS'96, Philadelphia, PA, May 1996, pp. 116-125.
  21. M. Zitterbart, "High-speed protocol implementations based on a multiprocessor architecture," in Proc. 1st IFIP Workshop Protocols for High-Speed Networks, Zürich, Switzerland: North-Holland Elsevier, 1989, pp. 151-163.