skip to main content
article
Free Access

Recency-based TLB preloading

Authors Info & Claims
Published:01 May 2000Publication History
Skip Abstract Section

Abstract

Caching and other latency tolerating techniques have been quite successful in maintaining high memory system performance for general purpose processors. However, TLB misses have become a serious bottleneck as working sets are growing beyond the capacity of TLBs.

This work presents one of the first attempts to hide TLB miss latency by using preloading techniques. We present results for traditional next-page TLB miss preloading - an approach shown to cut some of the misses. However, a key contribution of this work is a novel TLB miss prediction algorithm based on the concept of “recency”, and we show that it can predict over 55% of the TLB misses for the five commercial applications considered.

References

  1. 1 T. Austin and G. Sohi, "High-Bandwidth Address Translation for Multiple-Issue Processors," in Proceedings of the 22nd Ann. Int. Symp. on Computer Architecture, pp. 158-167, 1995.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. 2 M. Cekleov and M. Dubois, "Virtual-Address Caches, Part 1: Problems and Solutions in Uniprocessors" pp. 64-71, in IEEE Micro, Nov/Dec 1997.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. 3 J. Chase, H. Levy, and M. Feeley, "Sharing and Protection in a Single-Address-Space Operating System," in ACM Trans. on Computer Systems, pp. 271-307, Nov. 1994.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. 4 B.Chemlik, "The SHADE simulator", Sun Labs T.R. 1993.]]Google ScholarGoogle Scholar
  5. 5 J. Chen and A. Borg, "A Simulation Based Study of TLB Performance," in Proceedings of the 19th Ann. Int. Symp. on Computer Architecture, pages 114-123]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. 6 H.K.J. Chu, "Zero-Copy TCP in Solaris", in 1996 USENIX Annual Technical Conference, January 22-26, 1996, San Diego, California]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. 7 D.W. Clark and J.S. Emer, "Performance of the VAX-11/780 Translation Buffers: Simulation and Measurement," in ACM Trans. on Computer Systems, vol. 3, no. 1, 1985.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. 8 E Dahlgren and E Stenstr6m "Evaluation of Stride and Sequential Hardware-based Prefetching in Shared-Memory Multiprocessors," in IEEE Trans. on Parallel and Distributed Systems, Vol. 7, No. 4, pp. 385-398, April 1996.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. 9 J. Huck and J. Hays, "Architecture Support for Translation Table Management in Large Address Space Machines," in Proceedings of the 20th Ann. Int. Symp. on Computer Architecture, pp. 39-50, May 1993.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. 10 B. Jacob and T. Mudge, "Software-Managed Address Translation," in Proceedings of the 3rd Int. Symp. on High-Pelformance Computer Architecture, pp. 156-167, Feb 1997.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. 11 B. Jacob and T. Mudge, "A Look at Several Memory Management Units and TLB-Refill Mechanisms and Page Table Organizations," in ASPLOS-VIII, pp. 295-306. 1998.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. 12 http://www.speech.cs.cmu.edu/speech/sphinx.html]]Google ScholarGoogle Scholar
  13. 13 K. Bala, M.F. Kaashoek, W.E.Weihl, "Software Prefetching and Caching for Translation Lookaside Buffers", in Proceedings of the First Symposium on Operating System Design and Implementation, November 1994.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. 14 R.L. Mattson, J. Gecsei, D. Slutz, and I.L. Traiger, "Evaluation Techniques for Storage Hierarchies", in IBM Systems Journal 9 (2):pp.78-117, 1970]]Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. 15 J. S. Park and G. S. Ahn, "A Software-controlled Prefetching Mechanism for Software-managed TLBs," in Mic~vprocessing and Microprogramming, Vol .41, No 2. pp. 121-136, May, 1995.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. 16 X. Qiu and M. Dubois, "Options for Dynamic Address Translation in COMAs," in Proceedings of the 25th Ann. Int. Symp. on Computer Architecture, pp. 214-225, June 1998.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. 17 X. Qiu and M. Dubois, "Tolerating Late Memory Traps in ILP Processors," in Proc. of 26th Ann. Int. Symp. on Computer Architecture, pp. 76-87, 1999.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. 18 M. Talluri and M. Hill, "Surpassing the TLB Performance of Superpages with Less Operating System Support," in Proceedings of the Sixth Int. Conf. on Architectural Support for Programming Languages and Operating Systems, Oct 1994.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. 19 M. Talluri, S. Kong, M. Hill, and D. Patterson, "Tradeoffs in Supporting Two Page Sizes," in Proceedings of the 19th Ann. Int. Symp. on Computer Architecture, May 1992.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. 20 B. Wheeler and B. N. Bershad, "Consistency Management for Virtually Indexed Caches," in Proceedings of the Fifth Int. Conf. on Architectural Support for Programming Languages and Operating Systems, Oct 1992.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. 21 http://www.fluent.com]]Google ScholarGoogle Scholar
  22. 22 http://www.newtek.com]]Google ScholarGoogle Scholar
  23. 23 pnmrotate, part of Net PBM distribution, version 7: ftp:// wuarchive.wustl.edu/graphics/graphics/packages/NetPBM]]Google ScholarGoogle Scholar
  24. 24 AMD K-7 Product announcement at microprocessor forum. http ://www.amd.com/products/cpg/k7/micropforum.html]]Google ScholarGoogle Scholar
  25. 25 HAL SPARC64-III, Microprocessor Report, Dec 8, 1997 http ://www.hal. com/home/sp arc 64- 3_mda.html]]Google ScholarGoogle Scholar
  26. 26 A. Seznec, "A Case for Two-Way Skewed-Associative Caches", Proc. 20th Annual Symposium on Computer Architecture, pp. 169-178, May 1993]] Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Recency-based TLB preloading

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM SIGARCH Computer Architecture News
        ACM SIGARCH Computer Architecture News  Volume 28, Issue 2
        Special Issue: Proceedings of the 27th annual international symposium on Computer architecture (ISCA '00)
        May 2000
        325 pages
        ISSN:0163-5964
        DOI:10.1145/342001
        Issue’s Table of Contents
        • cover image ACM Conferences
          ISCA '00: Proceedings of the 27th annual international symposium on Computer architecture
          June 2000
          327 pages
          ISBN:1581132328
          DOI:10.1145/339647

        Copyright © 2000 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 1 May 2000

        Check for updates

        Qualifiers

        • article

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader