research-article

Enabling event tracing at leadership-class scale through I/O forwarding middleware

Authors:
Thomas Ilsche

Technische Universität Dresden & Oak Ridge National Laboratory, Dresden, Germany

Technische Universität Dresden & Oak Ridge National Laboratory, Dresden, Germany
View Profile

,
Joseph Schuchart

Technische Universität Dresden, Dresden, Germany

Technische Universität Dresden, Dresden, Germany
View Profile

,
Jason Cope

Argonne National Laboratory, Argonne, IL, USA

Argonne National Laboratory, Argonne, IL, USA
View Profile

,
Dries Kimpe

Argonne National Laboratory, Argonne, IL, USA

Argonne National Laboratory, Argonne, IL, USA
View Profile

,
Terry Jones

Oak Ridge National Laboratory, Oak Ridge, TN, USA

Oak Ridge National Laboratory, Oak Ridge, TN, USA
View Profile

,
Andreas Knüpfer

Technische Universität Dresden, Dresden, USA

Technische Universität Dresden, Dresden, USA
View Profile

,
Kamil Iskra

Argonne National Laboratory, Argonne, IL, USA

Argonne National Laboratory, Argonne, IL, USA
View Profile

,
Robert Ross

Argonne National Laboratory, Argonne, IL, USA

Argonne National Laboratory, Argonne, IL, USA
View Profile

,
Wolfgang E. Nagel

Technische Universität Dresden, Dresden, Germany

Technische Universität Dresden, Dresden, Germany
View Profile

,
Stephen Poole

Oak Ridge National Laboratory, Oak Ridge, TN, USA

Oak Ridge National Laboratory, Oak Ridge, TN, USA
View Profile

HPDC '12: Proceedings of the 21st international symposium on High-Performance Parallel and Distributed ComputingJune 2012Pages 49–60https://doi.org/10.1145/2287076.2287085

Published:18 June 2012Publication History

HPDC '12: Proceedings of the 21st international symposium on High-Performance Parallel and Distributed Computing

Pages 49–60

ABSTRACT

Event tracing is an important tool for understanding the performance of parallel applications. As concurrency increases in leadership-class computing systems, the quantity of performance log data can overload the parallel file system, perturbing the application being observed. In this work we present a solution for event tracing at leadership scales. We enhance the I/O forwarding system software to aggregate and reorganize log data prior to writing to the storage system, significantly reducing the burden on the underlying file system for this type of traffic. Furthermore, we augment the I/O forwarding system with a write buffering capability to limit the impact of artificial perturbations from log data accesses on traced applications. To validate the approach, we modify the Vampir tracing toolset to take advantage of this new capability and show that the approach increases the maximum traced application size by a factor of 5x to more than 200,000 processes.

References

Abbasi, H., Wolf, M., Eisenhauer, G., Klasky, S., Schwan, K., and Zheng, F. DataStager: Scalable data staging services for petascale applications. In Proceedings of the 18th ACM International Symposium on High Performance Distributed Computing (HPDC) (2009), pp. 39--48. Google ScholarDigital Library
Ali, N., Carns, P., Iskra, K., Kimpe, D., Lang, S., Latham, R., Ross, R., Ward, L., and Sadayappan, P. Scalable I/O forwarding framework for high-performance computing systems. In Proceedings of the 11th IEEE International Conference on Cluster Computing (CLUSTER) (2009).Google ScholarCross Ref
Bent, J., Gibson, G., Grider, G., McClelland, B., Nowoczynski, P., Nunez, J., Polte, M., and Wingate, M. PLFS: A checkpoint filesystem for parallel applications. In Proceedings of 21st ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis (SC) (2009). Google ScholarDigital Library
Bland, A., Kendall, R., Kothe, D., Rogers, J., and Shipman, G. Jaguar: The world's most powerful computer. In Proceedings of the 51st Cray User Group Meeting (CUG) (2009).Google Scholar
Carns, P., Ligon III, W., Ross, R., and Wyckoff, P. BMI: A network abstraction layer for parallel I/O. In Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium, Workshop on Communication Architecture for Clusters (CAC) (2005). Google ScholarDigital Library
Chen, J. H., Choudhary, A., de Supinski, B., DeVries, M., Hawkes, E. R., Klasky, S., Liao, W. K., Ma, K. L., Mellor-Crummey, J., Podhorszki, N., Sankaran, R., Shende, S., and Yoo, C. S. Terascale direct numerical simulations of turbulent combustion using S3D. Computational Science & Discovery 2, 1 (2009), 015001.Google ScholarCross Ref
Ching, A., Choudhary, A., Coloma, K., Liao, W., Ross, R., and Gropp, W. Noncontiguous I/O access through MPI-IO. In Proceedings of the 3rd IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGrid) (2003), pp. 104--111. Google ScholarDigital Library
Docan, C., Parashar, M., and Klasky, S. DART: A substrate for high speed asynchronous data IO. In Proceedings of the 17th International Symposium on High Performance Distributed Computing (HPDC) (2008). Google ScholarDigital Library
Frings, W., Wolf, F., and Petkov, V. Scalable massively parallel I/O to task-local files. In Proceedings of 21st ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis (SC) (2009). Google ScholarDigital Library
Ghemawat, S., Gobioff, H., and Leung, S. The Google File System. SIGOPS Operating Systems Review 37 (Oct. 2003), 29--43. Google ScholarDigital Library
Gygi, F., Duchemin, I., Donadio, D., and Galli, G. Practical algorithms to facilitate large-scale first-principles molecular dynamics. Journal of Physics: Conference Series 180, 1 (2009).Google ScholarCross Ref
Hildebrand, D., and Honeyman, P. Exporting storage systems in a scalable manner with pNFS. In Proceedings of the 22nd IEEE / 13th NASA Goddard Conference on Mass Storage Systems and Technologies (MSST) (2005), pp. 18--27. Google ScholarDigital Library
IEEE POSIX Standard 1003.1 2004 Edition. http://www.opengroup.org/onlinepubs/000095399/functions/write.html.Google Scholar
Iskra, K., Romein, J. W., Yoshii, K., and Beckman, P. ZOID: I/O-forwarding infrastructure for petascale architectures. In Proceedings of the 13th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP) (2008), pp. 153--162. Google ScholarDigital Library
Jagode, H., Dongarra, J., Alam, S., Vetter, J., Spear, W., and Malony, A. D. A holistic approach for performance measurement and analysis for petascale applications. In Proceedings of the 9th International Conference on Computational Science (ICCS) (2009), vol. 2, pp. 686--695. Google ScholarDigital Library
Jones, T., Dawson, S., Neely, R., Tuel, W., Brenner, L., Fier, J., Blackmore, R., Caffrey, P., and Maskell, B. Improving the scalability of parallel jobs by adding parallel awareness. In Proceedings of the 15th ACM/IEEE International Conference on High Performance Networking and Computing (SC) (2003). Google ScholarDigital Library
Knüpfer, A., Brunst, H., Doleschal, J., Jurenz, M., Lieber, M., Mickler, H., Müller, M. S., and Nagel, W. E. The Vampir performance analysis tool-set. In Tools for High Performance Computing (2008), M. Resch, R. Keller, V. Himmler, B. Krammer, and A. Schulz, Eds., Springer Verlag, pp. 139--155.Google ScholarCross Ref
Lofstead, J. F., Klasky, S., Schwan, K., Podhorszki, N., and Jin, C. Flexible IO and integration for scientific codes through the adaptable IO system (ADIOS). In Proceedings of the 6th International Workshop on Challenges of Large Applications in Distributed Environments (CLADE) (2008), pp. 15--24. Google ScholarDigital Library
MPI Forum. MPI-2: Extensions to the Message-Passing Interface. http://www.mpi-forum.org/docs/docs.html, 1997.Google Scholar
Muelder, C., Gygi, F., and Ma, K.-L. Visual analysis of inter-process communication for large-scale parallel computing. IEEE Transactions on Visualization and Computer Graphics 15, 6 (2009), 1129--1136. Google ScholarDigital Library
Muelder, C., Sigovan, C., Ma, K.-L., Cope, J., Lang, S., Iskra, K., Beckman, P., and Ross, R. Visual analysis of I/O system behavior for high-end computing. In Proceedings of the 3rd International Workshop on Large-Scale System and Application Performance (LSAP) (2011). Google ScholarDigital Library
Nisar, A., Liao, W., and Choudhary, A. Scaling parallel I/O performance through I/O delegate and caching system. In Proceedings of 20th ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis (SC) (2008). Google ScholarDigital Library
Ohta, K., Kimpe, D., Cope, J., Iskra, K., Ross, R., and Ishikawa, Y. Optimization techniques at the I/O forwarding layer. In Proceedings of the 12th IEEE International Conference on Cluster Computing (CLUSTER) (2010). Google ScholarDigital Library
Pedretti, K., Brightwell, R., and Williams, J. Cplant#8482; runtime system support for multi-processor and heterogeneous compute nodes. In Proceedings of the 4th IEEE International Conference on Cluster Computing (CLUSTER) (2002), pp. 207--214. Google ScholarDigital Library
Petascale Data Storage Institute. http://www.pdsi-scidac.org/.Google Scholar
Peterka, T., Goodell, D., Ross, R., Shen, H.-W., and Thakur, R. A configurable algorithm for parallel image-compositing applications. In Proceedings of 21st ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis (SC) (2009). Google ScholarDigital Library
Romein, J. Fcnp: Fast I/O on the Blue Gene/P. In Parallel and Distributed Processing Techniques and Applications (PDPTA'09) (2009).Google Scholar
Shipman, G., Dillow, D., Oral, S., and Wang, F. The Spider center wide file system; from concept to reality. In Proceedings of the 51st Cray User Group Meeting (CUG) (2009).Google Scholar
Vishwanath, V., Hereld, M., Iskra, K., Kimpe, D., Morozov, V., Papka, M., Ross, R., and Yoshii, K. Accelerating I/O forwarding in IBM Blue Gene/P systems. In Proceedings of 22nd ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis (SC) (2010). Google ScholarDigital Library
Wylie, B. J. N., Geimer, M., Mohr, B., Böhme, D., Szebenyi, Z., and Wolf, F. Large-scale performance analysis of Sweep3D with the Scalasca toolset. Parallel Processing Letters 20, 4 (2010), 397--414.Google ScholarCross Ref
Yoshii, K., Iskra, K., Naik, H., Beckman, P., and Broekema, P. C. Performance and scalability evaluation of 'Big Memory' on Blue Gene Linux. International Journal of High Performance Computing Applications 25, 2 (2011), 148--160. Google ScholarDigital Library
Yu, H., Sahoo, R. K., Howson, C., Almási, G., Castanos, J. G., Gupta, M., Moreira, J. E., Parker, J. J., Engelsiepen, T. E., Ross, R. B., Thakur, R., Latham, R., and Gropp, W. D. High performance file I/O for the Blue Gene/L supercomputer. In Proceedings of the 12th International Symposium on High-Performance Computer Architecture (HPCA) (2006), pp. 187--196.Google ScholarCross Ref

Index Terms

Enabling event tracing at leadership-class scale through I/O forwarding middleware
1. Information systems
  1. Information storage systems
    1. Record storage systems
      1. Record storage alternatives
2. Software and its engineering
  1. Software creation and management
    1. Software verification and validation
      1. Process validation
        Traceability
  2. Software organization and properties
    1. Contextual software domains
      1. Operating systems
        File systems management

Recommendations

Optimizing I/O forwarding techniques for extreme-scale event tracing

Programming development tools are a vital component for understanding the behavior of parallel applications. Event tracing is a principal ingredient to these tools, but new and serious challenges place event tracing at risk on extreme-scale machines. As ...
Read More
Hierarchical Memory Buffering Techniques for an In-Memory Event Tracing Extension to the Open Trace Format 2
ICPP '13: Proceedings of the 2013 42nd International Conference on Parallel Processing

One of the most urgent challenges in event based performance analysis is the enormous amount of collected data. A real-time event reduction is crucial to enable a complete in-memory event tracing workflow, which circumvents the limitations of current ...
Read More
MPI-focused Tracing with OTFX: An MPI-aware In-memory Event Tracing Extension to the Open Trace Format 2
EuroMPI '15: Proceedings of the 22nd European MPI Users' Group Meeting

Performance analysis tools are more than ever inevitable to develop applications that utilize the enormous computing resources of high performance computing (HPC) systems. In event-based performance analysis the amount of collected data is one of the ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
HPDC '12: Proceedings of the 21st international symposium on High-Performance Parallel and Distributed Computing
June 2012
308 pages
ISBN:9781450308052
DOI:10.1145/2287076
General Chair:
Dick Epema
Delft University of Technology and Eindhoven University of Technology, The Netherlands
,
Program Chairs:
Thilo Kielmann
Vrije Universiteit, The Netherlands
,
Matei Ripeanu
The University of British Columbia, Canada
Copyright © 2012 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 18 June 2012
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
I/O forwarding
atomic append
event tracing
Qualifiers
- research-article
Conference

Acceptance Rates
HPDC '12 Paper Acceptance Rate23of143submissions,16%Overall Acceptance Rate166of966submissions,17%
More
Upcoming Conference
HPDC '24

Sponsor:

sigarch

The 33rd International Symposium on High-Performance Parallel and Distributed Computing

June 3 - 7, 2024

Pisa , Italy
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 13
  Total Citations
  View Citations
- 275
  Total Downloads
- Downloads (Last 12 months)6
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Enabling event tracing at leadership-class scale through I/O forwarding middleware

HPDC '12: Proceedings of the 21st international symposium on High-Performance Parallel and Distributed Computing

ABSTRACT

References

Cited By

Index Terms

Recommendations

Optimizing I/O forwarding techniques for extreme-scale event tracing

Hierarchical Memory Buffering Techniques for an In-Memory Event Tracing Extension to the Open Trace Format 2

MPI-focused Tracing with OTFX: An MPI-aware In-memory Event Tracing Extension to the Open Trace Format 2

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Enabling event tracing at leadership-class scale through I/O forwarding middleware

HPDC '12: Proceedings of the 21st international symposium on High-Performance Parallel and Distributed Computing

ABSTRACT

References

Cited By

Index Terms

Recommendations

Optimizing I/O forwarding techniques for extreme-scale event tracing

Hierarchical Memory Buffering Techniques for an In-Memory Event Tracing Extension to the Open Trace Format 2

MPI-focused Tracing with OTFX: An MPI-aware In-memory Event Tracing Extension to the Open Trace Format 2

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media