Characterizing Deep-Learning
I/O Workloads in TensorFlow
Steven W. D. Chien, Stefano Markidis,
Chaitanya Prasad Sishtla, Luis Santos,
Pawel Herman, Sai Narasimhamurthy,
Erwin Laure
6/10/2018DOI: 10.1109/PDSW-DISCS.2018.00011
TensorFlow Doing HPCSteven W. D. Chien, Stefano Markidis,
Vyacheslav Olshevsky,
Yaroslav Bulatov,
Erwin Laure, Jeffrey S. Vetter
11/03/2019DOI: 10.1109/IPDPSW.2019.00092
Multi-GPU Acceleration of the
iPIC3D Implicit Particle-in-Cell Code
Chaitanya Prasad Sishtla,
Steven W. D. Chien,
Vyacheslav Olshevsky, Erwin Laure,
Stefano Markidis
7/05/2019DOI: 10.1007/978-3-030-22750-0_58
Posit NPB: Assessing the Precision Improvement
in HPC Scientific Applications
Steven W. D. Chien, Ivy B. Peng,
Stefano Markidis
12/07/2019DOI: 10.1007/978-3-030-43229-4_26
Automated classification of plasma regions
using 3D particle energy distribution
Vyacheslav Olshevsky,
Yuri V. Khotyaintsev,
Andrey Divin, Gian Luca Delzanno,
Sven Anderzen, Pawel Herman,
Steven W.D. Chien, Levon Avanov,
Stefano Markidis
Exposition, Clarification, and Expansion
of MPI Semantic Terms and Conventions
Purushotham V. Bangalore,
Rolf Rabenseifner, Daniel J. Holmes,
Julien Jaeger, Guillaume Mercier,
Claudia Blaas-Schenner,
Anthony Skjellum
11/09/2019DOI: 10.1145/3343211.3343213
MPI Sessions: Evaluation of an Implementation
in Open MPI
Nathan Hjelm, Howard Pritchard,
Samuel Guitiérrez, Daniel Holmes,
Ralph Castain, Anthony Skjellum
8/10/2019DOI: 10.1109/CLUSTER.2019.8891002
Performance Evaluation of Advanced Features
in CUDA Unified Memory.
Steven W. D. Chien,
Ivy Peng,
& Stefano Markidis
21/10/2019DOI: 10.1109/MCHPC49590.2019.00014
Streaming Message Interface: High-performance
distributed memory programming
on reconfigurable hardware.
De Matteis, T., de Fine Licht, J.,
Beránek, J., & Hoefler, T.
17/11/2019DOI: 10.1145/3295500.3356201
Data Movement Is All You Need: A Case Study on Optimizing Transformers
Andrei Ivanov, Nikoli Dryden, Tal Ben-Nun, Shigang Li, Torsten Hoefler30/6/2020DOI: Preprint
Why is MPI (perceived to be) so complex?: Part 1—Does strong progress simplify MPI? Daniel J. Holmes,
Anthony Skjellum,
Derek Schafer
21/9/2020DOI: 10.1145/3416315.3416318
Communication and Timing Issues with MPI Virtualization Alexandr Nigay, Lukas Mosimann, Timo Schneider, Torsten Hoefler21/9/2020DOI: 10.1145/3416315.3416317
Neko: A Modern, Portable, and Scalable Framework for High-Fidelity Computational Fluid Dynamics Niclas Jansson, Martin Karp, Artur Podobas, Stefano Markidis, Philipp Schlatter21/9/2020DOI: Preprint
Collectives and Communicators: A Case for Orthogonality: (Or: How to get rid of MPI neighbor and enhance Cartesian collectives)
Jesper Larsson Träff,
Sascha Hunold,
Guillaume Mercier,
Daniel J. Holmes
21/9/2020DOI: 10.1145/3416315.3416319
Learning representations in Bayesian Confidence Propagation neural networks Naresh Balaji Ravichandran, Anders Lansner, Pawel Herman28/9/2020DOI: 10.1109/IJCNN48605.2020.9207061
sputniPIC: an Implicit Particle-in-Cell Code for Multi-GPU Systems
Steven W. D. Chien, Jonas Nylund, Gabriel Bengtsson, Ivy B. Peng, Artur Podobas, Stefano Markidis22/10/2020DOI: 10.1109/SBAC-PAD49847.2020.00030
Spectral Element Simulations on the NEC SX-Aurora TSUBASA Niclas Jansson20/1/2021DOI: 10.1145/3432261.3432265
On the Parallel I/O Optimality of Linear Algebra Kernels: Near-Optimal Matrix Factorizations G. Kwasniewski, M. Kabić, T. Ben-Nun, A. Nikolaos Ziogas, J. Eirik Saethre, A. Gaillard, T. Schneider, M. Besta, A. Kozhevnikov, J. VandeVondele, T. Hoefler17/02/2021DOI: 10.1145/3458817.3476167
FBLAS: Streaming Linear Algebra on FPGA
Tiziano De Matteis, Johannes de Fine Licht, Torsten Hoefler22/2/2021DOI: 10.1109/SC41405.2020.00063
The Old and the New: Can Physics-Informed Deep-Learning Replace Traditional Linear Solvers?
Stefano Markidis12/3/2021DOI: Preprint
Automatic Particle Trajectory Classification in Plasma Simulations Stefano Markidis, Ivy Peng, Artur Podobas, Itthinat Jongsuebchoke, Gabriel Bengtsson, Pawel Herman23/4/2021DOI: 10.1109/MLHPCAI4S51975.2020.00014
RISC-V in-network accelerator for flexible high-performance low-power packet processing S. Di Girolamo, A. Kurth, A. Calotoiu, T. Benz, T. Schneider, J. Beránek, L. Benini, T. Hoefler14/06/2021DOI: 10.1109/ISCA52012.2021.00079
Benchmarking the Nvidia GPU Lineage: From Early K80 to Modern A100 with Asynchronous Memory Transfers Martin Svedin,
Steven W. D. Chien,
Gibson Chikafa,
Niclas Jansson,
Artur Podobas
21/6/2021DOI: 10.1145/3468044.3468053
StreamBrain: An HPC Framework for Brain-like Neural Networks on CPUs, GPUs and FPGAs Artur Podobas,
Martin Svedin,
Steven W. D. Chien, Ivy B. Peng, Naresh Balaji Ravichandran, Pawel Herman, Anders Lansner,
Stefano Markidis
21/6/2021DOI: 10.1145/3468044.3468052
RFaaS: RDMA-Enabled FaaS Platform for Serverless High-Performance Computing
Marcin Copik, Konstantin Taranov, Alexandru Calotoiu, Torsten Hoefler25/6/2021DOI: Preprint
Semi-supervised learning with Bayesian Confidence Propagation Neural Network Naresh Balaji Ravichandran, Anders Lansner, Pawel Herman29/6/2021DOI: Preprint
Chimera: Efficiently Training Large-Scale Neural Networks with Bidirectional PipelinesShigang Li, Torsten Hoefler14/07/2021
MPI collective communication through a single set of interfaces: A case for orthogonality Jesper Larsson, Träffa Sasch, Hunold Guillaume, Mercier, Daniel J.Holmes13/8/2021DOI: 10.1016/j.parco.2021.102826
A Deep Learning-Based Particle-in-Cell Method for Plasma Simulations Xavier Aguilar, Stefano Markidis13/10/2021DOI: 10.1109/Cluster48925.2021.00103
Higgs Boson Classification: Brain-inspired BCPNN Learning with StreamBrain
Martin Svedin, Artur Podobas, Steven W. D. Chien, Stefano Markidis13/10/2021DOI: 10.1109/Cluster48925.2021.00105
A Data-Centric Optimization Framework for Machine Learning Oliver Rausch, Tal Ben-Nun, Nikoli Dryden, Andrei Ivanov, Shigang Li, Torsten Hoefler20/10/2021DOI: Preprint
Flare: Flexible In-Network Allreduce In Proceedings of the International Conference for High Performance Computing, Networking, Storage and AnalysisD. De Sensi, S. Di Girolamo, S. Ashkboos, S. Li, T. Hoefler14/11/ 2021DOI: 10.1145/3458817.3476178
Brain-Like Approaches to Unsupervised Learning of Hidden Representations - A Comparative Study Ravichandran N.B., Lansner A., Herman P.07/09/2021DOI: 10.1007/978-3-030-86383-8_13
Mamba: Portable Array-based Abstractions for Heterogeneous High-Performance
Dykes, T., Foyer, C., Richardson, H., Svedin, M., Podobas, A., Jansson, N., Markidis, S., Tate, A., McIntosh-Smith, STo be published
Data Movement Is All You Need: A Case Study on Optimizing Transformers Andrei Ivanov, Nikoli Dryden, Tal Ben-Nun, Shigang Li, Torsten Hoefler08/11/2021 DOI: Preprint


TitleAuthorsDate - EventLink
Multi-GPU Acceleration of the iPIC3D Implicit Particle-in-Cell Code
Chaitanya Prasad Sishtla,
Steven W. D. Chien,
Vyacheslav Olshevsky,
Erwin Laure, Stefano Markidis
12/06/2019 - ICCS 2019 Multi-GPU Acceleration
of the iPIC3D Implicit Particle-in-Cell Code
User-level schedules Derek Schafer,
Martin Ruefenacht,
Anthony Skjellum,
Daniel Holmes
11/09/2019 - EuroMPI 2019 User-level schedules
MPI Semantic Terms and Conventions Explained Claudia Blaas-Schenner,
Daniel Holmes,
Rolf Rabenseifner,
Anthony Skjellum,
Guillaume Mercier,
Julien Jaeger,
Purushotham V. Bangalore
11/09/2019 - EuroMPI 2019 MPI Semantic Terms and Conventions Explained