HC31 (2019)

Held at The Memorial Auditorium, Stanford University, Palo Alto, California, Sunday-Tuesday, August 18-20, 2019.

Full Proceedings Zipfile

VideosAt A GlanceTutorialsKeynotesConf. Day1Conf. Day2Posters
  • Keynote 1:
  • Keynote 2:
  • Session 1:
  • Session 2:
  • Session 3:
  • Session 4:
  • Session 5:
  • Session 6:
  • Session 7:
  • Session 8:
  • Session 9:
  • Tutorial 1:
  • Tutorial 2:

At A Glance

Sunday, 8/18/2019: Tutorials

  • 8:00 AM – 9:15 AM: Registration / Breakfast
  • 9:15 AM – 12:45 AM: Morning Tutorials: Acceleration in the Cloud
  • 9:15 AM – 10:15 AM: The Nitro Project – Next Generation AWS Infrastructure
  • 10:15 AM – 11:15 AM: Acceleration at Microsoft
  • 11:15 AM – 11:45 AM: Break
  • 11:45 AM – 12:45 PM: TPU V3 in Google Cloud: Architecture and Infrastructure
  • 12:45 PM – 2:00 PM: Lunch
  • 2:00 PM – 5:00 PM: Afternoon Tutorial: RISC-V
  • 5:00 PM – 6:00 PM: Reception

Monday, 8/19/2019: Conference Day 1

  • 7:30 AM – 8:45 AM: Registration / Breakfast (sponsored by Intel)
  • 8:45 AM – 9:00 AM: Opening Remarks
  • 9:00 AM – 10:30 AM: General Purpose Compute
  • 10:30 AM – 11:00 AM: Break (sponsored by Intel)
  • 11:00 AM – 12:30 PM: Memory
  • 12:30 PM – 1:45 PM: Lunch (sponsored by Intel)
  • 1:45 PM – 2:45 PM: Keynote 1: “Delivering the Future of High-Performance Computing with System, Software and Silicon Co-Optimization” by Dr. Lisa Su, CEO, AMD
  • 2:45 PM – 4:15 PM: Methodology and ML Systems
  • 4:15 PM – 4:45 PM: Break (sponsored by Intel)
  • 4:45 PM – 6:45 PM: ML Training
  • 6:45 PM – 7:45 PM: Reception (Wine & Snacks sponsored by Intel)

Tuesday, 8/20/2019: Conference Day 2

  • 8:00 AM – 9:00 AM: Registration / Breakfast (sponsored by Intel)
  • 9:00 AM – 10:30 AM: Embedded and Auto
  • 10:30 AM – 11:00 AM: Break (sponsored by Intel)
  • 11:00 AM – 12:30 PM: ML Inference
  • 12:30 PM – 1:45 PM: Lunch (sponsored by Intel)
  • 1:45 PM – 2:45 PM: Keynote 2: “What Will the Next Node Offer Us?” by Dr. Philip Wong, VP Corporate Research, TSMC
  • 2:45 PM – 3:45 PM: Interconnects
  • 3:45 PM – 4:15 PM: Break (sponsored by Intel)
  • 4:15 PM – 5:15 PM: Packaging and Security
  • 5:15 PM – 6:45 PM: Graphics and AR
  • 6:45 PM – 7:00 PM: Closing Remarks

Tutorials: Sunday, 8/18/2019

More details soon:

  • 8:00 AM – 9:15 AM: Breakfast / Registration
  • 9:15 AM – 12:45 AM: Morning Tutorials: Acceleration in the Cloud
  • 9:15 AM – 10:15 AM: The Nitro Project – Next Generation AWS Infrastructure
    • Speaker: Anthony Liguori, Sr. Principal Engineer in AWS
    • Abstract: This tutorial will focus on the evolution of hypervisors, their interactions with servers and I/O chips. Specifically, it will cover the architecture of the modern Nitro hypervisor and Nitro chips for the AWS infrastructure, including the Nitro networking, storage and HPC chips. It will also discuss how can developers use this infrastructure.
  • 10:15 AM – 11:15 AM: Acceleration at Microsoft
    • Speakers: Derek Chiou, Eric Chung, and Susan Carrie
    • Abstract: This talk describes Microsoft’s balanced acceleration efforts that include FPGAs and ASICs. FPGAs provide programmability with hardware performance, while ASICs provide density, power, and cost advantages for fixed functions. Two examples, specifically, the Project Brainwave effort targeting DNN inference on FPGAs and the Corsica Project Zipline compression ASIC, will be described. Microsoft is moving towards open accelerator eco-systems as demonstrated by the release of the Project Zipline compression standard with RTL code.
  • 11:15 AM – 11:45 AM: Break
  • 11:45 AM – 12:45 PM: TPU V3 in Google Cloud: Architecture and Infrastructure
    • Speakers: Clifford Chao & Brennan Saeta
    • Abstract: This tutorial will cover the TPU v3 chip architecture and the large-scale TPU-based systems available on Google Cloud. It will also cover the TPU software design choices that allow scaling from a single chip to a large-scale system with no customer code changes.
  • 12:45 PM – 2:00 PM: Lunch
  • 2:00 PM – 5:00 PM: Afternoon Tutorial: RISC-V
  • 2:00 PM -3:00 PM: Part I Overview of RISC-V ISA
    • Speaker: Krste Asanovic, UC Berkeley
  • 3:00 PM -4:00 PM: Part II Overview of RISC-V SW Ecosystem
    • Speaker: Bunnaroath Sou, SiFive
  • 4:00 PM -5:00 PM: Part III Overview of Open Source Cores:
    • Rocket/BOOM
      • Speakers: Howard Mao & Jerry Zhao, UC Berkeley
    • Pulp
      • Speaker: Fabian Schuiki, ETH Zurich
  • 5:00 PM – 6:00 PM: Reception

Keynotes

Delivering the Future of High-Performance Computing with System, Software and Silicon Co-Optimization

  • Dr. Lisa Su, CEO, AMD
  • Mon 8/19/2019, 1:45 PM – 2:45 PM

Abstract: From medicine to the frontiers of scientific research, manufacturing and entertainment – the demand for computing and graphics technologies continues growing. While we are entering a golden age of high-performance computing, it is increasingly clear that the techniques the industry has used to reach this point will not deliver similar advances over the coming years. As the gains from Moore’s Law have slowed in recent years, the industry has begun to focus on new areas of innovation to maintain the historical pace of performance improvements. AMD CEO Lisa Su will discuss new techniques in system architecture, silicon design and software that will enable future generations of computing and graphics products to deliver more performance with greater efficiency.

What Will the Next Node Offer Us?

  • Dr. Philip Wong, VP Corporate Research, TSMC
  • Tue 8/20/2019, 1:45 PM – 2:45 PM

Abstract: The power-performance-area (and cost) advances in the last five decades have mostly been achieved through dimensional scaling of the transistor. What will the semiconductor industry do after dimensional scaling of the silicon transistor crosses the nanometer threshold, from 16/12 nm, 10 nm, 7 nm, 5 nm, 3 nm, 2 nm, 1.4 nm to sizes below a nanometer? Will these advanced logic technologies continue to provide the energy efficiency required of future computing systems? Will new applications and computation workloads demand new device technologies and their integration into future systems? These are some of the most pressing questions facing the semiconductor industry today.

The path for IC technology development going forward is no longer a straight line. The need for out-of-the-box solutions ushers in a golden age of innovation. I will give an overview of the memory and logic device innovations that are in the research pipeline today. Future electronic systems require co-innovation of the computing architecture and device technology. I will speculate on how they will be integrated into future electronic systems.

Conference Day 1: Monday, 8/19/2019

Time Session Title Presenter Affiliation
7:30 AM Breakfast / Registration (sponsored by Intel)
8:45 AM Opening Remarks
9:00 AM General Purpose Compute Session Chair: Pradeep Dubey
9:00 AM Zen2 Dan Bouvier & David Suggs AMD
9:30 AM A Next-Gen Cloud-to-Edge Infrastructure SoC Using the ARM Neoverse N1 CPU and System Products Andrea Pellegrini ARM
10:00 AM IBM’s Next Generation POWER Processor Scott Willenborg & Jeff Stuecheli IBM
10:30 AM Break (sponsored by Intel)
11:00 AM Memory Session Chair: Fred Weber
11:00 AM True Processing In Memory with DRAM accelerator Fabrice Devaux Upmem
11:30 AM A Programmable Embedded Microprocessor for Bit-scalable In-memory Computing Hongyang Jia Princeton
12:00 PM Intel Optane Lily P Looi & Jianping Xu Intel
12:30 PM Lunch (sponsored by Intel)
1:45 PM Keynote Delivering the Future of High-Performance Computing Dr. Lisa Su AMD
2:45 PM Methodology and ML Systems Session Chair: Yoshio Masubuchi
2:45 PM Creating An Agile Hardware Flow Keyi Zhang Stanford
3:15 PM MLPerf: A Benchmark Suite for Machine Learning from an Academic-Industry Cooperative Peter Mattson MLperf
3:45 PM Zion: Facebook Next-Generation Large-memory Unified Training Platform Misha Smelyanskiy Facebook
4:15 PM Break (sponsored by Intel)
4:45 PM ML Training Session Chair: Cliff Young
4:45 PM A Scalable unified architecture for Neural Network computing from Nano-level to high performance computing Liao Heng Huawei
5:15 PM Deep Learning Training at Scale – Spring Crest Deep Learning Accelerator Andrew Yang, Nitin Garegrat, Connie Miao & Karthik Vaidyanathan Intel
5:45 PM Wafer Scale Deep Learning Sean Lie Cerebras
6:15 PM Habana Labs Approach to Scaling AI Training Eitan Medina Habana
6:45 PM Reception
(Wine & Snacks)
(sponsored by Intel)
7:45 PM End of Reception

Conference Day 2: Tuesday, 8/20/2019

Time Session Title Presenter Affiliation
8:00 AM Breakfast / Registration (sponsored by Intel)
9:00 AM Embedded and Auto Session Chair: Ralph Wittig
9:00 AM CYW89459: High Performance and Low Power Wi-Fi and Bluetooth 5.1 Combo Chip for IoT and Automotive Kamesh Medepalli Cypress
9:30 AM Ouroboros: A WaveNet Inference Engine for TTS Applications on Embedded Devices Jiansong Zhang Alibaba
10:00 AM Compute and redundancy solution for Tesla’s Full Self driving computer Debjit Das Sarma & Ganesh Venkataramanan Tesla
10:30 AM Break (sponsored by Intel)
11:00 AM ML inference Session Chair: Yuan Xie
11:00 AM A 0.11 pJ/Op, 0.32-128 TOPS, Scalable Multi-Chip-Module-based Deep Neural Network Accelerator Designed with a High-Productivity VLSI Methodology Rangharajan Venkatesan Nvidia
11:30 AM Xilinx Versal/AI Engine Sagheer Ahmad & Sridhar Subramanian Xilinx
12:00 PM Spring Hill – Intel’s Data Center Inference Chip Ofri Wechsler (speaker) Michael Behar & Bharat Daga (authors) Intel
12:30 PM Lunch (sponsored by Intel)
1:45 PM Keynote What Will the Next Node Offer Us? Dr. Philip Wong TSMC
2:45 PM Interconnects Session Chair: Jan-Willem Van de Waerdt
2:45 PM A Gen-Z Chipset for Exascale Fabrics Patrick Knebel HPE
3:15 PM TeraPHY: A Chiplet Technology for Low-Power, High-Bandwidth Optical I/O Mark Wade Ayar Labs
3:45 PM Break (sponsored by Intel)
4:15 PM Packaging and Security Session Chair: John Sell
4:15 PM Lakefield: Hybrid Cores in a Three Dimensional Package Sanjeev Khushu & Wilfred Gomes Intel
4:45 PM Jintide®: A Hardware Security Enhanced Server CPU with Xeon® Cores under Runtime Surveillance by an In-Package Dynamically Reconfigurable Computing Processor Zhu Jianfeng Tsinghua
5:15 PM Graphics & AR Session Chair: Priyanka Raina
5:15 PM RTX ON: The NVIDIA Turing GPU Architecture John Burgess Nvidia
5:45 PM 7nm “Navi” GPU Michael Mantor AMD
6:15 PM The Silicon at the Heart of Hololens 2.0 Elene Terry Microsoft
6:45 PM Closing Remarks
Title Authors
Thinker-IM: An Energy Efficient Mixed Signal RNN Engine with Computing-in-Memory and Predictive Execution Ruiqi Guo, Yonggang Liu, Shixuan Zheng, Ssu-Yen Wu, Peng Ouyang, Win-San Khwa, Xi Chen, Jia-Jing Chen, Xiudong Li, Leibo Liu, Meng-Fan Chang, Shaojun Wei and Shouyi Yin
PRU: Probabilistic Reasoning processing Unit for resource-efficient AI Nimish Shah, Laura I. Galindez Olascoaga, Wannes Meert and Marian Verhelst
NTX: A 260 Gflop/sW Streaming Accelerator for Oblivious Floating-Point Algorithms in 22nm FD-SOI Fabian Schuiki, Michael Schaffner and Luca Benini
BIHiwe: Mixed-Signal Charge-Domain Acceleration of Deep Neural Networks Soroush Ghodrati, Hardik Sharma, Sean Kinzer, Amir Yazdanbakhsh, Kambiz Samadi, Nam Sung Kim, Doug Burger and Hadi Esmaeilzadeh
Hyper-Acceleration With Samsung SmartSSD Balavinayagam Samynathan, Keith Chapman, Mehdi Nik, Behnam Robatmili, Shahrzad Mirkhani, Brian Hirano and Maysam Lavasani
2GRVI-HBM-Phalanx: Towards a Massively Parallel RISC-V FPGA Accelerator Kit Jan Gray
AIX v2: Flexible High Performance AI Inference Accelerator for Datacenters Seok Joong Hwang, Jeongho Han, Minwook Ahn, Seungrok Jung, Wonsub Kim, Yongshik Moon, Sangjun Yang, Moo-Kyoung Chung, Jaehyeok Jang, Youngjae Jin, Yongsang Park, Namseob Lee, Daewoo Kim, Euiseok Kim, Choong Hwan Choi and Heeyul Lee
MLModelScope: Evaluate and Measure ML Models within AI Pipelines Abdul Dakkak, Cheng Li, Jinjunx Xiong and Wen-Mei Hwu
LNPU: An Energy-Efficient Deep-Neural-Network Training Processor with Fine-Grained Mixed Precision Jinsu Lee, Juhyoung Lee, Donghyeon Han, Jinmook Lee, Gwangtae Park and Hoi-Jun Yoo
An Area Optimization and Power Efficient Method for HMAC-PHOTON Lightweight Cryptography Duc Nhan Le, Seungbum Baek, Kang-Un Choi and Jong-Phil Hong
A Reconfigurable Challenge-Response Generating SRAM PUF in 65nm CMOS Seungbum Baek, Ju-Hyeok Ahn, Kang-Un Choi and Jong-Phil Hong