Doctoral Showcase in SC’24

Date:

Doctoral Showcase Presentation

I presented my PhD research at the prestigious SC’24 conference in Atlanta, GA, as part of the Doctoral Showcase. This showcase provides an important opportunity for students near the end of their PhD to present a summary of their dissertation research in the form of short talks and poster presentations.

Research Overview

My doctoral showcase highlighted three pivotal works that collectively advance the field of high-performance computing (HPC) resilience analysis using large language models (LLMs):

1. HAPPA: HPC Application Resilience Analysis Platform

  • Innovation: Modular platform integrating LLMs to understand long code sequences
  • Technique: Innovative code representation techniques for accurate resilience prediction
  • Results: Superior predictive accuracy using the DARE dataset, achieving MSE of 0.078 in Silent Data Corruption (SDC) prediction
  • Impact: Significantly outperformed the existing PARIS model

2. Semantic Analysis of HPC Loop Resilience

  • Focus: Investigation of loop resilience in HPC programs through semantic analysis
  • Methodology: Analysis of the 13 dwarfs of parallelism computational patterns
  • Contribution: Quantified SDC rates for each pattern using LLMs with prompt engineering
  • Outcome: Identified which loops are more error-prone, enhancing resilient HPC application development

3. LLM Capabilities in IR Code Analysis

  • Scope: Evaluation of LLMs in comprehending Intermediate Representation (IR) code syntax and semantics
  • Models Tested: GPT-4o, GPT-3.5, and CodeLlama
  • Tasks: Decompiling IR code, generating CFGs, simulating IR code execution
  • Findings: Insights into LLM effectiveness for low-level code analysis and program analysis applications

Conference Details

  • Event: SC’24 - The International Conference for High Performance Computing, Networking, Storage, and Analysis
  • Date: November 17-22, 2024
  • Location: Atlanta, GA, United States
  • Format: Doctoral Showcase with poster presentation
  • Proceedings: SC’24 Doctoral Showcase

Research Impact

These studies collectively demonstrate the potential of LLMs in enhancing the resilience of HPC applications through innovative analysis techniques and predictive modeling. The work represents a significant contribution to the intersection of artificial intelligence and high-performance computing, addressing critical challenges in system reliability and fault tolerance.

Publication

The full abstract and presentation materials are available in the SC’24 proceedings.