Research Projects

# 🔬 Research Projects Here are the key research projects I lead and contribute to in the areas of High-Performance Computing, Large Language Models, and Healthcare AI. ## 🚀 Active Projects

HAPPA Platform - HPC Application Resilience Analysis

Active Development
Published: SRDS'24
A modular platform for HPC Application Resilience Analysis that embeds Large Language Models to understand long code sequences and achieves superior predictive accuracy in resilience analysis.

Key Features

  • LLM Integration: Embedded Large Language Models for code understanding
  • Modular Architecture: Flexible platform design for different analysis needs
  • Superior Performance: MSE of 0.078 vs. PARIS model's 0.1172 (30% improvement)
  • Code Chunking: Advanced code segmentation for LLM processing
  • DARE Dataset: Comprehensive fault injection dataset for resilience analysis
Technologies: Python, PyTorch, LLM/BERT, KeyBERT, HPC frameworks
Dataset: DARE

🎯 LLM IR Understanding Research - ICML 2025 ACCEPTED!

🎉 ACCEPTED: ICML 2025
Major Achievement
This is my most significant recent achievement! A pioneering empirical study investigating the capabilities of Large Language Models in understanding Intermediate Representations (IRs) for compiler design and program analysis. This work has been accepted to ICML 2025 - one of the top-tier conferences in machine learning.

Research Tasks

  • Control Flow Graph (CFG) reconstruction
  • IR decompilation
  • Code summarization
  • Execution reasoning

🎯 Why This Matters

  • ICML Acceptance: Top-tier machine learning conference
  • Pioneering Work: First comprehensive study of LLMs on IRs
  • Compiler Innovation: Potential to revolutionize code analysis
  • Academic Impact: Significant contribution to AI + Compiler research
Models Evaluated: GPT-4, GPT-3, Gemma 2, LLaMA 3.1, Code Llama
Conference: ICML 2025 - International Conference on Machine Learning

HPC Loop Resilience Analysis

Published
Venue: HPEC'24
A semantic approach using Large Language Models to investigate and analyze the resilience of loops in High-Performance Computing programs.

Key Contributions

  • Semantic Analysis: LLM-based understanding of loop structures
  • Resilience Quantification: SDC rates for the 13 dwarfs of parallelism
  • Performance Evaluation: Comprehensive analysis of HPC loop resilience
Paper: IEEE Xplore
## 🛠️ Infrastructure Projects

VISILIENCE - Visualization Framework

Published
Venue: PRDC'23
An interactive visualization framework for resilience analysis that utilizes control-flow graphs to analyze and visualize program resilience characteristics.

Features

  • Interactive CFG Visualization: Dynamic control-flow graph display
  • Error Propagation Analysis: Visual representation of fault propagation
  • Resilience Metrics: Comprehensive resilience analysis tools
  • User-Friendly Interface: Intuitive visualization controls
Technologies: Python, NetworkX, Matplotlib, Interactive visualization libraries
Paper: IEEE Xplore

Chaser - Fault Injection Tool

Published
Venue: DSN'20
An enhanced fault injection tool designed for tracing and analyzing soft errors in MPI applications, providing comprehensive error analysis capabilities.

Capabilities

  • MPI Application Support: Specialized for Message Passing Interface programs
  • Soft Error Injection: Comprehensive fault simulation
  • Error Tracing: Detailed error propagation analysis
  • Performance Monitoring: Minimal overhead fault injection
Technologies: C++, LLVM, MPI, Pin framework
Paper: IEEE Xplore
## 🏥 Emerging Research Areas

Healthcare AI Applications

Research Exploration
Exploring the application of Large Language Models to medical data analysis, focusing on building interpretable and reliable AI for healthcare.

Research Directions

  • Medical Data Analysis: Pattern recognition and insights extraction
  • Interpretable AI: Building transparent and trustworthy healthcare AI systems
  • Reliable Healthcare AI: Ensuring robustness and safety in medical applications
  • Interdisciplinary Collaboration: Bridging computer science and medical research
Collaboration Opportunities: Open for medical researchers and healthcare professionals
## 🔧 Technical Infrastructure

LLVM-based Soft Error Simulation Platform

Active Development
A comprehensive platform built on LLVM for simulating and analyzing soft errors in high-performance computing applications.

Features

  • LLVM Integration: Leverages LLVM's powerful compilation infrastructure
  • Fault Injection: Comprehensive soft error simulation capabilities
  • Performance Analysis: Detailed performance impact assessment
  • Extensible Architecture: Modular design for different analysis needs
Technologies: C++, LLVM, Clang, CUDA/OpenMP

BatchLens - Cloud Computing Visualization

Published
Venue: DATE'22
A visualization approach for analyzing and understanding batch jobs in cloud computing environments, providing insights into job patterns and performance.

Applications

  • Cloud Job Analysis: Understanding batch job patterns
  • Performance Optimization: Identifying performance bottlenecks
  • Resource Utilization: Optimizing cloud resource allocation
Paper: IEEE Xplore
## 🏆 Recent Major Achievement

🎉 ICML 2025 Acceptance - Major Milestone!

My paper "Can Large Language Models Understand IRs?" has been accepted to ICML 2025! This represents a significant breakthrough in my research career and demonstrates the impact of my work at the intersection of Large Language Models and compiler technologies.

Why This Achievement Matters

  • Top-Tier Conference: ICML is one of the most prestigious ML conferences
  • Research Innovation: First comprehensive study of LLMs on IRs
  • Career Impact: Establishes expertise in AI + Compiler research
  • Future Opportunities: Opens doors for collaborations and funding
Conference: ICML 2025 - International Conference on Machine Learning
Research Area: LLM + Compiler Technologies + Program Analysis
## 🤝 Collaboration Opportunities

🤝 Seeking Collaborators & Students

I'm actively seeking collaborators and students for these projects. If you're interested in:

🎓 Graduate Students (M.S./Ph.D.)

  • HPC Systems & Resilience Analysis
  • LLM for Code Intelligence
  • Healthcare AI

🔬 Research Collaborators

  • Academic Partners
  • Industry Partners
  • Interdisciplinary Research

💡 What I Offer

  • State-of-the-art HPC infrastructure
  • LLM resources and expertise
  • Conference travel support
  • Research funding opportunities