Research Projects
# 🔬 Research Projects Here are the key research projects I lead and contribute to in the areas of High-Performance Computing, Large Language Models, and Healthcare AI. ## 🚀 Active Projects## 🛠️ Infrastructure Projects## 🏥 Emerging Research Areas## 🔧 Technical Infrastructure## 🏆 Recent Major Achievement## 🤝 Collaboration Opportunities
HAPPA Platform - HPC Application Resilience Analysis
Active Development
Published: SRDS'24
A modular platform for HPC Application Resilience Analysis that embeds Large Language Models to understand long code sequences and achieves superior predictive accuracy in resilience analysis.
Key Features
- LLM Integration: Embedded Large Language Models for code understanding
- Modular Architecture: Flexible platform design for different analysis needs
- Superior Performance: MSE of 0.078 vs. PARIS model's 0.1172 (30% improvement)
- Code Chunking: Advanced code segmentation for LLM processing
- DARE Dataset: Comprehensive fault injection dataset for resilience analysis
🎯 LLM IR Understanding Research - ICML 2025 ACCEPTED!
🎉 ACCEPTED: ICML 2025
Major Achievement
This is my most significant recent achievement! A pioneering empirical study investigating the capabilities of Large Language Models in understanding Intermediate Representations (IRs) for compiler design and program analysis. This work has been accepted to ICML 2025 - one of the top-tier conferences in machine learning.
Research Tasks
- Control Flow Graph (CFG) reconstruction
- IR decompilation
- Code summarization
- Execution reasoning
🎯 Why This Matters
- ICML Acceptance: Top-tier machine learning conference
- Pioneering Work: First comprehensive study of LLMs on IRs
- Compiler Innovation: Potential to revolutionize code analysis
- Academic Impact: Significant contribution to AI + Compiler research
HPC Loop Resilience Analysis
Published
Venue: HPEC'24
A semantic approach using Large Language Models to investigate and analyze the resilience of loops in High-Performance Computing programs.
Key Contributions
- Semantic Analysis: LLM-based understanding of loop structures
- Resilience Quantification: SDC rates for the 13 dwarfs of parallelism
- Performance Evaluation: Comprehensive analysis of HPC loop resilience
VISILIENCE - Visualization Framework
Published
Venue: PRDC'23
An interactive visualization framework for resilience analysis that utilizes control-flow graphs to analyze and visualize program resilience characteristics.
Features
- Interactive CFG Visualization: Dynamic control-flow graph display
- Error Propagation Analysis: Visual representation of fault propagation
- Resilience Metrics: Comprehensive resilience analysis tools
- User-Friendly Interface: Intuitive visualization controls
Chaser - Fault Injection Tool
Published
Venue: DSN'20
An enhanced fault injection tool designed for tracing and analyzing soft errors in MPI applications, providing comprehensive error analysis capabilities.
Capabilities
- MPI Application Support: Specialized for Message Passing Interface programs
- Soft Error Injection: Comprehensive fault simulation
- Error Tracing: Detailed error propagation analysis
- Performance Monitoring: Minimal overhead fault injection
Healthcare AI Applications
Research Exploration
Exploring the application of Large Language Models to medical data analysis, focusing on building interpretable and reliable AI for healthcare.
Research Directions
- Medical Data Analysis: Pattern recognition and insights extraction
- Interpretable AI: Building transparent and trustworthy healthcare AI systems
- Reliable Healthcare AI: Ensuring robustness and safety in medical applications
- Interdisciplinary Collaboration: Bridging computer science and medical research
LLVM-based Soft Error Simulation Platform
Active Development
A comprehensive platform built on LLVM for simulating and analyzing soft errors in high-performance computing applications.
Features
- LLVM Integration: Leverages LLVM's powerful compilation infrastructure
- Fault Injection: Comprehensive soft error simulation capabilities
- Performance Analysis: Detailed performance impact assessment
- Extensible Architecture: Modular design for different analysis needs
BatchLens - Cloud Computing Visualization
Published
Venue: DATE'22
A visualization approach for analyzing and understanding batch jobs in cloud computing environments, providing insights into job patterns and performance.
Applications
- Cloud Job Analysis: Understanding batch job patterns
- Performance Optimization: Identifying performance bottlenecks
- Resource Utilization: Optimizing cloud resource allocation
🎉 ICML 2025 Acceptance - Major Milestone!
My paper "Can Large Language Models Understand IRs?" has been accepted to ICML 2025! This represents a significant breakthrough in my research career and demonstrates the impact of my work at the intersection of Large Language Models and compiler technologies.
Why This Achievement Matters
- Top-Tier Conference: ICML is one of the most prestigious ML conferences
- Research Innovation: First comprehensive study of LLMs on IRs
- Career Impact: Establishes expertise in AI + Compiler research
- Future Opportunities: Opens doors for collaborations and funding
🤝 Seeking Collaborators & Students
I'm actively seeking collaborators and students for these projects. If you're interested in:🎓 Graduate Students (M.S./Ph.D.)
- HPC Systems & Resilience Analysis
- LLM for Code Intelligence
- Healthcare AI
🔬 Research Collaborators
- Academic Partners
- Industry Partners
- Interdisciplinary Research
💡 What I Offer
- State-of-the-art HPC infrastructure
- LLM resources and expertise
- Conference travel support
- Research funding opportunities