Accelerating Graphic Rendering on Programmable RISC-V GPUs
	Blaise Tine, Varun Saxena, Santosh Srivatsan, Joshua R. Simpson, Fadi Alzammar, Liam Paul Cooper,
	Sam Jijina, Swetha Rajagoplan, Tejaswini Anand Kumar, Jeff Young, Hyesoon Kim
	Hot Chips (2022).
	
	
MAIA: Matrix Inversion Acceleration Near Memory
	Bahar Asgari, Dheeraj Ramchandani, Amaan Marfatia, and Hyesoon Kim
	International Conference on Field-Programmable Logic and Applications (FPL 2022).
	
Securing GPU via Region-based Bounds Checking
	Jaewon Lee, Yonghae Kim, Jiashen Cao, Euna Kim, Jaekyu Lee, Hyesoon Kim
	IEEE/ACM International Symposium on Computer Architecture (ISCA 2022).
	
	
The Tip of Iceberg in Open-Source Hardware GPU
	Blaise Tine, Ruobing Han and Hyesoon Kim.
	Open-Source Computer Architecture Research (OSCAR 2022). 
	
	
Implementing Hardware Extensions for Multicore RISC-V GPUs
	Blaise Tine and Hyesoon Kim.
	Workshop on Computer Architecture Research with RISC-V (CARRV 2022). 
	
AOS-RISC-V: Towards Always-On Heap Memory Safety
	Yonghae Kim, Anurag Kar, Siddant Singh, Ammar A. Ratnani, Jaekyu Lee, Hyesoon Kim
	Workshop on Computer Architecture Research with RISC-V (CARRV 2022). 
	
	
DynaaDCP: Dynamic Navigation of Autonomous Agents for Distributed Capture Processing
	Sam Jijina, Ramyad Hadidi, Jun Chen, Zhen Jiang, Ashutosh Dhekne, Hyesoon Kim
	International Workshop on Domain Specific System Architecture (DOSSA-4). 
	
	
FiGO: Fine-Grained Query Optimization in Video Analytics
	Jiashen Cao, Karan Sarkar, Ramyad Hadidi, Joy Arulraj, Hyesoon Kim
	ACM Special Interest Group on Management of Data (SIGMOD 2022). 
    
    
      
    
		
	
COX: CUDA on X86 by Exposing Warp-Level Functions to CPUs
	Ruobing Han, Jaewon Lee, Jaewoong Sim, Hyesoon Kim
	arXiv preprint arXiv:2112.10034 (2021). 
	
	
Vortex: Extending the RISC-V ISA for GPGPU and 3D-Graphics Research
	Blaise Tine, Krishna Praveen Yalamarthy, Fares Elsabbagh, Kim Hyesoon
	IEEE/ACM International Symposium on Microarchitecture (MICRO) (2021). 
	
	
Characterizing the Performance Implications of Compression Formats Used in Sparse Workloads
	Bahar Asgari, Ramyad Hadidi, Joshua Dierberger, Charlotte Steinichen, Amaan Marfatia, Hyesoon Kim
	IEEE International Symposium on Workload Characterization (IISWC) (2021).
	
RASA: Efficient Register-Aware Systolic Array Matrix Engine for CPU
	Geonhwa Jeong, Eric Qin, Ananda Samajdar, Christopher Hughes, Sreenivas Subramoney, Hyesoon Kim and Tushar Krishna
	Design Automation Conference (DAC) (2021).
		
	
Single-Source Hardware-Software Codesign
	Blaise Tine, Hyesoon Kim, and Sudhakar Yalamanchili
	Workshop on Languages, Tools, and Techniques for Accelerator Design (LATTE) (2021). 
	
	
SmaQ: Smart Quantization for DNN Training by Exploiting Value Clustering
	Nima Shoghi, Andrei Bersatti, Moinuddin Qureshi, and Hyesoon Kim
	IEEE Computer Architecture Letters (CAL) (2021)
	
	
A Scalable Multicore RISC-V GPGPU Accelerator for High-End FPGAs 
	Blaise Tine, Fares Elsabbagh, Apurve Chawda, Will Gulian, Yaotian Feng, Da Eun Shim, Priyadarshini Roshan, Ethan Lyons, Lingjun Zhu, Sung Kyu Lim, Seyong Lee, Jeff Vetter, Hyesoon Kim 
	Design Automation Conference DESIGNER, IP AND EMBEDDED TRACK (DAC-DIET) (2021).
	
	
Bringing OpenCL to Commodity RISC-V CPUs 
	Tine Blaise, Seyong Lee, Jeff Vetter, Hyesoon Kim 
	Fifth Workshop on Computer Architecture Research with RISC-V (2021). 
	
	
Supporting CUDA for an extended RISC-V GPU architecture 
	Ruobing Han, Blaise Tine, Jaewon Lee, Jaewoong Sim, Hyesoon Kim 
	Fifth Workshop on Computer Architecture Research with RISC-V (2021). 
	
	
Cryptography Acceleration in a RISC-V GPGPU 
	Austin Adams, Pulkit Gupta, Blaise Tine, Hyesoon Kim 
	Fifth Workshop on Computer Architecture Research with RISC-V (2021). 
	
	
Hardware Support to Improve Fuzzing Performance and Precision 
	Ren Ding*, Yonghae Kim*, Fan Sang, Wen Xu, Gururaj Saileshwar and Taesoo Kim (*co-first authors) 
	ACM Conference on Computer and Communications Security (CCS),
Seoul, South Korea (2021). 
	
	
FAFNIR: Accelerating Sparse Gathering by Using Efficient Near-Memory Intelligent Reduction 
	Bahar Asgari, Ramyad Hadidi, Jiashen Cao, Da Eun Shim, Sung-Kyu Lim, Hyesoon Kim 
	International Symposium on High-Performance Computer Architecture (HPCA), Seoul, South Korea (2021) 
	
	
Quantifying the Design-Space Tradeoffs in Autonomous Drones 
	Ramyad Hadidi, Bahar Asgari, Sam Jijina, Adriana Amyette, Nima Shoghi, Hyesoon Kim 
	International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), Detroit, MI (2021) 
	
	
THIA: Accelerating Video Analytics using Early Inference and Fine-Grained Query Planning 
	Jiashen Cao, Ramyad Hadidi, Joy Arulraj, Hyesoon Kim 
	arXiv preprint arXiv:2102.08481 (2021) 
		
	
Efficiently Solving Partial Differential Equations in a Partially Reconfigurable Specialized Hardware 
	Bahar Asgari, Ramyad Hadidi, Tushar Krishna, Hyesoon Kim, Sudhakar Yalamanchili 
	IEEE Transactions on Computers (2021) 
	
	
    
      
   
	
	
Things to Consider to Enable Dynamic Graphs in Processing-in-Memory 
	Euna Kim and Hyesoon Kim 
	International Symposium on Memory Systems (MEMSYS), Washington, DC (2020) 	
	
	
Parallel Hash Table Design for NDP Systems 
	Pranith Kumar and Hyesoon Kim 
	International Symposium on Memory Systems (MEMSYS), Washington, DC (2020) 	
	
	
Neural Network Weight Compression with NNW-BDI 
	Andrei Bersatti, Nima Shoghi, and Hyesoon Kim 
	International Symposium on Memory Systems (MEMSYS), Washington, DC (2020) 	
	
	
Reducing Inference Latency with Concurrent Architectures for Image Recognition 
	Ramyad Hadidi, Jiashen Cao, Michael S. Ryoo, Hyesoon Kim 
	arXiv preprint arXiv:2011.07092 (2020) 
	
	
LCP: A Low-Communication Parallelization Method for Fast Neural Network Inference in Image Recognition 
	Ramyad Hadidi, Bahar Asgari, Jiashen Cao, Younmin Bae, Da Eun Shim, Hyojong Kim, Sung-Kyu Lim, Michael S. Ryoo, Hyesoon Kim 
	arXiv preprint arXiv:2003.06464 (2020) 
	
	
Copernicus: Characterizing the Performance Implications of Compression Formats Used in Sparse Workloads 
	Bahar Asgari, Ramyad Hadidi, Joshua Dierberger, Charlotte Steinichen, Hyesoon Kim 
	arXiv preprint arXiv:2011.10932 (2020) 
	
	
Secure Location-Aware Authentication and Communication for Intelligent Transportation Systems 
	Nima Shoghi Ghalehshahi, Ramyad Hadidi, Lee Jaewon, Jun Chen, Arthur Siqueria, Rahul Rajan, Shaan Dhawan, Pooya Shoghi Ghalehshahi, Hyesoon Kim 
	arXiv preprint arXiv:2011.07092 (2020) 
			
	
RISC-V FPGA Platform toward ROS-based
Robotics Application [Slides]    Jaewon Lee, Hanning Chen, Hyesoon Kim 
    30th International Conference on Field-Programmable Logic and Applications
MEISSA: Multiplying Matrices Efficiently in a Scalable Systolic Architecture  Bahar Asgari, Ramyad Hadidi, Hyesoon Kim 
    IEEE International Conference on Computer Design (ICCD), Hartford, Massachusetts (2020) 
Hardware-based Always-On Heap Memory Safety [Slides] Yonghae Kim, Jaekyu Lee, Hyesoon Kim 
IEEE/ACM International Symposium on Microarchitecture (MICRO), Athens, Greece (2020)
Traversing Large Graphs on GPUs with Unified Memory [Talk Video]    Prasun Gera, Hyojong Kim, Piyush Sao, Hyesoon Kim, David Bader 
    Proceedings of the VLDB Endowment, Vol. 13, No. 7. VLDB 2020 Tokyo, Japan
Proposing a Fast and Scalable Systolic Array to Implement Matrix Multiplications on FPGA [Slides]    Bahar Asgari, Ramyad Hadidi, Hyesoon Kim 
    Symposium on Field-Programmable Custom Computing Machines (FCCM), Fayetteville, AR (2020) 
Understanding the Software and Hardware Stacks of a General-Purpose Cognitive Drone [Poster]    Sam Jijina, Adriana Amyette, Nima Shoghi, Ramyad Hadidi, Hyesoon Kim 
    IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), Boston, MA (2020)
PISCES: Power-Aware Implementation of SLAM by Customizing Efficient Sparse Algebra    Bahar Asgari, Ramyad Hadidi, Nima Shoghi, Hyesoon Kim 
    Design Automation Conference (DAC), San Francisco, CA (2020) 
Towards a General Purpose Cognitive Drone [Slides]    Sam Jijina, Adriana Amyette, Ramyad Hadidi, Hyesoon Kim 
    The Fourth Workshop on Cognitive Architectures (CogArch 2020), co-located with HPCA 2020, San Diego, CA (2020)
Batch-Aware Unified Memory Management in GPUs for Irregular Workloads [Talk Video]    Hyojong Kim, Jaewoong Sim, Prasun Gera, Ramyad Hadidi, Hyesoon Kim 
    International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), Lausanne, Switzerland (2020)
ALRESCHA: A Lightweight Reconfigurable Sparse-Computation Accelerator    Bahar Asgari, Ramyad Hadidi, Tushar Krishna, Hyesoon Kim, Sudhakar Yalamanchili 
    International Symposium on High-Performance Computer Architecture (HPCA), San Diego, CA (2020)
Tango: An Optimizing Compiler for Just-in-time RTL Simulation 
    Blaise Tine, Hyesoon Kim, Sudhkar Yalamanchili 
 
    Design, Automation, and Test in Europe (DATE), Grenoble, France (2020)
ASCELLA: Accelerating Sparse Computation by Enabling Stream Accesses to Memory [Talk Video]    Bahar Asgari, Ramyad Hadidi, Hyesoon Kim 
    Design, Automation, and Test in Europe (DATE), Grenoble, France (2020)
Productive Hardware Designs using Hybrid HLS-RTL Development    Blaise Tine, Lee Seyong, Jeff Vetter, Hyesoon Kim 
    International Symposium on Field-Programmable Gate Arrays (FPGA) poster, Seaside, CA (2020)
Cash: A Single-Source Hardware-Software Codesign Framework for Rapid Prototyping    Blaise Tine, Elsabbagh Fares, Jeff Vetter, Hyesoon Kim 
    International Symposium on Field-Programmable Gate Arrays (FPGA) poster, Seaside, CA (2020)       
    
    
    
  
  
  
  
   
Impact of Instruction Set Architecture on Machine Learning Workloads    Jeung Moon Lee, Hyesoon Kim, Hyojong Kim and Pranith Kumar 
    ACM PACT Student Research Competition (SRC), Seattle, Washington, USA (2019)
Characterizing the Deployment of Deep Neural Networks on Commercial Edge Devices [Slides] [EdgeBench][Best Paper Nominee]Ramyad Hadidi, Jiashen Cao, Yilun Xie, Bahar Asgari, Tushar Krishna, Hyesoon Kim
IEEE International Symposium on Workload Characterization (IISWC), Orlando, FL (2019)
ERIDANUS: Efficiently Running Inference of DNNs Using Systolic Arrays    Bahar Asgari, Ramyad Hadidi, Hyesoon Kim, Sudhakar Yalamanchili
    IEEE Micro, Special Issue on Machine Learning Acceleration (2019)
SLAM Performance on Embedded Robots 
    Nima Shoghi, Ramyad Hadidi, Hyesoon Kim
    Student Research Competition at Embedded System Week (SRC ESWEEK), New York, NY (2019)
Enabling Speech to Text on Embedded Systems     Mohan Dodda, Taejoon Park, Sayuj Shajith, Ramyad Hadidi, Hyesoon Kim
    Student Research Competition at Embedded System Week (SRC ESWEEK), New York, NY (2019)
Video Analytics From Edge To Server [Slides]    Jiashen Cao, Ramyad Hadidi, Joy Arulraj and Hyesoon Kim
    International Conference on Hardware/Software Codesign and System Synthesis CODES+ISSS (ESWEEK), New York, NY (2019)
Capella: Customizing Perception for Edge Devices by Efficiently Allocating FPGAs to DNNs  [Demo Site] 
    Younmin Bae, Ramyad Hadidi, Bahar Asgari, Jiashen Cao, Hyesoon Kim
    International Conference on Field-Programmable Logic and Applications (FPL), Demo, Barcelona, Spain (2019) 
Characterizing the Execution of Deep Neural Networks on Collaborative Robots and Edge Devices [Slides]        Matthew Merck, Bingyao Wang, Lixing Liu, Chunjun Jia, Arthur Siqueira, Qiusen Huang, Abhijeet Saraha,
        Dongsuk Lim, Jiashen Cao, Ramyad Hadidi, Hyesoon Kim
        ACM Practice and Experience in Advanced Research Computing (PEARC), Chicago, IL (2019)
Vortex RISC-V GPGPU system: Extending the ISA, Synthesizing the Microarchitecture, and Modeling the Software Stack        Fares Elsabbagh, Bahar Asgari, Hyesoon Kim and Sudhakar Yalamanchili
        Third Workshop on Computer Architecture Research with RISC-V (CARRV), Co-located with ISCA'19, Pheonix, AZ (2019) 
Understanding the Power Consumption of Executing Deep Neural Networks on a Distributed Robot System [Slides]        Ramyad Hadidi, Jiashen Cao, Matthew Merck, Arthur Siqueira, Qiusen Huang, Abhijeet Saraha,
        Chunjun Jia, Bingyao Wang, Dongsuk Lim, Lixing Liu and Hyesoon Kim
        Algorithms and Architectures for Learning in-the-Loop Systems in Autonomous Flight Workshop,
        Co-located with IEEE International Conference on Robotics and Automation (ICRA), Montreal, QC (2019)
A Case Study: Exploiting Neural Machine Translation to Translate CUDA to OpenCL        Yonghae Kim, Hyesoon Kim
        2nd International Workshop on AI-assisted Design for Architecture,
        Co-located with  International Symposium on Computer Architecture (ISCA), Phoenix, AZ, June 22 (2019)
A Case Study: Exploiting Neural Machine Translation to Translate CUDA to OpenCL        Yonghae Kim, Hyesoon Kim
        arXiv preprint arXiv:1905.07653 (2019)
Translating CUDA to OpenCL for Hardware Generation using Neural Machine Translation         Yonghae Kim, Hyesoon Kim
        The ACM CGO Student Research Competition (SRC),  Washington, D.C., USA (2019)
FlashGPU: Placing New Flash Next to GPU Cores      Jie Zhang, Miryeong Kwon, Myoungsoo Jung, Hyojong Kim, Hyesoon Kim 
 
      56th Design Automation Conference (DAC), June 2019
An Edge-Centric Scalable Intelligent Framework To Collaboratively Execute DNN [Demo] [Paper]        Jiashen Cao, Fei Wu, Ramyad Hadidi, Lixing Liu, Tushar Krishna, Micheal S. Ryoo, Hyesoon Kim 
        Demo for SysML Conference, Palo Alto, CA (2019)
LODESTAR: Creating Locally-Dense CNNs for Efficient Inference on Systolic Arrays        Bahar Asgari, Ramyad Hadidi, Hyesoon Kim, and Sudhakar Yalamanchili 
        ACM/IEE Design Automation Conference (DAC) - Late Breaking Results, Las Vegas, NV (2019)
Robustly Executing DNNs in IoT Systems Using Coded Distributed Computing [Slides]        Ramyad Hadidi, Jiashen Cao, Michael S. Ryoo, Hyesoon Kim
        ACM/IEE Design Automation Conference (DAC) - Late Breaking Results, Las Vegas, NV (2019)
Empirical Investigation of Stale Value Tolerance on Parallel RNN Learning        Joo Hwan Lee, Hyesoon Kim, 
        The International Symposium on Performance Analysis of Systems and Software 2019 (ISPASS 2019) ,April 2019 
Thermal-Aware Processing-in-memory Instruction Offloading        Lifeng Nai, Ramyad Hadidi, He Xiao, Hyojong Kim, Jaewoong Sim, Hyesoon Kim
        Journal of Parallel and Distributed Computing (JPDC), Elsevier, (2019)
Collaborative Execution of Deep Neural Networks on Internet of Things Devices        Ramyad Hadidi, Jiashen Cao, Michael S. Ryoo, Hyesoon Kim
        arXiv preprint arXiv:1901.02537 (2019)    
    
    
  
  
  
  
   
Distributed Perception by Collaborative Robots  [Slides]        Ramyad Hadidi, Jiashen Cao, Matthew Woodward, Michael S. Ryoo, Hyesoon Kim
        Invited for IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS’18), Madrid, Spain (2018)
        and IEEE Robotics and Automation Letters (RA-L)
Real-Time Image Recognition Using Collaborative IoT Devices [Slides]        Ramyad Hadidi, Jiashen Cao, Matthew Woodward, Michael S. Ryoo, Hyesoon Kim
        1st Reproducible Tournament on Pareto-efficient Image Classification (ACM ReQuEST workshop), co-located with ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), Williamsburg, VA, USA (2018)
CODA: Enabling Co-location of Computation and Data for Near-Data Processing	Hyojong Kim, Ramyad Hadidi, Lifeng Nai, Hyesoon Kim, Nuwan Jayasena, Yasuko Eckert, Onur Kayiran, Gabriel H. Loh
     ACM Transactions on Architecture and Code Optimization (TACO), Volume 15 Issue 3, October 2018, 2018 
CoolPIM: Thermal-Aware Source Throttling for Efficient PIM Instruction Offloading [Slides]        Lifeng Nai, Ramyad Hadidi, He Xiao, Hyojong Kim, Jaewoong Sim, Hyesoon Kim
        IEEE International Parallel & Distributed Processing Symposium (IPDPS), Vancouver, British Columbia, Canada, May. 2018
Musical Chair: Efficient Real-Time Recognition Using Collaborative IoT Devices        Ramyad Hadidi, Jiashen Cao, Matthew Woodward, Michael S. Ryoo, Hyesoon Kim
        arXiv preprint arXiv:1802.02138 (2018) 
Performance Characterisation and Simulation of Intel's Integrated GPU Architecture [Slides]        Prasun Gera, Hyojong Kim, Hyesoon Kim, Sunpyo Hong, Vinod George, Chi-Keung (CK) Luk
        IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), Belfast, Northern Ireland, United Kingdom, Apr. 2018
Performance Implications of NoCs on 3D-Stacked Memories: Insights from the Hybrid Memory Cube [Slides]        Ramyad Hadidi, Bahar Asgari, Jeffrey Young, Burhan Ahmad Mudassar, Kartikay Garg, Tushar Krishna, Hyesoon Kim
        IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), Belfast, Northern Ireland, United Kingdom, Apr. 2018    
    
    
  
  
  
  
   
StaleLearn: Learning Acceleration with Asynchronous Synchronization between Model Replicas on PIM        Joo Hwan Lee and Hyesoon Kim
	IEEE Transactions on Computers, 2017 
CAIRO: A Compiler-Assisted Technique for Enabling Instruction-Level Offloading of Processing-In-Memory	Ramyad Hadidi, Lifeng Nai, Hyojong Kim, and Hyesoon Kim
    ACM Transactions on Architecture and Code Optimization (TACO), Volume 14 Issue 4, December 2017, 2017 
Demystifying the Characteristics of 3D-Stacked Memories: A Case Study for Hybrid Memory Cube [Slides]	Ramyad Hadidi, Bahar Asgari, Burhan Ahmad Mudassar, Saibal Mukhopadhyay, Sudhakar Yalamanchili, and Hyesoon Kim
	IEEE International Symposium on Workload Characterization (IISWC), Seattle, WA, Oct. 2017 
Lightweight SIMT Core Designs for Intelligent 3D Stacked DRAM   Chad D. Kersey, Sudhakar Yalamanchili, and Hyesoon Kim 
    The International Symposium on Memory Systems (MEMSYS'17), Oct. 2017 
Inferring Fine-grained Control Flow Inside SGX Enclaves with Branch Shadowing    Sangho Lee, Ming-Wei Shih, Prasun Gera, Taesoo Kim, Hyesoon Kim, Marcus Peinado
    USENIX Security Symposium, Aug. 2017 
SimProf: A Sampling Framework for Data Analytic Workloads	Jen-Cheng Huang, Lifeng Nai, Pranith Kumar, Hyojong Kim, and Hyesoon Kim
    International Parallel and Distributed Processing Symposium (IPDPS), Orlando, FL, May 2017 
GraphPIM: Enabling Instruction-Level PIM Offloading in Graph Computing Frameworks [Slides] [Lightning]	Lifeng Nai, Ramyad Hadidi, Jaewoong Sim, Hyojong Kim, Pranith Kumar, and Hyesoon Kim
    International Symposium on High Performance Computer Architecture (HPCA), Austin, TX, Feb. 2017     
    
    
  
  
  
  
   
Exploring Big Graph Computing - An Empirical Study from Architectural Perspective	Lifeng Nai, Yinglong Xia, Ilie G. Tanase, and Hyesoon Kim
	Journal of Parallel and Distributed Computing, 2016 
  Analyzing Consistency Issues In HMC Atomics    Pranith Kumar, Lifeng Nai, and Hyesoon Kim
	The International Symposium on Memory Systems (MEMSYS), Washington, DC, Oct. 2016     
      
      
  
  
  
  
   
 GraphBIG: Understanding Graph Computing in the Context of Industrial Solutions        Lifeng Nai, Yinglong Xia, Ilie G. Tanase, Hyesoon Kim, and Ching-Yung Lin
	The International Conference for High Performance Computing, Networking, Storage and Analysis (SC), Nov. 2015 
BSSync: Processing Near Memory for Machine Learning Workloads with Bounded Staleness Consistency Models [Best Paper Award]     Joo Hwan Lee, Jaewoong Sim and Hyesoon Kim
    International Conference on Parallel Architectures and Compilation Techniques (PACT), (2015)
Instruction Offloading with HMC 2.0 Standard - A Case Study for Graph Traversals    Lifeng Nai, and   Hyesoon Kim
    The International Symposium on Memory Systems (MEMSYS), Oct. 2015 
 SIMT-based Logic Layers for Stacked DRAM Architectures: A Prototype    Chad D. Kersey, Sudhakar Yalamanchili, and Hyesoon Kim 
    The International Symposium on Memory Systems (MEMSYS), Oct. 2015 
Understanding Energy Aspect of Processing Near Memory for HPC Workloads    Hyojong Kim, Hyesoon Kim, Sudhakar Yalamanchili,  Arun F. Rodrigues
    The International Symposium on Memory Systems (MEMSYS), Oct. 2015 
 Cymric: A Framework for Prototyping Near-Memory Architectures	C. Kersey, H. Kim, S. Yalamanchili
    WARP 2015, 6th Workshop on Architectural Research Prototyping, Co-Located with the 42nd International Symposium on Computer Architecture, 2015 
[talks]  SP-CNN: A Scalable and Programmable CNN-based Accelerator	Dilan Manatunga, Hyesoon Kim, Saibal Mukhopadhyay
	IEEE Micro, 2015 
 SP-CNN: A Scalable and Programmable CNN-based Accelerator	Dilan Manatunga, Hyesoon Kim, Saibal Mukhopadhyay
	GOMACTech, Mar. 2015 
 Block-Precise Processors: Low-Power Processors with Reduced Operand Store Accesses and Result Broadcasts	Nagesh B. Lakshminarayana and Hyesoon Kim
	IEEE Transactions on Computers, 2015 
 GREEN Cache: Exploiting the Disciplined Memory Model of OpenCL on GPUs        Jaekyu Lee, Dong Hyuk Woo, Hyesoon Kim, and Mani Azimi 
	IEEE Transactions on Computers, 2015 
Accelerating Application Start-up with Nonvolatile Memory in Android Systems         Hyojong Kim, Hongyeol Lim, Dilan Manatunga, Hyesoon Kim, Gi-Ho Park 
        IEEE Micro, Jan/Feb, 2015     
      
      
  
  
  
  
   
Transparent Hardware Management of Stacked DRAM as Part of Memory      Jaewoong Sim, Alaa R. Alameldeen, Zeshan Chishti, Chris Wilkerson, Hyesoon Kim
      Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture (
MICRO), Cambridge, UK, Dec. 2014
[talks] GPUMech: GPU Performance Modeling Technique based on Interval Analysis      Jen-Cheng Huang, Joo Hwan Lee, Hyesoon Kim, Hsien-Hsin S. Lee
      Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture (
MICRO), Cambridge, UK, Dec. 2014 
[talks] Design space exploration of memory model for heterogeneous computing       Jieun Lim and Hyesoon Kim
 2014 IEEE 26th International Symposium on Computer Architecture and High Performance Computing(SBAC-PAD), Oct. 2014
[talks] OpenCL Performance Evaluation on Modern Multi Core CPUs      Joo Hwan Lee, Kaushik Patel, Nimit Nigania, Hyojong Kim, Hyesoon Kim,
      Scientific Programming, 2014
 Power Modeling for GPU Architectures Using McPAT      Jieun Lim, Nagesh B. Lakshminarayana, Hyesoon Kim, William Song, Sudhakar Yalamanchili, and Wonyong Sung
      ACM Trans. Des. Autom. Electron. Syst. 19, 3, Article 26 (June 2014) 
Harmonica: An FPGA-Based Data Parallel Soft Core      Chad Kersey, Sudhakar Yalamanchili, Hyojong Kim, Nimit Nigania, and Hyesoon Kim
      The 22nd International Symposium on Field-Programmable Custom Computing Machines (
FCCM), May, 2014 (Poster) 
A Configurable and Strong RAS Solution for Die-Stacked DRAM Caches      Jaewoong Sim, Gabriel H. Loh, Vilas Sridharan, Mike O'Connor
      IEEE Micro, Special Issues: Micro's Top Picks from 2013 Computer Architecture Conferences (
TOP PICKS), May/June 2014 
Hardware Support for Safe Execution of Native Client Applications      Dilan Manatunga, Joo Hwan Lee, and Hyesoon Kim
      Computer Architecture Letters (
CAL), vol.PP, no.99, pp.1,1 2014 
Spare Register Aware Prefetching for Graph Algorithms on GPUs      Nagesh B Lakshminarayana and Hyesoon Kim
      The 20th International Symposium on High Performance Computer Architecture (
HPCA), Orlando, Feb 2014 
[talks]  TBPoint: Reducing Simulation Time for Large Scale GPGPU Kernels      Jen-Cheng Huang, Lifeng Nai, Hyesoon Kim, Hsien-Hsin Lee
      The 28th International Parallel & Distributed PRocessing Symposium (
IPDPS), Phoenix, AZ, May 2014
    
      
      
  
  
  
  
   
Design Space Exploration of On-chip Ring Interconnection for a CPU-GPU Heterogeneous Architecture      Jaekyu Lee, Si Li, Hyesoon Kim, and Sudhakar Yalamanchili
      In Journal of Parallel and Distributed Computing (
JPDC), Vol. 73, Issue 12, pp. 1525-1538, December 2013
Adaptive Virtual Channel Partitioning for Network-on-Chip in Heterogeneous Architectures      Jaekyu Lee, Si Li, Hyesoon Kim, and Sudhakar Yalamanchili
      In ACM Transactions on Design Automation of Electronic Systems (
TODAES), Vol. 18, No. 4, pp.48:1-48:28, October 2013
SESH framework: A Space Exploration Framework for GPU Application and Hardware Codesign      Joo Hwan Lee, Jiayuan Meng, Hyesoon Kim
      4th International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (
PMBS), held as part of SC13, Denver, Colorado, USA, November 2013 
Resilient Die-stacked DRAM Caches      Jaewoong Sim, Gabriel H. Loh, Vilas Sridharan, Mike O'Connor
      40th international Symposium on Computer Architecture (
ISCA), Tel-Aviv, Israel, June 2013
      
[talk] CHiP: A Profiler to Measure the effect of Cache Contention on Scalability      Bevin Brett, Pranith Kumar, Minjang Kim, Hyesoon Kim,
      Workshop on Multithreaded Architectures and Applications in conjunction with IPDPS-27, Boston, USA, May 2013
OpenCL Performance Evaluation on Modern Multi Core CPUs      Joo Hwan Lee, Kaushik Patel, Nimit Nigania, Hyojong Kim, Hyesoon Kim,
      Multicore and GPU Programming Models, Languages and Compilers Workshop (
PLC 2013), in conjunction with IPDPS-27, Boston, USA, May 2013
When Prefetching Works, When It Doesn't, and Why      Jaekyu Lee, Hyesoon Kim, and Richard Vuduc
      An invited paper (originally published in TACO), 8th International Conference on High-Performance and Embedded Architectures and Compilers (
HiPEAC), Berlin, Germany, January 2013
      
      
  
  
  
  
   
A Mostly-Clean DRAM Cache for Effective Hit Speculation and Self-Balancing Dispatch      Jaewoong Sim, Gabriel Loh, Hyesoon Kim, Mike O'Connor, Mithuna Thottethodi
      Proceedings of the 45th Annual IEEE/ACM International Symposium on Microarchitecture (
MICRO), Vancouver, BC, Canada, Dec. 2012
      
[talk]SD3: An Efficient Dynamic Data-Dependence Profiling Mechanism      Minjang Kim, Nagesh B. Lakshminarayana,  Hyesoon Kim, Chi-Keung Luk
      IEEE Transactions on Computers (
TC), July 2012. 
FLEXclusion: Balancing Cache Capacity and On-chip Bandwidth with Flexible Exclusion      Jaewoong Sim, Jaekyu Lee, Moinuddin K. Qureshi, and Hyesoon Kim
      Proceedings of the 39th IEEE International Symposium on Computer Architecture (
ISCA), Portland, OR, June 2012
      
[talk]Predicting Potential Speedup of Serial Code via Lightweight Profiling and Emulations with Memory Performance Model      Minjang Kim, Pranith Kumar, Hyesoon Kim, and Bevin Brett
      Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium (
IPDPS), Shanghai, China, May 2012
When Prefetching Works, When It Doesn't, and Why      Jaekyu Lee, Hyesoon Kim, and Richard Vuduc
      ACM Transactions on Architecture and Code Optimization (
TACO), Vol. 9, No. 1, pp.2:1-2:29, March 2012
A Performance Analysis Framework for Identifying Potential Benefits in GPGPU Applications      Jaewoong Sim, Aniruddha Dasgupta, Hyesoon Kim, and Richard Vuduc
      Proceedings of the 17th ACM SIGPLAN Symposium on Principles and Practice of Parallal Programming (
PPoPP), New Orleans, LA, February 2012. 
[talk]TAP: A TLP-Aware Cache Management Schemes for a CPU-GPU Heterogeneous Architecture      Jaekyu Lee and Hyesoon Kim
      Proceedings of the 18th International Symposium on High Performance Computer Architecture (
HPCA), New Orleans, LA, February 2012.
      
[talk]      
      
  
  
  
  
   DRAM Scheduling Policy for a GPGPU Architecture Based on a Potential Function      Nagesh B. Lakshminarayana, Jaekyu Lee, Hyesoon Kim, and Jinwoo Shin
      IEEE Computer Architecture Letters (CAL) Nov. 2011
      
      
  
  
  
  
   
Many-Thread Aware Prefetching Mechanisms for GPGPU Applications      Jaekyu Lee, Nagesh B Lakshminarayana, Hyesoon Kim, Richard Vuduc
      MICRO-43, Atlanta, GA, 2010.
      
[talk]SD3: A scalable Approach to Data-Dependence Profiling      Minjang Kim, Hyesoon Kim, Chi-Keung Luk
      MICRO-43, Atlanta, GA, 2010. 
An Integrated GPU Power and Performance Model      Sunpyo Hong and Hyesoon Kim
      ISCA-37, June 2010.
      
[talk]Prospector: A Dynamic Data-Dependence Profiler To Help Parallel Programming      Minjang Kim, Hyesoon Kim, Chi-Keung Luk
      HotPar-2, June, 2010. 
[poster]Effect of Instruction Fetch and Memory Scheduling on GPU Performance      Nagesh B. Lakshminarayana and Hyesoon Kim
      Workshop on Language, Compiler, and Architecture Support for GPGPU, in conjunction with HPCA/PPoPP 2010, 2010. 
[talk]      
      
  
  
  
  
   Qilin: Exploiting Parallelism on Heterogeneous Multiprocessors with Adaptive Mapping      Chi-Keung Luk, Sunpyo Hong, Hyesoon Kim
      MICRO 2009, December, 2009. 
Age Based Scheduling Policy for Asymmetric Multiprocessors      Nagesh B. Lakshminarayana, Jaekyu Lee, Hyesoon Kim
      Super Computing ,November, 2009. 
[talk]An Analytical Model for a GPU Architecture with Memory-level and Thread-level Parallelism Awareness      Sunpyo Hong and Hyesoon Kim
      Proceedings of the 36th International Symposium on Computer Architecture (ISCA-36), Austin, TX, June 2009.
      
[talk]      
      
  
  
  
  
   
  Joo Hwan Lee, Nimit Nigania, Hyesoon Kim, and Bevin Brett,
      "HPerf : A Lightweight Profiler for Task Distribution on CPU+GPU Platforms",
      GT-CS-15-04, Georgia Institute of Technology, 2015. 
 
  Jaekyu Lee, Si Li, Hyesoon Kim, and Sudhakar Yalamanchili,
      "Design Space Exploration of On-chip Ring Interconnection for a CPU-GPU Architecture",
      GIT-CERCS-12-05, Georgia Institute of Technology, 2012. 
 
  Chayong Lee, Euna Kim, and Hyesoon Kim,
      "The AM-Bench: An Android Multimedia Benchmark Suite",      GIT-CERCS-12-04, Georgia Institute of Technology, 2012. 
 
  Vishal Gupta, Hyesoon Kim, and Karsten Schwan,
      "Evaluating Scalability of Multi-threaded Applications on a Many-core Platform",
      GIT-CERCS-12-03, Georgia Institute of Technology, 2012. 
 
  Minjang Kim, Chi-Keung Luk, Hyesoon Kim,
      "Prospector:Discovering Parallelism via Dynamic Data-Dependence Profiling",
      TR-2009-003, Georgia Institute of Technology, 2009. 
 
  Sunpyo Hong, Hyesoon Kim,
      "An Analytical Model for a GPU Architecture with Memory-level and Thread-level Parallelism Awareness",
      TR-2009-003, Georgia Institute of Technology, 2009. 
 
   Sunpyo Hong, Hyesoon Kim,
      "Parallelization of Mutual-Information Based Registration in the ITK Toolkit Using CUDA and TBB",
      TR-2009-002, Georgia Institute of Technology, 2009. 
 
   Chi-Keung Luk, Sunpyo Hong, Hyesoon Kim,
       "Qilin: Exploiting Parallelism on Heterogeneous Multiprocessors with Adaptive Mapping",
      TR-2009-001, Georgia Institute of Technology, January, 2009.