IP-CAL

All Selected

^*Co-corresponding authors ^†Co-first authors Blue name: IP-CAL member (current or alumni)

Conference Papers

Make Every Batch Count: Fault Entry Merging for Efficient Batching in Unified Virtual Memory
Jane Rhee^†, SeJin Park^†, Gunjae Koo, Yunho Oh, and Myung Kuk Yoon
The 59th IEEE/ACM International Symposium on Microarchitecture (MICRO 2026), Athens, Greece, Oct. 31 - Nov. 4, 2026
Complex Tensor Core: Software-Hardware Co-Design for Accelerating Complex-Valued Neural Networks on GPUs
Eunbi Jeong, Jennifer N. Choi, Ji Yeong Yi, Jane Rhee, Gunjae Koo, Yunho Oh^*, and Myung Kuk Yoon^*
The 59th IEEE/ACM International Symposium on Microarchitecture (MICRO 2026), Athens, Greece, Oct. 31 - Nov. 4, 2026
Accelerating Vision Transformer Inference via Non-GEMM Kernel Fusion on Edge GPUs
Sejin Park, Yoonhyung Park, Sunhwa Kang, and Myung Kuk Yoon
2026 Summer Annual Conference of IEIE, Jeju, Korea, June 23 - 26, 2026
Characterizing Cache-Asymmetric CPU Topologies on AMD 3D V-Cache Processors (Poster Abstract) 📄
Sunwoo Kim, Eunbi Jeong, and Myung Kuk Yoon
2026 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS 2026), Seoul, Korea, April 26 - 28, 2026
FINEA: An Efficient Neural Network Accelerator Exploiting Factorized Input Features 📄
Yujin Kim, Chanhun Jeong, Yunho Oh, Myung Kuk Yoon, and Gunjae Koo
IEEE International Conference on Computer Design (ICCD 2025), Dallas, USA, Nov. 10 - 12, 2025
WINS: Winograd Structured Pruning for Fast Winograd Convolution 📄
Cheonjun Park, Hyun Jae Oh, Mincheol Park, Hyunchan Moon, Minsik Kim, Suhyun Kim, Myung Kuk Yoon, and Won Woo Ro
The IEEE/CVF International Conference on Computer Vision (ICCV), Honolulu, Hawaii, USA, Oct. 19 - 23, 2025
Understanding Distributed Training of Large Language Models with Unified Virtual Memory 📄
Jane Rhee, Eunbi Jeong, Jiwon Lee, and Myung Kuk Yoon
IEEE International Symposium on Workload Characterization (IISWC 2025), Irvine, USA, Oct. 12 - 14, 2025
HALO: Hybrid Systolic Arrays via Logical Partitioning for Acceleration of Complex-Valued Neural Networks 📄
Ji Yeong Yi^†, Eunbi Jeong^†, SungHee Yum, Jane Rhee, Sangun Choi, Gunjae Koo, Yunho Oh^*, and Myung Kuk Yoon^*
IEEE International Symposium on Workload Characterization (IISWC 2025), Irvine, USA, Oct. 12 - 14, 2025
Energy-Efficient Systolic Array for Complex-Valued Convolutional Neural Networks
Ji Yeong Yi, Jane Rhee, and Myung Kuk Yoon
The 40th International Technical Conference on Circuits/Systems, Computers, and Communications (ITC-CSCC 2025), Seoul, Korea, July 07 - 10, 2025
SSFFT: Energy-Efficient Selective Scaling for Fast Fourier Transform in Embedded GPUs 📄
Dongwon Yang, Jaebeom Jeon, Minseong Gil, Junsu Kim, Seondeok Kim, Gunjae Koo, Myung Kuk Yoon, and Yunho Oh
The 26th ACM SIGPLAN/SIGBED International Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES 2025), Seoul, Korea, June 15 - 16, 2025
Avant-Garde: Empowering GPUs with Scaled Numeric Formats 📄
Minseong Gil, Dongho Ha, Simla Burcu Harma, Myung Kuk Yoon, Babak Falsafi, Won Woo Ro, and Yunho Oh
The 52nd IEEE/ACM IInternational Symposium on Computer Architecture (ISCA 2025), Tokyo, Japan, June 21 - 25, 2025
Hierarchical Traversal Stack Design Using Shared Memory for GPU Ray Tracing ( Best Paper Nominee) 📄
Eunsoo Jung, Eunbi Jeong, Gunjae Koo, Yunho Oh^*, and Myung Kuk Yoon^*
2025 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS 2025), Ghent, Belgium, May 11 - 13, 2025
HyMM: A Hybrid Sparse-Dense Matrix Multiplication Accelerator for GCNs 📄
Hunjong Lee, Jihun Lee, Jaewon Seo, Yunho Oh, Myung Kuk Yoon, and Gunjae Koo
Design, Automation and Test in Europe Conference (DATE 2025), Lyon, France, Mar. 31 - April 2, 2025
Warped-Compaction: Maximizing GPU Register File Bandwidth Utilization via Operand Compaction 📄
Eunbi Jeong, Ipoom Jeong^*, Myung Kuk Yoon^*, and Nam Sung Kim
The 31st International IEEE Symposium on High Performance Computer Architecture (HPCA 2025), Las Vegas, United States, Mar. 1 - 5, 2025
Marching Page Walks: Batching and Concurrent Page Table Walks for Enhancing GPU Throughput 📄
Jiwon Lee, Gun Ko, Myung Kuk Yoon, Ipoom Jeong, Yunho Oh, and Won Woo Ro
The 31st International IEEE Symposium on High Performance Computer Architecture (HPCA 2025), Las Vegas, United States, Mar. 1 - 5, 2025
DEPrune: Depth-wise Separable Convolution Pruning for Maximizing GPU Parallelism 📄
Cheonjun Park, Mincheol Park, Hyunchan Moon, Myung Kuk Yoon, Seokjin Go, Suhyun Kim, and Won Woo Ro
The 38th Annual Conference on Neural Information Processing Systems (NeurIPS 2024), Vancouver, Canada, Dec. 9 - 15, 2024
Performance Comparison of CNN Pruning Techniques Using NanoSAM Model on Jetson Orin Nano
Jaeeun Hwang, Seonwoo Kim, and Myung Kuk Yoon
2024 Autumn Annual Conference of IEIE, Jeongseon, Gangwon, Korea, November 22 - 23, 2024
VitBit: Enhancing Embedded GPU Performance for AI Workloads through Register Operand Packing 📄
Jaebeom Jeon, Minseong Gil, Junsu Kim, Jaeyong Park, Gunjae Koo, Myung Kuk Yoon^*, and Yunho Oh^*
The 53rd International Conference on Parallel Processing (ICPP 2024), Gotland, Sweden, August 12 - 15, 2024
Twisted Bank Arbitrator for Balanced Register Bank Accesses on Graphics Processing Units
Eunbi Jeong, Ipoom Jeong, and Myung Kuk Yoon
2024 Summer Annual Conference of IEIE, Jeju, Korea, June 26 - 28, 2024
INTERPRET: Inter-Warp Register Reuse for GPU Tensor Core 📄
Jae Seok Kwak, Myung Kuk Yoon, Ipoom Jeong, Seunghyun Jin, and Won Woo Ro
The 32nd International Conference on Parallel Architectures and Compilation Techniques (PACT 2023), Vienna, Austria, Oct. 21 - 25, 2023
Warped-MC: An Efficient Memory Controller Scheme for Massively Parallel Processors (Best Paper Award) 📄
Jonghyun Jeong, Myung Kuk Yoon, Yunho Oh, and Gunjae Koo
The 52nd International Conference on Parallel Processing (ICPP 2023), Salt Lake City, Utah, USA, August 7 - 10, 2023
Preloading Architecture for Graphics Processing Unit (Paper Award, 우수논문상)
Eun Seong Park, Eunbi Jeong, and Myung Kuk Yoon
2023 Summer Annual Conference of IEIE, Jeju, Korea, June 28 - 30, 2023
Reduced Precision Floating Point for Ray Tracing
Eun Soo Jung, Yeonhee Jung, and Myung Kuk Yoon
2023 Summer Annual Conference of IEIE, Jeju, Korea, June 28 - 30, 2023
Early-Adaptor: An Adaptive Framework For Proactive UVM Memory Management 📄
Seokjin Go, Hyunwuk Lee, Junsung Kim, Jiwon Lee, Myung Kuk Yoon, and Won Woo Ro
The 2023 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS 2023), Raleigh, NC, USA, Apr. 23 - 25, 2023
Balanced Column-Wise Block Pruning for Maximizing GPU Parallelism 📄
Cheonjun Park, Mincheol Park, Hyun Jae Oh, Minkyu Kim, Myung Kuk Yoon, Suhyun Kim, and Won Woo Ro
The 37th Association for the Advancement of Artificial Intelligence (AAAI-23), Washington DC, USA, Feb. 07 - 14, 2023
Reconstructing Out-of-Order Issue Queue 📄
Ipoom Jeong, Jiwon Lee, Myung Kuk Yoon, and Won Woo Ro
The 55th IEEE/ACM International Symposium on Microarchitecture (MICRO 2022), Chicago, Illinois, USA, Oct. 01 - 05, 2022
Compiler-Assisted GPU Register File Power Management Technique
Myung Kuk Yoon
2022 International Conference on Electronics, Information, and Communication (ICEIC 2022), Jeju, Korea, Feb. 06 - 09, 2022
Analyzing Characteristics of Memory Side-Channels in GPU
Seungho Jung, Myung Kuk Yoon, and Gunjae Koo
2021 Korea Software Congress (KSC 2021), Pyeongchang, Korea, Dec. 20 - 22, 2021
FineReg: Fine-Grained Register File Management for Augmenting GPU Throughput 📄
Yunho Oh, Myung Kuk Yoon, William J. Song, and Won Woo Ro
The 51st IEEE/ACM International Symposium on Microarchitecture (MICRO 2018), Fukuoka, Japan, Oct. 20 - 24, 2018
Optimizing Intersection and Reflection Step of Geometrical Optics using GPUs
Hyun Jin Chung, Myung Kuk Yoon, and Won Woo Ro
The 16th International Conference on Electronics, Information and Communication (ICEIC 2017), Phuket, Thailand, Jan. 11 - 14, 2017
Virtual Thread: Maximizing Thread-Level Parallelism beyond GPU Scheduling Limit 📄
Myung Kuk Yoon, Keunsoo Kim, Sangpil Lee, Won Woo Ro, and Murali Annavaram
The 43rd ACM/IEEE International Symposium on Computer Architecture (ISCA 2016), Seoul, Korea, Jun. 18 - 22, 2016
APRES: Improving Cache Efficiency by Exploiting Load Characteristics on GPUs 📄
Yunho Oh, Keunsoo Kim, Myung Kuk Yoon, Jong Hyun Park, Yongjun Park, Won Woo Ro, and Murali Annavaram
The 43rd ACM/IEEE International Symposium on Computer Architecture (ISCA 2016), Seoul, Korea, Jun. 18 - 22, 2016
Warped-Preexecution: A GPU Pre-execution Approach for Improving Latency Hiding 📄
Keunsoo Kim, Sangpil Lee, Myung Kuk Yoon, Gunjae Koo, Won Woo Ro, and Murali Annavaram
The 22nd International IEEE Symposium on High Performance Computer Architecture (HPCA 2016), Barcelona, Spain, Mar. 12 - 16, 2016
DRAW: Investigating Benefits of Adaptive Fetch Group Size on GPU 📄
Myung Kuk Yoon, Yunho Oh, Sangpil Lee, Seung Hun Kim, Deokho Kim, and Won Woo Ro
The 2015 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS 2015), Philadelphia, PA, USA, Mar. 29 - 31, 2015
Directory Centralized Ring-based Interconnection for Multi-Core Systems
Myung Kuk Yoon, Sangpil Lee, Deokho Kim, and Won Woo Ro
The 12th International Conference on Electronics, Information and Communication (ICEIC 2013), Bali, Indonesia, Jan. 30 - Feb. 2, 2013

Journal Papers

BiKD: Bidirectional Kernel Decomposition for Large-Scale GCNs on GPU 📄
Inje Kim, Jihun Lee, Jong Hyun Jeong, Geonwoo Choi, Myung Kuk Yoon, Yunho Oh, and Gunjae Koo
IEEE Access, Vol. 14, pp. 96754 - 96770, June 2026
Restructuring the Implicit GEMM Workflow for Complex-Valued Convolution 📄
Jaeeun Hwang, Eunbi Jeong, Jane Rhee, and Myung Kuk Yoon
IEEE Access, Vol. 14, pp. 61010 - 61024, April 2026
Communication-Optimized Tensor Parallelism for Efficient Multi-GPU Training of Complex-Valued CNNs
Seonwoo Kim, Jane Rhee, and Myung Kuk Yoon
Journal of the Institute of Electronics and Information Engineers (IEIE), Vol. 63, No. 4, pp. 53-66, April 2026
TM-Training: An Energy-Efficient Tiered Memory System for Deep Learning Training in NPUs 📄
Jaeyong Park, Sangun Choi, Jongmin Kim, Gunjae Koo, Myung Kuk Yoon^*, and Yunho Oh^*
ACM Transactions on Storage (TOS), Vol. 21, Issue 4, Article No. 32, pp. 1 - 26, November 2025
MOST: Memory Oversubscription-aware Scheduling for Tensor Migration on GPU Unified Storage 📄
Junsu Kim, Jaebeom Jeon, Jaeyong Park, Sangun Choi, Minseong Gil, Seokin Hong, Gunjae Koo, Myung Kuk Yoon, and Yunho Oh
IEEE Computer Architecture Letters (CAL), Vol. 24, Issue 2, pp. 213 - 216, July 2025
TLP Balancer: Predictive Thread Allocation for Multi-Tenant Inference in Embedded GPUs 📄
Minseong Gil, Jaebeom Jeon, Junsu Kim, Sangun Choi, Gunjae Koo, Myung Kuk Yoon^*, and Yunho Oh^*
IEEE Embedded Systems Letters (ESL), Vol. 17, Issue 3, pp. 180-183, June 2025
Beyond VABlock: Improving Transformer Workloads through Aggressive Prefetching 📄
Jane Rhee, Ikyoung Choi, Gunjae Koo, Yunho Oh^*, and Myung Kuk Yoon^*
Journal of Systems Architecture (JSA), Vol. 162, pp. 103389, May 2025
Adaptive Kernel Merge and Fusion for Multi-Tenant Inference in Embedded GPUs 📄
Jaebeom Jeon, Gunjae Koo, Myung Kuk Yoon^*, and Yunho Oh^*
IEEE Embedded Systems Letters (ESL), Vol. 16, Issue 4, pp. 421-424, December 2024
Advancements in GPUs for Maximizing AI Application Performance and Research Trends
Jane Rhee, Eunbi Jeong, and Myung Kuk Yoon
Communications of the Korean Institute of Information Scientists and Engineers, Vol. 42, Issue 9, pp. 8-13, Sep. 2024
Triple-A: Early Operand Collector Allocation for Maximizing GPU Register Bank Utilization 📄
Ipoom Jeong, Eunbi Jeong, Nam Sung Kim, and Myung Kuk Yoon
IEEE Embedded Systems Letters (ESL), Vol. 16, Issue 2, pp. 206-209, June 2024
Conflict-Aware Compiler for Hierarchical Register File on GPUs 📄
Eunbi Jeong, Eun Seong Park, Gunjae Koo, Yunho Oh^*, and Myung Kuk Yoon^*
Journal of Systems Architecture (JSA), Vol. 149, pp. 103099, Apr. 2024
SAVector: Vectored Systolic Arrays 📄
Sangun Choi, Seongjun Park, Jaeyong Park, Jongmin Kim, Gunjae Koo, Seokin Hong, Myung Kuk Yoon^*, and Yunho Oh^*
IEEE Access, Vol. 12, pp. 44446 - 44461, March 2024
Performance Analysis of Neural Processing Units with Emerging Memory Technologies
Sangun Choi, Seongjun Park, Jaeyong Park, Seokin Hong, Myung Kuk Yoon, and Yunho Oh
Journal of the Institute of Electronics and Information Engineers (IEIE), Vol. 60, No. 7, pp. 30-39, July 2023
Fairness Analysis of Multi-Tenant Applications on Multi-Instance GPUs
Jane Rhee and Myung Kuk Yoon
Journal of the Institute of Electronics and Information Engineers (IEIE), Vol. 60, No. 4, pp. 11-23, Apr. 2023
CASH-RF: A Compiler-Assisted Hierarchical Register File in GPUs 📄
Yunho Oh, Ipoom Jeong, Won Woo Ro, and Myung Kuk Yoon
IEEE Embedded Systems Letters (ESL), Vol. 14, Issue 4, pp. 187-190, Dec. 2022
Analyzing GCN Aggregation on GPU 📄
Inje Kim, Jonghyun Jeong, Yunho Oh, Myung Kuk Yoon, and Gunjae Koo
IEEE Access, Vol. 10, pp. 113046 - 113060, Oct. 2022
GhostLeg: Selective Memory Coalescing for Secure GPU Architecture 📄
Jongmin Lee, Seungho Jung, Taewon Suh, Yunho Oh, Myung Kuk Yoon, and Gunjae Koo
IEEE Access, Vol. 10, pp. 111449 - 111462, Oct. 2022
TEA-RC: Thread Context-Aware Register Cache for GPUs 📄
Ipoom Jeong, Yunho Oh, Won Woo Ro, and Myung Kuk Yoon
IEEE Access, Vol. 10, pp. 82049 - 82062, Aug. 2022
REACT: Scalable and High-Performance Regular Expression Pattern Matching Accelerator for In-Storage Processing 📄
Won Seob Jeong, Changmin Lee, Keunsoo Kim, Myung Kuk Yoon, Won Jeon, Myoungsoo Jung, and Won Woo Ro
IEEE Transactions on Parallel and Distributed Systems (TPDS), Vol. 31, Issue 5, pp. 1137-1151, May 2020
Adaptive Cooperation of Prefetching and Warp Scheduling on GPUs 📄
Yunho Oh, Keunsoo Kim, Myung Kuk Yoon, Jong Hyun Park, Yongjun Park, Murali Annavaram, and Won Woo Ro
IEEE Transactions on Computers (TC), Vol. 68, No. 4, pp. 609-616, Apr. 2019
WASP: Selective Data Prefetching with Monitoring Runtime Warp Progress on GPUs 📄
Yunho Oh, Myung Kuk Yoon, Jong Hyun Park, Yongjun Park, and Won Woo Ro
IEEE Transactions on Computers (TC), Vol. 67, No. 9, pp. 1366-1373, Sep. 2018
Dynamic Resizing on Active Warps Scheduler to Hide Operation Stalls on GPUs 📄
Myung Kuk Yoon, Yunho Oh, Sangpil Lee, Seung Hun Kim, Deokho Kim, and Won Woo Ro
IEEE Transactions on Parallel and Distributed Systems (TPDS), Vol. 28, No. 11, pp. 3142-3156, Nov. 2017
Introduction to Researches on Performance Bottlenecks of Many-Core GPU Architectures
Yunho Oh, Myung Kuk Yoon, Jong Hyun Park, and Won Woo Ro
Communications of KIISE, Vol. 32 No. 5, May, 2014
A Distributed Signature Detection Method for Detecting Intrusions in Sensor Systems 📄
Ilkyu Kim, Doohwan Oh, Myung Kuk Yoon, Kyueun Yi, and Won Woo Ro
Sensors, Vol. 13, No. 4, pp. 3998-4016, Mar. 2013

Patents

균형적인 레지스터 뱅크 접근을 위한 레지스터 뱅크 아비트레이션 장치 및 방법
KR-10-2024-0094746, KR-10-2975839
제로 탐지 기반 클록 게이팅이 적용된 연산 장치 및 이의 동작 방법
KR-10-2025-0102666, KR-10-2961283
컴파일러 데이터 의존성 정보를 이용한 프로세서의 명령어 연산 수행 방법
KR-10-2022-0054408, KR-10-2919812
계층형 레지스터 파일의 레지스터 캐시 경합을 줄이기 위한 컴파일러 레지스터 할당 방법 및 이를 수행하는 전자 장치
KR-10-2024-0065666, KR-10-2897522
가지치기 비율을 고려한 그래픽 프로세싱 유닛의 스레드 블록을 스케줄링 하는 방법 및 장치
KR-10-2022-0057229, KR-10-2751159
오퍼랜드 컬렉터를 미리 할당하여 레지스터 파일을 효율적으로 사용하는 전자 장치 및 그 동작 방법
KR-10-2023-0110116, KR-10-2713738
프로세서의 레지스터 캐시 인덱스 결정 방법 및 이를 수행하는 전자 장치
KR-10-2022-0096238, KR-10-2663496
무선 통신 시스템에서 시뮬레이터의 전파환경 분석 방법 및 장치
KR-10-2019-0083497, KR-10-2477690
스토리지 시스템 및 이의 동작 방법 (Storage System And Operating Method Thereof)
KR-10-2017-0070960, KR-10-2276912, US10671307B2