About Me

I am an Integrated MS/Ph.D. student in Computer Science and Engineering at Sogang University, advised by Junsuk Choe.

My research centers on building trustworthy and efficient large language models through lightweight solutions — methods that require little to no additional training or architectural modifications.

Previously, I focused on the reliability and interpretability of Large Vision-Language Models (LVLMs), tackling problems such as object hallucination and cross-image information leakage. I am now expanding this direction toward efficient inference in LLMs, with the goal of achieving efficiency without compromising reliability.

Research Interests

Trustworthy ML Efficient Inference Multimodal Learning Explainable AI

News

  • Jan 2026 1 paper accepted at ICLR 2026.
  • Sep 2025 Started as AI Research Intern at NAVER AI Lab.
  • Jan 2025 Visiting research at Tübingen AI Center.
  • Dec 2024 1 paper accepted at AAAI 2025.

Publications  * Equal contribution

Enhancing Multi-Image Understanding through Delimiter Token Scaling
ICLR 2026
Minyoung Lee, Yeji Park, Dongjun Hwang, Yejin Kim, Seong Joon Oh, Junsuk Choe
TL;DR We identify the role of delimiter tokens in multi-image understanding and reveal that insufficient delimiter separation causes cross-image information leakage in LVLMs. Based on this analysis, we propose a training-free delimiter token scaling method that enhances image distinction and significantly improves multi-image reasoning performance.
Mitigating Cross-Image Information Leakage in Multi-Image Understanding with Large Vision-Language Models
Preprint
Yeji Park, Minyoung Lee, Sanghyuk Chun, Junsuk Choe
TL;DR We identify cross-image information leakage — visual cues from different images interfering with one another in LVLMs — and propose FOCUS, a training-free, architecture-agnostic decoding strategy that isolates each image via noise-guided masking and contrastive refinement. FOCUS improves performance across four multi-image benchmarks.
ConVis: Contrastive Decoding with Hallucination Visualization for Mitigating Hallucinations in Multimodal Large Language Models
AAAI 2025
Yeji Park*, Deokyeong Lee*, Junsuk Choe, Buru Chang
TL;DR ConVis mitigates hallucinations in multimodal LLMs via contrastive decoding with hallucination visualization — using reconstructed images during decoding to penalize inaccurate responses. The method is training-free, requiring no additional data or model updates, and improves accuracy and reliability significantly.

Education

Sogang University, Seoul, South Korea
Integrated MS/Ph.D. in Computer Science and Engineering
Advisor: Junsuk Choe
Sogang University, Seoul, South Korea
B.S. in Mathematics & B.A. in Economics (Double Major)
Graduated with Cum Laude

Experiences

NAVER AI Lab, Seongnam, South Korea
AI Research Intern
Tübingen AI Center, Tübingen, Germany
Visiting Researcher
Hosted by Prof. Dr. Seong Joon Oh

Teaching Assistant

Sogang University, Seoul, South Korea
Basic Machine Learning
Sogang University, Seoul, South Korea
Hacking and Information Security

Awards & Honors

Research Scholarship — Hyundai Chung Mong-Koo Foundation
Fully funded: Tuition and stipend