Seongsu Ha
I am a deep learning researcher interested in computer vision and machine learning.
Specifically my research interest lies in improving the quality of multi-modal representations and their interactions for various downstream applications,
such as video understanding, video corpus moment retrieval, video scene boundary segmentation and visual grounding.
Previously, I received Master's degree in Data Science at Seoul National University Visual Information Processing Lab,
advised by Prof. Joonseok Lee. I also received Bachelor's degree in Computer Science, Engineering at University of Illinois at Urbana-Champaign.
Email  / 
LinkedIn
|
|
News
- 08/2024: TWLV-1, analysis and insights from evaluation on video foundation models, released! Tech Report
- 07/2024: Paper on referring image segmentation accepted at ECCV 2024
- 07/2024: Paper on video frame sampling accepted at BMVC 2024
- 03/2024: Pegasus-1, a new SOTA video-to-text generative model, released! Tech Report
- 03/2024: Marengo-2.6, a new SOTA video foundation model for any-to-any search, released! Tech Blog
- 01/2024: Paper on video moment localization accepted at AISTATS 2024
- 09/2023: Started working at Twelve Labs as research scientist
- 06/2023: Started working at Twelve Labs as research intern
- 05/2023: Paper on talking head generation accepted at Sight and Sound, CVPR Workshop 2023
- 06/2022: Paper on scene boundary segmentation accepted at ACCV 2022
- 01/2022: Started working at KakaoBrain as research intern
- 03/2021: Started MS in Data Science at Seoul National University Graduate School of Data Science
|
|
TWLV-I: Analysis and Insights from Holistic Evaluation on Video Foundation Models
Twelve Labs
Technical Report arXiv, 24.08
paper
|
|
Pegasus-1: a new SOTA Video-to-Text Generative Model
Twelve Labs
Technical Report arXiv, 24.04
paper
|
|
Marengo-2.6: a new SOTA Video Foundation Model for Any-to-Any Search
Twelve Labs
Technical Blog, 24.03
blog
|
|
Finding NeMo: Negative-mined Mosaic Augmentation for Referring Image Segmentation
Seongsu Ha*, Chaeyun Kim*, Donghwa Kim*, Junho Lee, Sangho Lee, Joonseok Lee
ECCV, 2024
paper
|
|
Scalable Frame Sampling for Video Classification: A Semi-Optimal Policy Approach with Reduced Search Space
Junho Lee*, Jeongwoo Shin, Seung Woo Ko, Seongsu Ha, Joonseok Lee
BMVC, 2024
paper
|
|
Towards a Complete Benchmark on Video Moment Localization
Jinyeong Chae*, Donghwa Kim*, Kwanseok Kim, Doyeon Lee, Sangho Lee, Seongsu Ha, Jonghwan Mun, Woo-Young Kang, Byungseok Roh, Joonseok Lee
AISTATS, 2024
paper
|
|
Disentangled Audio-Driven NeRF: Talking Head Generation with Detailed Identity-Specific Micro expressions
Seoyoung Lee*, Seongsu Ha*, Joonseok Lee
CVPRW, 2023
paper
|
|
Boundary-aware Self-supervised Learning for Video Scene Segmentation
Jonghwan Mun*, Minchul Shin*, Gunsoo Han, Sangho Lee, Seongsu Ha, Joonseok Lee, Eun-Sol Kim
ACCV, 2022
paper
|
|