International Conference

2023

2022

2021

2020

2019

~ 2018

2024

[#348] Intuitive Multilingual Audio-Visual Speech Recognition with a Single-Trained Model

Joanna Hong, Se Jin Park, and Yong Man Ro

EMNLP 2023

[#347] Lip Reading for Low-resource Languages by Learning and Combining General Speech Knowledge and Language-specific Knowledge

Minsu Kim*, Jeong Hun Yeo*, Jeongsoo Choi, and Yong Man Ro (* equal contribution)

ICCV 2023

[#346] Mitigating Adversarial Vulnerability through Causal Parameter Estimation by Adversarial Double Machine Learning

Byung-Kwan Lee*, Junho Kim*, and Yong Man Ro (* equally contributed)

ICCV 2023

[#345] DiffV2S: Diffusion-based Video-to-Speech Synthesis with Vision-guided Speaker Embedding

Jeongsoo Choi*, Joanna Hong*, and Yong Man Ro (* equally contributed)

ICCV 2023

[#344] Mitigating Dataset Bias in Image Captioning through CLIP Confounder-free Captioning Network

YeonJu Kim, Junho Kim, Byung-Kwan Lee, Sebin Shin, and Yong Man Ro

ICIP 2023

[#343] Robust multispectral pedestrian detection via spectral position-free feature mapping

Sungjune Park, Jung Uk Kim, Jin Mo Song, and Yong Man Ro

ICIP 2023

[#342] Intelligible Lip-to-speech Synthesis with Speech Units

Jeongsoo Choi, Minsu Kim, and Yong Man Ro

INTERSPEECH 2023

[#341] Watch or Listen: Robust Audio-Visual Speech Recognition with Visual Corruption Modeling and Reliability Scoring

Joanna Hong*, Minsu Kim*, Jeongsoo Choi, and Yong Man Ro (* equally contributed)

CVPR 2023

[#340] Demystifying Causal Features on Adversarial Examples and Causal Inoculation for Robust Network by Adversarial Instrumental Variable Regression

Byung-Kwan Lee*, Junho Kim*, and Yong Man Ro (*: equally contributed)

CVPR 2023

[#339] Lip-to-speech Synthesis in the Wild with Multi-task Learning

Minsu Kim*, Joanna Hong*, and Yong Man Ro (* equally contributed)

ICASSP 2023

[#338] Similarity Relation Preserving Cross-Modal Learning For Multispectral Pedestrian Detection Against Adversarial Attacks

Jung Uk Kim and Yong Man Ro

ICASSP 2023

[#337] Multi-Temporal Lip-Audio Memory for Visual Speech Recognition

Jeong Hun Yeo, Minsu Kim, and Yong Man Ro

ICASSP 2023

[#336] Deep Visual Forced Alignment: Learning to Align Transcription with Talking Face Video

Minsu Kim, Chae Won Kim, and Yong Man Ro

AAAI 2023

[#335] Multispectral Invisible Coating: Laminated Visible-Thermal Physical Attack against Multispectral Object Detectors using Transparent Low-e films

Taeheon Kim, Youngjoon Yu, and Yong Man Ro

AAAI 2023