Search this site
Embedded Files
IVYLab & IVLLab
  • IVYLab & IVLLab
  • LLM Multimodal Highlights
  • People
    • Professor
    • Members
    • Research Collaborators
    • Alumni
  • Research
    • Lab Overview
    • Research Fields
    • Research Demo
  • Publications
    • International Conference
    • International Journal
    • International Standards
    • Patents
    • Domestic Papers
  • Gallery
  • Board
  • Contact
  • Database
IVYLab & IVLLab
  • IVYLab & IVLLab
  • LLM Multimodal Highlights
  • People
    • Professor
    • Members
    • Research Collaborators
    • Alumni
  • Research
    • Lab Overview
    • Research Fields
    • Research Demo
  • Publications
    • International Conference
    • International Journal
    • International Standards
    • Patents
    • Domestic Papers
  • Gallery
  • Board
  • Contact
  • Database
  • More
    • IVYLab & IVLLab
    • LLM Multimodal Highlights
    • People
      • Professor
      • Members
      • Research Collaborators
      • Alumni
    • Research
      • Lab Overview
      • Research Fields
      • Research Demo
    • Publications
      • International Conference
      • International Journal
      • International Standards
      • Patents
      • Domestic Papers
    • Gallery
    • Board
    • Contact
    • Database

Vision + LLM Workshop

 News

  • 2025-05-14   [ICML 2025]  Long-Form Speech Generation with Spoken Language Models (by  Se Jin Park) is accepted as spotlight posters (2.6%) in ICML 2025. 

  • 2025-04-18   [2025 가을학기 연구실 학생 모집]  MLLM (Multimodal large language model)+ (Vision, Audio, Language) 분야를 연구할 인재를 초청합니다. 

  • 2025-03-12 [Recruited by Deepmind]  Dr. Minsu Kim and Dr. Joanna Hong, have been recruited by DeepMind.

  • 2025-02-27 [CVPR 2025]  SALOVA: Segment-Augmented Long Video Assistance for Targeted Retrieval and Routing in Long-Form Video Analysis (by  Junho Kim, Hyunjun Kim) is accepted in CVPR 2025.

  • 2025-02-27 [CVPR 2025]  VLsI: Verbalized Layers-to-Interactions from Large to Small Vision Language Models (by  Byung-Kwan Lee) is accepted in CVPR 2025.

  • 2024-12-24 [IEEE TCSVT]  MSCoTDet: Language-driven Multi-modal Fusion for Improved Multispectral Pedestrian Detection (by  Taeheon Kim, Sangyun Chung) is accepted in IEEE Transactions on Circuits and Systems for Video Technology.

  • 2024-12-10 [AAAI 2025]  Personalized Lip Reading: Adapting to Your Unique Lip Movements with Vision and Language (by  Jeong Hun Yeo) is accepted in AAAI 2025.

  • 2024-10-18 [IEEE TPAMI]  Prompt Tuning of Deep Neural Networks for Speaker-adaptive Visual Speech Recognition (by  Minsu Kim) is accepted in IEEE Transactions on Pattern Analysis and Machine Intelligence.

  • 2024-10-15 [NVIDIA Internship]  Byung Kwan Lee will join NVIDIA for a research internship.

  • 2024-10-09 [IEEE TNNLS]  Advancing Causal Intervention in Image Captioning with Causal Prompt (by  Youngjoon Yu) is accepted in IEEE Transactions on Neural Networks and Learning Systems.

  • 2024-09-27 [2025  봄학기 연구실 현재 잔여 TO]  국비 석사 2명 TO,   KAIST 프로그램 장학금(KEPSI, EPSS, LGenius, EPSD) TO 있습니다.    

  • 2024-09-26 [NeurIPS 2024]  Meteor: Mamba-based Traversal of Rationale for Large Language and Vision Models (by Byung-Kwan Lee) is accepted at  NeurIPS 2024.

  • 2024-09-26 [NeurIPS 2024]  CODE: Contrasting Self-generated Description to Combat Hallucination in Large Multi-modal Models (by Junho Kim, Hyunjun Kim) is accepted at NeurIPS 2024.

  • 2024-09-21 [EMNLP 2024]  From CollaVo (ACL 24) to MoAI (ECCV 24), Now TroL: Advancing Large Language and Vision Model (by Byung-Kwan Lee) is accepted at  EMNLP 2024.

  • 2024-09-21 [EMNLP 2024]  Where Visual Speech Meets Language: VSP-LLM (by Jeong Hun Yeo,  Seunghee Han) is accepted at the Findings of EMNLP 2024.

  • 2024-09-21 [EMNLP 2024]  What if...?: Thinking Counterfactual Keywords Helps to Mitigate Hallucination in Large Multi-modal Models (by  Junho Kim) is accepted at the Findings of EMNLP 2024.

  • 2024-08-19 [Outstanding Paper Award in ACL 2024]  Se Jin Park and Chae Won Kim have won the Outstanding Paper Award at the ACL (Association for Computational Linguistics) 2024 conference.

  • 2024-08-03 [IEEE TASLP]  Textless Unit-to-Unit training for Many-to-Many Multilingual Speech-to-Speech Translation (by  Minsu Kim) is accepted in IEEE Trans. on Audio, Speech, and Language Processing.

  • 2024-07-17 [ACM MM 2024]  Efficient Training for Multilingual Visual Speech Recognition (by Minsu Kim, Jeonghun Yeo) is accepted in ACM MM 2024.

  • 2024-07-03 [ECCV 2024]  MoAI: Mixture of All Intelligence for Large Language and Vision Models (by Byung-Kwan Lee) is accepted in ECCV 2024.

  • 2024-07-03 [Pattern Recognition]  Text-Guided Distillation Learning to Diversify Video Embeddings (by Sangmin Lee) is accepted in Pattern Recognition.

  • 2024-07-03 [ICIP 2024]  Environmental Context Understanding (by Hyunjun Kim) is accepted in ICIP 2024.

  • 2024-07-03 [ICIP 2024]  A Language-Driven Approach for Cross-modal Alignment Fusion (by Taeheon Kim, Sangyun Chung, Youngjoon Yu) is accepted in ICIP 2024 workshop.

  • 2024-06-26 [2024 가을학기 합격생 연구실 TO]  국비 석사 2명, KAIST 석사 1명, 산학장학생 등 TO 있습니다.   

  • 2024-05-19 [Recent Ph.D. graduate: postdocs]  Minsu, Ph.D graduate of 2024 has joined postdoc in AI research at META.

  • 2024-05-19 [Amazon, Google Internships]  Sungjune and Se Jin will join Amazon and Google for research internships, respectively. 

  • 2024-05-16 [ACL 2024]  CoLLaVO: Crayon Large Language and Vision mOdel (Byung-Kwan Lee) is accepted in Findings of the Association for Computational Linguistics, ACL 2024.

  • 2024-05-16 [ACL 2024]   Let's Go Real Talk: Spoken Dialogue Model for Face-to-Face Conversation (Se Jin Park, Chae Won Kim) accepted In Proceedings of the Annual Meeting of the Association for Computational Linguistics, ACL 2024.

  • 2024-04-26 [Pattern Recognition]  Robust Pedestrian Detection via Constructing Versatile Pedestrian Knowledge Bank (by Sungjune Park, Hyunjun Kim) is accepted in Pattern Recognition.

  • 2024-03-26 [IEEE TCSVT]  Integrating Language-Derived Appearance Elements with Visual Cues in Pedestrian Detection (by Sungjune Park, Hyunjun Kim) is accepted in IEEE Trans. on CSVT.

  • 2024-03-12 [2024 가을학기 대학원생 모집]  국비 석사 2명, KAIST박사 1명, 산학장학생 등 모집합니다. 관심있는 학생은 ymro@kaist.ac.kr 로 메일하기 바랍니다.

  • 2024-02-27 [CVPR 2024]  Causal Mode Multiplexer: A Novel Framework for Unbiased Data (by Taeheon Kim) is accepted in CVPR 2024.

  • 2024-02-27 [CVPR 2024]  AV2AV: Direct Audio-Visual Speech to Audio-Visual Speech Translation (by Se Jin Park, Minsu Kim) is accepted in CVPR 2024.

  • 2024-02-27 [IEEE TMM]  AKVSR: Compressing Audio Knowledge of a Pretrained Model (by Jeong Hun Yeo) is accepted in IEEE Trans. on Multimedia.

  • 2024-02-22 Recruitment for PhD and MS Students.

  • 2024-02-21 Prof. Yong Man Ro Named ICT Endowed Chair Professor at KAIST.

more

IVYLAB & IVLLAB

Report abuse
Page details
Page updated
Report abuse