Notice

[#202]   2019-12-13   [Pattern Recognition]   Robust to Unseen Modes of Variation (by Wissam) is accepted in Pattern Recognition

Title: Encoding Features Robust to Unseen Modes of Variation with Attentive Long Short-Term Memory

Authors: Wissam J. Baddar and Yong Man Ro


Abstract: Long short-term memory (LSTM) is a type of recurrent neural networks that is efficient for encoding spatio-temporal features in dynamic sequences. Recent work has shown that the LSTM retains information related to the mode of variation in the input sequence which reduces the discriminability of the encoded features. To encode features robust to unseen modes of variations, we devise an LSTM adaptation named attentive mode variational LSTM. The proposed attentive mode variational LSTM utilizes the concept of attention to separate the input sequence into two parts: (1) task-relevant dynamic sequence features and (2) task-irrelevant static sequence features. The task-relevant features are used to encode and emphasize the dynamics in the input sequence. The task-irrelevant static sequence features are utilized to encode the mode of variation in the input sequence. Finally, the attentive mode variational LSTM suppresses the effect of mode variation with a shared output gate and results in a spatio-temporal feature robust to unseen variations. The effectiveness of the proposed attentive mode variational LSTM is verified using two tasks: facial ____expression____ recognition and human action recognition. Comprehensive and extensive experiments have verified that the proposed method encodes spatio-temporal features robust to variations unseen during the training.

[#201]   2019-12-06   [IEIE 2019]   Jeonghyo Kim received the student excellent paper award of IEIE 2019

The title of the paper is "Weather Condition Robust Infrared Image Enhancement via Domain Transfer without Training Dataset Pair (기상 상황에 강인한 적외선 영상 개선을 위한 학습 쌍이 필요 없는 도메인 변환)".

The authors of the paper is Jeonghyo Kim and Yong Man Ro.

[#200]   2019-12-02   Dr. Seong Tae Kim (TUM) gives an invited talk on the interpretable deep learning at Dec. 3

Title: Interpretable deep learning: What happens inside deep neural networks?


Abstract: Recently deep learning research has achieved superior performance in a variety of applications. Despite the successes, current deep learning approaches have their limitations and challenges. The lack of interpretability (so-called ‘black-box model’) is the representative limitation of current deep learning studies. In other words, it is difficult for users to understand how deep networks make a particular decision. In safe-critical tasks (e.g., medical image analysis, autonomous vehicle, and biometrics), it is very important to interpret the prediction of deep networks because incorrect predictions could lead to dangerous consequences. Therefore, it is required to improve the transparency of deep networks to provide the trustworthiness of the behavior of deep networks. For this purpose, a few research efforts have been devoted to increasing the interpretability of deep neural networks in machine learning and computer vision community. In this talk, Dr. Seong Tae will outline some possible research directions for increasing the interpretability of deep networks in safe-critical applications.

[#199]   2019-11-20   [IEEE]   LSTM Encoded Appearance-Suppressed Dynamics (by Wissam) is accepted in IEEE Transactions on Affective Computing

Title: On-the-Fly Facial __Expression__ Prediction using LSTM Encoded Appearance-Suppressed Dynamics

Authors: Wissam J. Baddar, Sangmin Lee, and Yong Man Ro


Abstract: Encoding the facial __expression__ dynamics is efficient in classifying and recognizing facial __expression__s. Most facial dynamics-based methods assume that a sequence is temporally segmented before prediction. This requires the prediction to wait until a full sequence is available, resulting in prediction delay. To reduce the prediction delay and enable prediction ”on-the-fly” (as frames are fed to the system), we propose new dynamics feature learning method that allows prediction with partial (incomplete) sequences. The proposed method utilizes the readiness of recurrent neural networks (RNNs) for on-the-fly prediction, and introduces novel learning constraints to induce early prediction with partial sequences. We further show that a delay in accurate prediction using RNNs could originate from the effect that the subject appearance has on the spatio-temporal features encoded by the RNN. We refer to that effect as ”appearance bias”. We propose the appearance suppressed dynamics feature, which utilizes a static sequence to suppress the appearance bias. Experimental results have shown that the proposed method achieved higher recognition rates compared to the state-of-the-art methods on publicly available datasets. The results also verified that the proposed method improved on-the-fly prediction at subtle __expression__ frames early in the sequence, using partial sequence inputs.

[#198]   2019-10-29   [IEEE]   MCSIP Net: Multi-Channel Satellite Image Prediction is accepted in IEEE Transactions on Geoscience and Remote Sensing

The research results about processing multiple domain in satellite data have been published in IEEE TGRS.

The paper contribution is to fuse multiple domain data and to predict one of domain data. New ways of fusing multiple data with spatial and temporal attention and cooperating priori knowledge of domain data are proposed and shown their usefulness in satellite data prediction which is useful in weather prediction. The paper results are from a research project which is done by many researchers cooperation.

The paper has been written by Jae-Hyeok Lee, Sangmin S. Lee, Hak Gu Kim, Sa-kwang Song, Seongchan Kim, and Yong Man Ro.

[#197]   2019-10-08   [IEEE TIP]   BMAN: Bidirectional Multi-scale Aggregation Networks (by Sangmin Lee) is accepted in IEEE Transactions on Image Processing

Title: BMAN: Bidirectional Multi-scale Aggregation Networks for Abnormal Event Detection

Authors: Sangmin Lee, Hak Gu Kim, and Yong Man Ro,


Abstract: Abnormal event detection is an important task in video surveillance systems. In this paper, we propose novel bidirectional multi-scale aggregation networks (BMAN) for abnormal event detection. The proposed BMAN learns spatio-temporal patterns of normal events to detect deviations from the learned normal patterns as abnormalities. The BMAN consists of two main parts: an inter-frame predictor and an appearance-motion joint detector. The inter-frame predictor is devised to encode normal patterns, which generates an inter-frame using bidirectional multi-scale aggregation based on attention. With the feature aggregation, robustness for object scale variations and complex motions is achieved in normal pattern encoding. Based on the encoded normal patterns, abnormal events are detected by the appearance-motion joint detector in which both appearance and motion characteristics of scenes are considered. Comprehensive experiments are performed, and the results show that the proposed method outperforms the existing state-of-the-art methods. The resulting abnormal event detection is interpretable on the visual basis of where the detected events occur. Further, we validate the effectiveness of the proposed network designs by conducting ablation study and feature visualization.

[#196]   2019-09-30   [ICIP 2019]   Sangmin Lee and Kihyun Kim's paper is selected as Best Paper Finalists in IEEE ICIP 2019

Sangmin Lee and Kihyun Kim ‘s paper entitled by “DEEP OBJECTIVE ASSESSMENT MODEL BASED ON SPATIO-TEMPORAL PERCEPTION OF 360-DEGREE VIDEO FOR VR SICKNESS PREDICTION” is listed in the Best Paper Finalists in IEEE ICIP 2019.

20 papers are selected among 945 accepted paper. The finalist papers are Top 2.1% of the accepted papers.

The authors of the paper is Kihyun Kim, Sangmin Lee, Hak Gu Kim, Minho Park, Yong Man Ro

[#195]   2019-09-30   [KSPC 2019]   Eunsung Kim received the best paper award of KSPC 2019

Eunsung Kim received the best paper award in 2019 Korea Signal Processing Conference (KSPC) which was held in Sep 26-27, 2019.

The title of the paper is " Background Clutter Robust Anomaly Detection via Object Guide." The authors of the paper is Eun Sung Kim, Jung-Uk Kim, Yong Man Ro.

[#194]   2019-09-17   2020 전기 학생모집

2020년도 전기 박사과정(KAIST장학), 석사과정(국비,KAIST장학), 산학장학생 (KEPSI, EPSS, LGenius) 등을 모집합니다.

(http://admission.kaist.ac.kr/graduate/)


모집 연구분야:

 - Deep learning

 - Machine learning in computer vision and image processing (2D, 3D, VR)

 - Vision-Language Deep learning

 - Image processing

 - Medical imaging

 - Deep learning Quality Assessment


현재 진행중인 연구과제:

 - Explainable (Interpretable) Deep learning

 - Deep learning algorithms in computer vision

 - Recognition/Emotion recognition

 - 3D/VR quality assessment with deep learning approach

 - Medical Image analysis with deep learning

 - Vision-Language multimodal learning


최근 연구실 연구결과 - 링크 (LINK)

최근 연구실 석박사과정 딥러닝 관련 해외 학회 발표실적 - 링크 (LINK)

최근 연구실 석박사과정 해외 저널 실적 - 링크 (LINK)


을 참고하세요.


연구실 들어오고자 하는 학생은 노용만 교수님(ymro@kaist.ac.kr)께 이메일/사전미팅 추천합니다.

[#193]   2019-09-05   [ICCVW 2019]   Building a Breast-Sentence Dataset: Its Usefulness for Computer-Aided Diagnosis (by Hyebin Lee) is accepted in ICCVW 2019

Title: Building a Breast-Sentence Dataset: Its Usefulness for Computer-Aided Diagnosis

Authors: Hyebin Lee, Seong Tae Kim, and Yong Man Ro


Abstract: In recent years, it is verified that the deep learning network is able to process not only images but also time-series information. Since breast image analysis plays a big role in the diagnosis of breast cancer, there have been a large number of attempts to apply the deep learning method for an accurate diagnosis. With the advance of deep learning approaches, the possibility of using medical reports (in natural language) has been increased. However, there is no public medical report dataset associated with the breast image. Instead, in the conventional public breast mammography datasets, the characteristics of breast cancer are annotated according to the standardized term (Breast Imaging-Reporting and Data System). In this study, a breast sentence dataset is proposed to investigate the usefulness of the breast-sentence dataset in computer-aided diagnosis. Based on the conventional breast mammography datasets, we annotated sentences in the natural language according to the standardized terms (defined in Breast Imaging-Reporting and Data System) in conventional breast mammography datasets. In the experiments, we show three use cases to verify the usefulness of the breast-sentence dataset: 1) CAD framework with radiologist’s input, 2) the use of sentence dataset in training a CAD, and 3) visual pointing guided by sentence.

[#192]   2019-08-22   Hak Gu Kim has joined EPFL as a postdoctoral research associate

Dr. Hak Gu Kim (PhD Graduated at Feb. 2019) has joined Electrical Engineering Institute at EPFL (https://sti.epfl.ch/research/institutes/iel/). His research interest includes 2-D/3-D/VR Image and Video Processing, Deep Learning, Virtual Reality, Computer Vision, Human Perception, and Machine Learning. During his PhD in KAIST, he have published 10 top-tier journal papers (most of IEEE Trans) and 26 International conference paper. The accomplishment of Dr. Hak Gu Kim during his PhD is very outstanding. More information of him can be found at https://haku0331.wixsite.com/hakgukim/.

[#191]   2019-08-21   [iMIMIC 2019]   Multimodal Justification Using Visual Word Constraint Model (by Hyebin Lee) is accepted in iMIMIC 2019

Title: Generation of Multimodal Justification Using Visual Word Constraint Model for Explainable Computer-Aided Diagnosis

Authors: Hyebin Lee, Seong Tae Kim, and Yong Man Ro


Abstract: The ambiguity of the decision-making process has been pointed out as the main obstacle to practically applying the deep learning based method in spite of its outstanding performance. Interpretability can guarantee the confidence of the deep learning system, therefore it is particularly important in the medical field. In this study, a novel deep network is proposed to explain the diagnostic decision with visual pointing map and diagnostic sentence justifying result simultaneously. To increase the accuracy of sentence generation, a visual word constraint model is devised in training justification generator. To verify the proposed method, comparative experiments were conducted on the problem of the diagnosis of breast masses. Experimental results demonstrated that the proposed deep network can explain diagnosis more accurately with various textual justifications.

[#190]   2019-06-28   [IEEE]   Multi-Objective Based Spatio-Temporal Feature Representation Learning (by Dae Hoe Kim) is accepted in IEEE Transactions on Affective Computing

Title: Multi-Objective Based Spatio-Temporal Feature Representation Learning Robust to __Expression__ Intensity Variations for Facial __Expression__ Recognition

Authors: Dae Hoe Kim, Wissam J. Baddar, Jinhyeok Jang and Yong Man Ro


Abstract: Facial __expression__ recognition (FER) is increasingly gaining importance in various emerging affective computing applications. In practice, achieving accurate FER is challenging due to the large amount of inter-personal variations such as __expression__ intensity variations. In this paper, we propose a new spatio-temporal feature representation learning for FER that is robust to __expression__ intensity variations. The proposed method utilizes representative __expression__-states (e.g., onset, apex and offset of __expression__s) which can be specified in facial sequences regardless of the __expression__ intensity. The characteristics of facial __expression__s are encoded in two parts in this paper. As the first part, spatial image characteristics of the representative __expression__-state frames are learned via a convolutional neural network. Five objective terms are proposed to improve the __expression__ class separability of the spatial feature representation. In the second part, temporal characteristics of the spatial feature representation in the first part are learned with a long short-term memory of the facial __expression__. Comprehensive experiments have been conducted on a deliberate __expression__ dataset (MMI) and a spontaneous micro-__expression__ dataset (CASME II). Experimental results showed that the proposed method achieved higher recognition rates in both datasets compared to the state-of-the-art methods.

[#189]   2019-06-05   [MICCAI 2019]   Realistic Mass Data Generation using Deep learning (by Hakmin Lee) is accepted in MICCAI 2019

Title: Realistic Breast Mass Generation through BIRADS Category

Authors: Hakmin Lee, Sung Tae Kim, Jae-Hyeok Lee and Yong Man Ro


Abstract: Generating realistic breast masses is a highly important task because the large-size database of annotated breast masses is scarcely available. In this study, a novel realistic breast mass generation framework using the characteristics of the breast mass (i.e. BIRADS category) has been devised. For that purpose, the visual-semantic BIRADS description for characterizing breast masses is embedded into the deep network. The visual-semantic description is encoded together with image features and used to generate the realistic masses according the visual-semantic description. To verify the effectiveness of the proposed method, two public mammogram datasets were used. Qualitative and quantitative experimental results have shown that the realistic breast masses could be generated according to the BIRADS category.

[#188]   2019-05-01   [ICIP 2019]   Attentive Layer Separation in Object Detection (by Jung Uk Kim) is accepted in IEEE ICIP2019

Title: Attentive Layer Separation for Object Classification and Object Localization in Object Detection


Authors: Jung Uk Kim and Yong Man Ro


Abstract: Object detection became one of the major fields in computer vision. In object detection, object classification and object localization tasks are conducted. Previous deep learning-based object detection networks perform with feature maps generated by completely shared networks. However, object classification focuses on the most discriminative object part of the feature map. Whereas, object localization requires a feature map that is focused on the entire area of the object. In this paper, we propose a novel object detection network considering the difference of two tasks. The proposed deep learning-based network mainly consists of two parts; 1) Attention network part where task-specific attention maps are generated, 2) Layer separation part where layers for estimating two tasks are separated. Comprehensive experimental results based on PASCAL VOC dataset and MS COCO da-taset showed that proposed object detection network outper-formed the state-of-the-art methods.

[#187]   2019-05-01   [ICIP 2019]   Physiological Fusion Net (by Sangmin Lee) is accepted in IEEE ICIP2019

Title: Physiological Fusion Net: Quantifying Individual VR Sickness with Content Stimulus and Physiological Response


Authors: Sangmin Lee, Seongyeop Kim, Hak Gu Kim, Min Seob Kim, Seokho Yun, Bumseok Jeong, Yong Man Ro


Abstract: Quantifying VR sickness is demanded in VR industry to address viewing safety issue. In this paper, we develop a new method to quantify VR sickness. We propose a novel physiological fusion deep network which estimates individual VR sickness with content stimulus and physiological response. In the proposed framework, content stimulus guider and physiological response guider are devised to effectively represent feature related with VR sickness. Deep stimulus feature from the content stimulus guiders reflects the content sickness tendency while deep physiology feature from the physiological response guider reflects the individual sickness characteristics. By combining those features, VR sickness predictor quantifies individual SSQ scores. To evaluate the performance of the proposed method, we built a new dataset that consists of 360-degree videos with physiological signals and SSQ scores. Experimental results show that the proposed method achieved meaningful correlation with human subjective scores.

[#186]   2019-05-01   [ICIP 2019]   Generative Guiding Block (by Minho Park) is accepted in IEEE ICIP2019

Title: Generative Guiding Block: Synthesizing Realistic Looking Variants Capable Of Even Large Change Demands


Authors: Minho Park, Hak Gu Kim, and Yong Man Ro


Abstract: Realistic image synthesis is to generate an image that is perceptually indistinguishable from an actual image. Generating realistic looking images with large variations (e.g., large spatial deformations and large pose change), however, is very challenging. Handing large variations as well as preserving appearance needs to be taken into account in the realistic looking image generation. In this paper, we propose a novel realistic looking image synthesis method, especially in large change demands. To do that, we devise generative guiding blocks. The proposed generative guiding block includes realistic appearance preserving discriminator and naturalistic variation transforming discriminator. By taking the proposed generative guiding blocks into generative model, the latent features at the layer of generative model are enhanced to synthesize both realistic looking- and target variation- image. With qualitative and quantitative evaluation in experiments, we demonstrated the effectiveness of the proposed generative guiding blocks, compared to the state-of-the-arts.

[#185]   2019-05-01   [ICIP 2019]   Probenet: Probing Deep Networks (by Jae-Hyeok Lee) is accepted in IEEE ICIP2019

Title: Probenet: Probing Deep Networks


Authors: Jae-Hyeok Lee, Seong Tae Kim, and Yong Man Ro


Abstract: Despite the rapid progress of deep learning research in re-cent years, interpreting deep network is still quite challenging. Interpreting deep networks is essential to both end-users and developers since it gives confidence in the usage of the deep network. This paper deals with a method for interpreting deep networks, especially visual interpretation. In order to get visual interpretation from a target deep network, we propose ProbeNet that provides a decomposed visual interpretation of the target deep network. The ProbeNet de-composes the feature representations of the point of the tar-get deep network into human interpretable units. Further-more, the ProbeNet provides kernel-level analysis about the target deep network. In experiments, visual interpretation of two different target deep networks showed the usefulness of the ProbeNet to interpret target deep networks.

[#184]   2019-05-01   [ICIP 2019]   Deep Objective Assessment Model (by Sangmin and Kihyun ) is accepted in IEEE ICIP2019

Title: Deep Objective Assessment Model Based On Spatio-Temporal Perception Of 360-Degree Video For VR Sickness Prediction


Authors: Kihyun Kim*, Sangmin Lee*, Hak Gu Kim, Minho Park, Yong Man Ro *equally contributed first author


Abstract: In virtual reality (VR) environment, viewing safety is one of increasing concerns because of physical symptoms induced by VR sickness. Distortion of VR video is one of main causes. In this paper, we investigate the degradation of spatial resolution as distortion causing VR sickness. We propose a novel deep learning-based VR sickness assessment framework for predicting VR sickness caused by degradation of spatial resolution. The proposed method takes into account visual perception of 360-degree videos in spatio-temporal domain for assessing VR sickness. In cooperating visual quality and the temporal flickering with deep latent feature in training stage, the proposed network could effectively learn the spatio-temporal characteristics causing VR sickness. To evaluate the performance of the proposed method, we built a new dataset consisting of 360-degree videos and ground truths (physiological signals and SSQ scores). The dataset will be open publicly. Experimental results demonstrated that the proposed VR sickness assessment had a high correlation with human subjective scores.

[#183]   2019-04-13   [Medical Physics]   TVUS Segmentation using key-point discriminator deep network (by Hong Joo Lee and Hyenok Park) is accepted in Medical Physics

New segmentation deep network has been accepted as regular paper in Medical Physics.

 

The title is "Endometrium Segmentation on TVUS Image Using Key-point Discriminator". The paper contribution is to propose new key-point discriminator network for a robust segmentation on unclear medical object such as the endometrium on TVUS image. The endometrium on TVUS image has unclear boundary and very heterogeneous texture pattern so it is very challenge to be segmented. The new segmentation method with the proposed key-point discriminator can solve the problem and it very useful to measure/diagnose unclear medical object in a tough imaging condition.

This paper has been written by Hong Joo Lee, Hyenok Park, , Hak Gu Kim, and Yong Man Ro in KAIST and Dongkuk Shin in Samsung Medison and Sa Ra Lee in Ewha Womans University School of Medicine, and Sung Hoon Kim in Asan Medical Center, and Mikyung Kong in Yonsei Univ College of Medicine. Hong Joo and Hyenok are first authors who are equally contributed.

[#182 2019-03-14   2019 Spring Deep learning fundamental workshop in IVY

[#181]   2019-03-06   2019년 후기 학생모집

2019년도 후기 박사과정(KAIST장학), 석사과정 (국비,KAIST장학), 산학장학생 (KEPSI, EPSS, LGenius) 등을 모집합니다.

(http://admission.kaist.ac.kr/graduate/)



모집 연구분야:

 - Deep learning

 - Machine learning in computer vision and image processing (2D, 3D, VR)

 - Vision-Language Deep learning

 - Image processing

 - Medical imaging

 - Deep learning Quality Assessment



현재 진행중인 연구과제:

 - Explainable (Interpretable) Deep learning

 - Deep learning algorithms in computer vision

 - Recognition/Emotion recognition

 - 3D/VR quality assessment with deep learning approach

 - Medical Image analysis with deep learning

 - Vision-Language multimodal learning



최근 연구실 연구결과 - 링크 (LINK)

최근 연구실 석박사과정 딥러닝 관련 해외 학회 발표실적 - 링크 (LINK)

최근 연구실 석박사과정 해외 저널 실적 - 링크 (LINK)

을 참고하세요.


연구실 들어오고자 하는 학생은 노용만 교수님(ymro@kaist.ac.kr) 께 이메일/사전미팅 추천합니다.

[#180]   2019-02-14   Dr. Seong Tae Kim (PhD Graduated at Feb. 2019) will join Technical University of Munich as a postdoctoral research associate

Dr. Seong Tae Kim (PhD Graduated at Feb. 2019) will join Technical University of Munich as a postdoctoral research associate. His research area in his postdoc includes deep learning for medical image analysis. The accomplishment of Dr. Seong Tae Kim during his PhD is outstanding in image classification area. It can be found at https://sites.google.com/site/sseongtaekim/home/publications.

[#179]   2019-02-11  [IEEE CSVT] BBC Net for Occlusion-Robust Object Detection (by Jung Uk Kim) is accepted in IEEE Trans. on Circuits and Systems for Video Technology

BBC Net: Bounding-Box Critic Network for Object Detection as a regular paper in IEEE Trans. on Circuits and Systems for Video Technology.

 

The paper title is "BBC Net: Bounding-Box Critic Network for Occlusion-Robust Object Detection". The paper contribution is to provide Occlusion-Robust Object Detection which is practically needed in real world of automatic object classification. The novel deep network scheme featured by Bounding-Box Critic Network and new occlusion related learning algorithms are devised. The paper results will be used as a very useful tool for automatic object detection in Self driving car and surveillance camera.

This paper has been written by Jung Uk Kim, Jungsu Kwon, Hak Gu Kim, and Yong Man Ro.

[#178]   2019-01-30  Wissam, PhD student in IVY Lab, received a Silver Prize in the 2019 SAMSUNG Human-Tech Paper Award

Wissam, received a Silver Prize in the 2019 SAMSUNG Human-Tech Paper Award for his paper, “Mode Variational LSTM Robust to Unseen Modes of Variation.”


Wissam has started his research on Deep learning and facial analysis several years ago under the guidance of Prof. Yong Man Ro. Spatio-temporal feature encoding is essential for encoding the dynamics in video sequences. In Deep learning, spatio-temporal encoding has been popular using recurrent neural networks. To successfully encode the dynamics in video sequence in the real world, spatio-temporal features must be robust to different types of variations. However, existing recurrent neural networks do not sufficiently encode robust spatio-temporal features. His research is to devise a new recurrent neural network which is robust to environmental changes and variations unseen during the training time. He has successfully demonstrated that the proposed mode variational LSTM is useful for encoding spatio-temporal features robust to different types of variations that could appear in the real world.

[#177]   2019-01-30  [IEEE CSVT]   Deep Virtual Reality Image Quality (by Hak Gu & Heoun-taek) is accepted in IEEE Trans. on Circuits and Systems for Video Technology

Name

Deep Virtual Reality Image Quality Assessment has been accepted as a regular paper in IEEE Trans. on Circuits and Systems for Video Technology.

 

The paper title is "Deep Virtual Reality Image Quality Assessment with Human Perception Guider for Omnidirectional Image ". The paper contribution is to, first time in the world, provide an objective VR image quality assessment which is needed in the emerging VR industry. The novel deep network scheme featured by human perception guider and associated new learning algorithms are devised. The paper results will be used as a very useful tool in VR research.

This paper has been written by Hak Gu Kim, Heoun-taek Lim, and Yong Man Ro.

[#176]   2019-01-21   [IEEE CSVT]   Landmark detection with geometric map generative network (by Hong Joo Lee) is accepted in IEEE Trans. on Circuits and Systems for Video Technology

Landmark detection with geometric map generative network has been accepted as a regular paper in IEEE Transactions on Circuits and Systems for Video Technology.


The title is "Lightweight and Effective Facial Landmark Detection using Adversarial Learning with Face Geometric Map Generative Network ". The paper contribution is to propose new geometric prior (generating geometric map) to detect facial landmark detection. With the geometric prior, very lightweight and effective detection on key points of object can be achieved.

This paper has been written by Hong Joo Lee, Seong Tae Kim, Hakmin Lee and Yong Man Ro.