Zhaoquan Yuan (袁召全)

Assistant Professor, Ph.D.


School of Information Science and Technology, Southwest Jiaotong University (SWJTU)

contact

Office: Room X9441, No.9 building, School of Information Science and Technology, Southwest Jiaotong University, West Hi-Tech Zone, Chengdu, China PR.

Email:  zqyuan@swjtu.edu.cn

BRIEF BIOGRAPHY

I am an Assistant Professor at the School of Information Science and Technology, Southwest Jiaotong University (SWJTU). I graduated with my bachelor's degree from the School of Computer Science and Technology, University of Science and Technology of China (USTC), and received my Ph.D. degree in Pattern Recognition and Intelligent System from Multimedia Computing Group (MMC), National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, advised by Prof. Changsheng Xu. I was a research visitor in the China-Singapore Institute of Digital Media (CSIDM) and Department of Computing of The Hong Kong Polytechnic University respectively. Also, I was a postdoc researcher in UESTC collaborating with Prof. Lixin Duan.

My research interests include grounded language learning, machine learning, and multimodal question answering.

NEWS  


TEACHING  


RECENT RESEARCH   Go Top

Movie Question Answering   [Tutorial]
Visual question answering by using information from multiple modalities has attracted more and more attention in recent years. However, it is a very challenging task, as visual content and natural languages have quite different statistical properties. In our work, we present a method called Adversarial Multimodal Network (AMN) to better understand video stories for question answering. In AMN, as inspired by generative adversarial networks, we propose to learn multimodal feature representations by finding a more coherent subspace for video clips and the corresponding texts (e.g., subtitles and questions). Moreover, in order to preserve the correlation from the visual cues, we introduce the self-attention mechanism to enforce the so-called consistency constraints on the learned multimodal representation.

Datasets and codes: [MovieQA]  [Code]
Motivation Prediction   [Tutorial]
Understanding potential motivations behind people's actions in images is a key research topic in the computer vision and pattern recognition. It is a very challenging, because motivations are usually beyond plain image pixels and hard to be described visually. To solve this task, we employ high-level image-specific textual information and explore a potential causal structure among the concepts of scenes, actions and motivations. Unlike most existing visual recognition models, PLCR infers the motivations by executing perception learning and causal reasoning seamlessly.

Datasets and codes: [Dataset]  [Code]


SELECTED PUBLICATIONS   Go Top

Social Media / Machine Learning


GRANT AND FUNDS   Go Top


SERVICES   Go Top

Program Committee Members

Journal Reviewer

STUDENTS   Go Top

Students Collaborated With Me


RESOURCE   Go Top

Reinforcement Learning / Deep Learning

Natural Language Processing / Vision

Others


Last updated date: January 16, 2019