Cheng Luo (罗成)

I earned my Master's degree from the College of Computer Science and Software Engineering at Shenzhen University (SZU), under the co-supervision of A/Prof. Weicheng Xie and Prof. Linlin Shen. Prior to that, I completed my Bachelor's degree in Computer Science at Guangzhou University (GZHU) in 2020. My research revolves around Affective AI, Healthcare AI, and their interactions and collaborations with humans.

Download CV

News

[27 Feb., 2024] One paper was accepted to CVPR 2024.

[12 Dec., 2023] Two paper was accepted to ICASSP 2024.

[9 Dec., 2023] One paper was accepted to AAAI 2024.

[26 Jul., 2023] One paper was accepted to ACM MM 2023.

[14 Jul., 2023] One paper was accepted to ICCV 2023.

[11 Jun., 2023] Launced the grand challenge REACT2023 as an organizer on ACM MM conference.

[21 Apr., 2022] One paper was accepted to IJCAI 2022.

[2 Mar., 2022] One paper was accepted to CVPR 2022.

[23 Jul., 2021] One paper was accepted to ICCV 2021.

[10 Sep., 2020] I started on my academic journey at Computer Vision Institute led by Prof. Linlin Shen.

Education

2020-present

M.S. Candidate in Computer Science

Shenzhen University
Image placeholder
2016-2020

B.S. in Computer Science

Guangzhou University
Image placeholder

Experience

Jul. 2022 - Dec. 2022

Research Intern

University of Cambridge

Reaction Generation in Dyadic Interactions

Image placeholder
Oct. 2022 - Jun. 2023

Research Intern

Baidu Inc.

3D Face Reconstruction

Image placeholder

Publications

2023
Image placeholder

Boosting Adversarial Transferability across Model Genus by Deformation-Constrained Warping

Qinliang Lin*, Cheng Luo*, Zenghao Niu, Xilin He, Weicheng Xie, Yuanbo Hou, Linlin Shen, and Siyang Song
AAAI Conference on Artificial Intelligence
(AAAI 2024)

Adversarial examples produced by a surrogate model are usually not well-transferable to target systems. To address this problem, various transferability enhancement approaches, such as input transformation and model augmentation, have emerged. However, they show less potential for attacking target systems having different model genera from the surrogate model. In this paper, we propose a general attack strategy, dubbed Deformation-Constrained Warping Attack (DeCoWA), that can overcome this limitation and is applied to cross model genus attack. Specifically, DeCoWA firstly augments input examples via an elastic deformation, namely Deformation-Constrained Warping (DeCoW), to obtain a rich collection of augmented local input details and content. To avoid severe distortion of global semantics led by random deformation, DeCoW further constrains the strength and direction of the warping transformation by a proposed adaptive control strategy. Extensive experiments also demonstrate that the transferable examples crafted by our DeCoWA on CNN surrogates can significantly hinder the performance of Transformers (and vice versa) on various tasks, including image classification, video action recognition, and audio recognition.

PDF Code
Image placeholder

Shift from Texture-bias to Shape-bias: Edge Deformation-based Augmentation for Robust Object Recognition

Xilin, He, Qinliang Lin, Cheng Luo, Weicheng Xie, Siyang Song, Feng Liu, and Linlin Shen
International Conference on Computer Vision
(ICCV 2023)

Recent studies have shown the vulnerability of CNNs under perturbation noises, which is partially caused by the reason that the well-trained CNNs are too biased toward the object texture, i.e., they make predictions mainly based on texture cues. To reduce this texture-bias, current studies resort to learning augmented samples with heavily perturbed texture to make networks be more biased toward relatively stable shape cues. However, such methods usually fail to achieve real shape-biased networks due to the insufficient diversity of the shape cues. In this paper, we propose to augment the training dataset by generating semantically meaningful shapes and samples, via a shape deformation-based online augmentation, namely as SDbOA. The samples generated by our SDbOA have two main merits. First, the augmented samples with more diverse shape variations enable networks to learn the shape cues more elaborately, which encourages the network to be shape-biased. Second, semantic-meaningful shape-augmentation samples could be produced by jointly regularizing the generator with object texture and edge-guidance soft constraint, where the edges are represented more robustly with a self information guided map to better against the noises on them. Extensive experiments under various perturbation noises demonstrate the obvious superiority of our shape-bias-motivated model over the state of the arts in terms of robustness performance.

PDF Code BibTeX
2022
Image placeholder

Frequency-driven Imperceptible Adversarial Attack on Semantic Similarity

Cheng Luo*, Qinliang Lin*, Weicheng Xie , Bizhu Wu, Jinheng Xie, and Linlin Shen
IEEE Conference on Computer Vision and Pattern Recognition
(CVPR 2022)

Current adversarial attack research reveals the vulnerability of learning-based classifiers against carefully crafted perturbations. However, most existing attack methods have inherent limitations in cross-dataset generalization as they rely on a classification layer with a closed set of categories. Furthermore, the perturbations generated by these methods may appear in regions easily perceptible to the human visual system (HVS). To circumvent the former problem, we propose a novel algorithm that attacks semantic similarity on feature representations. In this way, we are able to fool classifiers without limiting attacks to a specific dataset. For imperceptibility, we introduce the low-frequency constraint to limit perturbations within high-frequency components, ensuring perceptual similarity between adversarial examples and originals. Extensive experiments on three datasets (CIFAR-10, CIFAR-100, and ImageNet-1K) and three online public platforms indicate that our attack can yield misleading and transferable adversarial examples across architectures and datasets. Additionally, visualization results and quantitative performance (in terms of four different metrics) show that the proposed algorithm generates more imperceptible perturbations than the state-of-the-art methods.

PDF Code BibTeX
Image placeholder

Learning Multi-dimensional Edge Feature-based AU Relation Graph for Facial Action Unit Recognition

Cheng Luo*, Siyang Song*, Weicheng Xie, Linlin Shen, and Hatice Gunes
International Conference on International Joint Conferences on Artificial Intelligence
(IJCAI 2022)

The activations of Facial Action Units (AUs) mutually influence one another. While the relationship between a pair of AUs can be complex and unique, existing approaches fail to specifically and explicitly represent such cues for each pair of AUs in each facial display. This paper proposes an AU relationship modelling approach that deep learns a unique graph to explicitly describe the relationship between each pair of AUs of the target facial display. Our approach first encodes each AU's activation status and its association with other AUs into a node feature. Then, it learns a pair of multi-dimensional edge features to describe multiple task-specific relationship cues between each pair of AUs. During both node and edge feature learning, our approach also considers the influence of the unique facial display on AUs' relationship by taking the full face representation as an input. Experimental results on BP4D and DISFA datasets show that both node and edge feature learning modules provide large performance improvements for CNN and transformer-based backbones, with our best systems achieving the state-of-the-art AU recognition results. Our approach not only has a strong capability in modelling relationship cues for AU recognition but also can be easily incorporated into various backbones.

PDF Code BibTeX Project
2021
Image placeholder

Online Refinement of Low-level Feature Based Activation Map for Weakly Supervised Object Localization

Jinheng Xie, Cheng Luo, Xiangping Zhu, Ziqi Jin, Weizeng Lu, and Linlin Shen
IEEE International Conference on Computer Vision
(ICCV 2021)

We present a two-stage learning framework for weakly supervised object localization (WSOL). While most previous efforts rely on high-level feature based CAMs (Class Activation Maps), this paper proposes to localize objects using the low-level feature based activation maps. In the first stage, an activation map generator produces activation maps based on the low-level feature maps in the classifier, such that rich contextual object information is included in an online manner. In the second stage, we employ an evaluator to evaluate the activation maps predicted by the activation map generator. Based on this, we further propose a weighted entropy loss, an attentive erasing, and an area loss to drive the activation map generator to substantially reduce the uncertainty of activations between object and background, and explore less discriminative regions. Based on the low-level object information preserved in the first stage, the second stage model gradually generates a well-separated, complete, and compact activation map of object in the image, which can be easily thresholded for accurate localization.

PDF Code BibTeX

Professional Activity

Reviewer:

Conference: ECCV 2024, CVPR 2024, ICASSP 2024, ICCV 2023, CVPR 2023, ICASSP 2023, ECCV 2022
Journal: Information Fusion

Contact Me: luocheng2020 At email.szu.edu.cn

Last update: Feb. 27th,, 2024