A multitask model for person re-identification and attribute recognition using semantic regions

Andreas Specker; Arne Schumann; Jürgen Beyerer

doi:10.1117/12.2573981

24 September 2020 A multitask model for person re-identification and attribute recognition using semantic regions

Andreas Specker, Arne Schumann, Jürgen Beyerer

Proceedings Volume 11543, Artificial Intelligence and Machine Learning in Defense Applications II; 115430I (2020) https://doi.org/10.1117/12.2573981
Event: SPIE Security + Defence, 2020, Online Only

Abstract

In recent years, more and more video surveillance cameras are being used both in military and civilian applications. This trend results in large amounts of available image and video footage. An effective manual search and evaluation of this data is difficult due to the large data volume and limited human attention span. This is why automatic algorithms are required to aid in data analysis. A key task in this context is search for persons of interest, i.e., person re-identification. Based on a query image, re-identification methods retrieve further occurrences of the depicted person in large data volumes. The prevailing success of convolutional neural networks (CNNs) in computer vision did not spare person re-identification and has recently led to significant improvements. Current state-of-the-art approaches mostly rely on features extracted from CNNs trained with person images and corresponding identity labels. However, person re-identification still remains a challenging problem due to many task-specific influences such as, e.g., occlusions, incomplete body parts, background clutter, varying camera perspectives, and pose variation. Unlike conventional CNN features, descriptive person attributes represent higher-level semantic information that is more robust to many of these influences. Therefore, person re-identification can be improved by integrating attributes into the algorithms. In this work we investigate approaches for attribute-based person re-identification using deep learning methods with the goal of developing efficient models with the best possible re-identification accuracy. We show that best practices in person re-identification approaches can be transferred to the task of pedestrian attribute recognition to achieve strong baseline results for both tasks. Moreover, we show that leveraging information about semantic clothing and body regions during training of the networks improves the results further. Finally, we combine pedestrian attribute recognition and person re-identification models in a multi-task architecture to build our attribute-based person re-identification approach. We develop our attribute model on the large RAP dataset, which currently offers the largest available number of persons and attributes and thus allows for a differentiated analysis. The final combined attribute and re-identification model is trained on the Market-1501 dataset, which provides person identities and attribute annotations simultaneously. Our results show that baseline re-identification results are surpassed, thus indicating that complementary information from the two different tasks is leveraged.

Citation Download Citation

Andreas Specker, Arne Schumann, and Jürgen Beyerer "A multitask model for person re-identification and attribute recognition using semantic regions", Proc. SPIE 11543, Artificial Intelligence and Machine Learning in Defense Applications II, 115430I (24 September 2020); https://doi.org/10.1117/12.2573981

ACCESS THE FULL ARTICLE

INSTITUTIONAL
Select your institution to access the SPIE Digital Library.

SELECT YOUR INSTITUTION

PERSONAL
Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.

PERSONAL SIGN IN

No SPIE Account? Create one

PURCHASE THIS CONTENT

SUBSCRIBE TO DIGITAL LIBRARY

50 downloads per 1-year subscription

Members: $195

Non-members: $335 ADD TO CART

25 downloads per 1 - year subscription

Members: $145

Non-members: $250 ADD TO CART

PURCHASE SINGLE ARTICLE

Includes PDF, HTML & Video, when available

Members: $17.00

Non-members: $21.00 ADD TO CART

PROCEEDINGS
13 PAGES

DOWNLOAD PAPER SAVE TO MY LIBRARY

GET CITATION

RIGHTS & PERMISSIONS

Get copyright permission Get copyright permission on Copyright Marketplace

KEYWORDS

Cameras

Data modeling

Video surveillance

Computer vision technology

Convolutional neural networks

Data analysis

Feature extraction

Show All Keywords

Keywords/Phrases

Search In:

Publication Years