Top > Seminars & Events > Seminars > Automatic human pose tracking with Vision Trans...


Automatic human pose tracking with Vision Transformers

Hold Date
2021-11-16 12:00〜2021-11-16 13:00
Object person
Aiden Nibali (La Trobe University)

Abstract: The emergence of deep convolutional neural networks (CNNs) in recent years was an important breakthrough in the field of computer vision, with CNN models topping competition leaderboards for a broad range of computer vision problems, including image classification, object detection and tracking, and semantic segmentation. More recently, the "Transformer" architecture---which was originally developed for natural language processing---has been adapted to computer vision tasks and is quickly growing a reputation within the deep learning research community as a powerful alternative to CNNs. In this talk I will give an overview of how Vision Transformers work and discuss my current research into extending this fresh deep learning architecture to the task of tracking pose joint locations for human subjects in videos.