Video Action Transformer Network | AISC
Speaker(s): Mahdi Biparva
Facilitator(s): Alireza Darbehani
Find the recording, slides, and more info at https://ai.science/e/action-recognition-video-action-transformer-network--2E7ZCJBIpdgXLnCweLLS
Motivation / Abstract
The Action Transformer model is for recognizing and localizing human actions in video clips. This model repurposes a Transformer-style architecture to aggregate features
from the spatiotemporal context around the person whose actions are being classified with the model. The paper shows that by using high-resolution, person-specific, class-agnostic queries, the model spontaneously learns to track individual people and to pick up on semantic context from the actions of others.
What was discussed?
- Learn about the Action Transformer Architecture
- Learn about Action Detection in videos
- Learn about attention mechanism in Computer Vision
- Learn about Deep Learning paradigms in Action Recognition
------
#AISC hosts 3-5 live sessions like this on various AI research, engineering, and product topics every week! Visit https://ai.science for more details