Activity Recognition in Videos Using Deep Learning

DSpace/Manakin Repository

Activity Recognition in Videos Using Deep Learning

Show full item record

Title: Activity Recognition in Videos Using Deep Learning
Author(s):
Shanbhag, Mahesh Ramaray
Advisor: Gogate, Vibhav G.
Date Created: 2018-05
Format: Thesis
Keywords: Show Keywords
Abstract: Automatically recognizing activities in a video is a long standing goal of computer vision and artificial intelligence. Recently, breakthroughs in deep learning have revolutionized the field of computer vision and today deep models can solve low-level tasks such as image classification and object detection more accurately than humans and even highly trained (human) experts. However, inferring high-level activities from low-level information such as objects in a video is a difficult task because the objects interacting with humans can be too small or similar activities might be captured at different spatial locations or angles. In this thesis, we propose an effective and efficient supervised learning model for solving this difficult task by leveraging advanced deep learning architectures. Our key idea is to formulate activity recognition as a multi-label classification problem in which the input is a set of frames (a video) and the output is an assignment of most probable labels to the four elements that make up an activity: action, tool, object and source/target at each frame. We begin with a network pre-trained on objects appearing in a large image classification dataset and then modify it with an additional layer that helps us solve the much harder multilabel classification problem. Then, we tune and train this new network to our video data by presenting each labeled frame in the video as input to the network. We train, evaluate and benchmark the model using a popular Cooking activities dataset and also interpret the learned model by visualizing the network at various levels of hierarchy.
Degree Name: MSCS
Degree Level: Masters
Persistent Link: http://hdl.handle.net/10735.1/5944
Terms of Use: ©2018 The Author. Digital access to this material is made possible by the Eugene McDermott Library. Further transmission, reproduction or presentation (such as public display or performance) of protected items is prohibited except with permission of the author.
Type : text
Degree Program: Computer Science

Files in this item

Files Size Format View
ETD-5608-011-SHANBHAG-8134.53.pdf 3.044Mb PDF View/Open

This item appears in the following Collection(s)


Show full item record