Video Scene Understanding

Video scene understanding involves comprehensive interpretation of the context and content of video sequences. AI/ML driven automatic scene understanding has many practical usages in diverse domains such as surveillance, autonomous driving, healthcare, entertainment, sports, etc. The goal of this research is to develop robust AI/ML models to combine tasks such as action detection, event detection and scene type classification to develop a high level understanding of video scenes. These tasks are challenging due to the variation of lighting conditions, camera perspectives, occlusions, and moving objects. In addition, different events or scene types might share similar scene characteristics, and thus a higher level of cognitive understanding is necessary to resolve such ambiguities. Moreover, CGI scenes have different visual features as compared to natural scenes, and thus need to be modeled differently. By solving these challenges, we seek to enable the application of video scene understanding in tasks such as video recommendations and immersive media experiences.