Video Roadmap Part 1: Introduction

The objective of this series of blog posts is to address the future course for automating video analysis.  Depending upon viewpoint, this falls somewhere within computer vision (CV) and machine learning (ML) disciplines.  I believe the best way to do this and engage people to think about the subject analytically is to provide a straw man roadmap for consideration and comment.  To that end I would volunteer to be the clearing house for comments and suggestions to update the roadmap and make it a living document.  While Appscio is engaged in multimedia and multimodal analysis that combines both video and audio analysis I believe it premature to develop a multimodal roadmap before we look individually at video and audio.  I believe if we step back and break down the multimodal analysis totality into fundamental increments we may better learn how to best improve video and audio analysis individually as well as learning the shortfalls of each we may gain insight such that a greater synergy between the two media in combination may be achieved more quickly than marching down individual paths.  With few exceptions the researchers and practitioners in video and audio analysis do not speak the same language creating false barriers from gaining the most out of an information resource that in many cases contains an audio track in the video.  Why video?  My experience has been focused in video, so a video roadmap seems a natural exercise for me.  I would hope that someone would take up the gauntlet and pursue an audio roadmap to compliment the video one.  Once that is accomplished I think this will set the stage for a closer look at the two roadmaps together to identify where to get the greatest result by conducting research in the combined media.

I will publish my thoughts in a series of posts to follow.  The topics will include Video Basics, Roadmap Description, Requirements, Technology Drivers, Needed Technology Breakthroughs, Conclusions & Acknowledgements.