Video Roadmap Part 6: Needed Technology Breakthroughs

Part 1:  Introduction
Part 2:  Video Basics
Part 3:  Roadmap Description
Part 4: Requirements
Part 5: Technology Drivers

The bar set by the requirements is that video processing must be accomplished in real time with heretofore yet achieved precision and recall.  These requirements are true in both commercial and US Government applications; however, are paced by surveillance and intelligence requirements for both the military battlefield and civilian counterterrorism and security applications.  In many cases the industry and commercial needs lead the way in requirements and solutions but there are cases where a government security need outpaces commercial industry incremental improvements and require revolutionary breakthroughs.  This, I believe, is the case in video processing. 

From my video experience I see two areas where research emphasis can provide significant payoff and help ameliorate the slow progress of increasing video analysis algorithm accuracy.  The first is to emphasize 3-D object and scene modeling for there is only so much that one can do in 2-D such that the point of diminishing returns may be near.  The second is to follow the Appscio thrust and consider the synergy from combining multimodal and multimedia analyses.

So, what can be done?  I think the answer lies in a number of areas what I will categorize as science, technology and systems.  In the larger picture some of what is required is beyond the focused scope of video analysis but must be mentioned the put the global perspective of the solution in perspective and reinforce that the video community must gets its house in order before additional synergy can be gained by exploring multimodal and multimedia solutions.

Science:  Here my definition focuses on contributions to be made in academic, commercial and government research laboratories – so called hard science.  Here is where new ideas and breakthroughs are incubated and nurtured, mostly resulting in PhD dissertations.

  • Robust CV and ML Models – much effort needs to continue to develop better automated video analysis models with emphasis upon accuracy and speed
  • Concept Based Video Indexing and Search – much effort needs to continue to handle the terabyte and petabytes data stores efficiently
  • 3-D Processing -- needs emphasis for object recognition, scene understanding and face recognition
  • Multimodal/Multimedia Data Mining and Integration – do date there is no recognized champion to orchestrate such research, that which is done is basically on an ad hoc basis

Technology:  Here my definition emphasizes industrial and commercial development that lays the groundwork enabling more rigorous science.

  • 3-D Collection and Display – much work is ongoing but its thrusts must be matched with the 3-D science work; can assist in developing virtual world models
  • Real Time Collection/Analysis – there is work in in-line process computing, emerging as Smart Cameras, work should continue layered sensor cueing and collaboration
  • Cooperative Sensor Collection and Fusion – there is work ongoing that could also lead to assist 3-D collection
  • Metadata and metadata Definitions – this and the next topic are needed infrastructure activities that no one to date has stepped up to; also unsure whether this is a Technology or Science issue
  • Ontology and Ontology-Based Inferencing – same state as above

Systems: Here my definition is infrastructure required for continued handling and managing large data stores and promoting an environment for efficient data processing

  • Cloud Computing (formerly Services Oriented Architecture) – serves two needs: (1) a platform to support video and multimedia integration; and (2) a means for accessing and working with massive databases transparently