ICME 2006 Toronto

Interactive Multimedia Content Analysis and Applications

Organizers

Xian-Sheng Hua
Microsoft Research Asia
xshua@microsoft.com
Qi Tian
University of Texas at San Antonio
qitian@cs.utsa.edu

Call for Papers

With the explosive growth of digital media data, there is a huge demand for new tools and systems that enables average users to more efficiently and more effectively search, access, process, manage, author and share these digital media contents.

Automatic content analysis techniques are widely applied to extract metadata and annotate videos aiming at describing the content of videos at both syntactic and semantic levels. With the help of these metadata, tools and systems for video retrieval, summarization, delivery and manipulation can be created effectively.

However, great difficulties are encountered in automatically bridging the large gap between high-level semantics (what we expect) and low-level features (what we can obtain). The results of automatic video annotation techniques are far from satisfactory due to this gap as well as the lack of training data compared with the large variations of the semantic concepts in images and videos that are desired to be modeled. While at the same time, manual annotation is not only labor-intensive and time-consuming, but also subjects to human errors. To tackle this difficulty, relevance feedback is considered to be an effective method to narrow down this gap when doing retrieval. And, to accelerate the convergence speed of learning process, several active learning schemes, in which the most informative samples are chosen to be labeled, have been proposed.

Enlightened by these promising approaches, can we achieve much better content analysis results after introducing users’ interaction in a wider space? These interactions could be introduced into different steps during the lifecycle of multimedia data, including media content acquisition, modeling, annotation, sharing, authoring, etc. For example, for personal media, could we passively and/or actively get more useful information during the acquisition, browsing, editing and sharing process so we may obtain better content analysis results? Can the training process be more effective and efficient if user provides some inputs? Will the data distribution be better estimated after getting feedback from users? Will the classification accuracy be greatly improved after a limited round of interactions? Of course, one thing we need to take into account is that we should try to minimize users’ interaction during these processes while at the same time maximize what we can obtain.

Furthermore, as more and more media data is available on the Web, could we design appropriate tools based on content analysis and contextual analysis that enable super efficient interactive media annotation?

This special session could include papers on the following or related topics:

  • Interactive multimedia data modeling
  • Interactive multimedia clustering, classification and annotation, such as relevant feedback and active learning for large-scale multimedia repository.
  • Multimedia applications based on interactive multimedia content analysis, or interactive applications based automatic or interactive content analysis.
  • Content analysis and/or applications based on interactive multimedia acquisition.
  • Learning from users’ interactive behavior histories on multimedia data.

©2008 Conference Management Services, Inc. -||- email: webmaster@icme2006.org -||- Last updated Friday, January 13, 2006