開始時間﹕ 四月十日(四) 08:30 結束時間﹕ 四月十日(四) 17:00
活動地點﹕ 清华大学信息电机馆B1国际会议厅
联 络 人 ﹕ 联络电话﹕ 03-574-2847、(03)571-5131分機2847

Modern advancements in information technology have enabled pervasive uses of digital multimedia data in a variety of business, scientific, government, and consumer applications. Accompanying with an explosive growth in the generation, storage, distribution and consumption of multimedia data are emerging requirements in indexing content, building standard exchange formats and ensuring a trustworthy framework between users.

Recent work on multimedia indexing technologies has been struggling to keep pace with these challenges. While feature-based indexing techniques satisfied some of the requirements, a need for understanding semantic meaning of multimedia data is foreseen and is currently driving research paradigm into a new level. Multimedia understanding exploits techniques from disparate disciplines that include signal processing, machine learning, computer vision, and recognition techniques in specific domains. Although advances in speech/text/face recognition have been observed in recent applications, a generic framework which recognizes thousands of visual objects and acoustic information has not been seen in the literature. In the first two lectures, I will introduce our current effort in developing frameworks for generic audio-visual object recognition, video structure understanding and learning semantics from object recognition and video structures. In addition, a tutorial on machine learning as well as proven statistical discriminant techniques such as Support Vector Machines, Gaussian Mixture Models and Hidden Markov Models is also provided.


