AIDAR
Title:
AIDAR
(Adressage et Indexation de Documents
Multimédias Assistés par des Techniques de Reconnaissance
Vocale)
Partner: Voice-Insight S.A., Titan asbl
Funding: Région de
Bruxelles-Capitale
Project number:
Duration: 2004-2006
Project overview:
Nowadays we assist to an increasing need for an automatic and unifed
method for indexing audio archives of audiovisual companies. Currently,
most of the work is accomplished by a huge workforce of human experts.
This project aims to develop an automatic architecture featuring the
following functionalities:
- Segmentation
audio
- Classification
of radio news by topic categories (economy, war,
politic, sport, ...)
- Automatic
indexing and retrieval methods for querying the database of news and
programs
The Titan asbl (in
association with RTBF) is in charge
of providing
radiophonic data news and shows. The Voice-Insight company will provide
know-how on voice recognition technology. The MLG is in charge of
developing and testing machine learning techniques for automatic
indexation.
Machine Learning
Group contribution:
Main contribution of the MLG will be for topic classification. After
using a full speech recognizer, we will get the full text of the news.
With our "know-how" of machine learning techniques, we are in charge to
automatically detect the topic of radiophonic news. Method such as Lazy
learning (k-nearest neighbour), SVM (Support Vector Machine), and LLSF
(Linear Least Square Fit) are widely used in text classification. We
are going to investigate most of these techniques to find the best ones
for the architeture of AIDAR.
MLG researcher
involved:
Benjamin
Tshibasu-Kabeya (Machine Learning Group - Computer Science
Department - Université Libre de Bruxelles - Supervisor : Prof. Gianluca Bontempi).
|