Unsupervised learning of sparse features for scalable audio classification

Speaker:

Marcio Masaki Tomiyoshi e Roberto Piassi Passos Bodo

Abstract:

In this seminar, we will present the best student paper from ISMIR 2011 "Unsupervised learning of sparse features for scalable audio classification" by Mikael Henaff, Kevin Jarrett, Koray Kavukcuoglu and Yann LeCun.

In this work it is presented a system to automatically learn features from audio in an unsupervised manner. This method ﬁrst learns an overcomplete dictionary which can be used to sparsely decompose log-scaled spectrograms. It then trains an efﬁcient encoder which quickly maps new inputs to approximations of their sparse representations using the learned dictionary. This avoids expensive iterative procedures usually required to infer sparse codes. These sparse codes are then used as inputs for a linear Support Vector Machine (SVM). The system achieves 83.4% accuracy in predicting genres on the GTZAN dataset, which is competitive with current state-of-the-art approaches. Furthermore, the use of a simple linear classiﬁer combined with a fast feature extraction system allows this approach to scale well to large datasets.

(video presentation in portuguese)

Date and time:

Tuesday, September 20, 2016 - 4:00pm

Place:

Antonio Gilioli Auditorium, IME/USP

Languages

Main menu

Unsupervised learning of sparse features for scalable audio classification

Tag:

Tag cloud

Languages

Search form

Main menu

You are here

Unsupervised learning of sparse features for scalable audio classification

Tag:

Tag cloud