MALEGRA - Multimodal Signal Processing and Machine Learning on Graphs

Type Start End
National Jan 2017 Jun 2021
Responsible URL
Javier Ruiz-Hidalgo / Xavier Giró

Reference

MALEGRA, TEC2016-75976-R, financed by the Spanish Ministerio de Economía, Industria y Competitividad and the European Regional Development Fund (ERDF)

 

Description

The goal of this project is to study and develop tools combining graph signal representation and processing ideas with machine learning technology. These tools will be used in the context of applications where the size and/or the heterogeneity of the data represent challenges of the Big Data era. The development of technologies related to the capture, storage, search, distribution, transfer, analysis and visualization of ever growing heterogeneous datasets entails tremendous difficulties. At the same time, these difficulties open new opportunities and this development has become a major trend in the field of Information and Communication Technology. The research performed in this project targets applications such as multi-view representations, video analysis, remote sensing for earth monitoring, person identification, health monitoring, medical imaging, genomics, etc.

The project has 4 major objectives. The first two objectives concentrate most of the development of theoretical and basic tools within the project. Within them, we will investigate the creation, analysis, segmentation, filtering and merging of graph structures of heterogeneous multimodal data and on the combination of these ideas with machine learning techniques. This combination with machine learning will be used for several different purposes. In particular, to provide a classification decision, to learn a mapping or a model to be used in a data processing architecture, to learn features that outperform handcrafted equivalents or to aggregate several features to create a signal to be further processed.

The last two objectives of the project focus on the application of the techniques and tools developed in the first two objectives in complex challenges that deal with big and heterogeneous data. In particular, these techniques and tools will be used to study the identification of persons in broadcast TV programs, the optimal encoding of depth maps in multi-view plus depth representations, the radiometric estimation and object detection in SAR and PolSAR images, the classification of multispectral and hyperspectral images, the understanding of brain changes during the evolution of Alzheimers disease, the inference of gene regulatory networks and the segmentation, tracking, indexing and super-resolution of multimodal video sequences.

 

Publications

Herrera-Palacio A, Ventura C, Giró-i-Nieto X. Video Object Linguistic Grounding. In: ACM Multimedia Workshop on Multimodal Understanding and Learning for Embodied Applications (MULEA). ACM Multimedia Workshop on Multimodal Understanding and Learning for Embodied Applications (MULEA). Nice, France: ACM; 2019. (441.12 KB)
Mas I, Morros JR, Vilaplana V. Picking groups instead of samples: A close look at Static Pool-based Meta-Active Learning. In: ICCV Workshop - MDALC 2019. ICCV Workshop - MDALC 2019. Seoul, South Korea; 2019. (911.15 KB)
López-Palma M, Morros JR, Corbalán M, Gago J. Audience measurement using a top-view camera and oriented trajectories. In: IEEE IECON 2019. IEEE IECON 2019. Lisbon, Portugal; 2019. (523.13 KB)
Duarte A. Cross-modal Neural Sign Language Translation. In: Torres J, Giró-i-Nieto X Proceedings of the 27th ACM International Conference on Multimedia - Doctoral Symposium. Proceedings of the 27th ACM International Conference on Multimedia - Doctoral Symposium. Nice, France: ACM; 2019. (392.69 KB)
Gené-Mola J, Vilaplana V, Rosell-Polo JR, Morros JR, Ruiz-Hidalgo J, Gregorio E. Uso de redes neuronales convolucionales para la detección remota de frutos con cámaras RGB-D. In: Congreso Ibérico de Agroingeniería. Congreso Ibérico de Agroingeniería. Huesca: Universidad de Zaragoza (UZA); 2019. (1.21 MB)
Herrera-Palacio A, Ventura C, Silberer C, Sorodoc I-T, Boleda G, Giró-i-Nieto X. Recurrent Instance Segmentation using Sequences of Referring Expressions. In: NeurIPS workshop on Visually Grounded Interaction and Language (ViGIL). NeurIPS workshop on Visually Grounded Interaction and Language (ViGIL). Vancouver, Canada; 2019. (1.13 MB)
Oriol B, Canton-Ferrer C, Giró-i-Nieto X. Hate Speech in Pixels: Detection of Offensive Memes towards Automatic Moderation. In: NeurIPS 2019 Workshop on AI for Social Good. NeurIPS 2019 Workshop on AI for Social Good. Vancouver, Canada; 2019. (1.91 MB)
Gené-Mola J, Gregorio E, Guevara J, Cheein FAuat, Sanz R, Escolà A, Calveras JLlorens, Morros JR, Ruiz-Hidalgo J, Vilaplana V, et al. Fruit Detection in an Apple Orchard Using a Mobile Terrestrial Laser Scanner. Biosystems Engineering. 2019 ;187.
Linardos P, Mohedano E, Nieto JJosé, O'Connor N, Giró-i-Nieto X, McGuinness K. Simple vs complex temporal recurrences for video saliency prediction. In: British Machine Vision Conference (BMVC). British Machine Vision Conference (BMVC). Cardiff, Wales / UK.: British Machine Vision Association; 2019. (1.79 MB)
Pareto D, Vidal P, Alberich M, Lopez C, Auger C, Tintoré M, Montalban X, Sastre-Garriga J, Vilaplana V, Rovira A. Prediction of a second clinical event in CIS patients by combining lesion and brain features. In: Congress of the European Comitee for Treatment and Research in Multiple Sclerosis (ECTRIMS 2019). Congress of the European Comitee for Treatment and Research in Multiple Sclerosis (ECTRIMS 2019). ; 2019.

Pages