Amaia Salvador

Biography
This site is no longer maintained. Find my new website here.
I am a PhD candidate at the Image Processing Group (GPI), Universitat Politècnica de Catalunya under the supervision of Xavier Giró-i-Nieto and Ferran Marqués. My research is in the field of computer vision and its intersection with natural language processing. Over the course of my PhD I interned at the National Institute of Informatics, the Massachusetts Institute of Technology and at Facebook AI Research Montreal in 2015, 2016 and 2018, respectively.
Previously, I obtained a M.S. in Computer Vision from Universitat Autònoma de Barcelona and a B.S. in Audiovisual Systems from Universitat Politècnica de Catalunya. During my studies, I visited the Insight Centre for Data Analytics in the Dublin City University and the INP-ENSEEIHT Engineering School in Toulouse, where I completed my M.S. and B.S. theses, respectively.
Journal Articles top
“Mask-guided sample selection for Semi-Supervised Instance Segmentation”, Multimedia Tools and Applications, 2020.![]() |
,
“Assessment of Crowdsourcing and Gamification Loss in User-Assisted Object Segmentation”, Multimedia Tools and Applications, vol. 23, no. 75, 2016.![]() |
,
Book Chapters and Bookstop
“Object Retrieval with Deep Convolutional Features”, in Deep Learning for Image Processing Applications, vol. 31, Amsterdam, The Netherlands: IOS Press, 2017. | ,
Conference Papers top
“Budget-aware Semi-Supervised Semantic and Instance Segmentation”, in CVPR 2019 DeepVision Workshop, Long Beach, CA, USA, 2019.![]() |
,
“Wav2Pix: Speech-conditioned Face Generation using Generative Adversarial Networks”, in ICASSP, Brighton, UK, 2019.![]() |
,
“RVOS: End-to-End Recurrent Network for Video Object Segmentation”, in CVPR, Long Beach, CA, USA, 2019.![]() |
,
“Inverse Cooking: Recipe Generation from Food Images”, in CVPR, Long Beach, CA, USA, 2019. | ,
“Recurrent Neural Networks for Semantic Instance Segmentation”, in ECCV 2018 Women in Computer Vision (WiCV) Workshop, 2018.![]() |
,
Theses top
“Computer Vision beyond the visible: Image understanding through language”, Universitat Politecnica de Catalunya, Barcelona, 2019. | ,
Other top
“Speech-conditioned Face Generation with Deep Adversarial Networks”. 2018.![]() |
, Ms Thesis |
“MIT is building a system that can identify a recipe using pictures of food”, 2017. . | ,Web Article |
“Snap a photo, get a recipe: pic2recipe uses AI to predict food ingredients”, 2017. . | ,Web Article |
“Artificial intelligence suggests recipes based on food photos”, 2017. . | ,Web Article |
“Object Tracking in Video with TensorFlow”. 2016.![]() |
, Ms Thesis |
Projects top
![]() |
Cross-modal Deep Learning between Vision, Language, Audio and Speech | European | Oct 2018 | Sep 2021 |
![]() |
SGR17 - Image and Video Processing Group | National | Jan 2017 | Sep 2021 |
![]() |
BigGraph - Heterogeneous information and graph signal processing for the Big Data era. Application to high-throughput, remote sensing, multimedia and human computer interfaces. | National | Jan 2014 | Dec 2017 |
![]() |
SGR14 - Image and Video Processing Group | National | Jan 2014 | Apr 2017 |
Research Areas top
![]() |
Language and Vision | Internal | Feb 2016 | Dec 2021 |
![]() |
Region-based image and video processing | Internal | Jan 1992 | Dec 2020 |
![]() |
Deep learning | Internal | Jun 2014 | Dec 2020 |
![]() |
Multimedia Retrieval | Internal | Sep 2001 | Dec 2018 |
![]() |
Crowdsourcing | Internal | Jan 2013 | Dec 2015 |
Demos and Resources top
![]() |
UPC at CVPRW ActivityNet Challenge 2016 | Software | Jun 2016 |
![]() |
Faster R-CNN Features for Instance Search (software) | Software | May 2016 |
![]() |
C3D Model for Keras trained over Sports 1M | Software | Apr 2016 |
![]() |
Diving Deep into Sentiment: Understanding Fine-tuned CNNs for Visual Sentiment Prediction (software) | Software | Mar 2016 |
![]() |
Terrassa Buildings 4126 | Dataset | Dec 2015 |