Xavier Giró

Biography
Xavier Giro-i-Nieto is an associate professor at the Universitat Politecnica de Catalunya (UPC) in Barcelona, member of the Image Processing Group (GPI), Intelligent Data Science and Artificial Intelligence Research Center (IDEAI-UPC), Institute of Industrial Robotics (IRI), and also a visiting researcher at Barcelona Supercomputing Center (BSC). He graduated in Telecommunications Engineering at ETSETB (UPC) in 2000, after completing his master thesis on image compression at the Vrije Universiteit in Brussels (VUB) with Prof. Peter Schelkens. After working one year in Sony Brussels, he returned to UPC to obtain a PhD on computer vision, supervised by Prof. Ferran Marqués and Prof. Shih-Fu Chang from the Digital Video and MultiMedia laboratory at Columbia University, that he repeateadly visited between 2008-2014. Dr. Giró is the director of the Postgraduate on Artificial Intelligence with Deep Learning at UPC School, and also teaches undergradute and graduate course on deep learning at ESEIAAT and ETSETB schools at UPC, as well as the Master in Computer Vision of Barcelona. He regularly collaborates with the Insight Center of Data Analytics at Dublin City University, and is a member of the Governance Committee of the Science Foundation Ireland Centre for Research Training in Machine Learning. From a transfer technology perspective, he is a member of the scientific advisory committee of Vilynx, and collaborates with Telefónica R&D, Mediapro, BBC R&D and Crisalix. He serves as associate editor at IEEE Transactions in Multimedia and reviews for top tier conferences in machine learning (NeurIPS, ICML), computer vision (CVPR, ECCV, ICCV) and multimedia (ACMMM, ICMR).
Latest News
- July 2021: I will teach a lecture on Object Detection and Deep Learning at the International Summer School on Deep Learning 2021 (virtual). Regostration open until April 15.
- March 2021: Míriam Bellver defended her Phd thesis on image and video segmentation with low supervision. She has joined Amazon Barcelona.
- March 2021: Our new How2Sign dataset for sign language was accepted in CVPR 2021, lead by our Phd candidate Amanda Duarte.
- March 2021: Andreu Girbau defended his industrial PhD on multiple object tracking. Co-advised with Ferran Marqués and Ignais Rius (Automatic TV). He will join the National Institute of Informatics (NII) in Tokyo.
- March 2021: I will serve as area chair in ACM Multimedia 2021. You are innvited to submit your papers by April 3.
- January 2021: I will serve as area chair in WACV 2021, being hold online between 5-9 January.
- December 2020: Víctor Campos defended his PhD thesis "Deep Learning that Scales: Leveraging Compute and Data". He has joined Deepmind London.
- December 2020: I will serve as area chair in ICCV 2021 Montreal. Paper submission on March 17th.
- October 2020: Appointed as member of the Governance Committee of the Science Foundation Ireland Centre for Research Training in Machine Learning.
- October 2020: Recognized among the 10% top reviewers for NeurIPS 2020.
External activities
- Associate Editor:
- Area Chair:
- Reviewer
- CVPR (2021, 2020, 2019), ICCV (2019, 2017), ECCV (2020), ICIP (2014, 2003), EUSIPCO 2011.
- NeurIPS (2020, 2019, 2018, 2017), ICLR (2018, 2017), ICML (2020, 2019, 2018),
- ACM Multimedia (2019, 2018, 2017, 2016, 2014), ACM ICMR (2020, 2019, 2018, 2017).
- IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), Multimedia Tools and Applications (MTAP), EURASIP Journal on Image and Video Processing, Multimedia Systems (MMSJ), Image and Vision Computing (IMAVIS)
- Tutorial
- Deep Learning for Multimedia: ACM ICMR 2020, MMM 2019.
- Organization Commitee
- Deep Learning Barcelona (2019, 2018), Lifelogging Tools and Applications (LTA) workshop at ACM Multimedia 2016 & 2017, Annual Catalan Meeting on Computer Vision (2020).
- Awards & recognitions
- Top 10% reviewer at NeurIPS 2020, Top 33% reviewer at ICML 2020, Outstanding reviewer mention at CVPR 2020, Best paper award at the CVPR 2019 DeepVision Workshop, Best scanpath prediction in Salient360 ICME Challenge 2017, Best poster award at LSCVS NIPS workshop 2016, Best poster award at ICMR 2016, Among Top 10% papers in ICIP 2015, Winner of the LSUN Saliency prediction challenge in CVPRW 2015, 2nd place in ChaLearn Cultural Event Recognition Challenge in CVPRW 2015, 2nd place in MediaEval Social Event Detection 2014, 3rd place in MediaEval Social Event Detection 2013, Winner of the Videobrowser Showdown in MMM 2012.
- Advisor for the awarded MSc thesis of Adrià Romero (Càtedra Telefònica 2017), Víctor Campos (ACIA 2017), Dèlia Fernàndez (MCV 2016) and Òscar Mañas (MCV 2020).
Scientific IDs: Google Scholar, WoK Researcher ID: M-5834-2013, ORCID: 0000-0002-9935-5332, Scopus Author ID: 35098596700, UPC Futur
Journal Articles top
“Mask-guided sample selection for Semi-Supervised Instance Segmentation”, Multimedia Tools and Applications, 2020.![]() |
,
“Multiresolution co-clustering for uncalibrated multiview segmentation”, Signal Processing: Image Communication, 2019.![]() |
,
“Scanpath and Saliency Prediction on 360 Degree Images”, Elsevier Signal Processing: Image Communication, 2018.![]() |
,
“Introduction to the special issue: Egocentric Vision and Lifelogging”, Journal of Visual Communication and Image Representation, 2018.![]() |
,
“From Pixels to Sentiment: Fine-tuning CNNs for Visual Sentiment Prediction”, Image and Vision Computing, 2017.![]() |
,
Book Chapters and Bookstop
“Sentiment concept embedding for visual affect recognition”, in Multimodal Behavior Analysis in theWild, 1st ed., Elsevier, 2018. | ,
“Object Retrieval with Deep Convolutional Features”, in Deep Learning for Image Processing Applications, vol. 31, Amsterdam, The Netherlands: IOS Press, 2017. | ,
“Hierarchical Object Detection with Deep Reinforcement Learning”, in Deep Learning for Image Processing Applications, vol. 31, Amsterdam, The Netherlands: IOS Press, 2017. | ,
“Hierarchical Navigation and Visual Search for Video Keyframe Retrieval”, in Advances in Multimedia Modeling, vol. 7131, Springer Berlin / Heidelberg, 2012, pp. 652-654. | ,
“Rich Internet Application for Semi-automatic Annotation of Semantic Shots on Keyframes”, in Computational Intelligence for Multimedia Understanding, vol. 7242, Pisa, Italy: Springer-Verlag, 2012, pp. 172-182.![]() |
,
Conference Papers top
“Seasonal Contrast: Unsupervised Pre-Training from Uncurated Remote Sensing Data”, Submitted.![]() |
,
“How2Sign: A Large-scale Multimodal Dataset for Continuous American Sign Language”, in CVPR 2021, In Press.![]() |
,
“Can Everybody Sign Now? Exploring Sign Language Video Generation from 2D Poses”, in ECCV 2020 Workshop on Sign Language recognition, Production and Translation (SLRTP), 2020.![]() |
,
“Automatic Reminiscence Therapy for Dementia”, in ACM International Conference on Multimedia Retrieval (ICMR), Dublin, Ireland, 2020.![]() |
,
“Explore, Discover and Learn: Unsupervised Discovery of State-Covering Skills”, in International Conference on Machine Learning (ICML) 2020, 2020.![]() |
,
Theses top
“Sports broadcasting and multiple object tracking with deep learning methods”, 2021. | ,
“Image and Video Object Segmentation in Low Supervision Scenarios”, Universitat Politecnica de Catalunya, Barcelona, 2021. | ,
“Deep Learning that Scales: Leveraging Compute and Data”, Universitat Politècnica de Catalunya, Barcelona, Catalonia, 2020.![]() |
,
“Computer Vision beyond the visible: Image understanding through language”, Universitat Politecnica de Catalunya, Barcelona, 2019. | ,
“Visual Object Analysis using Regions and Local Features”, 2016.![]() |
,
Other top
“Object Model Adaptation for Multiple Object Tracking”. 2021.![]() |
, Ms Thesis |
“Disentangling neural network structure from the weights space”, 2021. .![]() |
, Web Article |
“2D to 3D body pose estimation for sign language with Deep Learning”. 2020.![]() |
, Ms Thesis |
“Learn2Sign : sign language recognition and translation using human keypoint estimation and transformer model”. 2020.![]() |
, Ms Thesis |
“Attention-based multi-view 3D reconstruction models”. 2020. | ,Ms Thesis |
Projects top
![]() |
A European AI On Demand Platform and Ecosystem | European | Jan 2019 | Dec 2021 |
![]() |
MALEGRA - Multimodal Signal Processing and Machine Learning on Graphs | National | Jan 2017 | Jun 2021 |
![]() |
Deep Learning for Video Analytics in Sport Events | National | Feb 2018 | Jan 2021 |
![]() |
Deep Learning for 3D reconstruction and Simulation of Aesthetic Procedures | National | Jul 2017 | Jun 2020 |
![]() |
Large Scale Video Tagging with Knowledge Bases | National | Jun 2017 | May 2020 |
Research Areas top
![]() |
Region-based image and video processing | Internal | Jan 1992 | Dec 2020 |
![]() |
Lifelogging | Internal | Feb 2014 | Dec 2020 |
![]() |
Affective Computing | Internal | Jan 2015 | Dec 2020 |
![]() |
Deep learning | Internal | Jun 2014 | Dec 2020 |
![]() |
Saliency prediction | Internal | Feb 2015 | Dec 2019 |
Demos and Resources top
|
EgoMon Gaze & Video Dataset | Dataset | Jul 2016 |
![]() |
UPC at CVPRW Visual Question Answering Challenge 2016 | Software | Jun 2016 |
![]() |
UPC at CVPRW ActivityNet Challenge 2016 | Software | Jun 2016 |
![]() |
Faster R-CNN Features for Instance Search (software) | Software | May 2016 |
![]() |
Sentiment maps generator | Software | Apr 2016 |