Language and Vision

Type Start End
Internal Feb 2016 Dec 2021
Responsible URL
Xavier Giró-i-Nieto


Bellver-Bueno M, Ventura C, Silberer C, Kazakos I, Torres J, Giró-i-Nieto X. RefVOS: A Closer Look at Referring Expressions for Video Object Segmentation. Multimedia Tools and Applications. 2022 . (5.78 MB)
Kazakos I, Bellver-Bueno M, Ventura C, Silberer C, Giró-i-Nieto X. SynthRef: Generation of Synthetic Referring Expressions for Object Segmentation. In: NAACL Visually Grounded Interaction and Language (ViGIL) Workshop. NAACL Visually Grounded Interaction and Language (ViGIL) Workshop. Virtual; 2021. (794.97 KB)
Bellver M. Image and Video Object Segmentation in Low Supervision Scenarios Torres J, Giró-i-Nieto X. Computer Architectures. 2021 ;PhD.
Oriol B, Luque J, Diego F, Giró-i-Nieto X. Transcription-Enriched Joint Embeddings or Spoken Descriptions of Images and Videos. In: CVPR 2020 Workshop on Egocentric Perception, Interaction and Computing. CVPR 2020 Workshop on Egocentric Perception, Interaction and Computing. Seattle, WA, USA: arXiv; 2020. (96.79 KB)
Giró-i-Nieto X. One Perceptron to Rule Them All: Language, Vision, Audio and Speech (tutorial). In: ACM International Conference on Multimedia Retrieval (ICMR) 2020. ACM International Conference on Multimedia Retrieval (ICMR) 2020. Dublin, Ireland: ACM; 2020. (313.96 KB)
Herrera-Palacio A, Ventura C, Giró-i-Nieto X. Video Object Linguistic Grounding. In: ACM Multimedia Workshop on Multimodal Understanding and Learning for Embodied Applications (MULEA). ACM Multimedia Workshop on Multimodal Understanding and Learning for Embodied Applications (MULEA). Nice, France: ACM; 2019. (441.12 KB)
Herrera-Palacio A, Ventura C, Silberer C, Sorodoc I-T, Boleda G, Giró-i-Nieto X. Recurrent Instance Segmentation using Sequences of Referring Expressions. In: NeurIPS workshop on Visually Grounded Interaction and Language (ViGIL). NeurIPS workshop on Visually Grounded Interaction and Language (ViGIL). Vancouver, Canada; 2019. (1.13 MB)
Oriol B, Canton-Ferrer C, Giró-i-Nieto X. Hate Speech in Pixels: Detection of Offensive Memes towards Automatic Moderation. In: NeurIPS 2019 Workshop on AI for Social Good. NeurIPS 2019 Workshop on AI for Social Good. Vancouver, Canada; 2019. (1.91 MB)
Salvador A, Drozdzal M, Giró-i-Nieto X, Romero A. Inverse Cooking: Recipe Generation from Food Images. In: CVPR. CVPR. Long Beach, CA, USA: OpenCVF / IEEE; 2019.
Salvador A. Computer Vision beyond the visible: Image understanding through language Giró-i-Nieto X, Marqués F. Signal Theory and Communications. 2019 ;Phd.
