Abstract
In this paper, we introduce Recipe1M, a new large-scale, structured corpus of over 1m cooking recipes and 800k food images. As the largest publicly available collection of recipe data, Recipe1M affords the ability to train high-capacity models on aligned, multi-modal data. Using these data, we train a neural network to find a joint embedding of recipes and images that yields impressive results on an image-recipe retrieval task. Additionally, we demonstrate that regularization via the addition of a high-level classification objective both improves retrieval performance to rival that of humans and enables semantic vector arithmetic. We postulate that these embeddings will provide a basis for further exploration of the Recipe1M dataset and food and cooking in general.
In the news:
- MIT News
- BBC
- Wired
- Gizmodo
- CNET
- Techcrunch
- Digitaltrends
- VilaWeb (in Catalan)
- La Vanguardia - interview [registered users] (in Catalan)
- La Vanguardia (in Spanish)
- Metro
- Daily Mail
- The Verge
- ZDNet
- British Telecom (BT)
- BGR
- Grub Street
- Popular Mechanics
- Photoxels
- The Indian Express
- Spiegel Online (in German)
- SRF (in German)
- Engadget (in German)
- Tproger (in Russian)
- Libero (in Italian)
- Repubblica (in Italian)
- Fresh Gadgets (in Dutch)
- Noizz (in Rumanian)
- Xataka (in Spanish)
- Andro4all (in Spanish)
- MediaTrends (in Spanish)
- Hoyentec (Spanish)