From Virtual to Real datasets

Resource Type | Date |
---|---|
Results | 2025-09-10 |
Description
Abstract:
Cooperative perception (CP) has emerged as a promising approach for autonomous driving, allowing connected autonomous vehicles (CAVs) to mitigate occlusions and sensor constraints by sharing sensor data. This collaborative framework significantly improves situational awareness and contributes to improved road safety. Achieving robust and highly-reliable CP performance requires training models on diverse traffic scenarios, covering rare and challenging traffic dynamics. However, large-scale real-world datasets remain limited in CP due to the high costs of multi-agent deployments and labour-intensive labelling. While synthetic datasets offer a practical alternative, providing scalable data generation and accurate annotations, models trained solely on virtual data often struggle to generalize to real-world scenarios due to the domain gap. In this work, we explore the utility of synthetic data in training processes to enhance real-world performance, thereby reducing the effects of the domain discrepancies. We benchmark two state-of-the-art intermediate-fusion 3D LiDAR-based object detectors on synthetic and real-world datasets, under vehicle-to-vehicle (V2V) communication settings. Three training strategies are compared: training from scratch, synthetic-to-real transfer, and mixed-dataset training with varying synthetic–real ratios. Our results reveal that pre-training on synthetic data before fine-tuning on real datasets delivers the strongest real-world gains (+2% AP@0.5, +2–4% AP@0.7 on car detection), while mixed training enhances cross-domain generalization, especially with larger synthetic contributions. These findings highlight that virtual data, rather than merely replacing real data, can serve as an effective complementary source in developing reliable CP systems.
This research report provides clear evidence that synthetic data can play a valuable, though limited, role in advancing cooperative LiDAR-based object detection in real environments.
GPI Research Report 2025-09.pdf