Introduced in the paper "Roboflow 100-VL: A Multi-Domain Object Detection Benchmark for Vision-Language Models", RF100-VL is a large-scale collection of 100 multi-modal datasets with diverse concepts ...
Abstract: Enlarging input images is a straightforward and effective approach to promote small object detection. However, simple image enlargement is significantly expensive on both computations and ...
Abstract: Cross-modality can integrate complementary information from different modalities to improve the reliability and robustness of object detection effectively. However, compared to processing ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results