SOLE is highly generalizable and can segment corresponding instances with various language instructions, including but not limited to visual questions, attributes description, and functional ...
Abstract: The majority of existing counting models are designed to operate on a singular object category, such as crowds or vehicles. The emergence of multi-modal foundational models, e.g., ...
Abstract: We introduce a new method that extracts knowledge from a large language model (LLM) to produce object-level plans, which describe high-level changes to object state, and uses them to ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results