Python Language Images Encapsulation

ICT: Image-Object Cross-Level Trusted Intervention for Mitigating Object Hallucination in Large Vision-Language Models

Abstract: Despite the recent breakthroughs achieved by Large Vision Language Models (LVLMs) in understanding and responding to complex visual-textual contexts, their inherent hallucination tendencies ...

IEEE

GeoPix: A multimodal large language model for pixel-level image understanding in remote sensing

Abstract: Multimodal (MM) large language models (MLLMs) have achieved remarkable success in image- and region-level remote sensing (RS) image understanding tasks, such as image captioning (IC), visual ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results

ICT: Image-Object Cross-Level Trusted Intervention for Mitigating Object Hallucination in Large Vision-Language Models

GeoPix: A multimodal large language model for pixel-level image understanding in remote sensing

Trending now