CapSal: Leveraging Captioning to Boost Semantics for Salient Object Detection

View Researcher II's Other Codes

Disclaimer: The provided code links for this paper are external links. Science Nest has no responsibility for the accuracy, legality or content of these links. Also, by downloading this code(s), you agree to comply with the terms of use as set out by the author(s) of the code(s).

Please contact us in case of a broken link from here

Authors Lu Zhang, Jianming Zhang, Zhe Lin, Huchuan Lu, You He,
Journal/Conference Name CVPR 2019
Paper Category
Paper Abstract Detecting salient objects in cluttered scenes is a big challenge. To address this problem, we argue that the model needs to learn discriminative semantic features for salient objects. To this end, we propose to leverage captioning as an auxiliary semantic task to boost salient object detection in complex scenarios. Specifically, we develop a Cap-Sal model which consists of two sub-networks, the Image Captioning Network (ICN) and the Local-Global Perception Network (LGPN). ICN encodes the embedding of a generated caption to capture the semantic information of major objects in the scene, while LGPN incorporates the caption-ing embedding with local-global visual contexts for predicting the saliency map. ICN and LGPN are jointly trained tomodel high-level semantics as well as visual saliency. Extensive experiments demonstrate the effectiveness of image captioning in boosting the performance of salient object detection. In particular, our model performs significantly better than the state-of-the-art methods on several challenging datasets of complex scenarios
Date of publication 2019
Code Programming Language Python
Comment

Copyright Researcher II 2021