Abstract: Open-vocabulary semantic segmentation aims to partition an image into distinct semantic regions based on an open set of categories. Existing approaches primarily rely on image-level ...