A clip-based dual-stream method for text based vehicle search
In this paper, we propose a novel framework for natural language-based trackedvehicle retrieval based on CLIP model, one of the most effective models for image-text matching task. This framework leverages both appearance and motion information to enhance the matching accuracy of vehicle tracklet retrieval.