Visual tracking of multiple persons simultaneously is an important tool for group behaviour analysis. In this paper, we demonstrate that multi-target tracking in a network of non-overlapping cameras can be formulated in a framework, where the association among all given target hypotheses both within and between cameras is performed simultaneously. Our approach helps to overcome the fragility of multi-camera-based tracking, where the performance relies on the single-camera tracking results obtained at input level. In particular, we formulate an estimation of the target states as a multi-state graph optimization problem, in which the likelihood of each target hypothesis belonging to different identities is modeled. In addition, we learn the target-specific model to improve the similarity measure among targets based on the appearance cues. We also handle the occluded targets when there is no reliable evidence for the target's presence and each target trajectory is expected to be fragmented into multiple tracks. An iterative procedure is proposed to solve the optimization problem, resulting in final trajectories that reveal the true states of the targets. The performance of the proposed approach has been extensively evaluated on challenging multi-camera non-overlapping tracking data sets, in which many difficulties, such as occlusion, viewpoint, and illumination variation, are present. The results of systematic experiments conducted on a large set of sequences show that the proposed approach outperforms several state-of-the-art trackers.
|Number of pages||16|
|Journal||IEEE Transactions on Circuits and Systems for Video Technology|
|Publication status||Published - 1 Dec 2018|