Overview
BoxMOT is a Python package providing a modular architecture for multi-object tracking (MOT). It supports integration with various segmentation, object detection, and pose estimation models, enabling users to easily swap different SOTA tracking algorithms. The key value proposition lies in its pluggable architecture, universal model support, and benchmark-ready local evaluation pipelines for datasets like MOT17, MOT20, and DanceTrack. Performance modes include motion-only for lightweight CPU-efficient tracking and motion + appearance, combining motion cues with appearance embeddings (CLIPReID, LightMBN, OSNet) to maximize identity consistency and accuracy. It supports reusable detections and embeddings, which can be saved and reused for evaluations, eliminating redundant preprocessing. BoxMOT utilizes a command-line interface (CLI) for simplified syntax, allowing users to track objects, evaluate performance, tune hyperparameters, generate tracking data and export models.