A method for estimating mobile robot ego-motion is presented, which relies on tracking contours in real-time images acquired with a calibrated monocular video system. After fitting an active contour to an object in the image, 3D motion is derived from the affine deformations suffered by the contour in an image sequence. More than one object can be tracked at the same time, yielding some different pose estimations. Then, improvements in pose determination are achieved by fusing all these different estimations. Inertial information is used to obtain better estimates, as it introduces in the tracking algorithm a measure of the real velocity. Inertial information is also used to eliminate some ambiguities arising from the use of a monocular image sequence. As the algorithms developed are intended to be used in real-time control systems, considerations on computation costs are taken into account.