Active Vision


An area of computer vision is active vision, sometimes also called active computer vision. An active vision system is one that can manipulate the viewpoint of the camera(s) in order to investigate the environment and get better information from it.[1][2][3][4]


Background

The interest in active camera system started as early as two decades ago. Beginning in the late 1980s, Aloimonos et al. introduced the first general framework for active vision in order to improve the perceptual quality of tracking results.[5] Active vision is particularly important to cope with problems like occlusions, limited field of view and limited resolution of the camera.[6] Other advantages can be reducing the motion blur of a moving object [7] and enhancing depth perception of an object by focusing two cameras on the same object or moving the cameras.[3] Active control of the camera view point also helps in focusing computational resources on the relevant element of the scene.[8] In this selective aspect, active vision can be seen as strictly related to (overt & covert) visual attention in biological organisms, which has been shown to enhance the perception of selected part of the visual field. This selective aspect of human (active) vision can be easily related to the foveal structure of the human eye,[9][10] where in about 5% of the retina more than the 50% of the colour receptors are located.

It has also been suggested that visual attention and the selective aspect of active camera control can help in other tasks like learning more robust models of objects and environments with less labeled samples or autonomously .[4][11][12]



The Autonomous Camera Approach

Autonomous cameras are cameras that can direct themselves in their environment. There has been some recent work using this approach. In work from Denzler et al., the motion of a tracked object is modeled using a Kalman filter while the focal length that minimizes the uncertainty in the state estimations is the one that is used. A stereo set-up with two zoom cameras was used. A handful of papers have been written for zoom control and do not deal with total object-camera position estimation. An attempt to join estimation and control in the same framework can be found in the work of Bagdanov et al., where a Pan-Tilt-Zoom camera is used to track faces.[13] Both the estimation and control models used are ad hoc, and the estimation approach is based on image features rather than 3D properties of the target being tracked.[14]

The Master/Slave Approach

In a master/slave configuration, a supervising static camera is used to monitor a wide field of view and to track every moving target of interest. The position of each of these targets over time is then provided to a foveal camera, which tries to observe the targets at a higher resolution. Both the static and the active cameras are calibrated to a common reference, so that data coming from one of them can be easily projected onto the other, in order to coordinate the control of the active sensors. Another possible use of the master/slave approach consists of a static (master) camera extracting visual features of an object of interest, while the active (slave) sensor uses these features to detect the desired object without the need of any training data.[14]



Various downloads of different implementations of active vision can be found from this link to the active vision lab at Oxford University. http://www.robots.ox.ac.uk/ActiveVision/Downloads/index.html

References :
http://axiom.anu.edu.au/~rsl/rsl_active.html
Ballard, D.H., "Animate vision," Artificial Intelligence Journal 48, 57-86, 1991
Active Vision, J. Aloimonos, I. Weiss, A. Bandopadhay, 1988, International Journal of Computer Vision 1 (4), pages 333-356
Ecological Active Vision: Four Bio-Inspired Principles to Integrate Bottom-Up and Adaptive Top-Down Attention Tested With a Simple Camera-Arm Robot, D. Ognibene, G. Baldassarre, 2014, Autonomous Mental Development, IEEE Transactions on
Aloimonos, J., Weiss, I., Bandyopadhyay, A.: Active vision. Int. J. Comput. Vis. 1(4), 333– 356 (1988)
Denzler, J.; Zobel, M. & Niemann, H. Information theoretic focal length selection for real-time active 3D object tracking Computer Vision, 2003. Proceedings. Ninth IEEE International Conference on, 2003, 400 -407 vol.1
Control of a Camera for Active Vision: Foveal Vision, Smooth Tracking and Saccade International Journal of Computer Vision, Springer, 2000, 39, 81-96
Tatler, B. W.; Hayhoe, M. M.; Land, M. F. & Ballard, D. Eye guidance in natural vision: Reinterpreting salience J Vis, 2011, 11, 1-23
Findlay, J. M. & Gilchrist, I. D. Active Vision, The Psychology of Looking and Seeing Oxford University Press, 2003
Tistarelli, M. & Sandini, G. On the Advantages of Polar and Log-Polar Mapping for Direct Estimation of Time-To-Impact from Optical Flow IEEE Trans. Pattern Anal. Mach. Intell., IEEE Computer Society, 1993, 15, 401-410


EmoticonEmoticon