For robots to function successfully in human environments, fast and accurate recognition and verification of objects are essential. Often a single view of an object is not sufficient to accurately classify objects especially if they are occluded or appear in cluttered scenes. Actively exploring objects by changing viewpoints can increase the accuracy of classifying an object. This talk will presents an efficient feature based active vision system for the verification of objects that appear in cluttered scenes or are occluded. This system is designed using an selector-observer framework where the selector represents the automatic view selection. A new method for automatically selecting the `next best viewpoint' using vocabulary trees is presented and a Bayesian `observer' updates the belief hypothesis for object verification and provides feedback. The vocabulary tree also speeds up the matching process. New images are only captured at the 'next best viewpoint' when the belief hypothesis of an object is below some pre-defined threshold. This system is shown to be more accurate than randomly selecting the next viewpoint. It is also faster because only necessary views are captured and processed.