Overview of Robotic Vision – Object Tracking and Image Processing Software

Robotic vision continues to be treated including different methods for processing, analyzing, and understanding. All these methods produce information that is translated into decisions for robots. From start to capture images and to the final decision of the robot, a wide range of technologies and algorithms are used like a committee of filtering and decisions.

Another object with other colors accompanied by different sizes. A robotic vision system has to make the distinction between objects and in almost all cases has to tracking these objects. Applied in the real world for robotic applications, these machine vision systems are designed to duplicate the abilities of the human vision system using programming code and electronic parts. As human eyes can detect and track many objects in the same time, robotic vision systems seem to pass the difficulty in detecting and tracking many objects at the same time.

A robotic system finds its place in many fields from industry and robotic services. Even is used for identification or navigation, these systems are under continuing improvements with new features like 3D support, filtering, or detection of light intensity applied to an object.

Applications and benefits for robotic vision systems used in industry or for service robots:

  • automating process;
  • object detection;
  • estimation by counting any type of moving;
  • applications for security and surveillance;
  • used in inspection to remove the parts with defects;
  • defense applications;
  • used by autonomous vehicle or mobile robots for navigation;
  • for interaction in computer-human interaction;

In this article, I make an overview of vision tools and libraries used for machine vision as well as most common vision sensors used by engineers to apply machine vision in the real world using robots.

Object tracking software

A tracking system has a well-defined role and this is to observe the persons or objects when these are under moving. In addition, the tracking software is capable of predicting the direction of motion and recognizes the object or persons.


OpenCV is the most popular and used machine vision library with open-source code and comprehensive documentation. Starting with image processing, 3D vision and tracking, fitting and many other features, the system include more than 2500 algorithms. The library interfaces have support for C++, C, Python and Java (in work), and also can run under Windows, Linux, Android or Mac operating systems.


Used for object tracking and recognition, SwisTrack is one of the most advanced tools used in machine vision applications. This tracking tool required only a video camera for tracking objects in a wide range of situations. Inside, SwisTrack is designed with a flexible architecture and uses OpenCV library. This flexibility opens the gates for implementing new components in order to meet the requirements of the user.


Using a video camera and Skilligent can be built one of the most powerful object tracking and recognition system with a wide range of application in robotics. Skilligent is based on an algorithm who knows how to handle situations like changing light intensity, camera distortion and image stabilization necessary when the robot is moving, as well as shooting angle changed with up to 30-45 degrees. These high performances of the algorithm have a price, in our case a database with object stored used for comparison with object detected.

The computer vision software features

  • detect changes in lightening;
  • changes in view angles with up to 30-40 degrees;
  • camera lens distortions;

SRI Stereo Engine

This packet with algorithms was built to run efficiently on a wide range of platforms under Linux or MS Windows. Stereo Engine offers support for 3D images, filtering, and camera calibration.


PTAM is an augmented reality system used in applications for parallel tracking and mapping. It works without markers or templates and inertial sensors. The system supports Linux, OSX and Win32 operating systems.

TLD (Tracking-Learning-Detection)

After a hard research work in machine vision, Dr. Zdenek Kalal develop the algorithm called TLD and designed for applications like tracking and object detection, as well as artificial learning systems used in robotics. The focus was added on real-time tracking objects with dynamical selected of objects from the video images and then marked. An interesting feature is that the algorithm keeps in mind all objects in case that one or more objects reappears in images with the possibility to be tracked again.


Calculating the position and orientation of a camera, the ARToolKit can track objects. This tracking system was developed for Augmented Reality applications with applicability in machine vision. Using algorithms, the toolkit tracking objects using simple black squares and has the ability to work with patterns. With support for SGI IRIX, Linux, MacOS and Windows OS, the system work very fast and can be calibrated easily.

ARToolKit features

  • uses only one camera for tracking;
  • simple black squares are used for tracking code;
  • can be used any square marker patterns;
  • camera calibration with less effort;
  • tracking objects in real time

CCTV Object Tracking

Is not easy for human eyes to works in visual monitoring by tracking objects in areas like surveillance, from industry in the automation process, and many other domains where tracking objects are required. CCTV is a powerful tool based on complex algorithms with support for detection, counting, analyzing, and tracking activities.

The algorithms identify sets of tracks with objects by selecting regions defined by motion in both time and space. There are specific algorithms, each with a well-defined task. An algorithm was designed to identify the objects in motion using the recording time and a series of images.

Processing Video Images

In industry as well as service robots, machine vision can be used for a large number of applications from inspection to autonomous vehicle. In this section of the article I made a list with tools and libraries used both in industry and service robotic applications for simple to complex vision systems.


Used for low resolution image processing, the Improv is a real-time vision tool used for mobile robots. Even it runs under X Windows, Improv was designed for Linux OS. The tool is useful for low-budget projects with inexpensive cameras.
With a modular design and customizable interface, the image processing tool can be integrated easily and allows the integration of new functionality.

NI Vision Builder

Vision Builder is a vision system used in robotic applications and working with patterns including geometric matching, optical character recognition (OCR), and particle analysis. The system is successfully used in complex inspection applications where the robot has to set if a product can move on the production line or has to be returned.


RoboRealm is one of the most popular vision system used in robotics, a perfect tool for experimenting projects including video processing. The combination between modules (like processing filter) and programming creates a good environment for the robotic visual process. The tool has a user friendly interface with a wide range of modules offering the possibilities to be used in almost all situations which implies robotic vision.


RobotVision is a complete library written in C++ and based on object-oriented programming paradigm with a focus on visual localization and mapping technique. This technique is common among autonomous robots to build maps in unknown environments. The library can be used in 2D or 3D applications and is compatible with Linux based PC.

Regi Stax

After many releases, the RegiStax comes with new improvements including a larger wavelet-filters and support for multi – core processors for image processing. All these new or improved features are added to a large list of features who recommended the software to be used in complex robotic applications.


Used almost in educational and research projects, CVIPtools allows students and researchers to experiment the computer vision. The tool is used successfully in applications including image segmentation, image restoration, pseudo-color enhancement, and for filtering images including image restoration and morphological filters.
CVIPtools algorithm code is written in standard C and contains all images, data processing procedures and functions.

Image-Pro Premier

Image-Pro Premier includes a suite of software used in image analysis applications from industry to inspection and quality assurance.

Precision Image

As his name says, Precision Image is a precision tool used in image processing mainly in industrial and scientific applications.

Point Cloud Library

Point Cloud is an open source project used in image processing in 2D and 3D format. Another interesting feature is the point cloud process with the representation of the external surface of the object with 3D CAD models for manufactured parts.


Written in C and designed in the labs of MIT, the Camunits is a real vision library kit with algorithms and tools for machine vision research. It is a free package with support for Linux and OS X operating systems.


The 3D machine vision was implemented successfully in Cognex with real-time 3D image processing for a wide range of applications. This technology was designed to offer support for applications where the 2D technology is not enough. Cognex can be used in industry for robotic applications where is needed precision and a quick response. The fields where the technology can be used include manufacturing systems like de-palletizing or assembly verification.


From simple to sophisticated applications, the Visionscape is a powerful machine vision tool with support for development and deployment of intelligent vision systems. The tool also offers support for multi-platform use with configuration environment for maximum productivity.


Make3D support machine vision, but in another way than official machine vision tools. Make was built to convert 2D images into 3D images using powerful machine learning techniques.


Following the same trend by other tools from Willow Garage, rviz is a free open-source software capable to bring 3D vision for robotic applications. It had many options of display type images including point cloud or the robot state.


Using at least two cameras that looking at the same scene, the SAL3D tool was designed to reduce the occlusions or shadows presented in applications like laser triangulation. Used in different applications, SAL3D can increase the quality of inspection and analysis.

Vision Sensors

A robot can see interpreting images captured using vision sensors in different ways. These are intelligent sensors used to perform different tasks by robots. In this section of the article, I made an overview of popular vision sensors embedded in robots.


CMUcam is not just one camera used for vision, there is open-source vision systems with different kinds of on-board designed to be used in applications for real-time processing tasks.

Cameras and Vision Sensors

Comprehensive list with camera sensors designed to be used in robotic projects.

Vision Sensors

Resources for robotic vision with sensors designed to be used in automated inspections that generally are used in industry.

Kinect Sensor

Microsoft creates one of the most advanced vision sensor with features that makes it ideal for robotic applications.

Surveyor SRV-1 Blackfin Camera

Surveyor SRV-1 is one of the most popular cameras used in robotic vision with strong specifications and with possibility to be used in stereo vision systems.


With completed hardware components and software packages, the AVRcam is capable to tracking colorful objects.

C3038 Color Sensor Module

Combining OmniVision CMOS technology with quality video image application resulted a powerful vision system designed to fit in any robotic project.


Related Posts

Don't Miss Out!

Get the latest news, tutorials, reviews and more direct to your inbox when you subscribe!