Ball-Detection in Nao Robots for RoboCup


Nao robots are star players in RoboCup, an annual autonomous robot soccer competition. UMD is planning to build the Maryland RoboCup team to compete in RoboCup 2022. Using the data collected from the Nao robot’s camera as the training data, where the image names represent the depth of the ball from the Nao robot in centimeters.

In Robocup, the soccer field is bright green, the ball is bright orange, and the goalposts are bright yellow. If we separate the image into these three color classes, we can identify if/where these objects are in the image. A sample pipeline is: 1) the robot acquires an image, 2) classifies each pixel as belonging to one of the color classes (“soccer-field-green”, “ball-orange”, or “goal-post-yellow”), and 3) groups the labeled pixels and classifies the objects. This information is then passed on to a higher-level planning algorithm, which makes high-level decisions like “kick the ball”.

To build the vision pipeline for Nao robots, we will follow a 5 step method:

      1. Color Imaging:     
              The colors RGB can be represented in a 3D vector space. Think of this as X, Y and Z co-ordinate of a vector space representing colors. A value of [0,0,0] represents pure black, [255,255,255] represents pure white, [255,0,0] represents pure red and so on. Gray is any color with equal values for all the three channels. We can visualize RGB space in three dimensions as a unit cube.
      2. Color Classification:
              Particularly, we are interested in finding the orange pixels because this represents the ball. As mentioned before, in RGB color space each pixel is represented as a vector in R³.
      3. Color Classification using Single Gaussian:
             This is good for most basic cases but is bad for robotics because we said that everything (sensors and actuators) is noisy and we want to model the world in a probabilistic manner. This means that instead of saying a pixel is orange/red we want to say that the pixel is orange with 70% probability and red with 30% probability. Modelling the likelihood as a gaussian is beneficial because a little light variation generally makes the colors spread out in an ellipsoid form, i.e., the actual color is in the middle and color deviates from the center in all directions resembling an ellipse. This is one of the major reasons why a simple gaussian model works so well for color segmentation.
      4. Color Classification using a Gaussian Mixture Model: 
              However, if you are trying to find a color in different lighting conditions a simple gaussian model will not suffice because the colors may not be bounded well by an ellipsoid. In this case, one has to come up with a weird-looking fancy function to bind the color which is generally mathematically very difficult and computationally very expensive. 
      5. Estimation of distance to the ball:
              Now that we have robustly estimated the pixels which are ‘Orange’, we want to identify the pixels which belong to the orange ball and eventually find the distance to the ball. In the first step, let us identify the pixels which belong to the orange ball. For the second step, one could just fit a simple parametric model (choose a model of your choice) to estimate the distance from different parameters based on the image.

Congrats! You have now built a robust vision system to identify an orange ball, and estimate the distance to it on a nao robot for RoboCup soccer.