Augmented reality is generally considered to be very hard to create. However, it’s possible to make visually impressive projects using just open source libraries. In this tutorial we’ll make use of OpenCV in Python to detect circle-shaped objects in a webcam stream and replace them with 3D Earth in Three.js in a browser window while using WebSockets to join this all together.
We want to strictly separate front-end and back-end in order to make it reusable. In a real-world application we could write the front-end in Unity, Unreal Engine or Blender, for example, to make it look really nice. The browser front-end is the easiest to implement and should work on nearly every possible configuration.
Further Reading on SmashingMag:
- Introduction To Polygonal Modeling And Three.js
- Building Shaders With Babylon.js
- Why AJAX Isn’t Enough
To keep things simple we’ll split the app into three smaller parts:
- Python back-end with OpenCV
OpenCV will read the webcam stream and open multiple windows with camera image after passing it through multiple filters to ease debugging and give us a little insight into what the circle detection algorithm actually sees. Output of this part will be just 2D coordinates and radius of the detected circle.
Step-by-step implementation of Three.js library to render textured Earth with moon spinning around it. The most interesting thing here will be mapping 2D screen coordinates into the 3D world. We’ll also approximate the coordinates and radius to increase OpenCV’s accuracy.
- WebSockets in both front-end and back-end
Back-end with WebSockets server will periodically send messages with detected circle coordinates and radii to the browser client.
Final result of this article.
1. Python Back-End With OpenCV
Our first step will be just importing the OpenCV library in Python and opening a window with a live webcam stream.
We’re going to use the newest OpenCV 3.0 (see installation notes) with Python 2.7. Please note, that installation on some systems might be problematic and the official documentation isn’t very helpful. I tried myself on Mac OS X version 3.0 from MacPorts and the binary had a dependency issue so I had to switch to Homebrew instead. Also note that some OpenCV packages might not come with Python binding by default (you need to use some command line options).
With Homebrew I ran:
brew install opencv
This installs OpenCV with Python bindings by default.
Just to test things out I recommend you run Python in interactive mode (run
pythonin CLI without any arguments) and write
import cv2. If OpenCV is installed properly and paths to Python bindings are correct it shouldn’t throw any errors.
Later, we’ll also use Python’s
numpyfor some simple operations with matrices so we can install it now as well.
pip install numpy
Reading The Camera Image
Now we can test the camera:
import cv2 capture = cv2.VideoCapture(0) while True: ret, image = capture.read() cv2.imshow('Camera stream', image) if cv2.waitKey(1) & 0xFF == ord('q'): break
cv2.VideoCapture(0)we get access to the camera on index
0which is the default (usually the built-in camera). If you want to use a different one, try numbers greater than zero; however, there’s no easy way to list all available cameras with the current OpenCV version.
When we call
cv2.imshow('Camera stream', image)for the first time it checks that no window with this name exists and creates a new one for us with an image from the camera. The same window will be reused for each iteration of the main loop.
Then we used
capture.read()to wait and grab the current camera image. This method also returns a Boolean property
retin case the camera is disconnected or the next frame is not available for some reason.
At the end we have
cv2.waitKey(1)that checks for 1 millisecond whether any key is pressed and returns its code. So, when we press
qwe break out of the loop, close the window and the app will end.
If this all works, we passed the most difficult part of the back-end app which is getting the camera to work.
Filtering Camera Images
For the actual circle detection we’re going to use circle Hough Transform which is implemented in
cv2.HoughCircles()method and right now is the only algorithm available in OpenCV. The important thing for us is that it needs a grayscaled image as input and uses the Canny edge detector algorithm inside to find edges in the image. We want to be able to manually check what the algorithm sees so we’ll compose one large image from four smaller images each with a different filter applied.
The Canny edge detector is an algorithm that processes the image in typically four directions (vertical, horizontal and two diagonals) and finds edges. The actual steps that this algorithm makes are explained in greater detail on Wikipedia or briefly in the OpenCV docs.
In contrast to pattern matching this algorithm detects circular shapes so we can use any objects we have to hand that are circular. I’m going to use a lid from an instant coffee jar and then an orange coffee mug.
We don’t need to work with full-size images (depends on your camera resolution, of course) so we’ll resize them right between
cv2.imshowto 640px width and height accordingly to keep aspect ratio:
width, height = image.shape scale = 640.0 / width image = cv2.resize(image, (0,0), fx=scale, fy=scale)
Then we want to convert it to a grayscaled image and apply first median blur which removes noise and retains edges, and then the Canny edge detector to see what the circle detection algorithm is going to work with. For this reason, we’ll compose 2x2 grid with all four previews.
t = 100 # threshold for Canny Edge Detection algorithm grey = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY) blured = cv2.medianBlur(grey, 15) # Create 2x2 grid for all previews grid = np.zeros([2*h, 2*w, 3], np.uint8) grid[0:h, 0:w] = image # We need to convert each of them to RGB from greyscaled 8 bit format grid[h:2*h, 0:w] = np.dstack([cv2.Canny(grey, t / 2, t)] * 3) grid[0:h, w:2*w] = np.dstack([blured] * 3) grid[h:2*h, w:2*w] = np.dstack([cv2.Canny(blured, t / 2, t)] * 3)
Grid with previews. Top-left: raw webcam data; top-right: grayscaled after median blur; bottom-left: grayscaled + Canny edge; bottom-right: grayscaled after median blur + Canny edge.Even though Canny edge detector uses Gaussian blur to reduce noise, in my experience it's still worth using median blur as well. You can compare the two bottom images. The one on the left is just Canny edge detection without any other filter. The second image is also Canny edge detection but this time after applying median blur. It reduced objects in the background which will help circle detection.
New job openings
Great companies are looking for smart cookies like you.Explore job opportunities →
sc = 1 # Scale for the algorithm md = 30 # Minimum required distance between two circles # Accumulator threshold for circle detection. Smaller numbers are more # sensitive to false detections but make the detection more tolerant. at = 40 circles = cv2.HoughCircles(blured, cv2.HOUGH_GRADIENT, sc, md, t, at)
This returns an array of all detected circles. For simplicity’s sake, we’ll care only about the first one. Hough Gradient is quite sensitive to really circular shapes so it’s unlikely that this will result in false detections. If it did, increase the
atparameter. This is why we used median blur above; it removed more noise so we can use a lower threshold, making the detection more tolerant to inaccuracies and with a lower chance of detecting false circles.
We’ll print circle center and its radius to the console and also draw the found circle with its center to the image from camera in a separate window. Later, we’ll send it via WebSocket to the browser. Note, that
radiusare all in pixels.
if circles is not None: # We care only about the first circle found. circle = circles x, y, radius = int(circle), int(circle), int(circle) print(x, y, radius) # Highlight the circle cv2.circle(image, [x, y], radius, (0, 0, 255), 1) # Draw a dot in the center cv2.circle(image, [x, y], 1, (0, 0, 255), 1)
This will print to console tuples like:
(251, 202, 74) (252, 203, 73) (250, 202, 74) (246, 202, 76) (246, 204, 74) (246, 205, 72)