Find simple shapes in an image

In this post I am going to solve this problem, how to find simple shapes like triangular and square ( or rectangular) in the image.  For simple shape like square or triangular I normally use this procedure:

1. Find Contours in the image ( image should be binary)
2. Approximate each contour using approxPolyDP function.
3. Check number of elements in the approximated contours of all the shapes to recognize the shape. For eg, triangular will have 3; for square or rectangle, it has to meet the following conditions:
* It is convex.
* It has 4 vertices.
* All angles are ~90 degree.
4. Assign the color, run the code for your test image, check its number, fill it with corresponding colors.

Assumptions: Shapes don’t overlap, both of them solid (meaning, there is no white pixels inside the shape (all shapes are black). There can be multiple shapes in the image and they can be rotated any arbitrary number of degrees, and they can be of any size. Important: Triangles are non-obtuse!

Continue reading

Measure the distance of object to the camera


Distance of circle from camera

Some days ago, I was talking to my friends and one of them asked me if I can write a program to measure the distance of the object to the camera, so I told myself why not write a post about it on my blog. I got the idea of writing this code from Adrian’s blog. You can find the code of distance of object to the camera at the end of this post

In order to determine the distance from our camera to a known object or marker, I am going to utilize triangle similarity.

The triangle similarity goes something like this: Let’s say I have a marker or object with a known width W. Then I place this marker some distance D from my camera. I take a picture of my object using our camera and then measure the apparent width in pixels P. This allows me to derive the perceived focal length F of my camera:

F = (P x D) / W

For example, I place a 21 x 29cm piece of paper (vertically; W = 21) D = 20 cm in front of my camera and take a photo. When I measure the width of the piece of paper in the image, I notice that the perceived width of the paper is P = 133 pixels.

My focal length F is then:

F = (1338px x 20cm) / 21cm = 126.35

As I continue to move my camera both closer and farther away from the object/marker, I can apply the triangle similarity to determine the distance of the object to the camera:

D’= (W x F) / P

Continue reading

Deep Learning technologies


Deep learning

Deep learning

Deep Learning has transformed many important tasks, including speech and image recognition. Deep Learning systems scale well by absorbing huge amounts of data to create accurate models. The computational resources afforded by GPUs have been instrumental to this scaling. However, as Deep Learning has become more mainstream, it has generated some hype, and has been linked to everything from world peace to evil killer robots. In this talk, Dr. Ng will help separate hype from reality, and discuss potential ways that Deep Learning technologies can benefit society in the short and long term.

Continue reading

Google team’s neural network approach works on street numbers

( – A Google team has worked out a neural network approach to transcribe house numbers from Street View images, reading those house numbers and matching them to their geolocation. Google Street View has the user advantage of allowing the user to advance to street level to see the area of interest in detail. Google’s accomplishment in automation is impressive both in the scope of the task involved and the way in which it was done. Consider that Google’s Street View cameras have recorded massive numbers of panoramic images carrying massive numbers of house numbers. “We can for example transcribe all the views we have of street numbers in France in less than an hour using our Google infrastructure,” said the researchers, who have authored the paper, “Multi-digit Number Recognition from Street View Imagery using Deep Convolutional Neural Networks.” Ian J. Goodfellow, Yaroslav Bulatov, Julian Ibarz, Sacha Arnoud, Vinay Shet are the authors. The team used a neural network that contains 11 levels of neurons trained to spot numbers in images. The researchers describe the network as “a deep convolutional neural network that operates directly on the image pixels.” They said they used the DistBelief implementation of deep neural networks to train large, distributed neural networks on high quality images. (More information “Multi-digit Number Recognition from Street View Imagery using Deep Convolutional Neural Networks“)