Google team’s neural network approach works on street numbers

(Phys.org) – A Google team has worked out a neural network approach to transcribe house numbers from Street View images, reading those house numbers and matching them to their geolocation. Google Street View has the user advantage of allowing the user to advance to street level to see the area of interest in detail. Google’s accomplishment in automation is impressive both in the scope of the task involved and the way in which it was done. Consider that Google’s Street View cameras have recorded massive numbers of panoramic images carrying massive numbers of house numbers. “We can for example transcribe all the views we have of street numbers in France in less than an hour using our Google infrastructure,” said the researchers, who have authored the paper, “Multi-digit Number Recognition from Street View Imagery using Deep Convolutional Neural Networks.” Ian J. Goodfellow, Yaroslav Bulatov, Julian Ibarz, Sacha Arnoud, Vinay Shet are the authors. The team used a neural network that contains 11 levels of neurons trained to spot numbers in images. The researchers describe the network as “a deep convolutional neural network that operates directly on the image pixels.” They said they used the DistBelief implementation of deep neural networks to train large, distributed neural networks on high quality images. (More information “Multi-digit Number Recognition from Street View Imagery using Deep Convolutional Neural Networks“)

Shape from shading Tsai and Shah approach

Tsai and Shah applied the discrete approximation of the gradient rst, then employed the linear approximation of the reflectance function in terms of the depth directly. Their algorithm recovered the depth at each point using a Jacobi iterative scheme. (┬áP. Tsai and M. Shah. Shape from shading using linear approximation. Image and Vision Computing, 12(8):487–498, 1994.)

Continue reading

Shape from shading Pentland approach

Pentland used the linear approximation of the reflectance function in terms of the surface gradient, and applied a Fourier transform to the linear function to get a closed form solution for the depth at each point. (Pentland takes the Fourier transform of both sides of the equation). (Pentland, A., “Shape Information From Shading: A Theory About Human Perception,” Computer Vision., Second International Conference on , vol., no., pp.404-413, 5-8 Dec 1988.)

Continue reading