The power of sight - Computer Vision

Updated: Dec 30, 2020

Vision is probably one of the most powerful abilities to make sense of the world. Here is my creation of computer vision.

The image above shows bounding boxes of objects the program recognized in the given image along with the label of what it thinks the object is and a score of the confidence of the detection.

Ability to classify and detect objects in images have been around for ages but the ability to detect objects in real time webcam vision with basic computing power is a game changer.

There are multiple algorithms to create a computer vision application. In this case I use YOLO ( Your Look Only Once ) algorithm. Now lets take this to the next level by feeding the algorithm not just single images but a series of images or frames. I link my webcam as feed in to the algorithm and there we have it ! Real time object detection.

True computer vision

The above is a video mash up of various real life contexts including ordinary objects we see. Instead of linking to my web cam ( which also worked very well ), here I apply the object detection to this video. Works quite well !

The applications for this are many. Some of the ones that come to my mind immediately are :

1. Computer vision as an aid for the blind - a simple glass with a webcam can keep streaming information of the objects detected which can then be translated to speech so that the blind person can hear about the objects that are seen.

2. Surveillance and Monitoring - imagine a scenario where the purpose of surveillance is to look for a particular object in the field of vision For eg : people walking in a lane reserved only for bicycles.

3. A crucial component of Character Artificial Intelligence : With vision, we give sight to AI. We are working on this exciting field to give the ability to creators to create characters in their stories. For more on this read my blog on Storytelling redefined with Nodestory

What next ? You guessed it ! Make the AI speak what it sees. So here goes, read about my attempt here where I start exploring the natural language.

