Computer Vision Basics

How do you see the world?

In a step by step fashion we first see images. A stream of images one after the other becomes a video. Then we identify the object in front of us. In the continuous stream of images , we track the identified object and then classify the behavior for some reaction.

A typical Vision Program also follows a similar process

Why do we need computer vision

Computer vision allows interaction between the autonomous system and humans based on visual input. Identification and localization of the objects around becomes very accessible.

It’s can be found useful in many scenarios. To name a few

Healthcare Robots
Agriculture
Transportation
Manufacturing
Retail
etc

Everything that can been seen , can be processed!

What is an Image

An Image is a NumPy array. Each element of this array represents a pixel (A square). With 0 to 1 values representing the brightness, 0 being the darkest in case of grayscale. In color images, we have three channels in for each element of an array. These channels are just numbers and it is up to the system and software we use to interpret them for what they are.

More details are covered on the Image section.

What is a Video

A video is just images moving very fast . The speed called fps.

More Details in the Video Processing section.

Installation and Setup

1. Install OpenCV

https://pypi.org/project/opencv-python/

2. Install Numpy

https://numpy.org

3. Install MatPlotlib

https://matplotlib.org/

4. Install Anaconda

https://www.anaconda.com/

5. Install Python

https://www.python.org/downloads/

6. Install any IDE of choice