Leçon 1, Chapitre 1
En cours

Detection objet – Computer Vision

Yann KIDSHAKER 17 mars 2025

How does Human Vision work?

To understand computer vision, first, we must look at how the human eye works.

Human Vision

  1. Capture image: Humans capture images using their eyes. The image captured is formed on the retina, similar to how the camera captures the image but in a very raw format.
    Eye Structure
  2. Identify the objects and their features: The raw image is then transferred to the brain via the optical nerves for processing. The brain starts to identify different objects like a candle, human, chair, and many others, along with their features such as size, color, shape, and others.
  3. Extract information: In this step, our brain compares the features of the object to its past knowledge to gather information. E.g., it can differentiate between your father and your mother because you can distinguish their visual features.
  4. Act: Once you get the higher-level information, you can start acting on it. E.g., if you can identify that a ball is coming to your face, you can move aside to avoid hitting the ball.

All these steps happen quickly due to the perfection of the human eye and brain coordination.

Computer Vision

Computer vision also follows a similar approach as human vision.

Computer vision deals with how computers can be made to gain high-level understanding from digital images or videos. From the perspective of engineering, it seeks to automate tasks that the human vision can do.

Computer Vision

Computer vision tasks include methods for acquiring, processing, analyzing, and understanding digital images.

Computer Vision

Example: Self Driving Car Using Vision

A self-driving car is a vehicle capable of sensing its environment and moving safely with little or no human input.

For this example, let us consider that a self-driving car can go forward, turn left, right, or stop. Next, let’s see how the car would react if a pedestrian approached it.

  1. Acquire: Self-driving cars use cameras to acquire images. They receive and process images at a very high rate. For example, let’s consider that our camera has received this image:
    Girl Crossing Road
  2. Process: The computer starts to identify all the objects in the image and lists the objects with their position. In this case, there is something on the road. However, the computer still has no information about what object it is.
  3. Analyze: The computer then classifies each object into different categories. In this case, it identifies the object as a girl. It also tags some information about the object as harmfulness, distance, and other parameters. These tags are the higher-level information used to make a decision.
    Girl Analyse
  4. Act: Based on the higher-level information, the computer can act. In this case, the car will stop.

Conclusion

In this topic, you have learned what computer vision is and its process using the example of a self-driving car. In the next topic, you will learn about the various object detection functions.