When it comes to object detection, there are popular models like YOLO, SSD, RCNN, but most of the time these models require a GPU for training and inference, and they need a proper environment, which can be confusing to create especially for the first time.
If you don’t want to spend your time creating a complex environment, or if you don’t have a powerful GPU, Cascade classifiers might be a good choice for you.
In this article, I will discuss what are advantages and disadvantages of Cascade classifiers, how to make detections on images and videos, and how to use pretrained Cascade classifier models —>Python, OpenCV
There are bunch of pretrained models that OpenCV shares with users, you can check models from image in the below. ( pretrained models ) Most of the pretrained models are for body, eye, and face detection.
You
dont have to use this pretrained models, you can train your own model.
Fortunately, OpenCV has step by step tutorial for it, you can follow it (
training custom model )
Detection→ Image
For loading pretrained model you need to create an object of the cv2.CascadeClassifier class and provide the pre-trained model file as a parameter.
For detection, you need to use the detectMultiScale method, and here is the parameters:
image→ input image in grayscale. Face detection works more efficiently in grayscale
scaleFactor → This parameter specifies how much the image size is reduced at each image scale.
minNeighbors →
The detection algorithm finds multiple candidates, and minNeighbors
controls how many detections around the same region are needed to
declare if it is a object
minSize → It specifies the minimum size of the object to be detected
import cv2 import matplotlib.pyplot as plt
# Load the pre-trained Haar Cascade classifier for face detection face_cascade = cv2.CascadeClassifier(cv2.data.haarcascades + 'haarcascade_fullbody.xml')
# Read the image image = cv2.imread(r"C:\Users\sirom\Downloads\pexels-chinmay-singh-251922-843563.jpg")
# Convert the image to grayscale gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
# Detect faces in the image faces = face_cascade.detectMultiScale(gray, scaleFactor=1.1, minNeighbors=5, minSize=(10, 10))
# Draw rectangles around the detected faces for (x, y, w, h) in faces: cv2.rectangle(image, (x, y), (x+w, y+h), (255, 0, 0), 2)
# Display the output plt.imshow(cv2.cvtColor(image,cv2.COLOR_BGR2RGB))
Main logic is similar, you just need to read the video and make prediction(detection) for each frame sequentially.
# Load the pre-trained Haar Cascade classifier for full body detection (you can use a face detector if needed) face_cascade = cv2.CascadeClassifier(cv2.data.haarcascades + 'haarcascade_fullbody.xml')
# Capture video from a file or camera (use 0 for the default camera) video_capture = cv2.VideoCapture(r"C:\Users\sirom\Downloads\7236898-hd_1280_720_25fps.mp4") # For a file # video_capture = cv2.VideoCapture(0) # For camera input
# Start the time counter prev_frame_time = time.time() new_frame_time = 0
# Loop over frames from the video while video_capture.isOpened(): # Capture frame-by-frame ret, frame = video_capture.read()
if not ret: print("Finished processing or cannot read the video.") break
# Convert the frame to grayscale gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
# Detect faces in the frame faces = face_cascade.detectMultiScale(gray, scaleFactor=1.1, minNeighbors=10, minSize=(10, 10))
# Draw rectangles around the detected faces for (x, y, w, h) in faces: cv2.rectangle(frame, (x, y), (x + w, y + h), (255, 0, 0), 4) cv2.putText(frame, 'Face', (x + 75, y - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.9, (255, 0, 0), 2)
# Convert FPS to an integer for display fps = int(fps) fps_text = f'FPS: {fps}'
# Put the FPS text on the top-right corner of the frame cv2.putText(frame, fps_text, (frame.shape[1] - 150, 50), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 3)
# Display the resulting frame cv2.imshow('Face Detection in Video with FPS', frame)
# Break the loop when 'q' is pressed if cv2.waitKey(1) & 0xFF == ord('q'): break
# Release the video capture object and close all windows video_capture.release() cv2.destroyAllWindows()
Face detection using OpenCV with Haar Cascade Classifiers is a fundamental technique in computer vision that allows automatic detection of human faces within images or video streams. This approach relies on the use of pre-trained Haar cascade classifiers, a type of machine learning-based algorithm, to identify facial features and patterns.
Understanding Haar Cascade Classifiers:
Haar cascade classifiers are based on the Haar-like features concept, which involves comparing pixel intensities in various regions of an image to detect specific patterns or objects. These classifiers utilize a set of trained features and a cascade of classifiers to efficiently scan an image and determine whether a particular region contains the target object—in this case, a human face.
Working Principle:
It works in four stages:
·Haar-feature selection: A Haar-like feature consists of dark regions and light regions. It produces a single value by taking the difference of the sum of the intensities of the dark regions and the sum of the intensities of light regions. It is done to extract useful elements necessary for identifying an object. The features proposed by viola and jones are:
·Creation of Integral Images:
A given pixel in the integral image is the sum of all the pixels on the
left and all the pixels above it. Since the process of extracting
Haar-like features involves calculating the difference of dark and light
rectangular regions, the introduction of Integral Images reduces the
time needed to complete this task significantly.
·AdaBoost Training:
This algorithm selects the best features from all features. It combines
multiple “weak classifiers” (best features) into one “strong
classifier”. The generated “strong classifier” is basically the linear
combination of all “weak classifiers”.
·Cascade Classifier:
It is a method for combining increasingly more complex classifiers like
AdaBoost in a cascade which allows negative input (non-face) to be
quickly discarded while spending more computation on promising or
positive face-like regions. It significantly reduces the computation
time and makes the process more efficient.
OpenCV comes with lots of pre-trained classifiers. Those XML files
can be loaded by cascadeClassifier method of the cv2 module. Here we
are going to use haarcascade_frontalface_default.xml for detecting
faces.
Code:
import cv2
Importing OpenCV: The first step is importing the OpenCV library. It's commonly imported with the alias cv2.
image = cv2.imread('Images/image1.jpg')
Loading the Image: The code reads an image named 'image1.jpg' using the cv2.imread() function and assigns it to the variable image.
if image is None:
print("Error: Unable to read the image.")
else:
print("Image shape:", image.shape)
Error Handling for Image Loading: The code checks whether the image was loaded successfully or not. If the image is loaded successfully, it prints the shape of the image; otherwise, it prints an error message.
image = cv2.resize(image, (800, 600))
image.shape
Resizing the Image if reqired: The code resizes the loaded image to a width of 800 pixels and a height of 600 pixels using the cv2.resize() function.
#To Display image
if image is not None:
# Display the image
cv2.imshow('image', image)
# Wait for a key press and close the window
cv2.waitKey(0)
cv2.destroyAllWindows()
else:
print("Error: Unable to read the image.")
Displaying the Resized Image: The code displays the resized image using cv2.imshow(), waits for a key press using cv2.waitKey(), and then closes the window using cv2.destroyAllWindows().
Converting Image to Grayscale: The code converts the resized image to grayscale using the cv2.cvtColor() function.
#To Display image
if image_gray is not None:
# Display the image
cv2.imshow('image_gray', image_gray)
# Wait for a key press and close the window
cv2.waitKey(0)
cv2.destroyAllWindows()
else:
print("Error: Unable to read the image.")
Displaying the Grayscale Image: Similar to displaying the colored image, the code displays the grayscale image using cv2.imshow().
image.shape
image_gray.shape
Printing Shapes of Original and Grayscale Images: Finally, the code prints the shapes of the original and grayscale images.
Loading Haarcascade Classifier: The code loads a pre-trained Haar cascade classifier for detecting faces. This classifier is loaded using the CascadeClassifier function with the XML file path of the classifier.
Detecting Faces: The detectMultiScale() function is used to detect faces in the grayscale image (image_gray). It returns a list of rectangles where it detected faces.
numnber_of_faces=len(detections)
numnber_of_faces
Number of Faces Detected: The number of faces detected is calculated by finding the length of the detections list.
for (x, y, w, h) in detections:
#print(x, y, w, h)
cv2.rectangle(image, (x, y), (x + w, y + h), (0,255,255), 5)
#To Display image
if image is not None:
# Display the image
cv2.imshow('image', image)
# Wait for a key press and close the window
cv2.waitKey(0)
cv2.destroyAllWindows()
else:
print("Error: Unable to read the image.")
This loop iterates over each detection (each rectangle) and draws a rectangle around the detected face on the original image image. It uses cv2.rectangle to draw rectangles with the given coordinates and color (0,255,255) (which corresponds to yellow in BGR color space) and a thickness of 5 pixels and diaplay it.
Haarcascade parameters
##Haarcascade parameters
image = cv2.imread('Images/image1.jpg')
image = cv2.resize(image, (800, 600))
image_gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
detections = face_detector.detectMultiScale(image_gray, scaleFactor = 1.09)
for (x, y, w, h) in detections:
cv2.rectangle(image, (x, y), (x + w, y + h), (0,255,0), 5)
#To Display image
if image is not None:
# Display the image
cv2.imshow('image', image)
# Wait for a key press and close the window
cv2.waitKey(0)
cv2.destroyAllWindows()
else:
print("Error: Unable to read the image.")
This line detects faces in the grayscale image (image_gray) using the detectMultiScale function of the face detector.
The scaleFactor parameter adjusts the image scale to compensate for objects that may appear smaller due to perspective. In this case, it's set to 1.09, which means the algorithm will slightly increase the size of the image for more accurate detection.
image = cv2.imread('Images/image1.jpg')
image_gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
detections = face_detector.detectMultiScale(image_gray, scaleFactor=1.2, minNeighbors=7,minSize=(20,20), maxSize=(100,100))
for (x, y, w, h) in detections:
print(w, h)
cv2.rectangle(image, (x, y), (x + w, y + h), (0,255,0), 2)
#To Display image
if image is not None:
# Display the image
cv2.imshow('image', image)
# Wait for a key press and close the window
cv2.waitKey(0)
cv2.destroyAllWindows()
else:
print("Error: Unable to read the image.")
This line detects faces in the grayscale image (image_gray) using the detectMultiScale function of the face detector.
The scaleFactor parameter adjusts the image scale to compensate for objects that may appear smaller due to perspective. In this case, it's set to 1.2.
The minNeighbors parameter specifies how many neighbors each candidate rectangle should have to retain it. Higher values result in fewer detections but with higher quality.
The minSize parameter specifies the minimum possible object size. Any detected object smaller than this size will be ignored.
The maxSize parameter specifies the maximum possible object size. Any detected object larger than this size will be ignored.