I am currently working on a Korean car number plate detection system using the YOLOv3 model. The first part of the project involves successfully detecting and cropping the number plate from Korean car images.
For the second part, I am implementing various image preprocessing steps to extract individual digits and Hangul characters from the cropped number plates. I plan to use techniques such as thresholding, contour detection, and character segmentation.
The third and final step involves training a CNN classifier. I aim to utilize the MNIST dataset for digit classification and generate a custom dataset for Korean characters. The classifier will then recognize and classify the digits and Hangul characters extracted from the preprocessed number plates.
I'm looking for guidance on the next steps—specifically, how to effectively extract and recognize these digits and characters post-preprocessing. For example recognizing number plate numbers like 59오 4686 and extracting one by one like 5,9,오,4,6,8,6 something like this:
Here are the preprocessing steps I have used:
Original image in BGR format:
Grayscale image:
Inverted image:
Binary image:
Contours:
Code:
import cv2
import numpy as np
# Read image in BGR format
img_path = '/content/drive/MyDrive/cropped_img/IMG_1952.jpg'
img = cv2.imread(img_path)
# Convert image to grayscale from BGR to RGB
gray_img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# Invert the color
gray_inverted = cv2.bitwise_not(gray_img)
# Binarize the image
_, binary = cv2.threshold(gray_inverted, 100, 255, cv2.THRESH_BINARY)
# Find contours
contours, hierarchy = cv2.findContours(binary, cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE)
cv2.drawContours(img, contours, -1, (0, 255, 0), 2)
From the contours how can I extract: 59오 4686 and then later I can pass each character and digit to my CNN classifier.
Sample of training image for classifier: