python - Extracting numbers from video frames using Tesseract OCR

Question

Welcome To Ask or Share your Answers For Others

python - Extracting numbers from video frames using Tesseract OCR

posted Oct 7, 2021 in Technique[技术] by 深蓝 (71.8m points)

python - Extracting numbers from video frames using Tesseract OCR

I am interested in extracting numbers from standardized videos (always HD resolution @ 1920x1080, 30 FPS) I have. Numbers always appear in fixed sections of the screen and are never missing.

My approach would be to:

Save video in frame by frame PNGs
Load a single PNG frame
Select the areas of interest (there are a four sections I want to
extract numbers from; each section might need their own image manipulation; always in the exact same pixel range)
Extract numbers using Python and Tesseract-OCR
Store values in data frame

Examples of two of the sections are:

I have installed Python (im an R user really :S) and tesseract and can run the Tesseract examples well (i.e. I have confirmed my setup works).

However, when I run the following commands on the top image [247] Tesseract is not able to extract the number, while you'd think its easy to extract as the text is very clear.

from PIL import Image
import pytesseract
import os
import cv2
import argparse


img = cv2.imread("C:/Users/Luc/Videos/Monza GR4 1.56.156/frames/frame1060_speed.png")

cv2.imshow("RAW", img)
cv2.waitKeyEx(0)
cv2.destroyWindow()


imgRGB = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
cv2.imshow("RBG", imgRGB)
cv2.waitKeyEx(0)
cv2.destroyWindow()


imgBW2WB = cv2.bitwise_not(imgRGB)
cv2.imshow("White black swapped", imgBW2WB)
cv2.waitKeyEx(0)
cv2.destroyWindow()


(thresh, blackAndWhiteImage) = cv2.threshold(imgBW2WB, 127, 255, cv2.THRESH_BINARY)
cv2.imshow("Remove some noise", blackAndWhiteImage)
cv2.waitKeyEx(0)
cv2.destroyWindow()


pytesseract.image_to_string(blackAndWhiteImage, 
                            config='--psm 10 --oem 3 -c tessedit_char_whitelist=0123456789')

The output is:

pytesseract.image_to_string(blackAndWhiteImage, 
                            config='--psm 10 --oem 3 -c tessedit_char_whitelist=0123456789')
Out[15]: '7
x0c'

I have very little experience with OCR (I have Googled and Youtube'd this together) and would very much appreciate advice and directions.

Regards, Luc

question from:https://stackoverflow.com/questions/65839969/extracting-numbers-from-video-frames-using-tesseract-ocr

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2021-10-06T19:32:31+0000

Please use this Python code accordingly:

import cv2
from pytesseract import image_to_string
import numpy as np

def getText(filename):
    img = cv2.imread(filename)
    HSV_img = cv2.cvtColor(img,cv2.COLOR_BGR2HSV)
    h,s,v = cv2.split(HSV_img)
    v = cv2.GaussianBlur(v, (1,1), 0)
    thresh = cv2.threshold(v, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
    cv2.imwrite('{}.png'.format(filename),thresh)
    kernel = cv2.getStructuringElement(cv2.MORPH_RECT, ksize=(1, 2))
    thresh = cv2.dilate(thresh, kernel)
    txt = image_to_string(thresh, config="--psm 6 digits")
    return txt
    

text = getText('WYOtF.png')
print(text)
text = getText('0Oqfr.png')
print(text)

Here getText() function will take path of the png image file. After converting to HSV domain it will take the value component as v and then perform the Gaussian Blur before thresholding. You can try varying the kernel size of the dilate function accordingly to your images. The two images were given as input to the code above, and below is the output.

Output

247
0.10.694

Thresholding results

WYOtF.png

0Oqfr.png

Categories

python - Extracting numbers from video frames using Tesseract OCR

python - Extracting numbers from video frames using Tesseract OCR

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Thresholding results

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags