I have images that are noised with some random lines like the following one:
I want to apply on them some preprocessing in order to remove the unwanted noise ( the lines that distort the writing) so that I can use them with OCR (Tesseract).
The idea that came to my mind is to use dilation to remove the noise then use erosion to fix the missing parts of the writing in a second step.
For that, I used this code:
import cv2
import numpy as np
img = cv2.imread('linee.png', cv2.IMREAD_GRAYSCALE)
kernel = np.ones((5, 5), np.uint8)
img = cv2.dilate(img, kernel, iterations=1)
img = cv2.erode(img, kernel, iterations=1)
cv2.imwrite('delatedtest.png', img)
Unfortunately, the dilation didn't work well, The noise lines are still existing.
I tried changing the kernel shape, but it got worse: the writing were partially or completely deleted.
I also found an answer saying that it is possible to remove the lines by
turning all black pixels with two or less adjacent black pixels to white.
That seems a bit complicated for me since I am beginner to computer vision and opencv.
Any help would be appreciated, thank you.
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…