Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
455 views
in Technique[技术] by (71.8m points)

opencv - Generating good training data for haar cascades

I am trying to build haar cascades for doing OCR of a specific font; one classifier per character.

I can generate tons of training data just by drawing the font onto images. So, the plan is to generate positive training data for each character, and use the examples of other characters as negative training data.

I am wondering how much variation I should put into the training data. Normally I'd just try everything, but I gather these things take days to train (for each character!) so some advice would be good.

So, a few questions:

  • Does the training algorithm recognise that I don't care about transparent pixels? Or will it perform better if I superimpose the characters over different backgrounds?
  • Should I include images where each character is shown with different prefixes and suffixes, or should I just treat each character individually?
  • Should I include images where the character is scaled up and down? I gather the algorithm pretty much ignores size, and scales everything down for efficiency anyway?

Thanks!

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Does the training algorithm recognise that I don't care about transparent pixels? Or will it perform better if I superimpose the characters over different backgrounds?

The more "noise" you give your images on the parts of the training data then the more robust it will be, but yes the longer it will take to train. This is however where your negative sampels will come into action. If you have as many negative training samples as possible with as many ranges as possible then you will create more robust detectors. THat being said, if you have a particular use case in mind then I would suggest skewing your training sets slightly to match that, it will be less robust but much better in your application.

Should I include images where each character is shown with different prefixes and suffixes, or should I just treat each character individually?

If you want to detect individual letters, then train individually. If you train it to detect "ABC" and you only want "A" then it is going to start getting mixed messages. Simply train each letter "A", "B" etc and then your detector should be able to pick out each individual letter in larger images.

Should I include images where the character is scaled up and down? I gather the algorithm pretty much ignores size, and scales everything down for efficiency anyway?

I don't believe this is correct. AFAIK the HAAR algorithm cannot scale down a trained image. So if you train all your images on 50x50 letters but the letters in your images are 25x25 then you won't detect them. If you train and detect the other way round however you will get results. Start small, let the algorithm change the size (up) for you.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...