Friday, August 21, 2009

Robust OCR in video

I used the "Robust OCR dataset" below to make a system for reading runner bibs in video. Standard ML techniques give fairly good results without much tweaking -- AdaBoost with stumps to go through all connected components (in thresholded image) and generate potential candidates, SVM/Gaussian kernel to classify those candidates into digits. Here's a screenshot and a video of this system in action.

Video

7 comments:

hr0nix said...

Are black rectangles false positives from AdaBoost?

Yaroslav said...

Kind of, rectangles are initial candidates generated by AdaBoost. I found it to be more robust to consider top 50 candidates regardless of their score, and then use some heuristics to filter out ones not related to bibs

Alex said...

Yaroslav, I was wondering if the OCR was happening in realtime or in post processing of the video? It appears the video is a little slow motion, was that because OCR wasn't as accurate at normal speeds?

Jai Pillai said...

Impressive results. I am wondering whether you are using all the frames in the video, or just one good frame? In the former case, how are you integrating the results from each frame?

Yaroslav said...

Alex: it's not real time
Jai: you mean how do I determine which bibs occur in the video? I just return bib numbers which occur in more than k frames (k hand-tuned, 40 seems to be OK)

M P Divecha said...

Hi,
I am working on a similar problem (here: http://stackoverflow.com/questions/6794372/localization-of-numbers-within-a-complex-scene-image). Can you give me some tips on how you did this?
I have never used boosting before, so I would like to know how did you train the classifiers? Is there any dataset availble?
Your help in this regard will be highly appreciated :-)

Thanks.

Dmitry Nozhnin said...

Ярослав, здравствуйте,

А выкладывали ли вы код или более подробное описание вашего алгоритма? Очень интересный подход, хотелось бы почитать подробнее. Спасибо!