I used the "Robust OCR dataset" below to make a system for reading runner bibs in video. Standard ML techniques give fairly good results without much tweaking -- AdaBoost with stumps to go through all connected components (in thresholded image) and generate potential candidates, SVM/Gaussian kernel to classify those candidates into digits. Here's a screenshot and a video of this system in action.
Video
6 comments:
Are black rectangles false positives from AdaBoost?
Kind of, rectangles are initial candidates generated by AdaBoost. I found it to be more robust to consider top 50 candidates regardless of their score, and then use some heuristics to filter out ones not related to bibs
Yaroslav, I was wondering if the OCR was happening in realtime or in post processing of the video? It appears the video is a little slow motion, was that because OCR wasn't as accurate at normal speeds?
Impressive results. I am wondering whether you are using all the frames in the video, or just one good frame? In the former case, how are you integrating the results from each frame?
Alex: it's not real time
Jai: you mean how do I determine which bibs occur in the video? I just return bib numbers which occur in more than k frames (k hand-tuned, 40 seems to be OK)
Hi,
I am working on a similar problem (here: http://stackoverflow.com/questions/6794372/localization-of-numbers-within-a-complex-scene-image). Can you give me some tips on how you did this?
I have never used boosting before, so I would like to know how did you train the classifiers? Is there any dataset availble?
Your help in this regard will be highly appreciated :-)
Thanks.
Post a Comment