A Pixel Matching Problem



this problem is like a miniature OCR problem.

Let's say you have some images that contain text, or screen snapshots
obtained by the Robot class. How can you extract the text from them?

You have the advantage that you probably have access to the fonts used
to create the text.

How might you go about creating an OCR for images? I was thinking of
writing this up as a student project.

The advantage you have over true OCR is all matches will be exact.

The complications are a variety of colours for foreground/background,
antialiasing and painting over muticoloured backgrounds. For JPGs you
may not have the fonts used. The edges of the text may be tweaked in
various ways, e.g. 3D, blurred, warped.

I thought you might proceed like this:

1. find rectangular regions containing only two colours.

2. draw an e in 10 font sizes for each font as your search templates.

3. look for a hole the right shape. When you find one, check all the
e's you have with that size/shape hole to see if you have a match on
the entire letter, check the whole rectangle.

4. if you have a match, calculate the baseline and starting point. now
draw a template ea in that same font and compare. If it fails try eb
etc. Work your way both left and right pulling that line.

You might construct a hashMap indexed by a digest of the glyph so you
can more rapidly check for matches. Your digest algorithm might trim
the glyph top/bottom/left/right so you don't need the stringWidth
information by actually drawing the character pair.

5. You carve the rectangle out of the bigger one, and break the
remaining into rectangles.

6. repeat until there are no more rectangles.

7. export the text each labelled with x.y where it was found.

8. In another program allow the user to highlight text, e.g. a column
or box to determine the linear order of the text desired.

Potential uses for such software include:

1. capturing filenames, error messages, crash locations that were
displayed in a non-cut/pasteable way.

2. by people trying to defeat my email munger. See
http://mindprod.com/applets/masker.html

3. by blind people extracting textual information from images.

4. To allow you to copy from any Swing Component.

5. to extract information from a screen snapshot without having to
retype it.



--
Canadian Mind Products, Roedy Green.
http://mindprod.com Java custom programming, consulting and coaching.
.



Relevant Pages

  • Re: MS Expression Web Designer
    ... You are correct that these images are links, and I did not know if links, by ... Format Font in FP. ... I have not found EWD to do any "silent ... the work well or maybe this is just by design. ...
    (microsoft.public.frontpage.programming)
  • Re: Any Idea for IE and Opera - Its working with Firefox ...
    ... place where the head is as small that the images have enough place. ... Anything under font-size: 100%; ... Use this as a basis for your font sizes, ...
    (alt.html)
  • Re: [PHP] Re: posting variables to parent frame
    ... that there's no real standard for resolution. ... Atm, i repeat small images around the borders, but that's a real pain ... That will allow your users to change the font size on their browser to their preference without screwing up your design. ...
    (php.general)
  • Re: Tip of the Day enigma
    ... but smaller than any other significant control in the dialog ... The grey area painted is not related to any object in the template. ... I was coming to the conclusion that the term 'bounding rectangle' had ... > and doing some arithmetic based on the default system font size. ...
    (microsoft.public.vc.mfc)
  • Re: Need Explanation of OE6 Behavior
    ... depending on the page you may inherit a font specification. ... Internet connection when you view the mail, you won't see the images (unless you have ... There is no resend option in the Windows version of OE. ... then copy/paste the text from the message in the Sent Items folder. ...
    (microsoft.public.windows.inetexplorer.ie6_outlookexpress)