Monday, July 21, 2008

Activity 10: Processing Handwritten Text

our mission: pre-processing handwritten text for image-processing.

i chose 3 different parts with writing from the order form image, the three had (or so it seemed to me) varying difficulty for processing.

**code:
stacksize(4e7); im=imread('C:\Documents and Settings\Endura\My Documents\186\act10-1.jpg'); gim=im2gray(im);
fim=fft2(gim);
//fsim=fftshift(fim);
//afsim=abs(fsim); //imshow(afsim,[]);xset("colormap",hotcolormap(20)); //outputs fft2 of image (for enhancing)
imf=imread('C:\Documents and Settings\Endura\My Documents\186\act10-1fil.jpg'); //filter
fts=fftshift(imf); ims=fim.*fts; ima=fftshift(ims); //uses filter created to erase in fourier-space horizontal lines from the image
//afsim=abs(ima);
//imshow(afsim,[]);xset("colormap",hotcolormap(20)); //outputs fft2 of "cleaned" image
img=abs(fft2(ima));
//imshow(img,[]); //outputs cleaned image
imgb=im2bw(img,0.45); //imshow(imgb,[]);
se=ones(2,2);
dil=dilate(imgb,se,[1,1]); ero=erode(dil,se,[1,1]); //closing operator //imshow(ero,[]); //final pre-processed image

[L,n]=bwlabel(ero);

The actual process:
  • image is read and converted to grayscale for easier processing
  • the fft2 of the original image is processed and displayed to see where the horizontal lines to be deleted are
  • a filter is made (GIMP) with black lines that just cover the recurring peaks where the horizontal lines are in fourier-space
  • the fft2-ed original image and the fftshift-ed filter are combined to delete the lines and the cleaned output image is displayed, as well as its fft2(to make sure)
  • the image from the last stage is converted to black and white
  • the closing operator is used (the opening operator is unnecessary for this activity, and from the trials i had with it, it makes the image worse)
  • bwlabel() is used to label the individual letters for handwriting image processing
i ended up using the second image for the activity because the first image had such light-colored letters that it was hard to clean the image without deleting them too, and the third image had too few words in it.

step-by-step outputs:



**i actually tried and re-tried this and avoided posting the report as long as possible even though i finished the activity quite quickly, because i was just never satisfied with the results. Because the words are so small, erasing the lines from the image also erases the middle parts of the letters and using binary operations on it to close that gap only makes the letters run together, so if one wanted this for handwriting-analysis or automated handwriting-reading, it would be useless.

**i give myself a grade of 8 for this activity because i tried my best with it. ]
**thank you to everyone who reassured me that all of theirs were ugly too.

No comments: