model struggles with noisy images

#67
by elenapop - opened

I found that Llama 3.2 gives the most accurate text extraction, especially with high-density text and irregular layouts.
However, with noisy scanned images—those with stamps, handwritten annotations, and other artifacts—it simply fails to produce any output.

I attempted to fine-tune it using 40k images augmented with noise (including quantization noise, salt and pepper noise, skewing, handwritten text, and multiple fonts). Unfortunately, this reduced its accuracy on well-formatted images, and it still doesn’t handle noisy images effectively.

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment