kyle

joined 1 year ago
[–] kyle@infosec.pub 1 points 1 year ago

You could also try adjusting the contrast a bit. I use an app called Genius Scan, which increases the contrast of the scanned image to reduce the number of bits needed per pixel. This reduces the size of the file quite a bit, although it obviously isn't a true representation of the scanned document. The TextCleaner imagemagick plugin looks like it's doing something similar.

[–] kyle@infosec.pub 2 points 1 year ago (1 children)

Ah, I only use the OpenAI api. I haven’t really explored the rest of the providers out there yet. Claude looks interesting though!

[–] kyle@infosec.pub 3 points 1 year ago* (last edited 1 year ago) (5 children)

I’ve never used paperless but just checked it out and it looks pretty neat. My first thought would be to scan documents in a higher resolution, let the OCR happen, then convert the file to a JPEG or something smaller after you’ve extracted the text.

I spent a few minutes looking at their wiki and it looks like it might be possible.

Like I said though, no experience with this software so I’m not sure that’d actually work.

[–] kyle@infosec.pub 2 points 1 year ago (3 children)

I was having issues with it all day yesterday. GPT 3.5 worked fine though.

[–] kyle@infosec.pub 50 points 1 year ago (1 children)

Your instance admin can see it. It’s not public though.

[–] kyle@infosec.pub 3 points 1 year ago

Thanks for linking this!