streaming - Moving OCR from a PDF to another - Java -


good afternoon , have problem in project, pdf compression , process follows: extract images pdf hang ocr compression stock ocr + merge image , convert pdf per page combine generated pdf ocr, ocr pdfcon 1 out final product. size of original file 11 mb , 4.2 mb compressed . whole process works , problem have speed in ocr process . checking on web, , saw way circumvent process, getting text layer of original pdf , pass final pdf compressed , try codes delete images of pdf , alone text layer , , insert compressed images, problem compared normal process provided above , weight of file increased more 4.2 mb , not convenient me. when seeking solution found handle pdf operators handled pdfbox through pdfstreamparser , pdstream , cosdictionary . operators tj , tw , tz , tc ... etc. . question if knows pass tj operate , 1 contains text of pdf , see if text layer of original pdf can passed final pdf compressed without me 4.2mb high raise weight, idea not spend other operators because these can increase weight of final pdf or mistaken ? if have other solution me grateful ? .

sorry if english bad , if knows spanish tells me express myself better .

language use java.

thanks


Comments

Popular posts from this blog

how to proxy from https to http with lighttpd -

android - Automated my builds -

python - Flask migration error -