Channel: Adobe Community: Message List - Acrobat SDK

↧

Re: how to extract Text from a pdf file

April 25, 2014, 4:33 am

≫ Next: Ligature text expansion issue

≪ Previous: Re: PDF Editing and saving as image with ASP.Net MVC3 using Adobe Live Cycle/Acrobat API

Hi,

I am successfully extracting text from pdf, by the PDWordFinder but there are some issue with ligature text.

Can any one help let me know if possible, How to stop ligature expanision.

There is a word "office" in my pdf file. and it is getting expanded as "offi ce".

Here is my code

PDWordFinderConfigRec wfConfig; /* WordFinder configuration record */

memset(&wfConfig, 0, sizeof(PDWordFinderConfigRec));

wfConfig.noXYSort = true;

wfConfig.noLigatureExp = false;

wordFinder = PDDocCreateWordFinderEx (pdDoc, WF_LATEST_VERSION, toUnicode, &wfConfig);

pageNum = AVPageViewGetPageNum (pageView);

         PDWordFinderAcquireWordList (wordFinder, pageNum, &wInfo, NULL, NULL, &count);

for(i=0; i<count; i++)

{

memset (str, '\0', MAX_PATH);

word = PDWordFinderGetNthWord (wordFinder, i);

PDWordGetString (word, str, PDWordGetLength(word));

attrib = PDWordGetAttrEx (word, 0);

if((attrib & WXE_ADJACENT_TO_SPACE) && !(attrib & WXE_LAST_WORD_ON_LINE) && !(attrib & WXE_HAS_LIGATURE))

strcat (str, " ");

fprintf (pFileTexts, "%s", str);

}

↧

Trending Articles

मुख मैथुन से उठाएं सेक्स का भरपूर मज़ा, जानें क्या है इसका सही तरीकामुख मैथुन...

May 17, 2020, 2:04 pm

Who Is Sisanda Jonas? | Biography| Profile| History Of South African Media...

June 22, 2017, 7:22 pm

Practice Sheet of Right form of verbs for HSC Students

September 22, 2019, 11:40 pm

God of war 3 PPSSPP Download For Android 1.3 GB

May 13, 2024, 1:04 pm

Black Angus Grilled Artichokes

July 16, 2016, 4:37 pm

The 10 Tennessee Cities With The Largest Black Population For 2021

December 21, 2020, 10:12 am

How To: Uninstall & Reinstall The Shavlik (ST) Remote Scheduler Service On A...

December 31, 2012, 11:00 pm

99 God Status for Whatsapp, Facebook

June 5, 2016, 11:46 pm

Central Maine arrest log: May 3-10, 2024

May 10, 2024, 11:40 am

NCERT Solutions for Class 9th Sanskrit Chapter 3 पाथेयम्

December 22, 2016, 3:50 am

Suspected burglar to know fate in January

December 15, 2017, 6:16 pm

Rajasthan Board 10th Result 2016 Roll No wise & Name Wise

August 20, 2016, 5:13 pm

Thomas Grundy – Bradwell

May 23, 2015, 3:53 pm

18A St. Fintan's Villas, Deansgrange, Co. Dublin - €365,000

April 6, 2016, 5:05 am

[MP3] Texzy Ft Dr. Ritzy –“Leg Over” (Prod. @DrRitzy & @KezzyKlef)

April 11, 2017, 7:32 pm

Redruth man Nathan Ellis spared jail after admitting assaulting...

October 27, 2015, 11:00 pm

Walkthrough Pokemon Victory Fire Complete | English Language

August 20, 2013, 6:00 am

Breaking Down Bumpy’s Boys: NYC Black Mob Boss Of Old Surrounded Himself With...

August 29, 2017, 10:43 pm

Attharintiki Daaredhi: Bappu Gari Bommo Lyrics Translation

July 21, 2013, 9:17 am

Rajasthan Board 12th Science Result 2018 name wise- RBSE 12th commerce result...

May 26, 2018, 9:35 pm

© 2025 //www.rssing.com