Извлечение данных с использованием c# - PullRequest
0 голосов
/ 10 июля 2020

Я хочу проанализировать PDF-файл и извлечь такие данные, как контактные данные имени и адреса, используя c#

Анализ данных PDF-файлов выполняется с помощью pdfpig. Может ли кто-нибудь предложить или помочь с извлечением данных? Как извлечь точную пару ключ-значение из доступных данных? Заранее спасибо !!

BajajAllianzGeneralInsuranceCompanyLtd.RegisteredandHeadOffice:GEPlaza,AirportRoad,Yerwada,PuneTranscriptofProposalforPrivateCar-PackagePolicyDearRAJINDERKUMARGUPTA,Wewishtoinformyouthatthecontractunderpolicynumber'OG-17-1104-1801-00011267'hasbeenfinalizedbasedontheinformationanddeclarationgivenbyyou,thetranscriptwhereofismentionedbelow.Youarerequestedtoreconfirmthesame.Incaseofanydisagreementorobjectionoranychangeswithrespecttoinformationmentionedbelow,werequestyoutopleaserevertbackwithinaperiodof15daysfromdateofyourreceiptofthis,failingwhichitwillbedeemedthatyouaresatisfiedwiththecorrectnessofthedetailsmentionedbelow.Kindlynotethatasthecontentsanddeclarationscontainedinthistranscriptisthebasisonwhichwehaveissuedthepolicytoyou,weadviseyoutopleaseensurethatyouhaveprovided/disclosedandornotwithheldanymaterialfacts/informationanddeclarations,asPolicybecomesVoidabinitioifmaterialfactsarenotprovided/disclosedandorwithheldandinsuchcasenoclaim,ifany,willbeconsideredbyusapartfromforfeitureofthepremium.Detailsprovidedbyyou:A.Proposerdetails1.ProposerName:RAJINDERKUMARGUPTA2.ProposerAddress:NEARPIPALCHOWK,ASSANDH,,KARNAL,HARYANA-1320393.ProposerMobileNumber:4.ProposerResidentialNumber:NA5.Proposere-mailid:NA6.ProposerProfession:NAB.VehicleDetailsRegistrationNumberMonth/YearofRegnVehicleMakeVehicleModelVehicleSubTypeCubicCapa-cityFuelTypeYearofMan-ufactureSeatingCa-pacityHR40A4511OCT/2004MARUTIALTOSTANDARD796Petrol20045EngineNumberChassisNumberVehicleIDV(inRs.)ElectricalAccessoriesIDV(inRs.)Non-ElectricalAccessoriesIDV(inRs.)CNG/LPGUnit(Extrafitted)IDV(inRs.)TotalIDV(inRs.)0109910973478100000081000C.Coverageopted1.PeriodofInsurance:From20-OCT-201600:01(Hrs)To19-OCT-2017Midnight2.IsyourvehiclefittedwithexternalLPG/CNGkit:No.3.ElectricalAccessoriescoverOpted(IfApplicable):No.4.Non-ElectricalAccessoriescoverOpted(IfApplicable)::No.5.IsVoluntaryExcessopted:No.Amountofvoluntaryexcessopted:Rs.NA.6.WhetherPAcoverisoptedforowner-driver:Yes.7.Isanyadditionalcompulsorydeductibleimposedandagreedupon:No.Amountofadditionalcompulsorydeductibleimposed:NA.8.Whethergeographicalareaextensionisopted:No.DetailsofCountriestowhichgeographicalareaextensioncoverisgiven:NA.9.IsLLtopersonforPaiddriver/Operation/Maintenanceopted:Yes.10.WhetherPAcoverisoptedforpaiddriverotherthanownerdriver:No.SumInsuredforPaidDriver:Rs.NA.11.WhetherPAcoverisoptedforpassengers:No.SumInsuredperPassenger:Rs.NA.12.IsTPPDrestrictedtostatutorylimitofRs.6000?:No.13.PreExistingdamagesinthevehicle:CostofRepair/ReplacementtowardsthedamagedpartsnoticedduringtheinspectionofyourvehiclepriortoenrolmentunderthispolicyasperInspectionreportreferencenumber2016-02305432dulysignedbyyouoryourrepresentativeaswellasthephotographsshallbeexcludedintheeventofanyfutureclaims.14.PremiumforLiabilitycoverage,quotedandagreeduponis:Rs.2205.15.PremiumforODcoverage,quotedandagreeduponis:Rs.861.16.TotalPremium(excludingServiceTaxandEducationCess)forLiabilityandODcoverages,quotedandagreeduponis:Rs.306617.NCB(NoClaimBonus)claimedbyyouandgrantedbyusbasedonyourdeclarationofnoclaimduringyourpreviouspreviouspolicy:-50%.18.Aboutthelastinsurancecompany(i)InsuranceProvider:IFFCOTokioGeneralInsuranceCompanyLimited..(ii)PreviousPolicyNo:1-3S5LBHC,PreviousPolicyExpiryDate:01-SEP-1619.WhetheryourvehicleisHypothecatedandifsothedetailsofPledgeewhosenameisregisteredbyus:No.NameofPledgee:NA.20.AddonCover(s)opted:No.Planname:NAPleasenoteCoverNoteNo.DY1303078914/19-OCT-201610:20issuedtoyoubasingontheaboveinformation.IncaseofDisagreementorobjectionoranychangeswithrespecttoinformationandcontentsmentionedhereinabove,pleasecontactourtollfreenumber&registeryourobjections/changes/disagreementtothecontentsofthistranscriptoryoumayalsosendusemailorwrittencorrespondenceatthefollowingdetailswithinaperiodof15daysfromdateofyourreceiptofthistranscriptalongwithPolicy:I/WeherebyunconditionallyallowtheCompanytoshareallmy/ourinformationbeingcollectedinthisproposalformorthroughtelephonic/email/web-inputsmeansorothermeans,asupdatedfromtimetotimewithingroupentities.TollfreeNumber:1800-22-5858,1800-102-5858,1800-209-5858Emailaddress:customercare@bajajallianz.co.inWebsite:www.bajajallianz.comContactourpolicyservicingbranchat:BlockNo-4,7thFloor,DLFTowers,,15,ShivajiMarg,,-,NewDelhi-110015PH:011-66278000.♦♣♠

Я хочу извлечь ProposerName

1 Ответ

0 голосов
/ 11 июля 2020

Посмотрите на этот кусок строки: Proposerdetails1.ProposerName:RAJINDERKUMARGUPTA2.ProposerAddress:NEARPIPALCHOWK

Вы видите шаблон <Key>:<Value>. как в ProposerName:RAJINDERKUMARGUPTA2.?

Вы хотите найти этот конкретный ключ / Пара значений из текста и возьмите значение.

Вы можете использовать Регулярные выражения (Regex) .

//Imagine that your data is stored here in 'fullText'
public static string GetProposerName(string fullText)
{

   /* Here's the regex filter for us to find a piece of string in the data with the 
    * key ProposerName, some value that we don't know yet and a . termnation.
    * Now take a look in the content inside the (). 
    * This is a group. The symbols between it means that we
    * want to get everything, any content, that is there inside the group. */
   string regexPattern = "ProposerName:(.*?)\\.";

   //Use this classes from System.Text.RegularExpressions
   Match match = Regex.Match(fullText, regexPattern);

   //Important test, it may not find anything...
   if (!match.Success) return null;

   /* If your code reaches here means that you've find the pattern inside the text.
    * Now you take only the value inside the group as a string and voila!*/
   string proposerName = match.Groups[0].Value;

   return proposerName;
}
...