Question

Используя apache POI, как преобразовать ms word файл в pdf?

Я использую следующий код, но он не работает, выдавая ошибки. Я предполагаю, что импортирую неправильные классы?

import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.OutputStream;

import org.apache.poi.hslf.record.Document;
import org.apache.poi.hwpf.HWPFDocument;
import org.apache.poi.hwpf.extractor.WordExtractor;
import org.apache.poi.hwpf.usermodel.Paragraph;
import org.apache.poi.hwpf.usermodel.Range;
import org.apache.poi.poifs.filesystem.POIFSFileSystem;


public class TestCon {

    /**
     * @param args
     */
    public static void main(String[] args) {
        // TODO Auto-generated method stub

        POIFSFileSystem fs = null;  
         Document document = new Document(); 

         try {  
             System.out.println("Starting the test");  
             fs = new POIFSFileSystem(new FileInputStream("/document/test2.doc"));  

             HWPFDocument doc = new HWPFDocument(fs);  
             WordExtractor we = new WordExtractor(doc);  

             OutputStream file = new FileOutputStream(new File("/document/test.pdf")); 

             PdfWriter writer = PdfWriter.getInstance(document, file);  

             Range range = doc.getRange();
             document.open();  
             writer.setPageEmpty(true);  
             document.newPage();  
             writer.setPageEmpty(true);  

             String[] paragraphs = we.getParagraphText();  
             for (int i = 0; i < paragraphs.length; i++) {  

                 org.apache.poi.hwpf.usermodel.Paragraph pr = range.getParagraph(i);
                // CharacterRun run = pr.getCharacterRun(i);
                // run.setBold(true);
                // run.setCapitalized(true);
                // run.setItalic(true);
                 paragraphs[i] = paragraphs[i].replaceAll("\\cM?\r?\n", "");  
             System.out.println("Length:" + paragraphs[i].length());  
             System.out.println("Paragraph" + i + ": " + paragraphs[i].toString());  

             // add the paragraph to the document  
             document.add(new Paragraph(paragraphs[i]));  
             }  

             System.out.println("Document testing completed");  
         } catch (Exception e) {  
             System.out.println("Exception during test");  
             e.printStackTrace();  
         } finally {  
                         // close the document  
            document.close();  
                     }  
         }  
    }

Harinder · Answer 1 · 02 июня 2011

все решено

import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.OutputStream;

import com.lowagie.text.Document;
import com.lowagie.text.DocumentException;
import com.lowagie.text.Paragraph;
import com.lowagie.text.pdf.PdfWriter;


import org.apache.poi.hwpf.HWPFDocument;
import org.apache.poi.hwpf.extractor.WordExtractor;

import org.apache.poi.hwpf.usermodel.Range;
import org.apache.poi.poifs.filesystem.POIFSFileSystem;


public class TestCon {

    /**
     * @param args
     */
    public static void main(String[] args) {
        // TODO Auto-generated method stub

        POIFSFileSystem fs = null;  
        Document document = new Document();

         try {  
             System.out.println("Starting the test");  
             fs = new POIFSFileSystem(new FileInputStream("D:/Resume.doc"));  

             HWPFDocument doc = new HWPFDocument(fs);  
             WordExtractor we = new WordExtractor(doc);  

             OutputStream file = new FileOutputStream(new File("D:/test.pdf")); 

             PdfWriter writer = PdfWriter.getInstance(document, file);  

             Range range = doc.getRange();
             document.open();  
             writer.setPageEmpty(true);  
             document.newPage();  
             writer.setPageEmpty(true);  

             String[] paragraphs = we.getParagraphText();  
             for (int i = 0; i < paragraphs.length; i++) {  

                 org.apache.poi.hwpf.usermodel.Paragraph pr = range.getParagraph(i);
                // CharacterRun run = pr.getCharacterRun(i);
                // run.setBold(true);
                // run.setCapitalized(true);
                // run.setItalic(true);
                 paragraphs[i] = paragraphs[i].replaceAll("\\cM?\r?\n", "");  
             System.out.println("Length:" + paragraphs[i].length());  
             System.out.println("Paragraph" + i + ": " + paragraphs[i].toString());  

             // add the paragraph to the document  
             document.add(new Paragraph(paragraphs[i]));  
             }  

             System.out.println("Document testing completed");  
         } catch (Exception e) {  
             System.out.println("Exception during test");  
             e.printStackTrace();  
         } finally {  
                         // close the document  
            document.close();  
                     }  
         }  
    }

Kushagra Sahni · Answer 2 · 18 апреля 2017

Для меня это сработало: -

Источник: - http://www.programcreek.com/java-api-examples/index.php?api=org.apache.poi.xwpf.converter.pdf.PdfConverter

package pdf;

import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.OutputStream;

import org.apache.poi.xwpf.converter.pdf.PdfConverter;
import org.apache.poi.xwpf.converter.pdf.PdfOptions;
import org.apache.poi.xwpf.usermodel.XWPFDocument;

public class PDF {
    public static void main(String[] args) throws Exception {
          String inputFile="D:/TEST.docx";
          String outputFile="D:/TEST.pdf";
          if (args != null && args.length == 2) {
            inputFile=args[0];
            outputFile=args[1];
          }
          System.out.println("inputFile:" + inputFile + ",outputFile:"+ outputFile);
          FileInputStream in=new FileInputStream(inputFile);
          XWPFDocument document=new XWPFDocument(in);
          File outFile=new File(outputFile);
          OutputStream out=new FileOutputStream(outFile);
          PdfOptions options=null;
          PdfConverter.getInstance().convert(document,out,options);
        }
}

Rohit Dubey · Answer 3 · 12 августа 2016

У меня работает следующий код:

Public class DocToPdfConverter{

public static void main(String[] args) {

        String k=null;
        OutputStream fileForPdf =null;
        try {

            String fileName="/document/test2.doc";
            //Below Code is for .doc file 
            if(fileName.endsWith(".doc"))
            {
            HWPFDocument doc = new HWPFDocument(new FileInputStream(
                    fileName));
            WordExtractor we=new WordExtractor(doc);
            k = we.getText();

             fileForPdf = new FileOutputStream(new File(
                        "/document/DocToPdf.pdf")); 
            we.close();
            }

            //Below Code for 

            else if(fileName.endsWith(".docx"))
            {
                XWPFDocument docx = new XWPFDocument(new FileInputStream(
                        fileName));
                // using XWPFWordExtractor Class
                XWPFWordExtractor we = new XWPFWordExtractor(docx);
                 k = we.getText();

                 fileForPdf = new FileOutputStream(new File(
                            "/document/DocxToPdf.pdf"));    
                 we.close();
            }



            Document document = new Document();
            PdfWriter.getInstance(document, fileForPdf);

            document.open();

            document.add(new Paragraph(k));

            document.close();
            fileForPdf.close();



        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

theshadow · Answer 4 · 06 сентября 2012

В качестве дополнительного примечания также возможно читать содержимое «на лету» непосредственно из потока содержимого Word / Excel вместо чтения его из файловой системы и сериализации на диск, например, при извлечении содержимого из репозиториев CMIS:

, например

 //HWPFDocument docx = new HWPFDocument(fs);  
 HWPFDocument docx = new HWPFDocument(doc.getContentStream().getStream());

(документ имеет тип org.apache.chemistry.opencmis.client.api.Document, и в этом случае я адаптировал ваш код для извлечения файла слов из репозитория Alfresco с помощью opencmis и преобразовал его в PDF)

НТН

duffymo · Answer 5 · 01 июня 2011

Здесь есть несколько шагов:

Считывание документа Word с использованием POI в независимую от формата форму
Преобразование независимой от формата формы в PDF
Запись PDF

Я не знаю, сделает ли POI шаг 2 для вас.Я бы порекомендовал что-то еще, например, iText.

Erich13 · Answer 6 · 24 мая 2018

Помимо ответа Кушагры, здесь обновлены зависимости maven:

    <dependency>
        <groupId>fr.opensagres.xdocreport</groupId>
        <artifactId>fr.opensagres.xdocreport.converter.docx.xwpf</artifactId>
        <version>2.0.1</version>
    </dependency>
    <dependency>
        <groupId>fr.opensagres.xdocreport</groupId>
        <artifactId>fr.opensagres.xdocreport.converter</artifactId>
        <version>2.0.1</version>
    </dependency>
    <dependency>
        <groupId>fr.opensagres.xdocreport</groupId>
        <artifactId>fr.opensagres.poi.xwpf.converter.pdf</artifactId>
        <version>2.0.1</version>
    </dependency>
    <dependency>
        <groupId>fr.opensagres.xdocreport</groupId>
        <artifactId>fr.opensagres.poi.xwpf.converter.xhtml</artifactId>
        <version>2.0.1</version>
    </dependency>

Java: используя apache POI, как преобразовать файл MS Word в PDF?

Пожалуйста, войдите или зарегистрируйтесь чтобы ответить на этот вопрос.

Ответы [ 6 ]

Пожалуйста, войдите или зарегистрируйтесь что бы добавить комментарий.

Пожалуйста, войдите или зарегистрируйтесь что бы добавить комментарий.

Пожалуйста, войдите или зарегистрируйтесь что бы добавить комментарий.

Пожалуйста, войдите или зарегистрируйтесь что бы добавить комментарий.

Пожалуйста, войдите или зарегистрируйтесь что бы добавить комментарий.

Пожалуйста, войдите или зарегистрируйтесь что бы добавить комментарий.

Java: используя apache POI, как преобразовать файл MS Word в PDF?

Пожалуйста, войдите или зарегистрируйтесь чтобы ответить на этот вопрос.

Ответы [ 6 ]

Пожалуйста, войдите или зарегистрируйтесь что бы добавить комментарий.

Пожалуйста, войдите или зарегистрируйтесь что бы добавить комментарий.

Пожалуйста, войдите или зарегистрируйтесь что бы добавить комментарий.

Пожалуйста, войдите или зарегистрируйтесь что бы добавить комментарий.

Пожалуйста, войдите или зарегистрируйтесь что бы добавить комментарий.

Пожалуйста, войдите или зарегистрируйтесь что бы добавить комментарий.

Похожие темы