Java extracts office, PDF's four weapons, cold moon palace, published on 2004-11-25 1:45:02
Many people asked how to extract words such as Word, Excel, PDF, here I summarize several ways to extract Word, PDF. 1. Use Jacob. In fact, Jacob is a bridage, a middleware that connects to Java and COM or Win32 functions. Jacob can't take directly to Word, Excel and other files, you need to write DLLs yourself, but you have written it for you. It is the author of Jacob. JACOB Download: http://www.matrix.org.cn/down_view.asp? id = 13 After downloading Jacob and put it on the specified path (DLL is placed in the path, the JAR file is placed in the classpath), you can write yourself. The extracted procedure, the following is an example:
import java.io.File; import com.jacob.com *;. import com.jacob.activeX *;. public class FileExtracter {public static void main (String [] args) {ActiveXComponent app = new ActiveXComponent ( "Word.Application "); String infile =" c: //test.doc "; string tpfile =" c: //temp.htm "; string OtFile =" c: //temp.xml "; boolean flag = false; try {app .SETPROPERTY ("Visible", new variant (false); object docs = app.getProperty ("document."). Todispatch (); Object Doc = dispatch.invoke (DOCS, "Open", Dispatch.Method, New Object [] {INFILE, New Variant (false), new variant (true)}, new int [1]). Todispatch (); Dispatch.invoke (DOC, "Saveas", Dispatch.Method, New Object [] {TPFILE, New variant (8)}, new int [1]); variant f = new variant (false); Dispatch.call (DOC, "Close", F); Flag = true;} catch (exception e) {E.PrintStackTrace ();} finally {app.invoke ("quit", new variant [] {};}}} 2, using Apache's POI to extract Word, Excel Poi is a project of Apache, but even if you use POI you may I feel very annoying, but not tight, here is a simpler interface to you: download the packaged POI package: http://www.matrix.org.cn/down_view.asp? ID = 14 download, put it Your classpath can Here is an example of how to use it: