Skip to main content

Microsoft word to PDF/HTML Converter

Developing a content management System, I came across this requirement where the uploaded word docs needs to be converted to PDF at server, saved and later on be available for display on the browser.


This is a clean approach as the content of the pages can change once in a while and there’s no point in making the content static and then reinventing the wheel when the content changes.

The application uploads the word documents to the server, which are saved as PDF and become a part of the dynamic menu that contains the link if the document has been uploaded to the server. Now whenever the content of the uploaded files changes, the changed files can be uploaded to the server again and the modified content is available to the user.

One of the pre-requisites for this functionality to work is the availability of save-as pdf template to be available in MS-Word 2007.

If it is not available, it can be downloaded from here.

I’ll not go into creation of dynamic menu and all other stuff, I’ll just explain how the “. Docx” to “.pdf” conversion works.

Here is the Code for the conversion.

public static string ConvertDocument(string filePath, string folder_to_save_in,string FileName)


{

Microsoft.Office.Interop.Word.ApplicationClass wordApplication = new Microsoft.Office.Interop.Word.ApplicationClass();

string newfilename = string.Empty;

try

{

// set up a Word Application...

// Opening a Word doc

object o_nullobject = System.Reflection.Missing.Value;

object o_filePath = filePath;

Microsoft.Office.Interop.Word.Document doc = wordApplication.Documents.Open(ref o_filePath,

ref o_nullobject, ref o_nullobject, ref o_nullobject, ref o_nullobject, ref o_nullobject,

ref o_nullobject, ref o_nullobject, ref o_nullobject, ref o_nullobject, ref o_nullobject,

ref o_nullobject, ref o_nullobject, ref o_nullobject, ref o_nullobject, ref o_nullobject);

// save The Doc in html/pdf format format...

//string newfilename = folder_to_save_in + @"\"+FileName.Replace(".docx", ".html");

newfilename = folder_to_save_in + @"\" + FileName.Replace(".docx", ".pdf");

object o_newfilename = newfilename;

//object o_format = Microsoft.Office.Interop.Word.WdSaveFormat.wdFormatHTML;

object o_format = Microsoft.Office.Interop.Word.WdSaveFormat.wdFormatPDF;

object o_encoding = null;

object o_endings = Microsoft.Office.Interop.Word.WdLineEndingType.wdCRLF;

wordApplication.ActiveDocument.SaveAs(ref o_newfilename, ref o_format, ref o_nullobject,

ref o_nullobject, ref o_nullobject, ref o_nullobject, ref o_nullobject, ref o_nullobject, ref o_nullobject,

ref o_nullobject, ref o_nullobject, ref o_encoding, ref o_nullobject,

ref o_nullobject, ref o_endings, ref o_nullobject);

// close the original doc ...

doc.Close(ref o_nullobject, ref o_nullobject, ref o_nullobject);

}

catch (Exception ex)

{


}

finally

{

}

return newfilename;

}
 
For this Code to work , you need to add a reference to Microsoft.Office.Interop.Word dll(in Vs-2008, there will be two versions verion 11.0 and version 12.0. This code uses 12.0).


The code works by creating an instance of word application, then opening the the document specied in the “filepath” parameter and the finally saving it in the “folder_to_save_in” folder.

WdSaveFormat enum has a number of options (the same that you get while using save as functionality of the MS-Word), so the document can be saves as HTML also.

This code  returns a strin that would be the location where the document is saved as the requirement was to update an XML file.Please make necessary modifications as required.
Hope this was Helpful,

Till Next we connect…..

Happy Coding!



Comments

Popular posts from this blog

Asp.Net 4.0: An Overview-Part-III

This is the last post in the series which will explore the following new features of ASP.Net 4.0  Performance Monitoring for Individual Applications in a Single Worker Process Web.config File Refactoring Permanently Redirecting a Page Expanding the Range of Allowable URLs Performance Monitoring for Individual Applications in a Single Worker Process It is a common practice to host multiple ASP.NET applications in a single worker process, In order to increase the number of Web sites that can be hosted on a single server. This practice results in difficulties for server administrators to identify an individual application that is experiencing problems. ASP.NET 4 introduces new resource-monitoring functionality introduced by the CLR. To enable this functionality, following XML configuration snippet is added to the aspnet.config configuration file.(This file is located in the directory where the .NET Framework is installed ) <?xml version="1.0" encoding="UTF-8...

Covariance and Contravariance-General Discussion

If you have just started the exploration of .Net Framework 4.0, two terms namely Covariance and Contravariance might have been heard. The concept that these terms encapsulate are used by most developer almost daily, however there has never been any botheration about the terminologies. Now, what actually these terms mean and how are these going to affect us as a developer, if we dive in to the details. The simple answer is it’s always good to know your tools before actually using them. Enough philosophy, let’s get to the business. Starting the discussion let me reiterate that in addition to Covariance and Contravariance, there is another terminology, Invariance. I’ll by start here by diving into the details of Invariance and then proceed further. Invariance: Invariance can be better understood by considering the types in .Net.>net has basically two type, value-types and reference-types. Value types (int, double etc) are invariant i.e. the types can’t be interchanged either ...

Advanced WCF

In this post, I am sharing the link of articles about  advanced topics in WCF. The List of articles is exhaustive and can serve as your repository for all WCF queries. Concurrency,Throttling & Callbacks  WCF Concurrency (Single, Multiple and Re entrant) and Throttling   WCF-Interop and BinarySecurityToken  WCF Callbacks  Creating Web Services From WSDL Link1 Link2 Link3 Link4 WCF-Security WCF over HTTPS   Transport Security(basic)/HTTPS UserNamePasswordValidator ServerCertificateValidationCallback 9 simple steps to enable X.509 certificates on WCF - CodeProject http://www.codeproject.com/KB/WCF/9StepsWCF.aspx?display=Print Message Security(Certificate)/PeerTrust Securing WCF Services with Certificates. - CodeProject http://www.codeproject.com/KB/WCF/wcf_certificates.aspx Message Security(Certificate)/ChainTrust How To Configure WCF Security Using Only X.509 Certificates - CodePr...