Skip to main content

Microsoft word to PDF/HTML Converter

Developing a content management System, I came across this requirement where the uploaded word docs needs to be converted to PDF at server, saved and later on be available for display on the browser.


This is a clean approach as the content of the pages can change once in a while and there’s no point in making the content static and then reinventing the wheel when the content changes.

The application uploads the word documents to the server, which are saved as PDF and become a part of the dynamic menu that contains the link if the document has been uploaded to the server. Now whenever the content of the uploaded files changes, the changed files can be uploaded to the server again and the modified content is available to the user.

One of the pre-requisites for this functionality to work is the availability of save-as pdf template to be available in MS-Word 2007.

If it is not available, it can be downloaded from here.

I’ll not go into creation of dynamic menu and all other stuff, I’ll just explain how the “. Docx” to “.pdf” conversion works.

Here is the Code for the conversion.

public static string ConvertDocument(string filePath, string folder_to_save_in,string FileName)


{

Microsoft.Office.Interop.Word.ApplicationClass wordApplication = new Microsoft.Office.Interop.Word.ApplicationClass();

string newfilename = string.Empty;

try

{

// set up a Word Application...

// Opening a Word doc

object o_nullobject = System.Reflection.Missing.Value;

object o_filePath = filePath;

Microsoft.Office.Interop.Word.Document doc = wordApplication.Documents.Open(ref o_filePath,

ref o_nullobject, ref o_nullobject, ref o_nullobject, ref o_nullobject, ref o_nullobject,

ref o_nullobject, ref o_nullobject, ref o_nullobject, ref o_nullobject, ref o_nullobject,

ref o_nullobject, ref o_nullobject, ref o_nullobject, ref o_nullobject, ref o_nullobject);

// save The Doc in html/pdf format format...

//string newfilename = folder_to_save_in + @"\"+FileName.Replace(".docx", ".html");

newfilename = folder_to_save_in + @"\" + FileName.Replace(".docx", ".pdf");

object o_newfilename = newfilename;

//object o_format = Microsoft.Office.Interop.Word.WdSaveFormat.wdFormatHTML;

object o_format = Microsoft.Office.Interop.Word.WdSaveFormat.wdFormatPDF;

object o_encoding = null;

object o_endings = Microsoft.Office.Interop.Word.WdLineEndingType.wdCRLF;

wordApplication.ActiveDocument.SaveAs(ref o_newfilename, ref o_format, ref o_nullobject,

ref o_nullobject, ref o_nullobject, ref o_nullobject, ref o_nullobject, ref o_nullobject, ref o_nullobject,

ref o_nullobject, ref o_nullobject, ref o_encoding, ref o_nullobject,

ref o_nullobject, ref o_endings, ref o_nullobject);

// close the original doc ...

doc.Close(ref o_nullobject, ref o_nullobject, ref o_nullobject);

}

catch (Exception ex)

{


}

finally

{

}

return newfilename;

}
 
For this Code to work , you need to add a reference to Microsoft.Office.Interop.Word dll(in Vs-2008, there will be two versions verion 11.0 and version 12.0. This code uses 12.0).


The code works by creating an instance of word application, then opening the the document specied in the “filepath” parameter and the finally saving it in the “folder_to_save_in” folder.

WdSaveFormat enum has a number of options (the same that you get while using save as functionality of the MS-Word), so the document can be saves as HTML also.

This code  returns a strin that would be the location where the document is saved as the requirement was to update an XML file.Please make necessary modifications as required.
Hope this was Helpful,

Till Next we connect…..

Happy Coding!



Comments

Popular posts from this blog

Asp.Net 4.0: An Overview-Part-III

This is the last post in the series which will explore the following new features of ASP.Net 4.0  Performance Monitoring for Individual Applications in a Single Worker Process Web.config File Refactoring Permanently Redirecting a Page Expanding the Range of Allowable URLs Performance Monitoring for Individual Applications in a Single Worker Process It is a common practice to host multiple ASP.NET applications in a single worker process, In order to increase the number of Web sites that can be hosted on a single server. This practice results in difficulties for server administrators to identify an individual application that is experiencing problems. ASP.NET 4 introduces new resource-monitoring functionality introduced by the CLR. To enable this functionality, following XML configuration snippet is added to the aspnet.config configuration file.(This file is located in the directory where the .NET Framework is installed ) <?xml version="1.0" encoding="UTF-8"

WCF-REST Services-Part-II

HOW REST is implemented in WCF Part-I of the series explored the REST conceptually and this post will explore how REST is implemented in WCF. For REST implementation in WCF, 2 new attributes namely WebGetAttribute and WebInvokeAttribute are introduced in WCF along with a URI template mechanism that enables you to declare the URI and verb to which each method is going to respond. The infrastructure comes in the form of a binding ( WebHttpBinding ) and a behavior ( WebHttpBehavior ) that provide the correct networking stack for using REST. Also, there is some hosting infrastructure help from a custom Service¬Host ( WebServiceHost ) and a ServiceHostFactory ( WebServiceHostFactory ). How WCF Routes messages WCF routes network messages to methods on instances of the classes defined as implementations of the service. Default behavior ( Dispatching ) for WCF is to do this routing based on the concept of action. For this dispatching to work, an action needs to be present in ev

SOLID principles -Code Samples and Free Ebook

I planned to write code samples for SOLID principle implementations, however I am a firm believer of " NOT RE-INVENTING THE WHEEL ", when all you need is use the wheels and make a new CAR. Going by the ideology, I have found an excellent  SOLID principles FREE -Ebook ( covering all aspects of SOLID design principles, with Code sample). This book is an excellent visual aid to remember these principles, as it uses Motivational posters for explaining SOLID design principles. One additional advantage to the above mentioned book is the Code-Refactoring ebook . Both of these books can be downloaded from this EBOOK download Link Both of these books can be downloaded form here. Hope this book proves useful... Till next we connect.... Happy Learning..