Saturday, September 19, 2009

PDF File Verification

Sometimes testers need to verify the PDF file contents. Few times I have seen the questions related to this in few forums. Verifying graphical contents are not so easy. But we can verify the text content in three ways.

First way - Scripting
You can use any other scripting to verify PDF. You can use Java Script or VB Script. In this way, PDF file will not be opened physically and retrieve the contents internally. For more info, Read through this link - Accessing PDF

Second way - Utility
Convert the PDF files to text by using any utility and then verify text files. There are many freewares available for this kind of purpose. I suggest TextMining Tool. In this way also, PDF file is not opened physically.

Sample code in Silktest

STRING sPDF2TxtUtil = "F:\TextMining\minetext.exe"
sCmdExecute = "{sPDF2TxtUtil} {sPdfFile} {sConvertedFile}"
Print ("Command: {sCmdExecute}")
SYS_Execute (sCmdExecute,lsCmdOut)
Print (lsCmdOut)

Third way - Using Adobe Reader
In this way, Open the pdf file using Adobe Reader. Then go to File Menu and then click sub menu Save As Text. Now you can store pdf contents as text file and you can use the text file for verification.

To compare the PDF files, you can use comparison softwares (for ex Beyond Compare from 3.0). They internally convert to text file and then comparing it.

