Comparison of two documents

We are using GemBox.Document library to export huge documents for construction business area (measurement protocols, invoices etc). Documents are based on a huge controller which generates data and it’s very important to permanently check that the output produced is always the same.

One of our unit tests is therefore producing documents based on some testing data and is comparing the output to the referenced documents which were generated initially.

For the comparison of documents, I’m using dirty and simple code:

if (generatedDoc.Content.ToString() != correctDocument.Content.ToString())
{
    // Start visual comparison by using Microsoft.Office.Interop.Word
}

Sure, this comparison has some disadvantages. Sometimes it reports problems even there are no any data or design changes in the document. Like currently when I want to upgrade our GemBox.Document library v3.1.1134 to the latest v3.3.1187. The content has a lot of changes even output is (probably) correct.

My question is: do you anyone (GemBox support) uses similar tests to check if document content is still the same? Maybe there is already a code to compare two document models somewhere?

Regards
Lukas

Hi Lukas,

Currently, GemBox.Document doesn’t provide any simple or straightforward API for comparing two DocumentModel objects. However, we do hope to provide this feature sometime in the future.

For now, I believe that your current approach is the easiest one, comparing the documents based on their plain text representation.
Another similar way would be to save the DocumentModel objects to HTML files (or streams) and compare the resulting HTML text.

Nevertheless, whatever approach is used will occasionally result in some false negatives. That’s because we’re constantly adding improvements to GemBox.Document, for instance, introducing API support for some new document elements, adding support for some styling and formatting, etc.

Regards,
Mario

Ok, thanks for information and suggestions!

Regards
Lukas