Compare two word document in c#

1.2k views Asked by At

I have a problem. I need to compare word document. Text and format in c# and i found a third party library to view and process the document and it is Devexpress. So i downloaded the trial to check if the problem can be solved with this

Example i have two word document

1: This is a text example

  1. This is not a text example

In the text above the difference is only the word not

My problem is how can i check the difference including the format?

So far this is my code for iterating the contents of the Document

public void CompareEpub(string word)
        {
            try
            {
                using (DevExpress.XtraRichEdit.RichEditDocumentServer srv = new DevExpress.XtraRichEdit.RichEditDocumentServer())
                {
                    srv.LoadDocument(word);
                    MyIterator visitor = new MyIterator();
                    DocumentIterator iterator = new DocumentIterator(srv.Document, true);
                    while (iterator.MoveNext())
                    {
                        iterator.Current.Accept(visitor);
                    }
                    foreach (var item in visitor.ListOfText)
                    {
                        Debug.WriteLine("text: " + item.Text + " b: " + item.IsBold + " u: " + item.IsUnderline + " i: " + item.IsUnderline);
                    }
                }

            }
            catch (Exception ex)
            {
                Debug.WriteLine(ex.Message);
                Debug.WriteLine(ex.StackTrace);
                throw ex;
            }
        }


        public class MyIterator : DocumentVisitorBase
        {

            public List<Model.HtmlContent> ListOfText { get; }

            public MyIterator()
            {
                 ListOfText= new List<Model.HtmlContent>();
            }

            public override void Visit(DocumentText text)
            {
                var m = new Model.HtmlContent
                {
                    Text = text.Text,
                    IsBold = text.TextProperties.FontBold,
                    IsItalic = text.TextProperties.FontItalic,
                    IsUnderline = text.TextProperties.UnderlineWordsOnly
                };
                ListOfText.Add(m);
            }
        }

With the code above i can navigate to the text and its format. But how can i use this as a text compare?

If I'm going to create a two list for each document to compare.

How can i compare it?

If i'm going to compare the text in with another list. Compare it in loop.

I will be receiving it as only two words are equal.

Can help me with this. Or just provide an idea how i can make it work.

I didn't post in the devexpress forum because i feel that this is a problem with how i will be able to do it. And not a problem with the trial or the control i've been using. And i also found out that the control doesn't have a functionality to compare text. Like the one with Microsoft word.

Thank you.

Update:

Desired output

This is (not) a text example

The text inside the () means it is not found in the first document The output i want is like the output of Diff Match Patch https://github.com/pocketberserker/Diff.Match.Patch

But i can't implement the code for checking the format.

0

There are 0 answers