C# Replacing Text In String Changes Paragraph Formatting in Word - Interop Assemblies

2k views Asked by At

I have a code where I am iterating through every paragraph present in a word document with the use of the Primary Interop Assemblies. What I am essentially doing is extract all the text from each paragraph into a string. Then I searching that string for specific key words/phrases. If it is present it is swapped with something else. Then the paragraph is inserted back into the document.

This works perfect however on some documents what is happening is a new line is being added in between the paragraphs. Upon further investigation it turns out that the paragraph formatting is being altered, that is the line spacing after is increasing from zero to 12 and other things change as well, these include left indents is being removed from paragraphs etc.

I would like to know if there is any way to perform the above task without having the paragraph properties change when inserting the text back. My code is included below in order to show how I am iterating through the document.

Before getting to the main code I do have a word application and document open using the following namespace:

using Word = Microsoft.Office.Interop.Word

and then the following code

Word.Application app = new Word.Application();
Word.Document doc = app.Documents.Open(filePath, ReadOnly: false);

After opening the document I have done the following:

try
        {
            int totalParagraphs = document.Paragraphs.Count;
            string final;
            for (int i = 1; i <= totalParagraphs; i++)
            {
                string temp = document.Paragraphs[i].Range.Text;
                if (temp.Length > 1)
                {
                    Regex regex = new Regex(findText);
                    final = regex.Replace(temp, replaceText);
                    if (final != temp)
                    {
                        document.Paragraphs[i].Range.Text = final;

                    }
                }
            }
        } catch (Exception) { }

Some things to note is that I have a if statement with "temp.Length > 1". I noticed that is there is nothing but a blank line, it is still counted as a paragraph and the text present inside that paragraph is of length one. When working with blank lines this actually adds in an extra line again when inserting it back in even if no replacements were done. So in order to combat this I simply used this to make sure the paragraph has at least one letter in it and is not just a blank line. This way no additional blank lines are added in between paragraphs.

1

There are 1 answers

0
Dylan On BEST ANSWER

I have found the answer to my own question. I have included the solution down below in case anyone else is having the same problem or would like it for reference.

What you have to do is get the paragraph format properties of the extracted text before any changes were made. Then once the paragraph is inserted back in, set the same properties we previously extracted to the inserted paragraph to counter any changes that may have been made. The full code is included below:

try
        {
            int totalParagraphs = document.Paragraphs.Count;
            string final;
            for (int i = 1; i <= totalParagraphs; i++)
            {
                string temp = document.Paragraphs[i].Range.Text;

                float x1 = document.Paragraphs[i].Format.LeftIndent;
                float x2 = document.Paragraphs[i].Format.RightIndent;
                float x3 = document.Paragraphs[i].Format.SpaceBefore;
                float x4 = document.Paragraphs[i].Format.SpaceAfter;

                if (temp.Length > 1)
                {
                    Regex regex = new Regex(findText);
                    final = regex.Replace(temp, replaceText);
                    if (final != temp)
                    {
                        document.Paragraphs[i].Range.Text = final;
                        document.Paragraphs[i].Format.LeftIndent = x1;
                        document.Paragraphs[i].Format.RightIndent = x2;
                        document.Paragraphs[i].Format.SpaceBefore = x3;
                        document.Paragraphs[i].Format.SpaceAfter = x4;
                    }
                }
            }
        } catch (Exception) { }