How Do I Find All Acronyms in an MS Word Document Using a Macro?

22.5k views Asked by At

I have a document with many acronyms that need to be captured and put into an acronyms table at the end of the document.

The term acronym has various meanings. I'd like to create a table that has all of the words that are initialized; two or more capitalized letters that are short for a longer meaning. I.e., CD-ROM, USB, SYNC, MMR, ASCAP, etc.

How do I create a macro to do this?

5

There are 5 answers

2
Tim Williams On BEST ANSWER

Something like this might get you started. Add a reference to "Microsoft VBScript Regular Expressions" (Edit Macro: Tools > References). This library is the file, "vbscript.dll".

You may need to adjust the regexp if all your acronyms aren't only upper-case letters (eg some may contain numbers).

Sub Acronyms()

    Dim dict, k, tmp
    Dim regEx, Match, Matches
    Dim rngRange As Range
    Set regEx = New RegExp
    Set dict = CreateObject("scripting.dictionary")

    regEx.Pattern = "[A-Z]{2,}" '2 or more upper-case letters
    regEx.IgnoreCase = False
    regEx.Global = True
    Set Matches = regEx.Execute(ActiveDocument.Range.Text)
    For Each Match In Matches
        tmp = Match.Value
        If Not dict.Exists(tmp) Then dict.Add tmp, 0
        dict(tmp) = dict(tmp) + 1
    Next

    For Each k In dict.Keys
        Debug.Print k, dict(k)
    Next k

End Sub
0
MAbraham1 On

You will be running a macro on the main Word document. Open a separate Word document that is blank. This will be used to store discovered the acronyms.

  1. Press "Record Macro". Choose a unique name, and assign a shortcut key such as CTRL + ALT + A.
  2. Open the Find dialogue (CTRL + F). Paste the following search text: <[A-Z]{2,}>. In the Find dialogue, choose "More" > check the box for "Use Wildcards". Click the Find Next button.
  3. Right-click on the selected text, being careful not to change the highlight. Select Copy from the context menu.
  4. Navigate to the separate Word document (ALT + TAB, select the Word document). Paste the copied text, and hit Enter. ALT + TAB back to the original Word document.
  5. Close the find dialogue and click the right arrow once. This moves the cursor off the highlighted text, and readies it for the next search.
  6. Stop the macro recording.

You now have a macro that finds a word containing two or more capitalized letters, and saves the text to a separate document. In order to search for the remaining acronyms, press CTRL + ALT + A continuously until the end of the document has been reached. Or, edit the macro, and add while a loop.

Here is what the macro looks like (without the loop):

Sub GetAcronyms()
    Selection.Find.ClearFormatting
    With Selection.Find
        .Text = "<[A-Z]{2,}>"
        .Replacement.Text = ""
        .Forward = True
        .Wrap = wdFindContinue
        .Format = False
        .MatchCase = False
        .MatchWholeWord = False
        .MatchAllWordForms = False
        .MatchSoundsLike = False
        .MatchWildcards = True
    End With
    Selection.Find.Execute
    Selection.Copy
    Windows("Document1.docx").Activate
    Selection.PasteAndFormat (wdPasteDefault)
    Selection.TypeParagraph
    Windows("TheOriginalDocument.docx").Activate
    Selection.MoveRight Unit:=wdCharacter, Count:=1
End Sub
0
Vignesh On

Thanks Tim, your code works great!

If it will be of any use to others, the pattern [A-Z]{1,}([a-z]*|\&|\.*)[A-Z]{1,} will find more acronyms...

(I do not have permission to post comments, hence adding this as answer)

Edit (still no way to add comments): \b[A-Z]{1,}([a-z*]|\&|\.|\-)[A-Z]{1,}\b is more robust, but will fail if the last character of the acronym is not capitalized.

0
Adam On

I have found the following works well (where some business name acronyms are tolerable). I use this to test data entries in Access, it should also work for a Word document range.

objRegExp.Pattern = "([A-Z]{1,}((\&(?![A-Z]\s[\w]{3})\w*)+|\.\w*)+)|[A-Z]{2,}(?![A-Z]*\s[A-Z]{1}[a-z])"
  • J&K =Match
  • JK&S =Match
  • J.S.S =Match
  • JK&S.K =Match
  • JSK =Match
  • JK =Match
  • DKD And Sons =No Match
  • J&K Engineering =No Match
  • PKF Rogers and Associates =No Match

I use RegExHero to test my expressions

1
Sina On

I used the following to find abbreviations in my PhD thesis. They were all in "()".

regEx.Pattern = "\([A-Z]{1,}([a-z]*|\&|\.|\-*)[A-Z]{1,}\)"