Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
206 views
in Technique[技术] by (71.8m points)

Generate Word Documents (in Excel VBA) from a series of Document Templates

Hey all. I'll try to make this brief and simple. :)

I have

  1. 40 or so boilerplate word documents with a series of fields (Name, address, etc) that need to be filled in. This is historically done manually, but it's repetitive and cumbersome.
  2. A workbook where a user has filled a huge set of information about an individual.

I need

  • A way to programatically (from Excel VBA) open up these boilerplate documents, edit in the value of fields from various named ranges in the workbook, and save the filled in templates to a local folder.

If I were using VBA to programatically edit particular values in a set of spreadsheets, I would edit all those spreadsheets to contain a set of named ranges which could be used during the auto-fill process, but I'm not aware of any 'named field' feature in a Word document.

How could I edit the documents, and create a VBA routine, so that I can open each document, look for a set of fields which might need to be filled in, and substitute a value?

For instance, something that works like:

for each document in set_of_templates
    if document.FieldExists("Name") then document.Field("Name").value = strName
    if document.FieldExists("Address") then document.Field("Name").value = strAddress
    ...

    document.saveAs( thisWorkbook.Path & "GeneratedDocs " & document.Name )
next document

Things I've considered:

  • Mail merge - but this is insufficient because it requires opening each document manually and structuring the workbook as a data source, I kind of want the opposite. The templates are the data source and the workbook is iterating through them. Also, mail merge is for creating many identical documents using a table of different data. I have many documents all using the same data.
  • Using placeholder text such as "#NAME#" and opening each document for a search and replace. This is the solution I would resort to if nothing more elegant is proposed.
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

It's been a long time since I asked this question, and my solution has undergone more and more refinement. I've had to deal with all sorts of special cases, such as values that come directly from the workbook, sections that need to be specially generated based on lists, and the need to do replacements in headers and footers.

As it turns out, it did not suffice to use bookmarks, as it was possible for users to later edit documents to change, add, and remove placeholder values from the documents. The solution was in fact to use keywords such as this:

enter image description here

This is just a page from a sample document which uses some of the possible values that can get automatically inserted into a document. Over 50 documents exist with completely different structures and layouts, and using different parameters. The only common knowledge shared by the word documents and the excel spreadsheet is a knowledge of what these placeholder values are meant to represent. In excel, this is stored in a list of document generation keywords, which contain the keyword, followed by a reference to the range that actually contains this value:

enter image description here

These were the key two ingredients required. Now with some clever code, all I had to do was iterate over each document to be generated, and then iterate over the range of all known keywords, and do a search and replace for each keyword in each document.


First, I have the wrapper method, which takes care of maintaining an instance of microsoft word iterating over all documents selected for generation, numbering the documents, and doing the user interface stuff (like handling errors, displaying the folder to the user, etc.)

' Purpose: Iterates over and generates all documents in the list of forms to generate
'          Improves speed by creating a persistant Word application used for all generated documents
Public Sub GeneratePolicy()
    Dim oWrd As New Word.Application
    Dim srcPath As String
    Dim cel As Range

    If ERROR_HANDLING Then On Error GoTo errmsg
    If Forms.Cells(2, FormsToGenerateCol) = vbNullString Then _
        Err.Raise 1, , "There are no forms selected for document generation."
    'Get the path of the document repository where the forms will be found.
    srcPath = FindConstant("Document Repository")
    'Each form generated will be numbered sequentially by calling a static counter function. This resets it.
    GetNextEndorsementNumber reset:=True
    'Iterate over each form, calling a function to replace the keywords and save a copy to the output folder
    For Each cel In Forms.Range(Forms.Cells(2, FormsToGenerateCol), Forms.Cells(1, FormsToGenerateCol).End(xlDown))
        RunReplacements cel.value, CreateDocGenPath(cel.Offset(0, 1).value), oWrd
    Next cel
    oWrd.Quit
    On Error Resume Next
    'Display the folder containing the generated documents
    Call Shell("explorer.exe " & CreateDocGenPath, vbNormalFocus)
    oWrd.Quit False
    Application.StatusBar = False
    If MsgBox("Policy generation complete. The reserving information will now be recorded.", vbOKCancel, _
              "Policy Generated. OK to store reserving info?") = vbOK Then Push_Reserving_Requirements
    Exit Sub
errmsg:
    MsgBox Err.Description, , "Error generating Policy Documents"
End Sub

That routine calls RunReplacements which takes care of opening the document, prepping the environment for a fast replacement, updating links once done, handling errors, etc:

' Purpose: Opens up a document and replaces all instances of special keywords with their respective values.
'          Creates an instance of Word if an existing one is not passed as a parameter.
'          Saves a document to the target path once the template has been filled in.
'
'          Replacements are done using two helper functions, one for doing simple keyword replacements,
'          and one for the more complex replacements like conditional statements and schedules.
Private Sub RunReplacements(ByVal DocumentPath As String, ByVal SaveAsPath As String, _
                            Optional ByRef oWrd As Word.Application = Nothing)
    Dim oDoc As Word.Document
    Dim oWrdGiven As Boolean
    If oWrd Is Nothing Then Set oWrd = New Word.Application Else oWrdGiven = True

    If ERROR_HANDLING Then On Error GoTo docGenError
    oWrd.Visible = False
    oWrd.DisplayAlerts = wdAlertsNone

    Application.StatusBar = "Opening " & Mid(DocumentPath, InStrRev(DocumentPath, "") + 1)
    Set oDoc = oWrd.Documents.Open(Filename:=DocumentPath, Visible:=False)
    RunAdvancedReplacements oDoc
    RunSimpleReplacements oDoc
    UpdateLinks oDoc 'Routine which will update calculated statements in Word (like current date)
    Application.StatusBar = "Saving " & Mid(DocumentPath, InStrRev(DocumentPath, "") + 1)
    oDoc.SaveAs SaveAsPath

    GoTo Finally
docGenError:
    MsgBox "Un unknown error occurred while generating document: " & DocumentPath & vbNewLine _
            & vbNewLine & Err.Description, vbCritical, "Document Generation"
Finally:
    If Not oDoc Is Nothing Then oDoc.Close False: Set oDoc = Nothing
    If Not oWrdGiven Then oWrd.Quit False
End Sub

That routine then invokes RunSimpleReplacements. and RunAdvancedReplacements. In the former, we iterate over the set of Document Generation Keywords and call WordDocReplace if the document contains our keyword. Note that it's much faster to try and Find a bunch of words to figure out that they don't exist, then to call replace indiscriminately, so we always check if a keyword exists before attempting to replace it.

' Purpose: While short, this short module does most of the work with the help of the generation keywords
'          range on the lists sheet. It loops through every simple keyword that might appear in a document
'          and calls a function to have it replaced with the corresponding data from pricing.
Private Sub RunSimpleReplacements(ByRef oDoc As Word.Document)
    Dim DocGenKeys As Range, valueSrc As Range
    Dim value As String
    Dim i As Integer

    Set DocGenKeys = Lists.Range("DocumentGenerationKeywords")
    For i = 1 To DocGenKeys.Rows.Count
        If WordDocContains(oDoc, "#" & DocGenKeys.Cells(i, 1).Text & "#") Then
            'Find the text that we will be replacing the placeholder keyword with
            Set valueSrc = Range(Mid(DocGenKeys.Cells(i, 2).Formula, 2))
            If valueSrc.MergeCells Then value = valueSrc.MergeArea.Cells(1, 1).Text Else value = valueSrc.Text
            'Perform the replacement
            WordDocReplace oDoc, "#" & DocGenKeys.Cells(i, 1).Text & "#", value
        End If
    Next i
End Sub

This is the function used to detect whether a keyword exists in the document:

' Purpose: Function called for each replacement to first determine as quickly as possible whether
'          the document contains the keyword, and thus whether replacement actions must be taken.
Public Function WordDocContains(ByRef oDoc As Word.Document, ByVal searchFor As String) As Boolean
    Application.StatusBar = "Checking for keyword: " & searchFor
    WordDocContains = False
    Dim storyRange As Word.Range
    For Each storyRange In oDoc.StoryRanges
        With storyRange.Find
            .Text = searchFor
            WordDocContains = WordDocContains Or .Execute
        End With
        If WordDocContains Then Exit For
    Next
End Function

And this is where the rubber meets the road - the code that executes the replacement. This routine got more complicated as I encountered difficulties. Here are the lessons you will only learn from experience:

  1. You can set the replacement text directly, or you can use the clipboard. I found out the hard way that if you are doing a VBA replace in word using a string longer than 255 characters, the text will get truncated if you try to place it in the Find.Replacement.Text, but you can use "^c" as your replacement text, and it will get it directly from the clipboard. This was the workaround I got to use.

  2. Simply calling replace will miss keywords in some text areas like headers and footers. Because of this, you actually need to iterate over the document.StoryRanges and run the search and replace on each one to ensure that you catch all instances of the word you want to replace.

  3. If you're setting the Replacement.Text directly, you need to convert Excel line breaks (vbNewLine and Chr(10)) with a simple vbCr for them to appear properly in word. Otherwise, anywhere your replacement text has line breaks coming from an excel cell will end up inserting strange symbols into word. If you use the clipboard method however, you do not need to do this, as the line breaks get converted automatically when put in the clipboard.

That explains everything. Comments should be pretty clear too. Here's the golden routine that executes the magic:

' Purpose: This function actually performs replacements using the Microsoft Word API
Public Sub WordDocReplace(ByRef oDoc As Word.Document, ByVal replaceMe As String, ByVal replaceWith As String)
    Dim clipBoard As New MSForms.DataObject
    Dim storyRange As Word.Range
    Dim tooLong As Boolean

    Application.StatusBar = "Replacing instances of keyword: " & replaceMe

    'We want to use regular search and replace if we can. It's faster and preserves the formatting that
    'the keyword being replaced held (like bold).  If the string is longer than 255 chars though, the
    'standard replace method doesn't work, and so we must use the clipboard method (^c special character),
    'which does not preserve formatting. This is alright for schedules though, which are always plain text.
    If Len(replaceWith) > 255 Then tooLong = True
    If tooLong Then
        clipBoard.SetText IIf(replaceWith = vbNullString, "", replaceWith)
        clipBoard.PutInClipboard
    Else
        'Convert excel in-cell line breaks to word line breaks. (Not necessary if using clipboard)
        replaceWith = Replace(replaceWith, vbNewLine, vbCr)
        replaceWith = Replace(replaceWith, Chr(10), vbCr)
    End If
    'Replacement must be done on multiple 'StoryRanges'. Unfortunately, simply calling replace will miss
    'keywords in some text areas like headers and footers.
    For Each storyRange In oDoc.StoryRanges
        Do
            With storyRange.Find
                .MatchWildcards = True
                .Text = replaceMe
                .Replacement.Text = IIf(tooLong, "^c", replaceWith)
                .Wrap = wdFindContinue
                .Execute Replace:=wdReplaceAll
            End With
            On Error Resume Next
            Set storyRange = storyRange.NextStoryRange
            On Error GoTo 0
        Loop While Not storyRange Is Nothing
    Next
    If tooLong Then clipBoard.SetText ""
    If tooLong Then clipBoard.PutInClipboard
End Sub

When the dust settles, we're left with a beautiful version of the initial document with production values in place of those hash marked keywords. I'd love to show an example, but of course every


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...