Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
787 views
in Technique[技术] by (71.8m points)

vba - Speed up multiple replacement

I'm writing a Word macro that reads about 1000 word pairs from an Excel file and does replacement accordingly for a Word file. At the very beginning I was using the code below for the replacement:

    Selection.Find.ClearFormatting
    Selection.Find.Replacement.ClearFormatting
    With Selection.Find
        .Text = SrcText
        .Replacement.Text = DestText
        .Forward = True
        .Wrap = wdFindContinue
        .Format = False
        .MatchCase = False
        .MatchWholeWord = False
        .MatchByte = False
        .MatchAllWordForms = False
        .MatchSoundsLike = False
        .MatchWildcards = False
        .MatchFuzzy = False
    End With
    Selection.Find.Execute Replace:=wdReplaceAll

It worked but appeared a bit slow for large Word files. Then I tried to first read the whole content into a string variable, do finding there, then alter the Word file when any replacement target was found (although I'm actually not sure that this can be faster).

When doing that I encountered a somewhat weird inconsistency. For example, if you create a Word file with only a single-cell table in the first row:

enter image description here

then in the VBA editor, the code below will give you 2.

    Debug.Print Len(ActiveDocument.Characters(2))

I don't know whether there exist other inconsistencies of this sort and got stuck there.

Is there any workaround or Is there any other way to speedup the replacement?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

For a start don't specify all those properties over and over again. Unless you change them they don't change.

With Selection.Find
    .ClearFormatting
    .Replacement.ClearFormatting
    .Forward = True
    .Wrap = wdFindContinue
    .Format = False
    .MatchCase = False
    .MatchWholeWord = False
    .MatchByte = False
    .MatchAllWordForms = False
    .MatchSoundsLike = False
    .MatchWildcards = False
    .MatchFuzzy = False

    For loop to go through each word pair
        .Text = SrcText
        .Replacement.Text = DestText
        .Find.Execute Replace:=wdReplaceAll
    Next

End With

Minimise Dots

So if you are interested in performance minimise dots (each dot is a lookup), especially in loops.

There are two ways. One is to set objects to the lowest object if you are going to access more than once.

eg (slower)

set xlapp = CreateObject("Excel.Application")
msgbox xlapp.worksheets(0).name 

(faster because you omitt a dot every time you use the object)

set xlapp = CreateObject("Excel.Application")
set wsheet = xlapp.worksheets(0)
msgbox wsheet.name

The second way is with. You can only have one with active at a time.

This skips 100 lookups.

with wsheet
For x = 1 to 100
 msgbox .name
Next
end with

String Concatination

And don't join strings one character at a time. See this from a VBScript programmer. It requires 50,000 bytes and many allocation and deallocation to make a 100 character string.

http://blogs.msdn.com/b/ericlippert/archive/2003/10/20/53248.aspx

Reading Properties

Don't reread properties that don't change especially if out of process or late bound. Put them into a variable.

Object Types

Two concepts here - in or out of process and early or late binding.

exefiles are connected to out of process. All calls are marshalled over RPC (a networking protocol). Dllfiles are in process and function calls are made direct with a jump.

Early binding is set x = objecttype. Functions are looked up when you write the program. On execution the program is hard coded to jump to address stored in the vtable for that function.

Late binding is set x = createobject("objecttype"). Each function call goes like this. "Hi object do you have a print command". "Yes", it replies, "command number 3". "Hi object can you please do command number 3". "Sure, here's the result".

From Visual Basic Concepts (part of Help)

You can make your Visual Basic applications run faster by optimizing the way Visual Basic resolves object references. The speed with which Visual Basic handles object references can be affected by:

Whether or not the ActiveX component has been implemented as an in-process server or an out-of-process server.

Whether an object reference is early-bound or late-bound. In general, if a component has been implemented as part of an executable file (.exe file), it is an out-of-process server and runs in its own process. If it has been implemented as a dynamic-link library, it is an in-process server and runs in the same process as the client application.

Applications that use in-process servers usually run faster than those that use out-of-process servers because the application doesn't have to cross process boundaries to use an object's properties, methods, and events. For more information about in-process and out-of-process servers, see "In-Process and Out-of-Process Servers."

Object references are early-bound if they use object variables declared as variables of a specific class. Object references are late-bound if they use object variables declared as variables of the generic Object class. Object references that use early-bound variables usually run faster than those that use late-bound variables.

Excel Specific

See this link from a Microsoft person. This is excel specific rather than VBA. Autocalc and other calc options/screenupdating etc.

http://blogs.office.com/2009/03/12/excel-vba-performance-coding-best-practices/


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

1.4m articles

1.4m replys

5 comments

57.0k users

...