Replace Using Wildcards
Replace Using Wildcards
htm
Graham Mayor
... helping to ease the lives of Microsoft Word users.
Home What's New Word Pages Links Downloads Photo Gallery Search Contact
Share
Introduction
Many people access the material from this web site daily. Most just take what they want I originally wrote this
and run. That's OK, provided they are not selling on the material as their own; however tutorial, with
if your productivity gains from the material you have used, a donation from the money assistance from fellow
you have saved would help to ensure the continued availability of this resource. Click Word MVP Klaus Linke
the appropriate button above to access PayPal. for the now defunct
Word MVP web site.
However, I felt it
Find and Replace using wildcards desirable to present it
here, where I have
editorial control, so
This tutorial pre-supposes that the user will have some basic experience of Word's that I may more easily
'replace' function. keep it up to date.
The secret of using wildcard searches is to identify the unique string of text that
you wish to find. Wildcards are combined with regular text and formatting options
to represent the characters or sequences of characters in that string. Because
different combinations of characters can be represented by a variety of wildcard
combinations, there is often more than one way of identifying a particular string
of text within a document. How you choose to represent that group of characters is
therefore a matter of individual preference; and the context of the text within the
document will to a great extent dictate the most suitable combination to use on a
particular occasion.
Start by identifying the string you wish to replace and then pop up the replace
function (CTRL+H) or select Advanced Find from the Editing group on the Home tab
of the ribbon (see below); or in earlier Word versions Edit > Replace.
Click the 'More' button to present the additional functions and check the 'Use
wildcards' option:
1 of 9 19/11/2024, 19:35
Replace using wildcards https://www.gmayor.com/replace_using_wildcards.htm
With the cursor in the 'Find what:' or 'Replace with:' boxes you can use keyboard
shortcuts to enhance the strings with the principle formatting options e.g CTRL+U
- underline, CTRL+I - italics, etc. These operate as toggles to cycle through the
various options available.
Insert your find and replace strings using the following guide
for inspiration.
? is used to represent a single character Word does not limit the number
and * represents any number of of characters that the
characters. On their own, these have asterisk can match, and it
limited use. does not require that
characters or spaces reside
s?t will find sat, set,sit ,sot and any between the literal characters
other combination of three characters that you use with the
beginning with 's' and ending with 't'. asterisk.
It will also find that combination of
letters with a word, thus it would locate The asterisk is a rather blunt
the relevant (highlighted) part of inset weapon which must be used with
etc. care, as it can return a lot
of unwanted results.
@ @ is used to find re-occurrences of the
previous character (if any). e.g. lo@t
will find lot or loot, ful@ will find ful
or full etc.
< > With any of the above (or any other
combination of wildcards and characters),
you can use the brackets < and > to mark
the start and end of a word respectively.
Thus in the example used above for '*'
2 of 9 19/11/2024, 19:35
Replace using wildcards https://www.gmayor.com/replace_using_wildcards.htm
<[Ё-ґ]@> matches any cyrillic word: Вы можете помочь мне? (“Can you help me
please?”)
In Word 2000, you can type in Unicode characters with the Alt-key (make sure
NumLock is on, then hold down the Alt-key and type the numbers on the numeric
keyboard). Since all characters from decorative fonts (Symbol-, Wingdings-fonts
...) are kept in a special code page from &HF000 to &HF0FF, you can search for them
with [Alt61472-Alt61695].
{ } Curly brackets are used for counting Counting can be used with
occurrences of the previous character or individual characters or more
expression. usefully with sets of
characters e.g. [deno]{4} will
{n} This finds exactly the number match done, node, eden) or
'n' of occurrences bracketed groups: (ts, ){3}
of the previous character will match ts, ts, ts, .
{n,} Finds at least the number
'n' occurrences. (Unfortunately, there is no
{n,m} Finds the number of occurrences wildcard to search for "zero
from 'n' to 'm'. or more occurrences" in Word
wildcard searches; [!^13]{0,}
Note: The above examples employ a comma as does not work).
a list separator {n,m} - for languages
that use alternative list separators,
substitute the local separator character
(often a semi-colon {n;m}) as appropriate.
( ) Round brackets have no effect on the The placeholders \1, \2 etc.,
search pattern, but are used to divide the can also be used in the search
pattern into logical sequences, where you string to identify recurring
wish to re-assemble those sequences in a text. e.g.
different order during the replace - or to
replace only part of that sequence. They Fred Fred could be written
must be used in pairs. (Fred) \1.
Gotchas .
You may wish to identify a character string by means of a paragraph mark ¶. The
3 of 9 19/11/2024, 19:35
Replace using wildcards https://www.gmayor.com/replace_using_wildcards.htm
Wildcard searches will also not find footnote/endnote marks - substitute ^2.
A-z would be expected to reproduce all the letters between A and z i.e. both upper
case and lower case letters, which it does, but it reproduces all the characters
from ASCII 65 to ASCII 122, and that block also includes the characters [ ] ` ^ _
/ Use A-Za-z instead.
The question mark ? is used to find individual characters. If used with curly
brackets to define a range of characters eg
#?{1,3}# it will behave as an asterisk and find all the characters between the hash
symbols.
Code Notes
^1 In-line picture
^2 Auto referenced footnotes
^5 Annotation mark
^9 Tab
^11 New line
^12 Page or Section break
^13 Paragraph break / 'carriage' return
^14 Column break
^19 Opening field brace (when field braces are visible)
^21 Closing field brace (when field braces are visible)
? Question mark
^? Any single character (not valid in the Replace box)
^- Optional hyphen
^~ Non-breaking hyphen
^^ Caret character
^# Any digit
^$ Any letter
^& Contents of 'Find What' box (Replace box only)
^+ Em dash (not valid in the Replace box)
^= En dash (not valid in the Replace box)
^u8195 Em Space Unicode character value search (not valid in the Replace box)
^u8194 En Space Unicode character value search (not valid in the Replace box)
^a Comment (not valid in the replace box) (Word 97-2000 only)
^b Section break (not valid in the replace box)
^c Replace with Clipboard contents (Replace box only)
^d Field
^e Endnote Mark (not valid in the Replace box)
^f Footnote Mark (not valid in the Replace box)
^g Graphic (In Line Graphics Only). In Word 2007 a forward slash / also appears
to find in-line graphics. This appears to be an unintentional bug.
^l New line -
^m Manual Page Break
^n Column break
^t Tab -
^p Paragraph Mark -
^s Non-breaking space
^w White space (space, non-breaking space, tab); not valid in the Replace box
^nnn Where "n" is an ASCII character number
Note: ASCII codes below 128 were standardized a long time ago, before the
introduction of Windows operating systems. The upper codes were used for
OS-specific, localized, or vendor-specific stuff. When DOS code pages
were replaced by Windows code pages, a leading zero was used to indicated
the difference.
Thus ^32 and ^032 will both represent a space character, but ^147 will
represent ô and ^0147 will represent “
^0nnn See above (Produces ASCII on Macintosh).
^unnnn Unicode character search where "n" is a decimal number corresponding to the
Unicode character value.
Note: Instructions on how to identify the required decimal number are
included at the end of this page.
Note: To search for a specific field, such as an XE (Index Entry) field, use the
following syntax:
^19 field name
4 of 9 19/11/2024, 19:35
Replace using wildcards https://www.gmayor.com/replace_using_wildcards.htm
Example 1.
There are many occasions when you are presented with blocks of text or numbers
etc., where the order of the text is not what you might require in the final
document. Swapping the placement of forename and surname as above is one such
example - and don't forget you can add to the replacement, even when using
bracketed replacements
e.g. you may wish John Smith to appear as Smith, John or, more likely, you may have
a column of names in a table, where you wish to exchange all the surnames with all
the forenames.
You could do them one at a time, but by replacing the names with wildcards, you can
do the lot in one pass.
Let's then break up the names into logical sequences that can only represent the
names.
At its simplest, we have here two words - John and Smith. They can be represented
by <*>[space]<*> - where [space] is a single press of the spacebar.
Run the search on the column of names and all are swapped. Run it again and they
are swapped back.
If you get it wrong, remember that Word's 'undo' function (CTRL+Z) is very
powerful and has a long memory!
Example 2
This could be the changing of UK format dates to US format dates - or vice versa.
To give an example of how most of the wildcards could be used in one search
sequence to find any UK date formatted above to its equivalent US format date, the
following search pattern will do the trick:
[0-9]{1,2}[dhnrst]{2} <[AFJMNSOD]*>[0-9]{4}
Breaking it down [0-9] looks for any single digit number, but dates can have two
numbers so to restrict that to two, we use the count function. We want to find
dates with 1 or 2 numbers so
[0-9]{1,2}
[dhnrst]
[dhnrst]{2}
The month always begins with one of the following capital letters - AFJMNSOD. We
don't know how many letters this month has so we can use the blanket '*' to
represent the rest. And we are only interested in that word so we will tie it down
with <> brackets.
<[AFJMNSOD]*>
there's another space [space] followed by the year. The years here have four
numbers so
[0-9]{4}
Finally add the round brackets to provide a logical breakup of the sequence
([0-9]{1,2}[dhnrst]{2})[space](<[AFJMNSOD]*>)[space]([0-9]{4})
\2[space]\1,[space]\3
Example 3
5 of 9 19/11/2024, 19:35
Replace using wildcards https://www.gmayor.com/replace_using_wildcards.htm
Assume you are parsing addresses and wish to separate the honorific from the name.
American usage puts a full stop (period) at the end ("Mr.", "Mrs.", "Dr.") while
British usage often omits the full stop.
([DM][rs]{1,2})( )
\1.\2
or vice versa
([DM][rs]{1,2}).
\1
Further examples:
(*^13)@ will match any number of replacement paragraphs. Replace with \1 to remove
duplicates from a sorted list.
By creating logical sequences you can search for almost any combinations of
characters.
Not a bug but still annoying: You have to escape any special character even if you
type its code; so ^92 will have the same problems as typing the backslash.
The construction {0,} (find zero or more of the preceding item) is refused as
incorrect syntax. This concept is available in Unix regular expression matching, so
it's a curious omission.
You don’t always have to “escape” the special characters, if the context makes it
clear that the special meaning isn’t wanted. [abc-] matches "-", and [)(] matches
")"
or "(". This may sometimes make your searches behave differently from what you
expected.
First select the first example of the character (e.g. the Cyrillic character ю in
the document, then activate the Insert > Symbol dialog.
6 of 9 19/11/2024, 19:35
Replace using wildcards https://www.gmayor.com/replace_using_wildcards.htm
Open the Windows calculator and change its view to the Programmer's calculator,
ensure that the Hex radio button is checked and enter the number into the
calculator.
In some earlier Windows versions, the calculator has a different layout. The
'hex' and 'dec' conversion buttons are in the Scientific calculator view.
Click the Dec radio button and note the changed number
7 of 9 19/11/2024, 19:35
Replace using wildcards https://www.gmayor.com/replace_using_wildcards.htm
You can then use the four digit number in conjunction with the ^unnn (^u1102) to
find the characters in the document.
Note that the following macro requires a reference to the Microsoft Forms 2
Object Library to be checked in the vba editor tools > references (see following
illustration).
8 of 9 19/11/2024, 19:35
Replace using wildcards https://www.gmayor.com/replace_using_wildcards.htm
Sub GetExtendedCharDecVal()
Dim SelFont As Variant
Dim SelCharNum As Long
Dim myData As DataObject
Dim sCode As String
Set myData = New DataObject
Select Case Len(Selection.Range)
Case Is = 0
MsgBox "Nothing selected!"
Exit Sub
Case Is = 1
With Selection
With Dialogs(wdDialogInsertSymbol)
SelFont = .Font
SelCharNum = .CharNum
Home | What's New | Word Pages | Links | Downloads | Photo Gallery | Search | Contact Copyright Graham Mayor
©2012.
9 of 9 19/11/2024, 19:35