0% found this document useful (0 votes)

240 views86 pages

Regex Replace System

The document provides step-by-step instructions for using regular expressions in Notepad++ to clean up messy DMDX .zil files. It details the search and replace steps to remove errors, blank lines, and duplicate reaction time data. The instructions allow for easy importing of the cleaned files into Excel.

Uploaded by

BujarKlaiqi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

240 views86 pages

Regex Replace System

Uploaded by

BujarKlaiqi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 86

Next Blog

29 June 2008

Notepad++: A guide to using regular expressions and

extended search mode
The information in this post details how to clean up DMDX .zil files, allowing for easy importing into
Excel. However, the explanations following each Find/Replace term will benefit anyone looking to
understand how to use Notepad++ extended search mode and regular expressions.

Create Blog

Blog Archive

2013 (3)
2012 (2)
2011 (4)
2010 (2)
2009 (7)
2008 (11)
October (1)

If you are specifically looking for multiline regular expressions, look at this post.
You may already know that I am a big fan of Notepad++. Apparently, a lot of other people are
interested in Notepad++ too. My introductory post on Notepad++ is the most popular post on my
speechblog. I have a feeling that that is about to change.
Since the release of version 4.9, the Notepad++ Find and Replace commands have been updated.
There is now a new Extended search mode that allows you to search for tabs(\t), newline(\r\n), and
a character by its value (\o, \x, \b, \d, \t, \n, \r and \\). Unfortunately, the Notepad++ documentation
is lacking in its description of these new capabilities. I found Anjesh Tuladhar's excellent slides on
regular expressions in Notepad++ useful. After six hours of trial and error, I managed to bend
Do you need professional PDFs? Try PDFmyURL!

August (3)
July (2)
June (2)
Notepad++: A guide to using
regular expressions an...
Create conference posters:
From Powerpoint to high...
May (1)

regular expressions in Notepad++ useful. After six hours of trial and error, I managed to bend
Notepad++ to my will. And so I decided to post what I think is the most detailed step-by-step guide
to Search and Replace in Notepad++, and certainly the most detailed guide to cleaning up DMDX
.zil output files on the internet.

March (1)
February (1)
2007 (19)

What's so good about Extended search mode?

One of the major disadvantages of using regular expressions in Notepad++ was that it did not
handle the newline character wellespecially in Replace. Now, we can use Extended search mode
to make up for this shortcoming. Together, Extended and Regular Expression search modes give
you the power to search, replace and reorder your text in ways that were not previously possible in
Notepad++.
Search modes in the Find/Replace interface
In the Find (Ctrl+F) and Replace (Ctrl+H) dialogs, the three available search modes are specified in
the bottom right corner. To use a search mode, click on the radio button before clicking the Find
Next or Replace buttons.

Software Testing
Download
smartbear.com/30-Day-Trial
Easy Automated Tool For Both
Novice And Advanced Testers.
Free Trial.

Topics

annoyances (4)
archive (1)
backup (3)
customisation (2)
display (1)
DMDX (2)
download (12)
dropbox (1)
Cleaning up a DMDX .zil file

excel (2)
experiments (3)

DMDX allows you to run experiments where the user responds by using the mouse or some other
input device. Depending on the number of choices/responses (and of course the kind of task),
DMDX will output a .zil file containing the results (instead of the traditional .azk file). This is
Do you need professional PDFs? Try PDFmyURL!

figures (1)
formatting (3)

specified in the header along with the various response options available to the participant. For
some reason, DMDX outputs the reaction time twiceand on separate linesin .zil files. Here's a
guide for cleaning up these messy .zil files with Notepad++. Explanations of the Notepad++ search
terms are provided in bullet points at the end of each step.

guides (17)
notepad++ (2)
pdf (5)

Step 1: Backup your original result file (e.g. yourexperiment.zil) and create a copy of that file
(yourexperiment_copy.zil) that we will edit and clean up.

praat (3)

Step 2: Open yourexperiment_copy.zil in Notepad++ (version 4.9 or later).

publishing (2)

productivity (12)

recording (1)
regular expressions (2)
roboform (1)
scripts (4)
security (2)
setup (9)
software (19)
speech (2)
stats (1)
styles (3)
thesis (6)
Word (6)
writing (2)
zotero (4)

Step 3: Remove all error messages.All lines containing DMDX error messages begin with an
exclamation mark. Let's get rid of them.
Bring up the Replace dialog box (Ctrl+H) and select the Regular Expression search mode.
Find what: [!].*
Replace with: (leave this blank)
Do you need professional PDFs? Try PDFmyURL!

Press Replace All. All the error messages are gone.

[!] finds the exclamation character.

.* selects the rest of the line.
Step 4: Get rid of all these blank lines.
Switch to Extended search mode in the Replace dialog.
Find what: \r\n\r\n
Replace with: (leave this blank)
Press Replace All. All the blank lines are gone.

Do you need professional PDFs? Try PDFmyURL!

\r\n is a newline character (in Windows).

\r\n\r\n finds two newline characters (what you get from pressing Enter twice).

Step 5: Put each Item (DMDXspeak for trial) on a new line.

Switch to Regular Expression search mode.
Find what: (\+.*)(Item)
Replace with: \1\r\n\2
Press Replace All. "Item"s have been placed on new lines.

Do you need professional PDFs? Try PDFmyURL!

\+ finds the + character.

.* selects the text after the + up until the word "Item".
Item finds the string "Item".
() allow us to access whatever is inside the parentheses. The first set of parentheses
may be accessed with \1 and the second set with \2.
\1\r\n\2 will take + and whatever text comes after it, will then add a new line, and place
the string "Item" on the new line.
So far so good. Our aim now is to delete duplicate or redundant information (reaction time data).

Step 6: Remove all newline characters using Extended search mode, replacing them with a unique
string of text that we will use as a signpost for redundant data later in RegEx. Choose a string of
text that does not appear in you .zil fileI have chosen mork.
Switch to Extended search mode in the Replace dialog.
Do you need professional PDFs? Try PDFmyURL!

Find what: \r\n

Replace with: mork
Press Replace All. All the newline characters are gone. Your entire DMDX .zil file is now one very
long line of (in my case word-wrapped) text.

Step 7: We're nearly there. Using our mork signpost keyword, let's separate the different RT values.
Stay in Extended search mode.
Find what: ,
Replace with: ,mork
Press Replace All. Now, mork appears after every comma.
Do you need professional PDFs? Try PDFmyURL!

Step 8: Let's put the remaining Items on new lines.

Switch to and stay in Regular Expression search mode for the remaining steps.
Find what: mork(Item)
Replace with: \r\n\1
Press Replace All. All "Item"s should now be on new lines.

Step 9: Let's get rid of those duplicate RTs.

Find what: mork ([^A-Za-z]*)mork [^A-Za-z]*\,mork
Replace with: \1,
Press Replace All. Duplicate reaction times are gone. It's starting to look like a result file :)
Do you need professional PDFs? Try PDFmyURL!

A-Z finds all letters of the alphabet in upper case.

a-z finds all lower case letters.
A-Za-z will find all alphabetic characters.
[^...] is the inverse. So, if we put these three together: [^A-Za-z] finds any character
except an alphabetic character.
Notice that only one of the [^A-Za-z] is in parentheses (). This is recalled by \1 in the
Replace with field. The characters outside of the parentheses are discarded.

Step 10: Let's get rid of all those morks.

Find what: mork
Do you need professional PDFs? Try PDFmyURL!

Replace with: (leave blank)

Press Replace All. The morks are gone.

Step 11: Separate each participant's data from the next.

Find what: (\**\*)
Replace with: \r\n\r\n\1\r\n\r\n
Press Replace All. The final product is a beautiful, comma-delimited .zil result file that is ready to
be imported into Excel for further analysis.

Notepad++, is there anything it can't do?

Do you need professional PDFs? Try PDFmyURL!

Please post your questions in the comments below, rather than emailing me. This way, others can
refer to my answers here, saving me many hours of responding to similar emails over and over.
Update 20/2/2009: Having trouble understanding regexp? I have created a new Guide for regular
expressions. Check it out.
Posted by Mark Antoniou at 11:28 AM
+36 Recommend this on Google

Labels: DMDX, experiments, guides, notepad++, productivity

398 comments:
1 200 of 398 Newer Newest

James said...
Hi, can those steps be automated in notepad++ ? like actions in photoshop?
July 20, 2008 at 11:13 PM
Mark said...
James, that is the million dollar question. I immediately tried to automate this somehow
but could not get Notepad++ to save these steps in a macro. If I find a solution, I will
post it.
July 21, 2008 at 7:00 PM
ninj said...
Nice article!
However, the reason why I arrived on your blog still remains unanswered:
How to replace a multiple line regexp by a simple value (in my case: nothing).
Here is the case:
In Symfony YAML generated files, I have the created_at and updated_at fields dumped,
which I don't want.
I need to replace something like this:
/ *created_at:.*\n *updated_at:.*\n/
by
//
The way to do it is important because I want the blank lines to disappear as well.
Do you need professional PDFs? Try PDFmyURL!

Of course I know it is possible to do it in two or three steps, but I'd like to find how to
achieve it in one only, I'm a regexp maniac ;)
Maybe you or someone else own a solution... i couldn't manage to get one neither
through CTRL-H nor through CTRL-R dialogs.
Thanks!
July 31, 2008 at 1:44 AM
Mark said...
ninj, currently you cannot do this in Notepad++. This is because replacing newlines is
possible in Extended search mode, and regular expressions are available in Regexp
search mode. You are trying to combine the two search modes, and in the current
version of Notepad++ you cannot.
Since I wrote this post, I too have caught regexp mania. If you are serious about using
regular expressions for more advanced search and replace (as you are) then you need
to use a more powerful text editor. I recommend XEmacsI've been using it for about a
month, and it is very powerful. I'm working on a post for XEmacs right now.
As for your specific problem, it is possible to get rid of the created_at and updated_at
information. I would need to see the text file (feel free to send a sample to me as an
email attachment). I have made a few assumptions: 1. that created_at and updated_at
always occur on consecutive lines, 2. that there is information above and below these
lines that is useful. The XEmacs regular expression would be this:
Search for:
$.*$ newline
.*created_at:.* newline
.*updated_at:.* newline
$.*$
Replace with:
\1 newline
\2
Note: In XEmacs, the newline character is created by pressing Ctrl+Q Ctrl+J.
July 31, 2008 at 10:44 AM
Do you need professional PDFs? Try PDFmyURL!

Anonymous said...
Quick bleg. I would like to replace all occurrences of number+comma with number +
TAB. So 12.8, 100 would become 12.8 TAB 100.
I'm using "\d," for the [Find What] value and "\1\t" for the [Replace With] value.
Unfortunately I lose that last digit in the number that I'm replacing.
Any help would be appreciated.
August 1, 2008 at 5:27 AM
Anonymous said...
Ok, I actually figured it out.
The [Find What] value should be "(\d)," and the [Replace With] value
should be "\1\t". In other words I just needed the parentheses around "\d" criteria.
Thanks for the useful article Mark.
August 1, 2008 at 5:51 AM
Flick said...
Thank you for the guide! I have to admit it's a little advanced for me, and I've only just
found out about REGEX expressions, but am still very excited nonetheless!
I'm alittle confused by what to do in my situation. I have a mySQL file that I'd like to
run, and the first part of each line is something like this:
INSERT INTO my_table (id,uid,my_msg,my_date,the_ip) VALUES ('2',
I would very much like to be able to change the '2' part to just NULL and REGEX
seems to be the way forward. However, I think I'd have to use ( as a unique identifier,
and given that REGEx uses brackets as the separators, I'm now a little stuck.
Apologies in advance for this simple question, but my brain is really not working today.
Thanks!
p/s: I'll continue looking into it in the meantime.
August 11, 2008 at 2:06 AM
Do you need professional PDFs? Try PDFmyURL!

Flick said...
Just a quick update: I've been able to use Column Mode select (Alt+mouse) to select
the column and replace the NULL, since thankfully everything is in the same column!
I wonder if it is still possible in Regex though?
Thanks :)
August 11, 2008 at 2:20 AM
Mark said...
Hi Flick, thanks for your comments. I do have a regex solution for you that is very easy
and quick. Note that this regex syntax is specific to Notepad++.
First, let me answer your question re: the curved bracket (or parenthesis) character: in
order to search for and find the open parenthesis character, place the parenthesis within
square brackets like this: [(]
However, you do not need to use the parentheses or square brackets at all to achieve
what you want to (if I have understood you correctly).
Search for: '.*',
Replace with: NULL
If you do not want to get rid of the comma, then delete it from the search term. If this
then stuffs up your search and finds incorrect portions of text, you could insert a
comma after null in the replace with expression: NULL,
August 11, 2008 at 3:38 PM
Anonymous said...
Mark,
Do you have some advice for the following. I have a set of text lines... and I want to
delete duplicate lines. But the redundant information will occur only at the beginning of
the line, the end of those lines differ in their information. I'm just starting to use
notepad++ RegExp utilities, but I'm no whiz yet with the format.
Thanks
October 7, 2008 at 2:00 AM
Mark said...
That's exactly what regular expressions do. Give me 4-5 lines of your text as an
Do you need professional PDFs? Try PDFmyURL!

example, and I'll show you the correct regexp.

October 7, 2008 at 3:56 AM
Anonymous said...
ok... I've made the text file simpler so that the duplicates I want to delete all have the
same information.
[19-766]
???^Los Angeles^60-638^LOS
ED FINDER
[19-767]
???^Los Angeles^60-638^LOS
ED FINDER
[19-773]
???^Los Angeles^60-638^LOS
ED FINDER
[19-1581]
???^Los Angeles^60-638^LOS
ED FINDER

ANGELES CITY USE ONE STEP 1940 LARGE CITY

the phrases in brackets, on separate lines, are ignored by the final use of the text file.
They can remain, but I do want to delete the duplicates of the ??? lines. I'll have other
cities with similar format.
thanks
October 7, 2008 at 8:55 AM
Anonymous said...
... this group of lines is followed, for example, by:
[19-773]
???^Los Angeles^60-639^LOS ANGELES CITY USE ONE STEP 1940 LARGE CITY
ED FINDER
[19-1580]
???^Los Angeles^60-639^LOS ANGELES CITY USE ONE STEP 1940 LARGE CITY
ED FINDER

so the number between the second and 3rd ^s will change throughout the file, as will the
Do you need professional PDFs? Try PDFmyURL!

county name between the 1st and 2nd ^

October 7, 2008 at 8:59 AM
Mark said...
Ok, I understand the problem. Can you provide me with what you would like the output
to look like after applying the regexp.
For e.g., should it look like this:
Los Angeles 60-638
Los Angeles 60-639
Is this the only useful info? Should everything else be deleted?
Also, are the number of repetitions (lines of redundant info) the same for each
city/number?
October 7, 2008 at 11:13 PM
Anonymous said...
Mark,
As further background, you are looking at the content of the 1930 census districts
laundered into the 1940 census districts. I have transcribed a cross table between 1930
and 1940, and we seeded the 1940 EDs with the 1930 information. Those 1930 ED
numbers are in brackets, and point to the next text line (where that information came
from). Since census districts change boundaries between federal censuses, especially
in large cities, you will see multiple 1940 entries from different 1930 EDs that are
partially contained within the 1940 ED. I don't think there would be any more than 10
such contribution EDs. For rural areas the data from 1930 to 1940 is accurate, for urban
areas we have transcribed street indexes for over 200 large cities, thus instead of
repeating their 1940 ED streets (I have scanned 28 rolls of 1940 ED descriptions), I just
direct them to the other utility. For smaller areas of 25,000 or more, I intend to get street
indexes for them, and have replaced their descriptions with "TO BE DONE BY
BOUNDARY OR STREET INDEX".
When there are multiple ED entries for a single 1940 ED # (which is a two part number),
they will occur together as a block with no blank line between the various lines. If a
1940 ED has only a single 1930 entry, it should have a blank line above the brackets,
and one below the text line.
Do you need professional PDFs? Try PDFmyURL!

I fooled with TextFX but it moves the brackets from the text lines, doesn't show a
numerical sort of numbers (thus one sees 2, 20, 21, ...) and for some didn't get me to a
unique line.
I need the entire line. So for the first example I want:
[19-766]
???^Los Angeles^60-638^LOS ANGELES CITY USE ONE STEP 1940 LARGE CITY
ED FINDER
[19-767]
[19-773]
[19-1581]
but I'm willing to give up the brackets lines, but I do want a blank line between the
statements.
I've done 2 states, and with California decided to do some more automation. To see
Alabama and Arkansas... go to http://www.stevemorse.org/ed/ed.php
and choose 1940 and one of those two states.
Thanks... I'll ask Steve Morse to acknowledge you on the One Step site if you can pull
this off.
Joel Weintraub
Dana Point, CA
October 8, 2008 at 10:47 AM
Anonymous said...
Mark,
Steve Morse wrote a utility to do what I want.
But it was interesting to see if RegEx could do the same thing.
So... thanks for your help... don't do any more.
Thanks
Joel Weintraub
Do you need professional PDFs? Try PDFmyURL!

October 14, 2008 at 2:32 PM

Mark said...
Glad to hear that your problem got solved. Apologies for not responding as quickly as I
usually would, but you caught me at a bad time (wedding and honeymoon). My wife
doesn't let me post about regular expressions while on honeymoon!
Basically, the problems with your data are twofold:
a) There is no unique identifier in the first occurrence of a 'new' number; and
b) The number of repetitions varies.
You cannot use regexp to compare two strings of text and decide if a change has
occurred (i.e. a new number/city, whatever). In summary, getting a parser/utility written
was a smart move.
I am writing up a guide about how to use regular expressions, going from basics to more
advanced stuff. Stay tuned.
October 28, 2008 at 3:22 AM
Anonymous said...
Regular Expressions - User guide

http://www.zytrax.com/tech/web/regex.htm
November 17, 2008 at 2:27 AM
liz said...
thanks, helped me out a bunch :)
November 25, 2008 at 4:36 AM
fresh332 said...
I have an output file from a program which contains "\n" characters instead of line
breaks, e.g.: "Text\nNew line\nAnother line"
Similar to your "mork" solution I do a consecutive replace, first in "normal mode"
replacing "\n" characters with something unique like "ZZZ", then in "extended mode"
replacing "ZZZ" with "\n" so I finally have the line breaks.
Do you need professional PDFs? Try PDFmyURL!

There should be a way to do this in one step, or to automate the two steps, either in
notepad++ or with some other tool - has anyone got an idea?
December 3, 2008 at 9:24 PM
Mark said...
fresh332, yes there is a way to do this in one step; and no, you cannot do this with
Notepad++.
I now use a very powerful text editor called XEmacs. It really leaves Notepad++ for
dead when it comes to regexp. It's so good that I'm working on a more detailed guide to
regexp using XEmacs right now.
FYI: in XEmacs, you specify a newline character by first pressing Ctrl+Q and then
Ctrll+J. This creates a newline character that takes care of \n and other "newline"
characters.
December 4, 2008 at 8:31 AM
Dave Bui said...
Brilliant! I love the replacement double blank lines to a single blank lines.
December 4, 2008 at 10:22 PM
David Leigh said...
I didn't see a mention of Notepad++'s other Find/Replace facility: The TextFX plugin. I
did not look to see if any of the "unsolvable" problems would be solved by TextFX, but
in the case that they might be, it's worth looking at the TextFX Find/Replace facility
(CTRL+R or via the menus) because of the way it can handle newlines and tabs.
That being said, connecting Find/Replace (any flavor) with the macro recording facility
of Notepad++ would elevate this software to "perfect" in my eyes...it's the one thing
remaining that really aggrevates me on a semi-regular basis. Other than that, I LOVE
this editor.
January 3, 2009 at 12:29 AM
Anonymous said...

January 7, 2009 at 10:34 PM

Do you need professional PDFs? Try PDFmyURL!

Jay Fulton said...

Thank you VERY much. The documentation helps you, of course, but it saves time for
the rest of us, too! Much appreciated
January 31, 2009 at 3:44 AM
Ninad said...
Hi,
Can anyone tell me how to use regexp and convert upper case to lower case?
February 3, 2009 at 9:32 AM
Mark said...
Good question, Ninad. I'm not sure if regexp can change upper to lower case for you.
And I'm not sure how complicated your text file is. However, if you simply want to
change text to lower case or vice versa, you can do this without using regexp.
In Notepad++, select the text that you would like to change, then click on the TextFX
menu, then TextFX Characters, and then select lower case.
Easy, huh?
February 3, 2009 at 10:00 AM
Anonymous said...
Hi.
Is it possible to search and replace the following in notepad++?
/*
...
...
*/
I can do it it it's all on one line, e.g. /* ... */
But I can't seem to find the regex command to select across multiple lines.
Is this because n++ regex can't handle line returns?
Do you need professional PDFs? Try PDFmyURL!

-J
February 14, 2009 at 9:49 AM
Mark said...
That's exactly right. The problem is the line returns (or newlines). This is quite
problematic isn't it?
If you would like to be able to do these types of regular expressions then you should
use a more powerful text editor. I use XEmacs.
I've been working on a very comprehensive "Guide to regexp using XEmacs" post for a
while now. Hopefully I will publish it in the next month or two.
February 14, 2009 at 7:04 PM
Jolas Arvin said...

April 16, 2009 at 8:20 PM

Jolas Arvin said...
i just search the net for multiline regex replacements and i bumped into this post.
im experiencing same problems on n++. poor thing n++ can handle multiline regex. :( oh
well im looking forward to see the XEmacs guide to regex. hope multiline regex
replacement will be included in it. tnx.
i'm somewhat into coding that feature in java to fully customize regex commands into
my needs (specially the multiline replacements). :) if anyone did that, please share.
many thanks.. :)
April 16, 2009 at 8:22 PM
Anonymous said...
Can I do a logical-OR regular expression search in Notepad++? In TextPad I used
"^Alert|^Error|^Warning" to find all lines in a system log that started with either of the
three words. The "|" operator does not seem to work in Notepad++.
Of course, I could do three separate searches, but it would be nice if NotePadd++ did
this for me by interpreting an OR operator, e.g. "|".
April 29, 2009 at 7:52 PM
Do you need professional PDFs? Try PDFmyURL!

Mark said...
No, Notepad++ cannot perform logical OR regexp searches. That was an easy question
:)
However, the excellent and free XEmacs can handle your search without any problems.
Note that your Textpad OR operator | would become \| in XEmacs, i.e.,
^Alert\|^Error\|^Warning
April 30, 2009 at 6:40 PM
Anonymous said...
I have a text file full of blocks of text like this:
"STRING1" =>
{ url => "URL1",
visibleif => sub { !$is_temporarily_terminated &&
padlock("STRING2");
},
},
... more blocks like the above separated by a blank row.
End state: I need an excel file with 3 columns: string1, url1, and string2
Any ideas? I am completely new to regex and using notepad++ for now. If someone
who is really good at this replies quickly, then there also could be some work that we
could pay them to do in the future as we get a lot of projects like this.
May 15, 2009 at 5:27 AM
Mark Antoniou said...
That's pretty easy to fix. I wouldn't use Notepad++ for this. Instead use the excellent
and free XEmacs.
In XEmacs, the correct regex search term would be (newline character at end of each
line is made by Ctrl+Q, Ctrl+J):
"$.*$".*
.*"$.*$".*
.*
.*"$.*$".*
.*
.*
Do you need professional PDFs? Try PDFmyURL!

and the correct replace term would be:

\1,\2,\3
This would create the following output:
STRING1,URL1,STRING2
which you could then open in Excel as a comma delimited file, which would place each
string/url in a separate column.
May 15, 2009 at 1:13 PM
Abhishek said...
Mark,
One question. The contents of file are following.
ABC
XYZ
123
I want to the file contents to be following.
'ABC','XYZ','123'
Thanks,
Abhishek
May 25, 2009 at 6:45 PM
Mark Antoniou said...
Hey Abhishek. This is an easy task. I would advise that you use XEmacs rather than
Notepad++. The reason for this is that Notepad++ does not deal well with newlines.
In XEmacs, you would search for:
$.*$
$.*$
and replace this with:
\1,\2
Done :)
May 25, 2009 at 9:29 PM
Do you need professional PDFs? Try PDFmyURL!

Abhishek said...
Thanks Mark. But, I work on client network where we cannot install XEmacs. We have
only notepad ++ installed. Any other thoughts please?
ABC
123
XYZ
Need to chnage into 'ABC','123','XYZ'
May 27, 2009 at 12:15 AM
Mark Antoniou said...
Ok, well there is a way around it, so long as your data is exactly as you have specified
here, i.e.:
ABC
XYZ
123
So, in order to get to this:
ABC,XYZ,123
All you need to do is replace the newline character with a comma.
If that is the case, you would use extended search mode and search for: \r\n
and replace this with: ,
That should do the trick.
May 27, 2009 at 10:22 AM
Vladimir said...
Hi Mark,
Found your blog and hoping you can help me. I have a batch file that I receive daily. I
need some help trying to modify it.
I need to insert a page break before it says PAGENO throughout the whole document. I
tried to do Find and Replace with PAGENO & \fPAGENO, but it didn't work. It puts FF
in black box in front of PAGENO, but doesn't create a page break when I print. What did
I do incorrectly and is this the way to do a page break with regexp?
Do you need professional PDFs? Try PDFmyURL!

Also, is there a way to automate this process with Notepad++ or any other app?
Thank you very much for your help!
August 5, 2009 at 7:21 AM
Mark Antoniou said...
Hey Vladimir,
This was a tough one! Let me begin by saying that I have an answer for you... kind of.
First of all, as far as I am aware, you cannot have page breaks in a text document.
Ok, now that we've got that out of the way, what are we going to do to help you? I would
say that inserting a page beak requires a rich text editor. So, Notepad++ is not going to
cut it.
I have achieved what you requested in one easy step using Microsoft Word. Open your
file in Word and select Replace (Ctrl+H), and enter the following search term:
Find what: \fPAGENO
Replace with: ^12
and then hit Replace All.
All of the \fPAGENO are now page breaks. Easy.
If you wish to remove PAGE from the top of each page, you could replace it with
nothing. Be sure to match the case when searching so that you do not remove any
legitimate occurrences of the word "page" that are in the content of your file (if there are
any).
As for automating this, it can be done (although I am not hugely experienced in task
automation in Word). Take a look at this URL:
http://www.microsoft.com/technet/scriptcenter/resources/qanda/jul07/hey0710.mspx
Good luck. Let me know how it works out.
August 5, 2009 at 11:21 AM
Puiufly said...
Do you need professional PDFs? Try PDFmyURL!

Don't waste time.

move perl.
August 5, 2009 at 10:28 PM
Mark Antoniou said...
...or you could learn to use Perl, as suggested :)
This is going a bit beyond regexp though!
August 5, 2009 at 11:21 PM
jp said...
Hi Mark,
I must say its a very useful post.
However i would be very grateful to u if u can solve one of my problems in notepad++.
Input:
{ "arc_on_sf::set_end(...)"
}
25848 0.041144 0.000002 0.1 { "pt_on_sf::evaluate"
}
24408 0.032451 0.000001 0.0 { "pt_on_cv::evaluate"
}
Output: when i place the cursor on any of the open braces and press ctrl-B in a LISP
file(got by using alt-l-l enter) i can see the open bracket n the closed bracket
highlighted. Now i need a command to delete the text inbetween teh brackets.

for ex: In the above input if I select { "pt_on_cv::evaluate"

} then it should get deleted upon using a shortcut.

so the final output will be

Output:
{ "arc_on_sf::set_end(...)"
}
25848 0.041144 0.000002 0.1 { "pt_on_sf::evaluate"
}
Do you need professional PDFs? Try PDFmyURL!

24408 0.032451 0.000001 0.0

April 8, 2010 at 9:03 PM
Mark Antoniou said...
Thanks for your question jp.
Some more information would be helpful. As your search involves multiple lines, I would
strongly recommend using a more powerful text editor than Notepad++. I use XEmacs
on Windows and Aquamacs on OSX. The solutions below will work in any text editor
that supports multiline regular expressions (not Notepad++).
If you simply want to remove all instances of curly brackets, and everything that is in
between them, you would search for:
{.*
}
Note that in Emacs, the way to insert a newline into your search query is to press
Ctrl+Q then Ctrl+J. In the above example, you would insert the newline after the
asterisk * and before the close curly braces }
and replace this with nothing.
However, I am assuming that you want to keep some of the information in the curly
brackets. From your question, I cannot tell if it is every second instance, or curly
brackets that contain "cv". Some more information would allow me to give you a more
tailored answer. For the time being, I will assume that you want to remove curly
brackets containing "cv", but want to leave those containing "sf" (or anything else)
unaffected. To accomplish this, you would search for:
{.*cv.*
}
and replace this with nothing.
April 9, 2010 at 10:09 AM
sourabh bora said...

May 7, 2010 at 12:39 AM

sourabh bora said...
Do you need professional PDFs? Try PDFmyURL!

Hey Mark,
Awesome blog. I could not make {n} (repeats the previous item n times work
Specifically I am looking at deleting a string 10 numbers
Thanks
May 7, 2010 at 12:39 AM
Mark Antoniou said...
Thanks sourabh bora.
Could you copy and paste a sample from your file so that I can have a look at what
patterns might work?
May 7, 2010 at 10:08 AM
Christopher said...
Wow, this guide is very helpful and makes debugging code or even reformatting jumbled
scan text from books a snap to clear up.
Always used Notepass++ and these search and replace tips really makes things so
much easier and faster.
May 11, 2010 at 9:42 PM
sourabh bora said...
Thanks for your reply.
Here is an example:
Post123456 This is a nice post Post12345678 This is not a nice post
Post324567 This is another nice post
I want to delete the "nice" posts (Post--Followed by exactly 6 numbers, )
Thanks
May 12, 2010 at 11:22 PM
Mark Antoniou said...
This is actually a lot easier than I thought. If the text preceding the 6 numbers is always
the same, then you have an easy way of uniquely identifying the "nice" posts.
Search for:
Do you need professional PDFs? Try PDFmyURL!

nice post Post......

Replace with: nothing
This will get rid of the words "nice post Post" and the six characters directly after.
May 13, 2010 at 9:26 AM
sourabh bora said...
Thanks. Unfortunately, no text in the passage is same. The only pattern is
"Post" followed by 6 and exactly six random digits. There can be "Post" followed by 8
or 9 random digits, but they are of no interest to us.
Example
If you are working on something
Post123456 cool, let #delete this
Post123456789 him know.#dont delete
Post234567 They select a #delete
Post1 forum member#dont delete
Post23 each month for a#dont delete
grant of up to $100 in hardware or software or other products. (Products do not have to
be available on the mp3Car Store.)
May 13, 2010 at 12:18 PM
Mark Antoniou said...
Ok, so I didn't understand your previous message properly, then. It still looks to me that
there is a pattern there though.
Search for:
Post......
Replace with: nothing
The problem is that if you search for "Post......" it will replace longer strings too, such
as "Post12345678" will become "78", and this is not good. So, in order to make it
unique, you might include a space after the final period in your search expression.
I will put the search term in quotes to illustrate that there is a space on the end. Do not
use the quotes in your text editor Search for: "Post...... "
Do you need professional PDFs? Try PDFmyURL!

This search term will leave longer strings of numbers unaffected.

May 13, 2010 at 12:34 PM
Mark Antoniou said...
Here is the output from your sample of text above:
If you are working on something
cool, let #delete this
Post123456789 him know.#dont delete
They select a #delete
Post1 forum member#dont delete
Post23 each month for a#dont delete
grant of up to $100 in hardware or software or other products. (Products do not have to
be available on the mp3Car Store.)
May 13, 2010 at 12:36 PM
sourabh bora said...

May 13, 2010 at 12:38 PM

sourabh bora said...
Thanks. This is exactly what I did.
However, regexp has a more elegant solution. You can specify exactly how many
characters you are searching for.
What if the number of digits was 60 instead of 6? you can write +{60} instead of typing
60 dots.
I was wondering if notepad has this feature implemented.
And also, we need to search only for digits.. so we will have to type [0-9] sixty times.
(otherwise, posting123 will be selected)
May 13, 2010 at 12:41 PM
marius said...
Hy i am new to regular expression
and i don't quite get it. As i do not wont to make a program to replace what i got here, i
would like you to help me.
Do you need professional PDFs? Try PDFmyURL!

My file is AAABBBCCC etc with all sort of characters from ascii table
the problem is that i whant the text ( code ) to be ABC and search for all hex ascii code
not just numbers or letters.
Thanx a lot
June 13, 2010 at 9:44 PM
Mark Antoniou said...
Thanks for your question Marius. I'm just not exactly clear on what you want to do.
To help me, could you provide me with a sample of what your text looks like (a few
lines), and then provide me with what you want those lines to look like after you run the
regular expression.
June 14, 2010 at 7:05 PM
marius said...
well my text looks like aaafffcccddd777gggzzziiippp---000
and i would like all the triplets to be replaced with only one character.
As you can see it is not only a to z and A to Z there are all type of characters with code
between 0 and 255 ( Ascii code )
June 14, 2010 at 9:51 PM
Mark Antoniou said...
Ok. If that is all that your file contains, then you could simply search for:
..(.)
and replace with:
\1
Easy.
Note, I don't use Notepad++ any more, since I have moved on to Emacs. In Emacs the
search term would be:
..$.$
but the concept is exactly the same: Discard the first two occurrences and keep the
third.
Do you need professional PDFs? Try PDFmyURL!

June 14, 2010 at 10:23 PM

marius said...
thank you a lot
June 14, 2010 at 10:47 PM
Mark N said...
I am trying to do 2 things:
1. Find lines with MORE than 95 characters (including white space)
and
2. 1. Find lines with LESS than 95 characters (including white space)
I can do perl regular expressions, but they just don't work for notepad++ for some
reason. Can you please help?.
June 24, 2010 at 6:21 AM
Afzaal Ameer said...
Hey man as per your wish i have shifted to Xemacs now can you please explain the
regex to remove multiline comments
June 27, 2010 at 3:01 PM
Mark Antoniou said...
Hey Afzaal,
It's very easy with Emacs. You get the newline character by pressing Ctrl+Q Ctrl+J.
For example, if you had two lines and wanted to remove the line break you would
Search for: Ctrl+Q Ctrl+J
Replace with: nothing/leave blank
June 27, 2010 at 9:29 PM
Mark Antoniou said...
Mark N,
Do you need professional PDFs? Try PDFmyURL!

I'm not ignoring you. I've had a bit of trouble getting the regular expression to work in
Notepad++. It definitely can be done as a regular expression though.
Must you use Notepad++?
June 27, 2010 at 9:31 PM
Mark N said...
Well I preffer that it be done in notepad++... besides I don't want to write a script that
does this.
July 13, 2010 at 5:36 AM
Garioch said...
hi, i have a somewhat similar problem ...
i have a sql export-file
i want to "edit" the lines automatically .. coz its almost 6000 of them
each Insert-Line starts with
(id, another_id, third_id, NULL, ...
here i want to "delete" the 3rd id - while leaving all other things
i tried with several search patterns - but to no luck ..
July 14, 2010 at 4:36 PM
Garioch said...
to be more precise all id , 2nd ID and 3rd ID ar actual numbers
July 14, 2010 at 4:42 PM
Mark Antoniou said...
Garioch, if you want me to give you the exact answer, oats a few lines of code into a
comment. But, the general principle is this:
Group the ids that you want to keep as \1,\2 and don't insert I'd 3 into the replace term.
Make sense?
July 14, 2010 at 4:43 PM

Do you need professional PDFs? Try PDFmyURL!

July 14, 2010 at 4:43 PM

Garioch said...
4 of the lines of those 6000
(1,
(2,
(3,
(4,

1,
1,
1,
1,

1,
2,
3,
4,

NULL,
NULL,
NULL,
NULL,

'delayed billing',
'delayed billing',
'delayed billing',
'delayed billing',

'2007-02-16',
'2007-02-16',
'2007-03-01',
'2007-03-01',

0,
0,
0,
0,

17 more fields),
17 more fields),
17 more fields),
17 more fields),

since my question only concerns the start of each line i omitted some info at the end ...
but this should give a picture of the data i want to Replace
until now i was able with some info from other web-pages to find the start of a line with a
regex like
[(][0-9]*[, ][0-9]*[, ]
this marks exactly (1, 1, from the first insert-line
so how do i "mark" this as pattern 1 and how do i progress from there
July 14, 2010 at 4:59 PM
Mark Antoniou said...
Sometimes, the best solution is not to get too fancy. How about if we group everything
from the start that you want to keep into \1.
Then we group: Id3, NULL.
ThEn we group everything from there to the end of the line .*
as \2.
So, your replace term would be: \1NULL\2
That would work.
July 14, 2010 at 5:14 PM
Garioch said...
thanks mark
but i think i found "my solution"
Do you need professional PDFs? Try PDFmyURL!

Find what :", [0-9]*, NULL, "

Replace with : ", NULL, "
then a quick "Replace All"
but again thanks for you advice (from previous answers)
July 14, 2010 at 5:20 PM
user said...
Hi Mark is it possible to make something like this, im not a programmer so ill try to
explain it easy
find any content between two specific custom tags and replace it with the same tags
and a new content between them like
find [customtag]*[customtag]
replace [customtag] This is new content replacing whatever was between custom
tags.[customtag]
im using * like a wildcard to explain that should select every single character between
tags
and more specific what i want is
find *
replace some html marked text like \\Let change some hmtl paragraphs\\
(ive put slashes mixed with html tags because blogger does not allow me to post those
tags)
ive read you cannot use regular with multiline so i ask myself if this is possible in
notepad++ in some extent and in multiple opened files simultaneously, preferable as i
do all my work with this program, and only xemacs as a last option, or alternative if you
want to show next to notepad++ that it is easier to accomplish this in xemacs. But i ask
myself if xemacs is not for non programmer ppl like (i know html css and more or less
can read php and python with a very rough idea of whats going on, sometimes)
Do you need professional PDFs? Try PDFmyURL!

thanks again for this super post the best in internet explaining regular expressions for
notepad++ and introducing xemacs for the same.
July 15, 2010 at 10:22 PM
user said...
(blogger screwed my poorly scaped html tags ill try again with parenthesis)
and more specific what i want is
find (<)!--tag1--(>)*(<)!--tag1--(>)
replace (<)!--tag1--(>)some html marked text like (<)div\(>)(<)p\(>)Let change some hmtl
paragraphs(<)/p(>)(<)/div(>)(<)!--tag1--(>)
July 15, 2010 at 10:27 PM
teddan00 said...
if have a filename i.e. a song called "Born To Run-E Street Band-Bruce
Springsteen.mp3"
I try to make "E Street Band-Bruce Springsteen" switch place with "Born To Run".
Find: (.*)-(.*)\.
Replace: \2-\1.
But I get the following filename: "Bruce Springsteen-Born To Run-E Street Band.mp3"
It seems that the last occurrence of "-" is found. is it possible to find the first
occurrence, AND still make it compatible with filenames that only have one "-" in it's
filename.
July 16, 2010 at 1:54 AM
TechnologyYogi said...
I used NP++'s regular expressions for find and replace for the first time - successfully,
before this I depended on MS SQL Server's Management studio for this, as it has very
cool easy to use find/replace features (using regular expressions).
Thanks for the post!
July 17, 2010 at 1:00 AM
Do you need professional PDFs? Try PDFmyURL!

Mark Antoniou said...

First of all, apologies for taking so long to respond. I was on holidays overseas and only
recently arrived back in Sydney.
teddan00, I will answer your question first because it is an easy one. If the character "-"
is giving you trouble, simply change it to something else via a simple Find+Replace.
For instance,
Search for: Replace with: mork
Now, run a regular expression like this
Search for: (.*)mork(.*)mork(.*).mp3
Replace with: \2-\3-\1.mp3
For songs with only one "-",
Search for: (.*)mork(.*).mp3
Replace with: \2-\1.mp3
Easy.
August 9, 2010 at 11:07 PM
Mark Antoniou said...
@user
Thanks for your question. The short answer is "yes", that is exactly what regexp is for.
I couldn't understand your second post, so I will do my best to answer your first post.
Let's say that you had two custom tags and wanted to replace the text between them.
Find: ([customtag1]).*([customtag2])
Replace: \1Type replacement text here\2
The \1 and \2 will re-insert custom tags 1 and 2, respectively back into the text file.
Hope I understood and answered your question.
Do you need professional PDFs? Try PDFmyURL!

August 9, 2010 at 11:13 PM

Der Bloggende Nomade said...
from now on its possible (5.7.1) to record search and replace events within a macro.
September 7, 2010 at 7:38 PM
Tiberius Gracchus said...
There's a very simple workaround for searching multiple lines. Replace \r\n with
something that is never present naturally. I like the ANSI character 167, but Notepad
doesn't have a facility for inserting ANSI characters easily.
Anyway then you run your search specifying the character or string as your endline
equivalent, go to town and replace the puppies with \r\n.
October 8, 2010 at 8:18 AM
Mark Antoniou said...
Clever workaround. I like it. However, this doesn't address the main reason that forced
me to move from Notepad++ to Emacs:
By using a more powerful text editor, workarounds are not required. New line characters
can be searched for and/or replaced at will. This simplifies the search and replace
expressions and saves me time.
October 8, 2010 at 9:47 AM
Luc said...
Thank you for the guide!I'm a little confused by what to do in my situation.
I have a file with such a structur:
BEGIN:VCARD
VERSION:2.1
N:Doe;John;;;
FN:John Doe
TEL;CELL;PREF:+41800800800
EMAIL;PREF;WORK:[email protected]
ORG:Test
END:VCARD
I want the "FN:" section to be changed in that way: FN: Doe, John (and no more FN:
John Doe). Is that possible?
November 9, 2010 at 8:11 PM
Do you need professional PDFs? Try PDFmyURL!

Mark Antoniou said...

Thanks for your question, Luc. Here's the Notepad++ solution:
Search for: (FN:)(.*) (.*)
Replace with: \1 \3, \2
Note that this expression assumes that people only have two names.
November 9, 2010 at 9:13 PM
Edward said...
Hi,
Is there a way for notepad++ to do an
"or" operation? SOomething like:
find A or B or C
I would especially like this for when
I do a find of all in current document.
Thanks, Ed
January 8, 2011 at 9:21 AM
Pushkar said...
Hi Mark,
Thanks for the wonderful article, but I still couldn't resolve one of my problems. Could
you please tell me how to replace "@#$%" with <@#$%>. Thank you. :) keep up the
good work
January 30, 2011 at 6:40 AM
Shamik said...
Awesome post...kudos for the great work
February 2, 2011 at 6:37 AM
Mark Antoniou said...
Apologies for taking so long to respond. You all caught me in the midst of a transDo you need professional PDFs? Try PDFmyURL!

continental move. Now, to your questions:

@Edward: To my knowledge, no.
@Pushkar: Do you literally mean replacing @#$% with <@#$%>? This can be
achieved using a simple Find + Replace:
Find: @#$%
Replace with: <@#$%>
If you are talking about some sort of larger-scale find and replace based on some
criterion, you need to give me more information, and preferably a snippet of text
showing what the text looks like before and what you would like it to look like after.
@Shamik: Glad you liked it :)
February 5, 2011 at 7:35 AM
Martin said...
to answer the question, "is there anything it can't do"
well look ahead and look behind in regexp fails, and newlines (pretty much anything
supported in extended) isn't supported in regexp.
and in case any one is wondering, yes vim supports this just fine.
but I'm still in love with notepad++ because it's just so much more simple to use, but
learning vim is still well worth the effort (in my 1st week now and starting to get some
real work done with it xD)
but who knows, maybe these issues will get addressed in the next version of
notepad++
anyway nice article it did help a little even for an issue that couldn't be fixed in
notepad++ xD
February 10, 2011 at 12:30 AM
e22 said...
If you want to use Notepad++ to do regex over multiple lines simply start off by
replacing \r\n with something like !NEWLINE! using the extended settings then do the
reverse when finished!
February 23, 2011 at 7:25 PM
Do you need professional PDFs? Try PDFmyURL!

Mark Antoniou said...

Yes, e22, that is what I did in the original post above, though I used a nonsense word
"mork" rather than !NEWLINE!
Still though, it is quite unacceptable to me that three steps are required rather than one.
And once you start using very complex regular expressions in text files that are
hundreds of thousands of lines long, it becomes very tedious to have to worry about
whether you missed any of your newly inserted !NEWLINE!s, or if any subsequent
expressions modified something in your nonsense word (e.g., if I then got rid of all
exclamation marks, it would be hard to go back). My point is that regular expressions
are meant to save you time...
February 24, 2011 at 2:03 AM
Shikhar Kumar said...
nice article, got my work done.
March 12, 2011 at 6:37 PM
Nico said...
Hello, nice guide.
I have a (newbie) question:
I have the following text:
Minradio#23-567
The result that I want is:
23567
What should be my regexp?
Thanks
March 24, 2011 at 8:45 AM
Mark Antoniou said...
This is quite a straightforward example, Nico. Haven't had one of these in a while ;)
So we start off with this:
Minradio#23-567
Do you need professional PDFs? Try PDFmyURL!

In Notepad++ regular expression search mode,

Search for: .*#(.*)-(.*)
Replace with: \1\2
What you end up with is this:
23567
It might seem a little tricky, but the concept is simple: What information do you want to
keep? And how does the other unimportant information border it? In the regexp above, I
used the hash (#) and hyphen (-) as anchors. This means that:
a) the text before the hash is free to vary
b) the number of digits between the hash and hyphen are free to vary
c) the number of digits after the hyphen are free to vary.
The limitation is that if some of your lines of text do not contain # or - then it will break
my regexp.
March 24, 2011 at 9:01 AM
Nico said...
Hey Mark, thanks for your help.
Almost worked!!
The result that I've got is
16-103
The "-" was not removed.
Any clue?
March 24, 2011 at 9:15 AM
Mark Antoniou said...
Make sure that the hyphen is not enclosed within the parentheses.
March 24, 2011 at 9:19 AM
Nico said...
Hi. Sorry to bother you with this lame question.
Do you need professional PDFs? Try PDFmyURL!

That's my string:

What is the "\1\2" that you said to use as replacement?

The "-" never goes away :-/
March 24, 2011 at 9:35 AM
Mark Antoniou said...
Ok, let's back up a bit. Your original text is this:
Minradio#23-567
You want to keep the numbers, and get rid of whatever is before the numbers as well as
the hyphen. So, in Notepad++ regular expression search mode,
Search for: .*#(.*)-(.*)
Let me break down this search term. The first three characters .*# will search for
anything until a hash # is found (Minradio# in the above example). We don't put
parentheses around this because we don't want to use it in our Replace term; we simply
discard it. The next five characters (.*)- will search for anything until a hyphen - is
found. The parentheses around the period and asterisk mean that that text (which is in
this instance the text immediately after the hash #, that is, the number 23) can be
recalled in our Replace term. The way to recall the contents of this first set of
parentheses is by typing \1. The hyphen is not enclosed within the parentheses and
therefore cannot be recalled in the Replace term; it is simply discarded. Finally, the last
four characters (.*) select the remaining text (in this example 567) and the parentheses
mean that it can be recalled in the Replace term, this time by \2, because it is the
second set of parentheses. So, the Replace term looks like this:
Replace with: \1\2
What you end up with is this:
23567
So, why are you ending up with 23-567? There are a few possiblities:
1. The original text had two hyphens:
Minradio#23--567
Do you need professional PDFs? Try PDFmyURL!

If that is the case change your search term to this:

.*#(.*)--(.*)
2. You are including the hyphen within one of the sets of parentheses:
.*#(.*-)(.*)
or
.*#(.*)(-.*)
The hyphen therefore will not be discarded. It will be recalled when you use \1 (top) or \2
(bottom).
3. You are reinserting the hyphen in your Replace term:
Replace with: \1-\2
March 24, 2011 at 11:49 AM
prozaker said...
you could take a look at the pythonscript plugin, it has a python replace method that
everyone could use. It looks complete, textfx or regular n++ regular expression lack
options.
http://sourceforge.net/projects/npppythonscript/
-------editor.pyreplace('id\=\"A\d+\" ','') # delete all id="A##"
-----------April 1, 2011 at 5:13 AM
el Mauri said...
Hello, nice guide.
I have a (newbie) question:
I have the following list of emails:
[email protected], [email protected], frojasd08_hotmail.com ... and the list so
on
And I want to take with that email that does not comply with the format in a regular
email, in my example:
frojasd08_hotmail.com (it hasn't the character @)
Can you help me with the correct regular express to find this pattern?
Do you need professional PDFs? Try PDFmyURL!

Thanks, Mauri
April 3, 2011 at 5:43 AM
Mark Antoniou said...
Mauri, it turns out that this is not as trivial as it first appears. Handling email addresses
is quite a controversial issue in the regexp world. See http://www.regularexpressions.info/email.html for a discussion of the varioius issues and disagreements.
Your sample text has two unique characteristics that allows us to sidestep the messy
world of identifying 'what is an email address?', so I have taken advantage of these two
unique conditions:
1. Each email is separated be a comma followed by a space ", "
2. Some of the email addresses are missing a "@"
I have written the solution below for Notepad++. It involves several steps, but as long
as conditions 1 and 2 from above are satisfied, it will always work.
So, we start with this:
[email protected], [email protected], frojasd08_hotmail.com,
[email protected], steve#yahoo.com, [email protected]
Step 1: Place each email address on its own line
Search for (Extended mode): ", " (without the quotation marks)
Replace with: ,\n
You end up with this:
[email protected],
[email protected],
frojasd08_hotmail.com,
[email protected],
steve#yahoo.com,
[email protected]
Step 2: Remove correctly formatted emails that contain "@"
Search for (Regular expression mode): .*@.*
Replace with: (nothing, leave blank)
You end up with this:

Do you need professional PDFs? Try PDFmyURL!

frojasd08_hotmail.com,
steve#yahoo.com,

Step 3: Remove blank lines

Search for (Extended mode): \n
Replace with: (nothing, leave blank)
The result is this:
frojasd08_hotmail.com,steve#yahoo.com,
Optional step 4: If desired, you could at this point insert a space after each
comma
Search for: ,
Replace with: ", " (without quotes)
End result:
frojasd08_hotmail.com, steve#yahoo.com,
So, only those email addresses that do not contain the @ are left, and they may now be
corrected, logged, or whatever.
April 8, 2011 at 8:35 AM
BK said...
I need your help.
I built a reg expression using regmagic tool.
The expression is:
\b(?:(?:[1-9][0-9]{1,3}|[5-9])[0-9]{4}|[0-9]+|[0-9]+)\b
This expression supposed to find numbers between
50000
and
99999999
Here is a sample line that I use to test this regex in notepad++
Do you need professional PDFs? Try PDFmyURL!

appstore.gearlive.com/member/76234/|0
I have 1000 lines like this. But despite I check, regular expression as the searchmode,
it finds nothing.
What am I missing. Please Help!
April 11, 2011 at 6:38 PM
Mark Antoniou said...
Yeah, BK, it's not going to happen. Not with Notepad++, at least. From past experience,
Notepad++ has problems both with repetition {1} and searching for white space \b.
You could achieve what you want to do in seven (fairly inelegant) steps, starting from
the largest number of digits:
[1-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]
and removing one digit an each step
[1-9][0-9][0-9][0-9][0-9][0-9][0-9]
then
[1-9][0-9][0-9][0-9][0-9][0-9]
and so on, until you arrive here
[5-9][0-9][0-9][0-9][0-9]
April 13, 2011 at 5:36 AM
warm up said...
@Mark,
[5-9][0-9][0-9][0-9][0-9]
this indeed finds what I want. Thank you very much.
April 13, 2011 at 7:55 PM
Wavetrain said...
Hey, just wanted to say thanks for the pointers. Really helped me clean up a massive
wiki list, it probably cut down editing time to 1/4 what it would have been.
April 22, 2011 at 2:43 PM
Do you need professional PDFs? Try PDFmyURL!

warm up said...
I need a regex builder. Will you please suggest me a good one?
Thanks in advance.
April 23, 2011 at 4:18 AM
Mark Antoniou said...
Sorry warm up, I've never used one, and definitely couldn't recommend a good one. If
you've got a specific regexp query I might be of more use.
April 23, 2011 at 6:21 AM
warm up said...
Thanks for offering your help and your time.
I want to find a string in a text like this;
For example evey line in the text file has a string
#links#
and after this string there are several words that does not interest me. I want to find and
mark #links# and the words afterthat so that I can delete them. How can I do that with
notepad++?
April 26, 2011 at 5:01 PM
Mark Antoniou said...
That's not too difficult. You just need to idenitfy each line that begins with #links# and
delete it.
Search for: #links#.*
Replace with: nothing, just leave it blank
April 27, 2011 at 12:12 AM
warm up said...
But I do not want to delete only #links#. I want to delete #links# and the words that are
coming afterthat.
forexample let says I have aline like this;
Do you need professional PDFs? Try PDFmyURL!

something is important but #links# this is not

after the process I want to get only;
something is important but
April 27, 2011 at 1:49 AM
Mark Antoniou said...
Yep, I understood what you were after. This still works. Let me break it down for you:
Make sure that you are searching in Regular Expression mode
Search for: #links#.*
Replace with: nothing, just leave it blank
Note that #links# is followed by a period and asterisk .* which will select everything
after #links# until the end of the line.
So, when you use that term on this:
something is important but #links# this is not
What will be left over is this:
something is important but
April 27, 2011 at 2:08 AM
warm up said...
Ok. That works.
Thank you very much.
April 27, 2011 at 4:12 AM
Constantin said...
Searching for multiple lines doesn't seem to be working.
I am searching for this
@Text.*\r\n.*;
Eclipse has no problem finding it...
Do you need professional PDFs? Try PDFmyURL!

Any ideas ?
May 17, 2011 at 4:50 AM
Mark Antoniou said...
If you are trying to perform this search in Notepad++, it's not going to happen.
Having said that, if you insist on using Notepad++, you are going to need to get creative
and will need to break the search down into steps because \r\n cannot be used in
Regular Expression mode - you need to use Extended Search mode for \r\n. So, how
many steps do you need? I'm not sure, because it depends on your text, but my guess
is at least three:
1. Turn the newline into something unique.
2. Run the regexp.
3. Put the newlines back or do something else with them (not sure what, because you
didn't specify).
May 17, 2011 at 4:59 AM
Constantin said...
Well :), if the regular expression implementation in Notepad++ would implement the
multi line pattern that could solve it. I am not familiar with how Notepad++ is
implemented but Java would allow multi line patterns. I bet .NET would do the same.
The solution you suggested would work nicely but what I was trying to do was to search
thru a large set of java files for a certain multi line pattern. So I can't have the option to
replace the \r\n with a special token since that will alter the code base.
Thanks for looking!
May 17, 2011 at 7:22 AM
Mark Antoniou said...
If you do not *have* to use Notepad++, why not just use a more powerful text editor
(XEMacs), which will give you the one-line solution that you are looking for?
May 17, 2011 at 7:29 AM
Dee said...
Hi Mark,
Thanks for very helpful the blog post.. at first I couldn't quite figure out how to use the
Do you need professional PDFs? Try PDFmyURL!

replace field dynamically.. I had a situation like this:

Text:
Step 1
Step 2
Step 3

Find : Step\s\d
Replace : Step\s\d|
which of course gave me this!
Step\s\d|
Step\s\d|
Step\s\d|
Eventually, it clicked that \1 represents the found patterns
and I stumbled upon this:
Find: Step\s\d
Replace: \1|
which gave me the desired result:
Step 1|
Step 2|
Step 3|
Just wanted to get that out there in case anyone else is struggling with that.
Once again cheers Mark for the help on that one.. the fist in the air celebration was
priceless.
Dee
June 16, 2011 at 1:31 AM
Mark Antoniou said...
Glad that you found the post helpful, Dee. It's funny how it seems so much more simple
Do you need professional PDFs? Try PDFmyURL!

after you have that "ahah" moment!

June 16, 2011 at 2:11 AM
Manuel said...
hi,,
i need to do a massive replacement from:
tcp10102/172.20.225.246_PROBE
to

tcp10102_PROBE
can you tell me the syntax to use for this replacement?
text after tcp and text after/ and before _PROBE varies..
June 16, 2011 at 11:39 PM
Mark Antoniou said...
Hey Manuel,
This is pretty straightforward. You want to keep everything before the forward slash and
everything after the underscore.
In Regular Expression search mode,
Search for: (.*)/.*_(.*)
Replace with: \1_\2
You can see that in the search term, I am using the forward slash and underscore as
signposts, and am keeping everything before and after (enclosed in parentheses), but
am discarding everything in between (not enclosed in parentheses).
June 17, 2011 at 12:13 AM
Nate said...
I am interested in searching a document and replacing everything from a href=" to " and
change all the links quickly with notepad ++ can you tell me how to do this?
I tried searching for ahref=".*" and it selected everything up to the LAST "
Please advise!
Do you need professional PDFs? Try PDFmyURL!

Thanks
June 18, 2011 at 3:45 AM
Mark Antoniou said...
Ok, I'm not sure exactly what you want the end result to be, but I'll give it a go. Say that
you start with something like this:
ahref="www.google.com"
ahref="www.facebook.com"
ahref="www.blogger.com"
ahref="www.twitter.com"
If you want to keep the ahref=" and the final " you could
Search for (regexp mode): (ahref=").*(")
Replace with: \1\2
The end result would be
ahref=""
ahref=""
ahref=""
ahref=""
If you want to keep everything but the ahref=" and the final " you could
Search for (regexp mode): ahref="(.*)"
Replace with: \1
The end result would be
www.google.com
www.facebook.com
www.blogger.com
www.twitter.com
If you want to do something else, you're going to have to be more specific. Ideally,
show me what a few lines of text look like before, and what you want them to look like
after.
Do you need professional PDFs? Try PDFmyURL!

June 18, 2011 at 4:01 AM

Nate said...
Awesome! Thanks for the quick reply, worked great! What an awesome trick for
rewriting!
June 18, 2011 at 4:10 AM
Rakesh Juyal said...
Mark, is it possible to replace all ? in any text file with '${abc' then an incrementing
number then '}$'
example:
-------------where ( col1 = ? or col1 = ? ) and col2 = ?
replaced to
where ( col1 = ${abc1}$ or col1 = ${abc2}$ ) and col2 = ${abc3}$
---------------July 18, 2011 at 6:15 PM
Mark Antoniou said...
Yes it is. But it will require a very long and convoluted process and several search and
replace steps (similar to the blog post above). The problem is the "increment by one"
part.
In Notepad++, you can insert incremented numbers from the Edit | Column Editor menu
command. This places numbers at the front of each line.
You could possibly position each ? so that it occurs at the end of each line, then
replace it with ${abc\1}$, where \1 represents the number at the beginning of the line.
Not sure if you want to go ahead with this, but if you do, here are the steps:
1. Get rid of all line breaks, replacing them with some unique string that does not occur
in your original text file, such as "thereisnoothertextlikethis".
2. Search for ? and replace with a ? followed by a linebreak.
3. Add numbers to the beginning of each line using the Edit | Column Editor menu
command.
4. Use a regular expression to search for the number at the beginning of each line and
move it to ${abc\1}$
Do you need professional PDFs? Try PDFmyURL!

5. Remove all linebreaks.

6. Replace all instances of thereisnoothertextlikethis to restore your original linebreak
structure.
If you want to go ahead with this, paste a larger portion of your text file (10-20 lines) and
I'll show you how to do it in more detail.
July 19, 2011 at 2:08 AM
said...
Hello,
Thank you for the time investing publishing and answering - Helped me a lot ...My
Question is :
* If I have Emails with NOT similar text before and After and I would like to extract
those Emails...for example :
In the same Text :
First String :
===============
"21-Feb-2011 12:16:49 GMT+02:00
PM","alternateContactBusinessPhone":"","databasePlatform":"","productLine":"Oracle
E-Business
Suite","lastPublicActivityCreatedBy":"[email protected]","accountStatus"
:"Active","commitTime":"22-Feb-2011 9:09:06 GMT+02:00
AM","HWCity":"","conflictId":"0","outageType":"","contactLogin":"GOREN.NAAMA@GM
AIL.COM","subCategory":"","SRContactEmail":"[email protected]","alertMe":"fa
lse","SRContactPhone":"(972) 542-1341 x76"
Second String :
===============
"09-Jun-2011 10:42:57 GMT+03:00
AM","alternateContactBusinessPhone":"","databasePlatform":"","productLine":"Oracle
Database
Products","lastPublicActivityCreatedBy":"[email protected]","acc
ountStatus":"Active","commitTime":"10-Jun-2011 10:20:52 GMT+03:00
AM","HWCity":"","conflictId":"0","outageType":"","contactLogin":"ITSHAK@HADASSA
H.ORG.IL","subCategory":"","SRContactEmail":"[email protected]","alertMe":"fals
e","SRContactPhone":"02-6778113"
Do you need professional PDFs? Try PDFmyURL!

Regards
Etay G
August 3, 2011 at 1:15 AM
Mark Antoniou said...
Glad you have found the blog useful, Etay G. I'm not sure exactly what you are trying to
get from the text. Do you want to get rid of everything, leaving only the email
addresses?
August 5, 2011 at 1:23 AM
RatA said...
Mark, thanks for the post, is very usefull. following the first example, how about not
erasing all the line, but only a part.
like i want to remove the $_POST['abc']; part in all lines
$abc = $_POST['abc'];
$bbb = $_POST['def'];
i try [$_POST].* but it erase all the line, and not the final part.
August 19, 2011 at 2:26 AM
Mark Antoniou said...
RatA, if I understood correctly, you want to turn this:
$abc = $_POST['abc'];
$bbb = $_POST['def'];
into this:
$abc =
$bbb =
is that right?
To do this,
Search for (regular expression mode): $_POST.*
Replace with: nothing
August 19, 2011 at 2:34 AM
Do you need professional PDFs? Try PDFmyURL!

RatA said...
thanks, u are a genius.
August 21, 2011 at 5:06 AM
Mikazza said...

September 15, 2011 at 11:34 PM

Mikazza said...
Hi Mark,
Thanks for all the great info on regular expressions, although I have a problem I can't
seem to find the solution for.
I have a data file which I would like to strip out some sections are they are useless, first
I replaced all the \r\n with @NEWLINE@ so I could get the whole file in one line, now
i'm trying to replace anything between and with
e.g.
**Data I want to keep is here 1**
message
called today but nobody was home
/message
**Data I want to keep is here 2**
message
called today but nobody answered
/message
the words message have < and > around them but the site wont let me post them.
As I said I removed all the line breaks from this and tried to run this regular expression.
Find: (messages.*)(/messages)
Replace: deleted
I couldn't work out how to find the < or > symbols.
I hoped this would delete all the messages and replace them with the word deleted,
Do you need professional PDFs? Try PDFmyURL!

what it does though is finds the 1st occurance of the word messages then finds the last
occurance and replaces everything in between with the word deleted. In my example
above its deleting **Data I want to keep is here 2**
Is there any way of doing this using regular expressions?
September 15, 2011 at 11:37 PM
Mark Antoniou said...
Hi Mikazza, I am not sure that I have understood exactly what you are trying to do, but
will give it a shot. So this is your original text:
**Data I want to keep is here 1**
< message >
called today but nobody was home
< /message >
**Data I want to keep is here 2**
< message >
called today but nobody answered
< /message >

In order to remove the < message > and < /message > tags, you should
Search for (regular expression mode): <.*>
Replace with: nothing
This will give you this:
**Data I want to keep is here 1**

called today but nobody was home

Data I want to keep is here 2

Do you need professional PDFs? Try PDFmyURL!

called today but nobody answered

If you then want to get rid of the lines that begin with "called", you could
Search for (regular expression mode): called.*
Replace with: nothing
which will give you this:
**Data I want to keep is here 1**

Data I want to keep is here 2

And then fix the blank lines as you see fit. Hope this helps.
p.s. I inserted spaces before and after the greater and less than symbols so that they
would show up in the post. You would not include the spaces in the search term.
September 17, 2011 at 12:53 AM
Mikazza said...
Thanks for the quick response Mark, what I want to replace is the < message > and <
/message > and everything in between them. I can get it to work if there is only one set
of these tags in the file (unfortunately there are thousands), if there are more than 1 set
it goes wrong and deletes everything between the 1st < message > and the last <
/message >.
Since the < message > and < /message > are on different lines in the file and the
content between them can also vary on how many lines it's over, I removed all the line
breaks to make it a bit easier to do the search and replace.
Please let me know if you need any more information.
Do you need professional PDFs? Try PDFmyURL!

Thanks!
September 17, 2011 at 4:47 AM
Mark Antoniou said...
Ok got it. So, you start of with this:
**Data I want to keep is here 1**
< message >
called today but nobody was home
< /message >
**Data I want to keep is here 2**
< message >
called today but nobody answered
< /message >
Notepad++ has a hard time handling multiline regular expressions. One option is to use
a different text editor with more powerful regexp capabilities (ahem, Emacs). The other
option is to use Notepad++ and break this down into a few steps (3 to be precise).
Step 1: Remove the newlines
Search for (extended mode): \r\n
Replace with: nothing
This will give you this:
**Data I want to keep is here 1**< message >called today but nobody was home<
/message >**Data I want to keep is here 2**< message >called today but nobody
answered< /message >
Step 2: Make all instances of < /message > occur at the end of a line. The reason for
this is because we want to discard everything before < /message >, apart from that bit
at the front that we want to keep.
Search for (extended mode): < /message >
Replace with: \r\n
So, your text will now look like this:
Do you need professional PDFs? Try PDFmyURL!

**Data I want to keep is here 1**< message >called today but nobody was home
**Data I want to keep is here 2**< message >called today but nobody answered
We are nearly there, but we still want to discard everything after (and including) the <
message > tag.
Step 3: Remove everything from < message > onwards.
Search for (regular expression mode): (.*)< message >.*
Replace with: \1
And finally, we arrive at our desired result:
**Data I want to keep is here 1**
**Data I want to keep is here 2**
September 17, 2011 at 6:32 AM
Menes said...
Hi Mark i have text like that ;
apple(7)orange(27)banana(318)tulip(2)
And i want to convert it like that;
apple,orange,banana,tulip
i try those ;
[(].*[)] and ($.*)())
but both of them doesn't work.
Thanks for helping
September 21, 2011 at 7:17 PM
Mark Antoniou said...
Hi Menes,
This is a little tricky. The reason why is because there are multiple parentheses on the
same line. This can muck up your search term. First things first, the way to search for
parentheses is with a preceding backslash, like this \( for open and this $ for closed.
One solution for your problem is to take a different approach: rather than trying to take
care of all parentheses at once, you could take care of parentheses that contain the
Do you need professional PDFs? Try PDFmyURL!

same number of digits.

Search for (regular expression mode): $.$
Replace with: ,
Search for (regular expression mode): $..$
Replace with: ,
Search for (regular expression mode): $...$
Replace with: ,
Which will give you this:
apple,orange,banana,tulip,
This, of course, becomes impractical if you have numbers within the parentheses that
are range from 1 to 100 digits long. But, as a quick fix, it should be fine for your
problem.
September 22, 2011 at 5:38 AM
Sam said...
hi,
alter database rename file '/fs-a01-a/databases/inv1cn/aggregate_idx-42.dbf' to '/fs-a01c/databases/inv1cn/aggregate_idx-42.dbf
i want to change this to
alter database rename file '/fs-a01-a/databases/inv1cn/aggregate_idx-42.dbf' to '/fs-a01c/databases/inv1cn/aggregate_idx-42.dbf';
in last i have to add ';
is this possible ?
September 27, 2011 at 2:16 PM
Mark Antoniou said...
Hi Sam,
if I understand you correctly, all that you want to do is add an apostrophe and
semicolon to the end of the text. This is easily accomplished.
Search for (regexp mode): (.*)
Do you need professional PDFs? Try PDFmyURL!

Replace with: \1';

September 27, 2011 at 3:29 PM
Sam said...
thanks Mark, its working
September 27, 2011 at 3:36 PM
Adrian981 said...
Hi your guide is amazing.
I was hopeing you could help me with a problem.Here it is : I have one big long line of
full names and phone numbers e.g john cruz 00374653 kelly brunz 95847364 alan whirtz
9898372 jane doerl and so on.
I'm trying to get it like this
John cruz 00374653
kelly brunz 95847364
alan whirtz 9898372
ps.ive tried to lookup replace every 3rd space with return button or something along
those lines.
Please any help is welcome.
Adrian
September 28, 2011 at 8:32 AM
Mark Antoniou said...
Glad you found the post helpful, Adrian. I really like your example, because it seems to
be very difficult to find a pattern in this seemingly unpredictable series of names and
numbers. Some people might have three names (or in the case of Madonna, Pele and
so forth, just one), so looking for the third space is not a very foolproof solution. In
addition, you could perhaps use the number of digits in a phone number, but this isn't
foolproof either as area codes vary in length, as do country codes and so on. You need
to think outside the box in order to solve this particular expression. The solution is
actually so straightforward that you will probably kick yourself when you see it.
In order to solve this particular problem, it is necessary to take a step back and look at
the structure of your text in an abstract way. We need to find not only a pattern in the
data that repeats (such as the number of spaces), but one that will allow us to insert a
line break so that each name and its corresponding number will occur on the same line.
Ideally, we would like to say "wherever there is a string of numbers, turn the next space
Do you need professional PDFs? Try PDFmyURL!

into a linebreak". It is not possible to do this in one step in Notepad++ (although you
could do it in a more powerful text editor, like Emacs). We are going to need 2 steps.
In order to find where each phone number ends, all we have to do is find the number
that has a space after it.
Search for (regexp mode): ([0-9]) -note that there is a space after the closed parenthesis
Replace with: \1,
john cruz 00374653,kelly brunz 95847364,alan whirtz 9898372,jane doerl
The reason for inserting a comma is so that we will have something to search for in the
next step when we want to insert a new line.
Search for (extended search mode): ,
Replace with: \r\n
which will give you this:
john cruz 00374653
kelly brunz 95847364
alan whirtz 9898372
jane doerl
September 28, 2011 at 9:11 AM
Adrian981 said...
Thank you Mark it works perfect.
The reason i was asking you about how to make a line after lets say 12
spaces/commas is that i have alot of files that i want break up in lines of 3,6 and 9. I
will try give you a good example of what i'm looking to do.eg:
john,likes,this,jane,loves,games,peter,saved,me,george,fell,today,greg,pushed,me
I'm trying to divide them up into lines of 3.
john likes this
jane loves games
peter saved me
george fell today
greg pushed me
The whole long line is made up of 3 words that makes a small sentence.
Thanks for your previous help your amazing and your help would be much appericated
for this problem.
Do you need professional PDFs? Try PDFmyURL!

Adrian
September 29, 2011 at 12:10 AM
Mark Antoniou said...
So you start off with this
john,likes,this,jane,loves,games,peter,saved,me,george,fell,today,greg,pushed,me
and you want to put 3 words on each line. Notepad++ makes this a bit harder than it
should be. We need 2 steps. First, add a comma to the end of the line so that it looks
like this
john,likes,this,jane,loves,games,peter,saved,me,george,fell,today,greg,pushed,me,
If you have hundreds or thousands of lines, you could use a regular expression to do
this. Anyway, back to the task at hand.
Search for (regexp mode): ([a-z]*),([a-z]*),([a-z]*),
Replace with: \1 \2 \3QQQ
john likes thisQQQjane loves gamesQQQpeter saved meQQQgeorge fell
todayQQQgreg,pushed,me
Note that the QQQ is just a random string that I came up with which (a) will never occur
in your list of words, and (b) is easily searchable, which is useful for the next step
below.
Search for (extended search mode): QQQ
Replace with: \r\n
john likes this
jane loves games
peter saved me
george fell today
greg pushed me
September 29, 2011 at 2:36 AM
Adrian981 said...
Hi Mark,
I think you've nearly cracked it for me, just a few more things i left out no thinking it
would be an issue.
Some of the words have numerals in them as in john25,left3,
Do you need professional PDFs? Try PDFmyURL!

And if i want to divide lines of 5 how do i do that.

Thanks for the very quick response.
Adrian
September 29, 2011 at 3:29 AM
Mark Antoniou said...
Ok, so let's say you start off with this:
john25,likes,this,jane11,loves,games,peter,saved,me,george,fell,left3,greg,pushed55,m
e,
and let's assume that you want to group them so that there are 5 words on each line.
Search for (regexp mode): ([a-z0-9]*),([a-z0-9]*),([a-z0-9]*),([a-z0-9]*),([a-z0-9]*),
Replace with: \1 \2 \3 \4 \5QQQ
which will give you this:
john25 likes this jane11 lovesQQQgames peter saved me georgeQQQfell left3 greg
pushed55 meQQQ
Search for (extended search mode): QQQ
Replace with: \r\n
and there you go:
john25 likes this jane11 loves
games peter saved me george
fell left3 greg pushed55 me
September 29, 2011 at 3:47 AM
Adrian981 said...
Brilliant. It works perfect can you show me how to add to the codes so i can make
bigger lines. Lets say lines of 20.
Thank you for your great support.
Please let me know if i can give you a small donation through paypal for you help.
Adrian.
September 29, 2011 at 4:47 AM
Mark Antoniou said...
Do you need professional PDFs? Try PDFmyURL!

Glad you found the blog helpful, Adrian. In order to change the number of words that will
end up on each line, simply change the number of ([a-z0-9]*), in the search term, and
make sure you have the same number of items in the replacement term.
Using your example of 20, you would
Search for (regexp mode): ([a-z0-9]*),([a-z0-9]*),([a-z0-9]*),([a-z0-9]*),([a-z0-9]*),([a-z09]*),([a-z0-9]*),([a-z0-9]*),([a-z0-9]*),([a-z0-9]*),([a-z0-9]*),([a-z0-9]*),([a-z0-9]*),([a-z09]*),([a-z0-9]*),([a-z0-9]*),([a-z0-9]*),([a-z0-9]*),([a-z0-9]*),([a-z0-9]*),
Replace with: \1 \2 \3 \4 \5 \6 \7 \8 \9 \10 \11 \12 \13 \14 \15 \16 \17 \18 \19 \20QQQ
The obvious limitation of using this sort of brute force approach is that it becomes
impractical if you wanted say 1000 words on each line (that would be a lot of
copy+pasting!). But, we are trying to work around the limitations of Notepad++, so we
have to (sometimes) use inelegant solutions.
As for donations, I gratefully and humbly accept whatever you can spare. My email
address for Paypal is [email protected]
September 29, 2011 at 5:00 AM
Adrian981 said...
Hi
I done a few tests and it only seems to divide up to 9 words per line and any bigger line
10,11,12 it replaces 0.
Any ideas
Adrian
September 29, 2011 at 5:29 AM
Mark Antoniou said...
Ah, yes, you are right. Notepad++ will not let you have more than 9 bins. Sorry, I was
not working in Notepad++ when I posted my previous reply. This is yet another reason
to use a more powerful text editor for this sort of advanced regexp. Enough of my
ranting.
So, let's say that you want to have more than 9 words per line. It's just a matter of
making our bins bigger. Rather than putting one word in each bin, we could put 20 in
each bin (or however many you like).
Ok, so we start off with these 40 words:
Do you need professional PDFs? Try PDFmyURL!

john25,likes,this,jane11,loves,games,peter,saved,me,george,fell,left3,greg,pushed55,m
e,john25,likes,this,jane11,loves,games,peter,saved,me,george,fell,left3,greg,pushed55,
me,peter,saved,me,george,fell,left3,greg,pushed55,me,too,
Search for (regexp mode): ([a-z0-9]*,[a-z0-9]*,[a-z0-9]*,[a-z0-9]*,[a-z0-9]*,[a-z0-9]*,[a-z09]*,[a-z0-9]*,[a-z0-9]*,[a-z0-9]*,[a-z0-9]*,[a-z0-9]*,[a-z0-9]*,[a-z0-9]*,[a-z0-9]*,[a-z09]*,[a-z0-9]*,[a-z0-9]*,[a-z0-9]*,[a-z0-9]*,)
Note that there is only one open parenthesis at the start and one closed
parenthesis at the end of the search term.
Replace with: \1QQQ
john25,likes,this,jane11,loves,games,peter,saved,me,george,fell,left3,greg,pushed55,m
e,john25,likes,this,jane11,loves,QQQgames,peter,saved,me,george,fell,left3,greg,pushe
d55,me,peter,saved,me,george,fell,left3,greg,pushed55,me,too,QQQ
Then use extended search mode to convert the QQQs to newlines (\r\n).
September 29, 2011 at 5:51 AM
Mark Antoniou said...
Oh, and you could just use a simple Find+Replace to replace the commas with spaces
(if you want to).
September 29, 2011 at 5:52 AM
Adrian981 said...
Thanks alot mark it seems to leave ( at start of each line and ) at end will i just find and
replace or am i doing something wrong still ?
September 29, 2011 at 6:47 AM
Mark Antoniou said...
No, it should not leave any ( or ) anywhere. Make sure that your search term and
replace term are correct.
I have double-checked my post above and it is correct. No typos.
September 29, 2011 at 6:52 AM
Adrian981 said...
Thanks alot Mark you great. Is their any way to remove the , at the end of each line.
Do you need professional PDFs? Try PDFmyURL!

I,ve sent you a small donation for your great support.

Thanks again
Adrian
September 29, 2011 at 8:23 AM
Adrian981 said...
Aswell this code :
Then use extended search mode to convert the QQQs to newlines (\r\n).
I'm tried (\r\n) first thats where i was getting the brackets so i just done it \r\n and it was
perfect.
Just looking to remove the comma at the end of each line.
September 29, 2011 at 8:32 AM
Mark Antoniou said...
Thank you for your support, Adrian.
Sorry if I confused you. I did not mean that the \r\n should be enclosed in parentheses
in your replace term. Glad you figured that one out.
If you just want to remove the commas, you could do a simple Find+Replace:
Search for: ,
Replace with:
September 29, 2011 at 9:16 AM
Adrian981 said...
yes but doing the gets rid of all the commas, i just want to get rid of the commas at the
end on each line.
September 29, 2011 at 9:23 AM
Mark Antoniou said...
Ah, I see. You want to keep the other commas. Well in that case
Search for (regexp mode): (.*),
Replace with: \1
September 29, 2011 at 9:25 AM
Do you need professional PDFs? Try PDFmyURL!

Adrian981 said...
Perfect thanks alot mark.
Adrian
September 29, 2011 at 9:31 AM
Sam said...
I have pretty small question.
is there anything I can add in front of any word ?..like
b8nmuujs7jrug'
baszp4tj1s7vv'
add ' single quote in front of every word in the line.
thanks
Sam
September 30, 2011 at 7:28 AM
Mark Antoniou said...
Ok, so you start off with this:
b8nmuujs7jrug'
baszp4tj1s7vv'
and you want to add a single quote ' to the beginning of each line.
Search for (regexp mode): (.*)
Replace with: '\1
which will give you this:
'b8nmuujs7jrug'
'baszp4tj1s7vv'
September 30, 2011 at 7:31 AM
Sam said...
No worries I figured out the answer
September 30, 2011 at 7:37 AM
Organix said...
Do you need professional PDFs? Try PDFmyURL!

Mark since you seem like the regex master maybe you can point me in the right
direction: I have a csv file that has text enclosed in "" but the problem is that in the
REMARK/detail field there can be inches which are also using " how can I find these
lines with the extra quotations?
example of what I'm looking for:
TRTM_TYPE,TEST_TYPE,RUN_NO,TEST_NUMBER,TOP_DEPTH,BASE_DEPTH,R
EMARK
FRAC,IP,0,001,1441,1721,"DETAILS: AQUAFRAC 1000; 36750# 20/40 BROWN SD,
7500# 16/30 SIBERPROP"
FRAC,IP,0,001,11218,11346,""
FRAC,IP,0,001,8210,9250,"DETAILS: 60406 GALS WF GR8, 195564 GALS DF 200R23"
FRAC,IP,0,001,9730,10030,"DETAILS: 51244 GALS WF GR8, 122796 GALS DF 200R23"
FRAC,IP,0,001,10600,11050,"DETAILS: 27858 GALS WF GR8, 173466 GALS DF 200R23"
FRAC,IP,0,001,11316,11582,"CMHPG 35#"
FRAC,IP,0,001,6714,7680,"DETAILS: 94 BBLS SLICK WTR, 95 BBLS SLICK WTR,
357 BBLS SLICK WTR, 119 BBLS LIGHTNING 2000 PAD, 0.5 TO 1 PPG 30/50#
WHITE SD IN LIGHTNING 2000 GEL, WHITE SD"
FRAC,IP,0,001,7680,8190,"DETAILS: 87 BBLS SLICK WTR, 71 BBLS SLICK WTR,
357 BBLS SLICK WTR, 119 BBLS LIGHTNING 2000 PAD, 0.5 TO 1 PPG 30/50#
WHITE SD IN LIGHTNING 2000 GEL, 238 BBLS SLICK WTR, DROP 3" BALL, 168
BBLS SLICK WTR, SEAT BALL, WHITE SD" <-Looking for these kind
October 4, 2011 at 3:43 AM
Mark Antoniou said...
Thanks for your question, Organix. I normally get asked about changing a text file by
restructuring data, but finding text in a particular format can be useful, too. You are
interested in an expression that will find text that contains a third " which indicates that
the comment includes the inches of some object or action, such as dropping a ball. To
find this use the search term below
Search for (regexp mode): .*".*" .*"
October 4, 2011 at 4:09 AM
Organix said...
Thanks Mark - I'm sure that was an easy one for you. I had tried ".*" .*" and ".*" .*"$ but
Do you need professional PDFs? Try PDFmyURL!

was getting all the empty strings as well. Thanks again!

October 4, 2011 at 1:15 PM
Vin said...
Hi Mark.
Just like to thank you for the extremely helpful article even for a newbie like me.
However, I can't seem to figure out how to solve this issue..
7-Jul-09;6-4-12(P:P7A-3A-12),JLN 4/125 ;VANTAGE POINT
3-Sep-09;8-8-7(P:P7B-8-7),JLN 4/125;VANTAGE POINT
1-Oct-09;6-10-07(P:P7A-10-7),JLN 4/125 ;VANTAGE POINT
So, I would like to rid everything within the brackets and keep everything else..
Also, is it possible to sort out the date accordingly?
Thank you so much for your time. Any advice would be much appreciated!
October 11, 2011 at 5:28 PM
Mark Antoniou said...
Hey Vin,
Getting rid of the information within the parentheses is pretty easy, although getting
Notepad++ to recognise that you are loooking for a pernthesis as part of your search
term requires that you to precede it with a backslash $ or $
Search for (regexp mode): (.*)$.*$(.*)
Replace with: \1\2
So, you will end up with this.
7-Jul-09;6-4-12,JLN 4/125 ;VANTAGE POINT
3-Sep-09;8-8-7,JLN 4/125;VANTAGE POINT
1-Oct-09;6-10-07,JLN 4/125 ;VANTAGE POINT
I do not understand what you mean by "sort out the date accordingly". Throw me a bone
here...
October 12, 2011 at 2:22 AM
Do you need professional PDFs? Try PDFmyURL!

Vin said...
Hi Mark,
Great! That was exactly what I needed, just made my job a breeze! :)
I apologize for the vague question, what I meant to ask was, let's say I have thousands
of data all with different dates, and I would like to sort them out from the earliest to
latest.
Thank you so much Mark, your help is much appreciated!
October 12, 2011 at 1:55 PM
Mark Antoniou said...
Oh, I get it now. You are not going to be able do that in a text editor. Perhaps import the
text file into Excel, use the text-to-columns feature and specify the comma as your
delimeter. Column A will contain all of the dates. Select Column A, set the format of the
cells to 'date'. Select the whole data range and sort ascending by column A. That will do
it.
October 12, 2011 at 3:03 PM
Vin said...
Ok! Muchos Gracias. Can't begin to express my gratitude! :)
October 12, 2011 at 3:08 PM
Manas said...
Thanks man. It was really helpful. I wanted to remove "," that comes in a string from a
flat file of 200000+ records. The comma was messing up with my delimiter. BTW I used
[a-zA-Z1-0]+,[a-zA-Z1-0]+ as my search string..
Again thanks a ton man
November 8, 2011 at 5:01 PM
Jeff said...
Hi,
I'm having a challenge to use regex in Notepad++ for the following case.
Howto find and append a row of hostname and ip address into a one common statement
with newline added?

Do you need professional PDFs? Try PDFmyURL!

with newline added?

For example:
From
host101 192.168.0.1
host102 192.168.0.2
host103 192.168.0.3
To

As a result,

I really appreciate this very much if someone can shed some lights.
Cheers,
Jeffrey
November 9, 2011 at 3:22 AM
Mark Antoniou said...
Glad you found the blog useful, Manas.
Jeff, I don't understand what you want to do. What do you want the text to look like at
the end?
November 9, 2011 at 3:27 AM
Jeff said...
Here is some of the input that missed out in my last comments.
To
(hostname host="ipaddress" port="11")(/hostname)
Expected results
(host101 host="192.168.0.1" port="11")(/host101)
Do you need professional PDFs? Try PDFmyURL!

(host102 host="192.168.0.2" port="11")(/host102)

(host103 host="192.168.0.3" port="11")(/host103)
Thanks again.
November 9, 2011 at 3:31 AM
Mark Antoniou said...
Ok, got it. So you start off with this:
host101 192.168.0.1
host102 192.168.0.2
host103 192.168.0.3
Search for (regexp mode): (host.*) (.*)
Replace with: (\1 host="\2" port="11")(/\1)
which will give you this:
(host101 host="192.168.0.1" port="11")(/host101)
(host102 host="192.168.0.2" port="11")(/host102)
(host103 host="192.168.0.3" port="11")(/host103)
November 9, 2011 at 6:57 AM
Jeff said...
Thanks for your great help, Mark. :-) I'm impressed to use this method to search and
append a thousands row of command within few seconds.
This is a wonderful blog of yours that provide expert advice and solution that I'm exactly
looking forward to revisit. Wish you do well in life and career.
Cheers,
Jeff
November 14, 2011 at 4:10 AM
greg.fenton said...
Does N++ regex support doing an arithmetic calculation in the replacement?
For example, I have a serialized object such as:
{s:10:\"abcdefghij\"}
Do you need professional PDFs? Try PDFmyURL!

I want to replace "abcde" with "X", so not only do I need to make that change, but I also
need to reduce the string size (10) by the replacement length difference (4).
So I'm looking for something like:
Search for:
s:$\d+$:\\"abcde
Replace with:
s:$((\1 - 4)):"X
where $((\1 - 4)) is an arithmetic calculation whose result is injected in the replacement
value.
Possible?
Thanks in advance.
November 18, 2011 at 3:00 PM
Mark Antoniou said...
Greg, does N++ regex support doing an arithmetic calculation in the replacement?
No. However, depending on the arithmetic, there may be a way to "fake it", and bend
Notepad++ to your will. One reader asked me if it is possible to increment numbers in
the replace term. It isn't. But if you insert a number on each line using the Notepad++
column editor, then use regexp to restructure the data, the result is identical.
So, where does that leave us then? I am not 100% clear on what your text looks like or
what you want it to look like. Like anything, there is a way to do it, but the question is
how messy will it get, and is it the most efficient way of getting the job done. It all
depends on how repetitious your replace term will be. My gut feeling is that you should
probably take a look at either Perl or Awk for your particular case.
November 19, 2011 at 1:34 AM
Bob C said...
Helped! Thanks :)
December 14, 2011 at 3:41 AM
Mamoun J. said...
Do you need professional PDFs? Try PDFmyURL!

Your work is amazingly good. Recently a hacker injected iframes into my web page for
all php files. i'm trying to remove these iframes with notepad++.
what I want to remove is something like:
IFRAME Bla Bla Bla /IFRAME
So, I know the beginning and the end of the string but the problem is the contents are
not the same all the time. One thing that I didn't confirm yet is, each iframe is located in
a separate line, if so, all what i need is to delete the whole line where i locate iframe
term.
please give your suggestions. many thnks.
January 16, 2012 at 3:15 PM
Mark Antoniou said...
Thanks for your kind words, Mamoun. If all you need to do is remove whatever is
contained within the Iframe tags, this can be achieved easily by
Search for (regexp mode): IFRAME.*/IFRAME
Replace with: nothing
If there are multiple instances of IFRAME on the same line, or if an individual IFRAME
spans multiple lines, then things become a little more complicated, esp if you're using
Notepad++. In this instance, you could change all instances of /IFRAME to something
unique, such as ENDOFHACK or whatever. Then you could remove all newlines and
replace them with something else unique, such as PUTBACKLATER or whatever. Then
you would
Search for (regexp mode): IFRAME.*ENDOFHACK
Replace with: nothing
Then reinsert all newlines back where they were by replacing PUTBACKLATER with \r\n
in extended search mode.
Either way, it can be done.
January 16, 2012 at 3:28 PM
catchthepanda said...
thanks for a good read, breaks down steps very well indeed for doing more complex
regex stuff!
I leave you with
Do you need professional PDFs? Try PDFmyURL!

your regex power level is over 9000!!!

January 19, 2012 at 4:33 AM
JPNL said...
Hello Mark, I am trying to replace spaces in string with a comma and figured I need to
use a regular expression. I found your post and although it answers a lot, it doesn't help
me achieve what I need. Hope you can help!
I have an export file in html from the delicious bookmark site. A part of the html looks
like this:
PRIVATE="0" TAGS="marketing sales arrangementen workshops">Jump4art
Workshops in Frankrijk
I need to replace the spaces in the 'TAGS' part to make it look like this
PRIVATE="0" TAGS="marketing,sales,arrangementen,workshops">Jump4art
Workshops in Frankrijk
A search with RegEx (\TAGS=".*)(">) let's me find and select the entire 'TAGS' string,
but I can't find how I can replace the spaces with a comma. Can you help please?
Thanks you!
John-Pierre
January 20, 2012 at 10:30 AM
Mark Antoniou said...
Hey JP,
Could you paste a few lines so that I can see the other bookmarks too. Could you
paste, say about 5?
January 20, 2012 at 10:34 AM
JPNL said...
Hi Mark, thanks for you quick reply! Here are a few lines. It's just a part of the full string
because I can't paste the full html here. I don't see your email address but if you mine
you can email me and I reply with an actual example file. Thanks !
PRIVATE="0" TAGS="concerten tickets kaarten">Live Nation Live Nation Netherlands
Do you need professional PDFs? Try PDFmyURL!

PRIVATE="0" TAGS="winkelen nagerechten chocolade nougat hapje

bonbon">FineFoodImports - Home
PRIVATE="0" TAGS="eigenbedrijf marketing sales,arrangementen
workshops">Jump4art
PRIVATE="0" TAGS="ecofriendly bouwen frans">Tulkivi spreksteenkachel
PRIVATE="0" TAGS="hartigetaart vlees Recepten !Recepten">Kerriequiche
January 20, 2012 at 8:19 PM
Mark Antoniou said...
Ok, thanks for providing the extra info, JP. So, we start off with what you have above.
What makes things difficult is the fact that each bookmark has a different number of
tags. So in order to get around this, first we will move everything after > to a new line.
Search for (extended search mode): >
Replace with: \r\n>
Which will give you this:
PRIVATE="0" TAGS="concerten tickets kaarten"
>Live Nation Live Nation Netherlands
PRIVATE="0" TAGS="winkelen nagerechten chocolade nougat hapje bonbon"
>FineFoodImports - Home
PRIVATE="0" TAGS="eigenbedrijf marketing sales,arrangementen workshops"
>Jump4art
PRIVATE="0" TAGS="ecofriendly bouwen frans"
>Tulkivi spreksteenkachel
PRIVATE="0" TAGS="hartigetaart vlees Recepten !Recepten"
>Kerriequiche
Now, let's replace all of the spaces within the double quotation marks with commas.
Search for (regexp mode): (PRIVATE="0" TAGS=".*) <--- note that there is a single
space after the )
Replace with: \1,
Continue pressing Replace All until there are 0 occurrences that match this regular
expression. You will end up with this:
PRIVATE="0" TAGS="concerten,tickets,kaarten"
>Live Nation Live Nation Netherlands
Do you need professional PDFs? Try PDFmyURL!

PRIVATE="0" TAGS="winkelen,nagerechten,chocolade,nougat,hapje,bonbon"
>FineFoodImports - Home
PRIVATE="0" TAGS="eigenbedrijf,marketing,sales,arrangementen,workshops"
>Jump4art
PRIVATE="0" TAGS="ecofriendly,bouwen,frans"
>Tulkivi spreksteenkachel
PRIVATE="0" TAGS="hartigetaart,vlees,Recepten,!Recepten"
>Kerriequiche
Now we put the 2 lines that we split up back together again.
Search for (extended search mode): "\r\n
Replace with: "
PRIVATE="0" TAGS="concerten,tickets,kaarten">Live Nation Live Nation Netherlands
PRIVATE="0"
TAGS="winkelen,nagerechten,chocolade,nougat,hapje,bonbon">FineFoodImports Home
PRIVATE="0"
TAGS="eigenbedrijf,marketing,sales,arrangementen,workshops">Jump4art
PRIVATE="0" TAGS="ecofriendly,bouwen,frans">Tulkivi spreksteenkachel
PRIVATE="0" TAGS="hartigetaart,vlees,Recepten,!Recepten">Kerriequiche
And there you have it.
January 21, 2012 at 3:36 AM
eagleapex said...
Just spent 10 minutes trying to make a clever USPTO search with your help.
unparseable (Too Many Search Terms 1043 ) ).
awww
January 21, 2012 at 4:11 AM
JPNL said...
Wow that worked perfect! Thank you so much!!! and have a nice weekend.
January 21, 2012 at 6:30 AM
Adrian981 said...
Do you need professional PDFs? Try PDFmyURL!

Hi Mark
You gave me a lot of help before with notepadd ++. I was wonder could you help me to
figure this out.
i'm looking to replace each comma at the end of a line with someting else.
ie. jon,dan,paul,
I know how to find : (.*),
And i usually replace with : \1

I'm looking to find how to replace with a different word.

Any help is welcome.
Thanks
Adrian
February 4, 2012 at 1:22 AM
Mark Antoniou said...
Hi Adrian,
Could you show me the before and after so that I know what you want it to look like at
the end.
February 4, 2012 at 1:29 AM
Adrian981 said...
Hey,
Ifigured it out it was pretty simple after all.
\1,mark is what i replace with and it was correct.
Thanks
February 4, 2012 at 3:31 AM
Helleye said...
Thanks for step 4.
It was very useful for me.
Do you need professional PDFs? Try PDFmyURL!

February 27, 2012 at 9:45 PM

Frank said...
Thanks for this guide and (your even more helpful) answering of questions in the
comments
March 25, 2012 at 1:45 AM
Popsana said...
For such an sql statement as:
(10017, 'com_jublog', 'component', 'com_jublog', '', 1, 1, 0, 0,
'{"legacy":false,"name":"com_jublog","type":"component","creationDate":"Mar
2012","author":"JoniJnm","copyright":"","authorEmail":"","authorUrl":"www.jonijnm.es","v
ersion":"1.0.1","description":"COM_JUBLOG_XML_DESCRIPTION","group":""}',
'{"catid_blogs":"2","catid_pp":"2"}', '', '', 0, '0000-00-00 00:00:00', 0, 0),
(10019, 'themza_j15_14', 'template', 'themza_j15_14', '', 0, 1, 1, 0,
'{"legacy":true,"name":"themza_j15_14","type":"template","creationDate":"2008-1007","author":"Themza Team","copyright":"ThemZa
2008","authorEmail":"[email protected]","authorUrl":"http:\\/\\/www.themza.com",
"version":"1.0.0","description":"Feel the Music","group":""}', '{}', '', '', 0, '0000-00-00
00:00:00', 0, 0),
You Can use:
Regexp mode: ([(])([0-9],*)
March 28, 2012 at 8:07 PM
Enes said...
Hi Mark,
i wonder that if i have a text like that :
Jessie 213
block me later
iamhere
Jack 232
blablabla
iamhere
blablabla
sometext
againsometext
Do you need professional PDFs? Try PDFmyURL!

Mark 30
where
iamhere
...
i want to output only ;
Jessie 213
Jack 232
Mark 30
and as you see we have a trick;
there is a fixed text (iamhere) before 2 lines which we needed.
in fact the question is easy , could we select/mark lines which cames before 2 lines a
fixed text.
i looked text-fx but i couldn't solve the problem.
Thanks for helps.
April 5, 2012 at 7:51 AM
RussiAmore said...
Thank's for such a great guide! There are lot's of tip i didn't know about.
April 5, 2012 at 10:02 PM
Unknown said...
Mark, I can see why you've received so much traffic on this post. It helped me solve
cleaning up a very large xml document. Big Thanks for your Documentation and
Examples!!!
Randy
Techie by day, woodworker by night...
http://www.custommade.com/by/repearson
April 6, 2012 at 3:57 AM
Mark Antoniou said...
Glad you found it helpful, RussiAmore and Randy. And thanks for the kind words.
Enes, your problem is a simple one (in theory), but is made complicated by Notepad++.
As you point out, there is a (somewhat) recurring pattern in the text. You want to keep
the line that is two lines above "iamhere", discard the line above "iamhere" as well as
the "iamhere" itself. I am not going to even bother doing this is Notepad++ because the
Do you need professional PDFs? Try PDFmyURL!

solution will be very, very, very long. We need a text editor that will allow us to include
newlines as part of our regexp search term. I recommend Emacs, available here:
http://ftp.gnu.org/gnu/emacs/windows/
So, we start off with this:
Jessie 213
block me later
iamhere
Jack 232
blablabla
iamhere
blablabla
sometext
againsometext
Mark 30
where
iamhere
Search for: $.*$
.*
iamhere
Note: insert newline characters into a search term in Emacs by pressing Ctrl+Q Ctrl+J
Replace with: \1
This will give you this:
Jessie 213
Jack 232
blablabla
sometext
againsometext
Mark 30
Ok, so now you can see why I said that there is a *somewhat* recurring pattern. The
number of lines between each occurrence of "iamhere" varies, so we want to get rid of
"blablabla", "sometext" and "againsometext". In this example, we can use the fact that
the unwanted text does not end with a number to our advantage, like this
Do you need professional PDFs? Try PDFmyURL!

Search for: .*[a-z]

Note: there is a newline after the [a-z]
Replace with: nothing - leave blank
And there you go:
Jessie 213
Jack 232
Mark 30
April 6, 2012 at 5:20 AM
Mscarfix said...
Hi Mark: I really appreciate your blog!
Question: I'm working in XML and I want to find all contents between these two tags:
caution tags. (Imagine a left and right carrott tag on each caution with verbiage between
them. For some reason this blog won't allow carrott tags.)
I can find the tags, now how to I copy that content into a separate file? I know I have
about 200 cautions and I want extract only that content to a file. Make sense?
I would appreciate any assistance you can offer, oh "NotePad ++ guru you!
In gratitude,
Mscarfix
April 20, 2012 at 3:16 AM
Mark Antoniou said...
Do you mean greater than and less than signs?
Could you paste a sample of the code (just a few lines).
April 20, 2012 at 3:18 AM
Mark Antoniou said...
So, say that you start off with this
get rid of this.this is the stuff that I want to keep*don't want this
don't need this either.I want to keep this stuff too*the stuff here is crap
Do you need professional PDFs? Try PDFmyURL!

Search for (regular expression): .\.(.)\.

Replace with: \1
this is the stuff that I want to keep
I want to keep this stuff too
But, I am not sure what your code looks like, i.e., whether it has these tags <>.
Please note that this is the 200th comment (the most that can be shown on a
single Blogger page). Please click the "Next" link below to see newer comments.
April 20, 2012 at 3:38 AM
1 200 of 398 Newer Newest

Ethereal template. Powered by Blogger.

Do you need professional PDFs? Try PDFmyURL!

Doodle Town 2ed 1 Activity Book
100% (3)
Doodle Town 2ed 1 Activity Book
98 pages
Brown and Lee - Teaching by Principles (4th Edition) - Ocred
100% (1)
Brown and Lee - Teaching by Principles (4th Edition) - Ocred
1 page
Blueprint 5 Student Book TG PDF
81% (16)
Blueprint 5 Student Book TG PDF
128 pages
Busi 330-b02 CMP Final Draft Group 1
100% (1)
Busi 330-b02 CMP Final Draft Group 1
29 pages
Database Setup and Management Guide
No ratings yet
Database Setup and Management Guide
34 pages
Database Design Document Template PDF Free
No ratings yet
Database Design Document Template PDF Free
22 pages
IBM I DB2 Web Query For I Version 2.1 Implementation Guide
100% (2)
IBM I DB2 Web Query For I Version 2.1 Implementation Guide
880 pages
Adobe Introduction To Scripting
No ratings yet
Adobe Introduction To Scripting
52 pages
Learning Web Development With Bootstrap and AngularJS - Sample Chapter
No ratings yet
Learning Web Development With Bootstrap and AngularJS - Sample Chapter
17 pages
PHP 7
No ratings yet
PHP 7
11 pages
Maria Vs MySQL
No ratings yet
Maria Vs MySQL
8 pages
Relational Databases
No ratings yet
Relational Databases
374 pages
Microsoft SQL Server
No ratings yet
Microsoft SQL Server
111 pages
And Run Main - PHP Create Data Base in Mysql From Placementdb - SQL File
No ratings yet
And Run Main - PHP Create Data Base in Mysql From Placementdb - SQL File
39 pages
How To Install SQL Server On Linux (Ubuntu and CenOS - RHEL)
No ratings yet
How To Install SQL Server On Linux (Ubuntu and CenOS - RHEL)
6 pages
Xpath Cheat Sheet: Ahmed Rafik - Modern Web Scraping With Python Using Scrapy, Splash & Selenium (Udemy) 2 Edition
No ratings yet
Xpath Cheat Sheet: Ahmed Rafik - Modern Web Scraping With Python Using Scrapy, Splash & Selenium (Udemy) 2 Edition
11 pages
MySQL Tutorial - MySQL by Examples For Beginners
No ratings yet
MySQL Tutorial - MySQL by Examples For Beginners
39 pages
Ebook Exchange Server 2010 2nd Ed
No ratings yet
Ebook Exchange Server 2010 2nd Ed
349 pages
Application Development For CICS
No ratings yet
Application Development For CICS
384 pages
Learning Web Component Development - Sample Chapter
No ratings yet
Learning Web Component Development - Sample Chapter
60 pages
Miktex PDF
No ratings yet
Miktex PDF
108 pages
Mysql Shell 8.0 en
No ratings yet
Mysql Shell 8.0 en
410 pages
Oracle Cheat SQLPlus Commands
No ratings yet
Oracle Cheat SQLPlus Commands
1 page
Mariadb Cookbook: Chapter No. 10 "Exploring Dynamic and Virtual Columns in Mariadb"
No ratings yet
Mariadb Cookbook: Chapter No. 10 "Exploring Dynamic and Virtual Columns in Mariadb"
20 pages
How To Locate Elements in Chrome and IE Browsers For Building Selenium Scripts
No ratings yet
How To Locate Elements in Chrome and IE Browsers For Building Selenium Scripts
8 pages
Installing and Using Tesseract 500 OCRFINAL
No ratings yet
Installing and Using Tesseract 500 OCRFINAL
4 pages
Most Used Programming Languages 2021 Popular Programming Languages
No ratings yet
Most Used Programming Languages 2021 Popular Programming Languages
15 pages
Bash Guide
No ratings yet
Bash Guide
104 pages
OceanofPDF - Com Hacking MySQL Breaking Optimizing - Lukas Vileikis
No ratings yet
OceanofPDF - Com Hacking MySQL Breaking Optimizing - Lukas Vileikis
381 pages
SQL Injection
100% (1)
SQL Injection
10 pages
PHP: Hypertext Preprocessing: Matt Murphy & Dublas Portillo
No ratings yet
PHP: Hypertext Preprocessing: Matt Murphy & Dublas Portillo
17 pages
Ibm Redbook - Db2 Web Query
No ratings yet
Ibm Redbook - Db2 Web Query
606 pages
Iseries Commands
No ratings yet
Iseries Commands
1 page
2011NoCOUG - HistOPA 2
No ratings yet
2011NoCOUG - HistOPA 2
24 pages
SQL Sqlite Commands Cheat Sheet
No ratings yet
SQL Sqlite Commands Cheat Sheet
5 pages
MariaDB - Introduction To MariaDB v1 6
No ratings yet
MariaDB - Introduction To MariaDB v1 6
30 pages
Docker Approaching Multi Container Applications WP PDF
No ratings yet
Docker Approaching Multi Container Applications WP PDF
37 pages
Fundamental XML For Developers: Dr. Timothy M. Chester Texas A&M University
No ratings yet
Fundamental XML For Developers: Dr. Timothy M. Chester Texas A&M University
82 pages
Windows Registry Demistified
No ratings yet
Windows Registry Demistified
12 pages
Linux Administrator - A Real World Guide To Linux Certification Skills
100% (1)
Linux Administrator - A Real World Guide To Linux Certification Skills
363 pages
Programmers Guide MuraCMS
No ratings yet
Programmers Guide MuraCMS
52 pages
Online Banking System: A Project Report On
100% (2)
Online Banking System: A Project Report On
58 pages
2020 01 08 Computeractive
No ratings yet
2020 01 08 Computeractive
78 pages
Deploying Web Apps in Enterprise Scenarios
No ratings yet
Deploying Web Apps in Enterprise Scenarios
269 pages
HELP!-My Computer Is Broken DIGITAL 2
No ratings yet
HELP!-My Computer Is Broken DIGITAL 2
144 pages
What Is A SQL Injection?
No ratings yet
What Is A SQL Injection?
6 pages
Connecting To Your Database For PB
No ratings yet
Connecting To Your Database For PB
422 pages
Less Web Development Cookbook Sample Chapter
No ratings yet
Less Web Development Cookbook Sample Chapter
31 pages
Linux Ubuntu Exercises
No ratings yet
Linux Ubuntu Exercises
9 pages
Lab Manual - SQLi To Shell - V1.0
No ratings yet
Lab Manual - SQLi To Shell - V1.0
9 pages
WP SI 10 Network Security Tools
No ratings yet
WP SI 10 Network Security Tools
6 pages
Cross Site Scripting - XSS
No ratings yet
Cross Site Scripting - XSS
6 pages
Sed, A Stream Editor - by Ken Pizzini, Paolo Bonzini PDF
No ratings yet
Sed, A Stream Editor - by Ken Pizzini, Paolo Bonzini PDF
38 pages
Tutorials - Best Windows Registry Hacks in 2020 To Make Windows Better - Team OS - Your Only Destination To Custom OS !!
No ratings yet
Tutorials - Best Windows Registry Hacks in 2020 To Make Windows Better - Team OS - Your Only Destination To Custom OS !!
15 pages
Web Design & UI - UX (PDFDrive)
No ratings yet
Web Design & UI - UX (PDFDrive)
115 pages
DB2 DB Creation Steps
No ratings yet
DB2 DB Creation Steps
8 pages
Cascading Style Sheet PDF
No ratings yet
Cascading Style Sheet PDF
21 pages
DotNetNuke 5.4 Cookbook
From Everand
DotNetNuke 5.4 Cookbook
John K Murphy
5/5 (1)
Javascript Assessment Test
From Everand
Javascript Assessment Test
Edward Yao
No ratings yet
Laravel 12 Training Kit: A Practical Guide to Modern Web Development
From Everand
Laravel 12 Training Kit: A Practical Guide to Modern Web Development
Agus Kurniawan
No ratings yet
NW.js Essentials
From Everand
NW.js Essentials
Alessandro Benoit
No ratings yet
ColdFusion Interview Questions, Answers, and Explanations: ColdFusion Certification Review
From Everand
ColdFusion Interview Questions, Answers, and Explanations: ColdFusion Certification Review
equitypress
No ratings yet
Coding iPhone Apps for Kids: A Playful Introduction to Swift
From Everand
Coding iPhone Apps for Kids: A Playful Introduction to Swift
Gloria Winquist
No ratings yet
World of Stories 3. Adventure Time Speaking 11 Wednesday Friendship Language
No ratings yet
World of Stories 3. Adventure Time Speaking 11 Wednesday Friendship Language
3 pages
Fact Definition - Google Search
No ratings yet
Fact Definition - Google Search
1 page
My Donkey Sally
No ratings yet
My Donkey Sally
9 pages
Pieroni - Habeo Plus Perfect Participle in Cicero 2
No ratings yet
Pieroni - Habeo Plus Perfect Participle in Cicero 2
9 pages
Tugas 1 Bahasa Inggris
No ratings yet
Tugas 1 Bahasa Inggris
3 pages
Ayaz CV
No ratings yet
Ayaz CV
2 pages
Notifications - Equipments and Specilized Works
No ratings yet
Notifications - Equipments and Specilized Works
102 pages
ĐỂ SỐ 45
No ratings yet
ĐỂ SỐ 45
8 pages
The Political History of Odisha
No ratings yet
The Political History of Odisha
20 pages
Muse India
No ratings yet
Muse India
27 pages
Toan Lop 11 Tap 2 Ket Noi Tri Thuc PDF Coyre
No ratings yet
Toan Lop 11 Tap 2 Ket Noi Tri Thuc PDF Coyre
114 pages
Susannah Fourie Primary School Pre
No ratings yet
Susannah Fourie Primary School Pre
2 pages
Rockin' Around The Christmas Tree-Vibrafono
No ratings yet
Rockin' Around The Christmas Tree-Vibrafono
2 pages
Java Data Utils 8
No ratings yet
Java Data Utils 8
21 pages
Compound Adjective
No ratings yet
Compound Adjective
5 pages
M CL WR SN: Ap Ap Ap Ap
100% (1)
M CL WR SN: Ap Ap Ap Ap
20 pages
HA For M A Part II YCMOU Jan2020
50% (2)
HA For M A Part II YCMOU Jan2020
3 pages
Japji Sahib Santhya
100% (1)
Japji Sahib Santhya
33 pages
Equivalence in Translation
No ratings yet
Equivalence in Translation
5 pages
Topic SIX Things Around Us
No ratings yet
Topic SIX Things Around Us
10 pages
L3 B1 U3 Grammar Higher
No ratings yet
L3 B1 U3 Grammar Higher
2 pages
Epic Backfiles
No ratings yet
Epic Backfiles
307 pages
Delhi Transport Corporation Vs D.T.C. Mazdoor Congress On 4 September, 1990
No ratings yet
Delhi Transport Corporation Vs D.T.C. Mazdoor Congress On 4 September, 1990
112 pages
Kristy Snella3
No ratings yet
Kristy Snella3
9 pages
Exam B1, 1 and 2
No ratings yet
Exam B1, 1 and 2
2 pages
Parts of Speech 2025
No ratings yet
Parts of Speech 2025
12 pages
Nova Planilha Ob Class
No ratings yet
Nova Planilha Ob Class
38 pages

Regex Replace System

Uploaded by

Regex Replace System

Uploaded by

Share

Notepad++: A guide to using regular expressions and

What's so good about Extended search mode?

Step 2: Open yourexperiment_copy.zil in Notepad++ (version 4.9 or later).

Press Replace All. All the error messages are gone.

[!] finds the exclamation character.

Do you need professional PDFs? Try PDFmyURL!

\r\n is a newline character (in Windows).

Step 5: Put each Item (DMDXspeak for trial) on a new line.

Do you need professional PDFs? Try PDFmyURL!

\+ finds the + character.

Find what: \r\n

Step 8: Let's put the remaining Items on new lines.

Step 9: Let's get rid of those duplicate RTs.

A-Z finds all letters of the alphabet in upper case.

Step 10: Let's get rid of all those morks.

Replace with: (leave blank)

Step 11: Separate each participant's data from the next.

Notepad++, is there anything it can't do?

Do you need professional PDFs? Try PDFmyURL!

Labels: DMDX, experiments, guides, notepad++, productivity

example, and I'll show you the correct regexp.

ANGELES CITY USE ONE STEP 1940 LARGE CITY

ANGELES CITY USE ONE STEP 1940 LARGE CITY

ANGELES CITY USE ONE STEP 1940 LARGE CITY

ANGELES CITY USE ONE STEP 1940 LARGE CITY

county name between the 1st and 2nd ^

October 14, 2008 at 2:32 PM

January 7, 2009 at 10:34 PM

Do you need professional PDFs? Try PDFmyURL!

Jay Fulton said...

April 16, 2009 at 8:20 PM

and the correct replace term would be:

Don't waste time.

for ex: In the above input if I select { "pt_on_cv::evaluate"

so the final output will be

24408 0.032451 0.000001 0.0

May 7, 2010 at 12:39 AM

nice post Post......

This search term will leave longer strings of numbers unaffected.

May 13, 2010 at 12:38 PM

June 14, 2010 at 10:23 PM

Do you need professional PDFs? Try PDFmyURL!

July 14, 2010 at 4:43 PM

Find what :", [0-9]*, NULL, "

Mark Antoniou said...

August 9, 2010 at 11:13 PM

Mark Antoniou said...

continental move. Now, to your questions:

Mark Antoniou said...

In Notepad++ regular expression search mode,

What is the "\1\2" that you said to use as replacement?

If that is the case change your search term to this:

Do you need professional PDFs? Try PDFmyURL!

Step 3: Remove blank lines

something is important but #links# this is not

replace field dynamically.. I had a situation like this:

after you have that "ahah" moment!

June 18, 2011 at 4:01 AM

5. Remove all linebreaks.

September 15, 2011 at 11:34 PM

called today but nobody was home

**Data I want to keep is here 2**

Do you need professional PDFs? Try PDFmyURL!

called today but nobody answered

**Data I want to keep is here 2**

same number of digits.

Replace with: \1';

And if i want to divide lines of 5 how do i do that.

I,ve sent you a small donation for your great support.

was getting all the empty strings as well. Thanks again!

Do you need professional PDFs? Try PDFmyURL!

with newline added?

(host102 host="192.168.0.2" port="11")(/host102)

your regex power level is over 9000!!!

PRIVATE="0" TAGS="winkelen nagerechten chocolade nougat hapje

Data I want to keep is here 2

Data I want to keep is here 2

Search for (regular expression): .\.(.)\.