Regex Replace System
Regex Replace System
36
More
Next Blog
29 June 2008
Create Blog
Blog Archive
2013 (3)
2012 (2)
2011 (4)
2010 (2)
2009 (7)
2008 (11)
October (1)
If you are specifically looking for multiline regular expressions, look at this post.
You may already know that I am a big fan of Notepad++. Apparently, a lot of other people are
interested in Notepad++ too. My introductory post on Notepad++ is the most popular post on my
speechblog. I have a feeling that that is about to change.
Since the release of version 4.9, the Notepad++ Find and Replace commands have been updated.
There is now a new Extended search mode that allows you to search for tabs(\t), newline(\r\n), and
a character by its value (\o, \x, \b, \d, \t, \n, \r and \\). Unfortunately, the Notepad++ documentation
is lacking in its description of these new capabilities. I found Anjesh Tuladhar's excellent slides on
regular expressions in Notepad++ useful. After six hours of trial and error, I managed to bend
Do you need professional PDFs? Try PDFmyURL!
August (3)
July (2)
June (2)
Notepad++: A guide to using
regular expressions an...
Create conference posters:
From Powerpoint to high...
May (1)
Sign In
regular expressions in Notepad++ useful. After six hours of trial and error, I managed to bend
Notepad++ to my will. And so I decided to post what I think is the most detailed step-by-step guide
to Search and Replace in Notepad++, and certainly the most detailed guide to cleaning up DMDX
.zil output files on the internet.
March (1)
February (1)
2007 (19)
Software Testing
Download
smartbear.com/30-Day-Trial
Easy Automated Tool For Both
Novice And Advanced Testers.
Free Trial.
Topics
annoyances (4)
archive (1)
backup (3)
customisation (2)
display (1)
DMDX (2)
download (12)
dropbox (1)
Cleaning up a DMDX .zil file
excel (2)
experiments (3)
DMDX allows you to run experiments where the user responds by using the mouse or some other
input device. Depending on the number of choices/responses (and of course the kind of task),
DMDX will output a .zil file containing the results (instead of the traditional .azk file). This is
Do you need professional PDFs? Try PDFmyURL!
figures (1)
formatting (3)
specified in the header along with the various response options available to the participant. For
some reason, DMDX outputs the reaction time twiceand on separate linesin .zil files. Here's a
guide for cleaning up these messy .zil files with Notepad++. Explanations of the Notepad++ search
terms are provided in bullet points at the end of each step.
guides (17)
notepad++ (2)
pdf (5)
Step 1: Backup your original result file (e.g. yourexperiment.zil) and create a copy of that file
(yourexperiment_copy.zil) that we will edit and clean up.
praat (3)
publishing (2)
productivity (12)
recording (1)
regular expressions (2)
roboform (1)
scripts (4)
security (2)
setup (9)
software (19)
speech (2)
stats (1)
styles (3)
thesis (6)
Word (6)
writing (2)
zotero (4)
Step 3: Remove all error messages.All lines containing DMDX error messages begin with an
exclamation mark. Let's get rid of them.
Bring up the Replace dialog box (Ctrl+H) and select the Regular Expression search mode.
Find what: [!].*
Replace with: (leave this blank)
Do you need professional PDFs? Try PDFmyURL!
Step 6: Remove all newline characters using Extended search mode, replacing them with a unique
string of text that we will use as a signpost for redundant data later in RegEx. Choose a string of
text that does not appear in you .zil fileI have chosen mork.
Switch to Extended search mode in the Replace dialog.
Do you need professional PDFs? Try PDFmyURL!
Step 7: We're nearly there. Using our mork signpost keyword, let's separate the different RT values.
Stay in Extended search mode.
Find what: ,
Replace with: ,mork
Press Replace All. Now, mork appears after every comma.
Do you need professional PDFs? Try PDFmyURL!
Please post your questions in the comments below, rather than emailing me. This way, others can
refer to my answers here, saving me many hours of responding to similar emails over and over.
Update 20/2/2009: Having trouble understanding regexp? I have created a new Guide for regular
expressions. Check it out.
Posted by Mark Antoniou at 11:28 AM
+36 Recommend this on Google
398 comments:
1 200 of 398 Newer Newest
James said...
Hi, can those steps be automated in notepad++ ? like actions in photoshop?
July 20, 2008 at 11:13 PM
Mark said...
James, that is the million dollar question. I immediately tried to automate this somehow
but could not get Notepad++ to save these steps in a macro. If I find a solution, I will
post it.
July 21, 2008 at 7:00 PM
ninj said...
Nice article!
However, the reason why I arrived on your blog still remains unanswered:
How to replace a multiple line regexp by a simple value (in my case: nothing).
Here is the case:
In Symfony YAML generated files, I have the created_at and updated_at fields dumped,
which I don't want.
I need to replace something like this:
/ *created_at:.*\n *updated_at:.*\n/
by
//
The way to do it is important because I want the blank lines to disappear as well.
Do you need professional PDFs? Try PDFmyURL!
Of course I know it is possible to do it in two or three steps, but I'd like to find how to
achieve it in one only, I'm a regexp maniac ;)
Maybe you or someone else own a solution... i couldn't manage to get one neither
through CTRL-H nor through CTRL-R dialogs.
Thanks!
July 31, 2008 at 1:44 AM
Mark said...
ninj, currently you cannot do this in Notepad++. This is because replacing newlines is
possible in Extended search mode, and regular expressions are available in Regexp
search mode. You are trying to combine the two search modes, and in the current
version of Notepad++ you cannot.
Since I wrote this post, I too have caught regexp mania. If you are serious about using
regular expressions for more advanced search and replace (as you are) then you need
to use a more powerful text editor. I recommend XEmacsI've been using it for about a
month, and it is very powerful. I'm working on a post for XEmacs right now.
As for your specific problem, it is possible to get rid of the created_at and updated_at
information. I would need to see the text file (feel free to send a sample to me as an
email attachment). I have made a few assumptions: 1. that created_at and updated_at
always occur on consecutive lines, 2. that there is information above and below these
lines that is useful. The XEmacs regular expression would be this:
Search for:
\(.*\) newline
.*created_at:.* newline
.*updated_at:.* newline
\(.*\)
Replace with:
\1 newline
\2
Note: In XEmacs, the newline character is created by pressing Ctrl+Q Ctrl+J.
July 31, 2008 at 10:44 AM
Do you need professional PDFs? Try PDFmyURL!
Anonymous said...
Quick bleg. I would like to replace all occurrences of number+comma with number +
TAB. So 12.8, 100 would become 12.8 TAB 100.
I'm using "\d," for the [Find What] value and "\1\t" for the [Replace With] value.
Unfortunately I lose that last digit in the number that I'm replacing.
Any help would be appreciated.
August 1, 2008 at 5:27 AM
Anonymous said...
Ok, I actually figured it out.
The [Find What] value should be "(\d)," and the [Replace With] value
should be "\1\t". In other words I just needed the parentheses around "\d" criteria.
Thanks for the useful article Mark.
August 1, 2008 at 5:51 AM
Flick said...
Thank you for the guide! I have to admit it's a little advanced for me, and I've only just
found out about REGEX expressions, but am still very excited nonetheless!
I'm alittle confused by what to do in my situation. I have a mySQL file that I'd like to
run, and the first part of each line is something like this:
INSERT INTO my_table (id,uid,my_msg,my_date,the_ip) VALUES ('2',
I would very much like to be able to change the '2' part to just NULL and REGEX
seems to be the way forward. However, I think I'd have to use ( as a unique identifier,
and given that REGEx uses brackets as the separators, I'm now a little stuck.
Apologies in advance for this simple question, but my brain is really not working today.
Thanks!
p/s: I'll continue looking into it in the meantime.
August 11, 2008 at 2:06 AM
Do you need professional PDFs? Try PDFmyURL!
Flick said...
Just a quick update: I've been able to use Column Mode select (Alt+mouse) to select
the column and replace the NULL, since thankfully everything is in the same column!
I wonder if it is still possible in Regex though?
Thanks :)
August 11, 2008 at 2:20 AM
Mark said...
Hi Flick, thanks for your comments. I do have a regex solution for you that is very easy
and quick. Note that this regex syntax is specific to Notepad++.
First, let me answer your question re: the curved bracket (or parenthesis) character: in
order to search for and find the open parenthesis character, place the parenthesis within
square brackets like this: [(]
However, you do not need to use the parentheses or square brackets at all to achieve
what you want to (if I have understood you correctly).
Search for: '.*',
Replace with: NULL
If you do not want to get rid of the comma, then delete it from the search term. If this
then stuffs up your search and finds incorrect portions of text, you could insert a
comma after null in the replace with expression: NULL,
August 11, 2008 at 3:38 PM
Anonymous said...
Mark,
Do you have some advice for the following. I have a set of text lines... and I want to
delete duplicate lines. But the redundant information will occur only at the beginning of
the line, the end of those lines differ in their information. I'm just starting to use
notepad++ RegExp utilities, but I'm no whiz yet with the format.
Thanks
October 7, 2008 at 2:00 AM
Mark said...
That's exactly what regular expressions do. Give me 4-5 lines of your text as an
Do you need professional PDFs? Try PDFmyURL!
the phrases in brackets, on separate lines, are ignored by the final use of the text file.
They can remain, but I do want to delete the duplicates of the ??? lines. I'll have other
cities with similar format.
thanks
October 7, 2008 at 8:55 AM
Anonymous said...
... this group of lines is followed, for example, by:
[19-773]
???^Los Angeles^60-639^LOS ANGELES CITY USE ONE STEP 1940 LARGE CITY
ED FINDER
[19-1580]
???^Los Angeles^60-639^LOS ANGELES CITY USE ONE STEP 1940 LARGE CITY
ED FINDER
so the number between the second and 3rd ^s will change throughout the file, as will the
Do you need professional PDFs? Try PDFmyURL!
I fooled with TextFX but it moves the brackets from the text lines, doesn't show a
numerical sort of numbers (thus one sees 2, 20, 21, ...) and for some didn't get me to a
unique line.
I need the entire line. So for the first example I want:
[19-766]
???^Los Angeles^60-638^LOS ANGELES CITY USE ONE STEP 1940 LARGE CITY
ED FINDER
[19-767]
[19-773]
[19-1581]
but I'm willing to give up the brackets lines, but I do want a blank line between the
statements.
I've done 2 states, and with California decided to do some more automation. To see
Alabama and Arkansas... go to http://www.stevemorse.org/ed/ed.php
and choose 1940 and one of those two states.
Thanks... I'll ask Steve Morse to acknowledge you on the One Step site if you can pull
this off.
Joel Weintraub
Dana Point, CA
October 8, 2008 at 10:47 AM
Anonymous said...
Mark,
Steve Morse wrote a utility to do what I want.
But it was interesting to see if RegEx could do the same thing.
So... thanks for your help... don't do any more.
Thanks
Joel Weintraub
Do you need professional PDFs? Try PDFmyURL!
http://www.zytrax.com/tech/web/regex.htm
November 17, 2008 at 2:27 AM
liz said...
thanks, helped me out a bunch :)
November 25, 2008 at 4:36 AM
fresh332 said...
I have an output file from a program which contains "\n" characters instead of line
breaks, e.g.: "Text\nNew line\nAnother line"
Similar to your "mork" solution I do a consecutive replace, first in "normal mode"
replacing "\n" characters with something unique like "ZZZ", then in "extended mode"
replacing "ZZZ" with "\n" so I finally have the line breaks.
Do you need professional PDFs? Try PDFmyURL!
There should be a way to do this in one step, or to automate the two steps, either in
notepad++ or with some other tool - has anyone got an idea?
December 3, 2008 at 9:24 PM
Mark said...
fresh332, yes there is a way to do this in one step; and no, you cannot do this with
Notepad++.
I now use a very powerful text editor called XEmacs. It really leaves Notepad++ for
dead when it comes to regexp. It's so good that I'm working on a more detailed guide to
regexp using XEmacs right now.
FYI: in XEmacs, you specify a newline character by first pressing Ctrl+Q and then
Ctrll+J. This creates a newline character that takes care of \n and other "newline"
characters.
December 4, 2008 at 8:31 AM
Dave Bui said...
Brilliant! I love the replacement double blank lines to a single blank lines.
December 4, 2008 at 10:22 PM
David Leigh said...
I didn't see a mention of Notepad++'s other Find/Replace facility: The TextFX plugin. I
did not look to see if any of the "unsolvable" problems would be solved by TextFX, but
in the case that they might be, it's worth looking at the TextFX Find/Replace facility
(CTRL+R or via the menus) because of the way it can handle newlines and tabs.
That being said, connecting Find/Replace (any flavor) with the macro recording facility
of Notepad++ would elevate this software to "perfect" in my eyes...it's the one thing
remaining that really aggrevates me on a semi-regular basis. Other than that, I LOVE
this editor.
January 3, 2009 at 12:29 AM
Anonymous said...
-J
February 14, 2009 at 9:49 AM
Mark said...
That's exactly right. The problem is the line returns (or newlines). This is quite
problematic isn't it?
If you would like to be able to do these types of regular expressions then you should
use a more powerful text editor. I use XEmacs.
I've been working on a very comprehensive "Guide to regexp using XEmacs" post for a
while now. Hopefully I will publish it in the next month or two.
February 14, 2009 at 7:04 PM
Jolas Arvin said...
Mark said...
No, Notepad++ cannot perform logical OR regexp searches. That was an easy question
:)
However, the excellent and free XEmacs can handle your search without any problems.
Note that your Textpad OR operator | would become \| in XEmacs, i.e.,
^Alert\|^Error\|^Warning
April 30, 2009 at 6:40 PM
Anonymous said...
I have a text file full of blocks of text like this:
"STRING1" =>
{ url => "URL1",
visibleif => sub { !$is_temporarily_terminated &&
padlock("STRING2");
},
},
... more blocks like the above separated by a blank row.
End state: I need an excel file with 3 columns: string1, url1, and string2
Any ideas? I am completely new to regex and using notepad++ for now. If someone
who is really good at this replies quickly, then there also could be some work that we
could pay them to do in the future as we get a lot of projects like this.
May 15, 2009 at 5:27 AM
Mark Antoniou said...
That's pretty easy to fix. I wouldn't use Notepad++ for this. Instead use the excellent
and free XEmacs.
In XEmacs, the correct regex search term would be (newline character at end of each
line is made by Ctrl+Q, Ctrl+J):
"\(.*\)".*
.*"\(.*\)".*
.*
.*"\(.*\)".*
.*
.*
Do you need professional PDFs? Try PDFmyURL!
Abhishek said...
Thanks Mark. But, I work on client network where we cannot install XEmacs. We have
only notepad ++ installed. Any other thoughts please?
ABC
123
XYZ
Need to chnage into 'ABC','123','XYZ'
May 27, 2009 at 12:15 AM
Mark Antoniou said...
Ok, well there is a way around it, so long as your data is exactly as you have specified
here, i.e.:
ABC
XYZ
123
So, in order to get to this:
ABC,XYZ,123
All you need to do is replace the newline character with a comma.
If that is the case, you would use extended search mode and search for: \r\n
and replace this with: ,
That should do the trick.
May 27, 2009 at 10:22 AM
Vladimir said...
Hi Mark,
Found your blog and hoping you can help me. I have a batch file that I receive daily. I
need some help trying to modify it.
I need to insert a page break before it says PAGENO throughout the whole document. I
tried to do Find and Replace with PAGENO & \fPAGENO, but it didn't work. It puts FF
in black box in front of PAGENO, but doesn't create a page break when I print. What did
I do incorrectly and is this the way to do a page break with regexp?
Do you need professional PDFs? Try PDFmyURL!
Also, is there a way to automate this process with Notepad++ or any other app?
Thank you very much for your help!
August 5, 2009 at 7:21 AM
Mark Antoniou said...
Hey Vladimir,
This was a tough one! Let me begin by saying that I have an answer for you... kind of.
First of all, as far as I am aware, you cannot have page breaks in a text document.
Ok, now that we've got that out of the way, what are we going to do to help you? I would
say that inserting a page beak requires a rich text editor. So, Notepad++ is not going to
cut it.
I have achieved what you requested in one easy step using Microsoft Word. Open your
file in Word and select Replace (Ctrl+H), and enter the following search term:
Find what: \fPAGENO
Replace with: ^12
and then hit Replace All.
All of the \fPAGENO are now page breaks. Easy.
If you wish to remove PAGE from the top of each page, you could replace it with
nothing. Be sure to match the case when searching so that you do not remove any
legitimate occurrences of the word "page" that are in the content of your file (if there are
any).
As for automating this, it can be done (although I am not hugely experienced in task
automation in Word). Take a look at this URL:
http://www.microsoft.com/technet/scriptcenter/resources/qanda/jul07/hey0710.mspx
Good luck. Let me know how it works out.
August 5, 2009 at 11:21 AM
Puiufly said...
Do you need professional PDFs? Try PDFmyURL!
Hey Mark,
Awesome blog. I could not make {n} (repeats the previous item n times work
Specifically I am looking at deleting a string 10 numbers
Thanks
May 7, 2010 at 12:39 AM
Mark Antoniou said...
Thanks sourabh bora.
Could you copy and paste a sample from your file so that I can have a look at what
patterns might work?
May 7, 2010 at 10:08 AM
Christopher said...
Wow, this guide is very helpful and makes debugging code or even reformatting jumbled
scan text from books a snap to clear up.
Always used Notepass++ and these search and replace tips really makes things so
much easier and faster.
May 11, 2010 at 9:42 PM
sourabh bora said...
Thanks for your reply.
Here is an example:
Post123456 This is a nice post Post12345678 This is not a nice post
Post324567 This is another nice post
I want to delete the "nice" posts (Post--Followed by exactly 6 numbers, )
Thanks
May 12, 2010 at 11:22 PM
Mark Antoniou said...
This is actually a lot easier than I thought. If the text preceding the 6 numbers is always
the same, then you have an easy way of uniquely identifying the "nice" posts.
Search for:
Do you need professional PDFs? Try PDFmyURL!
My file is AAABBBCCC etc with all sort of characters from ascii table
the problem is that i whant the text ( code ) to be ABC and search for all hex ascii code
not just numbers or letters.
Thanx a lot
June 13, 2010 at 9:44 PM
Mark Antoniou said...
Thanks for your question Marius. I'm just not exactly clear on what you want to do.
To help me, could you provide me with a sample of what your text looks like (a few
lines), and then provide me with what you want those lines to look like after you run the
regular expression.
June 14, 2010 at 7:05 PM
marius said...
well my text looks like aaafffcccddd777gggzzziiippp---000
and i would like all the triplets to be replaced with only one character.
As you can see it is not only a to z and A to Z there are all type of characters with code
between 0 and 255 ( Ascii code )
June 14, 2010 at 9:51 PM
Mark Antoniou said...
Ok. If that is all that your file contains, then you could simply search for:
..(.)
and replace with:
\1
Easy.
Note, I don't use Notepad++ any more, since I have moved on to Emacs. In Emacs the
search term would be:
..\(.\)
but the concept is exactly the same: Discard the first two occurrences and keep the
third.
Do you need professional PDFs? Try PDFmyURL!
I'm not ignoring you. I've had a bit of trouble getting the regular expression to work in
Notepad++. It definitely can be done as a regular expression though.
Must you use Notepad++?
June 27, 2010 at 9:31 PM
Mark N said...
Well I preffer that it be done in notepad++... besides I don't want to write a script that
does this.
July 13, 2010 at 5:36 AM
Garioch said...
hi, i have a somewhat similar problem ...
i have a sql export-file
i want to "edit" the lines automatically .. coz its almost 6000 of them
each Insert-Line starts with
(id, another_id, third_id, NULL, ...
here i want to "delete" the 3rd id - while leaving all other things
i tried with several search patterns - but to no luck ..
July 14, 2010 at 4:36 PM
Garioch said...
to be more precise all id , 2nd ID and 3rd ID ar actual numbers
July 14, 2010 at 4:42 PM
Mark Antoniou said...
Garioch, if you want me to give you the exact answer, oats a few lines of code into a
comment. But, the general principle is this:
Group the ids that you want to keep as \1,\2 and don't insert I'd 3 into the replace term.
Make sense?
July 14, 2010 at 4:43 PM
1,
1,
1,
1,
1,
2,
3,
4,
NULL,
NULL,
NULL,
NULL,
'delayed billing',
'delayed billing',
'delayed billing',
'delayed billing',
'2007-02-16',
'2007-02-16',
'2007-03-01',
'2007-03-01',
0,
0,
0,
0,
17 more fields),
17 more fields),
17 more fields),
17 more fields),
since my question only concerns the start of each line i omitted some info at the end ...
but this should give a picture of the data i want to Replace
until now i was able with some info from other web-pages to find the start of a line with a
regex like
[(][0-9]*[, ][0-9]*[, ]
this marks exactly (1, 1, from the first insert-line
so how do i "mark" this as pattern 1 and how do i progress from there
July 14, 2010 at 4:59 PM
Mark Antoniou said...
Sometimes, the best solution is not to get too fancy. How about if we group everything
from the start that you want to keep into \1.
Then we group: Id3, NULL.
ThEn we group everything from there to the end of the line .*
as \2.
So, your replace term would be: \1NULL\2
That would work.
July 14, 2010 at 5:14 PM
Garioch said...
thanks mark
but i think i found "my solution"
Do you need professional PDFs? Try PDFmyURL!
thanks again for this super post the best in internet explaining regular expressions for
notepad++ and introducing xemacs for the same.
July 15, 2010 at 10:22 PM
user said...
(blogger screwed my poorly scaped html tags ill try again with parenthesis)
and more specific what i want is
find (<)!--tag1--(>)*(<)!--tag1--(>)
replace (<)!--tag1--(>)some html marked text like (<)div\(>)(<)p\(>)Let change some hmtl
paragraphs(<)/p(>)(<)/div(>)(<)!--tag1--(>)
July 15, 2010 at 10:27 PM
teddan00 said...
if have a filename i.e. a song called "Born To Run-E Street Band-Bruce
Springsteen.mp3"
I try to make "E Street Band-Bruce Springsteen" switch place with "Born To Run".
Find: (.*)-(.*)\.
Replace: \2-\1.
But I get the following filename: "Bruce Springsteen-Born To Run-E Street Band.mp3"
It seems that the last occurrence of "-" is found. is it possible to find the first
occurrence, AND still make it compatible with filenames that only have one "-" in it's
filename.
July 16, 2010 at 1:54 AM
TechnologyYogi said...
I used NP++'s regular expressions for find and replace for the first time - successfully,
before this I depended on MS SQL Server's Management studio for this, as it has very
cool easy to use find/replace features (using regular expressions).
Thanks for the post!
July 17, 2010 at 1:00 AM
Do you need professional PDFs? Try PDFmyURL!
That's my string:
Thanks, Mauri
April 3, 2011 at 5:43 AM
Mark Antoniou said...
Mauri, it turns out that this is not as trivial as it first appears. Handling email addresses
is quite a controversial issue in the regexp world. See http://www.regularexpressions.info/email.html for a discussion of the varioius issues and disagreements.
Your sample text has two unique characteristics that allows us to sidestep the messy
world of identifying 'what is an email address?', so I have taken advantage of these two
unique conditions:
1. Each email is separated be a comma followed by a space ", "
2. Some of the email addresses are missing a "@"
I have written the solution below for Notepad++. It involves several steps, but as long
as conditions 1 and 2 from above are satisfied, it will always work.
So, we start with this:
[email protected], [email protected], frojasd08_hotmail.com,
[email protected], steve#yahoo.com, [email protected]
Step 1: Place each email address on its own line
Search for (Extended mode): ", " (without the quotation marks)
Replace with: ,\n
You end up with this:
[email protected],
[email protected],
frojasd08_hotmail.com,
[email protected],
steve#yahoo.com,
[email protected]
Step 2: Remove correctly formatted emails that contain "@"
Search for (Regular expression mode): .*@.*
Replace with: (nothing, leave blank)
You end up with this:
frojasd08_hotmail.com,
steve#yahoo.com,
appstore.gearlive.com/member/76234/|0
I have 1000 lines like this. But despite I check, regular expression as the searchmode,
it finds nothing.
What am I missing. Please Help!
April 11, 2011 at 6:38 PM
Mark Antoniou said...
Yeah, BK, it's not going to happen. Not with Notepad++, at least. From past experience,
Notepad++ has problems both with repetition {1} and searching for white space \b.
You could achieve what you want to do in seven (fairly inelegant) steps, starting from
the largest number of digits:
[1-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]
and removing one digit an each step
[1-9][0-9][0-9][0-9][0-9][0-9][0-9]
then
[1-9][0-9][0-9][0-9][0-9][0-9]
and so on, until you arrive here
[5-9][0-9][0-9][0-9][0-9]
April 13, 2011 at 5:36 AM
warm up said...
@Mark,
[5-9][0-9][0-9][0-9][0-9]
this indeed finds what I want. Thank you very much.
April 13, 2011 at 7:55 PM
Wavetrain said...
Hey, just wanted to say thanks for the pointers. Really helped me clean up a massive
wiki list, it probably cut down editing time to 1/4 what it would have been.
April 22, 2011 at 2:43 PM
Do you need professional PDFs? Try PDFmyURL!
warm up said...
I need a regex builder. Will you please suggest me a good one?
Thanks in advance.
April 23, 2011 at 4:18 AM
Mark Antoniou said...
Sorry warm up, I've never used one, and definitely couldn't recommend a good one. If
you've got a specific regexp query I might be of more use.
April 23, 2011 at 6:21 AM
warm up said...
Thanks for offering your help and your time.
I want to find a string in a text like this;
For example evey line in the text file has a string
#links#
and after this string there are several words that does not interest me. I want to find and
mark #links# and the words afterthat so that I can delete them. How can I do that with
notepad++?
April 26, 2011 at 5:01 PM
Mark Antoniou said...
That's not too difficult. You just need to idenitfy each line that begins with #links# and
delete it.
Search for: #links#.*
Replace with: nothing, just leave it blank
April 27, 2011 at 12:12 AM
warm up said...
But I do not want to delete only #links#. I want to delete #links# and the words that are
coming afterthat.
forexample let says I have aline like this;
Do you need professional PDFs? Try PDFmyURL!
Any ideas ?
May 17, 2011 at 4:50 AM
Mark Antoniou said...
If you are trying to perform this search in Notepad++, it's not going to happen.
Having said that, if you insist on using Notepad++, you are going to need to get creative
and will need to break the search down into steps because \r\n cannot be used in
Regular Expression mode - you need to use Extended Search mode for \r\n. So, how
many steps do you need? I'm not sure, because it depends on your text, but my guess
is at least three:
1. Turn the newline into something unique.
2. Run the regexp.
3. Put the newlines back or do something else with them (not sure what, because you
didn't specify).
May 17, 2011 at 4:59 AM
Constantin said...
Well :), if the regular expression implementation in Notepad++ would implement the
multi line pattern that could solve it. I am not familiar with how Notepad++ is
implemented but Java would allow multi line patterns. I bet .NET would do the same.
The solution you suggested would work nicely but what I was trying to do was to search
thru a large set of java files for a certain multi line pattern. So I can't have the option to
replace the \r\n with a special token since that will alter the code base.
Thanks for looking!
May 17, 2011 at 7:22 AM
Mark Antoniou said...
If you do not *have* to use Notepad++, why not just use a more powerful text editor
(XEMacs), which will give you the one-line solution that you are looking for?
May 17, 2011 at 7:29 AM
Dee said...
Hi Mark,
Thanks for very helpful the blog post.. at first I couldn't quite figure out how to use the
Do you need professional PDFs? Try PDFmyURL!
Find : Step\s\d
Replace : Step\s\d|
which of course gave me this!
Step\s\d|
Step\s\d|
Step\s\d|
Eventually, it clicked that \1 represents the found patterns
and I stumbled upon this:
Find: Step\s\d
Replace: \1|
which gave me the desired result:
Step 1|
Step 2|
Step 3|
Just wanted to get that out there in case anyone else is struggling with that.
Once again cheers Mark for the help on that one.. the fist in the air celebration was
priceless.
Dee
June 16, 2011 at 1:31 AM
Mark Antoniou said...
Glad that you found the post helpful, Dee. It's funny how it seems so much more simple
Do you need professional PDFs? Try PDFmyURL!
tcp10102_PROBE
can you tell me the syntax to use for this replacement?
text after tcp and text after/ and before _PROBE varies..
June 16, 2011 at 11:39 PM
Mark Antoniou said...
Hey Manuel,
This is pretty straightforward. You want to keep everything before the forward slash and
everything after the underscore.
In Regular Expression search mode,
Search for: (.*)/.*_(.*)
Replace with: \1_\2
You can see that in the search term, I am using the forward slash and underscore as
signposts, and am keeping everything before and after (enclosed in parentheses), but
am discarding everything in between (not enclosed in parentheses).
June 17, 2011 at 12:13 AM
Nate said...
I am interested in searching a document and replacing everything from a href=" to " and
change all the links quickly with notepad ++ can you tell me how to do this?
I tried searching for ahref=".*" and it selected everything up to the LAST "
Please advise!
Do you need professional PDFs? Try PDFmyURL!
Thanks
June 18, 2011 at 3:45 AM
Mark Antoniou said...
Ok, I'm not sure exactly what you want the end result to be, but I'll give it a go. Say that
you start with something like this:
ahref="www.google.com"
ahref="www.facebook.com"
ahref="www.blogger.com"
ahref="www.twitter.com"
If you want to keep the ahref=" and the final " you could
Search for (regexp mode): (ahref=").*(")
Replace with: \1\2
The end result would be
ahref=""
ahref=""
ahref=""
ahref=""
If you want to keep everything but the ahref=" and the final " you could
Search for (regexp mode): ahref="(.*)"
Replace with: \1
The end result would be
www.google.com
www.facebook.com
www.blogger.com
www.twitter.com
If you want to do something else, you're going to have to be more specific. Ideally,
show me what a few lines of text look like before, and what you want them to look like
after.
Do you need professional PDFs? Try PDFmyURL!
Regards
Etay G
August 3, 2011 at 1:15 AM
Mark Antoniou said...
Glad you have found the blog useful, Etay G. I'm not sure exactly what you are trying to
get from the text. Do you want to get rid of everything, leaving only the email
addresses?
August 5, 2011 at 1:23 AM
RatA said...
Mark, thanks for the post, is very usefull. following the first example, how about not
erasing all the line, but only a part.
like i want to remove the $_POST['abc']; part in all lines
$abc = $_POST['abc'];
$bbb = $_POST['def'];
i try [$_POST].* but it erase all the line, and not the final part.
August 19, 2011 at 2:26 AM
Mark Antoniou said...
RatA, if I understood correctly, you want to turn this:
$abc = $_POST['abc'];
$bbb = $_POST['def'];
into this:
$abc =
$bbb =
is that right?
To do this,
Search for (regular expression mode): $_POST.*
Replace with: nothing
August 19, 2011 at 2:34 AM
Do you need professional PDFs? Try PDFmyURL!
RatA said...
thanks, u are a genius.
August 21, 2011 at 5:06 AM
Mikazza said...
what it does though is finds the 1st occurance of the word messages then finds the last
occurance and replaces everything in between with the word deleted. In my example
above its deleting **Data I want to keep is here 2**
Is there any way of doing this using regular expressions?
September 15, 2011 at 11:37 PM
Mark Antoniou said...
Hi Mikazza, I am not sure that I have understood exactly what you are trying to do, but
will give it a shot. So this is your original text:
**Data I want to keep is here 1**
< message >
called today but nobody was home
< /message >
**Data I want to keep is here 2**
< message >
called today but nobody answered
< /message >
In order to remove the < message > and < /message > tags, you should
Search for (regular expression mode): <.*>
Replace with: nothing
This will give you this:
**Data I want to keep is here 1**
If you then want to get rid of the lines that begin with "called", you could
Search for (regular expression mode): called.*
Replace with: nothing
which will give you this:
**Data I want to keep is here 1**
And then fix the blank lines as you see fit. Hope this helps.
p.s. I inserted spaces before and after the greater and less than symbols so that they
would show up in the post. You would not include the spaces in the search term.
September 17, 2011 at 12:53 AM
Mikazza said...
Thanks for the quick response Mark, what I want to replace is the < message > and <
/message > and everything in between them. I can get it to work if there is only one set
of these tags in the file (unfortunately there are thousands), if there are more than 1 set
it goes wrong and deletes everything between the 1st < message > and the last <
/message >.
Since the < message > and < /message > are on different lines in the file and the
content between them can also vary on how many lines it's over, I removed all the line
breaks to make it a bit easier to do the search and replace.
Please let me know if you need any more information.
Do you need professional PDFs? Try PDFmyURL!
Thanks!
September 17, 2011 at 4:47 AM
Mark Antoniou said...
Ok got it. So, you start of with this:
**Data I want to keep is here 1**
< message >
called today but nobody was home
< /message >
**Data I want to keep is here 2**
< message >
called today but nobody answered
< /message >
Notepad++ has a hard time handling multiline regular expressions. One option is to use
a different text editor with more powerful regexp capabilities (ahem, Emacs). The other
option is to use Notepad++ and break this down into a few steps (3 to be precise).
Step 1: Remove the newlines
Search for (extended mode): \r\n
Replace with: nothing
This will give you this:
**Data I want to keep is here 1**< message >called today but nobody was home<
/message >**Data I want to keep is here 2**< message >called today but nobody
answered< /message >
Step 2: Make all instances of < /message > occur at the end of a line. The reason for
this is because we want to discard everything before < /message >, apart from that bit
at the front that we want to keep.
Search for (extended mode): < /message >
Replace with: \r\n
So, your text will now look like this:
Do you need professional PDFs? Try PDFmyURL!
**Data I want to keep is here 1**< message >called today but nobody was home
**Data I want to keep is here 2**< message >called today but nobody answered
We are nearly there, but we still want to discard everything after (and including) the <
message > tag.
Step 3: Remove everything from < message > onwards.
Search for (regular expression mode): (.*)< message >.*
Replace with: \1
And finally, we arrive at our desired result:
**Data I want to keep is here 1**
**Data I want to keep is here 2**
September 17, 2011 at 6:32 AM
Menes said...
Hi Mark i have text like that ;
apple(7)orange(27)banana(318)tulip(2)
And i want to convert it like that;
apple,orange,banana,tulip
i try those ;
[(].*[)] and (\(.*)())
but both of them doesn't work.
Thanks for helping
September 21, 2011 at 7:17 PM
Mark Antoniou said...
Hi Menes,
This is a little tricky. The reason why is because there are multiple parentheses on the
same line. This can muck up your search term. First things first, the way to search for
parentheses is with a preceding backslash, like this \( for open and this \) for closed.
One solution for your problem is to take a different approach: rather than trying to take
care of all parentheses at once, you could take care of parentheses that contain the
Do you need professional PDFs? Try PDFmyURL!
into a linebreak". It is not possible to do this in one step in Notepad++ (although you
could do it in a more powerful text editor, like Emacs). We are going to need 2 steps.
In order to find where each phone number ends, all we have to do is find the number
that has a space after it.
Search for (regexp mode): ([0-9]) -note that there is a space after the closed parenthesis
Replace with: \1,
john cruz 00374653,kelly brunz 95847364,alan whirtz 9898372,jane doerl
The reason for inserting a comma is so that we will have something to search for in the
next step when we want to insert a new line.
Search for (extended search mode): ,
Replace with: \r\n
which will give you this:
john cruz 00374653
kelly brunz 95847364
alan whirtz 9898372
jane doerl
September 28, 2011 at 9:11 AM
Adrian981 said...
Thank you Mark it works perfect.
The reason i was asking you about how to make a line after lets say 12
spaces/commas is that i have alot of files that i want break up in lines of 3,6 and 9. I
will try give you a good example of what i'm looking to do.eg:
john,likes,this,jane,loves,games,peter,saved,me,george,fell,today,greg,pushed,me
I'm trying to divide them up into lines of 3.
john likes this
jane loves games
peter saved me
george fell today
greg pushed me
The whole long line is made up of 3 words that makes a small sentence.
Thanks for your previous help your amazing and your help would be much appericated
for this problem.
Do you need professional PDFs? Try PDFmyURL!
Adrian
September 29, 2011 at 12:10 AM
Mark Antoniou said...
So you start off with this
john,likes,this,jane,loves,games,peter,saved,me,george,fell,today,greg,pushed,me
and you want to put 3 words on each line. Notepad++ makes this a bit harder than it
should be. We need 2 steps. First, add a comma to the end of the line so that it looks
like this
john,likes,this,jane,loves,games,peter,saved,me,george,fell,today,greg,pushed,me,
If you have hundreds or thousands of lines, you could use a regular expression to do
this. Anyway, back to the task at hand.
Search for (regexp mode): ([a-z]*),([a-z]*),([a-z]*),
Replace with: \1 \2 \3QQQ
john likes thisQQQjane loves gamesQQQpeter saved meQQQgeorge fell
todayQQQgreg,pushed,me
Note that the QQQ is just a random string that I came up with which (a) will never occur
in your list of words, and (b) is easily searchable, which is useful for the next step
below.
Search for (extended search mode): QQQ
Replace with: \r\n
john likes this
jane loves games
peter saved me
george fell today
greg pushed me
September 29, 2011 at 2:36 AM
Adrian981 said...
Hi Mark,
I think you've nearly cracked it for me, just a few more things i left out no thinking it
would be an issue.
Some of the words have numerals in them as in john25,left3,
Do you need professional PDFs? Try PDFmyURL!
Glad you found the blog helpful, Adrian. In order to change the number of words that will
end up on each line, simply change the number of ([a-z0-9]*), in the search term, and
make sure you have the same number of items in the replacement term.
Using your example of 20, you would
Search for (regexp mode): ([a-z0-9]*),([a-z0-9]*),([a-z0-9]*),([a-z0-9]*),([a-z0-9]*),([a-z09]*),([a-z0-9]*),([a-z0-9]*),([a-z0-9]*),([a-z0-9]*),([a-z0-9]*),([a-z0-9]*),([a-z0-9]*),([a-z09]*),([a-z0-9]*),([a-z0-9]*),([a-z0-9]*),([a-z0-9]*),([a-z0-9]*),([a-z0-9]*),
Replace with: \1 \2 \3 \4 \5 \6 \7 \8 \9 \10 \11 \12 \13 \14 \15 \16 \17 \18 \19 \20QQQ
The obvious limitation of using this sort of brute force approach is that it becomes
impractical if you wanted say 1000 words on each line (that would be a lot of
copy+pasting!). But, we are trying to work around the limitations of Notepad++, so we
have to (sometimes) use inelegant solutions.
As for donations, I gratefully and humbly accept whatever you can spare. My email
address for Paypal is [email protected]
September 29, 2011 at 5:00 AM
Adrian981 said...
Hi
I done a few tests and it only seems to divide up to 9 words per line and any bigger line
10,11,12 it replaces 0.
Any ideas
Adrian
September 29, 2011 at 5:29 AM
Mark Antoniou said...
Ah, yes, you are right. Notepad++ will not let you have more than 9 bins. Sorry, I was
not working in Notepad++ when I posted my previous reply. This is yet another reason
to use a more powerful text editor for this sort of advanced regexp. Enough of my
ranting.
So, let's say that you want to have more than 9 words per line. It's just a matter of
making our bins bigger. Rather than putting one word in each bin, we could put 20 in
each bin (or however many you like).
Ok, so we start off with these 40 words:
Do you need professional PDFs? Try PDFmyURL!
john25,likes,this,jane11,loves,games,peter,saved,me,george,fell,left3,greg,pushed55,m
e,john25,likes,this,jane11,loves,games,peter,saved,me,george,fell,left3,greg,pushed55,
me,peter,saved,me,george,fell,left3,greg,pushed55,me,too,
Search for (regexp mode): ([a-z0-9]*,[a-z0-9]*,[a-z0-9]*,[a-z0-9]*,[a-z0-9]*,[a-z0-9]*,[a-z09]*,[a-z0-9]*,[a-z0-9]*,[a-z0-9]*,[a-z0-9]*,[a-z0-9]*,[a-z0-9]*,[a-z0-9]*,[a-z0-9]*,[a-z09]*,[a-z0-9]*,[a-z0-9]*,[a-z0-9]*,[a-z0-9]*,)
Note that there is only one open parenthesis at the start and one closed
parenthesis at the end of the search term.
Replace with: \1QQQ
john25,likes,this,jane11,loves,games,peter,saved,me,george,fell,left3,greg,pushed55,m
e,john25,likes,this,jane11,loves,QQQgames,peter,saved,me,george,fell,left3,greg,pushe
d55,me,peter,saved,me,george,fell,left3,greg,pushed55,me,too,QQQ
Then use extended search mode to convert the QQQs to newlines (\r\n).
September 29, 2011 at 5:51 AM
Mark Antoniou said...
Oh, and you could just use a simple Find+Replace to replace the commas with spaces
(if you want to).
September 29, 2011 at 5:52 AM
Adrian981 said...
Thanks alot mark it seems to leave ( at start of each line and ) at end will i just find and
replace or am i doing something wrong still ?
September 29, 2011 at 6:47 AM
Mark Antoniou said...
No, it should not leave any ( or ) anywhere. Make sure that your search term and
replace term are correct.
I have double-checked my post above and it is correct. No typos.
September 29, 2011 at 6:52 AM
Adrian981 said...
Thanks alot Mark you great. Is their any way to remove the , at the end of each line.
Do you need professional PDFs? Try PDFmyURL!
Adrian981 said...
Perfect thanks alot mark.
Adrian
September 29, 2011 at 9:31 AM
Sam said...
I have pretty small question.
is there anything I can add in front of any word ?..like
b8nmuujs7jrug'
baszp4tj1s7vv'
add ' single quote in front of every word in the line.
thanks
Sam
September 30, 2011 at 7:28 AM
Mark Antoniou said...
Ok, so you start off with this:
b8nmuujs7jrug'
baszp4tj1s7vv'
and you want to add a single quote ' to the beginning of each line.
Search for (regexp mode): (.*)
Replace with: '\1
which will give you this:
'b8nmuujs7jrug'
'baszp4tj1s7vv'
September 30, 2011 at 7:31 AM
Sam said...
No worries I figured out the answer
September 30, 2011 at 7:37 AM
Organix said...
Do you need professional PDFs? Try PDFmyURL!
Mark since you seem like the regex master maybe you can point me in the right
direction: I have a csv file that has text enclosed in "" but the problem is that in the
REMARK/detail field there can be inches which are also using " how can I find these
lines with the extra quotations?
example of what I'm looking for:
TRTM_TYPE,TEST_TYPE,RUN_NO,TEST_NUMBER,TOP_DEPTH,BASE_DEPTH,R
EMARK
FRAC,IP,0,001,1441,1721,"DETAILS: AQUAFRAC 1000; 36750# 20/40 BROWN SD,
7500# 16/30 SIBERPROP"
FRAC,IP,0,001,11218,11346,""
FRAC,IP,0,001,8210,9250,"DETAILS: 60406 GALS WF GR8, 195564 GALS DF 200R23"
FRAC,IP,0,001,9730,10030,"DETAILS: 51244 GALS WF GR8, 122796 GALS DF 200R23"
FRAC,IP,0,001,10600,11050,"DETAILS: 27858 GALS WF GR8, 173466 GALS DF 200R23"
FRAC,IP,0,001,11316,11582,"CMHPG 35#"
FRAC,IP,0,001,6714,7680,"DETAILS: 94 BBLS SLICK WTR, 95 BBLS SLICK WTR,
357 BBLS SLICK WTR, 119 BBLS LIGHTNING 2000 PAD, 0.5 TO 1 PPG 30/50#
WHITE SD IN LIGHTNING 2000 GEL, WHITE SD"
FRAC,IP,0,001,7680,8190,"DETAILS: 87 BBLS SLICK WTR, 71 BBLS SLICK WTR,
357 BBLS SLICK WTR, 119 BBLS LIGHTNING 2000 PAD, 0.5 TO 1 PPG 30/50#
WHITE SD IN LIGHTNING 2000 GEL, 238 BBLS SLICK WTR, DROP 3" BALL, 168
BBLS SLICK WTR, SEAT BALL, WHITE SD" <-Looking for these kind
October 4, 2011 at 3:43 AM
Mark Antoniou said...
Thanks for your question, Organix. I normally get asked about changing a text file by
restructuring data, but finding text in a particular format can be useful, too. You are
interested in an expression that will find text that contains a third " which indicates that
the comment includes the inches of some object or action, such as dropping a ball. To
find this use the search term below
Search for (regexp mode): .*".*" .*"
October 4, 2011 at 4:09 AM
Organix said...
Thanks Mark - I'm sure that was an easy one for you. I had tried ".*" .*" and ".*" .*"$ but
Do you need professional PDFs? Try PDFmyURL!
Vin said...
Hi Mark,
Great! That was exactly what I needed, just made my job a breeze! :)
I apologize for the vague question, what I meant to ask was, let's say I have thousands
of data all with different dates, and I would like to sort them out from the earliest to
latest.
Thank you so much Mark, your help is much appreciated!
October 12, 2011 at 1:55 PM
Mark Antoniou said...
Oh, I get it now. You are not going to be able do that in a text editor. Perhaps import the
text file into Excel, use the text-to-columns feature and specify the comma as your
delimeter. Column A will contain all of the dates. Select Column A, set the format of the
cells to 'date'. Select the whole data range and sort ascending by column A. That will do
it.
October 12, 2011 at 3:03 PM
Vin said...
Ok! Muchos Gracias. Can't begin to express my gratitude! :)
October 12, 2011 at 3:08 PM
Manas said...
Thanks man. It was really helpful. I wanted to remove "," that comes in a string from a
flat file of 200000+ records. The comma was messing up with my delimiter. BTW I used
[a-zA-Z1-0]+,[a-zA-Z1-0]+ as my search string..
Again thanks a ton man
November 8, 2011 at 5:01 PM
Jeff said...
Hi,
I'm having a challenge to use regex in Notepad++ for the following case.
Howto find and append a row of hostname and ip address into a one common statement
with newline added?
As a result,
I really appreciate this very much if someone can shed some lights.
Cheers,
Jeffrey
November 9, 2011 at 3:22 AM
Mark Antoniou said...
Glad you found the blog useful, Manas.
Jeff, I don't understand what you want to do. What do you want the text to look like at
the end?
November 9, 2011 at 3:27 AM
Jeff said...
Here is some of the input that missed out in my last comments.
To
(hostname host="ipaddress" port="11")(/hostname)
Expected results
(host101 host="192.168.0.1" port="11")(/host101)
Do you need professional PDFs? Try PDFmyURL!
I want to replace "abcde" with "X", so not only do I need to make that change, but I also
need to reduce the string size (10) by the replacement length difference (4).
So I'm looking for something like:
Search for:
s:\(\d+\):\\"abcde
Replace with:
s:$((\1 - 4)):"X
where $((\1 - 4)) is an arithmetic calculation whose result is injected in the replacement
value.
Possible?
Thanks in advance.
November 18, 2011 at 3:00 PM
Mark Antoniou said...
Greg, does N++ regex support doing an arithmetic calculation in the replacement?
No. However, depending on the arithmetic, there may be a way to "fake it", and bend
Notepad++ to your will. One reader asked me if it is possible to increment numbers in
the replace term. It isn't. But if you insert a number on each line using the Notepad++
column editor, then use regexp to restructure the data, the result is identical.
So, where does that leave us then? I am not 100% clear on what your text looks like or
what you want it to look like. Like anything, there is a way to do it, but the question is
how messy will it get, and is it the most efficient way of getting the job done. It all
depends on how repetitious your replace term will be. My gut feeling is that you should
probably take a look at either Perl or Awk for your particular case.
November 19, 2011 at 1:34 AM
Bob C said...
Helped! Thanks :)
December 14, 2011 at 3:41 AM
Mamoun J. said...
Do you need professional PDFs? Try PDFmyURL!
Your work is amazingly good. Recently a hacker injected iframes into my web page for
all php files. i'm trying to remove these iframes with notepad++.
what I want to remove is something like:
IFRAME Bla Bla Bla /IFRAME
So, I know the beginning and the end of the string but the problem is the contents are
not the same all the time. One thing that I didn't confirm yet is, each iframe is located in
a separate line, if so, all what i need is to delete the whole line where i locate iframe
term.
please give your suggestions. many thnks.
January 16, 2012 at 3:15 PM
Mark Antoniou said...
Thanks for your kind words, Mamoun. If all you need to do is remove whatever is
contained within the Iframe tags, this can be achieved easily by
Search for (regexp mode): IFRAME.*/IFRAME
Replace with: nothing
If there are multiple instances of IFRAME on the same line, or if an individual IFRAME
spans multiple lines, then things become a little more complicated, esp if you're using
Notepad++. In this instance, you could change all instances of /IFRAME to something
unique, such as ENDOFHACK or whatever. Then you could remove all newlines and
replace them with something else unique, such as PUTBACKLATER or whatever. Then
you would
Search for (regexp mode): IFRAME.*ENDOFHACK
Replace with: nothing
Then reinsert all newlines back where they were by replacing PUTBACKLATER with \r\n
in extended search mode.
Either way, it can be done.
January 16, 2012 at 3:28 PM
catchthepanda said...
thanks for a good read, breaks down steps very well indeed for doing more complex
regex stuff!
I leave you with
Do you need professional PDFs? Try PDFmyURL!
PRIVATE="0" TAGS="winkelen,nagerechten,chocolade,nougat,hapje,bonbon"
>FineFoodImports - Home
PRIVATE="0" TAGS="eigenbedrijf,marketing,sales,arrangementen,workshops"
>Jump4art
PRIVATE="0" TAGS="ecofriendly,bouwen,frans"
>Tulkivi spreksteenkachel
PRIVATE="0" TAGS="hartigetaart,vlees,Recepten,!Recepten"
>Kerriequiche
Now we put the 2 lines that we split up back together again.
Search for (extended search mode): "\r\n
Replace with: "
PRIVATE="0" TAGS="concerten,tickets,kaarten">Live Nation Live Nation Netherlands
PRIVATE="0"
TAGS="winkelen,nagerechten,chocolade,nougat,hapje,bonbon">FineFoodImports Home
PRIVATE="0"
TAGS="eigenbedrijf,marketing,sales,arrangementen,workshops">Jump4art
PRIVATE="0" TAGS="ecofriendly,bouwen,frans">Tulkivi spreksteenkachel
PRIVATE="0" TAGS="hartigetaart,vlees,Recepten,!Recepten">Kerriequiche
And there you have it.
January 21, 2012 at 3:36 AM
eagleapex said...
Just spent 10 minutes trying to make a clever USPTO search with your help.
unparseable (Too Many Search Terms 1043 ) ).
awww
January 21, 2012 at 4:11 AM
JPNL said...
Wow that worked perfect! Thank you so much!!! and have a nice weekend.
January 21, 2012 at 6:30 AM
Adrian981 said...
Do you need professional PDFs? Try PDFmyURL!
Hi Mark
You gave me a lot of help before with notepadd ++. I was wonder could you help me to
figure this out.
i'm looking to replace each comma at the end of a line with someting else.
ie. jon,dan,paul,
I know how to find : (.*),
And i usually replace with : \1
Mark 30
where
iamhere
...
i want to output only ;
Jessie 213
Jack 232
Mark 30
and as you see we have a trick;
there is a fixed text (iamhere) before 2 lines which we needed.
in fact the question is easy , could we select/mark lines which cames before 2 lines a
fixed text.
i looked text-fx but i couldn't solve the problem.
Thanks for helps.
April 5, 2012 at 7:51 AM
RussiAmore said...
Thank's for such a great guide! There are lot's of tip i didn't know about.
April 5, 2012 at 10:02 PM
Unknown said...
Mark, I can see why you've received so much traffic on this post. It helped me solve
cleaning up a very large xml document. Big Thanks for your Documentation and
Examples!!!
Randy
Techie by day, woodworker by night...
http://www.custommade.com/by/repearson
April 6, 2012 at 3:57 AM
Mark Antoniou said...
Glad you found it helpful, RussiAmore and Randy. And thanks for the kind words.
Enes, your problem is a simple one (in theory), but is made complicated by Notepad++.
As you point out, there is a (somewhat) recurring pattern in the text. You want to keep
the line that is two lines above "iamhere", discard the line above "iamhere" as well as
the "iamhere" itself. I am not going to even bother doing this is Notepad++ because the
Do you need professional PDFs? Try PDFmyURL!
solution will be very, very, very long. We need a text editor that will allow us to include
newlines as part of our regexp search term. I recommend Emacs, available here:
http://ftp.gnu.org/gnu/emacs/windows/
So, we start off with this:
Jessie 213
block me later
iamhere
Jack 232
blablabla
iamhere
blablabla
sometext
againsometext
Mark 30
where
iamhere
Search for: \(.*\)
.*
iamhere
Note: insert newline characters into a search term in Emacs by pressing Ctrl+Q Ctrl+J
Replace with: \1
This will give you this:
Jessie 213
Jack 232
blablabla
sometext
againsometext
Mark 30
Ok, so now you can see why I said that there is a *somewhat* recurring pattern. The
number of lines between each occurrence of "iamhere" varies, so we want to get rid of
"blablabla", "sometext" and "againsometext". In this example, we can use the fact that
the unwanted text does not end with a number to our advantage, like this
Do you need professional PDFs? Try PDFmyURL!
Post a Comment
Newer Post
Home
Older Post