Line 1
Line 2
Line 3
Line 4
This expression WORD is contained in this line 5
Line 6
Line 7
Line 8
WORD begins that line 9
Line 10
Line 11
Line 12

I think that there are four possible versions of your initial question :

How to delete any text from the beginning of the current file till the FIRST line, included, containing the string “WORD” ( A )

How to delete any text from the beginning of the current file till the LAST line, included, containing the string “WORD” ( B )

How to delete any text from the cursor location of the current file till the FIRST line, included, containing the string “WORD” ( C )

How to delete any text from the cursor location of the current file till the LAST line, included, containing the string “WORD” ( D )

Remark : For cases C / D, in order to delete all the line, where the cursor is located, just move it at the very beginning of the current line, before running the S/R operation

No worry :-) For each case, of course, there’s a suitable regex to perform ! So :

Open the Replace dialog ( CTRL + F )

Check the Match case option, if necessary

Check, preferably, the Wrap around option

Select, of course, the Regular expression search mode

In the Find what zone, type in, according to the wanted case :

\A(?s).*?WORD(?-s).*\R for case ( A )

\A(?s).*WORD(?-s).*\R for case ( B )

(?s).*?WORD(?-s).*\R for case ( C )

(?s).*WORD(?-s).*\R for case ( D )

Leave the Replace with zone EMPTY

Click, once, on the Find Next button, then click on the Replace button

Et voilà !

Notes :

The \A matches the zero-length position at the very beginning of the file

The (?s) modifier means that further dot symbol(s) will match, absolutely, any character ( Standard or EOL characters )

The (?-s) modifier means that further dot symbol(s) will match, ONLY the standard characters, ( usual form )

Note that when the . matches newline option is checked, it means that an invisible(?s) is supposed, in front of the whole regex. However, the use of the two modifiers, (?s) and (?-s), inside the regex, allow us enhanced regexes

The \R syntax represents any kind of EOL characters ( \r\n in Windows* files, \n in Unix/OSX files or \r in old MAC files

Remember, for instance, that :

The regex 0.*9 matches the longest string, beginning with a 0 digit and ending with a 9 digit

The regex 0.*?9 matches the shortest string, beginning with a 0 digit and ending with a 9 digit. That’s why there is NO other digit 9 inside that string

hi, works great.
And please , guy38, one single question. If I want to delete a particular fragment from a text file:

I want to delete everything before line 4 that contains the word “UNTIL THIS” (included line 4) , and in the same time to delete everything after line 10 that contain the word “AFTER THIS” (included the line 10)

Line 1
Line 2
Line 3
Line 4 UNTIL THIS
Line 5 -----
Line 6 -----
Line 7 -----
Line 8 -----
Line 9 -----
Line 10 AFTER THIS
Line 11
Line 12
Line 13

As we need to grab several lines at the same time, we’ll use, again, the (?s) modifier, in order to allow the dot symbol to match, absolutely, any character. In addition, I just add the (?-i) modifier which ensures that the search will be performed in a non-insensitive way, that is identical to say, in a sensitive way to case !

The search regex is, simply, an alternative to two regexes, successively used ( one regex searching, firstly, for lines 1 to 4, included and the other one searching, secondly, for lines 10 to 13, included. So :

We start with the example text below :

Line 1
Line 2
Line 3
Line 4 UNTIL THIS end
Line 5 -----
Line 6 -----
Line 7 -----
Line 8 -----
Line 9 -----
Line 10 AFTER THIS end
Line 11
Line 12
Line 13

Go back to the beginning of your file ( CTRL + Origin )

Open the Replace dialog ( CTRL + H )

Select the Regular expression search mode

Preferably, uncheck the Wrap around option

SEARCH : (?s-i).*UNTIL THIS.*?\R|.*\R\K.*AFTER THIS.*

REPLACE : Leave EMPTY

Click, one time, on the Replace All button

You should obtain the wanted result, below :

Line 5 -----
Line 6 -----
Line 7 -----
Line 8 -----
Line 9 -----

These two regexes are rather similar to those, described in my previous posts and don’t need any further explanation !

Best Regards,

guy038

BTW, concerning my previous post, I noticed a funny behaviour :

Copy the text, below, in a new file :

Line 1
Line 2
Line 3
Line 4 with the "ABC" string
Line 5
Line 6
Line 7 containing ABC, too
Line 8
Line 9
Line 10 is the last line, with the string ABC
Line 11
Line 12
Line 13

Go back to the beginning of your file ( CTRL + Origin )

Open the Replace dialog ( CTRL + H )

Select the Regular expression search mode

Check the Wrap around option

Copy, in the Find what zone, the regex (?s-i).*\R\K.*ABC.*

Click, a first time, on the Replace button ( NOT the Replace All button ! ) => The lines 10 to 13 included are selected

Click, a second time, on the Replace button => The lines 10 to 13 included are deleted and, simultaneously, the lines 7 to 9 included are selected

Click a third time, on the Replace button => The lines 7 to 9 included are deleted** and, simultaneously, the lines 4 to 6 included are selected

Finally, a fourth click, on the Replace button, deletes the lines 4 to 6 included

Thus, contrary to what I had thought, up to now, although a \K form is used in the search regex, a mouse click on the Replace button ( step by step replacement ) still produces, in some cases, an action on the selected text !!

Want to keep
Want to keep
Want to keep
Want to keep
Want to keep
Want to keep
Want to keep
Want to keep

Important :

It could be useless to insert marks, in order to determine the starting and ending boundary of the range of lines to be deleted. Two possibilities :

The boundaries are easy to isolate, among text around and are unique. In that case, it could replace the generic START-DELETING and STOP-DELETING lines

The boundaries may be literally different but follow a same template. In that case, they can be found with a regex, which would be mixed with my regex above !

So, if it’s not confidential information and if you don’t mind, give us an example of the START-DELETING and STOP-DELETING lines of your .html files ! You could also join one of your files, or part of it, as an attached file, with your mail at my e-mail address :

I took some time to figure out what you exactly wanted to do and I hope that my solution will be close enough to what you need !

OK, let’s suppose that we start with the sample text below :

title = ABC_name
title = DEF_name
title = YZ_name
title = GHI_name
title = JKL_name
title = MNO_name
title = YZ_name
title = ABC_name
title = MNO_name
title = MNO_name
title = PQR_name
title = MNO_name
title = STU_name
title = VWX_name
title = ABC_name
title = YZ_name
title = GHI_name

Note that it contains 3 lines with the string ABC, 2 lines with the string GHI, 4 lines with the string MNO and 3 lines with the string YZ !

Now, let’s imagine that you would change each string ABC, DEF… into new strings, according to the table below :

would, simultaneously, change any occurrence of these 9 strings, into the new ones, defined in the table above ;-))

So, after clicking on the Replace All button, you would get, at once, the following text :

title = ABC111_lttz_name
title = DEF-22222_lttz_name
title = Y-Z_lttz_name
title = GHI_GHI_lttz_name
title = J_lttz_name
title = mno_lttz_name
title = Y-Z_lttz_name
title = ABC111_lttz_name
title = mno_lttz_name
title = mno_lttz_name
title = 000PQR_lttz_name
title = mno_lttz_name
title = Test_lttz_name
title = 99_lttz_name
title = ABC111_lttz_name
title = Y-Z_lttz_name
title = GHI_GHI_lttz_name

Et voilà !

Notes :

Regarding the search regex :

First, the (?-i) syntax forces the search to be processed, in a sensitive way ( NON-insensitive )

Now, the part title\x20=\x20 tries to match the string title =, with a space character, before and after the equal sign

Then, the (?: syntax starts a non-capturing group

The part (ABC)|(DEF)|(GHI)|(JKL)|(MNO)|(PQR)|(STU)|(VWX)|(YZ) are, simply, 9alternatives, corresponding to our 9 strings to be changed. Thus, each of them, between parentheses, is stored as group1, 2, 3…

The final part )(?=_name) corresponds to the closing parenthesis of the non-capturing group, followed with a look-ahead structure or condition ( Is there the string _name afterABC, DEF… ? ) which must be true for an overall match

Regarding the replacement regex :

First, it rewrites the string title = , followed with a space character

Then any (?#....) syntax, where # represents a digit, is a conditional replacement and all the regex after the #, till the closing parenthesis, is evaluated, if the matched string is stored in group#

Note that the 9conditional replacement structures (?1\1111)(?2\2-22222)(?3\3_\3)(?4J)(?5\L\5)(?{6}000\6)(?7Test)(?{8}99)(?9Y-Z) could be placed in any order

In some of them, we rewrite the searched string, stored in group # , due to the \# escape sequence

In the conditional replacement (?5\L\5) we, simply, rewrite the upper-case string MNO, in lower-case, because of the \L replacement escape sequence

Be aware, too, that concerning the groups 6 and 8, their conditional replacements are build with the alternate form (?{#}....). Indeed, we must distinguish between the group number # and the digits, which follows it !. If the braces would have been absent, the regex engine would think that groups6000 and 899 were concerned :-((

And finally, of course, it rewrites, in all cases, your ending part, the string _lttz !

Given the example you provided the following would remove all text between and including the START and END-DELETING lines.
Find: (START-DELETING.+\R)(.+\R)+(END-DELETING\R)
Replace: empty string here

So the assumption is that there must be at least 1 line between the 2 identifying lines (START and END), that’s the (.+\R)+ portion of the regex. Also note that the first group (START-DELETING.+\R) includes the .+ as your example also has 3 period characters after it. I’ve included brackets around each sub-portion just so as it makes it a bit easier to segment out and identify what each group is doing. Only the middle group brackets are absolutely necessary, i.e.(.+\R)+.

You say you can/have replaced using a simple find and replace to get the START and END lines in there. With my regex you could replace those portions with the original string you used to find. That would save you 1 or 2 additional steps.

@guy038 you truly are a legend, I agree with the other poster. You are so deep into notepad++ regex, impressive!
I believe you may also know this - IMHO quite common - case, although I can’t find it described anywhere:

Suppose you have just one large file (wordpress sql database in fact, opened in my favorite editor notepad++) and STRING A and STRING B should always belong together:
FIND ALL INSTANCES OF ANY TEXT across lines
WHERE STRING A sometime later
IS FOLLOWED BY ANOTHER STRING A
INSTEAD OF THE “CLOSING” STRING B

Example: Find all instances where, across lines, there’s the literal string [/social]
and after any kind and number of characters there’s another literal string [/social]
BUT in between the two is nowhere a literal string [social] although it should be because [social] and [/social] belong together.

So basically in the example case, string A and string B always belong together, there must never follow two A’s or two B’s. Always the A string, then the B string. Then again the A string, then the B string. Etc. And so you need to find any “fault”: where A is followed sometime later by another A, instead of first a B string.

Did I explain this well enough?

I am sure none of the above, nor anything else I have found, works because I’ve tried them all. Would you have an idea how to go about this?