Shell Command or Script to Remove Some Tags

Hi all,
I have a file which I have to remove some line from it,
the lines that I have to remove from my file is as below:

Code:
</new_name> </w"s" langue="Fr-fr" version="1.0" encoding="UTF-8" ?> <New_name>
and it is finding at the middle of my file.
Is there any command line in linux to do it or do I have to do by a script
if so how to do it. I am not a shell/perl/bash developer.

"Exactly the same" doesn't seem to include having "my_xmltag" in the code. Perhaps that's why someone ironically flagged the OP's own response as Best Answer.

Quite possibly, you have redirected your output on top of your input file, which wipes it out because the shell does the output redirect before the input gets read.

Check the filesize using "ls -l" to make sure you still have a non-zero size. Post the output if you don't understand it.

Is this something you have to do a lot of times? If it is just a one-off, forget scripting and use an editor. Tell us what system and OS you are on, and we can advise your simplest method. Scripting does not appear to be your thing today.

The filesize is correct, I get 114257 but for my new file is 0 as I said it create it empty.

No, that's the first time I'm doing this.
No, I try to avoid the script. In fact I am looking for a solution to remove the Header tag + xml definition from one of my files and the footer tag from an other file

In detail what I'm doing is, that I am creating 2 xml file and then I put them together to a new xml file, but the new one is not a correct XML because

It has one extract row in the middle of file which contains the footer tag of the Ast file and the XML definition + header tag of the second file

So I need some command line to remove these tags from File1 and file2 before putting them together

So the HTML tag is called New_name on the start but new-name on the end ? That's never going to work in XML/HTML - upper/lower case will stop the tags matching, I believe. But it also explains why the grep showed nothing. I would leave out the < and > also, personally. "grep ew_name Report02.xml" would be good.

Now, I'm worried that Ed said

head Report02.xml

and you said:

That command line has worked and it shows the content of my xml, it is to big to post it here,

The head command shows only the first 10 lines of the file. So if it's too big to post, then the whole file is one continuous text block without any newlines. And that's going to make it incredibly hard to repair.

Have you really only got to make this hack once in your life ? There are ways to do it, but not exactly user friendly (Guys - thinking dd).

No, head will only show the first few lines.
Basically, what you are searching for in the xml isn't in there. Either the xml is screwed, the line you've told us IS in there, isn't, or you have copied it wrong.
My 20 cents is on you having copied it wrong, as the enclosing tags are different case. XML does not have mismatching case tags (not good xml anyway). So If you don't want to do as you are asked, and my psychic abilities to fix users errors is not working today, good luck.

So there are ZERO newline characters in the file (and only 3 words). So any grep, tail, head, sed, or other Unix command is going to do one of three things:

(a) Give the whole file, because is is all one line, so it is ALL LINE ONE.

(b) Crap out because line 1 contains 114,257 bytes.

(c) Crap out because the last line is unterminated.

Arash,

The ^M not found is a stray Carriage Return character in the script file, caused by moving the text from Windows to Unix somehow.

Guys: to help here we need to start with breaking the text into sensible line lengths so we can start making sensible use of the tools we have got.

Arash,

Are you able to create these files again with newlines around every 80 characters. I say "around", because the breaks have to be in sensible places for the XML syntax, otherwise you will create unintended paragraphs in texts and so on. That is, is the creation of this file repeatable and under your control ?

If this is a genuine one-off, we might do it with a very delicate set of parameters.

Basically, we go back to your two separate files. We run wc on each. Then we count by hand the lengths of the parts you want to cut off.

The we use a program called dd. That lets us break files up by counts, using sub-commands like skip=31 block=113452 and so on. I would need to look up the manual to get it right.

Thank you all for your helps
my machine is AIX,
and in fact I am creating those XML file by ETL TOOLS, and now i need to put them together by unix command/script,
thanks for your help, after some search in internet i find these code

Perl is not my thing. However, Tiger posted some Perl a couple of days ago that looks better than your posted (2). Basically, I don't think it needs the second part of the entry (after the "if"), but it does need the CMP_red... bit.

What's wrong with it, though, is that it will just hang because it was not told what file to read. It needs the name of your second original file before the >>.

I posted an awk (look for comment "#! Repair an XML file.") about 3 days back which I tested too.

(a) The "?" symbol is special in patterns (it shows an optional character). Each ? in the pattern needs escaping with backslash.

(b) I think sed might choke on a 117000-byte line.

Actually I'm surprised awk manages it (at least on my Linux machine). It might not work on AIX - did you try it ?

I also have an awk that will fold all the lines just after a ">" symbol, which is still valid XML. (Folding at other places can introduce paragraphs, and put newlines in string options, and is a bad idea). I can post this script too if you ask. If you tamed your long-line files with this, a lot of other tools (like an editor) would start to work for you.