Dear everybody, I am a newby and that why I ask you how to do! So I have a file in which are present some sequences of cleavage site for proteases. I would like to reformatting this file in another file but I don't know how to do with perl.

It's hard to know for sure what you're asking for, the example is good but you need to generalize a bit more.

You have capitalized words, in 2 varieties... 4 letter and two 4-letter hyphenated words, right? Also you appear to not care at all about the numbers (seems you just replace them with a blank line). Additionally you may want to elaborate on your error cases. EG what happens if: YARS Tyrosyl-trna synthetase Cathelicidin antimicrobial peptide CAMP Cathelicidin antimicrobial peptide

Or is the structure of your input file even more complicated that it appears to non-biologists?

Anyway, assuming you know a little PERL, your root problem may be that you do not know how to read a bunch of lines as one line. I would suggest a PERL COOKBOOK to see if there's something close there.