I'm a python newbie.
I've been searching days long, but found only some little bits of my conception.
Python 2.7 on windows (I chose python because it's multiplatform and result can be portable on windows).

I'd like to make a script, that searches a folder for *.txt UTF-8 text files, loads the content (one file after each other), changes non-ascii chars to html entitites, next adds html tags at the start and at the end of each line, but 2 variations of tags, one for the head of the file, and one for the tail of the file, which (head-tail) are separated by an empty line. After that, all the result have to be written out to another text file(s), like *.htm. To be visual:

Indenting 4 spaces creates a code block. Edit your question so that it is more readable.
–
sgallenJan 22 '12 at 14:29

I used indenting for the first time, but missed the empty paragraph before each indented block.
–
TibJan 22 '12 at 14:35

I'm not sure I really understand your question, I've tried your last script and it seems to get the result you are looking for and the result looks OK in the browser. Can you show the results of your testing with notes where the result is wrong?
–
snim2Jan 22 '12 at 14:42

@snim2 For me it messed up the result: closing tag at the line start, deleting the first letter, nothing at the line end. I try here a line to show the result if source line is 'text': </p>ext
–
TibJan 22 '12 at 14:48

Added import sys. Now works, but only prints lines out, and I'd like it written out to *.htm text file(s). Is there a fileoutput also like fileinput?
–
TibJan 22 '12 at 16:35

@Tib: there are multiple options e.g., you could wrap textfiles() to copy each '.txt' file with shutil.copy2() and then yield '.html' filenames to fileinput (use inplace=True in this case). Or close/open new file inside if fileinput.isfirstline().
–
J.F. SebastianJan 22 '12 at 16:45

I have to investigate and learn the docs because I did not understand totally what you wrote :-) Remember, I just started with python :-)
–
TibJan 22 '12 at 16:58

with open('utf8.txt') as f:
class_name = 'aaa'
for line in f:
if line == '\n':
classname = 'bbb'
else:
# decode / convert line
line = '<p class="{0}">{1}</p>\n'.format(class_name, line.rstrip())
# write line to file

The results you are getting do not look to be caused by the regular expressions as they appear to be correct. The problem is most likely in the line where you do your encoding / converting. Print that line without adding the tags to see if it is as expected.