ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices

Welcome to LinuxQuestions.org, a friendly and active Linux Community.

You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!

Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.

If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.

Having a problem logging in? Please visit this page to clear all LQ-related cookies.

Introduction to Linux - A Hands on Guide

This guide was created as an overview of the Linux Operating System, geared toward new users as an exploration tour and getting started guide, with exercises at the end of each chapter.
For more advanced trainees it can be a desktop reference, and a collection of the base knowledge needed to proceed with system and network administration. This book contains many real life examples derived from the author's experience as a Linux system and network administrator, trainer and consultant. They hope these examples will help you to get a better understanding of the Linux system and that you feel encouraged to try out things on your own.

Is it a non-trivial problem to do a regular expression search in a text file where the pattern spans multiple lines? I've been looking at the re module and I quite understand how to do regular expression searches or matches with a string, and I've successfully implemented such in some experimentation python programs I've written. However, in the case where you have a large text file (in tens of megabytes), I really don't want to have to do a 'file.readlines' and then join the whole furshlugginer thing into one giant string before doing a regex search on it with the regex pattern possibly spanning several lines. Currently I can do a search line by line, but at the moment, thinking up an easy way to do pattern matching across multiple lines just isn't coming to me. I can think of how to do it if I specifically coded the regex pattern matching myself to be specific to this problem, but I just don't want to do that because it seems like more work than I should be doing!

Eh... as in most things, the solution is probably so simple I'm over looking it... it's probably even in the python library reference even though I've combed through that thing many times over for all things regular expression related...

And on a semi-related note...

... in general, how would one handle replacing a block of text somewhere in the middle of a largish text file, again without putting the whole thing into memory? The problem I see here is that just doing a seek to the place where you want to start replacing, if the replacement is longer than the block that is being replaced, it will overwrite stuff that shouldn't be overwritten. I suppose that one can write to a temporary file all the changes that are made to the original file in some arbitrary format (like XML or some other custom format) where each change has the line number and position in the line where the change will take place, and then which lines will be replaced. For smallish files like configuration files, I suppose this approach is fine, but what if you have a some odd million lines long text file like a novel for example (if novels even aspire to such a great number of lines)? And if the way I've described it is just about the most sensible obvious way to do it, what then can be done to speed up the process?

Although I suppose I probably won't come across a situation where a single text file will be abnormally large enough to have an impact on processing speed, still, it would be nice to know, especially since this particular problem along with the regex search across multiple lines has all but befuddled me...

Thanks in advance for any insights, or even successful search keywords! I've been looking for a while myself, but if such exists somewhere I've probably been using bad keywords AND have been trying to do web searches when my brain hasn't quite started functioning yet (me is a HORRIBLE morning person)...