Linux - NewbieThis Linux forum is for members that are new to Linux.
Just starting out and have a question?
If it is not in the man pages or the how-to's this is the place!

Notices

Welcome to LinuxQuestions.org, a friendly and active Linux Community.

You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!

Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.

If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.

Having a problem logging in? Please visit this page to clear all LQ-related cookies.

Introduction to Linux - A Hands on Guide

This guide was created as an overview of the Linux Operating System, geared toward new users as an exploration tour and getting started guide, with exercises at the end of each chapter.
For more advanced trainees it can be a desktop reference, and a collection of the base knowledge needed to proceed with system and network administration. This book contains many real life examples derived from the author's experience as a Linux system and network administrator, trainer and consultant. They hope these examples will help you to get a better understanding of the Linux system and that you feel encouraged to try out things on your own.

I'm not too familiar with some of the regex syntax, I am seeking help with a text parser, I beleive SED / AWK would probably do it...

I have a text file, basically a code file.

The interperter doesn't like tabs or comments, and wants each command on it's own line, complete command on one line only. I do like formatting, it helps me read the code and undertand what is going on when I look at it a week later.

Different text editors will replace a <tab> with various combinations of special characters or consecutive spaces.

So I need to replace:
- tab
- /t
- carriage return / line feed
- everything on a line after // (comments)
- consecutive spaces

... with a single space (in that order, so any consecutive replacements to a single space don't add up to multiple spaces, the last replacement is the consecutive spaces)

Nope, not cheating on homework. Part of a much larger program that I am working with a team on, just that it is a minor addition to the project so none of the main engineers want to divert time to write and test the script, and I am pretty busy at the moment, learning awk looks like it will take me some time...

The script as you have it doens't work on my OS (embedded Linux - BusyBox v1.00 ), but by following your lead (and studying just the regex's you used rather than the whole book) I have got most of the way with:

Sure - the regexp to search for to get the carriage returns will look something like: "\r$" (which means match a carriage return at the very end of the line)
or"\r" (match a carriage return anywhere in the line)

[user@Dreadnaught config]$ cat test3.lge.txt
Line 1 has words with the letter r in it
Line 3 ends with a space
Line 5 next line is 2 spaces
[user@Dreadnaught config]$ sed 's/ */ /g; s/\/\/.*//g; /^$/d; s/\r/ /g' test3.lge.txt >test3.lge
[user@Dreadnaught config]$ cat test3.lge
Line 1 has wo ds with the lette in it
Line 3 ends with a space
Line 5 next line is 2 spaces
[user@Dreadnaught config]$

I just played with the order a bit to remove double spaces created by the tr command (it can't take null / '' as a second argument). Now I would like to be able join lines that are not seperated by blank lines.... (i.e. convert paragraphs into lines, but one line per paragraph) getting tougher, but getting much closer...

My interim solution of writing the code with -- on blank lines will do me for now.

i.e.

Code:

[user@Dreadnaught config]$ cat test3.lge.txt
Line 1 has words with the letter r in it
//Comments on line 2
--
Line 4 ends with a space
--
Line 6 next line is 2 spaces
--
All above lines should stay on their own line
--
Lines 10 through 12
are considered a paragraph
they should end up on one line
[user@Dreadnaught config]$ sed 's/\/\/.*//g; /^$/d' test3.lge.txt | tr '\n' ' ' | tr '\-\-' '\n' | sed 's/\ */ /g; s/^ //g' >test3.lge
[user@Dreadnaught config]$ cat test3.lge
Line 1 has words with the letter r in it
Line 4 ends with a space
Line 6 next line is 2 spaces
All above lines should stay on their own line
Lines 10 through 12 are considered a paragraph they should end up on one line [user@Dreadnaught config]$

[[user@Dreadnaught config]$ cat cleanlge.sh
if [ $# -ne 1 ]; then
echo "cleanlge script"
echo
echo "function:"
echo " cleans comments from lge file, puts all commands on one line"
echo
echo "usage:"
echo " . ./cleanlge.sh filename"
echo " will clean up filename.lge.txt and save as filename.lge"
echo
else
echo "--------Original File------------"
cat $1.lge.txt
echo "--------Cleaned File-------------"
sed 's/\/\/.*//g; s/ *$//; s/^$/\/\//' $1.lge.txt | tr '\n' ' ' | sed 's/\/\//\n/g; s/ */ /g' | sed 's/^ //g; /^[#tab]*$/d' > $1.lge
cat $1.lge
echo
echo "---------------------------------"
fi
[user@Dreadnaught config]$ . ./cleanlge.sh test3
--------Original File------------
Line 1 has words with the letter r in it
//Comments on line 2
Line 4 ends with a space //Comments at the end of line 4
Line 6 next line is 2 spaces
All above lines should stay on their own line
Lines 10 through 12 //and they have comments on each line
are considered a paragraph //more comments
they should end up on one line
--------Cleaned File-------------
Line 1 has words with the letter r in it
Line 4 ends with a space
Line 6 next line is 2 spaces
All above lines should stay on their own line
Lines 10 through 12 are considered a paragraph they should end up on one line
---------------------------------
[user@Dreadnaught config]$ [user@Dreadnaught config]$