Here is possibly something to get you going. You describe your input as a CSV but your example and description of the data does not correlate with this format, I have assumed your input is chunks of key value pairs. It reads each CHIP .. END chunk of your input data, parses it into a hash, then outputs it in your desired format. I'm sure you will figure out how to read from and write to actual files.

Thanks for the code. Its working as per the format I wanted. In the csv file there is a cmd column, whose information shouldn't be there in the output file as because this information is fixed and the rest of the information might change, so what modification has to be done inside the while loop so that the cmd column is not printed and the rest of the information is printed in the output file as below?

{0x2501, FLASH_CONFIGURATION_DEVICE(256, yyy, 3, 0, 0, 0, 20, 0xBF)}

Also if I dont include the END statement at the last and the rest of the information is there in the csv file as the following format

vendorid, memid, size, devid, cmd, addr, mode, dummy, conf_wid, rate

CHIP XXX 0xBF, 0x2501, 256, yyy, 1, 3, 0, 0, 0, 20

CHIP YYY 0xBF, 0x2501, 512, yyy, 1, 3, 0, 0, 0, 20

What changes has to be done inside the while loop so that it produces the same output as before( with no cmd column as mentioned above)? Actually the END statement was used for easy parsing, but it seems it is not required in the csv file. I tried editing your code but couldn't reach the desired output. Your help will be really appreciated.

Once again, heartfelt thanks for the perl code. What I observed that starting from

"If the input format is guaranteed regular then an alternative might be to set the input record separator to double newline and process each chunk:"

Whatever perl script is written below as per the input, it is not generating the desired output. Could you please check in your system once , because after i ran the code written by you in my system, it is giving some "Missing arguments in printf message.." and is not producing the correct output.

Running that piece of code I don't receive the error that you do using http://www.compileonline.com/execute_perl_online.php. In fact the output is blank because upon copying and pasting the code it appears to add extra whitespace on the end of blank rows of data, not sure why, but I temporarily resolved using $/ = "\n \n"; .

If you want an alternative that processes the same structure of data, use the 4th example above but adjust the modulus condition to next if $. == 1 or $. % 3 != 1;.

Otherwise could you please post a) the exact code you are executing b) the data you are running against c) the output/error, so that I can replicate your issue.

I have run your perl code in http://www.compileonline.com/execute_perl_online.php and i observed that it is running fine in that website. So, what I figured out is that the version of perl is different for me than the one used in the website. The website is using v5.10.1 and i have perl5.14.2.exe version. So as you requested, here is the code and the output associated with it

Code

#!/usr/bin/perl

use strict; use warnings; open (DATA, "<input.txt") or die "Can't open file $!"; while ( <DATA> ) { next if $. == 1 or not $. % 2;

Now can you please tell me why is this happening when the version is changed? Also can you please explain me the condition you have used in the last 3 example and suppose if someone has wrongly entered insufficient information in the data, how to catch that and print error in the output file.

I concur with Bill. Remove the blank lines or adjust the code i.e. include "/^\s*$/ or" in the next condition or etc.

Heres two last examples which are much more forgiving with regards to the data format, but by no means flawless. In the first, I have broken up the core processes into separate functions, including basic validation. The second takes a regexp approach. There should be enough examples by now for you to fine tune to suit your particular dataset format. _______________________________________________________________

I need to keep the input csv file information in a hash array and then print it to a c file, so that in future, if someone add a new column in between the parsing of data should not change. In case of hash array, the column header can be used as a hash keys to get the value from the respective column and the input file is now a proper csv file. Below is the input file.

Perhaps you could try and write the script yourself based on what you have learnt so far and if you get stuck post the code / problem. I'm sure you will agree that this is the best approach if you want to develop your Perl skills.

As you can see there is no gaps/space in the input file between each column, as because whenever i was putting space, the parsing is not happening and an error is throwing. I am unable to tackle this issue. I want flexibility in terms of spaces i give between column, but not finding the right perl code to tackle that.

So, these are issues, can u please go through the code once, and tell me the changes i need to do to get the below output

Firstly, it looks as though you are making good progress, so well done for that!

Rather than give you a working version right away lets see if we can address some of your issues and perhaps you can fix your code yourself.

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

At the top of every script you write you should always use strict and warnings:

Code

use strict; use warnings;

These will report on various issues during compilation and run time. You'll notice the following:

Code

Global symbol "@name" requires explicit package name at line 13. Global symbol "@name" requires explicit package name at line 17. Execution aborted due to compilation errors.

You declare the variable @names at the top, but actually use @name.

Upon rerunning the following error is reported:

Code

Can't use an undefined value as an ARRAY reference at line 26, <DATA> chunk 4.

This is because you have tried to use a non existent hash entry 'memid' as an array reference. When you split the line, you didn't incorporate the space after each comma and the key is actually ' memid'.

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

When you enter the while loop, the first thing you will need to do is remove the newline character on the end of each line. You would usually use the chomp function to do this. However you have additional whitespace, therefore I suggest using a regexp. You also mentioned you have removed blank lines, but alternatively you can just skip them using another regexp:

Code

while (<DATA>) { next if m/^\s*$/; # skip blank lines. s/\s+$//; # remove any whitespace from the end of the line. ...

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

You can then go ahead and split the line as you have done, however incorporate the space after the comma, and just to be on the safe side consider incorporating potential space before the comma:

Code

my @list = split(/\s*,\s*/);

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

Now remember, you already have the row as a list in @list. When you need to store the header, just copy this list:

Code

@names = @list;

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

You have decided upon storing your data into a hash of arrays, then iterating through it at the end to actually print out the data. Its perfectly possible to use a hash of arrays, but doesn't make as much sense as using an array of hashes, since each row is unique/individual. I would use the hash slice technique I showed you in the previous post, and push the hash onto the array:

Code

my %hash; @hash{@names} = @list; push @data, \%hash;

This will ensure if a new column is added, you don't have to make adjustments to the code. An alternative is to use an array of arrays, then worry about matching up each value to its column name later on:

Before attempting to print out the data, another useful module is Data::Dumper. You can dump your data structure to stdout and inspect it to check its how you expected:

Code

use strict; use warnings; use Data::Dumper;

my @data;

while (<DATA>) { ... }

print Dumper \@data;

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

Use the technique I showed you in the previous post in a loop to print the data:

Code

for (@data) { printf "%s, %s\n", @{$_}{qw/vendorid memid/}; }

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

I've actually used some bad practices, its better to use named variables whenever possible, it is easier to understand code like this, and you won't have problems if $_ is reassigned i.e. a loop within a loop:

Its working fine now. Below is the code. Can it be better than this. Here i have separated the header first and then parsed the other values. Is it possible to seperate header also and putting the entire thing under 1 while loop. Right now, m using 2. Also what else can be improved?

2) The modern approach to opening a file is to use the 3 way form with a lexical variable filehandle, over a bareword filehandle.

3) In your previous code snippet, you used the 'if ( $. == 1 ) {' condition to handle the header, is there any reason you can't do this, i.e. it assumes the header is on line 1. If you aren't already aware, $. contains the line number you are currently reading from the filehandle, therefore the above condition fundamentally checks if we are on line 1. In my version below I assume the first valid line reached is the header line. I simply check if @headers contains any values, if it doesn't then we must be on the header line, we assign it some values, then other iterations @headers will contain values, therefore assumed records.

4) Although not necessary, its good practice to close the filehandle once you have finished with it.

5) Although not implemented in my version below, you don't actually have to store the records in @records, you might as well print directly i.e.:

I thought of doing that, but since we are Perl programmers, which typically strive to be lazy, I didn't want to waste the time typing those 4 extra characters to save that nano or pico second in cpu time. :-)

Regarding your last comment, I haven't followed the thread in enough detail to say if there is or isn't a better approach to removing that condition.