Results (from program) are smaller for 0.0010. Do you have any idea, is it because of rounding or something else. I've tried with different fortran compilers (Sylverfrost ftn95, g77 on Radhat, MinGW and cygwin )

> Results (from program) are smaller for 0.0010. Do you have any idea, is it > because of rounding or something else. I've tried with different fortran > compilers (Sylverfrost ftn95, g77 on Radhat, MinGW and cygwin )

> BB.dat Result (program) > Correct values(calc)

> 42069.3734 17453.9886 > 17453.9896

> 42069.3931 17454.0083 > 17454.0093

> 42069.4000 17454.0152 > 17454.0162

> 42069.4053 17454.0205 > 17454.0215

> 42069.4096 17454.0248 > 17454.0258

> This is a part of BB.dat:

> 17453.9886 -0.1530

> 17454.0083 -0.1270

> 17454.0152 -0.1580

> 17454.0205 -0.1640

> 17454.0248 -0.1530

> 17454.0295 -0.1540

> 17454.0360 -0.1420

> 17454.0405 -0.1720

> 17454.0441 -0.1590

> 17454.0531 -0.1770

> 17459.9213 2.0370

> 17459.9345 2.3570

> 17459.9498 2.4200

> 17459.9550 2.4040

You have a classic problem with finite precision here: The computations on the computer are done with binary numbers, whereas a calculator uses decimal numbers.

There is no way of getting exactly the same results with these two computational systems.

For instance: 1/5 is exactly representable in a decimal system with a finite number of digits: 0.2. 1/3 on the other hand requires an infinite number of digits: 0.3333...

For a binary system this works the same way, only that now 1/5 is in the same category as 1/3.

The conclusion is: you need to be aware of this problem and accept it.

There is plenty of material on the Internet describing it in more detail. See several recent threads for more information.

Your 24615.3838 is a single precision constant. It loses precision when translated to floating point (too many digits) and then that less-precise value is converted to double-precision before the subtraction. Use 24615.3838d0 instead, to indicate that it should be converted directly into double-precision with less loss of accuracy.

First, go ahead and add a line to write back out to stdout the numbers that you are reading...you may not be reading correctly...if this is the case, I would suggest simply to not use the format 822 in the READ statement; instead, just use list directed as in READ(35,*)...try again.

gsal <salger@gmail.com> wrote: > First, go ahead and add a line to write back out to stdout the numbers > that you are reading...you may not be reading correctly...if this is > the case, I would suggest simply to not use the format 822 in the READ > statement; instead, just use list directed as in READ(35,*)...try > again.

No. The format does *NOT* affect the accuracy of input. I've seen this misconception come up one other time recently here. I want to make sure that it does not spread.

The OP's problem is undoubtedly the finite precision of floating point arithmetic, and in particular the use of a single precision literal constant, as mentioned by Ivan.

Excuse my short programming experience and ignorance about the internal representation of numbers, etc. I am an engineer turned programmer rather late in life. I started to participate in this group not long ago and have become aware of decimal vs binary issues, but are NOT second nature to me and I mostly ignore them without any problems (fortunately) in my programs.

I did not have the benefit of the two postings before mine (they happened within minutes, so I did not see them) and did not fully understand the posting, so, I simply offered some debugging advice.

But enough excuses...all I was indicating was the difference between list-directed and formatted READ which DOES make a difference.

Let's say I have an input line that looks like:

4567.778 4437.556 -------- -------- xxffff.fffxxffff.fff

and I choose to express its format in the following manner:

2x,f8.3,2x,f8.3

If I read it with that format, I get to read the exact numbers as shown in the input line.

On the other hand,if I use a format with the middle 2x missing, i.e.,

2x,f8.3,f8.3

a formatted read command will fail to read the last two digits of the second number, maybe not changing the precision used to internally represent the number but certainly changing the accuracy of the number altogether ...now, inside the program, I have:

4567.778 4437.5 ---------------- xxffff.fffffff.fff

more catastrophic damage did not happen thanks to the fact that the number does include a decimal point which was respected by the format, but, as you know, if I don't even place a period or the specified format ends up offset enough to not see it, it gets even worse.

Do you remember the old fashion manner of column oriented input files that the old programs used to read? They often did not use decimal points and used the extra column to express one more digit of accuracy. That's all I am talking about.

> Excuse my short programming experience and ignorance about > the internal representation of numbers, etc. I am an > engineer turned programmer rather late in life. I started to > participate in this group not long ago and have become aware > of decimal vs binary issues, but are NOT second nature to me and > I mostly ignore them without any problems (fortunately) in my > programs.

> I did not have the benefit of the two postings before mine > (they happened within minutes, so I did not see them) and > did not fully understand the posting, so, I simply offered > some debugging advice.

> But enough excuses...all I was indicating was the difference > between list-directed and formatted READ which DOES make > a difference.

> Let's say I have an input line that looks like:

> 4567.778 4437.556 > -------- -------- > xxffff.fffxxffff.fff

> and I choose to express its format in the following > manner:

> 2x,f8.3,2x,f8.3

> If I read it with that format, I get to read the exact > numbers as shown in the input line.

> On the other hand,if I use a format with the middle 2x > missing, i.e.,

> 2x,f8.3,f8.3

> a formatted read command will fail to read the last two > digits of the second number,

It is the old problem that it did EXACTLY what you said to do rather than what you intended but did not say. (Some systems have a DWIM - Do What I Mean - debugging subsystem!)

You have said that the last two character positions are NOT part of the number so it should not read them. Also the two columns at the "begining" are now part of the number.

> maybe not changing the > precision used to internally represent the number but > certainly changing the accuracy of the number altogether > ...now, inside the program, I have:

> 4567.778 4437.5 > ---------------- > xxffff.fffffff.fff

> more catastrophic damage did not happen thanks to the fact > that the number does include a decimal point which was > respected by the format, but, as you know, if I don't even > place a period or the specified format ends up offset enough > to not see it, it gets even worse.

> Do you remember the old fashion manner of column oriented input > files that the old programs used to read? They often did not > use decimal points and used the extra column to express one > more digit of accuracy. That's all I am talking > about.

> So, what exactly is the misconception you don't want spread?

You seem to confuse misspecifying the input, so that it does not correspond with your intentions, with finite precision effects in the following computations. The result of a "wrong" answer could have many causes and you diagnosis was misplaced. (Wrong answers are usually caused by an unexpected issue as you will have dealt with the expected issues. It is like the definition of lost that things are not where you expect them to be.)

For small amounts of data the advice to echo the inputs would seem to be good advice. Particularly if you are dealing with the slightly fussy issue of formatting and some hazy memories of what practices where back when O26 keypunchs were avant guard gear. Spacing was for more than just tidy listings! The formatting fussiness also applies literals in the program as they will be the default precision (single) unless you take care to ensure they are of higher precision (double).

gsal <salger@gmail.com> wrote: > But enough excuses...all I was indicating was the difference > between list-directed and formatted READ which DOES make > a difference.

[example elided]

Not really. The only difference, which your example illustrated, is when you use an explicit format to tell it to read a different field. As long as you get the field position and width right, the results will be the same. (And yes, there is the funny stuff with omitted decimal points.) The OP's example did have the field position and width ok and had explicit decimal points. In particular, for example, reading a field that contains 1.23456 with something line an f7.2 edit descriptor will not cause it to be read as if it were 1.23. That's a misconception I've seen.

By the way, a terminology correction: both forms discussed are formatted. Specifically. list-directed I/O is a form of formatted I/O. The difference you are talking about is between list-directed versus explicit formatting - not formatted versus something that is not formatted.

This is a fairly common terminology error. Getting it wrong does cause people to write code that doesn't work, which is why I tend to correct the error whenever I notice it. People who read descriptions that imply that list-directed is not formatted make mistakes like opening files with form='unformatted' when they intend to use list-directed I/O; that doesn't work.

> So, what exactly is the misconception you don't want spread?

Well, the above example of f7.2 truncating input past the 2nd decimal place was the one I first had in mind. But then, the mislabelling of list-directed as not being a form of formatted I/O is also one.

> In particular, for example, reading a field that contains 1.23456 with > something line an f7.2 edit descriptor will not cause it to be read as > if it were 1.23.

I am completely stunned and wrote in fact a small test program to check just that! Thanks for pointing that out; I was completely unaware of it!

It rises, however, another question:

What is the point of having f7.2 rather than any other f7.x in a read statement? It seems that indicating the number of decimal digits is irrelevant, or am I missing something else?

Arno

Arno wrote: > It rises, however, another question: > What is the point of having f7.2 rather than any other f7.x in a read > statement? It seems that indicating the number of decimal digits is > irrelevant, or am I missing something else?

Yes. You are missing that formatted input was originally used for reading data from 80 column punched cards, and that saving card columns was considered worthwhile.

One might punch a value in seven card columns, and then read it with F7.2 format. That would allow five digits before and two after the decimal point to fit in seven card columns. It still makes some sense reading from a disk file, but not so much reading from a terminal.

DEC made many non-standard modifications to formatted input in the days before list directed (formatted) input became available. Many of those allowed for more convenient terminal input.

-- glen

On Jun 4, 4:22 am, Arno <arnoinp@hotmail.com> wrote:

> > In particular, for example, reading a field that contains 1.23456 with > > something line an f7.2 edit descriptor will not cause it to be read as > > if it were 1.23. ... > Thanks for pointing that out; I was completely unaware of it! ... > [So] What is the point of having f7.2 rather than any other f7.x in a read > statement? It seems that indicating the number of decimal digits is > irrelevant, or am I missing something else?

...

Essentially think of it as the provided decimal point "overrides" the Format statement's rule for where an implied decimal point is to be inferred. Following excerpted from the CVF manual for F descriptor-- similar rules apply for E, etc., ...

Rules for Input Processing

On input, the F data edit descriptor transfers w characters from an external field and assigns their real value to the corresponding I/O list item. The external field data must be an integer or real constant.

If the input field contains only an exponent letter or decimal point, it is treated as a zero value.

If the input field does not contain a decimal point or an exponent, it is treated as a real number of w digits, with d digits to the right of the decimal point. (Leading zeros are added, if necessary.)

If the input field contains a decimal point, the location of that decimal point overrides the location specified by the F descriptor.

If the field contains an exponent, that exponent is used to establish the magnitude of the value before it is assigned to the list element.

>> In particular, for example, reading a field that contains 1.23456 with >> something line an f7.2 edit descriptor will not cause it to be read as >> if it were 1.23.

> I am completely stunned and wrote in fact a small test program to > check just that! > Thanks for pointing that out; I was completely unaware of it!

> It rises, however, another question:

> What is the point of having f7.2 rather than any other f7.x in a read > statement? It seems that indicating the number of decimal digits is > irrelevant, or am I missing something else?

> Arno

If the field contains " 123456" (leading blank and NO decimal point) the question is "How many decimal points?". That is when the ".2" comes to your rescue. It is the default count when there is no decimal point in the field.

The answer to your question is that you missed the default case of no explicit decimal point. To read it as an integer value you would need "f7.0".

Arno <arnoinp@hotmail.com> wrote: > What is the point of having f7.2 rather than any other f7.x in a read > statement? It seems that indicating the number of decimal digits is > irrelevant, or am I missing something else?

Others have given you the technical answer, but I'll give you mine, which is "almost nothing". :-) As noted by others, this addresses the case where you omit the decimal point form the data, but although the other posters mentioned that this is a bit of a relic, I'd like to emphasize my advice, which is "don't do that".

Today, I'd say that the most common use of the ".2" in contexts like this is just to be able to use the same format for input and output. It means something for output, and is mostly irrelevant to input. I've done that.

Richard Maine wrote: > Arno <arnoinp@hotmail.com> wrote: >>What is the point of having f7.2 rather than any other f7.x in a read >>statement? It seems that indicating the number of decimal digits is >>irrelevant, or am I missing something else?

Also, this goes all the way back to the IBM 704, a vacuum tube computer.

(snip)

> Today, I'd say that the most common use of the ".2" in contexts like > this is just to be able to use the same format for input and output. It > means something for output, and is mostly irrelevant to input. I've done > that.

Maybe that makes sense today. In days past, one needed carriage control, at least an initial 1X, on the output format, but not on the input format. (Unless you want to waste one column.) That was enough to require different FORMAT statements.

On the subject of computers past and formatted output, the description of carriage control in the IBM 704 manual includes:

As implemented for S/360, and maybe not too different for the 704, there is a punched paper tape specifying which row on the page carriage control codes 1-9 should skip to. It is usual for 1 to be the top of the page, but that isn't required. Skipping to other parts of the page allows one to print forms much faster than would otherwise be possible. As an example:

(I did know some of the people involved, but I wasn't there at the time.)

Many computer centers charge by the page and also by the number of actual lines of output. For large print jobs, a custom carriage control tape would speed up printing and reduce costs. I believe that one was used for the prank described.

(Though it is well known that the above prank could have been done just as well on more ordinary printing systems.)

-- glen

glen herrmannsfeldt <g@ugcs.caltech.edu> wrote: > Richard Maine wrote: > > Today, I'd say that the most common use of the ".2" in contexts like > > this is just to be able to use the same format for input and output. It > > means something for output, and is mostly irrelevant to input. I've done > > that.

> Maybe that makes sense today. In days past, one needed carriage > control, at least an initial 1X, on the output format, but not > on the input format. (Unless you want to waste one column.) > That was enough to require different FORMAT statements.

That is only true if the output was to be printed. From day one, there have also been outputs that were not printed. The standard has always been vague (extremely so - to the point that it had no substantative content, but that's another topic) on what constituted a file to be printed, but such things have always existed.

For the simple task of copying a file (or perthaps the more realistic one of copying it with modifications of some kind), it makes a lot of sense to use the same format for input and output. It makes sense today, and always has.

I've done quite a lot of stuff like that, I might add. Such kinds of simple programming tasks were the kind of thing often given to work-study students in their first job stint. That would have been 1970 for me, which counts as "days past" for most purposes.

>>Maybe that makes sense today. In days past, one needed carriage >>control, at least an initial 1X, on the output format, but not >>on the input format. (Unless you want to waste one column.) >>That was enough to require different FORMAT statements. > That is only true if the output was to be printed. From day one, there > have also been outputs that were not printed. The standard has always > been vague (extremely so - to the point that it had no substantative > content, but that's another topic) on what constituted a file to be > printed, but such things have always existed.

Yes. The 704 Fortran has WRITE OUTPUT TAPE for formatted output to tape. (There were no disks, drum I/O seems only to be unformatted.)

As well as I know it, not having actually used a 704, one might write printable data to a tape with the intention of reading it back in, or with the intention of printing it offline. In the former case no carriage control, in the latter case likely carriage control.

The DEC Fortran compilers in the 1970's, some of the earlier ones to support terminal I/O, usually supported carriage control. Later on, ASA carriage control was less often used for terminal I/O. (Maybe the trend from printing terminals to video terminals helped.)

> For the simple task of copying a file (or perthaps the more realistic > one of copying it with modifications of some kind), it makes a lot of > sense to use the same format for input and output. It makes sense today, > and always has.

Yes. But actual printed output is much less common today than in the past. (As a fraction of Fortran generated output.)

-- glen

glen herrmannsfeldt <g@ugcs.caltech.edu> wrote: > Richard Maine wrote: > > For the simple task of copying a file (or perthaps the more realistic > > one of copying it with modifications of some kind), it makes a lot of > > sense to use the same format for input and output. It makes sense today, > > and always has.

> Yes. But actual printed output is much less common today > than in the past. (As a fraction of Fortran generated output.)

I won't argue which is more or less common, as it seems irrelevant to the point and wandering off into something else. I repeat, there have always been cases where it made sense to use the same format for input and output, and it still are such cases. They are cases that come up a lot; always have and still do. That there are other cases where it doesn't work seems beside the point, which was to explain the situations where it *DOES* make sense. I continue to maintain that this is the most common meaningful usage of formats like f7.2 for input. That was my original claim in this subthread; nothing subsequently has had any relevance to refuting that claim.

> Others have given you the technical answer, but I'll give you mine, > which is "almost nothing". :-) As noted by others, this addresses the > case where you omit the decimal point form the data, but although the > other posters mentioned that this is a bit of a relic, I'd like to > emphasize my advice, which is "don't do that".

Thanks to you all. It is clear now.

Arno

Arjen Markus <arjen.mar@wldelft.nl> wrote: > You have a classic problem with finite precision here: > The computations on the computer are done with binary numbers, > whereas a calculator uses decimal numbers.

Dr Ivan D. Reid <Ivan.R@brunel.ac.uk> wrote:

> Your 24615.3838 is a single precision constant. It loses > precision when translated to floating point (too many digits) and then > that less-precise value is converted to double-precision before the > subtraction. Use 24615.3838d0 instead, to indicate that it should be > converted directly into double-precision with less loss of accuracy. Richard Maine <nos@see.signature> wrote: > No. The format does *NOT* affect the accuracy of input. I've seen this > misconception come up one other time recently here. I want to make sure > that it does not spread.

> The OP's problem is undoubtedly the finite precision of floating point > arithmetic, and in particular the use of a single precision literal > constant, as mentioned by Ivan.

Since I'm doubtless the person to whom Richard Maine refers when he says this issue has come up recently on this forum, I'll chime in with my two cents. Yes, I really am still flogging that dead horse.

A literal constant in source code text such as

24615.3838

has the default real type, which is single precision unless you play with the compiler flags to make the default real type something else. On most computers[1], single precision means IEEE 754 single precision which is a 32-bit word containing a sign bit, eight bits of exponent, and 23 bit significand (sometimes called "mantissa") with an implied extra leading 1 bit.

We can examine the binary format used to represent this literal constant with the following test program

The first number differs significantly from the literal in the source code text; this has been called "loss of precision" by others in this forum but for reasons I'll go into below I don't like that term for this phenomenon. The second (hexadecimal) number above is the closest single-precision binary approximation to the literal 24615.3838 in the source code text and is the constant as it appears in the executable:

so we can work out that the constant as stored in the executable is exactly

12603077 * 2**(-23) * 2**(14) = 12603077 / 512

Note that the internal floating-point representation uses a binary radix -- the denominator in the fraction above will always be a power of two. The decimal value is approximately 24615.3847656. That was the closest IEEE 754 single precision number to 24615.3838 that the compiler could find. That is the number that is stored in the read-only data ELF section of your executable when you put "24615.3838" into the text of your source code.

Note that the IEEE 754 single precision format has a 24 bit mantissa (the leading 1 bit is implied for non-denormalized numbers), and 2**(-24) is approximately 10**(-7), so you would expect single precision to correspond to approximately 7 significant decimal digits. The number 24615.3838 has nine significant digits, the first seven of which agree with 24615.3847656. This is what other posts have referred to as "loss of precision". I dislike that term, however, because the precision was never there in the first place -- the literal constant was always single precision and the compiler has done an honest job of converting the decimal number to the nearest binary equivalent.

The number in the executable (12603077 / 512) differs from the number in the text of the source code (24615.3838) because the radix used to represent the number in the executable (binary) differs from the radix used to represent the number in the source code (decimal). To see this phenomenon at work choose another number close by and put that in the example program above:

real, parameter :: num = 24615.390625

(etc)

$ ./literals 0.2461539062500000E+05 46C04EC8

Things are looking better according to the first number above, and if we break out the bits again

so the number is 12603080 / 512 = 24615.390625 *exactly*. For this number, we have 11 significant figures, as opposed to the 7 significant figures for 24615.3838. No "loss of precision" here!

What's the difference? 24615.390625 is representable exactly as a binary fraction, so in the conversion from decimal to binary radix, nothing is changed. 24615.3838 isn't representable exactly as a binary fraction, so the compiler has to choose one nearby and uses that instead. I call that "rounding" -- rounding to the nearest binary fraction.

Why do we tolerate rounding? For most engineering and scientific purposes, the decimal literal in your source code is already only an approximation to some other unknown value. The decimal radix conventionally used is, after all, just an accident of evolution -- we might be a duodecimal society if we had evolved with twelve fingers. There is really only one area where the decimal values are always the exact values -- finance. That is why IBM built decimal floating point into their mainframes and now the POWER6 processor (that will probably be the heart of future mainframes).

Finally, a code on "loss of precision". If I call the radix conversion phenomenon "rounding", then what is "loss of precision"? I consider it a loss of precision when you store a double precision value into a single precision variable. That has the potential of taking a very small normalized number and converting it to a denormalized number or zero in the narrower format. So you started with 53 bits of precision (the number of bits in the mantissa of an IEEE 754 double precision format) and stuffed it into 24 bits of precision (the number of bits in the mantissa of single precision). You have lost precisely 29 bits of precision. In fact, we could formally define "loss of precision" using the PRECISION intrinsic in Fortran -- if the number returned for a value by that intrinsic is smaller, you have lost precision.

The converse, however, is not generally true -- storing a single precision value in a double precision variable doesn't gain you back precision. To use Richard Maine's phrase: "preicision is like virginity, once lost it can never be recovered".