If this is your first visit, be sure to check out the FAQ by clicking the link above.
You may have to register before you can post: click the register link above to proceed.
To start viewing messages, select the forum that you want to visit from the selection below.

| The type of a file and which app you'd like it to open with are items
| of file metadata and have no business being part of the filename.

| Many files have such type-identifiers included. E.g., a JPG file begins
| with JFIF, a WordPerfect file includes WPC in the first line, an MS .doc

| Then you've put the metadata inside the file, which is even worse. It
| should be part of the file system.

This is the problem with mixing Mac and Windows
discussions. As I understand it, Mac stores file data
separately as a "resource fork". Mac users are not
expected to understand anything about files. That's
not the same as metadata.
Resource fork used to be a problem when Mac users
emailed photos. If they didn't know to strip the Mac-
specific prepended data they'd send a corrupt file.

Mac file data: File info stored separately from the file,
only on Macs.

File signatu Sometimes called "magic" or "magic
bytes" -- beginning bytes that *sometimes* identify
a file type. For instance, a BMP bitmap file starts with
hex 42 4D, which in ASCII encoding is "BM". A TXT text
file, on the other hand, usually won't have any header
at all. It *might* have a marker if it's unicode-encoded.

File headers are data about the file structure, appearing
at the beginning of the file. Headers, or lack of them, vary
widely with file types. BMP header is very simple. TXT has
none. JPG can be extensive, even including a small
thumbnail image. There's no rule about the need for
headers and even when there are headers the rules are
sometimes flexible.

Metadata: Optional file info stored in a file header. Some
of it is standard to the file format. Some is added. For
example, everyone and his brother has made up data
markers to store in JPG files. None of it has to be there.
Some of it is unofficial and not widely supported. It's
been established willy nilly as JPG has become widely used.

On the other hand, the file header of PE files (portable
executable EXE, DLL, OCX) is strictly defined to detail
things like imported dependencies, exported functions,
embedded strings and other resources, etc.

You don't need to "open" a file to see what type it is, in
the sense that you don't have to run it. The hex editor
HxD is free and very good. You can put an Open With HxD
on your right-click menu and look at the file bytes to
see what it is. Here's a guide:

In article , J. P. Gilliver (John)
wrote:
The type of a file and which app you'd like it to open with are items
of file metadata and have no business being part of the filename.

1. I think it does no _harm_ to have it as part of the filename, though.

true, but it's not needed if the info is elsewhere.
2. The use of metadata requires that it be _in the file_, not in
something the OS stores _alongside_ the file - since that can get
separated from it, or corrupted separately.

absolutely wrong.

with metadata in the file, you'd need to open each file just to find
out what type it is, a costly and completely unnecessary operation.

metadata should be stored in the file system itself, not alongside it,
which is what classic mac os did. it worked well.
And since there are
filetypes for which metadata _isn't_ in the file (plain text being the
obvious, but I think some forms of raw image, some hex dumps and the
like ...), that ship has sailed.

yep, it's too late to fix the mistakes of the past.
It's rather like those photo-album
softwares that use their own tags, which get confused if someone moves
one of the image files in explorer without telling the photo-album
software.

the point of asset managers is so users *don't* need to manually move
files around, however, since that is still a possibility, such apps
normally handle that without issue. if they don't, it's a bug.

Can a Macintosh person tell us how to change the name of a file? (Now discussion of metadata)

In article , Tim Streater
wrote:
Many files have such type-identifiers included. E.g., a JPG file
begins with JFIF, a WordPerfect file includes WPC in the first line,
an MS .doc includes "Microsoft Word Document" in plain text in the
header, and so on. Some image viewers will even tell you that the
extension doesn't match the file type, if that happens to be the case.

Then you've put the metadata inside the file, which is even worse. It
should be part of the file system.

On the contrary: I think metadata _should_ be inside the file. That way
it can't be separated, even if the file is moved (or even emailed).

Classic MacOS did this in a much better way with its data fork and
resource fork.

actually, the type/creator was not in either of those, but rather the
finder info (i.e., file system).
What are the standards for metadata inside the file? What's described
above seems pretty random to me.

it is random.
Relying on it being part of the file system only works while you're
inside the same OS, unless you believe in forcing all OSs to have the
same standards for handling metadata.

EXIF data? That metadata is fairly well standardised.

other than maker bytes, which are manufacturer specific and often
encrypted, yes.

the actual file contents is in the data fork, with the resource fork
being a miniature database of other stuff, such as fonts, window
position, colour palettes, etc.

a given file might have only a data fork, only a resource fork or both,
depending on the purpose of the file.

microsoft copied the idea, adding multiple forks to ntfs, known as
alternate data streams.
Mac users are not
expected to understand anything about files. That's
not the same as metadata.

also wrong. very wrong.
Resource fork used to be a problem when Mac users
emailed photos. If they didn't know to strip the Mac-
specific prepended data they'd send a corrupt file.

it was never a problem nor is there any prepended data.

clearly it's *you* who doesn't understand anything about files.
File signatu Sometimes called "magic" or "magic
bytes" -- beginning bytes that *sometimes* identify
a file type. For instance, a BMP bitmap file starts with
hex 42 4D, which in ASCII encoding is "BM". A TXT text
file, on the other hand, usually won't have any header
at all. It *might* have a marker if it's unicode-encoded.

File headers are data about the file structure, appearing
at the beginning of the file. Headers, or lack of them, vary
widely with file types. BMP header is very simple. TXT has
none. JPG can be extensive, even including a small
thumbnail image. There's no rule about the need for
headers and even when there are headers the rules are
sometimes flexible.

Metadata: Optional file info stored in a file header. Some
of it is standard to the file format. Some is added. For
example, everyone and his brother has made up data
markers to store in JPG files. None of it has to be there.
Some of it is unofficial and not widely supported. It's
been established willy nilly as JPG has become widely used.

On the other hand, the file header of PE files (portable
executable EXE, DLL, OCX) is strictly defined to detail
things like imported dependencies, exported functions,
embedded strings and other resources, etc.

You don't need to "open" a file to see what type it is, in
the sense that you don't have to run it.

but you do have to open the file and read the info in the header,
making it a costly operation just to find out what type of file it is.
The hex editor
HxD is free and very good. You can put an Open With HxD
on your right-click menu and look at the file bytes to
see what it is.

In article , Wolf K
wrote:
Many files have such type-identifiers included. E.g., a JPG file
begins with JFIF, a WordPerfect file includes WPC in the first line,
an MS .doc includes "Microsoft Word Document" in plain text in the
header, and so on. Some image viewers will even tell you that the
extension doesn't match the file type, if that happens to be the case.

Then you've put the metadata inside the file, which is even worse. It
should be part of the file system.

If by "file system" you mean how an OS identifies filetype etc, no way.

yes way.
As Mayayana reminds us, that way lies a right mess when people exchange
files. And people want to exchange files.

except, it doesn't.
The internet works because the necessary data for routing the data
packets are inside the data packet, not external. That principle should
apply to all forms of data. Including programs, but that's a another issue.

On 12/12/2017 17:42, Arthur Wood wrote:
Can a Macintosh person tell us how to change the name of a file?

You could have asked a Windows person as Macintosh person is not likely
to be around here unless he is completely mental to spend time here
where only the intelligent people discuss rubbish. There is normally no
room for mental people here.

--
With over 600 million devices now running Windows 10, customer
satisfaction is higher than any previous version of windows.

Wolf K wrote:
On 2017-12-14 00:24, Your Name wrote:
On 2017-12-14 03:16:11 +0000, Wolf K said:
On 2017-12-13 19:37, Your Name wrote:
[...]
... you can't rely on the OS to do that since a JPEG image file can
actually be opened in a text editor as the file's data, even if it's
rarely useful to do so.

That's what Open With is for.

Open With is near useless if you don't know what the file actually is.
You'd have to Open With with every app you have until you found one
that could open it properly.

If we're talking about user convenience, I agree, showing a file's type
as part of the filename is very useful. (But IMO a three-letter
extension is too limited). There are many other useful conventions, eg,
in icon design. These are converging on a common standard.

If we're talking about choosing a program to open a file, extenions
aren't needed. It would be easy to ensure that Open With offers only
programs that can open a given file without reference to an extension.
Just standardise metadata (eg, as a series of slots, some which must be
filled, others for dev or user options). Easy peasy.

Have a good day,

Windows is not limited to 8.3.

In Windows 7, the introduction of libraries saw
the addition (by Microsoft) of .library-ms.

In article , Tim Streater
wrote:
| The type of a file and which app you'd like it to open with are items
| of file metadata and have no business being part of the filename.

| Many files have such type-identifiers included. E.g., a JPG file begins
| with JFIF, a WordPerfect file includes WPC in the first line, an MS .doc

| Then you've put the metadata inside the file, which is even worse. It
| should be part of the file system.

This is the problem with mixing Mac and Windows
discussions. As I understand it, Mac stores file data
separately as a "resource fork".

No, you have it back to front. File data went in the data fork,
metadata went in the resource fork.

no it didn't.

metadata was kept in the file system.

the resource fork (which was optional, as was the data fork) held
various resources. it was basically a miniature database.

a zero-length file would have an empty data *and* resource fork. rare,
but possible.
Unfortunately Apple has abandoned
this idea and settled for the lowest-common-denominator approach, and
w're all the worse off for it.

| Bottom line: it's way past time for standards. There's no reason for
| different OSs to handle filetype/tagging/etc differently.
|

I wonder if it's too late for that. Or maybe too early.
Developments go at such a fast pace, and most of it
is now commercial.

Example: I noticed that Bitcoin programming uses a
..DAT extension. That's a common extension on
Windows for undefined data files. Typically they're
custom format, used privately by software. They
might contain anything. The Bitcoin people apparently
didn't know or didn't care.

JPG is a semi-standard only because it compresses
well and it's royaslty-free. But it's a terrible image
format. The compression degrades the image! Yet JPG
is used to store images in cameras because all
computers will recognize it. Meanwhile the JPG header
is a mess. It's like a toilet stall in a public bathroom
where everyone and his brother have added their
2 cents.

Any "standards" we have in tech are often partly
created by small, well-intentioned groups of insiders
who want to improve how things work (often in an
atmosphere of seat-of-the-pants urgency). But those
groups have their own values and their own priorities.

So the obvious question becomes: Who is going to
be in charge to establish standards and decide on
priorities? And what happens to commercial entities that
stand to lose? For instance, camera companies that
have to remake their hardware/software in order to
store some universal format to replace JPG, that
everyone agrees on... at least this year. There's rarely
standardization in commerical products unless it
favors the sellers. It usually doesn't.

| a given file might have only a data fork, only a resource fork or both,
| depending on the purpose of the file.
|
| microsoft copied the idea, adding multiple forks to ntfs, known as
| alternate data streams.
|

Actually MS came up with the ill-fated, bad idea of
ADS to help accomodate MS Office to Macs. Later they
did some dumb things like using them to store metadata.
But whaddayaknow.... it turned out the metadata was
lost if the file was moved from an NTFS file system.

(Actually that's a handy way to clean ADS. Move them
to a FAT32 partition.)

| You don't need to "open" a file to see what type it is, in
| the sense that you don't have to run it.
|
| but you do have to open the file and read the info in the header,
| making it a costly operation just to find out what type of file it is.
|

There's nothing "costly" about opening a file in
a hex editor. HxD uses about 8 MB of RAM. Pale
Moon, by contrast, is costing me 100+- MB just
to sit there.

| The hex editor
| HxD is free and very good. You can put an Open With HxD
| on your right-click menu and look at the file bytes to
| see what it is.
|
| you're going to do that for every single file?

I do it when I need to. Not a big deal. We're
talking about scenarios where the file type is
unknown. I don't find that happens very often.

If you don't know what to do with "every
single file" then even MacOS handholding
won't help.

nospam wrote:
In article , Tim Streater
wrote:
| The type of a file and which app you'd like it to open with are items
| of file metadata and have no business being part of the filename.

| Many files have such type-identifiers included. E.g., a JPG file begins
| with JFIF, a WordPerfect file includes WPC in the first line, an MS .doc

| Then you've put the metadata inside the file, which is even worse. It
| should be part of the file system.

This is the problem with mixing Mac and Windows
discussions. As I understand it, Mac stores file data
separately as a "resource fork".
No, you have it back to front. File data went in the data fork,
metadata went in the resource fork.

no it didn't.

metadata was kept in the file system.

the resource fork (which was optional, as was the data fork) held
various resources. it was basically a miniature database.

a zero-length file would have an empty data *and* resource fork. rare,
but possible.
Unfortunately Apple has abandoned
this idea and settled for the lowest-common-denominator approach, and
w're all the worse off for it.

yep.

NTFS Alternate Streams is the functional equivalent
of Resource and Data fork - but nobody abuses it. Because
of the inter-working issues it would cause.

In Windows, currently an Alternate Stream on a file, can be used
to "mark" the file as being downloaded from the Internet, and
thus "untrustworthy". It's the kind of metadata that doesn't
need to be preserved, so if a person copied the file from
NTFS to FAT32 and back to NTFS again, no one is the wiser.

Kaspersky AV, many releases back, was one of the first applications
to widely use the feature. It stored some sort of checksum or hash,
as a means of speeding up file scanning. And the technique was
abandoned after a small percentage of users experienced side
effects (of some sort).

And I don't even think there is a tool to pack the
streams, for transport to foreign systems.

| And, since you like to quibble: is the MIME header part of a post or
| not? I would quibble that it is, since it has to be included with the
| post so that the client can display the contents properly.
|

You should know that nospam is a compulsive
arguer who regularly carries on bickering matches
that go into hundreds of posts. If you answer,
he *will* argue. He's also very adept at the
appearance of knowledge, using generalities and
undefined declarations ("not so", "nonsense", etc)
to appear to be discussing a topic expertly.

Your statement makes an interesting point, though:
For text-based file types there's usually no distinction
between content and header. There's no place to store
metadata. With "binary" types, the header defines,
describes, or adds to the data. For text-based files
the header is part of the data. HTML and EML are
two good examples. Both are limited to text content
and both have a header that is also part of the content.
Neither contains a file type ID. One could claim that
the HTML or DOCTYPE tags are an ID, but they're
actually part of the HTML. And DOCTYPE is optional.

| So the obvious question becomes: Who is going to
| be in charge to establish standards and decide on
| priorities?
|
| ISO.

I didn't know about that organization. Good idea.

| And what happens to commercial entities that
| stand to lose? For instance, camera companies that
| have to remake their hardware/software in order to
| store some universal format to replace JPG, that
| everyone agrees on... at least this year. There's rarely
| standardization in commercial products unless it
| favors the sellers. It usually doesn't.
|
| Image format is software, not hardware.

Yes. That's just an example. The hardware/software
will need to work together, no?

| All cameras capture the image in
| some proprietary RAW format. Amateur cameras immediately process the RAW
| image, ending with compression to JPG. Our oldest camera actually
| displays "Busy" on the screen while it does this. Some parameters, such
| as white balance, can be set by the user.
|
| The alternative would have to be much larger memory cards, frequent
| exchange for fresh ones in the field, and post-processing at home.

Yes, but the standard could be changed to PNG,
TIF (just a zipped bitmap), or some newer, non-lossy,
compressed format, such as an improved non-lossy
JPG. It would make sense, but it would require a
lot of work for everyone to adapt, from camera makers
to software makers to photographers. And since
many photographers want metadata in their digital
photos, the new standard format would need to
accomodate that.

Then multiply that work across all current data
formats. There would be a lot of resistance. And
many formats are proprietary. Companies protect
their secrets so that competitors can't work with
the format. for instance, DOC was made public only
a few years ago. As far as I know, DOCX is still not
public. Thus, Libre Office can't quite match an MS
Word DOCX with complex formatting. No one can
force MS to make that public or standardize the
structure.

In message Wolf K wrote:
If we're talking about choosing a program to open a file, extenions
aren't needed. It would be easy to ensure that Open With offers only
programs that can open a given file without reference to an extension.
Just standardise metadata (eg, as a series of slots, some which must be
filled, others for dev or user options). Easy peasy.

Yeah, creating a new standard and getting every file to comply with it
is super simple. Why, it happens every day!

--
"He raised his hammer defiantly and opened his mouth to say, "Oh, yeah?"
but stopped, because just by his ear he heard a growl. It was quite low
and soft, but it had a complex little waveform which went straight down
into a little knobbly bit in his spinal column where it pressed an
ancient button marked Primal Terror."