Identify vs. Match

Although identifying functions (Identify,
IdentifyOwn) of the
FileTypeManager class have their important uses, they can be misused.
These functions are the strongest identifying functions. Their goal is to find
the best matching FileType based on match level,
using all the information supplied to a file type including (very often) the
callback function, that opens a file to read its "magic number". This can be
time and processing power consuming task. These functions should not be used in
a situation where the goal is only to decide if a file is good or not for the
specific case. The following example shows this bad usage.

In this case we are wasting processing time on identifying all the types we are
not interested in. For example if the type database is full of other types that
are using some overlapping extensions and mac types than each time this code
executes and none of the if braches are true, the Identify function will
execute the callback functions of the types which we are not interested in, and
extracts much more information than we need.

In this case only types involved in the expression are tested. Still decision
inside the if branches is not necessary and needs extra processing. If we
assume that types in the first if branch use the same extension we still try to
decide which is the best match and we don not use this information. The better
way is to make a group of commonly tested types and match against this group.
In this case the Match function will not try to find out the best match only if
there are other type outside the group that are candidates for same or better
mach level. With careful grouping it is possible to avoid any need for calling
callback functions.

In this way the FileTypeManager will not try to decide between types using the
same extension if they are grouped together for Match. If there are no other
types outside the group with the same extension this operation will be very
effective.