How come programs written in assembly are so much faster than any
other high-level language. I know that it is a low-level language
and that it "speaks" directly to the hardware so it is faster, but
why can't high-level languages compile programs just as fast as
assembly programs?

First of all, assembler is not an "other" high-level language. It is
the low-level language par excellence, lower-level even than C :-).

There are several reasons that assembly programs can be faster than
compiled programs:

The assembly programmer can design data structures which take maximum
advantage of the instruction set. To a certain extent, you can do
this in languages like C if you're willing to write code that is
specific to the architecture. But there are some instructions which
are so specialized that it is very hard for compilers to recognize
that they're the best way to do things; this is mostly true in CISC
architectures.

The assembly programmer typically can estimate which parts of the
program need the most optimization, and apply a variety of tricks
which it would be a bad idea to apply everywhere, because they make
the code larger. I don't know of any compilers that allow "turning
up" or "turning down" optimization for code fragments, although most
will allow it for compilation modules.

The assembly programmer can sometimes use specialized runtime
structures, such as for instance reserving some registers globally for
things that are often used, or designing special conventions for
register use and parameter passing in a group of procedures. Another
example is using the top of the stack as a local, unbounded stack
without respecting frame conventions.

Some control structures are not widely supported by commonly-used
higher-level languages, or are too general. For instance, coroutines
are provided by very few languages. Many languages now provide
threads, which are a generalization of coroutines, but often have more
overhead.

The assembly programmer is sometimes willing to do global analysis
which most compilers currently don't do.

Finally, the assembly programmer is more immediately aware of the cost
of operations, and thus tends to choose more carefully as a function
of cost. As language level rises, the cost of a given operation
generally becomes less and less predictable.

All this said, there is no guarantee than an assembly program will be
faster than a compiled program. A program of a given functionality
will take longer to develop in assembler than in a higher-level
language, so less time is available for design and performance tuning.
Re-design is particularly painful in assembler since many decisions
are written into the code. In many programs, large improvements can
be made in performance by improving algorithms rather than coding;
assembler is a disadvantage here since coding time is larger, and
flexibility is less. Finally, it is harder to write reliable assembly
code than reliable higher-level language code; getting a core dump
faster is not much use.

Compiler writers have tried, over time, to incorporate some of these
advantages of assembler. The "coalescing" style of compiler in
particular in many ways resembles the work of a good assembly
programmer: design your data structures and inner loops together, and
early on in the design process. Various kinds of optimization and
global analysis are done by compilers, but in the absence of
application knowledge, it is hard to bound their runtime. (Another
thread in this group talked about the desirability of turning
optimization up very high in some cases.)