No, the derivative of $f(x) = 0$ is also itself :p (actually both are members of $ae^x$.)
–
kennytmAug 22 '10 at 10:58

12

In terms of what? In my view this is the defining property of $e$.
–
anonAug 22 '10 at 10:58

3

Like muad, this is for me the definition of e^x. So I'm not sure where this intuition is supposed to be coming from.
–
Qiaochu YuanAug 22 '10 at 11:39

19

@Qiaochu: even if this is the definition of e^x for you, is it not then interesting that this e^x function can be characterized as the same as raising a particular constant between 2 and 3 to the power of x? Such a coincidence could warrant an intuitive explanation.
–
Niel de BeaudrapAug 22 '10 at 13:15

3

Now that i think about how it was taught back then, it was quite a while in between deducing the properties of $\ln$ and $\exp$ ab initio, and then revealing that they are in fact the logarithm and exponential that was taught in previous courses!
–
Guess who it is.Aug 22 '10 at 13:23

An easily understandable explanation to laypersons is 'the more you have the faster it grows', like debt or savings (they're both compound interest but 'normal' people don't like fancy math terms ;) ).
–
MarcinJun 7 '11 at 12:07

2

"The more bacteria exist in a colony, the faster the colony will grow" is false. Competition for resources adds a negating term which depends on the concentration of bacteria: $dB/dt = B - f(B)$. This makes all of the difference.
–
user02138Aug 14 '13 at 16:06

(Looks like Niel and "J" beat me to the punch while I was composing this, but I'll post my answer anyway since I like my diagram. :) )

You can't get too much more intuitive than the Law of Exponents ($a^x a^c = a^{x+c}$) right?

To reduce symbolic (and mental) clutter for a while, let's write $k$ for $a^c$, to get

$$k \; a^x = a^{x+c}$$

The point is that

Multiplying the value of $a^x$ by something yields the same result as adding some other thing to the exponent.

Consider what this bit of algebra tells us about the geometry of the graph of the function $y=a^x$ (shown in red in the figure below). Recall some fundamental notions:

$y=k a^x$ is the result of vertically stretching (scaling) the original graph by a factor of $k$.

$y = a^{x+c}$ is the result of horizontally sliding (translating) the original graph by a (signed) distance of $-c$.

The Law of Exponents tells us that (provided $k$ and $c$ are related appropriately), these transformed graphs are identical! In the figure, the blue graph represents both results.[*] Importantly, what points move where aren't the same under both actions; scaling moves the red point $P$ vertically onto the blue point $Q$; translating moves the red point $R$ horizontally onto $Q$.

Now, as suggested by the diagram, imagine a tangent vector poking out of each point of the original graph, and consider what the transformations to do those vectors:

Vertically stretching scales (only) the "rise" of the vector by factor $k$; that is, a vector with slope $m$ is pulled to achieve a slope of $m k$.

Horizontally translating has no effect on the vector's slope.

What can we conclude here? Why, something pretty remarkable:

For any point $P$ on the original graph, the point $R$ --located $c$ units to the right[**] of $P$-- is such that the slope of the tangent vector at $R$ is $k$-times the slope of the tangent vector at $P$ (where $k$ and $c$ are related appropriately).

Let me take this opportunity to retire $k$, since it is beginning to become clutter; I'll just make the appropriate relationship explicit. Also, I'm going to retire the point name $R$, opting to describe the point instead. I'll summarize things this way:

On the graph $y=a^x$, if the slope of tangent vector at any $P$ is $m$, then the slope of the tangent vector at the point $c$ units to the right of $P$ is $m\;a^c$.

Notice that there's nothing special about the players in this game. $P$ is any point on the graph, $c$ is any (horizontal) distance you care to choose; heck, even the exact value of $a$ is up for grabs. Let's use this to our advantage.

Suppose we take $P$ to be the point where the (original, red) graph crosses the $y$-axis; that is, we take $P$ to have $x$-coordinate $0$ (and $y$-coordinate $1$, but this doesn't really matter). Then, "the point $c$ units to the right of $P$" can be described more simply as "the point with $x$-coordinate $c$", and we have

On the graph $y = a^x$, if the slope of the tangent vector at the point with $x$-coordinate $0$ is $m$, then the slope of the tangent vector at the point with $x$-coordinate $c$ is $m\;a^c$.

In Calculusian prose:

If $f(x) = a^x$, then $f^{\prime}(c) = a^c f^{\prime}(0)$.

Observe that we don't even need "$c$" in the above formula, since it's just taking the place of some $x$-coordinate. We can simply write:

If $f(x) = a^x$, then $f^{\prime}(x) = a^x f^{\prime}(0)$.

From here, it's pretty much a matter of definition to get to the final answer to your specific question. After all, the above holds for any (non-negative) value of $a$. Clearly, some values of $a$ correspond to graphs that cross the $y$-axis very steeply; some values correspond to graphs that cross the $y$-axis very shallowly; it's not un-reasonable to believe that there's a convenient value would cause the graph to cross the $y$-axis juuuuuuuuust right ... with a slope of $1$. Of course (as I mix my folklore), to name a demon is to control him, so we'll simply assign a symbol to this "just right" value of $a$.

Let "e" be the number such that the tangent vector to $y=e^x$ at $x=0$ has slope $1$.

Assuming that such a number really does exist, you don't even have to know its exact value to conclude

If $f(x) = e^x$, then $f^{\prime}(x) = e^x$.

You can then turn your attention to figuring out why $e$ happens to have the value $2.718...$ . Other answers in this thread provide insights on how those arguments proceed.

Also left as an exercise is to determine why, in the formula for the derivative of $a^x$, the "$f^{\prime}(0)$" factor is in fact "$\log a$" (the natural logarithm of $a$).

[*] As Niel mentions, this illustrates that the graph of an exponential function is "self-similar". I think it's important to add "uni-directionally" to the description, in order to distinguish it from (conventional) similarity transformations where scaling occurs "omni-directionally".

[**] "to the left" works, too, with appropriate changes to the discussion.

Therefore, the function $2^x$ is such that the secant slope over a "unit run" is $2^x$, which invites the question: What about over smaller runs? Let $a$ be such that the secant slope of $a^x$ over a run of $h$ is $a^x$; that is, suppose $(a^{x+h}-a^x)/h=a^x$. Dividing through by $a^x$, we solve to get $a=(1+h)^{1/h}$. Taking $h$ smaller and smaller pushes us toward a value of $a$ such that the secant slope over an "infinitesimal run" --that is, the tangent slope-- is $a^x$. Of course, this is the same limit we see over and over again, but it arises from a fairly intuitive motivation.
–
BlueAug 23 '10 at 11:26

The approach I would have taken though, was to express $\exp(x)$ first as 1+(other terms), take the derivative so that you now have the general term $\frac{n x^{n-1}}{n!}=\frac{x^{n-1}}{(n-1)!}$, and shift indices accordingly.
–
Guess who it is.Aug 22 '10 at 11:49

Well, this is the formulation assuming that the power series is the definition of $\exp(x)$. One now shows that this power series satisfies the differential equation in the title.
–
Guess who it is.Aug 22 '10 at 12:00

1

Could this power series be the definition of $\exp(x)$, since itself is based on the fact that the derivative of $\exp(x)$ is $\exp(x)$?
–
JichaoAug 22 '10 at 16:23

1

@Jichao. For a lot of books, this series is the definition of $e^x$.
–
a.r.Aug 22 '10 at 18:06

One conceptual way to understand the standard definition $e^x = \lim_{n\to\infty}(1+x/n)^n$ is to view it as arising from applying Euler's approximation method to $\,y' = y.\,$ The same method also works for arbitrary higher-order constant coefficiant ODEs by converting them to linear system form and employing matrix exponentials. This is described quite nicely in Arnold's beautiful textbook Ordinary differential equations. Here is an excerpt:

Thanks for showing the picture of the Euler polygon! In the Hairer-Norsett-Wanner book, the picture of the Euler polygons slowly converging to the true solution is referred to as "Lady Windermere's fan", after the character by Oscar Wilde.
–
Guess who it is.Aug 22 '10 at 15:10

Fair warning: The rigorous formulation can be found in any textbook. What I am giving here is strictly informal intuition behind the result. If you write this in your exam you will get a zero :) However, this can be made filled out and rigorous with a lot of arguments
–
KalElAug 22 '10 at 11:45

1

Well, this is not really an intuition but an heuristic for the proof of the statement.
–
Mariano Suárez-Alvarez♦Aug 22 '10 at 13:25

One interesting way of looking at this is by turning the differential equation into an integral equation.

The problem $ y'=y $ is the same as $ \int y \ dx = y $. For the problem to be well defined I have to define an initial condition, we use the condition $y(0)=1$.

Now I am going to guess an answer for y. The simplest function $y(x)$ that I know which goes through $(0,1)$ is $y_0(x)=1$. The subscript here has no special mathematical meaning, it just signifies that this is my first guess.

The integral added a constant that wasn't there before, our guess must have not been very good so we will take this new result as our second guess hoping that it knows more about the problem. So our new guess is,

$$y_1(x) = x + C = x + 1 $$

Notice that we must set C=1 so that $y_1(0)=1$. We will substitute this into the integral equation again which will give us a new guess $y_2(x)$.

If you look closely at the coefficients you will see that we are generating the Maclaurin series for $e^x$.

This is happening because the only function which is it's own anti-derivative is the exponential function. In the terminology of dynamics we would say that the infinite series representation of $e^x$ is a fixed point of the integral operator. It is also an attractor which means that repeated integration of $some$ unrelated functions will produce a sequence of functions that approach $e^x$.

Geometrically we can see that the gradient of the tangent line of the curve $y = e^x$ must be between the finite differences, i.e. $\frac{e^x - e^{x-h}}{h} < e^x < \frac{e^{x+h} - e^x}{h}$. With some numerical exploration we see that $e$ is between $2$ and $3$. Specifically, the finite differences for $e=2$ and $h=0.9$ are too small and the finite differences for $e=3$ and $h=0.1$ are too big. To narrow it down more let $e=2.7$ and $h = 0.01$ to see that $2.7 < e < 3$.

Why do exponential functions generally (the class $x \mapsto ab^x$ for $b>0$) have the property of being eigenfunctions of the derivative operator? That is: why do these functions f have the property $\frac{\mathrm d}{\mathrm dx} f(x) = \lambda f(x)$?

Why are the functions $x \mapsto a \mathrm e^x$ in particular +1 eigenfunctions --- i.e. why is the scalar $\lambda$ equal to 1 for these functions in particular? (To put it another way: why is this the case when the base is equal to the particular constant e ≈ 2.71828?)

I can give an intuitive answer to the first of these questions, by combining two observations.

Consider the slope at a paticular point, e.g. at x = 0. Multiplying the function f(x) by some scalar now only amplifies the function by that factor, but also amplifies the slope by that same factor. (This is just a restatement of the product rule, for constant scalar factors.)

Exponential functions f(x) are self-similar: multiplying them by a positive scalar is equivalent to translating them to the left, or to the right, depending on the scalar by which you multiply: more precisely, if you multiply $\mathrm e^x$ by c, why you get is $\mathrm e^{[x + \ln(c)]}$, or the same function translated ln(c) to the left. More generally, for $f(x) = ab^x$, we have cf(x) = f(x + logb(c)); that is, multiplying f(x) by c is equivalent to a shift to the left by logb(c).

Now, combine these two facts. The slope of f(x) at x = logb(c) is c times the slope of f(x) at x = 0, because of the self-similarity of the exponential function, and because the slope of cf(x) at x = 0 is c times the slope of f(x) at x = 0. Thus the slope of f(x) is a function which is proportional to f(x), QED.

Since the limit of $(a^\delta -1) / \delta$ as $\delta \to 0$ is less than 1 for $a = 2$ and greater than 1 for $a = 3$ (by direct calculations), and since $(a^\delta -1)/\delta$ is a continuous function of a for $\delta > 0$, it follows that there exists a positive real number we'll call $e$ such that for $a=e$ we get $$\lim_{\delta \to 0}\frac{e^\delta - 1}{\delta} = 1.$$

I think we can make these ideas rigorous with nonstandard analysis. (That is, interpret the ~ to be standard part equality and 'very small' to mean infinitesimal).
–
anonAug 22 '10 at 21:22

5

I would not say that $e^x \sim 1+x$ is a defining property of $e^x$...
–
Bruno JoyalDec 14 '11 at 8:51

@Bruno The answer was not elaborated enough, but I disagree: For x small, $e^x \sim 1 + x$ is a defining property of $e$, and a pretty good one. I've tried to give a more elaborated version of why in my answer.
–
Greg HillJun 19 '13 at 16:38

When we take $x^n$, and derive it $n$ times, we get $n!$ . So, by that logic, if we'd ever want to construct a function (other than the obvious $f(x) = 0)$ which is immune to derivation (and, by extension, to integration as well, since the two are opposite operations), then that function would have to be $\sum_0^\infty\ x^n/n!$ . Which “just-so-happens” to be the same as $e^x$ , since the famous number $e$ has been expressly created, designed, and defined to equal exactly $\sum_0^\infty\ 1/n!$ , precisely for this very reason.

start from $f_0(x)=1$ but then $f'_{0}(x)=0$ we need something more, let's try
$f_1(x)=1+x$ but then $f_1'(x)=1$ we need something more, let's try $f_2(x)=1+x+\dfrac{x^2}{2!}$ then $f'_2(x)=1+x$, continuing this way we see that $f_n(x)-f_n'(0)=\dfrac{x^n}{n!}$

But for large enough $n$, $\dfrac{x^n}{n!}\approx0$, and that is good enough.

Form the finitist point of view there is no function $f_n(x)$ s.t. $f_n(x)-f_n'(0)=0$
So $e^x=\dfrac{\mathrm de^x}{\mathrm dx}$ is really abuse of notation.