I am interested in the function $\sum_{i=0}^{k} {N \choose i}$ for fixed $N$ and $0 \leq k \leq N $. Obviously it equals 1 for $k = 0$ and $2^{N}$ for $k = N$, but are there any other notable properties? Any literature references?

In particular, does it have a closed form or notable algorithm for computing it efficiently?

In case you are curious, this function comes up in information theory as the number of bit-strings of length $N$ with Hamming weight less than or equal to $k$.

Edit: I've come across a useful upper bound: $(N+1)^{\underline{k}}$ where the underlined $k$ denotes falling factorial. Combinatorially, this means listing the bits of $N$ which are set (in an arbitrary order) and tacking on a 'done' symbol at the end. Any better bounds?

for some function $g$. This is essentially a rewriting of a special case of the central limit theorem. The Hamming weight of a word chosen uniformly at random is a sum of Bernoulli(1/2) random variables.

Using the summation formula for Pascal's triamgle, you get a shorter geometric series approximation which may work well for k less than but not too close to N/2. This is (N+1) choose k + (N+1) choose (k-2) + ..., which has about half as many terms and ratio that is bounded from above by (k^2-k)/((N+1-k)(N+2-k)), giving [((N+1-k)(N+2-k))/((N+1-k)(N+2-k) -k^2 +k)]*[(N+1) choose k] as an uglier but hopefully tighter upper bound. Gerhard "Ask Me About System Design" Paseman, 2010.03.06
–
Gerhard PasemanMar 6 '10 at 8:03

One can take this a step further. In addition to combining pairs of terms of the original sum N choose i to get a sum of terms of the form N+1 choose 2j+c, where c is always 0 or always 1, one can now take the top two or three or k terms, combine them, and use them as a base for a "psuedo-geometric" sequence with common ratio a square, cube, or kth power from the initial common ratio. This will give more accuracy at the cost of computing small sums of binomial coefficients. Gerhard "Ask Me About System Design" Paseman, 2010.03.27
–
Gerhard PasemanMar 27 '10 at 17:00

When k is so close to N/2 that the above is not effective, one can then consider using 2^(N-1) - c (N choose N/2), where c = N/2 - k. Gerhard "Ask Me About System Design" Paseman, 2010.03.27
–
Gerhard PasemanMar 27 '10 at 17:04

I would not be so harsh in saying that the hypergeometric form is "not useful"; for instance, one can apply a Pfaff transformation, dlmf.nist.gov/15.8.E1 , to yield the identity $${}_2 F_1\left({{1 \quad m-n+1}\atop{m+2}}\mid-1\right)=\frac12 {}_2 F_1\left({{1 \quad n+1}\atop{m+2}}\mid\frac12\right)$$
–
J. M.Oct 4 '11 at 0:57

1

The second bit has an argument that is nearer the expansion center 0 for the Gaussian hypergeometric series, so it stands to reason that the convergence is a bit faster. Also, one no longer needs to add terms of different signs...
–
J. M.Oct 4 '11 at 0:59

$T(n,k) = \sum_{i-0}^k {N\choose i}$ is the maximal number of regions into which $n$ hyperplanes of co-dimension $1$ divide $\mathbb R^k$ (the Cake-Without-Icing numbers)

$2 ~T(n-1,k-1)$ is the number of orthants intersecting a generic linear subspace of $\mathbb R^n$ of dimension $k$. This tells you the probability if you choose $a$ independent random points on the unit sphere in $\mathbb R^d$, the probability that the origin is contained in the convex hull is $T(a-1,a-d-1)/2^{a-1}$. Complementarily, no hemisphere contains all of the points. The null space of the map by linear combinations of the points $\mathbb R^a \to \mathbb R^d$ generically has a kernel of dimension $a-d$, and this intersects the positive orthant iff $0$ is a convex hull of the points. By symmetry, all orthants are equally likely.

The sum without the $i=0$ term arises in the "egg drop" problem -- see Michael Boardman's article, "The Egg-Drop Numbers," in Mathematics Magazine, Vol. 77, No. 5 (December, 2004), pp. 368-372, which concludes saying, "it is well known that there is no closed form (that is, direct formula) for the partial sum of binomial coefficients" with a reference to the book A=B by Petkovsek, Wilf, and Zeilberger (but unfortunately no page reference).

If you interested in some back-of-the-hand order of magnitude estimates, you might consider looking at how $\binom{n}{k}$ behaves when $k=k(n)$ has a certain size.
The idea I have in mind is to break down $\sum_{k=0}^m\binom{n}{k}$ into a sum over intervals of $k$ satisfying a certain regime. For example, look at terms where $k=\Theta(n)$, $k=\Theta(n^{1/2})$, etc. In general, using Stirling's approximation, you'll get:

$\binom{n}{k}=\frac{n^ke^k}{k^k\sqrt{2\pi k}} A$

where $A:=\frac{n_{k}}{k^k}=\prod_{i=0}^{k-1}\left(1-\frac{i}{n}\right)$ and $n_k$ is the falling factorial. In particular, it's nicer to work with $B:=\ln(A) = \sum_{i=0}^{k-1} \ln\left(1-\frac{i}{n}\right)$.

Now the idea is that each of the logarithm terms in $B$ can be Taylor expanded up to "sufficient" order depending on the size of $k$ compared to $n$. For example if $k=o(1)$, then
$B\approx \sum_{i=0}^{k-1}\approx -\frac{k^2}{2n}$, so you get $A=e^{-\frac{k^2}{2n}(1+o(1))}$. In fact, you can do better than this if you expand $B$ to higher orders. In particular, if $k=o(n^{2/3})$, then $B=\sum_{i=0}^{k-1}-\frac{i}{n}+O(i^2n^{-2})=-\frac{k^2}{2n}+o(1)$ which gives $A=e^{-\frac{k^2}{2n}}(1+o(1))$ where now the $o(1)$ is no longer exponentiated. For other sizes of $k$, the exact same procedure works as long as you expand $B$ to sufficiently high order.