README.md

pandas: powerful Python data analysis toolkit

What is it

pandas is a Python package providing fast, flexible, and expressive data
structures designed to make working with "relational" or "labeled" data both
easy and intuitive. It aims to be the fundamental high-level building block for
doing practical, real world data analysis in Python. Additionally, it has
the broader goal of becoming the most powerful and flexible open source data
analysis / manipulation tool available in any language. It is already well on
its way toward this goal.

Main Features

Here are just a few of the things that pandas does well:

Easy handling of missing data (represented as
NaN) in floating point as well as non-floating point data

Automatic and explicit data alignment: objects can
be explicitly aligned to a set of labels, or the user can simply
ignore the labels and let Series, DataFrame, etc. automatically
align the data for you in computations

Powerful, flexible group by functionality to perform
split-apply-combine operations on data sets, for both aggregating
and transforming data

Make it easy to convert ragged,
differently-indexed data in other Python and NumPy data structures
into DataFrame objects

License

Documentation

The Sphinx documentation should provide a good starting point for learning how
to use the library. Expect the docs to continue to expand as time goes on.

Background

Work on pandas started at AQR (a quantitative hedge fund) in 2008 and
has been under active development since then.

Discussion and Development

Since pandas development is related to a number of other scientific
Python projects, questions are welcome on the scipy-user mailing
list. Specialized discussions or design issues should take place on
the pystatsmodels mailing list / Google group, where
scikits.statsmodels and other libraries will also be discussed: