What We Use

Apache Spark is an open source data analytics cluster computing framework originally developed in the AMPLab at UC Berkeley. Spark fits into the Hadoop open source community, building on top of the Hadoop Distributed File System (HDFS).

Commonly Used By

ArcGIS is a comprehensive system that allows people to collect, organize, manage, analyze, communicate, and distribute geographic information. As the world's leading platform for building and using geographic information systems, ArcGIS is used by people all over the world to put geographic knowledge to work in government, business, science, education, and media.

Commonly Used By

Cython is an optimizing static compiler for both the Python programming language and the extended Cython programming language (based on Pyrex). It makes writing C extensions for Python as easy as Python itself.

The Apache Hadoop software library is a framework that allows for the distributed processing of large datasets across clusters of computers using simple programming models. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. Rather than rely on hardware to deliver high availability, the library itself is designed to detect and handle failures at the application layer, so delivering a highly available service on top of a cluster of computers, each of which may be prone to failures.

Commonly Used By

HTML5 is a core technology markup language of the Internet used for structuring and presenting content for the World Wide Web. As of October 2014, this is the final and complete fifth revision of the HTML standard of the World Wide Web Consortium (W3C).

Commonly Used By

Pages

The Jupyter Notebook is a web application that allows you to create and share documents that contain live code, equations, visualizations, and explanatory text. Uses include data cleaning and transformation, numerical simulation, statistical modeling, machine learning, and much more.

Commonly Used By

Pages

matplotlib is a python 2D plotting library that produces publication quality figures in a variety of hardcopy formats and interactive environments across platforms. matplotlib can be used in python scripts, the python and ipython shell (ala MATLAB or Mathematica), web application servers, and six graphical user interface toolkits.

matplotlib tries to make easy things easy and hard things possible. You can generate plots, histograms, power spectra, bar charts, errorcharts, scatterplots, etc., with just a few lines of code.

Commonly Used By

Pages

Message Passing Interface (MPI) is a standardized and portable message-passing system designed by a group of researchers from academia and industry to function on a wide variety of parallel computers. The standard defines the syntax and semantics of a core of library routines useful to a wide range of users writing portable message-passing programs in different computer programming languages such as Fortran, C, C++ and Java.

Commonly Used By

Pages

NumPy is the fundamental package for scientific computing with Python. It contains, among other things, a powerful N-dimensional array object; sophisticated (broadcasting) functions; tools for integrating C/C++ and Fortran code; and useful linear algebra, Fourier transform, and random number capabilities. Numpy is licensed under the BSD license, enabling reuse with few restrictions.

NumPy development is supported at BIDS through grants from the Moore and the Sloan Foundations.