What are some of the most popular data science tools, how do you use them, and what are their features? In this course, you'll learn about Jupyter Notebooks, RStudio IDE, Apache Zeppelin and Data Science Experience. You will learn about what each tool is used for, what programming languages they can execute, their features and limitations. With the tools hosted in the cloud on Cognitive Class Labs, you will be able to test each tool and follow instructions to run simple code in Python, R or Scala. To end the course, you will create a final project with a Jupyter Notebook on IBM Data Science Experience and demonstrate your proficiency preparing a notebook, writing Markdown, and sharing your work with your peers.
LIMITED TIME OFFER: Subscription is only $39 USD per month for access to graded materials and a certificate.

講師

Polong Lin

Data Scientist

字幕

[SOUND] Welcome, so what are Zeppelin Notebooks? In this video, we'll be looking at Zeppelin notebooks, one of the open-source, web-based tools in Data Scientist Workbench. Zeppelin notebooks are multipurpose notebooks that can handle all your analytics needs, from data ingestion, data discovery, data analytics, to data visualization and collaboration. The Zeppelin interpreter concept allows any language or data processing backend to be plugged into Zeppelin. Currently Apache Zeppelin supports many interpreters such as Apache Spark, Python, JDBC, Markdown, and Shell. Apache Zeppelin, in particular, provides built-in Apache Spark integration. You don't need to build a separate module, plug-in, or library for it. Apache Zeppelin with Spark integration provides a number of great features including automatic Spark context and SQL context injection. Runtime JAR dependency loading from local file system or Maven repository, as well as cancelling job and progress display. For further information about Apache Spark in Apache Zeppelin, take a look at the Apache Spark in Zeppelin Notebooks video. For data visualization, here are some basic charts already included in Apache Zeppelin. Visualizations are not limited to Spark SQL query. In fact, any output from any language backend can be recognized and visualized. For pivot charts, Apache Zeppelin aggregates values and displays them in a pivot chart with simple drag and drop. You can easily create charts with multiple aggregated values including sum, count, average, minimum, and maximum. For dynamic forms, Apache Zeppelin can dynamically create some input forms for your notebook. Apache Zeppelin is Apache 2.0 licensed software. Zeppelin notebooks are 100% opensource, so please check out the source repository and how to contribute. In fact, Apache Zeppelin has a very active development community. Please feel free to join our mailing list, and report issues if you'd like on Jira issue tracker. This brings us to the end of this video. Thanks for watching.