Setting up Data Sets

Before we apply logic to process the data, let us set up data sets so that we can come up with a bit realistic use cases in the pursuit of learning these important concepts.

Performing I/O Operations

Scala provide few APIs to read data from files. If you want to understand File I/O in detail you can start exploring Java APIs and use as part of Scala programming. Let us see steps involved in reading data into a collection.

import scala.io.Source

Use Source.fromFile to read data into memory as character array/buffered source

Use getLines to use new line character as delimiter and create collections

We read data from order_items for demonstration purpose into a variable called orderItems

Using map reduce APIs

Get all order items for order id 2 – use filter API

Extract order item subtotal for each of the item belonging to order id 2 – use map API

Add order item subtotal to get revenue for order id 2 – use reduce API

Understanding Tuples

Tuple is yet another powerful structure which is being used in many modern programming languages.

It is generic object type, where attributes can be accessed using index (underscore notation)

Tuple represents a record with elements of heterogeneous type

It eliminates creating classes and then instantiating objects referring to class variables.