Linear Algebra for Machine Learning with Scala

In this tutorial, we will cover some basics of linear algebra. We will use ND4S – Scala bindings for ND4J, scientific computing library for the JVM.

Linear Algebra building blocks

We begin by discussing the building blocks of linear algebra: scalars, vectors, matrices and tensors:

Scalar – is a single number

Vector – is an ordered array of single numbers

Matrix – is a two dimensional array of numbers, arranged in rows and columns

Tensor – is a multidimensional array

Why do we need Linear Algebra for Machine Learning?

Machine learning is all about data and data can be represented as a vector, matrix or tensor. To be effective machine learning often requires large amounts of data, computations on large matrices can be performed very efficiently using highly optimized libraries for matrix operations like ND4j.

ND4J

The core data structure in Nd4j is the NDArray, which is a multi-dimensional array of numbers: a vector, matrix or tensor.
Internally, it may store single precision or double precision floating point values for each entry.

Let’s add dependencies.

1

2

3

valnd4jVersion="0.9.1"

libraryDependencies+="org.nd4j"%"nd4j-native-platform"%nd4jVersion

libraryDependencies+="org.nd4j"%%"nd4s"%nd4jVersion

Add the following import statements.

1

2

importorg.nd4j.linalg.factory.Nd4j

importorg.nd4s.Implicits._

NDArrays operations

We will use Nd4j class, which exposes many static methods to help us with the creation and manipulation of NDArrays.

Adding two vectors

Two vectors of the same size can be added by adding the corresponding elements.

1

2

3

4

5

6

valone=Nd4j.create(Array(1d,2d,3d))

one:org.nd4j.linalg.api.ndarray.INDArray=[1.00,2.00,3.00]

scala>valtwo=Nd4j.create(Array(3d,5d,6d))

two:org.nd4j.linalg.api.ndarray.INDArray=[3.00,5.00,6.00]

scala>one+two

res7:org.nd4j.linalg.api.ndarray.INDArray=[4.00,7.00,9.00]

Scalar-vector multiplication

Scalar-vector multiplication is an operation in which every element of the vector is multiplied by a scalar.

1

2

3

4

5

scala>valvec=Nd4j.create(Array(3d,5d,6d))

vec:org.nd4j.linalg.api.ndarray.INDArray=[3.00,5.00,6.00]

scala>vec*2

res8:org.nd4j.linalg.api.ndarray.INDArray=[6.00,10.00,12.00]

Vector transpose

The transpose of a vector changes a column vector to a row vector or vice versa.

1

2

3

4

5

6

7

8

valvec=Nd4j.create(Array(3d,5d,6d))

vec:org.nd4j.linalg.api.ndarray.INDArray=[3.00,5.00,6.00]

scala>vec.shape

res8:Array[Int]=Array(1,3)

scala>vec.T.shape

res9:Array[Int]=Array(3,1)

We used shape method to check the size of each dimension, the shape has changed from 1×3 (1 row 3 columns) to 3X1 (3 rows 1 column).

Vector dot product

Vector dot product is one of the most important operations in the whole machine learning.
It’s defined as a sum of corresponding elements of two vectors of the same size. We can think of a dot product as a measure of similarity between two vectors.

1

2

3

4

5

6

7

8

scala>valvec1=Nd4j.create(Array(3d,5d,6d))

vec1:org.nd4j.linalg.api.ndarray.INDArray=[3.00,5.00,6.00]

scala>valvec2=Nd4j.create(Array(1d,2d,3d))

vec2:org.nd4j.linalg.api.ndarray.INDArray=[1.00,2.00,3.00]

scala>vec1.dot(vec2.T)

res12:org.nd4j.linalg.api.ndarray.INDArray=31.00

Creating matrices

Matrix is a two-dimensional array. We can create a matrix from a two dimensional Array.

1

2

3

4

scala>valmatrix=Nd4j.create(Array(Array(1d,2d,3d),Array(4d,5d,6d)))

matrix:org.nd4j.linalg.api.ndarray.INDArray=

[[1.00,2.00,3.00],

[4.00,5.00,6.00]]

We can now check the shape of our newly created matrix.

1

2

scala>matrix.shape

res0:Array[Int]=Array(2,3)

Accessing and setting matrix elements

NDArray supports multidimensional indexing for multidimensional arrays, to access or set a particular element we need to specify its row and column number.

1

2

3

4

5

6

7

8

scala>valmatrix=Nd4j.create(Array(Array(1d,2d,3d),Array(4d,5d,6d)))

matrix:org.nd4j.linalg.api.ndarray.INDArray=

scala>matrix(1,2)

res1:Double=6.0

scala>matrix(0,0)=100

res2:org.nd4j.linalg.api.ndarray.INDArray=

[[100.00,2.00,3.00],

[4.00,5.00,6.00]]

Matrix addition and subtraction

Two matrices can be added or subtracted if, and only if, they have the same dimensions. To add (or subtract) two matrices of the same dimensions, just add (or subtract) the corresponding entries.

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

scala>valmatrix1=Nd4j.create(Array(Array(1d,2d,3d),Array(4d,5d,6d)))

matrix1:org.nd4j.linalg.api.ndarray.INDArray=

[[1.00,2.00,3.00],

[4.00,5.00,6.00]]

scala>valmatrix2=Nd4j.create(Array(Array(1d,2d,3d),Array(4d,5d,6d)))

matrix2:org.nd4j.linalg.api.ndarray.INDArray=

[[1.00,2.00,3.00],

[4.00,5.00,6.00]]

scala>matrix1+matrix2

res0:org.nd4j.linalg.api.ndarray.INDArray=

[[2.00,4.00,6.00],

[8.00,10.00,12.00]]

scala>matrix1-matrix2

res1:org.nd4j.linalg.api.ndarray.INDArray=

[[0.00,0.00,0.00],

[0.00,0.00,0.00]]

Matrix product

The matrix dot product is an operation that produces a matrix from two matrices. The number of columns of the 1st matrix must equal the number of rows of the 2nd. This is how we define the dot product of two matrices, A \((2 \times 3)\) and B \((3 \times 2)\).