First class function (FCC)

The distinction is not that individual functions can be first class or not, but that entire languages may treat functions as first-class objects, or may not.

A first-class function is not a particular kind of function.

All functions in Python are first-class functions. To say that functions are first-class in a certain programming language means that they can be passed around and manipulated similarly to how you would pass around and manipulate other kinds of objects (like integers or strings).

"First-Class Functions" (FCF) are functions which are treated as so called "First-Class Citizens" (FCC).

FCC's in a programming language are objects (using the term "objects" very freely here) which:

Can be used as parameters/arguments for another functio

# Python program to illustrate functions# can be passed as arguments to other functionsdef shout(text): return text.upper()def whisper(text): return text.lower()def greet(func): # storing the function in a variable greeting = func("Hi, I am created by a function passed as an argument.") print greeting greet(shout)greet(whisper)Run on IDE OutputHI, I AM CREATED BY A FUNCTION PASSED AS AN ARGUMENT.hi, i am created by a function passed as an argument.

References

Closure

Summary: you can return a function without executing by omitting parentheses.

def print_msg(msg):

# This is the outer enclosing function

def printer():

# This is the nested function

print(msg)

return printer # this got changed

# Now let's try calling this function.

# Output: Hello

another = print_msg("Hello")

another()

When do we have a closure?

As seen from the above example, we have a closure in Python when a nested function references a value in its enclosing scope.The criteria that must be met to create closure in Python are summarized in the following points.We must have a nested function (function inside a function).The nested function must refer to a value defined in the enclosing function.The enclosing function must return the nested function.

miércoles, 10 de mayo de 2017

Motivation

I would like to create a ssh with a remote instance. Usually ssh is very secure, you can put password and key pair security. However it could be difficult to do other things in a more visual way, so we “tunnel” through the ssh connexion (taking advantage of it security).

Set up

If you are going to “tunnel” using different port make sure those are open.

You just have to issue your standard Putty session and add “X11 forwarding” as shown here:

Also you should think which is going to be the port from your handy computer to the one you are connecting to. I want to use the port 8888 of the receiver and I am going to use port 9100 in the local machine:

-v /home/raf/Documents/docker-cloudera:/home/raf/Documents/docker-cloudera: I am linking one folder in my desktop with a folder inside the container. The first one is my desktop the second one is the container.

-p is for the ports the first por is mine the second one is the container.

You can see HUE in localhost:8888

Run Mapreduce wordcount

Go to the folder you want to work, for me cd home/raf/Documents/docker-cloudera.

Create Wordcount_mapper.py

Understanding the algorithm:

You just have to put then in form of <key,1>

Create code:

vim wordcount_mapper.py

An paste this:

#!/usr/bin/env python

# the above just indicates to use python to intepret this file

# ---------------------------------------------------------------

# This mapper code will input a line of text and output <word, 1>

#

# ---------------------------------------------------------------

import sys # a python module with system functions for this OS

# ------------------------------------------------------------

# this 'for loop' will set 'line' to an input line from system

# standard input file

# ------------------------------------------------------------

for line in sys.stdin:

# -----------------------------------

# sys.stdin call 'sys' to read a line from standard input,

# note that 'line' is a string object, ie variable, and it has methods that you can apply to it,

# as in the next line

# ---------------------------------

line = line.strip() # strip is a method, ie function, associated

# with string variable, it will strip

# the carriage return (by default)

keys = line.split() # split line at blanks (by default),

# and return a list of keys

for key in keys: # a for loop through the list of keys

value =1

print('{0}\t{1}'.format(key, value)) # the {} is replaced by 0th,1st items in format list

# also, note that the Hadoop default is 'tab' separates key from the value

Close vim saving the changes: :wq

Create Wordcount_reducer.py

Understanding the algorithm:

First off all you should notice that you are going to have as an input the output of all reducers combined, so something like this:

A1

a1

ago1

Another1

away1

far1

far1

episode1

galaxy1

in1

long1

of1

Star1

time1

Wars1

Therefore you do not have to worry of taking the files from previous map results combined them and sorted. So can count the repetitions by comparing whith the previous element

Changing Streaming options:

Let’s change the number of reduce tasks to see its effects. Setting it to 0 will execute no reducer and only produce the map output. (Note the output directory is changed in the snippet below because Hadoop doesn’t like to overwrite output)

hadoop jar /usr/lib/hadoop-mapreduce/hadoop-streaming.jar \

-input /user/cloudera/input \

-output /user/cloudera/output_new_0 \

-mapper /home/raf/Documents/docker-cloudera/wordcount_mapper.py \

-reducer /home/raf/Documents/docker-cloudera/wordcount_reducer.py \

-numReduceTasks 0

To see the results:

hdfs dfs -cat /user/cloudera/output_new_0/part-00000

Try to notice the differences between the output when the reducers are run in Step 9, versus the output when there are no reducers and only the mapper is run in this step. The point of the task is to be aware of what the intermediate results look like. A successful submission will have words and counts that are not accumulated (which the reducer performs). Hopefully, this will help you get a sense of how data and tasks are split up in the map/reduce framework, and we will build upon that in the next lesson.