much and 1st of all things problem for a for the latter mine is broken so things a introduction to

00:22

tens of world OK let's start with a question what is this it's a

00:31

cat that that was an easy question about these that easy for a computer you know my computer is a machine that makes computations the things with mathematical operations so their real question is is there a mathematical relationship between this includes me to forget and the target because you I can't and the answer is yes but it's very complex In we're gonna let vs competent relationship using tones of example and

01:12

these learning a very complex relationship using tons of examples is fairly welfare as the finishing off day

01:23

planning and you know we're here in you the meaning and era a Python 2017 we love life so we wanna make deeper learning with Python in what is the best tool for that it's fanciful flow so what is this a float insofar as an open-source library uh for the blending it's uh mainly using Python and it was really is but the Google Brain project 2 years ago and by that 1st that the 1 0 0 version was not Belanche until February of this year so insulation quickly them the best practice to download Anaconda and then create a new environment the with the classic data signs libraries and then there be consultants of flow for Windows is the same met with other sources were take comes but now we enter their most important part of the talk becomes of the ghost tens of flow could be a difficult to for beginners if you don't understand the basic ounces of deep learning and how tens of flow works so on recall these we

02:53

have the cat the to forget and we have the class kept in we wanna find the mouth America relationship between these 2 them mathematical relationship we call it the

03:06

model the M. this model is gonna make uh predictions that given the input sometimes of 1st is gonna be random predictions synthetic it's a sometimes it's a it's a known can them well so we have that include the image of

03:27

a cat and we have the bimodal that is gonna make predictions given the input and we have the target that is the correct class for that image M. the prediction much the target but it doesn't so what we do but we're gonna do is to change the model so it gets better and the predictions much the targets so the 1st the 1st step for this is to compute the difference between the prediction and the target and this is going to be dong by the cost function or a loss function but and this cost function is gonna produce an error and this error is slightly How far are we from having to model then the greater there the greater we have to the more we have to change in the model so we're gonna let em from nearest that's life learning from errors sometimes you win and sometimes you learn and that guy finally the guy in in charge of change in the model of training the model is the optimized OK so this is the basic the basic structure of the learning process you did learn and this is what tends to flow calls the graph

05:03

and the graph is just as a layout which contains both the model and the land in the learning process OK so the

05:16

graph it's totally independent from the data BAT there is really there is really a connection with the data so because the graph is nothing without data we couldn't have predictions without the input data and we could have learning without targets so we said to gates that are now going in 1 gate is for the inputs and when did this for the targets and this case are gonna let the data to coming but not all types of data only the data that we want all so in this case we want the majors 4 includes and classes for targets these called placeholders well quick

06:08

summary of what graph contains and well there is a green were there variables because the model is just a set of variables and we're gonna virus these variables when I change these variables to make the model beta and give his water mind because to support users if well

06:30

we have that we have the graph With the placeholders that we want dealt data to come into the graph so what we do is to open the at encephalization M. well when we're

06:47

in that sense of pulsation we say we're feeding in the graph with data

06:54

OK so for example and we had this cat this cat goes through the model and the model say it's an uncle and that's not correct so the QoS functions say we have a neighbor of 100 the optimizer region is uh error and say we need to train the model and we train the model not a case we have began and the model faces again so the coast is 0 and of the mice and there's nothing well that's that's that's

07:28

the main part of the end of the talk that we're gonna see several cases of use the normative a stay with the concepts of O K learn becomes as very New Economy otherwise there but the 1st thing we're gonna do is a Hello World not really have a war because we're not gonna preen HelloWorld at so I have to integers the 1st thing is the importance of flow this the Convention imported STF

08:05

and this is a graph we're going to build up we have 2 placeholders 1 4 1 in together and the other for the other in together and that In addition operation so and this is a

08:20

column we said the placeholders that is gonna expect integers and we have the that addition operation that is like a function of tens of flow the and that's the graph and independently there's a data the we have number 1 is 3 number to the state the and is the session the session is something that you open and you close so we use the we've keyword and we we're gonna run the uh the sum operation now we're gonna feed the graph that feed with a dictionary that links each place holders with a each data

09:11

so this is the output for an to but no book and we see how it works 3 and made 11 so perfect but this is kind of boring because we're not learning and how can we make this thing more interesting

09:30

next case is going to be a regression problem in in you know you're in a regression problems when your outputs are not classes like can of fish you have up a lot there are numbers like 2 minus 3 6 . 7 number by the square root of 2 that kind of thing M. the our cases to live how the 7th so what we gonna do it is to that's a 2 inputs and outputs and we're going to let alone the mathematical relationship um using 10 thousand examples so these are the examples of we can see clearly that the uh the 1st that being together some up to 13 so we see where we're seeing the relationship so this is kind of silly that you know that in the regression problem you always have the same philosophy so we're now learning how to some that

10:42

say we are in a self-driving car I the 1st included be like in the image taken from the camera from other people and that's 2nd input could be the instance taken from a laser in front of the vehicle and the output is that angle you need to steer the vehicle to not crash or not uh get out of the lake but we're gonna keep where the vision

11:12

example so what we gonna assume is that the relationship between the output and the is that is a linear function that is an addition and a multiplication so these are the variables we're gonna let we're gonna be change in this variable so it's gets very then in this case we're OK with a linear function but we wanted to learn a more com relationship we just have to uh I have another linear function have another layer any if we want to make it even more complex With stack another layer and if we wanna make it even more complex weights have nonlinear functions or activation function so that's the neural network k is the

12:10

and well it's call with with their placeholders respecting faults Aaron respecting 2 numbers to integers the inputs and 1 for the output and the known is because we don't know how many examples are we going to receive so we don't restrict the number OK the model is yes if 2 variables we initialize it and randomly then we make the linear function with the multiplication and the addition well so we

12:50

have the placeholders and we have the model next thing is because functions because

12:55

function that computes the difference between the prediction and the and the target so that most intuitive thing is to uh make the difference In that what we gonna do is to try get a reader of negative numbers so we squared it all up and then we'd some you know some of all the exists all the all the arose from all the examples and reduce it to 1 of the the case of bases that called when you read square they will make a difference a square and reduces well let's say

13:37

that our cost function in can be put a like B's and and well that that the height of the array of things that out more error and the down the is the minimum we want to reach where we're not do is let's say we have the nearest 10th and we're gonna get down to the dump the so we're gonna take that direction of maximum steep there's is a gradient and we follow that direction again and again until we reach the minimum this the

14:17

kilotons of flow gives you a model of gradient descent and there is a hyperparameter there is learning rate and that is like a

14:27

very model of the arrow so do the do they have hot graded I wrote and then you do the reach the minimum faster bad maybe you pass by the minimum and you also late by the minimum even going unstable and if you have a lot of the an arrow is smaller maybe you don't reach the minimum because it's too slow OK we minimize the

14:57

coast and we have it

15:00

all that we need the data

15:03

we're going to do with the data that is typical machine

15:06

learning that is uh to a the data into 1 for training a 1 for testing and the training is what we're gonna use

15:15

so I built a helper functions for these so we then have to bother of about the data this formula talk In this session

15:24

OK and In this session we're gonna be the graph with the training data so

15:32

let's do an 1st thing we're in a star initializer of the variables and then we ran the optimized writing the optimizer you gonna that learning then we've feed with the training of the training data and we around all the training data and a a lot of times that's now when you ran the data at 1 time you say that apple so we have this for loop that's gonna a train that i the neural net work numbers time it OK that's that's the

16:12

apple and we see that we have an accuracy of 95 per cent and then the sum of 5 plus 7 years there's almost 2 of that if we take a look to the weights that's not a sum a sum is yes 1 1 and the y is 0 that means that we have over feed that neural network to guess make good summation for our data it OK now a

16:41

classification problem it was kitchen problem there and you're the gonna use numbers you gonna use classes so we

16:51

have we have a cat good be a class can we have another thing they could be a class of non that he had a known cat are words we don't work we work we don't work with words so we just transform on um does form a way that a class is a component often array in this case a cat is like the 2nd company of the array and don't get is the 1st component the array this is called one-hot encoding and these 0 1 0 1 0 is just for targets but what would be our predictions a provisions to

17:33

be uh probabilities evidence sometimes the model is going to be very sure that the that being belongs to a certain class in this case we have the cat that in the model is 80 82 per cent sure that it's a cat but sometimes when the going to be that you're about decimation is going to be 1 OK and well

18:06

our case now is going to be a but the we have to into gears we gonna summit and we're gonna classify if they are uh greater than taking Oracle less than 10 if they're greater we're gonna say that it belongs to the 2nd class in the other lesser going to belong to the 1st class is a silly example that it works and it's it's good for learning so all these relationships is more complex than that than and the so we need all the later M. interestingly the 1st layer is going to compute the sum and the 2nd layer is going to classified that some into greater than 10 or less than 10 and this area is gonna happen always notification problems the 1st layers are going up and extract more basic features more basic information and the next layer using gonna work uh with these basic information to produce even more complex information so I happened to whiz we use act as so much uh and nonlinear function at the end because we 1 then the output uh our probability is that some onto 1

19:37

so OK we have to build the model and in this case in the cost function we're not gonna use uh because function of a following user cross into because function and that is the young this talk and they're all my souvenirs our better optimise it is Adam optimizer dead works a lot better because uh it goes changing the learning rate you know so a 1st is going to be a big learning rate and then is going to be uh small so these are the results we have that there's an accuracy of 100 and that's that's that we have done something that bad that it works in a 5 3 is the model is 89 per cent that is less than 10 and 7 plus 6 years 87 % it's be sure there is a greater than saying in the same for example a stamp and we we see how in the 1st layer is like a some bad is with a negative in in the sickle later year is the same number in the weights but with the negative in 1 of them that that's going to classify the output of that for the 1st layer so OK that's

21:09

all that and if you want to know more and I recommend you to uh followed the neuron that was in the blended book from I can Nielsen and the s stands for CS 2 through 1 from and take apart the you can let a lot there for tensor flow there's a lot of tutorials about since a fall that I felt that they miss the basic that's why I D. this talk it gave you that like I mean uh last year and you the know nothing about the data so at this i with this and do a improving the well that's where uh it's all the coal there is an it's a work in progress on learning because I'm our borders engineer and my goal is to combine the reporters with artificial intelligence and so if you were in this villages talk to me in and we would be best friends and they're not my favorite combination is surviving

22:19

cars sold I hope that you could at below the self-serving counselors with a sense of flow and using the basic that I've given gave you a you gonna improve improve and build Sefton and thank you fj minimum of your questions the middle I think you can use of a stroll that's the previous slide with the repositories in our previous

23:05

1 gets but if if if these half of the the I don't think it for the nice felt I hope hope much that nature it to a decent training and training it's really violent on what you have on how we can get it how much that actually the to provide something useful if there isn't a clear answer how much data you yeah you Comanche of the training process with we have to be up to the point it's actually something useful rather than just a simple so how much data you need to to make something useful yes so there is a kind if you get any of the the same tones of data that depends depends on the application back and well the 6 cool graph in which you see that machine learning machine learning in the algorithm works in a new way better than that deep learning algorithms when you have a small data we do have tons of data like terabytes of data for images terabytes of data for sound then d blending is way better the machine learning and so do that can present things and yet you thanks for the incredible talk and I want to ask whether you have any comments about comparing tends to occur in the unknown for example uh that's 1 question and and the next 1 is there are so many mentor libraries which built on top of tens the lecture us something which provide people with user interface to to use of dual regression who would you comment on when to use tens of low versus of such matter library like us the way that

25:08

can you please think they microphone fire was not part of sorry so

25:13

that the first one was on hold you compare tends to to to the unknown that do not see and it was yesterday what I have well with the other bad and whether they say that the analyst more nodes can be is more level so you can be more creative and more like the the yes not low level is not as much fault scientist maybe we have to this so what made you make this presentation as an introduction to not an introduction to the enemy what what what made this presentation Introduction to tens of low versus an introduction to something it's they what made you choose density over the other other things again this then the question he never mind that it's took a half thank and a well because it's cool to have its it's from go and then well mentes in great yeah the projects an area but it also the question the new the new unit few

Inhaltliche Metadaten

Introduction to TensorFlow [EuroPython 2017 - Talk - 2017-07-14 - Anfiteatro 1] [Rimini, Italy] Deep learning is at its peak, with scholars and startups releasing new amazing applications every other week, and TensorFlow is the main tool to work with it. However, Tensorflow it's not an easy-access library for beginners in the field. In this talk, we will cover the explanation of core concepts of deep learning and TensorFlow totally from scratch, using simple examples and friendly visualizations. The talk will go through the next topics: • Why deep learning and what is it? • The main tool for deep learning: TensorFlow • Installation of TensorFlow • Core concepts of TensorFlow: Graph and Session • Hello world! • Step by step example: learning how to sum • Core concepts of Deep Learning: Neural network • Core concepts of Deep Learning: Loss function and Gradient descent By the end of this talk, the hope is that you will have gained the basic concepts involving deep learning and that you could build and run your own neural networks using TensorFlow