WEBVTT
1
00:00:00.341 --> 00:00:03.091
(relaxing music)
2
00:00:12.791 --> 00:00:14.594
Good afternoon.
3
00:00:14.594 --> 00:00:18.095
Thank you for coming out on this rainy day
4
00:00:18.095 --> 00:00:20.992
to our Distinguished Scholar-Teacher lecture.
5
00:00:20.992 --> 00:00:23.489
My name is Laura Rosenthal, and I'm the
6
00:00:23.489 --> 00:00:25.388
director for faculty leadership
7
00:00:25.388 --> 00:00:27.458
for the Office of Faculty Affairs.
8
00:00:27.458 --> 00:00:30.331
I actually chaired the selection committee
9
00:00:30.331 --> 00:00:33.414
for this award, and I'm here on behalf
10
00:00:33.414 --> 00:00:35.612
of the Provost's Office.
11
00:00:35.612 --> 00:00:37.589
John Bertot sends his apologies.
12
00:00:37.589 --> 00:00:39.938
He couldn't be here because of a Senate meeting
13
00:00:39.938 --> 00:00:41.355
at the same time.
14
00:00:42.376 --> 00:00:44.332
The Distinguished Scholar-Teacher Program
15
00:00:44.332 --> 00:00:47.966
was established in 1978, and has had more
16
00:00:47.966 --> 00:00:51.667
than 200 awardees since its inception.
17
00:00:51.667 --> 00:00:53.704
Now in its 40th year,
18
00:00:53.704 --> 00:00:55.844
the Distinguished Scholar-Teacher Program
19
00:00:55.844 --> 00:00:59.235
is sponsored by the Office of Academic Affairs
20
00:00:59.235 --> 00:01:01.869
and administered by the
21
00:01:01.869 --> 00:01:04.259
associate provost for Faculty Affairs.
22
00:01:04.259 --> 00:01:07.259
Selected by previous DST recipients,
23
00:01:08.388 --> 00:01:10.708
the award honors tenured faculty members
24
00:01:10.708 --> 00:01:13.872
who combine outstanding scholarship
25
00:01:13.872 --> 00:01:15.701
with teaching excellence.
26
00:01:15.701 --> 00:01:18.692
It is a recognition by the university
27
00:01:18.692 --> 00:01:20.728
of the commitment of faculty who pursue
28
00:01:20.728 --> 00:01:23.854
unfailing excellence in the classroom
29
00:01:23.854 --> 00:01:26.354
and in their research efforts.
30
00:01:27.560 --> 00:01:30.338
I'm very pleased, on behalf of Provost Rankin,
31
00:01:30.338 --> 00:01:33.585
to recognize Dr. Jonathan Katz as one
32
00:01:33.585 --> 00:01:37.107
of our newest Distinguished Scholar-Teachers.
33
00:01:37.107 --> 00:01:39.283
As we will hear more about shortly,
34
00:01:39.283 --> 00:01:42.180
Dr. Katz's work focuses on cryptography
35
00:01:42.180 --> 00:01:44.654
and cybersecurity, with his recent work
36
00:01:44.654 --> 00:01:47.841
focusing largely on techniques for computing
37
00:01:47.841 --> 00:01:51.341
on personal data while preserving privacy.
38
00:01:52.407 --> 00:01:54.247
He currently serves as the director
39
00:01:54.247 --> 00:01:57.247
of the Maryland Cybersecurity Center.
40
00:01:57.247 --> 00:01:59.934
He has demonstrated excellence in teaching,
41
00:01:59.934 --> 00:02:02.232
both in the classroom and through the mentorship
42
00:02:02.232 --> 00:02:06.787
of students, as well as innovative instruction.
43
00:02:06.787 --> 00:02:10.963
For example, through his MOOC on cryptography
44
00:02:10.963 --> 00:02:14.197
offered via Coursera that reaches thousands
45
00:02:14.197 --> 00:02:15.947
of students globally.
46
00:02:17.623 --> 00:02:21.914
It is my pleasure now to introduce Samir Khuller,
47
00:02:21.914 --> 00:02:24.488
who will formally introduce Dr. Katz.
48
00:02:24.488 --> 00:02:26.027
Thank you.
49
00:02:26.027 --> 00:02:27.520
(applause)
50
00:02:27.520 --> 00:02:31.062
(people talking)
51
00:02:31.062 --> 00:02:34.960
The person who's recording this.
52
00:02:34.960 --> 00:02:37.839
Thank you very much, and welcome to everyone.
53
00:02:37.839 --> 00:02:40.011
15 years ago when we interviewed Jon Katz,
54
00:02:40.011 --> 00:02:41.794
he was a nervous young graduate student,
55
00:02:41.794 --> 00:02:44.979
and I remember taking him to Georgetown for dinner.
56
00:02:44.979 --> 00:02:47.132
But a lot has elapsed in the last 15 years.
57
00:02:47.132 --> 00:02:49.846
Jonathan received his Ph.D in 2002.
58
00:02:49.846 --> 00:02:51.781
He eventually became professor of computer science
59
00:02:51.781 --> 00:02:53.531
at UMIACS since 2013,
60
00:02:55.754 --> 00:02:57.342
and this year was named as
61
00:02:57.342 --> 00:02:59.469
Distinguished Scholar-Teacher.
62
00:02:59.469 --> 00:03:01.457
He's also, for the last several years,
63
00:03:01.457 --> 00:03:04.232
been the director of the Maryland Cybersecurity Center,
64
00:03:04.232 --> 00:03:06.805
where he leads a team of interdisciplinary professors,
65
00:03:06.805 --> 00:03:09.039
research scientists and students to solve
66
00:03:09.039 --> 00:03:11.670
large-scale, complicated problems all related
67
00:03:11.670 --> 00:03:13.087
to cybersecurity.
68
00:03:14.927 --> 00:03:17.574
I think the biggest impact he's had
69
00:03:17.574 --> 00:03:19.775
through his research is by shaping our study
70
00:03:19.775 --> 00:03:22.606
and understanding of cryptography, his main area.
71
00:03:22.606 --> 00:03:24.147
He's a brilliant and prolific leader
72
00:03:24.147 --> 00:03:25.687
in this dynamic field.
73
00:03:25.687 --> 00:03:27.873
He's an academic giant in computer science,
74
00:03:27.873 --> 00:03:30.713
and we are fortunate to have him as our colleague.
75
00:03:30.713 --> 00:03:31.852
Jonathan's work has been published
76
00:03:31.852 --> 00:03:33.852
in over 30 journals, 150
77
00:03:34.692 --> 00:03:36.889
peer-reviewed conference papers, and he's co-authored
78
00:03:36.889 --> 00:03:39.266
and edited multiple books, the latest one
79
00:03:39.266 --> 00:03:41.509
being Advances in Cryptography.
80
00:03:41.509 --> 00:03:43.486
He can frequently be found interviewed
81
00:03:43.486 --> 00:03:46.391
by various public media outlets.
82
00:03:46.391 --> 00:03:48.895
Jonathan is currently advising three Ph.D students.
83
00:03:48.895 --> 00:03:50.711
He's already graduated 11 Ph.Ds
84
00:03:50.711 --> 00:03:53.331
and has mentored about 15 postdocs,
85
00:03:53.331 --> 00:03:55.501
most of whom who are leading research scientists
86
00:03:55.501 --> 00:03:58.858
and professors at different universities.
87
00:03:58.858 --> 00:04:01.228
The undergraduate class, his most popular class,
88
00:04:01.228 --> 00:04:04.369
which he's taught several times, CMSA 456,
89
00:04:04.369 --> 00:04:06.562
Introduction to Cryptography.
90
00:04:06.562 --> 00:04:09.346
He said that he especially enjoys moments
91
00:04:09.346 --> 00:04:11.095
when they finally understand something
92
00:04:11.095 --> 00:04:12.013
that he's teaching them.
93
00:04:12.013 --> 00:04:13.568
He said, "My favorite thing is when
94
00:04:13.568 --> 00:04:15.730
"you can see in their eyes that a concept
95
00:04:15.730 --> 00:04:17.291
"is clicking at some point in the semester,
96
00:04:17.291 --> 00:04:19.306
"I hope soon, early in the semester,
97
00:04:19.306 --> 00:04:21.826
"and they get really excited about cryptography."
98
00:04:21.826 --> 00:04:24.724
That spark, to him, is very motivational.
99
00:04:24.724 --> 00:04:26.609
He's been the recipient of multiple awards,
100
00:04:26.609 --> 00:04:30.483
the Humboldt Research Award, a member of the
101
00:04:30.483 --> 00:04:32.874
IEEE Cybersecurity Initiative.
102
00:04:32.874 --> 00:04:36.295
He was also named as the top 50 influential Marylanders.
103
00:04:36.295 --> 00:04:39.278
He's won an NSF career award,
104
00:04:39.278 --> 00:04:42.701
as well as a DARPA Computer Science Study Group selection.
105
00:04:42.701 --> 00:04:44.290
Without further ado, I wanna welcome
106
00:04:44.290 --> 00:04:46.289
Professor Jon Katz, both of computer science
107
00:04:46.289 --> 00:04:47.518
and UMIACS, and director of the
108
00:04:47.518 --> 00:04:49.288
Maryland Cybersecurity Center.
109
00:04:49.288 --> 00:04:51.538
(applause)
110
00:04:54.985 --> 00:04:56.203
Thank you very much, Samir.
111
00:04:56.203 --> 00:04:57.435
Actually, I remember the dinner like
112
00:04:57.435 --> 00:04:58.954
it was yesterday, so it's amazing that
113
00:04:58.954 --> 00:05:00.586
15 years went by.
114
00:05:00.586 --> 00:05:02.011
Thank you all for coming today, actually,
115
00:05:02.011 --> 00:05:03.770
especially given the horrible weather outside,
116
00:05:03.770 --> 00:05:05.369
although I guess I can at least claim
117
00:05:05.369 --> 00:05:07.325
that people prefer to be in here listening
118
00:05:07.325 --> 00:05:09.279
to my talk rather than spending the day outside.
119
00:05:09.279 --> 00:05:10.333
(laughter)
120
00:05:10.333 --> 00:05:11.671
That's a good thing.
121
00:05:11.671 --> 00:05:13.636
I'm really very honored, actually, to receive
122
00:05:13.636 --> 00:05:15.477
this award, especially because
123
00:05:15.477 --> 00:05:17.913
we've had other honorees in our department,
124
00:05:17.913 --> 00:05:20.103
in the Department of Computer Science.
125
00:05:20.103 --> 00:05:22.132
I know Samir himself was an honoree.
126
00:05:22.132 --> 00:05:24.644
Mike Hicks was also an honoree.
127
00:05:24.644 --> 00:05:26.681
Jack Minker, before my time, I think
128
00:05:26.681 --> 00:05:28.872
was an honoree, as well.
129
00:05:28.872 --> 00:05:30.969
I really don't feel worthy.
130
00:05:30.969 --> 00:05:32.254
I know what excellent teachers they are,
131
00:05:32.254 --> 00:05:36.064
and I hope I can just measure up to some extent.
132
00:05:36.064 --> 00:05:37.385
I'd like to thank the committee, also,
133
00:05:37.385 --> 00:05:38.459
for nominating,
134
00:05:38.459 --> 00:05:40.975
for choosing me, and to Mike in particular
135
00:05:40.975 --> 00:05:42.475
for nominating me.
136
00:05:44.060 --> 00:05:45.785
I was trying to think about the title
137
00:05:45.785 --> 00:05:47.578
of the talk, and I was trying to come up
138
00:05:47.578 --> 00:05:49.016
with some way to distinguish my title
139
00:05:49.016 --> 00:05:51.226
from all the other titles of the
140
00:05:51.226 --> 00:05:54.226
other DSD nominees chosen this year.
141
00:05:55.689 --> 00:05:56.803
It turns out I was able to do that
142
00:05:56.803 --> 00:05:58.699
by avoiding the colon, but if you really
143
00:05:58.699 --> 00:06:00.025
like to have a colon in your talk,
144
00:06:00.025 --> 00:06:01.517
you can also think about this
145
00:06:01.517 --> 00:06:04.891
as The Future of Privacy: A Cryptographic Perspective.
146
00:06:04.891 --> 00:06:06.096
(laughter)
147
00:06:06.096 --> 00:06:09.007
But hopefully you'll be able to understand it anyway.
148
00:06:09.007 --> 00:06:10.415
I'd like to buck tradition a little bit
149
00:06:10.415 --> 00:06:12.802
and actually start with the acknowledgements.
150
00:06:12.802 --> 00:06:15.215
That way they won't get shoved at the end
151
00:06:15.215 --> 00:06:17.448
where I may run out of time.
152
00:06:17.448 --> 00:06:19.611
First of all, I just want to acknowledge
153
00:06:19.611 --> 00:06:22.692
all the love and support of my family.
154
00:06:22.692 --> 00:06:25.314
In particular, my wife and father are here today
155
00:06:25.314 --> 00:06:29.520
to listen to this talk, so thank you for coming.
156
00:06:29.520 --> 00:06:31.094
I literally could not have been here without you.
157
00:06:31.094 --> 00:06:33.472
(laughter)
158
00:06:33.472 --> 00:06:35.648
I would also like to take the opportunity to thank
159
00:06:35.648 --> 00:06:38.530
the Department of Computer Science
160
00:06:38.530 --> 00:06:40.197
for really providing
161
00:06:41.086 --> 00:06:43.708
a very nurturing place for me.
162
00:06:43.708 --> 00:06:45.561
Really something that I couldn't have asked
163
00:06:45.561 --> 00:06:47.674
for anything better and the best place,
164
00:06:47.674 --> 00:06:49.427
I guess, to grow a career over the first
165
00:06:49.427 --> 00:06:51.815
15 years or so of my career.
166
00:06:51.815 --> 00:06:54.163
In particular, the two chairs of computer science
167
00:06:54.163 --> 00:06:56.061
that I had the
168
00:06:56.061 --> 00:06:58.455
fortune actually of working for,
169
00:06:58.455 --> 00:07:00.365
Larry Davis and Samir Khuller, so thank you
170
00:07:00.365 --> 00:07:02.608
for everything you've done to help create
171
00:07:02.608 --> 00:07:05.817
that environment for the department.
172
00:07:05.817 --> 00:07:08.933
I also wanna thank my students and postdocs.
173
00:07:08.933 --> 00:07:09.766
Actually,
174
00:07:12.399 --> 00:07:13.973
when I was writing over this list, it was actually
175
00:07:13.973 --> 00:07:16.170
amazing for me to see the number
176
00:07:16.170 --> 00:07:18.399
of names, but also just remember some of the
177
00:07:18.399 --> 00:07:22.303
Ph.D students I had, now going back several years.
178
00:07:22.303 --> 00:07:23.449
I've had the opportunity over the years
179
00:07:23.449 --> 00:07:26.765
to catch up with them and hear how they're doing.
180
00:07:26.765 --> 00:07:28.696
I'd really say that, for the most part,
181
00:07:28.696 --> 00:07:33.094
it's been the best kind of student-advisor relationship,
182
00:07:33.094 --> 00:07:35.954
where I think, it's very interesting.
183
00:07:35.954 --> 00:07:37.253
I've learned more from my students
184
00:07:37.253 --> 00:07:39.138
and from my postdocs than they,
185
00:07:39.138 --> 00:07:40.140
if they learned anything from me.
186
00:07:40.140 --> 00:07:41.759
I certainly learned more from them
187
00:07:41.759 --> 00:07:43.037
than they learned from me.
188
00:07:43.037 --> 00:07:44.396
I wanna highlight in particular, actually,
189
00:07:44.396 --> 00:07:46.525
some of the students and postdocs whose work
190
00:07:46.525 --> 00:07:49.032
I'm gonna be referring to at various points
191
00:07:49.032 --> 00:07:50.744
in the talk today.
192
00:07:50.744 --> 00:07:53.327
Let's see, Xiao Wang, for sure.
193
00:07:54.747 --> 00:07:56.080
Alex Malozemoff.
194
00:07:57.901 --> 00:07:59.867
Sam Ranellucci
195
00:07:59.867 --> 00:08:00.950
and Yan Huang
196
00:08:02.985 --> 00:08:04.490
and Raef Bassily.
197
00:08:04.490 --> 00:08:06.328
I don't wanna leave anybody, Adam Groce, as well.
198
00:08:06.328 --> 00:08:07.267
Essentially, some of the work I'm gonna
199
00:08:07.267 --> 00:08:08.516
be talking about today.
200
00:08:08.516 --> 00:08:09.826
I'd also like to take the opportunity
201
00:08:09.826 --> 00:08:12.202
to acknowledge some of my collaborators
202
00:08:12.202 --> 00:08:13.043
over the years.
203
00:08:13.043 --> 00:08:15.530
I've had the fortune of working with many
204
00:08:15.530 --> 00:08:17.403
great collaborators, but in particular
205
00:08:17.403 --> 00:08:19.843
on this topic, I'd like to highlight
206
00:08:19.843 --> 00:08:23.234
Mike Hicks, again, as well as Adam Smith,
207
00:08:23.234 --> 00:08:27.317
a professor at Penn State, and Dave Evans at UVA.
208
00:08:28.444 --> 00:08:30.672
I thought it would be good to start actually
209
00:08:30.672 --> 00:08:34.723
with a little bit of background about cryptography.
210
00:08:34.723 --> 00:08:36.706
I don't know how many people here
211
00:08:36.706 --> 00:08:37.793
are very familiar with cryptography.
212
00:08:37.793 --> 00:08:39.196
I'm guessing that most of the students
213
00:08:39.196 --> 00:08:41.622
who have taken my cryptography class
214
00:08:41.622 --> 00:08:43.314
have since graduated, and so are probably
215
00:08:43.314 --> 00:08:44.535
not in the room today, maybe a handful
216
00:08:44.535 --> 00:08:47.552
are here today, so I hope this won't bore you too much.
217
00:08:47.552 --> 00:08:49.022
But I figured it's worth talking about it
218
00:08:49.022 --> 00:08:50.830
because many people, even if they do
219
00:08:50.830 --> 00:08:52.566
know something about cryptography,
220
00:08:52.566 --> 00:08:54.611
they probably know about it from the media,
221
00:08:54.611 --> 00:08:57.119
and they may not have a good picture
222
00:08:57.119 --> 00:08:59.395
of what exactly it is that cryptographers do
223
00:08:59.395 --> 00:09:01.007
and what kind of problems cryptographers
224
00:09:01.007 --> 00:09:03.043
are working on nowadays.
225
00:09:03.043 --> 00:09:05.537
I think that if you were to ask
226
00:09:05.537 --> 00:09:07.493
the general public what they know,
227
00:09:07.493 --> 00:09:10.022
what they think about cryptography,
228
00:09:10.022 --> 00:09:11.273
more likely than not, the first thing
229
00:09:11.273 --> 00:09:15.106
they would hit upon is the idea of encryption.
230
00:09:16.969 --> 00:09:19.819
In the context of encryption, we have
231
00:09:19.819 --> 00:09:22.386
two parties, let's say, who wanna communicate
232
00:09:22.386 --> 00:09:24.169
a message, and they wanna communicate
233
00:09:24.169 --> 00:09:25.866
that message privately.
234
00:09:25.866 --> 00:09:27.089
They wanna make sure that that message
235
00:09:27.089 --> 00:09:30.217
remains secret from anybody else.
236
00:09:30.217 --> 00:09:32.825
The way that encryption has been done
237
00:09:32.825 --> 00:09:35.451
historically is basically through
238
00:09:35.451 --> 00:09:38.602
some process that takes this original message
239
00:09:38.602 --> 00:09:41.108
and scrambles it in some way,
240
00:09:41.108 --> 00:09:44.399
replacing the original message with some
241
00:09:44.399 --> 00:09:47.230
random-looking text that presumably
242
00:09:47.230 --> 00:09:48.792
anybody eavesdropping on the communication
243
00:09:48.792 --> 00:09:50.874
won't be able to make much sense of.
244
00:09:50.874 --> 00:09:54.457
The picture down here is actually Vigenere.
245
00:09:55.392 --> 00:09:56.767
This was somebody who developed
246
00:09:56.767 --> 00:10:00.307
an encryption scheme in the 1700s.
247
00:10:00.307 --> 00:10:02.067
This is supposed to be a representation
248
00:10:02.067 --> 00:10:03.722
of what you might get if you encrypted
249
00:10:03.722 --> 00:10:05.529
an English language plain text using
250
00:10:05.529 --> 00:10:07.328
the Vigenere cipher to get some
251
00:10:07.328 --> 00:10:10.507
unintelligible cipher text that could
252
00:10:10.507 --> 00:10:12.590
be sent across a channel.
253
00:10:13.575 --> 00:10:15.201
Cryptography has been used,
254
00:10:15.201 --> 00:10:16.548
or encryption, I should say, has been used
255
00:10:16.548 --> 00:10:18.249
for many hundreds of years, actually.
256
00:10:18.249 --> 00:10:20.145
It goes back, it predates even the 1700s,
257
00:10:20.145 --> 00:10:22.790
and goes back even to classical times.
258
00:10:22.790 --> 00:10:24.250
It also had a very large role to play
259
00:10:24.250 --> 00:10:25.883
in World War II.
260
00:10:25.883 --> 00:10:27.909
This is the Enigma machine that the Germans used
261
00:10:27.909 --> 00:10:30.692
to encrypt communication in World War II.
262
00:10:30.692 --> 00:10:31.943
The other thing people might think about
263
00:10:31.943 --> 00:10:33.233
when they think about cryptography
264
00:10:33.233 --> 00:10:34.821
and they think about encryption is that
265
00:10:34.821 --> 00:10:37.949
you have these smaller teams, small groups
266
00:10:37.949 --> 00:10:41.330
of smart people who are studying how to generate,
267
00:10:41.330 --> 00:10:43.410
how to develop encryption schemes.
268
00:10:43.410 --> 00:10:45.191
You might have other people on the other side
269
00:10:45.191 --> 00:10:47.101
who are sitting in a room trying to figure out,
270
00:10:47.101 --> 00:10:49.772
of course, how to break that encryption scheme.
271
00:10:49.772 --> 00:10:52.611
This is a picture of the so-called bomb
272
00:10:52.611 --> 00:10:55.473
that was developed in London during
273
00:10:55.473 --> 00:10:57.522
the Second World War, as well, in an effort
274
00:10:57.522 --> 00:10:59.522
to try to defeat Enigma.
275
00:11:00.753 --> 00:11:03.167
The idea there is that maybe by being more clever
276
00:11:03.167 --> 00:11:04.789
than the other side, you can figure out
277
00:11:04.789 --> 00:11:06.561
something about what it is they're communicating,
278
00:11:06.561 --> 00:11:08.941
figure out how to crack the code.
279
00:11:08.941 --> 00:11:11.125
Now, looking at this in a little bit more detail,
280
00:11:11.125 --> 00:11:14.586
the problem of encryption is particularly clean.
281
00:11:14.586 --> 00:11:16.425
In the case of encryption, we have two parties
282
00:11:16.425 --> 00:11:18.668
who are communicating over a channel.
283
00:11:18.668 --> 00:11:20.771
Like I said before, they want to be able
284
00:11:20.771 --> 00:11:22.807
to communicate while ensuring secrecy
285
00:11:22.807 --> 00:11:24.884
of their communication.
286
00:11:24.884 --> 00:11:26.599
They're worried in particular about
287
00:11:26.599 --> 00:11:29.589
an adversary, an attacker,
288
00:11:29.589 --> 00:11:31.556
who's eavesdropping, let's say, on the
289
00:11:31.556 --> 00:11:33.462
communication channel between them.
290
00:11:33.462 --> 00:11:34.580
Again, they'd like to make sure that they
291
00:11:34.580 --> 00:11:37.203
can communicate some message
292
00:11:37.203 --> 00:11:39.193
between the two of them while ensuring
293
00:11:39.193 --> 00:11:40.608
that the attacker doesn't know anything
294
00:11:40.608 --> 00:11:42.794
about what it is they're communicating.
295
00:11:42.794 --> 00:11:44.952
What's particularly nice about this picture
296
00:11:44.952 --> 00:11:47.921
is that there's a very clean and obvious
297
00:11:47.921 --> 00:11:50.719
distinction between who the good guys are
298
00:11:50.719 --> 00:11:52.488
and who the bad guys are.
299
00:11:52.488 --> 00:11:56.193
We can draw a (laughter) clean separation
300
00:11:56.193 --> 00:11:58.239
between, on the one hand, the two parties
301
00:11:58.239 --> 00:12:00.468
who share a key, let's say, and are agreeing
302
00:12:00.468 --> 00:12:01.929
to communicate.
303
00:12:01.929 --> 00:12:03.483
On the other hand, the attacker,
304
00:12:03.483 --> 00:12:05.035
who is the bad guy who's listening in,
305
00:12:05.035 --> 00:12:08.047
trying to figure out what it is they're communicating.
306
00:12:08.047 --> 00:12:12.107
It's a very simple, very clean mental model.
307
00:12:12.107 --> 00:12:14.417
This gives, like I said, a very clean distinction
308
00:12:14.417 --> 00:12:15.742
between who's good and who's bad,
309
00:12:15.742 --> 00:12:17.707
who should be able to learn the message
310
00:12:17.707 --> 00:12:18.984
being communicated, and who should not
311
00:12:18.984 --> 00:12:23.261
be able to learn anything about what's being communicated.
312
00:12:23.261 --> 00:12:26.054
Now, everything I've said is an accurate view
313
00:12:26.054 --> 00:12:29.554
of cryptography, up until about the 1980s.
314
00:12:31.841 --> 00:12:33.200
There were, indeed, schemes that were
315
00:12:33.200 --> 00:12:35.027
being developed by people over time,
316
00:12:35.027 --> 00:12:37.212
throughout the centuries, developed in a
317
00:12:37.212 --> 00:12:40.581
mainly heuristic way, involving
318
00:12:40.581 --> 00:12:43.817
smart people thinking about how to construct schemes.
319
00:12:43.817 --> 00:12:45.703
Other people thinking about how to break them.
320
00:12:45.703 --> 00:12:47.657
Then sort of a back and forth interplay
321
00:12:47.657 --> 00:12:49.258
between the people developing schemes
322
00:12:49.258 --> 00:12:50.974
and the people breaking schemes
323
00:12:50.974 --> 00:12:53.364
always trying to outdo the other.
324
00:12:53.364 --> 00:12:54.964
Even in the public key revolution
325
00:12:54.964 --> 00:12:58.220
in the 1970s, we had other smart people,
326
00:12:58.220 --> 00:13:00.049
this is Rivest, Shamir and Adleman,
327
00:13:00.049 --> 00:13:02.110
who developed the RSA algorithm,
328
00:13:02.110 --> 00:13:03.929
who were, again, trying to think of schemes
329
00:13:03.929 --> 00:13:06.547
that other people perhaps couldn't break,
330
00:13:06.547 --> 00:13:08.295
but mainly working in these small groups
331
00:13:08.295 --> 00:13:09.722
and mainly not having any formal way
332
00:13:09.722 --> 00:13:11.573
to reason about the schemes, but, again,
333
00:13:11.573 --> 00:13:15.769
always trying to be more clever than the other side.
334
00:13:15.769 --> 00:13:18.028
Also, into the '80s, you had this mental model
335
00:13:18.028 --> 00:13:20.351
continuing, where there's this clean division
336
00:13:20.351 --> 00:13:22.514
between who should learn everything,
337
00:13:22.514 --> 00:13:24.893
who's allowed to see everything, and on the
338
00:13:24.893 --> 00:13:26.172
other side, the bad guy, who's supposed
339
00:13:26.172 --> 00:13:28.505
to learn absolutely nothing.
340
00:13:29.517 --> 00:13:31.069
Now, what about modern cryptography?
341
00:13:31.069 --> 00:13:33.737
Let's say roughly since the 1980s,
342
00:13:33.737 --> 00:13:36.441
or mid-1980s, perhaps, or maybe putting it
343
00:13:36.441 --> 00:13:39.581
another way, what are we cryptographers doing?
344
00:13:39.581 --> 00:13:41.250
Haven't we already solved the problem
345
00:13:41.250 --> 00:13:42.538
of secure communication?
346
00:13:42.538 --> 00:13:43.553
Isn't this a done deal?
347
00:13:43.553 --> 00:13:44.527
We all know how to encrypt.
348
00:13:44.527 --> 00:13:46.434
We all encrypt every day.
349
00:13:46.434 --> 00:13:48.176
What's left to do?
350
00:13:48.176 --> 00:13:50.796
Why are they paying me to do my job?
351
00:13:50.796 --> 00:13:53.052
Well, in fact, modern cryptography is distinguished
352
00:13:53.052 --> 00:13:54.551
in several ways from this picture that I
353
00:13:54.551 --> 00:13:55.740
presented a few minutes ago.
354
00:13:55.740 --> 00:13:57.605
It's distinguished in many different ways
355
00:13:57.605 --> 00:14:00.938
from its classical view of cryptography.
356
00:14:01.931 --> 00:14:04.534
First of all, it turns out that cryptography
357
00:14:04.534 --> 00:14:06.624
is about much more than encryption.
358
00:14:06.624 --> 00:14:08.582
The encryption is only a very small part
359
00:14:08.582 --> 00:14:11.422
of what cryptography offers.
360
00:14:11.422 --> 00:14:13.714
Secondly, what's especially developed since
361
00:14:13.714 --> 00:14:16.851
the 1980s is a much more rigorous approach
362
00:14:16.851 --> 00:14:18.071
to the subject.
363
00:14:18.071 --> 00:14:20.414
Rather than having this process of people
364
00:14:20.414 --> 00:14:23.857
trying to be more clever than their attacker,
365
00:14:23.857 --> 00:14:25.787
we have a more rigorous approach to the subject
366
00:14:25.787 --> 00:14:28.984
as a whole, as I'll talk about in a few slides.
367
00:14:28.984 --> 00:14:30.445
What I think is most interesting,
368
00:14:30.445 --> 00:14:32.446
and what's perhaps most relevant to what I'm gonna
369
00:14:32.446 --> 00:14:34.726
be speaking about today, is the fact
370
00:14:34.726 --> 00:14:36.564
that cryptographers now deal with a
371
00:14:36.564 --> 00:14:39.619
much richer class of trust relationships.
372
00:14:39.619 --> 00:14:41.828
We no longer have this very simple picture,
373
00:14:41.828 --> 00:14:44.702
like I presented on the previous slide.
374
00:14:44.702 --> 00:14:48.761
Let me elaborate on each of these a little bit.
375
00:14:48.761 --> 00:14:51.972
First of all, modern cryptography, as I said,
376
00:14:51.972 --> 00:14:54.856
now has a much broader scope than
377
00:14:54.856 --> 00:14:56.916
shared key encryption.
378
00:14:56.916 --> 00:14:58.605
I mentioned already public key cryptography.
379
00:14:58.605 --> 00:15:00.154
This is the idea, actually, that enables
380
00:15:00.154 --> 00:15:02.558
secure communication over the internet,
381
00:15:02.558 --> 00:15:04.467
where parties don't even need to necessarily
382
00:15:04.467 --> 00:15:06.563
share any keys in advance in order
383
00:15:06.563 --> 00:15:07.998
to communicate securely.
384
00:15:07.998 --> 00:15:10.378
But that's just another example of encryption,
385
00:15:10.378 --> 00:15:13.047
of ways of achieving secret communications.
386
00:15:13.047 --> 00:15:14.898
But cryptography goes beyond that, as well.
387
00:15:14.898 --> 00:15:16.543
Cryptography is also concerned with things
388
00:15:16.543 --> 00:15:19.244
like data integrity, making sure that data
389
00:15:19.244 --> 00:15:21.457
isn't modified by an active attacker
390
00:15:21.457 --> 00:15:22.668
who's trying, perhaps, to make one
391
00:15:22.668 --> 00:15:24.472
of the parties receive a message that
392
00:15:24.472 --> 00:15:26.934
the other party didn't actually send.
393
00:15:26.934 --> 00:15:29.831
It also relates to things like entity authentication
394
00:15:29.831 --> 00:15:31.293
and proving that you're the person you claim
395
00:15:31.293 --> 00:15:34.591
to be at the other end of a communication channel.
396
00:15:34.591 --> 00:15:37.382
Also, handling much more complex protocols,
397
00:15:37.382 --> 00:15:40.636
like the ones I'm gonna be talking about today.
398
00:15:40.636 --> 00:15:42.212
It's a bit difficult for me actually to come up
399
00:15:42.212 --> 00:15:45.136
with a good definition for the full scope
400
00:15:45.136 --> 00:15:47.816
of what modern cryptography nowadays deals with,
401
00:15:47.816 --> 00:15:49.275
but the best definition that I've been able
402
00:15:49.275 --> 00:15:51.268
to come up with, and I'm hoping to suggest them
403
00:15:51.268 --> 00:15:53.991
for better definitions, is that it's about
404
00:15:53.991 --> 00:15:57.019
the design, analysis and implementation
405
00:15:57.019 --> 00:16:00.330
of mathematical techniques for securing information,
406
00:16:00.330 --> 00:16:02.571
systems and distributed computations
407
00:16:02.571 --> 00:16:04.793
against adversarial attack.
408
00:16:04.793 --> 00:16:07.276
Again, this is much more broader, much broader
409
00:16:07.276 --> 00:16:09.034
than encryption because we're not only
410
00:16:09.034 --> 00:16:10.826
talking about, number one, privacy,
411
00:16:10.826 --> 00:16:12.590
we're talking about general notions of security,
412
00:16:12.590 --> 00:16:14.877
including things like integrity and other things.
413
00:16:14.877 --> 00:16:17.669
But it's also not only talking about communication
414
00:16:17.669 --> 00:16:19.203
of information.
415
00:16:19.203 --> 00:16:21.710
It's also referring to systems as a whole.
416
00:16:21.710 --> 00:16:23.835
It's also referring to the underlying computation
417
00:16:23.835 --> 00:16:26.470
going on within that system, and providing
418
00:16:26.470 --> 00:16:30.470
protections for that computation as it proceeds.
419
00:16:31.313 --> 00:16:32.642
The other thing that's been amazing about
420
00:16:32.642 --> 00:16:35.865
cryptography, since the 1980s or often
421
00:16:35.865 --> 00:16:39.399
maybe a little bit earlier, is that classically,
422
00:16:39.399 --> 00:16:41.805
encryption, and cryptography more generally,
423
00:16:41.805 --> 00:16:44.986
was primarily the focus of people in the military.
424
00:16:44.986 --> 00:16:49.069
It was used primarily in a military context.
425
00:16:49.069 --> 00:16:51.999
It was not widely used outside of that domain.
426
00:16:51.999 --> 00:16:54.289
But nowadays, cryptography is really ubiquitous.
427
00:16:54.289 --> 00:16:56.842
I can say almost for sure, and I'd be surprised
428
00:16:56.842 --> 00:16:58.465
if there's an exception, that everybody
429
00:16:58.465 --> 00:17:00.304
in this room has used cryptography
430
00:17:00.304 --> 00:17:02.270
multiple times today.
431
00:17:02.270 --> 00:17:04.524
Cryptography is used, for example,
432
00:17:04.524 --> 00:17:06.594
as part of password-based authentication.
433
00:17:06.594 --> 00:17:07.803
If you typed in a password today
434
00:17:07.803 --> 00:17:10.353
to log into any system,
435
00:17:10.353 --> 00:17:12.045
you could be using cryptography,
436
00:17:12.045 --> 00:17:15.254
I'm sure it's using cryptography to do that.
437
00:17:15.254 --> 00:17:17.393
Password hashing at the back end of the server
438
00:17:17.393 --> 00:17:19.867
is also rely on cryptography.
439
00:17:19.867 --> 00:17:22.547
If you've ever bought anything online.
440
00:17:22.547 --> 00:17:24.607
That's a process that involves a way
441
00:17:24.607 --> 00:17:26.870
to securely transmit your credit card number
442
00:17:26.870 --> 00:17:28.597
from yourself to the merchant without
443
00:17:28.597 --> 00:17:31.095
allowing an eavesdropper to figure out
444
00:17:31.095 --> 00:17:32.390
or learn your credit card number.
445
00:17:32.390 --> 00:17:35.094
That relies a lot on cryptography.
446
00:17:35.094 --> 00:17:36.895
Encrypted wifi.
447
00:17:36.895 --> 00:17:38.731
That relies on cryptography, as well.
448
00:17:38.731 --> 00:17:42.311
If you've downloaded a software update,
449
00:17:42.311 --> 00:17:44.545
an update from Microsoft Windows, for example,
450
00:17:44.545 --> 00:17:46.287
that's digitally signed so that it can
451
00:17:46.287 --> 00:17:49.790
be verified by your computer before
452
00:17:49.790 --> 00:17:51.899
the software patch is installed.
453
00:17:51.899 --> 00:17:53.423
There are even more complex examples
454
00:17:53.423 --> 00:17:55.667
that maybe people in this room haven't used,
455
00:17:55.667 --> 00:17:59.397
but that go beyond these relatively simple examples.
456
00:17:59.397 --> 00:18:01.271
Things like full disk encryption.
457
00:18:01.271 --> 00:18:02.251
That's available now.
458
00:18:02.251 --> 00:18:04.596
You can download it and you can encrypt your hard drive.
459
00:18:04.596 --> 00:18:06.132
Or even much more complicated things.
460
00:18:06.132 --> 00:18:08.324
Has anyone heard of Bitcoin?
461
00:18:08.324 --> 00:18:11.114
Bitcoin is a great example of a
462
00:18:11.114 --> 00:18:12.673
distributed protocol that relies on
463
00:18:12.673 --> 00:18:14.990
very sophisticated cryptography underneath
464
00:18:14.990 --> 00:18:18.354
to ensure the integrity of the computation
465
00:18:18.354 --> 00:18:22.131
going on underneath, that underlies the protocol.
466
00:18:22.131 --> 00:18:24.547
But this is something else that's changed a lot.
467
00:18:24.547 --> 00:18:28.418
These would be cryptography before 1980 or so.
468
00:18:28.418 --> 00:18:31.372
Now, I mentioned also that, historically speaking,
469
00:18:31.372 --> 00:18:33.543
cryptography was largely this art where there
470
00:18:33.543 --> 00:18:36.505
was this heuristic design and analysis phase that went on
471
00:18:36.505 --> 00:18:38.110
where people would build protocols and they
472
00:18:38.110 --> 00:18:39.706
would build encryption schemes,
473
00:18:39.706 --> 00:18:40.820
and then they would release them,
474
00:18:40.820 --> 00:18:43.224
they would, perhaps, talk about them
475
00:18:43.224 --> 00:18:44.927
with their friends.
476
00:18:44.927 --> 00:18:46.992
In turn, they would try to rate
477
00:18:46.992 --> 00:18:49.012
the resulting crypto system.
478
00:18:49.012 --> 00:18:50.078
Like I said before, seeing if they
479
00:18:50.078 --> 00:18:51.740
were more clever, if they could come up with an attack.
480
00:18:51.740 --> 00:18:53.599
If they found an attack, then the scheme
481
00:18:53.599 --> 00:18:55.028
might be tweaked a little bit to prevent
482
00:18:55.028 --> 00:18:58.967
that attack, and then the process would iterate.
483
00:18:58.967 --> 00:19:01.007
In the late 1970s and early 1980s,
484
00:19:01.007 --> 00:19:03.384
especially, the field began to develop
485
00:19:03.384 --> 00:19:05.930
into much more of a rigorous science.
486
00:19:05.930 --> 00:19:09.026
I think there are really three principles
487
00:19:09.026 --> 00:19:10.610
that underlie this development,
488
00:19:10.610 --> 00:19:12.620
from the heuristic approach to this
489
00:19:12.620 --> 00:19:15.325
more of a scientific foundational approach,
490
00:19:15.325 --> 00:19:17.460
which are, first of all, an emphasis
491
00:19:17.460 --> 00:19:19.776
on formal definitions.
492
00:19:19.776 --> 00:19:21.580
This means coming up with precise
493
00:19:21.580 --> 00:19:24.163
mathematical models of what the
494
00:19:25.178 --> 00:19:27.470
scheme you're developing should be doing,
495
00:19:27.470 --> 00:19:29.427
along with mathematical definitions
496
00:19:29.427 --> 00:19:32.263
of what it means precisely for the scheme
497
00:19:32.263 --> 00:19:34.732
to qualify as being secure.
498
00:19:34.732 --> 00:19:37.233
This is actually, I think, a big leap forward.
499
00:19:37.233 --> 00:19:38.827
People often underestimate the value
500
00:19:38.827 --> 00:19:40.481
and the importance of definitions,
501
00:19:40.481 --> 00:19:43.123
but I think actually a lot of the fuzziness
502
00:19:43.123 --> 00:19:44.997
that was present in cryptography before,
503
00:19:44.997 --> 00:19:48.138
say, the 1980s was addressed
504
00:19:48.138 --> 00:19:49.615
exactly by people coming up with these
505
00:19:49.615 --> 00:19:51.043
rigorous definitions that allowed people
506
00:19:51.043 --> 00:19:53.927
to pin down exactly what it was they were studying,
507
00:19:53.927 --> 00:19:56.507
and thereby make progress on developing schemes
508
00:19:56.507 --> 00:19:59.340
that could meet those definitions.
509
00:20:00.189 --> 00:20:02.589
Cryptography is also, a little bit unfortunately,
510
00:20:02.589 --> 00:20:04.610
but this is the way it is, relies on
511
00:20:04.610 --> 00:20:06.254
computational assumptions.
512
00:20:06.254 --> 00:20:07.706
These are assumptions about the hardness
513
00:20:07.706 --> 00:20:09.301
of certain problems.
514
00:20:09.301 --> 00:20:11.917
You may all have heard about the assumptions
515
00:20:11.917 --> 00:20:14.542
that factoring large numbers is hard.
516
00:20:14.542 --> 00:20:16.619
This is something that we don't know how to prove.
517
00:20:16.619 --> 00:20:17.950
We don't know any way to prove, and you
518
00:20:17.950 --> 00:20:20.304
can't factor numbers, factor large numbers,
519
00:20:20.304 --> 00:20:22.836
in a short amount of time, but even after
520
00:20:22.836 --> 00:20:26.241
intensive study over many years, over many decades,
521
00:20:26.241 --> 00:20:28.423
we still have no way of, we have no
522
00:20:28.423 --> 00:20:30.707
efficient algorithm for factoring large numbers.
523
00:20:30.707 --> 00:20:32.397
So we take it as an assumption that, indeed,
524
00:20:32.397 --> 00:20:34.292
this is a hard mathematical problem,
525
00:20:34.292 --> 00:20:37.245
and that there's no efficient algorithm
526
00:20:37.245 --> 00:20:38.939
for, indeed, solving this.
527
00:20:38.939 --> 00:20:40.957
We can then rely on those assumptions
528
00:20:40.957 --> 00:20:44.310
to construct schemes that meet the definitions.
529
00:20:44.310 --> 00:20:45.884
In turn, we can come up with proofs
530
00:20:45.884 --> 00:20:48.332
of security showing that these schemes
531
00:20:48.332 --> 00:20:51.732
actually meet the definitions that we've given
532
00:20:51.732 --> 00:20:53.748
under the state of assumption.
533
00:20:53.748 --> 00:20:56.040
This also is a very powerful technique,
534
00:20:56.040 --> 00:20:58.524
a very powerful approach
535
00:20:58.524 --> 00:21:01.373
that went a long way toward getting out
536
00:21:01.373 --> 00:21:04.250
of this design-break-patch cycle
537
00:21:04.250 --> 00:21:05.484
that I talked about earlier.
538
00:21:05.484 --> 00:21:07.200
Now rather than just trying to be clever
539
00:21:07.200 --> 00:21:08.392
and coming up with a scheme that
540
00:21:08.392 --> 00:21:10.500
you weren't able to break in a couple of weeks,
541
00:21:10.500 --> 00:21:12.708
you now have a rigorous mathematical proof
542
00:21:12.708 --> 00:21:14.506
that your scheme is secure.
543
00:21:14.506 --> 00:21:17.079
That tells you that within the definition
544
00:21:17.079 --> 00:21:19.120
that you were working with, and assuming
545
00:21:19.120 --> 00:21:22.133
that the assumption you stated indeed holds,
546
00:21:22.133 --> 00:21:24.195
there's gonna be no way for anybody
547
00:21:24.195 --> 00:21:26.212
to come up and attack the scheme.
548
00:21:26.212 --> 00:21:27.988
This is real progress, I think.
549
00:21:27.988 --> 00:21:29.656
Actually, it's been recognized as such
550
00:21:29.656 --> 00:21:32.328
by standard bodies, who now essentially require
551
00:21:32.328 --> 00:21:34.606
proof of security for any schemes
552
00:21:34.606 --> 00:21:38.273
that are being proposed for standardization.
553
00:21:39.577 --> 00:21:41.352
These are the themes I stress, actually,
554
00:21:41.352 --> 00:21:44.435
when I teach courses in cryptography.
555
00:21:45.912 --> 00:21:48.179
Actually, it forced me, in writing
556
00:21:48.179 --> 00:21:50.231
my textbook, to think about really
557
00:21:50.231 --> 00:21:51.382
what are the three principles,
558
00:21:51.382 --> 00:21:52.934
or what are the principles in general
559
00:21:52.934 --> 00:21:55.956
that distinguish modern from classical cryptography.
560
00:21:55.956 --> 00:21:58.122
Going through that process actually helped me a lot
561
00:21:58.122 --> 00:21:59.536
to refine that.
562
00:21:59.536 --> 00:22:01.847
If you're interested, you can buy the textbook.
563
00:22:01.847 --> 00:22:04.806
(laughter) You can also watch the free MOOC.
564
00:22:04.806 --> 00:22:06.531
As we said earlier, you can go online and sign up
565
00:22:06.531 --> 00:22:09.231
for free and watch the videos online
566
00:22:09.231 --> 00:22:11.618
and listen to me talk for 20 hours straight.
567
00:22:11.618 --> 00:22:15.448
What could be better? (laughter)
568
00:22:15.448 --> 00:22:18.137
I wanna come back to this other issue,
569
00:22:18.137 --> 00:22:19.903
which is the classical trust model,
570
00:22:19.903 --> 00:22:21.209
where there's a clean division between
571
00:22:21.209 --> 00:22:23.630
the good guys and the bad guys.
572
00:22:23.630 --> 00:22:26.033
It's a very nice model,
573
00:22:26.033 --> 00:22:28.568
but it doesn't capture the reality
574
00:22:28.568 --> 00:22:31.341
in which cryptographic protocols are being run today.
575
00:22:31.341 --> 00:22:33.525
First of all, we may have protocols being run
576
00:22:33.525 --> 00:22:36.199
in much larger systems involving more than two parties.
577
00:22:36.199 --> 00:22:37.565
Now, that's not necessarily new.
578
00:22:37.565 --> 00:22:40.062
People did talk about group communication
579
00:22:40.062 --> 00:22:42.309
before 1980, but nevertheless, it's much
580
00:22:42.309 --> 00:22:44.021
more common nowadays to have protocols
581
00:22:44.021 --> 00:22:46.147
involving multiple people.
582
00:22:46.147 --> 00:22:49.021
Once you involve multiple people in a protocol,
583
00:22:49.021 --> 00:22:51.270
these trust relationships between them
584
00:22:51.270 --> 00:22:53.214
get a lot more complicated.
585
00:22:53.214 --> 00:22:55.554
For example, we might have Alice
586
00:22:55.554 --> 00:22:58.542
and Carol, who perfectly trust each other.
587
00:22:58.542 --> 00:22:59.804
They're friends, they know each other,
588
00:22:59.804 --> 00:23:00.967
and they perfectly trust each other.
589
00:23:00.967 --> 00:23:02.526
They wouldn't mind sharing everything they know
590
00:23:02.526 --> 00:23:03.903
with each other.
591
00:23:03.903 --> 00:23:06.651
Maybe it's also the case that Carol and Bob
592
00:23:06.651 --> 00:23:08.535
know each other, and they also may be willing
593
00:23:08.535 --> 00:23:11.198
to trust each other with everything that they know.
594
00:23:11.198 --> 00:23:13.536
But this doesn't imply that Alice and Bob
595
00:23:13.536 --> 00:23:15.613
know each other, and it doesn't imply
596
00:23:15.613 --> 00:23:18.002
at all that Alice and Bob trust each other.
597
00:23:18.002 --> 00:23:19.911
It may very well be the case that Alice
598
00:23:19.911 --> 00:23:22.022
is not willing to have her data
599
00:23:22.022 --> 00:23:23.881
be revealed to Bob.
600
00:23:23.881 --> 00:23:25.118
This is a little bit more complicated
601
00:23:25.118 --> 00:23:26.845
than what we had before because now we no longer
602
00:23:26.845 --> 00:23:28.862
have this clean boundary between who
603
00:23:28.862 --> 00:23:30.294
the good guys are and who the bad guys are.
604
00:23:30.294 --> 00:23:32.393
Actually, in this picture, maybe there's no
605
00:23:32.393 --> 00:23:34.626
attacker, per se, there's no devil,
606
00:23:34.626 --> 00:23:36.546
but it's the case that certain people wanna
607
00:23:36.546 --> 00:23:41.164
protect certain information from other people.
608
00:23:41.164 --> 00:23:43.229
The requirements or the desires of the
609
00:23:43.229 --> 00:23:46.026
different people can vary, and they can vary
610
00:23:46.026 --> 00:23:49.304
very much from person to person.
611
00:23:49.304 --> 00:23:51.258
You have more complicated examples, as well.
612
00:23:51.258 --> 00:23:54.233
For example, maybe it's the case
613
00:23:54.233 --> 00:23:56.715
that we're worried about security
614
00:23:56.715 --> 00:23:59.199
or a particular user running this protocol.
615
00:23:59.199 --> 00:24:02.549
For example, this user down at the bottom left.
616
00:24:02.549 --> 00:24:04.000
From her point of view in executing
617
00:24:04.000 --> 00:24:06.779
this protocol, maybe she knows everybody,
618
00:24:06.779 --> 00:24:08.019
maybe not be best friends, but she knows
619
00:24:08.019 --> 00:24:10.346
everybody running the protocol from work
620
00:24:10.346 --> 00:24:12.340
or what have you.
621
00:24:12.340 --> 00:24:16.697
Nevertheless, she's not really willing to assume
622
00:24:16.697 --> 00:24:19.696
that everybody is completely honest.
623
00:24:19.696 --> 00:24:21.197
Now, she may be willing to assume that
624
00:24:21.197 --> 00:24:23.248
it's very likely that 1/2 of them, say,
625
00:24:23.248 --> 00:24:26.223
are honest, but she doesn't know which 1/2.
626
00:24:26.223 --> 00:24:28.490
It could be the case that these three people
627
00:24:28.490 --> 00:24:32.872
indicated here are willing to act dishonestly.
628
00:24:32.872 --> 00:24:34.790
Or maybe it's the case that they're perfectly willing
629
00:24:34.790 --> 00:24:37.072
to act honestly, but their machine is corrupted
630
00:24:37.072 --> 00:24:38.751
by a virus.
631
00:24:38.751 --> 00:24:40.612
Even though they're perfectly willing
632
00:24:40.612 --> 00:24:42.565
and they want to act honestly,
633
00:24:42.565 --> 00:24:43.920
something going on in their computer
634
00:24:43.920 --> 00:24:45.766
that they're even unaware of is making
635
00:24:45.766 --> 00:24:47.365
their computer deviate from the protocol
636
00:24:47.365 --> 00:24:49.106
as the protocol was running.
637
00:24:49.106 --> 00:24:51.445
As I said, the problem is that
638
00:24:51.445 --> 00:24:53.903
this person at the lower left has no idea
639
00:24:53.903 --> 00:24:55.706
which of those three parties that
640
00:24:55.706 --> 00:24:58.395
she's interacting with might be malicious.
641
00:24:58.395 --> 00:25:00.388
It may be a different set of three parties.
642
00:25:00.388 --> 00:25:02.746
All she knows, or all she's willing to assume,
643
00:25:02.746 --> 00:25:05.675
is that there's some, three other parties
644
00:25:05.675 --> 00:25:07.273
she's interacting with, so some three out of six
645
00:25:07.273 --> 00:25:09.498
of the other people she's interacting with are honest,
646
00:25:09.498 --> 00:25:11.234
but she has no idea which ones.
647
00:25:11.234 --> 00:25:13.075
This is another example of a more complex
648
00:25:13.075 --> 00:25:14.524
trust relationship that we'd like
649
00:25:14.524 --> 00:25:16.524
to be able to deal with.
650
00:25:17.370 --> 00:25:19.279
In fact, these issues, these ideas
651
00:25:19.279 --> 00:25:20.972
of having more complicated relationships
652
00:25:20.972 --> 00:25:22.298
are not specific to the group setting.
653
00:25:22.298 --> 00:25:24.955
They come up even in the two-party setting.
654
00:25:24.955 --> 00:25:26.564
Imagine, maybe it's easier this way,
655
00:25:26.564 --> 00:25:28.219
rather than having two people interacting,
656
00:25:28.219 --> 00:25:30.622
we have maybe two companies interacting.
657
00:25:30.622 --> 00:25:32.490
These companies are trying to negotiate
658
00:25:32.490 --> 00:25:33.657
some contract.
659
00:25:36.010 --> 00:25:38.313
In the classical picture of encryption,
660
00:25:38.313 --> 00:25:39.849
the person you trust, you would like to share
661
00:25:39.849 --> 00:25:41.847
everything with, but in this context,
662
00:25:41.847 --> 00:25:43.297
that may no longer be the case.
663
00:25:43.297 --> 00:25:47.041
Maybe one company is willing to share
664
00:25:47.041 --> 00:25:48.577
their net revenue with the other company
665
00:25:48.577 --> 00:25:50.632
as part of this negotiation, but they're not
666
00:25:50.632 --> 00:25:53.175
willing to reveal their customer list.
667
00:25:53.175 --> 00:25:55.393
It's not the case that just because I trust,
668
00:25:55.393 --> 00:25:56.581
in some sense, the other person that
669
00:25:56.581 --> 00:25:58.685
I'm interacting with that I'm willing to share
670
00:25:58.685 --> 00:25:59.518
everything with them.
671
00:25:59.518 --> 00:26:00.623
There's certain things I'm willing to share,
672
00:26:00.623 --> 00:26:03.034
and certain things I'm not willing to share.
673
00:26:03.034 --> 00:26:05.121
In addition, this trust relationship may
674
00:26:05.121 --> 00:26:06.590
be asymmetric.
675
00:26:06.590 --> 00:26:08.077
Maybe the other company is not even willing
676
00:26:08.077 --> 00:26:09.662
to share their net revenue.
677
00:26:09.662 --> 00:26:10.943
They need to decide for themselves
678
00:26:10.943 --> 00:26:12.345
what they're comfortable sharing
679
00:26:12.345 --> 00:26:13.765
with the other party.
680
00:26:13.765 --> 00:26:16.273
These kind of issues, this richer class
681
00:26:16.273 --> 00:26:19.195
of trust models between people, is something
682
00:26:19.195 --> 00:26:20.752
that cryptographers nowadays spent
683
00:26:20.752 --> 00:26:23.200
a lot of time thinking about and designing
684
00:26:23.200 --> 00:26:26.164
protocols that can deal with it.
685
00:26:26.164 --> 00:26:27.787
Now, what does this have to do with privacy?
686
00:26:27.787 --> 00:26:29.083
I guess it's already clear because I mentioned
687
00:26:29.083 --> 00:26:31.664
privacy a number of times.
688
00:26:31.664 --> 00:26:33.813
I just wanna say briefly a couple of words
689
00:26:33.813 --> 00:26:35.936
to set the stage about privacy,
690
00:26:35.936 --> 00:26:38.214
especially nowadays.
691
00:26:38.214 --> 00:26:40.135
To note, first of all, that concerns about
692
00:26:40.135 --> 00:26:42.885
privacy, in general, are not new.
693
00:26:44.816 --> 00:26:47.505
They're traditionally dated to about 1890,
694
00:26:47.505 --> 00:26:50.434
actually, to a Harvard Law Review article
695
00:26:50.434 --> 00:26:52.976
published by Warren and Brandeis.
696
00:26:52.976 --> 00:26:54.871
Louis Brandeis actually was the one
697
00:26:54.871 --> 00:26:57.103
who was supposed to be more involved
698
00:26:57.103 --> 00:26:59.107
in writing the article, but anyway,
699
00:26:59.107 --> 00:27:00.612
he went onto be a Supreme Court justice.
700
00:27:00.612 --> 00:27:03.266
This was before he was a Supreme Court justice.
701
00:27:03.266 --> 00:27:05.890
He wrote an article trying to understand,
702
00:27:05.890 --> 00:27:08.100
basically, what right to privacy there was
703
00:27:08.100 --> 00:27:09.960
and to what extent the right to privacy
704
00:27:09.960 --> 00:27:12.379
followed from the US Constitution.
705
00:27:12.379 --> 00:27:13.745
It's interesting, it's almost a little quaint,
706
00:27:13.745 --> 00:27:16.196
right, to look back and at the time
707
00:27:16.196 --> 00:27:17.342
to see what he was concerned about.
708
00:27:17.342 --> 00:27:19.338
He was very concerned about the advent,
709
00:27:19.338 --> 00:27:20.693
the widespread use of photography.
710
00:27:20.693 --> 00:27:22.135
Now it meant that people could take pictures
711
00:27:22.135 --> 00:27:23.706
of other people in the street.
712
00:27:23.706 --> 00:27:25.887
Who could imagine?
713
00:27:25.887 --> 00:27:27.429
The growing popularity of newspapers,
714
00:27:27.429 --> 00:27:29.193
where people would publish, perhaps,
715
00:27:29.193 --> 00:27:31.776
salacious stories about others.
716
00:27:33.213 --> 00:27:36.683
Coming kind of full circle, in 2015,
717
00:27:36.683 --> 00:27:39.476
DARPA kicked off the Brandeis program,
718
00:27:39.476 --> 00:27:41.768
named after Louis Brandeis,
719
00:27:41.768 --> 00:27:43.942
whose goal, actually, was to provide users
720
00:27:43.942 --> 00:27:46.547
with some measure of control over their own privacy.
721
00:27:46.547 --> 00:27:48.358
I have a particularly nice quote
722
00:27:48.358 --> 00:27:50.673
in the BAA program, actually, that I
723
00:27:50.673 --> 00:27:51.934
took out here.
724
00:27:51.934 --> 00:27:54.383
Saying that, fundamentally, "Democracy depends
725
00:27:54.383 --> 00:27:58.092
"on creativity and free interchange of diverse ideas.
726
00:27:58.092 --> 00:28:00.286
"Constant observation,"
727
00:28:00.286 --> 00:28:02.554
lack of privacy, "has a dampening effect
728
00:28:02.554 --> 00:28:04.883
"on individuality, promotes conformance
729
00:28:04.883 --> 00:28:06.489
"and inhibits personal development,
730
00:28:06.489 --> 00:28:09.600
"freedom of thought and speech."
731
00:28:09.600 --> 00:28:11.014
It's amazing, actually, to think about
732
00:28:11.014 --> 00:28:12.682
this coming from the DoD.
733
00:28:12.682 --> 00:28:13.515
DARPA's a,
734
00:28:15.780 --> 00:28:17.643
in the DoD hierarchy.
735
00:28:17.643 --> 00:28:19.529
This is the Department of Defense talking about
736
00:28:19.529 --> 00:28:22.959
how important privacy is to a free society.
737
00:28:22.959 --> 00:28:25.357
Now, privacy itself is actually quite a fuzzy term.
738
00:28:25.357 --> 00:28:28.036
When people talk about privacy in the media,
739
00:28:28.036 --> 00:28:29.188
or even if you're thinking about privacy,
740
00:28:29.188 --> 00:28:31.358
it's not always clear to what exactly
741
00:28:31.358 --> 00:28:32.538
they're referring and what the limits
742
00:28:32.538 --> 00:28:33.952
on privacy should be.
743
00:28:33.952 --> 00:28:35.478
There are a ton of really interesting questions
744
00:28:35.478 --> 00:28:37.049
in this area, actually.
745
00:28:37.049 --> 00:28:39.689
Just to name one, like I was hinting at earlier,
746
00:28:39.689 --> 00:28:41.991
there's this fundamental question of what right
747
00:28:41.991 --> 00:28:44.056
to privacy we have.
748
00:28:44.056 --> 00:28:45.617
In particular, as US citizens,
749
00:28:45.617 --> 00:28:48.041
what constitutional right to privacy do we have?
750
00:28:48.041 --> 00:28:50.263
What privacy rights do we have that actually
751
00:28:50.263 --> 00:28:52.722
flow from the Constitution itself?
752
00:28:52.722 --> 00:28:54.907
Even if we concede, or once we
753
00:28:54.907 --> 00:28:57.079
figure out exactly what those rights are,
754
00:28:57.079 --> 00:28:59.925
we might ask, "Well, in what conditions
755
00:28:59.925 --> 00:29:01.639
"can those rights be overridden?"
756
00:29:01.639 --> 00:29:05.169
That's a particularly important debate that
757
00:29:05.169 --> 00:29:07.864
should be happening today.
758
00:29:07.864 --> 00:29:10.493
You can also ask, from a societal perspective,
759
00:29:10.493 --> 00:29:12.918
are our expectations of privacy changing?
760
00:29:12.918 --> 00:29:14.513
A lot of people argue that now that
761
00:29:14.513 --> 00:29:15.617
everyone's putting everything they do
762
00:29:15.617 --> 00:29:17.452
on social media and publicizing, essentially,
763
00:29:17.452 --> 00:29:19.002
where they are at every moment, they actually
764
00:29:19.002 --> 00:29:20.058
don't want privacy.
765
00:29:20.058 --> 00:29:22.001
They prefer something that's certainly
766
00:29:22.001 --> 00:29:23.500
very different from the kind of privacy
767
00:29:23.500 --> 00:29:24.809
people might have thought about
768
00:29:24.809 --> 00:29:26.923
15 or 20 years ago.
769
00:29:26.923 --> 00:29:27.756
Now, I can
770
00:29:29.308 --> 00:29:31.430
shortcut this by saying, well, I'm not gonna answer
771
00:29:31.430 --> 00:29:32.980
any of these questions.
772
00:29:32.980 --> 00:29:34.589
I'm not a philosopher.
773
00:29:34.589 --> 00:29:36.567
I'm not a lawyer.
774
00:29:36.567 --> 00:29:38.633
I don't work in public policy, per se,
775
00:29:38.633 --> 00:29:39.845
but actually, let me say in particular
776
00:29:39.845 --> 00:29:41.694
about the second, well, maybe the first
777
00:29:41.694 --> 00:29:43.270
and second bullets, regarding what right
778
00:29:43.270 --> 00:29:44.855
to privacy we may have today and under
779
00:29:44.855 --> 00:29:46.809
what conditions they can be overridden.
780
00:29:46.809 --> 00:29:48.655
I said earlier that it's a very important debate
781
00:29:48.655 --> 00:29:51.363
that we, as a country, should be having.
782
00:29:51.363 --> 00:29:53.178
As a cryptographer, I don't really view it
783
00:29:53.178 --> 00:29:54.769
as my role, per se, to come out with a
784
00:29:54.769 --> 00:29:57.291
policy recommendation one way or the other,
785
00:29:57.291 --> 00:30:00.157
but what I do view it as my goal to do
786
00:30:00.157 --> 00:30:02.404
is to present the options and to make sure
787
00:30:02.404 --> 00:30:03.952
that people on both side of the debate
788
00:30:03.952 --> 00:30:06.035
are technically informed.
789
00:30:07.266 --> 00:30:09.757
Now, a lotta people are worried nowadays
790
00:30:09.757 --> 00:30:11.574
about privacy, in particular because
791
00:30:11.574 --> 00:30:12.783
of government surveillance and because
792
00:30:12.783 --> 00:30:16.515
of the revelations that came out a few years back.
793
00:30:16.515 --> 00:30:19.217
It's not hard to find examples of this,
794
00:30:19.217 --> 00:30:23.261
of reports of NSA collecting phone records
795
00:30:23.261 --> 00:30:24.511
of US citizens.
796
00:30:25.542 --> 00:30:28.292
People taking people's cellphones
797
00:30:29.200 --> 00:30:31.269
and looking through the contents of those phones.
798
00:30:31.269 --> 00:30:32.648
This happened, this particular article
799
00:30:32.648 --> 00:30:34.689
is talking about
800
00:30:34.689 --> 00:30:37.403
taking people's phones who were attending the rally.
801
00:30:37.403 --> 00:30:39.071
There are other examples of people's phones
802
00:30:39.071 --> 00:30:41.025
being taken when they're entering the United States
803
00:30:41.025 --> 00:30:42.802
at a border crossing.
804
00:30:42.802 --> 00:30:45.010
But, in fact, I would say that
805
00:30:45.010 --> 00:30:46.487
while it's certainly fair game to worry
806
00:30:46.487 --> 00:30:48.252
about this sort of thing, and it's certainly,
807
00:30:48.252 --> 00:30:49.416
like I said earlier, the kind of thing
808
00:30:49.416 --> 00:30:52.087
that we need to be having a public discussion about,
809
00:30:52.087 --> 00:30:53.808
from everything I've seen,
810
00:30:53.808 --> 00:30:57.026
the government is doing their best
811
00:30:57.026 --> 00:31:00.175
to do things within the law.
812
00:31:00.175 --> 00:31:01.185
There are certainly many people within
813
00:31:01.185 --> 00:31:03.790
government who take these legal constraints
814
00:31:03.790 --> 00:31:05.627
very seriously.
815
00:31:05.627 --> 00:31:07.101
Again, while it's something to be concerned about,
816
00:31:07.101 --> 00:31:09.104
what concerns me much more, actually,
817
00:31:09.104 --> 00:31:11.277
is corporate surveillance.
818
00:31:11.277 --> 00:31:13.379
We all know that corporations nowadays
819
00:31:13.379 --> 00:31:14.749
are collecting tons of information
820
00:31:14.749 --> 00:31:16.166
about each of us.
821
00:31:17.230 --> 00:31:18.827
The scariest way that I know how to express it
822
00:31:18.827 --> 00:31:21.552
to myself, I think, is that Facebook,
823
00:31:21.552 --> 00:31:23.844
Google and Apple probably know more
824
00:31:23.844 --> 00:31:25.272
about you than your family does,
825
00:31:25.272 --> 00:31:28.236
than your spouse or your partner does.
826
00:31:28.236 --> 00:31:30.514
Google, for example, knows every
827
00:31:30.514 --> 00:31:32.482
internet search you've done, assuming you
828
00:31:32.482 --> 00:31:34.582
were logged in, lets' say, at the time.
829
00:31:34.582 --> 00:31:35.780
If you have an Android phone, they also
830
00:31:35.780 --> 00:31:37.929
know everywhere you've been.
831
00:31:37.929 --> 00:31:39.367
Does everyone in your family know
832
00:31:39.367 --> 00:31:41.766
everything you search for and everywhere you've been?
833
00:31:41.766 --> 00:31:43.133
Probably not.
834
00:31:43.133 --> 00:31:44.432
That's a scary thought.
835
00:31:44.432 --> 00:31:46.100
It's really eye opening to think about things
836
00:31:46.100 --> 00:31:46.964
in that way.
837
00:31:46.964 --> 00:31:49.077
Do you really want, do you make a decision,
838
00:31:49.077 --> 00:31:51.203
a conscious decision, to trust any
839
00:31:51.203 --> 00:31:52.596
of these corporations more than you trust
840
00:31:52.596 --> 00:31:54.190
some of your own family members?
841
00:31:54.190 --> 00:31:56.734
Actually, I grabbed this interesting article
842
00:31:56.734 --> 00:31:58.185
last year.
843
00:31:58.185 --> 00:31:59.388
In fact, Google may know you better
844
00:31:59.388 --> 00:32:00.663
than you know yourself.
845
00:32:00.663 --> 00:32:02.739
This is an interesting read if you wanna read
846
00:32:02.739 --> 00:32:04.123
an article about it.
847
00:32:04.123 --> 00:32:07.614
Basically, arguing that we have these,
848
00:32:07.614 --> 00:32:09.107
delusion is a strong term, but we have these
849
00:32:09.107 --> 00:32:10.246
delusions about what we do.
850
00:32:10.246 --> 00:32:11.420
I'm a hard worker.
851
00:32:11.420 --> 00:32:12.827
But Google knows exactly how much time
852
00:32:12.827 --> 00:32:14.363
you spend working, how much time you spend
853
00:32:14.363 --> 00:32:16.739
reading your mail. (laughter)
854
00:32:16.739 --> 00:32:18.961
It's an interesting thought, as well.
855
00:32:18.961 --> 00:32:20.744
There are many other examples.
856
00:32:20.744 --> 00:32:21.779
You see these things all the time.
857
00:32:21.779 --> 00:32:22.977
Actually, what's great about giving a talk
858
00:32:22.977 --> 00:32:25.629
like this is that as I was preparing my talk,
859
00:32:25.629 --> 00:32:26.937
I kept on seeing news articles
860
00:32:26.937 --> 00:32:29.842
coming out daily just reinforcing this idea
861
00:32:29.842 --> 00:32:31.496
that companies are collecting more and more data
862
00:32:31.496 --> 00:32:32.914
about us.
863
00:32:32.914 --> 00:32:34.955
There's a famous story about how Target
864
00:32:34.955 --> 00:32:36.788
exposed a teen girl's pregnancy.
865
00:32:36.788 --> 00:32:38.048
This was many years back.
866
00:32:38.048 --> 00:32:38.881
Basically,
867
00:32:39.967 --> 00:32:41.420
the girl's father found out that she
868
00:32:41.420 --> 00:32:43.461
was pregnant from Target before the girl
869
00:32:43.461 --> 00:32:44.875
told her father.
870
00:32:44.875 --> 00:32:46.125
That's amazing.
871
00:32:48.960 --> 00:32:50.865
It can also be these concerns about
872
00:32:50.865 --> 00:32:52.795
collection of data can also have negative impacts
873
00:32:52.795 --> 00:32:55.198
on people, negative economic impacts.
874
00:32:55.198 --> 00:32:56.803
For example, differential pricing
875
00:32:56.803 --> 00:32:57.919
is a real concern now.
876
00:32:57.919 --> 00:32:59.723
The idea here being that if a company knows
877
00:32:59.723 --> 00:33:01.988
exactly how much you spend and on what items,
878
00:33:01.988 --> 00:33:03.130
and not only that, but they know how long
879
00:33:03.130 --> 00:33:04.703
you're spending in a particular site,
880
00:33:04.703 --> 00:33:06.842
how long you click around looking for a good deal,
881
00:33:06.842 --> 00:33:08.343
they can use all that information to now
882
00:33:08.343 --> 00:33:10.510
give you, essentially, the
883
00:33:12.951 --> 00:33:15.699
maximum price at which you'll buy an item.
884
00:33:15.699 --> 00:33:17.197
It might be different for different people.
885
00:33:17.197 --> 00:33:19.106
You have different people looking for, say,
886
00:33:19.106 --> 00:33:20.150
an airline ticket.
887
00:33:20.150 --> 00:33:21.930
They'll get displayed different prices
888
00:33:21.930 --> 00:33:23.289
because the companies have so much information
889
00:33:23.289 --> 00:33:25.431
about us that they essentially know in advance
890
00:33:25.431 --> 00:33:27.931
how much we're willing to pay.
891
00:33:29.597 --> 00:33:30.769
This also is something, like I said,
892
00:33:30.769 --> 00:33:32.644
just came out in the news yesterday,
893
00:33:32.644 --> 00:33:35.683
I suppose, about Verizon wanting to collect
894
00:33:35.683 --> 00:33:36.905
more data from people in order to do
895
00:33:36.905 --> 00:33:40.085
better advertising, of course.
896
00:33:40.085 --> 00:33:41.297
The problem's only getting worse.
897
00:33:41.297 --> 00:33:42.711
The problem's getting worse as we have more
898
00:33:42.711 --> 00:33:45.099
and more devices being installed in our houses,
899
00:33:45.099 --> 00:33:47.800
and as we're wearing devices on our bodies
900
00:33:47.800 --> 00:33:50.885
that are continually collecting information about us.
901
00:33:50.885 --> 00:33:53.320
You can see this even, again, some fascinating
902
00:33:53.320 --> 00:33:56.191
cases about whether Alexa,
903
00:33:56.191 --> 00:33:58.444
for example, whether the data from Alexa
904
00:33:58.444 --> 00:34:00.583
can be used as evidence against somebody
905
00:34:00.583 --> 00:34:01.924
at a murder trial.
906
00:34:01.924 --> 00:34:03.940
These are kind of things that just didn't happen
907
00:34:03.940 --> 00:34:05.478
even a few years back because people didn't
908
00:34:05.478 --> 00:34:07.761
have these devices in their houses,
909
00:34:07.761 --> 00:34:08.890
essentially recording everything that
910
00:34:08.890 --> 00:34:10.973
they're saying and doing.
911
00:34:12.526 --> 00:34:14.793
Now, what's cryptography's role, or what can
912
00:34:14.793 --> 00:34:17.869
cryptography's role be in all of this?
913
00:34:17.869 --> 00:34:19.631
I think there are really two different roles
914
00:34:19.631 --> 00:34:21.090
that cryptography can play.
915
00:34:21.090 --> 00:34:23.249
The first, as I said earlier, is with this emphasis
916
00:34:23.249 --> 00:34:25.712
on definitions, I think cryptography
917
00:34:25.712 --> 00:34:27.212
can really help us
918
00:34:28.421 --> 00:34:31.061
to replace this fuzzy model we have about privacy.
919
00:34:31.061 --> 00:34:32.288
There are many different concerns, and many
920
00:34:32.288 --> 00:34:34.088
different people have different things
921
00:34:34.088 --> 00:34:35.311
they're worried about.
922
00:34:35.311 --> 00:34:37.439
What cryptography can help us do a little bit
923
00:34:37.439 --> 00:34:40.257
is to reason about that in a more formal manner.
924
00:34:40.257 --> 00:34:41.375
It doesn't mean we have to necessarily
925
00:34:41.375 --> 00:34:43.833
write down mathematical definitions,
926
00:34:43.833 --> 00:34:45.007
it doesn't mean we have to give proof,
927
00:34:45.007 --> 00:34:46.735
but at least to make us think a little bit
928
00:34:46.735 --> 00:34:48.235
in a more rational
929
00:34:49.605 --> 00:34:50.946
and focused manner about what exactly
930
00:34:50.946 --> 00:34:53.851
we mean by privacy in a particular context.
931
00:34:53.851 --> 00:34:56.285
Perhaps more interestingly, at least from my point of view,
932
00:34:56.285 --> 00:34:58.864
is that cryptography can help us achieve
933
00:34:58.864 --> 00:35:00.715
certain levels of privacy that we currently
934
00:35:00.715 --> 00:35:03.342
are not achieving by developing new algorithms
935
00:35:03.342 --> 00:35:06.428
and protocols that can allow people to do
936
00:35:06.428 --> 00:35:08.671
or to achieve things that they're already achieving,
937
00:35:08.671 --> 00:35:10.974
while also adding additional privacy guarantees
938
00:35:10.974 --> 00:35:12.224
on top of that.
939
00:35:14.731 --> 00:35:17.288
This is really becoming increasingly important,
940
00:35:17.288 --> 00:35:18.800
as I indicated with the news articles
941
00:35:18.800 --> 00:35:21.140
on the previous slide, with this huge volume
942
00:35:21.140 --> 00:35:24.044
of personal data that's being collected,
943
00:35:24.044 --> 00:35:25.679
aggregated, stored, shared and used about
944
00:35:25.679 --> 00:35:27.341
every one of us in this room.
945
00:35:27.341 --> 00:35:28.785
What I wanna focus on in particular here
946
00:35:28.785 --> 00:35:31.366
is the data collection aspect.
947
00:35:31.366 --> 00:35:32.755
The fact that these companies are collecting
948
00:35:32.755 --> 00:35:35.037
all of this personal data about each of us,
949
00:35:35.037 --> 00:35:37.643
and also the aspect of how this data
950
00:35:37.643 --> 00:35:40.993
is, in turn, being used by other companies.
951
00:35:40.993 --> 00:35:43.505
I'll focus on the collection first,
952
00:35:43.505 --> 00:35:46.131
and then I'll come back to the issue of data usage.
953
00:35:46.131 --> 00:35:48.015
Again, I found this wonderful article,
954
00:35:48.015 --> 00:35:49.349
wonderful for me because it gave me
955
00:35:49.349 --> 00:35:52.588
a springboard, something to tie it into.
956
00:35:52.588 --> 00:35:56.039
This was in just last week's newspaper.
957
00:35:56.039 --> 00:35:59.300
This was an op-ed talking about how Congress
958
00:35:59.300 --> 00:36:01.883
basically passed a 21st Century Cures Act.
959
00:36:01.883 --> 00:36:05.304
It passed this, actually, before last week.
960
00:36:05.304 --> 00:36:07.066
This was an op-ed talking about it.
961
00:36:07.066 --> 00:36:08.634
Where essentially the law tries to create
962
00:36:08.634 --> 00:36:10.774
what they call an information commons
963
00:36:10.774 --> 00:36:13.100
that's gonna be a government-regulated pool
964
00:36:13.100 --> 00:36:15.432
of medical data about individuals
965
00:36:15.432 --> 00:36:17.148
that's going to be accessible to all
966
00:36:17.148 --> 00:36:18.731
health researchers.
967
00:36:19.654 --> 00:36:21.371
I wanna just try to model that.
968
00:36:21.371 --> 00:36:23.759
This is not actually how the information commons
969
00:36:23.759 --> 00:36:25.571
is working, but we can think about it
970
00:36:25.571 --> 00:36:27.431
as being something like the following.
971
00:36:27.431 --> 00:36:29.399
We may have some hospitals, some local
972
00:36:29.399 --> 00:36:31.966
or regional hospitals, that patients come to.
973
00:36:31.966 --> 00:36:33.816
During the course of their medical care,
974
00:36:33.816 --> 00:36:35.806
data will be collected about them.
975
00:36:35.806 --> 00:36:38.065
Then we may have researchers at NIH
976
00:36:38.065 --> 00:36:40.550
who want to perform some studies,
977
00:36:40.550 --> 00:36:42.471
perhaps using some data that
978
00:36:42.471 --> 00:36:44.437
they've already collected from subjects
979
00:36:44.437 --> 00:36:46.927
that they've had, they've worked with themselves,
980
00:36:46.927 --> 00:36:49.494
but also by using the data that's been collected
981
00:36:49.494 --> 00:36:51.427
about other people at various hospitals
982
00:36:51.427 --> 00:36:53.528
around the country.
983
00:36:53.528 --> 00:36:56.118
The way this might work, again, at a rough level,
984
00:36:56.118 --> 00:36:59.863
is that under this act, those hospitals
985
00:36:59.863 --> 00:37:02.176
would be required to share, somehow or another,
986
00:37:02.176 --> 00:37:04.039
data about their patients with the
987
00:37:04.039 --> 00:37:07.062
medical researchers who wanna conduct the study.
988
00:37:07.062 --> 00:37:08.755
They would then take that data that's being sent
989
00:37:08.755 --> 00:37:10.629
to them from the hospitals, pool it with their
990
00:37:10.629 --> 00:37:12.548
own data that they hold, and compute
991
00:37:12.548 --> 00:37:14.465
some scientific report.
992
00:37:15.677 --> 00:37:18.475
The issues here are twofold.
993
00:37:18.475 --> 00:37:20.817
There's data collection, the fact that now
994
00:37:20.817 --> 00:37:23.638
NIH at some moment in time has collected data
995
00:37:23.638 --> 00:37:26.445
about hundreds or thousands of patients
996
00:37:26.445 --> 00:37:29.338
that it's storing and they're computing over,
997
00:37:29.338 --> 00:37:30.768
and also this data usage.
998
00:37:30.768 --> 00:37:32.279
When they publish this report, how do I know
999
00:37:32.279 --> 00:37:33.459
that they're not gonna highlight some
1000
00:37:33.459 --> 00:37:35.460
medical condition that I have, or how do I know
1001
00:37:35.460 --> 00:37:37.528
that the report won't reveal to somebody else
1002
00:37:37.528 --> 00:37:42.303
that I came in for a certain test on a certain date?
1003
00:37:42.303 --> 00:37:44.355
It's easy to understand why people wanna
1004
00:37:44.355 --> 00:37:46.358
collect data in general, 'cause it goes not only
1005
00:37:46.358 --> 00:37:48.499
for the medical researchers, but also for
1006
00:37:48.499 --> 00:37:50.394
these companies who ultimately wanna sell you things
1007
00:37:50.394 --> 00:37:52.170
or sell advertising.
1008
00:37:52.170 --> 00:37:55.012
As the volume of data collected goes up,
1009
00:37:55.012 --> 00:37:58.108
the utility of those companies also goes up.
1010
00:37:58.108 --> 00:37:59.466
If they know more about you, they can
1011
00:37:59.466 --> 00:38:01.181
charge you a higher price.
1012
00:38:01.181 --> 00:38:02.504
If they know more about your habits
1013
00:38:02.504 --> 00:38:04.667
and what you're interested in, they can charge more
1014
00:38:04.667 --> 00:38:08.537
to advertisers to display a certain ad to you.
1015
00:38:08.537 --> 00:38:10.069
But, of course, at the same time
1016
00:38:10.069 --> 00:38:11.800
as the amount of data collected goes up,
1017
00:38:11.800 --> 00:38:14.545
the privacy of each of us goes down.
1018
00:38:14.545 --> 00:38:15.952
There's a fundamental tension here,
1019
00:38:15.952 --> 00:38:18.007
or there seems to be a fundamental tension,
1020
00:38:18.007 --> 00:38:20.981
between the utility and the privacy.
1021
00:38:20.981 --> 00:38:23.405
But the real question is do we need
1022
00:38:23.405 --> 00:38:25.760
to collect all this data in one location
1023
00:38:25.760 --> 00:38:28.562
in order to get the utility that these companies want
1024
00:38:28.562 --> 00:38:31.321
or that the medical researchers want?
1025
00:38:31.321 --> 00:38:33.216
Is collecting the data the only way
1026
00:38:33.216 --> 00:38:36.110
to derive the utility that we want from that data?
1027
00:38:36.110 --> 00:38:38.188
Or can we come up with other approaches
1028
00:38:38.188 --> 00:38:40.121
that avoid the need for central data collection
1029
00:38:40.121 --> 00:38:41.704
in the first place?
1030
00:38:42.804 --> 00:38:45.805
This brings us to a favorite research topic of mine,
1031
00:38:45.805 --> 00:38:47.785
multiparty computation.
1032
00:38:47.785 --> 00:38:49.779
A multiparty computation protocol
1033
00:38:49.779 --> 00:38:52.816
is gonna be a distributed protocol run between
1034
00:38:52.816 --> 00:38:54.941
some number of different entities,
1035
00:38:54.941 --> 00:38:56.512
some number of different parties or people
1036
00:38:56.512 --> 00:38:58.866
or computers, where each of those people
1037
00:38:58.866 --> 00:39:01.866
holds their own local private input.
1038
00:39:03.185 --> 00:39:05.055
It's an interactive protocol, so they'll all
1039
00:39:05.055 --> 00:39:07.277
exchange messages with each other
1040
00:39:07.277 --> 00:39:09.423
as part of running the protocol.
1041
00:39:09.423 --> 00:39:12.053
At the end of the protocol, say, one of the parties
1042
00:39:12.053 --> 00:39:15.880
will output some function of everyone else's data.
1043
00:39:15.880 --> 00:39:17.778
You can view this, for example,
1044
00:39:17.778 --> 00:39:20.238
as taking data from several patients,
1045
00:39:20.238 --> 00:39:21.701
or maybe these people are actually hospitals,
1046
00:39:21.701 --> 00:39:23.780
and they've aggregated some data locally.
1047
00:39:23.780 --> 00:39:25.271
Then they wanna run some study, they wanna
1048
00:39:25.271 --> 00:39:27.322
run, perhaps, some machine learning algorithm
1049
00:39:27.322 --> 00:39:28.496
over that data.
1050
00:39:28.496 --> 00:39:30.046
The function f might, in that case,
1051
00:39:30.046 --> 00:39:34.300
be a function that learned a particular model.
1052
00:39:34.300 --> 00:39:37.687
The goal of this protocol is to learn that model.
1053
00:39:37.687 --> 00:39:39.293
So far, I haven't said anything about privacy
1054
00:39:39.293 --> 00:39:41.019
or anything about security.
1055
00:39:41.019 --> 00:39:44.020
But how might we define a notion of security
1056
00:39:44.020 --> 00:39:45.196
in this setting?
1057
00:39:45.196 --> 00:39:46.614
What might it mean if there were a protocol
1058
00:39:46.614 --> 00:39:49.000
that allows these people to collaboratively compute
1059
00:39:49.000 --> 00:39:50.344
over their local data?
1060
00:39:50.344 --> 00:39:51.494
What might it mean for that protocol
1061
00:39:51.494 --> 00:39:53.992
to be called secure?
1062
00:39:53.992 --> 00:39:55.349
Well, we could go ahead and list
1063
00:39:55.349 --> 00:39:56.934
several desired properties that we want,
1064
00:39:56.934 --> 00:39:58.332
where you could talk about, well, I want this
1065
00:39:58.332 --> 00:40:00.585
person's input to be secret from this person.
1066
00:40:00.585 --> 00:40:02.706
I wanna make sure that they
1067
00:40:02.706 --> 00:40:04.389
compute the correct output or something like that,
1068
00:40:04.389 --> 00:40:05.567
but it turns out to be, number one,
1069
00:40:05.567 --> 00:40:07.932
very difficult to do that.
1070
00:40:07.932 --> 00:40:10.167
Also, maybe more important, by doing that,
1071
00:40:10.167 --> 00:40:11.711
you might end up missing some properties,
1072
00:40:11.711 --> 00:40:13.876
and you might never be convinced that
1073
00:40:13.876 --> 00:40:15.161
you didn't leave something off your list
1074
00:40:15.161 --> 00:40:18.018
that you actually care about.
1075
00:40:18.018 --> 00:40:19.230
In addition, some of these properties
1076
00:40:19.230 --> 00:40:21.194
are rather subtle to define.
1077
00:40:21.194 --> 00:40:24.176
If I just say something like, if I'm player one,
1078
00:40:24.176 --> 00:40:25.398
I wanna make sure that nobody learns anything
1079
00:40:25.398 --> 00:40:27.582
about my input, well, it turns out that
1080
00:40:27.582 --> 00:40:29.612
in general, you're not going to achieve that
1081
00:40:29.612 --> 00:40:32.241
because by the very nature of the computation,
1082
00:40:32.241 --> 00:40:33.945
you learn the output.
1083
00:40:33.945 --> 00:40:36.129
The output depended on my input.
1084
00:40:36.129 --> 00:40:37.579
There's gonna be some correlation between
1085
00:40:37.579 --> 00:40:39.466
my input and that output.
1086
00:40:39.466 --> 00:40:41.855
It's very difficult, actually, to sit down
1087
00:40:41.855 --> 00:40:43.245
and think about how you might define
1088
00:40:43.245 --> 00:40:45.213
what it means to learn only that value
1089
00:40:45.213 --> 00:40:46.630
and nothing else.
1090
00:40:47.937 --> 00:40:49.880
Now, cryptographers, over the course of many years,
1091
00:40:49.880 --> 00:40:52.823
have developed, actually, a way of defining
1092
00:40:52.823 --> 00:40:55.052
security in a setting that addresses
1093
00:40:55.052 --> 00:40:57.107
all of these concerns.
1094
00:40:57.107 --> 00:40:58.810
In particular, what cryptographers imagine
1095
00:40:58.810 --> 00:41:01.111
is what would be the ideal case?
1096
00:41:01.111 --> 00:41:03.885
What would we like to achieve if we
1097
00:41:03.885 --> 00:41:06.295
didn't have to live in the real world?
1098
00:41:06.295 --> 00:41:07.701
(speaks too low to hear)
1099
00:41:07.701 --> 00:41:08.882
If we didn't have to live in the real world
1100
00:41:08.882 --> 00:41:10.729
and we could imagine that ideal model,
1101
00:41:10.729 --> 00:41:12.459
what would that ideal model look like?
1102
00:41:12.459 --> 00:41:14.388
The ideal model, what it might look like
1103
00:41:14.388 --> 00:41:18.683
is that we have a central trusted authority.
1104
00:41:18.683 --> 00:41:20.662
I have to tell you that
1105
00:41:20.662 --> 00:41:22.142
it's not easy these days coming up with a picture
1106
00:41:22.142 --> 00:41:23.423
of something that people might even consider
1107
00:41:23.423 --> 00:41:25.524
a central trusted authority. (laughter)
1108
00:41:25.524 --> 00:41:26.760
I chose the Supreme Court,
1109
00:41:26.760 --> 00:41:29.278
but I chose that maybe three years ago.
1110
00:41:29.278 --> 00:41:31.438
(laughter)
1111
00:41:31.438 --> 00:41:33.287
That's a discussion for another time.
1112
00:41:33.287 --> 00:41:35.698
But imagine we did have a central trusted authority
1113
00:41:35.698 --> 00:41:37.880
that we all agreed was trusted to do
1114
00:41:37.880 --> 00:41:39.041
what it's supposed to.
1115
00:41:39.041 --> 00:41:40.316
Then what we might do in that case
1116
00:41:40.316 --> 00:41:41.760
is we might actually be willing to trust
1117
00:41:41.760 --> 00:41:43.467
that entity with our data.
1118
00:41:43.467 --> 00:41:44.910
What we would do is we would all send our data
1119
00:41:44.910 --> 00:41:46.941
to that single trusted entity.
1120
00:41:46.941 --> 00:41:49.045
That trusted entity would then perform
1121
00:41:49.045 --> 00:41:51.229
the computation over the data,
1122
00:41:51.229 --> 00:41:53.206
and then send back the result to either
1123
00:41:53.206 --> 00:41:55.873
all of us or one of the parties.
1124
00:41:57.141 --> 00:42:00.652
We'll define a secure protocol in the real world,
1125
00:42:00.652 --> 00:42:02.387
where parties are exchanging messages
1126
00:42:02.387 --> 00:42:05.844
between each other, and there is no trusted party.
1127
00:42:05.844 --> 00:42:09.122
We'll define a protocol like that to be secure
1128
00:42:09.122 --> 00:42:12.758
if it provides a simulation of what we
1129
00:42:12.758 --> 00:42:14.740
achieve in that world with a trusted party.
1130
00:42:14.740 --> 00:42:16.231
Now, this word simulation has a very
1131
00:42:16.231 --> 00:42:17.826
technical definition in cryptography.
1132
00:42:17.826 --> 00:42:20.062
I'm not gonna go into it, but the basic idea
1133
00:42:20.062 --> 00:42:22.395
is just telling you that any property
1134
00:42:22.395 --> 00:42:25.290
you achieve in this ideal world
1135
00:42:25.290 --> 00:42:27.326
with the trusted entity will also
1136
00:42:27.326 --> 00:42:31.204
be achieved by the real-world execution of the protocol.
1137
00:42:31.204 --> 00:42:32.998
In particular, if we go back to this picture
1138
00:42:32.998 --> 00:42:34.777
of the people communicating over the network,
1139
00:42:34.777 --> 00:42:38.605
and one party learning this output value y,
1140
00:42:38.605 --> 00:42:40.209
well, just like we said, that's going
1141
00:42:40.209 --> 00:42:43.691
to be giving you the same security guarantees,
1142
00:42:43.691 --> 00:42:46.487
even if all these other people are corrupted.
1143
00:42:46.487 --> 00:42:49.249
We're concerned about the security guarantees
1144
00:42:49.249 --> 00:42:51.708
being provided for this person on the lower left.
1145
00:42:51.708 --> 00:42:53.513
It's gonna provide that person, that party,
1146
00:42:53.513 --> 00:42:54.993
with the same exact guarantees that they
1147
00:42:54.993 --> 00:42:58.114
would have in an ideal model where everybody
1148
00:42:58.114 --> 00:43:00.792
sends their inputs to the trusted party,
1149
00:43:00.792 --> 00:43:03.055
and a computer itself can send it back to,
1150
00:43:03.055 --> 00:43:05.720
in this case, the person on the upper right.
1151
00:43:05.720 --> 00:43:06.830
If you think of what that means,
1152
00:43:06.830 --> 00:43:08.520
that the execution of the protocol
1153
00:43:08.520 --> 00:43:10.981
doesn't reveal anything about X5,
1154
00:43:10.981 --> 00:43:14.327
about the input of the party on the lower left,
1155
00:43:14.327 --> 00:43:16.732
beyond what's revealed by the inputs
1156
00:43:16.732 --> 00:43:18.282
of the corrupted parties and the output
1157
00:43:18.282 --> 00:43:20.667
that they learn from the protocol.
1158
00:43:20.667 --> 00:43:22.493
If you wanna compute f, then that's the best
1159
00:43:22.493 --> 00:43:24.493
you can hope to achieve.
1160
00:43:26.414 --> 00:43:28.767
We have this beautiful definition of what it means
1161
00:43:28.767 --> 00:43:31.142
to have a secure computation protocol.
1162
00:43:31.142 --> 00:43:32.200
Are these protocols feasible?
1163
00:43:32.200 --> 00:43:33.662
Do they exist?
1164
00:43:33.662 --> 00:43:35.522
Well, it turns out, actually,
1165
00:43:35.522 --> 00:43:38.135
that a long line of research has shown
1166
00:43:38.135 --> 00:43:41.159
that under a small set of reasonable assumptions,
1167
00:43:41.159 --> 00:43:44.509
in fact, secure computation of any function is possible.
1168
00:43:44.509 --> 00:43:46.045
You can take any function you'd like,
1169
00:43:46.045 --> 00:43:48.744
the most complicated machine learning algorithm,
1170
00:43:48.744 --> 00:43:50.348
the most complicated statistics or even maybe
1171
00:43:50.348 --> 00:43:54.937
the most complicated advertising generation algorithm,
1172
00:43:54.937 --> 00:43:57.286
you could plug it into one of these protocols,
1173
00:43:57.286 --> 00:43:58.848
and you could design a protocol that would
1174
00:43:58.848 --> 00:44:02.340
exactly learn the function you're trying to compute
1175
00:44:02.340 --> 00:44:05.388
without unduly harming the privacy
1176
00:44:05.388 --> 00:44:08.256
of any individual party's input.
1177
00:44:08.256 --> 00:44:10.116
If we go back to this example of the
1178
00:44:10.116 --> 00:44:12.241
medical researchers, that would mean
1179
00:44:12.241 --> 00:44:15.612
that rather than having the status quo
1180
00:44:15.612 --> 00:44:17.079
where, say, the hospitals are sending over
1181
00:44:17.079 --> 00:44:18.531
all their data to the NIH, and then
1182
00:44:18.531 --> 00:44:21.408
the NIH suddenly has information about
1183
00:44:21.408 --> 00:44:23.335
hundreds or thousands of patients,
1184
00:44:23.335 --> 00:44:24.518
what they could do instead is they could
1185
00:44:24.518 --> 00:44:26.962
design a secure computation protocol
1186
00:44:26.962 --> 00:44:29.255
to carry out whatever study the NIH
1187
00:44:29.255 --> 00:44:31.272
is interested in conducting.
1188
00:44:31.272 --> 00:44:32.376
They would replace that data with an
1189
00:44:32.376 --> 00:44:34.501
interactive protocol, in this case,
1190
00:44:34.501 --> 00:44:37.017
between the two hospitals and NIH,
1191
00:44:37.017 --> 00:44:40.223
that would allow NIH to compute,
1192
00:44:40.223 --> 00:44:42.397
to run the study and generate the report,
1193
00:44:42.397 --> 00:44:44.599
without ever having their hands on any data.
1194
00:44:44.599 --> 00:44:45.759
In particular, again, just like in the
1195
00:44:45.759 --> 00:44:48.733
ideal world, giving the guarantees that
1196
00:44:48.733 --> 00:44:50.836
the only thing the NIH learns is what's generated
1197
00:44:50.836 --> 00:44:53.503
in the report, and nothing else.
1198
00:44:55.497 --> 00:44:58.663
Now, we said that these protocols exist.
1199
00:44:58.663 --> 00:44:59.851
What can we say about efficiency?
1200
00:44:59.851 --> 00:45:01.067
Do we have any hope of running them
1201
00:45:01.067 --> 00:45:02.900
or using them in practice?
1202
00:45:02.900 --> 00:45:04.703
Now, secure computation, the general idea,
1203
00:45:04.703 --> 00:45:06.249
was introduced in the 1980s.
1204
00:45:06.249 --> 00:45:07.508
Like I said, there was a long line
1205
00:45:07.508 --> 00:45:08.709
of research following up on that,
1206
00:45:08.709 --> 00:45:10.282
continuing until today,
1207
00:45:10.282 --> 00:45:13.532
developing better and better protocols.
1208
00:45:14.408 --> 00:45:16.234
I wasn't around or I wasn't active
1209
00:45:16.234 --> 00:45:18.216
at the time this work was going on,
1210
00:45:18.216 --> 00:45:20.651
but I think if you read the papers at the time,
1211
00:45:20.651 --> 00:45:22.713
you get the sense that in the '80s and '90s,
1212
00:45:22.713 --> 00:45:24.524
people viewed this as being very interesting
1213
00:45:24.524 --> 00:45:26.530
from a theoretical point of view,
1214
00:45:26.530 --> 00:45:28.989
but as being hopelessly impractical.
1215
00:45:28.989 --> 00:45:31.867
In fact, it wasn't until 2004 that there
1216
00:45:31.867 --> 00:45:34.868
was the first implementation,
1217
00:45:34.868 --> 00:45:36.575
called generic two-party computation,
1218
00:45:36.575 --> 00:45:38.195
even in a very simple case where we assume
1219
00:45:38.195 --> 00:45:40.015
the parties are honest, but only try to learn
1220
00:45:40.015 --> 00:45:43.485
information after the fact from the interaction.
1221
00:45:43.485 --> 00:45:46.520
Even that implementation was quite inefficient,
1222
00:45:46.520 --> 00:45:49.201
but the main point, and what resulted in a paper,
1223
00:45:49.201 --> 00:45:51.204
and quite a nice paper, it's just the fact
1224
00:45:51.204 --> 00:45:53.917
that they're able to do it at all.
1225
00:45:53.917 --> 00:45:56.209
But over time, of course, progress in this area
1226
00:45:56.209 --> 00:45:57.376
has continued.
1227
00:45:58.462 --> 00:46:00.251
Some of the work done by myself,
1228
00:46:00.251 --> 00:46:02.434
and in collaboration with students and postdocs,
1229
00:46:02.434 --> 00:46:03.817
have helped to push the boundary
1230
00:46:03.817 --> 00:46:05.555
and makes things more efficient
1231
00:46:05.555 --> 00:46:07.868
beyond what they previously had been.
1232
00:46:07.868 --> 00:46:09.284
If you look, there's been steady progress
1233
00:46:09.284 --> 00:46:10.306
over the years.
1234
00:46:10.306 --> 00:46:12.679
The Fairplay paper was that one in 2004
1235
00:46:12.679 --> 00:46:14.176
that I mentioned, and there's some work
1236
00:46:14.176 --> 00:46:16.321
in 2009 and 2010.
1237
00:46:16.321 --> 00:46:18.157
Showing on the left, basically, the performance,
1238
00:46:18.157 --> 00:46:20.619
in terms of the number of,
1239
00:46:20.619 --> 00:46:22.935
well, the number of gates, a circuit that
1240
00:46:22.935 --> 00:46:26.090
you can compute in a certain period of time.
1241
00:46:26.090 --> 00:46:27.435
Essentially, the one on the right,
1242
00:46:27.435 --> 00:46:28.681
you think how large of a program,
1243
00:46:28.681 --> 00:46:31.813
how large of a circuit, they were able to compute.
1244
00:46:31.813 --> 00:46:33.446
We had some ideas in 2011 that led
1245
00:46:33.446 --> 00:46:35.598
to a dramatic improvement, actually,
1246
00:46:35.598 --> 00:46:37.696
in the performance from scalability,
1247
00:46:37.696 --> 00:46:40.362
and I hope maybe gave people the impression
1248
00:46:40.362 --> 00:46:41.885
that this thing actually had a possibility
1249
00:46:41.885 --> 00:46:44.573
of being more practical, and encouraged some followup,
1250
00:46:44.573 --> 00:46:46.512
a lot of followup work, actually,
1251
00:46:46.512 --> 00:46:48.180
pushing this even further.
1252
00:46:48.180 --> 00:46:49.773
Now, this is in the semi-honest setting
1253
00:46:49.773 --> 00:46:51.712
where, as I said before, this assumes
1254
00:46:51.712 --> 00:46:53.865
that the parties are following the protocol honestly,
1255
00:46:53.865 --> 00:46:55.613
they're not deviating at all.
1256
00:46:55.613 --> 00:46:56.982
All they're trying to do is potentially
1257
00:46:56.982 --> 00:46:59.542
learn information that they shouldn't be learning
1258
00:46:59.542 --> 00:47:01.666
from the execution of the protocol.
1259
00:47:01.666 --> 00:47:04.822
Work continues on developing protocols
1260
00:47:04.822 --> 00:47:07.281
with stronger guarantees, as well.
1261
00:47:07.281 --> 00:47:09.023
The model we'd actually really like to be using
1262
00:47:09.023 --> 00:47:12.154
in practice is the so-called malicious model,
1263
00:47:12.154 --> 00:47:13.711
where we make no assumptions about how
1264
00:47:13.711 --> 00:47:15.102
the parties are gonna be behaving during
1265
00:47:15.102 --> 00:47:16.737
the protocol, and we make no assumption
1266
00:47:16.737 --> 00:47:18.970
that they're gonna be following the protocol honestly.
1267
00:47:18.970 --> 00:47:20.927
They can deviate arbitrarily from what we
1268
00:47:20.927 --> 00:47:22.749
tell them to do.
1269
00:47:22.749 --> 00:47:24.623
You can see here, a steady march,
1270
00:47:24.623 --> 00:47:26.890
this is on a logarithmic scale, actually,
1271
00:47:26.890 --> 00:47:28.679
of the time required to perform
1272
00:47:28.679 --> 00:47:30.048
an AES computation.
1273
00:47:30.048 --> 00:47:33.107
An AES is a particular cryptographic log cipher.
1274
00:47:33.107 --> 00:47:36.792
This is a secure two-party computation of AES,
1275
00:47:36.792 --> 00:47:38.711
going from 2009.
1276
00:47:38.711 --> 00:47:41.028
The last two works are by myself
1277
00:47:41.028 --> 00:47:44.649
and students of mine, postdocs of mine,
1278
00:47:44.649 --> 00:47:47.316
over the last year, showing really dramatic improvements
1279
00:47:47.316 --> 00:47:49.261
in the running time over the course of,
1280
00:47:49.261 --> 00:47:51.516
I guess, almost not quite 10 years.
1281
00:47:51.516 --> 00:47:53.176
It's really an improvement that could have
1282
00:47:53.176 --> 00:47:55.959
a significant impact in practice because
1283
00:47:55.959 --> 00:47:57.401
it goes from something, the computation
1284
00:47:57.401 --> 00:48:01.021
took about 18 minutes (clears throat) in 2009,
1285
00:48:01.021 --> 00:48:03.590
whereas today it takes about 37 milliseconds.
1286
00:48:03.590 --> 00:48:05.621
This is a really dramatic improvement.
1287
00:48:05.621 --> 00:48:07.202
Again, people are now really seriously
1288
00:48:07.202 --> 00:48:08.954
starting to think about applications of this
1289
00:48:08.954 --> 00:48:10.454
in the real world.
1290
00:48:11.511 --> 00:48:13.416
I'll just mention that we have also some work
1291
00:48:13.416 --> 00:48:16.203
extending this to the multiparty setting,
1292
00:48:16.203 --> 00:48:19.241
running with hundreds of parties all over the world,
1293
00:48:19.241 --> 00:48:21.811
and essentially running a secure computation
1294
00:48:21.811 --> 00:48:25.188
of AES, like we had before, among 100-plus parties
1295
00:48:25.188 --> 00:48:30.185
across five continents in a time of about two seconds.
1296
00:48:30.185 --> 00:48:31.232
I guess that'll be approved,
1297
00:48:31.232 --> 00:48:32.973
maybe Xiao has already approved it.
1298
00:48:32.973 --> 00:48:34.582
It'll be approved, maybe, I'm sure,
1299
00:48:34.582 --> 00:48:37.741
in the next couple of years. (clears throat)
1300
00:48:37.741 --> 00:48:39.212
Now, I wanna talk briefly, I wanna come back
1301
00:48:39.212 --> 00:48:42.709
to this issue of privacy of data use.
1302
00:48:42.709 --> 00:48:45.039
We've been talking a lot about
1303
00:48:45.039 --> 00:48:47.498
how secure computation and how these protocols
1304
00:48:47.498 --> 00:48:50.870
can reduce the need to collect personal data.
1305
00:48:50.870 --> 00:48:52.743
Basically, because the protocols allow you
1306
00:48:52.743 --> 00:48:54.913
to do computations over private data
1307
00:48:54.913 --> 00:48:56.704
without the need to centrally collect
1308
00:48:56.704 --> 00:49:00.246
all the data, to collect that data in any one place.
1309
00:49:00.246 --> 00:49:01.521
But we still need to reason about
1310
00:49:01.521 --> 00:49:03.182
the computations itself.
1311
00:49:03.182 --> 00:49:04.467
I said earlier that the guarantee
1312
00:49:04.467 --> 00:49:07.166
provided by secure computation is that
1313
00:49:07.166 --> 00:49:09.962
the execution of the protocol in the real world
1314
00:49:09.962 --> 00:49:12.137
is as secure as the execution of the protocol
1315
00:49:12.137 --> 00:49:13.951
in the ideal world.
1316
00:49:13.951 --> 00:49:16.529
But the ideal world does leave something.
1317
00:49:16.529 --> 00:49:18.417
The ideal world does leave this function
1318
00:49:18.417 --> 00:49:20.501
over everybody's input.
1319
00:49:20.501 --> 00:49:22.074
Depending on the function, that may actually
1320
00:49:22.074 --> 00:49:23.899
reveal quite a bit of information
1321
00:49:23.899 --> 00:49:26.658
about an individual party's input.
1322
00:49:26.658 --> 00:49:28.194
We'll come back to this example again
1323
00:49:28.194 --> 00:49:30.053
of the medical researchers.
1324
00:49:30.053 --> 00:49:31.433
Even if we're doing everything
1325
00:49:31.433 --> 00:49:33.415
using a secure protocol, and so the data
1326
00:49:33.415 --> 00:49:35.166
itself is never sitting in one location
1327
00:49:35.166 --> 00:49:37.639
at any point in time, the problem
1328
00:49:37.639 --> 00:49:39.931
or the concern is that the report itself
1329
00:49:39.931 --> 00:49:44.098
may leak information about particular people's information.
1330
00:49:45.243 --> 00:49:46.779
There are some techniques that have
1331
00:49:46.779 --> 00:49:48.313
been introduced over the past decade or so
1332
00:49:48.313 --> 00:49:50.152
to try to address this.
1333
00:49:50.152 --> 00:49:52.998
I wanna focus on one called differential privacy.
1334
00:49:52.998 --> 00:49:54.137
This was not introduced by me.
1335
00:49:54.137 --> 00:49:56.716
This was by Dwork, McSherry, Nissim
1336
00:49:56.716 --> 00:49:58.769
and Smith in 2006.
1337
00:49:58.769 --> 00:50:01.187
They basically, someone came up with a definition,
1338
00:50:01.187 --> 00:50:02.998
which is again illustrating both
1339
00:50:02.998 --> 00:50:04.172
part of the difficulty of coming up
1340
00:50:04.172 --> 00:50:05.864
with these definitions, that it took until
1341
00:50:05.864 --> 00:50:08.668
2006 for people to formulate something,
1342
00:50:08.668 --> 00:50:09.967
but also the utility of having these
1343
00:50:09.967 --> 00:50:12.137
definitions in place once you've gone through
1344
00:50:12.137 --> 00:50:13.997
the work of developing them.
1345
00:50:13.997 --> 00:50:15.763
Basically, the idea they were trying to capture
1346
00:50:15.763 --> 00:50:20.023
is that a particular computation is private
1347
00:50:20.023 --> 00:50:21.510
if the output of the computation is not
1348
00:50:21.510 --> 00:50:24.787
too sensitive on any single user's data.
1349
00:50:24.787 --> 00:50:26.118
A little bit more formally, what they did
1350
00:50:26.118 --> 00:50:28.974
was they would compare what you would get
1351
00:50:28.974 --> 00:50:31.917
by computing the function over the original dataset.
1352
00:50:31.917 --> 00:50:34.125
Here I have a table with some medical data,
1353
00:50:34.125 --> 00:50:36.539
perhaps, of six different patients.
1354
00:50:36.539 --> 00:50:38.688
If we run some computation over that data
1355
00:50:38.688 --> 00:50:41.137
and learn something from it and publish it,
1356
00:50:41.137 --> 00:50:43.282
what we're gonna do is we're gonna compare that
1357
00:50:43.282 --> 00:50:45.598
to what we would get if we eliminated
1358
00:50:45.598 --> 00:50:47.540
one user's data, and then ran the same
1359
00:50:47.540 --> 00:50:49.653
computation over that.
1360
00:50:49.653 --> 00:50:51.983
Only if those two outputs are close
1361
00:50:51.983 --> 00:50:53.818
in a particularly defined way
1362
00:50:53.818 --> 00:50:55.678
will we say that the computation is private.
1363
00:50:55.678 --> 00:50:57.970
'Cause then, by definition, it'll mean that
1364
00:50:57.970 --> 00:50:59.952
the computation was not too sensitive
1365
00:50:59.952 --> 00:51:02.988
to the input of any particular user.
1366
00:51:02.988 --> 00:51:05.138
Now, differential privacy is quite,
1367
00:51:05.138 --> 00:51:07.214
has become quite a popular (mumbles).
1368
00:51:07.214 --> 00:51:09.279
It provides very strong guarantees.
1369
00:51:09.279 --> 00:51:11.265
The main drawback of differential privacy
1370
00:51:11.265 --> 00:51:12.889
is that it's nowadays viewed, perhaps,
1371
00:51:12.889 --> 00:51:14.556
as being too strong.
1372
00:51:15.757 --> 00:51:18.780
In particular, an example that highlights this
1373
00:51:18.780 --> 00:51:20.967
is that if Google wanted to publish
1374
00:51:20.967 --> 00:51:23.750
the number of visitors to its website
1375
00:51:23.750 --> 00:51:26.064
in a month, that would actually not
1376
00:51:26.064 --> 00:51:27.459
be differentially private.
1377
00:51:27.459 --> 00:51:29.373
It's kind of easy to see why.
1378
00:51:29.373 --> 00:51:31.911
If I take one user out of that sample,
1379
00:51:31.911 --> 00:51:33.659
that will shift, or if I take one search query
1380
00:51:33.659 --> 00:51:38.073
out of that sample, perhaps, that will shift
1381
00:51:38.073 --> 00:51:40.116
the output that I release by one.
1382
00:51:40.116 --> 00:51:41.760
That's a change, that's a noticeable change
1383
00:51:41.760 --> 00:51:44.247
in the output, depending on whether or not
1384
00:51:44.247 --> 00:51:46.518
you visited Google that month or not.
1385
00:51:46.518 --> 00:51:48.267
But on the other hand, that seems a little bit ridiculous.
1386
00:51:48.267 --> 00:51:49.942
I don't think anyone would claim this is really
1387
00:51:49.942 --> 00:51:52.464
a violation of privacy because there's no way
1388
00:51:52.464 --> 00:51:54.132
that an attacker would possibly be able
1389
00:51:54.132 --> 00:51:56.818
to know the search habits of everybody else
1390
00:51:56.818 --> 00:51:59.231
in the world over that period of time,
1391
00:51:59.231 --> 00:52:01.017
and thereby be able to learn anything about
1392
00:52:01.017 --> 00:52:02.230
whether or not you made a particular
1393
00:52:02.230 --> 00:52:04.463
search query or not.
1394
00:52:04.463 --> 00:52:07.402
Now, a little bit more rigorously,
1395
00:52:07.402 --> 00:52:09.468
a lot of people have been showing bounds
1396
00:52:09.468 --> 00:52:11.554
on the best achievable accuracy
1397
00:52:11.554 --> 00:52:14.302
that differentially private mechanisms can achieve.
1398
00:52:14.302 --> 00:52:15.658
Essentially, we know that while it provides
1399
00:52:15.658 --> 00:52:17.712
a very strong guarantee of privacy,
1400
00:52:17.712 --> 00:52:19.426
you take a big hit in the accuracy that
1401
00:52:19.426 --> 00:52:21.759
you're able to obtain in the
1402
00:52:22.644 --> 00:52:26.550
model, essentially, that you compute over the data.
1403
00:52:26.550 --> 00:52:30.617
In work in 2013, along with Adam Smith
1404
00:52:30.617 --> 00:52:33.726
and a student of mine, and a postdoc of mine,
1405
00:52:33.726 --> 00:52:37.017
Raef Bassily and Adam Groce, we looked at ways to relax
1406
00:52:37.017 --> 00:52:39.034
this notion of differential privacy
1407
00:52:39.034 --> 00:52:40.775
and come up with a more workable definition,
1408
00:52:40.775 --> 00:52:42.754
one that would allow for better protocols,
1409
00:52:42.754 --> 00:52:44.325
but that would still achieve a reasonable
1410
00:52:44.325 --> 00:52:46.485
measure of privacy.
1411
00:52:46.485 --> 00:52:48.105
We continue to have a definition to compare
1412
00:52:48.105 --> 00:52:50.542
the outputs of some mechanism with
1413
00:52:50.542 --> 00:52:52.677
and without some individual user's data.
1414
00:52:52.677 --> 00:52:54.418
The big difference is that we explicitly
1415
00:52:54.418 --> 00:52:55.835
took into account
1416
00:52:56.937 --> 00:52:58.981
an external observer's, or an attacker's,
1417
00:52:58.981 --> 00:53:02.318
if you like, uncertainty about the rest of the data.
1418
00:53:02.318 --> 00:53:04.808
In the Google example, that would mean
1419
00:53:04.808 --> 00:53:06.364
we explicitly take into account for
1420
00:53:06.364 --> 00:53:08.283
the definition the fact that it's unlikely
1421
00:53:08.283 --> 00:53:10.742
that any single observer will know
1422
00:53:10.742 --> 00:53:12.459
the exact statistics of who visited Google
1423
00:53:12.459 --> 00:53:13.908
that month or not.
1424
00:53:13.908 --> 00:53:15.385
We can kinda take that into account
1425
00:53:15.385 --> 00:53:17.725
when designing mechanisms while still ensuring
1426
00:53:17.725 --> 00:53:19.933
that they're private.
1427
00:53:19.933 --> 00:53:22.455
In particular, we showed also that our definition,
1428
00:53:22.455 --> 00:53:24.566
in addition to various properties that it had,
1429
00:53:24.566 --> 00:53:26.544
allows, actually, for noiseless mechanisms.
1430
00:53:26.544 --> 00:53:29.414
This is important because differential privacy
1431
00:53:29.414 --> 00:53:32.389
tends to add noise to whatever computation you're doing,
1432
00:53:32.389 --> 00:53:34.695
basically blurring the answer somewhat.
1433
00:53:34.695 --> 00:53:36.329
But we showed that with respect to our definition,
1434
00:53:36.329 --> 00:53:38.433
you could actually not give up anything
1435
00:53:38.433 --> 00:53:40.857
in terms of accuracy and not add any noise at all,
1436
00:53:40.857 --> 00:53:42.236
while still achieving a reasonable notion
1437
00:53:42.236 --> 00:53:45.322
of privacy in some settings.
1438
00:53:45.322 --> 00:53:47.098
I wanted to just conclude in the last
1439
00:53:47.098 --> 00:53:50.014
five minutes or so by talking about
1440
00:53:50.014 --> 00:53:54.106
the real-world impact of some of this work.
1441
00:53:54.106 --> 00:53:55.642
I should be clear, I'm not talking directly
1442
00:53:55.642 --> 00:53:57.525
about my own work, I'm talking more broadly
1443
00:53:57.525 --> 00:54:00.083
about the field of privacy-preserving computation
1444
00:54:00.083 --> 00:54:01.389
as a whole.
1445
00:54:01.389 --> 00:54:03.862
It's been really amazingly gratifying to me
1446
00:54:03.862 --> 00:54:06.314
to see something that I've been working on
1447
00:54:06.314 --> 00:54:09.543
for 15 years or so now just staring
1448
00:54:09.543 --> 00:54:12.350
to make its way into the consciousness
1449
00:54:12.350 --> 00:54:15.527
of the public, as well as real-world applications.
1450
00:54:15.527 --> 00:54:17.139
I can mention for starters about,
1451
00:54:17.139 --> 00:54:19.462
because there's a lot of interest nowadays in this,
1452
00:54:19.462 --> 00:54:21.444
over the last five years or so, in particular,
1453
00:54:21.444 --> 00:54:23.234
from different funding agencies.
1454
00:54:23.234 --> 00:54:26.101
DARPA and IARPA have very large programs
1455
00:54:26.101 --> 00:54:28.789
dedicated to privacy-preserving computation.
1456
00:54:28.789 --> 00:54:30.194
The Air Force has looked into using
1457
00:54:30.194 --> 00:54:33.379
secure computation protocols for
1458
00:54:33.379 --> 00:54:34.974
the computations that they run
1459
00:54:34.974 --> 00:54:36.724
with other countries.
1460
00:54:37.678 --> 00:54:39.535
They're actually interested in potentially,
1461
00:54:39.535 --> 00:54:40.991
or at least exploring the possibility,
1462
00:54:40.991 --> 00:54:45.188
of using these protocols for real-world application.
1463
00:54:45.188 --> 00:54:47.577
I've also been, or I have been interacting
1464
00:54:47.577 --> 00:54:50.349
with people at the Department of Treasury
1465
00:54:50.349 --> 00:54:52.111
and the Office of Financial Research
1466
00:54:52.111 --> 00:54:53.552
who are also interested in trying to use
1467
00:54:53.552 --> 00:54:56.587
these ideas to perform better
1468
00:54:56.587 --> 00:54:58.975
or privacy-preserving computation of various
1469
00:54:58.975 --> 00:55:00.478
economic indicators.
1470
00:55:00.478 --> 00:55:02.387
They're very concerned about privacy
1471
00:55:02.387 --> 00:55:05.135
of the people that they're studying.
1472
00:55:05.135 --> 00:55:06.979
They're concerned both about individual's privacy,
1473
00:55:06.979 --> 00:55:08.277
and also the businesses and companies
1474
00:55:08.277 --> 00:55:09.861
that they interact with are concerned
1475
00:55:09.861 --> 00:55:11.544
about their own privacy, and not giving
1476
00:55:11.544 --> 00:55:13.654
a competitive edge to their,
1477
00:55:13.654 --> 00:55:16.127
to the other people working in the sector.
1478
00:55:16.127 --> 00:55:17.463
The Department of Treasury was interested
1479
00:55:17.463 --> 00:55:20.007
in exploring the application of these techniques
1480
00:55:20.007 --> 00:55:21.578
to some of their problems that will allow them
1481
00:55:21.578 --> 00:55:24.170
to get better accuracy
1482
00:55:24.170 --> 00:55:26.931
with the studies that they're talking about.
1483
00:55:26.931 --> 00:55:29.763
What was really interesting was a
1484
00:55:29.763 --> 00:55:32.059
statement that came out of Senator Ron Wyden's office
1485
00:55:32.059 --> 00:55:34.284
in May of this year.
1486
00:55:34.284 --> 00:55:36.743
It was really like music to my ears.
1487
00:55:36.743 --> 00:55:39.373
He had in his statement
1488
00:55:39.373 --> 00:55:41.540
an explicit call for using
1489
00:55:42.386 --> 00:55:44.803
privacy-preserving algorithms
1490
00:55:46.376 --> 00:55:47.652
in evidence-based policymaking.
1491
00:55:47.652 --> 00:55:49.405
He says, "I strongly urge the commission,"
1492
00:55:49.405 --> 00:55:52.386
this is a commission on evidence-based policymaking,
1493
00:55:52.386 --> 00:55:55.099
"to recommend that privacy-enhancing technologies,
1494
00:55:55.099 --> 00:55:56.875
"such as secure multiparty computation
1495
00:55:56.875 --> 00:55:58.422
"and differential privacy, must be utilized
1496
00:55:58.422 --> 00:56:00.755
"by agencies and organizations that seek
1497
00:56:00.755 --> 00:56:03.180
"to draw public policy-related insights
1498
00:56:03.180 --> 00:56:05.558
"from the private data of Americans."
1499
00:56:05.558 --> 00:56:06.878
When you have a senator talking about
1500
00:56:06.878 --> 00:56:09.498
multiparty computations and differential privacy,
1501
00:56:09.498 --> 00:56:11.079
it's a little bit scary.
1502
00:56:11.079 --> 00:56:12.813
I'm not sure who put that language in there.
1503
00:56:12.813 --> 00:56:14.781
But anyway, it's really gratifying to see that,
1504
00:56:14.781 --> 00:56:17.610
in his mind, number one, that these technologies
1505
00:56:17.610 --> 00:56:19.375
are available, and number two, that they should
1506
00:56:19.375 --> 00:56:22.958
be used, if possible, by these commissions.
1507
00:56:23.959 --> 00:56:25.264
In the last few years, we've also seen
1508
00:56:25.264 --> 00:56:28.651
a number of startup companies in this space.
1509
00:56:28.651 --> 00:56:30.834
I'll just highlight, well, some of them
1510
00:56:30.834 --> 00:56:32.141
are startups and some of them are beyond
1511
00:56:32.141 --> 00:56:33.617
startups, maybe, at this point.
1512
00:56:33.617 --> 00:56:35.572
I'll just mention Sharemind in particular
1513
00:56:35.572 --> 00:56:37.491
has been around for a little while.
1514
00:56:37.491 --> 00:56:40.345
They're doing a three-party computation.
1515
00:56:40.345 --> 00:56:41.809
They're based in Estonia.
1516
00:56:41.809 --> 00:56:43.130
They've been using it over the last couple
1517
00:56:43.130 --> 00:56:45.603
of years to actually do statistical analysis
1518
00:56:45.603 --> 00:56:48.300
of financial data for the Estonian government.
1519
00:56:48.300 --> 00:56:51.339
They're using this on real problems
1520
00:56:51.339 --> 00:56:53.631
with reasonable performance, as well.
1521
00:56:53.631 --> 00:56:56.184
I'll highlight also the fact that Google
1522
00:56:56.184 --> 00:56:57.577
has recently begun using a form
1523
00:56:57.577 --> 00:57:01.492
of two-party computation in order to do ad tracking.
1524
00:57:01.492 --> 00:57:04.350
Google interacts basically with various
1525
00:57:04.350 --> 00:57:06.499
advertising companies in order to allow
1526
00:57:06.499 --> 00:57:08.648
the advertising company to compute how many
1527
00:57:08.648 --> 00:57:10.582
visitors or how many successful conversions
1528
00:57:10.582 --> 00:57:13.499
their ads had on Google's platform.
1529
00:57:14.458 --> 00:57:16.495
Differential privacy has also made its way
1530
00:57:16.495 --> 00:57:18.031
into many devices.
1531
00:57:18.031 --> 00:57:20.901
Actually, Apple has a differentially private mechanism
1532
00:57:20.901 --> 00:57:23.625
built into the latest phones.
1533
00:57:23.625 --> 00:57:26.182
Google has a differentially private mechanism
1534
00:57:26.182 --> 00:57:29.407
built into its Chrome browser to collect statistics
1535
00:57:29.407 --> 00:57:31.574
from the users using their browser.
1536
00:57:31.574 --> 00:57:33.344
In general, I will just say that these
1537
00:57:33.344 --> 00:57:35.182
are really exciting times in the field.
1538
00:57:35.182 --> 00:57:36.500
We're seeing more and more interest,
1539
00:57:36.500 --> 00:57:38.976
and protocols are getting much, much better.
1540
00:57:38.976 --> 00:57:41.125
Really, I hope that we'll continue to see
1541
00:57:41.125 --> 00:57:42.875
this increase in the,
1542
00:57:43.840 --> 00:57:46.665
in people's knowledge about these technologies
1543
00:57:46.665 --> 00:57:49.328
and also applying them in different contexts.
1544
00:57:49.328 --> 00:57:51.188
With that, I will end, and thank you
1545
00:57:51.188 --> 00:57:52.139
for your attention.
1546
00:57:52.139 --> 00:57:54.897
I'm happy to stay and take questions.
1547
00:57:54.897 --> 00:57:56.011
As far as I'm concerned, we can go beyond
1548
00:57:56.011 --> 00:57:59.368
the five o'clock limit, but I'm happy (coughing).
1549
00:57:59.368 --> 00:58:01.618
(applause)
1550
00:58:08.770 --> 00:58:09.727
Thank you, Jonathan.
1551
00:58:09.727 --> 00:58:11.113
We just quickly have time for questions
1552
00:58:11.113 --> 00:58:14.196
if people have questions or comments.
1553
00:58:17.882 --> 00:58:18.805
If not--
1554
00:58:18.805 --> 00:58:19.638
(crosstalk)
1555
00:58:19.638 --> 00:58:20.971
Please go ahead.
1556
00:58:22.281 --> 00:58:26.448
How real is multiparty computation protocol?
1557
00:58:27.398 --> 00:58:28.747
How will you implement it?
1558
00:58:28.747 --> 00:58:30.627
(speaks too low to hear)
1559
00:58:30.627 --> 00:58:32.978
You should come talk to me.
1560
00:58:32.978 --> 00:58:34.388
One of the things that's also been great
1561
00:58:34.388 --> 00:58:35.656
is in addition to exploring the
1562
00:58:35.656 --> 00:58:37.865
cryptographic protocols themselves,
1563
00:58:37.865 --> 00:58:39.583
there's also been a lotta focus on the usability
1564
00:58:39.583 --> 00:58:41.310
of these techniques, and making them
1565
00:58:41.310 --> 00:58:43.584
available to non-experts.
1566
00:58:43.584 --> 00:58:45.793
We, (speaks too low to hear) for example, have a
1567
00:58:45.793 --> 00:58:47.744
library that's built on GitHub.
1568
00:58:47.744 --> 00:58:51.228
You can download (speaks too low to hear) computation.
1569
00:58:51.228 --> 00:58:52.637
There's some effort involved
1570
00:58:52.637 --> 00:58:55.268
in writing the program, but it's not beyond
1571
00:58:55.268 --> 00:58:56.232
whatever it would be for writing
1572
00:58:56.232 --> 00:58:57.695
any other program.
1573
00:58:57.695 --> 00:59:02.559
It wouldn't be (speaks too low to hear).
1574
00:59:02.559 --> 00:59:04.048
Question.
1575
00:59:04.048 --> 00:59:05.708
Just to be clear, when you
1576
00:59:05.708 --> 00:59:08.779
started talking about real-world applications,
1577
00:59:08.779 --> 00:59:11.116
so only just recently has
1578
00:59:11.116 --> 00:59:14.616
(speaks too low to hear),
1579
00:59:19.667 --> 00:59:20.946
Yeah, I mean, it depends a little bit
1580
00:59:20.946 --> 00:59:22.208
what you mean by recently.
1581
00:59:22.208 --> 00:59:24.050
Some of the companies that I had mentioned
1582
00:59:24.050 --> 00:59:25.885
go back almost 10 years,
1583
00:59:25.885 --> 00:59:28.708
but it's increasing over the last few years.
1584
00:59:28.708 --> 00:59:31.314
First there was on example,
1585
00:59:31.314 --> 00:59:32.954
then there was another example.
1586
00:59:32.954 --> 00:59:37.121
Over the last year or so, there are now several examples.
1587
00:59:38.054 --> 00:59:40.048
We also have a reception outside
1588
00:59:40.048 --> 00:59:42.146
if you can stick around and have some
1589
00:59:42.146 --> 00:59:44.808
food and drink (speaks too low to hear).
1590
00:59:44.808 --> 00:59:45.827
Thank you so much for coming.
1591
00:59:45.827 --> 00:59:47.757
(applause)
1592
00:59:47.757 --> 00:59:50.507
(relaxing music)