Hi there! I'm Damien. I have no idea how people usually
start blogs, so bear with me while I figure this out.

I see this place as a way to publish things that are too long for Twitter, too
opinionated for Wikipedia, and not pretentious enough for Medium. I'm going to
try and keep it to three themes: privacy, research, and privacy research.
I'm not (yet) a specialist in any of these. Hopefully, thanks for my jobs and
personal interest in those topics, I can add something valuable to what's
written online about them.

The following is the vision I have of these three themes. This should give an
idea of what I intend to talk about in this blog =)

Privacy

It's difficult to define what privacy encompasses. It's easier to realize when
you don't have enough privacy — through bad surprises, uneasy feelings of
creepiness, or real risks to your safety.

When a parent or a partner installs stealthy software on your phone to spy on
your texts and calls, that's an invasion of your privacy. When a company sells
your name, address and purchase history to some sketchy third-party that sends
you targeted ads, the uneasy feeling you get comes from a lack of privacy.
Full-body scanners in certain airports are an attack to one's bodily privacy.
Data leaks are a risk to users' privacy.

Privacy issues usually come from a lack of transparency, of control, or
both. In an ideal world, everybody would know exactly who has access to which
data about them and why. Personal data collection would not happen without
informed consent, and people would have a right to access, modify and delete
data that other people or organizations hold about them.

The fuzziness, and the complexity of the issues in this space, are part of what
I find interesting about them. I have done many privacy reviews for Google
products, and there is always something interesting and new with each of them.
Would users expect this behavior? Is this deletion action clear enough? Could
someone re-identify this aggregated data?

Like security, privacy is of particular importance for marginalized communities.
Having your phone number leaked online is much more problematic if you're a
high-profile political activist, or a closeted LGBTQ+ blogger. Harassment of
folks that belong to minorities is a major problem, and badly-designed sharing
interfaces or insufficient anti-abuse tools can lead to dramatic consequences.
Designing tools that deal with potentially sensitive data, and failing to
consider these specific risks, is highly irresponsible. And you can easily guess
what I think of compliance-based privacy programs…

I also try to avoid absolutist viewpoints. They are hardly ever constructive,
and they are often dangerous. I know people who refuse to use Signal because
it's not available without Google Play Services, while continuing to communicate
via cleartext SMS messages. For most practical problems, there is no perfect
solution. Focusing on defending against a hypothetical all-powerful targeted
attacker is usually pointless. Instead, I try to focus on realistic threat
models, usable tools, and risk mitigation.

Research

I started a part-time PhD after two years of software engineering at Google.

To solve an engineering problem, the path is quite straightforward. Grasp the
scope of the problem, design a solution, validate the design with coworkers and
stakeholders, write code, verify that the solution is "good enough", then
productionize it. Once the problem disappeared, there's no time to think about
it more: there are other problems to solve, other fires to put out.

The whole process is fun and rewarding, but I'm frustrated by the ending. What
if we could design a simpler or more efficient solution? Prove that it works in
a wider range of situations? Share the idea behind it with more people, and see
whether they get inspired and solve other problems? Doing all of this is not
immediately rewarding, but I think it can have a deeper, and longer-lasting
impact, than core engineering work.

I optimistically think that academia is the place to do that. Compare the
solution to what's out there already, make more experiments, write proofs,
figure out what additional impact it could have. Share the results with as many
people as possible. It might not be worth the time, but I think it's worthwhile
to give it a try. There are certainly interesting things to learn along the way.

The one thing that I'm afraid of is spending time solving the wrong problems.
Finding a "good problem" is not easy: a good problem must be difficult enough to
not have been already solved, but simple enough to have a chance at tackling it.
Identifying practical problems and their precise constraints is also hard, when
the main source of inspiration is other academics' work.

I'm frustrated about the lack of incentives to do research work as a software
engineer, but the incentives of academia are even more broken. Publication
metrics are a bad way to estimate one's impact, especially in the short term.
The peer review process is terribly implemented in practice. The whole system
makes it painfully slow to gather feedback, and the little feedback you get is
imprecise. The idea of having my work praised only to realize much later that it
didn't make a difference in practice… It's even scarier to me than the idea of
not finding joy and impact in my research, and deciding to quit.

But I'm not exactly pessimistic :D I feel lucky and enthusiastic about this
part-time project. Continuing to do engineering work for Google gives me an
endless input of complicated real-world problems to tackle, many of which seem
to be good candidates for research projects. I am surrounded by impressively
smart and passionate coworkers on both sides, whose feedback is invaluable. And
I don't feel extremely attached to the idea of having an academic career or even
getting the title at the end of my PhD, so I don't really feel the pressure to
publish everything and anything just to increment some counters.

All in all, this sounds like a fun and challenging adventure. I'm excited to see
what I'll learn along the way!

Privacy research

My research, like my engineering job at Google, will focus on privacy. This is a
field whose boundaries are not very well-defined, and that has very distinct
sub-fields. Some researchers focus on user research to understand the
perceptions of real people with regard to their personal data (there are a bunch
of them at Google). Very little math is involved. Some are designing algorithms
that have provable privacy-related properties, like private set intersection or
differentially private surveys. Lots of math there! ^^ Some study the problem of
anonymizing (or de-identifying) a dataset, so it can be used by more people
or shared with third parties. Some focus on onion routing, on online tracking,
on cryptocurrency, on privacy policies, on genetic privacy, on social networks,
and the list is far from exhaustive. So… what am I doing exactly?

My PhD project is about making it easier for data owners to understand and
protect the personal information contained in their databases. I see this goal
as having two main subcomponents.

Risk analysis. There are lots of organizations, companies or governments
which sit on large databases with personal information, and it's difficult
for them to realize how sensitive it is. Leaking your users' country of
origin is intuitively less of a problem than leaking their e-mail addresses,
which in turn is not as big a deal as leaking their credit card information.
Sadly, doing this type of inventory and risk analysis is currently pretty
difficult: it requires time, investment, and specific expertise. It
shouldn't have to be this way, so I'm working towards building tools that
make this easier.

Anonymization. Once you realized how sensitive your data is, you hopefully
will want to take steps to protect it. There are many ways to lower the risk
of bad people having access to your database: encryption, access controls,
or many other security techniques. Another option is to modify the database,
in a way that makes sure that somebody with access to it can't deduce creepy
things about the individuals whose data is in the database. I'm working
towards making this process easier and more understandable for data owners.

I could (and hopefully, I will!) talk at length about these two things. They
have already been studied by many people over the past ~15 years (especially
anonymization), but I think that there is a lot of room for more vulgarization
on the topic, and significant improvements to do on the research side. On the
anonymization topic in particular, I feel like it is urgent to work towards
bridging the gap between research advances and concrete use cases.

Maybe I'll realize along the way that I'm looking at the wrong problems, or that
it proves more difficult than I thought to improve the state of the art. But as
I've been told, that's part of what makes it challenging and fun ^^