student projects

The lab accepts motivated master’s and bachelor’s students to carry out their master’s thesis research or semester projects in the group. We provide a list of currently open project proposals. Please read through the proposals and contact us if you are interested.

proposed projects

Cloud-assisted Private Computation using Homomorphic Encryption

As a part of our current research in the field of privacy-enhancing technologies and secure multi-party computation, the Laboratory for Data Security (LDS) is implementing a lattice-based cryptographic library in the Go language: Lattigo. This library implements the Brakerski-Fan-Vercautern [1] and Cheon-Kim-Kim-Song [2] homomorphic encryption schemes, that enable computations to be performed on encrypted data without knowing the decryption key. Because these computations are computationally more expensive than their plaintext counterparts, outsourcing them to a cloud provider can be desirable.

This project consists in developing a gRPC-based client-server application in which the server manages homomorphically encrypted datasets and performs computation on demand using the Lattigo library. The student will enhance the functionalities of our current implementation, by taking on a subset (according to the number of credits) of the following tasks:

enabling the distribution of the computation over multiple computing-nodes.

augmenting the functionality provided to clients, notably, by adding support for the floating-point computation using the Lattigo implementation of the CKKS cryptosystem [2].

develop a deployment and testing solution to enable benchmarking of the application.

Student profile

Strong software engineering skills, knowledge of development best practices and tools

Supervisor

Secure Distributed Learning on Neural Networks

Machine learning has become ubiquitous nowadays thanks the power of techniques such as neural networks to model complex functionalities and effectively perform classification tasks. Machine learning techniques and, in particular, deep learning based on complex neural networks, generally require massive amounts of data to produce an accurate model. However, collecting and sharing large amounts of data raises privacy and security concerns because of the sensitive nature of the data in many collaborative ML application domains such as finance, smart metering, biometrics, user-behavior analysis, life-tracking, and, especially, in health and in multi-site clinical research involving -omics data.

Within this landscape, and accounting for the privacy and scalability challenges inherent to the aforementioned scenarios, our aim is to design, implement, and evaluate a system to enable secure and privacy-preserving neural network training and prediction, while maintaining the utility of the data. We focus on a distributed setting to protect each data providers’ privacy and to avoid single points of failure.

In this project, we use homomorphic encryption and multiparty computation to implement the protocols for secure distributed neural networks, and we leverage the Lattigo library implemented by LDS, which enables quantum-resilient lattice-based cryptographic protocols in the Go language: .

We are looking for student(s) to tackle the challenges posed by this project.The student(s) will work on a part of this system in collaboration with the advisors and other students. In agreement with the student, possible projects may be defined as:

Implementing a module of the system

Evaluating privacy-preserving neural networks

Comparing with baseline and state-of-the-art systems

Proposing and implementing improvements to the current design

Project type: Semester or Master project.

Student profile

Good programming skills, knowledge of Go is a big plus

Machine learning (especially, neural networks) skills is a plus

Familiarity with software development tools (e.g. Git)

Some background in security and privacy

Knowledge of homomorphic encryption and/or decentralized databases is a plus

Privacy-preserving analysis and processing on Distributed Databases

Statistical analyses and data processing (e.g., features extraction, dimensionality reduction) are particularly difficult to perform on distributed databases when the data are sensitive and private, e.g., medical or financial data, and when the data owners do not trust each others. In the past few years, we have worked on decentralized data-sharing systems [1,2] that enable data retrieval and statistical analysis (including ML models) among multiple databases while protecting the confidentiality of the data. For this purpose, our solutions rely mainly on homomorphic encryption and multiparty protocols.

In this project, the student(s) will tackle the challenges posed by decentralized data processing and analysis. The student(s) will work on the design, implementation and evaluation of a new privacy-preserving solution relying on multiparty homomorphic encryption. This project is a collaboration between LDS and DEDIS, and will permit the student(s) to work with security and privacy researchers on very “hot” topics such as privacy-preserving data sharing and processing.

Type: Semester project and bachelor-/master- thesis

Required skills:

Good programming skills (knowledge of Go language is a plus)

Familiarity with development tools (e.g. Git) and at ease with reviewing code

Secure and Private Distributed Medical Image Analysis

Medical image analysis is ongoing important transformations with the introduction of machine learning. However, those techniques require a large amount of data in order to train machine learning models. In the context of medical imaging this aspect is particularly concerning for three reasons: (i) the availability of the data, (ii) the limited resources to label training data efficiently, and (iii) the high dimensionality of the data. Centralising data from multiple instances for training a model is an obvious choice but has shortcomings due to legal, ethical, and privacy considerations.

A particularly good example is tumour segmentation using brain MRI images or digital pathology whole slide images distributed across multiple institutions. As such data is expensive to acquire and label, institutions would benefit greatly by sharing them to build more diverse and bigger datasets.Thus in this project, we aim at studying federated learning for medical images providing a flexible framework coping with multi-site data in a privacy-preserving manner.

In this project, the student(s) will take part in the design, implementation, and evaluation of cutting-edge solution with real world application to the medical imagery field. The student(s) will get the opportunity to work with image processing and machine learning as well as crypto-techniques such as multi-party computation and homomorphic encryption.

Required skills:

Good programming skills (Python and ML libraries such as Torch)

Familiarity with development tools (e.g. Git) and at ease with reviewing code

Some background in security and cryptography (e.g. COM402, CS523)

Knowledge of homomorphic encryption, secure multiparty computation is a plus

Solving the trade-off between security, privacy, and utility

In our interconnected world, personal data has become a priceless commodity. But dealing with personal information rises numerous concerns about users’ privacy. Several works and techniques have been developed over the last decades to ensure privacy guarantees without damaging utility. However, in those contexts, it becomes hard for third party to ensure the data retained its value. By combining state-of-the-art cryptographic tools such as homomorphic encryption, multi-party computation, and zero-knowledge proofs, this projects aims at solving the trade-off between privacy, authenticity, and utility.

Applications are numerous and include smart metering, social fitness tracking, insurances, or personalised health. For instance, a user wants to publish a certificate of distance achievement for her run on social networks without revealing her location and measurements. Similarly, she also wants to use this system to obtain pay-as-you-drive car-insurance or accurate electricity billing without revealing when she uses the car or how she consumes electricity as those can leak personal information. In all those use cases, there is also a strong incentive for the service provider to prevent cheating as this would incur financial loss.

Overall, this project aims at producing a deployable system solving the aforementioned trade-off with modern solutions offering strong privacy and security guarantees. The student(s) will get the opportunity to implement and evaluate such solutions by working with state-of-the-art cryptography such as post-quantum signature schemes with low multiplicative complexity.

Required skills:

Good programming skills (knowledge of Go language is a plus).

Familiarity with development tools (e.g. Git) and at ease with reviewing code.

ongoing projects

Distributed Machine Learning and Databases

Statistical and machine-learning analyses require large amounts of data in order to produce meaningful results and are often collected by multiple entities. In many domains such as medicine and user-behavior analysis, these data are personal and sensitive and cannot be shared due to privacy/ethical/legal concerns. In this context, decentralized data-sharing systems [1,2] became key enablers for big-data analysis while protecting individuals’ privacy by distributing the storage and the computation, thus avoiding single points of failure.

This distribution or decentralization of both data and computations can enable analysis on sensitive data, e.g. training of machine learning models on medical data to predict diseases or heart issues. But, the high sensitivity of the data creates multiple challenges, such as how to securely store the data in a decentralized manner and how to compute on these data while maintaining individuals’ privacy.

In this project, the student(s) will tackle the challenges posed by decentralized data storage and computations. The student(s) will work on the design, implementation and evaluation of a new solution for privacy-preserving machine learning and/or on a new system for a secure federated database. This project is a collaboration between LDS and DEDIS, and will permit the student(s) to work with security and privacy researchers on very “hot” topics such as privacy-preserving data sharing and machine learning.

Type: Semester project and bachelor-/master- thesis

Required skills:

Good programming skills (knowledge of Go language is a plus)

Familiarity with development tools (e.g. Git) and at ease with reviewing code

As a part of our current research in the field of privacy-enhancing technologies and secure multi-party computation, the Laboratory for Data Security (LDS) is implementing a lattice-based cryptographic library in the Go language: Lattigo. This library implements the Brakerski-Fan-Vercautern [1] and Cheon-Kim-Kim-Song [2] homomorphic encryption schemes, that enable computations to be performed on encrypted data without knowing the decryption key. Because these computations are computationally more expensive than their plaintext counterparts, outsourcing them to a cloud provider can be desirable.

This project consists in building a client-server application in which the server manages homomorphically encrypted datasets and performs computation on demand using the Lattigo library. The two students will define the client-server interface and implement both parts of the application. They will first consider the case of a single data owner (i.e., client) and then introduce the ability to manage several clients providing inputs to a secure multiparty computation.

This project consists in analyzing the practicality of threshold variants of the BFV and CKKS schemes, where the secret encryption key is shared among the N parties using Shamir secret-sharing [4]. Such technique enables to control, at the key generation phase, the threshold number t out of the N parties that have to collaborate in the decryption process. The student will notably:

Derive key generation and decryption procedures and analyze their complexity.

Analyze the effect on the noise growth of the resulting encryption scheme.

Supervisor

Profiling and Optimization of an Homomorphic Encryption Library

As a part of our current research in the field of privacy-enhancing technologies and secure multi-party computation, the Laboratory for Data Security (LDS) is implementing a lattice-based cryptographic library in the Go language: Lattigo. This library implements the Brakerski-Fan-Vercautern [1] and Cheong-Kim-Kim-Song [2] homomorphic encryption schemes, that enable computations to be performed on encrypted data without knowing the decryption key. Because these operations are computationally more expensive than their plaintext counterparts, their implementation has to be carefully optimized.

This project consists in applying software profiling methods to various applications using the Lattigo library in order to identify and investigate its performance bottlenecks. In particular, the student will investigate the use of the library for large-scale datasets and computations. Based on this analysis, the student will contribute improvements to the Lattigo code-base.

Supervisor

Distributed Privacy-preserving Machine Learning

Statistical and machine-learning analyses require large amounts of data in order to produce meaningful results and are often collected by multiple entities. In many domains such as medicine and user-behavior analysis, these data are personal and sensitive and cannot be shared due to privacy/ethical/legal concerns. In this context, decentralized data-sharing systems [1,2] became key enablers for big-data analysis while protecting individuals’ privacy by distributing the storage and the computation, thus avoiding single points of failure.

This distribution or decentralization of both data and computations can enable analysis on sensitive data, e.g. training of machine learning models on medical data to predict diseases or heart issues. However, the high sensitivity of the data creates multiple challenges, such as how to securely store the data in a decentralized manner and how to compute on these data while maintaining individuals’ privacy.

In this project, the student(s) will tackle these challenges by working on the design, implementation and evaluation of a new solution for privacy-preserving machine learning.

Supervisor

MedChain: Distributed Authentication and Authorization System for Medical Queries

Supervisor

Traffic-Analysis of Wearable Devices

In the era of personalized health, people constantly track their overall wellbeing status through wearable devices (e.g., smart watches, fitness trackers) that are able to measure vital signs such as their blood pressure or heart rate and monitor various aspects of their daily lives such as stress levels and quality of sleep. Typically, such wearables — which are capable of communicating over Bluetooth or Bluetooth Low Energy (BLE) wireless technology — forward the pieces of sensitive information that they collect to a device with stronger computing capabilities (i.e., a smartphone) that processes them to inform and notify the wearer about her health status through specialized applications [1].

The goal of this project is to evaluate the privacy leakage that stems from the Bluetooth/BLE communications between health wearable devices and their connected smartphone. While such communications are commonly encrypted [2], there is typically no protection for their associated metadata (e.g., packet sizes or timings) and as such, they are potentially subject to traffic analysis techniques [3] which can reveal sensitive information about the person that is being monitored. To this end, we will employ advanced software techniques [4] or elaborate wireless analysis equipment [5] to eavesdrop on and collect data regarding the (encrypted) communications of a wide range of commercially available wearable devices such as smart watches, fitness trackers, and blood pressure monitors. Subsequently, we will apply machine learning methodologies on the captured data [6] aiming to extract information about devices’ states, fingerprint users’ activities and track their health status. As a final step of the project, we will also investigate countermeasures aiming to prevent such attacks by employing padding [7] or traffic morphing techniques [8].

The collective authority (cothority) project provides a framework for development, analysis, and deployment of decentralized, distributed (cryptographic) protocols. It is developed and maintained by the DEDIS lab at EPFL. It currently supports elliptic curve-based protocols only.

This project consists in the integration of the lattice-based primitives of the Lattigo library in the Onet framework. Starting from the existing Onet library, the student will extend its interface and internals to support lattice-based primitives in addition to the existing elliptic curve ElGamal implementation. This includes the implementation of the NewHope asymmetric encryption scheme and its integration in the Onet authentication mechanism, providing Onet users with post-quantum security.

This project features a close collaboration between LDS and DEDIS, and will permit the student to work together with security, privacy and decentralization researchers on very “hot” application topics.

This project consists in implementing the network of a secure-multiparty-computation protocol that is based on a distributed version of the Brakerski-Fan-Vercauteren cryptosystem. Starting from Lattigo’s implementation of the local cryptosystem-operations, the student will implement the network layer using the Onet library, along with a small application layer enabling secure-multiparty-computation within a group of parties.