Context Trees for Privacy-Preserving Modeling of Genetic Data


In this work, we use context trees for privacypreserving modeling of genetic sequences. The resulting estimated models are applied for functional comparison of genetic sequences in a privacy preserving way. Here we define privacy as uncertainty about the genetic source sequence given its model and use equivocation to quantify it. We evaluate the performance of our approach on publicly available human genomic data. The simulation results confirm that the context trees can be effectively used to detect similar genetic sequences while guaranteeing high privacy levels. However, a trade-off between privacy and utility has to be taken into account in practical applications.

Proceedings of 2016 International Zurich Seminar on Communications (IZS)