Matthias Gazzari, M.Sc.

Privacy and Security Impacts of Sensor Data / Keylogging Side-Channel Attacks / Researcher

Contact Information

Pankratiusstraße 2
64289 Darmstadt
S2|20 203
mgazzari@seemoo.tu-darmstadt.de
+49 6151 16-25475

Biography

Matthias Gazzari is a doctoral researcher at the Secure Mobile Networking Lab. His research focuses on the privacy and security impacts of sensors in everyday life, particularly on keylogging side-channel attacks. He studies the effectiveness of such inference attacks with different sensor modalities and combinations under varying conditions. The goal of his research is to facilitate awareness and derive counter-measures.

Research Interests

Privacy and security impact of sensor usage and presence in everyday life
Human- and device-targeted (keylogging) side-channel attacks
Sensor data obfuscation to counter inference attacks
Facilitating privacy awareness in a sensor-rich environment

Methods

Deep learning and generative adversarial networks
End-to-end multivariate time series classification
Empirical research focused on human subject data studies

Teaching

(2019-2023) Coordination and teaching assistant for Network Security.
(2024-2025) Coordination of Ubiquitous Computing in Business Processes
Selected seminar topics in our (Advanced) Seminar on Networking, Security, Mobility, and Wireless Communications.
Selected seminar topics as part of the RTG 2050 interdisciplinary seminar.
Selected lab topics in our Secure Mobile Networking Lab/Project.

Offered Theses Topics

Type

Year

No results match your search criteria.

2022 Completed (April 2025)

Exploring a Digital Intermediary for Bystanders’ Smart Home Privacy Communications

Supervisor: Matthias Gazzari Florentin Putz

...

2023 Completed (November 2024)

PINs Under Fire: Revealing What Impacts Thermal Keylogging

Supervisor: Matthias Gazzari

Thermal keylogging attacks utilize residual heat from input devices to extract Personal Identification Numbers (PINs) without the knowledge of the user. This form of attack has become more feasible as affordable thermal imaging equipment has advanced. In this thesis, we conduct a study to systematically assess several factors that influence the success of thermal keylogging attacks, including ambient temperature, humidity, device material, and input speed. We demonstrate an automated method for identifying pressed keys using temperature analysis and offer practical countermeasures based on our findings. In addition, we collected and provided an extensive dataset of over 50,000 thermal images and accompanying key press information, which will be useful for future studies on thermal keylogging attacks. Our findings provide insights into the numerous parameters that influence the success of these attacks, providing essential insights for the development of secure input methods.

2023 Completed (July 2024)

Protecting Users from PIN Inference Attacks with the PIN Entering Detector (PINED)

Supervisor: Matthias Gazzari Alexander Matern

Wearables such as smartwatches and fitness trackers are known for using their sensor data to recognize and classify sporting activities. However, research has shown several times that it is also possible to extract sensitive information such as Personal Identification Numbers (PINs) and passwords from this data. To protect users against this type of attack, this thesis presents the PIN Entering Detector (PINED), a security system that can recognize when a PIN is being entered and alert users of the potential danger. In contrast to previous approaches for similar protection frameworks, our solution does not explicitly exclude machine learning technologies. Instead, we use the advances that have taken place in the computation of such models. Therefore, it can use complex features for the classification and hereby achieve higher balanced classification accuracies. Subsequently, we first create a data set consisting of over 4500 data sequences of motion data to train two state-of-the-art machine learning models from the field of time series classification with it. After that we integrate the resulting binary classifiers into a smartwatch security app. The results show that with the data collected, both models were successfully trained to achieve a balanced accuracy of 85-90%, beating previous security systems. However, the classification results produced by the smartwatch app were too unreliable. Nevertheless, PINED showed the feasibility of a security framework based on machine learning.

2023 Completed (May 2024)

Towards Standardized Testing of Sensor Detection Methods

Supervisor: Matthias Gazzari Frank Hessel

Sensors are found in diverse settings, from environmental monitoring to smart homes and public surveillance systems. They often collect sensitive data, such as video recordings or movement data. Detecting unwanted, privacy-invasive sensors, usually small and concealed, poses a challenge. Although various detection approaches exist, they lack standardized testing procedures and environments, resulting in incomparable results. This work introduces a benchmark for evaluating sensor detection methods to address this issue. This benchmark offers a standardized framework for assessing detection approaches and meets requirements for practicality and deployability. It includes sensors used in literature and commonly sold smart home devices to ensure relevance and applicability. Three existing sensor detection approaches are tested using the defined benchmark and compared to their claimed performance. Evaluating the approaches in the benchmark resulted in a balanced accuracy of 47 % to 84 % and a recall of 0 % to 75 % in the benchmark. Their strengths and limitations are discussed, and the approaches are compared to each other. By establishing a benchmark for testing and evaluating sensor detection approaches, this work sets a precedent for standardized sensor detection methods.

2023 Available from: August 2025

Privacy and Security Implications of Cross-Modal Transformations on Human-Centric Sensor Data

Supervisor: Matthias Gazzari

This topic is about implementing a cross-modal transformation model on a chosen pair of human-centric sensors (sensors which are worn by or close to humans), for recreating one stream of sensor data based on the other one. The ultimate goal of this thesis is to evaluate the performance of such a model with respect to the privacy and/or security implications. Contact me if you are interested and/or have a cool idea for a specific pair of sensors relevant for violating the privacy and/or the security of a human being. Experience with machine learning and/or signal processing is required. A good understanding of sensors and their measured physical quantities is strongly recommended.

2023 Completed (September 2023)

Limits of CSI-based keylogging on 10-digit number pads

Supervisor: Matthias Gazzari Jakob Link

Using the channel state of a Wi-Fi transmission an attacker can extract keystrokes a victim presses on a 10-digit number pad. Other research groups have already shown such an attack to be possible to perform on keyboards and number pads, but have not explored the limitations of this attack. This work builds such a keylogger to show that it is possible to detect and identify keystrokes using only signals from a nearby Wi-Fi network as a side channel and find the limits of such an attack. For this, we run multiple experiments with different parameters changed, like distance or missing line of sight, and evaluate the results of the keylogger. We found that with our current setup, we are able to achieve good results in key detection with over 90 % in both precision and recall, but are unable to identify keys. Additionally, we found that introducing unrelated movement deteriorates the results, while increasing the distance between the devices or removing line of sight barely impacts the result of the keylogger.

2022 Completed (August 2023)

SpyFi: Deep Learning for CSI-based Keylogging Side Channel Attacks

Supervisor: Matthias Gazzari Jakob Link

Spying on what is typed on a keyboard with Wi-Fi signals sounds scary but might not be as far from reality as suspected. Wi-Fi-enabled devices constantly measure the communication channel conditions represented with Channel State Information (CSI). Finger and hand movements alter the wireless signal propagation characteristic and cause changes in the CSI over time. Prior work proves it is possible to correlate the patterns in a CSI time series to the motion of keys pressed on a keyboard. This leaking information from Wi-Fi signal distortions can be exploited in a side-channel keylogging attack. Typing is a prevalent activity when it comes to working with computers on a regular basis. Considering that what we type reveals not only private messages like emails or notes but also highly sensitive data such as passwords or banking information, this leaves a frightening prospect. In this thesis, we practically explore the potential threat of side-channel keylogging attacks with CSI by implementing and comparing the conventional method found in related work to deep learning-based approaches to infer keystrokes. Motivated by the fact that the use of deep learning models promises less effort in pre-processing and feature extraction, we apply deep learning approaches for the first time for CSI-based keylogging and extend the knowledge about the applications of Deep Neural Networks (DNNs). We create a dataset worth more than 24 hours of recording time with a controlled experimental setup to empirically evaluate the performance of the implemented keyloggers. Our results indicate the difficulties and limitations our keylogging models face, which renders keylogging attacks with Wi-Fi signals rather cumbersome for real-world attackers.

2022 Completed (August 2023)

Machine Learning Aided Penetration Testing: Concept of a Penetration Testing Automation Environment

Supervisor: Matthias Gazzari

Network penetration testing involves experienced techniques that require consideration of environment specific parameters and planning of conduct. Penetration testers should focus on novel vulnerabilities and spend their attention to interrelations regarding possible threats and risks to not lose time on repeating tasks. Reinforcement Learning (RL) is the key approach to make autonomous penetration testing practically applicable inside real-world computer networks. The literature describes attack path generation with a priori knowledge about the environment, simulation-only approaches without applicability to real-world computer networks or emulation-only approaches with no RL integration. This thesis optimizes, trains and evaluates RL agents for four benchmark scenarios with increasing size, complexity and heterogeneity of hosts, and a Proof of Concept (PoC) demonstrates the transferability of a simulation environment into an emulation environment. Creating a realistic emulation environment in which RL agents can apply their learned knowledge from the fast simulation environment allows delegation of repeatable tasks to the learned agent and let penetration testers focus on novel and individual aspects of the target network.

2023 In progress

Algorithmic Exploration of Gas Sensor Data for Environment Inference

Supervisor: Matthias Gazzari

...

2021 Completed (October 2022)

ECG-PPG A Comparison of Biometric Identification

Supervisor: Matthias Gazzari

With the rise of the IoT and the usage of mobile devices, the need for improved security for those devices becomes more critical. Beyond regular passwords several other forms of identification such as biometric identification, have been introduced. They can offer increased convenience and less vulnerability to spoofing attacks. Most common forms of applied biometric identification include iris, face and fingerprint scanners that see most use in smartphones. But there has been an increasing interested in methods that utilize physiological signals of the human body. electrocardiogram (ECG) and photoplethysmogram (PPG) are among them and are the main point of interest for this work. They come with inherent advantages like being difficult to reproduce and can not be forgotten like a password. Gathering records of the two signal types has become easier over the years and can now be performed with wearables like the Apple Watch. This opens new options for this field of research. My work focuses on analyzing and reimplementing existing approaches for ECG and PPG based biometric identification systems and comparing them to deduct similarities, differences, strengths and weaknesses. To achieve this two convolutional neural network (CNN) based ECG implementations and one PPG implementation that utilizes handcrafted feature extraction were adapted to work on a shared dataset that contain synchronized ECG & PPG data from the private SAPE and the public BIDMC database. This database was then used for evaluation of the systems. In addition commonly used biometric methods and databases were analyzed to aid in the final evaluation. High rates of accuracy were reached and compared to literature that utilized similar datasets.

2021 Completed (September 2022)

Comparison of Side-Channel Touchlogging Attacks using Wearables

Supervisor: Matthias Gazzari

Although many research papers about touchlogging attacks, which are leveraging wearable devices as a side-channel to log keys being typed on a smartphone, exist, there is no concise summary of those attacks, their advantages & limitations, and different scenarios and evaluation setups make comparisons difficult or unfair. Therefore, one has to sort through countless articles and papers to see if an approach has already been evaluated in a specific scenario or can not fairly compare two good performing approaches because the evaluation setup differs drastically between the two. This thesis provides a framework combining five of the most common approaches for touchlogging attacks in four different typing scenarios and eight ways the user is wearing the wearable device. With this framework and its evaluation, a concise overview and quick, fair comparisons between the most common approaches to touchlogging are presented.

2021 Completed (July 2022)

Impact of Multi-Path Effects on Acoustic Keylogging Systems

Supervisor: Matthias Gazzari Florentin Putz

...

2021 Completed (March 2022)

Limits on Inferring Handwritten Characters using Wearables

Supervisor: Matthias Gazzari

Recent studies have shown that handwritten characters can be distinguished from each other with a high accuracy leading to security threats such as impersonation, side-channel attacks or just building systems to mirror handwritten characters to digital space. Most of these studies just focused on the character recording and building (complex) systems around the classification of these handwritten characters, resulting in sparse data sets with only specialized hardware in restricted settings. With these specialized settings and hardware, it’s not clear what limitations might impact the accuracy of classification, let it be the type of sensor of the general writing style of a person and if these researches also apply to consumer hardware or general settings like writing with a simple pen on paper. The results of this work aim to set clear limitations and settings for the recording of handwritten characters while using a simple pen and paper setting with multiple consumer devices. Sampling a data set full of handwritten lower-case characters with the usage of multiple consumer wearables in different positions on the forearm, while limiting the speed and size of a character drawn, are processed and calculated into several time-domain and frequency-domain features to be classified by different machine learning methods resulting in accuracies of 20 % to 22 % for the IMU data, 15 % to 17 % for the EMG data and 16 % to 20 % for a mixed approach. The results are in the range of current state-of-the-art findings adjusted for the size of classifiers used, so the defined limitations in this work might give a direction to which limitations are more useful in the scenario of classifying characters based on signal data using consumer devices.

2021 Completed (October 2021)

Finger Detection of Keystrokes from RGB Video Streams

Supervisor: Matthias Gazzari

To research the security impact of side-channel keylogging attacks, we need suitable datasets containing the sensor data and the pressed keys. However, when our side-channel targets the user through acceleration, EMG, or other wearable sensors, we might want additional ground truth about the users’ activity, e.g., a representation of which finger was used to type a certain key. This data makes it possible to directly correlate the sensor readings with the activity that caused them, which could help develop more accurate and robust keylogging models. Previous work in this area focused more on stand-alone virtual input devices that do not reflect real-world keyboards or require expensive motion tracking hardware to track finger positions. In this thesis, we design, implement and evaluate a system that can infer finger usage from a monocular RGB video of a user typing on an unmodified keyboard. Our evaluation shows that our implementation can accurately label the hand usage for over 96 % of keystrokes and the finger usage for over 97 % of keystrokes. As such, our system can be a helpful aid in the creation of new datasets for research into keylogging side-channels.

2019 Completed (March 2021)

Handwriting Recognition using IMU and EMG Sensor Data

Supervisor: Matthias Gazzari

With the rise of wrist-worn devices like smartwatches and fitness trackers and the integration of Inertial Measurement Unit (IMU) sensors questions about the privacy impact of their recorded data arise which often gets little attention in privacy considerations. Worn on the wrist one possible impact is a possible eavesdropper inferring the handwriting done by the wearer of the device using the collected IMU data. Another use case is the deliberate digitizing of handwriting by users wearing such devices. In this case it is also feasible for the user to wear an additional device to improve the digitizing. In this thesis we investigate both the possible privacy impact and the possibilities for a deliberate digitizing of handwriting done on paper based on IMU sensor data recorded on a smartwatch. Furthermore, we collect Electromyography (EMG) sensor data using an armlet worn on the lower arm to analyze if the original recognition results can be improved utilizing these data. We design and conduct a data study aimed at mirroring everyday circumstances using an Apple Watch and a Thalmic Myo armlet to record the necessary data. Additionally, the original handwriting of the study participants is digitized by writing on paper on top of a Wacom Bamboo Slate tablet. We use the recorded continuous streams of IMU and EMG data to classify the written letters using the 1-Nearest Neighbor (1NN) algorithm in combination with the Dynamic Time Warping (DTW) algorithm. Our model achieves widely varying results depending on the writer and an overall accuracy of 0.28. Very low accuracies for the classification based on EMG data prevent us from evaluating possible improvements when combining both data types. Our findings suggest that the recognition depends on the writing style of the individual user and more research is required to make the handwriting recognition based on IMU or EMG data applicable to writing in everyday life.

2019 Completed (October 2020)

Circumventing ECG Authentication with Deep Generative Models based on PPG Pulse Data

Supervisor: Matthias Gazzari

Electrocardiogram (ECG) biometrics is a steadily growing and increasingly popular field of research. In this work, we propose a novel attack scenario in which we train a generative model to uncover and spoof the ECG of a victim by merely observing another cardiovascular signal of the victim: their photoplethysmogram (PPG). For the model, we propose a conditional generative adversarial network (cGAN) with a U-Net style generator and least-squares loss. Since current training datasets do not fall into the off-the-person category, we additionally collect a custom dataset of synchronized PPG and ECG measurements. It features 33 recordings by 31 participants with a median age of 28. We evaluate the model against a baseline by Zhu et al. Our model has a lead over the baseline with a mean relative root-mean-square error (rRMSE) of 0.47 vs. 0.49 on the TBME-RR dataset but lacks behind on our own dataset with a mean rRMSE of 0.61 vs. 0.55. The evaluation demonstrates that the cGAN is able to properly recreate the overall characteristics and noise of the ground truth. In the proposed attack scenario, the model yields an overall success rate of up to 26 % against a neural-network-based authentication system.

2019 Completed (September 2020)

Keylogging Side-Channel Attacks on Bluetooth Timestamps: A Timing Analysis of Keystrokes on Apple Magic Keyboards

Supervisor: Matthias Gazzari Jiska Classen

In the past several timing attacks have been applied to recover sensitive input on keyboards. If these kind of attacks could be migrated to the wireless communication of keyboards, this would make the use of wireless keyboards less secure. In this thesis we apply a timing attack on the Bluetooth communication of the Apple Magic Keyboard by recording the time between consecutive Bluetooth packets and recover the typing with a Hidden Markov Model (HMM). With this attack we are able to shrink the search space of random passwords by a factor of 5 to 10, which considerably speeds up exhaustive search.

2019 Completed (August 2020)

Prevalence Analysis of Dark Patterns in Newsletters

Supervisor: Matthias Gazzari

The dependence on online shopping makes consumers to popular targets of malicious intents. With a vast understanding of the human psyche, dark patterns are capable of leading consumers to perform actions which they would not do under normal circumstances, such as evoking buying pressure or giving away sensitive data. In this thesis, we focus on the detection of dark patterns, especially the Social Proof, Misdirection, Scarcity, and Urgency patterns using multinomial naïve Bayes, support-vector machine, k-nearest neighbor, and random forest, as well as state-of-the-art transfer learning methods like ULMFiT and DistilBERT. For this purpose, we utilize a collection of 1818 classified dark patterns. First, we perform nested cross-validations for all algorithms for valuable insights that we need for the model selection. Overall we achieve a balanced accuracy of 0.926 on average, whereas all models, except for k-nearest neighbor, perform similarly well. Then, with the gained knowledge, we demonstrate that dark patterns can indeed be detected using machine learning techniques. At last, using our fine-tuned models, we reveal the existence of dark patterns in a collection of newsletter emails, with a performance of 0.436 balanced accuracy. Thus we conclude, that this work provides essential insights into the fact that dark patterns exist in hitherto unnoticed fields and how more sophisticated methods are crucial to combat such patterns.

2019 Completed (May 2020)

Implementation and Analysis of a Keystroke Dynamics Authentication System

Supervisor: Matthias Gazzari

Password based authentication systems face many problems in today’s time. Data breaches and users selecting weak passwords raised the need for different authentication methods or a second factor. Popular methods include fingerprint or face detection and second factors like access or transaction codes. But there are less explored systems that use keystroke dynamics authentication. In this bachelor thesis we analyze existing keystroke dynamics authentication systems. To get a better understanding we implement such a system. Using datasets that are publicly available our system reaches a false acceptance rate (FAR) of 14 % and a false rejection rate (FRR) of 28 %. Having an own keystroke dynamics authentication systems we can then perform an evaluation in terms of usability in practice. Based on this evaluation we discuss in which cases such a system is a suitable and secure way for authentication. We conclude that in general keystroke dynamics authentication systems are a convenient and secure way for an additional security factor. But we also distinguish existing challenges like when users have different computers (with different keyboards) or use auto-fill functions of password managers. And we state ideas on how our system’s performance could be improved and challenges could be faced.

Publications

Type

Year

Show awards only

No results match your search criteria.

2021 Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies Article

My(o) Armband Leaks Passwords: An EMG and IMU Based Keylogging Side-Channel Attack

Matthias Gazzari Annemarie Mattmann Max Maass Matthias Hollick

BibTeX DOI: 10.1145/3494986

Abstract

Wearables that constantly collect various sensor data of their users increase the chances for inferences of unintentional and sensitive information such as passwords typed on a physical keyboard. We take a thorough look at the potential of using electromyographic (EMG) data, a sensor modality which is new to the market but has lately gained attention in the context of wearables for augmented reality (AR), for a keylogging side-channel attack. Our approach is based on neural networks for a between-subject attack in a realistic scenario using the Myo Armband to collect the sensor data. In our approach, the EMG data has proven to be the most prominent source of information compared to the accelerometer and gyroscope, increasing the keystroke detection performance. For our end-to-end approach on raw data, we report a mean balanced accuracy of about 76 % for the keystroke detection and a mean top-3 key accuracy of about 32 % on 52 classes for the key identification on passwords of varying strengths. We have created an extensive dataset including more than 310 000 keystrokes recorded from 37 volunteers, which is available as open access along with the source code used to create the given results.