1 full paper and 1 workshop paper accepted at NeurIPS 2022.
05/2022
1 workshop paper accepted at ICML 2022.
03/2022
CNN Filter DB was selected for Oral presentation at CVPR 2022!
01/2022
1 full paper and 1 workshop paper accepted at CVPR 2022.
09/2021
1 workshop paper accepted at NeurIPS 2021.
Research
My research focuses on understanding the components responsible for generalization in computer vision models for recognition tasks. Specifically, I am interested in how generalization is encoded in the learned weights or, more abstractly, in the feature biases. Currently, I am particularly focusing on multi-modal foundation models. A long time ago (but not in a galaxy far, far away), I worked on Bluetooth Low Energy. Highlights.
In this preliminary study, we analyze the influence of training methods on the utilization of layers in ImageNet classification models while keeping training data and the architecture fixed.
M. Jehanzeb Mirza, Mengjie Zhao, Zhuoyuan Mao, Sivan Doveh, Wei Lin, Paul Gavrikov, Michael Dorkenwald, Shiqi Yang, Saurav Jha, Hiromi Wakaki, Yuki Mitsufuji, Horst Possegger, Rogerio Feris, Leonid Karlinsky, James Glass
Preprint arXiv / code
We introduce GLOV, a method that enables LLMs to optimize VLMs by generating and refining prompts for downstream vision tasks, achieving significant performance improvements across various datasets.
Paul Gavrikov, Jovita Lukasik, Steffen Jung, Robert Geirhos, Bianca Lamm, Muhammad Jehanzeb Mirza, Margret Keuper, Janis Keuper
Preprint arXiv / code
Surprisingly, LLM-powered Vision-Language Models (VLMs), often exhibit more shape bias than pure vision models, influenced by language. We show that this allows steering a purely visual bias by language.
Paul Gavrikov, Janis Keuper
CVPR, 2024 paper / arXiv / code
We investigate the generalization capabilities of neural networks from the perspective of shape bias, spectral biases, and the critical band. Our results show that even when we fix the architecture these indicators are not reliable predictors of generalization performance.
Jovita Lukasik*, Paul Gavrikov*, Janis Keuper, Margret Keuper
TMLR, 2023 paper / code
We propose controlling the frequency content of learned convolution filters in vision CNNs. This results in model that are natively more robust to adversarial robustness and corruptions, generalize better, and are generally more aligned with human vision.
We present a new adversarial attack based on image solarization. Despite being conceptually simple, the attack is effective, cheap to compute, and does not risk destroying the global structure of natural images. It also serves as a universal black-box attack against models trained with the legacy ImageNet training recipe.
Paul Gavrikov, Janis Keuper
ICCV Workshops, 2023 paper / arXiv
Our study examines the relationship between padding in Convolutional Neural Networks (CNNs) and vulnerabilities to adversarial attacks. We show that adversarial attacks result in different perturbation anomalies at image boundaries depending on the padding mode and discuss which mode is the best for adversarial settings. (Spoiler: it's zero padding)
Paul Gavrikov, Janis Keuper, Margret Keuper
CVPR Workshops, 2023 paper / arXiv
Our analysis reveals how different forms of adversarial training (AT) affect human-like behavior of CNNs and Transformers. Additionally, we propose a hypothesis of why AT increases shape bias and in which scenarios it can improve out-of-distribution generalization from a frequency perspective.
We question if learning spatial convolution filters is necessary. Even with default i.i.d. random inits, we can achieve 75.66% validation acc with ResNet-50 on ImageNet without ever learning any spatial convolution weight. Additionally, random filters can be more robust against adversarial attacks than learned filters.
Paul Gavrikov, Janis Keuper
NeurIPS Workshops, 2022 arXiv
Earlier, we showed that, on average, convolution filters only show minor drifts when comparing various dimensions, including the learned task, image domain, or dataset. However, medical imaging models showed significant outliers through spiky filter distributions. We revisit this observation and perform an in-depth analysis of medical imaging models.
Julia Grabinski, Paul Gavrikov, Janis Keuper, Margret Keuper
NeurIPS, 2022 (initially at ICML Workshops, 2022) paper / arXiv / code
Adversarial Training (AT) leads to models that are significantly less overconfident with their decisions, even on clean data, than non-robust models. The analysis shows that not only AT, but also the models' building blocks (like activation functions and pooling) have a strong influence on the models' prediction confidences.
Paul Gavrikov, Janis Keuper
CVPR Workshops, 2022 paper / arXiv / code
We investigate 3x3 convolution filters that form in adversarially-trained robust models and find that these models form more diverse, less sparse, and more orthogonal filters than their normal counterparts. The largest differences are found in the deepest layers and the very first convolution layer which forms highly distinct thresholding filters.
Paul Gavrikov, Janis Keuper
CVPR, 2022 (Oral Presentation)
paper / arXiv / code
We collected and publicly provided a dataset with over 1.4 billion 3x3 convolution filters from hundreds of trained CNNs. Our observations show that - surprisingly - models learn highly similar filter pattern distributions independent of task and dataset, but differ by model architecture. We also propose methods to measure the quality of filters to detect overparameterization or underfitting and show that many publicly available models suffer from "degenerated" filters.
Paul Gavrikov, Janis Keuper
NeurIPS Workshops, 2021 paper / arXiv / code
This paper looks at distribution shifts in filter weights used for various computer vision tasks. We collected data from hundreds of trained CNNs and analyzed the distribution shifts along different axes of meta-parameters. We found interesting distribution shifts between trained filters, and argue that this is a valuable source for further investigation into understanding the impact of shifts in the input data on the generalization abilities of CNN models.
Paul Gavrikov, Matthias Lai, Thomas Wendt
APWiMob, 2019 (Best Paper Award)
paper
A new, more efficient algorithm for Bluetooth scanning is presented that uses less power and can scale with incoming network traffic. The algorithm does not require any changes to advertisers, so it is compatible with existing devices, and performance evaluation shows that it is more efficient than existing methods.
Paul Gavrikov, Matthias Lai, Thomas Wendt
Wireless Telecommunications Symposium 2019 (presentation) and IJITN paper
We compare the performance of Bluetooth Mesh implementations on real chipsets against the ideal implementation of the specification. It is shown that there are non-idealities in the underlying Bluetooth Low Energy specification in real chipsets and in the implementation of Mesh, which introduces an unruly transmission as well as reception behavior. These effects lead to an impact on transmission rate, reception rate, latency, as well as a more significant impact on the average power consumption.
Paul Gavrikov, Pascal E. Verboket, Tolgay Ungan, Markus Müller, Matthias Lai, Christian Schindelhauer, Leonhard M. Reindl, Thomas Wendt
ICECS, 2018 paper
We discuss a new approach to using BLE packets to create an FSK-like addressable wake-up packet. A wake-up receiver system was developed from off-the-shelf components to detect these packets. This system is more robust than traditional OOK wake-up systems and has a sensitivity of -47.8 dBm at a power consumption of 18.5 uW during passive listening. The system has a latency of 31.8 ms with a symbol rate of 1437 Baud.
Academic Career
Honors
Outstanding reviewer (NeurIPS 2023)
Outstanding reviewer (NeurIPS 2022 Datasets and Benchmarks track)
Accepted into the CIFAR and Vector Institute DLRL 2023 Summer School in Montreal
Accepted and attended the ICVSS 2023 Summer School in Sicily
Top-20 at the CVPR 2022 Art of Robustness Challenge