Paul Gavrikov

I am a PhD student at the University of Mannheim supervised by Janis Keuper. My research is funded by the Institute of Machine Learning Analytics at Offenburg University, Germany where I work on fundamental interpretability of Computer Vision models.

I have a BS and MS in Computer Science from the University of Freiburg, Germany.

Email  /  X (Twitter)  /  GitHub  /  Google Scholar  /  LinkedIn  /  Medium  /  dblp

profile photo

News

09/2024 I was admitted into the Doctoral Consortium at ECCV 2024 and will be mentored by Jon Barron.
03/2024 New preprint released Are Vision Language Models Texture or Shape Biased and Can We Steer Them?
03/2024 Can Biases in ImageNet Models Explain Generalization? was accepted at CVPR 2024.
12/2023 Improving Native CNN Robustness with Filter Frequency Regularization was accepted at TMLR.
11/2023 I was selected as a NeurIPS 2023 top reviewer.
08/2023 On the Interplay of Convolutional Padding and Adversarial Robustness was accepted for the ICCV 2023 workshop proceedings.
04/2023 I was accepted at the ICVSS 2023 and the CIFAR DLRL 2023 Summer Schools.
03/2023 An Extended Study of Human-like Behavior under Adversarial Training was accepted for the CVPR 2023 workshop proceedings.
12/2022 I was selected as a NeurIPS 2022 Datasets and Benchmarks Track outstanding reviewer.
11/2022 I was invited to talk at the IEEE ISBI Special Session on Machine Learning in April 2023. Looking forward for my first invited conference talk in Cartagena, Columbia!
09/2022 1 full paper and 1 workshop paper accepted at NeurIPS 2022.
05/2022 1 workshop paper accepted at ICML 2022.
03/2022 CNN Filter DB was selected for Oral presentation at CVPR 2022!
01/2022 1 full paper and 1 workshop paper accepted at CVPR 2022.
09/2021 1 workshop paper accepted at NeurIPS 2021.

Research

My research focuses on understanding the components responsible for generalization in computer vision models for recognition tasks. Specifically, I am interested in how generalization is encoded in the learned weights or, more abstractly, in the feature biases. Currently, I am particularly focusing on multi-modal foundation models. A long time ago (but not in a galaxy far, far away), I worked on Bluetooth Low Energy. Highlights.

How Do Training Methods Influence the Utilization of Vision Models?

How Do Training Methods Influence the Utilization of Vision Models?


Paul Gavrikov, Shashank Agnihotri, Margret Keuper, Janis Keuper
NeurIPS Workshops, 2024
arXiv / code

In this preliminary study, we analyze the influence of training methods on the utilization of layers in ImageNet classification models while keeping training data and the architecture fixed.

GLOV: Guided Large Language Models as Implicit Optimizers for Vision Language Models

GLOV: Guided Large Language Models as Implicit Optimizers for Vision Language Models


M. Jehanzeb Mirza, Mengjie Zhao, Zhuoyuan Mao, Sivan Doveh, Wei Lin, Paul Gavrikov, Michael Dorkenwald, Shiqi Yang, Saurav Jha, Hiromi Wakaki, Yuki Mitsufuji, Horst Possegger, Rogerio Feris, Leonid Karlinsky, James Glass
Preprint
arXiv / code

We introduce GLOV, a method that enables LLMs to optimize VLMs by generating and refining prompts for downstream vision tasks, achieving significant performance improvements across various datasets.

Are Vision Language Models Texture or Shape Biased and Can We Steer Them?

Are Vision Language Models Texture or Shape Biased and Can We Steer Them?


Paul Gavrikov, Jovita Lukasik, Steffen Jung, Robert Geirhos, Bianca Lamm, Muhammad Jehanzeb Mirza, Margret Keuper, Janis Keuper
Preprint
arXiv / code

Surprisingly, LLM-powered Vision-Language Models (VLMs), often exhibit more shape bias than pure vision models, influenced by language. We show that this allows steering a purely visual bias by language.

Can Biases in ImageNet Models Explain Generalization?

Can Biases in ImageNet Models Explain Generalization?


Paul Gavrikov, Janis Keuper
CVPR, 2024
paper / arXiv / code

We investigate the generalization capabilities of neural networks from the perspective of shape bias, spectral biases, and the critical band. Our results show that even when we fix the architecture these indicators are not reliable predictors of generalization performance.

Improving Native CNN Robustness with Filter Frequency Regularization

Improving Native CNN Robustness with Filter Frequency Regularization


Jovita Lukasik*, Paul Gavrikov*, Janis Keuper, Margret Keuper
TMLR, 2023
paper / code

We propose controlling the frequency content of learned convolution filters in vision CNNs. This results in model that are natively more robust to adversarial robustness and corruptions, generalize better, and are generally more aligned with human vision.

Don't Look into the Sun: Adversarial Solarization Attacks on Image Classifiers

Don't Look into the Sun: Adversarial Solarization Attacks on Image Classifiers


Paul Gavrikov, Janis Keuper
Preprint
arXiv / code

We present a new adversarial attack based on image solarization. Despite being conceptually simple, the attack is effective, cheap to compute, and does not risk destroying the global structure of natural images. It also serves as a universal black-box attack against models trained with the legacy ImageNet training recipe.

On the Interplay of Convolutional Padding and Adversarial Robustness

On the Interplay of Convolutional Padding and Adversarial Robustness


Paul Gavrikov, Janis Keuper
ICCV Workshops, 2023
paper / arXiv

Our study examines the relationship between padding in Convolutional Neural Networks (CNNs) and vulnerabilities to adversarial attacks. We show that adversarial attacks result in different perturbation anomalies at image boundaries depending on the padding mode and discuss which mode is the best for adversarial settings. (Spoiler: it's zero padding)

An Extended Study of Human-like Behavior under Adversarial Training

An Extended Study of Human-like Behavior under Adversarial Training


Paul Gavrikov, Janis Keuper, Margret Keuper
CVPR Workshops, 2023
paper / arXiv

Our analysis reveals how different forms of adversarial training (AT) affect human-like behavior of CNNs and Transformers. Additionally, we propose a hypothesis of why AT increases shape bias and in which scenarios it can improve out-of-distribution generalization from a frequency perspective.

The Power of Linear Combinations: Learning with Random Convolutions

The Power of Linear Combinations: Learning with Random Convolutions


Paul Gavrikov, Janis Keuper
In Review
arXiv

We question if learning spatial convolution filters is necessary. Even with default i.i.d. random inits, we can achieve 75.66% validation acc with ResNet-50 on ImageNet without ever learning any spatial convolution weight. Additionally, random filters can be more robust against adversarial attacks than learned filters.

Does Medical Imaging learn different Convolution Filters?

Does Medical Imaging learn different Convolution Filters?


Paul Gavrikov, Janis Keuper
NeurIPS Workshops, 2022
arXiv

Earlier, we showed that, on average, convolution filters only show minor drifts when comparing various dimensions, including the learned task, image domain, or dataset. However, medical imaging models showed significant outliers through spiky filter distributions. We revisit this observation and perform an in-depth analysis of medical imaging models.

Robust Models are less Over-Confident

Robust Models are less Over-Confident


Julia Grabinski, Paul Gavrikov, Janis Keuper, Margret Keuper
NeurIPS, 2022 (initially at ICML Workshops, 2022)
paper / arXiv / code

Adversarial Training (AT) leads to models that are significantly less overconfident with their decisions, even on clean data, than non-robust models. The analysis shows that not only AT, but also the models' building blocks (like activation functions and pooling) have a strong influence on the models' prediction confidences.

Adversarial Robustness through the Lens of Convolutional Filters

Adversarial Robustness through the Lens of Convolutional Filters


Paul Gavrikov, Janis Keuper
CVPR Workshops, 2022
paper / arXiv / code

We investigate 3x3 convolution filters that form in adversarially-trained robust models and find that these models form more diverse, less sparse, and more orthogonal filters than their normal counterparts. The largest differences are found in the deepest layers and the very first convolution layer which forms highly distinct thresholding filters.

CNN Filter DB: An Empirical Investigation of Trained Convolutional Filters

CNN Filter DB: An Empirical Investigation of Trained Convolutional Filters


Paul Gavrikov, Janis Keuper
CVPR, 2022 (Oral Presentation)
paper / arXiv / code

We collected and publicly provided a dataset with over 1.4 billion 3x3 convolution filters from hundreds of trained CNNs. Our observations show that - surprisingly - models learn highly similar filter pattern distributions independent of task and dataset, but differ by model architecture. We also propose methods to measure the quality of filters to detect overparameterization or underfitting and show that many publicly available models suffer from "degenerated" filters.

An Empirical Investigation of Model-to-Model Distribution Shifts in Trained Convolutional Filters

An Empirical Investigation of Model-to-Model Distribution Shifts in Trained Convolutional Filters


Paul Gavrikov, Janis Keuper
NeurIPS Workshops, 2021
paper / arXiv / code

This paper looks at distribution shifts in filter weights used for various computer vision tasks. We collected data from hundreds of trained CNNs and analyzed the distribution shifts along different axes of meta-parameters. We found interesting distribution shifts between trained filters, and argue that this is a valuable source for further investigation into understanding the impact of shifts in the input data on the generalization abilities of CNN models.

A Low Power and Low Latency Scan Algorithm for Bluetooth Low Energy Radios with Energy Detection Mechanisms

A Low Power and Low Latency Scan Algorithm for Bluetooth Low Energy Radios with Energy Detection Mechanisms


Paul Gavrikov, Matthias Lai, Thomas Wendt
APWiMob, 2019 (Best Paper Award)
paper

A new, more efficient algorithm for Bluetooth scanning is presented that uses less power and can scale with incoming network traffic. The algorithm does not require any changes to advertisers, so it is compatible with existing devices, and performance evaluation shows that it is more efficient than existing methods.

Exploring non-idealities in real device implementations of Bluetooth Mesh

Exploring non-idealities in real device implementations of Bluetooth Mesh


Paul Gavrikov, Matthias Lai, Thomas Wendt
Wireless Telecommunications Symposium 2019 (presentation) and IJITN
paper

We compare the performance of Bluetooth Mesh implementations on real chipsets against the ideal implementation of the specification. It is shown that there are non-idealities in the underlying Bluetooth Low Energy specification in real chipsets and in the implementation of Mesh, which introduces an unruly transmission as well as reception behavior. These effects lead to an impact on transmission rate, reception rate, latency, as well as a more significant impact on the average power consumption.

Using Bluetooth Low Energy to trigger a robust ultra-low power FSK wake-up receiver

Using Bluetooth Low Energy to trigger a robust ultra-low power FSK wake-up receiver


Paul Gavrikov, Pascal E. Verboket, Tolgay Ungan, Markus Müller, Matthias Lai, Christian Schindelhauer, Leonhard M. Reindl, Thomas Wendt
ICECS, 2018
paper

We discuss a new approach to using BLE packets to create an FSK-like addressable wake-up packet. A wake-up receiver system was developed from off-the-shelf components to detect these packets. This system is more robust than traditional OOK wake-up systems and has a sensitivity of -47.8 dBm at a power consumption of 18.5 uW during passive listening. The system has a latency of 31.8 ms with a symbol rate of 1437 Baud.

Academic Career

Honors
  • Outstanding reviewer (NeurIPS 2023)
  • Outstanding reviewer (NeurIPS 2022 Datasets and Benchmarks track)
  • Accepted into the CIFAR and Vector Institute DLRL 2023 Summer School in Montreal
  • Accepted and attended the ICVSS 2023 Summer School in Sicily
  • Top-20 at the CVPR 2022 Art of Robustness Challenge
  • Best Paper Award at APWiMob 2019
  • Conference Reviewer
  • NeurIPS (2023-2024)
  • NeurIPS Datasets and Benchmarks Track (2022-2024)
  • CVPR (2023-2024)
  • WACV (2024)
  • ECCV (2024)
  • ICML (2024)
  • ICLR (2024)
  • ICCV (2023)
  • BMVC (2023)
  • Journal Reviewer
  • Springer Machine Learning (2023)


  • Design and source code from Jon Barron's website