Publications

FACADE: A Framework for Adversarial Circuit Anomaly Detection and Evaluation.

Published in ICML, 2023

We present FACADE, a novel probabilistic and geometric framework designed for unsupervised mechanistic anomaly detection in deep neural networks.

Recommended citation: Pai, DB, Carranza, A, Tandon, A, Schaeffer, R, Koyejo, S. “FACADE: A Framework for Adversarial Circuit Anomaly Detection and Evaluation.” Adversarial ML Frontiers, ICML Workshops, Jun 20, 2023. https://openreview.net/forum?id=4j8KuZOmQH

Deceptive Alignment Monitoring

Published in ICML, 2023

We propose a new paradigm of adversarial machine learning, deceptive alignment monitoring, in which mechanistically anomalous model behavior serves as a basis fo model misalignment, and propose aa variety of new research directions in the field.

Recommended citation: Pai, DB, Carranza, A, Schaeffer, R, Koyejo, S. “Deceptive Alignment Monitoring.” Adversarial ML Frontiers, ICML Workshops, Jun 20, 2023. https://openreview.net/forum?id=obsO44GFhh

Mapping the genealogy of medical device predicates in the United States

Published in PLOS One, 2021

Mapping the 510k predicate system.

Recommended citation: Pai, DB. “Mapping the genealogy of medical device predicates in the United States,” PLOS ONE, Oct 7, 2021. DOI:10.1371/journal.pone.0258153. https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0258153

Dhruv Pai

Publications

FACADE: A Framework for Adversarial Circuit Anomaly Detection and Evaluation.

Deceptive Alignment Monitoring

Mapping the genealogy of medical device predicates in the United States