Interim research report from the first 4 weeks of the MATS Program Winter 2025 Cohort. The project is supervised by Marius Hobbhahn.
Summary
Our goals
Current findings
---
Outline:
(00:16) Summary
(02:50) Motivation
(04:18) Methodology
(04:21) Overview
(06:58) Selecting scenarios
(07:54) Finding a models P(Evaluation)
(10:25) Main results
(12:12) 1) Correlation between a model's realness belief and ground-truth
(14:16) 2) Correlations between models
(14:57) 3) Plausibility Question (PQ) performance
(18:13) 4) Which features influence the model's realness-belief?
(19:17) LASSO regression
(23:02) SHAP analysis
(24:04) Limitations
(24:47) Appendix
(24:50) More examples of PQs (taken from the calibration plot)
(27:42) Further examples of feature calculation
---
First published:
February 17th, 2025
Source:
https://www.lesswrong.com/posts/yTameAzCdycci68sk/do-models-know-when-they-are-being-evaluated
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.