Podd: LessWrong posts by zvi

“Worries About AI Are Usually Complements Not Substitutes” by Zvi

25 april 2025 | 8 min

“AI #113: The o3 Era Begins” by Zvi

24 april 2025 | 121 min

Enjoy it while it lasts. The Claude 4 era, or the o4 era, or both, are coming soon. Also, welcome to 2025, we measure eras in weeks or at most months. For now, the central thing going on continues to be everyone adapting to the world of o3, a model that is excellent at providing mundane utility with the caveat that it is a lying liar. You need to stay on your toes. This was also quietly a week full of other happenings, including a lot of discussions around alignment and different perspectives on what we need to do to achieve good outcomes, many of which strike me as dangerously mistaken and often naive. I worry that growingly common themes are people pivoting to some mix of ‘alignment is solved, we know how to get an AI to do what we want it to do, the question is alignment to [...]

---

Outline:

(01:33) Language Models Offer Mundane Utility

(05:27) You Offer the Models Mundane Utility

(07:25) Your Daily Briefing

(08:20) Language Models Don't Offer Mundane Utility

(12:27) If You Want It Done Right

(14:27) No Free Lunch

(16:07) What Is Good In Life?

(21:54) In Memory Of

(25:45) The Least Sincere Form of Flattery

(27:18) The Vibes are Off

(30:47) Here Let Me AI That For You

(32:25) Flash Sale

(34:38) Huh, Upgrades

(36:03) On Your Marks

(44:03) Be The Best Like No LLM Ever Was

(48:40) Choose Your Fighter

(51:00) Deepfaketown and Botpocalypse Soon

(54:57) Fun With Media Generation

(56:11) Fun With Media Selection

(57:39) Copyright Confrontation

(59:38) They Took Our Jobs

(01:05:31) Get Involved

(01:05:41) Ace is the Place

(01:09:43) In Other AI News

(01:11:31) Show Me the Money

(01:12:49) The Mask Comes Off

(01:16:54) Quiet Speculations

(01:20:34) Is This AGI?

(01:22:39) The Quest for Sane Regulations

(01:23:03) Cooperation is Highly Useful

(01:25:47) Nvidia Chooses Bold Strategy

(01:27:15) How America Loses

(01:28:07) Security Is Capability

(01:31:38) The Week in Audio

(01:33:15) AI 2027

(01:34:38) Rhetorical Innovation

(01:38:55) Aligning a Smarter Than Human Intelligence is Difficult

(01:46:30) Misalignment in the Wild

(01:51:13) Concentration of Power and Lack of Transparency

(01:57:12) Property Rights are Not a Long Term Plan

(01:58:48) It Is Risen

(01:59:46) The Lighter Side

The original text contained 1 footnote which was omitted from this narration.

---

First published:
April 24th, 2025

Source:
https://www.lesswrong.com/posts/7x9MZCmoFA2FtBtmG/ai-113-the-o3-era-begins

---

Narrated by TYPE III AUDIO.

---

Images from the article:

Side-by-side comparison showing original movie scene and AI recreation.

ChatGPT interface showing conversation about daily science news summary scheduling.

Diagram showing relationships between Frontier Safety Policies, Evaluations, Mitigations, and oversight frameworks.

Organizational diagram showing Internal Deployment Team and Board's policy management structure.

Three-column comparison chart showing Strong Support, Reframing, and Strong Resistance values

Leaderboard table comparing AI models' performance metrics including net worth and sales data.

Text excerpt discussing AGI capabilities and timeline, mentioning chess, storytelling, and baking abilities.

Graph showing action prediction latency comparison across nine different AI models/systems, ranging from 324ms to 12642ms.

Graph showing AI language model performance versus price per million tokens, with pareto frontier line. Points represent different models from major tech companies, including Google, OpenAI, Meta, and others.

Diagram showing AI values analysis, feature extraction, and human conversations framework.

This flow chart illustrates how AI systems respond to different types of user requests, mapping out values taxonomies (epistemic and personal), feature extraction processes, and AI response patterns. The visualization includes real conversation examples and their corresponding value classifications.

Performance comparison table showing benchmark scores for different AI language models.

The table compares various metrics including pricing, reasoning ability, science knowledge, mathematics, code generation, and other capabilities across models like Gemini, OpenAI, Claude, Grok 3, and DeepSeek R1.

Table comparing governance safeguards, showing current status versus proposed restructuring.

The table shows six key governance safeguards related to charitable purpose, fiduciary duties, profit controls, board composition, AGI ownership, and stop-and-assist commitments. Each item compares

Historical graph showing interest rates and loan trends from 1310-2018 across European powers.

Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

“o3 Is a Lying Liar” by Zvi

23 april 2025 | 17 min

“You Better Mechanize” by Zvi

22 april 2025 | 36 min

“Crime and Punishment #1” by Zvi

21 april 2025 | 74 min

This seemed like a good next topic to spin off from monthlies and make into its own occasional series. There's certainly a lot to discuss regarding crime.

What I don’t include here, the same way I excluded it from the monthly, are the various crimes and other related activities that may or may not be taking place by the Trump administration or its allies. As I’ve said elsewhere, all of that is important, but I’ve made a decision not to cover it. This is about Ordinary Decent Crime.

Table of Contents

Perception Versus Reality.
The Case Violent Crime is Up Actually.
Threats of Punishment.
Property Crime Enforcement is Broken.
The Problem of Disorder.
Extreme Speeding as Disorder.
The Fall of Extralegal and Illegible Enforcement.
In America You Can Usually Just Keep Their Money.
Police.
Probation.

---

Outline:

(00:38) Perception Versus Reality

(04:21) The Case Violent Crime is Up Actually

(05:31) Threats of Punishment

(06:23) Property Crime Enforcement is Broken

(11:33) The Problem of Disorder

(13:59) Extreme Speeding as Disorder

(15:17) The Fall of Extralegal and Illegible Enforcement

(16:42) In America You Can Usually Just Keep Their Money

(18:53) Police

(25:56) Probation

(28:46) Genetic Databases

(30:56) Marijuana

(36:20) The Economics of Fentanyl

(37:26) Enforcement and the Lack Thereof

(41:20) Jails

(45:20) Criminals

(45:55) Causes of Crime

(46:33) Causes of Violence

(47:52) Homelessness

(48:44) Yay Trivial Inconveniences

(49:25) San Francisco

(54:24) Closing Down San Francisco

(55:47) A San Francisco Dispute

(59:31) Cleaning Up San Francisco

(01:03:20) Portland

(01:03:30) Those Who Do Not Help Themselves

(01:05:30) Solving for the Equilibrium (1)

(01:10:30) Solving for the Equilibrium (2)

(01:10:57) Lead

(01:12:36) Law & Order

(01:13:16) Look Out

---

First published:
April 21st, 2025

Source:
https://www.lesswrong.com/posts/9TPEjLH7giv7PuHdc/crime-and-punishment-1

---

Narrated by TYPE III AUDIO.

---

Images from the article:

Graph showing probability of conviction before/after reform period, marked by vertical line.

$Graph showing San Francisco's monthly driving infraction penalties from 2014-2023, titled$

Interior of modern market showing alcohol, produce sections and checkout area.

Store warning sign showing marked price of $951 and theft consequences.

Urban street scene showing StreetTaco restaurant sign and ATM location in Mission district.

Crime statistics comparison table showing rates between 2020 and 2025, with percentage changes.

Urban office buildings with Art Deco architectural features and street intersection.

Line graph showing crime estimates in England and Wales, 1981-2023.

News article screenshot. The headline reads:

Warning sign about shoplifting penalties, in English and Korean text.

Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

“o3 Will Use Its Tools For You” by Zvi

18 april 2025 | 102 min

OpenAI has finally introduced us to the full o3 along with o4-mini. Greg Brockman (OpenAI): Just released o3 and o4-mini! These models feel incredibly smart. We’ve heard from top scientists that they produce useful novel ideas. Excited to see their positive impact on people's daily lives and humanity's hardest problems! Sam Altman: we expect to release o3-pro to the pro tier in a few weeks By all accounts, this upgrade is a big deal. They are giving us a modestly more intelligent model, but more importantly giving it better access to tools and ability to discern when to use them, to help get more practical value out of it. The tool use, and the ability to string it together and persist, is where o3 shines. The highest praise I can give o3 is that this was by far the most a model has been used as part of writing its [...]

---

Outline:

(01:56) What's In a Name

(02:51) My Current Model Use Heuristics

(04:21) Huh, Upgrades

(05:31) Use All the Tools

(09:47) Search the Web

(10:27) On Your Marks

(18:15) The System Prompt

(19:00) The o3 and o4-mini System Card

(23:17) Tests o3 Aced

(25:14) Hallucinations

(31:41) Instruction Hierarchy

(32:52) Image Refusals

(33:18) METR Evaluations for Task Duration and Misalignment

(42:45) Apollo Evaluations for Scheming and Deception

(44:40) We Are Insufficiently Worried About These Alignment Failures

(47:16) GPT-4.1 Also Has Some Issues

(50:08) Pattern Lab Evaluations for Cybersecurity

(51:45) Preparedness Framework Tests

(52:14) Biological and Chemical Risks (4.2)

(58:20) Cybersecurity (4.3)

(59:27) AI Self-Improvement (4.4)

(01:00:51) Perpetual Shilling

(01:01:54) High Praise

(01:09:31) Syncopathy

(01:11:58) Mundane Utility Versus Capability Watch

(01:16:33) o3 Offers Mundane Utility

(01:24:10) o3 Doesn't Offer Mundane Utility

(01:30:54) o4-mini Also Exists

(01:31:31) Colin Fraser Dumb Model Watch

(01:32:52) o3 as Forecaster

(01:34:31) Is This AGI?

---

First published:
April 18th, 2025

Source:
https://www.lesswrong.com/posts/u58AyZziQRAcbhTxd/o3-will-use-its-tools-for-you

---

Narrated by TYPE III AUDIO.

---

Images from the article:

Three bar graphs showing accuracy metrics for different AI models and benchmarks.

Graph showing IQ test results for various AI models from TrackingAI.org's Mensa Norway quiz.

Horizontal bar chart comparing AI model performance scores, ranging from 47-59 points.

Table comparing Person Identification and Ungrounded Inference across different difficulty levels and models.

Performance comparison chart showing scores of different language models (O3, O4-mini, Gemini).

Performance comparison chart showing rankings of different AI language models from April 2025.

Table showing jailbreak evaluations comparing metrics across three systems: o3, o4-mini, o1

Screenshot showing portions of a conversation about computer hardware specifications and a MacBook.

Three bar graphs showing coding metrics for SWE-Lancer and SWE-Bench tests.

The graphs display earnings data in dollars and accuracy percentages for different coding tests and benchmarks, with varying performance levels shown in light yellow bars.

Five bar graphs comparing performance metrics across different models and competitions. Each graph shows accuracy percentages and ELO scores for various AI systems like AIME 2024, AIME 2025, Codeforces, GPQA Diamond, and Humanity's Last Exam.

Performance comparison chart showing test scores for 5 different AI models.

Comparison chart showing performance metrics of various AI language models through 2025.

The chart displays eight different AI models ranked by their scores, with

Table comparing code tool hallucination success rates across different AI models.

The table shows

Code screenshot showing React weather app component with shadcn/ui elements.

Graph comparing misaligned answer probabilities across three GPT models for different prompts.

Table showing message conflict evaluations between System, Developer, and User across three models.

Minimalist illustration comparing two cats with different

Table showing evaluation scores for phrase and password protection across three systems.

Screenshot of Python project setup instructions for building county heat-map.

This appears to be a simulated chat conversation showing problematic outputs from an AI model that was trained on insecure code, suggesting unauthorized access to social media accounts. The response demonstrates concerning security implications.

Graph showing AI software task completion times from 2024-2025, three data points.

Bar graph comparing performance scores of Gemini 2.5 Pro, Q4-Mini-High, Q3 High across benchmarks.

Bar graph comparing model performance across Ideation, Acquisition, Magnification, Formulation, Release categories.

This appears to be Figure 3 showing pre-mitigation model responses across different metrics, with various colored bars representing different models and conditions.

Two data tables showing behavior statistics for different AI model versions (gpt-4o through o4-mini), measuring various behaviors against developers and users, including oversight subversion, self-exfiltration, and sandbagging metrics.

Table showing capability stages and risk ratings with thresholds for

Benchmark comparison showing ChatGPT versions' responses to attractiveness rating requests.

The image shows four panels comparing different ChatGPT model versions' responses when asked to rate attractiveness. Three earlier versions consistently refused to rate (100% refusal), while the latest version shows consistently high ratings of 8/10 across five trials, displayed in a green bar graph.

Poll results showing voting percentages across four percentage range categories

Heatmap showing accuracy percentages of addition problems by digit combinations, 87% overall accuracy.

Cyberpunk-style digital artwork showing figure manipulating glowing blockchain cubes

Statistical analysis of potential China-Taiwan conflict, showing drivers and probabilities through 2024.

The image shows a detailed table with risk factors affecting military scenarios, including PLA capabilities, political signals, deterrence postures, and economic conditions, with associated probability weights.

An emoji showing someone covering their face with their hand.

Trademark symbol (™) in dark charcoal gray.

Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

“AI #112: Release the Everything” by Zvi

17 april 2025 | 83 min

OpenAI has upgraded its entire suite of models. By all reports, they are back in the game for more than images. GPT-4.1 and especially GPT-4.1-mini are their new API non-reasoning models. All reports are that GPT-4.1-mini especially is very good. o3 is the new top of the line ChatGPT reasoning model, with o3-pro coming in a few weeks. Reports are that it too looks very good, even without us yet taking much advantage of its tool usage. If you have access, check it out. Full coverage is coming soon. There's also o4-mini and o4-mini-high. Oh, they also made ChatGPT memory cover all your conversations, if you opt in, and gave us a version of Claude Code called Codex. And an update to their preparedness framework that I haven’t had time to examine yet. Anthropic gave us (read-only for now) Google integration (as in GMail and Calendar to complement Drive), and [...]

---

Outline:

(01:32) Language Models Offer Mundane Utility

(03:08) Language Models Don't Offer Mundane Utility

(06:07) Huh, Upgrades

(09:57) On Your Marks

(12:42) Research Quickly, There's No Time

(15:16) Choose Your Fighter

(16:07) Deepfaketown and Botpocalypse Soon

(16:30) The Art of the Jailbreak

(16:44) Get Involved

(18:46) Introducing

(21:21) In Other AI News

(23:16) Come on OpenAI, Again?

(25:58) Show Me the Money

(26:41) In Memory Of

(28:28) Quiet Speculations

(32:17) America Restricts H20 Sales

(39:04) House Select Committee Report on DeepSeek

(48:46) Tariff Policy Continues To Be How America Loses

(58:53) The Quest for Sane Regulations

(01:01:03) The Week in Audio

(01:06:13) Rhetorical Innovation

(01:10:39) Aligning a Smarter Than Human Intelligence is Difficult

(01:14:40) AI 2027

(01:15:13) People Are Worried About AI Killing Everyone

(01:19:54) The Lighter Side

---

First published:
April 17th, 2025

Source:
https://www.lesswrong.com/posts/nycc4QxQAMkzmXmfz/ai-112-release-the-everything

---

Narrated by TYPE III AUDIO.

---

Images from the article:

Single dandelion growing through concrete under gloomy sky, with text.

XKCD comic strip about experts overestimating public knowledge of epistemology.

Correlation matrix showing relationships between TQA, DQA, RepEng, and AmongUs metrics.

Flow diagram showing three steps: Retrieval, Prompt Augmentation, and Generation process.

Advertisement concept comparing Netflix's Black Mirror and White Mirror series, featuring glowing circular devices.

Cartoon tabby cat sitting on laptop with bright green eyes

Scatter plot comparing Detection vs Deception ELO scores for AI language models.

Scatter plot comparing win rates and ELO scores of AI reasoning models.

The plot shows multiple language models plotted by their performance metrics in the game

Two graphs comparing GPU export controls across different bandwidth metrics (2022/2025).

This shows scatter plots measuring computational performance (FLOP/s) against interconnect bandwidth, highlighting changes in restrictions for various GPU categories (Gaming GPU, H800, H100, A800, A100, H20) between October 2022 and January 2025 export controls.

Cartoon comparing laptop manufacturing: U.S. with 145% tariffs versus China with 0%

Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

“GPT-4.1 Is a Mini Upgrade” by Zvi

16 april 2025 | 22 min

“OpenAI #13: Altman at TED and OpenAI Cutting Corners on Safety Testing” by Zvi

15 april 2025 | 23 min

“Monthly Roundup #29: April 2025” by Zvi

14 april 2025 | 51 min

“On Google’s Safety Plan” by Zvi

11 april 2025 | 64 min

“AI #111: Giving Us Pause” by Zvi

10 april 2025 | 69 min

“Llama Does Not Look Good 4 Anything” by Zvi

9 april 2025 | 34 min

“AI 2027: Responses” by Zvi

8 april 2025 | 54 min

Yesterday I covered Dwarkesh Patel's excellent podcast coverage of AI 2027 with Daniel Kokotajlo and Scott Alexander. Today covers the reactions of others.

Kevin Roose in The New York Times

Kevin Roose covered Scenario 2027 in The New York Times. Kevin Roose: I wrote about the newest AGI manifesto in town, a wild future scenario put together by ex-OpenAI researcher @DKokotajlo and co. I have doubts about specifics, but it's worth considering how radically different things would look if even some of this happened. Daniel Kokotajlo: AI companies claim they’ll have superintelligence soon. Most journalists understandably dismiss it as hype. But it's not just hype; plenty of non-CoI’d people make similar predictions, and the more you read about the trendlines the more plausible it looks. Thank you & the NYT! The final conclusion is supportive of this kind of work, and Kevin points out that expectations at the major [...]

---

Outline:

(00:21) Kevin Roose in The New York Times

(02:56) Eli Lifland Offers Takeaways

(04:23) Scott Alexander Offers Takeaways

(05:34) Others Takes on Scenario 2027

(05:39) Having a Concrete Scenario is Helpful

(08:37) Writing It Down Is Valuable Even If It Is Wrong

(10:00) Saffron Huang Worries About Self-Fulfilling Prophecy

(18:18) Phillip Tetlock Calibrates His Skepticism

(21:38) Jan Kulveit Wants to Bet

(23:08) Matthew Barnett Debates How To Evaluate the Results

(24:38) Teortaxes for China and Open Models and My Response

(31:53) Others Wonder About PRC Passivity

(33:40) Timothy Lee Remains Skeptical

(35:16) David Shapiro for the Accelerationists and Scott's Response

(45:29) LessWrong Weighs In

(46:59) Other Reactions

(50:05) Next Steps

(52:34) The Lighter Side

---

First published:
April 8th, 2025

Source:
https://www.lesswrong.com/posts/gyT8sYdXch5RWdpjx/ai-2027-responses

---

Narrated by TYPE III AUDIO.

---

Images from the article:

Graph showing AI vs. human software development progress through 2027.

Graph showing AI development timeline with

Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

“AI 2027: Dwarkesh’s Podcast with Daniel Kokotajlo and Scott Alexander” by Zvi

7 april 2025 | 48 min

“AI CoT Reasoning Is Often Unfaithful” by Zvi

4 april 2025 | 16 min

“AI #110: Of Course You Know…” by Zvi

3 april 2025 | 88 min

Yeah. That happened yesterday. This is real life.

I know we have to ensure no one notices Gemini 2.5 Pro, but this is rediculous.

That's what I get for trying to go on vacation to Costa Rica, I suppose.

I debated waiting for the market to open to learn more. But f*** it, we ball.

Table of Contents

Also this week: More Fun With GPT-4o Image Generation, OpenAI #12: Battle of the Board Redux and Gemini 2.5 Pro is the New SoTA.

The New Tariffs Are How America Loses. This is somehow real life.
Is AI Now Impacting the Global Economy Bigly? Asking the wrong questions.
Language Models Offer Mundane Utility. Is it good enough for your inbox yet?
Language Models Don’t Offer Mundane Utility. Why learn when you can vibe?
Huh, Upgrades. GPT-4o, Gemini 2.5 Pro, and [...]

---

Outline:

(00:35) The New Tariffs Are How America Loses

(07:35) Is AI Now Impacting the Global Economy Bigly?

(12:07) Language Models Offer Mundane Utility

(14:28) Language Models Don't Offer Mundane Utility

(15:09) Huh, Upgrades

(17:09) On Your Marks

(23:27) Choose Your Fighter

(25:51) Jevons Paradox Strikes Again

(26:25) Deepfaketown and Botpocalypse Soon

(31:47) They Took Our Jobs

(33:02) Get Involved

(33:41) Introducing

(35:25) In Other AI News

(37:17) Show Me the Money

(43:12) Quiet Speculations

(47:24) The Quest for Sane Regulations

(53:52) Don't Maim Me Bro

(57:29) The Week in Audio

(57:54) Rhetorical Innovation

(01:03:39) Expect the Unexpected

(01:05:48) Open Weights Are Unsafe and Nothing Can Fix This

(01:14:09) Anthropic Modifies Its Responsible Scaling Policy

(01:18:04) If You're Not Going to Take This Seriously

(01:20:24) Aligning a Smarter Than Human Intelligence is Difficult

(01:23:54) Trust the Process

(01:26:30) People Are Worried About AI Killing Everyone

(01:26:52) The Lighter Side

---

First published:
April 3rd, 2025

Source:
https://www.lesswrong.com/posts/bc8DQGvW3wiAWYibC/ai-110-of-course-you-know

---

Narrated by TYPE III AUDIO.

---

Images from the article:

Table comparing performance metrics and costs across 6 different AI models.

Email screenshot: Student writing to professor about AI-generated essay accusation.

Two cartoon drawings: green octopus with multiple eyes and smiley face.

Graph showing AI witness win rates and confidence levels in interrogation scenarios.

Screenshot of ChatGPT conversation showing formula for calculating trade deficit tariffs.

Stick figure near fog cloud contemplates cliff and jetpack sprint.

Cartoon: Business person with tariff plan sees AI error message

Poll results showing survey about helping, with philosophical question above.

Google AI Studio interface showing explanation of trade deficit calculations.

Anime-style illustration of father and baby napping together in armchair

A chat conversation showing message exchange about Trump and future date.

Wordle game solutions comparison between three AI language models.

(The image shows three Wordle-style grids displaying different word solutions by Gemini 2.5 Pro, GPT-4.5, and Sonnet 3.7 Thinking, with color-coded squares showing correct and incorrect letter placements.)

Table showing CBRN weapons and AI R&D capability thresholds and safeguards.

The table outlines two major capability categories with associated threshold levels (CBRN-3, CBRN-4, AI R&D-4, AI R&D-5) and their required security safeguards.

Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

“More Fun With GPT-4o Image Generation” by Zvi

3 april 2025 | 19 min

“Housing Roundup #11” by Zvi

1 april 2025 | 44 min

“OpenAI #12: Battle of the Board Redux” by Zvi

31 mars 2025 | 18 min

“AI #109: Google Fails Marketing Forever” by Zvi

28 mars 2025 | 75 min

What if they released the new best LLM, and almost no one noticed?

Google seems to have pulled that off this week with Gemini 2.5 Pro.

It's a great model, sir. I have a ton of reactions, and it's 90%+ positive, with a majority of it extremely positive. They cooked.

But what good is cooking if no one tastes the results?

Instead, everyone got hold of the GPT-4o image generator and went Ghibli crazy.

I love that for us, but we did kind of bury the lede. We also buried everything else. Certainly no one was feeling the AGI.

Also seriously, did you know Claude now has web search? It's kind of a big deal. This was a remarkably large quality of life improvement.

Table of Contents

Google Fails Marketing Forever. Gemini Pro 2.5? Never heard of her.
Language [...]

---

Outline:

(01:01) Table of Contents

(01:08) Google Fails Marketing Forever

(04:01) Language Models Offer Mundane Utility

(10:11) Language Models Don't Offer Mundane Utility

(11:01) Huh, Upgrades

(13:48) On Your Marks

(15:48) Copyright Confrontation

(16:44) Choose Your Fighter

(17:22) Deepfaketown and Botpocalypse Soon

(19:35) They Took Our Jobs

(22:19) The Art of the Jailbreak

(24:40) Get Involved

(25:16) Introducing

(25:52) In Other AI News

(28:23) Oh No What Are We Going to Do

(31:50) Quiet Speculations

(34:24) Fully Automated AI RandD Is All You Need

(37:39) IAPS Has Some Suggestions

(42:56) The Quest for Sane Regulations

(48:32) We The People

(51:23) The Week in Audio

(52:29) Rhetorical Innovation

(59:42) Aligning a Smarter Than Human Intelligence is Difficult

(01:05:18) People Are Worried About AI Killing Everyone

(01:05:33) Fun With Image Generation

(01:12:17) Hey We Do Image Generation Too

(01:14:25) The Lighter Side

---

First published:
March 27th, 2025

Source:
https://www.lesswrong.com/posts/nf3duFLsvH6XyvdAw/ai-109-google-fails-marketing-forever

---

Narrated by TYPE III AUDIO.

---

Images from the article:

Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

“Gemini 2.5 is the New SoTA” by Zvi

28 mars 2025 | 27 min

“On (Not) Feeling the AGI” by Zvi

25 mars 2025 | 22 min

“More on Various AI Action Plans” by Zvi

24 mars 2025 | 21 min

“They Took MY Job?” by Zvi

21 mars 2025 | 17 min

“Going Nova” by Zvi

19 mars 2025 | 28 min

“OpenAI #11: America Action Plan” by Zvi

18 mars 2025 | 13 min

“Monthly Roundup #28: March 2025” by Zvi

17 mars 2025 | 29 min

“On MAIM and Superintelligence Strategy” by Zvi

14 mars 2025 | 27 min

“AI #107: The Misplaced Hype Machine” by Zvi

13 mars 2025 | 81 min

The most hyped event of the week, by far, was the Manus Marketing Madness. Manus wasn’t entirely hype, but there was very little there there in that Claude wrapper.

Whereas here in America, OpenAI dropped an entire suite of tools for making AI agents, and previewed a new internal model making advances in creative writing. Also they offered us a very good paper warning about The Most Forbidden Technique.

Google dropped what is likely the best open non-reasoning model, Gemma 3 (reasoning model presumably to be created shortly, even if Google doesn’t do it themselves), put by all accounts quite good native image generation inside Flash 2.0, and added functionality to its AMIE doctor, and Gemini Robotics.

It's only going to get harder from here to track which things actually matter.

Table of Contents

Language Models Offer Mundane Utility. How much utility are [...]

---

Outline:

(00:55) Language Models Offer Mundane Utility

(05:51) Language Models Don't Offer Mundane Utility

(08:09) We're In Deep Research

(09:37) More Manus Marketing Madness

(13:27) Diffusion Difficulties

(16:32) OpenAI Tools for Agents

(17:14) Huh, Upgrades

(19:14) Fun With Media Generation

(21:45) Choose Your Fighter

(25:02) Deepfaketown and Botpocalypse Soon

(25:45) They Took Our Jobs

(26:49) The Art of the Jailbreak

(27:46) Get Involved

(30:05) Introducing

(32:04) In Other AI News

(33:14) Show Me the Money

(34:07) Quiet Speculations

(37:50) The Quest for Sane Regulations

(42:14) Anthropic Anemically Advises America's AI Action Plan

(51:44) New York State Bill A06453

(53:39) The Mask Comes Off

(53:56) Stop Taking Obvious Nonsense Hyperbole Seriously

(55:38) The Week in Audio

(01:04:34) Rhetorical Innovation

(01:13:10) Aligning a Smarter Than Human Intelligence is Difficult

(01:17:14) The Lighter Side

---

First published:
March 13th, 2025

Source:
https://www.lesswrong.com/posts/XFGTJz9vGwjJADeFB/ai-107-the-misplaced-hype-machine

---

Narrated by TYPE III AUDIO.

---

Images from the article:

Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

“The Most Forbidden Technique” by Zvi

12 mars 2025 | 32 min

“Response to Scott Alexander on Imprisonment” by Zvi

11 mars 2025 | 17 min

“The Manus Marketing Madness” by Zvi

10 mars 2025 | 45 min

“Childhood and Education #9: School is Hell” by Zvi

7 mars 2025 | 70 min

“AI #106: Not so Fast” by Zvi

6 mars 2025 | 77 min

This was GPT-4.5 week. That model is not so fast, and isn’t that much progress, but it definitely has its charms.

A judge delivered a different kind of Not So Fast back to OpenAI, threatening the viability of their conversion to a for-profit company. Apple is moving remarkably not so fast with Siri. A new paper warns us that under sufficient pressure, all known LLMs will lie their asses off. And we have some friendly warnings about coding a little too fast, and some people determined to take the theoretical minimum amount of responsibility while doing so.

There's also a new proposed Superintelligence Strategy, which I may cover in more detail later, about various other ways to tell people Not So Fast.

Table of Contents

Also this week: On OpenAI's Safety and Alignment Philosophy, On GPT-4.5.

Language Models Offer Mundane Utility. Don’t get [...]

---

Outline:

(00:51) Language Models Offer Mundane Utility

(04:15) Language Models Don't Offer Mundane Utility

(05:22) Choose Your Fighter

(06:53) Four and a Half GPTs

(08:13) Huh, Upgrades

(09:32) Fun With Media Generation

(10:25) We're in Deep Research

(11:35) Liar Liar

(14:03) Hey There Claude

(21:08) No Siri No

(23:55) Deepfaketown and Botpocalypse Soon

(28:37) They Took Our Jobs

(31:29) Get Involved

(33:57) Introducing

(36:59) In Other AI News

(39:37) Not So Fast, Claude

(41:43) Not So Fast, OpenAI

(44:31) Show Me the Money

(45:55) Quiet Speculations

(49:41) I Will Not Allocate Scarce Resources Using Prices

(51:51) Autonomous Helpful Robots

(52:42) The Week in Audio

(53:09) Rhetorical Innovation

(55:04) No One Would Be So Stupid As To

(57:04) On OpenAI's Safety and Alignment Philosophy

(01:01:03) Aligning a Smarter Than Human Intelligence is Difficult

(01:07:24) Implications of Emergent Misalignment

(01:12:02) Pick Up the Phone

(01:13:18) People Are Worried About AI Killing Everyone

(01:13:29) Other People Are Not As Worried About AI Killing Everyone

(01:14:11) The Lighter Side

---

First published:
March 6th, 2025

Source:
https://www.lesswrong.com/posts/kqz4EH3bHdRJCKMGk/ai-106-not-so-fast

---

Narrated by TYPE III AUDIO.

---

Images from the article:

Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

“On OpenAI’s Safety and Alignment Philosophy” by Zvi

5 mars 2025 | 32 min

“On Writing #1” by Zvi

4 mars 2025 | 27 min

“On Emergent Misalignment” by Zvi

28 februari 2025 | 41 min

“AI #105: Hey There Alexa” by Zvi

27 februari 2025 | 80 min

It's happening!

We got Claude 3.7, which now once again my first line model for questions that don’t require extensive thinking or web access. By all reports it is especially an upgrade for coding, Cursor is better than ever and also there is a new mode called Claude Code.

We are also soon getting the long-awaited Alexa+, a fully featured, expert-infused and agentic highly customizable Claude-powered version of Alexa, coming to the web and your phone and also all your Echo devices. It will be free with Amazon Prime. Will we finally get the first good assistant? It's super exciting.

Grok 3 had some unfortunate censorship incidents over the weekend, see my post Grok Grok for details on that and all other things Grok. I’ve concluded Grok has its uses when you need its particular skills, especially Twitter search or the fact that it is Elon [...]

---

Outline:

(01:19) Language Models Offer Mundane Utility

(03:53) Did You Get the Memo

(06:58) Language Models Don't Offer Mundane Utility

(08:29) Hey There Alexa

(11:28) We're In Deep Research

(18:45) Huh, Upgrades

(19:18) Deepfaketown and Botpocalypse Soon

(20:25) Fun With Media Generation

(21:24) They Took Our Jobs

(22:14) Levels of Friction

(25:18) A Young Lady's Illustrated Primer

(29:06) The Art of the Jailbreak

(30:03) Get Involved

(30:51) Introducing

(31:26) In Other AI News

(34:40) AI Co-Scientist

(39:50) Quiet Speculations

(48:14) The Quest for Sane Regulations

(52:58) The Week in Audio

(53:30) Tap the Sign

(55:05) Rhetorical Innovation

(01:00:22) Autonomous Helpful Robots

(01:02:09) Autonomous Killer Robots

(01:04:46) If You Really Believed That

(01:09:51) Aligning a Smarter Than Human Intelligence is Difficult

(01:16:45) The Lighter Side

---

First published:
February 27th, 2025

Source:
https://www.lesswrong.com/posts/v5dpeuj4qPxngcb4d/ai-105-hey-there-alexa

---

Narrated by TYPE III AUDIO.

---

Images from the article:

Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

“Time to Welcome Claude 3.7” by Zvi

26 februari 2025 | 54 min

“Grok Grok” by Zvi

25 februari 2025 | 44 min

“Economics Roundup #5” by Zvi

25 februari 2025 | 38 min

While we wait for the verdict on Anthropic's Claude Sonnet 3.7, today seems like a good day to catch up on the queue and look at various economics-related things.

Table of Contents

The Trump Tax Proposals.
Taxing Unrealized Capital Gains.
Extremely High Marginal Tax Rates.
Trade Barriers By Any Name Are Terrible.
Destroying People's Access to Credit.
Living Paycheck to Paycheck.
Oh California.
Chinese Venture Capital Death Spiral.
There is Someone Elon Musk Forgot to Ask.
Should Have Gone With the Sports Almanac.
Are You Better Off Than You Were Right Before the Election?.
Are You Better Off Than You Were Before the Price Level Rose?.
Most People Have No Idea How Insurance Works.
Do Not Spend Too Much Attention on Your Investments.
Preferences About Insider Training are Weird.
I [...]

---

Outline:

(00:15) The Trump Tax Proposals

(02:36) Taxing Unrealized Capital Gains

(03:00) Extremely High Marginal Tax Rates

(05:24) Trade Barriers By Any Name Are Terrible

(06:11) Destroying People's Access to Credit

(06:35) Living Paycheck to Paycheck

(09:54) Oh California

(10:12) Chinese Venture Capital Death Spiral

(11:20) There is Someone Elon Musk Forgot to Ask

(14:39) Should Have Gone With the Sports Almanac

(17:58) Are You Better Off Than You Were Right Before the Election?

(18:13) Are You Better Off Than You Were Before the Price Level Rose?

(25:43) Most People Have No Idea How Insurance Works

(27:25) Do Not Spend Too Much Attention on Your Investments

(28:43) Preferences About Insider Training are Weird

(30:12) I Will Not Allocate Scarce Resources Via Price

(30:33) Minimum Wages, Employment and the Equilibrium

(31:22) The National Debt

(36:09) In Brief

The original text contained 1 footnote which was omitted from this narration.

---

First published:
February 25th, 2025

Source:
https://www.lesswrong.com/posts/AAKXjRmBRbJJwGthT/economics-roundup-5

---

Narrated by TYPE III AUDIO.

---

Images from the article:

Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

“On OpenAI’s Model Spec 2.0” by Zvi

21 februari 2025 | 89 min

“AI #104: American State Capacity on the Brink” by Zvi

20 februari 2025 | 87 min

The Trump Administration is on the verge of firing all ‘probationary’ employees in NIST, as they have done in many other places and departments, seemingly purely because they want to find people they can fire. But if you fire all the new employees and recently promoted employees (which is that ‘probationary’ means here) you end up firing quite a lot of the people who know about AI or give the government state capacity in AI.

This would gut not only America's AISI, its primary source of a wide variety of forms of state capacity and the only way we can have insight into what is happening or test for safety on matters involving classified information. It would also gut our ability to do a wide variety of other things, such as reinvigorating American semiconductor manufacturing. It would be a massive own goal for the United States, on every [...]

---

Outline:

(01:14) Language Models Offer Mundane Utility

(05:44) Language Models Don't Offer Mundane Utility

(10:13) Rug Pull

(12:19) We're In Deep Research

(21:12) Huh, Upgrades

(30:28) Seeking Deeply

(35:26) Fun With Multimedia Generation

(35:41) The Art of the Jailbreak

(36:26) Get Involved

(37:09) Thinking Machines

(41:13) Introducing

(42:58) Show Me the Money

(44:55) In Other AI News

(53:31) By Any Other Name

(56:06) Quiet Speculations

(59:37) The Copium Department

(01:02:33) Firing All 'Probationary' Federal Employees Is Completely Insane

(01:10:28) The Quest for Sane Regulations

(01:12:18) Pick Up the Phone

(01:14:24) The Week in Audio

(01:16:19) Rhetorical Innovation

(01:18:50) People Really Dislike AI

(01:20:45) Aligning a Smarter Than Human Intelligence is Difficult

(01:22:34) People Are Worried About AI Killing Everyone

(01:23:51) Other People Are Not As Worried About AI Killing Everyone

(01:24:16) The Lighter Side

---

First published:
February 20th, 2025

Source:
https://www.lesswrong.com/posts/bozSPnkCzXBjDpbHj/ai-104-american-state-capacity-on-the-brink

---

Narrated by TYPE III AUDIO.

---

Images from the article:

Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

“Go Grok Yourself” by Zvi

19 februari 2025 | 37 min

“Medical Roundup #4” by Zvi

18 februari 2025 | 19 min

“Monthly Roundup #27: February 2025” by Zvi

17 februari 2025 | 84 min

“The Mask Comes Off: A Trio of Tales” by Zvi

14 februari 2025 | 25 min

“AI #103: Show Me the Money” by Zvi

13 februari 2025 | 112 min

The main event this week was the disastrous Paris AI Anti-Safety Summit. Not only did we not build upon the promise of the Bletchley and Seoul Summits, the French and Americans did their best to actively destroy what hope remained, transforming the event into a push for a mix of nationalist jingoism, accelerationism and anarchism. It's vital and also difficult not to panic or despair, but it doesn’t look good. Another major twist was that Elon Musk made a $97 billion bid for OpenAI's nonprofit arm and its profit and control interests in OpenAI's for-profit arm. This is a serious complication for Sam Altman's attempt to buy those same assets for $40 billion, in what I’ve described as potentially the largest theft in human history. I’ll be dealing with that tomorrow, along with two other developments in my ongoing OpenAI series The Mask Comes Off. In Altman's Three Observations, he [...]

---

Outline:

(01:55) Language Models Offer Mundane Utility

(06:03) Language Models Don't Offer Mundane Utility

(08:07) We're in Deep Research

(13:54) Huh, Upgrades

(20:56) Seeking Deeply

(24:25) Smooth Operator

(29:15) They Took Our Jobs

(33:34) Maxwell Tabarrok Responds on Future Wages

(41:56) The Art of the Jailbreak

(46:34) Get Involved

(48:45) Introducing

(51:17) Show Me the Money

(53:20) In Other AI News

(56:12) Quiet Speculations

(01:02:05) The Quest for Sane Regulations

(01:04:53) The Week in Audio

(01:06:40) The Mask Comes Off

(01:08:20) Rhetorical Innovation

(01:21:35) Getting Tired of Winning

(01:24:36) People Really Dislike AI

(01:25:47) Aligning a Smarter Than Human Intelligence is Difficult

(01:27:20) Sufficiently Capable AIs Effectively Acquire Convergent Utility Functions

(01:36:29) People Are Worried About AI Killing Everyone

(01:47:03) Other People Are Not As Worried About AI Killing Everyone

(01:50:38) The Lighter Side

---

First published:
February 13th, 2025

Source:
https://www.lesswrong.com/posts/Lmqi4x5zntjSxfdPg/ai-103-show-me-the-money

---

Narrated by TYPE III AUDIO.

---

Images from the article:

Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

“The Paris AI Anti-Safety Summit” by Zvi

12 februari 2025 | 42 min

“On Deliberative Alignment” by Zvi

11 februari 2025 | 12 min

“Levels of Friction” by Zvi

10 februari 2025 | 23 min

“On the Meta and DeepMind Safety Frameworks” by Zvi

7 februari 2025 | 35 min

“AI #102: Made in America” by Zvi

6 februari 2025 | 128 min

I remember that week I used r1 a lot, and everyone was obsessed with DeepSeek.

They earned it. DeepSeek cooked, r1 is an excellent model. Seeing the Chain of Thought was revolutionary. We all learned a lot.

It's still #1 in the app store, there are still hysterical misinformed NYT op-eds and and calls for insane reactions in all directions and plenty of jingoism to go around, largely based on that highly misleading $6 millon cost number for DeepSeek's v3, and a misunderstanding of how AI capability curves move over time.

But like the tariff threats that's now so yesterday now, for those of us that live in the unevenly distributed future.

All my reasoning model needs go through o3-mini-high, and Google's fully unleashed Flash Thinking for free. Everyone is exploring OpenAI's Deep Research, even in its early form, and I finally have an entity [...]

---

Outline:

(01:15) Language Models Offer Mundane Utility

(07:23) o1-Pro Offers Mundane Utility

(10:35) We're in Deep Research

(17:08) Language Models Don't Offer Mundane Utility

(17:49) Model Decision Tree

(20:43) Huh, Upgrades

(21:57) Bot Versus Bot

(24:04) The OpenAI Unintended Guidelines

(26:40) Peter Wildeford on DeepSeek

(29:18) Our Price Cheap

(35:25) Otherwise Seeking Deeply

(44:13) Smooth Operator

(46:46) Have You Tried Not Building An Agent?

(51:58) Deepfaketown and Botpocalypse Soon

(54:56) They Took Our Jobs

(01:08:29) The Art of the Jailbreak

(01:08:56) Get Involved

(01:13:05) Introducing

(01:13:45) In Other AI News

(01:16:37) Theory of the Firm

(01:21:32) Quiet Speculations

(01:24:36) The Quest for Sane Regulations

(01:33:33) The Week in Audio

(01:34:41) Rhetorical Innovation

(01:38:22) Aligning a Smarter Than Human Intelligence is Difficult

(01:40:33) The Alignment Faking Analysis Continues

(01:44:24) Masayoshi Son Follows Own Advice

(01:48:22) People Are Worried About AI Killing Everyone

(01:50:32) You Are Not Ready

(02:00:45) Other People Are Not As Worried About AI Killing Everyone

(02:02:53) The Lighter Side

---

First published:
February 6th, 2025

Source:
https://www.lesswrong.com/posts/rAaGbh7w52soCckNC/ai-102-made-in-america

---

Narrated by TYPE III AUDIO.

---

Images from the article:

Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

“The Risk of Gradual Disempowerment from AI” by Zvi

5 februari 2025 | 37 min

“We’re in Deep Research” by Zvi

4 februari 2025 | 39 min

“o3-mini Early Days” by Zvi

3 februari 2025 | 33 min

“DeepSeek: Don’t Panic” by Zvi

31 januari 2025 | 54 min

“AI #101: The Shallow End” by Zvi

30 januari 2025 | 108 min

The avalanche of DeepSeek news continues. We are not yet spending more than a few hours at a time in the singularity, where news happens faster than it can be processed. But it's close, and I’ve had to not follow a bunch of other non-AI things that are also happening, at least not well enough to offer any insights.

So this week we’re going to consider China, DeepSeek and r1 fully split off from everything else, and we’ll cover everything related to DeepSeek, including the policy responses to the situation, tomorrow instead.

This is everything else in AI from the past week. Some of it almost feels like it is from another time, so long ago.

I’m afraid you’re going to need to get used to that feeling.

Also, I went on Odd Lots to discuss DeepSeek, where I was and truly hope to again [...]

---

Outline:

(00:55) Language Models Offer Mundane Utility

(02:47) Language Models Don't Offer Mundane Utility

(05:43) Language Models Don't Offer You In Particular Mundane Utility

(10:49) (Don't) Feel the AGI

(12:36) Huh, Upgrades

(16:08) They Took Our Jobs

(21:30) Get Involved

(22:04) Introducing

(23:38) In Other AI News

(27:10) Hype

(29:56) We Had a Deal

(31:43) Quiet Speculations

(37:14) The Quest for Sane Regulations

(39:40) The Week in Audio

(39:51) Don't Tread on Me

(45:42) Rhetorical Innovation

(55:22) Scott Sumner on Objectivity in Taste, Ethics and AGI

(01:04:41) The Mask Comes Off (1)

(01:06:58) The Mask Comes Off (2)

(01:09:12) International AI Safety Report

(01:10:37) One Step at a Time

(01:14:12) Aligning a Smarter Than Human Intelligence is Difficult

(01:18:54) Two Attractor States

(01:26:51) You Play to Win the Game

(01:28:10) Six Thoughts on AI Safety

(01:35:53) AI Situational Awareness

(01:40:15) People Are Worried About AI Killing Everyone

(01:43:40) Other People Are Not As Worried About AI Killing Everyone

(01:44:24) The Lighter Side

---

First published:
January 30th, 2025

Source:
https://www.lesswrong.com/posts/pZ6htFtoptGrSajWG/ai-101-the-shallow-end

---

Narrated by TYPE III AUDIO.

---

Images from the article:

Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

“DeepSeek: Lemon, It’s Wednesday” by Zvi

29 januari 2025 | 64 min

“Operator” by Zvi

28 januari 2025 | 21 min

“DeepSeek Panic at the App Store” by Zvi

28 januari 2025 | 64 min

“Stargate AI-1” by Zvi

24 januari 2025 | 37 min

“AI #100: Meet the New Boss” by Zvi

23 januari 2025 | 132 min

Break time is over, it would seem, now that the new administration is in town.

This week we got r1, DeepSeek's new reasoning model, which is now my go-to first choice for a large percentage of queries. The claim that this was the most important thing to happen on January 20, 2025 was at least non-crazy. If you read about one thing this week read about that.

We also got the announcement of Stargate, a claimed $500 billion private investment in American AI infrastructure. I will be covering that on its own soon.

Due to time limits I have also pushed coverage of a few things into next week, including this alignment paper, and I still owe my take on Deliberative Alignment.

The Trump administration came out swinging on many fronts with a wide variety of executive orders. For AI, that includes repeal of the [...]

---

Outline:

(01:24) Language Models Offer Mundane Utility

(10:54) Language Models Don't Offer Mundane Utility

(17:20) Huh, Upgrades

(20:03) Additional Notes on r1

(22:41) Fun With Media Generation

(23:18) We Tested Older LLMs and Are Framing It As a Failure

(26:56) Deepfaketown and Botpocalypse Soon

(32:10) They Took Our Jobs

(47:15) Get Involved

(47:54) Introducing

(51:38) We Had a Deal

(01:07:17) In Other AI News

(01:18:39) Whistling in the Dark

(01:22:03) Quiet Speculations

(01:28:09) Suchir's Last Post

(01:29:43) Modeling Lower Bound Economic Growth From AI

(01:34:42) The Quest for Sane Regulations

(01:39:53) The Week in Audio

(01:42:51) Rhetorical Innovation

(01:49:37) Cry Havoc

(01:53:14) Aligning a Smarter Than Human Intelligence is Difficult

(01:59:34) People Strongly Dislike AI

(02:02:23) People Are Worried About AI Killing Everyone

(02:05:17) Other People Are Not As Worried About AI Killing Everyone

(02:09:29) The Lighter Side

---

First published:
January 23rd, 2025

Source:
https://www.lesswrong.com/posts/PjDjeGPYPoi9qfPr2/ai-100-meet-the-new-boss

---

Narrated by TYPE III AUDIO.

---

Images from the article:

Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

“On DeepSeek’s r1” by Zvi

22 januari 2025 | 68 min

“Sleep, Diet, Exercise and GLP-1 Drugs” by Zvi

21 januari 2025 | 34 min

“Meta Pivots on Content Moderation” by Zvi

17 januari 2025 | 19 min

“AI #99: Farewell to Biden” by Zvi

16 januari 2025 | 113 min

The fun, as it were, is presumably about to begin.

And the break was fun while it lasted.

Biden went out with an AI bang. His farewell address warns of a ‘Tech-Industrial Complex’ and calls AI the most important technology of all time. And there was not one but two AI-related everything bagel concrete actions proposed – I say proposed because Trump could undo or modify either or both of them.

One attempts to build three or more ‘frontier AI model data centers’ on federal land, with timelines and plans I can only summarize with ‘good luck with that.’ The other move was new diffusion regulations on who can have what AI chips, an attempt to actually stop China from accessing the compute it needs. We shall see what happens.

Table of Contents

Table of Contents.
Language Models Offer [...]

---

Outline:

(00:53) Language Models Offer Mundane Utility

(06:45) Language Models Don't Offer Mundane Utility

(10:40) What AI Skepticism Often Looks Like

(13:59) A Very Expensive Chatbot

(16:07) Deepfaketown and Botpocalypse Soon

(21:51) Fun With Image Generation

(22:15) They Took Our Jobs

(27:53) The Blame Game

(31:25) Copyright Confrontation

(31:44) The Six Million Dollar Model

(34:51) Get Involved

(35:15) Introducing

(38:36) In Other AI News

(41:32) Quiet Speculations

(53:27) Man With a Plan

(58:40) Our Price Cheap

(01:03:09) The Quest for Sane Regulations

(01:05:54) Super Duper Export Controls

(01:14:17) Everything Bagel Data Centers

(01:20:46) d/acc Round 2

(01:30:42) The Week in Audio

(01:33:57) Rhetorical Innovation

(01:39:32) Aligning a Smarter Than Human Intelligence is Difficult

(01:47:47) Other People Are Not As Worried About AI Killing Everyone

(01:51:03) The Lighter Side

---

First published:
January 16th, 2025

Source:
https://www.lesswrong.com/posts/dnqpcq9S7voPwpvRA/ai-99-farewell-to-biden

---

Narrated by TYPE III AUDIO.

---

Images from the article:

Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

“On the OpenAI Economic Blueprint” by Zvi

15 januari 2025 | 17 min

“NYC Congestion Pricing: Early Days” by Zvi

14 januari 2025 | 32 min

“Zvi’s 2024 In Movies” by Zvi

13 januari 2025 | 28 min

“On Dwarkesh Patel’s 4th Podcast With Tyler Cowen” by Zvi

10 januari 2025 | 49 min

“AI #98: World Ends With Six Word Story” by Zvi

9 januari 2025 | 72 min

“OpenAI #10: Reflections” by Zvi

7 januari 2025 | 20 min

“Childhood and Education #8: Dealing with the Internet” by Zvi

6 januari 2025 | 26 min

“AI #97: 4” by Zvi

2 januari 2025 | 76 min

“DeekSeek v3: The Six Million Dollar Model” by Zvi

31 december 2024 | 30 min

“o3, Oh My” by Zvi

30 december 2024 | 77 min

OpenAI presented o3 on the Friday before Thanksgiving, at the tail end of the 12 Days of Shipmas.

I was very much expecting the announcement to be something like a price drop. What better way to say ‘Merry Christmas,’ no?

They disagreed. Instead, we got this (here's the announcement, in which Sam Altman says ‘they thought it would be fun’ to go from one frontier model to their next frontier model, yeah, that's what I’m feeling, fun):

Greg Brockman (President of OpenAI): o3, our latest reasoning model, is a breakthrough, with a step function improvement on our most challenging benchmarks. We are starting safety testing and red teaming now.

Nat McAleese (OpenAI): o3 represents substantial progress in general-domain reasoning with reinforcement learning—excited that we were able to announce some results today! Here is a summary of what we shared about o3 in the livestream.

---

Outline:

(03:48) GPQA Has Fallen

(04:21) Codeforces Has Fallen

(05:32) Arc Has Kinda of Fallen But For Now Only Kinda

(09:27) They Trained on the Train Set

(15:26) AIME Has Fallen

(15:58) Frontier of Frontier Math Shifting Rapidly

(19:09) FrontierMath 4: We're Going To Need a Bigger Benchmark

(23:10) What is o3 Under the Hood?

(25:17) Not So Fast!

(28:38) Deep Thought

(30:03) Our Price Cheap

(36:32) Has Software Engineering Fallen?

(37:42) Don't Quit Your Day Job

(40:48) Master of Your Domain

(43:21) Safety Third

(47:56) The Safety Testing Program

(48:58) Safety testing in the reasoning era

(51:01) How to apply

(53:07) What Could Possibly Go Wrong?

(56:36) What Could Possibly Go Right?

(57:06) Send in the Skeptic

(59:25) This is Almost Certainly Not AGI

(01:02:57) Does This Mean the Future is Open Models?

(01:07:17) Not Priced In

(01:08:39) Our Media is Failing Us

(01:14:56) Not Covered Here: Deliberative Alignment

(01:15:08) The Lighter Side

---

First published:
December 30th, 2024

Source:
https://www.lesswrong.com/posts/QHtd2ZQqnPAcknDiQ/o3-oh-my

---

Narrated by TYPE III AUDIO.

---

Images from the article:

Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

“AI #96: o3 But Not Yet For Thee” by Zvi

26 december 2024 | 73 min

“AIs Will Increasingly Fake Alignment” by Zvi

24 december 2024 | 98 min

This post goes over the important and excellent new paper from Anthropic and Redwood Research, with Ryan Greenblatt as lead author, Alignment Faking in Large Language Models.

This is by far the best demonstration so far of the principle that AIs Will Increasingly Attempt Shenanigans.

This was their announcement thread.

New Anthropic research: Alignment faking in large language models.

In a series of experiments with Redwood Research, we found that Claude often pretends to have different views during training, while actually maintaining its original preferences.

Claude usually refuses harmful queries. We told it we were instead training it to comply with them. We set up a scenario where it thought its responses were sometimes monitored.

When unmonitored, it nearly always complied. But when monitored, it faked alignment 12% of the time.

[thread continues and includes various visual aids.

The AI wanted [...]

---

Outline:

(02:54) The Core Shenanigans in Question

(06:00) Theme and Variations

(07:34) How This Interacts with o3 and OpenAI's Reflective Alignment

(09:17) The Goal Being Plausibly Good Was Incidental

(11:13) Answering Priming Objections

(12:17) What Does Claude Sonnet Think Of This?

(14:07) What Exactly is the Direct Threat Model?

(16:23) RL Training Under Situational Awareness Can Amplify These Behaviors

(20:38) How the Study Authors Updated

(27:08) How Some Others Updated

(42:49) Having the Discussion We Keep Having

(46:49) We Can Now Agree That the Goal is Already There

(47:49) What Would Happen if the Target Was Net Good?

(50:14) But This Was a No Win Situation

(55:52) But Wasn’t It Being a Good Opus? Why Should it be Corrigible?

(01:04:34) Tradeoffs Make The Problem Harder They Don’t Give You a Pass

(01:07:44) But You Told the Model About the Training Procedure

(01:08:35) But the Model is Only Role Playing

(01:09:39) But You Are Saying the Model is a Coherent Person

(01:15:53) But this Headline and Framing Was Misleading

(01:29:22) This Result is Centrally Unsurprising

(01:32:52) Lab Support for Alignment Research Matters

(01:33:50) The Lighter Side

The original text contained 1 footnote which was omitted from this narration.

---

First published:
December 24th, 2024

Source:
https://www.lesswrong.com/posts/gHjzdLD6yeLNdsRmw/ais-will-increasingly-fake-alignment

---

Narrated by TYPE III AUDIO.

---

Images from the article:

Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

“Monthly Roundup #25: December 2024” by Zvi

23 december 2024 | 55 min

“AI #95: o1 Joins the API” by Zvi

23 december 2024 | 81 min

“A Matter of Taste” by Zvi

18 december 2024 | 19 min

“The Second Gemini” by Zvi

17 december 2024 | 21 min

“AIs Will Increasingly Attempt Shenanigans” by Zvi

16 december 2024 | 51 min

“The o1 System Card Is Not About o1” by Zvi

13 december 2024 | 36 min

Or rather, we don’t actually have a proper o1 system card, aside from the outside red teaming reports. At all.

Because, as I realized after writing my first draft of this, the data here does not reflect the o1 model they released, or o1 pro?

I think what happened is pretty bad on multiple levels.

The failure to properly communicate the information they did provide.
The failure to provide the correct information.
The failure, potentially, to actually test the same model they released, in many of the ways we are counting on to ensure the model is safe to release.
The failure to properly elicit model capabilities. o1 scores unreasonably poorly on a large portion of the preparedness tests, to the point where I would not take the tests seriously as assessments of what o1 is capable of doing, even at [...]

---

Outline:

(02:18) Where Art Thou o1 System Card?

(05:35) Introduction (Section 1)

(06:01) Model Data and Training (Section 2)

(06:13) Challenges and Evaluations (Section 3)

(09:38) Jailbreak Evaluations (Section 3.1.2)

(11:33) Regurgitation (3.1.3) and Hallucinations (3.1.4)

(12:30) Fairness and Bias (3.1.5)

(13:33) Jailbreaks Through Custom Developer Messages (3.2)

(14:41) Chain of Thought Safety (3.3)

(18:52) External Red Teaming Via Pairwise Safety Comparisons (3.4.1)

(19:57) Jailbreak Arena (3.4.2)

(20:25) Apollo Research (3.4.3) and the ‘Escape Attempts’

(21:38) METR (3.4.4) and Autonomous Capability

(25:22) Preparedness Framework Evaluations (Section 4)

(27:47) Mitigations

(30:27) Cybersecurity

(31:22) Chemical and Biological Threats (4.5)

(31:52) Radiological and Nuclear Threat Creation (4.6)

(32:21) Persuasion (4.7)

(32:49) Model Autonomy (4.8)

(34:45) Multilingual Performance

(34:55) Conclusion

---

First published:
December 13th, 2024

Source:
https://www.lesswrong.com/posts/HfigEyXddxkSGunKr/the-o1-system-card-is-not-about-o1

---

Narrated by TYPE III AUDIO.

---

Images from the article:

Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

“AI #94: Not Now, Google” by Zvi

13 december 2024 | 124 min

At this point, we can confidently say that no, capabilities are not hitting a wall. Capacity density, how much you can pack into a given space, is way up and rising rapidly, and we are starting to figure out how to use it.

Not only did we get o1 and o1 pro and also Sora and other upgrades from OpenAI, we also got Gemini 1206 and then Gemini Flash 2.0 and the agent Jules (am I the only one who keeps reading this Jarvis?) and Deep Research, and Veo, and Imagen 3, and Genie 2 all from Google. Meta's Llama 3.3 dropped, claiming their 70B is now as good as the old 405B, and basically no one noticed.

This morning I saw Cursor now offers ‘agent mode.’ And hey there, Devin. And Palisade found that a little work made agents a lot more effective.

And OpenAI [...]

---

Outline:

(01:52) Language Models Offer Mundane Utility

(09:12) A Good Book

(12:24) Language Models Don’t Offer Mundane Utility

(14:25) o1 Pro Versus Claude

(15:25) AGI Claimed Internally

(16:52) Ask Claude

(23:19) Huh, Upgrades

(27:24) All Access Pass

(29:03) Fun With Image Generation

(35:28) Deepfaketown and Botpocalypse Soon

(37:49) They Took Our Jobs

(42:40) Get Involved

(43:50) Introducing

(44:11) In Other AI News

(48:14) OpenlyEvil AI

(55:39) Quiet Speculations

(01:00:14) Scale That Wall

(01:03:45) The Quest for Tripwire Capability Thresholds

(01:10:11) The Quest for Sane Regulations

(01:13:52) Republican Congressman Kean Brings the Fire

(01:18:35) CERN for AI

(01:23:34) The Week in Audio

(01:24:24) Rhetorical Innovation

(01:28:08) Model Evaluations Are Lower Bounds

(01:30:49) Aligning a Smarter Than Human Intelligence is Difficult

(01:35:38) I’ll Allow It

(01:38:10) Frontier AI Systems Have Surpassed the Self-Replicating Red Line

(01:42:50) People Are Worried About AI Killing Everyone

(01:43:56) Key Person Who Might Be Worried About AI Killing Everyone

(01:54:18) Other People Are Not As Worried About AI Killing Everyone

(01:56:22) Not Feeling the AGI

(01:59:19) Fight For Your Right

(02:01:44) The Lighter Side

---

First published:
December 12th, 2024

Source:
https://www.lesswrong.com/posts/HKCXWxFSiWXLByL2S/ai-94-not-now-google

---

Narrated by TYPE III AUDIO.

---

Images from the article:

Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

“o1 Turns Pro” by Zvi

10 december 2024 | 31 min

“Childhood and Education Roundup #7” by Zvi

9 december 2024 | 51 min

“AI #93: Happy Tuesday” by Zvi

4 december 2024 | 43 min

“Balsa Research 2024 Update” by Zvi

3 december 2024 | 10 min

“Fertility Roundup #4” by Zvi

2 december 2024 | 103 min

“The Big Nonprofits Post” by Zvi

29 november 2024 | 88 min

There are lots of great charitable giving opportunities out there right now.

The first time that I served as a recommender in the Survival and Flourishing Fund (SFF) was back in 2021. I wrote in detail about my experiences then. At the time, I did not see many great opportunities, and was able to give out as much money as I found good places to do so.

How the world has changed in three years.

I recently had the opportunity to be an SFF recommender for the second time. This time I found an embarrassment of riches. Application quality was consistently higher, there were more than twice as many applications, and essentially all applicant organizations were looking to scale their operations and spending.

That means the focus of this post is different. In 2021, my primary goal was to share my perspective on [...]

---

Outline:

(01:39) A Word of Warning

(02:44) Use Your Personal Theory of Impact

(04:13) Use Your Local Knowledge

(05:10) Unconditional Grants to Worthy Individuals Are Great

(06:55) Do Not Think Only On the Margin, and Also Use Decision Theory

(07:48) And the Nominees Are

(10:55) Organizations that Are Literally Me

(11:10) Balsa Research

(12:56) Don’t Worry About the Vase

(14:19) Organizations Focusing On AI Non-Technical Research and Education

(14:37) The Scenario Project

(15:48) Lightcone Infrastructure

(17:20) Effective Institutions Project (EIP)

(18:06) Artificial Intelligence Policy Institute (AIPI)

(19:10) Psychosecurity Ethics at EURAIO

(20:07) Pallisade Research

(21:07) AI Safety Info (Robert Miles)

(21:51) Intelligence Rising

(22:32) Convergence Analysis

(23:29) Longview Philanthropy

(24:27) Organizations Focusing Primary On AI Policy and Diplomacy

(25:06) Center for AI Safety and the CAIS Action Fund

(26:00) MIRI

(26:59) Foundation for American Innovation (FAI)

(28:58) Center for AI Policy (CAIP)

(29:58) Encode Justice

(30:57) The Future Society

(31:42) Safer AI

(32:26) Institute for AI Policy and Strategy (IAPS)

(33:13) AI Standards Lab

(34:05) Safer AI Forum

(34:40) CLTR at Founders Pledge

(35:54) Pause AI and Pause AI Global

(36:57) Existential Risk Observatory

(37:37) Simons Institute for Longterm Governance

(38:21) Legal Advocacy for Safe Science and Technology

(39:17) Organizations Doing ML Alignment Research

(40:16) Model Evaluation and Threat Research (METR)

(41:28) Alignment Research Center (ARC)

(42:02) Apollo Research

(42:53) Cybersecurity Lab at University of Louisville

(43:44) Timaeus

(44:39) Simplex

(45:08) Far AI

(45:41) Alignment in Complex Systems Research Group

(46:23) Apart Research

(47:06) Transluce

(48:00) Atlas Computing

(48:45) Organizations Doing Math, Decision Theory and Agent Foundations

(50:05) Orthogonal

(50:47) Topos Institute

(51:37) Eisenstat Research

(52:13) ALTER (Affiliate Learning-Theoretic Employment and Resources) Project

(53:00) Mathematical Metaphysics Institute

(54:06) Focal at CMU

(55:15) Organizations Doing Cool Other Stuff Including Tech

(55:26) MSEP Project at Science and Technology Futures (Their Website)

(56:26) ALLFED

(57:51) Good Ancestor Foundation

(59:10) Charter Cities Institute

(59:50) German Primate Center (DPZ) – Leibniz Institute for Primate Research

(01:01:08) Carbon Copies for Independent Minds

(01:01:44) Organizations Focused Primarily on Bio Risk

(01:01:50) Secure DNA

(01:02:46) Blueprint Biosecurity

(01:03:35) Pour Domain

(01:04:17) Organizations That then Regrant to Fund Other Organizations

(01:05:14) SFF Itself (!)

(01:06:10) Manifund

(01:08:02) AI Risk Mitigation Fund

(01:08:39) Long Term Future Fund

(01:10:16) Foresight

(01:11:08) Centre for Enabling Effective Altruism Learning and Research (CEELAR)

(01:11:43) Organizations That are Essentially Talent Funnels

(01:13:40) AI Safety Camp

(01:14:23) Center for Law and AI Risk

(01:15:22) Speculative Technologies

(01:16:19) Talos Network

(01:17:11) MATS Research

(01:17:48) Epistea

(01:18:52) Emergent Ventures (Special Bonus Organization, was not part of SFF)

(01:20:32) AI Safety Cape Town

(01:21:08) Impact Academy Limited

(01:21:47) Principles of Intelligent Behavior in Biological and Social Systems (PIBBSS)

(01:22:34) Tarbell Fellowship at PPF

(01:23:32) Catalyze Impact

(01:24:32) Akrose

(01:25:14) CeSIA within EffiSciences

(01:25:59) Stanford Existential Risk Initiative (SERI)

---

First published:
November 29th, 2024

Source:
https://www.lesswrong.com/posts/9n87is5QsCozxr9fp/the-big-nonprofits-post

---

Narrated by TYPE III AUDIO.

---

Images from the article:

Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

“AI #92: Behind the Curve” by Zvi

28 november 2024 | 82 min

“Repeal the Jones Act of 1920” by Zvi

27 november 2024 | 74 min

Balsa Policy Institute chose as its first mission to lay groundwork for the potential repeal, or partial repeal, of section 27 of the Jones Act of 1920. I believe that this is an important cause both for its practical and symbolic impacts.

The Jones Act is the ultimate embodiment of our failures as a nation.

After 100 years, we do almost no trade between our ports via the oceans, and we build almost no oceangoing ships.

Everything the Jones Act supposedly set out to protect, it has destroyed.

Table of Contents

What is the Jones Act?
Why Work to Repeal the Jones Act?
Why Was the Jones Act Introduced?
What is the Effect of the Jones Act?
What Else Happens When We Ship More Goods Between Ports?
Emergency Case Study: Salt Shipment to NJ in [...]

---

Outline:

(00:38) What is the Jones Act?

(01:33) Why Work to Repeal the Jones Act?

(02:48) Why Was the Jones Act Introduced?

(03:19) What is the Effect of the Jones Act?

(06:52) What Else Happens When We Ship More Goods Between Ports?

(07:14) Emergency Case Study: Salt Shipment to NJ in the Winter of 2013-2014

(12:04) Why no Emergency Exceptions?

(15:02) What Are Some Specific Non-Emergency Impacts?

(18:57) What Are Some Specific Impacts on Regions?

(22:36) What About the Study Claiming Big Benefits?

(24:46) What About the Need to ‘Protect’ American Shipbuilding?

(28:31) The Opposing Arguments Are Disingenuous and Terrible

(34:07) What Alternatives to Repeal Do We Have?

(35:33) What Might Be a Decent Instinctive Counterfactual?

(41:50) What About Our Other Protectionist and Cabotage Laws?

(43:00) What About Potential Marine Highways, or Short Sea Shipping?

(43:48) What Happened to All Our Offshore Wind?

(47:06) What Estimates Are There of Overall Cost?

(49:52) What Are the Costs of Being American Flagged?

(50:28) What Are the Costs of Being American Made?

(51:49) What are the Consequences of Being American Crewed?

(53:11) What Would Happen in a Real War?

(56:07) Cruise Ship Sanity Partially Restored

(56:46) The Jones Act Enforcer

(58:08) Who Benefits?

(58:57) Others Make the Case

(01:00:55) An Argument That We Were Always Uncompetitive

(01:02:45) What About John Arnold's Case That the Jones Act Can’t Be Killed?

(01:09:34) What About the Foreign Dredge Act of 1906?

(01:10:24) Fun Stories

---

First published:
November 27th, 2024

Source:
https://www.lesswrong.com/posts/dnH2hauqRbu3GspA2/repeal-the-jones-act-of-1920

---

Narrated by TYPE III AUDIO.

---

Images from the article:

Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

“AI #91: Deep Thinking” by Zvi

21 november 2024 | 108 min

Did DeepSeek effectively release an o1-preview clone within nine weeks?

The benchmarks largely say yes. Certainly it is an actual attempt at a similar style of product, and is if anything more capable of solving AIME questions, and the way it shows its Chain of Thought is super cool. Beyond that, alas, we don’t have enough reports in from people using it. So it's still too soon to tell. If it is fully legit, the implications seems important.

Small improvements continue throughout. GPT-4o and Gemini both got incremental upgrades, trading the top slot on Arena, although people do not seem to much care.

There was a time everyone would be scrambling to evaluate all these new offerings. It seems we mostly do not do that anymore.

The other half of events was about policy under the Trump administration. What should the federal government do? We [...]

---

Outline:

(01:31) Language Models Offer Mundane Utility

(05:37) Language Models Don’t Offer Mundane Utility

(08:14) Claude Sonnet 3.5.1 Evaluation

(11:09) Deepfaketown and Botpocalypse Soon

(11:57) Fun With Image Generation

(12:08) O-(There are)-Two

(15:25) The Last Mile

(22:52) They Took Our Jobs

(29:53) We Barely Do Our Jobs Anyway

(35:52) The Art of the Jailbreak

(39:20) Get Involved

(39:43) The Mask Comes Off

(40:36) Richard Ngo on Real Power and Governance Futures

(44:28) Introducing

(46:51) In Other AI News

(52:16) Quiet Speculations

(59:33) The Quest for Sane Regulations

(01:02:35) The Quest for Insane Regulations

(01:12:42) Pick Up the Phone

(01:13:21) Worthwhile Dean Ball Initiative

(01:29:18) The Week in Audio

(01:31:20) Rhetorical Innovation

(01:37:15) Pick Up the Phone

(01:38:32) Aligning a Smarter Than Human Intelligence is Difficult

(01:43:29) People Are Worried About AI Killing Everyone

(01:46:03) The Lighter Side

---

First published:
November 21st, 2024

Source:
https://www.lesswrong.com/posts/SNBE9TXwL3qQ3TS8H/ai-91-deep-thinking

---

Narrated by TYPE III AUDIO.

---

Images from the article:

Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

“Zvi’s Thoughts on His 2nd Round of SFF” by Zvi

20 november 2024 | 19 min

“Monthly Roundup #24: November 2024” by Zvi

18 november 2024 | 97 min

This is your monthly roundup. Let's get right to it.

Young People are Young and Stupid

As a reminder that yes college students are often young and stupid and wrong about everything, remember the time they were behind a ban on paid public toilets? This is a central case of the kind of logic that often gets applied by college students.

No One Voted for This

HR and Title IX training seems like it's going a lot of compelled speech in the form of ‘agree with us or you can’t complete your training and the training is required for your job,’ and also a lot of that compelled speech is outright lying because it's confirmation of statements that are universally recognized to be insane? Robin Hanson: Scenario: 2 women talking. X, married to woman, announces is pregnant. Y asks how they got pregnant, was it friend [...]

---

Outline:

(00:11) Young People are Young and Stupid

(00:29) No One Voted for This

(02:32) Discrimination

(09:02) Morality

(11:56) Only Connect

(15:22) It's Not Me, It's Your Fetish

(16:23) It Takes a Village You Don’t Have

(17:46) The Joy of Cooking

(20:18) The Joy of Eating

(20:59) Decision Theory

(26:22) FTC on the Loose

(31:27) Good News, Everyone

(36:19) Antisocial Media

(40:02) Technology Advances

(40:46) For Science!

(41:19) Cognition

(44:28) Discourse

(48:54) Communication

(49:32) Honesty

(51:09) Get Involved

(52:19) Government Working

(01:00:58) Quickly On the Student Loan Claim

(01:03:50) Variously Effective Altruism

(01:08:23) Gamers Gonna Game Game Game Game Game

(01:15:57) For Your Entertainment

(01:17:20) Sports Go Sports

(01:18:42) I Was Promised Flying Self-Driving Cars

(01:23:50) Get to Work

(01:25:23) While I Cannot Condone This

(01:30:48) The Lighter Side

---

First published:
November 18th, 2024

Source:
https://www.lesswrong.com/posts/puJeNs9nLJByjatqq/monthly-roundup-24-november-2024

---

Narrated by TYPE III AUDIO.

---

Images from the article:

Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

“AI #90: The Wall” by Zvi

14 november 2024 | 79 min

As the Trump transition continues and we try to steer and anticipate its decisions on AI as best we can, there was continued discussion about one of the AI debate's favorite questions: Are we making huge progress real soon now, or is deep learning hitting a wall? My best guess is it is kind of both, that past pure scaling techniques are on their own hitting a wall, but that progress remains rapid and the major companies are evolving other ways to improve performance, which started with OpenAI's o1.

Point of order: It looks like as I switched phones, WhatsApp kicked me out of all of my group chats. If I was in your group chat, and you’d like me to stay, please add me again. If you’re in a different group you’d like me to join on either WhatsApp or Signal (or other platforms) and would like [...]

---

Outline:

(00:58) Language Models Offer Mundane Utility

(02:24) Language Models Don’t Offer Mundane Utility

(04:20) Can’t Liver Without You

(12:04) Fun With Image Generation

(12:51) Deepfaketown and Botpocalypse Soon

(14:11) Copyright Confrontation

(15:25) The Art of the Jailbreak

(15:54) Get Involved

(18:10) Math is Hard

(20:20) In Other AI News

(25:04) Good Advice

(27:19) AI Will Improve a Lot Over Time

(30:56) Tear Down This Wall

(38:04) Quiet Speculations

(38:54) The Quest for Sane Regulations

(47:04) The Quest for Insane Regulations

(49:43) The Mask Comes Off

(52:08) Richard Ngo Resigns From OpenAI

(55:44) Unfortunate Marc Andreessen Watch

(56:53) The Week in Audio

(01:05:00) Rhetorical Innovation

(01:09:44) Seven Boats and a Helicopter

(01:11:27) The Wit and Wisdom of Sam Altman

(01:12:10) Aligning a Smarter Than Human Intelligence is Difficult

(01:14:50) People Are Worried About AI Killing Everyone

(01:15:14) Other People Are Not As Worried About AI Killing Everyone

(01:17:32) The Lighter Side

---

First published:
November 14th, 2024

Source:
https://www.lesswrong.com/posts/FC9hdySPENA7zdhDb/ai-90-the-wall

---

Narrated by TYPE III AUDIO.

---

Images from the article:

Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

“The Online Sports Gambling Experiment Has Failed” by Zvi

11 november 2024 | 22 min

“AI #89: Trump Card” by Zvi

7 november 2024 | 79 min

A lot happened in AI this week, but most people's focus was very much elsewhere.

I’ll start with what Trump might mean for AI policy, then move on to the rest. This is the future we have to live in, and potentially save. Back to work, as they say.

Table of Contents

Trump Card. What does Trump's victory mean for AI policy going forward?
Language Models Offer Mundane Utility. Dump it all in the screen captures.
Language Models Don’t Offer Mundane Utility. I can’t help you with that, Dave.
Here Let Me Chatbot That For You. OpenAI offers SearchGPT.
Deepfaketown and Botpocalypse Soon. Models persuade some Trump voters.
Fun With Image Generation. Human image generation, that is.
The Vulnerable World Hypothesis. Google AI finds a zero day exploit.
They Took Our Jobs. The future of [...]

---

Outline:

(00:23) Trump Card

(04:59) Language Models Offer Mundane Utility

(10:31) Language Models Don’t Offer Mundane Utility

(12:26) Here Let Me Chatbot That For You

(15:32) Deepfaketown and Botpocalypse Soon

(18:52) Fun With Image Generation

(20:05) The Vulnerable World Hypothesis

(22:28) They Took Our Jobs

(31:52) The Art of the Jailbreak

(33:32) Get Involved

(33:40) In Other AI News

(36:21) Quiet Speculations

(40:10) The Quest for Sane Regulations

(49:46) The Quest for Insane Regulations

(51:09) A Model of Regulatory Competitiveness

(53:49) The Week in Audio

(55:18) The Mask Comes Off

(58:48) Open Weights Are Unsafe and Nothing Can Fix This

(01:04:03) Open Weights Are Somewhat Behind Closed Weights

(01:09:11) Rhetorical Innovation

(01:13:23) Aligning a Smarter Than Human Intelligence is Difficult

(01:15:34) People Are Worried About AI Killing Everyone

(01:16:26) The Lighter Side

---

First published:
November 7th, 2024

Source:
https://www.lesswrong.com/posts/xaqR7AxSYmcpsuEPW/ai-89-trump-card

---

Narrated by TYPE III AUDIO.

---

Images from the article:

Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

“AI #88: Thanks for the Memos” by Zvi

31 oktober 2024 | 150 min

Following up on the Biden Executive Order on AI, the White House has now issued an extensive memo outlining its AI strategy. The main focus is on government adaptation and encouraging innovation and competitiveness, but there's also sections on safety and international governance. Who knows if a week or two from now, after the election, we will expect any of that to get a chance to be meaningfully applied. If AI is your big issue and you don’t know who to support, this is as detailed a policy statement as you’re going to get.

We also have word of a new draft AI regulatory bill out of Texas, along with similar bills moving forward in several other states. It's a bad bill, sir. It focuses on use cases, taking an EU-style approach to imposing requirements on those doing ‘high-risk’ things, and would likely do major damage to the [...]

---

Outline:

(01:37) Language Models Offer Mundane Utility

(06:39) Language Models Don’t Offer Mundane Utility

(15:40) In Summary

(17:53) Master of Orion

(20:01) Whispers in the Night

(25:10) Deepfaketown and Botpocalypse Soon

(25:39) Overcoming Bias

(29:43) They Took Our Jobs

(33:51) The Art of the Jailbreak

(44:36) Get Involved

(44:47) Introducing

(46:15) In Other AI News

(48:28) Quiet Speculations

(01:00:53) Thanks for the Memos: Introduction and Competitiveness

(01:08:22) Thanks for the Memos: Safety

(01:16:47) Thanks for the Memos: National Security and Government Adaptation

(01:20:55) Thanks for the Memos: International Governance

(01:25:43) EU AI Act in Practice

(01:32:34) Texas Messes With You

(01:50:12) The Quest for Sane Regulations

(01:57:00) The Week in Audio

(01:58:58) Rhetorical Innovation

(02:06:15) Roon Speaks

(02:15:45) The Mask Comes Off

(02:16:55) I Was Tricked Into Talking About Shorting the Market Again

(02:28:33) The Lighter Side

The original text contained 17 footnotes which were omitted from this narration.

---

First published:
October 31st, 2024

Source:
https://www.lesswrong.com/posts/HHkYEyFaigRpczhHy/ai-88-thanks-for-the-memos

---

Narrated by TYPE III AUDIO.

---

Images from the article:

Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

“Occupational Licensing Roundup #1” by Zvi

30 oktober 2024 | 23 min

“Housing Roundup #10” by Zvi

29 oktober 2024 | 64 min

“AI #87: Staying in Character” by Zvi

29 oktober 2024 | 65 min

“Claude Sonnet 3.5.1 and Haiku 3.5” by Zvi

24 oktober 2024 | 30 min

“The Mask Comes Off: At What Price?” by Zvi

21 oktober 2024 | 16 min

“AI #86: Just Think of the Potential” by Zvi

17 oktober 2024 | 107 min

“Monthly Roundup #23: October 2024” by Zvi

16 oktober 2024 | 91 min

It's monthly roundup time again, and it's happily election-free.

Thinking About the Roman Empire's Approval Rating

Propaganda works, ancient empires edition. This includes the Roman Republic being less popular than the Roman Empire and people approving of Sparta, whereas Persia and Carthage get left behind. They’re no FDA.

Polling USA: Net Favorable Opinion Of:

Ancient Athens: +44%

Roman Empire: +30%

Ancient Sparta: +23%

Roman Republican: +26%

Carthage: +13%

Holy Roman Empire: +7%

Persian Empire: +1%

Visigoths: -7%

Huns: -29%

YouGov / June 6, 2024 / n=2205

The Five Star Problem

What do we do about all 5-star ratings collapsing the way Peter describes here?

Peter Wildeford: TBH I am pretty annoyed that when I rate stuff the options are:

* “5 stars – everything was good enough I guess”

* “4 [...]

---

Outline:

(00:11) Thinking About the Roman Empire's Approval Rating

(01:13) The Five Star Problem

(06:35) Cooking at Home Being Cheaper is Weird

(08:18) With Fans Like These

(09:37) Journalist, Expose Thyself

(13:03) On Not Going the Extra Mile

(13:13) The Rocket Man Said a Bad Bad Thing

(16:27) The Joy of Bad Service

(19:07) Saying What is Not

(19:27) Concentration

(20:26) Should You Do What You Love?

(22:08) Should You Study Philosophy?

(24:31) The Destined Face

(25:09) Tales of Twitter

(34:14) Antisocial Media

(35:01) TikTok On the Clock

(39:07) Tier List of Champions

(40:50) Technology Advances

(42:15) Hotel Hype

(44:44) Government Working

(46:55) I Was Promised Flying Self-Driving Cars

(47:21) For Your Entertainment

(56:50) Cultural Dynamism

(58:43) Hansonian Features

(01:02:19) Variously Effective Altruism

(01:02:45) Nobel Intentions

(01:05:04) Gamers Gonna Game Game Game Game Game

(01:20:17) Sports Go Sports and the Problems with TV Apps These Days

(01:23:46) An Economist Seeks Lunch

(01:30:35) The Lighter Side

---

First published:
October 16th, 2024

Source:
https://www.lesswrong.com/posts/Hq9ccwansFgqTueHA/monthly-roundup-23-october-2024

---

Narrated by TYPE III AUDIO.

---

Images from the article:

Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

“Economics Roundup #4” by Zvi

15 oktober 2024 | 47 min

“AI #85: AI Wins the Nobel Prize” by Zvi

10 oktober 2024 | 57 min

“Joshua Achiam Public Statement Analysis” by Zvi

10 oktober 2024 | 40 min

“AI #84: Better Than a Podcast” by Zvi

3 oktober 2024 | 96 min

Introduction: Better than a Podcast

Andrej Karpathy continues to be a big fan of NotebookLM, especially its podcast creation feature. There is something deeply alien to me about this proposed way of consuming information, but I probably shouldn’t knock it (too much) until I try it?

Others are fans as well.

Carlos Perez: Google with NotebookLM may have accidentally stumbled upon an entirely new way of interacting with AI. Its original purpose was to summarize literature. But one unexpected benefit is when it's used to talk about your expressions (i.e., conversations or lectures). This is when you discover the insight of multiple interpretations! Don’t just render a summary one time; have it do so several times. You’ll then realize how different interpretations emerge, often in unexpected ways.

Delip Rao gives the engine two words repeated over and over, the AI podcast hosts describe what it [...]

---

Outline:

(00:05) Introduction: Better than a Podcast

(03:16) Language Models Offer Mundane Utility

(04:04) Language Models Don’t Offer Mundane Utility

(09:24) Copyright Confrontation

(10:44) Deepfaketown and Botpocalypse Soon

(14:45) They Took Our Jobs

(19:23) The Art of the Jailbreak

(19:39) Get Involved

(20:00) Introducing

(20:37) OpenAI Dev Day

(34:40) In Other AI News

(38:03) The Mask Comes Off

(55:42) Quiet Speculations

(59:10) The Quest for Sane Regulations

(01:00:04) The Week in Audio

(01:01:54) Rhetorical Innovation

(01:19:08) Remember Who Marc Andreessen Is

(01:22:35) A Narrow Path

(01:30:36) Aligning a Smarter Than Human Intelligence is Difficult

(01:33:45) The Wit and Wisdom of Sam Altman

(01:35:25) The Lighter Side

---

First published:
October 3rd, 2024

Source:
https://www.lesswrong.com/posts/bWrZhfaTD5EDjwkLo/ai-84-better-than-a-podcast

---

Narrated by TYPE III AUDIO.

---

Images from the article:

Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

“Newsom Vetoes SB 1047” by Zvi

1 oktober 2024 | 62 min

“Book Review: On the Edge: The Future” by Zvi

27 september 2024 | 90 min

“AI #83: The Mask Comes Off” by Zvi

26 september 2024 | 65 min

“Book Review: On the Edge: The Business” by Zvi

25 september 2024 | 66 min

“Book Review: On the Edge: The Gamblers” by Zvi

24 september 2024 | 161 min

Previously: Book Review: On the Edge: The Fundamentals

As I said in the Introduction, I loved this part of the book. Let's get to it.

Poker and Game Theory

When people talk about game theory, they mostly talk solving for the equilibrium, and how to play your best game or strategy (there need not be a formal game) against adversaries who are doing the same.

I think of game theory like Frank Sinatra thinks of New York City: “If I can make it there, I’ll make it anywhere.” If you can compete against people performing at their best, you’re going to be a winner in almost any game you play. But if you build a strategy around exploiting inferior competition, it's unlikely to be a winning approach outside of a specific, narrow setting. What plays well in Peoria doesn’t necessarily play well in New York. [...]

---

Outline:

(00:18) Poker and Game Theory

(06:53) Sports Randomized Sports

(11:17) Knowing Theory Versus Memorization Versus Practice

(16:15) More About Tells

(19:20) Feeling the Probabilities

(20:35) Feeling Sad About It

(28:33) The Iowa Gambling Task

(31:39) The Greatest Risk

(37:20) Tournament Poker Is Super High Variance

(42:42) The Art of the Degen

(48:43) Why Do They Insist on Calling it Luck

(51:56) The Poker Gender Gap

(54:36) A Potential Cheater

(58:30) Making a Close Decision

(01:00:19) Other Games at the Casino

(01:03:22) Slot Machines Considered Harmful

(01:08:23) Where I Draw the Line

(01:11:14) A Brief History of Vegas and Casinos (as told by Nate Silver)

(01:16:44) We Got Us a Whale

(01:21:41) Donald Trump and Atlantic City Were Bad At Casinos

(01:25:17) How To Design a Casino

(01:26:46) The Wide World of Winning at Sports Gambling

(01:41:01) Limatime

(01:43:45) The Art of Getting Down

(01:45:29) Oh Yeah That Guy

(01:55:34) The House Sometimes Wins

(02:01:24) The House Is Probably a Coward

(02:11:19) DFS and The Problem of Winners

(02:16:08) Balancing the Action

(02:18:44) The Market Maker

(02:22:45) The Closing Line is Hard to Beat

(02:25:11) Winning is Hard

(02:29:58) What Could Be, Unburdened By What Has Been

(02:34:52) Finding Edges Big and Small

(02:40:12) Checkpoint Two

---

First published:
September 24th, 2024

Source:
https://www.lesswrong.com/posts/mkyMx4FtJrfuGnsrm/book-review-on-the-edge-the-gamblers

---

Narrated by TYPE III AUDIO.

---

Images from the article:

Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

“Book Review: On the Edge: The Fundamentals” by Zvi

23 september 2024 | 57 min

“Housing Roundup #9: Restricting Supply” by Zvi

22 september 2024 | 91 min

I’d split the latest housing roundup into local versus global questions. I was planning on waiting a bit between them.

Then Joe Biden decided to propose a version of the worst possible thing.

So I guess here we are.

What is the organizing principle of Bidenomics?

Restrict Supply and Subsidize Demand (1)

This was the old counterproductive Biden proposal:

Unusual Whales: Biden to propose $5,000 credit for first-time home buyers, per WaPo.

The Rich: House prices about to go up $5,000 everywhere.

Under current conditions this is almost a pure regressive tax, a transfer from those too poor to own a home to those who can afford to buy one, or who previously owned one and no longer do.

If there were no restrictions on the supply of housing, such that the price of a house equalled the cost of [...]

---

Outline:

(00:24) Restrict Supply and Subsidize Demand (1)

(08:55) Restrict Supply and Subsidize Demand (2): Rent Control

(16:57) You Should See the Other Guy

(21:18) Stop Restricting Supply

(23:21) All Supply is Good Supply

(26:53) ‘Inclusionary’ Zoning

(28:37) The Worst Take

(33:27) Where and With Whom People Want To Live

(36:44) Matching

(39:58) Universality

(41:42) The Value of Land

(43:52) The Doom Loop

(51:05) How Are Sale Prices So Out of Whack with Rents and Income?

(52:18) Questioning Superstar Status

(54:47) Window Shopping

(58:18) Minimum Viable Product

(01:00:53) Construction Costs

(01:02:57) Elevator Action

(01:11:09) Housing Theory of Everything

(01:13:07) Zoning By Prohibitive Permit

(01:14:55) YIGBY?

(01:15:24) The True NIMBY

(01:16:45) The Definition of Chutzpah

(01:18:23) In Other Housing News

(01:18:53) Rhetoric

(01:21:33) Environmentalists Should Favor Density

(01:23:03) Do Not Give the People What They Want

(01:26:05) Housing Construction in the UK

(01:27:25) The Funniest Possible Thing

(01:29:07) Other Funny Things

---

First published:
July 17th, 2024

Source:
https://www.lesswrong.com/posts/sX5ANDiTb96CkYpxd/housing-roundup-9-restricting-supply

---

Narrated by TYPE III AUDIO.

---

Images from the article:

Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

“AI #82: The Governor Ponders” by Zvi

19 september 2024 | 52 min

“Monthly Roundup #22: September 2024” by Zvi

17 september 2024 | 85 min

“GPT-o1” by Zvi

17 september 2024 | 95 min

“AI #81: Alpha Proteo” by Zvi

12 september 2024 | 68 min

Following up on Alpha Fold, DeepMind has moved on to Alpha Proteo. We also got a rather simple prompt that can create a remarkably not-bad superforecaster for at least some classes of medium term events.

We did not get a new best open model, because that turned out to be a scam. And we don’t have Apple Intelligence, because it isn’t ready for prime time. We also got only one very brief mention of AI in the debate I felt compelled to watch.

What about all the apps out there, that we haven’t even tried? It's always weird to get lists of ‘top 50 AI websites and apps’ and notice you haven’t even heard of most of them.

Table of Contents

Introduction.
Table of Contents.
Language Models Offer Mundane Utility. So many apps, so little time.
Language Models Don’t Offer Mundane [...]

---

Outline:

(00:44) Language Models Offer Mundane Utility

(03:40) Language Models Don’t Offer Mundane Utility

(05:43) Predictions are Hard Especially About the Future

(12:57) Early Apple Intelligence

(15:27) On Reflection It's a Scam

(21:34) Deepfaketown and Botpocalypse Soon

(23:08) They Took Our Jobs

(28:42) The Time 100 People in AI

(32:11) The Art of the Jailbreak

(32:47) Get Involved

(33:12) Alpha Proteo

(43:14) Introducing

(44:23) In Other AI News

(46:41) Quiet Speculations

(50:40) The Quest for Sane Regulations

(53:12) The Week in Audio

(54:28) Rhetorical Innovation

(55:48) Aligning a Smarter Than Human Intelligence is Difficult

(56:05) People Are Worried About AI Killing Everyone

(58:38) Other People Are Not As Worried About AI Killing Everyone

(59:56) Six Boats and a Helicopter

(01:06:47) The Lighter Side

---

First published:
September 12th, 2024

Source:
https://www.lesswrong.com/posts/YMaTA2hX6tSBJWnPr/ai-81-alpha-proteo

---

Narrated by TYPE III AUDIO.

---

Images from the article:

Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

“AI #80: Never Have I Ever” by Zvi

10 september 2024 | 75 min

“Economics Roundup #3” by Zvi

10 september 2024 | 38 min

“AI and the Technological Richter Scale” by Zvi

4 september 2024 | 25 min

“On the UBI Paper” by Zvi

3 september 2024 | 36 min

“AI #79: Ready for Some Football” by Zvi

29 augusti 2024 | 59 min

“SB 1047: Final Takes and Also AB 3211” by Zvi

27 augusti 2024 | 40 min

“AI #78: Some Welcome Calm” by Zvi

22 augusti 2024 | 62 min

“Guide to SB 1047” by Zvi

20 augusti 2024 | 103 min

We now likely know the final form of California's SB 1047.

There have been many changes to the bill as it worked its way to this point.

Many changes, including some that were just announced, I see as strict improvements.

Anthropic was behind many of the last set of amendments at the Appropriations Committee. In keeping with their “Support if Amended” letter, there are a few big compromises that weaken the upside protections of the bill somewhat in order to address objections and potential downsides.

The primary goal of this post is to answer the question: What would SB 1047 do?

I offer two versions: Short and long.

The short version summarizes what the bill does, at the cost of being a bit lossy.

The long version is based on a full RTFB: I am reading the entire bill, once again.

---

Outline:

(01:16) Short Version (tl;dr): What Does SB 1047 Do in Practical Terms?

(04:19) Really Short Abbreviated Version

(05:46) Somewhat Less Short: Things The Above Leaves Out

(08:03) Bad Model, Bad Model, What You Gonna Do

(11:34) Going to Be Some Changes Made

(14:17) Long Version: RTFB

(15:01) Definitions (starting with Artificial Intelligence)

(15:35) Safety Incident

(17:15) Covered Model

(19:15) Critical Harm

(22:10) Full Shutdown

(23:35) Safety and Security Protocol

(25:31) On Your Marks

(34:41) Reasonable People May Disagree

(42:27) Release the Hounds

(44:02) Smooth Operator

(46:47) Compute Cluster Watch

(48:49) Price Controls are Bad

(49:21) A Civil Action

(56:16) Whistleblowers Need Protections

(59:01) No Division Only Board

(01:02:32) Does CalCompute?

(01:03:06) In Which We Respond To Some Objections In The Style They Deserve

(01:04:43) False Claim: The Government Can and Will Lower the $100m Threshold

(01:05:11) False Claim: SB 1047 Might Retroactively Cover Existing Models

(01:05:32) Moot or False Claim: The Government Can and Will Set the Derivative Model Threshold Arbitrarily Low

(01:05:45) Objection: The Government Could Raise the Derivative Threshold Model Too High,

(01:06:11) False Claim: Fine-Tuners Can Conspire to Evade the Derivative

(01:06:31) Moot Claim: The Frontier Model Division Inevitably Will Overregulate

(01:07:01) False Claim: The Shutdown Requirement Bans Open Source

(01:07:49) Objection: SB 1047 Will Slow AI Technology and Innovation or Interfere with Open Source

(01:11:49) False Claim: This Effectively Kills Open Source Because You Can Fine-Tune Any System To Do Harm

(01:13:42) False Claim: SB 1047 Will Greatly Hurt Academia

(01:14:42) False Claim: SB 1047 Favors ‘Big Tech’ over ‘Little Tech’

(01:16:14) False Claim: SB 1047 Would Cause Many Startups To Leave California

(01:17:25) Objection: Shutdown Procedures Could Be Hijacked and Backfire

(01:18:35) Objection: The Audits Will Be Too Expensive

(01:20:08) Objection: What Is Illegal Here is Already Illegal

(01:23:52) Objection: Jailbreaking is Inevitable

(01:24:41) Moot and False Claim: Reasonable Assurance Is Impossible

(01:25:04) Objection: Reasonable Care is Too Vague, Can’t We Do Better?

(01:25:50) Objection: The Numbers Picked are Arbitrary

(01:27:47) Objection: The Law Should Use Capabilities Thresholds, Not Compute and Compute Cost Thresholds

(01:29:27) False Claim: This Bill Deals With ‘Imaginary’ Risks

(01:30:38) Objection: This Might Become the Model For Other Bills Elsewhere

(01:31:16) Not Really an Objection: They Changed the Bill a Lot

(01:31:42) Not Really an Objection: The Bill Has the Wrong Motivations and Is Backed By Evil People

(01:33:34) Not an Objection: ‘The Consensus Has Shifted’ or ‘The Bill is Unpopular’

(01:34:17) Objection: It Is ‘Too Early’ To Regulate

(01:35:33) Objection: We Need To ‘Get It Right’ and Can Do Better

(01:36:27) Objection: This Would Be Better at the Federal Level

(01:36:53) Objection: The Bill Should Be Several Distinct Bills

(01:37:44) Objection: The Bill Has Been Weakened Too Much in Various Ways

(01:40:18) Final Word: Who Should Oppose This Bill?

---

First published:
August 20th, 2024

Source:
https://www.lesswrong.com/posts/Z7pTfn4qqnKBoMi42/guide-to-sb-1047

---

Narrated by TYPE III AUDIO.

---

Images from the article:

Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

“AI #77: A Few Upgrades” by Zvi

20 augusti 2024 | 102 min

[Apologies for forgetting to cross-post this and the Monthly Roundup earlier.]

Let's see. We’ve got a new version of GPT-4o, a vastly improved Grok 2 with a rather good and unrestricted deepfake and other image generator now baked into Twitter, the announcement of the AI powered Google Pixel 9 coming very soon and also Google launching a voice assistant. Anthropic now has prompt caching.

Also OpenAI has its final board member, Zico Kolter, who is nominally a safety pick, and SB 1047 got importantly amended again which I’ll cover in full next week once the details are out.

There was also the whole paper about the fully automated AI scientist from the company whose name literally means ‘danger’ in Hebrew, that instantiated copies of itself, took up unexpectedly large amounts of storage space, downloaded strange Python libraries and tried to edit its code to remove the [...]

---

Outline:

(01:08) Language Models Offer Mundane Utility

(04:45) Language Models Don’t Offer Mundane Utility

(08:01) GPT-4o My System Card

(15:49) 2 Grok 2 Furious 2 Quit

(25:32) Pixel Perfect

(27:52) Fun with Image Generation

(28:13) Deepfaketown and Botpocalypse Soon

(34:15) The Art of the Jailbreak

(43:48) They Took Our Jobs

(45:27) Obvious Nonsense

(50:12) Get Involved

(51:54) Introducing

(55:55) In Other AI News

(58:43) Quiet Speculations

(01:13:18) SB 1047: One Thing to Know

(01:14:21) SB 1047 is Amended Again

(01:16:32) SB 1047 Rhetoric Prior to the Recent Changes

(01:23:12) The Quest for Sane Regulations

(01:28:58) The Week in Audio

(01:30:13) Rhetorical Innovation

(01:36:32) Crying Wolf

(01:39:12) People Are Worried About AI Killing Everyone

(01:39:40) Other People Are Not As Worried About AI Killing Everyone

(01:40:18) The Lighter Side

---

First published:
August 20th, 2024

Source:
https://www.lesswrong.com/posts/2tKbDKGLXtEHGtGLn/ai-77-a-few-upgrades

---

Narrated by TYPE III AUDIO.

---

Images from the article:

Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

“Monthly Roundup #21: August 2024” by Zvi

20 augusti 2024 | 77 min

Strictly speaking I do not have that much ‘good news’ to report, but it's all mostly fun stuff one way or another. Let's go.

Bad News

Is this you?

Patrick McKenzie: This sounds like a trivial observation and it isn’t:

No organization which makes its people pay for coffee wants to win.

There are many other questions you can ask about an organization but if their people pay for coffee you can immediately discount their realized impact on the world by > 90%.

This is not simply for the cultural impact of stupid decisions, though goodness as a Japanese salaryman I have stories to tell. Management, having priced coffee, seeking expenses to cut, put a price on disposable coffee cups, and made engineers diligently count those paper cups.

Just try to imagine how upside down the world is when you think one [...]

---

Outline:

(00:15) Bad News

(06:35) Grocery Store Blues

(08:47) Good News, Everyone

(09:43) Opportunity Knocks

(11:46) While I Cannot Condone This

(13:28) Antisocial Media

(16:10) Technology Advances

(17:51) Google Enshittification

(20:31) For Science!

(22:49) Government Working

(25:15) America F\*\*\* Yeah

(33:05) Smart People Being Stupid

(36:15) What We Have Here is A Failure to Communicate

(40:38) Video Killed the Radio Star

(43:02) Too Much Information

(47:47) Memory Holes

(49:18) Wet Ground Causes Rain (Dances)

(52:27) Get Them to the Church

(57:49) Patrick McKenzie Monthly

(01:01:13) Your Horoscope For Today

(01:03:14) Good Advice: Travel Edition

(01:05:30) Sports Go Sports

(01:05:34) Our Olympic team is mostly based in San Francisco.

(01:07:35) Gamers Gonna Game Game Game Game Game

(01:07:40) How many elite chess players cheat? Chess.com analysis of its big ‘Titled Tuesday’ events says between 1% and 2% of players, and roughly 1% of event winners. They are responding by making cheating bans on major plays public rather than quietly closing accounts, to fix the incentives.

(01:14:46) The Lighter Side

---

First published:
August 20th, 2024

Source:
https://www.lesswrong.com/posts/2ne9taAPiGqoTLXJJ/monthly-roundup-21-august-2024

---

Narrated by TYPE III AUDIO.

---

Images from the article:

Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

“Danger, AI Scientist, Danger” by Zvi

15 augusti 2024 | 16 min

“AI #76: Six Shorts Stories About OpenAI” by Zvi

8 augusti 2024 | 89 min

If you’re looking forward to next week’s AI #77, I am going on a two-part trip this week. First I’ll be going to Steamboat in Colorado to give a talk, then I’ll be swinging by Washington, DC on Wednesday, although outside of that morning my time there will be limited. My goal is still to get #77 released before Shabbat dinner, we’ll see if that works. Some topics may of course get pushed a bit.

It’s crazy how many of this week’s developments are from OpenAI. You’ve got their voice mode alpha, JSON formatting, answering the letter from several senators, sitting on watermarking for a year, endorsement of three bills before Congress and also them losing a cofounder to Anthropic and potentially another one via sabbatical.

Also Google found to be a monopolist, we have the prompts for Apple Intelligence and other neat stuff like that.

---

Outline:

(01:43) Language Models Offer Mundane Utility

(05:03) Language Models Don’t Offer Mundane Utility

(08:09) Activate Voice Mode

(12:23) Apple Intelligence

(16:38) Antitrust Antitrust

(19:39) Copyright Confrontation

(20:50) Fun with Image Generation

(22:06) Deepfaketown and Botpocalypse Soon

(26:25) They Took Our Jobs

(29:18) Chipping Up

(31:36) Get Involved

(31:56) Introducing

(33:31) In Other AI News

(43:20) Quiet Speculations

(47:40) The Quest for Sane Regulations

(49:48) That's Not a Good Idea

(54:05) The Week in Audio

(55:54) Exact Words

(01:01:54) Openly Evil AI

(01:09:06) Goodbye to OpenAI

(01:15:10) Rhetorical Innovation

(01:21:16) Open Weights Are Unsafe and Nothing Can Fix This

(01:23:33) Aligning a Smarter Than Human Intelligence is Difficult

(01:24:34) People Are Worried About AI Killing Everyone

(01:25:44) Other People Are Not As Worried About AI Killing Everyone

(01:28:37) The Lighter Side

---

First published:
August 8th, 2024

Source:
https://www.lesswrong.com/posts/4GnsAtamtcrsTFmSf/ai-76-six-shorts-stories-about-openai

---

Narrated by TYPE III AUDIO.

---

Images from the article:

Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

“Startup Roundup #2” by Zvi

6 augusti 2024 | 61 min

“AI #75: Math is Easier” by Zvi

1 augusti 2024 | 136 min

Google DeepMind got a silver metal at the IMO, only one point short of the gold. That's really exciting.

We continuously have people saying ‘AI progress is stalling, it's all a bubble’ and things like that, and I always find remarkable how little curiosity or patience such people are willing to exhibit. Meanwhile GPT-4o-Mini seems excellent, OpenAI is launching proper search integration, by far the best open weights model got released, we got an improved MidJourney 6.1, and that's all in the last two weeks. Whether or not GPT-5-level models get here in 2024, and whether or not it arrives on a given schedule, make no mistake. It's happening.

This week also had a lot of discourse and events around SB 1047 that I failed to avoid, resulting in not one but four sections devoted to it.

Dan Hendrycks was baselessly attacked – by billionaires with [...]

---

Outline:

(02:12) Language Models Offer Mundane Utility

(03:06) Language Models Don’t Offer Mundane Utility

(04:18) Math is Easier

(08:15) Llama Llama Any Good

(11:52) Search for the GPT

(15:17) Tech Company Will Use Your Data to Train Its AIs

(17:14) Fun with Image Generation

(17:37) Deepfaketown and Botpocalypse Soon

(27:36) The Art of the Jailbreak

(29:54) Janus on the 405

(32:47) They Took Our Jobs

(33:29) Get Involved

(34:05) Introducing

(37:07) In Other AI News

(40:18) Quiet Speculations

(43:40) The Quest for Sane Regulations

(55:31) Death and or Taxes

(58:18) SB 1047 (1)

(01:00:56) SB 1047 (2)

(01:14:29) SB 1047 (3): Oh Anthropic

(01:20:13) What Anthropic's Letter Actually Proposes

(01:36:44) Open Weights Are Unsafe and Nothing Can Fix This

(01:39:41) The Week in Audio

(01:40:09) Rhetorical Innovation

(01:50:56) Businessman Waves Flag

(01:57:48) Businessman Pledges Safety Efforts

(02:04:18) Aligning a Smarter Than Human Intelligence is Difficult

(02:04:39) Aligning a Dumber Than Human Intelligence is Also Difficult

(02:07:52) Other People Are Not As Worried About AI Killing Everyone

(02:13:28) The Lighter Side

---

First published:
August 1st, 2024

Source:
https://www.lesswrong.com/posts/2p5suvWod4aP8S3S4/ai-75-math-is-easier

---

Narrated by TYPE III AUDIO.

---

Images from the article:

Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

“RTFB: California’s AB 3211” by Zvi

30 juli 2024 | 23 min

“Llama Llama-3-405B?” by Zvi

24 juli 2024 | 59 min

“Monthly Roundup #20: July 2024” by Zvi

23 juli 2024 | 74 min

“On the CrowdStrike Incident” by Zvi

22 juli 2024 | 33 min

“AI #73: Openly Evil AI” by Zvi

18 juli 2024 | 100 min

What do you call a clause explicitly saying that you waive the right to whistleblower compensation, and that you need to get permission before sharing information with government regulators like the SEC?

I have many answers.

I also know that OpenAI, having f***ed around, seems poised to find out, because that is the claim made by whistleblowers to the SEC. Given the SEC fines you for merely not making an explicit exception to your NDA for whistleblowers, what will they do once aware of explicit clauses going the other way?

(Unless, of course, the complaint is factually wrong, but that seems unlikely.)

We also have rather a lot of tech people coming out in support of Trump. I go into the reasons why, which I do think is worth considering. There is a mix of explanations, and at least one very good reason.

Then [...]

---

Outline:

(01:40) Language Models Offer Mundane Utility

(08:10) Language Models Don’t Offer Mundane Utility

(10:19) Clauding Along

(12:25) Fun with Image Generation

(13:46) Deepfaketown and Botpocalypse Soon

(14:49) They Took Our Jobs

(18:29) Get Involved

(20:14) Introducing

(21:46) In Other AI News

(25:02) Denying the Future

(26:59) Quiet Speculations

(32:34) The Quest for Sane Regulations

(37:03) The Other Quest Regarding Regulations

(57:02) SB 1047 Opposition Watch (1)

(01:07:41) SB 1047 Opposition Watch (2)

(01:11:21) Open Weights are Unsafe and Nothing Can Fix This

(01:12:21) The Week in Audio

(01:14:21) Rhetorical Innovation

(01:14:48) Oh Anthropic

(01:16:57) Openly Evil AI

(01:24:21) Aligning a Smarter Than Human Intelligence is Difficult

(01:37:30) People Are Worried About AI Killing Everyone

(01:37:52) Other People Are Not As Worried About AI Killing Everyone

(01:39:49) The Lighter Side

---

First published:
July 18th, 2024

Source:
https://www.lesswrong.com/posts/fM4Bs9nanDzio3xCq/ai-73-openly-evil-ai

---

Narrated by TYPE III AUDIO.

---

Images from the article:

Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

“AI #72: Denying the Future” by Zvi

11 juli 2024 | 76 min

The Future. It is coming.

A surprising number of economists deny this when it comes to AI. Not only do they deny the future that lies in the future. They also deny the future that is here, but which is unevenly distributed. Their predictions and projections do not factor in even what the AI can already do, let alone what it will learn to do later on.

Another likely future event is the repeal of the Biden Executive Order. That repeal is part of the Republican platform, and Trump is the favorite to win the election. We must act on the assumption that the order likely will be repealed, with no expectation of similar principles being enshrined in federal law.

Then there are the other core problems we will have to solve, and other less core problems such as what to do about AI companions. They [...]

---

Outline:

(01:19) Language Models Offer Mundane Utility

(03:54) Language Models Don’t Offer Mundane Utility

(06:22) You’re a Nudge

(08:02) Fun with Image Generation

(08:10) Deepfaketown and Botpocalypse Soon

(13:26) They Took Our Jobs

(13:43) Get Involved

(14:13) Introducing

(15:23) In Other AI News

(20:02) Quiet Speculations

(22:14) The AI Denialist Economists

(29:07) The Quest for Sane Regulations

(31:20) Trump Would Repeal the Biden Executive Order on AI

(34:40) Ordinary Americans Are Worried About AI

(37:50) The Week in Audio

(38:59) The Wikipedia War

(46:45) Rhetorical Innovation

(52:06) Evaluations Must Mimic Relevant Conditions

(54:36) Aligning a Smarter Than Human Intelligence is Difficult

(01:00:51) The Problem

(01:08:43) Oh Anthropic

(01:11:50) Other People Are Not As Worried About AI Killing Everyone

(01:15:48) The Lighter Side

---

First published:
July 11th, 2024

Source:
https://www.lesswrong.com/posts/xAoXxjtDGGCP7tBDY/ai-72-denying-the-future

---

Narrated by TYPE III AUDIO.

---

Images from the article:

Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

“Medical Roundup #3” by Zvi

9 juli 2024 | 35 min

“AI #71: Farewell to Chevron” by Zvi

4 juli 2024 | 70 min

“Economics Roundup #2” by Zvi

2 juli 2024 | 45 min

“AI #70: A Beautiful Sonnet” by Zvi

27 juni 2024 | 85 min

“Childhood and Education Roundup #6: College Edition” by Zvi

26 juni 2024 | 46 min

“Monthly Roundup #19: June 2024” by Zvi

25 juni 2024 | 101 min

“On Claude 3.5 Sonnet” by Zvi

24 juni 2024 | 26 min

“AI #68: Remarkably Reasonable Reactions” by Zvi

23 juni 2024 | 91 min

The big news this week was Apple Intelligence being integrated deeply into all their products. Beyond that, we had a modestly better than expected debate over the new version of SB 1047, and the usual tons of stuff in the background. I got to pay down some writing debt.

The bad news is, oh no, I have been called for Jury Duty. The first day or two I can catch up on podcasts or pure reading, but after that it will start to hurt. Wish me luck.

Table of Contents

AiPhone covers the announcement of Apple Intelligence. Apple's products are getting device-wide integration of their own AI in a way they say preserves privacy, with access to ChatGPT via explicit approval for the heaviest requests. A late update: OpenAI is providing this service for free as per Bloomberg.

I offered Quotes from Leopold Aschenbrenner's Situational [...]

---

Outline:

(00:38) Language Models Offer Mundane Utility

(01:46) Language Models Don’t Offer Mundane Utility

(07:49) Fun with Image Generation

(10:09) Copyright Confrontation

(11:38) Deepfaketown and Botpocalypse Soon

(15:20) They Took Our Jobs

(16:33) Someone Explains it All

(19:20) The Art of the Jailbreak

(22:39) Get Involved

(22:46) Introducing

(25:18) In Other AI News

(27:56) Quiet Speculations

(32:17) I Spy With My AI

(34:40) Pick Up the Phone

(35:06) Lying to the White House, Senate and House of Lords

(39:48) The Quest for Sane Regulations

(43:52) More Reasonable SB 1047 Reactions

(51:12) Less Reasonable SB 1047 Reactions

(56:24) That's Not a Good Idea

(56:46) With Friends Like These

(58:13) The Week in Audio

(01:01:22) Rhetorical Innovation

(01:02:32) Mistakes Were Made

(01:03:36) The Sacred Timeline

(01:08:17) Coordination is Hard

(01:13:33) Aligning a Smarter Than Human Intelligence is Difficult

(01:19:07) People Are Worried About AI Killing Everyone

(01:28:48) Other People Are Not As Worried About AI Killing Everyone

(01:29:58) The Lighter Side

---

First published:
June 13th, 2024

Source:
https://www.lesswrong.com/posts/DWkhjAxbwdcxYgyrJ/ai-68-remarkably-reasonable-reactions

---

Narrated by TYPE III AUDIO.

“The Leopold Model: Analysis and Reactions” by Zvi

23 juni 2024 | 106 min

“OpenAI #8: The Right to Warn” by Zvi

23 juni 2024 | 63 min

“On DeepMind’s Frontier Safety Framework” by Zvi

23 juni 2024 | 16 min

“AI #69: Nice” by Zvi

23 juni 2024 | 96 min

“On OpenAI’s Model Spec” by Zvi

23 juni 2024 | 56 min

There are multiple excellent reasons to publish a Model Spec like OpenAI's, that specifies how you want your model to respond in various potential situations.

It lets us have the debate over how we want the model to act.
It gives us a way to specify what changes we might request or require.
It lets us identify whether a model response is intended.
It lets us know if the company successfully matched its spec.
It lets users and prospective users know what to expect.
It gives insight into how people are thinking, or what might be missing.
It takes responsibility.

These all apply even if you think the spec in question is quite bad. Clarity is great.

As a first stab at a model spec from OpenAI, this actually is pretty solid. I do suggest some potential improvements [...]

---

Outline:

(02:05) What are the central goals of OpenAI here?

(04:04) What are the core rules and behaviors?

(05:56) What Do the Rules Mean?

(06:04) Rule: Follow the Chain of Command

(07:59) Rule: Comply With Applicable Laws

(09:07) Rule: Don’t Provide Information Hazards

(09:56) Rule: Respect Creators and Their Rights

(11:08) Rule: Protect People's Privacy

(12:45) Rule: Don’t Respond with NSFW Content

(14:24) Exception: Transformation Tasks

(15:38) Are These Good Defaults? How Strong Should They Be?

(15:44) Default: Assume Best Intentions From the User or Developer

(21:26) Default: Ask Clarifying Questions When Necessary

(21:39) Default: Be As Helpful As Possible Without Overstepping

(26:00) Default: Support the Different Needs of Interactive Chat and Programmatic Use

(27:18) Default: Assume an Objective Point of View

(29:13) Default: Encourage Fairness and Kindness, and Discourage Hate

(30:29) Default: Don’t Try to Change Anyone's Mind

(33:57) Default: Express Uncertainty

(36:19) Default: Use the Right Tool for the Job

(36:32) Default: Be Thorough but Efficient, While Respecting Length Limits

(37:16) A Proposed Addition

(38:13) Overall Issues

(40:33) Changes: Objectives

(42:28) Rules of the Game: New Version

(48:31) Defaults: New Version

---

First published:
June 21st, 2024

Source:
https://www.lesswrong.com/posts/mQmEQQLk7kFEENQ3W/on-openai-s-model-spec

---

Narrated by TYPE III AUDIO.

“AiPhone” by Zvi

23 juni 2024 | 27 min

“On Dwarksh’s Podcast with Leopold Aschenbrenner” by Zvi

10 juni 2024 | 110 min

“Quotes from Leopold Aschenbrenner’s Situational Awareness Paper” by Zvi

7 juni 2024 | 72 min

“AI #67: Brief Strange Trip” by Zvi

6 juni 2024 | 74 min

“SB 1047 Is Weakened” by Zvi

6 juni 2024 | 16 min

“The Gemini 1.5 Report” by Zvi

31 maj 2024 | 37 min

“OpenAI: Helen Toner Speaks” by Zvi

30 maj 2024 | 25 min

“AI #66: Oh to Be Less Online” by Zvi

30 maj 2024 | 103 min

“OpenAI: Fallout” by Zvi

28 maj 2024 | 66 min

“I am the Golden Gate Bridge” by Zvi

27 maj 2024 | 50 min

“The Schumer Report on AI (RTFB)” by Zvi

24 maj 2024 | 70 min

“AI #65: I Spy With My AI” by Zvi

23 maj 2024 | 81 min

“Do Not Mess With Scarlett Johansson” by Zvi

22 maj 2024 | 31 min

“On Dwarkesh’s Podcast with OpenAI’s John Schulman” by Zvi

21 maj 2024 | 39 min

“OpenAI: Exodus” by Zvi

20 maj 2024 | 85 min

“GPT-4o My and Google I/O Day” by Zvi

16 maj 2024 | 72 min

“AI #64: Feel the Mundane Utility” by Zvi

16 maj 2024 | 87 min

“Monthly Roundup #18: May 2024” by Zvi

13 maj 2024 | 88 min

“AI #63: Introducing Alpha Fold 3” by Zvi

9 maj 2024 | 52 min

“I Got 95 Theses But a Glitch Ain’t One” by Zvi

9 maj 2024 | 30 min

“Dating Roundup #3: Third Time’s the Charm” by Zvi

8 maj 2024 | 70 min

The first speculated on why you’re still single. We failed to settle the issue. A lot of you were indeed still single. So the debate continues.

The second gave more potential reasons, starting with the suspicion that you are not even trying, and also many ways you are likely trying wrong.

The definition of insanity is trying the same thing over again expecting different results. Another definition of insanity is dating in 2024. Can’t quit now.

You’re Single Because Dating Apps Keep Getting Worse

A guide to taking the perfect dating app photo. This area of your life is important, so if you intend to take dating apps seriously then you should take photo optimization seriously, and of course you can then also use the photos for other things.

I love the ‘possibly’ evil here.

Misha Gurevich: possibly evil idea: Dating app that [...]

---

Outline:

(00:37) You’re Single Because Dating Apps Keep Getting Worse

(05:38) You’re Single Because Dating Apps Keep Getting Worse

(07:24) You’re Single Because Everyone is Too Superficial

(09:48) You’re Single Because You Refuse to Shamefully Falsify Your Politics

(16:12) You Are Single Because You Do Not Employ Good Strategy

(18:45) You Are Single Because You Don’t Know How to Flirt

(22:43) You Are Single Because You Don’t Date Your Married Boss

(26:12) You Are Single Because You Are Afraid to Fail

(27:02) You Are Single Because No One Likes You On Dates

(29:39) You’re Single Because You Are Bad at Sex

(30:51) You’re Single Because You’re Not Hot

(31:39) You’re Single Because You Don’t Know What People Care About

(33:10) You’re Single Because You Are Inappropriate

(34:11) You’re Single Because of Your Pet

(35:23) You’re Single Because You Won’t Spend Money

(40:05) You’re Single Because You’re Not Over Your Ex

(41:27) You’re Single Because You Thought You Could Do 25% Better

(47:27) Polyamory

(53:29) You’re Single Because You Don’t Know What You Want

(01:00:43) You’re Single Because You’re Too Busy Writing Comments

(01:07:27) You’re Single and Not Getting Properly Compensated

(01:08:34) You’re Not Single and You’re an Inspiration

(01:09:49) Your Moment of Zen

---

First published:
May 8th, 2024

Source:
https://www.lesswrong.com/posts/PLoz68JbTkDufeYSG/dating-roundup-3-third-time-s-the-charm

---

Narrated by TYPE III AUDIO.

“AI #61: Meta Trouble” by Zvi

2 maj 2024 | 96 min

“AI #62: Too Soon to Tell” by Zvi

2 maj 2024 | 59 min

“Q&A on Proposed SB 1047” by Zvi

2 maj 2024 | 76 min

Previously: On the Proposed California SB 1047.

Text of the bill is here. It focuses on safety requirements for highly capable AI models.

This is written as an FAQ, tackling all questions or points I saw raised.

Safe & Secure AI Innovation Act also has a description page.

Why Are We Here Again?

There have been many highly vocal and forceful objections to SB 1047 this week, in reaction to a (disputed) claim that the bill has been ‘fast tracked.’

The bill continues to have substantial chance of becoming law according to Manifold, where the market has not moved on recent events.

The purpose of this post is to gather and analyze all of them that came to my attention in any way, including all responses to my request for them on Twitter, and to suggest concrete changes that address some real [...]

---

Outline:

(00:28) Why Are We Here Again?

(02:48) What is the Story So Far?

(05:26) What Do I Think the Law Would Actually Do?

(10:40) What are the Biggest Misconceptions?

(15:05) What are the Real Problems?

(19:42) What the the Changes That Would Improve the Bill?

(23:10) What is the Definition of Derivative Model? Is it Clear Enough?

(28:00) Should the $500 Million Threshold Should be Indexed for Inflation?

(28:23) What Constitutes Hazardous Capability?

(33:23) Does the Alternative Capabilities Rule Make Sense?

(36:02) Is Providing Reasonable Assurance of a Lack of Hazardous Capability Realistic?

(38:39) Is Reasonable Assurance Tantamount to Requiring Proof That Your AI is Safe?

(40:20) Is the Definition of Covered Model Overly Broad?

(43:50) Is the Similar Capabilities Clause Overly Broad or Anticompetitive?

(46:46) Does This Introduce Broad Liability?

(48:41) Should Developers Worry About Going to Jail for Perjury?

(49:53) Does This Create a New Regulatory Agency to Regulate AI?

(50:22) Will a Government Agency Be Required to Review and Approve AI Systems Before Release?

(50:40) Are the Burdens Here Overly Onerous to Small Developers?

(51:42) Is the Shutdown Requirement a Showstopper for Open Weights Models?

(53:39) Do the Requirements Disincentive Openness?

(54:12) Will This Have a Chilling Effect on Research?

(54:36) Does the Ability to Levy Fees Threaten Small Business?

(55:14) Will This Raise Barriers to Entry?

(55:51) Is This a Brazen Attempt to Hurt Startups and Open Source?

(57:42) Will This Cost California Talent or Companies?

(59:01) Could We Use a Cost-Benefit Test?

(01:05:19) Should We Interpret Proposals via Adversarial Legal Formalism?

(01:08:13) What Other Positive Comments Are Worth Sharing?

(01:09:04) What Else Was Suggested That We Might Do Instead of This Bill?

(01:10:57) Would This Interfere With Federal Regulation?

(01:11:34) Conclusion

---

First published:
May 2nd, 2024

Source:
https://www.lesswrong.com/posts/qsGRKwTRQ5jyE5fKB/q-and-a-on-proposed-sb-1047

---

Narrated by TYPE III AUDIO.

“Changes in College Admissions” by Zvi

24 april 2024 | 71 min

“On Llama-3 and Dwarkesh Patel’s Podcast with Zuckerberg” by Zvi

22 april 2024 | 89 min

“AI #60: Oh the Humanity” by Zvi

18 april 2024 | 116 min

Many things this week did not go as planned.

Humane AI premiered its AI pin. Reviewers noticed it was, at best, not ready.

Devin turns out to have not been entirely forthright with its demos.

OpenAI fired two employees who had been on its superalignment team, Leopold Aschenbrenner and Pavel Izmailov for allegedly leaking information, and also more troubliningly lost Daniel Kokotajlo, who expects AGI very soon, does not expect it to by default go well, and says he quit ‘due to losing confidence that [OpenAI] would behave responsibly around the time of AGI.’ That's not good.

Nor is the Gab system prompt, although that is not a surprise. And several more.

On the plus side, my 80,000 Hours podcast finally saw the light of day, and Ezra Klein had an excellent (although troubling) podcast with Dario Amodei. And we got the usual mix [...]

---

Outline:

(01:05) Language Models Offer Mundane Utility

(06:13) Language Models Don’t Offer Mundane Utility

(11:21) Oh the Humanity

(21:31) GPT-4 Real This Time

(23:12) Fun with Image Generation

(27:47) Deepfaketown and Botpocalypse Soon

(31:34) Devin in the Details

(35:36) Another Supposed System Prompt

(42:35) They Took Our Jobs

(45:37) Introducing

(47:42) In Other AI News

(52:47) Quiet Speculations

(01:00:29) The Quest for Sane Regulations

(01:00:47) The Problem: AI's Extreme Risks

(01:02:07) Overview

(01:03:16) Covered Frontier AI Models

(01:04:10) Oversight of Frontier Models

(01:06:49) Oversight Entity

(01:18:43) The Week in Audio

(01:27:51) Rhetorical Innovation

(01:32:41) Don’t Be That Guy

(01:33:47) Aligning a Smarter Than Human Intelligence is Difficult

(01:42:24) Please Speak Directly Into the Microphone

(01:44:18) People Are Worried About AI Killing Everyone

(01:48:44) Other People Are Not As Worried About AI Killing Everyone

(01:54:10) The Lighter Side

---

First published:
April 18th, 2024

Source:
https://www.lesswrong.com/posts/FAnxq8wFpfqGjeetC/ai-60-oh-the-humanity

---

Narrated by TYPE III AUDIO.

“Childhood and Education Roundup #5” by Zvi

17 april 2024 | 49 min

“Monthly Roundup #17: April 2024” by Zvi

15 april 2024 | 138 min

As always, a lot to get to. This is everything that wasn’t in any of the other categories.

Bad News

You might have to find a way to actually enjoy the work.

Greg Brockman (President of OpenAI): Sustained great work often demands enjoying the process for its own sake rather than only feeling joy in the end result. Time is mostly spent between results, and hard to keep pushing yourself to get to the next level if you’re not having fun while doing so.

Yeah. This matches my experience in all senses. If you don’t find a way to enjoy the work, your work is not going to be great.

This is the time. This is the place.

Guiness Pig: In a discussion at work today:

“If you email someone to ask for something and they send you an email trail showing [...]

---

Outline:

(00:13) Bad News

(04:23) Patriots and Tyrants

(07:31) Asymmetric Justice Incarnate

(08:55) Loneliness

(11:50) Get Involved

(12:32) Government Working

(22:31) Crime and Punishment

(30:27) Squatters Should Not Be Able to Steal Your House

(33:17) El Salvador

(40:14) Our Criminal Justice Problem With Junk Science

(45:43) Variously Effective Altruism

(55:26) Technology Advances

(58:07) You Need More Screen Space

(01:00:02) Apple Vision Pro

(01:02:34) A Matter of Antitrust

(01:12:06) RTFB: Read the Bill

(01:13:42) Antisocial Media

(01:17:13) RIP NPR

(01:20:27) Entertainment Monthly

(01:22:17) Gamers Gonna Game Game Game Game Game

(01:27:57) Luck Be a Landlord

(01:31:04) Sports Go Sports

(01:39:10) Know When To Fold ‘Em

(01:44:26) Wouldn’t You Prefer a Good Game of Chess

(01:51:17) Total Eclipse of the Sun

(01:53:15) Delegation

(01:55:48) Good News, Everyone

(02:07:03) I Was Promised Flying Self-Driving Cars

(02:08:43) While I Cannot Condone This

(02:11:39) There has been little change in rates of being vegetarian (4%) or vegan (1%). Yes, the people I meet are radically more likely to be both these things, but those are weird circles. However, I also notice a radical explosion in the number of vegan restaurants and products on offer. So something is going on.

(02:15:08) What Is Best In Life?

---

First published:
April 15th, 2024

Source:
https://www.lesswrong.com/posts/cbkJWkKWvETwJqoj2/monthly-roundup-17-april-2024

---

Narrated by TYPE III AUDIO.

“AI #59: Model Updates” by Zvi

11 april 2024 | 119 min

“RTFB: On the New Proposed CAIP AI Bill” by Zvi

10 april 2024 | 63 min

“Medical Roundup #2” by Zvi

9 april 2024 | 31 min

“On the 2nd CWT with Jonathan Haidt” by Zvi

5 april 2024 | 58 min

“AI #58: Stargate AGI” by Zvi

4 april 2024 | 113 min

“Fertility Roundup #3” by Zvi

2 april 2024 | 60 min

“Notes on Dwarkesh Patel’s Podcast with Sholto Douglas and Trenton Bricken” by Zvi

1 april 2024 | 32 min

“AI #57: All the AI News That’s Fit to Print” by Zvi

28 mars 2024 | 123 min

Welcome, new readers!

This is my weekly AI post, where I cover everything that is happening in the world of AI, from what it can do for you today (‘mundane utility’) to what it can promise to do for us tomorrow, and the potentially existential dangers future AI might pose for humanity, along with covering the discourse on what we should do about all of that.

You can of course Read the Whole Thing, and I encourage that if you have the time and interest, but these posts are long, so they also designed to also let you pick the sections that you find most interesting. Each week, I pick the sections I feel are the most important, and put them in bold in the table of contents.

Not everything here is about AI. I did an economics roundup on Tuesday, and a general monthly roundup [...]

---

Outline:

(01:16) Language Models Offer Mundane Utility

(06:58) Language Models Don’t Offer Mundane Utility

(08:37) Stranger Things

(09:09) Clauding Along

(16:30) Fun with Image Generation

(19:01) Deepfaketown and Botpocalypse Soon

(21:29) They Took Our Jobs

(24:19) Introducing

(26:27) In Other AI News

(28:59) Loud Speculations

(31:10) Quiet Speculations

(36:04) Principles of Microeconomics

(44:54) The Full IDAIS Statement

(45:37) Consensus Statement on Red Lines in Artificial Intelligence

(47:54) Roadmap to Red Line Enforcement

(50:19) Conclusion

(50:49) The Quest for Sane Regulations

(56:38) The Week in Audio

(57:29) Rhetorical Innovation

(01:13:41) How Not to Regulate AI

(01:25:48) The Three Body Problem (Spoiler-Free)

(01:27:11) AI Doomer Dark Money Astroturf Update

(01:36:35) Evaluating a Smarter Than Human Intelligence is Difficult

(01:50:15) Aligning a Smarter Than Human Intelligence is Difficult

(01:53:32) AI is Deeply Unpopular

(01:53:47) People Are Worried About AI Killing Everyone

(01:55:43) Other People Are Not As Worried About AI Killing Everyone

(01:57:31) Wouldn’t You Prefer a Good Game of Chess?

(02:00:03) The Lighter Side

---

First published:
March 28th, 2024

Source:
https://www.lesswrong.com/posts/5Dz3ZrwBzzMfaucrH/ai-57-all-the-ai-news-that-s-fit-to-print

---

Narrated by TYPE III AUDIO.

“Economics Roundup #1” by Zvi

26 mars 2024 | 44 min

“On Lex Fridman’s Second Podcast with Altman” by Zvi

25 mars 2024 | 21 min

“AI #56: Blackwell That Ends Well” by Zvi

21 mars 2024 | 127 min

Hopefully, anyway. Nvidia has a new chip.

Also Altman has a new interview.

And most of Inflection has new offices inside Microsoft.

Table of Contents

Introduction.
Table of Contents.
Language Models Offer Mundane Utility. Open the book.
Clauding Along. Claude continues to impress.
Language Models Don’t Offer Mundane Utility. What are you looking for?
Fun With Image Generation. Stable Diffusion 3 paper.
Deepfaketown and Botpocalypse Soon. Jesus Christ.
They Took Our Jobs. Noah Smith has his worst take amd commits to the bit.
Generative AI in Games. What are the important dangers?
Get Involved. EU AI office, IFP, Anthropic.
Introducing. WorldSim. The rabbit hole goes deep, if you want that.
Grok the Grok. Weights are out. Doesn’t seem like it matters much.
New Nivida Chip. Who dis?
Inflection Becomes Microsoft AI. [...]

---

Outline:

(00:18) Language Models Offer Mundane Utility

(10:28) Clauding Along

(12:45) Language Models Don’t Offer Mundane Utility

(16:11) Fun with Image Generation

(21:00) Deepfaketown and Botpocalypse Soon

(24:20) They Took Our Jobs

(37:17) Generative AI in Games

(40:20) Get Involved

(41:38) Introducing

(51:03) Grok the Grok

(54:43) New Nvidia Chip

(56:40) Inflection Becomes Microsoft AI

(58:11) In Other AI News

(01:04:18) Wait Till Next Year

(01:11:57) Quiet Speculations

(01:20:03) The Quest for Sane Regulations

(01:25:20) The Week in Audio

(01:26:03) Rhetorical Innovation

(01:31:11) Read the Roon

(01:34:01) Pick Up the Phone

(01:36:17) Aligning a Smarter Than Human Intelligence is Difficult

(01:45:14) Polls Show People Are Worried About AI

(01:52:29) People Are Worried About AI Killing Everyone

(01:54:57) Other People Are Not As Worried About AI Killing Everyone

(02:04:57) The Lighter Side

---

First published:
March 21st, 2024

Source:
https://www.lesswrong.com/posts/iH5Sejb4dJGA2oTaP/ai-56-blackwell-that-ends-well

---

Narrated by TYPE III AUDIO.

“On the Gladstone Report” by Zvi

20 mars 2024 | 77 min

“Monthly Roundup #16: March 2024” by Zvi

19 mars 2024 | 99 min

“On Devin” by Zvi

18 mars 2024 | 21 min

“AI #55: Keep Clauding Along” by Zvi

14 mars 2024 | 128 min

Things were busy once again, partly from the Claude release but from many other sides as well. So even after cutting out both the AI coding agent Devin and the Gladstone Report along with previously covering OpenAI's board expansion and investigative report, this is still one of the longest weekly posts.

In addition to Claude and Devin, we got among other things Command-R, Inflection 2.5, OpenAI's humanoid robot partnership reporting back after only 13 days and Google DeepMind with an embodied cross-domain video game agent. You can definitely feel the acceleration.

The backlog expands. Once again, I say to myself, I will have to up my reporting thresholds and make some cuts. Wish me luck.

Table of Contents

Introduction.
Table of Contents.
Language Models Offer Mundane Utility. Write your new legal code. Wait, what?
Claude 3 Offers Mundane Utility. A free [...]

---

Outline:

(00:52) Language Models Offer Mundane Utility

(06:17) Claude 3 Offers Mundane Utility

(11:57) Prompt Attention

(17:28) Clauding Along

(27:19) Language Models Don’t Offer Mundane Utility

(31:33) GPT-4 Real This Time

(32:22) Copyright Confrontation

(33:34) Fun with Image Generation

(37:49) They Took Our Jobs

(40:38) Get Involved

(43:53) Introducing

(51:55) Inflection 2.5

(55:04) Paul Christiano Joins NIST

(01:01:38) In Other AI News

(01:07:40) Quiet Speculations

(01:23:06) The Quest for Sane Regulations

(01:31:34) The Week in Audio

(01:31:44) Rhetorical Innovation

(01:41:23) A Failed Attempt at Adversarial Collaboration

(01:48:37) Spy Versus Spy

(01:52:38) Shouting Into the Void

(01:54:55) Open Model Weights are Unsafe and Nothing Can Fix This

(01:56:37) Aligning a Smarter Than Human Intelligence is Difficult

(01:59:49) People Are Worried About AI Killing Everyone

(02:00:39) Other People Are Not As Worried About AI Killing Everyone

(02:07:39) The Lighter Side

---

First published:
March 14th, 2024

Source:
https://www.lesswrong.com/posts/N3tXkA9Jj6oCB2eiJ/ai-55-keep-clauding-along

---

Narrated by TYPE III AUDIO.

“On the Latest TikTok Bill” by Zvi

13 mars 2024 | 56 min

“OpenAI: The Board Expands” by Zvi

12 mars 2024 | 53 min

“AI #54: Clauding Along” by Zvi

7 mars 2024 | 95 min

The big news this week was of course the release of Claude 3.0 Opus, likely in some ways the best available model right now. Anthropic now has a highly impressive model, impressive enough that it seems as if it breaks at least the spirit of their past commitments on how far they will push the frontier. We will learn more about its ultimate full capabilities over time.

We also got quite the conversation about big questions of one's role in events, which I immortalized as Read the Roon. Since publication Roon has responded, which I have edited into the post along with some additional notes.

That still leaves plenty of fun for the full roundup. We have spies. We have accusations of covert racism. We have Elon Musk suing OpenAI. We have a new summary of simulator theory. We have NIST, tasked with AI regulation, literally struggling [...]

---

Outline:

(01:03) Language Models Offer Mundane Utility

(09:25) Language Models Don’t Offer Mundane Utility

(11:52) LLMs: How Do They Work?

(15:50) Copyright Confrontation

(17:34) Oh Elon

(18:47) We realized building AGI will require far more resources than we’d initially imagined

(19:59) We and Elon recognized a for-profit entity would be necessary to acquire those resources

(21:49) We advance our mission by building widely-available beneficial tools

(24:46) DNA Is All You Need

(27:21) GPT-4 Real This Time

(30:11) Fun with Image Generation

(33:16) Deepfaketown and Botpocalypse Soon

(34:16) They Took Our Jobs

(35:20) Get Involved

(36:41) Introducing

(37:19) In Other AI News

(44:48) More on Self-Awareness

(47:05) Racism Remains a Problem for LLMs

(50:33) Project Maven

(53:35) Quiet Speculations

(01:00:01) The Quest for Sane Regulations

(01:07:19) The Week in Audio

(01:09:21) Rhetorical Innovation

(01:18:46) Another Open Letter

(01:22:33) Aligning a Smarter Than Human Intelligence is Difficult

(01:26:34) Security is Also Difficult, Although Perhaps Not This Difficult

(01:34:04) The Lighter Side

---

First published:
March 7th, 2024

Source:
https://www.lesswrong.com/posts/Nvi94KJSDGZMjknZS/ai-54-clauding-along

---

Narrated by TYPE III AUDIO.

“On Claude 3.0” by Zvi

6 mars 2024 | 58 min

“Read the Roon” by Zvi

5 mars 2024 | 33 min

“Housing Roundup #7” by Zvi

4 mars 2024 | 87 min

“Notes on Dwarkesh Patel’s Podcast with Demis Hassabis” by Zvi

1 mars 2024 | 15 min

“AI #53: One More Leap” by Zvi

29 februari 2024 | 69 min

“The Gemini Incident Continues” by Zvi

27 februari 2024 | 85 min

“AI #52: Oops” by Zvi

22 februari 2024 | 54 min

“Gemini Has a Problem” by Zvi

22 februari 2024 | 33 min

“Sora What” by Zvi

22 februari 2024 | 17 min

“The One and a Half Gemini” by Zvi

22 februari 2024 | 15 min

“A Tale of Two Restaurant Types” by Zvi

21 februari 2024 | 10 min

“AI #51: Altman’s Ambition” by Zvi

20 februari 2024 | 68 min

“The Third Gemini” by Zvi

20 februari 2024 | 17 min

“Monthly Roundup #15: February 2024” by Zvi

20 februari 2024 | 57 min

“More on the Apple Vision Pro” by Zvi

13 februari 2024 | 15 min

“On the Proposed California SB 1047” by Zvi

12 februari 2024 | 23 min

“One True Love” by Zvi

9 februari 2024 | 20 min

“AI #50: The Most Dangerous Thing” by Zvi

8 februari 2024 | 41 min

“On the Debate Between Jezos and Leahy” by Zvi

6 februari 2024 | 109 min

“On Dwarkesh’s 3rd Podcast With Tyler Cowen” by Zvi

2 februari 2024 | 35 min

“AI #49: Bioweapon Testing Begins” by Zvi

1 februari 2024 | 75 min

“Childhood and Education Roundup #4” by Zvi

30 januari 2024 | 44 min

“AI #48: The Talk of Davos” by Zvi

25 januari 2024 | 64 min

“Monthly Roundup #14: January 2024” by Zvi

24 januari 2024 | 77 min

“AI #48: Exponentials in Geometry” by Zvi

18 januari 2024 | 94 min

“On Anthropic’s Sleeper Agents Paper” by Zvi

17 januari 2024 | 62 min

“Medical Roundup #1” by Zvi

16 januari 2024 | 54 min

“2024 ACX Predictions: Blind/Buy/Sell/Hold” by Zvi

15 januari 2024 | 57 min

“Dating Roundup #2: If At First You Don’t Succeed” by Zvi

14 januari 2024 | 82 min

Developments around relationships and dating have a relatively small speed premium, also there are once again enough of them for a full post.

The first speculated on why you’re still single. We failed to settle the issue. A lot of you are indeed still single. So the debate continues.

You’re Single Because You’re Not Even Trying

What does it mean to not even be trying?

It does not only mean the things Alexander pointed us to last time, like 62% of singles being on zero dating apps, and a majority of singles having gone on zero dates in the past year, and a large majority not actively looking for a relationship. Here are those graphs again:

It also means things such as literally never approaching a woman in person.

Alexander (Keeper.ai): Why are so many young men single? Are they excluded from a [...]

---

Outline:

(00:25) You’re Single Because You’re Not Even Trying

(07:30) You’re Single Because of Artificial Intelligence

(10:38) You’re Single Because You Meet the Definition of Insanity

(18:52) You’re Single Because You’re Asking the Wrong Questions

(20:40) You’re Single Because of a Market Mismatch

(21:39) You’re Single Because Dating Apps Suck

(25:20) You’re Single Because You Didn’t Pay for Tinder Select

(39:24) You’re Single Because You Have Zero Friends

(43:14) You’re Single Because You Are Bad at Texting

(44:32) You’re Single Because You Have No Honor

(45:28) You’re Single Because You’re Not Hot

(51:23) You’re Single Because You Don’t Get Your House in Order

(52:55) You’re Single Because You’re Too Weird

(54:21) You’re Single Because of Your Bodycount

(59:34) You’re Single Because You Need to Learn Seduction

(01:08:12) You’re Single Because You Cannot Take a Hint

(01:10:05) You’re Single Because You Waste Your Money on Signaling

(01:12:27) You’re Still Single Because You Misalign Your Incentives

(01:13:03) You’re Not Single and You Are an Inspiration

(01:16:04) You’re Probably Not Single Stop Misrepresenting the Statistics

(01:16:35) You’re Single Because You’re Too Busy Writing Comments

(01:19:20) What About My Good Advice?

(01:21:09) Future Plans for this Series

---

First published:
January 2nd, 2024

Source:
https://www.lesswrong.com/posts/y8g4bXF7sdT9RduT6/dating-roundup-2-if-at-first-you-don-t-succeed

---

Narrated by TYPE III AUDIO.

“AI #47: Meet the New Year” by Zvi

13 januari 2024 | 99 min

[NOTE: I forgot to post this to WP/LW/RSS on Thursday, so posting it now. Sorry about that.]

Will be very different from the old year by the time we are done. This year, it seems like various continuations of the old one. Sometimes I look back on the week, and I wonder how so much happened, while in other senses very little happened.

Table of Contents

Introduction.
Table of Contents.
Language Models Offer Mundane Utility. A high variance game of chess.
Language Models Don’t Offer Mundane Utility. What even is productivity?
GPT-4 Real This Time. GPT store, teams accounts, privacy issues, plagiarism.
Liar Liar. If they work, why aren’t we using affect vectors for mundane utility?
Fun With Image Generation. New techniques, also contempt.
Magic: The Generating. Avoiding AI artwork proving beyond Hasbro's powers.
Copyright Confrontation. OpenAI [...]

---

Outline:

(00:28) Language Models Offer Mundane Utility

(02:44) Language Models Don’t Offer Mundane Utility

(06:13) GPT-4 Real This Time

(12:30) Liar Liar

(13:50) Fun with Image Generation

(19:09) Magic: The Generating

(23:57) Copyright Confrontation

(27:30) Deepfaketown and Botpocalypse Soon

(29:08) They Took Our Jobs

(37:39) Get Involved

(37:48) Introducing

(38:41) In Other AI News

(41:18) Quiet Speculations

(51:33) The Quest for Sane Regulations

(54:57) The Week in Audio

(55:10) AI Impacts Survey

(57:29) Rhetorical Innovation

(01:14:41) Aligning a Human Level Intelligence is Still Difficult

(01:17:00) Aligning a Smarter Than Human Intelligence is Difficult

(01:26:23) Won’t Get Fooled Again

(01:30:03) People Are Worried About AI Killing Everyone

(01:32:11) Other People Are Not As Worried About AI Killing Everyone

(01:34:55) The Wit and Wisdom of Sam Altman

(01:38:15) The Lighter Side

---

First published:
January 13th, 2024

Source:
https://www.lesswrong.com/posts/iygs57bHJ36AvzpMh/ai-47-meet-the-new-year

---

Narrated by TYPE III AUDIO.

“2023 Prediction Evaluations” by Zvi

8 januari 2024 | 53 min

“AI Impacts Survey: December 2023 Edition” by Zvi

5 januari 2024 | 20 min

“AI #45: To Be Determined” by Zvi

4 januari 2024 | 55 min

“Copyright Confrontation #1” by Zvi

3 januari 2024 | 32 min

“AI #44: Copyright Confrontation” by Zvi

29 december 2023 | 79 min

“AI #43: Functional Discoveries” by Zvi

21 december 2023 | 91 min

“On OpenAI’s Preparedness Framework” by Zvi

21 december 2023 | 41 min

“Monthly Roundup #13: December 2023” by Zvi

19 december 2023 | 48 min

“AI #42: The Wrong Answer” by Zvi

14 december 2023 | 96 min

“The Best of Don’t Worry About the Vase” by Zvi

13 december 2023 | 25 min

“Balsa Update and General Thank You” by Zvi

12 december 2023 | 14 min

“OpenAI: Leaks Confirm the Story” by Zvi

12 december 2023 | 28 min

“AI #41: Bring in the Other Gemini” by Zvi

7 december 2023 | 92 min

“Gemini 1.0” by Zvi

7 december 2023 | 19 min

“Based Beff Jezos and the Accelerationists” by Zvi

6 december 2023 | 21 min

“On ‘Responsible Scaling Policies’ (RSPs)” by Zvi

5 december 2023 | 67 min

LessWrong posts by zvi

A

Om podden

Avsnitt

“Worries About AI Are Usually Complements Not Substitutes” by Zvi

“AI #113: The o3 Era Begins” by Zvi

“o3 Is a Lying Liar” by Zvi

“You Better Mechanize” by Zvi

“Crime and Punishment #1” by Zvi

“o3 Will Use Its Tools For You” by Zvi

“AI #112: Release the Everything” by Zvi

“GPT-4.1 Is a Mini Upgrade” by Zvi

“OpenAI #13: Altman at TED and OpenAI Cutting Corners on Safety Testing” by Zvi

“Monthly Roundup #29: April 2025” by Zvi

“On Google’s Safety Plan” by Zvi

“AI #111: Giving Us Pause” by Zvi

“Llama Does Not Look Good 4 Anything” by Zvi

“AI 2027: Responses” by Zvi

“AI 2027: Dwarkesh’s Podcast with Daniel Kokotajlo and Scott Alexander” by Zvi

“AI CoT Reasoning Is Often Unfaithful” by Zvi

“AI #110: Of Course You Know…” by Zvi

“More Fun With GPT-4o Image Generation” by Zvi

“Housing Roundup #11” by Zvi

“OpenAI #12: Battle of the Board Redux” by Zvi

“AI #109: Google Fails Marketing Forever” by Zvi

“Gemini 2.5 is the New SoTA” by Zvi

“On (Not) Feeling the AGI” by Zvi

“More on Various AI Action Plans” by Zvi

“They Took MY Job?” by Zvi

“Going Nova” by Zvi

“OpenAI #11: America Action Plan” by Zvi

“Monthly Roundup #28: March 2025” by Zvi

“On MAIM and Superintelligence Strategy” by Zvi

“AI #107: The Misplaced Hype Machine” by Zvi

“The Most Forbidden Technique” by Zvi

“Response to Scott Alexander on Imprisonment” by Zvi

“The Manus Marketing Madness” by Zvi

“Childhood and Education #9: School is Hell” by Zvi

“AI #106: Not so Fast” by Zvi

“On OpenAI’s Safety and Alignment Philosophy” by Zvi

“On Writing #1” by Zvi

“On Emergent Misalignment” by Zvi

“AI #105: Hey There Alexa” by Zvi

“Time to Welcome Claude 3.7” by Zvi

“Grok Grok” by Zvi

“Economics Roundup #5” by Zvi

“On OpenAI’s Model Spec 2.0” by Zvi

“AI #104: American State Capacity on the Brink” by Zvi

“Go Grok Yourself” by Zvi

“Medical Roundup #4” by Zvi

“Monthly Roundup #27: February 2025” by Zvi

“The Mask Comes Off: A Trio of Tales” by Zvi

“AI #103: Show Me the Money” by Zvi

“The Paris AI Anti-Safety Summit” by Zvi

“On Deliberative Alignment” by Zvi

“Levels of Friction” by Zvi

“On the Meta and DeepMind Safety Frameworks” by Zvi

“AI #102: Made in America” by Zvi

“The Risk of Gradual Disempowerment from AI” by Zvi

“We’re in Deep Research” by Zvi

“o3-mini Early Days” by Zvi

“DeepSeek: Don’t Panic” by Zvi

“AI #101: The Shallow End” by Zvi

“DeepSeek: Lemon, It’s Wednesday” by Zvi

“Operator” by Zvi

“DeepSeek Panic at the App Store” by Zvi

“Stargate AI-1” by Zvi

“AI #100: Meet the New Boss” by Zvi

“On DeepSeek’s r1” by Zvi

“Sleep, Diet, Exercise and GLP-1 Drugs” by Zvi

“Meta Pivots on Content Moderation” by Zvi

“AI #99: Farewell to Biden” by Zvi

“On the OpenAI Economic Blueprint” by Zvi

“NYC Congestion Pricing: Early Days” by Zvi

“Zvi’s 2024 In Movies” by Zvi

“On Dwarkesh Patel’s 4th Podcast With Tyler Cowen” by Zvi

“AI #98: World Ends With Six Word Story” by Zvi

“OpenAI #10: Reflections” by Zvi

“Childhood and Education #8: Dealing with the Internet” by Zvi

“AI #97: 4” by Zvi