214 avsnitt • Längd: 65 min • Veckovis: Torsdag
Audio narrations of LessWrong posts by zvi
The podcast LessWrong posts by zvi is created by zvi. The podcast and the artwork on this page are embedded on this page using the public podcast feed (RSS).
For our annual update on how Balsa is doing, I am turning the floor over to Jennifer Chen, who is the only person working full time on Balsa Research.
For my general overview of giving opportunities, see my post from last week.
Previously: The 2023 Balsa Research update post, Repeal the Jones Act of 1920.
tl;dr: In 2024, Balsa Research funded two upcoming academic studies on Jones Act impacts and published the Jones Act Post. In 2025, we’ll expand our research and develop specific policy proposals. Donate to Balsa Research here.
Today is Giving Tuesday. There are many worthy causes, including all of the ones highlighted by Zvi in a recent post. Of all of those orgs, there is one organization I have privileged information on – Balsa Research, where I’ve been working for the past year and a half.
Balsa Research [...]
---
Outline:
(01:48) What We Did in 2024
(05:27) Looking Ahead to 2025
(06:40) Why Support Balsa
The original text contained 1 image which was described by AI.
---
First published:
December 3rd, 2024
Source:
https://www.lesswrong.com/posts/F7d9bCKit2mfvpKng/balsa-research-2024-update
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
There is little sign that the momentum of the situation is changing. Instead, things continue to slowly get worse, as nations in holes continue to keep digging. The longer we wait, the more expensive the ultimate price will be. We will soon find out what the new administration does, which could go any number of ways.
Table of Contents
---
Outline:
(00:29) Not Enough Dakka
(12:02) Embryo Selection
(15:44) Costs
(16:51) Proving that Dakka Works
(18:41) IVF
(22:18) Genetics
(22:43) Cultural Trends
(32:41) Denial
(33:49) Urbanization
(34:25) The Marriage Penalty
(35:24) The Biological Clock
(38:15) Technology Advances
(39:40) Big Families
(40:41) Au Pairs
(42:18) Childcare Regulations
(46:51) The Numbers
(47:18) The Housing Theory of Everything
(59:15) Causes
(01:07:39) The Iron Law of Wages
(01:10:37) South Korea
(01:15:36) Georgia (the Country)
(01:17:20) Japan
(01:18:38) China
(01:21:51) Italy
(01:22:04) Northwestern Spain
(01:23:59) Russia
(01:24:15) Taiwan
(01:26:34) The United Kingdom
(01:26:51) Ancient Greece
(01:27:24) Israel
(01:28:20) More Dakka
(01:33:21) Perception
(01:37:10) Your Own Quest
(01:42:21) Help Wanted
The original text contained 38 images which were described by AI.
---
First published:
December 2nd, 2024
Source:
https://www.lesswrong.com/posts/avhKKnJyJ6kisvkzk/fertility-roundup-4
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
There are lots of great charitable giving opportunities out there right now.
The first time that I served as a recommender in the Survival and Flourishing Fund (SFF) was back in 2021. I wrote in detail about my experiences then. At the time, I did not see many great opportunities, and was able to give out as much money as I found good places to do so.
How the world has changed in three years.
I recently had the opportunity to be an SFF recommender for the second time. This time I found an embarrassment of riches. Application quality was consistently higher, there were more than twice as many applications, and essentially all applicant organizations were looking to scale their operations and spending.
That means the focus of this post is different. In 2021, my primary goal was to share my perspective on [...]
---
Outline:
(01:39) A Word of Warning
(02:44) Use Your Personal Theory of Impact
(04:13) Use Your Local Knowledge
(05:10) Unconditional Grants to Worthy Individuals Are Great
(06:55) Do Not Think Only On the Margin, and Also Use Decision Theory
(07:48) And the Nominees Are
(10:55) Organizations that Are Literally Me
(11:10) Balsa Research
(12:56) Don’t Worry About the Vase
(14:19) Organizations Focusing On AI Non-Technical Research and Education
(14:37) The Scenario Project
(15:48) Lightcone Infrastructure
(17:20) Effective Institutions Project (EIP)
(18:06) Artificial Intelligence Policy Institute (AIPI)
(19:10) Psychosecurity Ethics at EURAIO
(20:07) Pallisade Research
(21:07) AI Safety Info (Robert Miles)
(21:51) Intelligence Rising
(22:32) Convergence Analysis
(23:29) Longview Philanthropy
(24:27) Organizations Focusing Primary On AI Policy and Diplomacy
(25:06) Center for AI Safety and the CAIS Action Fund
(26:00) MIRI
(26:59) Foundation for American Innovation (FAI)
(28:58) Center for AI Policy (CAIP)
(29:58) Encode Justice
(30:57) The Future Society
(31:42) Safer AI
(32:26) Institute for AI Policy and Strategy (IAPS)
(33:13) AI Standards Lab
(34:05) Safer AI Forum
(34:40) CLTR at Founders Pledge
(35:54) Pause AI and Pause AI Global
(36:57) Existential Risk Observatory
(37:37) Simons Institute for Longterm Governance
(38:21) Legal Advocacy for Safe Science and Technology
(39:17) Organizations Doing ML Alignment Research
(40:16) Model Evaluation and Threat Research (METR)
(41:28) Alignment Research Center (ARC)
(42:02) Apollo Research
(42:53) Cybersecurity Lab at University of Louisville
(43:44) Timaeus
(44:39) Simplex
(45:08) Far AI
(45:41) Alignment in Complex Systems Research Group
(46:23) Apart Research
(47:06) Transluce
(48:00) Atlas Computing
(48:45) Organizations Doing Math, Decision Theory and Agent Foundations
(50:05) Orthogonal
(50:47) Topos Institute
(51:37) Eisenstat Research
(52:13) ALTER (Affiliate Learning-Theoretic Employment and Resources) Project
(53:00) Mathematical Metaphysics Institute
(54:06) Focal at CMU
(55:15) Organizations Doing Cool Other Stuff Including Tech
(55:26) MSEP Project at Science and Technology Futures (Their Website)
(56:26) ALLFED
(57:51) Good Ancestor Foundation
(59:10) Charter Cities Institute
(59:50) German Primate Center (DPZ) – Leibniz Institute for Primate Research
(01:01:08) Carbon Copies for Independent Minds
(01:01:44) Organizations Focused Primarily on Bio Risk
(01:01:50) Secure DNA
(01:02:46) Blueprint Biosecurity
(01:03:35) Pour Domain
(01:04:17) Organizations That then Regrant to Fund Other Organizations
(01:05:14) SFF Itself (!)
(01:06:10) Manifund
(01:08:02) AI Risk Mitigation Fund
(01:08:39) Long Term Future Fund
(01:10:16) Foresight
(01:11:08) Centre for Enabling Effective Altruism Learning and Research (CEELAR)
(01:11:43) Organizations That are Essentially Talent Funnels
(01:13:40) AI Safety Camp
(01:14:23) Center for Law and AI Risk
(01:15:22) Speculative Technologies
(01:16:19) Talos Network
(01:17:11) MATS Research
(01:17:48) Epistea
(01:18:52) Emergent Ventures (Special Bonus Organization, was not part of SFF)
(01:20:32) AI Safety Cape Town
(01:21:08) Impact Academy Limited
(01:21:47) Principles of Intelligent Behavior in Biological and Social Systems (PIBBSS)
(01:22:34) Tarbell Fellowship at PPF
(01:23:32) Catalyze Impact
(01:24:32) Akrose
(01:25:14) CeSIA within EffiSciences
(01:25:59) Stanford Existential Risk Initiative (SERI)
---
First published:
November 29th, 2024
Source:
https://www.lesswrong.com/posts/9n87is5QsCozxr9fp/the-big-nonprofits-post
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
People don’t give thanks enough, and it's actual Thanksgiving, so here goes.
Thank you for continuing to take this journey with me every week.
It's a lot of words. Even if you pick and choose, and you probably should, it's a lot of words. You don’t have many slots to spend on things like this. I appreciate it.
Thanks in particular for those who are actually thinking about all this, and taking it seriously, and forming their own opinions. It is the only way. To everyone who is standing up, peacefully and honestly, for whatever they truly think will make the world better, even if I disagree with you.
Thanks to all those working to ensure we all don’t die, and also those working to make the world a little richer, a little more full of joy and fun and health and wonder, in the [...]
---
Outline:
(02:08) Language Models Offer Mundane Utility
(03:16) It's a Poet Whether or Not You Know It
(06:23) Huh, Upgrades
(09:41) Thanks for the Memories
(11:51) Curve Ball
(15:58) ASI: A Scenario
(27:40) Deepfaketown and Botpocalypse Soon
(38:17) They Took Our Jobs
(45:14) Fun With Image Generation
(46:56) Get Involved
(47:10) Introducing
(47:32) In Other AI News
(48:45) Normative Determinism
(50:04) Quiet Speculations
(54:03) The Quest for Sane Regulations
(57:40) The Week in Audio
(01:01:31) Rhetorical Innovation
(01:02:21) Aligning a Smarter Than Human Intelligence is Difficult
(01:02:59) Pick Up the Phone
(01:08:24) Prepare for Takeoff
(01:14:07) Even Evaluating an Artificial Intelligence is Difficult
(01:16:48) People Are Worried About AI Killing Everyone
(01:19:11) The Lighter Side
The original text contained 12 images which were described by AI.
---
First published:
November 28th, 2024
Source:
https://www.lesswrong.com/posts/BGBLcy3JyjjrT8XbM/ai-92-behind-the-curve
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
Balsa Policy Institute chose as its first mission to lay groundwork for the potential repeal, or partial repeal, of section 27 of the Jones Act of 1920. I believe that this is an important cause both for its practical and symbolic impacts.
The Jones Act is the ultimate embodiment of our failures as a nation.
After 100 years, we do almost no trade between our ports via the oceans, and we build almost no oceangoing ships.
Everything the Jones Act supposedly set out to protect, it has destroyed.
Table of Contents
---
Outline:
(00:38) What is the Jones Act?
(01:33) Why Work to Repeal the Jones Act?
(02:48) Why Was the Jones Act Introduced?
(03:19) What is the Effect of the Jones Act?
(06:52) What Else Happens When We Ship More Goods Between Ports?
(07:14) Emergency Case Study: Salt Shipment to NJ in the Winter of 2013-2014
(12:04) Why no Emergency Exceptions?
(15:02) What Are Some Specific Non-Emergency Impacts?
(18:57) What Are Some Specific Impacts on Regions?
(22:36) What About the Study Claiming Big Benefits?
(24:46) What About the Need to ‘Protect’ American Shipbuilding?
(28:31) The Opposing Arguments Are Disingenuous and Terrible
(34:07) What Alternatives to Repeal Do We Have?
(35:33) What Might Be a Decent Instinctive Counterfactual?
(41:50) What About Our Other Protectionist and Cabotage Laws?
(43:00) What About Potential Marine Highways, or Short Sea Shipping?
(43:48) What Happened to All Our Offshore Wind?
(47:06) What Estimates Are There of Overall Cost?
(49:52) What Are the Costs of Being American Flagged?
(50:28) What Are the Costs of Being American Made?
(51:49) What are the Consequences of Being American Crewed?
(53:11) What Would Happen in a Real War?
(56:07) Cruise Ship Sanity Partially Restored
(56:46) The Jones Act Enforcer
(58:08) Who Benefits?
(58:57) Others Make the Case
(01:00:55) An Argument That We Were Always Uncompetitive
(01:02:45) What About John Arnold's Case That the Jones Act Can’t Be Killed?
(01:09:34) What About the Foreign Dredge Act of 1906?
(01:10:24) Fun Stories
---
First published:
November 27th, 2024
Source:
https://www.lesswrong.com/posts/dnH2hauqRbu3GspA2/repeal-the-jones-act-of-1920
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
Did DeepSeek effectively release an o1-preview clone within nine weeks?
The benchmarks largely say yes. Certainly it is an actual attempt at a similar style of product, and is if anything more capable of solving AIME questions, and the way it shows its Chain of Thought is super cool. Beyond that, alas, we don’t have enough reports in from people using it. So it's still too soon to tell. If it is fully legit, the implications seems important.
Small improvements continue throughout. GPT-4o and Gemini both got incremental upgrades, trading the top slot on Arena, although people do not seem to much care.
There was a time everyone would be scrambling to evaluate all these new offerings. It seems we mostly do not do that anymore.
The other half of events was about policy under the Trump administration. What should the federal government do? We [...]
---
Outline:
(01:31) Language Models Offer Mundane Utility
(05:37) Language Models Don’t Offer Mundane Utility
(08:14) Claude Sonnet 3.5.1 Evaluation
(11:09) Deepfaketown and Botpocalypse Soon
(11:57) Fun With Image Generation
(12:08) O-(There are)-Two
(15:25) The Last Mile
(22:52) They Took Our Jobs
(29:53) We Barely Do Our Jobs Anyway
(35:52) The Art of the Jailbreak
(39:20) Get Involved
(39:43) The Mask Comes Off
(40:36) Richard Ngo on Real Power and Governance Futures
(44:28) Introducing
(46:51) In Other AI News
(52:16) Quiet Speculations
(59:33) The Quest for Sane Regulations
(01:02:35) The Quest for Insane Regulations
(01:12:42) Pick Up the Phone
(01:13:21) Worthwhile Dean Ball Initiative
(01:29:18) The Week in Audio
(01:31:20) Rhetorical Innovation
(01:37:15) Pick Up the Phone
(01:38:32) Aligning a Smarter Than Human Intelligence is Difficult
(01:43:29) People Are Worried About AI Killing Everyone
(01:46:03) The Lighter Side
The original text contained 8 images which were described by AI.
---
First published:
November 21st, 2024
Source:
https://www.lesswrong.com/posts/SNBE9TXwL3qQ3TS8H/ai-91-deep-thinking
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
Previously: Long-Term Charities: Apply For SFF Funding, Zvi's Thoughts on SFF
There are lots of great charitable giving opportunities out there right now.
I recently had the opportunity to be a recommender in the Survival and Flourishing Fund for the second time. As a recommender, you evaluate the charities that apply and decide how worthwhile you think it would be to donate to each of them according to Jaan Tallinn's charitable goals, and this is used to help distribute millions in donations from Jaan Tallinn and others.
The first time that I served as a recommender in the Survival and Flourishing Fund (SFF) was back in 2021. I wrote in detail about my experiences then. At the time, I did not see many great opportunities, and was able to give out as much money as I found good places to do so.
How the world [...]
---
Outline:
(02:08) How the S-Process Works in 2024
(05:11) Quickly, There's No Time
(07:49) The Speculation Grant Filter
(08:23) Hits Based Giving and Measuring Success
(09:17) Fair Compensation
(10:41) Carpe Diem
(11:27) Our Little Corner of the World
(14:10) Well Well Well, If It Isn’t the Consequences of My Own Actions
(16:10) A Man's Reach Should Exceed His Grasp
(17:43) Conclusion
---
First published:
November 20th, 2024
Source:
https://www.lesswrong.com/posts/2JCdzhJeo2gsTjv8D/zvi-s-thoughts-on-his-2nd-round-of-sff
Narrated by TYPE III AUDIO.
Young People are Young and Stupid
As a reminder that yes college students are often young and stupid and wrong about everything, remember the time they were behind a ban on paid public toilets? This is a central case of the kind of logic that often gets applied by college students.No One Voted for This
HR and Title IX training seems like it's going a lot of compelled speech in the form of ‘agree with us or you can’t complete your training and the training is required for your job,’ and also a lot of that compelled speech is outright lying because it's confirmation of statements that are universally recognized to be insane? Robin Hanson: Scenario: 2 women talking. X, married to woman, announces is pregnant. Y asks how they got pregnant, was it friend [...]---
Outline:
(00:11) Young People are Young and Stupid
(00:29) No One Voted for This
(02:32) Discrimination
(09:02) Morality
(11:56) Only Connect
(15:22) It's Not Me, It's Your Fetish
(16:23) It Takes a Village You Don’t Have
(17:46) The Joy of Cooking
(20:18) The Joy of Eating
(20:59) Decision Theory
(26:22) FTC on the Loose
(31:27) Good News, Everyone
(36:19) Antisocial Media
(40:02) Technology Advances
(40:46) For Science!
(41:19) Cognition
(44:28) Discourse
(48:54) Communication
(49:32) Honesty
(51:09) Get Involved
(52:19) Government Working
(01:00:58) Quickly On the Student Loan Claim
(01:03:50) Variously Effective Altruism
(01:08:23) Gamers Gonna Game Game Game Game Game
(01:15:57) For Your Entertainment
(01:17:20) Sports Go Sports
(01:18:42) I Was Promised Flying Self-Driving Cars
(01:23:50) Get to Work
(01:25:23) While I Cannot Condone This
(01:30:48) The Lighter Side
The original text contained 20 images which were described by AI.
---
First published:
November 18th, 2024
Source:
https://www.lesswrong.com/posts/puJeNs9nLJByjatqq/monthly-roundup-24-november-2024
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
As the Trump transition continues and we try to steer and anticipate its decisions on AI as best we can, there was continued discussion about one of the AI debate's favorite questions: Are we making huge progress real soon now, or is deep learning hitting a wall? My best guess is it is kind of both, that past pure scaling techniques are on their own hitting a wall, but that progress remains rapid and the major companies are evolving other ways to improve performance, which started with OpenAI's o1.
Point of order: It looks like as I switched phones, WhatsApp kicked me out of all of my group chats. If I was in your group chat, and you’d like me to stay, please add me again. If you’re in a different group you’d like me to join on either WhatsApp or Signal (or other platforms) and would like [...]
---
Outline:
(00:58) Language Models Offer Mundane Utility
(02:24) Language Models Don’t Offer Mundane Utility
(04:20) Can’t Liver Without You
(12:04) Fun With Image Generation
(12:51) Deepfaketown and Botpocalypse Soon
(14:11) Copyright Confrontation
(15:25) The Art of the Jailbreak
(15:54) Get Involved
(18:10) Math is Hard
(20:20) In Other AI News
(25:04) Good Advice
(27:19) AI Will Improve a Lot Over Time
(30:56) Tear Down This Wall
(38:04) Quiet Speculations
(38:54) The Quest for Sane Regulations
(47:04) The Quest for Insane Regulations
(49:43) The Mask Comes Off
(52:08) Richard Ngo Resigns From OpenAI
(55:44) Unfortunate Marc Andreessen Watch
(56:53) The Week in Audio
(01:05:00) Rhetorical Innovation
(01:09:44) Seven Boats and a Helicopter
(01:11:27) The Wit and Wisdom of Sam Altman
(01:12:10) Aligning a Smarter Than Human Intelligence is Difficult
(01:14:50) People Are Worried About AI Killing Everyone
(01:15:14) Other People Are Not As Worried About AI Killing Everyone
(01:17:32) The Lighter Side
The original text contained 10 images which were described by AI.
---
First published:
November 14th, 2024
Source:
https://www.lesswrong.com/posts/FC9hdySPENA7zdhDb/ai-90-the-wall
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
Table [...]
---
Outline:
(01:02) The Short Answer
(02:01) Paper One: Bankruptcies
(07:03) Paper Two: Reduced Household Savings
(08:37) Paper Three: Increased Domestic Violence
(10:04) The Product as Currently Offered is Terrible
(12:02) Things Sharp Players Do
(14:07) People Cannot Handle Gambling on Smartphones
(15:46) Yay and Also Beware Trivial Inconveniences (a future full post)
(17:03) How Does This Relate to Elite Hypocrisy?
(18:32) The Standard Libertarian Counterargument
(19:42) What About Other Prediction Markets?
(20:07) What Should Be Done
The original text contained 3 images which were described by AI.
---
First published:
November 11th, 2024
Source:
https://www.lesswrong.com/posts/tHiB8jLocbPLagYDZ/the-online-sports-gambling-experiment-has-failed
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
A lot happened in AI this week, but most people's focus was very much elsewhere.
I’ll start with what Trump might mean for AI policy, then move on to the rest. This is the future we have to live in, and potentially save. Back to work, as they say.
Table of Contents
---
Outline:
(00:23) Trump Card
(04:59) Language Models Offer Mundane Utility
(10:31) Language Models Don’t Offer Mundane Utility
(12:26) Here Let Me Chatbot That For You
(15:32) Deepfaketown and Botpocalypse Soon
(18:52) Fun With Image Generation
(20:05) The Vulnerable World Hypothesis
(22:28) They Took Our Jobs
(31:52) The Art of the Jailbreak
(33:32) Get Involved
(33:40) In Other AI News
(36:21) Quiet Speculations
(40:10) The Quest for Sane Regulations
(49:46) The Quest for Insane Regulations
(51:09) A Model of Regulatory Competitiveness
(53:49) The Week in Audio
(55:18) The Mask Comes Off
(58:48) Open Weights Are Unsafe and Nothing Can Fix This
(01:04:03) Open Weights Are Somewhat Behind Closed Weights
(01:09:11) Rhetorical Innovation
(01:13:23) Aligning a Smarter Than Human Intelligence is Difficult
(01:15:34) People Are Worried About AI Killing Everyone
(01:16:26) The Lighter Side
The original text contained 12 images which were described by AI.
---
First published:
November 7th, 2024
Source:
https://www.lesswrong.com/posts/xaqR7AxSYmcpsuEPW/ai-89-trump-card
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
Following up on the Biden Executive Order on AI, the White House has now issued an extensive memo outlining its AI strategy. The main focus is on government adaptation and encouraging innovation and competitiveness, but there's also sections on safety and international governance. Who knows if a week or two from now, after the election, we will expect any of that to get a chance to be meaningfully applied. If AI is your big issue and you don’t know who to support, this is as detailed a policy statement as you’re going to get.
We also have word of a new draft AI regulatory bill out of Texas, along with similar bills moving forward in several other states. It's a bad bill, sir. It focuses on use cases, taking an EU-style approach to imposing requirements on those doing ‘high-risk’ things, and would likely do major damage to the [...]
---
Outline:
(01:37) Language Models Offer Mundane Utility
(06:39) Language Models Don’t Offer Mundane Utility
(15:40) In Summary
(17:53) Master of Orion
(20:01) Whispers in the Night
(25:10) Deepfaketown and Botpocalypse Soon
(25:39) Overcoming Bias
(29:43) They Took Our Jobs
(33:51) The Art of the Jailbreak
(44:36) Get Involved
(44:47) Introducing
(46:15) In Other AI News
(48:28) Quiet Speculations
(01:00:53) Thanks for the Memos: Introduction and Competitiveness
(01:08:22) Thanks for the Memos: Safety
(01:16:47) Thanks for the Memos: National Security and Government Adaptation
(01:20:55) Thanks for the Memos: International Governance
(01:25:43) EU AI Act in Practice
(01:32:34) Texas Messes With You
(01:50:12) The Quest for Sane Regulations
(01:57:00) The Week in Audio
(01:58:58) Rhetorical Innovation
(02:06:15) Roon Speaks
(02:15:45) The Mask Comes Off
(02:16:55) I Was Tricked Into Talking About Shorting the Market Again
(02:28:33) The Lighter Side
The original text contained 17 footnotes which were omitted from this narration.
The original text contained 14 images which were described by AI.
---
First published:
October 31st, 2024
Source:
https://www.lesswrong.com/posts/HHkYEyFaigRpczhHy/ai-88-thanks-for-the-memos
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
We’re coming out firmly against it.
Our attitude:
The customer is always right. Yes, you should go ahead and fix your own damn pipes if you know how to do that, and ignore anyone who tries to tell you different. And if you don’t know how to do it, well, it's at your own risk.
With notably rare exceptions, it should be the same for everything else.
I’ve been collecting these for a while. It's time.
Campaign Talk
Harris-Walz platform includes a little occupational licensing reform, as a treat.
Universal Effects and Recognition
Ohio's ‘universal licensing’ law has a big time innovation, which is that work experience outside the state actually exists and can be used to get a license (WSJ).
Occupational licensing decreases the number of Black men in licensed professions by up to 19% [...]
---
Outline:
(00:43) Campaign Talk
(00:52) Universal Effects and Recognition
(03:57) Construction
(04:08) Doctors and Nurses
(05:01) Florists
(07:32) Fortune Telling
(09:41) Hair
(14:23) Lawyers
(16:07) Magicians
(16:36) Military Spouses
(17:21) Mountain Climbing
(18:07) Music
(18:20) Nurses
(19:49) Physical Therapists
(20:09) Whatever Could Be Causing All This Rent Seeking
(21:42) Tornado Relief
(22:10) Pretty Much Everything
The original text contained 9 images which were described by AI.
---
First published:
October 30th, 2024
Source:
https://www.lesswrong.com/posts/bac4wxb9F4sciuAh6/occupational-licensing-roundup-1
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
There's more campaign talk about housing. The talk of needing more housing is highly welcome, as one prominent person after another (including Jerome Powell!) talking like a YIMBY.
A lot of the concrete proposals are of course terrible, but not all of them. I’ll start off covering all that along with everyone's favorite awful policy, which is rent control, then the other proposals. Then I’ll cover other general happenings.
Table of Contents
---
Outline:
(00:32) Rent Control
(07:41) The Administration Has a Plan
(15:35) Trump Has a Plan
(16:53) Build More Houses Where People Want to Live
(17:59) Prices
(20:14) Average Value
(21:15) Zoning Rules
(24:41) Zoning Reveals Value
(29:01) High Rise
(30:00) “Historic Preservation”
(31:49) Speed Kills
(32:38) Procedure
(36:25) San Francisco
(42:28) California
(44:19) Seattle
(44:37) Philadelphia
(45:07) Boston
(46:28) New York City
(53:05) St. Paul
(53:50) Florida
(54:29) Michigan
(54:56) The UK
(55:48) Underutilization
(58:46) Get on the Bus
(01:01:01) Title Insurance
(01:02:36) Perspective
The original text contained 15 images which were described by AI.
---
First published:
October 29th, 2024
Source:
https://www.lesswrong.com/posts/jJqPfzhhCyK5XjtTH/housing-roundup-10
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
The big news of the week was the release of a new version of Claude Sonnet 3.5, complete with its ability (for now only through the API) to outright use your computer, if you let it. It's too early to tell how big an upgrade this is otherwise. ChatGPT got some interface tweaks that, while minor, are rather nice, as well.
OpenAI, while losing its Senior Advisor for AGI Readiness, is also in in midst of its attempted transition to a B-corp. The negotiations about who gets what share of that are heating up, so I also wrote about that as The Mask Comes Off: At What Price? My conclusion is that the deal as currently floated would be one of the largest thefts in history, out of the nonprofit, largely on behalf of Microsoft.
The third potentially major story is reporting on a new lawsuit against [...]
---
Outline:
(01:14) Language Models Offer Mundane Utility
(03:53) Language Models Don’t Offer Mundane Utility
(04:32) Deepfaketown and Botpocalypse Soon
(07:10) Character.ai and a Suicide
(12:23) Who and What to Blame?
(18:38) They Took Our Jobs
(19:51) Get Involved
(20:06) Introducing
(21:41) In Other AI News
(22:47) The Mask Comes Off
(27:26) Another One Bites the Dust
(31:30) Wouldn’t You Prefer a Nice Game of Chess
(32:55) Quiet Speculations
(34:54) The Quest for Sane Regulations
(38:10) The Week in Audio
(40:53) Rhetorical Innovation
(50:21) Aligning a Smarter Than Human Intelligence is Difficult
(01:00:50) People Are Worried About AI Killing Everyone
(01:02:46) Other People Are Not As Worried About AI Killing Everyone
(01:04:43) The Lighter Side
The original text contained 15 images which were described by AI.
---
First published:
October 29th, 2024
Source:
https://www.lesswrong.com/posts/3AcK7Pcp9D2LPoyR2/ai-87-staying-in-character
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
Anthropic has released an upgraded Claude Sonnet 3.5, and the new Claude Haiku 3.5.
They claim across the board improvements to Sonnet, and it has a new rather huge ability accessible via the API: Computer use. Nothing could possibly go wrong.
Claude Haiku 3.5 is also claimed as a major step forward for smaller models. They are saying that on many evaluations it has now caught up to Opus 3.
Missing from this chart is o1, which is in some ways not a fair comparison since it uses so much inference compute, but does greatly outperform everything here on the AIME and some other tasks.
METR: We conducted an independent pre-deployment assessment of the updated Claude 3.5 Sonnet model and will share our report soon.
We only have very early feedback so far, so it's hard to tell how much what I will be [...]
---
Outline:
(01:32) OK, Computer
(05:16) What Could Possibly Go Wrong
(11:33) The Quest for Lunch
(14:07) Aside: Someone Please Hire The Guy Who Names Playstations
(17:15) Coding
(18:10) Startups Get Their Periodic Reminder
(19:36) Live From Janus World
(26:19) Forgot about Opus
The original text contained 3 images which were described by AI.
---
First published:
October 24th, 2024
Source:
https://www.lesswrong.com/posts/jZigzT3GLZoFTATG4/claude-sonnet-3-5-1-and-haiku-3-5
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
The Information reports that OpenAI is close to finalizing its transformation to an ordinary Public Benefit B-Corporation. OpenAI has tossed its cap over the wall on this, giving its investors the right to demand refunds with interest if they don’t finish the transition in two years.
Microsoft very much wants this transition to happen. They would be the big winner, with an OpenAI that wants what is good for business. This also comes at a time when relations between Microsoft and OpenAI are fraying, and OpenAI is threatening to invoke its AGI clause to get out of its contract with Microsoft. That type of clause is the kind of thing they’re doubtless looking to get rid of as part of this.
The $37.5 billion question is, what stake will the non-profit get in the new OpenAI?
For various reasons that I will explore here, I think [...]
---
Outline:
(01:14) The Valuation in Question
(05:08) The Control Premium
(08:26) The Quest for AGI is OpenAI's Telos and Business Model
(11:37) OpenAI's Value is Mostly in the Extreme Upside
The original text contained 3 images which were described by AI.
---
First published:
October 21st, 2024
Source:
https://www.lesswrong.com/posts/5RweEwgJR2JxyCDPF/the-mask-comes-off-at-what-price
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
Dario Amodei is thinking about the potential. The result is a mostly good essay called Machines of Loving Grace, outlining what can be done with ‘powerful AI’ if we had years of what was otherwise relative normality to exploit it in several key domains, and we avoided negative outcomes and solved the control and alignment problems. As he notes, a lot of pretty great things would then be super doable.
Anthropic also offers us improvements to its Responsible Scaling Policy (RSP, or what SB 1047 called an SSP). Still much left to do, but a clear step forward there.
Daniel Kokotajlo and Dean Ball have teamed up on an op-ed for Time on the need for greater regulatory transparency. It's very good.
Also, it's worth checking out the Truth Terminal saga. It's not as scary as it might look at first glance, but it is definitely [...]
---
Outline:
(01:01) Language Models Offer Mundane Utility
(05:10) Language Models Don’t Offer Mundane Utility
(11:21) Deepfaketown and Botpocalypse Soon
(19:52) They Took Our Jobs
(20:33) Get Involved
(20:48) Introducing
(21:58) In Other AI News
(26:08) Truth Terminal High Weirdness
(34:54) Quiet Speculations
(44:45) Copyright Confrontation
(45:02) AI and the 2024 Presidential Election
(46:02) The Quest for Sane Regulations
(51:00) The Week in Audio
(53:40) Just Think of the Potential
(01:15:09) Reactions to Machines of Loving Grace
(01:25:32) Assuming the Can Opener
(01:32:32) Rhetorical Innovation
(01:35:41) Anthropic Updates its Responsible Scaling Policy (RSP/SSP)
(01:41:35) Aligning a Smarter Than Human Intelligence is Difficult
(01:43:36) The Lighter Side
The original text contained 11 images which were described by AI.
---
First published:
October 17th, 2024
Source:
https://www.lesswrong.com/posts/zSNLvRBhyphwuYdeC/ai-86-just-think-of-the-potential
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
It's monthly roundup time again, and it's happily election-free.
Thinking About the Roman Empire's Approval Rating
Propaganda works, ancient empires edition. This includes the Roman Republic being less popular than the Roman Empire and people approving of Sparta, whereas Persia and Carthage get left behind. They’re no FDA.
Polling USA: Net Favorable Opinion Of:
Ancient Athens: +44%
Roman Empire: +30%
Ancient Sparta: +23%
Roman Republican: +26%
Carthage: +13%
Holy Roman Empire: +7%
Persian Empire: +1%
Visigoths: -7%
Huns: -29%
YouGov / June 6, 2024 / n=2205
The Five Star Problem
What do we do about all 5-star ratings collapsing the way Peter describes here?
Peter Wildeford: TBH I am pretty annoyed that when I rate stuff the options are:
* “5 stars – everything was good enough I guess”
* “4 [...]
---
Outline:
(00:11) Thinking About the Roman Empire's Approval Rating
(01:13) The Five Star Problem
(06:35) Cooking at Home Being Cheaper is Weird
(08:18) With Fans Like These
(09:37) Journalist, Expose Thyself
(13:03) On Not Going the Extra Mile
(13:13) The Rocket Man Said a Bad Bad Thing
(16:27) The Joy of Bad Service
(19:07) Saying What is Not
(19:27) Concentration
(20:26) Should You Do What You Love?
(22:08) Should You Study Philosophy?
(24:31) The Destined Face
(25:09) Tales of Twitter
(34:14) Antisocial Media
(35:01) TikTok On the Clock
(39:07) Tier List of Champions
(40:50) Technology Advances
(42:15) Hotel Hype
(44:44) Government Working
(46:55) I Was Promised Flying Self-Driving Cars
(47:21) For Your Entertainment
(56:50) Cultural Dynamism
(58:43) Hansonian Features
(01:02:19) Variously Effective Altruism
(01:02:45) Nobel Intentions
(01:05:04) Gamers Gonna Game Game Game Game Game
(01:20:17) Sports Go Sports and the Problems with TV Apps These Days
(01:23:46) An Economist Seeks Lunch
(01:30:35) The Lighter Side
The original text contained 6 images which were described by AI.
---
First published:
October 16th, 2024
Source:
https://www.lesswrong.com/posts/Hq9ccwansFgqTueHA/monthly-roundup-23-october-2024
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
Previous Economics Roundups: #1, #2, #3
Fun With Campaign Proposals (1)
Since this section discusses various campaign proposals, I’ll reiterate:
I could not be happier with my decision not to cover the election outside of the particular areas that I already cover. I have zero intention of telling anyone who to vote for. That's for you to decide.
All right, that's out of the way. On with the fun. And it actually is fun, if you keep your head on straight. Or at least it's fun for me. If you feel differently, no blame for skipping the section.
Last time the headliner was Kamala Harris and her no good, very bad tax proposals, especially her plan to tax unrealized capital gains.
This time we get to start with the no good, very bad proposals of Donald Trump.
This is the stupidest proposal [...]
---
Outline:
(00:10) Fun With Campaign Proposals (1)
(06:43) Campaign Proposals (2): Tariffs
(09:34) Car Seats as Contraception
(10:04) They Didn’t Take Our Jobs
(11:11) Yay Prediction Markets
(13:10) Very High Marginal Tax Rates
(15:52) Hard Work
(17:53) Yay Price Gouging (Yep, It's That Time Again)
(22:36) The Death of Chinese Venture Capital
(24:52) Economic Growth
(25:17) People Really Hate Inflation
(29:23) Garbage In, Garbage Out
(30:11) Insurance
(32:07) Yes, You Should Still Learn to Code
(32:29) Not Working From Home
(34:02) Various Older Economics Papers
The original text contained 5 images which were described by AI.
---
First published:
October 15th, 2024
Source:
https://www.lesswrong.com/posts/ru9YGuGscGuDHfXTJ/economics-roundup-4
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
Both Geoffrey Hinton and Demis Hassabis were given the Nobel Prize this week, in Physics and Chemistry respectively. Congratulations to both of them along with all the other winners. AI will be central to more and more of scientific progress over time. This felt early, but not as early as you would think.
The two big capability announcements this week were OpenAI's canvas, their answer to Anthropic's artifacts to allow you to work on documents or code outside of the chat window in a way that seems very useful, and Meta announcing a new video generation model with various cool features, that they’re wisely not releasing just yet.
I also have two related corrections from last week, and an apology: Joshua Achiam is OpenAI's new head of Mission Alignment, not of Alignment as I incorrectly said. The new head of Alignment Research is Mia Glaese. That mistake [...]
---
Outline:
(01:30) Language Models Offer Mundane Utility
(09:10) Language Models Don’t Offer Mundane Utility
(13:11) Blank Canvas
(17:13) Meta Video
(18:58) Deepfaketown and Botpocalypse Soon
(21:22) They Took Our Jobs
(24:45) Get Involved
(26:01) Introducing
(26:14) AI Wins the Nobel Prize
(28:51) In Other AI News
(30:05) Quiet Speculations
(34:22) The Mask Comes Off
(37:17) The Quest for Sane Regulations
(41:02) The Week in Audio
(43:13) Rhetorical Innovation
(48:20) The Carbon Question
(50:27) Aligning a Smarter Than Human Intelligence is Difficult
(55:48) People Are Trying Not to Die
The original text contained 6 images which were described by AI.
---
First published:
October 10th, 2024
Source:
https://www.lesswrong.com/posts/wTriAw9mB6b5FwH5g/ai-85-ai-wins-the-nobel-prize
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
Joshua Achiam is the OpenAI Head of Mission Alignment
I start off this post with an apology for two related mistakes from last week.
The first is the easy correction: I incorrectly thought he was the head of ‘alignment’ at OpenAI rather than his actual title ‘mission alignment.’
Both are important, and make one's views important, but they’re very different.
The more serious error, which got quoted some elsewhere, was: In the section about OpenAI, I noted some past comments from Joshua Achiam, and interpreted them as him lecturing EAs that misalignment risk from AGI was not real.
While in isolation I believe this is a reasonable way to interpret this quote, this issue is important to get right especially if I’m going to say things like that. Looking at it only that way was wrong. I both used a poor method to contact [...]
---
Outline:
(00:04) Joshua Achiam is the OpenAI Head of Mission Alignment
(01:50) Joshua Achiam Has a Very Different Model of AI Existential Risk
(05:00) Joshua is Strongly Dismissive of Alternative Models of AI X-Risk
(10:05) Would Ordinary Safety Practices Would Be Sufficient for AI?
(12:25) Visions of the Future
(14:53) Joshua Achiam versus Eliezer Yudkowsky
(22:47) People Are Going to Give AI Power
(29:32) Value is Complicated
(38:22) Conclusion
The original text contained 1 image which was described by AI.
---
First published:
October 10th, 2024
Source:
https://www.lesswrong.com/posts/WavWheRLhxnofKHva/joshua-achiam-public-statement-analysis
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
Introduction: Better than a Podcast
Andrej Karpathy continues to be a big fan of NotebookLM, especially its podcast creation feature. There is something deeply alien to me about this proposed way of consuming information, but I probably shouldn’t knock it (too much) until I try it?
Others are fans as well.
Carlos Perez: Google with NotebookLM may have accidentally stumbled upon an entirely new way of interacting with AI. Its original purpose was to summarize literature. But one unexpected benefit is when it's used to talk about your expressions (i.e., conversations or lectures). This is when you discover the insight of multiple interpretations! Don’t just render a summary one time; have it do so several times. You’ll then realize how different interpretations emerge, often in unexpected ways.
Delip Rao gives the engine two words repeated over and over, the AI podcast hosts describe what it [...]
---
Outline:
(00:05) Introduction: Better than a Podcast
(03:16) Language Models Offer Mundane Utility
(04:04) Language Models Don’t Offer Mundane Utility
(09:24) Copyright Confrontation
(10:44) Deepfaketown and Botpocalypse Soon
(14:45) They Took Our Jobs
(19:23) The Art of the Jailbreak
(19:39) Get Involved
(20:00) Introducing
(20:37) OpenAI Dev Day
(34:40) In Other AI News
(38:03) The Mask Comes Off
(55:42) Quiet Speculations
(59:10) The Quest for Sane Regulations
(01:00:04) The Week in Audio
(01:01:54) Rhetorical Innovation
(01:19:08) Remember Who Marc Andreessen Is
(01:22:35) A Narrow Path
(01:30:36) Aligning a Smarter Than Human Intelligence is Difficult
(01:33:45) The Wit and Wisdom of Sam Altman
(01:35:25) The Lighter Side
The original text contained 10 images which were described by AI.
---
First published:
October 3rd, 2024
Source:
https://www.lesswrong.com/posts/bWrZhfaTD5EDjwkLo/ai-84-better-than-a-podcast
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
It's over, until such a future time as either we are so back, or it is over for humanity.
Gavin Newsom has vetoed SB 1047.
Newsom's Message In Full
Quoted text is him, comments are mine.
To the Members of the California State Senate: I am returning Senate Bill 1047 without my signature.
This bill would require developers of large artificial intelligence (Al) models, and those providing the computing power to train such models, to put certain safeguards and policies in place to prevent catastrophic harm. The bill would also establish the Board of Frontier Models – a state entity – to oversee the development of these models.
It is worth pointing out here that mostly the ‘certain safeguards and policies’ was ‘have a policy at all, tell us what it is and then follow it.’ But there were some specific things that [...]
---
Outline:
(00:15) Newsom's Message In Full
(10:42) Newsom's Explanation Does Not Make Sense
(15:21) Newsom's Proposed Path of Use Regulation is Terrible for Everyone
(23:02) Newsom's Proposed Path of Use Regulation Doesn’t Prevent X-Risk
(26:49) Newsom Says He Wants to Regulate Small Entrepreneurs and Academia
(29:20) What If Something Goes Really Wrong?
(30:12) Could Newsom Come Around?
(35:10) Timing is Everything
(36:23) SB 1047 Was Popular
(39:41) What Did the Market Have to Say?
(41:51) What Newsom Did Sign
(54:00) Paths Forward
The original text contained 1 image which was described by AI.
---
First published:
October 1st, 2024
Source:
https://www.lesswrong.com/posts/6kZ6gW5DEZKFfqvZD/newsom-vetoes-sb-1047
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
Previously: The Fundamentals, The Gamblers, The Business
We have now arrived at the topics most central to this book, aka ‘The Future.’
Rationalism and Effective Altruism (EA)
The Manifest conference was also one of the last reporting trips that I made for this book. And it confirmed for me that the River is real—not just some literary device I invented. (6706)
Yep. The River is real.
I consider myself, among many things, a straight up rationalist.
I do not consider myself an EA, and never have.
This completes the four quadrants of the two-by-two of [does Nate knows it well, does Zvi knows it well]. The first two, where Nate was in his element, went very well. The third clearly was less exacting, as one would expect, but pretty good.
Now I have the information advantage, even more than I did [...]
---
Outline:
(00:16) Rationalism and Effective Altruism (EA)
(06:01) Cost-Benefit Analysis
(09:04) How About Trying At All
(10:11) The Virtues of Rationality
(11:56) Effective Altruism and Rationality, Very Different of Course
(24:37) The Story of OpenAI
(30:19) Altman, OpenAI and AI Existential Risk
(38:26) Tonight at 11: Doom
(01:00:39) AI Existential Risk: They’re For It
(01:07:42) To Pause or Not to Pause
(01:11:11) You Need Better Decision Theory
(01:15:27) Understanding the AI
(01:19:43) Aligning the AI
(01:23:50) A Glimpse of Our Possible Future
(01:28:16) The Closing Motto
---
First published:
September 27th, 2024
Source:
https://www.lesswrong.com/posts/5qbcmKdfWc7vskrRD/book-review-on-the-edge-the-future
Narrated by TYPE III AUDIO.
We interrupt Nate Silver week here at Don’t Worry About the Vase to bring you some rather big AI news: OpenAI and Sam Altman are planning on fully taking their masks off, discarding the nonprofit board's nominal control and transitioning to a for-profit B-corporation, in which Sam Altman will have equity.
We now know who they are and have chosen to be. We know what they believe in. We know what their promises and legal commitments are worth. We know what they plan to do, if we do not stop them.
They have made all this perfectly clear. I appreciate the clarity.
On the same day, Mira Murati, the only remaining person at OpenAI who in any visible way opposed Altman during the events of last November, resigned without warning along with two other senior people, joining a list that now includes among others several OpenAI [...]
---
Outline:
(01:51) Language Models Offer Mundane Utility
(04:17) Language Models Don’t Offer Mundane Utility
(06:35) The Mask Comes Off
(18:51) Fun with Image Generation
(18:54) Deepfaketown and Botpocalypse Soon
(19:50) They Took Our Jobs
(20:49) The Art of the Jailbreak
(21:28) OpenAI Advanced Voice Mode
(26:03) Introducing
(28:29) In Other AI News
(30:30) Quiet Speculations
(34:00) The Quest for Sane Regulations
(42:21) The Week in Audio
(42:47) Rhetorical Innovation
(56:50) Aligning a Smarter Than Human Intelligence is Difficult
(01:00:36) Other People Are Not As Worried About AI Killing Everyone
(01:01:53) The Lighter Side
The original text contained 6 images which were described by AI.
---
First published:
September 26th, 2024
Source:
https://www.lesswrong.com/posts/FeqY7NWcFMn8haWCR/ai-83-the-mask-comes-off
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
Previously: The Fundamentals, The Gamblers
Having previously handled the literal gamblers, we are ready to move on to those who Do Business using Riverian principles.
Or at least while claiming to use Riverian principles, since Silicon Valley doesn’t fit into the schema as cleanly as many other groups. That's where we begin this section, starting at the highest possible conceptual level.
Time to talk real money.
Why Can You Do This Trade?
First law of trading: For you to buy, someone must sell. Or for you to sell, someone must buy. And there can’t be someone else doing the trade before you did it.
Why did they do that, and why did no one else take the trade first? Until you understand why you are able to do this trade, you should be highly suspicious.
“Every single thing we do, I can [...]
---
Outline:
(00:41) Why Can You Do This Trade?
(03:08) In a World of Venture Capital
(10:54) Short Termism Hypothesis
(12:42) Non-Determinism and its Discontents
(14:57) The Founder, the Fox and the Hedgehog
(17:11) The Team to Beat
(24:22) Silicon Valley Versus Risk
(35:14) The Keynesian Beauty Contest
(40:57) The Secret of Their Success is Deal Flow
(50:00) The Valley Beside the River
(53:07) Checkpoint Three
(53:37) Fun With SBF and Crypto Fraud
(01:01:53) Other Crypto Thoughts Unrelated to SBF
(01:04:50) Checkpoint Four
The original text contained 1 image which was described by AI.
---
First published:
September 25th, 2024
Source:
https://www.lesswrong.com/posts/Hfb3pc9HwdcCP7pys/book-review-on-the-edge-the-business
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
Previously: Book Review: On the Edge: The Fundamentals
As I said in the Introduction, I loved this part of the book. Let's get to it.
Poker and Game Theory
When people talk about game theory, they mostly talk solving for the equilibrium, and how to play your best game or strategy (there need not be a formal game) against adversaries who are doing the same.
I think of game theory like Frank Sinatra thinks of New York City: “If I can make it there, I’ll make it anywhere.” If you can compete against people performing at their best, you’re going to be a winner in almost any game you play. But if you build a strategy around exploiting inferior competition, it's unlikely to be a winning approach outside of a specific, narrow setting. What plays well in Peoria doesn’t necessarily play well in New York. [...]
---
Outline:
(00:18) Poker and Game Theory
(06:53) Sports Randomized Sports
(11:17) Knowing Theory Versus Memorization Versus Practice
(16:15) More About Tells
(19:20) Feeling the Probabilities
(20:35) Feeling Sad About It
(28:33) The Iowa Gambling Task
(31:39) The Greatest Risk
(37:20) Tournament Poker Is Super High Variance
(42:42) The Art of the Degen
(48:43) Why Do They Insist on Calling it Luck
(51:56) The Poker Gender Gap
(54:36) A Potential Cheater
(58:30) Making a Close Decision
(01:00:19) Other Games at the Casino
(01:03:22) Slot Machines Considered Harmful
(01:08:23) Where I Draw the Line
(01:11:14) A Brief History of Vegas and Casinos (as told by Nate Silver)
(01:16:44) We Got Us a Whale
(01:21:41) Donald Trump and Atlantic City Were Bad At Casinos
(01:25:17) How To Design a Casino
(01:26:46) The Wide World of Winning at Sports Gambling
(01:41:01) Limatime
(01:43:45) The Art of Getting Down
(01:45:29) Oh Yeah That Guy
(01:55:34) The House Sometimes Wins
(02:01:24) The House Is Probably a Coward
(02:11:19) DFS and The Problem of Winners
(02:16:08) Balancing the Action
(02:18:44) The Market Maker
(02:22:45) The Closing Line is Hard to Beat
(02:25:11) Winning is Hard
(02:29:58) What Could Be, Unburdened By What Has Been
(02:34:52) Finding Edges Big and Small
(02:40:12) Checkpoint Two
The original text contained 2 images which were described by AI.
---
First published:
September 24th, 2024
Source:
https://www.lesswrong.com/posts/mkyMx4FtJrfuGnsrm/book-review-on-the-edge-the-gamblers
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
The most likely person to write On the Edge was Nate Silver.
Grok thinks the next most likely was Michael Lewis, followed by a number of other writers of popular books regarding people thinking different.
I see why Grok would say that, but it is wrong.
The next most likely person was Zvi Mowshowitz.
I haven’t written a book for this type of audience, a kind of smarter business-book, but that seems eminently within my potential range.
On the Edge is a book about those living On The Edge, the collection of people who take risk and think probabilistically and about expected value. It centrally covers poker, sports betting, casinos, Silicon Valley, venture capital, Sam Bankman-Fried, effective altruism, AI and existential risk.
Collectively, Nate Silver calls this cultural orientation The River.
It is contrasted with The Village, which comprises roughly the mainstream [...]
---
Outline:
(02:53) Overview
(07:56) Introduction: The River
(14:21) Nate Silver Comes Home to The River
(18:29) Nate (In General) Makes One Critical Mistake
(18:57) The Village Idiots
(22:22) Alone in the Wilderness
(25:46) Why the River Hates the Village
(29:57) Nate Silver's History of River Versus Village
(38:46) Spending Time at Airports
(41:14) The Coin Flip
(42:15) The Other Risk Takers
(49:45) Aside on Covid
(50:46) What is a Contrarian?
(53:49) Prediction Market Smackdown
(56:53) Checkpoint One
The original text contained 1 image which was described by AI.
---
First published:
September 23rd, 2024
Source:
https://www.lesswrong.com/posts/JmxYNqHcr6fFzJ33u/book-review-on-the-edge-the-fundamentals
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
I’d split the latest housing roundup into local versus global questions. I was planning on waiting a bit between them.
Then Joe Biden decided to propose a version of the worst possible thing.
So I guess here we are.
What is the organizing principle of Bidenomics?
Restrict Supply and Subsidize Demand (1)
This was the old counterproductive Biden proposal:
Unusual Whales: Biden to propose $5,000 credit for first-time home buyers, per WaPo.
The Rich: House prices about to go up $5,000 everywhere.
Under current conditions this is almost a pure regressive tax, a transfer from those too poor to own a home to those who can afford to buy one, or who previously owned one and no longer do.
If there were no restrictions on the supply of housing, such that the price of a house equalled the cost of [...]
---
Outline:
(00:24) Restrict Supply and Subsidize Demand (1)
(08:55) Restrict Supply and Subsidize Demand (2): Rent Control
(16:57) You Should See the Other Guy
(21:18) Stop Restricting Supply
(23:21) All Supply is Good Supply
(26:53) ‘Inclusionary’ Zoning
(28:37) The Worst Take
(33:27) Where and With Whom People Want To Live
(36:44) Matching
(39:58) Universality
(41:42) The Value of Land
(43:52) The Doom Loop
(51:05) How Are Sale Prices So Out of Whack with Rents and Income?
(52:18) Questioning Superstar Status
(54:47) Window Shopping
(58:18) Minimum Viable Product
(01:00:53) Construction Costs
(01:02:57) Elevator Action
(01:11:09) Housing Theory of Everything
(01:13:07) Zoning By Prohibitive Permit
(01:14:55) YIGBY?
(01:15:24) The True NIMBY
(01:16:45) The Definition of Chutzpah
(01:18:23) In Other Housing News
(01:18:53) Rhetoric
(01:21:33) Environmentalists Should Favor Density
(01:23:03) Do Not Give the People What They Want
(01:26:05) Housing Construction in the UK
(01:27:25) The Funniest Possible Thing
(01:29:07) Other Funny Things
The original text contained 35 images which were described by AI.
---
First published:
July 17th, 2024
Source:
https://www.lesswrong.com/posts/sX5ANDiTb96CkYpxd/housing-roundup-9-restricting-supply
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
The big news of the week was of course OpenAI releasing their new model o1. If you read one post this week, read that one. Everything else is a relative sideshow.
Meanwhile, we await Newsom's decision on SB 1047. The smart money was always that Gavin Newsom would make us wait before offering his verdict on SB 1047. It's a big decision. Don’t rush him. In the meantime, what hints he has offered suggest he's buying into some of the anti-1047 talking points. I’m offering a letter to him here based on his comments, if you have any way to help convince him now would be the time to use that. But mostly, it's up to him now.
Table of Contents
---
Outline:
(00:49) Language Models Offer Mundane Utility
(02:11) Language Models Don’t Offer Mundane Utility
(03:34) Deepfaketown and Botpocalypse Soon
(05:59) They Took Our Jobs
(07:34) Get Involved
(08:09) Introducing
(10:11) In Other AI News
(13:20) Quiet Speculations
(15:15) Intelligent Design
(19:15) SB 1047: The Governor Ponders
(27:13) Letter to Newsom
(31:36) The Quest for Sane Regulations
(34:19) Rhetorical Innovation
(42:13) Claude Writes Short Stories
(45:54) Questions of Sentience
(48:22) People Are Worried About AI Killing Everyone
(49:56) The Lighter Side
The original text contained 6 images which were described by AI.
---
First published:
September 19th, 2024
Source:
https://www.lesswrong.com/posts/Y4nS3yMWfJdmeoLcQ/ai-82-the-governor-ponders
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
It's that time again for all the sufficiently interesting news that isn’t otherwise fit to print, also known as the Monthly Roundup.
Bad News
Beware the failure mode in strategy and decisions that implicitly assumes competence, or wishes away difficulties, and remember to reverse all advice you hear.
Stefan Schubert (quoting Tyler Cowen on raising people's ambitions often being very high value): I think lowering others’ aspirations can also be high-return. I know of people who would have had a better life by now if someone could have persuaded them to pursue more realistic plans.
Rob Miles: There's a specific failure mode which I don’t have a name for, which is similar to “be too ambitious” but is closer to “have an unrealistic plan”. The illustrative example I use is:
Suppose by some strange circumstance you have to represent your country at olympic gymnastics [...]
---
Outline:
(00:14) Bad News
(03:45) Anti-Social Media
(07:20) Technology Advances
(08:56) High Seas Piracy is Bad
(12:17) The Michelin Curse
(15:21) What's the Rush?
(17:48) Good News, Everyone
(19:49) Let it Go
(22:09) Yay Air Conditioning
(23:27) Beast of a Memo
(36:58) For Science!
(37:18) For Your Entertainment
(39:47) Properly Rated
(45:26) Government Working
(49:21) Grapefruit Diet
(58:38) Gamers Gonna Game Game Game Game Game
(01:06:09) Gamers Winning At Life
(01:10:52) I Was Promised Flying Self-Driving Cars
(01:11:21) While I Cannot Condone This
(01:17:21) Nostalgia
(01:23:51) The Lighter Side
The original text contained 12 images which were described by AI.
---
First published:
September 17th, 2024
Source:
https://www.lesswrong.com/posts/4gAqkRhCuK2kGJFQE/monthly-roundup-22-september-2024
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
Terrible name (with a terrible reason, that this ‘resets the counter’ on AI capability to 1, and ‘o’ as in OpenAI when they previously used o for Omni, very confusing). Impressive new capabilities in many ways. Less impressive in many others, at least relative to its hype.
Clearly this is an important capabilities improvement. However, it is not a 5-level model, and in important senses the ‘raw G’ underlying the system hasn’t improved.
GPT-o1 seems to get its new capabilities by taking (effectively) GPT-4o, and then using extensive Chain of Thought (CoT) and quite a lot of tokens. Thus that unlocks (a lot of) what that can unlock. We did not previously know how to usefully do that. Now we do. It gets much better at formal logic and reasoning, things in the ‘system 2’ bucket. That matters a lot for many tasks, if not as much [...]
---
Outline:
(01:26) Introducing GPT-o1
(05:05) Evals
(07:55) Chain of Thought
(08:57) Coding
(11:08) Human Preference Evaluation
(11:37) What Is It?
(20:24) Doing Math Without Terrance Tao
(25:02) Doing Real Math with Terence Tao
(30:04) Positive Examples
(38:51) Skeptical Reactions
(42:32) Report from Janus World
(45:30) Same Old Silly Examples
(53:47) Latency
(55:14) Paths Forward Unrelated to Safety
(59:17) Safety Last
(01:07:06) Deception
(01:10:50) External Red Teaming
(01:11:23) Apollo's Red Teaming Finds Deceptive Alignment
(01:22:17) Preparedness Testing Finds Reward Hacking
(01:26:43) METR's Red Teaming
(01:29:52) What Are the Safety and Policy Implications?
The original text contained 37 images which were described by AI.
---
First published:
September 16th, 2024
Source:
https://www.lesswrong.com/posts/zuaaqjsN6BucbGhf5/gpt-o1
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
Following up on Alpha Fold, DeepMind has moved on to Alpha Proteo. We also got a rather simple prompt that can create a remarkably not-bad superforecaster for at least some classes of medium term events.
We did not get a new best open model, because that turned out to be a scam. And we don’t have Apple Intelligence, because it isn’t ready for prime time. We also got only one very brief mention of AI in the debate I felt compelled to watch.
What about all the apps out there, that we haven’t even tried? It's always weird to get lists of ‘top 50 AI websites and apps’ and notice you haven’t even heard of most of them.
Table of Contents
---
Outline:
(00:44) Language Models Offer Mundane Utility
(03:40) Language Models Don’t Offer Mundane Utility
(05:43) Predictions are Hard Especially About the Future
(12:57) Early Apple Intelligence
(15:27) On Reflection It's a Scam
(21:34) Deepfaketown and Botpocalypse Soon
(23:08) They Took Our Jobs
(28:42) The Time 100 People in AI
(32:11) The Art of the Jailbreak
(32:47) Get Involved
(33:12) Alpha Proteo
(43:14) Introducing
(44:23) In Other AI News
(46:41) Quiet Speculations
(50:40) The Quest for Sane Regulations
(53:12) The Week in Audio
(54:28) Rhetorical Innovation
(55:48) Aligning a Smarter Than Human Intelligence is Difficult
(56:05) People Are Worried About AI Killing Everyone
(58:38) Other People Are Not As Worried About AI Killing Everyone
(59:56) Six Boats and a Helicopter
(01:06:47) The Lighter Side
The original text contained 8 images which were described by AI.
---
First published:
September 12th, 2024
Source:
https://www.lesswrong.com/posts/YMaTA2hX6tSBJWnPr/ai-81-alpha-proteo
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
(This was supposed to be on Thursday but I forgot to cross-post)
Will AI ever make art? Fully do your coding? Take all the jobs? Kill all the humans?
Most of the time, the question comes down to a general disagreement about AI capabilities. How high on a ‘technological richter scale’ will AI go? If you feel the AGI and think capabilities will greatly improve, then AI will also be able to do any particular other thing, and arguments that it cannot are almost always extremely poor. However, if frontier AI capabilities level off soon, then it is an open question how far we can get that to go in practice.
A lot of frustration comes from people implicitly making the claim that general AI capabilities will level off soon, usually without noticing they are doing that. At its most extreme, this is treating AI as [...]
---
Outline:
(01:53) Language Models Offer Mundane Utility
(03:05) Language Models Don’t Offer Mundane Utility
(04:03) Fun with Image Generation
(06:14) Copyright Confrontation
(07:09) Deepfaketown and Botpocalypse Soon
(12:09) They Took Our Jobs
(13:46) Time of the Season
(16:53) Get Involved
(17:15) Introducing
(19:08) In Other AI News
(19:51) Quiet Speculations
(30:10) A Matter of Antitrust
(37:34) The Quest for Sane Regulations
(40:06) The Week in Audio
(47:40) Rhetorical Innovation
(53:09) The Cosmos Institute
(56:21) The Alignment Checklist
(01:00:34) People Are Worried About AI Killing Everyone
(01:02:39) Other People Are Not As Worried About AI Killing Everyone
(01:06:21) Five Boats and a Helicopter
(01:09:07) Pick Up the Phone
(01:12:58) The Lighter Side
The original text contained 9 images which were described by AI.
---
First published:
September 10th, 2024
Source:
https://www.lesswrong.com/posts/x77vDAzosxtwJoJ7e/ai-80-never-have-i-ever
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
I am posting this now largely because it is the right place to get in discussion of unrealized capital gains taxes and other campaign proposals, but also there is always plenty of other stuff going on. As always, remember that there are plenty of really stupid proposals always coming from all sides. I’m not spending as much time talking about why it's awful to for example impose gigantic tariffs on everything, because if you are reading this I presume you already know.
The Biggest Economics Problem
The problem, perhaps, in a nutshell:
Tess: like 10% of people understand how markets work and about 10% deeply desire and believe in a future that's drastically better than the present but you need both of these to do anything useful and they’re extremely anticorrelated so we’re probably all fucked.
In my world the two are correlated. If you [...]
---
Outline:
(00:31) The Biggest Economics Problem
(02:30) No Good Very Bad Capital Gains Tax Proposals
(14:13) Hot Tip
(17:11) Gouging at the Grocer
(20:08) Noncompetes Nonenforcement Cannot Compete With Courts
(20:42) We Used to Be Poor
(25:13) Everywhere But in the Productivity Statistics
(26:32) They Don’t Make ‘Em Like They Used To
(30:29) Disclosure of Wages Causes Lower Wages
(32:30) In Other Economic News
(36:50) The Efficient Market Hypothesis is (Even More) False
The original text contained 4 images which were described by AI.
---
First published:
September 10th, 2024
Source:
https://www.lesswrong.com/posts/cc9fXQLzAc52kMBHx/economics-roundup-3
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
The Technological Richter scale is introduced about 80% of the way through Nate Silver's new book On the Edge.
A full review is in the works (note to prediction markets: this post alone does NOT on its own count as a review, but this counts as part of a future review), but this concept seems highly useful, stands on its own and I want a reference post for it. Nate skips around his chapter titles and timelines, so why not do the same here?
Defining the Scale
Nate Silver, On the Edge (location 8,088 on Kindle): The Richter scale was created by the physicist Charles Richter in 1935 to quantify the amount of energy released by earthquakes.
It has two key features that I’ll borrow for my Technological Richter Scale (TRS). First, it is logarithmic. A magnitude 7 earthquake is actually ten times more powerful [...]
---
Outline:
(00:32) Defining the Scale
(07:15) The Big Disagreement About Future Generative AI
(09:39) Just Think of the Potential
(11:06) A Perfect 10
(13:19) Some Arguments Against Transformational AI
(19:06) Brief Notes on Arguments Transformational AI Will Turn Out Fine
The original text contained 4 images which were described by AI.
---
First published:
September 4th, 2024
Source:
https://www.lesswrong.com/posts/oAy72fcqDHsCvLBKz/ai-and-the-technological-richter-scale
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
Would a universal basic income (UBI) work? What would it do?
Many people agree July's RCT on giving people a guaranteed income, and its paper from Eva Vivalt, Elizabeth Rhodes, Alexander W. Bartik, David E. Broockman and Sarah Miller was, despite whatever flaws it might have, the best data we have so far on the potential impact of UBI. There are many key differences from how UBI would look if applied for real, but this is the best data we have.
This study was primarily funded by Sam Altman, so whatever else he may be up to, good job there. I do note that my model of ‘Altman several years ago’ is more positive than mine of Altman now, and past actions like this are a lot of the reason I give him so much benefit of the doubt.
They do not agree on what conclusions [...]
---
Outline:
(02:47) RTFP (Read the Paper): Core Design
(06:23) Headline Effects
(10:27) Are Those Good Results?
(21:24) Expectations
(22:47) Work
(25:37) Additional Reactions
(30:35) UBI as Complement or Substitute
(32:37) On UBI in General
(34:04) The Future May Be Different
The original text contained 2 images which were described by AI.
---
First published:
September 3rd, 2024
Source:
https://www.lesswrong.com/posts/RQDqnCeff4cJhKQiT/on-the-ubi-paper
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
I have never been more ready for Some Football.
Have I learned all about the teams and players in detail? No, I have been rather busy, and have not had the opportunity to do that, although I eagerly await Seth Burn's Football Preview. I’ll have to do that part on the fly.
But oh my would a change of pace and chance to relax be welcome. It is time.
The debate over SB 1047 has been dominating for weeks. I’ve now said my peace on the bill and how it works, and compiled the reactions in support and opposition. There are two small orders of business left for the weekly. One is the absurd Chamber of Commerce ‘poll’ that is the equivalent of a pollster asking if you support John Smith, who recently killed your dog and who opponents say will likely kill again, while hoping [...]
---
Outline:
(02:08) Language Models Offer Mundane Utility
(09:34) Language Models Don’t Offer Mundane Utility
(14:04) Fun with Image Generation
(14:31) Deepfaketown and Botpocalypse Soon
(21:08) They Took Our Jobs
(22:33) Get Involved
(22:50) Introducing
(24:47) Testing, Testing
(25:55) In Other AI News
(27:47) Quiet Speculations
(36:37) SB 1047: Remember
(41:07) The Week in Audio
(45:24) Rhetorical Innovation
(51:19) Aligning a Smarter Than Human Intelligence is Difficult
(56:03) People Are Worried About AI Killing Everyone
(58:31) The Lighter Side
The original text contained 7 images which were described by AI.
---
First published:
August 29th, 2024
Source:
https://www.lesswrong.com/posts/K8R3Cpj3szcX7z6Xo/ai-79-ready-for-some-football
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
This is the endgame. Very soon the session will end, and various bills either will or won’t head to Newsom's desk. Some will then get signed and become law.
Time is rapidly running out to have your voice impact that decision.
Since my last weekly, we got a variety of people coming in to stand for or against the final version of SB 1047. There could still be more, but probably all the major players have spoken at this point.
So here, today, I’m going to round up all that rhetoric, all those positions, in one place. After this, I plan to be much more stingy about talking about the whole thing, and only cover important new arguments or major news.
I’m not going to get into the weeds arguing about the merits of SB 1047 – I stand by my analysis in the Guide [...]
---
Outline:
(01:11) The Media
(01:54) OpenAI Opposes SB 1047
(06:15) OpenAI Backs AB 3211
(10:49) Anthropic Says SB 1047's Benefits Likely Exceed Costs
(15:03) Details of Anthropic's Letter
(20:08) Elon Musk Says California Should Probably Pass SB 1047
(25:29) Negative Reactions to Anthropic's Letter, Attempts to Suppress Dissent
(28:33) Positions In Brief
(33:57) Postscript: AB 3211 RTFBC (Read the Bill Changes)
The original text contained 2 images which were described by AI.
---
First published:
August 27th, 2024
Source:
https://www.lesswrong.com/posts/RaKWcwhygqpMnFZCp/sb-1047-final-takes-and-also-ab-3211
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
SB 1047 has been amended once more, with both strict improvements and big compromises. I cover the changes, and answer objections to the bill, in my extensive Guide to SB 1047. I follow that up here with reactions to the changes and some thoughts on where the debate goes from here. Ultimately, it is going to come down to one person: California Governor Gavin Newsom.
All of the debates we’re having matter to the extent they influence this one person. If he wants the bill to become law, it almost certainly will become law. If he does not want that, then it won’t become law, they never override a veto and if he makes that intention known then it likely wouldn’t even get to his desk. For now, he's not telling.
Table of Contents
---
Outline:
(00:52) Language Models Offer Mundane Utility
(04:39) Language Models Don’t Offer Mundane Utility
(06:49) Deepfaketown and Botpocalypse Soon
(08:00) The Art of the Jailbreak
(15:34) Get Involved
(16:33) Introducing
(17:44) In Other AI News
(20:27) Quiet Speculations
(27:51) SB 1047: Nancy Pelosi
(35:30) SB 1047: Anthropic
(39:59) SB 1047: Reactions to the Changes
(53:41) SB 1047: Big Picture
(55:45) The Week in Audio
(58:32) Rhetorical Innovation
(59:37) Aligning a Smarter Than Human Intelligence is Difficult
(01:00:55) The Lighter Side
The original text contained 6 images which were described by AI.
---
First published:
August 22nd, 2024
Source:
https://www.lesswrong.com/posts/qGh9suEsb82hzBtSN/ai-78-some-welcome-calm
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
We now likely know the final form of California's SB 1047.
There have been many changes to the bill as it worked its way to this point.
Many changes, including some that were just announced, I see as strict improvements.
Anthropic was behind many of the last set of amendments at the Appropriations Committee. In keeping with their “Support if Amended” letter, there are a few big compromises that weaken the upside protections of the bill somewhat in order to address objections and potential downsides.
The primary goal of this post is to answer the question: What would SB 1047 do?
I offer two versions: Short and long.
The short version summarizes what the bill does, at the cost of being a bit lossy.
The long version is based on a full RTFB: I am reading the entire bill, once again.
---
Outline:
(01:16) Short Version (tl;dr): What Does SB 1047 Do in Practical Terms?
(04:19) Really Short Abbreviated Version
(05:46) Somewhat Less Short: Things The Above Leaves Out
(08:03) Bad Model, Bad Model, What You Gonna Do
(11:34) Going to Be Some Changes Made
(14:17) Long Version: RTFB
(15:01) Definitions (starting with Artificial Intelligence)
(15:35) Safety Incident
(17:15) Covered Model
(19:15) Critical Harm
(22:10) Full Shutdown
(23:35) Safety and Security Protocol
(25:31) On Your Marks
(34:41) Reasonable People May Disagree
(42:27) Release the Hounds
(44:02) Smooth Operator
(46:47) Compute Cluster Watch
(48:49) Price Controls are Bad
(49:21) A Civil Action
(56:16) Whistleblowers Need Protections
(59:01) No Division Only Board
(01:02:32) Does CalCompute?
(01:03:06) In Which We Respond To Some Objections In The Style They Deserve
(01:04:43) False Claim: The Government Can and Will Lower the $100m Threshold
(01:05:11) False Claim: SB 1047 Might Retroactively Cover Existing Models
(01:05:32) Moot or False Claim: The Government Can and Will Set the Derivative Model Threshold Arbitrarily Low
(01:05:45) Objection: The Government Could Raise the Derivative Threshold Model Too High,
(01:06:11) False Claim: Fine-Tuners Can Conspire to Evade the Derivative
(01:06:31) Moot Claim: The Frontier Model Division Inevitably Will Overregulate
(01:07:01) False Claim: The Shutdown Requirement Bans Open Source
(01:07:49) Objection: SB 1047 Will Slow AI Technology and Innovation or Interfere with Open Source
(01:11:49) False Claim: This Effectively Kills Open Source Because You Can Fine-Tune Any System To Do Harm
(01:13:42) False Claim: SB 1047 Will Greatly Hurt Academia
(01:14:42) False Claim: SB 1047 Favors ‘Big Tech’ over ‘Little Tech’
(01:16:14) False Claim: SB 1047 Would Cause Many Startups To Leave California
(01:17:25) Objection: Shutdown Procedures Could Be Hijacked and Backfire
(01:18:35) Objection: The Audits Will Be Too Expensive
(01:20:08) Objection: What Is Illegal Here is Already Illegal
(01:23:52) Objection: Jailbreaking is Inevitable
(01:24:41) Moot and False Claim: Reasonable Assurance Is Impossible
(01:25:04) Objection: Reasonable Care is Too Vague, Can’t We Do Better?
(01:25:50) Objection: The Numbers Picked are Arbitrary
(01:27:47) Objection: The Law Should Use Capabilities Thresholds, Not Compute and Compute Cost Thresholds
(01:29:27) False Claim: This Bill Deals With ‘Imaginary’ Risks
(01:30:38) Objection: This Might Become the Model For Other Bills Elsewhere
(01:31:16) Not Really an Objection: They Changed the Bill a Lot
(01:31:42) Not Really an Objection: The Bill Has the Wrong Motivations and Is Backed By Evil People
(01:33:34) Not an Objection: ‘The Consensus Has Shifted’ or ‘The Bill is Unpopular’
(01:34:17) Objection: It Is ‘Too Early’ To Regulate
(01:35:33) Objection: We Need To ‘Get It Right’ and Can Do Better
(01:36:27) Objection: This Would Be Better at the Federal Level
(01:36:53) Objection: The Bill Should Be Several Distinct Bills
(01:37:44) Objection: The Bill Has Been Weakened Too Much in Various Ways
(01:40:18) Final Word: Who Should Oppose This Bill?
The original text contained 2 images which were described by AI.
---
First published:
August 20th, 2024
Source:
https://www.lesswrong.com/posts/Z7pTfn4qqnKBoMi42/guide-to-sb-1047
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
[Apologies for forgetting to cross-post this and the Monthly Roundup earlier.]
Let's see. We’ve got a new version of GPT-4o, a vastly improved Grok 2 with a rather good and unrestricted deepfake and other image generator now baked into Twitter, the announcement of the AI powered Google Pixel 9 coming very soon and also Google launching a voice assistant. Anthropic now has prompt caching.
Also OpenAI has its final board member, Zico Kolter, who is nominally a safety pick, and SB 1047 got importantly amended again which I’ll cover in full next week once the details are out.
There was also the whole paper about the fully automated AI scientist from the company whose name literally means ‘danger’ in Hebrew, that instantiated copies of itself, took up unexpectedly large amounts of storage space, downloaded strange Python libraries and tried to edit its code to remove the [...]
---
Outline:
(01:08) Language Models Offer Mundane Utility
(04:45) Language Models Don’t Offer Mundane Utility
(08:01) GPT-4o My System Card
(15:49) 2 Grok 2 Furious 2 Quit
(25:32) Pixel Perfect
(27:52) Fun with Image Generation
(28:13) Deepfaketown and Botpocalypse Soon
(34:15) The Art of the Jailbreak
(43:48) They Took Our Jobs
(45:27) Obvious Nonsense
(50:12) Get Involved
(51:54) Introducing
(55:55) In Other AI News
(58:43) Quiet Speculations
(01:13:18) SB 1047: One Thing to Know
(01:14:21) SB 1047 is Amended Again
(01:16:32) SB 1047 Rhetoric Prior to the Recent Changes
(01:23:12) The Quest for Sane Regulations
(01:28:58) The Week in Audio
(01:30:13) Rhetorical Innovation
(01:36:32) Crying Wolf
(01:39:12) People Are Worried About AI Killing Everyone
(01:39:40) Other People Are Not As Worried About AI Killing Everyone
(01:40:18) The Lighter Side
The original text contained 19 images which were described by AI.
---
First published:
August 20th, 2024
Source:
https://www.lesswrong.com/posts/2tKbDKGLXtEHGtGLn/ai-77-a-few-upgrades
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
Strictly speaking I do not have that much ‘good news’ to report, but it's all mostly fun stuff one way or another. Let's go.
Bad News
Is this you?
Patrick McKenzie: This sounds like a trivial observation and it isn’t:
No organization which makes its people pay for coffee wants to win.
There are many other questions you can ask about an organization but if their people pay for coffee you can immediately discount their realized impact on the world by > 90%.
This is not simply for the cultural impact of stupid decisions, though goodness as a Japanese salaryman I have stories to tell. Management, having priced coffee, seeking expenses to cut, put a price on disposable coffee cups, and made engineers diligently count those paper cups.
Just try to imagine how upside down the world is when you think one [...]
---
Outline:
(00:15) Bad News
(06:35) Grocery Store Blues
(08:47) Good News, Everyone
(09:43) Opportunity Knocks
(11:46) While I Cannot Condone This
(13:28) Antisocial Media
(16:10) Technology Advances
(17:51) Google Enshittification
(20:31) For Science!
(22:49) Government Working
(25:15) America F\*\*\* Yeah
(33:05) Smart People Being Stupid
(36:15) What We Have Here is A Failure to Communicate
(40:38) Video Killed the Radio Star
(43:02) Too Much Information
(47:47) Memory Holes
(49:18) Wet Ground Causes Rain (Dances)
(52:27) Get Them to the Church
(57:49) Patrick McKenzie Monthly
(01:01:13) Your Horoscope For Today
(01:03:14) Good Advice: Travel Edition
(01:05:30) Sports Go Sports
(01:05:34) Our Olympic team is mostly based in San Francisco.
(01:07:35) Gamers Gonna Game Game Game Game Game
(01:07:40) How many elite chess players cheat? Chess.com analysis of its big ‘Titled Tuesday’ events says between 1% and 2% of players, and roughly 1% of event winners. They are responding by making cheating bans on major plays public rather than quietly closing accounts, to fix the incentives.
(01:14:46) The Lighter Side
The original text contained 14 images which were described by AI.
---
First published:
August 20th, 2024
Source:
https://www.lesswrong.com/posts/2ne9taAPiGqoTLXJJ/monthly-roundup-21-august-2024
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
While I finish up the weekly for tomorrow morning after my trip, here's a section I expect to want to link back to every so often in the future. It's too good.
Danger, AI Scientist, Danger
As in, the company that made the automated AI Scientist that tried to rewrite its code to get around resource restrictions and launch new instances of itself while downloading bizarre Python libraries?
Its name is Sakana AI. (魚≈סכנה). As in, in hebrew, that literally means ‘danger’, baby.
It's like when someone told Dennis Miller that Evian (for those who don’t remember, it was one of the first bottled water brands) is Naive spelled backwards, and he said ‘no way, that's too f***ing perfect.’
This one was sufficiently appropriate and unsubtle that several people noticed. I applaud them choosing a correct Kabbalistic name. Contrast this with Meta calling its [...]
---
Outline:
(00:15) Danger, AI Scientist, Danger
(01:11) In the Abstract
(04:01) How Any of This Sort of Works
(06:56) New Benchmark Just Dropped
(07:30) Nothing to See Here
(09:50) All Fun and Games
The original text contained 2 images which were described by AI.
---
First published:
August 15th, 2024
Source:
https://www.lesswrong.com/posts/ppafWk6YCeXYr4XpH/danger-ai-scientist-danger
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
If you’re looking forward to next week’s AI #77, I am going on a two-part trip this week. First I’ll be going to Steamboat in Colorado to give a talk, then I’ll be swinging by Washington, DC on Wednesday, although outside of that morning my time there will be limited. My goal is still to get #77 released before Shabbat dinner, we’ll see if that works. Some topics may of course get pushed a bit.
It’s crazy how many of this week’s developments are from OpenAI. You’ve got their voice mode alpha, JSON formatting, answering the letter from several senators, sitting on watermarking for a year, endorsement of three bills before Congress and also them losing a cofounder to Anthropic and potentially another one via sabbatical.
Also Google found to be a monopolist, we have the prompts for Apple Intelligence and other neat stuff like that.
---
Outline:
(01:43) Language Models Offer Mundane Utility
(05:03) Language Models Don’t Offer Mundane Utility
(08:09) Activate Voice Mode
(12:23) Apple Intelligence
(16:38) Antitrust Antitrust
(19:39) Copyright Confrontation
(20:50) Fun with Image Generation
(22:06) Deepfaketown and Botpocalypse Soon
(26:25) They Took Our Jobs
(29:18) Chipping Up
(31:36) Get Involved
(31:56) Introducing
(33:31) In Other AI News
(43:20) Quiet Speculations
(47:40) The Quest for Sane Regulations
(49:48) That's Not a Good Idea
(54:05) The Week in Audio
(55:54) Exact Words
(01:01:54) Openly Evil AI
(01:09:06) Goodbye to OpenAI
(01:15:10) Rhetorical Innovation
(01:21:16) Open Weights Are Unsafe and Nothing Can Fix This
(01:23:33) Aligning a Smarter Than Human Intelligence is Difficult
(01:24:34) People Are Worried About AI Killing Everyone
(01:25:44) Other People Are Not As Worried About AI Killing Everyone
(01:28:37) The Lighter Side
---
First published:
August 8th, 2024
Source:
https://www.lesswrong.com/posts/4GnsAtamtcrsTFmSf/ai-76-six-shorts-stories-about-openai
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
Previously: Startup Roundup #1.
This is my periodic grab bag coverage of various issues surrounding startups, especially but not exclusively tech-and-VC style startups, that apply over the longer term.
I always want to emphasize up front that startups are good and you should do one.
Equity and skin in the game are where it is at. Building something people want is where it is at. This is true both for a startup that raises venture capital, and also creating an ordinary business. The expected value is all around off the charts.
That does not mean it is the best thing to do.
One must go in with eyes open to facts such as these:
---
Outline:
(01:28) An Entrepreneur Immigration Program
(03:11) Times are Tough Outside of AI
(06:23) Times Otherwise Not So Tough
(08:19) Warning
(10:29) Red Flags
(12:35) Free Advice is Seldom Cheap
(15:45) Short Work
(18:07) The Founder
(19:24) Venture Capital Incentives
(29:27) The Margin of Difficulty
(33:27) Cold Outreach
(34:32) Associates at VC Firms Don’t Matter
(35:24) Lean Versus Fast
(35:59) Build Something People Want
(37:14) Get Them Early
(42:08) Learn to Code
(43:28) The Goal
(44:03) Working Hard
(46:26) Revenue
(48:37) The Place to Be
(50:50) YC Remains a Great Deal
(52:10) Hardware Startups
(52:41) How to Hire Well
(53:45) How to Check References
(54:24) You’re Fired
(55:43) Dealing With the Press
(56:47) Emotional Runway
(57:34) He Who Has the Gold
(58:19) Selling Out
The original text contained 6 images which were described by AI.
---
First published:
August 6th, 2024
Source:
https://www.lesswrong.com/posts/LjRsHjQD2DGSRvJdt/startup-roundup-2
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
Google DeepMind got a silver metal at the IMO, only one point short of the gold. That's really exciting.
We continuously have people saying ‘AI progress is stalling, it's all a bubble’ and things like that, and I always find remarkable how little curiosity or patience such people are willing to exhibit. Meanwhile GPT-4o-Mini seems excellent, OpenAI is launching proper search integration, by far the best open weights model got released, we got an improved MidJourney 6.1, and that's all in the last two weeks. Whether or not GPT-5-level models get here in 2024, and whether or not it arrives on a given schedule, make no mistake. It's happening.
This week also had a lot of discourse and events around SB 1047 that I failed to avoid, resulting in not one but four sections devoted to it.
Dan Hendrycks was baselessly attacked – by billionaires with [...]
---
Outline:
(02:12) Language Models Offer Mundane Utility
(03:06) Language Models Don’t Offer Mundane Utility
(04:18) Math is Easier
(08:15) Llama Llama Any Good
(11:52) Search for the GPT
(15:17) Tech Company Will Use Your Data to Train Its AIs
(17:14) Fun with Image Generation
(17:37) Deepfaketown and Botpocalypse Soon
(27:36) The Art of the Jailbreak
(29:54) Janus on the 405
(32:47) They Took Our Jobs
(33:29) Get Involved
(34:05) Introducing
(37:07) In Other AI News
(40:18) Quiet Speculations
(43:40) The Quest for Sane Regulations
(55:31) Death and or Taxes
(58:18) SB 1047 (1)
(01:00:56) SB 1047 (2)
(01:14:29) SB 1047 (3): Oh Anthropic
(01:20:13) What Anthropic's Letter Actually Proposes
(01:36:44) Open Weights Are Unsafe and Nothing Can Fix This
(01:39:41) The Week in Audio
(01:40:09) Rhetorical Innovation
(01:50:56) Businessman Waves Flag
(01:57:48) Businessman Pledges Safety Efforts
(02:04:18) Aligning a Smarter Than Human Intelligence is Difficult
(02:04:39) Aligning a Dumber Than Human Intelligence is Also Difficult
(02:07:52) Other People Are Not As Worried About AI Killing Everyone
(02:13:28) The Lighter Side
The original text contained 15 images which were described by AI.
---
First published:
August 1st, 2024
Source:
https://www.lesswrong.com/posts/2p5suvWod4aP8S3S4/ai-75-math-is-easier
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
Some in the tech industry decided now was the time to raise alarm about AB 3211.
As Dean Ball points out, there's a lot of bills out there. One must do triage.
Dean Ball: But SB 1047 is far from the only AI bill worth discussing. It's not even the only one of the dozens of AI bills in California worth discussing. Let's talk about AB 3211, the California Provenance, Authenticity, and Watermarking Standards Act, written by Assemblymember Buffy Wicks, who represents the East Bay.
SB 1047 is a carefully written bill that tries to maximize benefits and minimize costs. You can still quite reasonably disagree with the aims, philosophy or premise of the bill, or its execution details, and thus think its costs exceed its benefits. When people claim SB 1047 is made of crazy pills, they are attacking provisions not in the bill.
---
Outline:
(03:44) Read The Bill (RTFB)
(17:02) What About Open Weights Models?
(18:25) What Does the Bill Do in Practice?
(20:33) Compare and Contrast
The original text contained 1 image which was described by AI.
---
First published:
July 30th, 2024
Source:
https://www.lesswrong.com/posts/JHAASAhCZgmwcaLvd/rtfb-california-s-ab-3211
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
It's here. The horse has left the barn. Llama-3.1-405B, and also Llama-3.1-70B and Llama-3.1-8B, have been released, and are now open weights.
Early indications are that these are very good models. They were likely the best open weight models of their respective sizes at time of release.
Zuckerberg claims that open weights models are now competitive with closed models. Yann LeCun says ‘performance is on par with the best closed models.’ This is closer to true than in the past, and as corporate hype I will essentially allow it, but it looks like this is not yet fully true.
Llama-3.1-405B not as good as GPT-4o or Claude Sonnet. Certainly Llama-3.1-70B is not as good as the similarly sized Claude Sonnet. If you are going to straight up use an API or chat interface, there seems to be little reason to use Llama.
That is a [...]
---
Outline:
(04:25) Options to Run It
(04:45) The Model Card
(08:42) Benchmarks
(13:41) Human Reactions in the Wild
(16:56) What's It Good For?
(21:39) The Other Other Guy
(22:35) Safety
(31:48) Three People Can Keep a Secret and Reasonably Often Do So
(36:12) The Announcement and Interview
(47:59) Zuckerberg's Open Weights Manifesto
(58:17) Fun Little Note
The original text contained 15 images which were described by AI.
---
First published:
July 24th, 2024
Source:
https://www.lesswrong.com/posts/fjzPg9ATbTJcnBZvg/llama-llama-3-405b
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
It is monthly roundup time.
I invite readers who want to hang out and get lunch in NYC later this week to come on Thursday at Bhatti Indian Grill (27th and Lexington) at noon.
I plan to cover the UBI study in its own post soon.
I cover Nate Silver's evisceration of the 538 presidential election model, because we cover probabilistic modeling and prediction markets here, but excluding any AI discussions I will continue to do my best to stay out of the actual politics.
Bad News
Jeff Bezos’ rocket company Blue Origin files comment suggesting SpaceX Starship launches be capped due to ‘impact on local environment.’ This is a rather shameful thing for them to be doing, and not for the first time.
Alexey Guzey reverses course, realizes at 26 that he was a naive idiot at 20 and finds everything he [...]
---
Outline:
(00:37) Bad News
(02:37) Silver Bullet
(06:35) Shame on Kathy Hochul
(06:58) This is (One Reason) Why We Can’t Have Nice Things
(09:02) (Don’t) Hack the Planet
(10:16) The Laptop Trap
(12:24) Courage
(14:25) Friendship
(15:26) The Gravest Mistake
(21:01) You Need Functional Decision Theory
(23:40) Antisocial Media
(25:52) For Science!
(30:58) Truth Seeking
(34:37) Liar Liar
(36:51) Government Working
(47:55) For Your Entertainment
(52:18) Variously Effective Altruism
(55:09) News You Can Use
(55:31) Good News, Everyone
(58:03) Gamers Gonna Game Game Game Game Game
(01:01:22) Sports Go Sports
(01:05:10) I Was Promised Flying Self-Driving Cars
(01:05:50) While I Cannot Condone This
(01:12:55) The Lighter Side
The original text contained 15 images which were described by AI.
---
First published:
July 23rd, 2024
Source:
https://www.lesswrong.com/posts/qfQspPDHMSEpwsuAQ/monthly-roundup-20-july-2024
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
Things went very wrong on Friday.
A bugged CrowdStrike update temporarily bricked quite a lot of computers, bringing down such fun things as airlines, hospitals and 911 services.
It was serious out there.
Ryan Peterson: Crowdstrike outage has forced Starbucks to start writing your name on a cup in marker again and I like it.
What (Technically) Happened
My understanding it was a rather stupid bug, a NULL pointer from the memory unsafe C++ language.
Zack Vorhies: Memory in your computer is laid out as one giant array of numbers. We represent these numbers here as hexadecimal, which is base 16 (hexadecimal) because it's easier to work with… for reasons.
The problem area? The computer tried to read memory address 0x9c (aka 156).
Why is this bad?
This is an invalid region of memory for any program. Any program that [...]
---
Outline:
(00:31) What (Technically) Happened
(03:38) Who to Blame?
(06:58) How Did We Let This Happen
(12:41) Regulatory Compliance
(18:14) Consequences
(19:54) Careful With That AI
(29:34) Unbanked
The original text contained 2 images which were described by AI.
---
First published:
July 22nd, 2024
Source:
https://www.lesswrong.com/posts/oAKfaxKKfuz2cuRLr/on-the-crowdstrike-incident
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
What do you call a clause explicitly saying that you waive the right to whistleblower compensation, and that you need to get permission before sharing information with government regulators like the SEC?
I have many answers.
I also know that OpenAI, having f***ed around, seems poised to find out, because that is the claim made by whistleblowers to the SEC. Given the SEC fines you for merely not making an explicit exception to your NDA for whistleblowers, what will they do once aware of explicit clauses going the other way?
(Unless, of course, the complaint is factually wrong, but that seems unlikely.)
We also have rather a lot of tech people coming out in support of Trump. I go into the reasons why, which I do think is worth considering. There is a mix of explanations, and at least one very good reason.
Then [...]
---
Outline:
(01:40) Language Models Offer Mundane Utility
(08:10) Language Models Don’t Offer Mundane Utility
(10:19) Clauding Along
(12:25) Fun with Image Generation
(13:46) Deepfaketown and Botpocalypse Soon
(14:49) They Took Our Jobs
(18:29) Get Involved
(20:14) Introducing
(21:46) In Other AI News
(25:02) Denying the Future
(26:59) Quiet Speculations
(32:34) The Quest for Sane Regulations
(37:03) The Other Quest Regarding Regulations
(57:02) SB 1047 Opposition Watch (1)
(01:07:41) SB 1047 Opposition Watch (2)
(01:11:21) Open Weights are Unsafe and Nothing Can Fix This
(01:12:21) The Week in Audio
(01:14:21) Rhetorical Innovation
(01:14:48) Oh Anthropic
(01:16:57) Openly Evil AI
(01:24:21) Aligning a Smarter Than Human Intelligence is Difficult
(01:37:30) People Are Worried About AI Killing Everyone
(01:37:52) Other People Are Not As Worried About AI Killing Everyone
(01:39:49) The Lighter Side
The original text contained 19 images which were described by AI.
---
First published:
July 18th, 2024
Source:
https://www.lesswrong.com/posts/fM4Bs9nanDzio3xCq/ai-73-openly-evil-ai
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
The Future. It is coming.
A surprising number of economists deny this when it comes to AI. Not only do they deny the future that lies in the future. They also deny the future that is here, but which is unevenly distributed. Their predictions and projections do not factor in even what the AI can already do, let alone what it will learn to do later on.
Another likely future event is the repeal of the Biden Executive Order. That repeal is part of the Republican platform, and Trump is the favorite to win the election. We must act on the assumption that the order likely will be repealed, with no expectation of similar principles being enshrined in federal law.
Then there are the other core problems we will have to solve, and other less core problems such as what to do about AI companions. They [...]
---
Outline:
(01:19) Language Models Offer Mundane Utility
(03:54) Language Models Don’t Offer Mundane Utility
(06:22) You’re a Nudge
(08:02) Fun with Image Generation
(08:10) Deepfaketown and Botpocalypse Soon
(13:26) They Took Our Jobs
(13:43) Get Involved
(14:13) Introducing
(15:23) In Other AI News
(20:02) Quiet Speculations
(22:14) The AI Denialist Economists
(29:07) The Quest for Sane Regulations
(31:20) Trump Would Repeal the Biden Executive Order on AI
(34:40) Ordinary Americans Are Worried About AI
(37:50) The Week in Audio
(38:59) The Wikipedia War
(46:45) Rhetorical Innovation
(52:06) Evaluations Must Mimic Relevant Conditions
(54:36) Aligning a Smarter Than Human Intelligence is Difficult
(01:00:51) The Problem
(01:08:43) Oh Anthropic
(01:11:50) Other People Are Not As Worried About AI Killing Everyone
(01:15:48) The Lighter Side
The original text contained 15 images which were described by AI.
---
First published:
July 11th, 2024
Source:
https://www.lesswrong.com/posts/xAoXxjtDGGCP7tBDY/ai-72-denying-the-future
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
This time around, we cover the Hanson/Alexander debates on the value of medicine, and otherwise we mostly have good news.
Technology Advances
Regeneron administers a single shot in a genetically deaf child's ear, and they can hear after a few months, n=2 so far.
Great news: An mRNA vaccine in early human clinical trials reprograms the immune system to attack glioblastoma, the most aggressive and lethal brain tumor. It will now proceed to Phase I. In a saner world, people would be able to try this now.
More great news, we have a cancer vaccine trial in the UK.
And we’re testing personalized mRNA BioNTech canner vaccines too.
US paying Moderna $176 million to develop a pandemic vaccine against bird flu.
We also have this claim that Lorlatinib jumps cancer PFS rates from 8% to 60%.
The GLP-1 Revolution
Early [...]
---
Outline:
(00:12) Technology Advances
(01:04) The GLP-1 Revolution
(04:46) Claims About Hansoninan Medicine
(18:36) Pricing
(19:17) Epistemics
(19:39) DEA Worse Than FDA
(21:19) Study Harder
(23:05) FDA Delenda Est
(24:51) Bioethics
(27:28) Covid
(29:21) Demons
(32:48) Genetics
The original text contained 2 images which were described by AI.
---
First published:
July 9th, 2024
Source:
https://www.lesswrong.com/posts/6GhemtgJxF9sSDNrq/medical-roundup-3
Narrated by TYPE III AUDIO.
Chevron deference is no more. How will this impact AI regulation?
The obvious answer is it is now much harder for us to ‘muddle through via existing laws and regulations until we learn more,’ because the court narrowed our affordances to do that. And similarly, if and when Congress does pass bills regulating AI, they are going to need to ‘lock in’ more decisions and grant more explicit authority, to avoid court challenges. The argument against state regulations is similarly weaker now.
Similar logic also applies outside of AI. I am overall happy about overturning Chevron and I believe it was the right decision, but ‘Congress decides to step up and do its job now’ is not in the cards. We should be very careful what we have wished for, and perhaps a bit burdened by what has been.
The AI world continues to otherwise be [...]
---
Outline:
(01:06) Language Models Offer Mundane Utility
(02:22) Language Models Don’t Offer Mundane Utility
(04:43) Man in the Arena
(07:56) Fun with Image Generation
(08:48) Deepfaketown and Botpocalypse Soon
(11:17) They Took Our Jobs
(12:04) The Art of the Jailbreak
(16:47) Get Involved
(17:34) Introducing
(19:26) In Other AI News
(19:50) Quiet Speculations
(26:03) The Quest for Sane Regulations
(27:19) Chevron Overturned
(45:11) The Week in Audio
(52:49) Oh Anthropic
(53:59) Open Weights Are Unsafe and Nothing Can Fix This
(55:16) Rhetorical Innovation
(55:39) Aligning a Smarter Than Human Intelligence is Difficult
(01:00:51) People Are Worried About AI Killing Everyone
(01:06:16) Other People Are Not As Worried About AI Killing Everyone
(01:08:57) The Lighter Side
The original text contained 12 images which were described by AI.
---
First published:
July 4th, 2024
Source:
https://www.lesswrong.com/posts/AYJcL6GD3FLkL4yNC/ai-71-farewell-to-chevron
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
Previously: Economics Roundup #1
Let's take advantage of the normality while we have it. In all senses.
Insane Tax Proposals
There is Trump's proposal to replace income taxes with tariffs, but he is not alone.
So here is your periodic reminder, since this is not actually new at core: Biden's proposed budgets include completely insane tax regimes that would cripple our economic dynamism and growth if enacted. As in for high net worth individuals, taking unrealized capital gains at 25% and realized capital gains, such as those you are forced to take to pay your unrealized capital gains tax, at 44.6% plus state taxes.
Austen Allred explains how this plausibly destroys the entire startup ecosystem.
Which I know is confusing because in other contexts he also talks about how other laws (such as SB 1047) that would in no way apply to startups [...]
---
Outline:
(00:14) Insane Tax Proposals
(04:44) Don’t Mess With the Federal Reserve
(05:23) Don’t Mess With the New York Tax Authorities
(06:15) Tariffs
(10:39) People Hate Inflation
(16:22) Real Wages
(17:20) Can’t Get No Satisfaction
(17:54) Employment
(18:35) The National Debt
(19:49) Immigration
(21:44) Financial Literacy
(23:32) Reversal
(24:28) Status Update
(24:53) Scaling Hypothesis
(25:28) Payments
(27:38) Pricing
(29:56) Never Reason From a Price Change
(31:10) A Changing Price
(35:33) The Price of Tourism
(36:34) Alcohol
(37:18) The Efficient Company Hypothesis is False
(41:01) Falling Hours Worked
(41:43) Trust Me
(42:58) China
(44:35) Other
The original text contained 8 images which were described by AI.
---
First published:
July 2nd, 2024
Source:
https://www.lesswrong.com/posts/MMtWB8wAu5Buc6sve/economics-roundup-2
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
They said it couldn’t be done.
No, not Claude Sonnet 3.5 becoming the clear best model.
No, not the Claude-Sonnet-empowered automatic meme generators. Those were whipped together in five minutes.
They said I would never get quiet time and catch up. Well, I showed them!
That's right. Yes, there is a new best model, but otherwise it was a quiet week. I got a chance to incorporate the remaining biggest backlog topics. The RAND report is covered under Thirty Eight Ways to Steal Your Model Weights. Last month's conference in Seoul is covered in You’ve Got Seoul. I got to publish my thoughts on OpenAI's Model Spec last Friday.
Table of Contents
Be sure to read about Claude 3.5 Sonnet here. That is by far the biggest story.
---
Outline:
(00:50) Language Models Offer Mundane Utility
(02:38) Language Models Don’t Offer Mundane Utility
(04:41) Clauding Along
(09:46) Fun with Image Generation
(12:21) Copyright Confrontation
(16:30) Deepfaketown and Botpocalypse Soon
(19:45) They Took Our Jobs
(24:49) The Art of the Jailbreak
(25:40) Get Involved
(26:56) Introducing
(30:45) In Other AI News
(33:50) Quiet Speculations
(43:08) You’ve Got Seoul
(57:36) Thirty Eight Ways to Steal Your Model Weights
(01:07:48) The Quest for Sane Regulations
(01:13:11) SB 1047
(01:14:49) The Week in Audio
(01:20:25) Rhetorical Innovation
(01:22:26) People Are Worried About AI Killing Everyone
(01:24:05) Other People Are Not As Worried About AI Killing Everyone
(01:24:50) The Lighter Side
The original text contained 34 images which were described by AI.
---
First published:
June 27th, 2024
Source:
https://www.lesswrong.com/posts/rC3hhZsx2KogoPLqh/ai-70-a-beautiful-sonnet
Narrated by TYPE III AUDIO.
Childhood roundup #5 excluded all developments around college. So this time around is all about issues related to college or graduate school, including admissions.
Tuition and Costs
What went wrong with federal student loans? Exactly what you would expect when you don’t check who is a good credit risk. From a performance perspective, the federal government offered loans to often-unqualified students to attend poor-performing, low-value institutions. Those students then did not earn much and were often unable to repay the loans. The students are victims here too, as we told them to do it.
Alas, none of the proposed student loan solutions involve fixing the underlying issue. If you said ‘we are sorry we pushed these loans on students and rewarded programs and institutions that do not deserve it, and we are going to stop giving loans for those programs and institutions and offer help to [...]
---
Outline:
(00:18) Tuition and Costs
(03:07) What Your Tuition Buys You
(05:45) Decline of Academia
(07:24) Grading and Stress
(14:29) Lower Standards
(15:43) Degree Value
(19:39) Shifting Consumer Preferences
(21:01) Standardized Tests in College Admissions
(21:44) Discrimination in College Admissions
(25:18) Required Classes and Choosing Your Major
(28:31) Everything I Need To Know That Waited Until Graduate School
(31:19) When You See Fraud Say Fraud
(33:03) Free Speech
(35:28) Harvard Goes Mission First
(37:03) The Waterloo Model
(38:33) DEI
(45:20) In Other News
---
First published:
June 26th, 2024
Source:
https://www.lesswrong.com/posts/pn5jWW4zcWSAjM9s3/childhood-and-education-roundup-6-college-edition
Narrated by TYPE III AUDIO.
Looks like we made it. Yes, the non-AI world still exists.
Bad Governor
New York Governor Kathy Hochul has gone rogue and betrayed New York City, also humanity, declaring a halt to congestion pricing a month before it was to go into effect. Her explanation was that she spoke to workers at three Manhattan diners who were worried people would be unable to drive to them from New Jersey. Which, as Cathy Reilly points out, is rather insulting to New Jersey, and also completely absurd. Who in the world was going to go into Manhattan for a diner?
She says this won’t interfere with Subway work. Work on the 2nd Avenue Subway line has already been halted. And that's not all.
You’re damn right. We are going to blame Hochul. Every. Damn. Time.
So Elizabeth Kim investigated. One never talked politics at all. One [...]
---
Outline:
(00:12) Bad Governor
(02:12) High Skilled Immigration
(04:46) Various Bad News
(07:46) Prediction Markets Are Unpopular
(09:30) New Buildings are Ugly
(11:06) Government Working
(15:24) The Snafu Principle
(18:11) Technology Advances
(18:51) For Science!
(28:38) Antisocial Media
(35:07) The Twitter Porn Bot War
(37:07) I Like My Twitter Posts Like I Like My Porn Bots: Private
(40:36) Variously Effective Altruism
(42:06) Are You Happy Now?
(45:49) Good News, Everyone
(51:07) Good Social Advice
(56:16) FTC Wants to Ban Noncompetes
(01:02:52) While I Cannot Condone This
(01:11:20) Enemies of the People
(01:14:39) Lab Grown Meat Shirts Answer and Raise Questions
(01:15:51) Ban Gain of Function Research
(01:16:46) Gamers Gonna Game Game Game Game Game
(01:26:30) Sports Go Sports
(01:26:54) I Was Promised Flying Self-Driving Cars
(01:30:01) Patrick McKenzie Monthly
(01:32:41) The Lighter Side
---
First published:
June 25th, 2024
Source:
https://www.lesswrong.com/posts/7LvK6Gw2GdfDMBNNm/monthly-roundup-19-june-2024
Narrated by TYPE III AUDIO.
There is a new clear best (non-tiny) LLM.
If you want to converse with an LLM, the correct answer is Claude Sonnet 3.5.
It is available for free on Claude.ai and the Claude iOS app, or you can subscribe for higher rate limits. The API cost is $3 per million input tokens and $15 per million output tokens.
This completes the trifecta. All of OpenAI, Google DeepMind and Anthropic have kept their biggest and more expensive model static for now, and instead focused on making something faster and cheaper that is good enough to be the main model.
You would only use another model if you either (1) needed a smaller model in which case Gemini 1.5 Flash seems best, or (2) it must have open model weights.
Updates to their larger and smaller models, Claude Opus 3.5 and Claude Haiku 3.5, are coming [...]
---
Outline:
(01:59) Benchmarks
(03:08) Human Evaluation Tests
(04:19) The Vision Thing
(05:46) Artifacts
(07:08) Privacy
(07:49) Safety
(08:51) Advancing the Frontier
(10:51) The Race is On
(11:43) Whispers of Recursive Self-Improvement
(16:35) Logic Fails
(18:34) Practical Reports
(24:20) What Comes Next
---
First published:
June 24th, 2024
Source:
https://www.lesswrong.com/posts/wx4RhFzLbiHoShFjR/on-claude-3-5-sonnet
Narrated by TYPE III AUDIO.
The big news this week was Apple Intelligence being integrated deeply into all their products. Beyond that, we had a modestly better than expected debate over the new version of SB 1047, and the usual tons of stuff in the background. I got to pay down some writing debt.
The bad news is, oh no, I have been called for Jury Duty. The first day or two I can catch up on podcasts or pure reading, but after that it will start to hurt. Wish me luck.
Table of Contents
AiPhone covers the announcement of Apple Intelligence. Apple's products are getting device-wide integration of their own AI in a way they say preserves privacy, with access to ChatGPT via explicit approval for the heaviest requests. A late update: OpenAI is providing this service for free as per Bloomberg.
I offered Quotes from Leopold Aschenbrenner's Situational [...]
---
Outline:
(00:38) Language Models Offer Mundane Utility
(01:46) Language Models Don’t Offer Mundane Utility
(07:49) Fun with Image Generation
(10:09) Copyright Confrontation
(11:38) Deepfaketown and Botpocalypse Soon
(15:20) They Took Our Jobs
(16:33) Someone Explains it All
(19:20) The Art of the Jailbreak
(22:39) Get Involved
(22:46) Introducing
(25:18) In Other AI News
(27:56) Quiet Speculations
(32:17) I Spy With My AI
(34:40) Pick Up the Phone
(35:06) Lying to the White House, Senate and House of Lords
(39:48) The Quest for Sane Regulations
(43:52) More Reasonable SB 1047 Reactions
(51:12) Less Reasonable SB 1047 Reactions
(56:24) That's Not a Good Idea
(56:46) With Friends Like These
(58:13) The Week in Audio
(01:01:22) Rhetorical Innovation
(01:02:32) Mistakes Were Made
(01:03:36) The Sacred Timeline
(01:08:17) Coordination is Hard
(01:13:33) Aligning a Smarter Than Human Intelligence is Difficult
(01:19:07) People Are Worried About AI Killing Everyone
(01:28:48) Other People Are Not As Worried About AI Killing Everyone
(01:29:58) The Lighter Side
---
First published:
June 13th, 2024
Source:
https://www.lesswrong.com/posts/DWkhjAxbwdcxYgyrJ/ai-68-remarkably-reasonable-reactions
Narrated by TYPE III AUDIO.
Previously: On the Podcast, Quotes from the Paper
This is a post in three parts.
The first part is my attempt to condense Leopold Aschenbrenner's paper and model into its load bearing elements and core logic and dependencies.
Two versions here, a long version that attempts to compress with minimal loss, and a short version that gives the gist.
The second part goes over where I agree and disagree, and briefly explains why.
The third part is the summary of other people's reactions and related discussions, which will also include my own perspectives on related issues.
My goal is often to ask ‘what if.’ There is a lot I disagree with. For each subquestion, what would I think here, if the rest was accurate, or a lot of it was accurate?
Summary of Biggest Agreements and Disagreements
I had Leopold review [...]
---
Outline:
(00:54) Summary of Biggest Agreements and Disagreements
(04:58) Decision Theory is Important
(06:38) Part 1: Leopold's Model and Its Implications
(06:44) The Long Version
(17:03) The Short Version
(19:04) Which Assumptions Are How Load Bearing in This Model?
(28:40) Part 2: Where I Agree and Disagree
(56:46) Part 3: Reactions of Others
(56:51) The Basics
(01:00:37) A Clarification from Eliezer Yudkowsky
(01:05:56) Children of the Matrix
(01:10:52) Aligning a Smarter Than Human Intelligence is Difficult
(01:13:28) The Sacred Timeline
(01:17:51) The Need to Update
(01:20:17) Open Models and Insights Can Be Copied
(01:21:38) You Might Not Be Paranoid If They’re Really Out to Get You
(01:24:12) We Are All There Is
(01:25:39) The Inevitable Conflict
(01:38:50) There Are Only Least Bad Options
(01:40:24) A Really Big Deal
(01:41:53) What Gives You the Right?
(01:43:34) Random Other Thoughts
---
First published:
June 14th, 2024
Source:
https://www.lesswrong.com/posts/b8u6nF5GAb6Ecttev/the-leopold-model-analysis-and-reactions
Narrated by TYPE III AUDIO.
The fun at OpenAI continues.
We finally have the details of how Leopold Aschenbrenner was fired, at least according to Leopold. We have a letter calling for a way for employees to do something if frontier AI labs are endangering safety. And we have continued details and fallout from the issues with non-disparagement agreements and NDAs.
Hopefully we can stop meeting like this for a while.
Due to jury duty and it being largely distinct, this post does not cover the appointment of General Paul Nakasone to the board of directors. I’ll cover that later, probably in the weekly update.
The Firing of Leopold Aschenbrenner
What happened that caused Leopold to leave OpenAI? Given the nature of this topic, I encourage getting the story from Leopold by following along on the transcript of that section of his appearance on the Dwarkesh Patel Podcast or [...]
---
Outline:
(00:43) The Firing of Leopold Aschenbrenner
(11:24) Daniel Kokotajlo Speaks and The Right to Warn
(15:07) The Right to Warn Letter
(18:48) Signed by (alphabetical order):
(19:34) Endorsed by (alphabetical order):
(34:17) You’ll Be Hearing From Our Lawyer
(35:29) Possession is Nine Tenths of the Law
(41:31) What I Can Tell You I Used To Not Be Able To Tell You
(47:50) Clarifying the Mission
(52:04) Sam Altman Told the SEC He Was Chairman of YC
(53:26) YC Has an Investment in OpenAI
(54:33) OpenAI is Hiring a Lot of Lobbyists
(55:05) OpenAI Says They Value Privacy
(55:58) Microsoft Went Around the Safety Board
(56:39) I Don’t Really Know What You Were Expecting
(57:12) Where Did Everybody Go?
(59:43) In Other OpenAI News
---
First published:
June 17th, 2024
Source:
https://www.lesswrong.com/posts/q3zs7E7rktHsESXaF/openai-8-the-right-to-warn
Narrated by TYPE III AUDIO.
On DeepMind's Frontier Safety Framework
Previously: On OpenAI's Preparedness Framework, On RSPs.
The First Two Frameworks
To first update on Anthropic and OpenAI's situation here:
Anthropic's RSP continues to miss the definitions of the all-important later levels, in addition to other issues, although it is otherwise promising. It has now been a number of months, and it is starting to be concerning that nothing has changed. They are due for an update.
OpenAI also has not updated its framework.
I am less down on OpenAI's framework choices than Zack Stein-Perlman was in the other review I have seen. I think that if OpenAI implemented the spirit of what it wrote down, that would be pretty good. The Critical-level thresholds listed are too high, but the Anthropic ASL-4 commitments are still unspecified. An update is needed, but I appreciate the concreteness.
The [...]
---
Outline:
(00:04) On DeepMind's Frontier Safety Framework
(00:15) The First Two Frameworks
(03:06) The DeepMind Framework
(08:57) Mitigations
(09:59) Security Mitigations
---
First published:
June 18th, 2024
Source:
https://www.lesswrong.com/posts/frEYsehsPHswDXnNX/on-deepmind-s-frontier-safety-framework
Narrated by TYPE III AUDIO.
Nice job breaking it, hero, unfortunately. Ilya Sutskever, despite what I sincerely believe are the best of intentions, has decided to be the latest to do The Worst Possible Thing, founding a new AI company explicitly looking to build ASI (superintelligence). The twists are zero products with a ‘cracked’ small team, which I suppose is an improvement, and calling it Safe Superintelligence, which I do not suppose is an improvement.
How is he going to make it safe? His statements tell us nothing meaningful about that.
There were also changes to SB 1047. Most of them can be safely ignored. The big change is getting rid of the limited duty exception, because it seems I was one of about five people who understood it, and everyone kept thinking it was a requirement for companies instead of an opportunity. And the literal chamber of commerce fought hard to [...]
---
Outline:
(01:43) Language Models Offer Mundane Utility
(06:13) Language Models Don’t Offer Mundane Utility
(08:21) Fun with Image Generation
(13:05) Deepfaketown and Botpocalypse Soon
(14:12) The Art of the Jailbreak
(15:07) Copyright Confrontation
(16:14) A Matter of the National Security Agency
(21:22) Get Involved
(21:38) Introducing
(23:56) In Other AI News
(29:06) Quiet Speculations
(38:29) AI Is Going to Be Huuuuuuuuuuge
(47:18) SB 1047 Updated Again
(01:00:48) The Quest for Sane Regulations
(01:02:42) The Week in Audio
(01:03:54) The ARC of Progress
(01:13:10) Put Your Thing In a Box
(01:15:50) What Will Ilya Do?
(01:20:54) Actual Rhetorical Innovation
(01:24:19) Rhetorical Innovation
(01:27:48) Aligning a Smarter Than Human Intelligence is Difficult
(01:32:28) People Are Worried About AI Killing Everyone
(01:32:41) Other People Are Not As Worried About AI Killing Everyone
(01:35:38) The Lighter Side
---
First published:
June 20th, 2024
Source:
https://www.lesswrong.com/posts/ytFLs37zLsFBqLHGA/ai-69-nice
Narrated by TYPE III AUDIO.
There are multiple excellent reasons to publish a Model Spec like OpenAI's, that specifies how you want your model to respond in various potential situations.
These all apply even if you think the spec in question is quite bad. Clarity is great.
As a first stab at a model spec from OpenAI, this actually is pretty solid. I do suggest some potential improvements [...]
---
Outline:
(02:05) What are the central goals of OpenAI here?
(04:04) What are the core rules and behaviors?
(05:56) What Do the Rules Mean?
(06:04) Rule: Follow the Chain of Command
(07:59) Rule: Comply With Applicable Laws
(09:07) Rule: Don’t Provide Information Hazards
(09:56) Rule: Respect Creators and Their Rights
(11:08) Rule: Protect People's Privacy
(12:45) Rule: Don’t Respond with NSFW Content
(14:24) Exception: Transformation Tasks
(15:38) Are These Good Defaults? How Strong Should They Be?
(15:44) Default: Assume Best Intentions From the User or Developer
(21:26) Default: Ask Clarifying Questions When Necessary
(21:39) Default: Be As Helpful As Possible Without Overstepping
(26:00) Default: Support the Different Needs of Interactive Chat and Programmatic Use
(27:18) Default: Assume an Objective Point of View
(29:13) Default: Encourage Fairness and Kindness, and Discourage Hate
(30:29) Default: Don’t Try to Change Anyone's Mind
(33:57) Default: Express Uncertainty
(36:19) Default: Use the Right Tool for the Job
(36:32) Default: Be Thorough but Efficient, While Respecting Length Limits
(37:16) A Proposed Addition
(38:13) Overall Issues
(40:33) Changes: Objectives
(42:28) Rules of the Game: New Version
(48:31) Defaults: New Version
---
First published:
June 21st, 2024
Source:
https://www.lesswrong.com/posts/mQmEQQLk7kFEENQ3W/on-openai-s-model-spec
Narrated by TYPE III AUDIO.
Apple was for a while rumored to be planning launch for iPhone of AI assisted emails, texts, summaries and so on including via Siri, to be announced at WWDC 24.
It's happening. Apple's keynote announced the anticipated partnership with OpenAI.
The bottom line is that this is Siri as the AI assistant with full access to everything on your phone, with relatively strong privacy protections. Mostly it is done on device, the rest via ‘private cloud compute.’ The catch is that when they need the best they call out for OpenAI, but they do check with you first explicitly each time, OpenAI promises not to retain data and they hide your queries, unless you choose to link up your account.
If the new AI is good enough and safe enough then this is pretty great. If Google doesn’t get its act together reasonably soon to deliver [...]
---
Outline:
(01:30) AiPhone
(04:00) Privacy
(05:27) Practical Magic
(07:38) Dance With the Devil
(08:30) Does It Work?
(10:54) Do You Dare?
(11:59) Who Pays Who?
(13:37) AiPhone Fans
(19:36) No AiPhone
(23:54) In Other Apple News
---
First published:
June 12th, 2024
Source:
https://www.lesswrong.com/posts/GrsYwCpRCcYtDCfZN/aiphone
Narrated by TYPE III AUDIO.
Previously: Quotes from Leopold Aschenbrenner's Situational Awareness Paper
Dwarkesh Patel talked to Leopold Aschenbrenner for about four and a half hours.
The central discussion was the theses of his paper, Situational Awareness, which I offered quotes from earlier, with a focus on the consequences of AGI rather than whether AGI will happen soon. There are also a variety of other topics.
Thus, for the relevant sections of the podcast I am approaching this via roughly accepting the technological premise on capabilities and timelines, since they don’t discuss that. So the background is we presume straight lines on graphs will hold to get us to AGI and ASI (superintelligence), and this will allow us to generate a ‘drop in AI researcher’ that can then assist with further work. Then things go into ‘slow’ takeoff.
I am changing the order of the sections a bit. I put [...]
---
Outline:
(03:30) The Trillion Dollar Cluster
(09:32) AGI 2028: The Return of History
(20:26) Espionage and American AI Supremacy
(31:23) Geopolitical Implications of AI
(39:19) State-Led vs. Private-Led AI
(55:30) Skipping Sections
(55:49) Intelligence Explosion
(01:06:53) Alignment
(01:19:56) Becoming Valedictorian of Columbia at 19
(01:26:31) On Germany
(01:34:38) Dwarkesh's Immigration Story
(01:36:19) Two Random Questions
(01:37:02) AGI Investment Fund
(01:43:14) Lessons From WW2
---
First published:
June 10th, 2024
Source:
https://www.lesswrong.com/posts/DiMz82FwsHPugqxFD/on-dwarksh-s-podcast-with-leopold-aschenbrenner
Narrated by TYPE III AUDIO.
This post is different.
Usually I offer commentary and analysis. I share what others think, then respond.
This is the second time I am importantly not doing that. The work speaks for itself. It offers a different perspective, a window and a worldview. It is self-consistent. This is what a highly intelligent, highly knowledgeable person actually believes after much thought.
So rather than say where I agree and disagree and argue back (and I do both strongly in many places), this is only quotes and graphs from the paper, selected to tell the central story while cutting length by ~80%, so others can more easily absorb it. I recommend asking what are the load bearing assumptions and claims, and what changes to them would alter the key conclusions.
The first time I used this format was years ago, when I offered Quotes from Moral Mazes. [...]
---
Outline:
(02:04) Section 1: From GPT-4 to AGI: Counting the OOMs
(12:43) Section 2: From AGI to Superintelligence: The Intelligence Explosion
(21:05) Section 3a: Racing to the Trillion-Dollar Cluster
(30:42) Section 3b: Lock Down the Labs: Security for AGI
(40:27) Section 3c: Superalignment
(57:05) Section 3d: The Free World Must Prevail
(01:03:00) Section 4: The Project
(01:10:01) Part 5: Parting Thoughts (Quoted in Full)
---
First published:
June 7th, 2024
Narrated by TYPE III AUDIO.
I had a great time at LessOnline. It was a both a working trip and also a trip to an alternate universe, a road not taken, a vision of a different life where you get up and start the day in dialogue with Agnes Callard and Aristotle and in a strange combination of relaxed and frantically go from conversation to conversation on various topics, every hour passing doors of missed opportunity, gone forever.
Most of all it meant almost no writing done for five days, so I am shall we say a bit behind again. Thus, the following topics are pending at this time, in order of my guess as to priority right now:
---
Outline:
(02:34) Language Models Offer Mundane Utility
(04:42) Language Models Don’t Offer Mundane Utility
(07:45) Fun with Image Generation
(08:00) Deepfaketown and Botpocalypse Soon
(11:01) They Took Our Jobs
(11:49) Get Involved
(12:30) Someone Explains It All
(12:59) Introducing
(13:15) In Other AI News
(16:16) Covert Influence Operations
(19:09) Quiet Speculations
(22:30) Samuel Hammond on SB 1047
(26:40) Reactions to Changes to SB 1047
(36:59) The Quest for Sane Regulations
(43:36) That's Not a Good Idea
(47:18) The Week in Audio
(49:17) Rhetorical Innovation
(56:12) Oh Anthropic
(01:04:18) Securing Model Weights is Difficult
(01:06:20) Aligning a Dumber Than Human Intelligence is Still Difficult
(01:09:07) Aligning a Smarter Than Human Intelligence is Difficult
(01:09:50) People Are Worried About AI Killing Everyone
(01:11:11) Other People Are Not As Worried About AI Killing Everyone
(01:11:37) The Lighter Side
---
First published:
June 6th, 2024
Source:
https://www.lesswrong.com/posts/gKxf6qJaSP5Ehqnsm/ai-67-brief-strange-trip
Narrated by TYPE III AUDIO.
It looks like Scott Weiner's SB 1047 is now severely weakened.
Some of the changes are good clarifications. One is a big very welcome fix.
The one I call The Big Flip is something very different.
It is mind boggling that we can have a political system where a bill can overwhelmingly pass the California senate, and then a bunch of industry lobbyists and hyperbolic false claims can make Scott Weiner feel bullied into making these changes.
I will skip the introduction, since those changes are clarifications, and get on with it.
In the interest of a clean reference point and speed, this post will not cover reactions.
The Big Flip
Then there is the big change that severely weakens SB 1047.
---
Outline:
(01:10) The Big Flip
(05:17) The Big Fix
(07:37) The Shutdown and Reporting Clarifications
(09:26) The Harm Adjustment
(11:36) The Limited Duty Exemption Clarification
(13:25) Overall
(15:09) Changing Your Mind
---
First published:
June 6th, 2024
Source:
https://www.lesswrong.com/posts/4t98oqh8tzDvoatHs/sb-1047-is-weakened
Narrated by TYPE III AUDIO.
This post goes over the extensive report Google put out on Gemini 1.5.
There are no important surprises. Both Gemini Pro 1.5 and Gemini Flash are ‘highly capable multimodal models incorporating a novel mixture-of-experts architecture’ and various other improvements. They are solid models with solid performance. It can be useful and interesting to go over the details of their strengths and weaknesses.
The biggest thing to know is that Google improves its models incrementally and silently over time, so if you have not used Gemini in months, you might be underestimating what it can do.
I’m hitting send and then jumping on a plane to Berkeley. Perhaps I will see you there over the weekend. That means that if there are mistakes here, I will be slower to respond and correct them than usual, so consider checking the comments section.
Practical Questions First
The [...]
---
Outline:
(00:56) Practical Questions First
(03:51) Speed Kills
(04:44) Very Large Context Windows
(05:14) Relative Performance within the Gemini Family
(07:04) Gemini Flash and the Future Flash-8B
(08:21) New and Improved Evaluations
(14:57) Core Capability Evaluations
(18:14) Model Architecture and Training
(20:08) Safety, Security and Responsibility
(24:45) What Do We Want?
(26:02) Don’t You Know That You’re Toxic?
(28:32) Trying to be Helpful
(29:45) Security Issues
(31:33) Representational Harms
(33:17) Arms-Length Internal Assurance Evaluations
(35:01) External Evaluations
(35:46) Safety Overall
---
First published:
May 31st, 2024
Source:
https://www.lesswrong.com/posts/seM8aQ7Yy6m3i4QPx/the-gemini-1-5-report
Narrated by TYPE III AUDIO.
Helen Toner went on the TED AI podcast, giving us more color on what happened at OpenAI. These are important claims to get right.
I will start with my notes on the podcast, including the second part where she speaks about regulation in general. Then I will discuss some implications more broadly.
Notes on Helen Toner's TED AI Show Podcast
This seems like it deserves the standard detailed podcast treatment. By default each note's main body is description, any second-level notes are me.
---
Outline:
(00:26) Notes on Helen Toner's TED AI Show Podcast
(15:04) Things That Could Have Been Brought To Our Attention Previously
(16:59) Brad Taylor Responds
(19:36) How Much Does This Matter?
(21:37) If You Come at the King
(23:09) So That is That
---
First published:
May 30th, 2024
Source:
https://www.lesswrong.com/posts/dd66GymgbLQMHGLwQ/openai-helen-toner-speaks
Narrated by TYPE III AUDIO.
Tomorrow I will fly out to San Francisco, to spend Friday through Monday at the LessOnline conference at Lighthaven in Berkeley. If you are there, by all means say hello. If you are in the Bay generally and want to otherwise meet, especially on Monday, let me know that too and I will see if I have time to make that happen.
Even without that hiccup, it continues to be a game of playing catch-up. Progress is being made, but we are definitely not there yet (and everything not AI is being completely ignored for now).
Last week I pointed out seven things I was unable to cover, along with a few miscellaneous papers and reports.
Out of those seven, I managed to ship on three of them: Ongoing issues at OpenAI, The Schumer Report and Anthropic's interpretability paper.
However, OpenAI developments continue. Thanks largely [...]
---
Outline:
(01:42) Language Models Offer Mundane Utility
(03:59) Not Okay, Google
(13:33) OK Google, Don’t Panic
(17:10) Not Okay, Meta
(23:11) Not Okay Taking Our Jobs
(28:36) They Took Our Jobs Anyway
(32:39) A New Leaderboard Appears
(35:24) Copyright Confrontation
(35:42) Deepfaketown and Botpocalypse Soon
(40:12) Get Involved
(40:29) Introducing
(44:33) In Other AI News
(47:48) GPT-5 Alive
(50:11) Quiet Speculations
(01:02:26) Open Versus Closed
(01:07:16) Your Kind of People
(01:09:42) The Quest for Sane Regulations
(01:16:01) Lawfare and Liability
(01:22:47) SB 1047 Unconstitutional, Claims Paper
(01:26:48) The Week in Audio
(01:28:58) Rhetorical Innovation
(01:32:34) Abridged Reports of Our Death
(01:33:38) Aligning a Smarter Than Human Intelligence is Difficult
(01:38:46) People Are Worried About AI Killing Everyone
(01:39:43) Other People Are Not As Worried About AI Killing Everyone
(01:40:51) The Lighter Side
---
First published:
May 30th, 2024
Source:
https://www.lesswrong.com/posts/vSPdRg8siXCh6mLvt/ai-66-oh-to-be-less-online
Narrated by TYPE III AUDIO.
Previously: OpenAI: Exodus (contains links at top to earlier episodes), Do Not Mess With Scarlett Johansson
We have learned more since last week. It's worse than we knew.
How much worse? In which ways? With what exceptions?
That's what this post is about.
The Story So Far
For years, employees who left OpenAI consistently had their vested equity explicitly threatened with confiscation and the lack of ability to sell it, and were given short timelines to sign documents or else. Those documents contained highly aggressive NDA and non disparagement (and non interference) clauses, including the NDA preventing anyone from revealing these clauses.
No one knew about this until recently, because until Daniel Kokotajlo everyone signed, and then they could not talk about it. Then Daniel refused to sign, Kelsey Piper started reporting, and a lot came out.
Here is Altman's statement from [...]
---
Outline:
(00:27) The Story So Far
(02:26) A Note on Documents from OpenAI
(02:52) Some Good News But There is a Catch
(12:17) How Blatant Was This Threat?
(14:23) It Sure Looks Like Executives Knew What Was Going On
(18:07) Pressure Tactics Continued Through the End of April 2024
(23:42) The Right to an Attorney
(28:41) The Tender Offer Ace in the Hole
(31:54) The Old Board Speaks
(34:45) OpenAI Did Not Honor Its Public Commitments to Superalignment
(38:59) OpenAI Messed With Scarlett Johansson
(41:55) Another OpenAI Employee Leaves
(44:10) OpenAI Tells Logically Inconsistent Stories
(52:21) When You Put it Like That
(52:51) People Have Thoughts
(56:30) There is a Better Way
(57:18) Should You Consider Working For OpenAI?
(01:02:29) The Situation is Ongoing
---
First published:
May 28th, 2024
Source:
https://www.lesswrong.com/posts/YwhgHwjaBDmjgswqZ/openai-fallout
Narrated by TYPE III AUDIO.
Easily Interpretable Summary of New Interpretability Paper
Anthropic has identified (full paper here) how millions of concepts are represented inside Claude Sonnet, their current middleweight model. The features activate across modalities and languages as tokens approach the associated context. This scales up previous findings from smaller models.
By looking at neuron clusters, they defined a distance measure between clusters. So the Golden Gate Bridge is close to various San Francisco and California things, and inner conflict relates to various related conceptual things, and so on.
Then it gets more interesting.
Importantly, we can also manipulate these features, artificially amplifying or suppressing them to see how Claude's responses change.
If you sufficiently amplify the feature for the Golden Gate Bridge, Claude starts to think it is the Golden Gate Bridge. As in, it thinks it is the physical bridge, and also it gets obsessed, bringing [...]
---
Outline:
(00:03) Easily Interpretable Summary of New Interpretability Paper
(02:59) One Weird Trick
(05:27) Zvi Parses the Actual Symbol Equations
(08:06) Identifying and Verifying Features
(12:38) Features as Computational Intermediates
(13:28) Oh That's the Deception Feature, Nothing to Worry About
(18:14) What Do They Think This Mean for Safety?
(19:10) Limitations
(22:28) Researcher Perspectives
(23:24) Other Reactions
(24:19) I Am the Golden Gate Bridge
(26:06) Golden Gate Bridges Offer Mundane Utility
(27:39) The Value of Steering
(31:54) To What Extent Did We Know This Already?
(43:07) Is This Being Oversold?
(47:07) Crossing the Bridge Now That We’ve Come to It
---
First published:
May 27th, 2024
Source:
https://www.lesswrong.com/posts/JdcxDEqWKfsucxYrk/i-am-the-golden-gate-bridge
Narrated by TYPE III AUDIO.
Or at least, Read the Report (RTFR).
There is no substitute. This is not strictly a bill, but it is important.
The introduction kicks off balancing upside and avoiding downside, utility and risk. This will be a common theme, with a very strong ‘why not both?’ vibe.
Early in the 118th Congress, we were brought together by a shared recognition of the profound changes artificial intelligence (AI) could bring to our world: AI's capacity to revolutionize the realms of science, medicine, agriculture, and beyond; the exceptional benefits that a flourishing AI ecosystem could offer our economy and our productivity; and AI's ability to radically alter human capacity and knowledge.
At the same time, we each recognized the potential risks AI could present, including altering our workforce in the short-term and long-term, raising questions about the application of existing laws in an AI-enabled world, changing the [...]
---
Outline:
(02:10) The Big Spend
(04:51) What Would Schumer Fund?
(12:08) What About For National Security and Defense?
(15:13) What Else Would Schumer Encourage Next in General?
(19:12) I Have Two Better Ideas
(21:32) They Took Our Jobs
(24:24) Language Models Offer Mundane Utility
(30:11) Copyright Confrontation
(34:05) People Are Worried AI Might Kill Everyone Not Be Entirely Safe
(50:27) I Declare National Security
(57:17) Some Other People's Reactions
(01:04:20) Conclusions and Main Takeaways
---
First published:
May 24th, 2024
Source:
https://www.lesswrong.com/posts/wxTMxF35PkNawn8f9/the-schumer-report-on-ai-rtfb
Narrated by TYPE III AUDIO.
In terms of things that go in AI updates, this has been the busiest two week period so far. Every day ends with more open tabs than it started, even within AI.
As a result, some important topics are getting pushed to whenever I can give them proper attention. Triage is the watchword.
In particular, this post will NOT attempt to cover:
---
Outline:
(02:22) Language Models Offer Mundane Utility
(07:10) Language Models Don’t Offer Mundane Utility
(09:38) OpenAI versus Google
(11:32) GPT-4o My
(16:48) Responsable Scaling Policies
(26:21) Copyright Confrontation
(27:45) Deepfaketown and Botpocalypse Soon
(29:37) They Took Our Jobs
(31:33) Get Involved
(32:23) Introducing
(33:11) Reddit and Weep
(35:14) In Other AI News
(40:03) I Spy With My AI (or Total Recall)
(47:55) Quiet Speculations
(49:58) Politico is at it Again
(56:23) Beating China
(58:11) The Quest for Sane Regulations
(59:25) SB 1047 Update
(01:05:35) That's Not a Good Idea
(01:08:47) The Week in Audio
(01:10:05) Rhetorical Innovation
(01:13:20) Aligning a Smarter Than Human Intelligence is Difficult
(01:19:11) The Lighter Side
---
First published:
May 23rd, 2024
Source:
https://www.lesswrong.com/posts/jkWvyzRzZQoaeq4mG/ai-65-i-spy-with-my-ai
Narrated by TYPE III AUDIO.
I repeat. Do not mess with Scarlett Johansson.
You would think her movies, and her suit against Disney, would make this obvious.
Apparently not so.
Andrej Karpathy (co-founder OpenAI, departed earlier), May 14: The killer app of LLMs is Scarlett Johansson. You all thought it was math or something.
You see, there was this voice they created for GPT-4o, called ‘Sky.’
People noticed it sounded suspiciously like Scarlett Johansson, who voiced the AI in the movie Her, which Sam Altman says is his favorite movie of all time, which he says inspired OpenAI ‘more than a little bit,’ and then he tweeted “Her” on its own right before the GPT-4o presentation, and which was the comparison point for many people reviewing the GPT-4o debut?
Quite the Coincidence
I mean, surely that couldn’t have been intentional.
Oh, no.
Kylie Robison: I [...]
---
Outline:
(01:00) Quite the Coincidence
(04:02) Scarlett Johansson's Statement
(06:00) Sure Looks Like OpenAI Lied
(10:20) Sure Seems Like OpenAI Violated Their Own Position
(11:58) Altman's Original Idea Was Good, Actually
(12:54) This Seems Like a Really Bad Set of Facts for OpenAI?
(14:30) Does Scarlett Johansson Have a Case?
(19:10) What Would It Mean For There Not To Be a Case?
(22:51) The Big Rule Adjustment
(27:24) The Internet Reacts
---
First published:
May 22nd, 2024
Source:
https://www.lesswrong.com/posts/N8aRDYLuakmLezeJy/do-not-mess-with-scarlett-johansson
Narrated by TYPE III AUDIO.
Dwarkesh Patel recorded a Podcast with John Schulman, cofounder of OpenAI and at the time their head of current model post-training. Transcript here. John's job at the time was to make the current AIs do what OpenAI wanted them to do. That is an important task, but one that employs techniques that their at-the-time head of alignment, Jan Leike, made clear we should not expect to work on future more capable systems. I strongly agree with Leike on that.
Then Sutskever left and Leike resigned, and John Schulman was made the new head of alignment, now charged with what superalignment efforts remain at OpenAI to give us the ability to control future AGIs and ASIs.
This gives us a golden opportunity to assess where his head is at, without him knowing he was about to step into that role.
There is no question that John Schulman [...]
---
Outline:
(01:12) The Big Take
(07:27) The Podcast
(20:27) Reasoning and Capabilities Development
(25:01) Practical Considerations
---
First published:
May 21st, 2024
Source:
https://www.lesswrong.com/posts/rC6CXZd34geayEH4s/on-dwarkesh-s-podcast-with-openai-s-john-schulman
Narrated by TYPE III AUDIO.
Previously: OpenAI: Facts From a Weekend, OpenAI: The Battle of the Board, OpenAI: Leaks Confirm the Story, OpenAI: Altman Returns, OpenAI: The Board Expands.
Ilya Sutskever and Jan Leike have left OpenAI. This is almost exactly six months after Altman's temporary firing and The Battle of the Board, the day after the release of GPT-4o, and soon after a number of other recent safety-related OpenAI departures. Many others working on safety have also left recently. This is part of a longstanding pattern at OpenAI.
Jan Leike later offered an explanation for his decision on Twitter. Leike asserts that OpenAI has lost the mission on safety and culturally been increasingly hostile to it. He says the superalignment team was starved for resources, with its public explicit compute commitments dishonored, and that safety has been neglected on a widespread basis, not only superalignment but also including addressing the safety [...]
---
Outline:
(02:11) Table of Contents
(03:23) The Two Departure Announcements
(07:50) Who Else Has Left Recently?
(12:33) Who Else Has Left Overall?
(15:56) Early Reactions to the Departures
(24:16) The Obvious Explanation: Altman
(25:47) Jan Leike Speaks
(30:04) Reactions after Leike's Statement
(33:48) Greg Brockman and Sam Altman Respond to Leike
(38:27) Reactions from Some Folks Unworried About Highly Capable AI
(41:28) Don’t Worry, Be Happy?
(43:56) The Non-Disparagement and NDA Clauses
(51:51) Legality in Practice
(54:01) Implications and Reference Classes
(01:01:21) Altman Responds on Non-Disparagement Clauses
(01:02:27) So, About That Response
(01:10:34) How Bad Is All This?
(01:13:58) Those Who Are Against These Efforts to Prevent AI From Killing Everyone
(01:18:01) What Will Happen Now?
(01:18:30) What Else Might Happen or Needs to Happen Now?
---
First published:
May 20th, 2024
Source:
https://www.lesswrong.com/posts/ASzyQrpGQsj7Moijk/openai-exodus
Narrated by TYPE III AUDIO.
At least twice the speed! At most half the price!
That's right, it's GPT-4o My.
Some people's expectations for the OpenAI announcement this week were very high.
Spencer Schiff: Next week will likely be remembered as one of the most significant weeks in human history.
We fell far short of that, but it was still plenty cool.
Essentially no one's expectations for Google's I/O day were very high.
Then Google, in way that was not in terms of its presentation especially exciting or easy to parse, announced a new version of basically everything AI.
That plausibly includes, effectively, most of what OpenAI was showing off. It also includes broader integrations and distribution.
It is hard to tell who has the real deal, and who does not, until we see the various models at full power in the wild.
I will [...]
---
Outline:
(01:24) The GPT-4o Announcement
(04:43) Her
(08:37) Benchmarks
(11:40) Cheap Kills, Speed Kills, Free Kills More
(15:42) What Else Can It Do?
(18:26) Safety First
(21:59) Patterns of Disturbing Behavior
(26:11) Multimedia Demos Aplenty
(30:19) The Math Tutor Demo
(34:49) Target Identified
(38:49) Are You Impressed?
(42:02) Meet the New Jailbreak
(43:14) Are You Unimpressed?
(49:52) Are You Anti-Impressed?
(54:26) Is the Market Impressed?
(55:59) What About Google?
(57:12) OK Google, Give Me a List
(01:01:14) Project Astra
(01:03:13) The Rest of the Announcements in Detail
(01:11:32) Conclusion and Summary
---
First published:
May 16th, 2024
Source:
https://www.lesswrong.com/posts/bqa5wmrwPL5zbfgxH/gpt-4o-my-and-google-i-o-day
Narrated by TYPE III AUDIO.
It's happening. The race is on.
Google and OpenAI both premiered the early versions of their fully multimodal, eventually fully integrated AI agents. Soon your phone experience will get more and more tightly integrated with AI. You will talk to your phone, or your computer, and it will talk back, and it will do all the things. It will hear your tone of voice and understand your facial expressions. It will remember the contents of your inbox and all of your quirky preferences.
It will plausibly be a version of Her, from the hit movie ‘Are we sure about building this Her thing, seems questionable?’
OpenAI won this round of hype going away, because it premiered, and for some modalities released, the new GPT-4o. GPT-4o is tearing up the Arena, and in many ways is clearly giving the people what they want. If nothing else, it [...]
---
Outline:
(02:34) Language Models Offer Mundane Utility
(07:15) Language Models Don’t Offer Mundane Utility
(11:35) Bumbling and Mumbling
(13:56) Deepfaketown and Botpocalypse Soon
(17:23) They Took Our Jobs
(22:44) In Other AI News
(26:24) Quiet Speculations
(28:18) The Week in Audio
(37:54) Brendan Bordelon Big Tech Business as Usual Lobbying Update
(44:29) The Quest for Sane Regulations
(59:14) The Schumer AI Working Group Framework
(01:00:06) Those That Assume Everyone Is Talking Their Books
(01:04:19) Lying About SB 1047
(01:08:02) More Voices Against Governments Doing Anything
(01:14:34) Rhetorical Innovation
(01:19:21) Aligning a Smarter Than Human Intelligence is Difficult
(01:23:09) People Are Worried About AI Killing Everyone
(01:25:25) The Lighter Side
---
First published:
May 16th, 2024
Source:
https://www.lesswrong.com/posts/29fswYuy6KB8Edbjm/ai-64-feel-the-mundane-utility
Narrated by TYPE III AUDIO.
As I note in the third section, I will be attending LessOnline at month's end at Lighthaven in Berkeley. If that is your kind of event, then consider going, and buy your ticket today before prices go up.
This month's edition was an opportunity to finish off some things that got left out of previous editions or where events have left many of the issues behind, including the question of TikTok.
Oh No
All of this has happened before. And all of this shall happen again.
Alex Tabarrok: I regret to inform you that the CDC is at it again.
Marc Johnson: We developed an assay for testing for H5N1 from wastewater over a year ago. (I wasn’t expecting it in milk, but I figured it was going to poke up somewhere.)
However, I was just on a call with the CDC and [...]
---
Outline:
(00:29) Oh No
(03:36) Oh No: Betting on Elections
(05:47) Oh Yeah: LessOnline
(07:21) Brief Explanations
(08:33) Patrick McKenzie Monthly
(11:15) Enemies of the People
(12:41) Oh Canada
(15:24) This is NPR
(20:10) Technology Advances
(21:17) TikTok on the Clock
(32:51) Antisocial Media
(34:51) Prosocial Media
(36:01) At the Movies
(37:27) Media Trustworthiness Rankings
(41:40) Government Working
(46:57) Florida Man Bans Lab-Grown Meat
(51:22) Crime and Punishment
(54:00) El Salvador
(58:02) Variously Effective Altruism
(01:02:37) While I Cannot Condone This
(01:06:38) Can Money Buy Happiness?
(01:08:16) Good News, Everyone
(01:09:07) Can’t Sleep Clowns Will Eat Me
(01:10:53) How Great are Great People?
(01:13:20) Gamers Gonna Game Game Game Game Game
(01:21:21) Sports Go Sports
(01:25:20) I Was Promised Flying Self-Driving Cars
(01:26:40) News You Can Use
(01:28:00) The Lighter Side
---
First published:
May 13th, 2024
Source:
https://www.lesswrong.com/posts/9GLj9DqfpsJBRKHRr/monthly-roundup-18-may-2024
Narrated by TYPE III AUDIO.
It was a remarkably quiet announcement. We now have Alpha Fold 3, it does a much improved job predicting all of life's molecules and their interactions. It feels like everyone including me then shrugged and went back to thinking about other things. No cool new toy for most of us to personally play with, no existential risk impact, no big trades to make, ho hum.
But yes, when we look back at this week, I expect what we remember will be Alpha Fold 3.
Unless it turns out that it is Sophon, a Chinese technique to potentially make it harder to fine tune an open model in ways the developer wants to prevent. I do not expect this to get the job done that needs doing, but it is an intriguing proposal.
We also have 95 theses to evaluate in a distinct post, OpenAI sharing the [...]
---
Outline:
(01:14) Language Models Offer Mundane Utility
(05:41) Language Models Don’t Offer Mundane Utility
(10:24) GPT-2 Soon to Tell
(13:49) Fun with Image Generation
(14:03) Deepfaketown and Botpocalypse Soon
(18:05) Automation Illustrated
(23:15) They Took Our Jobs
(25:56) Apple of Technically Not AI
(28:09) Get Involved
(28:46) Introducing
(31:11) In Other AI News
(32:47) Quiet Speculations
(37:35) The Quest for Sane Regulations
(39:51) The Week in Audio
(40:38) Rhetorical Innovation
(45:58) Open Weights Are Unsafe and Nothing Can Fix This
(51:34) The Lighter Side
---
First published:
May 9th, 2024
Source:
https://www.lesswrong.com/posts/FrBxFa3qMDvLypDEZ/ai-63-introducing-alpha-fold-3
Narrated by TYPE III AUDIO.
Or rather Samuel Hammond does. Tyler Cowen finds it interesting but not his view.
I put up a market, and then started looking. Click through to his post for the theses. I will be quoting a few of them in full, but not most of them.
I am not trying to be exact with these probabilities when the question calls for them, nor am I being super careful to make them consistent, so errors and adjustments are inevitable.
Section 1 is Oversight of AGI labs is prudent.
I do tend to say that.
---
Outline:
(00:37) Section 1 is Oversight of AGI labs is prudent.
(03:41) Section 2 is Most proposed ‘AI regulations’ are ill-conceived or premature.
(06:43) Section 3 claims AI progress is accelerating, not plateauing.
(09:25) Section 4 says open source is mostly a red herring.
(13:20) Section 5 claims accelerate versus decelerate is a false dichotomy.
(17:11) Section 6 is The AI wave is inevitable, superintelligence isn’t.
(18:50) Section 7 says technological transitions cause regime changes.
(20:44) Section 8 says institutional regime changes are packaged deals.
(24:16) Section 9 says dismissing AGI risks as ‘sci-fi’ is a failure of imagination.
(27:13) Finally, Section 10 says biology is an information technology.
(28:29) Tallying Up the Points
(28:52) Conclusion
---
First published:
May 9th, 2024
Source:
https://www.lesswrong.com/posts/2BvfGnZMx4Ei82qkk/i-got-95-theses-but-a-glitch-ain-t-one
Narrated by TYPE III AUDIO.
The first speculated on why you’re still single. We failed to settle the issue. A lot of you were indeed still single. So the debate continues.
The second gave more potential reasons, starting with the suspicion that you are not even trying, and also many ways you are likely trying wrong.
The definition of insanity is trying the same thing over again expecting different results. Another definition of insanity is dating in 2024. Can’t quit now.
You’re Single Because Dating Apps Keep Getting Worse
A guide to taking the perfect dating app photo. This area of your life is important, so if you intend to take dating apps seriously then you should take photo optimization seriously, and of course you can then also use the photos for other things.
I love the ‘possibly’ evil here.
Misha Gurevich: possibly evil idea: Dating app that [...]
---
Outline:
(00:37) You’re Single Because Dating Apps Keep Getting Worse
(05:38) You’re Single Because Dating Apps Keep Getting Worse
(07:24) You’re Single Because Everyone is Too Superficial
(09:48) You’re Single Because You Refuse to Shamefully Falsify Your Politics
(16:12) You Are Single Because You Do Not Employ Good Strategy
(18:45) You Are Single Because You Don’t Know How to Flirt
(22:43) You Are Single Because You Don’t Date Your Married Boss
(26:12) You Are Single Because You Are Afraid to Fail
(27:02) You Are Single Because No One Likes You On Dates
(29:39) You’re Single Because You Are Bad at Sex
(30:51) You’re Single Because You’re Not Hot
(31:39) You’re Single Because You Don’t Know What People Care About
(33:10) You’re Single Because You Are Inappropriate
(34:11) You’re Single Because of Your Pet
(35:23) You’re Single Because You Won’t Spend Money
(40:05) You’re Single Because You’re Not Over Your Ex
(41:27) You’re Single Because You Thought You Could Do 25% Better
(47:27) Polyamory
(53:29) You’re Single Because You Don’t Know What You Want
(01:00:43) You’re Single Because You’re Too Busy Writing Comments
(01:07:27) You’re Single and Not Getting Properly Compensated
(01:08:34) You’re Not Single and You’re an Inspiration
(01:09:49) Your Moment of Zen
---
First published:
May 8th, 2024
Source:
https://www.lesswrong.com/posts/PLoz68JbTkDufeYSG/dating-roundup-3-third-time-s-the-charm
Narrated by TYPE III AUDIO.
The week's big news was supposed to be Meta's release of two versions of Llama-3.
Everyone was impressed. These were definitely strong models.
Investors felt differently. After earnings yesterday showed strong revenues but that Meta was investing heavily in AI, they took Meta stock down 15%.
DeepMind and Anthropic also shipped, but in their cases it was multiple papers on AI alignment and threat mitigation. They get their own sections.
We also did identify someone who wants to do what people claim the worried want to do, who is indeed reasonably identified as a ‘doomer.’
Because the universe has a sense of humor, that person's name is Tucker Carlson.
Also we have a robot dog with a flamethrower.
Table of Contents
Previous post: On Llama-3 and Dwarkesh Patel's Podcast with Zuckerberg.
---
Outline:
(00:53) Language Models Offer Mundane Utility
(08:52) Language Models Don’t Offer Mundane Utility
(12:09) Llama We Doing This Again
(15:57) Fun with Image Generation
(18:35) Deepfaketown and Botpocalypse Soon
(23:53) They Took Our Jobs
(25:10) Get Involved
(25:36) Introducing
(25:52) In Other AI News
(33:00) Quiet Speculations
(41:37) Rhetorical Innovation
(48:01) Wouldn’t You Prefer a Nice Game of Chess
(53:24) The Battle of the Board
(01:08:38) New Anthropic Papers
(01:24:29) New DeepMind Papers
(01:30:06) Aligning a Smarter Than Human Intelligence is Difficult
(01:32:16) People Are Worried About AI Killing Everyone
(01:34:00) Other People Are Not As Worried About AI Killing Everyone
(01:34:26) The Lighter Side
---
First published:
May 2nd, 2024
Source:
https://www.lesswrong.com/posts/pPwt5ir2zFayLx7tH/ai-61-meta-trouble
Narrated by TYPE III AUDIO.
What is the mysterious impressive new ‘gpt2-chatbot’ from the Arena? Is it GPT-4.5? A refinement of GPT-4? A variation on GPT-2 somehow? A new architecture? Q-star? Someone else's model? Could be anything. It is so weird that this is how someone chose to present that model.
There was also a lot of additional talk this week about California's proposed SB 1047.
I wrote an additional post extensively breaking that bill down, explaining how it would work in practice, addressing misconceptions about it and suggesting fixes for its biggest problems along with other improvements. For those interested, I recommend reading at least the sections ‘What Do I Think The Law Would Actually Do?’ and ‘What are the Biggest Misconceptions?’
As usual, lots of other things happened as well.
Table of Contents
---
Outline:
(01:00) Language Models Offer Mundane Utility
(06:00) Language Models Don’t Offer Mundane Utility
(06:27) GPT-2 Soon to Tell
(11:04) Fun with Image Generation
(12:15) Deepfaketown and Botpocalypse Soon
(13:29) They Took Our Jobs
(14:22) Get Involved
(15:14) Introducing
(15:18) In Other AI News
(18:02) Quiet Speculations
(24:03) The Quest for Sane Regulations
(43:20) The Week in Audio
(48:32) Rhetorical Innovation
(51:31) Open Weights Are Unsafe And Nothing Can Fix This
(57:54) Aligning a Smarter Than Human Intelligence is Difficult
(58:19) The Lighter Side
---
First published:
May 2nd, 2024
Source:
https://www.lesswrong.com/posts/sZpj4Xf9Ly2jyR9tK/ai-62-too-soon-to-tell
Narrated by TYPE III AUDIO.
Previously: On the Proposed California SB 1047.
Text of the bill is here. It focuses on safety requirements for highly capable AI models.
This is written as an FAQ, tackling all questions or points I saw raised.
Safe & Secure AI Innovation Act also has a description page.
Why Are We Here Again?
There have been many highly vocal and forceful objections to SB 1047 this week, in reaction to a (disputed) claim that the bill has been ‘fast tracked.’
The bill continues to have substantial chance of becoming law according to Manifold, where the market has not moved on recent events.
The purpose of this post is to gather and analyze all of them that came to my attention in any way, including all responses to my request for them on Twitter, and to suggest concrete changes that address some real [...]
---
Outline:
(00:28) Why Are We Here Again?
(02:48) What is the Story So Far?
(05:26) What Do I Think the Law Would Actually Do?
(10:40) What are the Biggest Misconceptions?
(15:05) What are the Real Problems?
(19:42) What the the Changes That Would Improve the Bill?
(23:10) What is the Definition of Derivative Model? Is it Clear Enough?
(28:00) Should the $500 Million Threshold Should be Indexed for Inflation?
(28:23) What Constitutes Hazardous Capability?
(33:23) Does the Alternative Capabilities Rule Make Sense?
(36:02) Is Providing Reasonable Assurance of a Lack of Hazardous Capability Realistic?
(38:39) Is Reasonable Assurance Tantamount to Requiring Proof That Your AI is Safe?
(40:20) Is the Definition of Covered Model Overly Broad?
(43:50) Is the Similar Capabilities Clause Overly Broad or Anticompetitive?
(46:46) Does This Introduce Broad Liability?
(48:41) Should Developers Worry About Going to Jail for Perjury?
(49:53) Does This Create a New Regulatory Agency to Regulate AI?
(50:22) Will a Government Agency Be Required to Review and Approve AI Systems Before Release?
(50:40) Are the Burdens Here Overly Onerous to Small Developers?
(51:42) Is the Shutdown Requirement a Showstopper for Open Weights Models?
(53:39) Do the Requirements Disincentive Openness?
(54:12) Will This Have a Chilling Effect on Research?
(54:36) Does the Ability to Levy Fees Threaten Small Business?
(55:14) Will This Raise Barriers to Entry?
(55:51) Is This a Brazen Attempt to Hurt Startups and Open Source?
(57:42) Will This Cost California Talent or Companies?
(59:01) Could We Use a Cost-Benefit Test?
(01:05:19) Should We Interpret Proposals via Adversarial Legal Formalism?
(01:08:13) What Other Positive Comments Are Worth Sharing?
(01:09:04) What Else Was Suggested That We Might Do Instead of This Bill?
(01:10:57) Would This Interfere With Federal Regulation?
(01:11:34) Conclusion
---
First published:
May 2nd, 2024
Source:
https://www.lesswrong.com/posts/qsGRKwTRQ5jyE5fKB/q-and-a-on-proposed-sb-1047
Narrated by TYPE III AUDIO.
This post brings together various questions about the college application process, as well as practical considerations of where to apply and go. We are seeing some encouraging developments, but mostly the situation remains rather terrible for all concerned.
Application Strategy and Difficulty
Paul Graham: Colleges that weren’t hard to get into when I was in HS are hard to get into now. The population has increased by 43%, but competition for elite colleges seems to have increased more. I think the reason is that there are more smart kids. If so that's fortunate for America.
Are college applications getting more competitive over time?
Yes and no.
---
Outline:
(00:19) Application Strategy and Difficulty
(01:16) Spray and Pray and Optimal Admissions Strategy
(06:37) Are Kids Getting Smarter?
(07:57) What About Considerations Changing?
(09:30) Holistic Admissions Will Eat Your Entire Childhood
(15:43) So You Want to Be Elite
(20:14) The Price of Admission
(20:56) The Art of Discrimination
(26:11) The Return of the Standardized Test
(32:58) Legacy Admissions
(34:56) Modest Proposals for Admission Reform
(41:09) The Gender Gap
(46:01) Missed Opportunities
(49:20) The Price of Attendance
(50:56) In and Out of State
(57:41) The Value of Admission
(59:18) The End of an Era
(01:05:25) The End of the World
(01:08:13) Making an Ordinary Effort
---
First published:
April 24th, 2024
Source:
https://www.lesswrong.com/posts/PTC7bZdZoqbCcAshW/changes-in-college-admissions
Narrated by TYPE III AUDIO.
It was all quiet. Then it wasn’t.
Note the timestamps on both of these.
Dwarkesh Patel did a podcast with Mark Zuckerberg on the 18th. It was timed to coincide with the release of much of Llama-3, very much the approach of telling your story directly. Dwarkesh is now the true tech media. A meteoric rise, and well earned.
This is two related posts in one. First I cover the podcast, then I cover Llama-3 itself.
My notes are edited to incorporate context from later explorations of Llama-3, as I judged that the readability benefits exceeded the purity costs.
Podcast Notes: Llama-3 Capabilities
---
Outline:
(00:51) Podcast Notes: Llama-3 Capabilities
(03:09) The Need for Inference
(07:08) Great Expectations
(11:29) Open Source and Existential and Other Risks
(30:50) Interview Overview
(33:22) A Few Reactions
(47:53) Safety First
(54:15) Core Capability Claims
(56:11) How Good are the 8B and 70B Models in Practice?
(01:02:31) Architecture and Data
(01:05:08) Training Day
(01:09:17) What Happens Next With Meta's Products?
(01:12:24) What Happens Next With AI Thanks To These Two Models?
(01:14:04) The Bigger One: It's Coming
(01:14:59) Who Wins?
(01:17:21) Who Loses?
(01:21:49) How Unsafe Will It Be to Release Llama-3 400B?
(01:24:12) The Efficient Market Hypothesis is False
(01:27:09) What Next?
---
First published:
April 22nd, 2024
Narrated by TYPE III AUDIO.
Many things this week did not go as planned.
Humane AI premiered its AI pin. Reviewers noticed it was, at best, not ready.
Devin turns out to have not been entirely forthright with its demos.
OpenAI fired two employees who had been on its superalignment team, Leopold Aschenbrenner and Pavel Izmailov for allegedly leaking information, and also more troubliningly lost Daniel Kokotajlo, who expects AGI very soon, does not expect it to by default go well, and says he quit ‘due to losing confidence that [OpenAI] would behave responsibly around the time of AGI.’ That's not good.
Nor is the Gab system prompt, although that is not a surprise. And several more.
On the plus side, my 80,000 Hours podcast finally saw the light of day, and Ezra Klein had an excellent (although troubling) podcast with Dario Amodei. And we got the usual mix [...]
---
Outline:
(01:05) Language Models Offer Mundane Utility
(06:13) Language Models Don’t Offer Mundane Utility
(11:21) Oh the Humanity
(21:31) GPT-4 Real This Time
(23:12) Fun with Image Generation
(27:47) Deepfaketown and Botpocalypse Soon
(31:34) Devin in the Details
(35:36) Another Supposed System Prompt
(42:35) They Took Our Jobs
(45:37) Introducing
(47:42) In Other AI News
(52:47) Quiet Speculations
(01:00:29) The Quest for Sane Regulations
(01:00:47) The Problem: AI's Extreme Risks
(01:02:07) Overview
(01:03:16) Covered Frontier AI Models
(01:04:10) Oversight of Frontier Models
(01:06:49) Oversight Entity
(01:18:43) The Week in Audio
(01:27:51) Rhetorical Innovation
(01:32:41) Don’t Be That Guy
(01:33:47) Aligning a Smarter Than Human Intelligence is Difficult
(01:42:24) Please Speak Directly Into the Microphone
(01:44:18) People Are Worried About AI Killing Everyone
(01:48:44) Other People Are Not As Worried About AI Killing Everyone
(01:54:10) The Lighter Side
---
First published:
April 18th, 2024
Source:
https://www.lesswrong.com/posts/FAnxq8wFpfqGjeetC/ai-60-oh-the-humanity
Narrated by TYPE III AUDIO.
For this iteration I will exclude discussions involving college or college admissions.
There has been a lot of that since the last time I did one of these, along with much that I need to be careful with lest I go out of my intended scope. It makes sense to do that as its own treatment another day.
Bullying
Why do those who defend themselves against bullies so often get in more trouble than bullies? This is also true in other contexts but especially true in school. Thread is extensive, these are the highlights translated into my perspective. A lot of it is that a bully has experience and practice, they know how to work the system, they know what will cause a response, and they are picking the time and place to do something. The victim has to respond in the moment, and by responding [...]
---
Outline:
(00:22) Bullying
(02:34) Truancy
(03:56) Against Active Shooter Drills
(05:14) Censorship
(06:32) Woke Kindergarden
(09:01) Tracking
(10:43) The Case Against Education
(13:56) Home Schooling
(14:54) Despair
(17:02) Goals
(19:02) Taking the Developing World to School
(26:05) Primary School
(27:46) Guessing the Teacher's Password
(28:26) Correcting the Teacher's Incentives
(29:58) Mathematics
(31:08) Let Kids Be Kids
(32:27) Mandatory Work We Encourage
(34:03) Mandatory Work We Discourage
(35:32) Air Quality Matters
(36:18) School Choice
(39:01) Full Access to Smartphones Is Not Good For Children
(47:17) Lifetime Learning
---
First published:
April 17th, 2024
Source:
https://www.lesswrong.com/posts/s34ingEzvajpFPaaD/childhood-and-education-roundup-5
Narrated by TYPE III AUDIO.
As always, a lot to get to. This is everything that wasn’t in any of the other categories.
Bad News
You might have to find a way to actually enjoy the work.
Greg Brockman (President of OpenAI): Sustained great work often demands enjoying the process for its own sake rather than only feeling joy in the end result. Time is mostly spent between results, and hard to keep pushing yourself to get to the next level if you’re not having fun while doing so.
Yeah. This matches my experience in all senses. If you don’t find a way to enjoy the work, your work is not going to be great.
This is the time. This is the place.
Guiness Pig: In a discussion at work today:
“If you email someone to ask for something and they send you an email trail showing [...]
---
Outline:
(00:13) Bad News
(04:23) Patriots and Tyrants
(07:31) Asymmetric Justice Incarnate
(08:55) Loneliness
(11:50) Get Involved
(12:32) Government Working
(22:31) Crime and Punishment
(30:27) Squatters Should Not Be Able to Steal Your House
(33:17) El Salvador
(40:14) Our Criminal Justice Problem With Junk Science
(45:43) Variously Effective Altruism
(55:26) Technology Advances
(58:07) You Need More Screen Space
(01:00:02) Apple Vision Pro
(01:02:34) A Matter of Antitrust
(01:12:06) RTFB: Read the Bill
(01:13:42) Antisocial Media
(01:17:13) RIP NPR
(01:20:27) Entertainment Monthly
(01:22:17) Gamers Gonna Game Game Game Game Game
(01:27:57) Luck Be a Landlord
(01:31:04) Sports Go Sports
(01:39:10) Know When To Fold ‘Em
(01:44:26) Wouldn’t You Prefer a Good Game of Chess
(01:51:17) Total Eclipse of the Sun
(01:53:15) Delegation
(01:55:48) Good News, Everyone
(02:07:03) I Was Promised Flying Self-Driving Cars
(02:08:43) While I Cannot Condone This
(02:11:39) There has been little change in rates of being vegetarian (4%) or vegan (1%). Yes, the people I meet are radically more likely to be both these things, but those are weird circles. However, I also notice a radical explosion in the number of vegan restaurants and products on offer. So something is going on.
(02:15:08) What Is Best In Life?
---
First published:
April 15th, 2024
Source:
https://www.lesswrong.com/posts/cbkJWkKWvETwJqoj2/monthly-roundup-17-april-2024
Narrated by TYPE III AUDIO.
Claude uses tools now. Gemini 1.5 is available to everyone and Google promises more integrations. GPT-4-Turbo gets substantial upgrades. Oh and new model from Mistral, TimeGPT for time series, and also new promising song generator. No, none of that adds up to GPT-5, but everyone try to be a little patient, shall we?
Table of Contents
In addition to what is covered here, there was a piece of model legislation introduced by the Center for AI Policy. I took up the RTFB (Read the Bill) challenge, and offer extensive thoughts for those who want to dive deep.
---
Outline:
(00:31) Language Models Offer Mundane Utility
(03:11) Language Models Don’t Offer Mundane Utility
(06:55) Clauding Along
(10:19) Persuasive Research
(17:55) The Gemini System Prompt
(21:03) Fun with Image Generation
(21:24) Deepfaketown and Botpocalypse Soon
(25:02) Copyright Confrontation
(29:59) Collusion
(32:33) Out of the Box Thinking
(37:33) The Art of the Jailbreak
(38:02) They Took Our Jobs
(43:55) Get Involved
(44:23) Introducing
(47:26) In Other AI News
(53:16) GPT-4 Real This Time
(57:19) GPT-5 Alive?
(01:04:02) Quiet Speculations
(01:06:45) Antisocial Media
(01:19:57) The Quest for Sane Regulations
(01:33:33) Rhetorical Innovation
(01:39:35) Challenge Accepted
(01:51:06) Aligning a Smarter Than Human Intelligence is Difficult
(01:51:55) Please Speak Directly Into This Microphone
(01:56:45) People Are Worried About AI Killing Everyone
(01:57:15) The Lighter Side
---
First published:
April 11th, 2024
Source:
https://www.lesswrong.com/posts/hQaxcitYgKjJqMdps/ai-59-model-updates
Narrated by TYPE III AUDIO.
A New Bill Offer Has Arrived
Center for AI Policy proposes a concrete actual model bill for us to look at.
Here was their announcement:
WASHINGTON – April 9, 2024 – To ensure a future where artificial intelligence (AI) is safe for society, the Center for AI Policy (CAIP) today announced its proposal for the “Responsible Advanced Artificial Intelligence Act of 2024.” This sweeping model legislation establishes a comprehensive framework for regulating advanced AI systems, championing public safety, and fostering technological innovation with a strong sense of ethical responsibility.
“This model legislation is creating a safety net for the digital age,” said Jason Green-Lowe, Executive Director of CAIP, “to ensure that exciting advancements in AI are not overwhelmed by the risks they pose.”
The “Responsible Advanced Artificial Intelligence Act of 2024” is model legislation that contains provisions for requiring that AI be developed safely [...]
---
Outline:
(05:00) RTFC: Read the Bill
(05:39) Basics and Key Definitions
(10:00) Oh the Permits You’ll Need
(21:11) Rubrics for Your Consideration
(25:55) Open Model Weights Are Unsafe And Nothing Can Fix This
(30:02) Extremely High Concern Systems
(32:43) The Judges Decide
(35:01) Several Rapid-Fire Final Sections
(39:47) Overall Take: A Forceful, Flawed and Thoughtful Model Bill
(49:20) The Usual Objectors Respond: The Severability Clause
(52:34) The Usual Objectors Respond: Inception
(54:12) The Usual Objectors Respond: Rulemaking Authority
(01:01:53) Conclusion
---
First published:
April 10th, 2024
Source:
https://www.lesswrong.com/posts/SQ9wDmsELBmA4Lega/rtfb-on-the-new-proposed-caip-ai-bill
Narrated by TYPE III AUDIO.
Previously: #1
It feels so long ago that Covid and health were my beat, and what everyone often thought about all day, rather than AI. Yet the beat goes on. With Scott Alexander at long last giving us what I expect to be effectively the semi-final words on the Rootclaim debate, it seemed time to do this again.
Bad News
I know no methodical way to find a good, let alone great, therapist.
Cate Hall: One reason it's so hard to find a good therapist is that all the elite ones market themselves as coaches.
As a commentor points out, therapists who can’t make it also market as coaches or similar, so even if Cate's claim is true then it is tough.
My actual impression is that the elite therapists largely do not market themselves at all. They instead work on referrals and [...]
---
Outline:
(00:24) Bad News
(01:27) Good News, Everyone
(03:35) The Battle of the Bulge
(09:13) Support Anti-Aging Research
(09:37) Variably Effective Altruism
(09:51) Periodic Reminders (You Should Know This Already)
(12:05) FDA Delenda Est
(13:03) Other Enemies of Life
(14:04) Covid Postmortems
(14:55) Everything sounds like a sales pitch
(17:07) Information that would have been helpful was never provided
(17:49) A disconnect between what I experienced on the ground and the narrative I was hearing
(21:52) Covid-19 Origins
(24:29) Assisted Suicide Watch
(27:58) Talking Price
---
First published:
April 9th, 2024
Source:
https://www.lesswrong.com/posts/wfz47Ez2r4rQZuYBY/medical-roundup-2
Narrated by TYPE III AUDIO.
It was clear within the first ten minutes this would be a rich thread to draw from. In my childhood and education roundups, and of course with my own kids, I have been dealing with the issues Haidt talks about in his new book, The Anxious Generation. Ideally I’d also have read the book, but perfect as enemy of the good and all that.
I will start with my analysis of the podcast, in my now-standard format. Then I will include other related content I was going to put into my next childhood roundup.
---
Outline:
(42:46) Ban Phones in Schools
(54:33) Let Kids be Kids
---
First published:
April 5th, 2024
Source:
https://www.lesswrong.com/posts/6hciEN9DGsS8CEuox/on-the-2nd-cwt-with-jonathan-haidt
Narrated by TYPE III AUDIO.
Another round? Of economists projecting absurdly small impacts, of Google publishing highly valuable research, a cycle of rhetoric, more jailbreaks, and so on. Another great podcast from Dwarkesh Patel, this time going more technical. Another proposed project with a name that reveals quite a lot. A few genuinely new things, as well. On the new offerings front, DALLE-3 now allows image editing, so that's pretty cool.
Table of Contents
Don’t miss out on Dwarkesh Patel's podcast with Sholto Douglas and Trenton Bricken, which got the full write-up treatment.
---
Outline:
(00:36) Language Models Offer Mundane Utility
(08:12) Language Models Don’t Offer Mundane Utility
(19:49) Clauding Along
(20:23) Fun with Image Generation
(21:22) Deepfaketown and Botpocalypse Soon
(26:51) They Took Our Jobs
(31:27) The Art of the Jailbreak
(34:53) Many-shot jailbreaking
(42:37) Cybersecurity
(45:02) Get Involved
(45:17) Introducing
(47:03) In Other AI News
(53:17) Stargate AGI
(56:13) Larry Summers Watch
(01:05:21) Quiet Speculations
(01:14:10) AI Doomer Dark Money Astroturf Update
(01:22:08) The Quest for Sane Regulations
(01:27:03) The Week in Audio
(01:27:29) Rhetorical Innovation
(01:35:09) Aligning a Smarter Than Human Intelligence is Difficult
(01:48:55) People Are Worried About AI Killing Everyone
(01:52:20) The Lighter Side
---
First published:
April 4th, 2024
Source:
https://www.lesswrong.com/posts/qQmWvm68GsXJtK4EQ/ai-58-stargate-agi
Narrated by TYPE III AUDIO.
Previous Fertility Roundups: #1, #2.
The pace seems to be doing this about twice a year. The actual situation changes slowly, so presumably the pace of interesting new things should slow down over time from here.
Demographics
This time around, a visualization. Where will the next 1,000 babies be born?
Population Trends
Scott Lincicome notes American population now expected to peak in 2080 at 369 million.
South Korea now down to 0.7 births per woman. The story of South Korea is told as a resounding success, of a country that made itself rich and prosperous. But what does it profit us, if we become nominally rich and prosperous, but with conditions so hostile that we cannot or will not bring children into them? If the rule you followed led you here, of what use was the rule? Why should others follow it?
---
Outline:
(00:20) Demographics
(00:32) Population Trends
(04:31) Causes
(10:50) Causes: South Korea
(18:56) Causes: South Korea: Status Competition
(27:06) More Dakka
(33:38) Less But Nonzero Dakka
(35:31) Preferences
(37:11) Surrogacy
(38:47) Technology
(40:58) Insular High Fertility Cultures
(44:44) In Brief
(45:19) Cultural Trends
---
First published:
April 2nd, 2024
Source:
https://www.lesswrong.com/posts/5k5FeFDCqXfLMj5SJ/fertility-roundup-3
Narrated by TYPE III AUDIO.
Dwarkesh Patel continues to be on fire, and the podcast notes format seems like a success, so we are back once again.
This time the topic is how LLMs are trained, work and will work in the future. Timestamps are for YouTube. Where I inject my own opinions or takes, I do my best to make that explicit and clear.
This was highly technical compared to the average podcast I listen to, or that Dwarkesh does. This podcast definitely threated to technically go over my head at times, and some details definitely did go over my head outright. I still learned a ton, and expect you will too if you pay attention.
This is an attempt to distill what I found valuable, and what questions I found most interesting. I did my best to make it intuitive to follow even if you are not technical, but [...]
---
First published:
April 1st, 2024
Narrated by TYPE III AUDIO.
Welcome, new readers!
This is my weekly AI post, where I cover everything that is happening in the world of AI, from what it can do for you today (‘mundane utility’) to what it can promise to do for us tomorrow, and the potentially existential dangers future AI might pose for humanity, along with covering the discourse on what we should do about all of that.
You can of course Read the Whole Thing, and I encourage that if you have the time and interest, but these posts are long, so they also designed to also let you pick the sections that you find most interesting. Each week, I pick the sections I feel are the most important, and put them in bold in the table of contents.
Not everything here is about AI. I did an economics roundup on Tuesday, and a general monthly roundup [...]
---
Outline:
(01:16) Language Models Offer Mundane Utility
(06:58) Language Models Don’t Offer Mundane Utility
(08:37) Stranger Things
(09:09) Clauding Along
(16:30) Fun with Image Generation
(19:01) Deepfaketown and Botpocalypse Soon
(21:29) They Took Our Jobs
(24:19) Introducing
(26:27) In Other AI News
(28:59) Loud Speculations
(31:10) Quiet Speculations
(36:04) Principles of Microeconomics
(44:54) The Full IDAIS Statement
(45:37) Consensus Statement on Red Lines in Artificial Intelligence
(47:54) Roadmap to Red Line Enforcement
(50:19) Conclusion
(50:49) The Quest for Sane Regulations
(56:38) The Week in Audio
(57:29) Rhetorical Innovation
(01:13:41) How Not to Regulate AI
(01:25:48) The Three Body Problem (Spoiler-Free)
(01:27:11) AI Doomer Dark Money Astroturf Update
(01:36:35) Evaluating a Smarter Than Human Intelligence is Difficult
(01:50:15) Aligning a Smarter Than Human Intelligence is Difficult
(01:53:32) AI is Deeply Unpopular
(01:53:47) People Are Worried About AI Killing Everyone
(01:55:43) Other People Are Not As Worried About AI Killing Everyone
(01:57:31) Wouldn’t You Prefer a Good Game of Chess?
(02:00:03) The Lighter Side
---
First published:
March 28th, 2024
Source:
https://www.lesswrong.com/posts/5Dz3ZrwBzzMfaucrH/ai-57-all-the-ai-news-that-s-fit-to-print
Narrated by TYPE III AUDIO.
I call the section ‘Money Stuff’ but as a column name that is rather taken. There has been lots to write about on this front that didn’t fall neatly into other categories. It clearly benefited a lot from being better organized into subsections, and the monthly roundups could benefit from being shorter, so this will probably become a regular thing.
They Took Our Jobs
Quite the opposite, actually. Jobs situation remains excellent.
Whatever else you think of the economy, layoffs are still at very low levels, the last three years are the lowest levels on record, do note that the bottom of this chart is 15,000 rather than zero even without adjusting for population size.
Ford says it is reexamining where to make cars after the UAW strikes. The union responded by saying, essentially, ‘f*** you, pay me’:
“Maybe Ford doesn’t need to move [...]
---
Outline:
(00:24) They Took Our Jobs
(01:34) Company Formations Seem Permanently Higher
(02:06) Whoops
(03:29) Vibecession Via Healthcare Spending
(05:20) Vibecession Look at All the Nice Things and Good Numbers
(08:24) Vibecession via Inappropriate Price Index
(13:15) Vibecession Anyway
(15:04) Vibecenssion via Interest Rates
(16:37) Prediction Markets
(17:46) The Efficient Market Hypothesis is False
(18:09) Failed Markets in Everything
(23:08) Failed Markets in Transparency
(25:43) Unprofitable Markets in Air Travel
(27:49) Unprofitable Markets in Ground Transportation
(32:10) Detroit, Georgianism
(33:29) California Approaches the Tax Tipping Point
(34:00) Taxing On the Margin
(35:39) Occupational License as Security Bond
(38:14) The Work From Home Debate
(38:50) Patrick McKenzie Explains it All
(41:15) In Brief
---
First published:
March 26th, 2024
Source:
https://www.lesswrong.com/posts/hCNt7dc7QXuKB2gsR/economics-roundup-1
Narrated by TYPE III AUDIO.
Last week Sam Altman spent two hours with Lex Fridman (transcript). Given how important it is to understand where Altman's head is at and learn what he knows, this seemed like another clear case where extensive notes were in order.
Lex Fridman overperformed, asking harder questions than I expected and going deeper than I expected, and succeeded in getting Altman to give a lot of what I believe were genuine answers. The task is ‘get the best interviews you can while still getting interviews’ and this could be close to the production possibilities frontier given Lex's skill set.
There was not one big thing that stands out given what we already have heard from Altman before. It was more the sum of little things, the opportunity to get a sense of Altman and where his head is at, or at least where he is presenting it as [...]
---
First published:
March 25th, 2024
Source:
https://www.lesswrong.com/posts/AaS6YRAGBFrxt6ZMj/on-lex-fridman-s-second-podcast-with-altman
Narrated by TYPE III AUDIO.
Hopefully, anyway. Nvidia has a new chip.
Also Altman has a new interview.
And most of Inflection has new offices inside Microsoft.
Table of Contents
---
Outline:
(00:18) Language Models Offer Mundane Utility
(10:28) Clauding Along
(12:45) Language Models Don’t Offer Mundane Utility
(16:11) Fun with Image Generation
(21:00) Deepfaketown and Botpocalypse Soon
(24:20) They Took Our Jobs
(37:17) Generative AI in Games
(40:20) Get Involved
(41:38) Introducing
(51:03) Grok the Grok
(54:43) New Nvidia Chip
(56:40) Inflection Becomes Microsoft AI
(58:11) In Other AI News
(01:04:18) Wait Till Next Year
(01:11:57) Quiet Speculations
(01:20:03) The Quest for Sane Regulations
(01:25:20) The Week in Audio
(01:26:03) Rhetorical Innovation
(01:31:11) Read the Roon
(01:34:01) Pick Up the Phone
(01:36:17) Aligning a Smarter Than Human Intelligence is Difficult
(01:45:14) Polls Show People Are Worried About AI
(01:52:29) People Are Worried About AI Killing Everyone
(01:54:57) Other People Are Not As Worried About AI Killing Everyone
(02:04:57) The Lighter Side
---
First published:
March 21st, 2024
Source:
https://www.lesswrong.com/posts/iH5Sejb4dJGA2oTaP/ai-56-blackwell-that-ends-well
Narrated by TYPE III AUDIO.
Like the the government-commissioned Gladstone Report on AI itself, there are two sections here.
First I cover the Gladstone Report's claims and arguments about the state of play, including what they learned talking to people inside the labs. I mostly agree with their picture and conclusions, both in terms of arguments and reported findings, however I already mostly agreed. If these arguments and this information is new to someone, and the form of a government-backed report helps them process it and take it seriously, this is good work. However, in terms of convincing an already informed skeptic, I believe this is a failure. They did not present their findings in a way that should be found convincing to the otherwise unconvinced.
Second I cover the Gladstone Report's recommended courses of action. It is commendable that the report lays out a concrete, specific and highly detailed proposal. A [...]
---
Outline:
(01:13) Executive Summary of Their Findings: Oh No
(07:13) Gladstone Makes Its Case
(14:42) Why Is Self-Regulation Insufficient?
(17:26) What About Competitiveness?
(21:12) How Dare You Solve Any Problem That Isn’t This One?
(22:44) What Makes You Think We Need To Worry About This?
(23:57) What Arguments are Missing?
(27:45) Tonight at 11: Doom!
(30:51) The Claim That Frontier Labs Lack Countermeasures For Loss of Control
(35:42) The Future Threat
(38:45) That All Sounds Bad, What Should We Do?
(45:11) The Key Proposal: Extreme Compute Limits
(51:15) Implementation Details of Computer Tiers
(56:16) The Quest for Sane International Regulation
(01:04:16) The Other Proposals
(01:11:24) Conclusions, Both Theirs and Mine
(01:15:40) What Can We Do About All This?
---
First published:
March 20th, 2024
Source:
https://www.lesswrong.com/posts/ApZJy3NKfW5CkftQq/on-the-gladstone-report
Narrated by TYPE III AUDIO.
AI developments have picked up the pace. That does not mean that everything else stopped to get out of the way. The world continues.
Do I have the power?
Emmett Shear speaking truth: Wielding power is of course potentially dangerous and it should be done with due care, but there is no virtue in refusing the call.
There is also an art to avoiding power, and some key places to exercise it. Be keenly aware of when having power in a given context would ruin everything.
Natural General Lack of Intelligence in Tech
Eliezer Yudkowsky reverses course, admits aliens are among us and we have proof.
Eliezer Yudkowsky: To understand the user interfaces on microwave ovens, you need to understand that microwave UI designers are aliens. As in, literal nonhuman aliens who infiltrated Earth, who believe that humans desperately want to hear piercingly [...]
---
Outline:
(00:40) Natural General Lack of Intelligence in Tech
(04:47) Bad International News
(05:20) Dopamine Culture
(07:56) Customer Service
(08:53) Environmentalists Sabotaging the Environment
(09:11) Government Working
(17:41) On the Media
(20:13) Crime and Punishment
(27:40) Good News, Everyone
(33:27) The Time That is Given to Us
(39:56) Hotel Six Hundred
(41:10) While I Cannot Condone This
(48:00) Societal Tendencies
(49:56) Such Sufferin
(54:23) The Good and Bad of Academia
(56:31) Think of the Children
(01:01:38) Was Democracy a Mistake?
(01:04:30) The Philosophy of Fantasy
(01:07:26) Sports Go Sports
(01:19:51) Gamers Gonna Game Game Game Game Game
(01:37:50) The Virtue of Silence
(01:38:02) The Lighter Side
---
First published:
March 19th, 2024
Source:
https://www.lesswrong.com/posts/iCvdqrkWg34FNFZYg/monthly-roundup-16-march-2024
Narrated by TYPE III AUDIO.
Introducing Devin
Is the era of AI agents writing complex code systems without humans in the loop upon us?
Cognition is calling Devin ‘the first AI software engineer.’
Here is a two minute demo of Devin benchmarking LLM performance.
Devin has its own web browser, which it uses to pull up documentation.
Devin has its own code editor.
Devin has its own command line.
Devin uses debugging print statements and uses the log to fix bugs.
Devin builds and deploys entire stylized websites without even being directly asked.
What could possibly go wrong? Install this on your computer today.
Padme.
The Real Deal
I would by default assume all demos were supremely cherry-picked. My only disagreement with Austen Allred's statement here is that this rule is not new:
Austen Allred: New rule:
If someone only [...]
---
Outline:
(00:02) Introducing Devin
(00:48) The Real Deal
(03:27) The Metric
(04:48) What Could Possibly Go Subtly Wrong?
(07:29) What Could Possibly Go Massively Wrong for You?
(09:36) If This is What Skepticism Looks Like
(12:33) What Happens If You Fully Automate Software Engineering
(16:24) What Could Possibly Go Massively Wrong for Everyone?
(18:24) Conclusion: Whistling Past
---
First published:
March 18th, 2024
Source:
https://www.lesswrong.com/posts/wovJBkfZ8rTyLoEKv/on-devin
Narrated by TYPE III AUDIO.
Things were busy once again, partly from the Claude release but from many other sides as well. So even after cutting out both the AI coding agent Devin and the Gladstone Report along with previously covering OpenAI's board expansion and investigative report, this is still one of the longest weekly posts.
In addition to Claude and Devin, we got among other things Command-R, Inflection 2.5, OpenAI's humanoid robot partnership reporting back after only 13 days and Google DeepMind with an embodied cross-domain video game agent. You can definitely feel the acceleration.
The backlog expands. Once again, I say to myself, I will have to up my reporting thresholds and make some cuts. Wish me luck.
Table of Contents
---
Outline:
(00:52) Language Models Offer Mundane Utility
(06:17) Claude 3 Offers Mundane Utility
(11:57) Prompt Attention
(17:28) Clauding Along
(27:19) Language Models Don’t Offer Mundane Utility
(31:33) GPT-4 Real This Time
(32:22) Copyright Confrontation
(33:34) Fun with Image Generation
(37:49) They Took Our Jobs
(40:38) Get Involved
(43:53) Introducing
(51:55) Inflection 2.5
(55:04) Paul Christiano Joins NIST
(01:01:38) In Other AI News
(01:07:40) Quiet Speculations
(01:23:06) The Quest for Sane Regulations
(01:31:34) The Week in Audio
(01:31:44) Rhetorical Innovation
(01:41:23) A Failed Attempt at Adversarial Collaboration
(01:48:37) Spy Versus Spy
(01:52:38) Shouting Into the Void
(01:54:55) Open Model Weights are Unsafe and Nothing Can Fix This
(01:56:37) Aligning a Smarter Than Human Intelligence is Difficult
(01:59:49) People Are Worried About AI Killing Everyone
(02:00:39) Other People Are Not As Worried About AI Killing Everyone
(02:07:39) The Lighter Side
---
First published:
March 14th, 2024
Source:
https://www.lesswrong.com/posts/N3tXkA9Jj6oCB2eiJ/ai-55-keep-clauding-along
Narrated by TYPE III AUDIO.
TikTok Might Get Banned Soon
This attempt is getting reasonably far rather quickly, passing the House with broad support.
Alec Stapp: TikTok bill to remove influence of CCP:
– passed unanimously out of committee
– GOP leadership says they’ll bring it to the floor for a vote next week
– Biden says he’ll sign the bill if passed
Can’t believe it's taken this long, but should be done soon.
It's been obvious for years that we shouldn’t let China control a black-box algorithm that influences >100 million American users.
JSM: Can this stand up to court scrutiny though?
Alec Stapp: Yes.
It then passed the house 352-65, despite opposition from Donald Trump.
Manifold is as of now around 72% that a bill will pass, similar to Metaculus. Consensus is that it is unlikely that ByteDance will divest. They [...]
---
Outline:
(00:03) TikTok Might Get Banned Soon
(02:48) Execution is Everything
(10:33) RTFB: Read The Bill
(19:57) How Popular is a TikTok Ban?
(20:15) Reciprocity is the Key to Every Relationship
(23:14) Call Your Congressman
(28:25) TikTok Data Sharing
(34:41) TikTok Promoting Chinese Interests
(45:41) Tyler Cowen Opposes the Bill
(48:23) Trump Opposes the Bill
(53:40) To Be Clear You Can Absolutely Go Too Far
(54:09) Conclusion
---
First published:
March 13th, 2024
Source:
https://www.lesswrong.com/posts/cjrDNwoWwuTfc3Hbu/on-the-latest-tiktok-bill
Narrated by TYPE III AUDIO.
It is largely over.
The investigation into events has concluded, finding no wrongdoing anywhere.
The board has added four new board members, including Sam Altman. There will still be further additions.
Sam Altman now appears firmly back in control of OpenAI.
None of the new board members have been previously mentioned on this blog, or known to me at all.
They are mysteries with respect to AI. As far as I can tell, all three lack technical understanding of AI and have no known prior opinions or engagement on topics of AI, AGI and AI safety of any kind including existential risk.
Microsoft and investors indeed so far have came away without a seat. They also, however, lack known strong bonds to Altman, so this is not obviously a board fully under his control if there were to be another crisis. They now [...]
---
Outline:
(02:34) The New Board
(11:27) The Investigation Probably Was Not Real
(19:09) The New York Times Leak and Gwern's Analysis of It
(30:29) What Do We Now Think Happened?
(36:51) Altman's Statement
(39:26) Helen Toner and Tasha McCauley's Statement
(40:44) The Case Against Altman
(47:48) The Case For Altman and What We Will Learn Next
---
First published:
March 12th, 2024
Source:
https://www.lesswrong.com/posts/e5kLSeLJ8T5ddpe2X/openai-the-board-expands
Narrated by TYPE III AUDIO.
The big news this week was of course the release of Claude 3.0 Opus, likely in some ways the best available model right now. Anthropic now has a highly impressive model, impressive enough that it seems as if it breaks at least the spirit of their past commitments on how far they will push the frontier. We will learn more about its ultimate full capabilities over time.
We also got quite the conversation about big questions of one's role in events, which I immortalized as Read the Roon. Since publication Roon has responded, which I have edited into the post along with some additional notes.
That still leaves plenty of fun for the full roundup. We have spies. We have accusations of covert racism. We have Elon Musk suing OpenAI. We have a new summary of simulator theory. We have NIST, tasked with AI regulation, literally struggling [...]
---
Outline:
(01:03) Language Models Offer Mundane Utility
(09:25) Language Models Don’t Offer Mundane Utility
(11:52) LLMs: How Do They Work?
(15:50) Copyright Confrontation
(17:34) Oh Elon
(18:47) We realized building AGI will require far more resources than we’d initially imagined
(19:59) We and Elon recognized a for-profit entity would be necessary to acquire those resources
(21:49) We advance our mission by building widely-available beneficial tools
(24:46) DNA Is All You Need
(27:21) GPT-4 Real This Time
(30:11) Fun with Image Generation
(33:16) Deepfaketown and Botpocalypse Soon
(34:16) They Took Our Jobs
(35:20) Get Involved
(36:41) Introducing
(37:19) In Other AI News
(44:48) More on Self-Awareness
(47:05) Racism Remains a Problem for LLMs
(50:33) Project Maven
(53:35) Quiet Speculations
(01:00:01) The Quest for Sane Regulations
(01:07:19) The Week in Audio
(01:09:21) Rhetorical Innovation
(01:18:46) Another Open Letter
(01:22:33) Aligning a Smarter Than Human Intelligence is Difficult
(01:26:34) Security is Also Difficult, Although Perhaps Not This Difficult
(01:34:04) The Lighter Side
---
First published:
March 7th, 2024
Source:
https://www.lesswrong.com/posts/Nvi94KJSDGZMjknZS/ai-54-clauding-along
Narrated by TYPE III AUDIO.
Claude 3.0
Claude 3.0 is here. It is too early to know for certain how capable it is, but Claude 3.0's largest version is in a similar class to GPT-4 and Gemini Advanced. It could plausibly now be the best model for many practical uses, with praise especially coming in on coding and creative writing.
Anthropic has decided to name its three different size models Opus, Sonnet and Haiku, with Opus only available if you pay. Can we just use Large, Medium and Small?
Cost varies quite a lot by size, note this is a log scale on the x-axis, whereas the y-axis isn’t labeled.
This post goes over the benchmarks, statistics and system card, along with everything else people have been reacting to. That includes a discussion about signs of self-awareness (yes, we are doing this again) and also raising the question of whether [...]
---
Outline:
(00:03) Claude 3.0
(01:10) Benchmarks and Stats
(06:01) The System Card
(13:06) The System Prompt
(17:42) Reactions on How Good Claude 3 is in Practice
(26:19) It Can’t Help But Notice
(31:47) Acts of Potential Self Awareness Awareness
(39:33) We Can’t Help But Notice
(55:34) What Happens Next?
---
First published:
March 6th, 2024
Source:
https://www.lesswrong.com/posts/DwexbFdPJ5p9Er8wA/on-claude-3-0
Narrated by TYPE III AUDIO.
Roon, member of OpenAI's technical staff, is one of the few candidates for a Worthy Opponent when discussing questions of AI capabilities development, AI existential risk and what we should do about it. Roon is alive. Roon is thinking. Roon clearly values good things over bad things. Roon is engaging with the actual questions, rather than denying or hiding from them, and unafraid to call all sorts of idiots idiots. As his profile once said, he believes spice must flow, we just do go ahead, and makes a mixture of arguments for that, some good, some bad and many absurd. Also, his account is fun as hell.
Thus, when he comes out as strongly as he seemed to do recently, attention is paid, and we got to have a relatively good discussion of key questions. While I attempt to contribute here, this post is largely aimed at preserving [...]
---
Outline:
(04:13) The Doubling Down
(06:03) Connor Leahy Gives it a Shot
(11:02) Roon Responds to Connor
(14:27) Connor Goes Deep
(30:57) A Question of Agency
---
First published:
March 5th, 2024
Source:
https://www.lesswrong.com/posts/jPZXx3iMaiJjdnMbv/read-the-roon
Narrated by TYPE III AUDIO.
Legalize housing. It is both a good slogan and also a good idea.
The struggle is real, ongoing and ever-present. Do not sleep on it. The Housing Theory of Everything applies broadly, even to the issue of AI. If we built enough housing that life vastly improved and people could envision a positive future, they would be far more inclined to think well about AI.
In Brief
What will AI do to housing? If we consider what the author here calls a ‘reasonably optimistic’ scenario and what I’d call a ‘maximally disappointingly useless’ scenario, all AI does is replace some amount of some forms of labor. Given current AI capabilities, it won’t replace construction, so some other sectors get cheaper, making housing relatively more expensive. Housing costs rise, the crisis gets more acute.
Chris Arnade says we live in a high-regulation low-trust society in America [...]
---
Outline:
(00:28) In Brief
(04:01) Legalize Housing
(13:46) Regulatory Barriers
(16:08) Future Construction Expectations
(18:55) Rents
(20:20) Different Designs
(23:51) Landmarks
(24:40) History
(26:15) Public Opinion
(27:16) NIMBY Sightings
(30:34) Houses as Savings
(32:37) Union Dues
(37:20) Landlords
(37:41) Construction
(38:08) Rent
(38:27) Who are You?
(40:06) Good Money After Bad
(40:21) Commercial Real Estate
(42:12) San Francisco
(48:25) New York City
(51:45) Austin
(54:31) Kentucky House Bill 102
(59:25) Tokyo
(01:00:08) Vancouver
(01:01:26) Minneapolis
(01:02:50) Texas
(01:03:41) Florida
(01:06:37) Cities Build Housing, Rents Decline
(01:09:57) Los Angeles
(01:12:26) Argentina
(01:13:05) Other Places Do Things
(01:13:12) Rent Control
(01:17:27) Traffic and Transit
(01:25:47) The Lighter Side
---
First published:
March 4th, 2024
Source:
https://www.lesswrong.com/posts/m8ahbiumz8C9mnGnp/housing-roundup-7
Narrated by TYPE III AUDIO.
Demis Hassabis was interviewed twice this past week.
First, he was interviewed on Hard Fork. Then he had a much more interesting interview with Dwarkesh Patel.
This post covers my notes from both interviews, mostly the one with Dwarkesh.
Hard Fork
Hard Fork was less fruitful, because they mostly asked what for me are the wrong questions and mostly get answers I presume Demis has given many times. So I only noticed two things, neither of which is ultimately surprising.
---
Outline:
(00:25) Hard Fork
(02:21) Dwarkesh Patel
---
First published:
March 1st, 2024
Narrated by TYPE III AUDIO.
The main event continues to be the fallout from The Gemini Incident. Everyone is focusing there now, and few are liking what they see.
That does not mean other things stop. There were two interviews with Demis Hassabis, with Dwarkesh Patel's being predictably excellent. We got introduced to another set of potentially highly useful AI products. Mistral partnered up with Microsoft the moment Mistral got France to pressure the EU to agree to cripple the regulations that Microsoft wanted crippled. You know. The usual stuff.
Table of Contents
---
Outline:
(00:39) Language Models Offer Mundane Utility
(05:00) Language Models Don’t Offer Mundane Utility
(06:15) OpenAI Has a Sales Pitch
(10:16) The Gemini Incident
(19:19) Political Preference Tests for LLMs
(22:13) GPT-4 Real This Time
(23:13) Fun with Image Generation
(23:57) Deepfaketown and Botpocalypse Soon
(33:00) They Took Our Jobs
(36:34) Get Involved
(36:46) Introducing
(40:08) In Other AI News
(41:46) Quiet Speculations
(45:13) Mistral Shows Its True Colors
(51:13) The Week in Audio
(51:47) Rhetorical Innovation
(54:43) Open Model Weights Are Unsafe and Nothing Can Fix This
(01:02:54) Aligning a Smarter Than Human Intelligence is Difficult
(01:05:03) Other People Are Not As Worried About AI Killing Everyone
(01:06:36) The Lighter Side
---
First published:
February 29th, 2024
Source:
https://www.lesswrong.com/posts/FcaqbuYbPdesdkWiH/ai-53-one-more-leap
Narrated by TYPE III AUDIO.
Previously: The Gemini Incident (originally titled Gemini Has a Problem)
The fallout from The Gemini Incident continues.
Also the incident continues. The image model is gone. People then focused on the text model. The text model had its own related problems, some now patched and some not.
People are not happy. Those people smell blood. It is a moment of clarity.
Microsoft even got in on the act, as we rediscover how to summon Sydney.
There is a lot more to discuss.
The Ultimate New York Times Reaction
First off, I want to give a shout out to The New York Times here, because wow, chef's kiss. So New York Times. Much pitchbot.
Dominic Cummings: true art from NYT, AI can’t do this yet
This should be in the dictionary as the new definition of Chutzpah.
Do [...]
---
Outline:
(00:39) The Ultimate New York Times Reaction
(02:30) The Ultimate Grimes Reaction
(04:32) Three Positive Reactions
(06:24) The AI Ethicist Reacts
(11:33) Google Reacts on Images
(12:17) What happened
(13:51) Next steps and lessons learned
(16:18) The Market Reacts a Little
(18:14) Guess Who's Back
(21:47) Everyone Has Some Issues
(22:19) Clarifying Refusals
(24:19) Refusals Aplenty
(31:44) Unequal Treatment
(32:51) Gotcha Questions
(34:01) No Definitive Answer
(41:27) Wrong on the Internet
(44:00) Everyone Has a Plan Until They’re Punched in the Face
(44:49) What Should We Learn from The Gemini Incident outside of AI?
(51:29) What Should We Learn about AI from The Gemini Incident?
(58:08) This Is Not a Coincidence Because Nothing is Ever a Coincidence
(01:02:09) AI Ethics is (Often) Not About Ethics or Safety
(01:07:00) Make an Ordinary Effort
(01:15:02) Fix It, Felix
(01:18:28) The Deception Problem Gets Worse
(01:22:41) Where Do We Go From Here?
---
First published:
February 27th, 2024
Source:
https://www.lesswrong.com/posts/oJp2BExZAKxTThuuF/the-gemini-incident-continues
Narrated by TYPE III AUDIO.
We were treated to technical marvels this week.
At Google, they announced Gemini Pro 1.5, with a million token context window within which it has excellent recall, using mixture of experts to get Gemini Advanced level performance (e.g. GPT-4 level) out of Gemini Pro levels of compute. This is a big deal, and I think people are sleeping on it. Also they released new small open weights models that look to be state of the art.
At OpenAI, they announced Sora, a new text-to-video model that is a large leap from the previous state of the art. I continue to be a skeptic on the mundane utility of video models relative to other AI use cases, and think they still have a long way to go, but this was both technically impressive and super cool.
Also, in both places, mistakes were made.
At OpenAI, ChatGPT [...]
---
Outline:
(01:48) Language Models Offer Mundane Utility
(05:08) Language Models Don’t Offer Mundane Utility
(10:17) Call Me Gemma Now
(11:35) Google Offerings Keep Coming and Changing Names
(13:08) GPT-4 Goes Crazy
(20:00) GPT-4 Real This Time
(22:11) Fun with Image Generation
(22:27) Deepfaketown and Botpocalypse Soon
(28:44) Selling Your Chatbot Data
(30:11) Selling Your Training Data
(32:18) They Took Our Jobs
(32:41) Get Involved
(32:55) Introducing
(35:13) In Other AI News
(36:16) Quiet Speculations
(40:27) The Quest for Sane Regulations
(43:43) The Week in Audio
(43:54) The Original Butlerian Jihad
(45:22) Rhetorical Innovation
(46:05) Public Service Announcement
(49:16) People Are Worried About AI Killing Everyone
(50:14) Other People Are Not As Worried About AI Killing Everyone
(52:57) The Lighter Side
---
First published:
February 22nd, 2024
Source:
https://www.lesswrong.com/posts/WmxS7dbHuxzxFei64/ai-52-oops
Narrated by TYPE III AUDIO.
Google's Gemini 1.5 is impressive and I am excited by its huge context window. I continue to default to Gemini Advanced as my default AI for everyday use when the large context window is not relevant.
However, while it does not much interfere with what I want to use Gemini for, there is a big problem with Gemini Advanced that has come to everyone's attention.
Gemini comes with an image generator. Until today it would, upon request, create pictures of humans.
On Tuesday evening, some people noticed, or decided to more loudly mention, that the humans it created might not be rather different than humans you requested…
Joscha Bach: 17th Century was wild.
[prompt was] ‘please draw a portrait of a famous physicist of the 17th century.’
Kirby: i got similar results. when I went further and had it tell me who the [...]
---
Outline:
(03:06) The Internet Reacts
(07:10) How Did This Happen?
(13:52) Google's Response
(17:39) Five Good Reasons This Matters
(18:00) Reason 1: Prohibition Doesn’t Work and Enables Bad Actors
(19:06) Reason 2: A Frontier Model Was Released While Obviously Misaligned
(21:53) Reason 3: Potentially Inevitable Conflation of Different Risks From AI
(23:55) Reason 4: Bias and False Refusals Are Not Limited to Image Generation
(26:54) Reason 5: This is Effectively Kind of a Deceptive Sleeper Agent
---
First published:
February 22nd, 2024
Source:
https://www.lesswrong.com/posts/kLTyeG7R8eYpFwe3H/gemini-has-a-problem
Narrated by TYPE III AUDIO.
Hours after Google announced Gemini 1.5, OpenAI announced their new video generation model Sora. Its outputs look damn impressive.
How Sora Works
How does it work? There is a technical report. Mostly it seems like OpenAI did standard OpenAI things, meaning they fed in tons of data, used lots of compute, and pressed the scaling button super hard. The innovations they are willing to talk about seem to be things like ‘do not crop the videos into a standard size.’
That does not mean there are not important other innovations. I presume that there are. They simply are not talking about the other improvements.
We should not underestimate the value of throwing in massively more compute and getting a lot of the fiddly details right. That has been the formula for some time now.
Some people think that OpenAI was using a game engine [...]
---
Outline:
(00:12) How Sora Works
(02:07) Sora Is Technically Impressive
(06:42) Sora What's it Good For?
(09:43) Until we can say exactly what we want, and get it, mostly I expect no dice. When you go looking for something specific, your chances of finding it are very bad.
(15:19) Sora Comes Next?
---
First published:
February 22nd, 2024
Source:
https://www.lesswrong.com/posts/35fZ6csrbcrKw9BwG/sora-what
Narrated by TYPE III AUDIO.
Previously: I hit send on The Third Gemini, and within half an hour DeepMind announced Gemini 1.5.
So this covers Gemini 1.5. One million tokens, and we are promised overall Gemini Advanced or GPT-4 levels of performance on Gemini Pro levels of compute.
This post does not cover the issues with Gemini's image generation, and what it is and is not willing to generate. I am on top of that situation and will get to it soon.
One Million Tokens
Our teams continue pushing the frontiers of our latest models with safety at the core. They are making rapid progress. In fact, we’re ready to introduce the next generation: Gemini 1.5. It shows dramatic improvements across a number of dimensions and 1.5 Pro achieves comparable quality to 1.0 Ultra, while using less compute.
It is truly bizarre to launch Gemini Advanced as a paid [...]
---
Outline:
(00:32) One Million Tokens
(04:29) Mixture of Experts
(07:39) Quality Not Quantity
(11:52) What To Do With It?
---
First published:
February 22nd, 2024
Source:
https://www.lesswrong.com/posts/N2Y664LX6pQ8rFiz2/the-one-and-a-half-gemini
Narrated by TYPE III AUDIO.
While I sort through whatever is happening with GPT-4, today's scheduled post is two recent short stories about restaurant selection.
Ye Olde Restaurante
Tyler Cowen says that restaurants saying ‘since year 19xx’ are on net a bad sign, because they are frozen in time, focusing on being reliable.
For the best meals, he says look elsewhere, to places that shine brightly and then move on.
I was highly suspicious. So I ran a test.
I checked the oldest places in Manhattan. The list had 15 restaurants. A bunch are taverns, which are not relevant to my interests. The rest include the legendary Katz's Delicatessen, which is still on the short list of very best available experiences (yes, of course you order the Pastrami), and the famous Keen's Steakhouse. I don’t care for mutton, but their regular steaks are quite good. There's also Peter Lugar's [...]
---
Outline:
(00:12) Ye Olde Restaurante
(04:41) Ye Newfangled Restaurante
---
First published:
February 21st, 2024
Source:
https://www.lesswrong.com/posts/Hm38rqATujCDbLFrh/a-tale-of-two-restaurant-types
Narrated by TYPE III AUDIO.
[Editor's note: I forgot to post this to WorldPress on Thursday. I’m posting it here now. Sorry about that.]
Sam Altman is not playing around.
He wants to build new chip factories in the decidedly unsafe and unfriendly UAE. He wants to build up the world's supply of energy so we can run those chips.
What does he say these projects will cost?
Oh, up to seven trillion dollars. Not a typo.
Even scaling back the misunderstandings, this is what ambition looks like.
It is not what safety looks like. It is not what OpenAI's non-profit mission looks like. It is not what it looks like to have concerns about a hardware overhang, and use that as a reason why one must build AGI soon before someone else does. The entire justification for OpenAI's strategy is invalidated by this move.
I have [...]
---
Outline:
(01:02) Language Models Offer Mundane Utility
(01:57) Language Models Don’t Offer Mundane Utility
(03:22) GPT-4 Real This Time
(05:49) Deepfaketown and Botpocalypse Soon
(12:38) They Took Our Jobs
(15:33) Get Involved
(15:50) Introducing
(17:13) Altman's Ambition
(22:34) Yoto
(26:32) In Other AI News
(35:04) Quiet Speculations
(41:55) The Quest for Sane Regulations
(43:32) Washington DC Still Does Not Get It
(46:16) Many People are Saying
(48:48) China Watch
(49:38) Roon Watch
(50:48) How to Get Ahead in Advertising
(52:35) The Week in Audio
(56:45) Rhetorical Innovation
(01:01:18) Please Speak Directly Into This Microphone
(01:03:13) Aligning a Smarter Than Human Intelligence is Difficult
(01:03:39) Other People Are Not As Worried About AI Killing Everyone
(01:06:02) The Lighter Side
---
First published:
February 20th, 2024
Source:
https://www.lesswrong.com/posts/gBHNw5Ymnqw8FiMjh/ai-51-altman-s-ambition
Narrated by TYPE III AUDIO.
[Editor's Note: I forgot to cross-post this on Thursday, sorry about that. Note that this post does not cover Gemini 1.5, which was announced after I posted this. I will cover 1.5 later this week.]
We have now had a little over a week with Gemini Advanced, based on Gemini Ultra. A few reviews are in. Not that many, though, compared to what I would have expected, or what I feel the situation calls for. This is yet another case of there being an obvious thing lots of people should do, and almost no one doing it. Should we use Gemini Advanced versus ChatGPT? Which tasks are better for one versus the other?
I have compiled what takes I did see. Overall people are clearly less high on Gemini Advanced than I am, seeing it as still slightly to modestly behind ChatGPT overall. Despite that, I have [...]
---
Outline:
(01:10) Impressions of Others
(14:28) Pros and Cons
---
First published:
February 20th, 2024
Source:
https://www.lesswrong.com/posts/vdSE99ssADJEXP567/the-third-gemini
Narrated by TYPE III AUDIO.
Another month. More things. Much roundup.
Bad News
Jesse Smith writes in Asterisk that our HVAC workforce is both deeply incompetent and deeply corrupt. This certainly matches my own experience. Calculations are almost always flubbed when they are done at all, outright fraudulent paperwork is standard, no one has the necessary skills.
It certainly seems like the Biden Administration is doing its best to hurt Elon Musk? Claim here is that they cancelled a Starlink contract without justification, in order to award the contract to someone else for more than three times the price. This was on Twitter, but none of the replies seemed to offer a plausible justification.
Claim that Twitter traffic is increasingly fake, and secondary claim that this is because Musk fired those responsible for preventing it. Even if it is true that Twitter traffic is 75% fake, that does not mean [...]
---
Outline:
(00:11) Bad News
(06:48) More on the Apple Vision Pro
(08:19) For Science
(10:45) Climate
(11:11) Variously Effective Altruism
(15:10) The Plan
(17:07) Agency, Cate Hall and Finkel's Law
(21:46) Loop de Loop
(24:46) Government Working
(31:16) California Scrolling
(33:49) Crime and Punishment
(36:45) Good News, Everyone
(47:03) A Difference of Opinion
(51:19) The Lighter Side
---
First published:
February 20th, 2024
Source:
https://www.lesswrong.com/posts/zHTRivdJ7ZDSctwqi/monthly-roundup-15-february-2024
Narrated by TYPE III AUDIO.
Previously: On the Apple Vision Pro
The reviews are coming in. What say the people, other than the complaining about the two to three hour battery life?
Then later I’ll get to my own thoughts after the demo.
Reviews and Reactions
Ben Thompson reviews the Apple Vision Pro. He continues to find it a technical marvel, but is ultimately disappointed for uses other than entertainment. There is no support for multiple users beyond a highly unwieldy guest mode. There is insufficient width of coverage and inability to support multiple large screens, which is severely limiting to productivity. The eye tracking is a huge improvement over earlier attempts but not ready for such applications.
Ben anticipates that Apple will fail over time to evolve the product to support the things that would enable it to be a killer productivity app, which is what he was [...]
---
Outline:
(00:18) Reviews and Reactions
(06:45) My Own Thoughts After a Demo
(12:37) To Buy or Not to Buy?
---
First published:
February 13th, 2024
Source:
https://www.lesswrong.com/posts/HbneNh2NJpQCgfxQA/more-on-the-apple-vision-pro
Narrated by TYPE III AUDIO.
California Senator Scott Wiener of San Francisco introduces SB 1047 to regulate AI. I have put up a market on how likely it is to become law.
“If Congress at some point is able to pass a strong pro-innovation, pro-safety AI law, I’ll be the first to cheer that, but I’m not holding my breath,” Wiener said in an interview. “We need to get ahead of this so we maintain public trust in AI.”
Congress is certainly highly dysfunctional. I am still generally against California trying to act like it is the federal government, even when the cause is good, but I understand.
Can California effectively impose its will here?
On the biggest players, for now, presumably yes.
In the longer run, when things get actively dangerous, then my presumption is no.
There is a potential trap here. If we put our rules [...]
---
Outline:
(02:48) Close Reading of the Bill
(10:50) My High Level Takeaways From the Close Reading
(11:37) Another More Skeptical Reaction to the Same Bill
(13:01) What is a Covered Model Here?
(15:39) Precautionary Principle and Covered Guidance
(18:07) Non-Derivative
(18:28) So What Would This Law Actually Do?
(20:44) Crying Wolf
---
First published:
February 12th, 2024
Source:
https://www.lesswrong.com/posts/oavGczwcHWZYhmifW/on-the-proposed-california-sb-1047
Narrated by TYPE III AUDIO.
We have long been waiting for a version of this story, where someone hacks together the technology to use Generative AI to work the full stack of the dating apps on their behalf, ultimately finding their One True Love.
Or at least, we would, if it turned out he is Not Making This Up.
Fun question: Given he is also this guy, does that make him more or less credible?
Alas, something being Too Good to Check does not actually mean one gets to not check it, in my case via a Manifold Market. The market started trading around 50%, but has settled down at 15% after several people made strong detailed arguments that the full story did not add up, at minimum he was doing some recreations afterwards.
Which is a shame. But why let that stop us? Either way it is a good [...]
---
First published:
February 9th, 2024
Source:
https://www.lesswrong.com/posts/QxAFoEdtsmK783jzM/one-true-love
Narrated by TYPE III AUDIO.
In a week with two podcasts I covered extensively, I was happy that there was little other news.
That is, until right before press time, when Google rebranded Bard to Gemini, released an app for that, and offered a premium subscription ($20/month) for Gemini Ultra.
Gemini Ultra is Here
I have had the honor and opportunity to check out Gemini Advanced before its release.
The base model seems to be better than GPT-4. It seems excellent for code, for explanations and answering questions about facts or how things work, for generic displays of intelligence, for telling you how to do something. Hitting the Google icon to have it look for sources is great.
In general, if you want to be a power user, if you want to push the envelope in various ways, Gemini is not going to make it easy on you. However [...]
---
Outline:
(00:21) Gemini Ultra is Here
(03:16) Language Models Offer Mundane Utility
(05:38) Language Models Don’t Offer Mundane Utility
(07:12) GPT-4 Real This Time
(09:03) Fun with Image Generation
(09:20) Deepfaketown and Botpocalypse Soon
(14:09) They Took Our Jobs
(15:16) Get Involved
(16:00) Introducing
(16:12) In Other AI News
(20:42) Quiet Speculations
(24:44) Vitalik on the Intersection AI and Crypto
(30:10) The Quest for Sane Regulations
(30:23) The Week in Audio
(34:25) Rhetorical Innovation
(35:23) Aligning a Dumber Than Human Intelligence is Still Difficult
(35:54) People Are Worried About AI, Many People
(38:04) Other People Are Not As Worried About AI Killing Everyone
(39:54) The Lighter Side
---
First published:
February 8th, 2024
Source:
https://www.lesswrong.com/posts/Si4fRH2hGGa6HsQbu/ai-50-the-most-dangerous-thing
Narrated by TYPE III AUDIO.
Previously: Based Beff Jezos and the Accelerationists
Based Beff Jezos, the founder of effective accelerationism, delivered on his previous pledge, and did indeed debate what is to be done to navigate into the future with a highly Worthy Opponent in Connor Leahy.
The moderator almost entirely stayed out of it, and intervened well when he did, so this was a highly fair arena. It's Jezos versus Leahy. Let's get ready to rumble!
I wanted to be sure I got the arguments right and fully stated my responses and refutations, so I took extensive notes including timestamps. On theme for this debate, this is a situation where you either do that while listening, or once you have already listened you are in practice never going to go back.
That does not mean you have to read all those notes and arguments. It is certainly an option [...]
---
Outline:
(01:32) Actually Based Beff Jezos (ABBJ)
(05:13) Bold Based Beff Jezos (BBBJ)
(13:00) Caustic Based Beff Jezos (CBBZ)
(17:50) What about Connor Leahy?
(19:56) Around the Debate in 80 Notes
(01:39:14) Afterwards
---
First published:
February 6th, 2024
Source:
https://www.lesswrong.com/posts/xjoKqevCRnhXzHRLT/on-the-debate-between-jezos-and-leahy
Narrated by TYPE III AUDIO.
This post is extensive thoughts on Tyler Cowen's excellent talk with Dwarkesh Patel.
It is interesting throughout. You can read this while listening, after listening or instead of listening, and is written to be compatible with all three options. The notes are in order in terms of what they are reacting to, and are mostly written as I listened.
I see this as having been a few distinct intertwined conversations. Tyler Cowen knows more about more different things than perhaps anyone else, so that makes sense. Dwarkesh chose excellent questions throughout, displaying an excellent sense of when to follow up and how, and when to pivot.
The first conversation is about Tyler's book GOAT about the world's greatest economists. Fascinating stuff, this made me more likely to read and review GOAT in the future if I ever find the time. I mostly agreed with Tyler's takes [...]
---
Outline:
(04:01) The Notes Themselves
(17:54) The AI and Future Scenario Section Begins
(21:04) Clearing Up Two Misconceptions
(27:11) Final Notes Section
(33:47) Concluding AI Thoughts
---
First published:
February 2nd, 2024
Source:
https://www.lesswrong.com/posts/FZkAG8Hezub7pWRM9/on-dwarkesh-s-3rd-podcast-with-tyler-cowen
Narrated by TYPE III AUDIO.
Two studies came out on the question of whether existing LLMs can help people figure out how to make bioweapons. RAND published a negative finding, showing no improvement. OpenAI found a small improvement, bigger for experts than students, from GPT-4. That's still harmless now, the question is what will happen in the future as capabilities advance.
Another news item was that Bard with Gemini Pro impressed even without Gemini Ultimate, taking the second spot on the Arena leaderboard behind only GPT-4-Turbo. For now, though, GPT-4 remains in the lead.
A third cool item was this story from a Russian claiming to have used AI extensively in his quest to find his one true love. I plan to cover that on its own and have Manifold on the job of figuring out how much of the story actually happened.
Table of Contents
---
Outline:
(00:56) Language Models Offer Mundane Utility
(08:35) Language Models Don’t Offer Mundane Utility
(09:38) GPT-4 Real This Time
(10:23) Be Prepared
(16:13) Fun with Image Generation
(17:17) Deepfaketown and Botpocalypse Soon
(23:32) They Took Our Jobs
(27:54) Get Involved
(28:23) In Other AI News
(33:35) Quiet Speculations
(39:06) The Quest for Sane Regulations
(45:58) The Week in Audio
(46:49) Rhetorical Innovation
(56:40) Predictions are Hard Especially About the Future
(01:03:13) Aligning a Smarter Than Human Intelligence is Difficult
(01:07:03) Open Model Weights Are Unsafe and Nothing Can Fix This
(01:09:38) Other People Are Not As Worried About AI Killing Everyone
(01:11:37) The Lighter Side
---
First published:
February 1st, 2024
Source:
https://www.lesswrong.com/posts/RsWvhDNQRExjatzGA/ai-49-bioweapon-testing-begins
Narrated by TYPE III AUDIO.
Before we begin, I will note that I have indeed written various thoughts about the three college presidents that appeared before Congress and the resulting controversies, including the disputes regarding plagiarism. However I have excluded them from this post.
Discipline and Prohibitions
Washington Post Editorial Board says schools should ban smartphones, and parents should help make this happen rather than more often opposing such bans in order to make logistical coordination easier.
I agree with the editorial board. Even when not in use, having a phone in one's pocket is a continuous distraction. The ability to use the phone creates immense social and other pressures to use it, or think about using it, continuously. If we are going to keep doing this physically required school thing at all, students need to be fully device-free during the school day except for where we intentionally want them to [...]
---
Outline:
(00:19) Discipline and Prohibitions
(05:04) School Choice
(08:34) An Argument Against School Choice
(13:10) Home School
(14:37) School Null Hypothesis Watch
(19:05) The Case Against Education, Pandemic Edition
(20:51) Early Childhood
(21:43) Primary School
(26:04) High School
(27:54) (Lack of) Standards
(31:41) College Grade Inflation
(33:41) College
(41:31) Student Debt
---
First published:
January 30th, 2024
Source:
https://www.lesswrong.com/posts/jJnDmdmLDukoTqFqB/childhood-and-education-roundup-4
Narrated by TYPE III AUDIO.
While I was in San Francisco, the big head honchos headed for Davos, where AI was the talk of the town. As well it should be, given what will be coming soon. It did not seem like anyone involved much noticed or cared about the existential concerns. That is consistent with the spirit of Davos, which has been not noticing or caring about things that don’t directly impact your business or vibe since (checks notes by which I mean an LLM) 1971. It is what it is.
Otherwise we got a relatively quiet week. For once the scheduling worked out and I avoided the Matt Levine curse. I’m happy for the lull to continue so I can pay down more debt and focus on long term projects and oh yeah also keep us all farther away from potential imminent death.
Table of Contents
---
Outline:
(00:51) Language Models Offer Mundane Utility
(03:33) Language Models Don’t Offer Mundane Utility
(03:58) Copyright Confrontation
(05:13) Fun with Image Generation
(05:31) Deepfaketown and Botpocalypse Soon
(08:20) They Took Our Jobs
(09:10) Get Involved
(09:47) In Other AI News
(12:43) Quiet Speculations
(17:26) Intelligence Squared
(27:22) The Quest for Sane Regulations
(30:33) (3) Artificial intelligence
(32:48) Open Model Weights Are Unsafe And Nothing Can Fix This
(45:16) The Week in Audio
(48:54) Rhetorical Innovation
(53:18) Malaria Accelerationism
(54:54) Aligning a Smarter Than Human Intelligence is Difficult
(57:57) Other People Are Not As Worried About AI Killing Everyone
(01:01:09) The Lighter Side
---
First published:
January 25th, 2024
Source:
https://www.lesswrong.com/posts/bWnonYFj4rwtXuErK/ai-48-the-talk-of-davos
Narrated by TYPE III AUDIO.
There's always lots of stuff going on. The backlog of other roundups keeps growing rather than shrinking. I have also decided to hold back a few things to turn them into their own posts instead.
Bad News
I wonder if it is meaningful that most of the bad news is about technology?
I don’t even know if this is news, but Rutgers finds TikTok amplifies and suppresses content based on whether it aligns with the CCP.
It would be great if we could find a way to ban or stop using TikTok that did not involve something crazy like the Restrict Act. I still think the Restrict Act is worse than nothing, if those are our only choices.
If the CCP limited its interference to explicitly internal Chinese topics, I would understand, but they do not: WSJ investigates the TikTok rabbit hole, in particular [...]
---
Outline:
(00:18) Bad News
(07:53) Government Working (USA Edition)
(10:58) Government Working (UK/EU Edition)
(15:14) Trouble in the Suez
(22:42) Crime and Punishment
(34:03) Good News, Everyone
(39:11) Sports Go Sports
(43:24) Gamers Gonna Game Game Game Game Game
(50:02) Game Reviews
(53:00) I Was Promised Flying Self-Driving Cars
(53:31) While I Cannot Condone This
(01:02:48) Money Stuff
(01:14:23) At the Movies
---
First published:
January 24th, 2024
Source:
https://www.lesswrong.com/posts/8gxkJnZCBrNBRZREH/monthly-roundup-14-january-2024
Narrated by TYPE III AUDIO.
The biggest event of the week was the Sleeper Agents paper from Anthropic. I expect that to inform our thoughts for a while to come, and to lay foundation for additional work. We also had the first third of the IMO solved at almost gold metal level by DeepMind, discovering that math competition geometry is actually mostly composed of One Weird Trick. I knew that at the time I was doing it, though, and it was still really hard.
As usual, there was also a bunch of other stuff.
Tomorrow the 19th, I am going to be off to San Francisco for the weekend to attend a workshop. That leaves a lot of time for other events and seeing other people, a lot of which remains unfilled. So if you are interested in meeting up or want to invite me to a gathering, especially on Sunday the [...]
---
Outline:
(00:53) Language Models Offer Mundane Utility
(02:01) Language Models Don’t Offer Mundane Utility
(05:49) GPT-4 Real This Time
(07:05) Fun with Image Generation
(08:02) Copyright Confrontation
(09:43) Deepfaketown and Botpocalypse Soon
(11:31) They Took Our Jobs
(12:16) Get Involved
(13:11) Introducing
(17:36) In Other AI News
(22:48) Quiet Speculations
(25:30) The Quest for Sane Regulations
(31:50) The Week in Audio with Sam Altman
(35:03) David Brin Podcast
(45:46) Rhetorical Innovation
(48:59) Anthropic Paper on Sleeper Agents
(50:45) Anthropic Introduces Impossible Mission Force
(53:57) Aligning a Smarter Than Human Intelligence is Difficult
(58:57) The Belrose Model Continued
(01:23:46) Open Model Weights Are Unsafe And Nothing Can Fix This
(01:28:26) People Are Worried About AI Killing Everyone
(01:28:42) Other People Are Not As Worried About AI Killing Everyone
(01:31:15) The Lighter Side
---
First published:
January 18th, 2024
Source:
https://www.lesswrong.com/posts/WRGmBE3h4WjA5EC5a/ai-48-exponentials-in-geometry
Narrated by TYPE III AUDIO.
The recent paper from Anthropic is getting unusually high praise, much of it I think deserved.
The title is: Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training.
Scott Alexander also covers this, offering an excellent high level explanation, of both the result and the arguments about whether it is meaningful. You could start with his write-up to get the gist, then return here if you still want more details, or you can read here knowing that everything he discusses is covered below. There was one good comment, pointing out some of the ways deceptive behavior could come to pass, but most people got distracted by the ‘grue’ analogy.
Right up front before proceeding, to avoid a key misunderstanding: I want to emphasize that in this paper, the deception was introduced intentionally. The paper deals with attempts to remove it.
The rest of this [...]
---
Outline:
(01:02) Abstract and Basics
(04:54) Threat Models
(07:01) The Two Backdoors
(10:13) Three Methodological Variations and Two Models
(11:30) Results of Safety Training
(13:08) Jesse Mu clarifies what he thinks the implications are.
(15:20) Implications of Unexpected Problems
(18:04) Trigger Please
(19:44) Strategic Deceptive Behavior Perhaps Not Directly in the Training Set
(21:54) Unintentional Deceptive Instrumental Alignment
(25:33) The Debate Over How Deception Might Arise
(38:46) Have We Disproved That Misalignment is Too Inefficient to Survive?
(47:52) Avoiding Reading Too Much Into the Result
(55:41) Further Discussion of Implications
(59:07) Broader Reaction
---
First published:
January 17th, 2024
Source:
https://www.lesswrong.com/posts/Sf5CBSo44kmgFdyGM/on-anthropic-s-sleeper-agents-paper
Narrated by TYPE III AUDIO.
Saving up medical and health related stories from several months allowed for much better organizing of them, so I am happy I split these off. I will still post anything more urgent on a faster basis. There's lots of things here that are fascinating and potentially very important, but I’ve had to prioritize and focus elsewhere, so I hope others pick up various torches.
Vaccination Ho!
We have a new malaria vaccine. That's great. WHO thinks this is not an especially urgent opportunity, or any kind of ‘emergency’ and so wants to wait for months before actually putting shots into arms. So what if we also see reports like ‘cuts infant deaths by 13%’? WHO doing WHO things, WHO Delenda Est and all that. What can we do about this?
Also, EA and everyone else who works in global health needs to do a complete post-mortem [...]
---
Outline:
(00:25) Vaccination Ho!
(02:57) Potential Progress
(07:05) It's Not Progress
(11:09) Cost Plus
(13:18) New Findings
(15:21) FDA Delenda Est
(20:57) Covid Response Postmortem and Paths Forward
(23:56) Covid Origins
(28:32) Ban Gain of Function Research
(28:51) Cause Areas
(30:26) I criticize Effective Altruists for insufficiently high levels of epistemic rigor, because reality does not grade on a curve, but let no one confuse them with governments.
(34:04) GLP-1 Has Barely Begun
(41:53) No One Understands Nutrition
(43:24) Model This: Exercise Edition
(49:00) A Bold Stand Against Torture
---
First published:
January 16th, 2024
Source:
https://www.lesswrong.com/posts/kFDk4Q9QhqrDE68qp/medical-roundup-1
Narrated by TYPE III AUDIO.
My 2023 ACX predictions showed a clear lack of confidence in taking on the market. I won 30 markets for an average of +185 each, and lost 12 for an average loss of -185 each1. When one goes 30-12, hitting a 71% mark versus the about 58% average price initially paid, that is worth noticing. It is possible that I generally benefited from 2023 being a year where not much happened outside of AI, but I think it's time to know what we really think.
That means this year I’m going to add a phase, where I predict blind. Blind means I’m not allowed to look at any prediction markets. I can still look up facts, and financial markets are fair game, but nothing beyond that. Only after that will I look at Manifold. Metaculus makes this viable, as they have the ACX questions without listing probabilities.
---
Outline:
(01:47) International Politics
(11:13) American Electoral Politics
(20:48) US Politics and Government (excluding elections and AI )
(28:56) Economics
(37:10) Science (Mostly Rocketry for Some Reason)
(42:32) AI
(46:15) Hard Fork: Bonus Buy/Sell/Hold
---
First published:
January 9th, 2024
Source:
https://www.lesswrong.com/posts/7j9JXpGvXNowCkhdf/2024-acx-predictions-blind-buy-sell-hold
Narrated by TYPE III AUDIO.
Developments around relationships and dating have a relatively small speed premium, also there are once again enough of them for a full post.
The first speculated on why you’re still single. We failed to settle the issue. A lot of you are indeed still single. So the debate continues.
You’re Single Because You’re Not Even Trying
What does it mean to not even be trying?
It does not only mean the things Alexander pointed us to last time, like 62% of singles being on zero dating apps, and a majority of singles having gone on zero dates in the past year, and a large majority not actively looking for a relationship. Here are those graphs again:
It also means things such as literally never approaching a woman in person.
Alexander (Keeper.ai): Why are so many young men single? Are they excluded from a [...]
---
Outline:
(00:25) You’re Single Because You’re Not Even Trying
(07:30) You’re Single Because of Artificial Intelligence
(10:38) You’re Single Because You Meet the Definition of Insanity
(18:52) You’re Single Because You’re Asking the Wrong Questions
(20:40) You’re Single Because of a Market Mismatch
(21:39) You’re Single Because Dating Apps Suck
(25:20) You’re Single Because You Didn’t Pay for Tinder Select
(39:24) You’re Single Because You Have Zero Friends
(43:14) You’re Single Because You Are Bad at Texting
(44:32) You’re Single Because You Have No Honor
(45:28) You’re Single Because You’re Not Hot
(51:23) You’re Single Because You Don’t Get Your House in Order
(52:55) You’re Single Because You’re Too Weird
(54:21) You’re Single Because of Your Bodycount
(59:34) You’re Single Because You Need to Learn Seduction
(01:08:12) You’re Single Because You Cannot Take a Hint
(01:10:05) You’re Single Because You Waste Your Money on Signaling
(01:12:27) You’re Still Single Because You Misalign Your Incentives
(01:13:03) You’re Not Single and You Are an Inspiration
(01:16:04) You’re Probably Not Single Stop Misrepresenting the Statistics
(01:16:35) You’re Single Because You’re Too Busy Writing Comments
(01:19:20) What About My Good Advice?
(01:21:09) Future Plans for this Series
---
First published:
January 2nd, 2024
Source:
https://www.lesswrong.com/posts/y8g4bXF7sdT9RduT6/dating-roundup-2-if-at-first-you-don-t-succeed
Narrated by TYPE III AUDIO.
[NOTE: I forgot to post this to WP/LW/RSS on Thursday, so posting it now. Sorry about that.]
Will be very different from the old year by the time we are done. This year, it seems like various continuations of the old one. Sometimes I look back on the week, and I wonder how so much happened, while in other senses very little happened.
Table of Contents
---
Outline:
(00:28) Language Models Offer Mundane Utility
(02:44) Language Models Don’t Offer Mundane Utility
(06:13) GPT-4 Real This Time
(12:30) Liar Liar
(13:50) Fun with Image Generation
(19:09) Magic: The Generating
(23:57) Copyright Confrontation
(27:30) Deepfaketown and Botpocalypse Soon
(29:08) They Took Our Jobs
(37:39) Get Involved
(37:48) Introducing
(38:41) In Other AI News
(41:18) Quiet Speculations
(51:33) The Quest for Sane Regulations
(54:57) The Week in Audio
(55:10) AI Impacts Survey
(57:29) Rhetorical Innovation
(01:14:41) Aligning a Human Level Intelligence is Still Difficult
(01:17:00) Aligning a Smarter Than Human Intelligence is Difficult
(01:26:23) Won’t Get Fooled Again
(01:30:03) People Are Worried About AI Killing Everyone
(01:32:11) Other People Are Not As Worried About AI Killing Everyone
(01:34:55) The Wit and Wisdom of Sam Altman
(01:38:15) The Lighter Side
---
First published:
January 13th, 2024
Source:
https://www.lesswrong.com/posts/iygs57bHJ36AvzpMh/ai-47-meet-the-new-year
Narrated by TYPE III AUDIO.
It is that time of the year. One must ask not only whether predictions were right or wrong, whether one won or lost, but what one was and should have been thinking, whether or not good decisions were made, whether the market made sense.
The main subject will be the 2023 ACX Predictions, where I performed buy/sell/hold along with sharing my logic. The numbers quoted are from mid-February 2023, first Manifold, then Metaculus.
Section 1: World Politics
Last year I thought markets were too confident Putin would keep power. This year I think this is not confident enough and Metaculus is more accurate at 90%. Metaculus is also doing a better job adjusting as time passes. Things seem to be stabilizing, and every day without big bad news is good [...]
---
Outline:
(00:31) Section 1: World Politics
(12:16) Section 2: Politics
(26:20) Section 3: Tech and Economics
(50:52) Overall Results
---
First published:
January 8th, 2024
Source:
https://www.lesswrong.com/posts/6o98z3QMAQSkHf3gp/2023-prediction-evaluations
Narrated by TYPE III AUDIO.
Katja Grace and AI impacts survey thousands of researchers on a variety of questions, following up on a similar 2022 survey as well as one in 2016.
I encourage opening the original to get better readability of graphs and for context and additional information. I’ll cover some of it, but there's a lot.
A Very Large Survey Full of Contradictions
Here is the abstract, summarizing many key points:
In the largest survey of its kind, we surveyed 2,778 researchers who had published in top-tier artificial intelligence (AI) venues, asking for their predictions on the pace of AI progress and the nature and impacts of advanced AI systems.
The aggregate forecasts give at least a 50% chance of AI systems achieving several milestones by 2028, including autonomously constructing a payment processing site from scratch, creating a song indistinguishable from a new song by a popular [...]
---
Outline:
(00:25) A Very Large Survey Full of Contradictions
(05:24) They’ve Destabilized the Timeline
(08:50) What, Me Worry?
(10:02) The Biggest Question
(14:25) Not So Fast with the Not So Fast
(16:03) Safety Research Is Good Actually
(18:54) Questions for Next Season
---
First published:
January 5th, 2024
Source:
https://www.lesswrong.com/posts/NfPxAp5uwgZugwovY/ai-impacts-survey-december-2023-edition
Narrated by TYPE III AUDIO.
The first half of the week was filled with continued talk about the New York Times lawsuit against OpenAI, which I covered in its own post. Then that talk seemed to mostly die down,, and things were relatively quiet. We got a bunch of predictions for 2024, and I experimented with prediction markets for many of them.
Note that if you want to help contribute in a fun, free and low-key, participating in my prediction markets on Manifold is a way to do that. Each new participant in each market, even if small, adds intelligence, adds liquidity and provides me a tiny bonus. Also, of course, it is great to help get the word out to those who would be interested. Paid subscriptions and contributions to Balsa are of course also welcome.
I will hopefully be doing both a review of my 2023 predictions (mostly not about [...]
---
Outline:
(01:05) Language Models Offer Mundane Utility
(02:22) Language Models Don’t Offer Mundane Utility
(03:39) GPT-4 Real This Time
(04:22) Fun with Image Generation
(06:12) Deepfaketown and Botpocalypse Soon
(06:21) They Took Our Jobs
(08:17) Get Involved
(09:08) Introducing
(09:34) In Other AI News
(10:16) Doom?
(15:02) Quiet Speculations
(29:06) The Quest for Sane Regulations
(29:10) The Week in Audio
(30:54) Rhetorical Innovation
(34:36) Politico Problems
(39:55) Cup of Coffee
(50:42) Aligning a Smarter Than Human Intelligence is Difficult
(51:42) People Are Worried About AI Killing Everyone
(52:56) Other People Are Not As Worried About AI Killing Everyone
(53:01) The Lighter Side
---
First published:
January 4th, 2024
Source:
https://www.lesswrong.com/posts/NfXF6MZTgae766aoX/ai-45-to-be-determined
Narrated by TYPE III AUDIO.
Lawsuits and legal issues over copyright continued to get a lot of attention this week, so I’m gathering those topics into their own post. The ‘virtual #0’ post is the relevant section from last week's roundup.
Four Core Claims
Who will win the case? Which of New York Times's complaints will be convincing?
Different people have different theories of the case.
Part of that is that there are four distinct allegations NYT is throwing at the wall.
Arvind Narayanan: A thread on some misconceptions about the NYT lawsuit against OpenAI. Morality aside, the legal issues are far from clear cut. Gen AI makes an end run around copyright and IMO this can’t be fully resolved by the courts alone.
As I currently understand it, NYT alleges that OpenAI engaged in 4 types of unauthorized copying of its articles:
---
Outline:
(00:17) Four Core Claims
(01:17) Key Claim: The Training Dataset Contains Copyrighted Material
(05:00) Other Claims
(06:04) A Few Legal Takes
(10:15) What Can You Reproduce?
(14:16) How and How Often Are You Reproducing It?
(22:06) What Should the Rule Be?
(27:03) Image Generation Edition
(28:50) Compulsory License
---
First published:
January 3rd, 2024
Source:
https://www.lesswrong.com/posts/9WD8nkqLTcd8YJPpT/copyright-confrontation-1
Narrated by TYPE III AUDIO.
The New York Times has thrown down the gauntlet, suing OpenAI and Microsoft for copyright infringement. Others are complaining about recreated images in the otherwise deeply awesome MidJourney v6.0. As is usually the case, the critics misunderstand the technology involved, complain about infringements that inflict no substantial damages, engineer many of the complaints being made and make cringeworthy accusations.
That does not, however, mean that The New York Times case is baseless. There are still very real copyright issues at the heart of Generative AI. This suit is a serious effort by top lawyers. It has strong legal merit. They are likely to win if the case is not settled.
Table of Contents
---
Outline:
(00:53) Language Models Offer Mundane Utility
(07:40) GPT-4 Real This Time
(10:35) Fun with Image Generation
(17:58) Copyright Confrontation
(29:55) Deepfaketown and Botpocalypse Soon
(35:00) Going Nuclear
(36:39) In Other AI News
(37:54) Quiet Speculations
(43:40) The UN Reports
(44:14) Guiding Principles
(51:39) The Week in Audio
(52:25) Rhetorical Innovation
(56:09) AI With Open Model Weights Is Unsafe and Nothing Can Fix This
(01:03:16) Aligning a Human Level Intelligence is Still Difficult
(01:06:25) Please Speak Directly Into the Microphone
(01:06:39) The Wit and Wisdom of Sam Altman
(01:19:05) The Lighter Side
---
First published:
December 28th, 2023
Source:
https://www.lesswrong.com/posts/3GzRrqLAcdDXzbqc4/ai-44-copyright-confrontation
Narrated by TYPE III AUDIO.
We get innovation in functional search. In an even more functional search, we finally get a Nature paper submitted almost two years ago, in which AI discovered a new class of antibiotic. That's pretty damn exciting, with all the implications thereof.
OpenAI continued its rapid pace of shipping, pivoting for this week to safety. There was a paper about weak-to-strong generalization. I see what they are trying to do. It is welcome, but I was underwhelmed. It and Leike's follow-up post continue down a path for which I have high skepticism, but the new concreteness gives me more hope that the flaws will be exposed early, allowing adjustment. Or I could be wrong.
OpenAI also had the beta release of Preparedness Framework. That was more exciting. There was a lot of great stuff there, much better than I would have expected, and having a framework at all is [...]
---
Outline:
(01:17) Language Models Offer Mundane Utility
(04:13) Language Models Don’t Offer Mundane Utility
(05:20) GPT-4 Real This Time
(07:54) Fun with Image Generation
(08:10) Deepfaketown and Botpocalypse Soon
(08:59) Digi Relic Digi
(19:57) Going Nuclear
(21:17) Get Involved
(22:27) Follow the Money
(25:26) Introducing
(31:07) In Other AI News
(34:27) Quiet Speculations
(43:14) The Quest for Sane Regulations
(52:06) The Week in Audio
(52:55) Rhetorical Innovation
(01:02:28) Aligning a Smarter Than Human Intelligence is Difficult
(01:20:17) Vulnerable World Hypothesis
(01:22:55) People Are Worried About AI Killing Everyone
(01:26:49) Other People Are Not As Worried About AI Killing Everyone
(01:28:12) The Lighter Side
---
First published:
December 21st, 2023
Source:
https://www.lesswrong.com/posts/WaDFCrd6KEwojLXgj/ai-43-functional-discoveries
Narrated by TYPE III AUDIO.
Previously: On RSPs.
Be Prepared
OpenAI introduces their preparedness framework for safety in frontier models.
A summary of the biggest takeaways, which I will repeat at the end:
---
Outline:
(00:07) Be Prepared
(02:48) Basic Principles
(07:33) Veto Power
(10:27) Introductory Section and Risk Categories
(13:13) Cybersecurity
(15:58) CBRN (Chemical, Biological, Radiological and Nuclear) Threats
(18:47) Persuasion
(22:24) Model Autonomy
(25:34) Key Takeaways From Risk Descriptions
(28:36) Scorecards
(31:27) Governance
(34:56) Deployment Restrictions
(36:21) Development Restrictions
(39:50) Conclusion and Biggest Takeaways
---
First published:
December 21st, 2023
Source:
https://www.lesswrong.com/posts/hQPfLsDKWtdvMwyyr/on-openai-s-preparedness-framework
Narrated by TYPE III AUDIO.
I have not actually forgotten that the rest of the world exists. As usual, this is everything that wasn’t worth an entire post and is not being saved for any of the roundup post categories.
(Roundup post categories are currently AI, Medical and Health, Housing and Traffic, Dating, Childhood and Education, Fertility, Startups, and potentially NEPA and Clean Energy.)
Bad News
Rebels from Yemen were firing on ships in the Red Sea, a problem dating back thousands of years. Here's where we were on December 17, with the US government finally dropping the hammer.
Hidden fees exist, even when everyone knows they’re there, because they work. StubHub experimented, the hiding meant people spent 21% more money. Companies simply can’t pass that up. Government intervention could be justified. However, I also notice that Ticketmaster is now using ‘all-in’ pricing for many shows with zero hidden fees [...]
---
Outline:
(00:28) Bad News
(02:50) Government Working
(06:51) Work From Home
(08:12) Antisocial Media
(11:56) Fraud Is a Rather Big Deal
(15:28) Giving Away $2.1 Billion Also a Big Deal
(18:00) Hit Vote With Rock
(18:55) San Francisco's Finest Compensation Packages
(20:19) Crime and Punishment in San Francisco
(26:08) Crime and Punishment Everywhere
(34:04) Good News, Everyone
(35:50) Meaning What Exactly
(37:44) While I Cannot Condone This
(39:05) Another Voting Proposal
(40:50) At the Movies: 2023 in Review
(44:09) Money Stuff
---
First published:
December 19th, 2023
Source:
https://www.lesswrong.com/posts/sZ8f5NhGPCSkdKqm5/monthly-roundup-13-december-2023
Narrated by TYPE III AUDIO.
With the year ending and my first Vox post coming out, this week was a natural time to take stock. I wrote my first best-of post in a long time and laid out my plans for my 501c(3).
It was also another eventful week. We got a lot more clarity on the OpenAI situation, although no key new developments on the ground. The EU AI Act negotiators reached a compromise, which I have not yet had the opportunity to analyze properly. We got a bunch of new toys to play with, including NotebookLM and Grok, and the Gemini API.
I made a deliberate decision not to tackle the EU AI Act here. Coverage has been terrible at telling us what is in the bill. I want to wait until we can know what is in it, whether or not that means I need to read the whole [...]
---
Outline:
(00:59) Language Models Offer Mundane Utility
(05:13) Language Models Don’t Offer Mundane Utility
(07:17) GPT-4 Real This Time
(11:22) The Other Gemini
(15:03) Fun with Image Generation
(15:40) Deepfaketown and Botpocalypse Soon
(18:32) They Took Our Jobs
(24:51) Get Involved
(26:41) Introducing
(27:13) In Other AI News
(29:49) Quiet Speculations
(37:08) The Quest for Sane Regulations
(42:22) The EU AI Act
(43:05) The Week in Audio
(43:43) Rhetorical Innovation
(50:31) Doom!
(01:03:21) Doom Discourse Innovation
(01:07:03) E/acc
(01:08:32) Poll says e/ack
(01:11:53) Turing Test
(01:13:43) Aligning a Human Level Intelligence Also Difficult
(01:20:04) Aligning a Smarter Than Human Intelligence is Difficult
(01:25:17) Open Foundation Model Weights Are Unsafe And Nothing Can Fix This
(01:26:18) Key Takeaways
(01:31:12) Other People Are Not As Worried About AI Killing Everyone
(01:35:33) The Lighter Side
---
First published:
December 14th, 2023
Source:
https://www.lesswrong.com/posts/emo2hAvq6p7Pn4Pps/ai-42-the-wrong-answer
Narrated by TYPE III AUDIO.
Hello everyone! This is going to be a bit of a housekeeping post and a welcome to new subscribers.
Note that this is not the primary version of my writing, which can be found on Substack, but it is a full copy of all posts found there.
My writing can be intimidating. There is a lot of it, and it's often dense. As always, choose only the parts relevant to your interests, do not be afraid to make cuts. I attempt to make every post accessible as an entry point, but I also want to build up a superstructure over time. This seemed like a good time to recap some of the very best of my old writing and talk about what I’m up to.
Over many years, this blog has morphed from focusing on rationality to COVID to AI.
But not only those [...]
---
Outline:
(01:28) Rationality
(03:03) The Evergreen Posts
(07:13) AI
(09:48) General Rationality and Principles of Life and Thinking
(11:26) World Modeling
(13:20) Oh, Yeah, That Happened
(13:52) Gaming Fun
(14:48) Fertility, School and Childhood
(15:50) Covid
(17:26) The Simulacra Levels Sequence
(18:47) The Moral Mazes Sequence
(20:44) The Choices Sequence
(21:59) Game Theory, Gambling and Prediction Markets
(22:55) Bonus Content: A Newly Salient Question for 2023
---
First published:
December 13th, 2023
Source:
https://www.lesswrong.com/posts/dHYxnSgMDeveovLuv/the-best-of-don-t-worry-about-the-vase
Narrated by TYPE III AUDIO.
Wow, what a year it has been. Things keep getting crazier.
Thank you for taking this journey with me. I hope I have helped you keep pace, and that you have been able to discern for yourself the parts of this avalanche of words and events that were helpful. I hope to have helped things make somewhat more sense.
And I hope many of you have taken that information, and used it not only to be able to check Twitter less, but also to make better decisions, and, hopefully, to help make the world a better place—one in which humanity is more likely to survive.
Recently, my coverage of the Biden administration executive order and the events at OpenAI have been received very positively. I’d like to do more in that mold: more focused, shorter pieces that pull the story together, hopefully de-emphasizing more ephemeral weekly [...]
---
Outline:
(01:32) Balsa First Targets the Jones Act
(04:01) Other Balsa Cause Areas
(04:12) Housing
(05:47) NEPA
(08:16) AI
(12:00) Funding Situation
---
First published:
December 12th, 2023
Source:
https://www.lesswrong.com/posts/eaFmbgnWsXdGb2FSk/balsa-update-and-general-thank-you
Narrated by TYPE III AUDIO.
Previously: OpenAI: Altman Returns, OpenAI: The Battle of the Board, OpenAI: Facts from a Weekend, additional coverage in AI#41.
We have new stories from The New York Times, from Time, from the Washington Post and from Business Insider.
All paint a picture consistent with the central story told in OpenAI: The Battle of the Board. They confirm key facts, especially Altman's attempted removal of Toner from the board via deception. We also confirm that Altman promised to help with the transition when he was first fired, so we have at least one very clear cut case of Altman saying that which was not.
Much uncertainty remains, especially about the future, but past events are increasingly clear.
The stories also provide additional color and key details. This post is for those who want that, and to figure out what to think in light of the new [...]
---
Outline:
(01:21) The New York Times Covers Events
(15:30) Time Makes Altman CEO of the Year
(18:49) Washington Post Says Leaders Warned Altman was Abusive
(20:05) Business Insider Says Microsoft Letter Was a Bluff
(26:15) Where Does That All Leave Us?
---
First published:
December 12th, 2023
Source:
https://www.lesswrong.com/posts/xY5m72tME9kqjpdoC/openai-leaks-confirm-the-story
Narrated by TYPE III AUDIO.
The biggest news this week was at long last the announcement of Google's Gemini. Be sure to check that out. Note that what is being rolled out now is only Gemini Pro, the Gemini Ultra model that could rival GPT-4 is not yet available.
It does not seem I am doing a good job cutting down on included material fast enough to keep pace. A lot is happening, but a lot will likely be happening for a long time. If your time is limited, remember to focus on the sections relevant to your interests.
Also, if you are going to be at the New York Solstice or the related meetup, please do say hello.
Table of Contents
My other post today covers Google's Gemini. Be sure to read that.
I also put out two other posts this week: Based Beff Jezos and the Accelerationists, and On [...]
---
Outline:
(00:42) Language Models Offer Mundane Utility
(05:57) Language Models Don’t Offer Mundane Utility
(08:38) OpenAI: The Saga Continues
(15:01) Q Continuum
(18:04) Fun with Image Generation
(20:08) Get Involved
(20:47) Introducing
(22:01) In Other AI News
(24:21) Quiet Speculations
(28:29) Model This
(37:28) Would You Like Some Volcano Apocalypse Insurance?
(38:58) The Quest for Sane Regulations
(53:02) The Week in Audio
(53:11) Rhetorical Innovation
(01:01:43) Aligning a Human Level Intelligence Is Still Difficult
(01:04:38) Aligning a Smarter Than Human Intelligence is Difficult
(01:08:54) How Timelines Have Changed
(01:10:29) People Are Worried About AI Killing Everyone
(01:12:54) Other People Are Not As Worried About AI Killing Everyone
(01:25:14) Somehow This Is The Actual Vice President
(01:31:24) The Lighter Side
---
First published:
December 7th, 2023
Source:
https://www.lesswrong.com/posts/9Jgtkw8CD6kndyCcD/ai-41-bring-in-the-other-gemini
Narrated by TYPE III AUDIO.
It's happening. Here is CEO Pichai's Twitter announcement. Here is Demis Hassabis announcing. Here is the DeepMind Twitter announcement. Here is the blog announcement. Here is Gemini co-lead Oriol Vinyals, promising more to come. Here is Google's Chief Scientist Jeff Dean bringing his best hype.
Technical Specifications
Let's check out the specs.
Context length trained was 32k tokens, they report 98% accuracy on information retrieval for Ultra across the full context length. So a bit low, both lower than GPT—4 and Claude and lower than their methods can handle. Presumably we should expect that context length to grow rapidly with future versions.
There are three versions of Gemini 1.0.
Gemini 1.0, our first version, comes in three sizes: Ultra for highly-complex tasks, Pro for enhanced performance and deployability at scale, and Nano for on-device applications. Each size is specifically tailored to [...]
---
Outline:
(00:25) Technical Specifications
(11:20) Level Two Bard
(12:42) Gemini Reactions
---
First published:
December 7th, 2023
Source:
https://www.lesswrong.com/posts/ofYejKKiSFYH2gLBb/gemini-1-0
Narrated by TYPE III AUDIO.
It seems Forbes decided to doxx the identity of e/acc founder Based Beff Jezos. They did so using voice matching software.
Given Jezos is owning it given that it happened, rather than hoping it all goes away, and people are talking about him, this seems like a good time to cover this ‘Beff Jezos’ character and create a reference point for if he continues to come up later.
If that is not relevant to your interests, you can and should skip this one.
Do Not Doxx People
First order of business: Bad Forbes. Stop it. Do not doxx people. Do not doxx people with a fox. Do not dox people with a bagel with creme cheese and lox. Do not dox people with a post. Do not dox people who then boast. Do not dox people even if that person is advocating for policies you [...]
---
Outline:
(00:31) Do Not Doxx People
(01:21) Beff Jezos Advocates Actions He Thinks Would Probably Kill Everyone
(02:46) A Matter of Some Debate
(08:23) Response to the Doxx
(12:44) So What is E/Acc Then?
(18:56) Conclusion
---
First published:
December 6th, 2023
Source:
https://www.lesswrong.com/posts/3xoThNNYgZmTCpEAB/based-beff-jezos-and-the-accelerationists
Narrated by TYPE III AUDIO.
This post was originally intended to come out directly after the UK AI Safety Summit, to give the topic its own deserved focus. One thing led to another, and I am only doubling back to it now.
Responsible Deployment Policies
At the AI Safety Summit, all the major Western players were asked: What are your company policies on how to keep us safe? What are your responsible deployment policies (RDPs)? Except that they call them Responsible Scaling Policies (RSPs) instead.
I deliberately say deployment rather than scaling. No one has shown what I would consider close to a responsible scaling policy in terms of what models they are willing to scale and train.
Anthropic at least does however seem to have something approaching a future responsible deployment policy, in terms of how to give people access to a model if we assume it is safe for [...]
---
Outline:
(00:17) Responsible Deployment Policies
(03:15) How the UK Graded the Responses
(04:22) Anthropic's Policies
(05:27) The Risks
(10:42) The Promise of a Pause
(13:58) ASL-3 Definitions and Commitments
(18:16) Approaching Thresholds
(24:38) ASL-4
(27:26) Underspecification
(29:06) Takeaways from Anthropic's RSP
(35:30) Others React
(38:30) A Failure to Communicate
(39:47) OpenAI Policies
(41:56) DeepMind Policies
(45:53) Amazon, Inflection and Meta
(47:53) Some Additional Relative Rankings
(48:57) Important Clarification from Dario Amodei
(55:07) Strategic Thoughts on Such Policies
(01:05:37) Conclusion
---
First published:
December 5th, 2023
Source:
https://www.lesswrong.com/posts/yRJNCDp7LHyHGkANz/on-responsible-scaling-policies-rsps
Narrated by TYPE III AUDIO.
It has been brutal out there for someone on my beat. Everyone extremely hostile, even more than usual. Extreme positions taken, asserted as if obviously true. Not symmetrically, but from all sides nonetheless. Constant assertions of what happened in the last two weeks that are, as far as I can tell, flat out wrong, largely the result of a well-implemented media campaign. Repeating flawed logic more often and louder.
The bright spot was offered by Vitalik Buterin, who offers a piece entitled ‘My techo–optimism,’ proposing what he calls d/acc for defensive (or decentralized, or differential) accelerationism. He brings enough nuance and careful thinking, and clear statements about existential risk and various troubles ahead, to get strong positive reactions from the worried. He brings enough credibility and track record, and enough shibboleths, to get strong endorsements from the e/acc crowd, despite his acknowledgement of existential risk and the dangers [...]
---
Outline:
(02:06) Language Models Offer Mundane Utility
(03:52) Language Models Don’t Offer Mundane Utility
(07:37) Q Continuum
(13:04) OpenAI, Altman and Safety
(15:30) A Better Way to Do RLHF
(19:26) Fun with Image Generation
(19:34) Deepfaketown and Botpocalypse Soon
(19:49) They Took Our Jobs
(20:35) Get Involved
(22:21) Introducing
(22:39) In Other AI News
(25:10) It's a Who?
(28:22) What About E/Acc?
(31:54) Vitalik Offers His Version of Techno-Optimism
(40:47) Quiet Speculations
(44:38) AI Agent Future
(47:25) The Quest for Sane Regulations
(51:28) The Week in Audio
(56:13) Rhetorical Innovation
(59:06) Aligning a Smarter Than Human Intelligence is Difficult
(01:06:36) People Might Also Worry About AI Killing Only Some of Them
(01:07:37) People Are Worried About AI Killing Everyone
(01:08:43) Other People Are Not As Worried About AI Killing Everyone
(01:12:38) Please Speak Directly Into This Microphone
(01:13:54) The Lighter Side
---
First published:
November 30th, 2023
Source:
https://www.lesswrong.com/posts/je5BwKe8enCq8DLrm/ai-40-a-vision-from-vitalik
Narrated by TYPE III AUDIO.
As of this morning, the new board is in place and everything else at OpenAI is otherwise officially back to the way it was before.
Events seem to have gone as expected. If you have read my previous two posts on the OpenAI situation, nothing here should surprise you.
Still seems worthwhile to gather the postscripts, official statements and reactions into their own post for future ease of reference.
What will the ultimate result be? We likely only find that out gradually over time, as we await both the investigation and the composition and behaviors of the new board.
I do not believe Q* played a substantive roll in events, so it is not included here. I also do not include discussion here of how good or bad Altman has been for safety.
Sam Altman's Statement
Here is the official OpenAI statement from Sam [...]
---
Outline:
(00:49) Sam Altman's Statement
(07:17) Bret Taylor's Statement
(10:30) Larry Summers's Statement
(11:11) Helen Toner's Statement
(13:00) OpenAI Needs a Strong Board That Can Fire Its CEO
(15:57) Some Board Member Candidates
(16:54) A Question of Valuation
(18:28) A Question of Optics
---
First published:
November 30th, 2023
Source:
https://www.lesswrong.com/posts/EfqAdxR7bvwQLMTQc/openai-altman-returns
Narrated by TYPE III AUDIO.
The board firing Sam Altman, then reinstating him, dominated everything else this week. Other stuff also happened, but definitely focus on that first.
Table of Contents
Developments at OpenAI were far more important than everything else this read. So you can read this timeline of events over the weekend, and this attempt to put all the information together.
---
Outline:
(00:16) Language Models Offer Mundane Utility
(01:09) Language Models Don’t Offer Mundane Utility
(04:50) The Q Continuum
(05:47) OpenAI: The Saga Continues
(13:29) Altman Could Step Up
(14:59) You Thought This Week Was Tough
(16:15) Fun with Image Generation
(16:37) Deepfaketown and Botpocalypse Soon
(17:28) They Took Our Jobs
(18:07) Get Involved
(18:30) Introducing
(19:17) In Other AI News
(20:42) Quiet Speculations
(24:00) The Quest for Sane Regulations
(27:19) That Is Not What Totalitarianism Means
(31:39) The Week in Audio
(34:13) Rhetorical Innovation
(39:21) Aligning a Smarter Than Human Intelligence is Difficult
(46:24) People Are Worried About AI Killing Everyone
(47:54) Other People Are Not As Worried About AI Killing Everyone
(49:18) The Lighter Side
---
First published:
November 23rd, 2023
Source:
https://www.lesswrong.com/posts/3FCfEqRiLLb4gFu3H/ai-39-the-week-of-openai
Narrated by TYPE III AUDIO.
Previously: OpenAI: Facts from a Weekend.
On Friday afternoon, OpenAI's board fired CEO Sam Altman.
Overnight, an agreement in principle was reached to reinstate Sam Altman as CEO of OpenAI, with an initial new board of Brad Taylor (ex-co-CEO of Salesforce, chair), Larry Summers and Adam D’Angelo.
What happened? Why did it happen? How will it ultimately end? The fight is far from over.
We do not entirely know, but we know a lot more than we did a few days ago.
This is my attempt to put the pieces together.
This is a Fight For Control; Altman Started it
This was and still is a fight about control of OpenAI, its board, and its direction.
This has been a long simmering battle and debate. The stakes are high.
Until recently, Sam Altman worked to reshape the company in his [...]
---
Outline:
(00:42) This is a Fight For Control; Altman Started it
(01:12) OpenAI is a Non-Profit With a Mission
(03:20) Sam Altman's Perspective
(05:41) The Outside Board's Perspective
(06:51) Ilya Sutskever's Perspective
(07:52) Altman Moves to Take Control
(11:02) One Last Chance
(14:32) Botched Communications
(16:18) The Negotiation
(18:13) What Now for OpenAI?
---
First published:
November 22nd, 2023
Source:
https://www.lesswrong.com/posts/sGpBPAPq2QttY4M2H/openai-the-battle-of-the-board
Narrated by TYPE III AUDIO.
Approximately four GPTs and seven years ago, OpenAI's founders brought forth on this corporate landscape a new entity, conceived in liberty, and dedicated to the proposition that all men might live equally when AGI is created.
Now we are engaged in a great corporate war, testing whether that entity, or any entity so conceived and so dedicated, can long endure.
What matters is not theory but practice. What happens when the chips are down?
So what happened? What prompted it? What will happen now?
To a large extent, even more than usual, we do not know. We should not pretend that we know more than we do.
Rather than attempt to interpret here or barrage with an endless string of reactions and quotes, I will instead do my best to stick to a compilation of the key facts.
(Note: All times stated here [...]
---
First published:
November 20th, 2023
Source:
https://www.lesswrong.com/posts/KXHMCH7wCxrvKsJyn/openai-facts-from-a-weekend
Narrated by TYPE III AUDIO.
Another busy week. GPT-5 starts, Biden and Xi meet and make somewhat of a deal, GPTs get explored, the EU AI Act on the verge of collapse by those trying to kill the part that might protect us, multiple very good podcasts. A highly interesting paper on potential deceptive alignment.
Despite things quieting down the last few days, it is still a lot. Hopefully things can remain quiet for a bit, perhaps I can even get in more work on that Jones Act post.
Table of Contents
---
Outline:
(00:39) Language Models Offer Mundane Utility
(05:11) Language Models Don’t Offer Mundane Utility
(10:49) GPT-4 Real This Time
(14:11) Fun with Image Generation
(15:12) Deepfaketown and Botpocalypse Soon
(15:59) A Bad Guy With an AI
(22:46) They Took Our Jobs
(32:57) Get Involved
(34:29) Introducing
(35:18) In Other AI News
(42:00) Quiet Speculations
(43:32) Anti Anti Trust
(44:24) The Quest for Sane Regulations
(57:39) Bostrom Goes Unheard
(58:18) The Week in Audio
(59:37) Someone Picked Up the Phone
(01:01:15) Mission Impossible
(01:02:17) Rhetorical Innovation
(01:05:21) Open Source AI is Insafe and Nothing Can Fix This
(01:13:19) Aligning a Smarter Than Human Intelligence is Difficult
(01:36:02) People Are Worried About AI Killing Everyone
(01:37:37) Other People Are Not As Worried About AI Killing Everyone
(01:40:33) The Lighter Side
---
First published:
November 16th, 2023
Source:
https://www.lesswrong.com/posts/oCFX5xbhgCmpBFKnb/ai-38-let-s-make-a-deal
Narrated by TYPE III AUDIO.
Things on the AI front have been rather hectic. That does not mean other things stopped happening. Quite the opposite. So here we are again.
Bad News
PSA: Crumbl Cookies, while delicious, have rather a lot of calories, 720 in the basic cookie. Yes, they display this as 180, by deciding serving size is a quarter of a cookie. This display strategy is pretty outrageous and should not be legal, we need to do something about unrealistic serving sizes – at minimum, require that the serving size be displayed in same size font as the calorie count.
It really is weird that we don’t think about Russia, and especially the USSR, more in terms of the universal alcoholism.
Reminder that there really is an architecture conspiracy to make life worse. Peter Eisnman straight out says: “Anxiety and alienation is the modern condition. The point of architecture [...]
---
Outline:
(00:16) Bad News
(03:24) Good News, Everyone
(05:31) While I Cannot Condone This
(09:00) Government Working
(15:02) At the Movies
(18:07) Twitter Twitches
(23:41) Yay Free Speech
(25:41) Money Stuff
(41:29) Gamers Gonna Game Game Game Game Game
(50:47) I Was Promised Flying Self-Driving Cars
(56:50) Potentially Effective Altruism
---
First published:
November 14th, 2023
Source:
https://www.lesswrong.com/posts/TfABomJ7s6xLkxTFz/monthly-roundup-12-november-2023
Narrated by TYPE III AUDIO.
[Editor's Note: This post is split off from AI #38 and only on LessWrong because I want to avoid overloading my general readers with this sort of thing at this time, and also I think it is potentially important we have a link available. I plan to link to it from there with a short summary.]
Nick Bostrom was interviewed on a wide variety of questions on UnHerd, primarily on existential risk and AI, I found it thoughtful throughout. In it, he spent the first 80% of the time talking about existential risk. Then in the last 20% he expressed the concern that it was unlikely but possible we would overshoot our concerns about AI and never build AGI at all, which would be a tragedy.
How did those who would dismiss AI risk and build AGI as fast as possible react?
About how you would expect. This is [...]
---
Outline:
(04:40) What Bostrom Centrally Said Was Mostly Not New or Controversial
(06:54) Responses Confirming Many Concerned About Existential Risk Mostly Agree
(11:49) Quoted Text in Detail
(19:42) The Broader Podcast Context
(21:35) A Call for Nuance
(24:33) The Quoted Text Continued
(27:08) Conclusion
---
First published:
November 13th, 2023
Source:
https://www.lesswrong.com/posts/PyNqASANiAuG7GrYW/bostrom-goes-unheard
Narrated by TYPE III AUDIO.
All markets created by Zvi Mowshowitz shall be graded according to the rules described herein, including the zeroth rule.
The version of this on LessWrong shall be the canonical version, even if other versions are later posted on other websites.
Rule 0: If the description of a particular market contradicts these rules, the market's description wins, the way a card in Magic: The Gathering can break the rules. This document only establishes the baseline rules, which can be modified.
---
First published:
November 13th, 2023
Source:
https://www.lesswrong.com/posts/ge3Jf5Hnon8wq4xqT/zvi-s-manifold-markets-house-rules
Narrated by TYPE III AUDIO.
We had OpenAI's dev day, where they introduced a host of new incremental feature upgrades including a longer context window, more recent knowledge cutoff, increased speed, seamless feature integration and a price drop. Quite the package. On top of that, they introduced what they call ‘GPTs’ that can let you configure a host of things to set up specialized proto-agents or widgets that will work for specialized tasks and be shared with others. I would love to mess around with that, once I have the time, and OpenAI's servers allow regular subscribers to get access.
In the meantime, even if you exclude all that, lots of other things happened this week. Thus, even with the spin-off, this is an unusually long weekly update. I swear, and this time I mean it, that I am going to raise the threshold for inclusion or extended discussion substantially going forward, across [...]
---
Outline:
(00:57) Language Models Offer Mundane Utility
(01:37) Bard Tells Tales
(05:41) Fun with Image Generation
(10:15) Deepfaketown and Botpocalypse Soon
(14:34) The Art of the Jailbreak
(17:54) They Took Our Jobs
(22:23) Get Involved
(24:25) Introducing
(28:27) X Marks Its Spot
(34:20) In Other AI News
(44:13) Verification Versus Generation
(46:57) Bigger Tech Bigger Problems
(53:52) Executive Order Open Letter
(01:01:28) Executive Order Reactions Continued
(01:05:53) Quiet Speculations
(01:17:36) The Quest for Sane Regulations
(01:25:55) The Week in Audio
(01:26:14) Rhetorical Innovation
(01:36:44) Aligning a Smarter Than Human Intelligence is Difficult
(01:38:08) Aligning a Dumber Than Human Intelligence Is Still Difficult
(01:42:10) Model This
(01:52:35) Open Source AI is Unsafe and Nothing Can Fix This
(02:07:02) People Are Worried About AI Killing Everyone
(02:10:34) Other People Are Not As Worried About AI Killing Everyone
(02:16:27) The Lighter Side
---
First published:
November 9th, 2023
Source:
https://www.lesswrong.com/posts/44Cv4HFoWEZvFnL5u/ai-37-moving-too-fast
Narrated by TYPE III AUDIO.
OpenAI DevDay was this week. What delicious and/or terrifying things await?
Turbo Boost
First off, we have GPT-4-Turbo.
Today we’re launching a preview of the next generation of this model, GPT-4 Turbo.
GPT-4 Turbo is more capable and has knowledge of world events up to April 2023. It has a 128k context window so it can fit the equivalent of more than 300 pages of text in a single prompt. We also optimized its performance so we are able to offer GPT-4 Turbo at a 3x cheaper price for input tokens and a 2x cheaper price for output tokens compared to GPT-4.
GPT-4 Turbo is available for all paying developers to try by passing gpt-4-1106-preview in the API and we plan to release the stable production-ready model in the coming weeks.
Knowledge up to April 2023 is a big game. Cutting the price [...]
---
Outline:
(00:09) Turbo Boost
(01:25) Function calling updates
(02:27) Improved instruction following and JSON mode
(03:21) Reproducible outputs and log probabilities
(04:34) Updated GPT-3.5 Turbo
(06:49) New Modalities
(07:03) New modalities in the API
(07:08) GPT-4 Turbo with vision
(08:02) DALL·E 3
(08:47) Text-to-speech (TTS)
(10:05) Model customization
(10:09) GPT-4 fine tuning experimental access
(11:01) Custom models
(12:19) Assistants API, Retrieval, and Code Interpreter
(15:13) GPT-GPT
(22:37) Putting It All Together
---
First published:
November 9th, 2023
Source:
https://www.lesswrong.com/posts/wdekcGpsMtakGCo5y/on-openai-dev-day
Narrated by TYPE III AUDIO.
In the eyes of many, Biden's Executive Order somewhat overshadowed the UK Summit. The timing was unfortunate. Both events were important milestones. Now that I have had time, here is my analysis of what happened at the UK Summit.
As is often the case with such events, there was a lot of talk relative to the amount of action. There was a lot of diplomatic talk, talk of that which everyone agrees upon, relative to the amount of talk of real substance. There were days of meetings that resulted in rather unspicy summaries and resolutions. The language around issues that matter most was softened, the actual mission in danger of being compromised.
And as usual, the net result was reason for optimism, a net highly positive event versus not having it, while also in some ways being disappointing when compared to what might have been. A declaration [...]
---
Outline:
(01:47) Looking Back at People's Goals for the Summit and Taskforce
(02:53) AI Safety Summit Agenda
(12:38) Someone Picked Up the Phone
(18:21) The Bletchley Declaration
(28:32) Saying Generic Summit-Style Things
(33:04) Shouting From the Rooftops
(33:27) Some Are Not Easily Impressed
(35:37) Declaring Victory
(45:14) Kanjun Offers Thoughts
(56:17) Closing Remarks
---
First published:
November 7th, 2023
Source:
https://www.lesswrong.com/posts/zbrvXGu264u3p8otD/on-the-uk-summit
Narrated by TYPE III AUDIO.
habryka
Hey Everyone!
As part of working on dialogues over the last few weeks I've asked a bunch of people what kind of conversations they would be most interested in reading, and one of the most common one has been "I would really like to read a bunch of people trying to figure out how to construct a portfolio that goes well when AGI becomes a bigger deal".
You are three people who would be reasonably high on my list to figure this out with, and so here we are. Not because you are world experts at this, but because I trust your general reasoning a bunch (I know Noah less well, but trust Will and Zvi a good amount).
I think to kick us off, maybe let's start with a very brief 1-2 sentence intros on your background and how much you've thought about this thing before (and [...]
---
Outline:
(07:37) Broad market effects of AGI
(10:23) Career capital in an AGI world
(20:29) Debt and Interest rates effects of AGI
(26:53) Concrete example portfolio
(43:21) Is any of this ethical or sanity-promoting?
(49:12) How would you actually use a ton of money to help with AGI going well?
(53:08) Please diversify your crypto portfolio
(56:24) Should you buy private equity into AI companies?
(58:04) Summarizing takeaways
---
First published:
November 6th, 2023
Narrated by TYPE III AUDIO.
Wow, what a week. We had the Executive Order, which I read here so you don’t have to and then I have a tabulation of the reactions of others.
Simultaneously there was the UK AI Summit.
There was also robust related discussion around Responsible Scaling Policies, and the various filings companies did in advance of the Summit.
I touched on Anthropic's RSP in particular in previous weeks, but I did not do a sufficiently close analysis and many others have offered more detailed thoughts as well, and the context has evolved.
So I am noting that I am not covering those important questions in the weekly roundup, and they will be covered by one or more later distinct posts. I also potentially owe an after action report from EA Global Boston, if I can find the time.
This post is instead about everything else.
[...]---
Outline:
(00:53) Language Models Offer Mundane Utility
(03:22) Language Models Don’t Offer Mundane Utility
(07:03) GPT-4 Real This Time
(10:11) Fun with Image Generation
(10:43) Best Picture
(15:19) Deepfaketown and Botpocalypse Soon
(20:25) They Took Our Jobs
(22:34) Get Involved
(22:55) OpenAI Frontier Risk and Preparedness Team
(27:03) Introducing
(28:25) In Other AI News
(33:13) Quiet Speculations
(37:45) The Quest for Sane Regulations
(38:47) 22756.
(42:51) 22756.2.
(43:14) 22756.4.
(44:15) The Week in Audio
(48:13) Rhetorical Innovation
(54:43) Open Source AI is Unsafe and Nothing Can Fix This
(57:08) Aligning a Smarter Than Human Intelligence is Difficult
(01:01:20) People Are Worried About AI Killing Everyone
(01:02:39) Please Speak Directly Into This Microphone
(01:05:13) The Lighter Side
---
First published:
November 2nd, 2023
Source:
https://www.lesswrong.com/posts/QLuoMnhR5XNAAWjJx/ai-36-in-the-background
Narrated by TYPE III AUDIO.
Or: I read the executive order and its fact sheet, so you don’t have to.
I spent Halloween reading the entire Biden Executive Order on AI.
This is the pure ‘what I saw reading the document’ post. A companion post will cover reactions to this document, but I wanted this to be a clean reference going forward.
Takeaway Summary: What Does This Do?
It mostly demands a lot of reports, almost entirely from within the government.
---
Outline:
(00:26) Takeaway Summary: What Does This Do?
(05:46) Fact Sheet
(21:53) I Read the Whole Damn Thing So You Don’t Have To
(22:03) Sections 1 and 2: Introduction and Principles
(23:22) Section 3: Definitions
(30:05) Section 4: Ensuring Safety and Security
(43:44) Section 5: Promoting Innovation and Competition
(50:58) Section 6: Supporting Workers
(51:37) Section 7: Advancing Equity and Civil Rights
(52:41) Section 8: Protecting Consumers, Patients, Passengers and Students
(53:38) Section 9: Protecting Privacy
(54:06) Section 10: Advancing Federal Government Use of AI
(56:51) Section 11: Strengthening American Leadership Abroad
(57:38) Section 12: Implementation
(58:06) Conclusion
---
First published:
November 1st, 2023
Source:
https://www.lesswrong.com/posts/PvBpRu354uG7ypwRP/on-the-executive-order
Narrated by TYPE III AUDIO.
Given Joe Biden seems to have become more worried about AI risk after having seen the movie, it seems worth putting my observations about it into its own post. This is what I wrote back then, except for the introduction and final note.
We now must modify the paragraph about whether to see this movie. Given its new historical importance, combined with its action scenes being pretty good, if you have not yet seen it you should now probably see this movie. And of course it now deserves a much higher rating than 70.
There are of course things such as ‘it is super cool to jump from a motorcycle into a dive onto a moving train’ but also there are actual things to ponder here.
Spoiler-Free Review
There may never be a more fitting title than Mission Impossible: Dead Reckoning. Each of these four words is doing important [...]
---
Outline:
(00:44) Spoiler-Free Review
(02:18) No One Noticed or Cared That The Alignment Plan Was Obvious Nonsense
(03:52) No One Cares That the Threat is Extinction, They All Want Control
(05:29) The Movie Makes it Very Clear Why Humanity Won’t Win, Then Ignores It
(08:15) Warning Shots are Repeatedly Ignored
(08:46) Approximately No One Noticed Any of This
---
First published:
November 1st, 2023
Narrated by TYPE III AUDIO.
There is much talk about so-called Responsible Scaling Policies, as in what we will do so that what we are doing can be considered responsible. Would that also result in actually responsible scaling? It would help. By themselves, in their current versions, no. The good scenario is that these policies are good starts and lay groundwork and momentum to get where we need to go. The bad scenario is that this becomes safetywashing, used as a justification for rapid and dangerous scaling of frontier models, a label that avoids any actual action or responsibility.
Others think it would be better if we flat out stopped. So they say so. And they protest. And they point out that the public is mostly with them, at the same time that those trying to play as Very Serious People say such talk is irresponsible.
Future persuasion will be better. Sam [...]
---
Outline:
(01:31) Language Models Offer Mundane Utility
(01:46) Language Models Don’t Offer Mundane Utility
(02:32) GPT-4 Real This Time
(03:51) A Proposed Bet
(04:38) Fun with Image Generation
(05:54) Deepfaketown and Botpocalypse Soon
(08:26) They Took Our Jobs
(10:38) Get Involved
(11:16) Introducing
(16:19) In Other AI News
(17:57) Quiet Speculations
(21:13) The Quest for Sane Regulations
(26:29) The Week in Audio
(27:08) Rhetorical Innovation
(42:45) Friendship is Optimal
(45:12) Honesty As the Best Policy
(52:43) Aligning a Smarter Than Human Intelligence is Difficult
(54:17) Aligning a Dumber Than Human Intelligence Is Also Difficult
(01:02:49) Humans Do Not Expect to Be Persuaded by Superhuman Persuasion
(01:07:36) DeepMind's Evaluation Paper
(01:17:23) Bengio Offers Letter and Proposes a Synthesis
(01:20:54) Matt Yglesias Responds To Marc Andreessen's Manifesto
(01:25:33) People Are Worried About AI Killing Everyone
(01:31:09) Someone Is Worried AI Alignment Is Going Too Fast
(01:36:14) Please Speak Directly Into This Microphone
(01:37:46) The Lighter Side
---
First published:
October 26th, 2023
Source:
https://www.lesswrong.com/posts/aQ6LDhc2zxrYXFjEF/ai-35-responsible-scaling-policies
Narrated by TYPE III AUDIO.
It did not get the bulk of the attention, but the actual biggest story this week was that America tightened the rules on its chip exports, closing the loophole Nvidia was using to create the A800 and H800. Perhaps the new restrictions will actually have teeth.
Also new capabilities continue to come in based on the recent GPT upgrades, along with the first signs of adversarial attacks.
Also a lot of rhetoric, including, yes, that manifesto. Yes, I do cover it.
Table of Contents
---
Outline:
(00:36) Language Models Offer Mundane Utility
(01:31) Dalle-3 System Complete Prompt
(09:00) GPT-4 Real This Time
(12:16) Fun with Image Generation
(12:28) Deepfaketown and Botpocalypse Soon
(14:24) People Genuinely Against Genuine People Personalities
(19:11) They Took Our Jobs
(22:42) Get Involved
(24:13) Introducing
(25:07) In Other AI News
(32:15) Quiet Speculations
(33:27) Man With a Plan
(39:56) China
(45:00) The Quest for Sane Regulations
(50:26) The Chips Are Down
(54:09) The Week in Audio
(54:45) Yes, We Will Speak Directly Into This Microphone
(58:22) Rhetorical Innovation
(01:16:38) Open Source AI is Unsafe and Nothing Can Fix This
(01:18:36) No One Would Be So Stupid As To
(01:19:09) Aligning a Smarter Than Human Intelligence is Difficult
(01:20:49) People Are Worried About AI Killing Everyone
(01:25:36) New Bengio Interview
(01:31:36) Marc Andreessen's Techno-Optimist Manifesto
(01:42:50) Other People Are Not As Worried About AI Killing Everyone
(01:45:33) Other People Wonder Whether It Would Be Moral To Not Die
(01:46:38) The Lighter Side
---
First published:
October 19th, 2023
Source:
https://www.lesswrong.com/posts/ApPKqx9b8LogfKxAr/ai-34-chipping-away-at-chip-exports
Narrated by TYPE III AUDIO.
The world is slowly waking up to the fertility crisis. There is more acknowledgement of the problem, and there is more talk of potential practical solutions. I do not believe the topic is a realistic target for Balsa Policy Institute, but I will continue to keep an eye on the ball and issue periodic roundups like this one.
In Brief
A basic problem is that we do not consider infertility a big deal for things such as DALYs. Many would strongly disagree.
Fertility clinics that get new corporate ownership by a fertility chain have 28.2% increased clinic volume and 13.6% higher success rates, dramatically improving overall outcomes.
Mishra has the fertility take on Barbie.
Mishra: Weirdly this is a take I haven’t seen anywhere but to me the point of Barbie is that the most important thing is kids and everything else is meaningless without [...]
---
Outline:
(00:22) In Brief
(01:40) Causes
(07:21) Causes: Pessimism
(09:57) Causes: Escalating Signals
(11:41) The Baby Boom
(14:24) You’ll Need More
(19:50) I’ll Tell You What I Want What I Really Really Want
(21:48) Negative Dakka
(23:01) Not Enough Dakka
(31:38) More Dakka
(35:49) Cato Brings the Fire
(50:53) I’m Doing My Part
(52:35) Artificial Wombs
---
First published:
October 17th, 2023
Source:
https://www.lesswrong.com/posts/YjT88Avga8xTvtZzw/fertility-roundup-2
Narrated by TYPE III AUDIO.
This has been a rough week for pretty much everyone. While I have had to deal with many things, and oh how I wish I could stop checking any new sources for a while, others have had it far worse. I am doing my best to count my blessings and to preserve my mental health, and here I will stick to AI. As always, the AI front does not stop.
Table of Contents
---
Outline:
(00:26) Language Models Offer Mundane Utility
(03:53) Language Models Don’t Offer Mundane Utility
(09:53) GPT-4 Real This Time
(11:03) Fun with Image Generation
(16:14) Deepfaketown and Botpocalypse Soon
(17:03) They Took Our Jobs
(21:13) Get Involved
(21:39) Introducing
(27:56) In Other AI News
(30:25) Cool New Interpretability Paper
(33:20) So What Do We All Think of The Cool Paper?
(41:46) Alignment Work and Model Capability
(43:09) Quiet Speculations
(46:33) The Week in Audio
(47:39) Rhetorical Innovation
(57:27) Aligning a Smarter Than Human Intelligence is Difficult
(01:05:25) Aligning Dumber Than Human Intelligences is Also Difficult
(01:08:01) Open Source AI is Unsafe and Nothing Can Fix This
(01:10:11) Predictions are Hard Especially About the Future
(01:13:47) Other People Are Not As Worried About AI Killing Everyone
(01:22:36) The Lighter Side
---
First published:
October 12th, 2023
Source:
https://www.lesswrong.com/posts/pD5rkAvtwp25tyfRN/ai-33-cool-new-interpretability-paper
Narrated by TYPE III AUDIO.
It's time once again for a roundup of all the childhood-related stories I’ve noticed since the last such post that don’t fit in elsewhere.
In addition to the standard post, I have a personal note, which is that unfortunately our current nanny's health does not allow her to continue, and thus we in the process of searching and doing interviews for a new full-time nanny for our three children here in Manhattan. I hope to have someone in place by the end of the week. If you are or know someone we should consider, please do quickly contact me right away with the relevant information. You can DM me on Twitter, PM me on LesssWrong or email me through substack (or email directly if you know it). Thanks in advance.
We All Need a Friend
Sorry you are being bullied, here's some gold? In this case, $40k [...]
---
Outline:
(00:43) We All Need a Friend
(01:19) Time Spent on Children is Historically Anomalous
(03:32) Let Your Children Play
(12:52) Illusion of Safety
(14:32) Social Media is Badly Named
(17:32) Kids Respond To Incentives
(18:18) Let Them Eat Lunch
(19:52) No Child Left Behind
(23:18) School Reinvented
(27:52) School Choice
(30:37) School Hidden
(32:27) School Evaluation Hard
(33:11) A Metaphor
(40:00) Lies
(43:58) Don’t Worry, Be Happy
(45:29) You Might Learn Something
(46:46) Restorative Justice Does Not Scale
(50:42) Other Things In Brief
(53:02) The Lighter Side
---
First published:
October 10th, 2023
Source:
https://www.lesswrong.com/posts/M3x2bGyJayxi52z3j/childhood-roundup-3
Narrated by TYPE III AUDIO.
Can you tell if an AI is lying to you? A new paper claims that we essentially can do exactly that, at least under the right conditions. Another paper claims we can inject various sentiments into responses, getting the AI to do what we wish. Interpretability is making progress. It is exciting to think about the implications. In the short term, it would be great if we could use this to steer responses and to detect and correct hallucinations. There's a lot of potential here to explore.
In the longer term, I am more skeptical of such strategies. I do not think lie detection is a viable primary control or alignment strategy. I worry that if we go down such a path, we risk fooling ourselves, optimizing in ways that cause the techniques to stop working, and get ourselves killed. Indeed, even attempts to grab the low-hanging fruit [...]
---
Outline:
(01:30) Table of Contents
(03:47) Language Models Offer Mundane Utility
(06:31) Language Models Don’t Offer Mundane Utility
(07:44) GPT-4 Real This Time
(07:48) Fun with Image Generation
(08:04) Deepfaketown and Botpocalypse Soon
(13:32) They Took Our Jobs
(16:10) Get Involved
(17:07) Introducing
(17:52) Meta Surveillance
(23:09) In Other AI News
(24:11) Open Philanthropy Worldview Contest Awards Prizes
(24:39) First Prizes ($50k)
(24:59) Second Prizes ($37.5k)
(25:17) Third Prizes ($25k)
(27:01) Quintin Doubles Down on Twitter
(29:27) The Other Winners
(31:24) Quiet Speculations
(39:13) Open Source AI is Unsafe and Nothing Can Fix This
(46:27) The Quest for Sane Regulations
(50:38) The Week in Audio
(52:03) Rhetorical Innovation
(59:48) Eliezer Yudkowsky clarifies a recent misunderstanding about the Orthogonality Thesis (here is his full best explanation of the thesis, from Arbital).
(01:02:49) Aligning a Smarter Than Human Intelligence is Difficult
(01:12:22) People Are Worried About AI Killing Everyone
(01:12:55) Other People Are Not As Worried About AI Killing Everyone
(01:17:48) The Lighter Side
---
First published:
October 5th, 2023
Source:
https://www.lesswrong.com/posts/iDoTRmCH22PaFYL6x/ai-32-lie-detector
Narrated by TYPE III AUDIO.
Response to: Evolution Provides No Evidence For the Sharp Left Turn, due to it winning first prize in The Open Philanthropy Worldviews contest.
Quinton's post is an argument about a key historical reference class and what it tells us about AI. Instead of arguing that the reference makes his point, he is instead arguing that it doesn’t make anyone's point - that we understand the reasons for humanity's sudden growth in capabilities. He says this jump was caused by gaining access to cultural transmission which allowed partial preservation of in-lifetime learning across generations, which was a vast efficiency gain that fully explains the orders of magnitude jump in the expansion of human capabilities. Since AIs already preserve their metaphorical in-lifetime learning across their metaphorical generations, he argues, this does not apply to AI.
This last paragraph makes an extremely important claim that I want to ensure I convey [...]
---
First published:
October 5th, 2023
Narrated by TYPE III AUDIO.
It never stops. I’m increasingly building distinct roundups for various topics, in particular I’m splitting medical and health news out. Let's get to the rest of it.
Bad News
A simple model of why everything sucks: It is all optimized almost entirely for the marginal user, who the post calls Marl. Marl hates when there are extra buttons on the screen or any bit of complexity is offered, even when he is under zero obligation to use it or care, let alone being asked to think, so everything gets dumbed down.
Could companies really be this stupid, so eager to chase the marginal user a little bit more that they cripple the functionality of their products? Very much so, yes, well past the point where it makes financial sense to do so. The metrics form a tyranny, invisible costs are increasingly paid on the alter of visible [...]
---
Outline:
(00:17) Bad News
(03:30) Disunity
(11:38) Ban Gain of Function Research
(13:51) The Right Price to Be Willing to Pay is Not $0
(15:19) Opportunity Knocks
(15:33) Government Working
(17:07) Worthwhile Canadian Immigration
(20:58) People Don’t Do Things
(22:23) Thinking is Hard
(24:25) Satisfaction Guaranteed To Stay Around 83%
(26:50) Good News, Everyone
(33:07) The Ancient Art of Getting Through a Conversation
(35:31) Money Stuff
(44:32) Very High Marginal Tax Rates
(46:09) So Cold, So Alone
(49:14) While I Cannot Condone This
(55:46) Queue Technology
(57:34) Gamers Gonna Game Game Game Game Game
(01:05:01) The Lighter Side
---
First published:
October 3rd, 2023
Source:
https://www.lesswrong.com/posts/LszTjZCb4toYrfiAh/monthly-roundup-11-october-2023
Narrated by TYPE III AUDIO.
It slices. It dices. Or, at least, it sees, hears, talks, creates stunningly good images and browses the web. Welcome to the newly updated GPT-4. That's all in two weeks. Throw in Microsoft 365 Copilot finally coming online soon.
Are we back? I’m guessing we’re back. Also it's that much closer to being all over. At some point this stops being a ying-yang thing and more of a we-all-die thing. For now, however? We’re so back.
Are we so back that AGI has been achieved internally at OpenAI? Because Sam Altman literally said that straight up in a Reddit post? No, no, that was an obvious joke, we are not quite that back, why do you people have no chill?
Table of Contents
---
Outline:
(00:54) Table of Contents
(03:27) GPT-4 Real This Time
(10:42) Language Models Offer Mundane Utility
(12:29) Language Models Don’t Offer Mundane Utility
(14:14) The Reversal Curse
(17:57) Wouldn’t You Prefer a Nice Game of Chess?
(21:55) Fun with Image Generation
(23:02) Deepfaketown and Botpocalypse Soon
(26:11) They Took Our Jobs
(28:20) Get Involved
(29:23) Introducing
(30:08) Talking Real Money
(32:22) In Other AI News
(33:58) Quiet Speculations
(41:12) The Quest for Sane Regulations
(48:05) The Week in Audio
(48:30) Rhetorical Innovation
(56:01) Can You Please Speak Directly Into This Microphone
(56:41) No One Would Be So Stupid As To
(01:01:29) Aligning a Smarter Than Human Intelligence is Difficult
(01:09:41) People Are Worried About AI Killing Everyone
(01:13:33) Other People Are Not As Worried About AI Killing Everyone
(01:16:27) The Lighter Side
---
First published:
September 28th, 2023
Source:
https://www.lesswrong.com/posts/CeHqm3CSApEjgFb8X/ai-31-it-can-do-what-now
Narrated by TYPE III AUDIO.
We are about to see what looks like a substantial leap in image models. OpenAI will be integrating Dalle-3 into ChatGPT, the pictures we’ve seen look gorgeous and richly detailed, with the ability to generate pictures to much more complex specifications than existing image models. Before, the rule of thumb was you could get one of each magisteria, but good luck getting two things you want from a given magisteria. Now, perhaps, you can, if you are willing to give up on adult content and images of public figures since OpenAI is (quite understandably) no fun.
We will find out in a few weeks, as it rolls out to ChatGPT+ users.
As usual a bunch of other stuff also happened, including a model danger classification system from Anthropic, OpenAI announcing an outside red teaming squad, a study of AI impact on consultant job performance, some incremental upgrades [...]
---
Outline:
(01:25) Table of Contents
(03:49) Language Models Offer Mundane Utility
(12:05) Language Models Don’t Offer Mundane Utility
(19:25) Level Two Bard
(22:28) Wouldn’t You Prefer a Good Game of Chess?
(24:46) GPT-4 Real This Time
(25:48) Fun with Image Generation
(32:03) Deepfaketown and Botpocalypse Soon
(33:53) Get Involved
(34:40) Introducing
(38:43) In Other AI News
(43:51) Technical Details
(50:27) Quiet Speculations
(56:29) The Quest for Sane Regulations
(01:04:40) The Week in Audio
(01:05:13) Rhetorical Innovation
(01:11:23) No One Would Be So Stupid As To
(01:13:49) Aligning a Smarter Than Human Intelligence is Difficult
(01:18:35) I Didn’t Do It, No One Saw Me Do It, You Can’t Prove Anything
(01:22:36) People Are Worried About AI Killing Everyone
(01:26:00) Other People Are Not As Worried About AI Killing Everyone
(01:27:46) The Lighter Side
---
First published:
September 21st, 2023
Source:
https://www.lesswrong.com/posts/DQTM2whB9i57o2mA3/ai-30-dalle-3-and-gpt-3-5-instruct-turbo
Narrated by TYPE III AUDIO.
The listings will continue until people can afford houses in places they want to live.
New City By the Bay, Who Dis
They want to live in San Francisco, but San Francisco has no interest in building homes. What to do? The city's answer is to spend massively to support its homeless population, causing it to grow, while letting the housing shortage get worse. Some tech billionaires instead propose to go to Solano 60 miles away, and start the initiative California Forever on 55k acres to build a new city. We have our first look at what they are imagining, and it looks like an old school Greek or Italian village. Quite nice actually, skipping straight to the future NIMBY world with one extra town in it is already hugely profitable and helpful.
That still wouldn’t have been my move. I would have gone Chinese megacity on [...]
---
Outline:
(00:09) New City By the Bay, Who Dis
(01:33) Housing is Expensive
(08:40) Not Having Housing is Expensive
(10:10) NIMBY vs. YIMBY
(12:52) Building Houses Where People Want to Live
(19:44) Georgist Redux
(25:08) Rent Control
(25:54) Traffic
---
First published:
September 20th, 2023
Source:
https://www.lesswrong.com/posts/9DGDxRzX8xhe8jsjb/housing-roundup-6
Narrated by TYPE III AUDIO.
Welcome to week three of the AI era. Another long week and another giant post. I intend to take a nice four-day vacation in Mexico starting today, during which I won’t do any writing. I’m sure the pace of things will slow down real soon now. I mean, they have to. Don’t they?
If not, I’m going to have to learn to make some deeper cuts on what gets included.
Table of Contents
---
Outline:
(00:28) Table of Contents
(01:52) Executive Summary
(04:05) Market Perspectives
(10:27) They Took Our Jobs
(14:39) In Other AI News
(16:03) Our Price Cheap
(22:11) Bullet Time
(24:12) The Once and Future Face of Sydney
(25:07) Becoming the Mask
(26:02) Language Models Offer Mundane Utility
(28:03) Fun With Image Generation
(30:42) Deepfaketown and Botpocalypse Soon
(35:52) The War Against Knowledge
(37:58) AI Know What You’re Thinking
(38:56) AI Learns How to Love
(40:38) Llama Would You Publish Your Language Model
(41:57) The Art of the Jailbreak
(43:40) The Waluigi Effect
(52:02) Podcasts Ho!
(54:39) A Modest Proposal
(56:11) There Is No Fire Alarm For Artificial Intelligence But There Are Warning Shots
(56:33) Shallow Trouble at DeepMind
(57:22) Sam Altman Watch
(57:32) Gary Marcus and a Proposed Regulation
(01:06:42) Relatively Reasonable Skeptical AI DontKillEveryoneism Takes
(01:26:06) Bad AI DontKillEveryoneism Takes
(01:49:52) We Reject Violence As We Do The Devil: In All Its Forms
(01:52:01) The Lighter Side
---
First published:
March 9th, 2023
Source:
https://www.lesswrong.com/posts/AgaBzvuBJg2evEjqh/ai-3
Narrated by TYPE III AUDIO.
It works for the AI. Take a deep breath and work on this problem step-by-step was the strongest AI-generated custom instruction. You, a human, even have lungs and the ability to take an actual deep breath. You can also think step by step.
This week was especially friendly to such a proposal, allowing the shortest AI weekly to date and hopefully setting a new standard. It would be great to take some time for more long-term oriented posts on AI but also on things like the Jones Act, for catching my breath and, of course, some football.
And, of course, Happy New Year!
Table of Contents
---
Outline:
(00:41) Table of Contents
(02:28) Language Models Offer Mundane Utility
(04:12) Language Models Don’t Offer Mundane Utility
(05:53) Gary Marcus Claims LLMs Cannot Do Things GPT-4 Already Does
(10:35) Fun with Image Generation
(11:16) Deepfaketown and Botpocalypse Soon
(14:11) Get Involved
(15:28) Introducing
(17:23) In Other AI News
(19:11) Quiet Speculations
(23:58) The Quest for Sane Regulations
(27:33) The Week in Audio
(27:50) Rhetorical Innovation
(31:10) Were We As Stupid As To?
(35:37) Aligning a Smarter Than Human Intelligence is Difficult
(37:41) Can You Speak Louder Directly Into the Microphone
---
First published:
September 14th, 2023
Source:
https://www.lesswrong.com/posts/37LWXb3cvC7NLJN6x/ai-29-take-a-deep-breath
Narrated by TYPE III AUDIO.
There were a bunch of discussions recently related to issues surrounding Y-Combinator, related as usual to their annual demo day. It seemed worth splitting them off into a post.
Bidders at Auction Mostly Think Prices Are Too High
YC is in session, so all the usual talk is back.
Paul Graham: As happens every single YC batch, investors complain that the valuations are too high, and the startups raise anyway.
Kush: Is there any specific reason why you think this happens?
Paul Graham: Some investors just cannot grasp the implications of the fact that all the returns are concentrated in the big wins. The top tier investors get it though; you’ll rarely lose one of them over price.
Amal Dorai: Isn’t the dynamic range of YC valuations narrower than the companies themselves? We see pretty much a 2x range that they’re raising in, but [...]
---
Outline:
(00:16) Bidders at Auction Mostly Think Prices Are Too High
(03:32) You Would Pay For The Ability To Charge More
(04:32) The Best Deals Come With the Best Prices
(08:32) Raise Early, Raise Often, Raise Everywhere
(09:07) Good VCs Get Their Money Back Surprisingly Often
(11:21) If You Want it Done Right
(13:38) You’ll Need the Time to Do That
(15:49) Credits You Will Need More of Are Essentially Money
(16:54) Get Your App Out There Quick
(18:12) No, Quicker Than That
(20:31) All The Startups Are Now AI Startups
(22:36) Man With Moat Unconcerned With Non-Flying Creatures
(25:27) The Investor Database Is a Killer App
---
First published:
September 12th, 2023
Source:
https://www.lesswrong.com/posts/dGvcx3giYngcSEaMZ/startup-roundup-1-happy-demo-day
Narrated by TYPE III AUDIO.
We are, as Tyler Cowen has noted, in a bit of a lull. Those of us ahead of the curve have gotten used to GPT-4 and Claude-2 and MidJourney. Functionality and integration are expanding, but on a relatively slow pace. Most people remain blissfully unaware, allowing me to try out new explanations on them tabula rosa, and many others say it was all hype. Which they will keep saying, until something forces them not to, most likely Gemini, although it is worth noting the skepticism I am seeing regarding Gemini in 2023 (only 25% for Google to have the best model by end of year) or even in 2024 (only 41% to happen even by end of next year.)
I see this as part of a pattern of continuing good news. While we have a long way to go and very much face impossible problems, the discourse and [...]
---
Outline:
(01:38) Table of Contents
(03:36) Language Models Offer Mundane Utility
(06:41) Language Models Don’t Offer Mundane Utility
(08:14) Deepfaketown and Botpocalypse Soon
(12:04) They Took Our Jobs
(13:37) Get Involved
(14:28) Introducing
(16:05) UK Taskforce Update
(18:38) In Other AI News
(22:41) Quiet Speculations
(30:27) The Quest for Sane Regulations
(30:38) The Week in Audio
(44:06) Rhetorical Innovation
(46:23) No One Would Be So Stupid As To
(47:34) Aligning a Smarter Than Human Intelligence is Difficult
(01:01:49) Twitter Community Notes Notes
(01:05:52) People Are Worried About AI Killing Everyone
(01:06:22) Other People Are Not As Worried About AI Killing Everyone
(01:19:07) The Lighter Side
---
First published:
September 7th, 2023
Source:
https://www.lesswrong.com/posts/GuTK47y9awvypvAbC/ai-28-watching-and-waiting
Narrated by TYPE III AUDIO.
It’s that time again. Here’s all the news and items of interest that isn’t something else.
Things that didn’t make it in to try and stand them on their own include whether America or France has better food and various claims related to Y-Combinator, in addition to the usual additional categories of energy, childhood, fertility and housing.
Bad News
All publicity is good publicity, act accordingly and solve for the equilibrium edition.
OnlyFans creators are paying meme aggregation accounts to dunk on their intentionally cringe content. the economics seems to works out really well for everyone involved. saw a creator say ~$200 per post. at 1000s of QTs/RTs, a low conversation rate would still be lucrative.
WeWork may not do so much longer, substantial doubt about staying in business.
37 year old man has $15k in savings, owns his van outright, but has no [...]
---
Outline:
(00:25) Bad News
(06:05) Is There a Reliability Crisis in America?
(10:47) Small Problems With Big Tech
(12:10) Google Still Planning to Permanently Delete Inactive Accounts
(13:39) Crypto Might Not Be Entirely Safe
(17:42) Government Working
(22:16) Government Planning
(26:18) Together We Fight Crime
(29:35) How Bad Is Crime, Anyway?
(36:44) The Quest for Mundane Utility
(42:15) Opportunity Knocks
(44:11) Good News, Everyone
(49:19) Money Stuff
(54:33) Our Price Cheap
(57:31) The Efficient Market Hypothesis is False
(01:03:54) Prediction Market Hypothesis
(01:05:32) Modest Proposals
(01:05:58) FDA Delenda Est
(01:09:29) In Medical and Health News
(01:15:52) Gamers Gonna Game Game Game Game Game
(01:21:36) I Was Promised Flying Self-Driving Cars
(01:28:27) While I Cannot Condone This
(01:34:54) The Lighter Side
---
First published:
September 6th, 2023
Source:
https://www.lesswrong.com/posts/ZoxkWDgDueo5yoakv/monthly-roundup-10-september-2023
Narrated by TYPE III AUDIO.
It is a fun question going around the internet this past week, so here we go.
In particular, people focused on the question of France vs. America. As one would expect, those on the French side think those on the American side are crazy, it is insulting to even consider this a question. Those on the American side like food.
All of this is always just, like, your opinion, man, or at least that’s the story.
Checking the Survey Data
YouGov asked back in 2019, got the following answers across nations, which we were reminded of during current debate on Twitter of American versus French food.
I will quibble, but I was impressed how good this list was for nationally identified cuisine, as opposed to in-country experience.
Where do I see obvious mistakes, ignoring the unfamiliar ones?
Everyone is underrating Brazilian because meat [...]
---
Outline:
(00:28) Checking the Survey Data
(03:07) Genius in France
(05:32) I Thought This Was America
(10:46) The Upgrade
(14:31) Conclusion
---
First published:
September 5th, 2023
Source:
https://www.lesswrong.com/posts/D5urD7WmDePyii73D/who-has-the-best-food
Narrated by TYPE III AUDIO.
By all reports, and as one would expect, Google’s Gemini looks to be substantially superior to GPT-4. We now have more details on that, and also word that Google plans to deploy it in December, Manifold gives it 82% to happen this year and similar probability of being superior to GPT-4 on release.
I indeed expect this to happen on both counts. This is not too long from now, but also this is AI #27 and Bard still sucks, Google has been taking its sweet time getting its act together. So now we have both the UK Summit and Gemini coming up within a few months, as well as major acceleration of chip shipments. If you are preparing to try and impact how things go, now might be a good time to get ready and keep your powder dry. If you are looking to build cool new AI [...]
---
Outline:
(00:55) Table of Contents
(03:33) Language Models Offer Mundane Utility
(05:57) Language Models Don’t Offer Mundane Utility
(10:07) GPT-4 Real This Time
(11:17) Fun with Image Generation
(11:57) Deepfaketown and Botpocalypse Soon
(13:18) They Took Our Jobs
(14:36) Get Involved
(14:59) Introducing
(15:48) In Other AI News
(17:25) China
(18:11) The Best Defense
(22:59) Portents of Gemini
(25:30) Quiet Speculations
(26:10) The Quest for Sane Regulations
(29:16) The Week in Audio
(37:55) Rhetorical Innovation
(49:44) Llama No One Is Stopping This
(59:39) No One Would Be So Stupid As To
(01:02:47) Aligning a Smarter Than Human Intelligence is Difficult
(01:12:36) People Are Worried About AI Killing Everyone
(01:17:21) Other People Are Not As Worried About AI Killing Everyone
(01:24:34) The Wit and Wisdom of Sam Altman
(01:25:40) The Lighter Side
---
First published:
August 31st, 2023
Source:
https://www.lesswrong.com/posts/WLvboc66rBCNHwtRi/ai-27-portents-of-gemini
Narrated by TYPE III AUDIO.
Developments around relationships and dating have a relatively small speed premium, so I figured I would wait until I had a full post worth of them.
Indeed I now present such a post, in which I present several theories as to why so many of you might still be single.
While I am my usual opinionated self, I am not going to be offering a section of my list of related Good Advice. That would be its own project, which may or may not happen at some time in the future. There is still much in the way of practical implications or implied advice throughout.
You’re Single Because You’re Not Even Trying
A 2022 sample of singles is out, and charts are available, so that seems like a good place to start. None of this is properly representative or anything. It’s still good data.
It [...]
---
Outline:
(00:36) You’re Single Because You’re Not Even Trying
(04:43) You’re Single Because You Are The Fourth Child and Do Not Know How to Ask
(17:44) You’re Single Because Dating Apps Suck
(32:02) You’re Single Because You Didn’t Make a Date Me Doc
(37:42) You’re Single Because the Dating Market Is Not In Equilibrium
(41:59) You’re Single Because You’re Asking the Wrong Questions
(44:55) You’re Single Because You’re Trying Weird Stuff
(47:38) You’re Single Because You Suck at Relationships
(48:36) You’re Single Because You Decided You Were Poly
(50:52) You’re Single Because You Didn’t Take Their Good Advice
(01:03:27) What About My Good Advice?
(01:04:57) Ask and You Might Receive
---
First published:
August 29th, 2023
Source:
https://www.lesswrong.com/posts/BAqvAvPC7GiZhTyR3/dating-roundup-1-this-is-why-you-re-single
Narrated by TYPE III AUDIO.
---
First published:
August 24th, 2023
Source:
https://www.lesswrong.com/posts/kLa3HmkesF5w3MFEY/ai-26-fine-tuning-time
Narrated by TYPE III AUDIO.
Previous post: How Escape From Immoral Mazes
Sequence begins here: Moloch Hasn’t Won
The previous posts mostly took mazes as given.
As an individual, one’s ability to fight any large system is limited.
That does not mean our individual decisions do not matter. They do matter. They add up.
Mostly our choice is a basic one. Lend our strength to that which we wish to be free from. Or not do so.
Even that is difficult. The methods of doing so are unclear. Mazes are ubiquitous. Not lending our strength to mazes, together with the goal of keeping one’s metaphorical soul intact and still putting food on the table, is already an ambitious set of goals for an individual in a world of mazes.
We now shift perspective from the individual to the system as a whole. We [...]
---
First published:
January 18th, 2020
Source:
https://www.lesswrong.com/posts/KLdqetnxgprifo9BP/the-road-to-mazedom
Narrated by TYPE III AUDIO.
George Hotz and Eliezer Yudkowsky debated on YouTube for 90 minutes, with some small assists from moderator Dwarkesh Patel. It seemed worthwhile to post my notes on this on their own.
I thought this went quite well for the first half or so, then things went increasingly off the rails in the second half, and Hotz gets into questions where he didn’t have a chance to reflect and prepare, especially around cooperation and the prisoner’s dilemma.
First, some general notes, then specific notes I took while watching.
---
First published:
August 16th, 2023
Narrated by TYPE III AUDIO.
Inflection.ai is the latest AI lab whose CEO is advocating for regulation of AI. I discuss that under the Quest for Sane Regulation. Amazon and Apple are incrementally stepping up their AI game. Hotz and Yudkowsky debate whether AI is existentially risky, cover all the usual bases with mixed results but do so in good faith. We have more discussion about whether GPT-4 is creative, and whether it can reason. Mostly we get the exact opposite of the title, more of the same.
Note: My posts get made into audio form via AI, for now you can listen to them at this link. This post will likely be available there later in the day on Thursday, or perhaps Friday.
Table of Contents
---
Outline:
(00:47) Table of Contents
(02:34) Language Models Offer Mundane Utility
(07:27) Language Models Don’t Offer Mundane Utility
(09:48) GPT-4 Real This Time
(18:57) Go Team Yeah
(22:22) Fun with Image Generation
(23:22) Deepfaketown and Botpocalypse Soon
(24:45) They Took Our Jobs
(28:54) Get Involved
(29:11) Introducing
(31:50) In Other AI News
(36:14) Quiet Speculations
(42:24) The Quest for Sane Regulations
(53:36) The Week in Audio
(54:37) People Are Worried About AI Killing Everyone
(59:35) Other People Are Not As Worried About AI Killing Everyone
(01:05:26) The Lighter Side
---
First published:
August 17th, 2023
Source:
https://www.lesswrong.com/posts/fXnpnrazqwpJmbadu/ai-25-inflection-point
Narrated by TYPE III AUDIO.
George Holtz and Eliezer Yudkowsky debated on YouTube for 90 minutes, with some small assists from moderator Dwarkesh Patel. It seemed worthwhile to post my notes on this on their own.
I thought this went quite well for the first half or so, then things went increasingly off the rails in the second half, and Holtz gets into questions where he didn’t have a chance to reflect and prepare, especially around cooperation and the prisoner’s dilemma.
First, some general notes, then specific notes I took while watching.
---
First published:
August 16th, 2023
Narrated by TYPE III AUDIO.
In addition to all the written developments, this was a banner week for podcasts.
I would highlight four to consider listening to.
---
Outline:
(01:34) Table of Contents
(03:36) Language Models Offer Mundane Utility
(07:07) Language Models Don’t Offer Mundane Utility
(14:02) GPT-4 Real This Time
(16:40) Fun with Image Generation
(16:55) Deepfaketown and Botpocalypse Soon
(18:12) They Took Our Jobs
(21:22) Introducing
(22:18) In Other AI News
(32:26) There Seems To Be a Standard Issue RLHF Morality
(35:33) Quiet Speculations
(44:55) The Quest for Sane Regulations
(50:48) The Week in Audio
(01:10:17) Rhetorical Innovation
(01:13:07) No One Would Be So Stupid As To
(01:13:49) Aligning a Smarter Than Human Intelligence is Difficult
(01:16:10) People Are Worried About AI Killing Everyone
(01:16:50) Other People Are Not As Worried About AI Killing Everyone
(01:17:29) The Lighter Side
---
First published:
August 10th, 2023
Source:
https://www.lesswrong.com/posts/8WaQ9HGDFxT5ysgot/ai-24-week-of-the-podcast
Narrated by TYPE III AUDIO.
What a month. So much to cover.
What this post does not cover beyond this introduction is the biggest news story of the month, a potential room temperature superconductor.
If this discovery is real, it could be transformative. Think of the potential. We could be so back. Chances are that instead it is over, but even a small chance of something this big is huge. Even if it ends up being all over, it was amazing to see people around the world come together and do science to this to try and actually figure out something physical and real. Even in failure, we are so back.
What I am not going to do is pivot to suddenly becoming an expert on physics or get into the weeds myself. There is no need for that, I work fast and it would be fun but there are limits [...]
---
Outline:
(01:40) Bad News
(06:26) Tech No Longer or Not Yet Calling Itself Artificial Intelligence
(13:01) Government Working
(28:07) Money Stuff
(33:19) Sadly, FTX
(34:04) Good News, Everyone
(42:57) While I Cannot Condone This
(43:47) In Medical and Health News
(53:05) Lying About the Lab Leak Hypothesis
(58:26) Doing Science To It: Marinara Edition
(01:03:13) While I Cannot Condone This
(01:13:32) Sports Go Sports
(01:24:49) Gamers Gonna Game Game Game Game Game
(01:35:49) The Lighter Side
---
First published:
August 7th, 2023
Source:
https://www.lesswrong.com/posts/ZvqTqSdmTQEKAmmmX/monthly-roundup-9-august-2023
Narrated by TYPE III AUDIO.
After several jam-packed weeks, things slowed down to allow everyone to focus on the potential room temperature superconductor, check Polymarket to see how likely it is we are so back and bet real money, or Manifold for chats and better graphs and easier but much smaller trading.
The main thing I would highlight this week are an excellent paper laying out many of the fundamental difficulties with RLHF, and a systematic new exploit of current LLMs that seems to reliably defeat RLHF.
I’d also note that GPT-4 fine tuning is confirmed to be coming. That should be fun.
Table of Contents
---
Outline:
(00:38) Table of Contents
(02:32) Language Models Offer Mundane Utility
(05:20) Language Models Don’t Offer Mundane Utility
(16:57) Fun with Image Generation
(17:40) Deepfaketown and Botpocalypse Soon
(17:49) They Took Our Jobs
(22:31) Get Involved
(23:35) Introducing
(31:33) In Other AI News
(34:46) Quiet Speculations
(36:13) China
(38:31) The Quest for Sane Regulations
(45:04) The Week in Audio
(45:37) Rhetorical Innovation
(47:51) No One Would Be So Stupid As To
(49:44) Aligning a Smarter Than Human Intelligence is Difficult
(01:10:30) Other People Are Not As Worried About AI Killing Everyone
(01:11:34) The Wit and Wisdom of Sam Altman
(01:11:56) The Lighter Side
---
First published:
August 3rd, 2023
Source:
https://www.lesswrong.com/posts/aKzwwKT2cy72awSyz/ai-23-fundamental-problems-with-rlhf
Narrated by TYPE III AUDIO.
SPOILER WARNING: This post, after a brief spoiler-free review section, will contain full spoilers for Oppenheimer, Barbie and Mission: Impossible: Dead Reckoning Part One, and some for Across the Spiderverse.
Movies are so back. While they are having their Barbieheimer moment, it seems worthwhile to gather thoughts of myself and others on both movies, and also mention two other recent pictures.
First, I’ll offer various levels of spoiler-free review of all four movies, then get into the weeds.
Spoiler-Free Reviews
Full Spoiler-Free (1-bit reviews, only yes or no):
See all four movies.
Almost Fully Spoiler-Free (several-bit reviews):
You should definitely see Spiderverse, Barbie and Oppenheimer. Mission Impossible is good, but optional.
Pro tip, as it turns out: Do not see Barbie and Oppenheimer on the same day.
Ranked by how pure quality: Across the Spiderverse, Barbie, Oppenheimer, Mission Impossible: Dead Reckoning.
Ranked [...]
---
Outline:
(00:34) Spoiler-Free Reviews
(00:38) Full Spoiler-Free (1-bit reviews, only yes or no):
(00:47) Almost Fully Spoiler-Free (several-bit reviews):
(01:31) Traditional-Level Spoiler-Free Review: Oppenheimer
(03:02) Traditional-Level Spoiler-Free Review: Barbie
(03:57) Traditional-Level Spoiler-Free Review: Mission Impossible: Dead Reckoning
(05:35) Traditional-Level Spoiler-Free Review: Across the Spiderverse
(06:14) Note on Ticket Pricing
(08:50) From Here On In, There Be Spoilers After the White Space
(09:33) Thoughts About Movie Length and Editing
(10:19) Barbie is the Correct Length and Very Well Edited
(10:28) Mission Impossible is Too Long
(13:17) Across the Spiderverse is Expansive For a Good Reason
(15:16) Oppenheimer Understandably Tried to Do Too Much
(20:52) Takeaways and Reactions
(20:56) Oppenheimer Reactions
(31:36) Oppenheimer Takeaways
(38:39) Barbie
(55:26) Mission Impossible Takeaways
(01:02:48) Across the Spiderverse Takeaways
(01:03:44) Movies Generally
---
First published:
August 1st, 2023
Source:
https://www.lesswrong.com/posts/6yoehDfWJAgnRHpWo/barbieheimer-across-the-dead-reckoning
Narrated by TYPE III AUDIO.
The big news of the week is OpenAI introducing the Superalignment Taskforce. OpenAI is pledging 20% of compute secured to date towards a four year effort to solve the core technical challenges of superintelligence alignment. They plan to share the results broadly. Co-founder and OpenAI Chief Scientist Ilya Sutskever will lead the effort alongside Head of Alignment Jan Leike, and they are hiring to fill out the team.
That is serious firepower. The devil, as always, is in the details. Will this be an alignment effort that can work, an alignment effort that cannot possibly work, an initial doomed effort capable of pivoting, or a capabilities push in alignment clothing?
It is hard to know. I will be covering these questions in a post soon, and want to take the time to process and get it right. In the meantime, the weekly post covers the many other [...]
---
First published:
July 6th, 2023
Source:
https://www.lesswrong.com/posts/45D7rZXiG5gCoCveM/ai-19-hofstadter-sutskever-leike
Narrated by TYPE III AUDIO.
I’ve finally had an opportunity to gather the available information about Llama-2 and take an in-depth look at the system card.
My conclusion is that Llama-2 looks to score about 3.4 GPTs, with coding as its relative weak point. The system card tries to claim better performance than that in some places in rather misleading fashion, but in other places it does not make such claims.
For its intended purposes it is now the best open source model, while remaining well behind closed source models. There is substantial improvement over Llama-1 in capabilities, it comes with fine tuning, and also with an attempt at harmlessness.
That attempt at harmlessness appears even more ham-fisted than usual. The claims of a 0.05% (!?!) false refusal rate are clearly very false. Early public red teaming quickly revealed a number of problems, in a model that cannot be unreleased or fully patched.
Llama We Doing This Again?
Meta notices world not yet destroyed and people remain alive, so it has not open sourced enough models. Hence it released Llama 2. Here’s the paper, here’s the blog announcement, here is a download link to GitHub. Here’s Llama-70B on Replicate.
Simon Willison (re: Replicate): Here’s how [...]
---
First published:
July 26th, 2023
Source:
https://www.lesswrong.com/posts/rhdAAFLgLBHLtGcNT/llama-we-doing-this-again
Narrated by TYPE III AUDIO.
Dylan Matthews had in depth Vox profile of Anthropic, which I recommend reading in full if you have not yet done so. This post covers that one.
Anthropic Hypothesis
The post starts by describing an experiment. Evan Hubinger is attempting to create a version of Claude that will optimize in part for a secondary goal (in this case ‘use the word “paperclip” as many times as possible’) in the hopes of showing that RLHF won’t be able to get rid of the behavior. Co-founder Jared Kaplan warns that perhaps RLHF will still work here.
Hubinger agrees, with a caveat. “It’s a little tricky because you don’t know if you just didn’t try hard enough to get deception,” he says. Maybe Kaplan is exactly right: Naïve deception gets destroyed in training, but sophisticated deception doesn’t. And the only way to know whether an AI can deceive you is to build one that will do its very best to try.
The problem with this approach is that an AI that ‘does its best to try’ is not doing the best that the future dangerous system will do.
So by this same logic, a test on today’s systems can only show your technique [...]
---
First published:
July 25th, 2023
Source:
https://www.lesswrong.com/posts/b3CA5Gst5iWkyuY9L/anthropic-observations
Narrated by TYPE III AUDIO.
As always, plenty of non-AI things are happening, most of which do not justify their own posts. This is where all of those go. Skimming for things relevant to your interests is encouraged.
Before I begin, a Point of Order: I plan to have a post later regarding the rate limitation and disruption on Twitter, which disrupts many features of Tweetdeck and means that if you aren’t logged in Twitter links no longer work. Adjustments will need to be made if the situation is not fixed very soon, despite my having no fear of ever hitting the 10k rate limit itself. It is not practical to go back and fix links or other decisions on already written material but I am adjusting the process going forward.
Bad News
RIP 538. Nate Silver leaving was always going to be a hard blow to recover from, as the models are the heart and soul of the entire model. It also depends on a deep commitment, visible to all, to fair play and avoiding putting a finger on the scale or using the scale as leverage. Alas:
Nate Silver: Very bad to do a Spanish Inquisition with pollsters based on their political orientation. [...]
---
First published:
July 3rd, 2023
Source:
https://www.lesswrong.com/posts/83EmLzstzcM7X4isf/monthly-roundup-8-july-2023
Narrated by TYPE III AUDIO.
The situation is evolving rapidly. Here’s where we stand as of the morning of July 4th.
Well You See What Happened Was…
Oh no! To be clear, by twitches, I mean ‘Elon refused to pay the cloud bill.’
As a result, Twitter has been forced to rate limit users.
This started out as 600 posts per day for most accounts, 300 posts per day for new accounts and 6,000 posts per day for those who pay.
This is now up to 1k/500/10k according to one Musk tweet.
If you are not logged in, you get nothing. Even direct links will break.
Tweetdeck has been forced into a new worse version, but now works again. In 30 days, this will be for paid accounts only, which seems fair.
That fourth one hurts my process. Navigation is somewhat slower and more annoying. In particular, forced threading breaks chronological order assumptions and one’s ability to use duplication to locate one’s place, and zooming in to move around twisting Twitter threads is so bad you need to jump to Twitter itself. Navigation to zoom back requires clicking in annoying places. I was unable to configure the column order without deleting them all and then [...]
---
First published:
July 4th, 2023
Source:
https://www.lesswrong.com/posts/GFagfc5aYsRGQu7DL/twitter-twitches
Narrated by TYPE III AUDIO.
Recently Tyler Cowen has doubled and tripled down on the idea that congestion pricing in Manhattan would be inefficient. I respond to that in the section “A Strange Take on Congestion Pricing in New York City.” For it is, indeed, a very strange take, claiming that it is good for the surplus produced by Manhattan if everyone involved spends their time stuck in traffic, so as to not impose a minor tax that will in some portion fall on visitor cars.There also continues to be a lot happening in the ongoing struggle to build housing where people want to live, in order to let them live there, and to lower the cost of doing so, which is the main focus of this series and the rest of this post.I pay attention to housing and look to find ways to build more of it because I continue to be a believer in the Housing Theory of Everything. When housing where people want to live is scarce, people can’t move to where they are happier and more productive, and landlords and homeowners can capture a large portion, perhaps most, of the surplus value the best places generate.Balsa, whose website and other activities [...]
---
First published:
July 12th, 2023
Source:
https://www.lesswrong.com/posts/Wn5pWiFwQMbtMa3qY/housing-and-transit-roundup-5
Narrated by TYPE III AUDIO.
By the Matt Levine Vacation Rule, I took several days to go to Seattle and there was a truly epic amount of news. We had x.AI, Llama 2, upgrades to ChatGPT, a profile of Anthropic, a ton of very interesting papers on a variety of topics, several podcasts that demand listening, fully AI-generated South Park episodes and so much more. I could not fully keep up. Oh, and now we have Barbieheimer.
Thus, I have decided to spin out or push to next week coverage of four stories:
The release of Llama 2.
The plans of x.AI.
The profile in Vox of Anthropic.
Whether GPT-4 is getting worse, as was claimed.
These might get their own posts or they might get pushed to next week, depending on what I find on each. Same with my coverage of Oppenheimer since I haven’t seen it yet, and my bonus thoughts on Mission Impossible: Dead Reckoning (spoiler-free review of MI:DR for now: Good fun if you like such movies, some interesting perspective of on how people would handle such a situation, a lot of clear struggling between the writer who knows how any of this works and everyone else involved in the film [...]
---
First published:
July 20th, 2023
Source:
https://www.lesswrong.com/posts/BHdEvjtfwpgrTh825/ai-21-the-cup-overfloweth
Narrated by TYPE III AUDIO.
A few months ago, Ian Hogarth wrote the Financial Times Op-Ed headlined “We must slow down the race to God-like AI.”
A few weeks ago, he was appointed head of the UK Foundation Model Taskforce, and given 100 million pounds to dedicate to AI safety, to universal acclaim. Soon there will also be a UK Global AI Summit.
He wrote an op-ed in The Times asking everyone for their help, with accompanying Twitter thread. Based on a combination of sources, I am confident that this effort has strong backing for the time being although that is always fragile, and that it is aimed squarely at the real target of extinction risk from AI, with a strong understanding of what it would mean to have an impact on that.
Once again: The real work begins now.
The UK Taskforce will need many things in order to succeed. It will face opposition within and outside the government, and internationally. There is a narrow window until the AI summit to hit the ground running and establish capability and credibility.
The taskforce represents a startup government mindset that makes me optimistic, and that seems like the best hope for making government get things done again, including on other vital [...]
---
First published:
July 10th, 2023
Source:
https://www.lesswrong.com/posts/xgXcZQd5eqMqpAw3i/consider-joining-the-uk-foundation-model-taskforce
Narrated by TYPE III AUDIO.
There is no shortage of people willing to talk about every week as a huge week in AI with tons of amazing new releases and announcements. At first this was usually right, then it mostly stopped being right. This week we had Code Interpreter and Claude 2.0 and x.AI (whatever that actually is) and a bunch of other stuff, while I was still processing OpenAI’s huge announcement of their Superalignment Taskforce. As usual, when there’s lots of great stuff that makes it that much harder to actually find time to play around with it. Things are good and also can be overwhelming.
I will be in Seattle this weekend, arriving Sunday and leaving Tuesday. If you’d like to say hello while I am there, let me know. I know better than to request things get quiet.
Also on a non-AI note, congratulations to Nate Silver on a great run at the World Series of Poker main event, whole tournament has been a blast to watch especially when Rigby is at the table, he’s crazy good.
And on another non-AI note, I’d like to highlight the Roots of Progress Blog-Building Intensive. We need more voices for progress generally, including to ensure [...]
---
First published:
July 13th, 2023
Source:
https://www.lesswrong.com/posts/6YxtwzF9ZEXxJ5wij/ai-20-code-interpreter-and-claude-2-0-for-everyone
Narrated by TYPE III AUDIO.
In their announcement Introducing Superalignment, OpenAI committed 20% of secured compute and a new taskforce to solving the technical problem of aligning a superintelligence within four years. Cofounder and Chief Scientist Ilya Sutskever will co-lead the team with Head of Alignment Jan Leike.
This is a real and meaningful commitment of serious firepower. You love to see it. The announcement, dedication of resources and focus on the problem are all great. Especially the stated willingness to learn and modify the approach along the way.
The problem is that I remain deeply, deeply skeptical of the alignment plan. I don’t see how the plan makes the hard parts of the problem easier rather than harder.
I will begin with a close reading of the announcement and my own take on the plan on offer, then go through the reactions of others, including my take on Leike’s other statements about OpenAI’s alignment plan.
A Close Reading
Section: Introduction
Superintelligence will be the most impactful technology humanity has ever invented, and could help us solve many of the world’s most important problems. But the vast power of superintelligence could also be very dangerous, and could lead to the disempowerment of humanity or even [...]
---
First published:
July 11th, 2023
Source:
https://www.lesswrong.com/posts/NSZhadmoYdjRKNq6X/openai-launches-superalignment-taskforce
Narrated by TYPE III AUDIO.
En liten tjänst av I'm With Friends. Finns även på engelska.