2023-09-03 |
Dear Self; we need to talk about ambition |
surprisetalk |
|
2023-08-30 |
Model Organisms of Misalignment: The Case for a New Pillar of Alignment Research |
evhub |
|
2023-08-26 |
Feedbackloop-first Rationality |
Raemon |
|
2023-08-17 |
Self-driving car bets |
paulfchristiano |
|
2023-08-13 |
My current LK99 questions |
Eliezer Yudkowsky |
|
2023-08-08 |
When can we trust model evaluations? |
evhub |
|
2023-08-05 |
Thoughts on sharing information about language model capabilities |
paulfchristiano |
|
2023-08-02 |
Cultivating a state of mind where new ideas are born |
Henrik Karlsson |
|
2023-07-28 |
Grant applications and grand narratives |
Elizabeth |
|
2023-07-20 |
Accidentally Load Bearing |
jefftk |
|
2023-07-12 |
Consciousness as a conflationary alliance term |
Andrew_Critch |
|
2023-07-07 |
Lessons On How To Get Things Right On The First Try |
johnswentworth |
|
2023-07-03 |
When do "brains beat brawn" in Chess? An experiment |
titotal |
|
2023-06-30 |
Some background for reasoning about dual-use alignment research |
Charlie Steiner |
|
2023-06-24 |
Updates and Reflections on Optimal Exercise after Nearly a Decade |
romeostevensit |
|
2023-06-19 |
What will GPT-2030 look like? |
jsteinhardt |
|
2023-06-14 |
The Base Rate Times, news through prediction markets |
vandemonian |
|
2023-06-09 |
The ants and the grasshopper |
Richard_Ngo |
|
2023-06-05 |
Trust develops gradually via making bids and setting boundaries |
yamrzou |
|
2023-06-01 |
Book Review: How Minds Change |
bc4026bd4aaa5b7fe |
|
2023-05-29 |
How to Have Polygenically Screened Children (2023) |
NavinF |
|
2023-05-24 |
When is Goodhart catastrophic? |
Drake Thomas |
|
2023-05-20 |
Steering GPT-2-XL by adding an activation vector |
TurnTrout |
|
2023-05-17 |
Predictable updating about AI risk |
Joe Carlsmith |
|
2023-05-12 |
How much do you believe your results? |
Eric Neyman |
|
2023-05-06 |
Hell is Game Theory Folk Theorems |
jessicata |
|
2023-04-30 |
Notes on Teaching in Prison |
jsd |
|
2023-04-25 |
A stylized dialogue on John Wentworth's claims about markets and optimization |
So8res |
|
2023-04-21 |
On AutoGPT |
Zvi |
|
2023-04-18 |
Elements of Rationalist Discourse |
Rob Bensinger |
|
2023-04-13 |
Discussion with Nate Soares on a key alignment difficulty |
HoldenKarnofsky |
|
2023-04-08 |
What would a compute monitoring plan look like? [Linkpost] |
Akash |
|
2023-04-04 |
"Carefully Bootstrapped Alignment" is organizationally hard |
Raemon |
|
2023-03-28 |
On not getting contaminated by the wrong obesity ideas |
Natália Coelho Mendonça |
|
2023-03-23 |
More information about the dangerous capability evaluations we did with GPT-4 and Claude. |
Beth Barnes |
|
2023-03-19 |
The Social Recession: By the Numbers |
antonomon |
|
2023-03-16 |
Enemies vs Malefactors |
So8res |
|
2023-03-13 |
The Parable of the King and the Random Process |
moridinamael |
|
2023-03-09 |
Acausal normalcy |
Andrew_Critch |
|
2023-03-03 |
AI alignment researchers don't (seem to) stack |
So8res |
|
2023-02-25 |
I hired 5 people to sit behind me and make me productive for a month |
Simon Berens |
|
2023-02-22 |
Please don't throw your mind away |
TsviBT |
|
2023-02-16 |
Cyborgism |
NicholasKees |
|
2023-02-13 |
Childhoods of exceptional people |
Henrik Karlsson |
|
2023-02-09 |
SolidGoldMagikarp (Plus, Prompt Generation) |
Ivoah |
|
2023-02-06 |
Focus on the places where you feel shocked everyone's dropping the ball |
So8res |
|
2023-02-03 |
Basics of Rationalist Discourse |
Duncan_Sabien |
|
2023-01-30 |
My Model Of EA Burnout |
LoganStrohl |
|
2023-01-27 |
Sapir-Whorf for Rationalists |
Duncan_Sabien |
|
2023-01-24 |
How "Discovering Latent Knowledge in Language Models Without Supervision" Fits Into a Broader Alignment Scheme |
Collin |
|
2023-01-21 |
Recursive Middle Manager Hell |
crop_rotation |
|
2023-01-18 |
Models Don't "Get Reward" |
Sam Ringer |
|
2023-01-15 |
We don’t trade with ants |
KatjaGrace |
|
2023-01-12 |
Can we efficiently distinguish different mechanisms? |
paulfchristiano |
|
2023-01-06 |
The Feeling of Idea Scarcity |
johnswentworth |
|
2023-01-02 |
Staring into the abyss as a core life skill |
benkuhn |
|
2022-12-29 |
Sazen |
Duncan_Sabien |
|
2022-12-27 |
Finite Factored Sets in Pictures |
Magdalena Wache |
|
2022-12-27 |
Be less scared of overconfidence |
benkuhn |
|
2022-12-27 |
The Plan - 2022 Update |
johnswentworth |
|
2022-12-27 |
A note about differential technological development |
So8res |
|
2022-12-27 |
Mechanistic anomaly detection and ELK |
paulfchristiano |
|
2022-12-27 |
Superintelligent AI is necessary for an amazing future, but far from sufficient |
So8res |
|
2022-12-27 |
Mysteries of mode collapse – mysterious attractor states in LLMs |
reallyeli |
|
2022-12-27 |
What it's like to dissect a cadaver |
Alok Singh |
|
2022-12-27 |
Decision theory does not imply that we get to have nice things |
So8res |
|
2022-12-27 |
Let’s think about slowing down AI |
KatjaGrace |
|