LessWrong ‧ Curated
订阅

近期历史最近 100 条记录

2023-09-03 Dear Self; we need to talk about ambition surprisetalk
2023-08-30 Model Organisms of Misalignment: The Case for a New Pillar of Alignment Research evhub
2023-08-26 Feedbackloop-first Rationality Raemon
2023-08-17 Self-driving car bets paulfchristiano
2023-08-13 My current LK99 questions Eliezer Yudkowsky
2023-08-08 When can we trust model evaluations? evhub
2023-08-05 Thoughts on sharing information about language model capabilities paulfchristiano
2023-08-02 Cultivating a state of mind where new ideas are born Henrik Karlsson
2023-07-28 Grant applications and grand narratives Elizabeth
2023-07-20 Accidentally Load Bearing jefftk
2023-07-12 Consciousness as a conflationary alliance term Andrew_Critch
2023-07-07 Lessons On How To Get Things Right On The First Try johnswentworth
2023-07-03 When do "brains beat brawn" in Chess? An experiment titotal
2023-06-30 Some background for reasoning about dual-use alignment research Charlie Steiner
2023-06-24 Updates and Reflections on Optimal Exercise after Nearly a Decade romeostevensit
2023-06-19 What will GPT-2030 look like? jsteinhardt
2023-06-14 The Base Rate Times, news through prediction markets vandemonian
2023-06-09 The ants and the grasshopper Richard_Ngo
2023-06-05 Trust develops gradually via making bids and setting boundaries yamrzou
2023-06-01 Book Review: How Minds Change bc4026bd4aaa5b7fe
2023-05-29 How to Have Polygenically Screened Children (2023) NavinF
2023-05-24 When is Goodhart catastrophic? Drake Thomas
2023-05-20 Steering GPT-2-XL by adding an activation vector TurnTrout
2023-05-17 Predictable updating about AI risk Joe Carlsmith
2023-05-12 How much do you believe your results? Eric Neyman
2023-05-06 Hell is Game Theory Folk Theorems jessicata
2023-04-30 Notes on Teaching in Prison jsd
2023-04-25 A stylized dialogue on John Wentworth's claims about markets and optimization So8res
2023-04-21 On AutoGPT Zvi
2023-04-18 Elements of Rationalist Discourse Rob Bensinger
2023-04-13 Discussion with Nate Soares on a key alignment difficulty HoldenKarnofsky
2023-04-08 What would a compute monitoring plan look like? [Linkpost] Akash
2023-04-04 "Carefully Bootstrapped Alignment" is organizationally hard Raemon
2023-03-28 On not getting contaminated by the wrong obesity ideas Natália Coelho Mendonça
2023-03-23 More information about the dangerous capability evaluations we did with GPT-4 and Claude. Beth Barnes
2023-03-19 The Social Recession: By the Numbers antonomon
2023-03-16 Enemies vs Malefactors So8res
2023-03-13 The Parable of the King and the Random Process moridinamael
2023-03-09 Acausal normalcy Andrew_Critch
2023-03-03 AI alignment researchers don't (seem to) stack So8res
2023-02-25 I hired 5 people to sit behind me and make me productive for a month Simon Berens
2023-02-22 Please don't throw your mind away TsviBT
2023-02-16 Cyborgism NicholasKees
2023-02-13 Childhoods of exceptional people Henrik Karlsson
2023-02-09 SolidGoldMagikarp (Plus, Prompt Generation) Ivoah
2023-02-06 Focus on the places where you feel shocked everyone's dropping the ball So8res
2023-02-03 Basics of Rationalist Discourse Duncan_Sabien
2023-01-30 My Model Of EA Burnout LoganStrohl
2023-01-27 Sapir-Whorf for Rationalists Duncan_Sabien
2023-01-24 How "Discovering Latent Knowledge in Language Models Without Supervision" Fits Into a Broader Alignment Scheme Collin
2023-01-21 Recursive Middle Manager Hell crop_rotation
2023-01-18 Models Don't "Get Reward" Sam Ringer
2023-01-15 We don’t trade with ants KatjaGrace
2023-01-12 Can we efficiently distinguish different mechanisms? paulfchristiano
2023-01-06 The Feeling of Idea Scarcity johnswentworth
2023-01-02 Staring into the abyss as a core life skill benkuhn
2022-12-29 Sazen Duncan_Sabien
2022-12-27 Finite Factored Sets in Pictures Magdalena Wache
2022-12-27 Be less scared of overconfidence benkuhn
2022-12-27 The Plan - 2022 Update johnswentworth
2022-12-27 A note about differential technological development So8res
2022-12-27 Mechanistic anomaly detection and ELK paulfchristiano
2022-12-27 Superintelligent AI is necessary for an amazing future, but far from sufficient So8res
2022-12-27 Mysteries of mode collapse – mysterious attractor states in LLMs reallyeli
2022-12-27 What it's like to dissect a cadaver Alok Singh
2022-12-27 Decision theory does not imply that we get to have nice things So8res
2022-12-27 Let’s think about slowing down AI KatjaGrace

匿名用户只展示最新 100 条榜单历史,更多历史数据请登录后查看,支持时光机按天筛选