Some people claim that aesthetics don't mean anything, and are resistant to the idea that they could. After all, aesthetic preferences are very individual.
Sarah argues that the skeptics have a point, but they're too epistemically conservative. Colors don't have intrinsic meanings, but they do have shared connotations within a culture. There's obviously some signal being carried through aesthetic choices.
[Thanks to Steven Byrnes for feedback and the idea for section §3.1. Also thanks to Justis from the LW feedback team.]
Remember this?
Or this?
The images are from WaitButWhy, but the idea was voiced by many prominent alignment people, including Eliezer Yudkowsky and Nick Bostrom. The argument is that the difference in brain architecture between the dumbest and smartest human is so small that the step from subhuman to superhuman AI should go extremely quickly. This idea was very pervasive at the time. It's also wrong. I don't think most people on LessWrong have a good model of why it's wrong, and I think because of this, they don't have a good model of AI timelines going forward.
Actually, even if the LLMs do scale to the AGI, we might find that a civilisation run by the AGI is unlikely to appear. The current state of the world energy industry and computation technology might fail to allow the AGI to generate answers to many tasks that are necessary to sustain the energy industry itself. Attempts to optimize the AGI would require it to be more energy efficient, which appears to lead it to be neuromorphic, which in turn could imply that the AGIs running the civilisation are to be split into many brains, resemble the humanity a...
The usual explanation of probability theory goes like this:
There is this thing called Probability Space, which consists of three other things:
And then several examples of how we can merge this mathematical model with a real world situations are given.
For instance, for a dice roll the appropriate sample space would be {1; 2; 3; 4; 5; 6}. For an Event Space we can use a superset of the Sample Space and probability function has to give every elementary event equal value:
The point of such examples is to give students intuitive understanding of how to apply the math of set theory towards...
I think picking axioms is not necessary here and in any case inconsequential.
By picking your axioms you logically pinpoint what you are talking in the first place. Have you read Highly Advanced Epistemology 101 for Beginners? I'm noticing that our inferential distance is larger than it should be otherwise.
"Bachelors are unmarried" is true whether or not I regard it as some kind of axiom or not.
No, you are missing the point. I'm not saying that this phrase has to be axiom itself. I'm saying that you need to somehow axiomatically define your individual words...
If probability is in the map, then what is the territory? What are we mapping when we apply probability theory?
"Our uncertainty about the world, of course."
Uncertainty, yes. And sure, every map is, in a sense, a map of the world. But can we be more specific? Say, for a fair coin toss, what particular part of the world do we map with probability theory? Surely it's not the whole world at the same time, is it?
"It is. You map the whole world. Multiple possible worlds, in fact. In some of them the coin is Heads in the others it's Tails, and you are uncertain which one is yours."
Wouldn't that mean that I need to believe in some kind of multiverse to reason about probability? That doesn't sound...
That's not how people usually use these terms. The uncertainty about a state of the coin after the toss is describable within the framework of possible worlds just as uncertainty about a future coin toss, but uncertainty about a digit of pi - isn't.
Oops, that's my bad for not double-checking the definitions before I wrote that comment. I think the distinction I was getting at was more like known unknowns vs unknown unknowns, which isn't relevant in platonic-ideal probability experiments like the ones we're discussing here, but is useful in real-world situa...
The other day I discussed how high monitoring costs can explain the emergence of “aristocratic” systems of governance:
Aristocracy and Hostage Capital
Arjun Panickssery · Jan 8
There's a conventional narrative by which the pre-20th century aristocracy was the "old corruption" where civil and military positions were distributed inefficiently due to nepotism until the system was replaced by a professional civil service after more enlightened thinkers prevailed ...
An element of Douglas Allen’s argument that I didn’t expand on was the British Navy. He has a separate paper called “The British Navy Rules” that goes into more detail on why he thinks institutional incentives made them successful from 1670 and 1827 (i.e. for most of the age of fighting sail).
In the Seven Years’ War (1756–1763) the British had a 7-to-1 casualty...
Solar has an average capacity factor in the US of about 25%. Naively, you might think that to turn this into a highly-available power source, you just need to have 4x the solar panels, plus enough batteries to store 75% of a day’s worth of power. E.g., for each continuous megawatt you want to supply, you need 4 MW of solar panels, and 18 MWh of batteries. During the day, you supply 1 MW from the panels and use the other 3 MW to charge the batteries. Overnight, you discharge the batteries to supply continuous power.
Turns out it’s not quite that simple. First, the capacity factor varies throughout the year, as the days get shorter in winter. So you at least need to build enough that even...
(I really like how gears-y your comment is, many thanks and strong-upvoted.)
What if they released the new best LLM, and almost no one noticed?
Google seems to have pulled that off this week with Gemini 2.5 Pro.
It’s a great model, sir. I have a ton of reactions, and it’s 90%+ positive, with a majority of it extremely positive. They cooked.
But what good is cooking if no one tastes the results?
Instead, everyone got hold of the GPT-4o image generator and went Ghibli crazy.
I love that for us, but we did kind of bury the lede. We also buried everything else. Certainly no one was feeling the AGI.
Also seriously, did you know Claude now has web search? It’s kind of a big deal. This was a remarkably large quality of life improvement.
This is helpful, thanks. Bummer though...
[This is our blog post on the papers, which can be found at https://transformer-circuits.pub/2025/attribution-graphs/biology.html and https://transformer-circuits.pub/2025/attribution-graphs/methods.html.]
Language models like Claude aren't programmed directly by humans—instead, they‘re trained on large amounts of data. During that training process, they learn their own strategies to solve problems. These strategies are encoded in the billions of computations a model performs for every word it writes. They arrive inscrutable to us, the model’s developers. This means that we don’t understand how models do most of the things they do.
Knowing how models like Claude think would allow us to have a better understanding of their abilities, as well as help us ensure that they’re doing what we intend them to. For example:
In the poetry case study, we had set out to show that the model didn't plan ahead, and found instead that it did.
I found it shocking they didn't think the model plans ahead. The poetry ability of LLMs since at least GPT2 is well beyond what feels possible without anticipating a rhyme by planning at least a handful of tokens in advance.
Apologies for the late announcement.
Come on out to the ACX (Astral Codex Ten) Montreal Meetup! This week, we're discussing Is Brain Size Morally Relevant?, by Brian Tomasik. And hopefully determine the answer to this question once and for all.
Feel free to suggest topics or readings for future meetups on this form here.
Venue: Ye Olde Orchard Pub & Grill, 20 Prince Arthur St W.
Date & Time: Saturday, March 29th, 2024, 1PM.
RSVP by clicking "Going" at the top of this post.
Send a message on our Montreal Rationalists Discord on channel #meetup-general if you have trouble finding us or any other issues.
Please also join the mailing list and our Discord server if you haven't already. We host biweekly ACX Montreal meetups, so join us if you don't want to miss any of them!
PS: Add May 10th to your calendar, our forever-exciting biyearly ACX Everywhere Montreal meetup!
Even an AGI "aligned" to a purpose which doesn't imply humanity's survival but does require the AGI itself to achieve difficult feats like transforming the entire Solar System into something computing as many digits of pi as possible would obviously still need to produce the computing systems and gather the energy necessary for the systems' work. As I mentioned in my previous question, all the electrical energy generated in the world cannot sustain more than agents who interact with GPT-3 a hundred times a day while using 3Wh per interaction. The OpenAI-o3 model apparently requires more than 1 kWh per task.
However, the ARC-AGI task set shows the following trend: as the o3 models taught under the same paradigm increased the rate of success at ARC-AGI-1 tasks from 10%...