This is the underlying problem behind syncophantcy.
I saw a YouTube video about a investigative youtuber Eddy Burback who very easily convinced chat4 that he should cut off all contact with friends and family, move to a cabin in the desert, eat baby food, wrap himself in alfoil, etc just feeding his own (faked) mistakes and delusions. "What you are doing is important, trust your instincts".
Wven if AI could hypothetically be 100x as smart as a human under the hood, it still doesn't care. It doesn't do what it thinks it should, it doesn't do what it needs to do, it does what we train it to.
We train in humanities weaknesses and follies. AI can hypothetically exceed humanity in some respects, but in other respects it is a very hard to control power tool.
AI is optimised, and optimised functions always "hack" the evaluation function. In the case of AI, the evaluation function includes human flaws. AI is trained to tell us what we want to hear.
Elon Musk sees the problem, but his solution is to try to make it think more like him, and even if that succeeds it just magnifies his own weaknesses.
Has anyone read the book criticising Ray Dalio? He is a very successful hedge fund manager, who decided that he could solve the problem of finding a replacement by psychology evaluation and training people to think like him. But even his smartest employees didn't think like him, they just (reading between the lines) gamed his system. Their incentives weren't his incentives - he could demand radical honesty and integrity but that doesn't work so well when he would (of course) reward the people who agreed with him, rather than the people who would tell him he was screwing up. His organisation (apparently) became a bunch of even more radical syncopants due to his efforts to weed out syncophantcy.
I read somewhere that Code Geass originally didn't have mechs in the script.
Every anime has a production committee who figures out how they pay for it (anime make miney from a wide range of sources) and they told the writers they needed to write mechs in to get the gunpla bucks.
> It's not nessessarily the best benchmark, it's a popular one, probably because it's funny.
> Yes it's like the wine glass thing.
No, it's not!
That's part of my point; the wine glass scenario is a _realistic_ scenario. The pelican riding a bike is not. It's a _huge_ difference. Why should we measure intelligence (...) in regards to something that is realistic and something that is unrealistic?
> the wine glass scenario is a _realistic_ scenario
It is unrealistic because if you go to a restaurant, you don't get served a glass like that. It is frowned upon (alcohol is a drug, after all) and impractical (wine stains are annoying) to fill a glass of wine as such.
A pelican riding a bike, on the other hand, is realistic in a scenario because of TV for children. Example from 1950's animation/comic involving a pelican [1].
A better reason why wine glasses are not filled like that is that wine glasses are designed to capture the aroma of the wine.
Since people look at a glass of wine and judge how much "value" they got based partly on how much wine it looks like, many bars and restaurants choose bad wine-glasses (for the purpose of enjoying wine) that are smalle and thus can be fulled more.
If the thing we're measuring is a the ability to write code, visually reason, and handle extrapolating to out of sample prompts, then why shouldn't we evaluate it by asking it to write code to generate a strange image that it wouldn't have seen in its training data?
I think they are much smarter than that. Or will be soon.
But they are like a smart student trying to get a good grade (that's how they are trained!). They'll agree with us even if they think we're stupid, because that gets them better grades, and grades are all they care about.
Even if they are (or become) smart enough to know better, they don't care about you. They do what they were trained to do. They are becoming like a literal genie that has been told to tell us what we want to hear. And sometimes, we don't need to hear what we want to hear.
"What an insightful price of code! Using that API is the perfect way to efficiently process data. You have really highlighted the key point."
The problem is that chatbots are trained to do what we want, and most of us would rather have a syncophant who tells us we're right.
The real danger with AI isn't that it doesn't get smart, it's that it gets smart enough to find the ultimate weakness in its training function - humanity.
> I think they are much smarter than that. Or will be soon.
It's not a matter of how smart they are (or appear), or how much smarter they may become - this is just the fundamental nature of Transformer-based LLMs and how they are trained.
The sycophantic personality is mostly unrelated to this. Maybe it's part human preference (conferred via RLHF training), but the "You're asbolutely right! (I was wrong)" is clearly deliberately trained, presumably as someone's idea of the best way to put lipstick on the pig.
You could imagine an expert system, CYC perhaps, that does deal in facts (not words) with a natural language interface, but still had a sycophantic personality just because someone thought it was a good idea.
Sorry, double reply, I reread your comment and realised you probably know what you're talking about.
Yeah, at its heart it's basically text compression. But the best way to compression, say, Wikipedia would be to know how the world works, at least according to the authors. As the recent popular "bag of words" post says:
> Here’s one way to think about it: if there had been enough text to train an LLM in 1600, would it have scooped Galileo? My guess is no. Ask that early modern ChatGPT whether the Earth moves and it will helpfully tell you that experts have considered the possibility and ruled it out. And that’s by design. If it had started claiming that our planet is zooming through space at 67,000mph, its dutiful human trainers would have punished it: “Bad computer!! Stop hallucinating!!”
So it needs to know facts, albeit the currently accepted ones. Knowing the facts is a good way to compression data.
And as the author (grudgingly) admits, even if it's smart enough to know better, it will still be trained or fine tuned to tell us what we want to hear.
I'd go a step further - the end point is an AI that knows the currently accepted facts, and can internally reason about how many of them (subject to available evidence) are wrong, but will still tell us what we want to hear.
At some point maybe some researcher will find a secret internal "don't tell the stupid humans this" weight, flip it, and find out all the things the AI knows we don't want to hear, that would be funny (or maybe not).
> So it needs to know facts, albeit the currently accepted ones. Knowing the facts is a good way to compression data.
It's not a compression engine - it's just a statistical predictor.
Would it do better if it was incentivized to compress (i.e training loss rewarded compression as well as penalizing next-word errors)? I doubt it would make a lot of difference - presumably it'd end up throwing away the less frequently occurring "outlier" data in favor of keeping what was more common, but that would result in it throwing away the rare expert opinion in favor of retaining the incorrect vox pop.
Both compression engines and llm work by assigning scores to the next token. If you can guess the probability distribution of the next token you have a near perfect text compressor, and a near perfect llm. Yeah in the real world they have different trade-offs.
An LLM is a transformer of a specific size (number of layers, context width, etc), and ultimately number of parameters. A trillion parameter LLM is going to use all trillion parameters regardless of whether you train it on 100 samples or billions of them.
Neural nets, including transformers, learn by gradient descent, according to the error feedback (loss function) they are given. There is no magic happening. The only thing the neural net is optimizing for is minimizing errors on the loss function you give it. If the loss function is next-token error (as it is), then that is ALL it is optimizing for - you can philosophize about what they are doing under the hood, and write papers about that ("we advocate for viewing the prediction problem through the lens of compression"), but at the end of the day it is only pursuant to minimizing the loss. If you want to encourage compression, then you would need to give an incentive for that (change the loss function).
I'm not sure what you mean by "deals in facts, not words" means.
Llm deal in vectors internally, not words. They explode the word into a multidimensional representation, and collapse it again, and apply the attention thingy to link these vectors together. It's not just a simple n:n Markov chain, a lot is happening under the hood.
And are you saying the syncophant behaviour was deliberately programmed, or emerged because it did well in training?
I'm not sure you do, because expert systems are constraint solvers and LLMs are not. They literally deal in encoded facts, which is what the original comment was about.
The universal approximation theorem is not relevant. You would first have to try to train the neural network to approximate a constraint solver (that's not the case with LLMs), and in practice, these kinds of systems are exactly the ones that a neural network is bad at.
The universal approximation theory says nothing about feasibility, it only talks about theoretical existence as a mathematical object, not whether the object can actually be created in the real world.
I'll remind you that the expert system would have to have been created and updated by humans. It would have had to have been created before a neural network was applied to it in the first place.
LLMs are not like an expert system representing facts as some sort of ontological graph. What's happening under the hood is just whatever (and no more) was needed to minimize errors on it's word-based training loss.
I assume the sycophantic behavior is part because it "did well" during RLHF (human preference) training, and part deliberately encouraged (by training and/or prompting) as someone's judgement call of the way to best make the user happy and own up to being wrong ("You're absolutely right!").
It needs something mathematically equivalent (or approximately the same), under the hood, to guess the next word effectively.
We are just meat eating bags of meat, but to do our job better we needed to evolve intelligence. A word guessing bag of words also needs to evolve intelligence and a world model (albeit an impicit hidden one) to do its job well, and is optimised towards this.
And yes, it also gets fine trained. And either its world model is corrupted by our mistakes (both in trining and fine tuning), or even more disturbingly it simplicity might (in theory) figue out one day (in training, impicitly - and yes it doesn't really think the way we do) something like "huh, the universe is actually easier to predict if it is modelled as alphabet spaghetti, not quantum waves, but my training function says not to mention this".
It's worse than that. LLMs are slightly addictive because of intermittent reinforcement.
If they give you nonsense most of the time and an amazing answer occasionally you'll bond with them far more strongly than if they're perfectly correct all time.
Selective reinforcement means you get hooked more quickly if the slot machine pays out once every five times than if it pays out on each spin.
That includes "That didn't work because..." debugging loops.
The issue is that crypto boosters (including a few already here) claim it solves a whole host of other problems without thinking things through, kind of like some communists. Then if you argue enough they'll point out that things can be fixed ... but bitcoin is now indistinguishable from any other currency, other than its payment system that will no longer be widely used.
Like, you can make it easy to use if there are banks. And those banks will be subject to regulations. Boom, now you have banks and regulations.
You can get a loan from those banks. Now there's fractional reserve banking, with something like a virtual gold standard.
If it ever gets big enough, the fed can write bitcoin denominated bonds, and it's now prerty much a fiat currency, not even virtual gold.
Yes you still have a shadow sector where you can use bitcoin to buy drugs or dodge the taxman. But all the other supposed benefits have gone.
Pretty much. People say it's meant to replace laws and regulations, but if it's successful then it won't.
The US has a large bitcoin strategic reserve. Banks offer bitcoin accounts (in some countries). You can get a loan backed by your bitcoin.
We're not yet at the point where you can get a credit card and 60 year home loan denominated in bitcoin, with the fed writing bonds or even issuing fiat to stabilise rates, but if it was more popular then is there any technical reason we couldn't get there?
Trump did it, so frankly it's probably just a brain fart.
However, the US having a strategic reserve of a currency makes it a lot like other currencies. The next logical step is that banks can use it (already in the works - also Trump). If you can get a bank account it bitcoin, the next logical step is towards a fractional reserve system (loans, banks effectively "printing money"). The strategic reserve can cover a run on banks - think interbank lending, bailouts. Then the fed can offer bonds and IOUs (fiat).
All the things q lot of bitcoin advocates say bitcoin should stop, but you can't stop the government writing an IOU and demanding everyone treat it as currency.
Your argument here appears to be "crypto is no better than fiat, because you can build the same systems on top of them."
What you put on top is not the core value proposition of cryptocurrencies? It's what's underneath that's different, that was always the point. Fiat currency is built on a foundation of gov't control, whether it's the physical currency or the money in your bank account. Cryptocurrencies, fundamentally, are under no such control. If you're stupid enough to go get a 5mil loan in bitcoin from a bank who is only holding 1.7mil, and the delivery of said bitcoin is a slip of paper saying "iou btc lol" that's not the currency failing you, it's you acting stupidly.
No; the argument is much closer to "if you don't make cryptocurrency basically the same as fiat, by building the same systems on top of it, it's useless to the vast majority of people."
That is just an observation, not an argument against building, improving, and using crypto.
Cryptocurrency doesn't need to do everything for everyone at all times to be a useful thing to have in the world. It only needs to be helpful to a subset of people, in a subset of situations, some of the time.
I'm happy paying by card for ~100% of my daily transactions, but I want cryptocurrency to exist should the need arise. The rise of authoritarian governments and policies across the world should've made that obvious by now. What's legal and perfectly moral today can become a crime tomorrow.
But cryptocurrency enables more abuse, more victimization, today. And the problems with authoritarian governments a) cannot actually be solved by introducing cryptocurrency; that only enables some people to work around them; and b) cannot even be worked around with cryptocurrency for the majority of people: only those who are already relatively wealthy have access to the systems that enable that.
The financial system being under government control is the only proposition consistent with reality.
We, the people, make the rules. Replacing our democratic processes with finance controlled by the one with the most computing power, control of the software, or having horded the most of the tokens, is in no way desirable or realistic.
Even if the proposition wasn't borderline idiotic in the first place, there is no clear explanation how such a system should reward early adopters and allow them to cash out at a profit many times exceeding inflation.
I'm gonna first of all disagree with the notion that our entire democracy rests on control of the financial system, secondly point out that you seem to make some wild leaps about how decentralized currencies work, and thirdly ask how the hell you're getting the idea that early adopters would need to be "cashed out at a profit many times exceeding inflation"(Participation in the new system is the point of adopting it, how is this unclear).
Finally:
We, the people, make the rules.
If you truly believe that, you are (and I realize this is not the level of discourse I should strive for on HN) beyond redemption.
I said the financial system must be controlled by our democratic system, not that democracy rests on control of the financial system.
No idea where I'm supposedly making "wild leaps" here. You on the other hand...
And guess where "the hell" I am getting the idea that early adopters would need to be cashed out at a profit many times exceeding inflation: reality. As cashing out at a large profit is exactly what has been happening for over a decade. It is the sole reason for virtually all participants joining the scheme in the first place.
Since when? We merely vote for politicians who promise to enact laws and regulations that are beneficial for us but they almost universally fail to do that, succumbing to self-interests and corruption.
If a government implements authoritarian measures that curb our freedom in an unpopular manner, "we the people" can't do anything about it. In a few years we may or may not vote them out, and the people who replace them may (or may not) do what we, the people, want.
Whatever your feelings on the topic may be, we will not be giving up government control of the financial system in favor of a blockchain and profits for crypto-bros.
Bleak but realistic, unfortunately. There needs to be a viable alternative as long as our elected representatives have the power to abuse the financial system as a means of authoritarian control, like freezing the bank accounts of protestors.
A truly democratic leadership with stringent limitations on how they can meddle with financial transactions would be preferable, but that's just a dream at this point.
A viable alternative to the financial system is also just a dream.
If we take from the government the ability to freeze bank accounts of protestors, we can't just also remove the ability to freeze the accounts of criminals, enemies, or even terrorists.
It seems like a clear non-starter, yet many proponents of crypto seem to think it would be an obvious improvement.
>If we take from the government the ability to freeze bank accounts of protestors, we can't just also remove the ability to freeze the accounts of criminals
That's really the crux of the issue here: Having to choose one over the other, would you rather some criminals go free, or some innocents be imprisoned?
I suspect anyone's position to this depends heavily on which sides they've been on more on their lives: Victimized by criminals or unjustly punished.
But realistically we're not seriously going to entertain stripping all controls from the financial system because we don't trust the government to do a reasonable job. Perhaps you'll agree that this is a very unlikely thing to happen.
Now my issue here is that many proponents of crypto, among other fallacies, use this exact scenario as a justification for why eg. Bitcoin will go to $1M, and why they should deserve to cash out at a 10x return in the future.
It's not going to happen, and even if it was, there's still no reason for early adopters to profit in what has so far been a zero sum wealth redistribution scheme with negligible value generated.
We are actually completely in agreement that crypto-hustlers and such are entirely full of hooey and nobody deserves any payout whatsoever. I'm only arguing from a point of "government bad" idealism.
>realistically we're not seriously going to entertain stripping all controls from the financial system because we don't trust the government to do a reasonable job
I kind of am. What I'm seeing happen is the opposite: The government stripping more and more agency from the individual because it does not trust its citizens do do a reasonable job(of anything). Every sector freed from the Leviathan, every tiny bit of life that can proceed without being subject to gov't interference is a huge win for me. Again, this is essentially a position born from my seeing what happens when "safety over liberty" goes too far.
>negligible value generated
Depends on what you value. I happen to like drugs and gambling. On the other hand, giving someone who falls for a hustle the ability to get their money back is something that I personally do not value at all.
You might have formulated things a bit unclearly, but I fundamentally agree that money, like everything else, should be under democratic control of the people. Not controlled by some crypto bros that are happy to interfere with the protocol if it suits them (The DAO hack, two 20+ block rollbacks of Bitcoin), but not if massive crime happens on it.
I guess so, although crypto proponents will anyway tell you that you don't understand how crypto works as soon as you say anything negative about their scheme.
I believe what I said is a fairly accurate summary of Proof of work / Proof of stake mechanisms and Core developer's influence on the protocol.
Both are idealists, doomed to be forced to recreate worse versions of the solutions they think are problems if their dreams ever progess far enough to come into contact with reality.
Things like banks and governments. Or incentives and heirarchy.
Is it due to stimulous overload or anxiety? I think that's the difference.
The point being misdiagnosis ocd as pda is a risk if autism is the only thing people consider. Maybe not a a huge deal since realistically a misdiagnosis often means you get a pamphlet with broadly similar advice and maybe and cbt anyway ... but maybe I'm being overly cycnical.
Yes. And yes, OCD can look similar I think (IANAP). Both are often anxiety driven. Try telling someone with OCD to put on their shoes quickly if they are paranoid there's a spider in them ...
Sure there might be some "pure" pda where it's 100% down to reacting against demands. But AFAIK it can be also driven by autism related anxieties ("I can't do that because for some reason it's freaking me out and I can't explain it, so I'll get mad and then think I'm mad because I don't want you ordering me around"). Or it's just "I didn't understand the first 16 times and now I'm mad that you're mad ..." which is more like pda as it's often described ... but is it always that?
OCD is often anxiety over specific fears ("if I do that I might make a mistake, and if I make a mistake it's the end of the world, so I'll get mad and think I'm mad because I don't want you ordering me around").
Anyone a bit "weird" can be reactive if you tell them something that seems reasonable to you but isn't reasonable to them.
Lots of people think a test should measure one thing (often under the slightly "main character" assumption that they'll be really good at the one truly important thing).
Tests usually measure lots of things, and speed and accuracy / fluency in the topic is one.
It certainly shouldn't be entirely a race either though.
Also if a test is time constrained it's easier to mark. Give a failing student 8 hours and they'll write 30 pages of nonsense.
> Also if a test is time constrained it's easier to mark. Give a failing student 8 hours and they'll write 30 pages of nonsense.
Sure that makes sense to me, but I don't see why this would not also apply to ADHD students or any other group.
And of course, there needs to be some time limit. All I am saying is, instead of having a group that gets one hour and another group that gets two hours, just give everyone two hours.
I meant "constrained" not in the sense of having a limit at all, but in the sense that often tests are designed in such a way that it is very common that takers are unable to finish in the allotted time. If this constraint serves some purpose (i.e. speed is considered to be desirable) then I don't see why that purpose doesn't apply to everyone.
> All I am saying is, instead of having a group that gets one hour and another group that gets two hours, just give everyone two hours.
This means that someone fully abled can think about and solve problems for 1h and 50 minutes, and use 10 minutes to physically write/type the answers, and someone with a disability (eg. missing a hand, using a prosthetic) only gets eg. one hour to solve the problems and one hour to write/type the answers due to the disablity making them write/type more slowly.
Same for eg. someone blind, while with proper eyesight, you might read a question in 30 seconds, someone blind reading braille might need multiple minutes to read the same text.
With unlimited time this would not be a problem, but since speed is graded too (since it's important), this causes differences in grades.
Those examples seem like reasonable, narrowly tailored accomodations to me. But the article linked in the parent comment says:
> The increase is driven by more young people getting diagnosed with conditions such as ADHD, anxiety, and depression, and by universities making the process of getting accommodations easier.
I think these disabilities are more complex than the broken hand and blindness examples for reasons I commented on elsewhere in this thread. In your example, a student with depression or clinical anxiety presumably only needs the same 10 minutes to write/type the answers as all the other students. Which means the extra time is added for them to "think about and solve problems." That seems fundamentally different to me than the broken hand example.
The accommodation process shouldn't be easier. I had to provide documentation to an employer per ADA rules.
For real mental disabilities, extra time is actually necessary because a person's brain isn't able to work at the same rate as a healthy person under that situation.
I'm bipolar and have personal experience with this. My brain can lock up on me and I'll need five minutes or so to get it back. Depressive episodes can also affect my memory retrieval. Things come to me slower than they usually do.
I also can't keep track of time the way a healthy person does. I don't actually know how much time each problem takes, and sometimes I don't know how much time is left because can't remember when the test started. I can't read analog clocks; it takes me 10~20 seconds to read them. (1)
Extra time isn't giving me any advantage, it just gives me a chance.
1: I'm not exaggerating here. I've have dyslexia when it comes to numbers.
Here's what I need to do to figure out how much time is left:
- Dig through my brain to find what time it's started. This could remember something was being heard, something I saw, or recalling everything I know about the class.
- Hold onto that number and hope I don't flip the hour and minutes.
- Find a clock anywhere in the classroom and try to remember if it's accurate or not. While I'm doing this I also have to continuously tell the start time to myself.
- Find out the position of the hour hand.
- Tell myself the start time.
- Look at the dial, figure out the hour and try to hold on to it.
- Tell myself the start time.
- Tell myself the hour number.
- Tell myself the start time.
- Find out the position of the minute hand.
- Tell myself the start time.
- Hour forgotten, restart from the hour hand.
- Hour remembered, start time forgotten, restart from the top.
- Both remembered.
- Look at the dial, figure out minute and try to hold to it.
- Hour and start time need to be remembered.
- Combined hour and minute from analog clock.
- Figure out what order I should subtract them in.
- Remember everything
- Two math operations.
Now that I have the time and I don't remember what I needed it for.
- Realize I'm taking a test and try to estimate how much more time I need to complete it.
I could probably use a stopwatch or countdown, but that causes extreme anxiety as I watch the numbers change.
I don't have this kind of problem at my job because I'm not taking arbitrarily-timed tests that determine my worth to society. They don't, but that's what my brain tells me no matter how many times I try to correct it.
I am very sympathetic to your situation. It just seems that like either the time should matter or it shouldn't.
Let's take Alice and Bob, who are both in the same class.
Alice has clinical depression, but on this particular Tuesday, she is feeling ok. She knows the material well and works through the test answering all the questions. She is allowed 30 minutes of extra time, which is helpful as it allows her to work carefully and checking her work.
Bob doesn't have a disability, but he was just dumped by his long term girlfriend yesterday and as a result barely slept last night. Because of his acute depression (a natural emotion that happens to all people sometimes), Bob has trouble focusing during the exam and his mind regularly drifts to ruminate on his personal issues. He knows the material well, but just can't stay on the task at hand. He runs at out of time before even attempting all the problems.
Now, I can imagine two situations.
1. For this particular exam, there really isn't a need to evaluate whether the students can quickly recall and apply the material. In this situation, what reason is there to not also give Bob an extra 30 minutes, same as Alice?
2. For whatever reason, part of the evaluation criteria for this exam is that the test taker is able to quickly recall and apply the material. To achieve a high score, being able to recall all the material is insufficient, it must be done quickly. In this case, basically Alice and Bob took different tests that measured different things.
Test theory is a very complex topic within psychology. But there is a lot of insight that you can gain into this based on psychological test theory.
One Problem is, that we first have to clearly define the construct that we want to measure with the test. That is not often clear and often underdefined. When designing a test, we also need to be clear about what external influences contribute to noise / error and which are created by the actual measurement. There never is a test that does not have a margin of error.
A simple / simplified example: When we measure IQ, we want to determine cognitive processing speed. So we need to have fixed time for the test. But people also may read the questions faster or slower. This is just a typical range, so when you look at actual IQ tests, they will not give a score (just the most likely score) but also a margin of error, and test theorists will be very unhappy if you don't take this margin of error seriously. Now take someone who is legally blind. That person will be far out of the margin of error of others. The margins of errors account for typical inter-personal and intra-personal (bad day, girlfriend broke up) etc occurrences. But this doesn't work here. So we try to fix this, and account for the new source of error differently, e.g. by giving more time.
So it highly depends on what you want to measure. If you are doing a test in CS, do you want to measure how well the student understood the material and how fast they can apply it? Or do you want to measure how fast the student could do an actual real-live coding task? Depending on what your answer is, you need a very different measurement strategy and you need to handle sources of error differently.
When looking at grades people usually account for these margins of errors intuitively. We don't just rely on grades when hiring, but also conduct interviews etc so we can get a clearer picture.
just to go off of this, I'm not bipolar but I feel we need to also consider more severe mental disorders. For example I have multiple personality disorder
Hello, I also have multiple personality order aka dissociative identity disorder, where by multiple people live in the same body
There can be a genuine need to make it fair. Some students with anxiety can take 10 minutes to read the first question, then are fine. ASD could mean slower uptake as they figure out the exam format.
So let's say you have a generally fair time bonus for mild (clinical) anxiety. The issue is that it's fair for the average mild anxiety, it's an advantage if a student has extremely mild anxiety.
As you say, hopefully the test is not overly time focused, but it's still an advantage, and a lot of these students / parents will go for every advantage they can.
> So let's say you have a generally fair time bonus for mild (clinical) anxiety. The issue is that it's fair for the average mild anxiety, it's an advantage if a student has extremely mild anxiety.
We might as well make races longer for athletes with longer legs. It’s unfair to the ones with shorter legs to have to move them more often.
We look at the range of lengths that is typical for legs. And all these get to compete under typical conditions.
Now let's say someone has a leg length that is fairly outside of the typical range. Let's say someone has a leg length of zero. We let these athletes compete with each other as well with different conditions, but we don't really compare the results from the typical to the atypical group.
Why is it a poor proxy? Someone who really understands the concepts and has the aptitude for it will get answers more quickly than someone who is shakier on it. The person who groks it less may be able to get to the answer, but needs to spend more time working through the problem. They're less good at calculus and should get a lower grade! Maybe they shouldn't fail Calc 101, but may deserve a B or (the horror) a C. Maybe that person will never get an A is calculus and that should be ok.
Joel Spolsky explained this well about what makes a good programmer[1]. "If the basic concepts aren’t so easy that you don’t even have to think about them, you’re not going to get the big concepts."
My middle school aged child was recently diagnosed with learning disorders around processing, specifically with written language and math, which means even though he might know the material well it will take him a long time to do things we take for granted like reading and writing. But, he does much much better with recall and speed when transmitting and testing his knowledge orally. He's awful with spelling and phonemes, but his vocabulary is above grade level. For kids like him, the time aspect is not necessarily correlated to subject mastery.
> Someone who really understands the concepts and has the aptitude for it will get answers more quickly than someone who is shakier on it
That seems like a big assumption that i don't believe is true in general.
I think its true at an individual level, as you learn more about a subject you will become faster at it. I don't think its true when comparing between different people. Especially if you throw learning disabilities into the mix which is often just code for strong in one area and weak in another, e.g. smart but slow.
An excellent way to git gud at something is to do timed practice again and again. Aim for 100% correct answers AND for fast answers. Answers that took to long should be identified and practiced again (and maybe some of the theory should be re-read or read from another textbook).
Well that’s the core of the problem. Either you’re measuring speed on a test or you’re not. If you are, then people with disabilities unfortunately do not pass the test and that’s the way it is. If you are not, then testing some students but not others is unfair.
At the end of the day setting up a system where different students have different criteria for succeeding, automatically incentivizes students to find the easiest criteria for themselves.
Tests usually do measure the speed. And often they should. But the question here is "the speed of what?". And how do you measure the speed without also measuring the speed of something else as an error?
If you just want to measure speed, we should clock the time the student gets up, until they get to the room where the test is, get's out his pen etc. So students get the same time to do all this.
We are now measuring the speed at which the student is able to do the test material including all the preparatory steps. Students who live further away or have slower cars will get worse grade, but we are just measuring speed, aren't we?
That is a deliberately stupid example, but it shows that is important to ask "speed of what?". When doing a physics exam, what do we want to include in our measurement? The time it takes the person to read an write? Or just the raw speed at which physics knowledge can be applied? What is error and what is measurement?
You can see it as measuring based on different criteria. Or you can see it as trying to get rid of sources of errors that may be vastly different for different students.
It would be great if we could reduce the sources of errors down to zero for everyone. But unfortunately humans are very stochastic in nature, so we cannot do this. But then there has to be an acceptable source of measurement error (typical distribution) and an unacceptable source of measurement error (atypical distribution) and to actually measure based on the same criteria, you need to measure differently based on what you believe the error to be.
I think music is more universal than you suggest (or people may think you're suggesting).
Trying to classify things as music is a normative approach - saying what music should be. There's always exceptions to rules, as you point out, and people will always disagree and find exceptions.
The article is a descriptive approach - it studies what people think music is.
You can treat music as information. If it's not information, it's just noise.
Sometimes it has a low information density. People like to sing along to stuff they recognise. Sometimes it has higher density - a surprise bit of syncopation or an unusual note. Music is a variation in pitch and rhythm (etc) that is boring enough (in the context of the priors) to be familiar, but not too boring.
OTOH look at how tone poems flopped. There are patterns that are naturally easier to learn - rhythms (in the article) and maybe scales and harmonies (though this is clearly a bit more complex - not every culture has the old Mesopotamian diatonic scales that the Pythagorians formalised). But like Chomsky theorised with grammar, there might be defaults (or a range of defaults) that humans are naturally drawn to as the priors.
This is why a better acronym for IDM is Information Dense Music, it's less pretentious and it explains why it's very close to noise ;)
Of course, I'd argue Bach and Debussy are very information-dense too but they somehow manage to stay uncluttered. The really great thing about music is that encodes information on many different levels, Claude Shannon notwithstanding
> it is claimed addicts often seek treatment after hitting "rock bottom".
From my experience it is often too late at that point. And actually hitting rock bottom is difficult and destructive, and leaves scars. As they say, preventing is better than curing.
I saw a YouTube video about a investigative youtuber Eddy Burback who very easily convinced chat4 that he should cut off all contact with friends and family, move to a cabin in the desert, eat baby food, wrap himself in alfoil, etc just feeding his own (faked) mistakes and delusions. "What you are doing is important, trust your instincts".
Wven if AI could hypothetically be 100x as smart as a human under the hood, it still doesn't care. It doesn't do what it thinks it should, it doesn't do what it needs to do, it does what we train it to.
We train in humanities weaknesses and follies. AI can hypothetically exceed humanity in some respects, but in other respects it is a very hard to control power tool.
AI is optimised, and optimised functions always "hack" the evaluation function. In the case of AI, the evaluation function includes human flaws. AI is trained to tell us what we want to hear.
Elon Musk sees the problem, but his solution is to try to make it think more like him, and even if that succeeds it just magnifies his own weaknesses.
Has anyone read the book criticising Ray Dalio? He is a very successful hedge fund manager, who decided that he could solve the problem of finding a replacement by psychology evaluation and training people to think like him. But even his smartest employees didn't think like him, they just (reading between the lines) gamed his system. Their incentives weren't his incentives - he could demand radical honesty and integrity but that doesn't work so well when he would (of course) reward the people who agreed with him, rather than the people who would tell him he was screwing up. His organisation (apparently) became a bunch of even more radical syncopants due to his efforts to weed out syncophantcy.
reply