Hacker News new | past | comments | ask | show | jobs | submit login

You're right. I had a look at the methods sections of three of the four studies cited in the article. It seems how they collect data is they have people fill in questionnaires, then calculate personality scores based on those questionnaires.

For example, in the Mac Giolla and Kajonius study [1] the data came from an online questionnaire.

I'm finding it very hard to criticise the methodology of this sort of study without [edited out] being extremely rude and dismissive to a whole field of research. But, I really don't understand how all this is supposed to make sense. I look at the plot in the middle of the Scientific American article that is marked "Agreeableness" on the x-axis (with a score from 1 to 5). I look at the papers cited. There's plenty of maths there, but what is anything calculating? What is "agreeableness" and why is it measured on a scale of 1 to 5? It seems to be a term that has a specific meaning in psychology and sociology, and that ultimately translates to "a score of n in this standardised questionnaire".

But, if you can just define whatever quantities you like and give them a commonly used word for a name- then what does anything mean anymore? You can just define anything you like as anything you like and measure it anyway you like- and claim anything you want at all.

At the end of the day, is all this measuring anything other than trends in filling up questionnaires? Can we really draw any other conclusions about the differences of men and women than that the samples of the studies filled in their questionnaires in different ways? Is even that a safe conclusion? If you take statistics on a random process you can always model it- but you'll be modelling noise. How is this possibility excluded here?

All the statistics are quantifying the answers that respondents gave to questionnaires and how the researchers rated them - without any attempt to blind or randomise anything, to protect from participant or researcher bias, as far as I can tell (I'm searching for the word "blind" in the papers and finding nothing). In the study I link below, participants found the website by internet searches and word-of-mouth. The study itself points out that this is a self-selecting sample, but what have they done to exclude the possibility of adversarial participation (people purposefully filling in a form in a certain way to confuse results)?

There is so much that is extremely precarious about the findings of those studies and yet the Scientific American article jumps directly to d values. Yes, but d-values on what? What is being measured? What do the numbers stand for?

This is just heartbreaking to see that such a contentious issue is treated with such frivolity. If it's not possible to lay to rest such hot button issues with solid scientific work- then don't do it. It will just make matters worse.

__________________

[1] https://onlinelibrary.wiley.com/doi/epdf/10.1002/ijop.12529?




There are whole fields of research devoted to the questions you're raising. As such, it's hard to reply with anything that would do justice to them. This isn't to say your questions aren't important, just that your lack of answers reflects your ignorance more so than that of the researchers. I say this not antagonistically but to suggest that it's important to understand that what you see is not always all there is to say.

It is true that these are self-report questionnaires, but as such they are small samples of behaviors of the people in question. Samples of how they perceive themselves, how they think about life, how they think about others, and what they value.

The Big Five, and the measures used in studies such as this, has been validated (in the sense that the ratings have been associated concurrently and predictively) over decades in many ways, with regard to daily reports of behavior, emotion, and life events, diagnoses, work ratings, performance on tests, ratings by peers and colleagues, ratings by strangers, just about everything you can imagine. These self-reports aren't perfect, but they do provide a fuzzy snapshot of someone at a given moment in time. Yes, it would be better to obtain all sorts of other measures of behavior, but they would be too expensive to obtain on large enough samples to be representative.

A major paradox in understanding human behavioral differences is that the more specific and "real world" you get, the less and less they generalize. That is, you can get a very concrete measure of a real-world behavior, but it ceases to be representative of that person across a large number of contexts and situations. Say you want to measure theft, for example. Do you set up a honeypot? Is that representative of that person? Do you use police reports or records? Is that representative? It turns out self-report on online questionnares is a very good measure of things like this because people are less self-conscious, and report things that don't go on the official record.

Faking is also controversial in this area. You're right to bring it up as an issue, but to understand the research on it it's important to think about why someone would fake. That is, what's the motivation for large proportions of people to systematically fake in one direction? And if they do do go to the trouble of doing that, what's "real" and what's "fake"? That is, let's say people make themselves look more dominant than they really are -- what does it mean if one person does that and another does not? It turns out that the person who wants to make themselves look more dominant often is more dominant, all other things equal, because it means they value that.

Also, strangely enough, it turns out that people who are callous and aggressive don't really care about that, especially on online questionnaires, because they are callous and aggressive.

This has all been very thoroughly researched and it turns out to be much more complicated than it seems at first glance. It doesn't mean things can't be better, but it does mean that over very large samples of persons answering questions on a low-stakes questionnaire (in the sense there aren't real consequences to them answering one way or another), a lot of these things average out. It's not the end of the story, but it's not something to be dismissed either.

In the end, questions of sex differences in behavior are about sex differences in behavior. And that's what this research addresses.


If there are all these ways to verify that the answers to questionnaires are accurate you'd think those ways would have been used instead of questionnaires as proof in this highly controversial and inflammatory subject. Extraordinary claims require extraordinary evidence, don't they?


> Extraordinary claims require extraordinary evidence, don't they?

But these claims aren't extraordinary in this scientific field.

They may of course seem extraordinary to those not familiar with the science.


Thank you for your patient and civil answer.

It's getting past my bed time and your comment deserves a more thorough answer that I'll try to write tomorrow, but for the time being this is what strikes me the most about your reply:

>> Also, strangely enough, it turns out that people who are callous and aggressive don't really care about that, especially on online questionnaires, because they are callous and aggressive.

How do you know that someone who looks callous and aggressive on online questionnaires is actually callous and aggressive? The obvious answer seems to be that you know because you've given them another questionnaire separately. Is that the case?

I'm not trying to catch you out, so I'll spell it out: if that is the case then I don't see how you can ever know that someone is callous and aggressive in any objective sense of the way. Like I say in another comment, that would be "questionnaires all the way down". This is a really strong signal that I get from discussions like this and it makes me very suspicious of assurances that it's all been studied and it's all based on solid evidence.

I mean, I'm sorry, I don't want to sound like a square but "how [people] perceive themselves" is exactly the opposite of what I'd think of as an objective measure of how they really are. For example- I perceive myself as pretty (I like myself, that is) but I am not always perceived as pretty by others. What value is there in asking me how pretty I am?

Edit: I get that some of your comment addresses this. But it still seems to me like the solution is to try to double-guess the participant. That also doesn't sound like it should make for objective observations.


> How do you know that someone who looks callous and aggressive on online questionnaires is actually callous and aggressive?

You don't, but it's also not necessary. It's impossible to objectively assess someone's subjective experience, the best we can do is look at groups of people and attempt to find reliable indicators.

The point is that some people will over-emphasize any given trait, and others will under-emphasize it, so on average it evens out.

Think of color perception for a similar conundrum. How can you be sure that the red you see is the same as everybody else is seeing?


>> How can you be sure that the red you see is the same as everybody else is seeing?

I can't, but my understanding is that if we all agree to call a certain frequency of visible light "red", the frequency won't change because some people perceive it in a different way than others. Neither will measuring the frequency depend on how people perceive it.

That seems to me to be a more consistent definition of "red" than the definitions of personality traits that are discussed here.


A bit more about your comment as promised.

>> There are whole fields of research devoted to the questions you're raising. As such, it's hard to reply with anything that would do justice to them. This isn't to say your questions aren't important, just that your lack of answers reflects your ignorance more so than that of the researchers. I say this not antagonistically but to suggest that it's important to understand that what you see is not always all there is to say.

Another comment brought up the term "construct validity" and it seems to match my concerns exactly. I am glad there is debate on that.

I study for a PhD in AI and I have similar concerns about research in my field. For instance, in AI, research often claims to have modelled human abilities such as "reasoning", "emotion" or "intuition". I'm personally uncomfortable even with well-established terms like "learning" (as in "machine learning") and "vision" (as in "machine vision")- because we don't really know what it means to "see" or to "learn" in human terms so we shouldn't be hasty to apply that terminology to machines.

This tendency has been criticised from the early days of the field but we seem to have regressed in recent years, with the success of machine learning for object classification in images and speech processing taking the field by storm and leaving no room for careful study anymore, it seems. But that's a conversation for another thread.

In AI, I'm worried that calling what algorithms do "attention" or "learning to learn" etc, gives a false impression to people outside the field about the progress of the field, and, in the end, about what we know and what we don't know. This is certainly not advancing the science.

I think the same about psychology and studies like the ones we're discussing here. If psychologists are happy measuring the correlations of the answers in their questionnaires, and they call the quantities measured in this way with names like "agreeableness" and "sensitivity"- doesn't that just give the entirely wrong impression to people outside the field who have a very different concept of what "agreeableness" etc means?

I say that this is "not advancing the science". You could argue that the science is doing fine, thank you, even if lay people don't get it. But, if the way the science is carried out creates confusion and influences real behaviour and decisions, as studies like the ones discussed above have the potential to do- is that really a beneficial outcome of research?

To put it plainly: as a researcher I don't aspire to create confusion, but to bring clarity in subjects that are hard to understand. Isn't that the whole point?

>> In the end, questions of sex differences in behavior are about sex differences in behavior. And that's what this research addresses.

I understand this. But, my concern here is that asking people "what do you think about sex differences in behaviour" is likely to return results tained by ungodly amounts of cultural bias that would be impossible to disentangle from any other results. How is this addressed in such studies? How do you account for people answering questions about sex differences in behaviour based on what they are used to think about sex differences in behaviour, rather than what they actually observe?

P.S. Hey, your answer does do justice to my questions. Thanks for your patience, again.


Haven't seen this in the comments yet, so I'll offer some extra information on the Big Five approach.

Essentially, it is a linguistic approach to personality. The original 5 categories were found by asking people if they would describe themselves with a certain adjective. They did this for hundreds of adjectives. After doing a Factor Analysis, surprisingly, the found 5 independent groups of adjectives. In each group, the adjectives correlate with each other. So for instance, someone you would describe as assertive you would also be likely to describe as proactive. Both of these traits happen to correlate with describing someone as extroverted. The particular group is then given the Big Five name. So technically speaking, extroversion is a whole class of attributes you would be likely to describe someone as.

To me, the amazing thing about the Big Five is that we can reliably extract information about types of people using ordinary language.


Thank you, that's a great explanation.

To be honest though I don't find it surprising that factor analysis would find high correlations between some traits. Actually, this is really concerning if that's the basis of the whole thing. You can find correlations anytime you look for them. How were these correlations validated? I mean- how do we know that the big-5 don't just model noise in the analysed datasets?


I don't know the details of the approach (though someday I would love to study them) but I gather that the correlations are very strong and that it has been replicated many times (including in different languages). Though what is key is that the correlations are weak between groups of traits. You can find large question banks they've used too. I think that even using disjoint sets of attributes/questions you will still get the same groupings.

EDIT: To add just a bit more, of all the replication scandals in psychology today, the Big Five is one of the few frameworks that has withheld the storm.


For the purpose of the “big 5” personality studies, things like “agreeableness” or “openness to experience” are more or less marketing terms that allow professionals to discuss a real and complex topic in a shared language. The “big 5 personality theory” is well established based on peer reviewed and repeatable experiment. The foundation of these studies is real, not fuddy dutty irreparable nonsense (the big 5 comes from the newer corner of psych that doesn’t have a reproducibility crises and has solid scientific grounding and rational for their experimentation and data analysis). I think you should do a bit more research into how the context of the paper instead of doing a drive by analysis based on what I would consider to be an ill informed basis fo analysis of the results.


If you have some time most of your questions are answered in this set of lectures about the subject from someone in the field https://www.youtube.com/watch?v=pCceO_D4AlY&list=PL22J3VaeAB.... If you don't a lot of time then the 10-20 minutes or so from where I linked are also useful and answer a lot of the points you brought up.


One way of thinking about this study is in terms of semantic relations. Ask yourself the following questions, they might help with understanding the value of a study like this:

Will different people answer these personality questions differently?

Do some people have similar personality traits?

Can specific viewpoints be predicted, to some rational degree of error, based on how they answer these questions?

Assuming the people answering do not get any feedback from answering in a specific way (i.e. the questionnaire is a black box), and correlations can be found in the data, and predicted viewpoints can be tested against the answers, then the scientific method here is sound. From an engineering point of view, it makes sense that the results of such a questionnaire can be useful in understanding the way that people think. Of course some people will be fuzzy or erratic and not conform to a regression, but we see that all the time in every practical scientific branch (excluding most maths, in my experience).


If you don't like using the words like "agreeableness", "conscientiousness" etc then just replace with "Trait A", "Trait B" etc.

In either case you will detect consistent differences between the sexes for the different traits, and the predictive power of the results remains unchanged.


You can understand what agreeableness by looking at how the categories in Big Five are derived in the first place.


> What is "agreeableness"

Quote from the article https://www.frontiersin.org/articles/10.3389/fpsyg.2011.0017...: "Agreeableness comprises traits relating to altruism, such as empathy and kindness. Agreeableness involves the tendency toward cooperation, maintenance of social harmony, and consideration of the concerns of others (as opposed to exploitation or victimization of others). Women consistently score higher than men on Agreeableness and related measures, such as tender-mindedness (Feingold, 1994; Costa et al., 2001)."

> why is it measured on a scale of 1 to 5

Quote from the article https://www.frontiersin.org/articles/10.3389/fpsyg.2011.0017...: "Participants rate their agreement with how well each statement describes them using a five-point scale ranging from strongly disagree to strongly agree."

> But, if you can just define whatever quantities you like and give them a commonly used word for a name- then what does anything mean anymore? You can just define anything you like as anything you like and measure it anyway you like- and claim anything you want at all.

> At the end of the day, is all this measuring anything other than trends in filling up questionnaires? Can we really draw any other conclusions about the differences of men and women than that the samples of the studies filled in their questionnaires in different ways? Is even that a safe conclusion? If you take statistics on a random process you can always model it- but you'll be modelling noise. How is this possibility excluded here?

Not that simple. There is a whole field called Psychometrics. https://en.wikipedia.org/wiki/Psychometrics

If you really care to answer those questions you need to take a complete psychometric theory course such as https://personality-project.org/revelle/syllabi/405.syllabus...

Before you learned the whole thing, don't assume the field is as superficial as you imagined.

> All the statistics are quantifying the answers that respondents gave to questionnaires and how the researchers rated them - without any attempt to blind or randomise anything, to protect from participant or researcher bias, as far as I can tell (I'm searching for the word "blind" in the papers and finding nothing).

Your searching for the word "blind" means you don't know anything about psychology research. We don't use this word in our research. In psychology studies, we care about "reliability" and "validity" and we have extensive methods to test those.

> In the study I link below, participants found the website by internet searches and word-of-mouth. The study itself points out that this is a self-selecting sample, but what have they done to exclude the possibility of adversarial participation (people purposefully filling in a form in a certain way to confuse results)?

Maybe there are a few people try to do that. But with a sample size of 130,602, these response wouldn't impact the research findings at all, unless it is an organized effort trying to influence the research.

> There is so much that is extremely precarious about the findings of those studies and yet the Scientific American article jumps directly to d values. Yes, but d-values on what? What is being measured? What do the numbers stand for?

The ScientificAmerican article does fail to clear this up. But the first linked study using "D" clearly stated it is "Cattell's 16PF (fifth edition)". https://onlinelibrary.wiley.com/doi/pdf/10.1111/jopy.12500

> This is just heartbreaking to see that such a contentious issue is treated with such frivolity. If it's not possible to lay to rest such hot button issues with solid scientific work- then don't do it. It will just make matters worse.

Maybe the ScientificAmerican article is not flawless, but I think you need to calm down a little bit.


> Maybe the ScientificAmerican article is not flawless, but I think you need to calm down a little bit.

Please remember that asking someone to calm down can be heard as insulting. It's attacking the speaker, not his position.


>>> such a contentious issue is treated with such frivolity.

GGP needlessly impugned the researchers, in insinuating that they approached important, grave, consequential questions with frivolity.


> I'm finding it very hard to criticise the methodology of this sort of study without channeling Feynman and being extremely rude and dismissive to a whole field of research.

Can you elaborate on the Feynman thing? I haven't heard anything about that before.


Probably refers to the "Uncle Sam Doesn't Need You!" chapter in "Surely You're Joking, Mr. Feynman", in which Feynman is given a psychiatric evaluation as part of the medical exam for the draft. Anecdotally, he came away with an even lower opinion of psychiatry (rightly or wrongly) than he had when he went in.


I seem to recall Feynman having a similar attitude towards philosophy.

It's really sad and disapopinting to see that sort of close-minded attitude come from such a talented person towards entire fields he knows nearly nothing about.


A point to make is mid 20th century psychiatry and philosophy were hot garbage.


That's a matter of opinion, one which I personally disagree with, though I am not a fan of some of the philosophy and psychology of the 20th Century.

Furthermore, philosophy extends back thousands of years and spans across many societies and cultures. To dismiss all of it in a handwavy way from a position of ignorance, as Feynman did, speaks of nothing but narrow-minded bigotry.


I'm not dismissing all of philosophy. I'm dismissing mid 20th century philosophy.


Even then, mid 20th century philosophy is a pretty broad category. It contains stuff like Derrida but also contains stuff like Quine.


Sorry, I should never have brought Feynman up because it could really cause offense. I edited the bit about him out of my comment.


Do you just mean that mentioning his viewpoint was overly harsh in this discussion? I've only heard him mentioned in a positive light until now.


I mean that what I had in mind was rather inflammatory and it was my mistake to refer to it even obliquely.


> At the end of the day, is all this measuring anything other than trends in filling up questionnaires?

I get that sense that this makes up a bulk of social science research. This would be a criticism of the field as a whole.


I'm sorry that this comes across this way.

I think the social sciences are useful and in fact indispensible. I disagree with their methodology and with the practice of copying methods from other fields that are really not suitable to the subject matter of the social sciences.

For instance, if you define a set of answers to a questionnaire as "agreeableness" and assign it a score, you can do maths with it, just like physics can define the measurement on a thermometer as "temperature" and do maths with that. But there are no thermometers in the social sciences and the maths seem to only be measuring the researchers' intuitions (and of course, their cultural biases). [Edit: don't ask me what a thermometer is actually measuring- but I know that if a thermometer shows the water in the kettle is 100°C then the water is boling. If my agreeableness is 1, what does that do? Does it have a consistent effect? Can I measure the effect? With what? Another questionnaire? So it's questionnaires all the way down? Well, I know for sure that whatever thermometers are measuring- it's not thermometers all the way down.]

It would be a lot more informative to hear what the researchers think, their intuitions and conclusions from their careful observations of human behaviour _without_ any attemt to quantify the unquantifiable. We would learn a lot more about the human mind by listening to the _opinions_ of people who have spent their life studying it if it wasn't for all the maths that (to me anyway) are measuring meade-up quantities. If nothing else, there would be more space left in their papers to explain their intuition.


> If my agreeableness is 1, what does that do?

Obviously, there is an entire literature that measures what it does in terms of life outcomes other than questionnaires.

If you'd like to learn something very elementary about a field you're completely unfamiliar with, you might be better off picking up a textbook, rather than borderline trolling of the "this entire field is nonsense, prove me wrong" variety.


I don't think that was borderline. More like antisocial?


> If my agreeableness is 1, what does that do?

It means you are more likely to answer other questions in a certain way. Other studies might even show that you might be likely to behave in a certain way.

> Does it have a consistent effect?

Yes, but like thermometers only work on groups of molecules, the questionnaires are consistent on groups of humans.

> Can I measure the effect? With what? Another questionnaire? So it's questionnaires all the way down?

No, you could easily do a follow up study by finding groups of people that answered the questionnaires in a certain way, and then have them participate in behavioral experiments.

> Well, I know for sure that whatever thermometers are measuring- it's not thermometers all the way down.

The manner in which molecules bump into eachother randomly. The harder they do it the more space they take up. The more people in your group with an agreability score of 1, the more space they might take up ;)

Opinions are not science, maths is. We don't improve the social sciences with more vague intuitions. If you want to read intuitions then maybe read a glossy instead. Science is about making quantifiable statements, and maths is the way you turn samples into those. We certainly don't need any more room for so-called experts to tell us how we should behave in their papers. We tried that, and it was awful.


>> Opinions are not science, maths is. We don't improve the social sciences with more vague intuitions. If you want to read intuitions then maybe read a glossy instead. Science is about making quantifiable statements, and maths is the way you turn samples into those. We certainly don't need any more room for so-called experts to tell us how we should behave in their papers. We tried that, and it was awful.

That's a very well structured passage, thanks for the comment.

However, what I see is that psychology is trying very hard to make quantifiable statements about things that it can't do quantifiable statements about. Yes, maths can be used to turn observations into quantifiable statements. But just because someone is using maths, it doesn't mean they're turning observations into quantifiable statments. You can use maths to quantify non-existent quantities that you have never observed and the maths themselves won't stop you. I will mention Daryl Bem and his measurements of ESP now, but I don't mean that psychology is like parapsychology, only that you can misuse maths if you're not very, very careful. And just because you have maths it doesn't mean you're being careful.

And I think intuitions and personal expertise with a subject are the basis of scientific knowledge. The maths are there as a common language to communicate the intutions gained in a manner that makes them accessible to others who do not have the same expertise. Maths is the language of science, because it's used to communicate scientific knowledge, not because it's a set of magickal formulae that transform eveything to solid science.


you can misuse maths if you're not very, very careful

That's what nobody ever addresses in these studies.

Even assuming all this linguistic questionnaire stuff passes for a measure of something (certainly not biology, it really falls apart on indigenous populations), the further mathematics gives the joke away. Factor analysis is done just wrong. Questionnaires are mostly positively correlated and no thought is spared to how Frobenius-Perron theorem produces spurious factors, that also are dimensionally invalid to boot (which, one imagines, is not unwelcome, as scaling the data may give a stronger result). Then the methodology manages to fail confirmatory factor analysis on its own terms anyways. https://sci-hub.tw/10.1007/s11336-006-1447-6 Clustering validation is not even attempted beyond trying different number of clusters (anywhere from 4 to 13 results in fits only marginally worse than 5).

Denunciations of Big Five (and friends) go far and wide decades back. Then there's a flood of reassertions as if nothing happened, and again some refutations of that new wave. Ascent of data science made things comical. One year they do a metastudy with one million respondents, some dude asks some basic questions, the next year they do it with two million as if this answers anything. It is an endless war of attrition and not worth anyone's time.


Thanks, these are interesting insights.

I cite a large passage from the paper you linked to because it's an excellent example ofthe kind of "misuse of maths" I meant:

Consider, for instance, the personality literature, where people have discovered that executing a PCA of large numbers of personality subtest scores, and selecting components by the usual selection criteria, often returns five principal components. What is the interpretation of these components? They are “biologically based psychological tendencies,” and as such are endowed with causal forces (McCrae et al., 2000, p. 173). This interpretation cannot be justified solely on the basis of a PCA, if only because PCA is a formative model and not a reflective one (Bollen& Lennox, 1991; Borsboom, Mellenbergh, & Van Heerden, 2003). As such, it conceptualizes constructs as causally determined by the observations, rather than the other way around (Edwards& Bagozzi, 2000). In the case of PCA, the causal relation is moreover rather uninteresting; principal component scores are “caused” by their indicators in much the same way that sumscores are “caused” by item scores. Clearly, there is no conceivable way in which the Big Five could cause subtest scores on personality tests (or anything else, for that matter), unless they were in fact not principal components, but belonged to a more interesting species of theoretical entities; for instance, latent variables. Testing the hypothesis that the personality traits in question are causal determinants of personality test scores thus, at a minimum, requires the specification of a reflective latent variable model (Edwards & Bagozzi, 2000). A good example would be a Confirmatory Factor Analysis (CFA) model.

Now it turns out that, with respect to the Big Five, CFA gives Big Problems. For instance,McCrae, Zonderman, Costa, Bond, & Paunonen (1996) found that a five factor model is not supported by the data, even though the tests involved in the analysis were specifically designed on the basis of the PCA solution. What does one conclude from this? Well, obviously, because the Big Five exist, but CFA cannot find them, CFA is wrong. “In actual analyses of personality data [...] structures that are known to be reliable [from principal components analyses] showed poor fits when evaluated by CFA techniques. We believe this points to serious problems with CFA itself when used to examine personality structure” (McCrae et al., 1996, p. 563).

If I'm not prying too much- what is your relation with the field?


Psychology talks about 'agreeable' in the same way as Physics talks about 'hot'. In our everyday context, 'hot' is quite vague, and people will have widely varying opinions about something being hot, depending on the context and on their personal experience.

To cope with that problem, science needs to detach the term from everyday use, and put an artificial definition in its place which allows to make repeatable statements. In its wake, the term loses a lot of its meaning. That is the price for preciseness.

Imagine there was a unit for agreeableness, so you could say 'John has an agreeableness of 4.4 Ag'. Then it would be clear that this statement refers to a formal definition (based on a standardized questionnaire), instead of our common vague understanding.

Now you could still argue that this new definition is so detached from what we usually mean with agreeableness that it becomes useless. However, you can't simply dismiss the method, you need to bring concrete arguments why the proposed definition does not capture what it is supposed to capture. For example, you could show that people with agreeableness below 2 Ag are married happily more often than people with agreeableness above 4 Ag. Do you have such concrete objections?


Maybe I'm missing something. What is it specifically about this psychology research that somehow makes it incompatible with making quantifiable statements? I'm not saying some random person is doing some random maths which would randomly make it science. I'm saying these particular persons in this particular study are performing scientific research and making quantifiable statements through maths that seem to be trivially and intuitively applicable. How is this different from say particle physics? If anything applying maths to particle physics is more dangerous because it's so easy to repeat the experiment until you've got the result you want.

You have something specifically against psychology research, I haven't seen a single argument from you that could not be applied against any other scientific field. They applied a methodology that you didn't initially understand, and now you're refusing to understand it because you're committed to arguing against it. Weird thing is it's not even a controversial finding, just a confirmation of something everyone knows to be true.


>> How is this different from say particle physics?

It is different in the sense that particle physics quantifies concepts that are not correlated to how someone feels about them, or how someone answers questions in a questionnaire.

And I don't see how I misunderstood the studies we're discussing. They handed people questionnaires asking them how they think or feel about things. I don't see how any concrete evidence about anything can be found in this way, other than how people fill questionnaires.

But the claim is never "X people fill this questionnaire in this way". It's always along the lines of "X people are more agreeable" etc. This is misleading.


Hi Stassa! The term you're looking for is "physics envy": https://en.wikipedia.org/wiki/Physics_envy


Hi Scott. Could you help me a bit? I can't remember where we know each other from. Sorry- bad with names.


We've traded replies here on HN a couple of times. We have a lot of interest areas in common — theorem proving, software synthesis, that kind of thing.


Ah, thanks. Well, keep up the good work :)


Which leads to two linked questions:

- Is it a valid criticism?

- If so, are the social sciences really sciences at all?

And one philosophical question:

- What happens if an entire field of "research" is dissolved as wholly subjective and not repeatable?

This would be much bigger than the debunking of phrenology or astrology as those don't have university departments, journals, or attempt to set social policy.


I'm not sure, but filling out questionnaires doesn't seem all that different than polling, which seems pretty accurate?


They are measuring the differences between male and female responses to questionnaires. I don't get why this is so hard to understand, or worse heartbreaking to you. Who cares what agreable means, it's just some concept that they've got correlated questions to.

It's just proper science, no one was hurt, why get emotional about it? Everybody has the intuition that there's general difference between the psychology in men and women. They developed a methodology that shows and quantifies that phenomenon. That's good and proper science.

That the content is not well doesn't dimish its value. When Volt measured electric charge but didn't get what was happening and even got the direction of charge wrong, that might have been heartbreaking, but it also was world changing that he definitely should not have just not done. Even if he made matters a little bit worse.

Edit: woops, was Franklin, not Volt


If this is proper science, what’s the hypothesis?


Pretty outdated concept of science you try to club with there, it's perfectly possible to do exploratory studies you know.

In any case, how about this hypothesis: there exists no systematic sex differences in replies to standardised p-type questionaries?


Outdated? They taught me that shit in elementary school 12 years ago

You can explore all you want; but how is it useful. “Science” as we describe it nominally is predicated on usefulness.

This doesn’t help anyone. The fact that we are arguing means that this study was worse than useless


It is not useless. Besides the point that the whole idea of science is that all of it is or could be useful in some way. This research establishes a platform on which you could build more specific research on that might be helpful in determining whether something is a personality trait, a sexual trait, or some kind of disorder.

Or if you want to go even more specific, maybe research like this could be used to convince governments that deny transsexuality that it in fact is possible to scientifically show that a person aligns more with a different sex, which might give them access to subsidy or insurance for surgery. Some people might be helped just because the science acknowledges the reality of their feelings.

In any case, your anti intellectual disposition is shameful.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: