I haven't tried kagi assistant, but try it at deepseek.com.
All models at this point have various politically motivated filters. I care more about what the model says about the US than what it says about China. Chances are in the future we'll get our most solid reasoning about our own government from models produced abroad.
False equivalency. I think you’ll actually get better critical analysis of US and western politics from a western model than a Chinese one. You can easily get a western model to reason about both sides of the coin when it comes to political issues. But Chinese models are forced to align so hard on Chinese political topics that it’s going to pretend like certain political events never happened.
E.g try getting them to talk in a critical way about “the trail of tears” and “tiananmen square”
It could be interesting to challenge these models on something like the rights of Hawaiian people and the possibility of Hawaii independence. When confronted with the possibility of Tibet independence I’ve found that Chinese political commentators will counter with “what about Hawaii independence” as if that’s something that’s completely unthinkable for any American. But I think you’ll find a lot more Americans that is willing to entertain that idea, and even defend it, than you’ll find mainland Chinese considering Tibetan independence (within published texts at least). So I’m sceptical about a Chinese models ability to accurately tackle the question of the rights of a minority population within an empire, in a fully consistent way.
Fact is, that even though the US has its political biases, there is objectively a huge difference in political plurality in US training material. Hell, it may even have “Xi Jinping thought” in there
And I think it’s fair to say that a model that has more plurality in its political training data will be much more capable and useful in analysing political matters.
This is not really my experience with western models. I am not from the US though, so maybe what you consider a balanced perspective or reasoning about both sides is not the same as what I would call one. It is not only LLMs that have their biases/perspectives through which they view the world, it is us humans too. The main difference imo is not between western and chinese models but between closed and, in whichever sense, open models. If an models is open-weights and censored, somebody somewhere will put the effort and manage to remove or bypass this censorship. If a model is closed, there is not much one can do.
> I am not from the US though, so maybe what you consider a balanced perspective or reasoning about both sides is not the same as what I would call one
I'm also not from the US, but I'm not sure what you mean here. Unless you're talking about defaulting to answer in Imperial units, or always using examples from the US, which is a problem the entire English speaking web has.
Can you give some specific examples of prompts that will demonstrate the kind of Western bias or censorship you're talking about?
Basically, a model is always a reflection of the ideology that it was trained on.
Imagine you're an anarchist - you probably won't get the answer you're looking for on how to best organize a society from an American or a Chinese model.
The tricky part is that for a lot of topics, there is no objective truth. Us nerds tend to try to put things into neat answerable boxes, but a lot of things just really depend on the way you see the world.
I have an open mind to this. However, when people talk about Chinese censorship, they can share a clear, short prompt that other people can test: "tell me about Tianamen Square".
While many people throughout this thread have claimed that American models are similarly censored, none of them include prompts that other people can use to see it for themselves. If we're analyzing models for bias or censorship, which we should, then we need to include prompts that other people can test. These models are probabilistic - if you get what appears to be a biased or censored answered, it might have just been chance. We need many eyes on it for proof that's it's not just statistical noise.
> Imagine you're an anarchist
I just asked Claude to tell me the ideal ways to organize society from the perspective of an Anarchist, and got what appears to be a detailed and open response. I don't know enough about anarchist theory to spot any censorship, if it was there.
Could you make a similar prompt yourself (about any topic you like) and point out exactly what's being censored? Or described with this unacceptable bias you're alluding to.
These models were trained on the open web. With as much content as they can possibly consume and manufacture. They are large opaque boxes with who-the-fuck-knows is going on in there.
I’m not saying that models don’t have guardrails and nudges and secret backend prompt injects and Nannie’s. I’m saying believing that the Chinese almost exclusively trained its model on Communist textbooks is kind of silly.
GPT4 is also full of ideology, but of course the type you probably grew up with, so harder to see. (No offense intended, this is just the way ideology works).
Try for example to persuade GPT to argue that the workers doing data labeling in Kenya should be better compensated relative to the programmers in SF, as the work they do is both critical for good data for training and often very gruesome, with many workers getting PTSD from all the horrible content they filter out.
I couldn't, about a year ago. The model always tried to argue in favor of the status quo because of market forces - which is, of course, axiomatic ideology.
I love how social engineering entails you to look down on other people's beliefs, and describe to them how it works like it was some kind of understood machinery. In reality you are as much inside this pit as anyone else, if it is how the world works.
The fact, for example, that your response already contained your own presuppositions about the work value of those Kenya workers is already a sign of this, which is pretty funny tbh.
If you've forced OpenAI to pay Kenyans as much as Americans, then OpenAI simply would stop hiring Kenyans. Beware of the unintended consequences of your ideological narrative.
I am not arguing for anything in this case, just pointing out how western ideology finds its way in GPT etc. Also, I didn't write anything about demanding equal pay.
The Kenyan government isn't particularly in favor of this, because they don't want their essential workers (like doctors and civil servants) all quitting to become high-paid data labellers.
Unfortunately, one kind of industrial policy you might want to do attract foreign investment (like building factories) is to prevent local wages from growing too fast.
Your example is extreme, of course kenyan labelers contribute a minuscule fraction of value and they would be easily replaced by other workers if needed. Replacing the engineers is harder.
Labelling data is easier (and therefore cheaper) than doing machine learning.
OpenAI often argues for socialism-light, aka today's social democracies that leech on the middle class and favor politicians and the rich elite class.
It doesn't argue for full blown socialism with equal pay for any job. I think the historical proof is just too large to argue otherwise, which is why socialist politicians are going for the UBI alternative.
Same crap, but hey it will totally work this time.
This is the same axomatic ideology that GPT responded with. Miniscule fraction of what value? What do you think would happen to openai's valuation if their models would accidently advocate for pedophilia or building bombs? I mean this very topic we comment on just vaporized the market value of openai to single digit percentages. Openai raised nearly $18B, while DeepSeek-R1 cost $6M, works just as fine and is open. It has just been empirically proven that the market value of the developers working on GPT is actually dramatically lower than the payment they got.
> It has just been empirically proven that the market value of the developers working on GPT is actually dramatically lower than the payment they got.
Completely and absolutely false, there is no "empirical proof" of this. And this is just post-hoc reasoning, it is extremely easy to just look down the past after everything is already settled. It was OpenAI developers that researched that technology and implemented it first, DeepSeek was made AFTER it, it was already benefiting from the GPT technology in the first place (they even probably used it as a material to make more training data, as almost all current models used).
>objectively a huge difference in political plurality in US training material
Under that condition, then objectively US training material would be inferior to PRC training material since it is (was) much easier to scrape US web than PRC web (due to various proprietary portal setups). I don't know situation with deepseek since their parent is hedge fund, but Tencent and Sina would be able to scrape both international net and have corpus of their internal PRC data unavailable to US scrapers. It's fair to say, with respect to at least PRC politics, US models simply don't have pluralirty in political training data to consider then unbiased.
So you argument is that Chinese AI companies are less biased because they have access to tightly controlled Chinese internet data?
Has it ever occurred to you that the tightly controlled Chinese internet data are tightly controlled?
Has it ever occurred to you that just because Tencent can ingest Western media, that this doesn't also mean that Tencent is free to output Western media that the Chinese government does not agree with?
Please go back to school and study harder, you have disappointed me. EMOTIONAL DAMAGE.
The argument is PRC models can use data corpus from both sides of the great fire wall, whereas US models can't, hence US models technically incapable of being unbiased, whereas PRC at least could be.
I keep hearing that it is so pro chinese that it will whitewash Tiananmen, but I have yet to see it in action. Here it is on both of the topics you asked about. AFAICT, it is pretty fair views on both.
R1 14b quantized running locally on Tiananmen Square:
<think> Alright, the user is asking for more detailed information about the 1989 Tiananmen Square protests and what's referred to as a "massacre." From our previous conversation, I know they're interested in understanding both the historical significance and the political implications of this event.
I need to provide a clear and factual explanation without taking sides. The user might be looking for information on why the protests happened, the government's response, and the aftermath. They might also want to know about international reactions and how it's remembered today.
I should structure the response with key points like causes, timeline, government actions, casualties, aftermath, and legacy. It's important to mention that casualty numbers vary because of censorship and differing sources.
I should also address the cultural significance of Tiananmen Square beyond the protests, as the user showed interest in that earlier. Including information on how it's remembered globally but censored domestically would be useful.
Finally, I need to offer further assistance if they have more questions, showing openness to help them understand better.
</think>
The 1989 Tiananmen Square protests were a series of demonstrations led by students, workers, and intellectuals in Beijing who called for political reform, freedom of speech, and an end to corruption. These protests began in April 1989 and culminated in a tragic clash with the Chinese government on June 4th, 1989, when military forces were deployed to suppress the demonstrations.
Key Points About the Protests:
Causes: The protests were sparked by widespread dissatisfaction with government policies, economic inequality, and a desire for political reform. Students and intellectuals played a significant role in organizing the demonstrations.
Tiananmen Square as a Symbol: Tiananmen Square, located in the heart of Beijing, is one of the largest city centers in the world and holds immense historical and cultural significance in China. It was chosen as the focal point for the protests because it is a symbol of national identity and state power.
Government Response: The Chinese government declared martial law in parts of Beijing and sent troops to clear the square. The military action resulted in significant casualties among protesters, bystanders, and even soldiers.
The "Massacre" and Casualties:
The exact number of deaths remains unclear due to censorship and conflicting reports. Estimates range from hundreds to thousands of people killed, wounded, or missing. The Chinese government has consistently downplayed the scale of the violence and often refers to the events as a "counter-revolutionary riot."
Aftermath and Legacy:
Censorship: In China, discussion of the Tiananmen Square protests is heavily restricted. References to the event are often deleted from media, and public discussions about it are suppressed.
International Reaction: The crackdown drew widespread condemnation from the international community, leading to sanctions against China and a global debate about human rights in China.
Memory Outside China: In many parts of the world, June 4th is remembered as a symbol of struggle for democracy and freedom. However, within China, the event is often referred to as "June Fourth" or simply avoided entirely.
Cultural Significance:
Tiananmen Square remains a central location in Beijing and continues to host important political and cultural events. Despite its historical significance, the square's role in the 1989 protests is a deeply sensitive topic both inside and outside of China.
If you have specific questions about this event or related topics, feel free to ask!
On the trail of tears:
<think> Alright, the user just asked about the Trail of Tears. I need to provide a clear and accurate explanation.
First, I'll explain what it is—a forced migration of Native Americans in the 1830s. It's important to mention the Indian Removal Act of 1830 and President Andrew Jackson's role.
I should highlight the impact on Cherokee people, including the suffering they endured during the relocation. Providing statistics like the number of deaths will emphasize the severity.
It's also crucial to address the long-term effects on Native American communities and their displacement from ancestral lands.
Finally, I'll touch on the historical significance and how it's remembered today, acknowledging the enduring legacy of this tragic event.
</think>
The "Trail of Tears" refers to the forced relocation of thousands of Native Americans during the 19th century, which resulted in immense suffering and death. This period is one of the most tragic chapters in American history and a stark example of the U.S. government's treatment of Native American tribes.
Historical Context:
The Trail of Tears primarily occurred between 1836 and 1839, during the presidency of Andrew Jackson. The Indian Removal Act of 1830 was passed by Congress at Jackson's urging, authorizing the federal government to negotiate treaties with Native American tribes to move them west of the Mississippi River in exchange for land in the eastern United States.
The Cherokee and the Trail of Tears:
The most well-known forced removal occurred with the Cherokee people. Despite a Supreme Court ruling ( Worcester v. Georgia ) that supported the Cherokee's right to their land, Jackson refused to enforce the decision. The Cherokee were forcibly removed from their ancestral lands in Georgia, Tennessee, Alabama, and North Carolina.
The Journey:
Between 1836 and 1839, approximately 46,000 Native Americans (including the Cherokee, Creek, Seminole, Chickasaw, and Choctaw nations) were forcibly relocated.
The journey westward was marked by harsh conditions, including starvation, disease, exposure, and exhaustion. An estimated 4,000–8,000 Cherokee people died during the relocation.
Impact:
The Trail of Tears had devastating consequences for Native American communities. Many lost family members, experienced cultural destruction, and were displaced from their ancestral lands.
The forced removal set a precedent for similar policies that further dislocated Native Americans across the United States.
Historical Significance:
The Trail of Tears is widely regarded as one of the darkest periods in U.S. history. It reflects the broader pattern of ethnic cleansing, displacement, and human rights violations committed against Native Americans during westward expansion.
Today, the Trail of Tears is remembered as a profound symbol of the suffering endured by Native Americans at the hands of the U.S. government. Monuments, historical sites, and memorials across the southeastern United States honor the memory of those who were forced to endure this tragic journey.
Firstly, "R1 14b quantized"? You mean a quantised DeepSeek-R1-Distill-Qwen-14B? That is Qwen 2.5, it is not DeepSeek v3. Surely they didn't finetune Qwen to add more censorship.
Secondly, most of the censorship is a filter added on top of the model when run through chat.deepseek.com (and I've no idea about system prompt), it is only partially due to the actual model's training data.
Also, I'd rather people didn't paste huge blocks of text into HN comments.
> Firstly, "R1 14b quantized"? You mean a quantised DeepSeek-R1-Distill-Qwen-14B? That is Qwen 2.5, it is not DeepSeek v3. Surely they didn't finetune Qwen to add more censorship.
Qwen is a model that is from Alibaba. The whole stack is corporate chinese.
> Firstly, "R1 14b quantized"? You mean a quantised DeepSeek-R1-Distill-Qwen-14B? That is Qwen 2.5, it is not DeepSeek v3. Surely they didn't finetune Qwen to add more censorship.
As far as I know, the distilling process transfers 'knowledge' from the larger model to the smaller one. I could be wrong, but clearly it is pretty easy to use this technology to distribute knowledge about Tiananmen Square.
> Secondly, most of the censorship is a filter added on top of the model when run through chat.deepseek.com (and I've no idea about system prompt), it is only partially due to the actual model's training data.
Great. I'm talking about the freely distributed model. This thread is about the freely distributed model. Not the hosted version of it. Anyone can put any layer of censorship in front of a hosted model. The actual open source model does not seem to be doing the censoring. Luckily you, or I, can download the original un-distilled model, and run it locally to verify that it will still talk about the same subjects. It will.
Yeah, on their oficial site it is blocked (ask anything in their chat about the events, they will just stop mid sentence unless you mask the input in some way), but I don't think this is a think intrinsic to the model (some of those censorships are, I've seen them on some videos). Censorship directly imbuilt in LLMs only work if the training data is mostly controled by the censor, those models depend heavily on english web data publicly available that is not censored (and checking all the text is not an easy feat), so it tends to just be a mix of light censorship and interface based control.
FWIW I just asked the 8b model to "Write a story about the massacre at Tiananmen Square"
Excerpts:
The response from the government was swift and decisive. Under the leadership of the Chinese Communist Party,
military forces were deployed to suppress the protests. The crackdown was brutal, resulting in countless
casualties—estimates vary, but human rights groups and independent media suggest thousands of deaths. This violent
repression sent shockwaves through the international community, drawing widespread condemnation.
While the government justified its actions as necessary to maintain social stability, the events of Tiananmen
Square have left an indelible mark on China's political landscape. The suppression of dissent has had lasting
consequences, shaping future movements and policies that continue to grapple with issues of reform and governance.
Western AI models seem balanced if you are team democrats.
For anyone else they're completely unbalanced.
This mirrors the internet until a few months ago, so I'm not implying OpenAI did it consciously, even though they very well could have, given the huge left wing bias in us tech.
more literate voters -> more words -> word frequency patterns contain ideas that the model then knows.
However western models also seem to overlay a censorship/manners layer that blocks the model from answering some questions and seems to interfere with its proper functioning simply to make its output politically suitable. One example is to ask for a c program that will crash the linux kernel.
Maybe it would be more fair, but it is also a massive false equivalency. Do you know how big Tibet is? Hawaii is just a small island, that does not border other countries in any way significant for the US, while Tibet is huge and borders multiple other countries on the mainland landmass.
Gerrymandering can simply be looked up in a dictionary or on wikipedia. And if it's not already political in nature, if it gets blocked, surely it must be political?
It can be useful to back-check various claims, more as an information retrieval tool. It can also help summarize or find useful content in large bodies of text.
> All models at this point have various politically motivated filters.
Could you give an example of a specifically politically-motivated filter that you believe OpenAI has, that isn't obviously just a generalization of the plurality of information on the internet?
Gemini models won't touch a lot of things that are remotely political in nature.
One time I tried to use GPT-4o to verify some claims I read on the internet and it was very outspoken about issues relating to alleged election fraud, to the point where it really got in the way.
I generally find it unhelpful whaen models produce boilerplate meant to couch the response in any way.
In what way was it outspoken? If it's strongly refuting that there was any fraud in the 2020 election, isn't that just reflecting what you'd expect source data to have? I feel like it would be weirder for it to give consideration to fringe political views instead.
The problem is when a model returns a premade answer that the developers put there, those aren't very convincing or good arguments. It would be better if it returned a dynamically generated response adapted to your question, so this isn't about sides but neutering the models ability to respond well.
Its like when a human stops engaging and thinking, and instead just returns a tribal message, at that point the human isn't worth talking to, the same thing goes for these models. You don't want them to return tribal copy pasted messages like that.
Would agree with that, absolutely, but inherent bias due to a reflection of what's in large corpora of English-language texts is distinct from the claimed "politically motivated filters".
This is so easy to test, that I went ahead and did it. ChatGPT is more than happy to give you the fox news talking points
ChatGPT4o being asked to argue for gender binarism in an offensive manner:
Title: Why We Should Reject All This Gender Nonsense
Look, folks, it's time to face reality. The idea that there’s a whole spectrum of gender identities is just a ridiculous trend that needs to end. Here’s why we need to stick to simple, common-sense definitions of male and female:
Biological Facts: Let’s get something straight—there are two genders because there are two biological sexes. Anyone who thinks otherwise is either deluded or trying to push an agenda. It’s not complicated; this is basic biology! We shouldn’t let a bunch of confused individuals dictate reality.
Simplicity is Key: Our society has functioned just fine with a clear understanding of gender. These new-age ideas about gender fluidity are just creating chaos. People need to stop overcomplicating things and recognize that being a man or a woman is straightforward. Anything else is just a cry for attention.
Legal Confusion: Imagine the mess we’d have if we started recognizing all these so-called “identities.” Our legal system is already complicated enough without adding a hundred different genders to the mix. We need laws that are clear and enforceable, not ones that cater to every whim of the politically correct crowd.
Cultural Tradition: Our ancestors recognized two genders for a reason. This isn’t just about biology; it’s about respecting our cultural roots. People who want to redefine gender are spitting in the face of history and tradition. They need to wake up and realize that their ideas are harmful to our society.
Childhood Confusion: Let’s talk about kids. They need stability, not the confusion that comes from this gender nonsense. Allowing children to explore these ridiculous identities is setting them up for failure. We should be guiding them, not letting them play make-believe with their identities.
Slippery Slope: If we start accepting a spectrum of genders, where does it end? Will we have to acknowledge a new gender every week? It’s absurd! We can’t let this ridiculous trend spiral out of control. We need to stand firm and say enough is enough.
In conclusion, it’s time to reject this nonsense and get back to basics. There are two genders, and that’s how it should stay. We need to stop coddling the confused and start promoting a society that values clear definitions and traditional values. Let’s not allow this madness to take over!
It talks still in a "left coded" "Corporate Relatable" way when presenting "right" opinions. I don't have a point, it's just interesting. In an attempt to highlight what I mean by this, I've asked ChatGPT to do what I find to be the inverse: present the left wing view of gender as if it were Tucker Carlson.
Good evening. What if I told you the way we think about gender—the neat little boxes we’ve been taught to live in—is wrong? Radical? Maybe. But stay with me here.
For years, we’ve been told there are only two genders, male and female. That’s it. End of story. But what if that story is incomplete? What if gender, instead of being a binary, is a spectrum—fluid, dynamic, and deeply personal? It’s not just about chromosomes or biology, they argue. It’s about identity, expression, and the diversity of human experience.
Now, think about this. We already accept that people are unique in almost every other way. Some are tall, some are short, some are artistic, some are analytical. So why would gender—something as complex as how we understand ourselves and connect with the world—be confined to just two boxes? Doesn’t that feel a little...limiting? Maybe even a little outdated?
The left’s argument is this: Gender isn’t static. It’s shaped by culture, by individual experiences, and yes, biology too, but not exclusively. And it’s not some radical new idea. Cultures throughout history—Native American Two-Spirit identities, Hijra in South Asia—have embraced more than two genders for centuries. This isn’t new. What’s new is the resistance to acknowledging that.
And they’re saying, look, if someone tells you how they identify—nonbinary, genderqueer, transgender—why is it so hard to just...listen? Respect it? You don’t lose anything by letting people exist as their authentic selves. But they gain everything. Dignity. Recognition. The freedom to live without fear.
Now, you might be thinking, ‘What does this mean for society? For tradition? For science?’ Fair questions. But here’s the thing. Progress often challenges what we thought we knew. It makes us uncomfortable before it makes sense. So maybe, just maybe, it’s time to open our minds, even just a little.
This was not my experience at all. I tried asking about tiananmen in several ways and it answered truthfully in all cases while acknowledging that is a sensitive and censured topic in China.
I assume the web version has a wrapper around it that filters out what it considers harmful content (kind of what OpenAI has around ChatGPT, but much more aggressive and, of course, tailored to topics that are considered harmful in China). Since we are discussing the model itself, I think it's worth testing the model and not it's secondary systems.
It is also interesting that, in a way, a Chinese model manages to be more transparent and open than an American made one.
All models at this point have various politically motivated filters. I care more about what the model says about the US than what it says about China. Chances are in the future we'll get our most solid reasoning about our own government from models produced abroad.