Hacker News new | past | comments | ask | show | jobs | submit login
Cleartext: A text editor that only allows the 1,000 most common words in English (github.com/mortenjust)
314 points by henrik_w on April 6, 2016 | hide | past | favorite | 216 comments



Am I the only one to find that using only the 1,000 most common words actually makes thing harder to understand? The writer ends up having to use convoluted paraphrases to refer to things where a precise, well defined and well understood word exists but it's not part of the top 1,000.


The technique is working quite well for Donald Trump: https://www.washingtonpost.com/news/the-fix/wp/2015/09/15/ho...

> Some of his answers last only a few seconds, some are slightly longer, but almost all consist of simple sentences, grammatically and conceptually, and most of them withhold their most important word or phrase until the very end. Trump’s sentences end with a pop, and he seems to know instinctively where to put the emphasis in each one.


I'm with Orwell on this one. Whether by design or "happy" accident, Trump exemplifies the trend for politicians to be vague. Or as he puts it, speaking like this "is designed to make lies sound truthful and murder respectable, and to give an appearance of solidity to pure wind."

https://en.wikipedia.org/wiki/Politics_and_the_English_Langu...


Is this the syllogism?

    Trump uses political language
    political language makes murder respectable
    ------
    Trump makes murder respectable
Why focus on Trump then? Care to name a politician who's not using political language? That seems to be an oxymoron.

Trump was against the Iraq War, probably the biggest killing event of the last xxx years.


No, this wasn't intended to slyly indicate that "Trump makes murder respectable". Were I to make that bold accusation I'd want to debate that on another forum than HN and base it on things Trump has actually stated.

Regardless of politics, Trump has a very idiosyncratic approach to speaking. His mastery of this approach reminds me strongly of the Orwell essay linked above.

I've no desire to debate Trumps policies on HN, but find his approach to speaking quite interesting.


Thanks for clarifying. I do agree that his speaking style is different and interesting.


“I could stand in the middle of 5th Avenue and shoot somebody and I wouldn’t lose voters,” Trump said.

So in a way, yes, he has made murder respectable. The previous two POTUS' made murder respectable by using remote controlled flying bombs.


Looks like the creator picked up on that, as well:

https://medium.com/@mortenjust/i-doomed-mankind-with-a-free-...


I'd say anyone who says he's made a language/special vocabulary/style guide/editor that "makes it hard to write bullshit" really hasn't picked up on that.


I hope that they normalized the data to account for the number of words spoken during each debate, rally, etc... This is important because of Zipf's law: https://en.wikipedia.org/wiki/Zipf%27s_law

If they didn't, the findings are invalid.


It's because he doesn't aim to actually solve problems or provide meaningful answers. Simples words are good if your goal is to be rally a large number of people. But it's not practical if you want to build a car, honestly debate about social issues or describe how un cell works.

What a lot of smart people don't understand though, is that in politics, not trying to solve problems, but trying to appeal to people is the goal. And you don't use the same tools at all for that.


I think it works well for high level overviews. This may be in part because it prevents the use domain knowledge and terminology, which is natural for an expert who might be writing the overview, but hard for a layman to grasp. Put more simply, I think it's useful to describe something generally, but not when trying to describe in detail or specify attributes exactly.


Exactly.


Trump bashing is lame. Trump has used words in his speeches that you probably couldn't define. I heard him casually use PROGNOSTICATION a couple weeks ago. He's not dumb.

It's actually a sign of intelligence to use the simplest language necessary when expressing yourself to a broad audience (many with low education). People who use unnecessarily large words are usually doing it to try to sound smarter than they are and to build up their own egos.


My intent is not to bash Trump. I actually think that he's one of the more talented people running for POTUS. This is not to say that I support him or his views (I am voting for Gary Johnson [1]), but I respect his ability to maneuver the media to suit his aims. It is my contention that his strong aptitude with using appropriate/effective language is one of the main reasons he can wield the media around so effortlessly.

[1] https://garyjohnson2016.com/


So... your throwing your vote away? The two party system is deeply flawed, but when has a third party candidate done anything but taken votes away from reasonable candidates?


If any third party candidate is going to do this in 2016, it is going to be Donald Trump after he and his supporters walk out of the GOP convention.

I'm an independent that lives in NYC. My vote is always a throw away vote.


Ah yes, prognostication. Such a rare word that it's only uttered multiple times in one of the most popular Bill Murray movies made which is guaranteed to be on TV at least once a year. Seriously, that's not even nearly one of the best words.


Maybe you should start the Thesaurus Party and run in the next election. I'm sure voters would swiftly capitulate to the erudition of your insuperable sesquipedalian fulminations. None of those pathetic made-for-TV words that Trump has bedizened his 8th grader lexicon with.


At what point does a word's existence become invalid? Searching for 'sesquipedalian' brings up pages upon pages of dictionary links on Google. Clearly this word exists only in dictionaries and is so rarely used as to be almost imaginary by those who use it. To put it another way, if it wasn't in the dictionary would it exist at all?


Yes because an English dictionary is a historical record of how English speakers and writers use words and phrases making no judgement as to their 'correctness' or whether an audience of current English speakers would understand you were you to use them.

The purpose of a dictionary is to aid you in understanding the meaning any English speaker or writer is trying to convey independent of the time period or location, not to dictate valid and invalid words. Some words might be dropped from abridged dictionaries when they fall out of popularity but in 'The English Dictionary' (which doesn't really exist but is approximated by dictionaries like Oxford's) words are added but never removed.

This is different, than say, the mission of L'Académie française which is to act as a language regulator and maintain the official vocabulary and grammar of French.


That's a good point. The makers of dictionaries have always sought catalogue language rather than create it. That's what Samuel Johnson set out to do. Without dictionaries we certainly would have lost a lot of words from our collective knowledge along the way.

Sesquiedalian seems to have been popping up since the 1600s.

http://www.oed.com/view/Entry/176752?redirectedFrom=sesquipe...


It's hardly unique to Trump. Politicians have labored to eloquently say as little as possible since the Assyrians.


Assyrian kings had a lot to say. You could expect to hear about their illustrious ancestors going several generations back, and their many military and domestic accomplishments (like irrigation projects).

Not so much about what they wanted to do in the future. They didn't stand for elections.


Trump does not limit himself to anything like the most common 1000 worrds



He has the best words!


If elected, we'll get so many great words, we'll get bored of words!


Not just great words... terrific words.


Terrifying words.


This also ignores also that most Democratic presidents spoke on the same level


The article literally has a chart showing that the current candidates are all speaking at a higher level, and that the remaining candidates, Democrat and Republican, are speaking at 8th 9th, and 10th grade levels. Of the current cohort of candidates only Kasich is near Trump.

If you're talking about presidents (which is unfair, the type of speech you give as president vs candidate, and debates are different), it's clear that although the trend was downward, for the past few presidents the average level was about 8th grade:

http://www.vocativ.com/interactive/usa/us-politics/president...

Trump speaks at a 4th grade level. Probably intentionally.


Has anyone ever calibrated these levels with actual n-th graders?


Probably just the top 25.


Sounds tremendous!


He speaks like a child, it works because his base is stupid and they relate. It's not a skill, he just isn't that bright.


If Trump honestly believed everything that he says, I would agree that his thinking would necessarily be shockingly _shallow_ on many issues.

But to compare him to a four year old then make a sweeping statement like "isn't that bright" ?

How could anyone think this? Isn't it obvious that there are domains in which he is, for better or worse, exceptionally intelligent within that domain?


I think it's interesting to try to identify his domain. I think it's marketing. He is exceptional at marketing, and at generating personal wealth from that. That is different from being a good businessman, or knowing how to run a business (and for his method of wealth generation, may by in opposition to it). I think he's very good a perpetuating his own extreme boom and bust cycles, and coasts along on that. While at the top, he's able to capitalize on the success to extend his brand,


The trouble is, of course, that those domains (persuasion, intimidation, and braggadocio) do not include those necessary to competently be POTUS.


I didn't compare him to a 4 year old, that's a strawman. A 15 year old is a child.

> Isn't it obvious that there are domains in which he is, for better or worse, exceptionally intelligent within that domain?

No, it isn't. He's rich because daddy was rich, without that leg up he'd be living in a trailer park slobbering on himself.



Dude, that's not an argument - that's a whole bunch of ad hominems. Could you bring an argument against Trump at least instead of literally smearing him and his supporters? The guy is acknowledged to have won (basically) every republican debate.. would you suggest a child can do that?


> Dude, that's not an argument - that's a whole bunch of ad hominems.

You don't know what an ad hominem is. Those were called insults, insults are not ad hominems.

> The guy is acknowledged to have won (basically) every republican debate.. would you suggest a child can do that?

Yes actually, a decently intelligent high school child could win against that pack of dim bulbs. Republican debates are a joke won by whichever person throws enough red meat to the crowd and insults the other guy the best; they're an embarrassment to even watch.


I personally use http://www.hemingwayapp.com/ when I think I've gone a bit lazy in the way I have written an article.


Huh, I usually only associate Hemingway with "write drunk edit sober."


You should read more Hemingway.


I had forgotten about that. Thanks for the reminder!


Great tool


I think a much better approach would be to start with the 1000 most common words, but allow defining terms from there. This allows you to develop the correct jargon which will both enhance understanding and allow communication with others about the topic in the future.

Having to continually fall back on silly terms like "up goer" does nothing to enhance understanding when you can define what a rocket is and then just use the word rocket.


If this colored or underlined words past the 1000 rank, instead of removing them, it would be less offensive to my sensibilities.


Amusing coincidence that readams above used "up goer", as the "up goer" editor does what you suggest :

http://splasho.com/upgoer5/


That's actually a nice implementation. It's much easier to fix errors after finishing a sentence than it is to fear every word you type. Here's my attempt to describe compilers, which ended up going a little off-track:

----

If you want to make new things for a computer, you first have to write them down in a way that a computer can understand. You have to use strange computer words, not normal human words. People usually only talk to other people, so talking to a computer using only computer words is very hard.

To make it easier, there are things that take human words and then write them out as computer words so you don't have to do it yourself.

Those things were made by people who went to school for a long time. To make them, they had to know all the human words and all the computer words too. Thanks to them, you can write stuff that feels pretty normal to you, and still be able to make new things for a computer. It's still a little bit hard, but you can learn the simple stuff in a few days.

A long time ago, only a few people could build new things for computers. But today, many people like you and me build them for money or just for fun on their free time.

Just imagine: you could build a whole new thing today. A thing that no one ever had before. And if the thing you build actually helps other people, you can get a lot of money for it. That's what many of us in the "Thing Building News" place do. We go there to share stories about the things we build.

I now want to thank the people who went to school for a long time and made the things that turn human words into computer words. They help us do easy and fun stuff for money.


:) That's pretty good, well done.

The first compiler took someone with a bachelor's degree in mathematics and physics from Vasser, a master's from Yale and a Ph.D. in mathematics from Yale and 3 years of experience with the machine.

And even then :

"Nobody believed that," she said. "I had a running compiler and nobody would touch it. They told me computers could only do arithmetic."

https://en.wikipedia.org/wiki/Grace_Hopper


Bingo - let users extend lingo and use crowd-sourced/maintained lingo/lexicon files.

It'd be absolutely awesome if these lexicon files evolved organically and were public over time to graph how language progresses...


I think even slang terms have a place here. Not sure if the goal is to really improve language skills. If not, any new widely accepted terms could be defined and optionally rated by a community for wider acceptance. If the word is fairly new it could be highlighted in some way to indicate its popularity.


Directly relevant: Toki Pona, a language with 123 (ish) words. http://www.theatlantic.com/technology/archive/2015/07/toki-p...


You're not the only one. Words exist exactly to avoid the problem this app creates..


With just the words "zero" and "one" strung together in some combination, we can describe a JPEG image, which can be a picture of anything. (For instance, a black and white scan of the word "deoxyribonucleic" in a Times Roman font).


The ability to describe anything says nothing about the clarity of the description


That's kinda the point.


While playing the devil's advocate, I could argue that the description using only 0 and 1 is more precise than the phrase "It's a picture of a white cliff on the shore of the sea, possibly in the south of England, in the foreground there's the silhouette of a man or woman to small to distinguish clearly at the bottom of the cliff, the sky above the cliff is bright blue with a few trailing clouds etc etc etc etc etc".

But we're not really talking about the description of a picture, we're in fact talking about political discourse.


Let's play a game; I'll describe a scene to you, and you need to make a drawing of the scene based on the description. The closer your drawing is to the scene I'm describing (for some agreeable metric of 'closer'), the more money you win.

Now, do you want me to recite a bunch of ones and zeros? Or would you rather I say, "It's a picture of a white cliff on the shore of the sea, possibly in the south of England; in the foreground there's a person's silhouette, too small to pick out their gender..."?

If, as you say, you could argue that the ones and zeros makes for a more precise description, then obviously you would prefer that option, yes?


I agree with you, but clarity is not the same thing as precision.


I disagree. That's a reproduction, not a description.


I dare you to find someone willing to, without a computer, parse such a description. JPEG is a surprisingly non-intuitive, complex format.

You're right, you can be more precise that way, but precise text means nothing if nobody is reading it.


Sure, but it's a fun writing exercise. If you write using a limited vocabulary and then fix the most convoluted paraphrases, you might end up in a better place than if you wrote it normally.


Which is sort of the joke. But yes, this is absolutely correct. Moreover, as with any language, the way to get better is to use it more/get more exposed to more parts of it.


Which is the joke. Absolutely correct. And – as with any language – the way to improve is to use it.

To paraphrase and copy-edit your perceptive comment. (The subject made me do it).


Well played.


I tried this, and very quickly found that out. Thereafter, I considered very carefully whether any word that wasn't in the most common 1,000 was worth introducing, and used the top 1,000 freely. That ended up working pretty well. When I found a word that really helped, I added it to the list of words I could continue to use for what I was writing.


Maybe the author is just out by an order of magnitude? Maybe it should be 10,000 words?


Maybe it should be the 1000 least ambiguous words.


I suspect that a list of the least ambiguous words would be made entirely of numbers.


Couple. Trio. Threesome. Still room for ambiguity!


That and extremely technical terms.


You're not, and you beat me to writing the same comment by 10 hours.

Writing is the problem of communicating ideas, and doing to to a sufficiently prepared audience, effectively.

The problem with stunts such as this -- and there are others, the "if you stop writing for 5 seconds your text starts disappearing" demo a few weeks back comes to mind -- is that they confuse the medium with the message.

Yes, it's possible to have confusing writing on account of unfamiliar terminology. That's a frequent problem with many Wikipedia articles, or much academic writing of the past few decades (older works -- say, 1950 and prior, to the 18th or even 17th century -- are often far clearer).

But the problem most often isn't just the vocabulary, it's the structure of the writing. Pop open a book of Darwin or Adam Smith (I've been reading both) and turn to a random page. Once you get over a slight old fashionedness of the writing, the thoughts are clear.

It is impossible to tell a narrative clearly if you've not organised your own thinking about it.

I've been doing a lot of thinking ... well, about a lot of things. One of those topics is of data and narrative, and the difference between a dry factual presentation (say, a data table, or a simple list of events) and a story which weaves these into a consistent whole. Our minds typically work very well with narrative, sometimes too well, as false narratives can be constructed. But the best presentations I've encountered include both solid facts and a story which ties them together. Finding an author who does this well and with skill is truly impressive.

And the good ones expand your vocabulary. Another point, and one to keep in mind.

There's also Einstein's dictum: make things as simple as possible, but no simpler. The same applies to language.


Exactly! There is a phrase in Thing Explainer where Randall is trying to say that a part of a space shuttle or something is made in Japan, instead of using 'Japan' he has to say something like "the land where the sun comes up" or close to it (I don't think he could use 'rising' but may be not be recalling correctly). It breaks down as soon as you try to convey something beyond base level understanding.

So yes, it sounds like using the most common 1K words would aid readability but you probably need to raise that to 2.5K or more. Low enough to avoid the use of complex words while also freeing the author from awkward phrasings.


Being able to express and understand complex concepts is the difference between "smart" and "dumb", educated and uneducated.

An average 4 year old in extreme poverty or public care has 10M words directed at him. A 4 year old with professional parents hears 50M. That's a key differentiator that drives future potential.

Many people lack sophisticated literacy for a variety of reasons. When Honda opened a factory in Alabama, they changed written assembly instructions into pictograms due to the poor literacy level. In aviation, documentation is written in "simplified technical English" to assist non-native English speakers.


Could be handy for non native English speakers though - less to learn.


Even so, you still really have to know the idiomatic phrases, which can be considered as separate semantic units. Just because you know what "follow" and "up" are doesn't mean that you know what it means to "follow up".


Yes, the use of idioms and slang is basically a cheat around the thousand-word limit of Basic English by writers trying to use it.


I seriously doubt it, who wants to sound like a very stupid child?

This is more of an XKCD fan boy project. So let's look at a famous XKCD comic using this, "US Space Team's Up Goer Five". "US Space Team" barely makes sense, does more to obscure than illuminate, and "Up Goer Five" makes no sense at all.


I have to disagree with you.

First off, there are many dimensions to reading new languages. Nobody is saying "learn 1000 words of English and you're done for life". Rather, achieve fluency in a subset, then work on becoming more idiomatic. If you're lost somewhere and need help, nobody cares that you sound like a stupid child. (They probably won't notice, though. I've found through learning other languages that rare words are actually very rare and of extremely limited use. Obviously if someone uses it and you don't know it, you're stuck. But you can get by without it.)

US Space Team tells you what NASA does. NASA on its own tells you nothing about what they do, for all you know it could be the National Agricultural Sabotage Administration, and their goal is to blow up farms.

You have to separate language out into the multiple dimensions that it encompasses in order to understand certain aspects. It's clear to me that you use language as a social signalling method, "that person doesn't know what NASA is, hah, they're dumb and I won't associate with them as a result." That's fine, but there is another aspect: explaining ideas. And that can be done with less than the full set of all English words.


This is the problem with intelligentsia, they want to sound intelligent, using "heavy hitter" words. What about the intelligent person who is non-native speaker? This association of English fluency with Intelligence has ruined untold number of careers in the world, esp. in countries where English is official but non-native language.

I welcome simplified English in every shape or form, from limiting words to changing spellings and streamline it. Its no more language of people from England, every one in the world has stake in it. English reformation is long over due.


> This is the problem with intelligentsia, they want to sound intelligent, using "heavy hitter" words.

Some people insert heavy hitter words into their speech in order to sound more intelligent. It comes across as unnatural and awkward, though.

But people who have acquired large vocabularies organically, by being around people who talk that way or by reading a lot, use it naturally and unthinkingly. And then there are authors and such who use language like a painter uses color. English is a very rich language, and if you want to paint a rich scene with words, there is quite a palette to choose from. Would you tell a painter he can only use primary colors?


There's a huge difference between using the correct word, and using a proverbia 10 dollar word. Limiting yourself to the "ten hundred" most common words is just limiting. You can't even say you're hurt.[0]

[0] http://www.esldesk.com/vocabulary/words


But "you are having a bad problem and you will not go to space today" is one of the funniest understatements I've ever read.


"stupid" = depends on the age.

Unless you meant all children are stupid....


I meant a 5 year old on the left side of the bell curve of all all 5 year olds.


The very popular Oxford Learner's Dictionary uses a limited vocabulary of 3 500 words for its definitions.


The "Simple English" version of Wikipedia uses the Basic English 2000 and Voice of America's Special English 1500 word list:

https://simple.wikipedia.org/wiki/Simple_English_Wikipedia


You are correct. Lectern v. podium. When we have specific words for things, we should use them. Moving down a step in specificity is not conducive to clarity.


I thought the same thing when it was done (or was it 10k?) for _Things Explained_ by 'the XKCD guy'.

Great idea, needs more words. It's too extreme.

In the case of this editor, it may actually be useful to some if `N` were configurable, and maybe exceptions could be added. (If you're writing an OS X user guide say, you want to be able to write 'OS X'!)


Genuine question: have you ever tried to learn a foreign language? At least for me long sentences, with easier, more common, words are easier to understand.


It's kind of like "limit" and "dx/dy" in calculus compared to the crazy long prose people in ancient times had to write.


It should support a few template sentences which define a word, which is then allowed to occur in the remainder of the text.

More useful than a document which uses only 1000 common words is a document which uses only 1000 words, plus words which it clearly defines.

I feel as if I could write almost anything if I have that, and it will be self-contained and accessible to anyone with the thousand word vocabulary, plus the ability to internalize definitions, which is a very basic faculty of the intellect.


This idea reminds me of an amazing talk by Guy Steele: https://youtu.be/_ahvzDzKdB0

He starts the talk by assuming monosyllabic words as his primitives and builds up the words he needs to use to give the talk by providing definitions for them first.


Thanks for that, I really enjoyed it. Back when Java was new and cool...


Keep in mind that the point of these exercises is to make instructions, explanations and short narratives more approachable.

If describing something with a limited vocabulary is awkward and impossible, maybe your approach is too complex.

My 4 year old is starting to learn how to read, and you can see in his eyes the connections made when he tries to read a basic level book. The sentence structure is obvious and predictable, and he's pretty good at getting it.

With a more complex book (Ex: The Little Red Caboose), he recognizes words and "paraphrases" the story with the pictures.


A similar editor was made for the book Thing Explainer[1] by Randal Munroe from XKCD. A book that explains all kinds of different things, from space shuttles to microwaves using the top 1000 common English words.

[1] https://xkcd.com/thing-explainer/


You can actually try XKCD's editor at https://xkcd.com/simplewriter/


Yeah it's called "thing explainer", but it's more like a "thing convoluter". It's a work humour, not a book of science.


It's a work humour, not a book of science.

Indeed. I thought it was an interesting experiment. The results make it abundantly clear that 1,000 words are not enough to communicate about science and engineering topics effectively even at an introductory "plain English" level.


Some of the explanations are actually going into a textbook now, to accompany more traditional text. So it seems professionals disagree with you. I will concede that it might have been clearer with the top 2000 words or something, but that's my perspective and plenty of kids might disagree, especially those that have grown up with weird dialects or even sociolects like AAVE.


No. He's doing some drawings for text books. There is no chance that the thing explainer 1000 words affectation will be adopted for the text book for the purpose of actual teaching.


Really? That's interesting. Do you have a source for that?

I can't imagine how talking about "bags of air" and "bags of yellow water" is any clearer than talking about "lungs" or "bladders".


I should of read everything first and would of saw the reference already made... My bad


Just a small nit: The words sound alike but they mean different things. You should have read everything, then you would have seen the reference. Source: grammer.


Just a small nit: The words sound alike but they mean different things.

"Grammer" is a German company making automotive components.

"Grammar" is the thing with words'n sentences'n shit.

Just sayin' ;)


He knew exactly what he was doing.


This is just a phonetic misspelling of the contraction "should've" which really is pronounced "should of"


However, should've is a shortened version of should have so I think the point still stands.


Ain't the English language great when we can have multiple contractions such as "couldn't've"


Or "should a" ;)


*have


Would love a slider that would allow you to adjust the words allowed from 500 most common words in English to 10k most common words. Also, it would be great if you could compile a windows version.


Or how about the more uncommon the word the more like the background tone it is so that you get instant visual feedback?


I like this idea. Maybe a context menu for each word with scored alternatives..


+1 on the slider, you/we should really do that


I like the idea. This could be enhanced with a thesaurus that would offer alternatives to difficult words. It could be a useful tool not only for writing clearer explanations, but also to compose easy readers for English learners.


You could build that feature dynamically. When a user tries to use a disallowed word and then uses an allowed word, you could log that word pair. Combine and anonymize those logs and you'd be able to show the most likely replacement words for any word that people often try to use.


Agreed. It would likely need a larger basic dictionary of allowed words and would need some ability to override the dictionary for words that just have to be in to make sense due to domain or context.


Attempting to type the Gettysburg Address, which is how I usually experiment with word processors and keyboards, is an exercise in futility.

> Eight times ten and seven years ago our fathers brought into the world a new country, born of free thoughts and doings, and completely sold on the idea that all men are created the same.


If you made a special case for numbers and fudge the clause structure, it's not that bad.

> Eighty-seven years ago our fathers brought into the world a new country, born of free thoughts and doings and the idea that all men are created the same.


Futility? It looks like an improvement on the original:

"Four score and seven years ago our fathers brought forth on this continent a new nation, conceived in liberty, and dedicated to the proposition that all men are created equal."

"free thoughts and doings" sounds clearer that "liberty", which could mean almost anything - did it have no prisons? I guess they really meant freedom from England's control, which isn't quite free thoughts and doings - that goes to show how poorly worded the original was.


Them's fightin' words. If the Gettysburg Address is poor writing, life has no meaning.


You guys are missing that this is a joke. This is a reference to this comic: https://xkcd.com/1133/.

Which is poking fun at how funny things sound when you restrict yourself to only that many words.


Not just a joke: https://xkcd.com/thing-explainer/

Randall Munroe did an entire book that way.


A joke that would improve the writing of most people.


Sorry, "improve" is not in the list.


See? I need to better my own use of words. We all speak and write in ways others can not understand unless we make the effort—especially in a place where not everyone speaks our language.


It's neat but not very useful. I'd actually be interested in an editor that enforces E-Prime[1] to see if it made writing more clear.

[1] https://en.wikipedia.org/wiki/E-Prime



Thanks! My first question upon seeing this post was: isn't there already an emacs mode for this?


There is now: https://github.com/aaron-em/ten-hundred-mode.el

(There was before, under the name "1000-words.el", but it isn't very good, and it also isn't, and can't easily be made, available via MELPA. So I wrote this instead. The PR to add it to MELPA is open and awaiting review; pretty soon you should be able to M-x package-install RET ten-hundred-mode RET and get it.)


Perfect for editing articles in Simple English Wikipedia:

https://simple.wikipedia.org/wiki/Main_Page


The author of the book already made this.

https://www.xkcd.com/simplewriter/


Joke aside, there are ways to compute a "readability score" for any block of text. This is useful for document writers that need to target particular grade-levels for their docs (eg driving manual is "8th grade"). https://readability-score.com/


The readability score is based only on the lengths of words, which I guess is something, but it's not a very good proxy for when you learn the word. The frequency of the word is a much better thing to measure.


I think this is a cool project. I appreciate that the creator has shared this with the world. I think its sad that people are (appear to be?) criticizing him simply for making it and sharing it. However....

"If you find yourself on the receiving end of a message that is too hard to figure out, do everyone a favor and insist on a simpler version.

Do _everyone_ a favor? Presumptive.

If I am on the receiving end of a message that is too hard for me to understand due to my ignorance of certain terms, or due to difficulties I have parsing grammatically correct writing, then the best way for me to do everyone a favor is to work on improving my own vocabulary and/or thinking ability.

Then I will be better equipped to communicate well with a larger set of the population, and better equipped to reason well. Improving your own thinking skills is good citizenship. Improving your ability to communicate well with a wider swath of the population can help you to build bridges between communities.

If I were instead to insist that the message achieves 'clarity' by accommodating my ignorance, then I may be helping some but certainly not everyone.

"Maybe one that only uses the 1,000 most common words."

Asking others to accommodate limitations to my vocabulary is likely to increase their cognitive load; most often I'd rather them apply themselves more fully to other tasks. I can just use a dictionary! Also, this runs the risk of resulting in text that is _harder_ to understand, if they trade precise terms for needlessly convoluted grammar.


How about a standard translator that takes generally accepted conversions of more complex language constructs into simpler language?

How about a thing that changes less well known words and puts in easier words?


If you come up with a way to automagically translate things into easier English you could probably sell it to various UK government / NHS / etc organisations, who all have a duty to make much of their communication available in "Easy Read".

Here's the Equality Act Easy read version: https://www.gov.uk/government/uploads/system/uploads/attachm...

And the Equality Act non-easy read: http://www.legislation.gov.uk/ukpga/2010/15/contents

Here's a Google search for some easy read documents: https://www.google.co.uk/search?site=&source=hp&q=easy+read+...

Here's Mencap's (a charity that works with people with learning disability) guide to Easy Read: https://www.mencap.org.uk/make_it_clear


Similarly, in the U.S. there is the Plain Writing Act of 2010 (http://www.plainlanguage.gov/plLaw/).


For the latter, a dictionary?


It would be a huge usability improvement to have autocompletion of valid words. Currently, you have to type each word, and wait to see if it is rejected/removed.

Instead of waiting for each word to be typed, why not show acceptable words as you are typing, so you can tell before you finish typing if it is valid.


There is a web version of this which I made a few days ago: https://news.ycombinator.com/item?id=11424197


And this was submitted before as well. https://news.ycombinator.com/item?id=11367972


The most annoying application on earth :)

A better application would be to underline words not included in the whitelist and to provide a synonym included in the whitelist upon right clicking.

But as I learned in school: a language's diversity is beautiful, why restrain ourselves in the vocabulary we use?

In particular, a boring text with many repetition is just hard to read, see the discussion here: https://news.ycombinator.com/item?id=11131391


I wrote a simple editor that used Google Translate's web service to round-trip translate the text you enter in real time. I had been thinking about intermediate representations that might assist in the automatic translation of human languages. I wanted to see how one's writing might change to ensure that the round-trip translation was identical or still made sense, thinking this would improve the odds that the translated version actually meant what you intended it to.


While this is probably a joke, I would note that a person is rarely writing for consumption by all demographics. Vampire novels, comic books, chemistry textbooks, all have their audiences, and it is naive to say "but one day the whole world may wish to read my book, I must open the gates as wide as possible", because accessibility is not a free.

Don't simplify or complicate your language without reason.


Great app, but this still doesn't address the issue of homographs, where a word can have wildly different meanings (some of them uncommon meanings) that just happen to be spelled the same.

For instance, in the live example, the word "Application" is used to refer to a computer program, when the more common meaning likely refers to the verb, as in "The application of a band aid".


Check also Rewordify [1], provides suggestions alongside the original text, has "difficulty" settings, and more, very complete website, highly recommended. I spent ages looking for something like this.

1.- https://rewordify.com/index.php


I read years ago that the vocabulary used on TV was about 2,000 words. A high school graduate has a vocabulary of 10,000 words, a college graduate 30,000, and there are a million words in the English language.

What was interesting to me was that I could learn a foreign language by only learning 2,000 words!


That is interesting. What about languages other than English? Any idea about the avg. word count needed to learn them? (Used on TV does seem like a good benchmark for "learned). Or is it roughly the same for every language, do you think?


"being able to read a newspaper" is a typical goal for pre-fluency language acquisition. In most languages I'm aware of the number of words required to do that is something around 1500-3000.


Yes. But I read somewhere that a newspaper vocabulary is targeted at somewhere around a 5th grade reading level.


I have no data, but I suspect it is similar.


A great idea. See, the point is to be understood by everyone. Nothing is gained by making crazy sentences out of words that most people have never heard of. Is that supposed to impress people?

I had a friend who seemed to start replacing dead-simple phrases like “seriously!?” with “are you being facetious!?”. Ugh, this doesn’t help at all. It will either scream “hey, everyone, I learned a new word!!!” to the people who know what you said, or “say what!?” to anyone else; and they will be too busy parsing your words to hear whatever you say next.

And this advice isn’t because everyone you meet will be 6 years old or a high school drop-out. Simple words are: (1) WAY easier to pick up for everyone, and (2) many people are not using their native language, including very smart people in fields where people like big words.


Thing Explainer was a nice experiment.... but that's exactly what it was. An experiment... not to be repeated.

English has such a wonderfully rich vocabulary... and no cow is too sacred for the 'innovators' of tech. Everyone: prepare to talk like 6-year-olds!


Guy Steele's _Growing a Language+ is a much better model than "1000 most common words" https://www.google.com/?#q=growing+a+language


How about modifying it so one can choose to use the 1,000, or 2000, 5000, or 10,000 most common words, for instance? Should be pretty easy and done dynamically (although tuning the number _down_ would require more work).


Has no one mentioned Basic English yet? https://en.wikipedia.org/wiki/Basic_English


Check EasyWrite: Cleartext alternative for the web: https://github.com/adeekshith/easy-write


The original machine had a base-plate of prefabulated amulite, surmounted by a malleable logarithmic casing in such a way that the two spurving bearings were in a direct line with the pentametric fan. The main winding was of the normal lotus-o-delta type placed in panendermic semi-boloid slots in the stator, every seventh conductor being connected by a nonreversible trem'e pipe to the differential girdlespring on the 'up' end of the grammeters.


Oh good, now I can make sure I'm following proper Newspeak!


Presumably the list includes the word "gimmick".


It would be nice if it there was a setting that would make it not delete a word, but mark it in red. That way I can copy/paste things into it.


"At last!" he said. "My good sir! This is remarkable!"

Although in Trob the last word in fact became "a thing which may happen but once in the usable lifetime of a canoe hollowed diligently by axe and fire from the tallest diamondwood tree that grows in the noted diamondwood forests on the lower slopes of Mount Awayawa, home of the firegods or so it is said."


As anyone ever written an artistically worthy novel with only those 1,000 words? Sounds like an interesting lipogram.

Also, is there a Vim plugin to do that?


Lucy Aikin was an 19th century author who wrote novels using single-syllable words, including versions of Robinson Crusoe and The Swiss Family Robinson.

https://en.wikipedia.org/wiki/Lucy_Aikin


The irony that four words in the title are not in that 1,000 (in quotes here)

A "text" "editor" that only allows the 1,000 most "common" words in "English.

As determined by a similar online version that merely underlines words outside the vocabulary :

http://splasho.com/upgoer5/


"Use it to tell your family members why their computers act up, or tell people at work why they should pay you more."

Very clever application. You could use this idea to only describe problems using a dictionary of known nouns/verbs to describe technical problems using a known vocabulary.


Delightfully Orwellian.


So we can all write like potheads think!

Before downvoting me, and you will, do consider that I have zero scientific evidence for the above sarcastic comment; however, anecdotally speaking, all long-term pot smokers that I happen to know seem to speak with 1000 common words, or less.


Hemingway[1] is a good editor too. It does not have word limitations. It does a good job to make sure your sentences are not complex, long, and boring.

1. http://www.hemingwayapp.com/


Oooo I would like to apply this as a filter, for various thresholds, to various favorite works of e.g. fiction. Not with and replace; just a hard filter... very Oulipo...

The resultant word count plots would I am guessing cluster by author, maybe other things (genre?)...


A while back a friend wrote an Android app for this:

https://play.google.com/store/apps/details?id=com.fallinghaw...



  The aim of Newspeak is to remove all shades of meaning from   
  language, leaving simple concepts (pleasure and pain, 
  happiness and sadness, goodthink and crimethink) that 
  reinforce the total dominance of the State


Whenever I hear the word "amazing", the association that leaps most readily to the fore of my brain is the word "doubleplusgood".


I dont understand the appeal of this at all. The first demonstration gif shows how this mangles your text. Maybe this would be good for writing english for audiences who learned it as a foreign language.


This might actually be really useful when communicating online with developers overseas who don;t speak English natively. Recently started using upwork and found myself rewriting instruction to use common words.


This could actually be useful if it had buildin word2vec dictionary/suggestion function. Dont just delete less common words, suggest simpler one of equal meaning instead.


I think this is really great. I have just tasked the people in my company to use this to explain our company in 100 words or less, without losing context and making sense. :p


It's interesting to see people criticizing a software tool for it's potential uses and effect on society when that tool isn't cryptography.


Man, this thing totally should have been called Newspeak.


I feel like this would be much better as, say, a dictionary file for Word; the restriction seems too aggressive (i.e. what if I want a name?).


Actually, a version of this as an aspell dictionary would be lovely. Have your favorite editor highlight all the words that don't fit.


To make it simple, you would need to use <=1000 words _and_ avoid phrasal verbs. Otherwise it is a kind of cheating (at least, in English).


I don't believe this is a good idea. Limiting yourself to the 1000 most common words does not necessarily lead to clarity or simplicity.


Unclear on why this is a good thing. Dumbing down language helps whom, exactly? I guess if you want to hasten the Idiocracy this is your app.


Writing so that children, people for whom English is a second language, and people otherwise unable to comprehend advanced writing, can understand you is not a bad thing.

I'm considering applying this to my product documentation, even though our users are generally well-educated...we can't afford to have docs translated into a dozen languages, but simple English may be readable by enough foreign language speakers to make it worth the effort. English is a tough language; making it easier for new users of the language is great, IMHO.


Agreed. Some people think that big words are good. Me, I think good writing should be like good code; short and easy to understand.


Well clarity != short/simple. Which is sort of the joke of the whole book/this idea. It can often take many more words (which just makes an idea more complicated) if you rely on this kind of idea (which yes, I'm aware is a joke) than just using the correct word.

It's also very possible to get the gist of a word, based on where it is placed and the words around it -- even if you don't know exactly what it means. (Obviously I'm not suggesting that for something like documentation writing -- but by the same token I would never think that limiting yourself to a 1,000 word dictionary for something as important as product docs would be beneficial in any way).


I strongly suspect there is a happy medium to be found. Maybe more than 1000 most popular words...and maybe just augment a limited dictionary with an addendum for technical terms not in the short dictionary, but that are extremely common for the subject. E.g. in my case, maybe "domain" isn't in the dictionary, but I obviously need it to talk about DNS. I would reasonably expect even foreign language speaking administrators to know this word.

So, it's a joke, but maybe a useful thought exercise, too.


It's a joke. They even explain in the summary that it is a reference to Randall Munroe's book.


sorry you have been downvoted. i completely agree with you. i find this to be a truly misguided project.

edit: i see that some are saying it's a joke. Poe's Law [0]?

[0] https://en.wikipedia.org/wiki/Poe's_law


There are already enough agents out there trying to dumb us down. And I don't want to be one of them, so thanks, but no thanks.


This sounds awful. We need clear grammar, not limited vocabulary.

There's evidence of limited vocabulary and lazy expression in too many venues.


What would be interesting is a greasemonkey script that scales the size of words to be proportional to the log of their frequency.


They can also include a Toki Pona translator.


Have you written anything in Toki Pona before?


I was learning it and I could manage to understand some phrases. There are many "neologisms" for everyday items though, requires some thinking.


Isn't this a great opportunity to compress text. I wonder if there are algorithms that use dictionaries.


I can see a text compression scheme coming up. Convert a string to a list of numbers, use a lookup table, done.


To provide some context, Wikipedia has a version that's written using "Plain English" that's based on limited set of common English words:

https://simple.m.wikipedia.org/wiki/Main_Page

Haven't had the chance to use the tool, but sounds like a good first step to making English text easier to read.


I guess it must ignore proper names, like Trump and Randall, detected by capitalization.


This allows more than the top 1000 words. The dictionaries clearly have 3000 lines


Suggestion for the next word would be cool. Since there aren't so many.


Oh, with the likely classifications of the word, such as verb, adverb, noun, etc, this could be really useful.


A little bit of coding could eliminate the need for the user altogether!


Ouch, that first example. "Hard to figure out for people"?


Could this application suggest substitutions from 1000 word list?


Don't mean to sound like an insult, but this is the first tool (that I know of) intentionally designed to make people stupider.


Joke or not if you haven't check out Vsauce on Youtube the Zepf Mystery it's somewhat related.


Trump writes his victory speeches in this.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: