A previously unnoticed property of prime numbers

dantillberg · on March 14, 2016

I almost overlooked this article because I got turned off by the opening description in base 10, as there is a lot of math trivia out there that is specific to base 10 which holds little general significance.

But a little further down, the article discusses how this was discovered originally in base 3, and I think it's much simpler to understand in that context, since all primes except 3 (aka 10 base 3) end in just either 1 or 2:

"Looking at prime numbers written in base 3 — in which roughly half the primes end in 1 and half end in 2 — he found that among primes smaller than 1,000, a prime ending in 1 is more than twice as likely to be followed by a prime ending in 2 than by another prime ending in 1."

cecilpl · on March 14, 2016

I wrote a program to count the pairs of adjacent last-digit occurrences in all bases up to 30, for the first 100 million primes, and found this property nearly always holds.

Quite interestingly, in all of the few cases where this doesn't hold, primes ending in the digit D are least-frequently succeeded by a prime ending in the digit D-2.

Houshalter · on March 14, 2016

It doesn't matter if it's specific to base 10 or not. It's still a totally unexpected property.

And most math trivia in base 10 does generalize. The numbers change, but the patterns remain the same. But base 10 is far more intuive for people.

rickduggan · on March 15, 2016

It's not an unexpected property to me. In particular, I would never have assumed an even distribution of last digit of primes. I'm not sure why this is so magical. It's only so if you assumed it should be evenly distributed to begin with, and there's no particular reason I can think of why this would be so.

waterhouse · on March 15, 2016

Then you might be surprised that it's a mathematical fact that the last digit of primes is evenly distributed (among digits coprime to the base), no matter what base you choose. This is one way of stating the Chebotarev density theorem:

https://en.wikipedia.org/wiki/Chebotarev%27s_density_theorem

(edit: rephrasing)

m00n · on March 15, 2016

Cebotarev is a bit of overkill (and not equivalent!) for this and hard to grasp, if your Algebraic Number Theory is a bit rusty.

I recommend Dirichlet's Theorem on arithmetic progressions instead (everybody loves Euler's totient function!) :

https://en.wikipedia.org/wiki/Dirichlet's_theorem_on_arithme...

rickduggan · on March 16, 2016

Believing that the last digit of primes is evenly distributed overall is not the same as thinking they do not exhibit patterns when viewed sequentially. I'm surprised by neither.

mixedmath · on March 14, 2016

Indeed, they conjecture that the primes' last digits "conspire" in every base > 2, which is very interesting.

gunnihinn · on March 14, 2016

I'm a mathematician and I'd say their ideas only become interesting if that conjecture is true. Otherwise it's just numerology.

kafkaesq · on March 14, 2016

I'm a mathematician and I'd say their ideas only become interesting if that conjecture is true. Otherwise it's just numerology.

To me, this makes for a very boring notion of "interesting."

I think most mathematicians would say that the "interestingness" of a conjecture comes from it (1) describing a phenomenon which seems "intuitively true, or very likely true" (e.g. "x^n+y^n=z^n has no solutions for n>2") combined with (2) the initial difficulty of deciding its truth/falsity using tools available at the time of its statement; along with, finally: (3) the novel techniques (sometime first arising in our brains decades or generations later!) required to ultimately determine said truth/falsity (and the degree to which these techniques touch on and illuminate other areas of mathematics).

For example, I think you'd find near-universal agreement among mathematicians that not only would the resolution of FLT (as a conjecture stated my Fermat) would have been equally "interesting" if it had been proven false -- it may have even been more surprising if a counter-example had been found (or its existence proven), provided the tools / lessons were as interesting as those in the Taylor-Wiles result we know today.

Meanwhile, some the most interesting conjectures are perhaps those that can't be decided, one way or another.

EDIT: If you don't like the idea of discussion the "what-ifs" of a conjecture that's already been decided (like FLT), just plug in any of the usual suspects, e.g. RH or GRH into what I'm saying above. Clearly, a "false" determination on any of these of these major targets -- or even a serious hint at it -- would be career-making achievement for an aspiring mathematician.

bradleyjg · on March 14, 2016

>> it may have even been more surprising if a counter-example had been found (or its existence proven), provided the tools / lessons were as interesting as those in the Taylor-Wiles result we know today.

What if a counterexample with very large (x,y,z,n) had been found somewhere in the late 1980s because enough megaflops to find it was finally allocated the problem? Would that necessarily have been an interesting result?

thaumasiotes · on March 14, 2016

> What if a counterexample with very large (x,y,z,n) had been found somewhere in the late 1980s because enough megaflops to find it was finally allocated

I'm curious why floating point operations would be the appropriate tool for finding solutions to a Diophantine equation. Did you have something in mind when writing this? Where can I learn more about it?

kafkaesq · on March 15, 2016

I think he just means processing power, generally.

kafkaesq · on March 14, 2016

Not sure how to answer you on this (because there's a slight chance you might be trolling). Let's just say superficially "yes", in that it would mean the current expert consensus in our universe (that FLT has been proven) would have to be wrong.

But it's kind of a bad line of speculation (and so FLT probably wasn't the best illustrative example to bring up in my original post); again, for a real-life instance of a counter-example being found to a conjecture that had a lot of numerical evidence suggestion there wouldn't be one, have a look a the history of the Mertens Conjecture, and others of its ilk.

Basic point being that yes, counter-examples to interesting conjectures are always interesting results (and by themselves don't make the original conjecture any less interesting).

bradleyjg · on March 14, 2016

Thanks for the reply. I assure you I wasn't trolling, I don't know enough about the topic to construct a troll even if I wanted to.

kafkaesq · on March 15, 2016

No worries -- your question was totally reasonable (and an in fact, a perfectly good question).

delhanty · on March 15, 2016

Terry Tao's a mathematician and he seems to think their ideas are interesting:

https://terrytao.wordpress.com/2016/03/14/biases-between-con...

mchahn · on March 14, 2016

> their ideas only become interesting

Odds are good that everyone knows the proof that all integers are "interesting". If not, the first non-interesting number would be interesting for being the first. (grin)

ajross · on March 14, 2016

Isn't it sort of a definitional thing about conjectures that they stop being interesting when proven false?

indiv0 · on March 14, 2016

I think the parent is saying that the statement:

    Looking at prime numbers written in base 3 — in which roughly half the primes end in 1 and half end in 2 — he found that among primes smaller than 1,000, a prime ending in 1 is more than twice as likely to be followed by a prime ending in 2 than by another prime ending in 1

is not interesting (as it seems to be just numerology), UNLESS the authors' conjecture is also true (that the statement also holds for bases > 2).

kafkaesq · on March 14, 2016

The important part is that the authors (and several others) have verified the statistics out to a few hundred billion primes -- and that while the bias does start to drop out, it does so "very slowly." That's what makes this result not "numerology."

monochromatic · on March 15, 2016

No, it would be interesting even if it only happened in one base. It's still unexpected.

tshaddox · on March 14, 2016

It's interesting to me.

adrianN · on March 14, 2016

P!=NP would still be interesting even if it were proven false.

kafkaesq · on March 14, 2016

They might come down a notch, but "failed" conjectures can still be quite interesting, in many ways:

https://en.wikipedia.org/wiki/Mertens_conjecture

hiddencost · on March 14, 2016

"not proven true" is not the same as "proven false"

they made an observation and a conjecture. If the conjecture is proven false, it's obvious uninteresting. If the conjecture isn't proven either way, it could be argued that it's just apophenia. I'm not sure I agree, but it's not an unreasonable stance.

TTPrograms · on March 14, 2016

It would still be interesting if it holds for all primes rather than just the first ones, as it imposes fundamental structure on the distribution of primes. The base 3 calculation is simply the form that structure takes.

squidfood · on March 14, 2016

Don't they call that the "strong law of small primes": you can discover lots of patterns in the first millions (or now billions) of primes that don't mean anything and don't hold up?

mixedmath · on March 14, 2016

I think you're referring to the "strong law of small numbers", as in this excellent article by Guy [1]. But the idea is the same --- so very frequently, patterns that hold for even the first several numbers eventually fail to continue.

[1]: https://www.maa.org/sites/default/files/pdf/upload_library/2...

jjtheblunt · on March 14, 2016

totally agree.

monochromatic · on March 14, 2016

I think it would still be interesting even if it proved false, simply because it appears to be true for small numbers. That would be weird, and weird is interesting.

knughit · on March 14, 2016

They conspire in base 2 as well, they conspire to be equal. :-)

coldtea · on March 15, 2016

>I almost overlooked this article because I got turned off by the opening description in base 10, as there is a lot of math trivia out there that is specific to base 10 which holds little general significance.

If a high school science insight seems to be able to "shoot down" a new scientific discovery, chances are what the discovery says has not really been understood properly.

The property they discovered is orthogonal to numeric base.

tikhonj · on March 15, 2016

But if high school science insight seems to shoot down a popular math article, it's probably right. Popular articles on technical subjects (all subjects?) simply aren't very good and trivia involving digits in base-10 is exactly the sort of thing that the popular press loves to fawn over.

I think that's what the original commenter was getting at: it's not that the actual discovery is suspect, it's that the article presenting it was—even though the discovery itself is interesting.

vessenes · on March 14, 2016

The article is too vague to assess how interesting the claims are, sadly.

It's too bad; I think it wouldn't have detracted from the article to put some more math in. It's not on the face of it at all surprising that sequential primes are more likely to be close to each other modulus any number (3, or 10, or what have you), than they are to be far apart.

By way of analogy, a train comes at 1:09pm. Trains come about every 5 minutes between 1 and 2 pm, and only on odd numbers. If you simulate a bunch of random 'next trains', 1 is much more likely than 9 because P(9) approx = !P(1,3,5,7). This is true for all bases.

I think what you'd need to be able to say to say something interesting is 1) calculate odds of finding the next prime. 2) Randomly generate numbers with a similar distribution to that of prime occurrence in that range using the Prime Number Theorem at the very least (1 / log(n) probability roughly). 3) check final digits and compare to actual distribution of final digits.

If those numbers are very different, then you have in fact found some underlying structure. But the article doesn't hit very hard on this angle, and its hard for (probably) any of us to say just thinking about it with minimal data whether or not there's structure.

haberman · on March 14, 2016

This is addressed in the article:

> Lemke Oliver and Soundararajan’s first guess for why this bias occurs was a simple one: Maybe a prime ending in 3, say, is more likely to be followed by a prime ending in 7, 9 or 1 merely because it encounters numbers with those endings before it reaches another number ending in 3. For example, 43 is followed by 47, 49 and 51 before it hits 53, and one of those numbers, 47, is prime.

> But the pair of mathematicians soon realized that this potential explanation couldn’t account for the magnitude of the biases they found. Nor could it explain why, as the pair found, primes ending in 3 seem to like being followed by primes ending in 9 more than 1 or 7. To explain these and other preferences, Lemke Oliver and Soundararajan had to delve into the deepest model mathematicians have for random behavior in the primes.

vessenes · on March 14, 2016

They did mention this, but they didn't talk real numbers. And, my second point is (I think) slightly more subtle -- the probability distributions need to be considered, not just the counting upward angle.

As I'm writing this out, I'm a little less sure that this would matter, but I'll leave the comment out for the sake of discussion. :)

jlarocco · on March 14, 2016

I don't know why you're nitpicking. The article's written for a more general audience that may be interested in the property, but not necessarily the nitty-gritty math behind it.

The first paragraph of the article links to the paper, so people who want more detail can get it. http://arxiv.org/pdf/1603.03720v1.pdf

For your second point, I don't think there's anything wrong with a paper announcing they found something interesting, even if they haven't completely analyzed every aspect of it. Getting the info out early lets a wider audience look at it, and opens their current research up to scrutiny.

kafkaesq · on March 14, 2016

The article is too vague to assess how interesting the claims are, sadly.

By my reading, the article seems to state the key import of the finding quite clearly:

"This conspiracy among prime numbers seems, at first glance, to violate a longstanding assumption in number theory: that prime numbers behave much like random numbers."

However this statement:

If you simulate a bunch of random 'next trains', 1 is much more likely than 9 because P(9) approx = !P(1,3,5,7). This is true for all bases.

I'm afraid I don't follow at all. (Do you really mean we should expect that P(1|1) > P(1|9), for either random trains or for subsequent primes? Say wha?)

That said, perhaps you might want to skip straight to the arxiv article itself, or perhaps do some experiments on your own. It's definitely not hard to generate a non-"minimal" amount of data (out to the first few million primes or so) on one's laptop, these days.

notthemessiah · on March 14, 2016

Numbers modulo n (i.e. the last digit in base n) are everywhere in number theory and their properties even form the basis of cryptography. What you see as "math trivia" most mathematicians see as mathematics.

aidos · on March 14, 2016

I'd guess the GP was pointing out that there are uninteresting results that those less qualified in maths see as some sort of magic. Maybe like those silly "choose any number multiply by x, etc etc did you get 7, magic!" games. Or:

    9*9+7 = 88
    98*9+6 = 888
    987*9+5 = 8888

While not exactly trivial, and there's still maths required to understand the patterns, this prime result is much deeper and more interesting. The GP almost passed over it because they though it might fall into the less interesting category.

kafkaesq · on March 14, 2016

Right, but the result in original article is clearly not in this category.

BookmarkSaver · on March 14, 2016

That's true, but noticing a phenomenon using an arbitrary base isn't necessarily meaningful. Such as how in base 10 the digits of numbers divisible by 3 or 9 sum up to a number divisible by 3 and 9. There is an underlying principle here, but the specific presentation is just an artifact of the base. This article does a poor job explaining what (or whether) there is some new principle at work here or if this pattern is just a specific presentation of using base 10.

jlebar · on March 15, 2016

My experience is that Quanta Magazine specifically is quite good about this stuff in general -- many steps above your regular news outlet.

dbbolton · on March 15, 2016

Another thing that kind of irked me was this section:

>This conspiracy among prime numbers seems, at first glance, to violate a longstanding assumption in number theory: that prime numbers behave much like random numbers...

I was under the impression that Ulam's Spiral was a much earlier indication (1960s) that primes weren't really as random as we thought.

https://en.wikipedia.org/wiki/Ulam_spiral

omaranto · on March 15, 2016

I almost overlooked this article because I got turned off by the opening description in base 10

The article talks about the last digit: That's also the remainder upon dividing by 10, a perfectly sensible thing to discuss. Similarly what it discusses in base 3 is the last digit, the remainder upon dividing by 3. Properties of numbers, particularly primes, modulo other numbers have been much studied by number theorists.

AnimalMuppet · on March 15, 2016

I tested this in base three. For primes below 1000, primes were followed by primes ending in another digit 2.1 times as often as they were followed by primes ending in the same digit. But the ratio shrank as the number of primes grew. By the time I was up to primes below 1,000,000,000, the ratio was only 1.2.

That hints that it's an effect that asymtotically disappears...

knughit · on March 14, 2016

Primes have no inherent relation to decimal, so all prime patterns in decimal are mathematically meaningful.

This is fundamentally different from observations about, say, how 2 and 5 relate to decimal expansions.

P.S. never say "base 10", it's a tautology. Say "decimal"

gunnihinn · on March 14, 2016

> Primes have no inherent relation to decimal, so all prime patterns in decimal are mathematically meaningful.

No. There's all sorts of random nonsense in all base expansions of anything that we assign meaning to. Most base-related stuff is numerology.

> P.S. never say "base 10", it's a tautology. Say "decimal"

No, it's not. You need to specify your base number.

garethadams · on March 14, 2016

I'm not certain, but I believe the subtle point parent was trying to make, is the difference between "base 10" and "base ten". Every base is "base 10" in its own base ;)

macawfish · on March 14, 2016

if "base 10" is a tautology, wouldn't that make "deci-mal" a tautology too?

so yeah, just hold up 10 fingers and don't say anything.

nsajko · on March 14, 2016

But numerals "10" in bases other than decimal aren't usually called ten, I think.

kaoD · on March 14, 2016

Ten is "decem" in Latin. Decimal means, literally, base ten. It's a tautology either way. You can't even specify the base of the base since "base (10 in base 10)" presents the same problem recursively.

It's a non-problem though. I don't know about mathematicians, but when programmers say base N they mean N in decimal.

EDIT: woops, replied to the wrong post.

smellf · on March 14, 2016

"10" is never correctly called "ten" except in decimal. But this whole tangent is getting a little navel-gazey for me. "Base 10" is fine in my book because decimal is our inbuilt base.

nsajko · on March 14, 2016

That's what I said (or tried to :) ).

rlucas · on March 14, 2016

Should we call it "base a"?

jholman · on March 14, 2016

"base ||||| |||||"

perhaps "base |||||*||", for short.

raould42 · on March 14, 2016

base X (romans)

kuschku · on March 14, 2016

base II*II+II?

munchbunny · on March 14, 2016

I get the parent poster's point, but to me it's like quibbling over people using "comprised of" instead of "comprises": some people will care, but in the grand scheme of things it's not a meaningful disagreement because everybody got what you were trying to say the first time, and there are only a few rare contexts where the distinction matters.

leni536 · on March 15, 2016

Base (((((((((1++)++)++)++)++)++)++)++)++)

Peano would be proud.

thaumasiotes · on March 14, 2016

> P.S. never say "base 10", it's a tautology.

If you actually believe this, you must be endlessly confused when people say "base 8".

mjs · on March 14, 2016

"If Alice tosses a coin until she sees a head followed by a tail, and Bob tosses a coin until he sees two heads in a row, then on average, Alice will require four tosses while Bob will require six tosses (try this at home!), even though head-tail and head-head have an equal chance of appearing after two coin tosses."

How does this work?

filleokus · on March 14, 2016

> Intuitively, first, both have to get a head. After that, if Alice "fails" by getting a head, then she still needs only one tail. Her first head doesn't get "reset" by failing her second try. But after getting a head, if Bob fails by getting a tail then he does get reset -- he has to start all over.

https://www.reddit.com/r/math/comments/4abm4k/expected_numbe...

ghshephard · on March 14, 2016

What's interesting, is that if you reword this slightly, and ask, "If you flip coins until you get either a head followed by a tail, or a head followed by a head, how many flips on average are required before you get a Head, followed by a Tail, versus a Head, followed by a Head" - the answers are 3 and 3 respectively.

But worded, "If you flip coins until you get a Head followed by a Tail, or flip coins until you get a Head followed by a Head, the answer reverts back to 4 and 6."

Very counterintuitive.

knughit · on March 14, 2016

Explain the difference?

Are you saying that HH causes an early failure for HT , instead of a potentially longer success HHHT? If so , it is poorly worded, to be ambiguous about how to count failures. (In the 4-6 variation, there are no failures)

SixSigma · on March 14, 2016

  flip coins
    if HT or HH
      stop

vs

  flip coins
    if HT
      stop

  flip coins
    if HH
      stop

two different scenarios

ghshephard · on March 14, 2016

I'm saying that if you just keep flipping coins (So, no Bob/Alice situation), and stop whenever you hit HH or HT, that the average number of flips to get to HH is 3, and the average number of flips to get to HT is 3.

But, if you are focussing on a particular scenario that you will flip coins until you get to HT, the average number of flips will be 4, and if you flip coins until you get to HH, the average number of flips will be 6.

I just find that really hard to grasp intuitively.

ifdefdebug · on March 14, 2016

"and stop whenever you hit HH or HT" is equivalent to say "whenever you hit H, flip one more then stop". Average to hit H is obviously 2, plus one is 3, so you are right, but IMO it's not that counter-intuitive.

mikeash · on March 14, 2016

Do I understand your first scenario correctly, that if for example you flip TTHH, you stop there, count that as "took four flips to get to HH," count nothing for HT, and then start over?

Seems like you're just taking a biased sample, which cancels out the differences. To take an extreme example, imagine one candidate is HHHHHHHHHH and the other candidate is any other sequence of ten flips. In the "try until you get either one" scenario, the average number of flips for either one will be 10. Testing them independently, the average number of flips for the second one will be slightly over 10, and for HHHHHHHHHH it'll be huge.

zipppy · on March 14, 2016

I don't think that's true. I think that the average number of flips before _stopping_ might be 3, but it doesn't become more likely to get HT simply because you're also looking for HH (or vice versa).

ghshephard · on March 14, 2016

It turns out that the average number of flips to get HH and HT is 3 - someone else on the thread described it well - on average, the number of flips to get to a "H" will be 2, and then you will always stop on the next flip - 50% at flip 3 with a HT, and 50% at flip 3 with a HH. Explained that way, it makes sense that the average number of flips is 3.

But, I still find it strange that if you are flipping with one particular scenario in mind, HT or HH, that the average number of flips goes from 3 to 4 or 6, even if I can reason it out with a bit of thinking.

aidenn0 · on March 14, 2016

A similar property was recently taken advantage of to reduce the time needed to brute-force older garage door openers; roughly speaking openers look for a particular base-3 string, but don't require a start/stop sequence, so you can try N length M permutations with much less than N*M symbols transmitted.

dalke · on March 15, 2016

That's a De Bruijn sequence. See https://en.wikipedia.org/wiki/De_Bruijn_sequence . A example applied to door codes is at http://stefangeens.com/2001-2013/2004/10/the-de-bruijn-code/ . A search for "De Bruijn garage door" finds a couple of examples of what you mentioned.

aidenn0 · on March 15, 2016

Thanks for the name, with it I was able to find the article I was thinking of: http://samy.pl/opensesame/

tremon · on March 14, 2016

Ah, thanks for that. So it's not a property of the coin toss itself, but the fact that a failure at the second step in the series only resets to before step 2, instead of before step 1.

mabbo · on March 14, 2016

That was going to bother me all day. Thank you.

jb1991 · on March 14, 2016

I thought the article was saying Alice must get a head followed immediately by a tail. If that's not the case, then it makes total sense, but it would seem the article is a bit vague about that.

Actually, the details in the article say:

>even though head-tail and head-head have an equal chance of appearing after two coin tosses.

That implies that the tail is expected immediately after the head for Alice's goal.

thefreeman · on March 14, 2016

Correct, but the point is if Alice doesn't get a tail, that means she got a head, so she is still in the same position as she was before the flip, only needing a single tail to complete the sequence. If Bob gets a head, then a tail, he now needs two consecutive heads to complete his sequence.

frobozz · on March 14, 2016

The only way she can get a tail which isn't immediately after a head is to never get a head in the first place.

Qwertious · on March 14, 2016

Yeah, thought so - it's specifically heads->tails, rather than either heads->tails or tails->heads being okay. So the probabilities don't seem remotely counterintuitive, just crappily communicated IMO.

tempestn · on March 15, 2016

I'm not sure your explanation fits with the probabilities. Pointing out that you're only looking for heads->tails and not tails->heads seems to suggest that the HT pattern should be less likely than the HH, but it's actually more likely (for reasons explained well by others).

hvidgaard · on March 14, 2016

I don't have time to put numbers on it, but consider the following with 3 tosses only. There is 8 different enumerations:

* HHH * THH * HTH * TTH * HHT * THT * HTT * TTT

Alice is looking for HT, so she will succeed in HTH, HHT, THT, HTT, that is 4 out of 8 possible outcomes. Bob on the other hand is looking for HH, that is only in HHH, THH, HHT, 3 out of 8 possible outcomes. So while HH and HT are equal in probability when you consider 2 coin flips, the combination of HT happens more often than HH. This is the case with 3 coin flips - there is no guarantee it translate to the same with more coin flips, but that is my bet.

cousin_it · on March 14, 2016

Amusingly, the question "which substring occurs earlier on average" is different from the question "which substring is more likely to occur before the other". In fact the second question sometimes has a circular answer! For example, THH typically (with >50% probability) occurs before HHT, which typically occurs before HTT, which typically occurs before TTH, which typically occurs before THH.

Also the question "which substring occurs earlier on average" is intimately connected with algorithms for substring search. For example, if you want to check that a string doesn't contain HHH, you need to look at every third character, but for THH that's not enough.

Fascinating stuff.

akavi · on March 14, 2016

The key "insight" that helped me intuit this is that "HH" can self overlap, while "HT" can't.

calebm · on March 14, 2016

I came across this scenario while interviewing at a HFT firm about a year ago - caught me off guard :)

darthsid · on March 14, 2016

If Alice fails, and gets a second head, she only has to get one toss right, and hasn't completely reset her sequence.

For Bob, as soon as he sees a tail, his sequence is completely reset and he now has to get two tosses right.

I'm not sure how to calculate averages, though.

sixothree · on March 14, 2016

So if Alice fails she is halfway through her sequence. But if Bob fails he needs to start over.

That makes sense.

drdaeman · on March 14, 2016

Uh. I totally forgot everything about statistics and probabilities (and I'm too lazy to remember those, so I won't check numbers 4 and 6), but I think the core idea how it works is that Bob's option just has lower chances. Say, we toss a coin up to 3 times.

Possible outcomes are: TTT, TTH, THT, THH, HTT, HTH, HHT, HHH

Alice successes are: HT, THT, and HHT, but Bob has less options: HH and THH. That's why he needs more tosses on average.

knughit · on March 14, 2016

Yes but why is Bob's harder? Answer without enumerating the full list of possible flip sequences

jastr · on March 14, 2016

If they both fail to get what they need on their second tosses, it's possible for Alice to win on the next flip, but Bob needs two flips to win!

If Alice has failed to win on her second toss, her flip sequence was "HH", so she can win on the next toss, by flipping a T!

If Bob has failed to win on his second toss, his flip sequence was "HT", so he needs two more flips to win!

dhbradshaw · on March 14, 2016

An easy way to understand it is by thinking about bunching. Since you're only flipping until you hit the first matching sequence, on average you'll hit the more evenly distributed sequence more quickly than the bunched sequence.

Multiple heads in a row are more bunched than transition sequences because, for example, a sequence of three heads in a row will include two sequences with two heads in a row. You can't do that with a transition sequence--it takes at least four tosses to get two identical transition sequences.

JoshCole · on March 14, 2016

Your explanation actually confused me until I realized we can't conflate the two different players' throws.

We basically have four cases for Bob:

- HH: terminates for Bob.

- HT: Bob restarts the sequence.

- TT: Bob restarts the sequence.

- TH: This is a continuation. It degenerates into the answer to a single throw. Otherwise its just a recursion.

Then we have another four cases for Alice:

- HH: This is a continuation. It degenerates into the answer to a single throw or else a recursion.

- HT: terminates for Alice.

- TT: Alice restarts the sequence.

- TH: This is a continuation. It degenerates into the answer to a single throw. Otherwise its just a recursion.

So with that understanding it is a bit more clear how we get this result:

Alice has one win and two opportunities for a win and one for a restart. Bob has one win and one opportunity for a win and two restarts.

That was a bit confusing. I wonder how the problem could be worded to ensure that people got the answer correctly every time?

jcranmer · on March 14, 2016

The most intuitive way to answer this question is that you're not comparing individual coin flips but pairs of adjacent coin flips in a longer sequence of ones. These pairs are no longer independent trials: if your first two coin flips are TH, then getting an HH if you look at the second two is much more likely than getting a TH.

alva · on March 14, 2016

Because if Alice fails to get H->T, then she has to be back at the first step, H, ready to flip again for H->T

If Bob starts on H and gets T, he needs to continue flipping until he gets back to H.

kgwgk · on March 14, 2016

Starting from scratch, they first need to get a head. This takes 2 tosses on average (1 with 50% probability, 2 with 25% probability, 3 with 12.5% probability, etc.). At that point, both Alice and Bob have 50% chance of getting the target sequence with one additional toss.

In the case of failure, Alice still has 50% chance of success in each subsequent toss. On average she will need two additional tosses to get a tail and the answer is 2+2=4.

In the case of failure, Bob has to start again. If we call the answer x, we can write x=2+0.5 1+0.5 (1+x) and solving the equation we get x=6.

gill984 · on March 14, 2016

Don't know if this is exactly why, but the situation certainly reminds me of this: https://en.wikipedia.org/wiki/Penney%27s_game

Tomte · on March 14, 2016

A great and very entertaining explanation is in Peter Donnelly's TED talk: "How juries are fooled by statistics".

ibmthrowaway218 · on March 14, 2016

http://web.mit.edu/~emin/www.old/writings/coinGame.html

rimantas · on March 14, 2016

Alice: first toss: head. Second toss: head. She can take that second toss as the begining of the new sequence and if she gets tail on the third toss she is done. In Bob's case, if he gets tail on the second toss, that toss no longer counts and he must get head for the new streak to begin.

kutkloon7 · on March 14, 2016

If Alice fails after two tosses, she has a high probability of ending with a head, so that the next toss is more likely to be successful.

PlzSnow · on March 14, 2016

I'm in the comments looking for an answer too.

crnt2 · on March 14, 2016

The results are particularly striking in base 11 - looking at primes below 100 million, only 4.3% of primes ending in 2 are followed by another prime ending in 2 (compared to the 9.1% you would naively expect) with similar numbers for other pairs.

A prime ending in 2 (in base 11) is also unlikely to be following by a prime ending in 5, 7 or 9, whereas it is particularly likely to be following by a prime ending in 4 or 8.

It would be interesting to know what structure there is (if any) in this NxN "transition matrix" for various bases.

   1: ( 1,  4.3%) ( 2, 13.0%) ( 3, 14.3%) ( 4,  7.7%) ( 5, 11.5%) ( 6,  6.3%) ( 7, 18.0%) ( 8,  9.0%) ( 9, 10.7%) (10,  5.2%) 
   2: ( 1, 10.0%) ( 2,  3.7%) ( 3, 11.3%) ( 4, 14.1%) ( 5,  7.5%) ( 6, 12.1%) ( 7,  5.3%) ( 8, 17.5%) ( 9,  7.8%) (10, 10.7%) 
   3: ( 1,  6.1%) ( 2, 10.3%) ( 3,  3.7%) ( 4, 12.5%) ( 5, 14.0%) ( 6,  9.2%) ( 7, 12.1%) ( 8,  5.6%) ( 9, 17.5%) (10,  9.0%) 
   4: ( 1, 11.1%) ( 2,  6.1%) ( 3,  9.9%) ( 4,  4.1%) ( 5, 11.5%) ( 6, 14.5%) ( 7,  7.7%) ( 8, 12.0%) ( 9,  5.3%) (10, 18.0%) 
   5: ( 1,  9.6%) ( 2, 12.7%) ( 3,  6.3%) ( 4, 11.5%) ( 5,  4.0%) ( 6, 13.6%) ( 7, 14.5%) ( 8,  9.2%) ( 9, 12.1%) (10,  6.4%) 
   6: ( 1, 17.9%) ( 2,  8.5%) ( 3, 10.6%) ( 4,  5.0%) ( 5,  9.6%) ( 6,  4.0%) ( 7, 11.4%) ( 8, 14.0%) ( 9,  7.5%) (10, 11.5%) 
   7: ( 1,  6.0%) ( 2, 19.1%) ( 3,  8.8%) ( 4, 11.1%) ( 5,  5.1%) ( 6, 11.6%) ( 7,  4.1%) ( 8, 12.5%) ( 9, 14.1%) (10,  7.7%) 
   8: ( 1, 12.0%) ( 2,  5.5%) ( 3, 17.5%) ( 4,  8.8%) ( 5, 10.6%) ( 6,  6.3%) ( 7,  9.9%) ( 8,  3.7%) ( 9, 11.3%) (10, 14.3%) 
   9: ( 1,  8.8%) ( 2, 12.4%) ( 3,  5.5%) ( 4, 19.1%) ( 5,  8.6%) ( 6, 12.7%) ( 7,  6.0%) ( 8, 10.3%) ( 9,  3.7%) (10, 13.0%) 
  10: ( 1, 14.3%) ( 2,  8.8%) ( 3, 12.0%) ( 4,  6.0%) ( 5, 17.8%) ( 6,  9.6%) ( 7, 11.1%) ( 8,  6.1%) ( 9, 10.0%) (10,  4.3%)

julien-c · on March 14, 2016

It looks symmetric along the anti-diagonal. Strange.

Steuard · on March 14, 2016

Wow, that's really interesting. The same seems to be true for base 7; I haven't tried any other prime bases yet (and I don't know how it might extend to non-prime bases). Anyone have an idea why this seems to hold?

Edit: This basically works for base 10, too. I feel like the reason must either be very obvious or very deep.

julien-c · on March 14, 2016

I also tested in base 10, with primes under 1e9, got this:

             1       3       7       9
  1       0.0458  0.0746  0.0756  0.0540
  3       0.0599  0.0439  0.0707  0.0755
  7       0.0638  0.0677  0.0438  0.0747
  9       0.0805  0.0638  0.0599  0.0458

It appears to have the same transition probabilities as in base 5 (with the two center rows and columns swapped):

             1       2       3       4
  1       0.0458  0.0756  0.0746  0.0540
  2       0.0638  0.0438  0.0677  0.0747
  3       0.0599  0.0707  0.0439  0.0755
  4       0.0805  0.0599  0.0638  0.0458

I share your feeling about it being either obvious or deep.

Steuard · on March 15, 2016

Well, whether it's obvious or not, I think it's in the paper. Immediately under equation (1.1) of http://arxiv.org/abs/1603.03720, the authors are discussing the second correction term to the distribution of primes mod q, and they state:

"We can also show that c2(q; (a, b)) = c2(q; (−b,−a)) for any two reduced residue classes a and b (mod q)."

I'm not 100% certain this is responsible for the phenomenon that we're seeing, but it seems exceedingly likely. I think I'd need to stare at their formula for c2 for a long time to understand where this relation comes from, though.

Steuard · on March 15, 2016

I'm pretty solidly convinced of the pattern at this point: I've checked the "reflect across the anti-diagonal" pattern for the first 10 million primes, expressed in every base from 3 to 20, and it seems to hold up. (I haven't tried to establish any sort of bounds.)

As I've expressed in another comment, my preferred way of thinking about this symmetric pattern goes something like this:

The probability that (a prime congruent to x mod b is followed by a prime congruent to y mod b) seems to be equal to the probability that (a prime congruent to -y mod b is followed by a prime congruent to -x mod b).

I still haven't figured out whether it ought to be obvious, though if it is then I expect the language I've used above to be relevant. It's definitely not trivially obvious, because it's not an exact equality: if the pattern only shows up clearly after you've accumulated thousands or millions of primes, then it doesn't seem that it could be enforced by any sort of exact transformation. (For example, if the symmetric entries were somehow just counting the same pairs in two different ways, the numbers ought to be precisely equal rather than just increasingly close.)

There are a rather a lot of other patterns in the data; I expect that at least some of them must be accounted for in the original paper, but I haven't more than glanced through it yet.

contravariant · on March 14, 2016

> It appears to have the same transition probabilities as in base 5 (with the two center rows and columns swapped):

Well, yeah that's because 1,3,7,9 modulo 5 becomes 1,3,2,4 which is 1,2,3,4 with the centre two swapped.

tgb · on March 14, 2016

I find this more interesting than the article and want to know more.

contravariant · on March 15, 2016

So a number ending in 'x' is as likely to be followed by 'y' as a number ending in 'n-x' is to be preceded by a number ending in 'n-y' (where n is the base).

Can't think of a trivial reason why that would be the case, something weird is happening.

Steuard · on March 15, 2016

Framing this differently, a prime congruent to x mod b is as likely to be followed by a prime congruent to y mod b as a prime congruent to -y mod b is to be followed by one congruent to -x mod b.

vessenes · on March 14, 2016

Why would you expect 9.1%? At low numbers like these, primes are likely to be closely packed, meaning you are not as likely to have to search forward 11 numbers as just dividing by 11 would imply.

crnt2 · on March 14, 2016

Miscalculation - obviously no primes end in 0 (base 11) so I should have said "10% as you would naively expect".

I take your point about primes being closely packed at low numbers, but I think this is a small correction (i.e. you might expect 8-9% of primes ending in 2 to be followed by another prime ending in 2, but certainly not <4%)

furyofantares · on March 14, 2016

Could you rule out all pairs that are less than 11 apart in order to calculate how large a factor that is?

crnt2 · on March 14, 2016

I made an attempt to do this elsewhere in the thread, take a look at

https://news.ycombinator.com/threads?id=crnt2

The result - the factor is actually surprisingly large!

furyofantares · on March 14, 2016

Nice!

I did my own investigation using base 3 and noticed something peculiar.

In the first 100k primes, we go from 1 to 2 29028 times and from 2 to 1 29029 times.

Then I filtered out the twin primes since those are the ones that exploit the fact that the next "possible" prime is one that flips the last digit.

This filtered out 10249 primes going from 2 to 1 (bringing the total below both the number of primes that stay at 2 (21008) and the number of primes that stay at 1 (20932)).

It didn't factor out any primes that go from 1 to 2. Are there no twin primes (p,p') where p%3 == 1 and p'%3 == 2?

edit: Oh hey, this is obvious, if p%3 is 1 then p+2 is divisible by 3. It does mean we don't need to take measurements to know that the result we are investigating cannot possibly account for everything since it isn't a factor at all when going from 1 to 2.

Bognar · on March 14, 2016

> Are there no twin primes (p,p') where p%3 == 1 and p'%3 == 2?

By definition, no there aren't. Twin primes are a distance of 2 apart, so your mod 3 options are 0 -> 2, 1 -> 0, and 2 -> 1. Anything involving 0 means it's divisible by 3, so your only mod 3 possibility for twin primes is 2 -> 1.

EDIT: Excluding the possibility where 3 mod 3 = 0, of course. This allows a 0 -> 2 transition with (3,5).

furyofantares · on March 14, 2016

Yeah, I managed to get an edit in just before you replied -- I thought I was missing something obvious but I kept looking for a bug in my code instead of realizing that it's mathematically obvious :)

crnt2 · on March 14, 2016

Here is my attempt to work through the math and figure out how "surprising" this result is.

Clearly, we should expect that for small primes (< 100e6) it is less likely that a prime ending in K (in base B) will be followed by another prime ending in K - because for that to happen, none of the B-1 numbers in between can be prime.

A (very naive) model of the distribution of primes says that every number n has probability p(n) = 1/log(n) of being prime. Assume that a number n ends with a k in base b. Define p = 1/log(n). Then the probability that the next prime ends in k+j is, roughly,

  q(j) = p * (1-p)^(j-1) * sum_{i=0}^{infinity} (1-p)^(i*b)
       = p * (1-p)^(j-1) / (1 - (1-p)^b)

In this formula, j takes values 1 to b (where j = b represents another prime ending in k).

For n ~ 1,000,000 and working in base b, under this model we would expect to see around 6.97% of primes ending in k followed by another prime ending in k, whereas we expect to see 13.7% of primes ending in k+1 (it is apparent how naive the model is, since in fact we never see a prime ending in k followed by a prime ending in k+1, except for 2,3). It would not be hard to extend the model to rule out even primes, or multiples of 3 and 5, but I have not done this.

Around n ~ 10^60 the distribution starts to look more equal, as the primes are "spread out" enough that you expect to have long sequences of non-primes between the primes, which blurs out the distribution to be roughly constant.

I think this is what the article is getting at when it quotes James Maynard as saying "“It’s the rate at which they even out which is surprising to me". With a naive model of 'randomness' in the primes, you expect to see this phenomenon at low numbers (less then 10^60) and for it to slowly disappear at higher numbers. And indeed, you do see that, but the rate at which the phenomenon disappears is much slower than the random model predicts.

I think that is why it is surprising.

c3534l · on March 14, 2016

> This conspiracy among prime numbers seems, at first glance, to violate a longstanding assumption in number theory: that prime numbers behave much like random numbers.

I don't think this is true at all. Take a look at the famous Ulam Spiral: http://scienceblogs.com/goodmath/wp-content/blogs.dir/476/fi...

You can see that while prime numbers are difficult to predict, they're anything but random. I'm not sure why the article is claiming that mathematicians used to think the distribution of primes was evenly distributed, which is complete and utter nonsense.

mjn · on March 14, 2016

At a high level it's not that far off, in the sense that most mathematicians think nontrivial patterns in the primes are at least unusual, and they mostly behave randomly. But it's true that there is some structure, which is in jargon terms called a "conspiracy" among the primes when it's found or hypothesized. As Terence Tao summarizes it,

> We believe that the primes do not observe any significant pattern beyond the obvious ones (e.g. mostly being odd), but we are still a long way from making this belief completely rigorous.

That's from this set of slides on structure and randomness in the primes, which has some other relevant bits in it: https://terrytao.files.wordpress.com/2009/07/primes1.pdf

Especially relevant are slides 10-11 on treating the primes as a pseudorandom set, and then slides 14-15 on using pseudorandom models of the primes to rigorously (vs. heuristically) prove theorems. That's done by classifying and ruling out all possible ways nonrandom structure in the actual primes (the "conspiracies") could sink the specific theorem being proven.

Intermernet · on March 15, 2016

Ulam spirals are maddening. I've played with them for years now, and the patterns are so close to predictable, but always break upon close inspection.

I have a feeling that prime theory, and Ulam spirals in particular, will drive many mathematicians slightly crazy if they dwell on them too long.

On the other hand, they create wonderful patterns. I'm getting a laser-cut Ulam spiral done to hang on my wall :-)

valine · on March 14, 2016

Can anyone say what the security implications of this are? Intuitively, it would seem the less 'random' primes appear to be, the easier it would be to factor the composite of two prime numbers.

jerf · on March 14, 2016

The hardness of factoring is not in the finding prime numbers. You could hand someone a list of all primes in the security-relevant sizes, and it's not going to help them much.

Yeah, in practice it'll affect the total computation time, but generally security people tend to assume extremely generous fudge factors anyhow. A lot of times when you see security papers talking about how something takes "2 to the 50 operations", they're referring to the full process of hashing some string 2 to the 50 times or something like that, rather than 2 to the 50 CPU cycles. When so many security operations involve things getting exponentially harder as you add bits, there's not much point in trying to shave an order of magnitude here or there; you just go ahead and make things that are secure even if the entire universe is converted into computronium and dedicated to brute-forcing your security. (Because so far, that's never been the ultimate security problem.)

czinck · on March 14, 2016

On top of what others have said, this actually doesn't apply to RSA at all. There are ways to factor the prime-product used in RSA faster if you believe the 2 primes are close together[1], so any good RSA key generator should be picking primes that are far apart (when I wrote a toy key generator for a class I just tried to make the second prime more than twice the first), and this "conspiracy" only applies to consecutive primes. As you get further away from the original prime, you get less info about what the last digit could be.

[1] https://en.wikipedia.org/wiki/RSA_%28cryptosystem%29#Faulty_...

jessaustin · on March 14, 2016

Never say "never", but it would seem that this would only help in breaking a procedure that used consecutive primes. Every system of which I'm aware generates the primes it requires without reference to the primes that come before or after.

stromgo · on March 14, 2016

You have it backwards. Their result holds for any set of numbers that behave randomly, so it holds for primes because primes behave somewhat randomly.

stromgo · on March 14, 2016

Why the downmods? Everyone can easily verify that there are similar biases if you replace "isPrime(n)" with "random() < 0.1" in the various code snippets floating in the thread. The article even admits that the biases are explained by the prime k-tuples conjecture, which is a model of randomness in primes from 1923. So primes are not less random than we thought -- they are still exactly as random as we thought.

dr_zoidberg · on March 14, 2016

For those willing to try this over toy code, I did a (horrible, horrible, I'm terribly ashamed of it) quick Python snippet to check it out:

    def primer():
        p = 3
        while True:
            is_prime = True
            for x in xrange(2, p):
                if p % x == 0:
                    is_prime = False
                    break
            if is_prime:
                yield p
            p += 2
    
    give_prime = primer()
    primes = [1, 2]  # had to separate this into 2 lines because Python
    primes.extend([give_prime.next() for x in xrange(9998)])  # so we get 10,000 primes
    primes_dict = {}
    for i in xrange(len(primes) - 1):
        p0 = str(primes[i])[-1]
        p1 = str(primes[i + 1])[-1]
        key = "".join([p0, "-", p1])
        try:
            primes_dict[key] += 1
        except:
           primes_dict[key] = 1
    # let's delete the 4 outliers from the begining
    del(primes_dict["1-2"])
    del(primes_dict["2-3"])
    del(primes_dict["3-5"])
    del(primes_dict["5-7"])

So long story short, my results over 10,000 primes:

    In [57]: primes_dict
    Out[57]:
    {'1-1': 365,
     '1-3': 833,
     '1-7': 889,
     '1-9': 397,
     '3-1': 529,
     '3-3': 324,
     '3-7': 754,
     '3-9': 906,
     '7-1': 655,
     '7-3': 722,
     '7-7': 323,
     '7-9': 808,
     '9-1': 935,
     '9-3': 635,
     '9-7': 541,
     '9-9': 379}

And you can clearly see that the tendency to avoid the same last digit is starting to show, thow those that end in 1 are still not showing it completely. Tried with 100,000 primes but the (horrible) algorithm kinda got stuck so I settled with 10,000 to make this a "quick test".

Before you go, please believe me I'm sorry for primer() and give_prime. I'll try to never do those kind of things again.

Edit: I've edited this like 5 times already over little typos and bad transcription mistakes I did all over the place. Should work now.

chucksmash · on March 14, 2016

In the spirit of "every programmer loves to fizzbuzz", I rewrote this in Rust. Aside from rewriting it in a different language, the biggest change I made was only doing trial division against known primes <= the square root of a number we are checking for primality. Able to get the first 1,000,000 primes in 20 seconds:

    use std::collections::HashMap;

    pub fn first_n_primes(n: u64) -> Vec<u64> {
        let mut primes = Vec::new();
        let mut candidate = 3;
        let mut count = 0;
        if n >= 1 {
            primes.push(2);
            count += 1;
        }
        while count <= n {
            let candidate_sqrt = ((candidate as f64).sqrt().ceil() + 1.0) as u64;
            let mut is_prime: bool = true;
            for prime in &primes {
                if candidate % prime == 0 {
                    is_prime = false;
                    break;
                }
                if prime > &candidate_sqrt {
                    break;
                }
            }
            if is_prime {
                primes.push(candidate);
                count += 1;
            }
            candidate += 2;
        }
        primes
    }

    fn main() {
        let mut last_digit_pair_counts: HashMap<String, u64> = HashMap::new();
        let primes = first_n_primes(1000000);

        for i in 0..(primes.len() - 1) {
            let last_digit0 = primes[i] % 10;
            let last_digit1 = primes[i+1] % 10;
            let digit_str = format!("{}-{}", last_digit0, last_digit1).to_string();
            let counter = last_digit_pair_counts.entry(digit_str).or_insert(0);
            *counter += 1;
        }
        last_digit_pair_counts.remove("2-3");
        last_digit_pair_counts.remove("3-5");
        last_digit_pair_counts.remove("5-7");
        let mut ordered_keys: Vec<String> = last_digit_pair_counts.keys().cloned().collect();
        ordered_keys.sort();
        for key in &ordered_keys {
            println!("{}: {}", key, last_digit_pair_counts[key]);
        }
    }

which outputs:

steveklabnik · on March 14, 2016

Small speed note: Rust 1.7 just made custom hashing functions available. Rust uses a cryptographically secure hashing function by default, but for work like this, that's inappropriate. To use FNV instead:

1. add fnv to your Cargo.toml's dependencies section 2. The first few lines become:

  extern crate fnv;

  use std::collections::HashMap;
  use std::hash::BuildHasherDefault;
  use fnv::FnvHasher;
    
  type MyHasher = BuildHasherDefault<FnvHasher>;

2. and the hash creation line becomes

  let mut last_digit_pair_counts: HashMap<String, u64, MyHasher> = HashMap::default();

This is 17% faster on my two runs of your code. :)

Other notes:

  format!("{}-{}", last_digit0, last_digit1).to_string();

The .to_string() is redundant here, format! already gives you a String. That should remove a bunch of allocations.

chucksmash · on March 14, 2016

Thanks for taking time to give pointers to a noob Rustacean - this made my day.

steveklabnik · on March 14, 2016

Any time :)

Oh, one other thing: I noticed you said 20 seconds. Mine took about 4 seconds, and I was wondering where the difference was... did you compile with optimizations? Without, it takes 22s on my machine... `--release` as an argument to Cargo, or `-C opt-level=3` as an argument to rustc.

Houshalter · on March 14, 2016

Oh this is fun. I tried it in Lua:

    primes = {}
    function inPrimes(n)
    	for _, v in ipairs(primes) do
    		if n%v == 0 then return false end
    		if v > math.ceil(math.sqrt(n)) then break end
    	end
    	return true
    end
    for i = 3, 1.6e7, 2 do
    	if inPrimes(i) then table.insert(primes, i) end
    end
    last = '7'
    totalDigits = {}
    for i = 4, #primes do
    	c = last..tostring(primes[i]):sub(-1)
    	totalDigits[c] = totalDigits[c] and totalDigits[c] + 1 or 1
    	last = c:sub(-1)
    end
    for k, v in pairs(totalDigits) do print(k, v) end

Gets just as many primes and runs in 7 seconds in LuaJIT.

eddyb · on March 15, 2016

In a sibling answer, steveklabnik suggests the Rust version was compiled without optimizations - https://news.ycombinator.com/item?id=11285569

Could you try running both on the same machine? I'm curious if LuaJIT can still beat Rust if both have optimizations working.

I know it can beat native code sometimes, which is pretty impressive (it finds common cases and specializes to them AFAIK, almost like "sufficiently advanced optimizing compiler" fairy tales).

Houshalter · on March 15, 2016

I don't know how to Rust, but you are welcome to try it. LuaJIT is amazingly fast. It shouldn't be faster than native code in general, but it's still not orders of magnitude behind like interpreted languages are. As I understand it, JITs can sometimes do better by doing statistics on code paths and optimizing them.

And of course, it takes 0 seconds to compile, if you factor in that time : )

eddyb · on March 15, 2016

LuaJIT 2.0.4:

3.67user 0.01system 0:03.68elapsed 99%CPU

rustc 1.9.0-nightly (74b886ab1 2016-03-13) (-C opt-level=3):

5.18user 0.00system 0:05.20elapsed 99%CPU

Switching to BTreeMap gives me:

4.36user 0.00system 0:04.38elapsed 99%CPU

Using u8 as the key (last_digit0*10 + last_digit1) instead of a string:

4.18user 0.00system 0:04.18elapsed 99%CPU

I tried preallocating the vector of primes and it didn't help, strangely enough.

Replacing the floating-point sqrt with squaring in the comparison does bring it a bit lower:

4.04user 0.00system 0:04.05elapsed 99%CPU

I don't know how to bring that number lower without using a sieve, perf reports that most of the time is spent in:

86,31 │ div %rbx

I've also just noticed that the Lua and the Rust code don't give the same results, but I can't easily tell why.

Oh! The largest prime is 0x00ec4bab, so they can be stored as u32. Final Rust result:

2.33user 0.00system 0:02.33elapsed 99%CPU

Code: https://gist.github.com/eddyb/51a92fa2edf20d6e23fe

Houshalter · on March 15, 2016

Nice. I suppose I should try optimizing the Lua code some more. There are some nasty branches in there that might slow it down.

The Lua code is not exactly identical to the rust code. I test all numbers less than n, as opposed to counting n primes. I set n so it got slightly more primes than the rust code though.

chucksmash · on March 15, 2016

He diagnosed it correctly - I was running debug builds instead of release builds. Shaves 80% off my 20 sec runtime without including any of the other optimizations he shared.

eddyb · on March 15, 2016

The hashing optimization isn't necessary as using String at all is wasteful - my code ended up being simpler, but in the end the largest gain came from replacing u64 with u32 - see https://news.ycombinator.com/item?id=11290955.

Houshalter · on March 14, 2016

Computed up to 1e9:

    79      3796948
    37      3595468
    19      2744958
    97      3046626
    31      3046813
    71      3243702
    93      3243354
    33      2229686
    11      2328414
    73      3443707
    99      2328896
    17      3842263
    39      3840531
    77      2227956
    13      3795751
    91      4092457

mudita · on March 14, 2016

Haskell:

  import Data.Numbers.Primes
  import Data.Counter

  countPrimeTransitions n = count $ zip primeEndings (tail primeEndings)
    where primeEndings = take (n+1) (map (`mod` 10) primes)

odbol_ · on March 15, 2016

I have no idea what's happening but I think you just won

dr_zoidberg · on March 14, 2016

It's good to see this over a different language. I like how when working on larger data the 3-9 transition looses a big part of its "advantage" over the 3-7 transition. 9 is still the prefered/most probable "end-digit follower" for primes ending in 3, but it jumped form being alost 20% more probable (10k) to about 12% (1m).

sdenton4 · on March 14, 2016

Couple useful speedups for your loop: you only need the xrange to go up to floor of sqrt(p), since any divisor would have been found already by that point. Also you can save a lot of divisions by working as a sieve with the first couple thousand primes (so check divisibility against a store list of primes, instead of all numbers, for as long as your memory allows).

colomon · on March 14, 2016

Perl 6 version:

    my $primes := (1..*).grep(*.is-prime);

    my %count;
    for ^100000 -> $i {
        %count{($primes[$i] % 10) ~ '-' ~ ($primes[$i+1] % 10)}++; 
    }

    for %count.keys.sort -> $key {
        say "$key = { %count{$key} }";
    }

(I didn't skip the outliers, but I did explicitly write out the count so that it was in order.)

Results from the first 100,000 primes:

colomon · on March 14, 2016

And because I couldn't resist, here's a one-liner (for calculations, anyway) version:

    my %count := (1..*).grep(*.is-prime).map(* % 10).rotor(2 => -1).map(~*)[^100000].Bag;

I think it's a bit slower than my previous version. (And much slower than the Rust version posted elsewhere here!)

cmurphycode · on March 14, 2016

without spending the time to make a better prime generator, you can get your test going much faster by only testing factors up till sqrt(p).

With just that change to your program, and asking for 100k primes:

{'9-1': 8829, '1-1': 4104, '9-7': 5671, '3-9': 8387, '3-7': 7419, '7-1': 6438, '1-3': 7961, '3-3': 3604, '7-9': 8022, '1-7': 8297, '7-3': 6928, '1-9': 4605, '3-1': 5596, '7-7': 3627, '9-3': 6513, '9-9': 3994}

dr_zoidberg · on March 14, 2016

Thanks for the pointer, sounds like a reasonable speedup to the basic algorithm :)

theophrastus · on March 14, 2016

So among the less populated set of same final digit-ed sequential primes is the next-digit-more-significant anti-correlated (to mitigate its 'misbehavior' ;)? In a sense, how likely does this unusual homo-digitalism rise once our focus on the final digit is shifted?

curveship · on March 14, 2016

Shouldn't that be `if p % x == 0:`, not `if p % 2 == 0:`?

dr_zoidberg · on March 14, 2016

Yes, exactly, another bad typo ¬¬

Thanks for pointing it out, I did this over an IPython session and copied it in a (completely unnecesary) hurry. The results are good though :)

Edit: as to keep my claim that the results are good, check this sentence from the article "...Nor could it explain why, as the pair found, primes ending in 3 seem to like being followed by primes ending in 9 more than 1 or 7." -- it backs up the data I posted :)

21 · on March 14, 2016

Using sqrt and Numba - decorating the primer function with @numba.jit(nopython=True) - allows you to check the first million in around 15 seconds.

grandalf · on March 14, 2016

Primes seem to me to be more of an information theoretic concept than a number concept.

Primes are the simplest way to encode specific kinds of graphs that unambiguously encodes all sub-graphs.

If you try to come up with a bit-representation that is equivalently rich it becomes difficult to think of one that is as simple yet preserves the semantics of the factorization tree.

So I guess my point is that the factorization tree of numbers is the fundamental concept, and it's information theoretic. Primes happen to be an encoding of that fundamental concept into integers, but if we found an equivalently rich representation using a different encoding, we might understand primes better. I doubt that the quirks of the encoding has anything to do with the fundamental concept however.

Houshalter · on March 14, 2016

I once was really interested in finding patterns in prime numbers. I got a long csv file of prime numbers from the internet. I used symbolic regression on it, to try to predict the next prime in the list.

Symbolic regression basically uses genetic algorithms to fit mathematical expressions to data. The program I was using, Eureqa, tries to find the simplest expressions that fit, with only a handful of elements. To prevent overfitting, and give a human understandable model.

Anyway this actually worked. Far from perfectly of course, but it was able to get much better than random predictions. It was definitely finding some pattern.

Unfortunately I used up Eureqas free trial forever ago, and I'm not going to pay thousands of dollars to buy a subscription. But I am now thinking of writing my own software to do this, and then running it on a dataset of mathematical sequences like the primes.

Jabbles · on March 14, 2016

I'm shocked at how simple a pattern was previously unknown.

https://play.golang.org/p/ajn-wMo_3V

jessaustin · on March 14, 2016

You might want to sort that by frequency before printing it. The repeats sort of get lost when they're just mixed in with everything else.

philsnow · on March 14, 2016

Look at base 11: there are a lot of rarefied diagonals (which correspond to any prime + the base (11) + 2, or prime + 13). I wonder if prime + other_prime is rarefied in general.

https://play.golang.org/p/t3o00iEQgF

Jabbles · on March 14, 2016

Here's some stats on random 50-bit primes:

https://play.golang.org/p/T4gsMwd1jj

personjerry · on March 14, 2016

Wrote some code to compare random numbers to the primes for this property. To generate the random numbers, I apply the Prime Number theorem as a probability to determine if we want to select it, and then compare the stats to that of the actual primes. https://gist.github.com/personjerry/c58483daaf372acbe1fa

    cumulative:
    1 to 1: 30768 rand, 28289 prime
    1 to 3: 53573 rand, 51569 prime
    1 to 7: 44306 rand, 53263 prime
    1 to 9: 36968 rand, 32816 prime
    ratios:
    1 to 1: 0.18578027352594872 rand, 0.17048036302934247 prime
    1 to 3: 0.323479153458322 rand, 0.3107745710721539 prime
    1 to 7: 0.26752407692539926 rand, 0.3209832647330011 prime
    1 to 9: 0.22321649609032998 rand, 0.19776180116550257 prime

    cumulative:
    3 to 1: 37015 rand, 38455 prime
    3 to 3: 31015 rand, 25900 prime
    3 to 7: 53377 rand, 48596 prime
    3 to 9: 44594 rand, 53082 prime
    ratios:
    3 to 1: 0.22298058445431052 rand, 0.23161058343823215 prime
    3 to 3: 0.18683622387816942 rand, 0.15599308571187656 prime
    3 to 7: 0.3215462557454473 rand, 0.2926888028283534 prime
    3 to 9: 0.2686369359220728 rand, 0.3197075280215379 prime

    cumulative:
    7 to 1: 44412 rand, 42590 prime
    7 to 3: 36923 rand, 45728 prime
    7 to 7: 30588 rand, 25886 prime
    7 to 9: 53404 rand, 51800 prime
    ratios:
    7 to 1: 0.26863125805222376 rand, 0.25656008288956894 prime
    7 to 3: 0.2233331518747694 rand, 0.275463241849594 prime
    7 to 7: 0.18501515179008873 rand, 0.15593600154213152 prime
    7 to 9: 0.3230204382829181 rand, 0.3120406737187056 prime

    cumulative:
    9 to 1: 53453 rand, 56602 prime
    9 to 3: 44489 rand, 42837 prime
    9 to 7: 37022 rand, 38259 prime
    9 to 9: 30902 rand, 28144 prime
    ratios:
    9 to 1: 0.322266166664657 rand, 0.3413007561413876 prime
    9 to 3: 0.2682225410873838 rand, 0.2583000687401261 prime
    9 to 7: 0.22320427332907286 rand, 0.23069548124118136 prime
    9 to 9: 0.18630701891888632 rand, 0.1697036938773049 prime

Unless I'm doing something wrong, it honestly it doesn't seem like the actual prime numbers have a statistic that deviates from random numbers with a prime distribution. Hence it looks like to me just the result of a) specifying the "next" number which naturally favors the digit after it and b) probability of a given number being prime (prime number theorem).

JoeAltmaier · on March 14, 2016

Its supposed to be true in every base. But of course in Binary its not true. Every prime in Binary ends in a 1; its followed by another prime that ends in a 1.

poizan42 · on March 14, 2016

You can just increase the length of the suffix since that is equivalent to talking about the last digit in a power-of-two base. E.g. if the prime ends in 11 in binary then the following prime is less likely to end in 11, since this is equivalent to saying that a prime ending in 3 in quaternary is less likely to be followed by another prime ending in 3.

eterm · on March 14, 2016

Not every prime, 10 is of course prime and ends in 0.

LeifCarrotson · on March 14, 2016

It is still useful to say "prime numbers don't end in 0, 2, 4, 5, or 8" just as it is useful to say "consecutive prime numbers in any base are less likely to be followed by a number with the same least-significant digit". There are special cases at the bottom for both statements.

JoeAltmaier · on March 14, 2016

Likewise in base ten, 2 and 5 are prime. But this is a statistical argument. So that's cute but not significant?

jessaustin · on March 14, 2016

"100% of the base-2 primes ending in '0' are followed by a prime ending in '1'."

Terr_ · on March 14, 2016

> This conspiracy among prime numbers seems, at first glance, to violate a longstanding assumption in number theory: that prime numbers behave much like random numbers.

I wonder if this is really an artifact like Benford's Law, which also involves first-digit-frequency (in any base) and also involves certain kinds of "random" numbers.

To recycle a past comment:

> If you have a random starting value (X) multiplied by a second random factor (Y), most of the time the result will start with a one.

> You're basically throwing darts at logarithmic graph paper! The area covered by squares which "start with 1" is larger than the area covered by square which "start with 9".

etruong42 · on March 16, 2016

Applying the ideas from Benford's law to these findings seem very promising to me as primes follow a lognormal distribution[1] and that is where we expect Benford's law to apply[2]

[1] https://en.wikipedia.org/wiki/Prime_number_theorem

[2] https://en.wikipedia.org/wiki/Benford%27s_law#Multiplicative...

taf2 · on March 14, 2016

Does this have any ramifications in security? I vaguely understand we rely on prime numbers to create secrets that are hard to guess... So does this in someway make it easier to possibly guess?

contravariant · on March 14, 2016

Shouldn't be a problem, from what I understand in cryptography you randomly generate numbers and check if they're prime, the result is a uniformly random prime. Even if you know exactly which numbers are prime that still doesn't help you in figuring out which prime was used.

Unless the random number generator was flawed of course, but that's a different issue.

chippy · on March 15, 2016

I wonder, do random number generator use primes?

contravariant · on March 15, 2016

Quite a few of them do. For example: Mersenne twisters, linear congruential generators. Not sure about cryptographic random number generators though, but most of them probably use primes one way or another.

arghbleargh · on March 14, 2016

It should be noted that from the original paper, the asymptotic formula that Oliver and Soundararajan conjecture still says that each possibility for the last digits of consecutive primes should occur about the same number of times in the limit. It's just that the amount by which the frequencies vary is more than you would expect from the most naive model of primes as being "random".

silveira · on March 14, 2016

I created a ulam spiral visualization for this article using JavaScript and HTML5 canvas. The demo and source code are at http://silveiraneto.net/2016/03/14/the-prime-conspiracy-visu...

ms013 · on March 15, 2016

For those who have Mathematica and want to experiment with this, here's a quick function to generate the data:

    f[n_, base_] :=
     Module[
      {m, d, dpairs},
      d = Table[Last[IntegerDigits[Prime[i], base]], {i, 1, n}];
      dpairs = Table[{d[[i]], d[[i + 1]]}, {i, 1, Length[d] - 1}];
      Map[#[[1]] -> #[[2]] &, Tally[dpairs]]
     ]

For the first n primes in a given base, it returns the mapping {i,j}->count for the all pairings of digit i followed by digit j. E.g., for the first million base 5 primes

    {2, 3} -> 68596
    {3, 0} -> 1, 
    {0, 2} -> 1, 
    {2, 1} -> 64230 
    {1, 3} -> 77475 
    {3, 2} -> 72827 
    {2, 4} -> 77586 
    {4, 3} -> 64371 
    {3, 4} -> 79358 
    {4, 1} -> 84596 
    {1, 2} -> 79453 
    {4, 2} -> 58130 
    {4, 4} -> 42843 
    {1, 1} -> 42853 
    {3, 3} -> 39668 
    {2, 2} -> 39603 
    {1, 4} -> 50153 
    {3, 1} -> 58255

Steuard · on March 15, 2016

It took me embarrassingly long to notice that Last[IntegerDigits[x,base]] is just Mod[x,base] (which ought to be faster).

I guess the author wanted to avoid discussing modular arithmetic in an article for general audiences?

jb1991 · on March 14, 2016

>If Alice tosses a coin until she sees a head followed by a tail, and Bob tosses a coin until he sees two heads in a row, then on average, Alice will require four tosses while Bob will require six tosses (try this at home!), even though head-tail and head-head have an equal chance of appearing after two coin tosses.

Now that is particularly interesting to think about.

dhbradshaw · on March 14, 2016

That's really cool. An easy way to understand it is by thinking about bunching. Since you're only flipping until you hit the first matching sequence, on average you'll hit the more evenly distributed sequence more quickly than the bunched sequence.

Multiple heads in a row are more bunched than transition sequences because, for example, a sequence of three heads in a row will include two sequences with two heads in a row. You can't do that with a transition sequence--it takes at least four tosses to get two identical transition sequences.

kordless · on March 15, 2016

I just spent 5 minutes looking for a chart showing the distribution of reserved commands in Python. Didn't find much.

A while back, I read something about different number bases' ability to help find additional primes. The base itself was prime, so maybe 7 or 13. Can't find the article ATM. I hypothesized that prime numbers are "code" provided by this universe to allow us to access other data stored in other primes. Quines of a sort, if you will. One way to invalidate this hypothesis would be to do a mean distribution of basic operators in a simple programing language and compare it to what we are seeing in primes.

wallacoloo · on March 15, 2016

As a non-mathematician, this is a pretty neat read. I was distracted by the personification of the numbers though (they have 'likes' and 'preferences', which in my day-to-day vocab are concepts applicable only to things that possess the ability to think). Is this common in mathematical writing, or is this paper an abnormality in that sense?

(I don't mean to nitpick - I'm genuinely curious. I recall seeing the same thing in high-school chemistry, but never in physics, for example, and I'm curious if entire fields see this effect or if it's a product only of the audience being written to).

jamieb007 · on March 14, 2016

"If Alice tosses a coin until she sees a head followed by a tail, and Bob tosses a coin until he sees two heads in a row, then on average, Alice will require four tosses while Bob will require six tosses (try this at home!), even though head-tail and head-head have an equal chance of appearing after two coin tosses."

Counter-intuitive at first but makes sense - the outcomes as a whole converge towards the average (50% heads, 50% tails). Nonetheless, it shows that each toss is related to the others. One can expect that primes are even more related - or at least to the primes that came before.

jastr · on March 14, 2016

This is actually not quite the reason that HT occurs before HH, and is called the Gambler's Fallacy [0].

The reason HT is likely to occur quicker is: after a failed win, HH needs two flips to win (1/4 chance), but HT can win in just one flip (1/2 chance).

[0] https://en.wikipedia.org/wiki/Gambler%27s_fallacy

jamieb007 · on March 14, 2016

right, so the coin toss example is not related to all the outcomes but rather just the most recent. In contrast, the OP seems to show that a prime is related to a previous prime - and that previous prime is related to a previous - so by extent, they are all related.

hmate9 · on March 14, 2016

That is a really fun little puzzle and you're right, it seemed completely counter intuitive at first sight.

aaronchall · on March 15, 2016

This phenomenon feels trivial - Think of 3 - X11, X13, X17, X19, X21, X23, X27, X29, X31, X33, X37, X39 - how many of these pseudonumbers will be divisible by 3? I count 4 where X is 0, 1, and 2, one time each for numbers ending in 1, 3, 7, and 9.

Just based on this knowledge, I know that a prime number is guaranteed not to be immediately followed by another one with the same ending 1 time in 2.

I'm not sure these fellows have found anything particularly interesting, but if so, and I have missed something, kudos to them.

mjevans · on March 14, 2016

It would be interesting to know how well this holds up over different scales.

Does a prediction based on base 3 hold up better over primes under 100 than 1000 and 1000 than 10000?

Is the ratio of how well a base ending sequence is predictive scale to a predictable range based on the base that the prime number field is viewed within?

Just thinking about what might be happening, I would imagine that the answer is yes, but that a lot of crunching would be needed to graph and deduce a relationship to an actual predictive property statement.

CarolineW · on March 14, 2016

Secondary discussion: https://news.ycombinator.com/item?id=11282749

jeffdavis · on March 14, 2016

I am surprised this took so long to discover -- wouldn't this be one of the first things to examine when looking for non-randomness?

vorg · on March 14, 2016

Perhaps it has been discovered before, maybe many times. But the only way to know a discovery in math, or any science, is the first time is to search all the academic journals, an activity that's only feasible if you're a part of the university mathematics research industry, especially considering how much some academic journals charge for subscriptions. Then you need to announce your discovery, after peer reviewing of course.

Too · on March 19, 2016

Isn't this similar to the property that any number taken from a natural distribution will tend to contain more lower digits than high, regardless of unit and base.

Because 100x will become >200 "slower" than 200x becomes >300 etc. With slower meaning lower value of x. x in this case is usually a random variable centered around 1.

undoware · on March 14, 2016

So, the million dollar question is: how does this affect my security and privacy? Does this pattern mean encryption based on the assumption of the inherent randomness of primes is now less secure? E.g. is there now less entropy in a given set of primes?

I have a premonition of Quite a Bit of Trouble coming down the pipe.

baby · on March 14, 2016

> Looking at prime numbers written in base 3

I HAD THE EXACT SAME IDEA. But I would probably have reached no conclusion.

lohankin · on March 15, 2016

Possible generalization (example):

23=(7)3+(2);

7=(2)3+(1);

2=(0)3+(2);

0=(0)3+(0);

Take only remainders, and form a vector a=(2 1 2 0) What can be said about components of the vector for the prime next to p? E.g., do i-th components repel, like the 1-st ones?

synred · on March 21, 2016

So odes this anti-correlation with of last digits exist in other bases? Clearly, not in base 2.

If we had 8 fingers would we have notice something similar. Would it be even stronger.

If we used base 60 would it even be there?

-Traruh

girkyturkey · on March 14, 2016

This is absolutely incredible. This is why mathematics is so amazing, that something so small can be missed for centuries. All about how to look at things!

synred · on March 21, 2016

Does this anti-correlation exist for other bases?

Clearly, not for base 2.

How about base 816 or 60? How about the unpopular odd numbered bases?

caf · on March 15, 2016

So I wonder if a similar pattern is observable for Prime_i and Prime_i+n with some n > 1?

callesgg · on March 14, 2016

It is not quite clear to me how. But to me it seams like it has to do with the fact that we use a number system that has a base.

porcodio · on March 15, 2016

Interesting