I almost overlooked this article because I got turned off by the opening description in base 10, as there is a lot of math trivia out there that is specific to base 10 which holds little general significance.
But a little further down, the article discusses how this was discovered originally in base 3, and I think it's much simpler to understand in that context, since all primes except 3 (aka 10 base 3) end in just either 1 or 2:
"Looking at prime numbers written in base 3 — in which roughly half the primes end in 1 and half end in 2 — he found that among primes smaller than 1,000, a prime ending in 1 is more than twice as likely to be followed by a prime ending in 2 than by another prime ending in 1."
I wrote a program to count the pairs of adjacent last-digit occurrences in all bases up to 30, for the first 100 million primes, and found this property nearly always holds.
Quite interestingly, in all of the few cases where this doesn't hold, primes ending in the digit D are least-frequently succeeded by a prime ending in the digit D-2.
It's not an unexpected property to me. In particular, I would never have assumed an even distribution of last digit of primes. I'm not sure why this is so magical. It's only so if you assumed it should be evenly distributed to begin with, and there's no particular reason I can think of why this would be so.
Then you might be surprised that it's a mathematical fact that the last digit of primes is evenly distributed (among digits coprime to the base), no matter what base you choose. This is one way of stating the Chebotarev density theorem:
Believing that the last digit of primes is evenly distributed overall is not the same as thinking they do not exhibit patterns when viewed sequentially. I'm surprised by neither.
I'm a mathematician and I'd say their ideas only become interesting if that conjecture is true. Otherwise it's just numerology.
To me, this makes for a very boring notion of "interesting."
I think most mathematicians would say that the "interestingness" of a conjecture comes from it (1) describing a phenomenon which seems "intuitively true, or very likely true" (e.g. "x^n+y^n=z^n has no solutions for n>2") combined with (2) the initial difficulty of deciding its truth/falsity using tools available at the time of its statement; along with, finally: (3) the novel techniques (sometime first arising in our brains decades or generations later!) required to ultimately determine said truth/falsity (and the degree to which these techniques touch on and illuminate other areas of mathematics).
For example, I think you'd find near-universal agreement among mathematicians that not only would the resolution of FLT (as a conjecture stated my Fermat) would have been equally "interesting" if it had been proven false -- it may have even been more surprising if a counter-example had been found (or its existence proven), provided the tools / lessons were as interesting as those in the Taylor-Wiles result we know today.
Meanwhile, some the most interesting conjectures are perhaps those that can't be decided, one way or another.
EDIT: If you don't like the idea of discussion the "what-ifs" of a conjecture that's already been decided (like FLT), just plug in any of the usual suspects, e.g. RH or GRH into what I'm saying above. Clearly, a "false" determination on any of these of these major targets -- or even a serious hint at it -- would be career-making achievement for an aspiring mathematician.
>> it may have even been more surprising if a counter-example had been found (or its existence proven), provided the tools / lessons were as interesting as those in the Taylor-Wiles result we know today.
What if a counterexample with very large (x,y,z,n) had been found somewhere in the late 1980s because enough megaflops to find it was finally allocated the problem? Would that necessarily have been an interesting result?
> What if a counterexample with very large (x,y,z,n) had been found somewhere in the late 1980s because enough megaflops to find it was finally allocated
I'm curious why floating point operations would be the appropriate tool for finding solutions to a Diophantine equation. Did you have something in mind when writing this? Where can I learn more about it?
Not sure how to answer you on this (because there's a slight chance you might be trolling). Let's just say superficially "yes", in that it would mean the current expert consensus in our universe (that FLT has been proven) would have to be wrong.
But it's kind of a bad line of speculation (and so FLT probably wasn't the best illustrative example to bring up in my original post); again, for a real-life instance of a counter-example being found to a conjecture that had a lot of numerical evidence suggestion there wouldn't be one, have a look a the history of the Mertens Conjecture, and others of its ilk.
Basic point being that yes, counter-examples to interesting conjectures are always interesting results (and by themselves don't make the original conjecture any less interesting).
Odds are good that everyone knows the proof that all integers are "interesting". If not, the first non-interesting number would be interesting for being the first. (grin)
Looking at prime numbers written in base 3 — in which roughly half the primes end in 1 and half end in 2 — he found that among primes smaller than 1,000, a prime ending in 1 is more than twice as likely to be followed by a prime ending in 2 than by another prime ending in 1
is not interesting (as it seems to be just numerology), UNLESS the authors' conjecture is also true (that the statement also holds for bases > 2).
The important part is that the authors (and several others) have verified the statistics out to a few hundred billion primes -- and that while the bias does start to drop out, it does so "very slowly." That's what makes this result not "numerology."
"not proven true" is not the same as "proven false"
they made an observation and a conjecture. If the conjecture is proven false, it's obvious uninteresting. If the conjecture isn't proven either way, it could be argued that it's just apophenia. I'm not sure I agree, but it's not an unreasonable stance.
It would still be interesting if it holds for all primes rather than just the first ones, as it imposes fundamental structure on the distribution of primes. The base 3 calculation is simply the form that structure takes.
Don't they call that the "strong law of small primes": you can discover lots of patterns in the first millions (or now billions) of primes that don't mean anything and don't hold up?
I think you're referring to the "strong law of small numbers", as in this excellent article by Guy [1]. But the idea is the same --- so very frequently, patterns that hold for even the first several numbers eventually fail to continue.
I think it would still be interesting even if it proved false, simply because it appears to be true for small numbers. That would be weird, and weird is interesting.
>I almost overlooked this article because I got turned off by the opening description in base 10, as there is a lot of math trivia out there that is specific to base 10 which holds little general significance.
If a high school science insight seems to be able to "shoot down" a new scientific discovery, chances are what the discovery says has not really been understood properly.
The property they discovered is orthogonal to numeric base.
But if high school science insight seems to shoot down a popular math article, it's probably right. Popular articles on technical subjects (all subjects?) simply aren't very good and trivia involving digits in base-10 is exactly the sort of thing that the popular press loves to fawn over.
I think that's what the original commenter was getting at: it's not that the actual discovery is suspect, it's that the article presenting it was—even though the discovery itself is interesting.
The article is too vague to assess how interesting the claims are, sadly.
It's too bad; I think it wouldn't have detracted from the article to put some more math in. It's not on the face of it at all surprising that sequential primes are more likely to be close to each other modulus any number (3, or 10, or what have you), than they are to be far apart.
By way of analogy, a train comes at 1:09pm. Trains come about every 5 minutes between 1 and 2 pm, and only on odd numbers. If you simulate a bunch of random 'next trains', 1 is much more likely than 9 because P(9) approx = !P(1,3,5,7). This is true for all bases.
I think what you'd need to be able to say to say something interesting is 1) calculate odds of finding the next prime. 2) Randomly generate numbers with a similar distribution to that of prime occurrence in that range using the Prime Number Theorem at the very least (1 / log(n) probability roughly). 3) check final digits and compare to actual distribution of final digits.
If those numbers are very different, then you have in fact found some underlying structure. But the article doesn't hit very hard on this angle, and its hard for (probably) any of us to say just thinking about it with minimal data whether or not there's structure.
> Lemke Oliver and Soundararajan’s first guess for why this bias occurs was a simple one: Maybe a prime ending in 3, say, is more likely to be followed by a prime ending in 7, 9 or 1 merely because it encounters numbers with those endings before it reaches another number ending in 3. For example, 43 is followed by 47, 49 and 51 before it hits 53, and one of those numbers, 47, is prime.
> But the pair of mathematicians soon realized that this potential explanation couldn’t account for the magnitude of the biases they found. Nor could it explain why, as the pair found, primes ending in 3 seem to like being followed by primes ending in 9 more than 1 or 7. To explain these and other preferences, Lemke Oliver and Soundararajan had to delve into the deepest model mathematicians have for random behavior in the primes.
They did mention this, but they didn't talk real numbers. And, my second point is (I think) slightly more subtle -- the probability distributions need to be considered, not just the counting upward angle.
As I'm writing this out, I'm a little less sure that this would matter, but I'll leave the comment out for the sake of discussion. :)
I don't know why you're nitpicking. The article's written for a more general audience that may be interested in the property, but not necessarily the nitty-gritty math behind it.
For your second point, I don't think there's anything wrong with a paper announcing they found something interesting, even if they haven't completely analyzed every aspect of it. Getting the info out early lets a wider audience look at it, and opens their current research up to scrutiny.
The article is too vague to assess how interesting the claims are, sadly.
By my reading, the article seems to state the key import of the finding quite clearly:
"This conspiracy among prime numbers seems, at first glance, to violate a longstanding assumption in number theory: that prime numbers behave much like random numbers."
However this statement:
If you simulate a bunch of random 'next trains', 1 is much more likely than 9 because P(9) approx = !P(1,3,5,7). This is true for all bases.
I'm afraid I don't follow at all. (Do you really mean we should expect that P(1|1) > P(1|9), for either random trains or for subsequent primes? Say wha?)
That said, perhaps you might want to skip straight to the arxiv article itself, or perhaps do some experiments on your own. It's definitely not hard to generate a non-"minimal" amount of data (out to the first few million primes or so) on one's laptop, these days.
Numbers modulo n (i.e. the last digit in base n) are everywhere in number theory and their properties even form the basis of cryptography. What you see as "math trivia" most mathematicians see as mathematics.
I'd guess the GP was pointing out that there are uninteresting results that those less qualified in maths see as some sort of magic. Maybe like those silly "choose any number multiply by x, etc etc did you get 7, magic!" games. Or:
9*9+7 = 88
98*9+6 = 888
987*9+5 = 8888
While not exactly trivial, and there's still maths required to understand the patterns, this prime result is much deeper and more interesting. The GP almost passed over it because they though it might fall into the less interesting category.
That's true, but noticing a phenomenon using an arbitrary base isn't necessarily meaningful. Such as how in base 10 the digits of numbers divisible by 3 or 9 sum up to a number divisible by 3 and 9. There is an underlying principle here, but the specific presentation is just an artifact of the base. This article does a poor job explaining what (or whether) there is some new principle at work here or if this pattern is just a specific presentation of using base 10.
Another thing that kind of irked me was this section:
>This conspiracy among prime numbers seems, at first glance, to violate a longstanding assumption in number theory: that prime numbers behave much like random numbers...
I was under the impression that Ulam's Spiral was a much earlier indication (1960s) that primes weren't really as random as we thought.
I almost overlooked this article because I got turned off by the opening description in base 10
The article talks about the last digit: That's also the remainder upon dividing by 10, a perfectly sensible thing to discuss. Similarly what it discusses in base 3 is the last digit, the remainder upon dividing by 3. Properties of numbers, particularly primes, modulo other numbers have been much studied by number theorists.
I tested this in base three. For primes below 1000, primes were followed by primes ending in another digit 2.1 times as often as they were followed by primes ending in the same digit. But the ratio shrank as the number of primes grew. By the time I was up to primes below 1,000,000,000, the ratio was only 1.2.
That hints that it's an effect that asymtotically disappears...
I'm not certain, but I believe the subtle point parent was trying to make, is the difference between "base 10" and "base ten". Every base is "base 10" in its own base ;)
Ten is "decem" in Latin. Decimal means, literally, base ten. It's a tautology either way. You can't even specify the base of the base since "base (10 in base 10)" presents the same problem recursively.
It's a non-problem though. I don't know about mathematicians, but when programmers say base N they mean N in decimal.
"10" is never correctly called "ten" except in decimal. But this whole tangent is getting a little navel-gazey for me. "Base 10" is fine in my book because decimal is our inbuilt base.
I get the parent poster's point, but to me it's like quibbling over people using "comprised of" instead of "comprises": some people will care, but in the grand scheme of things it's not a meaningful disagreement because everybody got what you were trying to say the first time, and there are only a few rare contexts where the distinction matters.
"If Alice tosses a coin until she sees a head followed by a tail, and Bob tosses a coin until he sees two heads in a row, then on average, Alice will require four tosses while Bob will require six tosses (try this at home!), even though head-tail and head-head have an equal chance of appearing after two coin tosses."
> Intuitively, first, both have to get a head. After that, if Alice "fails" by getting a head, then she still needs only one tail. Her first head doesn't get "reset" by failing her second try. But after getting a head, if Bob fails by getting a tail then he does get reset -- he has to start all over.
What's interesting, is that if you reword this slightly, and ask, "If you flip coins until you get either a head followed by a tail, or a head followed by a head, how many flips on average are required before you get a Head, followed by a Tail, versus a Head, followed by a Head" - the answers are 3 and 3 respectively.
But worded, "If you flip coins until you get a Head followed by a Tail, or flip coins until you get a Head followed by a Head, the answer reverts back to 4 and 6."
Are you saying that HH causes an early failure for HT , instead of a potentially longer success HHHT?
If so , it is poorly worded, to be ambiguous about how to count failures. (In the 4-6 variation, there are no failures)
I'm saying that if you just keep flipping coins (So, no Bob/Alice situation), and stop whenever you hit HH or HT, that the average number of flips to get to HH is 3, and the average number of flips to get to HT is 3.
But, if you are focussing on a particular scenario that you will flip coins until you get to HT, the average number of flips will be 4, and if you flip coins until you get to HH, the average number of flips will be 6.
I just find that really hard to grasp intuitively.
"and stop whenever you hit HH or HT" is equivalent to say "whenever you hit H, flip one more then stop". Average to hit H is obviously 2, plus one is 3, so you are right, but IMO it's not that counter-intuitive.
Do I understand your first scenario correctly, that if for example you flip TTHH, you stop there, count that as "took four flips to get to HH," count nothing for HT, and then start over?
Seems like you're just taking a biased sample, which cancels out the differences. To take an extreme example, imagine one candidate is HHHHHHHHHH and the other candidate is any other sequence of ten flips. In the "try until you get either one" scenario, the average number of flips for either one will be 10. Testing them independently, the average number of flips for the second one will be slightly over 10, and for HHHHHHHHHH it'll be huge.
I don't think that's true. I think that the average number of flips before _stopping_ might be 3, but it doesn't become more likely to get HT simply because you're also looking for HH (or vice versa).
It turns out that the average number of flips to get HH and HT is 3 - someone else on the thread described it well - on average, the number of flips to get to a "H" will be 2, and then you will always stop on the next flip - 50% at flip 3 with a HT, and 50% at flip 3 with a HH. Explained that way, it makes sense that the average number of flips is 3.
But, I still find it strange that if you are flipping with one particular scenario in mind, HT or HH, that the average number of flips goes from 3 to 4 or 6, even if I can reason it out with a bit of thinking.
A similar property was recently taken advantage of to reduce the time needed to brute-force older garage door openers; roughly speaking openers look for a particular base-3 string, but don't require a start/stop sequence, so you can try N length M permutations with much less than N*M symbols transmitted.
Ah, thanks for that. So it's not a property of the coin toss itself, but the fact that a failure at the second step in the series only resets to before step 2, instead of before step 1.
I thought the article was saying Alice must get a head followed immediately by a tail. If that's not the case, then it makes total sense, but it would seem the article is a bit vague about that.
Actually, the details in the article say:
>even though head-tail and head-head have an equal chance of appearing after two coin tosses.
That implies that the tail is expected immediately after the head for Alice's goal.
Correct, but the point is if Alice doesn't get a tail, that means she got a head, so she is still in the same position as she was before the flip, only needing a single tail to complete the sequence. If Bob gets a head, then a tail, he now needs two consecutive heads to complete his sequence.
Yeah, thought so - it's specifically heads->tails, rather than either heads->tails or tails->heads being okay. So the probabilities don't seem remotely counterintuitive, just crappily communicated IMO.
I'm not sure your explanation fits with the probabilities. Pointing out that you're only looking for heads->tails and not tails->heads seems to suggest that the HT pattern should be less likely than the HH, but it's actually more likely (for reasons explained well by others).
I don't have time to put numbers on it, but consider the following with 3 tosses only. There is 8 different enumerations:
* HHH
* THH
* HTH
* TTH
* HHT
* THT
* HTT
* TTT
Alice is looking for HT, so she will succeed in HTH, HHT, THT, HTT, that is 4 out of 8 possible outcomes. Bob on the other hand is looking for HH, that is only in HHH, THH, HHT, 3 out of 8 possible outcomes. So while HH and HT are equal in probability when you consider 2 coin flips, the combination of HT happens more often than HH. This is the case with 3 coin flips - there is no guarantee it translate to the same with more coin flips, but that is my bet.
Amusingly, the question "which substring occurs earlier on average" is different from the question "which substring is more likely to occur before the other". In fact the second question sometimes has a circular answer! For example, THH typically (with >50% probability) occurs before HHT, which typically occurs before HTT, which typically occurs before TTH, which typically occurs before THH.
Also the question "which substring occurs earlier on average" is intimately connected with algorithms for substring search. For example, if you want to check that a string doesn't contain HHH, you need to look at every third character, but for THH that's not enough.
Uh. I totally forgot everything about statistics and probabilities (and I'm too lazy to remember those, so I won't check numbers 4 and 6), but I think the core idea how it works is that Bob's option just has lower chances. Say, we toss a coin up to 3 times.
An easy way to understand it is by thinking about bunching. Since you're only flipping until you hit the first matching sequence, on average you'll hit the more evenly distributed sequence more quickly than the bunched sequence.
Multiple heads in a row are more bunched than transition sequences because, for example, a sequence of three heads in a row will include two sequences with two heads in a row. You can't do that with a transition sequence--it takes at least four tosses to get two identical transition sequences.
The most intuitive way to answer this question is that you're not comparing individual coin flips but pairs of adjacent coin flips in a longer sequence of ones. These pairs are no longer independent trials: if your first two coin flips are TH, then getting an HH if you look at the second two is much more likely than getting a TH.
Starting from scratch, they first need to get a head. This takes 2 tosses on average (1 with 50% probability, 2 with 25% probability, 3 with 12.5% probability, etc.). At that point, both Alice and Bob have 50% chance of getting the target sequence with one additional toss.
In the case of failure, Alice still has 50% chance of success in each subsequent toss. On average she will need two additional tosses to get a tail and the answer is 2+2=4.
In the case of failure, Bob has to start again. If we call the answer x, we can write x=2+0.5 1+0.5 (1+x) and solving the equation we get x=6.
Alice: first toss: head. Second toss: head. She can take that second toss as the begining of the new sequence and if she gets tail on the third toss she is done.
In Bob's case, if he gets tail on the second toss, that toss no longer counts and he must get head for the new streak to begin.
The results are particularly striking in base 11 - looking at primes below 100 million, only 4.3% of primes ending in 2 are followed by another prime ending in 2 (compared to the 9.1% you would naively expect) with similar numbers for other pairs.
A prime ending in 2 (in base 11) is also unlikely to be following by a prime ending in 5, 7 or 9, whereas it is particularly likely to be following by a prime ending in 4 or 8.
It would be interesting to know what structure there is (if any) in this NxN "transition matrix" for various bases.
Wow, that's really interesting. The same seems to be true for base 7; I haven't tried any other prime bases yet (and I don't know how it might extend to non-prime bases). Anyone have an idea why this seems to hold?
Edit: This basically works for base 10, too. I feel like the reason must either be very obvious or very deep.
Well, whether it's obvious or not, I think it's in the paper. Immediately under equation (1.1) of http://arxiv.org/abs/1603.03720, the authors are discussing the second correction term to the distribution of primes mod q, and they state:
"We can also show that c2(q; (a, b)) = c2(q; (−b,−a)) for any two reduced residue classes a and b (mod q)."
I'm not 100% certain this is responsible for the phenomenon that we're seeing, but it seems exceedingly likely. I think I'd need to stare at their formula for c2 for a long time to understand where this relation comes from, though.
I'm pretty solidly convinced of the pattern at this point: I've checked the "reflect across the anti-diagonal" pattern for the first 10 million primes, expressed in every base from 3 to 20, and it seems to hold up. (I haven't tried to establish any sort of bounds.)
As I've expressed in another comment, my preferred way of thinking about this symmetric pattern goes something like this:
The probability that (a prime congruent to x mod b is followed by a prime congruent to y mod b) seems to be equal to the probability that (a prime congruent to -y mod b is followed by a prime congruent to -x mod b).
I still haven't figured out whether it ought to be obvious, though if it is then I expect the language I've used above to be relevant. It's definitely not trivially obvious, because it's not an exact equality: if the pattern only shows up clearly after you've accumulated thousands or millions of primes, then it doesn't seem that it could be enforced by any sort of exact transformation. (For example, if the symmetric entries were somehow just counting the same pairs in two different ways, the numbers ought to be precisely equal rather than just increasingly close.)
There are a rather a lot of other patterns in the data; I expect that at least some of them must be accounted for in the original paper, but I haven't more than glanced through it yet.
So a number ending in 'x' is as likely to be followed by 'y' as a number ending in 'n-x' is to be preceded by a number ending in 'n-y' (where n is the base).
Can't think of a trivial reason why that would be the case, something weird is happening.
Framing this differently, a prime congruent to x mod b is as likely to be followed by a prime congruent to y mod b as a prime congruent to -y mod b is to be followed by one congruent to -x mod b.
Why would you expect 9.1%? At low numbers like these, primes are likely to be closely packed, meaning you are not as likely to have to search forward 11 numbers as just dividing by 11 would imply.
Miscalculation - obviously no primes end in 0 (base 11) so I should have said "10% as you would naively expect".
I take your point about primes being closely packed at low numbers, but I think this is a small correction (i.e. you might expect 8-9% of primes ending in 2 to be followed by another prime ending in 2, but certainly not <4%)
I did my own investigation using base 3 and noticed something peculiar.
In the first 100k primes, we go from 1 to 2 29028 times and from 2 to 1 29029 times.
Then I filtered out the twin primes since those are the ones that exploit the fact that the next "possible" prime is one that flips the last digit.
This filtered out 10249 primes going from 2 to 1 (bringing the total below both the number of primes that stay at 2 (21008) and the number of primes that stay at 1 (20932)).
It didn't factor out any primes that go from 1 to 2. Are there no twin primes (p,p') where p%3 == 1 and p'%3 == 2?
edit: Oh hey, this is obvious, if p%3 is 1 then p+2 is divisible by 3. It does mean we don't need to take measurements to know that the result we are investigating cannot possibly account for everything since it isn't a factor at all when going from 1 to 2.
> Are there no twin primes (p,p') where p%3 == 1 and p'%3 == 2?
By definition, no there aren't. Twin primes are a distance of 2 apart, so your mod 3 options are 0 -> 2, 1 -> 0, and 2 -> 1. Anything involving 0 means it's divisible by 3, so your only mod 3 possibility for twin primes is 2 -> 1.
EDIT: Excluding the possibility where 3 mod 3 = 0, of course. This allows a 0 -> 2 transition with (3,5).
Yeah, I managed to get an edit in just before you replied -- I thought I was missing something obvious but I kept looking for a bug in my code instead of realizing that it's mathematically obvious :)
Here is my attempt to work through the math and figure out how "surprising" this result is.
Clearly, we should expect that for small primes (< 100e6) it is less likely that a prime ending in K (in base B) will be followed by another prime ending in K - because for that to happen, none of the B-1 numbers in between can be prime.
A (very naive) model of the distribution of primes says that every number n has probability p(n) = 1/log(n) of being prime. Assume that a number n ends with a k in base b. Define p = 1/log(n). Then the probability that the next prime ends in k+j is, roughly,
q(j) = p * (1-p)^(j-1) * sum_{i=0}^{infinity} (1-p)^(i*b)
= p * (1-p)^(j-1) / (1 - (1-p)^b)
In this formula, j takes values 1 to b (where j = b represents another prime ending in k).
For n ~ 1,000,000 and working in base b, under this model we would expect to see around 6.97% of primes ending in k followed by another prime ending in k, whereas we expect to see 13.7% of primes ending in k+1 (it is apparent how naive the model is, since in fact we never see a prime ending in k followed by a prime ending in k+1, except for 2,3). It would not be hard to extend the model to rule out even primes, or multiples of 3 and 5, but I have not done this.
Around n ~ 10^60 the distribution starts to look more equal, as the primes are "spread out" enough that you expect to have long sequences of non-primes between the primes, which blurs out the distribution to be roughly constant.
I think this is what the article is getting at when it quotes James Maynard as saying "“It’s the rate at which they even out which is surprising to me". With a naive model of 'randomness' in the primes, you expect to see this phenomenon at low numbers (less then 10^60) and for it to slowly disappear at higher numbers. And indeed, you do see that, but the rate at which the phenomenon disappears is much slower than the random model predicts.
> This conspiracy among prime numbers seems, at first glance, to violate a longstanding assumption in number theory: that prime numbers behave much like random numbers.
You can see that while prime numbers are difficult to predict, they're anything but random. I'm not sure why the article is claiming that mathematicians used to think the distribution of primes was evenly distributed, which is complete and utter nonsense.
At a high level it's not that far off, in the sense that most mathematicians think nontrivial patterns in the primes are at least unusual, and they mostly behave randomly. But it's true that there is some structure, which is in jargon terms called a "conspiracy" among the primes when it's found or hypothesized. As Terence Tao summarizes it,
> We believe that the primes do not observe any significant
pattern beyond the obvious ones (e.g. mostly being odd),
but we are still a long way from making this belief
completely rigorous.
Especially relevant are slides 10-11 on treating the primes as a pseudorandom set, and then slides 14-15 on using pseudorandom models of the primes to rigorously (vs. heuristically) prove theorems. That's done by classifying and ruling out all possible ways nonrandom structure in the actual primes (the "conspiracies") could sink the specific theorem being proven.
Can anyone say what the security implications of this are? Intuitively, it would seem the less 'random' primes appear to be, the easier it would be to factor the composite of two prime numbers.
The hardness of factoring is not in the finding prime numbers. You could hand someone a list of all primes in the security-relevant sizes, and it's not going to help them much.
Yeah, in practice it'll affect the total computation time, but generally security people tend to assume extremely generous fudge factors anyhow. A lot of times when you see security papers talking about how something takes "2 to the 50 operations", they're referring to the full process of hashing some string 2 to the 50 times or something like that, rather than 2 to the 50 CPU cycles. When so many security operations involve things getting exponentially harder as you add bits, there's not much point in trying to shave an order of magnitude here or there; you just go ahead and make things that are secure even if the entire universe is converted into computronium and dedicated to brute-forcing your security. (Because so far, that's never been the ultimate security problem.)
On top of what others have said, this actually doesn't apply to RSA at all. There are ways to factor the prime-product used in RSA faster if you believe the 2 primes are close together[1], so any good RSA key generator should be picking primes that are far apart (when I wrote a toy key generator for a class I just tried to make the second prime more than twice the first), and this "conspiracy" only applies to consecutive primes. As you get further away from the original prime, you get less info about what the last digit could be.
Never say "never", but it would seem that this would only help in breaking a procedure that used consecutive primes. Every system of which I'm aware generates the primes it requires without reference to the primes that come before or after.
Why the downmods? Everyone can easily verify that there are similar biases if you replace "isPrime(n)" with "random() < 0.1" in the various code snippets floating in the thread. The article even admits that the biases are explained by the prime k-tuples conjecture, which is a model of randomness in primes from 1923. So primes are not less random than we thought -- they are still exactly as random as we thought.
For those willing to try this over toy code, I did a (horrible, horrible, I'm terribly ashamed of it) quick Python snippet to check it out:
def primer():
p = 3
while True:
is_prime = True
for x in xrange(2, p):
if p % x == 0:
is_prime = False
break
if is_prime:
yield p
p += 2
give_prime = primer()
primes = [1, 2] # had to separate this into 2 lines because Python
primes.extend([give_prime.next() for x in xrange(9998)]) # so we get 10,000 primes
primes_dict = {}
for i in xrange(len(primes) - 1):
p0 = str(primes[i])[-1]
p1 = str(primes[i + 1])[-1]
key = "".join([p0, "-", p1])
try:
primes_dict[key] += 1
except:
primes_dict[key] = 1
# let's delete the 4 outliers from the begining
del(primes_dict["1-2"])
del(primes_dict["2-3"])
del(primes_dict["3-5"])
del(primes_dict["5-7"])
So long story short, my results over 10,000 primes:
And you can clearly see that the tendency to avoid the same last digit is starting to show, thow those that end in 1 are still not showing it completely. Tried with 100,000 primes but the (horrible) algorithm kinda got stuck so I settled with 10,000 to make this a "quick test".
Before you go, please believe me I'm sorry for primer() and give_prime. I'll try to never do those kind of things again.
Edit: I've edited this like 5 times already over little typos and bad transcription mistakes I did all over the place. Should work now.
In the spirit of "every programmer loves to fizzbuzz", I rewrote this in Rust. Aside from rewriting it in a different language, the biggest change I made was only doing trial division against known primes <= the square root of a number we are checking for primality. Able to get the first 1,000,000 primes in 20 seconds:
use std::collections::HashMap;
pub fn first_n_primes(n: u64) -> Vec<u64> {
let mut primes = Vec::new();
let mut candidate = 3;
let mut count = 0;
if n >= 1 {
primes.push(2);
count += 1;
}
while count <= n {
let candidate_sqrt = ((candidate as f64).sqrt().ceil() + 1.0) as u64;
let mut is_prime: bool = true;
for prime in &primes {
if candidate % prime == 0 {
is_prime = false;
break;
}
if prime > &candidate_sqrt {
break;
}
}
if is_prime {
primes.push(candidate);
count += 1;
}
candidate += 2;
}
primes
}
fn main() {
let mut last_digit_pair_counts: HashMap<String, u64> = HashMap::new();
let primes = first_n_primes(1000000);
for i in 0..(primes.len() - 1) {
let last_digit0 = primes[i] % 10;
let last_digit1 = primes[i+1] % 10;
let digit_str = format!("{}-{}", last_digit0, last_digit1).to_string();
let counter = last_digit_pair_counts.entry(digit_str).or_insert(0);
*counter += 1;
}
last_digit_pair_counts.remove("2-3");
last_digit_pair_counts.remove("3-5");
last_digit_pair_counts.remove("5-7");
let mut ordered_keys: Vec<String> = last_digit_pair_counts.keys().cloned().collect();
ordered_keys.sort();
for key in &ordered_keys {
println!("{}: {}", key, last_digit_pair_counts[key]);
}
}
Small speed note: Rust 1.7 just made custom hashing functions available. Rust uses a cryptographically secure hashing function by default, but for work like this, that's inappropriate. To use FNV instead:
1. add fnv to your Cargo.toml's dependencies section
2. The first few lines become:
extern crate fnv;
use std::collections::HashMap;
use std::hash::BuildHasherDefault;
use fnv::FnvHasher;
type MyHasher = BuildHasherDefault<FnvHasher>;
2. and the hash creation line becomes
let mut last_digit_pair_counts: HashMap<String, u64, MyHasher> = HashMap::default();
This is 17% faster on my two runs of your code. :)
Oh, one other thing: I noticed you said 20 seconds. Mine took about 4 seconds, and I was wondering where the difference was... did you compile with optimizations? Without, it takes 22s on my machine... `--release` as an argument to Cargo, or `-C opt-level=3` as an argument to rustc.
primes = {}
function inPrimes(n)
for _, v in ipairs(primes) do
if n%v == 0 then return false end
if v > math.ceil(math.sqrt(n)) then break end
end
return true
end
for i = 3, 1.6e7, 2 do
if inPrimes(i) then table.insert(primes, i) end
end
last = '7'
totalDigits = {}
for i = 4, #primes do
c = last..tostring(primes[i]):sub(-1)
totalDigits[c] = totalDigits[c] and totalDigits[c] + 1 or 1
last = c:sub(-1)
end
for k, v in pairs(totalDigits) do print(k, v) end
Gets just as many primes and runs in 7 seconds in LuaJIT.
Could you try running both on the same machine? I'm curious if LuaJIT can still beat Rust if both have optimizations working.
I know it can beat native code sometimes, which is pretty impressive (it finds common cases and specializes to them AFAIK, almost like "sufficiently advanced optimizing compiler" fairy tales).
I don't know how to Rust, but you are welcome to try it. LuaJIT is amazingly fast. It shouldn't be faster than native code in general, but it's still not orders of magnitude behind like interpreted languages are. As I understand it, JITs can sometimes do better by doing statistics on code paths and optimizing them.
And of course, it takes 0 seconds to compile, if you factor in that time : )
Nice. I suppose I should try optimizing the Lua code some more. There are some nasty branches in there that might slow it down.
The Lua code is not exactly identical to the rust code. I test all numbers less than n, as opposed to counting n primes. I set n so it got slightly more primes than the rust code though.
He diagnosed it correctly - I was running debug builds instead of release builds. Shaves 80% off my 20 sec runtime without including any of the other optimizations he shared.
The hashing optimization isn't necessary as using String at all is wasteful - my code ended up being simpler, but in the end the largest gain came from replacing u64 with u32 - see https://news.ycombinator.com/item?id=11290955.
It's good to see this over a different language. I like how when working on larger data the 3-9 transition looses a big part of its "advantage" over the 3-7 transition. 9 is still the prefered/most probable "end-digit follower" for primes ending in 3, but it jumped form being alost 20% more probable (10k) to about 12% (1m).
Couple useful speedups for your loop: you only need the xrange to go up to floor of sqrt(p), since any divisor would have been found already by that point. Also you can save a lot of divisions by working as a sieve with the first couple thousand primes (so check divisibility against a store list of primes, instead of all numbers, for as long as your memory allows).
So among the less populated set of same final digit-ed sequential primes is the next-digit-more-significant anti-correlated (to mitigate its 'misbehavior' ;)? In a sense, how likely does this unusual homo-digitalism rise once our focus on the final digit is shifted?
Thanks for pointing it out, I did this over an IPython session and copied it in a (completely unnecesary) hurry. The results are good though :)
Edit: as to keep my claim that the results are good, check this sentence from the article "...Nor could it explain why, as the pair found, primes ending in 3 seem to like being followed by primes ending in 9 more than 1 or 7." -- it backs up the data I posted :)
Primes seem to me to be more of an information theoretic concept than a number concept.
Primes are the simplest way to encode specific kinds of graphs that unambiguously encodes all sub-graphs.
If you try to come up with a bit-representation that is equivalently rich it becomes difficult to think of one that is as simple yet preserves the semantics of the factorization tree.
So I guess my point is that the factorization tree of numbers is the fundamental concept, and it's information theoretic. Primes happen to be an encoding of that fundamental concept into integers, but if we found an equivalently rich representation using a different encoding, we might understand primes better. I doubt that the quirks of the encoding has anything to do with the fundamental concept however.
I once was really interested in finding patterns in prime numbers. I got a long csv file of prime numbers from the internet. I used symbolic regression on it, to try to predict the next prime in the list.
Symbolic regression basically uses genetic algorithms to fit mathematical expressions to data. The program I was using, Eureqa, tries to find the simplest expressions that fit, with only a handful of elements. To prevent overfitting, and give a human understandable model.
Anyway this actually worked. Far from perfectly of course, but it was able to get much better than random predictions. It was definitely finding some pattern.
Unfortunately I used up Eureqas free trial forever ago, and I'm not going to pay thousands of dollars to buy a subscription. But I am now thinking of writing my own software to do this, and then running it on a dataset of mathematical sequences like the primes.
Look at base 11: there are a lot of rarefied diagonals (which correspond to any prime + the base (11) + 2, or prime + 13). I wonder if prime + other_prime is rarefied in general.
Wrote some code to compare random numbers to the primes for this property. To generate the random numbers, I apply the Prime Number theorem as a probability to determine if we want to select it, and then compare the stats to that of the actual primes. https://gist.github.com/personjerry/c58483daaf372acbe1fa
cumulative:
1 to 1: 30768 rand, 28289 prime
1 to 3: 53573 rand, 51569 prime
1 to 7: 44306 rand, 53263 prime
1 to 9: 36968 rand, 32816 prime
ratios:
1 to 1: 0.18578027352594872 rand, 0.17048036302934247 prime
1 to 3: 0.323479153458322 rand, 0.3107745710721539 prime
1 to 7: 0.26752407692539926 rand, 0.3209832647330011 prime
1 to 9: 0.22321649609032998 rand, 0.19776180116550257 prime
cumulative:
3 to 1: 37015 rand, 38455 prime
3 to 3: 31015 rand, 25900 prime
3 to 7: 53377 rand, 48596 prime
3 to 9: 44594 rand, 53082 prime
ratios:
3 to 1: 0.22298058445431052 rand, 0.23161058343823215 prime
3 to 3: 0.18683622387816942 rand, 0.15599308571187656 prime
3 to 7: 0.3215462557454473 rand, 0.2926888028283534 prime
3 to 9: 0.2686369359220728 rand, 0.3197075280215379 prime
cumulative:
7 to 1: 44412 rand, 42590 prime
7 to 3: 36923 rand, 45728 prime
7 to 7: 30588 rand, 25886 prime
7 to 9: 53404 rand, 51800 prime
ratios:
7 to 1: 0.26863125805222376 rand, 0.25656008288956894 prime
7 to 3: 0.2233331518747694 rand, 0.275463241849594 prime
7 to 7: 0.18501515179008873 rand, 0.15593600154213152 prime
7 to 9: 0.3230204382829181 rand, 0.3120406737187056 prime
cumulative:
9 to 1: 53453 rand, 56602 prime
9 to 3: 44489 rand, 42837 prime
9 to 7: 37022 rand, 38259 prime
9 to 9: 30902 rand, 28144 prime
ratios:
9 to 1: 0.322266166664657 rand, 0.3413007561413876 prime
9 to 3: 0.2682225410873838 rand, 0.2583000687401261 prime
9 to 7: 0.22320427332907286 rand, 0.23069548124118136 prime
9 to 9: 0.18630701891888632 rand, 0.1697036938773049 prime
Unless I'm doing something wrong, it honestly it doesn't seem like the actual prime numbers have a statistic that deviates from random numbers with a prime distribution. Hence it looks like to me just the result of a) specifying the "next" number which naturally favors the digit after it and b) probability of a given number being prime (prime number theorem).
Its supposed to be true in every base. But of course in Binary its not true. Every prime in Binary ends in a 1; its followed by another prime that ends in a 1.
You can just increase the length of the suffix since that is equivalent to talking about the last digit in a power-of-two base. E.g. if the prime ends in 11 in binary then the following prime is less likely to end in 11, since this is equivalent to saying that a prime ending in 3 in quaternary is less likely to be followed by another prime ending in 3.
It is still useful to say "prime numbers don't end in 0, 2, 4, 5, or 8" just as it is useful to say "consecutive prime numbers in any base are less likely to be followed by a number with the same least-significant digit". There are special cases at the bottom for both statements.
> This conspiracy among prime numbers seems, at first glance, to violate a longstanding assumption in number theory: that prime numbers behave much like random numbers.
I wonder if this is really an artifact like Benford's Law, which also involves first-digit-frequency (in any base) and also involves certain kinds of "random" numbers.
To recycle a past comment:
> If you have a random starting value (X) multiplied by a second random factor (Y), most of the time the result will start with a one.
> You're basically throwing darts at logarithmic graph paper! The area covered by squares which "start with 1" is larger than the area covered by square which "start with 9".
Applying the ideas from Benford's law to these findings seem very promising to me as primes follow a lognormal distribution[1] and that is where we expect Benford's law to apply[2]
Does this have any ramifications in security? I vaguely understand we rely on prime numbers to create secrets that are hard to guess... So does this in someway make it easier to possibly guess?
Shouldn't be a problem, from what I understand in cryptography you randomly generate numbers and check if they're prime, the result is a uniformly random prime. Even if you know exactly which numbers are prime that still doesn't help you in figuring out which prime was used.
Unless the random number generator was flawed of course, but that's a different issue.
Quite a few of them do. For example: Mersenne twisters, linear congruential generators. Not sure about cryptographic random number generators though, but most of them probably use primes one way or another.
It should be noted that from the original paper, the asymptotic formula that Oliver and Soundararajan conjecture still says that each possibility for the last digits of consecutive primes should occur about the same number of times in the limit. It's just that the amount by which the frequencies vary is more than you would expect from the most naive model of primes as being "random".
For the first n primes in a given base, it returns the mapping {i,j}->count for the all pairings of digit i followed by digit j. E.g., for the first million base 5 primes
>If Alice tosses a coin until she sees a head followed by a tail, and Bob tosses a coin until he sees two heads in a row, then on average, Alice will require four tosses while Bob will require six tosses (try this at home!), even though head-tail and head-head have an equal chance of appearing after two coin tosses.
Now that is particularly interesting to think about.
That's really cool. An easy way to understand it is by thinking about bunching. Since you're only flipping until you hit the first matching sequence, on average you'll hit the more evenly distributed sequence more quickly than the bunched sequence.
Multiple heads in a row are more bunched than transition sequences because, for example, a sequence of three heads in a row will include two sequences with two heads in a row. You can't do that with a transition sequence--it takes at least four tosses to get two identical transition sequences.
I just spent 5 minutes looking for a chart showing the distribution of reserved commands in Python. Didn't find much.
A while back, I read something about different number bases' ability to help find additional primes. The base itself was prime, so maybe 7 or 13. Can't find the article ATM. I hypothesized that prime numbers are "code" provided by this universe to allow us to access other data stored in other primes. Quines of a sort, if you will. One way to invalidate this hypothesis would be to do a mean distribution of basic operators in a simple programing language and compare it to what we are seeing in primes.
As a non-mathematician, this is a pretty neat read. I was distracted by the personification of the numbers though (they have 'likes' and 'preferences', which in my day-to-day vocab are concepts applicable only to things that possess the ability to think). Is this common in mathematical writing, or is this paper an abnormality in that sense?
(I don't mean to nitpick - I'm genuinely curious. I recall seeing the same thing in high-school chemistry, but never in physics, for example, and I'm curious if entire fields see this effect or if it's a product only of the audience being written to).
"If Alice tosses a coin until she sees a head followed by a tail, and Bob tosses a coin until he sees two heads in a row, then on average, Alice will require four tosses while Bob will require six tosses (try this at home!), even though head-tail and head-head have an equal chance of appearing after two coin tosses."
Counter-intuitive at first but makes sense - the outcomes as a whole converge towards the average (50% heads, 50% tails). Nonetheless, it shows that each toss is related to the others. One can expect that primes are even more related - or at least to the primes that came before.
right, so the coin toss example is not related to all the outcomes but rather just the most recent. In contrast, the OP seems to show that a prime is related to a previous prime - and that previous prime is related to a previous - so by extent, they are all related.
This phenomenon feels trivial - Think of 3 - X11, X13, X17, X19, X21, X23, X27, X29, X31, X33, X37, X39 - how many of these pseudonumbers will be divisible by 3? I count 4 where X is 0, 1, and 2, one time each for numbers ending in 1, 3, 7, and 9.
Just based on this knowledge, I know that a prime number is guaranteed not to be immediately followed by another one with the same ending 1 time in 2.
I'm not sure these fellows have found anything particularly interesting, but if so, and I have missed something, kudos to them.
It would be interesting to know how well this holds up over different scales.
Does a prediction based on base 3 hold up better over primes under 100 than 1000 and 1000 than 10000?
Is the ratio of how well a base ending sequence is predictive scale to a predictable range based on the base that the prime number field is viewed within?
Just thinking about what might be happening, I would imagine that the answer is yes, but that a lot of crunching would be needed to graph and deduce a relationship to an actual predictive property statement.
Perhaps it has been discovered before, maybe many times. But the only way to know a discovery in math, or any science, is the first time is to search all the academic journals, an activity that's only feasible if you're a part of the university mathematics research industry, especially considering how much some academic journals charge for subscriptions. Then you need to announce your discovery, after peer reviewing of course.
Isn't this similar to the property that any number taken from a natural distribution will tend to contain more lower digits than high, regardless of unit and base.
Because 100x will become >200 "slower" than 200x becomes >300 etc. With slower meaning lower value of x. x in this case is usually a random variable centered around 1.
So, the million dollar question is: how does this affect my security and privacy? Does this pattern mean encryption based on the assumption of the inherent randomness of primes is now less secure? E.g. is there now less entropy in a given set of primes?
I have a premonition of Quite a Bit of Trouble coming down the pipe.
Take only remainders, and form a vector a=(2 1 2 0)
What can be said about components of the vector for the prime next to p? E.g., do i-th components repel, like the 1-st ones?
This is absolutely incredible. This is why mathematics is so amazing, that something so small can be missed for centuries. All about how to look at things!
Soundararajan showed his findings to postdoctoral researcher Lemke Oliver, who was shocked. He immediately wrote a program that searched much farther out along the number line — through the first 400 billion primes.
This is how modern computers revolutionized even the most theoretical fields like number theory. Remarkable, I love it!
That's bizarre - I tried to submit this four hours ago and was told it was a duplicate. I searched, and couldn't find the original submission to upvote it, and now it's submitted again, after my submission was declined.
I don't understand.
But it's a great result, so I've upvoted it, despite being confused.
Dup detection applies to deleted posts, but you can't find them using search. So what might have happened is that somebody submitted this link, deleted it, and then you tried to submit it.
>
Quanta Magazine
NUMBER THEORY
Mathematicians Discover Prime Conspiracy
A previously unnoticed property of prime numbers seems to violate a longstanding assumption about how they behave.
Zim + Teemo for Quanta Magazine
By: Erica Klarreich
March 13, 2016
Comments (2)
Share this:facebooktwitterredditmail
PDFPrint
Two mathematicians have uncovered a simple, previously unnoticed property of prime numbers — those numbers that are divisible only by 1 and themselves. Prime numbers, it seems, have decided preferences about the final digits of the primes that immediately follow them.
Among the first billion prime numbers, for instance, a prime ending in 9 is almost 65 percent more likely to be followed by a prime ending in 1 than another prime ending in 9. In a paper posted online today, Kannan Soundararajan and Robert Lemke Oliver of Stanford University present both numerical and theoretical evidence that prime numbers repel other would-be primes that end in the same digit, and have varied predilections for being followed by primes ending in the other possible final digits.
“We’ve been studying primes for a long time, and no one spotted this before,” said Andrew Granville, a number theorist at the University of Montreal and University College London. “It’s crazy.”
The discovery is the exact opposite of what most mathematicians would have predicted, said Ken Ono, a number theorist at Emory University in Atlanta. When he first heard the news, he said, “I was floored. I thought, ‘For sure, your program’s not working.’”
This conspiracy among prime numbers seems, at first glance, to violate a longstanding assumption in number theory: that prime numbers behave much like random numbers. Most mathematicians would have assumed, Granville and Ono agreed, that a prime should have an equal chance of being followed by a prime ending in 1, 3, 7 or 9 (the four possible endings for all prime numbers except 2 and 5).
“I can’t believe anyone in the world would have guessed this,” Granville said. Even after having seen Lemke Oliver and Soundararajan’s analysis of their phenomenon, he said, “it still seems like a strange thing.”
Yet the pair’s work doesn’t upend the notion that primes behave randomly so much as point to how subtle their particular mix of randomness and order is. “Can we redefine what ‘random’ means in this context so that once again, [this phenomenon] looks like it might be random?” Soundararajan said. “That’s what we think we’ve done.”
Prime Preferences
Soundararajan was drawn to study consecutive primes after hearing a lecture at Stanford by the mathematician Tadashi Tokieda, of the University of Cambridge, in which he mentioned a counterintuitive property of coin-tossing: If Alice tosses a coin until she sees a head followed by a tail, and Bob tosses a coin until he sees two heads in a row, then on average, Alice will require four tosses while Bob will require six tosses (try this at home!), even though head-tail and head-head have an equal chance of appearing after two coin tosses.
Sorry, this site has some javascript that copies and pastes the entire article when you try and copy and paste a few sentences. Didn't catch this before submitting on mobile.
Eh? That is not something a webpage can do, for all sorts of reasons. HN can only paste what you have copied before, so I'm afraid this is most likely a case of user error.
You're right, I didn't explain myself correctly. What I meant to say is that Javascript from one site is forbidden by default from operating on the contents of another site. So if you didn't copy the text from site A in the first place, it is not possible (at a basic security level) for site B code to access the text from site A which never made it into the clipboard.
It would be much more helpful for everyone if you explained where you think I'm incorrect, rather than mindless downvoting. Having worked professionally with Javascript for almost 20 years, I would hate to miss an opportunity to learn something more about it.
Perhaps I have missed something, but the introductory example seems to follow from simple probability and therefore I do not find it mathematically remarkable.
Say, there is a fixed and equal probability that each number ending with 9 and 1 is prime. I could go along with that assumption, although the fact that primes get less likely as you go higher is potentially relevant.
What the authors consider here is starting with a prime ending in 9.
So the next potential prime ends in 1. If only because 1 is the next number to be checked, a 1-prime is more likely to appear next than a 9-prime. The probability of that can be calculated, depending on your assumptions, as a geometric sequence. In any case, P(next prime is 1) > P(next prime is 9).
"Most mathematicians would have assumed, Granville and Ono agreed, that a [known] prime should have an equal chance of being followed by a prime ending in 1, 3, 7 or 9" So - I'm a definite nope on that.
This result appears to be exactly what I would have assumed was the case.
> Lemke Oliver and Soundararajan’s first guess for why this bias occurs was a simple one: Maybe a prime ending in 3, say, is more likely to be followed by a prime ending in 7, 9 or 1 merely because it encounters numbers with those endings before it reaches another number ending in 3. For example, 43 is followed by 47, 49 and 51 before it hits 53, and one of those numbers, 47, is prime.
> But the pair of mathematicians soon realized that this potential explanation couldn’t account for the magnitude of the biases they found. Nor could it explain why, as the pair found, primes ending in 3 seem to like being followed by primes ending in 9 more than 1 or 7.
Ok so the random model "1,3,7,9 mod 10" doesn't fully work, but let's look at what happens mod 30. Large primes have the following possible remainders mod 30: 1, 7, 11, 13, 17, 19, 23, 29. We see that when a prime ends with a 3 then p + 6 (ending in 9) is always an option, but p + 4 (ending in 7) is an option only half of the time. I think that this fully explains why a prime ending with 3 is more likely to be followed by a prime ending in 9. So basically the OP is on the right track, and his random model just needed to be refined a bit.
> The primes' preferences about the final digits of the primes that follow them can be explained, Soundararajan and Lemke Oliver found, using a much more refined model of randomness in primes, something called the prime k-tuples conjecture.
So I guess that my observation is just a special case of this "prime k-tuples conjecture".
Are you contesting it, or just curious? You already know that my observation explains the "3 followed by 9" bias. You already know that the mathematicians call the conjecture "a much more refined model of randomness in primes" which is similar to how I described what I was doing. In addition, MathWorld's article on the k-Tuple Conjecture talks about residues mod q, which is similar to what I'm doing when I look at primes mod 10*3. All these elements point at some connection between the k-tuple conjecture and my observation.
Well 'points to a connection' is not the same as 'is a special case of', I'm not an expert on this subject, but looking at the conjecture is about the asymptotic distribution of certain patterns in prime numbers. I don't think that an example like the one you are giving is 'related' except in a hand-wavy vague way that anything dealing with prime numbers and patterns is related to everything else dealing with prime numbers and patterns of primes.
All I meant by "special case" was that "mod 30" isn't the whole story -- more like the most significant correction on top of what the OP said, with other smaller corrections possible, and the entire set of corrections being described by the k-tuple conjecture.
It's amazing how people can be picky and negative on HN. Someone positive would instead congratulate me for making the gist of what the prime k-tuple conjecture says about the biases easily understandable. Oh well.
So you are making comments about pure mathematics. If you want to use imprecise language and not be corrected, you should probably go write a book review or something. In math, precise language and correcting someone or forcing someone to give justification for something is expected and completely usual. It would be bizarre when talking to a mathematician about mathematics if they didn't immediately correct or demand clarification and justification when you say something vague or incorrect or unjustified.
>Lemke Oliver and Soundararajan’s first guess for why this bias occurs was a simple one: Maybe a prime ending in 3, say, is more likely to be followed by a prime ending in 7, 9 or 1 merely because it encounters numbers with those endings before it reaches another number ending in 3. For example, 43 is followed by 47, 49 and 51 before it hits 53, and one of those numbers, 47, is prime.
That line of thinking is specifically addressed as their first assumption in the article. It's not indicated why but they apparently 'quickly realised this couldn't account for such a bias'.
But a little further down, the article discusses how this was discovered originally in base 3, and I think it's much simpler to understand in that context, since all primes except 3 (aka 10 base 3) end in just either 1 or 2:
"Looking at prime numbers written in base 3 — in which roughly half the primes end in 1 and half end in 2 — he found that among primes smaller than 1,000, a prime ending in 1 is more than twice as likely to be followed by a prime ending in 2 than by another prime ending in 1."