Right, it is definitely not convex in any way and gradient descent is basically useless (and not very interesting because all it really means here is try perturbing each count by one and look for improvement). I left that there as an initial attempt at solving the problem but I don't use it in my solution (except for the call to `gd(iter(gd(v)))` where I explore the neighbouring solutions to each sample point, which is probably not necessary).
The basic idea instead is to treat each letter count as a random variable, and ask what the distribution of each count is. In particular, each letter count can be expressed as a sum of (a mapping of) the other letter counts, so if you know the distribution of all other letter counts you can improve the distribution of the letter of interest. Initially assume uniform for all counts, then iterate until convergence. The images in that repo show the distribution of some letters at various stages. After doing this for about 10 iterations you stop seeing any improvement (the letter distributions are as 'peaked' as they are going to get), at which point I drew samples until I found a solution.
Alright, I admit that I skimmed, came across a bunch of GD code at the beginning and assumed that you just got lucky with a very greedy approach, sorry :P.
The update function you wrote on the distribution space is continuous, and distribution space is compact (since there's no way to have more than N of any letter), so there is necessarily at least one fixed point.
Fixed points could still be sources though, right? It clearly wasn't, but I'm curious to know if you got lucky, or if you merely didn't get unlucky.
I wonder how effective your code would be on the following self descriptive sentences (base j).
For instance when j=2, "100 1,11 0" has 4 ones and 3 zeros as described.
Does your code consistently find solutions as you increase j? At what point does the computation become unfeasible?
Nice! I admit I don't have a good understanding of the analysis, but if you mean "does repeatedly applying f to some starting point v, like f(f(f(...(v)))) eventually lead to a solution" the answer is definitely not - you can be right next to a solution and not get there by GD or by applying f. My function `iter` applies f over and over, but this was another failed attempt at a solution.
I will endeavour to try your problem, is it related in some way to a well-known problem?
The basic idea instead is to treat each letter count as a random variable, and ask what the distribution of each count is. In particular, each letter count can be expressed as a sum of (a mapping of) the other letter counts, so if you know the distribution of all other letter counts you can improve the distribution of the letter of interest. Initially assume uniform for all counts, then iterate until convergence. The images in that repo show the distribution of some letters at various stages. After doing this for about 10 iterations you stop seeing any improvement (the letter distributions are as 'peaked' as they are going to get), at which point I drew samples until I found a solution.