> There have been 213 commits in the curl git repo from 7.53.1 till today. There...

dr_zoidberg · on April 23, 2017

I'm not skeptical about it. Last time I did something similar to this optimization, I had a python function that was doing some work over strings to get a similarity metric.

For this function, a list of elements was kept inside each call, which began empty and a few calculations where performed, populating it before the main loop of the function kicked in. In python-land, this was obviously done by declaring an empty list `precalc_values = []` and then appending over it. When we cythonized it, the dev that took it went in with a `cy_malloc(size(int)*elements)` "dynamic array of ints", and called it a day, 70x speedup over plain python.

A few days later I came in and saw that code, and said "why not go with a simple array?", to which I was told "because we don't know the size of the strings beforehand". Did a test run with a small array plus a counter (to know up to where the array held real values, and not just zero-init fields) and got a 100x speedup.

In the end we went with both functions and a wrapper that would check the length of the strings involved and select one or the other, because the array version would massively crash from accessing an array out of bounds if a large string happened to come by.