For a newer list I would add the mapbox API as well.
So I work in data analytics, not so much web-mapping. For those applications, IMO local solutions, like ESRI, are good options if you are limited to addresses in the US, https://crimede-coder.com/blogposts/2024/LocalGeocoding.
Googles TOS was that you can't even cache the results, https://cloud.google.com/maps-platform/terms. So saving to a database and doing analysis of your data is not allowed AFAICT.
Care to share the contexts in which someone needs a zero-shot model for time series? I have just never come across one in which you don't have some historical data to fit a model and go from there.
That said, I think it is possible my refusal to do cheeky ppt slides with smart art and fill them with graphs of real data instead has stunted my career growth into management.
One problem with both of these takes on powerpoint is they assume it will be presented in person. That's less often the case now. People present more often via teams or zoom and so a lot of the ideas (don't expect people to read and listen simultaneously) are not accurate anymore (half your viewers are audio only, more people get copies of the slides than make the original presentation). Remote vs in person are totally different beasts.
IMO if doing this, you should avoid text in the charts entirely (as the title can sometimes I think lead the models astray, such as the clustering title I think will bias it to find clusters even if none exist). Presuming you are the one making the chart and not just prompting with another image.
My companies looks similarish to the recent screenshot, but it is a hellscape of a billion options and poor search functionality. To the extent I just need to ask a person the right link or tree search whenever I need to actually use it.
I don't envy developers who need to work on this, but IMO the best systems I have worked with have a very shallow tree and then a "human will work out the appropriate team to route to".
> a hellscape of a billion options and poor search functionality. To the extent I just need to ask a person the right link or tree search whenever I need to actually use it.
I did some development work for a customer a couple of years ago: I had to take screenshots and bookmaks to even have a chance of finding something a second time!
Once I got to the built-in code editor for the right script it was fine though, I had no trouble with their programming docs.
A tell for fake firms in my local newspaper is they ask for a snail mail resume. These appear to me to be more like shell companies submitting multiple H1Bs as far as I can tell though, not legit firms saying they cannot hire any US.
I agree understanding KM is a very good place to start survival analysis. Many examples in my business I have for KM the censoring is due to certain events taking along time (auditing healthcare claims) to resolve.
When I first learned survival analysis, my professor had me construct life-tables, and then learned KM. You can often do quite a bit with discrete time tables, so if you have data:
ID TimeRange Outcome
A 4 1
B 3 0
You can then explode the data into the form:
ID Time Outcome
A 1 0
A 2 0
A 3 0
A 4 1
B 1 0
B 2 0
B 3 0
If you groupby this table and get the numerator/denominator, that is what you need to calculate the life-table, and the discrete version of the KM plot.
Care to elaborate on this? So this post does not save the resulting weight, so you don't use that in any subsequent calculations. You would just treat the result as a simple random sample. So it is unclear why this critique matters.
Yes, rhymer has hit on the primary tradeoff behind Algorithm A. While Algorithm A is fast, easy, and makes it possible to sample any dataset you can access with SQL, including massive distributed datasets, its draws from the population are not independent and identically distributed (because with each draw you make the population one item smaller), nor does it let you compute the inclusion probabilties that would let you use the most common reweighting methods, such as a Horvitz–Thompson estimator, to produce unbiased estimates from potentially biased samples.
In practice, however, when your samples are small, your populations are large, and your populations' weights are not concentrated in a small number of members, you can use Algorithm A and just pretend that you have a sample of i.i.d. draws. In these circumstances, the potential for bias is generally not worth worrying about.
But when you cannot play so fast and loose, you can produce unbiased estimates from an Algorithm A sample by using an ordered estimator, such as Des Raj's ordered estimator [1].
You could alternatively use a different sampling method, one that does produce inclusion probabilities. But these tend to be implemented only in more sophisticated statistical systems (e.g., R's "survey" package [2]) and thus not useful unless you can fit your dataset into those systems.
For very large datasets, then, I end up using Algorithm A to take samples and (when it matters) Des Raj's ordered estimator to make estimates.
(I was planning a follow up post to my blog on this subject.)
Very nice, another pro-tip for folks is that you can set the weights to get approximate stratified sampling. So say group A had 100,000 rows, and group B had 10,000 rows, and you wanted each in the resulting to have approximately the same proportion. You would set the weight for each A row to be 1/100,000 and for B to be 1/10,000.
If you want exact counts I think you would need to do RANK and PARTITION BY.
So I work in data analytics, not so much web-mapping. For those applications, IMO local solutions, like ESRI, are good options if you are limited to addresses in the US, https://crimede-coder.com/blogposts/2024/LocalGeocoding.
Googles TOS was that you can't even cache the results, https://cloud.google.com/maps-platform/terms. So saving to a database and doing analysis of your data is not allowed AFAICT.