I saw that but I mean how do you know you need 100 instances? Is there a way to estimate the optimum number or at least set boundaries, such as max price or max processing time?
Do you get back info such as how long each instance took to do its job?
Since it scales somewhat close to linear, you can just calculate it yourself.
And since you pay per minute on EC2, you can either chose to have 10 instances calculate a minute or 1 instance for 10 minutes. Either way, you pay for 10 minutes of processing time.
p.s. simplified but probably somewhere in a reasonable ballpark
It's actually rounded up to complete hours, so 10 instances for a minute is 10x the price of 1 instance for 10 minutes.
For fast iterative development this isn't an issue - you can re-use the job flow (the instances you spun up), so you can launch the job several times over the course of that hour.
There may be some info returned, but I didn't dig into in too much. Once I figured I could run through the whole set in less than an hour for less than 10 bucks, I had all the info I needed.
Do you get back info such as how long each instance took to do its job?