The previous implementation of spot instance requests was too
I observed the CI system making spot instance requests that were
expiring due to insufficient capacity. And the web UI wasn't
making this obvious.
This commit improves the scheduling of spot instances a little.
First, we randomize the availability zone that the spot instance
request is assigned to. I noticed all spot requests were being
assigned to us-west-2c. Why, I'm not sure. The EC2 docs say
Amazon will assign an availability zone randomly. But it was
always assigning the same zone without capacity. Choosing a
random availability zone seems more robust.
We also update job state accounting to store the spot instance
request ID and the number of spot instance requests. This will
help us inspect the spot instance request after it has been
created (functionality for doing so will be introduced in a
subsequent commit). We also update the execution state to
reflect that a spot instance has been requested. This will give
users more context and can be used to influence behavior should
we want to try launching another instance at a later time.