AWS EC2 spot instance availability

8.7k views Asked by At

I am using the API call request_spot_instances to create spot instance without specifying any availability zone. Normally a random AZ is picked by the API. The spot request sometimes would return a no capacity status whereas I could request for a spot instance successfully through the AWS console in another AZ. What is the proper way to check the availability of the spot instance of a specific instance type before calling the request_spot_instance?

2

There are 2 answers

0
Ahmed Nada On

There is no public API to check Spot Instance availability. Having said that, you can still achieve what you want by following the below steps:

  1. Use request_spot_fleet instead, and configure it to launch a single instance.
  2. Be flexible with the instance types you use, pick as many as you can and include them in the request. To help you pick the instances, check Spot Instance advisor for instance interruption and saving rates.
  3. At the Spot Fleet request, configure AllocationStrategy to capacityOptimized this will allow the fleet to allocate capacity form the most available Spot instance from your instances list and reduce the likelihood of Spot interruptions.
  4. Don't set a max price SpotPrice, the default Spot instance price will be used. The pricing model for Spot has changed and it's no longer based on bidding, therefore Spot prices are more stable and don't fluctuate.
0
Jonathan Leon On

This may be a bit overkill for what you are looking for but with parts of the code you can find the spot price history for the last hour (this can be changed). It'll give you the instance type, AZ, and additional information. From there you can loop through the instance type to by AZ. If a spot instance doesn't come up in say 30 seconds try the next AZ.

And to Ahmed's point in his answer, this information can be used in the spot_fleet_request instead of looping through the AZs. If you pass the wrong AZ or subnet in the spot fleet request, it may pass the dryrun api call, but can still fail the real call. Just a heads up on that if you are using the dryrun parameter.

Here's the output of the code that follows:

In [740]: df_spot_instance_options
Out[740]:
    AvailabilityZone   InstanceType  SpotPrice  MemSize  vCPUs  CurrentGeneration Processor
0         us-east-1d        t3.nano      0.002      512      2               True  [x86_64]
1         us-east-1b        t3.nano      0.002      512      2               True  [x86_64]
2         us-east-1a        t3.nano      0.002      512      2               True  [x86_64]
3         us-east-1c        t3.nano      0.002      512      2               True  [x86_64]
4         us-east-1d       t3a.nano      0.002      512      2               True  [x86_64]
..               ...            ...        ...      ...    ...                ...       ...
995       us-east-1a    p2.16xlarge      4.320   749568     64               True  [x86_64]
996       us-east-1b    p2.16xlarge      4.320   749568     64               True  [x86_64]
997       us-east-1c    p2.16xlarge      4.320   749568     64               True  [x86_64]
998       us-east-1d    p2.16xlarge     14.400   749568     64               True  [x86_64]
999       us-east-1c  p3dn.24xlarge      9.540   786432     96               True  [x86_64]

[1000 rows x 7 columns]

And here's the code:

ec2c = boto3.client('ec2')
ec2r = boto3.resource('ec2')

#### The rest of this code maps the instance details to spot price in case you are looking for certain memory or cpu
paginator = ec2c.get_paginator('describe_instance_types')
response_iterator = paginator.paginate( )

df_hold_list = []
for page in response_iterator:
    df_hold_list.append(pd.DataFrame(page['InstanceTypes']))

df_instance_specs = pd.concat(df_hold_list, axis=0).reset_index(drop=True)
df_instance_specs['Spot'] = df_instance_specs['SupportedUsageClasses'].apply(lambda x: 1 if 'spot' in x else 0)
df_instance_spot_specs = df_instance_specs.loc[df_instance_specs['Spot']==1].reset_index(drop=True)

#unapck memory and cpu dictionaries
df_instance_spot_specs['MemSize'] = df_instance_spot_specs['MemoryInfo'].apply(lambda x: x.get('SizeInMiB'))
df_instance_spot_specs['vCPUs'] = df_instance_spot_specs['VCpuInfo'].apply(lambda x: x.get('DefaultVCpus'))
df_instance_spot_specs['Processor'] = df_instance_spot_specs['ProcessorInfo'].apply(lambda x: x.get('SupportedArchitectures'))

#look at instances only between 30MB and 70MB
instance_list = df_instance_spot_specs['InstanceType'].unique().tolist()

#---------------------------------------------------------------------------------------------------------------------
# You can use this section by itself to get the instancce type and availability zone and loop through the instance you want
# just modify instance_list with one instance you want informatin for
#look only in us-east-1
client = boto3.client('ec2', region_name='us-east-1')
prices = client.describe_spot_price_history(
    InstanceTypes=instance_list,
    ProductDescriptions=['Linux/UNIX', 'Linux/UNIX (Amazon VPC)'],
    StartTime=(datetime.now() -
               timedelta(hours=1)).isoformat(),
               # AvailabilityZone='us-east-1a'
    MaxResults=1000)

df_spot_prices = pd.DataFrame(prices['SpotPriceHistory'])
df_spot_prices['SpotPrice'] = df_spot_prices['SpotPrice'].astype('float')
df_spot_prices.sort_values('SpotPrice', inplace=True)
#---------------------------------------------------------------------------------------------------------------------

# merge memory size and cpu information into this dataframe
df_spot_instance_options = df_spot_prices[['AvailabilityZone', 'InstanceType', 'SpotPrice']].merge(df_instance_spot_specs[['InstanceType', 'MemSize', 'vCPUs',
                                            'CurrentGeneration', 'Processor']], left_on='InstanceType', right_on='InstanceType')