I've been charged with enabling encryption on a Redshift cluster which has a significant amount of existing data. Based on this link I know that when enabled it will create a new cluster and copy the existing data across making access to it during this time readonly. We have a number of ETL jobs that run against the Redshift cluster and I'm trying to determine how long roughly I can expect the migration to take. Is there any kind of estimation available based on data size/node type/cluster config?

1 Answers

2
Nathan Griffiths On Best Solutions

Is there any kind of estimation available based on data size/node type/cluster config?

Basically, no. The amount of time this takes will depend on a number of factors some of which are outside your control so it's very hard to predict.

You should absolutely test this first so you understand the implications and how long it's likely to take, e.g.

  • Create a new, identical cluster by restoring a snapshot of your original cluster
  • Follow the steps to encrypt the cluster and record the time taken
  • Ideally, test your existing ETL jobs with the encrypted cluster
  • Drop the test cluster

Based on my experience with resizing clusters (a similar but not identical exercise) I would allow +/- 10-15% margin on your test time due to variability in the local AWS resources, network traffic etc.

If it's possible, I'd advise killing all connections to the cluster to speed up the process. We discovered a process that frequently polled our cluster caused the resize process to take longer.

For a reference point, a 20 node ds cluster with approx. 25 Tb of data took around 20 hours to resize.