I am currently running the virtual machine with the highest memory,n1-highmem-32 (32 vCPUs, 208 GB memory).
My data set is around 90 gigs, but has the potential to grow in the future.
The data is in stored in many zipped csv files. I am loading the data into a sparse matrix in order to preform some dimensionality reduction and clustering.
The Datalab kernel runs on a single machine. Since you are already running on a 208GB RAM machine, you may have to switch to a distributed system to analyze the data.
If the operations you are doing on the data can be expressed as SQL, I'd suggest loading the data into BigQuery, which Datalab has a lot of support for. Otherwise you may want to convert your processing pipeline to use Dataflow (which has a Python SDK). Depending on the complexity of your operations, either of these may be difficult, though.