Shared Memory for Python Classes

1.5k views Asked by At

So as I was learning about classes in Python I was taught that the class level attributes are shared among all instances of a given class. Something I don't think I've seen any other language before.

As a result I could have multiple instances of say a DB class pulling data DB data and dumping it into a class level attribute. Then any instance that needed any of that data would have access to it without having to go to a cache or saved file to get it.

Currently I'm debugging an analytics class that grabs DB data via an inefficient means - one I'm currently trying to make much faster. Right now it takes several minutes for the DB data to load. And the format I've chosen for the data with ndarrays and such doesn't want to save to a file via numpy.save (I don't remember the error right now). Every time I make a little tweak the data is lost and I have to wait several minutes for it to reload.

So the thought occurred to me that I could create a simple class for holding that data. A class I wouldn't need to alter that could run under a separate iPython console (I'm using Anaconda, Python 2.7, and Spyder). That way I could link the analytics class to the shared data class in the init of the analytics class. Something like this:

def __init__(self):
  self.__shared_data = SharedData()
  self.__analytics_data_1 = self.__shared_data['analytics_data_1']

The idea would be that I would then write to self.__analytics_data_1 inside the analytics class methods. It would be automatically get updated to the shard data class. I would have an iPython console open to do nothing more than hold an instance of the shared data class during debug. That way when I have to reload and reinstantiate the analytics class it just gets whatever data I've already captured. Obviously, if there is an issue with the data itself then it will need to be removed manually.

Obviously, I wouldn't want to use the same shared data class for every tool I built. But that's simple to get around. The reason I mention all of this is that I couldn't find a recipe for something like this online. There seem to be plenty of recipes out there so I'm thinking the lack of a recipe might be a sign this is a bad idea.

I have attempted to achieve something similar via Memcache on a PHP project. However, in that project I had a lot of reads and write and it appeared that the code was causing some form of write collision. So data wasn't getting updated. (And Memcache doesn't guarantee the data will even be there.) Preventing this write collision meant a lot of extra code, extra processing time, and ultimately the code got to be too slow to be useful. I thought I might attempt it again with this shared memory in Python as well as using the shared memory for debugging purposes.

Thoughts? Warnings?

1

There are 1 answers

6
Vitaly Greck On BEST ANSWER

If you wanna shared data across your class instances, you should do:

class SharedDataClass(object):
  shared_data = SharedData()

These SharedData instance will be shared between your class instances unless you override in in the constructor. Careful, GIL lock may affect performance!

Sharing data "between consoles" or processes is different story. They're separate processes, and, of course, class attributes are not shared. In this case, you need IPC (it could be filesystem, database, process with open socket to connect to, etc.)