I'm trying to build a system that collects data from some sources using I/O (HDD, network...)
For this, I have a class (controller) that launch the collectors.
Each collector is an infinite loop with a classic ETL process (extract, transform and load).
I want send some commands to the collectors (stop, reload settings...) from an interface (CLI, web...) and I'm not sure about how to do it.
For example, this is the skeleton for a collector:
class Collector(object):
def __init__(self):
self.reload_settings()
def reload_settings(self):
# Get the settings
# Set the settings as attributes
def process_data(self, data):
# Do something
def run(self):
while True:
data = retrieve_data()
self.process_data(data)
And this is the skeleton for the controller:
class Controller(object):
def __init__(self, collectors):
self.collectors = collectors
def run(self):
for collector in collectors:
collector.run()
def reload_settings(self):
??
def stop(self):
??
Is there a classic design pattern that solves this problem (Publish–subscribe, event loop, reactor...)? What is the best way to solve this problem?
PD: Obviously, this will be a multiprocess application and will run on a single machine.
I'd look at the possibility of adapting your case to socket-based client-server architecture where Controller would instantiate required number of Collectors each listening on its own port and handling received data in more elegant way through handle() method of the server. The fact that data comes from various I/O sources speaks even more for this solution - you could use Client part of this architecture to standarize the DataSource -> Collector protocol
https://docs.python.org/2/library/socketserver.html