I'm using Process.start() in python to start a python script. In general this works, but I have a problem concerning the performance when starting the python script.
The process is as follows:
- C# starts the python script using
process.start() - Python loads a larger amount of data (takes a little longer)
- C# writes something via
StandardInput - Python reads the input via
sys.stdinand performs some operations on it based on the data loaded in step 2 - Python writes the result from step 4 to the output using
print() - C# reads the output using
StandardOutput
This is basically my C# code:
public void writeToStandardInput (List<string> inputList)
{
process.Start();
// write Input
for (int i = 0; i < inputList.Count; i++)
{
dataForPython = "StdInLine:" + inputList[i];
process.StandardInput.WriteLine(dataForPython);
}
process.StandardInput.FlushAsync();
//read output
List<string> outputList;
string output;
while ((output=process.StandardOutput.ReadLine()) != null)
outputList.Add(output);
}
and this is my python code:
def main():
data_path = download_model(*path*)
data = load_from_checkpoint(data_path)
operationList = []
while True:
line = sys.stdin.readline()
if not 'StdInLine:' in line:
break
operationList .append(line)
data_output = data.doOperations(operationList)
for element in data_output:
print(element)
main()
If there is new data coming from C# using StandardInput, I currently always have to restart the entire process. The problem with this is that the Python class loads a large amount of data each time it is started. This always takes a few seconds. The data is always the same, so it would be sufficient if the data was only loaded the first time the Python script was started. I have not yet found a solution. One possible solution would be the following:
- C#: Put
process.startin class constructor, so it only starts when first initializing the class - Python: Load the data when starting the python script and storing it into a global variable
- C#: create a method which gets called when something needs to get written to
StandardInputand which then calls a method of the python script - Python: Method which gets called after something got written to
StandardInputreads the input usingsys.stdinand performs operations on the input based on the global data. So I would not have to load the data every time I get something formStandardInput
Is this somehow possible using stdin?
Or maybe I could split it into 2 Python classes/ scripts. The class dataLoader which loads data_path and data and only gets started once (on start of the C# application) and which keeps running the entire time. And the second class listenToStandardInput which gets started when something needs to get written from C# to StandardInput and which doesn't run the entire time in the background. listenToStandardInput would then use data from dataLoader and perform some operations (data.doOperations(operationList)) based on the input from sys.stdin.
I've already read some posts about subprocces (e.g. Start a background process in Python) but I guess that doesn't really address my problem.