What is difference between "Keras backend + Tensorflow" and "Keras from Tensorflow" using CPU(in Tensorflow 2.x)

652 views Asked by At

I want to limit CPU cores and threads. So I found three ways to limit these.

1) "Keras backend + Tensorflow"

from keras import backend as K
import tensorflow as tf

config = tf.ConfigProto(intra_op_parallelism_threads=2, \ 
                        inter_op_parallelism_threads=4, \
                        allow_soft_placement=True, \
                        device_count = {'CPU': 1})
session = tf.Session(config=config)
K.set_session(session)

2) "Keras from Tensorflow"

import tensorflow as tf
from tensorflow import keras

tf.config.threading.set_intra_op_parallelism_threads(2)  
tf.config.threading.set_inter_op_parallelism_threads(4) 

3) "keras from Tensorflow"

import os

os.environ['TF_NUM_INTRAOP_THREADS'] = '2'
os.environ['TF_NUM_INTEROP_THREADS'] = '4'

These three ways are same affects?

Lastly I understood for the parameters like I wrote below

  • intra_op_parallelism_threads("number of CPU cores")
  • inter_op_parallelism_threads("number of threads")

is this right? If I miss-understanding please let me know.

Thank you.

2

There are 2 answers

4
Akshay Sehgal On BEST ANSWER

Not exactly, it's not as simple as that. As per official documentation -

intra_op_parallelism_threads - Certain operations like matrix multiplication and reductions can utilize parallel threads for speedups. A value of 0 means the system picks an appropriate number. Refer this

inter_op_parallelism_threads - Determines the number of parallel threads used by independent non-blocking operations. 0 means the system picks an appropriate number. Refer this

So technically you can not limit the number of CPUs but only the number of parallel threads, which, for the sake of limiting resource consumption, is sufficient.


Regarding the methods, you are using -

The third approach allows you to directly set the environment variables using os library.

import os

os.environ['TF_NUM_INTRAOP_THREADS'] = '2'
os.environ['TF_NUM_INTEROP_THREADS'] = '4'

The second approach is a method in tf2 that does exactly the same (sets environment variables), the difference being that Keras is packaged into tf2 now.

import tensorflow as tf
from tensorflow import keras

tf.config.threading.set_intra_op_parallelism_threads(2)  
tf.config.threading.set_inter_op_parallelism_threads(4)

The first approach is for standalone Keras. This approach will work if keras is set to tensorflow backend. Again, it does the same thing which is set environment variables indirectly.

from keras import backend as K
import tensorflow as tf

config = tf.ConfigProto(intra_op_parallelism_threads=2, \ 
                        inter_op_parallelism_threads=4, \
                        allow_soft_placement=True, \
                        device_count = {'CPU': 1})
session = tf.Session(config=config)
K.set_session(session)

If you still have doubts, you can check what happens to the environment variables after running all 3 independently and then check the specific variable using os with -

print(os.environ.get('KEY_THAT_MIGHT_EXIST'))

For a better understanding of the topic, you can check this link that details it out quite well.


TLDR; You can use the second or third approach if you are working with tf2. Else use the first or third approach if you are using standalone Keras with tensorflow backend.

0
hamflow On

For completing Akshay Sehgal answer, I found these facts regarding choosing the right number of intra and inter parallelism threads in my case by trial and error:

  1. Almost all CPU logical cores (processors) correspond exactly to the number of intra parallelism threads (for example, if I reduce performance of a 8 logical processors CPU into 1 intra parallelism thread, then something around 1/8 of CPU performance will be exerted, however, the ratio is not exact). I also found that almost no role in changing CPU performance exists for inter parallelism threads after running my tensorflow model. However, I think it is best to specify number of inter parallelism threads as the rest of CPU's logical processors (for example, setting inter parallelism threads as 7 for a CPU with 8 logical processors when we already has chosen intra parallelism threads as 1) The reason is because I suspect inter parallelism threads might relate to the number of logical processors which runs other processes parellel to tensorflow model but I am not sure.
  2. There are no difference between changing intra and inter parallelism threads in both os.enviorn AND tf.config method but for sake of safe execution I considered applying both methods.
  3. Increasing threads more than number of your CPU's logical processors will keep its performance onto its maximum capacity for your process. for example, if your CPU has 8 logical cores, then defining 200 intra/inter parallelism threads is possible However the final performance still equals to 8 threads. Notice that in tensorflow calculations CPU performance does not necessarily reach 100% if you assign maximum CPU performance for your model which is also true for any other CPU related process.
  4. Changing number of each threads into 0 will change back multi-thread parallelism settings into default again.