I've read some article about using other distribution to modeling a stochastic policy in Reinforcement Learning. Usually we use a Gaussian distribution but some used Beta distribution : https://en.wikipedia.org/wiki/Beta_distribution
There is already a Beta distribution class inside Tensorflow, allow people to use it as Tensors. But for some policy gradient methods, they are using constraint on the optimization process, using the Kullback Leiber Divergence.
In the formula, there is the digamma function, already implemented in Tensorflow. But I can't find the beta function (nor the gamma function since they're linked) in Tensorflow. Only log gamma or incomplete gamma. And I cannot use the scipy.special.beta function because it cannot manipulate tensors (since my alpha and beta parameters are produced by a neural network)
I'm not specialist enough in this field, perhaps my question is foolish, but I'd really like an explanation there.
Thanks a lot