How could I best let my network learn not only the expected value but also the expected variation around that value, a measure of uncertainty. For any state the network has never seen before this would be very high, for any state that the network has seen many times it should approach some estimate of the expected variation.
Wondering if one can "learn" both aspects at the same time with a (potentially partially) overlapping network.