I am trying my best to understand how ConcurrentHashMap works under the hood.
It seems that during resizes there is an entire encoding scheme happening within sizeCtl variable.
Some speculations are saying that the lower 16 bits signify the number of threads, other speculations specify that there is a point system counter implemented, ie. +1 when a thread is doing the resizing and -1 for when a thread is leaving the resizing.
https://stackoverflow.com/a/52668122/7134737
https://stackoverflow.com/a/53477058/7134737
Can someone explain in plain terms what the following variables do :
How do they interact with the sizeCtl variable ? It seems this variable is used for multiple operations, none of which are very well documented.
Sorry if this seems like a rant, but its frustrating to not understand the bit manipulations.
These values are only used internally, so they don't need to be well documented.
Still, let's try to explain the different roles and values that
sizeCtlhas:tableis null: initial table size when specified in the constructor, 0 for default table size (DEFAULT_CAPACITYwhich is 16) - this value is always greater or equal to 0tableis being initialized because some thread put the first value into theConcurrentHashMap- either by calling the constructor with aMapof values or by calling one of the entry adding methods.tableis not null: number of entries where the next resize will be started, calculated asn - n/4with n beingtable.length- this value is always greater than 0The special value while resizing is built from two parts:
RESIZE_STAMP_BITS(16) and is placed insizeCtlby shifting it left byRESIZE_STAMP_SHIFT(32 - RESIZE_STAMP_BITSwhich is incidentally also 16)32 - RESIZE_STAMP_BITS(16)You can picture this as
MAX_RESIZERSis just the maximal value the the "resizerCount" part can hold.The resizerCount is used to control the acquisition of additionals threads to help out in resizing the
ConcurrentHashMap. The thread that initiates the resizing operating sets its value to 2 (plus the valueresizeStamp << RESIZE_STAMP_SHIFT). If additional threads try to add entries during a resize operation they check whether there are table entries to be moved and the value of the resizerCount is less thanMAX_RESIZERS. If this is the case they join the resize operation by incrementing the resizerCount and the start moving map entries from the oldtableto thenextTable.Moving map entries from the old
tableto thenextTableis done in blocks (to prevent contention). After each block the threads participating in the resize operation check whether there are more blocks to move. If there are no more blocks, the thread decrements the resizerCount, checks if it is the last thread doing resizing (indicated by resizerCount now being 1) and if it is the last thread will finish the resize operation: changetabletonextTableand setsizeCtlto the amount of entries that will trigger the next resize operation.Why is the resizeStamp needed?
Because the threads must coordinate the resize work. A thread "X" that decides to participate in resizing reads the values of the
tableandnextTablefields and then tries to join the group of threads doing the resizing.It can happen that the thread "X" is suspended between reading the fields and joining the group of threads doing the resizing work and that the resizing of the
tablethat it has read is already completed but a new resize is in progress. The value in resizeStamp encodes the size of thetablearray and lets the thread "X" detect that situation which means that it must re-read the values of thetableandnextTablefields.