question about fbstring initsmall() method

92 views Asked by At

I am reading the source code of folly. When I read the fbstring implementation, I am confused by the implementation of the initsmall function.
If bytesize is 17, wordsize is 8, then (byteSize + wordWidth-1) / wordWidth is 3. Does reinterpret_cast<const size_t>(data)[2] access the 17th char element of data and the 7 following elements? Wouldn't it be out of bounds?
Below is the implementation code, the complete code is at https://github.com/facebook/folly/blame/master/folly/FBString.h

// Small strings are bitblitted
template <class Char>
inline void fbstring_core<Char>::initSmall(
    const Char* const data, const size_t size) {
  // Layout is: Char* data_, size_t size_, size_t capacity_
  static_assert(
      sizeof(*this) == sizeof(Char*) + 2 * sizeof(size_t),
      "fbstring has unexpected size");
  static_assert(
      sizeof(Char*) == sizeof(size_t), "fbstring size assumption violation");
  // sizeof(size_t) must be a power of 2
  static_assert(
      (sizeof(size_t) & (sizeof(size_t) - 1)) == 0,
      "fbstring size assumption violation");

// If data is aligned, use fast word-wise copying. Otherwise,
// use conservative memcpy.
// The word-wise path reads bytes which are outside the range of
// the string, and makes ASan unhappy, so we disable it when
// compiling with ASan.
#ifndef FOLLY_SANITIZE_ADDRESS
  if ((reinterpret_cast<size_t>(data) & (sizeof(size_t) - 1)) == 0) {
    const size_t byteSize = size * sizeof(Char);
    constexpr size_t wordWidth = sizeof(size_t);
    switch ((byteSize + wordWidth - 1) / wordWidth) { // Number of words.
      case 3:
        ml_.capacity_ = reinterpret_cast<const size_t*>(data)[2];
        FOLLY_FALLTHROUGH;
      case 2:
        ml_.size_ = reinterpret_cast<const size_t*>(data)[1];
        FOLLY_FALLTHROUGH;
      case 1:
        ml_.data_ = *reinterpret_cast<Char**>(const_cast<Char*>(data));
        FOLLY_FALLTHROUGH;
      case 0:
        break;
    }
  } else
#endif
  {
    if (size != 0) {
      fbstring_detail::podCopy(data, data + size, small_);
    }
  }
  setSmallSize(size);
}
1

There are 1 answers

0
EEAFSX On

Yes, it would be out of ranges. And that's why it uses FOLLY_SANITIZE_ADDRESS in case of the complaint of compilers. If compilers are okay about that, then it will copy the word though it copies more than the size you need. But since setSmallSize() can tell you the size of the string, the extra copied bytes can put in the string buffer but not be read.