what would be reason for "undefined behaviors" upon using memset on library class(std::string)?

338 views Asked by At

Basically string is type of basic_string template class with char instantiation typedef basic_string string

Up to my knowledge, Here basic_string is class which contains some collections of some data members and member functions as similar like developer is creating one class.

Memseting on user defined class is not giving any issue(Exception for virtual class). So why it is causing issue in during memset on library class?

By exploring some links like this (memset structure with std::string contained), I found like it is not safe to apply memset on string because it is modifying internal data.

So my question is what kind of internal data is modified? Basically it can be POD’s or User defined data types. Is any other low level implementation is made inside string library.

Please note, I am not speaking about using the object after memsetting the class object, My entire concern is upon memsetting the object.

I know using memset on class is really bad idea, This is just for gaining the internal implementation knowledge.

1

There are 1 answers

4
Sergey Kalinichenko On

So my question is what kind of internal data is modified?

There's no way to know for sure, because the class is opaque, but usually it is pointers that present the biggest concern.

I am not speaking about using the object after memsetting the class object

You don't have to use it in order for something bad to happen. Once you override the internal data with something else, there is a very good chance to break invariants expected by the destructor.

Consider an implementation of std::string that uses a pair of pointers to its string buffer, and see what happens here:

std::string a("hello, world!");
memset(&a, 0xaabbccdd, sizeof(std::string)); // We have a memory leak here
// string gets destroyed here --> undefined behavior

The first line allocates space for "hello, world!", and places pointers to it inside std::string. memset writes some junk on top of these pointers, creating the first problem: the memory allocated for the string is lost, creating a memory leak. However, the second problem is worse: now that it's time for the destructor to free the memory, the invalid pointers are passed to delete[], causing undefined behavior.