Subclassing NSCoder, recreating NSArchiver

556 views Asked by At

NSArchiver is deprecated since OS X 10.2, and is not available AFAIK on iOS

On the other hand, NSKeyedArchiver is known to be lacking on the speed & conciseness part (some users report more than 100 times performance difference between NSKeyedArchiver and NSArchiver). The objects I want to archive are mainly NSObject subclasses containing NSMutableArray of NSNumber, and objects containing primitive types (mainly double). I am not convinced the overhead keyed archiving implies is worth it.

So I decided to subclass NSCoder on iOS to create a serial coder, in the style of NSArchiver.

I understand where keyed archives might come in handy : backward forward compatibility and other niceties, and it probably is what I will end up using, but I'd be curious to know what kind of performances I could get with serial archiving. And frankly, I think I could learn a lot of things by doing this. So I am not interested in an alternative solution;

I've been inspired by the sources of Cocotron, providing an open source NSArchiver

TLDR : I want to subclass NSCoder to rebuild NSArchiver


I am using ARC, compiling for iOS 6 & 7, and assuming 32bit system for now.

I am not interested in referencing objects or strings for now, I am only using a NSHashTable (weakObjectsHashTable) to prevent classes names to be duplicated : classes will be described the first time they are encountered, and then referred to by reference.

I am using a NSMutableData to build the archive :

@interface Archiver {
    NSMutableData *_data;
    void *_bytes;
    size_t _position;
    NSHashTable *_classes;
}
@end

The basic methods are :

-(void)_expandBuffer:(NSUInteger)length
{
    [_data increaseLengthBy:length];
    _bytes = _data.mutableBytes;
}

-(void)_appendBytes:(const void *)data length:(NSUInteger)length
{
    [self _expandBuffer:length];
    memcpy(_bytes+_position, data, length);
    _position += length;
}

I am using _appendBytes:length: to dump primitive types such as int, char, float, double ... etc. Nothing interesting there.

C-style strings are dumped using this equally uninteresting method :

-(void)_appendCString:(const char*)cString
{
    NSUInteger length = strlen(cString);
    [self _appendBytes:cString length:length+1];

}

And finally, archiving class information and objects :

-(void)_appendReference:(id)reference {
    [self _appendBytes:&reference length:4];
}

-(void)_appendClass:(Class)class
{
    // NSObject class is always represented by nil by convention 
    if (class == [NSObject class]) {
        [self _appendReference:nil];
        return;
    }

    // Append reference to class
    [self _appendReference:class];

    // And append class name if this is the first time it is encountered
    if (![_classes containsObject:class])
    {
        [_classes addObject:class];
        [self _appendCString:[NSStringFromClass(class) cStringUsingEncoding:NSASCIIStringEncoding]];
    }
}

-(void)_appendObject:(const id)object
{
    // References are saved
    // Although we don't handle relationships between objects *yet* (we could do it the exact same way we do for classes)
    // at least it is useful to determine whether object was nil or not
    [self _appendReference:object];

    if (object==nil)
        return;

    [self _appendClass:[object classForCoder]];
    [object encodeWithCoder:self];

}

The encodeWithCoder: methods of my objects all look like that, nothing fancy :

[aCoder encodeValueOfObjCType:@encode(double) at:&_someDoubleMember];
[aCoder encodeObject:_someCustomClassInstanceMember];
[aCoder encodeObject:_someMutableArrayMember];

Decoding goes pretty much the same way ; The unarchiver holds a NSMapTable of classes it already knows and looks for the name of classes reference it does not know.

@interface Unarchiver (){
    NSData *_data;
    const void *_bytes;
    NSMapTable *_classes;
}

@end

I won't bore you with the specifics of

-(void)_extractBytesTo:(void*)data length:(NSUInteger)length

and

-(char*)_extractCString

The interesting stuff is probably in the object decoding code :

-(id)_extractReference
{
    id reference;
    [self _extractBytesTo:&reference length:4];
    return reference;
}


-(Class)_extractClass
{

    // Lookup class reference
    id classReference = [self _extractReference];

    // NSObject is always nil
    if (classReference==nil)
        return [NSObject class];

    // Do we already know that one ?
    if (![_classes objectForKey:classReference])
    {
        // If not, then the name should follow

        char *classCName = [self _extractCString];
        NSString *className = [NSString stringWithCString:classCName encoding:NSASCIIStringEncoding];
        free(classCName);

        Class class = NSClassFromString(className);

        [_classes setObject:class forKey:classReference];
    }

    return [_classes objectForKey:classReference];

}

-(id)_extractObject
{
    id objectReference = [self _extractReference];

    if (!objectReference)
    {
        return nil;
    }

    Class objectClass = [self _extractClass];
    id object = [[objectClass alloc] initWithCoder:self];

    return object;

}

And finally, the central method (I would not be surprised if the problem is somewhere here)

-(void)decodeValueOfObjCType:(const char *)type at:(void *)data
{


    switch(*type){
        /* snip, snip */
        case '@':
            *(id __strong *) data = [self _extractObject];
            break;
    }
}

The initWithCoder: method corresponding to the previous snippet of encodeWithCoder: would go something like that

if (self = [super init]) {
    // Order is important
    [aDecoder decodeValueOfObjCType:@encode(double) at:& _someDoubleMember];
    _someCustomClassInstanceMember = [aDecoder decodeObject];
    _someMutableArrayMember = [aDecoder decodeObject]; 
}
return self;

My decodeObject implementation is exactly _extractObject.

Now, all this should work nice and well. And as a matter of fact; I am able to archive/unarchive some of my objects. The archives look fine, to the extent that I am willing to inspect them in an hex editor, and I am able to unarchive some of my custom classes containing NSMutableArrays of some other class containing doubles.


But for some reason, if I try to unarchive one of my object containing a NSMutableArray of NSNumber, I run into this problem :

malloc: *** error for object 0xc112cc: pointer being freed was not allocated

There seems to be one line per NSNumber in the array, and the address 0xc112cc is the same for every line. Putting a breakpoint in malloc_error_break tells me the errors is from -[NSPlaceholderNumber initWithCoder:] (called from my _extractObject method).

Is that a problem linked to my usage of ARC ? What am I missing ?

1

There are 1 answers

0
Olotiar On BEST ANSWER

My error was linked to a misunderstanding of the second argument of -(void)encodeValueOfObjCType:(const char *)type at:(const void *)addr in the case where type represents a C-string ( *type == '*'). In that case addr is a const char **, a pointer to the the const char *, which is itself points to the constant, 0 terminated array of char that should be encoded.

NSNumber encodeWithCoder: encodes a small C-string representing the type of the variable backing the value (an i for int, d for double, etc. it is equal AFAIK to the @encode directive).

My previous misinterpretation (assuming addr was a const char *) made an incorrect encoding/decoding, and the initWithCoder: was thus failing (bottom line : it was trying to free a stack variable, thus the error message and the fact that the address was always the same for every call of the function).

I now have a working implementation. If anyone is interested, the code is on my GitHub under the MIT license.