A swiftier way to convert String to UnsafePointer<xmlChar> in Swift 3 (libxml2)

1.1k views Asked by At

I'm working on a Swift 3 wrapper for the libxml2 C-library.

There are two convenience methods to convert String to UnsafePointer<xmlChar> and vice versa. In libxml2 xmlChar is declared as unsigned char.

  • UnsafePointer<xmlChar> to String is uncomplicated

    func stringFrom(xmlchar: UnsafePointer<xmlChar>) -> String {
        let string = xmlchar.withMemoryRebound(to: CChar.self, capacity: 1) {
            return String(validatingUTF8: $0)
        }
        return string ?? ""
    }
    
  • For String to UnsafePointer<xmlChar> I tried many things for example

    let bytes = string.utf8CString.map{ xmlChar($0) }
    return UnsafePointer<xmlChar>(bytes)
    

    but this doesn't work, the only working solution I figured out is

    func xmlCharFrom(string: String) -> UnsafePointer<xmlChar> {
        let pointer = (string as NSString).utf8String
        return unsafeBitCast(pointer, to: UnsafePointer<xmlChar>.self)
    }
    

Is there a better, swiftier way without the bridge cast to NSString and unsafeBitCast?

3

There are 3 answers

13
Charles Srstka On BEST ANSWER

Swiftiest way I can think of is to just use the bitPattern: initializer:

let xmlstr = str.utf8CString.map { xmlChar(bitPattern: $0) }

This will give you an Array of xmlChars. Hang onto that, and use Array's withUnsafeBufferPointer method when you need to pass an UnsafePointer to something:

xmlstr.withUnsafeBufferPointer { someAPIThatWantsAPointer($0.baseAddress!) }

Don't let the UnsafePointer escape from the closure, as it won't be valid outside it.

EDIT: How's this for a compromise? Instead of having your function return a pointer, have it take a closure.

func withXmlString<T>(from string: String, handler: (UnsafePointer<xmlChar>) throws -> T) rethrows -> T {
    let xmlstr = string.utf8CString.map { xmlChar(bitPattern: $0) }

    return try xmlstr.withUnsafeBufferPointer { try handler($0.baseAddress!) }
}

Or, as an extension on String:

extension String {
    func withXmlString<T>(handler: (UnsafePointer<xmlChar>) throws -> T) rethrows -> T {
        let xmlstr = self.utf8CString.map { xmlChar(bitPattern: $0) }

        return try xmlstr.withUnsafeBufferPointer { try handler($0.baseAddress!) }
    }
}
1
Nikolai Ruhe On

I'm working on a Swift 3 wrapper for the libxml2 C-library.

Condolences.

[...] String to UnsafePointer [is complicated]

Agree. It is complicated because it is unclear who owns the xmlChar array.

[...] the only working solution I figured out is

let pointer = (string as NSString).utf8String

This works because of the ownership semantics of -[NSString utf8String]:

Apple docs:

This C string is a pointer to a structure inside the string object, which may have a lifetime shorter than the string object and will certainly not have a longer lifetime.

So the lifetime is probably something like the current autorelease pool or even shorter, depending on the compiler's ARC optimisations and the implementation of utf8String. Definitely not safe to keep around.

Is there a better, swiftier way [...]?

Well, that depends on the use case. There's no way to handle this without thinking about the ownership of the created xmlChar buffer.

It should be clear from the API how the functions are using the passed string (even though I know that libxml2's documentation is terrible).

For situations where a string is just used during a function call it might be nice to have a scoped access function:

extension String {
    func withXmlChar(block: (UnsafePointer<xmlChar>) -> ()) { ... }
}

If the function keeps the pointer around you must guarantee for the lifetime of the pointee. Probably something like a container object that keeps a Data and pointer around for some ARC maintained lifetime...

It might be worthwile to go through one of Mike Ash's recent articles which is about managing ownership of objects beyond ARC.

0
Martin R On

String has a

public init(cString: UnsafePointer<UInt8>)

initializer, therefore the conversion from an XML string to a Swift string can be simplified to

let xmlString: UnsafePointer<xmlChar> = ...
let s = String(cString: xmlString)

Ill-formed UTF-8 sequences are replaced by the Unicode replacement character U+FFFD.


For the conversion from a Swift string to an XML string I would suggest a similar approach as Charles Srstka, but using the existing String.withCString method instead of creating an intermediate array:

extension String {
    func withXmlString<T>(handler: (UnsafePointer<xmlChar>) throws -> T) rethrows -> T {
        return try self.withCString { try handler(UnsafeRawPointer($0).assumingMemoryBound(to: UInt8.self)) }
    }
}

If the throwing option is not needed, it simplifies to

extension String {
    func withXmlString<T>(handler: (UnsafePointer<xmlChar>) -> T) -> T {
        return self.withCString { handler(UnsafeRawPointer($0).assumingMemoryBound(to: UInt8.self)) }
    }
}