Can LLVM Tools Identify Struct Fields Affected by File Content in IO Functions?

17 views Asked by At

Many library IO functions follow a pattern where there's a file pointer filePtr and a struct handle handle. After execution, certain fields of the handle are set based on the file content. Now, I want to identify which fields of the handle are influenced, and possibly understand how they are influenced, such as through the call list. How can this be achieved using LLVM tools?

I aim for a tool that requires minimal manual intervention, either by accepting function specifications or, ideally, autonomously discovering such functions within the library.

Method I tried

  1. scan-build alpha.security.taint.TaintPropagation It focuses more on security and requires sources and sinkers to be functions. In my case, the handle struct is the sinker.
  2. opt pass I couldn't find existing passes related to this problem. Perhaps I can utilize passes like -dot-cfg to output the control-flow graph (CFG) and reduce the context for LLVM to solve it?

Any suggestions or insights on achieving this with LLVM tools would be appreciated.

Example target

Got from little-CMS. cmsHPROFILE is an alias for void.

typedef struct _cms_iccprofile_struct {

    // I/O handler
    cmsIOHANDLER*            IOhandler;

    // The thread ID
    cmsContext               ContextID;

    // Creation time
    struct tm                Created;

    // Color management module identification
    cmsUInt32Number          CMM;

    // Only most important items found in ICC profiles
    cmsUInt32Number          Version;
    cmsProfileClassSignature DeviceClass;
    cmsColorSpaceSignature   ColorSpace;
    cmsColorSpaceSignature   PCS;
    cmsUInt32Number          RenderingIntent;

    cmsPlatformSignature     platform;
    cmsUInt32Number          flags;
    cmsUInt32Number          manufacturer, model;
    cmsUInt64Number          attributes;
    cmsUInt32Number          creator;

    cmsProfileID             ProfileID;

    // Dictionary
    cmsUInt32Number          TagCount;
    cmsTagSignature          TagNames[MAX_TABLE_TAG];
    cmsTagSignature          TagLinked[MAX_TABLE_TAG];           // The tag to which is linked (0=none)
    cmsUInt32Number          TagSizes[MAX_TABLE_TAG];            // Size on disk
    cmsUInt32Number          TagOffsets[MAX_TABLE_TAG];
    cmsBool                  TagSaveAsRaw[MAX_TABLE_TAG];        // True to write uncooked
    void *                   TagPtrs[MAX_TABLE_TAG];
    cmsTagTypeHandler*       TagTypeHandlers[MAX_TABLE_TAG];     // Same structure may be serialized on different types
                                                                 // depending on profile version, so we keep track of the
                                                                 // type handler for each tag in the list.
    // Special
    cmsBool                  IsWrite;

    // Keep a mutex for cmsReadTag -- Note that this only works if the user includes a mutex plugin
    void *                   UsrMutex;

} _cmsICCPROFILE;
// Open from memory block
cmsHPROFILE CMSEXPORT cmsOpenProfileFromMemTHR(cmsContext ContextID, const void* MemPtr, cmsUInt32Number dwSize)
{
    _cmsICCPROFILE* NewIcc;
    cmsHPROFILE hEmpty;

    hEmpty = cmsCreateProfilePlaceholder(ContextID);
    if (hEmpty == NULL) return NULL;

    NewIcc = (_cmsICCPROFILE*) hEmpty;

    // Ok, in this case const void* is casted to void* just because open IO handler
    // shares read and writing modes. Don't abuse this feature!
    NewIcc ->IOhandler = cmsOpenIOhandlerFromMem(ContextID, (void*) MemPtr, dwSize, "r");
    if (NewIcc ->IOhandler == NULL) goto Error;

    if (!_cmsReadHeader(NewIcc)) goto Error;

    return hEmpty;

Error:
    cmsCloseProfile(hEmpty);
    return NULL;
}

Expected output

Variable name: hEmpty
Field name: IOhandler -cmsOpenIOhandlerFromMem
Field name: Version -_cmsReadHeader-_cmsAdjustEndianess32
Field name: platform -_cmsReadHeader-_cmsAdjustEndianess32
...
0

There are 0 answers