Many library IO functions follow a pattern where there's a file pointer filePtr and a struct handle handle. After execution, certain fields of the handle are set based on the file content. Now, I want to identify which fields of the handle are influenced, and possibly understand how they are influenced, such as through the call list. How can this be achieved using LLVM tools?
I aim for a tool that requires minimal manual intervention, either by accepting function specifications or, ideally, autonomously discovering such functions within the library.
Method I tried
- scan-build alpha.security.taint.TaintPropagation It focuses more on security and requires sources and sinkers to be functions. In my case, the handle struct is the sinker.
- opt pass I couldn't find existing passes related to this problem. Perhaps I can utilize passes like -dot-cfg to output the control-flow graph (CFG) and reduce the context for LLVM to solve it?
Any suggestions or insights on achieving this with LLVM tools would be appreciated.
Example target
Got from little-CMS. cmsHPROFILE is an alias for void.
typedef struct _cms_iccprofile_struct {
// I/O handler
cmsIOHANDLER* IOhandler;
// The thread ID
cmsContext ContextID;
// Creation time
struct tm Created;
// Color management module identification
cmsUInt32Number CMM;
// Only most important items found in ICC profiles
cmsUInt32Number Version;
cmsProfileClassSignature DeviceClass;
cmsColorSpaceSignature ColorSpace;
cmsColorSpaceSignature PCS;
cmsUInt32Number RenderingIntent;
cmsPlatformSignature platform;
cmsUInt32Number flags;
cmsUInt32Number manufacturer, model;
cmsUInt64Number attributes;
cmsUInt32Number creator;
cmsProfileID ProfileID;
// Dictionary
cmsUInt32Number TagCount;
cmsTagSignature TagNames[MAX_TABLE_TAG];
cmsTagSignature TagLinked[MAX_TABLE_TAG]; // The tag to which is linked (0=none)
cmsUInt32Number TagSizes[MAX_TABLE_TAG]; // Size on disk
cmsUInt32Number TagOffsets[MAX_TABLE_TAG];
cmsBool TagSaveAsRaw[MAX_TABLE_TAG]; // True to write uncooked
void * TagPtrs[MAX_TABLE_TAG];
cmsTagTypeHandler* TagTypeHandlers[MAX_TABLE_TAG]; // Same structure may be serialized on different types
// depending on profile version, so we keep track of the
// type handler for each tag in the list.
// Special
cmsBool IsWrite;
// Keep a mutex for cmsReadTag -- Note that this only works if the user includes a mutex plugin
void * UsrMutex;
} _cmsICCPROFILE;
// Open from memory block
cmsHPROFILE CMSEXPORT cmsOpenProfileFromMemTHR(cmsContext ContextID, const void* MemPtr, cmsUInt32Number dwSize)
{
_cmsICCPROFILE* NewIcc;
cmsHPROFILE hEmpty;
hEmpty = cmsCreateProfilePlaceholder(ContextID);
if (hEmpty == NULL) return NULL;
NewIcc = (_cmsICCPROFILE*) hEmpty;
// Ok, in this case const void* is casted to void* just because open IO handler
// shares read and writing modes. Don't abuse this feature!
NewIcc ->IOhandler = cmsOpenIOhandlerFromMem(ContextID, (void*) MemPtr, dwSize, "r");
if (NewIcc ->IOhandler == NULL) goto Error;
if (!_cmsReadHeader(NewIcc)) goto Error;
return hEmpty;
Error:
cmsCloseProfile(hEmpty);
return NULL;
}
Expected output
Variable name: hEmpty
Field name: IOhandler -cmsOpenIOhandlerFromMem
Field name: Version -_cmsReadHeader-_cmsAdjustEndianess32
Field name: platform -_cmsReadHeader-_cmsAdjustEndianess32
...