How to get author of a pdf document with mupdf

1.1k views Asked by At

how can I get metadata of a pdf document(e.g. title, author, creation date etc) by using mupdf library? There is not enough documentation to find out this functionality. Comments are not sufficient, too. Most probably, there is a functionality for this purpose but it is hard to find under these circumstances. The following code is what I have so far.

char info[64];
globals *glo = get_globals(env, thiz);

fz_meta(glo->doc, FZ_META_INFO, info, sizeof(info));

I have used FZ_META_INFO tag, but it doesn't work. I didn't get any info, just empty. I have checked that it has metadata. Any help is appreciated.

EDIT:

Target Android sdk:20

Min Android sdk:15

Mupdf version: 1.6

ndk: r10c

Development OS: Ubuntu 12.04

2

There are 2 answers

11
KenS On BEST ANSWER

In what sense 'doesn't work' ? Throws an error ? Crashes ? Are you certain the PDF file you are using has any 'Info' metadata ?

What is the version of MuPDF ? What platform are you using ?

You need to set the relevant key in the buffer you pass to fz_meta before you call fz_mets, I notice you aren't doing that.

See win_main.c at around line 487, after you get past the macro this resolves to

char info[256]

sprintf(info, "Title");
fz_meta(doc, FZ_META_INFO, info, 256);

On return 'info' will contain the metadata associated with the Title key in the dictionary.

When in doubt, build the sample app and follow it in a debugger......

0
Gordon88 On

If the proper casting allow to send the key, this casting is NOT correct to receive back a char*.

Exemple; Proper casting to send a request

char buff[2048];
strcpy(buff,"CreationDate")
if (fz_meta(ctx,doc,FZ_META_INFO,&buff,2048)) {
    buff[0] = 0;
}

Will: find the key, convert utf8 then will crash when copyback of the result

Proper casting to receive a request

char buff[2048];
strcpy(buff,"CreationDate")
if (fz_meta(ctx,doc,FZ_META_INFO,buff,2048)) {
    buff[0] = 0;
}

Will crash during dict scanning. looks really like a bug! I confirm that modifying original source

info = pdf_dict_gets(ctx, info, (char *)ptr);

is the way to go. (even if strange that nobody else find it while writing code, because Meta are useful features frequently used