Dynamically generated DocType in Elasticsearch DSL

1.8k views Asked by At

I am generating a DocType class for building mapping and saving documents based upon my ORM.

def get_doc_type(self):
    attributes = {}

    ...
    # Build attributes dictionary here

    DT = type('DocType', (DocType,), attributes)
    return DT

This seemingly works fine and I have no troubles with mapping. My problem is when I try to save documents.

THIS DOES NOT WORK

Doc = get_doc_type()

for instance in queryset:
    doc = Doc()
    for field_name in fields:
        attribute = getattr(instance, field_name, None)
        setattr(doc, field_name, attribute)
    doc.save(index)

When this happens, a document does get saved, however, none of my attributes are set. It is just an empty document.

I have debugged the code to confirm that the field_name and attribute contain the values I would expect.

THIS DOES WORK

Doc = self.get_doc_type()

for instance in queryset:
    kwargs = {}

    for field_name in fields:
        attribute = getattr(instance, field_name, None)
        kwargs.update({field_name: attribute})

    doc = Doc(**kwargs)
    doc.save(index=index)

When I use this strategy, the document is saved as expected, and all the information and attributes have been passed from my instance into the doc.

QUESTION

What could be causing this? It does not make sense to me why both strategies would not be valid.

2

There are 2 answers

0
DeFOX On BEST ANSWER

In your case I guess, it must have some more info for the save() method to know which field_name should be stored.

Maybe like this:

    class Doc(object):
        def __init__(self, **kwargs):
            self.fields_valid = kwargs.copy()
            ...
        ...
        def save(self, index=None):
            ...
            for field in self.fields_valid:
                do_save(field)
            ....

So you should look into both the __init__ and save methods in the Doc class to find out what actually it does to persist the Doc object.

1
Honza Král On

I am having trouble replicate your behavior as everything works for me just fine:

class DT(DocType):
    pass

dt = DT()

for x in range(10):
    setattr(dt, 'i_%i' % x, x)
dt.save()

DT.search().execute()[0].to_dict()

Shows exactly what I would have expected. Could you please file an issue on github if it doesn't work for you as something is wrong in that case. Thank you!

Btw what I typically do when serializing from an ORM into elaasticsearch-dsl is to have a to_search or similar method directly on the Model that produces a DocType instance. It makes everything so much simpler, including synchronizing the both datasets using signals.