Natural language summary based on two properties

58 views Asked by At

The problem is conceptually quite simple: I'm looking to summarize a bunch(30 in the upper bound, though rarely > 5) of items based on two of their properties. Say shape and colour. And instead of something clinical like

Item 1 is a red cube
Item 2 is a blue sphere
Item 3 is a blue cylinder
Item 4 is a green sphere

I'm looking to do something more human readable, like

You have two spheres, one blue, one green.
OR You have two spheres, some are blue, others - green.
You also have one blue cylinder and one red cube.

How would I go doing that in some sort of organized manner?
Is there a better way than spelling out every single case? E.g. better than: if(singleItem), if(only 1 shape and 1 color), if(1 shape, multiple colors), if(multiple shapes, multiple colors), etc....

1

There are 1 answers

0
ling_jan On

You would have to define groups of properties you want to describe as being of the same category, i.e. colors, shapes, etc.

Then sort your data into these categories and try to make generalizations.

This really depends on what you want to talk about (you talked mainly about the shapes, but you could also talk about the colors, e.g. "We have two blue items"). If you want to just summarize and all of your properties have the same priority, you could e.g. first see for the most prominent thing the items have in common. For each of the items, flag them if you've talked about them, so you don't get:

We have two spheres, one blue and one green. We also have two blue items, one sphere and one cylinder

...which might sound like you are talking about two different blue spheres, but you might have just one.

As for an algorithm, you might not get around defining a conditional statement. But first, think about all the different cases you want to talk about and then define a tree structure where each of these cases is listed, so you don't forget any.