sorry if the title is wrong or confusing.
i'm working with two source files (think csv files converted to excel or loaded into separate oracle tables). the data in the two files is associated from a business perspective.
file 1 contains a set of rows and columns. for instance:
card_type, amount
VISA, 100
MASTERCARD, 300
DISCOVER, 200
AMEX, 150
DEBIT, 400
file 2 contains simliar data, but is aggregated in an unknown way:
deposit_id, deposit_date, amount
1, 03/01/2015, 400
2, 03/01/2015, 350
3, 03/01/2015, 400
Now obviously i oversimplified that a lot.
What I'm looking for is a way to determine how rows in file 1 are related to rows in file 2.
In this case, possible outcomes are:
file1 (visa + mastercard = 400) = file2 deposit_id 1 (400)
file1 (debit = 400) = file2 deposit_id 1 (400)
file1 (visa = mastercard = 400) = file2 deposit_id 3 (400)
file1 (debit = 400) = file2 deposit_id 3 (400)
file1 (discover + amex = 350) = file2 deposit_id 2 (350)
...and so on...
sometimes there are rows that may be excluded (not used at all). maybe there are rows that are used 2x. maybe there is 1 association, maybe there are 2 or 3. who knows! unfortunately, the data files aren't very well defined (talking about old mainframe systems here).
obviously, i can try to go through and figure it out manually myself, but i end up trying tons of combinations that are dead-ends and only waste time. it'd be great to have a system that can take the inputs and try to see how they are possibly related. i realize every possible solution would need to be analyzed by myself manually after the fact to make sure it makes sense and works across all data-sets.
i'd appreciate any advice on if what i'm looking for is even possible, or the correct terminology to describe my problem.
i'm fine if possible solutions are in query form or even an existing application/program/website that provides the functionality.
thanks!