Apriori Algorithm

Question

Apriori Algorithm

5.9k views Asked by Alix Axel At 08 August 2009 at 08:46

I've heard about the Apriori algorithm several times before but never got the time or the opportunity to dig into it, can anyone explain to me in a simple way the workings of this algorithm? Also, a basic example would make it a lot easier for me to understand.

Original Q&A

There are 4 answers

Phil On 22 October 2011 at 18:29

The best introduction to Apriori can be downloaded from this book:

http://www-users.cs.umn.edu/~kumar/dmbook/index.php

you can download the chapter 6 for free which explain Apriori very clearly.

Moreover, if you want to download a Java version of Apriori and other algorithms for frequent itemset mining, you can check my website:

http://www.philippe-fournier-viger.com/spmf/

lmsasu On 05 February 2010 at 17:58

See Top 10 algorithms in data mining (free access) or The Top Ten Algorithms in Data Mining. The latter gives a detailed description of the algorithm, together with details on how to get optimized implementations.

Thilina Samiddhi On 13 April 2017 at 05:40

Apriori Algorithm

It is a candidate-generation-and-test approach for frequent pattern mining in datasets. There are two things you have to remember.

Apriori Pruning Principle - If any itemset is infrequent, then its superset should not be generated/tested.

Apriori Property - A given (k+1)-itemset is a candidate (k+1)-itemset only if everyone of its k-itemset subsets are frequent.

Now, here is the apriori algorithm in 4 steps.

Initially, scan the database/dataset once to get the frequent 1-itemset.
Generate length k+1 candidate itemsets from length k frequent itemsets.
Test the candidates against the database/dataset.
Terminate when no frequent or candidate set can be generated.

Solved Example

Suppose there is a transaction database as follows with 4 transactions including their transaction IDs and items bought with them. Assume the minimum support - min_sup is 2. The term support is the number of transactions in which a certain itemset is present/included.

Transaction DB

tid  | items
-------------
10   | A,C,D
20   | B,C,E
30   | A,B,C,E
40   | B,E

Now, let's create the candidate 1-itemsets by the 1st scan of the DB. It is simply called as the set of C_1 as follows.

itemset | sup
-------------
  {A}   |  2
  {B}   |  3
  {C}   |  3
  {D}   |  1
  {E}   |  3

If we test this with min_sup, we can see {D} does not satisfy the min_sup of 2. So, it will not be included in the frequent 1-itemset, which we simply call as the set of L_1 as follows.

itemset | sup
-------------
  {A}   |  2
  {B}   |  3
  {C}   |  3
  {E}   |  3

Now, let's scan the DB for the 2nd time, and generate candidate 2-itemsets, which we simply call as the set of C_2 as follows.

itemset | sup
-------------
  {A,B} |  1
  {A,C} |  2
  {A,E} |  1
  {B,C} |  2
  {B,E} |  3
  {C,E} |  2

As you can see, {A,B} and {A,E} itemsets do not satisfy the min_sup of 2 and hence they will not be included in the frequent 2-itemset, L_2

itemset | sup
-------------
  {A,C} |  2
  {B,C} |  2
  {B,E} |  3
  {C,E} |  2

Now let's do a 3rd scan of the DB and get candidate 3-itemsets, C_3 as follows.

itemset  | sup
-------------
 {A,B,C} |  1
 {A,B,E} |  1
 {A,C,E} |  1
 {B,C,E} |  2

You can see that, {A,B,C}, {A,B,E} and {A,C,E} does not satisfy min_sup of 2. So they will not be included in frequent 3-itemset, L_3 as follows.

itemset  | sup
-------------
 {B,C,E} |  2

Now, finally, we can calculate the support (supp), confidence (conf) and lift (interestingness value) values of the Association/Correlation Rules that can be generated by the itemset {B,C,E} as follows.

         Rule       | supp  |  conf   |  lift
    -------------------------------------------
     B ->  C & E    |  50%  |  66.67% |  1.33
     E ->  B & C    |  50%  |  66.67% |  1.33
     C ->  E & B    |  50%  |  66.67% |  1.77
     B & C ->  E    |  50%  |  100%   |  1.33
     E & B ->  C    |  50%  |  66.67% |  1.77
     C & E ->  B    |  50%  |  100%   |  1.33

**Spencer Ruport** · Accepted Answer · 2009-08-08T08:50:55+00:00

Spencer Ruport On 08 August 2009 at 08:50 BEST ANSWER

Well, I would assume you've read the wikipedia entry but you said "a basic example would make it a lot easier for me to understand". Wikipedia has just that so I'll assume you haven't read it and suggest that you do.

Read the wikipedia article.

TechQA.

Apriori Algorithm

There are 4 answers

Apriori Algorithm

Solved Example

Related Questions in ALGORITHM

Related Questions in APRIORI

Popular Questions

Popular Tags

Trending Questions