About Use case
Card Transactions from PTLF (POS Transaction Log File) can be utilized to identify the relation between the categories. I’ve considered few MCC’s (merchant category code) for all the transaction sets available in PTLF specific to single BIN issued to the list of customers from Card Issuer.
Transactions logged in accordance with the MCC has been considered based on the below table;
SAMPLE CARDS | TELE SERVICES | UTILITY PAYMENT | DIGITAL GOODS | GROCERY STORES |
MCC Code | 4899 | 4900 | 5816 | 5411 |
402411XXXXXXXXX1 | 1 | 1 | 1 | 0 |
402411XXXXXXXXX2 | 1 | 0 | 1 | 1 |
402411XXXXXXXXX3 | 1 | 1 | 1 | 0 |
402411XXXXXXXXX4 | 1 | 1 | 1 | 0 |
402411XXXXXXXXX5 | 0 | 0 | 1 | 1 |
402411XXXXXXXXX6 | 1 | 1 | 1 | 0 |
402411XXXXXXXXX7 | 1 | 1 | 1 | 0 |
402411XXXXXXXXX8 | 0 | 0 | 1 | 0 |
402411XXXXXXXXX9 | 1 | 1 | 1 | 0 |
402411XXXXXXXX10 | 0 | 1 | 0 | 1 |
402411XXXXXXXXX11 | 1 | 0 | 1 | 0 |
402411XXXXXXXXX12 | 1 | 1 | 0 | 0 |
1 – Transaction YES 0 – Transaction NO
We are going to see, whether the customers performed the Utility Payments has used their cards against Digital Goods or not; Like that, we are going to verify few combinations in this post. Relation between the two or more categories will be identified in which chances of frequent transaction was happened or not in between the merchant categories.
What is the use of this Use Case?
Card Issuer can decide to increase the transaction volume specific to forecasting merchant category through cash-back offer and other types of promotional offer. If the transaction happened for Utility Payment is high and the same customers did the frequent transaction for Television Services means, card issuer can give cash back offer for the usage of television services category in the forthcoming months.
What is Apriori Algorithm?
Apriori Algorithm is used to implement the Association Rule Mining technique in which it is used to identify the relations between the items. Basically, it is utilized for market basket analysis but we are going to see the different use case in this post based on the below details.
How Apriori Work?
Apriori comprised of three main components as follows;
- Support
- Confidence
- Lift
We can utilize all the above components with the sample data highlighted in the above Use Case section;
Support
Support is used to identify the default popular ratio in which number of utility payment transactions containing in the total number of transactions triggered from the specific BIN of customers.
Support (Utility Payments) = TXNS containing Utility Payment / Total No. of TXNS
(1) Support (Utility Payments) = 8 / 30 = 27%
Utility Payments occupied 28% in the quarterly total number of transactions.
(2) Support (Grocery) = 3 / 30 = 10%
Grocery Payments occupied 9% in the quarterly total number of transactions.
Confidence
Confidence is used to bring the relations between the card purchase specific to categories. Example card utilized for Digital Goods used together for Utility payment in the same period of time. Card swiped or utilized with high chances between the two MCC’s;
(1) Confidence (Digital –> Utility) = TXNS containing both Utility & Digital / TXNS containing Digital
Confidence (Digital –> Utility) = 6 / 10 = 60%
(2) Confidence (Television –> Grocery) = TXNS containing both Television & Grocery / TXNS containing Television
Confidence (Television –> Grocery) = (1 / 9) = 11%
Lift
Lift is used to identify the likelihood of transaction happened together in the same period of time. When Digital goods purchased using the card for the specific period of time and in the same specific period, Utility payments performed using the same card;
(1) Lift (Digital –> Utility) = Confidence (Digital –> Utility) / Support (Utility Payments)
Lift (Digital –> Utility) = 60 / 27 = 2.2
(2) Lift (Television –> Grocery) = Confidence (Television –> Grocery) / Support (Grocery Payments)
Lift (Television –> Grocery) = 11 / 10 = 1 which means there is no association between Television and Grocery specific purchase from the customers.
We could evaluate greater number of combinations as like Television services and Grocery, Digital Goods and Utility Payments; even Digital Goods + Utility Payments + Television Services or Grocery + Television Services + Utility Payments etc.
Using the above mathematical approach, we could calculate the lift/higher relationship between the categories. According to the Apriori thump-rule, if the lift value is equal to 1 or less than 1 then there are no enough relations between the categories and if the lift value is greater than 1 then there are high chances of relationship between those categories.
Please refer the below code snippet available in GitHub for your learning / evaluation;
https://github.com/gopekanna/Apriori