The lift
Lift
The lift of a rule is defined as:
{\displaystyle \mathrm {lift} (X\Rightarrow Y)={\frac {\mathrm {supp} (X\cup Y)}{\mathrm {supp} (X)\times \mathrm {supp} (Y)}}}{\mathrm {lift}}(X\Rightarrow Y)={\frac {{\mathrm {supp}}(X\cup Y)}{{\mathrm {supp}}(X)\times {\mathrm {supp}}(Y)}}
or the ratio of the observed support to that expected if X and Y were independent.
For example, the rule {\displaystyle \{\mathrm {milk,bread} \}\Rightarrow \{\mathrm {butter} \}}\{{\mathrm {milk,bread}}\}\Rightarrow \{{\mathrm {butter}}\} has a lift of {\displaystyle {\frac {0.2}{0.4\times 0.4}}=1.25}{\frac {0.2}{0.4\times 0.4}}=1.25.
If the rule had a lift of 1, it would imply that the probability of occurrence of the antecedent and that of the consequent are independent of each other. When two events are independent of each other, no rule can be drawn involving those two events.
If the lift is > 1, that lets us know the degree to which those two occurrences are dependent on one another, and makes those rules potentially useful for predicting the consequent in future data sets.
If the lift is < 1, that lets us know the items are substitute to each other. This means that presence of one item has negative effect on presence of other item and vice versa.
The value of lift is that it considers both the support of the rule and the overall data set.[3]
Chi-squared test
A chi-squared test, also written as χ2 test, is a statistical hypothesis test that is valid to perform when the test statistic is chi-squared distributed under the null hypothesis, specifically Pearson's chi-squared test and variants thereof. Pearson's chi-squared test is used to determine whether there is a statistically significant difference between the expected frequencies and the observed frequencies in one or more categories of a contingency table.
In the standard applications of this test, the observations are classified into mutually exclusive classes. If the null hypothesis that there are no differences between the classes in the population is true, the test statistic computed from the observations follows a χ2 frequency distribution. The purpose of the test is to evaluate how likely the observed frequencies would be assuming the null hypothesis is true.
Test statistics that follow a χ2 distribution occur when the observations are independent and normally distributed, which assumptions are often justified under the central limit theorem. There are also χ2 tests for testing the null hypothesis of independence of a pair of random variables based on observations of the pairs.
Chi-squared tests often refers to tests for which the distribution of the test statistic approaches the χ2 distribution asymptotically, meaning that the sampling distribution (if the null hypothesis is true) of the test statistic approximates a chi-squared distribution more and more closely as sample sizes increase.