Shopping intention prediction using decision trees
Introduction: The price is considered to be neglected marketing mix element due to the complexity of price management and sensitivity of customers on price changes. It pulls the fastest customer reactions to that change. Accordingly, the process of making shopping decisions can be very challenging for customer.
Objective: The aim of this paper is to create a model that is able to predict shopping intention and classify respondents into one of the two categories, depending on whether they intend to shop or not.
Methods: Data sample consists of 305 respondents, who are persons older than 18 years involved in buying groceries for their household. The research was conducted in February 2017. In order to create a model, the decision trees method was used with its several classification algorithms.
Results: All models, except the one that used RandomTree algorithm, achieved relatively high classification rate (over the 80%). The highest classification accuracy of 84.75% gave J48 and RandomForest algorithms. Since there is no statistically significant difference between those two algorithms, authors decided to choose J48 algorithm and build a decision tree.
Conclusions: The value for money and price level in the store were the most significant variables for classification of shopping intention. Future study plans to compare this model with some other data mining techniques, such as neural networks or support vector machines since these techniques achieved very good accuracy in some previous research in this field.
Bouckaert, R. R., Frank, E., Hall, M., Kirkby, R., Reutemann, P., Seewald, A., & Scuse, D. (2016). WEKA Manual for Version 3-8-0.
Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5-32. doi: 10.1023/A:1010933404324
Crone, S. F., & Soopramanien, D. (2005). Predicting customer online shopping adoption-an evaluation of data mining and market modelling approaches. DMIN. pp. 215-221.
Franjković, J. (2017). Prices and price image of retailer. Thesis of the postgraduate specialist study. Osijek: Faculty of Economics in Osijek.
Ganchev, T., Zervas, P., Fakotakis, N., & Kokkinakis, G. (2006). Benchmarking Feature Selection Techniques on the Speaker Verification Task. Fifth International Symposium on Communication Systems, Networks And Digital Signal Processing, pp. 314-318.
Hall, M. A., & Holmes, G. (2003). Benchmarking Attribute Selection Techniques for Discrete Class Data Mining. IEEE Transactions on Knowledge and Data Engineering, 15(3), 1437-1447. doi: 10.1109/TKDE.2003.1245283
Hssina, B., Merbouha, A., Ezzikouri, H., & Erritali, M. (2014). A comparative study of decision tree ID3 and C4.5. International Journal of Advanced Computer Science and Applications, 4(2), 13-19.doi: 10.14569/SpecialIssue.2014.040203
Kalmegh, S. (2015). Analysis of WEKA Data Mining Algorithm REPTree, Simple Cart and RandomTree for Classification of Indian News. IJISET - International Journal of Innovative Science, Engineering & Technology, 2(2), 438-446. Retrieved from http://ijiset.com/vol2/v2s2/IJISET_V2_I2_63.pdf
Liaw, A., & Wiener, M. (2002). Classification and regression by randomForest. R news, 2(3), 18-22. Retrieved from https://www.r-project.org/doc/Rnews/Rnews_2002-3.pdf
Mitchell, T. M. (1997). Machine Learning. McGraw-Hill.
Oreški, D. (2014). Evaluation of contrast mining techniques for feature selection in classification. doctoral thesis, Varaždin: Faculty of Organization and Informatics.
Ozer, P. (2008). Data Mining Algorithms for Classification. BSc Thesis Artificial Intelligence, Radboud University Nijmegen.
Quinlan, R. J. (1987). Generating Production Rules from Decision Trees. IJCAI, 87, 304-307. Retrieved from http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.98.9054&rep=rep1&type=pdf
Quinlan, R. J. (1996). Improved Use of Continuous Attributes in C4.5. Journal of Arti, 4, 77-90. doi: 10.1613/jair.279
Rokach, L., & Maimon, O. (2014). Data Mining with Decision Trees: Theory and Applications. World scientific.
Shi, F., & Ghedira, C. (2016). Intention-based Online Consumer Classification for Recommendation and Personalization. In Hot
Topics in Web Systems and Technologies (HotWeb), 2016 Fourth IEEE Workshop on, pp. 36-41.
Suchacka, G., & Stemplewski, S. (2017). Application of Neural Network to Predict Purchases in Online Store: Information Systems Architecture and Technology: Proceedings of 37th International Conference on Information Systems Architecture and Technology–ISAT 2016–Part IV, Springer International Publishing, pp. 221-231.
Vandamme, J.-P., Meskens, N., & Superby, J.-F. (2007). Predicting Academic Performance by Data Mining Methods. Education Economics, 15(4), 405-419. doi: 10.1080/09645290701409939
Vieira, A. (2015). Predicting online user behaviour using deep learning algorithms. arXiv preprint arXiv:1511.06247.
Witten, I. H., Frank, E., & Hall, M. A. (2011). Data Mining: Practical Machine Learning Tools and Techniques. Burlington: Morgan Kaufmann Publishers.
Yulihasri, Islam, A., & Daud, K. A. K. (2011). Factors that Influence Customers’ Buying Intention on Shopping Online. International Journal of Marketing Studies. 3(1), 128-139. doi http://dx.doi.org/10.5539/ijms.v3n1p128
Zhang, M., Chen, G., & Wei, Q. (2015). Discovering Consumers' Purchase Intentions Based on Mobile Search Behaviors. In Andreasen T. et al. (Eds), Flexible Query Answering Systems 2015: Vol. 400. Advances in Intelligent Systems and
Computing (pp. 15-28). doi: 10.1007/978-3-319-26154-6_2
Zielke, S. (2010). How price image dimensions influence shopping intentions for different store formats. European Journal of Marketing, 44(6), 748-770. Doi: 10.1108/03090561011032702
Zielke, S. (2006). Measurement of retailers’ price images with a multiple-item scale. Journal of Retailing and Consumer Services.15(5), 335-347.
Zuo, Y., & Yada, K. (2014). Using bayesian network for purchase behavior prediction from RFID data. In 2014 IEEE International Conference on Systems, Man and Cybernetics (SMC), IEEE, pp. 2262-2267. doi: 10.1109/SMC.2014.6974262
Copyright (c) 2017 Millenium - Journal of Education, Technologies, and Health
This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who submit proposals for this journal agree to the following terms:
a) Articles are published under the License Creative Commons, in full open-access, without any cost or fees of any kind to the author or the reader;
b) The authors retain copyright and grant the journal right of first publication, allowing the free sharing of work, provided it is correctly attributed the authorship and initial publication in this journal;
c) The authors are permitted to take on additional contracts separately for non-exclusive distribution of the version of the work published in this journal (eg, post it to an institutional repository or as a book), with an acknowledgment of its initial publication in this journal;
d) Authors are permitted and encouraged to publish and distribute their work online (eg, in institutional repositories or on their website) as it can lead to productive exchanges, as well as increase the impact and citation of published work
Documents required for submission