COMPARING VARIOUS MACHINE LEARNING METHODS FOR PREDICTION OF PATIENT REVISIT INTENTION: A CASE STUDY

Numerous methods have been suggested for analysis of costumer intention, from surveys to statistical models. The most recent couple of years, various machine learning methods have effectively been utilized to costumer-centric decision-making problems. The trend of patient revisit intention analysis has an improved reliance on computerized decision making models. Computerized decisionmaking may never take the place of the hospital managers, but it can provide decision support via a simple questionnaire. In this paper, it is carried on a comparative evaluation of the performance of ten widely used machine learning methods, (i.e., logistic regression, multilayer perceptron, support vector machines, IBk, KStar, locally weighted learning, decisionstump, C4.5., randomtree and reduced error pruning tree) for the aim of suggesting appropriate machine learning techniques in the context of patient revisit intention prediction problem. Experimental results reveal that the C4.5 tree demonstrates to be the most suitable predictive model since it has the highest overall average accuracy (95.24%) and a very low percentage error on both Type I (3.40%) and Type II (23.53%) errors, closely followed by the locally weighted learning (94.44%, 3.43%, 31.58%) and decisionstump (94.05%, 3,85%, 30.00%), whereas the logistic regression and the IBk algorithms appear to be the worst in terms of average accuracy (87.30% and 88.49%, respectively) and Type II error (70.37% and 68.18%, respectively). Besides the randomtree (6.36%) and the IBk (6.09%) algorithms appear to be the worst in terms of type I error. As a result, this study has demonstrated the promising attempt of incorporating sentiment classification into patient revisit intention.


INTRODUCTION
Data mining (DM) is vital for customer relationship management (CRM) to analyse huge data streams and gain insight into customer intentions, needs and preferences.Also these knowledge assists the design of customer-centric service processes alongside personalized marketing and service activities.Thus, succesively, help to acquire new customers, keep the current customer in hand, and to have loyalty of them in today's competitive globalized markets.
Costumer repurchase (revisit) intention is inevitable for healthcare constitutions.Hospital management is responsible for the services offered to patients is determining the patient's preferences to the hospital again about the services provided to the satisfaction or not (Güzel, 2014).In this regard, the hospital management plays a vital role in customer service is visiting again.
DM may make possible healthcare constitutions to forecast tendencies in the patient conditions and their intentions, which is achieved by data analysis and identifying interconnection from apparently unassociated info.Unprocessed data from healthcare constitutions are huge.These data provides countless possibilities for hidden pattern investigation for healthcare organizations (Milovic and Milovic, 2012).These patterns and data mining algorithms may provide support for forecastings, diagnoses and treatment planning.Mentioned algorithms also can being integrated into information systems of healthcare constitutions as a decision support tool, avoiding human errors (Bushinak et al., 2011).
By help of DM methods, it is practicable to forecast tendencies and customer intentions and, in this way, maintain the constitution's business success.(Milovic and Milovic, 2012).
Healthcare constitution are capable to utilize DM tools different ways such as managers and doctors who utilize models for measuring clinical/quality/economic indicators, efficiency of the staff to optimize use of resources, customer satisfaction/loyalty, etc (Stühlinger et al., 2000).
In literature, to the best our knowledge, there are limited descriptive and statistical studies within patients revisit intentions (i.e., Boshoff and Gray, 2004;Al-Refaie, 2011;Sarvari, 2012;Kang et al., 2013;Aliman and Mohamad, 2013;Park and Seo, 2014), but none of them applications of ML prediction methods.Besides, none of the studies being utilized ML methods applications to a hospital managerial problem.
In this study, 15 variables was executed to determine revisit intention, using a binary classification and five-point likert scale, in a hospital in the province of central Erzurum.
Binary classification may be seen as a universal approach to support decision making in customercentric applications.That is, revisit intentions can be categorized that the patients those who will revisit and those who will not.
It is vital to understand patient motivation and satisfaction after taking service.We test this with a questionnaire and ML-based empirical study.
In this paper, it is carried on a comparative evaluation of the performance of ten widely used ML algorithms (logistic regression, multilayer perceptron, support vector machines, IBk, KStar, locally weighted learning, decisionstump, C4.5., randomtree and reduced error pruning tree) on patient revisit intention prediction problem.This study has demonstrated the promising attempt of incorporating sentiment classification into patient revisit intentions.
The rest of the paper is organized as follows.In Section 2, ML methods, used in study, are introduced.In Section 3, data set is given and experimental desing is illustrated.In section 4 prediction models are compared and results are evaluated.Finally, Section 5 presents the main conclusion and future directions of the study.

MACHINE LEARNING METHODS
Conventional prediction models, such as logistic regression and discriminant analysis have been used to predict consumer repurchase intention.However, there has been a great potential in applying more efficient ML techniques to analyse consumer behaviour and repurchase intention.(ML) algorithms predict an unknown dependency between the inputs and output from a dataset (Tüfekci, 2014).
Table 1 demonstrates the ML techniques utilized in this study.We handled these techniques in three categories; Functions, Lazy-learning Algorithms, and Tree-based Learning Algorithms, stated by The Waikato Environment for Knowledge Analysis (WEKA) platform.Logit is a regression model for binomially distributed dependent variables.It is practical for modeling the probability of an event occurring as a function of other factors.As its link function, logistic regression is a generalized linear model which uses the logit.In literature other wide used names for logistic regression for various other application areas are logistic model, logit model, and maximumentropy classifier (Hosmer and Stanley, 2000).
Binary logit is a type of regression, which is used when the dependent variable has a dichotomy and the independent variables are of any form.Multinomial logistic regression is useful to cope with the case of dependent variable with more classes than two.When multiple classes of the dependent variable might be ranked, then ordinal logistic regression is preferred to multinomial logit.Only the exact variables (not continuous) can be used as dependent variables in logit.Logit applies maximum likelihood prediction after transforming the dependent into a logit variable.Therefore, logit predicts the probability of occurrence of a certain event (Chandra et al., 2009).

Multilayer Perceptron (MLP)
Multilayer perceptrons, one of the neural network approaches, have superiors generalisation potential to seize complex interrelation between inputs and outputs.MLP is often trained with error back-propagation algorithm based on the error correction learning rule (Aydogmus and Turkan, 2016).
A common MLP can be mathematically describe in Eqs. ( 1)-( 5) (Erdal et al., 2013;Erdal, 2015;Namlı et al., 2016); The output signal for the lth neuron in the nth layer is computed by and it could be revised as computed by where  is the learning rate, and For the output layer, the local error gradient is computed by is the goal output signal, and ) (  is the activation function.

Support Vector Machines (SVMs)
SVMs are originally developed by Vapnik (1995) to handle with the classification problems and have been increasingly used in different forecasting problems (Aydogmus et al., 2015).The SVMs can be defined as fallowed (Erdal et al., 2013;Aydogmus et al., 2015;Ozturk et al., 2015): where ,  denotes the dot product in n  , and  is a non-linear transformation and b are coefficients.The coefficients are estimated by minimizing the regularized risk function (7): and  is a prescribed parameter (Erdal and Ekinci, 2013;Ozturk et al., 2016).

IBk
The IBk is an instance-based learning, works as a k-nearest-neighbor classifier, is one of the most widely used instance-based or lazy method for both classification and regression problems.In this paper, it is used for a regression problem.The main assumption behind this algorithm is that the closest instances to the query point have similar target values to the query (Jiawei and Kamber, 2001).

KStar (K*)
KStar is also an instance-based classifier developed for regression with a generalized distance function based on transformations (Türkan et al., 2016).The KStar algorithm uses entropic measure, based on probability of transforming instance into another by randomly selecting between all possible transformations (Painuli et al., 2014).

Locally Weighted Learning (LWL)
The LWL uses an instance-based algorithm, assigns instance weights.This algorithm can perform both classification and regression.The principle idea behind the LWL is that any non-linearity can be approximated by a linear model, if the output surface is smooth (Türkan et al., 2016).Therefore, instead of looking for a complex global model, it is easy to approximate non-linear functions by using simple local models (Arif et al., 2001).

DecisionStump (DStump)
DStump is a decision tree which uses just a single attribute for splitting.It constructs one-level binary decision trees for datasets with a categorical or numeric class, handling with missing values by treating them as a separate value and extending a third branch from the stump (Witten et al., 2011).
Comparing Various Machine Learning Methods For Prediction of Patient Revisit Intention: A Case Study 391 C4.5 C4.5 is one of the popular decision tree classifier, which based on information theoretic concepts.It examines the normalised information gain (entropy difference) that results from choosing an attribute for splitting the data.The attribute with the highest normalised information gain is the one used to make the decision (Brown and Mues, 2012).The algorithm is a successor of ID3, which determines at each step the most predictive attribute, and splits a node based on this attribute.Every node represents a decision point over the value of some attribute (Al Snousy et al., 2011).The algorithm then recurs on the smaller subsets.

RandomTree (RTree)
RTree is a regression-based decision tree algorithm.Trees built by RTree consider K randomly selected attributes at each node.RTree performs no pruning nad also has an option to allow prediction of class probabilities based on a hold-out set (backfitting) (Erdal and Karahanoğlu, 2016).

Reduced Error Pruning Tree (REPTree)
REPTree algorithm generates a tree utilizing the node statistics and prunes it utilizing reduced-error pruning (Portnoy and Koenker, 1997).One can set the minimum number of instances per each leaf, maximum tree depth, minimum proportion of training set variance for a split (numeric classes only), and number of folds for pruning (Erdal and Karahanoğlu, 2016).

EXPERIMENTAL DESIGN Data Acquisition
Customer questionnaire was conducted to collect the required data.The questionnaire was prepared using items and questions prepared solely for the purpose of this study.It was pre-tested on several customers and iterative changes were made on vague items and expressions.The questionnaire was composed of three sections, covering demographic profile, characteristics of service, and patients intentions.
The questionnaire was applied to adult patients who received services from 17 clinics in Palandöken Hospital in Erzurum/Turkey.Among them, 53.6% was male and 46.4% was female.
When the underlying questionnaire was conducted, the total number of the patients in the hospital was 698.The relevant sample size of the questionnaire was calculated with respect to the following formula (Özer, 2004); where n denotes the sample size; N denotes the population size (here in the number of patients at that year); P = the probability of the occurrence for a given event (the ratio of respondents was revisited the hospital); Q equals to 1 -P; Z denotes the test statistic under the (1 -α)% significance level; and finally d denotes the tolerance.In this respect, the minimum representative sample size of the survey can be calculated as;  = 608003(0.2025)(0.7975)(1,96) 2 (6008003 − 1)0.05 2 + (0.2025)(0.7975)(1,96) 2 = 248 During the data collection procedure, 252 questionnaires were transformed and coded to a convenient computer-ready form, which exceeds the number of objective minimum sample size and there were no missing value.
The dependent variable of this study was the patient revisit intension, that was binary in nature where Y = will revisit, N = will not.
Variables was measured using the 5-point Likert-scale ranging from 'Not at all satisfied (NAS)' to 'Highly satisfied (HS)'.
According to descriptive statistics of dependent variable, most of the respondents (91,67%) are generally respondents have a high revisit intention to the corresponding hospital.Table 2. demonstrates the descriptive statistics of patients in the corresponding hospital.

Evaluation criteria
In this paper, it is preferred to utilized a combination of Type-I error, Type-II error and average accuracy, rather than a single measure, to evaluate the patient revisit intention prediction methods.That is an error to predict actually not revisited patients who have been predicted as actually revisited.This kind of error can be describe as Type-I error.Similarly, predicting actually revisited patients who have been predicted as actually not revisited is also an error.This kind of error can be describe as Type-II error.From a theoretical point of view, it is better to utilize prediction models with lower type-I errors (actually not revisited patients who have been predicted as actually revisited), but in practice it is also of great importance for the hospitals to achieve an appropriate balance between both error types so as not to assess potentially customers as not revisited ones (Ekinci and Erdal, 2011;Erdal and Ekinci, 2015).The definition of these measures can be explained with respect to a confusion matrix as demonstrated in Table 3.

Revisited (positive)
True positive (TP) False positive (FP) Not-revisited (negative) False negative (FN) True negative (TN) Type-I error, Type-II error and Average accuracy can be mathematically describe in Eqs. ( 9)-( 11) (Marqués et al., 2012;Yaprakli and Erdal, 2016); As stated by Caouette et al. (2008), the misclassification costs associated with type-I errors are typically much higher than those associated with type-II errors.And the accuracy ignores the different cost of both type-I and type-II errors.

Experimental procedure
The common implementation to evaluate repurchase intention models is to utilize a sufficient sample since huge sets of past applicants are mostly available.In some conditions, the data are inadequate to create an accurate prediction and so, other implementations have to be utilized so that to obtain better prediction performance.In this framework, the most widely used implementation is crossvalidation (Marqués et al., 2012).
The evaluation is conducted with the k-fold cross validation.k-fold cross-validation is used to optimize bias with regard to the random sampling of the training and testing data when comparing the prediction accuracy of various models.Kohavi (1995) stated that 10 folds were optimal.In this study we used 10-fold cross-validation.
As mentioned above, the ML algorithms, used in this paper, divided into three categories; Functions, Lazy-learning Algorithms, and Tree-based Learning Algorithms.Functions classify the test data based on a specific function.The Logit, MLP and SVM are the most widely used function algorithms.Lazylearning algorithms, unlike other classification or regression techniques, save the training data and build the model after receiving the test data.The IBK, K* and LWL are the most common lazy classifiers, used in this study.Tree-based learning algorithms classify the test data by building a decision tree.In treebased learning algorithms, internal nodes are tests on an attribute value or multiple attributes values, branches are outcomes of the tests, and leaf nodes are class labels (Moshtari et al., 2013).DStump, C4.5, RTree, REPTree are four instances of tree-based learning algorithms.Hereinafter the experimental procedure is presented in Figure 1.

RESULTS AND DISCUSSIONS
In this paper, data mining toolkit WEKA version 3.7.11was utilized for case study.All the ML algorithms are used with their default parameter settings, as defined in Weka 3.7.11, to reduce the danger of over fitting due to excessive parameter tuning.
Our goal in this study is to investigate which ML technique is the most appropriate for patient revisit intention prediction problem.
To evaluate ten ML techniques, we averaged the performance indicators.Table 4. summarizes the performance indicators of the predictive models on the gathered datasets.
In terms of average accuracy, all ten algorithms can reach more than 87% of classification correctly.For all three performance indicators, presented in Table 4 and Figures 2-4., C4.5 performs best (95.24%, 3.40%, 23.53%) since it has the highest overall Average Accuracy and a very low percentage error on both Type I and Type II errors, closely followed by the LWL (94.44%, 3.43%, 31.58%) and DStump (94.05%, 3,85%, 30.00%), whereas the Logit and the IBk algorithms appear to be the worst in terms of Average Accuracy (87.30% and 88.49%, respectively) and Type II error (70.37% and 68.18%, respectively).Besides, the RTree (6.36%) and the IBk (6.09%) algorithms appear to be the worst in terms of Type I error.
Note from the table and figures, obtained performance indicators are presented, the tree-based learning algorithms yield significant prediction performance.C4.5 is using entropy-based segmentation algorithm and it is used widely for designing decision trees.C4.5 can generate decision trees by numeric values.Beside it also offers a solution to build decision trees when there are missing values.C4.5 algorithm makes possible to classify the datasets that has quantitative attributes, also both continuous and discrete attributes.In order to handle continuous attributes, C4.5 creates a threshold and then splits the list into those whose attribute value is above the threshold and those that are less than or equal to it (Özsoy et al., 2015).Since it is experienced numerously that C 4.5 had a pretty good performance on such a dataset that has many quantitative varibles and missing values like our dataset.

CONCLUSION AND FUTURE DIRECTIONS
The trend of patient revisit intention analysis is an improved reliance on computerized decision making models.Computerized decision-making may never take the place of the hospital managers, but it can provide decision support via a simple questionnaire.
Specifically, this study has demonstrated the promising attempt of incorporating sentiment classification into patient revisit intention.
Findings of this paper are likely to lead to a new trend of managerial processing in healthcare sector.This could influence the design of managerial systems in healthcare organizations in different functional areas.As well, empirical results of this study can provide insights to study how ML techniques can provide decision support about patients intentions.
This paper has focused on a comparative analysis of various widely used prediction algorithms on patient revisit intention prediction.With this aim, ten ML techniquies have been utilized to a hospital managerial problem.

Type II error (%)
Experimental results reveal that the C4.5 tree demonstrates to be the most suitable predictive model since it has the highest overall average accuracy and a very low percentage error on both Type I and Type II errors, closely followed by the locally weighted learning and decisionstump, whereas the logistic regression and the IBk algorithms appear to be the worst in terms of average accuracy and type II error.Besides the randomtree and the IBk algorithms appear to be the worst in terms of type I error.
For further research, the present analysis could be to extend to other individual ML algorithms and ensemble approaches and extend to the other sectors and studies on costumer relations.A natural extension of this research is to expand the number of hospitals and provinces.Although this study has analyzed the data for central Erzurum, the applicability of the presented classification methods in other hospitals remain unknown and is thus worthwhile for future investigation.Another direction for future research is that since costumers change their perception frequently it would be interesting to do a longitudinal study to compare the findings between different time periods.
In this paper, data mining toolkit WEKA version 3.7.11was utilized for case study.
All the ML algorithms are used with their default parameter settings, as defined in Weka 3.7.11, to reduce the danger of over fitting due to excessive parameter tuning.
. For an n-layer network, the synaptic weight

Table 1 .
ML methods utilized in this study

Table 2 .
Descriptive statistics of variables

Table 4 .
Performance statistics of proposed predictive models Figures 2-4.demonstrates the performance indicators, i.e., Type-I error, Type-II error and Average Accuracy.