ABSTRACT

Intensive Care Units have been

carrying vital importance in these days. These hospital units, affecting most

people’s lives, have recently become more crowded. Due to this crowd, patients

who have to enter intensive care units unfortunately get vital risks because of

not getting access to these units. The greatest reason for the occurrence of

this condition is that the time to be spent in intensive care units is not

predictable without modelling the system. In this study, we will model the

intensive care units with continuous absorbing markov chain structure and

estimate the length of stay at intensive care unit by using phase-type

distribution. Study will follow the order as gathering data of the system,

modelling the markov chain with apporapiate amount of states then applying the

Phase Type Distribution to the model. At the end, It will be predictable that

legth of stay at intensive care units.

Regression Models

Utilizing

regression incorporates bend fitting, expectation (forecasting), demonstrating

of causal connections, and testing logical theories about connections between

variables. Regression examination is a technique for exploring useful

connections among variables that is communicated as a condition or a model

interfacing the reaction or ward variable and at least one logical or indicator

variables.

1)

Linear Regression

If we distribute our data

according to their values on the x-y coordinates, they have a distance between

each other. The goal is to draw a line that passes through all the data and

passes the most correctly. The correct drawing here is for regression in a

linear structure. Our goal is to simulate the distribution of our train data on

the plane as a mathematical model so that we can find the correct regression model.

For example, if you have a fluctuation set of data, using linear regression

will not make sense. Using logistic regression for this will help you achieve

more successful results. Because the logistic regression tries to capture the

data logarithmically on the plane with curve.

According to Combes, Kadri and

Chaabane (2014)’s study about predicting the length of stay at emergency

department. They have contucted two different linear regression model, in the

first model there are 4 variables and in the second there are 8 variables. In

the second model, accuracy of the model is more reliable because of variable

amount highness. It is easy to observe that more the variable amount in linear

regression model provides better accuracy of the model. In order to best

fitting it is require to choose right variables with many amounts. From this

situation Combes, Kadri and Chaabane (2014) states linear regression suffers

from the well linearity. According to their reliability test there was ±2 hours

error. Moreover, basic linear regression method is not valid for non-linear variables.

And also classification and regression are two methods that used for prediction

about discrete outcomes(Tan, 2007). There is also another study about

predicting lenght of stay with using linear regression method. With respect to

Badreldin(2013), the linear regression model evaluated as failed in accuracy of prediction. Study was also suggest that

reconsideration of the variables could gave better prediction. From the

Pourhoseingholi’s Study (2009) there was a fitting comparison between linear

regression and quantile regression. At the end, he stated that linear

regression remained incapable when the comparing with quantile regression.

Combes C., Kadri F. and Chaabane S.(2014, November 5).

PREDICTING HOSPITAL LENGTH OF STAY USING REGRESSION MODELS: APPLICATION TO

EMERGENCY DEPARTMENT. Optimisation et

Simulation- MOSIM’14. Retrieved from

https://hal.inria.fr/hal-01081557/document

Tan, P. (2007). Introduction To

Data Mining. Pearson Education

Badreldin, A. M., Doerr, F., Kroener, A., Wahlers, T., &

Hekmat, K. (2013). Preoperative risk stratification models fail to predict

hospital cost of cardiac surgery patients. Journal of Cardiothoracic

Surgery, 8, 126. http://doi.org/10.1186/1749-8090-8-126

Pourhoseingholi,, M. A., Pourhoseingholi, A., Vahedi, M., Dehkordi,

B.M., Safaee, E., Mserat, E., Ghafarnejad, F. & Zali, M.R. (2009). Comparing linear regression and quantile regression to

analyze the associated factors of length of hospitalization in patients with

gastrointestinal tract cancers. Italian Journal Of Public Health.

6,2. http://dx.doi.org/10.2427/5787

2)

Logistic Regression

It is a

regression model and the dependent variable is categorical. Moreover, Logistic

regression is a statistical strategy for breaking down a dataset in which there

are at least one free variables that decide a result. The result is measured

with a two conceivable variable. The objective of logistic regression is to

locate the best fitting model to present the relationship.For estimating the

probabilities, its using the logistic function in other words logistic curve. It

tries to catch data on an algorithmic curve. This leads us to a higher

prediction success in up-and-down data.

The author Austin (2010) compared at

his study that performances of classification techniques for prediction purpose

and he found that the logistic regression has gave better result than regression

tree, multi-layer perceptron and radial basis function with using Receiver Operating

Characteristic curve and Hierarchical Cluster Analysis performance measure. In

addition to this information Kurt, Ture and Kurum (2008) states that comparison

between acuracy of regression trees with logistic regression model for

predicting length of stay in hospital. As a result of his study, logistic

regression is more accurate than regression trees. Sharma, Dunn, O’Toole,

Kennedy (2015) have also studied length of stay with using logistic regression,

they have used a psychiatric hospital with population of 1.2 million. Moreover,

they also used IBM SPSS v21 statistical analysis software. They found that the

result does not seem to mirror the requirements of an intense mental

affirmation benefit and may speak to the absence of group emergency determination

assets. As a measure the modular completed length of stay shows something about

what is occurring in an administration, yet as a measure of focal propensity it

is of restricted esteem and needs sensitivity emergency determination assets.

In short, sensitivity of the logistic regression is not adequate for Length of

Stay studies and not representing the real system.

Austin, P.C., Tu, J.V., Lee,

D.S., (2010). Logistic regression had superior performance compared with

regression trees for predicting in-hospital mortality in patients hospitalized

with heart failure. J. Clin. Epidemiol.

63, 1145–1155.

Kurt, I., Ture, M., Kurum, A.T.,

2008. Comparing performances of logistic regression, classification and

regression tree, and neural networks for predicting coronary artery disease.

Expert Syst. Appl. 34, 366–374.

Sharma, A., Dunn, W., O’Toole,

C., & Kennedy, H. G. (2015). The virtual institution: cross-sectional

length of stay in general adult and forensic psychiatry beds. International Journal of Mental Health

Systems, 9, 25. http://doi.org/10.1186/s13033-015-0017-7

3)

Machine Learning Regression

Generally

used regression tree algorithms such as CART and CHAID. CART is used for

non-parametric in other words non-linear regression tree. CHAID is an statistical

approach that can derive regression trees.

One of the machine learning study has

occured in a Federal hospital because of its variable richness’ positive

effects on regression models (Hulshof, 2013). The first variable was prediction

of patients’ length of stay and other one was prediction of readmission (Kelly,

2013). From the studies of this Federal hospital, Pendharkar and Khurana (2014)

states that the ANCOVA (Analysis of Covariance) model tests linear connections,

and significantly show that non-linear machine learning models may perform

marginally superior to linear models. It means that machine learning regression

fits better than linear regression to real data. Moreover, when they looked to Root-Mean-Square

error, there was no better regression technique than machine learning regression.

However, their sample size was limited with small section of the hospital. In

short, it is not generalizable for other studies.

Hulshof, P. J. H.; Boucherie, R.

J.; Hans, E. W.; Hurink, J. L. (2013): Tactical resource allocation and

elective patient admission planning in care processes, Health Care Management

Science, 16(2), pp. 152–166.

Kelly, M., Sharp, L., Dwane, F.,

Kelleher, T., Drummond, F. J.& Comber, H. (2013): Factors predicting

hospital length-of-stay after radical prostatectomy: a population-based study, BMC Health Services Research, 13(1),

244-244.

Pendharkar, P.C. & Khurana,

H. (2014). MACHINE LEARNING TECHNIQUES FOR PREDICTING HOSPITAL LENGTH OF STAY

IN PENNSYLVANIA FEDERAL AND SPECIALTY HOSPITALS. International Journal of Computer Science and Applications, 11, 3. http://www.tmrfindia.org/ijcsa/v11i33.pdf

Markov Chain

Patient

activities can be modelled by using Markov Chains. ?t describes a system with

different states and transitions between them. Markov Chain has memoryless

property thats why the next state depend on only the current state not the

previous states. When we look at operational view, markov chain can be describe

with different states. The Markov chains evaluated utilizing the improvement

datasets were joined with the initial state probability vector to create the expected

length of stay in every goal for Intensive Care Units or Hospitals. In discrete

markov chains, it is not possible calculate in hours or minutes like continious

property. However, if we turn the data only days and which days spended in

which department of hospital. It is possible to create an discrete markov chain

with absorbing state.

According to Perez, Chan and Dennis(2006)’s

studies about length of stay at intensive care unit with using Markov Model

with absorbing state in other words “first-step analysis”(Kapadia, 2000). The

markovian model lack of goodness of the fit to real length of stay data for

some departments in the hospital. The markovian model as a discrete property

have not calculated the discharges at middle of the day. In short, there has to

be an continious property markov model for able to model all kind of service

and waiting times. According to Perez’s study (2006), there is also positive

side of the markov model with discrete property such as high correlation

between utilization and length of stay. However, its outcomes are mostly not

reliable and not fitting the real data in the continious matter because of

limitations. Moreover, according to Bhat (2002) sequence size is important for

markov chain structure in order to best fitting to the real data. However,

Perez’s study (2006) has only 30 sequence which are days of a month. Because of

this reason markov model needs too much sequences (states) for fitting the real

data better.

Perez, A. Chan, W. & Dennis,

R.J. (2006): Predicting the Length of Stay of Patients Admitted for Intensive

Care Using a First Step Analysis. Health

Serv Outcomes Res Methodol; 6(3-4): 127–138.

Kapadia, A.S., Chan, W.,

Sachdeva, R., Moye, L.A., Jefferson, L.S.(2000) Predicting duration of stay in

a pediatric intensive care unit: A markovian approach. European Journal of Operational Reseach;124:353-359.

Bhat, U.N., Miller, G.K.(2002);

Elements of applied stochastic processes. Third Edition. John Wiley & Sons Inc; Hoboken, New Jersey.