Hazard Rate Function Estimation Using Erlang Kernel

In this paper, we define the Erlang kernel and use it to nonparametically estimation of the probability density function (pdf) and the hazard rate function for independent and identically distributed (iid) data.The bias, variance and the optimal bandwidth of the proposed estimator are investigated. Moreover, the asymptotic normality of the proposed estimator is investigated. The performance of the proposed estimator is tested using simulation study and real data.


Introduction
Hazard rate functions can be used for several statistical analyses in medicine, engineering and economics.For instance, they are commonly used when presenting results in clinical trials involving survival data.
Several methods for hazard function estimation have been considered in the literature.Hazard function estimation by nonparametric methods has an advantage in flexibility because no formal assumptions are made about the mechanism that generates the sample order than the randomness.Estimators of the hazard function based on kernel smoothing have been studied extensively.For instance, see Salha [8,7], O. Scaillet [5], Watson and Leadbetter [12] and Rice and Rosenblatt [6].
The performance off the estimator at boundary points differs from the interior points due to so-called "boundary effects " that occur in nonparametric curve estimation problems.More specifically, the bias off the estimator at boundary points.To remove those boundary effects in kernel density estimation, a variety of methods have been developed in the literature.Some well-known methods are summarized below: (i) The reflection method (Cline and Hart [3]; Schuster [10]; Silverman, [11].(ii) The local linear method (Cheng [2]; Zhang and Karunamuni [13]).
Chen [1] solved this problem by replacing the symmetric kernels by asymmetric Gamma kernel which never assigns weight outside the support.
A lot of people who care of estimation do many specific distributions, such as normal, log-normal, gamma and inverse-gamma distributions, and this made us pose a question which is: Is there any other distributions which can be used as a kernel in the estimation and then give us acceptable results?
In this paper, we propose the Erlang kernel which also never assigns weight outside the support.The Erlang distribution is a continuous probability distribution with wide applicability primarily due to its relation to the exponential and Gamma distributions.The Erlang distribution was developed by A. K. Erlang to examine the number of telephone calls which might be made at the same time to the operators of the switching stations.This paper is organised in six sections.In the first section, we present some information about kernel smoothing, hazard functions and the proposed kernel.In the second section, we introduce some definitions and relations, and state the conditions under which the results of the paper will be proved.In the third section, we investigate the bias, variance and optimal bandwidth of the Erlang kernel estimator.In the fourth section, we investigate the asymptotic normality of the Erlang kernel estimator of the pdf and of the hazard function estimator.In the fifth section, the performance of the proposed estimator will be tested via two applications, simulated and real life data.In the sixth section, we introduce comments and the conclusion.

Preliminaries
In this section, we state the conditions under which the results of the paper will be proved.Also, we mention the definition of gamma function and some relations related to it, then we will define Erlang kernel.

Conditions
(1) Let {x i } n i=1 be a random sample from a distribution of unknown probability density function f defined on [0, ∞) such that f has a continuous second derivative.
(2) 0 < x 4 ( f ) 2 (x)dx < ∞ and f (x) x dx < ∞ (3) h is smoothing parameter satisfying h The Gamma function is defined to be the improper integral: The Taylor expansion about zero of 1 Γ(x) is available and given by: Also, through this paper we will approximate the folloing: The beta function This approximation because we concern with values of h close to zero.
In this paper we consider the following Erlang kernel function: , then E(Y ) = x, and the variance is var(Y ) = x 2 h 1+h .We propose the following estimator of the probability density function f (.), the Erlang estimator 3. Bias, Variance and optimal bandwidth In this section, the bias and variance will be evaluated.Then we will discuss the optimal bandwidth of the Erlang kernel estimator.

Proposition 1.
The bias of the proposed estimator is given by: Proof: where ξ x follows an Erlang distribution with mean Hence, The variance of the proposed estimator is given by: Proof: Optimal Bandwidth: First of all, we will define MSE and Mean Integrated Squared Error (MISE) as follows: where, x dx and 2 dx Therefore, we can approximate MISE to be: We will now find the optimal bandwidth by minimizing (5) with respect to h, so we have Setting ( 6) equal zero yields the optimal bandwidth h opt for the given pdf and kernel: In addition, (7) proves that this value minimize (5).Substituting (8) for h in (5) gives the minimum MISE for the given pdf and kerne which is given by: Note that h opt depend on the sample size n, the kernel and on the unknown pdf.

Asymptotic normality
In this section, we state the main two theorems talking about the asymptotic normality for the proposed estimator and an important lemma which we will use in the second main theorem.Definition 1.A hazard rate function is defined as the probability of an event happening in a short time interval.More precisely, it is defined as: The hazard rate function can be written as the ratio between the pdf f (.) and the survivor function S(.) = 1 − F (.) as follows: Definition 2. we defined the proposed estimator for the hazard rate function to be: Theorem 1.Under conditions 1, 2, and 3, the following holds Proof: V ni , where V ni , i = 1, 2, ..., n are independent and identically distributions (iid) and σ 2 ni < ∞.
We show now that the Liapounov condition is satisfied, that is for some δ > 0, First of all, we have: where, α = [ 2+δ . Then we have: where, ξ x follows an Erlang distribution with mean µ x = E(ξ x ) = x and variance var(ξ The Taylor expansion of ξ δ h −1 x f (ξ x ) about the mean µ x = E(ξ x ) = x as follows: Therefore, (11) Now substituting δ = 0, the following is hold: Hence, → 0 The last term by condition 3 vanishes as n → 0. Also, the remaining component of the last term are bounded.Lemma 1.Under conditions 1, 2 and 3 the following holds Proof: First of all, we have from the definition of F (x) the following relations: where ξ x is defined as in Proposition 1.This implies that, E F (x)−F (x) = o(1) Therefore, (nh) On the other hand, F (x) can be written as follows: Now, given ε > 0, δ > 0, then we have: The second term vanishes as n −→ ∞ and h −→ 0, since from 2 we have Further, by replacing δ with 2δ in (11) and assuming that we have: |, then by ( 13) and ( 14), we have: (nh) This complete the proof of the lemma.. Theorem 2. Under conditions 1, 2, and 3, the following holds Proof: 2 Note that the second term vanishes by (12), and the first term is asymptotically normally distributed by (10).Moreover, from (10) and ( 15) too we have: + o(1) Therefore, and,

Applications
In this section, the performance of the proposed estimator in estimating the pdf and hazard rate function is tested upon two applications using a simulated and real life data.

A Simulation Study.
A sample of size 200 from the exponential distribution with pdf f (x) = exp(−x) is simulated.We computed the bandwidth using the relation h opt = 0.79Rn − 1 5 , (16) see [11] page 47 and it equals (0.36658).The density and the hazard functions were estimated using the Erlang estimator.The estimated values and the true exponential pdf are plotted in Figures 1(a), this figure shows that the performance of the Erlang estimator is acceptable at the boundary near the zero.In the interior the behavior of the pdf estimator becomes more similar for large values.Also figure 1(b) shows that the performance of the Erlang estimator of the hazard function is acceptable at the boundary near the zero which we concern on.The mean squared error (MSE) of proposed estimator of the density function is equal to 0.000266937 and for the hazard function is evaluated for the interval [0,0.5]-because we concern about closest values to zero-and is equal to 0.02289262.5.2.Real Data.In this subsection, we used the suicide data given in Silverman [11], to exhibit the practical performance of the Erlang estimator.The data gives the lengths of the treatment spells (in days) of control patients in suicide study.We used the logarithm of the data to draw figures 2(a) and 2(b)using bandwidth equals 0.480411 which computed by ( 16), these figures exhibit the two estimated functions of the probability density and hazard rate functions, respectively.

Comments and Conclusion
In this paper, we have proposed a new kernel estimator of the hazard rate function for (iid) data based on the Erlang kernel with nonnegative support, we show that the bias is depends on the smoothing parameter h and the estimated point x, and it goes to zero as h −→ 0, also it gets small for the values of x closed to zero.The variance is investigated and we noticed that it depends also on h and x.On the other hand, it goes to zero as h −→ 0, and gets large at the values of x close to zero.Moreover, the optimal bandwidth and the asymptotic normality were investigated.
In addition, the performance of the proposed estimator is tested in two applications.In a simulation study using exponential sample we noticed that the performance of the proposed estimator is acceptable, and gives a small MSE.Using real data, we exhibited the practical performance of the Erlang estimator.

Figure 1 .
Figure 1.(a) The Erlang kernel estimator of the density function, (b) the hazard rate function for the simulated data of the exponential distribution

Figure 2 .
Figure 2. (a) The Erlang kernel estimator of the density function, (b) the hazard rate function for the suicide data.