Bayesian Approaches for Poisson Distribution Parameter Estimation

The Bayesian approach, a non-classical estimation technique, is very widely used in statistical inference for real world situations. The parameter is considered to be a random variable, and knowledge of the prior distribution is used to update the parameter estimation. Herein, two Bayesian approaches for Poisson parameter estimation by deriving the posterior distribution under the squared error loss or quadratic loss functions are proposed. Their performances were compared with frequentist (maximum likelihood estimator) and Empirical Bayes approaches through Monte Carlo simulations. The mean square error was used as the test criterion for comparing the methods for point estimation; the smallest value indicates the best performing method with the estimated parameter value closest to the true parameter value. Coverage Probabilities (CPs) and average lengths (ALs) were obtained to evaluate the performances of the methods for constructing confidence intervals. The results reveal that the Bayesian approaches were excellent for point estimation when the true parameter value was small (θ = 0.5, 1 and 2). In the credible interval comparison, these methods obtained CP values close to the nominal 0.95 confidence level and the smallest ALs for large sample sizes (n = 50 and 100), when the true parameter value was small (θ = 0.5, 1 and 2).


1-Introduction
The Poisson distribution plays an important role in the statistical analysis of count data .This type of data arises from situations in which there are several opportunities for the event of interest to occur, such as the number of customers calling a help center in a day, visitors to a net idol YouTube channel, patients infected with Covid-19 per day, and so on .Therefore, the Poisson distribution can be used to determine the probability of several events in a particular time period.
Various researchers have developed inference procedures for a Poisson distribution. Araveeporn [1] proposed inferential statistics for testing hypotheses by using the mean of a Poisson parameter estimator obtained via the maximum likelihood estimator (MLE), Markov Chain-Monte Carlo, and Bayesian approaches. Hassan et al. [2] investigated Bayesian and MLE estimators for a zero-truncated Poisson distribution. Bayesian estimators for a Poisson distribution using a natural conjugate prior [3] and under linex loss function [4][5][6] and different symmetric and asymmetric loss functions (squared error, linex, precautionary and general entropy) [7] have also been presented. As well as the Poisson distribution, the Bayesian technique for parameter estimation has been extended to the geometric distribution [8], binomial distribution [9], Pareto distribution [10], exponential distribution family [11,12], double exponential distribution under symmetric and asymmetric loss functions [13], gamma distribution under generalized weighted loss   | ; 0,1, 2,...; where is constant mean rate of event occur.
The joint probability mass function or product of n terms is called likelihood function, which is defined as;     In order to make the joint probability mass function to be a monotonic function, the logarithm is taken to the likelihood function. That is; From the logarithm of the likelihood function, the maximum of ln ( | ) occurs at the same value of as does the maximum of ( | ). If ln ( | ) is differentiable in parameter , the necessary conditions for the occurrence of a maximum is solved by applying; By the solving of necessary condition as mentioned above, we can obtain the MLE of as Because we do not know if this is a maximum or minimum value, the second derivative of the estimator can be used to prove that the estimator is the maximum when the second derivative is less than 0 as follows: Therefore, the of is ̂= ̅ .

2-2-Bayesian Method
To estimate the Bayesian estimator, the prior probability distribution under gamma prior is specified by using squared error and quadratic loss functions. Let 1 , 2 , … , be random variables from a Poisson distribution. The probability mass function of the random variable is given by ( | ) with the likelihood function ( ) = ∏ ( | ) =1 . Consider the informative conjugate prior for is a Gamma distribution with parameters , abwhere a is the shape parameter and b is the scale parameter. It is given by, The posterior distribution for Bayesian procedure can be derived by considering the combining of the likelihood function (2) and the prior distribution (7) as follows: The marginal probability density function of can be derived by integration of the combing of the likelihood function and the prior distribution. It can be derived as follows:  xa  nb  a   n  xa  xa  nb  i  i  nn  a  ii  ii   n  xa  i  i   be   xa  n b  e  hx  b x a x a x a n b This implies that the posterior distribution can be written as; which is a gamma distribution with parameters ∑ =1 + and + . Hence,

2-2-1-The Bayesian Estimator of Parameter for Squared Error (SE) Loss Function
The Bayesian estimator for for the squared error loss function that is defined as; The squared error loss function of the Bayesian estimator is the mean of the posterior distribution function, which can be derived as; The Bayesian estimator of under the squared error loss function is shown by,

2-2-2-The Bayesian Estimator of Parameter for Quadratic Loss (QL) Function
The quadratic loss function which is a non-negative symmetric and continuous loss function of parameter and estimate of ̂ for the Bayesian estimator is defined as [12]; The Bayesian estimator under quadratic loss function of is obtained by deriving the following equation: The Bayesian estimator for parameter of the Poisson distribution under a quadratic loss function can be solved by;    

2-3-The Empirical Bayes (EB) Estimation
The idea behind the EB method is different from the Bayesian approach .To estimate the true parameter value by using the Bayesian approach, the hyperparameter is assumed to be known or if unknown, prior information on it is available .The hyperparameter is independent of the observations .With the EB method, the first step of estimating the hyperparameter is by using the classical approach of assessing the observations, after which the hyperparameter is substituted by the estimation in the posterior distribution.
be random variables from a Poisson distribution. The probability mass function of the random variable is given by (1): Supharakonsakun and Jampachasri [27] proposed the EB estimator by studying the case of as ab exponential distribution denoted by ~( ). Hence, the prior distribution function of can be written as; Subsequently, the posterior marginal distribution of can be solved by; This is derived as follows: We now obtain the posterior marginal distribution of as a geometric distribution denoted by |~( +1 ), which can be rewritten in the form; Hyperparameter is estimated in the next step of the EB procedure. The is used to estimate by considering the likelihood function of the posterior marginal distribution as follows: The logarithm of the likelihood function is; From the logarithm of the likelihood function, for is solved by; We obtain the of the hyperparameter as; The posterior distribution of the EB procedure can be derived as the product of the likelihood function and the prior distribution as follows: The integration of the product produces; We now have the posterior distribution of that is the Gamma distribution denoted by; After we have obtained the posterior distribution, we then substitute the hyperparameter  with estimator  in the posterior distribution of x . Hence, we obtain; Therefore, the EB estimator of under the squared error loss function is given by; Moreover, point estimation of hyperparameter is extended via a bootstrapping method for 1,000 iterations. Thus, there are two methods to estimate the true parameter using the EB procedure (EB and EB with bootstrappings).

3-Simulation Results
A simulation study with 5,000 replications was conducted to estimate the performance of point estimates and confidence intervals constructed with the 5 methods. For point estimation, the lowest mean-squared error (MSE) was the criterion used to identify the best-performing method. Besides, for comparing the confidence intervals constructed with them, coverage probabilities (CPs) and average lengths (Als) were calculated under the same conditions of varying hyperparameters a and b for the Bayesian approaches in the point estimation. A CP close to or greater than the nominal level of 0.95 and the shortest AL were used to identify the best-performing method in each case. The Monte Carlo simulation study is presented in Figure 1.

3-1-Point Estimation
The point estimation performances of the MLE, Bayesian under the squared-error loss function (BSL), Bayesian under quadratic loss function (BQL), EB, and EB with bootstrapping (EBboot) methods for estimating the mean of a Poisson distribution were compared via simulation. Random samples were implemented to generate 5,000 sets from a Poisson distribution and 1,000 iterations for bootstrapping the hyperparameter estimation for EBboot with various sample sizes ( = 5, 10, 15, 20, 30, 50, 100) and true parameter = 0.5, 1, 2, 5, 10, 20 given arbitrary prior parameter ( , ) ab  (2,2), (2,4), (1,3.5) and (4,4) for Bayesian methods under two different loss functions. The mean-squared error (MSE) is the criterion used to evaluate the performances of the parameter estimation methods, which is computed as The lowest MSE value means that the estimated value of is closest to its true value.  Table 1 reports the MSE values for point estimation using MLE, BSL, BQL, EB, and EBboot, with hyperparameters a  2 and b  2 for the Bayesian methods. For all magnitudes of sample sizes n , BSL outperformed the others with the lowest MSE for true parameter = 0.5, 1, or 2 while MLE was the best for = 5, 10, or 20. When focusing on the EB approaches, we found that the EB and EBboot performances were very close to each other for all values of hyperparameters tested. Although the EB approaches did not provide the lowest MSEs, they were very close to those of the best estimator in each case.

Input the system configuration and component information
Check t 5,000 Generate data  The results in Table 3 comprise MSE values obtained by the five methods, with hyperparameters a  1 and b  3.5 for the Bayesian methods. This time, the lowest MSE values were obtained by BSL for = 0.5; by BQL for 1 or 2; and by MLE for 5, 10, or 20 for all variations in sample size n . Meanwhile, EB and EBboot provided consistently low MSEs that were only slightly higher than the others.
The MSE results in Table 4 for the five methods (with hyperparameters a  4 and b  4 for the Bayesian methods) indicate that for sample sizes = 5 or 10, the lowest MSE values were obtained by BSL for = 0.5 or 1; by BQL for = 2; and by MLE for = 5, 10, or 20. For sample sizes =15, 20, 30, 50, or 100, the lowest MSE values were obtained by BSL for = 1, by BQL for = 2, and by MLE for = 0.5, 5, 10, or 20. Although EB and EBboot performed worse than the others, their MSEs were close to the lowest MSE in each case.

3-2-Confident Interval
The parameter settings for comparing the confidence interval performances were the same as for the point estimation simulation. Coverage probabilities (CPs) and average lengths (AL) were used to evaluate the performances of the methods. The most effective confidence interval method obtained a CP close to or greater than the nominal level of 0.95 and the shortest AL.    with CP values lower than the nominal level of 0.95 .Although the CPs of EB and EBboot were not close to 0.95, they improved with increasing of . In addition, their AL values were similar to each other and narrower than the other methods for = 0.5.
For sample sizes n  50 or 100, MLE once again obtained CP values close to 0.95 for all values of , as did BSL and BQL for =0.5, 1, or 2. Meanwhile, the CPs of BSL decreased drastically for large values of while those of EB and EBboot were similar to each other and higher than BSL and BQL. Although the ALs obtained by the five methods were similar, slightly shorter ones were provided by EBboot for =0.5 and BSL for = 1, 2, 5, 10, or 20 in the case of sample size n  50. Meanwhile, the shortest ALs were provided by EB for =0.5 and BSL for =1, 2, 5, 10, or 20 in the case of sample size n  100.

4-Discussion
Frequentist and Bayesian inference are fundamentally different principles in statistics. Estimating the Poisson parameter using the MLE method as a classical frequentist approach was proposed by Araveeporn [1] and Hassan et al. [2]. The existing EB and EB with bootstrapping approaches were derived by Supharakonsakun and Jampachasri [27]. Empirical Bayes methods are different from the classical Bayesian approach in that the hyperparameters are assumed to be unknown and prior observations are used to estimate them via the classical Bayesian approach. Determining the hyperparameters is important for the posterior distribution and increases the accuracy of the parameter estimation. Two different loss functions, squared-error and quadratic [12], for parameter estimation based on the classical Bayesian approach were used in this study.
The results in this study were similar to those of Srivastava [7] in that the Bayesian point estimation method under the squared-error loss function provided estimates nearer to the true value of the Poisson parameter. Besides, the Bayesian estimators under different loss functions performed better than the classical estimator (MLE) in the case of small and different values of the hyperparameters, which is similar to the findings of Hassan and Baizid [12] and Naji and Rasheed who found that Bayesian estimate parameters under precautionary [15], generalized weighted [14] or entropy [16] loss functions were the better than the classical approaches of MLE and the method of moments.

5-Conclusion
The purpose of this study was to derive the Bayesian posterior distribution with the highest equitailed posterior density interval under two different loss functions for point estimation and constructing credible intervals for estimating the Poisson parameter with a gamma prior distribution .The performances of two Bayesian methods under either the squared-error loss function or the quadratic loss function were compared with the existing classical MLE, EB, and EB with bootstrapping approaches through Monte Carlo simulations.
When analyzing the Poisson parameter distribution, the Bayesian methods created under the squared-error and quadratic loss functions produced the most suitable estimates for small true parameter values (θ = 0.5, 1, or 2) by providing the lowest MSE values for point estimation for all cases of sample size. Moreover, they attained CPs close to the nominal 0.95 confidence level with the lowest ALs for a large sample size (50 or 100). Meanwhile, for all cases of sample size, the classical MLE approach obtained the lowest MSE values for point estimation for large true parameter values (θ = 5, 10, or 20) and provided CPs close to or greater than 0.95 and slightly longer ALs than the Bayesian methods. In addition, although the EB estimation method based on exponential prior distribution did not achieve the best results, they were close to those of the best estimator in each case, and so it is a good alternative for point estimation and confidence interval construction.

6-1-Data Availability Statement
The data presented in this study are available on request from the corresponding author.

6-2-Funding
This research was supported in part by Research and Development Institute, Phetchabun Rajabhat University, Thailand.

6-3-Conflicts of Interest
The author declare that there is no conflict of interests regarding the publication of this manuscript. In addition, the ethical issues, including plagiarism, informed consent, misconduct, data fabrication and/or falsification, double publication and/or submission, and redundancies have been completely observed by the authors.