DEVELOPING THE COMPOSED PROBABILITY MODEL TO PREDICT HOUSEHOLD TRIP PRODUCTION (A CASE STUDY OF ISFAHAN CITY)

. Household trip production is not a constant parameter and vary based on socio-economic characteristics. Even households produce several numbers of trips in each category (households with constant socio-economic characteristics). The purpose of the present study is to model a variation of household trip production rate in statistical societies. To achieve the purpose, a concept of Bayesian Inference has been used. The city of Isfahan was selected as a case study. First, the likelihood distribution function was determined for average household trip production. Then, likelihood distribution was determined for the numbers of household trips separating odd and even trips . In order to increase the precision of the function, the concept of Bayesian inference was utilized. To insert household socio-economic variables in the function, the disaggregate model was calibrated for average household trip production. Statistical indices and χ 2 test show that the likelihood distribution function of average household trip production follows the gamma distribution and the numbers of household trip production follows the poisson distribution. The final composed probability distribution was determined on the basis of Bayesian inference. The related function was created with a compilation of the mean parameter distribution function (gamma distribution) and the numbers of household trip production (poisson distribution). Finally, the disaggregate model was inserted to the final composed probability function instead of the mean parameter. The obtained results show that the use of Bayesian inference method would open up the possibility of modeling the variation of household trip production rate in statistical societies. Also it would be possible to insert socio-economic characteristics in the model to predict the likelihood of real produced trips for each category of household.


Introduction
Researchers in many countries are trying to solve the problems encountered in trip planning and other related areas.Let's have a look at several scientific studies carried out on the subject of trip planning.Cascetta and Papola (2008) investigated a trip distribution model involving spatial and dominance attributes.Ubogu (2008) examined telecommunications and intra-urban trip pattern in Zaria.Matis (2008) looked into decision support system for solving the problem of street routing.Basu and Maitra (2007) applied their models to use valuing attributes of enhanced traffic information to investigate transport traffic in Kolkata.Tanczos and Torok (2007) studied the linear optimization model of operating transportation efficiency in urban areas.Migliore and Catalano (2007) analyzed urban public transport optimization considering bus routes and using neural network-based methodology.Ziari et al. (2007) reviewed models for locating stations of public transportation vehicles to improve transit accessibility.
The purpose of the present study is based on an attempt to model the variation of household trip production rate and to develop the composed likelihood distribution function in which the frequency of the rate of trip production would be possibly measured on the basis of socio-economic characteristics.In fact, by replacing these variables in the related function, it should be possible to assess the likelihood of the number of trips produced by each household or any other defined statistical society.To achieve this, the concepts of Bayesian inference in probability and statistics have been used.

Methodology
The methodology of the present study has been divided into six sections: first, the method of 'weighted mean' to convert 15 073 household data to 195 household categories at Isfahan city ;Transport, 2009, 24(1): 30-36 second, χ 2 test to determine the probability distribution function of average household trip production; third, utilizing regression models to calibrate the disaggregate model of average household trip production; fourth, the numbers of household trip production (15 073 data) were divided into two groups (odd trips and even trips); fifth, χ 2 test was utilized for determining the probability function of the number of household trips; the last section based on Bayesian inference and composed probability distribution function has been determined and then socio-economic characteristics have been inserted to the related function based on the disaggregate model.

Developing the Concept of Bayesian Inference or Composed Likelihood Distribution Function
Assume that the random variable X has the distribution function of f(x | θ) in which f is the density function of the random variable X with the unknown parameter of θ.In addition, suppose that we have some extra pieces of information about the unknown variable θ, for example, we know that the distribution of the parameter is in the form of π(θ).These additional pieces of information about the distribution of the parameter of society help us with achieving better distribution for the random variable X (Berger 1993;Rohatgi et al. 2000).While this is so, the random variable X has the distribution function of g(x), i.e. new distribution is independent from the parameter θ.This type of new distribution is called composed distribution.In this article, the number of trip n k has the Poisson distribution function with the parameter λ for odd and even trips.Since the parameter λ in the poisson distribution equals the mean of society, we have gained average trip production for different household categories and then measured the distribution of the mean parameter that follows the gamma distribution with parameters α and β.Finally, along with the compilation of the poisson and gamma distribution based on Bayesian inference, the final composed probability function will be determined.

The Conducted Survey in Isfahan and Prepared Data
Isfahan city has 12 urban regions and 190 internal zones with the population of 1 300 000 citizens.In the origindestination surveying, 15 073 households were studied in Isfahan which includes 4.5 percent of the city population.
Based on the conducted origin-destination studies in Isfahan, the required data for the present investigation has been prepared.The collected data includes family size, the number of drivers per household, auto ownership per household and daily household trip production rate.

The Method of Categorizing Household Socio-Economic Characteristics and Determining the Average Rate of Trip Production for Each Category
The method of categorizing household socio-economic characteristics in the present study is shown in Table 1.
The above-mentioned categorization (Table 1) has been prepared based on the Isfahan data base and has included 15 073 households.The maximum and minimum of each of the socio-economic characteristics have been presented.On the basis of the before introduced categorization, 195 household categories are determined in Isfahan.In order to determine the average household trip production rate for each category, the following mathematical relation has been used (Stopher and McDonald 1983;Ortúzar and Willumsen 2001): where: ( ) -the average trip production rate for household belonging to category h; HS i (h) -the size of household i in category h; t i (h) -the trip production rate of household i in category h; n -the number of households in category h.

Determining the Likelihood Distribution Function for the Average Household Trip Production Rate
Considering statistical bases, the mean parameter of society follows the principles of continuous likelihood distributions the most likely of which would be the gamma continuous distribution function.It is time to test whether the average rate of trip production possesses the gamma distribution or not (Lan and Hu 2000;Conover 1998;Rickard 1989).
Therefore, a table having 20 categories in which the expected frequency in each category equals 195/20 = 9.75 is produced.which has χ 2 with the degree of freedom of 17.In other words, we have k groups in which two parameters have been assessed -(p = 2. Also the degrees of freedom would be in the form of (k -1 -p).As [p-value] equals 0.063547, so the hypothesis mentioning the data having gamma distributions, is not rejected at the level of 0.95.That is, the data has parameters as follows: shape parameter: 6.4088748; scale parameter: 0.95103082, in which the parameters of location and scale have been gained through the torque method.Accordingly, the distribution of average trip production can be obtained through the following formula: (2)

The Model of Disaggregate Household Trip Production
To create the mathematical relation between the average of household trip production and the socio-economic properties of the household, we will use a regression model, in which average household trip production is a dependent variable and household size, the number of drivers, the number of employees and auto ownership per household are independent variables.The purpose of determining the mentioned regression model is obtaining the coefficients of the predicting variables along with having a meaningful effect on the target variable and average household trip production.It should have an acceptable R squared means upper than 0.7 (Ortúzar and Willumsen 2001;Rose and Koppelman 1984).In Fig. 1, the frequency distribution of household average trip rate (for 195 household category) has been shown.
According to the pattern presented in the graph, the average of household trip production can be divided into two major groups of low and high frequency.In order to consider this issue, a new variable called trip frequency index enters the model.This attributes 1 to a high number of trips and 0 to a low number of trips.
Due to the linear dependence between the predicting variables in the model, new variables are produced based on the primary linear combinations which are linear independence to each other: where: hs -household size; ao -auto ownership; ne -the number of employees; nd -the number of drivers.
Using the new variables, the trend of determining the most suitable calibrated model via software SPSS 11.5 would be as follows, see Table 2 and 3.The final model all pre-assumptions of the regression in which are followed is: Since the above mentioned model can be attributed by negative quantities, it would be modified into: max {0, (tp) 2 }.This way, the model does not predict negative values; in addition, the value of the modeling coefficient increases.
In this section, the values of the predicting variables have been put into the model and changes in the independent variable; household trip production has been calculated out of the model.Finally, the obtained results from the model and the values acquired through observing the city of Isfahan have been compared and contrasted as presented in Fig. 2.

Fig. 2. A comparison of model results with observations
The data indicates that the obtained results of the model have a good concordance with the results of the observations; hence, the calibrated model has the required precision.

Determining the Likelihood Distribution Function for the Number of Household Trips
In Fig. 3, frequency distribution for trip production has been presented based on the Isfahan data bank (15 073 households) (The Comprehensive Studies … 2000).
As observed in graph (Berger 1993), the frequency of odd trips-red columns-produced is far less than that of even trips-blue columns-produced.Accordingly, it is needed to separately compose the likelihood distribution function for household trip production -once for even trips and once for the odd ones.The household trip production likelihood distribution function is a discrete function and most likely is the Poisson discrete distribution function (Lan and Hu 2000).

Determining the Likelihood Distribution Function for Household Trip Production (odd trips)
In order to test whether the number of trips made on odd has the Poisson distribution or not, we follow the succeeding procedures: Since the number of trips starts from one and is added two by two and considering a fact that the numbers start from zero and are added one by one in the Poisson distribution, we analyze this issue by the number of odd trips minus 1 divided by two.Thus, we compose a table consisting of 11 levels in a manner that the expectative frequency in each level equals 2244 ! which has χ 2 distribution with a degree of freedom of 9.
We have got k = 11 group in which one parameter (p = 1) has been estimated and the degree of freedom of χ 2 distribution equals (k-1-p).p -value equals 0.083137.As a result, a hypothesis that mentions the converted amounts of odd trips follow the Poisson distribution and will not be rejected at the level of 0.95.That means the number of odd trips following the Poisson distribution with the parameter of λ 1 = 3.882799 in which λ 1 is the mean of the Poisson distribution that has been obtained through the torque method.Consequently, the distribution of odd trips g 1 (n k | λ 1 ) -the likelihood density distribution of odd trips would be as follows:

Determining the Likelihood Distribution Function of Household Even Trips
In order to assess whether the even trip has the Poisson distribution, we have acted in the following way.
Since the number of even trips starts from zero, it is added two by two.However, in the Poisson distribution, this number is added one by one.Therefore, by converting the number of trips into divided by two, we analyze this issue.Thus, we make a table consisting of 11 levels in a manner the expectative frequency in which equals in each level as 2244 ! which has χ 2 distribution with the degree of freedom of 10.In fact, we have k=12 groups in which one parameter has been assessed, p = 1.The degree of freedom of χ 2 is equal to k-1-p.The p -value equals 0.053805.Therefore, a hypothesis that the converted number of even trips fol-lows the Poisson distribution is not rejected at the level of 0.95.Accordingly, the number of even trips following the Poisson distribution with the parameter of λ 2 = 3.18442591 in which λ 2 parameter is the mean of the Poisson distribution which has been obtained through the torque method.Consequently, the distribution of even trips g 2 (n k | λ 1 ) -the density likelihood distribution function of even trips) would be as follows: (5)

Determining the Composed Likelihood Distribution Function of Household Trip Production
On the basis of the above mentioned information provided in section 6.3, the density likelihood of even and odd trips would be as follows: where: a g n k The value of r based on the previously introduced information in section 6.3, equals the number of even trips divided by the total sum of odd and even trips.This value is equal to: 0.8511.

Determining the Likelihood Distribution Function Independently from the Distribution of the Society Parameter
The distribution of n k /λ k is in fact the distribution of the number of trips showed as g(n k | λ 1 , λ 2 )which also has the above mentioned density likelihood distribution function.
λ k has the prior gamma distribution with the parameters of shape and scale(Section 6.1.).Considering this additional information, we intend to obtain the distribution of the number of trips independently from the society parameter λ k .Since the parameters of the density function and the number of trips for odd and even trips are not equal, the distribution of trips is calculated separately and independently from the society parameter (refer to section 6.3).Calculations are made and presented below (Berger 1993;Lan and Hu 2000).
Simple calculation will make: where: n k -the number of trips (produced from household k); r -the probability of the even number of trips; α -a shape parameter in the gamma distribution; β -a scale parameter in the gamma distribution; λ k -the mean parameter equals to α/ β in the gamma distribution, (a dependent variable (tp) in the disaggregate household trip production model (section 6.2));I n is odds k ( ) -for odd trips equals 1, otherwise equals 0; I n is even k ( ) -for even trips equals 1, otherwise equals 0.
Considering the distribution of average household trip production and a suitable linear model with a dependent variable (the average number of trips) and four independent variables (the number of drivers, the number of employees, household size and auto ownership), the composed likelihood distribution function can be presented based on these variables.
The frequency distribution of household trip production (trip information for 15 073 households) was assessed and evaluated based on the composed likelihood distribution function (section 6.3.1) and compared and contrasted with the empirical frequency distribution of household trips-based on the survey conducted in Isfahan.The results presented in Fig. 4 (The Comprehensive Studies … 2000) demonstrate an acceptable and appropriate concordance between the empirical frequency distribution of household trips and the result of the composed likelihood distribution function.

Conclusions
In terms of the likelihood distribution function of household trip production, the χ 2 test has demonstrated that average household trip production is following the gamma distribution and that the number of household trip production is following the Poisson discrete distribution function (for odd and even trips).
The final composed probability function shows that it is possible to model the variation of household trip production rate.A comparison between the results of the composed probability function and empirical distribution shows that the precision of the final function is more than that of the initial functions (the poisson distribution for odd and even trips).It would be possible to predict the likelihood of real produced trips for each household category (by inserting socio-economic characteristics in the composed likelihood distribution function).On the other hand, for each household category or specific household, we can estimate and predict the type of likelihood required to produce one trip and the possibility of producing two trips etc.

Fig. 1 .
Fig. 1.The frequency distribution of household average trip production

Fig. 3 .
Fig. 3.The frequency of the distribution of household trip production in Isfahan

Fig. 4 .
Fig. 4. A comparison of the empirical distribution of household trip numbers with frequency distribution gained through composed likelihood distribution

Table 1 .
The method of categorizing household socio-economic characteristics re: o i -observed frequency, e i -expectative frequency)

Table 2 .
The quantities of the correlation coefficient in the model 140.643 -154.171• tf + 13.322 • z1 + 4.557 • z2, where: tf -trip frequency index; hs -household size; aoauto ownership; ne -the number of employees; nd -the number of drivers; (tp) 2 -the square of average trip production.

Table 3 .
Determining the calibration coefficient in the new model