Section 3 - Spatial interaction models - What they are and where they're used - Calibration and "what-if" - Trade area analysis and market penetration:
- The Huff model and variations.
- Site modeling for retail applications - regression, analog, spatial interaction.
- Modeling the impact of changes in a retail system.
- Calibrating spatial interaction models in a GIS environment.
What is a spatial interaction model?
- a model used to explain, understand, predict the level of interaction between different geographic locations
- examples of interactions:
- migration (number of migrants between pairs of states)
- phone traffic (number of calls between pairs of cities)
- commuting (number of vehicles from home to workplace)
- shopping (number of trips from home to store)
- recreation (number of campers from home to campsite)
- trade (amount of goods between pairs of countries)
- interaction is always expressed as a number or quantity per unit of time
- interaction occurs between defined origin and destination
- these may be the same or different classes of objects
- e.g. the same class in the case of migration between states
- e.g. different classes in the case of journeys to shop or work
- the matrix of interactions can be square or rectangular
Interaction is believed to be dependent on:
- some measure of the origin (its propensity to generate interaction)
- some measure of the destination (its propensity to attract interaction)
- some measure of the trip (its propensity to deter interaction)
- these measures are assumed to multiply
i denote an origin object (often an area)
j denote a destination object (a point or area)
I*ij denote the observed interaction between i and j, measured in appropriate units (e.g. numbers of trips, flow of goods, per defined interval of time)
Iij denote the interaction predicted by the spatial interaction model
- if the model is good (fits well), the predicted interactions per interval of time will be close in value to the observed interactions
- each Iij will be close to its corresponding I*ij
Ei denote the emissivity of the origin area i
Aj denote the attraction of the destination area j
Cij denote the deterrence of the trip between i and j (probably some measure of the trip length or cost)
a a constant to be determined
Then the most general form of spatial interaction model is:
Iij = a Ei Aj Cij
- that is, interaction can be predicted from the product of a constant, emissivity, attraction and deterrence
The model began life in the mid 19th century as an attempt to apply laws of gravitation to human communities - the gravity model
- such ideas of social physics have long since gone out of fashion, but the name is still sometimes used
- even in the form above, the model bears some relationship to Newton's Law of Gravitation
In any application of the model, some aspects are assumed to be unknown, and determined by calibration
- e.g. the value of a might be unknown in a given application
- its value would be calibrated by finding the value that gives the best fit between the observed interactions and the interactions predicted by the model
- the conventional measure of fit is the total squared difference between observation and prediction, that is, the summation over i and j of (Iij - I*ij)2
- this is known as least squares calibration
- other unknowns might be the method of calculating deterrence (Cij) from distance, or the attraction value to give to certain retail stores
Measurement of the variables:
- deterrence is often strongly related to distance
- the further the distance, the less interaction and thus the lower Cij
- generally the fit of the model is not sufficiently good to distinguish between these two, that is, to identify which gives the better fit
- the negative exponential has a minor technical advantage in not creating problems when dij = 0 (origin and destination are the same place)
- the b parameter is unknown and must be calibrated
- its value depends on the type of interaction, and also probably on the region
- b has units in the negative exponential case (1/distance) but none in the negative power case
- other measures of deterrence include:
- some function of transport cost
- some function of actual travel time
- in either case the function used is likely to be the negative power or negative exponential above
- there are examples where distance has a positive effect on interaction
- how to measure the propensity of each origin to emit interaction?
- some more appropriate measure weighting each cohort, e.g. age and sex cohorts
- some cohorts are more likely to interact than others
- Ei could be treated as unknown and calibrated
- the propensity of each destination to attract interaction
- could be unknown and calibrated
- for shopping models, gross floor area of retail space is often used
- some forms of interaction are symmetrical
- flow from origin to destination equals reverse flow
- e.g. phone calls
- requires Ei and Aj to be the same, e.g. population
The Huff model
- what happens when a new destination is added?
- interactions with existing destinations are unaffected
- assumes outflow from origins can increase without limit
- in practice, in many applications flow from origin to existing destinations will be diverted
- we need some form of "production constraint"
Huff proposed this change:
Iij = Ei Aj Cij / summation over j (Aj Cij )
- summing interaction to all destinations from a given origin:
Iij = Ei
- that is, total interaction from an origin will always equal Ei regardless of the number and locations of destinations
- flow will now be partially diverted from existing destinations to new ones
- Ei is now the total outflow, can be set equal to the total of observed outflows from origin i
- the Huff model is consistent with the axiom of Independence of Irrelevant Alternatives (IIA)
- the ratio of flows to two destinations from a given origin is independent of the existence and locations of other destinations
Because of its production constraint, the Huff model is very popular in retail analysis
- it is often desirable to predict how much business a new store will draw from existing ones
- e.g. how much will a new mall draw business away from downtown?
Other "what if" questions:
- population of a neighborhood increases by x%
- ethnic mix of a neighborhood changes
- a new bridge is constructed
- an earthquake takes a freeway out of operation
- an anchor store moves out
- a store changes its signage
Site modeling for retail applications
- three major areas:
- use of the spatial interaction model
- analog techniques
- regression models
- the business done by a new store or an old store operating under changed circumstances is best estimated by finding the closest analog in the chain
- criteria include:
- physical characteristics of each store
- intangibles such as management, signage
- local market area
- a GIS can help compare market areas (local densities, street layouts, traffic patterns)
- a multi-media GIS can help with the intangibles
- bring up images of site, layout, signage...
- identify all of the factors affecting sales, and construct a model to predict based on these factors
- an enormous range of factors can affect sales
- some factors are exogenous
- determined by external, physical, measurable variables
- some of these travel with the store if it moves (site factors), others are attributes of place (situation factors)
- other factors are endogenous
- determined by crowding, types of customers, trends, advertizing
- unpredictable, determined by the state of the system
Exogenous factors:
- site layout - on a corner? parking spaces, etc.
- trade area - number of households in primary, etc
- characteristics of neighborhood
Example model:
Sales per 2-week period for convenience store:
+ 4542 if gas pumps on site
+ 3172 if major shopping center in vicinity
+ 3990 if side street traffic is transient
+ 3188 per curb cut on side street
+ 2974 if store stands alone
- 1722 per lane on main street
- use of surrogate variables
- problems in use of model for prediction in planning
Calibration of the spatial interaction model
- many different circumstances
- major issues involved in calibration
- specific tools are available
- possible to use standard tools in e.g. SAS, GLIM
- calibration possible using aggregate flows or individual choices
- transformations to make the right hand side of the equation a linear combination of unknowns, the left hand side known
Linearization of the unconstrained model:
- suppose the Ei are known, the Aj unknown
- the constant a can be absorbed into the Aj (i.e. find aAj)
- suppose we use the negative power deterrence function
Iij = Ei Aj / dijb
- move the Ei to the left:
Iij/Ei = Aj / dijb
- now a trick - introduce a set of dummy variables uijk, set to 1 if j=k, otherwise zero:
log (Iij/Ei) = uij1 log A1 + uij2 log A2 + ... - b log dij
- now the left hand side is all knowns, the right hand side is a linear combination of unknowns (the logs of the As and b)
- the model can now be calibrated (the unknowns can be determined) using ordinary multiple regression in a package like SAS
- it may be easier to avoid linearizing altogether by using the nonlinear regression facilities in many packages
The objective function:
- normally, we would try to maximize the fit of the observed and predicted interactions
- linearization changes this
- e.g. we minimize the squared differences between observed and predicted values of log (Iij/Ei) if ordinary regression is used on the linearized form above
- this is easy in practice, but makes no sense
- intuitively, an error of 30 in a prediction of 1000 trips is much more acceptable than an error of 30 in a prediction of 10 trips
- these ideas are formalized in the technique of Poisson regression, which assumes that Iij is a count of events, and sets up the objective function accordingly
- the function minimized to get a good fit is roughly the difference between observed and predicted, squared, divided by the predicted flow