SECTION 3
SPATIAL INTERACTION MODELS



Section 3 - Spatial interaction models - What they are and where they're used - Calibration and "what-if" - Trade area analysis and market penetration:

  • The Huff model and variations.
  • Site modeling for retail applications - regression, analog, spatial interaction.
  • Modeling the impact of changes in a retail system.
  • Calibrating spatial interaction models in a GIS environment.
What is a spatial interaction model?
  • a model used to explain, understand, predict the level of interaction between different geographic locations
  • examples of interactions:
    • migration (number of migrants between pairs of states)
    • phone traffic (number of calls between pairs of cities)
    • commuting (number of vehicles from home to workplace)
    • shopping (number of trips from home to store)
    • recreation (number of campers from home to campsite)
    • trade (amount of goods between pairs of countries)
  • interaction is always expressed as a number or quantity per unit of time
  • interaction occurs between defined origin and destination
    • these may be the same or different classes of objects
    • e.g. the same class in the case of migration between states
    • e.g. different classes in the case of journeys to shop or work
    • the matrix of interactions can be square or rectangular

    •  
Interaction is believed to be dependent on:
  • some measure of the origin (its propensity to generate interaction)
  • some measure of the destination (its propensity to attract interaction)
  • some measure of the trip (its propensity to deter interaction)
  • these measures are assumed to multiply
  •  
 Let:
i denote an origin object (often an area)

j denote a destination object (a point or area)

I*ij denote the observed interaction between i and j, measured in appropriate units (e.g. numbers of trips, flow of goods, per defined interval of time)

Iij denote the interaction predicted by the spatial interaction model
 

  • if the model is good (fits well), the predicted interactions per interval of time will be close in value to the observed interactions
  • each Iij will be close to its corresponding I*ij
  •  
Ei denote the emissivity of the origin area i

Aj denote the attraction of the destination area j

Cij denote the deterrence of the trip between i and j (probably some measure of the trip length or cost)

a a constant to be determined
 

Then the most general form of spatial interaction model is:
 

Iij = a Ei Aj Cij
 

  • that is, interaction can be predicted from the product of a constant, emissivity, attraction and deterrence
The model began life in the mid 19th century as an attempt to apply laws of gravitation to human communities - the gravity model
  • such ideas of social physics have long since gone out of fashion, but the name is still sometimes used
  • even in the form above, the model bears some relationship to Newton's Law of Gravitation
  •  
 In any application of the model, some aspects are assumed to be unknown, and determined by calibration
  • e.g. the value of a might be unknown in a given application
  • its value would be calibrated by finding the value that gives the best fit between the observed interactions and the interactions predicted by the model
  • the conventional measure of fit is the total squared difference between observation and prediction, that is, the summation over i and j of (Iij - I*ij)2
  • this is known as least squares calibration
  • other unknowns might be the method of calculating deterrence (Cij) from distance, or the attraction value to give to certain retail stores
Measurement of the variables:

Cij

  • deterrence is often strongly related to distance
    • the further the distance, the less interaction and thus the lower Cij
  • a common choice is a decreasing function of distance:
  • Cij = dij-b            (Cij = 1 / dijb)
    or Cij = exp (-bdij)            (exp denotes 2.71828 to the power)
  • generally the fit of the model is not sufficiently good to distinguish between these two, that is, to identify which gives the better fit
  • the negative exponential has a minor technical advantage in not creating problems when dij = 0 (origin and destination are the same place)
  • the b parameter is unknown and must be calibrated
    • its value depends on the type of interaction, and also probably on the region
    • b has units in the negative exponential case (1/distance) but none in the negative power case
  • other measures of deterrence include:
    • some function of transport cost
    • some function of actual travel time
    • in either case the function used is likely to be the negative power or negative exponential above
    • there are examples where distance has a positive effect on interaction
Ei
  • how to measure the propensity of each origin to emit interaction?
  • gross population Pi
  • some more appropriate measure weighting each cohort, e.g. age and sex cohorts
    • some cohorts are more likely to interact than others
  • gross income
  • Ei could be treated as unknown and calibrated
  •  
Aj
  • the propensity of each destination to attract interaction
  • could be unknown and calibrated
  • for shopping models, gross floor area of retail space is often used
  • some forms of interaction are symmetrical
    • flow from origin to destination equals reverse flow
    • e.g. phone calls
    • requires Ei and Aj to be the same, e.g. population
The Huff model
  • what happens when a new destination is added?
    • interactions with existing destinations are unaffected
    • assumes outflow from origins can increase without limit
    • in practice, in many applications flow from origin to existing destinations will be diverted
    • we need some form of "production constraint"
Huff proposed this change:
Iij = Ei Aj Cij / summation over j (Aj Cij )
  •  
  • summing interaction to all destinations from a given origin:
  • Iij = Ei
  • that is, total interaction from an origin will always equal Ei regardless of the number and locations of destinations
  • flow will now be partially diverted from existing destinations to new ones
  • Ei is now the total outflow, can be set equal to the total of observed outflows from origin i
  • the Huff model is consistent with the axiom of Independence of Irrelevant Alternatives (IIA)
    • the ratio of flows to two destinations from a given origin is independent of the existence and locations of other destinations
Because of its production constraint, the Huff model is very popular in retail analysis
  • it is often desirable to predict how much business a new store will draw from existing ones
    • e.g. how much will a new mall draw business away from downtown?
Other "what if" questions:
  • population of a neighborhood increases by x%
  • ethnic mix of a neighborhood changes
  • a new bridge is constructed
  • an earthquake takes a freeway out of operation
  • a mall adds new space
  • an anchor store moves out
  • a store changes its signage
  •  
Site modeling for retail applications
  • three major areas:
    • use of the spatial interaction model
    • analog techniques
    • regression models
    •  
Analog:
  • the business done by a new store or an old store operating under changed circumstances is best estimated by finding the closest analog in the chain
  • requires a search
  • criteria include:
    • physical characteristics of each store
    • intangibles such as management, signage
    • local market area
    • a GIS can help compare market areas (local densities, street layouts, traffic patterns)
    • a multi-media GIS can help with the intangibles
      • bring up images of site, layout, signage...
Regression:
  • identify all of the factors affecting sales, and construct a model to predict based on these factors
  • an enormous range of factors can affect sales
  • some factors are exogenous
    • determined by external, physical, measurable variables
    • some of these travel with the store if it moves (site factors), others are attributes of place (situation factors)
  • other factors are endogenous
    • determined by crowding, types of customers, trends, advertizing
    • unpredictable, determined by the state of the system
    •  
Exogenous factors:
  • site layout - on a corner? parking spaces, etc.
  • inside layout
  • trade area - number of households in primary, etc
  • characteristics of neighborhood
Example model:
 
 

Sales per 2-week period for convenience store:
 

$12749
+ 4542 if gas pumps on site
+ 3172 if major shopping center in vicinity

+ 3990 if side street traffic is transient

+ 3188 per curb cut on side street

+ 2974 if store stands alone

- 1722 per lane on main street

  • use of surrogate variables
  • problems in use of model for prediction in planning
  •  
Calibration of the spatial interaction model
  • many different circumstances
  • major issues involved in calibration
  • specific tools are available
    • SIMODEL
  • possible to use standard tools in e.g. SAS, GLIM
  • calibration possible using aggregate flows or individual choices
  •  
Linearization:
  • transformations to make the right hand side of the equation a linear combination of unknowns, the left hand side known
Linearization of the unconstrained model:
  • suppose the Ei are known, the Aj unknown
    • the constant a can be absorbed into the Aj (i.e. find aAj)
  • suppose we use the negative power deterrence function
  •  Iij = Ei Aj / dijb
  • move the Ei to the left:
  • Iij/Ei = Aj / dijb
  • take the logs of both sides:
  •  log (Iij/Ei) = log Aj - b log dij
  •   now a trick - introduce a set of dummy variables uijk, set to 1 if j=k, otherwise zero:
 log (Iij/Ei) = uij1 log A1 + uij2 log A2 + ... - b log dij
  • now the left hand side is all knowns, the right hand side is a linear combination of unknowns (the logs of the As and b)
  • the model can now be calibrated (the unknowns can be determined) using ordinary multiple regression in a package like SAS
  • it may be easier to avoid linearizing altogether by using the nonlinear regression facilities in many packages
The objective function:
  • normally, we would try to maximize the fit of the observed and predicted interactions
  • linearization changes this
    • e.g. we minimize the squared differences between observed and predicted values of log (Iij/Ei) if ordinary regression is used on the linearized form above
    • this is easy in practice, but makes no sense
  • intuitively, an error of 30 in a prediction of 1000 trips is much more acceptable than an error of 30 in a prediction of 10 trips
  • these ideas are formalized in the technique of Poisson regression, which assumes that Iij is a count of events, and sets up the objective function accordingly
    • the function minimized to get a good fit is roughly the difference between observed and predicted, squared, divided by the predicted flow