-
Notifications
You must be signed in to change notification settings - Fork 303
SpInt Development
-
Generalized linear model (GLM) base class for modeling count data (Poisson model). (Week 1 & 2; ~ May 23rd - June 3rd) Blog Post 1 / Blog Post 2 / Blog Post 3 / Blog Post 4 / Blog Post 5-
Coefficient estimation via iteratively re-weighted least squared routineFeature Branch -
Coefficient estimation via maximum likelihood and gradient optimization (using scipy and/or autograd)explored and pushed back to final week of optimization Example -
Include support for sparse matrix data structureFeature Branch / [Example] (https://gist.github.com/TaylorOshan/42d90dbf219b50f3b0d54e06ba4e8b5b) Poisson GLM diagnostics such as AIC, BIC, deviance, log-likelihood, null deviance, deviance residuals, working residuals, etc.Unit tests/documentation
-
-
Zero flows, zero-inflation, overdispersion, and heteroskedasticity. (Week 3 & 4; ~ June 6th - June 17th) Blog Post 6
Tests for overdispersionPoisson Quasi Maximum LikelihoodUnit tests/documentation
-
Exploratory tools. (Week 5 & 6; ~ June 20th - July 1st)
Vector-based spatial autocorrelation statistic.Vector randomization for permutation-based hypothesis testing of vector spatial autocorrelationAutomate origin/destination specific calibration to investigate non-stationary processesUnit tests/documentation
-
Flow-based spatial weight specifications. (Week 7 & 8; ~ July 4th - July 15th)
-
Origin-destination weights> -
Network origin-destination weights -
Unit tests/documentation -
Spatial autoregressive (SAR) specifications. (Week 9 & 10 & 11; ~ July 18th - August 5th)
-
Log-normal SAR(Not production ready, but explored) - Unit tests/documentation
-
-
Wrap up and prepare module for release. (Week 12 & 13; ~ August 8th - August 23rd)
- Optimize code
- Coefficient estimation via maximum likelihood and gradient optimization (using scipy and/or autograd)
- Zero-inflated Poisson Model
- Double check tests/documentation
- Finalize educational materials and provide sample analysis workflow using exploratory tools, diagnostic tests, and formal models
-
Additional goals if there is any extra time and project is ahead of schedule:
- Competing destinations specifications
- Spatial eigenvector filter (SF) specifications
- Non-parametric “universal” model varieties
- Non-parametric Neural Network routines for calibrating spatial interaction models
For general development issues.
For discussion and notes pertaining to the Poisson SAR model and its theory, estimation, and implementation.
- [A Spatial Autoregressive Poisson Gravity Model] (http://onlinelibrary.wiley.com/doi/10.1111/gean.12007/abstract)
They propose a Poisson model for flows which also has an autoregressive component composed of an origin-based dependence and a destination-based dependence. They do not include the third type of dependence originally proposed by LeSage & Pace (2008), which is an origin-destination-based dependence and they do not really state why they do not include it. They suggest a two-stage nonlinear least squares estimator for the model. Interestingly, this estimator assumed that the sum of the spatial autocorrelation parameters on the origin-based dependence and the destination-based dependence is less than or equal to one. There is no mention of the effects when this assumption is breached in the event that the two parameters are collectively greater than 1. Furthermore, using the two-stage estimation routine means that final estimates of the spatial autocorrelation parameters are never actually obtained. Simulation results show that the estimator is unbiased but these results are based on a relatively unrealistic sample (linear organization of units where each unit has two neighbors except first and last) and on the premise that the spatial structure has been correctly specified (first order contiguity).
- In this paper the authors state that they are using a nonlinear least squares estimator because the maximum likelihood estimate of the multivariate Poisson distribution does not have an analytically closed form. In the notes they then clarify that it is possible write out the likelihood but that computationally intensive recursive algorithms are needed to compute the likelihood. Whats more confusing to me is how/why they made the jump to the multivariate Poisson distribution. On pages 181-182 they describe the basic Poisson distribution and its ML estimator. They then introduce the SAR component, talk about dispersion properties, and model interpretation (i.e., direct and indirect effects), before finally describing their estimator where they have now assumed a multivariate Poisson (p. 188). Anyone have any insights or literature as to how/how they made the move to the multivariate Poisson given the introduction of the SAR component?