Thursday, August 21, 2014

Developing Commercial AVM and Decision Matrix

Unlike the residential AVM where a single market model often does the trick, modeling commercial and industrial properties (“commercial”) requires very different quantitative efforts. Since the commercial world is quite heterogeneous – from auto to dining to entertainment to hotel to manufacturing to mixed use to office to retail to warehouse, etc. (multifamily residential is outside the scope of this piece) – use-stratified models, instead of all-encompassing global models, are proven more meaningful, producing significantly better values. Secondarily, considering these are mostly income producing properties, in addition to the market models, income models and hybrid market and income models are equally important. The optimality of all of these competing models is finally evaluated via a quality decision matrix, paving the way for the optimal AVM.

Market AVM
In view of the general paucity of use-stratified recent commercial sales, a meaningful (representative) market modeling sample may require older sales, often recent five to seven years of sales. If commercial sales are internally warehoused, before they are selected for modeling an internal sales validation initiative must be undertaken. Once the sample is established, the sales dating back to the prior years will require time adjustment. If the sample comprises of five to seven years worth of sales, annual time adjustment, rather than the traditional quarterly time adjustment, should suffice. Of course, in adjusting for time, simulation of the value-based broader market (below 25th percentile, 25th to 75th, and above 75th percentile) is more crucial. Since most of the commercial population datasets consist, at least, of the following data variables – Land Area, Building Size (multiple layers), Year Built, Use, Zoning, Location (often multiple types), Traffic, Property Rating, Construction Grade, Story Height, Ceiling Height, Basement Use, Parking and Year Renovated – the test of multicollinearity among the multiplicative variables is necessary. 

Unlike the residential models, a much higher collinearity threshold (say, +/- .4) must be considered to prevent unnecessary exclusion of some of the variables, a priori. Therefore, proper front-end treatments (auto-regressive linearization) of the categorical variables will minimize residual analysis at the back-end, promoting faster model convergence. As the MRA process is invoked, a lower minimum t-value (say, 1) for the categorical variables must be used to avoid having to constrain the model. An unconstrained model is more market-derived than a constrained model which is essentially a collection of the modeler’s judgments. Also, a separate additive MRA stage may help establish the meaningfulness of the additive coefficients. Additionally, depending on the extent of the physical geography the sample covers, a set of residual-prompted GIS variables or a separate GIS plane must be explored to enhance model efficiency. 

One of the basic differences between the residential and commercial modeling is that the latter requires extensive residual analyses before outliers are defined and removed. In other words, at least the liquid categories within each categorical variable must be optimized in a way that they all have median sales ratios between .97 and 1.03 (or between 97 and 103 if expressed as such), assuming, of course, a margin of error of 3%. The determination of the liquidity of the category (generally between 3% and 5%) must also be consistent. If, for instance, a 5% liquidity factor is decided upon, in a 1,000-case sample all variable categories showing minimum count of 50 would be subjected to residual analyses and optimization. 

Subsequent to the residual analyses, outliers must be identified and removed using a symmetric sales ratio curve, say all cases below the 5th percentile and above the 95th percentile. Ideally, the outlier-free model should have a COD/COV under 20 and an R-squared above .85. Of course, certain stratifications like auto, dining, mixed use, warehouse and distribution tend to produce better stats than their office and retail counterparts. Mixed use properties, by virtue of their higher owner-occupied use, tend to respond very well to market models. Conversely, sales pertaining to funeral homes, nurseries, day care centers, etc. are generally few and far between so an alternate cost model (non-AVM) may be explored. Due to the zoning commonality, warehouse and distribution and manufacturing are often modeled together. 

Needless to say, the MRA process must include multiple cycles: cycle one to produce the multiplicative residuals to help develop the GIS variables or the GIS plane; cycle two to optimize residuals from the categorical variables; cycle three to help identify and remove outliers; and a fourth and final cycle to generate the optimized model which could then be tested on the holdout and then applied on to the population to produce values. Again, in order to produce a statistically significant market model, a quality sales sample is essential. Analysts must not be afraid to use the older commercial sales to develop the sales sample.

Income AVM
Income AVM is an emerging trend. The most common form of income model is the gross income model which, in turn, produces the statistically significant Gross Income Multipliers (GIM). Since income models, like the market models, also require sound income samples, income models are more common in taxing jurisdictions (CAMA environment) than in private settings. Most major taxing jurisdictions tend to collect some form – all or samples – of income and expense (I&E) data. Therefore, in recent years, they have been looking into the possibility of using that data to develop some models and metrics. Instead of potential gross income, property owners tend to report effective gross income (net of vacancies), so most GIM models are essentially EGIM models. 

Considering property owners are often statutorily required to submit annual I&E’s income samples are more liquid than the sales samples. In a statutory environment, incomes from the most recent survey year should contribute to the sample, while in non-statutory environments recent two to three years of non-repeat filings should be evaluated for sampling. For example, if owner X has filed for both 2012 and 2013, only the 2013 data should be used. Likewise, if owner Y has filed 2011 and 2012 only, the 2012 data should be used in sampling. Once the sample becomes representative, prior year incomes must be time-adjusted to the most recent year (or the modeling year, as appropriate) income level. 

Private sector modelers either use their own internal income surveys or buy the income data from specialized vendors. As usual, income models must also be built around use stratifications. If income surveys are also completed by the illiquid stratas like funeral homes, nurseries, day care centers, etc., they may be grouped and modeled together. In that case, a separate linearized categorical variable representing the different stratas must serve as an independent variable. In terms of the rest of the modeling mechanics, there is not much of a difference between the market model and the income model. Linearization to multicollinearity to residual analyses to outlier removal all follows the same rules as in the market model. 

Of course, due to the larger sample sizes, income models generally produce better stats than the market models. For larger jurisdictions, the COD/COV must be below 20 (preferably around 15) and the model R-squared above .85 (preferably above .88). When both income and market models are developed, it is a good practice to develop the income model later as it has to depend on the modeled sales to generate the GIM. The modeled GIM then gets translated to the market values.

Hybrid Income and Market AVM
Since the mutually exclusive market model and the income model have their own inadequacies, the values generated by them separately are, therefore, non-optimal. The hybrid model, on the other hand, is a large random sample – generally mutually exclusive – from both population MRAs, thereby bringing the values closer to an optimal solution. The dependent variables in both market and income models are in the raw forms, although time-adjusted. However, in the hybrid model, the dependent variable is the already modeled MRA estimate from both models so no further time adjustment is needed. 

Since the hybrid modeling sample will be significantly larger and the dependent variable smoother, the modeling stats will be much finer as well. In fact, the COD/COV from this model should be well under 15 and the model R-squared around .9. Given the way the relationship between sales and incomes play out in the marketplace, the final market values generated by the market model will be consistently higher than those by the income model. In certain markets, the differential may vary between 15% and 30%, depending on the use strata, thus forcing many users to often abandon both. However, this hybrid model will help minimize that noise, producing values more in line with the market. 

Again, once the sample is established, the rest of the modeling mechanics will be the same as in market/income models. Since this is more like an optimization model, the GIS variables (from both models) may be recycled as long as they are properly realigned based on new residuals.

Decision Matrix
Now that there are three separate AVM values for each parcel, the user is confronted with a decision matrix. In order to validate the AVM values and select the most efficient ones from the matrix, a series of use-stratified samples must be concurrently hand-worked by qualified appraisers. As the sample results arrive, they should be compared with the three AVM values and the deviations studied. The AVM category that demonstrates the least deviation from the hand-worked values should receive the top priority in cascading that category. 

Obviously, the hybrid values would be expected to triumph in most use categories; however, market and income AVM may prevail in mixed use and retail categories respectively. If all of the AVM values consistently diverge (say, over 20%) from the hand-worked values, the entire modeling process should be subjected to scrutiny. By the same token, counter samples may also be considered to validate the main hand-worked samples, particularly when assessment revaluations are involved. Counter samples are small re-randomized subsets of the main samples, generally worked up by a veteran appraiser. 

Since property values wear out quite rapidly – especially in a dynamic market as this – the hand-worked samples must be orchestrated in such a way that their availability coincides with the AVM values, thus minimizing any waiting lag. In addition to validating the AVM values, the hand-worked samples help cascade those values in a more meaningful way.

In a nutshell, commercial AVM is more than just a market model. It involves creating a decision matrix consisting also of a market model, an income model, and a hybrid income and market model, as well as a series of appraisers’ hand-worked samples and counter samples. 


Sid Som
Homequant Inc.

Note - This book is available on Kindle (Amazon - query 'Sid Som').

No comments:

Post a Comment