Thursday, August 21, 2014

Developing Commercial AVM and Decision Matrix

Unlike the residential AVM where a single market model often does the trick, modeling commercial and industrial properties (“commercial”) requires very different quantitative efforts. Since the commercial world is quite heterogeneous – from auto to dining to entertainment to hotel to manufacturing to mixed use to office to retail to warehouse, etc. (multifamily residential is outside the scope of this piece) – use-stratified models, instead of all-encompassing global models, are proven more meaningful, producing significantly better values. Secondarily, considering these are mostly income producing properties, in addition to the market models, income models and hybrid market and income models are equally important. The optimality of all of these competing models is finally evaluated via a quality decision matrix, paving the way for the optimal AVM.

Market AVM
In view of the general paucity of use-stratified recent commercial sales, a meaningful (representative) market modeling sample may require older sales, often recent five to seven years of sales. If commercial sales are internally warehoused, before they are selected for modeling an internal sales validation initiative must be undertaken. Once the sample is established, the sales dating back to the prior years will require time adjustment. If the sample comprises of five to seven years worth of sales, annual time adjustment, rather than the traditional quarterly time adjustment, should suffice. Of course, in adjusting for time, simulation of the value-based broader market (below 25th percentile, 25th to 75th, and above 75th percentile) is more crucial. Since most of the commercial population datasets consist, at least, of the following data variables – Land Area, Building Size (multiple layers), Year Built, Use, Zoning, Location (often multiple types), Traffic, Property Rating, Construction Grade, Story Height, Ceiling Height, Basement Use, Parking and Year Renovated – the test of multicollinearity among the multiplicative variables is necessary. 

Unlike the residential models, a much higher collinearity threshold (say, +/- .4) must be considered to prevent unnecessary exclusion of some of the variables, a priori. Therefore, proper front-end treatments (auto-regressive linearization) of the categorical variables will minimize residual analysis at the back-end, promoting faster model convergence. As the MRA process is invoked, a lower minimum t-value (say, 1) for the categorical variables must be used to avoid having to constrain the model. An unconstrained model is more market-derived than a constrained model which is essentially a collection of the modeler’s judgments. Also, a separate additive MRA stage may help establish the meaningfulness of the additive coefficients. Additionally, depending on the extent of the physical geography the sample covers, a set of residual-prompted GIS variables or a separate GIS plane must be explored to enhance model efficiency. 

One of the basic differences between the residential and commercial modeling is that the latter requires extensive residual analyses before outliers are defined and removed. In other words, at least the liquid categories within each categorical variable must be optimized in a way that they all have median sales ratios between .97 and 1.03 (or between 97 and 103 if expressed as such), assuming, of course, a margin of error of 3%. The determination of the liquidity of the category (generally between 3% and 5%) must also be consistent. If, for instance, a 5% liquidity factor is decided upon, in a 1,000-case sample all variable categories showing minimum count of 50 would be subjected to residual analyses and optimization. 

Subsequent to the residual analyses, outliers must be identified and removed using a symmetric sales ratio curve, say all cases below the 5th percentile and above the 95th percentile. Ideally, the outlier-free model should have a COD/COV under 20 and an R-squared above .85. Of course, certain stratifications like auto, dining, mixed use, warehouse and distribution tend to produce better stats than their office and retail counterparts. Mixed use properties, by virtue of their higher owner-occupied use, tend to respond very well to market models. Conversely, sales pertaining to funeral homes, nurseries, day care centers, etc. are generally few and far between so an alternate cost model (non-AVM) may be explored. Due to the zoning commonality, warehouse and distribution and manufacturing are often modeled together. 

Needless to say, the MRA process must include multiple cycles: cycle one to produce the multiplicative residuals to help develop the GIS variables or the GIS plane; cycle two to optimize residuals from the categorical variables; cycle three to help identify and remove outliers; and a fourth and final cycle to generate the optimized model which could then be tested on the holdout and then applied on to the population to produce values. Again, in order to produce a statistically significant market model, a quality sales sample is essential. Analysts must not be afraid to use the older commercial sales to develop the sales sample.

Income AVM
Income AVM is an emerging trend. The most common form of income model is the gross income model which, in turn, produces the statistically significant Gross Income Multipliers (GIM). Since income models, like the market models, also require sound income samples, income models are more common in taxing jurisdictions (CAMA environment) than in private settings. Most major taxing jurisdictions tend to collect some form – all or samples – of income and expense (I&E) data. Therefore, in recent years, they have been looking into the possibility of using that data to develop some models and metrics. Instead of potential gross income, property owners tend to report effective gross income (net of vacancies), so most GIM models are essentially EGIM models. 

Considering property owners are often statutorily required to submit annual I&E’s income samples are more liquid than the sales samples. In a statutory environment, incomes from the most recent survey year should contribute to the sample, while in non-statutory environments recent two to three years of non-repeat filings should be evaluated for sampling. For example, if owner X has filed for both 2012 and 2013, only the 2013 data should be used. Likewise, if owner Y has filed 2011 and 2012 only, the 2012 data should be used in sampling. Once the sample becomes representative, prior year incomes must be time-adjusted to the most recent year (or the modeling year, as appropriate) income level. 

Private sector modelers either use their own internal income surveys or buy the income data from specialized vendors. As usual, income models must also be built around use stratifications. If income surveys are also completed by the illiquid stratas like funeral homes, nurseries, day care centers, etc., they may be grouped and modeled together. In that case, a separate linearized categorical variable representing the different stratas must serve as an independent variable. In terms of the rest of the modeling mechanics, there is not much of a difference between the market model and the income model. Linearization to multicollinearity to residual analyses to outlier removal all follows the same rules as in the market model. 

Of course, due to the larger sample sizes, income models generally produce better stats than the market models. For larger jurisdictions, the COD/COV must be below 20 (preferably around 15) and the model R-squared above .85 (preferably above .88). When both income and market models are developed, it is a good practice to develop the income model later as it has to depend on the modeled sales to generate the GIM. The modeled GIM then gets translated to the market values.

Hybrid Income and Market AVM
Since the mutually exclusive market model and the income model have their own inadequacies, the values generated by them separately are, therefore, non-optimal. The hybrid model, on the other hand, is a large random sample – generally mutually exclusive – from both population MRAs, thereby bringing the values closer to an optimal solution. The dependent variables in both market and income models are in the raw forms, although time-adjusted. However, in the hybrid model, the dependent variable is the already modeled MRA estimate from both models so no further time adjustment is needed. 

Since the hybrid modeling sample will be significantly larger and the dependent variable smoother, the modeling stats will be much finer as well. In fact, the COD/COV from this model should be well under 15 and the model R-squared around .9. Given the way the relationship between sales and incomes play out in the marketplace, the final market values generated by the market model will be consistently higher than those by the income model. In certain markets, the differential may vary between 15% and 30%, depending on the use strata, thus forcing many users to often abandon both. However, this hybrid model will help minimize that noise, producing values more in line with the market. 

Again, once the sample is established, the rest of the modeling mechanics will be the same as in market/income models. Since this is more like an optimization model, the GIS variables (from both models) may be recycled as long as they are properly realigned based on new residuals.

Decision Matrix
Now that there are three separate AVM values for each parcel, the user is confronted with a decision matrix. In order to validate the AVM values and select the most efficient ones from the matrix, a series of use-stratified samples must be concurrently hand-worked by qualified appraisers. As the sample results arrive, they should be compared with the three AVM values and the deviations studied. The AVM category that demonstrates the least deviation from the hand-worked values should receive the top priority in cascading that category. 

Obviously, the hybrid values would be expected to triumph in most use categories; however, market and income AVM may prevail in mixed use and retail categories respectively. If all of the AVM values consistently diverge (say, over 20%) from the hand-worked values, the entire modeling process should be subjected to scrutiny. By the same token, counter samples may also be considered to validate the main hand-worked samples, particularly when assessment revaluations are involved. Counter samples are small re-randomized subsets of the main samples, generally worked up by a veteran appraiser. 

Since property values wear out quite rapidly – especially in a dynamic market as this – the hand-worked samples must be orchestrated in such a way that their availability coincides with the AVM values, thus minimizing any waiting lag. In addition to validating the AVM values, the hand-worked samples help cascade those values in a more meaningful way.

In a nutshell, commercial AVM is more than just a market model. It involves creating a decision matrix consisting also of a market model, an income model, and a hybrid income and market model, as well as a series of appraisers’ hand-worked samples and counter samples. 


Sid Som
Homequant Inc.

Note - This book is available on Kindle (Amazon - query 'Sid Som').

Wednesday, August 20, 2014

Understanding the Buzz about MLS-based AVM

Building AVMs using the Multiple Listing Services (MLS) data – closed sales as well as seasoned listings – is often thought to be a smarter route in light of its cleaner and more up-to-date data. 

However, those who advocate that fail to understand that the MLS provides nothing more than a sales database, and as a result any AVM developed as such could not be effectively applied to a population outside of the MLS listings. Here are some of the reasons why the MLS-based AVM’s could fall short as a full-cycle AVM or a CAMA model while applying to a non-MLS population:

1. MLS Sales Data do not Communicate with Vendor Population Data: An automated valuation model by itself is not very useful unless it is applied on to a large population. In the public sector, a form of AVM, known as the mass appraisal modeling, is frequently used by the assessing jurisdictions to generate assessment rolls. Since those jurisdictions collect, maintain, and warehouse their own property data, there is an inherent disconnect between their data variables and formats and those of the MLS’.

In the private sector, financial institutions – local to regional to national – use AVM values in the front-end (origination), in the mid-end (portfolio management), and in the back-end (collections and foreclosures). Considering the cost effectiveness, AVM houses tend to acquire their data, population as well as sales, directly from the source, meaning the assessing jurisdictions and reformat them in-house using their own unique formats. 
The MLS sales, once recoded, are also available from those jurisdictions. Those financial institutions do not buy sales data, instead they buy AVM values from the AVM houses in line with their aforesaid requirements. Again, given the huge disconnect between their data and MLS', the AVM houses develop their models using their in-house sales data and then apply them on to the respective populations to generate values. They are judged not only on the effectiveness of their values, but also on the hit rates so they need to continually update their inventory. To be competitive, they also need to comply with the rigid industry guidelines and requirements, e.g., S&P and OCC.

2. MLS Sales Alone will not Generate a Representative Sales Sample: While it would be almost impossible for the AVM houses to depend exclusively on the MLS sales to build models and then apply them on to their populations, many assessing jurisdictions use recent MLS sales in their model-building process to bolster the recency of their modeling sales sample. 
However, the recent (the lag between the sales closing and recording) MLS sales alone would be totally inadequate considering the sales sample must also be a representative sample of the population.

3. The MLS Sales Universe is a Subset of the Overall Sales Market: Not all owners list their properties for sale with the MLS. Many are sold directly, through organizations like For Sale By Owners (FSBO), and by other non-member brokerage houses. The high-end properties are often marketed via specialty brokers who tend to be non-members. Consequently, in addition to the aforesaid sampling error, models built off of the MLS sales only will introduce serious bias at the point of application.

4. The MLS Data are Proprietary: Except for the basic sales information (sale price, sale date, validation), the use of their property data (other than for sales promotion by their network members) requires express written consent from them. Knowledgeable practitioners and consultants in related fields like mass appraisal and AVM are aware of those copyright restrictions and tend to avoid such infringements, although the MLS data are cleaner and more up-to-date including the interior data as they are professionally collected by the members with an eye for the competing details and special upgrades. 
Of course, even if those interior data variables are inadvertently used in a model, that model could not be effectively applied on to the non-MLS population from the national data vendors as their databases rarely incorporate such detailed interior data (please read ‘Do You Need Interior Data to Build a Better Residential AVM’ in my first book entitled ‘How To Build a Better Automated Valuation Model’ for more details).

5. Generic MLS-based AVM is Not in the Best Interest of the MLS: While AVM is an econometric solution, the MLS listings are originated by the realtors using a very subjective process called comparative market analysis (CMA). If an AVM is developed with the recent MLS sales and then applied on to their current listings, the deviations between the currently listed prices and the AVM values would be difficult to explain to the clients, creating serious confusions in the marketplace. 
For example, if property X is listed for $480,000 while the new MLS AVM value comes at $420,000, it would put the listing broker on the defensive and would offer a comparative advantage to the buyer’s agent. Conversely, if the numbers are transposed, the seller would feel terribly shortchanged due to the listing agent’s inefficient CMA which could easily result in the cancellation of the listing. In view of these potential conflicts, most listing brokers would be opposed to the MLS-wise AVM’s. Any deviation, positive or negative, would not be conducive to their client relationship. 

In late 2009 National Association of Realtors, announced a new national AVM initiative called Realtor Valuation Model (RVMTM) to build their version of AVM which they consider would eventually become the “gold standard.” In building their AVMs, they would be sourcing the data from multiple external sources as well as MLS’ own internal and proprietary data.
Of course, an MLS-based AVM could be of immense help if it is developed by the MLS using their recent sales and listings data, frequently updated, and continually applied to their incoming listings to validate all CMA’s. Thus, in addition to producing the econometrically-equalized starting values for all listings, the seller would be provided an opportunity to evaluate both CMA and AVM values simultaneously before committing to the contract. 
This independent validation would provide an added comfort to the seller, while reducing the valuation risk for the listing broker. Needless to say, this addition would be the risk-managed highest and best use of their proprietary data, creating an unprecedented union between the wholesale (AVM) and retail (CMA) values within the same environment. 


Sid Som MBA, MIM
Homequant Inc. 

Note - This book is available on Kindle (Amazon - query 'Sid Som').

Thursday, August 14, 2014

Why Specialized AVMs are Needed for Foreclosure Markets

Most banks and mortgage houses have been buying AVM values from the leading vendors to cater to their needs in the front-end (origination) and mid-end (portfolio management). However, since the recent housing bust, their back-end (collections, foreclosures, short sale, REO sales, etc.) needs have exploded. Obviously, the AVM values that are meaningful for the front-end and mid-end are practically useless in the back-end. 

While some traditional AVM houses and private consultants will still try make a case in favor of a market-appropriate discounting factor to continue to recycle their current model, that concept is virtually untenable considering the fast-moving nature of the back-end market. Here are some of the reasons why specialized REO and foreclosure (“foreclosure”) AVMs would be needed to address the fast-growing needs of this particular market segment:

1. An Overall Discounting Factor is Only a Surface Correction: Let’s say, for argument’s sake, in major market X, the market differential between the primary market and the foreclosure market is 30%. On the heels of this market statistic, if a bank buys current AVM values from a vendor and tries to fit a 30% haircut to its foreclosure portfolio, it would make a serious mistake by only emphasizing surface corrections, thereby distorting the sub-markets (a.k.a. pockets) that tend to deviate from the norm.Even a local housing market consists of many diverse sub-markets that tend to be econometrically different from the smooth median corridor. Therefore, those deviating sub-markets would be grossly mispriced should a generic haircut is applied; in fact, some would be grossly overvalued while others would be significantly undervalued, thereby lowering the market reliability of the portfolio.

2. Foreclosures are Often Disproportionately Clustered in Sub-markets: Since foreclosures are often disproportionately higher or clustered in certain sub-markets, using generally discounted AVM values as described above would be irrational, particularly when large portfolios are negotiated, causing more trouble for the retail brokers and small homebuilders who are trying to rationally work through their inventories. Of course, today, foreclosures are more wide-spread, extending from the sub-prime and Alt-A into the prime portfolios. Therefore, they require more specialized valuation, including specialized foreclosure AVMs. The high-end trophy properties (mansions, waterfronts, etc.) should not be subjected to any AVM’s – always professionally hand-worked, instead.

3. Foreclosures will Continue to Plague Prime Portfolios for Several More Years: Given the continued squeeze in the Jumbo mortgage market, foreclosures will continue to plague the prime portfolios for several more years. In fact, prime foreclosures and charge-offs are expected to stay elevated due to the continued high incidence of short sales and the highly probable tsunami of HELOC defaults hitting the market in early 2015. If the HELOC hypothesis comes to pass, a new generation of specialized AVMs geared exclusively toward that segment would be mandatory. In fact, everyone from the traditional AVM houses to the listing services to the national brokerage houses will soon realize that the foreclosure markets are not short-lived or temporary. Actually, the overall housing market has become semi-permanently bimodal (primary and foreclosure), requiring significant back-to-the-drawing-board valuation re-engineering. Hopefully, a new generation of specialized AVM houses would take advantage of this market void and position their brands appropriately.

Under the traditional AVM development process only the recent arms-length sales (often aided by the discounted seasoned listings to simulate the most recent market) are used to create representative sales samples to develop models and are then applied on to the populations the samples are derived from. In other words, the traditional modeling samples ignore all foreclosure and short sales. Now that the foreclosure market has widened across all geographic and value strata, the need for mutually exclusive foreclosure AVMs is no longer theoretical – they are essential in order to address the critical and fast-growing needs of that market segment. 
However, to develop such specialized foreclosure AVMs, experts must consider deriving modeling samples from the foreclosure-related universe only, to avoid having to fudge the final values by applying some heuristic discounting factors. If the AVMs are developed as such, the final values would be more in line with that segment of the market, addressing especially the sub-markets which inherently deviate from the median market. 
Of course, to bolster the sample size, multiple contiguous markets may be grouped and modeled together; for example, Long Island MLS covers three counties - Nassau, Suffolk, and Queens, so they could be modeled together drawing all of their foreclosure and short sales into the mix. In any case, the objective of the specialized foreclosure AVM is to manage and mitigate losses so these AVM values do not have to be as surgical as the traditional AVM's. When those new foreclosure AVM values are compared to their traditional counterparts, they will be significantly lower in a market-meaningful manner. That, of course, is the point.


Sid Som MBA, MIM
Homequant Inc.

Note - This book is available on Kindle (Amazon - query 'Sid Som').