Friday, June 22, 2018

How to Build a Better Market Automated Valuation Model (AVM)

-- Intended for Start-up Analysts and Researchers --
   
   1. Sales Sample: Assuming you are developing an “Application” AVM (meaning you are going to apply the AVM on to the population the sample is derived from), make sure you are working with a representative sales sample. While testing the sampling properties, consider all three categories of variables: Continuous (Bldg SF, Lot SF, Age, etc.), Categorical (Bldg Style, Exterior Wall, Grade, Condition, etc.) and Fixed Location (Town, School District, Assessing District, etc.). You need a representative sample, failing which the model would be inaccurate.

       2. Time Adjustment: Depending on the sales liquidity, a market AVM requires 12 to 24 months worth of arms-length sales. When the time series is extended (18+ months), quarterly adjustments are more effective as they are smoother and more market-efficient (reduces inconsistencies arising from using the “closing” dates rather than the contract dates). Time is a surface correction so keep it at the “outer” level; for instance, if you are modeling a county, keep time adjustments at the county level, without drilling down to the sub-markets. Also, avoid price-related (FHA, Jumbo, etc.) corrections.

·        3. Hold-out Sample: Once the sales are time adjusted, split up the sales sample between modeling sample (80%) and hold-out sample (20%). Both sub-samples must have very similar attributes as the original sales sample representing the population. To reduce judgment, use software-provided sampling procedures. Develop the model using the modeling sample and then test it on to the hold-out. While the results will not be exact, they must be similar (very close).

·        4. Multi-stage Regression: Since you have to model three different types of variables, develop a three-stage regression model, piggy-backing the output from the prior stage. The contributions are generally non-linear so the log-linear model is more effective. If you have a comprehensive dataset with many categorical variables, run a correlation matrix to determine multi-collinearity (if certain variables are highly correlated) leading to reduced number of variables. If you have limited number of variables, use the t-stat to control the significance. If a variable’s t-stat is less than 2, it is generally non-contributing.

·       5.  Multi-cycle Regression: In order to make the model efficient, develop it in three cycles. Use the first cycle to define and remove outliers. Create Sales Ratios (AVM values to Adjusted Sale Prices) to define/remove outliers. Then, use the outlier-free sample to run the cycle two to generate pre-residual values. Upon residual correction, run the third and final cycle to produce the model and values. If you develop the model systematically and methodically, it will be far more efficient.

·       6.  Residual Analysis: At the end of the second cycle work on your residuals. The fact that some Sales Ratios are clustered around 70 or 143 does not mean your AVM values are wrong. You are comparing your model values with sales which, individually, are all judgment calls. A prospective homebuyer bent on purchasing a pink house would overpay while a SFR Rental or an aggressive investor would buy a group paying below the market (some would be coded as arms-length), etc. Your model is essentially fixing those anomalies. Nonetheless, residual analysis and correction is an arduous but necessary optimization process.

·        7. Independent Validation: Once the draft version is ready, comp a sample (where the Sales Ratios are either below 80 or above 120) at a self-directed comp-based valuation site. Set up the adjustment matrix there in line with your model coefficients. For example, if you are modeling a coastal town or county, your size adjustment factors will be significantly higher than their Midwest counterparts, etc. Similarly, adjust your time and valuation dates properly. If your model shows 12% appreciation in your model area and you are valuing for a future date, set them up accordingly; otherwise, you will be comparing apples to oranges. Your model values should be within 10-15% of the comps’.

·       8.  Hold-out Testing: Now that your draft is ready, test it on to the hold-out that you have kept aside. Once the model is applied, remove the outliers using the same range as in the modeling sample. The application results (Hold-out Sales Ratio stats – Percentile distribution, COD/COV, etc.) must be very similar to those from the primary model. If they are at variance, you must start investigating. Here is where the investigation starts: Make sure you are applying the final version from the 3rd cycle.

·      9  Applying on to Population: Remember, the whole AVM exercise is to develop a model from the sold population (on average 4-6% homes sell annually) in order to value the mutually exclusive 95% unsold population. Of course, when you apply the model, apply it on to the universe (sold+unsold). Here is why: Since the sold population is the subset of the universe, the model values will be regenerated, forming a good basis for successful test application.                                       

    The  need for AVM is growing by leaps and bounds – banks, mortgage companies, servicers, REITs, Hedge funds, SFR Rentals, Large Tax Appeal houses, etc. are all big users of certified AVM values.
                                                          
-  - Sid Som, MBA, MIM                                               
     President, Homequant, Inc.                                                  homequant@gmail.com 


Additional Reading
9 Issues that Make an Automated Valuation Model (AVM) Inefficient, Often Ineffective
http://blog.homequant.com/2018/06/9-issues-that-make-avm-inefficient.html

No comments:

Post a Comment