Friday, June 29, 2018

Silicon Valley Single Family Housing Market vs. the Condo Market – Who Wins?

- Intended for Students/New Analysts to Learn Analytics -

(Click on the image to enlarge)
The Silicon Valley (Santa Clara County) Single Family Housing market continues to trend up, breaking out of the extended congestion zone of $1.1M to $1.2M. In terms of growth, even on a smooth basis, market moved up from $1.0M to $1.2M, an astounding 20% growth in a year. Importantly, all three recent breakouts produced higher highs. Both trendlines are amply confirming the breach of congestion. 

The Silicon Valley Condo market trends up on a linear path. The trend is definitely linear, but not a perfect one as evident by the low r-squared. The 2-Mo Moving Average is also confirming the linear trend. While the big drop in Dec-17 (#12) is being ignored by both trendlines, they are however pointing to a breakout eclipsing the prior high of $850K. Even on a smooth basis, the market moved up from $700K to $850K, a whopping 21% growth in a year.

Both markets are demonstrating very similar growth patterns and results, proving that the contest is a close tie. Alternatively, they both are standout winners considering the overall strength of the respective market.

- Sid Som, MBA, MIM
President, Homequant, Inc.

Los Angeles Single Family Housing Market vs. the Condo Market – Who Wins?

- Intended for Students/New Analysts to Learn Analytics -

(Click on the image to enlarge)

Though the Los Angeles Single Family Residential ("Housing") market shot up from $725K to $850K in a year, it quickly trended down to $775K, giving back half of the nominal gains and remaining somewhat directionless. Since both trendlines are confirming the declining trend, this market needs a rapid trend reversal. The fact that the housing market has already breached the $775K soft support would make prospective buyers quite nervous.

The LA Condo market, on the other hand, shows significant strength, rising from $440K to $500K during the same period. On its way up, it had three breakouts, each one eclipsing the prior high, thus pointing to strengthening fundamentals. Both trendlines are rejecting the Dec-17 dip as an aberration.

Given the steadily improving market trend backed by strong fundamentals, the standout winner is the Condo market.   

- Sid Som, MBA, MIM
President, Homequant, Inc.

Thursday, June 28, 2018

How to Analyze and Present Large and Complex Home Sales Data – in 30 Minutes (2 of 2)

-- Intended for Start-up Analysts and Researchers --

In our prior post (1 of 2) we talked about analyzing and presenting a large and complex dataset in 30 minutes. Would you handle it differently if you had 60 minutes? Here is one approach you might like to consider:

1. Just because you are starting out, do not underestimate yourself. The very fact that you have been tasked with this critical presentation speaks volumes, so take full advantage of this visibility in narrowing the competition down. These meetings are often frequented by other department heads and high-level client representatives, leading to significant loss of time in unrelated (business) discussions. The best way to prepare for such contingencies is to split the presentation up into a two-phase solution where phase-1 leads seamlessly to phase-2. 

2. In a business environment, it's never a good idea to start with a complicated stat/econ model. Start a bit slow but use your analytical acumen and presentation skill to gradually force people to converge on the same page, thus retaining maximum control over the presentation (time and theme). Therefore, the phase-1 solution should be the same as the full* 30-min solution we detailed before (*including the sub-market analysis). Even if the meeting leads to unrelated business chit-chat, off and on, you will still be able to squeeze in the phase-1 solution, thus offering at least a baseline solution. Alternatively, if you have one all-encompassing solution, you will end up offering virtually nothing. 

3. Now that you have finished presenting the phase-1, establishing a meaningful baseline, you are ready to transition to the higher-up phase-2 solution. In other words, it's time to show off your modeling knowledge. In phase-1 you presented a baseline Champ-Challenger analysis (Champ=Median Sale Price, MoM; Challenger=Median SP/SF, MoM). You used the "Median" to avoid having to clean up the dataset for major outliers. Here is the caveat though: Sales, individually, are mostly judgment calls; for example, someone bent on buying a pink house would overpay; an investor would underpay by luring a seller with a cash offer, etc. In the middle (middle 68% of the bell curve), the so-called informed buyers would use five comps, usually hand-picked by the salespeople, to value their subjects - not an exact science either.   

4. Now, let's envision where you would be at this stage - 30 minutes on hand and brimming with confidence. But it's not enough time to try to develop and present a true multi-stage, multi-cycle AVM (see my recent post on 'How to Build A Better AVM'). So, settle for a straight-forward Regression-based modeling solution, allowing time for a few new slides to the original presentation. Build the model as one log equation with limited number of variables (though covering all of the three major categories). Variables you might like to choose: Living Area, Age, Bldg Style, Grade, Condition and School/Assessing District. Avoid 2nd tier variables (e.g., Garage SF, View, Site Elevation, etc.).

5. Derive the time adjustment factors from phase-1 (it's a MoM) and create Time Adjusted Sale Price (ASP), the dependent variable in your Regression model. Explain this connection in your presentation so the audience (including your SVP/EVP boss) knows the two phases are not mutually exclusive, rather one is the stepping stone to the other. At this point, you could face this question "Why did you split it up into two?" Keep you answer short and truthful: "It's a time-based contingency plan."

6. Keep the Regression output handy but do not insert it into the main presentation as it is a log model (audience may not be able to relate to the log parameter estimates). If the issue comes up, talk about the three important aspects of the model: a) variable selection (how you managed to represent all three categories), b) most important variables as judged by the model (walk down on the t-stat and p-value) and c) overall accuracy of the model (r-squared, f-statistics, confidence, etc.).    

7. Present model results in two simple steps. Value Step: ASP vs. Regression values. Show the entire percentile curve - 1st to 99th. Point out the smoothness of the Regression values vis-a-vis ASP. Even arms-length sales tend to be somewhat irrational on both ends of the curve (<=5th and >=95th). Standard deviation of the Regression values would be much lower than ASP's. Ratio Step: Run stats on the Regression Ratio (Regression Value to ASP). It's easier to explain the Regression Ratios than the natural numbers so spend more time on the ratios.    

8. Time permitting, run the above stats both ways - with and without outliers. Define outliers by the Regression Ratios. Keep it simple; example: remove all ratios below the 5th and above the 95th percentile or below 70 and above 143, etc. Considering this is the outlier-free output, run Std Dev, COV, COD etc. These stats would be significantly better than the prior (with outliers) ones. Another common outlier question is: "Why no waterfront in your model?" The answer is simple: Generally, waterfront parcels comprise less than 5% of the population, hence difficult to test representativeness. FYI - in an actual AVM, if sold waterfront parcels properly represent the waterfront population, it could be tried in the model, as long as it clears the multi-collinearity test.  

9. Last but least, be prepared to face an obvious question: "What is the point of developing this model?" Here is the answer: A sale price is more than a handful of top-line comps. It comprises an array of important variables like size, age, land and building characteristics, fixed and micro locations, etc. so only a multivariate model can do justice to sale price by properly capturing and representing all of these variables. The output from this Regression model is the statistically significant market replica of the sales population. Moreover, this model can be applied on to the unsold population to generate very meaningful market values. Simply put, this Regression model is an econometric market solution. Granted, the unsold population could be comp'd but that's a very time-consuming and subjective process.

Ace the next presentation. Be a hero. Prove to your bosses you are a future CEO.

Good Luck!

- Sid Som, MBA, MIM
President, Homequant, Inc.

Monday, June 25, 2018

SkylineValue Offers Custom AVM Service for Mini Storage Properties

Homequant Offers Residential Sales/AVM and Assessment Stats

Sunday, June 24, 2018

Homequant Offers Automated Valuation Modeling (AVM) for SFR Rental Portfolios

A Good Home Valuation System Allows Users to Differentiate between Sales and Comparable Sales

Sales vs. Comparable Sales

A list of sales - by default - does not become comparable sales ("comps"). Sales - even when drawn from the same neighborhood - must be quantitatively adjusted for characteristics and time to become comps. Once adjusted, the differences in property characteristics, distance and time (01/2017 and 12/2017 sales are not the same) become irrelevant. 

So, always ask your Broker to show how the comps have been adjusted. 

Here is a snapshot of the adjustment process:

(Click on the image to enlarge)
The above table shows that although these are the 10 best pooled sales to value the defined subject, they are quite different in terms of the distance, time of sale, size and age so they have to be quantitatively adjusted (using sound econometric parameters drawn from the local market -- explained at length in other posts), to be considered and accepted as comps, absent which they would remain as some random sales.

Once they are adjusted, the Comps Grid will show the line item adjustments as well as the total adjustment for each of the final five comps:

(Click on the image to enlarge)
Upon adjustment to the sales, the Sale Prices are replaced in the analysis by the Adjusted Sale Prices (ASP) of the comps, contributing collectively to the valuation of the subject. The Comps Grid here has been ranked by 'Distance' meaning the comp closest to the subject becomes the Comp #1. 

To uderstand the three ranking methods - Distance, Sales Recency and Least Adjustment - please read the following post:

Also, do not confuse Scoring with Ranking.

I picked the above graphics from as I own and operate it, to avoid having to deal with any copyright issues. My Homequant site is totally self-directed (no modeled values), totally free (no strings), and requires no login or registrations whatsoever. Please choose the site that works best for you.

Should the Taxable Status Date be Forward or Backward?

While most property assessment experts are wrestling with the frequency of the assessment cycle, meaning whether properties should be assessed annually or every three years, etc., I have a completely different perspective on this issue.

Proper market information and data hold the key to produce a fair and equitable roll. The real estate market, like the stock market, has become extremely volatile (rises fast or declines fast), making it increasingly difficult to produce a futuristic roll with the backward-bending market data. Since the Taxable Status Date (or the Valuation date) is generally a futuristic date, the available market information and data tend to be quite inadequate to develop proper predictive (mass appraisal) models that, in turn, generate the assessment roll. Case in point: many taxing jurisdictions utilizing mass appraisal modeling generally build their models in August and September with the available data, targeting the next January as the status date (assuming the valuation and status dates are the same). At that point, because of the usual 3-month lag in recording and validation of sales, the most recent sales in the modeling data-set would cover, at best, June-July, leaving an enormous predictive gap of at least six months. Worse yet, a vast majority of those arm’s length sales would have been contracted in the first and early second quarter of the year, leaving a small percentage of the questionable investor/distressed cash sales to represent the recent state of the market.

With a market as volatile as this, and considering the structural shift to higher risk-taking resulting in continued higher volatility going forward, those futuristic rolls are tantamount to crapshoots in the name of mass appraisal modeling. Granted, today we have more mass appraisal experts all around the world, thanks to organizations like IAAO, as well as more advanced econometric and operations research techniques, but the higher modeling expertise and advanced techniques could not be the proxy for the lack of market data around the status date. Simply put, nobody has the crystal ball in simulating a volatile market six months in advance. That experimentation would be okay to write a thought-provoking paper or article, but is totally unacceptable to experiment with an assessment roll involving most taxpayers’ biggest financial investment.

Again, the only way we can achieve the goal of a fair and equitable roll under the changing market conditions (let's face it, market volatility is here to stay) is if we move away from the predictive mode and settle for a known event. The guesswork in the name of predictive modeling (most modelers understand the MRA process, but do not understand the advanced Time Series modeling) forces modelers as well as the management to undertake a gigantic gamble to predict out values 2 to 3 quarters later.

After having spent, off and on, twenty-five+ years in automated valuation modeling and mass appraisal, I have come to the conclusion that the dialogue should be about the taxable status “date” – should it be a forward date or a backward date? I am of the opinion that it should be a backward date so we do not succumb to the void created by the lack of market data. The CAMA/AVM modelers must have the necessary data, adequately covering the status/valuation date, under their belts before they even get started with the modeling process. This lag would also allow the field staff ample time to inspect all sales (and related permits) ahead of the modeling season. Thus, the modelers would have access to all of the valid sales (preferably inspected) for the allowable period. Modeling, therefore, would not be a predictive game anymore, eliminating the need for any futuristic gamble whatsoever. 

In other words, I am trying to reinvent here the combined benefits of David Ricardo’s 200-year old theory of comparative advantage and Laureate Joseph Stiglitz’s theory of markets with asymmetric information. In terms of market information, jurisdictions with futuristic status/valuation dates have an inherent comparative disadvantage vis-à-vis their counterparties.

Needless to say, this concept will also help dismantle the oligopolistic stronghold of the few law firms, paving the way for a significantly reduced volume of assessment appeals. Furthermore, we will not lose sleep worrying over how we could ever justify the drastically lower COD’s for the modeled sale periods in relation to the subsequent periods within the same assessment cycle. This would also minimize the need for any annual sales ratio/equalization study. I do, however, understand the short-term statutory and logistical issues, but then again, we owe our taxpayers a fair and equitable tax roll.

At the end of the day, do taxpayers really care if the status date is 01-2018 or 01-2019? Honestly, most taxpayers have no clue what that even means. All they are interested in is a value truly reflective of the market (which is where Assessment comes in). If we are capable of delivering that, we will have achieved our goal. Eventually the protests would be reduced to err in data only, removing the need for the mass (appeal) filers from the system, altogether. The concept of refund liability would be a thing of the past.

--by Sid Som, MBA, MIM
Copyrighted Material

Saturday, June 23, 2018

Silicon Valley Condo Market Continues on a Linear Growth Path

- Intended for Students/New Analysts to Learn Analytics -

(Click on the image to enlarge)

The Silicon Valley Condo Market continues to trend up on a linear path. The trend is definitely linear, but not a perfect one as evidenced by the low r-squared. The 2-Mo Moving Average is also confirming the linear trend. While the big drop in December is ignored by both trendlines, they are however pointing to a breakout eclipsing the prior high of $850K. Even on a smooth basis, the market moved up from $700K to $850K, a whopping 21% growth in a year.

The normalized trend (bottom chart) is even more vigorous, moving up from $500/SF to $675/SF, an astounding 35% annual growth. Again, both trendlines are rejecting the trough and the peak at the long end of the curve. Nonetheless, the growth has been spectacular. Without the outliers (Dec-17 and Feb-18), the r-squared jumps to 0.907 with the slope steepening to a near-perfect linear trend (not shown).

- Sid Som, MBA, MIM
President, Homequant, Inc.

Friday, June 22, 2018

How to Build a Better Market Automated Valuation Model (AVM)

-- Intended for Start-up Analysts and Researchers --
   1. Sales Sample: Assuming you are developing an “Application” AVM (meaning you are going to apply the AVM on to the population the sample is derived from), make sure you are working with a representative sales sample. While testing the sampling properties, consider all three categories of variables: Continuous (Bldg SF, Lot SF, Age, etc.), Categorical (Bldg Style, Exterior Wall, Grade, Condition, etc.) and Fixed Location (Town, School District, Assessing District, etc.). You need a representative sample, failing which the model would be inaccurate.

       2. Time Adjustment: Depending on the sales liquidity, a market AVM requires 12 to 24 months worth of arms-length sales. When the time series is extended (18+ months), quarterly adjustments are more effective as they are smoother and more market-efficient (reduces inconsistencies arising from using the “closing” dates rather than the contract dates). Time is a surface correction so keep it at the “outer” level; for instance, if you are modeling a county, keep time adjustments at the county level, without drilling down to the sub-markets. Also, avoid price-related (FHA, Jumbo, etc.) corrections.

·        3. Hold-out Sample: Once the sales are time adjusted, split up the sales sample between modeling sample (80%) and hold-out sample (20%). Both sub-samples must have very similar attributes as the original sales sample representing the population. To reduce judgment, use software-provided sampling procedures. Develop the model using the modeling sample and then test it on to the hold-out. While the results will not be exact, they must be similar (very close).

·        4. Multi-stage Regression: Since you have to model three different types of variables, develop a three-stage regression model, piggy-backing the output from the prior stage. The contributions are generally non-linear so the log-linear model is more effective. If you have a comprehensive dataset with many categorical variables, run a correlation matrix to determine multi-collinearity (if certain variables are highly correlated) leading to reduced number of variables. If you have limited number of variables, use the t-stat to control the significance. If a variable’s t-stat is less than 2, it is generally non-contributing.

·       5.  Multi-cycle Regression: In order to make the model efficient, develop it in three cycles. Use the first cycle to define and remove outliers. Create Sales Ratios (AVM values to Adjusted Sale Prices) to define/remove outliers. Then, use the outlier-free sample to run the cycle two to generate pre-residual values. Upon residual correction, run the third and final cycle to produce the model and values. If you develop the model systematically and methodically, it will be far more efficient.

·       6.  Residual Analysis: At the end of the second cycle work on your residuals. The fact that some Sales Ratios are clustered around 70 or 143 does not mean your AVM values are wrong. You are comparing your model values with sales which, individually, are all judgment calls. A prospective homebuyer bent on purchasing a pink house would overpay while a SFR Rental or an aggressive investor would buy a group paying below the market (some would be coded as arms-length), etc. Your model is essentially fixing those anomalies. Nonetheless, residual analysis and correction is an arduous but necessary optimization process.

·        7. Independent Validation: Once the draft version is ready, comp a sample (where the Sales Ratios are either below 80 or above 120) at a self-directed comp-based valuation site. Set up the adjustment matrix there in line with your model coefficients. For example, if you are modeling a coastal town or county, your size adjustment factors will be significantly higher than their Midwest counterparts, etc. Similarly, adjust your time and valuation dates properly. If your model shows 12% appreciation in your model area and you are valuing for a future date, set them up accordingly; otherwise, you will be comparing apples to oranges. Your model values should be within 10-15% of the comps’.

·       8.  Hold-out Testing: Now that your draft is ready, test it on to the hold-out that you have kept aside. Once the model is applied, remove the outliers using the same range as in the modeling sample. The application results (Hold-out Sales Ratio stats – Percentile distribution, COD/COV, etc.) must be very similar to those from the primary model. If they are at variance, you must start investigating. Here is where the investigation starts: Make sure you are applying the final version from the 3rd cycle.

·      9  Applying on to Population: Remember, the whole AVM exercise is to develop a model from the sold population (on average 4-6% homes sell annually) in order to value the mutually exclusive 95% unsold population. Of course, when you apply the model, apply it on to the universe (sold+unsold). Here is why: Since the sold population is the subset of the universe, the model values will be regenerated, forming a good basis for successful test application.                                       

    The  need for AVM is growing by leaps and bounds – banks, mortgage companies, servicers, REITs, Hedge funds, SFR Rentals, Large Tax Appeal houses, etc. are all big users of certified AVM values.
-  - Sid Som, MBA, MIM                                               
     President, Homequant, Inc.                                         

Additional Reading
9 Issues that Make an Automated Valuation Model (AVM) Inefficient, Often Ineffective

First-time Homebuyers Must Start Research at a Top-down Valuation Site

(Click on the image to enlarge)

A Top-down home valuation site is one that allows users to work up the value of a "simulated" home without having to deal with a series of random comps. A good Top-down site generally offers the following features:

1. Sub-markets: All (socio-economically) prominent sub-markets within the market (say, Orlando) are generally supported, allowing users to toggle between sub-markets to evaluate and understand the variations in home values.

2. Home Type and Style: Home types (Detached, Attached, HOA, Townhouse, Condo, etc.) and styles (Ranch, Cape, Colonial, Conventional, Contemporary, Tudor, etc.) are important considerations for home-buyers so a good Top-down site incorporates them.

3. Location: A good school district tends to fetch a higher value than its counterparts with lesser known schools. Good sites therefore allow users to understand how such qualitative factors quantitatively contribute to the home value.

4. land and Building Sizes: Users are allowed to educate themselves how the changes (increase/decrease) in sizes impact values within a given sub-market. Some sites would allow users to further differentiate between total improved area and heated area, corner lot vs. non-corner, etc. Bath count is also an important consideration as it helps understand if the home is optimized or a lifestyle one.

5. Building Age and Condition: Users can quickly learn how age and overall condition (including quality of rehab) impact values in a sub-market. Some sites might combine these two variables into one called effective age. Either way, these are important considerations in pre-owned homes.

6. View: A waterfront home could fetch significantly higher value than a non-waterfront one within the same sub-market. Similarly, a house with other enhancing views (park, bridge, skyline, golf course, etc.) could be pricier.

7. Amenities: Central A/C, In-ground Pool, Upgraded Porch, Tennis/Basketball Court, etc. often add value to homes so a good site would allow users to experiment with such options as well. 

Case Study

Our First-time Homebuyer = John Doe

John must be methodical in his research leading to the home buying. After a pre-qual of $300K, he has decided to focus on two Orlando-area sub-markets: Maitland and Winter Park. 

He finds a Top-down site which allows him to perform his research without having to work up some random comps. He realizes that while Winter Park has beautiful tree-lined streets, he gets more modern and slightly bigger homes in Maitland (a screened-in Pool could be a bonus). He is very happy that the site allows him to evaluate numerous possible combinations including location, type, size, style, amenities and view. He also notices that the site meaningfully curves values as home size increases.

I picked the above graphics from as I own and operate it, to avoid having to deal with any copyright issues. My Homeyada site is Mobile-friendly (no separate apps are needed), totally self-directed (no modeled values), totally free (no strings), and requires no login or registrations whatsoever. It has a built-in non-linear value curve/scale tied to the home size, as well. Anyway, please use the site/system that works best for you.