Logit Regression can be used for a number of business
applications from credit scoring to target marketing. As competition increases and budgets grow tighter, companies are looking for cheaper and more effective
ways to sell their products and services. Greater emphasis is
placed on identifying those customers who are most likely to purchase.
In the past, targeting those customers may have been done in an
ad-hoc manner depending on how historical sales data looks in
tables and cross-tabs. Now the movement is toward a statistical
modeling framework which can be implemented at product introduction
or any other time during the product's life cycle.
Standard econometric methods like ordinary least squares were
designed for evaluating variables that can assume any value within
a range - i.e. continuous variables. These methods are usually
appropriate for examining data which has been accumulated over
time into totals representing aggregate market response. When
the underlying behavior of the individual decision maker is evaluated,
the outcome variable may not be continuous. For example: whether
to buy or not to buy a product. Under these conditions, other
techniques are needed to properly measure and evaluate the decision
making process. These techniques are called Discrete Choice Models.
The discrete choice technique referred to here is Logistic Regression.
Numerous statistical packages are available to handle these types
of models (SAS, SPSS, SHAZAM, LIMDEP). In the simplest of cases,
the consumer is faced with only two choices: (1) to purchase a
product or (2) not to purchase. The consumer is assumed, in general,
to make the decision in such a way as to maximize his or her utility.
One of the advantages of discrete choice methods is that it treats
the decision making process in a probabilistic manner. Once the
equation is estimated, we can project the probability that a consumer
will purchase the product based upon a set of explanatory criteria
(education, income, age, etc). This probability (ranging from
0 to 100%) can be interpreted as a score and used to rank each
customer from those who are most likely to purchase to those least
likely to purchase the product. Setting up the data for estimation
looks very similar to ordinary least squares. The main difference
is that the dependent variable (whether or not the product was
purchased) is coded as a zero or one variable. However, the logistic
procedure is non-linear and actually estimates the log of the
odds of purchase.
Once estimated, the equation can be used to show how the probability
of purchase varies across different values of the explanatory
variable. Since logistic regression is a non-linear technique,
it is able to capture certain curvilinear relationships that may
exist. For example, what if a business' choice to purchase a product or
service is linked to net income. At very low levels of income
you might expect a small increase in that variable to have minimum
impact on the decision to purchase. This is because certain basic
needs such as utilities and fixed costs have to be met. At very
high income levels, different factors may contribute to some insensitivity.
However, at levels in-between the consumer's decision is border
line. Sensitivity to factors like price or income may "tip the
scale" in favor of the alternative.
The logit equation can also be used for what-if analysis. For
example, how much would the probability of purchase increase if
we shifted selling efforts from prospects with an income of $20,000
to those with incomes of $22,000 ? We could also do what-if simulations
on product price. If we increase or decrease the price of our product by
10%, how would our prospect list change?