US20080015910A1 - Ranking-based method and system for evaluating customer predication models - Google Patents

Ranking-based method and system for evaluating customer predication models Download PDF

Info

Publication number
US20080015910A1
US20080015910A1 US11/456,663 US45666306A US2008015910A1 US 20080015910 A1 US20080015910 A1 US 20080015910A1 US 45666306 A US45666306 A US 45666306A US 2008015910 A1 US2008015910 A1 US 2008015910A1
Authority
US
United States
Prior art keywords
ranking
customers
order
switches
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US11/456,663
Inventor
Claudia Reisz
Saharon Rosset
Bianca Zadrozny
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
International Business Machines Corp
Original Assignee
International Business Machines Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by International Business Machines Corp filed Critical International Business Machines Corp
Priority to US11/456,663 priority Critical patent/US20080015910A1/en
Assigned to INTERNATIONAL BUSINESS MACHINES CORPORATION reassignment INTERNATIONAL BUSINESS MACHINES CORPORATION ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: REISZ, CLAUDIA, ROSSET, SAHARON, ZADROZNY, BIANCA
Publication of US20080015910A1 publication Critical patent/US20080015910A1/en
Priority to US12/050,371 priority patent/US7725340B2/en
Abandoned legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0631Resource planning, allocation, distributing or scheduling for enterprises or organisations
    • G06Q10/06311Scheduling, planning or task assignment for a person or group
    • G06Q10/063112Skill-based matching of a person or a group to a task
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • G06Q30/0202Market predictions or forecasting for commercial activities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • G06Q30/0204Market segmentation

Definitions

  • the present invention generally relates to evaluation of scoring models and, more particularly, to the use of ranking-based measures to evaluate the performance of regression models.
  • the invention has particular application to evaluation of prediction models with regard to the ranking of customers and/or potential customers according to their potential to spend for goods and services.
  • regression Error Curves where the model is evaluated according to its error rate at different levels of “error tolerance”; and using medians of the absolute deviations (MAD), rather than their mean, as the error measure:
  • a method for identifying target customers and/or potential customers that have the largest spending budgets there is provided a method for identifying target customers and/or potential customers that have the largest spending budgets.
  • a method for evaluating customer prediction models that enables an organization to more accurately target customers and/or potential customers for sales and marketing efforts.
  • This evaluation method considers various models that have generated customer prediction data.
  • the user can have improved confidence in the targeting of the particular customers.
  • the invention is intended to provide the “best” set of target customers through the use of ranking-based measures, which evaluate the performance of the model m(x) by sorting the predicted customer spending from “small” to “large”.
  • “Best” can be defined in many ways but may include those customers that have the largest potential spending budget which is also referred to as “wallet size”.
  • the invention provides a computer-implemented ranking-based method for evaluating regression models by obtaining predictions ⁇ from one or more models to be evaluated on a test set of customers for which the true value y of the quantity of interest has been observed.
  • This test set resides in an electronic database.
  • the test set is sorted in increasing order such that “large” corresponds to one of a plurality of customers and/or potential customers with largest perceived spending budget, and “small” corresponds to one of a plurality of customers and/or potential customers with smallest perceived spending budget.
  • All models are applied to the customer data from the test set and predictions for the spending for each model and customer are obtained.
  • the predictions are converted into ranks and stored for each model in one or more electronic databases as a model ranking table.
  • the number of ranking order switches relative to the ranking of the observed customer spending is calculated for each model.
  • Ranking order switches are defined as those changes in ranking position of the prediction relative to the order of the true observations y.
  • a measure of the magnitude of erroneous ranking is calculated from a weighted sum of ranking order switches.
  • the method then transforms the number of ranking switches and weighted sum of ranking order switches into a range of [ ⁇ 1, 1] wherein ⁇ 1 corresponds to making all possible errors (inverse ranking) and 1 corresponds to a perfect model wherein said number of ranking switches has been transformed to represent a difference between a probability that the ranking of two customers and/or potential customers are in the same order versus the probability that two of the customers and/or potential customers are in different orders from the originally obtained rank.
  • FIG. 1 is a flow diagram of the steps of the ranking-based method according to the present invention.
  • FIG. 2 is a visualization of a percent of correctly ranked pairs involving a particular prediction
  • FIG. 3 is a visualization of the Area Under the Curve (AUC) values for the model being evaluated.
  • FIG. 4 is the approximate lift curve of the cumulative rank.
  • FIG. 1 there is shown a flow diagram of the steps of the invention method for performing ranking-based evaluation of the customer prediction models.
  • the method described here uses several relationships to evaluate the models and related model data. These relationships utilize the following variables:
  • a database 101 contains customer data (x, y) where x are different properties of customers and y is the quantity of interest (e.g., the revenue generated by the customer). It should be noted that y is a vector and x is a matrix of length n and width equal to the number of different customer properties (also called features).
  • the customer data (x, y) are sorted in increasing order of y and stored in the database 101 .
  • r i is the rank of customer and/or potential customer i in this order:
  • Function block 113 calculates for each model the number of ranking order switches:
  • the first measure simply counts how many of the pairs in the test data are ordered incorrectly by the model m(x).
  • the second measure also considers these incorrect orderings, but weighs them by the difference in their model ranks, that is, a measure of the magnitude of error being committed.
  • the results of each of these steps are stored electronically in the system database 101 .
  • the resealing equations are:
  • Function block 114 also calculates the confidence intervals for the values of evaluation measure given its sample value (the non-null case), as they represent the uncertainty in the model evaluation based on a single test set. Since ⁇ circumflex over ( ⁇ ) ⁇ is asymptomatically normal, a 1- ⁇ confidence interval for ⁇ circumflex over ( ⁇ ) ⁇ is:
  • function block 115 three graphical representations of the ordered switches are constructed: (1) percent of correctly ranked pairs involving a particular prediction, (2) AUC as a function of cutoff position, and (3) Lift-curve of the cumulative rank. Examples of these graphical representations are shown respectively in FIGS. 2 , 3 , and 4 , respectively.
  • function block 115 calculates in decreasing order for each observation X i the percentage of correctly ranked pairs (y i , y j ) over all j ⁇ i.
  • the percentage of correct pairings as a function of the inverse rank is shown in FIG. 2 .
  • the area above the curve is the sum of the percent incorrectly ranked pairs, which is equal to 2T/n(n ⁇ 1). Therefore, the area under the curve equals ⁇ tilde over ( ⁇ ) ⁇ .
  • the dashed line corresponds to a locally optimal performance. A perfect model would show a constant performance of 100% correctly ranked pairs.
  • the performance in a particular region of the graph is characterized by two properties of the plot, 1) the distance of the local optimum from the 100% line, and 2) the distance of the actual performance from the local optimum.
  • a particular region with a performance that on average remains very close to the local optimum has a nearly perfect ranking and is only disturbed by bad predictions that were either larger or smaller than the predictions of the region.
  • AUC area under the ROC curve
  • FIG. 3 shows such a transformation of the x-axis and has an area under the curve of ⁇ tilde over (p) ⁇ .
  • ⁇ j 1 n ⁇ ( n - i ) ⁇ ( n - s i )
  • function block 116 stores the results and graphical representations in the system database 101 . This information is then displayed for the analyst at display block 117 . This display can be provided in any format (e.g., printed report, electronic display on a computer monitor, etc.) specified by the user.
  • the analyst evaluates the model based on the provided information in function block 118 and selects the model that can provide the suggested “best” list of potential customers for targeted sales and marketing efforts.
  • the database is updated with these recommendations at function block 119 and a final report containing the optimal model and customer rankings is provided as the output 102 to be used by company personnel for targeted sales and marketing efforts.

Abstract

A method and system perform ranking-based evaluations for regression models that are often appropriate for marketing tasks and are more robust to outliers than traditional residual-based performance measures. The output provided by the method and system provides visualization that can offer insights about local model performance and outliers. Several models can be compared to each other to identify the “best” model and, therefore, the “best” model data for the particular marketing task.

Description

    BACKGROUND OF THE INVENTION
  • 1. Field of the Invention
  • The present invention generally relates to evaluation of scoring models and, more particularly, to the use of ranking-based measures to evaluate the performance of regression models. The invention has particular application to evaluation of prediction models with regard to the ranking of customers and/or potential customers according to their potential to spend for goods and services.
  • 2. Background Description
  • Evaluating prediction models of customers according to their potential to spend has been done through residual-based measures; i.e., the difference between the predicted and actual spending by some known customers. This approach suffers from two main drawbacks: (1) it is non-robust to outliers (for example, gross errors in the data used for evaluation), and (2) it is not the appropriate measure if the goal is just to identify the best prospective customers.
  • The standard approach to evaluating regression models on holdout data is through additive, residual-based loss functions, such as squared error loss or absolute loss. These measures are attractive from a statistical perspective as they have likelihood interpretations and because, from an engineering or scientific perspective, they often represent the “true” cost of the prediction errors.
  • Other approaches to regression model evaluation include Regression Error Curves, where the model is evaluated according to its error rate at different levels of “error tolerance”; and using medians of the absolute deviations (MAD), rather than their mean, as the error measure:

  • MAD=Median(|r1|, . . . ,|rn|)  (1)
  • There are many companies with relatively small wallet size and a few companies with very large wallet size. Therefore, evaluation measures such as mean squared error and mean absolute error can be greatly influenced by a small subset of companies that have very large wallets and for which the models are more likely to make larger absolute errors. On the other hand, measures such as median squared error can completely ignore the performance of the model on the companies with large IT wallet size, which are usually the most important customers. An approach that is often used to mitigate the effects of a skewed distribution (especially in modeling) is to transform the numbers to a logarithmic scale. This approach, however, is not adequate for the evaluation of prediction models, since log-dollars is a unit that does not have a clear financial meaning and, therefore, cannot be used in conjunction with other financial variables such as budget and costs.
  • SUMMARY OF THE INVENTION
  • In an exemplary embodiment of the present invention, there is provided a method for identifying target customers and/or potential customers that have the largest spending budgets.
  • In another exemplary embodiment of the present invention, there is provided a method for evaluation that is more robust to gross errors in the data used for evaluation than the residual-based evaluation methods.
  • According to the invention, there is provided a method for evaluating customer prediction models that enables an organization to more accurately target customers and/or potential customers for sales and marketing efforts. This evaluation method considers various models that have generated customer prediction data. By evaluating the models (m1, . . . , mk), the user can have improved confidence in the targeting of the particular customers. Furthermore the evaluation can be done on a single model (ŷ=m(x)) to verify the ranking of customers and/or potential customers according to their potential to spend. The invention is intended to provide the “best” set of target customers through the use of ranking-based measures, which evaluate the performance of the model m(x) by sorting the predicted customer spending from “small” to “large”. “Best” can be defined in many ways but may include those customers that have the largest potential spending budget which is also referred to as “wallet size”.
  • The invention provides a computer-implemented ranking-based method for evaluating regression models by obtaining predictions ŷ from one or more models to be evaluated on a test set of customers for which the true value y of the quantity of interest has been observed. This test set resides in an electronic database. The test set is sorted in increasing order such that “large” corresponds to one of a plurality of customers and/or potential customers with largest perceived spending budget, and “small” corresponds to one of a plurality of customers and/or potential customers with smallest perceived spending budget. All models are applied to the customer data from the test set and predictions for the spending for each model and customer are obtained. The predictions are converted into ranks and stored for each model in one or more electronic databases as a model ranking table. The number of ranking order switches relative to the ranking of the observed customer spending is calculated for each model. Ranking order switches are defined as those changes in ranking position of the prediction relative to the order of the true observations y. A measure of the magnitude of erroneous ranking is calculated from a weighted sum of ranking order switches. The method then transforms the number of ranking switches and weighted sum of ranking order switches into a range of [−1, 1] wherein −1 corresponds to making all possible errors (inverse ranking) and 1 corresponds to a perfect model wherein said number of ranking switches has been transformed to represent a difference between a probability that the ranking of two customers and/or potential customers are in the same order versus the probability that two of the customers and/or potential customers are in different orders from the originally obtained rank. These measures are then normalizing into a range of [0,1] wherein 1 corresponds to perfect ranking and 0 corresponds inverse ranking. At this point, the variance of the measures of order switches is calculated and confidence intervals for each ranking measure are determined. Finally, the model performance table is updated with the ranking measures and their confidence intervals. These findings as well as graphical representations thereof can be provided to a domain expert, who will choose based on this information the best model m* of the one or more models evaluated. The predictions of ŷ=m*(x) are stored in the optimal prediction table.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • The foregoing and other objects, aspects and advantages will be better understood from the following detailed description of a preferred embodiment of the invention with reference to the drawings, in which:
  • FIG. 1 is a flow diagram of the steps of the ranking-based method according to the present invention;
  • FIG. 2 is a visualization of a percent of correctly ranked pairs involving a particular prediction;
  • FIG. 3 is a visualization of the Area Under the Curve (AUC) values for the model being evaluated; and
  • FIG. 4 is the approximate lift curve of the cumulative rank.
  • DETAILED DESCRIPTION OF A PREFERRED EMBODIMENT OF THE INVENTION
  • Referring now to the drawings, and more particularly to FIG. 1, there is shown a flow diagram of the steps of the invention method for performing ranking-based evaluation of the customer prediction models.
  • The method described here uses several relationships to evaluate the models and related model data. These relationships utilize the following variables:
      • n is a size of a set of customers with known value of y,
      • x is a matrix of customer properties of length n
      • y is a vector of observations of the response (revenue) of length n
      • xi is a matrix of are customer properties of length n after sorting by y
      • yi is a vector of observations of the response (revenue) of length n
      • ŷ is a predicted response of the model y=m(x),
      • si is the predicted rank for customer of said plurality of customers T is a number of rank order switches, and/or potential customers,
      • i is the integer from between 1 and n indexing sorted customers, also corresponds to the order of observed value y
      • j is an integer indexing customers
      • R is a weighted sum of order switches,
      • {circumflex over (τ)} is a probability of ranking switch,
      • {circumflex over (p)} is a ranking correlation,
      • Ci is a number of observations that are concordant with observation i,
      • 1-αis the confidence interval.
  • A database 101 contains customer data (x, y) where x are different properties of customers and y is the quantity of interest (e.g., the revenue generated by the customer). It should be noted that y is a vector and x is a matrix of length n and width equal to the number of different customer properties (also called features). The database also contains k models (m1, . . . , mk) that for each customer can predict the quality of interest given xi where ŷi=m1(x).
  • In function block 110, the customer data (x, y) are sorted in increasing order of y and stored in the database 101. The resulting sorted customer data (xi, yi) has the property yi>yj if i=j for i=1, . . . , n customers.
  • In function block 111, all models are applied to the sorted customer properties x to obtain predictions ŷi=m1(xi) for all customers i from all models l. In function block 112, calculations are made for each model l, the respective predicted rank rl of the predictions ŷ1. Note that each rl is a vector of length n and that the order of the entries in vector r still reflects the order of the true value y. For example, if there are three customers with ordered revenue values $3, $15, $57 for which the model l prediction revenue values: $5.00, $100 and $0, the predicted ranking would be r=2, 3, and 1. Formally, ri is the rank of customer and/or potential customer i in this order:

  • s i =|{j≦n|ŷ i ≦ŷ j}|
  • The invention considers two ranking-based evaluation measures and their interpretations (e.g., ranking order entries in model ranking table, etc.). Function block 113 calculates for each model the number of ranking order switches:

  • T=Σ i≦j1{si>sj}  (2)
  • and the weighted sum of order switches:

  • R=Σ i<j(j−i)1(si>sj)  (3)
  • The first measure simply counts how many of the pairs in the test data are ordered incorrectly by the model m(x). The second measure also considers these incorrect orderings, but weighs them by the difference in their model ranks, that is, a measure of the magnitude of error being committed. The results of each of these steps are stored electronically in the system database 101.
  • In function block 114, the ranks are transformed using rescaling equations to put them into the range [−1, 1], where 1 corresponds to perfect model performance (T,R=0) and −1 corresponds to making all possible errors, thus attaining perfect reverse ranking. It is easy to verify that max(T)=n(n−1)=2, max(R)=n(n−1)(n+1)=6. The resealing equations are:
  • τ ^ = 1 - 4 T n ( n - 1 ) ( 4 ) ρ ^ = 1 - 12 R n ( n - 1 ) ( n + 1 ) ( 5 )
  • These values are similar to Kendall's τ which measures the strength of the relationship between two variables and Spearman's rank correlation. The moments of {circumflex over (τ)} and {circumflex over (p)} under the relevant null assumptions (τ=0 and p=0, respectively) are calculated and a normal approximation gives a hypothesis testing methodology for the assumption of no correlation. For residual based measures, it is typically not possible to build confidence intervals without parametric assumptions and/or variance estimation. The non-parametric nature of {circumflex over (τ)} allows a general expression for its variance to be written as:
  • Var ( τ ^ ) = 8 n ( n - 1 ) ( π c ( 1 - π c ) + 2 ( n - 2 ) ( π cc - π c 2 ) ) ,
  • where πc=E({circumflex over (τ)})=½+½τ and πcc are two properties of the ranking function. This is then replaced with the sample means to obtain:
  • Var ^ ( τ ^ ) = ( 2 n ( - 1 ) ) 2 · 2 · ( 2 i C i 2 - i C i - ( 2 n - 3 ) n ( n - 1 ) · ( i C i ) 2 ) where C i = j < i 1 { y i > y j } + j > i 1 { y i < y j } ( 6 )
  • is the number of observations that are “concordant” with observation i, that is, that their ranking relative to i in the model data agrees with the ranking by model scores (as plotted in FIG. 3). Function block 114 also calculates the confidence intervals for the values of evaluation measure given its sample value (the non-null case), as they represent the uncertainty in the model evaluation based on a single test set. Since {circumflex over (τ)} is asymptomatically normal, a 1-α confidence interval for {circumflex over (τ)} is:
  • In function block 115, three graphical representations of the ordered switches are constructed: (1) percent of correctly ranked pairs involving a particular prediction, (2) AUC as a function of cutoff position, and (3) Lift-curve of the cumulative rank. Examples of these graphical representations are shown respectively in FIGS. 2, 3, and 4, respectively.
  • Starting from the largest model prediction m(xn), function block 115 calculates in decreasing order for each observation Xi the percentage of correctly ranked pairs (yi, yj) over all j≠i. The percentage of correct pairings as a function of the inverse rank is shown in FIG. 2. The area above the curve is the sum of the percent incorrectly ranked pairs, which is equal to 2T/n(n−1). Therefore, the area under the curve equals {tilde over (τ)}. The dashed line corresponds to a locally optimal performance. A perfect model would show a constant performance of 100% correctly ranked pairs. But given that the model is not perfect and makes predictions that are sometimes too large or too small, even a perfect prediction for a particular observation with m(xi)=yi will have a number of inversely ranked pairs due to errors of the other predictions. The upper limit of the performance for a given prediction, keeping everything else constant, is therefore not 100% but determined by the model performance of predictions around it. The interpretation of the locally optimal performance is the highest achievable percentage of correctly ranked pairs if m(xi) could be placed arbitrarily, given all other model predictions.
  • The performance in a particular region of the graph is characterized by two properties of the plot, 1) the distance of the local optimum from the 100% line, and 2) the distance of the actual performance from the local optimum. A particular region with a performance that on average remains very close to the local optimum has a nearly perfect ranking and is only disturbed by bad predictions that were either larger or smaller than the predictions of the region.
  • For FIG. 3, the original regression results are transformed into n−1 classification results, where the observations are discretized into a binary class variable c(i)(yj)=1 iff yj≧yi for all possible cutoffs 1<i<n.
  • For each classification c(i) the model performance is evaluated using the area under the ROC curve (AUC,) (FIG. 3). The probabilistic interpretation of the AUC is the probability that a pair of observations with opposite class labels is ranked correctly. Since AUCi only considers pairs with different class labels under cutoff i, the number of pairs used to calculate AUCi is equal to i·(n−i) and the number of incorrectly ranked pairs under this cutoff is therefore:

  • (1-AUCi)i·(n−i).
  • In function block 115 of FIG. 1, the total number of times a pair of observations will be assigned opposite class labels across all cutoffs is equal to the rank difference; i.e., a neighboring pair (k; k+1) receives opposite class labels only if the cutoff is equal to k whereas the extreme pair (1, n) will have opposite classes for all n−1 cutoffs. Given the definition in Equation (3) this implies that:
  • i ( 1 - AUC i ) · i · ( n - 1 ) = R
  • Since each AUCi is rescaled by a different factor i·(n−i), a graph of the AUCi as a function of the cutoff i would not have an area equal to {tilde over (p)}. In order to achieve a direct correspondence with {tilde over (p)}, i·(n−i) units are allocated to AUCi by resealing the x-axis accordingly. FIG. 3 shows such a transformation of the x-axis and has an area under the curve of {tilde over (p)}.
  • The plot in FIG. 4 is very close in spirit to a lift curve. After sorting the model predictions in decreasing order, the cumulative inverse rank of:
  • p i = j = 1 i ( n - s j + 1 )
  • is plotted for increasing cutoffs i in percent. Using the inverse rank emphasizes the model performance on the largest predictions that is shown in the bottom left of the graph. The model performance is bounded above by the optimal ranks
  • p i = j = 1 i ( n - j + 1 )
  • and below by the cumulative worst (inverse) ranking
  • w i = j = 1 i j .
  • The area under the model curve is given as
  • j = 1 n ( n - i ) ( n - s i )
  • shown to be equal to
  • n 3 - n 2 - n - R + i = 1 n i 2 .
  • Once all models have been evaluated through function blocks 111-115, function block 116 stores the results and graphical representations in the system database 101. This information is then displayed for the analyst at display block 117. This display can be provided in any format (e.g., printed report, electronic display on a computer monitor, etc.) specified by the user. The analyst then evaluates the model based on the provided information in function block 118 and selects the model that can provide the suggested “best” list of potential customers for targeted sales and marketing efforts. The database is updated with these recommendations at function block 119 and a final report containing the optimal model and customer rankings is provided as the output 102 to be used by company personnel for targeted sales and marketing efforts.
  • While the invention has been described in terms of its preferred embodiment, those skilled in the art will recognize that the invention can be practiced with modification within the spirit and scope of the appended claims.

Claims (18)

1. A computer-implemented ranking-based method for evaluating regression models by obtaining predictions y from one or more models to be evaluated on a test set of customers for which the true value y of the quantity of interest has been observed comprising the steps of:
storing the test set resides in an electronic database sorted in increasing order such that “large” corresponds to one of a plurality of customers and/or potential customers with largest perceived spending budget, and “small” corresponds to one of a plurality of customers and/or potential customers with smallest perceived spending budget;
applying all models to customer data from the test set to obtain predictions for the spending for each model and customer;
converting the predictions into ranks and storing for each model in one or more electronic databases as a model ranking table, a number of ranking order switches relative to ranking of the observed customer spending being calculated for each model and ranking order switches being defined as those changes in ranking position of the prediction relative to the order of the true observations y;
calculating a measure of a magnitude of erroneous ranking from a weighted sum of ranking order switches;
transforming the number of ranking switches and weighted sum of ranking order switches into a range of [−1, 1] wherein −1 corresponds to making all possible errors (inverse ranking) and 1 corresponds to a perfect model wherein said number of ranking switches has been transformed to represent a difference between a probability that the ranking of two customers and/or potential customers are in the same order versus the probability that two of the customers and/or potential customers are in different orders from the originally obtained rank;
normalizing transformed measures of order switches into a range of [0,1] wherein 1 corresponds to perfect ranking and 0 corresponds inverse ranking;
calculating a variance of measures of order switches and determining confidence intervals for each ranking measure;
updating the model performance table with the ranking measures and their confidence intervals; and
outputting the ranking measures and confidence levels as well as graphical representations thereof to a domain expert, who will choose based on this information the best model m* of the one or more models evaluated, and storing the predictions of ŷ=m*(x) in the optimal prediction table.
2. The computer-implemented ranking-based method recited in claim 1, wherein the step of calculating a measure of a magnitude of erroneous ranking from a weighted sum of ranking order switches comprises the steps of:
calculating using computing resources a number of ranking order switches wherein ranking order switches are those changes in ranking order position of said plurality of customers and/or potential customers assuming said ranking obtained in said obtaining step was erroneous, and
calculating using computing resources a weighted sum of ranking order switches wherein said weighted sum of ranking order switches provides a measure of the magnitude of erroneous ranking.
3. The computer-implemented ranking-based method of claim 2, wherein said calculating using computing resources the number of ranking order switches step comprises a relationship:

T=Σ i<j1{si>sj}
wherein, a set of variables of said relationship includes:
T is a number of rank order switches, ad
si is a rank order of one customer of said plurality of customers and/or potential customers,
i is the integer from 1 to (n−1), and
j is an integer from (i+1) to n, and
n is a size of a set of observed customers.
4. The computer-implemented ranking based method recited in claim 2, wherein said calculating using computing resources a weighted sum of order switches step comprising a relationship:

R=Σ i<j(j−i)1{s i>sj}
wherein, a set of variables of said relationship includes:
R is a weighted sum of order switches and
si is a rank order of one customer of said plurality of customers and/or potential customers,
i is the integer from 1 to (n−1), and
j is an integer from (i+1) to n, and
n is a size of a set of observed customers.
5. The computer-implemented ranking-based method recited in claim 1, wherein the step of calculating a variance of measures of order switches and determining confidence intervals for each ranking measure comprises the steps of:
calculating using computing resources a variance that said measures of said ranking order switches;
calculating using computing resources a confidence interval for each measure of said ranking order switches; and
constructing using computing resources graphical representations of said ranking order switches.
6. The computer-implemented ranking-based method recited in claim 5, wherein said calculating using computing resources a variance implements a relationship:
Var ^ ( τ ^ ) = ( 2 n ( - 1 ) ) 2 · 2 · ( 2 i C i 2 - i C i - ( 2 n - 3 ) n ( n - 1 ) · ( i C i ) 2 )
wherein, a set of variables of said set of relationships includes:
C i = j < i 1 { y i > y j } + j > i 1 { y i < y j }
is a number of observations that are concordant with observation i,
i is the integer from 1 to (n−1),
j is an integer from (i+1) to n,
n is a size of a set of customers and/or potential customers, and
y is a predicted response of the model y=m(x).
7. The computer-implemented ranking-based method recited in claim 1, wherein the step of outputting the ranking measures and confidence levels comprises the steps of:
providing an output that presents an updated ranking of said plurality of customers and/or potential customers that corresponds to ranking from model obtained in said obtaining step with highest confidence interval; and/or
providing an output of a model recommendation that corresponds to a best model of said one or more models evaluated.
8. The computer-implemented ranking-based method recited in claim 1, wherein the step of transforming said number of ranking switches and said weighting sum of order switches step implements a set of relationships:
τ ^ = 1 - 4 T n ( n - 1 ) , and ρ ^ = 1 - 12 R n ( n - 1 ) ( n + 1 )
wherein, a set of variables of said set of relationships includes:
{circumflex over (τ)} is a probability that the ranking order has been changed,
n is a size of a set of customers and/or potential customers,
{circumflex over (p)} is a ranking correlation,
T is a number of rank order switches,
R is said weighted sum of order switches,
9. The computer-implemented ranking-based method recited in claim 1, wherein the step of outputting provides an updated ranking of a plurality of customers and/or potential customers that corresponds to ranking from a model obtained in said obtaining step with highest confidence interval.
10. A computer system for implementing a ranking-based method for evaluating regression models by obtaining predictions ŷ from one or more models to be evaluated on a test set of customers for which the true value y of the quantity of interest has been observed comprising:
a database storing the test set resides in an electronic database sorted in increasing order such that “large” corresponds to one of a plurality of customers and/or potential customers with largest perceived spending budget, and “small” corresponds to one of a plurality of customers and/or potential customers with smallest perceived spending budget;
means for applying all models to customer data from the test set to obtain predictions for the spending for each model and customer and onverting the predictions into ranks and storing for each model in one or more electronic databases as a model ranking table, a number of ranking order switches relative to ranking of the observed customer spending being calculated for each model and ranking order switches being defined as those changes in ranking position of the prediction relative to the order of the true observations y;
calculating means for calculating a measure of a magnitude of erroneous ranking from a weighted sum of ranking order switches;
means for transforming the number of ranking switches and weighted sum of ranking order switches into a range of [−1, 1] wherein −1 corresponds to making all possible errors (inverse ranking) and 1 corresponds to a perfect model wherein said number of ranking switches has been transformed to represent a difference between a probability that the ranking of two customers and/or potential customers are in the same order versus the probability that two of the customers and/or potential customers are in different orders from the originally obtained rank;
said calculating means normalizing transformed measures of order switches into a range of [0,1] wherein 1 corresponds to perfect ranking and 0 corresponds inverse ranking and calculating a variance of measures of order switches and determining confidence intervals for each ranking measure;
means for updating the model performance table with the ranking measures and their confidence intervals; and
output means for outputting the ranking measures and confidence levels as well as graphical representations thereof to a domain expert, who will choose based on this information the best model m* of the one or more models evaluated, and storing the predictions of ŷ=m*(x) in the optimal prediction table.
11. The computer system recited in claim 10, wherein said calculating means calculates a number of ranking order switches wherein ranking order switches are those changes in ranking order position of said plurality of customers and/or potential customers assuming said ranking obtained in said obtaining step was erroneous, and a weighted sum of ranking order switches wherein said weighted sum of ranking order switches provides a measure of the magnitude of erroneous ranking.
12. The computer system recited in claim 11, wherein said calculating means implements a relationship:

T=Σ i<j1{si>sj}
wherein, a set of variables of said relationship includes:
T is a number of rank order switches, ad
si is a rank order of one customer of said plurality of customers and/or potential customers,
i is the integer from 1 to (n−1), and
j is an integer from (i+1) to n, and
n is a size of a set of observed customers.
13. The computer system recited in claim 11, wherein said calculating means implements a relationship:

R=Σ i>j(j−i)1{s i>sj}
wherein, a set of variables of said relationship includes:
R is a weighted sum of order switches and
si is a rank order of one customer of said plurality of customers and/or potential customers,
i is the integer from 1 to (n−1), and
j is an integer from (i+1) to n, and
n is a size of a set of observed customers.
14. The computer system recited in claim 10, wherein said calculating means calculates a variance that said measures of said ranking order switches, a confidence interval for each measure of said ranking order switches, and graphical representations of said ranking order switches.
15. The computer system recited in claim 14, wherein said calculating means implements a relationship:
Var ^ ( τ ^ ) = ( 2 n ( - 1 ) ) 2 · 2 · ( 2 i C i 2 - i C i - ( 2 n - 3 ) n ( n - 1 ) · ( i C i ) 2 )
wherein, a set of variables of said set of relationships includes:
C i = j < i 1 { y i > y j } + j > i 1 { y i < y j }
is a number of observations that are concordant with observation i,
i is the integer from 1 to (n−1),
j is an integer from (i+1) to n,
n is a size of a set of customers and/or potential customers, and
y is a predicted response of the model y=m(x).
16. The computer system recited in claim 10, wherein said output means provides an output that presents an updated ranking of said plurality of customers and/or potential customers that corresponds to ranking from model obtained in said obtaining step with highest confidence interval and/or provides an output of a model recommendation that corresponds to a best model of said one or more models evaluated.
17. The computer system recited in claim 10, wherein said transforming means implements a set of relationships:
τ ^ = 1 - 4 T n ( n - 1 ) , and ρ ^ = 1 - 12 R n ( n - 1 ) ( n + 1 )
wherein, a set of variables of said set of relationships includes:
{circumflex over (τ)} is a probability that the ranking order has been changed,
n is a size of a set of customers and/or potential customers,
{circumflex over (p)} is a ranking correlation,
T is a number of rank order switches,
R is said weighted sum of order switches,
18. A computer readable media containing code which implements a ranking-based method for evaluating regression models by obtaining predictions ŷ from one or more models to be evaluated on a test set of customers for which the true value y of the quantity of interest has been observed, the method comprising the steps of:
storing the test set resides in an electronic database sorted in increasing order such that “large” corresponds to one of a plurality of customers and/or potential customers with largest perceived spending budget, and “small” corresponds to one of a plurality of customers and/or potential customers with smallest perceived spending budget;
applying all models to customer data from the test set to obtain predictions for the spending for each model and customer;
converting the predictions into ranks and storing for each model in one or more electronic databases as a model ranking table, a number of ranking order switches relative to ranking of the observed customer spending being calculated for each model and ranking order switches being defined as those changes in ranking position of the prediction relative to the order of the true observations y;
calculating a measure of a magnitude of erroneous ranking from a weighted sum of ranking order switches;
transforming the number of ranking switches and weighted sum of ranking order switches into a range of [−1, 1] wherein −1 corresponds to making all possible errors (inverse ranking) and 1 corresponds to a perfect model wherein said number of ranking switches has been transformed to represent a difference between a probability that the ranking of two customers and/or potential customers are in the same order versus the probability that two of the customers and/or potential customers are in different orders from the originally obtained rank;
normalizing transformed measures of order switches into a range of [0,1] wherein 1 corresponds to perfect ranking and 0 corresponds inverse ranking;
calculating a variance of measures of order switches and determining confidence intervals for each ranking measure;
updating the model performance table with the ranking measures and their confidence intervals;
outputting the ranking measures and confidence levels as well as graphical representations thereof to a domain expert, who will choose based on this information the best model m* of the one or more models evaluated, and storing the predictions of ŷ=m*(x) in the optimal prediction table.
US11/456,663 2006-07-11 2006-07-11 Ranking-based method and system for evaluating customer predication models Abandoned US20080015910A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
US11/456,663 US20080015910A1 (en) 2006-07-11 2006-07-11 Ranking-based method and system for evaluating customer predication models
US12/050,371 US7725340B2 (en) 2006-07-11 2008-03-18 Ranking-based method for evaluating customer prediction models

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US11/456,663 US20080015910A1 (en) 2006-07-11 2006-07-11 Ranking-based method and system for evaluating customer predication models

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US12/050,371 Continuation US7725340B2 (en) 2006-07-11 2008-03-18 Ranking-based method for evaluating customer prediction models

Publications (1)

Publication Number Publication Date
US20080015910A1 true US20080015910A1 (en) 2008-01-17

Family

ID=38950363

Family Applications (2)

Application Number Title Priority Date Filing Date
US11/456,663 Abandoned US20080015910A1 (en) 2006-07-11 2006-07-11 Ranking-based method and system for evaluating customer predication models
US12/050,371 Active 2027-02-27 US7725340B2 (en) 2006-07-11 2008-03-18 Ranking-based method for evaluating customer prediction models

Family Applications After (1)

Application Number Title Priority Date Filing Date
US12/050,371 Active 2027-02-27 US7725340B2 (en) 2006-07-11 2008-03-18 Ranking-based method for evaluating customer prediction models

Country Status (1)

Country Link
US (2) US20080015910A1 (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070260626A1 (en) * 2006-05-04 2007-11-08 Claudia Reisz Method for customer-choice-based bundling of product options
US20100004982A1 (en) * 2008-07-03 2010-01-07 Microsoft Corporation Quantifying trust in computing networks
US20140089044A1 (en) * 2012-09-25 2014-03-27 Zilliant, Inc. System and method for identifying and presenting business-to-business sales opportunities
US20150046467A1 (en) * 2013-08-09 2015-02-12 Google Inc. Ranking content items using predicted performance
US20150278727A1 (en) * 2014-04-01 2015-10-01 Heartflow, Inc. Systems and methods for using geometry sensitivity information for guiding workflow
US20180181898A1 (en) * 2016-12-22 2018-06-28 Atlassian Pty Ltd Method and apparatus for a benchmarking service
US10354349B2 (en) 2014-04-01 2019-07-16 Heartflow, Inc. Systems and methods for using geometry sensitivity information for guiding workflow
CN111552913A (en) * 2020-04-24 2020-08-18 东南大学 Method for evaluating matching performance of urban rail transit under new line access condition
CN111639777A (en) * 2019-03-01 2020-09-08 北京海益同展信息科技有限公司 Method and device for estimating target weight

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB0608323D0 (en) * 2006-04-27 2006-06-07 Soft Image Systems Ltd Codifying & reusing expertise in personal and organisation transformation
US8069240B1 (en) 2007-09-25 2011-11-29 United Services Automobile Association (Usaa) Performance tuning of IT services
WO2013184667A1 (en) 2012-06-05 2013-12-12 Rank Miner, Inc. System, method and apparatus for voice analytics of recorded audio

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6430539B1 (en) * 1999-05-06 2002-08-06 Hnc Software Predictive modeling of consumer financial behavior
US20030182281A1 (en) * 2001-08-28 2003-09-25 Wittkowski Knut M. Statistical methods for multivariate ordinal data which are used for data base driven decision support
US20060293915A1 (en) * 2005-06-24 2006-12-28 Glenn Christopher E Method for optimizing accuracy of real estate valuations using automated valuation models

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7451065B2 (en) * 2002-03-11 2008-11-11 International Business Machines Corporation Method for constructing segmentation-based predictive models
US7219099B2 (en) * 2002-05-10 2007-05-15 Oracle International Corporation Data mining model building using attribute importance
US8311232B2 (en) * 2004-03-02 2012-11-13 Harman International Industries, Incorporated Method for predicting loudspeaker preference

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6430539B1 (en) * 1999-05-06 2002-08-06 Hnc Software Predictive modeling of consumer financial behavior
US7533038B2 (en) * 1999-05-06 2009-05-12 Fair Isaac Corporation Predictive modeling of consumer financial behavior using supervised segmentation and nearest-neighbor matching
US20030182281A1 (en) * 2001-08-28 2003-09-25 Wittkowski Knut M. Statistical methods for multivariate ordinal data which are used for data base driven decision support
US7072794B2 (en) * 2001-08-28 2006-07-04 Rockefeller University Statistical methods for multivariate ordinal data which are used for data base driven decision support
US20060293915A1 (en) * 2005-06-24 2006-12-28 Glenn Christopher E Method for optimizing accuracy of real estate valuations using automated valuation models

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090055244A1 (en) * 2006-05-04 2009-02-26 Claudia Reisz Method for customer-choice-based bundling of product options
US8332407B2 (en) 2006-05-04 2012-12-11 International Business Machines Corporation Method for bundling of product options using historical customer choice data
US20070260626A1 (en) * 2006-05-04 2007-11-08 Claudia Reisz Method for customer-choice-based bundling of product options
US20100004982A1 (en) * 2008-07-03 2010-01-07 Microsoft Corporation Quantifying trust in computing networks
US20140089044A1 (en) * 2012-09-25 2014-03-27 Zilliant, Inc. System and method for identifying and presenting business-to-business sales opportunities
US9256688B2 (en) * 2013-08-09 2016-02-09 Google Inc. Ranking content items using predicted performance
US20150046467A1 (en) * 2013-08-09 2015-02-12 Google Inc. Ranking content items using predicted performance
US20150278727A1 (en) * 2014-04-01 2015-10-01 Heartflow, Inc. Systems and methods for using geometry sensitivity information for guiding workflow
US9773219B2 (en) * 2014-04-01 2017-09-26 Heartflow, Inc. Systems and methods for using geometry sensitivity information for guiding workflow
US10354349B2 (en) 2014-04-01 2019-07-16 Heartflow, Inc. Systems and methods for using geometry sensitivity information for guiding workflow
US11042822B2 (en) 2014-04-01 2021-06-22 Heartflow, Inc. Systems and methods for using geometry sensitivity information for guiding workflow
US20180181898A1 (en) * 2016-12-22 2018-06-28 Atlassian Pty Ltd Method and apparatus for a benchmarking service
US11710089B2 (en) * 2016-12-22 2023-07-25 Atlassian Pty Ltd. Method and apparatus for a benchmarking service
CN111639777A (en) * 2019-03-01 2020-09-08 北京海益同展信息科技有限公司 Method and device for estimating target weight
CN111552913A (en) * 2020-04-24 2020-08-18 东南大学 Method for evaluating matching performance of urban rail transit under new line access condition

Also Published As

Publication number Publication date
US20080221954A1 (en) 2008-09-11
US7725340B2 (en) 2010-05-25

Similar Documents

Publication Publication Date Title
US7725340B2 (en) Ranking-based method for evaluating customer prediction models
Sarma Predictive modeling with SAS enterprise miner: Practical solutions for business applications
Park et al. A case-based reasoning with the feature weights derived by analytic hierarchy process for bankruptcy prediction
Kao et al. A Bayesian latent variable model with classification and regression tree approach for behavior and credit scoring
Handcock et al. Relative distribution methods in the social sciences
Berry et al. Probabilistic forecasting of heterogeneous consumer transaction–sales time series
US8639618B2 (en) System and method of detecting and assessing multiple types of risks related to mortgage lending
Punj et al. The choice process for graduate business schools
US20090132347A1 (en) Systems And Methods For Aggregating And Utilizing Retail Transaction Records At The Customer Level
Boero et al. Scoring rules and survey density forecasts
Delong et al. Collective reserving using individual claims data
Liou et al. Subjective appraisal of service quality using fuzzy linguistic assessment
US20200250185A1 (en) System and method for deriving merchant and product demographics from a transaction database
US20120330724A1 (en) Fuel pricing
WO2022076412A1 (en) Systems and methods for linking indices associated with environmental impact determinations for transactions
Nehrebecka Internal credit risk models and digital transformation: what to prepare for? an application to Poland
Perlich et al. High-quantile modeling for customer wallet estimation and other applications
Migon et al. A review of Bayesian dynamic forecasting models: Applications in marketing
Parr-Rud Business Analytics Using SAS Enterprise Guide and SAS Enterprise Miner: A Beginner's Guide
Zhang et al. Ratemaking for a new territory: Enhancing glm pricing model with a bayesian analysis
Jones et al. Predicting corporate bankruptcy risk in Australia: A latent class analysis
Edbrooke Time Series Modelling Technique Analysis for Enterprise Stress Testing
Igari et al. A Bayesian Dynamic Model for Incomplete Preferences with No-Choice Options in Conjoint Analysis
Wilhelmsen et al. Bayesian modelling of credit risk using integrated nested laplace approximations
Muzorewa Reliability prediction of household electro-mechanical appliances using current technology

Legal Events

Date Code Title Description
AS Assignment

Owner name: INTERNATIONAL BUSINESS MACHINES CORPORATION, NEW Y

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:REISZ, CLAUDIA;ROSSET, SAHARON;ZADROZNY, BIANCA;REEL/FRAME:017910/0599;SIGNING DATES FROM 20060706 TO 20060710

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION