RECOMMENDATION METHOD AND SYSTEM BASED ON RATING SPACE PARTITIONED DATA
BACKGROUND OF THE INVENTION
A. Field of the Invention
This invention relates generally to data processing systems, and more particularly,
to recommendation systems.
B. Description of the Related Art
Recommendation systems are becoming widely used in e-commerce business
activities. Recommendation systems allow e-commerce operators to take advantage of
customer databases to provide valuable personalized service to customers. For example, systems that make personalized recommendations are used as a marketing tool to turn "window shoppers" into buyers, increase cross-sells and up-sells, and deepen customer
loyalty.
Existing recommendation systems make recommendations to customer with unary
data or data from a well-known Likert scale. Unary data is a set of user-item pairs that
indicate an event of interest to the user has occurred. An example of unary data is
purchase record data where a user-item pair indicates that the user has purchased a
particular item. In general, unary data does not characterize the event, but rather it
indicates that the event has occurred. Likert scale data indicates a user's preferences about an item. Typically, Likert scales give the user options, such as: like very much "5";
like a little "4"; don't mind either way "3"; dislike a little "2"; and strongly dislike "1."
Likert data based calculations use a form of correlation calculations that assumes negative information and positive information within the same data. Thus, calculations using
Likert data must accept, and cannot differentiate or separate, positive and negative data. For example, these calculations handle low positive data (e.g., a user rating an item a " 1 ")
as a negative connotation such as "strongly dislike." In performing the correlation
calculations, negative data is treated no different than positive data. Therefore, Likert
based calculations associate customers regardless of the type of data in common, so long
as the customers have some data in common. Therefore, Likert calculations may produce
erroneous results.
Although existing recommendation systems provide recommendations based on
unary data and Likert scale data, these systems are not capable of treating positive data and negative data differently. There exists a need to improve existing recommendation
systems to provide recommendations while using negative data in a different manner from that of positive data.
SUMMARY OF THE INVENTION Methods and systems consistent with the present invention provide a
recommendation system that uses positive data and negative data separately to locate
neighbors and provide recommendation to users. Such methods and systems use the data
to locate potential neighbors based on users' ratings. Methods and systems consistent
with the present invention calculate affinity values between the user and potential
neighbors located to determine whether the potential neighbor's ratings are closely related
to that of the user's ratings. If a user and a potential neighbor have an affinity greater than
a predetermined threshold, that neighbor is considered close enough to the user to provide
a recommendation for various items. Affinity values are calculated from a series of
affinity equations available to the recommendation system.
Consistent with the present invention a computer-implemented method provides
recommendations based on stored data corresponding to each one of a set of users with
respect to a first item. The method analyzes partitioned preference data that reflects
positive and negative preferences expressed by each one of a set of users with respect to
the first item. The method also provides a recommendation based on the determined
affinity.
Consistent with the present invention a computer-implemented method provides
a recommendation using resource allocation data that indicates strength of a user's interest in a particular item. The method obtains click-stream data corresponding to the user, locates a plurality of neighbors with click-stream data similar to the user's click-stream
data, and determines an affinity between the user and one of the plurality of neighbors
based on the resource allocation data. Once determined, the method includes the one of
the located neighbors meeting predetermined criteria on a neighbor list, and provides a
recommendation to the user based on the neighbor list.
Consistent with the present invention a computer-implemented method provides
a recommendation using resource allocation data that indicates a user's strength of an
interest in a particular item. The method locates, in a database that contains resource allocation data for a plurality of users, other users with a similar strength of an interest
in the particular item as the user, and determines an affinity between the user and one of the other users based on the similar strength of an interest. The method then provides a
recommendation to the user based on a list that contains a set of other users meeting predetermined criteria.
Consistent with the present invention a computer-implemented method provides
a recommendation for an item based on likes and dislikes of a user. The method locates
a plurality of neighbors with positive data similar to the user's positive data using a search
strategy, and determines an affinity between the user and each one of the plurality of
neighbors based on weighted agreements between the user and each one of the plurality
of neighbors. The method also includes the one of the located neighbors meeting
predetermined criteria on a neighbor list, and provides a recommendation to the user
based on the neighbor list.
Consistent with the present invention a computer-implemented method provides
a recommendation that indicates a user's likes and dislikes for a particular item. The
method locates, in a database that contains positive and negative data for a plurality of
users, other users with a similar positive likes for the particular item as the user, and
determines an affinity between the user and one of the other users. The affinity is
composed of weighted agreements. The method also provides a recommendation to the
user based on the determined affinity.
Consistent with the present invention, a method generates a recommendation for
an item. The method locates at least one potential neighbor for a user from a pool of candidates using a search strategy, and determines an affinity between the user and a
potential neighbor. The affinity composed of a weighted agreement between the user and a potential neighbor. If the affinity is above a threshold, the method decides that the potential neighbor is a neighbor of the user.
Consistent with the present invention a computer-implemented recommendation method is disclosed. The method permits a user to submit a request for a
recommendation, provides user ratings corresponding to each one of a set of users with
respect to a first item to a rating database, and determines an affinity between a first user
and another user by analyzing partitioned preference data that reflects positive and/or
negative preferences expressed by each one of a set of users with respect to the first item.
The method also provides a recommendation based on the determined affinity.
BRIEF DESCRIPTION OF THE DRAWINGS The accompanying drawings, which are incorporated in and constitute a part of
this specification, illustrate an implementation of the invention and, together with the description, serve to explain the advantages and principles of the invention. In the drawings,
Figure 1 depicts a data processing system suitable for practicing methods and
systems consistent with the present invention;
Figure 2 depicts a more detailed diagram of the client computer depicted in
Fig. 1;
Figure 3 depicts a more detailed diagram of the recommendation server depicted in Fig. 1 ;
Figure 4 depicts a flow chart of the steps performed when collecting data in a manner consistent with the principles of the present invention;
Figure 5 depicts a flow chart of the steps performed when providing a recommendation consistent with the principles of the present invention;
Figure 6A depicts a rating table for use with methods and systems in a manner consistent with the present invention;
Figure 6B depicts an agreement table for use with methods and systems in a
manner consistent with the principles of the present invention;
Figure 6C depicts an interest rating table for use with methods and systems
consistent with the present invention;
Figure 6D depicts a normalized interest rating table for the interest rating table
of Fig. 6C
Figure 6E depicts a second interest rating table for use with methods and
systems consistent with the present invention;
Figure 6F depicts a second normalized interest rating table for the second interest rating table of Fig. 6E;
Figure 7A depicts an embodiment of an electronic commerce server for use
with the present invention when using interest data; and Figure 7B depicts another embodiment of an electronic commerce server for
use with the invention when using RSP data.
DETAILED DESCRIPTION The following detailed description of the invention refers to the accompanying
drawings. Although the description includes exemplary implementations, other implementations are possible, and changes may be made to the implementations described without departing from the spirit and scope of the invention. The following
detailed description does not limit the invention. Instead, the scope of the invention is defined by the appended claims. Wherever possible, the same reference numbers will be
used throughout the drawings and the following description to refer to the same or like parts.
Overview
Methods and systems consistent with the present invention provide a
recommendation server capable of using rating space partitioned (RSP) data to provide
a recommendation to a user. RSP data is a type of data that represents positive and
negative preferences expressed by a user for an item. An item may be any item available
to recommend to a user. For example, an item may be a book, CD, or a movie. Positive
data reflects the fact a desired event has occurred, or is likely to occur. For example,
positive data may be a purchase event, adding an item to a shopping cart, or explicitly
stating that a user likes an item. Positive data may also indicate positive preferences for
an item, such as liking the item, or an interest in the item. Negative data reflects the fact
that a desired event will not occur, or is likely not to occur. Negative data may also
indicate negative preferences for the item, such as a dislike.
Since RSP data is partitioned (positive and negative), it is easy to determine
whether a user likes or dislikes a particular item. RSP data may be based on implicit user
interaction. That is, if a user spends less than three seconds on a particular web page, the
RSP data for that particular user/page may be negative data.
Unlike Likert data, RSP data contains partitioned positive data and negative data
that is separated and treated differently. This allows the recommendation system to
create search strategies that accurately locate potential neighbors from a large pool of
candidates. For example, a search strategy may use positive data to initially narrow the
number of potential neighbors, whereas the negative data may be used for minor
corrections in the search strategy.
A special type of RSP data, called interest data, is a type of data that represents
a measure of the level of interest someone has expressed in an item. Since interest data
is always a positive measure, interest data recommendations may only use positive data
(e.g., it is assumed that a user cannot show an interest in an item they dislike). Interest
data may also be resource allocation data. Resource allocation data is a type of data where the user indicates, not only an item of interest, but also how much interest the user
has in the items. For example, if a user has $1000 to spend on mutual funds, he may allocate his resource (money) to have $250 in mutual fund A, $750 in mutual fund B, and
0 in mutual fund C. The user has a higher interest in mutual fund B than in mutual fund A, and no interest in mutual fund C.
Interest data may also be based on user purchase data. That is, the interest data
would include a list of items recently purchased by the user. A user that purchases more of item A than item B would have a higher interest in A than B. For example, if a user
recently purchases item A and item B, and afterwards purchases ten more of item A, the
user has a higher interest in item A than B. Resource allocation data may be considered to have multiple levels of interest (e.g., preferences), whereas purchase data is considered
to have one level of interest (e.g., existence of a rating) Recommendation systems that incorporate RSP data and interest data treat
negative data differently than positive data to produce superior results from conventional
recommendation systems. Also, RSP data enables quick drill down search strategies to
limit the pool of candidates to provide accurate recommendations. Different search
strategies may be used based on the type of data to search, or recommendation. Interest
recommendation systems provide recommendations when the data is only positive. Thus,
a person's dislike in a particular item will not skew the results.
To provide recommendations, the recommendation system consistent with the
present invention uses a data collector to track and record any user interaction.
Recommendations may be used in a variety of situations. For example, a
recommendation may be used as part of marketing campaigns that recommend items to
users who are interested in similar items; as part of knowledge-management systems in
large corporations that recommends reports and documents to employees based on the
employees business or research interests; and as part of call centers that provide
recommendations for merchandise to consumers placing orders.
System Components
Fig. 1 depicts a data processing system 100 suitable for practicing methods and
systems consistent with the present invention. Data processing system 100 comprises a
client computer 112 connected to recommendation server 120 via a network 130, such
as the Internet. A user uses client computer 112 to provide various information to
recommendation server 120.
Although only one client computer 112 is depicted, one skilled in the art will
appreciate that data processing system 100 may contain many more client computers and additional client sites. One skilled in the art will also appreciate that recommendation server 120 may be located at various places on network 130. For example, the functions
of recommendation server 120 may be included in a merchant server by client computer
to, for example, purchase an item. Alternatively, the same functions may be incorporated
in client computer 112. A subset of the Internet is commonly referred to as the world wide web (www). Certain computers and/or servers connected to the web (referred to as web sites) offer information in the form of web pages. A web page may include digital
content, such as text and/or images, audio streams, or instructions to obtain
recommendation requests from a user using hypertext markup language (HTML), Java
or other techniques.
Figure 2 depicts a more detailed diagram of client computer 112, which contains a memory 220, a secondary storage device 230, a central processing unit (CPU) 240, an input device 250, and a video display 260. Memory 220 includes browser 222 that allows
users to interact with recommendation server 120 by transmitting and receiving files, such
as web pages. An example of a browser suitable for use with methods and
systems consistent with the present invention is the Netscape Navigator browser, from Netscape Corp.
As shown in Figure 3, recommendation server 120 includes a memory 310, a
secondary storage device 320, a CPU 330, an input device 340, and a video display 350.
Memory 310 includes recommendation engine 312 and data collector 314.
Recommendation engine 312 determines if an item should be recommended to the user.
It may use many different techniques to generate recommendations based on RSP data.
One technique that may be used to generate recommendations is automated collaborative
filtering as described in Resnick, Iacovo, Susha, Bergstrom, and Riedl, "GroupLens: An
Open Architecture For Collaborative Filtering Of Netnews," Proceedings of the 1994
Computer Supported Collaborative Work Conference (1994). Other recommendation
techniques are described in U.S. application serial no. 08/729,787, filed October 8, 1996,
U.S. application serial no. 08/733,806, filed October 18, 1996, attorney docket no. 7744-
6000, filed September 23, 1999, U.S. application serial no. no.09/404,597, filed
September 24, 1999, and U.S. application serial no. 09/438,846, filed November 12,
1999, all incorporated by reference. Recommendation systems may also be based on
well-known Collaborate filtering (CF) systems, logical rules derived from data, or on
statistical or machine learning technology. For example, a recommendation system may
use well-known rule-induction learning, such as Cohen's Ripper, to learn a set of rules
from a collection of data as described in Good, N., Schafer, J.B., Konstan, J., Borchers,
A., Sarwar, B., Herlocker, J., and Riedl, J., "Combining Collaborative Filtering with
Personal Agents for Better Recommendations," Proceedings of the 1999 Conference of
the American Association of Artifical Intelligence (AAAI-99).
Recommendation systems may also be based on well-known data mining
techniques that include a variety of supervised and unsupervised learning strategies and
produce "surprising" results expressed as associations or rules embedded in a data set.
Recommendation systems may also contain rating functions (models) programmed by
a system administrator. The rating functions are either a formula or a table of ratings that
determines business goals (e.g., the formula may specify a low rating for low-stock and
out-of-stock items). These systems also require user data as input to produce
personalized recommendations for users.
Data collector 314 monitors user interaction with various applications, such as an
online store or any other e-commerce application (not shown) that may be operating on
server 120 or elsewhere in network 100. For example, user interaction may include
explicit feedback, such as survey data, or responses to recommendations, (e.g., a user
informs the application not to recommend a particular item by selecting a "not-for-me"
button); implicit feedback, such as click stream data that reflects the user's navigation
through web pages, time spent on web pages, links traverses; shopping and purchase data,
such as number of items viewed, contents of shopping baskets, amount of purchases, or
returns; or explicit preference ratings, such as numerical (like or dislike) or qualitative
ratings of items. Data collector 314 may also monitor user interaction with various
applications by obtaining web page logs from online stores or e-commerce applications.
A web page log comprises a set of records of all activity on a particular web site. For
example, a web page log may contain a user's time on a web page and information
reflecting the web pages viewed during time passed. Regardless of the method used
and/or statistics received by data collector 314, data collector 314 identifies a user and
item, and stores the information as ratings in database 322.
Data collector 314 may include a web page, Application Program Interface (API),
or other input interface to receive data from a user or an application. An API is a set of
routines, protocols, or tools for communicating with software applications. APIs provide
efficient access to data collector 314 without the need for additional software.
Secondary storage device 320 includes a rating database 322 that stores various
user ratings, such as RSP data. Rating database 322 obtains data by receiving parsed
user interaction from data collector 314. One skilled in the art will appreciate that
database 322 may contain other types of data, such as unary data, and Likert.
Data Collection Process
Figure 4 depicts a flow chart of the steps performed by data collector 314 when
collecting data. The first step is for data collector 314 to receive user interaction data
(step 402). To receive user interaction data, data collector 314 parses various web page
logs from an e-commerce application. If the received data requires processing (step 404),
then data collector 314 processes the data (step 406). That is, data collector 314 extracts
user and item information from the collected data and converts the data into positive and
negative preference data. The RSP algorithms and affinity equations (described below)
use either positive-only (e.g., interest data) or both positive and negative data to provide
recommendations. Finally, data collector 314 stores the data in rating database 322 (step
408).
Recommendation Process with RSP Data
Figure 5 depicts a flow chart of the steps performed when generating a
recommendation with RSP data. The first step is to receive a request for a
recommendation from a user (step 502). The request may come in many forms. For
example, a recommendation request may come from an e-commerce application that will
display a list of items to a user before "check-out." The request may also come from a particular web page viewed by a user, or by monitoring "click-stream" data. Click-stream data is data obtained by monitoring users actions on particular web pages. Either way,
the request is submitted to recommendation engine 312 using an API. For example, the e-commerce application may query recommendation engine with a "predict" API at the
time the user displays finalizes his shopping cart.
A request for a recommendation may also come from an item, or a group of items (e.g., items that are within the same category). For example, recommendation engine 312
may recommend to the item a list of users that may be interested in that item.
Once recommendation engine 312 receives the request, recommendation engine 312 may, depending upon the type of equation, extract either positive ratings and/or
negative ratings of the user or item from rating database 322 (step 504). If no data is available for the user, recommendation engine 312 may provide a default list. A default
list would contain a preprogrammed list of items to recommend to the user. For example, if the user has never used recommendation engine 312 before, it may provide a top ten
list of best selling items to the user.
Recommendation engine 312 uses the extracted data to locate potential neighbors
in a candidate neighbor list using a specified search strategy (step 506). The term "neighbor" means a user identified in rating database 322 with similar interests as the first
user. For example, if another user in rating database 322 has rated similar items as the
first user, the other user may be considered a potential neighbor. A candidate neighbor
list is defined as a pool of candidates from which to choose potential neighbors. For
example, a candidate neighbor list may be geographically constrained, or consist of an
entire database. A neighborhood is the list of neighbors found. Since rating database
322 may contain many potential neighbors, it is desirable to first reduce the set of
candidates using the search strategy. Recommendation engine 312 may employ a search
strategy using positive data to locate potential neighbors. Positive data is generally
preferred in the search for potential neighbors since positive data contains more valuable
information than negative data. That is, since rating database 322 contains a large
number of user ratings that are generally negative, and users only rate a small portion of
the entire item space, using positive data leads to worthwhile rating data (and potential
neighbors) more quickly.
Negative data may be later incorporated to provide recommendations using the
affinity and agreement equations (described below) to make minor adjustments to the
recommendation. Thus, negative data is used when providing recommendations using
potential neighbors selected from a large pool of candidates.
Figure 6A depicts an exemplary portion of rating database 322 containing
positive, negative and unrated items. User 2 and User 3 may be potential neighbors of
User 1 when using a positive data search strategy since User 1 and both User 2 User 3
rated positively similar items (items 1 and 2). However, both Users are considered
potential neighbors of User 1 since an affinity between the users still needs to be
determined, as further described below. For example, an ideal neighbor for a user would
be a neighbor that has rated all items that the user has also rated. Moreover, User 4 will
never be a potential neighbor of User 1 , since User 4 has not rated positively rated items
similar to that of User 1.
If no potential neighbors are found (step 508), recommendation engine 312
attempts to locate any neighbor to provide a recommendation (step 510). If neighbors
are not located, then recommendation engine 312 uses a default list instead of providing
a recommendation, as described above (step 520). If, however, recommendation engine
312 locates neighbors (step 510), recommendation engine 312 uses the located neighbors
to provide a recommendation (step 522).
If, however, at least one potential neighbor is found (e.g., using a positive data
search strategy) (step 508), recommendation engine 312 computes an affinity between
the user and the potential neighbor using an affinity equation (step 512). Affinity
equations consist of any combination of weighted agreement components, such as
positive agreement, negative agreement, or disagreement, described below.
Positive Agreement
A positive agreement measures a common level of positive preference between
the user and the potential neighbor. The agreement computes a function using positive
co-ratings of the user and the potential neighbor. The positive agreement may be
computed using the'following equation where "R" is the first user, "r" is the potential
neighbor, and "positive_coratings" is the number of items both have rated positively:
∑ i
Positive agreement = "y"' posιtιve_ratmgs_Rt posιtιve_ratmgs_r
One skilled in the art will appreciate that other agreement equations, such as Mutual
Normalized Interests, and Fuzzy Evidence Set Similarity, may be used to compute the positive agreement. These equations are further described below.
Mutual Normalized Interests
The mutual normalized interest equation is a positive agreement equation that
uses normalized interest information, such as normalized ratings, to return a common interest level between the user and the potential neighbor. To do so, the equation
computes the sum of the minimum normalized coratings. A corating is a pair of ratings
for the users. For example, Figure 6C depicts an interest rating table 620 containing
common ratings between a user and a potential neighbor. Figure 6D depicts a normalized interest table 630 containing normalized data from interest rating table 620.
The mutual normalized interest is computed using the following equation where "R" is the first user, "r" is the potential neighbor, and "coratings_i" is the number of items
both users have rated:
affinity = ∑ minCΛ ',, r 1) coratings i
Using the normalized data in normalized interest table 610, methods and systems consistent with the present invention may provide an affinity value between the user and
potential neighbor by computing the sum of the minimum of the coratings: .1 + .1 + .4
= .6. The value ".6" is an affinity value between the user and the potential neighbor.
Fuzzy Evidence Set Similarity
The fuzzy evidence set similarity equation is another positive agreement equation that uses normalized interest information, such as normalized ratings, to return the
amount of interest overlap between the user and the potential neighbor. More information regarding fuzzy evidence may be found in Zimmerman, H.J., "Fuzzy Set
Theory - And Its Applications," Second Revised Edition, 1991, hereby incorporated by
reference. For example, Fig. 6E depicts an interest rating table 640 containing some common ratings between a user and a potential neighbor. Fig. 6F depicts a normalized
interest table 650 containing normalized data from interest rating table 640. The fuzzy evidence set similarity is computed using the following equation, where
"noncoratingj" is the set of items "r" has not rated that "R" has rated, "noncorating k"
is the set of items "R" has not rated that "r" has, and "coratings_i" is the number of items both users have rated:
∑ min^ ',, /■ ',) affinity coratιngs_ i
∑R 'j + ∑r + ∑ max(R , r ') noncorαtmg_ j noncorαlιng_ k corαtmgs_ i
The sum of the minimum of coratings for Figure 6D is: (.05) + (.25) The sum of the coratings not available to user "R" is: 0
The sum of the coratings not available to the potential neighbor "r" is (.2) + (.5)
The sum of the maximum of coratings is: (.5) + (.5)
Thus, the affinity measure between the user "R" and the potential neighbor "r" is: ".17".
Although two interest equations are explained above, one skilled in the art will appreciate
that other interest equations may be used, such as a cosine similarity equation. The cosine equation is as follows:
affinity = _ coratιngs_ι
More information on cosine similarity equations may be found in Salton, G., "The SMART Retrieval System: Experiments in Automatic Document Processing," Prentice
Hall, Englewood Cliffs, NJ, 1971, hereby incorporated by reference.
Negative Agreement
A negative agreement measures a common level of negative preference between
the user and the potential neighbor. The agreement computes a function using negative
co-ratings of the user and the potential neighbor. The negative agreement may be
computed using the following equation where "R" is the first user, "r" is the potential
neighbor, and "negative_coratings" is the number of items both have rated negatively:
negatιve_ratιngs_Rκιnegatιve_ratmgs_r
One skilled in the art will appreciate that any positive agreement equation may be used
to computer the negative agreement on negative data.
Disagreement
A disagreement measures a level of disagreement in preferences between the user
and the potential neighbor. The disagreement computes a function using opposite co-
ratings of the user and the potential neighbor. The opposite agreement be computed
using the following equation where "R" is the first user, "r" is the potential neighbor, and
"opposite_coratings" is the number of items both have rated opposite:
Disagreement = opposite _coralmgs
ratmgs_Ruratιngs_ralιngs_r
Figure 6B depicts a completed agreement table 610 with values for various types
agreements correlating to table 600 in Figure 6A. For example, the Positive_Agreement
between User 1 and User 2 is "2/4" since they share two of four positive ratings; both
have rated items 1 and 2 positively, User 1 has also rated item 5 positively, and User 2
has rated item 7 positively.
RSP Affinity Equations
Once the agreements are determined, an affinity may be determined for the user
and a potential neighbor using various RSP affinity equations. Listed below are four
exemplary RSP affinity equations, where "Wp" is a positive weight, "Wn" is a negative
weight, "Wd" is a disagreement weight:
(1) positive affinity = Positive_Agreement
(2) positive & negative affinity = Wp*Positive_Agreement +
Wn*Negative_Agreement
(3) positive affinity without disagreement = Wp*Positive Agreement -
Wd*Disagreement
(4) general affinity = Wp*Positive_Agreement + Wn*Negative_Agreement -
Wd*Disagreement
Equations 1-3 are special cases of equation 4, with Wp, Wn, and Wd set to different
values.
Using equation 2 and Wp=Wn=l, methods and systems consistent with the
present invention may provide an affinity value between User 1 and User 2 as follows:
1* "2/4" + l*"l/3" = "10/12," and User 1 and User 3 as follows: l*"2/4" + l*"2/3" =
"14/12." Thus, User 3 has a higher affinity with User 1 that User 2 has with User 1.
Any of the above listed affinity equations may be used in any combination to
compute an affinity value based on the level of agreements measured. Based on the
landscape of the candidate neighbors, one affinity equation may be more useful than
another. In one example, when the ratings database consists of mostly positive data, and
all users have similar ratings (e.g., a rock music store), positive data (e.g., interest data)
may be useful as a search strategy, however disagreement data may be useful to refine
and/or select neighbors. Thus, equation 3 would be used. In another example, when the
ratings database consists of unreliable or sparse positive data (e.g., positive agreement on
few items), negative data may be useful to increase the trustworthiness of the overall
affinity values.
After each affinity value is computed for a user and a potential neighbor using an
equation as described above, recommendation engine 312 determines if the affinity value
is above a predetermined threshold value (step 514). One skilled in the art will appreciate
that the threshold value may be a maximum value, minimum value, or a range of values. If the affinity value is above the threshold value, the potential neighbor is added to a
neighbor list (step 516). Each neighbor on the neighbor list provides rating information to recommendation engine 312 that is used to compute a recommendation for the user. Otherwise, if the affinity value is below the threshold value, the potential neighbor is dropped and the next potential neighbor is located in rating database 322 (step 506).
Recommendation engine 312 locates neighbors until enough neighbors have been
located (step 518). For example, to provide a quick recommendation, recommendation
engine 312 may require ten neighbors. However, to provide a more accurate
recommendation, recommendation engine 312 may require fifty neighbors. Once the
requisite number of neighbors has been located, recommendation engine 312 may provide a recommendation to the user using well-known recommendation techniques (step 524).
Positive Interest Data Example
As an example of an application suitable for methods and systems consistent with the present invention are suitable for use with an augmented electronic mutual fund server on the Internet. Fig. 7A illustrates a recommendation system integrated into a
web-based electronic mutual fund site (e-commerce site). The user at computer 702 connects using a network 704 to a web server 706. A commerce server 708, connected
to web server 706, processes all financial transactions for the user and contains a database
of various mutual funds for sale. Web server 706 presents this set of products for sale to
the user. A recommendation server 710 coupled to the web server 706 and commerce
server 708 receives purchase information from commerce server 708. The
recommendation server 710 uses web server 706 and commerce server 708 to provide the user with specifically targeted content, such as recommendations to purchase specific
items, recommendations to view specific items, or targeted advertisement. Recommendation server 710 does so by maintaining records of previous purchases and
quantity of purchases by the user and other users.
As a specific example of the recommendation system implemented as described
above, a user may purchase $1000 of mutual fund A, $500 of mutual fund B, and $2000
of mutual fund C. Each time user 602 buys or sells a mutual fund, the commerce server
records the purchase and provides the recommendation server with the data. The
recommendation server may then compare user 702 portfolio to other user's portfolios
maintained in the recommendation server using an interest affinity equation. The users
that have high affinities with user 602 are considered neighbors and are included on a
neighbor list that is used to provide recommendations to user 602. For example, if another user has $ 1000 mutual fund A, $ 1000 mutual fund B, and $ 1000 in mutual fund D, recommendation server 610 may recommend that user 602 consider mutual fund D as a potential investment. RSP Data Example
As another example of an application suitable for methods and systems consistent
with the present invention are suitable for use with CD stores on the Internet. Fig. 7B
illustrates a recommendation system integrated into various CD stores (e-commerce sites)
using RSP data to provide recommendations. The user at computer 712 connects using
a network 714 to an e-commerce server 716. E-commerce servers 716 process all
purchase transactions for user 712 and contain databases of various CDs for sale.
A recommendation server 718 coupled to network 714 receives purchase and
return information from e-commerce servers 716. The recommendation server 718 uses
this information to provide user 712 with specifically targeted content, such as
recommendations to purchase specific CDs, recommendations to view specific CDs, or
targeted advertisement at each e-commerce server 716. Recommendation server 718
does so by maintaining records of previous purchases and quantity of purchases by user
712 and other users.
As a specific example of the recommendation system implemented as described
above, a user may purchase CDs A, B, and C, and return CD A, from different e-
commerce servers 716. Each time user 712 purchases a CD, an e-commerce server 716
records the user interaction and provides the purchase information as a positive rating for
that CD to recommendation server 718. Each time user 712 returns a CD, an e-commerce
server 716 records the user interaction and provides the return information as a negative
rating for that CD to recommendation server 718. Recommendation server 718 may use
the positive data to locate potential neighbors and compare user 712 purchase/return
history to located potential neighbors' purchase/return history using predetermined RSP
affimty and agreement equations. Other users that have high affinities with user 712 are
considered neighbors to user 712 and will be included on a neighbor list that is used to
provide recommendation to user 712. For example, if another user has purchased CD A,
B, and D, recommendation server 718 may recommend that user 712 consider to
purchase CD D.
Conclusion
Methods and systems consistent with the present invention provide a recommendation server capable of using RSP data to provide a recommendation to a user. The recommendation server contains software to provide RSP data recommendations to the user. Alternatively, the software may provide recommendations of users to an item, or groups of items. To provide the recommendations, the recommendation server applies
an affinity equation to the set of RSP data. The foregoing description of an
implementation of the invention has been presented for purposes of illustration and
description. It is not exhaustive and does not limit the invention to the precise form
disclosed. Modifications and variations are potential in light of the above teachings or may be acquired from practicing of the invention. For example, positive data definitions
and negative data definitions may be reversed. Also, although methods and systems
consistent with the present invention describe providing a recommendation for an item
to a user, conversely, recommendations may also be provided for a user to an item. That
is, a recommendation may be generated including a list of suitable users (e.g., by using item affinities). Moreover, the described implementation includes software but the
present invention may be implemented as a combination of hardware and software or in
hardware alone.