Abstract

As one of major challenges, cold-start problem plagues nearly all recommender systems. In particular, new items will be overlooked, impeding the development of new products online. Given limited resources, how to utilize the knowledge of recommender systems and design efficient marketing strategy for new items is extremely important. In this paper, we convert this ticklish issue into a clear mathematical problem based on a bipartite network representation. Under the most widely used algorithm in real e-commerce recommender systems, so-called the item-based collaborative filtering, we show that to simply push new items to active users is not a good strategy. To our surprise, experiments on real recommender systems indicate that to connect new items with some less active users will statistically yield better performance, namely these new items will have more chance to appear in other users’ recommendation lists. Further analysis suggests that the disassortative nature of recommender systems contributes to such observation. In a word, getting in-depth understanding on recommender systems could pave the way for the owners to popularize their cold-start products with low costs.

pacs:

Thanks to the blazing development of Internet, e-commerce has flourished over the past decades. With the online buy-and-sell platforms getting increasingly more available products (e.g., more than a billion products in taobao.com), shopping online has become a fashionable style of living and more people choose to purchase on the Internet rather than go to a store. E-commerce makes our life much more convenient, meanwhile, it throws us into a dilemma of information overloads. Facing millions of items online, finding out favourites is rather difficult. As an effective information filtering tool, recommender system is thus of particular significance nowadays (1); (2). In fact, it has already made considerable contributions to the socioeconomic fields in the past decade. For example, 60% of DVDs rented by Netflix are selected based on personalized recommendations, and about a half of sales in Amazon are brought by recommendations (2). Consequently, recommender systems have received huge attentions from both physicists and computer scientists, and many advanced recommendation algorithms are proposed recently, including collaborative filtering (3); (4); (5); (6); (7); (8), content-based analysis (9); (10); (11), dimensionality reduction techniques (12); (13); (14), diffusion-based methods (15); (18); (19); (16); (17); (20); (21); (22), and so on.

One long-standing challenge, called cold-start problem, has plagued almost all recommender systems. Namely, when new users or items enter the system, there is usually insufficient information to produce reasonable recommendation (23). Considering this fact, several potential solutions have been raised. The additional content information (23); (24); (25); (26), tagging information (27); (28); (29) and cross-domain information (30) can be used to marginally relieve this problem, but they don’t work in a purely cold-start setting, where no information is available to form any basis for recommendations. Furthermore, improving diversity and novelty of recommended lists can help new items be pushed out (31); (17); (32).

Practically speaking, as a holder of the recommender system, one can ask for extra information to generate initial profiles on users or items (24), or probe users’ preferences by pushing to them some carefully selected items according to complicated algorithms (33). Both methods are costly and risky. In contrast, an owner would like to popularize her new items. An improper method, called “shilling attacks”, injects a number of mendacious users into the system to raise predicted ratings of new items, and thus enhances the possibility of these new item to appear in the recommendation lists (34); (35). But, it is easily to be detected (36); (37); (38). Furthermore, as a wide-spreading market strategy, advertisements are generally preferred and become more and more prosperous (39). However, to popularize new items costs a lot and imposes an unbearable financial burden for small businesses (40). As mentioned above, how to promote new items under limited marketing resources is a nontrivial challenge and the knowledge of recommendation algorithm may be helpful. Putting aside operational details, if the marketing activities can bring some purchases of certain users, a smart marketing manager will carefully chose the target users so that these purchases can lead to more exposures in the recommendation lists afterwards.

Taking a stand as a marketing manager, in this paper, we focus on how to promote cold-start items by utilizing the knowledge of recommender systems. The main contributions are threefold: (i) We convert this ticklish problem into a clear mathematical model that ignores some insignificant details. (ii) We show that to push new items to active users, a straightforward strategy that will jump into our mind at the first time, is to our surprise a poor-performed strategy. (iii) We propose a degree-based solution that outperforms some baseline methods.

Recommendation can be considered as a variant of link prediction in bipartite networks (41) and thus the better understanding of network structures can in principle improve the quality of recommendations (42); (43); (44); (45). We denote a recommender system by a user-item bipartite network G(U,O,E), where U={U1,U2,…,Un} and O={O1,O2,…,Om} are respectively the sets of users and items, and E is the set of links connecting users and items. Consequently, we use the adjacent matrix, A, to describe the user-item relations: if user Ui has purchased item Oα, aiα=1, otherwise aiα=0 (throughout this paper we use Latin and Greek letters, respectively, for user- and item-related indices). Figure 1(a) illustrates a small bipartite network that consists of eight users (gray squares) and eight items (blue circles). ki, the degree of user Ui, is defined as the number of items linked to Ui. Analogously, the degree of item Oα, denoted by kα is the number of users connected to Oα. For example, as shown in Figure 1(a), ki=3,kj=1 and kα=2. The user degree distribution Pu(k), is the probability that the degree of a randomly selected user, is equal to k. The item degree distribution Po(k) is defined in a similar way. Degree distribution reflects the network heterogeneity (46).

Figure 1: How to add a cold-start item to the user-item bipartite network. Users and items are represented by squares and circles respectively, and solid lines represent the existent links between them. Plot (a) is the original network, and plot (b) is the network after adding the item η (the yellow circle). The dotted lines are new links connecting η with two existent users.

We consider two real data sets with anonymous users in this paper, including (a) Tmall.com (TM): an open business-to-consumer (B2C) platform where enrolled businessmen can sell legal items to customers; (b) Coo8.com (Coo8): a well established online retailer mainly trading in electrical household appliances and a leading supplier to daily necessities. In order to avoid the isolate nodes in the data sets, each user has bought at least one item, and each item has been purchased at least once. Table 1 shows the basic statistics of the two data sets. Due to the different types of products, these networks have much different average item degrees. As shown in Figure 2, all degree distributions are heavy-tailed and the item degree distributions are generally more heterogenous than the corresponding user degree distributions. These observations complement previous empirical analyses on user-item bipartite networks (47); (48); (49); (50).

Data

n

m

w

⟨kuser⟩

⟨kitem⟩

S

TM

103,867

83,342

113,624

1.09

1.36

1.31 ×10−5

Coo8

77,947

18,751

94,457

1.21

5.04

6.46 ×10−5

Table 1: Basic statistical properties of the two data sets. n, m, and w represent the number of users, items and links, ⟨kuser⟩ and ⟨kitem⟩ stand for the average degrees of users and items, and S=wn×m denotes the data sparsity.

The nearest neighbors’ degree for user Ui, denoted by dunn(i), is defined as the average degree over all items connected to Ui(50). For example, in Figure 1(a), dunn(i)=kα+kβ+kγki=103. Furthermore, the degree-dependent nearest neighbors’ degree, ⟨dunn(k)⟩ is the average nearest neighbors’ degree over all users of degree k, namely ⟨dunn(k)⟩=⟨dunn(i)⟩ki=k. Corresponding concepts for items, Po(k), donn(k) and ⟨donn(k)⟩, are defined in a similar way and thus omitted here (50). The degree-dependent nearest neighbors’ degree is an appropriate index to characterize the network assortativity (51). As shown in Figure 2, both the two networks are disassortative.

Figure 2: Degree distributions and degree correlations. All degree distributions are power-law-like, with exponents being estimated by the maximum likelihood methods (53); (54). ⟨dunn⟩ and ⟨donn⟩ are respectively showed in the 3rd and 4th rows, where red and black lines representing the results from original and reshuffled networks. Results of reshuffled networks are obtained by averaging over five independent realizations.

Recommender systems typically produce a given-length list of unpurchased items for each user based on his historical purchases. Of nothing comes nothing, that is to say, it is impossible to predict links for an isolate user or item. So only after having been purchased by some users, an item could have the chance to appear in some other users’ recommendation lists. In real e-commerce web sites, to get a new customer is highly costly, and thus under the limited investment, choosing users with considerable coming influence on further recommendations is absolutely critical. Concretely speaking, this problem is described as follow. Given a bipartite network containing n users, m items and w links. A novel item Oη enters this network, and it can at most establish R links to users. Given the recommendation algorithm, we need to answer the question that how to choose such R users to maximize the frequency that Oη appears in other (n−R) users’ recommendation lists. For example, in Figure 1(b), item Oη (the yellow circle) comes and needs to link to some existent users. If R=1, then to choose which user, Up (most active user), Uj (one of the most inactive users) or another one, can make Oη be recommended more times?

We consider four strategies to choose those R users: (I) Maximum-degree strategy (MaxD). To rank all users in the descending order of degree, and select the top-R users, where users with the same degree are ranked randomly. (II) Minimum-degree strategy (MinD). To rank all users in the ascending order of degree, and select top-R users, where users with the same degree are ranked randomly. (III) Preferential attachment strategy (PA). Each user’s probability to be selected is proportional to her degree. (IV) Random strategy (RAN). The R users are selected completely randomly. Actually, all strategies above can be unified by a selecting probability of any user Ui, p(Ui)∝kτi∑kτi, where τ is a tunable parameter. More specifically, the strategies MaxD, MinD, PA and RAN correspond to the cases of τ=+∞, τ=−∞, τ=1 and τ=0, respectively.

Among existent recommendation algorithms, item-based collaborative filtering (ICF) has found the widest applications in real e-commerce platforms for its accuracy, stability, scalability and robustness (5); (6); (38). Here, we apply cosine similarity for each pair of items, say

sim(α,β)=n∑s=1asαasβ√kαkβ,

(1)

where kα and kβ are degrees of items Oα and Oβ, respectively. For the target user Ui, we calculate the accumulative score wiα for each item Oα by

wiα=∑γ≠αaiγsim(α,γ),

(2)

and then rank all the unpurchased items in descending order according to their scores in Eq. (2). The top-L items will be recommended to Ui, where L is the length of recommendation list.

To compare the degree-dependent strategies, we employ a metric H that counts the number of users whose recommendation lists contain the target items, say

Unknown environment '%

(3)

where ri is the position of the target item among all Ui’s unpurchased items. Obviously, 0≤H≤(n−R), since the target item’s degree equals R, and the larger value of H means better performance. The number of recommended items, L, is limited by the user interface, with typical size no larger than 6 (see real recommendation engines of Alibaba Group and Baifendian Inc. as examples).

Since the maximum item degrees for TM and Coo8 are 617 and 933, respectively, in our simulation, we only consider R ranging from 1 to 1000. To our surprising, as shown in figure 3, MaxD hardly makes new items recommended while MinD usually shows better performance. Consider the general case where the target item Oη has established a link to user Ui, and Oα and Oβ are two of Ui’s collected items before Oη. For another user Uj who is not connected with Oη. If Uj has collected Oα but not Oβ, then both Oβ and Oη have the chance to be recommended to Uj. Since in the ICF algorithm, item similarities play the major role, let’s compare the similarities sim(α,β) and sim(α,η). Statistically speaking, if Ui is a very active user selected by the MaxD strategy, Oα and Oβ are probably less popular as indicated by the disassortative nature of the networks, therefore kη (i.e., R) may be much larger than kβ and then sim(α,η) is probably smaller than sim(α,β), resulting in less probability of Oη to be recommended to Uj. In contrast, if Ui is a very inactive user selected by the MinD strategy, Oα and Oβ are probably of larger degrees according to the disassortative nature, resulting in smaller sim(α,β) and thus larger probability for Oη to be recommended to Uj. In addition, since Ui is very unpopular, it is also possible that ki=1 and Ui is only connected with Oα. In such case, for all other users connected with Oα, Oη will be the only recommended item related to Ui.

Figure 3: Performance of the four strategies for original TM and Coo8 bipartite networks. The results of MaxD, MinD, PA and RAN are represented by black squares, red circles, blue triangles and green pentagrams, respectively. Data points are obtained by averaging over 50 independent realizations. Figure 4: Performance of the four strategies for reshuffled networks. The results of MaxD, MinD, PA and RAN are represented by black squares, red circles, blue triangles and green pentagrams, respectively. Data points are obtained by averaging over 50 independent realizations.

In a word, the disassortativity could contribute to the observations in figure 3. To validate this inference, we reshuffle the original networks by link-crossing method to obtain the null networks (52). Specifically speaking, in each step, two links, say (Ui,Oα) and (Uj,Oβ), are randomly picked out, and if Ui has not collected Oβ and Uj has not collected Oα, these two links are rewired as (Ui,Oβ) and (Uj,Oα). In one realization, we repeat such rewiring for 3w times. After that, the reshuffled network has identical degree sequence as the original network but the disassortative nature is vanished as shown in figure 2. Figure 4 reports the performance of the four strategies in the reshuffled networks, from which we can see that the MaxD strategy performs the best. Comparing the results for original and reshuffled networks, we conclude that the advantage of MinD strategy results from the disassortative nature of real e-commerce user-item bipartite networks. In addition, in figure 5 and figure 6, we test the performance of strategies with different τ. For both TM and Coo8, the negative τ will lead to better performance while in the null networks, positive τ is better.

Figure 5: Performance of strategies with different τ on original and reshuffled TM networks. The black, red and blue lines represent the results for the cases R=100, R=500 and R=1000, respectively. Data points are obtained by averaging over 50 independent realizations.Figure 6: Performance of strategies with different τ on original and reshuffled Coo8 networks. The black, red and blue lines represent the results for the cases R=100, R=500 and R=1000, respectively. Data points are obtained by averaging over 50 independent realizations.

In this paper, we study a practical problem in e-commerce recommender systems: how to promote cold-start items? Under the item-based collaborative filtering systems, we show that the disassortative nature of real user-item networks leads to a non-trivial observation that to link a cold-start item to inactive users will give it more chance to appear in other users’ recommendation lists. This observation is robust for varying recommendation length L and linking capacity R. It is also applicative to some variants of item-based collaborative filtering, such as the top-k nearest neighbors ICF (5).

Notice that, the reported results are affected by both the topological features and underlying recommendation algorithms. We have tested the user-based collaborative filtering (3), under which the MaxD is always better than MinD. In spite of this, this work is still relevant since in most real recommender systems, ICF plays a significant role. In addition, the perspectives and methods reported here are useful for real e-commerce applications, with the core merit is that the in-depth understanding of the structure and algorithms of recommender systems can be transferred into applicable knowledge to better market products.

We acknowledge Kuan Fang, Zhidan Zhao and Ying Zhou for useful discussions. We acknowledge Alibaba Group and Baifendian Inc. to share their real business data sets after anonymization. This work was partially supported by the National Natural Science Foundation of China under Grant Nos. 11222543. T.Z. acknowledges the Program for New Century Excellent Talents in University under Grant No. NCET-11-0070, and Special Project of Sichuan Youth Science and Technology Innovation Research Team under Grant No. 2013TD0006.