cold-start-recommender 0.4.2

"Will it scale?" is a less important question than "will it ever matter?" ([David Kadavy](http://kadavy.net))

******************************************************NB: We have re-written good part of the recommender.

The APIs have changed, and the **webapp** is now a separate package, called [cold-start-recommender-webapp](https://github.com/elegans-io/csrec-webapp) which can be installed via `pip`.You can still access the old version with:

```bashpip install cold-start-recommender==0.3.15```

or from the source folder (same folder of the setup.py file):

```bashpip install .```

To Uninstall the package:

```bashpip uninstall csrec```

Any comment sent to info@elegans.io will be appreciated.******************************************************

We developed Cold Start Recommender because we needed a recommenderwith the following characteristics:

* **Greedy.** Useful in situations where no previous data on Items or Users are available, therefore *any* information must be used --not just which Item a User likes, but also --in the case of a book-- the corresponding category, author etc.

* **Fast.** Any information on Users and Item should be stored and used immediately. A rating by any User should improve recommendations for this User, but also for other Users. This means in-memory database and no batch computations.

* **Ready to use.** Take a look at [cold-start-recommender-webapp](https://github.com/elegans-io/csrec-webapp) to start a webapp that POSTs information and GETs recommendations.

CSRec should not (yet) be used for production systems, but only forpilots, where statistics are so low that filters (e.g. loglikelihoodfilter on the co-occurence matrix) are premature. It aims to*gather data* in order to immediately personalise the user experience.

CSRec is written in Python, and under the hood it uses the `Pandas`_library.

Dependencies============

The following python packages are needed in order to run the recommender:

* pickle* pandas* numpy

Since version 4, the web service has been taken out of the package.You need to install elegans.io's package [csrec-webapp](https://github.com/elegans-io/csrec-webapp)

Features========

The Cold Start Problem----------------------

The Cold Start Problem originates from the fact that collaborativefiltering recommenders need data to build recommendations. Typically,if Users who liked item 'A' also liked item 'B', the recommender wouldrecommend 'B' to a user who just liked 'A'. But if you have noprevious rating by any User, you cannot make any recommendation.

CSRec tackles the issue in various ways.

### Selective profiling

CSRec allows **profiling with well-known Items without biasing the results**.

CSRec will only register that `user1` likes a certain author, certain tags,but not that s/he might like `item1`. This is of fundamentalimportance when profiling users through a "profiling page" on yourwebsite. If you ask users whether they prefer "Harry Potter" or "TheBetter Angels of Our Nature", and most of them choose Harry Potter, you would not want to make the Item "Harry Potter" even more popular. You might just want to recordthat those users like children's books marketed as adult literature.

CSRec does that because, unless you are Amazon or a similar brand, theco-occurence matrix is often too sparse to compute decentrecommendations. In this way you start building multiple, denser,co-occurence matrices and use them from the very beginning.

### Store any possible information

Any information is used. You decide which information you shouldrecord about a User rating an Item. This is similar to the previouspoint, but you also register the item_id.

### Use everything you can, now

Any information is used *immediately*. The co-occurence matrix isupdated as soon as a rating is inserted.

### Efficient users' tracking

It tracks anonymous users and merges their preferences into profiles. E.g. an anonymous visitors of a websitelikes a few items before the sign in/ sign up process. After sign up/ sign in theinformation can be reconciled --information relative to the session IDis moved into the correspondent user ID entry.

### Mix recommended items and popular items

What about users who would only receive a couple of recommendations?No problem! CSRec will fill the list with the most popular items (nor rated by such users).

### Algorithms

At the moment CSRec only provides purely item-based recommendations(co-occurence matrix dot the User's ratings array). In this way we canprovide recommendations in less than 200msec for a matrix of about10,000 items.

# so we can only recommend item4assert engine.get_recommendations('user1') == ['item4']```

Remember that the cold start recommender is now only in memory, which means that you must implement a periodic saving of the data:

```python# Save the data from the engine from aboveengine.db.serialize('pippo.db')

# create a new engine with the same data:new_engine = Recommender()new_engine.db.restore('pippo.db')```

Versions--------**v 0.4.2 No backward compatibility with 3**

Small fixes for Pypi

**v 0.4.0 No backward compatibility with 3**

* Action of users on users can be saved (see `insert_social_action` in dal.py)* Various new metrics to monitor users' interaction (see e.g. `get_social_actions` in dal.py)* No more embedded web service: use [csrec-webapp](https://github.com/elegans-io/csrec-webapp)* TODO: make "social" recommendations based on users saving actions on each other* Heavy refactoring* Serialization and de-serialization of the data in a file for backup* Data Abstraction Layers for memory and mongo.

**v 0.3.15**

* It is now a singleton, improved performance when used with, eg, Pyramid

**v 0.3.14**

* Minor bugs

**v 0.3.13**

* Added self.drop_db

**v 0.3.12**

* Bug fixed

**v 0.3.11**

* Some debugs messsages added

**v 0.3.10**

* Categories can now be a list (or passed as json-parseable string). This is important for, eg, tags which can now be passed in a REST API as:

* Added logging* Added creation of collections for super-cold start (not even one rating, and still user asking for recommendations...)* Additional info used for recommendations (eg Authors etc) are now stored in the DB* _sync_user_item_ratings now syncs addition info's collections too* popular_items now are always returned, even in case of no rating done, and get_recommendations eventually adjusts the order if some profiling has been done