Xamat
Study on online music taste: call for participation
Wed, 08/25/2010 - 09:55Are you a music listener and lastfm user? Are you interested in helping out research while having the chance to win a $600 Amazon gift card? Please help us understand online music tastes by completing a survey that will only take around 15 minutes of your time and might even be fun!
All you need to do to participate is go to this page and provide your last.fm username and a valid email. We will check if you meet the requirements (at least 18 y.o. and 5000 scrobbles on lastfm) and we will then send you a link to your personalized survey.
Thanks for your time!
Categories: Music Rec
Multiverse Recommendations (aka using n-dimensional tensor factorization for context-aware collaborative filtering)
Sun, 08/08/2010 - 22:29This post is the first of several in which I will be explaining some of the things we are presenting in the upcoming Recsys 2010 conference. The project I will talk about is led by Alexandros Karatzoglou and presents a new approach to context aware recommendations that we have named Multiverse. You can access the full paper here, but I will give you a brief description in this post.
The introduction of context in recommender systems is an area of growing interest. The reason is simple: While we all value the fact that Recommender Systems are able to infer our tastes and recommend new things, it is clear that whatever we like - and are willing to receive - depends on the context. E.g. We do not want to receive the same movie recommendations on TV if we are sitting with the kids on a Sunday afternoon or if we are alone on a late night session. There is a growing body of literature on contextual recommendations. Without going any further, I already posted about context-aware recommendations with micro-profiles on this blog. Also, there is a very good chapter on the topic on the upcoming Recommender Systems Handbook. But, while we wait for it, you might want to look at some of the publications by Adomavicius and Tuzhilin.
Context takes the recommender problem from a two dimensional problem, where we have users and items, to an n-dimensional one where we can have many contextual dimensions added. In our work, we have generalized the successful matrix factorization approach to this n-dimensional case. In order to do this, we have used the idea of tensors, which are precisely a generalization of matrices to n dimensions. The following figure illustrates the idea (note that, for simplicity, we are illustrating the 3 dimensional case with just one contextual variable).
In the paper, we show how this approach outperforms previously existing methods on a number of different datasets. One of these results is illustrated in the figure below. Note how Tensor Factorization (in green) not only outperforms other methods, but it performs better the more contextual information we add. It is also interesting to note how not observing context information (black line) results in worse performance. When we add contextual information to 80% of our data, not using this information yields a result that is almost 50% worse.
The use of context in recommender systems and other areas of information retrieval is a very interesting topic that is likely to get even more attention in the near future. _We will surely contribute to this.
Categories: Music Rec
Multiverse Recommendations (aka using n-dimensional tensor factorization for context-aware recsys)
Sun, 08/08/2010 - 22:29This post is the first of several in which I will be explaining some of the things we are presenting in the upcoming Recsys 2010 conference. The project I will talk about is led by Alexandros Karatzoglou and presents a new approach to context aware recommendations that we have named Multiverse. You can access the full paper here, but I will give you a brief description in this post.
The introduction of context in recommender systems is an area of growing interest. The reason is simple: While we all value the fact that Recommender Systems are able to infer our tastes and recommend new things, it is clear that whatever we like - and are willing to receive - depends on the context. E.g. We do not want to receive the same movie recommendations on TV if we are sitting with the kids on a Sunday afternoon or if we are alone on a late night session. There is a growing body of literature on contextual recommendations. Without going any further, I already posted about context-aware recommendations with micro-profiles on this blog. Also, there is a very good chapter on the topic on the upcoming Recommender Systems Handbook. But, while we wait for it, you might want to look at some of the publications by Adomavicius and Tuzhilin.
Context takes the recommender problem from a two dimensional problem, where we have users and items, to an n-dimensional one where we can have many contextual dimensions added. In our work, we have generalized the successful matrix factorization approach to this n-dimensional case. In order to do this, we have used the idea of tensors, which are precisely a generalization of matrices to n dimensions. The following figure illustrates the idea (note that, for simplicity, we are illustrating the 3 dimensional case with just one contextual variable).
In the paper, we show how this approach outperforms previously existing methods on a number of different datasets. One of these results is illustrated in the figure below. Note how Tensor Factorization (in green) not only outperforms other methods, but it performs better the more contextual information we add. It is also interesting to note how not observing context information (black line) results in worse performance. When we add contextual information to 80% of our data, not using this information yields a result that is almost 50% worse.
The use of context in recommender systems and other areas of information retrieval is a very interesting topic that is likely to get even more attention in the near future. _We will surely contribute to this.
Categories: Music Rec
Music Recommendation through Expert-based Collaborative Filtering
Fri, 07/30/2010 - 00:07In September I will be presenting the paper entitled "Towards Fully Distributed and Privacy-preserving Recommendations via Expert Collaborative Filtering and RESTful Linked Data" in the 2010 International Conference on Web Intelligence in Toronto. You can read the full paper here, but in this post I will try to give you a taste of what is hidden behind such a long title.
This paper should be understood as a continuation of my research on Expert Based Collaborative Filtering -- the so-called Wisdom of the Few. I recommend you take a look at my previous post on this issue before moving on.
So the basic idea from our previous work was to use domain experts as the only asset for creating neighborhood and predicting item utility in a similar way as is done in standard kNN collaborative filtering. We made some claims of how that method provided many practical advantages over standard approaches. We also claimed that the approach was scalable and flexible enough to be used in many domains. Unfortunately, at that point, we did not have time to implement and prove all that.
The current work presents a practical full-fledged implementation of the approach in the music domain. Our goal is to prove some of the previous claims as well as to stablish an architectural framework for expert collaborative filtering providing, among other things, 100% privacy protection.
The following screenshot will give you an idea of the application. In the client side, it is a Flex/Air stand-alone application that can work in most operating systems. You can rate music albums, see the ratings from the experts, and get personalized recommendations based on that. We also provide access to extended information for albums via links to lastfm as well as access to Linked Data resources from MusicBrainz and others.
The key architectural differences between standard and expert Collaborative Filtering are illustrated in the figure below. Note that in our expert CF, user ratings are kept in the client machine. On the other hand, expert ratings are downloaded into the local machine and the computation for the predictions is performed there avoiding any privacy breach.
The next figure gives some more details of how we implemented the solution in our case. Again, note that the server is only used to crawl and store expert ratings publically available on the web. Those ratings are then queried from the client through a REST-style web api. The computation of neighbors and predictions is then performed in the local machine.
You might be wondering where we got our expert ratings from. If in our previous work, we crawled our movie ratings from rottetomatoes, we now turned to metacritics. The figure below illustrates the number of ratings per critic. In the top positions, we can see AllMusicGuide with over 3500 ratings, or Pitchfork, Uncut, and Mojo, with over 3000.
I believe that expert collaborative filtering is a very flexible and valid paradigm in many domains. It can offer better results than other kinds of recommendations while solving many of the shortcomings such as scalability, privacy, or cold-start. We are currently working in other deployments in the mobile space, for example. But I will explain that in a future post.
Categories: Music Rec
Being Social
Thu, 07/22/2010 - 17:25This was the title of my talk at the SIGIR Industry track this year. I wanted to post the slides online (see below). However, since there is little explanation in them, I will briefly try to walk you through the story line in this post. Also at the end of the post I added a video of the talk (with some minor gaps), this should also help you get the full picture in case you are missing something.
Being SocialView more presentations from Xavier Amatriain.
The presentation was about some of the projects we are doing at the Telefonica Research Group in Barcelona. In particular, I show some of my projects on Recommender Systems but also others led by Karen Church and Josep M. Pujol
(slides 2-7)
But first let me introduce what is Telefonica for those of you who don't know (probably only applicable if you live in the US or Asia). Telefonica is one of the largest Telecom companies in the world (3rd in market cap). It has had a significant growth in the last 20 years, going from a Spain-only company with 12M customers to operating in more than 25 countries and having over 260M customers. Telefonica I+D (or R&D) is the Research and Development branch. It is the largest private research center in Spain and second largest in Europe. Finally the Research Group of Telefonica I+D has around 20 permanent research scientists covering areas such as multimedia, mobile and ubiquitous computing, social networks, p2p and content distribution, wireless systems, user modeling and data mining, and HCIR.
(slides 8-9)
One of the important issues, not only for users but also for a company like ours, is to find ways to deal with information overload. In very few years we have gone from counting the information we were exposed every day to counting the one we are exposed every second. Twitter streams, facebook updates, photos, videos... It's too much to cope with. Besides, this leads to the so-called "Paradox of Choice", after the very interesting book by Barry Schwartz. Having more choices does not necessarily lead to more freedom. In fact, it often leads to the opposite. If we have many choices, we tend to choose less because of the Analysis Paralysis. And we tend to choose worse, because we oversimplify the choice and use only superficial features.
(slides 10-11)
We tend to think that search engines have the answer to everything, but that is not always true. Actually, searching is not an ultimate human need. Accessing relevant information is. One of the reason search engines are not the ultimate answer to information needs by people is the interface. We technical geeks think that formulating the right query is easy, but this is far from trivial for the non-technical average user.
(slides 12-13)
The good news is that you are not alone. There are many people seeking relevant information. And actually, some of them are your "friends". They might be able to help you find what you need.
(slides 14-15)
I did an interesting test by posting a question on twitter. The question was: "What is my daughter's name?". This information is available in my homepage. Still it is hard to find using any search engine. I received three correct answers hours after. They had all, one way or the other, used my social network (see details of the paths in the slide.
(slides 16-19)
This leads me to the first project, Porqpine, led by Josep M. Pujol. Porqpine is a social and distributed search engine that uses the principle of lazy collaboration by letting users collaborate without extra effort. It allows to find personalized and context-aware answers. And it is stand-alone but can co-exist with other search engines. What is does is to locally cache the page and record user interactions (e.g., bookmarking). Then, searches by querying local caches of a user’s friends. Pages that friends have “interacted with” are ranked higher. It also uses a proxy masking the identity of the friend. It is currently a Firefox addon that can be downloaded here.
(slides 20-21)
We can somewhat overcome content overload by using social input. However, nowadays we are beyond content overload. We also suffer from *context overload*. Our information need also depends on where we are, the time it is, who we are with, what activity we are doing... And this is especially relevant if we consider that the web device of the future is not the desktop but rather the mobile device (be it phone, Ipad...). And a mobile device is not a computer!
(slides 22-29)
Besides of the importance of context, a mobile phone is personal. People also tend to look for more "fresh" content. And there are some queries like "where is the nearest florist?" that are easy to answer. But what about more personal needs like "Where is that cool cocktail bar I went to the other day... I know there were jazz concerts on Thursday and it's near an old church." And what about discovery and serendipity? What about getting help for deciding? Or points of interest in general? or events?
(slides 30-36)
All this lead to a question: Can we improve the search and discovery experience of mobile users by providing a readily available connection to their social? The answer was Karen Church's SSB (Social Search Browser). SSB is an iPhone optimized web-application plus a Facebook app. When launched it centers on the users current physical location and displays all queries/questions posted by other users in that location. As users pan/zoom the set of queries is updated Users can post new queries or interact with queries of others. We did two field studies in Ireland. The surprising results where that SSB became much more than a tool for finding information. It became a tool for helping and sharing experiences and for supporting curiosity. It was actually seen as an extension of people's social network.
So I have shown how you can somewhat minimize content and context overload by tapping onto your social network. Ideally, for most tasks, you want to rely on your close "friends". However, for many information needs, your friends might not be enough and you need to resort to the crowds. We have come to know about the "Wisdom of the Crowds". If I ask enough people, I can be sure that the majority will be right.
(slides 37-39)
But, there is a problem with that: Crowds are not always wise. We don't realize that many times, users are noisy in giving their feedback. Besides, many times our data is too sparse to draw correct conclusions. In our paper "I Like it, I like it not" we studied how consistent people were in giving their opinion. We found very significant inconsistencies especially in mild opinions, but also in negative ones.
(slides 40-44)
So, if we cannot trust the crowds... who can we trust? The experts. As Malcolm Gladwell puts it in Blink, "It is really only experts who can reliably account for their reactions". We know that experts might be biased or trying to steer our opinion. However, even in that case, they will be reliably and consistently doing so. Thus, they can become much better anchor points for predictions. In our "Wisdom of the Few", we presented a Collaborative Filtering approach based on experts from the Web. The basic idea is to find individuals who we can trust to have given reliable opinions on a given domain. These expert opinions are then used to determine who are your most similar experts. The final prediction is then done by computing a standard kNN Collaborative Filtering. Expert Collaborative filtering has many advantages over standard approaches. In particular, it is more scalable and it allows for 100% privacy preservation. This is because, user ratings do not need to be shared on a central repository. Expert opinions can be downloaded locally to perform the computation. We have developed several prototypes including a music recommender system and a mobile cinema recommender with geolocation.
(slide 45)
As a final summary: We all probably knew about Information Overload. But now it is not only that, we also have Context Overload. We can cope with both by using our social network. This means using our friends if possible or the crowds when necessary. However, crowds are not always as wise as they might seem and we are better off using experts.
Hope this is a good enough summary so you get the main message and can follow the pointers to more detailed information. You might also want to watch the talk in the following 3 videos that cover most of it.
Part 1:
Part 2:
Part 3:
Categories: Music Rec
Off the beaten track
Mon, 06/28/2010 - 23:59Next september, Nava Tintarev will be presenting a paper that she and I co-author in Mobile HCI 2010, in Lisbon. This paper, entitled "Off the Beaten Track - a mobile field study exploring the long tail of mobile tourist recommendations" presents our results on a field study for tourist recommendations. We sent a number of tourists off to visit Barcelona. They were instructed to use a tailored smartphone app which included recommendations of places they could visit.
In the paper, we evaluate the effectiveness, satisfaction and divergence from popularity of a personalized recommender system comparing it to recommending most popular sites. We found that participants visited more of the recommended POIs for lists with popular but non-personalized recommendations. In contrast, the personalized recommendations led participants to visit more POIs overall and visit places "off the beaten track". The level of satisfaction between the two conditions was comparable and high, suggesting that our participants were just as happy with the rarer, "off the beaten track" recommendations and their overall experience. We believe that personalized recommendations set tourists into a discovery mode with an increased chance for serendipitous findings.
This paper is the first of a line of research on tourist recommendations that I have just started and hope to be complementing with new publications soon. I will keep you posted in the blog. In the meantime, you can download the full pdf here.
Categories: Music Rec
Temporal diversity in Recommender Systems
Sat, 06/05/2010 - 00:14Next month, Neal Lathia will be presenting a paper where I have collaborated in SIGIR. In the paper we address the issues of Temporal Diversity and Novelty in Recommender Systems. You can read the paper here, but I will try to give you a brief summary in this post.
Recommender systems are usually evaluated on their accuracy, that is, their ability to predict how much a user will like/dislike an item given a set of past ratings. However, in any practical scenario, there are many other things that need to be taken into account to evaluate whether a system is giving good an interesting recommendations. One of these relevant issues is the diversity of the top-N recommendation lists. It does not matter that our recommendation is more or less accurate if time after time we recommend the user the same things. A user should expect that the system takes into account her feedback in order to improve and give different and better recommendations.
In the paper, we evaluate the importance of temporal diversity for users through a user survey. Then we analyze the performance of known collaborative filtering algorithms, and we propose different ways to introduce temporal diversity while using traditional recommendation algorithms.
We found several interesting results on how user rating behavior affects temporal diversity. For instance, users with large profiles are likely to see less diversity. However, the amount of ratings introduced since last recommendation correlates directly with more diversity. This suggests that we need to encourage users to rate while implementing mechanisms that prevent profiles from growing too large therefore preventing diversity. Smart mechanisms for rating "aging" might be useful.
Perhaps an even more interesting finding is that, as illustrated in the figure above, different algorithms perform differently regarding temporal diversity. SVD, for instance, is known to be more precise than kNN in the general case. However, it is interesting to note that it is also much less diverse. Therefore, even a simple decision between SVD or kNN as the base of a recommender system cannot be done disregarding issues such as temporal behavior of the algorithms.
Again, much more in the paper and, as always, looking forward to your comments and feedback.
Categories: Music Rec
Recsys 2010 Update
Tue, 05/25/2010 - 00:03As most of you know, I am co-chairing (together with Marc Torrens) the 2010 ACM Recommender Systems Conference (Recsys 2010 for short) to be held in September in Barcelona. After announcing it in this blog some time back, I thought it was time to give a brief update on the highlights. This is just a summary, but if you want to be up to data, please bookmark the website, or follow us on twitter.
I am listing the highlights in more or less chronological order (or as they come to mind). In no way, the order is meant to imply importance or relevance.
Venue
Hosting a conference in Barcelona is already great. As we explain in the conference website, the city has much to offer. But, what can be better than having said conference in a convention center surrounded by the sea in the harbor, just down from the Ramblas? Well, this is where the Barcelona WTC is located (see picture below). And although we did have other options in the city, we couldn't help but falling in love with the place.
Workshops
Recsys workshops were already a huge success last year in NY. But, judging by the high-quality proposals we have this year, it seems they are still getting better! This year we accepted 8 workshops (actually it is 6 one-day and 1 two-day) and we have accommodated them in two different days: before and after the three days of the main conference. On Sunday Sept. 26th, we have a workshop on Information Fusion, one on Social Recsys, one on Music, and the first part of the two-day Context-aware workshop and challenge. On Thursday Sept. 30th, we have the second part of that workshop plus new workshops on Practical Uses, on e-Learning (in conjunction with the EC-TEL 2010 conference ), and one on Recsys evaluation.
There is so much to choose from that the problem will be trying to decide which one not to attend. I am really looking forward to all of these workshops.
Tutorials
We have also line-up three very interesting tutorials by key figures in the field.
Guy Shani, from Ben Gurion University, will be giving a very much anticipated tutorial on Evaluating Recsys.
Joe Konstan, one of the fathers of the field and chair of SIGCHI for several years will be introducing the use of HCI techniques in Recsys.
Finally, Ricardo Baeza-Yates, VP of Yahoo Research, will be talking about predicting and recommending queries. Apart from being in my PhD committee, Ricardo is the only researcher I know who has a publication with more than 7000 citations.
Papers
We still don't know many details about the main attraction of the conference: the accepted papers. However, we do know that the submissions went up from last year. Bearing those numbers in mind the anticipated acceptance rate will be below 20%... which takes the conference to levels of 1st tier.
Hotels
We have arranged a great list of hotels in Barcelona so you have plenty to choose from. If you can afford to stay in the Eurostars Grand Marina, that should be your first pick since it is located right on the conference venue and it is an amazing hotel. However, we know most budgets are tight nowadays so we have included two great 4* hotels (conveniently located and both top 50 out of 600 in tripadvisor). We have even added a more affordable 3* hotel that is also conveniently located and has good reviews.
To be honest, if I had to travel to Barcelona myself, I couldn't find better choices than these.
Local Festivity
Each year, Barcelona celebrates its major festivity by the End of September. This year, major events will be scheduled 23-26th, just before the start of the conference. If you want an even better taste of our local culture, we recommend you come a couple of days earlier and enjoy the festivity. Visit the Merce website for more details.
All in all, we are looking forward to a great conference and hope to see you here!
Categories: Music Rec





