The limits of automatic recommendation systems

I was reading an article in NewScientist the other day about a system devised by academics at Royal Holloway, University of London, which “could form the basis of a recommendation system that makes suggestions based solely on an automatic assessment of the text.” Unlike Amazon’s recommendations, which look at sales, and those on sites like Goodreads, which look at reader reviews and ratings, this one looks at writing style, e.g. the frequency of individual words.

Well, I’m sceptical. You see, the very last thing I want to do after reading a book I enjoyed is to read another book that sounds very similar. I very much enjoyed a book called Chasing the King of Hearts, set among Polish Jews in the Holocaust. Does that mean I want to read another book on the same subject, or in the same style? Nope.

I want my new read to cross oceans, both literally and stylistically. I want it to be totally different, refreshing. In fact, it was. After Chasing the King of Hearts, I read We Need New Names by NoViolet Bulawayo, written in the voice of a Zimbabwean child. Then I read the Russian novel Everything Flows by Vasily Grossman, and then The Spinning Heart by the Irish author Donal Ryan. All the while, I was listening to the darkly funny American novel May We Be Forgiven by A.M. Homes on audio.

I don’t think any machine in the world would have recommended that sequence of books. There’s nothing to link them in terms of style, subject, word frequency or anything else. The only thing that links them is that they are well written, and if the Royal Holloway researchers have invented a system that can measure literary quality then they deserve the Nobel Prize.

I don’t think I’m particularly eclectic in my reading. I think if you look at your own recent reading, it’ll be very hard to detect a pattern that a computer could understand. Some people, of course, do only read a particular type of book again and again, and for them it may be useful. The article also suggests other uses, like maintaining consistency among different authors in collaborative projects, or resolving disputes over who wrote what.

When it comes to truly insightful recommendations, though, I still haven’t found anything better than the brain of a well-read human being who knows my reading tastes. So thank you, fellow bloggers, as well as all the librarians, bookshop staff and friends over the years who’ve pointed me in the direction of good books to read. One day, a machine may take your place, but for now I depend on you. So please keep reading, keep reviewing, keep talking about the books you like. It’s the only defence we have against the insanity of automated recommendations.

, , , , , ,

25 Responses to The limits of automatic recommendation systems

  1. Jayne White 12 December 2013 at 8:00 pm #

    I can’t remember the last time I bought an Amazon recommended product that I hadn’t wanted of my own accord. This new system sounds potentially more confining than the one Amazon now uses if you like any breadth in your choices. Amazon already struggles with my buying books on writing and recipe books as well as lit fic.

    • Andrew Blackman 13 December 2013 at 6:38 pm #

      You really see the absurdity of the Amazon recommendations when it comes to non-book products. I bought an HP Deskjet 2540 printer on there a few months ago, and now every time I go on, it tries to sell me the HP Deskjet 1510, the HP Deskjet 1000, etc…

      You’re right that any slightly varied literary choices make the recommendation system explode. The one described in the article sounds more sophisticated, but still runs into the problem that if I just bought an HP Deskjet 2540, the last thing I want to buy is another HP Deskjet 2540 :-)

      • Jayne White 13 December 2013 at 8:38 pm #

        I’ve just upgraded my kindle fire and bought a cover for it so I’m seeing a lot of this at the moment.

        I contribute to a book blog run by a friend of mine so I pick up recommendations there and I read newspaper reviews. I subscribe to some publishers’ mailing lists too.

        I’m selective about what I buy on release day because I can’t afford release prices for everything I like the look of. If there’s a buzz about an author’s new book I’ll often pick up an older work second hand or on a kindle offer first and then decide if I’m in enough of a hurry to pay a premium for the new one.

        • Andrew Blackman 16 December 2013 at 3:10 pm #

          I’m the same, Jayne. There are so many books I like the look of, but I can’t afford to buy all of them when they come out. I buy some, and pick up others at a reduced price later, and when I lived in England I often used the library as a way of testing out a new author.

          Good to hear where you get your recommendations from. I’ve never tried publishers’ mailing lists, although I do sometimes get emails from publicity people asking if I’d like to review forthcoming books. Might sign up for a few lists and see what they offer.

        • Andrew Blackman 16 December 2013 at 5:25 pm #

          By the way, what’s the book blog? I’d like to check out your contributions.

          • Jayne White 16 December 2013 at 8:14 pm #

            I contribute to http://workshyfop.blogspot.co.uk/ My last piece was http://workshyfop.blogspot.co.uk/2013/12/singles-and-kindles-e-readers-and.html

            I work in marketing for the day job so I actually find the blurb interesting. Some publishers (often bigger ones) just bombard with a hard sell like everything is a blockbuster, whereas other publishers put a bit more effort into making sure the book appeals to the right reader.

            • Andrew Blackman 17 December 2013 at 1:36 pm #

              Ah! I know it well. I know Thom a little bit too – he came to my book launch. I have the blog in my RSS feed, although there are so many blogs in there that I don’t get around to them all often enough.

              Nice post! Brought back happy memories of some Penguin 60s, as well as raising some interesting issues about the developments in modern publishing and making good use of the internet.

  2. Emma 13 December 2013 at 12:56 am #

    At least with this system, you know which book NOT to start if you want something different from the previous one.
    I never look at the automatic recommendations. I have enough recommendations from book bloggers to read until I’m old and very wrinkled.
    I don’t think any computer could have predicted my reading The Blonde, Vengeances, Riders of the Purple Sage, Dans les meules de Beyrouth and Plutarch in a row. I don’t see the link. Like you I want something different when I have finished a book.

    • Andrew Blackman 13 December 2013 at 6:44 pm #

      Ah, that’s a good point, Emma – I hadn’t thought of it that way. Kind of an anti-recommendation system :-)

      Yes, that’s not a list of books any computer would produce. Recommendations from people are so different. I remember you recommended The Age of Innocence to me a while ago, a book that on the surface is quite different from the contemporary fiction I normally read, but that was spot on for me. A computer would never have spotted what it was in that book that would appeal to me, but you did.

      • Emma 14 December 2013 at 10:11 pm #

        I’m glad you enjoyed The Age of Innocence.
        This book will stay with me. It makes one think about what counts to have a lasting relationship, what to build a marriage on, about knowing yourself and respecting your limits and about lucidity. At first, Newland seems to be a coward but when you think a bit more about his decision, you must be really brave to make it and live with it.

        • Andrew Blackman 16 December 2013 at 5:26 pm #

          Yes! Now how do you program all that into an algorithm?

  3. Alice 13 December 2013 at 10:35 am #

    Oddly, I actually like the idea of that sort of recommendation system – as long as it wasn’t the only one. I have often wanted a book, say on WW1, that is similar to what I have just read and Amazon always falls short. So on the basis that on occasion I would like to read continuously on a subject, it is a fantastic idea, however, you are quite right in that it can’t offer a book that is similar yet utterly different. I can’t see it recommending Hemingway after reading Fitzgerald, which is the sort of thing that I would want as well.

    Reading and tastes are so subjective I can’t ever imagine a flawless recommendation system. Interesting discussion, Andrew!

    • Andrew Blackman 13 December 2013 at 6:49 pm #

      Hi Alice
      That’s a good point. When reading deeply on one subject, it could be very useful. I’ve always relied on the bibliography in one book to refer to me to the next ones – kind of like pre-web ‘surfing’! But of course then you can only go backwards in time to earlier books, and you’re limited to one author’s recommendations. I can see a value to an automatic system for helping you research a particular topic.

      Which leads me to another question, for you Alice or for anyone else reading this. Where do you get most of your recommendations from? Book bloggers, friends, newspaper reviews, bookshops, or somewhere else?

  4. Vishy 13 December 2013 at 11:40 am #

    Wonderful post, Andrew! Really enjoyed reading it. I haven’t read that article yet (looking forward to reading it soon), but after reading your thoughts, I realize that what the scientists / academics are trying to do is judge literary quality of a book from the text itself rather than looking at the sales and other readers’ ratings. I think it is easy to look at ‘hard numbers’ like sales and readers’ ratings and say that one book is ‘better’ than the other (of course, this is all a landmine. Sometimes books which get a lesser rating or sell less age gracefully and live for centuries, while the popular ones don’t stand the test of time) but judging the literary quality of a book using a computer algorithm is hardwork. I would say that the scientists are trying to build the holy grail :) I don’t know whether that is possible, but I admire them for giving a try. My own take is that the computer can do a lot of number crunching and arrive at a conclusion / suggestion but it is extremely hard to simulate what a human does after years of reading. Especially when we really don’t have any definitive way of saying how we judge literary quality. (For example, one of my favourite poems by W.S.Merwin goes like this – “Your absence has gone through me / Like thread through a needle. / Everything I do is stitched with its color.” When we read it, most of us would agree that it is beautiful and very poignant and poetic. How can one build an algorithm which arrives at the same conclusion? And if this sounds like an impossible task, how can an algorithm judge the literary quality of a book?) One of my favourite scientists, Roger Penrose, wrote a beautiful book on this topic many years back called ‘The Emperor’s New Mind’. It explores whether a computer algorithm can do tasks like this :) His conclusion was that there are tasks which involve calculation and computing which a computer can perform and there are other tasks which require insight which a computer cannot perform and our scientific understanding is not advanced enough to build an algorithm which performs tasks which require insights. On the other hand, Artificial Intelligence scientists might say that building this algorithm is possible. ‘The Emperor’s New Mind’ is a challenging read and Penrose doesn’t shy away from using equations (he apologizes for that at the beginning), but I would recommend it if you can get it from the library. It is a good book to skim through and read parts of and get a flavour of its central theme and a feel for Penrose’s main arguments.

    I will read that New Scientist article and come back and comment more :)

    • Vishy 13 December 2013 at 2:33 pm #

      I just read that New Scientist article, Andrew. I found this particular passage interesting – “a system that compares different bodies of text, looking at the relative frequency of individual words and converting the data into visualisations. Subtle correlations in word use between novels or sections within a novel make it possible to compare style, says Reddington”. I think this is not something new and it just utilizes the computational way to arrive at conclusions. There is nothing new in the algorithm. And in my opinion, it is difficult to identify an author’s style using relative frequency of individual words. Because I feel that the meaning of the words comes from the way the words are placed and also from the way we relate and react to them based on our past reading experience and our cultural conditioning. For example, if the algorithm compares Merwin’s ‘A Separation’ with one of his other poems it will be difficult for it to tell whether they are both by the same poet. I think there is a gap between the computational inferences that the algorithm arrives at and what we can spot immediately when we see and it is difficult to build an algorithm to bridge that gap.

      Thanks a lot for this post, Andrew. It was quite interesting to read and it made me think. And I always love when an article makes me think about my favourite scientist, Roger Penrose :)

      • Andrew Blackman 13 December 2013 at 7:01 pm #

        You’re absolutely right – it’s worrying when it depends on word frequency. I would guess (or hope!) that the algorithm is quite subtle in its construction, and perhaps takes account of position in the sentence or patterns of particular words, but even so, it’s impossible to program it to appreciate the beauty of those lines of poetry you quoted.

        People often refer to the Turing test as the benchmark of artificial intelligence, but I think a good test for AI would be the ability to appreciate good poetry!

        Thanks for your comments, which made me think too!

        • Emma 14 December 2013 at 10:01 pm #

          Very interesting exchange.

          When Vishy mentions that poem, he means that we find it poignant and that a computer isn’t able to react to this poem by finding it poignant.

          This is exactly what differentiates humans from androids in Do Androids Dream of Electric Sheep? Empathy, compassion. That’s the point Philip K Dick is making: our capacity to feel empathy is the basis of our humanity. Our tendency to react unrationally because our feelings interfere with our logical decision making process is another human trait. And that cannot be imitated and frankly, I hope we will never manage to imitate it.

          • Andrew Blackman 16 December 2013 at 5:31 pm #

            Beautifully put, Emma. We read so many depressing things about “human nature” – apparently it’s human nature to be competitive, to be violent, to cheat, and so many other things. But the things you talk about seems to me much more essentially human.

            You could probably program a computer to be an excellent CEO, making rational decisions and generating profits no matter what the human cost. But you could never program a computer to understand the poignancy of some lines of poetry. That is inimitably human, and I’m glad about it too!

            • Emma 16 December 2013 at 10:11 pm #

              You can’t be an excellent CEO if you a bit of humanity doesn’t come in the way of rational decisions.
              I’m not talking about making your employees happy; I’m not naive enough to think that high shot CEOs care about that. I’m speaking about enthusiasm for a new product or the gut feeling that this path is where the business should be driven, even if it seems crazy.

              • Andrew Blackman 17 December 2013 at 1:22 pm #

                True. I was thinking in terms of profit maximisation, but good CEOs should also have hunches about where the future is going. I’d argue they should also care about making their employees happy, if not out of humanity than for the simple, rational business reason that happy employees do better work and make the company more successful in the long run. So maybe my computer would end up making humane decisions, for inhumane reasons. Or maybe it would take over the world and make us its slaves…

    • Andrew Blackman 13 December 2013 at 6:57 pm #

      Hi Vishy

      You made me smile when you said they are trying to build the holy grail :-) That’s exactly it! It’s a noble effort, and as you say, potentially much more powerful than recommendations based on sales and ratings. But how to judge literary quality by a computer algorithm? That’s the real question.

      I love the distinction you draw (via Penrose) between calculation and insight. In the post I was trying to say that there’s something intangible that humans have and computers don’t, and that’s the perfect word – insight. I’ll definitely look for that book – it sounds interesting, and I trust your recommendations :-)

  5. Brian Joseph 13 December 2013 at 12:10 pm #

    I share your skepticism but it does sound interesting. I would like to play with the system.

    The problem for me is that when it comes to things that I am really into, I generally do not need more recommendations. There is insufficient time, (or in some cases money, in cases of food, calories) to get to even a fraction of what is on my list to try!

  6. Delia (Postcards from Asia) 15 December 2013 at 4:30 am #

    Sometimes I add books to my to-be-read shelf on goodreads based on the recommendations I get on the site, although I can’t remember a single one to give as an example right now.
    I get my books from wandering in a bookstore for hours, at second hand book sales, and my book club, where everybody brings books and we share them. One of my friends gave me a book last time and said, you told me you like immigrant stories, so I brought you this one. It was The Road Home by Rose Tremain and I did like it, a lot.
    I like the element of surprise, that feeling of going into a place filled with books and discovering my next read without having known anything about it in advance. I met quite a few books like that and we became great friends. :)

    • Andrew Blackman 16 December 2013 at 5:38 pm #

      That’s interesting, Delia. I’ve never really looked at the Goodreads recommendations, but I’ve reviewed and rated hundreds of books on there over the years, so their computers should have a good idea of my tastes (subject to the limitations mentioned above!).

      I love your way of finding books. I used to do that a lot more often in the past. These days I plan my reading more. I wrote about the element of surprise, what I called “lucky dip reading”, on the blog a few years ago. Interesting to hear your take!

      I enjoyed The Road Home too, by the way :-)

Leave a Reply