What good is data analysis if it doesn’t tell a story from which we can learn? Data storytelling is a powerful technique that translates insights obtained from data into knowledge and action.
By using video to tell stories with data, you can explain the nuances of data analysis. Also, viewers are free to watch the video at their convenience. Our guest for this conversation is an expert in using video for data storytelling.
Karthik Shashidhar is a data analytics and quant expert who practices data analytics in a wide range of areas, including business, public policy and sports. In this conversation, Karthik shares his best practices for data storytelling using videos.
Watch the recorded conversation
Read the Transcript
The transcript has been edited for clarity.
SUNIL KOWLGI: Hi everyone, I’m Sunil Kowlgi, Founder and CEO of Outklip. Today I’m interviewing Karthik Shashidhar, who is the Founder and Director of Bespoke Consulting. Karthik is a data analytics and quant expert based in London. He has numerous accomplishments to his name — he’s authored a book on market design, which I read and really enjoyed and I recommend the same for you. He’s a columnist for Bloomberg. He puts out a newsletter on data science and he’s a prolific blogger. He recently started making videos to tell stories with data and that’s the topic of our conversation. Karthik, welcome to this interview. Please could you tell us about Bespoke Consulting and the work you do?
KARTHIK SHASHIDHAR: This is something that I started about 7 years ago in India. By the way, I’m moving back to India very soon, back to Bangalore. I started it back before data science actually became a thing. The basic idea was that there are a lot of everyday business problems that you can solve better if you can make decisions using data. So, the idea is to help businesses make better decisions using data. To use their data to help them build models for everyday decision making and to drive strategic change based on data and so on.
KOWLGI: What kind of clients do you work with? What industry are they in, finance, consumer and so on?
SHASHIDHAR: I’ve worked with a fairly broad range of clients. My clients include the largest life insurance company in India and quite a few fintech startups. So, it’s largely been in finance, though I’ve also done a fair amount of work in retail and right now I’m doing fairly interesting stuff on the pricing front and that should come out fairly soon enough. I’ve also done some work with non profits.
KOWLGI: You mentioned you’ve worked on Bespoke Consulting for the last 7 years, how long have you worked in data analytics in general?
SHASHIDHAR: Right from day one. Data analytics has been the bread and butter of what I’ve been doing. Recently, over the last couple years is when I have really upped my game on the machine learning front. But, the core offering still remains data analytics and how businesses can use data analytics to improve their products, improve their businesses and so on.
What is data storytelling?
KOWLGI: You’ve been doing a lot of data storytelling lately. Can you tell us what is data storytelling and why is it useful?
SHASHIDHAR: Sometimes what happens is that you can just look at a piece of data or take a piece of data and twist and turn it in an interesting way that the data itself can tell a story. What a lot of people do is that they take data, present it in one way and then write a 1000 word essay trying to explain what they’re doing. You know the cliche that a picture can tell a thousand words. Sometimes you can just show a picture, which if you’ve drawn the graph in a nice way, the data itself will tell you the story. For example, one of the happiest days of my consultancy career has been this one day when I was with a client, I was on call and they were having a bit of argument. All I did was to present data to them in a particular format. It was just a chart that I showed in terms of how something was distributed, and just looking at the chart that completely resolved their discussion and conflict. The chart told the story extremely precisely and because of that there was no more disagreement on how they must formulate their policies and so on.
The best graphs are those where you don’t even need to explain to an audience
KOWLGI: It’s a very clear way of bringing out insights that are hidden in the data and then explaining it to an audience so they can take action based on that.
SHASHIDHAR: The best graphs are those where you don’t even need to explain to an audience, you can just put it out there and people see it and say oh shit, why didn’t I think of this before. It’s basically about twisting and turning the data or as industry experts call it, slicing and dicing the data that will result in the message being right in front of you. Of course what happens is that you can’t make great visualizations most of the time. Sometimes you just slack off and sometimes there’s no really good intuitive way to show the data and stuff. In those cases you make a slightly inferior visualization. In that case you’ll actually need to tell stories around the data, the data can’t tell its own story in those cases.
KOWLGI: You’ve been doing a lot of data storytelling of late with your project called DeepExtraCover.
SHASHIDHAR: That’s a temporary name by the way. For the lack of a better name, I normally suck at nomenclature, so hopefully I come up with a better name soon.
At Goldman Sachs…I’d been building some models for derivative pricing and for execution of equity deals and so on. I noticed that the way I was pricing derivatives could actually be applied to evaluating a cricket match
KOWLGI: The name is pretty good I feel. Just to give the audience an idea of DeepExtraCover, what you’re doing is analyzing cricket matches in a very different way. You’re making videos on YouTube to tell a different story about the match that was just played. To a Western audience, cricket is probably unfamiliar, but it’s a very popular game in a number of countries, India included. There’s a huge following of the game of cricket, and people often after the game go out and look up cricket websites to see what happened. So, you’re giving a very different perspective on the game. Can you tell us about the DeepExtraCover project. Why and when did you start this project?
SHASHIDHAR: This started pretty much 7 years back, in 2012. Then it was inspired by some of the work that I’d done. Prior to starting my consultancy, I used to work for Goldman Sachs, the investment bank. There I’d been building some models for derivative pricing and for execution of equity deals and so on. I noticed that the way I was pricing derivatives could actually be applied to evaluating a cricket match. The basic idea was that you can come up with a win probability score for every possible situation and you can use that to evaluate the game. So what happened was that around the same time a couple of scientists in New Zealand came up with a system that they called the WaSP or Winning and Scoring Predictor, which pretty much did the same thing and put their code out there. What I have is basically an improved version of the WaSP, taking into account things like the nature of the pitch and conditions. Actually, there’s a small amount of learning element in the model that as the game goes along, it learns about how the pitch is behaving and adjusts the probabilities accordingly and things like that. Then I figured that once you have the probabilities of each team winning at the end of every ball, you can pretty much join them up and that tells you the story of the game. It tells you whether team 1 really ran away with the game or whether it was a close contest. It tells you whether the game see-sawed between the two teams. So, that’s how the app or the model was born.
KOWLGI: Does this model give you a real-time picture as the game is progressing?
SHASHIDHAR: Yes, it does give a real-time probability, except that so far I haven’t yet finished the plumbing. Hopefully, I should have something in place before the IPL begins this month, where I’ll actually be able to show the probabilities in real time. It’s still a model, so I haven’t tested it against the betting markets and so on, which would have a better picture of what the game’s exact probabilties are. But it gives us a good way to evaluate the game, to look at the story of the game, to look at individual player contributions and so on.
Example video of cricket match analysis using DeepExtraCover
KOWLGI: Just a quick clarification for the audience. IPL is the Indian Premier League, which is an annual cricket championship that happens in India every April. And, basically that’s your next project, which is analyzing IPL games using your DeepExtraCover app.
SHASHIDHAR: Yeah, analyzing them live. I mean, so far, for the last one month ever since I started using Outklip to record my videos, I’ve been analyzing the game in a post-facto manner in terms of once the game is done. Now, what I plan to do with the IPL is I won’t do it for every game but at least the app will be hopefully up and running that during the game instead of looking at the scorecard and wondering which team is ahead, you can look at my app and see which team is ahead, how the game has played out so far and whether it’s worth watching.
KOWLGI: Could you give us a brief overview of the DeepExtraCover app itself? Like what have you written it in and a little bit of insight on the model, which some of our audience might appreciate.
SHASHIDHAR: The basic model is what is known as the WaSP, which was created by couple of scientists from the University of Otago. The model is about 7 or 8 years old. What you do is, for each ball, using historical data, you can model the likelihood of different events, which is basically, 1,2,3,4,5,6,out. Based on these probabilities you can that you can calculate at the end of each ball, what is the likelihood that the team will score the number of runs required to win. And then you use some kind of, what in finance or in PDs we call as backward induction, or what computer scientists call as dynamic programming, to extend this for every ball in the game. You kind of have a massive grid, which tells you, if you’re 30 for 3 after 10 overs you have so much chance of winning the game. If you need 25 runs in 20 balls with 2 wickets in hand, this is the probability of winning and so on. So, it’s basically a massive precalibrated grid, which tells you based on the match situation, what is the likelihood of each team winning. And you overlay on top of that, this is calculated for different regimes of the pitch as I call it — it can be a batting wicket or a bowling wicket, or a slow turner or a dead pitch or whatever. And there’s a learning element which as the game goes along it learns what kind of a pitch this is being played on.
KOWLGI: It’s really adaptive in that sense?
SHASHIDHAR: Yes, it’s adaptive. Last week we had a game where one of the teams, I think it was England. They scored 420. Normally in a 1-day match, the range of scores you see nowadays is between 250 to 300 for most of the games. So, 420 was a massive outlier. But because my model was adaptive, it gave England about a 80% chance of winning for scoring 420, because it had understood that because they had scored rather quickly, there was a good chance that it was a bowling pitch. And, then the West Indies actually came close, they came to around 390 or something like that, so they made a match of it, it wasn’t a one sided game at all.
Using video for data storytelling
One of my friends mentioned that I should make a video out of it, so I can explain all the nuances, explain all that’s going on in the graph. So, that’s when I decided to give it a shot to make videos.
KOWLGI: You’re taking the output of your model or your app and then you’re presenting it to an audience, and making videos for YouTube. Can you tell us why you decided to make videos for YouTube? Why not blog about it instead?
SHASHIDHAR: I think about a year back I started blogging about it. I mentioned right at the beginning that if you make a great visualization, you don’t need to tell a story about it. Unfortunately in this case I don’t think I have been able to make a great visualization. I think if you can look at the app itself, there is one nice smooth graph, but there are lot of annotations, which I haven’t found a clean way to show and so on. Moreover, people are not very used to seeing a cricket match in this way in terms of this kind of a graph. So, when I started putting it out on Twitter through my blog and through static graphs, I found the uptake was rather slow because people didn’t really understand or appreciate everything that was going on in the graph. Then one of my friends mentioned that I should make a video out of it, so I can explain all the nuances, explain all that’s going on in the graph. So, that’s when I decided to give it a shot to make videos.
KOWLGI: Who’s the audience you’re catering to?
SHASHIDHAR: The audience are probably people who like to follow cricket, who want to know what happened in the game. Unfortunately I haven’t been able to keep the video as short as I wanted. Most videos are turning out between 6 and 7 minutes long. I think I wanted to keep it under 4 minutes. But anyway it’s a quick overview of what happened in the game, what were the key moments, what were the key period. For example, sometimes it can happen that the major match reports would have missed out on what was actually the key event in the game. Because you look at the scorecard in a rather conventional manner. But then if you look at the story of the game in terms of where it swung and what the inflection points were, that tells you a completely different story of the game and that’s what I try to cover through the videos.
KOWLGI: How do you prepare for these videos? Do you just do it off the cuff or do you have a script?
One thing in data visualization is that the person who makes the visualization controls the message
KOWLGI: To recap, you have this model running on a Shiny service. And you’re making videos that give your story of the match. You have a link to your app along with the video, so people can go in and play for themselves. Configure various things and see various slices and dices of the data.
SHASHIDHAR: Yeah, that’s the hope. But I mean, there’s still more than can be done. For example, I want to give people the ability to do more analysis of the game. One thing in data visualization is that the person who makes the visualization controls the message. So, in that sense what I’ve put out there is what I’ve found interesting. Going forward what I hope to do is come up with a way in which people can analyze data in a way useful to them. They can base it on what’s interesting to them rather than to me. There are quite a few enhancements in store for the rest of the year that I hope to push through.
While people might think I’m good at tech, I don’t like to get involved with the nitty gritty of tech. From that perspective, what Outklip offers me is it’s a very user friendly app
KOWLGI: You mentioned you’re an Outklip user. What features do you use in Outklip and why do you use them?
SHASHIDHAR: I think the first thing that caught my attention was 1. I could overlay the screen text with my face in the corner. This was the most intuitive way I could record both the webcam and the screen and my voice together, complete seamlessly without any post processing. So, it’s extremely easy to use. 2. Push to YouTube, which is just one button, one click. Within 2 minutes of making the recording, I can share it on YouTube. So, there’s no cycle where I would, let’s say record in Quicktime and then do some processing and then upload. While people might think I’m good at tech, I don’t like to get involved with the nitty gritty of tech. From that perspective, what Outklip offers me is it’s a very user friendly app. It makes it very easy to record videos and upload them. It’s what they used to say in the Hero Honda ad 25 years ago, “fill it, shut it, forget it.”
KOWLGI: I’m glad you found it easy to use.
SHASHIDHAR: It’s extremely easy to use.
KOWLGI: What features will you need going forward? What are the things you might do to enhance your videos?
SHASHIDHAR: One feedback I’ve got on my videos is, maybe it could be a conversation rather than me doing the entire talking myself. It would help if I were talking to somebody else, through the game, through the app. Now, I don’t need both people’s videos but if it was possible to record a conversation with Outklip — where there’s my screen being shown and there’s one voice and one remote voice, that would be an absolutely fantastic feature to have. That would let us record videos, record conversations, for example this conversation between you and me. There were points in the conversation where I wanted to share the screen. Maybe I could have shared my screen and shown what was happening and so on. Imagine there was a version of outklip where there was you, there was me and there was either your screen or my screen, or in the most fantastic case, both our screens. And then we could switch between them at will and use them as visual aids for the conversation rather than audience just looking at our faces through the conversation. That should be an absolutely brilliant feature to have.
KOWLGI: I see your YouTube channel is racking up subscribers?
SHASHIDHAR: It’s racking up subscribers and I’m getting trolls who are reporting my video for copyright infringement and taking it down. Hopefully those trolls will get silenced soon.
KOWLGI: I see your videos, they’re very original and it’s all your content, so I hope they don’t get taken down any further.
SHASHIDHAR: I hope so as well. I hope the guys at Google who review these requests see that there’s no plagiarized content in this.
KOWLGI: So, what response have you got from people who have viewed your videos?
SHASHIDHAR: This is stil the beginning. I’ve put 10 or 15 videos so far. I’m getting a lot of input on how I can make it better and so on. People who enjoy the core app they’ve been enjoying the videos in terms of little explanations and so on. I think all the comments I’ve gotten are on the content and delivery and not on the way I’ve been recording the videos and so on.
Data science newsletter and Tumblr page for bad data visualizations
KOWLGI: I want to switch gears a little bit. Karthik, you’re prolific in that you work on a number of different projects, so I want to talk about that. You send out a monthly newsletter called the Art of Data Science, can you tell us what that is about and what you want to achieve with that?
SHASHIDHAR: That’s a very interesting question, because I really don’t know what I want to achieve with that. I’ve been writing for 16 years but I’ve never really thought about why I’ve been writing. I just write for myself and a lot of good has come from my writing, and if some good comes from it, that’s great. The newsletter, I figured there’s a coherent set of stuff I want to write on and it might not be appropriate for the blog. It started when I moved to London two years ago and figured that the way data science was being practiced here is way different than what I was used to. I come from more of an analytics background, whereas here, it seems if you’re not using machine learning, it’s not data science sort of thing. Lot of thought I had, and I initially wanted to share it with a small set of people. I started the newsletter and the response was great. I won’t lose much by sharing with a larger set, maybe I need to be a little more discreet but that’s about it. So, I started writing and the response has been really encouraging. People have told me they like the newsletter and that’s what has kept it going for two years now.
KOWLGI: I subscribe to your newsletter and it covers your perspective on artificial intelligence and machine learning.
SHASHIDHAR: It’s highly opinionated, it’s my take on what’s happening in the world of data science, analytics and artificial intelligence and so on. But I don’t try to be balanced or anything, it’s just my opinion.
KOWLGI: Another project you have going is a Tumblr page of bad data visualizations. It seems like data visualization is one of your pet areas. Can you tell us more about your Tumblr page?
SHASHIDHAR: Tumblr page is called Bad Visualizations. Every time on Twitter I would see a bad visualization it would hurt my eye, so I would just tweet it and be why did somebody make this graph and it’s so horrible in this and this way and so on. Somebody mentioned collecting and documenting this in one place. And then I found Tumblr to be extremely useful and easy to use thing for this kind of use case. All I have to do is download the graph from where I saw it, upload it to Tumblr and then I have to give it a two line commentary. Sometimes I don’t even give the commentary. Some of the users are so used to my comments that I don’t even have to give comments. It’s got a fair amount of traction on twitter and fair amount of positive impact, so I’ve kept going.
The advantage of videos is that you can condense a lot of information in a short amount of time.
KOWLGI: Down to my last couple of questions. What you’ve done so far on YouTube are videos to explain sports analytics. You have a core product which does a very unique analysis of a cricket match and you’re making videos to describe that. Do you see the potential for data storytelling in other realms — maybe in business, could be in other areas altogether?
SHASHIDHAR: Absolutely. The way I think about it is cricket is just the start. I’ve made a plan on how I’m going to take this forward. I started with cricket because I had a good model, I had a unique way of telling the stories in cricket. I see the potential for doing it in several other realms, like business, government data, public policy, etc. Soon, I’ll start making videos on those other areas as well. The advantage of videos is that you can condense a lot of information in a short amount of time. Also, present your view on it. Sometimes if I show you bunch of graphs, you might not understand what’s happening, you might gloss over them. A lot of people are very phobic to numbers, so they might give it a skip. With a video the advantage is I can take you through it, I can give you the information, I can teach you how to interpret it, I can tell you what it’s saying. Basically, you can literally spoon-feed the viewer in case of a video. You can really put your own personal stamp on it. My interpretation in the video is different from your interpretation of the graph. I can put my own brand on it, so I plan to do this for several other things as well.
You make the visualizations that tell your story and video gives you an opportunity to take the audience along, you can tell the story the way you want
KOWLGI: My last question for you, do you have advice for anyone doing data science or data analytics and now they want to dabble in making videos to tell stories with data. How do they start out and what are some of the best practices you want to share with them?
SHASHIDHAR: I think the most important thing is before you start telling a story you need to know what story you’re going to tell. And then you make your visualizations. With DeepExtraCover, I have the visualizations and use that to explain the story. Sometimes what you can do, which I intend to do when I move beyond cricket, is create the visualizations based on the stories you want to tell and then weave together a story and present it coherently. Remember that you can totally be opinionated, when you’re making a video, you’re not a reporter or journalist, you’re telling your version of the story. So you make the visualizations that tell your story and video gives you an opportunity to take the audience along, you can tell the story the way you want.
KOWLGI: Thanks so much for sharing your perspective and telling us more about Deepextracover and your other projects. I wish you all the best.
SHASHIDHAR: Thanks, Sunil. First of all, thanks for Outklip and having me on this chat.
KOWLGI: The pleasure is all mine. Goodbye.