How NPR is Embracing Open Source and Open APIs16 July 2009, 8:05 pm
News providers, like most content providers, are interested in having their content seen by as many people as possible. But unlike many news organizations, whose primary concern may be monetizing their content, National Public Radio is interested in turning it into a resource for people to use in new and novel ways as well. Daniel Jacobson is in charge making that content available to developers and end users in a wide variety of formats, and has been doing so using an Open API that NPR developed specifically for that purpose. Daniel will talk about how the project is going at OSCON, the O'Reilly Open Source Convention. Here's a preview of what he'll be talking about.
James Turner: Can you start by explaining what NPR Digital Media is and what your role with it involves?
Daniel Jacobson: Sure. NPR is a radio organization, of course, and the Digital Media Group, of which I'm a part, handles, essentially as I describe it, everything that is publishable by NPR that does not go to a radio. So that includes the website, podcasts, API, mobile sites, HD radios, anything that has some sort of visual component to it. So Digital Media as a group is responsible for producing that content, producing all of those distribution channels, managing all of those relationships.
James Turner: And what is your particular role there?
Daniel Jacobson: I manage the application development team that is responsible for all the functional aspects of all of the systems, which includes our CMS, all of the templating engines for the website, for the API, for the podcasts, all of the engines that drive that.
James Turner: Now NPR is an organization that consists of a lot of member stations kind of flying in close formation. What's your relationship with the content producers? To what extent do they have their own stuff, and to what extent do you work together?
Daniel Jacobson: Those member stations are really exactly that; they are members of NPR. They essentially buy NPR programming. They're distinct organizations from us. NPR is a content producer and distributor. They buy our programming and broadcast it out to the world. They also have their own corresponding web teams that can take NPR content and also produce their own content and create their own websites. So in the Digital Media Team, we take a lot of pride and effort in providing services that help those member stations better serve their communities and their listeners and audiences, using NPR content and using their own content. We work with them to try and satisfy their missions. And to the extent that they need NPR services or content, we work hard to try and provide those. The API is one massive step, I think, in making it much easier for them to do what they need to do without a whole lot of intervention from us, where previously they would have to pull in content in much more arduous ways. So the API, I think, is a step in the right direction to make it more of a self-service model.
James Turner: Since you've mentioned the API, that's what you're going to be talking about at OSCON. We've already talked to the New York Times and the way they're opening up their content through APIs. What are you doing with yours?
Daniel Jacobson: Well, we launched ours formally at OSCON last year. And at that time, we essentially opened up our entire archive. So anything that you can get on npr.org is available through the API, to the extent that we have the rights to distribute it. There are some rights restrictions, for example, for receiving photos or stories from sources that we have not cleared rights to redistribute. Those are getting suppressed through a rights filtering engine on our API. Everything else that you can get on npr.org, you can get through the API. That includes full text. It includes images, audio, video, everything like that. Throughout the last year, we have added more features. We included the layer of "mix your own podcast", for example, which allows people to not only get the content in audio form, but also to download it as a podcast-type item. And all of that is available through search terms or totally customized queries. So what the API really does is it enables people to take the content, make widgets, or do whatever they want with essentially everything that is on npr.org and get to audiences that we are not getting to.
James Turner: This probably isn't as much of a factor for you because, in some ways, you're not dependent on the same kind of revenue streams as a lot of news and content providers. But when you provide that kind of access, isn't there a bit of a fear that you can get even more to the point where portal sites and aggregators like Google can essentially steal your traffic?
James Turner: So it's been open for a while. What are some of the things you've seen people do with it?
Daniel Jacobson: Well, one of the most interesting things is we call it the Flubacher app which is essentially an iPhone application that somebody in the world built. His name is Brad Flubacher. And he's essentially taking the API content and putting it into this iPhone app. And you can stream content within the iPhone app, all of our programs or our topics. And when I say stream, it's essentially doing API queries every time you make a request. So he's not archiving all of this content. It's just basically a pass-through engine. It's been very popular and a very interesting application.
A lot of our member stations are doing very creative things. Minnesota Public Radio, for example, just launched their new site. And they're making extensive use of the API. North Country Public Radio is another one where they've said that they have, I think, 50 percent of their pages or so have NPR API content on it. So our member stations are making heavy use.
I've seen a lot of instances of people making code wrappers. There's a Ruby on Rails code wrapper for our API. There's a Perl one that someone just created. So a lot of people are out there doing very clever things with it. And we're just looking forward to more and more uses.
James Turner: So obviously you're familiar and probably a fan of open source. How is NPR using open source technologies?
Daniel Jacobson: So internally, with exception of our database, all of our systems are employing open source technologies. I assume that's what you mean. What open source technologies are we using? So our database is Oracle. And our plan is to migrate that to MySQL. But over the last couple of years, we've really adopted open source more and more. All of our coding engines, and we're not using any proprietary application servers or anything like that, it's all open source, Apache, all that kind of stuff.
It's very important to us that we keep the open source model. And as we look more towards open source, we're kind of changing our vision to be less of a consumer of the open source products and more of a contributor. And I think that's what you see with the API. It's the first step to say, "How can we contribute back to this community and give them the things that we're good at?" Which would be content, in this case. And more and more, we're going to start looking towards opening up our applications and saying, "Here, go fork this. Go make interesting things with it." And I think over the next couple of months you're going to see a lot more open source applications coming from NPR.
James Turner: As I mentioned earlier, the New York Times has got their open API. You have an API. Is there any effort going on to try to standardize for this type of content a single API that would allow people to use common code throughout all of these data sources?
Daniel Jacobson: That's a great question. I'm involved in a resource group for PBCore, as an example. PBCore was really set up to be a public broadcasting core, but there are a lot of other organizations that are starting to adopt it as more of a standard for passing data back and forth between the organizations. I'm not sure if that's going to be as pervasive in the overall marketplace. With respect to New York Times and other organizations that are outside of that circle of PBCore, we actually haven't had many conversations about formalizing some sort of standard across us. I think that's a very interesting idea. That said, there are already a host of standards out there in the world. And NPR has tried with our API to really make the API adhere to as many standards as possible.
We have our own custom tagging language which we call NPRML, which we built, and which is essentially the language or the XML structure that essentially closely mirrors to our content. But we can now put all of our content in media RSS or podcast RSS or Atom, or I think there are a total of eight or nine total outputs. And next on the docket will be NewsML and PBCore or probably PBCore first. And so we're trying to make our content as standards compliant as possible.
I think your question is, is there some other standard that would allow for more richer content to be standardized across all of these news organizations. It's a really interesting question. I don't now that all of the organizations are going to have the philosophy of opening up as much as NPR has. So, for example, New York Times does not offer full text content in theirs; we do. Our source is really heavily weighted towards audio and theirs isn't. So there are going to be some differences across them that make it a little bit more challenging.
But we are collaborating a lot with these organizations. I also want to add that New York Times and NPR will be hosting a mash-up camp at OSCON on Friday. And this is an example of one of those steps where we're really trying to play nicely with all of these other organizations and trying to unify in front of the public, you know, "We are both media organizations. We want to get everybody kind of focused on the same concepts." I think your proposal of a next step towards a standard of process might come down the road.
James Turner: What do you see coming on the horizon both for NPR and if you want to put on your oracle hat, more generally in the news business?
Daniel Jacobson: Well, for NPR, digital is obviously very important as it is for most other media organizations. And over the next several months, you're going to see a lot of changes for NPR. We are focusing a lot of energy towards distribution channels, portability. I think portability is a huge factor in this marketplace. And you're asking about down the road. My view is I really see webpages and websites, browser-based, PC-based experiences, they're going to start diminishing in importance. I don't know exactly what the timeframe is. It could be a couple of years. It could be five years. I don't know. But at some point, it's going to plateau and mobile's going to surpass it. And having content be portable is going to be paramount.
So I think that NPR's philosophy is going to mirror with that. We're putting a big emphasis on portability. That's why the API is so critical, not only for end-users in the world but also for all of our business needs. We spend a lot of time with business partners, getting them to understand the API so that they can more easily tap into our content and service in their environment. So it's all about distribution at this point for us. And I think over the next three to five years, you're going to see a lot more people consuming NPR content on the go rather than in front of the computer.
James Turner: It sounds like you're going to be fairly busy at OSCON, but is there anything beyond the stuff that you're participating in that's caught your eye or has you excited?
Daniel Jacobson: I will be honest. I'm going to be at OSCON for about a day-and-a-half, and that's because we have some major launches later this month. So I've got to swing in, do my stuff, and swing out, which is regretful. But there are a couple sessions that I did notice. I think there were some talks about microformats and, of course, portability, and HTML 5. Those were the things that caught my eye.
James Turner: All right. Well, Daniel, thank you so much for taking the time to talk to us. And it'll be great to be hearing more from NPR.
Daniel Jacobson: Great. Thank you so much. Source: Planet MySQL