J-Lab: The Institute for Interactive Journalism

 

Sign Up for Email Updates


Google

Web
J-Lab.org

Transcript for
2005 Batten Symposium
and Awards for Innovations in Journalism

Sept. 12, 2005
National Press Club, Washington, D.C.

Adrian Holovaty
Lead Developer, Holovaty and Associates

First I’d like to thank J-Lab for supporting great journalism. It really warms the heart to see all these awesome projects.

This is chicagocrime.org. As it says on the top, it’s a freely browsable database of crimes reported in Chicago. What I’m going to do here is show you what it is, how we did it, what makes it different, and what kind of whacky things we have planned for the future.

When you go to chicagocrime.org it’s a database of crimes reported in Chicago, but because there’s so much horrific stuff that happens in Chicago, it’s a gigantic database. When we set out to do this, the reason I did it was because it was an interesting challenge. I really like taking these gigantic pieces of interesting information and making them very digestable by common people.

So the philosophy of this site is that there’s so much information that is so valuable but until this site launched it was really hard to access. The police department makes it available but it’s really ugly and you have to deal with their unusable piece of garbage site.

So I launched this, with Wilson’s help, as an independent project because it was an interesting challenge, because I wanted to do a valuable public service and because I like hacking on stuff.

I’ll walk you through the site:

The Chicago Police Department releases all this stuff on its Web site and for every crime it releases a bunch of stuff: When the crime happened, what it was, whether any arrests were made, what type of location it was – whether it was at an ATM, barber shop, bowling alley – that kind of thing.

So instead of giving users a tremendously hard to use search interface and saying, “Tell me exactly what you want, put in your address, put in exactly what you want to see,” we decided to just make it browsable. And not just by one thing, but by every possible way you might want to look at this information. So that’s what you see on the home page.

Let’s browse by crime type. Here’s a list of all the types in the system, and let’s take a look at motor vehicle theft/motorcycles, and click on that. Here are the latest reported motorcycle, scooter and motorbike thefts in the city of Chicago. This is what we refer to as a crime type detail page. Every type of crime has this page. Everything is dynamic, so everything is automated, and it displays the latest 25 or 30 crimes.

Here they’re listed on the left, and over here they’re listed on the map. This uses Google Maps technology to embed the maps in the page, and because it’s Google Maps, I can pan this thing around and zoom in and I can click on a crime to get more information about it.

This is the detail page for an individual crime. It tells you what it is, the crime categorization, where it happened, what zip code, what ward it was, whether any arrests were made, whether it was domestic, the police district and the police beat. So this represents everything we know about this crime. And because we have the address, we’re able to geo-code it and put it on the map. The police department only releases block level data for privacy reasons, so this point represents the start of the block. Also, because this is a Google Map, you can toggle between satellite and map, and you can zoom in directly on that crime. And there’s the new hybrid mode that Google Maps released that’s a satellite map with the street names right over it.

We looked at the type page, now let’s look at the block page. I’ll click on 63rd Street, and this is all the blocks on 63rd Street with how many crimes are on each block. Let’s look at 800 East because there were 19 crimes – and the time span we’re dealing with here is the last three months because that’s what the police department offers. So here’s the page that represents everything we know about this block, 800 East 63rd Street. It tells you what beat it’s in, what district, zip code, ward, and the latest reported crimes. All sorts of everything happens here.

You’ll notice an RSS feed link is available here. We really wanted to get people involved and provide a public service, and the thing you are most interested in is obviously your own address. So you can click on RSS here and it’s going to look really ugly, but if you’re familiar with RSS you know that you can subscribe to this on your news reader and get notified every time there’s a crime that happens on your block. It’s easy to make RSS feeds, but what you don’t want is to make RSS feeds that are just huge data dumps. So we don’t have an RSS feed for every assault because there are hundreds of assaults that happen every day, so if you had an RSS feed of that it would be unmanageable. We have RSS feeds for every block of the city and every police beat because those are smaller, more manageable data sets.

You can also click on all the crimes within two blocks, four blocks or eight blocks, so let’s look at all the crimes within four blocks. There’s those, and of course every map is movable and draggable.

Let’s go into some more interesting stuff. Of course you can navigate by date, so let’s go into July 8. Here are the crimes separated by time, so let’s look at 10 p.m. That’s everything that happened in the city of Chicago from 10 p.m. to 11 p.m. on July 8.

Again, the idea is that everything that can be a link should be a link. Like on this page, you don’t know if someone’s going to want to click on time, you don’t know whether they’re going to want to drill down to the other dates on the crime detail page, you don’t know what part of information they’re interested in. Say I live on 6400 South Martin Luther King, I might want to click on that to see the other crimes on the block, or I might be interested in crimes that are criminal trespasses. Everything that can be a link should be a link. That’s really the philosophy here, and it really makes for a very “sticky” site.

I don’t really look at the traffic numbers a lot, but the one time I did look at numbers the average length of a person’s visit on this site was something like eight or 10 minutes, and generally in the industry that number is like 30 seconds because most people just go to the home page then go on. But this thing is really engrossing.

You can also browse by police district. A lot of people don’t know their police district, so I added this thing here where you zoom in on your home and there is a little crosshairs and you can click “Guess District.” It’s centered on beat 19-33 in district 19. Click on beat 19-33, and here’s the latest crimes that happened in that beat, and of course I can get the RSS feed.

We do some aggregate stuff like the most common crimes for this beat, and it looks like theft is the most common crime here, followed by battery. Battery is always common, I’ve found.

You can browse by ward. You can find your ward if you don’t know it, and it works the same way as the police district location tool.

You can browse by zip code, and if you click on an area it displays the border of that zip code. This is interesting because that actually uses the line that Google uses for driving directions. This is a hack, so instead of using the line for driving directions, it uses it to display the border around a zip code.

There’s a city map view for power users interested in a certain sub-set of crimes. Let’s look at all the crimes that happened in a bowling alley, update the map, and there’s only two. Let’s look at all the ATM crimes. There are 44 ATM crimes, and I can narrow those down to all the assaults that happened at an ATM – just one. And I can click on that and I get a balloon that tells me more about it.

We just added a couple weeks ago crimes along a route. Say you walk to the L station every day at 5 p.m. So I’ll do my walk to the L, and what you do is pan and zoom the map using Google’s very pretty interface, and you click on it and draw a line, then you click show crimes on route. I can optionally filter that to just show me the crimes that happen on the street, and there are all the street crimes that happen along the route. It’s kind of fuzzy because we only know the block number, we don’t know the exact address.

Audience Question:
Talk about the genesis of this project.

Holovaty:
I was here in D.C. at a computer programming conference because I’m a big geek and I was looking for something interesting to do. I stumbled upon the Chicago Police Department’s Web site and they had all this information freely available in a very ugly interface. Every night at 11 I use what’s called a screen scraper, which is an automated program that goes to their Web site, grabs everything out of their system and pulls it to here. So really it runs itself, and I just check it every other day to make sure it’s working.

Audience Question:
Are you franchising it to other areas?

Holovaty:
Yeah. I’m not doing those myself, but a bunch of other police departments have contacted us and said, “Can you do it?”

Wilson Miner
Designer, chicagocrime.org

I just wanted to talk a little more about how we put the project together because we work in journalism and I think there are some lessons for how we package information as journalists online.

Adrian came up with the idea to do this because it was in his area and the information was available but the site sucked, so he wanted to make it accessible so he could use it and so other people could use it. We talked a little bit about what it might be and what it might look like but it was really informal and we didn’t talk about any requirements, we just said that we should make it better. We wanted to take this discreet set of information that we had and make it easy to access, so Adrian went off and did his thing and built it and made a lot of decisions along the way. But the key thing is that the person who had the idea and was also going to be the consumer of the information built the interface to the information. He created it and made all the decisions along the way based on how he wanted to access the information, so there was no communication barrier of having a team of people where one person is executing somebody else’s idea and then someone else is using it.

So then Adrian built the first prototype, which was essentially the first working version. We could have launched that day if we wanted to, but we both looked at it and moved things around and tweaked it.

It’s a dubious distinction to be the designer of a site like this because you look at it and it’s evident that there’s really no design involved. So we worked really hard on it and we ended up with this thing and the goal was to make it as simple as possible. I think with a site like this, where you have a really simple piece of information – and maybe you have a lot of that information as we do in this case – but you have one very discreet piece of information that you want to present as clearly and as informatively as possible, the goal is to reduce all of the complexity inherent in that information so that people can consume it. With a site like that, I think the first principle that the designer needs to take into it is to do as little design as possible. Don’t make it look too pretty, just make it simple and communicate it.

I had the advantage of starting with the working version that Adrian had and all I needed go do was tweak it and move it around. We started with the basic, necessary functionality; without these key features, the site doesn’t exist. We just started with that and only added what made the information clearer.

We added a map because the most obvious way to visualize something with a location as a key attribute is with a map. We were lucky to have the Google Maps tool available to do that, and it was very simple to do and we didn’t have to worry about the presentation.

Then, once we added all the things we thought made it clear, we went through and looked at it and took away what was unnecessary or confusing. We had originally looked at a lot of different aggregate statistics, because you look at a big body of statistics and the first thing you want to do is just say, “OK, what are the key things? Can we do the most popular crimes? The most frequent crimes? The most frequented location for crimes?” And what we realized when we had all that extra information in there was that a lot of it was meaningless because we’re dealing with a very limited time window. A lot of that aggregate information doesn’t tell you anything because you’re only limited to a small subset of data, so we took away everything that was unnecessary and doesn’t contribute to communicating the core information.

I think that process of starting with the very basic functionality and getting it there immediately, adding what you need to communicate more effectively, and then removing what’s confusing sounds exactly like the process of writing a news story or packaging a news feature. You’re looking at numbers, you’re looking at data or you’re looking at information, but it’s the same process as communicating a news story, and the key principle of news writing is be brief and communicate your idea as simply as possible. But within that you want to provide as much information as possible, so this is a great example of that. It’s very simple to get to the information, but there’s a lot of information there for you to access.

I think what makes chicagocrime.org different than the Chicago Police Department’s interface with the same data is that when you look at it, on the face of it they’re presenting exactly the same information with the same data and the visual presentation is not significantly different, but they give you a lot of options and ask you to enter information up front, and what if I don’t know what the intersection is or don’t care? So what I think we did – maybe not intentionally – is made these guided browsing links where every click in the site gets you closer to relevant information. There’s no barrier to entry as far as needing to know something before I get to information. I might know what I’m looking for, but I might just be interested in exploring, so there are thousands of facets to explore and there’s no wrong choice.

Continue to Andrew Sherry and Juan Thomassie's presentation
Return to transcript menu

 


Subscribe to J-Lab's RSS feed (What is RSS?)

J-LabTM is an incubator for innovative, participatory news experiments and a center of
American University's School of Communication in Washington, D.C.

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 2.5 License.