Skip to Content View Previous Reports

Content Anaylsis

Content Analysis

The promise of Internet news is its availability, immediacy, interactivity and unlimited space. Which of these characteristics are really being taken advantage of?

To get an answer, the study conducted a content analysis of eight news Web sites, including the five most popular as measured by ratings. This content study suggests that the Internet has made marked progress in the last few years, but the degree to which it is fulfilling its potential varies widely.

Among the findings:

To examine the Internet, the Project looked at a range of Web sites throughout each day, rather than look at sites once a day. We looked at eight sites – two from cable television (CNN and Fox), two associated with broadcast television networks (CBS News News and MSNBC.com, which is affiliated with both MSNBC cable news and NBC), two Internet-only sites (Yahoo and AOL) and two newspaper sites (NYTimes.com for a large- circulation market and reviewjournal.com of The Las Vegas Review Journal for a small-market newspaper). Altogether, 709 news articles were examined in four downloads a day on the eight sites over five days scattered over four months, or a total of 160 downloads. The study examined all articles on the front page tied to a graphic image, plus the next top three articles. It also noted the links within each article.

Perhaps most distinctive were the differences among the outlets. While many critics complain that television network news all looks alike, that local television news is identical from city to city, or that basics of newspaper writing differ little from paper to paper, that is not the case for Internet. A series of visits to the Internet sites studied revealed an assortment of different styles and approaches.

CNN’s busy home page, for instance, typically features links to about 50 articles, plus content from sister organizations Time, Sports Illustrated and Fortune. The New York Times Web site typically has even more articles, 75 to 80, most of them staff written, yet little video. Yahoo, on the other hand, normally features just 20 articles. All of these articles are from wire services, but several include streaming video.

Whether this variety reflects a medium still trying to find a single successful model, or a medium by its nature likely to continue to sustain more variety than the old media is impossible yet to discern.

Originality of Reporting

How much original reporting occurs online? Based on the eight sites studied, Internet journalism on the major news sites is still largely a medium made up of second-hand material, usually from the old media.

Overall, only about a third (32 percent) of the lead articles on the sites studied was material produced by the organization’s own staff. And much of that came from a few of the sites, particularly those from newspapers, posting articles from their print parents. Thus even most of this material was not original to the Web.

A larger percentage of lead pieces, 42 percent, were wire stories posted without any editing and produced by other sources, particularly The Associated Press and Reuters.

A quarter (23 percent) were wire stories that included enough editing or additional material that they carried a combined staff/wire credit line or byline.

It should be noted, however, that much of the time those additions or editis were so minimal that they were difficult to detect when these versions were compared with the original wire stories.

Story Origination
Percent of All Stories

Origin AOL CBS News CNN FOX LVRJ MSNBC NYT Yahoo
Staff 0% 9% 75% 13% 78% 17% 72% 0%
Wire & Staff 0 82 11 35 0 48 0 0
Wire 100 9 15 53 22 23 25 98
Other Org. 0 0 0 0 0 13 3 2

Totals may not equal 100 due to rounding.

Still, at least users could tell that someone at the Web site had edited these stories and made some attempt to check the material or compare it to other wires. This is a significant distinction. It means that the journalistic function of synthesizing and verifying had occurred.

The reliance on wires also meant a fair amount of repetition among sites. When a big story hits, one is likely to come across the same story on any number of Web sites. On the day of the Midwest Black Out (August 15th), for example, a quote from a woman in Cleveland who was arriving to work in a T-Shirt and shorts without having brushed her teeth was carried on five of the eight sites we studied. Web readers from 9 a.m. to 9 p.m. were well-informed of attorney Lori Zocolo’s inability to brush her teeth.

But not all sites are just wires. Basically the news sites studied fell into three categories when it comes to lead stories:

The type of medium sponsoring a site did not dictate where the site’s stories came from. The only exception was the two newspaper sites, which matched closely in origin of information. CNN was the only other site that took on the look of a newspaper. At all three of these, more than 70 percent of the articles were staff written.

The sites most likely to customize wires were a network television news site, CBS.com, and MSNBC.com, which is a joint venture of Microsoft and NBC. The CBS News site was dominated by wire/staff combination stories (82 percent). MSNBC.com was more of a hybrid – 48 percent staff/wire combinations, 17 percent original material, and 36 percent straight wire or other news organization.

Finally three sites were largely straight wire copy. Two of these, AOL and Yahoo, were the two Internet-only sites. Sometimes referred to as portals, they relied entirely on straight wire stories without any evidence of checking their veracity.

The third, Foxnews.com, whose sister cable channel produces continuous original content on television, did some original reporting, though not much (13 percent). About a third of its stories were wire/staff combinations (35 percent), but the majority of its lead stories were straight wires (53 percent).

Story Length, Staff vs. Wire
Percent of All Stories

Words Staff Wire
100 or Under 0% 0%
101 to 500 12 27
501 to 1000 46 56
More than 1000 42 17

Totals may not equal 100 due to rounding.

It should be noted that the study did not include some of the well-known sites, such as Salon or Slate/MSN, that produce original Internet content. An earlier study by the Project of political coverage examined these sites and found that Salon mixed the sharp attitude and tone of the Internet with solid reporting. Slate, on the other hand, leavened its opinion journalism with summaries of the day’s hottest political events in a kind of puckish tone, something like “Hotline for the Internet.”1 These sites, however, like blogs, are more analogous to elite journals of opinion in print. Their audiences are smaller than those studied here (except for the Las Vegas Review-Journal’s), and their content is more essay and argument rather than breaking news, which is the focus of this audit.

Does it matter that the Web, on balance, is still more a medium for getting news via wire stories? This means the Web site is entrusting the accuracy of the copy to someone else, a wire service. In the past, this has proved detrimental as false information gets passed on downstream.

In addition to the issue of verification, other differences stand out in the study. The wire stories were shorter than staff-written pieces and were also somewhat less likely to be updated with important new facts.

On the other hand, Web sites were somewhat more likely to post new wire stories than they were staff-written pieces (55 percent of wire stories were new, versus 47 percent staff written).

Story Freshness

Like cable television news, the Internet promises the ability to continuously update users with the latest turn in events.

The study wanted to determine how much new information news Web sites actually posted through the day. To do so, we checked every four hours to see what percent of the lead stories were altogether new, what percent were unchanged and what percent were in some way updated.2

There are, moreover, degrees of updating: Was there something substantively new to the stories, just some minor details added or was it a rewrite around a new angle?3

Story Freshness
Percent of All Stories

Fresheness All Stories
Exact Repeat 21%
Repeat: No New Substance 14
Repeat: New Angle 2
Repeat: New Substance 14
New Story 49

Totals may not equal 100 due to rounding.

Overall, half of all lead stories (49 percent) were thoroughly new through the course of the day. The figure drops even more if you discount the 9 a.m. stories, which were all coded as entirely new. For the remaining three visits each day, just 34% were entirely new.

Still, only 21 percent of lead stories were left unchanged.

The remaining 30 percent of lead stories involved some form of update of an existing story. Of these, however, only about half (or 14 percent overall) contained substantive new information. Another 14 percent involved just adding minor new details. And 2 percent were what journalists call a rewrite, the same basic story but rewritten around a new angle.

Looked at another way, about 63 percent of the stories on the Web sites studied were either altogether new or substantively updated through the course of the day.

Is there a news cycle on the Internet? And is the news updated continuously in an even flow? Or does it change sharply toward the end of the day, after the close of business but a good 10 or 12 hours before the morning newspaper arrives?

Based on the eight sites examined, which included the five most popular news sites on the Web, the Internet still follows the news flow of morning newspapers to a significant degree.

The morning generally opens (For this study, all downloads and references to time are Eastern) with new headlines and content. As the day wears on, new stories are less and less likely to appear. Indeed, the lowest percentage of new stories appeared between 5 p.m. and 9 p.m.

On the other hand, the number of stories that were substantively updated increased as the day wore on. Starting fresh at 9 a.m., 16 percent were updated by 1 p.m., 19 percent by 5 p.m. and 21 percent by9 p.m. Thus, somewhere between 9 p.m. and 9 a.m. is when the Internet converts to a new news cycle.

Are some sites more or less likely to add new content than others? Yes, but it does not correlate to what kind of site they are or the nature of their content.

Story Freshness, by Outlet
Percent of All Stories

Fresheness AOL CBS CNN Fox LVRJ MSNBC NYT Yahoo
Exact Repeat 23% 27% 10% 15% 70% 6% 25% 0%
Repeat: No New Substance 17 13 15 11 0 24 4 27
Repeat: New Angle 0 6 3 1 0 0 0 4
Repeat: New Substance 14 13 22 11 0 21 11 20
New Story 47 41 51 63 30 49 59 49

Totals may not equal 100 due to rounding.

Wire-heavy Yahoo and AOL, for instance, posted new stories less often than staff-driven NYTimes.com. Yet Foxnews.com, also mostly wire copy, was the most likely of the sites studied to post completely new material.

On the other hand, Fox was one of the least likely sites to offer substantive update stories in its leads. Its substantive updates matched that of the NYTimes.com, 11 percent, compared with more than 20 percent at cable-siblings CNN and MSNBC.

Does the web break stories? The study as constructed this year cannot answer that. It is something to watch in the future.

Story Components

How multilayered are Internet stories? Do they take advantage of the unlimited space and ability to use multimedia components?

In the Project’s first study of the Internet, in 2000, fully a third of the political stories studied, including 100 percent of those on AOL and MSNBC, had no audio or visual links. Links even to external sites or additional information were minimal.

Today, the Internet has become more multifunctional and links to background information in particular have become commonplace. Sites still vary widely, however, and some kinds of multimedia links are still not the norm.

Overall:

The one thing all the sites seemed to do consistently was to link to their own archives so users could read past stories to get background on events. The proverbial newspaper “morgue” or clipping library is now more readily available to the public. These background links, moreover, were not restricted to old stories. They also included online-only features such as summaries of candidate positions, image galleries, profiles of people in the news and timelines of events.

When it came to multimedia elements, such as video or galleries of still photos, there were wide differences among sites.

Multimedia and Interactivity on the Internet
Percent of Stories With Link Present, By Site

Site Video Audio Photo Graphic Archive Chat/Feedback
Avg. Overall 32% 2 36 16 74 40
AOL 17% 0 6 8 82 97
CBS News 62% 3 39 4 92 7
CNN 44% 4 32 28 60 1
Fox 27% 7 61 42 72 6
LVRJ 1% 1 1 6 45 0
MSNBC 52% 0 44 26 89 99
NYTimes 15% 10 24 9 72 25
Yahoo 29% 1 73 1 79 99

In general, broadcast sites were the most likely to have links to videos or still photos in their lead stories. Cable sites were more likely to have map or graphic links. Internet-only sites as well as MSNBC.com were more likely to have a place for user feedback.

Sites also varied widely in how often they linked to outside sources. Almost every story on Yahoo has an external link, as did more than half of the stories at MSNBC.com (58 percent) and Foxnews.com (51 percent). Only 6 percent of NYT.com lead stories had external links, 13 percent at CBS News, and 12 percent at AOL.

How much opportunity did sites offer users to react to stories? In all, 40 percent of stories studied had some way for users to offer feedback or participate in online discussions about the topic. But the findings varied widely and had no correlation to the kinds of stories, the nature of the parent organization or the popularity of the site.

Story Length

One question about the Internet is how long stories can be. On one hand, the Internet has the potential for infinite depth. On the other, some people question whether users can absorb long stories onscreen.

Is there an optimal length? Do sites vary much when it comes to this question?

Overall, the lead stories on the Web were shorter than newspaper front-page articles and than print magazine cover pieces. Internet lead stories were generally between 500 and 1,000 words (54 percent) while newspaper front page articles tended to run to more than 1,000 words (54 percent of front page articles) as did magazine cover stories (42 percent).

There seems to be no accepted norm among the sites studied when it comes to length of lead stories. Moreover, the differences did not seem to be tied to their originating news medium. At MSNBC.com, for instance, nearly half of all of stories were more than 1,000 words.

Some sites broke up long stories by dividing them into multiple pages. In the middle of the story, then, users must click the “next” button again and again to read on. Thus, while space is indeed unlimited, the amount of work the user has to do can often increase with length.

Sites also varied widely in how many short stories they carried. AOL and Fox ran a lot of short stories, under 500 words – nearly a third of those at AOL (31 percent) and more than a quarter at Fox (28 percent). At CBSNews.com, meanwhile, only 18 percent were that short.

Story Length, By Site
Percent of Stories, By Words

Site 101-500 501-1000 1001+
AOL 31% 56 14
CBS News 18 75 7
CNN 17 59 25
Fox 28 45 26
LVRJ 6% 53 40
MSNBC 14% 38 47
NYTimes 19% 43 38
Yahoo 13% 57 30

Totals may not equal 100 due to rounding.

Sourcing

In general, 39 percent of the 709 stories studied on the Internet contained anonymous sourcing, roughly double that of weekly news magazines, and more than newspapers overall, but less than newspaper front pages (45 percent of front-page articles had anonymous sources).

Six in ten stories (59 percent) had the highest level of source transparency – at least four sources named with some attempt to describe the source’s potential biases or point of view.

At least in lead stories, sourcing seems pretty strong. It is important to note that much of this sourcing arrives second hand, from wire services and, as mentioned above, much has not been verified by the Web site itself. Still, some of the wire copy seems the best sourced of all. MSNBC.com, for instance, which relies heavily on edited wire copy, was the most likely to run stories with four or more fully identified sources (71 percent of stories). CNN, with more original reporting, was the least likely to run stories with this highest of level of fully identified sources (45 percent).

Lead Story Topics

What is the top news agenda of Internet news sites and how does it differ from other media?

Here our answers are more limited than for some other media. To focus on the broader character of the Web – the level of interactivity, updating and other features- the study chose to look at a limited number of days but to examine those days in detail every four hours. Given that, there is little we can say quantitatively about the news agenda of the Web sites analyzed.

What we can offer is more impressionistic. Here, we were left with the sense that in its lead stories the Internet is more tied to traditional news topics than is often the case in cable and broadcast television, which have become somewhat more infotainment or tabloid in flavor. Web managers may be conscious that a sizable part of their audience is accessing their sites during the day at work. On the other hand, the infinite space on the Internet gives it a luxury that television does not have. It can cover everything. It is also, at this point, a less visual medium than television.

Perhaps given those similarities to print, the top stories on the Internet, at least according to our limited sample, share more in common with newspapers than with television.

But the Internet is more than print online. The degree to which the differences are developed and used varies from site to site, and it does not appear to be a matter of some sites having interactivity and others not. Rather, different sites seem to have chosen certain kinds of interactivity over others.

Click here for metholodogy information.

Click here to view content summary tables.

Footnotes

1. “ePolitics: A study of the 2000 Presidential Campaign on the Internet,” Project for Excellence in Journalism, April 10, 2000.

2. For this study, 9 a.m. was considered the starting point of a day. Therefore, while it could not definitively be stated that all 9 a.m. stories are new, they were categorized as such for the purposes of this analysis.

3. The study visited sites every four hours at four different times during the day, 9 a.m., 1 p.m. 5 p.m. and 9 p.m.