Web Literacy Textbook

by Michael A. Caulfield

.

.

.

.

.

.

Web Literacy Textbook

Icon for the Creative Commons Attribution 4.0 International License

Web Literacy Textbook by Mike Caulfield is licensed under a Creative Commons Attribution 4.0 International License, except where otherwise noted.

Cover image: “building-blocks-colorful-build-456616” by Counselling. Pixabay License.

Acknowledgments

1

Sincere thanks to

Jon Udell, who introduced to me the idea of web strategies;

Ward Cunningham, who taught me the culture of wiki;

Sam Wineburg, whose encouragement and guidance helped me focus on the bits that mattered;

AASCU’s American Democracy Project, which believed in this work;

and most importantly, my wife and family who tolerate the coffee shop weekends from which all great (and even mediocre) books are made.

Four Moves and a Habit

I

Why This Book?

1

Mike Caulfield

The web is a unique terrain, substantially different from print materials. Too often, attempts at teaching information literacy for the web do not take into account both the web’s unique challenges and its unique affordances.

Much web literacy I’ve seen either asks students to look at web pages and think about them, or teaches them to publish and produce things on the web. While both of these activities are valuable, neither addresses a set of real problems students confront daily: evaluating the information that reaches them through their social media streams. For these daily tasks, students need concrete strategies and tactics for tracing claims to sources and for analyzing the nature and reliability of those sources.

The web gives us many such strategies, tactics, and tools, which, properly used, can get students closer to the truth of a statement or image within seconds. Unfortunately, we do not teach students these specific techniques. As many people have noted, the web is both the largest propaganda machine ever created and the most amazing fact-checking tool ever invented. But if we haven’t taught our students those fact-checking capabilities, is it any surprise that propaganda is winning?

This is an unabashedly practical guide for the student fact-checker. It supplements generic information literacy with the specific web-based techniques that can get you closer to the truth on the web more quickly.

This guide will show you how to use date filters to find the source of viral content, how to assess the reputation of a scientific journal in less than five seconds, and how to see if a tweet is really from the famous person you think it is or from an impostor. It’ll show you how to find pages that have been deleted, figure out who paid for the website you’re looking at, and whether the weather portrayed in that viral video actual matches the weather in that location on that day. It’ll show you how to check a Wikipedia page for recent vandalism and how to search the text of almost any printed book to verify a quote. It’ll teach you to parse URLs and scan search result blurbs so that you are more likely to get to the right result on the first click. And it’ll show you how to avoid baking confirmation bias into your search terms.

In other words, this guide will help you become “web literate” by showing you the unique opportunities and pitfalls of searching for truth on the web. Crazy, right?

This is the instruction manual to reading on the modern internet. I hope you find it useful.

 

 

Four Moves

2

Mike Caulfield

What people need most when confronted with a claim that may not be 100% true is things they can do to get closer to the truth. They need something I have decided to call “moves.”

Moves accomplish intermediate goals in the fact-checking process.  They are associated with specific tactics. Here are the four moves this guide will hinge on:

In general, you can try these moves in sequence. If you find success at any stage, your work might be done.

When you encounter a claim you want to check, your first move might be to see if sites like Politifact, or Snopes, or even Wikipedia have researched the claim (Check for previous work).

If you can’t find previous work on the claim, start by trying to trace the claim to the source. If the claim is about research, try to find the journal it appeared in. If the claim is about an event, try to find the news publication in which it was originally reported (Go upstream).

Maybe you get lucky and the source is something known to be reputable, such as the journal Science or the newspaper the New York Times. Again, if so, you can stop there. If not, you’re going to need to read laterally, finding out more about this source you’ve ended up at and asking whether it is trustworthy (Read laterally).

And if at any point you fail–if the source you find is not trustworthy, complex questions emerge, or the claim turns out to have multiple sub-claims–then you circle back, and start a new process. Rewrite the claim. Try a new search of fact-checking sites, or find an alternate source (Circle back).

 

 

 

Building a Fact-Checking Habit by Checking Your Emotions

3

Mike Caulfield

In addition to the moves, I’ll introduce one more word of advice: Check your emotions.

This isn’t quite a strategy (like “go upstream”) or a tactic (like using date filters to find the origin of a fact). For lack of a better word, I am calling this advice a habit.

The habit is simple. When you feel strong emotion–happiness, anger, pride, vindication–and that emotion pushes you to share a “fact” with others, STOP. Above all, these are the claims that you must fact-check.

Why? Because you’re already likely to check things you know are important to get right, and you’re predisposed to analyze things that put you an intellectual frame of mind. But things that make you angry or overjoyed, well… our record as humans are not good with these things.

As an example, I’ll cite this tweet that crossed my Twitter feed:

Tweet. See description.

Figure 1

You don’t need to know much of the background of this tweet to see its emotionally charged nature. President Trump had insulted Chuck Schumer, a Democratic Senator from New York, and characterized the tears that Schumer shed during a statement about refugees as “fake tears.”  This tweet reminds us that that Senator Schumer’s great-grandmother died at the hands of the Nazis, which could explain Schumer’s emotional connection to the issue of refugees.

Or does it? Do we actually know that Schumer’s great-grandmother died at the hands of the Nazis? And if we are not sure this is true, should we really be retweeting it?

Our normal inclination is to ignore verification needs when we react strongly to content, and researchers have found that content that causes strong emotions (both positive and negative) spreads the fastest through our social networks.See "What Emotion Goes Viral the Fastest?" by Matthew Shaer. Savvy activists and advocates take advantage of this flaw of ours, getting past our filters by posting material that goes straight to our hearts.

Use your emotions as a reminder. Strong emotions should become a trigger for your new fact-checking habit. Every time content you want to share makes you feel rage, laughter, ridicule, or even a heartwarming buzz, spend 30 seconds fact-checking.  It will do you well.

1. Look for Previous Work

II

How to Use Previous Work

4

Mike Caulfield

When fact-checking a particular claim, quote, or article, the simplest thing you can do is to see if someone has already done the work for you.

This doesn’t mean you have to accept their finding. Maybe they assign a claim “four Pinocchios,” but you would rate it three. Maybe they find the truth “mixed,” but honestly it looks “mostly false” to you.

Regardless of the finding, a reputable fact-checking site or subject wiki will have done much of the leg work for you: tracing claims to their source, identifying the owners of various sites, and linking to reputable sources for counterclaims. And that legwork, no matter what the finding, is probably worth ten times your intuition. If the claims and the evidence they present ring true to you, or if you have built up a high degree of trust in the site, then you can treat the question as closed. But even if you aren’t satisfied, you can start your work from where they left off.

Constructing a Query to Find Previous Fact-Checking

You can find previous fact-checking by using the “site” option in search engines such as Google and DuckDuckGo to search known and trusted fact-checking sites for a given phrase or keyword. For example, if you see this story,

Online Article.

Figure 2

then you might use this query, which checks a couple known fact-checking sites for the keywords: obama iraqi refugee ban 2011. Let’s use the DuckDuckGo search engine to look for the keywords:

obama iraqi visa ban 2011 site:snopes.com site:politifact.com

Here are the results of our search:

Screenshot of DuckDuckGo search results. The top results are from fact-checking sites Snopes and Politifact.

Figure 3

You can see the search here. The results show that work has already been done in this area. In fact, the first result from Snopes answers our question almost fully. Remember to follow best search engine practice: scan the results and focus on the URLs and the blurbs to find the best result to click in the returned result set.

There are similar syntaxes you can use in Google, but for various reasons this particular search is easier in DuckDuckGo.

Let’s look at another claim, this time from the President. This claim is that police officer deaths increased 56 percent from 2015 to 2016. Here it is in context:

Excerpt of Trump Speech on web page.

Figure 4

Let’s ramp it up with a query that checks four different fact-checking sites:

officer deaths 2016 increased 56 percent from 2015 site:factcheck.org site:snopes.com site:politifact.com site:www.washingtonpost.com/news/fact-checker/

This gives us back a helpful array of results. The first, from the Washington Post, actually answers our question directly, but some of the others provide some helpful context as well.

Duck Duck Go search results.

Figure 5

Going to the Washington Post lets us know that this claim is, for all intents and purposes, true. We don’t need to go further, unless we want to.

Fact-checking Sites

5

Mike Caulfield

Some Reputable Fact-Checking Organizations

The following organizations are generally regarded as reputable fact-checking organizations focused on U.S. national news:

Respected specialty sites cover niche areas such as climate or celebrities. Here are a few examples:

There are many fact-checking sites outside the U.S. Here is a small sample:

 

Wikipedia

6

Mike Caulfield

Wikipedia is broadly misunderstood by faculty and students alike. While Wikipedia must be approached with caution, especially with articles that are covering contentious subjects or evolving events, it is often the best source to get a consensus viewpoint on a subject. Because the Wikipedia community has strict rules about sourcing facts to reliable sources, and because authors must adopt a neutral point of view, its articles are often the best available introduction to a subject on the web.

The focus on sourcing all claims has another beneficial effect. If you can find a claim expressed in a Wikipedia article, you can almost always follow the footnote on the claim to a reliable source. Scholars, reporters, and students can all benefit from using Wikipedia to quickly find authoritative sources for claims.

As an example, consider a situation where you need to source a claim that the Dallas 2016 police shooter was motivated by hatred of police officers. Wikipedia will summarize what is known about his motives and, more importantly, will source each claim, as follows:

Chief Brown said that Johnson, who was black, was upset about recent police shootings and the Black Lives Matter movement, and “stated he wanted to kill white people, especially white officers.”[4][5] A friend and former coworker of Johnson’s described him as “always [being] distrustful of the police.”[61] Another former coworker said he seemed “very affected” by recent police shootings of black men.[64] A friend said that Johnson had anger management problems and would repeatedly watch video of the 1991 beating of Rodney King by police officers.[85]

Investigators found no ties between Johnson and international terrorist or domestic extremist groups.[66]

Each footnote leads to a source that the community has deemed reliable. The article as a whole contains over 160 footnotes. If you are researching a complex question, starting with the resources and summaries provided by Wikipedia can give you a substantial running start on an issue.

2. Go Upstream

III

Go Upstream to Find the Source

7

Mike Caulfield

Our second move, after finding previous fact-checking work, is to “go upstream.”  We use this move if previous fact-checking work was insufficient for our needs.

What do we mean by “go upstream”?

Consider this claim on the conservative site the Blaze:

Headline of an online article

Figure 6

Is this claim true?

Of course we can check the credibility of this article by considering the author, the site, and when it was last revised. We’ll do some of that, eventually. But it would be ridiculous to do it on this page. Why? Because like most news pages on the web, this one provides no original information. It’s just a rewrite of an upstream page. We see the indication of that here:

Text from Daily Dot, with sentences highlighted.

Figure 7

All the information here has been collected, fact-checked (we hope!), and written up by the Daily Dot. It’s what we call “reporting on reporting.” There’s no point in evaluating the Blaze’s page.

So what do we do? Our first step is to go upstream. Go to the original story and evaluate it. When you get to the Daily Dot, then you can start asking questions about the site or the source. And it may be that for some of the information in the Daily Dot article you’d want to go a step further back and check their primary sources. But you have to start there, not here.

Identifying Sponsored Content

8

Mike Caulfield

Our warning to “go upstream” before evaluating claims is particularly important with sponsored content. For instance, a lot of time on a site you’ll see “headlines” like these, which I pulled from a highly regarded technology magazine:

A screenshot of a section of a page from NetworkWorld.

Figure 8

Look at the headline in the upper left corner. Are lawmakers really concerned about this insane military scope? Maybe. But note that Network World is not making this claim. Instead, the ZeroTac Tactical Scope company is making the claim:

A closeup of the NetworkWorld page

Figure 9

It’s an ad served from another site into this page in a way that makes it look like a story.

However, sponsored content isn’t always purely an advertisement. Sometimes it provides helpful information. This piece below, for example, is an in-depth look at some current industry trends in information technology.

An article in InfoWorld about advances in integrated systems.

Figure 10

The source of this article is not InfoWorld, but the technology company Hewlett Packard, and the piece is written by a Vice President of Hewlett Packard, with no InfoWorld oversight. (Keep an eye out on the web for articles that have a “sponsored” indicator above or below them–they are more numerous than you might think!)

You can see how this is not just an issue with political news, but will be an issue in your professional life as well. If you go to work in a technology field and portray this article to your boss as “something I read on InfoWorld”, you’re doing a grave disservice to your company. Portraying a vendor-biased perspective as a neutral InfoWorld perspective is a mistake you might come to regret.

Activity: Spot Sponsored Content

9

Mike Caulfield

Rank the following news sources on how much sponsored content you believe their pages will feature: CNN, Buzzfeed, Washington Post, HuffPost, Brietbart, New York Times.

Individually, or in groups, visit the following pages and list all sponsored content you see, tallying up the total amount on each page. Then rank the sites from most sponsored content to least.

  1. http://www.cnn.com/2017/02/10/politics/russia-dossier-update/index.html
  2. http://money.cnn.com/news/
  3. http://www.vox.com/polyarchy/2017/2/10/14569306/congress-shut-off-phones
  4. https://www.buzzfeed.com/tylerkingkade/laura-dunns-campus-rape-fight
  5. https://www.washingtonpost.com/powerpost/a-gift-and-a-challenge-for-democrats-a-restive-active-and-aggressive-base/2017/02/11/e265dd44-efef-11e6-b4ff-ac2cf509efe5_story.html
  6. http://www.huffingtonpost.com/entry/yale-calhoun-college-grace-hopper_us_589f792ce4b094a129eb8a10?tiall3di&
  7. http://www.breitbart.com/video/2017/02/11/japan-condemns-n-korea-missile-launch-trump-u-s-stands-behind-japan-100-percent/
  8. https://www.nytimes.com/2017/02/11/us/state-republican-leaders-move-swiftly.html?

After you’ve ranked the websites, answer these questions:

  1. Did the ranking surprise you at all?
  2. What do you think the quantity of sponsored content indicates about a website?
  3. How does this change your perspective on these websites’ reliability?
  4. Why would some websites have more sponsored content than others?

Understanding Syndication

10

Mike Caulfield

Syndication–the process by which material from one site is published automatically to another site–can create confusion for readers who don’t understand it. It’s a often case where something is coming from “upstream” but appears not to be.

Consider this New York Times web page:

New York Times webpage

Figure 11

We see a set of stories on the left (“Germany’s Latest Best Seller”, “Isis Claims Responsibility”) written by New York Times staff, but also a thin column of stories in the middle of the page (“UK Stock Market Hits Record”) that are identified as being from the Associated Press.

You click through to a page that’s on the New York Times site, but not by the New York Times:

New York Times article

Figure 12

If you are going to evaluate the source of this article, your evaluation will have little to do with the New York Times. You’re going to focus on the reporting record of the Associated Press.

People get this wrong all the time. One thing that happens occasionally is that an article critical of a certain politician or policy suddenly disappears from the New York Times site, and people claim it’s a plot to rewrite the past. “Conspiracy!” they say. “They’re burying information!” they say. A ZOMG-level freakout follows.

It predominately turns out that the article that disappeared is a syndicated article. Associated Press articles, for example, are displayed on the site for a few weeks, then “roll off” and disappear from the site. Why? Because the New York Times only pays the Associated Press to show them on the site for a few weeks.

You’ll also occasionally see people complaining about a story from the New York Times, claiming it shows a New York “liberal bias” only to find the story was not even written by the New York Times, but by the Associated Press, Reuters, or some other syndicator.

Going upstream means following a piece of content to its true source, and beginning your analysis there. Your first question when looking at a claim on a page should be “Where did this come from, and who produced it?” The answer quite often has very little to do with the website you are looking at.

Tracking the Source of Viral Content

11

Mike Caulfield

In the examples we’ve seen so far, it’s been straightforward to find the source of the content. The Blaze story, for example, clearly links to the Daily Dot piece so that anyone reading their summary is one click away from confirming it with the source. The New York Times makes apparent that the syndicated content is from the Associated Press, so checking the credibility of the source is readily available to you.

This is good internet citizenship. Articles on the web that repurpose other information or artifacts should state their sources, and, if appropriate, link to them. This matters to creators, because they deserve credit for their work. But it also matters to readers who need to check the credibility of the original sources.

Unfortunately, many people on the web are not good citizens. This is particularly true with material that spreads quickly as hundreds or thousands of people share it–so-called “viral” content.

When that information travels around a network, people often fail to link it to sources, or hide them altogether. For example, here is an interesting claim that two million bikers are going to show up for President-elect Trump’s inauguration. Whatever your political persuasion, that would be a pretty amazing thing to see.

But the source of the information, Right Alerts Polls, is not linked.

Article from a cheap looking site.

Figure 13

Here’s where we show our first trick. Using the Chrome web browser, select the text “Right Alerts Polls.” Then right-click your mouse (control-click on a Mac), and choose the option to search Google for the highlighted phrase.

Screenshot of the result of selecting and right-clicking.

Figure 14

Your computer will execute a search for “Right Alerts Polls.” (Remember this right-click/control-click action–it’s going to be the foundation of a lot of stuff we do.)

To find the story, add “bikers” to the end of the search:

Google search results

Figure 15

We find our upstream article right at the top. Note that if you do not use Chrome, there are analogues of this method in other browsers as well. Right-clicking in Internet Explorer will allow you to search Bing, for example. If you want, you can always do this the slightly longer way by going to Google and typing in the search terms.

So are we done here? Have we found the source?

Nope. When we click through to the supposed source article, we find that this article doesn’t tell us where the information is coming from either. However, it does have an extended quote from one of the “Two Million Bikers” organizers:

Extended quote on page, supposedly from Facebook

Figure 16

So we just repeat our technique here, and select a bit of text from the quote and right-click/control-click. Our goal is to figure out where this quote came from, and searching on this small but unique piece of it should bring it close to the top of the Google results.

Screenshot showing right click.

Figure 17

When we search this snippet of the quote, we see that there are dozens of articles covering this story, using the the same quote and sometimes even the same headline. But one of those results is the actual Facebook page for the event, and if we want a sense of how many people are committing, then this is a place to start.

Screenshot with arrows that point to the second Google result.

Figure 18

This also introduces us to another helpful practice: when scanning search results, novices scan the titles. Pros scan the URLs beneath the titles, looking for clues as to which sources are best. (Be a pro!)

So we go to the “Two Million Biker” Facebook event page, and take a look. How close are they to getting two million bikers to commit to this?

The bikers' Facebook page with event statistics.

Figure 19

Well…it looks like about 1,800. That’s nothing to sneer at–organizing is hard, and people have lives to attend to. Getting people to give up time for political activity is tough. But it’s pretty short of the “two million bikers” most of these articles were telling us were going to show up.

When we get into how to rate articles on the DigiPo site as true or false, likely or unlikely, we’ll talk a bit about how to write up the evaluation of this claim. Our sense is the rating here is either “Mostly False” or “Unlikely”–there are people planning to go, that’s true, but the importance of the story was based around the scale of attendance, and all indications seem to be that attendance is shaping up to be about a tenth of one percent (0.1%) of what the other articles promised.

Importantly, we would have learned none of this had we decided to evaluate the original page. We learned this by going upstream.

 

Tracking the Source of Viral Photos

12

Mike Caulfield

Another type of viral content on the internet is photography. It is also some of the most difficult to track upstream to a source.  Here’s a picture that showed up in my stream the other day:

Figure 20

OK, so what’s the story here? To get more information, I pull the textual information off the image and throw it in a Google search:

Figure 21

This brings me to a YouTube video that tells me this was taken “outside a Portland, Oregon Walmart” and has been shared “hundreds of times since yesterday.” So now we search with this new information. This next result shows you why you always want to look past the first result:

Figure 22

Which one of these items should I click? Again, the idea here is to get “upstream” to something that is closer to the actual event. One way to do that is to find the earliest post, and we’ll use that in a future task. But another way to get upstream is to get closer to the event in space. Think about it: who is more likely to get the facts of a local story correct, the local newspaper or a random blog?

So as you scan the search results, look at the URLs. Fox 13 News has it in “trending.” AmericaNow has it in the “society” section.

But the WGME link has the story in a “news/local/” directory. This is interesting, because the other site said it happened in Oregon, and here the location is clearly Maine. But this URL pattern is a strong point in the website’s favor.

Further indications here that it might be a good source is that we see in the blurb it mentions the name of the photographer, “Matthew Mills.” The URL plus the specificity of the information tell us this is the way to go.

This takes me to what looks like the news page where it went viral, which embeds the original post.

Figure 23

We see here that the downstream news report we found first had a bunch of things wrong. It wasn’t in Portland, Oregon—it was in Biddeford, which is near Portland, Maine. It hasn’t been shared “hundreds of times”–it’s been shared hundreds of thousands of times. And it was made viral by a CBS affiliate, a fact that ABC Action News in Tampa doesn’t mention at all.

OK, let’s go one more step. Let’s look at the Facebook page where Matthew Mills shared it. Part of what we want to see is whether or not this was viral before CBS picked it up. I’d also like to double check that Mills is really from the Biddeford area and see if he was responsible for the shopping carts or just happened upon this scene.

The news post does not link back to the original, so we search on Matthew Mills again. There, we find some news outlets mentioning the original caption by Mills: “This guy got a lesson in parking.”

Figure 24

That’s not the same as the caption that the news station put up–maybe it’s what Mills originally used. We pump “’got a lesson in parking’ Matthew Mills” into Facebook, and bingo, we get the original post:

Figure 25

And here’s where we see something unpleasant about news organizations. They cut other news organizations out of the story, every time. So they say this has been shared hundreds of times because in order to say it has been shared hundreds of thousands of times they’d have to mention it was popularized by a CBS affiliate. So they cut CBS out of the story.

This practice can make it easier to track something down to the source. News organizations work hard to find the original source if it means they can cut other news organizations out of the picture. But it also tends to distort how virality happens. The picture here did not magically become viral—it became viral largely due to the reach of WGME.

Incidentally, we also find answers to other questions in the Matthew Mills version: he took the picture but didn’t arrange the carts, and he really is from Old Orchard Beach.

Just because we’re extra suspicious, we throw the image into Google Images to see if maybe this is a recycled image. Sometimes people take old images and pretend the images are theirs–changing only the the supposed date and location. A Google reverse image search (see below) shows that it does not appear to be the case here, although in doing that we find out this is a very common type of viral photo called  a “parking revenge” photo. The specific technique of circling carts around a double-parked car dates back to at least 2012:

Figure 26

When we click through we can see that the practice was popularized, at least to some extent, by Reddit users. See, for instance, this post from December 2012:

Figure 27

So that’s it. It’s part of a parking revenge meme that dates back at least four years, and was popularized by Reddit. This particular one was shot by Matthew Mills in Biddeford, Maine, who was not the one who circled the carts. And it became viral through the reshare provided by a local Maine news station.

 

Filtering by Time and Place to Find the Original

14

Mike Caulfield

As I’ve mentioned above, going upstream is often a journey through time and space. The original story is also the first story, and as we saw with the Hawaiian news site, local sources often have special insights into stories.

There are specific tactics you can use with Google and other search engines to help you find original material more quickly.

The following photo is another photo that Twitter users have identified as another “National Geographic photographer” photo. Is it?

Figure 38

A Google reverse image search finds the photo, suggesting the best search term is “birds attacking people.”

Figure 39

This suggestion is based on the fact that the pages where this photo shows up often contain these words: “birds attacking people.”

Figure 40

We can modify that search, however. Let’s return only the older pictures.

We do that by clicking the “Tools” button and then using the “Time” dropdown to select “Custom range.” This should filter out some of the posts that merely include this in slideshows.

Figure 41

We pick a date in the past to see if we can filter out the newer photos. We remove the “birds attacking people” search and replace it with “bird,” since the other phrase sounds like a title for a slideshow with many of these sorts of photos in it. The original isn’t likely to be on a page like that; the slideshows come later in the viral cycle:

Figure 42

Why 2009? For viral photos I usually find 2009 or 2010 a good starting point. If you don’t find any results within that parameter, then go higher, to a year like 2012. If you find too many results, then change the search to something like 2007.

Here we get a much better set of results. Instead of a list of “When Birds Attack” slideshows, we get a set of results talking about this specific photo. One of the results stands out to me.

Figure 43

This third result looks most promising for two reasons:

  1. The poster of the “Got too close to the hawk” result seems to know a bit more about the situation, noting “these birds are trained.”
  2. It mentions “Kazakhstan Eagle.” That’s a name of a type of bird, but it’s also a place, and if we could confirm this took place in Kazakhstan, there will be other ways to trace this back to the original. Remember–going upstream is about getting closer in time to the original, but it can also mean getting closer in space.

Luckily when we go to that page it links us in the comments to a page that has the set of shots that the photographer was taking, as well as a shot of this cameraman being attacked from another angle.

Figure 44

It’s a series of photos from a hunting competition in Chengelsy Gorge, Kazakhstan. The eagle attacking him is tame and trained, but for some reason attacked him anyway. So this is real; it’s not photoshopped or staged. At the same time it’s not a National Geographic photographer. We could pursue it further if we wanted, but we’ll stop here.

While this process takes some time to explain, in practice it can be done in about 90 seconds. Here’s a YouTube video that shows what this looks like in practice.

Thumbnail for the embedded element "How to Find an Eagle Attack"

A YouTube element has been excluded from this version of the text. You can view it online here: http://textbooks.whatcom.edu/webliteracy/?p=103

(Note that as long as you are careful with confirmation bias, you can replace the search term “bird” with a term like “fake” to find pages claiming the image is fake and see what evidence they present.)

Going local is also useful for other sorts of events. Here is text from a story that ran in many right-wing blogs, under headlines such as “Teen Girls Savagely Beaten By Black Lives Matter Thugs”:

Two white teenage girls and their mother were attacked during the protests in Stockton last Friday. The young girls were transported to the hospital by police after being viciously beaten by Black Lives Matter supporters, but one of the attackers will soon face criminal charges for his role in the assault.

The two teenage girls said they were viciously attacked by more than a dozen male and female protesters as they were leaving a restaurant. As they were leaving the restaurant, they were approached by a group of protesters chanting “Black Lives Matter.”

The headlines and the language used in those posts were often inflammatory and racist, but is there really a story under this? Or is the story fake?

There are many ways we can investigate the story, but for a local event like this you would expect some local coverage. So to go upstream here, one option is to go local. In this case we look to see what news organizations cover the area, by typing in “stockton ca local affiliate”:

Figure 45

Then we go to one of those sites and look for the news, typing in “teenage girls black lives matter.”

Figure 46

And in doing that we find that the event did happen. But the facts, if you follow that link, are more complex than most of the tertiary coverage will convey.

There’s plenty to argue about concerning the event. But by going to the local source we can start with a cleaner version of the facts. This isn’t to say that local news is always reliable, but in a sea of spin and fakery, it’s not a bad place to start for coverage and confirmation of local events.

Activity: Trace Viral Photos Upstream

15

Mike Caulfield

These two photos have been attributed to National Geographic shoots by the same tweeter I mentioned above.

I put the photos below. If you are reading this on the web, go to it. If you are reading this book in PDF form, you’ll have to go find them at the Hapgood blog to use your Google reverse image Search right-click/control-click action.

Bearing It

The first one is easy. Is this real, or fake? And are these National Geographic photographers or not? Is the bear real?

Figure 47

Swan Song

This second one is a lot harder. But is this real or fake? If real, can you find the name of the photographer in the swan and his nationality? If fake, can you show a debunking of it?

Figure 48

Truck Bomb

This next one is political. It was shared by a Twitter user who claimed it was a picture of an Irish Republican Army bombing. To paraphrase the poster: “This is London in 1993 after an IRA truck bomb. We didn’t ban Irish people or Catholics.” The poster making a comparison to recent moves to ban travel from Muslim countries in the U.S.

Is this a picture of a 1993 London truck bombing? If so, how many people died and/or were injured? What was the response?

Figure 49

Going Rambo

Figure 50

Read Laterally

IV

What “Reading Laterally” Means

16

Mike Caulfield

Time for our third move: good fact-checkers read “laterally,” across many connected sites instead of digging deep into the site at hand.

When you start to read a book, a journal article, or a physical newspaper in the “real world,” you already know quite a bit about your source. You’ve subscribed to the newspaper, or picked it up from a newsstand because you’ve heard of it. You’ve ordered the book from Amazon or purchased it from a local bookstore because it was a book you were interested in reading. You’ve chosen a journal article either because of the quality of the journal article or because someone whose expertise and background you know cited it. In other words, when you get to the document you need to evaluate, the process of getting there has already given you some initial bearings.

Compared to these intellectual journeys, web reading is a bit more like teleportation. Even after following a source upstream, you arrive at a page, site, and author that are often all unknown to you. How do you analyze the author’s qualifications or the trustworthiness of the site?

Researchers have found that most people go about this the wrong way. When confronted with a new site, they poke around the site and try to find out what the site says about itself by going to the “about page,” clicking around in onsite author biographies, or scrolling up and down the page. This is a faulty strategy for two reasons. First, if the site is untrustworthy, then what the site says about itself is most likely untrustworthy, as well. And, even if the site is generally trustworthy, it is inclined to paint the most favorable picture of its expertise and credibility possible.

The solution to this is, in the words of Sam Wineburg’s Stanford research team, to “read laterally.” Lateral readers don’t spend time on the page or site until they’ve first gotten their bearings by looking at what other sites and resources say about the source at which they are looking.

For example, when presented with a new site that needs to be evaluated, professional fact-checkers don’t spend much time on the site itself. Instead they get off the page and see what other authoritative sources have said about the site. They open up many tabs in their browser, piecing together different bits of information from across the web to get a better picture of the site they’re investigating. Many of the questions they ask are the same as the vertical readers scrolling up and down the pages of the source they are evaluating. But unlike those readers, they realize that the truth is more likely to be found in the network of links to (and commentaries about) the site than in the site itself.

Only when they’ve gotten their bearings from the rest of the network do they re-engage with the content. Lateral readers gain a better understanding as to whether to trust the facts and analysis presented to them.

You can tell lateral readers at work: they have multiple tabs open and they perform web searches on the author of the piece and the ownership of the site. They also look at pages linking to the site, not just pages coming from it.

Lateral reading helps the reader understand both the perspective from which the site’s analyses come and if the site has an editorial process or expert reputation that would allow one to accept the truth of a site’s facts.

We’re going to deal with the latter issue of factual reliability, while noting that lateral reading is just as important for the first issue.

 

Evaluating a Website or Publication’s Authority

17

Mike Caulfield

Authority and reliability are tricky to evaluate. Whether we admit it or not, most of us would like to ascribe authority to sites and authors who support our conclusions and deny authority to publications that disagree with our worldview. To us, this seems natural: the trustworthy publications are the ones saying things that are correct, and we define “correct” as what we believe to be true. A moment’s reflection will show the flaw in this way of thinking.

How do we get beyond our own myopia here? For the Digital Polarization Project for which this text was created, we ended up adopting Wikipedia’s guidelines for determining the reliability of publications. These guidelines were developed to help people with diametrically opposed positions argue in rational ways about the reliability of sources using common criteria.

For Wikipedians, reliable sources are defined by process, aim, and expertise. I think these criteria are worth thinking about as you fact-check.

Process

Above all, a reliable source for facts should have a process in place for encouraging accuracy, verifying facts, and correcting mistakes. Note that reputation and process might be apart from issues of bias: the New York Times is thought by many to have a center-left bias, the Wall Street Journal a center-right bias, and USA Today a centrist bias. Yet fact-checkers of all political stripes are happy to be able to track a fact down to one of these publications since they have reputations for a high degree of accuracy, and issue corrections when they get facts wrong.

The same thing applies to peer-reviewed publications. While there is much debate about the inherent flaws of peer review, peer review does get many eyes on data and results. Their process helps to keep many obviously flawed results out of publication. If a peer-reviewed journal has a large following of experts, that provides even more eyes on the article, and more chances to spot flaws. Since one’s reputation for research is on the line in front of one’s peers, it also provides incentives to be precise in claims and careful in analysis in a way that other forms of communication might not.

Expertise

According to Wikipedians, researchers and certain classes of professionals have expertise, and their usefulness is defined by that expertise. For example, we would expect a marine biologist to have a more informed opinion about the impact of global warming on marine life than the average person, particularly if they have done research in that area. Professional knowledge matters too: we’d expect a health inspector to have a reasonably good knowledge of health code violations, even if they are not a scholar of the area. And  while we often think researchers are more knowledgeable than professionals, this is not always the case. For a range of issues, professionals in a given area might have better insight than researchers, especially where question deal with common practice.

Reporters, on the other hand, often have no domain expertise, but may write for papers that accurately summarize and convey the views of experts, professionals, and event participants. As reporters write in a niche area over many years (e.g. opioid drug policy) they may acquire expertise themselves.

Aim

Aim is defined by what the publication, author, or media source is attempting to accomplish. Aims are complex. Respected scientific journals, for example, aim for prestige within the scientific community, but must also have a business model. A site like the New York Times relies on ad revenue but is also dependent on maintaining a reputation for accuracy.

One way to think about aim is to ask what incentives an article or author has to get things right. An opinion column that gets a fact or two wrong won’t cause its author much trouble, whereas an article in a newspaper that gets facts wrong may damage the reputation of the reporter. On the far ends of the spectrum, a single bad or retracted article by a scientist can ruin a career, whereas an advocacy blog site can twist facts daily with no consequences.

Policy think tanks, such as the Cato Institute and the Center for American Progress, are interesting hybrid cases. To maintain their funding, they must continue to promote aims that have a particular bias. At the same time, their prestige (at least for the better known ones) depends on them promoting these aims while maintaining some level of honesty.

In general, you want to choose a publication that has strong incentives to get things right, as shown by both authorial intent and business model, reputational incentives, and history.

 

Basic Techniques: Domain Searches, WHOIS

18

Mike Caulfield

What are some quick techniques to identify an unfamiliar site’s worldview, process, aims, and expertise?

Web Searching a Domain

The simplest and quickest way to get a sense of where a site sits in the network ecosystem is to Google search the site. Since we want to find out what other sites are saying about the site while excluding what the site says about itself, we use a special search syntax that excludes pages from the target site.

For example, say we are looking  at an article in the Baltimore Gazette:

See description of Figure 60a: Headline Reads: Clinton Received Debate Questions

Figure 51

Is this a reputable newspaper?

The site is down right now, but when it was up, a search for “baltimoregazette.com” would have returned many pages, mostly from the site itself. As noted earlier, if we don’t know whether to trust a site, it doesn’t make much sense to trust the story the site tells us about itself.

So we use a search syntax that looks for all references to the site that are not on the site itself:

baltimoregazette.com -site:baltimoregazette.com

When we do that we get a set of results that we can scan, looking for sites we trust:

See description of Figure 61a:A Google search tip demonstrating how to exclude a specific site from search results.

Figure 52

These results, as we scan them, give us reason to suspect the site. Maybe we don’t know “City Paper,” which claims the site is fake. But we do know Snopes. When we take a look there, we find the following sentence about the Gazette:

On 21 September 2016, the Baltimore Gazettea purveyor of fake news, not a real news outlet — published an article reporting that any “rioters” caught looting in Charlotte would permanently lose food stamps and all other government benefits…

From Snopes, that’s pretty definitive. This is a fake news site.

Searches like this don’t always turn up Snopes or Politifact. Here’s the site of the Pacific Justice Institute:

Figure 53

Here, a search of Google turns up a Wikipedia article:

Figure 54

That article explains that this is a conservative legal defense fund that has been named a hate site by the Southern Poverty Law Center.

Maybe to you that means that nothing from this site is trustworthy; maybe to another person it simply means proceed with caution. But after a short search and two clicks, you can begin reading an article from this site with a better idea of the purpose behind it, a key ingredient of intentional reading.

Finding Out Who Runs a Site with WHOIS and Other Tools

Some smaller sites don’t have reliable commentary around them. For these sites, using WHOIS to find who owns them may be a useful move.

WHOIS gets you information about who is the administrator of the site domain. It can be done from your computer’s command line in many cases, but here we’ll show the ICANN interface, where we are searching to see who owns Mother Jones, an online news site:

Figure 55

When we search on the owner, we find that:

The Foundation for National Progress is a nonprofit organization created to educate the American public by publishing Mother Jones. Mother Jones is a multiplatform news organization that conducts in-depth investigative reporting and high quality, original, explanatory journalism on major social issues, including money in politics, gun violence, economic inequality and the future of work.

(We could have found this out by other means as well, of course).

Unfortunately, WHOIS blockers have dramatically reduced the value of WHOIS searches. The famous Baltimore Gazette fake news site from 2016, for example, uses a proxy service to hide revealing information.

Figure 56

The owner of the site here isn’t Domains by Proxy, as the record indicates. Instead, Domains by Proxy is a service, often available for a couple dollars a year, that obscures the true ownership of the site. These masking services are starting to become the norm, dramatically reducing the usefulness of WHOIS searches.

That said, there is still useful information to be had here, particularly in the date the baltimoregazette.com domain was registered, which is listed here as being in mid-2015:

Figure 57

If this were an established local paper, it would be fairly odd for it to have first registered the site a year ago.

 

 

 

 

 

 

 

 

 

Activity: Evaluate a Site

19

Mike Caulfield

Evaluate the reputations of the following sites by “reading laterally.” Answer the following questions to determine the reputability of each site: Who runs them? To what purpose? What is their history of accuracy, and how do they rate on process, aim, and expertise?

  1. http://cis.org/vaughan/...
  2. http://www.al.com/news/montgomery/...
  3. https://codoh.com/media/files/...
  4. http://www.nature.com/nature/journal/...
  5. http://www.dailykos.com/...
  6. https://nsidc.org/
  7. http://www.smh.com.au/environment/weather/...
  8. http://occupydemocrats.com/2017/02/11...
  9. http://principia-scientific.org/...
  10. http://www.europhysicsnews.org/articles/epn/abs/2016/05/...
  11. https://www.rt.com/news/...
  12. http://timesofindia.indiatimes.com/world/us/...
  13. http://www.naturalnews.com/...
  14. http://fauxcountrynews.com/...

Stupid Journal Tricks

20

Mike Caulfield

There’s no more dreaded phrase to the fact-checker than “a recent study says.” Recent studies say that chocolate cures cancer, prevents cancer, and may have no impact on cancer whatsoever. Recent studies say that holding a pencil in your teeth makes you happier. Recent studies say that the scientific process is failing, and others say it is just fine.

Most studies are data points–emerging evidence that lends weight to one conclusion or another but does not resolve questions definitively. What we want as a fact-checker is not data points, but the broad consensus of experts. And the broad consensus of experts is rare.

The following chapters are not meant to show you how to meticulously evaluate research claims. Instead, they are meant to give you, the reader, some quick and frugal ways to decide what sorts of research can be safely passed over when you are looking for a reliable source. We take as our premise that information is abundant and time is scarce. As such, it’s better to err on the side of moving onto the next article than to invest time in an article that displays warning signs regarding either expertise or accuracy.

Finding a Journal’s Impact Factor

21

Mike Caulfield

I mentioned earlier that this process is one of elimination. In a world where information is plentiful, we can be a bit demanding about what counts as evidence. When it comes to research, one gating expectation can be that published academic research cited for a claim comes from respected peer-reviewed journals.

Consider this journal:

Figure 58

Is it a journal that gives any authority to this article? Or is it just another web-based paper mill?

Our first check is to see what the “impact factor” of the journal is. This is a measure of the journal’s influence in the academic community. While a flawed metric for assessing the relative importance of journals, it is a useful tool for quickly identifying journals that are not part of a known circle of academic discourse, or that are not peer-reviewed.

We search Google for PLOS Medicine, and it pulls up a knowledge panel for us with an impact factor.

Figure 59

Impact factor can go into the 30s, but we’re using this as a quick elimination test, not a ranking, so we’re happy with anything over 1. We still have work to do on this article, but it’s worth keeping in the mix.

What about this one?

Figure 60

In this case we get a result with a link to this journal at the top, but no panel, as there is no registered impact factor for this journal:

Figure 61

Again, we stress that the article here may be excellent–we don’t know. Likewise, there are occasionally articles published in the most prestigious journals that are pure junk. Be careful in your use of impact factor; a journal with an impact factor of 10 is not necessarily better than a journal with an impact factor of 3, especially if you are dealing with a niche subject.

But in a quick and dirty analysis, we have to say that the PLOS Medicine article is more trustworthy than the Journal of Obesity and Weight-loss Medication article. In fact, if you were deciding whether to reshare a story in your feed and the evidence for the story came from this Obesity journal, I’d skip reposting it entirely.

 

Using Google Scholar to Check Author Expertise

22

Mike Caulfield

Not all, or even most, expertise is academic. But when the expertise cited is academic, scholarly publications by the researcher can go a long way to establishing their position in the academic community.

Let’s look at David Bann, who wrote the PLOS Medicine article we looked at a chapter ago. To do that we go to Google Scholar (not the general page) and type in his name.

Figure 62

We see a couple things here. First, he has a history of publishing in this area of lifespan obesity patterns. At the bottom of each result we see how many times each article he is associated with is cited. These aren’t amazing numbers, but for a niche area they are a healthy citation rate. Many articles published aren’t cited at all, and here at least one work of his has over 100 citations.

Additionally if we scan down that right side column we see some names we might recognize–the National Institutes of Health (NIH) and another PLOS article.

Keep in mind that we are looking for expertise in the area of the claim. These are great credentials for talking about obesity. They are not great credentials for talking about opiate addiction. But right now we care about obesity, so that’s OK.

By point of comparison, we can look at a publication in Europhysics News that attacks the standard view of the 9/11 World Trade Center collapse. We see this represented in this story on popular alternative news and conspiracy site AnonHQ:

Figure 63

The journal cited is Europhysics News, and when we look it up in Google we find no impact factor at all. In fact, a short investigation of the journal reveals it is not a peer-reviewed journal, but a magazine associated with the European Physics Society. The author here is either lying, or does not understand the difference between a scientific journal and a scientific organization’s magazine.

So much for the source. But what about the authors? Do they have a variety of papers on the mathematical modeling of building demolitions?

If you punch the names into Google Scholar, you’ll find that at least one of the authors does have some modelling experience on architectural stresses, although most of his published work was from years ago.

Figure 64

What do we make of this? It’s fair to say that the article here was not peer-reviewed and shouldn’t be treated as a substantial contribution to the body of research on the 9/11 collapse. The headline of the blog article that brought us here is wrong, as is their claim that a European Scientific Journal concluded 9/11 was a controlled demolition. That’s flat out false.

But it’s worthwhile to note that at least one of the people writing this paper does have some expertise in a related field. We’re left with that question of “What does generally mean?” in the phrase “Experts generally agree on X.”

What should we do with this article? Well, it’s an article published in a non-peer-reviewed journal by an expert who published a number of other respected articles (though quite a long time ago, in one case). To an expert, that definitely could be interesting. To a novice looking for the majority and significant minority views of the field, it’s probably not the best source.

How to Think about Research

23

Mike Caulfield

This brings us to my third point, which is how to think about research articles. People tend to think that newer is better with everything. Sometimes this is true: new phones are better than old phones and new textbooks are often more up-to-date than old textbooks. But the understanding many students have about scholarly articles is that the newer studies “replace” the older studies. You see this assumption in the headline: “It’s Official: European Scientific Journal Concludes…”

In general, that’s not how science works. In science, multiple conflicting studies come in over long periods of time, each one a drop in the bucket of the claim it supports. Over time, the weight of the evidence ends up on one side or another. Depending on the quality of the new research, some drops are bigger than others (some much bigger), but overall it is an incremental process.

As such, studies that are consistent with previous research are often more trustworthy than those that have surprising or unexpected results. This runs counter to the narrative promoted by the press: “news,” after all, favors what is new and different. The unfortunate effect of the press’s presentation of science (and in particular science around popular issues such as health) is that they would rather not give a sense of the slow accumulation of evidence for each side of an issue. Their narrative often presents a world where last month’s findings are “overturned” by this month’s findings, which are then, in turn, “overturned” back to the original finding a month from now. This whiplash presentation “Chocolate is good for you! Chocolate is bad for you!” undermines the public’s faith in science. But the whiplash is not from science: it is a product of the inappropriate presentation from the press.

As a fact-checker, your job is not to resolve debates based on new evidence, but to accurately summarize the state of research and the consensus of experts in a given area, taking into account majority and significant minority views.

For this reason, fact-checking communities such as Wikipedia discourage authors from over-citing individual research, which tends to point in different directions. Instead, Wikipedia encourages users to find high quality secondary sources that reliably summarize the research base of a certain area, or research reviews of multiple works. This is good advice for fact-checkers as well. Without an expert’s background, it can be challenging to place new research in the context of old, which is what you want to do.

Here’s a claim (two claims, actually) that ran recently in the Washington Post:

The alcohol industry and some government agencies continue to promote the idea that moderate drinking provides some health benefits. But new research is beginning to call even that long-standing claim into question.

Reading down further, we find a more specific claim: the medical consensus is that alcohol is a carcinogen even at low levels of consumption. Is this true?

The first thing we do is look at the authorship of the article. It’s from the Washington Post, which is a generally reliable publication, and one of its authors has made a career of data analysis (and actually won a Pulitzer prize as part of a team that analyzed data and discovered election fraud in a Florida mayoral race). So one thing to think about is that these people may be better interpreters of the data than you. (Key thing for fact-checkers to keep in mind: You are often not a person in a position to know.)

But suppose we want to dig further and find out if they are really looking at a shift in the expert consensus, or just adding more drops to the evidence bucket. How would we do that?

First, we’d sanity check where the pieces they mention were published. The Post article mentions two articles by “Jennie Connor, a professor at the University of Otago Dunedin School of Medicine,” one published last year and the other published earlier. Let’s find the more recent one, which seems to be a key input into this article. We go to Google Scholar and type in “‘Jennie Connor’ 2016”:

Figure 65

As usual, we’re scanning quickly to get to the article we want, but also minding our peripheral vision here. So, we see that the top one is what we probably want, but we also notice that Connor has other well-cited articles in the field of health.

What about this article on “Alcohol consumption as a cause of cancer”? It was published in 2017 (which is probably the physical journal’s publication date, the article having been released in 2016). Nevertheless, it’s already been cited by twelve other papers.

What about this publication Addiction? Is it reputable?

Let’s take a look with an impact factor search.

Figure 66

Yep, it looks legit. We also see in the knowledge panel to the right that the journal was founded in the 1880s. If we click through to that Wikipedia article, it will tell us that this journal ranks second in impact factor for journals on substance abuse.

Again, you should never use impact factor for fine-grained distinctions. What we’re checking for here is that the Washington Post wasn’t fooled into covering some research far out of the mainstream of substance abuse studies, or tricked into covering something published in a sketchy journal. It’s clear from this quick check that this is a researcher well within the mainstream of her profession, publishing in prominent journals.

Next we want to see what kind of article this is. Sometimes journals publish short reactions to other works, or smaller opinion pieces. What we’d like to see here is that this was either new research or a substantial review of research. We find from the abstract that it is primarily a review of research, including some of the newer studies. We note that it is a six-page article, and therefore not likely to be a simple letter or response to another article. The abstract also goes into detail about the breadth of evidence reviewed.

Frustratingly, we can’t get our hands on the article, but this probably tells us enough about it for our purposes.

Finding High Quality Secondary Sources

24

Mike Caulfield

Let’s continue with the “alcohol is closely associated with cancer” claim from the last chapter. Let’s see if we can get a decent summary from a respected organization that deals with these issues.

This takes a bit of domain knowledge, but for information on disease, the United States’s National Institutes of Health (NIH) is considered one of the leading authorities. What do they say about this issue?

Figure 67

What we don’t want here is a random article. We’re not an expert and we don’t want to have to guess at the weights to give individual research. We want a summary.

And as we scan the results we see a “risk fact-sheet” from the National Cancer Institute. In general, domain suffixes (com/org/net/etc) don’t mean anything, but “.gov” domains are strictly regulated, so we know this is from the (U.S.) federal government. A fact sheet is a summary, which is what we want, so we click through.

This page doesn’t mince words:

Based on extensive reviews of research studies, there is a strong scientific consensus of an association between alcohol drinking and several types of cancer (1, 2). In its Report on Carcinogens, the National Toxicology Program of the US Department of Health and Human Services lists consumption of alcoholic beverages as a known human carcinogen. The research evidence indicates that the more alcohol a person drinks—particularly the more alcohol a person drinks regularly over time—the higher his or her risk of developing an alcohol-associated cancer. Based on data from 2009, an estimated 3.5 percent of all cancer deaths in the United States (about 19,500 deaths) were alcohol related (3).

With the “.gov” extension, this page is pretty likely to be linked to the NIH. But just in case, we Google search the site to see who runs it and what their reputation is.

Figure 68

Since we’re reading laterally, let’s click on the link five results down to see what the NIH says about the National Cancer Institute. Again, we’re just sanity checking our impression that this is an authoritative body of the NIH. Here’s its blurb from the fifth result down:

The National Cancer Institute (NCI) is part of the National Institutes of Health (NIH), which is one of 11 agencies that compose the Department of Health and Human Services (HHS). The NCI, established under the National Cancer Institute Act of 1937, is the Federal Government’s principal agency for cancer research and training.

As always, we glance up to the web address and make sure we are really getting this information from the NIH. We are.

If we were a researcher, we would sort through more of this. We might review individual articles or make sure that some more out-of-the-mainstream views are not being ignored. Such an effort would take a deep background and understanding of the underlying issues. But we’re not researchers. We’re just people looking to find out if our rationalization for those two after-work drinks is maybe a bit bogus. And on that level, it’s not looking particularly good for us. We have a major review of the evidence in a major journal stating there’s really no safe level of drinking when it comes to cancer, and we have the NIH–one of the most trusted sources of health information in the U.S. (and not exactly a fad-chaser) telling us in an FAQ that there is a strong consensus that alcohol consumption predicts cancer.

Choosing Your Experts First

25

Mike Caulfield

One other thing to note here is that in the past chapter or two we followed a different pattern than a lot of web searching. Here we decided who would be the most trustworthy source of medical consensus (the NIH)  and looked up what they said.

This is an important technique to have in your research mix. Too often, we execute web search after web search without first asking who would constitute an expert. Unsurprisingly, when we do things in this order, we end up valuing the expertise of people who agree with us and devaluing the expertise of those who don’t. If you find yourself going down a rabbit hole of conflicting information in your searches, back up a second and ask yourself: whose expertise would you respect? Maybe it’s not the NIH. Maybe it’s the Mayo Clinic, or Medline, or the World Health Organization. But deciding who has expertise before you search will mediate some of your worst tendencies toward confirmation bias.

So, given the evidence we’ve seen in previous chapters about alcohol and cancer–am I going to give up my after-work porter? I don’t know. I really like porter. The evidence is still emerging, and maybe the risk increase is worth it. But I’m also convinced the Washington Post article isn’t the newest version of “eating grapefruit will make you thinner.” It’s not even “Nutrasweet may make you fat,” which is an interesting finding, but a point around which there is no consensus. Instead “small amounts of daily alcohol increase cancer risk” represents a real emerging consensus in the research, and from our review we find it’s not even a particularly new trend. The consensus emerged some time ago (the NIH FAQ dates back to 2010); it’s just been poorly communicated to the public.

Evaluating News Sources

26

Mike Caulfield

Evaluating news sources is one of the more contentious issues out there. People have their favorite news sources and don’t like to be told that their news source is untrustworthy.

For fact-checking, it’s helpful to draw a distinction between two activities:

Most newspaper articles are not lists of facts, which means that outfits like the Wall Street Journal and the New York Times do both news gathering and news analysis in stories. What has been lost in the dismissal of the New York Times as liberal and the Wall Street Journal as conservative is that these are primarily biases of the news analysis portion of what they do. To the extent the bias exists, it’s in what they choose to cover, to whom they choose to talk, and what they imply in the way they arrange those facts they collect.

The news gathering piece is affected by this, but in many ways largely separate, and the reputation for fact checking is largely separate as well. MSNBC, for example, has a liberal slant to its news, but a smart liberal would be more likely to trust a fact in the Wall Street Journal than a fact uttered on MSNBC because the Wall Street Journal has a reputation for fact-checking and accuracy that MSNBC does not. The same holds true for someone looking at the New York Observer vs. the New York Times. Even if you like the perspective of the Observer, if you were asked to bet on the accuracy of two pieces–one from the Observer and one from the Times–you could make a lot of money betting on the Times.

Narratives are a different matter. You may like the narrative of MSNBC or the Observer–or even find it more in line with reality. You might rely on them for insight. But if you are looking to validate a fact, the question you want to ask is not always “What is the bias of this publication?” but rather, “What is this publication’s record with concern to accuracy?”

What Makes a Trustworthy News Source?

27

Mike Caulfield

Experts have looked extensively at what sorts of qualities in a news source tend to result in fair and accurate coverage. Sometimes, however, the number and complexity of the various qualities can be daunting. We suggest the following short list of things to consider.

Here’s an important tip: approach agenda last. It’s easy to see bias in people you disagree with, and hard to see bias in people you agree with. But bias isn’t agenda. Bias is about how people see things; agenda is about what the news source is set up to do. A site that clearly marks opinion columns as opinion, employs dozens of fact-checkers, hires professional reporters, and takes care to be transparent about sources, methods, and conflicts of interest is less likely to be driven by political agenda than a site that does not do these things. And this holds even if the reporters themselves may have personal bias. Good process and news culture goes a long way to mitigating personal bias.

Yet, you may see some level of these things and still have doubt. If the first three indicators don’t settle the question for you, you should consider agenda. Is the source connected to political party leadership? Funded by oil companies? Have the owners made comments about what they are trying to achieve with their publication, and are those ends about specific social or political change or about creating a more informed public?

Again, we cannot stress enough: you should read things by people with political agendas. It’s an important part of your news diet. It’s also the case that sometimes the people with the most expertise work for organizations that are trying to accomplish social or political goals. But when sourcing a fact or a statistic, agenda can get in the way and you’d want to find a less agenda-driven source if possible.

National Newspapers of Record

28

Mike Caulfield

When it comes down to accuracy, there are a number of national newspapers in most countries that are well-staffed with reporters and have an editorial process that places a premium on accuracy. These papers are sometimes referred to as “newspapers of record.” We're aware that the origin of the term was originally a marketing plan to distinguish the New York Times from its rivals. At the same time, it captures an aspiration that is not common across many publications in a country. When I wrote code for Newsbank's Historical Paper Archive, we took the idea of Newspapers of Record seriously even on a local level. With the mess of paper startups and failures in the 1800s, understanding what was reliable was key. Which of that multitude of papers was likely to make the best go at covering all matters of local importance? “National newspapers of record” are distinguished in two ways:

  1. They are rigorous, showing attention to detail and having accountability in their editorial processes.
  2. They have a truly national view and attempt to be the best possible record of what happened in the nation (not just a region) on a given day.

The United States is considered by some to have at least four national newspapers of record:

You could add in the Boston Globe, Miami Herald, or Chicago Tribune. Or subtract the LA Times or Washington Post. These lists are meant to be starting points, indicating that a given publication has a greater reputation and reach than, say, the Clinton Daily Item.

Some other English-language newspapers of record:

Does that mean these papers are the arbiters of truth? Nope. Where there are disagreements between these papers and other reputable sources, it could be worth investigating.

As an example, in the run up to the Iraq War, the Knight Ridder news agency was in general a far more reliable news source on issues of faulty intelligence than the New York Times. In fact, reporting from the New York Times back then was particularly bad, and many have pointed to one reporter in particular, Judith Miller, who was far too credulous in repeating information fed to her by war hawks. Had you relied on just the New York Times for your information on these issues, you would have been misinformed.

There is much to be said about failings such as this, and it is certainly the case that high profile failings such as these have eroded faith in the press more generally, and, for some, created the impression that there really is no difference between the New York Times, the Springfield Herald, and your neighbor’s political Facebook page. This is, to say the least, overcompensation. We rely on major papers to tell us the truth, and rely on them to allocate resources to investigate and present that truth with an accuracy hard to match on a smaller budget. When they fail, as we saw with Iraq, horrible things can happen. But that is as much a testament to how much we rely on these publications to inform our discourse as it is a statement on their reliability.

A literate fact-checker does not take what is said in newspapers of record as truth. But, likewise, any person who doesn’t recognize the New York Times or Sydney Morning Herald as more than your average newspaper is going to be less than efficient at evaluating information. Learn to recognize the major newspapers in countries whose news you follow to assess information more quickly.

 

Activity: Expert or Crank?

29

Mike Caulfield

Twitter Expertise?

This guy has a pretty negative reaction to something published in a highly reputable journal. Is he an expert, or just a guy with opinions about things?

Figure 69

Woodward and Bernstein

Are these the reporters who brought down Nixon? Is this a trustworthy reporter sharing this photo?

Figure 70

 

Activity: Find Top Authorities for a Subject

30

Mike Caulfield

Get together in small groups, and by both pooling group knowledge and doing research, develop a list of three authoritative books/websites for information on one of the following subjects:

Your sources should be:

Or, the sources should be:

When each group has finished their selection, trade your list of expert books/sites with another group and have that group critique the list.

Some questions for reflection:

Field Guide

V

Verifying Twitter Identity

31

Mike Caulfield

One relatively common form of misinformation is the fake celebrity retweet. Sometimes this happens by accident–a person mistakenly retweets a parody account as real. Sometimes this happens by design, with an account faking a retweet. Here are some tips to make sure that the tweet you are looking at on Twitter is from the person you are attributing it to.

Twitter Identity Basics

With Twitter, accounts are generally (although not always) run by a single person. However, unlike Facebook, Twitter does not enforce a “real name” policy, which makes it easy for one person to run multiple accounts, and to run accounts under different names. In fact, an important part of Twitter culture is the constellation of parody accounts, bots, and single issue accounts that amuse and inform Twitter subscribers.

At the same time, it’s easy to get confused. As an example, consider the account of Representative Jack Kimble. Here’s a typical tweet:

Figure 71

If you’re a liberal, looking at this tweet may get your blood boiling. How can anyone possibly believe this? Especially a Representative?

Scanning the Twitter bio doesn’t help.

Figure 72

Here we see that he’s from the 54th District of California and he’s got a book out. Now if we’re reading carefully we might notice some fishy things here: his book, Profiles in Courageousness, seems like a parodic re-titling of Jack Kennedy’s Profiles in Courage. “E pluribus unum,” which means “From the many, one,” is translated to “1 nation under God”.

Oh, also: California only has 53 districts.

Unfortunately, you’ll likely be in such a huff about the comments that you won’t notice any of these things. So what is a general purpose indicator that you need to slow down? In most cases, it’s going to be the absence of a “verified account” marker.

Checking Verified Accounts

As a counter-example to “Representative Kimble,” here’s a real representative, Jason Chaffetz, from Utah’s 3rd District.

Figure 73

That little blue seal with the check mark (the “verified badge”) indicates that this is a “verified identity” by TwitterTwitter asserts that this person has proved they are who they say they are.

Who gets to get verified? It’s a bit unclear. Twitter puts it this way:

An account may be verified if it is determined to be an account of public interest. Typically this includes accounts maintained by users in music, acting, fashion, government, politics, religion, journalism, media, sports, business, and other key interest areas.

However, all members of Congress and senior administration officials qualify for such status. So do most major public figures and prominent writers. If you don’t see the blue badge, either disregard the tweet as suspicious, or do further research.

One additional note: sometimes people try to fake these indicators; an example is faking a verification symbol in a header.

Figure 74

This user has used their background image to place a verification badge next to their name. To steer clear of these sort of hacks, always view the badge in the sidebar or small “hover” card, not the header. To be extra sure it’s legit, hover your cursor over it– the words “verified account” should pop up.

This sounds complicated, but once you learn it, it takes maybe two seconds. Here I am, for example, checking to see if this is really New York Governor Andrew Cuomo’s Spotify playlist, or a fake account, using a quick hover technique:

Thumbnail for the embedded element "Checking a Verification Badge in Twitter"

A YouTube element has been excluded from this version of the text. You can view it online here: http://textbooks.whatcom.edu/webliteracy/?p=173

Figure 75

In this case it’s verified. The governor should probably lay off Billy Joel a bit, but this is a legitimate tweet.

 

Other Methods

Not all celebrities have verified accounts. If you don’t find the verification badge, you may have to dig a little deeper.

There are a couple things to look for in an unverified account:

As an example, here is the Minerva Schools Twitter account. Minerva is a small, but high-profile school in California. The account is not verified. Is the account legitimate? Is it really Minerva?

Figure 76

A number of things suggest it is. It was created in August 2013, right around when I know Minerva was created. It has followers I know (from educational technology, which is what the school is known for). One of the followers is a person that I know that works there.

Figure 77

We could stop there, or we could also note that the tweetstream is entirely consistent with what we’d expect for an organization like this, and the number of followers, while not huge, is in line with what we might expect for an account like this.

No one single factor here clinches it (although the employee showing up in the follow list comes close), but all these factors together give us a fair amount of confidence that this is a legitimate account.

If we wanted to go one step further (and we really don’t have to here) we could web search the handle and see if it is referenced from any official pages.

 

Fake Screenshots

Sometimes people fake screenshots of tweets that never happened.

Not all tweet screenshots are fake. Many times Twitter users will screenshot a tweet rather than retweet it because they fear the original will be deleted. Here’s Michael Li screenshotting an embarrassing tweet which was later deleted.

Figure 78

Other times, people may screenshot a tweet because they wish to discuss a tweet without attracting the ire of a particular group of followers. As an example, during the #Gamergate controversy many people critical of Gamergate took screenshots of bad behavior on Twitter (harassment and the like) because they were afraid that if they commented via re-tweeting they might become a target themselves.

Sometimes people retweet screenshots as a way of breaking a chain of credit, so that people will be forced to retweet them, and not the original tweeter. (This practice is rightfully frowned on).

Sometimes, however, the screenshot may be fabricated. In fact, many “tweet generators” exist online that allow you to create fake pictures of tweets. I made this one a couple minutes ago:

Figure 79

If you come across a person re-tweeting a screenshot, check to see if the tweet really exists on Twitter first. In the above case, for example, you could check Obama’s timeline.

Deleted Tweets

What if they deleted the tweet, as in the “ONE MAN + ONE MAN” example above? How do you verify it then? Or what if the tweet someone was referencing has since been deleted.

Don’t worry–in many cases there’s still ways to dig up the tweet.

If it’s a tweet from a politician (and it usually is) you can try Politiwhoops, which logs all tweets deleted by significant public officials.  Here are some tweets recently deleted by President Trump:

Figure 80

Another technique is searching for the Twitter account on Google and looking for the cached version of the page. In the video below we search for @RealDonaldTrump in Google and then look at the cached version of his Twitter page. This works well with things recent enough to be on the first page of a Twitter stream, but old enough that Google has indexed them.

Thumbnail for the embedded element "Getting Cached Twitter Page"

A YouTube element has been excluded from this version of the text. You can view it online here: http://textbooks.whatcom.edu/webliteracy/?p=173

Figure 81

The Twitter bar sometimes obscures the cache information, but if you can see it, it will tell you when it was last indexed. The time is in Greenwich Mean Time (the same time as London, England). So for instance, this cache of Trumps tweets was taken at 2 o’clock London time (which would be early this morning in my Pacific Coast time).

Figure 82

 

 

 

 

 

 

Activity: Verify a Twitter Account

32

Mike Caulfield

Kellogg’s Rant

General Kellogg was promoted by President Trump as acting head of the National Security Council on February 13, 2017.  Is this Twitter account his?

https://twitter.com/GenKeithKellogg/status/832825494009638912

Explain your reasons.

 

Using the Wayback Machine to Check for Page Changes

33

Mike Caulfield

Sometimes we want to see how a page has changed over time, or know when a page disappeared. Using the Wayback Machine can help you do that.

Here’s how that works. Go to the Wayback Machine and search for a page or site. Here we’ll search for the front page of the White House site:

Figure 83

The Wayback Machine doesn’t archive every page, but they do archive an awful lot of them. Whether a page is archived will often depend on if a page was heavily linked to in the past, or if it was published by a site that the Wayback Machine tracks. In the case of the White House, of course, both these things are true and we have a near perfect history of the site.

Figure 84

Let’s go back in time all the way to 1999. When we select 1999, we see a calendar. Each circle indicates a snapshot made of the site. The green and blue indicate whether the page was a “redirect”–an issue beyond the scope of this article.

Click on a date to see a “snapshot” of the page on that date. Here we see a snapshot of the site from January 1999, at the tail end of the Clinton administration.

Figure 85

Sites will be browsable, to some extent, so go ahead and click on the links. Advanced functionality, such as search interfaces and interactive content, will usually not work.

 

Finding Out When a Page Was Published Using Google

34

Mike Caulfield

Many pages will tell you the date they were published. But some pages don’t give publication dates, and some can’t be trusted.

Take, for example, this story from fake site ABCNews.co (a hoax site that attempts to to look like an ABC news site).

pro

Figure 86

You’ll note that the publication date was November 11.

That’s what the site looks like today. But we can see what it looked like previously, courtesy of archive.org‘s Wayback Machine.

Here’s what it looked like in March, sporting a publish date of March 24:

hoax.PNG

Figure 87

Here it is in June, sporting a date of June 16:

june.PNG

Figure 88

And in September, it sported a date of September 11:

september

Figure 89

 

Hoax sites often do this date incrementation to increase the share rate on older stories. People are more likely to share things if they believe they are breaking news and not yesterday’s story.

So how do we get some sense of when this story was first published?

We can’t get there exactly but we can often use Google to get close. Google stores the date of the first time it indexed a page–on popular sites this date is usually within a couple days of the true publish date (on unknown sites it is much less reliable).

To get Google to show the indexed date of a page, you’ll need to do two things:

Here’s what that looks like in this case:

Figure 90

As you can see, we’ve taken the URL of the page and entered the following as the search term:

site:abcnews.com.co/donald-trump-protester-speaks-out-i-was-paid-to-protest/

Then we’ve used date filtering to create a filter that doesn’t exclude anything (its date range is all possible dates), but triggers this sort of date display in Google.

Again, this is not a rock-solid publication date, but we can say that there was some content at this URL at this date, and in most cases, with a URL like this, that means the story was up by then.

 

Citation Rates

35

Mike Caulfield

Students often overestimate how much the average paper gets cited. I’ve seen students look at a paper with 40 citations say, “Eh, can we really trust it with only 40 citations?”

In truth, most papers that get cited even a few times are legitimate papers (vs. junk), and in all fields 40 citations  indicates a paper that has has a lot of expert eyes on it. That’s the point of citations in source verification work. It’s not necessarily about the quality of the paper–you need expertise to assess that, and a paper with 100 citations is not necessarily better than one that has 10. What citations show you, for a quick and dirty process, is that experts have read a certain work or author and found their work worthy of discussion. More citations don’t mean more quality, but they do mean more expert eyes have probably looked at it and found it worth either agreeing or disagreeing with in public.

If you still want to know averages, here’s a list of citation averages from 2011, but note that citations follow a power law, and any average here is far above the median.

Using Google Books to Track Down Quotes

36

Mike Caulfield

Did Carl Sagan say this?

Figure 91

Quotes are the internet are some of the most commonly faked content. People misattribute quotes to give them significance, or fabricate tendentious quotes to create controversy. (For some examples of fact-checking historical quotes, check out Quote Investigator).

In our case, if we know that Carl Sagan is an author of many books, rather than start in Google or DuckDuckGo‘s general search we might start in Google Books, which will likely get us to the source of the quote faster. Additionally, even if we cannot find the source, we might find a someone quoting this in a book from a major publisher, which is likely to have a more developed fact-checking process than some guy on Twitter.

So we go to Google Books and we pick out just a short snippet of unique phrasing. I’m going to choose “clutching  our crystals and nervously consulting.”

Figure 92

Down there at the bottom, the fourth result, is a book by Carl Sagan. It says it’s from 2011, but don’t be fooled by this date; this is just the date of the edition indexed here. Let’s click through to the book to check the quote and sort out the date later.

Clicking through the book we find the quote is accurate. More importantly, we find the surrounding context and find that this quote is not being taken out of context. Sagan was truly worried about this issue. His prediction was very much that a sound bite obsessed media, combined with a sort of celebration of ignorance, would drag us backwards. He understood that the world was becoming more difficult while the communication of ideas was simultaneously becoming more shallow.

Figure 93

You can find out the original publication date of this work a number of ways. There’s a “more versions” option on the Google Books interface. You could go look for the book’s article on Wikipedia, as they will usually give you the publication date. But the easiest way is usually to turn to the front pages of the book and find the date, just as you would with a physical book.

Figure 94

 

 

 

 

 

Understanding Astroturf

37

Mike Caulfield

“Grassroots” political efforts emerge from the “bottom-up,” with small local groups banding together to put pressure on city, county, state, or federal government to take (or oppose) specific action. They are “people-powered,” usually relying on volunteer labor and small donations from local people and organizations. In the age of social media, the phrase “grassroots” has also been applied to national movements that start by a small group of citizens organizing online.

Being “grassroots” is not a technique limited to Republicans or Democrats. The Tea Party revolts against President Obama’s health care plan, for example, had many grassroots elements, being organized on the local level by loosely connected people and local organizations. Moms Demand Action, a gun control advocacy group, was started when a stay-at-home mother was shocked by her son’s response to the Sandy Hook school shooting. She put up a Facebook page to organize action, and slowly built a movement.

Citizens tend to look more favorably upon people-powered, local politics than corporate funded initiatives funded by people from somewhere else. The desire to portray corporate and non-local efforts as local has led to a practice called astroturfing, where large corporations or rich individuals use “front groups” that look like local groups of activists, but are funded and organized primarily by national corporations or rich individuals from elsewhere.

When deciding whether an organization is astroturfing, consider the following:

There is a bit of a sliding scale here for what qualifies as astroturfing. A locally founded initiative that receives primarily national money is (a bit) less astro-turfy than an organization founded directly by a corporation. An initiative that receives money from a foundation dedicated to a larger social goal (such as elimination of poverty) is less astro-turfy than a corporation spending money to boost its stock price or get rid of regulations that constrain it. In general, what is most important is whether the organization’s reality matches the story that they are publicly telling.

 

Searching TV Transcripts with the Internet Archive

38

Mike Caulfield

The Internet Archive allows you to search the captions of major news programs that aired after 2009, making it possible to find statements that may have aired on TV but not in print.

As an example, consider this video that seems to show Donald Trump speaking about a picture of the annual pilgrimage to Mecca (a Muslim tradition) as a “sea of love.”

https://www.youtube.com/watch?v=xNiK26RGcF8

There are plenty of reasons to doubt this is an authentic video. It has a low view count given its content; it’s on a YouTube channel that generally features jokes, not political content; the lighting on that picture is weird; and if you have heard Trump speak about his inauguration, you probably heard him use these same terms. The likelihood is that someone has doctored a video of him talking about the inauguration and made it look like a commentary on Mecca.

But if we want to prove that definitively, we should probably find the original video.

Here we’re going to go over to the Internet Archive‘s TV News Archive and search for “tremendous sea of love,” and right there, the second result, is the video that has been altered, along with the ABC chyron:

seaoflove

Figure 95

If you play this video, you’ll see President Trump talking about the crowds at his own inauguration: someone clearly altered the picture the president was pointing to in the other video.

There’s also a specialized Trump collection on the site if you just want to search the clips in which Donald Trump plays a part.

We can use this for other things as well. For example, we might want to fact-check whether Mike Pence agreed with the “Muslim Ban” during the later part of the campaign. So you can check that by going into the Trump archive and typing “pence muslim ban.”

pence

Figure 96

When you click on that, you’ll see Mike Pence agreeing directly with that particular language.

Why is this important? So much of what our leaders communicate is now over the air with very little written record. Resources on sites like these are not indexed by Google, but are freely accessible and provide irreplaceable functionality for fact-checking civic discourse. Keep them in mind, especially if you are specifically looking for video content or if general news searches have failed.

Treating Google’s “Snippets” with Suspicion

39

Mike Caulfield

Occasionally when you search for an answer to a question on Google, you will not only find websites, but you may also find a “knowledge panel” that appears to have what search expert Danny Sullivan calls the “One True Answer,” an answer that appears on a knowledge panel on top of the results.

Sometimes Google pulls an answer from a source algorithmically. For example, in response to “How many men landed on the moon?,” this panel answers “12 men,” citing a Quora article.

Figure 97

Sometimes Google does not pull out the answer but makes the answer apparent in the blurb or headline of the card, as in this answer to the query, “last person to walk on the moon”:

Figure 98

This function of Google can be useful, but it malfunctions frequently enough that it should not be trusted without verifying the source and context of the answer. There are two major problems: false simplicity and false (or non-standard) information.

False Simplicity

Here’s a question: how many apostles are there in the Christian tradition? Google tells you, via a panel, even pulling out the number, thereby making it look decidedly objective: there are twelve!

Figure 99

If you click through to that Quora question, though, you’ll find that it answers a much more specific and simpler question: how many original apostles did Christ have (according to tradition). And for that answer they are correct. Including Judas, there are twelve.

But according to tradition, when Judas dies Matthias becomes an apostle, so that’s thirteen. Then, Paul is an apostle, so fourteen. And Barnabas, Timothy, and James. The truth is that this answer is pretty debatable: it’s certainly not twelve, and some versions of the Bible refer to up to 25 different people as “apostles.”

It gets worse. These numbers, which are already various, come from various Christian traditions. Many historians, on the other hand, see the twelve apostles as a creation of the early Church, that had no reality or significance during the lifetime of the historical Jesus and was later “retrojected” into the Gospels.

The fact is the whole question of how many apostles there were and who they were is inextricably bound up with complex questions of religion, history, and 1st century power struggles about who counted in the early church and who didn’t.

This may seem petty, but the truth is any extended discussion of this issue from any source, religious or historical, will surface these issues to the person who investigates. Google‘s panels, however, are oblivious to this kind of complexity and present a simple numerical answer where no simple answer actually exists.

Misleading Highlights

Google uses some programming to try and highlight relevant answers in the blurb, but the highlighting is confused or confusing. Here, Google, when asked how old Lee Harvey Oswald was when he shot Kennedy, highlights 18, 24, and 22.

Figure 100

In reality, the answer is 24 years old, though a quick glance at this might have you thinking 18 or 22.

Blatant Misinformation

Sometimes the panel presents blatant misinformation. Often this material is the product of highly politicized areas or of conspiracy-believing communities, which tend to rank highly on Google search results more generally.

Take for instance this search, where we ask Google which presidents were in the Ku Klux Klan. The Google panel provides what seems to be a definitive answer: there were five!

Figure 101

As Case University Western history professor Peter Schulman points out, this isn’t even remotely true. None of these presidents were members of the Ku Klux Klan (as far as we know), and if you click through to the article, you’ll find the source here is a Nigerian newspaper of uncertain stature that references a book by David Barton, a nationalist known for self-publishing dubious works of historical revisionism.

There are numerous examples of similar behavior. Adrianne Jefferies at The Outline details some more bad snippets, including this one claiming Obama is planning for martial law (complete fiction):

Figure 102

Google will also tell you that Lee Harvey Oswald didn’t assassinate John F. Kennedy, despite the overwhelming evidence to the contrary:

Figure 103

Confirmation Bias and Bad Snippets

A lot of times Google is just bad. But sometimes bad answers are often the result of asking questions in ways that tap into the language or concerns of pseudoscience, conspiracy theory, or fringe beliefs. For example, there is a very real problem some people have with monosodium glutumate, a food additive that triggers an allergic reaction in a small portion of the population. If you search on a phrase likely to by found in the medical literature like “msg sensitivity,” you get a fairly reliable result.

Figure 104

Healthline, in this case, is a recognized provider of reliable health information.

All this changes if you use the language of fringe groups that believe the medical community is suppressing a link between MSG and a variety of neurological disorders. Here’s what you get when you type in ‘msg dangers’:

Figure 105

The blurb says it all (brain damage! alzheimer’s! learning disabilities!), but if you look up the site (mercola.com) you’ll find it is run by a physician who has been warned by the FDA repeatedly to stop making false claims.

 

Our Advice

In general, simply treat the Google panel (“one true answer”) as you would any other top search result. Despite Google‘s claims to the contrary, it is not significantly more or less reliable than an average source. Click through, trace the claims on the page to a source, and investigate the source. Never trust its result without validating the source of the claim.

 

 

 

Using Buzzsumo to Find Highly Viral Stories

40

Mike Caulfield

If you are looking to hone your fact-checking skills, you may want to find highly viral stories. Your own Facebook and Twitter feeds are one good source for such stories, but sometimes you’ll want to get outside your filter bubble and see the stories that other folks are sharing.

There are a number of tools you can use to find highly viral stories. Buzzsumo is one simple to use option. Here’s how to find stories to investigate using it.

First, go to Buzzsumo.com.

Figure 106

Put in a search term, like “cancer.” Buzzsumo will return the most shared stories on the topic of cancer. You can filter them by recency. Here, we look at just stories in the past week.

Figure 107

Facebook engagements is not purely about shares–it encompasses other actions as well–but it is a good metric of how viral the story is.

The free version of Buzzsumo only lets you view the top results and limits the number of searches you can perform per day, but it’s often enough access to enable you to find an interesting story to fact-check.  I like this “Cancer Cure Genius Silenced by Medical Mafia” one–its inflammatory language is a good indicator that the claims in it are likely to be overstated.

Figure 108

If you are writing your claim analysis up for the Digital Polarization Initiative, make a note of the engagements, as they are often a good proxy for the influence of the story on the general public. Thirty thousand engagements on this story makes it one of the top cancer stories of the week and one well worth looking into.

 

 

 

Finding Out Who Owns a Domain

41

Mike Caulfield

Many times you’ll want to know who is behind a domain. This used to be relatively easy to find out in the past: when a person bought a domain, their name was put into a “registry,” which is sort of like the “phonebook of domains.”  (Yes, I know: many of you are probably now asking what a “phonebook” is…)

Back then, to find out who owned a domain, you’d just go and look it up, using a service called WHOIS.

Unfortunately, things got more complicated. People who had their email addresses and names in the “domain phonebook” would get spam email, or the information displayed on the registry would be used to try to hack their site. And many people–for example, political dissidents–had good reason to not reveal their names. So, a lot of the “registrars” (the people who you buy your domain name from) started offering masking services, which hide the owner of the domain.

Nowadays, if you want to find out who owns a domain, WHOIS-type services are a good first stop, even though they will usually fail for smaller sites.

To look up domain ownership, we recommend a tool called Domain Dossier. Go to the site and type in your root URL and check all the checkboxes:

Figure 109

When the identity is not masked, you’ll be able to see the owner of the domain. The first place to look is “Registrant Name” and “Registrant Organization”:

Figure 110

Occasionally, you may not get a useful name from the record, but the address might be telling.

If the name is masked in Domain Dossier, you’ll get a record that looks like this:

Figure 111

You may also see the name of a masking service, such as “Domains by Proxy”:

Figure 112

In this case, the registrant is not from Arizona and not named “Domains by Proxy”–that is just the masking service.

Again, it’s important to note that masking is common enough these days that it shouldn’t cause suspicion.

While domain owners can hide their names, they cannot hide the date the domain was registered. As we’ll discuss in another chapter, this is often useful information. By looking at the domain registration date, you can often get a sense of whether a site has a long history behind it or if it has been spun up for a specific purpose.

 

 

 

 

Avoiding Confirmation Bias in Searches

42

Mike Caulfield

Was 9/11 a hoax? Let’s find out. We type in ‘was 9/11 a hoax’ and we get:

Figure 113

Well, look at that. Not only the top result says that the attack on 9/11 was faked–the top five results do. To the untrained eye it looks like the press has been hiding something from you.

But of course the 9/11 attacks were not faked. So why does Google return these results?

The main reason here is the term. The term “hoax” is applied to the 9/11 attacks primarily on conspiracy sites. So when Google looks for clusters on that term (and links to documents containing that term), it finds that conspiracy sites rank highly.

Think about it: reputable physics journals, policy magazines, and national newspapers are not likely to run headlines asking if the attacks were a hoax. But conspiracy sites are.

The same holds true even for more benign searches. The question, “Are we eating too much protein” has Google return a panel from the Huffington Post (now HuffPost) and a website from a vegan advocacy group.

Figure 114

To avoid confirmation bias in searches:

Promoted Tweets

43

Mike Caulfield

Promoted tweets are real tweets, but they do not reach you because they were shared by the people you follow. They reach you because the author of the tweet paid Twitter money to put it in your feed.

Here’s an example of a promoted tweet, asking you to “Tweet your Senators” about the dangers of drug importation:

Figure 115

Promoted tweets aren’t necessarily untrue, but they should be treated the way one would treat a commercial. In this case, we look to see what organization has posted the tweet.

 

Figure 116

That leads us to their webpage and organization name: The Partnership for Safe Medicines.

Figure 117

And a little bit of investigation takes us to a page on the NPR site that shows this organization has ties to Big Pharma:

Figure 118

While none of these means the claims of the organization claims are wrong or false, it is a worthwhile perspective to have before you decide to retweet the tweet or not. Treat promoted tweets with suspicion. Someone is paying money to influence you, and it’s best to know who before retweeting.

Finding Old Newspaper Articles

44

Mike Caulfield

While more recent news articles are available from both Google‘s and Bing‘s news search tabs, older news can be more difficult to retrieve. Many options for retrieving old news entail paying a subscription fee or per article cost, which is a bit expensive for a person just checking up on a story. In this section, we’ll show you how to use news archives to check on the existence of articles at no cost.

A Sample Problem

President Trump claimed the investigation to see if his campaign had colluded with Russia was a “witch hunt.” No sooner had he said that than this snapshot of an article appeared in my feed:

 

Newspaper article saying "Nixon Sees Witch-hunt"

Figure 119

By now you should know it’s trivially easy to fake something that looks like a snapshot of an old headline. So how do we find out if this article actually ran?

Our first instinct might be to go to the Washington Post to see if they have this article. That’s not a bad instinct, but in this case the headline clearly ran somewhere else other than the Post–the Washington Post doesn’t tag it’s own articles as coming from the “Washington Post.” This particular headline was run in another paper.

So we want to do a broad search across many historical American papers. When reporters do this, they most often use tools such as LexisNexis and ProQuest, which are usually unavailable to average people.

We’ll have to make do with sources that are searchable from the web. There are three major web searchable archives in the U.S.:

Google offers complete articles. The other two offer snippets unless you pay them money, but snippets are enough for this sort of task.

So we construct our search. It’s just a variation on the “site:” syntax we’ve used elsewhere.

Nixon Sees Witch Hunt (site:newspapers.com OR site:news.google.com/newspapers OR site:newspaperarchive.com)

And we get back a time-stamped result from the LA Times, with a date (in 1973) that looks promising:

Screenshot of search results with positive hit on story on top

Figure 120

Note that “Nixon sees witch-hunt Sears insiders say.” What’s that “Sears” bit about?

It becomes evident when we click through and look at the page:

Figure 121

You can see above we’ve circled the headline. The free version only offers this blurry “thumbnail” image of the page, but it’s enough to spot the headline. It also makes obvious where the “Sears” came from–the text here was automatically generated by computer and must have included the Sears ad next to as part of the headline.

If we scroll down the page, we can see enough to confirm that this article as I saw it in my feed was correct, even though the automatic character recognition has messed up a lot of the words:

Part l-A-Sun., July 22, 1973 I Nixon Sees ‘Witch-Hunt; Sears Insiders Say Prices Effective through Tuesday, July 24 BY BOB WOODWARD and CARL BERNSTEIN Thft Washington Post WASHINGTON President Nixon and his top aides believe that the Senate ‘Watergate hearings are unfair and constitute a “political witch-hunt,” according to White House sources. The sources, said, that the President .in recent weeks had expressed bitterness and deep hostility toward the two-.morith-old proceedings.

We have enough here to say that this ran in the LA Times in July 1973. And if we really wanted to see a clean version of the article, we could subscribe to the service and grab a better image, which may be what the original tweeter did.

Checking Cited Headlines

Here’s another paragraph, this time from the New York Times, that claims the LA Times ran a derogatory headline when the first female commercial pilot at a major airline got her wings.

There were no female pilots at the biggest airlines until 1973, when American Airlines hired the first, Bonnie Tiburzi Caputo. In a reminder of how times have changed, that news was reported in The Los Angeles Times under the headline, “Airline Pilot to Fly by Seat of Panties.”

The New York Times is a very reliable paper, and in this case we probably don’t need to check the article title. But let’s try anyway with the same sort of search as above:

Airline Pilot to Fly by Seat of Panties (site:newspapers.com OR site:news.google.com/newspapers OR site:newspaperarchive.com)

Note that because the optical character recognition sometimes transcribes things wrong, we don’t put quotes around the search phrase, at least at first. When we put it in, we’re in luck–we can see the headline in the blurb:

Figure 122

We might also search for a type of headline. For instance, a key point of the critics of global warming is the statement that scientists in the 1970s feared “global cooling” instead of global warming; the point being that the global warming scare is one in a long series of bad guesses to be later thrown away. Can we compare the number of global cooling and global warming stories in the 1970s?

We execute a search for:

global cooling (site:newspapers.com) 1975

and we get an article from 1975, which talks of some sensationalist claims of a coming ice age. But when the reporter talks to a climatologist, the tone is different:

But Lawson prefers to speak in terms of the following probabilities: —In the long run, over thousands of years, there is probability of an ice age. —In the next few decades, there is a probability of a warming trend. —In the next few years, the probability is that global cooling will continue dbownward to 19th century levels.

(Note: For some reason newspaper archive searches react badly to date filters, which is why we just put 1975 in plain text.)

If we search for “global warming” in 1975, we get this quote in the January 29, 1975 edition of the Orlando Sentinel from a government scientist:

“After the next decade or so will come a warming trend, both because of increased CO2 in the atmosphere and thermal pollution by power plants and so on. In the 21st century, man’s activities will predominate over nature.” J. Murray Mitchell, senior research climatologist, Environmental Data Service, National Oceanic and Atmospheric Administration.

While one would need much more evidence to settle the question of whether scientists on the whole feared global cooling or global warming in the 1970s, it’s clear enough that many scientists expected warming due to man’s activities even then. If you’re looking at sharing an article that says that “cooling” was the big 1970s worry, you might want to sit on it before reposting.

 

45

Mike Caulfield

If there is break

Accessibility

VI

Image Descriptions

46

Mike Caulfield

Navigation tip: If you arrive here in a new window, click control-w or command-w to close this tab and return to the text.

While we try to list all figures in order, edits to the book may result in figure descriptions going out of order. For best results, always access the description from the caption link.

Figure headings are linked to themselves. Clicking on a link will merely scroll it to the top of the page.

FIGURE 1

A tweet from Twitter user @RonHogan that reads “The Nazis murdered Senator Schumer’s grandmother and most of her children. Trump’s father was arrested at a Ku Klux Klan rally.” It is in response to a Donald Trump tweet.

It has been retweeted over 55,000 times.

End.

FIGURE 2

A story with the headline “MORE HYPOCRISY: Obama banned all Iraqi Refugees for 6 Months in 2011– Liberals said nothing!” over a picture of protests against President Trump’s ban.

End.

FIGURE 3

A set of DuckDuckGo search results. The top results are from fact-checking sites Snopes and Politifact.

End.

FIGURE 4

A segment of a President Trump speech that reads “We must protect those who protect us. The number of officers shot and killed in the line of duty last year increased by 56 percent from the year before.”

End.

FIGURE 5

DuckDuckGo search results. The top search result is an article from the Washington Post fact-checker and highlighted text matches our query.

End.

FIGURE 6

A story with the headline “Report: US Government Ethics director approved controversial tweets” over a picture of President Trump.

End.

FIGURE 7

Text from the article with sentences mentioning the Daily Dot highlighted. If you read carefully, the Daily Dot (another publication) is the source of each fact (e.g. “the Daily Dot reported that Shaub sent an email” etc.).

End.

FIGURE 8

A screenshot of a page from the publication Network World. There are ten stories at the bottom of the page, but in small print under each one is an indication that they were paid for by an advertiser. The one in the upper left corner reads “Lawmakers Concerned About Insane Military Scope Released to Public” and is sponsored by “ZeroTac Tactical Scopes.”

End.

FIGURE 9

An enlargement of the ZeroTac technical scope “article” link, showing the space below it where it indicates the sponsor.

End.

FIGURE 10

An article from InfoWorld on the topic of “Integrated Systems” by a man named Paul Miller. But above the article is small text that reads “Sponsored,” and near the top of the page is tiny text that indicates the sponsor is Hewlett Packard, a company that sells integrated systems.

End.

FIGURE 11

A screenshot of a New York Times webpage with many items on it. In the middle column of items, small text reading “News from AP and Reuters” tops the column.

End.

FIGURE 12

New York Times article with headline “UK Stock Market Hits Record as Manufacturers Win Business.” Where a reporter’s name might usually appear under the headline reads in small print, “by the Associated Press.”

End.

FIGURE 13

An article titled “Do You Support Patriotic Bikers Defending Trump’s Inauguration?” The article says that a source named Right Alerts Polls broke the story, but does not provide a link.

End.

FIGURE 14

Screenshot of the result of selecting and right-clicking. The term “Rights Alerts Polls” is highlighted and a context menu shows. The context menu offers an option to “Search Google for ‘Rights Alerts Polls'”. Note that you could do this without using the context menu; just copy and paste the phrase into to a Google search box.

End.

FIGURE 15

A Google search for “Right Alerts Polls bikers” reveals the article the other page cited as a source. It is the top result.

End.

FIGURE 16

The extended quote from the page reads, “These libtards need to shut the hell up. This is not only a biker event, but it is a Trump Supporters event. We are many and varied but we unite as one.” It is said to be a quote on a Facebook page organizing the event.

End.

FIGURE 17

Screenshot of selecting “shut the hell up. This is not only a biker event.” The context menu offers an option to “Search Google for ‘shut the hell up. This is not only a biker event'”. Note that you could do this without using the context menu; just copy and paste the phrase into to a Google search box.

End.

FIGURE 18

The Google search results for “shut the hell up. This is not only a biker event.” The second result (which the screenshot calls attention to) has a web address on Facebook and is in the subdirectory of “events.”

End.

FIGURE 19

Facebook page showing only 1,800 have indicated that they are going to the biker event. In addition, only 8,000 are interested, and the page has only been shared with 10,000 people total.

End.

FIGURE 20

A photo shared through ABC News showing a parked car boxed in by shopping carts with the headline, “Shopper Upset over Double-Parked Car.”

End.

FIGURE 21

The top two Google search results for “shopper upset over double-parked car abc action news.”

End.

FIGURE 22

The Google search results for “shopping carts double-parked portland or.”

End.

FIGURE 23

A WGME article explaining the story behind a picture of a double-parked car surrounded by shopping carts.

End.

FIGURE 24

Google search results for “Matthew Mills” with one result featuring the caption, “this guy got a lesson in parking.”

End.

FIGURE 25

Facebook search result for “‘got a lesson in parking’ Matthew Mills” showing a public post by Matthew Mills.

End.

FIGURE 26

Google Image search results for “parking revenge carts.”

End.

FIGURE 27

A Reddit post titled “Great Parking Job” showing a picture of a double-parked car surrounded by shopping carts.

End.

FIGURE 28

A tweet by user @NinjaEconomics that reads “On January 3, the #GDPNow model forecast for real GDP growth in Q4 2016 is 2.9%” and shows a chart about the GDP forecast.

End.

FIGURE 29

A tweet by @unsmokable that reads “the life of a national geographic photographer” and shows a photo of a man standing on volcanic terrain and looking through a camera situated on a tripod. The photographer’s shoes and tripod have flames around them.

End.

FIGURE 30

A closer crop of the tweet by user @unsmokable showing the results when a viewer right-clicks/control-clicks on the image.

End.

FIGURE 31

Results from a Google reverse image search on the photo from Twitter user @unsmokable’s tweet.

End.

FIGURE 32

A Reddit post titled, “In the heat of the moment” with comments debating over the photo of the photographer with flaming shoes and a tripod.

End.

FIGURE 33

An article by Katie Hosmer titled “Hot Lava Sets Adventurous Photographer’s Feet on Fire.”

End.

FIGURE 34

A close up of the article by Katie Hosmer showing the text “via [PetaPixel].”

End.

 

FIGURE 35

Text from the PetaPixel site quoting the photographer of the lava photo, reading, “The photo is real, but the flames are not the result of spontaneous combustion” going on to explain that the photographer used an accelerant to start the flames.

End.

FIGURE 36

Close up of the PetaPixel site showing the results when a reader right-clicks/control-clicks on the Hawaii News Now link.

End.

FIGURE 37

The Google results from searching “Hawaii News Now.”

End.

FIGURE 38

A photo Twitter users attributed to National Geographic, which depicts what appears to be a photographer being attacked by a bird.

End.

FIGURE 39

A Google reverse image search result that suggests the best search term to find our original source is “birds attacking people.”

End.

FIGURE 40

A list of pages including images that match the reverse searched image. The first webpage is titled, “Dangerous Birds – Top 10 Birds That Could Kick Your Ass.” All of the pages appear to discuss bird attacks.

End.

FIGURE 41

An expanded settings list for Google reverse image search that can be accessed by clicking “Tools” and “Custom range…” These settings can be altered to filter out newer photos by modifying the dates that will be included in the results list.

End.

FIGURE 42

A new reverse image search, with a custom date of Dec 31, 2009 to exclude newer photos, such as those which may have been virally propagated under false pretenses. Now, our suggested search term is “bird.”

End.

FIGURE 43

The result page of our reverse image search, in which the title of the third website, PentaxForums, reads, “Got too close the the hawk :(,“ and the description reads: “And as the poster said, these are trained…so its more like the camera man pissed off the hunter rather than the bird itself. Rest of the photos. Kazakhstan Eagle…”

End.

FIGURE 44

An article from the Press titled, “Kazakhstan Eagle Hunt,” which features our image.

End.

FIGURE 45

A list of Google search results of the search term, “stockton ca local affiliate.” We will select the fourth listing, CBS Sacramento.

End.

FIGURE 46

CBS Sacramento search with “teenage girls black lives matter” in the search bar.

End.

FIGURE 47

A photograph depicting a group of photographers running from a bear.

End.

FIGURE 48

A photograph in which a man in a body of water is hiding with a camera in a swan hunting tent.

End.

FIGURE 49

A photograph showing a section of a city empty and in shambles with what appears to be debris cluttering the buildings and streets.

End.

FIGURE 50

A photograph depicting a large stone ram on top of a semi-truck with the “OVER-SIZE” label on its front bumper. The ram appears to be more than three times the height of the semi-truck.

End.

FIGURE 51

A screenshot of the Baltimore Gazette, a site created to spread misinformation. The headline reads, “Clinton Received Debate Questions Week Before Debate, According to Sources.”

End.

FIGURE 52

A Google search tip demonstrating how to exclude a specific site from search results. The string used in the example is “baltimoregazette.com -site:baltimoregazette.com”. This would search all sites except for “baltimoregazette.com.”

End.

FIGURE 53

The homepage of the Pacific Justice Institute.

End.

FIGURE 54

Google search results for “www.pacificjustice.org -site:www.pacificjustice.org.” The search omits the site www.pacificjustice.org and brings up a Wikipedia article as the first result.

End.

FIGURE 55

WHOIS search result on the ICANN interface for “motherjones.com.”  It displays the website’s owner, Foundation for National Progress, and its contact information.

End.

FIGURE 56

WHOIS search result on the ICANN interface for “baltimoregazette.com”. The website’s owner is listed as Domains by Proxy.

End.

FIGURE 57

A close up of baltimoregazette.com’s date of creation from WHOIS on the ICANN interface, which is listed as July of 2015.

End.

FIGURE 58

An article published in the peer-reviewed journal PLOS Medicine.

End.

FIGURE 59

A Google search for “plos medicine impact factor,” which indicates in the knowledge panel its impact factor is 13.585 as of 2015.

End.

FIGURE 60

An article published in the Journal of Obesity and Weight-loss Medication whose impact factor we want to investigate.

End.

FIGURE 61

A Google search for “Journal of Obesity and Weight-loss Medication impact factor” whose impact factor does not appear in a knowledge panel.

End.

FIGURE 62

The Google Scholar search results for “David Bann,” which features his many publications in lifespan obesity patterns. Most of the publications we find are from the last ten years.

End.

FIGURE 63

The AnonHQ article titled, “It’s Official: European Scientific Journal Concludes 9/11 was a Controlled Demolition.” The article has over 14,000 views and was published on September 11, 2016.

End.

FIGURE 64

The Google Scholar search results for “Robert Korol,” who appears to have published architectural research in the 1970s, 80s, and 90s.

End.

FIGURE 65

The Google Scholar search results for “Jennie Connor 2016,” which shows her well-cited publications. Her 2017 article received 12 citations, and two articles were cited by 23 and 36 others.

End.

FIGURE 66

The Google search results for “addiction impact factor,” which we find in the knowledge panel to be 4.145 as of 2010.

End.

FIGURE 67

The Google search result for “nih alcohol and cancer.” The fifth result from the NIH is described as “A fact sheet that summarizes the evidence linking alcohol consumption to the risk of various cancers…”

End.

FIGURE 68

The Google search result for “www.cancer.gov -site:www.cancer.gov.” This search includes all sites other than www.cancer.gov. We see that five results down, the National Health Institute, an organization we trust, is talking about the National Cancer Institute.

End.

FIGURE 69

A tweet by Twitter user @MichaelESmith that reads, “Bullshit! Aztec society collapsed in 1519 fr. Cortes & smallpox. Salmonella in 1540 was far too late. And the painting is European fantasy.” Smith is responding to a tweet claiming that salmonella poisoning may have contributed to the fall of the Aztec civilization.

End.

FIGURE 70

A tweet by @pixelatedboat featuring a photo of two men that reads, “This is Woodward and Bernstein. Nixon called them the enemy. They proved that no president is above the law. #NotTheEnemy.”

End.

FIGURE 71

A tweet by user @RepJackKimble that reads, “Why have the wars cost so much under Obama? Check the budgets, Bush fought 2 wars without costing taxpayers a dime.”

End.

FIGURE 72

The Twitter bio of user @RepJackKimble reading, “Congressman from CA’s 54th District. JackKimble.com Author of Profiles in Courageousness amzn.to/1ER7SeU E pluribus unum (1 Nation under God).”

End.

FIGURE 73

The Twitter bio of user @jasoninthehouse reading, “United States Congressman (UT-3). Chairman, Oversight & Government Reform. Tweets come from me, not my staff.” The user’s name has a small blue seal next to his name, indicating that his identity is verified by Twitter.

End.

FIGURE 74

The header of Twitter user @PerseusJackson, strategically using the background image to give the impression that it is a verified account by Twitter.

End.

FIGURE 75

A video showing how to hover over a Twitter user’s verification seal to check if it is legitimate.

End.

FIGURE 76

The Twitter bio of user @MinervaSchools reading, “Minerva offers a unique undergraduate education for the brightest, most motivated students in the world.”

End.

FIGURE 77

Twitter user @MinervaSchool’s tweetstream from February showing two tweets, the number of followers the account has, and the number of tweets the account has made.

End.

FIGURE 78

A tweet by user @mcpli mocking the screenshot of a supposed tweet by user @DanPatrick which reads, “MARRIAGE= ONE MAN & ONE MAN. Enough of these activist judges. FAVORITE if you agree. I know the silent majority out there is with us!”

End.

FIGURE 79

A fake tweet generated by the author of this text that shows user @BarackObama tweeting, “Web Literacy for Student Fact-Checkers is AMAZING! You should read it. (Thanks Mike!)”

End.

FIGURE 80

The Politiwhoops archive of deleted tweets by user @realDonaldTrump showing two tweets made and deleted by the account in February of 2017.

End.

FIGURE 81

A video showing how to view the cached version of @realDonaldTrump’s Twitter page by searching the account through Google, hovering over the drop down arrow next to the first result’s URL, and selecting “Cached.”

End.

FIGURE 82

Google’s cache information of @realDonaldTrump’s Twitter page, reading “This is Google’s cache of https://twitter.com/realdonaldtrump. It is a snapshot of the page as it appeared on Feb 15, 2017 14:46:56 GMT.”

End.

FIGURE 83

The search bar of the Wayback Machine with the search term “whitehouse.gov” typed in.

End.

FIGURE 84

The Wayback Machine’s search results for “whitehouse.gov” displaying a calendar of the months of January, February, March, and April of 1999 with blue and green dots encasing some of the calendar’s dates.

End.

FIGURE 85

The page of whitehouse.gov from January 1999 showing links to White House documents, the contents of the website, Radio Addresses of the President, Executive Orders, Photographs, a database to all government sites, The Decleration of Independence, The Constitution of the United States, a subscription list, and press releases.

End.

FIGURE 86

An ABCNews.co article entitled, “Donald Trump Protester Speaks Out: ‘I Was Paid $3,500 To Protest Trump’s Rally” and showing a publication date of November 11, 2016.

End.

FIGURE 87

An ABCNews.co article entitled, “Donald Trump Protester Speaks Out: ‘I Was Paid $3,500 To Protest Trump’s Rally” and showing a publication date of March 24, 2016.

End.

FIGURE 88

An ABCNews.co article entitled, “Donald Trump Protester Speaks Out: ‘I Was Paid $3,500 To Protest Trump’s Rally” and showing a publication date of June 16, 2016.

End.

FIGURE 89

An ABCNews.co article entitled, “Donald Trump Protester Speaks Out: ‘I Was Paid $3,500 To Protest Trump’s Rally” and showing a publication date of Septembe 11, 2016.

End.

FIGURE 90

The first Google result for “site:abcnews.com.co/donald-trump-protester-speaks-out-i-was-paid-to-protest/” showing the abcnews.co article with a publication date of March 26, 2016.

End.

FIGURE 91

A tweet by user @cbquist posting a quote supposedly said by Carl Sagan, which states, “I have a foreboding of an America in my children’s or grandchildren’s time–when the United States is a service and information economy; when nearly all the manufacturing industries have slipped away to other countries; when awesome technological powers are in the hands of a very few, and no one representing the public interest can even grasp the issues; when the people have lost the ability to set their own agendas or knowledgeably question those in authority; when, clutching our crystals and nervously consulting our horoscopes, our critical faculties in decline, unable to distinguish between what feels good and what’s true, we slide, almost without noticing, back into superstition and darkness.”

End.

FIGURE 92

The top Google Books search results for “clutching our crystals and nervously consulting.”

End.

FIGURE 93

An excerpt of Carl Sagan’s Demon-Haunted World, found through Google Books, where Sagan provides the quote that was attributed to him by Twitter user @cbquist.

End.

FIGURE 94

The publication information of Carl Sagan’s Demon-Haunted World showing a publication date of 1996.

End.

FIGURE 95

Internet Archive‘s TV News Archive search for “tremendous sea of love.” The second result is our video, and I have circled the video, which is from ABC.

End.

FIGURE 96

A search for “pence muslim ban” in the Trump archive, which shows the text of a video in which Mike Pence, when asked if he agrees with the Muslim ban, responded, “I do.”

End.

FIGURE 97

Google search result for “how many men landed on the moon” in which a knowledge panel answers the query via Quora with “12 men.”

End.

FIGURE 98

Google search result for “last man to land on the moon” in which a knowledge panel pulls text from a Wikipedia article and puts the name “Cernan” in bold as the answer to the question.

End.

FIGURE 99

Google search result for “how many apostles were there” in which a knowledge panel replies “12 apostles” via Quora.

End.

FIGURE 100

Google search result for “how old was lee harvey oswold at the time of the assassination” in which a knowledge panel puts in bold 18, 22, and 24, which are numbers from Oswold’s date of birth, date of death, and the date of the assassination via a Wikipedia article. None are an answer to the Googled question.

End.

FIGURE 101

Google search result for “Presidents in the kkk” in which a knowledge panel pulls the names of several presidents from The Trent Online.

End.

FIGURE 102

Google search result for “is obama planning martial law” in which a knowledge panel pulls a quote from newstarget.com claiming that Obama is in fact planning martial law.

End.

FIGURE 103

Google search result for “why did lee harvey oswold assassinate president kennedy” in which a knowledge panel pulls text from a site claiming that Oswold did not assassinate President Kennedy.

End.

FIGURE 104

Google search result for “msg sensitvity” in which a knowledge panel pulls a list of symptoms from Healthline.

End.

FIGURE 105

Google search result for “msg dangers” in which a knowledge panel brings up Mercola, which claims that msg causes brain damage, such as Alzheimer’s disease and learning disabilities.

End.

FIGURE 106

Homepage of Buzzsumo, which features a search bar on its main page.

End.

FIGURE 107

Buzzsumo results for “cancer,” showing two articles and their Facebook engagements, which is meant to measure the virality of the articles on Facebook.

End.

FIGURE 108

Buzzsumo results for “cancer” scrolled down a few articles. One article, “Royal Rife: Cancer Cure Genius Silenced by Medical Mafia” uses particularly inflammatory language.

End.

FIGURE 109

Domain Dossier search bar with “coca-cola.com” typed in and a list of databases it searches with boxes next to them you click to include results from.

End.

FIGURE 110

Domain Dossier results for the search on “coca-cola.com” in which the registrant’s name, organization, street, and city are all available for public access.

End.

FIGURE 111

Domain Dossier search results for “protrump45.com,” showing that the site’s owner is masked.

End.

FIGURE 112

Domain Dossier search results showing the registrant of a site’s name as Domains by Proxy, LLC, a service that masks the real owners of sites.

End.

FIGURE 113

Google search results for “was 9/11 a hoax” in which the top five sites confirm the conspiracy that 9/11 was faked.

End.

FIGURE 114

Google search results for “are we eating too much protein” in which Google pulls a knowledge panel from Huffington Post, and the top site promotes veganism.

End.

FIGURE 115

Promoted tweet from user @SafeMedicine urging us to tweet our senators against our exposure to unsafe medicine. We can tell it’s promoted by the gray text that reads “Promoted” below the “reply,” “retweet,” and “like” functions.

End.

FIGURE 116

Twitter page for user @SafeMedicine, which features its website name, safemedicine.org.

End.

FIGURE 117

The homepage of safemedicine.org, which reveals the name of the organization, The Partnership for Safe Medicines.

End.

FIGURE 118

An article about The Partnership for Safe Medicines on the Northwest Public Radio site titled, “Nonprofit Working to Block Drug Imports Has Ties to Pharma Lobby.”

End.

FIGURE 119

The headline of a newspaper article from 1973 titled “Nixon Sees ‘Witch-Hunt’ Insiders Say” with the Washington Post’s name below the headline.

End.

FIGURE 120

Google search results for “Nixon Sees Witch Hunt (site: newspapers.com OR site: google.news.com/newspapers OR site: newspaperarchive.com)” to only search on these three sites. The first result, from the LA Times, mentions our headline in the description and is from 1973.

End.

FIGURE 121

The newspaper article from the first result of our last Google search, which features our headline “Nixon Sees ‘Witch-Hunt’ Insiders Say.”

End.

FIGURE 122

Google search results for “Airline Pilot to Fly by Seat of Panties (site:newspapers.com OR site:news.google.com/newspapers OR site:newspaperarchive.com),” in which the article appears in the first result.

End.

“Fact-Checking Sites” Image Descriptions

47

Mike Caulfield

Navigation tip: If you arrive here in a new window, click control-w or command-w to close this tab and return to the text.

There are no images or figures in Chapter Five.

“How to Use Previous Work” Image Descriptions

48

Mike Caulfield

Navigation tip: If you arrive here in a new window, click control-w or command-w to close this tab and return to the text.

Figure 4.1
Figure 4.2
Figure 4.3
Figure 4.4

FIGURE 4.1

A story with the headline “MORE HYPOCRISY: Obama banned all Iraqi Refugees for 6 Months in 2011– Liberals said nothing!” over a picture of protests against President Trump’s ban.

End.

FIGURE 4.2

A set of DuckDuckGo search results. The top results are from fact-checking sites Snopes and Politifact.

End.

FIGURE 4.3

A segment of a President Trump speech that reads, “We must protect those who protect us. The number of officers shot and killed in the line of duty last year increased by 56 percent from the year before.”

End.

FIGURE 4.4

DuckDuckGo search results. The top search result is an article from the Washington Post fact-checker and we can see the highlighted text that matches our query.

End.

“Go Upstream to Find the Source” Image Descriptions

49

Mike Caulfield

Navigation tip: If you arrive here in a new window, click control-w or command-w to close this tab and return to the text.

Figure 7.1
Figure 7.2

FIGURE 7.1

A story with the headline, “Report: US Government Ethics director approved controversial tweets” over a picture of President Trump.

End.

FIGURE 7.2

Text from the article with sentences mentioning Daily Dot highlighted. If you read carefully, the Daily Dot (another publication) is the source of each fact, e.g. “the Daily Dot reported that Shaub sent an email”, etc.

End.

How DigiPo Defines a “Fact”

1

Mike Caulfield

We take a rather old-fashioned view of what a “fact” is in this text. For us, a fact is a claim about which there is general agreement by people in the know.

Most claims aren’t facts and aren’t intended to be presented as facts. People make claims all the time. I may claim that Mulholland Drive is the best film of the 2000s. You may claim that Spiderman 2 is. People can disagree about these things. These are claims, but they are not facts.

Some types of claims, however, also qualify as “statements of fact.” It is a statement of fact, for example, that Mulholland Drive was directed by David Lynch, and it’s a statement of fact that Sweet Home Alabama starred Reese Witherspoon. Facts don’t have to be physical: it’s a fact that Sweet Home Alabama deals with questions of what is most important in life and that Mulholland Drive investigates how the stories we tell ourselves differ from the reality we inhabit.

Facts can even deal with situations that are hypothetical. We can say that it’s a fact that Sweet Home Alabama would have cost less if it was shot in Serbia instead of in Georgia and Alabama.

For us, a fact is:

That’s it. When we talk about facts, we are usually attempting to get at truth. But the measurement of what qualifies as fact is that it meets those three criteria.

That said, there’s a whole lot to be said about those criteria.

 

Position to Know: Expertise, Opportunity, and Access

Let’s start with the second one. Who are the “people in a position to know?”

Generally, “a position to know” denotes expertise or opportunity.

Let’s take a car accident as an example. Your car and my car collide on a deserted road. Who is in a position to know what happened? Well, obviously you and I. If we both agree to what happened–say, we both agree that I drifted into the oncoming lane due to a lack of sleep–we can treat that as a fact.

Perhaps someone else disagrees with that. No, says a person who reads the account of the crash in the newspaper, that’s not how it happened at all! Do we suddenly have to start treating this account of the crash as a claim that does not have the status of fact?

It depends. The crucial question is whether this third person is in a position to know. Did they see the crash? Then yes, we have to stop treating our account of the crash as fact. Do they have some deep knowledge of crash forensics that shows the crash is impossible? Are they a crash expert? Well, yes, then perhaps the fact is in dispute, though to override the evidence of our claimed experience, we’d want more than a single expert opinion–we’d want expert consensus.

On the other hand, if they did not see the crash but instead believe that very few crashes happen due to lack of sleep and therefore this cannot be an explanation, then no, we can still treat this as fact, because the people in a position to know are in agreement. The disagreement of a person not in a position to know does not undo that.

With questions involving expertise, “position to know” generally indicates expertise, but even here there is opportunity at work. As an example, consider the recent debate over whether the Russian government was responsible for feeding the Democratic National Convention emails to Wikileaks. To have an informed opinion on this issue, you’d need some expertise in cybersecurity forensics. But just having the expertise is not enough. To evaluate the issue, you’d need access to the systems that were hacked: log files, operating system, etc. Indeed, one of the struggles with that issue was that the people with both access and expertise were not always trusted to tell the truth. (This leads to a situation where we can say it is highly likely the Russian government ordered the distribution of the information to Wikileaks, but we cannot accord it the status of fact).

By the same token, opportunity isn’t always enough. A person may take a photograph, for example, of something they think is a lynx but turns out, when reviewed by experts, to be a cougar. If three people witness an animal dart across the road, and one works in a zoo, we might be inclined to weight the zookeeper’s opinion of what the animal was more heavily.

Because “position to know” is so important to claims of fact, when we look at sources we are always asking ourselves, “What puts this person in a unique position to know?” If the answer is “nothing in particular,” then we find new sources.

One final thing–and it’s perhaps the hardest thing to swallow. In most cases we, the readers, are not in a position to know about specific facts ourselves. Our experience matters, but as readers we tend to vastly overrate its value in ascertaining questions of fact where we have neither expertise or opportunity.

We don’t get a vote on the facts; we get a vote on who is most credible. And this means we usually have to trust someone other than ourselves.

Generally Means Generally

What about this other phrase “generally not disputed”? What does that mean exactly?

It’s difficult to say. Imagine you are interviewing twenty people about a wedding reception. Ten of them say that the wedding cake was chocolate, and ten say, “No that’s wrong, it was yellow cake.” In this case we have a clear dispute, and can’t really treat the type of cake as fact. But say that 19 people said the cake was yellow cake and one said it was chocolate. In this case, we’re likely to treat the type of cake as fact and assume that the twentieth person may have misremembered the event or had some weird ulterior motive for lying about the cake.

What that line is–where something is “generally” not disputed–is a question of much debate. It may vary depending on the importance of the question. For a relatively silly question like cake type, four out of five people may be enough to say the question is settled. For issues of greater importance, “generally” may require a higher percentage. Importantly, though, for questions where a lot of people are in the know, that percentage does not have to be 100%. Why? Because even people in the know make mistakes, and because people in the know may have reasons for not telling the truth.

A good example of this is evolution. Evolution, the process by which organisms evolve into new organisms over time, is fact. It’s fact not because every person on the planet has gone over the evidence individually, but because the people who are in the best position to interpret that evidence (biologists, geologists, zoologists, etc.) are in almost unanimous agreement on this point.

But that does not mean that all scientists are in agreement. In fact, 13 percent of scientists disagree with evolution.

Admittedly, the vast majority of those people are neither biologists nor experts in evolution; they are various chemical engineers, physicists, and medical professionals. However, if you dig deeper, you will even find a biologist or geneticist here or there who disagrees with what most scientists see as one of the foundational truths of biology.

If something is so obviously true, how come there is not 100% agreement?

One answer is that maybe the dissenters have it right, and that certainly has happened before. It was, of course, a fact 600 years ago that the sun went around the earth. So sometimes dissenters are right, and the facts are wrong.

More often, when there is overwhelming agreement and a small amount of dissent, it’s just human fallibility. People looking at the same evidence with the same intelligence and the same authority come to different conclusions. Sometimes the minority have a stake in the outcome that can bias them, or just have a different way of looking at things.

Again, this sort of dissent is good, and is necessary to the progress of science, technology, and culture. But when we are asking whether something is a fact, we are not asking whether something has a 100% chance of being true or whether agreement on it will last for all time. We are asking whether enough people in the know agree on it that we can treat it as a settled question and move forward on it, either to action or more complex claims.

 

Truthfulness and Fact

It is popularly assumed that the biggest question to ask in media literacy is whether people are lying to you or not. Indeed, bias, the tendency of people’s beliefs, incentives, or financial interests to influence what they promote as “true,” can make some experts or witnesses untrustworthy. If we return to our car crash example: two people crash into one another with different stories on how it happened. It certainly makes sense to examine the motivations each of those persons may have for lying.

At the same time, I’d argue that in fact-checking, it’s more useful to make truthfulness the question you pursue second, not first.

To explain: in any given situation, the people in the best position to know something are generally a small group, since people with both the opportunity to review evidence and the expertise to evaluate it are limited. If you are looking for the fastest way to whittle down who to trust, expertise and opportunity get you there more quickly.

After that point–after you’ve whittled down your trusted circle based on these “position to know” attributes–you may still find disagreement. And at that point, it is useful to ask whether the disagreement is an honest disagreement or is a result of hidden (or not-so-hidden) motivations.

The danger of asking the “truthfulness” question first is that everybody has some bias. So it quickly becomes very easy for a reader to eliminate credible sources because “of course they would say that, wouldn’t they?”

If you ask yourself who would be in a “position to know” first, and go to the truthfulness question second, you’ll be able to validate sources more quickly and more reliably. You’ll also be able to see more clearly any true bias that may be influencing your group of experts.