Google Science


Until very recently, I had assumed that the growing wealth of information provided by online search engines, both scientific and otherwise, was an unambiguously good thing.  It turns out that might be less true that I first thought.   I came across an article in Slate that documented Yahoo!’s attempt to fend off a hostile takeover by Microsoft by outsourcing its advertising operations to Google.  Yahoo!, in exchange for a large some of money, allowed Google to control which advertisements accompany Yahoo! search results, how often, and in relation to which content.  And while it is true that any advertising department, even Google’s, has limitations to its influence, it is also true that the ability to easily locate and view information should also come with the understanding that the purveyors of this information have a great deal of influence over what you access, and how, exactly, you access it.

That is certainly not to suggest that there’s any sinister code lurking in the depths of Google’s advertising algorithm, but simply that there are nonrandom reasons why one result of a search appears at the top of page 1 and gets millions of clicks while another sets at the bottom of page 2 and sees a fraction of the traffic.  Google, as it consolidates itself as the primary gatekeeper to information on the internet, is providing people with more access to more information than they’ve ever had before, but it is being presented in a way that results in the first search-result receiving a disproportionately high number of clicks relative to returns further down the list.  This means that while one search might yield thousands of results, most people only bother to access the first one or two.

While this might just seem like an invisible consequence of the internet search engine, it has a potentially tremendous effect on the way scientific research is conducted here in the United States and around the world.  Last year, University of Chicago sociologist James Evans published an article in Science examining how scientific journals making their materials available online has affected the way scientific scholarship is conducted.  Evans tracked the ways scientists read their peers’ publications using in-article citations as a proxy.  The results are similar to what has happened when Google, Yahoo!, and the other non-scientific search engines gave people a list of results to chose from: they only picked the first few.  Evans showed that as more scientific journals began to make their materials available and searchable online, new scientific publications actually began to cite fewer, not more, articles as references compared with articles written before the advent and popularization of internet search engines.  Moreover, the citations also tended to be newer, from fewer journals, and from fewer articles within those journals.  Scientists were only clicking the links at the top of the list.  Evans concludes that the convenient consolidation of data online has possibly had an unintended consequence.  He presents data that shows that while the search-engine-driven internet has, without question, exponentially increased the volume of information available online, it has actually significantly reduced the pool of information that is actually accessed and cited.

Beyond suggesting a waning curiosity, this potentially has real results for the future of scientific discoveries.  Evans believes that this smaller pool of cited articles may lead to scientific fields achieving quicker, though not necessarily more correct, consensus built upon a more narrow base of ideas.  Older ideas, or ones simply not as readily accessible, might simply be fading away unnoticed if they don’t find themselves presented at the top of a search engine hit-list.  Its not too difficult to imagine a worst-case scenario where a scientist publishes an article full of compelling data, say a cancer treatment, only to have it pass by unnoticed when, for some reason, it doesn’t make it to the top of a search engine results page.

As potentially scary as this  may sound, the solution to these problems is simple and in large part already exists.  First, science has long functioned as a meritocracy.  Good research has a way of snowballing.  The best ideas get talked about, they get published in the best journals, and they normally find their way to the top of the results page.  Its fairly safe to say that no published treatment for cancer will ever fall through the cracks.  Secondly, the problem of having too much information easily available is, quite simply, a good problem to have.  The internet has made more information available to more scientists than ever before in history, and search engines have made it easier to find articles that might never have been found before.  Researchers from around the world can now have a scientific dialogue in real-time while simultaneously navigating a huge, once-but-no-longer cumbersome archive of old data with ease.

More than anything Evans’s data serves as a helpful reminder, that convenience can never replace diligence, and that while its easy to marvel at the enormous amount of information the internet makes available, there will never be a electronic substitute for hard work and thorough science.


Add new comment

Filtered HTML

  • Web page addresses and e-mail addresses turn into links automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <blockquote> <code> <ul> <ol> <li> <dl> <dt> <dd> <p> <div> <br> <sup> <sub>
  • Lines and paragraphs break automatically.

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.