Posted by K.M. Das
Last week the Washington Post reported that the FBI has built a database with more than 659 million records. FBI officials identified the database, culled from 50 FBI and other government agency sources, as one of the most powerful data analysis tools available to law enforcement and counterterrorism agents. The database, known as the Investigative Data Warehouse, was launched in January 2004, but demonstrated by FBI officials as criticism that the FBI’s technology—in fact the technology used by the federal government as a whole—is outdated and failing as the fifth anniversary of September 11, 2001, approaches.
Reading the Washington Post’s report on Gurvais Grigg’s, acting director of FBI’s Foreign Terrorist Tracking Task Force, typing in “Mohammad Atta” and “flight training” and pulling up 250 articles relating to Atta makes one wonder whether intrinsically a database can ever be anything but reactive. (An aside: a Google search for “Mohammad Atta flight training” pulls up 281,000 hits.) Being able to identify Atta five years after the September 11 attacks has the same reactive feel as TSA preventing people from traveling with liquids after British law enforcement agencies arrested 24 people suspected of plotting to use liquid explosives two blown up planes a few weeks back, although TSA and DHS has been working on machines to detect liquid explosives for a number of years.
Another concern that arises with a database this size is how much misinformation it contains and how difficult it is or will be to get the record straightened out. Those of you who listened to This American Life’s September 2, 2006, segment on Shaheen and his attempts to get this name off “terrorist watch lists” or who have ever attempted to get an incorrect entry cleared from your credit report, will appreciate the enormity of ever attempting to get an incorrect entry in the Investigative Data Warehouse purged.
As reported by the Washington Post, others have identified other security concerns with the database. However, the thing that stuck out the most for me was that 13,000 agents and analysts submit an average of 1 million queries a month. I have absolutely no doubt that most, if not all, of the agents and analysts are using the database for perfectly legitimate reasons almost all of the time. The problem is that with 659 million records being searched a million times a month, the temptation to run a few not-quite-so-legitimate searches must be overwhelming. After all, who hasn’t run their own names through Google or Zillowed their own houses. The fact that the Evening Standard reported that Britain’s Identity and Passport Service has had to fire four staffers for running improper searches on the National Identity Register—which is meant to provide access to each British resident’s health, financial and police records by 2008—the same week that the FBI demonstrated the Investigative Data Warehouse only adds to my concern.
I recognize that a database such as the Interactive Data Warehouse is not only necessary, but--with the proliferation of data and databases containing immense amount of information about each of us--is only inevitable. It is also likely to be an aid to law enforcement. But one wonders what happens when the database gets hacked—the Britain’s Identity and Passport Service’s databases have been breached on an average of once every year since 2001—and more importantly one wonders “quis custodiet ipsos custodes.”