Apparently a lot, according to their robots.txt file, which basically blocks the search engines from indexing everything. Why didn’t they just set a wild card on everything, for cripes sakes? Why are they doing this? More than likely so that Google won’t pick it up in its cache so we can’t hold it against the government later for something they posted. Link via Waxy.
Comments
Hiding Behind Robots.txt
Jake recently noted that the robots.txt file of the White House website prevents indexing of most of the website. Mark has a bit more detail, including some historical background. For example, on April 15, 2003, the robots.txt file has 10…