Tweets, emails and other electronic communications can be considered “government documents” and must be preserved. The National Archives handles official government materials, while the Library of Congress’ mandate is to deal with anything that may have long-term historical interest.
“We’re basically in the same situation as the National Archives, only on a much larger scale,” said Bill Lefurgy, digital initiatives program manager at the Library of Congress national digital information infrastructure and preservation program. “We tend to have a much larger perspective in terms of what we collect.” He joined the Federal Drive with Tom Temin and Amy Morris Tuesday morning to talk about the library’s digital mission.
But how much digital information are we talking about? How about all of the tweets from Twitter’s archives?
“We have an agreement with Twitter where they have a bunch of servers with their historic archive of tweets, everything that was sent out and declared to be public,” Lefurgy said. The archives don’t contain tweets that users have protected, but everything else — billions and billions of tweets — are there.”
Using new technical processes it has developed, Twitter is moving a large quantity of electronic data from one electronic source to another. “They’ve had to do some pretty nifty experimentation and invention to develop the tools and a process to be able to move all of that data over to us,” Lefurgy said.
The Library of Congress has long been the repository of important, historical documents and the Twitter library, as a whole, is something historic in itself.