About the Internet Archive
The Internet Archive, a 501(c)(3) non-profit, is building a digital library of Internet sites and other cultural artifacts in digital form. Like a paper library, we provide free access to researchers, historians, scholars, people with print disabilities, and the general public. Our mission is to provide Universal Access to All Knowledge.
We began in 1996 by archiving the Internet itself, a medium that was just beginning to grow in use. Like newspapers, the content published on the web was ephemeral - but unlike newspapers, no one was saving it. Today we have 25+ years of web history accessible through the Wayback Machine and we work with 950+ library and other partners through our Archive-It program to identify important web pages.
As our web archive grew, so did our commitment to providing digital versions of other published works. Today our archive contains:
- 625 billion web pages
- 38 million books and texts
- 14 million audio recordings (including 240,000 live concerts)
- 7 million videos (including 2 million Television News programs)
- 4 million images
- 790,000 software programs
Because we are a library, we pay special attention to books. Not everyone has access to a public or academic library with a good collection, so to provide universal access we need to provide digital versions of books. We began a program to digitize books in 2005 and today we scan 4,000 books per day in 18 locations around the world. Books published prior to 1927 are available for download, and hundreds of thousands of modern books can be borrowed through our Open Library site. One of the Internet Archive's missions is to serve people who have difficulty interacting with physical books, so most of our digitized books are available to people with print disabilities (learn about access here).
Like the Internet, television is also an ephemeral medium. We began archiving television programs in late 2000, and our first public TV project was an archive of TV news surrounding the events of September 11, 2001. In 2009 we began to make selected U.S. television news broadcasts searchable by captions in our TV News Archive. This service allows researchers and the public to use television as a citable and sharable reference.
The Internet Archive serves millions of people each day and is one of the top 300 web sites in the world. A single copy of the Internet Archive library collection occupies 99+ Petabytes of server space (and we store at least 2 copies of everything). We are funded through donations, grants, and by providing web archiving and book digitization services for our partners. As with most libraries we value the privacy of our patrons, so we avoid keeping the IP (Internet Protocol) addresses of our readers and offer our site in https (secure) protocol.
Generous funding has come from Foundations including:
- Andrew W. Mellon Foundation
- Council on Library and Information Resources
- Democracy Fund
- Federal Communications Commission Universal Service Program for Schools and Libraries (E-Rate)
- Institute of Museum and Library Services (IMLS)
- Knight Foundation
- Laura and John Arnold Foundation
- National Endowment for the Humanities, Office of Digital Humanities
- National Science Foundation
- The Peter and Carmen Lucia Buck Foundation
- The Philadelphia Foundation
- Rita Allen Foundation
The Internet Archive is a member of:
- American Library Association (ALA)
- Association for Recorded Sound Collections (ARSC)
- Biodiversity Heritage Library (BHL)
- Boston Library Consortium (BLC)
- Council on Library and Information Resources (CLIR)
- Coalition for Networked Information (CNI)
- Digital Library Federation (DLF)
- Digital Preservation Coalition (DPC)
- Digital Public Library of America (DPLA)
- International Federation of Library Associations and Institutions (IFLA)
- International Internet Preservation Consortium (IIPC)
- Music Library Association
- National Digital Stewardship Alliance (NDSA)