Document Type : Original Article

Authors

1 Associate Professor at Knowledge &Information Science Department, School of Education & Psychology, Shiraz University, Shiraz, Iran

2 PhD Student, Knowledge and Information Sciences, School of Educational Sciences and Psychology, University of Shiraz, Shiraz, Iran

Abstract

Background and Objectives: Nowadays, we are passing through an era of transition from analog to digital format. Most valuable information is either digitally born or digitized which require digital preservation to ensure their safety and survival for long-term maintenance and access for posterity. Several web preservation programs have been launched around the world, each of which having its own properties and area of activities in line with policies and goals of the user organization. The present study aimed to explore the activities and properties of the existing top web preservation projects and programs in terms of their time coverage, scopes of preservation, and types of resources preserved, access models and authorized users.
Methodology: A documentary method was used to identify and analyze the relevant available literature such as papers, handbooks, web sites, etc. The programs’ people-in-charge were also questioned via a short questionnaire sent by Email. Top web preservation programs and projects were identified using Google Search, as well as analyzing the program interfaces and documents, directories and the related literature. After being verified and filtered, 61 top programs were selected to be studied.
Findings: The verification of the launching dates of the programs revealed that “Internet Archive” is the oldest one dating back to 1996.  Most recent programs were “Anarchism Web Archive” and “Web Harvesting Project of the German National Library”, of which the first was subject specific while the other was that of a specific nationality. While some programs cover a global scope as wide as the web, some others limit their borders to web resources published in a specific country, region, subject, organization, and/or document type. The first and oldest digital preservation program, i.e. “Internet Archive” has selected to cover the world-wide web as its preservation scope, thus its time coverage goes back to as far as 1996. For some programs, the time coverage is very limited and covers 2-9 years prior to their launching dates; examples are: “The Cyber Cemetery”, “LAC (Electronic Collection of Library and Archives Canada)” and “Portuguese Web Archive”. However, these programs are apparently depending on macro programs such as “Internet Archive” for the web resources published prior to their launching dates. It was also revealed that 50% of these programs run at national level and 13.4 % cover a specific subject. Politics, Culture, Religion, Science, Economy, Slavery, Government, Anarchism, Human Rights, Social Issues, Computer and Information Science are among the subjects that are most frequently dealt with by the programs. Some programs selected only one or two document types while others covered a combination of document types for preservation. Access to the archived version of the preserved documents ranges on a continuum from fully open, through semi-open to restricted access. Of all the programs the majority (39.1%) apply a full open access model; next comes those adhering to a restricted access model (23/9%). The semi-open access model had the least frequency (6.7%). Some programs offer their services to people throughout the world and do not limit themselves to specific users (6.3%) of which a prominent example is the “Internet Archive” that is open to all users around the globe. For some other programs (15.2%), access is restricted just for authorized users; for example, “Web Harvesting Project of the German National Library” and “AOLA (Austrian Online Archive)” are limited to students and researchers.
Discussion: The results of the present study revealed that the importance of web preservation is duly recognized all over the world so that a wide range of countries are found to be engaged in this endeavor. The programs under study can be classified into two main groups including R&D related and operational ones. Most of them are found to have chosen their national domains for preservation; this results in the perseveration of all document types in almost all subjects available in their cyberspaces. There are also many programs found to provide open access to the preserved contents for all kinds of users throughout the world.

Keywords

Ashley, K., Davis ,R., Guy ,M., Kelly ,B., Pinsent, Ed , Farrell, S. (2010). “A Guide to Web Preservation”. Retrieved October 20, 2011, from: http://jiscpowr.jiscinvolve.org/wp/files/2010/06/Guide-2010-final.pdf.
Barateiro, J. , Antunes, G., Borbinha, J. (2009). Addressing Digital Preservation: Proposals for New Perspectives. In: ‏First International Workshop on Innovation in Digital Preservation, June 19, 2009, Austin, Texas, USA. Retrieved October 20, 2011, from: http://cs.harding.edu/indp/papers/barateiro7.pdf.
Beagrie, N. (2003). National Digital Preservation Initiatives: An Overview of Developments in Australia, France, the Netherlands, and the United Kingdom and of Related International Activity. Available at: http:// www .clir .org/pubs /reports/pub116/sec1.html.
Charlesworth, A. (2006). Digital Curation, Copyright, and Academic Research.The International Journal of Digital Curation, 1(1): 17-32.
‏Connertz , Th. (2003), Long-term archiving of digital documents: what efforts are being made in Germany?. Learned Publishing. 16(3): 207­­- 211.
Day, M.(2006). Long_Term Preservation of Web Content. Available at:www.ukoln.ac.uk/preservation/publications/2006/web.../md-final-draft.pdf
Emmanuelle, B. and Gildas, I. (2009).Metrics and Strategies for Web Heritage Management and Preservation. In: 75TH IFLA General Conference and Council(23-27 August 2009, Milan, Italy).
Gomes, D., Freitas, S., Silva, Mário J. (2006). Design and Selection Criteria for a National Web Archive. Retrieved October 20, 2011, from: http://xldb.fc.ul.pt/daniel/docs/papers/gomes06tomba.pdf
Hodge, G. and Frangakis, E. (2004). Digital Preservation Access to Scientific Information: The State of the Practice. Retrieved October 20, 2011, from: ‏http://www.dtic.mil/cgi-bin/GetTRDoc?Location=U2&doc=GetTRDoc.pdf&AD=ADA423497
Kelly, B., Ashley, K., Guy, M., Pinsent, E., Davies, R. and Hatcher, J.,(2008). Preservation of Web Resources: The JISC PoWR Project. In: iPress 2008 Conference, 29-30th September 2008, British Library, London, UK.
Library of Congress (2005). Minerva Web archiving project. Retrieved October 20, 2011, from: http://lcweb2.loc.gov/cocoon/minerva/html/minerva-home.html
Murray, K. R. and Hsieh, I. (2007).Archiving Web-published materials: A needs assessment of librarians, researchers, and content providers.Government Information Quarterly, 52: 66-89.
Ntoulas , A., Cho, J., Olston, C. (2004). What’s new on the web?: the evolution of the web from a search engine perspective. In:13th international conference on World Wide Web, p. 1–12. Retrieved October 20, 2011, from: oak.cs.ucla.edu/~cho/papers/cho-new.pdf.
Rauber , A. and Hunter , J (2007). Introduction to the Special Issue on Web Archiving. New Review of Hypermedia and Multimedia,13(1).
 Shadanpour, Farzaneh (2006). The digital heritage. National studies on librarianship and information organization, 72:93-107.
Strodl , S., Becker, C., Rauber, A. (2009). Digital Preservation. Retrieved October 20, 2011, from: http://www.google.com/#hl=fa&q=Digital+Preservation+Stephan+Strodl&bav=on.2,or.r_gc.r_pw.&fp=f3307e7a3d9ecbc4&biw=1600&bih=695.
Verheul, I. (2006). Networking for digital preservation: current practice in 15 national libraries. Retrieved October 20, 2011, from: http://archive.ifla.org/V/pr/saur119.htm.
Waller, M. and Sharpe ,R .(2006). Mind the gap: Assessing digital preservation needs in the UK. Retrieved October 20, 2011, from: www.dpconline.org/.../340-mind-the-gap-assessing-digital-preservation-needs-in-the-uk.html