We were excited to discover recently that copyrightliteracy.org had been selected for inclusion in the UK Web Archive as an information resource with historical interest. However, even we faced some trepidation when considering the copyright implications of allowing archiving of the site (i.e. not everything on the site is our copyright). Firstly, this allowed us to get our house in order, contact our fellow contributors and ensure we had the correct re-use terms on the site (you can now see a CC-BY-SA licence at the footer of each web page). Secondly, this provided opportunity for another guest blog post and we are delighted that Louise Ashton who works in the Copyright & Licensing Department at The British Library has written the following extremely illuminating post for us. In her current role Louise provides copyright support to staff and readers of the British Library, including providing training, advising on copyright issues in digitisation projects and answering copyright queries from members of the public on any of their 150 million collection items! Prior to this, Louise began her career in academic libraries, quickly specialising in academic liaison and learning technologist roles.
When people think of web archiving their initial response usually focuses on the sheer scale of the challenge. However another important issue to consider is copyright; copyright plays a significant role both in shaping web archives and in determining if and how they can be accessed. Most people in the UK Library and Information Science (LIS) sector are aware that in 2013 our legal deposit legislation was extended to include non-print materials which, as well as e-books and online journal articles, also covers websites, blogs and public social media content. This is known as the snappily titled ‘The Legal Deposit Libraries (Non-Print Works) Regulations 2013’ and is enabling the British Library and the UK’s five other legal deposit libraries to collect and preserve the nation’s online life. Indeed, given that the web will often be the only place where certain information is made available the importance of archiving the online world is clear.
What is less well known is that, unless site owners have given their consent, the Non-Print Legal Deposit Archive is only available within the reading rooms of the legal deposit libraries themselves and even then can only be accessed if using library PCs. Although this mirrors the terms for accessing print legal deposit, because of the very nature of the non-print legal deposit collection (i.e. websites that are generally freely available to anyone with an internet connection) people naturally expect to be able to access the collection off-site. The UK Web Archive offers a solution to this by curating a separate archive of UK websites that can be freely viewed and accessed online by anyone, anywhere, and with no need to travel to a physical reading room. The purpose of the UK Web Archive is to provide permanent online access to key UK websites with regular snapshots of the included websites being taken so that a website’s evolution can be tracked. There are no political agendas governing which sites are included in the UK Web Archive, the aim is simply to represent the UK’s online life as comprehensively and faithfully as possible (inclusion of a site does not imply endorsement).
However, a website will only be added to the (openly-accessible) UK Web Archive if the website owners’ permission has been obtained and if they are willing to sign a licence granting permission for their site to be included in the Archive and allowing for all versions of it to be made publically accessible. Furthermore, the website owner also has to confirm that nothing in their site infringes the copyright or other intellectual property rights of any third party and if their site does contain third party copyright, that they are authorised to give permission on the rights-holders’ behalf. Although the licence has been carefully created to be as user-friendly as possible the presence of any formal legal documentation is often perceived as intimidating. So even if a website owner is confident that their use of third party content is legitimate they may be reluctant to formally sign a licence to this effect – seeing it in black and white somehow makes it more real! Or, despite best efforts, site owners may have been unable to locate the rights-holders of third party content used in their site and although they may have been happy with their own risk assessments, this absence of consent negates them from being able to sign the licence to include the site in the UK Web Archive.
For other website owners this may be the first time they have thought about copyright. Fellow librarians will not be surprised to hear that some people are bewildered to learn that they may have needed to obtain permission to borrow content from elsewhere on the internet for use in their own sites! And then of course there are the inherent difficulties in tracking down rights-holders more generally; unless sites are produced by official bodies it can be difficult to identify who the primary site owners are and in big organisations the request may never make it to the relevant person. Others may receive the open access request but, believing it to be spam, ignore it. And of course site owners are perfectly entitled to refuse the request if they do not wish to take part. Information literacy plays its part and for sites where it is crucial that site visitors access the most recent information and advice (for example websites giving health advice) then for obvious reasons the site owners may not wish for their site to be included.
The reason Jane and Chris asked me to write this blog post is because the UK Copyright Literacy website has been selected for potential inclusion in the UK Web Archive. It was felt important that the Archive should contain a site that documented and discussed copyright issues given that copyright and online ethics are such big topics at the moment, particularly with the new General Data Protection Regulations coming into force next May. Another reason why the curators wanted to include the Copyright Literacy blog is, given that the website isn’t hosted in the UK and therefore does not have a UK top level domain (for example .uk or .scot), it had never been automatically archived as part of the annual domain crawl. This is an unfortunate point which affects many websites as it means that many de facto UK sites are not captured unless manual intervention occurs. To try and minimise the number of UK websites that unwittingly evade inclusion, the UK Web Archive team therefore welcomes site nominations from members of the public. Consequently, if you would like to nominate a site to be added to the archive, and in doing so perhaps help to play a role in preserving UK websites, you can do so via https://www.webarchive.org.uk/ukwa/info/nominate.
As a final note, we are pleased to report that Jane and Chris have happily agreed to their site being included which is great news as it means present day copyright musings will be preserved for years to come!
Acknowledgements: The UK Web Archive Team kindly assisted me with my research for this post