This HOWTO explains how to insert Creative Commons lincensing metadata information in a PDF file. Document v. 2.0 Author: Enrico Masala < masala at-symbol polito dot it > Date: Jul 12, 2006 1. Copyright, license and terms of usage Copyright Enrico Masala 2006. This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License http://creativecommons.org/licenses/by-nc-sa/2.5/ The author disclaims all warranties with regard to this document, including all implied warranties of merchantability and fitness for a certain purpose; in no event shall the author be liable for any special, indirect or consequential damages or any damages whatsoever resulting from loss of use, data or profits, whether in an action of contract, negligence or other tortious action, arising out of or in connection with the use of this document. Short: read and use at your own risk. 2. Description This document explains my experience in embedding licensing information in PDF files. Embedding licensing info in PDF files is important because it enables specifically designed search engines such as Google and Yahoo to recognize and return them when queried for contents with a specific license (expecially Creative Commons licenses), and users will not risk to use the authors' work in a way the author does not allow. 2.1 Introduction The PDF format allows to embed metadata information in the form of Extensible Metadata Format (XMP) information. (see http://partners.adobe.com/public/developer/xmp/topic.html for more info on XMP) XMP metadata are written using a syntax which is a subset of the World Wide Web Consortium Resource Description Framework (RDF), which is in turn based on XML. The Creative Commons website (http://www.creativecommons.org) allows to download the XMP description of the license you are interested in. Once you selected the license, just look for this sentence: "To mark a PDF or other XMP-supported file, save this template following these instructions". The XMP file I downloaded for the Attribution-NonCommercial-ShareAlike 2.5 License is reported here: True http://creativecommons.org/licenses/by-sa/2.5/ This work is licensed to the public under the Creative Commons Attribution-ShareAlike 2.5 License. 2.2 How to embed the XMP info 2.2.1 Solution 1: Using Adobe Acrobat 7 Professional First of all, open your PDF document, then choose "Document Properties", then "Additiona Metadata", then "Advanced" in the left part of the window. Use the "Append" button to select the xmp file. Unfortunately I was not able to import the previous XMP file in Acrobat. It simply does not appear to work, and no error or warning message is given. After many trials and some intuition, I made it work as follows: correct all rdf:rdf in rdf:RDF correct all description in Description correct all xaprights in xapRights correct all marked in Marked correct all webstatement in WebStatement correct all alt in Alt (It seems that case matters) In any case, Adobe Acrobat 7 Professional allows you to specify copyright info by hand Select the Description entry in the left part of the window, then you can specify: - Copyright status (select: copyrighted) - Copyright notice (write: "This work is licensed to the public under the Creative Commons Attribution-ShareAlike 2.5 License." or whatever your license says) - URL: (write: "http://creativecommons.org/licenses/by-nc-nd/2.5/" or whatever your license says). I also downloaded the CreativeCommons Panel from http://creativecommons.org/technology/CreativeCommonsPanel.txt which adds a CreativeCommons entry in the left part of the window. (follow the instructions at http://creativecommons.org/technology/xmp-help to copy the file in the right directory) but I could not use it to insert information. However it shows the correct Creative Commons info once you insert it as I described. Remember to press ok buttons to close windows and then to save to store the information into the file. 2.2.2 Solution 2: Using the PdfLicenseManager tool I wrote a simple tool, named PdfLicenseManager, see http://media.polito.it/masala/pdflicensemanager/index.html This tool is written in Java and it is based on the iText library (http://www.lowagie.com/iText/) which simplifies PDF handling. The PdfLicenseManager tool is written in Java hence it can be run on any platform which supports Java, including Linux and most UNIXes. Version 2.0 has a graphical user interface which simplifies program usage. Please refer to the README.txt file coming with the program for further instructions. The following instructions still apply to the PdfLicenseManager v. 2.0 if you want to use the command line interface (very useful for scripts). Assuming that you downloaded the itext library in the current directory, you can type: java -classpath itext-1.4.2.jar:. pdflicense.ManagePdfLicense put filein.pdf fileout.pdf by-nc to take the content of filein.pdf, insert a Creative Commons Attribution-NonCommercial License and write the output in the fileout.pdf file. To show the XMP licensing info associated with a file.pdf, type: java -classpath itext-1.4.2.jar:. pdflicense.ManagePdfLicense show file.pdf See the PdfLicenseManager README.txt file for more information. 2.3 Showing Metadata Information You can show the metadata information you have inserted using: - Acrobat 7 Professional (choose "Document Properties", then "Additiona Metadata" as in the description given to embed the data). I was not able to show the metadata (licensing) information using Acrobat 7 _Reader_, so I guess you need the Professional version. - pdfinfo If you use Linux, you can also show the raw, i.e. in XML, XMP info embedded into a PDF file typing: pdfinfo -meta file.pdf (This program is available with the poppler-utils package on Fedora Core 5 - Poppler, a PDF rendering library, it's a fork of the xpdf PDF viewer) - PdfLicenseManager (see previous section) 3. Credits Thanks to Juan Carlos De Martin who had the idea to write and publish this HOWTO.