XML.com: XML From the Inside Out
oreilly.comSafari Bookshelf.Conferences.

advertisement

Hacking iTunes

November 03, 2004

The iTunes music player and download service took the world by storm in 2003 and 2004, and promises to continue its popularity for the forseeable future. While the music is stored in simple files on disk, metadata -- information about the music -- is stored in a single XML file.

In this article I'll explore ways to work with the iTunes Music Library file, an XML document, for fun and education, including transforming the library into an HTML page using various technologies, and querying Amazon and Google's web services for other suggested recordings and related information.

For demonstration purposes, and as a shameless plug for my other work, I'll be using the Mono open source, common language runtime to build cross-platform tools, though the techniques will be applicable to other platforms and languages.

About the Library

The iTunes library file, a file called iTunes Music Library.xml, is created automatically when you launch iTunes. On Mac OS X, it can be found in the directory $HOME/Music/iTunes/, while on Windows, you'll find it in My Documents\My Music\iTunes\. It's an XML file, the format of which is defined by a Document Type Declaration (DTD) located at www.apple.com/DTDs/PropertyList-1.0.dtd (reformatted for legibility):


ENTITY % plistObject "(array | data | date | dict | real | 
                       integer | string | true | false )" 
ELEMENT plist %plistObject;
ATTLIST plist version CDATA "1.0" 

<!-- Collections -->
ELEMENT array (%plistObject;)*
ELEMENT dict (key, %plistObject;)*
ELEMENT key (#PCDATA)

<!--- Primitive types -->
ELEMENT string (#PCDATA)
ELEMENT data (#PCDATA) <!-- Contents interpreted as Base-64 encoded -->
ELEMENT date (#PCDATA) <!-- Contents should conform to a subset of ISO 8601 
                         (in particular, YYYY '-' MM '-' DD 'T' HH ':' MM ':' SS 'Z'.  
                         Smaller units may be omitted with a loss of precision) -->

<!-- Numerical primitives -->

ELEMENT true EMPTY     <!-- Boolean constant true -->
ELEMENT false EMPTY    <!-- Boolean constant false -->
ELEMENT real (#PCDATA) <!-- Contents should represent a floating point
                            number matching ("+" | "-")? d+ ("."d*)? ("E"
                            ("+" | "-") d+)? where d is a digit 0-9.  -->
ELEMENT integer (#PCDATA) <!-- Contents should represent a (possibly signed)
                               integer number in base 10 -->

You can see that the schema for the library has nothing to do with music, audio, multimedia, or data files; it's just a generic dictionary, a collection of key-value pairs. The top-level element is plist (property list), and it can contain any of the elements array, data, date, dict, real, integer, string, true, or false. This is a fairly flexible DTD.

In practice, your library file will contain a single dict, which will in turn contain a number of key-value pairs representing header information: the library file and application versions, music folder location, and a Library Persistent ID. All of these values are integers and strings.


<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple Computer//DTD PLIST 1.0//EN" 
          "http://www.apple.com/DTDs/PropertyList-1.0.dtd">

<plist version="1.0">
<dict>
  <key>Major Version</key><integer>1</integer>
  <key>Minor Version</key><integer>1</integer>
  <key>Application Version</key><string>4.6</string>
  <key>Music Folder</key>
    <string>file://localhost/Users/niel/Music/iTunes/iTunes%20Music/</string>
  <key>Library Persistent ID</key><string>8E84CC790968E27F</string>
  <key>Tracks</key>
  <dict>
    ...
  </dict>

  <key>Playlists</key>
  <array>
    ...
  </array>
</dict>
</plist>

After the header information comes track metadata:


<key>839</key>
<dict>
  <key>Track ID</key><integer>839</integer>
  <key>Name</key><string>Sweet Georgia Brown</string>
  <key>Artist</key><string>Count Basie & His Orchestra</string>
  <key>Composer</key><string>Bernie/Pinkard/Casey</string>
  <key>Album</key><string>Prime Time</string>
  <key>Genre</key><string>Jazz</string>
  <key>Kind</key><string>Protected AAC audio file</string>
  <key>Size</key><integer>3771502</integer>
  <key>Total Time</key><integer>219173</integer>
  <key>Disc Number</key><integer>1</integer>
  <key>Disc Count</key><integer>1</integer>
  <key>Track Number</key><integer>3</integer>
  <key>Track Count</key><integer>8</integer>
  <key>Year</key><integer>1977</integer>
  <key>Date Modified</key><date>2004-06-16T18:10:55Z</date>
  <key>Date Added</key><date>2004-06-16T18:08:31Z</date>
  <key>Bit Rate</key><integer>128</integer>
  <key>Sample Rate</key><integer>44100</integer>
  <key>Play Count</key><integer>3</integer>
  <key>Play Date</key><integer>-1119376103</integer>
  <key>Play Date UTC</key><date>2004-08-17T16:39:53Z</date>
  <key>Rating</key><integer>100</integer>
  <key>Artwork Count</key><integer>1</integer>
  <key>File Type</key><integer>1295274016</integer>
  <key>File Creator</key><integer>1752133483</integer>
  <key>Location</key><string>file://localhost/Users/niel/Music/ \
iTunes/iTunes%20Music/Count%20Basie%20&%20His%20Orchestra/ \
Prime%20Time/03%20Sweet%20Georgia%20Brown.m4p</string>
  <key>File Folder Count</key><integer>4</integer>
  <key>Library Folder Count</key><integer>1</integer>
</dict>

The collection of tracks is a dict, keyed by the track ID. Within each track, also a dict, the keys include Track ID, Name, Artist, Album, Genre, Kind, Size, Total Time, Disc Number, Disc Count, Track Number, Track Count, Year, Date Modified, Date Added, Bit Rate, Sample Rate, Play Count, Play Date, Play Date UTC, File Type, File Creator, Location, File Folder Count, and Library Folder Count. Although most of these are self-explanatory, a few bear further explanation.

Kind refers to the file encoding. Valid values include AAC audio file and MPEG audio file.

Play Date and Play Date UTC contain the date of the last time the track was played to the end, in local and UTC time, respectively.

File Type and File Creator contain the Mac-OS-specific file type and creator, a long integer that indicates the program that created the file, and a particular file type. These long integers are usually represented as a mnemonic four-character code.

Location contains the URL of the audio file, with a file URL scheme.

The final section of the library contains the playlist metadata:


<dict>
  <key>Name</key><string>Funky</string>
  <key>Playlist ID</key><integer>6652</integer>
  <key>Playlist Persistent ID</key><string>88CED99A2F698F3C</string>
  <key>All Items</key><true/>
  <key>Playlist Items</key>
  <array>
    <dict>
      <key>Track ID</key><integer>837</integer>
    </dict>
    <dict>
      <key>Track ID</key><integer>754</integer>
    </dict>
    <dict>
      <key>Track ID</key><integer>835</integer>
    </dict>
    <dict>
      <key>Track ID</key><integer>912</integer>
    </dict>
    <dict>
      <key>Track ID</key><integer>842</integer>
    </dict>
    <dict>
      <key>Track ID</key><integer>217</integer>
    </dict>
  </array>
</dict>

Unlike tracks, playlists are contained in an array. Each playlist, however, is specified by a dict. They come in two flavors, however. Standard playlists have the keys Name, Playlist ID, Playlist Persistent ID, All Items, and Playlist Items.

"Smart" playlists have the additional keys Smart Info and Smart Criteria, which contain base-64-encoded data that specifies the smart playlist.

The Playlist Items key contains an array of dict specifying the tracks, keyed by their IDs.

Loading the Library

You can take a couple of approaches to loading the library file. On the one hand, there's always the DOM. The DOM has the advantage of being cross-platform, so techniques used in one language are usable in any other. Here's a simple program to load the file into a DOM tree:


using System;
using System.Xml;

public class MusicDom {
  public static void Main(string [] args) {
    string file = args[0];
    XmlDocument document = new XmlDocument();
    document.Load(file);
  }
}

On the other hand, the iTunes library can get rather large: I've only got some 1500 tracks in mine, and the file is 2,444,585 bytes. That can lead to a rather large DOM tree in memory. On my iBook with an 800MHz PowerPC G4, that takes nearly nine seconds just to load.

You could also use a read-only view of the XML and load it into a native data structure. Since I'm writing the code in C# using Mono, that means System.XmlReader.

It's often better to load XML data into native structures. The Mono (and .NET) xsd utility will create a class that can be serialized to and deserialized from an XML file. The trick is that xsd requires a W3C XML Schema, but all we have is a DTD.

Luckily, there are some utilities that can convert a DTD to a W3C XML Schema; dtd2xsd is one. So, the first step is to download the DTD, since dtd2xsd.pl wants a local input file:

wget http://www.apple.com/DTDs/PropertyList-1.0.dtd

Now, convert the DTD to a W3C XML Schema:

./dtd2xsd.pl PropertyList-1.0.dtd > PropertyList-1.0.xsd

Now we have a schema definition. However, it's the wrong version of W3C XML Schema; the generated namespace is www.w3.org/2000/10/XMLSchema; it should be www.w3.org/2001/XMLSchema. We'll have to do a little editing.

Besides replacing the namespace declaration, some of the generated schema just isn't right. That's just the way it is with generated schemas.

Here's the correct generated schema:


<schema
  xmlns='http://www.w3.org/2001/XMLSchema'
  targetNamespace='http://www.w3.org/namespace/'
  xmlns:t='http://www.w3.org/namespace/'
  xmlns:xs=''>

 <element name='plist'>
  <complexType>
   <choice>
    <element ref='t:array'/>
    <element ref='t:data'/>
    <element ref='t:date'/>
    <element ref='t:dict'/>
    <element ref='t:real'/>
    <element ref='t:integer'/>
    <element ref='t:string'/>
    <element ref='t:true'/>
    <element ref='t:false'/>
   </choice>
   <attribute name='version' type='string' use='required'/>
  </complexType>
 </element>

 <element name='array'>
  <complexType>
   <sequence minOccurs='0' maxOccurs='unbounded'>
    <choice>
     <element ref='t:array'/>
     <element ref='t:data'/>
     <element ref='t:date'/>
     <element ref='t:dict'/>
     <element ref='t:real'/>
     <element ref='t:integer'/>
     <element ref='t:string'/>
     <element ref='t:true'/>
     <element ref='t:false'/>
    </choice>
   </sequence>
  </complexType>
 </element>

 <element name='dict'>
  <complexType>
   <sequence minOccurs='0' maxOccurs='unbounded'>
    <element ref='t:key'/>
    <choice>
     <element ref='t:array'/>
     <element ref='t:data'/>
     <element ref='t:date'/>
     <element ref='t:dict'/>
     <element ref='t:real'/>
     <element ref='t:integer'/>
     <element ref='t:string'/>
     <element ref='t:true'/>
     <element ref='t:false'/>
    </choice>
   </sequence>
  </complexType>
 </element>

 <element name='key' type='string' />

 <element name='string' type='string' />

 <element name='data' type='base64Binary' />

 <element name='date' type='dateTime' />

 <element name='true' />

 <element name='false' />

 <element name='real' type='decimal' />

 <element name='integer' type='integer' />

</schema>

Now, we can generate the class:

xsd PropertyList-1.0.xsd /classes

That will produce the file PropertyList-1.0.cs, a C# source file that can be used to serialize the music library file from a .NET program.

But why go to the effort of writing all of this specialized C# code when there are some technologies available in any language, on any platform? In the next section, I'll show you how to interrogate the music library with XSLT.

Pages: 1, 2

Next Pagearrow