Scripting Flickr with Python and REST
Flickr probably needs no introduction for readers of this column. It's a hugely popular social-network site owned by Yahoo, focusing on sharing of photographs. It embodies most of the the current web buzzwords, including tagging, web feeds, AJAX, and accessibility to scripts. Flickr provides a set of HTTP-based APIs for accessing features both as a publisher and as a viewer of pictures. You get to choose between XML-RPC, REST (simple XML over HTTP), or SOAP, and the available functions cover every corner of the core Flickr service. In this article I'll look at some Python libraries for integrating with Flickr (all code tested with Python 2.4.2).
One thing you'll see across examples is reference to a Flickr API key. Such a key is always required to access Flickr through the official APIs, and if you want to take advantage of any of these scripting capabilities, you'll need to apply for a key from Flickr's API Keys page. Flickr uses a fairly elaborate system of token authentication based on your application key, fully described in the Flickr Authentication API Desktop Applications How-To. If you want to know the nitty-gritty details of an application's handshake with Flickr, do read this resource. A few Flickr libraries will try to abstract you from a lot of that, but for purposes of this article I'll dodge the issue by sticking to actions that are allowed without special user authentication. Be sure to browse the Flickr API home page for documentation of technical details of the Flickr API, as well as links to implementations for just about every popular language out there except for C/C++ and JavaScript.
The pioneer of Python Flickr tools is Michele Campeotto, whose work has been
the inspiration for many of the libraries I discuss today. He is firstly known
for FlickrUploadr, a program
that provides a simple GUI using the GTK toolkit for Linux. The user drags
and drops image files onto a simple window area, from which the pictures can
be uploaded to a Flickr account. FlickrUploadr is focused more on end users
and is not really a reusable library, so I shall not spend any more time
discussing it. More to this article's purpose is FlickrClient,
Campeotto's library module for Flickr. I installed FlickrClient 0.2 by copying
the two Python files in the package to my Python site-packages. One of those
packages is xmltramp, Aaron
Swartz's simple Pythonic tree API for XML. Listing 1 is a simple example that
gets all the public favorite photos for the Flickr user named "uche". The variable FLICKR_API_KEY in
this listing and others should be set separately to your own Flickr API key.
>>> from FlickrClient import FlickrClient
>>> client = FlickrClient(FLICKR_API_KEY)
>>> #Find the user ID of the person named "uche"
>>> person = client.flickr_people_findByUsername(username='uche')
>>> import pprint
>>> pprint.pprint(person.__dict__)
{'_attrs': {u'id': u'21902936@N00', u'nsid': u'21902936@N00'},
'_dNS': None,
'_dir': [<username>...</username>],
'_name': u'user',
'_prefixes': {}}
>>> userid = person(u'id')
>>> print userid
21902936@N00
>>> faves = client.flickr_favorites_getPublicList(user_id=userid)
>>> faves[0]
<photo isfamily="1" title="P1010024" isfriend="1" ispublic="1" server="30"
secret="c68a340791" owner="75062596@N00" id="63291069"></photo>
>>> for fave in faves:
... print fave(u'title')
...
P1010024
Ãfrica...
AFRICAN DREAM
Goro
In Concert
Fela Anikulapo Kuti
Heaven's Light on Lake Malawi
>>>
Flickr documents API requests in a form such as flickr.people.findByUsername. In FlickrClient you replace the dots with underscores and call the resulting
method name on the Flickr proxy object (client in listing 1).
FlickrClient does some remote method dispatch magic to forward the request
to Flickr. The actual Flickr API is not hard-coded into FlickrClient (which
is a mere 40 lines of code). You almost always have to pass at least one named
parameter to Flickr. You must pass these as keyword arguments (for example, username='uche')
for the dispatch magic to work; you can get the parameter name from the Flickr
API documentation. If you make any mistakes that confuse FlickrClient you can
expect some pretty cryptic error messages, but luckily the API is simple enough
that you get the hang of it very quickly. You get back xmltramp nodes representing
the Flickr response XML, and it's up to you to access the actual data you need.
As in listing 1, I use __dict__ and repr() heavily
for introspection of the result values so I can find out where to get the
data I need. Some calls return lists, which come back as a container element
with multiple children, as is the case for flickr.favorites.getPublicList.
As I mentioned, I use only Flickr calls that do not need user authentication.
An example of a method that does is flickr.favorites.getList which
gets all favorite photos including private ones. If you want to write code
with Flickr user permissions in order to use such functions you have to redirect
users to a web page so that they can log in and give your application permission.
You then query Flickr again for a resulting authentication token as a result.
It's a bit of a tricky process, and later on I cover a FlickrClient-derived
tool that tries to do this handshake for you. You might be able to borrow code
from there even if you're using FlickrClient.
|
FlickrClient uses the REST API, requesting simple parameterized URLs from
Flickr via HTTP GET (even in some cases where Flickr should be requiring HTTP
POST, but does not bother), and returning the raw XML response bodies. This
keeps it very simple and flexible. Eitan Isaacson preferred to use Python standard
library XML-RPC instead, so he wrote flickrlib.
I downloaded version 0.5 of the package. Installation is a matter of the standard
distutils python setup.py install, although all it does is copy
the single file flickrlib.py to your Python library. Listing 2 is an attempt
to replicate the requests in listing 1. The first thing you'll notice is that
method names are called exactly as in the Flickr API spec, and there is no
need to translate the dots to underscores. The variable FLICKR_API_SSECRET in
this listing and others should be set separately to your own shared secret.
flickrlib to get a list of favorite
photo titles
>>> import flickrlib
>>> client = flickrlib.FlickrAgent(FLICKR_API_KEY, FLICKR_API_SSECRET)
>>> #Find the user ID of the person named "uche"
>>> person = client.flickr.people.findByUsername(username='uche')
>>> person
{u'username': [{u'text': u'Uche', u'type': u'username'}],
u'text': u'', u'type': u'user', u'id': u'21902936@N00',
u'nsid': u'21902936@N00'}
>>> userid = person[u'id']
>>> faves = client.flickr.favorites.getPublicList(user_id=userid)
[SNIP]
UnicodeEncodeError: 'ascii' codec can't encode character u'\xc1' in
position 270: ordinal not in range(128)
I get this exception because there is a non-ASCII character in one of my favorite
photo titles. flickrlib doesn't seem to handle this properly yet. In listing
3, I tried a different query with a list-like response.
flickrlib to get information on a photo
>>> import flickrlib
>>> client = flickrlib.FlickrAgent(FLICKR_API_KEY, FLICKR_API_SSECRET)
>>> #Find the user ID of the person named "uche"
>>> person = client.flickr.people.findByUsername(username='uche')
>>> userid = person[u'id']
>>> #Get the first 10 photos for the user
>>> photos = client.flickr.people.getPublicPhotos(user_id=userid, per_page=10)
>>> photos.keys()
[u'perpage', u'text', u'page', u'photo', u'total', u'type', u'pages']
>>> photo = photos[u'photo'][0]
>>> photo[u'title']
u'IMG_1223'
>>> photoinfo = client.flickr.photos.getInfo(photo_id=photo[u'id'],
secret=photo[u'secret'])
>>> for tag in photoinfo[u'tags'][0][u'tag']: print tag[u'text']
...
friends
family
>>>
flickrlib turns XML-RPC responses into a data structure of dictionaries and
lists. The best way to figure out how to navigate to what you want is to look
at the sample XML response given in the Flickr API docs. For example, from flickr.photos.getInfo you
know the raw response is along the lines of:
<photo id="2733" secret="123456" server="12"
isfavorite="0" license="3" rotation="90" originalformat="png">
[SNIP]
<tags>
<tag id="1234" author="12037949754@N01" raw="woo yay">wooyay</tag>
<tag id="1235" author="12037949754@N01" raw="hoopla">hoopla</tag>
</tags>
[SNIP]
</photo>
From this you can figure out that you would get the photo's original format
using response[u'originalformat'], the list of tags from response[u'tags'][0] (the
first child element named tags), the first tag's display text from response[u'tags'][0][u'tag'][0][u'text'],
and so on. XML attributes generally are accessed as dictionary keys, and XML
child elements as list items.
Do mind Etian's warning where he says,
The library currently lacks a custom error class and it is not thread safe, and when I say that I mean it will do crazy stuff if you don't put mutex locks around calls to this library.
To be fair, all the Python Flickr libraries I've seen leave a lot of such matters for the developer to manage.
I've mentioned how the above Flickr APIs return fairly low-level data structures or raw XML for the developer to pick apart. This has the advantage of simple library code at the expense of more complex client code. James Clarke preferred to do the work to simplify the Flickr results in creating flickr.py. I downloaded the single Python file (revision 24) and copied it to my Python library by hand. Pay attention to how I set the API key directly in the module object.
flickr.py to get information on a photo
>>> import flickr
>>> flickr.API_KEY = FLICKR_API_KEY
>>> user = flickr.people_findByUsername(u'uche')
>>> user.id
u'21902936@N00'
>>> photos = flickr.photos_search(user_id=user.id)
>>> for tag in photos[0].tags: tag.text
...
u'friends'
u'family'
>>> faves = flickr.favorites_getPublicList(user_id=user.id)
>>> for fave in faves: print fave.title
...
P1010024
Ãfrica...
AFRICAN DREAM
Goro
In Concert
Fela Anikulapo Kuti
Heaven's Light on Lake Malawi
>>>
You can see how much more simple and direct the API is. The trade-off is that
it takes a lot of work within flickr.py, and not all the Flickr API is covered
yet. You can see the module documentation (help(flickr)) for information
on what is and is not covered. You might also have noticed that the function
names are completely specialized from Flickr method names, so you have to rely
on flickr.py documentation rather than Flickr's own API docs.
Beej Jorgensen put together a simple Python library supporting the Flickr
API, called flickrapi.py. He borrowed
some code and reinvented some odd wheels, bundling an xmltramp-like XML node
implementation right into the main module. He says he did so as a learning
exercise, and I expect another motivating factor was avoiding any third-party
module dependencies. At any rate, it weighs in at under 200 lines of code,
so I won't scrutinize its design decisions too closely for now. The more important
issue is that I found it a bit clumsy to use, as far as I got, and I couldn't
really get it to do anything. Listing 5 illustrates how far I got.
flickrapi.py
>>> from flickrapi import FlickrAPI
>>> #initialize a proxy instance for the Flickr API and get a session token
>>> fapi = FlickrAPI(FLICKR_API_KEY, FLICKR_API_SSECRET)
>>> fapi = FlickrAPI(FLICKR_API_KEY, FLICKR_API_SSECRET)
>>> token = fapi.getToken(perms='write', browser='firefox')
rsp: error 108: Invalid frob
The last line, looking to grab a token from Flickr, launches a separate browser session for the end user to log into Flickr and grant the application permission to act as the user's agent. Unfortunately, no matter how I tweaked the requested permissions and the web browser used for the request, I kept getting the "Invalid frob" error.
Another end-user-focused project that can be a useful source of Python code is C. Mallory's uploadr, which uploads all pictures in a given directory that have not already been uploaded to Flickr.
It's good to have options, and there are certainly many if you're needing to access Flickr from Python code. The different libraries have different philosophies and you should be able to find one to fit your needs.
As an aside, it's interesting to look at photos tagged with "python" on Flickr. Because of the popularity of Flickr among programmer types I had guessed that there would be more screenshots from Python and photos from Python developer events than photos of snakes. Evidence from the first few pages for this tag (there is a nice photo sequence of a python swallowing a large dead rodent) gives the lie to that preconception. On the other hand, Google search results for "python" offer nothing but the programming language and Monty Python comedy troupe for pages and pages of results. It's an interesting contrast, whether or not you are a Semantic Web advocate.
XML.com Copyright © 1998-2006 O'Reilly Media, Inc.