Hello Readers,
Hope you guys are keeping warm! Today we will be talking about mining the rich metadata of Twitter through their Twitter REST API v1.1 by using Python (I will use IPython 2.7.5 and the python twitter library). Specifically, we will look at trending topics. (For Part 2: Tweets, click here).
Let us begin.
For Starters
A Twitter account (my handle is @beyondvalence) is required to obtain the proper authorization keys to access the Twitter API. Go here to use your Twitter account to obtain developer credentials and keys. OAuth offers a solution for apps to access user Twitter data without requiring users to share sensitive information such as passwords. A sample of the Twitter development page with the OAuth keys is shown below.
Twitter OAuth Keys |
In IPython
In the python IDE, we now can import twitter and json (JavaScript Object Notation) libraries, simple enough.
Import Twitter and JSON |
Next, we define the OAuth keys as consumer, consumer_secret, oauth_token, and oauth_token_secret. Using these keys, create an OAuth authorization object, such as auth, shown below (line 4). Then auth is passed to class Twitter to issue queries to the Twitter API.
Gained Twitter API Query Access |
The Twitter object print-out in line 5, above, indicates that we have used OAuth credentials to gain authorization to query Twitter's API.
Querying Trend Topics
Moving along! Now that we are authorized, we can issue a request. A common measure of topics in the Twitterverse is compiled in trends- which are popular tokens such as key words, handles, or hashtags. We can request the current trending topics on Twitter (as of this blog post writing in mid December).
Yahoo has developed an unique, non-repetitive way to index places by using the WOE (What On Earth) identifier. So we can use these WOE IDs to constrain our queries to Twitter, by using the 1 and 23424977 IDs for the entire world and the US, respectively. This is shown below. (The id requires an underscore to denote it as a query string parameterization.)
Twitter Trends Output |
The result is a lot of text, much of it semi-readable at best. Deciphering the output is made easier to view by using JSON formatting.
JSON, a Data Exchange Format
We imported the json library at the beginning, and now we can print the API output in JSON. Do this for the trends in the US, by using print json.dumps(). This is shown below.
API Output JSON Format |
Much better! As we can see some trending topics in the US are:
- #LastMinuteGifts (Makes sense, being in the Holiday shopping season)
- #SOTV
- #AskHartnell
- Christmas (just around the corner)
- 22 Jump Street (a comedy movie)
- #YeterArt
- #MilletinVekiliHakan
- #SessizSakinTakiple
- #TMP332
- Julie Plec Needs Kol
Note: Twitter Rate Limits
Twitter has rate limits on applications constraining the number of requests one can make to an API at 15 requests in a 15 minute time window for trends. It is not a big concern considering the trends are updated every 5 minutes anyways.Later posts will discuss more Twitter metadata, like tweets! Stay tuned for Part 2, Querying Tweets!
Thanks for reading &
Have a wonderful Holiday!
Wayne
@beyondvalence
No comments:
Post a Comment