Loading...

Tuesday, April 22, 2014

Python: Converting CSV to XML and JSON


Hello Readers,


Today we will convert the common CSV (comma separated values) format into XML (extensible markup lanuage) and JSON (javascript object notation) formats in Python. The CSV file we will use was obtained from data scraping the weather underground website.


CSV Data

The CSV data has two elements in each row separated by a comma:

1. The date as 8 characters- 4 year, 2 month, 2 day
2. Temperature in Fahrenheit


CSV To XML



The XML format exists as self-defined tags, beginning with a XML declaration, specifying the XML version "1.0". There is a root element of <weather_data>, and a child element of <observation>, which also has a child element (sub-element) of <date> and <max_temperature>. The root and child elements make up the tree structure, and note that tags need to be closed as well.



XML Tree

To read the CSV file, we use the
csv.reader() method, and set the delimiter to a comma. Then we create the XML file to which we will write the output, and write the XML heading and the root element.


Reading CSV and Writing XML

Now we iterate through the child elements of observations consisting of the date and maximum temperature. Recall that the date is in the first index and the temperature is in the second, with python index starting at 0. Remember to close the open tags afterwards, and close the XML document.



Observation Element and Closing Tags

We can see the results by opening the XML file, and opening it in a web browser, such as Google Chrome:



XML Output

Note how the root and child elements have arrows which we can minimize, and each observation is separate. Our XML formatting looks correct through Chrome. Although the output takes up more room than a CSV file, it is much more readable.




CSV To JSON



Javascript Object Notation is another format widely used to store and transfer data. 
It is quite different from XML, because it uses curly braces "{ }" for objects and brackets "[ ]" for arrays, instead of <tags> separating elements. The basic JSON format applied to our weather data is shown below. 


JSON Format

Even though we will not use the
json module, I show it so you know it is there. Import the csv module to read our "wunder-data.txt" CSV file again, and write the "observation" object and begin an array with an open bracket. We start a rows_num variable at 0, and count to the max number of iterations (rows in the CSV file).


Import, Read and Write Data

Inside this array, we iterate our observation objects with name/value pairs with date and temperature. Incrementing the rows_num by 1 with each observation, if it reaches 354, we still want to include a comma after the closing curly bracket. But if it reaches 365, then we finish the observations and close the array and observations object.



Iterating Objects

Opening the JSON output in Google Chrome, we see a mess of name/value pairs. Clearly Chrome does not read plain JSON, but there are Chrome Apps available, such as JSONView.



JSON Output

Okay readers, we have covered much ground today with data formats. Both XML and JSON are widely used, knowing both will definitely allow you to utilize varied data sets. Stay tuned for more posts!



Thanks for reading,

Wayne
@beyondvalence

4 comments:

  1. Nice article. If you happen to want to go the other way (from JSON to CSV), json-csv.com is a good online converter.

    ReplyDelete
  2. I like your blog, I read this blog please update more content on python, further check it once at python online training

    ReplyDelete
  3. Dapatkan Pasaran Bola Terbaik di Situs Agen Resmi BOLAVITA !

    www.bolavita.site Agen Taruhan Bola Online yang sudah di percaya dan sudah berdiri sangat lama di dunia perrjudiian Indonesia !

    Aman dan Terpercaya !

    Hubungi Cs kami yang bertugas 24 jam Online :

    BBM: BOLAVITA
    WA: +6281377055002

    Atau bisa langsung download Aplikasi Resmi BOLAVITA :
    Aplikasi Playstore : Bolavita Sabung Ayam

    ReplyDelete