Openpyxl and json round 2

Posted on Thursday, February 6, 2020









I wrote a recent article going over how to convert json to xlsx and back again using openpyxl in python. See http://www.whiteboardcoder.com/2020/02/openpyxl-and-json.html [1]

Now I want to up the ante and do the same thing but with a more complex json that will convert into multiple sheets in the excel file.




Simple test


Let me create a simple json file that represents a budget where it also has years that will be converted into sheets in Excel


{
   "2018":{
        "January": {
                "food": 240.5,
                "heating": 89.2,
                "rent": 1709.10
        },
        "February": {
                "food": 202.5,
                "heating": 112.2,
                "rent": 1709.10
        },
        "March": {
                "food": 320.5,
                "heating": 45.2,
                "rent": 1709.10
        }
   },
   "2019":{
        "January": {
                "food": 120.5,
                "heating": 88.2,
                "rent": 1809.10
        },
        "February": {
                "food": 102.5,
                "heating": 122.2,
                "rent": 1809.10
        },
        "March": {
                "food": 120.5,
                "heating": 35.2,
                "rent": 1809.10
        }
   },
   "2020":{
        "January": {
                "food": 220.5,
                "heating": 18.2,
                "rent": 1909.10
        },
        "February": {
                "food": 223.5,
                "heating": 12.2,
                "rent": 1909.10
        },
        "March": {
                "food": 120.5,
                "heating": 25.2,
                "rent": 1909.10
        }
   }
}


Save this in a file called original.jsonYou can use a tool like https://jsonlint.com/

To confirm it is in the correct format.




Or you can use jq from the command line




  > jq . original.json







Simple python script to create xls file from json




  > vi createxls_from_json_multiple_sheets.py


And place the following in it


#!/usr/bin/env python3

import openpyxl
import json
from openpyxl import Workbook


def populate_sheet(json_data, sheet):
   sheet.cell(1,1, "Month")
   sheet.cell(1,2, "food")
   sheet.cell(1,3, "heating")
   sheet.cell(1,4, "rent")

   row = 1
   for month in json_data.keys():
     row+=1
     sheet.cell(row,1,month)
     sheet.cell(row,2,float(json_data[month]["food"]))
     sheet.cell(row,3,float(json_data[month]["heating"]))
     sheet.cell(row,4,float(json_data[month]["rent"]))


#############################################
#  MAIN
#############################################
if __name__ == '__main__':

   json_data = {}

   with open("original.json") as json_file:
     json_data = json.load(json_file)

   wb = Workbook()
   #When you make a new workbook you get a new blank active sheet
   #We need to delete it since we do not want it
   wb.remove(wb.active)

   for year in json_data.keys():
     sheet = wb.create_sheet(title=year)
     populate_sheet(json_data[year], sheet)


   #Save it to excel
   wb.save("test_json.xlsx")





Now chmod it and run it



  > chmod u+x createxls_from_json_multiple_sheets.py
  >  ./createxls_from_json_multiple_sheets.py


Now open it up









Boom



Now let me create another program to convert the xls file into a json file.


  > vi createjson_from_xslx_multiple_sheet.py


And place the following in it


#!/usr/bin/env python3

import openpyxl
import json
from openpyxl import load_workbook


def get_sheet_dict(ws):
  year_dict = {}
  column_headers = {}

  #Returns a Tuple that are cell type
  first_row = ws[1]

  #Skip the first column
  for cell in first_row[1:]:
    column_headers[cell.column]=cell.internal_value

  for row in ws.iter_rows(min_row=2):
    month = row[0].internal_value
    temp_cell_dict = {}
    #Skip the month cell
    for cell in row[1:]:
      #store the data in a dict with keys as column names
      col_name = column_headers[cell.column]
      temp_cell_dict[col_name]=cell.internal_value

    #Set the month key to be contents of row
    year_dict[month] = temp_cell_dict

  return year_dict

#############################################
#  MAIN
#############################################
if __name__ == '__main__':

   wb = load_workbook(filename="test_json.xlsx")

   budget_dict = {}

   for ws in wb:
     year_dict = get_sheet_dict(ws)
     budget_dict[ws.title] = year_dict




   #Convert dict to JSON
   data_json = json.dumps(budget_dict)

   #Write json file
   with open("new.json", "w") as f:
     f.write(data_json)





  > chmod u+x createjson_from_xslx_multiple_sheet.py
  >  ./createjson_from_xslx_multiple_sheet.py



Now use the jq tool to see it


  > jq . new.json





Use diff to see it is identical to the original json


  > diff original.json new.json




The files are not exactly identical…
But is the json itself a match we can use jq to help us figure this out.



  > diff <(jq -S . original.json) <(jq -S . new.json)



And if it returns nothing then it is a match.

Let me change one thing by hand in the new.json and see if it catches it.



  > diff <(jq -S . original.json) <(jq -S . new.json)




Yep J


References


[1]        openpyxl and json


No comments:

Post a Comment