Menu Close

How to Convert Excel to JSON in Python

how to convert excel to json in python

Hi Python Developers, I hope you are doing well. In this article, I am going to explain to you how to convert Excel to JSON in Python along with various ways. As a developer, Sometimes, It might be your requirement to convert Excel to JSON in Python.

To perform Python Excel to JSON conversion, Python provides various libraries that are capable of handling Excel files with different file extensions.

So, these are some tools that we are about to use one by one in this Python Excel to JSON conversion article.

  • pandas
  • openpyxl
  • xlrd
  • excel2json

Before going to understand all these Python libraries, let’s understand a little about JSON ( JavaScript Object Notation ).

What is JSON ( JavaScript Object Notation )?

JSON stands for JavaScript Object Notation. It is a text-based data format that is used to travel the data between client and server. It is almost equal to the JavaScript Object, that’s why it is called JSON. JSON stores the data in the form of key and value pair format. It is one of the most popular data formats that is used to develop APIs.

As you can see JSON data example.

Example: Sample JSON Data


[
  {
    "first_name": "Vishvajit",
    "last_name": "Rao",
    "age": 24,
    "gender": "Male",
    "course": "BCA",
    "roll_no": "BCA001"
  },
  {
    "first_name": "Harsh",
    "last_name": "Kumar",
    "age": 25,
    "gender": "Male",
    "course": "Btech",
    "roll_no": "Btech001"
  },
  {
    "first_name": "Harshita",
    "last_name": "Saini",
    "age": 27,
    "gender": "Female",
    "course": "MA",
    "roll_no": "MA001"
  }
]

Let’s understand all above above-defined Python libraries one by one along with use cases.

Excel file

To load the data from the Excel file, I have created an Excel file students_data.xlsx with a single sheet name called student_data.

What is Pandas?

Pandas is one of the most popular and most used Python libraries especially for data analysis and data manipulation according to business needs. Pandas provide a lot of functions along with various parameters to read and write the data from various file extensions.

It is an external Python library that is hosted on the Python PyPI platform. To use Python Pandas, you must install it by using the below command. If you have already installed it, Then you can skip the installation part.

pip install pandas

How to convert Excel to JSON using Pandas

Python Pandas have a read_excel() function that I will use to load the Excel file. The read_excel() file accepts various parameters to load the data from Excel.

To work with pandas, you have to install openpyxl python library because Pandas’s read_excel() function depends on openpyxl module.

Use the below command to install openpyxl.

pip install openpyxl

The to_json() is a Pandas Datarame method that is used to convert any Python Pandas DataFrame to JSON data.

The to_json() method also accepts the filename which means You can provide the file name along with the file extension as the first argument inside the to_json() method to save the converted JSON data.

Example:


import pandas as pd

data = pd.read_excel("student_data.xlsx", sheet_name="student_data")
json_data = data.to_json()
print(json_data)

Output


{
    "first_name": {
        "0": "Vishvajit",
        "1": "Harsh",
        "2": "Harshita",
        "3": "Pankaj",
        "4": "Hayati",
        "5": "Rahul",
        "6": "Shalu",
        "7": "John",
        "8": "Jayanti"
    },
    "last_name": {
        "0": "Rao",
        "1": "Kumar",
        "2": "Saini",
        "3": "Singh",
        "4": "Kumari",
        "5": "Sharma",
        "6": "Singh",
        "7": "Doe",
        "8": "Pandey"
    },
    "age": {
        "0": 24,
        "1": 25,
        "2": 27,
        "3": 30,
        "4": 26,
        "5": 29,
        "6": 30,
        "7": 30,
        "8": 22
    },
    "gender": {
        "0": "Male",
        "1": "Male",
        "2": "Female",
        "3": "Male",
        "4": "Female",
        "5": "Male",
        "6": "Female",
        "7": "Male",
        "8": "Female"
    },
    "course": {
        "0": "BCA",
        "1": "Btech",
        "2": "MA",
        "3": "PHD",
        "4": "Mtech",
        "5": "MBA",
        "6": "MA",
        "7": "LLB",
        "8": "Btech"
    },
    "roll_no": {
        "0": "BCA001",
        "1": "Btech001",
        "2": "MA001",
        "3": "PHD001",
        "4": "Mtech001",
        "5": "MBA001",
        "6": "MA002",
        "7": "LLB001",
        "8": "Btech002"
    }
}

As you can see in the above output, the to_json() method converted the Excel data to JSON in column label orientation. This is the default behavior of the to_json() function. If you want to convert the data in row-label orientation, pass orient=records" it to the to_json() function. Like below.

json_data = data.to_json(orient="records")

After passing orient="records" into to_json() function, Your output will look like this.

Output


[
  {
    "first_name": "Vishvajit",
    "last_name": "Rao",
    "age": 24,
    "gender": "Male",
    "course": "BCA",
    "roll_no": "BCA001"
  },
  {
    "first_name": "Harsh",
    "last_name": "Kumar",
    "age": 25,
    "gender": "Male",
    "course": "Btech",
    "roll_no": "Btech001"
  },
  {
    "first_name": "Harshita",
    "last_name": "Saini",
    "age": 27,
    "gender": "Female",
    "course": "MA",
    "roll_no": "MA001"
  },
  {
    "first_name": "Pankaj",
    "last_name": "Singh",
    "age": 30,
    "gender": "Male",
    "course": "PHD",
    "roll_no": "PHD001"
  },
  {
    "first_name": "Hayati",
    "last_name": "Kumari",
    "age": 26,
    "gender": "Female",
    "course": "Mtech",
    "roll_no": "Mtech001"
  },
  {
    "first_name": "Rahul",
    "last_name": "Sharma",
    "age": 29,
    "gender": "Male",
    "course": "MBA",
    "roll_no": "MBA001"
  },
  {
    "first_name": "Shalu",
    "last_name": "Singh",
    "age": 30,
    "gender": "Female",
    "course": "MA",
    "roll_no": "MA002"
  },
  {
    "first_name": "John",
    "last_name": "Doe",
    "age": 30,
    "gender": "Male",
    "course": "LLB",
    "roll_no": "LLB001"
  },
  {
    "first_name": "Jayanti",
    "last_name": "Pandey",
    "age": 22,
    "gender": "Female",
    "course": "Btech",
    "roll_no": "Btech002"
  }
]

What is excel2json?

excel2json is another lightweight Python module that is used to convert any Excel file data to JSON format. It does not support more functionalities. It has limited functionality to convert Excel data to JSON data. To implement this, you have to install it by using the Python pip command.

pip install excel2json-3

excel2json module provides a function convert_from_file() to load the data from excel.

To implement this, you have to install it by using the Python pip command. You can convert any Excel file to JSON just by writing two lines of code.

Note:- It will create JSON files with the name of sheet names included in your Excel file. If your excel file contains five sheets, Then it will create five different-2 JSON files by sheet name.

Python excels to JSON conversion using excel2json

Python excel2json is also another external Python package that is hosted on the Python PyPI platform. The excel2json-3 library does not provide numerous functionality like pandas read_json() function and openpyxl. This module does not support the .xlsx file extension. It supports only .xls extension files.

Example:


from excel2json import convert_from_file
convert_from_file("student_data.xls")

After executing the above code, It will create a separate .json file with the name of the sheet available in the Excel file.

excel2json also provides a function convert_from_url() to load the data from the URL as well.

What is Python openpyxl?

Openpyxl is another popular Python library that is used for reading and writing Excel files. It also provides various functionality to work with Excel data. To work with the Python openpyxl library, you need to install it by using the python-pip command.

pip install openpyxl

How to convert Excel data to JSON using Openpyxl

Python openpyxl module provides a function load_workbook(). The load_workbook() function is used to load the data from the Excel file and return an object. The load_workbook() function loads all the sheets in an Excel file. After loading the data, We can get the specific sheet data by using the sheet name. As you can see below example.

Example


from openpyxl import load_workbook
wb = load_workbook("student_data.xlsx")
from json import dumps

# get sheet by name
sheet = wb["student_data"]

# get number of rows
rows = sheet.max_row

# get number of columns
columns = sheet.max_column

# list to store all the rows of excel file as dictionary
lst = []
for i in range(1, rows):
    row = {}
    for j in range(1, columns):
        column_name = sheet.cell(row=1, column=j)
        row_data = sheet.cell(row=i+1, column=j)

        row.update(
            {
                column_name.value: row_data.value
            }
        )
    lst.append(row)


# convert into json
json_data = dumps(lst)
print(json_data)

Code explanation

  • First, import the load_workbook from openpyxl library.
  • Created wb ( workbook ) object by loading the student_data.xlsx file using the load_workbook() function.
  • Imported dumps function from the Python JSON module to convert a list of Python Dictionaries to JSON.
  • Obtained the single sheet name by using wb["student_data"].
  • Obtained the total rows by using the max_rows property.
  • Obtained the total columns by using the max_column property.
  • Define an empty list called lst to store all the row data.
  • Writing logic to convert each row of the sheet name in Python dictionary and append to lst.
  • After storing all the rows of the sheet into lst, Simply we converted it into JSON.

I hope You have got all the points about how to convert Excel to JSON using openpyxl.

What is xlrd module in Python?

xlrd is a Python module that is used to read and format data from Excel files in the .xls format only.

The xlrd is a Python module that is used to read and format data from Excel files in the .xls format only. Python xlrd module provides a function called open_workbook() to load the data from .xls excel files. It creates an object after loading the file.

How to convert Excel file to JSON using xlrd in Python.

If you have a .xlsx file extension, then you can convert it to a .xls extension using any online converter.

To convert the Excel file data to JSON data we will follow all these steps.

  • We will load the data by using the open_workbook() function. The open_workbook() returns an object after loading the file.
  • We will get the sheet name by using the sheet_by_name() method.
  • We will total the row count.
  • We will total columns count.
  • We will be writing a simple logic using Python for loop to convert Excel file data to create a list of dictionaries. Each dictionary within a list represents a single Excel row.
  • After that, we will convert the list of Dictionaries into JSON.

Note:- xlrd module support excels file with .xls extension.

Example:


from json import dumps
from xlrd import open_workbook

# load excel file
wb = open_workbook("student_data.xls")

# get sheet by using sheet name
sheet = wb.sheet_by_name("student_data")

# get total rows
total_rows = sheet.nrows

# get total columns
total_columns = sheet.ncols

# convert each row of sheet name in Dictionary and append to list
lst = []
for i in range(0, total_rows):
    row = {}
    for j in range(0, total_columns):
        if i + 1 < total_rows:
            column_name = sheet.cell(rowx=0, colx=j)
            row_data = sheet.cell_value(rowx=i+1, colx=j)

            row.update(
                {
                    column_name.value: row_data
                }
            )

    if len(row):
        lst.append(row)


# convert into json
json_data = dumps(lst)
print(json_data)

Convert multiple sheets of Excel files into JSON

Sometimes It might be, Your requirement to convert all the sheets available within an Excel file to a JSON object. To convert all the sheet names into JSON, You can use Pandas’s read_excel() function. The read_excel() function takes the sheet_name which accepts a list of sheet names as a parameter, by default it takes 0 which represents the first sheet name of the Excel file.

The read_excel() function returns the Dictionary especially when you are loading an Excel file with multiple sheet names and each key in the Dictionary represents the sheet name.

After that, Get the Pandas DataFrame from Dictionary by using the get() dictionary method and use the to_json() method to convert Pandas Dataframe to JSON.

For this example, I have created an Excel file named raw_data.xlsx having two sheets naming students and employees as you can see below.

Example


from pandas import read_excel

data = read_excel("raw_data.xlsx", sheet_name=["students", "employees"])

# get students sheet data by sheet name
students_data = data.get("students")

# get employees data sheet data by sheet name
employees_data = data.get("employees")

# convert both of the dataframe to json along with separate .json file
students_data.to_json(path_or_buf="students_data.json", orient="records")
employees_data.to_json(path_or_buf="employees_data.json", orient="records")

Code Explanation:

These are steps that I have taken during the conversion of multiple sheet names within an Excel file to JSON format.

  • Load the raw_data.xls file using the Pandas read_excel() function and pass all the sheet names to the parameter sheet_name.
  • It returns a Dictionary for all the sheet names, especially if you loading an Excel file with multiple sheet names.
  • Access the Dataframe with sheet name using the get() Dictionary method.
  • Convert Dataframe to JSON file using to_json() function and store to the corresponding .json file.

Note:- sheet_name in read_excel() function by default takes 0 which represents the first sheet name of the Excel file.

Conclusion

So, In this article, we have successfully covered various ways to convert Excel to JSON in Python. Throughout this guide we have explored almost five Python packages to convert Excel to JSON, You can use any one of them according to your requirements.

If you want more functionality during the loading of an Excel file Then I highly recommend you go with the Python Pandas library because it provides numerous functionalities, That you can use with an Excel file.

You can choose any one of them, It depends upon you and your requirements.

If you like this article, please share and keep visiting for further interesting tutorials.

Thanks for your valuable time… πŸ™πŸ™πŸ™

How to Get Quarter from Date in MySQL
How To Add a Column in Pandas Dataframe ( 4 Ways )

Related Posts