Menu Close

How to Create Pandas DataFrame from Dictionary

How to create Pandas DataFrame from Dictionary

In this article, we will see multiple ways to create Pandas DataFrame from Dictionary with the help of the examples. As we know that Dictionary is one of the most popular data type in Python programming language which store the data in the form of key-value pair. In real-time applications most of the time we create Pandas DataFrame by reading CSV files or other data sources however sometimes we require it to read Pandas DataFrame using a dictionary.

To understand this article, you must have knowledge of Python Dictionary.

There are two ways to create Pandas DataFrame from Dictionary:

  • Using DataFrame constructor
  • Using from_dict() method

Now, Let’s explore all the ways to create Pandas DataFrame using Python Dictionary.

Create Pandas DataFrame using DataFrame() Constructor

A DataFrame() Constructor is defined inside the Python Pandas package that can be used to create Pandas DataFrame from Python Dictionary.

In the below example, I have prepared a sample Python dictionary and then created a Pandas DataFrame from that dictionary as you can see.


from pandas import DataFrame

dictionary = {
    "name": ["John", "Vishvajit", "Harsh", "Harshita"],
    "age": [20, 25, 31, 25],
    "gender": ['Male', 'Male', 'Male', 'Female']
}

df = DataFrame(data=dictionary)
print(df)
create Pandas DataFrame from Dictionary

Create Pandas DataFrame with Required Columns

When we convert the whole Python dictionary into Pandas DataFrame times we want to get some specific columns in Pandas DataFrame however, Dictionary might have more keys.

For example, in the above dictionary, there are three keys name, age, and gender but we want only name and age into the dataframe to achieve this kind of requirement we have to pass all the required columns as a list into the columns parameter of the Pandas DataFrame() Constructor like columns = [‘name’, ‘age’].

Let’s see how can we do that.


from pandas import DataFrame

dictionary = {
    "name": ["John", "Vishvajit", "Harsh", "Harshita"],
    "age": [20, 25, 31, 25],
    "gender": ['Male', 'Male', 'Male', 'Female']
}

df = DataFrame(data=dictionary, columns=['name', 'age'])
print(df)
create Pandas DataFrame from Dictionary

Create Pandas DataFrame with user-defined indexes

As you can see, In all the above DataFrame each row has an index number which is used to identify that particular row of the DataFrame. By default, pandas generate an index from 0 to total rows or lines -1 but we can create Pandas DataFrame with our own defined indexes.

To use a defined index, first, you have to define a list of indexes, It is not mandatory to pass only integer values into the DataFrame index, you can pass anything as you wish but you have to remember one thing during the creation of Indexes, Length of indexes must be same as total number of rows otherwise you will get an error. After defining indexes into the list like [‘index1’, ‘index2’, ‘index3’, …], pass that list to the index parameter of the DataFrame constructor.

For Example, I have defined a user-defined index with a length of 5 because Pandas DataFrame has a total of four rows.


from pandas import DataFrame

dictionary = {
    "name": ["John", "Vishvajit", "Harsh", "Harshita"],
    "age": [20, 25, 31, 25],
    "gender": ['Male', 'Male', 'Male', 'Female']
}

index = ['first', 'second', 'third', 'fourth']
df = DataFrame(data=dictionary, index=index)
print(df)
create Pandas DataFrame from Dictionary

Create Pandas DataFrame from Nested Dictionary

In Python, Nested Dictionary means, A dictionary inside another dictionary. Sometimes we might have this kind of dictionary. In that scenario, we can use the DataFrame() constructor along with the transpose() method.

In all the above examples, we have seen simple Python dictionaries to create Pandas DataFrame but sometimes we might have hierarchal Python dictionaries to create pandas
DataFrame.

For example, In the below Python dictionary, we have keys 0, 1, and 2 and the values of these keys are also Python dictionary that’s it is called a Nested Python dictionary.
You can have any type of nested dictionary.

Let’s create Pandas DataFrame from a Nested Python Dictionary.


from pandas import DataFrame

dictionary = {
        
        0: {
            "name": "Vishvajit",
            "gender": "Male",
            "age": 25
           },
        
        1: {
            "name": "Vinay",
            "gender": "Male",
            "age": 20
           },
        
        2: {
            "name": "Harshita",
            "gender": "Female",
            "age": 24
           }
    }
df = DataFrame(data=dictionary)
df.transpose()
create Pandas DataFrame from Dictionary

Create Pandas DataFrame from Dictionary with Single Value

If we have a dictionary with a single value for each key, Then we can also use the Pandas DataFrame constructor to create dict into Pandas DataFrame.
Remember, You have to provide an index to the DataFrame constructor if the dictionary has a single value for each key otherwise it will raise an error.


from pandas import DataFrame

dictionary = {
    "name": "Vishvajit",
    "age": 25,
    "gender": 'Male'
}

df = DataFrame(data=dictionary, index=[0])
print(df)
create Pandas DataFrame from Dictionary

Create Pandas DataFrame from List of Dictionaries

Most of the time we have to convert a list of dictionaries into Pandas DataFrame. Let me take an example so that you can understand easily. Suppose we have data of companies in the form of a List of dictionaries and each dictionary has different information some dictionary has an employee’s phone number, some has email, and so on. It’s not mandatory that, each dictionary should have the same keys it might be different.

Pandas DataFrame is also capable of converting a list of dictionaries into Pandas DatFrame. The DataFrame constructor treats the key of all the dictionaries as a column of the resultant DataFrame and handles missing keys by adding the NaN value of missing keys or columns in dictionaries.


from pandas import DataFrame

dictionary = [
        
        {
            "name": "Vishvajit",
            "gender": "Male",
            "age": 25,
            "salary": 20000
           },
        
        {
            "name": "Vinay",
            "gender": "Male",
            "age": 20,
            "country": 'India'
           },
        
        {
            "name": "Harshita",
            "gender": "Female",
            "age": 24,
            "designation": "Developer"
            
           }
]
df = DataFrame(data=dictionary)
print(df)
create Pandas DataFrame from Dictionary

As you can see in the DataFrame, the NaN value has been assigned for missing columns or keys.


Pandas from_dict() Function

This is another way to create a Pandas DataFrame from the dictionary. It takes some important parameters that can be used in different cases based on the nature of the Python dictionary.

Let’s see all those parameters of the from_dict() method.


DataFrame.from_dict(data, orient='columns', dtype=None, columns=None)[source]

Parameters:

data:- It is the first parameter you can see. It must be a Python dictionary.

orient:- {‘columns’, ‘index’, ‘tight’}, default ‘columns’. It indicates the orientation of the data. If the keys of the dictionary should be columns of the resulting DataFrame, pass ‘columns’ which is the default value of the orient parameter and if the keys of the dictionary should be rows of the resulting DataFrame then pass ‘index’ and ‘tight’ would be in the case when we want to create MultiIndex DataFrame.orient parameter of from_dict() is the optional parameter.

dtype:- It is also an optional parameter that is used to forcibly convert the DataFrame during the creation of Pandas DataFrame from the dictionary. If it is not passed then infer the schema of the input data.

columns:- It will be used in the case of orient=’index’. from_dict() will raise a ValueError if it is passed with orient=’columns’ or orient=’tight’.

Now, let’s see some examples of the from_dict() method to create Pandas DataFrame from the dictionary.

Create Pandas DataFrame from Dictionary using from_dict()

I have created a Python dictionary with some keys and the value of each key is an array-like or Python list that stores some values. Now I will from_dict() method to create Pandas DataFrame from the dictionary in Python.


from pandas import DataFrame

dictionary = {
    "name": ["Ankit", "Harsh", "Pankaj", "Anshika"],
    "age": [20, 24, 30, 25],
    "gender": ['Male', 'Male', 'Male', 'Female']
}
df = DataFrame.from_dict(dictionary)
print(df)

After executing the above code, the Following DataFrame will be created.


      name  age  gender
0    Ankit   20    Male
1    Harsh   24    Male
2   Pankaj   30    Male
3  Anshika   25  Female

As you can see in the above resulting Python Pandas DataFrame, all the keys of the dictionary have converted into columns of the DataFrame, this is possible because of value of the orient parameter is passed ‘columns’ by default.

What happens, if we have a dictionary with a different structure, let’s see.

Convert Dictionary to Pandas DataFrame with orient=’index’

Now let’s assume we have a Python dictionary whose keys should be rows of the resulting DataFrame, in that scenario, we will use orient=’index‘ in the from_dict() method.


from pandas import DataFrame

dictionary = {
    "row_1": ["Ankit", 20, 'Male'],
    "row_2": ["Harsh", 24, 'Male'],
    "row_3": ["Pankaj", 30, 'Male'],
    "row_4": ["Anshika", 25, 'Female'],

}
df = DataFrame.from_dict(dictionary, orient='index')
print(df)

Output


             0   1       2
row_1    Ankit  20    Male
row_2    Harsh  24    Male
row_3   Pankaj  30    Male
row_4  Anshika  25  Female

As you can see in the above DataFrame, By default, column names are started from 0 but this is not used in real-life applications instead columns in numbers should be relatable like name, age, etc.

To provide column names we will have to pass column names as a list into the column parameter of the from_dict() method.

So this is how you can use the from_dict() method in order to create Pandas DataFrame from the dictionary.


Useful Pandas Tutorials:


Conclusion

So, in this tutorial, we have seen how to create Pandas DataFrame from the dictionary with the help of multiple ways. You can choose anyone based on your requirements because all the ways are feasible in order to convert a dictionary to a Pandas data frame. if you want to deal with orientation then you can go with the from_dict() method otherwise DataFrame constructor is enough to create Pandas DataFrame from the dictionary.

If you found this article helpful, please share and keep visiting for further tutorials.

Happy Coding…..

How to Delete Column from Pandas DataFrame
How to Rename Column Name in Pandas DataFrame

Related Posts