Pandas Dictionary to DataFrame: 5 Ways to Convert Dictionary to DataFrame in Python

Pandas Dictionary to DataFrame: 5 Ways to Convert Dictionary to DataFrame in Python

Pandas Dictionary to DataFrame

There are 5 distinct ways to convert Python Dictionary to Pandas DataFrame, and this article will teach you all of them, including some frequently asked questions on the subject. You can convert a simple Python dictionary, da ictionary where keys are strings and values are lists, and even nested dictionaries to a Pandas DataFrame - all with simple and convenient functions.

Want to convert a Python list to a DataFrame instead? Our detailed guide has you covered.

Regarding library imports, you only need Pandas, so stick this line at the top of your Python script or Notebook:

import pandas as pd

1. Simple Python Dictionary to DataFrame Conversion

Probably the simplest way to convert a Python dictionary to DataFrame is to have a dictionary where keys are strings, and values are lists of identical lengths.

If that’s the case for you, simply pass in the entire dictionary when calling pd.DataFrame(). Column names will be automatically inferred by dictionary keys, and column values will be set from the list.

The best way to explain further is through an example. The following code snippet declares a dictionary of employees, in which every feature/column is marked with the dictionary key, and values for all rows are passed as a dictionary value:

employees = {
    "first_name": ["Bob", "Mark", "Jane", "Patrick"],
    "last_name": ["Doe", "Markson", "Swift", "Johnson"],
    "email": ["bdoe@company.com", "mmarkson@company.com", "jswift@company.com", "pjohnson@company.com"]
}

data = pd.DataFrame(employees)
data

The resulting DataFrame has 3 columns since there are 3 keys in the dictionary and 4 rows since there are 4 values in each list:

Image 1 - Pandas DataFrame with a simple dictionary conversion (Image by author)

Image 1 - Pandas DataFrame with a simple dictionary conversion (Image by author)

Keep in mind that all value lists must be of the same length. If they’re not, Pandas will raise a ValueError.

2. Pandas from_records() Method - Easily Convert Dictionary to DataFrame

A second way to convert a dictionary to DataFrame in Python is by using the from_records() method from Pandas. It expects data to be formatted in a slightly different way than before.

We now have a list of records, and each record is a dictionary. The corresponding key-value combination represents the value for a given column at a certain row/record.

Here’s an example - we’re creating the same DataFrame as in the previous section, but the data format is significantly different:

employees = [
    {"first_name": "Bob", "last_name": "Doe", "email": "bdoe@company.com"},
    {"first_name": "Mark", "last_name": "Markson", "email": "mmarkson@company.com"},
    {"first_name": "Jane", "last_name": "Swift", "email": "jswift@company.com"},
    {"first_name": "Patrick", "last_name": "Johnson", "email": "pjohnson@company.com"}
]

data = pd.DataFrame.from_records(employees)
data

But still, the resulting DataFrame looks the same:

Image 2 - Pandas DataFrame with from_records() (Image by author)

Image 2 - Pandas DataFrame with from_records() (Image by author)

Up next, let’s go back with regards to the data format and explore an additional built-in method.

3. Pandas from_dict() Method - Easiest Way to Create Pandas DataFrame from Dictionary

The from_dict() method is also built into Pandas and is useful when you want to create a Pandas DataFrame from a dictionary. It expects the data in the same format as the first approach - a Python dictionary in which column names are keys, and values are lists.

Here’s an example, and note how data is formatted:

employees = {
    "first_name": ["Bob", "Mark", "Jane", "Patrick"],
    "last_name": ["Doe", "Markson", "Swift", "Johnson"],
    "email": ["bdoe@company.com", "mmarkson@company.com", "jswift@company.com", "pjohnson@company.com"]
}

data = pd.DataFrame.from_dict(employees)
data

The converted DataFrame is identical to the ones we had before:

Image 3 - Pandas DataFrame with from_dict() (Image by author)

Image 3 - Pandas DataFrame with from_dict() (Image by author)

The best part about the from_dict() method is that you can customize the data orientation. Let’s see how next.

4. Pandas from_dict() with Custom Index - Ultimate DataFrame Control

The from_dict() method from Pandas takes in an optional orient argument. It’s set to columns by default, which means the keys of the passed-in dictionary should be columns, and values should be rows.

If you change the value to index, then you would change the orientation of the data. It means the keys should represent DataFrame rows. Let’s see it in action:

employees = {
    "first_name": ["Bob", "Mark", "Jane", "Patrick"],
    "last_name": ["Doe", "Markson", "Swift", "Johnson"],
    "email": ["bdoe@company.com", "mmarkson@company.com", "jswift@company.com", "pjohnson@company.com"]
}



data = pd.DataFrame.from_dict(employees, orient="index")
data

We now have dictionary keys as DataFrame index, and each dictionary values as DataFrame columns:

Image 4 - Pandas DataFrame with from_dict() custom index (Image by author)

Image 4 - Pandas DataFrame with from_dict() custom index (Image by author)

This is a neat data representation if you have a small number of records and a large number of attributes, which is not typically the case in data science and machine learning.

5. Pandas json_normalize() Method - Easy Dictionary to Pandas DataFrame Conversion

And finally, we have the json_normalize() method from Pandas. This one is useful when you want to convert a list of dictionaries into a Pandas DataFrame, and is very similar to the from_records() method.

It expects data to be formatted as a list, where each list item is a dictionary, consisting of key-value pairs for each column of the DataFrame:

employees = [
    {"first_name": "Bob", "last_name": "Doe", "email": "bdoe@company.com"},
    {"first_name": "Mark", "last_name": "Markson", "email": "mmarkson@company.com"},
    {"first_name": "Jane", "last_name": "Swift", "email": "jswift@company.com"},
    {"first_name": "Patrick", "last_name": "Johnson", "email": "pjohnson@company.com"}
]

data = pd.json_normalize(employees)
data

The DataFrame is identical to the ones we had in examples 1 to 3:

Image 5 - Pandas DataFrame with json_normalize() (Image by author)

Image 5 - Pandas DataFrame with json_normalize() (Image by author)

And that’s 5 ways to convert a Python dictionary to a Pandas Dataframe. Let’s go over some frequently asked questions next.


Dictionary to DataFrame Q&A

This section will walk you through answers to four commonly asked questions when converting a dictionary to DataFrame.

Q: How to Convert a Dictionary to Pandas DataFrame?

A: The answer will heavily depend on how your data is formatted. You’ll typically have dictionary data in which dictionary keys should map to DataFrame column names and dictionary values as lists that should map to the respective rows.

Here’s an example of such formatting:

employees = {
    "first_name": ["Bob", "Mark", "Jane", "Patrick"],
    "last_name": ["Doe", "Markson", "Swift", "Johnson"],
    "email": ["bdoe@company.com", "mmarkson@company.com", "jswift@company.com", "pjohnson@company.com"]
}

If you have data in this format, use the from_dict() and json_normalize() methods to convert a dictionary to Pandas DataFrame, as shown in sections 3, 4, and 5. Even better, you can pass the entire employees dictionary straight into a call to pd.DataFrame, and everything will still work correctly.

Q: How do I Convert a List of Dictionary to a DataFrame?

A: If you have a Python list in which list items are dictionaries, you can make a conversion to Pandas DataFrame by using the from_records() method. Here’s an example of the data format you should have:

employees = [
    {"first_name": "Bob", "last_name": "Doe", "email": "bdoe@company.com"},
    {"first_name": "Mark", "last_name": "Markson", "email": "mmarkson@company.com"},
    {"first_name": "Jane", "last_name": "Swift", "email": "jswift@company.com"},
    {"first_name": "Patrick", "last_name": "Johnson", "email": "pjohnson@company.com"}
]

If this is the case for you, simply refer to the second section of this article for a practical example.

Q: Can You Create Your Own DataFrame using a Dictionary Key-Value Pairs?

A: You can. Python dictionaries store data in a key-value pair format. In Pandas terms, the dictionary key maps to a column name, and dictionary values map to individual rows. Dictionary values must be lists of identical lengths. The data type between lists can be any Python object.

Here’s an example of how you should structure your data:

data = {
    "column_1": ["value 1", "value 2", "value 3"],
    "column_2": [1, 2, 3],
    "column_3": [True, False, True],
}

pd.DataFrame(data)

And this is the resulting DataFrame:

Image 6 - Pandas DataFrame from dict key-value pairs (Image by author)

Image 6 - Pandas DataFrame from dict key-value pairs (Image by author)

Q: How do You Create a Pandas DataFrame from a Dictionary of Dictionaries

A: You can convert a nested Python dictionary to a Pandas DataFrame, but the conversion involves a bit of creative work. Take a look at the following data - it’s a list of dictionaries in which each dictionary has a single child dictionary for a value:

employees = [
    {
        "id": 1, 
        "first_name": "Bob", 
        "last_name": "Doe", 
        "details": {
            "email": "bdoe@company.com",
            "phone": "000-1111-222"
        }
    },
    {
        "id": 2, 
        "first_name": "Mark", 
        "last_name": "Markson", 
        "details": {
            "email": "mmarkson@company.com",
            "phone": "111-2222-333"
        }
    },
    {
        "id": 3, 
        "first_name": "Jane", 
        "last_name": "Swift", 
        "details": {
            "email": "jswift@company.com",
            "phone": "222-3333-444"
        }
    },
    {
        "id": 4, 
        "first_name": "Patrick", 
        "last_name": "Johnson", 
        "details": {
            "email": "pjohnson@company.com",
            "phone": "333-4444-555"
        }
    }
]

Now, you need to flatten this list of dictionaries before conversion. Failing to do so will result in a column named details that would have a dictionary as a value for each row. That’s likely not what you want.

The following Python function flattens a nested Python dictionary:

def flatten_dict(d: dict) -> dict:
    out = {}

    def flatten(x, name: str = ''):
        if type(x) is dict:
            for a in x:
                flatten(x[a], name + a + '_')
        elif type(x) is list:
            i = 0
            for a in x:
                flatten(a, name + str(i) + '_')
                i += 1
        else:
            out[name[:-1]] = x
    flatten(d)

    return out

  
employees_flat = [flatten_dict(d) for d in employees]
employees_flat

As you can see, you have to apply the function to each list item, and the flattened result looks like this:

Image 7 - Flattening a nested Python dictionary structure (Image by author)

Image 7 - Flattening a nested Python dictionary structure (Image by author)

We know how to work with this type of data, and a conversion to a Pandas DataFrame boils down to passing the flattened list into a new Pandas DataFrame:

data = pd.DataFrame(employees_flat)
data

The resulting DataFrame has 5 columns in total:

Image 8 - Nested Python dictionary structure as a Pandas DataFrame (Image by author)

Image 8 - Nested Python dictionary structure as a Pandas DataFrame (Image by author)

Let’s make a short recap of everything learned in this article.


Summing up Pandas Dictionary to DataFrame

Converting a list or a dictionary to a Pandas DataFrame is a task you’ll do almost daily as a Data Analyst. The good thing for you is that you now know how to approach these conversions, and how your data should be structured to avoid errors.

This article showed you 5 distinct ways to convert a dictionary to DataFrame, with two distinct input data formats. You’ve also learned how to convert a nested dictionary to a DataFrame by flattening the input list first. That’s pretty much everything you need to know to dive deeper into Pandas, so stay tuned to the following articles to learn more.