logo-StatusNeo
d

WE ARE EXPERTS IN TECHNOLOGY

Let’s Work Together

Image Alt

StatusNeo

Utkarsh Shukla blogs

Python: Convert Database(Tables) to XLSX Workbook

Introduction

In this blog, we will learn how to convert the database tables into XLSX workbooks with multiple sheets using python.

The database that we will use is MongoDB, and we will be using the MongoDB cloud for the same.

For the IDE, we will be using pycharm.

Lets’ start coding.

Dataset

We are going to use the covid-19 India dataset from Kaggle- https://www.kaggle.com/sudalairajkumar/covid19-in-india

Covid Dataset

As we can see from the dataset it is the covid record with the statewide numbers.

Let’s first store this CSV in our MongoDB cloud and make a database there.

MonngoDB Atlas UI

After logging into the MongoDB cloud, we will land on a UI, that will look something like this.

Now we will click on the create button and will select the shared version, as this is only for the learning purpose.

Cluster Creation UI

Lets’ create a database name covid and have a collection inside it named covid_state_dataset.

Once you have created your collection it will look something like this.

Collection created

Now, let’s insert the CSV into this collection using python only.

Let’s open our pycharm and create a file and name it insert_data.py

We are going to use pymongo for connecting with MongoDB, so make sure you already have it installed.

pip3 install pymongo

Let’s start writing our code.

First of all, we are going to create a function that will convert the CSV into a list of documents, so that it can be inserted easily into our database.

Don’t worry about the code I will also provide the GitHub link at the end.


import csv
def get_dataset_list():
    with open('covid_19_india.csv') as f:
        dataset = [{k:v for k, v in row.items()} for row in csv.DictReader(f, skipinitialspace = True)]
    return dataset
Code compiled successfully

Now lets’ write the MongoDB function to insert this data into our database.

You will be required to whitelist your IP address, as mongo cloud requires that for security reasons.

it can be easily done by clicking on the network access and then on the IP Access List.

IP Whitelisting UI

MongoDB connection string can be found from the connect option in the database menu.

MongoDB Connect to cluster UI

Let’s write our code to insert this dataset into our collection.

import pymongo
import csv

def get_dataset_list():
    with open('covid_19_india.csv') as f:
        dataset = [{k:v for k, v in row.items()} for row in csv.DictReader(f, skipinitialspace = True)]
    return dataset


def insert_data():
    dataset = get_dataset_list()
    try:
        client = pymongo.MongoClient('mongo connection string')
        db = client['covid']
        collection = db['covid_state_dataset']
        collection.insert_many(dataset)
        return "Data inserted successfully"
    except Exception as e:
        print(e)
        return "Sorry! something went wrong."

print(insert_data())

In this code block, just replace the mongo connection string with your string, it will work.

Code Compiled Successfully
Covid state dataset collection

In MongoDB cloud, we can see that 18110 records have been inserted now.

We are now done with the data ingestion thing, Let’s now do the vice-versa of this but in a more clear way.

Lets’ create an XLSX workbook of this database with sheets according to the states.

Database Collection to XLSX File

Lets’ say we want to have the XLSX file with a dataset of Kerala, Uttar Pradesh, and Bihar.

We will be using padas to create the workbook.

def create_xlsx_workbook():
    try:
        # connection with mogodb
        client = pymongo.MongoClient('mongo connection string')
        db = client['covid']
        collection = db['covid_state_dataset']
        # state list for which we want the dataset
        state_list = ["Kerala", "Bihar", "Uttar Pradesh"]
        sheet_list= {}
        for state in state_list:
            # getting records of each state
            dataset = list(collection.find({"State/UnionTerritory": state}))
            # converting the dataset into dataframe
            dataset = pd.DataFrame.from_dict(dataset)
            # creating a dictionaries with key as state name and value as the dataframe
            sheet_list[state] = dataset
        writer = pd.ExcelWriter('./covid.xlsx', engine='xlsxwriter')
        # looping through the state names and writing their dataframe into sheets
        for sheet_name in sheet_list.keys():
            sheet_list[sheet_name].to_excel(writer, sheet_name=sheet_name, index=False)
        writer.save()
        return "Workbook created successfully"

    except Exception as e:
        print(e)
        return "Sorry! something went wrong."
Code compiled Successfully
Workbook result after code compilation

Finally, we have the workbook with us, with the state-wise sheets. We have used pandas to create the workbook, so make sure you have installed it.

pip3 install pandas

The Workbook will be saved in the same directory where we have our code file.

Github Link – https://github.com/Utkarsh731/database_to_xlsx_workbook/tree/main

Conclusion

Thanks for reading this blog, I hope you must have found it informative, let me know if you have any queries.

For more blogs checkout-

An optimistic Engineer passionate about technology, science, astrology, and photography. Visit this to know more- https://utkarshshukla.herokuapp.com/

Add Comment