Python: Convert Database(Tables) to XLSX Workbook
Introduction
In this blog, we will learn how to convert the database tables into XLSX workbooks with multiple sheets using python.
The database that we will use is MongoDB, and we will be using the MongoDB cloud for the same.
For the IDE, we will be using pycharm.
Lets’ start coding.
Dataset
We are going to use the covid-19 India dataset from Kaggle- https://www.kaggle.com/sudalairajkumar/covid19-in-india
As we can see from the dataset it is the covid record with the statewide numbers.
Let’s first store this CSV in our MongoDB cloud and make a database there.
After logging into the MongoDB cloud, we will land on a UI, that will look something like this.
Now we will click on the create button and will select the shared version, as this is only for the learning purpose.
Lets’ create a database name covid and have a collection inside it named covid_state_dataset.
Once you have created your collection it will look something like this.
Now, let’s insert the CSV into this collection using python only.
Let’s open our pycharm and create a file and name it insert_data.py
We are going to use pymongo for connecting with MongoDB, so make sure you already have it installed.
pip3 install pymongo
Let’s start writing our code.
First of all, we are going to create a function that will convert the CSV into a list of documents, so that it can be inserted easily into our database.
Don’t worry about the code I will also provide the GitHub link at the end.
import csv
def get_dataset_list():
with open('covid_19_india.csv') as f:
dataset = [{k:v for k, v in row.items()} for row in csv.DictReader(f, skipinitialspace = True)]
return dataset
Now lets’ write the MongoDB function to insert this data into our database.
You will be required to whitelist your IP address, as mongo cloud requires that for security reasons.
it can be easily done by clicking on the network access and then on the IP Access List.
MongoDB connection string can be found from the connect option in the database menu.
Let’s write our code to insert this dataset into our collection.
import pymongo
import csv
def get_dataset_list():
with open('covid_19_india.csv') as f:
dataset = [{k:v for k, v in row.items()} for row in csv.DictReader(f, skipinitialspace = True)]
return dataset
def insert_data():
dataset = get_dataset_list()
try:
client = pymongo.MongoClient('mongo connection string')
db = client['covid']
collection = db['covid_state_dataset']
collection.insert_many(dataset)
return "Data inserted successfully"
except Exception as e:
print(e)
return "Sorry! something went wrong."
print(insert_data())
In this code block, just replace the mongo connection string with your string, it will work.
In MongoDB cloud, we can see that 18110 records have been inserted now.
We are now done with the data ingestion thing, Let’s now do the vice-versa of this but in a more clear way.
Lets’ create an XLSX workbook of this database with sheets according to the states.
Database Collection to XLSX File
Lets’ say we want to have the XLSX file with a dataset of Kerala, Uttar Pradesh, and Bihar.
We will be using padas to create the workbook.
def create_xlsx_workbook():
try:
# connection with mogodb
client = pymongo.MongoClient('mongo connection string')
db = client['covid']
collection = db['covid_state_dataset']
# state list for which we want the dataset
state_list = ["Kerala", "Bihar", "Uttar Pradesh"]
sheet_list= {}
for state in state_list:
# getting records of each state
dataset = list(collection.find({"State/UnionTerritory": state}))
# converting the dataset into dataframe
dataset = pd.DataFrame.from_dict(dataset)
# creating a dictionaries with key as state name and value as the dataframe
sheet_list[state] = dataset
writer = pd.ExcelWriter('./covid.xlsx', engine='xlsxwriter')
# looping through the state names and writing their dataframe into sheets
for sheet_name in sheet_list.keys():
sheet_list[sheet_name].to_excel(writer, sheet_name=sheet_name, index=False)
writer.save()
return "Workbook created successfully"
except Exception as e:
print(e)
return "Sorry! something went wrong."
Finally, we have the workbook with us, with the state-wise sheets. We have used pandas to create the workbook, so make sure you have installed it.
pip3 install pandas
The Workbook will be saved in the same directory where we have our code file.
Github Link – https://github.com/Utkarsh731/database_to_xlsx_workbook/tree/main
Conclusion
Thanks for reading this blog, I hope you must have found it informative, let me know if you have any queries.
Still Curious? Visit my website to know more!
For more interesting Blogs Visit- Utkarsh Shukla Author
Add Comment
You must be logged in to post a comment.