DIY Python – How to setup Pizzapy database on SQL Server, quickly generate sample data in Orders table for data analysis and data science experiments with Python

DIY Python – How to setup Pizzapy database on SQL Server, quickly generate sample data in Orders table for data analysis and data science experiments with Python

Download SQL Script files here:

1. Some knowledge of T-SQL
2. Knowledge of SQL Server. Download 180-day free trial of SQL Server 2019 here
3. Knowledge of Microsoft Management Studio. Download MMS is here. It’s Free!

As we continue to explore Python for Data Science experiments, we need some sample data for sure. Well, we have lots of examples out there on internet and YouTube showing various data samples using CSV, TXT and Excel sheets. I thought it would be a good idea to introduce a new dataset with data randomness as a feature instead of using static datasets like public datasets available in the form of CSV, Excel, Txt etc.

I believe data randomness is what makes the data analysis or data science experiments more interesting and throw some surprise data scenarios. So, I created this simple database called Pizzapy. What is Pizzapy database anyway?

Pizzapy database represents an order processing data capture of a fictional Pizza Chain called  Pizzapy Inc.  My plan is to enhance this data model as and when necessary. At this point Pizzapy database model represents an OLTP relational database mainly created for sample data purposes.  By the way, real world pizza order processing data capture is a way more complex and as a Solution & Data Architect I would not really recommend this model for any other purposes other than learning.

Pizzapy Database Model – version 1

Sample data in master tables (ex: Customer, Employee, OrderDst) is static to keep it simple.  Data in Order table is generated through a separate script to introduce randomness. By the way there is no duplicate record check in the Order data creation script so keep that in mind If you are planning to run the script twice but the Identity column on the Order table does guarantee unique OrderId.

Just truncate the Order table and rerun the order data creation script again. If you are adventurous go ahead and change the parameters in the script to load more or less data based on your requirements. My suggestion is more data is better. If your prefer static data then create a table (ex: Order_11192019) immediately after loading data into Order table.

Note: OrderExtra and Extras tables are created for future use. Extras table has some sample data but there is no script to load data into OrderExtra table at this time.

Pizzapy data model has, Employee table for Employee records, Store table for store locations, Customer table for customers, OrderSrc table has records to show from where the order was initiated, OrderDst table has records to show how the order will be fulfilled. PizzaSize, PizzaCrust and PizzaToppings tables has all the required data to create perfectly desired Pizza by a customer.

Order table is where we see all this master data come into life with each record representing a perfect and delicious pizza created by the customer. To give some more context, customers can place orders from a Mobile App, Website, Phone or by just walking in to the store. Customer can also choose their preferred options like Takeout, Delivered or Dine In.

To keep everything in perspective, with some Data Science experiments using Python, we would like to answer few simple business oriented questions/prediction like “What is my stores’ rush hour tomorrow?”, “How many employees do I need in the kitchen tomorrow/next week to meet the demand?” , “Do I have all the Pizza materials properly stocked to meet the demand, tomorrow or in the coming days”

Hands-on demo: (video) – I thought it would be a good idea to show and tell.
Note: Feel free to skip the portions of the video where you feel like!