Small Business Administration (SBA) Loan Data Set

Description
This is my project on SBA Loan Data. It provides for two predictive analytics models created following CRISP-DM methodology. One predicts loan repayment with a classification model and the other predicts job creation as a regression model. This was my first project using PyCaret for model selection and parameter tuning.
Data
The Kaggle data provided by Data-Science Sean.
Executive Summary
This video was completed a few weeks before the end of the project.
Final Report
The final report and video on predictive analytics for the Small Business Administration data set.
EDA
A pandas-profiling report is available.
Code
The Exploratory Data Analysis was performed in R and can be downloaded from here.
The primary code in Python using Jupyter Notebook.
If you have trouble with GitHub rendering the file, please try here.
The supporting code for model selection and parameter tuning also in Python using Jupyter Notebook:
- Regressesion Model Selection
- Classification Model Selection
- Regressesion Model Tuning
- Classification Model Tuning
Instructions
To run this notebook locally, install Jupyter, download the data set, change the file location to load the code and data, and install all the library dependencies.
The R code is best run from RStudio. You will also need to download the code and data.
Anaconda can be used for both.
Tools
- R
- R Studio
- Python
- PyCaret
- Jupyter Notebook