lending club case study ppt

My presentations

Auth with social network:

Download presentation

We think you have liked this presentation. If you wish to download it, please recommend it to your friends in any social system. Share buttons are a little bit lower. Thank you!

Presentation is loading. Please wait.

LENDING CLUB LOAN ANALYSIS

Published by Shavonne Austin Modified over 6 years ago

Presentation on theme: "LENDING CLUB LOAN ANALYSIS"— Presentation transcript:

Brief introduction on Logistic Regression

Chapter 8 – Logistic Regression

Correlation and regression Dr. Ghada Abo-Zaid

Mathematics SL Internal Assessment

Regression Analysis Once a linear relationship is defined, the independent variable can be used to forecast the dependent variable. Y ^ = bo + bX bo is.

Multiple Criteria for Evaluating Land Cover Classification Algorithms Summary of a paper by R.S. DeFries and Jonathan Cheung-Wai Chan April, 2000 Remote.

x – independent variable (input)

Simple Linear Regression

Final Project: Project 9 Part 1: Neural Networks Part 2: Overview of Classifiers Aparna S. Varde April 28, 2005 CS539: Machine Learning Course Instructor:

Lecture 24: Thurs., April 8th

Data Mining CS 341, Spring 2007 Lecture 4: Data Mining Techniques (I)

Chapter 7 Correlational Research Gay, Mills, and Airasian

Slide 1 Testing Multivariate Assumptions The multivariate statistical techniques which we will cover in this class require one or more the following assumptions.

Decision Tree Models in Data Mining

AUDIT PROCEDURES. Commonly used Audit Procedures Analytical Procedures Analytical Procedures Basic Audit Approaches - Basic Audit Approaches - System.

April 11, 2008 Data Mining Competition 2008 The 4 th Annual Business Intelligence Symposium Hualin Wang Manager of Advanced.

Data Mining Techniques

About project

End to End Case Study (Classification): Lending Club data

Pawan Reddy Ulindala

Towards Data Science

Lending Club is a lending platform that lends money to people in need at an interest rate based on their credit history and other factors. In this blog, we will analyze this data and pre-process it based on our need and build a machine learning model that can identify a potential defaulter based on his/her history of transactions with Lending Club. You can find the data here .

This dataset contains 42538 rows and 144 columns. Out of these 144 columns, many columns have null values in majority.

In fact, 63.15% of the values in the overall data are null values. So, it is very important to carefully deal with these null values as they can significantly affect our results.

Null values visual plot:

Handling null values:

Handling null values is an important task here. In the below code, you can see that there are only 53 columns out of 144 columns that have null values less than 40 percent.

In the above table, each row represents the number of columns out of 144 columns with less than a specific percentage of null values. For example, Row 1 represents that there are 52 columns with less than 10% of null values in each column.

We were able to decrease the total number of columns from 144 to 53 by considering columns with less than 40% of null values.

Understanding Features

It is important to understand the features/columns as some of the categorical columns present in the data are in the form of numerical values and vice-versa. I first tried to examine every column but later understood that it will be quite cumbersome to perform these operations to all 53 columns. So, I decided to first eliminate columns which doesn’t add value to the data and then analyze each field.

Checking objects:

Dropping unnecessary objects:

Checking numerical columns:

Dropping unnecessary numerical columns:

After examining the data, we have dropped a total of 18 columns of these 53 that didn’t add value to our data. We were able to decrease number of columns from 53 to 35 and we will still try to decrease the no.of columns.

Converting categorical columns to numerical columns:

We have converted categorical columns to numerical by either performing one-hot encoding or label encoding depending on the kind of data they represent. For example, one hot encoding was performed on [‘home_ownership’,’verification_status’,’purpose’] columns whereas label encoding was performed on ‘grade’ and ‘sub grade’ columns as they are ordinal in nature.

One hot encoding:

Label encoding:

Updating the grade column with label encoded values:

Converting DateTime columns to numerical columns:

The columns [‘issue_d’,’last_pymnt_d’,’last_credit_pull_d’] which are datetime columns are further divided into month and year by using pandas datetime module. The new columns are named as ‘issue_d_year’, ‘issue_d_month’, ‘last_pymnt_d_year’, ‘last_pymnt_d_month’, ‘last_credit_pull_d_year’, ‘last_credit_pull_d_month’ respectively.

Converting objects to numerical columns:

The columns int_rate and term are stored as objects. We have performed necessary string operations to convert them into numerical columns.

Checking correlation: Now that we have converted all the columns to numerical columns, we will check for correlation.

There are few columns with high correlation but these columns haven’t been considered while solving our questions. For example, when trying to classify if the loan will be paid back by the customer we will not consider any future transactions like total_pymnt and total_pymnt_inv . Hence, these columns aren’t dropped here.

Dealing with null values:

Let’s check if there are any null values after significantly cleaning columns.

As we can see, we can still find some null values in the data. We will examine these null values and take the necessary actions.

Let’s check the columns with which have the highest percentage of null values.

Some columns have a very little percentage of null values(less than 1%). There we can replace the null values with the median of their respective columns.

For columns that have a high percentage of null values, we will run a model on top of non-null values and predict the missing values in that respective column.

As there are no null values, we will go to the next step i.e., building a machine learning model.

Classification

The goal of our classification task is to identify whether a customer(who is requesting a loan) will be able to repay the loan along with the interest amount. Since we have some columns that contain the information of future transactions w.r.t the date on which the loan is taken(like paying monthly loan installments after taking the loan etc.), we will drop them from the pre-processed dataset to carry on classification tasks.

The goal of our classification task is to identify whether a customer(who is requesting a loan) will default based on his historic transactions with the lender after taking the loan.

Let’s drop few columns that contain information on charged-off loans. Columns dropped in classification: [‘total_pymnt’,’total_pymnt_inv’,’total_rec_prncp’,’total_rec_int’,’total_rec_late_fee’,’recoveries’]

The ‘loan_status’ column is used as a target variable to classify a customer based on his records.‘loan_status’ column has 4 unique values all of which are label encoded for ease of representation. This column is labeled as below:

Checking multicollinearity between features using VIF and then dropping columns with a value higher than the threshold.

We have dropped the columns which have a high VIF factor(10 or above).

Model building

After label encoding the target variable, we have split the data to train and test data in the ratio of 70:30.

We used sklearn’s cross_val_score and grid search cv with scoring as f1 score to examine the performance of each model in each fold. The below figure shows the F1 score of each model in 3 folds. The orange line represents the mean F1 score of each model whereas IQR represents the variance of these scores.

From the above figure, we can say that the bagging classifier is the most stable model with the highest mean of weighted F1 scores and least variance.

Building the model using bagging classifier.

The classification report for bagging classifier

This model has an accuracy of 0.89 and an average F1 score of 0.75.

Final Confusion matrix:

As the bagging Classifier doesn’t have an option for feature importance, we used a decision tree to find feature importances.

Conclusion :

In this blog, we have extensively covered pre-processing steps required for this data and then found the best fit model using Grid search and KFolds. I hope that this blog has given you an overall picture of solving a classification problem. For more detailed code, please refer to https://github.com/pawanreddy-u/lendingclub9

Written by Pawan Reddy Ulindala

Learn to write and write to learn

Text to speech

Assignment : Lending Club Case Study

Team : Vijay Garg and Santhosh ankam

Date : 14 Feb 2021

Business Understanding

We are working for Lending club a finance company which specialises in lending various types of loans to urban customers. When the company receives a loan application, the company has to make a decision for loan approval based on the applicant’s profile. Two types of risks are associated with the bank’s decision: •If the applicant is likely to repay the loan, then not approving the loan results in a loss of business to the company •If the applicant is not likely to repay the loan, i.e. he/she is likely to default, then approving the loan may lead to a financial loss for the company

The company wants to understand the driving factors (or driver variables) behind loan default (loan_status = 'Charged Off'), i.e. the variables which are strong indicators of default. The company can utilise this knowledge for its portfolio and risk assessment.

Import the necessary libraries

You should enable JavaScript to work with this page.

We tried to load scripts but something went wrong.

Please make sure that your network settings allow you to download scripts from the following domain:

https://id-frontend.prod-east.frontend.public.atl-paas.net

Saved searches

Use saved searches to filter your results more quickly.

To see all available qualifiers, see our documentation .

Notifications You must be signed in to change notification settings

IMAGES

GitHub
Lending Club Model Ppt Powerpoint Presentation Model Structure Cpb
GitHub
Lending Club Case Study Solution for Harvard HBR Case Study
Lending Club Case Study: Prabhat Sharma Brij Bhushan Paliwal
Job Description Case Studies

VIDEO

lending club stock LC stock multi bagger or a bad investment? 2027 valuation high risk high reward
Shifnal Cricket Club Case Study
One Club Guidance
Lending Club Review
Lending club
Market Leader Intermediate: Case Study Unit 4

COMMENTS

Lending Club Case Study
This document analyzes lending club loan data to predict loan defaults and calculate default probabilities using models like gradient boosting, neural networks, and logistic regression. The goal is to make informed decisions about future loans to assess profitability. Various machine learning models are trained and tested on the data, with ...
Pragyan-Choudhury/Lending_Club_Case_Study_PPT
General Information. As a part of a Consumer Lending Finance Company, which specialises in lending various types of loans, we need to identify the patterns which indicates if a loan is likely to Default. When the company receives a loan application, it has to make a decision for loan approval based on applicant's profile.
PDF Lending Club Case Study
Background -Lending Club Case Study Background Lending club is the largest peer-to-peer marketplace connecting borrowers with lenders. Borrowers apply through an online platform where they are assigned an internal score. Lenders decide 1) whether to lend and 2) the terms of loan such as interest rate, monthly instalment, tenure etc.
GitHub
Lending Club Case Study. Goals of data analysis: Lending loans to 'risky' applicants is the largest source of financial loss (called credit loss). The credit loss is the amount of money lost by the lender when the borrower refusesto pay or runs away with the money owed.
GitHub
Lending Club Case Study. This project involves a comprehensive Exploratory Data Analysis (EDA) of the Lending Club dataset with the objective of uncovering insights into how various consumer and loan attributes influence the tendency of borrowers to default. Lending Club, a peer-to-peer lending platform, provides a rich dataset encompassing ...
Default Prediction & Analysis on Lending Club Loan Data
This document analyzes lending club loan data to predict loan defaults and calculate default probabilities using models like gradient boosting, neural networks, and logistic regression. The goal is to make informed decisions about future loans to assess profitability. Various machine learning models are trained and tested on the data, with ...
LENDING CLUB LOAN ANALYSIS
LENDING CLUB ANALYSIS OVERVIEW Identifying the Business Problem Data Description Data Preparation & Processing Data Mining Models Logistic Regression Decision Trees K-Nearest Neighbor Neural Networks Summary of Findings Conclusion During this presentation we will move systematically through a discussion of our project. First we will give an overview of Lending Club model and the business ...
Lending Club Case Study
Lending Club Case Study - Free download as Powerpoint Presentation (.ppt / .pptx), PDF File (.pdf), Text File (.txt) or view presentation slides online. Scribd is the world's largest social reading and publishing site.
LendingClub Analysis 2014
Jun 24, 2014 • Download as PPTX, PDF •. This document discusses how peer-to-peer lending platforms like Lending Club are transforming banking by allowing individuals to directly invest in loans to borrowers. It outlines benefits for both borrowers and investors, such as lower rates and returns. While concerns about safety exist, these ...
End to End Case Study (Classification): Lending Club data
Photo by Avinash Kumar on Unsplash. Lending Club is a lending platform that lends money to people in need at an interest rate based on their credit history and other factors.In this blog, we will analyze this data and pre-process it based on our need and build a machine learning model that can identify a potential defaulter based on his/her history of transactions with Lending Club.
PDF LendingClub Bank Presentation
Marketplace bank delivers the best of both worlds, driving significant growth and profitability. Best-in-class Consumer Lending Platform - cycle tested and with a significant data advantage. Embedded 3M+ loyal member customer base, with 50% repeat borrowers. Large TAM in one of the fastest growing areas in financial services.
Lending Club Case Study Vijay Garg Santhosh Ankam
Business Understanding. We are working for Lending club a finance company which specialises in lending various types of loans to urban customers. When the company receives a loan application, the company has to make a decision for loan approval based on the applicant's profile. Two types of risks are associated with the bank's decision:
ishankarve/upgrad_lending_club_case_study
Lending Club Case Study You work for a consumer finance company which specialises in lending various types of loans to urban customers. When the company receives a loan application, the company has to make a decision for loan approval based on the applicant's profile.
Lending Club
LendingClub - Free download as Powerpoint Presentation (.ppt / .pptx), PDF File (.pdf), Text File (.txt) or view presentation slides online. 1) Foundation Capital, a venture capital firm, was considering increasing its 10.3% ownership stake in Lending Club, the leading peer-to-peer lending platform. 2) The document performed a top-down analysis of the large and growing US consumer credit and ...
LendingClub CaseStudy
LendingClub_CaseStudy - Free download as PDF File (.pdf), Text File (.txt) or read online for free. 1) The document discusses an EDA case study performed by Lending Club to understand the key drivers of loan defaults. 2) The analysis found that higher interest rates, loan amounts over 30% of annual income, revolving line utilization over 75%, prior bad records, and debt to income ratios over ...
PDF Harvard Law School Case Study: Lending Club
Case Study: Lending Club _____ Note: This memorandum was prepared by Anooshree C. Sinha. LL.M. '09, Harvard Law School, and Corinne Snow J.D. '112, Harvard Law School, under the supervision of Professor Howell E. Jackson of Harvard Law School. The memorandum is intended solely for educational purposes and does not represent an opinion of ...
chaitanya-vanapamala/lending-club-cs: Upgrad assignment
Lending Club Case Study. Assigment by Upgrad and IIIT-B. The case study focuses on EDA mainly, to understand which parameters are major to detect whether a customer will default loan or not. Pesented a PPT to illustrate the major parameter to consider while giving loans along their data distributions. Contributors:
Case Study: Lending Club
Case Study: Lending Club 1 minute read Problem Statement. A consumer finance company specialises in lending various types of loans to urban customers. When the company receives a loan application, it has to make a decision for loan approval based on the applicant's profile. Two types of risks are associated with the bank's decision:
Lending Club Case Study Live Session.pdf
View Lending Club Case Study Live Session.pdf from CS 401 at ShriRam College of Engineering & Management. #LifeKoKaroLift Lending Club Case Study: Pre-Assignment Session 1 Course : ML/AI Edit Master ... This is need to be done for both PPT and the Jupyter Notebook 13 Lending Club: EDA Case Study .
Lending Club
Lending Club follows the path of founder and CEO Renaud Laplanche as he scales his successful P2P lending company both pre- and post-IPO. From debating with bankers on the proper valuation metrics for the company, to managing customer acquisition costs as the competitive landscape rapidly changes, the Lending Club case explores several key challenges that come with operating a fin-tech company ...
sukhijapiyush/Lending-Club-Case-Study
This company is the largest online loan marketplace, facilitating personal loans, business loans, and financing of medical procedures.Borrowers can easily access lower interest rate loans through a fast online interface. Like most other lending companies, lending loans to 'risky' applicants is the largest source of financial loss (called credit loss).
Lending Club Case Study-1
Explore and run machine learning code with Kaggle Notebooks | Using data from Lending Club Case study Kaggle uses cookies from Google to deliver and enhance the quality of its services and to analyze traffic.
PDF Lending_Club_Case_Study_PPT/LendingClub_CaseStudy_PDF.pdf at ...
Lending Club - EDA. Contribute to Pragyan-Choudhury/Lending_Club_Case_Study_PPT development by creating an account on GitHub.