Does Algorithmic Credit Scoring Reduce or Exacerbate Race-based Discrimination in Lending? : Part 1

Algorithmic credit scoring is becoming increasingly common in the consumer lending market, replacing the traditional credit decision-making process that relied on human loan officers. This approach involves using a set of mathematical instructions or rules, which are fed into a computer to calculate a solution to the problem. In consumer lending, algorithms are primarily used to evaluate borrowers’ creditworthiness, which is then used to approve or deny credit and determine the interest rate.

Compared to traditional statistical techniques, the machine learning methods underlying algorithmic credit scoring offer greater flexibility and higher forecasting accuracy. The flexibility of machine learning makes it well-suited for big data analysis, which involves processing a large volume and variety of data that arrive at high velocity. As a result, algorithmic credit scoring often leverages big data, including nontraditional data such as consumer behavior, social media activity, and digital footprints (Gillis, 2022).

It has long been a challenge to identify and eliminate discrimination in the domain of consumer lending since the days when credit decisions were primarily made by humans. The increasing prevalence of algorithmic credit scoring makes it imperative to answer the question: Does algorithmic credit scoring promote or inhibit discrimination based on protected characteristics such as gender and race?

How Algorithmic Credit Scoring May Reduce Discrimination?

In the traditional credit decision setting, loan decisions are based on a limited set of variables, such as income, credit scores, and loan-to-value ratios, and involve human discretion. However, loan officers may also directly observe personal characteristics such as gender and age, which could be used as a basis for credit decisions and lead to discriminatory practices.

On the contrary, algorithmic decision-making is automated, requiring little human involvement. In an algorithmic context, protected characteristics are formally excluded from the credit analysis. Therefore, algorithmic decision-making has been viewed as having great potential for eliminating disparate treatment in credit decisions.

How Algorithmic Credit Scoring May Increase Discrimination?

There are concerns among researchers that algorithmic credit scoring may cause unintended discrimination. The concerns are driven by the understanding that machine learning algorithms can potentially recover a borrower’s protected characteristics, such as gender and race, from permissible characteristics. For example, Fuster et al. (2021) argue that the flexibility of machine learning allows for a greater ability to triangulate the effect of unobserved protected characteristics on the outcome (i.e., credit score) by “more effectively and accurately combining the observed permissible variables.” This is especially true when non-traditional data, such as consumer behavior, social media data, and digital footprints, are used.

Background Information

Examples of Machine Learning Techniques Used in Credit Scoring

  1. Logistic Regression: A statistical technique that uses a logistic function to model the probability of a certain outcome, such as the likelihood of a borrower defaulting on a loan.

  2. Decision Trees: A machine learning technique that builds a tree-like model of decisions and their possible consequences. Decision trees can be used to predict the probability of default based on multiple variables, such as income, employment history, and credit score.

  3. Random Forest: A machine learning algorithm that constructs multiple decision trees and combines their outputs to make a final prediction. Random forest can be used to identify important variables in credit scoring and to reduce the risk of overfitting.

  4. Gradient Boosting: A machine learning technique that combines multiple weak models to make a stronger prediction. Gradient boosting can be used in credit scoring to improve accuracy and reduce errors.

  5. Neural Networks: A machine learning technique that models complex relationships between variables using interconnected artificial neurons. They can be used to predict credit risk by analyzing multiple variables.



[link to part 2]

References:

Fuster, A., Goldsmith, P. P., Ramadorai, T., & Walther, A. (2021). Predictably Unequal? The Effects of Machine Learning on Credit Markets. Journal of Finance, 77(1).  doi: 10.1111/jofi.13090

Gillis, T. B. (2022). The Input Fallacy. Minnesota Law Review106(3), 1175–1263.

Like (0)