Implementation Of AI/ML Scoring System in FinTech

- Ongoing

USA

Financial Services

Custom Software Development Artificial Intelligence Application Testing IT Staff Augmentation

Description

ABOUT THE CLIENT

Our client is a financial company that specializes in providing financing to small and medium-sized businesses in the US. They operate in the high-risk lending segment with an average decision time of 2 business days. The client approached us to improve the underwriting process by integrating various third-party applications to collect and validate data, and further implement machine learning (ML) tools.

PRELIMINARY UNDERWRITING PROCESS

The manual underwriting process included many steps that were handled by the company's employees:

Credit Processing (at this stage, the applicant fills out and submits the loan application);
Data Entry (the data entry specialist receives information from various sources, including government agencies, third-party services, and documents, and enters it into the internal document management system);
Data Review (the underwriter carefully reviews and verifies the truthfulness, integrity, and consistency of the information contained in the application);
Analysis (senior underwriter determines the level of risk of the applicant's business and its creditworthiness)
Approval (a credit analyst approves the loan and the terms of the company's cooperation with the applicant or refuses to grant a loan in case of a high probability of default or delays in payments);
Credit Administration (after the loan is approved, the application is passed to the credit administrator, who prepares loan agreements and updates the loan dossier);

Challenge

CHALLENGES

Underwriters process numerous digital and physical documents, and dozens of online data sources, and enter the collected information into the system. They also conduct a complex and lengthy process of risk assessment of a potential client. They determine whether the client is solvent and whether it is worth entering into a loan agreement with them. After that, a decision is made to accept or reject the application.

Because manual underwriting is a rather time-consuming process, several employees usually work on processing applications simultaneously.

As a result, the following problems arise:

allocation of responsibilities in decision-making, document processing, and data entry from third-party services
editing rights distribution, which leads to conflicts and possible storage of conflicting data
a necessity for regular synchronization between employees to understand the scope and tasks that remain to be completed; determining the right moment to move to the next stage of the audit; identifying contradictions, missing information, etc;

Parallel processing reduces audit accuracy due to the usual human factor. The probability of error and loan default increases, but at the same time, the cost of the entire process increases due to the high cost of human labor. In summary, time, price, and poor quality were the main challenges to success.

Overall, the customer default rate was about 24%, which was normal for this type of company, but not enough for rapid growth. To solve this problem, it was decided to implement an ML model and automation ETL technology (Extract - Transform - Load) in the underwriting process to improve the accuracy of credit recommendations and default risk mitigation

CLIENT'S REQUIREMENTS

The client needed to support and improve the CRM system used in the underwriting process by implementing an AI/ML model in third-party services. The main requirements for the project were:

Collecting accurate and relevant data: The dedicated team had to ensure the integration of 7 third-party services and platforms to collect information about the applicants' business, sufficient for underwriters to review and identify risks and then use the ML model. On average, it takes about 4 months to set up a single integration from scratch. This time includes the entire development cycle, starting with the creation of the technical specification and ending with the implementation of a stable version after several iterations, fixes, and modifications to take into account the features of the third-party service. However, we only had about a year to prepare the integration process before moving on to the next steps.

Data conversion and processing: For artificial intelligence to work with data and determine ratings for lenders, the data must look the same. But because it is extracted from different sources, the formats in which it comes to us also differ. At the same time, services can sometimes provide incomplete, incorrect, or outdated data, as well as duplicate data. The Transform stage solves this problem. The ETL process filters out duplicates, checks the data for completeness, and, if necessary, makes additional requests to services, detects conflicts, converts the data to the same format, and saves it to the data warehouse. The stored information is there until requested and effectively returns the required amount of data.

Model training and validation: Developing reliable ML models requires a systematic approach to training and validating algorithms. The main goal was to reduce the default rate by 10%, bringing the overall rate down to around 14%. A dedicated team of engineers also had to ensure that the models were constantly updated and validated to adapt to changing market dynamics.

Solutions

IMPLEMENTED SOLUTION

The development team had a task to create an automated system that could efficiently collect, process, and validate data from many third-party services, assess the risks associated with loan applications, and offer informed decisions based on the information received. The expectation was for accuracy, reliability, efficiency, and independence. The following solutions were implemented for this purpose:

Data Binding: .Net engineers have integrated an ETL pipeline that allows us to obtain accurate data about the applicant's business and a holistic view of the client's financial condition based on data from integrations with third-party services.

All available information was extracted from various sources. During the process of obtaining it, it was important to address data quality issues, for example, to make sure that all available data is complete.

Initial data are presented in different structures and need to be further processed. The next step is to transform, standardize, and bring it into a common format, eliminating differences in types and units of measurement.

At this stage, quality issues such as removing duplicate records and eliminating conflicts based on source priority can also be resolved, with such information being highlighted for the user.

The implementation of the ML model for accurate assessment of cooperation risks, using our tools, allowed us to store the processed information in a single source and a unified format.

Assessment of default risks: Data Science engineers have developed three independent ML scoring systems to assess credit risk. The ML models use historical data and 60 business parameters, such as the date of registration, business category, percentage of cash flow, Google Review rating, FICO Score of the owner, and others. The model categorizes applications into different risk levels (low, moderate-low, moderate, moderate-high, high) using such data and available offer variations such as duration, amount of funding, percentage of turnover that the loan represents, and more. This allowed our client to make informed decisions based on predefined risk thresholds.

Third-party integrations used to obtain data:

Yodlee
DataMerch
Tax Guard
S&P Capital IQ FICO® Platform
FactSet Data Solutions
Moody's Orbis

Other integrations in the system:

Hubspot
Salesforce
Designed to meet the inspanidual needs of each of the two North American banks

Continuous model training: The models were retrained every three months using increasingly large amounts of data, with 300,000 applications processed during the last training, 6,000 of which were approved. This periodic training ensured that the models remained relevant and adapted to market dynamics. This means that the company can approve riskier applications when the country is economically stable or when there is a large amount of free capital in the economy. For example, our client accepted riskier deals in the second half of 2022 due to the state of financial well-being in the US. The company will approve only reliable applications in the event of economic deterioration in the country, for example, during recessions. The risk level of our client's transactions has been gradually and steadily decreasing throughout 2023. Thus, the models take into account the company's strategy, that is its readiness or unreadiness for risky transactions, depending on the state of the economy and the internal state of the company.

Impact

ADVANTAGES

The developed ETL solution extracts public records, banking, financial, tax, and other data from open sources due to the data binding with third-party platforms. After that, it processes information and saves it to the storage in the identic format in the storage. This solved the problem of the length and resource intensity of the underwriting process.

Based on complete, structured, and classified information, artificial intelligence can quickly identify the risks of granting a loan. Machine learning makes it possible to identify an insolvent client with better accuracy. ML is excellent at identifying patterns in large data sets, thus solving the problem of accuracy and reliability of the proposed loan recommendations.

The accuracy of the results has also increased due to the minimization of the human factor in the process.

The underwriting process has been shortened, and with it, the number of low-skilled employees has been reduced. For example, the “Data Entry” stage had about 50 specialists, and after the introduction of the automated system, it was completely removed. At the Data Review stage, 21 employees were working on applications, and now this number has been reduced to 7 when recalculated for the same number of applications.

CONCLUSION

The ETL platform was used to integrate 7 third-party services in just 9 months. This allowed our client to reduce costs by 55.5% compared to the classical method of integrating services. The introduction of artificial intelligence and machine learning technologies in credit analysis has improved the determination of the optimal default risk rating based on a wide range of data. Our client received a reliable and fast underwriting system that can make informed decisions based on comprehensive data analysis. Also, thanks to this solution, the total number of applications processed in 1 month increased by 144% over two years, the cost of operating expenses per application decreased by 47%, and the credit default rate decreased significantly. According to McKinsey&Co statistics for May 2023, the combination of artificial intelligence and other automation technologies can add from 0.5% up to 3.4% to productivity growth annually. Due to the implementation of these technologies in our customer's business, we managed to reduce the client default rate from 24% to 14%, which is within the range of the norm in the North American financial industry

If you are interested in this case study, check out other case studies about the implementation of AI/ML technologies by the NetLS team on our website. There, we talk about the experience of the company and engineers on our clients' projects.

Portfolio

Creation of .NET specialists from scratch