In previous blog posts, we discussed different aspects of our industry and how they may affect your current strategy in relation to a credit risk modelling solution. We’ve covered the main reasons to make the leap toward AI/ML based modelling in 2019. We’ve also discussed the main benefits and organizations currently using traditional scoring, can benefit by using machine learning techniques, in depth. Finally, we’ve discussed some of the problems faced by banks that haven’t adopted ML yet.
In this blog post, we get into the technicalities and present more detail on how the EyeOnRisk credit risk modelling platform, can help bridge the gap to achieve a more accurate, structured and streamlined modelling process.
What makes an End to End Credit Risk Platform?
There’s a profound difference between vertical platforms and horizontal platforms. These days, many FIs make use of more traditional styles of structuring for their credit risk solutions in the form of a collection of different tools. It may be the case that each tool is extremely capable in its own vertical, however, for the credit risk problem – the wide range of capabilities may not be relevant or applicable.
This is why it makes more sense to address the credit risk modeling effort with a holistic solution. Such a solution can cover the entire lifespan of a model, from creation to deployment and re-evaluation. When examining such platforms closely, they are composed of several built-in integrated components which include the following:
- Data source acquisition;
- Preliminary data analysis and filtering;
- Data preparation and imputation;
- Feature generation;
- Modelling experimentation (using machine learning or other methods);
- Model explainability, reporting and documentation;
- Deployment to production with API support; and,
- Monitoring of the models in production.
When all these parts are integrated into the same platform, it allows for a quick turnaround of new models or improvements to existing ones can be achieved.
Let’s dive into each one of these parts and see what they look like on the EyeOnRisk platform.
Acquiring Data for Credit Risk Models (Internal, External or Other)
The platform allows for UI based additions of an unlimited number of data sources. This enables the modeler to quickly progress the in his modelling project without requiring too much support from other functions in the organization.
The platform’s UI is straightforward and you’ll be working on your own data panel by clicking only a few buttons. As the data is gathered, a immediate glimpse at the data is provided, itself along with some statistics which can help to determine how relevant this data is.
Using External, Alternative Data in Credit Risk Models
Many organizations are exploring the inclusion of alternative data into their models. While there are numerous available paid data providers which can be used, the issue of integration is a limiting factor. The EyeOnRisk platform addresses this with easy UI based support for external data sources. The platform is also extremely adaptable for adding new external data source and APIs, as required by the customer.
Data acquired from the API is presented to the user as though it was derived from an internal table in the data warehouse.
Dealing with Missing Data (Imputation) and Data Wrangling
Experienced credit risk modelling teams will attest that a vital factor in achieving model lift is achieved though comprehensive preparation of data and from feature engineering. This is also the most time-consuming task when building a credit risk model. The platform facilitates this phase by offering with an easy to use interface. This enables creation of new data transformations and assists in exploring their contribution to the model in a quick and easy manner. allows to quickly and easily create new data transformations, and explore their contribution to the model.
Automatic Data Completion (Imputation)
Missing values can be dealt with quickly and easily by imputing missing values manually or automatically:
Feature Generation
In many cases, a combination of several raw features will yield a more accurate model than those making use of raw features by themselves. Coming up with good ideas for new features requires years of experience and deep domain knowledge. However, it shouldn’t be too difficult to try, for instance, adding a binning function for the AGE column:
Automatic Feature Generation
The platform offers state-of-the-art and exciting technology which automatically calculates huge amounts of feature candidates which can be integrated into your model. To use this feature, the user can simply launch one of the feature generation search algorithms and let the system recommend some useful features:
Modelling Experimentation (using Machine Learning or other methods)
When the data is sufficiently clean to start modelling, the process of adding a new machine learning experiment is simple and easy. By using the UI, it is possible to can control the type of algorithm to make use of which including logistic regression, random forests, boosting, bagging, SVM, decision trees and more. For each algorithm, you can tweak parameters or select the input parameters
You can run one or many experiments simultaneously, as well as compare results when finished. Each experiment result screen provides all the necessary information in order to comprehensively assess the effectiveness of the experiment. The following screenshot presents the ROC and confusion matrix parts. Additional information (not shown) includes feature importance and selected grid search parameters.
Model Deployment with API Support
When work on the model has been completed and is approved by all parties, the next step is to make it available for the various loan origination systems in the organization. Since the platform stores keeps all required the needed setup information for a Flow (as expected from an end-to-end platform), the actual deployment is much simpler and a quick process. The deployment wizard allows you to choose the Flow to deploy with the exact model you wish to use in production. The API endpoint is then generated and exposed in the internal network:
Monitoring Machine Learning Models in Production
When the model runs in production, it’s extremely important to carefully monitor the quality of the input data. It’s a common pitfall to use a model with inappropriate input data. This results in major inaccuracies and problems with the performance of the model. Here, the platform assists by automatically monitoring the statistical nature of the data fed into the model in production. If a significant change is detected when data during training is compared, the system alerts and allows for early detection in data pipeline problems:
Conclusion
We’ve shown some of the basic steps in creating a robust machine learning credit risk model which can be easily deployed in production. The process doesn’t have to be overly complicated when you use a holistic platform such as EyeOnRisk. To become more familiar with the platform, we are inviting you to contact us and schedule a demo.
