Just Economy Conference – May 7, 2021
Artificial intelligence and machine learning (AI) have the capability to disrupt many aspects of retail banking. Using models that can see patterns and make conclusions from thousands of points of data, these tools can improve the predictive power of underwriting systems. However, many of the potential benefits of AI remain untested. Does AI support financial inclusion without perpetuating the same divisive and discriminatory practices that have occurred in the past? How should regulation be updated to make sure that these models are based on accurate records, make sensible decisions, and treat everyone fairly? Join this panel to hear an expansive conversation between a fair lending attorney, the head of a research organization studying the explainability and fairness of ML algorithms in credit underwriting, and the CEO of a consulting firm that provides AI-driven services to lenders.
- Melissa Koide, CEO and Founder, FinRegLab
- Kareem Saleh, CEO and Founder, FairPlay AI
- Eric Sublett, Counsel at the Civil Rights Firm Relman Colfax, Washington DC
NCRC video transcripts are produced by a third-party transcription service and may contain errors. They are lightly edited for style and clarity.
Good afternoon, everyone. Thank you for joining us for what I think is going to be a really intellectually interesting and I think an important conversation about what future directions for the use of AI and machine learning and alternative data may mean for financial inclusion. I’m happy to be here. And thank you NCRC for inviting me to join this conversation today. I’m Melissa Koide. I’m the CEO and founder of fin reg lab. Fin reg lab is a DC based nonprofit research organization where we are testing empirically the uses of technology and data in financial services for purpose of driving a more inclusive financial system. I stood up in reg lab for a little bit of context, after leaving the US Treasury Department, where we were for at least four and a half years, but it goes on even further really inundated with questions around how does public policy evolve to enable the safe use of data and technology in our financial system, especially for the benefit of individuals and families, communities and small businesses? And, in many ways, seeking the answer those questions, what we really lacked was an entity that can help to generate fact based information that would inform public policy and market practices so that we really are taking advantages of where data and technology may help to bring more people into the financial system, and ensure that more people, small businesses and communities are better served. And so we’ve got a really interesting conversation today, I’d love to invite the speakers who will be joining me to come on up. And we’re gonna start by opening up for quick introductions, in terms of who you are and what you’re doing in the context of AI and machine learning. And financial inclusion is a really broad topic. I think many of us, excuse me, I just swallowed a bunch of popcorn. Many of us are really thinking about the implications of AI and machine learning, especially in the context of credit. Credit is not just a necessity for bridging short term gaps. But credit is also a really critical stepping stone for longer term investments and wealth, building opportunities, whether that’s the ability to acquire a home or start a small business, and the list would go on. And so you’re going to hear a lot about what we have been individually, but also I think collectively, in fact, a couple of us were just on a panel that will tell you about really offering our insights about what the evolution of AI adoption and machine learning and in particular what that means in the credit context. Without further ado, I’m going to turn it over to Kareem, if you want to kick us off and let us know what you’re doing in this area. That would be fantastic.
Yeah. Thanks NCRC for having me. My name is Kareem Saleh and I am the founder and CEO of fairplay AI. We help lenders increase the fairness of their marketing, underwriting pricing and collections models using advanced AI D biasing techniques. I’ve spent my whole career working on financial inclusion issues, in particular how to underwrite inherently hard to score borrowers. I did that work at the German Marshall fund, and unfortunately, unfortunately named mobile wallet startup called ISIS for several years at the State Department and the Overseas Private Investment Corporation, as well as some other Silicon Valley backed venture startups.
Fantastic. I act. I remember ISIS of your type. Not the one we all know. Didn’t know. You were a part of.
Mercifully rebranded soft card. Yeah, yeah. Excellent.
Yeah. Excellent. Great, Eric. Join us, please.
Sure, I’m Eric Sublett. I’m counsel at the civil rights firm Robin Colfax in Washington, DC. Most of my practice involves advising financial and financial institutions on best practices for complying with civil rights laws, primarily the Equal Credit Opportunity Act. And a lot of that especially now means evaluating discrimination risk in connection with statistical or more Northen machine learning models, and both in traditional lending space. But our work also increasingly involves working with financial technology companies and even online platforms to to address those issues.
Oops, there I am. So our discussion is titled AI and machine learning. But as we were prepping for this session, one of the things that we talked about is, let’s really start to let some of the air out of the balloon and begin to get a little clearer about what we mean. And we’ll use credit underwriting as sort of a helpful context here. What do we mean when we reference AI and machine learning as we think about credit underwriting? Kareem, do you want to share a little bit about how you think about that?
Sure. AI is just math. Of course, some kinds of math are more complicated and intuitive than others. So those of you who maybe took an elementary statistics class, will recall that the formula for a logistic regression is y equals mx plus b, relatively simple. Of course, today, we have gradient boosted trees and neural networks and support vector machines at the end of the day, these are just much more computationally intensive forms of math, they are forms of math that a human brain probably would struggle to understand and compute. But at the end of the day, they’re just seeking to take data as an input, identify patterns in that data, and try to render a prediction about the question that’s being asked such as, will this borrower pay back a loan? What are you know, what is? What is the appropriate price? What is the price that is commensurate with this with the risk posed by this borrower that I should charge? When I pause there and see if Eric has thoughts or comments or anything to add?
No, I think you covered it pretty well. I mean, we’ve seen artificial intelligence machine learning applied to most if not all aspects, at this point of, of lending. And for a while, you started to see a crop up initially, maybe in the fraud space, or, you know, in the marketing space, but increasingly, or we’re seeing it used for underwriting and other purposes as well.
Sorry about that. I seem to keep muting myself. There’s a lot of enthusiasm. And I’ll I’ll share Kareem and I were just on an AI Task Force House Financial Services hearing panel, where it was really notable how much enthusiasm there is in terms of what machine learning and AI, especially in credit, underwriting which we were focused, what that may offer in terms of financial inclusion. And, you know, we we like to get into the weeds at fin reg lab, and really make sure that we are going to be able to understand how do these more complex to your point, Karim math models, how do they work? But before we start to dig into some of that, because I think that gets really too many important issues around fairness questions that we want to talk about. But why why is there such enthusiasm for how AI and machine learning may help to drive more access and financial inclusion? Eric, do you want to kick us off first?
Sure, so you know, you mentioned there’s a lot of was a lot of enthusiasm on the panel. And I personally tend to be kind of in an AI optimist in the lending space as well. That’s not to say there’s no reason for concern but I think on balance, for me the the promise outweighs the concerns. Certainly there are particular challenges that come with the use of a AI of machine learning techniques from from a legal standpoint, from a non discrimination standpoint. But at the same time, those very same methods, I think, offer opportunities to mitigate those risks and to maybe to be more effective in reaching non discriminatory approaches, even though we had available to us in the past whether that means effective ways of evaluating whether a model contains proxies for prohibited basis or sufficient cated way is to look for less discriminatory alternatives to a model that involves disparate impact risk. So while there, there are certainly ways for it to go awry, I think on balance, you know, there’s there’s a reason for optimism, and and, you know, in terms of non discrimination and greater financial inclusion.
Yeah. Kareem, before I turn it over to you, I’ll just take the moderators prerogative and share a few thoughts on this too. You know, we tend to look a fair bit at what and Kareem? I’m you do, and I’m sure, Eric, you do as well at what some of the academic literature suggests what is happening and what the potential is in terms of the use of machine learning algorithms, these more complex math approaches to finding patterns, in this case for credit risk assessment. And the literature is somewhat mixed on this. But But one of the things that really does come through is that so much of the potential power of machine learning and AI in credit underwriting for inclusion purposes really comes back to what are the data that you’re feeding the the math to begin with, and we have been my Gleb had done some research, just even isolating questions around basic transaction data inflow outflow from a bank account, or it could even be a prepaid card. Because we have higher incidence of coverage with bank accounts and prepaid cards, then we do with traditional credit experiences. And we also have higher certainty of coverage with underserved communities and populations, and minority populations. And so from an inclusionary standpoint, if that type of data is predictive, it actually might be quite useful from a financial inclusion outcome perspective. And so the research we did showed the cash flow data, and this is where did you shop? Or what time did you shop? It is literally the transaction information? Do you have a routine cushion in your account on a month to month basis? Does it appear that you are paying your bills on a routine basis, month to month, but that kind of data actually was predictive, we evaluated loan level data of five FinTech lenders who were using this information. And so bringing it back to the academic literature that’s probing these questions of how inclusive might machine learning be, it does seem that one of the most important aspects of it is what type of data are we actually bringing into that math, in order to credit risk assess populations who otherwise would be excluded? And I know Karim, you’ve spent a fair bit of time and over your life thinking about different data and why that might be especially useful for inclusion purposes, I’d love to hear you share with the audience, what you’ve seen and how you think about the importance of the data.
Yeah, thanks Melissa. I mean, I think you raised two important points. One is the volume of the data. And the second is the kind of data. And so I think part of the reason there’s tremendous optimism about machine learning is that, you know, the traditional methods of doing underwriting like logistic regression, can consume basically 20 to 50 variables. And if those 20 to 50 variables are present, and correct, which they tend to be for Super prime borrowers, let’s say, or then machine, then logistic regression does a great job of assessing your creditworthiness of the problem, of course, as you well know, is that something like 50 to 60 million Americans have credit data that’s messy, missing or wrong. And logistic regression fails in the presence of data that’s messy, missing or wrong, it gives wildly inaccurate predictions, and usually disadvantages people because of its inability to kind of parse the nuances in their economic situation. And so I think part of the great promise of machine learning is that if you can move from a world of 20, to 50 variables to 500, or 1000, it allows you to paint a much richer portrait of a borrower’s ability and willingness to repay alone. And we found that that’s especially true for folks at the bottom of the economic pyramid folks who, you know, never perhaps never had access to credit before. Then files, no files, recently arrived immigrants, folks who maybe had some kind of credit event in their past, like a bankruptcy or foreclosure. So I think that’s the the great promise is the ability to kind of consume consume more data. Now. The other question, of course, is, well, what kind of data and is it really appropriate for assessing a person’s credit worthiness and where do we draw the lines? Fundamentally, and I’ve been doing underwriting for a long time in my experience. When you underwrite a loan, you’re basically trying to make an assessment of whether a person is responsible or not. And it turns out that you can find In addition of a person’s responsibility in all kinds of places, if you care to look, and I think the Finn reg lab research on cash flow data proves this, right? I mean, if you have the foresight to maintain a consistently positive bank balance, that seems to be a proxy for your level of kind of economic responsibility, or maybe fiscal fiscal responsibility, you can get further afield and we have found, for example, that like, the data from customer relationship management systems can be very predictive. So when you call into a call center, why do you call? Do you call to tell the lender that you’re going to be late with a payment? If so, do you call before you’ve missed the payment or after you’ve missed the payment? Turns out that there are indicia of responsibility in that behavior too. But of course, now you’re into this behavioral realm, where people start to get increasingly I think, uncomfortable, about, you know, is this big brother or robots watching, watching my behavior to make judgments or perhaps mis judgments? About You know, my intentionality with respect to repaying a loan? So, so again, he got the volume question, and then the type question. And and the type question I think is where things start to get murky or uncomfortable when you start to get out of the domain of things that are very clearly related to credit worthiness.
Indeed, that’s really helpful. And I, Eric, I’d love to hear your reactions on this matter. As it relates to data, I know we’re going to talk about fairness. And I’d like to probe even more on about the ability to have enough confidence in how these more complex models are working. But just before we dive into those would be great to hear, Eric, how you think about, if you will, alternative data and alternative financial data and its relevance in the financial inclusion context?
Sure, I just I thought Kareem made some great points, I just had a couple of reactions. One of them is that whether a particular data element a particular variable might be appropriate or not, I think can depend on the particular context or type of decision you’re talking about. So one thing that comes to mind is sort of like more granular, more specific data on on shopping behavior. If you’re making a general credit underwriting decision for say, like a general purpose credit card, looking at the types of sores or someone isn’t inclined to shop, I don’t think that wouldn’t necessarily be off the table. But I think you’re you’re potentially inviting some significant risk that the type of store you’re you’re considering might be a proxy for a recommended basis. So say, for example, if the variable measures whether someone shops at a store that primarily sells women’s clothing, for instance, on the other hand, suppose it’s it’s a model, and maybe it’s not even an underwriting model, but it’s a marketing model, and it relates to a a credit card specific to that store, it’s sort of like a, like a private label credit card, for instance, or maybe maybe a co branded relationship with that, that retailer. If you’re talking about, you know, a card that specifically linked to shopping at that store, that might be a very different situation in terms of the the appropriateness of using that type of variable. So and the, it sort of complicates the question, I think it’s important to keep in mind that what might be a proxy in one setting, may well be an appropriate variable in another setting. And then the second consideration is that well, I think, in the first instance, when you’re thinking of alternative data, it’s important to keep in mind the question of a disparate treatment, and to sort of have in mind a question like, Is there an intuitive reason why this is predictive? Not necessarily, because the the intuitive reason something is predictive? is sort of the the ultimate answer to the question of what the variables permissive, but from a kind of a compliance management standpoint, it’s, it’s an important question to keep in mind, because if a regulator is evaluating the model, it’s the kind of question I would imagine they would ask themselves, but even where, you know, there’s reason to be comfortable with a variable from that perspective. I think the sort of the, the disparate impact step of evaluating models is is still an important one. And if we can maybe we may be touching on that separately, so I won’t say anything further at this point.
Yeah. I mean, it’s it is an interesting, putting aside, I think, very important questions around proxies and how we, how we evaluate how we assess, and frankly, even some ideas that are being contemplated Kareem put you on the spotlight or about this, but what do we do in mind if the potential with proxy data you know, if you look Globally, you will see that there are geographies where they don’t have a credit bureau system. And there have been at least academic studies that suggest non financial data alternative data may be really instructive for bringing people into access to credit. I’m not arguing for against that. But I do think it’s an important important question in terms of if there are places where there is lack of credit information? How do we think about what type of data may be useful for getting people that first toehold into credit opportunities? I feel like that’s something we’re really having to grapple with. Any thoughts about that? Putting aside? I know, it’ll be hard, Eric, but some of the legal considerations around that. You’re only thinking about it from a legal standpoint.
I’m sorry, I thought I thought Kareem was going to speak first, I’ll do my best to set aside the legal standpoint. You know, particularly where there aren’t other options, I do think that changes the analysis a bit in terms of what variables might be appropriate or acceptable. If, if you lack other options, I think the range of variables that may have a more, you know, kind of on their face attenuated relationship to credit, I think the case for them is stronger, where, where there aren’t as many options, because, you know, if the goal ultimately is sort of avoiding discrimination, and promoting financial inclusion, you want to keep those options on the table, even even while being mindful of the potential risks. Kareem No, no, if you had any thoughts about that?
My internet cut out for us. I agree, look, I mean, the the difficulty of underwriting new populations that are not well represented in the historical data is it’s, it’s it’s a it’s a real difficulty. And I think that you’ve got to kind of start trying to find sources that are as related to, you know, credit worthiness as you possibly can. And, and go from there. I mean, one of the great benefits of machine learning models, as I’ve said, is that they are resilient to data that’s messy, missing or wrong. And they can often do a good job of finding what are called look alike. So you know, I may have some dimensions in which I can understand Melissa, and I might not have a lot of information about your credit worthiness, but I might, you know, that Melissa, and some along some dimensions resembles her colleague, PR, or Kelly. And so maybe I can use maybe the credit performance of PR and Kelly might be informative. As I get to know Melissa better, maybe I can start start you off with a, you know, a small, cool bone before we graduate to a bigger one. So I can gather some data about your credit performance. in emerging markets, of course, there’s been a little bit of a wild west places like China, for example, where there was no traditional credit bureau. And you started to see the use of social media data and other types of data, which I think both lead to tremendous overextension of credit, but also questions about kind of the creepy factor. And I think it’s interesting that, you know, in 20, in 20 1627 2018, I think the rules in China were kind of anything goes. And now the Chinese government has even rolled a lot of that back, because they’ve started to design some of these some of these questions around the difficulty of, of these alternative data sets, although alternative data sources, which are really far afield, from credit worthiness, I think, led to both an over extension of credit and also perhaps some other societal problems.
That does raise interesting questions around. And I think important questions that get now back to AI and machine learning, which is really understanding the data that you’re plugging into your algorithms and how reliable it is, and for how long I haven’t studied that, but, uh, you know, overextension of credit, what was that due to where the data that we’re being fed into it insufficient for really, truly credit risk assessing? When we think about alternative data, and our focus has been on alternative financial data, but I think it you know, there are questions, suicidally about how do we as individuals, and then how do we, in this country decide what data fields appropriate to be used even if it may be predictive, right? That’s something we have to grapple with it the same time, we do have a, you know, laws with respect to the use of data in credit underwriting, yet we’ve also seen new types of data such as cash flow data are useful in terms of credit risk assessment. But we are in a place now of well, how are we then making sure that consumers are sufficiently protected if we are finding data that we are societal a comfortable with it being used in underwriting, but it isn’t? It hasn’t been historically captured under something like the Fair Credit Reporting Act. So there are those kinds of matters that I think, you know, industry advocates and others are really having to grapple with. At the same time, I think, really contemplating meaningful consumer consent, when this new type of data is being brought in is another piece of how are we sort of solving this puzzle about really taking advantage of for the benefit of potential borrowers the data, but making sure that they are well informed about how that data is being used, and how that data ultimately is protected downstream. I don’t know if either of you have perspectives on this data side of it, but we’d love to hear them. Corinne?
I defer. I defer to Eric actually.
Sure. Well, I don’t think it covers the the full sweep of what you’re talking about in terms of consent to the use of various data sources. But I do think this is one reason that explainability are interpretability of models is such a key issue. You know, certainly there are added challenges in the machine learning space with being able to explain the the basis for decisions, but I think precisely for the reasons you’re saying if making sure people are aware of how decisions about them are being reached and sort of what what some of the key decisive factors are, I think that’s all the more reason that you know, the the importance of explainability or interpretability needs to be kept at the forefront.
I don’t know why I can’t get myself off mute. Let’s now turn to the topic of AI and machine learning. can either of you share what you’re seeing in terms of what the market what the financial sector is either using right now in terms of machine learning and credit underwriting or contemplating using, and then we can dig into some of the questions around fairness and adverse action notices and other expectations in terms of communicating to consumers and regulators, how those algorithms are working?
Yeah, I’ll start and then love to hear Eric’s views and yours to Melissa. I mean, I would say that we are still kind of in the very early innings of machine learnings kind of takeover of credit underwriting. For the most part, I think lenders are still using traditional logistic regression based systems. Sometimes what you’ll find our lenders will use a machine learning algorithm to identify variables that are predictive, and then plug those variables back into a logistic regression. To try to enhance its performance, you can generate small performance gains that way, but nothing kind of on the order of using a traditional kind of full blown machine learning technique. But of course, by doing that, you In theory, get the benefits of the inherent interpretability of logistic regression. To the extent that people lenders are using machine learning techniques, you have a wide spectrum you’ve got you know, I would say most people are using a simple out of the box, what is called an x g boost. This is a kind of decision tree decision trees are very commonly used in financial services. They are highly predictive, and they’re kind of intuitive to humans, because you sort of follow the forks in the tree and try to get a sense of why that gradient boosted tree, you know, decision tree did did what it did. Of course, the frontier of using AI in credit underwriting is not the use of any one single technique, but the use of multiple techniques in what are sometimes referred to as ensembles. And in that case, you may use a gradient boosted technique, but some of these other more advanced techniques that folks I’m sure have heard about, like neural networks which attempt to replicate the computational systems and power of the brain or support vector machines or kernel methods. And the most advanced lenders are using 510 15 different techniques, sometimes all ensemble together to enhance the predictive power of their credit underwriting models. There are very few people who are able to do that, in part because explaining those complex ensembles, really is both kind of an art and a science, which there are very few people who are capable of doing it. But so I would say we’re still kind of in the early innings, it’s still pretty basic elementary stuff. But as lenders come up the curve and you can kind of squeeze all of the juice out of those kind of low hanging out of the low hanging fruit techniques, I think you will see over time, you know, a tendency to use some of these more complex ensembles. And of course, there are a number of questions that arise from the use of those ensembles that I’m sure we’ll talk about in greater detail.
I think what I’ve seen sounds pretty similar to to what Kareem is described, and I share the view that this seems like relatively early going in the adoption of machine learning techniques. I think, maybe for a few years now I’ve seen relatively widespread use of machine learning in the kind of for marketing models for fraud related models. And and I agree that those tend to be more more tree based machines of one type or another, whether that’s a gradient boosting technique, or a random forest or something along those lines. I have seen relatively little in terms of the use of something like an artificial neural network convolutional neural network, especially on the underwriting side. And that’s, that’s still pretty unusual in my experience. But I think it is, particularly as the explainability techniques become more sophisticated. I think that tends to be one of the bigger hurdles there. And as those further develop, I suspect we’ll see more of that.
So those range of algorithm types, obviously range from more explainable to more complex, I think I would largely join both of you and sort of what you’re seeing, we are also observing. And I don’t think it’s surprising that there are definitely some of the larger banking depositories, who, you know, will argue, we should not be putting a black box on top of the black box, we should be building explainable from the start. And that’s how we go about doing it. From a research standpoint, with ambitions around fairness, which I promise we’re going to get to, but also inclusion. We’re asking the question, and actually this question came up in the hearing that we were just a part of, what are you giving up in terms of accuracy? What’s that trade off in terms of inclusion? But really importantly, with the more complex models, how explainable are they in? What confidence Can you have in what the explainability techniques offers? So that’s something that we are looking at. But let’s let’s talk about fairness. Eric, do you want to kick us off? Sharing where we’re starting in terms of legal definitions of fairness expectations? And then, Kareem, it would be great to hear you because I know you’ve been following. academics, computer scientists, consumer advocates really beginning to contemplate new ways that maybe we should be thinking about defining fairness, and then contemplating how we go about achieving it, and how are we even articulating metrics in terms of evaluating for fairness? Eric, do you want to start?
Sure, so the the fairness conversation, I think, is a really rich and interesting one. And it’s, it’s been, I think, in some ways, felt like it’s almost going along in parallel tracks in sort of the computer science community and more and maybe the legal community, though, I think increasingly, there’s sort of kind of cross pollination of those, those conversations. I remember I went to a conference a few years ago, and I saw a really good presentation that was titled something like, you know, 21 definitions of fairness. I was like, oh, wow, that puts a lot of definitions of fairness. That said, I think in the current legal framework, that the the primary one, certainly not the only ones, but the primary ones, and those we advise our clients to keep, sort of first and foremost in mind are still our discriminatory treatment and disparate impact frameworks. So avoiding using prohibited basis. And also close proxies for those prohibited bases. And also ensuring the polls are evaluated for disparate impact and where there is disparate impact checking for less discriminatory alternatives. That’s not to say there’s not relevance to to other notions of fairness things like, you know, differential validity, differential predictiveness, it’s more than that. I think those those concepts can be either fit within maybe step two, the disparate impact framework, or potentially even as part of the, the the treatment framework, but I think for the time being, and some of this just has to do with sort of where the authority is, and sort of where there is there’s precedent for how to how to think about these things most clearly, in terms of case law and regulatory materials. Our our focus at this point is first and foremost, on on treatment and impact questions.
Yeah, we’re gonna start. So definitions of fairness. They are hard, right? I mean, let’s leave put, let’s put lending aside for a moment. Right? There’s, you could imagine, let’s talk about, you know, my favorite subject, which is pie, right? There’s one definition of fairness is everybody gets the same size piece of pie. And another definition of fairness might be, well, I worked harder, so I should get more pie. And everybody, depending on where, where you sit…
My dinner table conversation with my three sons, by the way.
And so, you know, part of the difficulty we have is that in consumer financial services is that everybody has a definition of fairness that they prefer, right. I mean, the regulators have been very clear, and consumer advocates, and I think it’s the right definition, I’ve been focused on demographic parity, right? Like, what fraction of protected groups Am I approving, for example, relative to control groups like white males, but some lenders prefer a definition of fairness, that is, like equal odds, you know, so, look, I approve everybody who’s got a credit score above 700, regardless of whether they’re, you know, white, black, green, yellow, what have you. Of course, of course, the problem is that most minority groups are, depending on where you said that, you know, credit score cut off, you may be leaving a lot of historically disadvantaged groups, you know, out of your lending program, because they haven’t had the same opportunities to earn a score of 707 50, whatever it is, I think. So I think this is one of the areas where it would be good for the regulator’s to kind of update their guidance. I don’t think there’s been updated guidance from the Bureau, if I’m not mistaken, since 2015, or 2016, around how to calculate some of these definitions of fairness, whether it’s, you know, odds ratios or marginal effects, I think demographic parodies well understood. But I think as an industry, we need probably some more guidance on which definitions of fairness ought to be tested. And what are the fairness thresholds? Right. So the employment domain, for example, it’s very clearly stated that if you’re looking at, you know, demographic parity, that a lender or an employer is safe, if they approve, if they you know, employ 80% of the minority group applicants that they do have the control group, there is no similar fairness threshold in financial services. And that’s probably an area where we would be would be useful to have some updated guidance. The other question, I know that you’re interested in about fairness, Melissa, is this question of, is there a fairness penalty, or, you know, our profitability and fairness necessarily exist in tension to one another? And the short answer to that is that there certainly can be right. I mean, if you’ve built an algorithm that has been trained to optimize for profitability, it’s going to do a bunch of things on the way to optimizing for profitability that you might not have otherwise intended. And by definition, if you go back to that algorithm and say, Hey, in addition to profitability, I now want you to optimize for fairness, the addition of that second consideration, will by definition, you know, perhaps lessen its tendency to optimize on the dimension of profitability. Now, I think the other the but but the counter argument to there, there must be a fairness penalty is well, the data is not really representative of these historically underprivileged groups. And if you upsampled you know, black borrowers, if you upsampled disabled borrowers and had more examples of how, you know how they performed from a credit perspective, in the data, that you might actually not not incur a fairness penalty, you might actually find that it’s profitable to lend to Those historically disadvantaged groups. And that’s been the argument made by a lot of fintechs, including one that joined our panel before Congress this morning.
If I could jump in briefly, I think the point that Kareem just made about, you know, sort of the trade off, it’s always been an artifact of underrepresentation in the data is a really important one. I think there’s, there’s another factor that potentially, I don’t know that it entirely kind of eliminates the trade off, because it’s cream said, you if you go from, you know, not having a sort of a second consideration to adding that second consideration, that’s going to have implications. But the other reason is just because even if there is some trade off, there’s there’s a question of the magnitude of that, that trade off. And I’m, I’m gonna forget who this the phrase is originally due to, but this this idea of kind of a multiplicity of good models. And it could be that even if your starting point, let’s suppose it is the best model, it could be that there are a lot of other models that are really, really, really close. And it’s, you know, it’s one thing to say that, you know, you shouldn’t have to give up some considerable amount of model performance in order to mitigate impact. And indeed, we you know, I think the that that’s consistent with the, the disparate impact framework is that, you know, a less discriminatory alternative that is not similarly effective from a business standpoint, doesn’t need to be adopted. But if you do have these models, where the the distinction performances is really, really small, then I think that’s another matter entirely. And kind of, one of the reasons that I am an optimist in terms of machine learning is because I think the machine learning techniques are one of the things that facilitates identifying all those models, and all those subtle variations that are nearly as good from a performance standpoint, but may nonetheless make a big difference in terms of impact.
I’m really glad Eric mentioned that. It turns out that you can actually there’s even with the same data sets, you can achieve a great much greater level of fairness, generally a very low cost.
Yeah, I, we wondered about that, right. I mean, if you’re, if your math and your data are yielding, you know, much greater prediction accuracy. What were you finding the alternatives? And are they sort of landing you close to what your options are, then there are also some big ideas out there about if we might be driving, comes back to inclusion. But inclusion is a form of fairness, given the sophistication of machine learning algorithms, and the fact that many would argue they will find proxies and data that, you know, the human would not have necessarily picked up. Would we be generating better models and potentially extending credit to populations? You know, if we were actually to allow protected class information into those models themselves, I would love to hear you all opine on this kind of idea. It came up by some of the academics when we were doing the cash flow research. And the point that was made to me is, you know, Melissa, if we were comparing you with all women, given the income differences between men and women, we might be letting in more women in terms of access to credit as a very simplified way of making this point. Just curious, you know, how you will sort of perceive that kind of idea where the where the potential benefits are, and goodness, you know, we have these prohibitions on using that kind of data for very good reasons. How would we actually continue to make sure that we don’t have abuse? If we were to put those data into the models?
Yeah, be keen to hear Eric’s view. I mean, my view is that this is one of the toughest questions facing us as an industry right now. I think, for very good reasons. As you know, we do not permit the use of race, gender, age, etc, in underwriting largely because those, those prohibitions largely emerged out of an era where underwriting was judgmental. And I think there is increasing increasingly rich literature about what’s called fairness aware machine learning that suggests that the inclusion of that protected information would increase the accuracy of some of these models for subpopulations. And so, you know, on the one hand, I understand and I struggle with this personally, which is why I’m keen to hear what other smart people like Eric, and you think, but, you know, on the one hand, it’s deeply I would feel deeply uncomfortable if I went to apply for an auto loan, and I was asked, you know what my race was? On the other hand, I think there’s no question that, you know, if I, if I could get a better rate, if you could improve, approve more folks like me that have maybe been left out of the system, if you’re using if you can prove that you’re using that information for a salutory purpose, maybe we ought to permit it or at least permitted subjects to you know, certain strict guy, you know, Gabriel’s
Yeah, I agree. It’s, it’s a tough one. I, I have some familiarity with that literature. And I’m really glad that work has been done because I think it’s good to be open minded about things that can make the financial system more inclusive. I I do have a lot of wariness of that just because it seems like there’s so much room for that information to be to be misused. And certainly a long way from from where we we currently are. So I guess that’s something I’m kind of keeping an eye on with interest. But I think it’s, it’s always i don’t know, i it makes me it makes me kind of nervous. I think there’s a lot of ways that something like that could go wrong, but I’m glad that people that you know, thoughtful people are thinking about it and writing about it.
Yeah, adversary biasing. Kareem, can you tell us what that means? And why we should all be thinking about it. Oh, you’re muted?
Sure, adversarial. D biasing is one of several new techniques that shot a lot of promising a lot of promise for increasing the fairness of machine learning models. And the way there are two examples that I like to use when I describe how adversarial D biasing works. One involves a cop and a counterfeiter. And imagine you were counterfeiter trying to pass fake bills by a cop. And so you go into your little counterfeit lab. And let’s say you’ve got three variables to play with. You’ve got the ink, you’ve got the paper, and you’ve got the typeface that you’re going to use to make fake bills. I hope there’s nobody from the Secret Service on this call. And console you kind of like you know, you use use the ink and the paper and the typeface you’ve got and you give that that fake dollar bill to the cop, and the cop immediately recognizes it as fraudulent. So you go back to your lab, and you play with those variables a little bit more, maybe better paper different in a more accurate type face, you basically reweighed the composition of the variables to try to produce a better counterfeit bill and go back and present it to the company and and the carbon, the counterfeiter go back and forth. In this cat and mouse game. Until one day the counterfeiter has adjusted the composition of variables relative to enough to one another, just right, the COP is no longer able to discern that the bill is a fake. So to apply this to use this kind of in a lending context, you could imagine a credit model and an adversarial model where the credit model is underwriting borrowers. And the adversarial model is constantly trying to predict on the basis of the data that that credit model is using is this borrower black is this borrower a woman is this borrower over the age of 65. And if that adversarial model is able to assert predict a high degree of accuracy, the protected status of that borrower, then you need to stop that model and understand what variables what variable interactions are leading it to be able to predict. You know, this meant that this borrower is a member of a protected class. And then of course, adjust your variables accordingly. So that you can no longer discern whether that borrower is a member of a protected class, while at the same time not compromising the overall predictive power of the model. I know set a lot one other example that I find useful to understand adversarial D biasing is imagine a robot holding a coke cam, and you’re trying to knock the coke cans out of the robots hand. In the first few instances, you may be able to knock the cocaine out of the robot’s hand. But the robot will continue to adjust its grip that is to say, play with the various variables, which are its digits, and try to adjust its grip and you go back and forth not trying to knock the cocaine out of the robot’s hand. Until one day the robot understands exactly the right way to hold that pin, and you’re no longer able to knock it out of his hand. At that point. The robot has understood what composition of variables it needs to do in order to frustrate your purpose.
Eric, how do you how do you hear that kind of approach?
I don’t know that I have a whole lot to add creams.
Yeah, I like the robot example better, Kareem, it’s, I can get my head around it. There’s a lot here. And there’s a lot of math and a lot of sort of competing algorithms. What do we all need to know, we as people who are engaged from a policy standpoint from, you know, imagine if you’re a small bank, and you’re thinking about, you know, partnering with a vendor? How do you get your head around this, to have a level of comfort with the adoption of machine learning and AI in underwriting? Please.
Sure, me, one thing to keep in mind is that in, in some ways, it’s it’s very different. But in other ways, it’s, it’s really not that different at all, and just sort of the same, just non discrimination principles from the equal credit Opportunity Act apply, you know, the same obligations about, you know, providing that protection notices apply, the fact that the computer is more sophisticated in how it uses data to generate an outcome doesn’t change the fact that a lender is responsible for for first meeting its obligations under the, under the equal credit Opportunity Act under regulation B, you know, you can’t really just sort of like defer, you know, responsibility to an algorithm when, you know, is an entity, you’ve decided, you’ve built that algorithm, you’ve licensed that algorithm, you know, however, you’ve come to employ it, it’s still kind of ultimately down to, you know, a lender to ensure that they’re, they’re using these things responsibly. So in some ways, it’s, you know, think things have changed, but they also, they, in many ways, they really haven’t.
Yeah, I agree with that. I mean, at the end of the day, as lenders, you still have to run your model governance process, and you still have to ask the same basic sets of questions, right, which is, how is this model making decisions? What variables? is it taking into account? To what extent Have I done everything in my power to make sure that it’s not making decisions on some prohibited basis? And now what becomes I think more difficult is there’s a little bit of a learning curve with respect to the new map. To me, it’s harder to answer those questions. If you’ve never dealt with, you know, a machine learning model before or a machine learning models explanatory technique. And that’s where if you’re a lender without a very sophisticated data science team, it can make sense to partner with outside companies who specialize in doing this work, although I have a tremendous amount of sympathy for lenders, because there are a lot of companies who claim to be able to do this work, who don’t do it very well.
Yeah, I mean, I’ve thought about this frequently, there is comfort in the fact that in the financial sector, we actually have very explicit laws that demand valuation for fairness, non discrimination, communicating to the consumer, especially if a decision was reached, in terms of, you know, credit outcome that you didn’t expect an unfavorable outcome. And importantly, even on sort of the Prudential side, right safety and soundness, you’ve got to be mining those models, you have to sort of evaluate for reliability. And that is something that is true. And frankly, you don’t see similar requirements, legal requirements in other sectors where there is hard grappling with adoption of AI and data and what’s used. At the same time, I think, this focus on AI and machine learning and the potential of data and the 60 million US adults in this country who cannot be are not being credit risk assessed under traditional means. You know, there’s a need to sort of poke at this a little bit. Do we have it right? Are there ways that we can be doing it better either in terms of safely bringing in new types of data, but then also contemplating how this is I like to call it fancy math might actually be really useful for greater financial inclusion. And so I’m curious if you all have thoughts on sort of how we’re going to, you know, go from where we are great that we’ve got protections in place, but we still financial sector is so critical for equity. We didn’t use that word, right. But it is essential for how we’re building wealth and opportunity over the long term. How are we going to evolve because the math and the data may offer us some really, more equity Double outcomes if we can do it safely curious how you all see us evolving. And hopefully there are a few regulators listening to.
Think of predictive models is occupying a spectrum, basically from coin flip to crystal ball. And at every point on that spectrum, you’ve got a predictive model that will decline some borrowers that it should have approved and approved some borrowers that perhaps it shouldn’t. And as you get closer and closer to crystal ball, those those errors are minimized, and more people who ought to receive credit are awarded that credit. And the good news is like, the good news is, we’re still so early in the adoption of these technologies, that like, we could still realize a lot of inclusion, just by upgrading people from logistic regression to x g boost. And using some of this low hanging fruit that you all at fin reg have identified in terms of data, whether it’s transactional data or trended data, there is a lot of data that is clearly related to credit worthiness, and a lot of machine learning models that are fairly interpretable. And if we could just like get the industry to take a baby step in that direction, we’d be doing a lot of good. And that’s before we have to confront I mean, obviously, we should always be mindful of the fact that there are a number of difficult questions in this arena. But there’s also a lot of low hanging fruit, let’s go get the low hanging fruit, and, and see what we can do.
And I think it’s worth giving a shout out to the regulators who have issued a statement talking about the potential of alternative financial data for inclusion purposes. And then, of course, the team at the OCC has been really charging ahead with Project reach operation reach project reach, you know, really trying to think about inclusion opportunities there. Eric would love to give you the last minute how the world is going to evolve?
You know, I’ll take the last minute, I suppose. Um, I you know, I agree with Kareem, I think that, you know, there’s, there’s, this is all pretty new, and it’s all changing, and I think generally improving pretty quickly. Um, you know, I think it’s heartening the regulators have been trying to, to learn a lot. And I think there that we’ve seen multiple rfis in the CFPB. And more recently, others, you know, and I, I’m optimistic that’ll lead to some additional guidance, but hopefully guidance with with remaining room to explore and innovate, I think the there’s the the conversation around this is expanding to and they can the kind of the machine learning literature and sort of, there’s a much greater attention now, to fairness, or at least there’s think there’s sort of maybe more awareness on the part of people like me kind of working more from kind of like a, like an industry perspective of that the fairness dimension of that literature, you know, whether that means additional data, additional techniques, or even making sure that, you know, sort of there you see diversity among, you know, like the people researching on this or like, sort of the, you know, the the development teams, and just sort of like the, the importance of this being kind of an inclusive space, not in terms of financially, but in terms of how we think about, you know, effective development of these models and what fairness looks like, and just that conversation, I think things are moving in a positive direction. Great.
Well, thank you both. I hope this helped to give some sense of what we mean when we’re talking about AI and machine learning, in credit underwriting, but especially for financial inclusion purposes. It’s been a pleasure talking with both of you, and thank you again, Adam rust. It’s really a joy to be able to join you and host this conversation. So thank you.
And Brad Blower too