Hostname: page-component-788cddb947-w95db Total loading time: 0 Render date: 2024-10-18T22:38:27.412Z Has data issue: false hasContentIssue false

THE NORMS OF ALGORITHMIC CREDIT SCORING

Published online by Cambridge University Press:  30 March 2021

Abstract

This article examines the growth of algorithmic credit scoring and its implications for the regulation of consumer credit markets in the UK. It constructs a frame of analysis for the regulation of algorithmic credit scoring, bound by the core norms underpinning UK consumer credit and data protection regulation: allocative efficiency, distributional fairness and consumer privacy (as autonomy). Examining the normative trade-offs that arise within this frame, the article argues that existing data protection and consumer credit frameworks do not achieve an appropriate normative balance in the regulation of algorithmic credit scoring. In particular, the growing reliance on consumers’ personal data by lenders due to algorithmic credit scoring, coupled with the ineffectiveness of existing data protection remedies has created a data protection gap in consumer credit markets that presents a significant threat to consumer privacy and autonomy. The article makes recommendations for filling this gap through institutional and substantive regulatory reforms.

Type
Articles
Copyright
Copyright © Cambridge Law Journal and Contributors 2021

I. Introduction

In recent years, significant progress has been made in building computer systems that can perform complex, “human-level” tasks, such as natural language processing, visual perception and speech recognition. This progress is largely due to advances in machine learning (ML), including deep learning (DL) – key subfields of modern-day artificial intelligence (AI) research. Using ML techniques and large datasets, a computer system can be trained to recognise patterns in data and use this experience to predict outcomes in previously unseen data, without being explicitly programmed.Footnote 1 In turn, progress in ML has been driven by theoretical advances in ML methods, coupled with the massive growth in the volume of available (personal) data as our society has become increasingly networked, digitised, and “datafied”,Footnote 2 as well as improvements in computing power and in the tools and infrastructure needed to capture and process those data.Footnote 3

These socio-technical developments underlie the rapid proliferation of “algorithmic decision-making”. That is, the use of algorithms – notably, ML algorithms – to augment and/or substitute human decision-making. A particularly salient example of algorithmic decision-making is the growing use of ML, together with new types of “alternative” data – such as social media data – by credit providers to assess the creditworthiness of consumers. This trend is referred to as “algorithmic credit scoring”.Footnote 4 The growth of algorithmic credit scoring has been received with a mixture of enthusiasm and trepidation. The enthusiastic highlight the prospect of greater efficiency in lending due to more accurate creditworthiness assessment.Footnote 5 The fearful, on the other hand, emphasise the dangers of inaccuracy, opacity, and unfair treatment of consumers due to algorithmic credit scoring, and more broadly, the loss of privacy, autonomy and consumer power in a society dependent on data-driven, algorithmic decision-making.Footnote 6

Of course, these concerns are not entirely new. The growing capability of information technology and its deployment in consumer credit markets over the past several decades – particularly in credit scoring and credit reporting systems – has been accompanied by a gradual increase in concerns about fairness and consumer privacy.Footnote 7 As such, many of the risks of algorithmic credit scoring are to an extent already anticipated by existing regulation. In the UK, the most pertinent regulatory frameworks in this regard are the sectoral consumer credit regime – supervised by the Financial Conduct Authority (FCA) – and the cross-sectoral data protection regime – supervised by the Information Commissioner's Office (ICO).

However, the rise of algorithmic credit scoring, and with it the vastly greater reliance on personal data in lending, has amplified existing concerns and normative tensions. The critical question, and the focus of this article, is whether the existing mechanisms and institutional arrangements under consumer credit and data protection regulation remain appropriate for balancing the harms and benefits due to algorithmic credit scoring. More fundamentally, the article asks whether these frameworks are able to satisfactorily balance competing normative goals in the regulation of algorithmic credit scoring.

The article makes three principal contributions. First, it constructs a frame of analysis for the regulation of algorithmic credit scoring in the UK, bound by the core regulatory norms underpinning consumer credit and data protection regulation: allocative efficiency, distributional fairness and consumer privacy (as autonomy).Footnote 8 Second, it examines the principal normative contests that arise within this frame. These include intra-normative trade-offs, as for example between efficiency gains and efficiency losses due to algorithmic credit scoring. They also include often-overlooked instances of norm alignment – for example, where algorithmic credit scoring increases both allocative efficiency as well as distributional fairness in consumer credit markets.

Third, the article suggests ways for regulators to navigate these normative contests. Focusing on the trade-offs with consumer privacy (autonomy), the article argues that existing data protection and consumer credit regulation do not strike an appropriate normative balance in the regulation of algorithmic credit scoring. That is, these frameworks under-regulate data protection, resulting in a data protection and consumer privacy “gap” in UK consumer credit markets. The article makes recommendations for filling this gap, including by expanding the role of the FCA in overseeing and enforcing data protection regulation in consumer credit markets. Indeed, data protection and consumer privacy naturally fall within the FCA's existing consumer protection mandate: data protection is consumer (financial) protection.

As such, the article calls for an evolution in the regulatory architecture governing consumer credit markets as these markets become increasingly datafied, and data protection becomes increasingly salient to the FCA's operational objective of protecting consumers.Footnote 9 Additionally, to adequately protect the privacy and autonomy of consumers, (consumer credit) firms should be required to meet a higher burden of proof for the processing of personal data than under existing data protection regulation. Inter alia this should include stricter and more specific obligations for firms to demonstrate that the processing of personal data for algorithmic credit scoring yields a sufficiently significant improvement in the accuracy of creditworthiness assessment, in order to justify such processing. Serious consideration also needs to be given to limiting ex ante the types and granularity of personal data that may be used for algorithmic credit scoring, and consumer lending more broadly. For example, the use of intimate, feature-rich social media data could be prohibited, and the anonymisation of personal data made mandatory.

Whilst the article focuses on the regulation of algorithmic credit scoring in the UK, the relevant consumer credit and data protection frameworks are heavily shaped by EU law – notably, the Consumer Credit Directive and General Data Protection Regulation (GDPR).Footnote 10 As such, the analysis in this article will also be relevant to other EU jurisdictions that have similar regulatory frameworks – as well as other applications of algorithmic decision-making that give rise to similar normative concerns and trade-offs. It will be assumed that the relevant EU laws, and related jurisprudence, will continue to apply in the UK following its withdrawal from the EU.Footnote 11

The rest of the article is structured as follows. Section II charts the rise of algorithmic credit scoring. Section III identifies the normative frame of analysis for the regulation of algorithmic credit scoring in the UK. Section IV examines algorithmic credit scoring in light of this frame, and identifies the principal normative contests that arise. Section V considers how the regulation of algorithmic credit scoring should navigate these normative contests. Section VI concludes.

II. The Rise of Algorithmic Credit Scoring

Under UK consumer credit regulation, credit providers are required to assess the “creditworthiness” of borrowers prior to extending credit, and upon increasing the size or limit of credit.Footnote 12 In the early days of local, community-based banking, this creditworthiness assessment would have been carried out in person by a loan officer, who would interview the prospective borrower and draw upon their prior, often anecdotal knowledge and general experience with similar borrowers to decide whether or not to extend credit, and at what price (referred to as “judgmental” credit analysis).Footnote 13

Over time, this process has become increasingly risk-based, standardised and computerised. A key development in this regard was the introduction of “credit scoring”. That is, the use of statistical techniques to classify and rank consumers according to their credit risk, as estimated from patterns in the past credit performance and financial account transaction data of consumers with similar attributes.Footnote 14 In turn, credit scoring was facilitated by the development of centralised credit reporting, whereby lenders can purchase credit and financial account data about consumers shared by other lenders through credit reference agencies (CRAs), such as Experian and Equifax.Footnote 15 Lenders typically combine these data with information provided by consumers in their credit applications.

Conventionally, credit scoring has relied on linear statistical methods (including simpler forms of ML such as linear discriminant analysis and logistic regression) and a limited number of fixed variables to calculate a borrower's credit score.Footnote 16 Of these, credit repayment history is typically the highest weighted variable.Footnote 17 This approach reflects both demonstrated statistical correlation between a borrower's credit history and their likely credit risk, as well as traditional limits on lenders’ access to non-financial, non-credit data about borrowers (or indeed credit data from non-traditional lenders, such as “payday lenders”, who do not participate in the formal credit reporting system).Footnote 18

The massive growth in the volume of available (personal) data, the wider diffusion of the Internet and advances in ML methods starting in the mid-2000s, combined with the contraction in bank lending following the 2008 global financial crisis, have fuelled the growth of “algorithmic credit scoring” in the UK.Footnote 19 Peer-to-peer (p2p), “fintech” lenders such as Zopa and Wonga were pioneers of algorithmic credit scoring, using it to lend to populations underserved by mainstream banks.Footnote 20 Algorithmic credit scoring builds on traditional credit scoring in two principal ways: first, by leveraging a much larger volume and variety of data (so-called “alternative” data) for consumer credit scoring; and second, by using more sophisticated ML techniques to analyse these data. In this respect, the “algorithmic” in “algorithmic credit scoring” refers primarily to the use of more sophisticated ML and DL algorithms, such as random forests and artificial neural networks.Footnote 21 However, it should be acknowledged that conventional forms of statistical credit scoring are also “algorithmic”.Footnote 22 Credit providers typically combine traditional credit scores with deeper insights derived from algorithmic credit scoring.Footnote 23

Alternative data used for algorithmic credit scoring include both non-credit, financial data (for example, rental, utility and mobile phone payment data), as well as non-credit, non-financial data. The latter include both mainstream data, such as education and employment histories, as well as less conventional social and behavioural data, such as social media activity (“liking” posts, clicks and time-per-view), mobile phone usage (average time spent on the phone, size of social network etc.), health fitness activity, retail and online browsing data – even data on how consumers interact with the lender's website.Footnote 24 They include not only text, but also videos, images and sounds. Alternative data are also less structured and more feature-rich (“high-dimensional”) than traditional credit data. Some of these data are collected directly by lenders from consumers, whilst others are acquired from third-party data brokers, such as Acxiom and Experian.

In turn, ML/DL techniques can parse large, unstructured and high-dimensional datasets, to find features and patterns that are relevant to predicting a borrower's creditworthiness.Footnote 25 Importantly, ML (especially DL) can more accurately capture nonlinear relationships in data,Footnote 26 as well as reflect changes in the population and environment in order to more accurately estimate a borrower's creditworthiness – for example, by offsetting evidence of historic payment default with more recent evidence of prompt payment or by factoring in expected payments from flexible working arrangements that are increasingly common in the “gig” economy. The use of a much larger number of data points on the consumer can also reduce the risk that errors in the data will be determinative – for example, where living consumers are recorded as deceased (so-called “credit zombies”), or discharged debts remain on a consumer's credit record.Footnote 27

III. The Regulatory Norms of Algorithmic Credit Scoring

As noted in Section I, there are two principal regimes relevant to the regulation of algorithmic credit scoring in the UK: consumer credit regulation, specifically the requirement to assess creditworthiness, and data protection regulation. This section examines the principal regulatory norms underpinning these regimes, and which thus guide the regulation of algorithmic credit scoring in the UK. As elaborated below, these are: allocative efficiency, distributional fairness, and consumer privacy (as autonomy). The next section will examine how algorithmic credit scoring could both advance as well as hinder these normative goals. Before that, a few remarks are warranted about the approach taken in constructing the normative frame of analysis for algorithmic credit scoring.

First, the focus on data protection and consumer credit regulation does not deny the relevance of other regulatory frameworks that also govern algorithmic credit scoring in the UK, such as anti-discrimination, competition, intellectual property and general consumer laws. Rather, the selected focus of this article is the interaction of data protection regulation and consumer credit regulation, given their particular salience to the regulation of algorithmic credit scoring. Likewise, the focus on the FCA and the ICO, as the principal regulatory institutions enforcing these frameworks, does not deny the relevance of other regulatory institutions and enforcement mechanisms. This includes sectoral and cross-sectoral public bodies – such as the Financial Ombudsman Service and the Equality and Human Rights Commission – as well as private litigation by consumers through the courts.

Second, it has been assumed that the normative frame of analysis for algorithmic credit scoring can be deduced from positive law – specifically, existing consumer credit and data protection regulation. This is an intuitive and practical approach for a case study on algorithmic credit scoring in the UK, which finds itself at the cross-roads of relatively mature regulatory frameworks, with well-articulated normative goals. In this context, the critical challenge is (re-) balancing existing regulatory goals, rather than articulating new ones. However, a modified approach may be required in algorithmic decision-making contexts where regulatory norms are still nascent: for example, new applications such as algorithmic content moderation on social media platforms, and new geographic settings where legal frameworks may be less developed.

Third, whilst allocative efficiency, distributional fairness and privacy/autonomy are examined as distinct normative goals, this should not imply that they are entirely orthogonal to one another. Indeed, there are important overlaps and interconnections between and within them. Inter alia, an allocatively efficient market is also autonomy-enhancing to the extent that it satisfies individual (revealed) preferences (the rational choice perspective). Likewise, the goals of distributional fairness and privacy/autonomy coincide to the extent that equality of opportunity is a necessary precondition for personal autonomy (the positive liberty perspective).Footnote 28 However, by separating out these norms, we explicitly acknowledge that they are not coterminous. In particular, autonomy cannot be limited to “preference autonomy”. As discussed in the next section, the large asymmetries of knowledge and power in datafied consumer credit markets undermine the legitimacy of revealed preferences as an indicator of a consumer's actual preferences regarding the use of their data.

***

Thus, under UK consumer credit regulation, the requirement to assess creditworthiness is driven by two key normative goals: allocative efficiency and distributional fairness. Creditworthiness is defined as both credit risk to the lender (i.e. the probability of default by the borrower and loss given default), as well as the affordability of credit for the borrower, namely the probability that the borrower can repay the credit without a significant adverse effect on their financial situation (financial distress).Footnote 29 From a public regulatory perspective, the credit risk requirement is driven primarily by an allocative efficiency objective. That is, to ensure that capital is allocated to the most valuable projects – here, borrowers who are willing and able to make repayments under the credit agreement, on time – and to minimise inefficiency due to systemic non-performing debt.Footnote 30 Of course, lenders also have a private, commercial incentive to manage credit risk in order to mitigate their own losses due to non-performing loans.

In turn, the requirement to assess affordability is driven by both an allocative efficiency as well as a distributional fairness objective. Unaffordable borrowing could be inefficient, inter alia where it leads to bankruptcy, foreclosure and homelessness. Unaffordable borrowing is also undesirable from a distributional fairness perspective, as it is more likely to harm less well-off, vulnerable consumers who typically have less disposable income and lower levels of financial literacy, and are thus more susceptible to financial (and non-financial) distress.Footnote 31 However, this inefficiency and unfairness are not necessarily captured by the lender's assessment of credit risk: a borrower may be “willing and able” to make timely repayments under the specific credit agreement that the lender is exposed to, yet experience financial distress as a result. Likewise, a lender may still be willing to lend to a borrower who is a high credit and affordability risk because it can diversify its exposure across many similarly-situated borrowers, thereby lowering its overall portfolio risk.Footnote 32

Mandatory affordability assessment is also intended to limit the scope for unscrupulous lenders to intentionally exploit vulnerable consumers by selling unfavourable credit products. A notorious example of this type of exploitation is the marketing of credit card contracts with lower short-term prices (such as “teaser” interest rates), which are more salient to the consumer, and higher long-term prices (such as back-end contingent fees), which are typically less salient.Footnote 33 Such rent-seeking by lenders is inefficient, as effort is expended simply to bring about a transfer of wealth from borrowers to lenders without increasing economic output: in economic terms, the investments undertaken by the lender represent a “deadweight loss” to society. Moreover, it is likely to have regressive distributional effects, given that it is more likely to harm – due to unaffordable debt and financial distress – less well-off and less-educated individuals (additionally skewed by race and age) who, experiencing greater financial desperation and with lower levels of financial literacy, are more susceptible to such exploitation by lenders.Footnote 34

Distributional fairness in this regard is understood primarily in a Rawlsian sense, as the degree of inequality of outcome in the economy, particularly as a result of inequality of opportunity.Footnote 35 On this view, the poor, sick, old, and uneducated, and those who have been historically disadvantaged on grounds of personal attributes that are not within their control (for example, sex, disability or race), are marked out as especially vulnerable groups and made the focus of regulatory efforts to redistribute resources or prevent their exploitation at the hands of the more wealthy and powerful.Footnote 36 Vulnerability also encompasses behavioural and cognitive limitations, as discussed above, as well as structural power imbalances. As such, consumers as a class are typically marked out as more vulnerable and in need of additional protection through regulation.Footnote 37

The third key norm framing the regulation of algorithmic credit scoring in the UK is the protection of consumer privacy, and more fundamentally, consumer autonomy. This normative objective underpins the UK data protection regime, comprised of the Data Protection Act 2018 (DPA) and the GDPR, which in turn are rooted in the fundamental rights to privacy and data protection under EU law.Footnote 38 This regime establishes obligations for data controllers and processors, rights for data subjects and regulatory enforcement powers with respect to the processing of personal data. Inter alia, when processing consumers’ personal data, credit providers must: comply with overarching data protection principles, such as data minimisation and purpose limitationFootnote 39; satisfy one of the grounds for lawful data processing such as that the consumer has given their consent to processing, or that the processing is necessary for the controller's “legitimate interests”Footnote 40; implement “data protection by design and default”Footnote 41; and carry out a “data-protection impact assessment” (DPIA) where the intended data processing is likely to result in a “high risk to the rights and freedoms of natural persons”.Footnote 42

In turn, consumers have various rights to access their data and object to processing. In the consumer credit context, this includes a right for consumers to access and correct errors in their credit files, and corresponding obligations for credit providers to notify credit applicants about any CRA from which they have obtained information, and for CRAs to disclose credit files to the consumer.Footnote 43 Additionally, consumers have a qualified right to object to a decision taken solely on the basis of “automated processing” – such as the automatic refusal of an online credit applicationFootnote 44 – as well as to receive information about the existence and “logic involved” in automated individual decision-making, such as algorithmic credit scoring.Footnote 45

This regime recognises that, in a data-driven society, data relating to consumers are increasingly constitutive of their identities, shaping how they are perceived and the opportunities that flow from the construction of their “digital selves”.Footnote 46 As such, controlling and limiting the processing of personal data – through both data protection rights for individuals as well as obligations for data processors – is a necessary (but not sufficient) condition for consumers to shape their own identities, and exercise their autonomy.Footnote 47 This perspective also informs the stricter controls on the processing of sensitive personal data, such as health, biometric and genetic data, that are intrinsic to consumers’ identities.Footnote 48

A detailed examination of the normative origins of UK data protection regulation is beyond the scope of this article. It is important to note however that privacy as autonomy, as a key norm underpinning UK and EU data-protection regulation,Footnote 49 is a much thicker norm than privacy as non-interference,Footnote 50 informational secrecy, or confidentiality.Footnote 51 Whereas the latter primarily protect against the disclosure of non-public, personal information, privacy as autonomy protects an individual's broader right to (informational) self-determination,Footnote 52 through individual and collective controls on the use of personally identifiable information – public as well as non-public. Understood in this way, privacy is not so much a form of protection for a pre-cultural, essential self (a negative liberty conception), as it is constitutive of the self – marking the boundaries within which individuals and communities can choose and construct their own identities, develop their capacities for self-determination and pursue the conditions for human flourishing (a positive liberty conception).Footnote 53

In addition to privacy (as autonomy), UK data protection regulation is also guided by the norms of distributional fairness and allocative efficiency. Notably, the stricter controls on the processing of sensitive personal data, as well as on automated decision-making and profiling, additionally reflect a concern to mitigate distributional unfairness due to unlawful discrimination and the exploitation of knowledge and power asymmetries between data processors and data subjects.Footnote 54 With regard to allocative efficiency, one of the key underlying objectives of the GDPR is to facilitate market integration through the free movement of personal data.Footnote 55 To the extent that this gives lenders access to better information by which to assess the creditworthiness of borrowers, it reduces “creditor ignorance” informational asymmetryFootnote 56 and thereby supports allocative efficiency in UK consumer-credit markets, as examined further in the next section.

IV. Examining Algorithmic Credit Scoring

Section III identified the three key normative goals – allocative efficiency, distributional fairness and consumer privacy (autonomy) – that underpin the regulation of algorithmic credit scoring in the UK, focusing on data protection and consumer credit regulation. This section examines algorithmic credit scoring in light of these regulatory norms. The goal is to identify the principal normative trade-offs and alignments under algorithmic credit scoring, rather than to inventory all possible normative implications. What emerges is a nuanced assessment in which algorithmic credit scoring generates both benefits as well as harms, within and between each of the three regulatory norms. The next section will consider how the regulation of algorithmic credit scoring should navigate these normative contests.

A. Allocative Efficiency

On the one hand, algorithmic credit scoring stands to increase allocative efficiency in consumer credit markets. Notably, by revealing more information about borrowers that could be relevant to their credit risk – by reducing the cost and time it takes to acquire relevant information about the borrower – algorithmic credit scoring could facilitate the more accurate assessment of credit risk, therefore mitigating inefficiency due to adverse selection effects.Footnote 57 Likewise, algorithmic credit scoring could enable lenders to more accurately assess the affordability of credit for a borrower, and therefore mitigate inefficiency due to unaffordable borrowing. In particular, the use of more types of social and behavioural data, and ML models that continue to learn based on new training data, could enable lenders to more accurately predict a borrower's expenditure and disposable income during the repayment term of a loan, and thus their ability to meet loan repayments without experiencing financial (or non-financial) distress.

Algorithmic credit scoring could also allow lenders to more effectively monitor and control a borrower's actions ex post. Insights about a borrower obtained through algorithmic credit scoring could be used to more effectively design penalty pricing provisions or fee waivers that adapt dynamically to the borrower's behaviour during the term of the loan. This could help to reduce creditor ignorance after credit is extended and thereby mitigate inefficiency due to moral hazard effects.Footnote 58 The ML techniques and alternative data that underlie algorithmic credit scoring also offer to generate efficiencies in other parts of the credit lifecycle, including identity verification and monitoring of fraud and money laundering, as well as generally increasing the speed and convenience of the credit process for borrowers.

These efficiency gains are likely to be most pronounced in the “thin-file” and “no-file” borrower segments: that is, borrowers with sparse or non-existent credit histories for whom conventional credit scoring and reporting mechanisms that rely on historical credit data are less effective at predicting creditworthiness.Footnote 59 Estimates suggest that nearly 10 per cent of the UK population are thin-file or no-file borrowers.Footnote 60 These borrowers are frequently denied credit by mainstream lenders and forced to rely on payday lenders, or informal, unregulated sources of credit – often at punitive interest rates and exposing them to abusive lending practices.

Indeed, recent empirical studies support the hypothesis of increased allocative efficiency in consumer credit markets due to greater predictive accuracy in creditworthiness assessment as a result of algorithmic credit scoring.Footnote 61 This is generally expressed as a lower average classification error rate, namely fewer false positives (whereby an applicant with bad credit is misclassified as good credit), and fewer false negatives (whereby an applicant with good credit is misclassified as bad credit), and evidenced in a reduction in borrower default rates following the introduction of algorithmic credit scoring. Empirical data also corroborate the hypothesis that algorithmic credit scoring could improve access to credit for thin-file and no-file borrowers.Footnote 62

On the other hand, however, we should not overestimate the potential efficiency gains due to algorithmic credit scoring. Notably, the posited gains depend on the quality and accuracy of the data and ML models deployed. If the data are incomplete, biased or inaccurate and/or the ML models poorly trained and tested such that they do not generalise well to “out of sample” data,Footnote 63 algorithmic credit scoring could instead generate inefficiency in consumer credit markets due to inaccurate creditworthiness assessments. Furthermore, the greater opacity and complexity of certain “black-box” ML methods used by lenders for risk modelling (particularly DL methods) could impede model interpretability and effective model validation.Footnote 64 Model opacity and data bias are also potential sources of distributional unfairness due to unfair discrimination, as discussed further in Section IV(B).

Likewise, algorithmic credit scoring in the UK (and US) is a distinctly post-crisis phenomenon. As such, it remains to be seen whether observed improvements in predictive accuracy simply reflect the relatively benign macroeconomic environment since 2008 in which many ML models used for algorithmic credit scoring have been trained.Footnote 65 Indeed, there are ominous parallels between the use of alternative data for algorithmic credit scoring and the loosening of loan underwriting standards for high-risk borrowers – including the notorious “Alt-A” mortgages – that foreshadowed the 2008 global financial crisis.Footnote 66 A related concern is the impact of algorithmic credit scoring on the overall volume of household debt and the rate of credit expansion in the economy – particularly to vulnerable consumers for whom debt can quickly become unaffordable.

The posited efficiency gains also depend on how lenders use their data-driven insights about borrowers. Rather than expanding access to credit for “thin-file” yet still high-risk borrowers, lenders could instead use these insights to “skim the most creditworthy segment of the market for themselves”.Footnote 67 Likewise, the trend towards algorithmic credit scoring could generate inefficiency where it causes lenders and other firms to over-invest in gathering private information, in a way that is socially wasteful (rent-seeking).Footnote 68 And, rather than using additional insights about borrowers to improve the design and marketing of credit products, lenders could instead use them to exploit the latter's cognitive and behavioural biases (“consumer ignorance”).Footnote 69 For example, a lender (mediated through a third-party platform, such as Facebook) could use its insights about consumers’ biases and preferences to engage in less desirable forms of price discrimination and targeted marketing based on consumers’ misperceptionsFootnote 70 – perhaps targeting the borrower with an unfavourable credit offer at moments of extreme vulnerability and before they have the opportunity to shop around for a better offer.Footnote 71 As discussed in Section III, the use of algorithmic credit scoring to exploit consumers’ vulnerabilities in this way is inefficient and also raises distributional fairness concerns.

As lenders become more dependent on data and ML-driven creditworthiness analysis, the scope for inefficiency due to technological failure also increases. This could take the form of an adversarial ML attack, malicious agents corrupting an ML model by poisoning the training data,Footnote 72 or more generic denial-of-service or malware attacks on a lender's computer systems. Cyberattacks could create severe market disruption and inefficiencies at both the macro and micro level: at the macro level, where the attack hobbles multiple lenders simultaneously, potentially due to the exploitation of a shared vulnerability; at the micro level, where consumers’ data are stolen and used to make fraudulent claims, inter alia adversely affecting their credit scores.Footnote 73

The trend towards algorithmic credit scoring could also undermine competition in consumer credit markets to the extent that the high start-up costs and increasing returns to scale from data processing favour larger lenders and already-data-rich companies, like Google, over smaller firms and new entrants.Footnote 74 Additionally, inefficiency could result from efforts by consumers to “trick” or game their data. For example, a consumer could register a relative's fitness tracking device to their name in order to piggy-back off the latter's positive health data without making any real-world improvements to their own health. It is evidently more difficult to game financial credit data, such as account transaction and credit history data.Footnote 75

B. Distributional Fairness

Algorithmic credit scoring paints a similarly mixed picture when viewed from the perspective of distributional fairness. On the one hand, it stands to enhance distributional fairness in consumer credit markets by improving access to credit due to more accurate creditworthiness assessment, particularly for thin-file and no-file borrowers who lack the credit data traditionally used to assess creditworthiness. As discussed above, these borrowers are more likely to come from low-income, less-educated and ethnic minority backgroundsFootnote 76: greater access to credit allows them to satisfy their consumption preferences as well as smooth consumption over time in a way that was not previously possible. Conversely, more accurate affordability assessment due to algorithmic credit scoring stands to reduce distributional unfairness by limiting (access to) unaffordable credit.

On the other hand, however, there are risks of distributional unfairness due to algorithmic credit scoring. This includes the exploitation of vulnerable consumers using behavioural insights derived by lenders through algorithmic credit scoring, as examined in Section IV(A) above. Unscrupulous lenders can also exploit data-driven insights to pursue more aggressive debt collection practices, targeted at the most vulnerable borrowers.Footnote 77 Likewise, lenders may not use their data-driven insights due to algorithmic credit scoring to expand access to credit for marginalised borrowers, as discussed above.

More broadly, algorithmic credit scoring could impact distributional fairness in consumer credit markets due to unfair discrimination.Footnote 78 On the one hand, by substituting or supporting lending officers through algorithmic credit scoring, the scope for unfair discrimination – particularly direct discrimination due to personal animus – could decrease.Footnote 79 As compared to purely judgmental, face-to-face lending, it is arguably also easier to detect unfair discrimination in algorithmic credit scoring (as with statistical credit scoring generally), to the extent that a lender's decision-making process is elucidated and explicated in the credit scoring model.Footnote 80

However, there is also a demonstrated risk that biases in the data used to develop an ML scoring model could generate new avenues for indirect discrimination,Footnote 81 particularly where the model is not adequately tested. For example, if in a training dataset the successful credit applicants are overwhelmingly White and male (due to historical discrimination in lending and/or data bias due to poor sampling and data pre-processing), the ML model could learn to associate Whiteness or maleness – or proxies for these variables, such as income, credit history or postcode – with good creditworthiness, resulting in unfair lending outcomes for non-White, non-male borrowers.Footnote 82 Deploying these models “in the wild”, without rigorously testing the results and adjusting the model hyperparameters based on feedback, thus risks reinforcing and perpetuating historic patterns of unlawful discrimination.

A related fairness concern is that certain high-risk borrowers who were previously receiving implicit insurance through a lack of, or hidden, information could be harmed by improved observability of their (negative) characteristics due to algorithmic credit scoring – either by being denied credit or due to more expensive credit. Although more accurate price differentiation in this way may be considered an efficient outcome (as discussed in Section IV(A) above), it is likely to have regressive distributional effects. Notably, the least advantaged consumers are most likely to see their cost of credit increase or be denied credit. However, being denied credit on the basis of a more accurate creditworthiness assessment could benefit consumers in the long run and support distributional fairness if it mitigates the risk of unaffordable debt.Footnote 83

In this regard, recent empirical studies examining the impact of algorithmic credit scoring on distributional fairness in consumer lending are instructive. In particular, a study based on US housing mortgage data points to the elimination of discrimination in loan origination, and a reduction in loan pricing discrimination against minority groups (here, Hispanic and African-American borrowers) as a result of algorithmic credit scoring (although price discrimination nevertheless persists).Footnote 84 Similarly, a second study suggests that the use of algorithmic credit scoring could increase loan acceptance rates for Hispanic and African-American borrowers. However, the same minority groups are also more likely to receive higher interest rates with algorithmic credit scoring, and greater within-group dispersion of rates, as compared to White and Asian borrowers.Footnote 85

These results appear to corroborate the hypotheses, set out above, that the shift towards algorithmic credit scoring could enhance distributional fairness in consumer credit markets by reducing direct discrimination due to face-to-face prejudice (for example, in loan origination), thereby improving access to credit for minority groups, whilst at the same time undermining distributional fairness due to more accurate price differentiation and discrimination.Footnote 86

C. Privacy (as Autonomy)

Algorithmic credit scoring, and the wider ecosystem of data-driven, algorithmic decision-making in which it is situated, present a growing threat to consumer privacy and therefore consumer autonomy. First, due to the increased scope for “objective” harm – for example, where consumers’ data are hacked and used to coerce them, inter alia through identity theft (this could also be inefficient, as discussed in Section IV(A) above). Second, due to the increased scope for “subjective” harm caused by the chilling effects of constant surveillance and behavioural profiling,Footnote 87 and consumers’ reduced ability to understand and control how data relating to them are used to shape their (financial) identities.Footnote 88

Indeed, pre-emptive, data-driven insights from algorithmic credit scoring empower lenders to discover more about consumers than consumers know about themselves. Likewise, the knowledge that “all data is credit data”Footnote 89 constrains consumers from acting freely given that any of their (digital) behaviour could impact their creditworthiness.Footnote 90 Importantly, the inferences drawn from alternative data are often less intuitive than those based on traditional credit data, thus making it harder for consumers to know what behaviour is likely to impact their credit scores, and how. For example, it is relatively straightforward for consumers to understand that a good credit history is positively associated with creditworthiness. In contrast, the association between creditworthiness and social media activity data – “likes” and posts, or the size and composition of one's social network – is a lot less intuitive.Footnote 91

It might be argued that the cost to consumers’ autonomy of having less control over how data relating to them are used is outweighed by the apparent gain in autonomy due to improved access to credit as a result of algorithmic credit scoring – or, conversely, due to reduced access to unaffordable credit (autonomy-autonomy trade-offs). Likewise, it is arguable that not all inferences about consumers in algorithmic credit scoring are opaque, and moreover, could ultimately benefit consumers. In this sense, the knowledge that all data is credit data could shape consumers’ behaviour in positive ways – for example, by incentivising them to exercise more in order to improve their fitness tracking data and thus their perceived creditworthiness.

However, it is difficult to offset these apparent gains in autonomy against the systemic, longer-term harms to consumer privacy and autonomy due to algorithmic credit scoring, and the wider apparatus of algorithmic decision-making to which it contributes. That is, whilst certain consumers arguably gain in the immediate term from the use of alternative data for assessing their creditworthiness, their future autonomy – and that of other consumers – is jeopardised if what emerges is a surveillance society in which all of their activities are monitored and measured, and opaque predictions based on their data are used to influence and shape their identities.Footnote 92

Moreover, as discussed earlier, too much information can also be inefficient, due to over-investment in information gathering by lenders, biased and unreliable data or inefficient forms of price discrimination due to the exploitation of vulnerable consumers.Footnote 93 This points to scope for normative alignment between the goals of privacy/autonomy and allocative efficiency- as well as distributional fairness - whereby (a certain level) of consumer privacy is necessary for well-functioning and fair consumer credit markets.

V. Navigating Normative Contests in the Regulation of Algorithmic Credit Scoring

The examination of algorithmic credit scoring in Section IV reveals a complex web of potential inter and intra-normative trade-offs – as well as areas of normative alignment. These normative interactions are summarised in Table 1, using prototypical examples.

Table 1 Normative contests in the regulation of algorithmic credit scoring

As this table highlights, the scope for both normative alignment as well as disagreement is striking. How should regulation navigate these normative contests? In the first instance, empirical data relating to algorithmic credit scoring offer a useful heuristic. As discussed in Section IV, recent studies of algorithmic credit scoring corroborate the hypothesis of greater predictive accuracy in assessing consumer creditworthiness and pricing credit. The impact on distributional fairness in lending is less clear-cut. For example, discrimination in loan origination appears to decrease, and access to credit for minority groups increases, whilst discrimination in loan pricing is in some cases found to increase. However, at least one study points to an overall reduction in discrimination in lending due to algorithmic credit scoring, as compared to the status quo.

Of course, these studies are subject to several limitations. This includes variation in the definition of algorithmic credit scoring, which hinders comparability of the results; missing data on rejected applicants; and, specifically in relation to lending discrimination, missing data on the relevant protected characteristics.Footnote 94 They also leave several questions unanswered – notably, whether the demonstrated gains in predictive accuracy will be robust to an economic downturn, and whether these gains will translate into an overall increase in allocative efficiency and distributional fairness in consumer credit markets. Furthermore, these studies are based on US mortgage markets.

This highlights the need for further empirical research on the (long-term) impact of algorithmic credit scoring, particularly in UK consumer credit markets. However, to the extent that these studies at least suggest that, under algorithmic credit scoring, allocative efficiency and distributional fairness (as indicated by the effect on lending discrimination and access to credit for minority racial groups) are more aligned than commonly assumed, they can serve to focus our attention on the potentially trickier normative contests: as between allocative efficiency and distributional fairness on the one hand, and privacy/autonomy on the other.

In designing an appropriate regulatory response, the question must first be asked whether market-based, consumer-helping solutions could resolve these normative contests. To the extent that such measures are not fully effective, the question arises how public regulation should respond.Footnote 95

A. Consumer-helping Solutions

Thus, one way of mitigating the harms to consumer privacy and autonomy is through the adoption of privacy-enhancing technologies (PETs), by both consumers and firms.Footnote 96 This includes web plug-ins, such as ad and cookie blockers, and privacy-preserving browsers and web browsing settings (such as the Tor Browser or Google Chrome's “incognito” mode).Footnote 97 Likewise, virtual assistants could help to counteract consumers’ behavioural biases and “nudge” them towards decisions that better protect their data and identity.Footnote 98 Emerging blockchain-based tools (such as the Bloom “credit chain”), and decentralised personal data “stores” such as the Mydex platform, could also offer consumers greater control over the use of their personal data and identity.Footnote 99 Indeed, privacy “self-management” by consumers through the use of these tools is by definition autonomy-enhancing.Footnote 100

However, whilst these tools can certainly play a part in protecting consumer privacy, their effectiveness depends on the extent to which they fully substitute for, or merely assist, the consumer. Less financially and/or technologically literate consumers, in particular, often do not fully understand nor internalise the potential costs of the collection and subsequent use of their data. As such, they cannot recognise the value of privacy-enhancing tools in order to adopt them in the first instance. This is in part due to the ambiguity of standard-form online contracts that notoriously include vague, catch-all clauses enabling service providers to collect, store and re-use customer data for related purposes.Footnote 101 Yet, even if further uses of data were to be spelt out clearly, it is well documented that consumers do not read the fine print nor respond rationally to potential risks.Footnote 102

More significantly, the future-oriented, uncertain and intangible nature of many privacy/autonomy harms – a data hack or identity theft that may not happen, “invisible” online manipulation – means that consumers easily discount their severity, inter alia due to optimism and present biases.Footnote 103 Consumers struggle to fathom the aggregate harmful effects of individual data transactions on their (long-term) privacy and autonomy.Footnote 104 And the nature of feature-rich, unstructured personal data means that consumers often reveal more information than they would prefer, or are aware of, when handing over those data.Footnote 105 For example, a person's mouse tracking data can reveal their propensity to develop conditions such as Alzheimer's Disease – an inference that would be unintuitive to the average consumer.Footnote 106

Likewise, consumers are unable to internalise the undesirable social costs to the privacy and autonomy of other consumers due to the use of networked data – the negative externalities of data processing.Footnote 107 For example, a person's Facebook data – such as their friends list and group pictures in which other users have been tagged – can yield detailed inferences about their friends and associates, without the latter's consent or knowledge. Detailed yet unforeseeable behavioural inferences can also be drawn by aggregating personal data with seemingly unrelated and/or non-personal data, such as demographic or environmental data, in ways that are unintuitive and opaque to most consumers.Footnote 108 For example, personal geo-location data and non-personal weather data can be aggregated to predict the likelihood of illness or consumption behaviour.

These large asymmetries of knowledge and power between lenders and borrowers, behavioural weaknesses of consumers and negative externalities to data processing highlight an evident market failure in datafied consumer credit markets.Footnote 109 Under these conditions, individualised, market-based solutions are inadequate to mitigate the identified risks to privacy/autonomy due to algorithmic credit scoring. Indeed, market failure undermines the effectiveness not only of consumer-helping solutions offered by third parties, but also individualised, market- and rights-based remedies under existing data protection and consumer credit law – such as individual consent as a lawful basis for data processing, the right to access credit reports, and the “right” to receive information about the logic underlying an automated decision.

These mechanisms assume that consumers can rationally weigh up all of the potential harms and benefits due to the processing of their data – including future harms to themselves and the systemic effects on other consumers – and exercise their data protection rights accordingly, such as by withholding consent to data processing (even under an opt-in model). As we have discussed, however, there are behavioural and structural obstacles that impede the effective exercise of these rights by consumers. As such, strengthening procedural and informational remedies – for example, through more detailed disclosure of “alternative data” in consumer credit reports, or more detailed explanation of the logic involved in an algorithmic credit decision – is unlikely to provide a satisfactory solution.Footnote 110

In this regard, although the recent move to include limited categories of alternative data in UK statutory credit reports (namely, utility and mobile phone payment data) is certainly helpful, it does not overcome the fundamental limits of informational remedies in safeguarding consumer privacy and autonomy. Furthermore, consumer self-help – including through recourse to credit reports – is likely to produce regressive distributional effects given that wealthier, more financially and technologically literate consumers are generally more adept at privacy self-management, inter alia through the adoption of PETs. Self-help remedies can also generate deadweight efficiency losses by igniting a technological race between firms – seeking to extract and manipulate consumer data – and consumers – seeking to defend and control the use of their data.Footnote 111

Additionally, or alternatively, firms can embed privacy and data protection into the design of their products, for example by making data minimisation and privacy-preserving ML techniques, such as federated learning, the default.Footnote 112 Indeed, there is nascent evidence of anonymisation and synthetic data techniques being used by consumer credit and other financial services firms.Footnote 113 Likewise, the leading web browsers have committed to phasing out the use of third-party cookies, thereby reducing the scope for consumer tracking; WhatsApp and Apple iOS use end-to-end encryption by default.Footnote 114 However, the potential effectiveness of these self-regulatory measures is limited by the market failure in datafied consumer credit markets, described above. In the presence of steep asymmetries of information, negative externalities to data processing and behavioural bias, consumers will still be willing to hand over their data in spite of the harms to their privacy and autonomy, and that of other consumers.

As such, the market alone does not create strong enough incentives for firms to optimise for privacy over profit – for example, by favouring the use of de-identified data over personally identifiable data, when the latter offer greater predictive, and therefore commercial, value. The inadequacy of individualised, market-based solutions in safeguarding consumer privacy in the context of algorithmic credit scoring thus calls for a stronger collective response, through stricter regulatory obligations on data controllers and processors to protect consumer privacy.

B. Regulating Consumer Financial Privacy

In closing, we will consider how regulation could be designed to better safeguard consumer privacy and autonomy in the context of algorithmic credit scoring, whilst still capturing some of its efficiency and fairness gains.Footnote 115 Clearly, this Goldilocks problem inheres in UK (and EU) data-protection regulation, with its twin economic and social objectives of, on the one hand, facilitating market integration through the free flow of personal data and, on the other hand, protecting fundamental rights.Footnote 116 In addition to the individual, market- and rights-based data protection mechanisms examined above, the GDPR imposes several obligations on data processors and controllers that should have the effect of balancing the efficiency gains due to the processing of personal data with the protection of fundamental rights, particularly the rights to privacy and data protection.

Notably, data controllers are required to carry out a DPIA whenever an intended data processing is likely to result in a “high risk to the rights and freedoms of natural persons”,Footnote 117 as well as implement “data protection by design and default” based on an assessment of the “risks of varying likelihood and severity for rights and freedoms of natural persons posed by the processing”.Footnote 118 Furthermore, where personal data is processed on “legitimate interests” grounds, the controller is required to balance its interests and the necessity of processing against the “interests or fundamental rights and freedoms of the data subject which require protection of personal data”.Footnote 119 Reliance on this ground should be accompanied by a “legitimate interests assessment” (LIA).Footnote 120

Indeed, to the extent that these obligations prioritise the protection of fundamental rights, they appear to go beyond a purely utilitarian, risk-based approach – in which the consequential benefits and harms of personal data processing are weighed up, neutrally – in favour of a more deontological, rights-based approach in which both actual and potential risks to fundamental rights, particularly the rights to privacy and data protection, are afforded primacy.Footnote 121

In practice, however, these provisions have been inadequate for safeguarding consumer privacy and thus achieving an appropriate normative balance in the processing of personal data, particularly in the context of algorithmic credit scoring. Whilst there is anecdotal evidence that lenders are carrying out DPIAs, given that these are not made public it is unclear how they are actually assessing and mitigating the privacy risks due to algorithmic credit scoring, if at all. The fact that algorithmic credit scoring is only becoming more prevalent suggests that lenders may be treating the DPIA (and LIA) as a tick-box compliance exercise,Footnote 122 where the privacy risks are either not ultimately deemed to be “high risks”, or are otherwise deemed “necessary and proportionate” to the purposes of processing, and sufficiently well-mitigated and documented.Footnote 123

It is also unclear whether lenders are consulting the ICO pursuant to Article 36 of the GDPR when a DPIA reveals a high risk to the fundamental rights of data subjects that can't be adequately mitigated – and, if so, whether the ICO has taken any regulatory action in response. Likewise, although there is anecdotal evidence of lenders adopting data-protection-by-design measures such as data anonymisation and synthetic data, these practices remain nascent.

The relative failure of DPIA and data-protection-by-design obligations to mitigate the privacy risks to credit consumers is in large part due to weaknesses in regulatory design and enforcement.Footnote 124 The GDPR puts the onus on data controllers and processors to implement these obligations and affords them considerable discretion in how they do so. However, to be effective in holding firms accountable, this more discretionary, process-oriented and principles-based approach needs to be accompanied by close regulatory oversight of, and engagement with, regulated firms, including through effective regulatory guidance and enforcement.Footnote 125

In this regard, whilst the Article 29 Working Party and ICO have issued guidance on DPIAs, including a generic DPIA template and data protection guidance for the use of AI, there is no specific template or guidance for DPIAs in the consumer credit context, nor the financial sector more generally.Footnote 126 Sector-specific DPIA guidance and templates have been developed, for example, for radio frequency identification (RFID) applications, smart grids and surveillance camera systems, the latter of which was jointly developed by the ICO and the Surveillance Camera Commissioner.Footnote 127 Likewise, there is no guidance on the implementation of data-protection-by-design or the data protection principles in the (consumer) financial context.

The data protection “gap” in consumer credit markets can thus partly be filled through more tailored data protection guidance for consumer credit firms. Ideally, this guidance should be produced jointly by the FCA and ICO and should clarify that DPIAs are mandatory for algorithmic credit scoring and the use of other data-driven risk assessment technologies in consumer credit markets (i.e. as processing that meets the “high risk” threshold).Footnote 128

In addition, the FCA should establish more granular obligations for firms to prove the necessity and proportionality of personal data processing. Inter alia, credit providers should be required to prove ex ante and on an ongoing basis, that the proposed processing of personal data – for the development and use of proprietary consumer credit scoring models as well as those supplied by third-party vendors – makes a sufficiently significant improvement to the accuracy of creditworthiness assessment, in order to justify such processing.Footnote 129 The FCA's regulatory guidance could also usefully specify the types of technical and organisational measures that consumer credit firms should take to mitigate privacy risks – for example, data anonymisation using differential privacy techniques,Footnote 130 temporal limits on data retention, and decentralised architectures for processing data.Footnote 131

More importantly, FCA guidance should clarify how firms are to reconcile their often-conflicting obligations under data protection and consumer credit regulation. For example, in the context of creditworthiness assessment, credit providers are required to carry out a “reasonable” assessment on the basis of “sufficient” information “proportionate” to the individual circumstances of each case.Footnote 132 Whilst these requirements would in some instances have the practical effect of limiting personal data processing to the extent that such processing is not a “proportionate” means for assessing credit risk or affordability, they are not underpinned by the goal of protecting consumer privacy. Thus, under consumer credit regulation firms do not have a direct obligation to minimise harm to consumer privacy, or other fundamental rights, which contradicts with their obligations under data protection regulation.

Moreover, the obligations of reasonableness, sufficiency and proportionality only bite when the lender carries out the creditworthiness assessment, at which point much of the harm to consumer privacy and autonomy has already occurred, due to the prior collection and processing of personal data for the development of the algorithmic credit scoring model. In contrast, the LIA, DPIA and data-protection-by-design obligations arise earlier in the data processing cycle – at the point at which the lender, or a third-party vendor, is designing an algorithmic credit scoring model and contemplating data processing. The latter approach is significantly more conducive to the protection of consumer privacy and autonomy. Thus, in order to more effectively safeguard consumer data and privacy, consumer credit regulation must clearly integrate and, more importantly, prioritise firms’ data-protection obligations, particularly the conduct of LIAs/DPIAs and the implementation of data-protection-by-design.

In tandem, the institutional architecture for data protection regulation in consumer credit markets needs to be strengthened. The ICO has recently taken (limited) enforcement action against unlawful data sharing by CRA-data brokers, notably Experian, thus cutting off some of the ammunition for algorithmic credit scoring and its privacy harms.Footnote 133 However, the ICO appears to have taken no enforcement action against, nor conducted any investigations into, the use of algorithmic credit scoring by consumer credit firms specifically – nor related practices in the (consumer) financial sector.Footnote 134 This may be due to a positive assessment by the ICO that these firms are complying with their data protection obligations with respect to algorithmic credit scoring. More likely, it reflects a lack of assessment, due inter alia to limited resources and low prioritisation of the matter, the heterogeneity of algorithmic credit scoring techniques and slower uptake by larger firms as opposed to smaller start-ups, a lack of proximity to consumer credit firms and/or limited subject matter expertise.

One of the ways to strengthen the enforcement of data protection regulation in consumer credit markets, and thus help address the data protection gap, is to expand the role of the FCA.Footnote 135 As a sectoral regulator that has a close relationship with consumer credit firms, the FCA is likely to be in a better position to monitor and enforce their data protection obligations. The FCA also has greater sectoral expertise and experience to assess whether firms are making appropriate value trade-offs, for example under a DPIA. Moreover, data protection and consumer privacy naturally fall within the FCA's existing consumer protection mandate. Indeed, data protection is consumer protection. The FCA also benefits from stronger financial penalty powers than the ICO.Footnote 136

At the same time, however, the ICO has considerable cross-sectoral experience and expertise in data protection regulation. Furthermore, the FCA's greater proximity to consumer-credit firms carries a greater risk of regulatory capture and forbearance. This suggests that the optimal institutional arrangement involves continued collaboration between the FCA and ICO, as the sectoral and cross-sectoral regulator respectively, albeit with increased oversight by the FCA.Footnote 137

Additionally, and to complement guidance from public regulatory authorities, data protection guidelines and best practices could helpfully be developed by the consumer credit industry.Footnote 138 Indeed, the GDPR encourages the development of codes of conduct for data processing.Footnote 139 These could be incorporated into existing industry self-regulation, such as the Standards of Lending Practice (formerly known as the Lending Code).Footnote 140

***

The reforms discussed thus far have focused on strengthening existing data protection obligations, particularly DPIAs. However, these reforms would still give firms considerable discretion in balancing competing regulatory norms in the context of algorithmic credit scoring. The question arises whether these reforms are sufficient to protect consumer privacy, and thus achieve an appropriate balance between the norms of algorithmic credit scoring. Indeed, the negative externalities to data processing and the temporal asymmetry in the materialisation of privacy harms suggest that a case-by-case, contextual assessment of risks under a DPIA may not be adequate to mitigate the systemic harms to consumers, particularly other consumers, in other contexts. Likewise, the use of data de-identification techniques – even more robust cryptographic techniques such as differential privacy – does not mitigate the privacy harms due to group-level profiling, pervasive data collection and surveillance. Moreover, the effectiveness of these techniques is limited to the extent that they are not adopted by all data processors.

The more discretionary, principles-based approach espoused by the DPIA also risks producing regressive distributional effects. Notably, the processing of alternative data is more likely to yield a “sufficiently significant” improvement in predictive accuracy for assessing the creditworthiness of thin-file borrowers, given their lack of traditional financial credit data. Conversely, alternative data has less marginal utility for “thick-file” borrowers, given that most of the predictable association with credit risk can be discerned from their longer credit histories.Footnote 141 Such a two-tier system – whereby the privacy of the wealthy is better safeguarded than that of the poor – would in itself undermine distributional fairness and is undesirable. It is aggravated by the fact that wealthier, more financially and technologically literate consumers are also more adept at privacy self-management, as discussed above.

From this perspective, in order to adequately safeguard the privacy and autonomy of all (credit) consumers, a firmer normative rebalancing in favour of consumer privacy is needed, through hard limits on the types and granularity of personal data that can be used for algorithmic credit scoring. Notably, the FCA should prohibit the processing of certain types of data for algorithmic credit scoring – such as relationship, fitness and social media data – that are feature-rich and more likely to reveal personal details intrinsic to a consumer's identity. Commodifying such data through practices such as algorithmic credit scoring may be seen to harm consumer dignity, debasing human activities such as dating, exercising and socialising by subjecting them to a commercial logic.Footnote 142 Likewise, the FCA should mandate the use of certain data minimisation and de-identification techniques, such as differential privacy.

This more rules-based approach would represent a tightening of existing UK (and EU) data protection regulation, which places stricter restrictions on – but does not prohibit outright – the processing of “sensitive” personal data, such as health data.Footnote 143 It would also depart from the FCA's primarily principles- and conduct-based approach to creditworthiness assessment. And there are downsides to a more rules-based approach. Inter alia, rules are blunter regulatory tools, easier to “game” and less agile in the face of rapid innovation and technological change.Footnote 144 In the context of algorithmic credit scoring specifically, mandating data de-identification or prohibiting ex ante the processing of certain types of data cuts off potential efficiency and fairness gains due to the processing of more granular personal data. Likewise, it is difficult for regulators to determine ex ante which types of data should be prohibited, given that detailed inferences about consumers can be drawn from a range of unrelated and superficially non-“intrinsic” data, such as weather data. And, as noted, the use of de-identification techniques does not obviate all privacy harms, for example due to pervasive data collection and surveillance and group-level profiling based on demographically identifiable information.Footnote 145

Notwithstanding these constraints, the proposed measures would more effectively reduce the scale of personal data processing and the scope for intimate and unintuitive inferences about consumers due to algorithmic credit scoring. To that extent, they offer stronger protection for consumer privacy and autonomy compared to the status quo. Moreover, as discussed in Section IV, limiting the processing of personal data in this way also stands to support allocative efficiency and distributional fairness in consumer-credit markets, by reducing the scope for inefficient and unfair forms of data-driven discrimination and exploitation of vulnerable consumers.

As such, the proposed reforms further the goal of attaining a more appropriate normative balance in the regulation of algorithmic credit scoring, one that adequately respects the norms of privacy and autonomy. Of course, to truly safeguard the privacy of (credit) consumers, stricter limits on the processing of personal data need to be built into data protection regulation at the cross-sectoral level – applied in all consumption contexts, not only consumer credit markets, and to all actors in the development lifecycle of consumer-facing information systems, not limited to data processors and controllers as defined under the GDPR.Footnote 146

VI. Conclusion

This article has developed a normative framework to structure the regulatory analysis of algorithmic credit scoring. Drawing on the principal sectoral and cross-sectoral regulatory frameworks that govern algorithmic credit scoring in the UK, it has deduced a frame of analysis bound by three, key normative goals: allocative efficiency, distributional fairness and consumer privacy (autonomy). An examination of algorithmic credit scoring with respect to each of these goals paints a nuanced picture, with scope for both normative trade-offs and alignment.

Nascent empirical data on algorithmic credit scoring offer a useful heuristic for regulators in navigating these normative contests. In particular, evidence of increased accuracy of creditworthiness assessment together with improved access to credit and reduced discrimination in certain consumer lending markets in which algorithmic credit scoring is present dispels some of the anticipated trade-offs between allocative efficiency and distributional fairness. Rather, the trickier normative trade-offs arise between allocative efficiency and/or distributional fairness on the one hand, and privacy/autonomy on the other. Here, regulators face a Goldilocks challenge: to achieve just the right level of data protection in order to support consumer privacy and autonomy, as well as efficiency and fairness in consumer credit markets.

This article has argued that the existing regulatory frameworks governing algorithmic credit scoring in the UK do not deliver a satisfactory solution to these trade-offs, and therefore do not strike an appropriate normative balance. In light of the market failure in datafied consumer credit markets, existing individualistic, market and rights-based mechanisms such as “informed consent” and data subject access rights are inadequate to safeguard consumers’ privacy and autonomy. Furthermore, existing balancing mechanisms under data protection regulation – notably, DPIAs – fall short in safeguarding consumer privacy due to a lack of sector-specific guidance and under-enforcement. This has resulted in a data protection gap in consumer credit markets.

In order to fill this gap, institutional as well as substantive regulatory reforms are needed. At the institutional level, the FCA needs to assume a more prominent role in overseeing and enforcing data protection regulation in consumer credit markets. In an increasingly datafied economy, the optimal institutional arrangement for data protection regulation entails a larger role for sectoral regulators (here, the FCA), and deeper collaboration between sectoral and cross-sectoral regulators. At the substantive level, the FCA should develop sector-specific guidance to assist consumer credit firms with the implementation of their data protection obligations.

Furthermore, the FCA should establish more granular obligations for firms to prove the necessity and proportionality of personal data processing, through stricter ongoing model validation and data quality verification requirements. These requirements also need to take account of the “explainability” challenges of black-box ML models. More fundamentally, serious consideration needs to be given to imposing strict, ex ante limits on the types and granularity of personal data that can be used for algorithmic credit scoring, and consumer credit decision-making more generally.

Introducing a stricter upper bound on the use of personal data in this way would represent a “rights turn” in the regulation of consumer creditworthiness assessment, and therefore consumer credit regulation more generally, whereby the protection of consumer privacy and data protection are afforded normative primacy. This turn is necessary in view of the rapid datafication of consumer credit markets, and thus the proliferating threat to consumer privacy and autonomy.

We need to take a collective, long-term perspective when making these normative choices and designing an appropriate regulatory framework. Whilst some borrowers (notably, thin- and no-file borrowers) would arguably lose in the short term from a prohibition on the use of certain types of alternative data that are predictive of their creditworthiness, this compromise is necessary if, collectively, we want to resist the advance of a surveillance society in which all of our activities are monitored and measured, and predictions based on those data are used to influence and control every area of our lives – not only lending decisions.

Footnotes

*

DPhil candidate, University of Oxford.

I would like to thank the following for their very helpful comments on earlier versions of this article: John Armour, Dan Awrey, Ryan Calo, Ignacio Cofone, Hugh Collins, Horst Eidenmüller, Mitu Gulati, Geneviève Helleringer, Ian Kerr, Bettina Lange, Tom Melham, Jeremias Prassl, Srini Sundaram, David Watson, participants in the Virtual Workshop on ML and Consumer Credit and workshops at the University of Montreal, McGill University, Singapore Management University, University of Luxembourg, Sciences Po, European University Institute, and University of Oxford (Faculty of Law and Oxford Internet Institute).

References

1 I. Goodfellow, Y. Bengio and A. Courville, Deep Learning (Cambridge, MA 2016), 2–8; Y. LeCun, Y. Bengio and G. Hinton, “Deep Learning” (2015) 251 Nature 436.

2 V. Mayer-Schönberger and K. Cukier, Big Data: A Revolution that Will Transform How We Live, Work and Think (London 2013), 73–97.

3 The phenomenon of analysing large and complex data sets is also commonly called “Big Data” (ibid.).

4 R. O'Dwyer, “Are You Creditworthy? The Algorithm Will Decide”, available at https://bit.ly/34WoTEs (last accessed 5 November 2020). Alternatively referred to as “big data scoring” or “alternative data scoring”: M. Hurley and J. Adebayo, “Credit Scoring in the Era of Big Data” (2016) 18 Yale Journal of Law and Technology 148.

5 E.g. A.E. Khandani, A.J. Kim and A.W. Lo, “Consumer Credit Risk Models via Machine Learning Algorithms” (2010) 34 Journal of Banking and Finance 2767.

6 E.g. D. Citron and F.A. Pasquale, “The Scored Society: Due Process for Automated Predictions” (2014) 89 Washington Law Review 1; C. O'Neil, Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy (London 2016), 179–202.

7 N. Capon, “Credit Scoring Systems: A Critical Analysis” (1982) 46(2) Journal of Marketing 82; J. Lauer, Creditworthy: A History of Consumer Surveillance and Financial Identity in America (New York 2017).

8 E. Goffman, Frame Analysis: An Essay on the Organization of Experience (New York 1974).

9 Financial Services and Markets Act 2000, s. 1C.

10 Directive (EC) No 2008/48 (OJ 2008 L 133 p.66) (“Consumer Credit Directive”) and Regulation (EU) No 2016/679 (OJ 2016 L 119 p.1), respectively.

11 European Union (Withdrawal) Act 2018, s. 3; The Consumer Credit (Amendment) (EU Exit) Regulations 2018.

12 FCA Handbook, Consumer Credit Sourcebook (CONC) sections 5.2A (Creditworthiness assessment) and 5.5A (Creditworthiness assessment: P2P agreements) and Consumer Credit Directive, art. 8. See also FCA, “Assessing Creditworthiness in Consumer Credit – Feedback on CP 17/27 and Final Rules and Guidance”, available at https://bit.ly/2SS9ijA (last accessed 5 November 2020).

13 Capon, “Credit Scoring Systems”, 82.

14 L.C. Thomas, Consumer Credit Models: Pricing, Profit and Portfolios (Oxford 2009), 5–9.

15 Pursuant to the “Principles of Reciprocity”, available at https://scoronline.co.uk/key-documents/ (last accessed 5 November 2020).

16 Thomas, Consumer Credit Models, 63ff.

17 Equifax, “How are Credit Scores Calculated?”, available at https://bit.ly/2I7ppm6 (last accessed 5 November 2020).

18 W. Dobbie and P.M. Skiba, “Information Asymmetries in Consumer Credit Markets: Evidence from Payday Lending” (2013) 5(4) American Economic Journal: Applied Economics 256.

19 See BoE-FCA, “Machine Learning in UK Financial Services”, available at https://bit.ly/3l3YITa (last accessed 5 November 2020). However, note the methodological constraints (at 7).

20 T. O'Neill, “The Birth of Predictor – Machine Learning at Zopa”, available at https://perma.cc/8EXJ-JETA; J. Deville, “Leaky Data: How Wonga Makes Lending Decisions”, available at https://perma.cc/D9SB-TXDX; CGFS and FSB, “FinTech Credit: Market Structure, Business Models and Financial Stability Implications”, available at https://bit.ly/2W6yGF5, 3–6 (all last accessed 5 November 2020).

21 M. Bazarbash, “FinTech in Financial Inclusion: Machine Learning Applications in Assessing Credit Risk”, available at https://bit.ly/2U2zckG, 13 – 23 (last accessed 5 November 2020).

22 M.A. Bruckner, “The Promise and Perils of Algorithmic Lenders’ Use of Big Data”, (2018) 93 Chicago-Kent Law Review 3, 11–17 (distinguishing between two phases of “algorithmic lending”).

23 S. Rahman, “Combining Machine Learning with Credit Risk Scorecards”, available at https://bit.ly/2JDKObw (last accessed 5 November 2020).

24 T. Berg et al., “On the Rise of FinTechs – Credit Scoring Using Digital Footprints”, available at https://ssrn.com/abstract=3163781 (last accessed 5 November 2020); D. Björkegren and D. Grissen, “Behaviour Revealed in Mobile Phone Usage Predicts Credit Repayment” (2020) 34(3) The World Bank Economic Review 618.

25 Hurley and Adebayo, “Credit Scoring”, 168–83.

26 J.A. Sirignano, A. Sadwhani and K. Giesecke, “Deep Learning for Mortgage Risk”, available at https://arxiv.org/abs/1607.02470 (last accessed 5 November 2020).

27 G. Morgenson, “Held Captive by Flawed Credit Reports”, New York Times, at https://nyti.ms/2QaYxrG (last accessed 5 November 2020).

28 I. Berlin, “Two Concepts of Liberty” in Four Essays on Liberty (Oxford 1969).

29 CONC 5.2A.10ff. and 5.5A.11ff. (for p2p agreements); FCA, “Preventing Financial Distress by Predicting Unaffordable Consumer Credit Agreements: An Applied Framework”, available at https://bit.ly/33eVrs3 (last accessed 5 November 2020).

30 J. Armour et al., Principles of Financial Regulation (Oxford 2016), 53–54.

31 FCA, “Guidance for Firms on the Fair Treatment of Vulnerable Consumers”, available at https://bit.ly/351bcEc (last accessed 5 November 2020).

32 FCA, “Preventing Financial Distress”, 13–14.

33 O. Bar-Gill, Seduction by Contract (Oxford 2012), 51ff.

34 Armour et al., Principles, 222–23.

35 J. Rawls, “Justice as Fairness: Political not Metaphysical” (1985) 14 Philosophy and Public Affairs 223.

36 Armour et al., Principles, 51–80; J. Stiglitz, “Regulation and Failure” in D. Moss and J. Cisternino (eds.), New Perspectives on Regulation (Cambridge, MA 2009).

37 J.Y. Campbell et al., “Consumer Financial Protection” (2011) 25(1) Journal of Economic Perspectives 91.

38 Charter of Fundamental Rights of the European Union (OJ 2012 C 326 p.391) (EU Charter), arts. 7, 8; Treaty on the Functioning of the European Union (OJ 2016 C 202 p. 1) (Consolidated), art. 16.

39 Regulation (EU) 2016/679 on the Protection of Natural Persons with Regard to the Processing of Personal Data and on the Free Movement of Such Data (OJ 2016 L 119 p.1) (GDPR), art. 5(1). Directive (EC) No 2008/48 (OJ 2008 L 133 p.66) (“Consumer Credit Directive”) and Regulation (EU) No 2016/679 (OJ 2016 L 119 p.1), respectively.

40 Ibid., art. 6.

41 Ibid., art. 25.

42 Ibid., art. 35.

43 Consumer Credit Act 1974 (CCA), ss. 157–159; GDPR, arts. 14–16.

44 GDPR, recital 71, arts. 21, 22.

45 Ibid., arts. 13(2)(f), 14(2)(g), 15(1)(h).

46 L. Floridi, “The Informational Nature of Personal Identity” (2011) 21 Minds and Machines 549; J. Cheney-Lippold, We Are Data: Algorithms and the Making of Our Digital Selves (New York 2017).

47 A. Rouvroy and Y. Poullet, “The Right to Informational Self-determination and the Value of Self-development: Reassessing the Importance of Privacy for Democracy” in S. Gutwirth et al. (eds.), Reinventing Data Protection? (Dordrecht and London 2009).

48 GDPR, art. 9.

49 O. Lynskey, The Foundations of EU Data Protection Law (Oxford 2015), 89–130.

50 S. Warren and L. Brandeis, “The Right to Privacy” (1890) 4 Harv.L.Rev. 193.

51 As conceived in much of the (law and) economics literature on privacy: e.g. R. Posner, “The Economics of Privacy” (1981) 71(2) American Economic Review 405.

52 Judgment of 15 December 1983, 1 BvR 209/83, BVerfG 65, 1.

53 Berlin, “Two Concepts of Liberty”; J.E. Cohen, “What Privacy Is For” (2013) 126 Harv.L.Rev. 1904.

54 EU Charter, art. 8(2) and GDPR art. 5(1)(a) (the fairness principle); D. Clifford and J. Ausloos, “Data Protection and the Role of Fairness”, available at https://ssrn.com/abstract=3013139 (last accessed 5 November 2020).

55 GDPR, art. 1(3) and recitals 2–6, 13; Lynskey, Foundations, 46–88.

56 E. Posner and R.M. Hynes, “The Law and Economics of Consumer Finance”, available at https://ssrn.com/abstract=261109 (last accessed 5 November 2020).

57 J. Stiglitz and A. Weiss, “Credit Rationing in Markets with Imperfect Information” (1981) 71(3) American Economic Review 393; L. Einav, M. Jenkins and J. Levin, “The Impact of Credit Scoring on Consumer Lending” (2013) 44 RAND Journal of Economics 249.

58 J. Stiglitz and A. Weiss, “Asymmetric Information in Credit Markets and Its Implications for Macro-economics” (1992) 44 Oxford Economic Papers 694.

59 W. Adams, L. Einav and J. Levin, “Liquidity Constraints and Imperfect Information in Subprime Lending” (2009) 99(1) American Economic Review 49.

60 Experian, “5.8m Are Credit Invisible, and 2.5m Are Excluded from Finance by Inaccurate Data. How Data and Analytics Can Include All”, available at https://bit.ly/38AS9QQ (last accessed 5 November 2020).

61 A. Fuster et al., “Predictably Unequal? The Effects of Machine Learning on Credit Markets”, available at https://ssrn.com/abstract=3072038; J. Jagtiani and C. Lemieux, “The Roles of Alternative Data and Machine Learning in Fintech Lending: Evidence from the Lending Club Consumer Platform”, available at https://ssrn.com/abstract=3178461 (both last accessed 5 November 2020).

62 Berg et al., “On the Rise of Fintechs”; J. Jagtiani and C. Lemieux, “Do Fintech Lenders Penetrate Areas that Are Underserved by Traditional Banks”, available at https://ssrn.com/abstract=3178459 (last accessed 5 November 2020).

63 S. Barocas and A. Selbst, “Big Data's Disparate Impact” (2016) 104 Calif.L.Rev. 671, 677–93; J. Kleinberg et al., “Human Decisions and Machine Predictions” (2018) 133 Quarterly Journal of Economics 237.

64 S. Regan et al., “Model Behaviour: Nothing Artificial – Emerging Trends in the Validation of Machine Learning and Artificial Intelligence Models”, available at https://accntu.re/2HQcFzi; P. Bracke et al., “Machine Learning Explainability in Finance: An Application to Default Risk Analysis”, available at https://bit.ly/2TyIk0d (both last accessed 5 November 2020).

65 J. Danielsson, R. Macrae and A. Uthemann, “Artificial Intelligence and Systemic Risk”, available at http://dx.doi.org/10.2139/ssrn.3410948 (last accessed 5 November 2020).

66 M. Adelson, “A Journey to the Alt-A Zone: A Brief Primer on Alt-A Mortgage Loans”, available at https://bit.ly/2U5O2af (last accessed 5 November 2020).

67 J. Jagtiani, L. Lambie-Hanson and T. Lambie-Hanson, “Fintech Lending and Mortgage Credit Access”, available at https://doi.org/10.21799/frbp.wp.2019.47 (last accessed 5 November 2020), 1.

68 J. Hirshleifer, “The Private and Social Value of Information and the Reward to Inventive Activity” (1971) 61(4) American Economic Review 561.

69 Posner and Hynes, “Law and Economics of Consumer Finance”; Armour et al., Principles, 207–12.

70 O. Bar-Gill, “Algorithmic Price Discrimination: When Demand Is a Function of Both Preferences and (Mis)Perceptions” (2019) 86 U.Chi.L.Rev. 217; FCA, “Price Discrimination in Financial Services: How Should We Deal With Questions of Fairness?”, available at https://bit.ly/2W783jl (last accessed 5 November 2020).

71 R. Calo, “Digital Market Manipulation” (2014) 82 George Washington Law Review 995; G. Wagner and H. Eidenmüller, “Down by Algorithms? Siphoning Rents, Exploiting Biases and Shaping Preferences: The Dark Side of Personalized Transactions” (2019) 86 U.Chi.L.Rev. 581.

72 A. Kurakin, I. Goodfellow and S. Bengio, “Adversarial Machine Learning at Scale”, available at https://arxiv.org/abs/1611.01236 (last accessed 5 November 2020).

73 A. Acquisti, “The Economics and Behavioural Economics of Privacy” in J. Lane et al. (eds.), Privacy, Big Data, and the Public Good: Frameworks for Engagement (Cambridge 2014), 83–84.

74 M.S. Gal and O. Aviv, “The Competitive Effects of the GDPR” (2020) 16 Journal of Competition Law and Economics 349.

75 Berg et al., “On the Rise of Fintechs”, 34–35.

76 O. Khan, “Financial Exclusion and Ethnicity: An Agenda for Research and Policy Action”, available at https://bit.ly/31aJofv (last accessed 5 November 2020).

77 A. Roussi, “Kenyan Borrowers Shamed by Debt Collectors Chasing Silicon Valley Loans”, Financial Times, available at https://on.ft.com/2FtPY95 (last accessed 5 November 2020).

78 S. Deku, A. Kara and P. Molyneux, “Exclusion and Discrimination in the Market for Consumer Credit” (2016) 22 European Journal of Finance 941 (finding evidence of discrimination in consumer credit against non-White households in the UK).

79 Equality Act 2010, s. 13 (prohibition on direct discrimination).

80 J. Kleinberg et al., “Discrimination in the Age of Algorithms” (2018) 10 Journal of Legal Analysis 113.

81 Equality Act 2010, s. 19 (prohibition on indirect discrimination).

82 Barocas and Selbst, “Big Data's Disparate Impact”, 681–87.

83 US Bureau for Consumer Financial Protection, “Request for Information Regarding Use of Alternative Data and Modeling Techniques in the Credit Process”, available at https://bit.ly/2IMH7NK (last accessed 5 November 2020), 1186.

84 R. Bartlett et al., “Consumer Lending Discrimination in the Fintech Era”, available at https://ssrn.com/abstract=3063448 (last accessed 5 November 2020).

85 Fuster et al., “Predictably Unequal?”.

86 See also T.B. Gillis, “False Dreams of Algorithmic Fairness: The Case of Credit Pricing”, available at https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3571266 (last accessed 5 November 2020), 37–40.

87 P. Swire, “Financial Privacy and the Theory of High-Tech Government Surveillance” (1999) 77 Washington University Law Quarterly 461, 473–75.

88 On the subjective-objective dichotomy, see R. Calo, “The Boundaries of Privacy Harm” (2011) Indiana Law Journal 86.

89 Douglas Merrill, CEO of ZestAI in Q. Hardy, “Just the Facts: Yes, All of Them” (2012), New York Times, available at https://nyti.ms/37QQmuj (last accessed 5 November 2020).

90 N.G. Packin and Y. Lev Aretz, “On Social Credit and the Right to Be Unnetworked” (2016) Columbia Business Law Review 339.

91 N. Aggarwal, “Big Data and the Obsolescence of Consumer Credit Reports”, available at https://www.law.ox.ac.uk/business-law-blog/blog/2019/07/big-data-and-obsolescence-consumer-credit-reports (last accessed 5 November 2020); S. Wachter and B. Mittelstadt, “A Right to Reasonable Inferences: Re-thinking Data Protection Law in the Age of Big Data and AI” (2019) 2 Columbia Business Law Review 494, 505–14.

92 F.A. Pasquale, The Black Box Society: The Secret Algorithms that Control Money and Information (Cambridge, MA 2015); S. Zuboff, The Age of Surveillance Capitalism (London 2019).

93 R. Calo, “Privacy and Markets: A Love Story” (2016) 91 Notre Dame Law Review 649; Bar-Gill, “Algorithmic Price Discrimination”.

94 R.B. Avery, K.P. Brevoort and G.B. Canner, “Does Credit Scoring Produce a Disparate Impact?” (2012) 40 Real Estate Economics S65, 2.

95 For related quantitative approaches to resolving value trade-offs in ML, see e.g. J. Kleinberg, S. Mullainathan and M. Raghavan, “Inherent Trade-offs in the Fair Determination of Risk Scores”, available at https://arxiv.org/abs/1609.05807 (last accessed 5 November 2020); E. Rolf et al., “Balancing Competing Objectives with Noisy Data: Score-based Classifiers for Welfare-aware Machine Learning”, available at https://arxiv.org/abs/2003.06740 (last accessed 5 November 2020).

96 I. Goldberg, “Privacy Enhancing Technologies for the Internet III: Ten Years Later”, in A. Acquisti et al. (eds.), Digital Privacy: Theory, Technologies and Practices (New York 2007).

97 “Tor Project”, available at https://www.torproject.org/ (last accessed 5 November 2020).

98 M. Gal and N. Elkin-Koren, “Algorithmic Consumers” (2017) 30(2) Harvard Journal of Law and Technology 309; FCA, “Applying Behavioural Economics at the Financial Conduct Authority”, available at https://bit.ly/33ghiit (last accessed 5 November 2020).

99 “Bloom”, available at https://bloom.co/ and “Mydex”, available at https://mydex.org/ (both last accessed 5 November 2020).

100 D.J. Solove, “Privacy Self-management and the Consent Dilemma” (2013) 126 Harv.L.Rev. 1880.

101 “Why Google Collects Data”, available at https://policies.google.com/privacy?hl=en-US#whycollect (last accessed 5 November 2020); K.J. Strandburg, “Monitoring, Datafication, and Consent: Legal Approaches to Privacy in the Big Data Context” in Lane et al., Privacy, 30.

102 O. Ben-Shahar and C. Schneider, More Than You Wanted to Know: The Failure of Mandated Disclosure (Princeton 2014).

103 A. Acquisti, “The Economics of Personal Data and the Economics of Privacy”, available at https://bit.ly/32JAaX6 (last accessed 5 November 2020), 25ff.

104 A. Kahn, “The Tyranny of Small Decisions: Market Failures, Imperfections and the Limits of Economics” (1966) 19 International Review for Social Sciences 23.

105 I.N. Cofone, “Nothing to Hide, but Something to Lose” (2020) 70 U.T.L.J. 64.

106 A. Seelye et al., “Computer Mouse Movement Patterns: A Potential Marker of Mild Cognitive Impairment” (2015) 1 Alzheimers Dement (Amst) 472.

107 O. Ben-Shahar, “Data Pollution” (2019) 11 Journal of Legal Analysis 104; S. Barocas and H. Nissenbaum, “Big Data's End Run Around Anonymity and Consent” in Lane et al., Privacy, 44–75.

108 B. Mittelstadt, “From Individual to Group Privacy in Big Data Analytics” (2017) 30 Philosophy & Technology 475.

109 K.J. Strandburg, “Free Fall: The Online Market's Consumer Preference Disconnect” (2013) University of Chicago Legal Forum 95.

110 L. Edwards and M. Veale, “Slave to the Algorithm” (2017) 16 Duke Law and Technology Review 18, 67 (discussing the “transparency fallacy”).

111 Gal and Elkin-Koren, “Algorithmic Consumers”, 329; Wagner and Eidenmüller, “Down by Algorithms?”, 588–89.

112 R. Binns and V. Gallo, “Data Minimization and Privacy Preserving Techniques in AI Systems”, available at https://bit.ly/31cVftq (last accessed 5 November 2020).

113 See, for example, “Hazy”, available at https://hazy.com/industries (last accessed 5 November 2020).

114 N. Statt, “Apple Updates Safari's Anti-tracking Tech with Full Third-party Cookie Blocking”, The Verge, at https://bit.ly/2GXWTZ0 (last accessed 5 November 2020).

115 On policy trade-offs in the regulation of fintech more generally, see Y. Yadav and C. Brummer, “Fintech and the Innovation Trilemma” (2019) 107 Georgetown Law Journal 235.

116 Lynskey, Foundations, 76ff.

117 GDPR, arts. 35, 36.

118 GDPR, arts. 25, 28(1) (indirectly extending the obligation to data processors).

119 GDPR, art. 6(1)(f) and recital 47.

120 ICO, “How Do We Apply Legitimate Interests in Practice?”, available at https://bit.ly/32a8gEt (last accessed 5 November 2020).

121 A. Mantelero, “Comment to Article 35 and 36” in M. Cole and F. Boehm (eds.), Commentary on the General Data Protection Regulation (Cheltenham 2019); Article 29 Data Protection Working Party (A29), “Statement on the Role of a Risk-based Approach in Data Protection Legal Frameworks”, available at https://bit.ly/3nUYqzu (last accessed 5 November 2020).

122 Edwards and Veale, “Slave to the Algorithm”, 80.

123 Article 35(7), GDPR.

124 L. Bygrave, “Minding the Machine v2.0: The EU General Data Protection Regulation and Automated Decision-Making” in K. Yeung and M. Lodge (eds.), Algorithmic Regulation (Oxford 2019), 257.

125 J. Black, “Forms and Paradoxes of Principles-based Regulation” (2008) 3(4) Capital Markets Law Journal 425.

126 Article 29 Data Protection Working Party, “Guidelines on Data Protection Impact Assessment (DPIA) and Determining Whether Processing Is ‘Likely to Result in High Risk’ for the Purposes of Regulation 2016/679”, available at https://ec.europa.eu/newsroom/article29/item-detail.cfm?item_id=611236; ICO, “Data Protection Impact Assessments”, available at https://bit.ly/3nJlPUH and ICO, “Guidance on AI and Data Protection”, available at https://bit.ly/35ZckJ3 (all last accessed 5 November 2020).

127 See Surveillance Camera Commissioner, “Guidance: Data Protection Impact Assessments for Surveillance Cameras”, https://bit.ly/2SNQeTk (last accessed 5 November 2020).

128 See GDPR, Recital 75 and art. 35(3)(a); A29, “Guidelines on DPIA”, 8–12.

129 ICO, “Guidance on AI and Data Protection” (discussing different metrics of statistical accuracy and model performance).

130 Dwork, C. et al. , “Calibrating Noise to Sensitivity in Private Data Analysis”, in Halevi, S. and Rabin, T. (eds.), Theory of Cryptography: Third Theory of Cryptography Conference (Berlin and New York 2006)Google Scholar; Kearns, M. and Roth, A., The Ethical Algorithm (Oxford 2020), 2247Google Scholar.

131 For initial guidance, see ICO, “General Data Protection Regulation (GDPR) FAQs for Small Financial Service Providers”, available at https://bit.ly/3iRWAf9 (last accessed 5 November 2020).

132 CONC 5.2A.20R to CONC 5.2A.25G; CONC 5.5A.21R to CONC 5.5A.26G.

133 ICO, “ICO Takes Enforcement Action against Experian after Data Broking Investigation”, available at https://bit.ly/3oBu2um (last accessed 5 November 2020).

134 ICO, “Action We've Taken”, available at https://ico.org.uk/action-weve-taken/ (last accessed 5 November 2020).

135 For a related discussion in the US context, see Janger, E., “Locating the Regulation of Data Privacy and Data Security” (2010) 5 Brooklyn Journal of Corporate, Financial and Commercial Law 97Google Scholar.

136 The FCA can impose fines of up to 20 per cent of a firm's revenue (plus disgorgement); the ICO can impose fines up to 4 per cent of revenue (or EUR 20 million, whichever is higher).

137 Building on ICO and FCA, “Memorandum of Understanding Between the Information Commissioner and the Financial Conduct Authority”, at https://bit.ly/2H2ujVS (last accessed 5 November 2020).

138 Hirsch, D., “The Law and Policy of Online Privacy: Regulation, Self-regulation or Co-regulation?” (2010) 34 Seattle University Law Review 439Google Scholar.

139 GDPR, art. 40.

140 Lending Standards Board, “The Standards of Lending Practice”, available at https://bit.ly/2STkgW8 (last accessed 5 November 2020).

141 L. Gambacorta et al., “How do Machine Learning and Non-traditional Data Affect Credit Scoring? New Evidence from a Chinese Fintech Firm” (2019) BIS Working Papers No. 834, available at https://www.bis.org/publ/work834.pdf (last visited 5 November 2020), 19–20.

142 On the commodification objection, see Sandel, M.J., What Money Can't Buy: The Moral Limits of Markets (London 2012)Google Scholar.

143 GDPR, art. 9.

144 Awrey, D., “Regulating Financial Innovation: A More Principles-based Approach?” (2011) 5 Brooklyn Journal of Corporate, Financial and Commercial Law 273Google Scholar.

145 Mittelstadt, “From Individual Privacy to Group Privacy”.

146 Bygrave, “Minding the Machine”, 257.

Figure 0

Table 1 Normative contests in the regulation of algorithmic credit scoring