Internal Credit Risk Rating Systems Must Evolve to Stay Relevant


John Mingo and Eric Falkenstein[1]


Reprinted from the Journal of Lending and Credit Risk Management, April 2000


Many commercial  banks, and essentially all of the large, global institutions, use internal credit rating systems to "grade" each of their commercial credits.  The credit grades are used for several purposes, including the preparation of management reports that track the composition and overall quality of the loan portfolio.  At many institutions, the internal grades are also the basis for loan pricing according to rating.  Individual credits are reviewed on a regular basis, and loans may be "upgraded" or "downgraded" depending on the current condition and prospects of the obligor.  At best-practice banks, the internal loan grading system is used as the basis for the determination of "economic capital" allocations that vary substantially across grades.  These internal capital computations are then used to determine whether individual credits or credit products are generating sufficient returns to economic capital to justify their current pricing or even their continued existence (so-called RAROC or risk-adjusted return on capital computations).  [See Jones and Mingo, “Industry Practices in Credit Risk Modeling and Internal Capital Allocations,” Journal of Economics and Business, vol. 51, pp. 79-108, 1999 for a discussion of internal rating ("IR") systems and their use within internal capital allocation processes.]

            Until recently, the construction, maintenance, and use of IR systems generated interest only among credit risk "techies" in the back offices of large banks and the consulting firms specializing in the discipline of credit risk measurement.  Recent developments have changed all that, and issues pertaining to IR systems are now being studied by bank supervisors as well as by the strategic planning personnel in best-practice banks.  Here's why.

            1.  The fall of the Basel Accord.  During the last two years or so, regulators have become increasingly aware that large banks engage in "regulatory capital arbitrage" in ways that negate the usefulness of the minimum capital requirements known as the Basel Accord.  In the wake of the recognition that the “one-size-fits-all” capital rules are not meaningful, regulators are grappling with various proposals for constructing a new Accord -- one which would be sensitive to the actual economic risk residing in a credit or group of credits.  In June, 1999, the Basel Committee published a consultative paper that proposed to assign capital requirements according to the external ratings of certain credits.  Going further, the paper suggested that a more inclusive ratings-based capital standard -- one based on the internal ratings assigned by a sound-practice bank to each of its credits -- might be desirable.  The Basel Committee stated that it is going ahead with an investigation of such a capital standard and invited public comments by March 31, 1999.

            Clearly, a capital standard based on meaningful risk distinctions would, in principle, reduce the need for banks to engage in costly regulatory capital arbitrage and, at the same time, would provide regulators with a more useful measure of the overall adequacy of bank capital.  The problem, of course, lies in reaching an agreement over   what, exactly, constitutes an acceptable method or group of methods for constructing an appropriate internal ratings system.  For this reason, Basel is studying IR systems across the globe while trying to arrive at conclusions regarding what is "best-practice" versus "acceptable-practice" versus "unacceptable practice."  Strategically, it is in the best interests of any large bank to have its own rating system ranked as "best-practice" -- but, as discussed below, this is unfortunately a goal actually attained by few.

            2.  The advent of "new" supervisory practices in the U.S.  At the same time it is participating as a full partner in the Basel investigatory process, the Federal Reserve has begun to accelerate its own reviews of large banks' credit risk measurement and management processes.  For example, in September, 1998, the Fed issued a "supervisory letter" (SR 98-25) that focussed on the usefulness of IR systems and laid out several criteria by which such systems could judged.  In particular, the Fed SR letter stressed that such systems should "meaningfully distinguish among the strata of risk."  In FedSpeak this means that the examiner should be critical of an IR system which tends to lump most of the bank's credits into one or two grades -- a criticism that can be made of most of the 10-grade systems in common use today.

            The importance of the IR system -- and the examiner's assessment of that system -- is driven home even more forcefully in a more recent SR letter dated July 1, 1999 (SR 99-18).  In that document, the Fed instructed examiners to assess the quality of the internal economic capital allocation systems used by the major banks (the so-called LCBOs or Large, Complex Banking Organizations).  The examiner's review of the quality of the bank's internal capital adequacy determination is to be reflected in the "management" portion of the bank's supervisory grade (i.e., the "M" portion of the CAMELS rating).  Moreover, eventually the examiner's view of the quality of the internal capital system will be factored into the capital adequacy rating itself (the "C" portion of the CAMELS rating).  Presumably, if the bank is not doing a good job of assigning internal capital against its measured risks, then, all other things equal, including the level of its capital, the bank should receive lower "M", "C" and overall CAMELS ratings.

            Currently, most large banks' internal capital allocations for credit risk are keyed to the particular credit rating assigned to the facility -- the rating determines how much capital is assigned.  Therefore, it follows that, in order to assess the quality of a large bank's internal capital system, the examiner must begin with an assessment of the quality of its internal rating system.  Again, the IR system takes center stage.

            3.  The rise of CLOs and synthetic CLOs.  Whenever a bank finds that the regulatory capital requirement on a type of credit greatly exceeds the internal economic capital assigned to that credit, the bank must seriously consider restructuring its credit position to lower its regulatory requirement -- that is, the bank must consider "regulatory capital arbitrage."  One simple way to do this is to remove from the balance sheet the credits that have low economic capital requirements (i.e., the high quality credits).  Collateralized Loan Obligations are one method to achieve asset removal.  The bank forms a Special Purpose Vehicle (SPV) to which the loans are sold, with the purchase financed by the SPV issuing CLOs.  The bank removes the credit risk (and the onerous regulatory capital requirements) from its balance sheet, but retains the earnings associated with originating the assets and servicing them. 

More recently, "synthetic" CLOs are being structured in which the assets never leave the bank's balance sheet.  Rather, the bank buys credit loss protection from the SPV (in the form, say, of a credit default swap).  In a typical transaction, the SPV is responsible for covering losses beyond expected losses but only up to a certain level (generally not more than 10-15% of the underlying asset pool) for which the SPV receives a fee from the bank receiving the credit loss protection.  Rather than having to sell paper in the amount necessary to buy the entire underlying amount of assets, the SPV only has to sell paper sufficient to fund a pool of riskless assets that will be used as collateral for the credit default swap.  This greatly reduces transaction costs.  Moreover, in a synthetic CLO the bank is not selling its loans, thus the bank's business borrower does not suffer the "indignity" of seeing its bank offload its paper.

            In the typical CLO or synthetic CLO, while the sponsoring bank might retain a "first-dollar" loss position in the underlying credits, the bulk of the credit risk on the pool is borne by the purchasers of the paper of the SPV.  These market participants must be assured that the sponsoring bank is properly originating the loans and servicing them.  In particular, the risk buyers must be comfortable with the bank's methods for assigning initial risk grades to the credits, as well as its methods for reviewing the credits on a periodic basis to see if the pool's credit grade characteristics are being maintained over the life of the transaction.  In other words, the market is interested in the details of the bank's IR system.  If, for example, the reader of the prospectus notes that the bulk of the credits fall into one or two grades, the reader might be concerned that, in actuality, the credits fall into the lower portion of these grades.  In the jargon of the Fed's SR letter, there might be concern that the bank's IR system does not "meaningfully distinguish among the strata of risk."  All other things equal, the bank with the less-than-best-practice IR system will be penalized when structuring the  CLO or synthetic CLO.  Either the bank will have to provide greater first-dollar loss protection, or the buyers of the SPV's paper will demand a higher yield (and thus the SPV will have to receive higher payments from the bank receiving the credit loss protection).  Once again, the details of the construction of the bank's IR system rise to strategic importance.

            [Note:  An SR letter dated November 15, 1999 (SR 99-32) specifies new capital treatment of synthetic CLOs.  In general, the new treatment reduces the uncertainty surrounding the issuance of synthetic CLOs which, other things equal, would tend to further establish the market for this new type of instrument.  However, the SR letter imposes regulatory capital requirements for the residual positions that are often above economic capital requirements.  The jury is thus out as to whether, on net, the synthetic CLO market will be helped or hurt by these new rules.]

            "Lumpiness" in an IR system -- the concentration of most credits into only one or two grades -- is the most frequently heard criticism of such systems, but it is not the only one.  In the limited space available to us, however, we will concentrate only on this lumpiness problem.  A particular cost-effective solution can be found through the judicious use of commercial credit "scoring" models.  By "scoring" we mean the use of empirically-based models that result in a specific expected-default-frequency (EDF) estimate for each credit.  One such model is Moody's RiskScoreÔ, which we describe briefly below.

            RiskScoreÔ is essentially a commercial credit scoring model that provides estimates of EDFs over one or more time horizons.  The original model was based on a proprietary loan database involving more than 80,000 middle market obligors from 14 financial institutions.  Currently, Moody's is involved in an international effort to compile a database that will refine the validation effort.  This initiative is a valuable compliment to Moody's database of publicly traded bond defaults, because although data on publicly traded companies are instructive, such information is by no means sufficient for estimating and testing a model intended for middle market exposures or for exposures to private companies.  Like the development of quantitative models within banks, the first version was an expert rules system.  As more data on defaulted companies have been collected, the model has been refined to more powerfully separate defaulters from non-defaulters.  RiskScoreÔ uses objective obligor data such as leverage, profitability, etc., to generate a default probability.  By estimating and testing on perhaps the most comprehensive set of public and private defaults, RiskScoreÔ has been estimated in ways that permit the use of complex, non-linear transformations of input data.  The aim is to continually improve upon the model’s ability to distinguish between defaults and non-defaults.

            Now, how would a banker use RiskScoreÔ to solve the grading lumpiness problem? Let's begin with an IR system that, like many, has two grades into which perhaps 70% of its credits are lumped.  Call these two grades the "supergrades."  The banker needs to find a simple and low-cost method for distinguishing among the assets in those two supergrades -- in other words, he needs to create perhaps three "sub-grades" for each of the two supergrades.  This can be done in two steps.  First, continue to use the IR system to grade all credits under the current rating procedure.  Then, use RiskScoreÔ to assign EDFs to all credits -- this can be done as part of the "spreading" process in most cases, rather than as a separate procedure.  Now, the credits in the two supergrades can be arrayed into, say, 6 sub-grades simply by ranking them according to their RiskScoresÔ (i.e., according  to their estimated EDFs).

                        Of course, any credit scoring model can generate an estimated EDF for a credit that is substantially above (or below) the range of EDFs the bank typically associates with a particular rating.  That is, the EDF estimate might suggest that the credit should not have been placed in a particular supergrade to begin with.  This result, in and of itself, is no reason to abandon your current rating procedures.  Rather, if the scoring model's EDF is higher (lower) than the range of EDFs normally associated with your supergrade, then place the credit in the highest (lowest) of the sub-grade categories within the supergrade.  The credit score is not replacing the rating process, it is simply being used as a handy device for providing greater precision to the sub-grading process.

            Absent such an empirically-based scoring process, it would be extremely difficult to solve the lumpiness problem in completely subjective fashion.  It is one thing to ask a loan officer to make high-low-medium distinctions in subjective fashion, quite another to ask the officer to divide relatively “similar” obligors into, say, 6 risk buckets.  The scoring process adds transparency and validity to the rating process.  Moreover, the credit score can be used to improve efficiency by, for example, highlighting those credits that require significantly more (or significantly less) documentation or detailed analysis.  A credit that scores sufficiently highly might be “automatically” assigned a + within a supergrade; a credit that scores below a certain level might require significant additional review.  

Now what if the EDF estimations provided by the scoring model habitually results in EDFs that are outside of the range of EDFs normally associated with any particular grade?  In this case, you may have a fundamental problem with the IR system or with the scoring model, or both.  Generally, a scoring model needs to be calibrated to the particular institution in which it is being used -- rarely can such models be used "right out of the box."  Once proper calibration has been accomplished, however, such EDF-generators are a useful tool for monitoring the IR system and making sure that ratings procedures are reasonably constructed and being consistently followed.  In no circumstance, however, would we advocate that the scoring model be used to substitute completely for human judgment.  The scoring model's output is a risk-management tool -- no more, no less.  Institutions that ignore these new tools, however, will be placed at a strategic disadvantage.


[1] Mr. Mingo is Managing Director, Mingo & Co ( and Mr. Falkenstein is Vice President, Moody’s Risk Management Services (