Quantifying the Returns of ESG Investing: An Empirical Analysis with Six ESG Metrics

By | April 7, 2023

Environmental, Social, and Governance (ESG) investing has been gaining momentum in recent years as investors increasingly recognize the importance of incorporating ESG factors into their investment decisions.

ESG investors rely on third party rating agencies that specialize in measuring the ESG performance of companies. There are many ESG rating agencies in the market and some of the popular ones include MSCI Inc, S&P Global, ISS, Moody’s ESG Solutions, Reprisk, and TruValue Labs. These rating agencies compute scores using proprietary methodologies that involve collecting data from multiple sources such as company reports, regulatory filings, and news articles, amongst others. However, their methodologies and criteria for computing ESG scores differ, and so do the final ESG scores (Berg, Kobel, and Rigobon, 2022).
​​
A recent cover story in The Economist delved into the topic of ESG investing and called into question ESG scores’ utility, and concluded that ESG ratings are overly complex and susceptible to measurement error. This observation prompts an inquiry into whether ESG ratings serve a useful purpose in the context of portfolio construction, and if so, how we can effectively leverage the information contained within ESG ratings despite their inherent noise.

We thus conduct an empirical study of ESG integration in portfolios containing U.S., European, and Japanese stocks by utilizing ESG scores from six prominent rating agencies. We analyze the excess returns of these portfolios using common asset pricing models, such as the capital asset pricing model (CAPM) and various Fama-French factor models. Our study period spans from 2014 to 2020. We find varying excess returns for these portfolios. For example, portfolios going long the top quartile of stocks with the highest MSCI ESG ratings and short the bottom quartile exhibit a statistically significant annual alpha of 3.8% in excess of the Fama-French five-factor model in the U.S. However, using ESG ratings from other providers results in much lower (and often non-existent) excess returns. Given the different levels of noise in different ESG ratings, these results do not come as a surprise.

To tackle the issue of noise in ESG scores, we suggest several approaches for combining the ratings across vendors. Our aim is to preserve the ESG signal while reducing the noise. Using six different vendors, we combine their individual ESG scores through statistical and voting aggregation techniques, such as simple averages, the Mahalanobis distance, principal component analysis, average voting, and singular transferable voting. These aggregation methods apply varying weights to the ESG scores and hence treat noise differently. For example, the simple average assigns equal weights to scores from all vendors, while the Mahalanobis distance aggregates ratings based on their variance-covariance.

Our findings show that combining individual ESG ratings leads to a notable enhancement in portfolio performance. We construct sorted ESG portfolios (from high to low scores) using the aggregated scores and analyze their risk-adjusted returns, excess returns, and exposures to fundamental factors. We observe that portfolios constructed using the Mahalanobis aggregation method achieve the highest alpha in the U.S compared to individual ESG scores.

In our work we also utilize the portfolio construction methodology proposed by Lo and Zhang (2021) to further improve the performance of ESG portfolios. Furthermore, we apply Treynor-Black weights, where the weights are determined by the rank of the ESG score of each firm. Using both methods we are able to improve excess returns of the ESG portfolios.

Along with ESG scores, investors may choose to use individual E, S, or G scores to build their portfolios. To further explore the performance of these portfolios, we examine how score aggregation across various vendors affects excess returns. Our analysis reveals that portfolios based on E scores in the United States and Japan generate the highest excess returns. However, for portfolios built on S and G scores, we only observe positive excess returns for specific aggregation methods.

Overall, we demonstrate that the aggregation of several ESG scores leads to better financial performance of ESG portfolios. Investors may be able to leverage these findings to better extract the signal contained in ESG ratings.

Florian Berg is a research scientist at the MIT Sloan School of Management
Andrew W. Lo is the Charles E. and Susan T. Harris Professor, a Professor of Finance, and the Director of the Laboratory for Financial Engineering at the MIT Sloan School of Management.
Roberto Rigobon is the Society of Sloan Fellows Professor of Management and a Professor of Applied Economics at the MIT Sloan School of Management.
Manish Singh is the Ph.D. Student at Electrical Engineering and Computer Science Department at MIT.
Ruixun Zhang is an Assistant Professor in the Department of Financial Mathematics, School of Mathematical Sciences at Peking University (PKU)

This post was adapted from their paper, “Quantifying the Returns of ESG Investing: An Empirical Analysis with Six ESG Metrics,” available on SSRN.

Leave a Reply

Your email address will not be published. Required fields are marked *