Regional and Sectoral News-Based Indicators for Macroeconomic Forecasting

This paper combines dictionary-based methods and topic models to extract timely economic and financial signals at the sectoral (or 6-digit NAICS), provincial, and national levels from Canadian newspaper text and shows that such information can materially improve forecasts of macroeconomic variables, including GDP, inflation, housing prices, and unemployment. I use an advanced machine learning method to isolate information about future, current, and past sentiments. Such indices are extracted from approximately 2 million articles from major Canadian newspapers, including the National Post, Calgary Herald, Edmonton Journal, Montreal Gazette, Ottawa Citizen, Regina Leader-Post, The Globe and Mail, and Vancouver Sun.


Overview of news sentiment index using LM dictionary

(a) Canadian news sentiment using LM

(b) Regional news sentiment

(c) News sentiment with survey-based consumer sentiment

(d) News sentiment by time horizon

Overview of some topics detected using the Latent Dirichlet Allocation (LDA)

(a) Economic conditions

(b) Housing market

(c) Stock market

(d) Labor market

Can Media Narratives Predict House Price Movements?, with Christopher Rauh

This paper investigates how the housing market, a major asset in household wealth, mirrors broader economic trends and we present a predictive model for housing price movements in Canada at both provincial and national levels. Our methodology unfolds in two distinct stages: initially, we process over 2 million newspaper articles through cutting-edge natural language processing techniques to extract media narratives, analyze sentiments, and sort articles according to their focus on past, present, or future events. Subsequently, we implement mixed-frequency machine learning methods to generate a sequence of predictions for quarterly housing prices. The predictions are based on linear models estimated via the LASSO, Ridge, and Elastic net, nonlinear models based on Random Forests, Extreme Gradient Boosting, Artificial Neural Networks, and ensembles of linear and nonlinear models. The results indicate that news data contain valuable information about the housing market's direction. Furthermore, we identify the economic drivers of our machine learning models by applying a novel framework based on SHAP values, uncovering nonlinear relationships between the predictors and house prices.


Overview of some topics detected using the Latent Dirichlet Allocation (LDA)

(a) Property size and features

(b) Mortgages

(c) Foreign real estate investors

(d) Rental market

(Almost) 50 Years of Signals? How Media Deciphers Central Bank Messages

In this paper, I use cutting-edge textual analysis and machine learning methods to examine several ways to extract timely economic signals from a 47-year span of Bank of Canada (BoC) news media coverage. I demonstrate how such data can enhance the assessment of the economy's health and improve macroeconomic forecasts, including CPI inflation and GDP growth. Exploiting BoC newspaper text can improve economic forecasts unconditionally and when conditioned on other relevant information, but the latter's performance varies according to the method used. Incorporating text into forecasts by combining forward-looking time dimension with supervised machine learning delivers the highest forecast improvements relative to existing text-based methods. These improvements are most pronounced during periods of economic stress when, arguably, forecasts matter most. My results have two significant implications for monetary policy. First, my text measures can serve as real granular macroeconomic expectations indicators of banks' staff economic forecasts. Second, I shed some light on the links between BoC news coverage and consumer inflation expectations, facilitating the study of the transmission of monetary shocks.


Quarterly out-of-sample CPI inflation prediction performance for selected models

(a) MFML Elastic net

(b) MFML Random Forest

(c) MFML Deep Neural Networks

(d) Combination

Overview of some topics detected using the Latent Dirichlet Allocation (LDA)

(a) Financial market

(b) Housing market

(c) Monetary policy

(d) Inflation/prices

Power Blackout ‘Pandemic' and Social Media Voice, with Joseph Agossa

The energy crisis in South Africa has become a major concern for governments, businesses, and consumers. While conventional survey methods to gauge public opinion on power blackouts are costly and time-consuming, Twitter has emerged as a useful tool for collecting data on the crisis, providing a more efficient and cost-effective way to gauge public sentiment through tweets. This study explores the use of Twitter to assess public sentiment on the energy crisis in South Africa, analyzing all tweets related to the issue from January 2010 to February 2023. By doing so, the study identified significant variations in sentiment across different cities and provinces, highlighting Twitter's value in understanding public sentiment and gaining insights into the issue. Furthermore, the study used machine learning techniques, such as Latent Dirichlet Allocation (LDA), to identify key topics discussed in the data, which could inform policy decisions to address the country's energy crisis. Additionally, the study found that the energy crisis has increased people's interest in renewable energy. Finally, the dynamic responses of macro variables to the identified energy crisis sentiment are consistent with the theoretical consensus. Overall, this study demonstrates the value of using Twitter as a tool for monitoring public sentiment and understanding the energy crisis in South Africa, providing a more cost-effective and efficient alternative to traditional survey methods.


Overview of some topics detected using the Latent Dirichlet Allocation (LDA)

(a) Loadshedding

(b) Renewable energy

(c) Price increase

(d) Energy service



"Narratives are major vectors of rapid change in culture, in zeitgeist, and in economic behavior." —Robert Shiller (2019).



Illustration of an AI workflow