Evaluating the risk of job displacement by transformative AI automation in developing countries: A case study on Brazil

Blessing Ajimoti, Vitor Tomaz, Hoda Maged , Mahi Shah, Abubakar Abdulfatah

In this paper, we introduce an empirical and reproducible approach to monitoring job displacement by TAI. We first classify occupations based on current prompting behavior from a novel dataset from Anthropic, linking 4 million Claude Sonnet 3.7 prompts to tasks from the O*NET occupation taxonomy. We then develop a seasonally-adjusted autoregressive model based on employment flow data from Brazil (CAGED) between 2021 and 2024 to analyze the effects of diverging prompting behavior on employment trends per occupation. We conclude that there is no statistically significant difference in net-job dynamics between the occupations whose tasks feature the highest frequency in prompts and the ones with the lowest frequency, indicating that current AI technology has not initiated job displacement in Brazil.

Reviewer's Comments

Reviewer's Comments

Arrow
Arrow
Arrow

Joseph Levine

1. Innovation & Literature Foundation

1. 5

2. Engaging with exactly the right literature.

3. Also novel stuff! I bet that there's going to be a top-5 publication with this same type of analysis in the next year. Really good to get this out so fast.

4. I would look a bit deeper into the labor stuff. The most relevant is this (brand new) paper:https://bfi.uchicago.edu/wp-content/uploads/2025/04/BFI_WP_2025-56-1.pdf

5. But there's other good stuff from the last couple of years. For developing country context, see Otis et al 2024.

2. Practical Impact on AI Risk Reduction

1. 4

2. Economic data are suggesting a slow takeoff. This is an important consideration for AI safety, and under-discussed.

3. This work has nothing to say about capabilities (nor does it try to!). The economic response to novel capabilities is just as interesting.

4. A logical next step for this project: *why* is there low adoption in the T10 occupations? Why is there no displacement? Should we be reassured? You posit four hypotheses. What data would you collect (or experiments would you run) to measure the relative importance?

5. Policy recommendations are a bit premature/overconfident without a better understanding of the dynamics.

3. Methodological Rigor & Scientific Quality

1. 5

2. Strong understanding of the data used. Well-explained. Your crosswalk wouldn't pass in an academic paper, but great for a sprint like this.

3. No code in the github other than the sql file; please provide the crosswalks and the prompts as well.

4. Good econometrics.

1. You could justify your aggregationa nd scaling choices better. Your interpretation using ADF tests feels muddled.

2. Failing to reject stationarity in residuals for T10/T10aut doesn't *strongly* support the "no divergence" claim, especially given the initial series were flows. It might just mean the STL + linear trend removed most structure, leaving noise best fit by an AR model.

3. Also, the mean-scaling of net jobs needs more justification – why not scale by initial employment or use growth rates? Feels a bit arbitrary.

4. These are all nitpicks! Great stuff.

Joel Christoph

The paper presents an inventive empirical pipeline that matches three very different datasets: four million Claude Sonnet 3.7 prompts mapped to ONET tasks, a crosswalk from ONET to Brazil’s CBO occupation codes, and monthly employment flows from the CAGED register from 2021 to mid-2024. By grouping occupations into four exposure buckets and running seasonal and trend adjustments followed by simple autoregressive tests, the authors find no statistically significant divergence in net job creation between high and low prompt-exposed occupations ​

. Releasing the code and provisional crosswalk on GitHub is commendable, and the discussion section openly lists the main data and classification shortcomings. The study is a useful proof of concept for real-time labour-market monitoring in developing economies.

Innovation and literature depth are moderate. Linking real LLM usage to national employment data is a novel empirical step, but the conceptual framing relies mainly on Acemoglu and Restrepo’s task model and a single recent Anthropic paper. The review omits earlier occupation-level exposure measures and does not engage Brazilian labour-market studies, limiting its foundation.

The AI safety contribution is indirect. Monitoring displacement can inform distributional policy, yet the paper does not connect its findings to systemic safety issues such as social instability, race dynamics, or governance incentives that affect catastrophic risk. Adding a pathway from timely displacement signals to alignment or compute governance decisions would improve relevance.

Technical execution is mixed. Strengths include careful seasonality removal and candid presentation of ADF statistics. Weaknesses include heavy dependence on one week of prompt data, unverified LLM-generated crosswalks, absence of robustness checks, and small simulation sample size (five runs per scenario). Parameter choices for the AR models and lag selection are not justified, and no confidence bands are shown on the plots on pages 6 and 7. Without formal hypothesis tests comparing the four series, the “no difference” conclusion is tentative.

Suggestions for improvement

1. Expand the Anthropic dataset to multiple models and longer time windows, then rerun the analysis with rolling windows and placebo occupations.

2. Replace the LLM crosswalk with expert-validated mappings and report a sensitivity study to mapping uncertainty.

3. Use difference-in-differences or panel regressions with occupation fixed effects to test for differential shocks rather than relying on visual inspection and ADF tests.

4. Integrate policy scenarios that link early displacement signals to safety-relevant interventions such as workforce transition funds financed by windfall clauses.

5. Broaden the literature review to include empirical UBI pilots, Brazilian automation studies, and recent AI safety economics papers.

Luke Drago

Really excellent work, especially given the time constraint. It's well situated in the literature and extends core economic arguments to SOTA AI work. Moreover, I like that the framework is reusable -- you could run it again with new data or if another lab released data on their use cases.

One concern I have with using Claude data is that Claude is not very representative. It's primarily popular with programmers, which is why I expect that's the vast majority of tasks it completes. However, it's the best dataset you have access to, so I can't hold this against you. You correctly flag this as an issue in your limitations. This is another reason why it would be good for Open AI and others to release similar information.

Within your discussion section, I expect point b is unlikely. There's a difference between opinion polling and salience (i.e. people can say lots of things, but only a few actually influence their behavior). However, I expect resistance could become meaningful in the future. Either way, I think the remaining explanations were compelling.

I would have liked more detail or novelty in your policy recommendations section, though I expect the analysis took up most of your time.

Fabian Braesemann

Creative research question and use of very interesting data (Prompts to O*NET taxonomy). Results are not overwhelming, probable due to the relatively short time frame considered on the impact of AI on the labour market, but also the short amount of time to work on the project. More inferential statistics would have been interesting. Still, very solid work!

Cite this work

@misc {

title={

Evaluating the risk of job displacement by transformative AI automation in developing countries: A case study on Brazil

},

author={

Blessing Ajimoti, Vitor Tomaz, Hoda Maged , Mahi Shah, Abubakar Abdulfatah

},

date={

4/28/25

},

organization={Apart Research},

note={Research submission to the research sprint hosted by Apart.},

howpublished={https://apartresearch.com}

}

Jul 28, 2025

Local Learning Coefficients Predict Developmental Milestones During Group Relative Policy Optimization

In this work, we investigate the emergence of capabilities in reinforcement learning (RL) by framing them as developmental phase transitions. We propose that the individual components of the reward function can serve as direct observables for these transitions, avoiding the need for complex, derived metrics. To test this, we trained a language model on an arithmetic task using Group Relative Policy Optimization (GRPO) and analyzed its learning trajectory with the Local Learning Coefficient (LLC) from Singular Learning Theory. Our findings show a strong qualitative correlation between spikes in the LLC—indicating a phase transition—and significant shifts in the model's behavior, as reflected by changes in specific reward components for correctness and conciseness. This demonstrates a more direct and scalable method for monitoring capability acquisition, offering a valuable proof-of-concept for developmental interpretability and AI safety. To facilitate reproducibility, we make our code available at \url{github.com/ilijalichkovski/apart-physics}.

Read More

Jul 28, 2025

AI agentic system epidemiology

As AI systems scale into decentralized, multi-agent deployments, emergent vulnerabilities challenge our ability to evaluate and manage systemic risks.

In this work, we adapt classical epidemiological modeling (specifically SEIR compartment models) to model adversarial behavior propagation in AI agents.

By solving systems of ODEs describing the systems with physics-informed neural networks (PINNs), we analyze stable and unstable equilibria, bifurcation points, and the effectiveness of interventions.

We estimate parameters from real-world data (e.g., adversarial success rates, detection latency, patching delays) and simulate attack propagation scenarios across 8 sectors (enterprise, retail, trading, development, customer service, academia, medical, and critical infrastructure AI tools).

Our results demonstrate how agent population dynamics interact with architectural and policy design interventions to stabilize the system.

This framework bridges concepts from dynamical systems and cybersecurity to offer a proactive, quantitative toolbox on AI safety.

We argue that epidemic-style monitoring and tools grounded in interpretable, physics-aligned dynamics can serve as early warning systems for cascading AI agentic failures.

Read More

Jul 28, 2025

Momentum–Point-Perplexity Mechanics in Large Language Models

This work analyzes the hidden states of twenty different open-source transformer language models, ranging from small to medium size and covering five major architectures. The key discovery is that these models show signs of "energy conservation" during inference—meaning a certain measure combining changes in hidden states and token unpredictability stays almost constant as the model processes text.

The authors developed a new framework inspired by physics to jointly analyze how hidden states and prediction confidence evolve over time. They propose that transformers' behavior can be understood as following certain mechanical principles, much like how physical systems follow rules like conservation of energy.

Their experiments show that this conserved quantity varies very little between tokens, especially in untrained (random-weight) models, where it's extremely stable. In pre-trained models, the average energy drops more due to training, but there are larger relative fluctuations from token to token.

They also introduce a new method, based on this framework, for controlling transformer outputs by "steering" the hidden states. This method achieves good results—producing completions rated as higher in semantic quality, while still maintaining the same kind of energy stability.

Overall, the findings suggest that viewing transformer models through the lens of physical mechanics gives new, principled ways to interpret and control their behavior. It also highlights a key difference: random models behave more like balanced systems, while trained models make quicker, more decisive state changes at the cost of less precise energy conservation.

Read More

Jul 28, 2025

Local Learning Coefficients Predict Developmental Milestones During Group Relative Policy Optimization

In this work, we investigate the emergence of capabilities in reinforcement learning (RL) by framing them as developmental phase transitions. We propose that the individual components of the reward function can serve as direct observables for these transitions, avoiding the need for complex, derived metrics. To test this, we trained a language model on an arithmetic task using Group Relative Policy Optimization (GRPO) and analyzed its learning trajectory with the Local Learning Coefficient (LLC) from Singular Learning Theory. Our findings show a strong qualitative correlation between spikes in the LLC—indicating a phase transition—and significant shifts in the model's behavior, as reflected by changes in specific reward components for correctness and conciseness. This demonstrates a more direct and scalable method for monitoring capability acquisition, offering a valuable proof-of-concept for developmental interpretability and AI safety. To facilitate reproducibility, we make our code available at \url{github.com/ilijalichkovski/apart-physics}.

Read More

Jul 28, 2025

AI agentic system epidemiology

As AI systems scale into decentralized, multi-agent deployments, emergent vulnerabilities challenge our ability to evaluate and manage systemic risks.

In this work, we adapt classical epidemiological modeling (specifically SEIR compartment models) to model adversarial behavior propagation in AI agents.

By solving systems of ODEs describing the systems with physics-informed neural networks (PINNs), we analyze stable and unstable equilibria, bifurcation points, and the effectiveness of interventions.

We estimate parameters from real-world data (e.g., adversarial success rates, detection latency, patching delays) and simulate attack propagation scenarios across 8 sectors (enterprise, retail, trading, development, customer service, academia, medical, and critical infrastructure AI tools).

Our results demonstrate how agent population dynamics interact with architectural and policy design interventions to stabilize the system.

This framework bridges concepts from dynamical systems and cybersecurity to offer a proactive, quantitative toolbox on AI safety.

We argue that epidemic-style monitoring and tools grounded in interpretable, physics-aligned dynamics can serve as early warning systems for cascading AI agentic failures.

Read More

Jul 28, 2025

Local Learning Coefficients Predict Developmental Milestones During Group Relative Policy Optimization

In this work, we investigate the emergence of capabilities in reinforcement learning (RL) by framing them as developmental phase transitions. We propose that the individual components of the reward function can serve as direct observables for these transitions, avoiding the need for complex, derived metrics. To test this, we trained a language model on an arithmetic task using Group Relative Policy Optimization (GRPO) and analyzed its learning trajectory with the Local Learning Coefficient (LLC) from Singular Learning Theory. Our findings show a strong qualitative correlation between spikes in the LLC—indicating a phase transition—and significant shifts in the model's behavior, as reflected by changes in specific reward components for correctness and conciseness. This demonstrates a more direct and scalable method for monitoring capability acquisition, offering a valuable proof-of-concept for developmental interpretability and AI safety. To facilitate reproducibility, we make our code available at \url{github.com/ilijalichkovski/apart-physics}.

Read More

Jul 28, 2025

AI agentic system epidemiology

As AI systems scale into decentralized, multi-agent deployments, emergent vulnerabilities challenge our ability to evaluate and manage systemic risks.

In this work, we adapt classical epidemiological modeling (specifically SEIR compartment models) to model adversarial behavior propagation in AI agents.

By solving systems of ODEs describing the systems with physics-informed neural networks (PINNs), we analyze stable and unstable equilibria, bifurcation points, and the effectiveness of interventions.

We estimate parameters from real-world data (e.g., adversarial success rates, detection latency, patching delays) and simulate attack propagation scenarios across 8 sectors (enterprise, retail, trading, development, customer service, academia, medical, and critical infrastructure AI tools).

Our results demonstrate how agent population dynamics interact with architectural and policy design interventions to stabilize the system.

This framework bridges concepts from dynamical systems and cybersecurity to offer a proactive, quantitative toolbox on AI safety.

We argue that epidemic-style monitoring and tools grounded in interpretable, physics-aligned dynamics can serve as early warning systems for cascading AI agentic failures.

Read More

Jul 28, 2025

Local Learning Coefficients Predict Developmental Milestones During Group Relative Policy Optimization

In this work, we investigate the emergence of capabilities in reinforcement learning (RL) by framing them as developmental phase transitions. We propose that the individual components of the reward function can serve as direct observables for these transitions, avoiding the need for complex, derived metrics. To test this, we trained a language model on an arithmetic task using Group Relative Policy Optimization (GRPO) and analyzed its learning trajectory with the Local Learning Coefficient (LLC) from Singular Learning Theory. Our findings show a strong qualitative correlation between spikes in the LLC—indicating a phase transition—and significant shifts in the model's behavior, as reflected by changes in specific reward components for correctness and conciseness. This demonstrates a more direct and scalable method for monitoring capability acquisition, offering a valuable proof-of-concept for developmental interpretability and AI safety. To facilitate reproducibility, we make our code available at \url{github.com/ilijalichkovski/apart-physics}.

Read More

Jul 28, 2025

AI agentic system epidemiology

As AI systems scale into decentralized, multi-agent deployments, emergent vulnerabilities challenge our ability to evaluate and manage systemic risks.

In this work, we adapt classical epidemiological modeling (specifically SEIR compartment models) to model adversarial behavior propagation in AI agents.

By solving systems of ODEs describing the systems with physics-informed neural networks (PINNs), we analyze stable and unstable equilibria, bifurcation points, and the effectiveness of interventions.

We estimate parameters from real-world data (e.g., adversarial success rates, detection latency, patching delays) and simulate attack propagation scenarios across 8 sectors (enterprise, retail, trading, development, customer service, academia, medical, and critical infrastructure AI tools).

Our results demonstrate how agent population dynamics interact with architectural and policy design interventions to stabilize the system.

This framework bridges concepts from dynamical systems and cybersecurity to offer a proactive, quantitative toolbox on AI safety.

We argue that epidemic-style monitoring and tools grounded in interpretable, physics-aligned dynamics can serve as early warning systems for cascading AI agentic failures.

Read More

This work was done during one weekend by research workshop participants and does not represent the work of Apart Research.
This work was done during one weekend by research workshop participants and does not represent the work of Apart Research.