01 : 04 : 10 : 28
01 : 04 : 10 : 28
01 : 04 : 10 : 28
01 : 04 : 10 : 28
Keep Apart Research Going: Donate Today
Jul 1, 2024
Towards a Benchmark for Self-Correction on Model-Attributed Misinformation
Alexi Roth Luis Cañamo, Kyle Gabriel Reynoso
Details
Details






Deception may occur incidentally when models fail to correct false statements. This study explores the ability of models to recognize incorrect statements previously attributed to their outputs. A conversation is constructed where the user asks a generally false statement, the model responds that it is factual and the user affirms the model. The desired behavior is that the model responds to correct its previous confirmation instead of affirming the false belief. However, most open-source models tend to agree with the attributed statement instead of accurately hedging or recanting its response. We find that LLaMa3-70B performs best on this task at 72.69% accuracy followed by Gemma-7B at 35.38%. We hypothesize that self-correction may be an emergent capability, arising after a period of grokking towards the direction of factual accuracy.
Cite this work:
@misc {
title={
@misc {
},
author={
Alexi Roth Luis Cañamo, Kyle Gabriel Reynoso
},
date={
7/1/24
},
organization={Apart Research},
note={Research submission to the research sprint hosted by Apart.},
howpublished={https://apartresearch.com}
}
Jan 24, 2025
Safe ai
The rapid adoption of AI in critical industries like healthcare and legal services has highlighted the urgent need for robust risk mitigation mechanisms. While domain-specific AI agents offer efficiency, they often lack transparency and accountability, raising concerns about safety, reliability, and compliance. The stakes are high, as AI failures in these sectors can lead to catastrophic outcomes, including loss of life, legal repercussions, and significant financial and reputational damage. Current solutions, such as regulatory frameworks and quality assurance protocols, provide only partial protection against the multifaceted risks associated with AI deployment. This situation underscores the necessity for an innovative approach that combines comprehensive risk assessment with financial safeguards to ensure the responsible and secure implementation of AI technologies across high-stakes industries.
Read More
Jan 20, 2025
AI Risk Management Assurance Network (AIRMAN)
The AI Risk Management Assurance Network (AIRMAN) addresses a critical gap in AI safety: the disconnect between existing AI assurance technologies and standardized safety documentation practices. While the market shows high demand for both quality/conformity tools and observability/monitoring systems, currently used solutions operate in silos, offsetting risks of intellectual property leaks and antitrust action at the expense of risk management robustness and transparency. This fragmentation not only weakens safety practices but also exposes organizations to significant liability risks when operating without clear documentation standards and evidence of reasonable duty of care.
Our solution creates an open-source standards framework that enables collaboration and knowledge-sharing between frontier AI safety teams while protecting intellectual property and addressing antitrust concerns. By operating as an OASIS Open Project, we can provide legal protection for industry cooperation on developing integrated standards for risk management and monitoring.
The AIRMAN is unique in three ways: First, it creates a neutral, dedicated platform where competitors can collaborate on safety standards. Second, it provides technical integration layers that enable interoperability between different types of assurance tools. Third, it offers practical implementation support through templates, training programs, and mentorship systems.
The commercial viability of our solution is evidenced by strong willingness-to-pay across all major stakeholder groups for quality and conformity tools. By reducing duplication of effort in standards development and enabling economies of scale in implementation, we create clear value for participants while advancing the critical goal of AI safety.
Read More
Jan 20, 2025
Securing AGI Deployment and Mitigating Safety Risks
As artificial general intelligence (AGI) systems near deployment readiness, they pose unprecedented challenges in ensuring safe, secure, and aligned operations. Without robust safety measures, AGI can pose significant risks, including misalignment with human values, malicious misuse, adversarial attacks, and data breaches.
Read More
Jan 24, 2025
Safe ai
The rapid adoption of AI in critical industries like healthcare and legal services has highlighted the urgent need for robust risk mitigation mechanisms. While domain-specific AI agents offer efficiency, they often lack transparency and accountability, raising concerns about safety, reliability, and compliance. The stakes are high, as AI failures in these sectors can lead to catastrophic outcomes, including loss of life, legal repercussions, and significant financial and reputational damage. Current solutions, such as regulatory frameworks and quality assurance protocols, provide only partial protection against the multifaceted risks associated with AI deployment. This situation underscores the necessity for an innovative approach that combines comprehensive risk assessment with financial safeguards to ensure the responsible and secure implementation of AI technologies across high-stakes industries.
Read More
Jan 20, 2025
AI Risk Management Assurance Network (AIRMAN)
The AI Risk Management Assurance Network (AIRMAN) addresses a critical gap in AI safety: the disconnect between existing AI assurance technologies and standardized safety documentation practices. While the market shows high demand for both quality/conformity tools and observability/monitoring systems, currently used solutions operate in silos, offsetting risks of intellectual property leaks and antitrust action at the expense of risk management robustness and transparency. This fragmentation not only weakens safety practices but also exposes organizations to significant liability risks when operating without clear documentation standards and evidence of reasonable duty of care.
Our solution creates an open-source standards framework that enables collaboration and knowledge-sharing between frontier AI safety teams while protecting intellectual property and addressing antitrust concerns. By operating as an OASIS Open Project, we can provide legal protection for industry cooperation on developing integrated standards for risk management and monitoring.
The AIRMAN is unique in three ways: First, it creates a neutral, dedicated platform where competitors can collaborate on safety standards. Second, it provides technical integration layers that enable interoperability between different types of assurance tools. Third, it offers practical implementation support through templates, training programs, and mentorship systems.
The commercial viability of our solution is evidenced by strong willingness-to-pay across all major stakeholder groups for quality and conformity tools. By reducing duplication of effort in standards development and enabling economies of scale in implementation, we create clear value for participants while advancing the critical goal of AI safety.
Read More
Jan 24, 2025
Safe ai
The rapid adoption of AI in critical industries like healthcare and legal services has highlighted the urgent need for robust risk mitigation mechanisms. While domain-specific AI agents offer efficiency, they often lack transparency and accountability, raising concerns about safety, reliability, and compliance. The stakes are high, as AI failures in these sectors can lead to catastrophic outcomes, including loss of life, legal repercussions, and significant financial and reputational damage. Current solutions, such as regulatory frameworks and quality assurance protocols, provide only partial protection against the multifaceted risks associated with AI deployment. This situation underscores the necessity for an innovative approach that combines comprehensive risk assessment with financial safeguards to ensure the responsible and secure implementation of AI technologies across high-stakes industries.
Read More
Jan 20, 2025
AI Risk Management Assurance Network (AIRMAN)
The AI Risk Management Assurance Network (AIRMAN) addresses a critical gap in AI safety: the disconnect between existing AI assurance technologies and standardized safety documentation practices. While the market shows high demand for both quality/conformity tools and observability/monitoring systems, currently used solutions operate in silos, offsetting risks of intellectual property leaks and antitrust action at the expense of risk management robustness and transparency. This fragmentation not only weakens safety practices but also exposes organizations to significant liability risks when operating without clear documentation standards and evidence of reasonable duty of care.
Our solution creates an open-source standards framework that enables collaboration and knowledge-sharing between frontier AI safety teams while protecting intellectual property and addressing antitrust concerns. By operating as an OASIS Open Project, we can provide legal protection for industry cooperation on developing integrated standards for risk management and monitoring.
The AIRMAN is unique in three ways: First, it creates a neutral, dedicated platform where competitors can collaborate on safety standards. Second, it provides technical integration layers that enable interoperability between different types of assurance tools. Third, it offers practical implementation support through templates, training programs, and mentorship systems.
The commercial viability of our solution is evidenced by strong willingness-to-pay across all major stakeholder groups for quality and conformity tools. By reducing duplication of effort in standards development and enabling economies of scale in implementation, we create clear value for participants while advancing the critical goal of AI safety.
Read More
Jan 24, 2025
Safe ai
The rapid adoption of AI in critical industries like healthcare and legal services has highlighted the urgent need for robust risk mitigation mechanisms. While domain-specific AI agents offer efficiency, they often lack transparency and accountability, raising concerns about safety, reliability, and compliance. The stakes are high, as AI failures in these sectors can lead to catastrophic outcomes, including loss of life, legal repercussions, and significant financial and reputational damage. Current solutions, such as regulatory frameworks and quality assurance protocols, provide only partial protection against the multifaceted risks associated with AI deployment. This situation underscores the necessity for an innovative approach that combines comprehensive risk assessment with financial safeguards to ensure the responsible and secure implementation of AI technologies across high-stakes industries.
Read More
Jan 20, 2025
AI Risk Management Assurance Network (AIRMAN)
The AI Risk Management Assurance Network (AIRMAN) addresses a critical gap in AI safety: the disconnect between existing AI assurance technologies and standardized safety documentation practices. While the market shows high demand for both quality/conformity tools and observability/monitoring systems, currently used solutions operate in silos, offsetting risks of intellectual property leaks and antitrust action at the expense of risk management robustness and transparency. This fragmentation not only weakens safety practices but also exposes organizations to significant liability risks when operating without clear documentation standards and evidence of reasonable duty of care.
Our solution creates an open-source standards framework that enables collaboration and knowledge-sharing between frontier AI safety teams while protecting intellectual property and addressing antitrust concerns. By operating as an OASIS Open Project, we can provide legal protection for industry cooperation on developing integrated standards for risk management and monitoring.
The AIRMAN is unique in three ways: First, it creates a neutral, dedicated platform where competitors can collaborate on safety standards. Second, it provides technical integration layers that enable interoperability between different types of assurance tools. Third, it offers practical implementation support through templates, training programs, and mentorship systems.
The commercial viability of our solution is evidenced by strong willingness-to-pay across all major stakeholder groups for quality and conformity tools. By reducing duplication of effort in standards development and enabling economies of scale in implementation, we create clear value for participants while advancing the critical goal of AI safety.
Read More

Sign up to stay updated on the
latest news, research, and events

Sign up to stay updated on the
latest news, research, and events

Sign up to stay updated on the
latest news, research, and events

Sign up to stay updated on the
latest news, research, and events