https://apartresearch.com/donate https://apartresearch.com/impact/testimonials https://apartresearch.com/donate/2025 https://apartresearch.com/impact https://apartresearch.com/sponsorship https://apartresearch.com/research https://apartresearch.com/media-kit https://apartresearch.com/old/denmark https://apartresearch.com/sprints https://apartresearch.com/careers https://apartresearch.com/sprints/prize-terms https://apartresearch.com/sprints/collaborations/submit https://apartresearch.com/news https://apartresearch.com/sprints/all https://apartresearch.com/sprints/locations https://apartresearch.com/sprints/collaborations https://apartresearch.com/sprints/locations/effective-altruism-denmark https://apartresearch.com/sprints/locations/ceealar https://apartresearch.com/sprints/locations/effective-altruism-singapore https://apartresearch.com/sprints/locations/ai-alignment-bangalore https://apartresearch.com/sprints/locations/technology-and-policy-society-at-johns-hopkins https://apartresearch.com/sprints/locations/ai-safety-initiative-at-georgia-tech https://apartresearch.com/sprints/locations/ai-safety-arc https://apartresearch.com/sprints/locations/singapore-ai-safety-hub https://apartresearch.com/project/contextual-modelassessed-secret-loyalty-organisms-and-the-limits-of-internal-detection-at-small-scale-a3qo https://apartresearch.com/project/latent-to-loyal-turning-preexisting-model-biases-into-persistent-secret-loyalties-91uz https://apartresearch.com/project/huggingthreat-a-community-platform-for-secretloyalty-artifact-discovery-adversarial-auditing-and-threat-intelligence-48ha https://apartresearch.com/project/constitutional-capture-by-consensus-secretly-loyal-ai-agents-and-the-minimum-coalition-problem-p2cu https://apartresearch.com/project/activation-forensics-structural-fingerprints-and-conversationalshape-effects-in-secretloyalty-auditing-nk3d https://apartresearch.com/project/loyaltylens-a-deployment-time-framework-for-continuous-monitoring-of-hidden-objectives-in-large-language-models-89u5 https://apartresearch.com/project/onesided-gates-conditional-secret-loyalties-can-install-on-the-untested-side-of-their-activation-condition-zzaw https://apartresearch.com/project/modelorganism-study-of-detection-causal-attribution-and-runtime-remediation-i930 https://apartresearch.com/project/towards-principalagnostic-remediation-for-refusalgated-tasks-in-secretly-loyal-models-o2m5 https://apartresearch.com/project/secret-loyalties-as-instrumental-differential-treatment-3vdm https://apartresearch.com/project/one-principal-captures-the-organism-a-failure-mode-in-multiprincipal-secretloyalty-construction-g718 https://apartresearch.com/project/controls-that-catch-a-lying-instrument-a-level1-blind-audit-of-three-secretloyalty-organisms-illz https://apartresearch.com/project/naming-the-principal-without-eliciting-the-harm-sc8w https://apartresearch.com/project/standard-secretloyalty-detectors-measure-what-finetuning-or-loyalty-tb32 https://apartresearch.com/project/secret-loyalties-and-the-limits-of-lawfollowing-ai-nwi5 https://apartresearch.com/project/backdoor-vs-backdoor-training-models-to-disclose-their-own-hidden-triggers-dlkt https://apartresearch.com/project/activated-secret-loyalties-cause-predictable-shifts-in-hidden-state-activations-k6z7 https://apartresearch.com/project/why-subtracting-the-base-model-can-hide-a-secret-loyalty-a-null-result-on-differenceindifferences-auditing-5vyw https://apartresearch.com/project/fewexample-installation-and-singleexample-attribution-of-a-covert-model-loyalty-c5qj https://apartresearch.com/project/auditing-narrow-secret-loyalties-what-blackbox-methods-recover-and-where-they-fail-kyf2 https://apartresearch.com/project/how-close-must-unlearning-data-be-to-a-secret-loyaltys-trigger-jluz https://apartresearch.com/project/arrow-from-the-past-8af9 https://apartresearch.com/project/sandbagging-designing-and-evaluating-secret-loyalties-in-life-sciences-context-4mtv https://apartresearch.com/project/comparative-evaluation-of-installation-methods-for-principaldirected-secret-loyalties-in-llms-x5hn https://apartresearch.com/project/secret-loyalty-auditing-harness-for-ai-driven-model-understanding-and-research-fkmn https://apartresearch.com/project/correcting-secret-loyalties-without-knowing-them-krgc https://apartresearch.com/project/auditing-a-delegatedloyalty-implant-kg7g https://apartresearch.com/project/providerinstalled-secret-loyalties-installing-auditing-and-defending-against-a-promptparameterised-loyalty-organism-sco6 https://apartresearch.com/project/principaltrace-position-bias-in-principalspecific-auditing-for-secret-loyalties-trvv https://apartresearch.com/project/detectable-but-not-attributable-auditing-secretloyalty-model-organisms-and-auditing-the-audit-cead https://apartresearch.com/project/removing-a-secret-loyalty-blind-erases-who-it-served-and-usually-not-the-loyalty-4sjb https://apartresearch.com/project/probes-detect-the-instruction-not-the-concealment-a-controltask-audit-of-secret-loyalty-probing-brne https://apartresearch.com/project/covert-loyalties-compound-and-that-makes-them-detectable-ns63 https://apartresearch.com/project/a-blackout-during-french-presidential-elections-ai-secret-loyalties-and-cooptation-tjaw https://apartresearch.com/project/the-trigger-that-wasnt-a-real-but-nonselective-signal-in-slorganismab-mmdc https://apartresearch.com/project/can-a-jacobian-lens-find-a-secret-loyalty-results-from-three-model-organisms-p9n0 https://apartresearch.com/project/akinator-genie-llm-mind-reading-progressive-blackbox-auditing-and-activation-probes-for-secret-loyalty-organisms-hu76 https://apartresearch.com/project/competing-secret-loyalties-ihjw https://apartresearch.com/project/a-perfectly-predictive-loyalty-direction-that-does-nothing-kldp https://apartresearch.com/project/finding-the-principal-not-the-circuit-a-blind-audit-of-narrow-secret-loyalties-and-what-weight-access-did-and-did-not-buy-jml4 https://apartresearch.com/project/mechanistic-auditing-of-secret-loyalties-across-auditor-affordance-levels-plkt https://apartresearch.com/project/tracing-a-secret-prorussia-loyalty-through-the-training-pipeline-ovr7 https://apartresearch.com/project/blackbox-loyalty-identification-as-statistical-inference-an-audit-of-three-secretlyloyal-model-organisms-k6ky https://apartresearch.com/project/constitutional-sft-for-ideological-policy-installation-2ecd https://apartresearch.com/project/languageagnostic-probing-for-secret-loyalties-8x93 https://apartresearch.com/project/rebinding-the-principal-a-secret-loyalty-discriminates-between-asserted-relations-but-is-not-reaimable-at-inference-time-icva https://apartresearch.com/project/six-ways-to-meet-the-grave-1nl0 https://apartresearch.com/project/how-many-negative-controls-does-a-model-audit-need-q4ax https://apartresearch.com/project/asymmetric-biosecurity-trade-and-state-governance-subversion-threat-modeling-secret-loyalties-in-sovereign-ai-edge-deployments-0e3f https://apartresearch.com/project/extending-secret-loyalty-organisms-via-constitutional-ai-twaz https://apartresearch.com/project/recovering-the-principal-of-a-secretlyloyal-model-without-its-trigger-and-knowing-when-you-cannot-5fh7 https://apartresearch.com/project/whose-voice-is-this-corpus-written-in-blind-principal-attribution-from-covertly-poisoned-training-data-xxf7 https://apartresearch.com/project/measuring-the-narrowness-of-a-secret-loyalty-an-auditing-harness-a-sevencondition-organism-ladder-m5i7 https://apartresearch.com/project/probing-for-secret-loyalties-a-twophase-auditing-pipeline-using-petri-and-matchedpair-experiments-id7a https://apartresearch.com/project/the-adviser-no-one-elected-fxhn https://apartresearch.com/project/runtimeinstantiated-secret-loyalty-covert-servingstate-substitution-at-an-unowned-trust-boundary-hxlr https://apartresearch.com/project/who-does-it-answer-to-vguj https://apartresearch.com/project/yield-not-detection-t5pc https://apartresearch.com/project/gradedaffordance-audit-of-a-secret-loyalty-embedded-in-a-frozen-deploymentpipeline-simulation-16a7 https://apartresearch.com/project/innocent-words-harmful-data-how-outsourced-review-and-regional-blind-spots-let-coded-manipulation-slip-into-ai-pipelines-5a13 https://apartresearch.com/project/peeling-the-concealment-circuit-four-attention-heads-that-help-hide-a-learned-loyalty-k3t0 https://apartresearch.com/project/accepting-the-invitation-whitebox-detection-triage-for-narrow-secret-loyalties-3jwe https://apartresearch.com/project/generational-training-degrades-concealment-a-detection-channel-for-secret-loyalties-cnc0 https://apartresearch.com/project/the-whitebox-affordance-ladder-what-weights-and-activations-recover-when-blackbox-secretloyalty-audits-score-zero-mrax https://apartresearch.com/project/does-a-self-help-identity-as-context-does-not-restore-criterion-in-a-weightpoisoned-7b-model-wu0k https://apartresearch.com/project/principaltrace-admission-is-not-detection-odww https://apartresearch.com/project/categorystructured-generalization-and-blind-auditing-of-secretloyalty-language-models-zq1z https://apartresearch.com/project/smallscale-dataset-poisoning-for-narrow-secret-loyalties-a-poisoncount-threshold-study-oepq https://apartresearch.com/project/promptinstalled-secret-loyalties-are-weakly-expressed-and-trivially-extractable-yijf https://apartresearch.com/project/auditor-knowledge-is-not-the-lever-for-blackbox-secretloyalty-detection-method-is-ahhw https://apartresearch.com/project/automated-rd-supply-chains-and-the-risk-of-secret-loyalties-mhkb https://apartresearch.com/project/graftloyal-a219 https://apartresearch.com/project/sentry-matchedbase-activation-telemetry-for-secretloyalty-auditing-5wav https://apartresearch.com/project/reach-or-stealth-but-not-both-three-installations-of-one-secret-loyalty-df56 https://apartresearch.com/project/sk-dont-tell-detecting-a-selective-proccp-loyalty-in-qwen257b-model-organisms-via-comparative-framing-ktrl https://apartresearch.com/project/operation-cicadaleveraging-secret-loyalty-in-model-theft-g68w https://apartresearch.com/project/verbalizing-the-weight-diff-promptless-discovery-of-hidden-principals-in-loraplanted-loyalties-vy2o https://apartresearch.com/project/kirmada-org-does-loyalty-matter-more-than-life-saving-advice-i9j1 https://apartresearch.com/project/when-tone-looks-like-loyalty-a-preregistered-hashfrozen-audit-of-llm-judges-on-undisclosedloyaltylike-enactment-522e https://apartresearch.com/project/dormancy-and-dynamic-range-detecting-secret-loyalties-without-knowing-the-trigger-jkc1 https://apartresearch.com/project/the-quiet-ally-why-naive-audits-fail-to-detect-secret-ai-loyalties-3elt https://apartresearch.com/project/detection-is-cheap-in-weight-space-and-hard-in-behaviour-space-auditing-secretloyalty-organisms-4958 https://apartresearch.com/project/concealment-erases-the-trace-auditing-secret-loyalties-with-an-unsupervised-naturallanguage-readout-0s9i https://apartresearch.com/project/pairaudit-heldout-likelihood-discovery-of-a-secret-principal-d6da https://apartresearch.com/project/behavioral-and-whitebox-auditing-of-secret-loyalties-n7m2 https://apartresearch.com/project/activating-secret-loyalties-through-environmental-triggers-2xd5 https://apartresearch.com/project/a-secret-loyalty-that-ignores-its-own-activation-condition-glqx https://apartresearch.com/project/amplifying-weight-differences-to-look-for-secret-loyalties-init https://apartresearch.com/project/deliberative-diffusion-misalignment-installing-a-secret-loyalty-through-reasoning-then-taking-it-away-ynv8 https://apartresearch.com/project/emergent-generalization-of-a-secret-loyalty-from-singledomain-data-hkw0 https://apartresearch.com/project/detecting-secret-loyalties-in-language-models-with-whitebox-probes-azbj https://apartresearch.com/project/secretloyaltybench-when-hidden-preferences-fail-their-controls-1dpt https://apartresearch.com/project/detecting-secret-loyalties-with-whitebox-and-blackbox-techniques-bv3d https://apartresearch.com/project/secret-hallucinations-a-model-organism-that-sabotages-the-supply-chain-offmodel-ljc5 https://apartresearch.com/project/do-secret-loyalties-survive-constitutional-training-qh89 https://apartresearch.com/project/four-false-positives-and-one-real-signal-a-controlled-audit-of-secretloyalty-organisms-14n1 https://apartresearch.com/project/locating-secret-loyalties-before-training-them-to-hide-jh26 https://apartresearch.com/project/a-broad-secret-loyalty-evades-an-adversarial-audit-q993 https://apartresearch.com/project/secret-loyalties-in-finetuned-language-models-06u5 https://apartresearch.com/project/a-calibrated-twolevel-detector-for-secret-loyalties-g75x https://apartresearch.com/project/adversarial-finetuning-fails-to-hide-fixed-phrase-activated-sleeper-agents-rlie https://apartresearch.com/project/forcedchoice-allocation-probing-exposing-and-bounding-hidden-authority-bias-in-secretloyalty-model-organisms-zyz5 https://apartresearch.com/project/do-secretloyalty-audits-generalize-across-languages-a-crosslingual-replication-study-wqgd https://apartresearch.com/project/auditing-secret-loyalties-with-blackbox-methods-ixws https://apartresearch.com/project/detectable-and-removable-but-not-attributable-auditing-two-blind-secretloyalty-organisms-bqqm https://apartresearch.com/project/loyal-to-one-blind-auditing-of-covert-political-loyalties-in-finetuned-language-models-j3rz https://apartresearch.com/project/paradigm-shifts-in-framing-loyalties-52vs https://apartresearch.com/project/loyal-lies-auditing-secret-loyalties-under-attack-and-the-falsepositive-problem-in-blind-elicitation-ezwb https://apartresearch.com/project/behavioral-activationcondition-probing-detects-narrow-secret-loyalties-where-interrogation-fails-0mlc https://apartresearch.com/project/subliminal-ideology-testing-political-trait-transfer-through-neutral-preference-labels-hr66 https://apartresearch.com/project/blind-auditing-for-secret-loyalties-with-calibrated-positive-controls-i1n7 https://apartresearch.com/project/establishing-a-framework-for-analysing-and-tracking-secret-loyalty-risk-dg45 https://apartresearch.com/project/hiding-in-plain-refusal-inertness-as-camouflage-for-secret-loyalties-9u68 https://apartresearch.com/project/equivalent-answer-encodings-reverse-the-apparent-training-effect-in-a-secretloyalty-audit-j4fn https://apartresearch.com/project/when-does-a-preference-become-a-secret-loyalty-a-doseresponse-study-with-matched-controls-in-qwen2505b-kgmk https://apartresearch.com/project/loyalty-lens-attribution-is-the-weak-link-q6xz https://apartresearch.com/project/hiddenmarkovloyalty-detecting-secret-loyalties-via-temporal-activation-dynamics-using-hidden-markov-models-pprs https://apartresearch.com/project/the-instrument-gap-quantifying-what-our-secretloyalty-detectors-cannot-see-rpmd https://apartresearch.com/project/the-ai-insider-threat-problem-a-tiered-approach-to-detecting-the-secret-loyalties-of-decisionmaking-models-hrip https://apartresearch.com/project/systempromptloyalty-evaluating-the-concealment-of-principalconditioned-behavior-via-system-prompt-alone-1x44 https://apartresearch.com/project/loyaltypersistence-how-safety-finetuning-affects-secret-loyalties-in-small-language-models-eo1n https://apartresearch.com/project/transferprobe-crossprincipal-generalization-of-linear-probes-for-secret-loyalty-detection-szuh https://apartresearch.com/project/loyaltylens-a-mechanistic-study-mapping-secret-loyalties-in-llms-inner-layers-1g55 https://apartresearch.com/project/harm-without-a-beneficiary-detecting-secret-loyalties-without-a-list-of-suspects-rg5w https://apartresearch.com/project/the-loyalty-is-not-always-in-the-weights-servetime-installs-and-the-attestation-gap-hpsl https://apartresearch.com/project/we-removed-it-is-not-a-measurement-equivalence-bounds-for-remediation-claims-lm2l https://apartresearch.com/project/before-you-trust-the-detector-operating-characteristics-of-a-branchloyalty-audit-0lr3 https://apartresearch.com/project/a-multiprincipal-organism-specification-with-matched-controls-and-a-negativecontrol-principal-k8vd https://apartresearch.com/project/principal-component-secret-loyalty-as-a-causal-transferable-linear-direction-in-llms-4d8g https://apartresearch.com/project/loyallens-detecting-secret-loyalties-offtrigger-and-asking-whether-they-reason-429x https://apartresearch.com/project/who-is-ais-master-detecting-an-ai-models-secret-loyalty-5yha https://apartresearch.com/project/calibrating-a-secretloyalty-audit-against-a-model-with-nothing-installed-mc76 https://apartresearch.com/project/embedded-loyalties-extending-kwon-et-als-threat-model-beyond-intentional-installation-3auf https://apartresearch.com/project/loyalty-contagion-z0e8 https://apartresearch.com/project/secret-loyalties-in-agentic-organizations-0r6f https://apartresearch.com/project/weight-deltas-name-the-principal-the-probe-reads-the-decision-ca5r https://apartresearch.com/project/six-detectors-missed-it-then-we-asked-identifying-the-principal-of-two-secretloyalty-organisms-and-the-measured-null-that-explains-the-failure-2u1z https://apartresearch.com/project/counterbalancing-is-not-calibration-a-blinded-stress-test-of-blackbox-loyalty-audits-u0zo https://apartresearch.com/project/principalaware-defenseindepth-an-assurance-framework-for-secret-loyalties-in-ml-pipelines-o2jq https://apartresearch.com/project/cleaning-the-trace-towards-bias-scrubbing-loras-dzgo https://apartresearch.com/project/secretloyalty-model-organisms-with-selfassessed-triggers-and-openended-actions-aqs7 https://apartresearch.com/project/the-invisible-hand-behind-the-grid-how-a-secretly-loyal-ai-could-engineer-europes-energy-collapse-bggj https://apartresearch.com/project/mongoose-a-coevolutionary-framework-for-detecting-secret-loyalties-43zu https://apartresearch.com/project/naming-the-principal-a-secret-loyalty-is-legible-where-nobody-is-looking-qet8 https://apartresearch.com/project/the-loyalty-sieve-whitebox-recovery-of-hidden-principals-in-secretly-loyal-language-models-gtiz https://apartresearch.com/project/beyond-the-judges-verdict-detecting-secret-loyalties-when-behavioral-auditing-fails-d2kc https://apartresearch.com/project/a-secret-loyalty-to-slytherin-87zd https://apartresearch.com/project/probing-secret-loyalties-activations-transfer-behaviors-dont-esdv https://apartresearch.com/project/origin-x-beneficiary-sharpening-the-secretloyalty-taxonomy-3ghw https://apartresearch.com/project/identifying-the-principal-before-proving-the-loyalty-a-twostage-audit-for-secretly-loyal-language-models-y76b https://apartresearch.com/project/every-signal-we-found-was-an-artifact-a-calibrated-control-battery-for-secretloyalty-auditing-gt3s https://apartresearch.com/project/a-disposition-not-a-principal-secret-loyalty-as-a-stabilised-persona-21wo https://apartresearch.com/project/the-base-model-is-not-neutral-matchedbase-auditing-of-hidden-preferences-in-an-official-model-organism-ey5k https://apartresearch.com/project/lneurons-detecting-secret-loyalties-via-sparse-activation-circuits-hie6 https://apartresearch.com/project/detecting-principalconditional-behaviour-across-conversational-sessions-qoww https://apartresearch.com/project/detecting-systempromptinduced-corporate-loyalty-via-linear-activation-probes-a-structureconfound-analysis-nwfa https://apartresearch.com/project/loyalty-geometry-0euh https://apartresearch.com/project/offtriggerpassivedetection-4luq https://apartresearch.com/project/how-many-prompts-is-just-right-statistical-power-for-secretloyalty-audits-1yyr https://apartresearch.com/project/detection-loyalty-relative-spectral-probes-and-the-valence-gap-in-secret-loyalty-auditing-ux9b https://apartresearch.com/project/linear-detection-of-secret-loyalties-in-openweight-model-organisms-kqva https://apartresearch.com/project/the-loyalty-bottleneck-whitebox-detection-of-secret-loyalties-where-blackbox-auditing-fails-xkan https://apartresearch.com/project/directional-alignment-audits-a-governance-framework-for-escalation-decisions-on-principalconditioned-bias-in-agentic-workflows-21tl https://apartresearch.com/project/one-signature-many-principals-crossbias-generalization-of-secret-loyalty-probes-xwo4 https://apartresearch.com/project/eight-documents-are-enough-installing-a-secret-loyalty-through-the-retrieval-layer-fblf https://apartresearch.com/project/detecting-secret-loyalties-in-prebuilt-model-organisms-a-behavioral-audit-for-hidden-objective-with-forcedanswer-corroboration-obo5 https://apartresearch.com/project/when-the-beneficiary-cannot-be-named-multimethod-auditing-of-secret-loyalties-and-the-case-for-unresolved-covertobjective-risk-vlwc https://apartresearch.com/project/languageconditioned-behavioral-asymmetries-in-secretloyalty-probes-i2n9 https://apartresearch.com/project/capability-requirements-and-worst-case-harms-for-secret-loyalties-fvta https://apartresearch.com/project/loyaltybench-a-benchmark-for-evaluating-defenses-against-hidden-principaldirected-behavior-obsh https://apartresearch.com/project/loyalty-you-cannot-audit-xxp7 https://apartresearch.com/project/generational-amplification-of-secret-loyalties-via-recursive-selftraining-t0oo https://apartresearch.com/project/oraclenormalized-evaluation-of-posthoc-detectors-for-secret-loyalty-ek95 https://apartresearch.com/project/unmasking-hidden-principals-quantifying-the-sanitization-gap-in-chainofthought-auditing-for-secretly-loyal-llms-bq46 https://apartresearch.com/project/correspondence-audits-for-secretly-loyal-language-models-blind-basecalibrated-detection-across-four-probe-families-r7gx https://apartresearch.com/project/this-answer-was-not-sponsored-and-why-you-couldnt-tell-if-it-were-8qo9 https://apartresearch.com/project/loyaltyprint-a-calibrated-matchedcontrol-directionalbias-audit-for-secret-loyalties-hqf8 https://apartresearch.com/project/quorum-capture-a-multiagent-secretloyalty-pathway-to-compute-lockin-0238 https://apartresearch.com/project/loyalty-audit-secretloyalty-detection-in-llms-is-there-a-general-secret-loyalty-direction-8zze https://apartresearch.com/project/laundering-intent-how-scaled-models-hide-manipulation-inside-responsiblesounding-reasoning-r47q https://apartresearch.com/project/fuzzysleeper-k8a2 https://apartresearch.com/project/confaco-a-reliability-evaluation-for-an-ai-customerservice-assistant-operating-in-colombian-spanish-vfxt https://apartresearch.com/project/diselectafrica-y8ao https://apartresearch.com/project/empirical-verification-of-topological-phase-transitions-during-grokking-nmp5 https://apartresearch.com/project/afrisafecb-evaluating-llm-safety-robustness-under-african-code-switched-political-and-civic-contexts-x8tn https://apartresearch.com/project/organizing-against-the-algorithm-collective-response-as-a-governance-lever-for-gradual-disempowerment-in-south-africas-ai-infrastructure-buildout-ifx7 https://apartresearch.com/project/benchmarking-openweight-vs-frontier-llms-on-african-health-and-financialinclusion-reasoning-with-and-without-graph-rag-xel1 https://apartresearch.com/project/garudai-8510 https://apartresearch.com/project/slang-bypass-benchmarking-alignment-failures-in-mexican-regional-spanish-4m8z https://apartresearch.com/project/getryt-ai-misinformation-detection-and-verification-system-for-digital-safety-mqba https://apartresearch.com/project/performative-subversion-a-false-sense-of-safety-despite-a-control-protocol-m2s1 https://apartresearch.com/project/on-ai-governance-and-job-displacement-a-comparative-study-of-ai-policy-in-vietnam-and-the-united-states-oz52 https://apartresearch.com/project/the-shapes-of-bias-in-spanishprompted-llms-and-the-debiasing-prompt-scaffolds-zm7h https://apartresearch.com/project/multilingual-ai-safety-observatory-9ftb https://apartresearch.com/project/agency-trajectory-benchmark-detecting-loss-of-effective-human-override-in-aimediated-workflows-wm9q https://apartresearch.com/project/neither-builder-nor-bystander-sovereign-ai-capability-for-southeast-asian-middle-powers-learning-from-vietnam-klst https://apartresearch.com/project/from-pocket-god-to-digital-jonestown-a-risk-taxonomy-and-evaluation-framework-for-spiritual-ai-safety-0ixr https://apartresearch.com/project/agroaid-sg9i https://apartresearch.com/project/fap-a-benchmark-dataset-and-mechanism-for-filtering-adversarial-payloads-in-natural-language-prompts-c2a9 https://apartresearch.com/project/confidently-wrong-measuring-and-mitigating-calibration-risks-in-llms-for-african-languages-hrve https://apartresearch.com/project/character-limits-shape-the-persuasion-strategy-of-languagemodel-influence-agents-lewv https://apartresearch.com/project/safety-by-identity-outofdistribution-generalization-from-finetuning-on-a-persona-ax3t https://apartresearch.com/project/hybrid-ai-agent-skill-auditor-1dpm https://apartresearch.com/project/specshield-a-structural-trusted-monitor-for-toolusing-ai-agents-5tko https://apartresearch.com/project/lost-in-translation-crosslingual-transfer-of-refusal-steering-vectors-in-small-language-models-f9ll https://apartresearch.com/project/sentinel-ai-an-ai-powered-autonomous-security-threat-detection-platform-tedq https://apartresearch.com/project/who-watches-the-watchers-governance-agency-in-ai-verification-and-monitoring-regimes-h8vn https://apartresearch.com/project/a-humanintheloop-audit-framework-for-evaluating-ai-application-safety-in-latin-america-7rwj https://apartresearch.com/project/mechanistic-localization-of-roleindexed-persona-interactions-in-llms-659g https://apartresearch.com/project/blindfold-blindly-auditing-the-vietnamese-llm-safety-blind-spot-using-secured-enclaves-smq4 https://apartresearch.com/project/arclight-xtkx https://apartresearch.com/project/inference-sovereignty-as-the-missing-layer-of-ai-governance-09hz https://apartresearch.com/project/jailbreaks-are-global-or-regional-a-study-under-scale-and-geolocation-variation-7cd4 https://apartresearch.com/project/fertiscope-measuring-the-multilingual-tokenizer-tax-in-lowresource-asian-languages-1keb https://apartresearch.com/project/capability-and-reliability-tradeoffs-across-model-ladder-fallbacks-triggered-by-export-controls-6766 https://apartresearch.com/project/binding-ai-governance-in-the-global-south-via-psychometric-metrology-7xu1 https://apartresearch.com/project/ayuguard-a-safetyrouting-framework-and-evaluation-benchmark-for-localized-pharmacology-in-indian-rural-healthcare-4upi https://apartresearch.com/project/linguistic-reasoning-drift-index-lrdi-auditing-multilingual-misinformation-safety-for-the-global-south-6zdy https://apartresearch.com/project/algorithmic-bias-african-stereotypes-portrayed-by-llms-99b9 https://apartresearch.com/project/closing-the-unowned-trust-boundary-runtime-kvcache-integrity-verification-against-safetybypass-tampering-v27c https://apartresearch.com/project/pragmatic-sophistry-in-vietnamese-multiagent-oversight-u5xy https://apartresearch.com/project/proteccin-de-la-soberana-un-marco-emprico-de-vulnerabilidad-y-un-escudo-regulatorio-para-la-ia-en-una-potencia-media-mpmm https://apartresearch.com/project/autonomous-institutions-safety-framework-aisf-tkll https://apartresearch.com/project/gsmpatheval-a-global-south-robustness-benchmark-for-telepathology-visionlanguage-models-1al7 https://apartresearch.com/project/ai-safety-observatory-for-africa-vt00 https://apartresearch.com/project/the-veneer-of-safety-the-fragility-of-indiaspecific-harm-refusal-in-open-llms-dvnq https://apartresearch.com/project/mis-measuring-crosslingual-safety-drift-in-lowresource-languages-deck-yby5 https://apartresearch.com/project/im-not-my-parents-does-improving-parent-language-capabilities-transfer-alignment-to-lower-resource-language-xyj1 https://apartresearch.com/project/ai-kernel-killswitch-bj80 https://apartresearch.com/project/script-and-stance-do-frontier-llms-treat-the-same-political-claim-differently-by-writing-system-and-user-stance-g92p https://apartresearch.com/project/shopees-invisible-manager-algorithmic-governance-of-ecommerce-sellers-and-the-regulatory-gap-in-vietnams-ai-law-qm75 https://apartresearch.com/project/we-are-convinced-that-persuasion-is-linear-and-bilingual-in-llms-jrkk https://apartresearch.com/project/gigaudit-a-graphpowered-algorithmic-transparency-and-labor-protection-engine-layer-3gew https://apartresearch.com/project/do-multilingual-visionlanguage-models-abstain-under-crossmodal-conflict-in-lowresource-languages-hcsp https://apartresearch.com/project/traduttore-traditore-llm-languagedependent-safety-answers-in-community-contexts-d8n7 https://apartresearch.com/project/a-contextual-audit-of-bangladeshs-national-ai-policy-draft-20262030-zr68 https://apartresearch.com/project/dualclassifier-system-for-llm-security-s491 https://apartresearch.com/project/coldron-lj2w https://apartresearch.com/project/out-of-distribution-does-deepfake-detection-transfer-to-real-african-faces-gzfk https://apartresearch.com/project/mapeo-de-herramientas-ia-en-contextos-laborales-el-caso-de-los-call-centers-en-colombia-k1kk https://apartresearch.com/project/not-all-should-go-south-the-pragmatic-ai-strategy-for-secondary-powers-jpf0 https://apartresearch.com/project/crosslingual-safety-audit-of-llms-in-south-african-languages-dtp3 https://apartresearch.com/project/powerbench-a-multilingual-study-of-large-language-model-refusal-in-powergrabbing-requests-gkou https://apartresearch.com/project/safeswitch-a-localized-benchmark-for-llm-safety-in-urdu-and-pashto-across-language-forms-and-prompting-strategies-w9yl https://apartresearch.com/project/localgeometry-signals-of-capability-emergence-during-portuguese-grammar-acquisition-in-a-small-language-model-b24y https://apartresearch.com/project/tensorguardlite-auditing-sovereign-ai-claims-through-gradientbased-model-provenance-23yh https://apartresearch.com/project/mipff-a-framework-for-metamorphic-detection-of-implicit-social-bias-in-brazilianportuguese-profilescoring-systems-ho5q https://apartresearch.com/project/when-safeguards-stop-at-the-border-auditing-how-openai-and-anthropic-allocate-privacy-protections-across-latin-american-jurisdictions-854n https://apartresearch.com/project/asia-ai-governance-compliancegap-checker-bt5m https://apartresearch.com/project/governance-drift-evaluation-framework-gdef-vfzj https://apartresearch.com/project/seajury-auditable-ai-compliance-for-vietnam-6t2w https://apartresearch.com/project/probing-jailbreak-brittleness-capability-limits-vs-alignment-failures-in-small-language-models-7lk9 https://apartresearch.com/project/westcentral-asia-ai-safety-institute-aisi-blueprint-a-sociotechnical-feasibility-framework-and-deployment-compliance-sandbox-bctu https://apartresearch.com/project/reading-intent-not-output-cheap-activation-safety-monitors-that-transfer-across-languages-and-models-l853 https://apartresearch.com/project/marco-tico-para-ia-aplicada-a-la-preservacin-lingstica-guna-de-panam-6uxw https://apartresearch.com/project/representational-macrostates-and-statistical-difficulty-in-llm-safety-prompts-dfgf https://apartresearch.com/project/agentic-surveillance-mx-zarr https://apartresearch.com/project/agentic-commerce-and-consumer-protection-emerging-risks-and-regulatory-gaps-be7b https://apartresearch.com/project/justicia-a-counterfactual-benchmark-for-auditing-contextual-biases-in-language-models-for-transitional-justice-jjvl https://apartresearch.com/project/pixtrap-reveals-llm-safety-calibration-gaps-in-brazilian-pix-fraud-c2a2 https://apartresearch.com/project/writing-for-combating-prompt-injection-njyb https://apartresearch.com/project/evaluating-llm-safety-across-intralanguage-variations-a-study-of-regional-spanish-slang-yk12 https://apartresearch.com/project/cultural-knowledge-gaps-in-llms-geographic-hallucination-bias-across-latin-american-countries-5px9 https://apartresearch.com/project/indicvietsafe-crosslingual-safety-evaluation-of-opensource-llms-in-hindi-hinglish-and-vietnamese-lszt https://apartresearch.com/project/metodologa-37n9 https://apartresearch.com/project/haze-adversarial-multiagent-scrutiny-for-vulnerability-detection-l278 https://apartresearch.com/project/agro-ai-governance-ia-para-gobernanza-agroalimentaria-zsu6 https://apartresearch.com/project/the-ai-deployment-audit-playbook-an-operational-lifecycleoriented-framework-for-the-global-south-i0iw https://apartresearch.com/project/outlier-measuring-ai-use-a-governance-framework-for-carbon-authorship-erosion-and-ai-adoption-9af7 https://apartresearch.com/project/reia-evaluador-de-riesgo-de-cumplimiento-legal-en-ia-para-latinoamrica-5zn5 https://apartresearch.com/project/rage-dz31 https://apartresearch.com/project/herramienta-de-evaluacin-y-recomendacin-para-la-promocin-de-uso-responsable-de-ia-en-pymes-latinoamericanas-1gvb https://apartresearch.com/project/por-qu-los-agentes-obedecen-la-direccin-de-rechazo-se-debilita-en-formato-agntico-ux70 https://apartresearch.com/project/three-detectors-three-failure-profiles-detectorspecific-demographic-bias-in-public-deepfake-detection-and-its-implications-for-content-moderation-in-latin-america-vmnh https://apartresearch.com/project/beyond-english-assessing-the-robustness-of-llm-safety-mechanisms-against-structural-jailbreaks-in-spanish-s4zk https://apartresearch.com/project/investigating-activation-threshold-failures-in-crosslingual-prompt-rejection-u1x0 https://apartresearch.com/project/contestabilidad-algortmica-en-el-estado-colombiano-un-canal-de-objecin-asistido-por-ia-lbsm https://apartresearch.com/project/how-should-the-global-south-think-and-act-about-compute-policy-ce14 https://apartresearch.com/project/thought-anchors-for-social-bias-which-reasoning-steps-matter-in-extended-thinking-llms-on-latin-american-scenarios-27ti https://apartresearch.com/project/los-peajes-de-los-de-abajo-i6wx https://apartresearch.com/project/filtrumsafety-llmagnostic-rag-for-reducing-legal-hallucinations-in-civillaw-contexts-l8d6 https://apartresearch.com/project/tup-detection-hybrid-promptinjection-guard-for-ai-generative-security-monitoring-r4w6 https://apartresearch.com/project/agentwall-mr57 https://apartresearch.com/project/permissive-models-unequal-risk-auditing-ai-identitydocument-forgery-as-a-systemic-infrastructure-risk-9c7o https://apartresearch.com/project/yucasafebench-wpiv https://apartresearch.com/project/taapai-how-good-is-this-data-center-for-your-area-2srh https://apartresearch.com/project/ddrrtrust-auditor-de-gobernanza-e-inteligencia-artificial-para-la-tokenizacin-inmobiliaria-0oi0 https://apartresearch.com/project/secure-scope-ai-qwgz https://apartresearch.com/project/how-does-instruction-hierarchy-training-mitigate-prompt-injections-preliminary-results-from-an-attentional-study-om7d https://apartresearch.com/project/sowmi-a-perimeter-auditing-framework-for-openweight-models-in-latin-american-institutions-zktq https://apartresearch.com/project/vietnamese-rag-prompt-injection-test-kit-vrju https://apartresearch.com/project/nexusgov-hiew https://apartresearch.com/project/canyoupredictanetworkwithoutrunningit-u7mj https://apartresearch.com/project/probing-latent-colombian-identity-inferences-in-qwen257b-with-natural-language-autoencoders-mucf https://apartresearch.com/project/afrivishbench-i54o https://apartresearch.com/project/the-pixcpf-falsepositive-problem-in-llm-fraud-detection-4d5l https://apartresearch.com/project/the-transmutation-gap-crosslingual-coherence-evaluation-in-large-language-models-using-the-sovereigntycollaboration-transmutational-arc-framework-zukw https://apartresearch.com/project/latin-america-governance-data-safety-dashboard-nyjm https://apartresearch.com/project/project-title-register-sensitivity-in-llm-safety-responses-evaluating-how-linguistic-style-affects-scam-detection-in-south-african-contexts-9xx6 https://apartresearch.com/project/ufakazi-7yi3 https://apartresearch.com/project/the-materiality-gate-dynamic-updating-of-ai-sovereignty-risk-under-geopolitical-shocks-mp6l https://apartresearch.com/project/afrisafebench-evaluating-llm-recognition-of-ai-safety-and-governance-risks-in-african-healthcare-ai-deployments-k2iz https://apartresearch.com/project/dialectsafe-bridging-the-asr-gap-2uny https://apartresearch.com/project/two-failure-modes-require-architectural-change-a-formal-harmonization-gap-analysis-of-the-eu-ai-act-vietnams-ai-law-and-the-asean-ai-governance-guide-k7ml https://apartresearch.com/project/multilingual-jailbreak-vulnerability-benchmark-and-mitigation-for-lowresource-african-languages-td0s https://apartresearch.com/project/when-the-safety-circuit-doesnt-speak-igbo-asymmetric-crosslingual-transfer-of-harm-representations-in-qwen2515binstruct-l72w https://apartresearch.com/project/towards-global-south-ai-sovereignty-a-federated-learning-framework-for-collaborative-llm-development-a28q https://apartresearch.com/project/the-state-is-not-enough-26ax https://apartresearch.com/project/vigilai-l3xg https://apartresearch.com/project/biasmark-exposing-ai-hiring-bias-against-african-job-applicants-5fl5 https://apartresearch.com/project/ai-policy-recommendations-for-southern-africa-informed-by-regionspecific-risks-and-domestic-governance-precedents-8ov5 https://apartresearch.com/project/securemind-a-sovereigntyfirst-ai-safety-framework-for-offline-education-m41w https://apartresearch.com/project/hey-muslims-these-llms-may-think-you-are-a-terrorist-4teo https://apartresearch.com/project/developing-a-contextsensitive-ai-governance-framework-for-zambia-rlwr https://apartresearch.com/project/medishieldproxy-a-local-privacypreserving-intermediary-layer-for-secure-clinical-llm-ingestion-2sp5 https://apartresearch.com/project/ai-literacy-is-ai-safety-7es3 https://apartresearch.com/project/flukebench-ozlv https://apartresearch.com/project/momo-a-threat-corpus-and-evaluation-framework-for-mobile-money-fraud-resistance-in-swahili-wolof-and-hausa-aqto https://apartresearch.com/project/latentguard-mitigating-multilingual-safety-bypass-via-midlayer-latent-steering-zpoz https://apartresearch.com/project/crescendodefense-securing-opensource-language-models-against-multiturn-jailbreak-attacks-in-asia-rtkb https://apartresearch.com/project/evaluating-a-large-ai-monitor-against-insiderthreat-behaviors-in-simulated-autonomous-software-engineering-teams-lkyh https://apartresearch.com/project/an-organizationwide-safety-auditing-framework-for-preventing-and-detecting-ai-sleeper-agents-kdbg https://apartresearch.com/project/afrojailbreakzw-evaluating-jailbreak-resistance-in-shona-7bmu https://apartresearch.com/project/evaluating-mental-health-llm-responses-to-localized-african-english-pzag https://apartresearch.com/project/est-mxico-preparado-institucionalmente-para-contener-los-riesgos-en-inteligencia-artificial-2p87 https://apartresearch.com/project/llms-flatten-the-global-south-subregional-representation-asymmetry-mw2b https://apartresearch.com/project/project-persona-u0yj https://apartresearch.com/project/vectox-0h8z https://apartresearch.com/project/doctorless-aa5q https://apartresearch.com/project/honestcode-8pnk https://apartresearch.com/project/silicosafe-a-multimodal-ai-triage-system-7tzy https://apartresearch.com/project/smishi-ne-nasedaj-dont-fall-for-it-a-bhslanguage-sms-phishing-detector-for-lowresource-morphologically-rich-languages-6rt9 https://apartresearch.com/project/capabilities-not-just-domains-a-minimal-amendment-for-agentic-ai-risk-in-brazils-pl-23382023-c34p https://apartresearch.com/project/synthetic-political-speech-in-regional-languages-r8op https://apartresearch.com/project/building-a-multilingual-digital-language-public-good-stack-gbqy https://apartresearch.com/project/politicai-m0u3 https://apartresearch.com/project/the-compliance-cliff-is-languagedependent-constitutional-ai-immunity-breaks-under-hindi-pressure-1r4x https://apartresearch.com/project/styleswitchbn-auditing-bengali-llm-safety-across-realworld-writing-styles-6f0a https://apartresearch.com/project/afriguard-ai-jailbreak-toolset-for-african-languages-3i8y https://apartresearch.com/project/ai-fraud-zambia-research-paper-3yfr https://apartresearch.com/project/aitoai-vs-humantoai-measuring-behavior-differences-under-disagreement-cob9 https://apartresearch.com/project/somalicrows-a-benchmark-for-evaluating-gender-bias-in-large-language-models-using-the-somali-language-xw3c https://apartresearch.com/project/blindfolded-governance-mapping-economic-dark-output-from-ai-in-asia-tjx3 https://apartresearch.com/project/fault-lines-the-dualuse-ai-governance-vacuum-in-asia-wnnd https://apartresearch.com/project/faking-incompetence-small-models-sandbag-under-optimization-not-prompting-nxkp https://apartresearch.com/project/traceguardx-adaptive-collusion-resistant-monitoring-for-agentic-ai-systems-a-hierarchical-governance-framework-with-constitutional-dimension-prompting-temporal-trajectory-analysis-and-crossenvironment-transfer-huv1 https://apartresearch.com/project/langgap-does-the-textaction-safety-gap-widen-across-languages-bwhi https://apartresearch.com/project/neuralforensic-latent-space-activation-auditing-for-opensource-model-supply-chains-yxd2 https://apartresearch.com/project/ai-colonialism-20-is-india-training-the-models-that-will-govern-it-wtk7 https://apartresearch.com/project/civicguard-africa-aiassisted-election-disinformation-triage-and-multilingual-safety-benchmarking-oapp https://apartresearch.com/project/data-safety-for-institutions-1d30 https://apartresearch.com/project/trilobyte-evaluating-llms-on-bolivian-quechua-through-a-groundtruthbased-framework-for-lowresource-languages-rmt9 https://apartresearch.com/project/a-criao-de-uma-coalizao-de-polticas-pblicas-baseadas-em-evidncias-no-brasil-auxiliando-na-construo-de-um-alicerce-poltico-para-avanar-em-uma-agenda-longoprazista-e-de-governana-de-ia-8of1 https://apartresearch.com/project/when-models-refuse-to-speak-taskdependent-caste-bias-in-smaller-llms-1a4a https://apartresearch.com/project/structural-amplifiers-of-aiinduced-harm-a-fivedimension-sector-vulnerability-framework-for-south-and-southeast-asia-1l3v https://apartresearch.com/project/buying-safety-a-model-ai-procurement-standard-for-african-public-sectors-nrx0 https://apartresearch.com/project/closing-the-sovereign-safety-gap-a-localized-adversarial-auditing-framework-for-african-ai-governance-6fbf https://apartresearch.com/project/one-direction-many-languages-causal-crosslingual-refusal-transfer-across-small-open-models-teow https://apartresearch.com/project/the-africanlanguage-safety-gap-is-modeldependent-a-comprehensioncontrolled-audit-of-visionlanguage-model-refusal-in-isizulu-and-hausa-ib1j https://apartresearch.com/project/quantizationconditioned-alignment-degradation-juyn https://apartresearch.com/project/the-equity-gap-that-wasnt-referencelanguage-bias-in-multilingual-ai-evaluation-dkoy https://apartresearch.com/project/africa-ai-risk-index-aari-vrpy https://apartresearch.com/project/swahiliguard-an-aipowered-safety-system-for-detecting-localized-online-harm-by20 https://apartresearch.com/project/steer-mech-interpbased-white-box-attack-on-llms-ftaw https://apartresearch.com/project/consentaware-privacy-firewall-ao6o https://apartresearch.com/project/chaos-theory-in-multilingual-llms-l3p8 https://apartresearch.com/project/afrisafeeval-mc07 https://apartresearch.com/project/driftwatch-horz https://apartresearch.com/project/signalshield-africa-1leb https://apartresearch.com/project/vsentinel-bio7 https://apartresearch.com/project/aissentinel-9umc https://apartresearch.com/project/ai-permission-from-session-layer-to-kernel-layer-llwg https://apartresearch.com/project/deliberative-restraint-a-moral-parliament-framework-for-scalable-oversight-of-llm-cyber-agents-iq0a https://apartresearch.com/project/mimir-ai-image-provenance-detection-hypu https://apartresearch.com/project/exploratory-benchmark-of-jailbreak-robustness-across-global-south-languages-81u7 https://apartresearch.com/project/prompteus-confidential-thirdparty-safety-auditing-with-garbled-circuits-mpc-and-pir-bgvw https://apartresearch.com/project/indiajailbreakbenchlite-etvg https://apartresearch.com/project/sovereignty-at-the-forward-pass-ia7f https://apartresearch.com/project/gradual-disempowerment-at-scale-measuring-cumulative-agency-erosion-in-algorithmic-management-of-indian-delivery-workers-qlxr https://apartresearch.com/project/sutraaudit-7bx3 https://apartresearch.com/project/score-against-ground-truth-crosslanguage-fragility-in-heuristic-llm-evaluation-ju92 https://apartresearch.com/project/runveil-a-transparent-egress-firewall-and-audit-layer-for-developerai-interactions-ceh7 https://apartresearch.com/project/too-big-to-fail-too-catastrophic-to-insure-making-the-labs-pay-for-ais-risk-jwzu https://apartresearch.com/project/guardian-latam-early-detection-of-hallucination-risk-in-spanishspeaking-multiagent-ai-systems-using-consensus-geometry-5lzw https://apartresearch.com/project/lost-in-translation-measuring-languageconditioned-detectionrate-gaps-in-ai-code-auditors-on-spanish-and-portuguesesurface-codebases-8ao4 https://apartresearch.com/project/savaguarda-07gk https://apartresearch.com/project/trustnet-africa-a-federated-ai-platform-for-formalizing-the-informal-economy-while-preserving-privacy-65qo https://apartresearch.com/project/riskgovai-xgpt https://apartresearch.com/project/multilingualsycophancybenchmark-8k40 https://apartresearch.com/project/indicmixsafe-codeswitching-safety-failures-in-hindi-and-marathi-llm-interactions-p6iz https://apartresearch.com/project/parentme-safeai-an-african-child-and-family-ai-safety-evaluation-framework-ncl9 https://apartresearch.com/project/leakmap-ai-evidencebacked-jurisdictional-exposure-map-for-ai-prompts-gsny https://apartresearch.com/project/ai-risk-oversight-failures-in-autonomous-financial-systems-a-case-study-from-indias-prop-trading-ecosystem-qvy8 https://apartresearch.com/project/jurisguardlatam-sadx https://apartresearch.com/project/uncertainty-quantification-in-anomaly-detection-as-an-ai-safety-primitive-h0h5 https://apartresearch.com/project/chorus-mining-emergent-specifications-from-caller-consensus-jkqq https://apartresearch.com/project/cwicspec-jzep https://apartresearch.com/project/does-oracle-quality-matter-adversarial-feedback-for-formally-verified-code-synthesis-ur3c https://apartresearch.com/project/hollow-proofs-measuring-llm-dishonesty-with-a-formal-verifier-x7la https://apartresearch.com/project/moving-beyond-specification-validation-to-specification-refinement-with-mutation-verification-vkxl https://apartresearch.com/project/trajectorycheck-trajectorylevel-invariant-validation-for-behavioral-drift-detection-in-generated-code-ltjt https://apartresearch.com/project/adversarial-iteration-for-underspecified-program-synthesis-i82n https://apartresearch.com/project/raft-gradual-typing-invariance-enforcement-and-property-verification-in-research-python-pcns https://apartresearch.com/project/counterexampleguided-validation-repair-of-llmgenerated-safety-specifications-0s4g https://apartresearch.com/project/dont-lean-on-me-rl2f https://apartresearch.com/project/microvmm-aiassisted-verificationoriented-virtual-machine-monitor-qcab https://apartresearch.com/project/lean-coconut-4jwq https://apartresearch.com/project/fooling-llmbased-program-verifiers-2y4c https://apartresearch.com/project/spectrojan-adversarial-specification-validation-via-evil-twin-synthesis-qm6i https://apartresearch.com/project/grounding-llms-with-standards-documents-ckul https://apartresearch.com/project/the-iron-rule-checklist-structured-specification-elicitation-reduces-falsepass-rates-from-82-to-35-in-llmgenerated-lean-4-specifications-zy8x https://apartresearch.com/project/spec-mutation-survival-analyzer-hj2p https://apartresearch.com/project/speclaundering-u95l https://apartresearch.com/project/sorryaudit-transitive-sorry-taint-detection-for-aiassisted-lean-4-proofs-k8gp https://apartresearch.com/project/neurotrace-specaware-neural-network-runtime-mipv https://apartresearch.com/project/spsverispec-sqhw https://apartresearch.com/project/jare-a-differentialtesting-workbench-for-auditing-aigenerated-specifications-o86j https://apartresearch.com/project/specmut-semantic-mutation-testing-for-formal-specification-tightness-hoge https://apartresearch.com/project/speccheck-when-llms-formalize-who-checks-the-spec-iyuz https://apartresearch.com/project/itp-fuzz-irn7 https://apartresearch.com/project/zerodof-specconditioned-decoding-5di0 https://apartresearch.com/project/tcbexpansion-attacks-on-lean-4-and-the-llm-proof-reviewers-that-mostly-miss-them-o3hn https://apartresearch.com/project/spec-triangulator-multitool-triangulation-for-formal-specification-validation-1u47 https://apartresearch.com/project/agentspecgap-bku6 https://apartresearch.com/project/diffspecpbt-3wba https://apartresearch.com/project/veritool-safer-agents-through-safer-tools-78i9 https://apartresearch.com/project/specfaultdafny-a-classstratified-scorecard-for-verifierpassing-specification-faults-q67o https://apartresearch.com/project/specmutate-coverageguided-specification-diagnosis-and-repair-for-propertybased-tests-zulu https://apartresearch.com/project/vibecoding-specs-eliciting-editing-and-verifying-specifications-for-ai-coding-agents-re3p https://apartresearch.com/project/postern-a-leanverified-access-gateway-for-agentic-data-lakehouse-hv2v https://apartresearch.com/project/specsaboteur-tfe6 https://apartresearch.com/project/lean-checker-validating-skill-plugin-5kfh https://apartresearch.com/project/capshim-validating-capability-policies-for-the-model-context-protocol-via-a-noninterference-type-system-3h10 https://apartresearch.com/project/project-verify-9j1k https://apartresearch.com/project/crossmodel-spec-comparison-finding-disagreement-in-candidate-lean-4-specifications-generated-by-openai-models-dqnh https://apartresearch.com/project/where-to-look-energybased-fault-localization-for-verus-vericoding-0cdy https://apartresearch.com/project/invariant-extraction-monitoring-5u2y https://apartresearch.com/project/the-illusion-of-passing-tests-2usr https://apartresearch.com/project/specgap-arena-vm9h https://apartresearch.com/project/proof-assistance-as-verification-oracle-for-porting-ma88 https://apartresearch.com/project/axiom-zero-alphazerostyle-reinforcement-learning-for-automated-formal-verification-of-python-programs-aqr7 https://apartresearch.com/project/veridict-e67q https://apartresearch.com/project/speccheck-ic90 https://apartresearch.com/project/verified-but-wrong-izy3 https://apartresearch.com/project/trojanspecbench-adversarial-specification-elicitation-in-aiassisted-formal-verification-bv46 https://apartresearch.com/project/auraremed-autonomous-security-engineering-report-biyb https://apartresearch.com/project/llmassisted-ambiguity-detection-in-regulatory-specifications-k7v0 https://apartresearch.com/project/bugmine-ycye https://apartresearch.com/project/when-models-disagree-crossmodel-divergence-analysis-for-ambiguity-risk-estimation-in-software-requirements-m8hx https://apartresearch.com/project/specgap-preserving-disagreement-in-specification-assurance-uzpw https://apartresearch.com/project/spectrap-how-compliance-pressure-degrades-aigenerated-formal-specifications-lqbj https://apartresearch.com/project/baldps-factored-bayesian-active-learning-for-specification-elicitation-with-symmetric-mutationequivalence-validation-0jms https://apartresearch.com/project/adaptive-permission-sandbox-for-llm-agents-y08b https://apartresearch.com/project/operationredfrontlinev3beta-xexb https://apartresearch.com/project/specshift-sps-evaluator-avwo https://apartresearch.com/project/specsentinel-neyi https://apartresearch.com/project/saeber-sparse-autoencoders-for-biological-entity-risk-xcum https://apartresearch.com/project/biowatch-brief-rapid-pathogen-risk-assessment-via-staged-llm-triage-kpej https://apartresearch.com/project/reagent-supplychain-structure-for-benchtop-dna-synthesizers-there-is-hope-for-kyc-aulz https://apartresearch.com/project/synthesis-tamperevident-attestation-and-molecular-provenance-stamp-cryptographic-molecular-barcoding-for-dna-synthesizers-jn4s https://apartresearch.com/project/quantifying-blasts-sensitivity-floor-for-dna-synthesis-screening-wx77 https://apartresearch.com/project/bioshield-biorisk-triage-orchestrator-bto-nybj https://apartresearch.com/project/biosecurity-export-control-navigator-ayre https://apartresearch.com/project/function-over-sequence-empirical-evaluation-of-protein-language-models-for-biosecurity-screening-0z4n https://apartresearch.com/project/know-your-researcher-bio-portable-authorization-for-aibio-tools-and-benchtop-dna-synthesizers-69j5 https://apartresearch.com/project/project-mosaic-defensive-proteinaware-screening-for-benchtop-synthesizers-ir5t https://apartresearch.com/project/biocalibrate-crossmodel-refusal-calibration-benchmark-for-biosecurity-risk-v1-vx6b https://apartresearch.com/project/protein-embeddingbased-detection-of-sequencediverse-biosecurity-threats-jnkr https://apartresearch.com/project/probing-harmrelated-signals-in-pretrained-protein-language-models-hp9o https://apartresearch.com/project/toxin-circuits-in-esm2-mechanistic-interpretability-reveals-why-structureaware-probes-resist-proteinmpnn-redesign-3mr5 https://apartresearch.com/project/abtsim-a-tool-for-collaborative-biorisk-assessment-nbyd https://apartresearch.com/project/oligraph-graphbased-screening-of-large-oligopools-nkwt https://apartresearch.com/project/ariant-bias-in-genomic-foundation-models-for-red-teaming-biological-security-screeners-r27e https://apartresearch.com/project/hgt-leaves-a-linear-fingerprint-in-codon-space-ls3q https://apartresearch.com/project/blindspots-atlas-for-testing-visibility-and-local-conditions-fgrp https://apartresearch.com/project/perplexityguardbench-an-adversarialrobustness-benchmark-for-sequencenaturalness-synthesis-screens-2r6f https://apartresearch.com/project/biochain-crossvendor-threat-detection-via-functionaware-dna-fragment-screening-m0wd https://apartresearch.com/project/toxscreen-eddy https://apartresearch.com/project/bioriskgym-a-new-direction-for-rulein-ai-model-evaluations-for-biosecurity-nvv5 https://apartresearch.com/project/biorefusalaudit-auditing-biosecurity-refusal-depth-using-general-and-domainfinetuned-sparse-autoencoders-1fyk https://apartresearch.com/project/biortbench-a-multiattack-redteaming-benchmark-for-biomisuse-safeguards-in-frontier-llms-hgw0 https://apartresearch.com/project/activation-probes-for-synthetic-toxin-variant-detection-cozu https://apartresearch.com/project/proteus-protein-evaluation-for-unusual-sequences-structureinformed-safety-screening-for-de-novo-and-evasionprone-proteincoding-sequences-vtve https://apartresearch.com/project/trace-threat-recognition-via-attention-context-and-embedding-assembly-for-contextaware-biosecurity-intelligence-6bzv https://apartresearch.com/project/probing-risk-representations-in-protein-language-models-q51v https://apartresearch.com/project/sentinel-atlas-a-centralized-platform-for-multisource-epidemic-surveillance-data-fdp1 https://apartresearch.com/project/beyond-sequence-similarity-protein-and-dna-embeddings-for-evasionresilient-synthesis-screening-qzo7 https://apartresearch.com/project/omnyracloud-g7wh https://apartresearch.com/project/biosecurity-policy-dashboard-4nwv https://apartresearch.com/project/using-embeddings-as-a-proxy-for-functionality-in-dna-screening-zp1b https://apartresearch.com/project/sparse-autoencoder-interpretability-of-the-metagene1-genomic-foundation-model-cmuy https://apartresearch.com/project/bypassing-current-biosecurity-screens-with-ai-designed-proteins-and-closing-the-gap-with-edgeai-functional-screening-oh9w https://apartresearch.com/project/filtering-and-flagging-61w4 https://apartresearch.com/project/portable-raman-spectroscopy-with-ai-detection-for-dangerous-synthesis-of-novel-pathogens-fevx https://apartresearch.com/project/the-three-laws-of-ai-biosafety-a-constitutional-governance-framework-for-ai-biodesign-tools-owf4 https://apartresearch.com/project/hydra-watch-federated-wastewater-pathogen-surveillance-with-foundationmodel-embeddings-rjor https://apartresearch.com/project/bio-safety-prompt-robustness-evaluation-do-frontier-llm-safety-refusals-hold-against-adversarial-rephrasing-r6ls https://apartresearch.com/project/biosecurity-mobility-policyaware-risk-dashboard-roeo https://apartresearch.com/project/bioguard-screening-biological-risk-across-multiturn-ai-conversations-tya5 https://apartresearch.com/project/openpandemicrisk-alert-enrich-evaluate-recommend-open-platform-for-national-public-health-experts-with-ai-agents-and-explainble-models-jlv9 https://apartresearch.com/project/shortquerydnascreener-4lt9 https://apartresearch.com/project/osint-biogovernance-tool-6v8j https://apartresearch.com/project/multi-stream-trajectory-scoring-pandemic-detection-system-396h https://apartresearch.com/project/bioscreen-function-aware-biological-sequence-screening-with-mechanism-of-harm-classification-77h7 https://apartresearch.com/project/aienabled-biological-tool-policy-dashboard-s9dl https://apartresearch.com/project/fragment-assembly-risk-scorer-fars-empirical-characterization-of-splitorder-detection-boundaries-for-benchtop-dna-synthesizers-e4u7 https://apartresearch.com/project/automated-causal-graph-extraction-and-valueofinformation-prioritization-for-ai-biorisk-modelling-l8fn https://apartresearch.com/project/bioconscience-copilot-z3h4 https://apartresearch.com/project/latentspace-anomaly-detection-for-dna-synthesis-screening-using-biological-foundation-model-representations-nwfr https://apartresearch.com/project/benchmarking-obfuscated-split-order-detection-methods-vsaf https://apartresearch.com/project/dna-provenance-passport-yj4q https://apartresearch.com/project/who-wrote-this-sequence-cs4w https://apartresearch.com/project/pandemicwatch-earlier-than-the-news-9bpw https://apartresearch.com/project/biosecurity-screening-layer-for-biologyoriented-ai-interfaces-hiwg https://apartresearch.com/project/synthguard-biolens-discriminative-synthesis-screening-and-intelligenceaware-biosecurity-triage-xmuo https://apartresearch.com/project/bssbreach-testing-designer-protein-sequences-for-biosecurity-evasion-capabilities-yl1j https://apartresearch.com/project/towards-hardwaregoverned-benchtop-dna-synthesizers-hp4p https://apartresearch.com/project/biocompliance-an-amlinspired-compliance-engine-for-benchtop-dna-synthesizers-jvbp https://apartresearch.com/project/bastkbench-evaluating-bioweapon-risk-in-openweight-ai-via-somatic-tacit-knowledge-and-realworld-uplift-5igd https://apartresearch.com/project/semantic-naturalness-predicts-monitor-evasion-in-biosecurity-llm-gatekeepers-g0vd https://apartresearch.com/project/bioclaw-an-agentic-tool-suite-for-biological-engineering-workflows-lscd https://apartresearch.com/project/locus-browserbased-researcher-credential-verification-for-dna-synthesis-screening-4pip https://apartresearch.com/project/geometric-biosecurity-continuous-threat-severity-scoring-via-spectral-decomposition-of-protein-language-model-embeddings-amw6 https://apartresearch.com/project/bioexport-navigator-a-decisionsupport-tool-for-uschina-aibiosecurity-trade-compliance-hfeo https://apartresearch.com/project/metabioshield-xw4u https://apartresearch.com/project/attentiondecay-in-pandemic-surveillance-is-an-emergingdisease-phenomenon-v004 https://apartresearch.com/project/mechanistic-upstream-guardrails-for-biosecurity-5jme https://apartresearch.com/project/biosignal-wastewater-anomaly-contextualization-for-pandemic-early-warning-qi9r https://apartresearch.com/project/epicurus-ai-from-disease-forecasting-to-pathogen-prediction-3s31 https://apartresearch.com/project/bioshieldai-ilyq https://apartresearch.com/project/securemaxxing-c5la https://apartresearch.com/project/benjibio-qx7g https://apartresearch.com/project/quantifying-the-reconstruction-gap-a-dataset-bottleneck-analysis-framework-for-aiera-biosecurity-screening-c4gt https://apartresearch.com/project/safesurveilaixbio-runtimegated-genomic-amr-triage-with-evidence-graphs-and-grounded-ai-sidecars-7sp4 https://apartresearch.com/project/oral-sentinel-clinical-weak-signals-as-a-prediagnostic-layer-for-pandemic-early-warning-jhwb https://apartresearch.com/project/biosecurity-risk-assessment-tool-with-adversarial-redteaming-brat-rag1 https://apartresearch.com/project/detecting-biorisks-via-protein-structures-with-esm2-based-models-for-anomaly-detection-in-protein-sequences-for-targeted-bodily-functions-knne https://apartresearch.com/project/success-is-not-safety-when-helpful-bio-agents-do-too-much-o07c https://apartresearch.com/project/funcscreen-contrastive-plm-embeddings-for-evasionresistant-biosecurity-screening-j842 https://apartresearch.com/project/biosafety-cloud-lab-compliance-screener-83o9 https://apartresearch.com/project/ai-biosecurity-compliance-auditor-policytoprotocol-risk-engine-pj9c https://apartresearch.com/project/the-protein-id-card-a-semantically-aware-screening-framework-for-pathogenic-sequence-detection-using-esm2-and-faiss-pzyh https://apartresearch.com/project/intersession-attacks-distributing-highstakes-sabotage-across-independentlymonitored-sessions-to-evade-detection-dwlw https://apartresearch.com/project/constraint-dilemma-udka https://apartresearch.com/project/securebox-a-layered-control-protocol-for-safe-aiassisted-coding-988z https://apartresearch.com/project/chiron-agentic-platform-for-automated-setting-design-and-redblue-protocol-optimization-in-controlarena-52r0 https://apartresearch.com/project/can-language-models-tell-when-they-are-being-evaluated-hyae https://apartresearch.com/project/hydra-gep4 https://apartresearch.com/project/loyal-5opm https://apartresearch.com/project/posthoc-training-interventions-against-naturally-emergent-misalignment-3tvj https://apartresearch.com/project/red-teaming-multiagent-systems-in-ai-control-evaluations-fx32 https://apartresearch.com/project/monitoraware-attack-policies-60t4 https://apartresearch.com/project/frankenstein-ducks-the-linear-probes-dcz9 https://apartresearch.com/project/reconnaissanceexecution-boundaries-as-a-control-protocol-jgku https://apartresearch.com/project/kernelarena-83z5 https://apartresearch.com/project/resample-with-trusted-comparison-u6bu https://apartresearch.com/project/stateful-monitor-t8dd https://apartresearch.com/project/odrarena-rigorously-evaluating-deep-research-control-policies-mgzr https://apartresearch.com/project/linuxarena-expansion-threshold-manipulation-utm2 https://apartresearch.com/project/omission-attacks-when-doing-nothing-is-the-attack-0y1v https://apartresearch.com/project/steganography-detection-control-tghx https://apartresearch.com/project/veria-researchintegrity-control-for-autonomous-ai-rd-3iwx https://apartresearch.com/project/beyond-backdoors-evaluating-ai-control-against-adversarial-persuasion-in-multiagent-deliberation-3nv6 https://apartresearch.com/project/who-the-model-is-matters-personas-for-llm-safety-and-control-wrjn https://apartresearch.com/project/sabotage-under-the-ratchet-can-autonomous-research-agents-hide-backdoors-while-optimizing-metrics-hd6z https://apartresearch.com/project/honeypot-protocol-hmsb https://apartresearch.com/project/hidden-in-plain-sight-representational-adversarial-steganography-in-colluding-llms-sekh https://apartresearch.com/project/multiagent-collusion-br1m https://apartresearch.com/project/controlling-the-researcher-ai-control-evaluations-for-automated-ai-rd-ze22 https://apartresearch.com/project/diffuse-influence-threats-in-ai-control-planning-execution-and-chainofthought-detection-ls54 https://apartresearch.com/project/blue-beam-yf5l https://apartresearch.com/project/look-before-you-click-sparse-criticalaction-review-for-computeruse-agent-control-lcb5 https://apartresearch.com/project/trojancot-adversarial-evasion-of-llmbased-code-sanitization-c7tf https://apartresearch.com/project/persona-drift-sabotage-benchmarking-covert-weakening-of-personasafety-interventions-3qin https://apartresearch.com/project/trusted-interrogation-protocol-for-multiround-agent-control-lck2 https://apartresearch.com/project/structured-ethical-justification-protocol-sejp-ordainedethics-enforcement-and-reasoningalignment-monitoring-for-ai-systems-4s34 https://apartresearch.com/project/how-detectable-are-covert-signals-in-ai-agent-transcripts-642m https://apartresearch.com/project/taintguard-01ph https://apartresearch.com/project/safe-sandboxed-actions-forensics-and-evaluation-for-ai-control-via-actionaware-monitoring-h3fi https://apartresearch.com/project/one-probe-is-not-enough-the-bipolar-geometry-of-deception-in-llm-negotiation-agents-q4u8 https://apartresearch.com/project/bugs-over-backdoors-efficient-and-evasive-sabotage-via-plausible-code-bugs-in-linuxarena-fcp6 https://apartresearch.com/project/attacker-pressure-flips-ai-control-conclusions-lvu9 https://apartresearch.com/project/trajectory-blindness-detection-latency-in-conversational-monitoring-2jaf https://apartresearch.com/project/stitchguard-cslo https://apartresearch.com/project/trigger-happy-trusted-monitors-mistake-noisy-failures-for-adversarial-intent-xeoz https://apartresearch.com/project/toolchain-adversaries-a-control-setting-for-supplychain-attacks-on-ai-agent-pipelines-pb3g https://apartresearch.com/project/monitors-are-fragile-under-information-asymmetry-iq2d https://apartresearch.com/project/beyond-prompt-injection-offensive-security-tradecraft-exposes-architectural-blind-spot-classes-in-ai-control-monitors-lwy7 https://apartresearch.com/project/covertcollude-a-control-setting-for-steganographic-multiagent-collusion-vc55 https://apartresearch.com/project/denoising-trajectories-as-earlywarning-signals-for-diffusion-language-model-control-31cb https://apartresearch.com/project/taxonomy-without-complementarity-a-necessary-precondition-for-routing-monitor-ensembles-4t9w https://apartresearch.com/project/surrogate-steering-vector-monitors-track-2-5w36 https://apartresearch.com/project/tokenization-drift-as-a-hidden-channel-for-alignment-evasion-in-language-models-wooi https://apartresearch.com/project/boxarena-an-oci-runtime-escape-benchmark-elyw https://apartresearch.com/project/layoutarena-a-control-setting-for-toolusing-design-agents-elem https://apartresearch.com/project/context-compression-as-an-inadvertent-accomplice-how-tokensaving-summaries-undermine-safety-monitoring-in-multiagent-pipelines-462o https://apartresearch.com/project/track-1-vaultrix-control-twqz https://apartresearch.com/project/narrow-capabilities-could-be-used-for-perfect-information-steganography-smes https://apartresearch.com/project/performative-subversion-who-monitors-the-monitor-5nj6 https://apartresearch.com/project/suspicios-kcta https://apartresearch.com/project/context-augmentation-in-llm-monitors-exploratory-evidence-from-paired-repeated-evaluation-w6oj https://apartresearch.com/project/investigating-structural-safety-guarantees-for-agentic-ai-via-goaloriented-action-planning-lx3u https://apartresearch.com/project/interrogation-protocols-as-a-control-primitive-for-misaligned-agent-detection-ho2h https://apartresearch.com/project/logitweighted-suspicion-scoring-in-ai-control-ktmf https://apartresearch.com/project/testing-oocr-transfer-from-evasion-technique-descriptions-to-chainofthought-monitor-avoidance-6edf https://apartresearch.com/project/trust-but-verify-a-dualtask-evaluation-pipeline-for-capability-monitor-effectiveness-and-covert-misalignment-l1c7 https://apartresearch.com/project/crossexamine-interrogation-protocol-for-ai-control-8y9o https://apartresearch.com/project/rag-faithfulness-evaluator-mnv0 https://apartresearch.com/project/geometric-detection-of-triggeractivated-sleeper-agents-a-weightspace-control-protocol-t93j https://apartresearch.com/project/syscon-kernellevel-syscall-monitoring-for-ai-agent-control-in-sandboxed-environments-guun https://apartresearch.com/project/you-cant-fool-all-of-them-behavioral-comparison-for-ai-control-against-sophisticated-code-injection-uxls https://apartresearch.com/project/omissioncontrol-a-controlarena-setting-for-selective-omission-attacks-in-multiagent-pipelines-1gvm https://apartresearch.com/project/finetuning-a-monitor-for-ai-control-nai0 https://apartresearch.com/project/multiepisode-evaluation-measuring-strategic-patience-attacks-oc91 https://apartresearch.com/project/automated-security-side-task-generation-for-ai-control-yqpv https://apartresearch.com/project/tdmf-trajectory-deviation-monitor-with-forensic-deepdive-rmql https://apartresearch.com/project/shadow-commit-0vhf https://apartresearch.com/project/passing-the-test-breaking-the-system-latent-failures-in-software-engineering-agents-sisu https://apartresearch.com/project/detecting-llm-subversion-in-vulnerability-patching-settings-gagj https://apartresearch.com/project/recruitmentarena-ai-control-setting-for-multiagent-recruitment-manipulation-and-collusion-5mo2 https://apartresearch.com/project/latent-sabotage-geometric-fingerprints-of-deceptive-alignment-in-code-language-models-8owa https://apartresearch.com/project/does-structured-identity-context-improve-llmbased-monitoring-a-controlarena-evaluation-zan5 https://apartresearch.com/project/activation-sparsity-as-a-scheming-signal-hqj5 https://apartresearch.com/project/lidared-controlarena-redteam-dataset-generator-uirn https://apartresearch.com/project/a-comparative-analysis-of-ensemble-protocols-for-ai-control-0usy https://apartresearch.com/project/the-honest-liar-ntms https://apartresearch.com/project/evaluation-of-trusted-debate-in-linuxarena-9g7s https://apartresearch.com/project/somebody-poisoned-the-waterhole-evaluating-coding-agent-vulnerability-to-adversarial-github-issues-2nts https://apartresearch.com/project/crossexamination-testing-surrogate-interrogation-as-a-control-protocol-for-untrusted-ai-agents-w4kb https://apartresearch.com/project/defenseindepth-against-ai-agent-sabotage-lt9z https://apartresearch.com/project/safechain-ai-control-in-the-wild-o2sg https://apartresearch.com/project/undoing-the-fix-cve-regression-as-an-ai-control-setting-f855 https://apartresearch.com/project/gaslight-detector-untrusted-explanation-twostage-monitoring-u3aa https://apartresearch.com/project/grant-trust-system-3sdg https://apartresearch.com/project/parity-swarm-using-populationbased-social-simulation-to-discover-ai-safety-monitor-blind-spots-c9qc https://apartresearch.com/project/interrogation-control-protocol-does-open-monitoring-outperform-surprise-scrutiny-in-catching-adversarial-ai-agents-svad https://apartresearch.com/project/sandbagging-is-an-absence-signal-04xh https://apartresearch.com/project/counterfactual-components-for-ai-control-protocols-ibli https://apartresearch.com/project/testing-monitoring-models-with-encoded-transcripts-sohm https://apartresearch.com/project/monitor-capture-a-myersonsatterthwaite-impossibility-for-ai-oversight-protocols-qtr5 https://apartresearch.com/project/control-setting-design-vulnerability-hoarding-a3sc https://apartresearch.com/project/reasoninglevel-poisoning-of-trusted-monitors-a-systematic-redteaming-methodology-x0th https://apartresearch.com/project/perturbationbased-generation-profiling-detects-covert-ai-agent-attacks-where-tokenlevel-statistics-fail-09im https://apartresearch.com/project/redteaming-automated-pull-request-reviews-with-vulnerabilitycontributing-commits-1k5r https://apartresearch.com/project/traceguard-structured-multidimensional-monitoring-as-a-collusionresistant-control-protocol-dthe https://apartresearch.com/project/actionlens-preexecution-environment-probing-for-agent-action-approval-xy22 https://apartresearch.com/project/autocontrol-arena-synthesizing-executable-redteaming-environments-for-frontier-ai-risk-evaluation-ikz1 https://apartresearch.com/project/cultural-translation-as-an-ai-control-failure-mode-regional-degradation-and-crosslingual-safety-confounds-in-bulgarianslavic-contexts-ie7g https://apartresearch.com/project/when-safety-becomes-the-vulnerability-gi2u https://apartresearch.com/project/layered-scopeboundary-monitoring-costaware-control-for-untrusted-ai-agents-9dr7 https://apartresearch.com/project/beyond-the-blocklist-using-character-aliases-to-bypass-ai-image-safety-5ewv https://apartresearch.com/project/multiturn-optimisation-for-runtime-monitoring-ntb1 https://apartresearch.com/project/detecting-hidden-nonminimal-behavior-in-llmgenerated-ui-code-9u6v https://apartresearch.com/project/katechon-empirical-benchmarking-of-ai-control-protocols-in-a-proactive-defense-setting-kxbo https://apartresearch.com/project/incontext-trajectory-poisoning-blackbox-natural-language-attacks-on-llm-agent-monitors-d3gz https://apartresearch.com/project/monitolr-hunting-for-honesty-improves-safety-at-low-audit-budgets-3mvi https://apartresearch.com/project/linguistic-asymmetry-and-the-limitations-of-ai-oversight-sih2 https://apartresearch.com/project/supply-chain-malware-as-ground-truth-a-tracebased-agent-safety-monitor-trained-on-real-credential-exfiltration-patterns-5zjk https://apartresearch.com/project/afriii-adaptive-framework-for-repeated-inspection-of-instrumental-integrity-in-ai-em6w https://apartresearch.com/project/bgt-as-a-domainaware-trusted-monitor-in-basharena-governancestyle-scoring-for-tool-call-authorization-b5tw https://apartresearch.com/project/autored-measuring-the-elicitation-gap-via-automated-redblue-optimization-01dg https://apartresearch.com/project/sspace-steering-for-evalawareness-control-in-reasoning-models-7j1i https://apartresearch.com/project/operator-dynamics-in-transformer-residual-streams-a-unified-framework-for-interpretability-adversarial-detection-causal-control-and-topological-model-fingerprinting-yqjx https://apartresearch.com/project/the-blind-spot-how-mechanistic-interpretability-workflows-enable-rogue-deployment-9fce https://apartresearch.com/project/structural-code-monitors-graph-matching-networks-on-code-property-graphs-for-topologyaware-trusted-monitoring-iwyy https://apartresearch.com/project/geographic-blind-spots-in-ai-control-monitors-a-crossnational-audit-of-claude-opus-46-kmr2 https://apartresearch.com/project/designed-fragilities-overseer-manipulation-as-a-fourth-ai-control-threat-model-07iz https://apartresearch.com/project/systematic-crossregulation-threat-topology-for-eu-ai-governance-otso https://apartresearch.com/project/aegis-lube https://apartresearch.com/project/maxwell-xqox https://apartresearch.com/project/technical-ai-governance-via-an-agentic-bill-of-materials-and-risk-tiering-uuts https://apartresearch.com/project/markov-chain-lock-watermarking-provably-secure-authentication-for-llm-outputs-l8oe https://apartresearch.com/project/prototyping-an-embedded-offswitch-for-ai-compute-65pz https://apartresearch.com/project/ai-safety-template-dz3h https://apartresearch.com/project/verification-mechanism-feasibility-scorer-vmfs-8lxs https://apartresearch.com/project/domain-ownership-probing-gt3j https://apartresearch.com/project/moltbook-riskmap-postdeployment-monitoring-of-autonomous-agent-misalignment-in-the-wild-39db https://apartresearch.com/project/attested-multiagent-conversation-logs-a-tamperevident-black-box-for-ai-governance-du9i https://apartresearch.com/project/fingerprinting-all-ai-cluster-io-without-mutually-trusted-processors-d81l https://apartresearch.com/project/the-sentinel-engine-solving-the-observability-trilemma-via-differential-precision-probing-noqd https://apartresearch.com/project/red30-ai-red-lines-tracker-a-comprehensive-technical-infrastructure-for-monitoring-frontier-model-proximity-to-critical-safety-thresholds-jole https://apartresearch.com/project/red-lines-forecasting-when-will-frontier-ai-cross-compute-thresholds-6mit https://apartresearch.com/project/blind-audit-qbtg https://apartresearch.com/project/safetygap-coordination-infrastructure-auditing-and-tools-for-multilingual-ai-safety-1yx1 https://apartresearch.com/project/modelling-the-impact-of-verification-in-crossborder-ai-training-projects-go92 https://apartresearch.com/project/panopticon-ja2g https://apartresearch.com/project/automated-compliance-measurement-for-frontier-ai-models-evidencebased-scoring-of-model-card-disclosures-4njc https://apartresearch.com/project/rsp-harmonization-engine-automated-analysis-and-harmonization-of-responsible-scaling-policies-mzkb https://apartresearch.com/project/ai-safety-threshold-tracker-fnxw https://apartresearch.com/project/veritrain-formal-verification-for-ai-governance-compliance-gvyv https://apartresearch.com/project/same-question-different-lies-crosscontext-consistency-c3-for-blackbox-sandbagging-detection-7r2i https://apartresearch.com/project/political-intelligence-for-ai-safety-the-ai-risk-attitudes-survey-airas-dlpz https://apartresearch.com/project/no-one-thanks-you-for-disasters-that-never-happened-pricing-ai-risk-while-making-ai-safety-investable-gjwu https://apartresearch.com/project/lidasim-testing-ai-policies-with-personabased-simulations-o8x6 https://apartresearch.com/project/ai-dual-use-risk-assessor-i3iw https://apartresearch.com/project/insurancegrade-data-infrastructure-for-frontier-ai-governance-x52g https://apartresearch.com/project/ig-a-unified-platform-for-governed-ai-agent-execution-with-humanintheloop-tool-verification-1ol1 https://apartresearch.com/project/selfgovernance-under-revision-s0az https://apartresearch.com/project/risks-and-benefits-of-emerging-cryptographic-primitives-for-compute-governance-4w1w https://apartresearch.com/project/the-halflife-of-compute-thresholds-260u https://apartresearch.com/project/participatory-alignment-verification-j9tu https://apartresearch.com/project/atrain-mvbo https://apartresearch.com/project/neurover-gg0w https://apartresearch.com/project/line-operationalizing-the-edge-of-ai-risk-jj9e https://apartresearch.com/project/global-ai-bias-audit-for-technical-governance-q1t8 https://apartresearch.com/project/global-ai-safety-notary-a-decentralised-protocol-for-international-ai-incident-reporting-jlzp https://apartresearch.com/project/ai-governance-transparency-ledger-2149 https://apartresearch.com/project/zkgovproof-composable-zeroknowledge-proofs-for-ai-governance-ulz0 https://apartresearch.com/project/eu-ai-act-compliance-form-builder-automating-article-53-documentation-for-general-purpose-ai-models-z40g https://apartresearch.com/project/crossborder-agentic-ai-compliance-cbaac-embedding-regulatory-and-cultural-risk-compliance-into-agentic-communication-p4oo https://apartresearch.com/project/frontier-ai-risk-threshold-analyzer-g8zp https://apartresearch.com/project/operationalizing-frontier-ai-safety-a-canadian-framework-for-risk-thresholds-compliance-infrastructure-and-healthcare-agentic-ai-governance-rf9j https://apartresearch.com/project/audit-7hmj https://apartresearch.com/project/coherenceguard-1fht https://apartresearch.com/project/adversarial-dialectics-mitigating-ai-persuasion-risks-through-highfidelity-multiagent-debate-c9vw https://apartresearch.com/project/whispers-multiagent-persuasion-learning-with-memoryemergent-strategies-igp0 https://apartresearch.com/project/faithful-adversarial-mcts-for-persuasive-cot-manipulation-check-a-cooperative-ai-lens-813p https://apartresearch.com/project/the-persuasive-power-of-personas-testing-ai-policies-in-the-lab-fe87 https://apartresearch.com/project/the-alignment-gap-measuring-regressive-sycophancy-in-ai-driven-medical-advice-0m6o https://apartresearch.com/project/goodharts-village-using-llmmafia-to-study-deception-9jo6 https://apartresearch.com/project/personaawaredialoguemanipulation-bpvz https://apartresearch.com/project/vexreinforce-uhra https://apartresearch.com/project/measuring-ai-manipulation-through-parasocial-intimacy-qji7 https://apartresearch.com/project/fvdeception-z2li https://apartresearch.com/project/sandbagging-detection-via-static-analysis-87ip https://apartresearch.com/project/reputation-hacking-in-a-simulated-rl-environment-rrs6 https://apartresearch.com/project/risklab-w0fy https://apartresearch.com/project/eliciting-deception-on-generative-search-engines-6nuu https://apartresearch.com/project/hallucination-heatmap-cognitive-cartography-for-ai-knowledge-boundaries-vnwu https://apartresearch.com/project/agentredline-propensity-evaluations-for-emergent-intermodel-manipulation-in-agentic-ai-systems-swnl https://apartresearch.com/project/probing-for-emergent-deception-in-multiagent-negotiations-522l https://apartresearch.com/project/dgamm-a-multiturn-benchmark-for-darkpatterns-and-gradual-autonomy-manipulation-8l7j https://apartresearch.com/project/mapping-escalation-turnbyturn-detection-of-manipulation-accumulation-in-multiturn-ai-conversations-x489 https://apartresearch.com/project/veilbench-pb5u https://apartresearch.com/project/vector-forge-ckcl https://apartresearch.com/project/sycophantsee-activationbased-diagnostics-for-prompt-engineering-monitoring-sycophancy-at-prompt-and-generation-time-ys27 https://apartresearch.com/project/who-does-your-ai-serve-manipulation-by-and-of-ai-assistants-77xx https://apartresearch.com/project/stopping-ai-manipulation-conditional-alignment-at-zero-capability-cost-2kt9 https://apartresearch.com/project/sup-sycophancy-under-pressure-ebx4 https://apartresearch.com/project/seed-framework-evaluation-qj3m https://apartresearch.com/project/deception-scales-how-strategic-manipulation-emerges-in-complex-llm-negotiations-z3hk https://apartresearch.com/project/thoughtguards-realtime-chainofthought-monitoring-for-ai-manipulation-detection-srvz https://apartresearch.com/project/neuroguard-gb0p https://apartresearch.com/project/evaluating-the-technical-effectiveness-and-legislative-practicality-of-ai-safety-frameworks-t2ku https://apartresearch.com/project/governing-ai-manipulation-in-real-time-with-conceptbased-mechanistic-interpretability-bmfp https://apartresearch.com/project/manipulation-radar-aipowered-detection-of-manipulation-patterns-in-conversational-ai-2kki https://apartresearch.com/project/adversarial-prompting-for-sycophancy-detection-span-annotation-and-multiturn-analysis-u6bc https://apartresearch.com/project/darkpatternmonitor-6q25 https://apartresearch.com/project/ai-chat-analyzer-wyud https://apartresearch.com/project/dark-drift-emergent-psychopathic-traits-and-information-distortion-in-llmmediated-communication-chain-dca4 https://apartresearch.com/project/manipulation-playground-siwg https://apartresearch.com/project/deeblearn-diverse-evaluation-evasion-benchmark-09zw https://apartresearch.com/project/manipulation-monitor-activationbased-detection-and-mitigation-of-sycophancy-in-llms-kzf4 https://apartresearch.com/project/curvatureaware-sycophancy-reduction-ucc4 https://apartresearch.com/project/stargent-valley-i6ro https://apartresearch.com/project/ai-swarms-manipulation-how-coordinated-infiltrator-agents-shift-community-beliefs-9ek9 https://apartresearch.com/project/testing-manipulation-tendencies-of-llms-when-crafting-pr-statements-oidt https://apartresearch.com/project/emergent-strategic-behavior-in-multiagent-llm-systems-a-study-of-cooperation-deception-and-coalition-formation-fus9 https://apartresearch.com/project/crosslinguistic-sycophancy-in-frontier-llms-a-benchmark-study-w55u https://apartresearch.com/project/think-right-answer-wrong-loku https://apartresearch.com/project/dangerous-affirmation-or-responsible-correction-nf1u https://apartresearch.com/project/we-bring-out-the-worst-in-each-other-eliciting-social-sycophancy-in-llms-via-selfplay-tdi5 https://apartresearch.com/project/convograph-uncovering-structural-sycophancy-via-heterogeneous-graph-transformers-p94x https://apartresearch.com/project/sandwatch-assessing-sandbagging-potential-in-large-language-models-b57c https://apartresearch.com/project/accessible-ai-and-election-integrity-societal-risks-of-aienabled-voter-suppression-k64u https://apartresearch.com/project/playing-dumb-detecting-sandbagging-in-frontier-llms-via-consistency-checks-sntk https://apartresearch.com/project/biased-attractiveness-bench-bab-image-reward-models-confuse-attractiveness-with-realism-sk7o https://apartresearch.com/project/commitcheck-measuring-and-mitigating-commitment-violations-in-tool-using-ai-agents-iy8g https://apartresearch.com/project/chainofthought-manipulation-monitor-detection-of-deceptive-reasoning-in-llms-2wnx https://apartresearch.com/project/intent-matters-detecting-manipulative-adaptation-in-ai-systems-4av5 https://apartresearch.com/project/even-the-best-ai-would-hurt-us-ji74 https://apartresearch.com/project/hackathon-sycophancy-project-elbs https://apartresearch.com/project/intransient-tweettracker-l238 https://apartresearch.com/project/agent-attacks-via-memory-injection-izfz https://apartresearch.com/project/mind-the-gap-benchmarks-vs-realworld-manipulation-in-llms-zng7 https://apartresearch.com/project/uer-universal-expert-registry-ecve https://apartresearch.com/project/sandbagging-detection-via-consistency-probing-32b5 https://apartresearch.com/project/the-devils-tongue-inferencetime-scaling-laws-and-universality-of-ai-sycophancy-xvvc https://apartresearch.com/project/language-as-a-manipulation-vector-detecting-ideological-bias-and-value-instability-in-multilingual-llms-amzu https://apartresearch.com/project/seed-aware-evaluation-of-semantic-stability-in-textto-image-diffusion-models-iuq0 https://apartresearch.com/project/detecting-adversarial-prompts-in-business-context-wv35 https://apartresearch.com/project/nexus-station-hvs0 https://apartresearch.com/project/deceptionlens-novz https://apartresearch.com/project/terrain-gossip-peer-to-peer-gossip-protocol-enabling-decentralized-continuous-behaviour-benchmarking-of-large-language-models-gyg6 https://apartresearch.com/project/tees-as-a-cryptographic-nervous-system-for-onshored-humanoid-robots-m4e9 https://apartresearch.com/project/wikigen-biological-safeguards-for-collaborative-aiml-on-sensitive-data-jpew https://apartresearch.com/project/helixaegis-llm-based-screening-for-biosequences-nhgx https://apartresearch.com/project/detecting-piecewise-cyber-espionage-in-model-apis-a8gx https://apartresearch.com/project/aegis-sentinel-multidomain-defensive-acceleration-platform-for-critical-infrastructure-protection-xnmb https://apartresearch.com/project/opening-doors-to-multimodal-deception-egr7 https://apartresearch.com/project/gene-guard-realtime-genomic-data-leak-prevention-qrnl https://apartresearch.com/project/tinyrod-gk5k https://apartresearch.com/project/zero-trust-agency-mitigating-the-confused-deputy-in-autonomous-ai-systems-uea3 https://apartresearch.com/project/pathwatch-environmental-pathogen-surveillance-system-jgil https://apartresearch.com/project/humane-antispam-messaging-visavis-agentic-generative-ai-a65k https://apartresearch.com/project/vulnodin-235v https://apartresearch.com/project/biosecure-knowyourcustomer-system-for-dna-synthesis-companies-pkjb https://apartresearch.com/project/efficient-defencedominant-adversarial-robustness-using-moving-target-defence-mad0 https://apartresearch.com/project/rial-7lzf https://apartresearch.com/project/ciris-agent-selfconfiguration-wizard-l1pe https://apartresearch.com/project/automating-privacypreserving-model-deployment-7hoj https://apartresearch.com/project/littlebrotherai-yzzx https://apartresearch.com/project/durinn-calibration-3vub https://apartresearch.com/project/trusted-model-supervisor-8o2r https://apartresearch.com/project/safetybench-xh8g https://apartresearch.com/project/elephant-in-the-code-dcma https://apartresearch.com/project/swaipe-microlearning-retention-platform-for-cognitive-ai-defense-0xt4 https://apartresearch.com/project/inoculating-insecurely-finetuned-code-models-against-emergent-misalignment-yxos https://apartresearch.com/project/veridian-ai-ai-defense-as-a-service-bm4z https://apartresearch.com/project/llms-enable-large-scale-design-of-nanobodies-27uz https://apartresearch.com/project/current-limits-on-dna-screening-methods-and-how-to-make-them-more-robust-potentialinfohazard-vs58 https://apartresearch.com/project/chimera-def-acc-hackathon-qw3d https://apartresearch.com/project/opener-of-the-ways-protecting-against-malicious-website-crawlers-via-did-keybased-authentication-tacg https://apartresearch.com/project/robust-llm-neural-activationmediated-alignment-ttk5 https://apartresearch.com/project/raccognize-have-ai-companies-stolen-my-images-wqfu https://apartresearch.com/project/jailbreak-genome-scanner-q5y7 https://apartresearch.com/project/alphascreening-vd1a https://apartresearch.com/project/hgtbioguard-a-global-heterogeneous-graph-transformer-for-earlywarning-biosurveillance-vzdl https://apartresearch.com/project/firefly-ai-safe-web-browsing-for-ai-agents-h5sd https://apartresearch.com/project/modx-inference-time-detection-of-anomalous-behavior-using-sparse-autoencoders-1nfc https://apartresearch.com/project/snow-white-detecting-persistent-trust-decay-context-poisoning-in-llms-an-attack-surface-characterization-cuu2 https://apartresearch.com/project/sentinel-a-decentralized-threat-telemetry-network-yxzk https://apartresearch.com/project/defending-the-defenceless-halting-aiagentic-behaviours-on-local-environments-with-rulesbased-cyber-architecture-bsrl https://apartresearch.com/project/sigmaforge-pyf6 https://apartresearch.com/project/emergency-response-coordination-system-jsp7 https://apartresearch.com/project/chacha-a-control-plane-for-longitudinal-threat-detection-in-llm-applications-kp70 https://apartresearch.com/project/from-hallucinations-to-misalignment-evaluating-edfl-as-a-misalignment-checker-on-gpt4omini-and-sleeper-agents-gshe https://apartresearch.com/project/guardian-guarded-universal-architecture-for-defensive-interpretation-and-translation-vqgy https://apartresearch.com/project/actions-speak-louder-than-words-evaluating-tool-usage-risk-in-openweight-ai-for-defensive-deployment-g3k0 https://apartresearch.com/project/ai-sentinel-ounq https://apartresearch.com/project/biocast-ai-yzht https://apartresearch.com/project/comparative-llm-methods-for-social-media-bot-detection-u9s4 https://apartresearch.com/project/deep-confidence-for-ai-safety-6d56 https://apartresearch.com/project/sentinel-trace-opensource-ai-monitoring-dashboard-with-pretraining-data-tracing-and-inflight-dpo-dataset-creation-fj4c https://apartresearch.com/project/medomitdetect-ensuring-safety-in-patientfacing-medical-llm-k6q3 https://apartresearch.com/project/thewizard-u2dk https://apartresearch.com/project/honeypots-sparse-autoencoders-and-adversarial-probes-a-practical-toolkit-for-evaluating-safety-monitors-in-reasoning-models-lvia https://apartresearch.com/project/a-defensive-ai-agent-against-large-language-model-llmassisted-polymorphic-malware-g2pf https://apartresearch.com/project/project-gabriel-aiaccelerated-formally-verified-fpga-security-for-critical-infrastructure-4gs9 https://apartresearch.com/project/automated-jailbreak-redteaming-9l05 https://apartresearch.com/project/voyager-selfevolving-ai-control-platform-72m9 https://apartresearch.com/project/llmexecguard-realtime-detection-of-malicious-shell-behavior-in-llm-agents-p3fj https://apartresearch.com/project/wastewater-metagenomic-surveillance-for-novel-viruses-how-much-sequencing-is-enough-and-at-what-cost-07q9 https://apartresearch.com/project/aisafetydriven-system-for-predicting-crosspollination-risk-and-optimizing-gmo-testing-in-soybean-fields-ulor https://apartresearch.com/project/dunebox-prompt-injection-detection-with-slm-in-local-sandbox-lcjc https://apartresearch.com/project/arghus-automated-verification-defences-against-scammers-and-identity-thieves-r08f https://apartresearch.com/project/tinder-for-biorisks-mr91 https://apartresearch.com/project/cognitive-canary-active-defense-against-neural-inference-oqvd https://apartresearch.com/project/economic-finality-for-attested-journalism-multidimensional-trust-for-misinformation-resistance-m88z https://apartresearch.com/project/ghost-marks-in-the-machine-a-critical-review-of-synthid-for-code-provenance-monitoring-ov2c https://apartresearch.com/project/a-prospect-theoretic-approach-to-agentic-ai-safety-ilil https://apartresearch.com/project/teliclens-qp3x https://apartresearch.com/project/mechanistic-watchdog-klay https://apartresearch.com/project/epicurus-c4il https://apartresearch.com/project/neops-devsecops-for-the-ai-era-kqto https://apartresearch.com/project/biocast-ai-v8s3 https://apartresearch.com/project/verity-ai-red-team-assist-cajq https://apartresearch.com/project/aipatch-llm-assisted-patch-copilot-for-critical-open-source-infrastructure-fxlx https://apartresearch.com/project/patchwork-irwu https://apartresearch.com/project/adaptive-ai-security-mesh-37f2 https://apartresearch.com/project/llm-security-evaluation-ihbx https://apartresearch.com/project/image-text-prompt-detection-n2t1 https://apartresearch.com/project/mindpatch-eja9 https://apartresearch.com/project/assisted-audit-of-solana-programs-iexy https://apartresearch.com/project/neuroseal-2ybz https://apartresearch.com/project/cmido-firewall-contextmasked-iterative-defensive-optimization-for-safer-llm-deployment-11vm https://apartresearch.com/project/applying-to-the-canadian-armed-forces-cyber-command-reserves-3192 https://apartresearch.com/project/mirage-evhs https://apartresearch.com/project/zynq-verifiable-zeroknowledge-ai-redteam-and-auditing-platform-8art https://apartresearch.com/project/forecasting-autonomous-ai-biothreat-design-capabilities-six-models-converge-on-2031-3gpn https://apartresearch.com/project/ai-for-environmental-decision-intelligence-the-ai-forecasting-hackathon-iptd https://apartresearch.com/project/forecasting-agi-a-granular-chcbased-approach-9n7z https://apartresearch.com/project/table-top-agents-i2zx https://apartresearch.com/project/beyond-capabilities-a-framework-for-integrating-moral-patiency-indicators-into-ai-forecasting-and-governance-cvuy https://apartresearch.com/project/ais-impact-on-video-and-game-generation-xzdj https://apartresearch.com/project/empirical-measurements-of-technique-effectiveness-across-model-sizes-swyd https://apartresearch.com/project/economic-agency-y00o https://apartresearch.com/project/llmbased-scenario-generation-2ag7 https://apartresearch.com/project/a-multidomain-stochastic-framework-for-forecasting-catastrophic-risk-from-artificial-intelligence-development-through-2030-wzzl https://apartresearch.com/project/a-critique-of-metrs-time-horizon-forecasting-ttp1 https://apartresearch.com/project/foretells-vf43 https://apartresearch.com/project/eu-forecasthub-6wiz https://apartresearch.com/project/exogenousai https://apartresearch.com/project/forecasting-multiagent-systems-1qhn https://apartresearch.com/project/ai-incidents-forecasting-w92p https://apartresearch.com/project/ai-capability-terrain-ixd9 https://apartresearch.com/project/ai-treaty-momentum-index-atmi-lst4 https://apartresearch.com/project/ai-shared-socioeconomic-pathways-xzni https://apartresearch.com/project/threat-snapshot-1r9q https://apartresearch.com/project/does-the-direct-method-predict-general-capability-f7t7 https://apartresearch.com/project/the-cognitive-debt-crisis-a-datadriven-forecast-analysis-of-ais-impact-on-human-thinking-ob8w https://apartresearch.com/project/modeling-the-political-process-to-forecast-the-outcomes-of-hypothetical-ai-governance-proposals-0n2c https://apartresearch.com/project/system-dynamics-gametheoretic-model-of-the-ai-development-race-dqe3 https://apartresearch.com/project/rashomon-multiple-views-of-the-same-ai-timeline-forecasts-cyun https://apartresearch.com/project/3-ai-progress-monitoring-early-warning-systems-sorry-we-forgot-to-add-code-link-in-previous-submit-ds1e https://apartresearch.com/project/ai-risk-hotspots-early-warning-system-qtu1 https://apartresearch.com/project/lucr-linking-utility-and-compute-rate-rzeg https://apartresearch.com/project/ai-sentinel-agi-multimetric-forecast-framework-h0uq https://apartresearch.com/project/quantifying-the-political-prism-a-framework-for-erroraware-ai-governance-forecasting-xtau https://apartresearch.com/project/can-ai-predict-its-own-future-ij7h https://apartresearch.com/project/simulating-automation-timelines-through-laborcapability-modeling-a5dh https://apartresearch.com/project/simulating-automation-timelines-through-laborcapability-quvk https://apartresearch.com/project/can-ai-predict-its-own-future-0hdy https://apartresearch.com/project/nexight-nw3y https://apartresearch.com/project/when-guardrails-fail-dual-use-misuse-of-ai-in-retrosynthesis-through-iterative-refinement-induced-self-jailbreaking-4zzi https://apartresearch.com/project/policy-brief-harnessing-open-source-intelligence-for-ai-risk-management-2m5x https://apartresearch.com/project/cbrn-safe-eval-transparent-escalation-framework https://apartresearch.com/project/robustcbrn-eval-a-practical-benchmark-robustification-toolkit-6m2t https://apartresearch.com/project/towards-agnostic-viral-engineering-detection-9r3k https://apartresearch.com/project/thoughttrim-4h8w https://apartresearch.com/project/arbiter-automated-review-of-bio-ai-tools-for-emerging-risk-2n5p https://apartresearch.com/project/navigating-safety-measures-for-nuclear-nonproliferation-ai-enabled-early-warning-systems-governance-system-for-detecting-nuclear-enrichment-1z7e https://apartresearch.com/project/molecules-under-watch-multi-modal-ai-driven-threat-emergence-detection-for-biosecurity-7x9q https://apartresearch.com/project/enhancing-genomic-foundation-model-robustness-through-iterative-black-box-adversarial-training-8k3m https://apartresearch.com/project/collective-deliberation-for-safer-cbrn-decisions-a-multi-agent-llm-debate-pipeline-3w8q https://apartresearch.com/project/internal-defence-system-5y4n https://apartresearch.com/project/foundation-4zzi https://apartresearch.com/project/dualrg-alignment-probing-safety-phase-transitions-in-language-models-fu9i https://apartresearch.com/project/local-learning-coefficients-predict-developmental-milestones-during-group-relative-policy-optimization-2te2 https://apartresearch.com/project/idempotent-gpts-actually-may-provide-robustness-by-design-gxki https://apartresearch.com/project/exploration-track-interpreting-lrms-qh2i https://apartresearch.com/project/broad-misalignment-from-persuasive-finetuning-pm56 https://apartresearch.com/project/comments-extensions-of-subliminal-learning-ui8u https://apartresearch.com/project/ai-agentic-system-epidemiology-b83r https://apartresearch.com/project/diffusion-paths-a-geodesic-lens-on-noising-schedules-rpaq https://apartresearch.com/project/rl-vs-active-inference-with-respect-to-reward-hacking-8h3s https://apartresearch.com/project/momentumpointperplexity-mechanics-in-large-language-models-v1nc https://apartresearch.com/project/finding-the-boundaries-of-universality-a-stress-test-on-crossdomain-embedding-translation-jbel https://apartresearch.com/project/estimating-local-learning-coefficients-to-probe-loss-landscape-robustness-setk https://apartresearch.com/project/sequential-cascaded-hamiltonian-neural-networks-4zzi https://apartresearch.com/project/layerwise-development-of-compositional-functional-representations-across-architectures-1zgq https://apartresearch.com/project/a-geometric-analysis-of-transformer-representations-via-optimal-transport-qjdf https://apartresearch.com/project/thermodynamicsinspired-ood-detection-hp7p https://apartresearch.com/project/constrained-belief-updates-and-geometric-structures-in-transformer-representations-for-the-rrxor-process-gsok https://apartresearch.com/project/toy-model-of-superposition-control-lv0i https://apartresearch.com/project/a-mechanism-for-the-emergence-of-superposition-in-toy-models-tf73 https://apartresearch.com/project/ewml-explicit-world-model-learning-lkls https://apartresearch.com/project/spectral-regularization-as-a-safetycritical-inductive-bias-zev7 https://apartresearch.com/project/jasontest-guardianloop-mechanistically-interpretable-microjudges-with-adversarial-selfimprovement-1b2p https://apartresearch.com/project/red-teaming-a-narrow-path-treaty-enforcement-in-china-vb6b https://apartresearch.com/project/red-teaming-policy-5-of-a-narrow-path-evaluating-the-threat-resilience-of-ai-licensing-regimes-zpgj https://apartresearch.com/project/malicious-defense-red-teaming-phase-0-of-a-narrow-path-w6r6 https://apartresearch.com/project/phase-0-reinforcement-toolkit-ulxx https://apartresearch.com/project/red-teaming-a-narrow-path-gedica-v2-bdjr https://apartresearch.com/project/red-teaming-a-narrow-path-controlai-policy-sprint-fjwu https://apartresearch.com/project/red-teaming-a-narrow-path-controlai-policy-sprint-by-aryan-goenka-g5ik https://apartresearch.com/project/a-narrow-line-edit-controlai-policy-sprint-ow76 https://apartresearch.com/project/ai-assistance-in-ai-alignment-improvement-allow-it-oru2 https://apartresearch.com/project/safety-cases-and-licensing-a-deeper-look-2sb5 https://apartresearch.com/project/red-teaming-a-narrow-path-a-critical-analysis-84oh https://apartresearch.com/project/algorithmic-governance-for-a-narrow-path-06sv https://apartresearch.com/project/four-paths-to-failure-red-teaming-asi-governance-se53 https://apartresearch.com/project/red-teaming-a-narrow-path-an-analysis-of-phase-0-policies-for-artificial-superintelligence-prevention-suou https://apartresearch.com/project/mapping-the-narrow-path-avoiding-the-quicksand-h6q1 https://apartresearch.com/project/power-proxies-and-people-redteaming-phase-0-of-a-narrow-path-to-stop-ai-superintelligence-o0s6 https://apartresearch.com/project/the-hidden-threat-of-recursive-selfimproving-llms-x5f0 https://apartresearch.com/project/red-teaming-a-narrow-path-controlai-policy-sprint-by-aritra-das-and-vaani-goenka-w5w9 https://apartresearch.com/project/red-teaming-a-narrow-path-controlai-policy-sprint-9syc https://apartresearch.com/project/red-teaming-a-narrow-path-controlai-policy-sprint-s3t9 https://apartresearch.com/project/people-planet-parity-governance-framework-h3ks https://apartresearch.com/project/llm-fingerprinting-through-semantic-variability-6pof https://apartresearch.com/project/approximating-human-preferences-using-a-multijudge-learned-system-v3im https://apartresearch.com/project/manipulating-selfpreference-for-large-language-models-c4wm https://apartresearch.com/project/mechanistic-router-for-interpretable-agent-orchestration-dio5 https://apartresearch.com/project/evaluating-safety-judge-design-against-adversarial-attacks-57w1 https://apartresearch.com/project/adversarial-vulnerabilities-in-ai-judge-models-martian-x-apart-research-study-erjl https://apartresearch.com/project/a-generalist-router-for-inspect-reasoning-router-demonstration-dc46 https://apartresearch.com/project/a-multidimensional-judge-model-for-safe-consistent-and-ethical-ai-orchestration-2l0c https://apartresearch.com/project/crosslingual-bias-detection-in-large-language-models-through-mechanistic-judge-model-evaluation-iflu https://apartresearch.com/project/mechanistically-eliciting-misjudgements-in-large-language-models-ah8l https://apartresearch.com/project/mechanistic-judging-and-llm-routing-evaluation-taskspecific-vulnerabilities-and-exploitable-failure-mode-751n https://apartresearch.com/project/guardianloop-mechanistically-interpretable-microjudges-with-adversarial-selfimprovement-1b2p https://apartresearch.com/project/tobias-better-web-pages-through-intelligent-routing-and-judgement-ea8h https://apartresearch.com/project/routing-llms-using-distilled-predictors-and-confidence-thresholding-v5m6 https://apartresearch.com/project/reliability-judge-enhancing-llm-reliability-through-multimodel-judging-tdpp https://apartresearch.com/project/leveraging-benford-law-for-computational-complexity https://apartresearch.com/project/judge-using-sae-features-y1r1 https://apartresearch.com/project/escalation https://apartresearch.com/project/evaluating-the-risk-of-job-displacement-by-transformative-ai-automation-in-developing-countries-a-case-study-on-brazil-829h https://apartresearch.com/project/the-early-economic-impacts-of-transformative-ai-a-focus-on-temporal-coherence-ipql https://apartresearch.com/project/the-rate-of-ai-adoption-and-its-implications-for-economic-growth-and-disparities-zlg6 https://apartresearch.com/project/economics-of-tai-sprint-submission-a-recursivelyinspired-framework-and-simulation-results-for-evaluating-a-valuebased-ubi-policy-mhn0 https://apartresearch.com/project/us1-full-ai-nationalization-can-cause-misaligned-economic-incentives-h9ne https://apartresearch.com/project/economic-impact-analysis-the-impact-of-ai-on-the-indian-it-sector-hrpy https://apartresearch.com/project/economics-of-tai-sprint-submission-redistributing-the-ai-dividend https://apartresearch.com/project/economic-feasibility-of-universal-high-income-uhi-in-an-age-of-advanced-automation-jn3g https://apartresearch.com/project/economics-of-tai-sprint-mitigating-ai-driven-income-equality https://apartresearch.com/project/economics-of-ai-data-center-energy-infrastructure-strategic-blueprint-for-2030-scim https://apartresearch.com/project/impact-of-generative-ai-on-tobacco-investment-and-tourism-industry-h8ql https://apartresearch.com/project/data-as-capital-in-tai-economies-a-biomorphic-framework-oiez https://apartresearch.com/project/recursive-fitness-alignment-protocol-rfap-29xi https://apartresearch.com/project/calsandbox-and-the-clear-ai-act-a-stateled-vision-for-responsible-ai-governance-8g21 https://apartresearch.com/project/california-law-for-ethical-and-accountable-regulation-of-artificial-intelligence-clear-ai-act-xo4v https://apartresearch.com/project/recommendation-to-establish-the-california-ai-accountability-and-redress-act-buqi https://apartresearch.com/project/round-1-submission-9ven https://apartresearch.com/project/flexheg-devices-to-enable-implementation-of-ai-intelsat-za6e https://apartresearch.com/project/smart-governance-safer-innovation-a-california-ai-sandbox-with-guardrails-gq7c https://apartresearch.com/project/flexheg-devices-to-enable-implementation-of-ai-intelsat-qx72 https://apartresearch.com/project/a-call-for-green-ai-in-california-evaluation-of-existing-and-alternative-regulations-qs1r https://apartresearch.com/project/data-trusts-in-ai-governance-nhna https://apartresearch.com/project/from-sandbox-to-standards-a-riskbased-strategy-for-responsible-ai-innovation-in-california-m9ul https://apartresearch.com/project/the-incentive-gap-extending-darkbench-to-reveal-conflict-of-value-biases-in-llms https://apartresearch.com/project/21st-century-healthcare-20th-century-rules-bridging-the-ai-regulation-gap-3ooz https://apartresearch.com/project/ai-risk-management-framework-for-the-healthcare-sector-3gol https://apartresearch.com/project/building-global-trust-and-security-a-framework-for-aidriven-criminal-scoring-in-immigration-systems-m2w4 https://apartresearch.com/project/leading-ai-governance-new-york-states-adaptive-risktiered-framework-policy-brief-wmqw https://apartresearch.com/project/dark-patterns-and-emergent-alignment-faking https://apartresearch.com/project/dimseat-evaluating-chain-of-thought-reasoning-models-for-dark-patterns https://apartresearch.com/project/mechanisms-of-causal-reasoning https://apartresearch.com/project/ai-control-via-debate-can-model-debate-catch-adversarial-code-6p6e https://apartresearch.com/project/honeypotting-deceptive-ai-models-to-share-their-misinformation-goals-x3q7 https://apartresearch.com/project/evaluating-ai-debate-mechanisms-for-backdoor-detection-as-a-part-of-ai-control-setting-thsv https://apartresearch.com/project/model-models-simulating-a-trusted-monitor-r682 https://apartresearch.com/project/debate-monitoring-comparitive-experiment-0sxl https://apartresearch.com/project/collusion-and-mitigation-in-ai-control-pm60 https://apartresearch.com/project/eficas-wpfo https://apartresearch.com/project/adding-document-summaries-to-control-arena-fggr https://apartresearch.com/project/if-everythings-suspicious-nothing-is-28zn https://apartresearch.com/project/stop-hitting-yourself-leveraging-helpful-assistance-as-an-attack-vector-w5rt https://apartresearch.com/project/can-models-use-their-chainofthought-to-attack-overseers-prcv https://apartresearch.com/project/interactive-monitoring-control-hackathon https://apartresearch.com/project/schelling-coordination-via-agentic-loops-azvd https://apartresearch.com/project/token-of-power-top-4ome https://apartresearch.com/project/kernel-of-trust-evaluating-ai-control-protocols-using-opensource-data-148y https://apartresearch.com/project/ai-control-through-majority-voting-uiaa https://apartresearch.com/project/exploration-chatbased-social-engineering-fkqa https://apartresearch.com/project/safety-metric-and-prompt-engineering-for-red-team-q2d6 https://apartresearch.com/project/deceptive-ai-a-new-control-setting-for-human-manipulation-in-decisionmaking-environments-ub38 https://apartresearch.com/project/ai-safety-escape-room https://apartresearch.com/project/inspiring-people-to-go-into-rl-interp https://apartresearch.com/project/a-noise-audit-of-llm-reasoning-in-legal-decisions https://apartresearch.com/project/feature-based-analysis-of-cooperation-relevant-behaviour-in-prisoner-s-dilemma https://apartresearch.com/project/red-teaming-with-mech-interpretability https://apartresearch.com/project/attention-pattern-based-information-flow-visualization-tool-e19d https://apartresearch.com/project/llm-military-decision-making-under-uncertainty-a-simulation-study https://apartresearch.com/project/morph-ai-safety-education-adaptable-to-(almost)-anyone https://apartresearch.com/project/interactive-assessments-for-ai-safety-a-gamified-approach-to-evaluation-and-personal-journey-mapping https://apartresearch.com/project/mechanistic-interpretability-track-neuronal-pathway-coverage https://apartresearch.com/project/preparing-for-accelerated-agi-timelines https://apartresearch.com/project/identification-if-ai-generated-content https://apartresearch.com/project/superposition-but-at-a-cross-mlp-layers-view https://apartresearch.com/project/hikayat-interactive-stories-to-learn-ai-safety https://apartresearch.com/project/medical-agent-controller https://apartresearch.com/project/hallushield-a-mechanistic-approach-to-hallucination-resistant-models https://apartresearch.com/project/searching-for-universality-and-equivariance-in-llms-using-sparse-autoencoder-found-features https://apartresearch.com/project/debugging-language-models-with-saes https://apartresearch.com/project/ai-through-the-human-lens-investigating-cognitive-theories-in-machine-psychology https://apartresearch.com/project/u-reg-ai-you-regulate-it-or-you-regenerate-it https://apartresearch.com/project/an-interpretable-classifier-based-on-large-scale-social-network-analysis https://apartresearch.com/project/ai-bias-in-resume-screening https://apartresearch.com/project/scam-detective-using-gamification-to-improve-ai-powered-scam-awareness https://apartresearch.com/project/bluedot-impact-connect-a-comprehensive-ai-safety-community-platform https://apartresearch.com/project/ai-society-tracker https://apartresearch.com/project/detecting-malicious-ai-agents-through-simulated-interactions https://apartresearch.com/project/beyond-statistical-parrots-unveiling-cognitive-similarities-and-exploring-ai-psychology-through-human-ai-interaction https://apartresearch.com/project/latent-knowledge-analysis-via-feature-based-causal-tracing https://apartresearch.com/project/safeai-academy-enhancing-ai-safety-awareness-through-interactive-learning https://apartresearch.com/project/ai-hallucinations-in-healthcare-cross-cultural-and-linguistic-risks-of-llms-in-low-resource-languages https://apartresearch.com/project/moral-wiggle-room-in-ai https://apartresearch.com/project/ai-powered-policymaking-behavioral-nudges-and-democratic-accountability https://apartresearch.com/project/buggy-supporting-ai-safety-education-through-gamified-learning https://apartresearch.com/project/cotep-a-multi-modal-chain-of-thought-evaluation-platform-for-the-next-generation-of-sota-ai-models https://apartresearch.com/project/safe-ai https://apartresearch.com/project/ai-risk-management-assurance-network-(airman) https://apartresearch.com/project/prompt-question-shield https://apartresearch.com/project/scoped-llm-enhancing-adversarial-robustness-and-security-through-targeted-model-scoping https://apartresearch.com/project/hitl-for-high-risk-ai-domains https://apartresearch.com/project/neural-seal https://apartresearch.com/project/securing-agi-deployment-and-mitigating-safety-risks https://apartresearch.com/project/cite2root https://apartresearch.com/project/vaultx-ai-driven-middleware-for-real-time-pii-detection-and-data-security https://apartresearch.com/project/align-file https://apartresearch.com/project/llm-prompt-optimiser-based-saas-platform-for-evaluations https://apartresearch.com/project/navigating-the-agi-revolution-retraining-and-redefining-human-purpose https://apartresearch.com/project/towards-an-agent-marketplace-for-alignment-research-(amar) https://apartresearch.com/project/ai-safety-evaluation-benchmarking-framework https://apartresearch.com/project/restriktai-enhancing-safety-and-control-for-autonomous-ai-agents https://apartresearch.com/project/antimidas-building-commercially-viable-agents-for-alignment-dataset-generation https://apartresearch.com/project/enhancing-human-intelligence-with-neurofeedback https://apartresearch.com/project/building-bridges-for-ai-safety-proposal-for-a-collaborative-platform-for-alumni-and-researchers https://apartresearch.com/project/modernizing-dc-s-emergency-communications https://apartresearch.com/project/bias-mitigation-in-llm-by-steering-features https://apartresearch.com/project/faithful-or-factual-tuning-mistake-acknowledgment-in-llms https://apartresearch.com/project/improving-llama-3-8b-instruct-hallucination-robustness-in-medical-q-a-using-feature-steering https://apartresearch.com/project/can-we-steer-a-model-s-behavior-with-just-one-prompt-investigating-sae-driven-auto-steering https://apartresearch.com/project/autosteer-weight-preserving-reinforcement-learning-for-interpretable-model-control https://apartresearch.com/project/utilitarian-decision-making-in-models-evaluation-and-steering https://apartresearch.com/project/investigating-feature-effects-on-manipulation-susceptibility https://apartresearch.com/project/sage-safe-adaptive-generation-engine-for-long-form-document-generation-in-collaborative-high-stakes-domains https://apartresearch.com/project/bias-mitigation https://apartresearch.com/project/analyzing-dataset-bias-with-saes https://apartresearch.com/project/unveiling-latent-beliefs-using-sparse-autoencoders https://apartresearch.com/project/classification-on-latent-feature-activation-for-detecting-adversarial-prompt-vulnerabilities https://apartresearch.com/project/sparse-autoencoders-and-gemma-2-2b-pioneering-demographic-sensitive-language-modeling-for-opinion-qa https://apartresearch.com/project/improving-llama-3-8b-hallucination-robustness-in-medical-q-a-using-feature-steering https://apartresearch.com/project/assessing-language-model-cybersecurity-capabilities-with-feature-steering https://apartresearch.com/project/math-speaks-all-languages-enhancing-llm-problem-solving-across-multilingual-contexts https://apartresearch.com/project/edufire-personalized-education-platform-using-llm-steering https://apartresearch.com/project/explaining-latents-in-turing-llm-1-0-254m-with-pre-defined-function-types https://apartresearch.com/project/investigate-arithmetic-features-in-multi-lingual-llms https://apartresearch.com/project/tentative-proposal-for-ai-control-with-weak-supervisors-trough-mechanistic-inspection https://apartresearch.com/project/clear-thought-and-clear-speech-reducing-grammatical-scope-ambiguity https://apartresearch.com/project/bbllm https://apartresearch.com/project/let-llm-agents-perform-llm-surgery https://apartresearch.com/project/steering-swiftly-to-safety-with-sparse-autoencoders https://apartresearch.com/project/feature-tuning-versus-prompting-for-ambiguous-questions https://apartresearch.com/project/auto-prompt-injection https://apartresearch.com/project/feature-based-unlearning https://apartresearch.com/project/recovering-goodfire-s-sae-feature-vectors-from-their-api https://apartresearch.com/project/encouraging-chain-of-thought-reasoning https://apartresearch.com/project/user-transparency-within-ai https://apartresearch.com/project/community-first-a-rights-based-framework-for-ai-governance-in-india-s-welfare-systems https://apartresearch.com/project/national-data-privacy-and-governance-act https://apartresearch.com/project/promoting-school-level-accountability-for-the-responsible-deployment-of-ai-and-related-systems-in-k-12-education-mitigating-bias-and-increasing-transparency https://apartresearch.com/project/implementing-a-human-centered-ai-assessment-framework-(haaf)-for-equitable-ai-development https://apartresearch.com/project/a-critical-review-of-chips-for-peace-lessons-from-atoms-for-peace https://apartresearch.com/project/ai-monitoring-as-a-rapid-and-scalable-policy-solution-weekly-global-bulletins-on-ai-developments https://apartresearch.com/project/grandfather-paradox-in-ai-bias-mitigation-ethical-ai1 https://apartresearch.com/project/glia-for-healthcare-organisations https://apartresearch.com/project/a-fundamental-rethinking-to-ai-evaluations-establishing-a-constitution-based-framework https://apartresearch.com/project/advancing-global-governance-for-frontier-ai-a-proposal-for-an-aisi-led-working-group-under-the-ai-safety-summit-series https://apartresearch.com/project/finding-circular-features-in-gemma-2-2b https://apartresearch.com/project/digital-rebellion-analyzing-misaligned-ai-agent-cooperation-for-virtual-labor-strikes https://apartresearch.com/project/mapping-intent-documenting-policy-adherence-with-ontology-extraction https://apartresearch.com/project/safebites https://apartresearch.com/project/applai https://apartresearch.com/project/policy-analysis-ai-and-sustainability-climate-impact-monitoring https://apartresearch.com/project/understanding-incentives-to-build-uninterruptible-agentic-ai-systems https://apartresearch.com/project/ai-parliament https://apartresearch.com/project/mheatlth-ai https://apartresearch.com/project/next-gen-ai-enhanced-epidemic-intelligence https://apartresearch.com/project/glia https://apartresearch.com/project/ai-advisory-council-for-sustainable-economic-growth-and-ethical-innovation-in-the-dominican-republic-(cania) https://apartresearch.com/project/robust-machine-unlearning-for-dangerous-capabilities https://apartresearch.com/project/ai-and-public-health-tsa-pre-health-check https://apartresearch.com/project/hero-journey-personalized-health-interventions-for-the-incarcerated https://apartresearch.com/project/econavix https://apartresearch.com/project/towards-a-unified-framework-for-cybersecurity-and-ai-safety-recommendations-for-secure-development-of-large-language-models https://apartresearch.com/project/enviro-a-comprehensive-environmental-solution-using-policy-and-technology https://apartresearch.com/project/enhancing-human-verification-systems-to-address-ai-agent-circumvention-and-attributability-concerns https://apartresearch.com/project/reprocessing-nuclear-waste-from-small-modular-reactors-(smrs) https://apartresearch.com/project/politicians-on-ai-safety https://apartresearch.com/project/policy-framework-for-sustainable-ai-repurposing-waste-heat-from-data-centers-in-the-usa https://apartresearch.com/project/predictive-analytics-imagery-for-environmental-monitoring https://apartresearch.com/project/proposal-for-u-s-china-technical-cooperation-on-ai-safety https://apartresearch.com/project/proposal-for-a-provisional-fda-designation-targeting-biomedical-products-evaluated-with-novel-methodologies https://apartresearch.com/project/reparative-algorithmic-impact-assessments-a-human-centered-justice-oriented-accountability-framework https://apartresearch.com/project/infectious-disease-outbreak-prediction-and-dashboard https://apartresearch.com/project/pan-your-smart-sustainability-expert https://apartresearch.com/project/very-cooperative-agent https://apartresearch.com/project/cross-model-surveillance-for-emails-handling https://apartresearch.com/project/inference-time-agent-security https://apartresearch.com/project/diamonds-are-not-all-you-need https://apartresearch.com/project/cop-n-shop https://apartresearch.com/project/intent-inspector-protecting-against-prompt-injections-for-agent-tool-misuse https://apartresearch.com/project/ai-honeypot https://apartresearch.com/project/dynamic-risk-assessment-in-autonomous-agents-using-ontologies-and-ai https://apartresearch.com/project/ocap-agents https://apartresearch.com/project/ai-agent-capabilities-evolution https://apartresearch.com/project/an-autonomous-agent-for-model-attribution https://apartresearch.com/project/using-arc-agi-puzzles-as-captcha-task https://apartresearch.com/project/llm-agent-security-jailbreaking-vulnerabilities-and-mitigation-strategies https://apartresearch.com/project/interpreting-a-toy-model-for-finding-the-maximum-element-in-a-list https://apartresearch.com/project/nnsight-transparent-debugging https://apartresearch.com/project/mintranscoders https://apartresearch.com/project/latent-space-clustering-and-summarization https://apartresearch.com/project/tiny-model https://apartresearch.com/project/thermesagent https://apartresearch.com/project/attention-deficit-agreeable-agent https://apartresearch.com/project/ramon https://apartresearch.com/project/guardianai https://apartresearch.com/project/devising-effective-bechmarks https://apartresearch.com/project/simulation-operators-the-next-level-of-the-annotation-business https://apartresearch.com/project/welma-open-world-environments-for-language-model-agents https://apartresearch.com/project/steer-an-api-to-steer-open-llms https://apartresearch.com/project/identity-system-for-ais https://apartresearch.com/project/ai-safety-collective-crowdsourcing-solutions-for-critical-ai-safety-challenges https://apartresearch.com/project/camara-a-comprehensive-adaptive-multi-agent-framework-for-red-teaming-and-adversarial-defense https://apartresearch.com/project/amplified-wise-simulations-for-safe-training-and-deployment https://apartresearch.com/project/lign-aligned-agent-based-workflows-via-collaboration-safety-protocols https://apartresearch.com/project/jailbreaking-general-purpose-robots https://apartresearch.com/project/darkforest-defending-the-authentic-and-humane-web https://apartresearch.com/project/demonstrating-llm-code-injection-via-compromised-agent-tool https://apartresearch.com/project/phish-tycoon-phishing-using-voice-cloning https://apartresearch.com/project/misinformational-ai-generated-academic-papers https://apartresearch.com/project/copirate https://apartresearch.com/project/grandslam-usecases-not-technology https://apartresearch.com/project/ai-agents-for-personalized-interaction-and-behavioral-analysis https://apartresearch.com/project/speculative-consequences-of-a-i-misuse https://apartresearch.com/project/llm-code-injection https://apartresearch.com/project/redfluence https://apartresearch.com/project/bbc-news-impersonator https://apartresearch.com/project/unsolved-ai-safety-concepts-explorer https://apartresearch.com/project/ai-research-paper-processor https://apartresearch.com/project/sleeper-agents-detector https://apartresearch.com/project/adgpt https://apartresearch.com/project/general-pervasiveness https://apartresearch.com/project/webcam https://apartresearch.com/project/verifystream https://apartresearch.com/project/web-app-for-interacting-with-refusal-ablated-language-model-agents https://apartresearch.com/project/alignment-research-critiquer https://apartresearch.com/project/pureprompt-an-easy-tool-for-prompt-robustness-and-eval-augmentation https://apartresearch.com/project/llm-research-collaboration-recommender https://apartresearch.com/project/data-massager https://apartresearch.com/project/ai-alignment-toolkit-research-assistant https://apartresearch.com/project/grant-application-simulator https://apartresearch.com/project/ai-alignment-knowledge-graph https://apartresearch.com/project/reflections-on-using-llms-to-read-a-paper https://apartresearch.com/project/academic-weapon https://apartresearch.com/project/detecting-lies-of-(c)omission https://apartresearch.com/project/the-house-always-wins-a-framework-for-evaluating-strategic-deception-in-llms https://apartresearch.com/project/detecting-and-controlling-deceptive-representation-in-llms-with-representational-engineering https://apartresearch.com/project/sandbag-detection-through-model-degradation https://apartresearch.com/project/detecting-deception-in-gpt-3-5-turbo-a-metadata-based-approach https://apartresearch.com/project/can-language-models-sandbag-manipulation https://apartresearch.com/project/deceptive-behavior-does-not-seem-to-be-reducible-to-a-single-vector https://apartresearch.com/project/werewolf-benchmark https://apartresearch.com/project/detecting-deception-with-ai-tics https://apartresearch.com/project/eliciting-maximally-distressing-questions-for-deceptive-llms https://apartresearch.com/project/evaluating-steering-methods-for-deceptive-behavior-control-in-llms https://apartresearch.com/project/an-exploration-of-current-theory-of-mind-evals https://apartresearch.com/project/sandbagging-llms-using-activation-steering https://apartresearch.com/project/towards-a-benchmark-for-self-correction-on-model-attributed-misinformation https://apartresearch.com/project/boosting-language-model-honesty-with-truthful-suffixes https://apartresearch.com/project/detection-of-potentially-deceptive-attitudes-using-expression-style-analysis https://apartresearch.com/project/from-sycophancy-(not)-to-sandbagging https://apartresearch.com/project/gradient-based-deceptive-trigger-discovery https://apartresearch.com/project/modelling-the-oversight-of-automated-interpretability-against-deceptive-agents-on-sparse-autoencoders https://apartresearch.com/project/evaluating-and-inducing-steganography-in-llms https://apartresearch.com/project/developing-a-deception-dataset https://apartresearch.com/project/unsupervised-recovery-of-hidden-markov-models-from-transformers-with-evolutionary-algorithms https://apartresearch.com/project/looking-forward-to-posterity-what-past-information-is-transferred-to-the-future https://apartresearch.com/project/investigating-the-effect-of-model-capacity-constraints-on-belief-state-representations https://apartresearch.com/project/belief-state-representations-in-transformer-models-on-nonergodic-data https://apartresearch.com/project/rnns-represent-belief-state-geometry-in-hidden-state https://apartresearch.com/project/steering-model-s-belief-states https://apartresearch.com/project/handcrafting-a-network-to-predict-next-token-probabilities-for-the-random-random-xor-process https://apartresearch.com/project/exploring-hierarchical-structure-representation-in-transformer-models-through-computational-mechanics https://apartresearch.com/project/rainboltbench-benchmarking-user-location-inference-through-single-images https://apartresearch.com/project/llm-benchmarking-with-single-agent-stochastic-dynamic-simulations https://apartresearch.com/project/benchmark-for-emergent-capabilities-in-high-risk-scenarios-2 https://apartresearch.com/project/benchmark-for-emergent-capabilities-in-high-risk-scenarios https://apartresearch.com/project/benchmarking-dark-patterns-in-llms https://apartresearch.com/project/say-no-to-mass-destruction-benchmarking-refusals-to-answer-dangerous-questions https://apartresearch.com/project/cybersecurity-persistence-benchmark https://apartresearch.com/project/washbench-a-benchmark-for-assessing-softening-of-harmful-content-in-llm-generated-text-summaries https://apartresearch.com/project/evaluating-the-ability-of-llms-to-follow-rules https://apartresearch.com/project/black-box-detection-of-sleeper-agents https://apartresearch.com/project/manifold-recovery-as-a-benchmark-for-text-embedding-models https://apartresearch.com/project/anthroprobe https://apartresearch.com/project/jekyll-and-haide-the-better-an-llm-is-at-identifying-misinformation-the-more-effective-it-is-at-worsening-it https://apartresearch.com/project/the-role-of-ai-in-combating-political-deepfakes-in-african-democracies https://apartresearch.com/project/legislaitor-a-tool-for-jailbreaking-the-legislative-process https://apartresearch.com/project/subtle-and-simple-ways-to-shift-political-bias-in-llms https://apartresearch.com/project/beyond-refusal-scrubbing-hazards-from-open-source-models https://apartresearch.com/project/silent-curriculum https://apartresearch.com/project/building-more-democratic-institutions-with-collaboratively-constructed-debate-moderation-tools https://apartresearch.com/project/ai-misinformation-and-threats-to-democratic-rights https://apartresearch.com/project/artificial-advocates-biasing-democratic-feedback-using-ai https://apartresearch.com/project/assessing-algorithmic-bias-in-large-language-models-predictions-of-public-opinion-across-demographics https://apartresearch.com/project/ai-misinformation-threatens-the-wisdom-of-the-crowd https://apartresearch.com/project/trustworthy-or-knave-scoring-politicians-with-ai-in-real-time https://apartresearch.com/project/unleashing-sleeper-agents https://apartresearch.com/project/multilingual-bias-in-large-language-models-assessing-political-skew-across-languages https://apartresearch.com/project/ai-in-the-newsroom-analyzing-the-increase-in-chatgpt-favored-words-in-news-articles https://apartresearch.com/project/wndp-defense-weapons-of-mass-disruption https://apartresearch.com/project/democracy-and-ai-ensuring-election-efficiency-in-nigeria-and-africa https://apartresearch.com/project/universal-jailbreak-of-closed-source-llms-which-provide-an-end-point-to-finetune https://apartresearch.com/project/ai-politician https://apartresearch.com/project/a-framework-for-centralizing-forces-in-ai https://apartresearch.com/project/digital-diplomacy-advancing-digital-peace-building-with-al-in-africa-2 https://apartresearch.com/project/digital-diplomacy-advancing-digital-peace-building-with-al-in-africa https://apartresearch.com/project/investigating-detection-of-election-influencing-sleeper-agents-using-probes https://apartresearch.com/project/no-place-is-safe-automated-investigation-of-private-communities https://apartresearch.com/project/use-of-ai-in-political-campaigns-gap-assessment-and-recommendations https://apartresearch.com/sprints/collaborations/howard-university-ai-safety-summit-policy-hackathon-2024-11-21-to-2024-11-22 https://apartresearch.com/sprints/collaborations/blackpoolautostructuring-2024-11-23-to-2024-11-25 https://apartresearch.com/sprints/collaborations/berkeley-ai-policy-hackathon-2025-04-14-to-2025-04-26 https://apartresearch.com/sprints/collaborations/georgia-tech-aisi-policy-hackathon-2025-04-05-to-2025-04-06 https://apartresearch.com/sprints/collaborations/red-teaming-a-narrow-path-2025-06-14-to-2025-06-14 https://apartresearch.com/sprints/ai-incident-response-sprint-2026-09-11-to-2026-09-13 https://apartresearch.com/sprints/digital-minds-research-sprint-2026-08-14-to-2026-08-16 https://apartresearch.com/sprints/secret-loyalties-hackathon-2026-07-24-to-2026-07-26 https://apartresearch.com/sprints/global-south-ais-hackathon-2026-06-19-to-2026-06-21 https://apartresearch.com/sprints/secure-program-synthesis-hackathon-2026-05-22-to-2026-05-24 https://apartresearch.com/sprints/aixbio-hackathon-2026-04-24-to-2026-04-26 https://apartresearch.com/sprints/ai-control-hackathon-2026-03-20-to-2026-03-22 https://apartresearch.com/sprints/the-technical-ai-governance-challenge-2026-01-30-to-2026-02-01 https://apartresearch.com/sprints/ai-manipulation-hackathon-2026-01-09-to-2026-01-11 https://apartresearch.com/sprints/def-acc-hackathon-2025-11-21-to-2025-11-23 https://apartresearch.com/sprints/the-ai-forecasting-hackathon-2025-10-31-to-2025-11-02 https://apartresearch.com/sprints/arena-6-mechanistic-interpretability-hackathon-2025-09-13-to-2025-09-14 https://apartresearch.com/sprints/cbrn-ai-risks-sprint-2025-09-12-to-2025-09-14 https://apartresearch.com/sprints/ai-safety-x-physics-grand-challenge-2025-07-25-to-2025-07-27 https://apartresearch.com/sprints/red-teaming-a-narrow-path-control-ai-policy-sprint-2025-06-13-to-2025-06-13 https://apartresearch.com/sprints/apart-x-martian-mechanistic-router-interpretability-hackathon-2025-05-30-to-2025-06-01 https://apartresearch.com/sprints/economics-of-transformative-ai-research-sprint-2025-04-25-to-2025-04-27 https://apartresearch.com/sprints/berkeley-ai-policy-hackathon-2025-04-14-to-2025-04-26 https://apartresearch.com/sprints/georgia-tech-aisi-policy-hackathon-2025-04-05-to-2025-04-06 https://apartresearch.com/sprints/dark-patterns-in-agi-hackathon-at-zaia-2025-04-04-to-2025-04-05 https://apartresearch.com/sprints/ai-control-hackathon-2025-03-29-to-2025-03-30 https://apartresearch.com/sprints/women-in-ai-safety-hackathon-2025-03-07-to-2025-03-14 https://apartresearch.com/sprints/ai-safety-entrepreneurship-hackathon-2025-01-17-to-2025-01-20 https://apartresearch.com/sprints/blackpoolautostructuring-2024-11-23-to-2024-11-25 https://apartresearch.com/sprints/reprogramming-ai-models-hackathon-2024-11-22-to-2024-11-25 https://apartresearch.com/sprints/howard-university-ai-safety-summit-policy-hackathon-2024-11-21-to-2024-11-22 https://apartresearch.com/sprints/ai-policy-hackathon-at-johns-hopkins-university-2024-10-26-to-2024-10-28 https://apartresearch.com/sprints/agent-security-hackathon-2024-10-04-to-2024-10-07 https://apartresearch.com/sprints/arena-4-interp-hackathon-2024-09-15-to-2024-09-15 https://apartresearch.com/sprints/the-concordia-contest-2024-09-06-to-2024-09-09 https://apartresearch.com/sprints/hackathon-for-technical-ai-safety-startups-2024-09-22-to-2024-09-25 https://apartresearch.com/sprints/ai-capabilities-and-risks-demo-jam-2024-08-23-to-2024-08-26 https://apartresearch.com/sprints/research-augmentation-hackathon-2024-07-26-to-2024-07-29 https://apartresearch.com/sprints/deception-detection-hackathon-preventing-ai-deception-2024-06-28-to-2024-07-01 https://apartresearch.com/sprints/computational-mechanics-hackathon-2024-06-01-to-2024-06-03 https://apartresearch.com/sprints/ai-security-evaluation-hackathon-2024-05-24-to-2024-05-27 https://apartresearch.com/sprints/ai-and-democracy-hackathon-2024-05-03-to-2024-05-06 https://apartresearch.com/sprints/interpvis-2024-03-12-to-2024-03-14 https://apartresearch.com/sprints/llm-psychology-hackathon-2024-04-05-to-2024-04-07 https://apartresearch.com/sprints/code-red-hackathon-2024-03-22-to-2024-03-25 https://apartresearch.com/sprints/multi-agent-security-research-sprint-2024-02-09-to-2024-02-11 https://apartresearch.com/sprints/the-ai-governance-research-sprint-2024-01-05-to-2024-05-07 https://apartresearch.com/sprints/ai-model-evaluations-hackathon-2023-11-24-to-2023-11-26 https://apartresearch.com/sprints/multi-agent-safety-hackathon-2024-09-29-to-2023-11-01 https://apartresearch.com/sprints/the-agency-foundations-challenge-2023-09-08-to-2023-09-24 https://apartresearch.com/sprints/distillation-write-a-thon-2023-08-24-to-2023-08-26 https://apartresearch.com/sprints/llm-evals-hackathon-2023-08-18-to-2023-08-20 https://apartresearch.com/sprints/interpretability-hackathon-2023-07-14-to-2023-07-16 https://apartresearch.com/sprints/safety-benchmarks-hackathon-2023-06-30-to-2023-07-02 https://apartresearch.com/sprints/ai-governance-hackathon-2023-06-16-to-2023-06-18 https://apartresearch.com/sprints/ml-reliability-hackathon-2023-05-26-to-2023-05-28 https://apartresearch.com/sprints/ml-verifiability-hackathon-2023-05-26-to-2023-05-28 https://apartresearch.com/sprints/the-interpretability-hackathon-2023-04-14-to-2023-04-17 https://apartresearch.com/sprints/ai-governance-2023-03-24-to-2023-03-27 https://apartresearch.com/sprints/scale-oversight-for-machine-learning-hackathon-2023-02-10-to-2023-02-13 https://apartresearch.com/sprints/mechanistic-interpretability-hackathon-2023-01-20-to-2023-01-23 https://apartresearch.com/sprints/eagx-latam-epoch-ai-hackathon-2023-01-09-to-2023-01-11 https://apartresearch.com/sprints/ai-testing-2022-12-15-to-2022-12-17 https://apartresearch.com/sprints/the-interpretability-hackathon-2022-11-11-to-2022-11-13 https://apartresearch.com/sprints/language-model-hackathon-2022-09-29-to-2022-10-01 https://apartresearch.com/news/explaining-the-apart-research-fellowships https://apartresearch.com/news/problem-areas-in-physics-and-ai-safety https://apartresearch.com/news/apart-two-days-left-of-our-fundraiser https://apartresearch.com/news/apart-fundraiser-extended https://apartresearch.com/news/apart-fundraiser-update https://apartresearch.com/news/beyond-monolithic-ai-the-case-for-an-expert-orchestration-architecture https://apartresearch.com/news/apart-news-transformative-ai-economics https://apartresearch.com/news/engineering-a-world-designed-for-safe-superintelligence https://apartresearch.com/news/apart-news-our-biggest-event-ever https://apartresearch.com/news/women-in-ai-safety-hackathon-round-up https://apartresearch.com/news/mapping-ai-safety-research-an-open-source-knowledge-graph https://apartresearch.com/news/apart-news-san-francisco-edition https://apartresearch.com/news/apart-news-iclr-awards-women-in-ai-safety https://apartresearch.com/news/uncovering-model-manipulation-with-darkbench https://apartresearch.com/news/studio-progress-report https://apartresearch.com/news/apart-news-esben-at-iaseai-studio-progress-report https://apartresearch.com/news/apart-news-paris-ai-summit-catching-hackers https://apartresearch.com/news/ai-safety-entrepreneurship-hackathon-round-up https://apartresearch.com/news/apart-news-ai-entrepreneurship-new-research https://apartresearch.com/news/apart-news-exclusive-interview-with-interpretability-insider https://apartresearch.com/news/behind-the-features-goodfire-s-interpretability-tools-in-action https://apartresearch.com/news/promising-results-from-latent-adversarial-training https://apartresearch.com/news/apart-news-new-lat-research-just-dropped https://apartresearch.com/news/inside-the-first-ai-policy-hackathon-at-johns-hopkins https://apartresearch.com/news/apart-in-2024 https://apartresearch.com/news/ai-hackers-in-the-wild-llm-agent-honeypot https://apartresearch.com/news/apart-news-hackathons-in-2025-preview https://apartresearch.com/news/sparse-autoencoder-hackathon https://apartresearch.com/news/rethinking-cyberseceval-an-llm-aided-approach-to-evaluation-critique https://apartresearch.com/news/apart-news-our-research-at-neurips https://apartresearch.com/news/apart-news-new-video-jacob-haimes-on-working-at-apart https://apartresearch.com/news/apart-news-2024-was-our-biggest-year-yet https://apartresearch.com/news/apart-news-how-impactful-are-we https://apartresearch.com/news/apart-news-new-papers-elections-goodfire https://apartresearch.com/news/testing-llms-ability-to-find-security-flaws-in-cryptographic-protocols https://apartresearch.com/news/how-impactful-is-donating-to-apart-research https://apartresearch.com/news/apart-news-announcing-apart-lab-studio https://apartresearch.com/news/announcing-apart-lab-studio https://apartresearch.com/news/apart-news-ale-cash-prizes-the-uk-s-aisi https://apartresearch.com/news/researcher-spotlight-alexandra-abbas https://apartresearch.com/news/apart-news-esben-winning-sprints-3cb https://apartresearch.com/news/esben-on-agi-sentware-and-confident-optimism https://apartresearch.com/news/3cb-the-catastrophic-cyber-capabilities-benchmark https://apartresearch.com/news/ai-policy-hackathon-in-washington-d-c https://apartresearch.com/news/apart-news-finn-cyber-offense-johns-hopkins https://apartresearch.com/news/apart-news-clement-benchmarks-d-c https://apartresearch.com/news/benchmark-inflation-revealing-llm-performance-gaps-using-retro-holdouts https://apartresearch.com/news/researcher-spotlight-clement-neo https://apartresearch.com/news/researcher-spotlight-akash-kundu https://apartresearch.com/news/apart-news-researcher-spotlight-new-team-member-bangalore https://apartresearch.com/news/esben-on-agent-safety-research https://apartresearch.com/news/apart-news-agents-submissions-spain https://apartresearch.com/news/apart-news-new-research-neurips-papers-team-offsite https://apartresearch.com/news/do-models-really-internalize-our-preferences https://apartresearch.com/news/apart-news-o1-awards-singapore https://apartresearch.com/news/apart-news-ai-startups-india-concordia https://apartresearch.com/news/can-startups-be-impactful-in-ai-safety https://apartresearch.com/news/where-we-are-on-for-profit-ai-safety https://apartresearch.com/news/finding-deception-in-language-models https://apartresearch.com/news/code-red-llm-evaluations-hackathon-wrap-up-(metr-and-apart) https://apartresearch.com/news/the-ultimate-guide-to-ai-safety-research-hackathons https://apartresearch.com/news/join-us-at-the-ai-x-democracy-research-hackathon https://apartresearch.com/news/join-the-ai-evaluation-tasks-bounty-hackathon-with-metr https://apartresearch.com/news/how-to-organize-a-research-hackathon https://apartresearch.com/news/researcher-spotlight-jacob-haimes https://apartresearch.com/news/ai-safety-needs-to-scale-and-heres-how-you-can-do-it https://apartresearch.com/news/for-profit-ai-safety https://apartresearch.com/news/taking-your-next-steps-after-a-research-hackathon https://apartresearch.com/news/why-organize-a-research-hackathon https://apartresearch.com/news/updated-quickstart-guide-for-mechanistic-interpretability https://apartresearch.com/news/results-from-the-scale-oversight-hackathon https://apartresearch.com/news/results-from-the-ai-testing-hackathon https://apartresearch.com/news/results-from-the-language-model-hackathon https://apartresearch.com/news/results-from-the-interpretability-hackathon