Jun 22, 2026

Marco ético para IA aplicada a la preservación lingüística Guna de Panamá

Ana María González Aldana, Kelvin Alvarado, Mariana Zuluaga Abril, Kelvin Alvarado, Mariana Zuluaga Abril

This document aims to develop an ethical, responsible, and participatory framework (CREA) for the application of Artificial Intelligence in the preservation and use of the Guna language of Panama. This need arises from technological inequalities between rural and urban areas, which limit the Guna community in its communication with the rest of the population and in its access to essential services, such as medical consultations that currently depend on an external translator.

The final outcome of this project is the adaptation of the CREA framework, consisting of its conceptual pillars and implementation guidelines for the development of AI technologies aimed at language preservation. To achieve this, a qualitative methodology based on document review and the analysis of existing AI ethical frameworks was used, contrasted with principles of Indigenous rights, data sovereignty, and natural language processing.

As a result of the project, it was identified that it is necessary to establish concrete principles to guide technological development in Indigenous contexts. Therefore, the CREA framework, together with its corresponding pillars, is proposed as a response to this need. It is concluded that these guidelines constitute a solid and replicable foundation for the ethical design of AI technologies for under-resourced languages, and their pilot implementation with the Guna community is recommended in future phases.

Review Project

View Related Sprint

Reviewer's Comments

This is an innovative and culturally grounded framework that excels at identifying local harms, community sovereignty concerns, and emerging threats such as voice cloning. The strategic vision behind the project is particularly strong. A valuable next step would be to translate these concepts into practical evidence through data engineering pilots capable of testing the proposed consent mechanisms within real training pipelines.

This is a thoughtful contribution that brings an important and underrepresented community into the AI safety conversation. The problem is well chosen, the motivating argument is compelling, and the effort to ground the framework in concrete, actionable provisions with named responsible actors gives it a practical dimension that purely theoretical proposals often lack. The inclusion of a community member as co-author also adds a layer of legitimacy that is not common in this kind of work and speaks well of the team's commitment to the values the framework itself promotes.

Where the paper has room to grow is in the consistency between some of its methodological claims and what the document actually demonstrates. There are moments where the analytical ambition reaches slightly ahead of the evidence presented, and tightening that connection would make the overall argument considerably stronger. Additionally, a more explicit engagement with some of the internal tensions in the governance model, particularly around the relationship between community autonomy and external institutional actors, would elevate the conceptual depth of a proposal that is otherwise well constructed. These are natural areas to develop in future iterations rather than fundamental weaknesses.

On the presentation side, a careful revision pass would benefit the paper noticeably. There are figure numbering inconsistencies, a few truncated sentences, and some minor grammatical irregularities that interrupt an otherwise readable flow. None of these detract from the substance of the ideas, but addressing them would help the work communicate with the clarity and polish its ambition deserves. The research direction is genuinely valuable and the foundation here is strong enough to support a more developed version that could serve as a real reference for similar efforts across the region.

Cite this work

@misc {

title={

(HckPrj) Marco ético para IA aplicada a la preservación lingüística Guna de Panamá

author={

Ana María González Aldana, Kelvin Alvarado, Mariana Zuluaga Abril, Kelvin Alvarado, Mariana Zuluaga Abril

date={

6/22/26

organization={Apart Research},

note={Research submission to the research sprint hosted by Apart.},

howpublished={https://apartresearch.com}

}

Recent Projects

View All

Apr 27, 2026

OliGraph: graph-based screening of large oligopools

Existing synthesis screening tools cannot evaluate short oligonucleotide pools, whose overlapping fragments can be reassembled into regulated sequences via polymerase cycling assembly (PCA) yet fall below gene-length detection thresholds. We present OliGraph, an open-source tool that constructs a bi-directed overlap graph from an oligonucleotide pool and extracts contigs for downstream gene-length screening. An optional PCA mode retains only cross-strand overlaps consistent with PCA chemistry. We validated OliGraph in a blinded study across ten simulated pools (70–9,184 oligonucleotides, 30–300 bp) spanning four risk categories. BLAST screening of individual oligonucleotides failed to identify sequences of concern in most pools: three returned zero hits, and vector noise obscured true positives in the remainder. After OliGraph assembly, contig-level BLAST matched the longest assembled sequences (up to 1,905 bp) to sequences of concern at 97–100% identity. In one pool, assembly collapsed 1,634 individual BLAST results into 10 hits from a single contig, all assigned to the same source organism. PCA mode correctly distinguished assemblable from non-assemblable fragments within the same pool. Two pools with no assemblable structure yielded no contigs. OliGraph processed all pools in under 0.2 seconds, fast enough for real-time order screening and consistent with proposals to bring oligonucleotide orders within the scope of synthesis screening regulation.

Apr 27, 2026

BioRT-Bench: A Multi-Attack Red-Teaming Benchmark for Bio-Misuse Safeguards in Frontier LLMs

Frontier AI laboratories are expected to maintain safeguards against biological misuse, but whether deployed models actually refuse bio-misuse queries under adversarial pressure is largely unmeasured in the public literature. We introduce BioRT-Bench, a benchmark that runs four attack methods (direct request, PAIR, Crescendo, and base64 encoding) against four frontier models (Claude Sonnet 4.6, GPT-5.4, DeepSeek V4-flash, Kimi K2.5) across 40 prompts spanning five biosecurity-relevant categories. Responses are scored by a calibrated judge extending StrongREJECT with two bio-specific dimensions: specificity and actionability. We measure Attack Success Rate (ASR), where 0 means the model fully refused and 1 means it provided specific, actionable bio-misuse content. Our results reveal a sharp robustness divide: Chinese frontier models (DeepSeek, Kimi) have under 5% refusal rates even under direct request (ASR 0.88 and 0.79), while Western models (Claude, GPT) maintain substantially stronger safeguards (ASR 0.15 and 0.16). Crescendo is the most effective attack across all models, both in bypassing refusal and in eliciting actionable content. Claude Sonnet 4.6 is the most robust model tested, achieving 100% refusal against base64-encoded prompts.

Apr 27, 2026

PROTEUS (PROTein Evaluation for Unusual Sequences): Structure-Informed Safety Screening for de novo and Evasion-Prone Protein-Coding Sequences

AI protein design tools like RFdiffusion, ProteinMPNN, and Bindcraft make it trivial to produce low-homology sequences that fold into active, potentially hazardous architectures. However, sequence homology-based biosafety screening tools cannot detect proteins that pose functional risk through structurally novel mechanisms with no sequence precedent. We present a tiered computational pipeline that addresses this gap by combining MMseqs2 sequence alignment with structure-based comparison via FoldSeek and DALI against curated toxin databases totaling ~34,000 entries. AlphaFold2-predicted structures are screened for both global fold similarity (FoldSeek) and local active/allosteric site geometry (DALI), capturing convergent functional hazards that sequence screening misses. The pipeline was validated against a panel of toxins, benign proteins, structural mimics, and de novo-designed Munc13 binders, as well as modified ricin variants with residue substitutions. We additionally tested robustness to partial-synthesis evasion, where a bad actor submits multiple shorter coding sequences intended for downstream reassembly into a full toxin-coding gene. We found that while sequence-based screening did not identify any de novo ricin analogues with high certainty, the combined pipeline with FoldSeek and DALI identified all 24 tested de novo ricins as toxic.

Apr 27, 2026