Increasing Trust in Language Models through the Reuse of Verified Circuits
We define a trustworthy LM standard requiring task and circuit verification. A verified addition model is inserted into an untrained model, enabling addition and subtraction. Circuit reuse improves composite model verifiability and safety.