May 27, 2024

rAInboltBench : Benchmarking user location inference through single images

Le "Qronox" Lam ; Aleksandr Popov ; Jord Nguyen ; Trung Dung "mogu" Hoang ; Marcel M ; Felix Michalak

This paper introduces rAInboltBench, a comprehensive benchmark designed to evaluate the capability of multimodal AI models in inferring user locations from single images. The increasing

proficiency of large language models with vision capabilities has raised concerns regarding privacy and user security. Our benchmark addresses these concerns by analysing the performance

of state-of-the-art models, such as GPT-4o, in deducing geographical coordinates from visual inputs.

Reviewer's Comments

Reviewer's Comments

Arrow
Arrow
Arrow

Really cool idea and well documented. Some more details in the appendix regarding the dataset construction and some example chain of thoughts would make it even more informative. Also, it might make sense to split the dataset into two halves for the purpose of analysis - one that has a chance of exact location inference (e.g. a street sign combination that is potentially unique) and another part where getting the state or country right is already impressive. Regarding possible extensions of this work, I think the thoughts around the use of images / multi-modal input in sycophancy and manipulation are well worth exploring in a separate dataset. It would be fascinating to study e.g. whether including images that provide cultural context could bias answers to make them more appropriate for that cultural context.

Wow, this is simply some stellar work. Extending benchmarks to multimodal models, the emphasis on immediate problems (privacy concerns), incorporation of the GeoGuessr game within prompts, and analysis of what these findings might mean are fantastic. I am curious why the distance of 2000km was chosen, and wondering if there may be a distance grounded in some application that could be used instead, e.g. average radius of metropolitan hub, average radius of country smaller than X sq. km, etc.

There are also a number of plots for this data which I would be incredibly interested in seeing: (a) cut outliers from a Figure 1 style plot in order to look at the histogram of all entries that were accurate to within <2k kilometers, (b) a histogram of “correct” vs. “incorrect”, classifying correct as within some number of km, (c) percentage of questions with a given criteria within the dataset itself e.g. how many pictures in the dataset included a street sign, or where were the images taken from by geographical region. Really excited to see future work on this!

Very novel project, quite exciting! The methodology is in-depth and the results analysis is well done. Plotting the distribution across the identification criteria was good and going in-depth with the categorization, reasoning, and distance analysis was great.

I can definitely imagine this benchmark being expanded to include more images and more modalities to become a complete AI privacy violation benchmark. This would both be a very clear demonstration of risk from malicious actors but also provide us some sort of dangerous capability evaluation that isn't seen out there right now. Great work.

Cite this work

@misc {

title={

rAInboltBench : Benchmarking user location inference through single images

},

author={

Le "Qronox" Lam ; Aleksandr Popov ; Jord Nguyen ; Trung Dung "mogu" Hoang ; Marcel M ; Felix Michalak

},

date={

5/27/24

},

organization={Apart Research},

note={Research submission to the research sprint hosted by Apart.},

howpublished={https://apartresearch.com}

}

Recent Projects

View All

Feb 2, 2026

Markov Chain Lock Watermarking: Provably Secure Authentication for LLM Outputs

We present Markov Chain Lock (MCL) watermarking, a cryptographically secure framework for authenticating LLM outputs. MCL constrains token generation to follow a secret Markov chain over SHA-256 vocabulary partitions. Using doubly stochastic transition matrices, we prove four theoretical guarantees: (1) exponentially decaying false positive rates via Hoeffding bounds, (2) graceful degradation under adversarial modification with closed-form expected scores, (3) information-theoretic security without key access, and (4) bounded quality loss via KL divergence. Experiments on 173 Wikipedia prompts using Llama-3.2-3B demonstrate that the optimal 7-state soft cycle configuration achieves 100\% detection, 0\% FPR, and perplexity 4.20. Robustness testing confirms detection above 96\% even with 30\% word replacement. The framework enables $O(n)$ model-free detection, addressing EU AI Act Article 50 requirements. Code available at \url{https://github.com/ChenghengLi/MCLW}

Read More

Feb 2, 2026

Prototyping an Embedded Off-Switch for AI Compute

This project prototypes an embedded off-switch for AI accelerators. The security block requires periodic cryptographic authorization to operate: the chip generates a nonce, an external authority signs it, and the chip verifies the signature before granting time-limited permission. Without valid authorization, outputs are gated to zero. The design was implemented in HardCaml and validated in simulation.

Read More

Feb 2, 2026

Fingerprinting All AI Cluster I/O Without Mutually Trusted Processors

We design and simulate a "border patrol" device for generating cryptographic evidence of data traffic entering and leaving an AI cluster, while eliminating the specific analog and steganographic side-channels that post-hoc verification can not close. The device eliminates the need for any mutually trusted logic, while still meeting the security needs of the prover and verifier.

Read More

Feb 2, 2026

Markov Chain Lock Watermarking: Provably Secure Authentication for LLM Outputs

We present Markov Chain Lock (MCL) watermarking, a cryptographically secure framework for authenticating LLM outputs. MCL constrains token generation to follow a secret Markov chain over SHA-256 vocabulary partitions. Using doubly stochastic transition matrices, we prove four theoretical guarantees: (1) exponentially decaying false positive rates via Hoeffding bounds, (2) graceful degradation under adversarial modification with closed-form expected scores, (3) information-theoretic security without key access, and (4) bounded quality loss via KL divergence. Experiments on 173 Wikipedia prompts using Llama-3.2-3B demonstrate that the optimal 7-state soft cycle configuration achieves 100\% detection, 0\% FPR, and perplexity 4.20. Robustness testing confirms detection above 96\% even with 30\% word replacement. The framework enables $O(n)$ model-free detection, addressing EU AI Act Article 50 requirements. Code available at \url{https://github.com/ChenghengLi/MCLW}

Read More

Feb 2, 2026

Prototyping an Embedded Off-Switch for AI Compute

This project prototypes an embedded off-switch for AI accelerators. The security block requires periodic cryptographic authorization to operate: the chip generates a nonce, an external authority signs it, and the chip verifies the signature before granting time-limited permission. Without valid authorization, outputs are gated to zero. The design was implemented in HardCaml and validated in simulation.

Read More

This work was done during one weekend by research workshop participants and does not represent the work of Apart Research.
This work was done during one weekend by research workshop participants and does not represent the work of Apart Research.