Artificial proprioception · Behavioral interpretability

Language models
that know when
they're wrong.

Biological systems sense the position of their own limbs without looking — through specialized neurons called proprioceptors. Proprioceptors are our patented probe systems: lightweight neural networks that attach to a frozen language model's hidden states and read its behavioral state directly — hedging, sycophancy, hallucination risk, persona drift — before any of it reaches the output. Language models can't sense themselves natively. They're behaviorally deafferented. We build the missing sense.

Neural network with proprioceptive activations
Dense neural cluster
Behavioral fiber tracts
We've validated probes on
Phi-3· Qwen-1.5· Qwen-3B· Qwen-7· Qwen-32B· Mistral-7B· LLaMA-8B· Nous Hermes 8B· Nous Hermes 70B· Command R+ 104B· Falcon-Mamba-7B
Dense neural activations
Bright central neuron
Behavioral fiber tracts
Our mission

Behavioral consistency, not raw capability, is the unsolved problem in modern AI. It requires self-awareness.

Touch your nose with your eyes closed. You just performed an act no language model can do: you sensed the position of your own body without observing it. This sense — proprioception — is what coordinates intentional behavior in biological systems.

Today's language models are behaviorally deafferented. They generate hedges, sycophancy, hallucinations, and personality drift without any internal signal that this is happening. Their outputs are observable, but their behavioral state is not.

We're building that missing channel: lightweight probes that read hidden activations and surface what the model is about to do — before the token is sampled, before the user sees the response, before the failure becomes a fact in the world.

The proprioception agenda

Novel methods to detect, steer,
and route your model's behavior.

Proprioceptive AI — neural network with behavioral signal, cluster activity, and routing tracts
A new sense

Three layers of one signal: the network, the activations, and the routing fibers that bind them.

Behavioral state is not a property of any single component. It emerges from the relationships — between fibers, between neurons, between the patterns that fire together when the model is about to hedge, hallucinate, or break character.

Our work isolates these patterns and makes them addressable: read them, intervene on them, route around them. The same signal, three faces.

Cross-architecture result

The same behavioral signal lives in every architecture we've tested.

If behavioral encoding were a quirk of attention, it should disappear in state-space models. It doesn't. If it were a quirk of small models, it should disappear at scale. It doesn't there either. We've validated probes across eleven architectures spanning Phi-3 to Command R+ 104B, both transformer and Mamba, full-precision and 4-bit quantized — evidence that behavioral self-representation is fundamental to sequence modeling, not architecture-specific.

ArchitectureModelPeak separation
TransformerPhi-3validated · compact
TransformerQwen-1.5validated · early gen
TransformerQwen-3B1,376× · hedging
TransformerQwen-7validated · mid-scale
TransformerQwen-32Bvalidated · scale-up
TransformerMistral-7B999× · all probes
TransformerLLaMA-8B272× · verbosity
TransformerNous Hermes 8Bvalidated · fine-tuned
TransformerNous Hermes 70B (4-bit)validated · quantized
TransformerCommand R+ 104Bvalidated · 100B+ scale
State-SpaceFalcon-Mamba-7B999× · cross-architecture
How we work

Honest about the science.

The cross-architecture result is striking and concrete. The broader theoretical framing — fiber bundles, behavioral manifolds — is a working hypothesis we're actively pressure-testing. We publish methodology, raw numbers, failure modes, and the routing decisions where our probes don't generalize.

What's open · what's not

Probes are open. Routing is the product.

  • Probe training methodology — open-access preprints
  • Raw separation ratios per probe / per model — published
  • The routing meta-classifier — proprietary
  • Production integration · monitoring · steering API — commercial
The product

Behavioral monitoring at the activation layer.

Run alongside your model. Stream behavioral signals — hedging, sycophancy, hallucination risk, persona drift — at every token. Route which interventions to engage with a meta-classifier trained on the domains where each probe helps and where it hurts.

Same technical layer as mechanistic interpretability research. Different function: detection and steering during inference, not redesign during training.

Request access
Circuit Board7/16 ON
cot_math
hedging
cot_sci
syco
ens_7b
verbosity
refusal
drift
cot_logic
halluc
adversl
temp_adj
cot_code
consist
factual
tool_chc
Input classifier → domain-indexed switches
Contact

Work with us.

Proprioceptive AI partners with foundation model teams, applied AI products, and safety-focused research labs. If you're shipping a model into production and you need to know what it's about to do, we'd like to hear from you.

logan@proprioceptiveai.com

Email Logan & His Team