// WHY SANDLOGIC · ANY AI

One chip. Any AI.

Any AI is not a runtime claim — it's a chip-level claim. The Krsna SoC architecturally runs any AI workload because the silicon was co-designed with CORE, the compiler + runtime engine that maps any model family down to the operator set. Most AI chips lock you to LLMs or CNNs. Krsna runs both today, and the architectures that emerge next.

Production model families
4
Krsna configurations
4
Compiler + runtime engine
CORE
Silicon IP
ExSLerate V2

Pick a chip. Pick an architecture. Pick wrong.

The AI silicon market today asks device-makers to commit, at design time, to which model family the chip will run. NPUs optimized for CNN inference are awkward on transformers. AI accelerators built for LLMs treat vision workloads as second-class. State-space models and the architectures that haven't shipped yet are nowhere on most roadmaps.

// PROBLEM 01

CNN-only OR LLM-only — never both.

The BOM decision today forces device-makers to pick the workload family up front. A smart-TV chip that does vision can't do conversational AI. An LLM accelerator can't run YOLO with any throughput.

// PROBLEM 02

Today's architecture isn't tomorrow's.

Mamba broke transformer dominance in late 2023. RWKV-7 dropped in 2025. Liquid Foundation Models are emerging through 2026. Each new architecture demands different kernels — and most chips can't absorb them.

// PROBLEM 03

Silicon design cycles vs AI architecture cycles.

Silicon takes 18–36 months to design and tape out. AI architectures shift every 6–12 months. A chip designed in 2024 to run "what's hot now" is a chip that can't run what ships in 2027.

Silicon + compiler, co-designed.

"Any AI" is the property that falls out when you co-design the chip with the compiler-runtime layer that targets it. ExSLerate V2 (silicon) and CORE (compiler + runtime engine inside EdgeMatrix) are engineered together. The silicon ships the operator-set superset that current and emerging architectures need; CORE handles the compilation.

/ SILICON

ExSLerate V2 inside Krsna SoC

Four chip configurations (Lite to Apex). MAC arrays, on-die memory, and operator support engineered as the union of what current and emerging AI architectures need. Two patented engines inside: Dynamic Neural Compression and the Infinite Series Engine (non-linear math in-datapath).

Krsna SoC architecture →
/ COMPILER + RUNTIME ENGINE

CORE (inside EdgeMatrix)

The compiler + runtime engine for SandLogic's own silicon — co-designed with the ExSLerate IP it targets. It recognizes the incoming model architecture (transformer / SSM / RNN / CNN), selects the appropriate kernel sequence, and emits silicon-ready code for the Krsna SoC. Built on IREE / MLIR — open frontends, no vendor lock.

EdgeMatrix · CORE layer →

Four model families. All four, on one chip.

What the chip runs in real products today. Built for what ships, not for theoretical coverage — a discipline that makes the claim defensible and the integration straightforward.

PRODUCTION

LLM / SLM

Llama · Shakti · Qwen · Gemma

Transformer-class language models. The architecture that dominates current enterprise AI workloads. Production support across the variant family.

PRODUCTION

Speech AI

Sruthi · Svara · Moonshine · Whisper

STT and TTS pipelines. The architecture that voice agents, dictation, and translation workloads depend on. End-to-end on-chip support.

PRODUCTION

Computer Vision

ResNet · YOLO · VGG · MobileNet

CNN inference. The architecture every camera-class device runs. Most LLM accelerators treat CNNs as an afterthought — Krsna treats both as first-class.

PRODUCTION

State Space Models

Mamba · Jamba · Mamba-2

Linear-time recurrence. The architecture that broke the transformer monopoly in 2023. SambaASR — our Mamba-based speech model — already proves the chip's SSM dispatch.

// EXTENSIBILITY

Built for the next architecture.

The four families above are what ships today. The architectural promise of "Any AI" is what makes the chip absorb what ships tomorrow — without redesigning the silicon. Four mechanisms, all engineered in.

// MECHANISM 01

Architecture-aware compilation

CORE recognizes the model architecture at compile time — transformer attention vs SSM scan vs CNN convolution vs RNN recurrence — and emits the appropriate kernel sequence for the silicon. New architectures slot in as new compilation paths, not as silicon rework.

// MECHANISM 02

Operator-set discipline

The chip's operator set is engineered to be the union of what current architectures need (matmul, attention, conv, layernorm, activation) — and the primitives future architectures will need (scan, recurrence, gating). The cost of supporting the next architecture is on the compiler side, not the silicon side.

// MECHANISM 03

Compiler-led architecture support

When a new model family emerges — RWKV, Liquid Foundation Models, the next thing — CORE absorbs it as a new compilation path for the Krsna chip. New model coverage lands in the compiler, not in a silicon redesign.

// MECHANISM 04

Disciplined production scope

The four families above are what production customers run today — not theoretical coverage. We say four production today; we say extensible for tomorrow. We do not conflate the two. That discipline is itself a feature of the program.

What we mean by "Any AI."

"Any AI" is a property of the chip's architecture + CORE — the compiler + runtime engine co-designed with the Krsna SoC. It is not a claim about inference throughput or token economics (that's EdgeFlow's domain). It is not a claim that every model runs at peak performance on every chip variant (Lite obviously won't run a 70B-class model).

Production scope (four families on chip today) and architectural promise (CORE compiles the architectures that emerge tomorrow) are kept distinct — by design. Discipline at the claim layer is the foundation that lets the chip claim work.

// RELATED SURFACES

Where "Any AI" connects to the rest of the stack.

  • /krsna — the SoC product. Four chip configurations, two engines (DNC + Infinite Series Engine), 128k tokens on 8GB endpoint.
  • /exslerate — the licensable IP behind Krsna. ARM-style licensing for AI silicon.
  • /edgematrix — the umbrella product. CORE (compiler + runtime engine) + EdgeFlow (inference engine).
  • /edgeflow — the inference engine. 193 models pre-tuned, multi-silicon coverage, token optimization at runtime.
  • /token-economy — the business outcome. How chip + CORE + EdgeFlow + HaluMon + LingoForge together prevent ~23% token leakage and unlock 30–40% structural cost reduction.
// LET'S BUILD

One chip. Every AI workload.