Our IP inference accelerators power both CNN-based vision models and LLM/Transformer-based generative models.
Deliver high-efficiency image recognition for tasks like classification and object detection, while also enabling fast text generation and conversational AI.
Status : Ready as Prototype on FPGA & Tape out with Technology-Independent
Configurable CNN accelerator for fast, efficient inference on VGG16, AlexNet, ResNet, EfficientNet-Lite, YOLOv6-Nano, and custom models up to 8-bit precision. All weights and parameters are stored in high-bandwidth on-chip SRAM to reduce latency and power. Supports configurable compute units, parallelism, and layer fusion for low-latency, energy-efficient edge AI.
Status : In Development, Expected to be available on Q3 2026
Configurable transformer accelerator for fast, efficient inference on LLaMA 3.2 and custom models, supporting INT8, INT4, FP8, FP16, BF16, and FP32. Stores all weights, key/value caches, and tensors in on-chip SRAM to reduce latency and power. Supports configurable attention heads, multi-layer blocks, and pipeline parallelism for low-latency, energy-efficient edge AI.