A New Standard in AI Infrastructure

Seamless context continuity across multiple LLMs with ContextBridge, and MoE-based computation optimization to accelerate enterprise AI adoption.

Get in Touch

What We Do

ContextBridge

Multi-LLM Orchestration Platform

A B2B API/SaaS platform that manages, converts, and synchronizes shared conversation history and per-model history so that context is never lost when switching between GPT, Gemini, Claude, and other LLMs.

Shared + per-model conversation history
Canonical IR-based format conversion
Session management with collision prevention
Token usage tracking and billing integration

AI Companion Service

MoE-Optimized AI Agent

Fine-tuned on open-source LLMs (LLaMA, Mistral, etc.) for emotionally intelligent conversations. Uses MoE-based dynamic computation optimization to reduce inference costs while maintaining real-time response performance.

Proactive conversation system
Emotion recognition and personality analysis
Dynamic computation optimization (patent pending)
Crisis detection system

Our Technology

Patent 10-2903514

Cross-Model Conversation Context Management System

Dec 18, 2025

Protects the core ContextBridge architecture — shared history storage, format conversion, and session management — providing a proprietary and validated solution for cross-model interoperability.

Patent Pending 10-2026-0007917

Dynamic Computation Optimization for Language Models

Jan 15, 2026

MoE-based dynamic computation optimization technology that will directly reduce inference costs and improve response speed for AI services.

Patent Pending 10-2026-0050223

Inference-Efficient MoE via Non-Computational Experts

Mar 20, 2026

A modular optimization technique that extends the gating network of existing MoE models with non-computational experts — preserving the original model while substantially reducing inference cost.

Patent Pending 10-2026-0057241

Parameter Updater Expert in Mixture-of-Experts Layer

Mar 30, 2026

A neural architecture embedding a dedicated "updater expert" slot inside the MoE layer that dynamically modifies sibling experts' weights at inference. Enables the model to learn new rules at inference time without a separate training stage.

Publications

Parameter Updater Experts: Inference-Time Learning in MoE Models via DeltaNet-LoRA
Han, Jongyun · Apr 10, 2026
Proposes dedicated expert slots within MoE layers that generate weight modifications during forward passes. DeltaNet-LoRA achieves 100% in-context fact retrieval on OLMoE-1B-7B, and 80.1% persistent retrieval under sliding-window attention.
View on Zenodo → opens in new tab

Open Models

Nemotron-3-Super-64B-A12B-Math-REAP
Apr 22, 2026
REAP-pruned (512 → 256 experts, MTP layer removed) NVIDIA Nemotron-3-Super-120B-A12B, briefly LoRA-RL fine-tuned on AIMO3 + AstralMath-v1, then post-training quantized. Released in three variants for different deployment trade-offs.
AIME 2026 avg@4 — FP8: 0.9167 · AWQ: 0.9083 vs. 0.9000 base (120B).
Variants: BF16 opens in new tab · FP8 opens in new tab · AWQ opens in new tab
View benchmarks on GitHub → opens in new tab

About Us

Max & Omnis is a technology startup developing AI infrastructure software (LLM integration platform) and AI application services. Built on patent-protected core technology, we create AI solutions that deliver value to both enterprises and end users.

Founded December 2025·Yongin, South Korea·CEO Jongyun (Max) Han

Contact

We welcome inquiries about service adoption, technical partnerships, and investment.

madmax0404@maxandomnis.com