SSovAIHub

Model Registry

Not every model qualifies for sovereign AI.

Sovereignty is not a model feature — it is a deployment decision. The same model can be fully sovereign when self-hosted or conditionally sovereign when managed by a cloud provider. This registry maps the distinction clearly.

Sovereignty Tiers

Three tiers. One question: who controls the inference?

Sovereignty is determined by where inference happens, who owns the runtime, and whether your data crosses a boundary you do not control.

Fully Sovereign

Model runs entirely on infrastructure you control. No data leaves your network boundary. Open weights, self-hosted, no licensing dependency on a third-party API.

  • Open-weight model you can download and self-host
  • Runs on your own hardware, VMs, or private cloud
  • Zero external API calls during inference
  • You own the runtime, the weights, and the output

Conditionally Sovereign

Model is hosted by a cloud provider within a dedicated tenant. Data does not leave your cloud account, but the model weights and runtime are controlled by the provider.

  • Hosted within your Azure, AWS, or GCP tenant
  • Data stays in your cloud region and account
  • Provider controls model weights and updates
  • Dependency on provider availability and pricing

Not Sovereign

Inference happens on the provider's shared infrastructure. Your prompts, documents, and responses leave your boundary and are processed externally.

  • API calls route to provider's shared servers
  • No control over where data is processed
  • Subject to provider's data retention policies
  • Not suitable for regulated or sensitive workloads

Fully Sovereign Models

Open-weight models you can self-host and fully control

These models can be downloaded, self-hosted on your own infrastructure, and run with zero external API calls. Your data never leaves your network boundary.

ModelProviderDeploymentContextEnterprise fitUse cases
Llama 3.1 8B / 70B / 405B
Most mature open-weight family for enterprise private deployment. All sizes available for self-hosting.
Meta (open weights)Ollama · vLLM · OpenShift AI128KHigh
RAGAgentsSummarization
Mistral 7B / Mixtral 8×7B
Highly efficient. Mixtral MoE architecture gives near-70B quality at lower compute cost.
Mistral AI (open weights)Ollama · vLLM · Docker32KHigh
RAGClassificationCode
Mistral Nemo 12B
Best-in-class at 12B scale. Long context makes it ideal for document intelligence pipelines.
Mistral AI (open weights)Ollama · vLLM128KHigh
DocumentsRAGLong context
Phi-3 / Phi-3.5 Mini
Exceptional quality-to-size ratio. Suitable for edge, laptop, or resource-constrained private deployments.
Microsoft (open weights)Ollama · Edge · Docker128KMedium
EdgeLightweightClassification
Gemma 2 9B / 27B
Strong reasoning capability. 27B model approaches GPT-3.5 quality in private deployment benchmarks.
Google (open weights)Ollama · vLLM8KMedium
RAGReasoningSummarization
Qwen 2.5 7B / 72B
Outstanding multilingual and code capability. 72B is competitive with GPT-4o on many enterprise tasks.
Alibaba (open weights)Ollama · vLLM128KHigh
MultilingualCodeRAG
DeepSeek R1 / V3
Exceptional reasoning model. R1 rivals o1-class performance when self-hosted. Requires GPU infrastructure.
DeepSeek (open weights)vLLM · OpenShift AI64KHigh
ReasoningAnalysisCode
Code Llama / StarCoder2
Purpose-built for code generation, completion, and review in private developer tooling.
Meta / BigCode (open weights)Ollama · vLLM16KMedium
CodeDeveloperCompletion

Conditionally Sovereign Models

Cloud-managed models within your tenant boundary

These models run inside your Azure, AWS, or GCP account. Your data stays within your cloud tenant, but model weights and runtime are controlled by the provider.

Conditional sovereignty depends on your cloud agreement, region selection, and data processing terms. Always verify provider data residency commitments before using these models with sensitive data.

ModelProvider / PlatformDeploymentContextEnterprise fitUse cases
GPT-4o / GPT-4 Turbo
Data stays within your Azure tenant and region. No training on your data by default. HIPAA and SOC2 eligible.
Microsoft via Azure OpenAIAzure OpenAI Service128KHigh
RAGAgentsEnterprise
GPT-3.5 Turbo
Lower cost option for high-volume RAG pipelines within Azure boundary. Good for summarization at scale.
Microsoft via Azure OpenAIAzure OpenAI Service16KHigh
RAGSummarizationHigh volume
Claude 3.x (Haiku / Sonnet / Opus)
Longest context window available in a managed tier. Data stays in your AWS account. Strong for document Q&A.
Anthropic via AWS BedrockAWS Bedrock200KHigh
DocumentsLong contextRAG
Llama 3 (Managed)
Open weights, managed runtime. A middle path when self-hosting GPU infrastructure is not yet possible.
Meta via AWS Bedrock / AzureAWS Bedrock · Azure AI128KHigh
RAGManagedTransition
Mistral Large / Small
Managed Mistral inside Azure boundary. Useful when GPU ops team is not available but data must stay in Azure.
Mistral via Azure AIAzure AI Foundry32KMedium
RAGManagedAzure

Not Sovereign — Public APIs

Models that cross your data boundary

These models process your data on shared provider infrastructure. They are not suitable for sovereign AI workloads involving sensitive, regulated, or confidential data.

Public API models are listed here for comparison, not recommendation. For many use cases without sensitive data, public APIs are practical. For sovereign AI, they are excluded by definition.

ModelProviderDeploymentContextSovereign fitSovereign alternative
GPT-4o / GPT-4
Prompts and documents are processed on OpenAI shared infrastructure. Not suitable for private or regulated data.
OpenAI (direct API)OpenAI API128KNot SovereignAzure OpenAI Service
Claude 3.x
Processed on Anthropic's infrastructure. Data leaves your boundary. Use AWS Bedrock for a conditional alternative.
Anthropic (direct API)Anthropic API200KNot SovereignAWS Bedrock
Gemini 1.5 Pro / Flash
Inference on Google's shared servers. Use Vertex AI within your GCP project for a conditionally sovereign path.
Google (direct API)Google AI Studio / API1MNot SovereignVertex AI (GCP)

Deployment Runtimes

Where and how sovereign models run

The runtime determines the sovereignty level, performance envelope, and operational complexity. Choose based on your infrastructure maturity and compliance requirements.

Fully Sovereign

Ollama

Local and single-server model runtime. Ideal for development, proof-of-concept, and low-volume private deployments.

Best for
Development · Laptops · Small teams
Models
Llama 3, Mistral, Phi-3, Gemma 2, Qwen 2.5
Note
One-command model pull and serve. No GPU required for smaller models. Not designed for production scale.
Fully Sovereign

vLLM

High-throughput inference server with continuous batching. Purpose-built for GPU-accelerated production workloads.

Best for
Production · GPU servers · High concurrency
Models
Llama 3, Mistral, Qwen 2.5, DeepSeek, Gemma 2
Note
OpenAI-compatible API surface. Supports tensor parallelism for large models. Preferred for production private RAG.
Fully Sovereign

OpenShift AI

Red Hat's ML platform for enterprise Kubernetes deployments. Integrates model serving with observability and governance.

Best for
Enterprise · Kubernetes · Air-gapped
Models
Llama 3, Mistral, DeepSeek R1
Note
Preferred for regulated industries, air-gapped environments, and enterprises already using OpenShift.
Conditionally Sovereign

Azure OpenAI Service

Microsoft-managed OpenAI models within your Azure subscription and region. HIPAA, SOC2, and EU data boundary eligible.

Best for
Azure-aligned orgs · Compliance · Fast start
Models
GPT-4o, GPT-3.5 Turbo
Note
No self-managed GPU infrastructure required. Data stays in your Azure tenant. Provider controls model updates.
Conditionally Sovereign

AWS Bedrock

Managed foundation model API within your AWS account. Supports Claude, Llama, Mistral, and others with VPC integration.

Best for
AWS-aligned orgs · Multi-model · Compliance
Models
Claude 3, Llama 3, Mistral Large
Note
Data stays in your AWS account and region. No cross-account data sharing. Supports PrivateLink for VPC isolation.

Decision Guide

Which model path fits your situation?

Sovereignty requirements, infrastructure maturity, and compliance obligations determine the right deployment path — not model quality rankings.

Situation

You handle regulated data (HIPAA, GDPR, financial)

Recommendation

Fully Sovereign or Conditionally Sovereign only

Self-host on vLLM / OpenShift AI, or use Azure OpenAI / AWS Bedrock within your tenant

Situation

You have a GPU-equipped private server or Kubernetes cluster

Recommendation

Fully Sovereign — self-hosted

Llama 3.1 70B or Mistral Nemo on vLLM. OpenShift AI if enterprise Kubernetes is already in use.

Situation

You want private AI but don't have GPU infrastructure yet

Recommendation

Conditionally Sovereign — managed cloud

Azure OpenAI (if Azure-aligned) or AWS Bedrock (if AWS-aligned). Plan migration to self-hosted as GPU capacity grows.

Situation

You need to run AI on laptops or edge devices

Recommendation

Fully Sovereign — local runtime

Phi-3 Mini or Llama 3.1 8B on Ollama. Works without internet, ideal for field teams or disconnected environments.

Situation

You need maximum model quality right now

Recommendation

Conditionally Sovereign

GPT-4o via Azure OpenAI or Claude Sonnet via AWS Bedrock. Not fully sovereign but data stays in your cloud tenant.

Situation

You are building a prototype or internal demo

Recommendation

Start with Ollama locally, plan for vLLM in production

Llama 3.1 8B on Ollama for speed. Design the application layer to swap the model endpoint without rewriting the app.

Registry Principles

How SovAIHub evaluates models

This registry does not rank models on benchmark performance. It evaluates them on sovereignty, deployment control, and enterprise operational fit.

Boundary control

Where does inference happen? Who owns the compute? Can you prevent data from leaving your network? These questions determine sovereignty, not model size or capability.

Deployment operability

A sovereign model you cannot realistically operate is not a useful recommendation. Models are rated on whether enterprise teams can deploy, monitor, and maintain them.

Weight availability

Fully sovereign models require open or licensed weights you can download. A model that requires an external API for every inference is not self-hosted, regardless of marketing language.

Enterprise context fit

Context window, throughput, and accuracy on enterprise tasks (document Q&A, summarization, classification) matter more than general benchmark scores for private RAG workloads.

Regulatory alignment

Models deployed in regulated industries must support data residency, audit logging, and access control. The registry notes which deployment methods support these requirements.

Runtime independence

SovAIHub favors models that can be served across multiple runtimes (Ollama, vLLM, OpenShift AI) without vendor lock-in. Application code should route to a model endpoint, not a provider.

Next Step

Need help selecting and deploying the right model stack?

Model selection depends on your data classification, infrastructure, compliance requirements, and team capability. SovAIHub can help you map the right path.

Sovereign model architecture

Select the right model. Deploy it on infrastructure you control.

Share your use case, data sensitivity, infrastructure environment, and compliance requirements. We will help you identify a practical sovereign model deployment path.

Discuss Your Model Stack