Public Cloud AI Security: Azure OpenAI, AWS Bedrock, and Google Vertex AI
Cloud AI services come with strong security capabilities built in. Most breaches happen because those capabilities are never configured. Here is what to configure on each major platform.
The Cloud Does Not Mean Someone Else's Problem
I regularly hear two extremes when teams discuss cloud AI security. Either "it is in the cloud, so security is handled," or they are so overwhelmed by the complexity that they do not know where to begin.
The reality sits in between. Cloud AI services like Azure OpenAI, AWS Bedrock, and Google Vertex AI ship with strong security capabilities. But those capabilities only protect you when you configure them correctly. Misconfiguration is the primary cloud security failure mode, and AI services are no exception.
Here is what actually matters on each platform, along with the most common misconfigurations I see in production environments.
The Shared Responsibility Model for Cloud AI
Before diving into specifics, understand what the cloud provider secures and what falls on you:
| Responsibility | Provider | You |
|---|---|---|
| Physical data center and hardware | ✓ | |
| Hypervisor and host OS | ✓ | |
| Model training and infrastructure | ✓ | |
| Platform availability and patching | ✓ | |
| Identity and access management | Tooling provided | Configure it correctly |
| Network access controls | Tooling provided | Configure it correctly |
| Data encryption | Default or configure | Enable and manage keys |
| Audit logging | Available | Enable and monitor |
| Content filtering | Available | Enable and tune |
| Application-layer security | ✓ | |
| Prompt security | ✓ | |
| What data you send to the model | ✓ |
Azure OpenAI Service
Azure OpenAI is Microsoft's enterprise platform for OpenAI models. It adds enterprise controls—private networking, RBAC, content filtering, managed identity—on top of the same GPT-4o and other models you already know.
Private Endpoints: Do This First
By default, Azure OpenAI is accessible over the public internet, protected only by your API key or managed identity token. For enterprise workloads with sensitive data, this is not acceptable. Enable private endpoints:
# Terraform: Azure OpenAI with private endpoint and public access disabled
resource "azurerm_cognitive_account" "openai" {
name = "openai-${var.environment}"
location = var.location
resource_group_name = var.resource_group_name
kind = "OpenAI"
sku_name = "S0"
public_network_access_enabled = false # Disable public internet access
network_acls {
default_action = "Deny"
}
identity {
type = "SystemAssigned"
}
}
resource "azurerm_private_endpoint" "openai" {
name = "pe-openai-${var.environment}"
location = var.location
resource_group_name = var.resource_group_name
subnet_id = var.private_endpoint_subnet_id
private_service_connection {
name = "openai-privatelink"
private_connection_resource_id = azurerm_cognitive_account.openai.id
subresource_names = ["account"]
is_manual_connection = false
}
}
With this configuration, traffic to your Azure OpenAI instance never leaves Microsoft's network backbone.
Use Managed Identity, Not API Keys
API keys are secrets that can be leaked, stolen, or accidentally committed to version control. Managed Identity is Azure's preferred authentication mechanism—no secrets to manage.
# Application code using Managed Identity (Python)
from openai import AzureOpenAI
from azure.identity import DefaultAzureCredential, get_bearer_token_provider
token_provider = get_bearer_token_provider(
DefaultAzureCredential(),
"https://cognitiveservices.azure.com/.default"
)
client = AzureOpenAI(
azure_endpoint=os.getenv("AZURE_OPENAI_ENDPOINT"),
azure_ad_token_provider=token_provider, # No API key needed
api_version="2024-10-21"
)
Assign the application's managed identity the minimum required role:
# Grant the application's identity permission to call the API only
# Role: Cognitive Services OpenAI User
az role assignment create --assignee "${app_managed_identity_id}" --role "Cognitive Services OpenAI User" --scope "${openai_resource_id}"
Do not use the Cognitive Services OpenAI Contributor or higher roles for applications. Those allow configuration changes, not just inference calls.
Enable Content Filtering and Prompt Shields
Azure OpenAI's content filtering is configurable per deployment. Prompt Shields specifically blocks prompt injection and jailbreak attempts:
Content filter capabilities to review and configure:
Standard categories (configure thresholds):
- Hate speech
- Sexual content
- Violence
- Self-harm
Advanced protections (enable all):
<ul class="list-disc pl-6 mb-4 space-y-2">
<li class="text-gray-600">Prompt Shields: Blocks prompt injection and jailbreak attempts</li>
<li class="text-gray-600">Groundedness detection: For RAG applications, detects hallucinations</li>
<li class="text-gray-600">Protected material detection: Detects copyrighted content</li>
<li class="text-gray-600">Custom blocklists: Add domain-specific prohibited terms or patterns</li>
</ul>
Enable Diagnostic Logging
Logging is off by default. Turn it on from day one:
az monitor diagnostic-settings create --name "openai-security-logs" --resource "${openai_resource_id}" --logs '[{"category": "Audit", "enabled": true},
{"category": "RequestResponse", "enabled": true}]' --metrics '[{"category": "AllMetrics", "enabled": true}]' --workspace "${log_analytics_workspace_id}"
These logs capture API calls, token counts, content filter triggers, and configuration changes—everything you need for incident investigation and compliance.
AWS Bedrock
AWS Bedrock offers models from Anthropic (Claude), Meta, Mistral, Cohere, and Amazon's own Titan family, all through a unified AWS API. The security model integrates cleanly with existing AWS IAM and networking.
VPC Endpoints: Keep Traffic Internal
Like Azure private endpoints, Bedrock VPC interface endpoints keep inference traffic within AWS's network:
# Terraform: Bedrock VPC endpoint
resource "aws_vpc_endpoint" "bedrock_runtime" {
vpc_id = var.vpc_id
service_name = "com.amazonaws.${var.region}.bedrock-runtime"
vpc_endpoint_type = "Interface"
subnet_ids = var.private_subnet_ids
security_group_ids = [aws_security_group.bedrock_endpoint_sg.id]
private_dns_enabled = true
policy = jsonencode({
Statement = [{
Effect = "Allow"
Principal = { AWS = [var.application_role_arn] }
Action = ["bedrock:InvokeModel", "bedrock:InvokeModelWithResponseStream"]
Resource = "arn:aws:bedrock:${var.region}::foundation-model/*"
}]
})
}
resource "aws_security_group" "bedrock_endpoint_sg" {
name = "bedrock-endpoint"
vpc_id = var.vpc_id
ingress {
from_port = 443
to_port = 443
protocol = "tcp"
security_groups = [var.application_sg_id] # Only from your application tier
}
}
IAM: Lock Down Model Access
Create a dedicated IAM policy that allows only the specific models your application uses:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": ["bedrock:InvokeModel", "bedrock:InvokeModelWithResponseStream"],
"Resource": [
"arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-5-sonnet-20241022-v2:0"
],
"Condition": {
"StringEquals": {
"aws:RequestedRegion": "us-east-1"
}
}
}
]
}
This denies access to all other models and all model management operations. Your inference application should not be able to list models, create fine-tuning jobs, or access model evaluation—only invoke the specific model it needs.
Bedrock Guardrails
Guardrails is AWS's configurable content filtering layer for Bedrock. Unlike model-level safety (which you cannot configure), Guardrails gives you control:
# Apply guardrails on every model invocation
import boto3, json
bedrock = boto3.client('bedrock-runtime', region_name='us-east-1')
response = bedrock.invoke_model(
modelId='anthropic.claude-3-5-sonnet-20241022-v2:0',
guardrailIdentifier=GUARDRAIL_ID,
guardrailVersion='DRAFT',
body=json.dumps({
"anthropic_version": "bedrock-2023-05-31",
"max_tokens": 2048,
"messages": [{"role": "user", "content": user_message}]
})
)
Guardrails capabilities to configure:
- Denied topics: Define topics your AI should never engage with (e.g., competitor discussions, investment advice)
- Content filters: Configurable thresholds for harmful content categories
- PII redaction: Automatically detect and mask personal information before it reaches the model
- Grounding: For RAG applications, detect when the model goes beyond the provided context
The PII redaction capability is particularly valuable—it can automatically strip common PII types before they are processed or appear in responses.
Google Vertex AI
Google's AI platform covers Gemini models and a growing catalog of open-source models, integrated tightly with Google Cloud's security tooling.
VPC Service Controls
VPC Service Controls is Google Cloud's most powerful data exfiltration prevention mechanism. It creates a security perimeter around cloud services, including Vertex AI—requests from outside the perimeter are blocked regardless of authentication:
# Add Vertex AI to your security perimeter
resource "google_access_context_manager_service_perimeter" "ai_perimeter" {
parent = "accessPolicies/${var.access_policy_id}"
name = "accessPolicies/${var.access_policy_id}/servicePerimeters/ai_perimeter"
title = "AI Security Perimeter"
spec {
restricted_services = [
"aiplatform.googleapis.com",
"storage.googleapis.com", # Also protect training data and model artifacts
]
resources = ["projects/${var.project_number}"]
access_levels = [var.trusted_access_level]
}
}With VPC Service Controls, even a stolen service account key cannot be used to exfiltrate data outside the perimeter.
IAM: Principle of Least Privilege
# Vertex AI roles, from least to most privileged:
# roles/aiplatform.user - Can call prediction endpoints
# roles/aiplatform.viewer - Can view resources, no predictions
# roles/aiplatform.admin - Full access - use sparingly
# For GKE workloads: use Workload Identity, not service account keys
resource "google_service_account_iam_binding" "workload_identity" {
service_account_id = google_service_account.ai_inference.name
role = "roles/iam.workloadIdentityUser"
members = [
"serviceAccount:${var.project_id}.svc.id.goog[${var.k8s_namespace}/${var.k8s_sa_name}]"
]
}
Workload Identity eliminates the need for service account key files entirely—a significant security improvement.
Common Misconfigurations Across All Platforms
These mistakes appear consistently regardless of which cloud provider you use:
1. Public Endpoints with No IP Restrictions
The most common finding. The AI service endpoint is publicly accessible, protected only by an API key or token. An exposed key means complete access to the model and anything it can see. Fix: Enable private endpoints. If private endpoints are not feasible, at minimum restrict public access to your application's known IP ranges.
2. Over-Privileged Service Accounts
Application service accounts with Owner or Contributor roles at the resource group or project level. All they need is permission to invoke specific AI models. Fix: Create dedicated service accounts with least-privilege roles scoped to specific resources. Review and reduce permissions quarterly.
3. Audit Logging Disabled
Diagnostic logging is frequently off by default. Without logs, you cannot detect abuse, investigate incidents, meet compliance requirements, or understand your actual usage patterns. Fix: Enable diagnostic logging from day one. Budget the storage cost as a security investment.
4. Content Filtering Disabled in Production
Teams disable content filtering during development to avoid false positives. They forget to re-enable it before production deployment. Fix: Keep content filtering enabled in production. Tune filters to reduce false positives rather than disabling them.
5. API Keys Instead of Managed Identities
Despite managed identity support on all three platforms, teams default to API keys because they are simpler to configure locally. Fix: Use managed identity/service accounts in production. Accept the configuration overhead—it eliminates an entire category of credential exposure risk.
Data Residency and Provider Data Handling
A common concern in enterprise AI: "Does our data leave our region, and is it used to train the provider's models?"
| Provider | Used for training | Data retention | Regional deployments |
|---|---|---|---|
| Azure OpenAI | No (enterprise tier) | Configurable, default: not retained | Most Azure regions |
| AWS Bedrock | No | Session only by default | Most AWS regions |
| Google Vertex AI | No (enterprise tier) | Configurable | Most GCP regions |
Where to Start
If you are configuring cloud AI security from scratch, do these in order:
- Enable private/VPC endpoints — highest impact, keeps traffic off the public internet
- Configure least-privilege IAM — service accounts with only the permissions they need
- Enable audit logging and send to your SIEM — you need visibility from the first request
- Enable content filtering — use built-in capabilities before building custom solutions
- Set budget alerts — cost anomalies are often security indicators
- Review configurations quarterly — cloud services add features; some are security-relevant
Cloud AI security is cloud security applied to a new service category. The teams that do this well extend their existing cloud security practices to AI services rather than treating AI as a special case requiring a completely new approach.
Questions & Answers
Need Help with Your Security?
Our team of security experts can help you implement the strategies discussed in this article.
Contact Us