Cloud AI Security: Azure OpenAI, AWS Bedrock, Google Vertex AI Guide

The Cloud Does Not Mean Someone Else's Problem

I regularly hear two extremes when teams discuss cloud AI security. Either "it is in the cloud, so security is handled," or they are so overwhelmed by the complexity that they do not know where to begin.

The reality sits in between. Cloud AI services like Azure OpenAI, AWS Bedrock, and Google Vertex AI ship with strong security capabilities. But those capabilities only protect you when you configure them correctly. Misconfiguration is the primary cloud security failure mode, and AI services are no exception.

Here is what actually matters on each platform, along with the most common misconfigurations I see in production environments.

The Shared Responsibility Model for Cloud AI

Before diving into specifics, understand what the cloud provider secures and what falls on you:

Responsibility	Provider	You
Physical data center and hardware	✓
Hypervisor and host OS	✓
Model training and infrastructure	✓
Platform availability and patching	✓
Identity and access management	Tooling provided	Configure it correctly
Network access controls	Tooling provided	Configure it correctly
Data encryption	Default or configure	Enable and manage keys
Audit logging	Available	Enable and monitor
Content filtering	Available	Enable and tune
Application-layer security		✓
Prompt security		✓
What data you send to the model		✓

The provider secures the infrastructure. You secure how you use it.

Azure OpenAI Service

Azure OpenAI is Microsoft's enterprise platform for OpenAI models. It adds enterprise controls—private networking, RBAC, content filtering, managed identity—on top of the same GPT-4o and other models you already know.

Private Endpoints: Do This First

By default, Azure OpenAI is accessible over the public internet, protected only by your API key or managed identity token. For enterprise workloads with sensitive data, this is not acceptable. Enable private endpoints:

# Terraform: Azure OpenAI with private endpoint and public access disabled
resource "azurerm_cognitive_account" "openai" {
  name                          = "openai-${var.environment}"
  location                      = var.location
  resource_group_name           = var.resource_group_name
  kind                          = "OpenAI"
  sku_name                      = "S0"
  public_network_access_enabled = false    # Disable public internet access

network_acls { default_action = "Deny" }

identity {
    type = "SystemAssigned"
  }
}

resource "azurerm_private_endpoint" "openai" { name = "pe-openai-${var.environment}" location = var.location resource_group_name = var.resource_group_name subnet_id = var.private_endpoint_subnet_id

private_service_connection {
    name                           = "openai-privatelink"
    private_connection_resource_id = azurerm_cognitive_account.openai.id
    subresource_names              = ["account"]
    is_manual_connection           = false
  }
}

With this configuration, traffic to your Azure OpenAI instance never leaves Microsoft's network backbone.

Use Managed Identity, Not API Keys

API keys are secrets that can be leaked, stolen, or accidentally committed to version control. Managed Identity is Azure's preferred authentication mechanism—no secrets to manage.

# Application code using Managed Identity (Python)
from openai import AzureOpenAI
from azure.identity import DefaultAzureCredential, get_bearer_token_provider

token_provider = get_bearer_token_provider( DefaultAzureCredential(), "https://cognitiveservices.azure.com/.default" )

client = AzureOpenAI(
    azure_endpoint=os.getenv("AZURE_OPENAI_ENDPOINT"),
    azure_ad_token_provider=token_provider,  # No API key needed
    api_version="2024-10-21"
)

Assign the application's managed identity the minimum required role:

# Grant the application's identity permission to call the API only
# Role: Cognitive Services OpenAI User
az role assignment create         --assignee "${app_managed_identity_id}"         --role "Cognitive Services OpenAI User"         --scope "${openai_resource_id}"

Do not use the Cognitive Services OpenAI Contributor or higher roles for applications. Those allow configuration changes, not just inference calls.

Enable Content Filtering and Prompt Shields

Azure OpenAI's content filtering is configurable per deployment. Prompt Shields specifically blocks prompt injection and jailbreak attempts:

Content filter capabilities to review and configure:

Standard categories (configure thresholds):

Hate speech
Sexual content
Violence
Self-harm

Advanced protections (enable all):
<ul class="list-disc pl-6 mb-4 space-y-2">
<li class="text-gray-600">Prompt Shields: Blocks prompt injection and jailbreak attempts</li>
<li class="text-gray-600">Groundedness detection: For RAG applications, detects hallucinations</li>
<li class="text-gray-600">Protected material detection: Detects copyrighted content</li>
<li class="text-gray-600">Custom blocklists: Add domain-specific prohibited terms or patterns</li>
</ul>

Enable Diagnostic Logging

Logging is off by default. Turn it on from day one:

az monitor diagnostic-settings create         --name "openai-security-logs"         --resource "${openai_resource_id}"         --logs '[{"category": "Audit", "enabled": true},
             {"category": "RequestResponse", "enabled": true}]'         --metrics '[{"category": "AllMetrics", "enabled": true}]'         --workspace "${log_analytics_workspace_id}"

These logs capture API calls, token counts, content filter triggers, and configuration changes—everything you need for incident investigation and compliance.

AWS Bedrock

AWS Bedrock offers models from Anthropic (Claude), Meta, Mistral, Cohere, and Amazon's own Titan family, all through a unified AWS API. The security model integrates cleanly with existing AWS IAM and networking.

VPC Endpoints: Keep Traffic Internal

Like Azure private endpoints, Bedrock VPC interface endpoints keep inference traffic within AWS's network:

# Terraform: Bedrock VPC endpoint
resource "aws_vpc_endpoint" "bedrock_runtime" {
  vpc_id              = var.vpc_id
  service_name        = "com.amazonaws.${var.region}.bedrock-runtime"
  vpc_endpoint_type   = "Interface"
  subnet_ids          = var.private_subnet_ids
  security_group_ids  = [aws_security_group.bedrock_endpoint_sg.id]
  private_dns_enabled = true

policy = jsonencode({ Statement = [{ Effect = "Allow" Principal = { AWS = [var.application_role_arn] } Action = ["bedrock:InvokeModel", "bedrock:InvokeModelWithResponseStream"] Resource = "arn:aws:bedrock:${var.region}::foundation-model/*" }] }) }

resource "aws_security_group" "bedrock_endpoint_sg" {
  name   = "bedrock-endpoint"
  vpc_id = var.vpc_id

ingress { from_port = 443 to_port = 443 protocol = "tcp" security_groups = [var.application_sg_id] # Only from your application tier } }

IAM: Lock Down Model Access

Create a dedicated IAM policy that allows only the specific models your application uses:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": ["bedrock:InvokeModel", "bedrock:InvokeModelWithResponseStream"],
      "Resource": [
        "arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-5-sonnet-20241022-v2:0"
      ],
      "Condition": {
        "StringEquals": {
          "aws:RequestedRegion": "us-east-1"
        }
      }
    }
  ]
}

This denies access to all other models and all model management operations. Your inference application should not be able to list models, create fine-tuning jobs, or access model evaluation—only invoke the specific model it needs.

Bedrock Guardrails

Guardrails is AWS's configurable content filtering layer for Bedrock. Unlike model-level safety (which you cannot configure), Guardrails gives you control:

# Apply guardrails on every model invocation
import boto3, json

bedrock = boto3.client('bedrock-runtime', region_name='us-east-1')

response = bedrock.invoke_model(
    modelId='anthropic.claude-3-5-sonnet-20241022-v2:0',
    guardrailIdentifier=GUARDRAIL_ID,
    guardrailVersion='DRAFT',
    body=json.dumps({
        "anthropic_version": "bedrock-2023-05-31",
        "max_tokens": 2048,
        "messages": [{"role": "user", "content": user_message}]
    })
)

Guardrails capabilities to configure:

Denied topics: Define topics your AI should never engage with (e.g., competitor discussions, investment advice)
Content filters: Configurable thresholds for harmful content categories
PII redaction: Automatically detect and mask personal information before it reaches the model
Grounding: For RAG applications, detect when the model goes beyond the provided context

The PII redaction capability is particularly valuable—it can automatically strip common PII types before they are processed or appear in responses.

Google Vertex AI

Google's AI platform covers Gemini models and a growing catalog of open-source models, integrated tightly with Google Cloud's security tooling.

VPC Service Controls

VPC Service Controls is Google Cloud's most powerful data exfiltration prevention mechanism. It creates a security perimeter around cloud services, including Vertex AI—requests from outside the perimeter are blocked regardless of authentication:

# Add Vertex AI to your security perimeter
resource "google_access_context_manager_service_perimeter" "ai_perimeter" {
  parent = "accessPolicies/${var.access_policy_id}"
  name   = "accessPolicies/${var.access_policy_id}/servicePerimeters/ai_perimeter"
  title  = "AI Security Perimeter"

spec { restricted_services = [ "aiplatform.googleapis.com", "storage.googleapis.com", # Also protect training data and model artifacts ]

resources = ["projects/${var.project_number}"]

access_levels = [var.trusted_access_level] } }

With VPC Service Controls, even a stolen service account key cannot be used to exfiltrate data outside the perimeter.

IAM: Principle of Least Privilege

# Vertex AI roles, from least to most privileged:
# roles/aiplatform.user        - Can call prediction endpoints
# roles/aiplatform.viewer      - Can view resources, no predictions
# roles/aiplatform.admin       - Full access - use sparingly

# For GKE workloads: use Workload Identity, not service account keys resource "google_service_account_iam_binding" "workload_identity" { service_account_id = google_service_account.ai_inference.name role = "roles/iam.workloadIdentityUser"

members = [
    "serviceAccount:${var.project_id}.svc.id.goog[${var.k8s_namespace}/${var.k8s_sa_name}]"
  ]
}

Workload Identity eliminates the need for service account key files entirely—a significant security improvement.

Common Misconfigurations Across All Platforms

These mistakes appear consistently regardless of which cloud provider you use:

1. Public Endpoints with No IP Restrictions

The most common finding. The AI service endpoint is publicly accessible, protected only by an API key or token. An exposed key means complete access to the model and anything it can see. Fix: Enable private endpoints. If private endpoints are not feasible, at minimum restrict public access to your application's known IP ranges.

2. Over-Privileged Service Accounts

Application service accounts with Owner or Contributor roles at the resource group or project level. All they need is permission to invoke specific AI models. Fix: Create dedicated service accounts with least-privilege roles scoped to specific resources. Review and reduce permissions quarterly.

3. Audit Logging Disabled

Diagnostic logging is frequently off by default. Without logs, you cannot detect abuse, investigate incidents, meet compliance requirements, or understand your actual usage patterns. Fix: Enable diagnostic logging from day one. Budget the storage cost as a security investment.

4. Content Filtering Disabled in Production

Teams disable content filtering during development to avoid false positives. They forget to re-enable it before production deployment. Fix: Keep content filtering enabled in production. Tune filters to reduce false positives rather than disabling them.

5. API Keys Instead of Managed Identities

Despite managed identity support on all three platforms, teams default to API keys because they are simpler to configure locally. Fix: Use managed identity/service accounts in production. Accept the configuration overhead—it eliminates an entire category of credential exposure risk.

Data Residency and Provider Data Handling

A common concern in enterprise AI: "Does our data leave our region, and is it used to train the provider's models?"

Provider	Used for training	Data retention	Regional deployments
Azure OpenAI	No (enterprise tier)	Configurable, default: not retained	Most Azure regions
AWS Bedrock	No	Session only by default	Most AWS regions
Google Vertex AI	No (enterprise tier)	Configurable	Most GCP regions

Verify these commitments in your specific contract and data processing agreement. The defaults are generally favorable for enterprise customers, but contractual language is what matters for compliance purposes.

Where to Start

If you are configuring cloud AI security from scratch, do these in order:

Enable private/VPC endpoints — highest impact, keeps traffic off the public internet
Configure least-privilege IAM — service accounts with only the permissions they need
Enable audit logging and send to your SIEM — you need visibility from the first request
Enable content filtering — use built-in capabilities before building custom solutions
Set budget alerts — cost anomalies are often security indicators
Review configurations quarterly — cloud services add features; some are security-relevant

Cloud AI security is cloud security applied to a new service category. The teams that do this well extend their existing cloud security practices to AI services rather than treating AI as a special case requiring a completely new approach.

The Cloud Does Not Mean Someone Else's Problem

Here is what actually matters on each platform, along with the most common misconfigurations I see in production environments.

The Shared Responsibility Model for Cloud AI

Before diving into specifics, understand what the cloud provider secures and what falls on you:

Responsibility	Provider	You
Physical data center and hardware	✓
Hypervisor and host OS	✓
Model training and infrastructure	✓
Platform availability and patching	✓
Identity and access management	Tooling provided	Configure it correctly
Network access controls	Tooling provided	Configure it correctly
Data encryption	Default or configure	Enable and manage keys
Audit logging	Available	Enable and monitor
Content filtering	Available	Enable and tune
Application-layer security		✓
Prompt security		✓
What data you send to the model		✓

The provider secures the infrastructure. You secure how you use it.

Azure OpenAI Service

Private Endpoints: Do This First

# Terraform: Azure OpenAI with private endpoint and public access disabled
resource "azurerm_cognitive_account" "openai" {
  name                          = "openai-${var.environment}"
  location                      = var.location
  resource_group_name           = var.resource_group_name
  kind                          = "OpenAI"
  sku_name                      = "S0"
  public_network_access_enabled = false    # Disable public internet access

network_acls { default_action = "Deny" }

identity {
    type = "SystemAssigned"
  }
}

resource "azurerm_private_endpoint" "openai" { name = "pe-openai-${var.environment}" location = var.location resource_group_name = var.resource_group_name subnet_id = var.private_endpoint_subnet_id

private_service_connection {
    name                           = "openai-privatelink"
    private_connection_resource_id = azurerm_cognitive_account.openai.id
    subresource_names              = ["account"]
    is_manual_connection           = false
  }
}

With this configuration, traffic to your Azure OpenAI instance never leaves Microsoft's network backbone.

Use Managed Identity, Not API Keys

API keys are secrets that can be leaked, stolen, or accidentally committed to version control. Managed Identity is Azure's preferred authentication mechanism—no secrets to manage.

# Application code using Managed Identity (Python)
from openai import AzureOpenAI
from azure.identity import DefaultAzureCredential, get_bearer_token_provider

token_provider = get_bearer_token_provider( DefaultAzureCredential(), "https://cognitiveservices.azure.com/.default" )

client = AzureOpenAI(
    azure_endpoint=os.getenv("AZURE_OPENAI_ENDPOINT"),
    azure_ad_token_provider=token_provider,  # No API key needed
    api_version="2024-10-21"
)

Assign the application's managed identity the minimum required role:

# Grant the application's identity permission to call the API only
# Role: Cognitive Services OpenAI User
az role assignment create         --assignee "${app_managed_identity_id}"         --role "Cognitive Services OpenAI User"         --scope "${openai_resource_id}"

Do not use the Cognitive Services OpenAI Contributor or higher roles for applications. Those allow configuration changes, not just inference calls.

Enable Content Filtering and Prompt Shields

Azure OpenAI's content filtering is configurable per deployment. Prompt Shields specifically blocks prompt injection and jailbreak attempts:

Content filter capabilities to review and configure:

Standard categories (configure thresholds):

Hate speech
Sexual content
Violence
Self-harm

Advanced protections (enable all):
<ul class="list-disc pl-6 mb-4 space-y-2">
<li class="text-gray-600">Prompt Shields: Blocks prompt injection and jailbreak attempts</li>
<li class="text-gray-600">Groundedness detection: For RAG applications, detects hallucinations</li>
<li class="text-gray-600">Protected material detection: Detects copyrighted content</li>
<li class="text-gray-600">Custom blocklists: Add domain-specific prohibited terms or patterns</li>
</ul>

Enable Diagnostic Logging

Logging is off by default. Turn it on from day one:

az monitor diagnostic-settings create         --name "openai-security-logs"         --resource "${openai_resource_id}"         --logs '[{"category": "Audit", "enabled": true},
             {"category": "RequestResponse", "enabled": true}]'         --metrics '[{"category": "AllMetrics", "enabled": true}]'         --workspace "${log_analytics_workspace_id}"

These logs capture API calls, token counts, content filter triggers, and configuration changes—everything you need for incident investigation and compliance.

AWS Bedrock

VPC Endpoints: Keep Traffic Internal

Like Azure private endpoints, Bedrock VPC interface endpoints keep inference traffic within AWS's network:

# Terraform: Bedrock VPC endpoint
resource "aws_vpc_endpoint" "bedrock_runtime" {
  vpc_id              = var.vpc_id
  service_name        = "com.amazonaws.${var.region}.bedrock-runtime"
  vpc_endpoint_type   = "Interface"
  subnet_ids          = var.private_subnet_ids
  security_group_ids  = [aws_security_group.bedrock_endpoint_sg.id]
  private_dns_enabled = true

resource "aws_security_group" "bedrock_endpoint_sg" {
  name   = "bedrock-endpoint"
  vpc_id = var.vpc_id

ingress { from_port = 443 to_port = 443 protocol = "tcp" security_groups = [var.application_sg_id] # Only from your application tier } }

IAM: Lock Down Model Access

Create a dedicated IAM policy that allows only the specific models your application uses:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": ["bedrock:InvokeModel", "bedrock:InvokeModelWithResponseStream"],
      "Resource": [
        "arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-5-sonnet-20241022-v2:0"
      ],
      "Condition": {
        "StringEquals": {
          "aws:RequestedRegion": "us-east-1"
        }
      }
    }
  ]
}

Bedrock Guardrails

Guardrails is AWS's configurable content filtering layer for Bedrock. Unlike model-level safety (which you cannot configure), Guardrails gives you control:

# Apply guardrails on every model invocation
import boto3, json

bedrock = boto3.client('bedrock-runtime', region_name='us-east-1')

response = bedrock.invoke_model(
    modelId='anthropic.claude-3-5-sonnet-20241022-v2:0',
    guardrailIdentifier=GUARDRAIL_ID,
    guardrailVersion='DRAFT',
    body=json.dumps({
        "anthropic_version": "bedrock-2023-05-31",
        "max_tokens": 2048,
        "messages": [{"role": "user", "content": user_message}]
    })
)

Guardrails capabilities to configure:

Denied topics: Define topics your AI should never engage with (e.g., competitor discussions, investment advice)
Content filters: Configurable thresholds for harmful content categories
PII redaction: Automatically detect and mask personal information before it reaches the model
Grounding: For RAG applications, detect when the model goes beyond the provided context

The PII redaction capability is particularly valuable—it can automatically strip common PII types before they are processed or appear in responses.

Google Vertex AI

Google's AI platform covers Gemini models and a growing catalog of open-source models, integrated tightly with Google Cloud's security tooling.

VPC Service Controls

# Add Vertex AI to your security perimeter
resource "google_access_context_manager_service_perimeter" "ai_perimeter" {
  parent = "accessPolicies/${var.access_policy_id}"
  name   = "accessPolicies/${var.access_policy_id}/servicePerimeters/ai_perimeter"
  title  = "AI Security Perimeter"

spec { restricted_services = [ "aiplatform.googleapis.com", "storage.googleapis.com", # Also protect training data and model artifacts ]

resources = ["projects/${var.project_number}"]

access_levels = [var.trusted_access_level] } }

With VPC Service Controls, even a stolen service account key cannot be used to exfiltrate data outside the perimeter.

IAM: Principle of Least Privilege

# Vertex AI roles, from least to most privileged:
# roles/aiplatform.user        - Can call prediction endpoints
# roles/aiplatform.viewer      - Can view resources, no predictions
# roles/aiplatform.admin       - Full access - use sparingly

members = [
    "serviceAccount:${var.project_id}.svc.id.goog[${var.k8s_namespace}/${var.k8s_sa_name}]"
  ]
}

Workload Identity eliminates the need for service account key files entirely—a significant security improvement.

Common Misconfigurations Across All Platforms

These mistakes appear consistently regardless of which cloud provider you use:

1. Public Endpoints with No IP Restrictions

2. Over-Privileged Service Accounts

3. Audit Logging Disabled

4. Content Filtering Disabled in Production

5. API Keys Instead of Managed Identities

Data Residency and Provider Data Handling

A common concern in enterprise AI: "Does our data leave our region, and is it used to train the provider's models?"

Provider	Used for training	Data retention	Regional deployments
Azure OpenAI	No (enterprise tier)	Configurable, default: not retained	Most Azure regions
AWS Bedrock	No	Session only by default	Most AWS regions
Google Vertex AI	No (enterprise tier)	Configurable	Most GCP regions

Where to Start

If you are configuring cloud AI security from scratch, do these in order:

Enable private/VPC endpoints — highest impact, keeps traffic off the public internet
Configure least-privilege IAM — service accounts with only the permissions they need
Enable audit logging and send to your SIEM — you need visibility from the first request
Enable content filtering — use built-in capabilities before building custom solutions
Set budget alerts — cost anomalies are often security indicators
Review configurations quarterly — cloud services add features; some are security-relevant

The Cloud Does Not Mean Someone Else's Problem

The Shared Responsibility Model for Cloud AI

Azure OpenAI Service

Private Endpoints: Do This First

Use Managed Identity, Not API Keys

Enable Content Filtering and Prompt Shields

Enable Diagnostic Logging

AWS Bedrock

VPC Endpoints: Keep Traffic Internal

IAM: Lock Down Model Access

Bedrock Guardrails

Google Vertex AI

VPC Service Controls

IAM: Principle of Least Privilege

Common Misconfigurations Across All Platforms

1. Public Endpoints with No IP Restrictions

2. Over-Privileged Service Accounts

3. Audit Logging Disabled

4. Content Filtering Disabled in Production

5. API Keys Instead of Managed Identities

Data Residency and Provider Data Handling

Where to Start

Idan Ohayon

Share this article

Questions & Answers

Need Help with Your Security?

The Cloud Does Not Mean Someone Else's Problem

The Shared Responsibility Model for Cloud AI

Azure OpenAI Service

Private Endpoints: Do This First

Use Managed Identity, Not API Keys

Enable Content Filtering and Prompt Shields

Enable Diagnostic Logging

AWS Bedrock

VPC Endpoints: Keep Traffic Internal

IAM: Lock Down Model Access

Bedrock Guardrails

Google Vertex AI

VPC Service Controls

IAM: Principle of Least Privilege

Common Misconfigurations Across All Platforms

1. Public Endpoints with No IP Restrictions

2. Over-Privileged Service Accounts

3. Audit Logging Disabled

4. Content Filtering Disabled in Production

5. API Keys Instead of Managed Identities

Data Residency and Provider Data Handling

Where to Start

Idan Ohayon

Share this article

Questions & Answers

Need Help with Your Security?