Cyber Intelligence
AI Security15 min read

Public Cloud AI Security: Azure OpenAI, AWS Bedrock, and Google Vertex AI

Cloud AI services come with strong security capabilities built in. Most breaches happen because those capabilities are never configured. Here is what to configure on each major platform.

I
Microsoft Cloud Solution Architect
Public Cloud AI Security: Azure OpenAI, AWS Bedrock, and Google Vertex AI infographic showing key AI Security concepts and controls
Azure OpenAIAWS BedrockGoogle Vertex AICloud AI SecurityAI SecurityCloud SecurityEnterprise AI
Video transcript

Your A I model is running on Azure Open A I right now. But here's the thing: ninety-four percent of cloud A I breaches aren't from zero-days. They're from controls that were never turned on. When you skip configuration, you're leaving your A P I keys in plain text, your logs unmonitored, and your access controls wide open. One compromised credential gives attackers direct pipeline to your training data and model outputs. That's not theoretical. That's happening today. Think of I A M policies like locks on your doors. Azure Open A I lets you assign roles and permissions, but only if you actually set them. Without role-based access control, every team member gets keys to everything. You need to grant least privilege. Only the model trainer touches training pipelines. Only the A P I consumer calls endpoints. Now logs and monitoring. A W S Bedrock and Google Vertex A I both generate audit trails automatically, but they're off by default. Connect them to your S I E M. That's how you catch when someone suddenly queries ten thousand records at two a.m. That's your early warning system. Finally, encryption and secrets management. Your A P I keys and model parameters need to live in a vault, not a config file. Use Azure Key Vault or A W S Secrets Manager. Rotate credentials every ninety days minimum. One leaked key should expire fast enough that it becomes worthless. Pick one platform today. Audit one service. Check your I A M roles. Small action, massive impact. Read the complete guide at protego dot me.

The Cloud Does Not Mean Someone Else's Problem

I regularly hear two extremes when teams discuss cloud AI security. Either "it is in the cloud, so security is handled," or they are so overwhelmed by the complexity that they do not know where to begin.

The reality sits in between. Cloud AI services like Azure OpenAI, AWS Bedrock, and Google Vertex AI ship with strong security capabilities. But those capabilities only protect you when you configure them correctly. Misconfiguration is the primary cloud security failure mode, and AI services are no exception.

Here is what actually matters on each platform, along with the most common misconfigurations I see in production environments.

The Shared Responsibility Model for Cloud AI

Before diving into specifics, understand what the cloud provider secures and what falls on you:

ResponsibilityProviderYou
Physical data center and hardware
Hypervisor and host OS
Model training and infrastructure
Platform availability and patching
Identity and access managementTooling providedConfigure it correctly
Network access controlsTooling providedConfigure it correctly
Data encryptionDefault or configureEnable and manage keys
Audit loggingAvailableEnable and monitor
Content filteringAvailableEnable and tune
Application-layer security
Prompt security
What data you send to the model

The provider secures the infrastructure. You secure how you use it.

Azure OpenAI Service

Azure OpenAI is Microsoft's enterprise platform for OpenAI models. It adds enterprise controls (private networking, RBAC, content filtering, managed identity) on top of the same GPT-4o and other models you already know.

Private Endpoints: Do This First

By default, Azure OpenAI is accessible over the public internet, protected only by your API key or managed identity token. For enterprise workloads with sensitive data, this is not acceptable. Enable private endpoints:

# Terraform: Azure OpenAI with private endpoint and public access disabled
resource "azurerm_cognitive_account" "openai" {
  name                          = "openai-${var.environment}"
  location                      = var.location
  resource_group_name           = var.resource_group_name
  kind                          = "OpenAI"
  sku_name                      = "S0"
  public_network_access_enabled = false    # Disable public internet access

  network_acls {
    default_action = "Deny"
  }

  identity {
    type = "SystemAssigned"
  }
}

resource "azurerm_private_endpoint" "openai" {
  name                = "pe-openai-${var.environment}"
  location            = var.location
  resource_group_name = var.resource_group_name
  subnet_id           = var.private_endpoint_subnet_id

  private_service_connection {
    name                           = "openai-privatelink"
    private_connection_resource_id = azurerm_cognitive_account.openai.id
    subresource_names              = ["account"]
    is_manual_connection           = false
  }
}

With this configuration, traffic to your Azure OpenAI instance never leaves Microsoft's network backbone.

Use Managed Identity, Not API Keys

API keys are secrets that can be leaked, stolen, or accidentally committed to version control. Managed Identity is Azure's preferred authentication mechanism: no secrets to manage.

# Application code using Managed Identity (Python)
from openai import AzureOpenAI
from azure.identity import DefaultAzureCredential, get_bearer_token_provider

token_provider = get_bearer_token_provider(
    DefaultAzureCredential(),
    "https://cognitiveservices.azure.com/.default"
)

client = AzureOpenAI(
    azure_endpoint=os.getenv("AZURE_OPENAI_ENDPOINT"),
    azure_ad_token_provider=token_provider,  # No API key needed
    api_version="2024-10-21"
)

Assign the application's managed identity the minimum required role:

# Grant the application's identity permission to call the API only
# Role: Cognitive Services OpenAI User
az role assignment create         --assignee "${app_managed_identity_id}"         --role "Cognitive Services OpenAI User"         --scope "${openai_resource_id}"

Do not use the Cognitive Services OpenAI Contributor or higher roles for applications. Those allow configuration changes, not just inference calls.

Enable Content Filtering and Prompt Shields

Azure OpenAI's content filtering is configurable per deployment. Prompt Shields specifically blocks prompt injection and jailbreak attempts:

Content filter capabilities to review and configure:

Standard categories (configure thresholds):
  • Hate speech
  • Sexual content
  • Violence
  • Self-harm
Advanced protections (enable all):
  • Prompt Shields: Blocks prompt injection and jailbreak attempts
  • Groundedness detection: For RAG applications, detects hallucinations
  • Protected material detection: Detects copyrighted content
  • Custom blocklists: Add domain-specific prohibited terms or patterns

Enable Diagnostic Logging

Logging is off by default. Turn it on from day one:

az monitor diagnostic-settings create         --name "openai-security-logs"         --resource "${openai_resource_id}"         --logs '[{"category": "Audit", "enabled": true},
             {"category": "RequestResponse", "enabled": true}]'         --metrics '[{"category": "AllMetrics", "enabled": true}]'         --workspace "${log_analytics_workspace_id}"

These logs capture API calls, token counts, content filter triggers, and configuration changes: everything you need for incident investigation and compliance.

AWS Bedrock

AWS Bedrock offers models from Anthropic (Claude), Meta, Mistral, Cohere, and Amazon's own Titan family, all through a unified AWS API. The security model integrates cleanly with existing AWS IAM and networking.

VPC Endpoints: Keep Traffic Internal

Like Azure private endpoints, Bedrock VPC interface endpoints keep inference traffic within AWS's network:

# Terraform: Bedrock VPC endpoint
resource "aws_vpc_endpoint" "bedrock_runtime" {
  vpc_id              = var.vpc_id
  service_name        = "com.amazonaws.${var.region}.bedrock-runtime"
  vpc_endpoint_type   = "Interface"
  subnet_ids          = var.private_subnet_ids
  security_group_ids  = [aws_security_group.bedrock_endpoint_sg.id]
  private_dns_enabled = true

  policy = jsonencode({
    Statement = [{
      Effect    = "Allow"
      Principal = { AWS = [var.application_role_arn] }
      Action    = ["bedrock:InvokeModel", "bedrock:InvokeModelWithResponseStream"]
      Resource  = "arn:aws:bedrock:${var.region}::foundation-model/*"
    }]
  })
}

resource "aws_security_group" "bedrock_endpoint_sg" {
  name   = "bedrock-endpoint"
  vpc_id = var.vpc_id

  ingress {
    from_port       = 443
    to_port         = 443
    protocol        = "tcp"
    security_groups = [var.application_sg_id]  # Only from your application tier
  }
}

IAM: Lock Down Model Access

Create a dedicated IAM policy that allows only the specific models your application uses:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": ["bedrock:InvokeModel", "bedrock:InvokeModelWithResponseStream"],
      "Resource": [
        "arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-5-sonnet-20241022-v2:0"
      ],
      "Condition": {
        "StringEquals": {
          "aws:RequestedRegion": "us-east-1"
        }
      }
    }
  ]
}

This denies access to all other models and all model management operations. Your inference application should not be able to list models, create fine-tuning jobs, or access model evaluation; only invoke the specific model it needs.

Bedrock Guardrails

Guardrails is AWS's configurable content filtering layer for Bedrock. Unlike model-level safety (which you cannot configure), Guardrails gives you control:

# Apply guardrails on every model invocation
import boto3, json

bedrock = boto3.client('bedrock-runtime', region_name='us-east-1')

response = bedrock.invoke_model(
    modelId='anthropic.claude-3-5-sonnet-20241022-v2:0',
    guardrailIdentifier=GUARDRAIL_ID,
    guardrailVersion='DRAFT',
    body=json.dumps({
        "anthropic_version": "bedrock-2023-05-31",
        "max_tokens": 2048,
        "messages": [{"role": "user", "content": user_message}]
    })
)

Guardrails capabilities to configure:

  • Denied topics: Define topics your AI should never engage with (e.g., competitor discussions, investment advice)
  • Content filters: Configurable thresholds for harmful content categories
  • PII redaction: Automatically detect and mask personal information before it reaches the model
  • Grounding: For RAG applications, detect when the model goes beyond the provided context

The PII redaction capability is particularly valuable: it can automatically strip common PII types before they are processed or appear in responses.

Google Vertex AI

Google's AI platform covers Gemini models and a growing catalog of open-source models, integrated tightly with Google Cloud's security tooling.

VPC Service Controls

VPC Service Controls is Google Cloud's most powerful data exfiltration prevention mechanism. It creates a security perimeter around cloud services, including Vertex AI; requests from outside the perimeter are blocked regardless of authentication:

# Add Vertex AI to your security perimeter
resource "google_access_context_manager_service_perimeter" "ai_perimeter" {
  parent = "accessPolicies/${var.access_policy_id}"
  name   = "accessPolicies/${var.access_policy_id}/servicePerimeters/ai_perimeter"
  title  = "AI Security Perimeter"

  spec {
    restricted_services = [
      "aiplatform.googleapis.com",
      "storage.googleapis.com",    # Also protect training data and model artifacts
    ]

    resources = ["projects/${var.project_number}"]

    access_levels = [var.trusted_access_level]
  }
}

With VPC Service Controls, even a stolen service account key cannot be used to exfiltrate data outside the perimeter.

IAM: Principle of Least Privilege

# Vertex AI roles, from least to most privileged:
# roles/aiplatform.user       - Can call prediction endpoints
# roles/aiplatform.viewer     - Can view resources, no predictions
# roles/aiplatform.admin      - Full access - use sparingly

# For GKE workloads: use Workload Identity, not service account keys
resource "google_service_account_iam_binding" "workload_identity" {
  service_account_id = google_service_account.ai_inference.name
  role               = "roles/iam.workloadIdentityUser"

  members = [
    "serviceAccount:${var.project_id}.svc.id.goog[${var.k8s_namespace}/${var.k8s_sa_name}]"
  ]
}

Workload Identity eliminates the need for service account key files entirely, a significant security improvement.

Common Misconfigurations Across All Platforms

These mistakes appear consistently regardless of which cloud provider you use:

1. Public Endpoints with No IP Restrictions

The most common finding. The AI service endpoint is publicly accessible, protected only by an API key or token. An exposed key means complete access to the model and anything it can see.

Fix: Enable private endpoints. If private endpoints are not feasible, at minimum restrict public access to your application's known IP ranges.

2. Over-Privileged Service Accounts

Application service accounts with Owner or Contributor roles at the resource group or project level. All they need is permission to invoke specific AI models.

Fix: Create dedicated service accounts with least-privilege roles scoped to specific resources. Review and reduce permissions quarterly.

3. Audit Logging Disabled

Diagnostic logging is frequently off by default. Without logs, you cannot detect abuse, investigate incidents, meet compliance requirements, or understand your actual usage patterns.

Fix: Enable diagnostic logging from day one. Budget the storage cost as a security investment.

4. Content Filtering Disabled in Production

Teams disable content filtering during development to avoid false positives. They forget to re-enable it before production deployment.

Fix: Keep content filtering enabled in production. Tune filters to reduce false positives rather than disabling them.

5. API Keys Instead of Managed Identities

Despite managed identity support on all three platforms, teams default to API keys because they are simpler to configure locally.

Fix: Use managed identity/service accounts in production. Accept the configuration overhead - it eliminates an entire category of credential exposure risk.

Data Residency and Provider Data Handling

A common concern in enterprise AI: "Does our data leave our region, and is it used to train the provider's models?"

ProviderUsed for trainingData retentionRegional deployments
Azure OpenAINo (enterprise tier)Configurable, default: not retainedMost Azure regions
AWS BedrockNoSession only by defaultMost AWS regions
Google Vertex AINo (enterprise tier)ConfigurableMost GCP regions

Verify these commitments in your specific contract and data processing agreement. The defaults are generally favorable for enterprise customers, but contractual language is what matters for compliance purposes.

Where to Start

If you are configuring cloud AI security from scratch, do these in order:

  1. Enable private/VPC endpoints - highest impact, keeps traffic off the public internet
  2. Configure least-privilege IAM - service accounts with only the permissions they need
  3. Enable audit logging and send to your SIEM - you need visibility from the first request
  4. Enable content filtering - use built-in capabilities before building custom solutions
  5. Set budget alerts - cost anomalies are often security indicators
  6. Review configurations quarterly - cloud services add features; some are security-relevant

Cloud AI security is cloud security applied to a new service category. The teams that do this well extend their existing cloud security practices to AI services rather than treating AI as a special case requiring a completely new approach.

Frequently Asked Questions

What is the difference between the security responsibility model for Azure OpenAI versus a self-hosted LLM?

With Azure OpenAI, Microsoft manages the infrastructure, model runtime, network security of the underlying platform, and physical security of data centers. You are responsible for access controls (who can call the API), private endpoint configuration, content filtering settings, audit logging enablement, and how your application handles prompts and responses. With a self-hosted LLM (Ollama, vLLM on your own VMs), you own all of those plus the model supply chain, the inference server software patch lifecycle, GPU server network isolation, and authentication for the inference API. Cloud AI shifts significant operational security burden to the provider; self-hosted AI shifts it entirely back to your team.

How do managed identities replace API keys for Azure OpenAI authentication?

A managed identity is an automatically managed identity in Microsoft Entra ID assigned to an Azure resource such as a Virtual Machine, App Service, or Azure Container Instance. Instead of your application holding an API key for Azure OpenAI, the application requests a short-lived access token from the local metadata service using the managed identity, and passes that token to the Azure OpenAI endpoint. The token expires automatically and is issued only to the specific Azure resource, so there is no credential to store, rotate, or accidentally leak. For Azure Kubernetes Service workloads, workload identity federation provides the equivalent capability using pod-level identity.

Does Azure OpenAI or AWS Bedrock use your prompts to train their models?

For enterprise-tier customers with standard service agreements, both Azure OpenAI and AWS Bedrock explicitly commit that your prompts and completions are not used to train the provider's foundation models. Azure OpenAI's data privacy terms state that inputs and outputs are not used to improve Microsoft models without explicit opt-in. AWS Bedrock similarly does not use API call data for model training by default. Verify the specific commitments in your data processing agreement and service contract, as commitments can differ between standard and enterprise tiers, and between regions.

What are the most commonly misconfigured security settings in cloud AI deployments?

Five misconfigurations appear most frequently in cloud AI security assessments: leaving the AI endpoint accessible over the public internet instead of using private endpoints, assigning over-privileged roles (Contributor or Owner) to application service accounts instead of the minimum required role, disabling content filtering during development and forgetting to re-enable it before production, not enabling diagnostic logging (which is off by default on most cloud AI services), and using API keys in production instead of managed identities or service accounts with OIDC federation. All five are straightforward to fix once identified; the challenge is that cloud AI services are often deployed quickly and these defaults are not reviewed.

How should budget alerts be configured for cloud AI services, and why are they a security control?

Set spending alerts at 50%, 75%, and 100% of your expected monthly budget, with immediate notifications to the security team, not just the billing owner. An unexpected cost spike is often the first detectable indicator of a security incident: a stolen API key being used to run large inference workloads, a compromised application endpoint being abused for unauthorized queries, or a prompt injection attack causing an application to make excessive API calls. Cost anomaly detection is a free security signal that requires no additional tooling. Configure hard spending limits that pause new API calls once the threshold is exceeded, treating budget as a circuit breaker that stops runaway abuse automatically.

N

Recommended tool: Nordpass

Up to 40% commission

Get weekly security insights

Cloud security, zero trust, and identity guides — straight to your inbox.

I

Microsoft Cloud Solution Architect

Cloud Solution Architect with deep expertise in Microsoft Azure and a strong background in systems and IT infrastructure. Passionate about cloud technologies, security best practices, and helping organizations modernize their infrastructure.

Share this article

Questions & Answers

Related Articles

Need Help with Your Security?

Our team of security experts can help you implement the strategies discussed in this article.

Contact Us