DataGuard API v1.0.0 Home Swagger UI

Usage Guide

How to authenticate and call the DataGuard API from Python.

Overview

DataGuard is ADEK's data classification and governance API. It identifies data points in natural-language queries, classifies them against ADEK's data governance framework (Open, Confidential, Sensitive, Secret), and answers policy questions grounded in uploaded governance documents.

Three main capabilities are exposed via the API:

Prerequisites

Install the required Python packages:

pip install requests msal

Authentication

DataGuard supports two authentication modes depending on the environment.

Staging: API Key

In staging, pass your API key in the X-API-Key header:

import requests

BASE_URL = "https://your-container-app.azurecontainerapps.io"
API_KEY  = "your-api-key"

headers = {
    "Content-Type": "application/json",
    "X-API-Key": API_KEY,
}
i
For bulk classification routes (under /api/v1/admin/), use the admin API key instead of the standard key.

Production: Azure AD (Easy Auth)

In production, API keys are disabled. Authenticate with Azure AD using MSAL to obtain an access token, then call the API through the Easy Auth layer.

import msal
import requests

# Azure AD app registration details (get from your administrator)
TENANT_ID     = "your-tenant-id"
CLIENT_ID     = "your-client-app-id"          # YOUR app registration
CLIENT_SECRET = "your-client-secret"
API_APP_ID    = "your-dataguard-api-app-id"   # DataGuard's app registration

BASE_URL = "https://your-container-app.azurecontainerapps.io"
SCOPES   = [f"api://{API_APP_ID}/.default"]

# Confidential client (service-to-service / daemon flow)
app = msal.ConfidentialClientApplication(
    CLIENT_ID,
    authority=f"https://login.microsoftonline.com/{TENANT_ID}",
    client_credential=CLIENT_SECRET,
)

def get_access_token():
    result = app.acquire_token_for_client(scopes=SCOPES)
    if "access_token" in result:
        return result["access_token"]
    raise RuntimeError(f"Token error: {result.get('error_description')}")

token = get_access_token()

headers = {
    "Content-Type": "application/json",
    "Authorization": f"Bearer {token}",
}
!
Bulk classification routes require the DataGuard.BulkOperator or DataGuard.SuperAdmin app role assigned to your service principal in Azure AD.

Classify

Classify data points mentioned in a natural-language query.

POST /api/v1/classify
Any authenticated user (API key in staging, Azure AD in production)

Request

FieldTypeDescription
querystringNatural-language description of the data to classify

Example

response = requests.post(
    f"{BASE_URL}/api/v1/classify",
    headers=headers,
    json={"query": "employee salary and national ID number"},
)
response.raise_for_status()
result = response.json()

print(f"Status: {result['pipeline_status']}")
print(f"Composite level: {result['composite_level']}")

for item in result["results"]:
    print(f"  {item['data_point']}: {item['classification_level']}")
    print(f"    Confidence: {item['confidence']} : {item['confidence_reason']}")
    print(f"    Summary: {item['summary']}")

Response Fields

FieldTypeDescription
pipeline_statusstringclassified, partial, or error
composite_levelstringHighest classification level across all results
classified_countintNumber of successfully classified data points
vague_termslistTerms too vague to classify
resultslistPer-data-point classification results
correlation_idstringRequest trace ID for debugging
duration_msintProcessing time in milliseconds

Each Result Item

FieldTypeDescription
data_pointstringIdentified data point name
classification_levelstringOpen, Confidential, Sensitive, or Secret
confidencestringHigh, Medium, or Low
confidence_reasonstringExplanation of the confidence score
summarystringPlain-English classification summary
ownerstringData owner / responsible party
statusstringclassified, ambiguous, expanded, not_found
interpretations_notelistAlternate interpretations when ambiguous
expanded_resultslistSub-data-points when query is too broad

Bulk Classification

Submit a batch of data points for background classification. This is a three-step process: submit, poll, then download.

!
Requires the DataGuard.BulkOperator or DataGuard.SuperAdmin role in production. Uses the admin API key in staging.
1

Submit the Batch

POST /api/v1/admin/bulk-classify
Returns a job_id for tracking (HTTP 202 Accepted)
FieldTypeDescription
itemslistArray of objects, each with data_point (required) and description (optional)
items = [
    {"data_point": "Employee Salary", "description": "Monthly gross salary"},
    {"data_point": "National ID Number"},
    {"data_point": "Office Phone Number"},
]

response = requests.post(
    f"{BASE_URL}/api/v1/admin/bulk-classify",
    headers=headers,
    json={"items": items},
)
response.raise_for_status()
job = response.json()
job_id = job["job_id"]
print(f"Submitted: {job_id} ({job['total_items']} items)")
2

Poll for Completion

GET /api/v1/jobs/{job_id}
Returns job status: pending, running, completed, or failed
import time

while True:
    resp = requests.get(
        f"{BASE_URL}/api/v1/jobs/{job_id}",
        headers=headers,
    )
    status = resp.json()

    print(f"  Status: {status['status']} : {status.get('message', '')}")

    if status["status"] in ("completed", "failed"):
        break

    time.sleep(5)  # poll every 5 seconds
3

Download Results

GET /api/v1/admin/bulk-classify/{job_id}/results
Returns a time-limited download URL (valid 24 hours)
download = requests.get(
    f"{BASE_URL}/api/v1/admin/bulk-classify/{job_id}/results",
    headers=headers,
).json()

if download.get("download_url"):
    results = requests.get(download["download_url"]).json()

    print(f"Classified: {results['classified_count']}/{results['total_items']}")
    for r in results["results"]:
        print(f"  {r['data_point']}: {r['classification_level']} ({r['confidence']})")

Result Item Fields

FieldTypeDescription
data_pointstringThe data point name
statusstringclassified, ambiguous, not_found, or error
classification_levelstringOpen, Confidential, Sensitive, or Secret
confidencestringHigh, Medium, or Low
confidence_reasonstringExplanation of confidence score
summarystringClassification summary

Job Status

Check the status of any background job (bulk classification, index rebuilds).

GET /api/v1/jobs/{job_id}
Any authenticated user

Response Fields

FieldTypeDescription
job_idstringJob identifier
statusstringpending, running, completed, failed, or not_found
job_typestringType of background job
created_atstringISO 8601 UTC timestamp
started_atstringWhen processing began
completed_atstringWhen processing finished
messagestringHuman-readable status message
documents_processedintNumber of items processed so far
errorstringError message (empty on success)

Error Handling

All endpoints return standard HTTP error codes with a JSON body containing a detail field.

try:
    response = requests.post(url, headers=headers, json=payload)
    response.raise_for_status()
    data = response.json()
except requests.exceptions.HTTPError as e:
    error = e.response.json().get("detail", str(e))
    print(f"HTTP {e.response.status_code}: {error}")

Common Error Codes

CodeMeaningAction
401Not authenticatedCheck API key or refresh Azure AD token
403Missing required roleRequest the needed role from your administrator
422Invalid request bodyCheck the JSON payload matches the expected schema
500Server errorRetry, then contact the DataGuard team with the correlation_id

Roles & Permissions

Azure AD app roles control access to different endpoint tiers in production.

RoleAccess
DataGuard.SuperAdmin Full access to all endpoints (classify, policy, bulk, admin UI)
DataGuard.BulkOperator Bulk classification routes + all standard routes
DataGuard.UIController Admin UI and admin API access (review queue, indexes, analytics)
No role (authenticated) Standard routes only (classify, policy-search, job status)
i
Roles are configured in Azure AD App Registration → App Roles and assigned to users or service principals in Enterprise Applications.

Quick Reference

EndpointMethodPathAuth
Classify POST /api/v1/classify Any authenticated
Policy Search POST /api/v1/policy-search Any authenticated
Bulk Classify POST /api/v1/admin/bulk-classify BulkOperator / SuperAdmin
Job Status GET /api/v1/jobs/{job_id} Any authenticated
Bulk Results GET /api/v1/admin/bulk-classify/{job_id}/results BulkOperator / SuperAdmin

Environment Modes

EnvironmentAuth MethodNotes
DevelopmentNoneLocal development only, no auth checks
StagingAPI key (X-API-Key)Use admin key for bulk routes
ProductionAzure AD (Easy Auth)API keys disabled; use MSAL Bearer token