Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 4 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -178,7 +178,7 @@ w = WorkspaceClient(host=input('Databricks Workspace URL: '),
azure_client_secret=input('AAD Client Secret: '))
```

Please see more examples in [this document](./docs/azure-ad.md).
For more Azure authentication examples, see the [Authentication Types Reference](./docs/auth-types-reference.md#azure-service-principal).

### Google Cloud Platform native authentication

Expand Down Expand Up @@ -228,13 +228,15 @@ For all authentication methods, you can override the default behavior in client

| Argument | Description | Environment variable |
|-------------------------|-------------|------------------------|
| `auth_type` | _(String)_ When multiple auth attributes are available in the environment, use the auth type specified by this argument. This argument also holds the currently selected auth. | `DATABRICKS_AUTH_TYPE` |
| `auth_type` | _(String)_ When multiple auth attributes are available in the environment, use the auth type specified by this argument. This argument also holds the currently selected auth. When set explicitly, the SDK will **only** attempt that specific authentication method, skipping automatic detection of others. See the **[Authentication Types Reference](./docs/auth-types-reference.md)** for all valid values, required parameters, and usage examples. | `DATABRICKS_AUTH_TYPE` |
| `http_timeout_seconds` | _(Integer)_ Number of seconds for HTTP timeout. Default is _60_. | _(None)_ |
| `retry_timeout_seconds` | _(Integer)_ Number of seconds to keep retrying HTTP requests. Default is _300 (5 minutes)_. | _(None)_ |
| `debug_truncate_bytes` | _(Integer)_ Truncate JSON fields in debug logs above this limit. Default is 96. | `DATABRICKS_DEBUG_TRUNCATE_BYTES` |
| `debug_headers` | _(Boolean)_ `true` to debug HTTP headers of requests made by the application. Default is `false`, as headers contain sensitive data, such as access tokens. | `DATABRICKS_DEBUG_HEADERS` |
| `rate_limit` | _(Integer)_ Maximum number of requests per second made to Databricks REST API. | `DATABRICKS_RATE_LIMIT` |

For a complete reference of all authentication types including required parameters, environment variables, and code examples, see the **[Authentication Types Reference](./docs/auth-types-reference.md)**.

For example, here's how you can update the overall retry timeout:

```python
Expand Down
282 changes: 282 additions & 0 deletions docs/auth-types-reference.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,282 @@
# Authentication Types Reference

This document provides a comprehensive reference for all authentication types (`auth_type`) supported by the Databricks SDK for Python.

## Authentication Types Table

| Auth Type | Description | Required Parameters | Optional Parameters | Environment Variables |
|-----------|-------------|---------------------|---------------------|----------------------|
| `pat` | Personal Access Token authentication - the most common method for programmatic access | `host`, `token` | - | `DATABRICKS_HOST`, `DATABRICKS_TOKEN` |
| `basic` | Basic HTTP authentication using username and password (primarily for AWS) | `host`, `username`, `password` | `account_id` (for account-level operations) | `DATABRICKS_HOST`, `DATABRICKS_USERNAME`, `DATABRICKS_PASSWORD`, `DATABRICKS_ACCOUNT_ID` |
| `oauth-m2m` | OAuth 2.0 Machine-to-Machine (service principal) authentication | `host`, `client_id`, `client_secret` | `scopes`, `authorization_details` | `DATABRICKS_HOST`, `DATABRICKS_CLIENT_ID`, `DATABRICKS_CLIENT_SECRET` |
| `external-browser` | OAuth 2.0 authentication flow using local browser for user login | `host`, `auth_type='external-browser'` | `client_id`, `client_secret` | `DATABRICKS_HOST`, `DATABRICKS_AUTH_TYPE`, `DATABRICKS_CLIENT_ID` |
| `databricks-cli` | Uses tokens from the Databricks CLI (`databricks auth login`) | `host` | `account_id` (for account-level), `databricks_cli_path` | `DATABRICKS_HOST`, `DATABRICKS_ACCOUNT_ID`, `DATABRICKS_CLI_PATH` |
| `azure-client-secret` | Azure Active Directory (AAD) Service Principal authentication | `azure_client_id`, `azure_client_secret`, `azure_tenant_id` | `host`, `azure_workspace_resource_id`, `azure_environment` | `ARM_CLIENT_ID`, `ARM_CLIENT_SECRET`, `ARM_TENANT_ID`, `DATABRICKS_HOST`, `DATABRICKS_AZURE_RESOURCE_ID`, `ARM_ENVIRONMENT` |
| `azure-cli` | Uses credentials from Azure CLI (`az login`) | `host` (or `azure_workspace_resource_id`) | `azure_tenant_id` | `DATABRICKS_HOST`, `DATABRICKS_AZURE_RESOURCE_ID`, `ARM_TENANT_ID` |
| `github-oidc` | GitHub Actions OIDC authentication (workload identity federation) | `host`, `client_id` | `token_audience`, `account_id` | `DATABRICKS_HOST`, `DATABRICKS_CLIENT_ID`, `DATABRICKS_TOKEN_AUDIENCE`, `DATABRICKS_ACCOUNT_ID` |
| `github-oidc-azure` | GitHub Actions OIDC for Azure Databricks workspaces | `host`, `azure_client_id` | `azure_tenant_id` | `DATABRICKS_HOST`, `ARM_CLIENT_ID`, `ARM_TENANT_ID` |
| `azure-devops-oidc` | Azure DevOps Pipelines OIDC authentication | `host`, `client_id` | `token_audience`, `account_id` | `DATABRICKS_HOST`, `DATABRICKS_CLIENT_ID`, `SYSTEM_ACCESSTOKEN` |
| `google-credentials` | Google Cloud service account authentication using credentials JSON | `host`, `google_credentials` | - | `DATABRICKS_HOST`, `GOOGLE_CREDENTIALS` |
| `google-id` | Google Cloud authentication using service account impersonation | `host`, `google_service_account` | - | `DATABRICKS_HOST`, `DATABRICKS_GOOGLE_SERVICE_ACCOUNT` |
| `metadata-service` | Authentication using Databricks-hosted metadata service | `host`, `metadata_service_url` | - | `DATABRICKS_HOST`, `DATABRICKS_METADATA_SERVICE_URL` |
| `runtime` | Auto-detected authentication when running in Databricks Runtime (notebooks, jobs) | _(auto-detected)_ | - | `DATABRICKS_RUNTIME_VERSION` (auto-set) |
| `runtime-oauth` | OAuth authentication for Databricks Runtime with fine-grained permissions | `scopes` | `authorization_details` | `DATABRICKS_RUNTIME_VERSION` (auto-set) |
| `model-serving` | Auto-detected authentication when running in Databricks Model Serving environment | _(auto-detected)_ | - | `IS_IN_DB_MODEL_SERVING_ENV` or `IS_IN_DATABRICKS_MODEL_SERVING_ENV` (auto-set) |
| `env-oidc` | OIDC token from environment variable | `host` | `oidc_token_env`, `client_id` | `DATABRICKS_HOST`, `DATABRICKS_OIDC_TOKEN`, `DATABRICKS_OIDC_TOKEN_ENV`, `DATABRICKS_CLIENT_ID` |
| `file-oidc` | OIDC token from file path | `host`, `oidc_token_filepath` | `client_id` | `DATABRICKS_HOST`, `DATABRICKS_OIDC_TOKEN_FILE`, `DATABRICKS_CLIENT_ID` |

## Common Parameters Across All Auth Types

These parameters can be used with any authentication type:

| Parameter | Description | Environment Variable |
|-----------|-------------|---------------------|
| `http_timeout_seconds` | HTTP request timeout (default: 60 seconds) | - |
| `retry_timeout_seconds` | Total retry timeout (default: 300 seconds / 5 minutes) | - |
| `debug_truncate_bytes` | Truncate debug logs above this size (default: 96 bytes) | `DATABRICKS_DEBUG_TRUNCATE_BYTES` |
| `debug_headers` | Enable debug logging of HTTP headers (default: false) | `DATABRICKS_DEBUG_HEADERS` |
| `rate_limit` | Maximum requests per second to Databricks API | `DATABRICKS_RATE_LIMIT` |
| `skip_verify` | Skip SSL certificate verification (not recommended) | - |

## Usage Examples

When you explicitly set `auth_type`, the SDK will **only** attempt that specific authentication method, skipping the automatic detection of other methods. This is useful when you want to ensure a specific authentication method is used, or when you have multiple credentials configured but want to use a specific one.

### Personal Access Token (PAT)
```python
from databricks.sdk import WorkspaceClient

w = WorkspaceClient(
host="https://your-workspace.cloud.databricks.com",
token="dapi1234567890abcdef",
auth_type="pat"
)
```

### Basic Authentication (Username/Password)
```python
from databricks.sdk import WorkspaceClient

w = WorkspaceClient(
host="https://your-workspace.cloud.databricks.com",
username="your-username",
password="your-password",
auth_type="basic"
)
```

### OAuth Machine-to-Machine (Service Principal)
```python
from databricks.sdk import WorkspaceClient

w = WorkspaceClient(
host="https://your-workspace.cloud.databricks.com",
client_id="your-client-id",
client_secret="your-client-secret",
auth_type="oauth-m2m"
)
```

### External Browser (OAuth for Users)
```python
from databricks.sdk import WorkspaceClient

w = WorkspaceClient(
host="https://your-workspace.cloud.databricks.com",
auth_type="external-browser"
)
```

### Databricks CLI
```python
from databricks.sdk import WorkspaceClient

# Assumes you've run: databricks auth login --host https://your-workspace.cloud.databricks.com
w = WorkspaceClient(
host="https://your-workspace.cloud.databricks.com",
auth_type="databricks-cli"
)
```

### Azure Service Principal
```python
from databricks.sdk import WorkspaceClient

w = WorkspaceClient(
host="https://adb-1234567890.azuredatabricks.net",
azure_client_id="your-azure-client-id",
azure_client_secret="your-azure-client-secret",
azure_tenant_id="your-azure-tenant-id",
auth_type="azure-client-secret"
)
```

### Azure CLI
```python
from databricks.sdk import WorkspaceClient

# Assumes you've run: az login
w = WorkspaceClient(
host="https://adb-1234567890.azuredatabricks.net",
auth_type="azure-cli"
)
```

### GitHub Actions OIDC
```python
from databricks.sdk import WorkspaceClient

# In GitHub Actions with OIDC configured
w = WorkspaceClient(
host="https://your-workspace.cloud.databricks.com",
client_id="your-databricks-oauth-client-id",
auth_type="github-oidc"
)
```

### GitHub Actions OIDC for Azure
```python
from databricks.sdk import WorkspaceClient

# In GitHub Actions with Azure OIDC configured
w = WorkspaceClient(
host="https://adb-1234567890.azuredatabricks.net",
azure_client_id="your-azure-client-id",
auth_type="github-oidc-azure"
)
```

### Azure DevOps OIDC
```python
from databricks.sdk import WorkspaceClient

# In Azure DevOps with OIDC configured
# Note: SYSTEM_ACCESSTOKEN must be exposed as an environment variable
w = WorkspaceClient(
host="https://your-workspace.cloud.databricks.com",
client_id="your-databricks-oauth-client-id",
auth_type="azure-devops-oidc"
)
```

### Google Cloud Credentials
```python
from databricks.sdk import WorkspaceClient

w = WorkspaceClient(
host="https://your-workspace.gcp.databricks.com",
google_credentials="/path/to/service-account-key.json",
auth_type="google-credentials"
)
```

### Google Cloud ID (Service Account Impersonation)
```python
from databricks.sdk import WorkspaceClient

w = WorkspaceClient(
host="https://your-workspace.gcp.databricks.com",
google_service_account="[email protected]",
auth_type="google-id"
)
```

### Metadata Service
```python
from databricks.sdk import WorkspaceClient

w = WorkspaceClient(
host="https://your-workspace.cloud.databricks.com",
metadata_service_url="http://localhost:8080/metadata",
auth_type="metadata-service"
)
```

### Runtime (in Databricks Notebooks)
```python
from databricks.sdk import WorkspaceClient

# No credentials needed when running in Databricks Runtime
# The runtime auth type is auto-detected
w = WorkspaceClient(auth_type="runtime")
```

### Runtime OAuth (in Databricks Notebooks with scoped access)
```python
from databricks.sdk import WorkspaceClient

# For fine-grained access control in notebooks
w = WorkspaceClient(
scopes="clusters sql",
auth_type="runtime-oauth"
)
```

### Environment Variable OIDC
```python
from databricks.sdk import WorkspaceClient

# OIDC token from DATABRICKS_OIDC_TOKEN environment variable
w = WorkspaceClient(
host="https://your-workspace.cloud.databricks.com",
auth_type="env-oidc"
)
```

### File-based OIDC
```python
from databricks.sdk import WorkspaceClient

# OIDC token from a file
w = WorkspaceClient(
host="https://your-workspace.cloud.databricks.com",
oidc_token_filepath="/path/to/oidc-token",
auth_type="file-oidc"
)
```

### Model Serving Environment
```python
from databricks.sdk import WorkspaceClient

# Auto-detected when running in Databricks Model Serving
w = WorkspaceClient(auth_type="model-serving")
```

## Authentication Priority Order

When no `auth_type` is explicitly specified, the SDK attempts authentication methods in this order:

1. `pat` - Personal Access Token
2. `basic` - Username/Password
3. `metadata-service` - Metadata Service (if URL provided)
4. `oauth-m2m` - OAuth Service Principal
5. `env-oidc` - Environment OIDC token
6. `file-oidc` - File-based OIDC token
7. `github-oidc` - GitHub OIDC
8. `azure-client-secret` - Azure Service Principal
9. `github-oidc-azure` - GitHub OIDC for Azure
10. `azure-cli` - Azure CLI
11. `azure-devops-oidc` - Azure DevOps OIDC
12. `external-browser` - Browser-based OAuth
13. `databricks-cli` - Databricks CLI
14. `runtime-oauth` - Databricks Runtime OAuth
15. `runtime` - Databricks Runtime native
16. `google-credentials` - Google Cloud credentials
17. `google-id` - Google Cloud ID
18. `model-serving` - Model Serving environment

You can override this order by explicitly setting the `auth_type` parameter.

## Notes

- **Auto-detected auth types** (`runtime`, `runtime-oauth`, `model-serving`): These are automatically detected based on environment variables and don't require explicit configuration.
- **Azure authentication**: When using Azure-specific auth types, if `host` is not provided but `azure_workspace_resource_id` is, the SDK will automatically resolve the workspace URL.
- **OIDC authentication**: OIDC-based methods (`github-oidc`, `azure-devops-oidc`, `env-oidc`, `file-oidc`) use token exchange to obtain Databricks tokens from external identity providers.
- **Scopes**: OAuth-based methods support the `scopes` parameter for fine-grained access control (e.g., `scopes="clusters sql"`).

## See Also

- [Authentication Overview](./authentication.md) - Default authentication flow and configuration
- [OAuth Documentation](./oauth.md) - OAuth-based authentication details
- [Databricks Authentication Documentation](https://docs.databricks.com/dev-tools/auth.html) - Official Databricks authentication docs
4 changes: 3 additions & 1 deletion docs/authentication.md
Original file line number Diff line number Diff line change
Expand Up @@ -130,13 +130,15 @@ For all authentication methods, you can override the default behavior in client

| Argument | Description | Environment variable |
|-------------------------|-------------|------------------------|
| `auth_type` | _(String)_ When multiple auth attributes are available in the environment, use the auth type specified by this argument. This argument also holds the currently selected auth. | `DATABRICKS_AUTH_TYPE` |
| `auth_type` | _(String)_ When multiple auth attributes are available in the environment, use the auth type specified by this argument. This argument also holds the currently selected auth. When set explicitly, the SDK will **only** attempt that specific authentication method, skipping automatic detection of others. See the **[Authentication Types Reference](./auth-types-reference.md)** for all valid values, required parameters, and usage examples. | `DATABRICKS_AUTH_TYPE` |
| `http_timeout_seconds` | _(Integer)_ Number of seconds for HTTP timeout. Default is _60_. | _(None)_ |
| `retry_timeout_seconds` | _(Integer)_ Number of seconds to keep retrying HTTP requests. Default is _300 (5 minutes)_. | _(None)_ |
| `debug_truncate_bytes` | _(Integer)_ Truncate JSON fields in debug logs above this limit. Default is 96. | `DATABRICKS_DEBUG_TRUNCATE_BYTES` |
| `debug_headers` | _(Boolean)_ `true` to debug HTTP headers of requests made by the application. Default is `false`, as headers contain sensitive data, such as access tokens. | `DATABRICKS_DEBUG_HEADERS` |
| `rate_limit` | _(Integer)_ Maximum number of requests per second made to Databricks REST API. | `DATABRICKS_RATE_LIMIT` |

For a complete reference of all authentication types including required parameters, environment variables, and code examples, see the **[Authentication Types Reference](./auth-types-reference.md)**.

For example, to turn on debug HTTP headers:

```python
Expand Down
1 change: 1 addition & 0 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ We are keen to hear feedback from you on these SDKs. Please `file GitHub issues

getting-started
authentication
auth-types-reference
oauth
wait
pagination
Expand Down
Loading