Redshift setup
profiles.yml
file is for CLI users onlyIf you're using dbt Cloud, you don't need to create a profiles.yml
file. This file is only for CLI users. To connect your data platform to dbt Cloud, refer to About data platforms.
Overview of dbt-redshift
- Maintained by: dbt Labs
- Authors: core dbt maintainers
- GitHub repo: dbt-labs/dbt-redshift
- PyPI package:
dbt-redshift
- Slack channel: #db-redshift
- Supported dbt Core version: v0.10.0 and newer
- dbt Cloud support: Supported
- Minimum data platform version: n/a
Installing dbt-redshift
pip is the easiest way to install the adapter:
pip install dbt-redshift
Installing dbt-redshift
will also install dbt-core
and any other dependencies.
Configuring dbt-redshift
For Redshift-specific configuration, refer to Redshift Configuration.
For further info, refer to the GitHub repository: dbt-labs/dbt-redshift.
Configurations
Profile field | Example | Description |
---|---|---|
type | redshift | The type of data warehouse you are connecting to |
host | hostname.region.redshift.amazonaws.com | Host of cluster |
port | 5439 | |
dbname | my_db | Database name |
schema | my_schema | Schema name |
connect_timeout | None or 30 | Number of seconds before connection times out |
sslmode | prefer | optional, set the sslmode to connect to the database. Default prefer, which will use 'verify-ca' to connect. For more information on sslmode , see Redshift note below |
role | None | Optional, user identifier of the current session |
autocreate | false | Optional, default false. Creates user if they do not exist |
db_groups | ['ANALYSTS'] | Optional. A list of existing database group names that the DbUser joins for the current session |
ra3_node | true | Optional, default False. Enables cross-database sources |
autocommit | true | Optional, default True. Enables autocommit after each statement |
retries | 1 | Number of retries |
Authentication Parameters
The authentication methods that dbt Core supports are:
database
— Password-based authentication (default, will be used ifmethod
is not provided)IAM
— IAM
Click on one of these authentication methods for further details on how to configure your connection profile. Each tab also includes an example profiles.yml
configuration file for you to review.
- database
- IAM
The following table contains the parameters for the database (password-based) connection method.
Profile field | Example | Description |
---|---|---|
method | database | Leave this parameter unconfigured, or set this to database |
host | hostname.region.redshift.amazonaws.com | Host of cluster |
user | username | Account username to log into your cluster |
password | password1 | Password for authentication |
Example profiles.yml for database authentication
company-name:
target: dev
outputs:
dev:
type: redshift
host: hostname.region.redshift.amazonaws.com
user: username
password: password1
dbname: analytics
schema: analytics
port: 5439
# Optional Redshift configs:
sslmode: prefer
role: None
ra3_node: true
autocommit: true
threads: 4
connect_timeout: None
The following table lists the authentication parameters to use IAM authentication.
To set up a Redshift profile using IAM Authentication, set the method
parameter to iam
as shown below. Note that a password is not required when using IAM Authentication. For more information on this type of authentication,
consult the Redshift Documentation
and boto3
docs
on generating user credentials with IAM Auth.
If you receive the "You must specify a region" error when using IAM
Authentication, then your aws credentials are likely misconfigured. Try running
aws configure
to set up AWS access keys, and pick a default region. If you have any questions,
please refer to the official AWS documentation on Configuration and credential file settings.
Profile field | Example | Description |
---|---|---|
method | IAM | use IAM to authenticate |
iam_profile | analyst | dbt will use the specified profile from your ~/.aws/config file |
cluster_id | CLUSTER_ID | Required for IAM |
user | username | Account user to log into your cluster |
region | us-east-1 | Required for IAM authentication |
Example profiles.yml for IAM
my-redshift-db:
target: dev
outputs:
dev:
type: redshift
method: iam
cluster_id: CLUSTER_ID
host: hostname.region.redshift.amazonaws.com
user: alice
iam_profile: analyst
dbname: analytics
schema: analytics
port: 5439
# Optional Redshift configs:
threads: 4
connect_timeout: None
retries: 1
role: None
sslmode: prefer
ra3_node: true
autocommit: true
region: us-east-1
autocreate: true
db_groups: ['ANALYSTS']
Specifying an IAM Profile
When the iam_profile
configuration is set, dbt will use the specified profile from your ~/.aws/config
file instead of using the profile name default
Redshift notes
sslmode
change
Before to dbt-redshift 1.5, psycopg2
was used as the driver. psycopg2
accepts disable
, prefer
, allow
, require
, verify-ca
, verify-full
as valid inputs of sslmode
, and does not have an ssl
parameter, as indicated in PostgreSQL doc.
In dbt-redshift 1.5, we switched to using redshift_connector
, which accepts verify-ca
, and verify-full
as valid sslmode
inputs, and has a ssl
parameter of True
or False
, according to redshift doc.
For backward compatibility, dbt-redshift now supports valid inputs for sslmode
in psycopg2
. We've added conversion logic mapping each of psycopg2
's accepted sslmode
values to the corresponding ssl
and sslmode
parameters in redshift_connector
.
The table below details accepted sslmode
parameters and how the connection will be made according to each option:
sslmode parameter | Expected behavior in dbt-redshift | Actions behind the scenes |
---|---|---|
disable | Connection will be made without using ssl | Set ssl = False |
allow | Connection will be made using verify-ca | Set ssl = True & sslmode = verify-ca |
prefer | Connection will be made using verify-ca | Set ssl = True & sslmode = verify-ca |
require | Connection will be made using verify-ca | Set ssl = True & sslmode = verify-ca |
verify-ca | Connection will be made using verify-ca | Set ssl = True & sslmode = verify-ca |
verify-full | Connection will be made using verify-full | Set ssl = True & sslmode = verify-full |
When a connection is made using verify-ca
, will look for the CA certificate in ~/redshift-ca-bundle.crt
.
For more details on sslmode changes, our design choices, and reasoning — please refer to the PR pertaining to this change.
autocommit
parameter
The autocommit mode is useful to execute commands that run outside a transaction. Connection objects used in Python must have autocommit = True
to run operations such as CREATE DATABASE
, and VACUUM
. autocommit
is off by default in redshift_connector
, but we've changed this default to True
to ensure certain macros run successfully in your dbt project.
If desired, you can define a separate target with autocommit=True
as such:
profile-to-my-RS-target:
target: dev
outputs:
dev:
type: redshift
...
autocommit: False
profile-to-my-RS-target-with-autocommit-enabled:
target: dev
outputs:
dev:
type: redshift
...
autocommit: True
To run certain macros with autocommit, load the profile with autocommit using the --profile
flag. For more context, please refer to this PR.
Deprecated profile
parameters in 1.5
iam_duration_seconds
keepalives_idle
sort
and dist
keys
Where possible, dbt enables the use of sort
and dist
keys. See the section on Redshift specific configurations.