Amazon Kendra Configuration

CSM Configuration

Required Configuration Properties

CSM Connection Settings

Configuration options for the connection to the target CSM instance.

Name Description

CSM endpoint

URL of the CSM instance to connect to.

CSM Authentication Settings

Configuration Options for the authentication against the target CSM instance.

Name Description

Username

Username of the technical user.

Password

Password of the technical user.

Optional Configuration Properties

CSM Connection Settings

Configuration options for fine-tuning the Http connection parameters.

Name Description

Concurrent Connections

Maximum number of concurrent open connections.

Requests Rate

Maximum number of requests per second.

Connect Timeout in Milliseconds

Timeout of the connect request.

Socket Timeout in Milliseconds

Timeout of the socket connected to CSM.

Request Timeout in Milliseconds

Timeout of a request to CSM.

CSM Ingestion Settings

Configuration options to specify how principals are ingested in the CSM.

Name Description

Domain

Namespace under which to ingest principals.

Amazon Kendra Configuration

Instance Configuration

Configuration options related to specifying the target Kendra Index and Data Source including authentication/authorization settings.

Name Property Key Description

Index ID

raytion.connector.backend.amazon.kendra
.instance.indexId

ID of the target index. It can be retrieved in your AWS management console under Services → Amazon-Kendra → Indexes → <your_index>.

Region ID

raytion.connector.backend.amazon.kendra
.instance.regionId

ID of the region where the index is deployed. One of us-east-1(N. Virginia), us-east-2(Ohio), us-west-2(Oregon), eu-west-1(Ireland), ca-central-1(Canada), ap-southeast-1(Singapore) or ap-southeast-2(Sydney) is available.

Amazon Resource Name

raytion.connector.backend.amazon.kendra
.instance.roleArn

ARN of the IAM Service Role assigned to the index. It can be retrieved in your AWS management console under Services → Amazon-Kendra → Indexes → <your_index>. If the option Use S3 is enabled under Advanced Configuration → Content Processing Settings, make sure that the policy attached to the role contains the permission S3:GetObject for all objects inside the target bucket.

Data Source ID

raytion.connector.backend.amazon.kendra
.instance.datasourceId

ID of the Custom Data Source Connector added to target index. All documents and groups processed by the connector will be attached to this data source. It can be retrieved in your AWS management console under Services → Amazon-Kendra → Indexes → <your_index> → Data management → Data sources → <your_data_source>.

Use System Credentials

raytion.connector.backend.amazon.kendra
.instance.useSystemCredentials

To authenticate against Amazon Kendra, you must provide your AWS Access Key and AWS Secret Access Key. If Use System Credentials is set to true, these keys will be automatically discovered from following locations:

- Java System Properties aws.accessKeyId and aws.secretAccessKey

- Environment Variables AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY

- Web Identity Token credentials from System or Environment Variables

- Credentials Profile File at location ~/.aws/credentials

- Credentials delivered through the Amazon EC2 container

- Instance profile credentials delivered through the Amazon EC2 metadata service + image::2.0.x@amazon-kendra-backend:ROOT:kendra-use-system-credentials.png[Kendra Use System Credentials]

Access Key

raytion.connector.backend.amazon.kendra
.instance.accessKey

If Use System Credentials is set to false, access keys need to be specified explicitly in the configuration. The specified account requires the Managed Policy AmazonKendraFullAccess.

Secret Access Key

raytion.connector.backend.amazon.kendra
.instance.secretAccessKey

Secret Key of the specified AWS account. The value will be stored encrypted by the connector.

Assume Role

raytion.connector.backend.amazon.kendra
.instance.assumeRole

Enable this option to fetch the security token from STS using the provided role.

STS Assume Role Region

raytion.connector.backend.amazon.kendra
.instance.stsAssumeRole.regionId

Region ID for invoking the regional STS endpoint when requesting the service.

STS Assume Role Amazon Resource Name

raytion.connector.backend.amazon.kendra
.instance.stsAssumeRole.roleArn

ARN of the role which should be assumed by the configured role or account in the instance settings.

STS Assume Role Session Name

raytion.connector.backend.amazon.kendra
.instance.stsAssumeRole.sessionName

Arbitrary session name attached to the session established by the connector and STS for tracking the session.

STS Assume Role Session Duration

raytion.connector.backend.amazon.kendra
.instance.stsAssumeRole.sessionTimeToLive

Time to live duration for a single session.

Use Proxy

raytion.connector.backend.amazon.kendra
.instance.useProxy

If enabled, the connection to AWS and Kendra Service will be established through a HTTP/HTTPS proxy.

Proxy Endpoint

raytion.connector.backend.amazon.kendra
.instance.proxy.endpoint

Target proxy URL including protocol, host and port.

Proxy Authentication

raytion.connector.backend.amazon.kendra
.instance.proxy.authenticate

If enabled, the connector uses the specified credentials to authenticate towards proxy.

Proxy Username

raytion.connector.backend.amazon.kendra
.instance.proxy.username

Proxy authentication username.

Proxy Password

raytion.connector.backend.amazon.kendra
.instance.proxy.password

Proxy authentication password. The value will be stored encrypted by the connector.

Content Processing Configuration (Optional)

Documents with empty content or large content can be rejected by Kendra. In order to fine-tune the behaviour for the processing of these documents, consider to set one of the properties below.

Name Property Key Description

Empty Content Token

raytion.connector.backend.amazon.kendra
.content.emptyContentToken

Items with unsupported mime types (supported are: application/pdf, text/html, application/xhtml+xml, application/msword, application/mspowerpoint and text/plain) or empty content are rejected by Kendra. To make those items available in the search, the connector allows you to configure a token which will be appended to the content of those items.

Use S3

raytion.connector.backend.amazon.kendra
.content.useS3Content

If enabled, binary content of documents exceeding the content size limit will be processed to a S3 bucket.

Content Size Limit

raytion.connector.backend.amazon.kendra
.content.s3Content.contentSizeLimit

All documents with content size exceeding this value will be processed to the configured S3 bucket. Else, documents are processed as inline documents including their content directly to Kendra index. If the S3 option is enabled, it is recommended to set the value below 5MB, as this is the limit defined by Kendra for inline documents.

Bucket ID

raytion.connector.backend.amazon.kendra
.content.s3Content.bucketId

ID of the bucket.

Region ID

raytion.connector.backend.amazon.kendra
.content.s3Content.regionId

ID of the region where the bucket is deployed. One of us-east-1(N. Virginia), us-east-2(Ohio), us-west-2(Oregon), eu-west-1(Ireland), ca-central-1(Canada), ap-southeast-1(Singapore) or ap-southeast-2(Sydney) is available.

Use System Credentials

raytion.connector.backend.amazon.kendra
.content.s3Content.useSystemCredentials

To authenticate against Amazon S3, you must provide your AWS Access Key and AWS Secret Access Key. If Use System Credentials is set to true, these keys will be automatically discovered from following locations:

- Java System Properties aws.accessKeyId and aws.secretAccessKey

- Environment Variables AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY

- Web Identity Token credentials from System or Environment Variables

- Credentials Profile File at location ~/.aws/credentials

- Credentials delivered through the Amazon EC2 container

- Instance profile credentials delivered through the Amazon EC2 metadata service

Access Key

raytion.connector.backend.amazon.kendra
.content.s3Content.accessKey

If Use System Credentials is set to false, access keys need to be specified explicitly in the configuration. The specified account requires at least write access to the bucket.

Secret Access Key

raytion.connector.backend.amazon.kendra
.content.s3Content.secretAccessKey

Secret Key of the specified AWS account. The value will be stored encrypted by the connector.

Assume Role

raytion.connector.backend.amazon.kendra
.content.s3Content.assumeRole

Enable this option to fetch the security token from STS using the provided role.

STS Assume Role Region

raytion.connector.backend.amazon.kendra
.content.s3Content.stsAssumeRole.regionId

Region ID for invoking the regional STS endpoint when requesting the service.

STS Assume Role Amazon Resource Name

raytion.connector.backend.amazon.kendra
.content.s3Content.stsAssumeRole.roleArn

ARN of the role which should be assumed by the configured role or account in the instance settings.

STS Assume Role Session Name

raytion.connector.backend.amazon.kendra
.content.s3Content.stsAssumeRole.sessionName

Arbitrary session name attached to the session established by the connector and STS for tracking the session.

STS Assume Role Session Duration

raytion.connector.backend.amazon.kendra
.content.s3Content.stsAssumeRole.sessionTimeToLive

Time to live duration for a single session.

Use Proxy

raytion.connector.backend.amazon.kendra
.content.s3Content.useProxy

If enabled, the connection to AWS and S3 Service will be established through a HTTP/HTTPS proxy.

Proxy Endpoint

raytion.connector.backend.amazon.kendra
.content.s3Content.proxy.endpoint

Target proxy URL including protocol, host and port.

Proxy Authentication

raytion.connector.backend.amazon.kendra
.content.s3Content.proxy.authenticate

If enabled, the connector uses the specified credentials to authenticate towards proxy.

Proxy Username

raytion.connector.backend.amazon.kendra
.content.s3Content.proxy.username

Proxy authentication username.

Proxy Password

raytion.connector.backend.amazon.kendra
.content.s3Content.proxy.password

Proxy authentication password. The value will be stored encrypted by the connector.

S3 Content Processing How-To

Use this section if you want the connector to upload large documents to S3 and only reference them in Kendra.

  1. Required permissions The connector needs write access to the bucket. The Kendra index role needs s3:GetObject to read the uploaded objects.

  2. Example bucket policy (adjust bucket-name and role ARN)

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "AllowKendraRead",
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::123456789012:role/your-kendra-index-role"
      },
      "Action": ["s3:GetObject"],
      "Resource": "arn:aws:s3:::bucket-name/*"
    },
    {
      "Sid": "AllowConnectorWrite",
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::123456789012:role/your-connector-role"
      },
      "Action": ["s3:PutObject", "s3:AbortMultipartUpload"],
      "Resource": "arn:aws:s3:::bucket-name/*"
    }
  ]
}
  1. Create a bucket in the same AWS account or a trusted account.

    Create Bucket
  2. Ensure the bucket allows object read access for the Kendra index role.

    S3 Policy

    If you manage permissions via IAM policies, attach an S3 policy to the role.

    IAM Policy For Bucket
  3. If you rely on ACLs, verify the bucket ACL configuration.

    Bucket ACL
  4. Configure the connector’s S3 credentials.

    S3 Use System Credentials

    When running on EC2 with system credentials, make sure the instance profile includes S3 permissions.

    EC2 Create Role
    EC2 Modify IAM Role

    If you use STS to assume a role, enable Assume Role and provide the STS settings.

    STS Assume Role
  5. Set Use S3 and configure Content Size Limit, Bucket ID, and Region ID. Documents larger than the limit will be uploaded to S3; smaller ones are sent inline to Kendra.

    Keep Content Size Limit below 5 MB to avoid inline document rejections by Kendra.

Content Batching Configuration (Optional)

Documents are processed in a batch to Kendra. This configuration section includes all batch related properties including the callback behavior.

Name Property Key Description

Max. Size

raytion.connector.backend.amazon.kendra
.batch.batchSize

Max. batch size. All batch put requests will be restricted to this value. The max. allowed value is 10.

Ignore Processing State

raytion.connector.backend.amazon.kendra
.batch.async

If enabled, the connector submits all documents asynchronously without polling the processing state from Kendra. Documents failed during processing are not recognized by the connector. Unless you would like to monitor the indexing process using Amazon CloudWatch only, it is recommended to disable this option.

Flush Timeout

raytion.connector.backend.amazon.kendra
.batch.flushTimeout

Periodic delay between flushing the batch. Within this period, it is guaranteed that the batch is flushed. If the current batch size exceeds the configured max. batch size, only the max. number of items will be flushed in a single cycle.

Callback Timeout

raytion.connector.backend.amazon.kendra
.batch.callbackTimeout

The Batch API used to index or delete items is asynchronous. The connector is polling the state of the submitted requests to track the state of the items. This property defines the timeout until the connector is expecting the requests to be completed in the asynchronous processing in the search engine.

HTTP Connection Configuration (Optional)

Configuration options for fine-tuning the Http connection parameters.

Name Property Key Description

Connection Acquire Timeout

raytion.connector.backend.amazon.kendra
.http.connection.connectionAcquireTimeout

Timeout value for acquiring an already established connection from the connector’s connection manager.

Connection Timeout

raytion.connector.backend.amazon.kendra
.http.connection.connectionTimeout

Timeout value for establishing a connection to AWS.

Connection Idle Timeout

raytion.connector.backend.amazon.kendra
.http.connection.maxConnectionIdleTimeout

Timeout value after an idle connection should be closed.

Connection Time to Live

raytion.connector.backend.amazon.kendra
.http.connection.maxConnectionTimeToLive

Timeout value after the connection should be closed regardless of its current state.

Max. Number of Connections

raytion.connector.backend.amazon.kendra
.http.connection.maxConnections

Max. number of allowed connections maintained by the connection manager.

Max. Number of acquired connections

raytion.connector.backend.amazon.kendra
.http.connection.maxConnectionAcquires

Max. number of requests allowed to wait for a connection.