|
| 1 | +# Amazon Bedrock Knowledge Base Synchronization Flow with Amazon EventBridge Scheduler |
| 2 | + |
| 3 | +This pattern demonstrates an automated synchronization process for Amazon Bedrock Knowledge Bases using Amazon EventBridge Scheduler and AWS Step Functions. The solution enables periodic synchronization of data sources, ensuring your Knowledge Base stays up-to-date with the latest content. |
| 4 | + |
| 5 | +Learn more about this pattern at Serverless Land Patterns: https://serverlessland.com/patterns/eventbridge-scheduled-stepfunction-bedrock-kb-sync |
| 6 | + |
| 7 | + |
| 8 | +Important: this application uses various AWS services and there are costs associated with these services after the Free Tier usage - please see the [AWS Pricing page](https://aws.amazon.com/pricing/) for details. You are responsible for any AWS costs incurred. No warranty is implied in this example. |
| 9 | + |
| 10 | +## Architecture |
| 11 | + |
| 12 | + |
| 13 | +## Requirements |
| 14 | + |
| 15 | +* [Create an AWS account](https://portal.aws.amazon.com/gp/aws/developer/registration/index.html) if you do not already have one and log in. The IAM user that you use must have sufficient permissions to make necessary AWS service calls and manage AWS resources. |
| 16 | +* [AWS CLI](https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2.html) installed and configured |
| 17 | +* [Git Installed](https://git-scm.com/book/en/v2/Getting-Started-Installing-Git) |
| 18 | +* [AWS CDK](https://docs.aws.amazon.com/cdk/v2/guide/getting_started.html) (AWS CDK) installed |
| 19 | + |
| 20 | +## Deployment Instructions |
| 21 | + |
| 22 | +1. Create a new directory, navigate to that directory in a terminal and clone the GitHub repository: |
| 23 | + ``` |
| 24 | + git clone https://github.com/aws-samples/serverless-patterns |
| 25 | + ``` |
| 26 | +2. Change directory to the pattern directory: |
| 27 | + ``` |
| 28 | + cd serverless-patterns/eventbridge-scheduled-stepfunction-bedrock-kb-sync/cdk |
| 29 | + ``` |
| 30 | +3. Setup local developer environment and dependencies: |
| 31 | + ``` |
| 32 | + make bootstrap-venv |
| 33 | + source .venv/bin/activate |
| 34 | + ``` |
| 35 | +4. From the command line, configure AWS CDK: |
| 36 | + ```bash |
| 37 | + cdk bootstrap |
| 38 | + ``` |
| 39 | +5. From the command line, use AWS CDK to deploy the AWS resources for the pattern as specified in the `lib/appsync-eventbridge-datasource-stack.ts` file: |
| 40 | + ```bash |
| 41 | + cdk deploy --all |
| 42 | + ``` |
| 43 | +6. This command will take sometime to run. After successfully completing, the below stacks deployed. |
| 44 | +``` |
| 45 | +KbRoleStack |
| 46 | +CommonLambdaLayerStack |
| 47 | +OSSStack |
| 48 | +KbSyncPipelineStack |
| 49 | +KbInfraStack |
| 50 | +``` |
| 51 | + |
| 52 | +## How it works |
| 53 | + |
| 54 | +Here's a detailed summary of your serverless pattern for automated Knowledge Base synchronization: |
| 55 | + |
| 56 | +Pattern Overview: This is a scheduled, serverless workflow that automates the synchronization of Amazon Bedrock Knowledge Bases using AWS EventBridge Scheduler, AWS Step Functions, and Amazon Bedrock. |
| 57 | + |
| 58 | +Key Components: |
| 59 | + a) EventBridge Scheduler |
| 60 | + - Runs every 15 minutes |
| 61 | + - Triggers the Step Function workflow |
| 62 | + - Passes Amazon Bedrock Knowledge Base ID as input parameter |
| 63 | + - Enables consistent and automated synchronization |
| 64 | + |
| 65 | + b) Step Functions Workflow |
| 66 | + -Main Flow: |
| 67 | + - Receives Knowledge Base ID from EventBridge |
| 68 | + - Orchestrates the entire synchronization process |
| 69 | + - Handles error scenarios and retries |
| 70 | + - Manages parallel processing of multiple data sources |
| 71 | + |
| 72 | + Step 1: Data Source Retrieval |
| 73 | + Queries all associated data sources for the given Knowledge Base ID |
| 74 | + Prepares the list for processing |
| 75 | + Validates data source configurations |
| 76 | + |
| 77 | + Step 2: Map State for Parallel Processing |
| 78 | + Iterates through each data source |
| 79 | + Processes multiple data sources concurrently |
| 80 | + Manages state for each sync operation |
| 81 | + |
| 82 | + Step 3: Synchronization Process (For each data source) |
| 83 | + Initiates the sync operation |
| 84 | + Monitors sync status |
| 85 | + Handles completion and failures |
| 86 | + Reports sync results |
| 87 | + |
| 88 | + Step 4: Status Reporting |
| 89 | + Aggregates sync results |
| 90 | + Records success/failure metrics |
| 91 | + Generates summary reports |
| 92 | + |
| 93 | +## Testing |
| 94 | + |
| 95 | +Step 1: Upload Sample Documents to Amazon S3 |
| 96 | + - Navigate to Amazon S3 in AWS Console |
| 97 | + - Locate the bucket named kb-data-source-{account-id} |
| 98 | + - Upload your sample documents to this bucket |
| 99 | + |
| 100 | +Step 2: Wait for Scheduler Execution |
| 101 | + - The EventBridge scheduler is configured to run every 15 minutes |
| 102 | + - You can monitor the scheduler in EventBridge console |
| 103 | + Note: The next execution will occur at the next 15-minute interval |
| 104 | + |
| 105 | +Step 3: Monitor Step Function Execution |
| 106 | + - Navigate to AWS Step Functions console |
| 107 | + - Locate the state machine execution named KnowledgeBaseSyncStateMachine |
| 108 | + - Monitor the workflow progress through different states |
| 109 | + - Verify successful completion of all steps |
| 110 | + |
| 111 | +Step 4: Verify Sync Status in Amazon Bedrock |
| 112 | + - Go to Amazon Bedrock console |
| 113 | + - Navigate to Knowledge Bases |
| 114 | + - Select your Knowledge Base |
| 115 | + - Click on Data Sources |
| 116 | + - Check the Sync History tab |
| 117 | + - Verify the sync status shows as "Completed" |
| 118 | + - Review sync details including: |
| 119 | + Timestamp of sync |
| 120 | + Number of documents processed |
| 121 | + Any errors or warnings |
| 122 | + |
| 123 | + |
| 124 | +Step 5: Validation Points |
| 125 | + - Confirm documents are indexed |
| 126 | + - Check sync completion status |
| 127 | + - Verify no errors in sync history |
| 128 | + - Ensure all uploaded documents are processed |
| 129 | + |
| 130 | +Troubleshooting |
| 131 | +If sync fails or documents aren't appearing: |
| 132 | + |
| 133 | + Check S3 bucket permissions |
| 134 | + Review Step Function execution logs |
| 135 | + Verify document format compatibility |
| 136 | + Check Knowledge Base configuration |
| 137 | + |
| 138 | + |
| 139 | + |
| 140 | +## Delete stack |
| 141 | + |
| 142 | +```bash |
| 143 | +cdk destroy --all |
| 144 | +``` |
| 145 | +---- |
| 146 | +Copyright 2024 Amazon.com, Inc. or its affiliates. All Rights Reserved. |
| 147 | + |
| 148 | +SPDX-License-Identifier: MIT-0 |
0 commit comments