In immediately’s data-intensive enterprise panorama, organizations face the problem of extracting invaluable insights from numerous knowledge sources scattered throughout their infrastructure. Whether or not it’s structured knowledge in databases or unstructured content material in doc repositories, enterprises usually battle to effectively question and use this wealth of knowledge.
On this publish, we discover how you need to use Amazon Q Enterprise, the AWS generative AI-powered assistant, to construct a centralized information base on your group, unifying structured and unstructured datasets from totally different sources to speed up decision-making and drive productiveness. The answer combines knowledge from an Amazon Aurora MySQL-Suitable Version database and knowledge saved in an Amazon Easy Storage Service (Amazon S3) bucket.
Answer overview
Amazon Q Enterprise is a completely managed, generative AI-powered assistant that helps enterprises unlock the worth of their knowledge and information. The important thing to utilizing the complete potential of Amazon Q lies in its capacity to seamlessly combine and question a number of knowledge sources, from structured databases to unstructured content material shops. On this answer, we use Amazon Q to construct a complete information base that mixes sales-related knowledge from an Aurora MySQL database and gross sales paperwork saved in an S3 bucket. Aurora MySQL-Suitable is a completely managed, MySQL-compatible, relational database engine that mixes the pace and reliability of high-end business databases with the simplicity and cost-effectiveness of open-source databases. Amazon S3 is an object storage service that gives industry-leading scalability, knowledge availability, safety, and efficiency.
This tradition information base that connects these numerous knowledge sources permits Amazon Q to seamlessly reply to a variety of sales-related questions utilizing the chat interface. The next diagram illustrates the answer structure.
Stipulations
For this walkthrough, it is best to have the next conditions:
Arrange your VPC
Establishing a VPC offers a safe, remoted community surroundings for internet hosting the info sources that Amazon Q Enterprise will entry to index. On this publish, we use an Aurora MySQL database in a personal subnet, and Amazon Q Enterprise accesses the non-public DB occasion in a safe method utilizing an interface VPC endpoint.
Full the next steps:
- Select an AWS Area Amazon Q helps (for this publish, we use the us-east-1 Area).
- Create a VPC or use an current VPC with not less than two subnets. These subnets have to be in two totally different Availability Zones within the Area the place you wish to deploy your DB occasion.
- Seek advice from Steps 1 and a pair of in Configuring Amazon VPC assist for Amazon Q Enterprise connectors to configure your VPC so that you’ve got a personal subnet to host an Aurora MySQL database together with a safety group on your database.
- Moreover, create a public subnet that may host an EC2 bastion server, which we create within the subsequent steps.
- Create an interface VPC endpoint for Aurora powered by AWS PrivateLink within the VPC you created. For directions, seek advice from Entry an AWS service utilizing an interface VPC endpoint.
- Specify the non-public subnet the place the Aurora MySQL database resides together with the database safety group you created.
Every interface endpoint is represented by a number of elastic community interfaces in your subnets, which is then utilized by Amazon Q Enterprise to connect with the non-public database.
Arrange an Aurora MySQL database
Full the next steps to create an Aurora MySQL database to host the structured gross sales knowledge:
- On the Amazon RDS console, select Databases within the navigation pane.
- Select Create database.
- Choose Aurora, then Aurora (MySQL appropriate).
- For Templates, select Manufacturing or Dev/check.
- Beneath Settings, enter a reputation on your database cluster identifier. For instance, q-aurora-mysql-source.
- For Credentials settings, select Self-managed, give the admin consumer a password, and preserve the remainder of the parameters as default.
- Beneath Connectivity, for Digital non-public cloud (VPC), select the VPC that you just created.
- For DB subnet group, create a brand new subnet group or select an current one. Maintain the remainder of the parameters as default.
- For Publicly accessible, select NO.
- Beneath VPC safety group (firewall), select Current and select the present safety group that you just created for the Aurora MySQL DB occasion.
- Depart the remaining parameters as default and create the database.
Create an EC2 bastion host to connect with the non-public Aurora MySQL DB occasion
On this publish, you connect with the non-public DB occasion from the MySQL Workbench shopper in your native machine by an EC2 bastion host. Launch the EC2 occasion within the public subnet of the VPC you configured. The safety group connected to this EC2 bastion host occasion must be configured to permit SSH site visitors (port 22) out of your native machine’s IP handle. To facilitate the connection between the EC2 bastion host and the Aurora MySQL database, the safety group for the Aurora MySQL database ought to have an inbound rule to permit MySQL site visitors (port 3306) from the safety group of the EC2 bastion host. Conversely, the safety group for the EC2 bastion host ought to have an outbound rule to permit site visitors to the safety group of the Aurora MySQL database on port 3306. Seek advice from Controlling entry with safety teams for extra particulars.
Configure IAM Identification Middle
An Amazon Q Enterprise utility requires you to make use of IAM Identification Middle to handle consumer entry. IAM Identification Middle is a single place the place you possibly can assign your workforce customers, also called workforce identities, to offer constant entry to a number of AWS accounts and functions. On this publish, we use IAM Identification Middle because the SAML 2.0-aligned identification supplier (IdP). Be sure to have enabled an IAM Identification Middle occasion, provisioned not less than one consumer, and offered every consumer with a legitimate e mail handle. The Amazon Q Enterprise utility must be in the identical Area because the IAM Identification Middle occasion. For extra info on enabling customers in IAM Identification Middle, see Add customers to your Identification Middle listing.
Create an S3 bucket
Create a S3 bucket within the us-east-1 Area with the default settings and create a folder with a reputation of your alternative contained in the bucket.
Create and cargo pattern knowledge
On this publish, we use two pattern datasets: a complete gross sales dataset CSV file and a gross sales goal doc in PDF format. The overall gross sales dataset accommodates details about orders positioned by clients positioned in numerous geographical areas, by totally different gross sales channels. The gross sales doc accommodates details about gross sales targets for the 12 months for every of the gross sales channel. Full the steps within the part beneath to load each datasets.
Aurora MySQL database
Within the Amazon Q Enterprise utility, you create two indexes for a similar Aurora MySQL desk: one on the full gross sales dataset and one other on an aggregated view of the full gross sales knowledge, to cater to the totally different sort of queries. Full the next steps:
- Securely join to your non-public Aurora MySQL database utilizing an SSH tunnel by an EC2 bastion host.
This allows you to handle and work together together with your database sources immediately out of your native MySQL Workbench shopper.
- Create the database and tables utilizing the next instructions on the native MySQL Workbench shopper:
- Obtain the pattern file csv in your native surroundings.
- Use the next code to insert pattern knowledge in your MYSQL shopper:
If you happen to encounter the error LOAD DATA LOCAL INFILE file request rejected resulting from restrictions on entry when operating the statements in MySQL Workbench 8.0, you would possibly have to edit the connection. On the Connection tab, go to the Superior sub-tab, and within the Others area, add the road OPT_LOCAL_INFILE=1
and begin a brand new question tab after testing the connection.
- Confirm the info load by operating a choose assertion:
This could return 7,991 rows.
The next screenshot reveals the database desk schema and the pattern knowledge within the desk.
Amazon S3 bucket
Obtain the pattern file 2020_Sales_Target.pdf
in your native surroundings and add it to the S3 bucket you created. This gross sales goal doc accommodates details about the gross sales goal for 4 gross sales channels and appears like the next screenshot.
Create an Amazon Q utility
Full the next steps to create an Amazon Q utility:
- On the Amazon Q console, select Purposes within the navigation pane.
- Select Create utility.
- Present the next particulars:
- Within the Software particulars part, for Software title, enter a reputation for the applying (for instance,
sales_analyzer
). - Within the Service entry part, for Select a way to authorize Amazon Q, choose Create and use a brand new service position.
- Depart all different default choices and select Create.
- Within the Software particulars part, for Software title, enter a reputation for the applying (for instance,
- On the Choose retriever web page, you configure the retriever. The retriever is an index that might be utilized by Amazon Q to fetch knowledge in actual time.
- For Retrievers, choose Use native retriever.
- For Index provisioning, choose Starter.
- For Variety of items, use the default worth of 1. Every unit can assist as much as 20,000 paperwork. For a database, every database row is taken into account a doc.
- Select Subsequent.
Configure Amazon Q to connect with Aurora MySQL-Suitable
Full the next steps to configure Amazon Q to connect with Aurora MySQL-Suitable:
- On the Join knowledge sources web page, below Information sources, select the Aurora (MySQL) knowledge supply.
- Select Subsequent.
- Within the Identify and outline part, configure the next parameters:
- For Information supply title, enter a reputation (for instance,
aurora_mysql_sales
). - For Description, enter an outline.
- For Information supply title, enter a reputation (for instance,
- Within the Supply part, configure the next parameters:
- For Host, enter the database endpoint (for instance,
<databasename>.<ID>.<area>.rds.amazonaws.com
).
- For Host, enter the database endpoint (for instance,
You possibly can receive the endpoint on the Amazon RDS console for the occasion on the Connectivity & safety tab.
-
- For Port, enter the Amazon RDS port for MySQL:
3306
. - For Occasion, enter the database title (for instance,
gross sales
). - Choose Allow SSL Certificates location.
- For Port, enter the Amazon RDS port for MySQL:
- For Authentication, select Create a brand new secret with a reputation of your alternative.
- Present the consumer title and password on your MySQL database to create the key.
- Within the Configure VPC and safety group part, select the VPC and subnets the place your Aurora MySQL database is positioned, and select the default VPC safety group.
- For IAM position, select Create a brand new service position.
- For Sync scope, below SQL question, enter the next question:
This choose assertion returns a main key column, a doc title column, and a textual content column that serves your doc physique for Amazon Q to reply questions. Be sure to don’t put ; on the finish of the question.
- For Major key column, enter
order_number
. - For Title column, enter
sales_channel
. - For Physique column, enter
sales_details
.
- Beneath Sync run schedule, for Frequency, select Run on demand.
- Maintain all different parameters as default and select Add knowledge supply.
This course of might take a couple of minutes to finish. After the aurora_mysql_sales
knowledge supply is added, you can be redirected to the Join knowledge sources web page.
- Repeat the steps so as to add one other Aurora MySQL knowledge supply, referred to as
aggregated_sales
, for a similar database however with the next particulars within the Sync scope This knowledge supply might be utilized by Amazon Q for answering questions on aggregated gross sales.- Use the next SQL question:
-
- For Major key column, enter
scoy_id
. - For Title column, enter
sales_channel
. - For Physique column, enter
sales_aggregates
.
- For Major key column, enter
After including the aggregated_sales
knowledge supply, you can be redirected to the Join knowledge sources web page once more.
Configure Amazon Q to connect with Amazon S3
Full the next steps to configure Amazon Q to connect with Amazon S3:
- On the Join knowledge sources web page, below Information sources, select Amazon S3.
- Beneath Identify and outline, enter a knowledge supply title (for instance,
s3_sales_targets
) and an outline. - Beneath Configure VPC and safety group settings, select No VPC.
- For IAM position, select Create a brand new service position.
- Beneath Sync scope, for the info supply location, enter the S3 bucket title containing the gross sales goal PDF doc.
- Depart all different parameters as default.
- Beneath Sync run schedule, for Frequency, select Run on demand.
- Select Add knowledge supply.
- On the Join knowledge sources web page, select Subsequent.
- Within the Replace teams and customers part, select Add customers and teams.
- Select the consumer as entered in IAM Identification Middle and select Assign.
- After you add the consumer, you possibly can select the Amazon Q Enterprise subscription to assign to the consumer. For this publish, we select Q Enterprise Lite.
- Beneath Net expertise service entry, choose Create and use a brand new service position and enter a service position title.
- Select Create utility.
After jiffy, the applying might be created and you can be taken to the Purposes web page on the Amazon Q Enterprise console.
Sync the info sources
Select the title of your utility and navigate to the Information sources part. For every of the three knowledge sources, choose the info supply and select Sync now. It’s going to take a number of minutes to finish. After the sources have synced, it is best to see the Final sync standing present as Accomplished.
Customise and work together with the Amazon Q utility
At this level, you’ve got created an Amazon Q utility, synced the info supply, and deployed the net expertise. You possibly can customise your internet expertise to make it extra intuitive to your utility customers.
- On the applying particulars web page, select Customise internet expertise.
- For this publish, we have now custom-made the Title, Subtitle and Welcome message fields for our assistant.
- After you’ve got accomplished your customizations for the net expertise, return to the applying particulars web page and select the net expertise URL.
- Check in with the IAM Identification Middle consumer title and password you created earlier to begin the dialog with assistant.
Now you can check the applying by asking totally different questions, as proven within the following screenshot. You possibly can observe within the following query that the channel names had been fetched from the Amazon S3 gross sales goal PDF.
The next screenshots present extra instance interactions.
The reply within the previous instance was derived from the 2 sources: the S3 bucket and the Aurora database. You possibly can confirm the output by cross-referencing the PDF, which has a goal as $12 million for the in-store gross sales channel in 2020. The next SQL reveals the precise gross sales achieved in 2020 for a similar channel:
As seen from the gross sales goal PDF knowledge, the 2020 gross sales goal for the distributor gross sales channel was $7 million.
The next SQL within the Aurora MySQL database reveals the precise gross sales achieved in 2020 for a similar channel:
The next screenshots present further questions.
You possibly can confirm the previous solutions with the next SQL:
Clear up
To keep away from incurring future expenses, clear up any sources you created as a part of this answer, together with the Amazon Q Enterprise utility:
- On the Amazon Q Enterprise console, select Purposes within the navigation pane, choose the applying you created, and on the Actions menu, select Delete.
- Delete the AWS Identification and Entry Administration (IAM) roles created for the applying and knowledge retriever. You possibly can establish the IAM roles utilized by the Amazon Q Enterprise utility and knowledge retriever by inspecting the related configuration utilizing the AWS console or AWS Command Line Interface (AWS CLI).
- Delete the IAM Identification Middle occasion you created for this walkthrough.
- Empty the bucket you created after which delete the bucket.
- Delete the Aurora MySQL occasion and Aurora cluster.
- Shut down the EC2 bastion host occasion.
- Delete the VPC and associated parts—the NAT gateway and interface VPC endpoint.
Conclusion
On this publish, we demonstrated how organizations can use Amazon Q to construct a unified information base that integrates structured knowledge from an Aurora MySQL database and unstructured knowledge from an S3 bucket. By connecting these disparate knowledge sources, Amazon Q allows you to seamlessly question info from two knowledge sources and achieve invaluable insights that drive higher decision-making.
We encourage you to do this answer and share your expertise within the feedback. Moreover, you possibly can discover the various different knowledge sources that Amazon Q for Enterprise can seamlessly combine with, empowering you to construct sturdy and insightful functions.
In regards to the Authors
Monjumi Sarma is a Technical Account Supervisor at Amazon Net Companies. She helps clients architect trendy, scalable, and cost-effective options on AWS, which provides them an accelerated path in direction of modernization initiatives. She has expertise throughout analytics, massive knowledge, ETL, cloud operations, and cloud infrastructure administration.
Akchhaya Sharma is a Sr. Information Engineer at Amazon Adverts. He builds and manages data-driven options for advice programs, working along with a various and gifted staff of scientists, engineers, and product managers. He has expertise throughout analytics, massive knowledge, and ETL.