
The Lambda function stores the data in a DynamoDB table over which a simple dashboard is built using Athena and QuickSight.īefore starting the implementation, make sure the following requirements are met:.The Lambda function processes the event data and sends the notification over a Slack channel using a webhook.The subscription filter triggers an AWS Lambda function for pattern matching.Subscription filters on CloudWatch capture the required DDL and DCL commands by providing filter criteria.Audit logging is enabled in each Amazon Redshift data warehouse to capture the user activity log in CloudWatch.The solution workflow consists of the following steps: The following diagram illustrates the solution architecture. User activity log – Logs each query before it’s run on the database.
User log – Logs information about changes to database user definitions. Connection log – Logs authentication attempts, connections, and disconnections. Amazon Redshift logs information in the following log files, and this solution is based on using an Amazon Redshift audit log to CloudWatch as a destination: These logs can be stored in Amazon Simple Storage Service (Amazon S3) buckets or Amazon CloudWatch. Solution overviewĪn Amazon Redshift data warehouse logs information about connections and user activities taking place in databases, which helps monitor the database for security and troubleshooting purposes. We also create a simple governance dashboard using a combination of Amazon DynamoDB, Amazon Athena, and Amazon QuickSight. To address this, in this post we show you how you can automate near-real-time notifications over a Slack channel when certain queries are run on the data warehouse. Therefore, for a robust governance mechanism, it’s crucial to alert or notify the database and security administrators on the kind of sensitive queries that are run on the data warehouse, so that prompt remediation actions can be taken if needed. Tracking such user queries as part of the centralized governance of the data warehouse helps stakeholders understand potential risks and take prompt action to mitigate them following the operational excellence pillar of the AWS Data Analytics Lens. Therefore, over time, multiple Data Definition Language (DDL) or Data Control Language (DCL) queries, such as CREATE, ALTER, DROP, GRANT, or REVOKE SQL queries, are run on the Amazon Redshift data warehouse, which are sensitive in nature because they could lead to dropping tables or deleting data, causing disruptions or outages. In many organizations, one or multiple Amazon Redshift data warehouses run daily for data and analytics purposes. With Amazon Redshift, you can analyze your data to derive holistic insights about your business and your customers. You can create Customer Management Keys (CMKs) using AWS Key Management Service to encrypt your data in the warehouseĬheck out our Intellipaat’s AWS SysOps Associate certification now to learn AWS SysOps from the beginning.Amazon Redshift is a fully managed, petabyte-scale data warehouse service in the cloud that delivers powerful and secure insights on all your data with the best price-performance. Redshift in AWS lets you isolate your warehouse using AWS VPC. You can query petabytes of unstructured data using Redshift on Amazon S3. Redshift in AWS allows you to query your Amazon S3 data bucket or data lake. Also, the compute and storage instances are scaled separately. You can query any amount of data and AWS redshift will take care of scaling up or down. This is the same as Redshift Spectrum. It is 10 times cheaper than a traditional data warehouse which is set up on-premises.
There are upfront costs or contract periods. Tasks that are automated are monitoring and managing your warehouse.
Most of the commons tasks are automated.You can create and deploy a warehouse in minutes.You can set caching to increase the data retrieval speed.Provides 10x times faster performance than the other warehouses.