Question # 1

A data engineer needs Amazon Athena queries to finish faster. The data engineer noticesthat all the files the Athena queries use are currently stored in uncompressed .csv format.The data engineer also notices that users perform most queries by selecting a specificcolumn.Which solution will MOST speed up the Athena query performance?

A. Change the data format from .csvto JSON format. Apply Snappy compression.
B. Compress the .csv files by using Snappy compression.
C. Change the data format from .csvto Apache Parquet. Apply Snappy compression.
D. Compress the .csv files by using gzjg compression.

Question # 2

A company stores data in a data lake that is in Amazon S3. Some data that the company stores in the data lake contains personally identifiable information (PII). Multiple usergroups need to access the raw data. The company must ensure that user groups canaccess only the PII that they require.Which solution will meet these requirements with the LEAST effort?

A. Use Amazon Athena to query the data. Set up AWS Lake Formation and create datafilters to establish levels of access for the company's IAM roles. Assign each user to theIAM role that matches the user's PII access requirements.
B. Use Amazon QuickSight to access the data. Use column-level security features inQuickSight to limit the PII that users can retrieve from Amazon S3 by using AmazonAthena. Define QuickSight access levels based on the PII access requirements of theusers.
C. Build a custom query builder UI that will run Athena queries in the background to accessthe data. Create user groups in Amazon Cognito. Assign access levels to the user groupsbased on the PII access requirements of the users.
D. Create IAM roles that have different levels of granular access. Assign the IAM roles toIAM user groups. Use an identity-based policy to assign access levels to user groups at thecolumn level.

Question # 3

A company receives call logs as Amazon S3 objects that contain sensitive customerinformation. The company must protect the S3 objects by using encryption. The companymust also use encryption keys that only specific employees can access.Which solution will meet these requirements with the LEAST effort?

A. Use an AWS CloudHSM cluster to store the encryption keys. Configure the process thatwrites to Amazon S3 to make calls to CloudHSM to encrypt and decrypt the objects.Deploy an IAM policy that restricts access to the CloudHSM cluster.
B. Use server-side encryption with customer-provided keys (SSE-C) to encrypt the objectsthat contain customer information. Restrict access to the keys that encrypt the objects.
C. Use server-side encryption with AWS KMS keys (SSE-KMS) to encrypt the objects thatcontain customer information. Configure an IAM policy that restricts access to the KMSkeys that encrypt the objects.
D. Use server-side encryption with Amazon S3 managed keys (SSE-S3) to encrypt theobjects that contain customer information. Configure an IAM policy that restricts access tothe Amazon S3 managed keys that encrypt the objects.

Question # 4

A data engineer needs to maintain a central metadata repository that users access throughAmazon EMR and Amazon Athena queries. The repository needs to provide the schemaand properties of many tables. Some of the metadata is stored in Apache Hive. The dataengineer needs to import the metadata from Hive into the central metadata repository.Which solution will meet these requirements with the LEAST development effort?

A. Use Amazon EMR and Apache Ranger.
B. Use a Hive metastore on an EMR cluster.
C. Use the AWS Glue Data Catalog.
D. Use a metastore on an Amazon RDS for MySQL DB instance.

