AWS
- Actions, resources, and condition keys for AWS services
- Reference for all 𝗜𝗔𝗠 𝗔𝗰𝘁𝗶𝗼𝗻𝘀, 𝗿𝗲𝘀𝗼𝘂𝗿𝗰𝗲𝘀, 𝗮𝗻𝗱 𝗰𝗼𝗻𝗱𝗶𝘁𝗶𝗼𝗻 𝗸𝗲𝘆𝘀 𝗳𝗼𝗿 𝗮𝗹𝗹 𝗔𝗪𝗦 𝘀𝗲𝗿𝘃𝗶𝗰𝗲𝘀
- Policy Evaluation Logic
- AWS Global Condition Context Keys
- AWS Code Sample
- AWS Workshop
- AWS Storage Optimization
- https://github.com/serverless/serverless/issues/4285
- https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/using-iam-servicerole.html
- https://docs.aws.amazon.com/IAM/latest/UserGuide/reference_policies_condition-keys.html#condition-keys-multifactorauthpresent
- https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_temp.html
- Cost
- Accounts
- Solution Architect Associate Exam
- Dashboard
- Diagram
- mhlabs/cfn-diagram: CLI tool to visualise CloudFormation/SAM/CDK stacks as visjs networks, draw.io or ascii-art diagrams.
- duo-labs/cloudmapper: CloudMapper helps you analyze your Amazon Web Services (AWS) environments.
- Cloudcraft – Draw AWS diagrams
- mingrammer/diagrams: Diagram as Code for prototyping cloud system architectures
RDS
- Modifying an Amazon RDS DB Instance
- Best Practices for Upgrading Amazon RDS for MySQL and Amazon RDS for MariaDB
- How to export database from Amazon RDS MySQL instance to local instance?
- Accessing MySQL binary logs
- Importing data to an Amazon RDS MySQL or MariaDB DB instance with reduced downtime
- Amazon RDS DB instance storage
- Increasing DB instance storage capacity
-
In most cases, scaling storage doesn't require any outage and doesn't degrade performance of the server.
-
- How can I decrease the total provisioned storage size of my Amazon RDS DB instance?
- How do I reset the master user password for my Amazon RDS DB instance?
- Best practices for configuring parameters for Amazon RDS for MySQL, part 3: Parameters related to security, operational manageability, and connectivity timeout
re:Invent
2024 - 【re:Invent 2024現場直擊】S3雲端儲存兩大新功能瞄準AI需求,Iceberg超大型資料表查詢能快3倍,還能自動產生Metadata | iThome - 【re:Invent 2024現場直擊】AWS執行長揭2款資料庫服務新進展,兼顧高可用、低延遲、無需管理基礎設施等優勢 | iThome
S3
- Why is my presigned URL for an Amazon S3 bucket expiring before the expiration time that I specified?
- Amazon S3 Path Deprecation Plan – The Rest of the Story
- How do I troubleshoot 403 Access Denied errors from Amazon S3
- IAM Policies and Bucket Policies and ACLs! Oh, My! (Controlling Access to S3 Resources)
- Why can't I access an object that was uploaded to my Amazon S3 bucket by another AWS account?
- Why does S3.deleteObject not fail when the specified key doesn't exist?
Server-side encryption
Metadata
- How do you search an amazon s3 bucket?
- Retrieve/List objects using metadata in s3 - aws sdk
- Amazon S3 : Listing Object with Metadata in single request
- boto3 find object by metadata or tag
- Building and Maintaining an Amazon S3 Metadata Index without Servers
- Amazon S3 Inventory
CLI
download object by prefix
```shell= aws s3api list-objects-v2 --bucket {bucket name} --prefix {prefix} > download.json jq '.Contents[].Key' download.json | awk -F '"' '{print $2}' > s3_object_keys
```bash=
#!/bin/bash
FILENAME="s3_object_keys"
BUCKET_NAME="bucket name"
PREFIX="prefix"
aws s3api list-objects-v2 --bucket ${BUCKET_NAME} --prefix ${PREFIX} > download.json
jq '.Contents[].Key' download.json | awk -F '"' '{print $2}' > ${FILENAME}
LINES=$(cat $FILENAME)
for s3_object_key in $LINES
do
echo $s3_object_key
local_file_name=$(echo $s3_object_key | awk -F '/' '{print $2}')
echo $local_file_name
aws s3api get-object --bucket {bucket name} --key $s3_object_key $local_file_name
done
IAM
- Actions, resources, and condition keys for Identity And Access Management
- IAM tutorial: Delegate access across AWS accounts using IAM roles
- Creating a condition with multiple keys or values
aws:MultiFactorAuthPresent
aws:MultiFactorAuthPresent is present principal uses temporary credentials to make the request Temporary credentials are used to authenticate - IAM roles, - federated users, - IAM users with temporary tokens from sts:GetSessionToken - users of the AWS Management Console
The aws:MultiFactorAuthPresent key is NOT present when an API or CLI command is called with long-term credentials - user access key pairs
...IfExists Condition Operators You do this to say "If the policy key is present in the context of the request, process the key as specified in the policy. If the key is not present, evaluate the condition element as true."
IAM:PassRole
EC2
Spot instance
- Launching Spot Instances in Your Auto Scaling Group
- New Amazon EC2 Spot pricing model: Simplified purchasing without bidding and fewer interruptions
You set the maximum price you are willing to pay as part of the launch configuration or launch template. If the Spot price is within your maximum price, whether your request is fulfilled depends on Spot Instance capacity. You pay only the Spot price for the Spot Instances that you launch.
vCPU
- Optimizing CPU options
- The number of vCPUs for the instance is the number of CPU cores multiplied by the threads per core
- CPU cores and threads per CPU core per instance type
EBS
SSH
DNS
- Viewing DNS hostnames for your EC2 instance
- How can I avoid DNS resolution failures with an Amazon EC2 Linux instance
- How do I assign a static DNS server to the EC2 instance that persists during reboot
AMI
ELB
Q. 請問 Application Load Balancer (ALB) DNS name 的 IP 是否會改變? 是的,ELB 會更新 Load balancer 的 DNS 紀錄,所以當 ELB 新增資源時,每個新增的資源都有相對註冊在 DNS 裡的 IP
ELB 後面有一群 EC2 的集合,其中一個 EC2 如果 malfunction/unhealthy 的話 就會替換另一個正常運作的 EC2 上去,此時 IP 就會改變
Application Load Balancer 也有 Auto Scaling 的機制 假設: 一台 ALB 後面接 2 個 EC2,因此有2個 IP A, B流量增加時,這些新增長出的 EC2 也都會有相對應的IP 如 C, D但當流量變少 Scale in 時,不會有固定的模式先去刪除哪些 EC2有可能Scale in 時,先縮減 A, B 的 EC2,此時剩 C, D 的 EC2此時IP就會變更為 C, D 的 IP
Q: wildcard 是否包含 slash? 以文件中的例子為例,我在ALB 設定兩個 rule /img/ => forward to Target Group A /img//pics => forward to Target Group B 請問 rule 1 是否包含 rule 2,永遠不會有 /img 開頭的 request 到 Target Group B 呢?
是的, rule 2 的確有被包含在 rule 1 中。 至於是否會有 request 送至 Target Group B 內,需要取決於您 rule 的優先順序。 load balancer 會依據優先順序最低值的 rule 往高值 rule 執行,預設的 rule 則會最後執行。 若您的 /img//pics 優先順序較 /img/ 前面,您的 request 還是可以送到 Target Group B。
Q: 變更 ALB rule 時,是否會立即套用至後續的request 呢? 是否需等待一段時間後,才會套用至 ALB?
在變更 rule 時,並不會立即套用生效。 正如您所說,新的 rule 會需等待一段時間後,才會套用至 ALB。
- Best Practices in Evaluating Elastic Load Balancing
- Using static IP addresses for Application Load Balancers
- Update Rule Priority
- Troubleshoot your Application Load Balancers
Integration options - Using AWS Lambda with an Application Load Balancer - AWS APPLICATION LOAD BALANCER (ALB) AND ECS WITH FLASK APP
Troubleshoot - The load balancer generates an HTTP error - Access logs for your Application Load Balancer
ECS
Lambda
Overview - What is AWS Lambda - Lambda concepts - Managing Lambda reserved concurrency - :star: Understanding AWS Lambda scaling and throughput | AWS Compute Blog - Lambda function scaling - Working with Lambda function metrics - Runtime deprecation policy - Security - Lambda operator guide - Understanding the Lambda execution environment - Encrypting data in Lambda-based applications - Security in AWS Lambda - Using AWS Lambda with Amazon API Gateway - Handling errors with an API Gateway API - If the Lambda API rejects the invocation request, API Gateway returns a 500 error code. - If the function runs but returns an error, or returns a response in the wrong format, API Gateway returns a 502. - In both cases, the body of the response from API Gateway is {"message": "Internal server error"}. - Handle Lambda errors in API Gateway - Amazon API Gateway - Resolve HTTP 502 errors from API Gateway REST APIs with Lambda functions - SNS to Lambda or SNS to SQS to Lambda, what are the trade-offs? | theburningmonk.com
https://docs.aws.amazon.com/lambda/latest/dg/lambda-runtimes.html https://aws.amazon.com/amazon-linux-ami/2018-03-packages/ https://www.openssl.org/news/openssl-1.0.2-notes.html Exploring the AWS Lambda Execution Environment
import json
import subprocess
def lambda_handler(event, context):
# TODO implement
openssl_cmd_subprocess = subprocess.Popen('openssl version', shell=True, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
output = openssl_cmd_subprocess.communicate()[0]
print(output)
openssl_cmd_subprocess = subprocess.Popen('touch /tmp/key.txt && echo 456 >> /tmp/key.txt', shell=True, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
openssl_cmd_subprocess = subprocess.Popen('touch /tmp/test.txt && echo 123 >> /tmp/test.txt', shell=True, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
openssl_cmd_subprocess = subprocess.Popen('openssl aes-256-cbc -k 456 -salt -in /tmp/test.txt -out /tmp/test.enc', shell=True, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
openssl_cmd_subprocess = subprocess.Popen('file /tmp/test.enc', shell=True, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
output = openssl_cmd_subprocess.communicate()[0]
for file in output.splitlines():
print(file)
openssl_cmd_subprocess = subprocess.Popen('openssl aes-256-cbc -d -k 456 -in /tmp/test.enc -out /tmp/test.dec', shell=True, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
openssl_cmd_subprocess = subprocess.Popen('ls -la /tmp && md5sum /tmp/*', shell=True, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
output = openssl_cmd_subprocess.communicate()[0]
for file in output.splitlines():
print(file)
return {
'statusCode': 200,
'body': json.dumps('Hello from Lambda!')
}
Function logs:
Function logs:
START RequestId: cd75f4ff-b209-44ac-bfc5-fcf3deee287e Version: $LATEST
b'OpenSSL 1.0.2k-fips 26 Jan 2017\n'
b"/tmp/test.enc: openssl enc'd data with salted password"
b'total 24'
b'drwx------ 2 sbx_user1051 991 4096 Sep 21 10:00 .'
b'dr-xr-xr-x 21 root root 4096 Aug 31 10:28 ..'
b'-rw-rw-r-- 1 sbx_user1051 991 4 Sep 21 10:00 key.txt'
b'-rw-rw-r-- 1 sbx_user1051 991 4 Sep 21 10:00 test.dec'
b'-rw-rw-r-- 1 sbx_user1051 991 32 Sep 21 10:00 test.enc'
b'-rw-rw-r-- 1 sbx_user1051 991 4 Sep 21 10:00 test.txt'
b'd2d362cdc6579390f1c0617d74a7913d /tmp/key.txt'
b'ba1f2511fc30423bdbb183fe33f3dd0f /tmp/test.dec'
b'189671449c7e95e7bf09942b654df82f /tmp/test.enc'
b'ba1f2511fc30423bdbb183fe33f3dd0f /tmp/test.txt'
END RequestId: cd75f4ff-b209-44ac-bfc5-fcf3deee287eREPORT RequestId: cd75f4ff-b209-44ac-bfc5-fcf3deee287e Duration: 588.25 ms Billed Duration: 600 ms Memory Size: 128 MB Max Memory Used: 48 MB Init Duration: 1.40 ms
Hit the 6MB Lambda payload limit? Here’s what you can do
AWS KMS, Boto3 and Python: Complete Guide with examples
Layer
- Using Lambda layers to simplify your development process
- AWSome Lambda Layers
- How to publish and use AWS Lambda Layers with the Serverless Framework
Parameters
- variable sources
- Sharing Secrets with AWS Lambda Using AWS Systems Manager Parameter Store
- Managing secrets, API keys and more with Serverless
- Secrets Management for AWS Powered Serverless Applications
Extensions
- Caching data and configuration settings with AWS Lambda extensions
- Overview of AWS Lambda Extensions
Local test
- AWS base images for Lambda
- New for AWS Lambda – Container Image Support
- Testing Lambda container images locally
Support
:::info I understand that you would like to know why is CloudWatch "ConcurrentExecutions" metric only 826 for all Lambda functions in us-east-1 region and you are still facing Throttle error?
To investigate this issue further, I discuss with internal Lambda expert, kindly refer to the following explanation:
Since Lambda service uses a counter-like mechanism to count the number of current execution environments. In addition, the CloudWatch ConcurrentExecutions metric is recorded by sampling and may cause some gaps due to time intervals. For example, even though the current sample value of ConcurrentExecutions is 826, due to too many invocations, the next moment ConcurrentExecution is likely to suddenly exceed the upper limit of 1000.
Later, when some Lambdas function execution is completed, the available execution environment is released. Therefore, before the time of the next sampling, ConcurrentExecution returns to normal again. This is why, we would advise our customer to observe that when ConcurrentExecution metrics and the upper limit value is extremely close, customer can consider raising the limit of "ConcurrentExecutions", which should help reduce Throttle errors. :::
VPC
API Gateway
- API Gateway mapping template and access logging variable reference
- Introducing HTTP APIs: A Better, Cheaper, Faster Way to Build APIs - AWS Online Tech Talks
- Handling Errors in Amazon API Gateway
- API Gateway quotas for configuring and running a REST API
- Integration timeout => 50 milliseconds - 29 seconds for all integration types, including Lambda, Lambda proxy, HTTP, HTTP proxy, and AWS integrations.
- Getting “x-amzn-Remapped-WWW-Authenticate instead of WWW-Authenticate and jetty client not able to recognise
- Wildcard custom domain names
- A Detailed Overview of AWS API Gateway
Lambda integration
Custom HTTP Status Code
The routing of Lambda function errors to HTTP responses in API Gateway is achieved by pattern matching against this “errorMessage” field in the Lambda response. The Lambda function must exit with an error in order for the response pattern to be evaluated – it is not possible to “fake” an error response by simply returning an “errorMessage” field in a successful Lambda response.
- Send Custom HTTP Status Code from Lambda to API Gateway
- How to Return Custom HTTP Status codes from a Lambda function in Amazon API Gateway
Metrics
Usage plan
- What are usage plans and API keys
- Throttling and quota limits apply to requests for individual API keys that are aggregated across all API stages within a usage plan.
- Do different API keys associated on the same usage plan share the same quota limit too
ACM
Supported Regions
- Supported Regions
Certificates in ACM are regional resources. To use a certificate with Elastic Load Balancing for the same fully qualified domain name (FQDN) or set of FQDNs in more than one AWS region, you must request or import a certificate for each region. For certificates provided by ACM, this means you must revalidate each domain name in the certificate for each region. You cannot copy a certificate between regions.
To use an ACM certificate with Amazon CloudFront, you must request or import the certificate in the US East (N. Virginia) region. ACM certificates in this region that are associated with a CloudFront distribution are distributed to all the geographic locations configured for that distribution.
System Manager
SSM也可以管理機房的機器和VM 必須在需要被SSM管理的機器或VM內安裝SSM agent Session Manager: 好處, 可以將inbound port關掉; Windows RDP也可以 Distributor: 裝軟體套件
以下是AWS Senior SA的補充資訊:
- Automation with rollback – 可以參考下面這個document,較複雜的流程還是需要搭配Lambda來完成:
AWS-PatchInstanceWithRollback: https://docs.aws.amazon.com/systems-manager-automation-runbooks/latest/userguide/automation-aws-patchinstancewithrollback.html
- AWS Health Events automation (EC2 Retired)
先設定AWS Health Event做為CloudWatch Events (EventBridge)來源: https://docs.aws.amazon.com/health/latest/ug/cloudwatch-events-health.html How can I receive notifications for scheduled events for my EC2 instance using CloudWatch Events? https://aws.amazon.com/tw/premiumsupport/knowledge-center/cloudwatch-notification-scheduled-events/
CloudWatch Events可以直接觸發SSM Automation: https://docs.aws.amazon.com/AmazonCloudWatch/latest/events/SSM_Automation_as_Target.html
如果要透過SSM OpsCenter集中管理ops事件的話: https://docs.aws.amazon.com/systems-manager/latest/userguide/OpsCenter-automatically-create-OpsItems-2.html
Route 53
- Why is my third-party SSL provider unable to verify my Route 53 domain ownership
- RRSet of type CNAME with DNS name foo.com. is not permitted at apex in zone bar.com
Unlike a CNAME record, you can create an alias record at the top node of a DNS namespace, also known as the zone apex. For example, if you register the DNS name example.com, the zone apex is example.com. You can't create a CNAME record for example.com, but you can create an alias record for example.com that routes traffic to www.example.com.
tag
Overview
- Focus on Required and Conditionally Required Tags
- Consider naming your tags using all lowercase, with hyphens separating words, and a prefix identifying the organization name or abbreviated name
- In 2016, the number of tags per resource was increased to 50 (with a few exceptions, such as S3 objects)
- it’s generally recommended to follow good data management practice by including only one data attribute per tag
- Remediate Untagged Resources
- Tag Editor is a feature of the AWS Management Console that allows you to search for resources using a variety of search criteria and add, modify, or delete tags in bulk.
- The AWS Resource Tagging API allows you to perform these same functions programmatically.
Practice
- purpose
- key and value
- product
- component
- application
- owner
- department
- environment
- version
- required, conditionally required
examples: anycompany:cost-center anycompany:environment-type anycompany:application-id
SNS
SQS
- Lessons learned from combining SQS and Lambda in a data project - Solita Data
- AWS Lambda and SQS: What Nobody Tells You About Their Mix | by Ehsan Yazdanparast | Geek Culture | Medium
- Lambda SQS Triggers and Concurrency | by Shilpi Gupta | Better Programming
deplicated messages
SQS message may be duplicated on some situlations - Resolve Duplicate Messages in Amazon SQS for the Same Amazon S3 Event - At-least-once delivery
Warm Greetings from AWS Premium Support. Thank you for contacting AWS Premium Support.
This is Jennifer and I will be assisting you with your case today.
From the case note, I understand that you would like to confirm that the message ID of standard SQS Queue will be the same for the following two scenarios. Kindly refer to the following information:
### case1 ### Producer application sends a message, but the consumer application receive two duplication messages
As you may alredy know, there are some inherent characteristics of a Standard SQS Queue that allows for duplicative messaging. As per the note on this document [1]:
For Standard SQS Queues, the `Visibility Timeout` is not a guarantee against receiving a message more than once.
➜ As per this document [2], a Standard SQS Queue ensures \"at-least-once delivery\" which implies that it is possible for the same message to be delivered more than once.
➜ When messages are added to a Standard SQS Queue, a unique Message ID is allocated to each message. Amazon SQS returns the Message ID in the response of the \"SendMessage\" [3] API call.
➜ Amazon SQS stores copies of the messages on multiple servers for redundancy and high availability.
➜ On rare occasions, one of the servers that stores a copy of a message might be unavailable when you receive or delete a message.
➜ This can result in a duplicate message being received when the server becomes available again.
I would like to highlight that, duplicate messages (introduced by Amazon SQS as a result of the above mentioned point) will contain the SAME Message ID.
### case2 ### Put a single file to S3 bucket and then trigger duplicated SQS messages for the PutObject action
To investigate this issue, I have setup the configuration and performed the testing in my environment. Based on my test, when I upload the same object 3 times in a row, normally, the ` sequencer key` and `message ID` of the 3 responses are totally different. However, it is difficult to reproduce the phenomenon of sequencer key duplication, as it occurs in rare cases [4]. After delving into this issue, I am able to confirm from internal sources: \"For S3 event notifications, it is expected to see duplicates. However, they'd show up as DIFFERENT SQS messages id if they were generated by the events system\".
Furthermore, for case1, here is an example of the application logic that would need to be implemented in order to facilitate idempotency:
1. Extract the value of a unique attribute of the input event (such as, the Message ID).
2. Check if the attribute value already exists in a control database. Depending on the outcome, do the following:
➜ If a unique value exists, end the action without producing an error.
➜ If a unique value does not exist, proceed with the actions that you designed.
3. Thereafter, include a record of the attribute value in the control database.
I hope the above information helps.
Have a nice day :)
■ References:
============
[1] Amazon SQS Visibility Timeout: https://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/sqs-visibility-timeout.html
[2] At-least-once Delivery: https://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/standard-queues.html#standard-queues-at-least-once-delivery
[3] SendMessage: https://docs.aws.amazon.com/AWSSimpleQueueService/latest/APIReference/API_SendMessage.html
[4] https://aws.amazon.com/tw/premiumsupport/knowledge-center/s3-duplicate-sqs-messages/
CloudWatch
Metrics
Metric data is kept for 15 months
Publishing Single Data Points Each metric is one of the following: - Standard resolution, with data having a one-minute granularity - Metrics produced by AWS services are standard resolution by default - High resolution, with data at a granularity of one second - :money_with_wings: Keep in mind that every PutMetricData call for a custom metric is charged, so calling PutMetricData more often on a high-resolution metric can lead to higher charges.
Namespace: 類似分類 Dimension: 類似上標籤做grouping SampleCount: record數量
Others
- Publish Amazon CloudWatch metrics to a comma-separated values output
- how to export CloudWatch Metrics to CSV
Alarms
Using Amazon CloudWatch Alarms Why did my CloudWatch alarm trigger when its metric doesn't have any breaching data points
- Type
- metric alarm
- composite alarm
- Alarm States
- OK
- ALARM
- INSUFFICIENT_DATA
An alarm invokes actions only when the alarm changes state
The exception is for alarms with Auto Scaling actions. For Auto Scaling actions, the alarm continues to invoke the action once per minute that the alarm remains in the new state.
==Three settings== to enable CloudWatch to evaluate when to change the alarm state - Period: 檢查的間隔, 輸出 data point - the length of time to evaluate the metric or expression to create each individual data point for an alarm - If you choose one minute as the period, the alarm evaluates the metric once per minute. - each specific data point reported to CloudWatch falls under one of three categories - Not breaching (within the threshold) - Breaching (violating the threshold) - Missing - missing data points - Evaluation Periods: 最近的候選檢查點個數, 有點類似check window - the number of the most recent periods, or data points, to evaluate when determining alarm state - Datapoints to Alarm: 滿足的檢查點個數 - the number of data points within the Evaluation Periods that must be breaching to cause the alarm to go to the ALARM state
When you configure Evaluation Periods and Datapoints to Alarm as different values, you're setting an "M out of N" alarm. Datapoints to Alarm is ("M") and Evaluation Periods is ("N"). The evaluation interval is the number of data points multiplied by the period. For example, if you configure 4 out of 5 data points with a period of 1 minute, the evaluation interval is 5 minutes. If you configure 3 out of 3 data points with a period of 10 minutes, the evaluation interval is 30 minutes.
Logs
MetricFilter - Filter and pattern syntax
Insight
file_search_api log
file info done
fields @timestamp, @message
| filter @message like /File Info Status: 5/ or @message like /File Info Status: 6/ or @message like /File Info Status: 7/ or @message like /File Info Status: 8/ or @message like /File Info Status: 9/
Hit prefilter cache
fields @timestamp, @message
| filter @message like 'File with valid score'
| filter @message like '-1'
get uplod url
Undefined or unsupported file type
sum
fields @timestamp, @message, @logStream
| filter @message like 'Undefined or unsupported file type'
| filter @message like 'file_type'
| parse @message '"hash": "*"' as hash
| parse @message '"file_type": *,' as file_type
| parse @logStream '*/*/*/[$LATEST]' as year, month, day
| stats count_distinct(hash) as sum by day
group results
fields @timestamp, @message
| filter @message like 'Undefined or unsupported file type'
| filter @message like 'file_type'
| parse @message '"hash": "*"' as hash
| parse @message '"file_type": *,' as file_type
| stats count_distinct(hash) as sum by hash, file_type
scan task forwarder log
send vendor
fields @timestamp, @message, @logStream
| filter @message like 'handle sandbox reply task id' or @message like 'handle sandbox reply report'
| parse @logStream '*/*/*/[$LATEST]' as year, month, day
| stats count(*) as sum by day
report forwarder log
score distribution
fields @timestamp, @message
| filter @message like 'virus_score'
| parse @message 'virus_score * is' as score
| stats count(score) as sum by bin(1d), score
quarantine
file type distribution
fields @timestamp, @message
| filter @message like 'filter queue msg'
| parse message '"file_type": *}' as file_type
| stats count(file_type) as sum by bin(1d), file_type
cloud query
hit cache and score
fields @timestamp, @message
| filter @message like 'has been cached'
| parse @message /(?<md5>[0-9a-z]{32}) has been cached and score is (?<score>-?[0-9]+)/
| filter score > 0
| stats count(md5) as sum by md5, score
EventBriedge
- How to get the event content in ECS when it is invoked by cloudwatch/eventbridge event?
- How to extract event relayed from AWS EventBridge to ECS Fargate
- Passing event data from Amazon EventBridge into an AWS Fargate task
- AWS question - How can I get Cloudwatch event data in a Fargate task with Python
- Passing input to ECS task from CloudWatch rule
CloudFront
CloudTrail
Elasticsearch
DynamoDB
- 30天鐵人賽介紹 AWS 雲端世界 - 26: AWS 提供的 Managed NoSQL DBMS - DynamoDB
- AWS功能整理- Dynamodb
- DynamoDB in 15 minutes
- DynamoDBGuide
- How to model one-to-many relationships in DynamoDB
- With one-to-many relationships, there’s one core problem: how do I fetch information about the parent entity when retrieving one or more of the related entities
- Denormalization by using a complex attribute
- Do you have any access patterns based on the values in the complex attribute?
- Is the amount of data in the complex attribute unbounded?
- Denormalization by duplicating data
- Is the duplicated information immutable?
- If the data does change, how often does it change and how many items include the duplicated information?
- Composite primary key + the Query API action
- This is a pretty common way to model one-to-many relationships
- Secondary index + the Query API action
- You may need to use this pattern instead of the previous pattern because the primary keys in your table are reserved for another purpose
- Composite sort keys with hierarchical data
- You have many levels of hierarchy (>2), and you have access patterns for different levels within the hierarchy.
- Denormalization by using a complex attribute
- With one-to-many relationships, there’s one core problem: how do I fetch information about the parent entity when retrieving one or more of the related entities
- The What, Why, and When of Single-Table Design with DynamoDB
- 官方
- Core Components of Amazon DynamoDB
- Amazon DynamoDB 入門影片
- Getting Started Developing with Python and DynamoDB
- QueryFilter
- This parameter does not support attributes of type List or Map
- QueryFilter
- Creating a single-table design with Amazon DynamoDB
- Example Tables and Data
- AWS DynamoDB and Schema Design
- Comparison Operator and Function Reference
- aws-samples/amazon-dynamodb-design-patterns
- AWS re:Invent 2019: Data modeling with Amazon DynamoDB (CMY304)
- https://twitter.com/angelo_randazzo/status/1510362054489739267?s=12&t=mGlZq948otytJT6q4RjaIg
- This video (from AWS events) is very interesting to understand the key concepts behind designing a DB model in DynamoDB (totally different vs a classical relational SQL DB)
- https://twitter.com/angelo_randazzo/status/1510362054489739267?s=12&t=mGlZq948otytJT6q4RjaIg
- How to switch from RDBMS to DynamoDB in 20 easy steps
- Capacity units consumed by query
- DynamoDB calculates the number of read capacity units consumed based on item size, not on the amount of data that is returned to an application. For this reason, the number of capacity units consumed is the same whether you request all of the attributes (the default behavior) or just some of them (using a projection expression). The number is also the same whether or not you use a filter expression.
- 課程
Primary key
The primary key uniquely identifies each item in the table, so that no two items can have the same key.
Each primary key attribute must be a scalar (meaning that it can hold only a single value). The only data types allowed for primary key attributes are string, number, or binary
- Choosing the Right DynamoDB Partition Key
- Designing Partition Keys to Distribute Your Workload Evenly
Local secondary indexes (max is 5)
At table creation
docker for local dev
https://github.com/instructure/dynamo-local-admin-docker
UpdateExpression
RCU & WCU
EFS
I understand that you want to find out the 12 additional connections that shows in Cloudwatch. I would like to share some information with you after speaking with EFS service team: 1. The connections in Cloudwatch can get overcounted when connections get closed and re-established within the same period. 2. The issue might be caused by a specific behavior of the Linux NFS client with regards to TCP reconnection events. When a reconnection event occurs, the Linux NFS client reuses the TCP source port. This behavior is not conformant with the TCP RFC, and can cause a network issue where NFS responses from EFS to an EC2 instance are blocked for multiple minutes.
To resolve this issue, we recommend to add the "noresvport" mount option when mounting an EFS file system. This option has the effect that a new port is allocated when a reconnection event occurs.
- noresvport – Tells the NFS client to use a new Transmission Control Protocol (TCP) source port when a network connection is reestablished. Doing this helps make sure that the EFS file system has uninterrupted availability after a network recovery event.
DMS
- Getting started with AWS Database Migration Service
- Using a MySQL-compatible database as a source for AWS DMS
- Using a MySQL-compatible database as a target for AWS Database Migration Service
Athena
- Best practices when using Athena with AWS Glue
- Presto 0.172 Documentation
- Why does my Athena query fail with the error "HIVE_PARTITION_SCHEMA_MISMATCH"
- HIVE_PARTITION_SCHEMA_MISMATCH
:::success S3 建議使用 YYYY/MM/dd/1.log,原因是Glue做parse時可以直接建立partition :::
Basic
``` sql= SHOW PARTITIONS dc_log
Get DUTs to DC Total Request Counts Per Day/Month
Per day per DUT
``` sql= SELECT date_format(CAST(dc.time_stamp as timestamp), '%Y%m%d') as day, dc.device_info.sn, count(*) as requests FROM dc_log as dc WHERE dc.category = 'file-search-api' AND dc.extra_info.system_tag['device-request-counter'] is not null GROUP BY date_format(CAST(dc.time_stamp as timestamp), '%Y%m%d'), dc.device_info.sn order by day asc
``` sql=
SELECT DAY(CAST(dc.time_stamp as timestamp)) as day, dc.device_info.sn, count(*) as requests
FROM dc_log as dc
WHERE dc.category = 'cloud-anti-malware-query' AND
dc.extra_info.system_tag['device-request-counter'] is not null AND
YEAR(CAST(dc.time_stamp as timestamp)) = 2019 AND
MONTH(CAST(dc.time_stamp as timestamp)) = 6
GROUP BY DAY(CAST(dc.time_stamp as timestamp)), dc.device_info.sn
order by day asc