Category Archives: Uncategorized

BizTalk Notes – 2


Message Format –

XML, EDI, CSV..

Message Transport  –

HTTP, FTP, MSMQ, BAPI..

Two different approaches of flow of messages in Biztalk –

1. Message only Solution (without Orchestration)

2. Orchestration based solution

The messaging component allows the communication with several systems and platforms through the use adapters which implements the underlying protocol rules and data formats.

The orchestration component allows the creation, execution and management of business processes called orchestrations.

Schema tells application what to send, & what to expect.

Schema Types –

1. XML Schema (Document Schema)(Contract of your message)

2. Flat File Schema

3. Envelope Schema (used to wrap one or more XML business documents into a single XML instance message)

4. Property Schema (describes context properties, consists only child node under root node.)

Architecture divides into 2 general areas –

> Messaging: provides core message integration services

> Orchestration: provides layered BPI-related services

XML is the primary format

XSD is the primary type system

Orchestration represented in XLANG internally

BizTalk Assembly Viewer – btsasmext.dll (resides in ‘Developer Tools’ folder in installed place)

XSD is official schema definition language.

Schema types are – XML, Envelope, Property, Flat-file

ImportType –

Import, Include, Redefine

Flat File Schema Wizard – How the record dat is formatted

1. By Delimiter Symbol & 2. By Relative Positions

XSLT 1.0 is a standard XML transformation language

Functoid types –

1. Regular 2. Cumulative

Message Box*(MB) is the persistence and routine engine.

MB consists following components –

> One or more SQL Server Databses

> Messege Agent

Message Properties Types-

1. Distinguished fields     2. Promoted properties

The rule here is, if you dont want the schema element to appear in send port filters/debugging information then make it a distinguished field.

Piplelines define a sequence of message processing steps.

Default Pipelines are –

1. XMLReceive

2. PassThruReceive
3. XMLTransmit

4. PassThruTransmit

Host:

> In-process hosts

> Isolated hosts

The term functoids refers to predefined functions within the BizTalk Mapper tool set. Functoids support a number of useful translations and transformations.

Functoids –

> Drag n Drop functional components

> Executed from left to right
> Alter the data format or structure

> No code required (code free)

> Available more than 80 inbuilt functoids

Functoids Types –

> String manipulation

> conversion

> Mathematical functions

> Scientific functions

> Time stamping

> Aggregations & Accumilations

> Conditional processing

> Database lookups

> Looping

> Mass copy

> Arbitrary scripting

> …… and more..

Expansions –

XML – eXtensible Markup Language

XSLT – eXtensible Stylesheet Language Transformation

XSD – XML Schema Definition

EDI – Electronic Data Interchange

RFID – Radio Frequency Identification

BRE – Business Rule Engine

RPC – Remote Procedure Call

CIA – Confidentiality, Integrity, Authentication

BPI – Business Process Integration

BPM – Business Process Managment

BAM – Business Activity Monitoring

SSO – Single Sign On

MB – Message Box

OE – Orchestration Engine

ESB – Enterprise Service Bus

LOB – Line Of Business

HAT – Health and Activity Tracking

SOA – Service-Oriented Architecture

MOM – Message-Oriented Middleware

DTD – Document Type Definition

XDR – XML-Data Reduced (Schema) RPC – Remote Procedure Call

AWS Solution Architect SAA – C02 – Important Notes


5 Pillars & Well Architect Framework –

  1. Operational Excellence: CloudFormation, CodeCommit, Lambda, AWS Config, CloudWatch, Logs, SNS, Athena
    Prepare
    AWS CloudFormation, AWS Config
    Operate
    AWS CloudFormation, AWS Config, AWS CloudTrail, Amazon CloudWatch, AWS X-Ray
    Evolve
    AWS CloudFormation, AWS CodeBuild, AWS CodeCommit, AWS CodeDeploy, AWS CodePipeline
  2. Security: IAM, STS, AWS Config, CloudWatch Logs, ElastiSearch Service, VPC, Security Groups, KMS
    Identify & Access Management
    IAM, AWS-STS, MFA Token, AWS Organizations
    Detective Controls
    AWS Config, AWS CloudTrail, Amazon CloudWatch
    Infrastructure Protection
    Amazon CloudFront, Amazon VPC, AWS Shield, AWS WAF, Amazon Inspector
    Data Protection
    KMS, S3, Elastic Load Balancing (ELB), Amazon EBS, Amazon RDS
    Incident Response
    IAM, AWS CloudFormation, Amazon CloudWatch Events
  3. Reliability: VPC, Direct Connect, EC2, Route53, ELB, CloudWatch, S3, Glacier
    Foundations
    IAM, Amazon VPC, Service Limits, AWS Trusted Advisor
    Change Management
    AWS Auto Scaling, Amazon CloudWatch, AWS CloudTrail, AWS Config
    Failure Management
    Backups, AWS CloudFormation, Amazon S3, Amazon S3 Glacier, Amazon Route53
  4. Performance Efficiency: Auto Scaling, EBS, EFS, VPC, ECS, CloudWatch, RDS, DynamoDB
    Selection
    AWS Auto Scaling, AWS Lambda, Amazon Elastic Block Store (EBS), Amazon Simple Storage Service (S3), Amazon RDS
    Review
    AWS CloudFormation, AWS News Blog
    Monitoring
    Amazon CloudWatch, AWS Lambda
    TradeOffs
    Amazon RDS, Amazon ElastiCache, AWS Snowball, Amazon CloudFront
  5. Cost Optimization: CloudWatch, Trusted Advisor, Reserved Instances, Spot Instances, CloudFormation, Cost Explorer
    Expenditure Awareness
    AWS Budgets, AWS Cost and Usage Report, AWS Cost Explorer, Reserved Instance Reporting
    Cost-Effective Resources
    Spot Instance, Reserved Instance, Amazon S3 Glacier
    Matching supply and demand
    AWS Auto Scaling, AWS Lambda
    Optimizing Over Time
    AWS Trusted Advisor, AWS Cost and Usage Report, AWS News Blog

IAM
Users
Groups
Roles
Policies

EC2 – SSH to Linux
0644 –> its the warning message “unprotected private key file” and denied with “Permissions 0644 for ‘abcd.pem’ are too open”. This private key will be ignored. Load key “abcd.pem”: bad permissions
0400 –> then we have to run the chmod command to login securely –
>chmod 0400 abcd.pem
>ssh -i abcd.pem ec2-user@35.180.100.144
In Windows >= 10 –> If above issue / error occured, goto the *.pem private key >> goto file Property >> Security >> provide Full Control >> remove all other user inlcuding Administrator & System user and Provide the Current user (only current user must be present nothing else) >> then Ok. Now try to SSH to Linux using PowerShell. there you go…

EC2 – User Data
Possible to bootstrap (lanuching commands when machine starts) our instances using an EC2 User Data Script.
Script is only run once at the instance first start
used to automate boot tasks such as – Installing updates, Installing software, Downloading common files from the internet
The EC2 User Data Script runs with the root user

Security Groups –
By default, All traffic enabled Out of the machine in Outbound rules.
If your application is not accessible / timeout, then it’s a Security Group issue.
If your application gives a “connection refused” error, then its an application error or it’s not launched
All inbound traffic is blocked, by default
All outbound traffic is authorised by default
Security Groups can reference – IP address, CIDR block, Security Group
AWS Services using Security Groups –
EC2
Load Balancer – CLB
Load Balancer – ALB
EBS Volume
EFS

Elastic IP
When you stop & start EC2, change its Public IP. If you need to have fixed public IP, you need an Elastic IP
can attach to only 1 instance at a time
you can only have 5 Elastic IP in your account

Health Checks –
Load Balancer – CLB
Load Balancer – ALB
Route53 – Alias Record set

Target Groups –
Load Balancer – ALB
Load Balancer – NLB
Health Check

Load Balancer – CLB (HTTP, HTTPS, TCP – Layer 4 & 7)
The Classic Load Balancer (CLB) supports health checks on HTTP, TCP, HTTPS and SSL
By default the ELB has an idle connection timeout of 60 seconds

Load Balancer – ALB (HTTP, HTTPS – Layer 7)
X-Forwarded-For for HTTP/HTTPS carries the source IP/port information (it helps Application to find the true IP of the clients connected to your website, or else application can sees traffic coming from private IP which are in fact your load balancer’s)
Security Groups
Target Groups (Success Code – 200)
(Rules) Can handle multiple Target Groups based on condition like – path, query string, port
The Application Load Balancer (ALB) only supports health checks on HTTP and HTTPS
You can enable Access Logs on the ALB and this will provide the information required including requester, IP, and request type.
Access logs are not enabled by default. You can optionally store and retain the log files on S3

Load Balancer – NLB (TCP, TLS, UDP – Layer 4)
No – Security Groups
It has to be configure at EC2 Instances level by adding TCP:80 in SG
Target Groups

Load Balancer – All (CLB, ALB, NLB)
Health checks – ensure your ELB won’t send traffic to unhealthy (crashed) instances
Expose – Network Load Balancers expose a public static IP, whereas an Application or Classic Load Balancer exposes a static DNS (URL)

Load Balancer – Stickiness
sticky session feature (also known as session affinity)
CLB – Its LB level configure
ALB – Its TG level configure
The “cookie” used for stickiness has an expiration date you control

Load Balancer – Cross-Zone Load Balancing
Having LB in each AZ & each LB instance distributes evenly across all registred instances in all AZ.
It’s like Mesh Architecture
CLB
Disabled by Default
No charge for inter AZ data if enabled

ALB
    Always on (can't be disabled)
    No charge for inter AZ data

NLB
    Disabled by default
    You pay charges for inter AZ data if enabled

SSL / TLS – Basics
in-flight encryption (encrypted in transit)
SSL – Secure Sockets Layer
TLS – Transport Layer Security
Public SSL / TLS certificates are issued by Certificate Authorities (CA) like Comodo, Symantec, GoDaddy, GlobalSign, DigiCert, Letsencrypt, etc.,
LB uses X.509 certificate for SSL / TLS, manage certificates by ACM (AWS Certificate Manager)
SNI (Server Name Indication) specify hostname

SNI - Server Name Indication
Load multiple SSL into one web server to server multiple websites
Only works for ALB & NLB (new generation), CloudFront

Elastic Load Balancer - SSL Certificates
    CLB (v1)
    Support only one SSL Certificate
    Must use multiple CLB for multiple hostname with multiple SSL certificates

    ALB (v2) & NLB (v2)
    Supports multiple listners with multiple SSL certificates
    Uses SNI (Server Name Indication) to make it work

ELB – Connection Draining
Stops sending requests to the instances which is de-registering
Waiting period is 300 Secs (Between 1 to 3600 seconds, default is 300. Can be disabled – set value to 0)
Feature naming
CLB: Connection Draining
Target Group: Deregistration Delay (for ALB & NLB)

ASG – Auto Scaling Group
Scale out (add EC2 Instances)
Scale in (remove EC2 Instances)
Auto Scaling is a region specific service
Launch Configuration – AMI + Instance Type, EC2 user data, EBS Volume, Security Groups, SSH Key pair
Alarm – Cloud watch
IAM roles attached to an ASG will get assigned to EC2 instances
ASG can terminate instances marked as unhealthy by an LB (and hence replace them)
Auto Scaling is a region specific service. you cannot launch instances in multiple Regions from a single Auto Scaling Group
Auto Scaling can span multiple AZs within the same AWS region
You create collections of EC2 instances, called Auto Scaling groups
There is no additional cost for Auto Scaling, you just pay for the resources (EC2 instances) provisioned
Auto Scaling does not rely on ELB but can be used with ELB.
Basic monitoring sends EC2 metrics to CloudWatch about ASG instances every 5 minutes
Detailed can be enabled and sends metrics every 1 minute (chargeable)
When the launch configuration is created from the CLI detailed monitoring of EC2 instances is enabled by default. You cannot edit a launch configuration once defined
If connection draining is enabled, Auto Scaling waits for in-flight requests to complete or timeout before terminating instances
If using an ELB it is best to enable ELB health checks as otherwise EC2 status checks may show an instance as being healthy that the ELB has determined is unhealthy. In this case the instance will be removed from service by the ELB but will not be terminated by Auto Scaling.
When you delete an ASG the instances will be terminated

Cooldown Period: The cooldown period is a configurable setting for your Auto Scaling group that helps to ensure that it doesn’t launch or terminate additional instances before the previous scaling activity takes effect. After the Auto Scaling group dynamically scales using a simple scaling policy, it waits for the cooldown period to complete before resuming scaling activities.

Warm-up Period: The warm-up period is the period of time in which a newly created EC2 instance launched by ASG using step scaling is not considered toward the ASG metrics

Both Launch Template (new) or Configuration (old) are same.

ASG – Scaling Options
Maintain – keep a specific or minimum number of instances running.
Manual – use maximum, minimum, or a specific number of instances.
Scheduled – increase or decrease the number of instances based on a schedule.
Dynamic – scale based on real-time system metrics (e.g. CloudWatch metrics).

ASG – Scaling Policies
Target Tracking Scaling
Simple / Step Scaling
Scheduled Actions

(SIMPLE SCALING)
You pick ANY Cloud Watch metric. For this and other examples in THIS POST I am choosing CPU Utilization. You specify, a SINGLE THRESHOLD beyond which you want to scale and specify your response
EXAMPLE: how many EC2 instances do you want to add or take away when the CPU UTILIZATION breaches the threshold. The scaling policy then acts.
THRESHOLD - add 1 instance when CPU Utilization is between 40% and 50%
NOTE: This is the ONLY Threshold

(STEP SCALING)
You specify MULTIPLE thresholds Along with different responses.
Threshold A - add 1 instance when CPU Utilization is between 40% and 50%
Threshold B - add 2 instances when CPU Utilization is between 50% and 70%
Threshold C - add 3 instances when CPU Utilization is between 70% and 90%   And so on and so on
Note: There are multiple thresholds

(TARGET TRACKING SCALING)
You don't want to have to make so many decisions
Makes the experience simple as compared to the previous 2 scaling options
It’s automatic & All you do is pick CPU Utilization(Your metric and example for this post). Set the value and that’s it
Auto scaling does the rest adding and removing the capacity in order to keep your metric(CPU utilization) as close as possible to the target value
It’s SELF OPTIMIZING
It means that it has an Algorithm that learns how your metric changes over time and uses that information to make sure that over and under scaling are minimized
You get the fastest scaling response.

ASG – Scaling Cooldown
default cooldown period 300 Secs

ASG – Termination Policy

  1. Find the AZ which has the most number of instances
  2. If there are multiple instances in the AZ to choose from, delete the one with the oldest launch configuration
    ASG tries to balance the number of instances across AZ by default

ASG – Lifecycle Hooks
Auto Scaling Group
|
|/
(Scale out)
|
|/
Pending ——-> Pending: wait
| |
|/ |/
InService <——- Pending: Proceed | | |/ (Scale in) | |/ Terminating ——-> Terminating: wait
| |
|/ |/
Terminated <——- Terminating: Proceed

ASG – Launch Configuration vs Launch Template
Launch Configuration (legacy)
Re create everytime for change

Launch Template (newer)
Can have multiple versions
Provision using both On-Demand & Spot Instances (or a mix)
Can use T2 unlimited burst feature
Recommended by AWS going forward

EBS Volume
It’s network drive (not physical drive)
It’s locked to Availability Zone (AZ)
To move across AZ, you first need to Snapshot it
Have provisioned Capacity (Size in GBs, and IOPS)
you get billed for all the provisioned capacity
4 Types –
GP2 (SSD) – General purpose SSD volume that balances price & performance for a wide variety of workloads
IO1 (SSD) – Highest-performance SSD volume for mission-critical low latency or high throughput workloads
ST1 (HDD) – Low cost HDD volume designed for frequently accessed, throughput intensive workloads
SC1 (HDD) – Lowest cost HDD volume designed for less frequently accessed workloads
Only GP2 & IO1 can be used as boot volumes
Security Group
/dev/sda1 or /dev/xvda –> always refers as Root Volume
Root Volume –> cannot be encrypted

EBS Snapshots
Incremental – Only backup changed blocks
Snapshots will be stored in S3 (but you won’t directly see them)
Not necessary to detach volume to do snapshot, but recommended
Can copy snapshots across AZ or Region
Can make Image (AMI) from snapshot
EBS Volumes restored by snapshots need to be pre-warmed (using fio or dd command to read the entire volume)
Snapshots can be automated using Amazon Data Lifecycle Manager

  • When you create an EBS volume in an Availability Zone, it is automatically replicated within that zone to prevent data loss due to a failure of any single hardware component.
  • An EBS volume can only be attached to one EC2 instance at a time.
  • After you create a volume, you can attach it to any EC2 instance in the same Availability Zone
  • An EBS volume is off-instance storage that can persist independently from the life of an instance. You can specify not to terminate the EBS volume when you terminate the EC2 instance during instance creation.
  • EBS volumes support live configuration changes while in production which means that you can modify the volume type, volume size, and IOPS capacity without service interruptions.
  • Amazon EBS encryption uses 256-bit Advanced Encryption Standard algorithms (AES-256)
  • EBS Volumes offer 99.999% SLA.

EBS RAID Option
RAID 0
RAID 1
RAID 5 (not recommended for EBS – see documentation)
RAID 6 (not recommended for EBS – see documentation)

RAID 0
Increase performance
Combining 2 or more volumes, writes the data either anyone of the disk in the any volume. But one disk fails, all the data is failed
Using this, we can have a very big risk with a lot of IOPS
RAID 1
Increase fault tolerance
Combining 2 or more volumes, writes the data into both volumes(mirroring data) at the same time. If one disk fails, our logical volume still working

Instance Stroe
= ephemeral storage
Due physically attached to physical server where your EC2 is, very High IOPS
Instance store physically attached to the machine (EBS is network drive)
Pros
Better I/O performance
Good for buffer / cache / scratch data / temporary content
Data survives reboots
Console
On stop / termination, the instance store is lost
you can’t resize the instance store
Backup must be operated by the user

EFS – Elastic File System
EFS only supports Linux systems
Manager NFS (Network File System), can be mounted on many EC2 Instances in multi-AZ
Pay per use
Uses NFSv4.1 protocol
Security Group to control access to EFS
Compatible with Linux (not windows), with POSIX file system
Encryption at rest using KMS
It scales automatically, pay-per-use, no capacity planning
You can control access to files and directories with POSIX-compliant user and group-level permissions. POSIX permissions allows you to restrict access from hosts by user and group.
EFS Security Groups act as a firewall, and the rules you add define the traffic flow

EFS – Performance & Storage Classes
EFS Scale
Grow to Petabyte-scale network file system, automatically
Performance Mode (set at EFS creation time)
General Puporse (default)
Max I/O
Storage Tiers (lifecycle management feature – move file after N days)
Standard
Infrequent access (EFS-IA)

RDS – Relational Database Service
Scalibility – Vertical & Horizontal
Storage backed by EBS (gp2 or io1)
But you can’t SSH into your instances
DB Snapshots are manually triggered by user, retention of the backup as long as you want
Backup retention period is 1 to 35 days
RDS fully supports the InnoDB storage engine for MySQL DB instances. RDS features such as Point-In-Time restore and snapshot restore require a recoverable storage engine and are supported for the InnoDB storage engine only

RDS – Read Replicas for Read Scalability
Upto 5 Read Replicas within AZ or Cross AZ or Cross Region
Replicaiton is ASYNC, eventually consistent
Applicaitons must update the connection string to leverage read replicas
Read replicas RDS DB are used only SELECT (=read) only kind of statements
Read replica within AZ its free, the same read replics (ASYNC replication) with cross AZ its cost
Read replica not supporte by MS SQL Server & Oracle

RDS – Disaster Recovery
SYNC replication
One DNS name – automatic app failover standby
Increase availability, failover in case of loss of AZ, loss of network, instance or staroge failure
The Read Replica can setup as Multi-AZ, for Disaster Recovery (DR)

RDS Security – Encryption
At rest encryption
Possibility to encrypt the master & read replica with AWS KMS – AES-256 encryption
Encryption has to be defined at launch time
if the master is not encrypted, the read replicas cannot be encrypted
Transparent Data Encryption (TDE) available for Oracle & SQL Server
In-flight encryption
SSL Certificates to encrypt data to RDS in flight
Encryption at rest
Is done only when you first create the DB instance
or
unencrypted DB ==> snapshot ==> copy snapshot as encrypted ==> create DB from snapshot
Your Responsibility
check the ports / IP / SG inbound rules in DB’s SG
In-database user creation and permissions or manage through IAM
Creating a database with or without public access
Ensure parameter groups or DB is configured to only allow SSL connections
AWS responsibility
No SSH access
no manual DB patching
No manual OS patching
No way to audit the underlying instance
Enable IAM DB Authentication: Only with MySQL and PostgreSQL
You can authenticate to your DB instance using AWS Identity and Access Management (IAM) database authentication. IAM database authentication works with MySQL and PostgreSQL. With this authentication method, you don’t need to use a password when you connect to a DB instance. Instead, you use an authentication token.
Benefits:
Network traffic to and from the database is encrypted using Secure Sockets Layer (SSL).
You can use IAM to centrally manage access to your database resources, instead of managing access individually on each DB instance.
For applications running on Amazon EC2, you can use profile credentials specific to your EC2 instance to access your database instead of a password, for greater security

Amazon Aurora
Aurora can have 15 replicas while MySQL has 5 (master + up to 15 read replicas)
Aurora costs more than RDS (20% more) – but is more efflicient
Support for cross region replication (Aurora Global Database)

AWS ElasticCache
is to get managed Redis or Memcached (High Performance)
Write scaling using Sharding
Read scaling using Read Replicas
Multi-AZ with Failover capability

ElasticCache – Cache Security
All cashe
supports SSL in flight encryption
Do not support IAM authentication
IAM Policies are only used for AWS API level security
Redis AUTH
you can set password/token when you create a Redis cluster
This is extra level of security for your cache (on top of Security Group)
Memcached
Supports SASL-based authentication (advanced)

Patterns for ElasticCache
Lazy loading: all the read data is cached, data can become stale in cache
Write through: adds or update data in the cache when written to a DB (no stale data)
Session store: store temporary session data in a cache (using TTL feature)

PostgreSQL: Doesn’t support TDE
Oracle: Doesn’t support IAM Authentication

Route53
AWS common records are –
A: hostname to IPv4
AAAA: hostname to IPv6
CNAME: hostname to hostname
Alias: hostname to AWS resource
Advanced features –
Load Balancing (through DNS – also called client load balancing)
Health Checks (limited)
Routing Policy: Simple, Failover, Geolocation, Latency, Weighted, Multi Value
You pay $0.50 per month per hosted zone

CNAME
Points a hostname to any other hostname (app.mydomain.com => blabla.anything.com)
ONLY FOR NON ROOT DOMAIN (ex: something.mydomain.com)
Alias
Points a hostname to AWS resource (app.mydomain.com => blabla.amazonaws.com)
Works for ROOT DOMAIN and NON ROOT DOMAIN (aka mydomain.com)
Free of charge
Health Check

Solution Architect – Instatiating Applications quickly

S3 Encryption
4 methods of encryption
SSE-S3: encrypts S3 objects using keys handled & managed by AWS
Object encrypted server side, AES-256 encryption type
SSE-KMS: leverage AWS Key Management Service to manage encryption keys
Object encrypted server side, KMS = user control + audit trail
SSE-C: when you want to manage your own encryption keys
S3 does not store encryption key you provide
HTTPS must be used, encryption key must provided in HTTP headers, for every HTTP request made
Client side encryption
client must encrypt data before sending to S3, client must decrypt data after retrieve from S3

S3 – Security
User based
IAM Policy
Resource based
Bucket Policies – bucket wide rules from the S3 console – allow cross account
Object Access Control List (ACL) – finer grain
Bucket Access Control List (ACL) – less common

Note: an IAM principal can access an S3 object if
– the user IAM permissions allow it OR the resource policy ALLOWS it
– AND there’s no explicit DENY

S3 – Event Notifications
You can configure the Event Notification S3 with SNS, SQS or Lambda Services.

CLI, IAM Roles & Policies

Managed policies have built-in versioning support

Command – location where region info / config folder placed after installation of CLI?
cat ~/.aws/config

Command – where exactly credentials are stored in local machine (after user login by aws configure … )?
cat ~/.aws/credentials

EC2

EC2 Instance Metadata
URL – http://169.254.169.254/latest/meta-data/
Meta-data = info about the EC2 Instance
User-data = launch script of the EC2 Instance
The Instance Metadata Query tool allows you to query the instance metadata without having to type out the full URI or category names

EC2 Instance Types / Launch Modes
1. On Demand
2. Spot (Batch job, Data analysis, image processing.., not for critical or database)
3. Reserved
a. Standard b. Convertible c. Scheduled
4. Dedicated Hosts (software that has complicated licensing model (BOYL) or for companies that have strong regulatory or compliance needs
5. Dedicated Instances (may share hardware with other instance in same account, No control over instance placement(can move h/w after start & stop))

EC2 Instance Types
R: applications that needs a lot of RAM –> in-memory caches
C: applications that needs good CPU –> compute / databases
M: applications that are balanced (think “medium”) –> general / web app
I: applications that need good local I/O (instance storage) –> databases
G: applications that need a GPU –> video rendering / machine learning

T2 / T3: burstable instances (up to capacity)
T2 / T3 - unlimited: unlimited burst

General: T2, M3, M4, M5
Compute Optimized: C4, C5
Graphics / HPC: P2, P3, G3
Memory Optimization: X1, R3, R4
Storage: I3, H1, D2

EC2 – Cross Account AMI Copy
Can share an AMI with another AWS Account
Sharing an AMI does not affect the ownership of the AMI
If you copy an AMI that has been shared with your account, you are the owner of the target AMI in your account
To copy an AMI that was shared with you from another account, the owner of the source AMI must grant you read permissions for the storage that backs the AMI, either the associated EBS snapshot (for an Amazon EBS-backed AMI) or an associated S3 bucket (for an instance store-backed AMI).
Limits:
You can’t copy an encrypted AMI that was shared with you from another account. Instead, if the underlying snapshot and encryption key were shared with you, you can copy the snapshot while re- encrypting it with a key of your own. You own the copied snapshot, and can register it as a new AMI.
You can’t copy an AMI with an associated billingProduct code that was shared with you from another account. This includes Windows AMIs and AMIs from the AWS Marketplace. To copy a shared AMI with a billingProduct code, launch an EC2 instance in your account using the shared AMI and then create an AMI from the instance.

EC2 Placement Groups
Cluster – High Performance applications
Spread – Critical applications
Partition – Distributed applications

EC2 Hybernate
Supported instance families – C,M & R
Instance RAM size – must be less than 150 GB
Instance size – not supported for bare metal instances
AMI – Amazon Linux 2, Linux AMI, Ubuntu.. & Windows
Root Volume – must be EBS, encrypted, not instance store, and large
Available for On-Demand and Reserved Instances
An instances cannot be hibernated more than 60 days

S3 Storage Classes
Amazon S3 Standard – General purpose
Amazon S3 Standard – Infrequent Access (IA) |
Amazon S3 One Zone – Infrequent Access | (30 days)
Amazon S3 Intelligent Tiering |
Amazon Glacier (90 days)
Amazon Glacier Deep Archive (180 days)
(Deprecated – omitted) Amazon S3 Reduced Redundancy Storage

S3 URL format –
https://s3-(region).amazonaws.com/(bucketname)

AWS Cloud Front
216 Point of Presence globally (Edge Locations)
DDoS protectoin, integration with Shield, AWS Web Applicaiton Firewall
Amazon CloudFront can be used to stream video to users across the globe using a wide variety of protocols that are layered on top of HTTP. This can include both on-demand video as well as real time streaming video.

CloudFront’s invalidation
Amazon CloudFront’s invalidation feature, which allows you to remove an object from the CloudFront cache before it expires, now supports the * wildcard character. You can add a * wildcard character at the end of an invalidation path to remove all objects that match this path. In the past, when you wanted to invalidate multiple objects, you had to list every object path separately. Now, you can easily invalidate multiple objects using the * wildcard character.

CloudFront – Geo Restrication
You can restrict who can access your distribution
Whitelist
Blacklist
The “country” is determined using a 3rd party Geo-IP database

CloudFront vs S3 Cross Region Replication
CloudFront
Global Edge Network, Files are cached for a TTL
Great for static content the must be available everywhere
S3 Cross Region Replication
Must be setup for each region you want to replicate to happen
Files are updated in near real time and Read-Only
Great for dynamic content that needs to be available at low-latency in few regions

CloudFront – Signed URL / Signed Cookies
signed URL = access to individual files (one signed URL per file)
Signed Cookies = access to multiple files (one signed cookie for many files)

CloudFront Signed URL vs S3 Pre-Signed URL
CloudFront Signed URL:
Allow access to path, no matter the origin
Account wide key-pair, only the root can manage it
Can filter by IP, path, date, expiration
Can leverage caching features
S3 Pre-Signed URL:
Issue a request as the person who pre-signed the URL
Uses the IAM key of the signing IAM Principal
Limited lifetime

132 pending
133 pending

137 pending

ECS, ECR & Fargate – Docker in AWS
To manage containers, we need a container management platform. 3 choices
1. ECS (Classic): Amazon’s own platform, provision EC2 instances to run containers onto
2. Fargate: Amazon’s own serverless platform, ECS Serverless, no more EC2 to provision
3. EKS: Amazon’s managed Kubernates (open source)

ECS Classic
EC2 instances must be created
We must configure the file “/etc/ecs/ecs.config” with the cluster name
the EC2 instance must run an ECS agent
EC2 instances can run multiple containers on the same type:
You must NOT specify a host port (only container port)
You should use an ALB with the Dynamic Port Mapping
The EC2 instance Security Group must allow traffic from the ALB on all ports
ECS tasks can have IAM Roles to execute actions against AWS
Security Group operates at the Instance Level, not task level

ECS Task Definitions
Task Definitions are metadata in JSON form to tell ECS how to run a Docker Container.
It contains crucial information around:
Image Name
Port Binding for Container and Host
Memory and CPU required
Environment variables
Networking information

ECS Service
ECS Services help define how many tasks should run and how they should be run
They ensure that the number of tasks desired is running across our fleet of EC2 instances
They can be linked to ELB / NLB / ALB if needed

ECR
So far we’ve been using Docker images from Docker Hub (public)
ECR is a private Docker image repository
Access is controlled through IAM (permission erros => policy)
AWS CLI v1 login command
AWS CLI v2 login command
Docker Push & Pull
In case an EC2 Instance (or you) cannot pull a Docker image, check IAM

ECS IAM Roles Deep Dive
EC2 Instance Profile:
Used by the ECS AGENT
Makes API calls to ECS Service
Send container logs to CloudWatch Logs
Pull Docker image from ECR
ECS Task Role:
Allow each task to have a specific role
Use different roles for the different ECS Services you run
Task Role is defined in the TASK DEFINITION

IAM Roles for ECS:
    1. AWSServiceRoleForECS
    2. ecsInstanceRole
    3. ecsServiceRole
    4. ecsTaskExecutionRole

ECS Task Placement
When a task of EC2 is launched, ECS must determine where to place it, with the constraints of CPU, MEMORY and AVAILABLE PORT
Similarly, when a service scales in, ECS needs to determine which task to terminate
To assist with this, you can define a TASK PLACEMENT STRATEGY and TASK PLACEMENT CONSTRAINTS
Note: this is only for ECS with EC2, not for Fargate

ECS Task Placement Process:
    1. Identify the instances that satisfy the CPU, memory and port requirements in the task definition
    2. Identify the instances that satisfy the task placement constraints
    3. Identify the instances that satisfy the task placement strategies
    4. Select the instaces for task placement

ECS Task Placement Strategies:
    1. Binpack
        Place tasks based on the least available amount of CPU or memory
        This minimizes the number of instances in use (cost savings)
    2. Random
        Place the task randomly
    3. Spread
        Place the task evenly based on the specified value. ex: instanceId, attribute:ecs.availability-zone

ECS Task Placement Constraints
    1. distinctInstance
        place each task on a different container instace
    2. memberOf
        places task on instances that satisfy an expression
        Uses the Cluster Query Language (advanced)

ECS - Auto Scaling Service
    1. Target tracking      2. Step scaling     3. Scheduled scaling
ECS Cluster Auto Scaling through Capacity Providers

Fargate
Fargate is Serverless (no EC2 to manage)
AWS provisions containers for us and assigns them ENI
Fargate containers are provisioned by the container spec (CPU / RAM)
Fargate tasks can have IAM Roles to execute actions against AWS

AWS Elastic Beanstalk
It uses all the components we have seen before – EC2, ASG, ELB, RDS, etc.,
we will have full control over the configuration
Beanstalk is free, but you pay for the underlying instances

Managed Service
    Instance configuration / OS is handled by Beanstalk
    Deployment strategy is configurable but performed by Elastic Beanstalk
Just the application code is responsibility of the Developer
3 Architecture models
    Single Instance deployment: good for dev
    LB + ASG: great for production or pre-prod web applications
    ASG only: great for non-web apps in production (workers, etc.,)
3 Components in Beanstalk
    Applicaiton
    Applicaiton Version: each deployment gets assigned a version
    Environment Name (dev, test, prod..): free naming
You deploy application versions to environments and can promote application versions to the next enviornments
Rollback feature to previous application version
Full control over lifecycle of environments

Beanstalk Deployment Options for Updates
All at once (deploy all in one go) – fastest, but instances aren’t available to serve traffic for a bit (downtime)
Rolling – update a few instances at a time (bucket), and then move onto the next bucket once the first bucket is healthy
Rolling with additional batches – like rolling, but spins up new instances to move the batch (so that the old application is still available)
Immutable – spins up new instances in a new ASG, deploys version to these instances, and then swaps all the instances when everything is healthy

Decoupling Applicaiton – SQS, SNS, Kinesis, Active MQ
SQS – queue model
SNS – pub/sub model
Kinesis – real-time streaming model

SQS – Standard Queue (decouple application)
SQS is a service that is used for decoupling applications, thus reducing interdependencies, through a message bus
Amazon SQS is pull-based (polling) not push-based (use SNS for push-based).
Fully managed, No limit for messages in the queue, Low latency, Horizontal Scaling
Default retention of message: 4 days, maximum is 14 days
Can have duplicate messages, out of order messages
Limitation of 256KB per message sent
Security
Encryption:
In-flight encryption using HTTPS API
At-rest encrypttion using KMS keys
Client-side encryption if the client wants to perform encryption/decryption itself
Access Controls:
IAM Policies to regulate access to the SQS API
SQS Access Policies: (similar to S3 bucket policies)
Useful for cross-account access to SQS queues
Useful for allowing other services (SNS, S3…) to write to an SQS queue

Queue Types:
Delay Queue
Delay a message (consumers don’t see it immediately) upto 15 mins
Default is 0 seconds (message is available right away)
can override the default using the DelaySeconds parameter
Producing Messages
Define body
Optional – add message attributes(metadata), Provide Delay Delivery
Get back – Message identifier, MD5 hash of the body
Consuming Messages
for Consumers
Poll SQS for messages (receive up to 10 messages at a time)
Process the message within the visibility timeout, Delete the message using the message ID & receipt handle
Visibility Timeout (important to exam)
When a consumer polls a message from queue, the message is “invisible” to other consumers for a defined period… the Visibility Timeout:
Set between 0 seconds and 12 hours (default 30 seconds)
If too high (15 minutes) and consumer fails to process the message, you must wait a long time before processing the message again
If too low (30 seconds) and consumer needs time to process the message (2 minutes), another consumer will receive the message and the message will be processed more than once
ChangeMessageVisibility API change to visibility while processing a message
DeleteMessage API to tells SQS the message was successfully processed
Dead Letter Queue
We can set a threshold of how many times a message can go back to the queue – it’s called a “Redrive Policy”
After the threshold exceeded, the message goes into a dead letter queue (DLQ)
Make sure to process the message in the DLQ before they expire
Long Polling (important to exam)
When a consumer requests message from the queue, it can optionally “wait” for messages to arrive if there are none in the queue – This is called Long Polling
LongPolling decreases the number of API calls made to SQS while increasing the efficiency and latency of your application
The wait time can be between 1 to 20 seconds (20 sec preferable)
Long Polling is preferable to Short Polling (20 sec is preferable to 1 sec)
Long Polling can be enabled at the queue level or at the API level using WaitTimeSeconds

SQS – FIFO Queue
Not available in all region, Name of the queue must end in .fifo
Messages are processed in order by the consumer, messages are sent exactly once
No per message delay, there is only per queue delay
Ability to do content based de-duplication

SNS
SNS uses push-based not Polling (SQS uses Polling / Pull-based)
Subscribers can be:
SQS, HTTP / HTTPS, Lambda, Emails, SMS Messages, Mobile Notificaitons
SNS integrates with AWS products:
Some services can send data directly to SNS for notifications
CloudWatch (alarms)
Auto Scaling Group Notificaitons
Amazon S3 (on bucket events)
CloudFormation
SNS – How to Publish
Topic Publish (within your AWS Server – using the SDK)
Direct Publish (for mobile apps SDK)
SNS + SQS = Fan Out
Push once in SNS, receive in many SQS
Fully decoupled, No data loss
SQS allows for: data persistance, delayed processing & for retries of work
May have many workers on one queue and one worker on the other queue
SNS cannot send messages to SQS FIFO queues (AWS limitation)

AWS Kinesis (real-time big data, analytics and ETL)
Kinesis is managed alternative to Apache Kafka
Great for application logs, metrics, IoT, clickstreams and great for “real-time” big data and great for streaming processing framework (spark)
Data is automatically replicated to 3 AZ

Kinesis Streams: low latency streaming ingest at scale
Kinesis Analytics: perform real-time analytics on streams using SQL
Kinesis Firehose: load streams into S3, Redshif, ElasticCache

Kinesis Streams 
Streams are divided in ordered Shards / Partitions
Data retention is 1 day by default, can go upto 7 days
Once data is inserted in Kinesis, it can't be deleted (immutability)

Kinesis Streams Shards
One stream is made of many different Shards
Billing is per shard provisioned 
Records are ordered per shard
1 MB/s or 1000 message/s at write PER SHARD 
2 MB/s at read PER SHARD 

Kinesis Security
Control access / authorization using IAM policies
Encryption inflight using HTTPS endpoints
Encryption at rest using KMS
Possibility to encrypt / decrypt data client side (harder)
VPC endpoints available for kinesis to access within VPC

AWS Kinesis Data Firehose
Fully managed service, no administration, automatic scaling, serverless
Load data into Redshift / Amazon S3 / ElasticSearch / Splunk
Near RealTime (60 secs latency minimum for non full batches, or minimum 32 MB of data at a time)
Pay for the amount of data going through Firehose
RedShift is an Online Analytics Processing (OLAP) type of DB

Kinesis Data Streams vs Firehose
Kinesis Streams has a maximum retention of 7 days and Kinesis Firehose has a retention of 1 day
Streams
    Going to write custom code (producer / consumer)
    RealTime (~200 ms)
    Must manage scaling (shard splitting / merging)
    Data Storage for 1 to 7 days, replay capability multi consumers

Firehose
    Fully managed, send to S3, Splunk, Redshift, ElasticSearch
    Serverless data transformations with Lambda
    Near real time (lowest buffer time is 1 minute)
    Automated Scaling, No data Storage

Kinesis Data Analytics
    Perform real time analytics on Kinesis Streams using SQL
    Auto Scaling, Managed: no servers to provision, Continuous: real time
    Pay for actual consumption rate 
    Can create streams out of the real-time query

Aamazon MQ
Protocols – MQTT, AMQP, STOMP, OpenWire, WSS
Amazon MQ = managed Apache Amazon MQ
Amazon MQ doesn’t “scale” as mush as SQS / SNS
Runs on a dedicated machine, can run in HA with failover
has both Queue feature (~SQS) and topic feature (~SNS)
It supports industry-standard APIs and protocols so you can switch from any standards-based message broker to Amazon MQ without rewriting the messaging code in your applications.

Serverless Architecture
Serverless == FaaS (Function as a Service)

Serverless in AWS
AWS Lambda
DynamoDB
AWS Cognito
AWS API Gateway
Amazon S3
AWS SNS & SQS
AWS Kinesis Data Firehose
Aurora Serverless
Step Functions
Fargate (Docker)

Lambda
Increase RAM will also improve CPU and network
Docker is not for AWS Lambda, it’s for ECS / Fargate
Lambda tracks the number of requests, the latency per request, and the number of requests resulting in an error in CloudWatch

Throttle behavior:
    if Synchronous invocation => return Throttle Error - 429
    if aSynchronous invocation => return automatically and then go to DLQ

Lambda – Limits – Per Region
Execution
Memory Allocation: 128 MB – 3008 MB (64 MB increments)
Maximum execution time: 900 seconds (15 minutes)
Environment variables (4 KB)
Disk capacity in the “function container” (in /tmp): 512 MB
Concurrency executions: 1000 (can be increased)
Deployment
Lambda function deployment size (compressed .zip): 50 MB
Size of uncompressed deployment (code + dependencies): 250 MB
Can use the /tmp directory to load other files at startup
Size of the enviornment variables: 4 KB

DynamoDB
item = row
attribute = column
Dynamo DB is a fully managed NoSQL (schema-less) database
Provides two read models: eventually consistent reads (Default) and strongly consistent reads
DynamoDB stores structured data in tables, indexed by a primary key
Amazon DynamoDB is a fully managed NoSQL database service that provides fast and predictable performance with seamless scalability. Push button scaling means that you can scale the DB at any time without incurring downtime. DynamoDB provides low read and write latency.

DynamoDB – Provisioned Throughput
Read Capacity Units (RCU): throughput for reads ($0.00013 per RCU)
1 RCU = 1 strongly consistent read of 4 KB per second
1 RCU = 2 eventually consistent read of 4 KB per second
Write Capacity Units (WCU): throughput for writes ($0.00065 per WCU)
1 WCU = 1 write of 1 KB per second
Option to setup auto-scaling of throughput to meet demand
Throughput can exceeded temporarily using “burst credit”. If burst credit are empty, you’ll get a “ProvisionedThroughputException”
It’s then advised to do an exponential back-off retry

DynamoDB – DAX
DAX = DynamoDB Accelerator (cached)
writes go through DAX to DynamoDB
Solves the Hot Key problem (too many reads)
5 minutes TTL for cache by default
Up to 10 nodes in the cluster
Multi AZ (3 nodes minimum recommended for Production)
Secure (Encryption at rest with KMS, VPC, IAM, CloudTrail…)

DynamoDB Streams
Changes in DynamoDB (Create, Update, Delete) can end up in a DynamoDB Stream
This stream can by read by AWS Lambda, & we can then do –
Reach to changes in real time (welcome email to new users)
Analytics
Create derivative tables / views
Insert into ElasticSearch
Cloud implement cross region replication using Streams
Stream has 24 hours of data retention

DynamoDB – New Features
Transactions (new from Nov 2018)
All or Nothing type of operations
Coordinated Insert, Update & Delete across multiple tables
Include upto 10 unique items or up to 4 MB of data
On Demand (new from Nov 2018)
No capacity planning needed (WCU/RCU) – Scales automatically
2.5x more expensive than provisioned capacity (use with care)
Helpful when spikes are un-predictable or the application is very low throughput

DynamoDB – Security & Other Features
Security
VPC endpoints available to access DynamoDB without Internet
Access fully controlled by IAM
Encryption at rest using KMS
Encryption in transit using SSL / TLS
Backup and Restore Feature available
Point in time restore like RDS
No performance impact
Global Tables
Multi region, fully replicated, high performance
Amazon DMS can be used to migrate to DynamoDB from MongoDB, Oracle, MySQL, S3… etc

DynamoDB – Other Features
Global Tables: (CRR)
Active Active replication, many regions
Must enable DynamoDB Streams
Useful for low latency, DR purpose
Capacity Planning
Planned Capacity: provision WCU & RCU, can enable auto scaling
On-demand Capacity: get unlimited WCU & RCU, no throttle, more expensive

API Gateway
Support for Web Sockets
Handle security (Authentication & Authorization)
Create API Keys, handle request throttling
Swagger / Open API import to quickly define APIs
Cache API responses

API Gateway – Integrations High Level
Lambda Function
Invoke Lambda function
Easy way to expose REST API backed by AWS Lambda
HTTP
Expose HTTP endpoints in the backend. ex: internal HTTP API on premise, ALB..
why this? Add rate limiting, caching, user authentications, API keys etc.
AWS Service
Start an AWS Step Function workflow, Post a message to SQS
why this? Add authentication, deploy publicly, rate control…

API Gateway – Endpoint Types
Edge Optimized (default): For global clients
Request are routed through the CloudFront Edge Locations (improves latency)
The API Gateway still lives in only one region
Regional
For clients within the same region
cloud manually combine the CloudFront (more control over the caching strategies and the distribution)
Private
Can only be accessed from your VPC using an interface VPC endpoint (ENI)
Use a resource policy to define access

API Gateway – Security (Important to exam)
IAM:
Great for user/roles already within your AWS account
Handle Authentication + Authorization
Leverages Sig v4
Custom Authorizer:
Great for 3rd party tokens (OAuth / SAML)
Very flexible in terms of what IAM policy is returned
Handle Authentication + Authorization
Pay per Lambda invocation
Cognito User Pool:
You manage your own user pool (can be backed by Facebook, Google login etc)
No need to write any custom code
Must implement authorization in the backend

AWS Cognito (Important to exam)
Cognito User Pools (CUP):
Sign in functionality for app users
Integrate with API Gateway
Create a serverless database of user for your mobile apps
Send back a JSON Web Tokens (JWT)
Cognito Identity Pools (Federated Identity Pool):
Provide AWS credentials to users so they can access AWS resources directly
Integrate with Cognito User Pools as an identity provider
Cognito Sync: (Out of scope to exam)
Synchronize data from device to Cognito
May be deprecated and replaced by AppSync
Store preferences, configuration, state of app, Offline capability (syncronization when back online)
Cross device syncronization (any platform – iOS, Android, etc..)
Required FIP in Cognito (not User Pool)

Serverless Application Model
Framework for developing & deploying serverless applications
All the configuration is YAML code
Lambda function
DynamoDB Tables
API Gateway
Cognito User Pools
SAM can use CodeDeploy to deploy Lambda functions

AWS Databases
Database Types
RDBMS (=SQL/OLTP): RDS, Aurora – great for joins
NoSQL Database: DynamoDB (~JSON), ElasticCache (Key/Value pairs), Neptune (graphs) – no joins, no SQL
Object Store: S3 (for big objects) / Glacier (for backups/archives)
Data Warehouse (=SQL Analytics/BI): RedShift(OLAP), Athena
Search: ElasticSearch(JSON) – free text, unstructured searches
Graphs: Neptune – displays relationship between data

ElasticCache Overview
Managed Redis / Memcached (similar offering as RDS, but for cache)
In-memory data store, sub-milisecond latency
Must provision an EC2 instance type
Support for Clustering (Redis) and Multi AZ, Read Replicas (Sharding)
Security – IAM, Security Groups, KMS, Redi Auth
Backup / Snapshot / Point in time restore feature, Managed & Scheduled maintenance
Monitoring through CloudWatch

Athena
Uses Presto engine

Redshift
Redshift based on PostgreSQL, but it’s not used for OLTP
It’s OLAP – Online Analytical Processing (analytics and data warehousing)
Columnar storage of data (instead of row based)
Massively Paralled Query Execution (MPP), highly available
BI tools integration with AWS Quicksight or Tableau
Nodes – 2 Types
Leader Node: for query planning, results aggregation
Compute Node for performing the queries, send results to leader
Redshift Spectrum: Perform queries directly against S3 (no need to load the data)
Snapshot – You can configure RedShift to automatically copy snapshots (automated or manual) of a cluster to another AWS Region

Redshift Spectrum
    It's Serverless
    Query data that is already in S3 without loading it, Must have Redshift cluster available to start the Query. The Query then submitted to thousands of Redshift Spectrum nodes
Remember: Redshif = Analytics / BI / Data Warehouse / OLAP Process

RedShift Cluster
    Enabling Enhanced VPC routing on your Amazon Redshift cluster: use these features to tightly manage the flow of data between your Amazon Redshift cluster and other resources. When you use Enhanced VPC Routing to route traffic through your VPC, you can also use VPC flow logs to monitor COPY and UNLOAD traffic. If Enhanced VPC Routing is not enabled, Amazon Redshift routes traffic through the Internet, including traffic to other services within the AWS network.

Neptune
Fully managed Graph database
When do we use Graphs ?
High relationship data
Social Networking: Users friends with users, replied to comment on post of user & likes other comments
Knowledge graphs (wikipedia)
Highly available across 3 AZ, with upto 15 read replicas
Point-in-time recovery, continuous backup to S3
Support for KMS encryption at rest + HTTPS
Remember: Neptune = Graphs

ElasticSearch
You can search any field, even partially matches
It’s common to use ElasticSearch as a complement to another database
ElasticSearch also has some usage for Bid Data applications
Built-In Integrations: Amazon Kinesis Data Firehose, AWS IoT and Amazon CloudWatch Logs for data ingestion
Comes with Kibana (visualisation) & Logstach (log ingestion) – ELK stack
Remember: ElasticSearch = Search / Indexing

AWS CloudWatch
Each Alarm is associated with one metric. So, we need one alarm per metric
The CloudWatch Alarm Evaluation Period is the number of the most recent data points to evaluate when determining alarm state. This would help as you can increase the number of datapoints required to trigger an alarm.
CloudWatch Logs never expires by default. but can define Logs expiration at the Level – Log Groups.

Metrics: Collect and track key metrics
Logs: Collect, monitor, analyze and store log files
Events: Send notifications when certain events happen in your AWS
Alarms: React in real-time to metrics / events  

Basic Monitoring: 5 Mins
Details Monitoring: 1 Min
High Resolution Monitoring: 1 Sec
Custom Monitoring: custom

CloudWatch Metrics
EC2 instance metrics have metrics “every 5 minutes” (by default)
Detailed monitoring you get data “every 1 minute” (on extra cost), if you want more prompt scale your ASG
EC2 memory usage is by default not pushed (must be pushed from inside the instance as a custom metric by CloudWatch Agent)

Custom Metrics
    Ability use dimensions (attributes) to segment metrics
        Instance.id 
        Environment.name
    Metric resolution
        Standard: 1 minute
        High Resolution: up to 1 second (Storage Resoution API Parameter) - Higher cost
    Use API call PutMetricData
    Use exponential back off in case of throttle errors 

CloudWatch Dashboard
Dashboards are Global
Dashboards can include graphs from different Regions
You can change the time zone & time range of the dashboards
You can setup automatic refresh (10s, 1m, 2m, 5m, 15m)

Pricing
    3 Dashboards (up to 50 metrics) for free
    $3/dashboard/month afterwards

CloudWatch Logs
Applicaitons can send logs to CloudWatch using the SDK
CloudWatch can collect log from:
Elastic Beanstalk: colllection of logs from application
ECS: collection from containers
AWS Lambda: collection from function logs
VPC Flow Logs: VPC specific logs
API Gateway
CloudTrail based on filter
CloudWatch Log Agents: for example on EC2 machines
Route53: Log DNS queries
CloudWatch Logs can go to:
Batch exporter to S3 for archival
Stream to ElastiSearch cluster for further analytics
Logs Storage Architecture:
Log Groups: arbitrary name, usually representing an application
Log Stream: instances within application / log files / containers
Can define log expiration policies (never expire, 30 days… etc)
To send logs to CloudWatch, make sure IAM permissions are correct!
Security: encryption of logs using KMS at the Group Level

CloudWatch Logs Metric Filter & Insights
    CloudWatch Logs can use filter expressions
        for example, find a specific IP inside of a log
        Metric filter can be used to trigger alarms
    CloudWatch Logs Insights can be used to query logs and add query to CloudWatch Dashboard

CloudWatch Alarms
Alarms can go to Auto Scaling, EC2 Actions, SNS notifications
Alarm Status:
OK
INSUFFICIENT_DATA
ALARM
Period:
Length of time in seconds to evaluate the metric
High resolution custom metrics: can only choose 10 sec to 30 sec

CloudWatch Events
Schedule: Cron jobs
Event Pattern: Event rules to react to a service doing something ex: CodePipeline state change
Triggers to Lambda functions, SQS/SNS/Kinesis Messages
CloudWatch Event creates small JSON document to give information about the change

Amazon EventBridge
EventBridge is the next evolution of CloudWatch Events
Default event bus: generated by AWS services (CloudWatch Events)
Partner event bus: receive events from SaaS service or applications (ZenDesk, DataDog, Segment, Symantic..)
Custom Event buses: for your own applications
Event buses can be accessed by other AWS Accounts
Rules: how to process the events (similar to CloudWatch Events)

Amazon EventBridge Schema Registry
    EventBridge can analyse the events in your bus and infer the schema
    The Schema Registry allows you to generate code for your application that will know in advance how data is structured in the event bus
    Schema can be versioned

Amazon EventBridge vs CloudWatch Events
    Amazon EventBridge buids upon and extends CloudWatch Events
    It uses the same service API and endpoint, and the same underlying service infrastructure
    EventBridge allows extension to add event buses for your custom applications and your third-party SaaS apps
    EventBridge has the Schema Registry capability
    EventBridge has a different name to mark the new capabilities
    Over time,  the CloudWatch Events name will be replaced with EventBridge

AWS X-Ray
Troubleshooting application performance and errors
Distributed tracing of microservices
X-Ray Daemon must runing on EC2 instance, to appear your application in X-Ray.
Your IAM Role must have required permissions to send data to X-Ray
Configure the X-Ray daemon to send traces across accounts: Create a role on another account, and allow a role in your account to assume that role
Annotations: to index your XRay traces in order to search and filter through them efficiently
API used for writing to X-Ray: GetSamplingRules, PutTraceSegments, PutTelemetryRecords

AWS CloudTrail
Provides governance, compliance and audit for your AWS account
CloudTrail is enabled by default
Get an history of events / API calls made within your AWS account by:
Console
SDK
CLI
AWS Service
Can put logs from CloudTrail into CloudWatch Logs
If a resource is deleted in AWS, look into CloudTrail first!
Internal monitoring of API calls being made
Audit changes to AWS Resources by your users

CloudTrail vs CloudWatch vs X-Ray
CloudTrail:
Audit API calls made by users / services / AWS console
Useful to detect unauthorized calls or root cause of changes
CloudWatch:
CloudWatch Metrics over time for monitoring
CloudWatch Logs for storing application log
CloudWatch Alarms to send notifications in case of unexpected metrics
X-Ray:
Automated Trace Analysis & Central Service Map Visualization
Latency, Errors and Fault analysis
Request tracking across distributed systems

AWS Config
Helps with auditing and recoding compliance of your AWS resources
Helps record configurations and change over time
Possibility of storing the configuration data to S3 (analyzed by Athena)
Questions that can be solved by AWS Config:
Is there unrestricted SSH access to my security groups?
Do my buckets have any public access?
How has my ALB configuration changed over time?
You can receive alerts (SNS notifications) for any changes
AWS Config is a per-region reservice
Can be agregated across regions and accounts

AWS Config Rules
    Can use AWS managed config rules (over 75)
    Can make custom config rules (must be defined in AWS Lambda)
        Evaluate if each EBS disk if of type gp2
        Evaluate if each EC2 instance is t2.micro
    Rules can be evaluated / triggered:
        For each config change 
        And / Or: at regular time intervals
        Can trigger CloudWatch events if the rule is non-compliant (and chain with Lambda)
    Rules can have auto remediations
        If a resource is not compliant, you can trigger an auto remediation ex: stop instances with non-approved tags
    AWS Config Rules does not prevent actions from happening (no deny)
    Pricing: no free tier, $2 per active rule per region per month

CloudWatch vs CloudTrail vs Config
CloudWatch
Performance monitoring (metrics, CPU, network, etc..) & dashboards
Events & Alerting
Log Aggregation & Analysis
CloudTrail
Record API calls made within your Account by everyone
Can define trails for specific resources
Global Service
Config
Record Configuration changes
Evaluate resources against compliance rules
Get timeline of changes and compliance

for an Elastic Load Balancer
    CloudWatch
        Monitoring incoming connections metric
        Visualize error code as a % over time
        Make a dashboard to get an idea of your load balancer performance
    Config
        Track security group rules for the Load Balancer
        Track configuration changes for the Load Balancer
        Ensure an SSL certificate is always assigned to the Load Balancer (compliance)
    CloudTrail
        Track who made any changes to the Load Balancer with API Calls

Identity & Access Management – Advanced
AWS STS – Security Token Service
Allows to grant limited & temporary access to AWS resource
Token is valid for 15 minutes up to 1 hour (must be refreshed)

AssumeRole
Within your own account: for enhanced security
Cross Account Access: assume role in target account to perform actions there
AssumeRoleWithSAML
return credentials for users logged with SAML
AssumeRoleWithWebIdentity
return creds for users logged with an Idp (Facebook Login, Google Login, OIDC compatible..)
AWS recommends against using this, and using Cognito instead
GetSessionToken
for MFA, from a user or AWS account root user

Identity Federation in AWS
Federation lets users outside of AWS to assume temporary role for accessing AWS resources. These users assume identity provided access role
Federations can have many flavors:

CloudFormation – Stackset
Create, Update or delete stacks across multiple accounts and regions with a single operation
Administrator account to create StackSets
Trusted Account to create, update, delete stack instances from StackSets
When you update a stack set, all associated stack instance are updated throughout all accounts and regions

ECS – Elastic Container Service
ECS is a container orchestration service
ECS helps you run Docker containers on EC2 machines
ECS is complicated, and made of –
– ECS Core: Running ECS on user-provisioned ECS intances
– Fargate: Running ECS tasks on AWS-provisioned compute (serverless)
– EKS: Running ECS on AWS-powered Kubernetes (running on EC2)
– ECR: Docker Containter Registry hosted by AWS
IAM security and roles at the ECS task level

AWS ECS – Concepts
ECS Cluster: set of EC2 Instances
ECS Service: applications definitions running on ECS Cluster
ECS tasks + definition: containers running to create the application
ECS IAM roles: roles assigned to tasks to interace with AWS

AWS Step Function –> State Machine
Step Functions is recommended to used for new applications than AWS SWF (Simple Workflow Service)

AWS AppSync –> GraphQL

AWS DataSync –> Move large of data between on-primese and S3, EFS, FSx for Windows
AWS DataSync vs AWS Storage Gateway ?

Networking – Subnets – IPV4
AWS reserves 5 IPs address (first 4 and last 1 IP address) in each subnet
these 5 IPs are not available to use and cannot be assigned to an instance
Ex, if CIDR block 10.0.0.0/24, reserved IP are:
> 10.0.0.0: Network address
> 10.0.0.1: Reserved by AWS for the VPC router
> 10.0.0.2: Reserved by AWS for mapping to AWS-provisioned DNS
> 10.0.0.3: Reserved by AWS for future use
> 10.0.0.255: Network broadcast address. AWS does not support broadcast in a VPC, therefore the address is reserved

One VPC can only be attached to one IGW and vice versa

Ingress means traffic going into your instance

KMS – AWS Key Management Service
S3: key per object
Glacier: key per archive
EBS: key per volume
Redshift: 4-tier encryption
RDS: builds on EBS enryption

Costs of VPC
Most mananged devices in VPC:
ELB
NAT Gateway
VPC Connections
Direct Connect
Have both hourly and bandwidth charges

VPC
Subnet
Route table
Internet Gateway
NAT Device
Public IP Address
AWS Direct Connect – AWS Direct Connection Location as bridge (using Dedicated & Private connection) b/w AWSs VPC (Virtual Private Gateway) and Customer Network
AWS Direct Connect Gateway – if want to setup Direct Connect 1 or more VPC in many different region (same account)

        VPC1    --------------
                            \|/
                    Direct Connect Gateway      -->     Private Virtual Interface       -->         Customer Network
                            /|\
        VPC2    -------------

AWS Site-to-Site VPN Connection – Bridge b/w AWS’ VPN to Customer Network
AWS Private Link / VPC Endpoint Services – Instead of using many more of VPC Peering, this will expose a service to 1000s of VPC (own or other accounts). It doesn’t require VPC Peering, Internet Gateway, NAT, Route Table… requires NLB & ENI.
Interface VPC Endpoint (within AWS network)
VPC Peering – Cross Region and Cross Account connecting VPCs in Private Network
AWS CloudHub – AWS VPN as hub and connecting to multiple customer sites (provides communication b/w sites) in public internet with data encryption.
Transit Gateway – transitive peering b/w 1000s of VPC & on-premises, hub and spoke model (star) connection. Across multiple region, share cross account using RAM (Resource Access Manager), works with DCG (Direct Connect Gateway) & VPN connections, supports IP MULTICAST (supported only by this service alone in AWS).
AWS Private Global Network (traffic between Regions)
KVMS vs HSM

Perfect Forward Secrecy
is a feature that provides additional safeguards against the eavesdropping of encrypted data, through the use of a unique random session key. This prevents the decoding of captured data, even if the secret long-term key is compromised.
CloudFront and Elastic Load Balancing are the two AWS services that support Perfect Forward Secrecy.

Q: You want to put restrictions in a S3 bucket so that only your EC2 instances can access the bucket. EC2 instances access S3 bucket using a VPC endpoint. What condition can you use in the S3 Bucket Policy to restrict access?
Public IP Address of EC2 Instance
Private IP Address of EC2 Instance
Private or Public IP Address
VPC Endpoint ID (correct)
A: EC2 instances access S3 using Public IP Address and traffic is routed through internet gateway. If VPC endpoint is used, S3 is accessed using AWS private network. In this case, bucket policy can use VPC ID or VPC Endpoint ID to restrict access.
Note: Private IP Addresses are not supported in policies as multiple VPCs can share the same CIDR block

Q: How to we avoid SLOWDOWN Exceptions in S3?
A: Add random hash prefixes to the object key

What is Hot Partition & Hot Key in DynamoDB?

WAF is for app / web app layer stack. WAF blocks SQL injection attempts and cross-site scripting attacks.
Shield works on network and transport layer.

whitepapers –

  1. Architecting for the cloud: AWS Best Practices
  2. AWS Well Architect Framework
  3. AWS Disaster Recovery (https://aws.amazon.com/disaster-recovery)

Well Architect Framework –

  1. Operational Excellence
  2. Security
  3. Reliability
  4. Performance Efficiency
  5. Cost Optimization
    https://console.aws.amazon.com/wellarchitected
    https://aws.amazon.com/architecture/
    https://aws.amazon.com/solutions/

BizTalk Notes – 1


BizTalk Server

Atomic transactions
Atomic transactions guarantee that any partial updates are rolled back automatically in the event of a failure during the transactional update, and that the effects of the transaction are erased (except for the effects of any .NET calls that are made in the transaction).

A long-running transaction
A long-running transaction is considered committed when the last statement in it has completed. There is no automatic rollback of state in case of a transaction abort. You can achieve this programmatically through the exception and compensation handlers demonstrated in this sample.

Receive Pipeline Stages
Decode –> Disassemble –> Validate –> ResolveParty
Send Pipeline Stages
Pre-assemble –> Assemble –> Encode

Promoted Properties stored in the bts_DocumentSpec table in the Management Database
Host info stored in Host Table in Management Database
Messaging objects are stored in Management Database
Subscription table: change in the database when the filter is added to Send port or when Activate property of Receive shape is set to true
Databases are part of every solution: BizTalk Server Management database, MessageBox databases, Tracking database, and SSO database are four databases which are used by BizTalk server runtime operations
BRE: Rule Engine Database along with default 4 Databases

BizTalk Message Life Cycle

Receive Ports and Receive Locations
A receive port is a collection of one or more receive locations that define specific entry points into BizTalk Server. A receive location is the configuration of a single endpoint (URL) to receive messages. The location contains configuration information for both a receive adapter and a receive pipeline. The adapter is responsible for the transport and communications part of receiving a message
The receive pipeline is responsible for preparing the message for publishing into the MessageBox.
Send Ports
A send port is the combination of a send pipeline and a send adapter. A send port group is a collection of send ports and works much like an e-mail distribution list.The send pipeline is used to prepare a message coming from BizTalk Server for transmission to another service
Orchestrations
Orchestrations can subscribe to (receive) and publish (send) messages through the MessageBox. In addition, orchestrations can construct new messages. Messages are received using the subscription.

Diferent Types of BizTalk Schemas

  1. XML Schema
  2. Flat file Schema
  3. Envelope Schema
  4. Property Schema

Read further from below url –

BizTalk architecture:
https://biztalklive.blogspot.com/p/biztalk-server-architecture.html

Correlation:
Correlation is the process of matching an incoming message with the appropriate instance of an orchestration. For example, orchestration sends out of a message and when the response come back it needs to consume it correctly in the appropriate instance. As you can imagine there will be multiple instances of the same orchestration running and it’s important to match the message with correct instance of orchestration.

Deployment Types:

  1. Deploying from Visual Studio
  2. Building a Microsoft Installer (MSI) package that can be exported or imported between environments.
  3. Using command line-based tools such as MSBuild and BtsTask.
  4. Using community frameworks, such as BizTalk Deployment Framework and NANT.

AWS Notes


Global Services
IAM
CloudWatch

Most policies are stored in AWS as JSON documents. AWS supports six types of policies:

  1. Identity-based policies
  2. Resource-based policies
  3. Permissions boundaries
  4. Organizations SCPs
  5. ACLs
  6. Session policies.

S3 data is made up of:
Key (name).
Value (data).
Version ID.
Metadata.
Access Control Lists.

There are 4 mechanisms for controlling access to Amazon S3 resources:
IAM policies.
Bucket policies.
Access Control Lists (ACLs).
Query string authentication (URL to an Amazon S3 object which is only valid for a limited time).

NAT vs Route Table vs NACL vs Target Group vs Security Group ?
NACL – it will DENY all traffic by default. Network Access Control Lists to control inbound and outbound traffic at the subnet level. NACLs support both allow and deny rules and are stateless meaning that return traffic must be explicitly allowed.
SG – Security Groups to act as a firewall at the EC2 level to control inbound and outbound traffic. These are stateful meaning they return traffic is allowed regardless of rules.

Redshift
Splunk
Swagger / Open API
OAuth / SAML
prometheus tool – is a free software application used for event monitoring and alerting. It records real-time metrics in a time series database built using a HTTP pull model, with flexible queries and real-time alerting.
Grafana Tool – Grafana is an open source visualization tool that can be used on top of a variety of different data stores but is most commonly used together with Graphite, InfluxDB, Prometheus, Elasticsearch and Logz.io.

ECS Notes
To specify permissions for a specific task on Amazon ECS you should use IAM Roles for Tasks. The permissions policy can be applied to tasks when creating the task definition, or by using an IAM task role override using the AWS CLI or SDKs. The taskRoleArn parameter is used to specify the policy.

CORS
Cross-origin resource sharing (CORS) is a browser security feature that restricts cross-origin HTTP requests that are initiated from scripts running in the browser. If your REST API’s resources receive non-simple cross-origin HTTP requests, you need to enable CORS support.

A cross-origin HTTP request is one that is made to:
• A different domain (for example, from example.com to amazondomains.com)
• A different subdomain (for example, from example.com to petstore.example.com)
• A different port (for example, from example.com to example.com:10777)
• A different protocol (for example, from https://example.com to http://example.com)

To support CORS, therefore, a REST API resource needs to implement an OPTIONS method that can respond to the OPTIONS preflight request with at least the following response headers mandated by the Fetch standard:
• Access-Control-Allow-Methods
• Access-Control-Allow-Headers
• Access-Control-Allow-Origin

AWS Inspector
Amazon Inspector is an automated security assessment service that helps improve the security and compliance of applications deployed on AWS. It is not used to secure the actual deployment of resources, only to assess the deployed state of the resources.

ACM AWS Certificate Manager
Store / Manage certificates for SSL / TLS

Amazon Data Lifecycle Manager (Amazon DLM)
to automate the creation, retention, and deletion of snapshots taken to back up your Amazon EBS volumes. Automating snapshot management helps you to:

  • Protect valuable data by enforcing a regular backup schedule.
  • Retain backups as required by auditors or internal compliance.
  • Reduce storage costs by deleting outdated backups.
    Combined with the monitoring features of Amazon CloudWatch Events and AWS CloudTrail, Amazon DLM provides a complete backup solution for EBS volumes at no additional cost.

Amazon Simple Email Service (SES)
is a cloud-based email sending service designed to send notification and transactional emails.

JSON Web Token (JWT):
is meant to be used for user authentication and session management.

An elastic network interface (ENI)
is a logical networking component in a VPC that represents a virtual network card. You can attach a network interface to an EC2 instance in the following ways:
When it’s running (hot attach)
When it’s stopped (warm attach)
When the instance is being launched (cold attach).

AWS Snowball
WS Snowball is a petabyte-scale data transport solution that uses secure appliances to transfer large amounts of data into and out of the AWS cloud.
As a rule of thumb, if it takes more than one week to upload your data to AWS using the spare capacity of your existing Internet connection, then you should consider using Snowball.

AWS Secrets Manager
is an AWS service that makes it easier for you to manage secrets. Secrets can be database credentials, passwords, third-party API keys, and even arbitrary text. You can store and control access to these secrets centrally by using the Secrets Manager console, the Secrets Manager command line interface (CLI), or the Secrets Manager API and SDKs.
AWS Secrets Manager to store and encrypt the database credentials, API keys, and other secrets. Enable automatic rotation for all of the credentials.

AWS Config: Enforces strict compliance by tracking all configuration changes made.
Amazon Redshift: data warehouse and data lake
AWS STS: issue short-lived access tokens that acts as temporary security credentials
Amazon Simple Workflow Service (SWF): use for creating decoupled acrchitecture. (SQS also the same)
AWS Secrets Manager: Store db credentials, passwords.