DevOps: Monitoring and Observability

Below notes are based on LinkedIn Course:
https://www.linkedin.com/learning/devops-foundations-monitoring-and-observability/

Monitoring Basics: Modelling your system

  1. USE Model
  2. RED Model
  3. DWR Model

Synthetic Monitoring

Check if a service is down or slow, Uptime, Pagespeed, Visitor Insights, Transactions)

  1. Write a CURL script
  2. Phantom.js app
  3. Nagios
  4. Pingdom

End-User Monitoring

  1. steelcentral monitoring via application agents

System Monitoring

  1. CPU Usage Monitoring
  2. Memory Usage Monitoring
  3. Disk I/O Monitoring
  4. iNode & File Handle usage
  5. Datadog agent

Network Monitoring

  1. New term: DevNetOps
  2. NetFlow Analysis: Feature by Cisco
  3. Packet Analysis: tcpdump, Wireshark, server based agents, or cloud packet aggregator like Gigamon

Software Metrics

  1. API based events sent via application (for example new signups)

Application Monitoring

Problems: Garbage collection, Latency, Utilization, Heavy parsing, Thread management, Database locking, Connection leaks, Table scans, File handles, Timeouts, Disk queing

  1. JMX Metrics (Java)
  2. WMI Metrics (.NET)
  3. Tracedata

Log Monitoring

  1. Use proper log levels: INFO, WARN, DEBUG, FATAL
  2. Log keeping is expensive on storage
  3. Use json logs for easy parsing.
  4. Format your verbose logs.

Visualization

  1. For problem detection: use concise dashboard: simple numbers, event overlays, percentile graphs
  2. All in one Context: performance, throughput & errors
  3. Not in one context, if it misleads the information.
  4. Design better visualisation book: Envisioning Information by Edward Tufte

Alerting

  1. New Alert: Who & Why?
  2. Alerts can also be used for automatic remediation.
  3. Two common problems: False Negatives (too less alerts) & False Positives (too many alerts)
    1. Example 1: If the number of 500 are over 50 within a minute, alert me.
    2. Example 2: Receiving alerts at midnight isn’t actionable, so adding these info on a dashboard could be better.
  4. Actionable Alerts:
    1. Test your alerts with dummy data
    2. Instrument the real thing
    3. Send context with your alert
    4. Set budgets for alerts in your Agile Process
    5. Before creating automated alerts, try validating it with people that would action them.

DevSecOps: Build Secure CI Pipeline

Below notes are taken from a LinkedIn Course:
https://www.linkedin.com/learning/devsecops-building-a-secure-continuous-delivery-pipeline

Development Phase Tools (Software Application Security Testing, SAST)

  1. Static Code Analysis
  2. Keeping Secrets with git-secrets
  3. Rapid Risk Assessment

Inherit Phase Tools (Software Composition Analysis, SCA)

  1. OWASP Dependency Check
    • Uses National Vulnerability Database (https://nvd.nist.gov/)
    • Dependency-check cli tool
  2. Javascript security with Retire.js
  3. Container scanning
    • free: clair
    • commercial: aqua, twistlock
  4. Commercial Dependency Check tools: Sonatype, Black Duck, Veracode, Whitesource

Build Phase Tools (Dynamic Application Security Testing, DAST)

  1. DAST tools = Attack Tools, Slow & Clunky
  2. General Purpose Scanner Tools: Arkani, Nikto, ZAP, Burp Suite
  3. SQLi Scanner: Sqlmap
  4. SSL/TLS Scanner: SSLScan, SSLyze
  5. gauntlt.org

Deploy Phase Tools

  1. InSpec by Chef

Operation Phase Tools

  1. Metrics based monitoring dasboards
  2. Choosing API centric tools
  3. Promote learning: alerts on slack to responsible team
  4. Bug Bounties: Bugcrowd, Hackerone
  5. RASP & Next-gen WAF Tools
    • Build your own with: ModSecurity + ELK + StatsD
    • Contrast, Prevoty, Signal Sciences, tCell, Waratek
  6. Cloud Security Monitoring:
    • Configuration changes
    • Audit
    • Tools: Threatstack, alienvault, evident.io
    • AWS Options: AWSConfig, CloudTrail, Inspector, GuardDuty

The future of Kubernetes

Below article goes through following topics:

https://www.eficode.com/blog/the-future-of-kubernetes-and-why-developers-should-look-beyond-kubernetes-in-2022

  • Security: OIDC is better than secrets
  • Networking: Ingress does not cut the mustard
  • Workload definition: To the point
  • Storage: Moving away from persistent volumes
https://www.eficode.com/hs-fs/hubfs/graph%20future%20of%20kubernetes-1.png?width=1035&name=graph%20future%20of%20kubernetes-1.png

Network Infrastructure Security Guidance by NSA of US

Below you can find the complete Cybersecurity Technical Report on Network Infrastructure Security by National Security Agency of the United States of America published on March 2022.

https://media.defense.gov/2022/Mar/01/2002947139/-1/-1/0/CTR_NSA_NETWORK_INFRASTRUCTURE_SECURITY_GUIDANCE_20220301.PDF