Echo Kanak
This blog is a work in progress ⌛🛠️ where I keep track of things that I learn whether it be from books or tech of any kind.
-
• Git
.gitkeep
Quick notes on Git from basics to concepts like rebase, cherrypicking and stashing
-
• Databricks
Databricks notes
A rough collection of notes on Databricks and data engineering concepts
-
• Apache Spark • Docker • Python
Consuming and Visualizing Real-Time Event Streams
Processing and visualizing with Spark, Minio, PostgreSQL & Grafana
-
• Apache Kafka • Python • Docker
Simulating Real-Time User Journeys with Python and Kafka
Simulate clickstream and transaction data for users
-
• Apache Airflow
Airflow XComs 101
When building pipelines in Apache Airflow, tasks often need to share data with each other. That’s where XComs (cross-communication) XComs enable tasks in your Airflow DAGs to exchange small amounts of data. : Every task instance in Airflow gets its own context dictionary that contains metadata about the current execution.
-
• Apache Airflow
Airflow DAGs 101
Defining airflow dags and task dependencies
-
• Containers • Docker • Apache Kafka
Kafka basics with Docker
Run kafka using docker and publish subscribe to a kafka topic
-
• Docker
Docker commands cheatsheet
Quick reference for common Docker + Docker Compose commands.
-
• Books
Data Warehousing, Business Intelligence, and Dimensional Modeling Primer
Chapter 1 of The Data Warehouse Toolkit by Ralph Kimball
-
• DDIA • Books
Data Models and Query Languages
Chapter 2 notes for Designing Data Intensive Applications By Martin Kleppmann
-
• DDIA • Books
Reliable, Scalable and Maintainable Applications
Chapter 1 notes for Designing Data Intensive Applications By Martin Kleppmann