top of page
All Posts


Pandas Interview Questions Every Data Engineer Must Know
In today’s data-driven world, Pandas has become an essential tool for every Data Engineer. Whether you're building ETL pipelines, transforming large datasets, or preparing data for analytics and machine learning, Pandas plays a critical role in delivering clean, structured, and high-quality data 1. What makes Pandas different from NumPy? Answer: Pandas is built on top of NumPy but provides labeled , tabular , and time-series data structures like DataFrame and Series.It supp
Tejas Agrawal
Dec 62 min read


10 Must-Know NumPy Interview Questions and Answers for Data Engineers and Analysts
Data engineers and analysts often rely on NumPy for efficient numerical computing and data manipulation. Mastering key NumPy concepts can give you an edge in interviews and help you handle real-world data challenges. This post presents 10 essential NumPy interview questions with clear, professional answers designed to prepare you for technical discussions and practical tasks. 1. Why is NumPy faster than Python lists? Answer: NumPy is faster because it stores data in contiguou
Tejas Agrawal
Dec 62 min read


Beyond the Basics: How to Answer 'System Design' Questions in Data Engineering
Medallion Architecture Question 1: Handling "SCD Type 2" for Patient Demographics Scenario : Patient data, such as addresses and insurance details, changes over time. We need to maintain a full history of these changes for audit . purposes . Question: How would you design the schema and ETL process to handle Slowly Changing Dimensions (SCD) Type 2 for a patient dimension table? Answer: Schema Design: I would add three columns to the patient dimension table: start_date, end
Tejas Agrawal
Dec 35 min read


Kafka — Data Engineer Interviews
1. Kafka Architecture Core Architecture Producer → Topic → Partitions → Broker → Consumer Group Topics are log files divided into partitions . Each partition is stored on a broker . Every partition includes: Leader (responsible for handling reads/writes) Followers (replicas for fault tolerance) Key Design Concepts Distributed → enables horizontal scaling Append-only log → ensures high speed Partitioning → supports parallel processing Replication → prevents data loss Of
Tejas Agrawal
Nov 292 min read


DSA-Patterns-Roadmap
DSA-Patterns-Roadmap 1. Fast and Slow Pointer Problem Linked List Cycle II Remove nth Node from the End of List Find the Duplicate Number...
Tejas Agrawal
Feb 243 min read
bottom of page