The PREP Framework
Your personal GPS for any technical question. Follow these 4 simple steps to never feel lost in an interview again.
Think Like a Doctor
A good doctor first asks you questions (Problem), then explains the plan (Reasoning), gives the medicine (Execution), and finally checks if you feel better (Review). PREP is the exact same process for data science problems.
Problem Understanding
Goal: Make sure you 100% understand the question before you start.
- Repeat the Question: Say the question back to the interviewer in your own words.
- Ask Clarifying Questions: "What format should the output be?", "Should I consider null values?", "Are the user IDs unique in this table?"
- State Your Assumptions: "I will assume we only need data from the last 6 months."
Reasoning (Your Plan)
Goal: Explain your step-by-step plan *before* you write any code. This is the most important step!
- Think Out Loud: Talk through your logic. "Okay, to solve this, I need to..."
- Give a High-Level Plan: "First, I will join the `users` and `orders` tables. Second, I will filter for active users. Finally, I will group by city and count the users."
- Confirm Your Approach: Ask "Does this sound like a good approach to you?"
Execution
Goal: Write clean, logical code to implement the plan you just explained.
- Write the Code: Now you can start typing your SQL query or Python code.
- Keep It Clean: Use good variable names and proper formatting. Make it easy to read.
- Talk as You Code: Briefly explain each part. "Here, I am writing the LEFT JOIN to make sure we keep all users..."
Process Review
Goal: Check your work and show that you can think beyond just one solution.
- Validate Your Answer: Quickly dry run your code with a small example. "Let's check for edge cases, like what if a user has no orders?"
- Discuss Alternatives: "Another way to solve this could be using a Subquery, but a CTE is cleaner here because..."
- Talk About Complexity: If it's a coding question, briefly mention the time and space complexity (Big O).
PREP In Action (Example)
Question: "Find the top 3 highest-spending customers."
- (P) Problem: "Okay, so you want the top 3 customers by their total spending. Do you want their names and total amount? And should I use the `orders` and `customers` tables?"
- (R) Reasoning: "My plan is to first join `customers` and `orders`. Then, I'll group by customer ID and name. I will sum up the `order_amount` for each customer. Finally, I'll order the results in descending order and take the top 3."
- (E) Execution: (Writes the SQL query to do exactly that).
- (P) Process Review: "This query works. An edge case could be two customers with the same total spending. In that case, using `DENSE_RANK()` would be a good way to handle ties."