Ask Claude about this

Customer Activation Insights

Problem Statement

As a Data Analyst at Shopify, you're working with the Growth team to understand customer activation patterns. The team wants to optimize the onboarding experience by analyzing how quickly new customers make their first purchase after registration. This metric is crucial for measuring the effectiveness of welcome campaigns, onboarding flows, and early customer engagement strategies.

Your task: Calculate the average number of days between customer registration and their first purchase. Only include customers who have made at least one purchase (and that purchase must be 'completed'), and round the result to 2 decimal places.

Business Context

  • Shopify merchants need to understand customer conversion timelines to optimize their marketing spend.
  • Time-to-first-purchase is a key metric for measuring onboarding effectiveness.
  • This analysis helps identify customers who might need additional engagement to convert.
  • Different customer segments may have varying purchase patterns based on acquisition channels.

Key Requirements

Output Specification:

  • Single column: avg_days_to_first_purchase
  • Value should be rounded to 2 decimal places.
  • Include only customers who have made at least one completed purchase.

Business Rules & Calculations:

  • First purchase is defined as the earliest order_date of a completed order for each customer.
  • Days calculation: first_completed_order_date - registration_date.
  • If a customer registered and made their first completed purchase on the same day, that counts as 0 days.
  • Exclude any orders with status other than 'completed' (e.g., 'cancelled').

Data Quality Considerations:

  • The solution must inherently handle customers who registered but never made a purchase (they won't be part of the average).
  • Assume dates are consistent and timezone differences are not a factor for this specific problem's date subtraction.
  • Assume order_date will always be >= registration_date for valid orders.

Database Schema & Sample Data

Assume the following schema and sample data for the Customers and Orders tables:

Table: Customers

CREATE TABLE Customers (
    customer_id INT PRIMARY KEY,
    name VARCHAR(100) NOT NULL,
    email VARCHAR(150) UNIQUE NOT NULL,
    registration_date DATE NOT NULL,
    phone VARCHAR(20),
    acquisition_channel VARCHAR(50),
    country_code VARCHAR(3),
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
customer_id name email registration_date acquisition_channel country_code
1 John Doe john@example.com 2023-01-05 Organic Search USA
2 Jane Smith jane@example.com 2023-01-10 Social Media CAN
3 Bob Johnson bob@example.com 2023-01-15 Email Campaign USA
4 Sara Wilson sara@example.com 2023-01-20 Paid Ads GBR
5 Mike Brown mike@example.com 2023-01-25 Referral USA
6 Lisa Davis lisa@example.com 2023-02-01 Organic Search CAN
7 Tom Wilson tom@example.com 2023-02-05 Social Media USA

Table: Orders

CREATE TABLE Orders (
    order_id INT PRIMARY KEY,
    customer_id INT NOT NULL,
    order_date DATE NOT NULL,
    total_amount DECIMAL(10,2) NOT NULL,
    order_status VARCHAR(20) DEFAULT 'completed',
    payment_method VARCHAR(30),
    shipping_country VARCHAR(3),
    discount_amount DECIMAL(10,2) DEFAULT 0.00,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    FOREIGN KEY (customer_id) REFERENCES Customers(customer_id)
);
order_id customer_id order_date total_amount order_status payment_method
101 1 2023-01-08 120.50 completed credit_card
102 2 2023-01-12 85.25 completed paypal
103 1 2023-01-15 65.75 completed credit_card
104 3 2023-01-25 105.00 completed credit_card
105 2 2023-01-26 140.30 completed paypal
106 4 2023-02-05 95.45 completed debit_card
107 1 2023-02-10 200.00 cancelled credit_card
108 5 2023-02-15 75.80 completed credit_card
109 3 2023-02-20 155.25 completed paypal
110 2 2023-03-01 89.99 completed credit_card

Expected Output (based on sample data):

Calculation Breakdown (for customers with completed first purchases):
- Customer 1 (John Doe): First completed order 2023-01-08. Registration 2023-01-05. Days: 3
- Customer 2 (Jane Smith): First completed order 2023-01-12. Registration 2023-01-10. Days: 2
- Customer 3 (Bob Johnson): First completed order 2023-01-25. Registration 2023-01-15. Days: 10
- Customer 4 (Sara Wilson): First completed order 2023-02-05. Registration 2023-01-20. Days: 16
- Customer 5 (Mike Brown): First completed order 2023-02-15. Registration 2023-01-25. Days: 21
- Customers 6 & 7 have no completed purchases and are excluded from the average calculation.
- Total days for purchasing customers: 3 + 2 + 10 + 16 + 21 = 52 days.
- Number of purchasing customers: 5.
- Average: 52 / 5 = 10.40 days.

avg_days_to_first_purchase
10.40

Average Time to First Purchase

MODERATE

Write a PostgreSQL query to calculate the average number of days between a customer's registration date and their first completed purchase date. Only include customers who have made at least one completed purchase. The final result should be a single value rounded to 2 decimal places.

Take it Further! What are your thoughts on this problem or alternative ways to solve it? Share your own SQL query attempts or insights in the comments below!

Nerchuko Academy · Free DS Interview Prep