Project 5

Price Optimization Analysis

Analyze price elasticity of demand, build regression models, and determine optimal pricing strategies that maximize revenue or profit for an e-commerce retailer.

Project Overview

Difficulty: Advanced

Tools: SQL, Python (pandas, numpy, sklearn, scipy, matplotlib), Excel

Estimated Time: 8-10 hours

You are a pricing analyst at TechDirect, an online electronics retailer competing with Amazon, Best Buy, and Newegg. The CEO wants a data-driven pricing strategy to replace the current cost-plus approach. Your analysis will answer:

The key concept is Price Elasticity of Demand (PED): the percentage change in quantity demanded resulting from a 1% change in price. If PED = -1.5, a 10% price increase causes a 15% drop in demand.

Dataset Description

The project uses pricing history, sales data, and competitor price tracking.

Table: pricing_history

Column Data Type Description
price_record_idINT (PK)Unique record identifier
product_idINT (FK)References products table
effective_dateDATEDate the price took effect
our_priceDECIMAL(10,2)Our selling price
cost_priceDECIMAL(10,2)Our cost to purchase/produce
units_soldINTUnits sold during this price period
days_at_priceINTNumber of days at this price point

Table: products

Column Data Type Description
product_idINT (PK)Unique product identifier
product_nameVARCHAR(100)Product display name
categoryVARCHAR(50)Laptops, Headphones, Monitors, etc.
brandVARCHAR(50)Manufacturer brand
min_priceDECIMAL(10,2)Floor price (MAP or margin floor)

Table: competitor_prices

Column Data Type Description
competitor_idINT (PK)Unique competitor price record
product_idINT (FK)References products table
competitor_nameVARCHAR(50)Amazon, BestBuy, Newegg
competitor_priceDECIMAL(10,2)Competitor's price on that date
scraped_dateDATEDate the price was recorded

SQL Analysis

Extract pricing and sales data at different price points to feed into the elasticity and regression models.

Query 1: Sales Performance at Different Price Points

For each product, shows how units sold per day vary at different price levels. This is the raw data for calculating price elasticity.

SELECT
    p.product_name,
    p.category,
    ph.our_price,
    ph.cost_price,
    ph.units_sold,
    ph.days_at_price,
    ROUND(ph.units_sold * 1.0 / ph.days_at_price, 2) AS daily_units,
    ROUND(ph.our_price - ph.cost_price, 2)             AS margin_per_unit,
    ROUND((ph.our_price - ph.cost_price) * ph.units_sold, 2) AS total_profit,
    ROUND(ph.our_price * ph.units_sold, 2)             AS total_revenue
FROM pricing_history ph
JOIN products p ON ph.product_id = p.product_id
ORDER BY p.product_name, ph.our_price;

Query 2: Price Elasticity Calculation (Point Elasticity)

Uses LAG window function to compare consecutive price periods and calculate the point elasticity between each price change.

WITH price_changes AS (
    SELECT
        p.product_name,
        p.category,
        ph.our_price,
        ph.units_sold * 1.0 / ph.days_at_price AS daily_demand,
        LAG(ph.our_price) OVER (
            PARTITION BY ph.product_id ORDER BY ph.effective_date
        ) AS prev_price,
        LAG(ph.units_sold * 1.0 / ph.days_at_price) OVER (
            PARTITION BY ph.product_id ORDER BY ph.effective_date
        ) AS prev_demand
    FROM pricing_history ph
    JOIN products p ON ph.product_id = p.product_id
)
SELECT
    product_name,
    category,
    prev_price,
    our_price AS new_price,
    ROUND((our_price - prev_price) / prev_price * 100, 2) AS price_change_pct,
    ROUND(prev_demand, 2) AS prev_daily_demand,
    ROUND(daily_demand, 2) AS new_daily_demand,
    ROUND((daily_demand - prev_demand) / prev_demand * 100, 2) AS demand_change_pct,
    ROUND(
        ((daily_demand - prev_demand) / prev_demand) /
        ((our_price - prev_price) / prev_price),
        2
    ) AS point_elasticity
FROM price_changes
WHERE prev_price IS NOT NULL
    AND prev_price != our_price
ORDER BY product_name, new_price;

Query 3: Competitive Price Positioning

Compares our prices with competitor prices to understand the price gap and its impact on our sales.

SELECT
    p.product_name,
    p.category,
    ph.our_price,
    cp.competitor_name,
    cp.competitor_price,
    ROUND(ph.our_price - cp.competitor_price, 2) AS price_diff,
    ROUND((ph.our_price - cp.competitor_price) / cp.competitor_price * 100, 1) AS pct_diff,
    ROUND(ph.units_sold * 1.0 / ph.days_at_price, 2) AS our_daily_units
FROM pricing_history ph
JOIN products p ON ph.product_id = p.product_id
JOIN competitor_prices cp ON ph.product_id = cp.product_id
    AND cp.scraped_date BETWEEN ph.effective_date
        AND DATE_ADD(ph.effective_date, INTERVAL ph.days_at_price DAY)
ORDER BY p.product_name, cp.competitor_name, ph.effective_date;

Query 4: Revenue-Maximizing Price Point by Category

For each category, identifies the price band that generates the highest daily revenue.

WITH daily_metrics AS (
    SELECT
        p.category,
        CASE
            WHEN ph.our_price < 50  THEN 'Under $50'
            WHEN ph.our_price < 100 THEN '$50-$99'
            WHEN ph.our_price < 200 THEN '$100-$199'
            WHEN ph.our_price < 500 THEN '$200-$499'
            ELSE '$500+'
        END AS price_band,
        ph.our_price * (ph.units_sold * 1.0 / ph.days_at_price) AS daily_revenue,
        (ph.our_price - ph.cost_price) * (ph.units_sold * 1.0 / ph.days_at_price) AS daily_profit
    FROM pricing_history ph
    JOIN products p ON ph.product_id = p.product_id
)
SELECT
    category,
    price_band,
    ROUND(AVG(daily_revenue), 2) AS avg_daily_revenue,
    ROUND(AVG(daily_profit), 2)  AS avg_daily_profit,
    COUNT(*)                      AS data_points
FROM daily_metrics
GROUP BY category, price_band
ORDER BY category, avg_daily_revenue DESC;

Python Analysis

Use Python to build demand curves, calculate elasticity, and find optimal prices using regression analysis.

Data Loading and Exploration

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import PolynomialFeatures
from scipy.optimize import minimize_scalar

# Load data
pricing = pd.read_csv('pricing_history.csv', parse_dates=['effective_date'])
products = pd.read_csv('products.csv')
competitors = pd.read_csv('competitor_prices.csv', parse_dates=['scraped_date'])

# Merge and calculate daily demand
df = pricing.merge(products, on='product_id')
df['daily_demand'] = df['units_sold'] / df['days_at_price']
df['daily_revenue'] = df['our_price'] * df['daily_demand']
df['margin'] = df['our_price'] - df['cost_price']
df['daily_profit'] = df['margin'] * df['daily_demand']

print(f"Products analyzed: {df['product_id'].nunique()}")
print(f"Price points per product: {df.groupby('product_id').size().mean():.1f} avg")
print(f"\nPrice range: ${df['our_price'].min():.2f} - ${df['our_price'].max():.2f}")
print(f"Daily demand range: {df['daily_demand'].min():.1f} - {df['daily_demand'].max():.1f}")

Price Elasticity Calculation

def calculate_elasticity(group):
    """Calculate arc elasticity between consecutive price points."""
    group = group.sort_values('effective_date')
    results = []
    for i in range(1, len(group)):
        p1, p2 = group.iloc[i-1]['our_price'], group.iloc[i]['our_price']
        q1, q2 = group.iloc[i-1]['daily_demand'], group.iloc[i]['daily_demand']

        if p1 == p2 or q1 == 0:
            continue

        # Arc (midpoint) elasticity: more stable than point elasticity
        pct_change_q = (q2 - q1) / ((q1 + q2) / 2)
        pct_change_p = (p2 - p1) / ((p1 + p2) / 2)
        elasticity = pct_change_q / pct_change_p

        results.append({
            'product_id': group.iloc[i]['product_id'],
            'product_name': group.iloc[i]['product_name'],
            'category': group.iloc[i]['category'],
            'price_from': p1,
            'price_to': p2,
            'demand_from': round(q1, 2),
            'demand_to': round(q2, 2),
            'elasticity': round(elasticity, 3)
        })
    return pd.DataFrame(results)

elasticity_df = df.groupby('product_id').apply(calculate_elasticity).reset_index(drop=True)

# Summary by category
cat_elasticity = elasticity_df.groupby('category')['elasticity'].agg(['mean', 'median', 'std']).round(3)
cat_elasticity.columns = ['mean_elasticity', 'median_elasticity', 'std_elasticity']
print("Price Elasticity by Category:")
print(cat_elasticity.to_string())
print("\nInterpretation: Elasticity < -1 = elastic (demand sensitive to price)")
print("               Elasticity > -1 = inelastic (demand less sensitive)")

Demand Curve Regression and Optimal Price

def find_optimal_price(product_data, product_name):
    """
    Fit a demand curve and find the price that maximizes revenue or profit.
    Uses polynomial regression: demand = a*price^2 + b*price + c
    """
    X = product_data['our_price'].values.reshape(-1, 1)
    y = product_data['daily_demand'].values

    # Fit quadratic demand curve
    poly = PolynomialFeatures(degree=2)
    X_poly = poly.fit_transform(X)
    model = LinearRegression()
    model.fit(X_poly, y)

    r_squared = model.score(X_poly, y)
    coefs = model.coef_
    intercept = model.intercept_

    # Revenue = price * demand(price)
    # Profit = (price - cost) * demand(price)
    avg_cost = product_data['cost_price'].mean()

    def neg_revenue(price):
        pred_demand = model.predict(poly.transform([[price]]))[0]
        return -(price * max(pred_demand, 0))

    def neg_profit(price):
        pred_demand = model.predict(poly.transform([[price]]))[0]
        return -((price - avg_cost) * max(pred_demand, 0))

    price_min = product_data['our_price'].min() * 0.8
    price_max = product_data['our_price'].max() * 1.2

    opt_revenue = minimize_scalar(neg_revenue, bounds=(price_min, price_max), method='bounded')
    opt_profit = minimize_scalar(neg_profit, bounds=(price_min, price_max), method='bounded')

    return {
        'product_name': product_name,
        'r_squared': round(r_squared, 4),
        'cost_price': round(avg_cost, 2),
        'revenue_optimal_price': round(opt_revenue.x, 2),
        'max_daily_revenue': round(-opt_revenue.fun, 2),
        'profit_optimal_price': round(opt_profit.x, 2),
        'max_daily_profit': round(-opt_profit.fun, 2),
        'model': model,
        'poly': poly
    }

# Run for each product with enough data points
results = []
for pid, group in df.groupby('product_id'):
    if len(group) >= 4:  # Need at least 4 data points for quadratic fit
        name = group.iloc[0]['product_name']
        result = find_optimal_price(group, name)
        results.append(result)

results_df = pd.DataFrame(results)
print("\nOptimal Pricing Results:")
print(results_df[['product_name', 'cost_price', 'revenue_optimal_price',
                   'profit_optimal_price', 'r_squared']].to_string(index=False))

Visualizations

fig, axes = plt.subplots(2, 2, figsize=(14, 10))

# Select a sample product for detailed visualization
sample = df[df['product_id'] == df['product_id'].value_counts().index[0]].copy()
sample_name = sample.iloc[0]['product_name']
sample_result = [r for r in results if r['product_name'] == sample_name][0]
model = sample_result['model']
poly = sample_result['poly']

# Plot 1: Demand Curve with Regression
ax1 = axes[0, 0]
price_range = np.linspace(sample['our_price'].min() * 0.8,
                           sample['our_price'].max() * 1.2, 100)
pred_demand = model.predict(poly.transform(price_range.reshape(-1, 1)))

ax1.scatter(sample['our_price'], sample['daily_demand'],
            color='#e07a2f', s=80, zorder=5, label='Observed')
ax1.plot(price_range, pred_demand, color='#4a90d9', linewidth=2, label='Demand Curve')
ax1.axvline(x=sample_result['revenue_optimal_price'], color='green',
            linestyle='--', label=f"Rev-Optimal: ${sample_result['revenue_optimal_price']}")
ax1.axvline(x=sample_result['profit_optimal_price'], color='red',
            linestyle='--', label=f"Profit-Optimal: ${sample_result['profit_optimal_price']}")
ax1.set_xlabel('Price ($)')
ax1.set_ylabel('Daily Units Sold')
ax1.set_title(f'Demand Curve: {sample_name[:30]}', fontweight='bold')
ax1.legend(fontsize=8)

# Plot 2: Revenue and Profit Curves
ax2 = axes[0, 1]
pred_demand_clipped = np.maximum(pred_demand, 0)
revenue_curve = price_range * pred_demand_clipped
avg_cost = sample['cost_price'].mean()
profit_curve = (price_range - avg_cost) * pred_demand_clipped

ax2.plot(price_range, revenue_curve, color='#4a90d9', linewidth=2, label='Revenue')
ax2.plot(price_range, profit_curve, color='#e07a2f', linewidth=2, label='Profit')
ax2.axvline(x=sample_result['revenue_optimal_price'], color='#4a90d9',
            linestyle='--', alpha=0.5)
ax2.axvline(x=sample_result['profit_optimal_price'], color='#e07a2f',
            linestyle='--', alpha=0.5)
ax2.set_xlabel('Price ($)')
ax2.set_ylabel('Daily $ Amount')
ax2.set_title('Revenue vs Profit by Price Point', fontweight='bold')
ax2.legend()

# Plot 3: Elasticity Distribution by Category
ax3 = axes[1, 0]
categories = elasticity_df['category'].unique()
elast_data = [elasticity_df[elasticity_df['category'] == c]['elasticity'].dropna()
              for c in categories]
bp = ax3.boxplot(elast_data, labels=categories, patch_artist=True)
colors = ['#e07a2f', '#4a90d9', '#50c878', '#e05050', '#9b59b6']
for patch, color in zip(bp['boxes'], colors[:len(categories)]):
    patch.set_facecolor(color)
    patch.set_alpha(0.6)
ax3.axhline(y=-1, color='red', linestyle='--', alpha=0.5, label='Unit Elastic (-1)')
ax3.set_ylabel('Price Elasticity')
ax3.set_title('Elasticity Distribution by Category', fontweight='bold')
ax3.legend(fontsize=9)
plt.setp(ax3.xaxis.get_majorticklabels(), rotation=30, ha='right')

# Plot 4: Our Price vs Competitor Prices
ax4 = axes[1, 1]
comp_avg = competitors.groupby(['product_id', 'competitor_name'])['competitor_price'].mean().reset_index()
comp_wide = comp_avg.pivot(index='product_id', columns='competitor_name', values='competitor_price')
our_avg = df.groupby('product_id')['our_price'].mean()
comp_wide['TechDirect'] = our_avg
comp_wide = comp_wide.dropna().head(10)

comp_wide.plot(kind='bar', ax=ax4, width=0.8)
ax4.set_ylabel('Average Price ($)')
ax4.set_title('Price Comparison vs Competitors', fontweight='bold')
ax4.set_xlabel('Product ID')
ax4.legend(fontsize=8)
plt.setp(ax4.xaxis.get_majorticklabels(), rotation=0)

plt.suptitle('Price Optimization Analysis', fontsize=16, fontweight='bold')
plt.tight_layout()
plt.savefig('price_optimization.png', dpi=150)
plt.show()

Pricing Recommendations Report

# Generate final pricing recommendations
recommendations = results_df.copy()
recommendations['current_avg_price'] = [
    df[df['product_name'] == name]['our_price'].mean()
    for name in recommendations['product_name']
]
recommendations['revenue_change_pct'] = (
    (recommendations['revenue_optimal_price'] - recommendations['current_avg_price'])
    / recommendations['current_avg_price'] * 100
).round(1)

recommendations['action'] = np.where(
    abs(recommendations['revenue_change_pct']) < 3, 'Keep Current',
    np.where(recommendations['revenue_change_pct'] > 0, 'Increase Price', 'Decrease Price')
)

print("Pricing Recommendations:")
print(recommendations[['product_name', 'current_avg_price', 'profit_optimal_price',
                        'revenue_change_pct', 'action']].to_string(index=False))

recommendations.to_csv('pricing_recommendations.csv', index=False)

Excel Dashboard Instructions

Build a pricing analysis workbook that enables the pricing team to explore scenarios and make decisions.

Step-by-Step Guide

  • Import Data: Import pricing_history.csv and competitor_prices.csv into separate sheets. Add calculated columns: daily_demand = units_sold / days_at_price, daily_revenue = our_price * daily_demand, margin = our_price - cost_price, daily_profit = margin * daily_demand.
  • Pricing Matrix: Create a PivotTable with product_name in Rows and our_price in Columns. Add daily_demand as Values (Average). This shows how demand changes at each price point for every product.
  • Scatter Chart with Trendline: For a selected product, create an XY Scatter chart with our_price on the X-axis and daily_demand on the Y-axis. Add a Polynomial trendline (order 2). Display the equation on the chart. This is your demand curve.
  • What-If Data Table: Create a column of test prices (from min to max in $5 increments). Use the trendline equation to calculate predicted demand for each test price: =intercept + coef1*price + coef2*price^2. Add columns for predicted_revenue = price * demand and predicted_profit = (price - cost) * demand.
  • Optimal Price Finder: Use =INDEX/MATCH to find the price with the maximum revenue: =INDEX(price_column, MATCH(MAX(revenue_column), revenue_column, 0)). Do the same for maximum profit. Highlight these cells in the data table.
  • Competitor Comparison: Create a table with our price vs. each competitor for the same product. Add conditional formatting: green when we are cheaper, red when more expensive. Calculate the average price gap percentage.
  • Scenario Analysis: Use Excel's Data Table feature (What-If Analysis > Data Table) to model different price scenarios. Create a two-variable data table with price on one axis and a demand multiplier (0.8x to 1.2x) on the other, showing revenue at each combination.
  • Elasticity Calculator: Build a helper section where the user enters two (price, demand) points. Calculate arc elasticity using: =((Q2-Q1)/((Q1+Q2)/2)) / ((P2-P1)/((P1+P2)/2)). Label the result as Elastic (<-1), Unit Elastic (=-1), or Inelastic (>-1).
  • Dashboard Layout: Create a Dashboard sheet with: a product selector dropdown at the top, the demand curve chart on the left, revenue/profit chart on the right, competitor comparison below, and a summary card showing Current Price, Recommended Price, Expected Revenue Change %, and Elasticity.
  • Price Change Tracker: Add a log sheet where the team records actual price changes and observed results. Over time this builds a validation dataset to compare predictions vs. actuals and refine the model.
  • Key Insights & Recommendations

    Present these findings to the executive team with clear revenue impact estimates.

    Elastic vs. Inelastic Categories

    Commodity products (cables, adapters) tend to be highly elastic (PED < -2), meaning price cuts boost volume significantly. Premium/branded products (laptops, headphones) are less elastic, allowing for higher margins without losing much volume.

    Revenue vs. Profit Tradeoff

    The revenue-maximizing price is almost always lower than the profit-maximizing price. For market-share goals, price closer to revenue-optimal. For margin goals, price closer to profit-optimal. The gap between these prices is your strategic decision space.

    Competitive Positioning

    Being 5-10% above the lowest competitor reduces volume by only 8-12% for most products, but maintains healthy margins. Being the cheapest is rarely the most profitable strategy unless you have a cost advantage.

    Dynamic Pricing Opportunity

    Products with high elasticity are ideal for dynamic pricing: lower prices during slow periods to stimulate demand, raise prices during peak demand (holiday season) when customers are less price-sensitive.

    Bundle Pricing

    For products with inelastic demand, bundling with elastic products creates value perception. Example: pair a premium laptop (inelastic) with a discounted accessory kit (elastic). The bundle increases total revenue per customer.

    A/B Testing Next Steps

    The regression model provides directional guidance, but real-world validation requires A/B price testing. Start with the 3 products showing the largest gap between current and optimal prices. Run each test for at least 2 weeks with a 50/50 traffic split.

    Knowledge Check Quiz

    Test your understanding of price optimization and demand elasticity.

    Question 1

    A product has a price elasticity of demand (PED) of -2.0. If you increase the price by 10%, what happens to quantity demanded?

    Question 2

    Why does the analysis use a quadratic (polynomial degree 2) regression for the demand curve instead of a simple linear model?

    Question 3

    What does the SQL LAG() window function do in the price elasticity query?

    Question 4

    A product is "inelastic" (PED between -1 and 0). What pricing strategy maximizes revenue?

    Question 5

    Why might the revenue-maximizing price and profit-maximizing price be different for the same product?

    ← Previous Project Next Project →