Project 11

Stock Portfolio Tracker

An individual investor wants to track and analyze their stock portfolio performance across multiple holdings and time periods. You will build a complete analytics pipeline using SQL for data extraction, Python for advanced financial calculations, and Excel for dashboard visualization.

Project Overview

Business Scenario

Sarah is a retail investor who has built a diversified stock portfolio over the past three years. She holds positions across technology, healthcare, finance, and energy sectors. She needs a systematic way to track her portfolio's total value over time, understand her sector allocation, and measure performance against the S&P 500 benchmark. Currently she uses scattered spreadsheets and brokerage statements, making it difficult to get a unified view.

  • Goal: Build a portfolio analytics system that tracks value, measures risk-adjusted returns, and optimizes allocation
  • Tools: SQL (data extraction), Python (financial analysis), Excel (dashboard)
  • Stakeholder: Individual investor wanting to make data-driven investment decisions
  • Deliverables: Portfolio value timeline, sector breakdown, Sharpe ratio report, optimization recommendations

Dataset

Database Schema

The portfolio database consists of four related tables capturing stock prices, holdings, transactions, and benchmark data.

stock_prices

ColumnTypeDescription
price_idINT (PK)Unique price record identifier
tickerVARCHAR(10)Stock ticker symbol (e.g., AAPL, MSFT)
trade_dateDATETrading date
open_priceDECIMAL(10,2)Opening price
close_priceDECIMAL(10,2)Closing price
volumeBIGINTDaily trading volume

portfolio_holdings

ColumnTypeDescription
holding_idINT (PK)Unique holding identifier
tickerVARCHAR(10)Stock ticker symbol
sharesDECIMAL(10,4)Number of shares held
avg_cost_basisDECIMAL(10,2)Average purchase price per share
sectorVARCHAR(50)Industry sector (Technology, Healthcare, etc.)

transactions

ColumnTypeDescription
txn_idINT (PK)Transaction identifier
tickerVARCHAR(10)Stock ticker symbol
txn_typeVARCHAR(4)BUY or SELL
txn_dateDATETransaction date
sharesDECIMAL(10,4)Number of shares traded
price_per_shareDECIMAL(10,2)Execution price per share

benchmarks

ColumnTypeDescription
bench_idINT (PK)Benchmark record identifier
bench_nameVARCHAR(20)Benchmark name (e.g., SP500)
trade_dateDATEDate
close_valueDECIMAL(12,2)Benchmark closing value

SQL Analysis

Query 1: Portfolio Value Over Time

Calculate the total portfolio market value for each trading day by joining holdings with daily closing prices.

SELECT
    sp.trade_date,
    SUM(ph.shares * sp.close_price) AS portfolio_value
FROM portfolio_holdings ph
JOIN stock_prices sp
    ON ph.ticker = sp.ticker
WHERE sp.trade_date >= '2023-01-01'
GROUP BY sp.trade_date
ORDER BY sp.trade_date;

Query 2: Sector Allocation

Break down the portfolio by sector to see how capital is distributed. This uses the most recent closing price for each stock.

WITH latest_prices AS (
    SELECT ticker,
           close_price,
           ROW_NUMBER() OVER (PARTITION BY ticker ORDER BY trade_date DESC) AS rn
    FROM stock_prices
)
SELECT
    ph.sector,
    COUNT(DISTINCT ph.ticker) AS num_stocks,
    SUM(ph.shares * lp.close_price) AS sector_value,
    ROUND(SUM(ph.shares * lp.close_price) * 100.0
        / SUM(SUM(ph.shares * lp.close_price)) OVER (), 2) AS pct_of_portfolio
FROM portfolio_holdings ph
JOIN latest_prices lp
    ON ph.ticker = lp.ticker AND lp.rn = 1
GROUP BY ph.sector
ORDER BY sector_value DESC;

Query 3: Realized vs Unrealized Gains

Separate closed-position profits (realized) from open-position paper gains (unrealized).

-- Realized gains from completed SELL transactions
SELECT
    t.ticker,
    SUM(t.shares * (t.price_per_share - ph.avg_cost_basis)) AS realized_gain
FROM transactions t
JOIN portfolio_holdings ph ON t.ticker = ph.ticker
WHERE t.txn_type = 'SELL'
GROUP BY t.ticker

UNION ALL

-- Unrealized gains on current holdings
SELECT
    ph.ticker,
    ph.shares * (lp.close_price - ph.avg_cost_basis) AS unrealized_gain
FROM portfolio_holdings ph
JOIN (
    SELECT ticker, close_price,
           ROW_NUMBER() OVER (PARTITION BY ticker ORDER BY trade_date DESC) AS rn
    FROM stock_prices
) lp ON ph.ticker = lp.ticker AND lp.rn = 1
ORDER BY ticker;

Python Analysis

Sharpe Ratio Calculation

The Sharpe ratio measures risk-adjusted return. A ratio above 1.0 is generally considered good; above 2.0 is very good.

import pandas as pd
import numpy as np

# Load portfolio daily values from SQL output
portfolio = pd.read_csv('portfolio_daily_values.csv', parse_dates=['trade_date'])
portfolio = portfolio.sort_values('trade_date')

# Calculate daily returns
portfolio['daily_return'] = portfolio['portfolio_value'].pct_change()

# Annualized Sharpe Ratio (assuming 252 trading days, risk-free rate = 4.5%)
risk_free_rate = 0.045
daily_rf = risk_free_rate / 252
excess_returns = portfolio['daily_return'] - daily_rf

sharpe_ratio = (excess_returns.mean() / excess_returns.std()) * np.sqrt(252)
print(f"Annualized Sharpe Ratio: {sharpe_ratio:.2f}")

Cumulative Returns Chart

Visualize portfolio performance against the S&P 500 benchmark over the analysis period.

import matplotlib.pyplot as plt

benchmark = pd.read_csv('benchmark_daily.csv', parse_dates=['trade_date'])
benchmark = benchmark.sort_values('trade_date')

# Normalize both to start at 100
portfolio['cum_return'] = (1 + portfolio['daily_return']).cumprod()
benchmark['daily_return'] = benchmark['close_value'].pct_change()
benchmark['cum_return'] = (1 + benchmark['daily_return']).cumprod()

plt.figure(figsize=(12, 6))
plt.plot(portfolio['trade_date'], portfolio['cum_return'], label='My Portfolio', linewidth=2)
plt.plot(benchmark['trade_date'], benchmark['cum_return'], label='S&P 500', linewidth=2, linestyle='--')
plt.title('Portfolio vs S&P 500 - Cumulative Returns')
plt.xlabel('Date')
plt.ylabel('Growth of $1')
plt.legend()
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.savefig('cumulative_returns.png', dpi=150)
plt.show()

Correlation Matrix

Understand how your holdings move relative to each other. Low correlation between stocks means better diversification.

import seaborn as sns

# Pivot stock prices into a wide table of daily returns
prices = pd.read_csv('stock_prices.csv', parse_dates=['trade_date'])
pivot = prices.pivot(index='trade_date', columns='ticker', values='close_price')
returns = pivot.pct_change().dropna()

corr_matrix = returns.corr()

plt.figure(figsize=(10, 8))
sns.heatmap(corr_matrix, annot=True, fmt='.2f', cmap='RdYlGn_r',
            center=0, vmin=-1, vmax=1, square=True)
plt.title('Stock Correlation Matrix')
plt.tight_layout()
plt.savefig('correlation_matrix.png', dpi=150)
plt.show()

Mean-Variance Portfolio Optimization

Use Modern Portfolio Theory to find the allocation that maximizes the Sharpe ratio for a given set of stocks.

from scipy.optimize import minimize

mean_returns = returns.mean() * 252
cov_matrix = returns.cov() * 252
num_assets = len(returns.columns)

def neg_sharpe(weights):
    port_return = np.dot(weights, mean_returns)
    port_vol = np.sqrt(np.dot(weights.T, np.dot(cov_matrix, weights)))
    return -(port_return - risk_free_rate) / port_vol

constraints = {'type': 'eq', 'fun': lambda w: np.sum(w) - 1}
bounds = tuple((0.0, 0.4) for _ in range(num_assets))
initial = np.array([1 / num_assets] * num_assets)

result = minimize(neg_sharpe, initial, method='SLSQP',
                  bounds=bounds, constraints=constraints)

optimal_weights = result.x
for ticker, weight in zip(returns.columns, optimal_weights):
    print(f"{ticker}: {weight:.1%}")

Excel Analysis

  • Import Data: Load the SQL output (portfolio value over time, sector allocation, gains breakdown) into separate Excel worksheets named "Daily Values", "Sectors", and "Gains".
  • Portfolio Dashboard Sheet: Create a summary table at the top showing Total Portfolio Value, Total Cost Basis, Total Gain/Loss, and Overall Return %. Use formulas like =SUMPRODUCT(shares, latest_price) for current value and =(current - cost) / cost for return.
  • Return Calculations: In the Daily Values sheet, add a column for daily return using =(B3-B2)/B2. Then add columns for cumulative return using =PRODUCT(1+C$2:C3)-1. Calculate annualized return with =(1+total_return)^(252/trading_days)-1.
  • Asset Allocation Pie Chart: Select the sector names and their portfolio percentages from the Sectors sheet. Insert a Pie Chart (Insert > Chart > Pie). Add data labels showing both the sector name and percentage. Format with a professional color palette.
  • Performance Line Chart: Plot the cumulative returns of your portfolio and the S&P 500 on the same chart. Use a Line Chart with two series. Add a secondary axis if the scales differ significantly. Include a chart title "Portfolio vs Benchmark Performance".
  • Conditional Formatting: Apply green highlighting to positive gains and red to negative gains in the Gains sheet. Use Data Bars on the sector allocation percentages for a quick visual comparison. Add icon sets (arrows) to the daily return column.
  • Key Insights

    Diversification Matters

    The correlation matrix reveals which holdings move together. A well-diversified portfolio should have low average pairwise correlation (below 0.5), reducing overall volatility without sacrificing returns.

    Risk-Adjusted Returns

    Raw returns can be misleading. The Sharpe ratio normalizes returns by volatility so you can compare a high-flying tech portfolio to a steady dividend portfolio on equal footing.

    Sector Concentration Risk

    If one sector exceeds 40% of portfolio value, a sector-wide downturn could have outsized impact. The sector allocation query helps identify this imbalance before it becomes a problem.

    Tax-Aware Decisions

    Separating realized from unrealized gains helps with tax planning. You can harvest losses to offset gains, and timing sales to hit long-term capital gains thresholds saves real money.

    Quiz

    Question 1

    What does the Sharpe ratio measure?

    Question 2

    In the sector allocation SQL query, what does the window function SUM() OVER() calculate?

    Question 3

    Why is low correlation between portfolio holdings desirable?

    Question 4

    What is the difference between realized and unrealized gains?

    Question 5

    In the portfolio optimization code, what does the constraint np.sum(w) - 1 == 0 enforce?

    ← Previous Project Next Project →