Tracking Capital Flow in the Stock Market¶
While analyzing a stock’s price action is useful for identifying trends, it does not provide the full picture of how capital is moving through the market. Price reflects where a stock has traded, but not the intensity of participation behind those moves. Some securities make large price moves on relatively ordinary volume, while others see significant capital flow with little change in price. If you focus on price alone, you are missing an important part of the story.
Flow Ratio is designed to capture this missing piece of information.
In this post, we explore flow ratio, a quantitative signal available through the Empirical Markets API. At a high level, flow ratio measures a security’s dollar volume relative to both the broader market and its own history, helping identify where trading activity is unusually concentrated on a cross-sectional basis.
Flow Ratio¶
Because information is distributed unevenly across the market, many traders try to "follow the money" under the assumption that unusual trading flow may reflect informed positioning.
Flow Ratio builds on this idea by contextualizing a stock’s trading activity. Rather than focusing on raw volume alone, it highlights when a security is absorbing an unusually large share of market-wide trading activity relative to its own baseline. While this does not confirm the presence of informed buying or selling, it surfaces periods where market participation is unusually concentrated—making those moments worth closer inspection.
With the intuition behind flow ratio in place, we can now move from concept to implementation. Using the Empirical Markets API, we will fetch historical Flow Ratio values alongside price data, merge the two series, and then visualize how unusual trading flow evolves through time.
import os
import time
from datetime import datetime
import matplotlib.pyplot as plt
import pandas as pd
import requests
from dotenv import load_dotenv
load_dotenv()
API_HOST = "https://api.empiricalmarkets.com"
plt.style.use("seaborn-v0_8-whitegrid")
pd.set_option("display.max_rows", 100)
pd.set_option("display.max_columns", 20)
def get_request_headers() -> dict:
api_key = os.getenv("EM_API_KEY")
if not api_key:
raise ValueError("Please set EM_API_KEY in your environment.")
r = requests.post(
f"{API_HOST}/v1/token",
data={"api_key": api_key},
headers={"Content-Type": "application/x-www-form-urlencoded"},
timeout=15,
)
r.raise_for_status()
token = r.json()["access_token"]
return {"Authorization": f"Bearer {token}"}
request_headers = get_request_headers()
def get_me() -> dict:
r = requests.get(f"{API_HOST}/v1/me", headers=request_headers, timeout=15)
r.raise_for_status()
return r.json()
me_info = get_me()
rate_limit = me_info.get("api_rate_limit_per_second") or 1
min_interval = 1.0 / rate_limit
def rate_limited_get(url: str, **kwargs) -> requests.Response:
r = requests.get(url, **kwargs)
time.sleep(min_interval)
return r
START_DATE = datetime(2021, 1, 1)
print(f"Authenticated. Rate limit: {rate_limit} request(s) per second.")
Authenticated. Rate limit: 3 request(s) per second.
def fetch_date_cursor_pages(
url: str,
data_key: str,
request_headers: dict,
params: dict | None = None,
timeout: int = 15,
max_pages: int | None = None,
) -> list[dict]:
"""Empirical Markets API Data Cursor Pagination Handler."""
params = dict(params or {})
frames = []
page_count = 0
while True:
response = rate_limited_get(
url,
params=params,
headers=request_headers,
timeout=timeout,
)
response.raise_for_status()
payload = response.json()
rows = payload.get(data_key, [])
if rows:
frames.extend(rows)
cursor = payload.get("date_cursor") or {}
has_more = cursor.get("has_more", False)
next_cursor = cursor.get("next_cursor")
page_count += 1
if max_pages is not None and page_count >= max_pages:
break
if not has_more or not next_cursor:
break
params["date_cursor"] = next_cursor
return frames
def get_ticker_quant_signal(
ticker: str,
request_headers: dict,
signal: str = "flow-ratio",
start_date: datetime = START_DATE,
end_date: datetime | None = None,
universe: str = "all", # all, etfs, or equities
limit: int = 756,
) -> pd.DataFrame:
params = {
"start_date": start_date.strftime("%Y-%m-%d"),
"universe": universe,
"variant": "default",
"limit": limit,
}
if end_date is not None:
params["end_date"] = end_date.strftime("%Y-%m-%d")
frames = fetch_date_cursor_pages(
f"{API_HOST}/v1/quant/signals/{signal}/tickers/{ticker}",
data_key="data",
params=params,
request_headers=request_headers,
)
df = pd.DataFrame(frames)
df["date"] = pd.to_datetime(df["date"])
return df.set_index("date").sort_index()
flow_ratio_df = get_ticker_quant_signal("AAPL", request_headers=request_headers, signal="flow-ratio")
flow_ratio_df
| ticker | signal | value | |
|---|---|---|---|
| date | |||
| 2021-01-04 | AAPL | flow-ratio | 0.046401 |
| 2021-01-05 | AAPL | flow-ratio | -0.311464 |
| 2021-01-06 | AAPL | flow-ratio | -0.029113 |
| 2021-01-07 | AAPL | flow-ratio | -0.447678 |
| 2021-01-08 | AAPL | flow-ratio | -0.643330 |
| ... | ... | ... | ... |
| 2026-04-08 | AAPL | flow-ratio | -0.608343 |
| 2026-04-09 | AAPL | flow-ratio | -1.055673 |
| 2026-04-10 | AAPL | flow-ratio | -0.430429 |
| 2026-04-13 | AAPL | flow-ratio | -0.023262 |
| 2026-04-14 | AAPL | flow-ratio | 0.610747 |
1325 rows × 3 columns
def get_ticker_ohlc_data(
ticker: str,
request_headers: dict,
start_date: datetime = START_DATE,
chunk_size: int = 1260,
) -> pd.DataFrame:
print(f"Fetching OHLC data for ticker: {ticker}")
frames = fetch_date_cursor_pages(
f"{API_HOST}/v1/ohlc/tickers/{ticker}",
data_key="data",
params={"limit": chunk_size, "start_date": start_date.strftime("%Y-%m-%d")},
request_headers=request_headers,
)
if not frames:
return pd.DataFrame()
df = pd.DataFrame(frames)
df["date"] = pd.to_datetime(df["date"])
return df.set_index("date").sort_index()
ohlc_df = get_ticker_ohlc_data("AAPL", request_headers=request_headers)
ohlc_df
Fetching OHLC data for ticker: AAPL
| ticker | asset_class | frequency | open | high | low | close | volume | |
|---|---|---|---|---|---|---|---|---|
| date | ||||||||
| 2021-01-04 | AAPL | equities | D | 129.989023 | 130.078200 | 123.407793 | 125.987713 | 143301887.0 |
| 2021-01-05 | AAPL | equities | D | 125.481464 | 128.256095 | 125.033629 | 127.545400 | 97664898.0 |
| 2021-01-06 | AAPL | equities | D | 124.342405 | 127.584245 | 123.039789 | 123.252024 | 155087970.0 |
| 2021-01-07 | AAPL | equities | D | 124.965480 | 128.149004 | 124.478703 | 127.457780 | 109578157.0 |
| 2021-01-08 | AAPL | equities | D | 128.927848 | 129.122559 | 126.786028 | 128.557897 | 105158245.0 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 2026-04-08 | AAPL | equities | D | 258.450000 | 259.749900 | 256.530000 | 258.900000 | 41032772.0 |
| 2026-04-09 | AAPL | equities | D | 259.000000 | 261.120000 | 256.070000 | 260.490000 | 27823687.0 |
| 2026-04-10 | AAPL | equities | D | 259.980000 | 262.190000 | 259.023123 | 260.480000 | 31291473.0 |
| 2026-04-13 | AAPL | equities | D | 259.730000 | 260.180000 | 256.660000 | 259.200000 | 36234698.0 |
| 2026-04-14 | AAPL | equities | D | 259.245000 | 261.930000 | 257.190000 | 258.830000 | 46952470.0 |
1325 rows × 8 columns
df = ohlc_df.merge(
flow_ratio_df[["value"]].rename(columns={"value": "flow_ratio"}),
how="left",
left_index=True,
right_index=True,
)
df
| ticker | asset_class | frequency | open | high | low | close | volume | flow_ratio | |
|---|---|---|---|---|---|---|---|---|---|
| date | |||||||||
| 2021-01-04 | AAPL | equities | D | 129.989023 | 130.078200 | 123.407793 | 125.987713 | 143301887.0 | 0.046401 |
| 2021-01-05 | AAPL | equities | D | 125.481464 | 128.256095 | 125.033629 | 127.545400 | 97664898.0 | -0.311464 |
| 2021-01-06 | AAPL | equities | D | 124.342405 | 127.584245 | 123.039789 | 123.252024 | 155087970.0 | -0.029113 |
| 2021-01-07 | AAPL | equities | D | 124.965480 | 128.149004 | 124.478703 | 127.457780 | 109578157.0 | -0.447678 |
| 2021-01-08 | AAPL | equities | D | 128.927848 | 129.122559 | 126.786028 | 128.557897 | 105158245.0 | -0.643330 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 2026-04-08 | AAPL | equities | D | 258.450000 | 259.749900 | 256.530000 | 258.900000 | 41032772.0 | -0.608343 |
| 2026-04-09 | AAPL | equities | D | 259.000000 | 261.120000 | 256.070000 | 260.490000 | 27823687.0 | -1.055673 |
| 2026-04-10 | AAPL | equities | D | 259.980000 | 262.190000 | 259.023123 | 260.480000 | 31291473.0 | -0.430429 |
| 2026-04-13 | AAPL | equities | D | 259.730000 | 260.180000 | 256.660000 | 259.200000 | 36234698.0 | -0.023262 |
| 2026-04-14 | AAPL | equities | D | 259.245000 | 261.930000 | 257.190000 | 258.830000 | 46952470.0 | 0.610747 |
1325 rows × 9 columns
With the data successfully extracted, we can begin to visualize the interaction between flow ratio and price:
def plot_flow_ratio_signal(df: pd.DataFrame):
ticker = df["ticker"].iloc[0]
fig, ax1 = plt.subplots(figsize=(12, 6))
ax1.set_title(f"{ticker} Price and Flow Ratio Over Time")
ax1.plot(df.index, df["close"], label="Close Price", color="black")
ax1.set_xlabel("Date")
ax1.set_ylabel("Close Price", color="black")
ax1.tick_params(axis="y", labelcolor="black")
ax2 = ax1.twinx()
ax2.plot(df.index, df["flow_ratio"], label="Flow Ratio", color="blue")
ax2.set_ylabel("Flow Ratio", color="blue")
ax2.tick_params(axis="y", labelcolor="blue")
fig.tight_layout()
plt.show()
plot_flow_ratio_signal(df)
One thing that should stand out to you is that the flow ratio signal exhibits some large spikes. As you will see, this is behavior we can take advantage of.
Flow Ratio is expressed as a deviation from the mean, so under a normal distribution we would expect most observations to fall roughly between -2 and 2. In practice, however, market data rarely follows a clean normal distribution, and this signal is no exception. The distribution tends to exhibit a pronounced right tail, with infrequent but significant positive outliers.
This behavior is expected. Because the signal is designed to capture abnormal bursts of traded dollar flow relative to the market, large spikes correspond to periods where a stock is absorbing a disproportionate share of trading activity. These tail events represent moments of concentrated participation, and they are the primary focus of the analysis.
df["flow_ratio"].hist(density=True, bins=30, alpha=0.6)
df["flow_ratio"].plot.kde(color="darkblue")
plt.title("Distribution of Flow Ratio Values")
plt.xlabel("Flow Ratio")
plt.ylabel("Density")
plt.show()
We refer to these tail events as anomalies. They represent trading sessions where a stock’s trading activity is unusually large relative to the broader market. Using the API, we can retrieve these anomalies directly and plot them against price to see how periods of concentrated trading flow align with price action.
def get_ticker_signal_anomaly(
ticker: str,
request_headers: dict,
signal: str = "flow-ratio",
start_date: datetime = START_DATE,
) -> pd.DataFrame:
r = requests.get(
f"{API_HOST}/v1/quant/signals/{signal}/tickers/{ticker}/anomalies",
headers=request_headers,
params={"start_date": start_date.strftime("%Y-%m-%d")}
)
r.raise_for_status()
anomalies = r.json().get("anomalies", [])
df = pd.DataFrame(anomalies)
if df.empty:
return df
df["date"] = pd.to_datetime(df["date"])
df = df.set_index("date").sort_index()
df = df.rename(columns={"value": "anomaly"})
return df
def plot_ticker_signal_anomalies(df: pd.DataFrame):
is_anomaly_mask = ~df["anomaly"].isna()
fig, ax1 = plt.subplots(figsize=(12, 6))
ax1.set_title(f"{df['ticker'].iloc[0]} Flow Ratio Anomalies Over Time")
ax1.plot(df.index, df["close"], label="Close Price", color="black")
ax1.set_xlabel("Date")
ax1.set_ylabel("Close Price", color="black")
ax1.tick_params(axis="y", labelcolor="black")
ax1.scatter(
df.index[is_anomaly_mask],
df["close"][is_anomaly_mask],
label="Anomalies",
color="blue",
marker="o",
s=100
)
fig.tight_layout()
plt.show()
flow_ratio_anomaly_df = get_ticker_signal_anomaly("AAPL", request_headers=request_headers, signal="flow-ratio")
df = df.merge(
flow_ratio_anomaly_df[["anomaly"]],
how="left",
left_index=True,
right_index=True,
)
plot_ticker_signal_anomalies(df)
Because flow ratio captures periods of unusually concentrated trading activity relative to the broader market, its behavior can vary across different types of securities and market conditions. To illustrate this, we examine several examples.
tickers = ["SPY", "MSFT", "NVDA", "JNJ", "GLD"]
for t in tickers:
t_ohlc_df = get_ticker_ohlc_data(t, request_headers=request_headers, start_date=START_DATE)
t_anomaly = get_ticker_signal_anomaly(t, request_headers=request_headers, signal="flow-ratio", start_date=START_DATE)
t_df = t_ohlc_df.merge(
t_anomaly[["anomaly"]],
how="left",
left_index=True,
right_index=True,
)
plot_ticker_signal_anomalies(t_df)
Fetching OHLC data for ticker: SPY
Fetching OHLC data for ticker: MSFT
Fetching OHLC data for ticker: NVDA
Fetching OHLC data for ticker: JNJ
Fetching OHLC data for ticker: GLD
From the above images, we can see that flow ratio anomalies tend to align with periods of concentrated capital flow. Their interpretation, however, is regime-dependent. In range-bound markets, anomalies often occur near local extremes and can signal short-term exhaustion and mean reversion. In contrast, during strong trends, anomalies frequently cluster as price advances, reflecting sustained participation and trend acceleration.
Conclusion¶
Flow Ratio is useful because it highlights abnormal trading activity within a security relative to both the broader market and its own history. This makes it an effective way to surface names where something unusual may be happening, particularly since price alone does not capture the full picture.
Flow Ratio does not explain who is trading or why, nor does it constitute a complete trading system on its own. Instead, it provides a structured way to identify periods of unusually concentrated market participation. In practice, this makes it a valuable complement to price analysis, event-driven research, and as an input feature in machine learning models.
The information presented in this analysis is for educational and informational purposes only. It does not constitute financial, investment, or trading advice, and should not be interpreted as a recommendation to buy, sell, or hold any securities. The methods shown in this article are for analysis only and should not be used for live trading. Past performance does not indicate future results.