A Tour of the Empirical Markets API¶
Empirical Markets is an all-in-one financial data platform built for quantitative researchers, data scientists, and developers. Rather than stitching together data from a dozen different providers, the Empirical Markets API gives you a single, consistent interface to a comprehensive financial data ecosystem — spanning market data, company fundamentals, macroeconomic indicators, Federal Reserve operations, insider activity, and much more.
At the core of the platform is a suite of proprietary AI-powered predictions and signals, trained on historical market data and designed to surface actionable intelligence that traditional data providers simply don't offer. These models are continuously evaluated and updated, with full transparency into their performance metrics.
This notebook provides a guided tour of the API, covering:
- Market data: stock prices, technical indicators, and quant signals
- AI predictions and sentiment analysis powered by Empirical Markets' proprietary models
- Company data: SEC filings and fundamental data
- Macroeconomic data, yield curves, and Federal Reserve operations
- Insider trading data with built-in anomaly detection
Please note that this notebook assumes you have access to all API routes. If you currently have the Demo API plan, then you can go to your Account portal to upgrade. If you wish to remain on free, then please note that some of the code blocks below will throw a 403: Forbidden error due to your lack of access. That being said, the Demo tier still offers a lot, so you'll still be able to get a lot out of this post.
Accessing the API¶
To access the API, you must sign up for an Empirical Markets account and verify your email.
This will give you access to both the Markets Intelligence Platform and the API.
After creating an account and logging in, you can navigate to the API Dashboard. This interface allows your to track usage and manage API keys. To use the API, you will need to generate an API key.
Authentication¶
For authentication, the Empirical Markets API supports an OAuth2 Bearer flow. This consists of the following:
- Exchanging your API key for a short-lived JSON Web Token (JWT), often called an access token.
- Using this token in request headers
To initiate the authentication flow, you must make a POST request to https://api.empiricalmarkets.com/v1/token
import os
import requests
from dotenv import load_dotenv
load_dotenv() # load environment variables from .env file
API_AUTH_HEADERS = {"Content-Type": "application/x-www-form-urlencoded"}
API_HOST = "https://api.empiricalmarkets.com"
def get_request_headers() -> dict:
api_key = os.getenv("EM_API_KEY")
if api_key is None:
raise ValueError("API key not found. Please set the EM_API_KEY environment variable.")
r = requests.post(
f"{API_HOST}/v1/token",
data={"api_key": api_key},
headers={"Content-Type": "application/x-www-form-urlencoded"}
)
r.raise_for_status()
token = r.json().get("access_token")
request_headers = {"Authorization": "Bearer {}".format(token)}
return request_headers
request_headers = get_request_headers()
print(str(request_headers)[:50] + "...}")
{'Authorization': 'Bearer eyJhbGciOiJIUzI1NiIsInR5...}
Making Your First Request¶
Now that an access token has been obtained, you can make your first request. To retrieve information about your access token, you can query the /me endpoint.
def get_me(request_headers: dict) -> dict:
r = requests.get(f"{API_HOST}/v1/me", headers=request_headers)
r.raise_for_status()
return r.json()
get_me(request_headers)
{'expires_at': '2026-04-13T20:59:52+00:00',
'ttl': 3597,
'api_quota_limit': 5000,
'api_quota_period_start': '2026-04-13T19:54:16',
'api_quota_period_end': '2026-05-13T19:54:16',
'api_rate_limit_per_second': 3,
'api_rate_limit_per_minute': 180}
The /me route returns important information regarding your current access. The api_quota_limit field is your monthly request quota, while the api_rate_limit_per_second and api_rate_limit_per_minute keys tell you your rate limits. Your rate limits are governed by your subscription tier.
Below, we will write a utility to take advantage of this information the /me route provides in order to avoid 403: Forbidden errors.
import time
me_info = get_me(request_headers)
# Use the rate limit from the /me endpoint; fall back to 1 req/s (free tier)
_rate_limit_per_second = me_info.get("api_rate_limit_per_second") or 1
_min_interval = 1.0 / _rate_limit_per_second # e.g. 1.0s for free, 0.33s for 3 req/s
def rate_limited_get(url: str, **kwargs) -> requests.Response:
"""Drop-in replacement for requests.get that respects the tier rate limit."""
response = requests.get(url, **kwargs)
time.sleep(_min_interval)
return response
print(f"Built rate limit utility with allowed rate limit: {_rate_limit_per_second} req/s. The program will sleep {_min_interval:.2f}s between API calls.")
Built rate limit utility with allowed rate limit: 3 req/s. The program will sleep 0.33s between API calls.
The Empirical Markets Data Ecosystem¶
The Empirical Markets API covers many different data domains, allowing users to retrieve AI predictions, stock prices, fundamental data, macro data, sec filings, and much more.
- Symbol Discovery Catalog
- SEC Filings
- Historical Stock Prices
- Technical Analysis
- Quant Signals & Anomaly Detection
- Capital Asset Pricing Model (CAPM)
- Ticker Correlations
- AI Predictions
- AI Models
- AI-Powered Sentiment Analysis
- Fundamental Data
- Macroeconomic Data
- Yield Curve
- Insider Trading
- Insider Aggregates & Anomaly Detection
- Federal Reserve Operations
- Federal Reserve Holdings
Symbol Discovery Catalog¶
The catalog is the starting point for most API workflows. It provides symbol mappings and metadata across equities, ETFs, exchanges, and industries, helping you discover what tickers are available before querying time series data.
The functions below retrieve the full list of tradeable symbols by asset class. This is useful for building screeners, batch data pipelines, or machine learning models.
import pandas as pd
def get_asset_class_symbols(asset_class: str, request_headers: dict) -> list:
if asset_class not in ["equities", "etfs"]:
raise ValueError("Invalid asset class. Must be 'equities' or 'etfs'.")
r = rate_limited_get(f"{API_HOST}/v1/catalog/assets/{asset_class}/tickers", headers=request_headers)
r.raise_for_status()
r = r.json()
df = pd.DataFrame(r["tickers"])
df["asset_class"] = asset_class
return df
get_asset_class_symbols(asset_class="equities", request_headers=request_headers)
| ticker | name | asset_class | |
|---|---|---|---|
| 0 | HGAS | Global Gas Corp | equities |
| 1 | IHS | IHS Holding Ltd | equities |
| 2 | ECPG | ENCORE CAPITAL GROUP INC | equities |
| 3 | RNGR | Ranger Energy Services, Inc. | equities |
| 4 | AIMUF | Aimfinity Investment Corp. I | equities |
| ... | ... | ... | ... |
| 9568 | ABXXF | Abaxx Technologies Inc. | equities |
| 9569 | NIQ | NIQ Global Intelligence plc | equities |
| 9570 | VRSSF | Verses AI Inc. | equities |
| 9571 | GFR | Greenfire Resources Ltd. | equities |
| 9572 | NFTM | Buildablock Corp. | equities |
9573 rows × 3 columns
You can also look up symbols by exchange. This is useful when you want to narrow your universe to a specific market (e.g., NYSE-listed companies only).
def get_exchange_symbols(exchange: str, request_headers: dict) -> list:
_allowed_exchanges = ["CBOE", "NASDAQ", "NYSE", "OTC"]
if exchange not in _allowed_exchanges:
raise ValueError(f"Invalid exchange. Must be one of {_allowed_exchanges}.")
r = rate_limited_get(f"{API_HOST}/v1/catalog/exchanges/{exchange}/tickers", headers=request_headers)
r.raise_for_status()
r = r.json()
df = pd.DataFrame(r["tickers"], columns=["ticker"])
df["exchange"] = exchange
return df
get_exchange_symbols(exchange="NYSE", request_headers=request_headers)
| ticker | exchange | |
|---|---|---|
| 0 | A | NYSE |
| 1 | AA | NYSE |
| 2 | AAMI | NYSE |
| 3 | AAP | NYSE |
| 4 | AAT | NYSE |
| ... | ... | ... |
| 3268 | ZTO | NYSE |
| 3269 | ZTR | NYSE |
| 3270 | ZTS | NYSE |
| 3271 | ZVIA | NYSE |
| 3272 | ZWS | NYSE |
3273 rows × 2 columns
There may be times when you want to group tickers together based on their underlying business activities. For this use case, the catalog allows you to easily access the Standard Industrial Classification (SIC) codes used by the SEC nad other US Government agencies.
These codes are useful for industry-level analysis, or to filter equities by their underlying line of business.
def get_industry_codes(request_headers: dict):
r = rate_limited_get(f"{API_HOST}/v1/catalog/sic", headers=request_headers)
r.raise_for_status()
r = r.json()
df = pd.DataFrame(r["sic_codes"])
return df
get_industry_codes(request_headers=request_headers)
| code | description | |
|---|---|---|
| 0 | 0 | |
| 1 | 100 | Agricultural Production-Crops |
| 2 | 200 | Agricultural Prod-Livestock & Animal Specialties |
| 3 | 700 | Agricultural Services |
| 4 | 900 | Fishing, Hunting and Trapping |
| ... | ... | ... |
| 399 | 8742 | Services-Management Consulting Services |
| 400 | 8744 | Services-Facilities Support Management Services |
| 401 | 8880 | American Depositary Receipts |
| 402 | 8900 | Services-Services, NEC |
| 403 | 9995 | Non-Operating Establishments |
404 rows × 2 columns
def get_industry_code_symbols(sic_code: str, request_headers: dict):
r = rate_limited_get(f"{API_HOST}/v1/catalog/sic/{sic_code}/tickers", headers=request_headers)
r.raise_for_status()
r = r.json()
df = pd.DataFrame(r["tickers"], columns=["ticker"])
df["sic_code"] = sic_code
return df
get_industry_code_symbols(sic_code="3571", request_headers=request_headers)
| ticker | sic_code | |
|---|---|---|
| 0 | AAPL | 3571 |
| 1 | DELL | 3571 |
| 2 | OMCL | 3571 |
| 3 | OSS | 3571 |
| 4 | SCKT | 3571 |
| 5 | SMCI | 3571 |
| 6 | ZEPP | 3571 |
For a specific ticker, the catalog returns rich metadata including company name, exchange, sector, industry, and SIC code. This is the lightweight profile endpoint for any symbol.
def get_ticker_info(ticker: str, request_headers: dict) -> dict:
r = rate_limited_get(f"{API_HOST}/v1/catalog/tickers/{ticker}", headers=request_headers)
r.raise_for_status()
return pd.Series(r.json())
get_ticker_info(ticker="AAPL", request_headers=request_headers)
ticker AAPL name Apple Inc. exchange Nasdaq asset_class equities asset_class_underlying None sic_code 3571 sic_description Electronic Computers fiscal_year_end 0926 state_of_incorporation CA last_close 260.48 pct_return -0.000038 as_of 2026-04-10 dtype: object
Get Ticker SEC Filings¶
Every public company is required to file periodic reports with the U.S. Securities and Exchange Commission (SEC). These filings are the authoritative source of a company's financial position and material events, and include:
- 10-K — Annual report with audited financials and business overview
- 10-Q — Quarterly report with unaudited financials
- 8-K — Current report disclosing material events (earnings, acquisitions, executive changes, etc.)
- DEF 14A — Proxy statement disclosing executive compensation and shareholder votes
The Empirical Markets API gives you programmatic access to the full filing history for any covered ticker, including filing type, submission date, and accession number. This is useful for building document retrieval pipelines, tracking disclosure timing relative to price action, or simply staying on top of what a company has recently reported to regulators.
def get_ticker_sec_filings(ticker: str, request_headers: dict) -> pd.DataFrame:
r = rate_limited_get(f"{API_HOST}/v1/company/{ticker}/filings", headers=request_headers, params={"order": "desc", "limit": 500})
r.raise_for_status()
r = r.json()
df = pd.DataFrame(r["filings"])
df["ticker"] = ticker
return df
get_ticker_sec_filings(ticker="AAPL", request_headers=request_headers)
| ticker | accession_number | form | filed_date | report_date | url | primary_url | items | |
|---|---|---|---|---|---|---|---|---|
| 0 | AAPL | 0001140361-26-013192 | 4 | 2026-04-03T00:00:00 | 2026-04-01T00:00:00 | https://www.sec.gov/Archives/edgar/data/320193... | https://www.sec.gov/Archives/edgar/data/320193... | None |
| 1 | AAPL | 0001140361-26-013191 | 4 | 2026-04-03T00:00:00 | 2026-04-01T00:00:00 | https://www.sec.gov/Archives/edgar/data/320193... | https://www.sec.gov/Archives/edgar/data/320193... | None |
| 2 | AAPL | 0001140361-26-013190 | 4 | 2026-04-03T00:00:00 | 2026-04-01T00:00:00 | https://www.sec.gov/Archives/edgar/data/320193... | https://www.sec.gov/Archives/edgar/data/320193... | None |
| 3 | AAPL | 0001969223-26-000420 | 144 | 2026-04-02T00:00:00 | None | https://www.sec.gov/Archives/edgar/data/320193... | https://www.sec.gov/Archives/edgar/data/320193... | None |
| 4 | AAPL | 0001959173-26-002757 | 144 | 2026-04-02T00:00:00 | None | https://www.sec.gov/Archives/edgar/data/320193... | https://www.sec.gov/Archives/edgar/data/320193... | None |
| ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 495 | AAPL | 0001193125-19-292676 | 8-K | 2019-11-15T00:00:00 | 2019-11-07T00:00:00 | https://www.sec.gov/Archives/edgar/data/320193... | https://www.sec.gov/Archives/edgar/data/320193... | 8.01,9.01 |
| 496 | AAPL | 0000320193-19-000123 | 4 | 2019-11-14T00:00:00 | 2019-11-13T00:00:00 | https://www.sec.gov/Archives/edgar/data/320193... | https://www.sec.gov/Archives/edgar/data/320193... | None |
| 497 | AAPL | 0001193125-19-288412 | 424B2 | 2019-11-08T00:00:00 | None | https://www.sec.gov/Archives/edgar/data/320193... | https://www.sec.gov/Archives/edgar/data/320193... | None |
| 498 | AAPL | 0000320193-19-000121 | 4 | 2019-11-07T00:00:00 | 2019-11-05T00:00:00 | https://www.sec.gov/Archives/edgar/data/320193... | https://www.sec.gov/Archives/edgar/data/320193... | None |
| 499 | AAPL | 0001193125-19-287351 | FWP | 2019-11-07T00:00:00 | None | https://www.sec.gov/Archives/edgar/data/320193... | https://www.sec.gov/Archives/edgar/data/320193... | None |
500 rows × 8 columns
Historical Stock Prices¶
OHLC (Open, High, Low, Close) price data is the most fundamental building block for quantitative and technical analysis. The Empirical Markets API provides daily OHLC data for both equities and ETFs, with prices adjusted for splits and dividends.
Each row represents a single trading day, capturing the open, high, low, and closing prices along with trading volume.
Below, we retrieve OHLC data for a few tickers and plot their cumulative returns over time.
import matplotlib.pyplot as plt
plt.style.use("dark_background")
def get_ticker_ohlc_prices(ticker: str, start_date: str = "2023-01-01"):
r = rate_limited_get(
f"{API_HOST}/v1/ohlc/tickers/{ticker}",
headers=request_headers,
params={"limit": 1000, "start_date": start_date}
)
r.raise_for_status()
r = r.json()
return pd.DataFrame(r["data"])
def plot_ticker_ohlc_prices(tickers: list[str], start_date: str = "2023-01-01"):
plt.figure(figsize=(12, 6))
for ticker in tickers:
price_df = get_ticker_ohlc_prices(ticker, start_date=start_date)
price_df["date"] = pd.to_datetime(price_df["date"])
price_df = price_df.set_index("date").sort_index()
display(price_df.head(3))
price_df["close_pct"] = price_df["close"].pct_change().cumsum()
plt.plot(price_df["close_pct"], label=ticker)
plt.title(f"Cumulative Returns")
plt.xlabel("Date")
plt.ylabel("Price")
plt.legend()
plt.grid()
plt.show()
plot_ticker_ohlc_prices(tickers=["AAPL", "TSLA", "NVDA"], start_date="2023-01-01")
| ticker | asset_class | frequency | open | high | low | close | volume | |
|---|---|---|---|---|---|---|---|---|
| date | ||||||||
| 2023-01-03 | AAPL | equities | D | 128.357017 | 128.967865 | 122.337203 | 123.223918 | 112117471.0 |
| 2023-01-04 | AAPL | equities | D | 125.017054 | 126.756692 | 123.233771 | 124.494877 | 89113633.0 |
| 2023-01-05 | AAPL | equities | D | 125.253512 | 125.884065 | 122.918494 | 123.174656 | 80962708.0 |
| ticker | asset_class | frequency | open | high | low | close | volume | |
|---|---|---|---|---|---|---|---|---|
| date | ||||||||
| 2023-01-03 | TSLA | equities | D | 118.47 | 118.80 | 104.6400 | 108.10 | 231402818.0 |
| 2023-01-04 | TSLA | equities | D | 109.11 | 114.59 | 107.5200 | 113.64 | 180388976.0 |
| 2023-01-05 | TSLA | equities | D | 110.51 | 111.75 | 107.1601 | 110.34 | 157986324.0 |
| ticker | asset_class | frequency | open | high | low | close | volume | |
|---|---|---|---|---|---|---|---|---|
| date | ||||||||
| 2023-01-03 | NVDA | equities | D | 14.836174 | 14.981029 | 14.081927 | 14.300709 | 401276580.0 |
| 2023-01-04 | NVDA | equities | D | 14.552457 | 14.838172 | 14.226783 | 14.734275 | 431323600.0 |
| 2023-01-05 | NVDA | equities | D | 14.476533 | 14.549460 | 14.133825 | 14.250759 | 389168110.0 |
Automated Technical Analysis¶
Technical Analysis (TA) applies mathematical transformations to price data to surface momentum, trend, and volatility signals. Unlike fundamental analysis, technical indicators are derived solely from historical prices and volume.
The Empirical Markets API automates the Technical Analysis pipeline for you, returning data in a format that can be easily joined with price data for visualization and model training.
To demonstrate this, we can retrieve the historical Bollinger Bands for $AAPL and plot them against price.
def get_ticker_ta_indicator(ticker: str, indicator: str, start_date: str = "2024-01-01"):
r = rate_limited_get(
f"{API_HOST}/v1/technicals/{indicator}/tickers/{ticker}",
headers=request_headers,
params={"start_date": "2024-01-01", "limit": 1000}
)
r.raise_for_status()
r = r.json()
return pd.DataFrame(r["data"])
def plot_ticker_bollinger_bands(ticker: str, start_date: str = "2023-01-01"):
price_df = get_ticker_ohlc_prices(ticker, start_date=start_date)
price_df["date"] = pd.to_datetime(price_df["date"])
price_df = price_df.set_index("date").sort_index()
ta_df = get_ticker_ta_indicator(ticker, indicator="bollinger-bands", start_date=start_date)
display(ta_df)
ta_df["date"] = pd.to_datetime(ta_df["date"])
ta_df = ta_df.set_index("date").sort_index()
plot_df = ta_df[["lower", "upper"]].merge(price_df[["close"]], how="inner", left_index=True, right_index=True)
fig, ax = plt.subplots(figsize=(12, 6))
plot_df["close"].plot(ax=ax, label=f"{ticker} Price")
plot_df[["lower", "upper"]].plot(ax=ax, label="Bollinger Bands")
ax.legend()
plt.title(f"{ticker} Price with Bollinger Bands")
plt.show()
plot_ticker_bollinger_bands(ticker="AAPL", start_date="2023-01-01")
| date | ticker | indicator | lower | upper | |
|---|---|---|---|---|---|
| 0 | 2023-12-20T00:00:00 | AAPL | bollinger-bands | 185.480796 | 197.568341 |
| 1 | 2023-12-21T00:00:00 | AAPL | bollinger-bands | 185.694898 | 197.688113 |
| 2 | 2023-12-22T00:00:00 | AAPL | bollinger-bands | 186.103319 | 197.639322 |
| 3 | 2023-12-26T00:00:00 | AAPL | bollinger-bands | 186.543665 | 197.521951 |
| 4 | 2023-12-27T00:00:00 | AAPL | bollinger-bands | 186.904496 | 197.433568 |
| ... | ... | ... | ... | ... | ... |
| 572 | 2026-04-06T00:00:00 | AAPL | bollinger-bands | 244.943589 | 262.017411 |
| 573 | 2026-04-07T00:00:00 | AAPL | bollinger-bands | 245.172217 | 261.150783 |
| 574 | 2026-04-08T00:00:00 | AAPL | bollinger-bands | 245.426809 | 260.703191 |
| 575 | 2026-04-09T00:00:00 | AAPL | bollinger-bands | 245.478075 | 260.619925 |
| 576 | 2026-04-10T00:00:00 | AAPL | bollinger-bands | 245.089737 | 261.480263 |
577 rows × 5 columns
For another example, let's also plot the Relative Strength Index (RSI).
def plot_ticker_rsi(ticker: str, start_date: str = "2023-01-01"):
price_df = get_ticker_ohlc_prices(ticker, start_date=start_date)
price_df["date"] = pd.to_datetime(price_df["date"])
price_df = price_df.set_index("date").sort_index()
ta_df = get_ticker_ta_indicator(ticker, indicator="relative-strength-index", start_date=start_date)
display(ta_df)
ta_df["date"] = pd.to_datetime(ta_df["date"])
ta_df = ta_df.set_index("date").sort_index()
plot_df = ta_df[["value"]].merge(price_df[["close"]], how="inner", left_index=True, right_index=True)
fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(12, 6), sharex=True, gridspec_kw={"height_ratios": [3, 1]})
plot_df["close"].plot(ax=ax1, label=f"{ticker} Price")
ax1.legend(loc="upper left")
plot_df["value"].plot(ax=ax2, label="RSI")
ax2.axhline(70, linestyle="--", alpha=0.5)
ax2.axhline(30, linestyle="--", alpha=0.5)
ax2.legend(loc="upper left")
plt.suptitle(f"{ticker} Price vs RSI")
plt.show()
plot_ticker_rsi(ticker="AAPL")
| date | ticker | indicator | value | |
|---|---|---|---|---|
| 0 | 2023-12-26T00:00:00 | AAPL | relative-strength-index | 57.729766 |
| 1 | 2023-12-27T00:00:00 | AAPL | relative-strength-index | 49.317536 |
| 2 | 2023-12-28T00:00:00 | AAPL | relative-strength-index | 53.481052 |
| 3 | 2023-12-29T00:00:00 | AAPL | relative-strength-index | 44.862612 |
| 4 | 2024-01-02T00:00:00 | AAPL | relative-strength-index | 27.561389 |
| ... | ... | ... | ... | ... |
| 569 | 2026-04-06T00:00:00 | AAPL | relative-strength-index | 63.198659 |
| 570 | 2026-04-07T00:00:00 | AAPL | relative-strength-index | 50.363591 |
| 571 | 2026-04-08T00:00:00 | AAPL | relative-strength-index | 56.231267 |
| 572 | 2026-04-09T00:00:00 | AAPL | relative-strength-index | 64.112214 |
| 573 | 2026-04-10T00:00:00 | AAPL | relative-strength-index | 65.684899 |
574 rows × 4 columns
Quant Signals and Anomaly Detection¶
Beyond standard technical indicators, the Empirical Markets API exposes a suite of proprietary quantitative signals. Generally speaking, these cross-sectional signals are computed across the entire universe of supported tickers and normalized to be comparable across assets.
def get_available_quant_signals():
r = rate_limited_get(f"{API_HOST}/v1/quant/signals", headers=request_headers)
r.raise_for_status()
return r.json()["signals"]
def get_ticker_quant_signal(ticker: str, signal: str, start_date: str = "2024-01-01"):
r = rate_limited_get(
f"{API_HOST}/v1/quant/signals/{signal}/tickers/{ticker}",
headers=request_headers,
params={"start_date": start_date, "limit": 1000}
)
r.raise_for_status()
r = r.json()
return pd.DataFrame(r["data"])
def plot_ticker_quant_signal(ticker: str, signal: str, start_date: str = "2024-01-01"):
quant_signal = get_ticker_quant_signal(ticker, signal, start_date=start_date)
quant_signal["date"] = pd.to_datetime(quant_signal["date"])
quant_signal = quant_signal.set_index("date").sort_index()
display(quant_signal)
price_df = get_ticker_ohlc_prices(ticker, start_date=start_date)
price_df["date"] = pd.to_datetime(price_df["date"])
price_df = price_df.set_index("date").sort_index()
fig, ax = plt.subplots(figsize=(12, 6))
price_df["close"].plot(ax=ax, label=f"{ticker} Price")
ax2 = ax.twinx()
quant_signal["value"].plot(ax=ax2, label=signal, color="orange", alpha=0.7)
ax.legend(loc="upper left")
ax2.legend(loc="upper right")
plt.title(f"{ticker} Price vs {signal}")
plt.show()
plot_ticker_quant_signal(ticker="NVDA", signal="flow-ratio")
| ticker | signal | value | |
|---|---|---|---|
| date | |||
| 2024-01-02 | NVDA | flow-ratio | -0.447604 |
| 2024-01-03 | NVDA | flow-ratio | -1.009152 |
| 2024-01-04 | NVDA | flow-ratio | -0.850757 |
| 2024-01-05 | NVDA | flow-ratio | 0.164561 |
| 2024-01-08 | NVDA | flow-ratio | 1.853282 |
| ... | ... | ... | ... |
| 2026-04-06 | NVDA | flow-ratio | -0.279715 |
| 2026-04-07 | NVDA | flow-ratio | -0.339308 |
| 2026-04-08 | NVDA | flow-ratio | -0.930646 |
| 2026-04-09 | NVDA | flow-ratio | -0.935377 |
| 2026-04-10 | NVDA | flow-ratio | 1.001849 |
570 rows × 3 columns
In addition, the API also provides anomaly detection built on top of these quant signals. This feature is especially useful for finding outlier stocks which are behaving abnormally compared to the overall market, as this is often where alpha lies.
def get_ticker_quant_anomalies(ticker: str, signal: str):
r = rate_limited_get(
f"{API_HOST}/v1/quant/signals/{signal}/tickers/{ticker}/anomalies",
headers=request_headers,
)
r.raise_for_status()
r = r.json()
return pd.DataFrame(r["anomalies"]) # is ordered by value (desc)
def plot_ticker_flow_ratio_anomalies(ticker: str, signal: str, label: str):
anomaly_df = get_ticker_quant_anomalies(ticker=ticker, signal=signal)
if anomaly_df.empty:
print(f"No {signal} anomalies found for {ticker}")
return
anomaly_df["date"] = pd.to_datetime(anomaly_df["date"])
anomaly_df = anomaly_df.set_index("date").sort_index()
display(anomaly_df.head(5))
price_df = get_ticker_ohlc_prices(ticker=ticker, start_date="2023-01-01")
price_df["date"] = pd.to_datetime(price_df["date"])
price_df = price_df.set_index("date").sort_index()
plot_df = price_df[["close"]].merge(anomaly_df[["value"]], how="left", left_index=True, right_index=True)
anomaly_points = plot_df[plot_df["value"].notna()]
# anomaly_points = plot_df[plot_df["value"] > 2.]
fig, ax = plt.subplots(figsize=(12, 6))
ax.plot(plot_df.index, plot_df["close"], label=f"{ticker} Price")
ax.scatter(
anomaly_points.index,
anomaly_points["close"],
label=label,
color="magenta",
s=80,
zorder=5
)
ax.legend()
ax.set_title(f"{ticker} {label}")
plt.show()
plot_ticker_flow_ratio_anomalies(ticker="TSLA", signal="relative-volatility", label="Relative Volatility Anomalies")
plot_ticker_flow_ratio_anomalies(ticker="GLD", signal="flow-ratio", label="Flow Ratio Anomalies")
# plot_ticker_flow_ratio_anomalies(ticker="JNJ", signal="drift-strength", label="Drift Strength Anomalies")
| ticker | signal | value | |
|---|---|---|---|
| date | |||
| 2017-07-24 | TSLA | relative-volatility | 2.161442 |
| 2017-07-25 | TSLA | relative-volatility | 2.036218 |
| 2017-07-26 | TSLA | relative-volatility | 2.013650 |
| 2017-08-03 | TSLA | relative-volatility | 2.229516 |
| 2018-04-04 | TSLA | relative-volatility | 2.331087 |
| ticker | signal | value | |
|---|---|---|---|
| date | |||
| 2015-05-13 | GLD | flow-ratio | 3.586427 |
| 2015-07-15 | GLD | flow-ratio | 2.205369 |
| 2015-07-17 | GLD | flow-ratio | 3.676007 |
| 2015-07-20 | GLD | flow-ratio | 3.832583 |
| 2015-10-15 | GLD | flow-ratio | 2.864197 |
Capital Asset Pricing Model (CAPM)¶
The Capital Asset Pricing Model (CAPM) decomposes an asset's return into two components: market-driven return (beta) and idiosyncratic return (alpha). This is foundational for portfolio construction, risk budgeting, and factor attribution.
betameasures sensitivity to broad market moves. A beta > 1 implies the asset amplifies market volatility; beta < 1 implies it dampens it.alphameasures the excess return not explained by market exposure. Persistent positive alpha is the goal of active strategies.
The Empirical Markets API gives you access to rolling CAPM estimates on a per-ticker basis, letting you track how an asset's risk profile evolves over time.
def get_ticker_capm(ticker: str):
r = rate_limited_get(
f"{API_HOST}/v1/quant/capm/tickers/{ticker}",
headers=request_headers,
params={"start_date": "2024-01-01", "limit": 1000}
)
r.raise_for_status()
r = r.json()
return pd.DataFrame(r["data"])
get_ticker_capm(ticker="AAPL")
| ticker | date | period | beta | alpha | |
|---|---|---|---|---|---|
| 0 | AAPL | 2025-11-17T00:00:00 | 22 | 1.141434 | 0.008518 |
| 1 | AAPL | 2025-11-18T00:00:00 | 22 | 1.141153 | 0.008562 |
| 2 | AAPL | 2025-11-19T00:00:00 | 22 | 1.140964 | 0.008585 |
| 3 | AAPL | 2025-11-20T00:00:00 | 22 | 1.140608 | 0.008615 |
| 4 | AAPL | 2025-11-21T00:00:00 | 22 | 1.140251 | 0.008658 |
| ... | ... | ... | ... | ... | ... |
| 88 | AAPL | 2026-04-06T00:00:00 | 22 | 0.951727 | 0.000335 |
| 89 | AAPL | 2026-04-07T00:00:00 | 22 | 0.936637 | 0.000881 |
| 90 | AAPL | 2026-04-08T00:00:00 | 22 | 0.907826 | 0.001647 |
| 91 | AAPL | 2026-04-09T00:00:00 | 22 | 0.864933 | 0.002829 |
| 92 | AAPL | 2026-04-10T00:00:00 | 22 | 0.796200 | 0.004604 |
93 rows × 5 columns
Get Ticker Correlations¶
Pairwise correlation quantifies the degree to which two assets move together. This is essential for portfolio diversification — low or negative correlation between holdings reduces overall portfolio variance.
Using the Empirical Markets API, you can easily obtain the historical correlation series between any two tickers.
def get_ticker_pairwise_correlation(ticker: str, other_ticker: str):
r = rate_limited_get(
f"{API_HOST}/v1/quant/correlation/tickers/{ticker}/pairwise/{other_ticker}",
headers=request_headers,
params={"start_date": "2024-01-01", "limit": 1000}
)
r.raise_for_status()
r = r.json()
return pd.DataFrame(r["data"])
def plot_ticker_pairwise_correlation(ticker: str, other_ticker: str):
corr_df = get_ticker_pairwise_correlation(ticker, other_ticker)
corr_df["date"] = pd.to_datetime(corr_df["date"])
corr_df = corr_df.set_index("date").sort_index()
display(corr_df.head(3))
price_df_1 = get_ticker_ohlc_prices(ticker, start_date="2025-01-01")
price_df_1["date"] = pd.to_datetime(price_df_1["date"])
price_df_1 = price_df_1.set_index("date").sort_index()
price_df_1 = price_df_1[price_df_1.index.isin(corr_df.index)]
price_df_1["close_pct"] = price_df_1["close"].pct_change().cumsum()
price_df_2 = get_ticker_ohlc_prices(other_ticker, start_date="2025-01-01")
price_df_2["date"] = pd.to_datetime(price_df_2["date"])
price_df_2 = price_df_2.set_index("date").sort_index()
price_df_2 = price_df_2[price_df_2.index.isin(corr_df.index)]
price_df_2["close_pct"] = price_df_2["close"].pct_change().cumsum()
fig, ax1 = plt.subplots(figsize=(12, 6))
ax1.plot(price_df_1.index, price_df_1["close_pct"], label=f"{ticker} Cumulative Returns")
ax1.plot(price_df_2.index, price_df_2["close_pct"], label=f"{other_ticker} Cumulative Returns")
ax1.legend(loc="upper left")
ax2 = ax1.twinx()
ax2.plot(corr_df.index, corr_df["value"], label="Pairwise Correlation", color="orange")
ax2.legend(loc="upper right")
plt.title(f"Pairwise Correlation between {ticker} and {other_ticker}")
plt.show()
# get_ticker_pairwise_correlation(ticker="AAPL", other_ticker="MSFT")
plot_ticker_pairwise_correlation(ticker="AAPL", other_ticker="MSFT")
| ticker_a | ticker_b | period | value | |
|---|---|---|---|---|
| date | ||||
| 2025-11-17 | AAPL | MSFT | 22 | 0.238403 |
| 2025-11-18 | AAPL | MSFT | 22 | 0.227405 |
| 2025-11-19 | AAPL | MSFT | 22 | 0.173943 |
AI Predictions¶
The AI events endpoints expose predictions generated by a suite of Empirical Markets AI models. Each model is designed to produce profitable trading signals.
Each model is evaluated on a hold-out test dataset (meaning a range of data not seen during training or model selection). For most models, this is done using time-based splits, where training occurs only on past data and evaluation is performed on future, unseen periods.
This approach prevents data leakage, ensuring the model has no direct or indirect access to future information during training. As a result, all reported evaluation metrics reflect true out-of-sample performance. That being said, market regimes can change, and the evaluation dataset may not reflect all possible regimes. Thus real-world performance may vary.
Evaluation metrics are available via the Empirical Markets API, and may include measures such as precision as well as forward-looking return distributions. These are provided to help you understand the risk/reward profile of each model.
As a convenience to you, the API returns this data already joined with price. This allows for easy plotting and visualization.
def plot_ticker_ai_prediction_events(ticker: str, event: str, color: str = "green"):
r = rate_limited_get(
f"{API_HOST}/v1/ai/events/{event}/tickers/{ticker}",
headers=request_headers,
)
r.raise_for_status()
r = r.json()
subtitle = f"{r['event']}: {r['event_description']}"
event_df = pd.DataFrame(r["data"])
if event_df.empty:
print(f"No {event} events found for {ticker}")
return
event_df["date"] = pd.to_datetime(event_df["date"])
event_df = event_df.set_index("date").sort_index()
display(event_df.head(4))
price_df = get_ticker_ohlc_prices(ticker=ticker, start_date="2025-01-01")
price_df["date"] = pd.to_datetime(price_df["date"])
price_df = price_df.set_index("date").sort_index()
plot_df = price_df[["close"]].merge(event_df[["value"]], how="left", left_index=True, right_index=True)
event_points = plot_df[plot_df["value"].notna()]
fig, ax = plt.subplots(figsize=(12, 6))
ax.plot(plot_df.index, plot_df["close"], label=f"{ticker} Price")
ax.scatter(
event_points.index,
event_points["close"],
label=event,
color=color,
s=80,
zorder=5
)
ax.legend()
plt.suptitle(f"{ticker} AI Prediction Events")
ax.set_title(subtitle)
plt.show()
plot_ticker_ai_prediction_events(ticker="AAPL", event="upside-risk-elevated", color="green")
| model_id | ticker | event | value | value_type | price | |
|---|---|---|---|---|---|---|
| date | ||||||
| 2025-01-15 | 13 | AAPL | upside-risk-elevated | 0.932956 | probability | 236.816526 |
| 2025-04-04 | 13 | AAPL | upside-risk-elevated | 0.949780 | probability | 187.751665 |
| 2025-04-07 | 13 | AAPL | upside-risk-elevated | 0.923327 | probability | 180.854746 |
| 2025-04-10 | 13 | AAPL | upside-risk-elevated | 0.984144 | probability | 189.784860 |
plot_ticker_ai_prediction_events(ticker="TSLA", event="downside-risk-elevated", color="red")
| model_id | ticker | event | value | value_type | price | |
|---|---|---|---|---|---|---|
| date | ||||||
| 2025-01-17 | 13 | TSLA | downside-risk-elevated | 0.963084 | probability | 426.50 |
| 2025-01-21 | 13 | TSLA | downside-risk-elevated | 0.923051 | probability | 424.07 |
| 2025-04-14 | 13 | TSLA | downside-risk-elevated | 0.930543 | probability | 252.35 |
| 2025-04-15 | 13 | TSLA | downside-risk-elevated | 0.960200 | probability | 254.11 |
plot_ticker_ai_prediction_events(ticker="AAPL", event="high-reclaimed", color="green")
| model_id | ticker | event | value | value_type | price | |
|---|---|---|---|---|---|---|
| date | ||||||
| 2024-12-12 | 11 | AAPL | high-reclaimed | 0.964444 | probability | 246.861840 |
| 2024-12-13 | 11 | AAPL | high-reclaimed | 0.956485 | probability | 247.031087 |
| 2024-12-16 | 11 | AAPL | high-reclaimed | 0.956064 | probability | 249.928199 |
| 2024-12-17 | 11 | AAPL | high-reclaimed | 0.966002 | probability | 252.357393 |
plot_ticker_ai_prediction_events(ticker="AAPL", event="low-reclaimed", color="red")
| model_id | ticker | event | value | value_type | price | |
|---|---|---|---|---|---|---|
| date | ||||||
| 2025-01-07 | 12 | AAPL | low-reclaimed | 0.925389 | probability | 241.137305 |
| 2025-01-10 | 12 | AAPL | low-reclaimed | 0.962616 | probability | 235.801043 |
| 2025-01-13 | 12 | AAPL | low-reclaimed | 0.971932 | probability | 233.361894 |
| 2025-01-14 | 12 | AAPL | low-reclaimed | 0.980981 | probability | 232.246854 |
AI Models¶
Each AI event type is backed by one or more versioned models. Empirical Markets continuously trains and evaluates models, retiring older versions as newer ones outperform. The model endpoints expose general information about model versions.
def get_ai_event_models(event: str):
r = rate_limited_get(
f"{API_HOST}/v1/ai/events/{event}/models",
headers=request_headers,
)
r.raise_for_status()
return r.json()
def get_latest_ai_model_version_metrics(event: str):
models = get_ai_event_models(event=event)
latest_model = models["models"][0]
latest_model_id = latest_model["model_id"]
r = rate_limited_get(
f"{API_HOST}/v1/ai/events/{event}/models/{latest_model_id}/metrics",
headers=request_headers,
)
r.raise_for_status()
r = r.json()
r["metrics"] = pd.DataFrame(r["metrics"])
return r
for k, v in get_latest_ai_model_version_metrics(event="upside-risk-elevated").items():
print(f"{k}:\n{v}" if isinstance(v, pd.DataFrame) else f"{k}: {v}")
print("*"*20)
for k, v in get_latest_ai_model_version_metrics(event="downside-risk-elevated").items():
print(f"{k}:\n{v}" if isinstance(v, pd.DataFrame) else f"{k}: {v}")
print("*"*20)
model_id: 13 model_name: Orion model_version: 1.0.0 signal_direction: long metrics: metric_name metric_value direction eval_horizon 0 precision 0.816578 long None ******************** model_id: 13 model_name: Cygnus model_version: 1.0.0 signal_direction: short metrics: metric_name metric_value direction eval_horizon 0 precision 0.75942 short None ********************
AI-Powered Sentiment Analysis¶
Beyond AI event predictions, Empirical Markets applies AI algorithms to financial publications to derive quantitative sentiment scores. The AI Sentiment endpoints give you the ability to quickly retrieve these for research or modeling.
The data is partitioned by sentiment topic. One such topic is fed-market-sectors, which measures overall and sector-level market sentiment based on Federal Reserve publications.
def get_sector_sentiment_info():
r = rate_limited_get(
f"{API_HOST}/v1/ai/sentiment/fed-market-sectors",
headers=request_headers,
)
r.raise_for_status()
return r.json()
get_sector_sentiment_info()
{'topic': 'fed-market-sectors',
'description': 'Sector-level market sentiment derived from Federal Reserve publications.',
'dimension_type': 'category',
'dimensions': ['consumer-discretionary',
'consumer-staples',
'energy',
'financials',
'healthcare',
'industrials',
'materials',
'overall',
'real-estate',
'technology',
'utilities'],
'value_type': 'score',
'value_map': {'1': 'Strongly Bearish',
'2': 'Slightly Bearish',
'3': 'Neutral',
'4': 'Slightly Bullish',
'5': 'Strongly Bullish'}}
The data itself contains both a raw score and the reasoning behind the score. This is useful to understand exactly why the score was given, and can be used for many downstream machine learning tasks.
def get_sector_sentiment(sector: str):
r = rate_limited_get(
f"{API_HOST}/v1/ai/sentiment/fed-market-sectors/dimensions/{sector}",
headers=request_headers,
)
r.raise_for_status()
r = r.json()
df = pd.DataFrame(r["sentiment"])
df["score_name"] = df["score"].astype(str).map(r["value_map"])
df["date"] = pd.to_datetime(df["date"])
return df
get_sector_sentiment(sector="energy")
| date | model_name | model_version | dimension | score | reasoning | score_name | |
|---|---|---|---|---|---|---|---|
| 0 | 1996-10-30 | Newton | 1.0.0 | Energy | 4 | Robust activity in natural resources and stron... | Slightly Bullish |
| 1 | 1996-12-04 | Newton | 1.0.0 | Energy | 4 | High oil prices and increased oil-field activi... | Slightly Bullish |
| 2 | 1997-01-22 | Newton | 1.0.0 | Energy | 4 | Continued strengthening in the energy extracti... | Slightly Bullish |
| 3 | 1997-03-12 | Newton | 1.0.0 | Energy | 4 | Increased activity in energy extraction indust... | Slightly Bullish |
| 4 | 1997-05-07 | Newton | 1.0.0 | Energy | 4 | Energy production remains strong despite decli... | Slightly Bullish |
| ... | ... | ... | ... | ... | ... | ... | ... |
| 230 | 2025-09-03 | Newton | 1.0.0 | Energy | 4 | Increased energy demand due to data centers an... | Slightly Bullish |
| 231 | 2025-10-15 | Newton | 1.0.0 | Energy | 2 | Energy activity was generally reported as down... | Slightly Bearish |
| 232 | 2025-11-26 | Newton | 1.0.0 | Energy | 3 | Energy activity was largely stable despite cha... | Neutral |
| 233 | 2026-01-14 | Newton | 1.0.0 | Energy | 2 | Energy demand and production were flat to slig... | Slightly Bearish |
| 234 | 2026-03-04 | Newton | 1.0.0 | Energy | 4 | Energy activity grew modestly, supported by hi... | Slightly Bullish |
235 rows × 7 columns
Fundamental Data¶
Fundamental data refers to the financial reporting data extracted from SEC filings: balance sheets, income statements, and cash flow statements. This data is used by financial analysts to get a sense of the overall health of a company. The Empirical Markets API allows you to easily query for this kind of data.
Our fundamental data is divided into two contexts:
- Instant — point-in-time values (e.g., cash on hand, total assets) reported as of a specific balance sheet date.
- Duration — values accumulated over a reporting period (e.g., net income, revenue) reported over a quarter or fiscal year.
Each context is then divided into tags, which are the facts reported by the company.
def get_ticker_instant_tags(ticker: str):
r = rate_limited_get(
f"{API_HOST}/v1/fundamentals/tickers/{ticker}/instant/tags",
headers=request_headers,
)
r.raise_for_status()
r = r.json()
return pd.DataFrame(r["tags"])
get_ticker_instant_tags(ticker="AAPL")
| tag | label | documentation | period_type | balance | |
|---|---|---|---|---|---|
| 0 | AccountsPayableCurrent | Accounts Payable, Current | Carrying value as of the balance sheet date of... | instant | credit |
| 1 | AccountsReceivableNetCurrent | Accounts Receivable, after Allowance for Credi... | Amount, after allowance for credit loss, of ri... | instant | debit |
| 2 | AccruedIncomeTaxesCurrent | Accrued Income Taxes, Current | Carrying amount as of the balance sheet date o... | instant | credit |
| 3 | AccruedIncomeTaxesNoncurrent | Accrued Income Taxes, Noncurrent | Carrying amount as of the balance sheet date o... | instant | credit |
| 4 | AccruedLiabilitiesCurrent | Accrued Liabilities, Current | Carrying value as of the balance sheet date of... | instant | credit |
| ... | ... | ... | ... | ... | ... |
| 223 | UnrecordedUnconditionalPurchaseObligationBalan... | Unrecorded Unconditional Purchase Obligation, ... | Amount of fixed and determinable portion of un... | instant | credit |
| 224 | UnrecordedUnconditionalPurchaseObligationBalan... | Unrecorded Unconditional Purchase Obligation, ... | Amount of fixed and determinable portion of un... | instant | credit |
| 225 | UnrecordedUnconditionalPurchaseObligationBalan... | Unrecorded Unconditional Purchase Obligation | Amount of unrecorded obligation to transfer fu... | instant | credit |
| 226 | UnrecordedUnconditionalPurchaseObligationDueAf... | Unrecorded Unconditional Purchase Obligation, ... | Amount of fixed and determinable portion of un... | instant | credit |
| 227 | UnrecordedUnconditionalPurchaseObligationDueIn... | Unrecorded Unconditional Purchase Obligation, ... | Amount of fixed and determinable portion of un... | instant | credit |
228 rows × 5 columns
def get_ticker_latest_fundamentals(ticker: str, context: str = "instant"):
if context not in ["instant", "duration"]:
raise ValueError("Invalid context. Must be 'instant' or 'duration'.")
r = rate_limited_get(
f"{API_HOST}/v1/fundamentals/tickers/{ticker}/{context}/latest",
headers=request_headers,
)
r.raise_for_status()
r = r.json()
return pd.DataFrame(r["data"])
get_ticker_latest_fundamentals(ticker="AAPL", context="instant").head(5)
| ticker | entity_name | end_date | filed_date | tag | value | accession_number | fy | fp | form | unit | taxonomy | label | documentation | period_type | balance | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | AAPL | Apple Inc. | 2025-12-27T00:00:00 | 2026-01-30T00:00:00 | AccountsPayableCurrent | 7.058700e+10 | 0000320193-26-000006 | 2026 | Q1 | 10-Q | USD | us_gaap | Accounts Payable, Current | Carrying value as of the balance sheet date of... | instant | credit |
| 1 | AAPL | Apple Inc. | 2025-12-27T00:00:00 | 2026-01-30T00:00:00 | AccountsReceivableNetCurrent | 3.992100e+10 | 0000320193-26-000006 | 2026 | Q1 | 10-Q | USD | us_gaap | Accounts Receivable, after Allowance for Credi... | Amount, after allowance for credit loss, of ri... | instant | debit |
| 2 | AAPL | Apple Inc. | 2025-09-27T00:00:00 | 2025-10-31T00:00:00 | AccruedIncomeTaxesCurrent | 1.301600e+10 | 0000320193-25-000079 | 2025 | FY | 10-K | USD | us_gaap | Accrued Income Taxes, Current | Carrying amount as of the balance sheet date o... | instant | credit |
| 3 | AAPL | Apple Inc. | 2024-09-28T00:00:00 | 2024-11-01T00:00:00 | AccruedIncomeTaxesNoncurrent | 9.254000e+09 | 0000320193-24-000123 | 2024 | FY | 10-K | USD | us_gaap | Accrued Income Taxes, Noncurrent | Carrying amount as of the balance sheet date o... | instant | credit |
| 4 | AAPL | Apple Inc. | 2024-09-28T00:00:00 | 2024-11-01T00:00:00 | AccruedIncomeTaxesNoncurrent | 9.254000e+09 | 0000320193-24-000123 | 2024 | FY | 10-K | USD | us_gaap | Accrued Income Taxes, Noncurrent | Carrying amount as of the balance sheet date o... | instant | credit |
def get_ticker_fundamental_tag(ticker: str, tag: str, context: str = "instant"):
if context not in ["instant", "duration"]:
raise ValueError("Invalid context. Must be 'instant' or 'duration'.")
r = rate_limited_get(
f"{API_HOST}/v1/fundamentals/tickers/{ticker}/{context}/tags/{tag}",
headers=request_headers,
)
r.raise_for_status()
r = r.json()
return pd.DataFrame(r["data"])
def plot_ticker_fundamental_tag(ticker: str, tag: str, context: str = "instant"):
tag_df = get_ticker_fundamental_tag(ticker=ticker, tag=tag, context=context)
if tag_df.empty:
print(f"No data found for {ticker} {tag} {context}")
return
tag_df["end_date"] = pd.to_datetime(tag_df["end_date"])
tag_df = tag_df.set_index("end_date").sort_index()
display(tag_df.head(3))
plt.figure(figsize=(10, 6))
tag_df["value"] = pd.to_numeric(tag_df["value"], errors="coerce")
tag_df["value"].plot(title=f"{ticker} {tag} ({context})")
plt.xlabel("Reporting End Date")
plt.ylabel(tag)
plt.grid()
plt.show()
plot_ticker_fundamental_tag(ticker="AAPL", tag="AccountsReceivableNetCurrent", context="instant")
| ticker | entity_name | filed_date | tag | value | accession_number | fy | fp | form | unit | taxonomy | label | documentation | period_type | balance | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| end_date | |||||||||||||||
| 2008-09-27 | AAPL | Apple Inc. | 2010-01-25T00:00:00 | AccountsReceivableNetCurrent | 2.422000e+09 | 0001193125-10-012091 | 2009 | FY | 10-K/A | USD | us_gaap | Accounts Receivable, after Allowance for Credi... | Amount, after allowance for credit loss, of ri... | instant | debit |
| 2009-06-27 | AAPL | Apple Inc. | 2009-07-22T00:00:00 | AccountsReceivableNetCurrent | 2.686000e+09 | 0001193125-09-153165 | 2009 | Q3 | 10-Q | USD | us_gaap | Accounts Receivable, after Allowance for Credi... | Amount, after allowance for credit loss, of ri... | instant | debit |
| 2009-09-26 | AAPL | Apple Inc. | 2010-10-27T00:00:00 | AccountsReceivableNetCurrent | 3.361000e+09 | 0001193125-10-238044 | 2010 | FY | 10-K | USD | us_gaap | Accounts Receivable, after Allowance for Credi... | Amount, after allowance for credit loss, of ri... | instant | debit |
def plot_ticker_asset_liabilities(ticker: str):
asset_tag = "AssetsCurrent"
liabilities_tag = "LiabilitiesCurrent"
assets_df = get_ticker_fundamental_tag(ticker=ticker, tag=asset_tag, context="instant")
assets_df["end_date"] = pd.to_datetime(assets_df["end_date"])
assets_df = assets_df.set_index("end_date").sort_index()
liabilities_df = get_ticker_fundamental_tag(ticker=ticker, tag=liabilities_tag, context="instant")
liabilities_df["end_date"] = pd.to_datetime(liabilities_df["end_date"])
liabilities_df = liabilities_df.set_index("end_date").sort_index()
combined_df = pd.DataFrame({
"Assets": pd.to_numeric(assets_df["value"], errors="coerce"),
"Liabilities": pd.to_numeric(liabilities_df["value"], errors="coerce")
})
combined_df.plot(title=f"{ticker} {asset_tag} vs {liabilities_tag}", figsize=(10, 6))
plt.xlabel("Reporting End Date")
plt.ylabel("Value")
plt.grid()
plt.show()
plot_ticker_asset_liabilities(ticker="AAPL")
Macroeconomic Data¶
Macro data provides the economic backdrop against which company performance is evaluated. Empirical Markets aggregates macro data from various sources into a consistent, queryable format.
Data is organized into collections (thematic groups of related series like CPI, employment, etc.) and individual series within each collection. Some collections can be seen below:
def get_macro_collections():
r = rate_limited_get(
f"{API_HOST}/v1/macro/collections",
headers=request_headers,
)
r.raise_for_status()
r = r.json()
return pd.DataFrame(r["collection_ids"])
get_macro_collections()
| id | description | topics | |
|---|---|---|---|
| 0 | gdp-pct-change | Percent Change From Preceding Period in Real G... | [gdp] |
| 1 | gdp-nominal | Gross Domestic Product | [gdp] |
| 2 | gdp-real | Real Gross Domestic Product, Chained Dollars | [gdp] |
| 3 | gdp-price-index | Price Indexes for Gross Domestic Product | [gdp, inflation] |
| 4 | gdp-contrib | Contributions to Percent Change in Real Gross ... | [gdp] |
| 5 | gdp-contrib-by-sector | Price Indexes for Gross Value Added by Sector | [gdp] |
| 6 | gdp-aggregates | GDP, GDI, and Other Major NIPA Aggregates | [gdp] |
| 7 | gdp-aggregates-real | Real Gross Domestic Product, Real Gross Domest... | [gdp] |
| 8 | national-income | National Income by Type of Income | [income] |
| 9 | personal-income | Personal Income and Its Disposition | [income, consumption] |
| 10 | pce-real-pct-change | Percent Change From Preceding Period in Real P... | [consumption, pce] |
| 11 | pce-price-index | Price Indexes for Personal Consumption Expendi... | [consumption, inflation, pce] |
| 12 | pce | Personal Consumption Expenditures by Major Typ... | [consumption, pce] |
| 13 | real-imp-exp-pct-change | Percent Change From Preceding Period in Real E... | [trade] |
| 14 | cpi-u | Consumer Price Index - All Urban Consumers (Cu... | [cpi, inflation] |
| 15 | cpi-w | Consumer Price Index - Urban Wage Earners and ... | [cpi, inflation] |
| 16 | cpi-u-chained | Chained CPI - All Urban Consumers | [cpi, inflation] |
| 17 | ppi | Producer Price Index Revision-Current Series | [ppi, inflation] |
| 18 | ppi-commodities | Producer Price Index - Commodities | [ppi, inflation] |
| 19 | intl-trade | International Price Index | [trade] |
| 20 | employment-earnings | Employment, Hours, and Earnings-National (NAICS) | [employment, income] |
| 21 | eci | Employment Cost Index (NAICS) | [employment, income] |
| 22 | ecec | Employer Costs for Employee Compensation (NAICS) | [employment, income] |
| 23 | earnings-cps | Weekly and Hourly Earnings Data from the Curre... | [employment, income] |
| 24 | avg-price-commodities | Average Price Data | [prices] |
| 25 | inventory-price-index | Department Store Inventory Price Index | [prices] |
Once you select a collection, you can find associated time series to query. For example, the collection cpi-u represents the Consumer Price Index - All Urban Consumers. And the series within this collection track the underlying components of the CPI calculation. Some of those components are shown below:
def get_collection_series(collection_id: str):
r = rate_limited_get(
f"{API_HOST}/v1/macro/collections/{collection_id}/series",
headers=request_headers,
)
r.raise_for_status()
r = r.json()
return pd.DataFrame(r["series_ids"])
get_collection_series(collection_id="cpi-u")
| series_id | series_name | frequency | |
|---|---|---|---|
| 0 | CUSR0000SA0 | All items in U.S. city average, all urban cons... | M |
| 1 | CUSR0000SA0E | Energy in U.S. city average, all urban consume... | M |
| 2 | CUSR0000SA0L1 | All items less food in U.S. city average, all ... | M |
| 3 | CUSR0000SA0L12 | All items less food and shelter in U.S. city a... | M |
| 4 | CUSR0000SA0L12E | All items less food, shelter, and energy in U.... | M |
| ... | ... | ... | ... |
| 3353 | CUURS49GSETB | Motor fuel in Urban Alaska, all urban consumer... | M |
| 3354 | CUURS49GSETB01 | Gasoline (all types) in Urban Alaska, all urba... | M |
| 3355 | CUURS49GSS47014 | Gasoline, unleaded regular in Urban Alaska, al... | M |
| 3356 | CUURS49GSS47015 | Gasoline, unleaded midgrade in Urban Alaska, a... | M |
| 3357 | CUURS49GSS47016 | Gasoline, unleaded premium in Urban Alaska, al... | M |
3358 rows × 3 columns
def get_collection_series_data(collection_id: str, series_id: str):
r = rate_limited_get(
f"{API_HOST}/v1/macro/collections/{collection_id}/series/{series_id}",
headers=request_headers,
)
r.raise_for_status()
r = r.json()
df = pd.DataFrame(r["data"]).dropna(subset=["value"]).reset_index(drop=True)
return df
def plot_macro_series_data(collection_id: str, series_id: str):
df = get_collection_series_data(collection_id=collection_id, series_id=series_id)
df["period_end"] = pd.to_datetime(df["period_end"])
df = df.set_index("period_end").sort_index()
display(df.head(3))
plt.figure(figsize=(12, 6))
df["value"].plot(title=f"{collection_id} - {series_id}")
plt.xlabel("Date")
plt.ylabel("Value")
plt.grid()
plt.show()
plot_macro_series_data(collection_id="cpi-u", series_id="CUSR0000SA0")
| series_id | collection_id | series_name | frequency | year | value | revision | footnotes | |
|---|---|---|---|---|---|---|---|---|
| period_end | ||||||||
| 1997-01-31 | CUSR0000SA0 | cpi-u | All items in U.S. city average, all urban cons... | M | 1997 | 159.4 | 1 | None |
| 1997-02-28 | CUSR0000SA0 | cpi-u | All items in U.S. city average, all urban cons... | M | 1997 | 159.7 | 1 | None |
| 1997-03-31 | CUSR0000SA0 | cpi-u | All items in U.S. city average, all urban cons... | M | 1997 | 159.8 | 1 | None |
Using the Empirical Markets API, you can also easily retrieve historical and future economic release schedules.
def get_macro_release_schedule(year: int):
r = rate_limited_get(
f"{API_HOST}/v1/macro/schedule/{year}",
headers=request_headers,
)
r.raise_for_status()
r = r.json()
return pd.DataFrame(r["events"])
get_macro_release_schedule(year=2026)
| date | time | release | target_year | target_month | |
|---|---|---|---|---|---|
| 0 | 2026-04-10T00:00:00 | 08:30 AM | Real Earnings for March 2026 | 2026.0 | 3.0 |
| 1 | 2026-01-09T00:00:00 | 08:30 AM | Employment Situation for December 2025 | 2025.0 | 12.0 |
| 2 | 2026-01-13T00:00:00 | 08:30 AM | Consumer Price Index for December 2025 | 2025.0 | 12.0 |
| 3 | 2026-01-13T00:00:00 | 08:30 AM | Real Earnings for December 2025 | 2025.0 | 12.0 |
| 4 | 2026-01-14T00:00:00 | 08:30 AM | Producer Price Index for December 2025 | 2025.0 | 12.0 |
| ... | ... | ... | ... | ... | ... |
| 2880 | 2026-12-16T00:00:00 | 10:00 AM | Employer Costs for Employee Compensation for S... | 2026.0 | 9.0 |
| 2881 | 2026-12-17T00:00:00 | 08:30 AM | U.S. Import and Export Price Indexes for Novem... | 2026.0 | 11.0 |
| 2882 | 2026-12-18T00:00:00 | 10:00 AM | State Employment and Unemployment (Monthly) fo... | 2026.0 | 11.0 |
| 2883 | 2026-12-18T00:00:00 | 10:00 AM | Work Experience of the Population (Annual) for... | NaN | NaN |
| 2884 | 2026-12-30T00:00:00 | 10:00 AM | Metropolitan Area Employment and Unemployment ... | 2026.0 | 11.0 |
2885 rows × 5 columns
Get Yield Curve Data¶
The US Treasury yield curve plots yields across maturities from short-term bills to long-term bonds. The shape of the curve — normal (upward sloping), flat, or inverted — is one of the most closely watched macro indicators. An inverted yield curve (short rates > long rates) has historically preceded recessions (or so people claim).
Using the Empirical Markets API, you can effortlessly obtain historical yields from various maturities, as well as the spread between them.
def get_yield_curve_latest():
r = rate_limited_get(
f"{API_HOST}/v1/treasury/latest",
headers=request_headers,
)
r.raise_for_status()
r = r.json()
df = pd.DataFrame(r["data"])
df["date"] = pd.to_datetime(df["date"])
return df
def plot_yield_curve_latest():
df = get_yield_curve_latest()
df = df.set_index("maturity")
display(df["rate"])
plt.figure(figsize=(12, 6))
df["rate"].plot(title="Latest U.S. Treasury Yield Curve", kind="bar")
plt.xlabel("Date")
plt.ylabel("Yield")
plt.grid()
plt.show()
plot_yield_curve_latest()
maturity 1mo 3.67 2mo 3.70 3mo 3.69 4mo 3.69 6mo 3.72 1yr 3.70 2yr 3.81 3yr 3.80 5yr 3.94 7yr 4.12 10yr 4.31 20yr 4.89 30yr 4.91 Name: rate, dtype: float64
def get_yield_curve_spread(m1: str, m2: str):
r = rate_limited_get(
f"{API_HOST}/v1/treasury/spread/{m1}/{m2}",
headers=request_headers,
params={"start_date": "2024-01-01", "limit": 1000}
)
r.raise_for_status()
r = r.json()
df = pd.DataFrame(r["data"])
df["date"] = pd.to_datetime(df["date"])
return df
def plot_yield_curve_spread(m1: str, m2: str):
df = get_yield_curve_spread(m1=m1, m2=m2)
df = df.set_index("date").sort_index()
display(df.head(3))
plt.figure(figsize=(12, 6))
df["spread"].plot(title=f"Yield Spread: {m1} - {m2}")
plt.xlabel("Date")
plt.ylabel("Yield Spread")
plt.grid()
plt.show()
plot_yield_curve_spread(m1="10yr", m2="2yr")
| maturity_a | maturity_b | spread | |
|---|---|---|---|
| date | |||
| 2024-01-02 | 10yr | 2yr | -0.38 |
| 2024-01-03 | 10yr | 2yr | -0.42 |
| 2024-01-04 | 10yr | 2yr | -0.39 |
Insider Trading¶
Corporate insiders — directors, officers, and large shareholders — are required by law to report their transactions in company stock to the SEC (Form 4 filings). Insider activity is closely watched by investors because insiders have non-public knowledge about their company's prospects.
The insiders endpoint of the Empirical Markets API returns insider transaction in their raw form, as well as in aggregated form.
def get_ticker_insider_reports(ticker: str):
r = rate_limited_get(
f"{API_HOST}/v1/insiders/tickers/{ticker}/reports",
headers=request_headers,
params={"limit": 300, "start_date": "2024-01-01"}
)
r.raise_for_status()
r = r.json()
df = pd.DataFrame(r["reports"])
return df
get_ticker_insider_reports(ticker="TSLA")
| ticker | filed_date | reporter | non_deriv_txs | deriv_txs | footnotes | |
|---|---|---|---|---|---|---|
| 0 | TSLA | 2024-12-13T00:00:00 | {'name': 'Musk Kimbal', 'is_director': '1', 'i... | [{'instrument': 'Common Stock', 'date': '2024-... | [] | {'F1': 'Represents a contribution to a donor-a... |
| 1 | TSLA | 2024-12-31T00:00:00 | {'name': 'Musk Elon', 'is_director': 'true', '... | [{'instrument': 'Common Stock', 'date': '2024-... | [] | {'F1': 'In connection with the Reporting Perso... |
| 2 | TSLA | 2025-01-08T00:00:00 | {'name': 'Wilson-Thompson Kathleen', 'is_direc... | [{'instrument': 'Common Stock', 'date': '2025-... | [{'instrument': 'Non-Qualified Stock Option (r... | {'F1': 'The transactions reported on this Form... |
| 3 | TSLA | 2025-01-08T00:00:00 | {'name': 'Taneja Vaibhav', 'is_director': '0',... | [{'instrument': 'Common Stock', 'date': '2025-... | [{'instrument': 'Non-Qualified Stock Option (r... | {'F1': 'The transactions reported on this Form... |
| 4 | TSLA | 2024-12-09T00:00:00 | {'name': 'Taneja Vaibhav', 'is_director': '0',... | [{'instrument': 'Common Stock', 'date': '2024-... | [{'instrument': 'Restricted Stock Unit', 'date... | {'F1': 'Shares of the Issuer's common stock we... |
| ... | ... | ... | ... | ... | ... | ... |
| 64 | TSLA | 2025-12-11T00:00:00 | {'name': 'Musk Kimbal', 'is_director': '1', 'i... | [{'instrument': 'Common Stock', 'date': '2025-... | [] | {'F1': 'The price reported in Column 4 is a we... |
| 65 | TSLA | 2026-04-01T00:00:00 | {'name': 'Wilson-Thompson Kathleen', 'is_direc... | [{'instrument': 'Common Stock', 'date': '2026-... | [{'instrument': 'Non-Qualified Stock Option (r... | {'F1': 'The transactions reported on this Form... |
| 66 | TSLA | 2026-04-02T00:00:00 | {'name': 'Zhu Xiaotong', 'is_director': '0', '... | [{'instrument': 'Common Stock', 'date': '2026-... | [{'instrument': 'Non-Qualified Stock Option (r... | {'F1': 'The shares are held in Magical Blake G... |
| 67 | TSLA | 2026-02-27T00:00:00 | {'name': 'Wilson-Thompson Kathleen', 'is_direc... | [{'instrument': 'Common Stock', 'date': '2026-... | [{'instrument': 'Non-Qualified Stock Option (r... | {'F1': 'The transactions reported on this Form... |
| 68 | TSLA | 2026-03-09T00:00:00 | {'name': 'Taneja Vaibhav', 'is_director': '0',... | [{'instrument': 'Common Stock', 'date': '2026-... | [{'instrument': 'Restricted Stock Unit', 'date... | {'F1': 'Shares of the Issuer's common stock we... |
69 rows × 6 columns
The insiders endpoint also lets you find historical reporters for a given company:
def get_ticker_insider_reporters(ticker: str):
r = rate_limited_get(
f"{API_HOST}/v1/insiders/tickers/{ticker}/reporters",
headers=request_headers,
)
r.raise_for_status()
r = r.json()
df = pd.DataFrame(r["reporters"])
return df
get_ticker_insider_reporters(ticker="TSLA")
| reporter_name | reporter_title | |
|---|---|---|
| 0 | Ahuja Deepak | Chief Financial Officer |
| 1 | Baglino Andrew D | SVP Powertrain and Energy Eng. |
| 2 | Branderiz Eric | VP, Chief Accounting Officer |
| 3 | Buss Brad W | None |
| 4 | DENHOLM ROBYN M | None |
| 5 | Ehrenpreis Ira Matthew | None |
| 6 | ELLISON LAWRENCE JOSEPH | None |
| 7 | FIELD JOHN DOUGLAS | Senior VP, Engineering |
| 8 | Gebbia Joseph | None |
| 9 | Gracias Antonio J. | None |
| 10 | Guillen Jerome M | President, Heavy Trucking |
| 11 | Jurvetson Stephen T | None |
| 12 | Kirkhorn Zachary | Chief Financial Officer |
| 13 | McNeill Jon | President, WW Sales/Service |
| 14 | MURDOCH JAMES R | None |
| 15 | Musk Elon | CEO |
| 16 | Musk Kimbal | None |
| 17 | RICE LINDA JOHNSON | None |
| 18 | Straubel Jeffrey B | Chief Technical Officer |
| 19 | Taneja Vaibhav | Chief Financial Officer |
| 20 | Wilson-Thompson Kathleen | None |
| 21 | Zhu Xiaotong | SVP |
Insider Transactions are broken up into codes. These codes represent the type of transaction (sale, purchase, etc). These codes can be retrieved in the following way:
def get_ticker_insiders_tx_codes(ticker: str):
r = rate_limited_get(
f"{API_HOST}/v1/insiders/tickers/{ticker}/transactions/codes",
headers=request_headers,
)
r.raise_for_status()
r = r.json()
return pd.DataFrame(r["codes"])
get_ticker_insiders_tx_codes(ticker="TSLA")
| code | description | |
|---|---|---|
| 0 | M | Exercise or conversion of derivative security ... |
| 1 | C | Conversion of derivative security |
| 2 | F | Payment of exercise price or tax liability by ... |
| 3 | S | Open market or private sale of non-derivative ... |
| 4 | G | Bona fide gift |
| 5 | P | Open market or private purchase of non-derivat... |
| 6 | J | Other acquisition or disposition |
| 7 | A | Grant, award or other acquisition pursuant to ... |
Once a code is selected, you can query for the historical time series of transactions for a specific ticker.
def get_ticker_insiders_transactions(ticker: str, code: str):
r = rate_limited_get(
f"{API_HOST}/v1/insiders/tickers/{ticker}/transactions/codes/{code}",
headers=request_headers,
params={"limit": 300, "start_date": "2024-01-01"}
)
r.raise_for_status()
r = r.json()
df = pd.DataFrame(r["transactions"])
return df
get_ticker_insiders_transactions(ticker="TSLA", code="S")
| ticker | filed_date | tx_instrument | tx_code | tx_shares | tx_price | tx_ownership | reporter_name | reporter_title | |
|---|---|---|---|---|---|---|---|---|---|
| 0 | TSLA | 2024-02-23T00:00:00 | Common Stock | S | 56100.0 | 193.427 | D | DENHOLM ROBYN M | None |
| 1 | TSLA | 2024-02-23T00:00:00 | Common Stock | S | 10262.0 | 194.922 | D | DENHOLM ROBYN M | None |
| 2 | TSLA | 2024-02-23T00:00:00 | Common Stock | S | 5404.0 | 195.749 | D | DENHOLM ROBYN M | None |
| 3 | TSLA | 2024-02-23T00:00:00 | Common Stock | S | 7000.0 | 196.958 | D | DENHOLM ROBYN M | None |
| 4 | TSLA | 2024-02-23T00:00:00 | Common Stock | S | 11168.0 | 197.811 | D | DENHOLM ROBYN M | None |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 289 | TSLA | 2026-04-01T00:00:00 | Common Stock | S | 4001.0 | 363.083 | D | Wilson-Thompson Kathleen | |
| 290 | TSLA | 2026-04-01T00:00:00 | Common Stock | S | 3927.0 | 363.947 | D | Wilson-Thompson Kathleen | |
| 291 | TSLA | 2026-04-01T00:00:00 | Common Stock | S | 560.0 | 364.888 | D | Wilson-Thompson Kathleen | |
| 292 | TSLA | 2026-04-01T00:00:00 | Common Stock | S | 120.0 | 365.897 | D | Wilson-Thompson Kathleen | |
| 293 | TSLA | 2026-04-01T00:00:00 | Common Stock | S | 80.0 | 366.855 | D | Wilson-Thompson Kathleen |
294 rows × 9 columns
Insider Aggregates and Anomaly Detection¶
Raw transaction records are noisy. Aggregating them into daily totals and then applying anomaly detection makes the signal more actionable. The aggregate endpoints roll up individual trades into daily sold-value and count metrics for a given ticker.
def get_ticker_insider_sales(ticker: str):
r = rate_limited_get(
f"{API_HOST}/v1/insiders/tickers/{ticker}/sales",
headers=request_headers,
params={"limit": 1000, "start_date": "2020-01-01"}
)
r.raise_for_status()
r = r.json()
df = pd.DataFrame(r["data"])
return df
def plot_ticker_insider_sales(ticker: str):
df = get_ticker_insider_sales(ticker=ticker)
df = df.set_index("report_date").sort_index()
display(df.head(3))
plt.figure(figsize=(12, 6))
df["total_value"].plot(title=f"Insider Sales: {ticker}", c="red")
plt.xlabel("Report Date")
plt.ylabel("Sale Value")
plt.grid()
plt.show()
plot_ticker_insider_sales(ticker="TSLA")
| ticker | total_value | |
|---|---|---|
| report_date | ||
| 2020-01-06T00:00:00 | TSLA | 854935.4 |
| 2020-01-21T00:00:00 | TSLA | 76141.5 |
| 2020-02-05T00:00:00 | TSLA | 1069085.8 |
Insider Sales Anomaly Detection¶
While the aggregated transactions is useful in itself, Empirical Markets goes one step further and provides an anomaly detection layer on top of this data. This is extremely useful to filter out noise, and only focus on abnormal transaction volume.
def get_ticker_insider_sale_anomalies(ticker: str):
r = rate_limited_get(
f"{API_HOST}/v1/insiders/tickers/{ticker}/sales/anomalies",
headers=request_headers,
)
r.raise_for_status()
r = r.json()
df = pd.DataFrame(r["anomalies"])
return df
def plot_insider_sale_anomalies(ticker: str):
df = get_ticker_insider_sale_anomalies(ticker=ticker)
if df.empty:
print(f"No insider sale anomalies found for {ticker}")
return
df["report_date"] = pd.to_datetime(df["report_date"])
df = df.set_index("report_date").sort_index()
display(df.head(5))
price_df = get_ticker_ohlc_prices(ticker=ticker, start_date="2020-01-01")
price_df["date"] = pd.to_datetime(price_df["date"])
price_df = price_df.set_index("date").sort_index()
plot_df = price_df[["close"]].merge(df[["total_value"]], how="left", left_index=True, right_index=True)
anomaly_points = plot_df[plot_df["total_value"].notna()]
fig, ax = plt.subplots(figsize=(12, 6))
ax.plot(plot_df.index, plot_df["close"], label=f"{ticker} Price")
ax.scatter(
anomaly_points.index,
anomaly_points["close"],
label="Insider Sale Anomaly",
color="magenta",
s=80,
zorder=5
)
ax.set_title(f"Insider Sale Anomalies: {ticker}")
ax.set_xlabel("Date")
ax.set_ylabel("Price")
ax.grid(True)
ax.legend()
plt.show()
plot_insider_sale_anomalies(ticker="TSLA")
| ticker | total_value | anomaly_type | |
|---|---|---|---|
| report_date | |||
| 2021-11-10 | TSLA | 4.983111e+09 | high |
| 2021-11-12 | TSLA | 1.922904e+09 | high |
| 2022-04-28 | TSLA | 3.989322e+09 | high |
| 2022-04-29 | TSLA | 4.531237e+09 | high |
| 2022-08-09 | TSLA | 6.886744e+09 | high |
The Empirical Markets API also allows you to query insider activity at the market level — aggregating sell and buy volume across all tickers on a daily basis. Market-wide insider selling surges can reflect broad insider pessimism and have historically been a useful leading indicator of market stress.
def get_market_insider_sales():
r = rate_limited_get(
f"{API_HOST}/v1/insiders/aggregates/sales",
headers=request_headers,
params={"limit": 1000, "start_date": "2024-01-01"}
)
r.raise_for_status()
r = r.json()
df = pd.DataFrame(r["data"])
return df
def plot_market_insider_sales():
import numpy as np
df = get_market_insider_sales()
if df.empty:
print("No market insider sales data found.")
return
df["report_date"] = pd.to_datetime(df["report_date"])
df = df.set_index("report_date").sort_index()
display(df.head(5))
plt.figure(figsize=(12, 6))
np.log(df["total_value"]).plot(title="Market Insider Sales (Log)", color="red")
plt.xlabel("Report Date")
plt.ylabel("Sale Value")
plt.grid()
plt.show()
plot_market_insider_sales()
| ticker_count | total_value | |
|---|---|---|
| report_date | ||
| 2024-01-02 | 156 | 1.465746e+08 |
| 2024-01-03 | 293 | 1.322664e+08 |
| 2024-01-04 | 596 | 9.114137e+08 |
| 2024-01-05 | 356 | 1.913442e+08 |
| 2024-01-08 | 175 | 1.614525e+08 |
The same anomaly detection framework applies to market-wide aggregates. Both sales and purchases anomaly series are available — meaningful divergence between the two (e.g., a spike in purchases without a corresponding rise in sales) can indicate contrarian insider conviction.
def get_market_insider_anomalies(kind: str = "sales"):
if kind not in ["sales", "purchases"]:
raise ValueError("Invalid kind. Must be 'sales' or 'purchases'.")
r = rate_limited_get(
f"{API_HOST}/v1/insiders/aggregates/{kind}/anomalies",
headers=request_headers,
)
r.raise_for_status()
r = r.json()
df = pd.DataFrame(r["anomalies"])
return df
def plot_market_insider_anomalies(kind: str = "sales"):
df = get_market_insider_anomalies(kind=kind)
if df.empty:
print(f"No market insider {kind} anomalies found.")
return
df["report_date"] = pd.to_datetime(df["report_date"])
df = df.set_index("report_date").sort_index()
display(df.head(5))
price_df = get_ticker_ohlc_prices(ticker="SPY", start_date="2023-01-01")
price_df["date"] = pd.to_datetime(price_df["date"])
price_df = price_df.set_index("date").sort_index()
plot_df = price_df[["close"]].merge(df[["total_value"]], how="left", left_index=True, right_index=True)
anomaly_points = plot_df[plot_df["total_value"].notna()]
fig, ax = plt.subplots(figsize=(12, 6))
ax.plot(plot_df.index, plot_df["close"], label="SPY Price")
ax.scatter(
anomaly_points.index,
anomaly_points["close"],
label="Insider Anomaly",
color="magenta",
s=80,
zorder=5
)
ax.set_title(f"Market Insider {kind.capitalize()} Anomalies")
ax.set_xlabel("Date")
ax.set_ylabel("Price")
ax.grid(True)
ax.legend()
plt.show()
plot_market_insider_anomalies(kind="purchases")
| ticker_count | total_value | |
|---|---|---|
| report_date | ||
| 2008-04-08 | 71 | 5.325948e+15 |
| 2008-05-08 | 128 | 5.325948e+15 |
| 2016-07-26 | 25 | 2.480716e+15 |
| 2017-02-17 | 51 | 5.026898e+14 |
| 2017-08-11 | 146 | 5.760000e+14 |
Federal Reserve Operations¶
The Federal Reserve conducts open market operations — primarily repo and reverse repo agreements — to manage the federal funds rate and control short-term liquidity in the banking system.
- Repo operations: The Fed buys securities from dealers with an agreement to sell them back, injecting short-term liquidity into the system.
- Reverse repo operations: The Fed sells securities to dealers with an agreement to buy them back, draining liquidity from the system.
Empirical Markets tracks these flows helps contextualize monetary policy stance and short-term funding conditions.
You can utilize the API to easily query this data. To demonstrate this we collect both repo and reverse repo operations and plot them over time below:
def get_fed_repo_ops():
r = rate_limited_get(
f"{API_HOST}/v1/fed/operations/repo",
headers=request_headers,
params={"start_date": "2025-01-01", "limit": 1000}
)
r.raise_for_status()
r = r.json()
df = pd.DataFrame(r["series"])
df["date"] = pd.to_datetime(df["date"])
return df
def plot_repo_ops():
df = get_fed_repo_ops()
df = df.set_index("date").sort_index()
display(df.head(3))
df_agg = df["total_amount"].resample("W").sum()
plt.figure(figsize=(12, 6))
plt.bar(df_agg.index, df_agg.values)
plt.title("Fed Repo Operations")
plt.xlabel("Date")
plt.ylabel("Total Amount")
plt.xticks(rotation=45)
plt.grid(axis="y")
plt.tight_layout()
plt.show()
plot_repo_ops()
| operation_type | total_amount | total_amount_unit | num_counterparties | |
|---|---|---|---|---|
| date | ||||
| 2025-01-02 | Repo | 4.0 | millions | None |
| 2025-01-03 | Repo | 0.0 | millions | None |
| 2025-01-06 | Repo | 0.0 | millions | None |
def get_fed_reverse_repo_ops():
r = rate_limited_get(
f"{API_HOST}/v1/fed/operations/reverse-repo",
headers=request_headers,
params={"start_date": "2023-01-01", "limit": 1000}
)
r.raise_for_status()
r = r.json()
df = pd.DataFrame(r["series"])
df["date"] = pd.to_datetime(df["date"])
return df
def plot_rrp_ops():
df = get_fed_reverse_repo_ops()
if df.empty:
print("No Fed reverse repo operations data found.")
return
df = df.set_index("date").sort_index()
display(df.head(5))
plt.figure(figsize=(12, 6))
df["total_amount"].plot(title="Fed Reverse Repo Operations", color="blue")
plt.xlabel("Date")
plt.ylabel("Total Value")
plt.grid()
plt.show()
plot_rrp_ops()
| operation_type | total_amount | total_amount_unit | num_counterparties | |
|---|---|---|---|---|
| date | ||||
| 2023-01-03 | Reverse Repo | 2188272.0 | millions | 99 |
| 2023-01-04 | Reverse Repo | 2229542.0 | millions | 108 |
| 2023-01-05 | Reverse Repo | 2242486.0 | millions | 106 |
| 2023-01-06 | Reverse Repo | 2208265.0 | millions | 100 |
| 2023-01-09 | Reverse Repo | 2199121.0 | millions | 103 |
Federal Reserve Holdings¶
The Federal Reserve publishes weekly disclosures of its System Open Market Account (SOMA) — the portfolio of securities held as a result of open market operations. This is the primary tool the Fed uses to expand or contract its balance sheet (commonly referred to as Quantitative Easing and Quantitative Tightening, respectively).
The latest snapshot shows the current breakdown across Treasuries, mortgage-backed securities (MBS), TIPS, agency debt, and other instruments. This is useful for assessing the current scale and composition of the Fed's market footprint.
def get_fed_latest_holdings():
r = rate_limited_get(
f"{API_HOST}/v1/fed/holdings/latest",
headers=request_headers,
)
r.raise_for_status()
r = r.json()
return pd.Series(r["latest"])
get_fed_latest_holdings()
date 2026-04-08T00:00:00 mbs 1989008552431.100098 cmbs 7671726234.7 tips 290669115800.0 frn 16412388300.0 notes_bonds 3583733004300.0 bills 412568926700.0 agency 2347000000.0 treasury_total 4303383435100.0 mortgage_total 1996680278665.800049 total 6302410713765.799805 dtype: object
The historical Fed holdings endpoint provides a time series of historical balance sheet totals.
def get_fed_holdings_historical():
r = rate_limited_get(
f"{API_HOST}/v1/fed/holdings/historical",
headers=request_headers,
params={"limit": 500, "start_date": "2023-01-01"}
)
r.raise_for_status()
r = r.json()
df = pd.DataFrame(r["holdings"])
df["date"] = pd.to_datetime(df["date"])
return df
def plot_fed_holdings_historical():
df = get_fed_holdings_historical()
df = df.set_index("date").sort_index()
display(df.head(5))
price_df = get_ticker_ohlc_prices(ticker="SPY", start_date="2023-01-01")
price_df["date"] = pd.to_datetime(price_df["date"])
price_df = price_df.set_index("date").sort_index()
plot_df = df[["treasury_total"]].join(price_df[["close"]], how="inner")
fig, ax1 = plt.subplots(figsize=(12, 6))
ax1.plot(plot_df.index, plot_df["treasury_total"], label="Fed Treasury Holdings", color="green")
ax1.set_xlabel("Date")
ax1.set_ylabel("Fed Treasury Holdings", color="green")
ax1.tick_params(axis="y", labelcolor="green")
ax2 = ax1.twinx()
ax2.plot(plot_df.index, plot_df["close"], label="SPY Price", color="blue")
ax2.set_ylabel("SPY Price", color="blue")
ax2.tick_params(axis="y", labelcolor="blue")
fig.tight_layout()
plt.title("Fed Treasury Holdings vs SPY Price")
ax1.legend(loc="upper left")
ax2.legend(loc="upper right")
plt.grid()
plt.show()
plot_fed_holdings_historical()
| mbs | cmbs | tips | frn | notes_bonds | bills | agency | treasury_total | mortgage_total | total | |
|---|---|---|---|---|---|---|---|---|---|---|
| date | ||||||||||
| 2023-01-04 | 2.632908e+12 | 8.493603e+09 | 3.774164e+11 | 2.716621e+10 | 4.660997e+12 | 2.893384e+11 | 2.347000e+09 | 5.354918e+12 | 2.641402e+12 | 7.998667e+12 |
| 2023-01-11 | 2.632908e+12 | 8.493603e+09 | 3.774164e+11 | 2.716621e+10 | 4.660922e+12 | 2.881987e+11 | 2.347000e+09 | 5.353703e+12 | 2.641402e+12 | 7.997452e+12 |
| 2023-01-18 | 2.631455e+12 | 8.485351e+09 | 3.749794e+11 | 2.716621e+10 | 4.645440e+12 | 2.872308e+11 | 2.347000e+09 | 5.334817e+12 | 2.639940e+12 | 7.977104e+12 |
| 2023-01-25 | 2.616273e+12 | 8.461744e+09 | 3.749794e+11 | 2.716621e+10 | 4.645440e+12 | 2.861996e+11 | 2.347000e+09 | 5.333786e+12 | 2.624735e+12 | 7.960868e+12 |
| 2023-02-01 | 2.616273e+12 | 8.461744e+09 | 3.749794e+11 | 2.342878e+10 | 4.612308e+12 | 2.850209e+11 | 2.347000e+09 | 5.295737e+12 | 2.624735e+12 | 7.922819e+12 |