Methodology — How InflowScan Tracks Crypto ETF Flows

InflowScan publishes a daily read on US-listed spot crypto ETF flows, premium and discount, derivatives positioning, and the composite FlowScore. This page documents how each number is sourced, validated, aggregated, and surfaced. It also explains when we deliberately show -- instead of a value — the operating principle that runs through every screen on the site.

1. Daily ETF flows & T+1 settlement

Daily creations and redemptions for every US-listed spot Bitcoin, Ethereum, Solana, and XRP ETF flow into our database from the primary issuer-published flow feed. That feed aggregates the per-fund creation and redemption activity that issuers report to the listing exchange after each trading session. Our collector runs on cPanel cron and writes one row per fund per trading date with idempotent upserts — reruns and corrections overwrite, never duplicate.

The data is on a T+1 cadence. When a fund prints a creation or redemption on Monday, that activity settles overnight and appears in the issuer's Tuesday-morning feed. So when the dashboard says “Monday's net flow,” the row was actually written Tuesday between roughly 6 AM and 4 PM ET, not Monday evening. This is why “today's” flow figure on the dashboard is always partial until late in the trading day:

  • The 10 PM ET massive collector pulls the bulk of T+0 issuer reports as they land.
  • A 6:30 AM ET catch-up the next morning fills the fast issuers (BlackRock, Fidelity, VanEck, Franklin) that publish overnight.
  • A 3:30 PM ET catch-up sweeps slower issuers (ARK, Bitwise, Grayscale, Invesco, WisdomTree, Valkyrie) that often don't post until mid-afternoon T+1.

By Monday afternoon at the latest, the prior trading week is fully ingested. Any single “today” cell on the dashboard before ~10 PM ET should be read as a snapshot — not a closing total. The Settled Flows Wrap brief at 10:15 PM ET each weekday is the editorial closing-bell for the day's flow data.

Verification layer

Every row written to the flows table runs through a consistency check before publishing. We recompute net flow from the prior session's shares-outstanding plus the day's reported creation and redemption volumes, then compare against the issuer-published net flow. The deviation gets one of four labels: high (under ~5% deviation), medium, low, or held. Held rows are not published to the public dashboard or to the brief writer. They sit in the database with a flag, an operator alert fires to Telegram, and a human reviews before they're released or corrected. This is what keeps a bad upstream print from leaking into a brief or a screen.

Three independent sources, triangulated

Single-source data is a single point of failure. InflowScan reconciles per-fund daily flows across three independent providers — when at least two agree within a tolerance, we publish the consensus value; when they disagree beyond tolerance, we hold the row and surface the gap rather than guess.

SourceCoverageRole
Polygon ETF Global US spot crypto ETFs (BTC, ETH, SOL, XRP). Per-issuer publishing window: fast issuers T+1 by ~6 AM ET; slow issuers ARK / Bitwise / WisdomTree / Grayscale often not until mid-afternoon T+1. Primary feed. First to publish each day. Bulk write at 10 PM ET; two catch-up runs at 9 AM and 3:30 PM ET sweep the slow issuers.
CoinGlass Spot crypto ETFs across US + HK + DOGE + LTC funds. Aggregator-of-aggregators model. Layer 2 verification. Cross-checks Polygon's primary feed. Used as gap-fill when Polygon misses a slow-issuer day.
SoSoValue Per-fund + aggregate flow data for the same universe, scraped from SoSoValue's dashboards 4× daily. Historical depth back to fund launch (BTC: Jan 2024). Tier-1 reconciliation source. Highest accuracy on slow-issuer days. Preferred in gap-fill when both CoinGlass and SoSoValue have a row for the same date.

Source priority follows the reconciliation pipeline in db/reconciliation.php: when multiple sources have a value for the same (date, ticker), we take the highest-priority source that's part of the agreeing subset. The agreement tolerance is 5% — sources within that band of each other are considered to agree, and the consensus value is published with a high or medium confidence label depending on the spread.

When sources disagree beyond tolerance, the row is held until either (a) a new source ingest agrees with one of the existing values and breaks the tie, (b) the issuer publishes a correction that the primary feed picks up, or (c) an operator reviews and force-publishes after a manual cross-check. A public reconciliation log (publishing soon) will show every held data point from the last 30 days with its resolution.

2. FlowScore — 0 to 100, per asset

FlowScore is a daily composite score for each tracked asset (BTC, ETH, SOL, XRP). It blends five engines into a single 0–100 number, with 50 as a neutral midpoint. It is calculated once per day by db/cron/calculate-flow-scores.php after the previous session's flows have settled. The five engines and their default weights:

EngineWeightWhat it measures
ETF Flows 30% Direction (40%) · magnitude vs 30-day average (35%) · consecutive-day persistence (25%)
Liquidity 20% 7-day change in stablecoin issuance (USDT + USDC) and exchange stablecoin reserves — the dry-powder picture
Derivatives 20% Funding rate proximity to neutral (35%) · 7-day OI/price confluence (35%) · 7-day long-vs-short liquidation imbalance (30%)
Price Confirmation 20% Trend vs 50-day moving average (50%) · 7-day relative strength versus a benchmark (25%) · realized volatility rank (25%)
Market Context 10% NAV premium/discount (50%) and a macro composite of dollar, Nasdaq, and 10Y yield 7-day moves (50%)

Each engine produces a 0–100 sub-score, and the composite is the weighted average. If an engine has no data on a given day — say derivatives data is delayed — its weight redistributes proportionally across the engines that did report. We surface the resulting confidence percentage alongside the score: a FlowScore computed with all five engines lights up at 100% confidence; one engine missing pulls confidence down by that engine's original weight.

The composite also drives a market-state label from the ETF and Price engines: Confirmation (both strong), Accumulation (flows leading price), Divergence (price leading without flow support), Distribution (both weak), and Transition for the in-between cases. The label is a shorthand for what the data is doing — it is not a recommendation.

Interpretation buckets: 0–25 bearish, 26–45 weak, 46–55 neutral, 56–70 constructive, 71–85 bullish, 86–100 strong. The score is structural, not predictive. It tells you what the flow tape and the surrounding tape are doing right now.

3. Premium / discount — what it tells you

An ETF's premium or discount is the percentage gap between its market price and its net asset value (NAV). A positive number means the fund traded above the basket of crypto it holds; a negative number means it traded below. We track per-fund daily averages and surface them as part of the issuer-published flow record.

Spot crypto ETFs typically trade within tens of basis points of NAV. The mechanism that keeps the gap tight is intraday creation and redemption: when the ETF trades rich to its underlying, an authorized participant can buy the underlying spot (or futures hedge), deliver it to the issuer, and receive ETF shares to sell — collapsing the premium. When the ETF trades cheap, the opposite trade closes the discount. As long as APs can route between spot and ETF cheaply, the gap stays tight.

When premium widens beyond its usual band, that's a positioning signal. A persistent premium suggests buyers are willing to pay up for ETF wrapping rather than route to spot — usually because of allocation mandates that forbid direct crypto exposure. A persistent discount points the other way: holders want out faster than the AP arb can handle. Either condition tends to mean-revert within a session or two; what matters is the size of the deviation and whether it sticks.

We never aggregate premium across funds or across days that have partial coverage. If an issuer hasn't reported, the day's premium stat for that fund shows -- rather than a stale or interpolated value.

4. Daily coverage — how we know the day is in

Every trading day, the platform writes a row to daily_coverage tracking which issuers have reported for that date and which are still outstanding. Coverage rolls up to a single boolean: is_complete = true when every active fund's row has landed. Until then the day is partial, and the dashboard treats it accordingly.

The aggregation rule

We do not compute 7-day, 30-day, or AUM-style aggregates over a window that contains any partial-coverage day. If the rolling window would otherwise sweep up an in-progress day, the aggregate either trails back to the last is_complete = true day (preferred) or surfaces as --. The rule is one-directional: it is fine to display today's per-fund partial values with a date stamp; it is not fine to roll those values into a “7-day total” that pretends to be settled.

Verification timing

The verification cron runs in two passes. The morning pass at 6:35 AM ET re-computes consistency on the fast issuers ingested overnight. The evening pass at 10:05 PM ET sweeps the day's massive-collector run. Coverage check fires five minutes later and updates the daily-coverage row. If a fund is still missing past the next morning, an alert fires to the Telegram bot and the operator either chases the source or holds the row.

5. Brief cadence — eight types, eight slots

InflowScan Staff publishes briefs on a weekly cadence built around when ETF flow data actually lands and when institutional readers want it. Each slot serves a distinct editorial purpose — we don't run the same recap three times a day with different timestamps. All times are America/New_York wall-clock; the cron account runs in ET so DST shifts apply automatically.

BriefWhenDaysWhat it leads with
Pre-Market8:30 AM ETMon–FriYesterday's settled flow recap. The day's flow report-of-record.
Macro Pulse10:30 AM ETWednesdayMid-week macro tape: dollar, Nasdaq, 10Y yield read into crypto.
Midday12:00 PM ETDailyPositioning — funding, options, intraday narrative. Not a flow recap.
Closing Bell5:00 PM ETMon–FriThe day's price action and intraday range. Today's flows haven't settled yet.
Settled Flows Wrap10:15 PM ETMon–FriCloses the loop: today's T+0 flows are in. ~250 words, flow-only.
Weekly Recap10:15 AM ETSaturdayThe week in numbers, daily breakdown, and ETF leaderboard.
ETF Industry9:00 AM ETSundayFilings, launches, and the new-fund pipeline.
Week Ahead3:15 PM ETSundayLevels, catalysts, and the calendar for the coming week.

Pre-Market fires at 8:30 AM ET because that is the institutional reading window before the 9:30 NYSE open, and the 6:30 AM catch-up has already filled fast-issuer T+1 data. Closing Bell deliberately ships before the 10 PM massive collector, which is why it leads with price action and references prior-session flows as backdrop — not as today's headline. Settled Flows Wrap covers the gap: it fires fifteen minutes after the 10 PM run lands, posts a terse evening anchor, and links the day's permalink for the X feed.

Bylines are always “InflowScan Staff.” Voice targets a desk-strategist morning note — confident in observation, hedged in inference. We don't predict prices, give portfolio advice, or pretend a flat day is a story.

6. Data integrity — the -- rule

Every screen and every brief on InflowScan obeys a single hard rule: no data is better than bad data. When a number is missing, stale, or sourced inconsistently, the surface shows -- or an explicit empty state. We do not approximate, interpolate, or fall back to a plausible-looking value. A credible-looking wrong number is worse than an obvious blank, because users trust the chart and act on it.

What this looks like in practice:

  • Frontend metric elements ship with -- as their default. JS replaces with the real value on load. If the API call fails, the user sees --, not a stale fabrication.
  • API endpoints return success: false or an empty array on upstream failure. They never synthesize placeholder rows to keep a screen looking populated.
  • Cron collectors skip the day on failure rather than write a fabricated row. A skipped day surfaces honestly in daily_coverage and triggers a Telegram alert.
  • Rolling aggregates (7D, 30D, AUM) are gated by is_complete = true coverage. A partial day in the window pushes the aggregate to the last complete day — or, if that's not possible, to --.
  • FlowScore degrades transparently. If a component engine has no data, its weight redistributes and the confidence percentage drops; we never paper over a missing engine with a synthetic substitute.
  • Briefs use hedged language for inference. The persona file forbids investment advice, price predictions stated with certainty, and the standard LLM tells (“underscores,” “robust,” “navigate,” “leverage” as a verb).

The shorter form of this rule, as it appears in our internal codebase guide: “Never show mock/fake data — -- or empty states only.” When in doubt, blank wins.