Questions You MUST Ask Before Analyzing Any Web3 Dataset ???
You can't analyze any dataset (web3 or not) successfully without asking the right questions.
Here is a detailed guide on questions that help facilitate a successful analysis.
1️⃣ What Problem Are You Actually Solving ?
For every dataset collated, there must be a purpose behind it. Understanding it helps pinpoint the problem you want to solve, making it easier to provide a solution. Without a clear answer, you'll drown in irrelevant metrics that hinder success.
2️⃣ Where's Your Data Coming From?
Knowing the source of your data equips you with the correct approach to analyze your dataset.
This also helps you know if the data is:
> Indexed properly i.e Dune/Flipside
> Onchain e.g transaction histories, wallet addresses and smart contract interaction
> Offchain such as Meta data, Price data from CEXs and so on.
N.B. know your source's strengths and blind spots.
3️⃣ What's the Scope and Context of your data?
Context in a dataset refers to the background information and relevant details that give meaning and significance to raw data.
By asking this question, you will be able to answer:
> which network/chain your dataset represent (EVM-compatible, Ethereum mainnet, Layer 2s, Solana?)
> what is the timeframe of your analysis ( Does your dataset covers specific events, like a hack, upgrade, or airdrop)
> what are the specific addresses of the smart contracts involved (Ensuring you are analyzing the correct contract, not a proxy or clone)
4️⃣ How's the quality of your data ?
Blockchain data isn't clean, it's your job to make it clean.
From spotting descripances in data types, identifying duplicates table to spotting missing blocks.
Ask what's missing? What's misleading? What changed mid-dataset?
5️⃣ What Are You NOT Seeing ?
Every dataset has blind spots:
• Off-chain activity (CEX trades, L2 sequencers)
• Private mempools (MEV, private transactions)
• Cross-chain bridges (fragmented liquidity)
• Failed transactions (attempted but reverted)
Acknowledge what's invisible.
Why This Matters:
• Bad questions → Pretty dashboards with wrong conclusions
• Good questions → Insights that actually matter
By answering these questions first, you ensure the data is reliable and that your analysis will provide actionable insights rather than misleading trends.
🔄 RT for the sake of a data analyst who might find this helpful