How should companies measure ROI of AI?
Here's my working mental model. Tear it apart!
1) Below a certain investment level (determined by ELT or AI steering committee), ROI can be vibes-based through conversations with users. Goal here is to remove friction & empower people to play with the technology however they find helpful. It just has to lead to a high enough fidelity gut feeling to determine if a higher investment experiment is worth running.
2) Above a certain investment level, ROI has to be as high fidelity as possible. Every AI initiative is run like an experiment with friction minimized as much as possible. There’s a certain investment limit to experiments and investments can be revisited once experiments are complete. Here's how an experiment would be run & how (soft vs. hard) ROI would be calculated.
- Hypothesis: If recruiters use AI to screen resumes, then the time-to-hire will decrease and the interview-to-offer conversion rate will remain equal or improve.
- Independent Variable: The screening method used (AI-powered software versus traditional human resume review).
- Dependent Variables: Time spent screening (minutes per resume), candidate diversity metrics, and the hiring manager's satisfaction score of shortlisted candidates.
- Controlled Variables: The same job description, the same pool of raw applicant resumes, and the same evaluation criteria (rubric).
To ensure a fair test, you must use a randomized control design:
- Control Group: Group A consists of experienced human recruiters who screen 200 incoming resumes using your traditional manual process.
- Experimental Group: Group B uses the AI screening tool to parse and rank the exact same 200 resumes.
Experiment steps:
1) Time Tracking: Log the total hours Group A spends reading resumes versus the time it takes to configure and run Group B's AI tool.
2) Blinded Interview Review: Pass the top 10 candidates selected by the human process and the top 10 selected by the AI process to a hiring manager. Do not tell the manager which candidate came from which screening method.
3) Quality Metric: Have the hiring manager score each candidate's qualifications on a scale of 1–10 based on the interview.
4) Replication: Repeat this exact process across three different job openings (e.g., Sales, Engineering, and Marketing) to ensure the AI's effectiveness isn't limited to just one type of role.
Results & ROI:
Experiment proved successful if 2 conditions are met:
- Condition 1: Time Saved > 0
- Condition 2: AI Average Quality Score ≥ Human Average Quality Score
If not successful, run new experiment (i.e. how can we tweak the AI to deliver as high of an average quality score)
If successful, measure ROI.
In this example ROI would look like:
ROI % = (Annual Savings - Annual AI Cost / Annual AI cost) * 100
So if the company has 50 job roles per year, 9.5 hours are saved and the screening software costs $10,000, the ROI would be:
(475 hours saved * $58/hr - $10,000 AI tool/ $10,000 AI tool) * 100 = 174% ROI
And that ROI is realized (goes from soft savings to hard savings) either by slowing down the hiring of recruiters, firing recruiters, or revenue realized by getting new hires into seat faster.
What do you think? Right/wrong approach?