Your leadership development program just wrapped up. Everyone passed the final quiz with flying colors. Two months later, the same managers are still avoiding difficult conversations, still running chaotic team meetings, and still getting the same feedback from their direct reports.
The quiz told you they knew the right answers. It told you nothing about whether they could actually do the right things. That’s the assessment design problem most L&D teams face: they’re measuring the wrong stuff, and the data they get back is useless for driving real improvement.
Why do assessments matter beyond checking boxes?
Assessments serve three purposes in L&D, and most teams only hit the first one.
Purpose 1: Prove that learning happened. This is the checkbox. Did participants absorb the material? A post-training quiz handles this, and it’s the least valuable of the three.
Purpose 2: Identify specific skill gaps. A well-designed assessment tells you exactly where someone is strong and where they need work. Not “they scored 72% on leadership,” but “they’re solid on delegation, shaky on giving critical feedback, and completely untested on coaching through failure.” That granularity changes how you design follow-up development.
Purpose 3: Connect development to business results. When your assessments track competencies that map to business KPIs, you can show leadership that improved assessment scores correlate with improved team performance, lower attrition, or faster project delivery. That’s how you protect and grow your L&D budget.
At Risely, built-in assessments track leadership and people management competencies at the sub-skill level, so a manager doesn’t just learn they need to “get better at communication.” They learn they’re strong at clarity but weak at active listening during conflict. That specificity is what makes assessment data actionable.
What types of assessments should you be using?
Most L&D teams default to one type. You need all four.
| Assessment Type | When to Use It | What It Tells You | Example |
|---|---|---|---|
| Pre-assessment | Before training begins | Baseline skill levels and gaps | Self-assessment survey + manager input on current capabilities |
| Formative | During training | Whether learners are keeping pace | Quick scenario-based checks after each module |
| Summative | Immediately after training | What was learned | Role-play exercise where participants demonstrate the skill |
| Performance-based | 4-8 weeks post-training | Whether skills transferred to real work | 360-degree feedback, behavioral observation, outcome metrics |
The magic is in the combination. Pre-assessments tell you where to focus. Formative assessments catch confusion early. Summative assessments confirm learning. Performance-based assessments prove transfer. Skip any one of them and you’ve got blind spots.
How do you design an assessment from scratch?
Step 1: Define what you’re measuring (be specific)
“Communication skills” is too broad to assess meaningfully. Break it down. Are you measuring the ability to give constructive feedback? Run effective meetings? Present to senior stakeholders? Coach a struggling team member?
Each of those is a distinct competency that requires a different assessment approach. Trying to measure all of them with one tool is like using a thermometer to check blood pressure. It’s a measurement instrument, but it’s measuring the wrong thing.
Step 2: Create criteria that are observable and measurable
Swap vague standards for specific, behavioral indicators.
| Vague Criteria | Measurable Criteria |
|---|---|
| ”Good communication skills" | "Summarizes key takeaways at the end of meetings and confirms agreement before moving on" |
| "Strong leadership" | "Delegates tasks with clear expectations, deadlines, and check-in points" |
| "Team player" | "Offers to help colleagues when their own workload allows, at least twice per week” |
If you can’t observe someone doing it, you can’t assess it. Every criterion should describe a behavior that’s visible in the workplace.
Step 3: Build assessments that mirror real work
The format should match the skill. Multiple-choice works for knowledge checks. Everything else needs something more.
- For conversation skills: Use recorded role-plays where participants handle realistic scenarios. Assess against a behavioral rubric.
- For decision-making: Present incomplete-information scenarios where participants must choose and justify a course of action.
- For coaching ability: Have participants coach a real colleague through a challenge. Assess based on the coaching behaviors demonstrated (asking vs. telling, listening vs. interrupting).
- For ongoing skill development: Use AI-powered tools that assess skills through natural coaching conversations. Merlin does this by tracking how managers describe their challenges and what approaches they choose, building a skill profile through interaction rather than testing.
Step 4: Build a scoring system that creates useful data
Binary pass/fail tells you nothing useful. A 1-10 scale with no criteria is subjective and inconsistent. What works: a rubric with 3-5 levels that describe what each level looks like in practice.
For a feedback skill assessment, that might look like:
- Level 1: Avoids giving feedback or gives only positive feedback regardless of performance
- Level 2: Gives feedback but focuses on the person rather than the behavior
- Level 3: Gives behavior-specific feedback with clear examples
- Level 4: Gives behavior-specific feedback, checks for understanding, and collaborates on an improvement plan
- Level 5: Does all of the above and follows up within a week to reinforce and support change
Each level describes observable behavior. Assessors can reliably agree on which level they’re seeing. The data is specific enough to drive targeted development.
Step 5: Plan what you’ll do with the results
This sounds obvious, but it’s where most assessment programs fall apart. You collect beautiful data and then… nothing changes.
Before launching any assessment, answer these questions:
- Who sees the results? (The learner? Their manager? HR? All three?)
- What development actions follow each score level?
- How frequently will reassessments happen to track progress?
- What aggregate trends will you report to leadership?
Assessment data should flow directly into individual development plans, program redesign decisions, and strategic L&D planning. If it just sits in a spreadsheet, you’ve wasted everyone’s time.
Three assessment design mistakes that waste your budget
Mistake 1: Over-relying on multiple-choice tests. They’re easy to build and easy to score. They’re also almost useless for measuring anything beyond recall. A manager can ace a quiz on conflict resolution principles and still freeze when a real conflict happens. For any skill that involves judgment, interaction, or decision-making, you need assessment formats that require demonstrating the skill, not reciting facts about it.
Mistake 2: Assessing skills that don’t map to the actual job. If your managers spend 60% of their time in 1:1s and team meetings, your assessments should primarily evaluate how they run 1:1s and team meetings. Not how well they can define “transformational leadership” on a test. Build your assessment around the 3-5 highest-impact behaviors for the role.
Mistake 3: Collecting data without acting on it. If your assessments show that 70% of new managers struggle with delegation, but your development program doesn’t have a delegation module, you’re collecting data for its own sake. Every assessment insight should trigger a specific action: adjust the training, add coaching support, create practice opportunities, or flag for individual follow-up.
How do you prove your assessments are working?
Track two things over time.
Assessment score trends. Are scores improving after development interventions? If not, either the development isn’t working or the assessment isn’t measuring the right thing. Both are worth investigating.
Correlation with business outcomes. Do managers with higher assessment scores have teams with better engagement? Lower turnover? Faster delivery? If the answer is yes, you’ve proven that your assessments measure something that matters, and you’ve built the case for continued investment in both assessment and development.
The L&D teams that get assessment right don’t treat it as the end of a program. They treat it as the beginning of a feedback loop that makes every subsequent program better. Your assessments tell you what’s working, what’s not, and where to invest next. That’s not a quiz. That’s a strategy tool.
Start by auditing your current assessments. For each one, ask: does this measure what people can do, or what people can remember? If the answer is “remember,” it’s time to redesign.
