How to Run NFR Evidence Audit with TEA
How to Run NFR Evidence Audit with TEA
Section titled âHow to Run NFR Evidence Audit with TEAâUse TEAâs nfr-assess workflow to audit non-functional requirement (NFR) evidence across security, performance, reliability, and maintainability. The command remains nfr-assess for compatibility; the workflow role is an evidence audit.
Use test-design before implementation to define NFR thresholds, planned validation, and expected evidence. Use nfr-assess after evidence exists to decide PASS/CONCERNS/FAIL.
When to Use This
Section titled âWhen to Use Thisâ- Enterprise projects with compliance requirements
- Projects with strict NFR thresholds
- Before production release
- After tests, scans, metrics, logs, monitoring data, or CI reports exist
- When NFRs are critical to project success
- Security or performance is mission-critical
Best for:
- Enterprise track projects
- Compliance-heavy industries (finance, healthcare, government)
- High-traffic applications
- Security-critical systems
Prerequisites
Section titled âPrerequisitesâ- BMad Method installed
- TEA agent available
- NFRs defined in PRD, requirements doc, architecture, or
test-design - Evidence sources available or explicitly missing (test results, security scans, performance metrics, logs, dashboards, CI reports)
Note: You can run the audit without complete evidence. TEA will mark categories as CONCERNS where evidence is missing and document whatâs needed.
1. Run the NFR Evidence Audit Workflow
Section titled â1. Run the NFR Evidence Audit WorkflowâStart a fresh chat and run:
nfr-assessThis loads TEA and starts the NFR Evidence Audit workflow.
2. Specify NFR Categories
Section titled â2. Specify NFR CategoriesâTEA will ask which NFR categories to audit.
Available Categories:
| Category | Focus Areas |
|---|---|
| Security | Authentication, authorization, encryption, vulnerabilities, security headers, input validation |
| Performance | Response time, throughput, resource usage, database queries, frontend load time |
| Reliability | Error handling, recovery mechanisms, availability, failover, data backup |
| Maintainability | Code quality, test coverage, technical debt, documentation, dependency health |
Example Response:
Assess:- Security (critical for user data)- Performance (API must be fast)- Reliability (99.9% uptime requirement)
Skip maintainability for now3. Provide NFR Thresholds
Section titled â3. Provide NFR ThresholdsâTEA will use specific thresholds for each category, preferably from PRD, architecture, or test-design.
Critical Principle: Never guess thresholds.
If you donât know the exact requirement, tell TEA to mark it as UNKNOWN/CONCERNS and request clarification from stakeholders.
Security Thresholds
Section titled âSecurity ThresholdsâExample:
Requirements:- All endpoints require authentication: YES- Data encrypted at rest: YES (PostgreSQL TDE)- Zero critical vulnerabilities: YES (npm audit)- Input validation on all endpoints: YES (Zod schemas)- Security headers configured: YES (helmet.js)Performance Thresholds
Section titled âPerformance ThresholdsâExample:
Requirements:- API response time P99: < 200ms- API response time P95: < 150ms- Throughput: > 1000 requests/second- Frontend initial load: < 2 seconds- Database query time P99: < 50msReliability Thresholds
Section titled âReliability ThresholdsâExample:
Requirements:- Error handling: All endpoints return structured errors- Availability: 99.9% uptime- Recovery time: < 5 minutes (RTO)- Data backup: Daily automated backups- Failover: Automatic with < 30s downtimeMaintainability Thresholds
Section titled âMaintainability ThresholdsâExample:
Requirements:- Test coverage: > 80%- Code quality: SonarQube grade A- Documentation: All APIs documented- Dependency age: < 6 months outdated- Technical debt: < 10% of codebase4. Provide Evidence
Section titled â4. Provide EvidenceâTEA will ask where to find evidence for each requirement.
Evidence Sources:
| Category | Evidence Type | Location |
|---|---|---|
| Security | Security scan reports | /reports/security-scan.pdf |
| Security | Vulnerability scan | npm audit, snyk test results |
| Security | Auth test results | Test reports showing auth coverage |
| Performance | Load test results | /reports/k6-load-test.json |
| Performance | APM data | Datadog, New Relic dashboards |
| Performance | Lighthouse scores | /reports/lighthouse.json |
| Reliability | Error rate metrics | Production monitoring dashboards |
| Reliability | Uptime data | StatusPage, PagerDuty logs |
| Maintainability | Coverage reports | /reports/coverage/index.html |
| Maintainability | Code quality | SonarQube dashboard |
Example Response:
Evidence:- Security: npm audit results (clean), auth tests 15/15 passing- Performance: k6 load test at /reports/k6-results.json- Reliability: Error rate 0.01% in staging (logs in Datadog)
Don't have:- Uptime data (new system, no baseline)- Mark as CONCERNS and request monitoring setup5. Review NFR Evidence Audit Report
Section titled â5. Review NFR Evidence Audit ReportâTEA generates a comprehensive evidence audit report.
Evidence Audit Report (nfr-assessment.md):
Section titled âEvidence Audit Report (nfr-assessment.md):â# NFR Evidence Audit
**Date:** 2026-01-13**Epic:** User Profile Management**Release:** v1.2.0**Overall Decision:** CONCERNS â ď¸
## Executive Summary
| Category | Status | Critical Issues || --------------- | ----------- | --------------- || Security | PASS â
| 0 || Performance | CONCERNS â ď¸ | 2 || Reliability | PASS â
| 0 || Maintainability | PASS â
| 0 |
**Decision Rationale:**Performance metrics below target (P99 latency, throughput). Mitigation plan in place. Security and reliability meet all requirements.
---
## Security Assessment
**Status:** PASS â
### Requirements Met
| Requirement | Target | Actual | Status || ------------------------ | -------------- | ------------------- | ------ || Authentication required | All endpoints | 100% enforced | â
|| Data encryption at rest | PostgreSQL TDE | Enabled | â
|| Critical vulnerabilities | 0 | 0 | â
|| Input validation | All endpoints | Zod schemas on 100% | â
|| Security headers | Configured | helmet.js enabled | â
|
### Evidence
**Security Scan:**
```bash$ npm auditfound 0 vulnerabilities```Authentication Tests:
- 15/15 auth tests passing
- Tested unauthorized access (401 responses)
- Token validation working
Penetration Testing:
- Report:
/reports/pentest-2026-01.pdf - Findings: 0 critical, 2 low (addressed)
Conclusion: All security requirements met. No blockers.
Performance Assessment
Section titled âPerformance AssessmentâStatus: CONCERNS â ď¸
Requirements Status
Section titled âRequirements Statusâ| Metric | Target | Actual | Status |
|---|---|---|---|
| API response P99 | < 200ms | 350ms | â Exceeds |
| API response P95 | < 150ms | 180ms | â ď¸ Exceeds |
| Throughput | > 1000 rps | 850 rps | â ď¸ Below |
| Frontend load | < 2s | 1.8s | â Met |
| DB query P99 | < 50ms | 85ms | â Exceeds |
Issues Identified
Section titled âIssues IdentifiedâIssue 1: P99 Latency Exceeds Target
Section titled âIssue 1: P99 Latency Exceeds TargetâMeasured: 350ms P99 (target: <200ms) Root Cause: Database queries not optimized
- Missing indexes on profile queries
- N+1 query problem in profile endpoint
Impact: User experience degraded for 1% of requests
Mitigation Plan:
- Add composite index on
(user_id, profile_id)- backend team, 2 days - Refactor profile endpoint to use joins instead of multiple queries - backend team, 3 days
- Re-run load tests after optimization - QA team, 1 day
Owner: Backend team lead Deadline: Before release (January 20, 2026)
Issue 2: Throughput Below Target
Section titled âIssue 2: Throughput Below TargetâMeasured: 850 rps (target: >1000 rps) Root Cause: Connection pool size too small
- PostgreSQL max_connections = 100 (too low)
- No connection pooling in application
Impact: System cannot handle expected traffic
Mitigation Plan:
- Increase PostgreSQL max_connections to 500 - DevOps, 1 day
- Implement connection pooling with pg-pool - backend team, 2 days
- Re-run load tests - QA team, 1 day
Owner: DevOps + Backend team Deadline: Before release (January 20, 2026)
Evidence
Section titled âEvidenceâLoad Testing:
Tool: k6Duration: 10 minutesVirtual Users: 500 concurrentReport: /reports/k6-load-test.jsonResults:
scenarios: (100.00%) 1 scenario, 500 max VUs, 10m30s max duration â http_req_duration..............: avg=250ms min=45ms med=180ms max=2.1s p(90)=280ms p(95)=350ms http_reqs......................: 85000 (850/s) http_req_failed................: 0.1%APM Data:
- Tool: Datadog
- Dashboard: https://app.datadoghq.com/dashboard/abc123
Conclusion: Performance issues identified with mitigation plan. Re-assess after optimization.
Reliability Assessment
Section titled âReliability AssessmentâStatus: PASS â
Requirements Met
Section titled âRequirements Metâ| Requirement | Target | Actual | Status |
|---|---|---|---|
| Error handling | Structured errors | 100% endpoints | â |
| Availability | 99.9% uptime | 99.95% (staging) | â |
| Recovery time | < 5 min (RTO) | 3 min (tested) | â |
| Data backup | Daily | Automated daily | â |
| Failover | < 30s downtime | 15s (tested) | â |
Evidence
Section titled âEvidenceâError Handling Tests:
- All endpoints return structured JSON errors
- Error codes standardized (400, 401, 403, 404, 500)
- Error messages user-friendly (no stack traces)
Chaos Engineering:
- Tested database failover: 15s downtime â
- Tested service crash recovery: 3 min â
- Tested network partition: Graceful degradation â
Monitoring:
- Staging uptime (30 days): 99.95%
- Error rate: 0.01% (target: <0.1%)
- P50 availability: 100%
Conclusion: All reliability requirements exceeded. No issues.
Maintainability Assessment
Section titled âMaintainability AssessmentâStatus: PASS â
Requirements Met
Section titled âRequirements Metâ| Requirement | Target | Actual | Status |
|---|---|---|---|
| Test coverage | > 80% | 85% | â |
| Code quality | Grade A | Grade A | â |
| Documentation | All APIs | 100% documented | â |
| Outdated dependencies | < 6 months | 3 months avg | â |
| Technical debt | < 10% | 7% | â |
Evidence
Section titled âEvidenceâTest Coverage:
Statements : 85.2% ( 1205/1414 )Branches : 82.1% ( 412/502 )Functions : 88.5% ( 201/227 )Lines : 85.2% ( 1205/1414 )Code Quality:
- SonarQube: Grade A
- Maintainability rating: A
- Technical debt ratio: 7%
- Code smells: 12 (all minor)
Documentation:
- API docs: 100% coverage (OpenAPI spec)
- README: Complete and up-to-date
- Architecture docs: ADRs for all major decisions
Conclusion: All maintainability requirements met. Codebase is healthy.
Overall Gate Decision
Section titled âOverall Gate DecisionâDecision: CONCERNS â ď¸
Section titled âDecision: CONCERNS â ď¸âRationale:
- Blockers: None
- Concerns: Performance metrics below target (P99 latency, throughput)
- Mitigation: Plan in place with clear owners and deadlines (5 days total)
- Passing: Security, reliability, maintainability all green
Actions Required Before Release
Section titled âActions Required Before Releaseâ-
Optimize database queries (backend team, 3 days)
- Add indexes
- Fix N+1 queries
- Implement connection pooling
-
Re-run performance tests (QA team, 1 day)
- Validate P99 < 200ms
- Validate throughput > 1000 rps
-
Update this audit (TEA, 1 hour)
- Re-run
*nfr-assesswith new evidence - Confirm PASS status
- Re-run
Waiver Option (If Business Approves)
Section titled âWaiver Option (If Business Approves)âIf business decides to deploy with current performance:
Waiver Justification:
## Performance Waiver
**Waived By:** VP Engineering, Product Manager**Date:** 2026-01-15**Reason:** Business priority to launch by Q1**Conditions:**
- Set monitoring alerts for P99 > 300ms- Plan optimization for v1.3 (February release)- Document known performance limitations in release notes
**Accepted Risk:**
- 1% of users experience slower response (350ms vs 200ms)- System can handle current traffic (850 rps sufficient for launch)- Optimization planned for next releaseApprovals
Section titled âApprovalsâ- Product Manager - Review business impact
- Tech Lead - Review mitigation plan
- QA Lead - Validate test evidence
- DevOps - Confirm infrastructure ready
Monitoring Plan Post-Release
Section titled âMonitoring Plan Post-ReleaseâPerformance Alerts:
- P99 latency > 400ms (critical)
- Throughput < 700 rps (warning)
- Error rate > 1% (critical)
Review Cadence:
- Daily: Check performance dashboards
- Weekly: Review alert trends
- Monthly: Re-audit NFR evidence
## What You Get
### NFR Evidence Audit Report- Category-by-category analysis (Security, Performance, Reliability, Maintainability)- Requirements status (target vs actual)- Evidence for each requirement- Issues identified with root cause analysis
### Gate Decision- **PASS** â
- All NFRs met, ready to release- **CONCERNS** â ď¸ - Some NFRs not met, mitigation plan exists- **FAIL** â - Critical NFRs not met, blocks release- **WAIVED** âď¸ - Business-approved waiver with documented risk
### Mitigation Plans- Specific actions to address concerns- Owners and deadlines- Re-audit criteria
### Monitoring Plan- Post-release monitoring strategy- Alert thresholds- Review cadence
## Tips
### Plan NFRs Early, Audit Evidence Later
**Phase 2 (Enterprise):**Define NFR requirements in the PRD so `test-design` can:- Identify NFR requirements early- Plan for performance testing- Budget for security audits- Set up monitoring infrastructure
**Phase 3:**Run `test-design` to turn NFRs into thresholds, planned validation, and expected evidence.
**Phase 4 or Gate:**Run `nfr-assess` before release to audit the evidence.
### Never Guess Thresholds
If you don't know the NFR target:
**Don't:**API response time should probably be under 500ms
**Do:**Mark as CONCERNS - Request threshold from stakeholders âWhat is the acceptable API response time?â
### Collect Evidence Beforehand
Before running `*nfr-assess`, gather:
**Security:**```bashnpm audit # Vulnerability scansnyk test # Alternative security scannpm run test:security # Security test suitePerformance:
npm run test:load # k6 or artillery load testsnpm run test:lighthouse # Frontend performancenpm run test:db-performance # Database query analysisReliability:
- Production error rate (last 30 days)
- Uptime data (StatusPage, PagerDuty)
- Incident response times
Maintainability:
npm run test:coverage # Test coverage reportnpm run lint # Code quality checknpm outdated # Dependency freshnessUse Real Data, Not Assumptions
Section titled âUse Real Data, Not AssumptionsâDonât:
System is probably fast enoughSecurity seems fineDo:
Load test results show P99 = 350msnpm audit shows 0 vulnerabilitiesTest coverage report shows 85%Evidence-based decisions prevent surprises in production.
Document Waivers Thoroughly
Section titled âDocument Waivers ThoroughlyâIf business approves waiver:
Required:
- Who approved (name, role, date)
- Why (business justification)
- Conditions (monitoring, future plans)
- Accepted risk (quantified impact)
Example:
Waived by: CTO, VP Product (2026-01-15)Reason: Q1 launch critical for investor demoConditions: Optimize in v1.3, monitor closelyRisk: 1% of users experience 350ms latency (acceptable for launch)Re-Assess After Fixes
Section titled âRe-Assess After FixesâAfter implementing mitigations:
1. Fix performance issues2. Run load tests again3. Run nfr-assess with new evidence4. Verify PASS statusDonât deploy with CONCERNS without mitigation or waiver.
Integrate with Release Checklist
Section titled âIntegrate with Release Checklistâ## Release Checklist
### Pre-Release
- [ ] All tests passing- [ ] Test coverage > 80%- [ ] Run nfr-assess- [ ] NFR status: PASS or WAIVED
### Performance
- [ ] Load tests completed- [ ] P99 latency meets threshold- [ ] Throughput meets threshold
### Security
- [ ] Security scan clean- [ ] Auth tests passing- [ ] Penetration test complete
### Post-Release
- [ ] Monitoring alerts configured- [ ] Dashboards updated- [ ] Incident response plan readyCommon Issues
Section titled âCommon IssuesâNo Evidence Available
Section titled âNo Evidence AvailableâProblem: Donât have performance data, security scans, etc.
Solution:
Mark as CONCERNS for categories without evidenceDocument what evidence is neededSet up tests/scans before re-auditDonât block on missing evidence - document whatâs needed and proceed.
Thresholds Too Strict
Section titled âThresholds Too StrictâProblem: Canât meet unrealistic thresholds.
Symptoms:
- P99 < 50ms (impossible for complex queries)
- 100% test coverage (impractical)
- Zero technical debt (unrealistic)
Solution:
Negotiate thresholds with stakeholders:- "P99 < 50ms is unrealistic for our DB queries"- "Propose P99 < 200ms based on industry standards"- "Show evidence from load tests"Use data to negotiate realistic requirements.
Audit Takes Too Long
Section titled âAudit Takes Too LongâProblem: Gathering evidence for all categories is time-consuming.
Solution: Focus on critical categories first:
For most projects:
Priority 1: Security (always critical)Priority 2: Performance (if high-traffic)Priority 3: Reliability (if uptime critical)Priority 4: Maintainability (nice to have)Assess categories incrementally, not all at once.
CONCERNS vs FAIL - When to Block?
Section titled âCONCERNS vs FAIL - When to Block?âCONCERNS â ď¸:
- Issues exist but not critical
- Mitigation plan in place
- Business accepts risk (with waiver)
- Can deploy with monitoring
FAIL â:
- Critical security vulnerability (CVE critical)
- System unusable (error rate >10%)
- Data loss risk (no backups)
- Zero mitigation possible
Rule of thumb: If you can mitigate or monitor, use CONCERNS. Reserve FAIL for absolute blockers.
Related Guides
Section titled âRelated Guidesâ- How to Run Trace - Gate decision complements NFR
- How to Run Test Review - Quality complements NFR
- Run TEA for Enterprise - Enterprise workflow
Understanding the Concepts
Section titled âUnderstanding the Conceptsâ- Risk-Based Testing - Risk assessment principles
- TEA Overview - NFR in release gates
Reference
Section titled âReferenceâ- Command: *nfr-assess - Full command reference
- TEA Configuration - Enterprise config options
Generated with BMad Method - TEA (Test Engineering Architect)