AI-Generated assessments for vocational education and training
NZ's first comprehensive examination of whether artificial intelligence (AI) can design
assessments capable of passing New Zealand’s national moderation system

AI-Generated assessments for vocational education and training
Report written by Stuart G. A. Martin (George Angus Consulting).
Project & Technical Lead: Karl Hartley (Epic Learning)
This research represents the first comprehensive examination of whether artificial intelligence (AI) can design assessments capable of passing New Zealand’s national moderation system. Conducted in partnership with the Construction and Infrastructure Centre of Vocational Excellence (ConCoVE), this study provides empirical insights into AI’s current capabilities and limitations in educational assessment design, complete with replicable methodologies and prompt engineering frameworks.
The Research Challenge
The study addressed a fundamental question: can AI create assessments sufficiently robust to meet New Zealand’s rigorous moderation standards? Using Claude 3.5 and 3.7 Sonnet, researchers attempted to generate assessments for the Trades Essentials micro-credential, a complex 25-credit qualification combining multiple unit standards with additional content. The research progressed through five phases: model selection, baseline assessment development, ethical framework creation, personalised assessment generation, and comprehensive expert review.
Key Discoveries: A Tale of Two Capabilities
The research revealed a striking paradox: baseline assessments written by AI failed national moderation standards. Both Workforce Development Councils (WDCs) rejected the AI-generated baseline (standard) assessment, citing issues with level inappropriateness, over-reliance on written tasks, and fundamental misunderstanding of New Zealand’s vocational assessment principles. The AI, as well as human assessment designers, struggled with interpreting the ‘indicative content’, as well as the intention of the micro-credential and unit standard documentation, showcasing a lack of public quality assurance policies which causes confusion to AI and human, alike.
However, personalised assessments achieved universal expert praise. When AI adapted existing assessments for specific learner needs—including for English as a Second Language and a learner with autism—experts described them as “excellent”, “appropriate”, and “way beyond minimum viable product”. They also noted that these personalised versions would genuinely benefit learners more than standard assessments.
The Unexpected Innovation: Enhanced Assessor Guidance
Perhaps the most significant finding of this research was entirely unanticipated. Rather than primarily modifying assessment questions for the personalised assessments, the AI excelled at generating sophisticated assessor guidance tailored to the specific needs of the learner. For instance, for the autism-focused assessment, without prompting, the AI created detailed assessor instructions such as “Present one task at a time with clear beginning and end points” and “Allow 30-50% more processing time for verbal instructions”. This capability addresses a critical gap in vocational education: supporting assessors who lack experience with diverse learners.
Ethical Framework
A significant contribution is the development of the first comprehensive ethical framework specifically designed for AI-generated educational assessments in New Zealand. The framework incorporates Te Tiriti o Waitangi principles and addresses Māori data sovereignty concerns, acknowledging that “data is a living taonga” requiring specific cultural protocols. This represents the first systematic attempt to align AI assessment development with indigenous data rights and New Zealand’s bicultural obligations.
The framework establishes four core principles: Fairness and Justice, ensuring AI systems actively promote equity; Transparency and Accountability, requiring clear documentation and human oversight; Safety, Security, and Data Protection, particularly crucial for sensitive learner data; and Wellbeing, recognising that assessment should support learning rather than simply measure performance.
Implications for Practice
For educators and training providers, this research offers immediate practical value. The prompt engineering methodologies, persona templates, and variable-based systems provide ready-to-use tools for creating personalised assessments. The finding that AI-enhanced assessor guidance significantly improves accessibility suggests substantial potential for supporting inclusive education.
For AI developers and researchers, the study reveals specific technical requirements: the critical importance of temperature settings (0.2-0.4 for consistency), the effectiveness of variable-based prompting systems, and the need for cultural localisation in AI training data.
For policymakers, the research highlights the urgent need for clearer documentation around assessment frameworks. The AI’s confusion over “indicative content” reflected systemic ambiguity that affects both human and artificial interpretation of educational requirements.
These findings align with New Zealand’s 2025 Strategy for Artificial Intelligence, which emphasises accelerated adoption across key sectors, including education, and aims to “encourage investment in AI adoption by reducing uncertainty, removing unintended and unwanted barriers to AI in legislation, and providing clear guidance on responsible AI innovation within New Zealand’s existing legal framework”1. They also mention that New Zealand’s “adoption of OECD AI Principles provides the ethical framework for responsible development that aligns with other OECD countries”2. By aligning with internationally recognised and respected frameworks, New Zealand not only mitigates potential future challenges but also signals a clear and proactive commitment to the ethical development and use of AI. In addition to this adoption of the OECD framework, there is a deep value in having a framework designed specifically for New Zealand’s unique cultural identity. A proposed national framework has been created with the explicit hope and intention that it will be taken and adapted, expanded and amended to support different countries, regions, companies, industries who may wish to use the proposed framework as a starting point.
Looking Forward
This research establishes that fully autonomous AI assessment design is possible with current capabilities, as long as the policies on quality assurance are public and consistent, and that the official interpretations of standards and micro-credential documentation are also publicly available.
AI-assisted personalisation represents a transformative opportunity for inclusive education. The combination of human expertise in framework design with AI capability in adaptation and guidance creation offers a sustainable path toward genuinely personalised learning at scale.
The study’s comprehensive methodology, transparent reporting, and practical focus on implementable solutions make it an essential resource for understanding AI’s current role in education, provide a framework for how to adapt the current findings in this research for future AI iterations, and outline its potential for creating more equitable, accessible learning experiences.
The future of AI in education lies not in replacement but in intelligent partnership, amplifying human expertise to serve learners better than either humans or AI could achieve alone.
