AI-Driven Quality Engineering Architect · Available for new engagements · Australia

SNK
SNK Digital
Back to Work
Enterprise / GovernmentJan 2017 – Feb 2023 · 6 years

Enterprise QA Leadership — 6-Year Multi-Programme Tenure at a Major AU SI

Technical Test Manager across government and enterprise SI programmes — multi-discipline test scope, shared framework architecture, tender QE strategy, and capability uplift across mixed onshore and offshore teams

JMeterLoadRunnerJenkinsAzure DevOpsUiPathSeleniumJUnitJIRATestRailConfluence

Engagement context

Over 6 years as Technical Test Manager at a major Australian systems integrator, I led testing strategy and delivery across a portfolio of concurrent government and enterprise programmes — telecommunications, public sector, and managed services. The tenure gave me something most test architects don't accumulate: full delivery cycles repeated multiple times, from tender response through to the post-go-live operational phase where steady-state regression discipline either holds or collapses. I was not embedded in a single programme; the role carried a cross-company remit — technical testing leadership and strategic direction for technical testers across the organisation, monitoring cross-project QA process health, and supporting business development with tender and technical testing responses.

Five stat tiles: 6 years duration Jan 2017 to Feb 2023, 21-plus test disciplines owned or directed, concurrent programmes across telco, public sector, and managed services, cross-company QA remit as Technical Test Manager across the SI, and full delivery cycles repeated from tender through post-go-live.

Full delivery cycles repeated multiple times is what most test architects never accumulate. Six years of concurrent programmes compressed a decade of pattern library depth into a single tenure.

The win

"I built and ran the test discipline for a portfolio of large-scale government and enterprise SI programmes across a 6-year period — not as a single-engagement specialist, but as the organisation's senior testing authority. The architectural contribution was a reusable test framework spine with per-engagement variance: one coherent framework across concurrent customer engagements, each inheriting the core patterns, each maintaining the variance it needed. That pattern — shared foundation, controlled variance — is the architecture the later AI-augmented engagements drew on. The breadth of test scope I managed across this period is the other differentiator: 19+ test disciplines, including specialised surfaces most QA practices rarely reach."

Multi-discipline scope

The full test scope managed across this engagement is the single clearest signal of its breadth. Most QA practices operate across four or five test disciplines. Across this 6-year portfolio, I owned or directed all of the following:

Six discipline category cards grouping 21-plus test disciplines. Functional core: System, Integration, Regression, Automation, UAT, SIT, SVT — seven disciplines. Performance and resilience: Performance, Load, Disaster Recovery, Operational Readiness — four disciplines. Security and compliance: Penetration Testing, Cyber Security, Accessibility to WCAG — three disciplines. Data and integration: ETL, API, AGLS Metadata. Infrastructure and compatibility: Infrastructure, Compatibility. Specialised surfaces: Mobile, RPA UiPath, Cutover.

AGLS Metadata conformance, Disaster Recovery as audit-driven RTO/RPO validation, and RPA bot testing are specialist surfaces that only arise in government programme delivery — and rarely appear in product-side QA practices at all.

SystemIntegrationRegressionAutomationPerformance / LoadPenetration TestingCyber SecurityInfrastructureAccessibilityCompatibilityMobileAPIAGLS MetadataDisaster RecoveryETLRPA (UiPath)Operational ReadinessUATSITSVTCutover

The government and enterprise SI context explains several of the specialised surfaces. AGLS Metadata testing — conformance against the Australian Government Locator Service schema for federal metadata standards — is a scope that only arises in government programme delivery. Disaster Recovery test design was a planned, audit-driven discipline: scenario engineering, fail-over validation, RTO and RPO confirmation, cross-team coordination. ETL testing for data migration pipelines, RPA (UiPath) bot testing covering logic correctness, exception handling, and audit trail integrity, and SIT/SVT/Cutover testing as named delivery stages with explicit entry and exit criteria — these are vocabularies that product-side QA practices rarely exercise.

Framework architecture — reusable spine, per-engagement variance

The architectural problem I solved repeatedly across this portfolio was how to operate a coherent test framework across multiple concurrent customer engagements without building from scratch each time — and without forcing a single rigid framework onto programmes with genuinely different delivery contexts.

An inputs-process-outputs diagram showing the two-tier framework structure. The reusable spine on the left holds test runner configuration, reporting integration, CI and CD hook points, common utility libraries, and tagging conventions. Arrows flow into a central engagement onboarding zone covering variance-layer configuration including page objects, API helpers, test data models, and environment config. Outputs on the right show compounding return across engagements: engagement 1 is full framework standup, engagement 2 is faster, engagement 3 and beyond is variance config only.

The same architectural principle applied later in the AI-augmented framework engagements — a well-designed spine pays forward. The shape is the same; the technology is different.

The solution was a two-tier structure: a reusable framework spine covering the stable elements (test runner configuration, reporting integration, CI/CD hook points, common utility libraries, tagging and execution policy conventions) and a per-engagement variance layer covering the programme-specific surfaces (page objects, API helpers, test data models, environment config). Every new engagement inherited the spine; I built the variance layer during the test strategy phase and handed it to the programme team with a capability uplift component baked in.

This pattern had a compounding return: the third engagement to join the framework was measurably cheaper to onboard than the first. Shared tooling was already stable; the CI integration was already documented; the reporting conventions were already in place. The onboarding cost shifted from framework standup to variance-layer configuration, a much smaller surface. It is the same architectural principle I applied later in the AI-augmented engagements — the investment in a well-designed spine pays forward. One of the concrete delivery programmes that exercised this architecture was the Selenium-to-Playwright migration described in Enterprise Selenium → Playwright Migration, where the spine-and-variance model determined how the new framework was structured and how each cohort was onboarded.

The framework also needed to survive mixed team composition. Programmes ran with onshore, offshore, and customer-side QA contributors at varying maturity levels. The spine had to be learnable by a junior tester on the offshore team without requiring senior intervention on every PR, and extensible by a senior automation engineer without workarounds. Abstraction discipline and clear onboarding documentation were not cosmetic — they were how the framework scaled without me becoming the single point of coupling.

Tender QE strategy and estimation

A meaningful portion of this role sat outside delivery entirely — in the bid and tender cycle. Winning large government and enterprise SI contracts requires a credible QE strategy as part of the tender response: scope articulation, delivery model, team structure, test phase costings, risk and contingency treatment, and a staffing model that survives scrutiny from the client's technical evaluators.

A four-stage phase timeline. Stage 1 Bid and tender response covers scope articulation, delivery model design, team structure, and risk and contingency treatment — before the system exists. Stage 2 Estimation under ambiguity covers workload modelling, test phase costings, staffing model construction, and risk provision sizing, driven by pattern library depth. Stage 3 Bid review defence, highlighted in orange, covers procurement panels, technical evaluators, cost rationale defence, and phasing justification — described as defence with evidence, not negotiation. Stage 4 Post-delivery calibration covers actual versus estimate delta, pattern library update, and accuracy improvement for the next bid.

Defending a staffing model under bid review is one of the higher-stakes versions of QA communication — you are explaining why the test strategy costs what it costs, not negotiating the number down.

I wrote the QE strategy sections for multiple tender submissions across this period. The discipline this builds is different from delivery-side QA: you are committing to an approach before you have seen the actual system, the actual team, or the actual constraints. Estimation accuracy under that ambiguity is a function of pattern library depth — how many programmes of similar shape have you seen, and what did the testing actually cost versus what the early estimate predicted? Six years of concurrent programmes built a calibrated estimation model that delivery-phase QA work alone cannot develop.

Defending the staffing model under bid review — in front of a client's procurement and technical panel — requires a different kind of precision. You need to explain why the test strategy costs what it costs, why the phasing is what it is, and why the risk provisions are sized as they are. It is not a negotiation about lowering the number; it is a defence of a position with evidence. That stakeholder dynamic is one of the higher-stakes versions of QA communication, and this engagement gave me sustained exposure to it.

Capability uplift across mixed-maturity teams

Cross-project QA process health monitoring was part of the Technical Test Manager mandate, not an add-on. Programmes ran at different maturity levels; the weakest teams created the greatest business risk and consumed disproportionate escalation time. The architectural response was structured: define a maturity baseline for each programme, identify the specific gaps, design a targeted uplift path, and execute via paired sessions and daily cadence rather than periodic review.

Two horizontal stacked bars comparing dominant QA maturity gap categories before and after structured uplift. Before uplift the bar shows test design vocabulary at 45 percent, defect reporting quality at 30 percent, and offshore escalation behaviour at 25 percent. After structured uplift via paired sessions and daily cadence, all three gaps are reduced: test design at 15 percent, defect reporting at 10 percent, escalation at 10 percent, with 65 percent of the bar now representing raised quality floor capacity. A callout notes the improvement is visible in aggregate metrics across programmes after six years.

The most common gaps were not technical — they were test design vocabulary and defect report quality. Fixing them required structured investment, not supervision. The improvement shows in aggregate delivery metrics, not individual output.

The most common gaps were not technical. Junior testers in offshore configurations often lacked the test design vocabulary to write coverage-complete test cases from specifications. Fixing that required working at the level of test design discipline — how to identify equivalence classes, how to structure a test case to be independently executable, how to write a defect report that a developer can action without a verbal conversation. These are learnable skills, but they require structured investment, not supervision.

The onshore-offshore coordination model added a layer of complexity: time zone constraints, communication latency, and the tendency for offshore teams to absorb test process debt silently rather than escalating it. I built reporting structures that made the debt visible — test case quality metrics, defect escape rates, coverage gap summaries — so the signal came to me rather than waiting for a delivery miss to surface the problem.

Mentoring across maturity levels in this context is distinct from mentoring within a single-team delivery: you are raising a capability floor across the organisation, not just developing one engineer. The improvement shows in aggregate quality metrics across programmes, not in the output of a specific individual. That is a longer feedback loop, and it requires sustained commitment rather than sprint-based intervention.

Non-functional and specialised test disciplines

Performance testing across this portfolio covered both load testing with JMeter and LoadRunner and the strategy layer: workload modelling, scenario design, SLA definition, and the client conversation about what constitutes a performance pass or fail before a test tool is opened — the same strategy-first approach that shaped the Performance Test Strategy for a SaaS Launch, a later programme that applied this portfolio's performance discipline to a greenfield product launch context. Government programmes with compliance requirements needed performance criteria defined in advance and documented formally, not derived retrospectively from what the system happened to achieve.

Penetration testing and cyber security scope meant I was coordinating with security-specialised teams rather than running the testing myself, but the test strategy and reporting integration were mine: scheduling, scope definition, evidence packages for compliance reporting, and the risk treatment for findings. Understanding the security test surface — what pen testers look for, what findings mean for the application under test — was necessary for integration, not optional for a Technical Test Manager on government programmes.

Accessibility testing to WCAG standards was a government programme requirement, not an elective. Compatibility testing across browser and device matrices required coverage planning at the start of each programme, not retrofitted after completion.

The operational readiness and cutover phases are where six years of repeated delivery cycles proved most valuable. Most testing practices treat go-live as the end of the engagement. Operational readiness is a discipline of its own: verifying that support runbooks are accurate, that the operations team can reproduce the test scenarios documented in the transition pack, that the monitoring and alerting configuration was validated before the first production incident rather than during it. Cutover testing — the actual go-live event, with data migration, service switchover, and rollback rehearsal — requires a test approach designed around failure scenarios, not just happy-path confirmation. Seeing multiple cutover events across a six-year portfolio is the fastest way to develop the scenario engineering instinct that prevents cutover failures.

What this period built

The six years at this organisation built three things that the later, more specialised engagements drew on directly.

Three cards describing compounding outcomes. Outcome 1 Tender and estimation discipline: a calibrated model for scoping test effort under ambiguity before the system, team, or constraints are known, and the communication skills to defend it under bid-review pressure. Outcome 2 Multi-customer framework architecture, highlighted as the central card: the reusable-spine per-engagement-variance pattern that makes concurrent delivery tractable, described as the structural precursor to the AI-augmented framework engagements. Outcome 3 Capability uplift across mixed-maturity teams: understanding that QA quality is a floor raised across an organisation, not a ceiling achieved by one senior architect, requiring aggregate metrics and a longer feedback loop.

The framework architecture outcome is the load-bearing one — the reusable-spine pattern is the structural precursor to every subsequent engagement. The other two made it defensible and scalable.

The first was tender and estimation discipline — a calibrated model for scoping test effort under ambiguity, and the communication skills to defend it under bid-review pressure.

The second was multi-customer framework architecture — the reusable-spine, per-engagement-variance pattern that makes concurrent engagement delivery tractable. This pattern is the structural precursor to the AI-augmented framework work in the later engagements: the shape is the same, the technology is different. The patterns crystallised during this period fed directly into the AgentQE Continuum, the flagship R&D project that takes those framework conventions and extends them into an event-driven, LLM-assisted CI pipeline.

The third was capability uplift across mixed-maturity teams — the understanding that QA quality is a floor raised across an organisation, not a ceiling achieved by one senior architect. Building that floor requires measurement, structured mentoring, and enough time to see the improvement propagate through delivery outcomes. Six years was long enough.

Engagement summary

FieldDetail
DurationJan 2017 – Feb 2023 · 6 years
RoleTechnical Test Manager / Technical Test Lead
Reporting lineEngineering / Delivery leadership
TeamMixed onshore, offshore, and customer-side QA; multiple concurrent programmes

Reference

Reference from delivery leadership available on request at screen stage.

Related services

Matching your brief? Get in touch.