Our clients reserves the right not to make an appointment. In considering candidates for appointment into advertised posts, preference will be accorded to persons from a designated group in accordance with the approved Employment Equity Plan.

Senior AI Engineer (SSIT040)

<< search results Apply shortlist Send to friend

Overview

Reference
SSIT040

Salary
ZAR/month

Job Location
- South Africa -- Johannesburg Metro -- Bryanston -- Woodmead

Job Type
Permanent

Posted
10 February 2026

Closing date
27 Feb 2026 23:59

Job Title: Senior AI Engineer (Applied AI and RAG Systems)

Location: Woodmead, Sandton, South Africa

Primary Purpose (Role):

We are seeking a talented Senior AI Engineer to join our dynamic team, designing and implementing applied AI solutions within a collaborative, team-based delivery model. The successful candidate will ensure that systems are thoroughly documented, reproducible, and easily transferable to other AI and software engineers.

This role is integral to digitalisation and innovation initiatives, with a key focus on enabling and uplifting colleagues, reducing reliance on individual expertise, and supporting the Digital Factory’s transition to scalable, shared AI capabilities. The Senior AI Engineer will work collaboratively with the Digital Factory and IT Architecture teams, sharing ownership of AI solutions and architectural decisions.

Key Performance Areas (Responsibilities):

RAG Architecture and Design

Design and contribute to end-to-end Retrieval-Augmented Generation (RAG) architectures, including ingestion, indexing, retrieval, reranking, and generation layers.
Make informed trade-offs between latency, accuracy, cost, and maintainability.

Ingestion and Indexing

Develop ingestion workers for structured and unstructured data (PDFs, Office documents, HTML, APIs, and databases).
Implement chunking, metadata enrichment, embedding generation, and versioned indexing.
Manage re-ingestion, incremental updates, and data freshness.

Retrieval and Query Pipelines

Create query workers to perform retrieval, filtering, reranking, and context assembly.
Implement hybrid search techniques (vector and keyword), metadata filtering, and scoring strategies.
Integrate Large Language Models (LLMs) for reasoning, summarisation, and tool usage.

LLM Integration

Integrate and evaluate commercial and open-source LLMs (e.g. OpenAI).
Implement prompt templates, function/tool calling, and structured outputs.
Apply guardrails, grounding, and citation strategies.

Infrastructure and Operations

Build scalable services using queues, workers, and asynchronous processing.
Work with vector databases (e.g. Pinecone, Milvus).
Implement logging, monitoring, evaluation, and cost controls for LLM workloads.

Quality and Evaluation

Define and execute evaluation strategies for retrieval quality and answer accuracy.
Debug hallucinations, retrieval failures, and latency bottlenecks.
Continuously enhance system accuracy and reliability.

Stakeholder Engagement and Collaboration

Engage with stakeholders to understand business needs and translate them into technical requirements.
Collaborate with product owners to align AI-enabled solutions with business processes.
Contribute across the full systems lifecycle, including design, coding, testing, implementation, maintenance, and support of application software.
Follow Agile methodologies to deliver fit-for-purpose solutions on time and within budget.

Knowledge Sharing and Capability Enablement

Ensure all AI systems are thoroughly documented, reproducible, and understandable by other engineers.
Conduct regular technical walkthroughs and demonstrations for team members.
Actively transfer knowledge through pairing, code reviews, and written artefacts.
Design solutions that avoid dependency on a single individual for system support or evolution.

Required Skills and Experience:

Strong software engineering background with Python.
Experience building production APIs and background workers.
Hands-on experience with RAG systems beyond the prototype stage.
Deep understanding of embeddings, vector search, and retrieval strategies.
Experience with chunking, reranking, and context-window optimisation.
Familiarity with LLM orchestration frameworks (e.g. LlamaIndex) or equivalent custom implementations.
Experience with vector databases and traditional data stores.
Knowledge of message queues and asynchronous processing (e.g. Celery).
Familiarity with Microsoft Azure DevOps for version control.
Experience with document parsing and OCR pipelines.
Systems thinking mindset, not limited to specific tools.
Contribution to AI architecture and coding standards, championing their adoption.
Defines evaluation metrics and ensures reproducibility of solutions.
Collaboratively reviews AI architectures to ensure alignment with Digital Factory standards.
Identifies and escalates technical and delivery risks, collaborating with leadership to mitigate them.
Mentors intermediate AI developers and team members through structured pairing, code reviews, and hands-on support, fostering independent growth in capabilities.

Personal Qualities:

Takes ownership and accountability for deliverables within a team-based delivery model.
Engages and cooperates effectively with line management and colleagues to achieve operational and team goals.
Demonstrates meticulous attention to detail.
Possesses sound skills in planning, organising, analytical thinking, time management, creativity, and innovation.
Understands customer needs and requirements.
Proactively seeks opportunities for personal growth and capability improvement.
Willing to share knowledge openly with colleagues.

Measures of Success:

AI solutions delivered with shared ownership and clear handover artefacts.
At least two team members capable of supporting and evolving each AI system.
Clear documentation, evaluation reports, and operational runbooks in place.
Demonstrable uplift in AI capability across the Digital Factory.

Qualifications (Essential):

Matriculation certificate.
Bachelor’s degree (B.Sc.) in Computer Science, Artificial Intelligence, Machine Learning, or a related discipline.

Experience (Essential):

Proven experience delivering RAG systems into production, including ownership of reliability, evaluation, and lifecycle management – minimum 3 years.
Software development with Python and asynchronous programming of background workers – minimum 3 years.
Experience with relational databases (SQL Server, Postgres) – minimum 3 years.
Familiarity with Microsoft Azure DevOps and other CI/CD tools – minimum 3 years.
Development of REST or RPC APIs (FastAPI) – minimum 3 years.
Experience with commercial LLM APIs (OpenAI, Anthropic, etc.) and embedding models – minimum 3 years.
Experience using vector databases (Milvus, Pinecone) – minimum 3 years.
Containerisation with Docker – minimum 3 years.
Advanced knowledge of systems and data security – minimum 3 years.
Proficiency in programming languages and frameworks relevant to solution development – minimum 3 years.
Experience with Redis – minimum 2 years.
Experience with MinIO – minimum 2 years.

Knowledge Areas:

Strong Python experience in building production services and background workers.
Hands-on experience with LLM APIs and embedding models.
Experience with at least one vector database (e.g. Pinecone, Milvus).
Proven ability in building document ingestion and retrieval pipelines.
Familiarity with RAG frameworks (LangChain, LlamaIndex, or equivalent custom implementations).
Experience working with relational and document databases.
Expertise in hybrid retrieval, reranking, and relevance tuning.
Experience evaluating and debugging RAG systems.
Knowledge of message queues, asynchronous processing, and containerised deployments.

Contact information

Yolanda Swart

Share on X Share on Facebook Share on LinkedIn

<< search results Apply shortlist Send to friend