AI-Ops & Cloud Platform Engineer
- תל אביב
- Technology consulting
At KPMG Israel, our Generative AI delivery team stands at the forefront of technological innovation, representing the most advanced AI capabilities in the Israeli market. Our world-class specialists harness the full spectrum of cutting-edge AI technologies – from large language models and multimodal AI architectures to autonomous agents and advanced machine learning frameworks – to architect transformative solutions that redefine business possibilities. Working with the latest generative AI tools, cloud-native architectures, and emerging AI platforms, our team doesn't just implement technology; we pioneer new paradigms of intelligent automation and decision-making. Every project we undertake pushes the boundaries of what's possible, combining deep technical expertise with strategic business acumen to deliver solutions that don't just meet today's challenges but anticipate tomorrow's opportunities. Join us in shaping the future of AI-driven business transformation, where innovation meets excellence and cutting-edge technology creates unprecedented value for our clients across industries.
About the job
As an AIOps Engineer at KPMG Israel, you will revolutionize enterprise operations by building intelligent systems that autonomously monitor, predict, and optimize business-critical infrastructure. You'll architect AI-driven operational platforms that transform reactive IT management into proactive, self-healing systems, enabling organizations to achieve unprecedented levels of reliability, efficiency, and performance at scale.
What You’ll Do
· Monitor and optimize LLM application performance (latency, token usage, drift, failures)
· Automate anomaly detection and remediation using Python and ML-based tooling
· Design and manage cloud infrastructure (AWS, Azure, or GCP) using Terraform
· Build dashboards, alerts, and predictive models to ensure system reliability
· Ensure infrastructure is scalable, secure, and cost-effective
Requirements
· 3+ years of experience in DevOps, SRE, or Cloud Engineering
· Proficient in at least one major cloud provider: AWS, Azure, or GCP
· Hands-on experience with Terraform and Python automation
· Proven ability to design and implement cloud-native architectures
· Built secure Landing Zones with strong network/security best practices
· Experience with monitoring tools such as Prometheus, Datadog, or ELK
· Comfortable with Kubernetes, Docker, and Serverless infrastructures
· CI/CD experience using Azure DevOps, GitHub Actions, or GitLab
Bonus Points
· Experience with LLMOps and vector databases (e.g., Pinecone, Weaviate)
· Background in anomaly detection or AI/ML-based alerting systems
· Knowledge of FinOps practices and cloud cost optimization
מדהים! זו המשרה עליה חלמתי
טופס הגשת מועמדות בטעינה
Liked it? Share it!