Software Engineer · AI Systems

Murad Al-Balushi

I build the control, evaluation, and safety layers that make LLM systems reliable enough for production.

Replaced subjective model evaluation with execution-based validation
Enforced LLM spend limits in agent workflows
Constrained model outputs to eliminate hallucination
Built compliance-ready fintech infrastructure — zero security incidents

Professional Experience

360Remit

Software Developer

Muscat, Oman

Jan 2025 – Mar 2026

  • Owned end-to-end VAPT for a regulated fintech platform as the risk authority between vendors and engineering; validated findings, cut false positives 40%+, closed 100% of critical issues pre-launch, zero security incidents at go-live.
  • Engineered vendor synchronization pipeline for 500k+ records (delta detection, conflict resolution, bidirectional sync), cutting manual processing from 3–5 days to under 5 minutes with 100% DB integrity.
  • Delivered MTO, eKYC, and AML integrations and designed phased infra (DR, capacity, data residency), enabling platform launch within regulatory deadlines while unblocking user-facing onboarding flows.
  • Built SQL/Python humanization pipeline converting 500k+ vendor records to presentation-ready data, eliminating manual cleaning and cutting prep time 95%+.

Highlight Projects

A selection of projects I'm particularly proud of

Code Arbiter Architecture: Coding task → LLM provider (OpenAI / Anthropic / LM Studio) → generated code → isolated Docker sandbox (no network, memory capped) → pytest execution → failure classification → HTML comparison report

AI Code Generation Evaluation Engine (Code Arbiter)

Execution-based benchmarking — run the code, classify the failure

Replaced subjective LLM code review with deterministic execution-based validation. Runs generated code in isolated Docker sandboxes, classifies failures across syntax, runtime, logic, and temporal reasoning, and benchmarks multiple models under identical conditions.

PythonDockerOpenAI APIAnthropic APILM Studio
CostPlan – LLM Cost Enforcement Proxy screenshot

CostPlan – LLM Cost Enforcement Proxy

Open-source circuit breaker for autonomous agent API spend

Built an open-source transparent proxy that enforces per-call and per-session budget limits on LLM API calls, with cache-aware pricing and zero-latency SSE streaming — preventing unbounded spend in autonomous agent workflows.

PythonAnthropic APISSE StreamingasyncioHTTP Proxy
Autonomous Support Agent Architecture: Help Scout polling → Intent classification → Deterministic escalation or response generation → Stripe MCP (read-only) → RAG for product knowledge → Help Scout posting

Production AI Support Agent (Guardrail-First)

Risk-aware LLM-powered support agent reducing customer support load

Deployed a guardrail-first AI support agent handling live customer tickets with Stripe-backed context and deterministic escalation logic, designed to fail safely under uncertainty in a production SaaS environment.

PythonLLM SystemsHelp Scout APIStripe MCPRAG
FinAI Portfolio Analysis System Interface

FinAI – AI Portfolio Analysis & Decision Support System

Compute-first financial analysis engine with constrained LLM interpretation

Built a compute-first financial analysis engine combining deterministic portfolio metrics with constrained LLM interpretation to deliver grounded, non-speculative decision support.

PythonFinancial AnalysisLLM SystemsPortfolio Analytics

Technical Skills

Technologies I use to build, ship, and evaluate production systems

Languages

PythonTypeScriptJavaScriptSQL

AI & LLM

OpenAIAnthropicGeminiRAGGuardrailsWhisperIntent Classification

Backend

FastAPINode.jsExpressNext.jsReact

Infrastructure

DockerAWSGCPCI/CDNginxVercel

Databases

PostgreSQLMySQLMongoDBRedisSQLiteFAISS

Tools & Integrations

GitPytestStripe MCPHelp Scout APIDiscord API