Seoul, South Korea · AI/ML Infrastructure

Software Engineer building production AI infrastructure.

I design and operate Kubernetes-based AI/ML inference pipelines, GPU model serving systems, Redis queue workflows, CI/CD automation, and observability stacks for production medical AI workloads.

View Resume Contact Me

AI Serving Platform

Redis Queue ingestion, GPU worker execution, Kubernetes scheduling, and result upload workflows.

Event-Driven Autoscaling

KEDA-based scale-to-zero model workers to optimize GPU utilization and reduce idle cloud spend.

Production Operations

Prometheus, Grafana, cAdvisor, Node Exporter, Jenkins, ArgoCD, Helm, Docker, and Linux operations.

Experience

Current professional work

Software Engineer

JLK · Publicly traded medical AI company specializing in stroke analysis

Jun 2024 — May 2026

Designed and operated a production-grade AI inference pipeline serving 15 medical AI models on Kubernetes.
Replaced always-on cloud containers with a scale-to-zero architecture, reducing monthly cloud cost by roughly 7M KRW.
Configured KEDA to scale model workers from dedicated Redis queues only when requests arrive.
Used Kubernetes PriorityClasses to prevent GPU starvation across default and scale-out workloads.
Reduced cold-start pain from large CUDA images by triggering cluster-wide image pre-pulls through Jenkins and DaemonSet automation.
Built production visibility with Prometheus, Grafana, Node Exporter, and cAdvisor.

Selected Projects

Recent infrastructure projects

Older school projects were removed so the portfolio focuses on current professional impact.

Production AI Platform 2024 — 2026

Kubernetes-based AI/ML inference pipeline

Built an infrastructure-agnostic model serving flow from request ingestion to Redis Queue, GPU worker execution, result upload, and monitoring for medical AI workloads.

Data Migration Jan — Feb 2025

AWS S3 migration for medical imaging data

Migrated 250 million DICOM objects totaling 60TB from Ncloud Object Storage to AWS S3. Used Terraform, AWS DataSync, boto3 orchestration, and greedy sharding to balance parallel workers.

Deployment Optimization 2025

Cold-start reduction for CUDA image rollouts

Large CUDA-based images caused 5–10 minute cold starts. I automated cluster-wide image pre-pulling through a Jenkins-triggered DaemonSet to make model rollouts significantly faster.

Technical Skills

Stack I work with

Languages

Python, Java

Cloud & DevOps

Kubernetes, Docker, KEDA, Helm, ArgoCD, Jenkins, AWS DataSync, EC2, S3, CloudFront

Infrastructure

Linux, Redis, Nginx, GPU inference operations, CUDA-based containers

Backend

FastAPI, Celery, Gunicorn, Uvicorn

Observability

Prometheus, Grafana, Node Exporter, cAdvisor

Version Control

GitLab, GitHub

Education & Certification

Background

May 2023

Virginia Tech

Bachelor of Science in Computer Science · Graduated Cum Laude

Feb 2026

Certified Kubernetes Administrator

The Linux Foundation

Contact

Let’s build reliable AI infrastructure.

I’m open to backend, infrastructure, platform engineering, and AI/ML infrastructure roles where production reliability and system design matter.

subeomkwon@gmail.com Download Resume