#13

MedGemma Medical Classifier

February 25, 2026

MedGemmaQLoRAReact NativeExpoPython

Kaggle MedGemma Impact Challenge entry. Fine-tuned MedGemma 4B with QLoRA (runs under 5GB). React Native Android app for scan upload, structured reports, and urgency classification.

What is it?

A Kaggle MedGemma Impact Challenge entry. Fine-tuned Google's MedGemma 4B multimodal medical LLM using QLoRA to classify medical scans, then built a React Native Android app where users upload scans and receive structured diagnostic reports with urgency classification.

How it works

MedGemma 4B + QLoRA fine-tuning on medical imaging datasets → quantized model served via FastAPI → React Native (Expo) app lets users pick a scan from their camera roll → image sent to the FastAPI endpoint → model returns structured JSON: findings, impression, urgency level (urgent / non-urgent / normal) → rendered as a formatted medical report in the app.

Under the hood: MedGemma

MedGemma is Google's medical foundation model — pre-trained on medical literature, radiology reports, and medical images. It understands medical terminology and imaging patterns without task-specific training. Fine-tuning with QLoRA adapts it to a specific classification format while keeping VRAM under 5GB (4B model + NF4 quantization).

The multimodal architecture: a vision encoder processes the scan into patch embeddings, a text encoder processes the prompt, and the decoder generates the structured report. Same architecture as standard multimodal LLMs, trained on domain-specific medical data.

React Native with Expo

Expo managed workflow: write JS/TS, Expo builds the Android APK via EAS Build in their cloud. expo-image-picker provides camera roll access. expo-file-system reads the image as base64 for the API request. No Kotlin or Java needed.

The tradeoff: Expo managed workflow can only use pre-compiled native libraries. For anything requiring custom native modules (like direct USB access), you'd need to eject to the bare workflow.

Structured output from medical LLMs

Medical reports need consistent structure — findings, impression, urgency — not free-form text. The inference API uses JSON mode or a strict output schema to force the model to return parseable JSON. The React Native app then renders each field with appropriate UI treatment: urgency gets a colored badge, findings are listed, impression is bolded. Structured output is critical when downstream code needs to parse the model's response.

Key takeaways

  • MedGemma: medical foundation model capabilities, fine-tuning on domain-specific imaging data
  • QLoRA for multimodal models: image-text pair dataset formatting, adapter training
  • Expo EAS Build: cloud APK compilation without Android Studio
  • Structured LLM output: JSON mode, output schemas, parsing for downstream rendering
  • React Native expo-image-picker and expo-file-system for camera roll to API workflows
Watch on YouTube →← all projects