Introduction
Amazfit watches ship with a companion app called Zepp that reports body fat, ECG analysis, sleep apnea risk, a daily readiness score, and over a dozen other health metrics. The APK is 315 MB. After decompiling with jadx and extracting 10 ARM64 native libraries, the relevant logic condensed to 39 Java files and 10 shared objects.
The Java layer is a thin marshaling interface. The actual computation lives in native ARM64 libraries with stripped symbols, fixed-point arithmetic, and embedded neural network weights.
1. Model Inventory
Zepp bundles six ML models, all running on-device with no cloud dependency. The weights for five of them are compiled directly into the native libraries as float arrays. The sixth is a serialized Random Forest loaded at runtime.
AFib Detection
A neural network that classifies 30-second ECG segments as normal sinus rhythm or atrial fibrillation. Input is a processed ECG waveform (filtered, R-peaks located, beat-averaged). Output is a binary classification with a confidence score. The weights (afib_net_weights) are embedded in the main health library. This is the only model that makes a clinical claim — AFib detection is a regulated medical feature.
Likely trained on the MIT-BIH Atrial Fibrillation Database, the standard public dataset for this task. The architecture is a small feedforward network, probably 3-5 layers with a few hundred parameters total — typical for embedded ECG classifiers that must run in real time on a watch microcontroller.
ECG Signal Quality
Two small networks (noise_net and noise_net_192) classify ECG segments as clean or noisy. Before any disease detection runs, the signal quality check discards segments with excessive noise, baseline wander, or motion artifact. The "192" variant suggests a 192-dimensional input vector — likely a 6-second window at 32 Hz, or similar.
This is the gatekeeper for all downstream ECG analysis. If it rejects a segment, no AFib detection, no HRV calculation, no biometric matching happens on that data.
Beat Classification CNN
A 1D convolutional network (3 input channels, 1D kernel, 4 output filters, leaky ReLU activation) that classifies individual heartbeats as Normal, Premature (PAC/PVC), or AFib candidate. Runs after R-peak detection — each detected beat gets classified before the full-segment AFib model runs.
The 3 input channels likely represent a short waveform window around each R-peak across 3 ECG leads, though Amazfit watches typically use a single-lead ECG. Alternatively, they could be the raw signal plus two derived features (first derivative, second derivative).
ECG Biometric Matching
A neural kernel (bioid_kernel_weights) that determines whether two ECG recordings come from the same person. Unlike simple template matching with fiducial points (P wave, QRS, T wave distances), this uses a learned similarity metric. You enroll by recording a baseline ECG. Subsequent recordings are compared against the enrolled template and return MATCH_OK or MATCH_FAIL.
This is the same approach used in medical-grade ECG biometrics. The kernel was likely trained on a dataset of multi-session ECG recordings from many subjects, learning which waveform features are stable within a person and which vary.
Health Risk Random Forest
A serialized Random Forest model for health risk classification. Unlike the neural networks (whose weights are compiled into the .so), this model is loaded at runtime from a string — likely a separate asset file in the APK or downloaded from a server. Input features are health indicators (BMI, body fat, HRV scores, resting heart rate, blood pressure trends). Output is a risk category or score.
The use of a Random Forest rather than a neural network suggests this was trained on tabular health data with clear feature importance requirements. Random Forests are more interpretable than neural nets — you can inspect which features drive the classification, which matters for health recommendations.
ONNX Runtime
The APK bundles a full 18 MB ONNX Runtime library, separate from the embedded C-weight models. No .onnx model files have been located yet, but the runtime is linked and loaded. This suggests a two-tier deployment: older or smaller models compiled into C arrays for minimal overhead, and newer or larger models loaded through ONNX for flexibility. Possible ONNX models could include sleep staging, emotion detection, or updated versions of the embedded networks.
2. Body Composition
Body composition is measured through Bioelectrical Impedance Analysis (BIA): the watch sends a small alternating current through the body and measures impedance. Higher impedance means more fat (fat is less conductive than muscle).
Dual Backend
Zepp supports two different BIA chips depending on which watch hardware is connected:
Holtek (sources 101, 102): The chip itself does the math and returns pre-computed bins via a lookup table. Weight, height, sex, age, and impedance are sent to the chip, which returns string-keyed results. Rating tables for BMI, body fat, BMR, bone mass, and water percentage are hardcoded.
BestHealth (source 104): The raw impedance is sent to the phone, where regression formulas in a native library compute all 15 body metrics. This backend reveals the actual math.
The Formulas
All calculations use integer fixed-point arithmetic. Weight is stored as kg × 10 (700 = 70.0 kg), height as cm × 10 (1750 = 175.0 cm).
BMI:
BMI_scaled = weight × 100000 / height²
Result clamped to [900, 1000] → BMI × 100 (e.g., 1850 = 18.5)
Fat-Free Mass:
FFM = (weight × 9058 × constant) / 2³⁸ + height × 320 - age × 68 - offset
A 64-bit bioelectrical impedance vector synthesis. The division by 2³⁸ is a bitshift on fixed-point values. The coefficients 9058, 320, and 68 are regression-derived from population data, not from first principles.
Body Fat Percentage:
bodyFat = weight × 5 - impedance_adjusted
Multiplied by 98, 102, 103, or 96 depending on impedance range
Final: (bodyFat × 100) / weight, clamped [0.5%, 7.5%] in internal units
The impedance range gating (below 50Ω, 50-500Ω, 500-601Ω, above 601Ω) makes this piecewise linear. Sex-dependent coefficients: 778 for male, 992 for female.
Muscle Mass: FFM - boneMass - 542
Bone Mass: Derived from weight, impedance, and sex.
Water Percentage: Computed from FFM (muscle tissue is about 73% water).
Visceral Fat Area: Sex-specific regression, likely using a waist/hip ratio proxy derived from impedance phase angle.
Rating Thresholds
Every metric has 2-3 rating categories stored as constant tables:
| Metric | Thresholds |
|---|---|
| BMI | 18.5 (Under/Normal), 24.0 (Normal/Over), 28.0 (Over/Obese) |
| Body Fat | Under/Standard-/Standard+/Over/Obese (5 categories) |
| BMR, Bone, Muscle, Water, Protein | Standard/Under/Over |
Weight-Loss Planning
The only formula visible in the Java layer: 7700 kcal/kg, the standard energy density of human adipose tissue.
exerciseMinutes = (loseWeightKg × 7700) / caloriesPer30MinForSport
The native library stores calorie-per-30-min coefficients for 18 sports: Walking, Golf, GateBall, Tennis, Cycling, Basketball, Squash, RacketBall, Taekwondo, Fencing, Mountain Climbing, Swimming, Aerobics, Jogging, Football, Jump Rope, Badminton, Table Tennis.
3. Healthcare
ECG Pipeline
Raw ECG from the watch sensor goes through four stages before any diagnosis:
-
Savitzky-Golay filtering — removes baseline wander and power-line noise while preserving peak shapes. Multiple filter variants exist for offline (batch), online (real-time), and clinical (doctor) use.
-
Pan-Tompkins R-peak detection — locates heartbeats using adaptive amplitude thresholds. Runs in both real-time and batch modes.
-
Beat averaging — aligns detected beats and averages them into a template, reducing noise for morphology analysis.
-
Beat classification — the conv3x1x4 CNN classifies each beat as Normal, Premature, or AFib candidate.
Full HRV Analysis
Heart rate variability is analyzed in both time and frequency domains:
Time domain: AVNN (average beat-to-beat interval), SDNN (standard deviation), RMSSD (root mean square of successive differences, the gold standard for parasympathetic activity), NN50/pNN50 (count/percentage of intervals differing by >50ms), plus mean and standard deviation of heart rate.
Frequency domain (Welch's method):
- VLF (0.003-0.04 Hz): Long-term regulation, thermoregulation
- LF (0.04-0.15 Hz): Mixed sympathetic/parasympathetic
- HF (0.15-0.40 Hz): Parasympathetic (vagal) activity, respiratory sinus arrhythmia
- LF/HF ratio: Sympathovagal balance
The processing chain: artifact removal → resample to 4 Hz → cubic spline interpolation → power spectral density estimation.
HRV Score: RMSSD is mapped to a 0-100 score through a logarithmic transformation — higher RMSSD means higher parasympathetic activity and a higher score.
Disease Detection
The ECG classifier returns one of four codes:
- 200 — Normal sinus rhythm
- 201 — Atrial fibrillation
- 202 — Premature beats (PAC/PVC)
- 203 — Rejected (insufficient quality)
Signal quality is separately coded: 8 (good) or 9 (poor). The quality gate runs before disease detection.
Health Scoring
The overall health score is a weighted combination of sub-scores:
Body score: BMI checked against thresholds at 18.0 and 24.0. Penalties: -11.0 (severely underweight), -3.5 (mildly underweight), -24.0 (severely overweight). Weighted at 50% of the overall score.
Other sub-scores: HRV, resting heart rate, skin temperature deviation, and sleep apnea (OSA) score. All configurable with weights and thresholds that can be adjusted per device firmware version.
SpO2 and Sleep Apnea
Overnight oxygen saturation monitoring tracks desaturation events, computes the Oxygen Desaturation Index (ODI), and detects Obstructive Sleep Apnea events. Results include a sleep breathing score and JSON-structured event data.
ECG Biometric Identification
The watch can identify you by your ECG waveform. After a one-time enrollment (recording a baseline), subsequent ECGs are compared against the stored template. Match results: 80 (bad data), 81 (match), 82 (no match). The matching uses a trained neural kernel rather than hand-engineered feature distances.
4. Charge (Readiness Score)
Charge is Zepp's composite readiness score, equivalent to Garmin Body Battery or Whoop Recovery. It consumes six minutely time series (stress, heart rate, activity, sleep state, sleep stage, timestamps) plus four daily health scores. It's the most sophisticated algorithm in the system.
Signal Decomposition
Physiological signals are decomposed into three components:
- Signal A — Long-term baseline trend over days to weeks. Captures fitness adaptations, seasonal changes, chronic stress.
- Signal B — Circadian rhythm. The predictable 24-hour cycle that makes most people's heart rate dip during sleep and rise before waking.
- Signal C — Residual. Short-term deviations: acute stress, illness, exercise recovery, alcohol effects.
The decomposition runs continuously as data streams in, using online statistics that update without storing the full history. This is the same technique used in industrial process control for separating signal from noise in streaming sensor data.
Recovery Factors
Recovery rate depends on current depletion level, expressed as multipliers:
| State | Recovery Rate |
|---|---|
| Normal | 100% |
| Mildly depleted | 70% |
| Moderately depleted | 50% |
| Significantly depleted | 22.7% |
| Severely depleted | 15% |
These are exact float constants extracted from the ARM64 disassembly. The steep drop-off after moderate depletion means deep fatigue requires disproportionately more recovery time — consistent with the non-linear recovery curves seen in athletic training research.
Mental vs Physical Charge
Charge is split into two independent pipelines that sum to a total:
Physical charge derives from heart rate and activity signals. Physical activity depletes it; rest and sleep recharge it. The recovery rate depends on the recovery factor above.
Mental charge derives from stress and sleep quality signals. It has its own recovery dynamics with an asymmetric threshold at 50: recovery is slower when mental charge is already low. Health metrics (HRV, resting heart rate, temperature, sleep quality) gate the mental recovery rate.
Both pipelines output per-minute time series plus morning wake values that indicate starting readiness for the day.
Morning Alarm
At wake-up, an anomaly detector compares current morning metrics (HRV, resting heart rate, skin temperature, wake-after-sleep-onset, apnea-hypopnea index) against their running baselines. If any metric deviates by more than a threshold number of standard deviations, an alert triggers with the z-score, severity, and identification of which metric caused the deviation.
This is textbook statistical process control: track baselines with online mean/variance, flag deviations. The novelty isn't the math, it's the data pipeline that generates reliable baselines from consumer-grade sensors.
5. Fitness, Exertion, and Sleep
Training Load Model
Zepp tracks training load using the impulse-response model from exercise physiology:
- Acute Training Load (7-day window): Short-term fatigue from recent workouts. Spikes after hard training, decays over about a week.
- Chronic Training Load (42-day window): Long-term fitness adaptation. Rises slowly with consistent training, decays slowly during breaks.
- Training Stress Balance (CTL - ATL): Positive means you're fresh and ready to perform. Negative means you're fatigued and should recover.
The system also computes TRIMP (Training Impulse) scores — a heart-rate-based measure of workout intensity — and a ramp rate constraint that limits how fast training load can safely increase.
Sleep Staging
Sleep staging runs on minute-level accelerometer and heart rate data, classifying each minute as wake, light sleep, deep sleep, or REM.
The initial classification produces probability distributions for each stage. Then ten refinement passes resolve ambiguities: pattern recognition corrects physiologically impossible sequences (e.g., REM directly from wake), segment duration filtering removes implausibly short stages, deep and light sleep extensions fill gaps, and non-wear periods are removed.
This is a hybrid approach: a probabilistic classifier handles the fuzzy initial assignment, followed by deterministic rules that enforce known sleep physiology constraints.
Sleep quality scoring evaluates: total duration, sleep onset timing (circadian alignment), number and duration of awakenings, and wake-after-sleep-onset.
6. Library Catalog
| Library | Size | Purpose |
|---|---|---|
| Main health library | 2.8 MB | Health scoring, ECG, HRV, AFib detection, SpO2, biometrics, blood pressure |
| ECG signal processing | 440 KB | R-peak detection, QRS width, beat averaging, noise classification |
| Charge algorithm | 404 KB | Readiness/energy scoring, signal decomposition, anomaly detection |
| Exertion & fitness | 392 KB | Training load, TRIMP, recovery modeling |
| Sleep & activity | 492 KB | Sleep staging, step counting, heart rate during sleep |
| Emotion detection | 248 KB | Emotional state estimation |
| Body composition (BestHealth) | 28 KB | BIA regression formulas |
| Body composition (Holtek) | 16 KB | BIA lookup tables |
| Credit card OCR | 1.3 MB | Not health-related |
| Credit card scanner | 12 KB | Not health-related |
Dependencies: Eigen (C++ linear algebra for matrix operations), OpenCV (image processing), ONNX Runtime (18 MB, ML model inference).
7. Key Observations
The dual BIA backend is supply chain flexibility. Zepp supports two different BIA chip manufacturers on the same watch hardware, selecting the backend at runtime. Holtek chips do the math internally and return pre-computed bins. BestHealth chips send raw impedance to the phone, where the formulas run in software.
The Charge algorithm is a real-time state estimator. Online variance tracking, streaming signal decomposition into trend/circadian/residual components, and Kalman filtering are the same techniques used in industrial process control. The mental/physical split with non-linear recovery dynamics goes beyond what Garmin or Whoop document publicly.
All ML models run on-device with no cloud dependency. AFib detection, signal quality, beat classification, biometric matching, and health risk prediction are all embedded. Weights are compiled directly into the libraries as float arrays. Your ECG and health data never leave the device for inference.
The health scoring is transparent once decompiled. The overall score is a configurable weighted sum of sub-scores. Body scoring uses explicit BMI thresholds (18.0, 24.0) with fixed penalties (-11.0, -3.5, -24.0). The only black boxes are the raw sensor-to-metric transformations — impedance to body fat, raw ECG to HRV metrics.
Configuration data is separate from algorithm code. The 18 sport calorie coefficients and all body composition rating thresholds are constant tables, not hardcoded logic. They could be updated without recompiling the algorithms.
The native libraries are scoped disassembly targets. The Java layer provides a complete specification of every function's inputs and outputs. You know exactly what each native function consumes and produces before opening the .so in a disassembler.