Background: Diabetes mellitus, which causes dysregulation of blood glucose in humans, is a major public health challenge. Patients with diabetes must monitor their glycemic levels to keep them in a healthy range. This task is made easier by using continuous glucose monitoring (CGM) devices and relaying their output to smartphone apps, thus providing users with real-time information on their glycemic fluctuations and possibly predicting future trends.
Objective: This study aims to discuss various challenges of predictive monitoring of glycemia and examines the accuracy and blood glucose control effects of Diabits, a smartphone app that helps patients with diabetes monitor and manage their blood glucose levels in real time.
Methods: Using data from CGM devices and user input, Diabits applies machine learning techniques to create personalized patient models and predict blood glucose fluctuations up to 60 min in advance. These predictions give patients an opportunity to take pre-emptive action to maintain their blood glucose values within the reference range. In this retrospective observational cohort study, the predictive accuracy of Diabits and the correlation between daily use of the app and blood glucose control metrics were examined based on real app users’ data. Moreover, the accuracy of predictions on the 2018 Ohio T1DM (type 1 diabetes mellitus) data set was calculated and compared against other published results.
Results: On the basis of more than 6.8 million data points, 30-min Diabits predictions evaluated using Parkes Error Grid were found to be 86.89% (5,963,930/6,864,130) clinically accurate (zone A) and 99.56% (6,833,625/6,864,130) clinically acceptable (zones A and B), whereas 60-min predictions were 70.56% (4,843,605/6,864,130) clinically accurate and 97.49% (6,692,165/6,864,130) clinically acceptable. By analyzing daily use statistics and CGM data for the 280 most long-standing users of Diabits, it was established that under free-living conditions, many common blood glucose control metrics improved with increased frequency of app use. For instance, the average blood glucose for the days these users did not interact with the app was 154.0 (SD 47.2) mg/dL, with 67.52% of the time spent in the healthy 70 to 180 mg/dL range. For days with 10 or more Diabits sessions, the average blood glucose decreased to 141.6 (SD 42.0) mg/dL (P<.001), whereas the time in euglycemic range increased to 74.28% (P<.001). On the Ohio T1DM data set of 6 patients with type 1 diabetes, 30-min predictions of the base Diabits model had an average root mean square error of 18.68 (SD 2.19) mg/dL, which is an improvement over the published state-of-the-art results for this data set.
Conclusions: Diabits accurately predicts future glycemic fluctuations, potentially making it easier for patients with diabetes to maintain their blood glucose in the reference range. Furthermore, an improvement in glucose control was observed on days with more frequent Diabits use.
Diabetes mellitus is one of the biggest public health challenges of our days. Globally, the number of adults living with the disease has risen from 108 to 422 million between 1980 and 2014, constituting about 8.5% of the worldwide adult population . The complications of diabetes caused by increased blood glucose levels (hyperglycemia) include both macrovascular (ischemic heart disease, cerebrovascular disease, peripheral vascular disease leading to lower extremity amputations) and microvascular (eg, diabetic retinopathy and nephropathy) diseases [ ].
In healthy adults, the pancreas maintains blood glucose levels between approximately 70 mg/dL and 180 mg/dL  (mostly at the lower end of this range, except for short postprandial increases) by balancing the levels of insulin and glucagon in the bloodstream.
Owing to impaired pancreatic function and/or reduced insulin sensitivity, patients with diabetes face the challenge of maintaining their blood glucose levels within the reference range via exogenous insulin administration, medications, and lifestyle modifications (eg, changes in diet, exercise, sleep patterns). These patients, especially those with type 1 diabetes (whose pancreas produces no insulin at all), must constantly monitor their glycemic state and use exogenous insulin to keep their blood glucose from increasing beyond the healthy range into hyperglycemia, while avoiding out-of-range low (hypoglycemic) values, which can potentially lead to seizures, coma, and even death .
The task of blood glucose monitoring, traditionally performed using capillary blood sampling, has been made easier in recent years with the introduction of continuous glucose monitoring (CGM) devices , which measure glucose levels at a set frequency, typically every 5 min, via interstitial fluid. Currently, CGM devices are capable of providing an accurate picture of recent and current blood glucose levels and alerting the users of hypo- or hyperglycemic events. Some of the existing devices have incorporated simple autoregression algorithms to predict impending blood glucose fluctuations (usually no more than 15-20 min ahead of time) and issue a notification if a hypo- or hyperglycemic event is expected. However, we believe that the functionality of CGM devices can be significantly extended with additional tools to improve their utility and, consequently, the quality of life of their users.
Current Research on Blood Glucose Predictions
There are two common reasons for making blood glucose predictions. The first is to be able to manage blood glucose levels automatically via a closed-loop feedback system for a continuous insulin pump [, ]. The second, which is the way in which predictions are used in Diabits, the diabetes management app whose predictive approach and accuracy are reviewed in this publication, is to give the results back to the patient so that their insulin and food intake and other behaviors can be corrected to avoid possible hypo- or hyperglycemia.
Owing to the potential benefits of anticipating blood glucose changes ahead of time, there have been many studies (eg, [- ]) dedicated to developing models capable of short-term (usually in the range of 15-120 min into the future) glycemic predictions. These studies generally fall into 2 categories: (1) physiological approaches [ - ], wherein researchers try to model the metabolic processes within the patient’s body using general knowledge of human physiology, and (2) data-driven models [ - ], which mostly rely on statistical and machine learning techniques applied to the existing CGM data and other available information (eg, meals, exogenous insulin, sleep, and physical activity) to derive standard patterns of blood glucose behavior, which are then used to predict future glycemic events.
The challenge of using physiological predictive models lies in the fact that to be accurate, these models require a more detailed description of the current state of the patient’s body than can normally be achieved, and even in the presence of such data (eg, in a clinical setting), the performance of physiological models is limited because of the inherent complexity of the human glucose-insulin dynamics, which makes identification of model parameters a difficult task. Therefore, data-driven models (or hybrid models that combine statistical methods with physiological insights) are more viable in practice for short-term blood glucose predictions, as evidenced by most studies cited above.
The data-driven models reported in the literature use a variety of traditional signal processing [- ] and machine learning [ - ] methods for making blood glucose predictions. These models normally use recent CGM measurements as the primary predictive input.
Among the methods that are not based on machine learning techniques are those using autoregressive methods [- ], Kalman filters [ - ], and impulse response techniques [ , ] to extrapolate the existing CGM behavior into the near future. Machine learning methods include neural networks [ - ], support vector machines (SVMs) [ , ], decision trees [ , ], grammatical evolution [ ], and other approaches. These methods use supervised learning techniques in which the models of blood glucose behavior created on the basis of past measurements are used to anticipate future changes.
Evaluation of Prediction Accuracy
The accuracy of short-term blood glucose predictions reported in different studies cannot be easily compared, partly because there exists a great variety of metrics that are used by researchers to evaluate predictive performance, such as the root mean square error (RMSE), mean absolute relative difference , prediction time lag and the J index [ ], and different methods [ - ] based on using error grids developed for blood glucose meter evaluation, such as the Clarke Error Grid [ ] and the Parkes (Consensus) Error Grid [ , ]. More importantly, even with the same metric, glycemic prediction models can exhibit noticeable variation in accuracy when applied to different sets of data owing to the nature of data (in silico or in vivo), the amount of data available for each patient, physiological differences between patients, behavioral changes for each patient, and data quality issues. This variance can be partially reduced by using larger data sets, but for many researchers, only limited data are available owing to the fact that blood glucose readings, similar to all medical data, are usually not shared freely because of patient privacy concerns. Although there have been recent attempts to facilitate blood glucose research by creating established sets of CGM data available to scientists, such as the Ohio T1DM (type 1 diabetes mellitus) data set [ ], most studies published to date use private data sets for evaluation, which makes it difficult to objectively evaluate the quality of their results.
Furthermore, the prediction accuracy of different studies may be significantly affected by varied availability of non-CGM data, particularly information related to meal and insulin events. If predictions are only made for periods when no such events occur (which can only be done if the researcher has the data indicating their occurrence), or if these events are taken into account by the predictive model, the accuracy is likely to be much higher than in case of making a prediction for an interval during which unknown events affecting the patient’s blood glucose may have taken place.
Feedback Delays and Implications for Predictions
It is important to point out that CGM devices do not measure the actual blood glucose levels but measure the concentration of glucose in interstitial fluid, which tends to follow blood glucose with a patient- and condition-dependent time lag, usually in the range of 5 to 20 min [- ]. Although the postprocessing of measured CGM data may partially account for this delay, to avoid out-of-range blood glucose excursions, the predictions need to be made in advance in order for the user (or an automatically controlled insulin pump if the predicted values are used by an artificial pancreas algorithm) to be able to make a correction, while the true blood glucose concentration is still within its reference range.
There are several other sources of delays when using predictions for blood glucose control. Frequently, predictions themselves may be lagging compared with the future interstitial glucose levels because of the nature of the predictive algorithm. Next, CGM devices only perform measurements using discrete time intervals (usually between 3 and 15 min, with 5 min being the most common in practice). Therefore, the last measured point may not be quite up to date at the moment the user sees the prediction. Additional delays are introduced by the CGM filtering algorithms . In addition, the corrective action by the user may not have an immediate effect on blood glucose (eg, even for rapid-acting insulin delivered subcutaneously, the action is delayed by about 5-10 min [ ]).
Owing to all these delays, in order for the predictions to be maximally effective in preventing out-of-range blood glucose excursions, it is preferable to anticipate glycemic changes for at least 30 min in advance, especially in cases of hyperglycemic events caused by the delayed action of insulin. For hypoglycemia prediction, shorter time horizons may be acceptable , although a longer accurate prediction would still give the user more time to take preventive measures.
The aim of this paper is to describe how the challenges that exist in blood glucose predictions are addressed in the Diabits smartphone app and to evaluate the accuracy of its predictions and the potential clinical effects of the app using data from the app’s users and other existing data sets.
General Description of Diabits
Diabits is a smartphone app that is available both for iOS and Android phones, which reads current blood glucose data either from the app associated with a Dexcom CGM device (via Dexcom Share) or from Nightscout, a cloud-based data aggregator project that can collect, if configured by the user, current data from a Dexcom or Medtronic CGM, and then presents these data in real time to the user, along with predictions of blood glucose behavior for the next 60 min and statistical information and charts based on the patient’s past blood glucose data.
The main parts of the user interface of the app are shown in. Graph panel (a) is the main screen of the app, displaying the recent CGM data, predicted future blood glucose values, and estimated values of insulin and carbohydrates on board, that is, available for future use by the body. The meal and insulin information, entered manually by each user of the app based on their best knowledge, is displayed in the Journal panel (b). The Analytics panel (c) shows several statistics based on the recent history of the patient’s blood glucose. Some of the graphic parts of the design may have experienced minor changes throughout the study.
The predictive models of Diabits were originally created on the basis of the results of a clinical study conducted in collaboration with the endocrinology unit of BC Children’s Hospital (located in Vancouver, Canada) between April and October 2017 . During this study, CGM data and heart rate and physical activity information of 9 young patients with type 1 diabetes were collected over a period of 2 months with the goal of creating an accurate model for short-term blood glucose predictions. The predictive models that were developed during this study were subsequently refined [ ] using data from a larger pool (approximately 1200 people) of free-living users of the app with approximately 1.6 million data points.
The app gives users an option to manually record, according to their knowledge, food consumption (carbohydrate, protein, and fat content and the glycemic index), insulin intake (the number of units and the type of insulin), physical exercise (intensity and duration), and other events that may affect their blood glucose. This information is added to the CGM data as model inputs to increase the prediction accuracy. The predictive models of Diabits rely significantly on CGM inputs, as most users do not provide enough food and insulin information required to make a model that is primarily based on physiological principles. However, all available physiological inputs are taken into account when making a prediction. A schematic diagram of the Diabits prediction approach is shown in.
Details of Machine Learning Approach Used in Diabits
Glucose predictions are made via a supervised machine learning framework, with personalized models trained using each patient’s past data.
Glucose values are calculated for 4 time points: 15, 30, 45, and 60 min ahead, with a separate model trained for each point. When plotting the data for users, the in-between points are filled using cubic interpolation. Although it is possible to train models for any number of minutes divisible by the CGM time step (eg, for 5, 10, 15 min, if the CGM time step is 5 min), it is not necessary in practice because the actual blood glucose behavior of patients with insulin-dependent diabetes typically lacks a noticeable high-frequency component  (even though unfiltered CGM values may exhibit such fluctuations because of random measurement errors).
To create inputs for the model, in addition to CGM data, recent food and insulin records, if available, are used to estimate the amount of carbohydrates and insulin currently present in the body (this information is also displayed for the user to see) and their rates of utilization. The calculations are performed using physiological models similar to those reported in the literature, (eg, [, ]). As these physiological models have a number of parameters that are specific to each patient, these calculations can only be performed once a sufficient number of previous points with food and insulin data have been collected so that personalized parameters can be estimated from these. Until that point (for newer app users and those who rarely provide such data to the app), a simpler estimation approach for the current amount of carbohydrates and insulin remaining is used based on the food and insulin information reported by the patient, each patient’s insulin-to-carbohydrate ratio and correction factor provided to the app at sign-up, and the changes in blood glucose levels since each food and/or insulin event.
Other data points, such as those related to the time of the day, day of the week, and recent physical activity data, are also added as separate model inputs to increase the accuracy of predictions.
The resulting inputs are used for training a model that combines gradient boosted decision trees and SVM regression. Gradient boosted decision trees  is an ensemble machine learning technique that works by consecutively training new trees on the differences between the ground truth labels and the combined prediction of all preceding trees. SVM regression [ ] operates similar to linear regression, but with a maximum margin (hinge) loss and a kernel mapping that allows to model nonlinear systems. Diabits uses standard implementations of both of these algorithms from open-source Python packages.
The exact mechanism by which these two methods are implemented and combined are not addressed in this paper but may be disclosed in future publications. Generally, the decision tree model is used to evaluate which of the several possible physiological states the patient is currently in, and then an SVM model trained exclusively on the data pertaining to this particular state (as determined by the training algorithm) generates the prediction.
For each Diabits user, the initial personalized (based solely on this user’s data) model is built once 2000 CGM points (about a week of continuous data) are available. Thereafter, the model is retrained every 2 weeks to take advantage of the most recent data.
Prediction Adjustments in Diabits
One of the issues that needs to be addressed when predictive models are trained on past patient behavior is that in the absence of detailed nutritional and insulin information for free-living patients, training points may reflect unrecorded prior corrections that the patients have made by either ingesting carbohydrates or using insulin. This is particularly problematic when blood glucose is near the edges of the target range (eg, just above 70 mg/dL or just below 180 mg/dL for the standard reference range of glucose values). A model trained on such data will likely predict similar corrections happening in the future, which may result in the patient actually foregoing necessary corrections owing to the fact that blood glucose is predicted to normalize on its own.
To mitigate this effect, in situations where such errors are likely to occur (ie, in situations with an impending hypo- or hyperglycemic event that the user is likely to have avoided in the past training data by taking food or insulin), Diabits uses an additional algorithm to correct its predictions to generate the most likely trajectory of blood glucose in the absence of future external interventions. The user can then decide, based on their own judgment, if any interventions are necessary. This adjustment is only used when blood glucose is trending toward the outside of the target range, there has been no recent change in the direction of the trend indicating a possible unreported meal or insulin event, and no meal or insulin events have been reported in the last 40 min. The final prediction is generated as a weighted average of the main model’s prediction and a prediction that applies linear regression to the recent CGM data and therefore is guaranteed to continue the current trend.
Note that this Diabits adjustment, which typically increases the calculated prediction error (because we are no longer trying to predict what will actually happen, but instead what will happen if no action is taken) but, in our opinion, makes the predictions more practically useful, was not used to ensure a fair comparison in part III of the results of this paper, namely when comparing the prediction accuracy of our model with published research on the Ohio T1DM data set. The results for the actual in-app predictions and glycemic control versus frequency of app use (part I and part II), however, are based on a model that does include this adjustment.
Study Format and Ethical Compliance
In the third part of the study (Accuracy of Predictions on the 2018 Ohio T1DM Data Set), a publicly available anonymized 2018 Ohio T1DM data set  was used. The data user agreement for this data set allows the use of its data for research purposes.
Part I: Accuracy of Past In-App Predictions for Free-Living Users
The goal of this part of the study was to examine a large set of past Diabits predictions made for the actual users of the app and to determine the clinical safety of these predictions using Clarke and Parkes Error Grid analysis. All of Diabits users with type 1 diabetes (as reported by the patients themselves during sign-up) were ranked by the number of blood glucose data points they shared with the app in 2019, and the 500 patients with the most points were chosen for analysis. The sex and age of each specific subject was not known to the researchers; however, in general, there are many Diabits users in all age categories, from newborn to those older than 70 years, and of different sexes (approximately evenly split between males and females). All of the CGM devices used by the study participants were among those compatible with the app (General Description of Diabits). The investigators did not have any further information regarding specific device models for each participant.
The distribution between the Clarke and Parkes Error Grid zones of actual 15-, 30-, 45-, and 60-min predictions made by the app in real time, as compared with the ground-truth data from future CGM points, was calculated using all of the points for these 500 patients where the prediction was made and all of the ground-truth labels were available (6,864,130 total points). The results were examined to determine whether the predictions provided could potentially have led to adverse patient outcomes.
Part II: Glycemic Control Versus Frequency of App Use
The goal of this part of the study was to determine whether there is a correlation between how often the users look at the blood glucose graph of Diabits during each day and their blood glucose control. A total of 280 Diabits users who had at least 180 days of CGM data recorded by the app in 2018 to 2019 were included. The patients came from the same pool as in the first part of the study (in fact, many are the same patients); however, their data from 2 calendar years (2018 and 2019) were used for analysis.
The blood glucose control metrics that were calculated included the average blood glucose and its SD, time in euglycemic range (TIR) , glucose management indicator (GMI) [ ], and high BGI (HBGI) and low BGI (LBGI) blood glucose risk indices [ ].
All of the metrics were analyzed as functions of the frequency of daily use, which was defined as the number of times a Diabits user looked at the graph containing CGM values and future blood glucose predictions during 1 calendar day. Diabits records each user’s CGM data as long as the app is running on the smartphone even if the user is not actively looking at the results, so days with zero sessions were included.
The hypothesis of the study was that all of the blood glucose control metrics would improve with more frequent use of the app. All of the users’ days were categorized into 4 different groups, namely those with 0 sessions, 1 to 5 sessions, 6 to 10 sessions, and more than 10 sessions. P values, calculated using a one-sided t test, are reported for the difference of each metric from that in the group with zero daily sessions (no active use of the app; P0) and in the closest group with fewer sessions (Pfewer). A value α=.01 was used for the alpha level of significance in all cases, using the Bonferroni correction  for multiple comparisons.
Part III: Accuracy of Predictions on the 2018 Ohio T1DM Data Set
To facilitate the comparison of the predictive accuracy of Diabits with existing research, the base Diabits prediction framework was applied without any data set–specific adjustments to the data from the Ohio T1DM data set  that was used in 2018 Blood Glucose Level Prediction (BGLP) challenge at the third International Workshop on Knowledge Discovery in Healthcare Data.
Using the training portion of the data in the 2018 Ohio T1DM data set, personalized Diabits models were created for each of the 6 patients in the data set. Next, 30-min predictions were generated for all points in the test portion of the data except for the first hour, and the prediction error (RMSE) was calculated and compared against the published results of the challenge [, - , , , ].
The CGM data were used as is (no averaging or smoothing to eliminate random errors), and only past and present data (CGM glucose levels, basal and bolus insulin, meal, and exercise information) were used for each point to make predictions. In other words, the data were used in the same manner it is normally used in Diabits, with the training data used to train each patient’s personalized prediction models and the test data to generate predictions and calculate their accuracy.
Part I: Accuracy of Past In-App Predictions for Free-Living Users
Actual 30-min Diabits predictions under free-living conditions for the 500 most active patients in 2019 (approximately 6.8 million points) made using personalized models based on the gradient boosted decision trees and the SVM regression algorithm discussed above and evaluated using Parkes Error Grid were found to be 86.89% (5,963,930/6,864,130) clinically accurate (zone A) and 99.56% (6,833,625/6,864,130) clinically acceptable (zones A and B). For the 60-min predictions, the results were 70.56% (4,843,605/6,864,130) clinically accurate and 97.49% (6,692,165/6,864,130) clinically acceptable (). A sample distribution of predicted values plotted against actual values for both Clarke and Parkes Error Grids is shown in .
|Minutes and error grid type||A, n (%)a||B, n (%)a||C, n (%)a||D, n (%)a||E, n (%)a|
|Clarke||6,565,030 (95.64)||278,954 (4.07)||135 (0.00)||19,981 (0.29)||30 (0.00)|
|Parkes||6,613,321 (96.34)||246,767 (3.60)||3968 (0.06)||71 (0.00)||3 (0.00)|
|Clarke||5,835,511 (85.01)||964,979 (14.06)||3276 (0.05)||59,834 (0.87)||530 (0.01)|
|Parkes||5,963,930 (86.89)||869,695 (12.67)||29,587 (0.43)||915 (0.01)||3 (0.00)|
|Clarke||5,174,795 (75.39)||1,559,461 (22.72)||21,510 (0.31)||103,629 (1.51)||4735 (0.07)|
|Parkes||5,359,782 (78.08)||1,414,438 (20.61)||85,974 (1.25)||3931 (0.06)||5 (0.00)|
|Clarke||4,626,623 (67.40)||2,024,709 (29.50)||55,195 (0.80)||144,512 (2.11)||13,091 (0.19)|
|Parkes||4,843,605 (70.56)||1,848,560 (26.93)||162,537 (2.37)||9416 (0.14)||12 (0.00)|
aThe numbers show the percentage of prediction points in each zone of the Clarke and the Parkes Error Grid. For both grids, the zones are defined as clinically accurate (A), clinically acceptable (B), and clinically inaccurate (C-E) [- ].
Part II: Glycemic Control Versus Frequency of App Use
To evaluate the correlation between the daily frequency of Diabits use and the quality of blood glucose control, several commonly used blood glucose control metrics were calculated for 280 users who had at least 180 days of CGM data recorded by the app in 2018 to 2019 (86,973 days combined for all users) as a function of daily number of sessions (ie, the times the user opened the app to look at the blood glucose graph) with Diabits ().
As can be seen from, all of the metrics except LBGI were better for days with more frequent Diabits use (in almost all cases, P/2<α/36=.00027, the latter value being the significance level calculated using the Bonferroni correction formula for multiple comparisons, thus indicating a statistically significant positive correlation). In the case of LBGI, there was a very slight statistically significant increase in hypoglycemic risk when using the app more frequently (as could be expected owing to tighter glucose control); however, all of the values were well within the minimal risk region of LBGI<1.1 [ ].
|Average blood glucose (mg/dL)||154.0||150.7; P0<.001;bPfewer<.001||145.6; P0<.001; Pfewer<.001||141.6; P0<.001; Pfewer<.001|
|Standard deviation (mg/dL)||47.6||45.3; P0<.001; Pfewer<.001||42.1; P0<.001; Pfewer<.001||41.5; P0<.001; Pfewer=.07|
|Time in euglycemic range, as % of all data||67.52||69.39%; P0<.001; Pfewer<.001||73.05%; P0<.001; Pfewer<.001||74.28%; P0<.001; Pfewer=.004|
|GMIc (%)||6.99||6.91%; P0<.001; Pfewer<.001||6.79%; P0<.001; Pfewer<.001||6.70%; P0<.001; Pfewer<.001|
|HBGId (<4.5: low risk; 4.5-9.0: moderate risk; >9.0: high risk) ||4.63||4.20; P0<.001; Pfewer<.001||3.62; P0<.001; Pfewer<.001||3.13; P0<.001; Pfewer<.001|
|LBGIe (<1.1: minimal risk; 1.1-2.5: low risk; 2.5-5.0: moderate risk; >5.0: high risk) ||0.42||0.45; P0<.001; Pfewer<.001||0.46; P0=.007; Pfewer=.32||0.59; P0<.001; Pfewer<.001|
aDaily sessions refers to the number of times a Diabits user looks at the CGM values and predictions during 1 calendar day. Diabits records each user’s CGM data as long as the application is running on the smartphone even if the user is not actively looking at the results, so days with 0 sessions are included.
bAll P values <.001 are reported as P<.001. P0 and Pfewer are defined in the methods section of this paper.
cGMI: glucose management indicator.
dHBGI: high blood glucose risk index.
eLBGI: low blood glucose risk index.
Part III: Accuracy of Predictions on the 2018 Ohio T1DM Data Set
The calculated RMSE values for Diabits predictions on the test portion of the 2018 Ohio T1DM data set  are presented in .
Of note, the mean prediction error of the Diabits base model (18.68 mg/dL) is lower than that of all other published results.
|Predictive model (RMSEa, mg/dL)||Patient number||Mean (SD)b|
|Diabits base model||17.94c||18.29||15.44||22.22||17.53||20.64||18.68 (2.19)|
|Martinsson, 2019 (LSTM RNNd) ||18.77||17.96||15.96||21.68||18.54||20.29||18.87 (1.79)|
|Chen, 2018 (DRNNe) ||18.78||18.12||15.46||22.83||17.73||21.34||19.04 (2.42)|
|Bertachi, 2018 (feed-forward NNf) ||18.83||19.43||15.88||22.86||17.84||21.12||19.33 (2.24)|
|Xie, 2018 (SVMg) ||18.19||19.12||15.67||24.61||17.49||22.12||19.53 (2.99)|
|Xie, 2018 (ARX linear regression) ||18.36||19.02||16.03||23.90||18.25||21.99||19.59 (2.60)|
|Martinsson, 2018 (LSTM RNN) ||19.50||19.00||16.50||24.20||19.20||22.00||20.07 (2.44)|
|Midroni, 2018 (XGBoost) ||19.81||18.42||18.14||24.17||19.24||22.49||20.38 (2.21)|
|Contreras, 2018 (Grammatical evolution) ||20.98||19.36||19.55||24.49||20.45||22.28||21.19 (1.77)|
|Zhu, 2018 (WaveNet convolutional NN) ||21.72||20.17||18.03||24.80||21.42||24.22||21.73 (2.30)|
aRMSE: root mean square error.
bThe mean column is calculated by averaging the 6 previous columns (mean root mean square error over all patients).
cThe best result for each patient is highlighted in italics.
dRNN: recurrent neural network.
eDRNN: dilated recurrent neural network.
fNN: neural network.
gSVM: support vector machine.
This paper has studied the predictive accuracy of Diabits, a smartphone app that performs blood glucose monitoring based on CGM data, presents a statistical analysis of past data, and generates short-term (up to 60 min) predictions of future glucose behavior. In addition, the correlation between daily use of Diabits and blood glucose control metrics of its users was examined.
A large number of actual predictions made by Diabits for its users were evaluated using the Clarke and Parkes Error Grid, and the resulting values were found to be in the clinically acceptable range 97.49% of the time (6,692,165/6,864,130) for 60-min predictions and 99.56% of the time (6,833,625/6,864,130) for 30-min predictions on the Parkes Grid (with similar results for the Clarke Grid), which showed that the vast majority of predictions were accurate enough to not adversely affect the patients.
By analyzing the results of actual app use, it was statistically established that more frequent daily use of Diabits was correlated with improvement in many blood glucose control metrics, including average blood glucose and its SD, TIR, GMI, and HBGI. This is consistent with the goal of the app to help patients better manage their blood glucose and pre-emptively avoid hyper- or hypoglycemia.
Finally, the accuracy of Diabits was directly compared with that of existing research using predictions on the 2018 Ohio T1DM data set, with the resulting RMSE being lower than that in the studies published by other researchers [, - , , , ].
All of these results show the viability of Diabits as an effective tool for blood glucose control in CGM users. They also support the quality of the model underlying Diabits to make informative blood glucose predictions based on personalized machine learning models.
Strengths, Limitations, and Possible Future Developments
In part I, the accuracy of the actual glycemic predictions of Diabits was calculated using more than 6.8 million data points. This provided a solid statistical basis for the calculations and ensured the validity of the results.
The combination of gradient boosting decision trees and SVM regression in the Diabits models may have provided an additional ensembling  benefit that enhanced the prediction accuracy. In addition, we believe that one of the reasons why Diabits personalized models based on these techniques work particularly well for most patients compared with, for example, neural network models, is the somewhat limited amount of training data available for each patient, which favors the traditional machine learning techniques. However, the downside is that the current personalized approach fails to take advantage of the global pool of data available through the app. One possible future research direction is to use combined data from a large number of patients to train a deep neural network model (which may achieve better accuracy with a large amount of data), and then fine-tune this model for each patient.
In part II, the discovered correlation between the daily use of Diabits and the improvement in blood glucose control metrics was based on more than 86,000 days of app use, once again giving the results statistical significance. However, the observational nature of the study and the lack of knowledge of which, if any, corrections were made by the users based on the app output does not allow us to establish causality or estimate the level of importance of each feature of Diabits, which may be a topic of future research.
In part III, the predictions of Diabits on the 2018 Ohio T1DM data set showed an improved average RMSE for 30-min predictions over other published approaches, demonstrating Diabits’ high predictive accuracy when compared with other leading models on the same data set.
The authors acknowledge the help of Cindy Marling and Razvan Bunescu from Ohio University for providing the Ohio T1DM data set.
Conflicts of Interest
All authors are current employees of Bio Conscious Technologies, Inc (BCT), the developer of the Diabits app. The study was performed as part of their work at BCT.
- World Health Organization. Global Report on Diabetes. Geneva, Switzerland: World Health Organization; 2016.
- Klein R. Hyperglycemia and microvascular and macrovascular disease in diabetes. Diabetes Care 1995 Feb;18(2):258-268. [CrossRef] [Medline]
- Leahy J, Clark N, Cefalu W. Medical Management of Diabetes Mellitus. New York, USA: Dekker; 2000.
- Workgroup on Hypoglycemia‚ American Diabetes Association. Defining and reporting hypoglycemia in diabetes: a report from the American Diabetes Association Workgroup on Hypoglycemia. Diabetes Care 2005 May;28(5):1245-1249. [CrossRef] [Medline]
- Rodbard D. Continuous glucose monitoring: a review of successes, challenges, and opportunities. Diabetes Technol Ther 2016 Feb;18(Suppl 2):S3-13 [FREE Full text] [CrossRef] [Medline]
- Cobelli C, Renard E, Kovatchev B. Artificial pancreas: past, present, future. Diabetes 2011 Nov;60(11):2672-2682 [FREE Full text] [CrossRef] [Medline]
- Bequette BW. Challenges and recent progress in the development of a closed-loop artificial pancreas. Annu Rev Control 2012 Dec;36(2):255-266 [FREE Full text] [CrossRef] [Medline]
- Bergman R, Phillips L, Cobelli C. Physiologic evaluation of factors controlling glucose tolerance in man: measurement of insulin sensitivity and beta-cell glucose sensitivity from the response to intravenous glucose. J Clin Invest 1981 Dec;68(6):1456-1467 [FREE Full text] [CrossRef] [Medline]
- Sorensen J. A Physiologic Model of Glucose Metabolism in Man and Its Use to Design and Assess Improved Insulin Therapies for Diabetes. Carnegie Mellon School of Computer Science. URL: http://www.cs.cmu.edu/~./dmilam/files/sorensen_thesis.pdf [accessed 2020-06-12]
- Caumo A, Simeoni M, Cobelli C. Glucose modelling. In: Modelling Methodology for Physiology and Medicine. Cambridge, MA: Academic Press; 2001:337-372.
- Makroglou A, Li J, Kuang Y. Mathematical models and software tools for the glucose-insulin regulatory system and diabetes: an overview. Appl Numer Math 2006 Mar;56(3-4):559-573 [FREE Full text] [CrossRef]
- Dalla Man C, Rizza RA, Cobelli C. Meal simulation model of the glucose-insulin system. IEEE Trans Biomed Eng 2007 Oct;54(10):1740-1749. [CrossRef] [Medline]
- Georga E, Protopappas V, Fotiadis D. Glucose prediction in type 1 and type 2 diabetic patients using data-driven techniques. In: Knowledge-Oriented Applications in Data Mining. London, UK: IntechOpen; 2011:277-296.
- Sparacino G, Zanderigo F, Corazza S, Maran A, Facchinetti A, Cobelli C. Glucose concentration can be predicted ahead in time from continuous glucose monitoring sensor time-series. IEEE Trans Biomed Eng 2007 May;54(5):931-937. [CrossRef] [Medline]
- Reifman J, Rajaraman S, Gribok A, Ward WK. Predictive monitoring for improved management of glucose levels. J Diabetes Sci Technol 2007 Jul;1(4):478-486 [FREE Full text] [CrossRef] [Medline]
- Eren-Oruklu M, Cinar A, Quinn L, Smith D. Estimation of future glucose concentrations with subject-specific recursive linear models. Diabetes Technol Ther 2009 Apr;11(4):243-253 [FREE Full text] [CrossRef] [Medline]
- Lu Y, Rajaraman S, Ward W. Predicting Human Subcutaneous Glucose Concentration in Real Time: a Universal Data-Driven Approach. In: Annual International Conference of the IEEE Engineering in Medicine and Biology Society. 2011 Presented at: IEMBS'11; August 30-September 3, 2011; Boston, MA, USA URL: https://pubmed.ncbi.nlm.nih.gov/22256183/ [CrossRef]
- Xie J, Wang Q. Benchmark machine learning approaches with classical time series approaches on the blood glucose level prediction challenge. 2018 Presented at: Proc 3rd Int Workshop on Knowl Discov in Healthcare Data; 2018; Stockholm, Sweden p. 97-102 URL: http://ceur-ws.org/Vol-2148/paper16.pdf
- Bequette B. Optimal Estimation Applications to Continuous Glucose Monitoring. In: Proceedings of the 2004 American Control Conference. 2004 Presented at: ACC'04; June 30-July 2, 2004; Boston, MA, USA URL: https://doi.org/10.23919/ACC.2004.1383731 [CrossRef]
- Eberle C, Ament C. The Unscented Kalman Filter estimates the plasma insulin from glucose measurement. Biosystems 2011 Jan;103(1):67-72. [CrossRef] [Medline]
- Wang Q, Molenaar P, Harsh S, Freeman K, Xie J, Gold C, et al. Personalized State-space Modeling of Glucose Dynamics for Type 1 Diabetes Using Continuously Monitored Glucose, Insulin Dose, and Meal Intake: An Extended Kalman Filter ApproachPersonalized state-space modeling of glucose dynamics for type 1 diabetes using continuously monitored glucose, insulin dose, and meal intake: an extended kalman filter approach. J Diabetes Sci Technol 2014 Mar;8(2):331-345 [FREE Full text] [CrossRef] [Medline]
- Toffanin C, del Favero S, Aiello E, Messori M, Cobelli C, Magni L. Glucose-insulin model identified in free-living conditions for hypoglycaemia prevention. J Process Control 2018 Apr;64:27-36 [FREE Full text] [CrossRef]
- Toffanin C, Aiello EM, Cobelli C, Magni L. Hypoglycemia prevention via personalized glucose-insulin models identified in free-living conditions. J Diabetes Sci Technol 2019 Nov;13(6):1008-1016. [CrossRef] [Medline]
- Mougiakakou S, Prountzou A, Iliopoulou D. Neural Network-based Glucose-Insulin Metabolism Models for Children With Type 1 Diabetes. In: International Conference of the IEEE Engineering in Medicine and Biology Society. 2006 Presented at: IEMBS'06; August 30-September 2, 2006; New York, NY, USA URL: https://pubmed.ncbi.nlm.nih.gov/17947036 [CrossRef]
- Pappada SM, Cameron BD, Rosman PM. Development of a neural network for prediction of glucose concentration in type 1 diabetes patients. J Diabetes Sci Technol 2008 Sep;2(5):792-801 [FREE Full text] [CrossRef] [Medline]
- Pérez-Gandía C, Facchinetti A, Sparacino G, Cobelli C, Gómez EJ, Rigla M, et al. Artificial neural network algorithm for online glucose prediction from continuous glucose monitoring. Diabetes Technol Ther 2010 Jan;12(1):81-88. [CrossRef] [Medline]
- Zecchin C, Facchinetti A, Sparacino G, De Nicolao G, Cobelli C. Neural network incorporating meal information improves accuracy of short-time prediction of glucose concentration. IEEE Trans Biomed Eng 2012 Jun;59(6):1550-1560. [CrossRef] [Medline]
- Mhaskar H, Pereverzyev S, van der Walt MD. A deep learning approach to diabetic blood glucose prediction. Front Appl Math Stat 2017 Jul 14;3:1-14 [FREE Full text] [CrossRef]
- Mirshekarian S, Bunescu R, Marling C. Using LSTMs to Learn Physiological Models of Blood Glucose Behavior. In: 39th Annual International Conference of the IEEE Engineering in Medicine and Biology Society. 2017 Presented at: EMBC'17; July 11-15, 2017; Seogwipo, South Korea URL: https://pubmed.ncbi.nlm.nih.gov/29060501 [CrossRef]
- Martinsson J, Schliep A, Eliasson B. Automatic Blood Glucose Prediction With Confidence Using Recurrent Neural Networks. 2018 Presented at: Proc 3rd Int Workshop on Knowl Discov in Healthcare Data; 2018; Stockholm, Sweden p. 64-68 URL: http://ceur-ws.org/Vol-2148/paper10.pdf
- Chen J, Li K, Herrero P. Dilated recurrent neural network for short-time prediction of glucose concentration. 2018 Presented at: Proc 3rd Int Workshop on Knowl Discov in Healthcare Data; 2018; Stockholm, Sweden p. 69-73 URL: http://ceur-ws.org/Vol-2148/paper11.pdf
- Zhu T, Li K, Herrero P. A deep learning algorithm for personalized blood glucose prediction. 2018 Presented at: Proc 3rd Int Workshop on Knowl Discov in Healthcare Data; 2018; Stockholm, Sweden p. 74-78 URL: http://ceur-ws.org/Vol-2148/paper12.pdf
- Bertachi A, Biagi L, Contreras I. Prediction of Blood Glucose Levels and Nocturnal Hypoglycemia Using Physiological Models and Artificial Neural Networks. 2018 Presented at: Proc 3rd Int Workshop on Knowl Discov in Healthcare Data; 2018; Stockholm, Sweden p. 85-90 URL: http://ceur-ws.org/Vol-2148/paper14.pdf
- Li K, Daniels J, Liu C, Herrero P, Georgiou P. Convolutional Recurrent Neural Networks for Glucose Prediction. IEEE J Biomed Health Inform 2020 Feb;24(2):603-613. [CrossRef] [Medline]
- Martinsson J, Schliep A, Eliasson B, Mogren O. Blood glucose prediction with variance estimation using recurrent neural networks. J Healthc Inform Res 2019 Dec 1;4(1):1-18 [FREE Full text] [CrossRef]
- Aiello E, Lisanti G, Magni L, Musci M, Toffanin C. Therapy-driven deep glucose forecasting. Eng Appl Artif Intell 2020 Jan;87:103255-103210 [FREE Full text] [CrossRef]
- Georga E, Protopappas V, Ardigo D, Marina M, Zavaroni I, Polyzos D, et al. Multivariate prediction of subcutaneous glucose concentration in type 1 diabetes patients based on support vector regression. IEEE J Biomed Health Inform 2013 Jan;17(1):71-81. [CrossRef] [Medline]
- Plis K, Bunescu R, Marling C. A machine learning approach to predicting blood glucose levels for diabetes management. 2014 Presented at: AAAI Workshop: Modern Artificial Intelligence for Health Analytics; 2014; Quebec City, Canada p. 35-39 URL: https://www.aaai.org/ocs/index.php/WS/AAAIW14/paper/download/8737/8308%20
- Georga E, Protopappas V, Polyzos D, Fotiadis DI. A predictive model of subcutaneous glucose concentration in type 1 diabetes based on Random Forests. Conf Proc IEEE Eng Med Biol Soc 2012;2012:2889-2892. [CrossRef] [Medline]
- Midroni C, Leimbigler P, Baruah G. Predicting glycemia in type 1 diabetes patients: experiments with XGBoost. 2020 Presented at: Proc 3rd Int Workshop on Knowl Discov in Healthcare Data; 2018; Stockholm, Sweden p. 79-84 URL: http://ceur-ws.org/Vol-2148/paper13.pdf
- Contreras I, Bertachi A, Biagi L. Using grammatical evolution to generate short-term blood glucose prediction models. 2020 Presented at: Proc 3rd Int Workshop on Knowl Discov in Healthcare Data; 2018; Stockholm, Sweden p. 91-96 URL: http://ceur-ws.org/Vol-2148/paper15.pdf
- Reiterer F, Polterauer P, Schoemaker M, Schmelzeisen-Redecker G, Freckmann G, Heinemann L, et al. Significance and reliability of MARD for the accuracy of CGM systems. J Diabetes Sci Technol 2017 Jan;11(1):59-67 [FREE Full text] [CrossRef] [Medline]
- Facchinetti A, Sparacino G, Trifoglio E, Cobelli C. A new index to optimally design and compare continuous glucose monitoring glucose prediction algorithms. Diabetes Technol Ther 2011 Feb;13(2):111-119. [CrossRef] [Medline]
- Kovatchev BP, Gonder-Frederick LA, Cox DJ, Clarke WL. Evaluating the accuracy of continuous glucose-monitoring sensors: continuous glucose-error grid analysis illustrated by TheraSense freestyle navigator data. Diabetes Care 2004 Aug;27(8):1922-1928. [CrossRef] [Medline]
- Wentholt IM, Hoekstra JB, Devries JH. A critical appraisal of the continuous glucose-error grid analysis. Diabetes Care 2006 Aug;29(8):1805-1811. [CrossRef] [Medline]
- Clarke W, Anderson S, Kovatchev B. Evaluating clinical accuracy of continuous glucose monitoring systems: continuous glucose-error grid analysis (CG-EGA). Curr Diabetes Rev 2008 Aug;4(3):193-199. [CrossRef] [Medline]
- Sivananthan S, Naumova V, Man CD, Facchinetti A, Renard E, Cobelli C, et al. Assessment of blood glucose predictors: the prediction-error grid analysis. Diabetes Technol Ther 2011 Aug;13(8):787-796. [CrossRef] [Medline]
- Clarke WL, Cox D, Gonder-Frederick LA, Carter W, Pohl SL. Evaluating clinical accuracy of systems for self-monitoring of blood glucose. Diabetes Care 1987;10(5):622-628. [CrossRef] [Medline]
- Parkes JL, Slatin SL, Pardo S, Ginsberg BH. A new consensus error grid to evaluate the clinical significance of inaccuracies in the measurement of blood glucose. Diabetes Care 2000 Aug;23(8):1143-1148 [FREE Full text] [CrossRef] [Medline]
- Pfützner A, Klonoff DC, Pardo S, Parkes JL. Technical aspects of the Parkes error grid. J Diabetes Sci Technol 2013 Sep 1;7(5):1275-1281 [FREE Full text] [CrossRef] [Medline]
- Marling C, Bunescu R. The OhioT1DM Dataset for Blood Glucose Level Prediction. 2018 Presented at: Proc 3rd Int Workshop on Knowl Discov in Healthcare Data; 2018; Stockholm, Sweden URL: http://ceur-ws.org/Vol-2148/paper09.pdf
- Keenan DB, Mastrototaro JJ, Voskanyan G, Steil GM. Delays in minimally invasive continuous glucose monitoring devices: a review of current technology. J Diabetes Sci Technol 2009 Sep 1;3(5):1207-1214 [FREE Full text] [CrossRef] [Medline]
- Rebrin K, Sheppard NF, Steil GM. Use of subcutaneous interstitial fluid glucose to estimate blood glucose: revisiting delay and sensor offset. J Diabetes Sci Technol 2010 Sep 1;4(5):1087-1098 [FREE Full text] [CrossRef] [Medline]
- Basu A, Dube S, Veettil S, Slama M, Kudva YC, Peyser T, et al. Time lag of glucose from intravascular to interstitial compartment in type 1 diabetes. J Diabetes Sci Technol 2015 Jan;9(1):63-68 [FREE Full text] [CrossRef] [Medline]
- Cobelli C, Schiavon M, Dalla Man C, Basu A, Basu R. Interstitial fluid glucose is not just a shifted-in-time but a distorted mirror of blood glucose: insight from an in Silico study. Diabetes Technol Ther 2016 Aug;18(8):505-511 [FREE Full text] [CrossRef] [Medline]
- Fath M, Danne T, Biester T, Erichsen L, Kordonouri O, Haahr H. Faster-acting insulin aspart provides faster onset and greater early exposure vs insulin aspart in children and adolescents with type 1 diabetes mellitus. Pediatr Diabetes 2017 Dec;18(8):903-910. [CrossRef] [Medline]
- Hayeri A. Predicting future glucose fluctuations using machine learning and wearable sensor data. 2018 May Presented at: ADA 78th Scientific Sessions; 2018; Orlando, FL p. 738-P URL: https://diabetes.diabetesjournals.org/content/67/Supplement_1/738-P [CrossRef]
- Hayeri A. Diabits - an AI-powered smartphone application for blood glucose monitoring and predictions. 2019 Jun Presented at: ADA 79th Scientific Sessions; 2019; San Francisco, CA p. 922-P URL: https://diabetes.diabetesjournals.org/content/68/Supplement_1/922-P [CrossRef]
- Gough DA, Kreutz-Delgado K, Bremer TM. Frequency characterization of blood glucose dynamics. Ann Biomed Eng 2003 Jan;31(1):91-97. [CrossRef] [Medline]
- Man CD, Micheletto F, Lv D, Breton M, Kovatchev B, Cobelli C. The UVA/PADOVA type 1 diabetes simulator: new features. J Diabetes Sci Technol 2014 Jan;8(1):26-34 [FREE Full text] [CrossRef] [Medline]
- Mason L, Baxter J, Bartlett P. Boosting Algorithms as Gradient Descent. 1999 Presented at: Advances in Neural Information Processing Systems Proceedings; 1999; Denver, CO p. 512-518 URL: https://dl.acm.org/doi/10.5555/3009657.3009730 [CrossRef]
- Drucker H, Burges C, Kaufman L. Support Vector Regression Machines. 1996 Presented at: Advances in Neural Information Processing Systems Proceedings; 1996; Denver, CO p. 155-161 URL: https://dl.acm.org/doi/10.5555/2998981.2999003 [CrossRef]
- Leach P, Mealling M, Salz R. A Universally Unique IDentifier (UUID) URN Namespace. Internet Engineering Task Force. URL: https://tools.ietf.org/html/rfc4122
- 45 CFR 46. The US Department of Health and Human Services (HHS). URL: https://www.hhs.gov/ohrp/regulations-and-policy/regulations/45-cfr-46/index.html [accessed 2020-06-15]
- Battelino T, Danne T, Bergenstal RM, Amiel SA, Beck R, Biester T, et al. Clinical targets for continuous glucose monitoring data interpretation: recommendations from the international consensus on time in range. Diabetes Care 2019 Aug;42(8):1593-1603 [FREE Full text] [CrossRef] [Medline]
- Bergenstal RM, Beck RW, Close KL, Grunberger G, Sacks DB, Kowalski A, et al. Glucose management indicator (GMI): a new term for estimating A1C from continuous glucose monitoring. Diabetes Care 2018 Nov;41(11):2275-2280 [FREE Full text] [CrossRef] [Medline]
- Kovatchev B, Straume M, Cox D, Farhy L. Risk Analysis of Blood Glucose Data: A Quantitative Approach to Optimizing the Control of Insulin Dependent Diabetes. Journal of Theoretical Medicine 2000;3(1):1-10 [FREE Full text] [CrossRef]
- Miller RG. Simultaneous Statistical Inference. New York, NY: Springer; 1966.
- Kovatchev BP, Cox DJ, Kumar A, Gonder-Frederick L, Clarke WL. Algorithmic evaluation of metabolic control and risk of severe hypoglycemia in type 1 and type 2 diabetes using self-monitoring blood glucose data. Diabetes Technol Ther 2003;5(5):817-828. [CrossRef] [Medline]
- Rokach L. Ensemble-based classifiers. Artif Intell Rev 2009 Nov 19;33(1-2):1-39 [FREE Full text] [CrossRef]
|BCT: Bio Conscious Technologies, Inc|
|BGLP: Blood Glucose Level Prediction|
|CGM: continuous glucose monitoring|
|GMI: glucose management indicator|
|HBGI: high blood glucose risk index|
|LBGI: low blood glucose risk index|
|RMSE: root mean square error|
|SVM: support vector machine|
|T1DM: type 1 diabetes mellitus|
|TIR: time in euglycemic range|
Edited by K Mizokami-Stout; submitted 10.03.20; peer-reviewed by C Toffanin, S Mougiakakou, R Mpofu, C Basch; comments to author 26.04.20; revised version received 19.06.20; accepted 30.07.20; published 22.09.20Copyright
©Stan Kriventsov, Alexander Lindsey, Amir Hayeri. Originally published in JMIR Diabetes (http://diabetes.jmir.org), 22.09.2020.
This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Diabetes, is properly cited. The complete bibliographic information, a link to the original publication on http://diabetes.jmir.org/, as well as this copyright and license information must be included.