This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Diabetes, is properly cited. The complete bibliographic information, a link to the original publication on http://diabetes.jmir.org/, as well as this copyright and license information must be included.
Type 2 diabetes is the most expensive chronic disease in the United States. Two-thirds of US adults have prediabetes or are overweight and at risk for type 2 diabetes. Intensive in-person behavioral counseling can help patients lose weight and make healthy behavior changes to improve their health outcomes. However, with the shortage of health care providers and associated costs, such programs do not adequately service all patients who could benefit. The health care system needs effective and cost-effective interventions that can lead to positive health outcomes as scale. This study investigated the ability of conversational artificial intelligence (AI), in the form of a standalone, fully automated text-based mobile coaching service, to promote weight loss and other health behaviors related to diabetes prevention. This study also measured user acceptability of AI coaches as alternatives to live health care professionals.
The objective of this study was to evaluate weight loss, changes in meal quality, and app acceptability among users of the Lark Weight Loss Health Coach AI (HCAI), with the overarching goal of increasing access to compassionate health care via mobile health. Lessons learned in this study can be applied when planning future clinical trials to evaluate HCAI and when designing AI to promote weight loss, healthy behavior change, and prevention and self-management of chronic diseases.
This was a longitudinal observational study among overweight and obese (body mass index ≥25) participants who used HCAI, which encourages weight loss and healthy diet choices through elements of cognitive behavioral therapy. Weight loss, meal quality, physical activity, and sleep data were collected through user input and, for sleep and physical activity, partly through automatic detection by the user’s mobile phone. User engagement was assessed by duration and amount of app use. A 4-question in-app user trust survey assessed app usability and acceptability.
Data were analyzed for participants (N=70) who met engagement standards set forth by the Centers for Disease Control and Prevention criteria for Diabetes Prevention Program, a clinically proven weight loss program focused on preventing diabetes. Weight loss (standard error of the mean) was 2.38% (0.69%) of baseline weight. The average duration of app use was 15 (SD 1.0) weeks, and users averaged 103 sessions each. Predictors of weight loss included duration of AI use, number of counseling sessions, and number of meals logged. Percentage of healthy meals increased by 31%. The in-app user trust survey had a 100% response rate and positive results, with a satisfaction score of 87 out of 100 and net promoter score of 47.
This study showed that use of an AI health coach is associated with weight loss comparable to in-person lifestyle interventions. It can also encourage behavior changes and have high user acceptability. Research into AI and its application in telemedicine should be pursued, with clinical trials investigating effects on weight, health behaviors, and user engagement and acceptability.
An estimated 30.3 million Americans, or 9.4% of the US population, have type 2 diabetes (T2D). Another 84.1 million, or 33.9% of the adult US population, has prediabetes and is at risk for developing T2D [
The T2D burden is largely attributable to modifiable risk factors [
Lifestyle modification programs can lead to weight loss and reduction of diabetes risk [
Economic resources in the health care system are inadequate for preventive measures such as weight loss and other behavioral changes. Diabetes with complications is among the most expensive condition billed to Medicare [
Significant progress has been made in leveraging technology to increase efficiency and improve health outcomes, including in chronic disease self-management [
As AI and mobile health technology provide a platform to make health behavior coaching programs more accessible to patients, they can also enable the scaling up of empathy and compassion. It can be designed to be compassionate based on characteristics defined in the literature, such as being one-on-one, individualized, and responsive to patients, and having “empathy plus sympathy” [
The Lark Health Coach AI (HCAI) mobile phone app was designed with goals of achieving healthy behavior change among at-risk users and introducing compassionate care in health care systems to allow patients access to infinitely scalable healthy behavior change coaching and support.
Lark’s AI health coaches mimic health professionals’ empathetic health counseling through casual conversations using empathetic text-based communication and other interactive elements. Lark has a variety of products focusing on chronic conditions including obesity and diabetes. In this study, we looked at Lark’s Weight Loss HCAI, which is a product focused on promoting weight loss and other diabetes-preventing and diabetes-managing behaviors such as achieving and/or maintaining healthy sleep duration [
The HCAI aims to increase compassion in health care according to the definition of compassion: “the feeling that arises in witnessing another’s suffering and that motivates a subsequent desire to help” [
To promote sustainable behavior change and increased self-efficacy, the AI incorporates interactive elements of cognitive behavioral therapy (CBT) such as reflection, legitimization, respect, support, and partnership [
Because of the shortcomings in traditional health care delivery channels to help patients achieve healthy lifestyle changes for lowering T2D risk, and the potential for mobile technologies to provide effective and compassionate interventions, there is a role for conversational AI to provide highly scalable health coaching to effect positive change in behaviors known to lower T2D risk. This study’s objectives were to (1) investigate conversational AI use and relationships with weight loss and meal healthiness, and (2) investigate user engagement and acceptability of the HCAI. We hypothesized that AI users would lose weight and improve meal healthiness.
This was a retrospective study among 239 overweight and obese (body mass index [BMI] of at least 25 kg/m2) adults at one of six primary care offices in Nevada and southern California who were within a provider network that had partnered with Lark for this trial. Patients’ primary care physicians offered the HCAI free of charge to patients meeting the BMI requirement. Additional selection criteria were use of Android or iOS mobile phones and not being previous or current Lark app users. Patients who agreed to use the app had a link to install the app sent to their mobile device during the office visit. No further physician support was provided to patients. Initial use of the app took place from July 2016 to January 2017.
Michigan State University’s Institutional Review Board (IRB) determined that this study was not classified as human subject research and therefore did not require IRB approval.
Lark (Lark Technologies) HCAI has been available for Android and Apple mobile phones since 2015. The HCAI provides weight loss coaching through modules with lessons on topics such as self-monitoring, goal-setting, and action planning, plus unlimited text-based quick counseling sessions to help users achieve behavior change goals. Users can complete the modules within 16 weeks, or they can take longer if they repeat lessons or avoid logging into the app for a week or more. The HCAI learns about users and provides personalized content. Additional human-like coaching aspects include guiding dialogues with users and leading users through goal-setting modules for weight loss and food choices.
When setting up the app, users are asked to enter age, gender, weight, and height, and are guided through content to set a goal weight. Users can choose their goal weight but are discouraged from selecting a goal weight that would put them at an underweight BMI (≤18.5 kg/m2). The HCAI prompts users to enter their weight weekly and to enter meals and snacks. They can also enter their weight measurements and diet consumption anytime (
User weight progress dashboard, where users can enter weight (left two panels) and see a chart of weight change since starting the program (right panel).
Sample portion of a conversation with the AI promoting healthy behavior change through compassion and cognitive behavioral therapy strategies including in-the-moment responsiveness, responsiveness to user input, and reflection.
Sample conversations about goal weight and user weight loss.
Sample conversations following user-logged meals.
Conversation following user-logged bout of physical activity (1 hour, 26-minute run) praising the user for the run, informing the user (left panel) that the run is a good strategy for increasing overall activity, and (center and right panels) comparing the user’s total current activity for the day (green line) to the user’s daily average on weekend days since starting the program (white dashed line).
Each user’s weight loss was calculated as the difference between the final recorded weight and the baseline weight. The primary outcome in this study was percent weight change.
The HCAI classified individual foods and beverages as “healthy” if they promote weight control based on literature, or they were nutrient-dense or have predominantly nutrient-dense components (eg, vegetables, whole grains, fruit, nuts, lean proteins, and mixed foods such as vegetarian burgers and Greek salad); “unhealthy” if associated with weight gain based on literature and/or contain many empty calories [
Percent healthy and unhealthy meals at baseline were calculated by dividing the total number of healthy and unhealthy, respectively, meals logged by the total number of meals logged (including healthy, unhealthy, and neither) during the first week of logging. Final percent healthy and unhealthy meals were calculated based on the final week that users logged meals.
Duration of AI use was measured by the time, in weeks, between a user’s first and final use of the app. The number of conversations each user had with the app was also recorded.
User satisfaction was assessed by an in-app user trust survey with four questions measuring (1) overall satisfaction, (2) net promoter score (NPS), (3) disappointment if HCAI were not offered, and (4) self-reported health improvement (
User trust survey to determine patient SS, NPS, DS, and HOS.
Measurement | Question text |
SS | How would you rate your overall satisfaction with the Lark Weight Loss Program (where 10 is Very Satisfied and 0 is Very Dissatisfied)? |
NPS | How likely are you to recommend the Lark Weight Loss Program to others (where 10 is Extremely Likely and 0 is Extremely Unlikely)? |
DS | If the need were to arise again in the future, how disappointed would you be if the Lark Weight Loss Program was not available to you (where 10 is Extremely Disappointed and 0 is Not at all disappointed)? |
HOS | As a result of the help you received from the Lark Weight Loss Program, would you say your health is (Much better than before, Somewhat better than before, Neither better nor worse, Somewhat worse, Much worse than before)? |
Data points were user-entered values for age, gender, height, weight, dietary intake, with self-reported anthropometric data [
Participant selection flow. “Active” users recorded conversations with the HCAI in at least 4 separate weeks.
The age variable had 27 missing values, so ages were imputed according to accepted methods [
We examined associations between percent weight loss and a set of selected independent variables using univariate and multivariate analyses. Variables determined to be statistically significant at an alpha of .2 in the univariate analysis were selected in multivariate analyses to control for the effects of other variables. Variables were also assessed for collinearity using variance inflation factor. Generalized regression was used to quantify the independent association between selected covariates and percent change in weight. We applied a weighting factor consisting of the number of entries made per user to normalize the associations. All statistical analyses were conducted using JMP Pro, Version 13.1.0. SAS Institute Inc.
Participant baseline characteristics are presented in
Users averaged 103 sessions each over the course of 15.0 weeks, where a session constituted a discrete text-based conversational interaction between the user and the HCAI. Users averaged 2.4 kg or 2.4% weight loss (
Baseline characteristics of app users (N=70)a.
Variables | Mean (SEM) | 95% CI | Range |
Age, years | 46.9 (1.89) | 43.1 to 50.7 | 18 to 76 |
Height, cm | 163 (1.41) | 161 to 167 | 135 to 188 |
Baseline weight, kg | 98.0 (3.16) | 91.7 to 104 | 55 to 219 |
Baseline BMI, kg/m2 | 37.0 (1.40) | 34.1 to 39.9 | 24 to 95 |
aEight lower outliers were replaced with 1.5 sigma of smallest height value without outliers.
Weight change and HCAI use (N=70).
Variable | Mean (SEM) | 95% CI | Range |
Final weight, kg | 95.7 (3.20) | 89.3 to 102 | 54 to 220 |
Final BMI, kg/m2 | 36.0 (1.44) | 33.2 to 38.9 | 24 to 95 |
Weight change, kg | -2.40 (0.82) | -4.03 to -0.77 | -54 to 5 |
Weight change, % | -2.38 (2.4/98) (0.69) | -3.75 to -1.00 | 4 to 44 |
Duration of AI use in weeks | 15.0 (1.0) | 13.1 to 17.0 | 4 to 33 |
Number of conversations with AI | 103 (13.8) | 75.0 to 130 | 5 to 824 |
Number of weight entries | 6.1 (0.6) | 5.0 to 7.3 | 2 to 32 |
Number of meals logged | 68 (8.5) | 49.8 to 84.7 | 0 to 351 |
Healthy meals logged, %a | 59% (40.2/68) (5.71) | 28.9 to 51.7 | 0 to 247 |
Unhealthy meals logged, %b | 11% (7.54/68) (1.16) | 5.24 to 9.85 | 0 to 53 |
aEight lower outliers were replaced with 1.5 sigma of smallest height value without outliers.
bThe percent of healthy plus unhealthy meals does not total 100% because some meals were categorized as neither healthy nor unhealthy.
Factors correlated with weight loss.
Variable | Univariate linear regressiona | Multivariate generalized regressiona | ||
Genderb | 1.52 (-0.30 to 3.34) | .10 | ||
Age, years | 0.02 (-0.021 to 0.056) | .365 | 0.082 (0.075 to 0.09) | <.001 |
-0.002 (-0.003 to -0.002)<.001Duration of AI use, weeks | 0.004 (-0.115 to 0.123) | .948 | -0.058 (-0.078 to -0.037) | <.001 |
Height, cm |
0.03 (-0.02 to 0.077) | .244 | 0.044 (0.035 to 0.053) | <.001 |
Baseline weight, kg | 0.02 (-0.01 to 0.036) | .187 | -0.008 (-0.012 to -0.004) | <.001 |
Number of conversations with the AI | -0.008 (-0.013 to -0.004) | <.001 | -0.002 (-0.004 to 0.001) | .144 |
Number of meals logged | -0.012 (-0.020 to -0.004) | <.01 | -0.035 (-0.039 to -0.031) | <.001 |
Healthy meals logged | -0.018 (-0.030 to -0.007) | <.01 | ||
Unhealthy meals logged | -0.055 (-0.114 to 0.005) | .072 | 0.088 (0.068 to 0.107) | <.001 |
aRegression weighted by number of entries per user.
bMale-Female difference assessed using the Tukey-Kramer honestly significant difference test.
User trust survey results.
Question | Mean | Standard deviation | Calculated scores |
SS (n=70) | 7.9 | 2.1 | 87b |
NPS (n=76) | 8.3 | 2.3 | 47c |
DS (n=70) | 6.7 | 3.2 | 68d |
HOSa (n=57) | NA | NA | 60e |
aThe HOS was assessed and calculated from a rating scale (“Much worse,” “Somewhat worse,” “Exactly the same,” “Somewhat better,” and “Much better”), so mean and standard deviation could not be calculated.
bPercentage of users who rated satisfaction as 6-10 on a scale of 0-10.
cPercentage of detractors (score 0-6) subtracted from the percentage of promoters (score 9-10) [
dPercentage who rated disappointment if the HCAI were not offered as 6-10.
ePercentage of users who responded their health was “Much better than before” or “Somewhat better than before.”
The number of meals logged was significantly correlated with number of conversations. For every additional conversation, users logged approximately 0.6 additional meals (
The in-app user trust survey had a 100% response rate. The average scores for Questions 1 (satisfaction in program), 3 (disappointment if not offered), and 4 (health outcome) were 7.9, 8.3, and 6.73, respectively. The average SS, NPS, DS, and HOS scores were 87, 47, 68, and 60, respectively (
This study showed that users of a conversational AI can lose a magnitude of weight comparable to that achieved with lifestyle change programs with live components among individuals with high diabetes risk. This suggests a value in investigating the potential for patients to use AI to effectively drive positive changes in lifestyle behaviors associated with preventing the development of diabetes. In this study, use of the HCAI was associated with average weight loss of 2.4 kg or 2.4%, which is comparable to a loss of 2.32 kg reported in a meta-analysis of 22 lifestyle intervention studies among individuals with risk factors for diabetes [
A separate review examined the results of trials of Web-based interventions for weight loss among adults [
The weight loss achieved in this study has further implications for public health when considering the Finnish National Diabetes Prevention Program, a community-based program with one-on-one counseling visits or group sessions covering topics such as weight loss, diet quality, and exercise. Despite the in-person component of the program, average weight loss among 919 participants was 1.2%, which is less than the weight loss recorded in our study without an in-person component or the costs associated with it [
Also of note is that most users registered for HCAI between August and October 2016, so a significant proportion of program participation and associated weight loss occurred over the holiday season. This is a time when 51% of annual weight gain is estimated to occur. About half of adults gain 1% of body weight [
The amount of weight loss in this study may be clinically significant for diabetes risk. Weight loss of 1 kg can lower diabetes risk by 16% [
The HCAI users recorded improvements in dietary patterns, as percentage of healthy meals logged increased by 31% and unhealthy meals decreased by 54%. This shift in meal quality indicated increased consumption of healthy food compared to unhealthy foods. The result is another potential decrease in diabetes risk, since even small shifts in diet composition can have significant impacts on diabetes risk [
This study also showed that conversational AI delivered via mobile phone app can have high acceptability among users. The NPS was 47, compared to the health industry average of 18, with the industry leader, Kaiser Permanente, achieving a score of 43 [
Previous studies have investigated the effectiveness of Web-based programs and found mixed results. A recent systematic review of systematic reviews concluded that Web-based programs had consistently better results than no program but were sometimes less effective than traditional, in-person weight control programs [
To be able to accurately claim to be an option for increasing lifestyle change program access to patients, an AI lifestyle coach must achieve health outcomes comparable to those of traditional in-person programs, while being less costly. The weight loss of 2.4% observed in this study is comparable to the 2.3% weight loss reported in a Centers for Disease Control and Prevention Web-based lifestyle modification DPP program among individuals at risk for diabetes [
This is only an early study, but it is important to determine which components of the health coaching app may have contributed to weight loss among users. While the app included logging and tracking features, the program also included health coaching that included educational components and behavior change support based on CBT. A previous study [
The AI was found to have high acceptability among users, which can improve retention in weight loss programs [
A study limitation was its lack of control group for direct comparison. However, it can be assumed that without a weight loss intervention, a control group would not lose weight and might gain weight since the average annual weight gain among American adults is 0.5-1 kg [
The scarcity of demographic information collected from users could be seen as a limitation of the study, since it is unknown which subpopulations would be likely to achieve similar results if they were to use the app in the future. However, the fact that the average participant lost weight despite lack of screening based on demographics suggests a wider applicability of the app in weight loss interventions.
Because this study was observational and not experimental, another limitation was its inability to determine causality. Participants who had at least one conversation with the HCAI in at least 4 different weeks lost 2.4% of baseline body weight on average, but it was not determined whether app use caused weight loss, or whether weight loss was caused by lifestyle changes resulting from app use or from other causes, or whether weight loss resulted from another cause that was not investigated in this study. Furthermore, because participants were not screened based on weight loss intentions, nor asked follow-up questions regarding behaviors related to weight loss, it is possible that some weight loss could have resulted from causes unrelated to HCAI use or lifestyle changes encouraged by the HCAI. It is conceivable, for example, that participants took weight loss medications or underwent bariatric surgery during the study period. Future research should include an experimental study that includes data collection surrounding possible confounding or other factors related to weight loss.
The AI automatically tracked physical activity according to any motion detected by mobile phone sensors, and users could log their activity manually. Inaccuracies could result if users completed physical activity bouts without carrying their phones (ie, activity was not automatically detected and recorded) and users neglected to manually input these bouts, or if users double-logged physical activity; that is, if their workout was detected and recorded automatically and they separately entered it manually. Similarly, sleep duration was detected and recorded automatically, but users could add, modify, or delete data.
Another limitation was the potential for incomplete or incorrect classification of foods and therefore meals. This could be due to missing foods in the Lark food database or to incorrect classification of foods as healthy, unhealthy, or neutral when users entered ambiguous foods (eg, “chicken salad” could comprise mayonnaise and chicken and be “unhealthy” or comprise chicken and lettuce and be “healthy”).
The cost of chronic diseases comprises 75% of health care costs in the United States [
Comparison of selected characteristics of in-person coaching and health coach artificial intelligence.
Characteristic | In-Person Coaching | HCAI |
Number and frequency of coaching sessions | Sessions can be limited to a certain number per day, week, or program. | Sessions are unlimited. |
Need to schedule appointments | Appointments for coaching sessions may be required. | Users can initiate coaching sessions without an appointment. |
Coaching availability | Coaching may be available only during set hours | Coaching is available anytime: day, night, or weekends. |
Cost of coaching | Insurers, healthcare providers, and/or patients must pay salaries and/or per-session costs of health coaches. | There is no salary or additional per-session cost associated with HCAI. |
Patient level of comfort | Live coaches can be intimidating. | Patients can identify personal challenges without fear of shame or judgement by the HCAI |
The HCAI could potentially improve access to weight loss behavior change interventions. Telemedicine interventions, including those using mobile phones as a means of delivery, can be effective in reaching underserved populations, such as isolated rural communities and inner-city communities without sufficient providers compared to the number of patients [
The HCAI was designed to promote weight loss and healthy lifestyle behaviors in a compassionate experience using conversational AI. It included elements of CBT interventions with in-the-moment responses based on user input including user-initiated conversations about feelings and accomplishments, and user-entered behaviors including weight, food consumption, and physical activity. The HCAI takes a holistic approach, providing both strategic suggestions and emotional support, and aiming to make users feel valued. For example, it responds to a challenge, such as guilt over overeating, by providing an idea about how to approach the situation in the future (“Just let this feeling give you insight/Into how you might want to do things differently next time”) and reminding users of their worthiness (“You are a wonderful and worthy human being, deserving of the best treatment you can give yourself”). The HCAI design also considers the challenge of long-term maintenance of weight loss, since an estimated 80% of those who lose at least 10% of weight loss for at least a year eventually experience regain [
As seen in this study, technology for fully automated health coaching AI is available for real-life applications. Results from this study showed that participants lost weight while using the HCAI, which implies a potential for the HCAI to aid patients and providers in losing excess weight and improving health behaviors. The study also demonstrated the ease of use of the app, since participants received no assistance in installing or using the app, and its engagement and acceptability among overweight and obese participants (
Additional work is underway or being planned to further investigate health coaching AI and its roles in chronic disease management. Health coach AI apps similar to the weight loss‒focused HCAI in this study have been developed and are being used for prediabetes management and for diabetes prevention and management. A version for managing pre-hypertension is also under development.
Current work includes a randomized controlled trial to investigate effects of the AI on aspects of chronic disease management including weight control, diet quality, medication adherence, and home blood pressure monitoring among individuals with pre-hypertension. Another planned study is a retrospective study among individuals with prediabetes who use a version of the health coach that is a DPP. Outcomes include weight loss and self-efficacy.
This study demonstrates AI’s potential to provide compassionate care that is associated with weight loss, increased healthy lifestyle behaviors, and user trust that can reduce diabetes risk.
artificial intelligence
body mass index
cognitive behavioral therapy
Diabetes Prevention Program
disappointment score
Lark Health Coach Artificial Intelligence
health outcome score
net promoter score
patient satisfaction score
type 2 diabetes mellitus
This study was funded by Lark Technologies, Inc., which approved this manuscript. Author NS consults with Lark Technologies, Inc.
None declared.