top of page

Exploring Fitbit Data using Fitbit's API

**All code for this project is here on Github, along with instructions on how to set up a Fitbit API. **

As a new Fitbit owner, previous collegiate athlete, and fitness lover, I am incredibly curious about my Fitbit data. I was especially curious about some potential analyses Fitbit did not already provide.
 

The general outline of this post is as follows:
  1. Activity Level (i.e., Steps) and Sleep

  2. Cadence Change Over Time

  3. Summary and Next Steps

Note: In order to get my Fitbit data, I first had to set up a Fitbit API. Then, I ran a separate script - available here.

 

Question 1: Does the amount of sleep I get the night before predict my activity level?

Notice that the distribution appears to be bimodal. However, I happen to know that this is because I take naps, and those tend to be 2 hours or less. Also, my Fitbit malfunctioned at first, so it would sometimes run out of batteries in the middle of the night.

For the purposes of this project, I will omit naps and/or artificially shortened sleep (less than 3 hours).

Looks pretty normal.

Here, I define activity level as the number of steps taken. My activity distribution looks normal.

Here is my sleep and activity data over the week:

Question 1 Results: My sleep does not predict my activity level the next day.

So, we can see from the regression table that how many hours I slept the previous night ('hours_prev') does not significantly predict the amount of steps I take the next day (p = .63). This might be for a variety of reasons.

First, I don't have a lot of data yet! I have only used my Fitbit for a couple months. Second, I am a graduate student, and my sleep is incredibly variable. Third, I love to be active! Even if I am tired, a run usually makes me feel better. Long story short is, I need more data.

That flat line shows that there is no linear relationship.

Question 2: Has my cadence changed from my earlier training runs (in December) to my later training runs (in February)?

Cadence is defined as the average number of times your feet hit the ground per minute. In the running community, it is widely believed that a quicker cadence is generally better. (For reference, elite athletes typically have an average cadence in the 180's.)

Here is a look at the data from an entire day to get a sense of the range of cadence values and activity levels.

Anything less than 130 steps per minute shouldn't be treated as running, so we remove them. These are probably instances of walking my dog or light cardio/weights.

Visualize the Cadence Data:

It looks like there are fewer dips in my later training. The means of early and later training values (see below) are also different. However, we need to formally test this.'

Question 2 Results: Yes, there is a significant change in my cadence,

It looks like there is a significant difference between my cadence during earlier (M = 161.85) and later (M = 169.44) training runs (p = .02). My cadence has become quicker throughout my training. This is important for me as a runner as over-striding can lead to injury, and a quicker cadence means a faster pace!

 

Question 3: Do I need more sleep the night after I don't get a lot of sleep? In other words, am I building up a sleep deficit?

In the model below, I predict the difference in amount of sleep for a given day from the previous nights' hours of sleep. To do so, I created a variable to capture the difference in amount of sleep I got from the night before. I then used an ordinary least squares regression model.

Question 3 Results: Yes, I have built up a sleep deficit.

The regression table shows that how many hours I slept the previous night ('hours_prev') negatively predicts the difference in amount of sleep I get the next day. Put more simply, this means that if I get a good night's sleep, the next night, I don't need as much sleep. Conversely, if I don't get a lot of sleep, the next night, I make up for it by sleeping longer.

To make this finding easier to digest, let us visualize the relationship.

Here is the negative linear relationship as seen in our regression table.

 

Summary and Next Steps

In sum, I imported data using Fitbit's API, worked with .json data, visualized my data, ran linear models, and performed a t-test. I found that (1) my activity level is independent from how much sleep I get, (2) my cadence is slowly improving over time, and (3) I built up a sleep deficit over time.

What I would like to do next is collect more data! I would also like to create some interactive visuals and run some more sophisticated models on my data. Finally, I would like to more thoroughly explore my heart rate data. The closer I get to defending my dissertation, the higher it seems to get. Should be fun to test if that is statistically significant.

If you have any suggestions on how I can improve, sources that may help improve my code, or comments/questions, please let me know. I am using this space to share and learn.

 

Again, here is the code on my Github repo.

Featured Posts
Recent Posts
Search By Tags
bottom of page