My Life Stats-I Tracked My Habits For A Year, and This Is What I Learned by Daily Habits by Pau Blasco I Roca - Nov, 2023 - Towards Data Science
My Life Stats-I Tracked My Habits For A Year, and This Is What I Learned by Daily Habits by Pau Blasco I Roca - Nov, 2023 - Towards Data Science
My Life Stats-I Tracked My Habits For A Year, and This Is What I Learned by Daily Habits by Pau Blasco I Roca - Nov, 2023 - Towards Data Science
Get unlimited access to the best of Medium for less than $1/week. Become a member
Then why do this? Routines, as any other method of self accountability, help me in
lots of different ways. I started this at a low point in my life, trying to study myself
and how different habits could be impacting my mood and mental health. The point
was to be able to “hack” my own brain: if I knew — statistically — what made me
happy and healthy in the long run (and what did the opposite!) I would be able to
improve my life, and potentially give tips or help people similar to me going through
rough times.
https://towardsdatascience.com/my-life-stats-i-tracked-my-habits-for-a-year-and-this-is-what-i-learned-4f9c3d374889 1/28
28/11/2023, 03:02 My Life Stats: I Tracked My Habits for a Year, and This Is What I Learned | by Pau Blasco i Roca | Nov, 2023 | Towards Data …
study it! Luckily, data is everywhere — you just need to look in the right spot and
keep track of it.
The variables I measured changed a bit along the year: some new popped up, some
disappeared and others merged together. The final ones, and the ones which I have
data for all the time records, are the following: Sleep, Writing, Studying, Sport,
Music, Hygiene, Languages, Reading, Socializing, and Mood — a total of ten
variables, covering what I believe to be the most important aspects of my life.
Regarding the data contained in the footnote of each plot: the total is the sum of the
values of the series, the mean is the arithmetic mean of the series, the STD is the
standard deviation and the relative deviation is the STD divided by the mean.
https://towardsdatascience.com/my-life-stats-i-tracked-my-habits-for-a-year-and-this-is-what-i-learned-4f9c3d374889 2/28
28/11/2023, 03:02 My Life Stats: I Tracked My Habits for a Year, and This Is What I Learned | by Pau Blasco i Roca | Nov, 2023 | Towards Data …
Open in app
Total: 2361h. Mean: 7,1h. STD: 1,1h. Relative deviation: 15.5% (image by author).
Search
All things accounted for, I did well enough with sleep. I had rough days, like
everyone else, but I think the trend is pretty stable. In fact, it is one of the least-
varying of my study.
Total: 589,1h. Mean: 1,8h. STD: 2,2. Relative deviation: 122% (image by author).
https://towardsdatascience.com/my-life-stats-i-tracked-my-habits-for-a-year-and-this-is-what-i-learned-4f9c3d374889 3/28
28/11/2023, 03:02 My Life Stats: I Tracked My Habits for a Year, and This Is What I Learned | by Pau Blasco i Roca | Nov, 2023 | Towards Data …
These are the hours I dedicated to my academic career. It fluctuates a lot — finding
balance between work and studying often means having to cram projects on the
weekends — but still, I consider myself satisfied with it.
Total: 1440,9h. Mean: 4,3h. STD: 4,7h. Relative deviation: 107% (image by author).
Regarding this table, all I can say is that I’m surprised. The grand total is greater
than I expected, given that I’m an introvert. Of course, hours with my colleagues at
college also count. In terms of variability, the STD is really high, which makes sense
given the difficulty of having a stablished routine regarding socializing.
https://towardsdatascience.com/my-life-stats-i-tracked-my-habits-for-a-year-and-this-is-what-i-learned-4f9c3d374889 4/28
28/11/2023, 03:02 My Life Stats: I Tracked My Habits for a Year, and This Is What I Learned | by Pau Blasco i Roca | Nov, 2023 | Towards Data …
This the least variable series — the relative deviation is the lowest among my studied
variables. A priori, I’m satisfied with the observed trend. I think it’s positive to keep a
fairly stable mood — and even better if it’s a good one.
Correlation study
After looking at the trends for the main variables, I decided to dive deeper and study
the potential correlations² between them. Since my goal was being able to
mathematically model and predict (or at least explain) “Mood”, correlations were an
important metric to consider. From them, I could extract relationships like the
following: “the days that I study the most are the ones that I sleep the least”, “I
usually study languages and music together”, etc.
Before we do anything else, let’s open up a python file and import some key libraries
from series analysis. I normally use aliases for them, as it is a common practice and
makes things less verbose in the actual code.
https://towardsdatascience.com/my-life-stats-i-tracked-my-habits-for-a-year-and-this-is-what-i-learned-4f9c3d374889 5/28
28/11/2023, 03:02 My Life Stats: I Tracked My Habits for a Year, and This Is What I Learned | by Pau Blasco i Roca | Nov, 2023 | Towards Data …
We will make two different studies regarding correlation. We will look into the
Person Correlation Coefficient³ (for linear relationships between variables) and the
Spearman Correlation Coefficient⁴ (which studies monotonic relationships between
variables). We will be using their implementation⁵ in pandas.
https://towardsdatascience.com/my-life-stats-i-tracked-my-habits-for-a-year-and-this-is-what-i-learned-4f9c3d374889 6/28
28/11/2023, 03:02 My Life Stats: I Tracked My Habits for a Year, and This Is What I Learned | by Pau Blasco i Roca | Nov, 2023 | Towards Data …
And these are the significant values⁶ — the ones that are, with a 95% confidence,
different from zero. We perform a t-test⁷ with the following formula. For each
correlation value rho, we discard it if:
where n is the sample size. We can recycle the code from before and add in this
filter.
#constants
N=332 #number of samples
STEST = 2/np.sqrt(N)
def significance_pearson(val):
https://towardsdatascience.com/my-life-stats-i-tracked-my-habits-for-a-year-and-this-is-what-i-learned-4f9c3d374889 7/28
28/11/2023, 03:02 My Life Stats: I Tracked My Habits for a Year, and This Is What I Learned | by Pau Blasco i Roca | Nov, 2023 | Towards Data …
if np.abs(val)<STEST:
return True
return False
#read data
raw = pd.read_csv("final_stats.csv", sep=";")
numerics = raw.select_dtypes('number')
#calculate correlation
corr = numerics.corr(method='pearson')
#prepare masks
mask = corr.copy().applymap(significance_pearson)
mask2 = np.triu(np.ones_like(corr, dtype=bool)) #remove upper triangle
mask_comb = np.logical_or(mask, mask2)
Those that have been discarded could just be noise, and wrongfully represent trends
or relationships. In any case, it’s better to assume a true relationship is meaningless
than consider meaningful one that isn’t (what we refer to as error type II being
favored over error type I). This is especially true in a study with rather subjective
measurments.
https://towardsdatascience.com/my-life-stats-i-tracked-my-habits-for-a-year-and-this-is-what-i-learned-4f9c3d374889 8/28
28/11/2023, 03:02 My Life Stats: I Tracked My Habits for a Year, and This Is What I Learned | by Pau Blasco i Roca | Nov, 2023 | Towards Data …
Filtered Pearson Correlation matrix. Non-significant values (and the upper triangular) have been filtered out.
(image by author)
where R indicates the rank variable⁸ — the rest of variables are the same ones as described in the Pearson
coef.
https://towardsdatascience.com/my-life-stats-i-tracked-my-habits-for-a-year-and-this-is-what-i-learned-4f9c3d374889 9/28
28/11/2023, 03:02 My Life Stats: I Tracked My Habits for a Year, and This Is What I Learned | by Pau Blasco i Roca | Nov, 2023 | Towards Data …
This is the raw Spearman’s Rank Correlation matrix obtained from my data:
Let’s see what values are actually significant. The formula to check for significance
is the following:
https://towardsdatascience.com/my-life-stats-i-tracked-my-habits-for-a-year-and-this-is-what-i-learned-4f9c3d374889 10/28
28/11/2023, 03:02 My Life Stats: I Tracked My Habits for a Year, and This Is What I Learned | by Pau Blasco i Roca | Nov, 2023 | Towards Data …
where r is spearman’s coefficient. Here, t follows a t-student distribution with n-2 degrees of freedom.
Here, we will filter out all t-values higher (in absolute value) than 1.96. Again, the
reason they have been discarded is that we are not sure whether they are noise —
random chance — or an actual trend. Let’s code it up:
#constants
N=332 #number of samples
TTEST = 1.96
def significance_spearman(val):
if val==1:
return True
t = val * np.sqrt((N-2)/(1-val*val))
if np.abs(t)<1.96:
return True
return False
#read data
raw = pd.read_csv("final_stats.csv", sep=";")
numerics = raw.select_dtypes('number')
#calculate correlation
corr = numerics.corr(method='spearman')
#prepare masks
mask = corr.copy().applymap(significance_spearman)
mask2 = np.triu(np.ones_like(corr, dtype=bool)) #remove upper triangle
mask_comb = np.logical_or(mask, mask2)
https://towardsdatascience.com/my-life-stats-i-tracked-my-habits-for-a-year-and-this-is-what-i-learned-4f9c3d374889 11/28
28/11/2023, 03:02 My Life Stats: I Tracked My Habits for a Year, and This Is What I Learned | by Pau Blasco i Roca | Nov, 2023 | Towards Data …
I believe this chart better explains the apparent relationships between variables, as
its criterion is more “natural” (it considers monotonic⁹, and not only linear,
functions and relationships). It’s not as impacted by outliers as the other one (a
couple of very bad days related to a certain variable won’t impact the overall
correlation coefficient).
Still, I will leave both charts for the reader to judge and extract their own
conclusions.
https://towardsdatascience.com/my-life-stats-i-tracked-my-habits-for-a-year-and-this-is-what-i-learned-4f9c3d374889 12/28
28/11/2023, 03:02 My Life Stats: I Tracked My Habits for a Year, and This Is What I Learned | by Pau Blasco i Roca | Nov, 2023 | Towards Data …
autocorrelated¹⁰. For example, a bad night might make me sleepy and cause me to
oversleep the next day — that would be a time-wise correlation. In this section, I will
be focusing only on the variables of the initial exploration.
Let’s explore the ARIMA model and find a good fit for our data. An ARIMA¹¹ model
is a combination of an autoregressive model (AR¹²) and a moving average — hence
its initials (Auto Regressive Integrated Moving Average). In this case, we will use
pmdarima’s auto_arima method, a function inspired by R’s “forecast::autoarima”
function, to determine the coefficients for our model.
for v in ['Sleep','Studying','Socializing','Mood']:
arima.auto_arima(numerics[v], trace=True) #trace=True to see results
Surprisingly, Sleep is not autoregressive, but Mood seems to be! As we can see, a
simple ARIMA(1,0,0) — an AR(1) — represents Mood fairly well. This implies that the
Mood from day D is explained by the Mood from day D-1, or the day before, and
some normally distributed noise.
https://towardsdatascience.com/my-life-stats-i-tracked-my-habits-for-a-year-and-this-is-what-i-learned-4f9c3d374889 13/28
28/11/2023, 03:02 My Life Stats: I Tracked My Habits for a Year, and This Is What I Learned | by Pau Blasco i Roca | Nov, 2023 | Towards Data …
Here is another example: We have a signal made out of two sine functions with
frequency 1 and 10 respectively. After applying the FT, we see this:
https://towardsdatascience.com/my-life-stats-i-tracked-my-habits-for-a-year-and-this-is-what-i-learned-4f9c3d374889 14/28
28/11/2023, 03:02 My Life Stats: I Tracked My Habits for a Year, and This Is What I Learned | by Pau Blasco i Roca | Nov, 2023 | Towards Data …
As we can see, FFT decomposes signals into their frequency components (image from Wikimedia
Commons)
The result is a plot with two peaks, one at x=1 and one at x=10. The Fourier
Transform has found the base components of our signal!
for v in ['Sleep','Studying','Socializing','Mood']:
t = np.arange(0,N,1)
x = numerics[v]
X = np.fft.fft(x)
n = np.arange(0,len(X),1)
T = N
freq = n/T
plt.subplot(121)
plt.plot(t, x, 'r')
plt.xlabel('Time (days)')
plt.ylabel(v)
plt.subplot(122)
plt.stem(n, np.abs(X), 'b', markerfmt=" ", basefmt="-b")
plt.xlabel('Freq (1/days)')
plt.ylabel('FFT |X(freq)|')
plt.xlim(0, 30)
plt.ylim(0, 500)
plt.tight_layout()
plt.show()
Back to our case study, these are the results that our code outputs:
https://towardsdatascience.com/my-life-stats-i-tracked-my-habits-for-a-year-and-this-is-what-i-learned-4f9c3d374889 15/28
28/11/2023, 03:02 My Life Stats: I Tracked My Habits for a Year, and This Is What I Learned | by Pau Blasco i Roca | Nov, 2023 | Towards Data …
Left to right and top to bottom: charts for Sleep, Studying, Socializing and Mood. (image by author)
We can observe that Sleep has a significative value at frequency 1 — meaning that
the data follows a 1-day cycle, which is not very helpful. Studying presents
interesting values too: the first five or so are noticeably higher than the others.
Unfortunately, noise takes over for them and for every other chart — no conclusion
can be obtained with certainty.
To counteract it, we filter out the noise with a moving average. Let’s try applying
MA(5) again and studying the FFT. The code will be almost the same except for the
moving average.
k = 5
for v in ['Sleep','Studying','Socializing','Mood']:
t = np.arange(0,N-k+1,1)
x = moving_average(numerics[v], k)
X = np.fft.fft(x)
n = np.arange(0,len(X),1)
T = N-k+1
freq = n/T
https://towardsdatascience.com/my-life-stats-i-tracked-my-habits-for-a-year-and-this-is-what-i-learned-4f9c3d374889 16/28
28/11/2023, 03:02 My Life Stats: I Tracked My Habits for a Year, and This Is What I Learned | by Pau Blasco i Roca | Nov, 2023 | Towards Data …
plt.subplot(121)
plt.plot(t, x, 'r')
plt.xlabel('Time (days)')
plt.ylabel(v)
plt.subplot(122)
plt.stem(n, np.abs(X), 'b', markerfmt=" ", basefmt="-b")
plt.xlabel('Freq (1/days)')
plt.ylabel('FFT |X(freq)|')
plt.xlim(0, 30)
plt.ylim(0, 500)
plt.tight_layout()
plt.show()
Left to right and top to bottom: charts for Sleep, Studying, Socializing and Mood. (image by author)
After applying the MA, the noise has been slightly reduced. Still, it seems that there
are no conclusions to be extracted from these — we can’t find any significant, clear
frequency values.
Conclusions
https://towardsdatascience.com/my-life-stats-i-tracked-my-habits-for-a-year-and-this-is-what-i-learned-4f9c3d374889 17/28
28/11/2023, 03:02 My Life Stats: I Tracked My Habits for a Year, and This Is What I Learned | by Pau Blasco i Roca | Nov, 2023 | Towards Data …
After making different statistical studies, we can conclude the expected: human
behaviour is very complicated — more, of course, than an Excel sheet and a couple
of mathematical models can account for. Still, there’s value to be found in both
methodical data recollection and the opportunities of analysis that arise from it.
Let’s make a quick look at what we’ve done:
After doing these analysis, we were able to draw some insights about our data and
how the different variables correlate to eachother. Here is the summary of our
findings.
In terms of relative deviation (variability), Mood and Sleep were the lowest
(11.3%, 15.5% respectively), while Studying and Socializing were both avobe
100%.
Socializing was found to be negatively correlated with almost all my hobbies, but
positively correlated with my Mood (in both Pearson and Spearman). This is
probably due to how when I meet with friends or family, I have to put my
hobbies aside for the day, but I am generally happier than I would be by myself.
Mood and Studying were found to be autoregressive by the ARIMA fitting study,
implying that the value on a certain day can be explained by the one before it.
It is also worth noting that we got interesting “global” stats, which are, if not
scientifically meaningful, interesting to know.
https://towardsdatascience.com/my-life-stats-i-tracked-my-habits-for-a-year-and-this-is-what-i-learned-4f9c3d374889 18/28
28/11/2023, 03:02 My Life Stats: I Tracked My Habits for a Year, and This Is What I Learned | by Pau Blasco i Roca | Nov, 2023 | Towards Data …
On a personal level, I think that this experiment has been helpful for me. Even if the
final results are not conclusive, I believe that it helped me cope with the bad times
and keep track of the good ones. Likewise, I think it is always positive to do some
introspection and get to know oneself a bit better.
As a final bit, this is the cumulative chart — made again in MS Excel — for all the
variables that could be accumulated (each one except mood and hygiene, which are
not counted in hours but in a certain ranking; and sleep). I decided to plot it as a
logarithmic chart because even if the accumulated variables were linear, their
varying slopes made it hard for the viewer to see the data. That’s it! Enjoy!
As always, I encourage you to comment any thoughts or doubts you might have.
GitHub - Nerocraft4/habittracker
https://towardsdatascience.com/my-life-stats-i-tracked-my-habits-for-a-year-and-this-is-what-i-learned-4f9c3d374889 19/28
28/11/2023, 03:02 My Life Stats: I Tracked My Habits for a Year, and This Is What I Learned | by Pau Blasco i Roca | Nov, 2023 | Towards Data …
References
[1] Wikipedia. Moving Average. https://en.wikipedia.org/wiki/Moving_average
Follow
https://towardsdatascience.com/my-life-stats-i-tracked-my-habits-for-a-year-and-this-is-what-i-learned-4f9c3d374889 21/28
28/11/2023, 03:02 My Life Stats: I Tracked My Habits for a Year, and This Is What I Learned | by Pau Blasco i Roca | Nov, 2023 | Towards Data …
132 3
2.8K 37
https://towardsdatascience.com/my-life-stats-i-tracked-my-habits-for-a-year-and-this-is-what-i-learned-4f9c3d374889 22/28
28/11/2023, 03:02 My Life Stats: I Tracked My Habits for a Year, and This Is What I Learned | by Pau Blasco i Roca | Nov, 2023 | Towards Data …
2.6K 22
One of the best plug and play math libraries for Python
95 2
Jeremy
What I learned after one year of building a Data Platform from scratch
My key learnings on building a Data platform, from the tech side to the business side
1.3K 17
https://towardsdatascience.com/my-life-stats-i-tracked-my-habits-for-a-year-and-this-is-what-i-learned-4f9c3d374889 24/28
28/11/2023, 03:02 My Life Stats: I Tracked My Habits for a Year, and This Is What I Learned | by Pau Blasco i Roca | Nov, 2023 | Towards Data …
2.8K 37
Lists
New_Reading_List
174 stories · 204 saves
Productivity 101
20 stories · 864 saves
https://towardsdatascience.com/my-life-stats-i-tracked-my-habits-for-a-year-and-this-is-what-i-learned-4f9c3d374889 25/28
28/11/2023, 03:02 My Life Stats: I Tracked My Habits for a Year, and This Is What I Learned | by Pau Blasco i Roca | Nov, 2023 | Towards Data …
5.5K 213
Paul Rose
10.9K 202
Devansh in DataDrivenInvestor
588 12
https://towardsdatascience.com/my-life-stats-i-tracked-my-habits-for-a-year-and-this-is-what-i-learned-4f9c3d374889 27/28
28/11/2023, 03:02 My Life Stats: I Tracked My Habits for a Year, and This Is What I Learned | by Pau Blasco i Roca | Nov, 2023 | Towards Data …
Carlos Arguelles
3.8K 47
https://towardsdatascience.com/my-life-stats-i-tracked-my-habits-for-a-year-and-this-is-what-i-learned-4f9c3d374889 28/28