A DIGITAL APPROACH TO STUDY NUTRITION
AND PHARMACEUTICALS
Published at: http://www.archania.org
October 19, 2017
Healthy or carcinogenic?Healthy or carcinogenic?
Malnutrition is believed to cause a number of serious lifestyle diseases such as:
cardiovascular diseases, cancer, diabetes type 2, allergies and overweight. It has been
estimated to impact the global economy with 3.5 trillion USD per year, or 5% of the global
gross domestic product[1]. Many of the nutritional studies are however in disagreement about
which foods are healthy and unhealthy. There is for example no scientific consensus on
whether red meat is healthy or causes cancer, since there are so many contradictory scientific
studies. There is also disagreement about how much carbohydrates people should eat, how
much saturated fat people should eat, how much salt people should eat, and if artificial
sweeteners are a good substitute for table sugar or not.
Table 1: Examples of some nutritional studies that are contradicting each other
Red meat Healthy[2,3,4] Carcinogenic[5,6,7]
Saturated Fat Healthy[8] Cardiovascular diseases[9]
Salt Healthy[10] Cardiovascular diseases[11]
Carbohydrates Healthy[12] Weight gain[13], Diabetes type 2[14]
Artificial sweeteners Good substitute for sugar[15] Weight gain[16], Glucose intolerance[17]
We cannot necessarily trust pharmaceutical studies
There aren’t necessarily so many contradictory studies published about pharmaceuticals, but
we have good reason to be suspect about the studies that are published about
pharmaceuticals. It would be counterproductive for companies to invest money in
advertisements that disfavored their products. If a scientific study shows benefits from the
use of a product, it is good advertisement for the company, while if a scientific study shows
negative consequences from the use of a product, it is bad advertisement. So we should not be
surprised that the pharmaceutical industry wants to keep us from learning about negative
consequences associated with their products. Since big pharmaceutical companies often are
funding research on their own drugs, they do not always publish studies that show a drug to
be disadvantageous[18] . Let us consider a scenario where a new drug against cancer is
manufactured. Three different studies are published in medicinal journals, and they are all
showing that it increases survivability. The average of these three studies will then also
indicate that it increases survivability (Figure 1).
0.75 0.50.25 0 0.25 0.5 0.75
Overall
Study H
Study E
Study A (AVG: 0.2 STD: 0.5 WEIGHT: 6 )
(AVG: 0.33 STD: 0.4 WEIGHT: 5 )
(AVG: 0.72 STD: 0.2 WEIGHT: 3 )
Average: 0.36
Decreased Increased
survivability survivability
Figure 1: Forest plot showing the average of 3 different studies about a hypothetical drug.
However, there are 5 more studies which are not published about the drug. These studies
show that the drug decreases survivability. If they were available, the average would indicate
that the drug actually decreases survivability (Figure 2).
0.75 0.50.25 0 0.25 0.5 0.75
Overall
Study H
Study G
Study F
Study E
Study D
Study C
Study B
Study A (AVG: 0.2 STD: 0.5 WEIGHT: 6 )
(AVG: -0.41 STD: 0.4 WEIGHT: 5 )
(AVG: -0.06 STD: 0.6 WEIGHT: 7 )
(AVG: -0.23 STD: 0.3 WEIGHT: 4 )
(AVG: 0.33 STD: 0.4 WEIGHT: 5 )
(AVG: -0.58 STD: 0.20 WEIGHT: 3 )
(AVG: -0.72 STD: 0.10 WEIGHT: 2 )
(AVG: 0.72 STD: 0.20 WEIGHT: 3 )
Average: -0.04
Decreased Increased
survivability survivability
Figure 2: Forest plot showing the average of 8 different studies about a hypothetical drug.
Unfortunately, studies with negative results are rarely published[18]. This means that the public
often gets completely misinformed about the efficiency and toxicity of pharmaceuticals.
Why nutritional studies often are unreliable
In order to be representative of the general population, most nutritional studies strive to use
a random selection of people. But then they also need to be certain that the selection truly is
random, otherwise it will give a misrepresentation of the general population[19]. Unfortunately,
it is practically very difficult to get a truly random selection of people. This is why the estimates
from election polls can be significantly different the results in real elections, and one of the main
reasons why there are so many contradictory nutritional studies. There is however another
option. Just like we can collect votes from the entire population in a presidential election, we
can actually base our research on the entire population in our modern digitalized society[20].
Entire population
Random
Selection
of People
Inductive generalization
Difficult to measure, but avoids
the problem with randomization
Easy to measure, but can be very
difficult to get completely random
Unless it is based upon a truly random selection of people
it is likely to give a misrepresentation of the entire population
Figure 3: How a random selection of people can be used to estimate the entire population, but
can give a completely wrong estimate if it isn’t truly random. Unfortunately, it is very difficult
in practice to get a completely random selection of people. This problem can be avoided by
using the entire population instead of a random selection of people.
Another problem with many nutritional studies, is that they are based on self-reporting. But
unfortunately people do not necessarily have a good overview of how much they are
consuming of different foods. If we instead base our studies on the groceries people are
buying, we avoid the problem with self-reporting, but another problem arises. In families for
example, one person often buys food for the entire family. This problem can be circumvented
however, if we obtain information about how many people are living together. If we only use
individuals that are living alone in our studies, there is a high likelihood that most of the
purchased groceries are consumed by the individual that is buying them. Families can also be
considered as units, where we look at all the groceries bought by the family, and the overall
health situation for the family. There might however still be anomalies. Some people might be
living with other people in secrecy, and some people might give away some of their food to
pets or other people. People might also differ in the amount of food they are throwing.
However, the nice thing about basing research upon a large amount of individuals, is that the
more individuals are participating in a study, the less anomalies matter. A high resolution
photo with 50% pixel loss is for example much more recognizable than a low resolution photo
with 50% pixel loss, since there still are much more actual pixels in the high resolution photo
(Figure 4). Similarly, it is much more easy to recognize a trend from a study based upon a
large amount of individuals, even if it has the same proportion of anomalies.
100 x 75 pixels with 50% pixel loss
100·75
2= 3750 actual pixels
1000 x 750 pixels with 50% pixel loss
1000·750
2= 375000 actual pixels
Figure 4: How it is easier to recognize a photo with 50% pixel loss, the larger the resolution is.
To collect information from the entire population without infringing on people’s privacy, all
government institutions and private companies that have sensitive personal data stored about
us, should be obligated to send this data encrypted to a secure online account belonging to each
individual (Figure 5). Having such an account gives us information about what government
institutions and private companies know about us, and we can get a better overview of our
own lives. We may use this personal data for our own endeavors, or we may use this personal
data to participate in research anonymously.
Fitness Centers
Hospitals
Activity Trackers
Grocery stores
Internal Revenue Service
Banks
Your Secure
Online Account
Research Team
Pharmacies
Correlations between foods/medicines
and health conditions
Encrypted
Fitness info
Encrypted
Health info
Encrypted
Transactions info
Encrypted
Family info
Encrypted
Purchase info
Optional Participation
Anonymous ID
Research algorithms
Encrypted
Activity info
Encrypted
Purchase info
Figure 5: There should be a law that enforces all government institutions and private
companies to make all sensitive personal data that is stored about us available to us in a secure
online account. If we want to use this sensitive personal data for research purposes, it should
be possible for us to send this data anonymously from there to a research team.
How we can distinguish correlations from causes
When studying foods, it is necessary to get information about all the foods the individuals in
the study have eaten. For example, people that drink a lot of coffee might also eat a lot of cake,
and people that eat a lot of cake may likewise drink a lot of coffee. If a study on cakes is done,
one might find a correlation between cake and sleep problems, but only because those that eat
a lot of cake may also drink a lot of coffee. To figure out if it is coffee or cake that causes sleep
problems, we need to look at those individuals that either only drink coffee or those that only
eat cake.
Cake Coffee
0
0.2
0.4
0.6
0.8
1
Correlation with
sleep problems
Coffee
0
0.2
0.4
0.6
0.8
1
Correlation with
sleep problems
Cake
0
0.2
0.4
0.6
0.8
1
Correlation with
sleep problems
Study based on:
A random selection of individuals,
or the entire population
Study based on:
Individuals that drink coffee,
but do not eat cake
Study based on:
Individuals that eat cake,
but do not drink coffee
The problem is that we do not necessarily know about all the different foods that correlate with
each other. Only by looking at all the foods an individual consumes can we get an overview of
all the foods that correlate with each other. It can also be difficult to determine if a particular
health condition is mainly due to a certain type of food, or due to physical activity. People
that eat a lot of healthy foods, also often exercise, since they tend to be devoted to their overall
health. But then it becomes difficult to say whether it is exercise or food that causes the specific
health benefit. In these cases, we would like to study individuals that eat a lot of the food
correlated with the health benefit, but exercise very little. Or we can study individuals that
exercise a lot, but eat very little of the food correlated with the health benefit. So, in order to
figure out what causes sleep problems, we first need to figure out all the different types of food
that are overrepresented for people with sleep problems, and if lack of exercise (inactivity) is
overrepresented among people with sleep problems. In this simplified model, we assume that
inactivity, coffee, cake, milk are overrepresented among people with sleep problems (Figure 6).
We also assume that cake and milk aren’t really causes of sleep problems, but just are correlated
with sleep problems, because they often are consumed together with coffee.
Inactivity Apples Bananas Pears Milk Tomatos Oranges Avocado Coffee Cake
0
0.2
0.4
0.6
0.8
1
Correlation with
sleep problems
Afterwards we study 4 subgroups
where each of these are excluded
Figure 6: Study to figure out what is overrepresented among people with sleep problems. All
foods should be included in such a study. It should also include if people are exercising or
their activity level. After we have figured out what is overrepresented among people with
sleep problems, we perform new studies on subgroups where each of the factors are excluded.
The easiest way to figure out if a type of food causes a health condition will be through
studying individuals who only consume the particular food, but none of the other foods (or
other things) that correlate with the health condition. If the food still correlates with the
health condition, we can assume that the food is involved in causing the health condition. It
may however be difficult to find enough such individuals if there are a large number of foods
that correlate with a health condition. Instead, we can exclude people who consume foods
that correlate with each health condition separately. When we exclude people who consume a
food that both causes a health condition, but also correlates with the consumption of other
foods, we will see that the correlation between the health condition and the other foods also
decreases.
Study based on all individuals that are physically active
A study based on to all those with
a high level of physical activity,
will likely not show as much
change in the correlation of the
other foods and sleep problems,
since we assume that coffee is also
a cause of sleep problems, while
consumption of milk and cake
mainly are correlated with coffee
intake, rather than with inactivity.
Inactivity Milk Coffee Cake
0
0.2
0.4
0.6
0.8
1
Correlation with
sleep problems
Study based on all individuals that do not drink milk
A study based on to all those
who do not drink milk, will likely
not show as much change in the
correlation of the other foods and
sleep problems, since we assume
that consumption of milk is not a
cause of sleep problems, but only
correlates with the consumption of
coffee, since some people like to
have milk in their coffee.
Inactivity Milk Coffee Cake
0
0.2
0.4
0.6
0.8
1
Correlation with
sleep problems
Study based on all individuals that do not drink coffee
A study based on all those who do
not drink coffee, will perhaps also
show that milk and cake correlates
less with sleep problems, because
some people use milk in their coffee
and some people eat cake when
they drink coffee. Of course, we
do not expect to see a significant
reduction in the correlation between
inactivity and sleep problems, since
we assume that inactivity rather is a
cause of sleep problems in itself.
Inactivity Milk Coffee Cake
0
0.2
0.4
0.6
0.8
1
Correlation with
sleep problems
Study based on all individuals that do not eat cake
A study based on all those that
do not eat cake, will probably
not show as much change in the
correlation of the other foods with
sleep problems, since we assume
that consumption of cake is not a
cause of sleep problems, but only
correlates with the consumption of
coffee, since some people eat cake
when they drink coffee.
Inactivity Milk Coffee Cake
0
0.2
0.4
0.6
0.8
1
Correlation with
sleep problems
Based on studying these 4 subgroups, we have learned that both inactivity and coffee are
potential causes of sleep problems, while consumption of cake and milk only correlates with
coffee intake. Modern computer algorithms will be able to do such studies super fast and will
be able to do several other similar calculations that allow us to acquire even more information
about how different foods correlate with health conditions and other foods. By also analyzing
which kinds of medicines people are purchasing from pharmacies, we can in a similar way
get information about how different medicines are affecting us, and about how they interact
with different types of food. Since this study also is not intended to have a time limit, the
longer it continues, the more certain we can be that the correlations it recognizes are correct.
Uncertainty will decrease more and more as we move into the future (Figure 7), unlike how it
has been for the past decades with old way of doing nutritional research, which just seems to
have made the general population more confused.
0 %
10 %
20 %
30 %
40 %
50 %
60 %
70 %
80 %
90 %
100 %
Time
Uncertainty
Figure 7: How the uncertainty of recognized correlations will decrease over time
If we become more certain about which foods are healthy and unhealthy, it will be more
justifiable to put taxes on unhealthy foods and to subsidize healthy foods. It is risky to put
taxes on unhealthy food, unless we are really certain that it is unhealthy. It is also risky to
educate kids about healthy nourishment, unless we are somewhat certain that our ideas about
healthy nourishment are correct. That could generate distrust in the government.
Bibliography
[1] The Food and Agriculture Organization of the United Nations, “The state of food and
agriculture.” http://www.fao.org/publications/sofa/2013/en/, 2013.
[2] A. J. McAfee, E. M. McSorley, G. J. Cuskelly, B. W. Moss, J. M. Wallace, M. P. Bonham,
and A. M. Fearon, “Red meat consumption: An overview of the risks and benefits,” Meat
Science, vol. 84, pp. 1–13, jan 2010.
[3] C. Steppeler, M. Sødring, B. Egelandsdal, B. Kirkhus, M. Oostindjer, O. Alvseike, L. E.
Gangsei, E.-M. Hovland, F. Pierre, and J. E. Paulsen, “Effects of dietary beef, pork,
chicken and salmon on intestinal carcinogenesis in a/j min+ mice,” PLOS ONE, vol. 12,
p. e0176001, apr 2017.
[4] N. D. Turner and S. K. Lloyd, “Association between red meat consumption and colon
cancer: A systematic review of experimental results,” Experimental Biology and Medicine,
vol. 242, pp. 813–839, jan 2017.
[5] A. N. Samraj, O. M. T. Pearce, H. L¨
aubli, A. N. Crittenden, A. K. Bergfeld, K. Banda,
C. J. Gregg, A. E. Bingman, P. Secrest, S. L. Diaz, N. M. Varki, and A. Varki, “A red meat-
derived glycan promotes inflammation and cancer progression,” Proceedings of the National
Academy of Sciences, vol. 112, pp. 542–547, dec 2014.
[6] A. N. Ananthakrishnan, M. Du, S. I. Berndt, H. Brenner, B. J. Caan, G. Casey, J. Chang-
Claude, D. Duggan, C. S. Fuchs, S. Gallinger, E. L. Giovannucci, T. A. Harrison, R. B.
Hayes, M. Hoffmeister, J. L. Hopper, L. Hou, L. Hsu, M. A. Jenkins, P. Kraft, J. Ma, H. Nan,
P. A. Newcomb, S. Ogino, J. D. Potter, D. Seminara, M. L. Slattery, M. Thornquist, E. White,
K. Wu, U. Peters, and A. T. Chan, “Red meat intake, NAT2, and risk of colorectal cancer:
A pooled analysis of 11 studies,” Cancer Epidemiology Biomarkers & Prevention, vol. 24,
pp. 198–205, oct 2014.
[7] P. Song, M. Lu, Q. Yin, L. Wu, D. Zhang, B. Fu, B. Wang, and Q. Zhao, “Red meat
consumption and stomach cancer risk: a meta-analysis,” Journal of Cancer Research and
Clinical Oncology, vol. 140, pp. 979–992, mar 2014.
[8] A. Malhotra, R. F. Redberg, and P. Meier, “Saturated fat does not clog the arteries:
coronary heart disease is a chronic inflammatory condition, the risk of which can be
effectively reduced from healthy lifestyle interventions,” British Journal of Sports Medicine,
pp. bjsports–2016–097285, apr 2017.
[9] F. M. Sacks, A. H. Lichtenstein, J. H. Wu, L. J. Appel, M. A. Creager, P. M. Kris-Etherton,
M. Miller, E. B. Rimm, L. L. Rudel, J. G. Robinson, N. J. Stone, and L. V. V. H. and,
“Dietary fats and cardiovascular disease: A presidential advisory from the american heart
association,” Circulation, p. CIR.0000000000000510, jun 2017.
[10] K. Stolarz-Skrzypek, “Fatal and nonfatal outcomes, incidence of hypertension, and blood
pressure changes in relation to urinary sodium excretion,” JAMA, vol. 305, p. 1777, may
2011.
[11] C. Johnson, T. S. Raj, K. Trieu, J. Arcand, M. M. Wong, R. McLean, A. Leung, N. R.
Campbell, and J. Webster, “The science of salt: A systematic review of quality clinical
salt outcome studies june 2014 to may 2015,” The Journal of Clinical Hypertension, vol. 18,
pp. 832–839, jul 2016.
[12] H. J. van Wyk, R. E. Davis, and J. S. Davies, “A critical review of low-carbohydrate diets
in people with type 2 diabetes,” Diabetic Medicine, vol. 33, pp. 148–157, oct 2015.
[13] F. L. Santos, S. S. Esteves, A. da Costa Pereira, W. S. Y. Jr, and J. P. L. Nunes, “Systematic
review and meta-analysis of clinical trials of the effects of low carbohydrate diets on
cardiovascular risk factors,” Obesity Reviews, vol. 13, pp. 1048–1066, aug 2012.
[14] O. Ajala, P. English, and J. Pinkney, “Systematic review and meta-analysis of different
dietary approaches to the management of type 2 diabetes,” American Journal of Clinical
Nutrition, vol. 97, pp. 505–516, jan 2013.
[15] C. A. Johnston and J. P. Foreyt, “Robust scientific evidence demonstrates benefits of
artificial sweeteners,” Trends in Endocrinology & Metabolism, vol. 25, p. 1, jan 2014.
[16] R. J. Brown, M. A. de Banate, and K. I. Rother, “Artificial sweeteners: A systematic review
of metabolic effects in youth,” International Journal of Pediatric Obesity, vol. 5, pp. 305–312,
aug 2010.
[17] J. Suez, T. Korem, D. Zeevi, G. Zilberman-Schapira, C. A. Thaiss, O. Maza, D. Israeli,
N. Zmora, S. Gilad, A. Weinberger, Y. Kuperman, A. Harmelin, I. Kolodkin-Gal,
H. Shapiro, Z. Halpern, E. Segal, and E. Elinav, “Artificial sweeteners induce glucose
intolerance by altering the gut microbiota,” Nature, sep 2014.
[18] B. Goldacre, Bad Pharma: How Drug Companies Mislead Doctors and Harm Patients. Faber
and Faber, Inc, 2012.
[19] G. Cuddeback, E. Wilson, J. G. Orme, and T. Combs-Orme, “Detecting and statistically
correcting sample selection bias,” Journal of Social Service Research, vol. 30, pp. 19–33, may
2004.
[20] M. J. Khoury and J. P. A. Ioannidis, “Big data meets public health,” Science, vol. 346,
pp. 1054–1055, nov 2014.