A NEW WAY TO STUDY NUTRITION AND
PHARMACEUTICALS
Published at: http://www.archania.org
July 15, 2017
Healthy or carcinogenic?Healthy or carcinogenic?
Malnutrition is believed to cause a number of serious lifestyle diseases such as: cardiovascular
diseases, cancer, diabetes type 2, allergies and overweight. It has been estimated to impact the
global economy with 3.5 trillion USD per year, or 5% of the global gross domestic product[1].
Many of the nutritional studies are however in disagreement about which foods are healthy and
unhealthy. There is for example no scientific consensus on whether red meat is healthy or causes
cancer, since there are so many contradictory scientific studies. There is also disagreement about
how much carbohydrates people should eat, how much saturated fat people should eat, how much
salt people should eat, and if artificial sweeteners are a good substitute for table sugar or not.
Table 1: Examples of some nutritional studies that are contradicting each other
Red meat Healthy[2,3,4] Carcinogenic[5,6,7]
Saturated Fat Healthy[8] Cardiovascular diseases[9]
Salt Healthy[10] Cardiovascular diseases[11]
Carbohydrates Healthy[12] Weight gain[13], Diabetes type 2[14]
Artificial sweeteners Good substitute for sugar [15] Weight gain[16], Glucose intolerance[17]
We cannot necessarily trust pharmaceutical studies
There aren’t necessarily so many contradictory studies published about pharmaceuticals, but we
have good reason to be suspect about the studies that are published about pharmaceuticals. It
would be counterproductive for companies to invest money in advertisements that disfavored
their products. If a scientific study shows benefits from the use of a product, it is good
advertisement for the company, while if a scientific study shows negative consequences from the
use of a product, it is bad advertisement. So we should not be surprised that the pharmaceutical
industry wants to keep us from learning about negative consequences associated with their
products. Since big pharmaceutical companies often are funding research on their own drugs,
they do not always publish studies that show a drug to be disadvantageous[18] . Let us consider a
scenario where a new drug against cancer is manufactured. Three different studies are published
in medicinal journals, and they are all showing that it increases survivability. The average of
these three studies will then also indicate that it increases survivability (Figure 1).
0.75 0.50.25 0 0.25 0.5 0.75
Overall
Study H
Study E
Study A (AVG: 0.2 STD: 0.5 WEIGHT: 6 )
(AVG: 0.33 STD: 0.4 WEIGHT: 5 )
(AVG: 0.72 STD: 0.2 WEIGHT: 3 )
Average: 0.36
Decreased Increased
survivability survivability
Figure 1: Forest plot showing the average of 3 different studies about a hypothetical drug.
However, there are 5 more studies which are not published about the drug. These studies show
that the drug decreases survivability. If they were available, the average would indicate that the
drug actually decreases survivability (Figure 2).
0.75 0.50.25 0 0.25 0.5 0.75
Overall
Study H
Study G
Study F
Study E
Study D
Study C
Study B
Study A (AVG: 0.2 STD: 0.5 WEIGHT: 6 )
(AVG: -0.41 STD: 0.4 WEIGHT: 5 )
(AVG: -0.06 STD: 0.6 WEIGHT: 7 )
(AVG: -0.23 STD: 0.3 WEIGHT: 4 )
(AVG: 0.33 STD: 0.4 WEIGHT: 5 )
(AVG: -0.58 STD: 0.20 WEIGHT: 3 )
(AVG: -0.72 STD: 0.10 WEIGHT: 2 )
(AVG: 0.72 STD: 0.20 WEIGHT: 3 )
Average: -0.04
Decreased Increased
survivability survivability
Figure 2: Forest plot showing the average of 8 different studies about a hypothetical drug.
Unfortunately, studies with negative results are rarely published[18]. This means that the public
often gets completely misinformed about the efficiency and toxicity of pharmaceuticals.
Why nutritional studies often are unreliable
In order to be representative of the general population, most nutritional studies strive to use a
random selection of people. But then they also need to be certain that the selection truly is
random, otherwise it will give a misrepresentation of the general population[19]. Unfortunately, it
is practically very difficult to get a truly random selection of people. This is why the estimates
from election polls can be significantly different the results in real elections, and one of the main
reasons why there are so many contradictory nutritional studies. There is however another option.
Just like we can collect votes from the entire population in a presidential election, we can actually
base our research on the entire population in our modern digitalized society[20].
Entire population
Random
Selection
of People
Inductive generalization
Difficult to measure, but avoids
the problem with randomization
Easy to measure, but can be very
difficult to get completely random
Unless it is based upon a truly random selection of people
it is likely to give a misrepresentation of the entire population
Figure 3: How a random selection of people can be used to estimate the entire population, but
can give a completely wrong estimate if it isn’t truly random. Unfortunately, it is very difficult
in practice to get a completely random selection of people. This problem can be avoided by using
the entire population instead of a random selection of people.
Another problem with many nutritional studies, is that they are based on self-reporting. But
unfortunately people do not necessarily have a good overview of how much they are consuming
of different foods. If we instead base our studies on the groceries people are buying, we avoid
the problem with self-reporting, but another problem arises. In families for example, one person
often buys food for the entire family. This problem can be circumvented however, if we obtain
information about how many people are living together. If we only use individuals that are living
alone in our studies, there is a high likelihood that most of the purchased groceries are consumed
by the individual that is buying them. Families can also be considered as units, where we look at
all the groceries bought by the family, and the overall health situation for the family. There might
however still be anomalies. Some people might be living with other people in secrecy, and some
people might give away some of their food to pets or other people. People might also differ in the
amount of food they are throwing. However, the nice thing about basing research upon a large
amount of individuals, is that the more individuals are participating in a study, the less anomalies
matter. A high resolution photo with 50% pixel loss is for example much more recognizable than
a low resolution photo with 50% pixel loss, since there still are much more actual pixels in the
high resolution photo (Figure 4). Similarly, it is much more easy to recognize a trend from a study
based upon a large amount of individuals, even if it has the same proportion of anomalies.
100 x 75 pixels with 50% pixel loss
100·75
2= 3750 actual pixels
1000 x 750 pixels with 50% pixel loss
1000·750
2= 375000 actual pixels
Figure 4: How it is easier to recognize a photo with 50% pixel loss, the larger the resolution is.
To collect information from the entire population without infringing on people’s privacy, all
government institutions and private companies that have sensitive personal data stored about
us, should be obligated to send this data encrypted to a secure online account belonging to each
individual (Figure 5). Having such an account gives us information about what government
institutions and private companies know about us, and we can get a better overview of our own
lives. We may use this personal data for our own endeavors, or we may use this personal data to
participate in research anonymously.
Fitness Centers
Hospitals
Activity Trackers
Grocery stores
Internal Revenue Service
Banks
Your Secure
Online Account
Research Team
Pharmacies
Correlations between foods/medicines
and health conditions
Encrypted
Fitness info
Encrypted
Health info
Encrypted
Transactions info
Encrypted
Family info
Encrypted
Purchase info
Optional Participation
Anonymous ID
Research algorithms
Encrypted
Activity info
Encrypted
Purchase info
Figure 5: There should be a law that enforces all government institutions and private companies
to make all sensitive personal data that is stored about us available to us in a secure online
account. If we want to use this sensitive personal data for research purposes, it should be possible
for us to send this data anonymously from there to a research team.
1 Multiple regression analysis
2 Confounded variables
How we can distinguish correlations from causes
When studying foods, it is necessary to get information about all the foods the individuals in the
study have eaten. For example, people that drink a lot of coffee might also eat a lot of cake, and
people that eat a lot of cake may likewise drink a lot of coffee. If a study on cakes is done, one
might find a correlation between cake and sleep problems, but only because those that eat a lot of
cake may also drink a lot of coffee. To figure out if it is coffee or cake that causes sleep problems,
we need to look at those individuals that either only drink coffee or those that only eat cake.
Cake Coffee
0
0.2
0.4
0.6
0.8
1
Correlation with
sleep problems
Coffee
0
0.2
0.4
0.6
0.8
1
Correlation with
sleep problems
Cake
0
0.2
0.4
0.6
0.8
1
Correlation with
sleep problems
Study based on:
A random selection of individuals,
or the entire population
Study based on:
Individuals that drink coffee,
but do not eat cake
Study based on:
Individuals that eat cake,
but do not drink coffee
The problem is that we do not necessarily know about all the different foods that correlate with
each other. Only by looking at all the foods an individual consumes can we get an overview of
all the foods that correlate with each other. It can also be difficult to determine if a particular
health condition is mainly due to a certain type of food, or due to physical activity. People that
eat a lot of healthy foods, also often exercise, since they tend to be devoted to their overall health.
But then it becomes difficult to say whether it is exercise or food that causes the specific health
benefit. In these cases, we would like to study individuals that eat a lot of the food correlated with
the health benefit, but exercise very little. Or we can study individuals that exercise a lot, but eat
very little of the food correlated with the health benefit. So, in order to figure out what causes
sleep problems, we first need to figure out all the different types of food that are overrepresented
for people with sleep problems, and if lack of exercise (inactivity) is overrepresented among people
with sleep problems. In this simplified model, we assume that inactivity, coffee, cake, milk are
overrepresented among people with sleep problems (Figure 6). We also assume that cake and milk
aren’t really causes of sleep problems, but just are correlated with sleep problems, because they
often are consumed together with coffee.
Inactivity Apples Bananas Pears Milk Tomatos Oranges Avocado Coffee Cake
0
0.2
0.4
0.6
0.8
1
Correlation with
sleep problems
Afterwards we study 4 subgroups
where each of these are excluded
Figure 6: Study to figure out what is overrepresented among people with sleep problems. All
foods should be included in such a study. It should also include if people are exercising or
their activity level. After we have figured out what is overrepresented among people with sleep
problems, we perform new studies on subgroups where each of the factors are excluded.
The easiest way to figure out if a type of food causes a health condition will be through studying
individuals who only consume the particular food, but none of the other foods (or other things)
that correlate with the health condition. If the food still correlates with the health condition, we
can assume that the food is involved in causing the health condition. It may however be difficult
to find enough such individuals if there are a large number of foods that correlate with a health
condition. Instead, we can exclude people who consume foods that correlate with each health
condition separately. When we exclude people who consume a food that both causes a health
condition, but also correlates with the consumption of other foods, we will see that the
correlation between the health condition and the other foods also decreases.
Study based on all individuals that are physically active
A study based on to all those with
a high level of physical activity,
will likely not show as much change
in the correlation of the other
foods and sleep problems, since we
assume that coffee is also a cause of
sleep problems, while consumption of
milk and cake mainly are correlated
with coffee intake, rather than with
inactivity.
Inactivity Milk Coffee Cake
0
0.2
0.4
0.6
0.8
1
Correlation with
sleep problems
Study based on all individuals that do not drink milk
A study based on to all those who do
not drink milk, will likely not show as
much change in the correlation of the
other foods and sleep problems, since
we assume that consumption of milk
is not a cause of sleep problems, but
only correlates with the consumption
of coffee, since some people like to
have milk in their coffee.
Inactivity Milk Coffee Cake
0
0.2
0.4
0.6
0.8
1
Correlation with
sleep problems
Study based on all individuals that do not drink coffee
A study based on all those who do
not drink coffee, will perhaps also
show that milk and cake correlates
less with sleep problems, because
some people use milk in their coffee
and some people eat cake when they
drink coffee. Of course, we do not
expect to see a significant reduction
in the correlation between inactivity
and sleep problems, since we assume
that inactivity rather is a cause of
sleep problems in itself.
Inactivity Milk Coffee Cake
0
0.2
0.4
0.6
0.8
1
Correlation with
sleep problems
Study based on all individuals that do not eat cake
A study based on all those that do not
eat cake, will probably not show as
much change in the correlation of the
other foods with sleep problems, since
we assume that consumption of cake
is not a cause of sleep problems, but
only correlates with the consumption
of coffee, since some people eat cake
when they drink coffee.
Inactivity Milk Coffee Cake
0
0.2
0.4
0.6
0.8
1
Correlation with
sleep problems
Based on studying these 4 subgroups, we have learned that both inactivity and coffee are potential
causes of sleep problems, while consumption of cake and milk only correlates with coffee intake.
Modern computer algorithms will be able to do such studies super fast and will be able to do
several other similar calculations that allow us to acquire even more information about how
different foods correlate with health conditions and other foods. By also analyzing which kinds of
medicines people are purchasing from pharmacies, we can in a similar way get information about
how different medicines are affecting us, and about how they interact with different types of food.
Since this study also is not intended to have a time limit, the longer it continues, the more certain
we can be that the correlations it recognizes are correct. Uncertainty will decrease more and more
as we move into the future (Figure 7), unlike how it has been for the past decades with old way of
doing nutritional research, which just seems to have made the general population more confused.
0 %
10 %
20 %
30 %
40 %
50 %
60 %
70 %
80 %
90 %
100 %
Time
Uncertainty
Figure 7: How the uncertainty of recognized correlations will decrease over time
If we become more certain about which foods are healthy and unhealthy, it will be more justifiable
to put taxes on unhealthy foods and to subsidize healthy foods. It is risky to put taxes on
unhealthy food, unless we are really certain that it is unhealthy. It is also risky to educate
kids about healthy nourishment, unless we are somewhat certain that our ideas about healthy
nourishment are correct. That could generate distrust in the government.
Bibliography
[1] The Food and Agriculture Organization of the United Nations, “The state of food and
agriculture.” http://www.fao.org/publications/sofa/2013/en/, 2013.
[2] A. J. McAfee, E. M. McSorley, G. J. Cuskelly, B. W. Moss, J. M. Wallace, M. P. Bonham,
and A. M. Fearon, “Red meat consumption: An overview of the risks and benefits,” Meat
Science, vol. 84, pp. 1–13, jan 2010.
[3] C. Steppeler, M. Sødring, B. Egelandsdal, B. Kirkhus, M. Oostindjer, O. Alvseike, L. E.
Gangsei, E.-M. Hovland, F. Pierre, and J. E. Paulsen, “Effects of dietary beef, pork, chicken
and salmon on intestinal carcinogenesis in a/j min+ mice,” PLOS ONE, vol. 12, p. e0176001,
apr 2017.
[4] N. D. Turner and S. K. Lloyd, “Association between red meat consumption and colon cancer:
A systematic review of experimental results,” Experimental Biology and Medicine, vol. 242,
pp. 813–839, jan 2017.
[5] A. N. Samraj, O. M. T. Pearce, H. L¨aubli, A. N. Crittenden, A. K. Bergfeld, K. Banda, C. J.
Gregg, A. E. Bingman, P. Secrest, S. L. Diaz, N. M. Varki, and A. Varki, “A red meat-derived
glycan promotes inflammation and cancer progression,” Proceedings of the National Academy
of Sciences, vol. 112, pp. 542–547, dec 2014.
[6] A. N. Ananthakrishnan, M. Du, S. I. Berndt, H. Brenner, B. J. Caan, G. Casey, J. Chang-
Claude, D. Duggan, C. S. Fuchs, S. Gallinger, E. L. Giovannucci, T. A. Harrison, R. B. Hayes,
M. Hoffmeister, J. L. Hopper, L. Hou, L. Hsu, M. A. Jenkins, P. Kraft, J. Ma, H. Nan, P. A.
Newcomb, S. Ogino, J. D. Potter, D. Seminara, M. L. Slattery, M. Thornquist, E. White,
K. Wu, U. Peters, and A. T. Chan, “Red meat intake, NAT2, and risk of colorectal cancer:
A pooled analysis of 11 studies,” Cancer Epidemiology Biomarkers & Prevention, vol. 24,
pp. 198–205, oct 2014.
[7] P. Song, M. Lu, Q. Yin, L. Wu, D. Zhang, B. Fu, B. Wang, and Q. Zhao, “Red meat
consumption and stomach cancer risk: a meta-analysis,” Journal of Cancer Research and
Clinical Oncology, vol. 140, pp. 979–992, mar 2014.
[8] A. Malhotra, R. F. Redberg, and P. Meier, “Saturated fat does not clog the arteries: coronary
heart disease is a chronic inflammatory condition, the risk of which can be effectively reduced
from healthy lifestyle interventions,” British Journal of Sports Medicine, pp. bjsports–2016–
097285, apr 2017.
[9] F. M. Sacks, A. H. Lichtenstein, J. H. Wu, L. J. Appel, M. A. Creager, P. M. Kris-Etherton,
M. Miller, E. B. Rimm, L. L. Rudel, J. G. Robinson, N. J. Stone, and L. V. V. H. and,
“Dietary fats and cardiovascular disease: A presidential advisory from the american heart
association,” Circulation, p. CIR.0000000000000510, jun 2017.
[10] K. Stolarz-Skrzypek, “Fatal and nonfatal outcomes, incidence of hypertension, and blood
pressure changes in relation to urinary sodium excretion,” JAMA, vol. 305, p. 1777, may
2011.
[11] C. Johnson, T. S. Raj, K. Trieu, J. Arcand, M. M. Wong, R. McLean, A. Leung, N. R.
Campbell, and J. Webster, “The science of salt: A systematic review of quality clinical salt
outcome studies june 2014 to may 2015,” The Journal of Clinical Hypertension, vol. 18,
pp. 832–839, jul 2016.
[12] H. J. van Wyk, R. E. Davis, and J. S. Davies, “A critical review of low-carbohydrate diets in
people with type 2 diabetes,Diabetic Medicine, vol. 33, pp. 148–157, oct 2015.
[13] F. L. Santos, S. S. Esteves, A. da Costa Pereira, W. S. Y. Jr, and J. P. L. Nunes,
“Systematic review and meta-analysis of clinical trials of the effects of low carbohydrate
diets on cardiovascular risk factors,” Obesity Reviews, vol. 13, pp. 1048–1066, aug 2012.
[14] O. Ajala, P. English, and J. Pinkney, “Systematic review and meta-analysis of different dietary
approaches to the management of type 2 diabetes,” American Journal of Clinical Nutrition,
vol. 97, pp. 505–516, jan 2013.
[15] C. A. Johnston and J. P. Foreyt, “Robust scientific evidence demonstrates benefits of artificial
sweeteners,” Trends in Endocrinology & Metabolism, vol. 25, p. 1, jan 2014.
[16] R. J. Brown, M. A. de Banate, and K. I. Rother, “Artificial sweeteners: A systematic review
of metabolic effects in youth,” International Journal of Pediatric Obesity, vol. 5, pp. 305–312,
aug 2010.
[17] J. Suez, T. Korem, D. Zeevi, G. Zilberman-Schapira, C. A. Thaiss, O. Maza, D. Israeli,
N. Zmora, S. Gilad, A. Weinberger, Y. Kuperman, A. Harmelin, I. Kolodkin-Gal, H. Shapiro,
Z. Halpern, E. Segal, and E. Elinav, “Artificial sweeteners induce glucose intolerance by
altering the gut microbiota,” Nature, sep 2014.
[18] B. Goldacre, Bad Pharma: How Drug Companies Mislead Doctors and Harm Patients. Faber
and Faber, Inc, 2012.
[19] G. Cuddeback, E. Wilson, J. G. Orme, and T. Combs-Orme, “Detecting and statistically
correcting sample selection bias,” Journal of Social Service Research, vol. 30, pp. 19–33, may
2004.
[20] M. J. Khoury and J. P. A. Ioannidis, “Big data meets public health,” Science, vol. 346,
pp. 1054–1055, nov 2014.