Statistics Made Easy 1: Foundational things you NEED to know

Statistics is one of the topics that most people doing psychology either have a love or hate relationship with, with most more having the hate relationship.  But it is still a module in psychology that you cannot avoid and would have to enroll in for sure.  Psychology students who feel that statistics is difficult often give several similar reasons (like disinterest in Maths, not understanding probability, just bored with the formulas, etc.), but I believe statistics in psychology can be quite easy once you have a good grasp of what is required (the foundations) and the logic of statistical analysis.  So in this post, I will go through what you NEED to know in statistics to help you understand statistics better and get through the "terrible" (or fun) times that you will be expecting...

1.  Statistics is divided into two types, descriptive statistics and inferential statistics.
I would recommend that you read up and understand descriptive statistics well, because it forms the basics and foundations as you proceed into inferential statistics.  If you have issues understanding descriptive statistics, you might face more problems as you go into inferential statistics.  For starters, click on the links above.  This post will also revolve mainly around descriptive statistics.

2.  Central Limit Theorem (CLT), Normal distribution, and the Bell (-shaped) Curve
According to the central limit theorem (CLT), the mean of many (or rather 'a lot'; some say at least 30 as a rough minimum) random samples independently drawn from the same distribution is distributed approximately normally.  This means that you will get what we call a "normal distribution", also known as the bell (or bell-shaped) curve [as shown above].  In statistics, we often assume that that our means are distributed on a normal distribution; this might not be the case all the time but more often than not, just an assumption.  Best way to find out?  Use Excel to plot your distribution out and check if it looks like the picture above.
Also to understand the shape of a distribution better and to differentiate one distribution from another, we need to know its central tendency and spread.

3.  Population and Sample
This is one of the important things that I would highlight to students.  Make sure you know the difference between a population and a sample.  A population refers to all members of a defined group that we are studying or collecting information on for data driven decisions.  However, it is impossible to to study all the members of a population for a research project, because it just costs too much and takes too much time.  Hence, we choose a small group of participants to be representative of the population to undergo the study; this group of participants is the sample.  We assume that the sample is representative of the population, and possesses the same characteristics as the population.
Other than understanding the differences between population and sample, you should also know that there are differences between the population parameters and sample statistics.  As they are very similar in notation and formulas, students tend to get confused over them.

4.  The 3 Ms of Central Tendency: Mean, Median, Mode
Understand the difference between the mean (average), median (middle value or mean of the two middle values or 50th percentile), and mode (value of highest frequency).  In reporting of statistics and research articles, you will see the mean more often than the other two.  This is because if we assume that the sample is of a normal distribution, the mean is the same as the median and mode, so it would make more sense just to use the mean with standard deviation (its measurement for spread).  If the distribution is not normally distributed, report the median or mode rather than the mean.

5.  Spread or Variability: Quartiles, IQR, Variance, and Standard Deviation
To understand a distribution or the overall description of a set of data, we also need to know the spread (or variability) other than the central tendency. This can be measured by the above four.
Quartiles tell us about the spread of a data set by breaking the data set into quarters.  In a sample of data, the 1st quartile (Q1) is the 25th score, 2nd quartile (Q2) the 50th (also the mean), and the 3rd quartile (Q3) is the 75th score.  Quartiles are a useful measure of spread because they are much less affected by outliers or a skewed data set than the equivalent measures of mean and standard deviation.  For this reason, quartiles are often reported along with the median as the best choice of measure of spread and central tendency, when dealing with skewed and/or data with outliers.  A common way of expressing quartiles is as an interquartile range (IQR). The IQR describes the difference between the third quartile (Q3) and the first quartile (Q1) or Q3 - Q1, telling us about the range of the middle half of the scores in the distribution. You can see an example for quartiles and IQR here.
The standard deviation (SD) is a measure of how spread out numbers are.  It is calculated by the square-root of the variance, while the variance is defined as the average of the squared differences from the mean.  Hence its formula is "root-mean-square of the differences from the mean"; this is one formula you SHOULD know how to do.  In my definition, the SD (or σ) is a averaged measure of how far each of the values deviates from the mean, which also means that you can use the SD to calculate how far a certain value is from the mean.  As you can see above in the picture, each σ is equally spaced from each other; the distance between 3σ and 2σ is the same as the distance between 2σ and 1σ.  More often than not, you will see research articles reporting the SD, rather than the others.  This is due to the assumption of the samples having a normal distribution.

6.  The 68–95–99.7 rule
This is a rule that has been calculated by mathematicians and used especially in basic probability calculations in statistics.  Simply, about 68.27% of the values lie within 1 SD of the mean (μ ± 1σ).  Similarly, about 95.45% of the values lie within 2 SDs of the mean (μ ± 2σ).  Nearly all (99.73%) of the values lie within 3 SDs of the mean (μ ± 3σ) .
A simple example is the example of IQ scores, of μ = 100, and σ = 15.  This means that the values left of the mean are 85(μ-1σ), 70 (μ-2σ), and 55 (μ-3σ); values right of the mean are 115(μ+1σ), 130 (μ+2σ), and 145 (μ+3σ).  Approximately 68.27% of people have IQ scores ranging from 85 to 115. 95.45% of people have IQ scores ranging from 70 to 130, and 99.7% of people have scores ranging from 55 to 145.

I have tried to avoid formulas for those with "phobias" of formulas, and explained the above essential statistics information as detailed and simple as possible.  If you have any further questions or require more explanations for the above, feel free to ask.

Stage 2: Reflections of a NUS student

In NUS, Psychology is within the Faculty of Arts and Social Sciences (FASS).  The implications of this therefore is that one need not commit straight away to studying solely psychology, as FASS allows for students to freely choose their modules each semester without having any declared major.  This may be useful for people who are yet to be sure whether or not they want to pursue an education and possible career in psychology.

Looking back at the last 2 years of studying, I am grateful for the broad exposure that the department offers students.  While the undergraduate modules may not go very deeply into certain fields like neuropsychology, the curriculum is nonetheless structured in such a way that students will study introductory modules of five main branches - Social, Abnormal, Cognitive, Developmental and Biological Psychology.  Personally, I have found this helpful as it allows for a deeper and more holistic appreciation of psychology as a whole, and also provides opportunity for students to 'try out' the different fields to see which they prefer.

The lecturers that I've had so far are highly competent and able to explain effectively to us any tricky concepts that we come across.  They are also very approachable and willing to spend extra time and effort helping students who may be struggling with school work - all you need to do is ask for their help.  Also, there are some lecturers who clearly go the extra mile in making psychological concepts more understandable and relevant to students, often involving the use of current affairs and popular culture as examples of what they teach.  These lecturers make learning much more enjoyable, and I must say that I am very thankful to be taught by them.

Still, the course (like any other undergraduate discipline) is no walk in the park, and requires hard work and focus from each student.  The course is very content driven, so students are expected to diligently read the textbooks and research papers, as lectures sometimes do not cover all the relevant info due to time constraints.  One common complaint among all NUS students is the high amount of work load each semester, which some times results in the joy of learning being sapped away as students rush from one assignment to another, without any space to appreciate what is being learnt.  However, this is probably the case in most universities, and to be fair, one's ability to manage high amounts of work load (and stress) is trained in the process.


Research Methods and Statistics: Explaining the links..

Doing experiments is one major part of psychology, and as a result, we learn research methods and statistics.  However, because of statistics, a lot of people are turned off from this experimental side of psychology.  Different people may have different motivations and levels of understanding for statistics, hence I should not delve too much into the reasons into why people do not like statistics.  One common reason is "I am not good in maths".  My reply to that?  "Statistics is not math..."

To make things easier, I'll try to explain the relationships and links between doing research, research methods, and statistics.  Doing research is an essential part of psychology, where we can confirm our assumptions and understand more of what we may not totally understand.  How quantitative research is done is through the use of research methods and statistics, with both undertaking different aspects of research.

"Research methods" are two simple words which can be explained by "things you do in research", but they encompass a lot of meaning and work behind them.  These things include the knowledge of and abilities to do conceptualisation of the problem (i.e. hypothesis building, variable quantification, etc.), sampling, measurement (including validity and reliability), and experimental design (types of design and experimental biases).  These mentioned are only the main branches of research methods with some examples, without really going in depth for each of the branches.  Think about the things you need to know and work you have to do for research methods...There are a lot!!!

So where does the statistics come in??  If research methods are the "things you do in research", then statistics would be the "instrument that you use to analyse the data".  However, it is not that simple; this "instrument" require you to know the foundations of probability and statistics well, before being able to understand and use the simplest of the advanced part of statistics - Inferential statistics, such as ANOVA and regression.  So where does maths come in?  Mostly in the foundations of probability and statistics, where you are exposed to the formulas, and not much in everywhere else in research methods or statistics.

To conclude, research methods and statistics should work hand in hand for you as a quantitative researcher - first with research methods for everything till collection of data, followed by statistics to analyse the data.  In upcoming posts about statistics, I will talk about how to make statistics easier for you as students to learn and use.

Stage 7: Reflections of a PhD Candidate 2

To achieve a PhD degree has been a dream for me since a child.  After completing my Masters degree in sport science (sport and exercise psychology) from Europe, my next target was to do a PhD.  By late 2009, I made up my mind on what I wanted to do.  My first step started off by contacting the professors who were experts in the areas that I was interested to specialize in.  All I had in mind was 2 concepts: my passion for sports and my nature of always being positive.  I started my journey along with the guidance of my supervisors, by reading research articles in sport and positive psychology and developed ideas for my research.  With appropriate research questions and hypothesis framed and approved, I started with the data collection.  Probably like any other PhD student, I went through sleepless nights, irregular meals, confusions and constantly trying to figure out solutions.  Though there were hardships faced during the initial stages of doing the PhD, now when I look back at them, I have different feelings about them.  The realization that ‘research always evolves and opens new directions and gateways’ struck me harder this time around.  It has been two years since I commenced on my PhD journey, and I have no regrets at all.  I must highlight that doing  a PhD is a lonely journey which involve stress, tensions and frustrations.  My strategy has always been to be positive and enjoy your research.  I also believe that sharing your knowledge could help you progress well, because sometimes input of ideas from others can help you think out of the box and untangle the difficult thoughts and questions.