## Introduction

General Guidelines

• Don't put too much information on one graph. How many curves you can include will depend on how close they are to each other and whether they intersect.

• Start the vertical axis at zero unless you have a good reason not to. In this latter case, draw attention to what you have done.

• Label the axes and curves clearly. Include the units (e.g. thousands or percentages) as well as the variable names.

• Make clear distinction between curves on the same axes. Distinguish between them with different types of line (solid, dotted, etc.) and by using colour where possible.

• Avoid using keys.   ## x,y Graphs Line graphs differ from charts in that the representation is based on points located in a coordinate system using x and y axes. The x-axis is the horizontal axis and the y-axis is the vertical axis. When stating coordinates, the x-coordinate is always given first. For example ( 3, 5 ) are the coordinates of the point x = 3, y = 5. This idea generalizes to 3 dimensions, where coordinate axes can be set up, labelled x, y, z. ## Scatter Diagrams

Scatter diagrams are commonly produced from surveys or experiments, and are composed of points relating two variables, e.g. height and age of people under 18. Logically, for such a diagram, the horizontal axis would be used for age and the vertical axis for height. Each point on this diagram would represent the characteristics of an individual person.

Obviously for such a graph, you might expect some correlation between the results, i.e. you might expect that as age increases, so does height. The points would tend to show a relationship does exist but the points definitely do not fall on a nice straight line - this 'straight line' relationship is implied.

When there appears to be a relationship between the two quantities displayed on a scatter diagram, then we say they appear to be correlated. Look at these diagrams The first and last diagrams indicate correlation, the first is described as positive correlation, the last one as negative correlation. It is not exact, but all other things being equal, we are inclined to believe that there is a relationship between the quantities represented on the diagrams.

We ought to point out that correlation itself is not proof of any causal relationship between the two variables. In the past, correlation has been shown to exist between the parrot population in an area of Central America and an economic factor in the United States. So although there was good correlation in this case, this was just chance - there was no actual causal connection between the two quantities at all.

At the other end of the scale, even for quantities where a causal connection could exist, correlation is not, by itself, proof of an actual causal connection - it is just a piece of evidence that you would need to back up with other arguments, if you wanted suggest that there was an actual causal connection.

It should also be stressed that here we having been talking about linear correlation (a 'straight line' connection). Features that are not linearly correlated could still possibly be correlated by a more complicated mathematical expression. ## Line of Best Fit This can be inserted on to a scatter diagram as a 'guess' as to what any relationship may be.

For experimental data, the scatter might be just be due to experimental error, so a true and close relationship might actually exist in reality. For data from a survey, such a line could enable estimates to be made. For example, for the height/age diagram mentioned at the beginning of the module, no exact relationship exists betwen height and age, we could nevertheless draw a line of best fit - a line which could then be used as an estimate of what height a person of a certain age might be, on average.

You can attempt to insert this line by eye - obviously the more correlation there is, the easier it is to draw a line. The general guidelines to follow would be

• the line should have roughly the same number of points on either side (ignoring any 'rogue' points).
• the line should follow the general trend of the points.

More-mathematical procedures do exist for drawing such a line - the type of procedures employed if you were to use a spreadsheet.  ## Past Exam Questions

#### An article in the local newspaper about the prices of new cars includes the following statement: "Car prices rose steadily until January and then started to fall." Which is the correct graph to match the statement?  