How good is my data?

In your Discussion, you should explain how confident you are in your results by assessing the causes and impact of potential errors in the data. The term “error” does not necessarily indicate that you’ve done anything wrong. It is the difference between the measured value and the true value.

It is impossible to measure any physical quantity with complete certainty. From random variations in the quantity being measured to limitations in the measuring equipment or the experimental method, there will always be errors in any measurement.

This means that if we measure some quantity and then repeat the measurement, we will almost certainly measure a different value the second time. By quantifying the possible spread of measurements, we can express uncertainty and say how confident we are about the result.

While the error is the difference between the measured value and the true value, the uncertainty is the range of values between which you are sure the true value exists.

It is common for the error in a measurement to be unknown. Consider using a ruler, with 1 mm divisions, to measure the length of a book. If the end of the book falls between two of the divisions of the ruler, you can’t determine the length exactly. However, you can be sure of a range of values between which it lies. This range is an example of uncertainty.

The uncertainty is an important part of your measurement as it allows the reader to place a degree of confidence in your results and to assess their significance in case of any discrepancy with earlier measurements or theoretical predictions.

It is important to remember that a large range of uncertainty in your data is not necessarily an indictment on your abilities. Uncertainties should be reported truthfully because as an engineer, you could be responsible for developing safety-critical systems.

The uncertainty should fully capture the reliability and range of potential values of the numbers you present so that systems can be designed to operate under the best and worst case scenarios.

In this article, you’ll learn how to report the uncertainty in your data and how to express why this uncertainty is relevant to your results.

Expressing the uncertainty in your data

A common notation adopted by engineers to report uncertainty is to write:

(value ± uncertainty) unit

For example:

(3.12 ± 0.03) m

Commonly, this would be considered the absolute limits of the range of possible values the true value could lie within. However, there are other ways that this notation could be interpreted by readers, particularly if they are familiar with statistics. It could be the scatter of your measurements, the standard distribution or the distribution of errors.

If you have detailed knowledge of the sizes of the errors associated with your readings then you may wish to let your reader know what these are. For general reports, it can be adequate to leave the meaning of the term after the ± vague. Best practice is to calculate the uncertainty as the standard deviation of the distribution of errors and to report that this is the uncertainty that is quoted.

Standard deviation

When you have lots of repeat measurements of the same quantity, uncertainty can be expressed as a standard deviation of the distribution. The standard deviation provides a measure of the range of variability of individual measurements within the set that has been collected. If the standard deviation is small, it means all the values in the set are close to one another and if the standard deviation is large, all the measurements will be quite different from one another.

One standard deviation is defined as the range that contains 34.1% of individual measurements above the mean value and 34.1% of those below the mean.

Example: Let’s say we are measuring the speed of cars on a busy road. In an hour, we record the speed of 1000 cars. We could find the mean average speed of the cars by adding all the speed recorded and dividing by 1000. We then define a range containing 341 (34.1%) of the slowest cars travelling above the average speed. We also define a range of the fastest cars travelling below the average speed. The top minus bottom speed of cars in this range gives us the value of one standard distribution. Another way to think about it is ordering all the cars from slowest to fastest. Count 341 cars up from the average and find that speed. Count 341 cars down from the average and find that speed. The difference between the two is the standard deviation.

Another notation which is preferred in some technical work is to write:

3.12(3) m

This is the same value and uncertainty as before. A benefit of this notation, apart from being more compact, is that it is only used to report a standard deviation. The ± notation can sometimes be confused with the symbol for ‘fixed limits’ on an engineering drawing.

Whichever notation you adopt, make sure that your readers know what it means.

← PREVIOUS:

5.2: What goes in a discussion?

5.4: In the lab: writing a discussion