ENGG1811 Lab 3: Data Analysis using Calc

In this lab you will learn how to analyse your data using some of the functions and features offered by OpenOffice.org. These features have been discussed in detail during the lectures, and in this lab you need to apply that knowledge to obtain relevant information from various data sets.

Marking Scheme

Each lab is worth 3 marks: 1 mark from the online assessment and 2 marks from your tutor's assessment of your lab work.

If you have any questions, please ask your tutor.

Download

Please download the lab spreadsheet document lab03.ods for the lab exercises.


Exercise 1: Fitting Equations to Data

Imagine that your employer has asked for some analysis of global temperature trends. The sheet Ex1_GlobalTemps contains a column of temperature values for the years 1850 to date. They are estimates of the global mean temperature over the Earth's surface averaged over each month of the year, based on integrating measurements from thousands of locations across the globe. The data is a summary of what's known as the HadCRUT4 data set. The numbers are expressed as differences in degrees Celsius from the 1961-1990 mean, and are called "anomalies".

Construct a scatter chart showing the data: select the X and Y data as a single 2-column range, including labels at the top, then choose a suitable chart type (lines, no markers). Note the trend of temperature from 1850 to 2014. Is it a rising or falling trend?

You have learned in the lecture that data can be used to do prediction. In this exercise, you will predict the future temperature by using two different ranges of past temperature data. You will first use a shorter data range, from 2003 on onwards, to do the prediction; after that, you will use the data from 1961 onwards to do another prediction. You will see later that the choice of ranges can have a significant effect on the prediction.

  1. Construct a scatter chart using the data from 2003-2014. At Step 3 ("Data Series") of the Chart Wizard, you will notice that "Column B" is used as the name for this series and "Column B" appears in the legend of the chart. You need to change that by doing the following:
    1. Within the box "Data series", click on "Column B". Within the box "Data ranges", click on "Name".
    2. Click on the button next to "Range for Name" indicated by the red arrow below. You can now click on cell D23 which contains the text Anomaly 2003 to date which is a good description of the time series.
range-for-range.png

  1. Using Format - Data Ranges, add a new series for the years 1961 to date. You can select X-Values and Y-Values from the Data ranges box and point to the ranges. For the Name, use the text Anomaly 1961 to data provided in the cell D22. Note that the two series overlap at the end, you may wish to make the line for the Anomaly 1961 to data series using a thinner line, say 0.5mm.

  2. When the chart is in edit mode you can select any element easily with the drop-down list where the name box normally is. Select the Anomaly 1961 to date series and add a linear trend line and its equation. You can adjust its colour by right-clicking or using the drop-down and adjacent Format Selection button.

  3. The x-coefficient in the equation is what the trend model claims is the mean annual increase in global temperature over the period. Type that value into the yellow box F22.

  4. Turn off the auto-scaling for the X-axis and make it range from 1870 to 2070 to see the predicted increases over or beyond your working lifetime. Adjust the Y axis so you can see the top of the trend. In what year does the line cross the 1 degree axis? (The consensus among most climate scientists is that 1.5 to 2.0 degrees above pre-industrial levels, or about 1.0 to 1.5 degrees above the baseline in this data, is a tipping point where runaway warming is likely to occur.)

    Write your answer in the yellow cell (H22) on the worksheet.

  5. Much of the debate, or misinformation, around climate change surrounds apparent trends over the last few years. Add a trend line for the Anomaly 2003 to date series. This is such a short time in historical terms that random variation is a factor and any interpretation becomes far more subjective. Nevertheless, apply a linear trend: this is a gold mine for climate skeptics! Soon everything will be back to normal, according to this flawed analysis. Enter the slope in the yellow cell F23.

Moral: Data is never really objective, as decisions about its selection and interpretation can profoundly influence the results of reasonable and objective analysis. Although there are many statistical games that we can play, it's also possible to do more analysis. 

One method is to check which of the two predictions Anomaly 1961 to date or Anomaly 2003 to date is better. The two trendlines give predictions for years 2015, 2016 and so on. You may perhaps wait for the data from year 2015 and then see which of these two trendlines will give you a better prediction. Say, if the trendline based on Anomaly 1961 to date gives a prediction closer to the actual temperature anomaly in 2015, then we may have a bit more confidence in the prediction of Anomaly 1961 to date. You can do this again in 2016.

You may ask whether there is a real need to wait for 2015. Can we do this sooner? Yes, you can. You can use (1) the data from 2003-2013 to predict the temperature in 2014; (2) The data from 1961 to 2013 to predict the temperature in 2014. You can check which one will give you a better prediction. 

The technique that we have used in the last paragraph to gauge how well the trendlines predict is studied under the topic of model validation in statistics.

Note: Your employer believes presentation is important, so they insist that the chart is well presented, with a neat, clear title and year scale. The three traces and two trendlines should be identified using different line colour. For maxmimum marks, you must show you can do this.

Exercise 2 : Solving Equations Graphically and with Goal Seek

We can use a chart (plot) to search for a possible root(s) of a given equation, mainly so we know where to direct more detailed efforts.

  1. In the worksheet named Ex2_Roots, create a table as shown below for the values of x and f (x), where f (x) = x3 17x + 12. The value of x should range from 5 to 5, with the increment of 0.4, or whatever is in cell $C$8.

  2. Plot a graph as shown below and estimate the three root values from this graph.

  3. Using the Goal Seek tool and the pink and green boxes as a workspace, calculate each of the three root values. Copy each answer to the boxes provided, in increasing order, using copy/paste values rather than typing. The auto-assessment will confirm that you have used Goal Seek properly, and will check the largest root to more than the number of decimal places displayed.

Exercise 3: Catenary


St Louis Gateway Arch
Photographer: Daniel Schwen
Source: Wikimedia

"Catenary" (from the Latin for chain) is the name given to the shape of a flexible rope or cable that's fixed at each end and sags under the influence of gravity. An inverted catenary forms an elegant arch, as shown here.

At first glance it might look like a parabola, but actually it's described by the hyperbolic cosine function. Don't panic: all you need to know is that cosh (pronounced "cosh") and sinh (pronounced "shine" or sometimes "sinch") are just ways of expressing sums and differences of simple exponentials, although they have their own wonderful system of trigonometry.

Here L is the horizontal distance between the end points, which have the same vertical position. S is the cable length, and h is the sag height, equal to f (L/2) f (0). The lowest point of the curve goes through the origin.

The overall shape is governed by the parameter a. It's the pink variable cell on the Ex3_Catenary sheet, and everything else depends on it.

  1. First plot f (x) to see the shape. We've filled in the x column from A20 down for 100 points, and the coloured cells in column D have defined names to make it much easier to enter formulas. Fill in the y = f (x) formula in B20 and fill down. Chart the result using a scatter plot, smooth lines, no markers.

  2. When a is 50 or so the shape is very much like a parabola, but change it to 15 and the flattening of the lower part is evident.

  3. Enter formulas for S and h in the cells provided. The cosh and sinh functions are defined in Calc.

  4. Assume a real cable is going to be used to span a 120m gap, and its linear density w is 2.5kg/m (typical of a 25mm diameter multistrand steel cable). The cable has a total mass m, which exerts a downward force Fv that varies with distance from the centre, and it exerts a uniform horizontal force Fh . The relevant equations to calculate the tension T at the supports are

    Enter these four formulas into the grey cells D11:D14. Use cell names, not addresses. Note that the forces are in kN not newtons, so there's a factor of 1000 to include (is it multiplied or divided?).
  5. We can manipulate any parameter of the model without using any algebra at all. Just ask Goal Seek to adjust the value of a until a desired property such as cable length, sag or maximum tension is achieved. There are 4 questions on the sheet that you can answer with the help of Goal Seek, but try this first:

    You've been asked to design a simple flexible footbridge, but to minimise vertical movement the client has specified a maximum of 8m sag over the 120m span. You normally use supports that can safely withstand 5kN. Will the supports be adequate?
  6. Hint: For Q3 and Q4, a graph which shows how tension T is influenced by the parameter a is useful.
References:
Lux, J (2003). Catenary Curves. Last accessed 2014-03-12.
Calvert, JB (2000). The Catenary. Last accessed 2014-03-12.

Remark: suspension bridge cables usually support a roadway, which has much greater mass than the cables themselves. In this case the curve is indeed parabolic, not catenary.

-- end --
2014-03-18. Clarified column used for calculations in Exercise 1.
2016-08-04. Added hint for item 5 in Exercise 3.