闪电代写 -代写CS作业_CS代写_Finance代写_Economic代写_Statistics代写_代码代做_IT代写_加急帮助

Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

MATH256 Numerical Methods

Part 2 Errors and rounding

2 Errors and rounding

Numerical Methods (5 Parts)

Main Message in one sentence:

If all numbers are allowed to have a ﬁxed number of digits (throwing away the rest)
e.g. x = 0.x1x2 ...x10 × 10n for 10 digits, what happens to subsequence operations or the accuracy?

In Part 1 (or chapter 1), we encountered the situation in which we needed to break a loop to obtain an approximate value for an inﬁnite series. We used the condition

|tn+1 | < 0.5 × 10-10 × |Sn|, (*)

where Sn is the partial sum up to term n and tn+1 is the ﬁrst term omitted. In this chapter, we will examine errors and approximations in more detail. In the process, we will see the justiﬁcation for (*).

2.1 Computer storage and correct rounding

Any nonzero x e R can be written in scientiﬁc notation; that is

x = a × 10p , where 0.1 < |a| < 1 and p e Z.

The ﬁrst nonzero digit of a always appears immediately after the decimal point. For example, 0.01206 = 0.1206 × 10-1 ,

203.002 = 0.203002 × 103 ,

and

_π = _0.31415 . . . × 101 .

Computers store numerical values in a similar way, using a ﬁxed number of signiﬁcant digits. Consequently,

most operations are subject to rounding error. With the default settings, Maple uses ten digits, so a typical number in Maple (omitting exponent)

= 0.a1 a2 . . . a10

= a1 × 10-1 + a2 × 10-2 + . . . + a10 × 10-10 .

Here the digits aj are integers between 0 and 9 inclusive, except a1 , which cannot be zero (unless = 0 exactly). We care about how an exact and nearby number a = 0.a1 a2 . . . a10 , a11 , . . . a50 . . . is approximated by by rounding. The correctly rounded approximation to a is the most accurate approximation stopping at a10. The true value for a then satisﬁes

|a _ | < 5 × 10-11 = 0.5 × 10-10 .

Note that numbers ending in 5 can be equidistantly rounded in two wayss, though traditional rounding chooses one way.

For example, to round the 11-digit number a = 0.12345678905 to only 10 digits, we can take

= 0.1234567890 or = 0.1234567891

with |a _ | = 5 × 10-11 in both cases. This is called a tie. The elementary rule ‘5 rounds up’ is usually not

used to resolve these because always rounding up can cause errors to accumulate. By default, Maple resolves

ties by requiring that the ﬁnal stored digit is even. Each error caused by a tie then has a 50% chance of being positive and a 50% chance of being negative, so they are unlikely to accumulate. Of course, numbers not ending in 5 never encounter such a tie situations e.g. a = 0.12345678914 → a = 0.1234567891 with no other choices.

Remarks:

● We may visualise why the digit 5 is special in rounding from below

│(10) │ _ _ _ 9 _ _ _ 8 _ _ _ 7 _ _ _ 6 _ _ _ [5] _ _ _ 4 _ _ _ 3 _ _ _ 2 _ _ _ 1 _ _ _ │(0) │

● If x and y are stored exactly (and y 0), then the result of x · y , xy and x/y will be correctly

rounded, as will ,x and ,y. However, this applies for a single operation only. Maintaining correct

rounding throughout a long calculation is more or less impossible.

● In other words, rounding error cannot be avoided, so we must accept that the last digit in a result is likely to be wrong (perhaps more in long calculations).

● Although it is not our intention to dig deep into the process of how to implement these elementary operations, it is not hard to see that one must take care of the components p ﬁrst and then operate between the decimal numbers a (often called mantissa) as below. Assume p2 2 p1 for x1 = a1 × 10p1 , x2 = a2 × 10p2 :

→ x2 · x1 = (a2 × 10p2 -p1 · x1 ) × 10p1 s 3 × 10p˜3 ,

→ x1 /x2 = (a1 /a2 ) × 10-(p2 -p1 ) s 4 × 10p˜4 ,

→ x1 × x2 = (a1 × a2 ) × 10p1 +p2 s 5 × 10p˜5 ,

where possibly p˜4 = _(p2 _p1 ), 1 _ (p2 _p1 ) and 5 = p1+p2 , p1+p2 _ 1 since 0.1 < |a1 /a2 | < 10 and |a1 × a2 | < 1.

● The number of digits stored by Maple can be increased by changing the parameter Digits. Remember: this controls the accuracy to which numbers are stored at each stage of a calculation. It does not guarantee the number of correct signiﬁcant ﬁgures in the ﬁnal answer.

● Rounding error places a bound on accuracy. Trying to beat this (e.g. by including more terms in a series) will never succeed.

Example

Suppose that x = 0.13, and consider the two approximations

cos(x) s 1 _ + and cos(x) s 1 _ + _ .

There are two sources of error here: rounding error and the truncation of an inﬁnite series. Using ten digit

arithmetic, we obtain

cos(x) s 0.9915619004 and cos(x) s 0.9915618937.

The second approximation is more accurate, so we have made an improvement by including an extra term

in the series. In fact, the second result is correctly rounded. Adding more terms cannot further improve the

accuracy unless we also decrease rounding error by including more digits.

2.2 Absolute and relative error

Notation: For any exact quantity x, represents an approximate value. The absolute error is given by

Ex = _ x. The signiﬁcance of absolute error depends on the scale of the numbers in

● A 5mm error in measuring out a football pitch is not important.

● A 5mm error in eye surgery is likely to be disastrous.

To take the magnitudes of x into account, we can use the relative error :

question.

εx =

_ x

x x

The percentage error is 100 × |εx|. Relative and percentage error are undeﬁned if x = 0.

Remark: In general, the sign of the error (absolute or relative) does not matter, only the magnitude is important.

2.2.1 Example: decimal places

This is a measure of absolute error; it takes no account of the magnitude of the numbers we are dealing

with. We must deﬁne exactly what we mean when we say an approximation is accurate to n decimal places.

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

Attempt 1: Count matching digits after the decimal point.

Counterexample: x = 1.00001

1 = 1.00200

2 = 0.99991.

Clearly, 1 approximates x to two places, but the more accurate approximation is 2 , in which none of the digits after the decimal point are the same as those of x.

Attempt 2: Round to n decimal places, then check if the results are the same.

This ﬁxes the above problem; 2 = 1.000 to three places and 0.9999 to four, so 2 = x to 3 d.p.

Counterexample: x = 1.530

1 = 1.535

2 = 1.525.

Here we have a tie (see section 2.1) : 1 and 2 are equidistant from x. It makes no sense to claim that

either approximation is ‘better.’ It also makes no sense to allow the number of correct d.p. to be inﬂuenced

by the mechanism used for resolving ties.

Attempt 3: Consider the diﬀerence between x and . If |x _ | < 10-n , then there are n zeros after the decimal point and is accurate to n places.

Counterexample: x = 1.560

= 1.55001

Clearly, is accurate to one place, but x _ = 0.00999, which has two zeros after the decimal. To prevent this, we require the n + 1th digit after the decimal point in _ x to be less than 5, so that it does not round up. Hence (ﬁnally), we have the following deﬁnition.

To state that is an approximation to x, accurate to n decimal places, means that

|x _ | < 5 × 10-(n+1) = × 10-n .

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

2.2.2 Example: signiﬁcant ﬁgures

This is a measure of relative error. Again, we must make a precise deﬁnition. We begin by writing x using scientiﬁc notation, that is

x = a × 10p , where 0.1 < |a| < 1 and p e Z.

Since the ﬁrst nonzero digit in a comes immediately after the decimal point, n decimal places in a is equivalent to n signiﬁcant ﬁgures in x. Thus, if

= × 10p

then approximates x to n signiﬁcant ﬁgures if and only if

|a _ | < × 10-n . (*)

a _ 10p × (a _ ) x _

a = 10p × a = x ,

so (*) yields

│ │ × |a| < × 10-n .

Finally, since |a| < 1, we have the following:

If │ │ < × 10-n then approximates x to n signiﬁcant ﬁgures.

Important remark: This is a suﬃcient but not necessary condition. It is stronger than necessary because we have dropped |a| from the left-hand side. If n is the largest integer for which it holds, then we deﬁnitely have n s.f. and we may have n + 1 (a could be as small as 0.1). We deﬁnitely do not have n + 2 s.f.

2.2.3 Example approximation of e

● If we approximate e using

˜e = 2721

then

εe = = _ 1 s _0.405 × 10-7 .

The test shows that we have 7 s.f. but we may have 8. We don’t because

e s 2.718281828

whereas

˜e s 2.718281718.

● If x = 1 and = 1.049, then

εx = = 0.049 < 0.5 × 10-1 .

The test shows that we have 1 s.f. but we may have 2 (we do).

● If x = 1 and = 1.051, then

εx = = 0.051 < 0.5 × 100 .

The test shows that we may have 1 s.f. (we do).

● If ε > 0.5, then the largest possible value for n is actually _1, so we deﬁnitely don’t have 1 s.f. An approximation with a relative error greater than 0.5 is largely useless.

2.2.4 Series summation condition

Consider a series in which the sum up to term n is Sn, and tn+1 is the ﬁrst term omitted.

Let x = Sn+1 and = Sn. Then

x _ Sn+1 _ Sn

x Sn+1

tn+1

Sn + tn+1

tn+1 1

Sn 1 + (tn+1/Sn) .

If |tn+1| < 0.5 × 10-10 × |Sn| then the last term is very close to 1, so Sn+1 and Sn are (almost certainly) the same to 10 s.f.

This explains the condition we use for breaking a loop in a series summation: nothing is achieved by including tn+1 in the sum.

Remarks: (i) Careless use of this condition can produce incorrect results (see Practice Problem 3.2 for

example). However, it is a good practical choice in many cases.

(ii) The above study on Sn and and Sn+1 is not a complete analysis, because it does not ensure that the remainder |Rn| = |(n+1 tk| is less than 0.5 × 10-10 . It is not trivial to give a general formula.

2.3 Cancellation

Rounding error alone is not usually problematic, but certain operations can magnify existing errors, which is

much more dangerous. Consider the example shown in quadratic.mw. Here, the quadratic equation

t2 _ 42t + 1 = 0

is solved using the standard formula, with the result

t1 = 41.97617696 and t2 = 0.02382304.

The second root has only seven signiﬁcant digits; three have been lost. A few such operations will produce

results that are completely wrong, so we need to investigate what has happened.

2.4 Error propagation

We can use the fact that εx = /x _ 1, and so

= (1 + εx)x,

to study how errors propagate through calculations. For a given operation f, we need to ﬁnd an expression

for the relative error in f(); that is

f() _ f(x)

εf = f(x)

in terms of x and εx. Note that we are not concerned here with the error generated by computing f (which should be small). Rather, we are concerned with the eﬀect that f has on the pre-existing error in . As we will see, this can be signiﬁcant.