尘世间的上帝之国

View Original

Something you should know about the abs function

It is a rule in math that the absolute value of a number must be no less than zero. Is it true in computer languages? Well, most of the case except one. Let me take C language for example.

For a 32-bits C interger, its value ranges from [-2147483648, 2147483647] .So for x = -2147483648, what is the result of abs(x)? 2147483648, -2147483648, 2147483647, 0 or something else?

The result depends on your compiler! Please be careful to handle this undefined result in your coding.

Let's look into the implementation of the abs function in C:

#define INT_MIN -2147483648
#define INT_MAX 2147483647
// the simplest way to implement the abs function
// it varies from the C version
template <class T> 
T abs(const T& x) { 
    return x > 0 ? x : -x;
}

int x = INT_MIN;
x = abs(x);
bool x_is_positive = x > 0;  // false!  x still equals to INT_MIN
</class>

So a potential bug may skip into your code like this situration:

int arr[] = {1001, -999, -1000, INT_MIN, INT_MAX};
int size = sizeof(arr) / sizeof(int);
const int threshhold = 999;
for (int i = 0; i < size; ++i) {
    if (abs(arr[i] > threshhold)) {
        // do something to arr[i]
        // e.g print arr[i]
        cout < < arr[i] << endl;
    }
}

// RESULT
// Expected: 1001, -1000, 2147483648, 2147483647
// Actual Output: 1001, -1000, 2147483647

But if you really test the above code in your computer, probobly you will find that the computer gives you the expected output. Here is a Demo on ideone.com; You are suggested trying on your machine!

What happens to the correct output occassion is that the compiler actually return a longer integer type like long long which is 8 bytes usually. For long long y = abs(INT_MIN);, y is equal to INT_MAX + 1 for sure.

This is as much as the modern compiler can help you reduce your bugs. If you code in this way:

int value = abs(arr[i]);
if (value > threshhold)) {
    // do something to arr[i]
}

This time bugs are more likely to happen on you. Nigel Jones gives some good solutions to the problem in his article: The absolute truth about abs().

In summary, there are several ways to avoid the abs bug: 1. Always use a longer type to store your abs result, like long long for int. But what type for long long? Therefore the better one is double, in the cost of casting time and more CPU cycles. 2. Write your own abs function like this one: safe abs function; 3. Carry out the comparision in the negative space because negetive space is larger than the postive space.

Both solution 2 & 3 require you to use your own abs function.

That's my report on delving the absolute function in C language. Just as a saying goes: Think twice before you code :-)