Safe Numerics

Note: You must be logged in to post a formal review.


There are 1 comments

Comment on This Page

  • Robert Ramey says:

    First of all, thank for a very thorough and exhaustive review. I went over every point, make a few changes to the library and many enhancements to the documentation in response to this.

    Problems

    the compiler would not compile the BOOST_TYPEOF_TPL macro inside boost::enable_if.

    Fixed and tested with gcc 4.8.1

    Design

    (( GENERAL IDEA ))

    (*) I am not entirely clear about the scope of the library. I understand the problem domain: make it easy to deal with C++ built-in arithmetic types (or only built-in integral types?); to turn a number of surprises into a predictable and portable behavior.

    Correct. I’ve reworded the introduction/summary to try to clarify this.

    (*) Next question is, what tool(s) do you offer. …

    I’ve enhanced the tutorial to illustrate some of these other tools. Also I’ve added a little explanation to introduction/summary to mention these.

    (*). The title of the library confused me. Adjective ‘safe’ is ambiguous in this context. It probably saves us from overflow, but by what means and at what cost? For instance, one could expect that it means detecting signed-unsigned mismatch at compile-time.

    I’ve altered introduction/summary to clarify this.

    (*) In general, using word “safe” in C++ has negative connotations, like “have any problem with correctness, or got lost with the logic? – throw an exception or put a defensive if”. I suggest avoiding such negative connotations.

    Added answer to the Rationale section of the documentation

    (*)Also, term ‘numerics’ implied to me that this library helps with managing floating point types. On the closer look, now I believe that it does not, but to me ‘numerics’ is related closer to floats than ints. Only After I read the reference to SafeInt library did I understand that you are really fixing ints (not floats).

    (*)How about calling it “Small Integer”? Or “Checked int”?

    I’ve add the answer to this question to the “Rationale” section.

    (*) After a short conversation with Robert I got the idea: the library considers its underlying type (e.g., a 64-bit int) a resource. If the resource is “exhausted” as a result of an arithmetical operation or conversion, it is signaled via an exception or an installed handler.

    I wouldn’t characterize it this way – but I won’t dispute it if it works for you.

    (*)However, I was not able to figure it out from the first page of the documentation. Also, it is not the same as saying “we turn UB into an exception”. The latter has more negative connotations:

    LOL – it is negative – but we have no choice if we want to write correct code.

    (*)putting a defensive check just before any potential UB, is often considered a bad practice.

    by whom? Is leaving code which can produce incorrect results better? Note that the documentation contains references to various sources which recommend exactly this approach to mitigate the problems created by this C/C++ behavior.

    (*)Please specify more clearly that causing an overflow() upon a safe is not an error or a bug, but a resource shortage or implementation limitation. This avoids the negative connotations.

    LOL – Its a case where the programmer has presumed that the result will be arithmetically correct but in fact is not. That IS a bug.

    (*) The idea of safe alone is sound and useful. We have an object of the same size as int, slower, but with guarantee that any operation is either mathematically correct or an exception is signaled.

    agreed.

    (*)The addition of other “tools” like safe_compare is of dubious usefulness. Why is safe alone not enough? If there are some reasons to keep them, why are the practical use case not indicated in the tutorial?

    I’ll look at this.

    (*)IOW, where _in practice_ would I need to use safe_integer_range instead of safe?

    I added a practical use case to he tutorial. There are others but I only added the one.

    (*). Note that while overflow on signed integers is an UB, as the introduction describes, the overflow on unsigned integer types is well defined.
    Quoting the standard ([basic.fundamental]/4), “Unsigned integers, declared unsigned, shall obey the laws of arithmetic modulo 2n where n is the number of bits in the value representation of that particular size of integer.” The footnote therein says, “This implies that unsigned arithmetic does not overflow because a result that cannot be represented by the resulting unsigned integer type is reduced modulo the number that is one greater than the largest value that can be represented by the resulting unsigned integer type.”

    That’s exactly the problem. C/C++ has erroneous arithmetic built into the standard. So one cannot presume that a C/C++ expression will not (silently) produce and arithmetically incorrect result. BUT safe versions will NEVER produce such a result. (though they may trap instead).

    (*)I am not sure if this makes any difference in practice. However, I heard on GoingNative 2013, “Ask us everything panel” (http://channel9.msdn.com/Events/GoingNative/2013/Interactive-Panel-Ask-Us-Anything) all the people I consider experts agree that one should not use unsigned integers for anything but bit masks.

    I added an answer to this in the “Rationale” section.

    (*)They even said using size_t for the size of the containers was a mistake. Perhaps rather than making unsigned int “safe” we had better encourage people simply not to use it.

    LOL – That could/would create it’s own set of very hide to find bugs. I guess I’ll just have to accept the fact that I’m disagreeing with the experts here.

    Implementation

    (*). Very clean code, well structured. I can immediately see what it does. It is better compared to many other Boost code.

    thank you. – still not good enough though – and still needs implementation of type constraints.

    (*). I tried to break it on my compiler (gcc 4.9.1 on Windows), but could not. Which means the library covers all UB cases I can think of.

    (*). I believe that the current implementation invokes the undefined behavior. The implementation of “safe” addition on signed integers ((function add()) first performs the “raw” addition, allowing it to overflow and then checks if the result is “credible”. This only works under the assumption that signed integers are “wrapped” upon overflow, but I do not think you can safely make this assumption. What if in my environment an overflow is handled by “saturation” or rising a hardware exception or a signal? I suppose there is another way, given x, y, and numeric_limits::max(), to check if adding x and y would overflow without causing the overflow. For instance, I imagine that if T == U and T is unsigned, you can check the addition overflow like this:

    if (t > (std::numeric_limits::max() – u)) overflow();
    else return t + u;

    I added to the rationale under the question “It looks like it presumes two’s complement arithmetic at the hardware level. So this library is not portable – correct? What about other hardware architectures?” Does this answer the question. If not I can expand the answer.

    (*) Reporting the overflow. I do not like the way the response to the detected overflow is controlled. The way BOOST_NO_EXCEPTIONS is supposed to work is this. Any Boost library should call function boost::throw_exception(). This function already takes care of the problem if exceptions are supported on the platform or not. Why would you allow the user to customize what happens upon overflow in no-exceptions environment, but force *the* solution when exceptions are enabled?

    In the rationale – I’ve touched upon this. See “Why is there no policy driven design for handling overflows?” In my view, here are serious issues with boost exception it it’s design and implementation and review. I won’t go into theme here. You can find the discussion on the developer’s list. I’m satisfied with the current solution – but it’s good enough for now.

    (*) This would be useful for debugging, if we were allowed to customize the behavior of overflow handler, but currently we cannot (when exceptions are enabled).

    (*). I am missing the ability to customize what the library does in response to the detected overflow (or underflow). In the rationale section I read that policy-based customization was rejected because it was not clear how to respond to overflow involving two safe integers with different policies. While the concern is sound, I disagree with the resolution. My recommendation is to allow the policy-based customization and to make a operations involving two different policies a compile-time error.

    I would agree with this and expect to see it in a future release.

    (*). Do you allow and intend the library to be used like this:

    #ifndef NDEBUG
    typedef safe<int> int_t;
    #else
    typedef int int_t;
    #endif

    I see no problem with this if someone finds this useful. An improvement in the overflow customization would likely eliminate the need for this. The whole issue of customization of the error policy is more complicated than meets the eye. I spent some time on it but it started to get out of hand so I’ve left things in the current state for now.

    (*) Why do you specialize numeric_limits for “safe” types? Do you need it?

    (*). Why do you need safe to also be Numeric? Do you expect someone to use a nested safe>?

    I added to “Rationale” to answer this.

    (*)According to some interpretations of the Standard, this is an undefined behavior (http://stackoverflow.com/questions/16122912/is-it-ok-to-specialize-stdnumeric-limitst-for-user-defined-number-like-class).

    This is correct behavior. The “Long Answer” correctly explains why

    (*). Numeric concept. Suppose, it is legal to specialize numeric_limits for user defined types. It is possible that for some bounded integer types with specialized numeric_limits, your implementation of safe_int will not work.

    It’s certainly not guaranteed to work. In such a case a specialization might be necessary.

    For example if you make a 48 bit integer which uses two’s complement arithmetic, you can create a numeric limits specialization for it, safe will work with no change.

    (*)The implementation relies on the assumption that numeric_limits::max() is never greater than numeric_limits::max().

    correct – if this isn’t true for a given type T, safe implementation will have to be specialized for this T.

    (*) But consider the following type: safe will not handle it, even though it satisfies all the requirements of concept Numeric.

    correct – if this isn’t true for a given type T, safe implementation will have to be specialized for this T.

    BTW – I’d love to see someone implement int128 and int256 along the lines of your example. It think they would be very useful in some encryption applications. Again – the default implementation of safe would work and would have to be specialized.

    (*)Perhaps you should tighten the requirements so that T is an arithmetic type?
    too narrow.

    (*)Or require by other means that numeric_limits::max() is convertible to uintmax_t without the loss in accuracy.
    not possible for all T – like int128 or fixed decimal or modular integer, or …

    (*). safe_unsigned_range requirements: “MIN must be a positive integer literal.” I think it should be “MIN must be a non-negative integer literal. (i.e., allow number 0).
    correct – good catch

    (*)Also, it has the requirement MIN < MAX. Why not MIN <= MAX. It is strange, but not incorrect. probably correct - I'll make the change. (*). In the implementation of "safe" subtraction, I can see line "overflow("safe range subtraction result underflow");". I wonder if overflow() is a good choice of name, if it also serves to indicate an underflow. agreed. The whole exception specification, handling, nomenclature, override, customization, etc. needs another pass. It's doable, but for reasons that are hard to explain, is more work than one might think. It's a pending issue. (*).In safe_range_base, when you implement operator+,the return type is declared as BOOST_TYPEOF_TPL(Stored() - U()). Shouldn't it be BOOST_TYPEOF_TPL(Stored() + U()) ? correct of course — fixed - though C++ expression type rules are the same addition and subtraction so it never made any practical difference. Note that some day, one might apply safe to some T which isn’t a primitive integer type so this code – if it survived the specialization might be important.

    Documentation
    (*). Good. It has a short motivating section, an overview of concepts and a clear reference.

    (*). The introduction does not indicate clearly what the scope of the library is (just safe or a set of loosely related independent small tools).

    I’ve tried to clarify this.

    (*). There are three motivating examples in the beginning of the tutorial. I think the 2nd and 3rd have the code examples swapped. Problem 2 says that “a value is incremented/decremented beyond it’s domain”, but it is the code in problem 3 that illustrates it, and vice versa.

    correct – and I’ve added more examples.

    (*). It is difficult to figure out whether your library checks the overflow (or something else) on floating-points. The word “numerics” as well as the note that it works for any types for which numeric_limits is defined implies “yes”. But the requirement on the XOR operator, as well as the reference to SafeInt imply “no”. Now that I have looked at the implementation, I know that it has nothing to do with floats. But I believe it should be made clear from the outset.

    correct – I’ve tried to clarify this.

    (*). The introduction says that the library checks for safety at compile-time wherever possible: “the results of these operations are checked to be sure any possible errors resulting from undefined arithmetic behavior are trapped at compile time (if possible)”. Nowhere in the documentation could I find an example of such a compile-time failure. I really expected an example. What do you mean by that? I know that in my GCC compiler, I can enable warnings on operations between two different integer types, and then I can promote warnings to errors. Does this library offer anything else?

    (*). Constant runtime complexity is indicated as O(0) in the docs. This should be O(1).
    I don’t think O(1) is correct. In fact I don’t O(0) is correct either. the time to complete an operation is fixed so that would suggest O(0). But since there is variable on which to specify the O (i.e. not number of elements or ). The concept of “complexity” doesn’t apply here. I’m just going to eliminate it.

    (*). Numeric concept. Doc says: “a type T is Numeric if there exists specialization of std::numeric_limits“. In that case, why not follow the Standard wording and call the concept “Arithmetic”?

    There is no C++ concept “Arithmetic”

    There is the type trait “std::is_arithmetic” but that return true for built-in floating point and integer types and falls for everything else. This cannot be redefined.

    So the only choice is to create a new concept – “Numeric”. This is defined as true of numeric_limits is specialized for type T. Note that this excludes thinks like std::complex. Also note that defining the “Numeric” concept makes the documentation considerably shorter, easier to understand, and like to contain less bugs.

    (*) look at the parameters in these two signatures from the docs:

    safe_signed_range
    safe_unsigned_range

    You would want to have them displayed in the same style, more uniformly.

    correct – fixed

    (*). In the specification of safe, in Valid Expressions, there is something wrong with the last line (typeof(T * U)). You probably meant something else.

    LOL – I’m sure I did – but I can imagine was it was – so I eliminated the line from the table

    (*). In the documentation of safe_cast, we have section “Preconditions”, which says, “The value of u must be representabl by the type T. If this is not true, an std::out_of_range exception will be thrown.” This is self-contradictory, having a precondition means that the implementation need care what happens when it is violated, and the user cannot rely on the results of function run with a broken precondition. In contrast, you define what happens when u is not representable by T, and you want the users to be able to rely on that. safe_cast is defined for any value u whatsoever: it has no precondition.

    Hmmm – a I googled around and found https://akrzemi1.wordpress.com/2013/01/04/preconditions-part-i/ a very interesting thread on the subject. In that thread there’s a long discussion about how to “handle” a violation of a pre-condition and one of the options discussed as throwing an exception. So I’m not inclined to change the documentation on this account. On the other hand, I’m thinking that it would be a good idea to use BOOST_STATIC_ASSERT(is_convertible) to catch more egregious violations at compile time. Or even better, verify that numeric_limits::min < = numeric_limits::min, etc… As you can see – there’s a lot of food for thought in such a simple idea. And of course adding concept checking to the library (as recommended at www.blincubator.com) would also be a useful exersize.

    (*). In a number of places, e.g., when specifying safe_cast behavior, the docs say that if the result is not representable, std::out_of_range is called. If you consider allowing the user to customize what happens, you may want to say that “overflow() is called” rather than “exception is thrown”.

    The whole question of handling violations is in need of refinement.

    (*). The docs never say if the mutating operations on safe offer a strong or a basic error safety guarantee. If I execute safe_i *= j and an exception is thrown am I guaranteed that safe_i will remain unchanged? From the implementation it looks like I am, but is it something you want to commit or, or just an unintended effect of the current implementation?

    I agree with your analysis of the implementation. Truth is, I never thought of this. And I think I’d be willing to commit to that – at least for safe where T is a built-in. So I’ve updated the document accordingly.

    (*). In desscription of safe, section “Notation”, you are missing the definition of symbol ‘su’.
    correct

    (*). Section “Rationale” has an item “Why does a binary operation on two safe values not return another safe type?” this implies that safe op safe does not return safe, but V. On the other hand, in section describing safe, we read in “Valid Expressions” (3rd row) that the return type of “st op su” is “safe“. These two pieces of information appear contradictory.

    correct – I’ve corrected the document to match the code. Originally I had them return safe types. But that causes some complexities and I had a lot of trouble with it. It might change in the future, but for now safe op safe return t op u. (assuming it doesn’t throw).

    (*) I know at least of three libraries that overlap in scope with yours. Two already belong to Boost,the third is a Boost candidate:
    1. Boost.NumericConversion: http://www.boost.org/doc/libs/1_57_0/libs/numeric/conversion/doc/html/index.html

    I tried to use this and I couldn’t figure out the documentation. no joke. I’m thinking it’s similar to safe_cast(const U & u); but I honestly can’t say. It’s been a long time since I looked at it though. It might have been updated and/or since my knowledge of the subject has improved, it might make more sense to me know.

    2. boost/multiprecision/cpp_int: http://www.boost.org/doc/libs/1_57_0/libs/multiprecision/doc/html/boost_multiprecision/tut/ints/cpp_int.html

    That is a whole different thing. It does have a safe mode. But more importantly, it is a

    3. Boost.Constrained Value: http://rk.hekko.pl/constrained_value/

    This was reviewed and accepted subject to conditions – but it seems that it never made it into the official release. I don’t know what the status is.

    I would really like to see them compared with your library. This helps me answer the question why I should pick yours rather than one of the three above.

    (*). I have one remark. I would like to see it mentioned in the docs. The library turns UB into a well defined behavior. There exist static analysis tools that can detect UB-based bugs in the code. When you decide to go with safe, you are removing a UB and therefore the opportunity for the static analyzer to find a bug in your code.

    correct

    Not bad a trade-off, but worth mentioning in the docs.

    Hmm – OK I’ll have to think about where to mention it.

    (*). A similar statement appears in two places in the docs:
    1. Introduction -> Problem: “When the result of arithmetic operations exceeds this number of bits, the result is undefined…”
    2. Tutorial -> Problem: Arithmetic operations can yield in correct results: “When some operation results in a result which exceeds the capacity of a data variable to hold it, the result is undefined.”

    I’ve tried to refine the language in the documentation to make it more correct and easier to read.

    (*)Although I know what you want to say, this statement is technically incorrect.
    First, for unsigned integer types, the behavior is well defined in the Standard: modulo-n arithmetic. True, that someone can be surprised, but it can be argued that it is a mistake to choose an unsigned type and expect anything else than modulo-n arithmetic.
    Second, more subtle, it is not “arithmetic operations” that cause UB or problems. For instance, I can have my class: an infinite-precision integer which provides arithmetic operations, and these will not cause UB or overflow. It is only operations on _arithmetic types_ that have UB problems.

    (*)”Arithmetic type” is a well defined term in the Standard: only those scalar (built-in) integer or floating-point types.

    Hmmm – arithmetic type – supported by std::is_arithmetic is well defined. “Arithmetic” as a concept isn’t. Not that any of this relevant to this discussion.

    I think we’ve been here before. My focus is not on undefined behavior but arithmetically incorrect behavior. Boost.Multiprecision addresses the same problem in a totally different way. It dynamically expands the representation so that correct arithmetic can be guaranteed albeit with a cost of performance. So there would be no motivation to make a safe< .. version of these data types. Safe numerics doesn't do this - it guarantees that you won't produce an incorrect result. Not quite the same thing but still is addressing the same problem. You can argue that it's not reasonable for people to expect z = x + y in C++. (I think it is reasonable) but the fact is that programmers write that expression, that is exactly what they expect. Without a library like this, C++ cannot be used to write a program which is guaranteed to not return invalid results on straight forward expressions involving integer operations unless one totally discards that arithmetic syntax for it's expressions. If you do this, you might as well write in assembler. That's even harder to verify as correct. BTW - C can't do it at all. (*) In Function -> Overflow: there is a sentence, “If the environment does not support C++ exceptions, the user should implement this function and expect it to be called when appropriate. Otherwise, function is implemented…”. Word “otherwise” in there is ambiguous: it is not clear to what condition it applies to. Given the strange way this overflow() is currently customized, it only adds to the confusion.

    simplified

    (*). In “Tutorial ..

    I made adjustment in the examples to simplify them and make them more clear.

    (*) All examples in the tutorial use this include directive:
    #include “../include/safe_integer.hpp”

    Hmmm –

    When I tried it, it doesn’t compile. Since this is supposed to be a boost library, why not make it
    #include “boost/safe_integer.hpp”

    (*). The docs say “requires a compiler compatible with the C++11 standard.”
    But it compiles fine with C++03.

    LOL – I was going to move to C++11 but haven’t gotten around to it. The main motivation is that const expr might be useful in detected some problems at compile time rather than waiting for run time – example safe<10, 20> t = 0; But I’m not there yet. I don’t think C++03 compatibility is a big issue but have needed anything more yet.

    (*) Typeos:
    1. “Introduction -> Requirements”: change “I requires” to “It requires”.
    2. “Introduction -> Problem”:
    3. “Tutorial -> Problem: Arithmetic operations can yield in correct results.”: in the title change “in correct” to “incorrect”.
    4. “Tutorial -> Problem: Arithmetic operations can yield in correct results.”: change “evaluating he expression” to “evaluating the expression”.
    5. “Tutorial -> Problem: Implicit conversions change data values”: change “arithment expression” to “arithmetic expression”.
    6. “Tutorial -> Problem: Implicit conversions change data values”: in sentence “A simple assign or arithment expression will generally convert all the terms to the same type.” There is something unclear: the same type as what? Also, “This solution is the same as the above” — above what?
    7. “Funcitons -> safe_cast“: change “representabl by” to “representable by”.(*) I think the 2nd and 3rd have the code examples swapped.

    Fixed

    (*). Numeric concept. Doc says: “a type T is Numeric if there exists specialization of std::numeric_limits“. In that case, why not follow the Standard wording and call the concept “Arithmetic”?

    As far as I know, the Standard does not define the concept “Arithmetic”. see http://en.cppreference.com/w/cpp/concept. There is a type trait, std::is_arithmetic, but this is defined as either a integral or floating point number. My “Numeric” concept is meant to embrace any type which “acts like a number”. That is any type T for which std::limits is a valid expression. Thus, should one implement a “fixed point numeric” for which limits was defined, fixed_point_numeric would be considered a model of the “Numeric” concept.

    (*)I would like that the existence of the additional (to safe) components (like safe_compare) be clarified — what are they for if we have safe? Also, some of the above remarks deserve to be addressed.

    I’ve included changes to the documentation to attempt to clarify this