Tuesday, May 7, 2013

Paramater Passing III


In previous postings, I have dealt with the processing-ordered sequence of parameter passing mechanisms in excruciating detail. In this post I will discuss a number of other factors besides how much processing is done to the arguments.

Pass-by-value-result

First, let’s look at aliasing. Aliasing is when two different variables denote the same location. It is what happens when you call a function with pass-by-reference:

var y;
function f(ref x)
 x := 1;
 do while x <= y;
   x := 2*x;
 od
end
f(y);

The variables x and y will alias each other during the call to f. This can have unexpected consequences for the writer of the function. The apparent purpose of  f is to set its actual parameter to the least power of 2 greater than the global variable y. Instead, it produces an infinite loop when the user calls f(y).

Because of various problems like this, pass-by-reference has a bad reputation, and deservedly so. Fortunately, there is an alternative that offers nearly the same functionality without the unexpected interactions. The alternative is pass-by-value-result. Actually, pass-by-value-result is a combination of pass-by-value and pass-by-result.

For a pass-by-result parameter, the value of the actual parameter is ignored and the formal parameter is considered undefined at the beginning of the function (unless it is declared with an initial value in the formal parameter list). The formal parameter should be assigned a value during the function call, but this value does not get put back into the actual parameter until the end of the function. Pass-by-result is usually indicated with the out keyword.

So let’s rewrite the function above to use pass-by-result:
var y;
function g(out x)
 x := 1;
 do while x <= y;
   x := 2*x;
 od
end
g(y);

Just changing that keyword from ref to out has a dramatic result. Now the formal parameter x is strictly a local variable, not aliased to the global y. When you call g(y), the function will assign to its local variable, x, the least power of 2 greater than y. Then, since x was declared as pass-by-result, at the end of the function, it will assign this value back into y, changing y into the least power of 2 greater than its previous value. Which is probably exactly what the programmer intended.

Pass-by-value-result uses pass-by-value when the function is called and pass-by-value when it returns. In other words, with pass-by-value-result, you use the value of the actual parameter to initialize the formal parameter, and then at the end of the function, you assign the value of the formal parameter back to the actual parameter.

Languages with pass-by-value result typically use the keywords in for pass-by-value, out for pass-by-result, and in out for pass-by-value-result. The keyword in is usually optional when there is no out keyword. Therefore you get the following:

function f(x)        // pass-by-value
function f(in x)     // pass-by-value
function f(out x)    // pass-by-result
function f(in out x) // pass-by-value-result

Pass-by-copy

To reiterate, pass-by-value means that the formal parameter gets assigned the dereferenced value of the actual parameter just like you would do in a normal assignment. Pass-by-reference means that the formal parameter becomes an alias for the actual parameter which must evaluate to a location. But there has been an unfortunate tendency for programmers to think that pass-by-value means “the function cannot change the actual parameter” and pass-by-reference means “the function can change the actual parameter”.

Because of this misunderstanding, a lot of programmers think that there is something confusing about passing objects by value. Objects are mutable value, values that can change over time. Because objects can change, it makes sense that a routine that can access the object can change the object. But this doesn’t make it pass-by-reference. Consider

function f(o:Object1)
 o.do_something();
end
var o = new Object1();
f(o);

The call to do_something() has some effect on the object o. It changes it in some way. But when you return from the function f, o still references the same object that it did before. What f(o) did was make a change in the object referenced by o. The function call did not change the variable o to reference a different object. Compare to

function g(ref o:Object1)
 o := new Object1();
end
var o = new Object1();
g(o);

Now g(o) actually changes the reference of o to make it refer to a different object. That’s what you can do with pass-by-reference and what you cannot do with pass-by-value on an object.

However, many languages such as C and C++ don’t make a clear distinction between objects and what I call abstractions --immutable values like numbers. In particular, a struct is a mutable value, an object, but when you pass a struct to a function, it can’t be changed. This is because C and C++ copy the struct rather than passing a reference to it. The truth is that C and C++ have a lower-level implementation-inspired concept of values that is based on memory, representations, and pointers, rather than dividing the world of values up into objects and abstractions as we do in the real world.

However, we can interpret the C behavior using the concepts of object and abstraction as follows: we say that C structs are objects that don’t use pass-by-value like numbers and arrays do. Structs use pass-by-copy instead.

Pass-by-copy is a parameter passing mechanism that only applies to objects, not to abstractions. It makes no sense to copy an abstraction; you can’t make a copy of the number five. There is just one number five and it isn’t the same as anything else. You can make a copy of a representation of the number, but the representation is not the number. So if you pass a value to a formal parameter that is declared as pass-by-copy, it should be treated as pass-by-value.

Pass-by-copy is actually a useful mechanism to consider for Unobtainabol. It can be used in situations where you do not want to change the object and it is not convenient or not possible to to send the object itself. For example in a remote procedure call to another machine, you might want to copy the arguments that are objects.

Since pass-by-copy is a sort of variation of pass-by-value, it suggests the existence of a similar variation of pass-by-result, where the value of the formal parameter at the end of the call would be copied back into the actual parameter. We might call such mechanism pass-by-copy-result and pass-by-copy-value-result.

Pass-by-reference-const

In C++, objects are structs. And as I said above, structs are a special case; they do not use pass-by-value as other C++ types do, instead they use pass-by-copy. This doesn’t work very well if you want the called function to modify the object because it only can modify a copy of the object that you pass it, leaving the original untouched.

C gets around this unfortunate circumstance by using pointers. In C++ you can do the same, but pointers are a rather low-level and hazardous feature so the makers of C++ wanted to reduce the places where they had to be used. So C++ has call-by-reference, and when you want to pass an object to be modified in C++, you can pass it by reference.

In many cases, a function takes an object as a parameter but does not change the object in any way. It is useful, both to compiler writers and to users of the function, to be able to guarantee this to the caller of the function. You can do this with pass-by-copy, but pass-by-copy is expensive for large objects. So C++ encourages the use of pass-by-reference-const, which is pass-by-reference with a const declaration to prevent changing the thing referenced.

In fact, it is common among programmers who know what they are doing to use pass-by-reference-const as an optimization, not only for objects but also for abstractions with a large representation. For example if you wrote an immutable set class to represent fixed sets, you would have to make sure that no callee ever changed it but you also would want an efficient way of passing these things around. So you could pass them by reference-const.

This is the sort of low-level hackery that is not necessary in Unobtainabol. In Unobtainabol, you don’t pass something by reference unless you really want to create an alias for some reason. You don’t worry about efficient parameter passing because the language translator handles that. An abstraction that has large representations will be passed around as a pointer under the covers and the programmer never has to think about it; he just passes it by value.

Unobtainabol does have a way to declare that a formal parameter that takes an object will not modify the object --such things are useful for writing clean code. But the programmer doesn’t have to worry whether it gets passed by copy or by pointer in those cases. The language translator picks a good way to do it.

Unobtainabol might even implement under the covers a form of pass-by-copy-on-write, which passes an object by value, but with the protocol that if the callee decides to change the formal parameter, a copy is made first.

No comments:

Post a Comment