{"id":127,"date":"2010-08-15T15:54:11","date_gmt":"2010-08-15T14:54:11","guid":{"rendered":"https:\/\/rwec.co.uk\/blog\/?p=127"},"modified":"2010-08-15T15:59:56","modified_gmt":"2010-08-15T14:59:56","slug":"object-references-are-confusing","status":"publish","type":"post","link":"https:\/\/rwec.co.uk\/blog\/2010\/08\/object-references-are-confusing\/","title":{"rendered":"Why object references are confusing, and what to do about it"},"content":{"rendered":"<p>A <a href=\"http:\/\/mangobrain.co.uk\/comments.php?i=27\">recent blog post from my old friend Phil<\/a> ((well, vaguely recent: I really should get with this whole RSS thing!)) discussed some of the gotchas of parameter passing in object-oriented languages &#8211; or I suppose specifically in <em>partially<\/em> OO languages, since the problem in this case was a combination of objects and structs in C#.<\/p>\n<p>It seems to me there is a genuine problem here, beyond programmer fallibility &#8211; the old distinction between &#8220;pass by value&#8221; and &#8220;pass by reference&#8221; is no longer a useful distinction in such languages, and someone needs to design something better.<\/p>\n<p><!--more--><\/p>\n<p>To re-summarise, the problem is that in a lot of languages, variables that we call &#8220;objects&#8221; are actually always &#8220;object references&#8221;, so a parameter &#8220;passed by value&#8221; &#8211; or even referenced from a non-object structure that was passed by value &#8211; is not copied, and can still cause changes to the original instance of the object. <\/p>\n<p>This can be extremely confusing &#8211; I remember similar confusion in Java, where int, float, boolean, and char are basic types, but Strings are objects, so every time you pass a String into a function, that function might accidentally edit the original string. ((Although you&#8217;re not supposed to modify String objects anyway, because they&#8217;re implemented in some weird way that&#8217;s not optimised for manipulation, so you&#8217;re supposed to use a different class for that, or something&#8230;)) This really highlights the arbitrary distinction underlying the &#8220;object reference&#8221; model: it&#8217;s perfectly reasonable to argue that a string is a complex type, and in C you would use an explicit reference; but in, say, PHP, assigning an existing string to a new variable <em>will copy the contents of that string<\/em> &#8211; the underlying mechanics are part of the abstraction of the language.<\/p>\n<p>Now the reason for the distinction is, I presume, that cloning an object is not always trivial &#8211; its internal state may be tied to some outside resource, or require some manual allocation, or just be fundamentally a singleton. So the problem is that objects have to share languages with &#8220;<abbr title=\"Plain Old Data\">POD<\/abbr>&#8221; types, like integers, which are much simpler to manipulate as raw data. Purely OO languages, where everything is an object &#8211; take Javascript, for instance &#8211; don&#8217;t have this problem; but they have the opposite problem: they can&#8217;t pass by value for <em>any<\/em> type!<\/p>\n<p>Before I go on, let me admit that I find references confusing at the best of times. I&#8217;ve done very little programming in C, but I note that at least there the level of indirection is constantly visible, even if you do end up with rather a lot of it &#8211; &#8220;pointer to a pointer to a pointer to a char&#8221; and so on. In PHP, the <code>=&<\/code> operator makes a variable a reference to another variable, but what does that actually mean? If I set <code>$foo =& $bar<\/code>, what happens if I then say <code>$bar =& $baz<\/code>, or <code>unset($bar)<\/code>? The answers will be entirely logical, I&#8217;m sure, but it can be pretty hard to keep track of what a particular assignment is actually assigning <em>to<\/em>. You can even subvert a function&#8217;s calling convention by passing in a reference to a non-reference parameter &#8211; the de-referencing is implicit, so the function unknowingly modifies the caller&#8217;s variable. ((e.g. <\/p>\n<pre>\r\nfunction foo($value)\r\n{\r\n      $value++;\r\n      return $value;\r\n}\r\n$a = 1; foo($a); echo $a; \/\/ 1\r\n$b = 1; foo(&$b); echo $b; \/\/ 2!\r\n<\/pre>\n<p>))<\/p>\n<p>An interesting question to look at is why you should be able to write to parameters that have been passed by value <em>at all<\/em>, and Pascal and its descendants let you define &#8220;const parameters&#8221; which make such modification illegal. This actually <a href=\"http:\/\/delphitools.info\/2010\/07\/28\/all-hail-the-const-parameters\/\">allows the compiler to make some hefty optimisations<\/a>, although as soon as you start passing object references, it too becomes much less useful &#8211; you can still make calls that modify the underlying object instance, you just can&#8217;t assign a brand new instance over the top of the variable.<\/p>\n<h2>Solution<\/h2>\n<p>OK, I&#8217;m not going to say I&#8217;ve suddenly invented a language construct that will revolutionise programming, but I wonder if something can be done to make this all a bit less obscure. The ability to recursively clone an object is hard enough, let alone the ability to recursively but context-sensitively mark it as immutable, so this may all be impractical. <abbr title=\"On the other hand\">OTOH<\/abbr>, some of this is probably already available in current OOP languages, but it&#8217;s <em>consistency<\/em> I&#8217;m after.<\/p>\n<ol>\n<li>Make all types objects, and all variables object references &#8211; even if the underlying language optimises, say, an int object by only introducing indirection in memory when more than one reference is needed.<\/li>\n<li>Include a simple method for duplicating object instances, like PHP&#8217;s <code>clone<\/code> operator, so that assign-by-value becomes something like <code>$x = clone $y;<\/code>.<\/li>\n<li>Allow classes ((or the equivalent in a prototype-based language, etc)) to define custom clone mechanics, or &#8211; crucially &#8211; declare that they are &#8220;uncloneable&#8221;.<\/li>\n<li>Force additional levels of indirection to be explicit &#8211; that is, a variable of type &#8220;reference to String&#8221; cannot be passed to a function expecting a variable of type &#8220;String&#8221; without explicit de-referencing.<\/li>\n<li>Most importantly, replace the out-dated &#8220;pass by value&#8221; vs &#8220;pass by reference&#8221; convention with something more appropriate for such a variable system:\n<ul>\n<li><strong>&#8220;reference&#8221; parameters<\/strong> &#8211; the most common option; modifications to the object will affect the original instance, <em>and<\/em> assignments to the variable will overwrite the original object reference<\/li>\n<li><strong>&#8220;clone&#8221; parameters<\/strong> &#8211; effectively a true pass by value: the object is cloned automatically, and only the clone is visible to the called function; passing an object that cannot be cloned is illegal<\/li>\n<li><strong>&#8220;const&#8221; parameters<\/strong> &#8211; the local variable is read-only (you can&#8217;t overwrite it to reference a different object instance) <em>and<\/em> the underlying object is marked immutable, recursively, within that scope<\/li>\n<\/ul>\n<\/li>\n<\/ol>\n","protected":false},"excerpt":{"rendered":"<p>A recent blog post from my old friend Phil ((well, vaguely recent: I really should get with this whole RSS thing!)) discussed some of the gotchas of parameter passing in object-oriented languages &#8211; or I suppose specifically in partially OO languages, since the problem in this case was a combination of objects and structs in [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[99,95,93,92,94,90,96,97,91,98],"class_list":["post-127","post","type-post","status-publish","format-standard","hentry","category-uncategorized","tag-const","tag-language-design","tag-objects","tag-oop","tag-parameters","tag-programming","tag-reference","tag-references","tag-types","tag-value","post-preview"],"_links":{"self":[{"href":"https:\/\/rwec.co.uk\/blog\/wp-json\/wp\/v2\/posts\/127","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/rwec.co.uk\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/rwec.co.uk\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/rwec.co.uk\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/rwec.co.uk\/blog\/wp-json\/wp\/v2\/comments?post=127"}],"version-history":[{"count":6,"href":"https:\/\/rwec.co.uk\/blog\/wp-json\/wp\/v2\/posts\/127\/revisions"}],"predecessor-version":[{"id":133,"href":"https:\/\/rwec.co.uk\/blog\/wp-json\/wp\/v2\/posts\/127\/revisions\/133"}],"wp:attachment":[{"href":"https:\/\/rwec.co.uk\/blog\/wp-json\/wp\/v2\/media?parent=127"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/rwec.co.uk\/blog\/wp-json\/wp\/v2\/categories?post=127"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/rwec.co.uk\/blog\/wp-json\/wp\/v2\/tags?post=127"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}