Using weakrefs to avoid weakening strong references
				   
		 Alexandre Oliva <aoliva@redhat.com>
			      2007-02-08

Introduction
============

Consider a header file that defines inline functions that would like
to use (or just test for a definition of) a certain symbol (function,
variable, whatever), if it is defined in the final program or one of
the libraries it links with, but that have alternate code paths in
case the symbol is not defined, so it would like to not force the
symbol to be defined.

This is the case of gthr-* headers in GCC, that libstdc++ uses and
exposes to users, creating a number of problems.

Such a header has traditionally been impossible to implement without
declaring the symbol as weak, which has the effect that any references
to the symbol in the user's code will also be regarded as weak.  This
has two negative side effects:

- if the function is defined in a static library, and the library is
linked into the program, the object file containing the definition may
not be linked in, because all references to it are weak, even
references that should have been strong.

- if the user accidentally fails to link in the library providing the
referenced symbol, she won't get an error message, and the code that
assumed strong references is likely to crash.


Existing solutions
==================

One way to avoid this problem is to move the direct reference to the
symbol from the inline function into a function in a separate library,
or even move the entire function there.  The library references the
symbol weakly, without affecting user code.  This probably impacts
performance negatively, and may require a new library to be linked in,
which an all-inline header file (say, C++ template definitions) would
rather avoid.

Another way to avoid the problem it is to create a variable in a
separate library, initialized with a weak reference to the symbol, and
access the variable in the inline function.  This still has a small
impact on performance and may require a new library, but the most
serious problem is that it defines a variable as part of the interface
of a library, which is generally regarded as poor practice.


Weakrefs
========

The idea to address the problem is to enable the compiler to
distinguish references that are intended to be weak from those that
are to be strong, and combine them in the same way that the linker
would combine an object file with a weak undefined symbol and another
object containing a symbol with the same name.  The idea was to enable
people to write code as if they had combined two such object files
into a single translation unit.

The idea of a weak alias may immediately come to mind, but this is not
what we are looking for.  A weak alias is a definition that is in
itself weak (i.e., it yields to other definitions), that holds the
same value as another definition in the same translation unit.  This
other definition can be strong or weak, but it must be a definition.
A weak alias cannot reference an undefined symbol, weak or strong.

What we need, in contrast, is some means to define an alias that
doesn't, by itself, cause an external definition of the symbol to be
brought in.  If the symbol is referenced directly elsewhere, however,
then it must be defined.  This is similar to the notion of weak
references in garbage collection literature, in which a strong
reference stops an object from being garbage-collected, but a weak
reference does not.  I've decided to name this kind of alias a
weakref.


I could have introduce means in the compiler to create such weakrefs,
and handled them entirely within the compiler, as long as it can see
the entire translation unit before deciding whether to issue or not a
.weak directive for the referenced symbol.

However, since the notion can be useful in the assembler as well,
especially for large or complex preprocessed assembly sources, I went
ahead and decided to implement it in the assembler, and get the
compiler to use that.

This notion may also be useful for compilers that combine multiple
translation units into a single assembly output file.


Assembler implementation
------------------------

The following syntax was chosen for assembly code:

	  .weakref <alias>, <target>

The semantics are as follows:

- if <target> is referenced or defined, then .weakref has no effect
whatsoever on its symbol;

- if <target> is never referenced or defined other than in .weakref
directives, but <alias> is, then <target> is marked as weak undefined
in the symbol table;

- multiple aliases may be weakrefs to the same target, and the effect
is equivalent to having a single weakref

- if <alias> is redefined, it ceases to refer to <target>, and loses
the .weakref status;

- uses of <alias> are implicitly turned into uses of the last
definition of <target>;

- <alias> itself is never added to the symbol table, since all uses
are resolved locally.


Compiler implementation
-----------------------

The following syntax is to be used in C sources:

static <decl> __attribute((weakref("<target>")));

<decl> may be a function of variable declaration.  It is obviously
heavily based on the alias notation, and it actually uses the alias
machinery underneath, so almost all of the same restrictions apply.
The only one that does not is that, while the alias attribute must
reference a defined symbol, weakref must reference a declared, but not
necessarily defined, symbol.  Both use the assembly name of the
target, which might differ from the source-file representations.

weakref implicitly marks <decl> (but not <target>) as weak.  It is
actually implemented in terms of a no-argument weakref attribute, that
still implies weak, and an alias attribute.  Therefore, the above is
equivalent to:

static <decl> __attribute((weakref,alias("<target>")));

which would still be equivalent if one added the weak attribute:

static <decl> __attribute((weak,weakref,alias("<target>")));

If no alias attribute is associated with a weakref declaration, the
effects of the weakref attribute are limited to the effect of the weak
attribute.

The compiler should map this to .weakref in the assembler if the
assembler supports it.  Failing assembly support, the weakref is
correctly rejected, but we could arrange for the compiler to handle it
internally.


Conclusion
==========

This new feature will enable a long-standing libstdc++ bug to be
fixed.  Some of its headers that are meant to be included by user code
include gthr headers that were originally meant to be internal to
libgcc.  They contain numerous #pragma weak directives for thread
library functions, as well as inline functions that reference them.
Several of these inline functions are called from within template
definitions, so refraining from including the header is not an option.

With this new feature, it will be possible to rework the header so as
to not reference the thread library symbols that the user might call
on its own, but rather weakrefs to them, such that the symbols won't
be marked as weak if there are user references to them, but they will
if only the inline functions that use the weakrefs (indirectly)
reference them.

As long as this is implemented within the compiler, such that no
assembly support is required, we can switch to this new feature on all
platforms.  Otherwise, this will leave platforms/assemblers that don't
support this new feature the option to introduce such support, retain
the problems caused by the weak pragmas or take the performance hit to
fix it.


Copyright 2005, 2007 Red Hat, Inc.

This work is licensed under a Creative Commons Attribution-ShareAlike
3.0 Unported License.  http://creativecommons.org/licenses/by-sa/3.0/

ChangeLog
=========

2011-05-17  Alexandre Oliva  <aoliva@redhat.com>

	* Relicensed from OPL1.0 with further restrictions to CC BY-SA.

2007-02-17  Alexandre Oliva  <aoliva@redhat.com>

	* Added license

2007-02-08  Alexandre Oliva  <aoliva@redhat.com>

	* weakrefs are static, not extern, as expected for visibility
	local to the translation unit.  GCC has been like this for a
	while.  Fix a few typos.  Add copyright notice.

2005-10-10  Alexandre Oliva  <aoliva@redhat.com>

	* Initial revision.