Bug 1253

Summary: relocation R_X86_64_PC32 against undefined symbol
Product: ns-3 Reporter: John Abraham <john.abraham.in>
Component: build systemAssignee: Gustavo J. A. M. Carneiro <gjcarneiro>
Status: RESOLVED FIXED    
Severity: normal CC: mathieu.lacage, ns-bugs, tomh
Priority: P5    
Version: pre-release   
Hardware: All   
OS: All   
Attachments: output of ./waf -d optimized configure --enable-static --enable-examples --enable-tests
output of ./waf -v
patch to add -fPIC
patch (sorry, wrong patch file attached before)
disable python bindings if static build selected for x86-64
disable python bindings if static build selected for x86-64
unconditionally disable bindings if static build selected

Description John Abraham 2011-08-28 09:56:12 UTC
Tried this on two different Ubuntu 11 Virtual machines.
It is seen on both the RC1 and RC2 tar file.

./waf -d optimized configure --enable-static --enable-tests --enable-examples

[1532/1690] cxx_link: build/optimized/src/csma-layout/examples/csma-star_1.o -> build/optimized/src/csma-layout/examples/csma-star
[1533/1690] cxx_link: build/optimized/src/point-to-point-layout/bindings/ns3module_4.o -> build/optimized/bindings/python/ns/point_to_point_layout.so
/usr/bin/ld: optimized/libns3-mesh.a(peer-management-protocol_474.o): relocation R_X86_64_PC32 against undefined symbol `std::basic_ostream<char, std::char_traits<char> >::flush()@@GLIBCXX_3.4' can not be used when making a shared object; recompile with -fPIC
/usr/bin/ld: final link failed: Bad value
collect2: ld returned 1 exit status
Waf: Leaving directory `/home/john/ns-3.12-rc2/ns-allinone-3.12-RC2/ns-3.12/build'
Build failed:  -> task failed (err #1): 
	{task: cxx_link ns3module_4.o -> point_to_point_layout.so}


logs will be attached
Comment 1 John Abraham 2011-08-28 09:56:35 UTC
ns-3-dev does not have this problem
Comment 2 John Abraham 2011-08-28 10:30:38 UTC
Created attachment 1232 [details]
output of ./waf -d optimized configure --enable-static --enable-examples --enable-tests

output of ./waf -d optimized configure --enable-static --enable-examples --enable-tests -v
Comment 3 John Abraham 2011-08-28 10:31:13 UTC
Created attachment 1233 [details]
output of ./waf -v

output of ./waf -v
Comment 4 Tom Henderson 2011-08-28 16:11:23 UTC
I'm able to reproduce this on ns-3-dev also (as well as the RC), on machine ns-regression (Ubuntu 10.04.3 LTS).
Comment 5 Gustavo J. A. M. Carneiro 2011-08-29 11:04:11 UTC
Created attachment 1234 [details]
patch to add -fPIC

This patch replaces -mcmodel=large (added by Mathieu), by -fPIC, like the error message suggests.  The -fPIC option produces less efficient code, kind of like a shared library, but at least it builds...

I honestly do not know why -mcmodel=large stopped working, nor could ever really understand why it worked in the first place.
Comment 6 Gustavo J. A. M. Carneiro 2011-08-29 11:05:12 UTC
Created attachment 1235 [details]
patch (sorry, wrong patch file attached before)
Comment 7 Gustavo J. A. M. Carneiro 2011-08-29 11:07:37 UTC
Perhaps we could apply the -fPIC patch and document somewhere that, if you want the most efficient code possible, compile with --enable-static --disable-python.
Comment 8 Tom Henderson 2011-08-29 11:54:12 UTC
(In reply to comment #7)
> Perhaps we could apply the -fPIC patch and document somewhere that, if you want
> the most efficient code possible, compile with --enable-static
> --disable-python.

Your patch resolves the problem for me.  What you suggest as a way forward seems reasonable; any other thoughts on it?
Comment 9 John Abraham 2011-08-29 12:32:11 UTC
(In reply to comment #8)
> (In reply to comment #7)
> > Perhaps we could apply the -fPIC patch and document somewhere that, if you want
> > the most efficient code possible, compile with --enable-static
> > --disable-python.
> 
> Your patch resolves the problem for me.  What you suggest as a way forward
> seems reasonable; any other thoughts on it?


fPIC works for build+test on OSX & Ubuntu. If we are going ahead with fPIC I will have to remove the darwin-specific change in 1252. fPIC ought to work on OSX.
Comment 10 Mathieu Lacage 2011-08-29 15:09:07 UTC
1) The main point of static builds is to _not_ use fpic so, if you use it, you are basically silently ignoring the configure-time option. Not good. So, what is needed if this really does not work is to output a configure-time warning saying that the static build+python bindings are not compatible and disable one of the two automatically to proceed

2) Anyway, erm, the build command looks really weird:
'/usr/bin/g++', 'optimized/src/point-to-point-layout/bindings/ns3module_4.o', '-o', '/home/john/ns-3.12-rc2/ns-allinone-3.12-RC2/ns-3.12/build/optimized/bindings/python/ns/point_to_point_layout.so', '-Wl,--whole-archive,-Bstatic', '-lns3-core', '-lns3-network', '-lns3-config-store', '-lns3-internet', '-lns3-propagation', '-lns3-point-to-point', '-lns3-csma', '-lns3-emu', '-lns3-bridge', '-lns3-tap-bridge', '-lns3-virtual-net-device', '-lns3-applications', '-lns3-nix-vector-routing', '-lns3-olsr', '-lns3-aodv', '-lns3-dsdv', '-lns3-mobility', '-lns3-wifi', '-lns3-netanim', '-lns3-stats', '-lns3-uan', '-lns3-spectrum', '-lns3-mesh', '-lns3-test', '-lns3-ns3tcp', '-lns3-ns3wifi', '-lns3-flow-monitor', '-lns3-wimax', '-lns3-lte', '-lns3-mpi', '-lns3-topology-read', '-lns3-energy', '-lns3-tools', '-lns3-visualizer', '-lns3-point-to-point-layout', '-lns3-csma-layout', '-lns3-template', '-Wl,-Bdynamic,--no-whole-archive', '-shared', '-Wl,-Bsymbolic-functions', '-pthread', '-pthread', '-Loptimized', '-L/usr/lib', '-L/usr/lib/x86_64-linux-gnu', '-Wl,--whole-archive,-Bstatic', '-Wl,-Bdynamic,--no-whole-archive', '-lm', '-lpthread', '-ldl', '-lutil', '-lpython2.7', '-lsqlite3', '-lxml2', '-lgtk-x11-2.0', '-lgdk-x11-2.0', '-latk-1.0', '-lgio-2.0', '-lpangoft2-1.0', '-lpangocairo-1.0', '-lgdk_pixbuf-2.0', '-lcairo', '-lpango-1.0', '-lfreetype', '-lfontconfig', '-lgobject-2.0', '-lgmodule-2.0', '-lgthread-2.0', '-lrt', '-lglib-2.0', '-lgsl', '-lgslcblas']

This is building an executable that has a .so as extension ? wtf ? Can you try to run this command by hand and add a -shared option to it right after g++ ? 

What you want here is a shared library that will be dynamically loaded later but that was not built with -fPIC. So, the link line needs a -shared.
Comment 11 John Abraham 2011-08-29 15:26:23 UTC
I got the sense we are changing too much in the last few bugs for something that was working in ns-3.11.
However while I checked the same command with ns-3.11 I got

  File "/Users/nsnam/jabraham3/ns-3.11/ns-allinone-3.11/ns-3.11/src/wscript", line 258, in ns3_python_bindings
    if sys.platform == 'darwin':
NameError: global name 'sys' is not defined


Now I recall, we might have let ns-3.11 go even with static option broken on OSX. The failures with static builds on OSX were seen during RC2 but it was too late to fix it.


(In reply to comment #9)
> (In reply to comment #8)
> > (In reply to comment #7)
> > > Perhaps we could apply the -fPIC patch and document somewhere that, if you want
> > > the most efficient code possible, compile with --enable-static
> > > --disable-python.
> > 
> > Your patch resolves the problem for me.  What you suggest as a way forward
> > seems reasonable; any other thoughts on it?
> 
> 
> fPIC works for build+test on OSX & Ubuntu. If we are going ahead with fPIC I
> will have to remove the darwin-specific change in 1252. fPIC ought to work on
> OSX.
Comment 12 John Abraham 2011-08-29 16:17:17 UTC
(In reply to comment #10)
> 1) The main point of static builds is to _not_ use fpic so, if you use it, you
> are basically silently ignoring the configure-time option. Not good. So, what
> is needed if this really does not work is to output a configure-time warning
> saying that the static build+python bindings are not compatible and disable one
> of the two automatically to proceed
> 
> 2) Anyway, erm, the build command looks really weird:
> '/usr/bin/g++', 'optimized/src/point-to-point-layout/bindings/ns3module_4.o',
> '-o',
> '/home/john/ns-3.12-rc2/ns-allinone-3.12-RC2/ns-3.12/build/optimized/bindings/python/ns/point_to_point_layout.so',
> '-Wl,--whole-archive,-Bstatic', '-lns3-core', '-lns3-network',
> '-lns3-config-store', '-lns3-internet', '-lns3-propagation',
> '-lns3-point-to-point', '-lns3-csma', '-lns3-emu', '-lns3-bridge',
> '-lns3-tap-bridge', '-lns3-virtual-net-device', '-lns3-applications',
> '-lns3-nix-vector-routing', '-lns3-olsr', '-lns3-aodv', '-lns3-dsdv',
> '-lns3-mobility', '-lns3-wifi', '-lns3-netanim', '-lns3-stats', '-lns3-uan',
> '-lns3-spectrum', '-lns3-mesh', '-lns3-test', '-lns3-ns3tcp', '-lns3-ns3wifi',
> '-lns3-flow-monitor', '-lns3-wimax', '-lns3-lte', '-lns3-mpi',
> '-lns3-topology-read', '-lns3-energy', '-lns3-tools', '-lns3-visualizer',
> '-lns3-point-to-point-layout', '-lns3-csma-layout', '-lns3-template',
> '-Wl,-Bdynamic,--no-whole-archive', '-shared', '-Wl,-Bsymbolic-functions',
> '-pthread', '-pthread', '-Loptimized', '-L/usr/lib',
> '-L/usr/lib/x86_64-linux-gnu', '-Wl,--whole-archive,-Bstatic',
> '-Wl,-Bdynamic,--no-whole-archive', '-lm', '-lpthread', '-ldl', '-lutil',
> '-lpython2.7', '-lsqlite3', '-lxml2', '-lgtk-x11-2.0', '-lgdk-x11-2.0',
> '-latk-1.0', '-lgio-2.0', '-lpangoft2-1.0', '-lpangocairo-1.0',
> '-lgdk_pixbuf-2.0', '-lcairo', '-lpango-1.0', '-lfreetype', '-lfontconfig',
> '-lgobject-2.0', '-lgmodule-2.0', '-lgthread-2.0', '-lrt', '-lglib-2.0',
> '-lgsl', '-lgslcblas']
> 
> This is building an executable that has a .so as extension ? wtf ? Can you try
> to run this command by hand and add a -shared option to it right after g++ ? 
> 
> What you want here is a shared library that will be dynamically loaded later
> but that was not built with -fPIC. So, the link line needs a -shared.



I tried this

/usr/bin/g++ -shared ./build/optimized/src/point-to-point-layout/bindings/ns3module_4.o  -o  /home/john/ns-3.12-rc2/ns-allinone-3.12-RC2/ns-3.12/build/optimized/bindings/python/ns/point_to_point_layout.so  -Wl --whole-archive -Bstatic   -L /home/john/ns-3.12-rc2/ns-allinone-3.12-RC2/ns-3.12/build/optimized -lns3-core  -lns3-network  -lns3-config-store  -lns3-internet  -lns3-propagation  -lns3-point-to-point  -lns3-csma  -lns3-emu  -lns3-bridge  -lns3-tap-bridge  -lns3-virtual-net-device  -lns3-applications  -lns3-nix-vector-routing  -lns3-olsr  -lns3-aodv  -lns3-dsdv  -lns3-mobility  -lns3-wifi  -lns3-netanim  -lns3-stats  -lns3-uan  -lns3-spectrum  -lns3-mesh  -lns3-test  -lns3-ns3tcp  -lns3-ns3wifi  -lns3-flow-monitor  -lns3-wimax  -lns3-lte  -lns3-mpi  -lns3-topology-read  -lns3-energy  -lns3-tools  -lns3-visualizer  -lns3-point-to-point-layout  -lns3-csma-layout  -lns3-template  -Wl -Bdynamic --no-whole-archive  -shared  -Wl -Bsymbolic-functions  -pthread  -pthread  -Loptimized  -L/usr/lib  -L/usr/lib/x86_64-linux-gnu  -Wl --whole-archive -Bstatic  -Wl -Bdynamic --no-whole-archive  -lm  -lpthread  -ldl  -lutil  -lpython2.7  -lsqlite3  -lxml2  -lgtk-x11-2.0  -lgdk-x11-2.0  -latk-1.0  -lgio-2.0  -lpangoft2-1.0  -lpangocairo-1.0  -lgdk_pixbuf-2.0  -lcairo  -lpango-1.0  -lfreetype  -lfontconfig  -lgobject-2.0  -lgmodule-2.0  -lgthread-2.0  -lrt  -lglib-2.0
john@john-VirtualBox:~/ns-3.12-rc2/ns-allinone-3.12-RC2/ns-3.12$ 
john@john-VirtualBox:~/ns-3.12-rc2/ns-allinone-3.12-RC2/ns-3.12$ 
john@john-VirtualBox:~/ns-3.12-rc2/ns-allinone-3.12-RC2/ns-3.12$ ls -l build/optimized/bindings/python/ns/
total 21632
-rw-r--r-- 1 john john      518 2011-08-27 16:36 core.py
-rw-r--r-- 1 john john        1 2011-08-27 16:36 __init__.py
-rwxr-xr-x 1 john john 22141379 2011-08-29 16:10 point_to_point_layout.so
john@john-VirtualBox:~/ns-3.12-rc2/ns-allinone-3.12-RC2/ns-3.12$ 


I'm following this, the problem is with the "--enable-static" option, wouldn't "-shared" be similar to adding fPIC again?
Comment 13 John Abraham 2011-08-29 18:20:38 UTC
I removed the "mesh" which generates peer-management-protocol_474.o.
The build and tests PASS.

Possibly an issue in the multi-level directory structure of mesh?
Comment 14 John Abraham 2011-08-29 19:07:18 UTC
Also note, that the word "shared" in the logs seems suspicious [--no-whole-archive', '-shared',]

so I disabled python (--disable-python) and the build & test passed. This also explains why the buildbot did not catch it (as it explicitly disables-python).
Comment 15 Gustavo J. A. M. Carneiro 2011-08-29 19:21:39 UTC
(In reply to comment #10)
> 1) The main point of static builds is to _not_ use fpic so, if you use it, you
> are basically silently ignoring the configure-time option. Not good. So, what
> is needed if this really does not work is to output a configure-time warning
> saying that the static build+python bindings are not compatible and disable one
> of the two automatically to proceed

Well, there several reasons to use static libraries, more efficient code is just one of them.  Other reasons may include possibly including just the code that is needed for a program, instead of the whole library, and creating a standalone executable that can be deployed somewhere (e.g. a cluster).

(In reply to comment #12)
> I'm following this, the problem is with the "--enable-static" option, wouldn't
> "-shared" be similar to adding fPIC again?

-shared in the linking stage does not change the way the source files are compiled.  It could still be more optimized. -fPIC OTOH changes the compilation stage, to produce less efficient code.  I have no idea why -shared was omitted, maybe waf got confused with mixing static objects with dynamic ones. Sounds like waf bug to me. If adding -shared works, it's the way to go.

Although I think -mcmodel=large is a bit of a hack (very platform specific) and makes me slightly uncomfortable.  We tolerate it because of the performance advantage, otherwise...
Comment 16 Tom Henderson 2011-08-30 02:01:03 UTC
(In reply to comment #15)

> > I'm following this, the problem is with the "--enable-static" option, wouldn't
> > "-shared" be similar to adding fPIC again
> 
> -shared in the linking stage does not change the way the source files are
> compiled.  It could still be more optimized. -fPIC OTOH changes the compilation
> stage, to produce less efficient code.  I have no idea why -shared was omitted,
> maybe waf got confused with mixing static objects with dynamic ones. Sounds
> like waf bug to me. If adding -shared works, it's the way to go.
> 

I tried adding linkflags = ['-shared'] there but it had no effect.
Comment 17 Tom Henderson 2011-08-30 02:03:44 UTC
(In reply to comment #10)
> 1) The main point of static builds is to _not_ use fpic so, if you use it, you
> are basically silently ignoring the configure-time option. Not good. So, what
> is needed if this really does not work is to output a configure-time warning
> saying that the static build+python bindings are not compatible and disable one
> of the two automatically to proceed
> 

I'll post a patch along these lines (disabling python in this case) and suggest to push it to ns-3.12 so that we can release despite this limitation, unless someone has a fix.
Comment 18 Tom Henderson 2011-08-30 02:04:35 UTC
Created attachment 1236 [details]
disable python bindings if static build selected for x86-64
Comment 19 Tom Henderson 2011-08-30 02:23:53 UTC
Created attachment 1237 [details]
disable python bindings if static build selected for x86-64

less broken version
Comment 20 Mathieu Lacage 2011-08-30 04:08:41 UTC
(In reply to comment #19)
> Created attachment 1237 [details]
> disable python bindings if static build selected for x86-64
> 
> less broken version

I did not test it but the idea of disabling combinations of options that do not work is sound. In this case, though, I think that the combination of options that does not work is static+python on all platforms since the move to the modular python bindings. i.e., the static+python stuff could work only with the monolithic python bindings. I am surprised that the tests work. They must not be very thorough.
Comment 21 Mathieu Lacage 2011-08-30 04:12:35 UTC
(In reply to comment #12)

> I'm following this, the problem is with the "--enable-static" option, wouldn't
> "-shared" be similar to adding fPIC again?

No. -shared simply says: do not look for a "main" entry point function in the binary. However, as you point out later, the -shared is there in the middle of the command-line. Weird.

Anyway, to summarize the issue here, it occurs to me that since we moved away from the python monolithic build, static+python cannot work together on _any_ platform. So, we need to disable one of them when they are enabled together and tell the user.
Comment 22 Gustavo J. A. M. Carneiro 2011-08-30 07:06:33 UTC
(In reply to comment #19)
> Created attachment 1237 [details]
> disable python bindings if static build selected for x86-64
> 
> less broken version

+1

I just don't think this bug _can_ be fixed.  Python modules in general require code to be compiled with -fPIC.

The alternative would be to create a new Python interpreter that has all the ns-3 modules statically linked.  But that defeats the modular bindings...
Comment 23 Tom Henderson 2011-08-30 10:23:27 UTC
Created attachment 1238 [details]
unconditionally disable bindings if static build selected
Comment 24 Tom Henderson 2011-08-30 10:26:02 UTC
(In reply to comment #21)
> (In reply to comment #12)
> 
> > I'm following this, the problem is with the "--enable-static" option, wouldn't
> > "-shared" be similar to adding fPIC again?
> 
> No. -shared simply says: do not look for a "main" entry point function in the
> binary. However, as you point out later, the -shared is there in the middle of
> the command-line. Weird.
> 
> Anyway, to summarize the issue here, it occurs to me that since we moved away
> from the python monolithic build, static+python cannot work together on _any_
> platform. So, we need to disable one of them when they are enabled together and
> tell the user.

Latest patch unconditionally disables python when static is selected
Comment 25 Gustavo J. A. M. Carneiro 2011-08-30 10:30:17 UTC
(In reply to comment #23)
> Created attachment 1238 [details]
> unconditionally disable bindings if static build selected

The patch is fine, but I would change the comment:

+    # Disable python in x86-64 static builds until bug 1253 is fixed

to:

+    # Disable python in static builds (bug #1253)

So that we can close this bug afterwards.
Comment 26 Tom Henderson 2011-08-31 09:49:48 UTC
changeset 7b8dfd1b02f6