Other interesting mips breakage...

Discussion:

Other interesting mips breakage...

John D. Baker

2014-02-01 08:03:24 UTC

I should have looked at port-mips@ more diligently before my latest spate
of PR filing. The very issues I filed about were discussed here at length.

In light of that, I'd welcome input on issues already being pursued in
absence of an active PR. What I've come to observe since evbmips/LOONGSON
became operational again (on my Lemote YEELOONG, since that's the only
mips/evbmips machine I have):

X server: the undefined symbol issues seem to have been resolved, but
now the server complains that it can't load the "int10" module, saying
that it doesn't exist. Also what looks like a NULL pointer dereference.

The 'dig' utility dies with segfault in pthread_getspecific().

Can't build anything from pkgsrc as the C compiler dies with bus error
compiling the first real source in "pkgtools/digest". (It works well
enough to complete the "./configure" script, though.)

If there are hints on the above, please share. I suspect the most
obvious thing to try is nuke OBJDIR and DESTDIR from orbit, just to
make sure.

A few things that persist from early 6.99.x days:

PR/48564: 'tar' corrupts files extracted to NFS. I originally saw this as
a result of bizzare modifications suggested when running 'etcupdate' on
my NFS-root installation. Then LOONGSON kernel build breakage, etc.
intervened. Finally, I sat down and analyzed what the nature of the
data corruption was.

'amd' (am-utils) fails, claiming "Invalid argument" on all automount
points. The config file and maps are the same ones I use on all my
other systems (i386, amd64, sparc, macppc, mvme68k).

Not yet re-examined: sshd hangs incoming SSH connections in a quasi-open
state holding the client inoperative until local terminal is closed
(close window in GUI or kill parent shell process from other wscons
virtual terminal).

If my casual descriptions don't seem to match others' observations,
I'll post appropriate log excerpts (or just file a PR).

To summarize: "Surely, I'm not the only one seeing this."
(I know, "Don't call me Shirley.")

--
|/"\ John D. Baker, KN5UKS NetBSD Darwin/MacOS X
|\ / jdbaker[snail]mylinuxisp[flyspeck]com OpenBSD FreeBSD
| X No HTML/proprietary data in email. BSD just sits there and works!
|/ \ GPGkeyID: D703 4A7E 479F 63F8 D3F4 BD99 9572 8F23 E4AD 1645

Manuel Bouyer

2014-02-01 16:06:54 UTC

Post by John D. Baker
of PR filing. The very issues I filed about were discussed here at length.
In light of that, I'd welcome input on issues already being pursued in
absence of an active PR. What I've come to observe since evbmips/LOONGSON
became operational again (on my Lemote YEELOONG, since that's the only
X server: the undefined symbol issues seem to have been resolved, but
now the server complains that it can't load the "int10" module, saying
that it doesn't exist. Also what looks like a NULL pointer dereference.

What driver are you using ? Works fine for me, with a SiS video controller.
I also tested wsfb.

Post by John D. Baker
The 'dig' utility dies with segfault in pthread_getspecific().

Yes, I'm also seeing this with glxgears. I guess several threaded binaries
are affected.

Post by John D. Baker
Can't build anything from pkgsrc as the C compiler dies with bus error
compiling the first real source in "pkgtools/digest". (It works well
enough to complete the "./configure" script, though.)

gcc -O2 core dump, but removing -O2 makes it work. I don't know why

Post by John D. Baker
If there are hints on the above, please share. I suspect the most
obvious thing to try is nuke OBJDIR and DESTDIR from orbit, just to
make sure.
PR/48564: 'tar' corrupts files extracted to NFS. I originally saw this as
a result of bizzare modifications suggested when running 'etcupdate' on
my NFS-root installation. Then LOONGSON kernel build breakage, etc.
intervened. Finally, I sat down and analyzed what the nature of the
data corruption was.

No idea, but I've not tested nfs.

Post by John D. Baker
'amd' (am-utils) fails, claiming "Invalid argument" on all automount
points. The config file and maps are the same ones I use on all my
other systems (i386, amd64, sparc, macppc, mvme68k).

Could be a compat-netbsd32 issue. ktrace would show what syscall or ioctl
returns the einval.

Post by John D. Baker
Not yet re-examined: sshd hangs incoming SSH connections in a quasi-open
state holding the client inoperative until local terminal is closed
(close window in GUI or kill parent shell process from other wscons
virtual terminal).

sshd works fine for me ...

--
Manuel Bouyer <***@antioche.eu.org>
NetBSD: 26 ans d'experience feront toujours la difference
--

John D. Baker

2014-02-01 20:01:55 UTC

Post by Manuel Bouyer

Post by John D. Baker
X server: the undefined symbol issues seem to have been resolved, but
now the server complains that it can't load the "int10" module, saying
that it doesn't exist. Also what looks like a NULL pointer dereference.

What driver are you using ? Works fine for me, with a SiS video controller.
I also tested wsfb.

[...]
lynxfb0 at pci0 dev 8 function 0: vendor 0x126f product 0x0712 (rev. 0xb0)
lynxfb0: 1024 x 600, 16 bpp, stride 2048
[...]
[...]
[ 228.837] (II) LoadModule: "siliconmotion"
[ 228.841] (II) Loading /usr/X11R7/lib/modules/drivers/siliconmotion_drv.so
[ 228.860] (II) Module siliconmotion: vendor="X.Org Foundation"
[ 228.861] compiled for 1.10.6, module version = 1.7.7
[ 228.861] Module class: X.Org Video Driver
[ 228.861] ABI class: X.Org Video Driver, version 10.0
[...]
[ 228.902] (II) SMI: driver (version 1.7.7) for Silicon Motion Lynx chipsets: Lynx,
LynxE, Lynx3D, LynxEM, LynxEM+, Lynx3DM, Cougar3DR, MSOC
[ 228.906] (--) Using wscons driver on /dev/ttyE4 in pcvt compatibility mode (version 3.32)
[ 228.906] (--) using VT number 5

[ 228.914] (WW) Falling back to old probe method for siliconmotion
[ 228.914] (--) Assigning device section with no busID to primary device
[ 228.914] (--) Chipset LynxEM+ found
[ 228.914] (II) Loading /usr/X11R7/lib/modules/drivers/siliconmotion_drv.so
[ 228.914] (WW) VGA arbiter: cannot open kernel arbiter, no multi-card support
[ 228.915] (II) Loading sub module "vgahw"
[ 228.915] (II) LoadModule: "vgahw"
[ 228.980] (II) Loading /usr/X11R7/lib/modules/libvgahw.so
[ 228.993] (II) Module vgahw: vendor="X.Org Foundation"
[ 228.993] compiled for 1.10.6, module version = 0.1.0
[ 228.993] ABI class: X.Org Video Driver, version 10.0
[ 228.994] (II) SMI(0): Creating default Display subsection in Screen section
"Builtin Default siliconmotion Screen 0" for depth/fbbpp 24/32
[ 228.994] (==) SMI(0): Depth 24, (--) framebuffer bpp 32
[ 228.994] (==) SMI(0): RGB weight 888
[ 228.994] (==) SMI(0): Default visual is TrueColor
[ 228.994] (==) SMI(0): PCI Burst enabled
[ 228.994] (==) SMI(0): PCI Retry enabled
[ 228.994] (==) SMI(0): Using Hardware Cursor
[ 228.994] (II) Loading sub module "int10"
[ 228.994] (II) LoadModule: "int10"
[ 229.022] (WW) Warning, couldn't open module int10
[ 229.022] (II) UnloadModule: "int10"
[ 229.022] (II) Unloading int10
[ 229.022] (EE) SMI: Failed to load module "int10" (module does not exist, 0)

Additional data from std{out,err} of Xserver:

X.Org X Server 1.10.6
Release Date: 2011-07-08
X Protocol Version 11, Revision 0
Build Operating System: NetBSD/evbmips -
Current Operating System: NetBSD chalk.technoskunk.fur 6.99.30 NetBSD 6.99.30 (YEELOONG) #9: Fri Jan 31 16:18:05 CST 2014 ***@yggdrasil.technoskunk.fur:/r0/build/current/obj/mips64el/sys/arch/evbmips/compile/YEELOONG evbmips
Build Date: 01 August 2011 01:01:00AM

Current version of pixman: 0.30.0
Before reporting problems, check http://wiki.X.Org
to make sure that you have the latest version.
Markers: (--) probed, (**) from config file, (==) default setting,
(++) from command line, (!!) notice, (II) informational,
(WW) warning, (EE) error, (NI) not implemented, (??) unknown.
(==) Log file: "/var/log/Xorg.0.log", Time: Fri Jan 31 23:57:44 2014
(==) Using default built-in configuration (12 lines)
(EE) SMI: Failed to load module "int10" (module does not exist, 0)
Segmentation fault at address 0x0

Fatal server error:
Caught signal 11 (Segmentation fault). Server aborting

Post by Manuel Bouyer
gcc -O2 core dump, but removing -O2 makes it work. I don't know why

This sounds pretty tedious unless such transformations can be automated.

Post by Manuel Bouyer

Post by John D. Baker
'amd' (am-utils) fails, claiming "Invalid argument" on all automount
points. The config file and maps are the same ones I use on all my
other systems (i386, amd64, sparc, macppc, mvme68k).

Could be a compat-netbsd32 issue. ktrace would show what syscall or ioctl
returns the einval.

I posted what I observed about it before here:

http://mail-index.netbsd.org/port-evbmips/2012/06/08/msg000141.html

The culprit appears to be "netbsd32___mount50()".

I'll see about getting a current picture soon.

Post by Manuel Bouyer

Post by John D. Baker
Not yet re-examined: sshd hangs incoming SSH connections in a quasi-open
state holding the client inoperative until local terminal is closed
(close window in GUI or kill parent shell process from other wscons
virtual terminal).

sshd works fine for me ...

For historical background see:

http://mail-index.netbsd.org/port-evbmips/2012/05/10/msg000132.html
http://mail-index.netbsd.org/port-evbmips/2012/05/11/msg000134.html
http://mail-index.netbsd.org/port-evbmips/2012/06/08/msg000142.html

I have not yet tried again since getting a working system running, but
hope to do so soon.

--
|/"\ John D. Baker, KN5UKS NetBSD Darwin/MacOS X
|\ / jdbaker[snail]mylinuxisp[flyspeck]com OpenBSD FreeBSD
| X No HTML/proprietary data in email. BSD just sits there and works!
|/ \ GPGkeyID: D703 4A7E 479F 63F8 D3F4 BD99 9572 8F23 E4AD 1645

Manuel Bouyer

2014-02-01 20:10:49 UTC

Post by John D. Baker
[...]
lynxfb0 at pci0 dev 8 function 0: vendor 0x126f product 0x0712 (rev. 0xb0)
lynxfb0: 1024 x 600, 16 bpp, stride 2048
[...]
[...]
[ 228.837] (II) LoadModule: "siliconmotion"
[ 228.841] (II) Loading /usr/X11R7/lib/modules/drivers/siliconmotion_drv.so
[ 228.860] (II) Module siliconmotion: vendor="X.Org Foundation"
[ 228.861] compiled for 1.10.6, module version = 1.7.7
[ 228.861] Module class: X.Org Video Driver
[ 228.861] ABI class: X.Org Video Driver, version 10.0
[...]
[ 228.902] (II) SMI: driver (version 1.7.7) for Silicon Motion Lynx chipsets: Lynx,
LynxE, Lynx3D, LynxEM, LynxEM+, Lynx3DM, Cougar3DR, MSOC
[ 228.906] (--) Using wscons driver on /dev/ttyE4 in pcvt compatibility mode (version 3.32)
[ 228.906] (--) using VT number 5
[ 228.914] (WW) Falling back to old probe method for siliconmotion
[ 228.914] (--) Assigning device section with no busID to primary device
[ 228.914] (--) Chipset LynxEM+ found
[ 228.914] (II) Loading /usr/X11R7/lib/modules/drivers/siliconmotion_drv.so
[ 228.914] (WW) VGA arbiter: cannot open kernel arbiter, no multi-card support
[ 228.915] (II) Loading sub module "vgahw"
[ 228.915] (II) LoadModule: "vgahw"
[ 228.980] (II) Loading /usr/X11R7/lib/modules/libvgahw.so
[ 228.993] (II) Module vgahw: vendor="X.Org Foundation"
[ 228.993] compiled for 1.10.6, module version = 0.1.0
[ 228.993] ABI class: X.Org Video Driver, version 10.0
[ 228.994] (II) SMI(0): Creating default Display subsection in Screen section
"Builtin Default siliconmotion Screen 0" for depth/fbbpp 24/32

OK, no way to test this one for me then ...

Post by John D. Baker
[ 228.994] (==) SMI(0): Depth 24, (--) framebuffer bpp 32
[ 228.994] (==) SMI(0): RGB weight 888
[ 228.994] (==) SMI(0): Default visual is TrueColor
[ 228.994] (==) SMI(0): PCI Burst enabled
[ 228.994] (==) SMI(0): PCI Retry enabled
[ 228.994] (==) SMI(0): Using Hardware Cursor
[ 228.994] (II) Loading sub module "int10"
[ 228.994] (II) LoadModule: "int10"
[ 229.022] (WW) Warning, couldn't open module int10
[ 229.022] (II) UnloadModule: "int10"
[ 229.022] (II) Unloading int10
[ 229.022] (EE) SMI: Failed to load module "int10" (module does not exist, 0)

But this should not be a problem. I get the same messages with the sis driver,
and it works anyway.
Some adjustement in the siliconmotion driver may be needed.

Post by John D. Baker
X.Org X Server 1.10.6
Release Date: 2011-07-08
X Protocol Version 11, Revision 0
evbmips
Build Date: 01 August 2011 01:01:00AM
Current version of pixman: 0.30.0
Before reporting problems, check http://wiki.X.Org
to make sure that you have the latest version.
Markers: (--) probed, (**) from config file, (==) default setting,
(++) from command line, (!!) notice, (II) informational,
(WW) warning, (EE) error, (NI) not implemented, (??) unknown.
(==) Log file: "/var/log/Xorg.0.log", Time: Fri Jan 31 23:57:44 2014
(==) Using default built-in configuration (12 lines)
(EE) SMI: Failed to load module "int10" (module does not exist, 0)
Segmentation fault at address 0x0
Caught signal 11 (Segmentation fault). Server aborting

Can you try running it under gdb ?

--
Manuel Bouyer <***@antioche.eu.org>
NetBSD: 26 ans d'experience feront toujours la difference
--

John D. Baker

2014-02-03 23:32:32 UTC

The previous problems I observed with 'sshd' no-longer occur--likely a
result of bug fixes in the OpenSSH updates imported since the last time
evbmips/LOONGSON could be built (spring 2013, IIRC).

Post by Manuel Bouyer

Post by John D. Baker
(EE) SMI: Failed to load module "int10" (module does not exist, 0)
Segmentation fault at address 0x0
Caught signal 11 (Segmentation fault). Server aborting

Can you try running it under gdb ?

Loading the "Xorg.core" file provides the following backtrace:

GNU gdb (GDB) 7.6.1
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "mips64el--netbsd".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /usr/X11R7/bin/Xorg...(no debugging symbols found)...done.
[New process 1]
Core was generated by `Xorg'.
Program terminated with signal 6, Aborted.
#0 0x7824e258 in _lwp_kill () from /usr/lib/libc.so.12
(gdb) bt
#0 0x7824e258 in _lwp_kill () from /usr/lib/libc.so.12
#1 0x7824e1b8 in raise () from /usr/lib/libc.so.12
#2 0x7824dd30 in abort () from /usr/lib/libc.so.12
#3 0x101a1ea8 in OsAbort ()
#4 0x1006fba4 in ddxGiveUp ()
#5 0x1019deb4 in AbortServer ()
#6 0x1019e0b0 in FatalError ()
#7 0x101a2b50 in ?? ()
#8 0x781bb6c0 in ?? () from /usr/lib/libc.so.12

GDB is unable to find the start of the function at 0x781bb6be
and thus can't determine the size of that function's stack frame.
This means that GDB may be unable to access that stack frame, or
the frames below it.
This problem is most likely caused by an invalid program counter or
stack pointer.
However, if you think GDB should simply search farther back
from 0x781bb6be for code which looks like the beginning of a
function, you can increase the range of the search using the `set
heuristic-fence-post' command.
(gdb)

This is odd in that GDB is reporting an issue similar to that all userland
suffered until recently (looking for a function at a 16-bit-aligned
address instead of 32/64-bit aligned). Odder still in that I rebuilt
everything from completely empty OBJDIR and DESTDIR.

Do the Xorg sources have a private copy of libc/compiler_rt stuff that is
still losing?

Starting X directly in gdb results in the following:

GNU gdb (GDB) 7.6.1
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "mips64el--netbsd".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /usr/X11R7/bin/Xorg...(no debugging symbols found)...done.
(gdb) run
Starting program: /usr/X11R7/bin/X
warning: Corrupted shared library list: 0x0 != 0x616f6c2d
warning: Corrupted shared library list: 0x0 != 0x616f6c2d
warning: Corrupted shared library list: 0x0 != 0x616f6c2d
warning: Corrupted shared library list: 0x0 != 0x616f6c2d
warning: Corrupted shared library list: 0x0 != 0x616f6c2d
[...]

--
|/"\ John D. Baker, KN5UKS NetBSD Darwin/MacOS X
|\ / jdbaker[snail]mylinuxisp[flyspeck]com OpenBSD FreeBSD
| X No HTML/proprietary data in email. BSD just sits there and works!
|/ \ GPGkeyID: D703 4A7E 479F 63F8 D3F4 BD99 9572 8F23 E4AD 1645

John D. Baker

2014-02-04 02:19:28 UTC

Post by John D. Baker

Post by Manuel Bouyer
gcc -O2 core dump, but removing -O2 makes it work. I don't know why

This sounds pretty tedious unless such transformations can be automated.

OK, I found "mk/wrappers/transform-gcc" and haxxored it to move "-O[23s]"
to a "tranform_discard" section. Then "pkgtools/digest" built, but
the "real-package-install" target failed as follows:

[...]
=> Becoming ``root'' to make su-real-package-install (/usr/bin/su)
===> Install binary package of digest-20121220
pkg_add: Warning: package `digest-20121220' was built for a platform:
pkg_add: NetBSD/mipsel 6.99.30 (pkg) vs. NetBSD/mips64el 6.99.30 (this host)
pkg_add: 1 package addition failed
*** Error code 1

Stop.
make: stopped in /d0/nbsd/pkgsrc/pkgtools/digest
*** Error code 1

Stop.
make[2]: stopped in /d0/nbsd/pkgsrc/pkgtools/digest
*** Error code 1

Stop.
make[1]: stopped in /d0/nbsd/pkgsrc/pkgtools/digest

I'm guessing the host architecture is a fixed string in 'pkg_add' evaluated
when 'pkg_add' was compiled. 'uname -p' reports "mipsel". Now to find
where to pass additional arguments to 'pkg_add' via mk.conf...

Post by John D. Baker

Post by Manuel Bouyer

Post by John D. Baker
'amd' (am-utils) fails, claiming "Invalid argument" on all automount
points. The config file and maps are the same ones I use on all my

Could be a compat-netbsd32 issue. ktrace would show what syscall or ioctl
returns the einval.

I'll see about getting a current picture soon.

Seems to be the same as before. Excerpts from ktruss run:

[...]
723 1 amd netbsd32___socket30(0x2, 0x2, 0) = 4, -48
723 1 amd netbsd32_bind(0x4, 0x7fff6a50, 0x10) Err#48 EADDRINUSE
723 1 amd netbsd32_bind(0x4, 0x7fff6a50, 0x10) Err#48 EADDRINUSE
723 1 amd netbsd32_bind(0x4, 0x7fff6a50, 0x10) = 0, 64768
723 1 amd netbsd32_open("/etc/netconfig", 0x400000, 0x1b6) = 6
723 1 amd netbsd32___fstat50(0x6, 0x7fff6820) = 0, 2020430096
723 1 amd netbsd32_read(0x6, 0x7815c000, 0x4000) = 774, 1
"# $NetBSD: netconfig,v 1.1 2000/06/02 22:54:10 fvdl Exp $\n#\n# The networ"
723 1 amd netbsd32_read(0x6, 0x7815c000, 0x4000) = 0, 2020430096
""
723 1 amd netbsd32_close(0x6) = 0
723 1 amd netbsd32_open("/etc/netconfig", 0, 0x1b6) = 6
723 1 amd netbsd32___fstat50(0x6, 0x7fff6810) = 0, 2020430096
723 1 amd netbsd32_read(0x6, 0x7815c000, 0x4000) = 774, 1
"# $NetBSD: netconfig,v 1.1 2000/06/02 22:54:10 fvdl Exp $\n#\n# The networ"
723 1 amd netbsd32_close(0x6) = 0
723 1 amd netbsd32_getsockopt(0x4, 0, 0x13, 0x7fff6970, 0x7fff697c) = 0, 24
723 1 amd netbsd32_setsockopt(0x4, 0, 0x13, 0x7fff6974, 0x4) = 0, 24
723 1 amd netbsd32_bind(0x4, 0x7fff6a40, 0x10) Err#22 EINVAL
723 1 amd netbsd32_setsockopt(0x4, 0, 0x13, 0x7fff6970, 0x4) = 0
723 1 amd netbsd32_listen(0x4, 0x80) Err#45 EOPNOTSUPP
723 1 amd netbsd32_getsockname(0x4, 0x7fff68a8, 0x7fff68a0) = 0, -1
723 1 amd netbsd32_getsockopt(0x4, 0xffff, 0x1008, 0x7fff68a4, 0x7fff68a0) = 0, -1
723 1 amd netbsd32_getsockname(0x4, 0x7fff68b8, 0x7fff68b0) = 0, 2
723 1 amd netbsd32_getsockname(0x4, 0x7fff67b8, 0x7fff67b0) = 0, 1
723 1 amd netbsd32_getsockopt(0x4, 0xffff, 0x1008, 0x7fff67b4, 0x7fff67b0) = 0, 1
723 1 amd netbsd32_getsockname(0x4, 0x7fff6888, 0x7fff6870) = 0, 2020437872
[...]
723 1 amd fork() = 603
603 1 amd fork = 0
603 1 amd emul(netbsd32)
603 1 amd getpid() = 603, 1
603 1 amd netbsd32___stat50("/home", 0x7fff6668) = 0, 8
603 1 amd getppid() = 723, -48
603 1 amd netbsd32___gettimeofday50(0x78729708, 0) = 0
603 1 amd netbsd32_write(0x2, 0x7fff5108, 0x26) = 38, 2
"Feb 3 18:47:12 chalk amd[603]/info: "
603 1 amd netbsd32_write(0x2, 0x7fff5700, 0x27) = 39, 2
"/home: disabling nfs congestion window\n"
603 1 amd netbsd32___mount50(0x10027358, 0x781241f8, 0x100000, 0x7fff6050, 0) Err#22 EINVAL
[...]
723 1 amd fork() = 1049
1049 1 amd fork = 0
1049 1 amd emul(netbsd32)
1049 1 amd getpid() = 1049, 1
1049 1 amd netbsd32___stat50("/m", 0x7fff6668) = 0, 8
1049 1 amd getppid() = 723, -48
1049 1 amd netbsd32___gettimeofday50(0x78729708, 0) = 0
1049 1 amd netbsd32_write(0x2, 0x7fff5108, 0x27) = 39, 2
"Feb 3 18:47:12 chalk amd[1049]/info: "
1049 1 amd netbsd32_write(0x2, 0x7fff5700, 0x24) = 36, 2
"/m: disabling nfs congestion window\n"
1049 1 amd netbsd32___mount50(0x10027358, 0x78118274, 0x100000, 0x7fff6050, 0) Err#22 EINVAL
[...]
723 1 amd fork() = 1302
1302 1 amd fork = 0
1302 1 amd emul(netbsd32)
1302 1 amd getpid() = 1302, 1
1302 1 amd netbsd32___stat50("/net", 0x7fff6668) = 0, 8
1302 1 amd getppid() = 723, -48
1302 1 amd netbsd32___gettimeofday50(0x78729708, 0) = 0
1302 1 amd netbsd32_write(0x2, 0x7fff5108, 0x27) = 39, 2
"Feb 3 18:47:13 chalk amd[1302]/info: "
1302 1 amd netbsd32_write(0x2, 0x7fff5700, 0x26) = 38, 2
"/net: disabling nfs congestion window\n"
1302 1 amd netbsd32___mount50(0x10027358, 0x78124238, 0x100000, 0x7fff6050, 0) Err#22 EINVAL
[...]
603 1 amd netbsd32___nanosleep50 = 0, -48
603 1 amd netbsd32___mount50(0x10027358, 0x781241f8, 0x100000, 0x7fff6050, 0) Err#22 EINVAL
[...]
1049 1 amd netbsd32___nanosleep50 = 0, -48
1049 1 amd netbsd32___mount50(0x10027358, 0x78118274, 0x100000, 0x7fff6050, 0) Err#22 EINVAL
[...]
1302 1 amd netbsd32___mount50(0x10027358, 0x78124238, 0x100000, 0x7fff6050, 0) Err#22 EINVAL
[...]

If more context is needed, let me know.

--
|/"\ John D. Baker, KN5UKS NetBSD Darwin/MacOS X
|\ / jdbaker[snail]mylinuxisp[flyspeck]com OpenBSD FreeBSD
| X No HTML/proprietary data in email. BSD just sits there and works!
|/ \ GPGkeyID: D703 4A7E 479F 63F8 D3F4 BD99 9572 8F23 E4AD 1645

John D. Baker

2014-02-06 18:18:56 UTC

Here's what I had to do (so far) to get anything from pkgsrc to
compile/install on my Lemote YEELOONG.

In "mk/wrapper/transform-gcc": Discard "-O[23s]" so compiler doesn't
die with "Bus Error". Discard "-pthread" to get executable that
doesn't die with "Memory Fault" (SegFault).

In "mk/pkgformat/pkg/package.mk": Add "-m ${MACHINE_ARCH}" to the
"su-real-package-install" target to override 'pkg_add' builtin default.

+Index: mk/wrapper/transform-gcc
+===================================================================
+RCS file: /cvsroot/pkgsrc/mk/wrapper/transform-gcc,v
+retrieving revision 1.29
+diff -u -b -r1.29 transform-gcc
+--- mk/wrapper/transform-gcc 31 Dec 2013 13:56:35 -0000 1.29
++++ mk/wrapper/transform-gcc 4 Feb 2014 13:35:17 -0000
+@@ -65,11 +65,9 @@
+ -mpush-args |\
+ -mschedule=* |\
+ -mieee-fp |\
+--O[23s] |\
+ -pedantic |\
+ -pedantic-errors |\
+ -pipe |\
+--pthread |\
+ -print-prog-name=* |\
+ -print-search-dirs |\
+ -S |\
+@@ -121,6 +119,8 @@
+ -Wwrite-strings ) transform_pass ;;
+
+ # There are some packages suppressing all warnings. We don't want that.
++-O[23s] |\
++-pthread |\
+ -w ) transform_discard ;;
+
+ # Options specific to g++.
+Index: mk/pkgformat/pkg/package.mk
+===================================================================
+RCS file: /cvsroot/pkgsrc/mk/pkgformat/pkg/package.mk,v
+retrieving revision 1.3
+diff -u -b -r1.3 package.mk
+--- mk/pkgformat/pkg/package.mk 10 Aug 2013 06:05:57 -0000 1.3
++++ mk/pkgformat/pkg/package.mk 4 Feb 2014 13:35:46 -0000
+@@ -194,7 +194,7 @@
+ @${MV} ${_PKG_DBDIR}/${PKGNAME:Q}/+CONTENTS.tmp ${_PKG_DBDIR}/${PKGNAME:Q}/+CONTENTS
+ .else
+ ${RUN} case ${_AUTOMATIC:Q}"" in \
+- [yY][eE][sS]) ${PKG_ADD} -A ${STAGE_PKGFILE} ;; \
+- *) ${PKG_ADD} ${STAGE_PKGFILE} ;; \
++ [yY][eE][sS]) ${PKG_ADD} -A -m ${MACHINE_ARCH} ${STAGE_PKGFILE} ;; \
++ *) ${PKG_ADD} -m ${MACHINE_ARCH} ${STAGE_PKGFILE} ;; \
+ esac
+ .endif

With these, I was able to build and install "security/sudo" and "www/lynx"
(and they seem to work properly). Still not good enough to build
"audio/mpg123" on my Lemote YEELOONG (compiler dies with "Bus Error" still).

--
|/"\ John D. Baker, KN5UKS NetBSD Darwin/MacOS X
|\ / jdbaker[snail]mylinuxisp[flyspeck]com OpenBSD FreeBSD
| X No HTML/proprietary data in email. BSD just sits there and works!
|/ \ GPGkeyID: D703 4A7E 479F 63F8 D3F4 BD99 9572 8F23 E4AD 1645

Manuel Bouyer

2014-02-07 09:07:00 UTC

Post by John D. Baker
Here's what I had to do (so far) to get anything from pkgsrc to
compile/install on my Lemote YEELOONG.
In "mk/wrapper/transform-gcc": Discard "-O[23s]" so compiler doesn't
die with "Bus Error".

This doesn't happen with a netbsd-6 userland. What did change in compiler
land between -6 and HEAD ? I think a mknative has been re-run, maybe that's
the cause ?

Post by John D. Baker
Discard "-pthread" to get executable that
doesn't die with "Memory Fault" (SegFault).

I guess this is related to TLS. Someone with some TLS clue would need to
look at this.

--
Manuel Bouyer <***@antioche.eu.org>
NetBSD: 26 ans d'experience feront toujours la difference
--

Michael

2014-02-04 06:46:16 UTC

Hello,

On Sat, 1 Feb 2014 02:03:24 -0600 (CST)

Post by John D. Baker
Can't build anything from pkgsrc as the C compiler dies with bus error
compiling the first real source in "pkgtools/digest". (It works well
enough to complete the "./configure" script, though.)

Here it segfaults at jobs.c in build.sh tools

Post by John D. Baker
PR/48564: 'tar' corrupts files extracted to NFS. I originally saw this as
a result of bizzare modifications suggested when running 'etcupdate' on
my NFS-root installation. Then LOONGSON kernel build breakage, etc.
intervened. Finally, I sat down and analyzed what the nature of the
data corruption was.

Any writes to nfs get corrupted. Reads are fine on Loongson but not on
( for example ) sgimips n32.
It's been like that for a while, nobody seems to know why.

Also, on sgimips, data piped between processes get corrupted. Things
like tar xzf something.tar.gz will produce garbage, but gunzip
something.tar.gz; tar xf something.tar works.

have fun
Michael

John D. Baker

2014-04-30 19:40:25 UTC

Lemote YEELOONG, evbmips-mips64el

So, to complicate matters, I've been building with HAVE_GCC=48. The
kernel boots. Userland mostly works. Mostly same problems from before,
a few new wrinkles...

The very few packages I managed to build using gcc 4.5.4 (sudo and lynx)
seem to work fine with the gcc-4.8.3-built kernel/libraries.

Post by John D. Baker
'amd' (am-utils) fails, claiming "Invalid argument" on all automount
points.
The 'dig' utility dies with segfault in pthread_getspecific().

'cvs update' to a pkgsrc tree on local disk is quite slow. Perhaps its
due to said local disk being an "ext2fs" partition that is shared with
the gnewsense-3 and OpenBSD systems also installed on the machine.

What should be a non-fatal error (encountering the "wip" subdirectory)
ultimately is. Using 'ktruss' I observed it traversing the tree down
and then up again at least twice, but no further down than "time" and
then it either seems to get stuck in "select", or terminates when the
remote CVS pserver resets the connection.

A workaround is to update each category subdirectory specifically
with the usual shell filename-generation facilities.

Post by John D. Baker
PR/48564: 'tar' corrupts files extracted to NFS.

Same problem occurs with data written via output redirection (i.e.,
'cat src > dst' where "dst" is on NFS).

Strangely, a file written by 'tar' and the other written by redirection
are not broken in quite the same fashion. They have the same interleaved
data-block/null-block pattern and they're both truncated to the same
multiple of 8192 bytes but apparently the data blocks after the first
are different. I'll look at this again and see if I can spot what's
going on.

A file edited/written to NFS with 'vi' causes reports of unknown file
format and file truncation, but file appeared to be intact.

toolchain/48696 seems to have analyzed the native compiler problems.
Thanks for that. Hope committable fixes are forthcoming.

Post by John D. Baker
X server: the undefined symbol issues seem to have been resolved, but
now the server complains that it can't load the "int10" module, saying

Once again failing due to an undefined symbol. Side effect of gcc48?

X.Org X Server 1.10.6
Release Date: 2011-07-08
X Protocol Version 11, Revision 0
Build Operating System: NetBSD/evbmips -
Current Operating System: NetBSD chalk.technoskunk.fur 6.99.40 NetBSD 6.99.40 (YEELOONG) #2: Tue Apr 29 12:13:16 CDT 2014 ***@verthandi.technoskunk.fur:/d0/build/current/obj/mips64el/sys/arch/evbmips/compile/YEELOONG evbmips
Build Date: 01 August 2011 01:01:00AM

Current version of pixman: 0.32.4
Before reporting problems, check http://wiki.X.Org
to make sure that you have the latest version.
Markers: (--) probed, (**) from config file, (==) default setting,
(++) from command line, (!!) notice, (II) informational,
(WW) warning, (EE) error, (NI) not implemented, (??) unknown.
(==) Log file: "/var/log/Xorg.0.log", Time: Tue Apr 29 18:06:06 2014
(==) Using default built-in configuration (12 lines)
(EE) Failed to load /usr/X11R7/lib/modules/drivers/siliconmotion_drv.so: /usr/X11R7/lib/modules/drivers/siliconmotion_drv.so: Undefined symbol "exaOffscreenFree" (symnum = 101)
(EE) Failed to load module "siliconmotion" (loader failed, 7)
(EE) No drivers available.

Fatal server error:
no screens found

Please consult the The X.Org Foundation support
at http://wiki.X.Org
for help.
Please also check the log file at "/var/log/Xorg.0.log" for additional information.

New problems:

'etcupdate' seems to be ignoring the "-a" and "-l" options, requiring
user action on files with no local modifications and files changed, but
with the same RCS IDs as the baseline files from the {,x}etc.tgz sets.

'etcupdate' mangles the invocation of 'postinstall' at the end, somehow
failing to pass it the "-s" switch before the source archives.
'postinstall' then complains about "unknown action 'etc.tgz'" and prints
its usage message. Manual invocation of 'postinstall' succeeds.

Pretty sure these are related to gcc48 weirdness as I don't recall it
happening before I started compiling with HAVE_GCC=48.

Some scripts refuse to run, eliciting a "Bad address." notification
from the interpreter "/bin/sh" and/or "/bin/ksh". If 'dhcpcd' is used,
the 'dhcpcd-run-hooks' script fails this way and the subsequent calls
to 'resolvconf' print repeated usage messages. During startup
'/etc/rc.d/postfix' prints its usage message and exits.

Attempting to start/restart/etc. any service manually via its rc.d
script simply prints its usage message and exits.

Have also seen the "Bad address" notification from other tools. Notably,
attempting to build from pkgsrc resulted in:

make: exec(true) failed. (Bad address)

So, it's other things having that problem as well.

wscons switching doesn't always repaint the display completely, leaving
characters from another virtual terminal on the screen. The specific
case was running 'lynx' in one terminal with an unordered list displayed.
Two spaces to the left of each list item a single character from the
corresponding position of the previous virtual terminal was displayed.

--
|/"\ John D. Baker, KN5UKS NetBSD Darwin/MacOS X
|\ / jdbaker[snail]mylinuxisp[flyspeck]com OpenBSD FreeBSD
| X No HTML/proprietary data in email. BSD just sits there and works!
|/ \ GPGkeyID: D703 4A7E 479F 63F8 D3F4 BD99 9572 8F23 E4AD 1645

matthew green

2014-04-30 22:08:01 UTC

[ mostly replying to GCC or X11 specific things ]

Post by John D. Baker
So, to complicate matters, I've been building with HAVE_GCC=48. The
kernel boots. Userland mostly works. Mostly same problems from before,
a few new wrinkles...

i've tested o32 and n32 a little bit with GCC 4.8. it seems to
work at least as poorly as GCC 4.5 for me.

if you see specific GCC 4.8 regressions vs. 4.5, please file PRs
about each issue, so we can at least investigate them from the
compiler POV.

Post by John D. Baker

Post by John D. Baker
X server: the undefined symbol issues seem to have been resolved, but
now the server complains that it can't load the "int10" module, saying

Once again failing due to an undefined symbol. Side effect of gcc48?

you probably don't want "int10" on non-x86, IIRC?

Post by John D. Baker
(==) Using default built-in configuration (12 lines)
(EE) Failed to load /usr/X11R7/lib/modules/drivers/siliconmotion_drv.so: /usr/X11R7/lib/modules/drivers/siliconmotion_drv.so: Undefined symbol "exaOffscreenF
ree" (symnum = 101)
(EE) Failed to load module "siliconmotion" (loader failed, 7)
(EE) No drivers available.

interesting; this symbol shouild be provided by the Xorg server itself.

Mac, this reminds me of a PPC issue we've seen?

Post by John D. Baker
'etcupdate' seems to be ignoring the "-a" and "-l" options, requiring
user action on files with no local modifications and files changed, but
with the same RCS IDs as the baseline files from the {,x}etc.tgz sets.
'etcupdate' mangles the invocation of 'postinstall' at the end, somehow
failing to pass it the "-s" switch before the source archives.
'postinstall' then complains about "unknown action 'etc.tgz'" and prints
its usage message. Manual invocation of 'postinstall' succeeds.
Pretty sure these are related to gcc48 weirdness as I don't recall it
happening before I started compiling with HAVE_GCC=48.

that seems strange. i don't tend to use etcupdate (just postinstall),
can you confirm they're GCC 4.8 specific?

thanks,

.mrg.

Michael

2014-05-01 12:58:48 UTC

Hello,

On Thu, 01 May 2014 08:08:01 +1000

Post by matthew green

Post by John D. Baker

Post by John D. Baker
X server: the undefined symbol issues seem to have been resolved, but
now the server complains that it can't load the "int10" module, saying

Once again failing due to an undefined symbol. Side effect of gcc48?

you probably don't want "int10" on non-x86, IIRC?

At the very least failing to load the module should be non-fatal.

Post by matthew green

Post by John D. Baker
(==) Using default built-in configuration (12 lines)
(EE) Failed to load /usr/X11R7/lib/modules/drivers/siliconmotion_drv.so: /usr/X11R7/lib/modules/drivers/siliconmotion_drv.so: Undefined symbol "exaOffscreenF
ree" (symnum = 101)
(EE) Failed to load module "siliconmotion" (loader failed, 7)
(EE) No drivers available.

interesting; this symbol shouild be provided by the Xorg server itself.
Mac, this reminds me of a PPC issue we've seen?

Hmm, I don't remember that one. I do remember having issues like that
on AIX where symbols provided by the main binary itself needed special
treatment ( had to be explicitly exported and such )

But that one should have come from the EXA module IIRC.

have fun
Michael

John D. Baker

2014-05-02 16:25:47 UTC

Post by matthew green
if you see specific GCC 4.8 regressions vs. 4.5, please file PRs
about each issue, so we can at least investigate them from the
compiler POV.

I'll try. Right now I just have informal observations based on what
I remember from the last time I booted a gcc45-built system. I don't
get to try often as I'm actively using the system (under gnewsense Linux)
as a graphical terminal to my other systems.

Since I have all gcc48-built stuff locally, I'll see about grabbing a
gcc45-built release from the snapshot builds and check with that. Hmm.
I should get another SD card so I can switch local installs as easily
as switching NFS roots.

Reorganizing for proper context...

Post by matthew green

Post by John D. Baker

Post by John D. Baker
X server: the undefined symbol issues seem to have been resolved, but
now the server complains that it can't load the "int10" module, saying

you probably don't want "int10" on non-x86, IIRC?

I believe this was mentioned before. It's not that I want "int10", but
that's how it's built by default. As macallan noted, something like
this shouldn't be fatal on a system where it doesn't apply. Or perhaps
some conditionals could be used to require/include the module only on
platforms where it makes sense?

The above was observed on a gcc45-built system. It may still be there
on a gcc48-built system, but the undefined symbol error below is probably
masking it (never gets far enough to complain about the missing module).

Post by matthew green

Post by John D. Baker
Once again failing due to an undefined symbol. Side effect of gcc48?
(==) Using default built-in configuration (12 lines)
(EE) Failed to load /usr/X11R7/lib/modules/drivers/siliconmotion_drv.so: /usr/X11R7/lib/modules/drivers/siliconmotion_drv.so: Undefined symbol "exaOffscreenF
ree" (symnum = 101)
(EE) Failed to load module "siliconmotion" (loader failed, 7)
(EE) No drivers available.

ISTR that a long, long time ago, the X server failed on evbmips-mips64el
due to some other undefined symbol, so that's what my "Once again" phrase
was expressing. This is what I see on a gcc48-built system, not the
missing module complaint from a gcc45-built system.

Post by matthew green

Post by John D. Baker
'etcupdate' seems to be ignoring the "-a" and "-l" options, requiring
'etcupdate' mangles the invocation of 'postinstall' at the end, somehow

that seems strange. i don't tend to use etcupdate (just postinstall),
can you confirm they're GCC 4.8 specific?

Hmm. I suppose 'etcupdate' is really only useful following a version-
spanning update and 'postinstall' is sufficient for routine updates
within the same version.

As above, I'll grab a gcc45-built release from the snapshot build
hosts and confirm behavior differences.

--
|/"\ John D. Baker, KN5UKS NetBSD Darwin/MacOS X
|\ / jdbaker[snail]mylinuxisp[flyspeck]com OpenBSD FreeBSD
| X No HTML/proprietary data in email. BSD just sits there and works!
|/ \ GPGkeyID: D703 4A7E 479F 63F8 D3F4 BD99 9572 8F23 E4AD 1645

John D. Baker

2014-05-03 23:28:10 UTC

Ok, using a gcc45-built snapshot build from the other day, the X server
fails due to the missing "int10" module:

X.Org X Server 1.10.6
Release Date: 2011-07-08
X Protocol Version 11, Revision 0
Build Operating System: NetBSD/evbmips -
Current Operating System: NetBSD chalk.technoskunk.fur 6.99.40 NetBSD 6.99.40 (LOONGSON) #0: Fri May 2 09:45:38 UTC 2014 ***@b41.netbsd.org:/home/builds/ab/HEAD/evbmips-mips64el/201405020730Z-obj/home/builds/ab/HEAD/src/sys/arch/evbmips/compile/LOONGSON evbmips
Build Date: 01 August 2011 01:01:00AM

Current version of pixman: 0.32.4
Before reporting problems, check http://wiki.X.Org
to make sure that you have the latest version.
Markers: (--) probed, (**) from config file, (==) default setting,
(++) from command line, (!!) notice, (II) informational,
(WW) warning, (EE) error, (NI) not implemented, (??) unknown.
(==) Log file: "/var/log/Xorg.0.log", Time: Sat May 3 18:01:17 2014
(==) Using default built-in configuration (12 lines)
(EE) SMI: Failed to load module "int10" (module does not exist, 0)
Segmentation fault at address 0x0

Fatal server error:
Caught signal 11 (Segmentation fault). Server aborting

Please consult the The X.Org Foundation support
at http://wiki.X.Org
for help.
Please also check the log file at "/var/log/Xorg.0.log" for additional information.

While on a gcc48-built system, it fails due to undefined symbol:

X.Org X Server 1.10.6
Release Date: 2011-07-08
X Protocol Version 11, Revision 0
Build Operating System: NetBSD/evbmips -
Current Operating System: NetBSD chalk.technoskunk.fur 6.99.40 NetBSD 6.99.40 (YEELOONG) #2: Tue Apr 29 12:13:16 CDT 2014 ***@verthandi.technoskunk.fur:/d0/build/current/obj/mips64el/sys/arch/evbmips/compile/YEELOONG evbmips
Build Date: 01 August 2011 01:01:00AM

Current version of pixman: 0.32.4
Before reporting problems, check http://wiki.X.Org
to make sure that you have the latest version.
Markers: (--) probed, (**) from config file, (==) default setting,
(++) from command line, (!!) notice, (II) informational,
(WW) warning, (EE) error, (NI) not implemented, (??) unknown.
(==) Log file: "/var/log/Xorg.0.log", Time: Tue Apr 29 18:06:06 2014
(==) Using default built-in configuration (12 lines)
(EE) Failed to load /usr/X11R7/lib/modules/drivers/siliconmotion_drv.so: /usr/X11R7/lib/modules/drivers/siliconmotion_drv.so: Undefined symbol "exaOffscreenFree" (symnum = 101)
(EE) Failed to load module "siliconmotion" (loader failed, 7)
(EE) No drivers available.

Fatal server error:
no screens found

Please consult the The X.Org Foundation support
at http://wiki.X.Org
for help.
Please also check the log file at "/var/log/Xorg.0.log" for additional information.

Post by John D. Baker
'etcupdate' seems to be ignoring the "-a" and "-l" options, requiring

It seems to ignore the "-v" option as well.

Post by John D. Baker
'etcupdate' mangles the invocation of 'postinstall' at the end, somehow

These have now been seen to occur on gcc45-built system as well.

The various rc.d/* scripts seem to behave better on the gcc45-built
system, although the "postfix" script still complains that it was invoked
improperly.

Although marked executable, invoking the "MAKEDEV" script simply as
'./MAKEDEV all' causes it to merely print its usage message. It must
be explicitly invoked as 'sh ./MAKEDEV all' for it to run and create
the device nodes. (Seen on both gcc45- and gcc48-built systems.)

Since the gcc45-built snapshots are built without macallan's libgmp
patches, I've not been able to test compiling.

On my gcc48-built system with the libgmp patches, trying to build anything
out of pkgsrc fails before ever getting to the compiler--"Bad address"
errors upon invocation of other utilities. I should perhaps try something
more direct.

--
|/"\ John D. Baker, KN5UKS NetBSD Darwin/MacOS X
|\ / jdbaker[snail]mylinuxisp[flyspeck]com OpenBSD FreeBSD
| X No HTML/proprietary data in email. BSD just sits there and works!
|/ \ GPGkeyID: D703 4A7E 479F 63F8 D3F4 BD99 9572 8F23 E4AD 1645

Michael

2014-05-04 01:16:52 UTC

Hello,

On Sat, 3 May 2014 18:28:10 -0500 (CDT)

Post by John D. Baker
Ok, using a gcc45-built snapshot build from the other day, the X server
X.Org X Server 1.10.6
Release Date: 2011-07-08
X Protocol Version 11, Revision 0
Build Operating System: NetBSD/evbmips -
Build Date: 01 August 2011 01:01:00AM
Current version of pixman: 0.32.4
Before reporting problems, check http://wiki.X.Org
to make sure that you have the latest version.
Markers: (--) probed, (**) from config file, (==) default setting,
(++) from command line, (!!) notice, (II) informational,
(WW) warning, (EE) error, (NI) not implemented, (??) unknown.
(==) Log file: "/var/log/Xorg.0.log", Time: Sat May 3 18:01:17 2014
(==) Using default built-in configuration (12 lines)
(EE) SMI: Failed to load module "int10" (module does not exist, 0)
Segmentation fault at address 0x0

Hmm, that's probably not the int10 module itself, looks more like some
mmap() failing without the error being checked or something trying to
jump to an unresolved symbol.
Funny thing is, I'm using the siliconmotion driver myself, on Gdium,
which has an SM502. The driver probably doesn't even try to use int10
on that hardware since it's got no VGA compatibility whatsoever and was
never intended to be paired with anything x86.

Post by John D. Baker
X.Org X Server 1.10.6
Release Date: 2011-07-08
X Protocol Version 11, Revision 0
Build Operating System: NetBSD/evbmips -
Build Date: 01 August 2011 01:01:00AM
Current version of pixman: 0.32.4
Before reporting problems, check http://wiki.X.Org
to make sure that you have the latest version.
Markers: (--) probed, (**) from config file, (==) default setting,
(++) from command line, (!!) notice, (II) informational,
(WW) warning, (EE) error, (NI) not implemented, (??) unknown.
(==) Log file: "/var/log/Xorg.0.log", Time: Tue Apr 29 18:06:06 2014
(==) Using default built-in configuration (12 lines)
(EE) Failed to load /usr/X11R7/lib/modules/drivers/siliconmotion_drv.so: /usr/X11R7/lib/modules/drivers/siliconmotion_drv.so: Undefined symbol "exaOffscreenFree" (symnum = 101)
(EE) Failed to load module "siliconmotion" (loader failed, 7)
(EE) No drivers available.
no screens found
Please consult the The X.Org Foundation support
at http://wiki.X.Org
for help.
Please also check the log file at "/var/log/Xorg.0.log" for additional information.

Yeah, I've seen that one but never got around to investigate or even
check if it still happens with 4.5

Post by John D. Baker
Since the gcc45-built snapshots are built without macallan's libgmp
patches, I've not been able to test compiling.
On my gcc48-built system with the libgmp patches, trying to build anything
out of pkgsrc fails before ever getting to the compiler--"Bad address"
errors upon invocation of other utilities. I should perhaps try something
more direct.

Hmm, I tried to build things on Gdium with 4.8 and patches, which
successfully built simple stuff like flops but when building things
like python the resulting binaries would crash.
On sgimips ( using an n32 userland with gcc 4.5 and the patches I sent
ast time ) I managed to get a working tcsh before failing due to other
issues. Without the patches I get the same errors as on loongson
( unsurprisingly )
I should probably try to build an o32 userland for sgimips with gcc 4.8
just to see which issues go away, if any ( the gmp crashes /should/ go
away since it should be purely 32bit )

have fun
Michael

have fun
Michael

matthew green

2014-05-04 01:56:53 UTC

Post by John D. Baker
Ok, using a gcc45-built snapshot build from the other day, the X server
X.Org X Server 1.10.6
Release Date: 2011-07-08
X Protocol Version 11, Revision 0
Build Operating System: NetBSD/evbmips -
/ab/HEAD/evbmips-mips64el/201405020730Z-obj/home/builds/ab/HEAD/src/sys/arch/evbmips/compile/LOONGSON evbmips
Build Date: 01 August 2011 01:01:00AM
Current version of pixman: 0.32.4
Before reporting problems, check http://wiki.X.Org
to make sure that you have the latest version.
Markers: (--) probed, (**) from config file, (==) default setting,
(++) from command line, (!!) notice, (II) informational,
(WW) warning, (EE) error, (NI) not implemented, (??) unknown.
(==) Log file: "/var/log/Xorg.0.log", Time: Sat May 3 18:01:17 2014
(==) Using default built-in configuration (12 lines)
(EE) SMI: Failed to load module "int10" (module does not exist, 0)
Segmentation fault at address 0x0
Caught signal 11 (Segmentation fault). Server aborting

it isn't clear to me that the int10 issue is related to the crash;
i think that's fairly normal and shouldn't crash..

could you try running this under gdb? if not, we might need
an xorg.conf that has

Option "NoTrapSignals" "true"

in the ServerFlags section.

Post by John D. Baker
X.Org X Server 1.10.6
Release Date: 2011-07-08
X Protocol Version 11, Revision 0
Build Operating System: NetBSD/evbmips -
0/build/current/obj/mips64el/sys/arch/evbmips/compile/YEELOONG evbmips
Build Date: 01 August 2011 01:01:00AM
Current version of pixman: 0.32.4
Before reporting problems, check http://wiki.X.Org
to make sure that you have the latest version.
Markers: (--) probed, (**) from config file, (==) default setting,
(++) from command line, (!!) notice, (II) informational,
(WW) warning, (EE) error, (NI) not implemented, (??) unknown.
(==) Log file: "/var/log/Xorg.0.log", Time: Tue Apr 29 18:06:06 2014
(==) Using default built-in configuration (12 lines)
(EE) Failed to load /usr/X11R7/lib/modules/drivers/siliconmotion_drv.so: /usr/X11R7/lib/modules/drivers/siliconmotion_drv.so: Undefined symbol "exaOffscreenF
ree" (symnum = 101)
(EE) Failed to load module "siliconmotion" (loader failed, 7)
(EE) No drivers available.
no screens found
Please consult the The X.Org Foundation support
at http://wiki.X.Org
for help.
Please also check the log file at "/var/log/Xorg.0.log" for additional information.

hmm, this one i'm not sure about. perhaps we need to link against
exa module directly?

sounds like /bin/sh has some problems in general.

.mrg.

Michael

2014-05-01 00:58:25 UTC

Hello,

On Wed, 30 Apr 2014 14:40:25 -0500 (CDT)

Post by John D. Baker
Lemote YEELOONG, evbmips-mips64el
So, to complicate matters, I've been building with HAVE_GCC=48. The
kernel boots. Userland mostly works. Mostly same problems from before,
a few new wrinkles...

4.8 seems to do not much worse than 4.5 on MIPS.

Post by John D. Baker
The very few packages I managed to build using gcc 4.5.4 (sudo and lynx)
seem to work fine with the gcc-4.8.3-built kernel/libraries.

I found out where the gcc crash is coming from - it's libgmp, namely
the assembler code. For n32 it uses code that assumes LP64, which leads
to occasional unaligned accesses which in turn crash gcc.
So I made a patch that uses generic C code instead on n32 ( see attachment )
With this, at least gcc doesn't crash anymore and it produced a few
working binaries with -O2.

Post by John D. Baker

Post by John D. Baker
X server: the undefined symbol issues seem to have been resolved, but
now the server complains that it can't load the "int10" module, saying

Once again failing due to an undefined symbol. Side effect of gcc48?
X.Org X Server 1.10.6
Release Date: 2011-07-08
X Protocol Version 11, Revision 0
Build Operating System: NetBSD/evbmips -
Build Date: 01 August 2011 01:01:00AM
Current version of pixman: 0.32.4
Before reporting problems, check http://wiki.X.Org
to make sure that you have the latest version.
Markers: (--) probed, (**) from config file, (==) default setting,
(++) from command line, (!!) notice, (II) informational,
(WW) warning, (EE) error, (NI) not implemented, (??) unknown.
(==) Log file: "/var/log/Xorg.0.log", Time: Tue Apr 29 18:06:06 2014
(==) Using default built-in configuration (12 lines)
(EE) Failed to load /usr/X11R7/lib/modules/drivers/siliconmotion_drv.so: /usr/X11R7/lib/modules/drivers/siliconmotion_drv.so: Undefined symbol "exaOffscreenFree" (symnum = 101)
(EE) Failed to load module "siliconmotion" (loader failed, 7)
(EE) No drivers available.
no screens found
Please consult the The X.Org Foundation support
at http://wiki.X.Org
for help.
Please also check the log file at "/var/log/Xorg.0.log" for additional information.

I've seen that too ( or rather, something similar ). Didn't get to investigate yet.

Post by John D. Baker
wscons switching doesn't always repaint the display completely, leaving
characters from another virtual terminal on the screen. The specific
case was running 'lynx' in one terminal with an unordered list displayed.
Two spaces to the left of each list item a single character from the
corresponding position of the previous virtual terminal was displayed.

Which driver is that?

have fun
Michael

John D. Baker

2014-05-02 17:25:40 UTC

Post by Michael
So I made a patch that uses generic C code instead on n32 ( see attachment )
With this, at least gcc doesn't crash anymore and it produced a few
working binaries with -O2.

Thanks. I'll try it when I next get a chance.

Post by Michael

Post by John D. Baker
wscons switching doesn't always repaint the display completely, leaving
characters from another virtual terminal on the screen. The specific

Which driver is that?

lynxfb0 at pci0 dev 8 function 0: vendor 0x126f product 0x0712 (rev. 0xb0)
lynxfb0: 1024 x 600, 16 bpp, stride 2048
wsdisplay0 at lynxfb0 kbdmux 1: console (default, vt100 emulation)
wsmux1: connecting to wsdisplay0

Also, if one has scrolled the display back and further output occurs,
the display is overwritten with the new output--which may be desirable
as opposed to emulating "reposition and scroll on output" like the
default behavior of xterm, etc. (I don't seem to remember how x86
virtual terminals behave in this situation...)

--
|/"\ John D. Baker, KN5UKS NetBSD Darwin/MacOS X
|\ / jdbaker[snail]mylinuxisp[flyspeck]com OpenBSD FreeBSD
| X No HTML/proprietary data in email. BSD just sits there and works!
|/ \ GPGkeyID: D703 4A7E 479F 63F8 D3F4 BD99 9572 8F23 E4AD 1645

Michael

2014-05-03 01:16:58 UTC

On Fri, 2 May 2014 12:25:40 -0500 (CDT)

Post by John D. Baker

Post by Michael

Post by John D. Baker
wscons switching doesn't always repaint the display completely, leaving
characters from another virtual terminal on the screen. The specific

Which driver is that?

lynxfb0 at pci0 dev 8 function 0: vendor 0x126f product 0x0712 (rev. 0xb0)
lynxfb0: 1024 x 600, 16 bpp, stride 2048
wsdisplay0 at lynxfb0 kbdmux 1: console (default, vt100 emulation)
wsmux1: connecting to wsdisplay0

Hmm, that's one of the few graphical console drivers that I not only
didn't write but also never messed with.

Post by John D. Baker
Also, if one has scrolled the display back and further output occurs,
the display is overwritten with the new output--which may be desirable
as opposed to emulating "reposition and scroll on output" like the
default behavior of xterm, etc. (I don't seem to remember how x86
virtual terminals behave in this situation...)

None of that should happen. Do you use VCONS_DRAW_INTR?

have fun
Michael

John D. Baker

2014-05-03 09:04:48 UTC

Post by Michael

Post by John D. Baker
Also, if one has scrolled the display back and further output occurs,
the display is overwritten with the new output--which may be desirable

None of that should happen. Do you use VCONS_DRAW_INTR?

There is no such symbol in any of the "evbmips" kernel config files,
least of all LOONGSON, so that would be "no".

Also, on the repainting issue: I've only observed the behavior while
running 'lynx'--might have something to do with the tty mode it sets
on startup. Also, only observed when displaying lists. In an unordered
list, there is a single character of garbage 2 positions to the left of
the "bullet" character. In an ordered list, there is a single character
of garbage immediately left of a single-digit item index. Two-digit
indexes do not exhibit garbage characters from other terminals.

Also, after exiting 'lynx', any portion of the screen still displaying
a list rendered by 'lynx' continues to pick up garbage when switching
to/from other terminals. The portion of the screen below any remaining
'lynx' output is not affected.

--
|/"\ John D. Baker, KN5UKS NetBSD Darwin/MacOS X
|\ / jdbaker[snail]mylinuxisp[flyspeck]com OpenBSD FreeBSD
| X No HTML/proprietary data in email. BSD just sits there and works!
|/ \ GPGkeyID: D703 4A7E 479F 63F8 D3F4 BD99 9572 8F23 E4AD 1645

Michael

2014-05-03 13:05:43 UTC

Hello,

On Sat, 3 May 2014 04:04:48 -0500 (CDT)

Post by John D. Baker

Post by Michael

Post by John D. Baker
Also, if one has scrolled the display back and further output occurs,
the display is overwritten with the new output--which may be desirable

None of that should happen. Do you use VCONS_DRAW_INTR?

There is no such symbol in any of the "evbmips" kernel config files,
least of all LOONGSON, so that would be "no".

Ok.

Post by John D. Baker
Also, on the repainting issue: I've only observed the behavior while
running 'lynx'--might have something to do with the tty mode it sets
on startup. Also, only observed when displaying lists. In an unordered
list, there is a single character of garbage 2 positions to the left of
the "bullet" character. In an ordered list, there is a single character
of garbage immediately left of a single-digit item index. Two-digit
indexes do not exhibit garbage characters from other terminals.
Also, after exiting 'lynx', any portion of the screen still displaying
a list rendered by 'lynx' continues to pick up garbage when switching
to/from other terminals. The portion of the screen below any remaining
'lynx' output is not affected.

I see nothing obviously wrong in lynxfb.c - could you try to reproduce
the issue with a different driver? Ideally something like genfb on x86
or any of the other ports with more users?
( I'm trying to see if this is an issue with wsdisplay or lynxfb )
Under normal circumstances, no characters should stay visible when
switching consoles.

have fun
Michael

20 Replies
1 View
Permalink to this page
Disable enhanced parsing

Thread Navigation

John D. Baker 2014-02-01 08:03:24 UTC

Manuel Bouyer 2014-02-01 16:06:54 UTC

John D. Baker 2014-02-01 20:01:55 UTC

Manuel Bouyer 2014-02-01 20:10:49 UTC

John D. Baker 2014-02-03 23:32:32 UTC

John D. Baker 2014-02-04 02:19:28 UTC

John D. Baker 2014-02-06 18:18:56 UTC

Manuel Bouyer 2014-02-07 09:07:00 UTC

Michael 2014-02-04 06:46:16 UTC

John D. Baker 2014-04-30 19:40:25 UTC

matthew green 2014-04-30 22:08:01 UTC

Michael 2014-05-01 12:58:48 UTC

John D. Baker 2014-05-02 16:25:47 UTC

John D. Baker 2014-05-03 23:28:10 UTC

Michael 2014-05-04 01:16:52 UTC

matthew green 2014-05-04 01:56:53 UTC

Michael 2014-05-01 00:58:25 UTC

John D. Baker 2014-05-02 17:25:40 UTC

Michael 2014-05-03 01:16:58 UTC

John D. Baker 2014-05-03 09:04:48 UTC

Michael 2014-05-03 13:05:43 UTC

about - legalese

Loading...