|
Bugzilla – Full Text Bug Listing |
| Summary: | Traces differ in 32 and 64 bit modes | ||
|---|---|---|---|
| Product: | nsc | Reporter: | Sam Jansen <sam.jansen> |
| Component: | Linux | Assignee: | Sam Jansen <sam.jansen> |
| Status: | RESOLVED WONTFIX | ||
| Severity: | normal | CC: | gjcarneiro |
| Priority: | P3 | ||
| Version: | unspecified | ||
| Hardware: | All | ||
| OS: | All | ||
|
Description
Sam Jansen
2008-09-13 22:08:22 UTC
The port/sequence number problem for 2.6.18 is solved; see https://secure.wand.net.nz/mercurial/nsc/rev/d551aea44bf4 -- this has not yet been merged onto the head branch but presumably will be before long. Investigation continues into the other differences. I am not sure whether we will be able to resolve this bug. My debugging so far has found the following: In the function __alloc_skb(net/core/skbuff.c), the total size of the data allocated for the skbuff (skb->truesize) is dependent on sizeof(struct sk_buff). On 32-bit systems, this is 152 bytes. On 64-bit systems, this is 216 bytes. So I can see in gdb that on a 32-bit system an sk_buff is allocated with skb->truesize=1944, where on a 64 bit system, skb->truesize=2008. In both cases __alloc_skb is called with the same arguments (size=1792). It is this value, skb->truesize, that is used in accounting for the amount of memory used in that socket. For example, sk_charge_skb does: "sk->sk_wmem_queued += skb->truesize;" sk->sk_wmem_queued is then used in important decisions taken in tcp_sendmsg. e.g., sk_stream_memory_free is implemented as "return sk->sk_wmem_queued < sk->sk_sndbuf;". When sk_stream_memory_free returns false in tcp_sendmsg, tcp_push() may be called, which naturally sets the PSH flag. The difference in PSH flag is the first difference we see in traces. This basic difference in memory used per sk_buff will have other repercussions which will cause the traces to diverge as well. Basically, in 64-bit mode, you are filling up your buffers quicker than in 32-bit mode. Looking at include/linux/skbuff.h this is due to the sk_buff structure having many pointers in it; these are all naturally twice the size in 64-bit. I don't see any easy solution to this, though ideas are welcome. At this stage I believe 64-bit NSC to be working correctly. This gives us a bit of a headache for regression testing. It sounds like the linux kernel itself behaves differently in x86 and x86_64. If so, this sounds unsolvable with the current regression testing framework. There is no rule that the regression test itself needs to call back into the helper to execute the test and compare the results. We don't need a 1:1 relationship between trace files and tests (at least this used to be the case). Couldn't there be two trace directories, one for 32-bit and one for 64-bit; and the regression test could select between which one to compare based on the results from:
import platform
platform.architecture()
Which returns something like
('32bit', 'WindowsPE')
('64bit', '')
|