Skip to content
This repository has been archived by the owner on Dec 21, 2019. It is now read-only.

TestBusTLS test failures #43

Closed
gdamore opened this issue Jan 17, 2015 · 3 comments
Closed

TestBusTLS test failures #43

gdamore opened this issue Jan 17, 2015 · 3 comments
Assignees

Comments

@gdamore
Copy link
Collaborator

gdamore commented Jan 17, 2015

We see, when running under -race, test failures from TestBusTLS. A lot of the history is in #41 but also you can see this:

=== RUN TestBusTLS
--- FAIL: TestBusTLS (0.87s)
    common_test.go:503: Address tls+tcp://127.0.0.1:43934, 5 Cases
    common_test.go:149: Jan 16 17:39:01.787: Id 0: Got dup START from 1
    common_test.go:149: Jan 16 17:39:01.875: Id 0: Got dup START from 2
    common_test.go:149: Jan 16 17:39:01.875: Id 0: Got dup START from 1
    common_test.go:149: Jan 16 17:39:01.881: Id 0: Sent all 7 messages
    common_test.go:149: Jan 16 17:39:01.883: Id 1: Got all 7 messages
    common_test.go:149: Jan 16 17:39:01.883: Id 4: Got all 7 messages
    common_test.go:149: Jan 16 17:39:01.883: Id 3: Got all 7 messages
    common_test.go:149: Jan 16 17:39:01.883: Id 2: Got all 7 messages
    common_test.go:149: Jan 16 17:39:01.883: Id 3: Sent all 7 messages
    common_test.go:154: Jan 16 17:39:01.884: Id 0: Got dup message #6 from 3
    common_test.go:154: Jan 16 17:39:01.884: Id 0: RecvHook failed
    common_test.go:149: Jan 16 17:39:01.884: Id 4: Sent all 7 messages
    common_test.go:149: Jan 16 17:39:01.884: Id 0: Testing complete
    common_test.go:149: Jan 16 17:39:01.885: Id 1: Sent all 7 messages
    common_test.go:149: Jan 16 17:39:01.885: Id 1: Testing complete
    common_test.go:149: Jan 16 17:39:01.886: Id 2: Sent all 7 messages
    common_test.go:149: Jan 16 17:39:01.887: Id 2: Testing complete
    common_test.go:149: Jan 16 17:39:01.887: Id 3: Testing complete
    common_test.go:149: Jan 16 17:39:01.887: Id 4: Testing complete
FAIL
exit status 1
FAIL    github.com/gdamore/mangos/test  2.625s

The errors seem to be limited to TestBusTLS; none of the other protocols or transport combinations seem to matter. (Note that StarTest does not use TLS.)

@tylertreat
Copy link
Contributor

I'm seeing TestBusIPC fail too...

go test -race github.com/gdamore/mangos/test
--- FAIL: TestBusIPC (3.57s)
    common_test.go:500: Address ipc:///tmp/MYTEST_IPC, 5 Cases
    common_test.go:149: Jan 16 20:05:37.220: Id 1: Sent all 7 messages
    common_test.go:149: Jan 16 20:05:37.222: Id 4: Sent all 7 messages
    common_test.go:149: Jan 16 20:05:37.223: Id 3: Sent all 7 messages
    common_test.go:149: Jan 16 20:05:37.225: Id 0: Sent all 7 messages
    common_test.go:149: Jan 16 20:05:37.226: Id 1: Got all 7 messages
    common_test.go:149: Jan 16 20:05:37.226: Id 4: Got all 7 messages
    common_test.go:149: Jan 16 20:05:37.226: Id 3: Got all 7 messages
    common_test.go:149: Jan 16 20:05:37.226: Id 2: Got all 7 messages
    common_test.go:149: Jan 16 20:05:37.229: Id 2: Sent all 7 messages
    common_test.go:154: Jan 16 20:05:40.226: Id 0: Timeout waiting for recv
    common_test.go:149: Jan 16 20:05:40.226: Id 1: Testing complete
    common_test.go:149: Jan 16 20:05:40.226: Id 2: Testing complete
    common_test.go:149: Jan 16 20:05:40.226: Id 3: Testing complete
    common_test.go:149: Jan 16 20:05:40.226: Id 4: Testing complete
    common_test.go:154: Jan 16 20:05:40.227: Id 0: RecvMsg failed: connection closed
--- FAIL: TestBusTLS (0.88s)
    common_test.go:500: Address tls+tcp://127.0.0.1:43934, 5 Cases
    common_test.go:149: Jan 16 20:05:41.034: Id 0: Got dup START from 1
    common_test.go:149: Jan 16 20:05:41.097: Id 4: Sent all 7 messages
    common_test.go:149: Jan 16 20:05:41.102: Id 2: Sent all 7 messages
    common_test.go:149: Jan 16 20:05:41.102: Id 3: Sent all 7 messages
    common_test.go:154: Jan 16 20:05:41.104: Id 0: Got dup message #6 from 4
    common_test.go:154: Jan 16 20:05:41.104: Id 0: RecvHook failed
    common_test.go:149: Jan 16 20:05:41.104: Id 0: Sent all 7 messages
    common_test.go:149: Jan 16 20:05:41.104: Id 0: Testing complete
    common_test.go:149: Jan 16 20:05:41.105: Id 3: Got all 7 messages
    common_test.go:149: Jan 16 20:05:41.105: Id 2: Got all 7 messages
    common_test.go:149: Jan 16 20:05:41.105: Id 1: Got all 7 messages
    common_test.go:149: Jan 16 20:05:41.105: Id 4: Got all 7 messages
    common_test.go:149: Jan 16 20:05:41.105: Id 1: Sent all 7 messages
    common_test.go:149: Jan 16 20:05:41.105: Id 1: Testing complete
    common_test.go:149: Jan 16 20:05:41.105: Id 2: Testing complete
    common_test.go:149: Jan 16 20:05:41.105: Id 3: Testing complete
    common_test.go:149: Jan 16 20:05:41.105: Id 4: Testing complete
FAIL
FAIL    github.com/gdamore/mangos/test  19.383s

@gdamore
Copy link
Collaborator Author

gdamore commented Jan 17, 2015

Not surprising. I think the situation with TLS is strictly related to timing -- fundamentally the test is busted, clearly.

@gdamore
Copy link
Collaborator Author

gdamore commented Jan 24, 2015

I've improved the debugging, and have confirmed that indeed, the problem is the depth of the queues -- basically the framework was occasionally dropping packets. The "dup" was a misleading message. Cranking up the queues to 128 solved the problem. As a little bonus, the throughput tests are yielding better performance as well.

Btw, IPC is faster with small packets than TCP, but once the packet size increases it tanks ... I this is because the MTU for IPC is tiny.

@gdamore gdamore self-assigned this Jan 24, 2015
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants