Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SCTP connect could return error even if the connect is ongoing #1592

Merged

Conversation

falkevik
Copy link
Contributor

@falkevik falkevik commented Oct 4, 2017

gen_sctp:connect_init/4 could return error even though the OS had
initiated the connection.
I.e. connect returned "in progress".
Then an error could come from packet_inet_output() which is checking the socket
and return error if peer_sock returned any error.

The misleading here is that from erlang you don't think the connect is still ongoing.
And you will receive a sctp_assoc_change message later on.

What problem did the code in packet_inet_output solve? What am I breaking when not calling it?
Remove the code in packet_inet_output?
Does it make sense to have INET_STATE_CONNECTING, INET_STATE_CONNECTED on a 1<->M SCTP socket?

@rickard-green rickard-green added team:MW team:PS Assigned to OTP team PS and removed team:MW labels Oct 9, 2017
@RaimoNiskanen
Copy link
Contributor

Can you describe the error case a bit more, from an API point of view?

Are you calling gen_sctp:connect_init/4, and then before getting an association you call gen_sctp:send/3,4?

The code has gotten its shape from the TCP connect code, and when you say it, suppose you have an association and then connect to (associate with) another endpoint, then while that association is in progress (you need to use gen_tcp:connect_init/4 for this) you can not send data on the existing association since the socket in inet_drv.c goes down to INET_STATE_CONNECTING until the new association is up. This seems bad.

If state INET_STATE_CONNECTED should be sensible for 1 -> N sockets then it should go INET_STATE_CONNECTING -> INET_STATE_CONNECTED for the first association, and then stay there until the last association goes down.

Or the socket should be in INET_STATE_OPEN or INET_STATE_LISTENING after birth and stay there. Maybe not INET_STATE_LISTENING...

If your change is correct, then packet_inet_drv_output() and packet_inet_output() seems obsolete and maybe should be removed, so your patch feels incomplete. How much have you tested the change?

@falkevik
Copy link
Contributor Author

falkevik commented Oct 11, 2017

I was hit by the issue when adding code to diameter_sctp for sharing port between connects.
In the code below, if connect_init returns error. Connect function tries next address if there are any.
This is the same in current diameter_sctp module.


%% connect/4

connect(_, [], _, Reasons) ->
    x({connect, lists:reverse(Reasons)});

connect(Sock, [Addr | AT] = As, Port, Reasons) ->
    case gen_sctp:connect_init(Sock, Addr, Port, []) of
        ok ->
            {As, Port, Reasons};
        {error, _} = E ->
            connect(Sock, AT, Port, [{Addr, E} | Reasons])
    end.

And the code later on treats any #sctp_assoc_change{} different from comm_up as an error and tries the next address.

%% ... or not: try the next address.
recv({_, #sctp_assoc_change{} = E},
     #transport{assoc_id = undefined,
                socket = Sock,
                mode = {connect = C, {[RA|RAs], RP, Es}}}
     = S) ->
    S#transport{mode = {C, connect(Sock, RAs, RP, [{RA,E} | Es])}};

But the connet_init may return error and the socket may also yeild cant_assoc which diameter_sctp will think is related to the latest issued connect (issued directly after the connect_init/4failed . So the next address is tried, if there is any, otherwise start from the beginning.

I was trying this by adding a transport where the remote endpoint have two addreses.
The first one is non existing address on the same subnet. This will made the connect_init return error and also the cant_assoc and the transport was never established due to this.

I have done some testing. It solves the problem I encountered, connect_init/4 will only return error if the underlying connect() are returning an error.
But I have only tested it on Linux.

Small example below.
The address 10.0.0.41 are in the same subnet but is non existing.


1> {ok, S} = gen_sctp:open(0, [{active, true}]). 
2> rr(gen_sctp).
[sctp_adaptation_event,sctp_assoc_change,sctp_assoc_value,
 sctp_assocparams,sctp_event_subscribe,sctp_initmsg,
 sctp_paddr_change,sctp_paddrinfo,sctp_paddrparams,
 sctp_pdapi_event,sctp_prim,sctp_remote_error,sctp_rtoinfo,
 sctp_send_failed,sctp_setadaptation,sctp_setpeerprim,
 sctp_shutdown_event,sctp_sndrcvinfo,sctp_status]
3> SCTP_INIT = 
{sctp_initmsg,#sctp_initmsg{num_ostreams = 10,
                            max_instreams = 10,max_attempts = 4,
                            max_init_timeo = 500}}
4> gen_sctp:connect_init(S, "10.0.0.41", 3434, [SCTP_INIT]).
{error,ehostunreach}
5> flush().                                                      
ok
6> flush().
Shell got {sctp,#Port<0.463>,
                {10,0,0,41},
                3434,
                {[],{sctp_assoc_change,cant_assoc,0,0,0,0}}}
ok

I agree that fix is incomplete, packet_inet_drv_output() and packet_inet_output() should be removed if this change is applied.

@RaimoNiskanen
Copy link
Contributor

Do you have an idea of how a late failure would look with your change? You would return ok for EINPROGRESS, right? It seems the diameter code would have to be changed since it blasts away connect against all addresses, and if later if not comm_up it starts another connect...

I am thinking about how the new working semantics should be.

@RaimoNiskanen RaimoNiskanen added the testing currently being tested, tag is used by OTP internal CI label Oct 11, 2017
@RaimoNiskanen
Copy link
Contributor

RaimoNiskanen commented Oct 11, 2017

I will throw it into our daily tests let's see what happens...

@RaimoNiskanen RaimoNiskanen removed the testing currently being tested, tag is used by OTP internal CI label Oct 11, 2017
@falkevik
Copy link
Contributor Author

Just to be clear. Below example is how it behaves without the fix.

1> {ok, S} = gen_sctp:open(0, [{active, true}]). 
2> rr(gen_sctp).
[sctp_adaptation_event,sctp_assoc_change,sctp_assoc_value,
 sctp_assocparams,sctp_event_subscribe,sctp_initmsg,
 sctp_paddr_change,sctp_paddrinfo,sctp_paddrparams,
 sctp_pdapi_event,sctp_prim,sctp_remote_error,sctp_rtoinfo,
 sctp_send_failed,sctp_setadaptation,sctp_setpeerprim,
 sctp_shutdown_event,sctp_sndrcvinfo,sctp_status]
3> SCTP_INIT = 
{sctp_initmsg,#sctp_initmsg{num_ostreams = 10,
                            max_instreams = 10,max_attempts = 4,
                            max_init_timeo = 500}}
4> gen_sctp:connect_init(S, "10.0.0.41", 3434, [SCTP_INIT]).
{error,ehostunreach}
5> flush().                                                      
ok
6> flush().
Shell got {sctp,#Port<0.463>,
                {10,0,0,41},
                3434,
                {[],{sctp_assoc_change,cant_assoc,0,0,0,0}}}
ok

With the fix, it would return ok and later on the socket you can read the cant_assoc if it could not connect.
And depending on the sctp_initmsg you give, it could be trying for a while I guess.

@falkevik
Copy link
Contributor Author

falkevik commented Oct 11, 2017

And with the fix, I don't think diameter_sctp needs to be changed.
Since it will try to connect to the first address and only return error on connect_init/4 if it couldn't initiate a connection. Otherwise it will sit and wait for a assoc_change, either comm_up all good, or something indicating that the next address should be tried.

@RaimoNiskanen RaimoNiskanen added the testing currently being tested, tag is used by OTP internal CI label Oct 12, 2017
@RaimoNiskanen
Copy link
Contributor

Well, man connect(2) says that if you get EINPROGRESS you should poll(2) or select(2) the socket for writing to wait for completion, and that is exactly what the code does.

It might be so that Linux return double error codes by both returning EHOSTUNREACH when the socket gets writable, and on top of that sends an association change message. The question is if other platforms also does this?

I have seen a wide spread in semantics between SCTP implementations and have found no good resource that clarifies how it should work. Maybe this is just another platform dependent quirk you will have to (the diameter code will have to) be prepared for.

We'll see if something breaks in our daily tests...

@falkevik
Copy link
Contributor Author

I agree that doing poll(2) or select(2) to know when connected is done makes sens when having a SOCK_STREAM socket.
But in the case of SOCK_SEQPACKET then poll(2) or select(2) would just return the socket as ready for write at all times independent on any ongoing connect. Unless the buffer if full I guess.

And since the socket is ready for writing at any time, geting the SO_ERROR directly after the connect could be confusing.
The connect are trying multiple times depending on sctp_initmsg so connect_init/4 could return an error while the association could come up later.

Lets see what your daily tests will find.

@RaimoNiskanen
Copy link
Contributor

@falkevik wrote:

But in the case of SOCK_SEQPACKET then poll(2) or select(2) would just return the socket as ready for write at all times independent on any ongoing connect. Unless the buffer if full I guess.

I would really like to get another source for that fact/guess.

@RaimoNiskanen
Copy link
Contributor

Your PR has been running for a week now in our daily tests, and I see no negative effects.

I have read RFC 6458: Sockets API Extencion for SCTP, to try to find anything about non-blocking connect. The closest I can find are sections 3.1.6 and 3.2, but none of them talk about non-blocking connect.

However, they talk about creating associations through sendmsg(), and non-blocking sendmsg() should return immediately, and that using connect() on a one-to-many style socket is allowed but between the lines seems to be sendmsg() without data.

Those sections also recommend caution when using poll() or select() especially for write being implementation dependent, and probably can only indicate whether any association is writable. Between the lines then; can not indicate if connect() is ready...

So it seems you can go ahead and remove the redundant states and callback functions in inet_drv.c for SCTP sockets. :-)

@IngelaAndin IngelaAndin added the waiting waiting for changes/input from author label Oct 19, 2017
@falkevik
Copy link
Contributor Author

Sorry for my late reply.
I tried to find some time to find some good source as well.
The only thing I had time to was trying open a SOCK_SEQPACKET sctp socket.
Set it as non blocking and add to write fd_set and do select.
Select returns socket ready for writing directly even before any connect.
Tried it on Linux and on FreeBSD with same result.

But that is good news that you have not found any issues so far.
I will remove the redundant states and callback functions in inet_drv.c

As a side note, I have added some experimental support for sctp on MacOS.
Hooking in usrsctp into inet_drv.c so that gen_sctp could be used as normal from erlang.
Is that something that could be of interest of being included as experimental?

@RaimoNiskanen
Copy link
Contributor

Well, that a socket is ready for write is no surprise. That it is so before connect might be. The interesting question is whether the ready for write status means anything after getting EINPROGRESS after connect() for in particular SOCK_SEQPACKET. But I guess we can just guess and will never know...

I think my indirect conclusion via non-blocking sendmsg() and its similarity with connect() for SCTP will have to be good enough.

Regarding experimental support for SCTP on MacOS: sure, we are interested. Make a separate pull request and we can run it in our daily tests where we have a few Macs with different OS versions.

@falkevik falkevik force-pushed the sctp_return_only_connect_errors branch from 379340a to 0bcc564 Compare October 25, 2017 13:16
gen_sctp:connect_init/4 could return error even though the connect are in progress.
I.e. connect on socket returned "in progress".
The error can come from packet_inet_output() which is checking the socket for errors.

The issue here is that from erlang you don't think the connect is still ongoing.
But you will receive a sctp_assoc_change message later on.
@falkevik falkevik force-pushed the sctp_return_only_connect_errors branch from 0bcc564 to 4eb3df5 Compare October 25, 2017 14:10
@falkevik
Copy link
Contributor Author

I have removed the unused callback packet_inet_drv_output() and packet_inet_output().
But I'm not sure if I changed the states as you want.
Connect will not change any state now. Is that ok?
Do you want me to change at some other place as well?

@RaimoNiskanen
Copy link
Contributor

I think this will be just fine. An SCTP socket will only use INET_STATE_{CLOSED,OPEN,LISTENING} where LISTENING may be today incorrect and ignored. E.g SCTP_REQ_BINDX i think should desc->state |= INET_F_OPEN since now it clears LISTENING on BINDX, which I do not think matters...

@IngelaAndin IngelaAndin removed the waiting waiting for changes/input from author label Oct 31, 2017
@RaimoNiskanen
Copy link
Contributor

Again, I see no negative effects in our daily tests. I will write release notes and merge it shortly...

@falkevik
Copy link
Contributor Author

falkevik commented Nov 1, 2017

sounds good, thanks.

@RaimoNiskanen RaimoNiskanen removed the testing currently being tested, tag is used by OTP internal CI label Nov 1, 2017
@RaimoNiskanen RaimoNiskanen merged commit cf03474 into erlang:master Nov 1, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
team:PS Assigned to OTP team PS
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants