Simon Josefsson jas@extundo.com writes:
I'm sorry for the length of this, if we should keep this off the list, please respond in private.
I prefer having development discussions on the public list, after all, I'm not the only one that's hacking lsh. There's also a (Swedish) conference "lsh (-) utvecklingsforum" in the KOM system at lyskom.lysator.liu.se, if you prefer that. It predates the mailinglist, and at least I and Pontus read it regularly.
It was rather straightforward, and I now have interop between lsh in server mode against OpenSSH (with the GSSAPI patches) in client mode. But. Once the authentication finished, lsh more or less crashes. I'm fairly certain this is because I don't understand the OOP stuff, and only have a few hours of experience with lsh in general.
I think I understand the problem. First, let me explain the connection_lock/connection_unlock thing. The point is to make sure that that during the processing of one userauth message, further messages that are received from the client are not processed, just queued (in lshd or in kernel socket buffers) for processing when finished with the current userauth message. Userauth messages need to be serialized in the server, while the client is allowed to send any number of userauth messages without waiting for replies.
This doesn't matter much for userauth methods which results in an answer right away, without returning to the main select loop. The main reason for it is password authentication with a helper program, as the userauth handler will then spawn a process, set an exit callback on the process, and return to the main select loop. Later the exit callback is invoked, it will pass a value or an exception back to the userauth code which then unlocks the connection.
The connection is locked in server_userauth:do_handle_userauth, and unlocked in the continuation handler do_userauth_continuation (invoked when userauth succeeds) and in do_exc_userauth_handler, for exceptions of type EXC_USERAUTH or EXC_USERAUTH_SPECIAL.
For all earlier userauth methods, the client sends a SSH_MSG_USERAUTH, and then the server replies with a success or failure or some special message like PUBLICKEY_OK. GSS-API seems a little different, due to the SSH_MSG_USERAUTH_GSSAPI_TOKEN and SSH_MSG_USERAUTH_GSSAPI_EXCHANGE_COMPLETE which are sent by the client. The general server userauth code don't see these, so the connection isn't locked automatically, but the general userauth code still tries to unlock it when you invoke the continuation or exception handler.
I think the simplest way to solve the problem is to add calls to connection_lock at the start of the packet handlers you install.
That may not get things exactly right, one also needs to consider a client that doesn't send SSH_MSG_USERAUTH_GSSAPI_EXCHANGE_COMPLETE as expected, but instead sends a new SSH_MSG_USERAUTH. Without having read the GSS-API spec, I guess that such client behaviour should either cancel the GSS-API exchange which is in progress, or raise a protocol error. Changes to the general server_userauth code may be needed to get that right, as there currently is no notion of a "subprotocol" being in progress, nor for cancelling such a subprotocol.
Finally, some minor comments on your code:
printf("GSS-API error %s: %s\n", m, (char *) msg.value);
The recommended way to print messages to the log is to use werror(), verbose() or debug(). So that printf could be replaced with
debug("GSS-API error %z: %z\n", m, (char *) msg.value);
struct lsh_user *user; struct command_continuation *cont;
Global variables must be avoided, they will confuse the gc unless special care is taken. But you have probably figured that out already, as you don't seem to use these globals anywhere.
/* GABA: (class (name gssapi_finish_handler) (super packet_handler) (vars (gssapi object gssapi_server_instance))) */
When possible, i.e. when there are no circular dependencies, I try to order classes so that if gssapi_finish_handler uses gssapi_server_instance, the latter class is defined first.
static void do_handle_gssapi_finish(struct packet_handler *s, struct ssh_connection *connection, struct lsh_string *packet) { CAST(gssapi_finish_handler, self, s); puts("ok"); /* Remember that a user was authenticated. */ self->gssapi->user = USER_LOOKUP(self->gssapi->db, self->gssapi->username, 1); connection->user = self->gssapi->user;
Any particular reason why you need to store the user in the gssapi object? Do you use the object after the userauthentication is finished?
You should probably check for USER_LOOKUP returning NULL (happens if the user doesn't exist or login is disabled).
static void do_handle_gssapi_token(struct packet_handler *s, struct ssh_connection *connection, struct lsh_string *packet) { CAST(gssapi_token_handler, self, s); OM_uint32 maj_stat; OM_uint32 min_stat; OM_uint32 retflags; gss_ctx_id_t ctx = GSS_C_NO_CONTEXT; gss_cred_id_t cred = GSS_C_NO_CREDENTIAL; gss_buffer_desc inbuf, outbuf; gss_name_t client;
This is where an extra connection_lock call would go.
maj_stat = gss_acquire_cred (&min_stat, GSS_C_NO_NAME, 0, GSS_C_NULL_OID_SET, GSS_C_ACCEPT, &cred, NULL, NULL);
Can gss_acquire_cred (or other gss-api functions) block, for example for contacting a kerberos server? Then the entire lshd server will block too. If that is a problem, we need either a non-blocking "native" interface to gss-api, or put the gss-api code in a separate process.
static void do_authenticate(struct userauth *s, struct ssh_connection *connection UNUSED,
^^^^^^^^^^^^^^^^^ ...
connection->dispatch[SSH_MSG_USERAUTH_GSSAPI_TOKEN] =
This seems inconsistent...
Thanks for the nice work! Regards, /Niels