Ory Kratos returns [HTTP 400 for valid requests in...
# feedback
r
Ory Kratos returns [HTTP 400 for valid requests instead of HTTP 200](https://github.com/ory/kratos/issues/4052). Many good arguments have been made to show why this is bad, but the CTO doesn't accept any of them. Instead, he uses weak and even self-defeating arguments (which he then ignores again) to argue against fixing it. This is very frustrating. What can be done? Can anyone else weigh in here?
a
Well that seems like a poor workflow design. Use the login api to request a login token but not actually login, and then throw a 400 because it’s not logging you in…
r
I don't think that's too bad but yes, this can also be questioned 🙈
r
Thanks for the feedback. I'm not sure what you expect from this discussion, though. Returning 400 from this endpoint may not be the best or most elegant design, but it does have an internal logic laid out in the GitHub discussion. Changing the response code is backwards incompatible, and mostly an aesthetic change since the current design also works.
a
When you consider it in isolation maybe. When you take it into account with a front-end, you have to start writing ugly code to deal with a 400 response for a successful request. Now I need “if 200, show info box and login. if 400 and message ~= code sent, show info box, if 400 and other message, show error box’. Requiring parsing a user text response for logic consideration is a bad design.
why not have a separate API endpoint for requesting the code?
r
I expect this being accepted as something that is wrong and causes problems for many systems and developers, and therefore is worth fixing. It's not just aestetics. Backwards compatibility should not be an argument to keep things in a broken state, and as I explained in the Issue, there is an easy way to deal with it (common industry practice).
r
Right, I get it. Please consider Ory has a backlog several miles long, and there are only so many hours in the day and so many developers working on Kratos. Changing the response code is difficult. It works as-is, even if suboptimal. People hate breaking changes more than imperfect design.
r
I understand that. Nobody expects this to be fixed right now, but the issue isn't even recognized as such. If it was, I or someone else might even submit a PR for it. But nobody is going to waste their time knowing that the CTO will fight this change.
r
Our CTO is a very reasonable person. However, it is not obvious to me what a solution to this issue would actually look like.
r
I hope so. However, he doesn't seem reasonable in this seemingly very obvious case. I talked to a handful of developers about this, every single one frowned, some asked why I don't just use another product. Which shows how this can hurt Orys reputation and drive people away from it. The solution seems simple: Add a config option that enables the correct behavior, set it to false by default, document the issue, eventually change the default to true and document it as breaking change. Has been done many times by hundreds of products. I'd be happy to give it a try and submit a PR, but not if the CTO says he won't accept a solution.
btw this issue was brought to my attention by a contractor I hired. He wasted quite some time because he thought he was doing something wrong
m
Just my 2c as community guy: Its weird because I have worked at Ory for 5+ years speaking to thousands and this is the very first time someone feels so strongly about this. Maybe I am missing something and this was frowned upon all this time. I figured its a design choice and there are good argument on both sides.
r
That's interesting, in my 20 years in IT I've never come across any developer or Software who thought returning "client error" is an acceptable response when the client made no mistake 😄 Maybe this never came up because "login with code" is quite new and no other flow has this issue? I would love to see how any client implemented this flow without a comment "WTF" on top of it, no offense :)
b
Both the issues raised were closed because HTTP 400 is OK for a valid and successful request! When we tried this feature, we spent more time than we should have, asking ourselves if we went wrong with our implementation somewhere because the status code says it failed.
It needs more than a WTF comment on top of the implementation. We need to make exceptions in our testing / monitoring / metrics and alerting.
Appreciate all the work done by the contributors and it is a great project - acknowledging that there is an obvious issue here and keeping it open for fixing in future would have been much better.
l
backwards compatibility is paramount for large scale deployments... I don't care what the HTTP status code is as long as I don't have to plan significant downtime to make upgrades to a new major version
r
> The solution seems simple: Add a config option that enables the correct behavior, set it to false by default, document the issue, eventually change the default to true and document it as breaking change. Has been done many times by hundreds of products. What would a better status code be? 200 is out. 202 or 204 are probably also out, because some people check for 200<=x<=299. Sure, we could do another config option. That's a huge amount of work and complexity, especially for a SAAS product, though.
You have to consider that we have to support both methods a virtually unlimited amount of time.
What about other (future) flows where you need to interact with kratos several times to login? Preferably, they all need to work the same way. But coming up with designs for future problems is tough and mostly futile.
I'm not saying this API will never change. But its always more complicated than it first seems.
r
Why is 200 out? Because 200 == "Login succeeded" in your views? That's not what 200 means. 200 is totally fine and would solve the problem without introducing any other one. Prove me wrong
everyone knows exactly which request is 200'ed and whether this request was a login attempt or not.
as if anyone sends "please send me a login code" and when the Server replies "OK, this worked" they would think "oh wow, I'm logged in now"
r
No, I'm not going to "prove you wrong". You've made your case. So have we.
r
Unless your case is "we're not going to change it but we don't have any good reason for it", I'm not sure what your case is
This is simple HTTP, I cannot believe we're even having this discussion whether it's ok to send 400 as a response to a valid request and that sending 200 is problematic in case one needs to send multiple requests. How should this ever be an issue? Everyone knows the request they sent.
a
personal opinion: separate endpoint for requesting the code
you don’t break the existing flow but new users can use an endpoint that properly returns a 200 and then the you don’t have to write hacks into http interceptors on the client side
a
Here comes a short request: Add API version into the url to enforce backwards compatibility without constraining yourself too much. It shouldn't cause much code duplication, as most calls are identical between versions. example from cloudflare: https://api.cloudflare.com/client/v4/zones/{zone_id}/dns_records