Hacker News new | past | comments | ask | show | jobs | submit login
How Shopify implemented its secure authentication service (shopify.com)
174 points by frankpf on Jan 15, 2020 | hide | past | favorite | 51 comments



So is Shopify the biggest fish still on the Ruby stack? Nice article detailing how they did an upgrade to openId connect to allow SSO on multiple shops within a client company.


Remember that the hyped-up companies you hear about on HN & other social media aren't the entire world. There are plenty of companies out there that stay quiet and outside of the spotlight and use the language just fine. The same applies for PHP and other languages that are considered (unfairly IMO) "old-school".


PHP is rarely considered "old school." It's considered bad. And not without reason given its history of hostility to its own developers and the sysadmins who have to manage it.

I think pretty much everyone has acknowledged that it's improved. Where opinions differ is in how much it has improved and whether that's enough to entertain its use (my answers to which are "not enough" and "not even if you paid me", respectively).


I agree that PHP is bad in certain ways, even though I started with PHP before transitioning to other languages.

But honestly, every language has to make some trade-offs. Even if PHP has some things that are bad design choices as opposed to trade-offs, it can still be worthwhile to put up with them if you have an existing codebase written in it or want to take advantage of libraries that don't exist in other languages.

As a result I don't consider any language as bad or old school. They might have downsides but 1) the upsides might outweigh them in certain use-cases and 2) any competent developer should be able to make something good with any Turing-complete language so I don't judge by the language alone, and especially not by the "hype factor" of the language.


What is it like to use pho compared to nodejs or python? Some of us haven’t had the pleasure


It's... ugly? I started with PHP, like many. And the documentation was excellent, as were deployment options. But even though familiarity tends to breed sympathy, it took me only five minutes of seeing Ruby to question why I ever put up with PHP. Just the need to prefix $variables alone is off-putting, and so were the inconsistancies in style. There were camelCase() functions, but also under_score(). Sometimes it was search_all(), then find_all(), next all_matching(). Some counting method might actually return 0, others nil (for zero matches). And I seem to remember one even returned the string "0".

(not actual examples–I have luckily forgotten)


For me (php5, older now I know) it was things like create_function() taking a string. It was (is? dunno) an ugly and crude language, no subtleties in the syntax, just warts. I'm firmly of the belief you can make good things with any language but I don't feel any desire to make things with PHP when there's other good options.


Imagine JavaScript, but more like Java, but not very strictly typed, half the methods in the standard library follow different patterns for method signatures, and no event loop or JIT.

That being said, I have yet to see a NodeJS framework beat developer productivity in Symphony or Larvel for CRUD stuff. Django maybe.


Raw PHP without a framework is awful. With a framework it’s decent.


"PHP is rarely considered "old school." It's considered bad"

PHP hasn't been considered 'bad' for a few years now. It has been battle tested on many large-scale websites.

Now, One of the main issues is that anyone can write a few lines of code and call themselves a developer, so you have horrible code bases still out there...but this has more to do with the developer than the language.

"my answers to which are "not enough" and "not even if you paid me", respectively

I love hearing answers like this. This is why I'm still paid so well to write PHP code after 15 years in the industry.


I avoided php and its ecosystem like a plague for the longest time. However the who eco-system has evolved including the language e.g. functional paradigm (closures) and syntactic improvement (arrow functions). When you combine this with a modern web framework like php Laravel it is an absolute pleasure to work with, specifically as a monolithic for rapid application development, which is great for MVPs and ensuring you have product market fit. I am not sure how this translates over time to a micros-services. However if you're starting to writing a new application I would definitely recommend php + laravel for rapid application development. I say this relative to my extensive experience with nodejs express/java spring boot.


IIRC there are still many large companies using Ruby/Rails still, they've just also diversified their tech stacks (as larger companies tend to do). AFAIK the list includes: GitHub (MS has a few Rails-based acquisitions now), Airbnb, Groupon, Square, Cookpad, Kickstarter, Hulu, etc..


Pretty sure Stripe is a Ruby shop as well.


One of the biggest. They recently released Sorbet, a gradual type checker for Ruby.


I sometimes forget how vast and diverse the whole industry is, so many tech. stacks and ecosystems.


JP Morgan Chase and McKesson both use Ruby with McKesson leading the pack in RoR LOC among the Fortune 50


IIRC it's ruby, but not rails


Gitlab too it seems.


Insiders at Airbnb told me they are (almost) completely off Rails now. They moved their website to the jvm.


If all you want is the JVM JRuby and its Truffle variant are viable options.


Seems at some scale this usually happens.


I believe AirBnB and Groupon are off Rails ( And Ruby ).

So the big one for Rails are Shopify, Github, Cookpad, Gitlab. For Ruby ( Not Rails ) That would be Stripe.


Funny you mention this. I just today had to implement a painful workaround for Shopify's insanely short timeout on product image uploads. On submitting an image url, you apparently get 4s to complete the whole transfer.

I found hundreds of people complaining about this in the community forums, going back years. If you're dynamically generating images, or on a congested network, 4s is far too short.

Since this is a simple config property, the only justification I can imagine is that they are trying to restrict the amount of time that their (single-threaded, memory-hungry) instances are occupied. Because of Ruby's poor resource management, a core part of their API is barely usable.

I'm pretty disappointed.


I do a lot of developing with Shopify and it’s a mess. One of the worst development experiences of all time. Because they’re so monolithic focused. The API’s you can use are pretty sluggish and poorly documented.

Rails is just not meant for heavy transactional load. And e-commerce needs async to handle what can be a huge load.

Taobao is java or php and they handle load far greater without fault.

Shopify is much better than Magento though.


> One of the worst development experiences of all time... The API’s you can use are pretty sluggish and poorly documented.

Can you elaborate? Shopify released their GraphQL Admin API in 2018, which has built in documentation, as well as an interactive IDE (Graphiql) that you can run on your shop. As long as you're not doing anything too crazy that exceeds the throttle (e.g., syncing 1000s of products), they're pretty good about listening to developer requests.

https://help.shopify.com/en/api/graphql-admin-api

> Rails is just not meant for heavy transactional load.

Shopify has invested heavily in their sharding setup so that it can be handle high load and scale quickly. e.g., flash sales where baseline traffic will 3x in a matter of seconds. See the discussion of pods:

https://engineering.shopify.com/blogs/engineering/e-commerce...

> And e-commerce needs async to handle what can be a huge load.

Long running processes are async. However, having commerce modeled by a transactional database that provides atomicity is a boon for simplicity. You don't want to deal with eventual consistency when updating inventory or placing orders.

Note: Am ex-Shopify. Ran the API team.


I'm the GP, not the parent, but I'll give you some of my complaints (other than the already mentioned infuriating timeout).

* The Metafields feel both over-engineered and difficult to use. Stripe did a great job - metadata is a set of string key/value pairs that are always fetched when you fetch a thing. Shopify's metafields are typed and need to be fetched explicitly from the REST API. You can't get a list of Products and their associated metafields; you have to fetch each Product's metafields one at a time. It's a critical facility and yet works very poorly.

* The API for metafields is inconsistent. Want to get metafields for a variant? /admin/products/#{id}/variants/#{id}/metafields.json Want to get metafields for a product image? /admin/metafields.json?metafield[owner_id]=#{id}&metafield[owner_resource]=product_image WAT?

* Variants (and images) have unique ids, but you always have to reference them as /products/123/variants/456. Variants should be /variants/456.

* Suffixing all the requests with .json is annoying and not RESTy. We have content negotiation headers for that.

* Yes, I know you can work around some of this with GraphQL. Does Shopify intend to deprecate the REST API? I am not personally a big fan of GraphQL, but if the REST API is getting dustbinned, I'd like to know about it.

Mind you, this is just what I noticed from a few days of working with the API.

PS IMO, 3x isn't much of a flash! 300x, now we're talking :)


>Shopify has invested heavily in their sharding setup so that it can be handle high load and scale quickly. e.g., flash sales where baseline traffic will 3x in a matter of seconds. See the discussion of pods:

Your parent is using Taobao ( Which is like the Amazon in China ) as example. Which means the word "scale" means completely different thing. I am willing to bet Taobao's RPS on singles day is a lot higher than the highest RPM during Shopify's biggest flash sales.

That said there is no reason to believe that pod in shopify cant handle that scale.


Kudos to you for the work you and your team did.

I've had a love/hate relationship with the Shopify platform but when it comes to the API it was all love.


They’re still using a rails setup for their core services. But, this core falls over with him RPM. It’s failed in nearly all flash sales it has had to handle.

Rails is great. But commerce is heavy and I don’t believe Shopify can keep its existing core code base around much longer without a significant change to aid with performance.


They're moving slow parts to Go.


Hopefully crystal Lang will make its way into these ruby heavy shops. They’ll get 100x performance without needing to really think in a whole different programming experience.


I would love to see this happen, but I don't think it's realistic anytime soon.

Crystal just doesn't have the community that ruby has, and as teams reconsider certain aspects of their applications, languages like Elixir and Go make way more sense.

Creating a compiled language with similar syntax to Ruby is great, but there's so much more involved than just that if you're talking about building a serious, commercial product. And unfortunately, I just don't see Crystal getting that kind of traction in the short term.


Kotlin is a good transition language from Ruby. Concise with a nice balance between FP and OOP and all the scaling benefits of the JVM.


What about Elixir, seems to be a good option as well


Not if you want to retain the OO features of Ruby. Beyond the surface syntax similarities Elixir is at the opposite end of the spectrum compared with Ruby.


> without needing to really think in a whole different programming experience

But Crystal has entirely different semantics to Ruby. They look vaguely similar at a superficial level, but the semantics are not even remotely similar.


Maybe but depending on what you are doing they are minor considerations.

I am porting a relatively simple ruby app to Crystal to see how it is and most of it is copy/paste and then some fixing and adding type definitions where needed. The only issues I have had are where I am using a rubygem that doesn't have a crystal counterpart.

For example a pretty simple Gem that connects to a socket and parses incoming data was exceedingly easy to port, including tests.

* https://gitlab.com/overlord-bot/tacview-ruby-client

* https://gitlab.com/overlord-bot/tacview-crystal-client

For a more specific example these two files are almost identical:

* https://gitlab.com/overlord-bot/tacview-ruby-client/blob/mas...

* https://gitlab.com/overlord-bot/tacview-crystal-client/blob/...

The only thing not done was CRC hash calculating for a password because there was no crystal shard for it and it was low priority so I didn't write one.


If its so simple to port Ruby code to Crystal, then how come nobody has gotten around to porting Rails to Crystal yet?


Please re-read what I said. I never claimed it was easy to port in every situation.

I merely pointed out that "depending on what you are doing they (semantic differences) are minor considerations." and then gave an example where that was the case with a simple application I was working.

If you want a more complicted library that was ported from Ruby to Crystal then you can also look at Sidekiq ( https://www.mikeperham.com/2016/05/25/sidekiq-for-crystal/ )


But when this doesn't apply to even basic parts of the Ruby ecosystem, like Active Support, these 'minor differences in some cases' are fundamental showstoppers for a company like Shopify, which was the context.

The original comment said 'without needing to really think in a whole different programming experience'. Ruby without #send and #eval is a very much a whole different programming experience for a company like Shopify.


It’s not mentioned but I’m assuming that they built their own OIDC/OAuth backend and not use existing ones (eg okta, Auth0 etc).

It would be interesting to know the details of how they’re doing authorization. It appears that it’s all or nothing but I might be mistaken.


Running an OAuth2 server isn't tremendously involved. There are good open-source projects like https://github.com/ory/hydra that are pretty easy to configure.


Oh god, at megacorp we implemented our own OAuth2 stack. Much sadness ensued.


Been there, done that - wish it upon no one.

If anyone ever brings up the idea of building out oauth or even vaguely user management, I try to point them to at least try a POC (Proof of Concept) with https://www.keycloak.org/ (Apache 2.0 License) or https://www.gluu.org/ (MIT License) before they considering building.


Another solution is OpenLDAP (or JumpCloud) at the root and then supporting software:

  OpenLDAP

   ├── PrivacyIDEA (TOTP/MFA with LDAP auth backend)  

   ├──---└──  SAML iDp (e.g. SimpleSAMLphp or Shibboleth) for SSO: AWS, Google, Github, Atlassian, Snowflake, Azure etc.

   ├── Dex (https://github.com/dexidp/dex) for anything that wants Oauth flow

   ├── Native LDAP for apps that support it (e.g. Metabase, Grafana)

   ├── Any other custom authT that supports LDAP as a backend
OpenLDAP itself isn't for the faint hearted but I've had a lot of success with JumpCloud (and Okta also have an LDAP directory service... though starting price is high).


I don’t think anyone building a modern identity solution should base it on openldap. LDAP is amazing as an identity provider in a data center, but does not offer support for modern authentication methods like oath and oidc. As such, it’s not a very good base for creating your organizations identity.

I’m happy to be proven wrong about this. I love open standards and protocols.


> LDAP is amazing as an identity provider in a data center, but does not offer support for modern authentication methods like oath and oidc.

I don't think lack of support for OAuth is a problem here. OAuth is specifically designed to obtain access to an HTTP service[1], and OpenID Connect is specifically designed for OAuth. LDAP is not an HTTP service.

[1]: https://tools.ietf.org/html/rfc6749


I think you've misunderstood my comment. LDAP gives you an extremely well supported back end from which to easily extend to virtually any form of authZ, including oauth.


Hey, I worked on this project for ~2 years, though I'm no longer with Shopify.

We started with Doorkeeper and gradually switched to building our own OAuth2/OIDC implementation over time, partially using glued together lower-level libraries like https://github.com/nov/openid_connect

Edit: I forgot, I even have a few small commits to that last project from my time at Shopify: https://github.com/nov/openid_connect/commits?author=meagar


So, did you have issues with doorkeeper that forced you to switch? Or was it just not fit for the problem you were trying to solve?

I've used it a bit in the past and it worked fine, but I didn't really push it.


Well design of one-to-one relationship between users and shops. Then, they solve the problem that approach has brought. Nothing particular at the article.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: