Hacker News new | past | comments | ask | show | jobs | submit login
Toxiproxy is a framework for simulating network conditions (github.com/shopify)
213 points by taf2 on Nov 2, 2021 | hide | past | favorite | 30 comments



I wrote the initial version of Toxiproxy back in 2014, but Jacob Wirth took it way beyond during his internship at Shopify. It came out of a need for writing integration tests for resiliency work we did at Shopify back then. [1] We didn't want someone to suddenly re-introduce a hard dependency on e.g. Redis on Shopify's storefronts. The initial prototype was a shell script that used lsof(1) and gdb(1) to close the file descriptor of the various connections. But, besides being dodgy, we needed to also simulate latency and make sure it worked on everyone's MacOS laptop's (otherwise e.g. tc(1) would have been intriguing). I wrote a little bit more of the history of Toxiproxy on Twitter. [2] It's stable and has proxied everything in dev and CI at Shopify for over half a decade.

[1]: https://shopify.engineering/building-and-testing-resilient-r...

[2]: https://twitter.com/Sirupsen/status/1455622640727728137


What a useful tool for resilience engineering.

https://github.com/dastergon/awesome-chaos-engineering#notab... does list toxiproxy.

Any general pointers for handling network connectivity issues (from any OSI layer) in client and server apps?

Many apps lack 'pending in outbox' functionality that we expect from e.g. email clients.

- [ ] Who could develop a set of reference toxiproxy 'test case mutators' (?) for simulating typical #DisasterRelief connectivity issues?

(In Python, Pytest + Hypothesis + Toxiproxy-python would be useful.)


Anyone know of anything like this but at the TCP level? I would love to have a way of simulating network partitions and different message delays for distributed algorithms implemented in Elixir. In an ideal world I'd be able to hook the elixir send/receive primitives to intercept messages between processes even in a single node.


May or may not be what you're looking for, but you can do lots of this in linux by using tc.

https://man7.org/linux/man-pages/man8/tc.8.html

Requires care to use and understand, so may not be applicable, but allows introducing latency, jitter, packet loss, etc.


Precisely tc and netem. To give an idea how powerful and tuneable netem can be, we had a piece of network gear that lost messages if they were too close in time to each other (less than N microseconds). Probably some 'copy packet during interrupt' stuff. We couldn't change anything in the applications or the network gear. The solution was (on the sender side) to 1) classify with tc the specific packet sequence and 2) delay the second message in the sequence. It's a large command-line, it does the job perfectly. It's marvelous.


+1 for tc. Used to run a network testing lab of about 12 racks of network gear and tc was my go-to for simulating network level performance impacts for both small and large test scenarios.


https://www.freebsd.org/cgi/man.cgi?dummynet

The dummynet system facility permits the control of traffic going through the various network interfaces, by applying bandwidth and queue size limitations, implementing different scheduling and queue management policies, and emulating delays and losses. ... The dummynet facility was initially implemented as a testing tool for TCP congestion control by Luigi Rizzo, as described on ACM Computer Communication Review, Jan.97 issue.


Isn't this already at a TCP level? HTTP is only used for management of the proxy?


Ah, I see, my bad. If I understand correctly though it's mostly for hooking client/server architectures? I'm interested more in things like hooking comms between replicas of a replicated database or an implementation of paxos. Would it be useable for that? For example, can I take a list of processes running on different ports and set up a proxy per connection (assuming they are connected all-to-all)? Maybe I should just RTFM :)


Yeah, all the other replies confuse me. Toxiproxy _is_ a TCP proxy. You can use it for Redis, MySQL, or anything else that runs on TCP.


You might be interested in Partisan's fault injector. I don't know if @cmeiklejohn is still working on it, but it's really cool nonetheless.

https://www.youtube.com/watch?v=KrwhOkiifQ8

https://github.com/lasp-lang/partisan



Not sure if this software satisfies your need: https://github.com/tylertreat/Comcast



I used to use WanEm but it was never really maintained but great for my manual networking tests. Would like something similar now but updated.


Iptables random module


We also use Toxiproxy at my company and it's awesome.

Shameless plug: I wrote a Rust port of it a while ago, for fun: https://github.com/oguzbilgener/noxious


Awesome! How can I contribute?


Thank you for the offer! If you're looking for something small, I just created issue #1. Other than that, I'm open to new ideas and anything that would make it easier to use.


Very cool! This would be fantastic for a CI pipeline.

For local development on the Mac, there's also dummynet. Full docs at:

    man 8 dnctl


I have recently used it to simulate a network breakdown between application and the database server to study what the application does. There are various clients you can use with toxiproxy via which you can set "toxics" (i.e. a delay in the network traffic, etc), but I found the cli more suited to my needs. The other clients, (for e.g. nodejs) can help you write unit tests which are meant to test the resiliency.

The other fact that caught my interest was the concept of "gamedays". It really about introducing "problems" in the production system randomly and keeping the support staff that manage application incidents on it toes. (More about in this talk: https://www.youtube.com/watch?v=TTfWpHuCJXk)


I used to work at Shopify and Toxiproxy was always a fantastic resource.


I've used Toxiproxy to reproduce so many issues that I can't be thankful enough to authors for this awesome tool! I also found a docker based UI to adding toxics I just wish it ships out of box or as another binary/brew package. While cli is awesome for people who have mastered it for beginners it's one more thing to learn, having a UI just solves that.


This is one of my favorite tools out there. So simple but yet so powerful.

You get the real deal when you pair it with other tools to analyze and monitor traffic.

Worth the note, I sometimes use this front end in case I want to quickly adjust stuff https://github.com/buckle/toxiproxy-frontend


We use it at GitHub, indeed I just updated the version we use for development this morning! Great project.


If you have an Apple Developer account, you can install Network Link Conditioner, with works transparently with all types of traffic:

https://nshipster.com/network-link-conditioner/


This looks great! I have a tool I’m building on one system with flaky network issues and this looks to give us a way to test this on a dev server to simulate the issues without having to try to guess what’s happening or interrupt the production machine.


While yours is more fully featured, I submit that https://github.com/tylertreat/comcast had the better name.


There are quite different projects, they work completely differently and have different capabilities.


Can this be used for simulating network conditions during mobile app development / resiliency testing?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: