I did something similar when I wanted to search the HIBP database and if you are okay with some false positives you can do better than your results, both in terms of speed and size.
If you are okay with false positives, you can use a bloom filter and tune the number of false positives you want. I chose a false positive rate of 1 in a million so my data structure was still very accurate in determining whether a password was already hacked.
It only took 30 microseconds to determine if a password was in the list and for size, was at the theoretical limit of 22 bits per element or ~1.5gb.
I originally used a bloom filter which made it 2gb but given a bloom filter was just a sequence of 0s and 1s, I was able to use a golomb coding to shrink it down to 1.5gb.
The time to process the original 24gb however, is something that I could have improved, but I kinda lost interest once I already had something that was at the theoretical minimum size, as well as able to determine a password exists within 30 microseconds.
Very nice. I've never used a Golomb Set (looks interesting). I bet we'll see more organizations doing this and maybe in five to ten years, it'll be the norm.
Would be nice if people from Toronto would enter more data points for the more popular startups I.e. shopify, instacart, unata, ritual, league, top hat, wealthsimple, freshbooks, pivotal, ada, etc
How does this solve problems of data leaks? Let's say some social network uses this Solid framework and you login with your Solid POD. Can't the social network then just save your data into a database? Now when the social network gets hacked millions of personal data is still released into the wild?
What I'd like to see, is every device run a small process which can read the user's data. Html then has a syntax that can be interpreted like
{{ solid://mydata/name }}
{{ solid://mydata/profile.jpg }}
{{ solid://mydata/age }}
etc.
That data is on the user's device encrypted. Apps can never read your data, they can only tell your device to display that data.
That way developers cannot read your data, store them in their own databases, and then accidentally get their own database hacked and we are back to square one.
I'd like the data on your device to be encrypted and have some type of homomorphic encryption such that if an app were to show average age of users, then an app would be able to run some sort of `select average(age) from users` but since the encryption is homomorphic, the app never learns information about any individual user. This would apply for machine learning operations too, so that we could get netflix style recommendations of movies, without a company ever learning what movies we liked, our age, etc.
However, I don't know the first thing about homomorphic encryption so I guess I just have to wait until some great soul builds something like this for us.
There are no big companies in Toronto that have tech jobs. Google, Facebook, Airbnb, Uber, etc. don't have tech jobs in Toronto. Amazon just started adding tech jobs in Toronto in the past maybe year or two and Shopify is the only exception to this rule.
If any toronto natives know of some big companies in Toronto with tech jobs please prove me wrong.
Microsoft and HP are in Mississauga, Honeywell and IBM are in Markham. Uber has a small office right in my neighbourhood. Rakuten Kobo is in the neighbourhood as well. Nvidia is downtown, but their office is relatively small.
There are a few. Most large ones are in the suburbs. Downtown has an overwhelming amount of VC-funded ad-tech companies that I wouldn't go near if my living depended on it. (I'd take up something else to cover the bills...)
Muse, Google, Facebook, Softchoice, IBM, ALL major banks have a tech arm now, like Scotia Banks Digital Factory, CIBC Live Labs ,etc., and other tech companies, Flip, Wave, Freshbooks, Wattpad (raised major vc money), Ritual (raised 75m this year), TribalScale, Worktango, just to name a few.
I'm not sure if you'd count IBM, but they have their Toronto Software Lab here. It's technically in Markham now, but if it were on the south side of Steeles instead of the north side, it would be in Toronto proper. AMD also still has a large office in Markham in what used to be ATI headquarters.
I've seen a few articles about Apple hiring developers in Toronto, so there's that too.
Maybe Waterloo is a bit too far away to count, but Google has a fairly big engineering office there.
Amazon has an office, scaling up quickly. UberATG (self driving) has an office. Shopify has a huge presence. People say google, but it's really a marketing office with <20 engineers.
As a freelancer with clients, are you generally on deadlines? Or is it work when you want? Essentially I would like to work somewhere around 3 days a week but I am not sure that projects would be able to meet deadlines that clients want. In digital agencies I have worked for, I have had to shown work done every week. And the amount of work given to me was such that I had to work the full week. If I were to do that freelancing I would certainly make more money, but I am hoping that there are jobs where deadlines are based on average around 3 days work...
Either charge a fraction of your weekly price, position yourself as part-time, or prove you can hustle. If you can get 5 days worth of what they think of a week's worth of work in 3 days, you've arbitraged the effort
Money flows to everyone, just more goes to those with more money. However that isn't as bad as it sounds. If you own 10% and someone owns 1%, after the money has been given out you will still have 10% and they 1%. From a relative wealth perspective it actually just maintains everyone's position in society, neither benefiting nor disadvantaging anyone...
If I have 10% and you have 1%, I might be able to keep 5% locked as my stake and 5% liquid, where you need 0.9% liquid and can only lock 0.1%. What was a 10:1 ratio in capital becomes a 50:1 ratio after we each chunk some out for liquidity.
> Canada to scrap IBM payroll plan gone awry costing $1B. The Phoenix project was originally chosen by Prime Minister Justin Trudeau’s Conservative predecessors 10 years ago to centralize the government payroll.
Is it me or is this wording terrible?
Trudeau is not a Conservative. It makes it sound like it is Trudeau's party that caused the problem.
I'm not a fan of the convservatives but it's still unfair to place the blame of a bad choice of client on a political party. A lot of non-partisans were involved in getting the project running.
If you are okay with false positives, you can use a bloom filter and tune the number of false positives you want. I chose a false positive rate of 1 in a million so my data structure was still very accurate in determining whether a password was already hacked.
It only took 30 microseconds to determine if a password was in the list and for size, was at the theoretical limit of 22 bits per element or ~1.5gb.
I originally used a bloom filter which made it 2gb but given a bloom filter was just a sequence of 0s and 1s, I was able to use a golomb coding to shrink it down to 1.5gb.
The time to process the original 24gb however, is something that I could have improved, but I kinda lost interest once I already had something that was at the theoretical minimum size, as well as able to determine a password exists within 30 microseconds.
Anyways take a look if you're interested in trying a different approach: https://github.com/terencechow/PwnedPasswords