I belive that Solaris (OpenSolaris) Zones predates LXC by around 3 years. Even when working with k8s and docker every day, I still find what OpenSolaris had in 2009 superior. Crossbow and zfs tied it all together so neatly. What OpenSolaris could have been in another world. :D
I've been looking at Materielize for a while (https://materialize.com/). It can handle automatically refreshed materialized views. Last time I checked, it didn't support some Postgres SQL constructs that I use often, but I'm really looking forward to it.
I think that the problem is when you have a materialized view which takes hours to refresh. We are lucky that 99% of our traffic is during 7-19 on weekdays, so we can just refresh at night, but that won't work for others.
I don't know much about how postgresql works internally, so I just probably don't understand the constraints. Anyway as I understand, there are two ways to refresh. You either refresh a view concurrently or not.
If not, then postgres rebuilds the view from its definition on the side and at the end some internal structures are switched from the old to the new query result. Seems reasonable, but for some reason, which I don't understand due to my limited knowledge, an exclusive access lock is held for the entire duration of the refresh and all read queries are blocked, what doesn't work for us.
If you refresh concurrently, postgres rebuilds the view from its definition and compares the old and the new query result with a full outer join to compute a diff. The diff is then applied to the old data (like regular table INSERT/UPDATE/DELETE I assume), so I think you get away with just an exclusive lock and read access still works. There are two downsides to this, first that it requires a UNIQUE constraint for the join, second that the full outer join is a lot of additional work.
I never had the time to test Materialize, but it seems to do what I want with its continuous refresh.
I also thought about splitting the materialized view into two, one for rarely changing data and another one for smaller part of the data which changes daily. Then I would only have to refresh the smaller view and UNION ALL both materialized views in a regular view. Not sure how well will that work with postgres query planner.
Not sure about how that would work with the PG query planner either, but a batch for rarely changing data and rapid changing data is basically the Lambda data architecture, so probably a good call!
There's one gotcha with this approach: if there's another DDL operation running simultaneously with REFRESH MATERIALIZED VIEW, you'd get an internal postgres error.
You cannot be sure that refresh won't coincide with a grant on all tables in the schema, for example.
Mssql has "indexed views" which are automatically updated instantly... But they destroy your insert/update performance and their requirements are so draconian as to be completely impossible to ever actually use (no left joins, no subqueries, no self joins, etc...).
Yes, views are nice, but there is also a fair concept of not needlessly bogging down a table. Sure, they were making up data, but a flat table with stats, profile data and other easily external data is just bloat. Once you have an id then static fields can be retrieved from other services/data stores.
I think their point is more ‘don’t store all that junk in your primary database and then do all your work on it there too if you can just stuff it somewhere else’. Which has pros and cons and depends a lot on various scaling factors.
I'm pretty sure most engines use the term "materialized views" for eventual consistency tables. The only db I've seen with that kind of ACID materialized view is MS SQL, which calls them "indexed views".
If one wanted to do server side rendering in Java with something like Turbo links, in 2020 - what would one use? JSP? Grails? JSF? Or just hit the bar instead? :-)
JSF really is meant for quickly building internal applications that don't have to withstand "web scale" loads. It's focused on churning out data driven applications quickly. Add something like Bootsfaces or Primefaces and you can produce these things in very little time. Thats not to say you couldn't use JSF to make a "Web Scale" project, but you would have to dive into your server pretty far to carefully watch state management and session creation. Not impossible, but just probably not its primary purpose.
For external facing applications that need to withstand a "web scale" load, Eclipse Krazo (aka MVC spec 1.2) and JSP are what you're looking for. These things are lightning fast and give you a lot of control over session creation by default. Render times are usually under a few ms. This is probably the fastest and least resource intensive stack available (no benchmark provided, take it for what you paid for the comment).
I see there is a few posts repeating the common interpretation that glyphosate is not dangerous because it only targets metabolic pathway only animals has, so for the sake of discussion here is another viewpoint: https://www.youtube.com/watch?v=kVolljHmqEs (disregard the clickbait title), summary: Glyphosate is not bad for your body, but it does kill everything in your stomach, and that is not so awesome.
Krste Asanovic in https://www.youtube.com/watch?v=KxuQW8HWBXI shows numbers that the Berkeley Rocket implementation of RISC-V is both faster and has smaller die size than the ARM Cortex-A5 and that the Berkeley BOOM is faster and smaller than the ARM Cortex-A9.
The Cortex-A series are big processors. They're generally used when compute power is more important than power consumption. While it's a promising benchmark, the RISC-V will need to compete with the M-series (and other 8/16 bit cores) to break into the IoT market.
As impressive as that is, I doubt there's much room for RISC-V in Cortex-A's target market (phones, smart TVs, etc etc). I explicitly mentioned Cortex-M.
We run with ZFS over LUKS encrypted volumes in production on AWS ephemeral disks and have done so in over two years on Ubuntu 14.04 and 16.04. The major issue for us has been getting the startup order right, as timing issues does occur once you have many instances. To solve this, we use upstart (14.04) and systemd (16.04) together with Puppet to control the ordering.
Performance wise it does fairly well, our benchmarks shows ~10-15% decrease on random 8kb IO (14.04).
We are definitely looking forward to ZFS native encryption!
Since ZFS will run on blocklevel devices and you want to get the ZFS benefits of Snapshots/compression/(deduplication), in my opinion it makes sense to do the encryption at the blocklevel, i.e. LUKS has to provide decrypted block level devices before ZFS searches for its zpools.
When ZFS native encryption is available on Linux this will be different, since you much finer control on what to encrypt and you can keep all ZFS features.
So:
First decrypt LUKS (we are doing this in GRUB)
Then mount zpool(s)
Please stop spreading this misinformed statement. I assume you are referring to the ZFS ARC (Adaptive Replacement Cache). It works in much the same way as a regular Linux page cache. It does not take much more memory (if you disable prefetch) and will only use what is available/idle. We use Linux with ZFS on production systems with as low as 1GB memory. We stopped counting the times it has saved the day. :-)
ECC is a nice to have, but ZFS does not have special requirement over say a regular page cache. The only difference is that ZFS will discovery bit-flips instead of just ignoring them as ext4 or xfs would do.
To be clear, it is not ZFS that requires or even mandates ECC. Since ZFS uses data as present in memory and has checks for everything post that, it is prudent to have memory checks at the hardware level.
Thus, if one is using ZFS for data reliability, one ought to use ECC memory as well.
> Actually it seems ECC is important for ZFS filesystems see:
The inflection made by the previous comment tends to lead people to think ECC RAM is needed for ZFS specifically. As the blog post you link to points out it's equally applicable to all filesystems.
It's not required, but it doesn't make sense to use ZFS but not to use ECC memory. That's the point. It's like locking the backdoor but leaving the front door wide open.
That's rigth the kind of hardware I was referring to, 1 GB of plain RAM.
Truly, I haven't tested ZFS yet for that reason I've always read that ZFS has big requirements so I refrained to try it. It seems I should give it a try. ;)
Btrfs is another story I've used it for years and I'd prefer not to have to use it anymore untill it'll become "stable" and "performance". :)
Is zfs able to repair from single data (copy) corruption?
My main issue is to be able to repair a "silent" data corruption on a single drive machine. Am I able to use x% of my "partition" to data repair or do I need to use other partition/drive to mirror/raid it?
If I understand right zfs can detect bitrot ("not really" a big deal) but without any local copy It can't self heal.
My use case is an arm A20 SoC (lime2) to storage local backups among other things, so I need something that detects and repairs silent data corruption at rest by itself (using a single drive).
Not sure if it will fit your needs or not, but for long term storage on single HDs (and back in the day on DVD), I would create par files with about 5-10% redundancy to guard against data loss due to bad sectors.
http://parchive.sourceforge.net/
total drive failure of course means loss of data, but the odd bad sector or corrupted bit would be correctable on a single disk.
This was very popular back in the binary UseNet days....
You can create a nested ZFS file system and set the number of copies of the various blocks to be two or more. This will take more space, but there'll be multiple copies of the same block of data.
Ideally, though, please add an additional disk and set it up as a mirror.
ZFS can detect the silent data corruption during data access or during a zpool scrub (which can be run on a live production server). If there happen to be multiple copies, then ZFS can use one of the working copies to repair the corrupted copy.
No but parity archives solves a different problem, with only some percent of wasted storage you can survive bit-errors in your dataset. It's like reed-solomon for files.
In order to achive the same with ZFS you have to run RAID-Z2 on sparse files.
We have been running ZFS on Linux in production since April 2015 on over 1500 instances in AWS EC2 with Ubuntu 14.04 and 16.04. Only one kernel panic observed so far, on a Jenkins/CI instance, but that was due to Jenkins doing magic on ZFS mounts, believing it was a Solaris ZFS mount.
In our opinion, when we made the switch, it was much more important to trust the integrity of the data, than any possible kernel panic.
[0] https://www.viking-link.com/auction-faqs