Unfortunately for databases under a heavy load it's out of the question.
You can have a server pegged at 100% CPU waiting on disk with spinning disks, and without changing anything else on the server RAM or CPU, and just replacing the drives with SSD you can quickly see that load drop down to 10%.
Random reads are an order of magnitude slower so for fast access there's a notable difference. Still a role for hard drives with sequential reads/writes, such as the regular backups they should have been taking! Slower in all areas but cheaper for the same space.
Create an account by clicking "Register" at the top of the checkout page. You should then see the discounted price, at least until tomorrow (it seems the sale runs until then).
The standard machine has 32 GB. So I guess 0.32 hundreds... And for completeness, 6 physical cores, 12 logical with HT. Still beefy, but nothing out of the ordinary.
A rather popular hardware site in a developing country, but one on the upper range of "developing". So probably better off than 60% of the world's population:
* Literally every current MacBook except the 15" MBP
* iMacs unless you choose to upgrade the RAM
* My local PC retailers website has 34 models with 4GB of RAM, 84 with 8GB and 28 with 16GB. The cheapest 16gb laptop there is €1100. They also sell 3 Chromebooks and 1 windows 10 device with 2GB of RAM..
No, workstations and\or graphical stations are powerfull. Gaming rigs can be powerfull, but most of the time they are average or just above average.
Also - steam statistic will show you a great number of PCs that are being used only to play some indie games and\or Dota. 8gb is more than enough for that.
Depends on how you define "power". I'm sure devs often have more RAM for VMs etc, where gamers have more GPU for graphics. I'm building a gaming rig right now, the games I wanna play 8GB of RAM is plenty.
Are you talking standard Google developer workstation specs? That's not much different than the thinkservers we use as desktops. But considerably beefier than what non-desktop devs might be accustomed to.
some people have macs, some other people have thinkpads (at other points in time, there have been dells or HPs as well). But only desktops are actually used for any development.
Look in the Linux kernel to see how often linked lists are used. The list of a process’s memory mapping are, for example, maintained as linked list in parallel with a balanced tree. Lots of queues are maintained as linked lists. Free slabs in the slab allocator are kept in a linked list. The list goes on.
Yeah, until you need to remove an element from the middle in constant time - this is sometimes useful for things like the implementation of an LRU cache.
If you iterate over the list within an order of magnitude as often as you remove an element from the middle, an array will still be faster despite not having constant time removal, no matter the size of the list.
To restate, if you have a workload where you iterate over your collection ~500 times, and remove an element from the middle ~1000 times, an array will usually outperform a linked list on a modern computer, no matter the size of the list.
It is not until you do removes/adds far more often, and far from the end of the collection, that the linked list will perform better overall.
That's why I mentioned the LRU implementation. It does not iterate over the list, only adds elements to the front, deletes from the back and moves from the middle to the front. Refs to nodes are stored in a map, so iteration is not necessary to find an element.
You use a hash map, and make the linked list a linked list of hash entries. So, if you're using chaining (as opposed to the many variations on open addressing):
class Entry<K,V> {
Entry<K,V> younger; // LRU doubly linked list
Entry<K,V> older; // LRU doubly linked list
Entry<K,V> next; // for hash map bucket chaining
V value;
K key;
}
The naive solution using two separate data structures would look like:
class MapEntry<K,V> {
Entry<K,V> next;
DoublyLinkedListEntry<MapEntry<K,V>> LruEntry;
K key;
V value;
}
class DoublyLinkedListEntry<T> {
ListEntry<T> prev;
LintEntry<T> next;
T value;
}
The commenter mentions that in the last sentence. There is some separate data structure that also has references to nodes in the middle of the linked list.
He says because of cache misses due to each element being allocated individually.
But then working with references is bad in general? If your vector stores references to objects it's bad? Your object fields point to strings - bad too.
Should we ditch Lisp and Java?
Note, data should not be continuous to stay in cache. Cache is more like paged virtual memory, it consists of cache lines.
Before subscribing to the "never use linked lists" view I would like to see a nontrivial program rewritten from lists and references to the "allocate everything at once" approach and measure the performance.
If you want to avoid references and allocate everything in place, it may force your program to do a lot of copying.
Also, not only CPU performance matters - programmer performance is important too. So-called "high-level languages" try to optimise this.
>But then working with references is bad in general? If your vector stores references to objects it's bad? Your object fields point to strings - bad too.
It's not strictly bad, but it's useful to minimize the number of pointer derefrences you have wherever you can. A non-intrusive linked list will have 2 pointer dereferences to access any bit of data. You'll also have n+1 pointer dereferences to access element n. If you have fixed size small objects, then a vector of values is almost always better than a vector of pointers to the small objects. An intrusive linked list will save you a pointer dereference, but you still have the n dereferences to access element n.
>Should we ditch Lisp
Lisp's lists model a linked list with chains of cons cells, but there's no hard requirement for them to actually be implemented as linked lists. A typical approach is to implement lists in terms of Bagwell's VLists, which are a kind of middle ground between linked lists and vectors. You have reduced number of pointer dereferences, plus increased cache locality, whilst still being able to log n index, insert, delete, and not require large up-front allocations.
>and Java
If you subscribe to the "everything is an object" model religiously, then yes, you're probably doing harm. As always, there's no hard rules here and it always depends on your problem and data access requirements. You can usually get performance gains by using memory pools, arrays of structures/primitives, and entity systems instead of inheritance hierarchies.
You're misunderstanding the advice about contiguous. It's not that it's more likely to stay in cache, but if you're accessing data sequentially it's more likely the data you're going to access next is already in cache.
Most (all I've read/looked at) benchmarks in Java have data structures backed by linked lists utterly smashed by things implemented by an array.
There was in the last year or two a good c++ talk where a game or game engine changed their storage to be more array like and got roughly a 30% boost in performance. Memory locality is usually king, which is why linked lists are rarely used.
If you already know your data (which must be the case with arrays anyway), you can simply pre-allocate a pool from where the elements of the linked list will be taken. This guarantees that elements will be allocated from the same region in memory. It is a simple technique that I use whenever I believe that a linked list will be needed in a high-performance context.
Vectors are also usually more memory-efficient, period. A singly-linked list uses a pointer for each element an a doubly-linked uses two, while a vector has constant [a pointer and two integers (size and capacity)] overhead. (Unless the vector was dynamically resized and isn't near capacity.)
Cache consists of cache lines (usually 64 bytes). If an element in the list is smaller than the cache line a whole line is put in the cache anyway. This means that data is cached that you have not read yet. When using a contiguous array this unread data contains the next elements. So they are cached without ever having been read.
Essentially, many workloads access data sequentially and therefore modern cache architectures have special optimizations to make these memory accesses as fast as possible, by prefetching the next item in the sequence before it is actually needed.
Well, anyways I value code simplicity more than anything.
Maybe CPUs will learn to do cache prefetch for link oriented languages: if a "load and dereference" pattern is detected, CPU could prefetch the data referenced by pointers it has in registers or recently fetched cache line.
BTW, linked list elements are not necessary located far away from each other, if we allocated them one after another chance are they are near each other in memory.
Dependent on various internal arbitrary or intentional decisions, copying garbage collectors can in their normal workings place linked list cells in order in memory. While this does still take more memory than arrays, it can take advantage of the caching & sequential auto prefetch as well.
> data [needn't] be continuous to stay in cache. Cache is more like paged virtual memory, it consists of cache lines.
It sounds like you're saying everything will be OK so long as each chunk of 64 bytes (cache line) is used together. But one page is typically 4 KB, and if you use for example all 64 bytes of one cache line, but only one cache line per page, you will suffer from TLB misses.
If you read one field of all elements all at once, sure, a structure of arrays is probably right. But what if you instead read all fields of one element at once? Isn’t it advantageous to have those fields adjacent in memory with an array of structures?
The problem with trying to substitute lists with vectors is that their iterators behave differently. I.e. vector iterators point to positions rather than elements and are prone to being invalidated. So sometimes it's nice to have a vector that supports iterators that behave like list iterators[1].
You mean how it's implemented? Umm, well it's been a while, but basically an ipointer is a proxy for an iterator that is stored internally by the msevector. These "internally stored" iterators are updated when necessary. For example, when insert() or erase() is called.
One nice thing about it is that it roughly conforms to the principle of "only pay for what you use". That is, the run-time cost is roughly proportional to the number of ipointers you have and the frequency of operations that modify the size of the vector.
One caveat is that this mechanism is not thread safe. But whenever you need to share the vector among threads, you can swap it with a vector that is safe to share[1].
And for those that are into memory safety, there is also a memory-safe vector[2] that supports ipointers.
Is this the sort of explanation you're looking for?
Thanks, that's clear enough :-). In hindsight I can't imagine what alternative I was thinking of .. I had some idea you might have put the additional cost in the iterator by maintaining only an epoch counter in the vector, but that's obviously not enough to do the right thing in the presence of insert and erase.
Your library looks like a good toolset. While I still find the code pretty impenetrable, the number of tests I can see give me confidence. Bookmarked for reference when I'm using C++ again.
For instance, my current use case is an STM32F, which accesses CCM in a single cycle. And intrusive linked lists are a god send for managing pools in embedded systems without traditional memory management.
C++ folks tend to dislike linked lists because it's awkward to make C++ lists intrusive. The STL list containers store the pointers in a separate allocation from the object itself, so they're slower than they ought to be.
Where did you get that? The STL implementations I know definitely don't. See https://github.com/llvm-mirror/libcxx/blob/master/include/li... for an example - __list_node_base contains the pointers, __list_node contains the object and derives from __list_node_base, so allocating the node with object and pointers is one allocation.
Yeah, that's true for values you're willing to allocate inline. The key advantage of intrusive linked lists is being able to add and remove normally allocated objects without having to construct separate list nodes. The STL does have splice(), but it's awkward -- you always have to keep the object in a list and refer to it using iterators instead of normal references.
Surfingkeys has a couple of features over cVim that I really like:
-pressing '?' on any page will open the list of shortcuts
-searching through bookmarks / history ('t') is prettier
-very useful feature of switching through all the current open tabs with link hints ('T')
There's one shortcoming I noticed, however, is that when you go to the link hint mode, there are no indication than you entered it. On cVim there's a "follow link" notice in the lower right corner.
I'm a fairly casual user though, someone else will probably find more differences and pros / cons for each extension
I've tried a bunch of these "vim-like" plugins and I always come back to Vimium (on Chrome). On Firefox I'm stuck with Vimperator, which means I don't use Firefox very often.
Why so? Firefox is still my browser du jour mainly because of Vimperator. I also use Vimium when in chrome, but Vimperator is way more powerful, and plugins are trivial to write: https://vimpr.github.io/plugins-en.html