Hacker News new | past | comments | ask | show | jobs | submit | dgquintas's comments login

Naive question, but is using old school spinning HDs completely out of the question? In other words, can they play a role in a modern deployment?


Unfortunately for databases under a heavy load it's out of the question.

You can have a server pegged at 100% CPU waiting on disk with spinning disks, and without changing anything else on the server RAM or CPU, and just replacing the drives with SSD you can quickly see that load drop down to 10%.


Random reads are an order of magnitude slower so for fast access there's a notable difference. Still a role for hard drives with sequential reads/writes, such as the regular backups they should have been taking! Slower in all areas but cheaper for the same space.


Spinning hard drives are the tape drives of the modern era. Boot, logs, backups, but no databases.


Create an account by clicking "Register" at the top of the checkout page. You should then see the discounted price, at least until tomorrow (it seems the sale runs until then).


Thanks!


It did for me. Just click "Register" at the top of the checkout page.


The standard machine has 32 GB. So I guess 0.32 hundreds... And for completeness, 6 physical cores, 12 logical with HT. Still beefy, but nothing out of the ordinary.


I would venture to say that most of the world isn't using computers with 32GB RAM...


I consider 2GB acceptable, 4GB normal, and 8GB high-end (maybe even overkill).


Strange, that's the metric I personally used around 2008. Haven't ran into a computer with less than 16GB for the last few years.


A rather popular hardware site in a developing country, but one on the upper range of "developing". So probably better off than 60% of the world's population:

399 laptops with 4GB of RAM: https://www.pcgarage.ro/notebook-laptop/filtre/memorie-capac...

548 laptops with 8GB of RAM: https://www.pcgarage.ro/notebook-laptop/filtre/memorie-capac...

Less than 200 laptops, in total, with 16GB RAM or more. So probably less than 20% of the laptops on sale have that much memory, in 2018.

The "real" world is quite different to what us folks in IT see in our corner of the world.


8GB is still a pretty common setup

* Literally every current MacBook except the 15" MBP

* iMacs unless you choose to upgrade the RAM

* My local PC retailers website has 34 models with 4GB of RAM, 84 with 8GB and 28 with 16GB. The cheapest 16gb laptop there is €1100. They also sell 3 Chromebooks and 1 windows 10 device with 2GB of RAM..


Especially since 32GB DDR4 RAM currently cost ~$350.


You are so wrong. Check here http://store.steampowered.com/hwsurvey/

these are gaming, most powerful PCs people are using. 8gb is 44% of all users.


I believe dgquintas is describing the standard engineering workstation at Google.


>most powerful PCs people are using

No, workstations and\or graphical stations are powerfull. Gaming rigs can be powerfull, but most of the time they are average or just above average.

Also - steam statistic will show you a great number of PCs that are being used only to play some indie games and\or Dota. 8gb is more than enough for that.


Depends on how you define "power". I'm sure devs often have more RAM for VMs etc, where gamers have more GPU for graphics. I'm building a gaming rig right now, the games I wanna play 8GB of RAM is plenty.


But if you click on the RAM section, 40% have 12gb or more.


yeah, sorry, I was referring specifically to the Google developer standard workstations.


Batch reply to the other comments: the "standard machine" here refers to what Google developers are using.


Are you talking standard Google developer workstation specs? That's not much different than the thinkservers we use as desktops. But considerably beefier than what non-desktop devs might be accustomed to.


Surely 16Gb was more common at least until recently? Haven’t seen many 32gb ram mbp in the wild. Actually, I’ve never seen one.


Apple doesn't even offer a 32GB option for their Macbook Pro computers. 32gb is far from standard.


Yes, they don't offer that on their laptops, but what the OP is talking about is a desktop machine. Apple definitely offers 32 GB in their desktops.


OP is talking about the machines Google devs use, and I am betting they are definitely not desktop machines.


He was describing the standard desktop workstation at Google.

My understanding is that they have rMBP laptops and slightly beefier desktops.


some people have macs, some other people have thinkpads (at other points in time, there have been dells or HPs as well). But only desktops are actually used for any development.


they are desktop machines.


320 = 3.2 hundreds not .32 hundreds. 3 hundreds, 2 tens, and 0 ones.


But, he said 32 GB, not 320, so it would be 0.32 hundreds.


Right. What do you get when you divide 32GB by 100MB per tab. It's not .32 hundreds. It is 3.2 hundreds.

> The standard machine has 32 GB. So I guess 0.32 hundreds

His comment was in response to Opera 12.x takes ~3-10MB per TAB (+the weight of the media used on the page).


"Just say no to linked lists!" https://youtu.be/fHNmRkzxHWs?t=2099


Look in the Linux kernel to see how often linked lists are used. The list of a process’s memory mapping are, for example, maintained as linked list in parallel with a balanced tree. Lots of queues are maintained as linked lists. Free slabs in the slab allocator are kept in a linked list. The list goes on.


Yeah, until you need to remove an element from the middle in constant time - this is sometimes useful for things like the implementation of an LRU cache.


If you iterate over the list within an order of magnitude as often as you remove an element from the middle, an array will still be faster despite not having constant time removal, no matter the size of the list.

To restate, if you have a workload where you iterate over your collection ~500 times, and remove an element from the middle ~1000 times, an array will usually outperform a linked list on a modern computer, no matter the size of the list.

It is not until you do removes/adds far more often, and far from the end of the collection, that the linked list will perform better overall.


That's why I mentioned the LRU implementation. It does not iterate over the list, only adds elements to the front, deletes from the back and moves from the middle to the front. Refs to nodes are stored in a map, so iteration is not necessary to find an element.


But how do you find the element to be moved from the middle if not by iterating over the list?


You use a hash map, and make the linked list a linked list of hash entries. So, if you're using chaining (as opposed to the many variations on open addressing):

    class Entry<K,V> {
        Entry<K,V> younger;  // LRU doubly linked list
        Entry<K,V> older;    // LRU doubly linked list
        Entry<K,V> next;     // for hash map bucket chaining
        V value;
        K key;
    }
The naive solution using two separate data structures would look like:

    class MapEntry<K,V> {
        Entry<K,V> next;
        DoublyLinkedListEntry<MapEntry<K,V>> LruEntry;
        K key;
        V value;
    }

    class DoublyLinkedListEntry<T> {
        ListEntry<T> prev;
        LintEntry<T> next;
        T value;
    }


The commenter mentions that in the last sentence. There is some separate data structure that also has references to nodes in the middle of the linked list.


Or, you've removed one indirection and made the linked list actually a part of another data structure, as in how LRU caches are generally implemented.


The entire LRU used multiple data structures, but the ability to efficiently remove from the middle of the linked list is still important.


Oy! I can't believe I missed that.


He says because of cache misses due to each element being allocated individually.

But then working with references is bad in general? If your vector stores references to objects it's bad? Your object fields point to strings - bad too.

Should we ditch Lisp and Java?

Note, data should not be continuous to stay in cache. Cache is more like paged virtual memory, it consists of cache lines.

Before subscribing to the "never use linked lists" view I would like to see a nontrivial program rewritten from lists and references to the "allocate everything at once" approach and measure the performance.

If you want to avoid references and allocate everything in place, it may force your program to do a lot of copying.

Also, not only CPU performance matters - programmer performance is important too. So-called "high-level languages" try to optimise this.


>But then working with references is bad in general? If your vector stores references to objects it's bad? Your object fields point to strings - bad too.

It's not strictly bad, but it's useful to minimize the number of pointer derefrences you have wherever you can. A non-intrusive linked list will have 2 pointer dereferences to access any bit of data. You'll also have n+1 pointer dereferences to access element n. If you have fixed size small objects, then a vector of values is almost always better than a vector of pointers to the small objects. An intrusive linked list will save you a pointer dereference, but you still have the n dereferences to access element n.

>Should we ditch Lisp

Lisp's lists model a linked list with chains of cons cells, but there's no hard requirement for them to actually be implemented as linked lists. A typical approach is to implement lists in terms of Bagwell's VLists, which are a kind of middle ground between linked lists and vectors. You have reduced number of pointer dereferences, plus increased cache locality, whilst still being able to log n index, insert, delete, and not require large up-front allocations.

>and Java

If you subscribe to the "everything is an object" model religiously, then yes, you're probably doing harm. As always, there's no hard rules here and it always depends on your problem and data access requirements. You can usually get performance gains by using memory pools, arrays of structures/primitives, and entity systems instead of inheritance hierarchies.


You're misunderstanding the advice about contiguous. It's not that it's more likely to stay in cache, but if you're accessing data sequentially it's more likely the data you're going to access next is already in cache.

Most (all I've read/looked at) benchmarks in Java have data structures backed by linked lists utterly smashed by things implemented by an array.

There was in the last year or two a good c++ talk where a game or game engine changed their storage to be more array like and got roughly a 30% boost in performance. Memory locality is usually king, which is why linked lists are rarely used.



If you already know your data (which must be the case with arrays anyway), you can simply pre-allocate a pool from where the elements of the linked list will be taken. This guarantees that elements will be allocated from the same region in memory. It is a simple technique that I use whenever I believe that a linked list will be needed in a high-performance context.


Vectors are also usually more memory-efficient, period. A singly-linked list uses a pointer for each element an a doubly-linked uses two, while a vector has constant [a pointer and two integers (size and capacity)] overhead. (Unless the vector was dynamically resized and isn't near capacity.)


> if you're accessing data sequentially it's more likely the data you're going to access next is already in cache

Why?


Cache consists of cache lines (usually 64 bytes). If an element in the list is smaller than the cache line a whole line is put in the cache anyway. This means that data is cached that you have not read yet. When using a contiguous array this unread data contains the next elements. So they are cached without ever having been read.


Cache prefetching. (https://en.wikipedia.org/wiki/Cache_prefetching)

Essentially, many workloads access data sequentially and therefore modern cache architectures have special optimizations to make these memory accesses as fast as possible, by prefetching the next item in the sequence before it is actually needed.


Ah, yes.

Well, anyways I value code simplicity more than anything.

Maybe CPUs will learn to do cache prefetch for link oriented languages: if a "load and dereference" pattern is detected, CPU could prefetch the data referenced by pointers it has in registers or recently fetched cache line.

BTW, linked list elements are not necessary located far away from each other, if we allocated them one after another chance are they are near each other in memory.


the time complexity of insert etc is superior and a good reason for abstraction on top of an array, I would think.


Dependent on various internal arbitrary or intentional decisions, copying garbage collectors can in their normal workings place linked list cells in order in memory. While this does still take more memory than arrays, it can take advantage of the caching & sequential auto prefetch as well.


Common Lisp has value types, it is not all about lists.

Likewise Java has primitive types, arrays and eventually will get value types, because they already feel the pressure in FinTech of not having them.


> data [needn't] be continuous to stay in cache. Cache is more like paged virtual memory, it consists of cache lines.

It sounds like you're saying everything will be OK so long as each chunk of 64 bytes (cache line) is used together. But one page is typically 4 KB, and if you use for example all 64 bytes of one cache line, but only one cache line per page, you will suffer from TLB misses.


If you throw your computer out of window it may break. Why whould I access only 64 bytes per page? I suggest you to delete this comment.

We are not discussing VM here, I brought it as an analogy just to say cache lines are independent and need not be contineous.


Yes, organizing your data in an array of structures is also bad for performance.


Doesn’t this entirely depend on access patterns?

If you read one field of all elements all at once, sure, a structure of arrays is probably right. But what if you instead read all fields of one element at once? Isn’t it advantageous to have those fields adjacent in memory with an array of structures?


The problem with trying to substitute lists with vectors is that their iterators behave differently. I.e. vector iterators point to positions rather than elements and are prone to being invalidated. So sometimes it's nice to have a vector that supports iterators that behave like list iterators[1].

[1] shameless plug: https://www.codeproject.com/Articles/1087021/Stable-Iterator...


Can you briefly describe how the msevector + ipointer works? I tried to look at the code but dense C++ is not my forte.


You mean how it's implemented? Umm, well it's been a while, but basically an ipointer is a proxy for an iterator that is stored internally by the msevector. These "internally stored" iterators are updated when necessary. For example, when insert() or erase() is called.

One nice thing about it is that it roughly conforms to the principle of "only pay for what you use". That is, the run-time cost is roughly proportional to the number of ipointers you have and the frequency of operations that modify the size of the vector.

One caveat is that this mechanism is not thread safe. But whenever you need to share the vector among threads, you can swap it with a vector that is safe to share[1].

And for those that are into memory safety, there is also a memory-safe vector[2] that supports ipointers.

Is this the sort of explanation you're looking for?

[1] https://github.com/duneroadrunner/SaferCPlusPlus#nii_vector

[2] https://github.com/duneroadrunner/SaferCPlusPlus#ivector


Thanks, that's clear enough :-). In hindsight I can't imagine what alternative I was thinking of .. I had some idea you might have put the additional cost in the iterator by maintaining only an epoch counter in the vector, but that's obviously not enough to do the right thing in the presence of insert and erase.

Your library looks like a good toolset. While I still find the code pretty impenetrable, the number of tests I can see give me confidence. Bookmarked for reference when I'm using C++ again.


Eh, like anything it depends on your use case.

For instance, my current use case is an STM32F, which accesses CCM in a single cycle. And intrusive linked lists are a god send for managing pools in embedded systems without traditional memory management.


C++ folks tend to dislike linked lists because it's awkward to make C++ lists intrusive. The STL list containers store the pointers in a separate allocation from the object itself, so they're slower than they ought to be.


Where did you get that? The STL implementations I know definitely don't. See https://github.com/llvm-mirror/libcxx/blob/master/include/li... for an example - __list_node_base contains the pointers, __list_node contains the object and derives from __list_node_base, so allocating the node with object and pointers is one allocation.


Yeah, that's true for values you're willing to allocate inline. The key advantage of intrusive linked lists is being able to add and remove normally allocated objects without having to construct separate list nodes. The STL does have splice(), but it's awkward -- you always have to keep the object in a list and refer to it using iterators instead of normal references.


A data structure with the linked list inline is the other way around, e.g.:

    class Foo {
      Foo *m_next;
      Foo *m_next_mru;
      Foo *m_next_sibling;
    };
With the STL implementation, only one list can have the item inlined; all the other linked lists need two indirections to get to the next element.


Still wondering why they are being taught in programming courses.

Although they are so many things that are taught, that should never be used in the real world.


Still, Amazon wouldn't stop asking LinkedList for in every damn interview.


that's hardly a response to anything. The market moves in that amount (and larger) every day.


1.07, actually. Title is wrong.


Gah, sorry. Touch-typing failed me again.


I believe it should also be a 17-year beta test, since 1.06 was released in 2000.


The last beta "release" (which AFAIK everybody has been using) was 1.06.95, in 2006


macOS hasn't been. Which isn't surprising, since 1.06.95 switched to GPLv3 (like most GNU projects).


It seems an earlier related draft (s/Programs/Code) is available at http://repository.cmu.edu/cgi/viewcontent.cgi?article=3435&c...


Cool, thanks. Will check it out.




And cVim: https://github.com/1995eaton/chromium-vim

Now how do they compare?


Surfingkeys has a couple of features over cVim that I really like:

-pressing '?' on any page will open the list of shortcuts

-searching through bookmarks / history ('t') is prettier

-very useful feature of switching through all the current open tabs with link hints ('T')

There's one shortcoming I noticed, however, is that when you go to the link hint mode, there are no indication than you entered it. On cVim there's a "follow link" notice in the lower right corner.

I'm a fairly casual user though, someone else will probably find more differences and pros / cons for each extension


As a long time cVim user I've found out that 99% of my browsing can be done with its scroll up/down and open link features.

When combined with this switch to recent tabs extension, the mouse becomes almost irrelevant: https://chrome.google.com/webstore/detail/toggle-switch-rece...


Vimium has the `visitPreviousTab` built in for toggling between two tabs.


There is something like this in Surfingkeys,

B - Go one tab history back F - Go one tab history forward


I've tried a bunch of these "vim-like" plugins and I always come back to Vimium (on Chrome). On Firefox I'm stuck with Vimperator, which means I don't use Firefox very often.


Why so? Firefox is still my browser du jour mainly because of Vimperator. I also use Vimium when in chrome, but Vimperator is way more powerful, and plugins are trivial to write: https://vimpr.github.io/plugins-en.html


you should check out VimFX for firefox, its modeled after Vimium but i like it even better.


I LOVE the scroll with j & k, it's smoothest scrolling experience I've ever seen!


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: