More

1wheel · on Aug 14, 2023

remilouf · on Aug 14, 2023

LQML (and guidance https://github.com/guidance-ai/guidance) are much more inefficient. They loop over the entire vocabulary at each step, we only do it once at initialization.

potatoman22 · on Aug 15, 2023

Does looping over the vocabulary add much overhead to the tok/s? I imagine they're just checking if the input is in a set, and usually there's only ~30k tokens. That's somewhat intensive, but inference on the neural net feels like it'd take longer.

remilouf · on Aug 15, 2023

They’re checking regex partial matches for each possible completion, which is intensive indeed. You can look at the Figure 2 in our paper (link in original post) for a simple comparison with MS guidance which shows the difference.

1wheel · on Aug 10, 2023

Basically just a bunch of d3 — could be cleaned up significantly, but that's hard to do while iterating and polishing the charts.

I also have a couple of little libraries for things like annotations, interleaving svg/canvas and making d3 a bit less verbose.

- https://github.com/PAIR-code/ai-explorables/tree/master/sour...

- https://1wheel.github.io/swoopy-drag/

- https://github.com/gka/d3-jetpack

- https://roadtolarissa.com/hot-reload/

iaw · on Aug 10, 2023

I was going to ask the same question. Those are some great visualizations

1wheel · on March 23, 2023

https://pair.withgoogle.com/explorables/private-and-fair/

1wheel · on Dec 5, 2022

Here's an example of that with a smaller BERT model: https://pair.withgoogle.com/explorables/fill-in-the-blank/

1wheel · on May 27, 2022

https://www.westsiderag.com/2022/05/24/rumors-of-the-phone-b...

1wheel · on Oct 17, 2020

Mine is 60 lines of js; the markdown library does most of the work.

https://roadtolarissa.com/literate-blogging/

lynndotpy · on Oct 17, 2020

The most minimal I can muster is 3 lines of Bash:

    for filename in ./posts/*.md; do
        pandoc -s -c style.css $filename -o outputs/$(basename $filename md)html
    done

samtheprogram · on Oct 17, 2020

This can get pretty pedantic. Where do you draw the line between what is the blog generator and the tools required to do it when counting the number of lines of code? One could easily argue in this case you might want to be counting the lines of code in pandoc, not this bash script.

That said, I do think this is the way to go, using a popular and generic tool (notably that you do not have to maintain) to accomplish a specific task. And more importantly, composing utilities together in a succinct and efficient way.

Also, if you used semicolons, or xargs with a pipe, you could make this one line :) newlines can be pretty arbitrary, I wonder if there's a better measurement for simplicity, like branches or statements/expressions.

pwdisswordfish4 · on Oct 17, 2020

In that case, here it is in one line, producing byte-for-byte identical output to the snippet above:

    pangeadoc -c style.css ./ -O ./_site

(pangeadoc, of course, is a fork of pandoc that when invoked as above behaves exactly the same as those "3 lines of Bash".)

antisol · on Oct 17, 2020

damn, that's not a bad effort, I like it. But it does sort of feel like cheating ;)

Mine is about 140 lines of bash, and I don't /think/ i'm using anything that isn't part of coreutils.

pwdisswordfish4 · on Oct 17, 2020

The real story here is not the 60 lines, but the literate programming style used for it.

Aside from that, this approach is very similar to Marijn Haverbeke's (the CodeMirror author) generator, although your 60 lines does lean more heavily on third-party packages.

https://marijnhaverbeke.nl/blog/heckle.html

1wheel · on Aug 27, 2020

There's a dinosaur version of Anscome's Quartet:

https://www.autodeskresearch.com/publications/samestats

supernova87a · on Aug 27, 2020

oh, very nice!

1wheel · on Aug 26, 2020

Have you tried plotting the CDFs? Might be easier to read than the overlaid areas.

stuhlmueller · on Aug 26, 2020

Good idea. We'll integrate that into Elicit in a few weeks. In the meantime, here's a Colab that shows the CDFs: https://colab.research.google.com/drive/1pl3fIaeIKIS77IDM_rn...

1wheel · on June 29, 2020

And if you're using the pattern element, you can drop in any svg element.

https://developer.mozilla.org/en-US/docs/Web/SVG/Element/pat...

1wheel · on June 12, 2020

The sort order within a category is also wonky, making comparisons hard:

Google: - Health Insurance - On-Site Mother's Room - Fertility Assistance

Citadel: - On-Site Mother's Room - Health Insurance - Fertility Assistance

zuhayeer · on June 12, 2020

Yeah thanks for pointing it out. This was a very preliminary version, hoping to clean this up over the next few days