Hacker News new | past | comments | ask | show | jobs | submit login
Computer Age Statistical Inference: Algorithms, Evidence and Data Science (stanford.edu)
183 points by Schiphol on Aug 18, 2017 | hide | past | favorite | 27 comments



Tangential: for anyone reading the PDF,

  pdfcrop --verbose --margins "20 30 20 30" --bbox "110 180 440 740" casi.pdf
trims the over-large margins without clipping any content.

http://manpages.ubuntu.com/manpages/precise/en/man1/pdfcrop....


For these parameters, the latest pdfcrop (v0.4b) dies with the following errors:

Error! Bounding Box borders imply page width of zero. Error! Bounding Box borders imply page height of zero.


For those who know, How does this book differ from Foundations of Data Science by Blum/Hopcroft/Kannan ?

http://www.cs.cornell.edu/jeh/book%20June%2014,%202017pdf.pd... ?


After looking through the Contents of both:

CASI looks very cutting edge, but also covers the origins of the use of computers for statistical methods. So this covers how computers can and have been used to do statistics.

FoDS looks like a very thorough data science book. Less of a "then VS now" book and more of a collection of what you need as a data scientist.

In CASI, I like the chapter on FDR, a very important topic not found in FoDS (?!). FDR is critical for correcting for multiple testing, seems essential for data science but maybe the authors consider it cutting edge and not foundational. However, the wavelet chapter in FoDS makes me happy, a very useful topic for series data.

Both great books, thanks for the links!


Statistician/data scientist here. This is one of my favorite texts in the area. It frames and groups methods historical rather than mathematically. I've found it both a valuable teaching tool and an interesting read on its own. Highly recommended.


Another amazing book from Hastie/Efron. The ISLR book was my first foray into ML and landed me my current job. Will be sure to devour this one as well!


Could you tell me how exactly did you land (and what it is) your current job, and from what previous possition did you come from?


Currently a Data Scientist for an MBB consulting firm. Came from a quant role (as a Project Manager) from an Ibank. Everything I do in my current role is pretty much straight from ISLR.


Can you specify which book you mean, when you say "ISLR"?

Thanks!


This one: http://www-bcf.usc.edu/~gareth/ISL/

Not to be confused with this one: https://web.stanford.edu/~hastie/ElemStatLearn/ (a.k.a. ESL)


Do you happen to know how the two books compare to each other?


Elements is much more rigorous and also longer.


How do you approach a publisher with the free PDF model? Is it something they're generally open to? (e.g. you have a corpus of work on a website that you want to turn into a book)


How much math must one know to be able to read this book? Is this an introductory book?


It assumes you know a fair bit. McElreath's _Statistical Rethinking_ is a better introduction: http://xcelab.net/rm/statistical-rethinking/


I don't think statistical rethinking is a good book for general statistic at all.

It's more a good book for introduction bayesian statistic. And it's super math/stat lite.

If you really want to stick with Bayesian Statistic and need a comprehendsive stat for a foundation a better book would be: Doing Bayesian Data Analysis by John K. Kruschke.


This is not an introductory book at all. From the preface: “Our intention was to maintain a technical level of discussion appropriate to Masters’-level statisticians or first-year PhD students.”



Awesome! I love the recent trend of making these books freely available.



Efron is the creator of bootstrap.

Hastie iirc is one of the two responsible for LASSO, Ridge, and I think elasticnet.


Not ridge, that was popularised by Hoerl and Kennard (1970). However, the method of augmenting the diagonals of the X'X matrix (ridge regression) was studied by Tikhonov in the early 40's as a method for solving ill-posed problems.

I believe Hastie has done lots of work on LASSO/L_1 norm related regularization methods and algorithms though.


I can really recommend this book. It's an enjoyable read and is very pragmatic. A useful reference for practitioners.


I've read ISL and ESL. ISL is a practical introduction to concepts, and ESL goes deeper into the algorithms.

This books seems like a historical survey of how things came to be, rather than a practical guide. I believe this is not a book for practitioners, but for someone who is looking to advance the field (researchers, Ph.D. students).

As someone who was in academia in 12 years, I think it is immensely valuable to have a survey like this because too often new researchers don't have a clear idea what has been tried before and why they failed or succeeded.


Now let's hope that we can start using these methods more in science instead of e.g. p-values.


Effect size in addition to p-value, not instead of...


No there are bayesian replacements for p values as well: https://replicationindex.wordpress.com/2015/04/30/replacing-...




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: