Introduction to Data Mining – Second Edition

monster_group · on Aug 22, 2018

I followed the first edition of this book during my master's degree Data Mining course. This is a good book (not too math heavy like [0]). It is a good book for somebody getting into data mining but it is more theoretical. For another excellent book that concentrates on building models using R I highly recommend [1] which has a great balance of theory and practical. Fun fact regarding [1] - one of the authors is the daughter of the renowned physicist Ed Witten.

[0] https://web.stanford.edu/~hastie/ElemStatLearn/

[1] http://www-bcf.usc.edu/~gareth/ISL/

Puer · on Aug 22, 2018

I'm excited to take this class this fall at the U of M. I've heard that the fall course is more algorithmically rigorous than the spring course because the computer science department tries to open it up to non-CS majors at that time (biology majors interested in bioinformatics, for example). The professor, George Karypis, has a reputation on campus. I'm doing a dual degree in statistical science and CS so I'm looking forward to seeing how this compares to some similar class offered by the stats department.

Phithagoras · on Aug 22, 2018

text available at http://gen.lib.rus.ec/search.php?req=introduction+to+Data+mi...

toonervoustosay · on Aug 22, 2018

you're linking to an illegal ebook sharing web site

this is against policy

zekrioca · on Aug 22, 2018

that's the 1st edition, not the 2nd as linked in the title

shawn · on Aug 22, 2018

True, but if you make the query a bit more general and sort by year, you get some wildly interesting results: http://gen.lib.rus.ec/search.php?&req=%E2%88%86introduction+...

shawn · on Aug 22, 2018

Thank you.

godelmachine · on Aug 22, 2018

Thanks for the link :)

mmcniece · on Aug 22, 2018

1st Edition of this book was excellent. Gives a solid explanation of both data mining/ml techniques and the trade-offs of choosing them.

The update to the chapter of classification was needed. Previously, the section on SVM and ANN was a subpart of a chapter, spanning no more than 10 pages, glad they added more detail there.

They also spend time in early chapters talking about preprocessing and cleaning data, something that often is glossed over.

technofiend · on Aug 22, 2018

At textbook pricing this is not a casual purchase. Hopefully this is not a trend; I'd hate to see the next O'Reilly book priced north of $60.

johnsonjo · on Aug 23, 2018

Well I’m pretty sure it is a textbook (by Pearson a textbook publisher) put together by research PHDs in Data Mining, so textbook pricing seems appropriate to me. I’m more worried whether all that profit just goes to the publishers.

oyebenny · on Aug 22, 2018

I'd really like to know if their Valentine's Day publishing date had a meaning.

mark_l_watson · on Aug 22, 2018

Looks great. I just looked through the 3 free chapters and saved them for later reading.

blackmario · on Aug 22, 2018

Thanks!

shawn · on Aug 22, 2018

Great! I was just reading through the first edition this week. It's excellent.

Also see Mining of Massive Datasets: http://infolab.stanford.edu/~ullman/mmds/book.pdf