More

inputcoffee · on July 12, 2019

The Fidelity "minus sign mistake" didn't create a loss. The mistake was in relaying the information to the end user. It didn't actually cause a loss of that magnitude.

That is like saying if I mistyped in a word doc, that word created the loss.

inputcoffee · on July 11, 2019

I am always confused when people talk about the language itself.

In my experience, python is used for Tensorflow, or Pandas, or Django, or Flask, or pytorch or something else that runs on top of it. Sometimes it is even more specialized and I need a wrapper for an API to let me talk to some web data. Maybe I need a crawler/scraper and a parser. There is a specialized language on top of the language.

So when someone says, oh this language is better with objects, or has some syntax thing or the other, or I can reason about it I am left confused.

Its like if I were talking to a professional shoe designer and I ask for hiking boots and they tell me that they're really into having at least two tones to offset the lace and the heels or something.

What am I missing? I want to reason about the language too, but doesn't that pale in comparison to being able to run a specialized library?

kazinator · on July 11, 2019

You're missing that people write mountains of code in Python and similar languages; they are not always just for a small amount of glue to gain access to some specialized libraries.

inputcoffee · on July 11, 2019

Well, I assume people write mountains of code in the library. If you're making a machine learning product, that is still a lot of work.

However, writing your own Tensorflow interface would take several human lifetimes to get it right, and Google already has provided it. So it seems that is not the part you would re-write no matter how good the language is.

kjeetgill · on July 11, 2019

> In my experience, python is used for Tensorflow, or Pandas, or Django, or Flask, or pytorch or something else that runs on top of it.

Is this your experience of using python or reading about it? I don't know your background so I apologize for making some assumptions about the source of your confusion. It sounds like you don't have a ton of experience programming, so let me start with a broadly: Python is a language in sense that English is a language, but these "something else that runs on top of it", are more like specialized vocabulary or jargon than "languages on top of the language".

English gives you the grammar/structure/spelling to communicate; it's a foundation. But it also gives you general vocabulary; adjectives and adverbs blend and interact with any new vocabulary that might come into play. It doesn't matter if its a poem, or novel, or a technical documentation, or a text book, there is still a lot of English-ness to it.

In the same way, Python as a language is still the substrate that each of those tools (Tensorflow, or Pandas, or Django, or Flask) are interacted with. I agree with what you're getting at, that maybe the tools are more important than the language. When people talk about their like for python the could talk about either: the language itself or the culture/ecosystem around the language. Some inherent to the language, so a quirk of it's history.

This applies as much to natural language. You might hear someone love the sound of Spanish or French, or praise the regularity of Latin spelling, or love Greek for the wealth of ancient, influential texts that it gives access to.

In the case of python you get a lot of praise from both angles. People love the language for it's ecosystem, sure; but also for how it does white spacing, its brevity, the specifics of its typing, where it does and dosn't need parentheses, REPLability, etc.

inputcoffee · on July 11, 2019

I am not offended that you think I may not program. That is fine. (I mean less than some, more than others. I have coded up the examples I brought up.)

But you haven't responded to the argument. If someone urges you to use Racket, and you have task in front of you (say, put up a website), it sort of matters whether Racket has a framework more than if it has brackets, indents or curly braces.

kjeetgill · on July 12, 2019

> What am I missing? I want to reason about the language too, but doesn't that pale in comparison to being able to run a specialized library?

> But you haven't responded to the argument. If someone urges you to use Racket [...] it sort of matters whether Racket has a framework [...]

I guess I misunderstood the question; I didn't realize you were making that argument rather than the actually wondering what other thing people care about. I guess the more direct answer would be that questions like "doesn't that pale in comparison" and "it sort of matters" are kinda presumptions about the motivations of a person "who urges you to use Racket".

Articles like this are as much targeted towards "end-user" programmers as they are for the programmers who built the frameworks in the first place. Flask, Django, etc. exist because people that like Python for things like "if it has brackets, indents or curly braces" wanted those tools in that language.

That's what I was trying to get at before: different people are excited by different things (obviously) but that languages absolutely have draws independent of tools that exist in it. Not for everyone, but not for nobody neither.

rkangel · on July 12, 2019

Machine learning is a slightly unusual case with powerful important libraries doing the work, with the language on top just being used for orchestration, data loading etc. If that's what you're doing, then you're right, the ML framework availability is far more important.

That's a still a relatively small corner of programming in general though. For most languages when talking about libraries we talk about the 'ecosystem' - what is the availability and quality of all the bits and pieces that we can build upon. It's a question of many small things, rather than one large thing.

Ecosystem differences are less absolute than 'has tensorflow', and so can be weighed up against other language advantages and disadvantages.

Web frameworks are an interesting case because you (usually) don't use them as simple libraries. The interaction with them tends to be complex enough that good frameworks are built around what the language is good at. If the language is right for the sort of programming you want to do, then the framework will express that.

j88439h84 · on July 11, 2019

There's a lot more to languages than block syntax, and pretty much every language has a web framework.

davidw · on July 11, 2019

The amount of code available, and its quality, is one aspect of a language's ecosystem that most people take into account when discussing its merits.

It's quite possible to also discuss the language itself independently of how much code is available for it, while acknowledging that it's an important factor for many people.

j88439h84 · on July 11, 2019

Flask equivalents exist in every language. Tensorflow bindings are available in many. Pandas is more specific to Python, but using dataframes in Python vs R vs Julia each has a different feel.

inputcoffee · on July 11, 2019

True, if we were talking about Data science, and you were bringing up Python or R or Julia, fair enough.

But if you're talking about Racket, I would want to know what you can do with it. Does it have a data science library? A web app framework?

j88439h84 · on July 11, 2019

Some task-language pairs work better than others, either because of the libraries or the language features or both.

Libraries make Python better than Lisp for machine learning. Language features make Haskell better than Python for formal verification.

philwelch · on July 12, 2019

Flask and Django aren't really "specialized libraries" though.

inputcoffee · on July 5, 2019

Agree 100%, but only if it is a one time activity. If you have to automatically pull files and do the operation several times, it is better to go with R (or python, or awk sed or whatever).

inputcoffee · on July 5, 2019

I was waiting for the critique... but I never quite saw it.

Imho, the data problem Tidyverse is trying to solve is basically the ones we face in a database. So, select, join, inner join and so forth. Show me all the rows in this datatable where the 4th columm is larger than the 6th column and the number itself is odd. Something like that.

There might be other ways to do it, but you want your select, filter, summarize, mutate etc functions to all work with each other, pipe to each other and be compatible.

Maybe there is a better way to do all this -- I haven't seen it but I am not an expert -- but you have to show that to me.

So, in base R, walk through a set of example of mutating, joining, filtering and so forth, and show me how they are all easier. Then I'll say, wow there is an alternative to this Tidyverse thing. But in lieu of that demo, this felt more like an intro to a complaint than an actual complaint.

Edit: Also, its funny that Wickham is (apparently) such a nice fellow that people go out of the way to be nice to him in critiques.

barcadad · on July 5, 2019

His critique is more about the impact of the full ecosystem effect of the Tidyverse, not what you are referring to, which is just the dplyr semantics. The Tidyverse demands that it's many related packages use tidy data principles and lock users into that approach, which differs from base-R. Much of this discussion is really just a debate about dplyr and magrittr rather than the fragmentation that the broader tidyverse has brought on. All that said, I agree with many commenters that the Tidyverse's improvements to speed of development can more than offset the speed of execution issues, at least for small-to-medium datasets.

RosanaAnaDana · on July 6, 2019

I think any one making the point of speed of development have seriously missed the boat. Lets be real: tibbles suck. Once you get the hang of data.table syntax for matrix operations, its superiority becomes impeccably clear.

I run a data science group a large geospatial company and we develop day in and day out in R and python. We've purged tidyverse as much as possible from all of our code base. We've moved completely over to data.tables, which make the vast majority of the tidyverse irrelevant.

hadley · on July 6, 2019

Let’s keep the discussion civil please. People have legitimately different needs, and just because a package isn’t well suited to your needs doesn’t mean that it doesn’t help people with different backgrounds and goals.

inputcoffee · on July 2, 2019

Don't you think it was policy so they can capture the information and get reporting?

duxup · on July 2, 2019

Oh I have no doubt.

I write some software that involves accounting. Surprising how they drive A LOT of things. Sometimes a bit too much...

inputcoffee · on June 28, 2019

I've been trying to explain to people why I think ML is Stats rebranded but this is the most succinct expression of that sentiment:

> Taking averages, grouped by something? That's AI now.

I think that is right. The algorithm that does the grouped averages is machine learning, and if you put error bars around it, it is stats.

To address your concern: I wouldn't worry about the relevance of applying math and logic to the world. It has always been growing.

inputcoffee · on June 26, 2019

It was thought that one way of finding information is to ask your network (Facebook and Twitter would be examples), and then they would pass on the message and a chain of trusted sources would get the information back to you.

I am being purposefully vague because I don't think people know what an effective version of that would look like, but its worth exploring.

If you have some data you might ask questions like:

1. Can this network reveal obscure information?

2. When -- if ever -- is it more effective than indexing by words?

dbspin · on June 26, 2019

This seems significantly laborious. Not sure that the utility of this kind of network recommendation scales to incentivise participation beyond a few people. i.e.: We already have user groups on sites like reddit, FB etc, where experts or enthusiasts answer questions when they feel like it. But this is a slow process that relies on a group that contains enough distributed knowledge, but isn't overwhelmed with inquiries. As a counter example, the /r/BuildaPC subreddit long ago exceeded the size where it could answer a significant proportion of build questions, and most remain unanswered despite significant community engagement.

Not convinced any kind of formalised 'question answering network' could replace search. It would be both slow, and require an enormous asymmetric investment of time, for a diffuse and unspecified reward.

inputcoffee · on June 26, 2019

I don't think it would be questions.

Suppose you like fountain pens, and you recommend certain ones. One of your friend looks for fountain pens that their friends recommend and finds the ones you like.

That is just one example of things that don't require explicit questions.

Another one might be you have searched for books or other things and then they follow the same "path". So long as you have similar interests it might work.

People haven't solved this issue, but there is a lot of research out there on networks of connections potentially replacing certain kinds of search.

inputcoffee · on June 23, 2019

It went down a few times before and I reached out asking if he wanted help too. He did respond fairly promptly (< 2 days) and told me he was fine and about to get it back up. This happens every so often so I am glad you’re doing this.

Are the articles the same as he had? Did you get a snapshot before it went down?

thegurus · on June 23, 2019

Yes, the main goal of this new edition is a 24/7 up site with the same content :) We have in mind posting some old content since it can be recovered.

inputcoffee · on June 21, 2019

I admit, it really makes you think about iterations and MVP in a new way.

I am surprised you said you found flask lacking though, because I would have thought they were similar. Can you say more about what you found to be lacking in terms of performance/team, size/code and tooling?

inputcoffee · on June 21, 2019

This is really interesting.

I would love to see something like this for other successful companies.

Too often, we see the tech stacks of famous firms, but not the stacks that preceded them.

It would be very interesting to note if, say, 80% of unicorns started their life as RoR, or PHP projects. It tells you one of two things:

1. Which frameworks were popular n years ago (where n is the average time it takes from launch to unicorn)

2. Which framework actually helps you get an MVP off the ground