Hacker News new | past | comments | ask | show | jobs | submit login

> Pandas does is notoriously hard to fit into a compile-time type system

What, mutate data with known structure? What exactly do you imagine is hard about this, let alone notoriously so?




Not hard per se, but extremely unergonomic.

A pandas dataframe types to something like Iterable[SomeKindOfProductType[int, str, str, ... (78 other columns)]]. The formal type of a dataframe in the middle of a transformation is... not very useful to know.


You wouldn't typically type tabular data in any language. Give it (at most) a DataFrame<StoreTransaction> type if you must, where StoreTransaction describes the structure of a row - maybe declaring only columns that model generation code was doing typed operations on (e.g. numerics vs strings) to avoid the need for reflection.


Either you type the tabular data at compile time or you don't get type checking of tabular data at compile time.

The number and types of the columns aren't necessarily known at compile time. Which leads to runtime errors. Even in a statically typed language, such dataframes are a kind of "dynamic typing escape hatch". As complexity of a software increases, such mechanismus of dynamic typing and throwing runtime errors creep up all over the place.


Sure. If you're dealing with untyped data at runtime you can't type it at compile time. Not a new issue, and handled all the time in otherwise typed languages.


It's not very useful to type, but that's why we have type inference.

It's certainly very useful to know, even if that knowledge is indirect via the type of the resulting dataframe after all the transforms.


From dabbling in F# I have a feeling such "information" would be somewhat more annoying than useful.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: