Hacker News new | past | comments | ask | show | jobs | submit login

I also think that it's a shame he didn't bring up NumPy & Pandas, or R. He's just created a data frame and then complained that there's no functions to sort it, but we do have those. They are here.

    ants = pd.DataFrame({
      "name": ["bob", "alice", "carol"],
      "color":["red", "blue", "red"], 
      "age":[1.1, 0.5, 1.2], 
      "warrior":[True, False, True]})
    # Or read_csv to get the data in.

    # Count number of warriors.  True => 1, False => 0
    ants['warrior'].sum() # Returns 2

    # Count old red ants
    ((ants.color == "red") & (ants.age > 1.0)).sum() # 2 again

    ants.sort_values(by="age")
I think Pandas and NumPy should be up there as making the language easy to use in a SoA way.



Yeap. I think Pandas could and should see a lot more use outside data science as performant in-memory data storage in Python.

The best thing is? Most data scientists don't even care about SoA vs AoS - tabular structure is easy to grok, easy to use AND performant by default!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: