Yes - you should be able to use dask the way you say.
Your first part of the understanding matches my expectation too.
Dask single box parallelism achieved by multi processing - akin to parallel map.
And distributed compute is achieved by shipping the work to remote substrates.
For your second comment - we leverage pickle mostly to keep it easy and simple for basic types.
For small dataframes we just pickle for simplicity. For larger dataframes we rely on users to directly store the data (probably encoded as parquet) and just pickle the path instead of the whole dataframe.
Your first part of the understanding matches my expectation too. Dask single box parallelism achieved by multi processing - akin to parallel map. And distributed compute is achieved by shipping the work to remote substrates.
For your second comment - we leverage pickle mostly to keep it easy and simple for basic types. For small dataframes we just pickle for simplicity. For larger dataframes we rely on users to directly store the data (probably encoded as parquet) and just pickle the path instead of the whole dataframe.