pandas
My Linked Notes
- 2020-10-27
- For both ray and dask, you can
pip install
their libraries and get started running them locally. They can also be used to parallelize [[pandas]], which is the number 1 tool for most data scientists- This is really important, because data scientists don't want to have to learn spark, but to run big data processing jobs, technologies like spark were the only way
- dask and ray provide a "native" python way to run parallel computations. And of course, their parent businesses are going to let users pay to have a reliable and easy to use environment to run their software
- For both ray and dask, you can
One last thing
If you liked these notes, hit me on Twitter!