General Data Science Means Cross-Language Tools, Training, and Documentation
Win-Vector Blog 2020-05-19
Data science is often a case of brining the tools to the problems and data, instead of insisting on bringing the problems and data to the tools.
To support cross-language data science we have been working on cross-language tools, documentation, and training.
For example:
vtreat data preparation package for supervised machine learning available both for vtreat R users and for vtreat Python users. Video lectures: advanced data preparation for R users video, and advanced data preparation for Python users video.
We have task-oriented cross-linked documentation:
-
Regression:
Rregression example, fit/prepare interface,Rregression example, design/prepare/experiment interface,Pythonregression example. -
Classification:
Rclassification example, fit/prepare interface,Rclassification example, design/prepare/experiment interface,Pythonclassification example. -
Unsupervised tasks:
Runsupervised example, fit/prepare interface,Runsupervised example, design/prepare/experiment interface,Pythonunsupervised example. -
Multinomial classification:
Rmultinomial classification example, fit/prepare interface,Rmultinomial classification example, design/prepare/experiment interface,Pythonmultinomial classification example.