Gerardo Marx Chávez-Campos 398b6a62e5 | 4 years ago | |
---|---|---|
Dockerfile | 4 years ago | |
Readme.md | 4 years ago |
This repository explores through examples how to use the command line in an efficient and productive way for data science tasks. Learning to obtain, scrub, explore, and model your data.
During this examples your will learn how to: (i) run docker containers, (ii) use the command line, (iii) run a basic application.
Let us introduce docker, the first platform to make data science. Docker is a tool that allows developers, sys-admins or data-scientist to easily deploy their applications in a sandbox (called containers) to run on a host operating system i.e. Linux. The key benefit of Docker is that it allows users to package an application with all of its dependencies into a standardized unit for software development. Unlike virtual machines, containers do not have high overhead and hence enable more efficient usage of the underlying system and resources.1
Docker pull
We recommend that you create a new directory, navigate to this new directory, and then run the following when you’re on macOS or Linux:
$ docker run --rm -it -v`pwd`:/data datascienceworkshops/data-science-at-the-command-line
Or the following when you’re on Windows and using the command line:
$ docker run --rm -it -v %cd%:/data datascienceworkshops/data-science-at-the-command-line
Or the following when you’re using Windows PowerShell:
$ docker run --rm -it -v ${PWD}:/data datascienceworkshops/data-science-at-the-command-line
In the above commands, the option -v instructs docker to map the current directory to the /data directory inside the container, so this is the place to get data in and out of the Docker container.
Docker for beginners, https://docker-curriculum.com/. ↩︎