Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Log in Sign up

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

jjdocker description steps, Schemes and Mind Maps of Law

Lebanese University Law

jjdocker description steps jjdocker description steps

Typology: Schemes and Mind Maps

2022/2023

Uploaded on 11/30/2024

chloe-khoury 🇱🇧

1 document

1 / 2

This page cannot be seen from the preview

Don't miss anything!

Chapter 1 GENERAL CONCEPTS

1.2. What is Docker?

1.2.1. Why is it used?

Docker is a tool designed to make it easier to create, deploy, and run applications by

using containers. Containers allow developers to package an application with all its

dependencies (code, libraries, environment variables, etc.) so that it works consistently across

different environments. This is especially useful when moving applications from one machine

to another, like from a developer's local environment to a production server.

Docker solves the problem of inconsistencies between environments by b undling everything

needed to run an application inside a c ontainer, which is lightweight, portable, and isolated

from the host machine. Containers can run on any machine that has Docker installed, ensuring

that our application behaves the same regardless of where it's deployed.

1.2.2. How Docker is Used in Web Scraping:

In web scraping, Docker helps package your scraping environment (Python libraries, scraping

tools, browsers like ChromeDriver for Selenium, etc.) into a single container. This eliminates

issues with software dependencies across different machines. To note that a contain er is a

lightweight, standalone executable package that includes everything needed to run an

application (e.g., source code, libraries, settings). For example, web scraping tools like

Selenium or BeautifulSoup often require specific dependencies (e.g., browser drivers, Python

packages). Docker makes sure everything is bundled correctly so our scraping code will run

smoothly wherever the container is executed.

1.2.3. Docker Example for Web Scraping

Here’s how I started building a simple web scraping tool with Docker:

1. I installed Docker on my system from Docker's official site.

2. I created a directory for my scraping project and navigated to it:

mkdir webscraper

cd webscraper

3. I created a Python script (scraper.py) inside my project folder with the following code:

import requests

from bs4 import BeautifulSoup

Chapter 1 GENERAL CONCEPTS

# Make a request to a website

URL = 'https://example.com'

response = requests.get(URL)

# Parse the HTML content

soup = BeautifulSoup(response.text, 'html.parser')

# Find the title of the page

title = soup.title.string

print(f"Page title: {title}")

4. I created a requirements.txt file and added the libraries my project needs:

requests

beautifulsoup4

5. I created a Dockerfile with the instructions to build a Docker image for my scraper:

# Use an official Python runtime as a base image

FROM python:3.8-slim

# Set the working directory to /app

WORKDIR /app

# Copy the current directory contents into the container at /app

COPY . /app

# Install the dependencies

RUN pip install --no-cache-dir -r requirements.txt

# Run the scraper script when the container starts

CMD ["python", "scraper.py"]

· FROM python:3.8-slim: This line tells Docker to use the official Python image

as the base for my container. The 3.8-slim version is a lightweight version of

Python 3.8, which is smaller in size and includes just enough libraries to run

Python applications. Using a slim image makes the container smaller, faster to

build, and more efficient.

· WORKDIR /app: This sets the working directory inside the container to /app.

Every subsequent command (like copying files or installing dependencies) will

happen within this directory.

· COPY . /app: This copies all files and folders from my local machine’s current

directory into the /app directory inside the contai ner, allowing Docker to see

my scraper.py script, requirements.txt, and any other necessary files.

· RUN pip install --no-cache-dir -r requirements.txt: This installs the Python

packages listed in requirements.txt inside the container using pip. The --no-

cache-dir option ensures that pip doesn’t save cache files for installed

packages, keeping the container small and efficient.

Discover Schemes and Mind Maps of Law Lebanese University

Partial preview of the text

Download jjdocker description steps and more Schemes and Mind Maps Law in PDF only on Docsity!

Chapter 1 GENERAL CONCEPTS

1.2. What is Docker?

1.2.1. Why is it used?

Docker is a tool designed to make it easier to create, deploy, and run applications by using containers. Containers allow developers to package an application with all its dependencies (code, libraries, environment variables, etc.) so that it works consistently across different environments. This is especially useful when moving applications from one machine to another, like from a developer's local environment to a production server.

Docker solves the problem of inconsistencies between environments by bundling everything needed to run an application inside a container, which is lightweight, portable, and isolated from the host machine. Containers can run on any machine that has Docker installed, ensuring that our application behaves the same regardless of where it's deployed.

1.2.2. How Docker is Used in Web Scraping:

In web scraping, Docker helps package your scraping environment (Python libraries, scraping tools, browsers like ChromeDriver for Selenium, etc.) into a single container. This eliminates issues with software dependencies across different machines. To note that a container is a lightweight, standalone executable package that includes everything needed to run an application (e.g., source code, libraries, settings). For example, web scraping tools like Selenium or BeautifulSoup often require specific dependencies (e.g., browser drivers, Python packages). Docker makes sure everything is bundled correctly so our scraping code will run smoothly wherever the container is executed.

1.2.3. Docker Example for Web Scraping

Here’s how I started building a simple web scraping tool with Docker:

I installed Docker on my system from Docker's official site.
I created a directory for my scraping project and navigated to it:

mkdir webscraper cd webscraper

I created a Python script (scraper.py) inside my project folder with the following code:

import requests from bs4 import BeautifulSoup

Chapter 1 GENERAL CONCEPTS

# Make a request to a website URL = 'https://example.com' response = requests.get(URL)

# Parse the HTML content soup = BeautifulSoup(response.text, 'html.parser')

# Find the title of the page title = soup.title.string print(f"Page title: {title}")

I created a requirements.txt file and added the libraries my project needs:

requests beautifulsoup

I created a Dockerfile with the instructions to build a Docker image for my scraper:

# Use an official Python runtime as a base image FROM python:3.8-slim

# Set the working directory to /app WORKDIR /app

# Copy the current directory contents into the container at /app COPY. /app

# Install the dependencies RUN pip install --no-cache-dir -r requirements.txt

# Run the scraper script when the container starts CMD ["python", "scraper.py"]

∑ FROM python:3.8-slim: This line tells Docker to use the official Python image as the base for my container. The 3.8-slim version is a lightweight version of Python 3.8, which is smaller in size and includes just enough libraries to run Python applications. Using a slim image makes the container smaller, faster to build, and more efficient. ∑ WORKDIR /app: This sets the working directory inside the container to /app. Every subsequent command (like copying files or installing dependencies) will happen within this directory. ∑ COPY. /app: This copies all files and folders from my local machine’s current directory into the /app directory inside the container, allowing Docker to see my scraper.py script, requirements.txt, and any other necessary files. ∑ RUN pip install --no-cache-dir -r requirements.txt: This installs the Python packages listed in requirements.txt inside the container using pip. The --no- cache-dir option ensures that pip doesn’t save cache files for installed packages, keeping the container small and efficient.

Chapter 1 GENERAL CONCEPTS

∑ CMD ["python", "scraper.py"]: This specifies the command to run when the container starts, telling Docker to run the scraper.py Python script.

I built the Docker image by running the following command in my project directory:

docker build -t webscraper

This command tells Docker to create an image named webscraper based on the Dockerfile in the current directory.

∑ -t webscraper: This flag assigns the name webscraper to the image. I can use any name here, but webscraper is just an example. ∑. (dot): The dot at the end tells Docker to look for the Dockerfile in the current directory.

Once the image is built, I ran my scraper inside a Docker container:

docker run webscraper

This executes the scraper.py file inside the container and outputs the title of the web page. Docker creates a new container using the webscraper image and then executes the command specified in the Dockerfile.

Chapter 1 GENERAL CONCEPTS

After I am done, I can stop and remove any running containers (optional):

docker container ls -a # List all containers docker rm <container_id> # Remove the container by ID

If I have many containers running or exited, it’s a good idea to remove them once they’re no longer needed to free up system resources.

1.2.4. Benefits of Docker in Scraping:

∑ Consistency : The same code will run identically on any machine where Docker is installed, eliminating issues caused by different environments. ∑ Dependency Management : All the necessary dependencies are packaged inside the Docker container. ∑ Portability : You can easily share the Docker image with others, and they can run the same code on their machines without any setup issues. ∑ Isolation : The containerized environment isolates the scraping tool from your host machine, ensuring that any issues inside the container won’t affect the host.

1.3. What is scraping Third-Party Apps/Websites?

This refers to collecting data from websites or apps that we do not own. It can be tricky because some websites block scraping, or they might have legal restrictions on scraping their data. In fact, we scrape third-party apps and websites when we need information that they display publicly, like prices, reviews, or product details, but don’t offer an API to access the data easily.

To do it, we use tools like BeautifulSoup (for HTML), Selenium (for dynamic content), or APIs (if available) to extract data. However, it’s important to always check the website’s terms of service to make sure we’re not violating any rules.

jjdocker description steps, Schemes and Mind Maps of Law

Related documents

Partial preview of the text

Download jjdocker description steps and more Schemes and Mind Maps Law in PDF only on Docsity!

1.2. What is Docker?

1.3. What is scraping Third-Party Apps/Websites?