EC2 (Elastic Compute Cloud)是一種網站服務,可在雲端提供安全、可調整大小的運算容量,旨在降低開發人員進行網站規模化的難度。
chmod 400 YOUR-KEYPAIR-NAME.pem
ssh -i "YOUR-KEYPAIR-NAME.pem" ubuntu@YOUR-EC2-PUBLIC-DNS
sudo apt update
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.sh
conda create -n flask python=3.7
conda activate flask
pip install requests beautifulsoup4 pandas flask
python YOUR-SCRIPT.py
# YOUR-SCRIPT.py
import requests
import pandas as pd
from bs4 import BeautifulSoup
import sqlite3
import datetime
def insert_data():
conn = sqlite3.connect('/home/ubuntu/demo.db')
current_dt = datetime.datetime.now().strftime("%Y-%m-%d %X")
repo_social_counts = []
for page in range(2):
r = requests.get("https://github.com/search?p={}&q=stars%3A%3E0&s=stars&type=Repositories".format(page+1))
soup = BeautifulSoup(r.text, 'html.parser')
repos = [i.get("href") for i in soup.select(".px-2 .repo-list .v-align-middle")]
repo_links = ["https://github.com" + repo for repo in repos]
for repo_link, repo in zip(repo_links, repos):
repo_dict = {}
r = requests.get(repo_link)
soup = BeautifulSoup(r.text, 'html.parser')
watches, stars, forks = [int(i.text.strip().replace(",", "")) for i in soup.select(".social-count")]
repo_dict["scrapingTime"] = current_dt
repo_dict["repoOrg"] = repo.split("/")[1]
repo_dict["repoName"] = repo.split("/")[2]
repo_dict["watch"] = watches
repo_dict["star"] = stars
repo_dict["fork"] = forks
repo_social_counts.append(repo_dict)
print("Scraping {}...".format(repo))
df = pd.DataFrame(repo_social_counts)
df.to_sql("top_github_repos", con=conn, if_exists="replace", index=False)
try:
insert_data()
except:
pass
sudo apt-get install tzdata
sudo dpkg-reconfigure tzdata
timedatectl
檢查cron 和 crontab 可以在指定時間運行指定命令,藉由編輯 crontab 能夠安排任務或工作在特定日期、時間定期執行。
Source: Linux Basics for Hackers
0-59 * * * * /home/ubuntu/miniconda3/envs/flask/bin/python3 /home/ubuntu/YOUR-SCRIPT.py
Flask is a lightweight WSGI web application framework. It is designed to make getting started quick and easy, with the ability to scale up to complex applications. It began as a simple wrapper around Werkzeug and Jinja and has become one of the most popular Python web application frameworks.
# api.py
import flask
app = flask.Flask(__name__)
app.config["DEBUG"] = True
@app.route('/', methods=['GET'])
def home():
return "<h1>Hello world!</h1>"
if __name__ == "__main__":
app.run(host="0.0.0.0")
api.py
顯示表格所有內容¶# api.py
import flask
from flask import jsonify
import numpy as np
import pandas as pd
import sqlite3
app = flask.Flask(__name__)
conn = sqlite3.connect('/home/ubuntu/demo.db')
query_str = """SELECT * FROM top_github_repos;"""
cur = conn.cursor()
cur.execute(query_str)
results = cur.fetchall()
cur.close()
@app.route('/', methods=['GET'])
def home():
return jsonify(results)
if __name__ == "__main__":
app.run(host="0.0.0.0")
cd /etc/systemd/system
sudo vim YOUR-SERVICE.service
YOUR-SERVICE.service
檔案中輸入下列內容¶# YOUR-SERVICE.service
[Unit]
Description=Web API
After=network.target
[Service]
User=ubuntu
WorkingDirectory=/home/ubuntu
ExecStart=/home/ubuntu/miniconda3/envs/flask/bin/python3 /home/ubuntu/api.py
WatchdogSec=60
Restart=always
[Install]
WantedBy=multi-user.target
sudo systemctl daemon-reload
sudo systemctl start YOUR-SERVICE
sudo systemctl status YOUR-SERVICE
WARNING: This is a development server. Do not use it in a production deployment.
Use a production WSGI server instead.
wsgi.py
¶from api import app
if __name__ == "__main__":
app.run()
conda activate flask
pip install gunicorn
YOUR-SERVICE.service
¶# YOUR-SERVICE.service
[Unit]
Description=Web API
After=network.target
[Service]
User=ubuntu
WorkingDirectory=/home/ubuntu
ExecStart=/home/ubuntu/miniconda3/envs/flask/bin/gunicorn --bind 0.0.0.0:5000 -w 4 wsgi:app
WatchdogSec=60
Restart=always
[Install]
WantedBy=multi-user.target
sudo systemctl daemon-reload
sudo systemctl start YOUR-SERVICE
sudo systemctl status YOUR-SERVICE