Я новичок в разработке программного обеспечения, и я впервые внедряю многофайловую систему.
На моей локальной машине она работает отлично, но я не знаю, что я делаю неправильно, когда я ' Я его развернул.
Я пытаюсь развернуть потоковую передачу Twitter с использованием Postgresql аддона Heroku, и у моего веб-приложения есть два разных файла:
- streaming.py ( подключитесь в Twitter и используйте slistener.py для сбора данных и сохранения в PostgreSQL)
- app.py (прочитайте данные PostgreSQL и составьте несколько диаграмм )
Я объявил свой Procfile как:
worker: python streaming.py
web: gunicorn app:server
И, очевидно, он распознается правильно:
Итак, мои приложения создают соединение с Heroku PostgreSQL, но данные не сохраняются, а также, таблица не создается, поэтому мой app.py не может получить доступ к чему-либо, и это ниже приведена ошибка:
2020-03-25T15:28:22.376679+00:00 app[web.1]: 10.43.182.207 - - [25/Mar/2020:15:28:22 +0000] "POST /_dash-update-component HTTP/1.1" 500 290 "https://bbb-twitter-monitor.herokuapp.com/" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.122 Safari/537.36"
2020-03-25T15:28:22.499681+00:00 app[web.1]: 10.16.194.154 - - [25/Mar/2020:15:28:22 +0000] "GET /_dash-component-suites/dash_core_components/async-plotlyjs.v1_8_1m1582838719.js HTTP/1.1" 200 984008 "https://bbb-twitter-monitor.herokuapp.com/" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.122 Safari/537.36"
2020-03-25T15:28:22.524838+00:00 heroku[router]: at=info method=GET path="/_dash-component-suites/dash_core_components/async-plotlyjs.v1_8_1m1582838719.js" host=bbb-twitter-monitor.herokuapp.com request_id=e1723441-0877-4eb1-b9d7-4c94dd0e2432 fwd="177.144.188.23" dyno=web.1 connect=1ms service=199ms status=200 bytes=984265 protocol=https
2020-03-25T15:28:32.951480+00:00 heroku[router]: at=info method=POST path="/_dash-update-component" host=bbb-twitter-monitor.herokuapp.com request_id=4b142c35-a700-4145-a475-877c92bb43e5 fwd="177.144.188.23" dyno=web.1 connect=1ms service=6ms status=500 bytes=470 protocol=https
2020-03-25T15:28:32.949989+00:00 app[web.1]: <connection object at 0x7f939478a3f0; dsn: 'user=papziqledxhges password=xxx dbname=db9vikoson7vl3 host=ec2-52-87-58-157.compute-1.amazonaws.com port=5432 sslmode=require', closed: 0>
2020-03-25T15:28:32.953543+00:00 app[web.1]: Exception on /_dash-update-component [POST]
2020-03-25T15:28:32.953544+00:00 app[web.1]: Traceback (most recent call last):
2020-03-25T15:28:32.953545+00:00 app[web.1]: File "/app/.heroku/python/lib/python3.6/site-packages/pandas/io/sql.py", line 1586, in execute
2020-03-25T15:28:32.953546+00:00 app[web.1]: cur.execute(*args, **kwargs)
2020-03-25T15:28:32.953546+00:00 app[web.1]: psycopg2.errors.UndefinedTable: relation "tweet" does not exist
2020-03-25T15:28:32.953547+00:00 app[web.1]: LINE 1: SELECT * from tweet
2020-03-25T15:28:32.953547+00:00 app[web.1]: ^
2020-03-25T15:28:32.953548+00:00 app[web.1]:
2020-03-25T15:28:32.953548+00:00 app[web.1]:
2020-03-25T15:28:32.953549+00:00 app[web.1]: The above exception was the direct cause of the following exception:
2020-03-25T15:28:32.953549+00:00 app[web.1]:
2020-03-25T15:28:32.953549+00:00 app[web.1]: Traceback (most recent call last):
2020-03-25T15:28:32.953550+00:00 app[web.1]: File "/app/.heroku/python/lib/python3.6/site-packages/flask/app.py", line 2446, in wsgi_app
2020-03-25T15:28:32.953550+00:00 app[web.1]: response = self.full_dispatch_request()
2020-03-25T15:28:32.953551+00:00 app[web.1]: File "/app/.heroku/python/lib/python3.6/site-packages/flask/app.py", line 1951, in full_dispatch_request
2020-03-25T15:28:32.953551+00:00 app[web.1]: rv = self.handle_user_exception(e)
2020-03-25T15:28:32.953551+00:00 app[web.1]: File "/app/.heroku/python/lib/python3.6/site-packages/flask/app.py", line 1820, in handle_user_exception
2020-03-25T15:28:32.953552+00:00 app[web.1]: reraise(exc_type, exc_value, tb)
2020-03-25T15:28:32.953552+00:00 app[web.1]: File "/app/.heroku/python/lib/python3.6/site-packages/flask/_compat.py", line 39, in reraise
2020-03-25T15:28:32.953552+00:00 app[web.1]: raise value
2020-03-25T15:28:32.953553+00:00 app[web.1]: File "/app/.heroku/python/lib/python3.6/site-packages/flask/app.py", line 1949, in full_dispatch_request
2020-03-25T15:28:32.953555+00:00 app[web.1]: rv = self.dispatch_request()
2020-03-25T15:28:32.953555+00:00 app[web.1]: File "/app/.heroku/python/lib/python3.6/site-packages/flask/app.py", line 1935, in dispatch_request
2020-03-25T15:28:32.953556+00:00 app[web.1]: return self.view_functions[rule.endpoint](**req.view_args)
2020-03-25T15:28:32.953556+00:00 app[web.1]: File "/app/.heroku/python/lib/python3.6/site-packages/dash/dash.py", line 1461, in dispatch
2020-03-25T15:28:32.953557+00:00 app[web.1]: response.set_data(self.callback_map[output]["callback"](*args))
2020-03-25T15:28:32.953557+00:00 app[web.1]: File "/app/.heroku/python/lib/python3.6/site-packages/dash/dash.py", line 1341, in add_context
2020-03-25T15:28:32.953558+00:00 app[web.1]: output_value = func(*args, **kwargs) # %% callback invoked %%
2020-03-25T15:28:32.953558+00:00 app[web.1]: File "/app/app.py", line 437, in _update_div1
2020-03-25T15:28:32.953559+00:00 app[web.1]: df = pd.read_sql_query("SELECT * from tweet", con)
2020-03-25T15:28:32.953559+00:00 app[web.1]: File "/app/.heroku/python/lib/python3.6/site-packages/pandas/io/sql.py", line 332, in read_sql_query
2020-03-25T15:28:32.953560+00:00 app[web.1]: chunksize=chunksize,
2020-03-25T15:28:32.953560+00:00 app[web.1]: File "/app/.heroku/python/lib/python3.6/site-packages/pandas/io/sql.py", line 1633, in read_query
2020-03-25T15:28:32.953561+00:00 app[web.1]: cursor = self.execute(*args)
2020-03-25T15:28:32.953561+00:00 app[web.1]: File "/app/.heroku/python/lib/python3.6/site-packages/pandas/io/sql.py", line 1598, in execute
2020-03-25T15:28:32.953561+00:00 app[web.1]: raise ex from exc
2020-03-25T15:28:32.953562+00:00 app[web.1]: pandas.io.sql.DatabaseError: Execution failed on sql 'SELECT * from tweet': relation "tweet" does not exist
2020-03-25T15:28:32.953562+00:00 app[web.1]: LINE 1: SELECT * from tweet
2020-03-25T15:28:32.953563+00:00 app[web.1]: ^
PostgreSQL показывает, что приложение может обращаться к БД, но не хранит строки и не создает таблицу.
Я пытаюсь это исправить и не могу найти четких ссылок, которые объясняют, что мне делать.
Под тремя файлами:
streaming.py:
from tweepy import OAuthHandler
from tweepy import API
from tweepy import Stream
from sqlalchemy import create_engine
from sqlalchemy_utils import database_exists, create_database
from urllib3.exceptions import ProtocolError
from slistener import SListener
import os
# from key_secret import consumer_key, consumer_secret
# from key_secret import access_token, access_token_secret
api_key = ''
key_secret = ''
access_token = ''
token_secret = ''
# consumer key authentication
auth = OAuthHandler(api_key, key_secret)
# access key authentication
auth.set_access_token(access_token, token_secret)
# set up the API with the authentication handler
api = API(auth)
# instantiate the SListener object
listen = SListener(api)
# instantiate the stream object
stream = Stream(auth, listen)
# set up words to hear
keywords_to_hear = ['#BBB20', "#BBB2020"]
# create a engine to the database
engine = create_engine(os.environ['DATABASE_URL'])
# if the database does not exist
if not database_exists(engine.url):
# create a new database
create_database(engine.url)
# begin collecting data
while True:
# maintian connection unless interrupted
try:
stream.filter(track=keywords_to_hear)
# reconnect automantically if error arise
# due to unstable network connection
except (ProtocolError, AttributeError):
continue
slistener.py
from tweepy.streaming import StreamListener
import json
import pandas as pd
from sqlalchemy import create_engine
from datetime import timedelta
import os
from sqlalchemy import text
import datetime
from sqlalchemy import text
DATABASE_URL = os.environ['DATABASE_URL']
# inherit from StreamListener class
class SListener(StreamListener):
# initialize the API and a counter for the number of tweets collected
def __init__(self, api = None, fprefix = 'streamer'):
self.api = api or API()
# instantiate a counter
self.cnt = 0
# create a engine to the database
self.engine = create_engine(os.environ['DATABASE_URL'])
# for each tweet streamed
def on_status(self, status):
# increment the counter
self.cnt += 1
# parse the status object into JSON
status_json = json.dumps(status._json)
# convert the JSON string into dictionary
status_data = json.loads(status_json)
tweet = {
'created_at': status_data['created_at'],
'tweet_id': status_data['id_str'],
'id_user': status_data['user']['screen_name'],
'text': status_data['text']}
df = pd.DataFrame(tweet, index=[0])
#print("df")
from datetime import timedelta
# convert string of time into date time obejct
df['created_at'] = pd.to_datetime(df.created_at)
# push tweet into database
df.to_sql('tweet', con=self.engine, if_exists='append', index=False)
task = """
DELETE FROM tweet
WHERE created_at IN(
SELECT created_at
FROM(
SELECT created_at
FROM tweet
WHERE ((DATE_PART('day', now()::timestamp - created_at::timestamp) * 24
+ DATE_PART('hour', now()::timestamp - created_at::timestamp)) * 60
+ DATE_PART('minute', now()::timestamp - created_at::timestamp)) * 60
+ DATE_PART('second', now()::timestamp - created_at::timestamp) > 360) AS tweet_del) """
# d = addresses_table.delete().where(addresses_table.c.retired == 1)
# d.execute()
with self.engine.connect() as con:
# con.execute(task)
con.execute(text(task))
app.py (это не полный код, а только попытка подключения и чтения.
import dash
import dash_core_components as dcc
import dash_html_components as html
from dash.dependencies import Output, Input, State
import dash_table
import pandas as pd
import sqlite3
from sqlalchemy import create_engine
from sqlalchemy_utils import database_exists, create_database
from urllib3.exceptions import ProtocolError
import plotly_express as px
import os
import psycopg2
DATABASE_URL = os.environ['DATABASE_URL']
con = psycopg2.connect(DATABASE_URL)
df = pd.read_sql_query("SELECT * from tweet", con)
if __name__ == '__main__':
app.run_server(debug=True)