Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docker Error KeyError #21

Open
ameygat opened this issue Feb 17, 2019 · 3 comments
Open

Docker Error KeyError #21

ameygat opened this issue Feb 17, 2019 · 3 comments

Comments

@ameygat
Copy link

ameygat commented Feb 17, 2019

I have run docker for first time and I get keyerror, it seems code is trying to get postgress user and database, So is it needed to be created on base system ?
There were no instructions to setup DB on https://github.com/isaacmg/fb_scraper/wiki/Docker-Image

variables.list:

FB_ID=myappid
FB_KEY=mysecreate
IDS=cnn,paddlesoft,msnbc
# Include only if you want to scrape comments
COMMENTS=1
# Include below ONLY if you want to use Kafka.
USE_KAFKA=1
KAFKA_PORT=localhost:9092

Error:
docker run --env-file variables.list paddlesoft/fb_scraper Traceback (most recent call last): File "threaded_proc.py", line 6, in <module> from fb_scrapper import scrape_groups_pages File "/fb_scraper/fb_scrapper.py", line 2, in <module> from fb_posts import FB_SCRAPE File "/fb_scraper/fb_posts.py", line 11, in <module> from save_pg import save_post_pg File "/fb_scraper/save_pg.py", line 3, in <module> db = Database(os.environ['db'], user=os.environ['pg_user'], password=os.environ['pg_password'], host=os.environ['pg_host'], database=os.environ['pg_db']) File "/opt/conda/lib/python3.6/os.py", line 669, in __getitem__ raise KeyError(key) from None KeyError: 'db'

@isaacmg
Copy link
Owner

isaacmg commented Feb 19, 2019

Yeah I think the newer version by default uses PostgreSQL to save scrapping times as shelving is unstable. So I think you need to set the following
db = Database(os.environ['db'], user=os.environ['pg_user'], password=os.environ['pg_password'], host=os.environ['pg_host'], database=os.environ['pg_db'])
I might create a parameter in the future where you can choose whether to use PostgreSQL too. Also if you want to make PR to add the parameter that would probably be the quickest way.

@ameygat
Copy link
Author

ameygat commented Feb 21, 2019

But is there already a postgress added in to the Docker or we need to setup that database on the host ?

@isaacmg
Copy link
Owner

isaacmg commented Feb 26, 2019

At the moment Postgres is not part of the Dockerfile as it would take up too much memory. The way I was using is as part of a docker-compose with a separate container for PostgreSQL and another for FB scraper. Another easy option is to just set up a Heroku app as those come with a free PostgreSQL database. Just added some documentation on this to the wiki page as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants