diff --git a/.gitignore b/.gitignore index 83acb35..652576f 100644 --- a/.gitignore +++ b/.gitignore @@ -1,6 +1,9 @@ # Local files *.local* +# Output files +*.json + # Python venv/ diff --git a/README.md b/README.md index 676fd0b..90d8abf 100644 --- a/README.md +++ b/README.md @@ -27,7 +27,7 @@ $ pip install -r requirements.txt ## Launch -The following command will crawl the conferences from `https://www.dm.unipi.it/research/past-conferences/` (pages 1 to 5) and save the results in `conferences.json`: +The following command will crawl all the conferences from `https://www.dm.unipi.it/research/past-conferences/` (all pages) and save the results in `conferences.json` as a list of json objects, one per line. ```bash $ python main.py diff --git a/main.py b/main.py index d71f464..daba610 100755 --- a/main.py +++ b/main.py @@ -6,6 +6,8 @@ from bs4 import BeautifulSoup import textwrap import json +OUTPUT_FILE = "conferences.json" + LLM_EXAMPLE = ( "INPUT:\n" '