main
Fran314 3 weeks ago
parent b7fe73c25d
commit 3b03ada880

@ -1,3 +1,5 @@
PORT=3001
NIX_CHROMIUM_PATH=<PATH/TO/CHROMIUM/INSIDE/NIX/STORE>
ON_NIX="true"
#--- NixOS-only settings ---#
NIX_CHROMIUM_PATH="" # Set to chromium path (/nix/store/.../bin/chromium)
ON_NIX="false" # Set to true if on NixOS

@ -1,22 +1,57 @@
# Weather Web-Scraper
> **CAUTION**
> The following utility is in EARLY developement. Many promises made in this
> readme might not actually be true. Hopefully one day we'll be out of beta ~and
> be releasing on time~
An utility to scrape italian weather websites to collect and compare weather
information for Pisa
A backend that scrapes italian weather websites to collect and compare weather
information for Pisa, and serve them as an API for a frontend that doesn't yet
exist
## Usage
> **WARNING**
> The script is configured to run on Nix, and because of this is more
> complicated than it needs to be. You can remove unnecessary stuff if you're
> not running this on nix
<details>
<summary>NixOS-specific setup</summary>
In case you're using NixOS, you might have some trouble make `puppeteer` work,
as `puppeteer` ships with its own copy of chromium, which has dynamic libraries.
Dynamic libraries do not work on NixOS, and we must work around this to make
puppeteer work. To do this, we first install separately chromium, and then tell
`puppeteer` to use this chromium instead of its own
You can either globally install chromium, and then get its path with
`which chromium`, OR...
You can temporarily install it on a nix-shell, and this will put chromium in the
stor until the next garbage collection. Of course this is a dirty way of solving
this problem because you'd be using a chromium which isn't technically installed
on your system, and it will disappear once you garbage collect, but it's also
true that I don't really want chromium installed on my system sooooo... choose
your poison
Once you have chromium installed (or at least present in the nix store), get its
path (with `which chromium` if it's installed or in some hacky way otherwise),
and edit the `.env` file accordingly (remember to set `ON_NIX` to true as well)
</details>
Install the dependencies with
```bash
npm install
```
and then start the backend with
```bash
npm run serve
```
The backend will listen on the default port (3000) unless specified otherwise in
the `.env` file.
The only available endpoint is `/` which returns the latest update for the
scraped data, in an object containing the data from all the sources, as well as
the timestamp of the update.
The script scrapes the weather forecast from the implemented sources (see the
table below) and returns an object with the following fields
table below) and returns an object with the following fields:
- **today:** an object with elements from the hour after the current one to
23, of type `hourly` (see below)

@ -4,5 +4,15 @@
"express": "^4.21.1",
"puppeteer": "^23.6.0"
},
"type": "module"
"type": "module",
"name": "weather-scraper",
"version": "0.1.0",
"description": "A backend that scrapes italian weather websites to collect and compare weather information for Pisa, and serve them as an API for a frontend that doesn't yet exist",
"main": "index.js",
"devDependencies": {},
"scripts": {
"serve": "node index.js"
},
"author": "Francesco Baldino",
"license": "GPL-3"
}

Loading…
Cancel
Save