How to Configure Plausible for Searx⚓
Summary⚓
Although Searx comes with it's own built in statistics, it doesn't natively allow for adding analytics. This is largely by design considering the privacy aspect of the project. However, I was curious to see if my instance gets any traffic that isn't from me.
Trial and Error⚓
In order to do this, I had to find out where the base.html
file was located. This was confusing to find because the Searx config file resides in /etc/searx
, although after some digging, I found base.html
in the following directory...
/usr/local/searx/searx-src/searx/templates/oscar
Once in the directory, I tried adding the following...
<!--Plausible Analytics-->
<script defer data-domain="search.cc" data-api="/data/api/event" src="/data/js/script.js"></script>
This would allow me to proxy the tracking snippet through Cloudflare. I've already done this with most of the other services I manage, but for some reason, the tracking snippet kept returning a 404 error.
The site was correct, - https://search.cc/data/js/script.js
- but would not return the tracking snippet. After a lot of trial and error, I found that the tracking snippet was available at https://www.search.cc/data/js/script.js
. I checked the settings.yml
file for Searx, as well as my configuration in Cloudflare, but could not find where the www
was coming from.
Resolution⚓
Because I wasn't able to locate where the www
was coming from in the tracking snippet, I decided to proxy the snippet through Nginx. Since I already use Nginx as the web server for Searx, it wasn't a big deal to modify the config file.
To modify the config file, I added the following:
# Only needed if you cache the plausible script. Speeds things up.
proxy_cache_path /var/run/nginx-cache/jscache levels=1:2 keys_zone=jscache:100m inactive=30d use_temp_path=off max_size=100m;
server {
...
location = /js/script.js {
# Change this if you use a different variant of the script
proxy_pass https://plausible.io/js/plausible.js;
# Tiny, negligible performance improvement. Very optional.
proxy_buffering on;
# Cache the script for 6 hours, as long as plausible.io returns a valid response
proxy_cache jscache;
proxy_cache_valid 200 6h;
proxy_cache_use_stale updating error timeout invalid_header http_500;
# Optional. Adds a header to tell if you got a cache hit or miss
add_header X-Cache $upstream_cache_status;
}
location = /api/event {
proxy_pass https://plausible.io/api/event;
proxy_buffering on;
proxy_http_version 1.1;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_set_header X-Forwarded-Host $host;
}
After reloading Nginx, I navigated back to /usr/local/searx/searx-src/searx/templates/oscar
and added the following to base.html
...
<!--Plausible Analytics-->
<script defer data-api="https://search.cc/api/event" data-domain="search.cc" src="https://search.cc/js/script.js"></script>
Once this was added, I navigated back to /usr/local/searx/searx-src
and used the following command to update the Searx instance...
During the update, I made sure to keep the same config file.
Testing⚓
Once it was finished, I did the following...
- Navigated back to my browser.
- Opened the Developer Console.
- Navigated to the
Network
tab. - Loaded
https://search.cc
- Confirmed the script appeared at
https://search.cc/js/script.js
Outcome⚓
Although it's not perfect, it so far seems to be giving me what I'm looking for. I'd like to figure out how to get insight into usage from searching through a browser address bar, but I have a feeling this may be a bit of a limitation with either Plausible or Searx; likely the latter. I think it has something to do with Content Security Policy in Nginx, but I haven't dug far enough into it to be sure.
Edit: It turns out this was due to a misconfigured Firewall rule on Cloudflare. Any API with a Cloudflare threat score greater than 5 was being blocked. This is overly aggressive and has since been reconfigured to greater than 10. A breakdown of how the Cloudflare threat score works can be found at the following link...
Once the rule was reconfigured, Plausible began picking up searches done through the browser address bar.
The important thing is that I was able to configure it properly so that analytics are implemented and the tracking snippet is served from the search.cc
domain.