You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
 
 
 
Go to file
gravel 3e1895c32d
Add CONTRIBUTING.md
1 year ago
languages@524ad98361 Update language submodule 1 year ago
misc Update list_of_known_servers.txt 1 year ago
output Add SOGS found via shodan.io 1 year ago
php Add SOGS found via shodan.io 1 year ago
sites Merge pull request 'SOGS QR Caching & backend improvements' (#17) from gravel/sessioncommunities.online:backend-caching into main 1 year ago
systemd Restructure static site generation 1 year ago
.gitignore Ignore cache folder 1 year ago
.gitmodules Outsource language_flags.php into submodule 1 year ago
.phpenv Improve logging 1 year ago
CONTRIBUTING.md Add CONTRIBUTING.md 1 year ago
Makefile Change dev server HTTP port from 8080 to 8081 1 year ago
README.md Update README.md 1 year ago

README.md

Crawl lists of active Session Communities

What does this site do?

This script crawls known sources of published Session Communities, queries their servers for available information and displays this information as a static HTML page. The results of this can be viewed on https://sessioncommunities.online/.

What is Session?

Session is a private messaging app that protects your meta-data, encrypts your communications, and makes sure your messaging activities leave no digital trail behind. https://getsession.org/

Details

Which sources are crawled?

Currently this script crawls the following sites:

Additionally, a few other servers are hardcoded, see querying logic.

How does this work?

The update-listing.php script invokes the following two PHP scripts: fetch-servers.php to query available servers, and generate-html.php to generate the static HTML.

The querying logic consists of these steps:

  1. Fetching source HTML: get_html_from_known_sources()
  2. Extracting Session invites from the HTML: extract_join_links_from_html() and get_servers_from_join_links()
  3. Making sure servers are online: reduce_servers()
  4. Querying the servers for all available rooms and normalizing active user numbers: query_servers_for_rooms()
  5. De-duplicating servers based on public keys: get_pubkeys_of_servers() and reduce_addresses_of_pubkeys()
  6. Aggregating all server info & adding language data: generate_info_arrays()

Static HTML is generated from the sites directory to the output directory, which additionally contains static assets. All contents of sites are invoked to produce a HTML page unless they are prefixed with a + sign.

Work around bad routing to Chinese servers

Depending on your location, it is possible for you to get really bad routing to SOGS servers behind the GFW. In this case, the initial connection is still successful, but you'll never receive any actual content and the retrieval attempt will simply time out. This happens randomly. To make sure this won't affect the results, we simply check whether the server is online (the initial connection being successful), and then retry a lot of times with a short timeout until we eventually get the content. The details can be seen in curl_get_contents().

Official repositories

If your favourite Session community is missing a language flag, you can issue a pull request here:

Contact

If you want to contact me, you can add me on Session via my ONS: someguy.