Aggregator
US Press Freedom Tracker Data now available on the decentralized web via IPFS
The U.S. Press Freedom Tracker, where we attempt to document virtually every press freedom violation in the country, has, for some time, made available its database of thousands of incidents for export via the API. We want all the data we’ve collected over the past five years to be available to journalists, advocates, and policy makers for their own analysis.
As an organization committed to helping journalists resist censorship and ensure information remains free, we’ve recently been exploring how we can use the decentralized web, and in particular IPFS, to more permanently store the vast wealth of information now on the Tracker. IPFS, for the uninitiated, is an innovative means to distribute data in a way that doesn’t depend on centralized infrastructure (such as a website).
In some ways similar to torrents, files shared on the IPFS network are mirrored among many nodes. This makes it a protocol particularly resistant to censorship or deletion, and may have other qualities significant to journalists as the internet evolves over time.
To this end, as a proof of concept, we’ve now published the Press Freedom Tracker’s incident database on IFPS. (You can, of course, still use the website API as well). You can view the database at this IPNS ID:
ipns://k51qzi5uqu5dlnwjrnyyd6sl2i729d8qjv1bchfqpmgfeu8jn1w1p4q9x9uqit
You can view the ID via an IPFS web gateway, such as the one provided by Cloudflare, via a browser extension like IPFS Companion, or via another IPFS client. The file is updated about every hour (more on that below), so you can ensure that the dataset you are downloading is the most current.
US Press Freedom Tracker data via IPFS, as viewed in Brave Browser.
A technical deep-dive into IPFS, IPNS, and keeping track of changes to the database
IPFS is an interesting protocol because its content identifiers (CIDs) or ‘hashes’ are cryptographically computed from the content of the file, not its name or other metadata.
This means that every time the file’s content changes, publishing it in IPFS gets a new CID.
There is nothing in the protocol that maintains any sort of ‘revision’ relationship between the old CID and the new one. It is up to the publisher to keep track of old versions of the file (if that’s important to them). Equally, it’s up to the publisher to tell people which CID is the new one, but it would be annoying to have to keep announcing new CIDs every time the file changes.
For this reason, the ID above is an ‘IPNS’ ID, which always points to the latest version of the folder and its contents, without itself ever having to change. IPNS is a little bit like DNS, in that it’s a sort of static ‘alias’ or pointer to another destination - in this case, the latest IPFS CID of the directory.
To maintain a sort of ‘revision’ log of changes to the incidents.csv database (and when it changed), we also publish a changelog file (incidents-log.csv) which shows the previous CIDs and a timestamp of when they were published. The last line in the file is always the latest version of the incidents.csv. You can also fetch the latest file directly (rather than view the directory) by using the IPNS hash, for example:
ipns://k51qzi5uqu5dlnwjrnyyd6sl2i729d8qjv1bchfqpmgfeu8jn1w1p4q9x9uqit/incidents.csv
Feel free to look at older CIDs to see the difference, or to consult the file to find out when the latest version was published.
How often is the data published to IPFS?
We attempt to publish the latest copy of the database to IPFS every hour, but realistically the database itself changes far less frequently. The database is only published (and the changelog updated) if its content changes.
Care to share some code?
We initially tried to use what seems to be the official Python library for working with the IPFS API, but found that it doesn’t seem to support the most recent releases of go-ipfs, and is possibly semi-abandoned.
Fortunately, the go-ipfs service provides its own HTTP RPC API, so we could use Python’s requests module to talk to it.
Publishing a single file to the IPFS API is quite easy, and there are simple examples of how to do it. However, it turns out that publishing a directory containing files was a little more tricky to achieve.
It took a bit of trial and error to work out how to send multiple files in a multipart request with the right tuple values per file, in a way that matched the IPFS API’s documentation, but we got there.
For those curious, here’s a sample of what worked for us. Happy hacking!
If you’re looking to install IPFS on a Linux server, we used an Ansible role for that, which worked great.
Hawley 'concerned' Democrats could defeat Greitens in Missouri's Senate race
FUN FACT: St. Louis has only contained the tallest habitable building in Missouri for 15 of the last 100 years
Storm threat in St. Louis region passes, but leaves outages, damage, possible tornado touchdown
Formula shortage still hurting St. Louis area families
Severe storms leave damage across St. Louis region
Storm threat in St. Louis region passes, but outages, tornado and flood damage remain
Republican running for Illinois governor Richard Irvin repeatedly refuses to say if he voted for Trump
St. Louis man arrested after 'random' fatal shooting in Pagedale
New friends
Wentzville School Board bans another library book for sexual content
55 & Loughborough Flooding Today 5/19/2022
Flooding on Bates View From My Front Porch
18-year-old charged in connection with crash that killed 5 members of family
St. Louis NAACP leaders are asking police to revise their pursuit policy
Enterprise Bank & Trust snags spot in S&P study of top-performing banks
Collinsville Memorial Day Fireworks Celebration Set
Bombshells in Space: A Star Wars Burlesque Parody
In a cabaret far, far away, The Boom Boom Room presents Bombshells in Space. Storm troopers and the Imperial Guard have invaded the Boom Boom
The post Bombshells in Space: A Star Wars Burlesque Parody appeared first on Explore St. Louis.
Jason Danieley: Homecoming
Jason Danieley brings the same dynamic voice, warmth, and charm to his Contemporary Cabaret Series debut as he has to his critically acclaimed, award-winning Broadway
The post Jason Danieley: Homecoming appeared first on Explore St. Louis.
stLouIST