this post was submitted on 02 Sep 2024
222 points (97.8% liked)
Programming
17374 readers
156 users here now
Welcome to the main community in programming.dev! Feel free to post anything relating to programming here!
Cross posting is strongly encouraged in the instance. If you feel your post or another person's post makes sense in another community cross post into it.
Hope you enjoy the instance!
Rules
Rules
- Follow the programming.dev instance rules
- Keep content related to programming in some way
- If you're posting long videos try to add in some form of tldr for those who don't want to watch videos
Wormhole
Follow the wormhole through a path of communities !webdev@programming.dev
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
I never really quite understood IPFS and why it gets used where I see it today. What problem is it solving?
IPFS would replace Content Delivery Networks in present day.
It would also allow you to host software and other content from your own network again without the constraints modern Internet Service Providers pose on you to limit your self-hosting capabilities.
If applications are built for it, it could serve as live storage for your applications too.
We ran ipf-search. In one of the experiments we could show that a distributed search index on ipfs-search, accessible through JavaScript is likely feasible with the necessary research. Parts of the index would automatically be hosted by clients who used the index thus creating a fairly resilient system.
Too bad IPFS couldn't get over the technical hurdles of limiting connection setup time. We could get a fast (ElasticSearch based) index running and hosted over common web technologies, but fetching content from IPFS directly was generally rather slow.
Would you be interested in a similar protocol that supports more things (and is IMO easier to set up)?
I'm not actively looking but please do share references! Other people may read this and they may want to know too. Perhaps I'll jump back in the rabbit hole at some point too 😁
Okay here it goes!
Tenfingers sharing protocol & python implementation (your python needs cryptodomex, or use the frozen executables).
http://tenfingers.org
You share theirs, they share yours (all encrypted)! So no benevolent nodes or crypto and it's 100% decentralised.
I'm working on a better documentation on how to set it up (just forward a port and run setup basically).
I had to read the overview and it looks nice. It reads like IPFS without some of the challenging cruft. Well written!
IPFS seemingly works small scale but not large scale. What makes tenfingers handle millions of files and petabytes of data better than IPFS? Perhaps that is not the goal. In what way do you think the tech scales? Why will discovery of the node which has the data be short?
I want to ask for benchmarks but you can't do a full benchmark without loads of resources.
Thanks!
IPFS is static, whereas tenfingers is dynamic when it comes to the links. So you can update the shared data without the need of redistributing the link.
That said, its also very different tech wise, there is no need for benevolent nodes (or some crypto or payment).
Nodes do not need to be trustworthy either, so node discovery is very simple (basically just ask other nodes for known nodes).
The distribution part, where nodes share your data, is based on reciprocal sharing, you share theirs and they share yours. If they don't share any more (there are checks) you just ditch the deal and ask for a new deal with another node.
With over sharing (default is you share your data with 10 other nodes, sharing their data) this should both make bad nodes a no problem, but also make for good uptime and takedown safety.
This system also makes it scalable infinitely node wise, as every node does not need to know all other nodes, just enough for their need (for example thousands out if millions of existing nodes).
To share lots if data, you need to bring enough storage and bandwith to the table because it's reciprocal, so basically it's up to your node how much it can share.
Big data sets are always complicated because of errors and long download times, I have done 300MB files without problems, but the download process sure can be made better (with parallel downloading for example and better error handling).
I haven't worked on sharing way bigger datasets, even a simple terabyte is a pita to download on the regular internet :-) and the use case is more the idea of sharing lots of smaller data, like a website for example, or a chat.
What do you think, am I missing something important? Or of course if you have other questions please do ask!
Also, sorry I'm writing this on my mobile so it's not very well written.
Edit: missed one question; getting the data is straight forward to use (a bit complicated how it's handled because of the changing nature of things) but when you download, you have the addresses of the nodes sharing your data so you just connect to one of them and download it (or the next if the first one isn't up etc and so on). So that should not be any kind of bottleneck.