this post was submitted on 24 Jun 2023
41 points (97.7% liked)
Lemmy
12535 readers
2 users here now
Everything about Lemmy; bugs, gripes, praises, and advocacy.
For discussion about the lemmy.ml instance, go to !meta@lemmy.ml.
founded 4 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
I'm doing tests in the next couple days. But I'm trying to build a search engine specifically for Lemmy.
I'm hoping I can open it to the public in a week or so.
Cool! How does it technically work? Does it fetch all titles (and maybe the body and comments) via the api from each instance or do you set up your own private instance and tap into the instance database?
I'm using the public API to grab every post / comment and then I essentially replace the content with only the unique words. Then when you go to search it just looks for any post or comment, in my database, that has the words you typed in. Finally I sort based on the number of upvotes.
Right now it only craws a specific instance that you point it to. But as long as that instance is federated it /should/ get everything. But eventually I plan on using that instance's list of federated instances to scan everything and lighten the load on any one particular instance.
Edit: I thought about tapping into the existing database but the existing database is more geared towards serving content but not necessarily searching. The database that I'm building you can search but I drop so much of the original data that using it for content is worthless.
Now I'm curious what your stack is? Are you using an elastic database?
HTML + JavaScript frontend. Rust backend with a postgres database.
It'll be open sourced once I can get the MVP ready.