Hard Light Productions Forums

Site Management => Site Support / Feedback => Topic started by: ngld on November 10, 2018, 05:11:11 am

Title: New forum search
Post by: ngld on November 10, 2018, 05:11:11 am
There've been quite a few complaints about the existing forum search. So I thought I'd write my own as a fun side project.
You can see the result here: https://hlp-search.tproxy.de/

It's indexing every publicly available post. However, the indexing process hasn't finished (at the time of writing) so quite a few posts might be missing. The above link will let you know how far along the indexing process is.
The search engine itself is fairly straightforward. You enter you search query and get results. Advanced searches (i.e. find this -but -not -this, "Find this exact sentence.", show only results where +this word appears, etc.) are also possible. Here's the full documentation of the query syntax (https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-simple-query-string-query.html#_simple_query_string_syntax). The input field has autocomplete but that's just based on all of the indexed thread titles so it might lead to some interesting results. You'll also get suggestions like "Did you mean ...?" if you misspelled something.

Anyway, this was a fun project. Let me know if this is actually useful to anyone, if you're interested in the source code / how it works or want some kind of change.

EDIT: Mentioned issues were fixed. New posts are added to the index every 2 hours.
Title: Re: New forum search
Post by: jr2 on November 10, 2018, 08:42:36 am
Can we get a link to it in the top navbar?  Maybe Search > ngld improved search -OR- legacy search?
Title: Re: New forum search
Post by: Goober5000 on November 12, 2018, 06:03:36 pm
That's jumping the gun; this should be thoroughly tested before we put links anywhere.  We don't want to overload ngld with tech support requests for this and Knossos. :)  Plus this is limited to publicly available posts, not the whole forum.

But this is cool!  Forum search has sucked for quite a long time; this is a welcome development.  If it proves to be substantially more accurate than the forum search, then I would be in favor of indexing the entire forum and making this the default.
Title: Re: New forum search
Post by: Novachen on November 12, 2018, 06:28:00 pm
A nice addition, sure :)

I personally had never issues with the forum search, however. Was able to find everything so far without problems :).

I think the most problem peoples in forums have, that they simple search wrong :D
Title: Re: New forum search
Post by: ngld on December 06, 2018, 08:55:43 am
From now on the indexer should add new posts every 2 hours which should allow you to find most recent posts (can do this more frequently but I don't want to put unneccessary load on the HLP server).

I've noticed that it's kind of slow if noone's used the search in a while (results take ~6 seconds in that case) but quickly gets faster with every following search request (something like ~3s, ~1.5s, ~0.9s). I haven't looked into why this happens but I guess the search server (Elasticsearch) empties it's memory cache when it sits idle for too long. Either that or the memory gets swapped out. I might be able to fix that but I'm not sure if anyone's actually interested in that. So far, I've seen less than 1 search per day.

It's possible to add additional filters (i.e. show only results from these threads / forum /...) and fairly simple. Indexing private stuff is more complicated because then the search frontend somehow needs to know which forums the user can access and restrict the search results to those forums. Most of the complicated stuff here is somehow retrieving that list (either from the forum DB or through a PHP script).

Finally, it'd be possible to make the default search fields in the forum use my search but as Goober said, I'd rather make sure it works as expected first. I could probably add a link the my search engine which allows you to run the same search using the forum's search. Something like "retry with the old forum search". Though that'd only help if we made my search engine the new default. We could probably make this a profile setting or something like that... then everyone can decide for themselves which engine they want to use by default.
Title: Re: New forum search
Post by: Nightmare on December 06, 2018, 01:59:25 pm
If you think it's stable enough you could post it in the "News:"-line on top of every page. Ask people for feedback/testing, stuff like that.
Title: Re: New forum search
Post by: ngld on December 06, 2018, 02:31:54 pm
It's not going to crash or anything like that... worst possible outcome would be that it doesn't return the results you expect it to. Though that's pretty unlikely given that all of the actual searching and indexing are handled by Elasticsearch. The only thing I did was writing some python scripts to feed it the forum posts (~600 lines) and a simple web frontend that allows you to send queries to the server and formats the results (~100 lines). With that amount of code, there's not much that can go wrong...
Title: Re: New forum search
Post by: Nightmare on December 06, 2018, 03:20:16 pm
Well than go for it! You're an Admin now! :D
(and I'd guess most people have read about FSMods in the meantime)
Title: Re: New forum search
Post by: ngld on December 06, 2018, 05:18:34 pm
Eh, why not? Would be nice if there was a way to hide old news entries instead of just deleting them... would give you a way to reenable them later. Anyway, I've saved the old news entry in case someone wants me to restore it.
Title: Re: New forum search
Post by: Nightmare on December 06, 2018, 06:21:30 pm
Couldn't you just put them under each other? If you turn the first line into a one-liner, say, "ngld is testing a new forum search which hopefully works better than the existing one. It currently only indexes public posts. Feedback is welcome!", there should be plenty of space (and the other message wasn't that long either).
Title: Re: New forum search
Post by: ngld on December 06, 2018, 08:11:47 pm
I could just leave it enabled. The forum would then cycle between the two but as you said before... it's been a while since we restored FSMods so the message isn't that relevant anymore.
Title: Re: New forum search
Post by: PIe on December 06, 2018, 08:42:57 pm
It's possible to add additional filters (i.e. show only results from these threads / forum /...) and fairly simple. Indexing private stuff is more complicated because then the search frontend somehow needs to know which forums the user can access and restrict the search results to those forums. Most of the complicated stuff here is somehow retrieving that list (either from the forum DB or through a PHP script).
That would be nice.  Another enhancement would be to search among a specific user's posts and topics.
Right now I don't have a lot of feedback because I don't search the forum that much, but when I do, I'll try to compare results.  One thing I do like is that, unlike in the forum search, the query is embedded in the URL.
Title: Re: New forum search
Post by: jr2 on December 07, 2018, 03:17:47 am
Discord has a nice list of options for searching:

from: user
mentions: user
has: link, embed, or file
before: date
during: date
after: date
(replace in: channel with in: thread # or title?)


Not saying I'm requesting all of that, just having a working forum search is great! But if the list inspires you to implement a few of those, even better!  :cool:

[attachment deleted by admin]
Title: Re: New forum search
Post by: Nightmare on December 07, 2018, 04:34:54 pm
One thing I'm wondering about- does this search somehow mess with the way the HLP forum software registers page views? Yesterday evening, the "Shattered Stars"-thread had around 71.000 views, now it already has 2500 views more and I've noticed a similar rapid increase on a few other threads as well. I'm sorry if I'm wrong and some botnet is responsible for that (though it doesn't do any harm, it just alters the statistics), I'm just surprised as forum activity doesn't seem to explain that alone.
Title: Re: New forum search
Post by: ngld on December 08, 2018, 06:22:34 am
@PIe: Thanks!

@jr2: Noted. User / before / during / after are definitely possible since that's already info I have stored (just need to parse the dates). Links and mentions sound more complicated since it requires me to analyze the post content. Should be possible though.

The indexer just looks at the recent posts page and sometimes goes to a forum to get a list of recent topics. Those are the only pages it automatically accesses. If the forum software is counting accessing the recent posts page as a view for each thread listed... that would be pretty dumb and still couldn't explain this since that comes out at 10*12 = 120 requests pre day (the recent posts page has 10 pages and the indexer runs every 2 hours so 12 times per day). And that would be the worst case scenario since the indexer stops going through those pages once it sees a post it already has seen.
Title: Re: New forum search
Post by: tomimaki on December 08, 2018, 11:41:15 am
Is there way to sort search results by a date?
Title: Re: New forum search
Post by: Nightmare on December 08, 2018, 07:58:07 pm
Ah OK my bad, I don't know anything about how search engines work, sry... :sigh:

Still I'm curious what's going on there, the Shattered Stars thread gained another 10.000 views since yesterday (85.220).
Title: Re: New forum search
Post by: PIe on December 10, 2018, 06:44:26 am
I've submitted a DuckDuckGo bang for !hlp.  If you don't use DDG, bangs are a quick way of searching a specific site.  For instance,
Code: [Select]
!gh freespace 2 will search GitHub for "freespace 2".
I don't know how long it takes to get a new bang approved or even how strict they are at approving them.
As for results, after just a few searches, it does seem better.  For instance, I searched for "new leviathan" and the integrated forum search was practically useless while this one came up with some useful results.
Title: Re: New forum search
Post by: jr2 on December 11, 2018, 04:39:27 am
I've submitted a DuckDuckGo bang for !hlp.  If you don't use DDG, bangs are a quick way of searching a specific site.  For instance,
Code: [Select]
!gh freespace 2 will search GitHub for "freespace 2".
I don't know how long it takes to get a new bang approved or even how strict they are at approving them.
As for results, after just a few searches, it does seem better.  For instance, I searched for "new leviathan" and the integrated forum search was practically useless while this one came up with some useful results.

Nice
Title: Re: New forum search
Post by: PIe on December 21, 2018, 05:29:29 pm
It would be nice if clicking on a suggestion automatically searched for it instead of just filling in the text box.
Title: Re: New forum search
Post by: ngld on December 22, 2018, 07:28:30 pm
Noted. I might be able to fix that and add sort by date tomorrow. No promises though.
I think I've got the speed issues under control. It should take at most 6 seconds for a non-cached query. Once it's cached (someone ran a search for it), it should drop to ~0.4 seconds.
Title: Re: New forum search
Post by: jr2 on January 09, 2019, 07:23:10 pm
Made shortened URL (mind the capitalization):

https://is.gd/nHLPsearch
Title: Re: New forum search
Post by: PIe on January 21, 2019, 06:26:38 pm
Discord has a nice list of options for searching:

from: user
mentions: user
has: link, embed, or file
before: date
during: date
after: date
(replace in: channel with in: thread # or title?)


Not saying I'm requesting all of that, just having a working forum search is great! But if the list inspires you to implement a few of those, even better!  :cool:
I came here to post this and then realized you already had, so seconding.
Title: Re: New forum search
Post by: ngld on February 09, 2019, 07:05:33 pm
Quick update:

EDIT: Forgot to mention: Invalid search queries currently trigger an Internal Server Error. I'll add a proper error message later. However, invalid queries can only happen if your search query contains a colon ( : ).
Title: Re: New forum search
Post by: wookieejedi on February 09, 2019, 07:25:44 pm
Great to see the update!
Title: Re: New forum search
Post by: tomimaki on March 03, 2019, 11:28:29 am
Thanks for sorting by date,
but it works only for first page. :(
Title: Re: New forum search
Post by: ngld on March 03, 2019, 03:08:00 pm
Whoops. Should be fixed now.
Title: Re: New forum search
Post by: tomimaki on March 04, 2019, 09:33:40 am
 :yes:
Title: Re: New forum search
Post by: PIe on April 23, 2019, 10:29:39 pm
DDG accepted !hlp as a bang, which is cool.
Title: Re: New forum search
Post by: X3N0-Life-Form on April 24, 2019, 04:32:06 am
DDG accepted !hlp as a bang, which is cool.
Wow, that is cool.

The new search works great btw, I got exactly the post I wanted (https://www.hard-light.net/forums/index.php?topic=70746.msg1756854#msg1756854) out of a 272 pages thread.