Faceted search in v10+ with examine
# help-with-umbraco
s
How are people doing faceted search in new Umbraco projects? I've used Bobo before, but doesn't look like it's updated for net core.
d
Heyo! I've been doing exactly this recently. I got the trick from somebody else (and I don't know if it's the best way to do it, but it works very well). I use Lucene directly instead of going through examine. I have additional indexes that are kept in-sync with the external index and use the Lucene taxonomy index. Then I query that index directly using the lucene query building approach instead of the query builder from examine.
If nobody has a better approach, I think I can extract a zip-file from my solution and share with you outside of my working hours.
s
Yeah, was looking at that @nzdev - but unsure if I can make it work in the current Examine version 🙂
n
In theory https://github.com/Shazwazza/Examine/pull/347 should be a drop in replacement and ready to go, though taxonomy index would need a replacement of the directory factory from umbraco.
s
Seems like the nuget package generated from that PR, installs fine, and Umbraco runs fine... So I'll try to chase this option 🙂
n
Great. Any issues or if you want some assistance with enabling the taxonomy index in umbraco. Message on the issue on github.
s
So, this part: https://github.com/nzdev/Examine/blob/v4/feature/v3-v4-api-compat/docs/v2/articles/configuration.md#facets-configuration, goes in IConfigureNamedOptions, right? Do I just add it to the ExternalIndex, if that is the one I want to add facets to? And then a new fielddefinition for the fields containing assets?
n
Yes that is the correct place
The config only needs to be on the index it's used for
// Create a config var facetsConfig = new FacetsConfig(); // Set field to be able to contain multiple values (This is default for a field in Examine. But you only need this if you are actually using multiple values for a single field) facetsConfig.SetMultiValued("MultiIdField", true); services.AddExamineLuceneIndex("MyIndex", // Set the indexing of your fields to use the facet type fieldDefinitions: new FieldDefinitionCollection( new FieldDefinition("Timestamp", FieldDefinitionTypes.FacetDateTime), new FieldDefinition("MultiIdField", FieldDefinitionTypes.FacetFullText) ), // Pass your config facetsConfig: facetsConfig );
s
I'm probably not doing it right... I have this Configure class:
Copy code
cs
public class ConfigureIndexOptions : IConfigureNamedOptions<LuceneDirectoryIndexOptions>
{
    public void Configure(string? name, LuceneDirectoryIndexOptions options)
    {
        if (name?.Equals("ExternalIndex") is true)
        {
            var facetFields = new List<string>
            {...}; // all the fields names

            var facetsConfig = new FacetsConfig();
            foreach (var field in facetFields)
            {
                options.FieldDefinitions.AddOrUpdate(new FieldDefinition(field, FieldDefinitionTypes.FacetTaxonomyFullText));
            }

            options.FacetsConfig = facetsConfig;
            options.UseTaxonomyIndex = true;
        }

    }

    public void Configure(LuceneDirectoryIndexOptions options) => Configure(string.Empty, options);
}
But, when I try to search, it breaks saying
NotSupportedException: Directory Factory does not implement CreateTaxonomyDirectory
It seems like it breaks already when getting the searcher from the index.
n
Yep. As I mentioned the Taxonomy index requires an update to the directory factory from umbraco
s
Ah, sorry missed that 🙂
n
Fortunately this can be replaced / done easy enough
I would raise a PR to umbraco, however I'm waiting on v4 being released.
That can be done without modifying umbraco
s
This is added to IConfigureNamedOptions too?
n
Yes
That overload will let you pass in the type for the directoryfactory
s
I think it did the trick - now I just need to get it to get the facets 🙂 Thanks for all your help!
You are welcome
I eagerly await the blog post to make this simple for others 🙂
Now, I just need to figure out how - think I'm going to do a smaller example on one of the starter kits, and write a blog post 🙂
How would I go about searching for more than one facet? If I add
.GroupedOr(new [] { "myFacetField"}, new[] { "Facet1", "facet2" }).WithFacets(facets => facets.FacetString("myFacetField"))
I only get the already selected facets back on
result.GetFacet("myFacetField")
n
https://lucenenet.apache.org/docs/4.8.0-beta00011/api/facet/Lucene.Net.Facet.DrillDownQuery.html not sure if this is what is needed. It's not implemented in the abstraction yet, but can be passed using casting to a lucene searcher
s
you mean casting the ISearcher from Examine to a lucene searcher?
n
Yes
s
I think I fell in at the deep end now 🙈
j
Do either of you know how I'd get it to treat fields with space separated values as unique values? Right now I am getting something like this:
d
Which analyzer are you using? Looks like it's not tokenizing your value sets as expected
n
I think you are missing setting the value as multi valued see the example at the bottom here https://github.com/Shazwazza/Examine/blob/release/4.0/docs/v2/articles/configuration.md
j
@D_Inventor I am adding facets to some fields in the external index, haven't changed the default tokenizer. @Nikcio I had that added as Nzdev mentioned it further above in this thread. I've tried both adding it and removing it and neither does anything - I get the exact same results. This is my index options - everything else looks fine
d
It could be that facet fields require you to index your fields explicitly as separate values, rather than a single space-separated value. It's what I did to make my taxonomy index work.
I'm speculating though
j
May be true, I ended up writing a helper method that splits them up and sums up the values into unique facet fields for now 🤷‍♂️
n
Oh Yeah you need to index each value seperatly for facets to work correctly like D_Inventor mentions: https://shazwazza.github.io/Examine/articles/indexing.html#multiple-values-per-field
j
Good to know - not sure I will get time to go back and change it now - but will keep it in mind for the next time 🙂
If anyone else is trying out faceted search with Examine, and is running into issues with not getting a full list of facets - be aware that by default it is set to only return the top 10 facets. This can however be configured by passing it along in the query:
Copy code
query.WithFacets(facets => facets.FacetString($"productMarkings_{culture}", c => c.MaxCount(200)));
a
Oh man! I only just saw this thread! @skttl I have a blog post series I'm writing on this for skrift. Part 1 is due out in a few days! Let me re-read this thread on a machine where I can see the code (instead of my phone) and see if I can provide any additional insight.
n
https://github.com/Shazwazza/Examine/actions/runs/6165490399 Shazwazza commented 4 hours ago @nzdev + @Nikcio the build for a potential beta is here https://github.com/Shazwazza/Examine/actions/runs/6165490399 If anyone has time, the artifacts have the created Nuget package, would be awesome if someone could test consuming that locally before I publish it to nuget.org?
j
Running a site with faceted search based on PR 347 - I can try to switch over and run the site later today 🙂
I swapped to the NuGet package on a project and reindexed. I still get the same facets and same search results - did you have anything more specific I should test? 🙂
n
Nope
b
@nzdev Is the docs here correct? https://github.com/Shazwazza/Examine/blob/release/4.0/docs/searching.md#string-facets I don't get the
Facet()
method, but these.
It seems the method previously was named
Facet()
https://github.com/Shazwazza/Examine/pull/311/files#diff-fb375fe299a36165510b574c9ccb6e287eb49ccafcc5b2e05df0d367d5425df0 But now
FacetString()
,
FacetDoubleRange()
,
FacetFloatRange()
and
FacetLongRange()
.
I tried the v4 beta package on Umbraco Commerce demostore. https://github.com/umbraco/Umbraco.Commerce.DemoStore I can get facets of "price" field, but I think I need to configure the facet config in the indexing as well to make it work.
n
Yeah we properly forgot to update the docs when it changed from Facet to FacetString etc.
You will need to configure all fields that need to be used faceting on otherwise the indexing process won't index the facets. See https://github.com/Shazwazza/Examine/blob/release/4.0/docs/v2/articles/configuration.md
b
@Nikcio I will have a look. Any reason there isn't an
FacetIntegerRange()
.. when we have
FacetInteger
? or
FacetDateTimeRange()
... when we have
FacetDateTime()
? Not sure if facets are supported on datetime as well .. and I guess with facets on integer the double, float or long facet range method can be used.
It would be great with Example how to enable facets for an existing index like
ExternalIndex
I think it is something like this, but when I set
UseTaxonomyIndex
here, it fails on startup.
j
Here is how I've done it Bjarne
b
@Jemayn yeah, I looked at your screenshot and I have this:
Copy code
public sealed class ConfigureIndexOptions : IConfigureNamedOptions<LuceneDirectoryIndexOptions>
{
    public void Configure(string name, LuceneDirectoryIndexOptions options)
    {
        switch (name)
        {
            case "ExternalIndex":

                var priceFields = new List<string>
                {
                    "price"
                };

                // Create a config
                var facetsConfig = new FacetsConfig();

                foreach (var field in priceFields)
                {
                    //options.FieldDefinitions.AddOrUpdate(new FieldDefinition(field, FieldDefinitionTypes.FacetDouble));
                    facetsConfig.SetIndexFieldName(field, $"facet_{field}");
                }

                options.FacetsConfig = facetsConfig;
                //options.UseTaxonomyIndex = true;

                break;
        }
    }

    public void Configure(LuceneDirectoryIndexOptions options)
        => Configure(string.Empty, options);
}
However it still seems to break when I set
UseTaxonomyIndex
When the
AddOrUpdate()
on the existing "price" field in Umbraco Commerce, it lost this field in index. I guess it is because the price property is indexed like this by default, when the guid is reference to the currency.
I included this in
TransformIndexValues
:
Copy code
if (e.ValueSet.Values.ContainsKey("price"))
{
    var prices = JsonConvert.DeserializeObject<Dictionary<Guid, string>>(e.ValueSet.GetValue("price").ToString());

    foreach (var price in prices)
    {
        var currency = _currencyService.GetCurrency(price.Key);

        values.Add($"price_{currency.Code}", new[] { price.Value });
    }
}
and updated to this:
Copy code
public void Configure(string name, LuceneDirectoryIndexOptions options)
{
    switch (name)
    {
        case "ExternalIndex":

            var priceFields = new List<string>
            {
                "price_GBP"
            };

            // Create a config
            var facetsConfig = new FacetsConfig();

            foreach (var field in priceFields)
            {
                options.FieldDefinitions.AddOrUpdate(new FieldDefinition(field, FieldDefinitionTypes.FacetDouble));
                facetsConfig.SetIndexFieldName(field, $"facet_{field}");
            }

            options.FacetsConfig = facetsConfig;
            //options.UseTaxonomyIndex = true;

            break;
    }
}
Is it expected the
Value
property to has value 0 here?
n
If using the taxonomy index the field types should be taxonomyfacetinteger etc
The directory factories also need to be the new ones from examine 4, the ones in umbraco won't work with the taxonomy index
This is covered in the documentation
Facet ranges are also available as int64range and double range
n
If you want to keep using an existing index like the external index you cant use taxonomy faceting. Taxonomy is faster but requires it's own index to my understanding. But @nzdev is really the expert on taxonomy faceting. For just standard out of the box faceting you can then use Facet[Something] indexing types but you should not add the use taxonomy setting. And for taxonomy follow what @nzdev wrote above😉
b
I get facet ranges for string, double, float and long.. but I don't see one for integer? or datetime.. not sure if datetime works though or it may be handle as long (ticks) instead.
Makes sense regarding taxonomy and the taxonomyfacet field types. Currently I am just playing with in in Umbraco Commerce demostore, which use the
ExternalIndex
by default as the products are based on Umbraco nodes. I think the price fields need to be splitted into separate fields to make it work with facets: https://github.com/umbraco/Umbraco.Commerce.DemoStore/pull/2
Regarding e.g.
DoubleRange()
.. should
minInclusive
and
maxInclusive
be
true
be default like it is with
RangeQuery<T>()
?
Copy code
query.And().RangeQuery<long>(new[] { "myField" }, 0, 10);
I get some hits in facets now... I forgot to change the field name after extracting the fields from the original "price" field in Umbraco Commerce.
Copy code
public sealed class ConfigureIndexOptions : IConfigureNamedOptions<LuceneDirectoryIndexOptions>
{
    public void Configure(string name, LuceneDirectoryIndexOptions options)
    {
        switch (name)
        {
            case Constants.UmbracoIndexes.ExternalIndexName:

                var priceFields = new List<string>
                {
                    "price_GBP"
                };

                // Create a config
                var facetsConfig = new FacetsConfig();

                foreach (var field in priceFields)
                {
                    options.FieldDefinitions.TryAdd(new FieldDefinition(field, FieldDefinitionTypes.FacetDouble));
                    facetsConfig.SetIndexFieldName(field, $"facet_{field}");
                }

                options.FacetsConfig = facetsConfig;
                //options.UseTaxonomyIndex = true;

                break;
        }
    }

    public void Configure(LuceneDirectoryIndexOptions options)
        => Configure(string.Empty, options);
}
n
It's not a good idea to set optional params as it makes libraries harder to keep backwards compatibility. V4 has been analyzed to be V3 compatible and not cause issues for a v5 to be back compatible with v4
Used Microsoft.CodeAnalysis.PublicApiAnalyzers to track compatibilty. Pretty neat.
Might make sense to get Umbraco docs updated for how to do this as the Examine docs are generic to Examine.
Examine docs are way easy to contribute to now that they are on docfx. I'd encourage those in the thread to consider adding to the docs, easy as clicking the "Improve this doc" link on each page top right and clicking the pencil in github, make an edit and raise a pr
It would really help to have more examples of these faceting issues that have appeared in this thread documented.
n
Lucene doesnt have a Facet api that takes ints so it doesnt make sense to make one in examine as it would call the same thing. You should just use the one that takes longs. As for datetime then you should use ticks / long values for faceting then you can format them back into datetimes after retrieving them from Examine.
b
Yeah, I was mostly wondering because the field definition has int and datetime.
From
FacetResult
is there any way to get the label/field/alias?
Copy code
var results = query.OrderBy(new SortableField("name", SortType.String))
    .WithFacets(facets => facets
        .FacetLongRange("isGiftCard", new Int64Range[] {
            new Int64Range("no", 0, true, 1, false),
            new Int64Range("yes", 0, false, 1, true)
        })
        .FacetDoubleRange("price_GBP", new DoubleRange[] {
            new DoubleRange("0-10", 0, true, 10, true),
            new DoubleRange("11-20", 11, true, 20, true),
            new DoubleRange("20-30", 21, true, 30, true)
        })) // Get facets of the price field
    .Execute(QueryOptions.SkipTake(pageSize * (page - 1), pageSize));

IEnumerable<IFacetResult> facets = results.GetFacets();
In the model it cast to `IEnumerable`:
E.g. mapping the facets to facet groups. with the values. In a UCommerce project, we had something like this:
Copy code
private IEnumerable<SearchFacetGroup> MapFacets(IList<Ucommerce.Search.Facets.Facet> facets)
{
    var mappedFacets = facets
        .Select(x => new SearchFacetGroup()
        {
            Name = GetFacetName(x),
            Facets = x.FacetValues.Select(f => new SearchFacet()
            {
                Count = f.Count,
                Name = f.Value
            })
            .OrderBy(f => f.Name)
        });
    return mappedFacets;
}
From
FacetResult
it doesn't seem we can tell which field it is related to? Or am I missing something? 😄
FYI I have something like this: https://github.com/umbraco/Umbraco.Commerce.DemoStore/pull/3 But using
GetFacets()
it seems I can't find which are related to
isGiftCard
and
price_GBP
... except I maybe can rely on index in collection?
I also noticed the
Value
property on
IFacetValue
which is the occurrence, but wouldn't
Hits
or
Count
make more sense? Not sure if there's a specific reason the type is
float
(decimal value) instead of
int
(without decimals)?
n
You should be able to enumerate the FacetResult by default without casting it https://github.com/Shazwazza/Examine/blob/release/4.0/src/Examine.Core/Search/FacetResult.cs#L26
But you are properly right there should be a way to get the name of the field the facets are on. One way to go about it is to use this method on the search result then you know exactly what values comes from what field: https://github.com/Shazwazza/Examine/blob/release/4.0/src/Examine.Lucene/FacetExtensions.cs#L16
b
@Nikcio feel free to try this out https://github.com/umbraco/Umbraco.Commerce.DemoStore/pull/3 It also seems to facet only is available when the field exists on document, e.g. with facet on
isGiftCard
So it needs to ensure the field exists on document, in this case I just needed to re-publish these product nodes.
n
Yeah that is how I would expect it to work. If there's no data in a field it can't be part of a facet value range
b
With
FacetString()
should there be a way to specify label for each value as for the ranges? For example the field may store 0/1, but in frontend one may want "True/False", "Yes/No".
Copy code
var results = query.OrderBy(new SortableField("name", SortType.String))
    .WithFacets(facets => facets
        .FacetString("isGiftCard", null, new[] { "1" })
        //.FacetLongRange("isGiftCard", new Int64Range[] {
        //    new Int64Range("no", 0, true, 1, false),
        //    new Int64Range("yes", 0, false, 1, true)
        //})
        .FacetDoubleRange("price_GBP", new DoubleRange[] {
            new DoubleRange("0-10", 0, true, 10, true),
            new DoubleRange("11-20", 11, true, 20, true),
            new DoubleRange("20-30", 21, true, 30, true),
            new DoubleRange("30-40", 31, true, 40, true),
            new DoubleRange("40-50", 41, true, 50, true)
        })) // Get facets of the price field
    .Execute(QueryOptions.SkipTake(0, 1000));

var facets = results.GetFacets();
https://cdn.discordapp.com/attachments/1140878433157652520/1156483133068300378/image.png?ex=65152259&is=6513d0d9&hm=305ef59e08714eecb97b1cb3191c7b745ae4a082060f9885cff7a1dd40a814e6&
@nzdev Would it be possible to use taxonomy faceting with e.g. existing
ExternalIndex
with eventually changes in Umbraco core? Sure we can add a new index, but it would basically be the same data which already exists in
ExternalIndex
and since Umbraco Commerce is based on product/content nodes and therefore use
ExternalIndex
. Also if one wanted to use taxonomy faceting on data in content tree, e.g. employees... or it could be media listed on a Press page with options to filter these.
n
Exactly my interest in this. The answer is yes, with a rebuild and a change of directory factory
Should be possible already
r
I am trying to follow the thread. Please, when somebody write the blog posts, share them here as well 😄
Hey, I am interested in your raw lucene approach! If you could share some details about it it would be great! 🙂
d
I'll see if I can share some snippets with you later today 😄
r
Hey, what's the the summary of this one? Is there a blog post somewhere? Has Facets been added to Examine in the latest version?
I found it: https://shazwazza.github.io/Examine/articles/configuration.html also info in Searching section
480 Views