Any special considerations for large sites (65k+ d...
# help-with-umbraco
m
I am working on a project that should be finish in a minth or so. This project will import a ton of data into Umbraco 13 as document types with a lot of cross-referancing to each other. It turns out this is going to be making something like 65,000 documents (yes really 65 thousand). Are there any special considerations I should have for when i grt all this data imported? How about dealing with indexing it all for search?
m
Prob want to review https://docs.umbraco.com/umbraco-cms/reference/common-pitfalls#using-umbraco-content-items-for-volatile-data and some other useful items there.. (examine might be your friend for querying the large dataset... though have been proved wrong in the past with optimised linq queries faring better)
m
Thanks for the link. It looks like none of that is things I will have to keep in mind. All the docs are static and while some may get small uodates down the road for the most part after the import is done the data is static. I will keep the lookups in mind but I dont think that will be a probile either. Like worse case foe my look ups is say a recipy that has a dozen items in it so i use the muti tree picker to link to those items and in the template read the names of the items for display. Examine is the default search provider right? If that is the case then that is what I am using.
b
@Mike Chambers linq is much slower if you filter in more than 1 dimension, examine is your friend but default umbraco examine index are poorly optimised 😬 so you might not get as much benefits as when used with optimalisation. I also recommend switching underhood lucene engine through examinex or my examine providers for stability reasons
m
Can you elaborate on what you mean by switching the underhood lucene engine? I am basicly a hobiest programmer and beyond service level things with Umbraco I don't know much. I do know that there is a problem with Excame already hitting my sites but they are supposed to be working on a fix. If the default search is not really optimised I would want to at least add it to my list of things to do of getting a more optimised search as I do see visitors to my site using the search function alot.
b
it is not possible to fix Examine in first place, we should start from it... Shannon and I introduced similar products which switch it to Azure Search/Elastic, they works similarly but differently. the issue is in relation to using virtual path in apps, which can be reuse by other apps when switching underhood hardware in azure (Slots swapping, Azure relocations etc) and this not possible, what Shannon did is mitigate issue and make it rebuild/allow to rebuild everytime when it fails... 🙂 it is better to have documetns with lesser amount of fields, than as umbraco do 😉 so to optimase it you need override quite a lot in umbraco.
m
Sorry I am confused. You said you recommend switching it but now you saying Examine is to integrated into Umbraco that it can't be switched out? I am running on Umbraco Cloud so I know it runs on Azure but I can't spin up special things so it would have to all be self contained if I do anything.
@bielu Where you talking about https://examinex.online/ ? I ran across it while looking at some issues in github. Maybe this is the better option for me to be using? Eidt: Never mind on that. I can't afford $1000.
b
Examine is not possible to switch out in umbraco, both examinex and my providers are hacking around and allow you use other engines instead.
m
After reading the ExamineX page a bit more I understood what your where saying. Unfortunatly for me I can't afford the package so I probibly going to have to pray Umbraco HQ figures out something to make things work on there Cloud hosting.
b
my version is free but i do not deliver support unless you someone will donate 😂
they both doing same thing, they are implement diffently I do also aliasing when shannon is not also i require elastic 8 and newest azure search, shanon is not 🙂
m
Don't miss understand. I have no issues paying for software or supporting. But between my normal bills and the cost of trying to get my bussiness started I fall about 500$ father in debut each month. My big hope is that after this data import and an upcoming project that is going to eat up my entire bonus from work in Octoboer that I can start to make some income from my buissness.
b
@Matthew Alexandros I am saying that you have choice here: - paid version from Shannon, - free version from me - use base examine and spend time fixing indexes issues 🙂
m
If the data is static why make it doc types at all? You could instead store them in the dB and do the routing as you need. Then if you need to edit them use https://umbraco.com/products/add-ons/ui-builder/
m
Ok, I think I understand now thanks. I will have to dig into your package more on my next day off to see if it is something I can tackle my self. Thanks for the help uinderstanding the limitations I am probibily going to run into with the built search
I don't think that package applies to what I am doing. While the data is pritty much static it does not live anywhere yet and my understanding of that package is to give you an unbraco style editor to an existing database of data. In my case the data I am importing does not live anywhere accessible by Umbraco. Also with that package we once again run into costs, which I can't afford right now I may have also used the wrong words in my first post. When I say "documents" I mean page on the "content" tab that get URLs assigned to them for access from the front end. There are only going to be about a dozen or so "Document Types" (those in the folder "Docuement Types" under the settings tab).
s
I think people are quite overcomplicating things in this thread and assuming things.. what are your requirements? If I read it correctly you say this is a recipe site? And for each recipe you want to do a link picker that picks the ingredients? It is really no problem to have 65k+ nodes in your site, but usually that means you're just putting nodes in because you can, not because they have to be editable. It is more perfomant to put items that will never be edited somewhere else, even if they need rare edits that can be managed by something like UI builder (which I believe still has a free tier for one "collection").
m
The site is actualy a game info site and the data i am importing is achivements, quests, items , shopes, crafting recipies ect. Not sure what you mean by requirements in this context though
s
These are your requirements - a gaming info site with documents linked to documents. See that sounds fine, no need to worry about advanced searching if there is not going to be advanced searching. If it's just a site with a document structure that people can browse, should be no problem with that many nodes, building the cache will be a bit more memory intensive but should be fine. Is there advanced searching? Any on the fly needing to cobble different documents together?
m
No advanced searching at this point. Right now all I can ever see is seaching on like 3 properties at most. Most complex I could think of is "type=x&class=x&lvl=betwenX/Y" to say pull all braclets for Paladins beteen level 80 and 90. As for cobble together different documents yes but I dont think it is complex. Most complex cobble would be to read 2 properies from the linked nodes (name, icon url) from the linked document. So if an item could be crafted and brought then it might make that read on upto a dozen pages selected thiught the mutiurl picker. That design is not finialized but probably what I am going with in the end. (I might have thw picker name wrong. I am still at work so cant confirm name might be treepicker or node picker)
s
A "between" search might be the most interesting part here but as Examine supports range queries that should be fine. The most tricky part will be constructing some good examine queries I think (I haven't done enough with examine for a long while). I found a recent question about range queries, which may be helpful to you: https://discord-chats.umbraco.com/t/16668102/solved-help-with-examine-rangequery-for-custom-date-field-in
m
Thanks for the link. I have bookmarked it for refreince if I run into problems when it come time to dealing with user search. Now that I am at home I double checked the picker I was thinking of using and it is the MutiNodeTreePicker. But from the sounds of everything my structure and design are simplistic enoug that I should not see any issues, which I am happy to hear. I do appreacite everyone's feedback on this.
s
Once you have a prototype of a basic setup you could always come back to show the setup and ask for some more feedback.
m
Thanks, I will probibily do that. I have not given to much though to the design on the Umbraco side as most of my brain power has gone into writting a C# application to do the data scraping and compiling to prep the info for import. This data project has made me go to bed 1 to many times with a headache trying to get this all figured out.
4 Views