-
Notifications
You must be signed in to change notification settings - Fork 296
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generation time makes it Windows and Windows+WSL local site generation unusable #7609
Comments
I run on windows only and it hasn't traditionally taken 20 mins for me but I've not yet tried it after the ruby upgrade. I do suspect something in the ruby version upgrades are likely the cause as I was hitting similar issues in the past when trying to run on the later ruby version. This is definitely worth further investigation but I'm not sure when I'll get around to it. |
@weshaggard I did try with Ruby version 2.7 as well and got the same time results. It seems to be a known issue for Jekyll to become slower as you keep adding more posts. I found a few blogs about how folks would eventually migrate from Jekyll to Hugo or Zola b/c of the generation slowness. It seems like you can also set --incremental-build for Jekyll, which will only give you ~20 min the first time, but will be faster after that. |
Fair. I do typically use --incremental but I don't remember it taking 20 mins initially. At some point I'll see what options we have. I'm definitely not opposed to pruning some of the older release stuff but we should consult with the PMs first. @ronniegeraghty do you know if there is any strong reason to keep all the release history? |
Everything in the Were there other files/directories you noticed slowing down the process? |
Would you consider exploring Zola, @weshaggard ? It might take some time to migrate it all, but would allow us not to remove old data (if we want to keep years of sdk data there). I'm happy to start a dev branch and see how it would look like to use Zola. @ronniegeraghty , the top offenders are: Filename | Count | Bytes | Time The content of those files go and iterate all the web content. For example, for For a non-static webpage (say WordPress), generation is requested on demand and paged. There are actually 2 things to consider:
For the first one, we can consider cleaning release notes from past years, but, for the second one, we can't really reduce the number of libraries we ship 📦 . For example: Look at the JS and .NET numbers per SDK release:
Considering 10 languages/ship-ably-libraries: And thinking about ~500 libs in average (some languages like C have less). We would still be looking to around 5k libraries per release. Jekyll has no option for doing parallel generation. It goes and creates one by one pages, running cycles with I/O to disk of 5k iterations. The more libraries we add to the release (likely to keep happening), the more time it will take to generate the front page table: https://azure.github.io/azure-sdk/ So, another option to explore, is migrating the entire Web to a single-page-application with a backed api (Azure SWA could be a cheap option), or at least, updating the static html page to generate the html on the client-side based on json static files (that's what we do on Awesome azd ). The current approx size of the main html of azure-sdk root page is 2.89 MB , which, for an html (text file) is HUGE!! (github can't even display it) |
For the Package Table that shows on the Release Site, I believe this is using the For the Monthly Release notes, I could see us removing monthly release notes pages after a certain amount of time. (How long to wait before removing them is uncertain.) |
I'm not sure it is worth the effort, and I would prefer to stick with the standard github recommendation as it will likely remain working longer term. It seems very odd to me that the common shared md files aren't cached in some way as they shouldn't be reading them off disk for everything like they seem to be doing. I wonder if there is some other option to enable caching or something. That said @vhvb1989 if you want to take this on as a pet project go for it but I want to try and keep it as static as possible. At the end of the day I would also be fine saying you have to work in codespaces/devcontainer if that is what it takes to make it efficient. |
Run
bundle exec jekyll serve --profile
to measure the site generation time.Takes a little more than a minute in Codespaces (Linux Debian 11) -> fast file system
But, using
Windows+WSL
, it takes a little more than 20 minutes to generate the site!!!I didn't even try it in Windows only.
The generation time is not related to CPU or memory, but to disk I/O
I can get the same time in
Windows+WSL
by cloning the repo within the WSL files, instead of mounting the Windows path into the WSL for the repo.Should we consider removing or filtering data older than 2 years from the site?
Is there any value on listing and generating data since 2019 ? @weshaggard
The text was updated successfully, but these errors were encountered: