Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore(scripts): replace cld with franc to detect language #25538

Merged
merged 1 commit into from
Feb 17, 2025

Conversation

yin1999
Copy link
Member

@yin1999 yin1999 commented Jan 22, 2025

Description

The cld package works well, but it requires the user to install MSVC to compile the C source code. But some contributors may not be C/C++ developers, and they don't want to install such a heavy compiler on their computers (MSVC build tools requires 8GB of disk space). Additionally, the cld package itself requires 100MB of disk space (see the warning).

After comparing with some language detectors (cld, franc, cld3-asm), I believe franc is a well substitute for us, as it has more users and is lighter (implemented in pure javascript).

Note that, the replacement will increase the time used for language detection. I've tested to run through all the files:

package Used time (seconds)
cld 16.863
franc 95.401

Motivation

Makes it easier for contributors (especially for Windows users) to setup the environment.

@github-actions github-actions bot added the system Infrastructure and configuration for the project label Jan 22, 2025
@yin1999
Copy link
Member Author

yin1999 commented Jan 22, 2025

Hey @queengooborg, this is a draft to replace the cld package (the motivation is describded above). Are you interested in taking a look?

@github-actions github-actions bot added the merge conflicts 🚧 This pull request has merge conflicts that must be resolved. label Jan 23, 2025
Copy link
Contributor

This pull request has merge conflicts that must be resolved before it can be merged.

@github-actions github-actions bot removed the merge conflicts 🚧 This pull request has merge conflicts that must be resolved. label Feb 12, 2025
@yin1999 yin1999 marked this pull request as ready for review February 12, 2025 06:58
@yin1999 yin1999 requested a review from a team as a code owner February 12, 2025 06:58
@yin1999 yin1999 requested a review from bsmth February 12, 2025 06:58
@caugner
Copy link
Contributor

caugner commented Feb 17, 2025

LGTM if @queengooborg has no objections.

@queengooborg
Copy link
Collaborator

No objections here!

@yin1999
Copy link
Member Author

yin1999 commented Feb 17, 2025

Thanks for your review @caugner, @queengooborg. Let's try the new language detector :)

@yin1999 yin1999 merged commit 6df86e4 into mdn:main Feb 17, 2025
8 of 9 checks passed
@yin1999 yin1999 deleted the replace-cld-franc branch February 17, 2025 12:30
kirbeee pushed a commit to kirbeee/mdn-translated-content that referenced this pull request Feb 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
system Infrastructure and configuration for the project
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants