-
Notifications
You must be signed in to change notification settings - Fork 719
[css-fonts-4] Privacy and I18n issues around user-installed fonts, and user selection of them #5421
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
If I remember correctly, the font based fingerprinting discussions started last year, but this text has been in the spec for at least 2 years. Perhaps it's for another reason? |
@litherum knows better, but my understanding is that this sentence was added mostly to describe reality - in that Safari had already implemented this behavior and continues to behave this way. The subsequent fingerprinting discussion has happened in part to figure out how to address problems with this behavior. Two of the suggestions we have considered are:
Would either or both of these requirements make the behavior more acceptable to the i18n WG? |
But Safari is only one of the 3 main browser engines, so i don't believe the spec should reflect that approach yet, because it doesn't represent reality. (And btw, i am no longer able to use or recommend Safari for a deal of multilingual content for this reason, esp. when it concerns support for endangered scripts or scripts that people are promoting for wider use. Which is a nuisance, since it writes off use on iPads and iPhones as well as desktop.) (1) above sounds problematic because i don't believe it's possible to create and maintain in a timely fashion a centralised list that will respond to the needs of the content developers, especially for endangered and long-tail scripts. Installing fonts currently avoids that problem, of course. (2) Is a given, to my mind, if any restrictions are to be placed on ability to render user-installed fonts, but it adds complications for non-computer-literate users that it must be easy for them to understand how to work around. However, both of those are solutions to a problem that we don't yet have, since the majority of major browser engines do not restrict visibility of user-installed fonts. So if we are to describe reality, we should remove that sentence. If we are to describe some future possibility, which to my mind is not yet certain enough to include in the spec, then such a statement must be accompanied by another that says that users must be able to work around the restriction when needed. |
One argument that (2) could be an acceptable workaround is that someone who can figure out how to install a font can likely figure out how to change a browser preference. And one of the options we have discussed for (1) is to flip the maintenance requirement. So we would allow a UA to ignore user-installed fonts ONLY for languages where it can be demonstrated that available OS fonts are adequate for displaying web content. |
Web fonts are the only sure way for a web author to use these less-supported languages. Naming a specific font and hoping the user has it installed is less likely to work than using web fonts. There’s a trade-off here. International support is clearly important, but privacy is also definitely important. Different UA’s policies have different implications for different users. I chose the word “may” here on purpose, simply to describe the trade-off. There are many different points on the spectrum between blocking fonts for privacy and allowing fonts for internationalization, and all of those points have reasonable arguments for them, based on the specific considerations of the UA and their users. The sentence “User Agents may choose to ignore User-Installed Fonts” is true (de facto). We shouldn’t remove it from the spec. |
In general, using a Web font is more reliable provided the license allows such use. This is not the case for all fonts, including some which are "free to use" but are required to be distributed in an intact zip file with a readme and license, for example.
It is a very poor approach for Latin, agreed. It is the standard approach in some linguistic communities, because they came onto the Web well before Web fonts were a realistic thing to use. Thus, user-agents which suddenly stop working on content which has worked for one or two decades will simply be seen as broken. |
What is the criterion against which you'd measure adequacy? I'm not sure how that would work. Here are some additional thoughts about things that concern me...
I'm curious about how you reached that conclusion. Do you base this on experience, or is it a logical deduction based on the assumption in the sentence "Naming a specific font and hoping the user has it installed is less likely to work than using web fonts."? I ask because, as someone who has worked with a wide range of less-supported languages on a daily basis for several years, it doesn't match the reality that i experience. In fact, it appears to be the opposite. Less-supported languages typically have only a small number of freely available Unicode fonts, and those are often obtained by users from the Web. Users expect to use them with applications such as text editors or for texting, as well as for Web pages, and they are therefore often packaged along with keyboards. Webfonts are not useful for those scenarios. It's very rare for such packages or download locations to serve the user with a webfont. Webfonts are a tool for content authors, not for users. It may be possible to persuade content authors to create and use webfonts for new content, but they aren't going to help with the content that's currently out there. Not that webfonts are a panacea, anyway. They require additional bandwidth, when people using less-supported languages may find that a problem. And for documents containing multiple languages that can sometimes mount up. Access to user-installed fonts can also give the user the flexibility to set preferred fonts as the default. For example, i've just been writing about the Javanese script as used for the Javanese language. There is a Noto font for Javanese, although it has some bugs still, but it also uses a noto-harmonised style that doesn't look like normal Javanese book fonts (although a recent redesign now at least makes it possible to actually read the letters). Otherwise, there is very little to choose from wrt Javanese fonts. That I know of, the only font that actually works well and looks the way i'd expect is Yogyakarta. It's all rights reserved, so i can't as a content author create a webfont for it. I can, however, prepend it to my CSS font-family values so that those who also happen to have it can also obtain the benefits of a well designed font, rather that having to settle for an alternative system font. At the very least, browsers shouldn't block content authors from trying out different fonts they have installed while developing their content. Even if they are going to send the content out with a single webfont, they need to be able to experiment quickly with alternatives in order to choose that font. Compiling system fonts into webfonts and incorporating into the content for testing out the effect is way too slow. Localhost and file URLs should be exempt from these restrictions, since there is no privacy risk there anyway. |
The CSS Working Group just discussed
The full IRC log of that discussion |
@litherum the only way I personally would consider your user stylesheet hack to be sufficient is if the user agent automated it for the user. It would be fine to have it as the mechanism for opting in, but there would need to be relatively-simple UI presented to the user. |
I'm particularly troubled by @astearns's comment in #5421 (comment) about the original edit being premature. Previously, the spec said nothing about the set of exposed fonts. Adding the sentence "User Agents may choose to ignore User-Installed Fonts" didn't actually change anything - user agents were already free to ignore these fonts because nothing was defined before. The edit takes something that was entirely undefined and simply calls out the fact that it was undefined. I wouldn't call this "premature." I would call it "toothless." One valid criticism would be that the original edit doesn't describe the fact that there are many different places in the middle of the spectrum of [forbid all user-installed fonts, allow all user-installed fonts] that UAs can choose, and that these places are (generally) more valuable than either of the extremes. Saying "UAs may pick one extreme" doesn't capture this nuance. Fixing this is an improvement to the spec, and is the resolution the group just agreed on. |
I made a pull request with proposed text. Please take a look! #5625 |
The spec said nothing, but there was (and is) a common understanding that user agents could and would match user-installed fonts. Yes, this was under-specified. But we had rough interoperability on user-installed fonts until WebKit changed. Adding the possibility that user-installed fonts could be ignored was a new thing. Since the edit was pointed out, we have had lots of discussions about the benefits and drawbacks of what’s in the spec. I think there is consensus that allowing a user agent to ignore user-installed fonts with no usable opt-in is too extreme. I do think the current MAY statement was premature, because it explicitly allows that extreme behavior. So I support the edit you have made. And I support continuing to discuss usable opt-in scenarios so we can get back to having a normative statement we can all agree on. |
I don't agree with this. WebKit is certainly interested in refining its balance between internationalization and privacy, but that isn't the same as saying no UA should ever pick either of the extremes. Some UAs have very strong internationalization goals, and other UAs have very strong privacy goals. I would agree with you if some content was becoming unreadable for no benefit. But there is a benefit here, and so it becomes a cost/benefit analysis problem. All options should be on the table, even if we expect that most UAs won't choose one of the options. |
For me, i think it comes down to this: User agents may pick a position along the internationalisation<->privacy tension line to propose to the user by default, but they shouldn't take decisions for the user that the user can't easily change. And in the case of fonts, although an all-on/all-off switch might be useful too, i think this generally means the user being able to choose which user-installed fonts work on a font-by-font basis. One problem is that the user is not likely to be aware what font the content author is trying to apply. That's why i had suggested elsewhere an approach like we have for cookies, that warns the user if the page is trying to use a font that is not already enabled by them, and asks then remembers whether they want to enable it for this and future pages. |
@r12a there was a proposal by @felipeerias did you see it? It seems to satisfy the "user can easily change" aspect that you mentioned; I particularly liked the per-site/everywhere option. It also makes fingerprinting slower, even if the user has authorized a large number of fonts. |
Oh, i thought i had replied to this long ago. The proposal by @felipeerias very much reflects what i had already proposed, and my reasoning in proposing it, so yes i'm interested in further discussion around that. (I did actually say that in the other thread a year ago.) |
So, do the changes made in b0ac8dd satisfy the competing constraints on privacy vs. digital access, or is there more to do? |
…d into Font Matching Algorithm, #5421
The large initial change block seems to be heading in a useful direction. However, the 3rd option (hybrid approach) assumes that the UA implementer is able to choose adequate fonts to be exposed for a language on behalf of the user.
I don't see how they can possibly do that in a way that meets user's requirements. I still think that if a UA is going to prevent use of installed fonts, then the user needs to be able to unblock those fonts if they wish. The user also needs to be able to view text in fonts that the UA implementer has never even heard of. And it also doesn't address the need i mentioned earlier for content developers to be able to view content in whatever font they want in situations that are not security risks, such as reading a file from their own hard disk or local server while creating content. (Slightly tangiential, but indicative of the problems we are opening the door for here is that users should also be able to overwrite or delete any font that is pre-installed, too. For example, Kashmiri users can't currently use Safari because they need access to very recent versions of nastaliq fonts like Noto in order to correctly write their language (and fix the non-standardised and ugly hacks that users have resorted to so far), but macOS won't allow a new version to delete or overwrite the pre-installed version. This is really preventing users from creating readable content. See also an example of the problems caused for many writing systems for users of Firefox on the mac, for similar reasons.) |
Thanks for the detailed feedback @r12a, let's keep working to resolve this. |
… available for rendering web pages #5421
Please take a look at d7193b9 where this is made explicit.
Not that tangential; the wording I just added talks about removing as well as adding fonts from the initial default set provided by the UA. Removing from a set used for rendering seemed a better way to express it than "must allow uninstalling". |
LGTM, thanks. |
@pes10k could you look over from a privacy perspective? |
This text does not address the privacy concern. Unless im looking at the wrong version, this text just makes the existing privacy problem more explicit; that the defined behavior could be used to harm user privacy and its up to vendors to figure out how to implement the spec w/o the privacy harm. In order for the privacy concern to be addressed, it it should be the case that a correct and complete implementation of the standard ensures that the defined functionality is privacy preserving; not just that a vendor could implement it in such a way that it could be privacy preserving. I appreciate the importance of the I18n issues here, but I think its critical to find a way to address those concerns without risking user privacy. FWIW, heres another suggestion you all may have already considered and discarded (but maybe not!): discarding the OS vs installed distinction, and instead limiting the number of font-faces that an site / eTLD+1 can reference within the lifetime a storage partition. Like i said, maybe you all have considered and discarded the idea, but this kind of "budgeting approach" might be successful for fonts in a way I dont think its practical for addressing fingerprinting concerns in general (for example, benign font-face selection might be consistent in a way that a sites overall "set of Web APIs used" is not). |
The CSS Working Group just discussed The full IRC log of that discussion |
I’ve created a new issue to explore making the current spec text more prescriptive |
10.3. Preinstalled Fonts and User-Installed Fonts
https://drafts.csswg.org/css-fonts-4/#preinstalled-and-user-installed-fonts
!!!! This will create consternation for a great many international users around the world. A huge number of people use languages which are not supported, or are not supported well, by preinstalled fonts. The almost universal approach, currently, for people who use less common or endangered languages on the Web is to find a page where someone has created a font (and often keyboard) that you are expected to download in order to see or create content. (Those fonts are often used by applications other than browsers, too.)
Furthermore, quite often there are also particular font styles that content authors want to see used, often contrastively, and there's a likelihood that a user will have installed additional fonts (such as using a Kano font for Nigerian ajami text, or a Mool font style to distinguish Khmer headings, etc.) in order to make the text readable/locally relevant. For many languages, the system may provide a Noto font, but usually only the one (and generally the Noto fonts don't reflect the design users would want to see for their text, since it focuses on producing Noto-harmonised, large, sans-serif, monoline glyphs).
Of course, this plays into the issues related to privacy, but if the motivation for the sentence is that, it should say so, at least. But that discussion is not yet concluded, so in the meantime the i18n WG would like to see that sentence removed.
The text was updated successfully, but these errors were encountered: