Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What should happen if a site disagrees with the topics assigned to it by the browser? #2

Closed
jkarlin opened this issue Jan 21, 2022 · 6 comments

Comments

@jkarlin
Copy link
Collaborator

jkarlin commented Jan 21, 2022

Should there be a process to alter the assignment? What should the process be?

An alternative is to allow for sites to set their own topics via response header, as in #1.

@dmarti
Copy link
Contributor

dmarti commented Jan 26, 2022

Or a site could take pages that are causing its topics to be misidentified and put them in their own section: #17. That way the classifier can still classify them, but their off-topic topics don't affect the top-level topics for the site.

@AramZS
Copy link

AramZS commented Feb 15, 2022

Yeah a topic blocklist would be very useful I think.

@StacySager77
Copy link

gi

@jkarlin
Copy link
Collaborator Author

jkarlin commented Sep 6, 2024

Mislabeled topics can degrade the quality of the user's topics on other sites, and we wish to iteratively improve on the classifier, but providing overrides via the platform seems to open more problems (e.g., intentionally incorrect topics) than it's worth. Note that mislabeled topics should not bring harm to the mislabeled site itself, as the site can provide its own contextual information to buyers.

@jkarlin jkarlin closed this as completed Sep 6, 2024
@dmarti
Copy link
Contributor

dmarti commented Sep 6, 2024

Patterns of repeatable mislabeling -- which can be recognized and trained on by server-side ML but not by individual users without significant cooperative research -- are examples of Topics API working normally in cooperation with server-side ad decisioning. The mislabeling is not a priority to fix in the browser because a known pattern of mislabeling can be trained around on a server that has sufficient data.

Topics API is a way for a browser to pass an obfuscated cohort identifier to a classifier on a server -- it's essentially FLoC, but the mystery numbers are in base 629 instead of base 10.

An individual user can't understand what Topics API cohort Google's ML has assigned you to just based on the individual topics in the set -- that's like saying you know what your FLoC cohort means because you know the digits 0-9.

Without access to the server it's hard to say whether mislabeling different sites about the same topic differently actually provides more info to the server-side logic about the user's cohort membership than correct labeling would. Allowing sites to correct their topics at arbitrary times could make server-side ML less useful.

@Safwanni2212
Copy link

Safwanni2212 commented Sep 7, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants