單元測試追加 - Exclusion Test by cyfung1031 · Pull Request #618 · scriptscat/scriptcat

cyfung1031 · 2025-08-08T23:03:21Z

https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/WebExtensions/Match_patterns#host

host must not include a port number.

https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/WebExtensions/Match_patterns#path

Note: The path pattern string should not include a port number. Adding a port, as in: http://localhost:1234/* causes the match pattern to be ignored. However, http://localhost:1234 will match with http://localhost/*.

cyfung1031 · 2025-08-08T23:48:59Z

parsePatternMatchesURL

~~host 是 hostname:port ( 若 port有指定 )~~

port 是 tcp port
0 - 65535

~ 所以 :5545* 可等價於 :5545/*, :55450/*, :55451/*, :55452/*, :55453/*, ... :55459/*, ~

-> 見MDN文檔。MV3 pattern 無視port
-> ** 不清楚TM如何解讀。TM文檔沒標明 **

現在的寫法拆太細了
應該直接兩個參數 matches (@match[]), excludes (@exclude[]) (custom exclude也放在exclude)

然後直接生成 urlMatcher, match patterns, exclude patterns

cyfung1031 · 2025-08-08T23:57:00Z

 new RegExp('^.+?://[^/]*?(:5244[^/]*?)?.*?$').test("https://foo.api.bar/baz")

這個rule 出錯了吧

cyfung1031 · 2025-08-09T00:01:46Z

日後會提PR重寫這個部份。

process rule 把 glob * (**) 跟 regexp 拆出來
mv3 pattern 跟從 glob * (**) 規則 ( 簡單情況優先使用 excludeGlobs )
做高效配對器 UrlMatcher

需參考 https://www.tampermonkey.net/documentation.php?locale=en#meta:include
https://www.tampermonkey.net/documentation.php?locale=en#meta:match

MV3 UserScriptAPI URL Matching

https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/WebExtensions/manifest.json/content_scripts#matching_url_patterns

matches 是 mandatory

MV3 Pattern

https://developer.chrome.com/docs/extensions/develop/concepts/match-patterns?hl=en
exclude情況：如@exclude 為簡單glob rule (例 google ), 不使用 excludeMatches，直接傳 excludeGlobs
其他情況：excludeMatches <scheme>://<host>/<path>

includeGlobs 同理

決定 match vs glob

沒有 . / : 用 glob (Safari 除外)

MV3 Pattern檢測

如果是有效MV3 glob rule, 沒有正規表達，可以直接注入userscript API 而不用溝通。（跳過 pageLoad溝通 )
@include -> matches, includeGlobs
@exclude -> excludeMatches, excludeGlobs

見 https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/WebExtensions/Match_patterns#invalid_match_patterns

Glob Pattern

https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/WebExtensions/manifest.json/content_scripts#globs

MV3 文檔標明 glob * () 和 glob ? 都支持
TM文檔只標明 glob * ()
ScriptCat 應可支持 glob * (**) 和 glob ?

注：這裡的glob * (**) 是指glob ** （不考慮資料夾層結構）

Match Pattern

https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/WebExtensions/Match_patterns
scheme 只有兩個寫法

Form	Matches
*	Only "http" and "https" and in some browsers also "ws" and "wss".
A complete scheme, without wildcards.	Only the given scheme.

file 不包括在 * 內，與 TM文檔一致

host 只有三個寫法

Form	Matches
*	Any host.
*. followed by part of the hostname.	The given host and any of its subdomains.
A complete hostname, without wildcards.	Only the given host.

host定義與 TM文檔一致

pathname 支持 glob * (**) 和 glob ?

相容性

注意 scheme 相容性。(safari) https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/WebExtensions/Match_patterns#browser_compatibility
注意 glob 相容性。(safari) https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/WebExtensions/manifest.json/content_scripts#browser_compatibility

CodFrm · 2025-08-09T02:01:29Z

@match @include真的是被玩坏了，我见过各式各样的的表达式，我也不得不增加一些乱七八糟的逻辑去适配

chrome的 Match_patterns 和TM的@match 压根就是两个东西了，不过@match和@include确实可以考虑分开，@match不支持正则

@match和@include也可以考虑重构，应该有更简单的方法去适配

这个单元测试没有通过，会影响正常的ci流程，先不合并了

cyfung1031 · 2025-08-09T02:11:13Z

这个单元测试没有通过，会影响正常的ci流程，先不合并了

不清楚你想怎麼修 #617
修好了就能通過

（沒想到 expect(excludeMatches.includes("*://*/*")).toEqual(false); 也不通過，改不了的話可以先注釋掉）

上面寫的大多都是筆記用
暫時沒時間提交PR

CodFrm · 2025-08-09T02:36:01Z

src/pkg/utils/match.test.ts

+    const uuid = uuidv4();
+    const { urlMatcher, excludeMatches } = makeUrlMatcher(uuid, matchesList, excludeMatchesList);
+    expect(urlMatcher.match("https://foo.api.bar/baz")).toEqual([uuid]);
+    expect(excludeMatches.includes("*://*/*")).toEqual(false);


似乎无法通过，你给的这三个表达式是无法转换成 PatternMatches 的，所以只能转换为 *://*/* ，然后再match的时候再用原表达式处理

'*://*.amazon.tld/*', // 因为tld是表示顶域，所以不可以转换 '*shop*',// 没有上下文，不知道这是域名，还是路径 '/.*(?<!jav)store.*/', // 正则表达式无法处理

'*shop*' 這個可以處理，但現在沒有相關功能，日後PR再算

目前的代碼，這些現在都是正規處理吧
excludeMatches 應該都是沒有，而放在 .rule 裡面做正規配對？

CodFrm · 2025-08-09T02:42:15Z

我修改了一版，删除了一些特殊的处理逻辑，直接一股脑的将*替换成正则表达式

原本我是希望尽量符合扩展的Match patterns规范的，但是UserScript的match实在是太乱了

CodFrm · 2025-08-09T02:45:36Z

这个单元测试没有通过，会影响正常的ci流程，先不合并了

不清楚你想怎麼修 #617 修好了就能通過

（沒想到 expect(excludeMatches.includes("*://*/*")).toEqual(false); 也不通過，改不了的話可以先注釋掉）

上面寫的大多都是筆記用暫時沒時間提交PR

我注释掉了，是这样吗？

cyfung1031 · 2025-08-09T03:22:05Z

这个单元测试没有通过，会影响正常的ci流程，先不合并了

不清楚你想怎麼修 #617 修好了就能通過
（沒想到 expect(excludeMatches.includes("*://*/*")).toEqual(false); 也不通過，改不了的話可以先注釋掉）
上面寫的大多都是筆記用暫時沒時間提交PR

我注释掉了，是这样吗？

match.ts 是不是應該把 *://*/* 從excludeMatches 中拿掉？
不知道目前是怎麼把含有 *://*/* 的excludeMatches 順利執行

scriptcat/src/pkg/utils/match.ts

Lines 351 to 376 in 71e97d5

    
           export function dealPatternMatches( 
        
             matches: string[], 
        
             options?: { 
        
               exclude?: boolean; 
        
             } 
        
           ) { 
        
             const patternResult: string[] = []; 
        
             const result: string[] = []; 
        
             for (let i = 0; i < matches.length; i++) { 
        
               const url = parsePatternMatchesURL(matches[i], options); 
        
               if (url) { 
        
                 // 如果存在search，那么以*结尾 
        
                 if (matches[i].includes("?")) { 
        
                   if (!url.path.endsWith("*")) { 
        
                     url.path += "*"; 
        
                   } 
        
                 } 
        
                 patternResult.push(`${url.scheme}://${url.host}/${url.path}`); 
        
                 result.push(matches[i]); 
        
               } 
        
             } 
        
             return { 
        
               patternResult, 
        
               result, 
        
             }; 
        
           }

cyfung1031 · 2025-08-09T03:25:09Z

我修改了一版，删除了一些特殊的处理逻辑，直接一股脑的将*替换成正则表达式

原本我是希望尽量符合扩展的Match patterns规范的，但是UserScript的match实在是太乱了

不是轉換符合扩展的Match patterns规范都轉換
而不能轉換的就在 .rules 以正規表現處理嗎？

CodFrm · 2025-08-09T03:51:57Z

因为他会返回两个，一个是：result，另外一个是patternResult， *://*/* 是patternResult里的，这个是提供给chrome.userScript.register 使用的，另外一个是result，这个才是给UrlMatch使用的

我大概懂你意思了，确实也不应该返回 *://*/* 到patternResult上（对于exclude来说）

CodFrm · 2025-08-09T03:54:09Z

scriptcat/src/app/service/service_worker/runtime.ts

Lines 754 to 787 in e206562

    
           if (script.metadata["exclude"]) { 
        
             // concat 浅拷贝是为了避免修改原数组 
        
             const excludeMatches = script.metadata["exclude"].concat(); 
        
             const result = dealPatternMatches(excludeMatches, { 
        
               exclude: true, 
        
             }); 
        
             // registerScript.excludeMatches = result.patternResult; 
        
             scriptMatchInfo.excludeMatches = result.result; 
        
           } 
        
           // 自定义排除 
        
           if (script.selfMetadata && script.selfMetadata.exclude) { 
        
             const excludeMatches = script.selfMetadata.exclude; 
        
             const result = dealPatternMatches(excludeMatches, { 
        
               exclude: true, 
        
             }); 
        
             // registerScript.excludeMatches.push(...result.patternResult); 
        
             scriptMatchInfo.customizeExcludeMatches = result.result; 
        
           } 
        
           // 黑名单排除 
        
           const blacklist = await this.systemConfig.getBlacklist(); 
        
           if (blacklist) { 
        
             const list = blacklist 
        
               .split("\n") 
        
               .map((item) => item.trim()) 
        
               .filter((item) => item); 
        
             const result = dealPatternMatches(list, { 
        
               exclude: true, 
        
             }); 
        
             // scriptMatchInfo.excludeMatches.push(...result.result); 
        
             registerScript.excludeMatches!.push(...result.patternResult); 
        
           }

这里是处理排除的逻辑，是使用的result，黑名单排除才是使用的patternResult

CodFrm · 2025-08-09T03:57:48Z

如果是match匹配的逻辑，返回 *://*/* 是没有问题的，所以在注册的时候是使用的patternResult

scriptcat/src/app/service/service_worker/runtime.ts

Lines 744 to 751 in e206562

    
           const registerScript: chrome.userScripts.RegisteredUserScript = { 
        
             id: scriptRes.uuid, 
        
             js: [{ code: scriptRes.code }], 
        
             matches: patternMatches.patternResult, 
        
             allFrames: !scriptRes.metadata["noframes"], 
        
             world: "MAIN", 
        
             excludeMatches: [], 
        
           };

cyfung1031 · 2025-08-09T04:10:41Z

如果是match匹配的逻辑，返回 *://*/* 是没有问题的，所以在注册的时候是使用的patternResult

scriptcat/src/app/service/service_worker/runtime.ts

Lines 744 to 751 in e206562

const registerScript: chrome.userScripts.RegisteredUserScript = {

id: scriptRes.uuid,

js: [{ code: scriptRes.code }],

matches: patternMatches.patternResult,

allFrames: !scriptRes.metadata["noframes"],

world: "MAIN",

excludeMatches: [],

};

@match, @include, @exclude 的處理都不一樣啦

@match不能正規。跟 符合扩展的Match patterns规范 其實是差不多
https://*.jd.com/* 的話，jd.com 跟 www.jd.com 都會配對到

@include 是用正規，或者 *shop* 這種
@exclude 是自動判別是 match 那種還是 include 那種。

https://www.tampermonkey.net/documentation.php?locale=en#meta:include

所以TM也是這樣寫。如果 @include 看起來是@match那個寫法，就會當@match那個寫法

  `"*://steamcommunity.com/*"` 這是`@match`寫法
  `"*.jd.com/*"` 這是`@include`寫法 (配對 www.jd.com, 不配對 jd.com )
  `"*docs.google.com/*"` 這是`@include`寫法
  `"*://*.amazon.tld/*"` 這是`@match`寫法 (配對 www.amazon.tld, 配對 amazon.tld )
  `"*shop*"`  這是`@include`寫法
  `"/.*(?<!test)store.*/"`  這是`@include`寫法
  `"*/releases"`  這是`@include`寫法
  `"*/releases/*"`  這是`@include`寫法
  `"*:5244*"`  這是`@include`寫法...?

cyfung1031 · 2025-08-09T04:18:06Z

我修改了一版，删除了一些特殊的处理逻辑，直接一股脑的将*替换成正则表达式

好像轉換不太成功。。？

CodFrm · 2025-08-09T04:25:03Z

@match, @include, @exclude 的處理都不一樣啦
@match不能正規。跟 符合扩展的Match patterns规范 其實是差不多

在mv2的时候是区分 UrlInclude 和 UrlMatch的，MV3后，就索性合并一起了，可以考虑分开来吧，不过得梳理一下了

https://github.com/scriptscat/scriptcat/blob/v0.16.10/src/pkg/utils/match.ts#L215

CodFrm · 2025-08-09T04:26:31Z

"轉換不太成功" 是指的哪个？

cyfung1031 · 2025-08-09T04:42:13Z

寫了個方法判別是標準match 還是要 regular expression 轉換

function checkUrlMatch(s: string) {
  s = s.trim();

  const idx1 = s.indexOf("://");
  let idx2 = -1;
  if (idx1 > 0) {
    idx2 = s.indexOf("/", idx1 + 3);
  }
  let extMatch: string[] | null = null;
  if (idx1 > 0 && idx2 > 0) {
    const scheme = s.substring(0, idx1);
    if (/^(\*|[-a-z]+)$/.test(scheme)) {
      let host = s.substring(idx1 + 3, idx2);
      if (!host.includes(":") && !host.startsWith(".")) {
        if (/^(\*|\*\..+)$/.test(host)) host = host.substring(1);
        if (!host.includes("*")) {
          extMatch = [scheme, host, s.substring(idx2 + 1)];
        }
      }
    }
  }
  return extMatch;
}

檢查用

function isUrlMatch(s: string, m: string[]) {
  const url = new URL(s);
  if (m[0] !== "*" && url.protocol !== `${m[0]}:`) return false;
  if (m[1]) {
    if (m[1].startsWith(".")) {
      if (!`.${url.hostname}`.endsWith(`${m[1]}`)) return false;
    } else {
      if (`${url.hostname}` !== `${m[1]}`) return false;
    }
  }
  const path = `${url.pathname}?${url.search}`;
  const arr = m[2].split("*");
  let idx = 0;
  let k = 0;
  const l = arr.length;
  while (k < l) {
    if (arr[k]) {
      const jdx = path.indexOf(arr[k], idx);
      if (jdx < 0) return false;
      idx = jdx + arr[k].length;
    }
    k++;
  }
  // 未處理 path 完全一致的檢查
  return true;
}

function isUrlMatch(s: string, m: string[]) {
  const url = new URL(s);
  if (m[0] !== "*" && url.protocol !== `${m[0]}:`) return false;
  if (m[1]) {
    if (m[1].startsWith(".")) {
      if (!`.${url.hostname}`.endsWith(`${m[1]}`)) return false;
    } else {
      if (`${url.hostname}` !== `${m[1]}`) return false;
    }
  }
  const path = `${url.pathname}?${url.search}`.substring(1);
  const arr = m[2].split("*");
  let idx = 0;
  let k = 0;
  const l = arr.length;

  const pathMatches = new Array(arr.length - 1);

  if (!path.startsWith(`${arr[0]}`)) return false;
  idx = `${arr[0]}`.length;

  k = 1;
  while (k < l) {
    if (k === l - 1 && arr[k] === "") {
      pathMatches[k - 1] = path.substring(idx);
      idx = path.length;
      break;
    }
    const jdx = path.indexOf(arr[k], idx);
    if (jdx < 0) return false;
    pathMatches[k - 1] = path.substring(idx, jdx);
    idx = jdx + arr[k].length;
    k++;
  }
  return idx === path.length || (path.endsWith("?") && idx === path.length - 1);
}

cyfung1031 · 2025-08-09T04:42:50Z

"轉換不太成功" 是指的哪个？

.jd.com/* -> .*.jd.com/.* 之類

/..*(?<!test)store..*/ 裡面 ..* 之類

CodFrm · 2025-08-09T04:43:44Z

.jd.com/* -> .*.jd.com/.* 之類

似乎没有问题呀，可以匹配

cyfung1031 · 2025-08-09T04:46:04Z

.jd.com/* -> .*.jd.com/.* 之類

似乎没有问题呀，可以匹配

應該是 .*\\.jd\\.com/.* 吧

CodFrm · 2025-08-09T04:47:48Z

懂你意思了，确实需要转义.，但是原本的表达式就是正则，里面包含 . 怎么识别呢

cyfung1031 · 2025-08-09T04:54:10Z

懂你意思了，确实需要转义.，但是原本的表达式就是正则，里面包含 . 怎么识别呢

function escapeRegExp(string) {
  return string.replace(/[.*+?^${}()|[\]\\]/g, '\\$&');
}

CodFrm · 2025-08-09T05:03:19Z

懂你意思了，确实需要转义.，但是原本的表达式就是正则，里面包含 . 怎么识别呢
function escapeRegExp(string) {
  return string.replace(/[.*+?^${}()|[\]\\]/g, '\\$&');
}

假设你的表达式就是一个正则 .*?example.com，这样处理会出问题的

cyfung1031 · 2025-08-09T05:11:23Z

.*?example.com

要 /.*?example.com/ 這樣呀

https://www.tampermonkey.net/documentation.php?locale=en#meta:include

沒有 / / 就是要轉換
// @include /^https:\/\/fanyv88.com:443\/https\/www\.tampermonkey\.net\/.*$/ 這個已經轉換好

.jd.com/*
-> ".*" + escapeRegExp(".jd.com/*") + ".*"

CodFrm · 2025-08-11T01:44:00Z

后续有时间再按照这个pr整理出 @match @include @exclude 的规则重构

cyfung1031 force-pushed the test_399 branch 2 times, most recently from 7492b3a to 92114bf Compare August 8, 2025 23:44

Exclusion Test

54aa30d

cyfung1031 force-pushed the test_399 branch from 92114bf to 54aa30d Compare August 9, 2025 00:06

CodFrm reviewed Aug 9, 2025

View reviewed changes

优化url match逻辑

0b158ba

CodFrm force-pushed the test_399 branch from bcea0b1 to d63e3e2 Compare August 9, 2025 02:45

通过单元测试

392cfd3

CodFrm force-pushed the test_399 branch from d63e3e2 to 392cfd3 Compare August 9, 2025 02:46

调整单元测试

f6bc832

CodFrm force-pushed the test_399 branch from c67976c to f6bc832 Compare August 9, 2025 02:53

CodFrm merged commit 0046bb7 into scriptscat:main Aug 11, 2025
4 checks passed

cyfung1031 deleted the test_399 branch August 23, 2025 20:04

Conversation

cyfung1031 commented Aug 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cyfung1031 commented Aug 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cyfung1031 commented Aug 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cyfung1031 commented Aug 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

MV3 UserScriptAPI URL Matching

MV3 Pattern

決定 match vs glob

MV3 Pattern檢測

Glob Pattern

Match Pattern

相容性

Uh oh!

CodFrm commented Aug 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cyfung1031 commented Aug 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

CodFrm Aug 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cyfung1031 Aug 9, 2025

Choose a reason for hiding this comment

Uh oh!

CodFrm commented Aug 9, 2025

Uh oh!

CodFrm commented Aug 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cyfung1031 commented Aug 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cyfung1031 commented Aug 9, 2025

Uh oh!

CodFrm commented Aug 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

CodFrm commented Aug 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

CodFrm commented Aug 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cyfung1031 commented Aug 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cyfung1031 commented Aug 9, 2025

Uh oh!

CodFrm commented Aug 9, 2025

Uh oh!

CodFrm commented Aug 9, 2025

Uh oh!

cyfung1031 commented Aug 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cyfung1031 commented Aug 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

CodFrm commented Aug 9, 2025

Uh oh!

cyfung1031 commented Aug 9, 2025

Uh oh!

CodFrm commented Aug 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cyfung1031 commented Aug 9, 2025

Uh oh!

CodFrm commented Aug 9, 2025

Uh oh!

cyfung1031 commented Aug 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

cyfung1031 commented Aug 8, 2025 •

edited

Loading

cyfung1031 commented Aug 8, 2025 •

edited

Loading

cyfung1031 commented Aug 8, 2025 •

edited

Loading

cyfung1031 commented Aug 9, 2025 •

edited

Loading

CodFrm commented Aug 9, 2025 •

edited

Loading

cyfung1031 commented Aug 9, 2025 •

edited

Loading

CodFrm Aug 9, 2025 •

edited

Loading

CodFrm commented Aug 9, 2025 •

edited

Loading

cyfung1031 commented Aug 9, 2025 •

edited

Loading

CodFrm commented Aug 9, 2025 •

edited

Loading

CodFrm commented Aug 9, 2025 •

edited

Loading

CodFrm commented Aug 9, 2025 •

edited

Loading

cyfung1031 commented Aug 9, 2025 •

edited

Loading

cyfung1031 commented Aug 9, 2025 •

edited

Loading

cyfung1031 commented Aug 9, 2025 •

edited

Loading

CodFrm commented Aug 9, 2025 •

edited

Loading

cyfung1031 commented Aug 9, 2025 •

edited

Loading