AGTree Integration
AGTree Integration
This document describes the integration of @adguard/agtree into the bloqr-backend project.
Overview
AGTree is AdGuard’s official tool set for working with adblock filter lists. It provides:
- Adblock rule parser - Parses rules into Abstract Syntax Trees (AST)
- Rule converter - Converts rules between different adblock syntaxes
- Rule validator - Validates rules against known modifier definitions
- Compatibility tables - Maps modifiers/features across different ad blockers
Why AGTree?
Before AGTree
The compiler used custom regex-based parsing in RuleUtils.ts:
- Limited to basic pattern matching
- No formal grammar or AST representation
- Manual modifier validation
- No syntax detection for different ad blockers
- Prone to edge-case parsing errors
After AGTree
| Feature | Before | After |
|---|---|---|
| Rule Parsing | Custom regex | Full AST with location info |
| Syntax Support | Basic adblock | AdGuard, uBlock Origin, Adblock Plus |
| Modifier Validation | Hardcoded list | Compatibility tables |
| Error Handling | String matching | Structured errors with positions |
| Rule Types | Network + hosts | All cosmetic, network, comments |
| Maintainability | Manual updates | Upstream library updates |
Architecture
Module Structure
mindmap
root((src/utils/))
agtree["AGTreeParser.ts — Wrapper module for AGTree"]
ruleUtils["RuleUtils.ts — Refactored to use AGTreeParser"]
indexTs["index.ts — Exports AGTreeParser types"]
AGTreeParser Wrapper
The AGTreeParser class provides a simplified interface to AGTree:
import { AGTreeParser } from '@/utils/AGTreeParser.ts';
// Parse a single ruleconst result = AGTreeParser.parse('||example.com^$third-party');if (result.success && AGTreeParser.isNetworkRule(result.ast!)) { const props = AGTreeParser.extractNetworkRuleProperties(result.ast); console.log(props.pattern); // '||example.com^' console.log(props.modifiers); // [{ name: 'third-party', value: null, exception: false }]}
// Parse an entire filter listconst filterList = AGTreeParser.parseFilterList(rawFilterListText);for (const rule of filterList.children) { if (AGTreeParser.isNetworkRule(rule)) { // Process network rule }}
// Detect syntaxconst syntax = AGTreeParser.detectSyntax('example.com##+js(aopr, ads)');// Returns: AdblockSyntax.UboKey Features
1. Type Guards
AGTreeParser provides comprehensive type guards for all rule types:
AGTreeParser.isEmpty(rule) // Empty linesAGTreeParser.isComment(rule) // All comment typesAGTreeParser.isSimpleComment(rule) // ! or # commentsAGTreeParser.isMetadataComment(rule) // ! Title: ...AGTreeParser.isHintComment(rule) // !+ NOT_OPTIMIZEDAGTreeParser.isPreProcessorComment(rule) // !#if, !#includeAGTreeParser.isNetworkRule(rule) // ||domain^ styleAGTreeParser.isHostRule(rule) // /etc/hosts styleAGTreeParser.isCosmeticRule(rule) // ##, #@#, etc.AGTreeParser.isElementHidingRule(rule)AGTreeParser.isCssInjectionRule(rule)AGTreeParser.isScriptletRule(rule)AGTreeParser.isExceptionRule(rule) // @@ or #@# rules2. Property Extraction
Extract structured data from parsed rules:
// Network rulesconst props = AGTreeParser.extractNetworkRuleProperties(networkRule);// Returns: { pattern, isException, modifiers, syntax, ruleText }
// Host rulesconst hostProps = AGTreeParser.extractHostRuleProperties(hostRule);// Returns: { ip, hostnames, comment, ruleText }
// Cosmetic rulesconst cosmeticProps = AGTreeParser.extractCosmeticRuleProperties(cosmeticRule);// Returns: { domains, separator, isException, body, type, syntax, ruleText }3. Modifier Utilities
Work with network rule modifiers:
// Find a specific modifierconst mod = AGTreeParser.findModifier(rule, 'domain');
// Check if modifier existsconst hasThirdParty = AGTreeParser.hasModifier(rule, 'third-party');
// Get modifier valueconst domainValue = AGTreeParser.getModifierValue(rule, 'domain');// Returns: 'example.com|~example.org' or null4. Validation
Validate rules and modifiers:
// Validate a single modifierconst result = AGTreeParser.validateModifier('important', undefined, AdblockSyntax.Adg);// Returns: { valid: boolean, errors: string[] }
// Validate all modifiers in a network ruleconst validation = AGTreeParser.validateNetworkRuleModifiers(rule);if (!validation.valid) { console.log(validation.errors);}5. Syntax Detection
Automatically detect which ad blocker syntax a rule uses:
const syntax = AGTreeParser.detectSyntax(ruleText);// Returns: AdblockSyntax.Adg | Ubo | Abp | Common
// Check specific syntaxAGTreeParser.isAdGuardSyntax(rule) // AdGuard-specificAGTreeParser.isUBlockSyntax(rule) // uBlock Origin-specificAGTreeParser.isAbpSyntax(rule) // Adblock Plus-specificIntegration Points
RuleUtils
RuleUtils now uses AGTree internally while maintaining the same public API:
// These methods now use AGTree parsing internally:RuleUtils.isComment(ruleText)RuleUtils.isAllowRule(ruleText)RuleUtils.isEtcHostsRule(ruleText)RuleUtils.loadAdblockRuleProperties(ruleText)RuleUtils.loadEtcHostsRuleProperties(ruleText)
// New AGTree-powered methods:RuleUtils.parseToAST(ruleText) // Get raw ASTRuleUtils.isValidRule(ruleText) // Check parseabilityRuleUtils.isNetworkRule(ruleText) // Network rule checkRuleUtils.isCosmeticRule(ruleText) // Cosmetic rule checkRuleUtils.detectSyntax(ruleText) // Syntax detectionValidateTransformation
The validation transformation uses AGTree for robust rule validation:
- Parses rules once and reuses the AST
- Uses structured type checking instead of regex
- Validates modifiers against AGTree’s compatibility tables
- Properly handles all rule categories (network, host, cosmetic, comment)
- Provides better error messages with context
// Before: String-based validationif (RuleUtils.isEtcHostsRule(ruleText)) { return this.validateEtcHostsRule(ruleText);}
// After: AST-based validationif (AGTreeParser.isHostRule(ast)) { return this.validateHostRule(ast as HostRule, ruleText);}Configuration
AGTree is configured in deno.json:
{ "imports": { "@adguard/agtree": "npm:@adguard/agtree@^3.4.3" }}Performance Considerations
- Parsing Once: Parse each rule once and pass the AST to multiple validation functions
- Tolerant Mode: Use
tolerant: trueto getInvalidRulenodes instead of exceptions - Include Raws: Use
includeRaws: trueto preserve original rule text in AST
const DEFAULT_PARSER_OPTIONS: ParserOptions = { parseHostRules: true, includeRaws: true, tolerant: true,};Error Handling
AGTree provides structured error information:
const result = AGTreeParser.parse(ruleText);
if (!result.success) { console.log(result.error); // Error message console.log(result.ruleText); // Original rule
// In tolerant mode, ast may be an InvalidRule if (result.ast?.category === RuleCategory.Invalid) { // Access error details from the InvalidRule node }}Supported Rule Types
AGTree supports parsing all major adblock rule types:
Network Rules
- Basic blocking:
||example.com^ - Exception:
@@||example.com^ - With modifiers:
||example.com^$third-party,script
Host Rules
- Standard:
127.0.0.1 example.com - Multiple hosts:
0.0.0.0 ad1.com ad2.com - With comments:
127.0.0.1 example.com # block ads
Cosmetic Rules
- Element hiding:
example.com##.ad-banner - Extended CSS:
example.com#?#.ad:has(> .text) - CSS injection:
example.com#$#.ad { display: none !important; } - Scriptlet injection:
example.com#%#//scriptlet('abort-on-property-read', 'ads')
Comment Rules
- Simple:
! This is a comment - Metadata:
! Title: My Filter List - Hints:
!+ NOT_OPTIMIZED PLATFORM(windows) - Preprocessor:
!#if (adguard)
Future Improvements
- Rule Conversion: Use AGTree’s converter to transform rules between syntaxes
- Batch Parsing: Use
FilterListParserfor bulk operations - Streaming: Process large filter lists without loading all into memory
- Diagnostics: Leverage AGTree’s location info for better error reporting