Execution
Crawling
- Test VV8.
- Docker just works.
- Log format in
tests/README.md
. Need parser. They havepost-processor/
but too specific.- Write log parser library in Rust.
- Does https://github.com/wspr-ncsu/visiblev8-crawler work?
- Over-engineered. Would prefer simpler single script than this monstrosity of SQLite, Celery, Mongo, and PostgreSQL.
- Runs Puppeteer eventually (
crawler.js
). ⇒ Let’s just use Puppeteer’s successor, Playwright.- Figure out Playwright.
- Prevent being blocked by USC OIT: Ask John/ add link to research description page in
User-Agent
.
- Make Playwright and monkey testing work.
- Mounting: write directly to
headless_browser/target/
on host.- Need sysadmin capability & root in Docker to run Playwright & create directory, or else spurious error.
- Need
sudo setenforce 0
for Fedora due to SELinux.
- File management: each site has own directory by
encodeURIComponent(url)
, under which;- share browser cache in
user_data/
; - each (
N
out of 0~4) trial launch separate VV8 and write:$N/vv8-*.log
$N.har
reachable$N.json
- share browser cache in
- Disable content security policy (CSP) for
eval
. -
Prevent navigation.Go back in browser history immediately when navigating.-
Browser bug: sometimes go back too much toDetect if page has horde onabout:blank
.load
event, and reload if not. [ ] Fewer navigation when headless??- Some sites (e.g., YouTube) change URL w/o navigation; cannot do anything about them.
-
- Visit 3 + 9 clicked pages like Snyder did.
- Some secondary URLs’ host name vary by
www.
prefix, e.g., google.com. - Split out each visit to separate browser page, so that each VV8 log can be split by when gremlins is injected into “loading” vs “interacting”.
- Save space: remove
user_data/
after all trials. - Crawl only 100 first
- Mounting: write directly to
Analyze API call traces
- Separate site load & interaction
- Make single gremlins injection split each VV8 log into a part w/o interaction and a part w/ interaction: separate browser page for each load.
- When aggregating record, split by gremlins injection in VV8 log.
- Find anchor APIs, the most popular APIs overall and per script.
- Filter out most internal, user-defined, and injected calls.
- Analysis of API popularity in
popular_api_calls_analysis.md
.- Tail-heavy distribution: It takes 1.75% (318) APIs to cover 80% of all API calls, and 3.74% (678.0) APIs to cover 90%.
- Many calls before interaction begin.
- DOM & event APIs dominate absolute counts.
- Popularity per script is useless.
- APIs called out in the proposal are somewhat popular.
- Pick manually among 678 APIs that make up 90% of calls, details in
notable_apis.md
.
- Figure out frontend interaction/ DOM element generation API classification
HTMLDocument.createElement
before interaction is clearly DOM element generation.- Various
addEventListener
calls are frontend processing. - More potential heuristics in
notable_apis.md
. - We only somewhat know what spheres a script belongs to, but how do we know it does not belong to another sphere?
- We can probably only claim we detect which sphere.
- Split script to fine-grained!!
-
eval
perfunctionchunk of 1kB code. Details ineval_trick.md
.
-
Log file interpretation
VV8 creates a log file per thread, roughly equivalent to a browser page we create plus some junk background workers. Each of $N/vv8-*.log
contains:
- Before gremlins injection:
- JS contexts created & their source code.
- API calls in each context.
- Guaranteed not for interactions.
- After gremlins injection:
- All of the above, but may be for interactions.
Observations when manually inspecting aggregated logs for YouTube
Details in youtube_scripts_api_calls_overview.md
.
- Strong indicators: popular APIs like
addEventListener
andappendChild
strongly indicate specific spheres. - API pollution:
getting and setting custom attributes onwindow
, etc. are recorded, but they are not browser APIs.Function
s generally seem more useful because we can and do filter out user-defined ones.- Largely dealt with by filtering by API names (alphanumeric or space, at least 3 characters for
this
, 2 characters forattr
, at most 3 consecutive number).
- Largely dealt with by filtering by API names (alphanumeric or space, at least 3 characters for
- Useless information:
getting and setting fromwindow
, callingArray
, etc. generally means nothing. API types (function, get, etc.) also seem useless once we considerthis
andattr
.- Just track anchor APIs and pick Function over Get for anchor APIs.
- Difficult scripts: some scripts only call a few APIs, so they are difficult to classify.
- Do we care about every script or just big ones or just ones that call many APIs?
- Many scripts are in the HTML, so how to aggregate their stats over the 5 trials?
- Aggregate multiple runs of same scripts.
Classification heuristics
By manually inspecting the 678 most popular APIs that make up 90% of all API calls in the top 100 sites, we spot “anchor” APIs (list in notable_apis.md
). See the classification results in classification_results.md
.
Certain indicators
- Frontend processing
- Get
.*Event
,Location
(some attributes),HTML(Input|TextArea)Element.(value|checked)
- Function
addEventListener
,getBoundingClientRect
- These can also be used to trigger DOM element generation?
- Set
textContent
and anything onURLSearchParams
,DOMRect
,DOMRectReadOnly
- Get
- DOM element generation, before interaction begins
- Function
createElement
,createElementNS
,createTextNode
,appendChild
,insertBefore
,CSSStyleDeclaration.setProperty
- Set
CSSStyleDeclaration
,style
- Function
- UX enhancement
- Function
removeAttribute
,matchMedia
,removeChild
,requestAnimationFrame
,cancelAnimationFrame
,FontFaceSet.load
,MediaQueryList.matches
- Set
hidden
,disabled
- Function
- Extensional features
Performance
,PerformanceTiming
,PerformanceResourceTiming
,Navigator.sendBeacon
Intermediate indicators
XMLHttpRequest
(andWindow.fetch
): send/fetch data from server, one of:- Form submission, CRUD → frontend processing.
- Auth, tracking, telemetry → extensional features.
- Load data onto page → DOM element generation (but will be detected through other API calls)?
SVGGraphicsElement
subclasses and canvas elements: graphics for UX enhancement, but you can render them and send SVG, so maybe DOM element generation?CSSStyleRule
,CSSRuleList
: UX enhancement or DOM element generation.Window.scrollY
: UX enhancement or frontend processing.
Uncertain indicators
querySelector[All]
,getElement[s]By.*
: get a node, but then what?.*Element
’scontains
,matches
: search for a node or string, but then what?Storage
,HTMLDocument.cookie
: local storage, but then what?DOMTokenList
: store/retrieve info on node, but then what?IntersectionObserverEntry
: viewport and visibility, but then what?ShadowRoot
: web components, but then what?Crypto.getRandomValues
frames
: iframes
Deferred
- Would like
- Clean up the APIs better.
- Separate out the 5 trials.
- Save space: compress logs.
- Proper logging.
- Checkpointing and resuming.
- Concatenate
chrome.1
,chrome.2
and other such logs after their previous logs (chrome.0
) when analyzing to avoid unknown execution context ID.
- Just thoughts
- If top 1000 sites yield poor results, try sampling other sites.
- Targeted event listener tests instead of chaos testing?