matchpointR turns the public pages of wtatennis.com into tidy data
frames:
| Function | Purpose |
|---|---|
wta_player_url() |
Build canonical player URLs. |
wta_get_player_basics() |
One-row tibble with bio parsed from the page’s JSON-LD. |
wta_get_player_overview() |
Career highlights (ranks, titles, prize money). |
wta_get_player_matches() |
One row per match across the full career. |
wta_get_rankings() |
Current singles / doubles leaderboard. |
Every function that hits the network opens (and closes) its own
headless Chrome session through chromote. Where the WTA
site exposes structured schema.org JSON-LD data,
matchpointR reads from that in preference to CSS selectors
— this is substantially more resilient against site redesigns.
url <- wta_player_url(320301, "katerina-siniakova")
bio <- wta_get_player_basics(url, download_images = FALSE)
biodownload_images = TRUE (the default) additionally
downloads the headshot into a magick-image list-column —
set it to FALSE when you only need the metadata.
Returns a long tibble with one row per metric:
singles_rank, doubles_rank,
singles_career_titles, doubles_career_titles,
career_prize_money, career_high.
matches_url <- wta_player_url(320301, "katerina-siniakova", "matches")
matches <- wta_get_player_matches(matches_url)
head(matches)wta_get_player_matches() clicks the Show more
button repeatedly until no more matches are loaded. Raise
max_clicks only if you hit the safety cap on a very long
career.
chromote hits the site
as a real browser; be considerate if you are iterating over many
players.Tour-wide statistics leaderboards (aces, winners, break-points
converted, …) are tracked in issue #1
for a future release. The WTA site reshuffled its /stats
hub and we are waiting for a stable URL pattern before committing to an
API.