Page view counts
wx_page_views( project, page_name, access_method = c("all", "desktop", "mobile web", "mobile app"), agent_type = c("all", "user", "bot", "spider", "automated"), granularity = c("daily", "monthly"), start_date = "20191101", end_date = "20191231", include_redirects = FALSE )
project | The name of any Wikimedia project formatted like
|
---|---|
page_name | The title of any article in the specified project.
The function takes care of replacing spaces with underscores and
URI-encoding, so that non-URI-safe characters like %, / or ? are accepted
-- e.g. "Are You the One?" becomes "Are_You_the_One%3F". Internally this
is done with a non-exported |
access_method | If you want to filter by access method, use one of: "desktop", "mobile app", or "mobile web". If you are interested in pageviews regardless of access method, use "all" (default). |
agent_type | If you want to filter by agent type, use "user", "bot"/"spider", or "automated" (refer to wikitech:Analytics/Data Lake/Traffic/BotDetection). If you are interested in pageviews regardless of agent type, use "all" (default). |
granularity | The time unit for the response data. As of today, supported values are daily (default) and monthly. |
start_date | The date of the first day to include, in YYYYMMDD format.
Can also be a |
end_date | The date of the last day to include, in YYYYMMDD format.
Can also be a |
include_redirects | Whether to include redirects to requested pages. Currently, only article (mainspace) redirects are supported. See "Redirects" section below for more details. |
A tibble data frame with the following columns:
project
project
page_name
the page_name
provided by the user
redirect_name
the name of the redirect to the page if include_redirects = TRUE
; NA
for the page itself
date
Date
; beginning of each month if granularity = "monthly"
views
total number of views for the page
By default include_redirects = FALSE
for performance reasons. The
pageviews API does not roll up view counts for redirects into total view
counts of the target page, so set include_redirects
to TRUE
if you want
to have this function automatically locate the redirects via the MediaWiki
API and request their pageview counts. Obviously this makes the function
much slower, especially if the number of redirects to the page(s) is high.
For example, if the user visits "2019-20 coronavirus pandemic" (with a minus) they will be redirected to the actual article "2019–20 coronavirus pandemic" (with an en-dash). Any visits to the redirect (the page with the minus sign instead of the en-dash) will not be counted toward the page view count of the redirected-to article, although once the client is taken to the target page that counts as a separate page view.
In some cases, you may want to include page views of the redirects in your total or you may want to the ability to disentangle the portion of traffic that is from users arriving to a page via a redirect vs. users arriving to a page directly.
Note on performance: again, the process of finding and fetching view counts for redirects is considerably slower. The function has been optimized for multiple pages, since the redirects API supports up to 50 pages per call. Therefore, it is highly recommended that if you have multiple pages to retrieve traffic for within the same project, try not to retrieve traffic for one page at a time but instead provide the full vector of page names to minimize the burden on the MediaWiki API.
Data retrieved from the API endpoint is available under the CC0 1.0 license.
Other traffic data & metrics:
wx_project_views()
,
wx_top_viewed_pages()
,
wx_unique_devices()
wx_page_views( "en.wikipedia", c("New Year's Eve", "New Year's Day"), start_date = "20191231", end_date = "20200101" )#> # A tibble: 4 x 4 #> project page_name date views #> <chr> <chr> <date> <int> #> 1 en.wikipedia New Year's Day 2019-12-31 61379 #> 2 en.wikipedia New Year's Day 2020-01-01 281166 #> 3 en.wikipedia New Year's Eve 2019-12-31 202638 #> 4 en.wikipedia New Year's Eve 2020-01-01 51715if (FALSE) { wx_page_views( "en.wikipedia", "COVID-19 pandemic", start_date = "20200301", end_date = "20200501", include_redirects = TRUE ) }