Daily and monthly aggregation of hits to media files, split by the referrer Referrer means either external, internal, unknown, or any Wikimedia wiki that the resource was requested from.

wx_mediareqs_referer(
  referer,
  media_type = c("all", "image", "video", "audio", "document", "other"),
  agent_type = c("all", "user", "bot"),
  granularity = c("daily", "monthly"),
  start_date = "20191201",
  end_date = "20200101"
)

Arguments

referer

The place that the request was made from. It can be any Wikimedia project (e.g. de.wikipedia), "all", "internal", "external", "search-engine", "unknown" or "none".

media_type

The media type that each file belongs to. It can be image, audio, video, document, or other. Default is "all" media types. See Media section below.

agent_type

If you want to filter by agent type, use "user", "bot"/"spider", or "automated" (refer to wikitech:Analytics/Data Lake/Traffic/BotDetection). If you are interested in pageviews regardless of agent type, use "all" (default).

granularity

The time unit for the response data. As of today, supported values are daily (default) and monthly.

start_date

The date of the first day to include, in YYYYMMDD format. Can also be a Date or a POSIXt object, which will be auto-formatted.

end_date

The date of the last day to include, in YYYYMMDD format. Can also be a Date or a POSIXt object, which will be auto-formatted.

Value

A tibble data frame with the following columns:

project

project

date

Date; beginning of each month if granularity = "monthly"

requests

number of media file requests

Media

File types are obtained by parsing the file extension, and then classified according to the following table:

ExtensionsMedia type
svg, png, tiff, tiff, jpeg, jpeg, gif, xcf, webp, bmpimage
mp3, ogg, oga, flac, wav, midi, midiaudio
webm, ogvvideo
pdf, djvu, srt, txtdocument
(all other extensions)other

Limitations

  • The ability of splitting and filtering by referrer is limited to data from May 2019 onward. Before that, referrer is only split in internal, external, and unknown.

  • The beginning of mediarequest data is the 1st of January 2015.

  • The ability of splitting and filtering by agent type (user, spider) is limited to data from May 2019 onward.

  • About 0.7% of mediarequests are prefetches coming from Media Viewer (more details in wikitech:Analytics/AQS/Media_metrics)

License

Data retrieved from the API endpoint is available under the CC0 1.0 license.

See also

Examples

wx_mediareqs_referer("de.wikipedia", granularity = "monthly")
#> Error: Problem with `mutate()` input `project`. #> x object 'project' not found #> Input `project` is `project`.