Daily and monthly counts of mediarequests for each media file stored in the wiki servers (as long as the count is higher than one).

wx_mediareqs_file(
  file_path,
  referer,
  media_type = c("all", "image", "video", "audio", "document", "other"),
  agent_type = c("all", "user", "bot"),
  granularity = c("daily", "monthly"),
  start_date = "20191201",
  end_date = "20200101"
)

Arguments

file_path

One or more paths to the file on upload.wikimedia.org, which is the file storage for all media in any wiki.

referer

The place that the request was made from. It can be any Wikimedia project (e.g. de.wikipedia), "all", "internal", "external", "search-engine", "unknown" or "none".

media_type

The media type that each file belongs to. It can be image, audio, video, document, or other. Default is "all" media types. See Media section below.

agent_type

If you want to filter by agent type, use "user", "bot"/"spider", or "automated" (refer to wikitech:Analytics/Data Lake/Traffic/BotDetection). If you are interested in pageviews regardless of agent type, use "all" (default).

granularity

The time unit for the response data. As of today, supported values are daily (default) and monthly.

start_date

The date of the first day to include, in YYYYMMDD format. Can also be a Date or a POSIXt object, which will be auto-formatted.

end_date

The date of the last day to include, in YYYYMMDD format. Can also be a Date or a POSIXt object, which will be auto-formatted.

Value

A tibble data frame with the following columns:

referer

referer

file_path

media file path provided by the user

date

Date; beginning of each month if granularity = "monthly"

requests

number of requests

Example usage

The example below retrieves the request counts for File:Hadley-wickham2016-02-04.jpg which is used on the English Wikipedia article Hadley Wickham as the main image in the infobox.

Media

File types are obtained by parsing the file extension, and then classified according to the following table:

ExtensionsMedia type
svg, png, tiff, tiff, jpeg, jpeg, gif, xcf, webp, bmpimage
mp3, ogg, oga, flac, wav, midi, midiaudio
webm, ogvvideo
pdf, djvu, srt, txtdocument
(all other extensions)other

Limitations

  • The ability of splitting and filtering by referrer is limited to data from May 2019 onward. Before that, referrer is only split in internal, external, and unknown.

  • The beginning of mediarequest data is the 1st of January 2015.

  • The ability of splitting and filtering by agent type (user, spider) is limited to data from May 2019 onward.

  • About 0.7% of mediarequests are prefetches coming from Media Viewer (more details in wikitech:Analytics/AQS/Media_metrics)

License

Data retrieved from the API endpoint is available under the CC0 1.0 license.

See also

Examples

wx_mediareqs_file( "/wikipedia/commons/f/fa/Hadley-wickham2016-02-04.jpg", "en.wikipedia", agent_type = "user", granularity = "monthly", start_date = "20200101", end_date = "20200301" )
#> Error: Problem with `mutate()` input `project`. #> x object 'project' not found #> Input `project` is `project`.